Wood and co-workers (Robinson and Cavanaugh, 1998; Wood 2002, 2005a) have devised and promoted the application of some statistical methods to the taxonomy of various biological organisms. Principally, they rely on a technique described as *Baraminic Distance Correlation* (BDC) or a version of multi-dimensional scaling (BDISTMDS). This paper will argue that these methods are based on a shaky understanding of statistical principles, and that their use ought to be abandoned.

**Keywords**: statistical baraminology, distance metrics, distance correlation, bootstrapping

Whereas Darwinian evolution assumes a monophyletic relationship between all life, as depicted in his famous “tree of life,” creationists interpret the differences between biological organisms in terms of original *baramins*, or “created kinds,” followed by a subsequent polyphyletic development. Statistical baraminology (SB) assumes it can uncover the discontinuities between groups of known taxa, which may be a route to determining the original kinds. To identify relationships between taxa, we can enumerate a set of characters that are shared by subsets of the particular organisms being studied. From a statistical viewpoint, we have a set of data comprising measurements of p variables (also called characters or attributes in the literature) for each of *n* objects (taxa) {* x_{i}* }, where

The first step of the BDC procedure is then to define a *n*-dimensional dissimilarity matrix

where the notation *d _{ij}* is shorthand for a function

*Baraminic Distance Correlation* (BDC) typically uses a distance metric that simply counts the number of “matches” between every pair of taxa across all characters1 and subtracts this from the maximum number of matches possible (*p*). Indeed, this is sometimes called a *simple matching distance*. Mathematically, if we have two taxa * x* and

or, equivalently

where the notation [*expr*] denotes the value 1 if the expression *expr* is logically true, and 0 if it is false, and *x _{j}* (resp.

Having calculated this function for all taxa, a *n* × *n* matrix is obtained. The columns of this *dissimilarity matrix* are then treated as variables from which a set of pairwise correlation coefficients *r* can be computed. This is the origin of the term *Baraminic Distance Correlation*. The values of *r* are then assessed for significance using a *t*-test, and the resulting “significance matrix” inspected for patterns. A considerable creationist industry of baraminological data-mining has developed in the last decade in which this idea is applied to many different datasets, of which (Aaron 2014; Cavanaugh and Wood 2002; Garner 2004, 2014; Ingle and Aaron 2015; Wood 2002, 2010, 2011, 2016) form just a sample. Some of the results are equivocal, others problematic (particularly Wood 2010, 2011), which comes to unexpected conclusions in the area of putative human ancestry), but the statistical validity of the method has just been assumed. Some years ago, Wilson (2010) made a plea for better scrutiny of the underlying assumptions of statistical baraminology, but as far as I can tell, this hasn’t happened. On examining SB from a conventional statistical perspective, I have discovered several problems, some of them very serious. These will now be discussed in detail.

BDC assumes we can assign a precise meaning to the idea that (say) object * x* is closer to

where *ϕ _{j}* is the weight of the

Moreover, even if we assume the simplest case—purely nominal variables and the simple matching definition—we are still implicitly assuming that presence and absence of characters is of equal significance. This may not be true, and if a character is expressed by an ordinal variable, it becomes highly dubious. Statisticians are not unaware of this, and several alternative measures have been suggested. Consider first the specific case of binary variables: by counting the presences and absences separately, the distance formula can be rewritten as

where ˄ is the standard symbol for logical “and.” Building on this, some obvious extensions can be seen. If the presence of a character (*x _{j}* = 1) is more significant than its absence (

Still other ideas have been suggested: the *Dice distance*, which gives twice as much weight to the case where both characters are present, has the value

(This could be further generalized to reflect subjective opinions on the importance of presence or absence by using some other value than 2.) These formulae can be extended to the non-binary case if necessary, and weighted versions of all of them can also be defined. For an example of some calculations using these metrics, see Appendix C. A recent study by Finch (2005) concluded that the Jaccard and Dice distances tend to do better than simple matching, in terms of cluster recovery for simulated binary data where the “true” taxonomic structure was known.

For non-binary variables, an obvious modification is to normalize the set of possible values to lie in the range [0, 1], and to modify the contribution made by such components by calculating the absolute distance |*x _{j}* −

Turning now to the various techniques described in Robinson and Cavanaugh (1998) and Wood (2002, 2005a), which can be collectively identified under the heading “statistical baraminology,” we should note that they all assume the computation of a suitable distance metric, although generally only results using the simple matching distance were reported in these papers. In fact, the question of whether this is appropriate appears to be ignored in the SB literature.

The “novel” tool developed in Robinson and Cavanaugh (1998), and routinely employed in papers on SB via a piece of online software known as BDIST (Wood 2001, 2008a, 2008b), is the use of baraminic distance correlation. The columns of the dissimilarity matrix are treated as variables from which a set of pairwise correlation coefficients *r* can be computed using the well-known Pearson product-moment formula. The values of *r* are then assessed for significance using a *t*-test, and the resulting “significance matrix” inspected for patterns.

There are some serious questions about the validity of this procedure. A correlation coefficient measures the strength of a postulated linear relationship between two variables, assuming we have obtained a set of independent random samples of each. Implicitly, we suppose a model

*Y* = *α* + *βX* + *ε*

where *y* and *x* here represent the “response” and “explanatory” variables, respectively, and *ε* represents a random error term. The values of *α* and *β* (and hence the correlation between *Y* and X) are estimated by minimizing the sum of squared errors, which is a *L*_{2} norm. (See Appendix A on norms.) These errors are assumed to be independently and identically distributed. In particular, if the distribution of the errors is Normal, it is possible to test for “significance” of the coefficient *β* by means of a *t*-test with *n* − 2 degrees of freedom and a pre-specified “P-value,” which is the probability of rejecting a null hypothesis that *β* is zero if the hypothesis is true.3 (Testing *β* is equivalent to testing *r*.) Firstly we should realize that the Pearson formula assumes the data are continuous random variables, although formally one can always apply the formula and obtain a result even if this is not so. In any case, it gets worse.

In the application to a distance matrix, the underlying assumptions for correlation are invalid, for the “variables” are simply the columns4 of the distance matrix * D*, which are also just the rows transposed! These values are certainly not

0 *d*

*d* 0

where *d* = *d*(* x_{1}*,

Now clearly this procedure does something, but as the “correlations” do not have a well-defined distribution, we cannot tell which ones are “significant;” that is, we have no idea what the distribution of these values would be under either a null or an alternative hypothesis. Generally, we assign significance to a value that is highly unlikely to arise by chance—usually defined as exceeding the pre-specified critical value. But how can we obtain such values in the absence of Normally distributed variables? In the simplest case, we can sketch a possible answer. If we have a *n* × p matrix of 1s and 0s, the *density* of the matrix * X* can be defined as

In other words, *ρ* tells us the fraction of 1s in the matrix * X*. Let us think about what would influence the distribution of values of the distance between 2 taxa under a null hypothesis that there is no pattern, i.e., if the 1s were assigned at random such that the average density is

which thus provides us with the distribution of “distances.” It’s difficult, however, to make further progress: what is really needed is a *joint* hypothesis test of a set of correlations, but to evaluate the joint distribution of the “correlations” obtained from a set of *n* such variables is an even harder problem! At any rate, it should be clear that this isn’t simply a matter of a t-test with degrees of freedom *n* − 2. Even if we could find it, a “critical value” for *r* will depend not only on n, but also on *p* and *ρ*.

Compounding the problems, however we obtain a critical value, is the non-independence of the columns of D. But even suppose they were independent: in any set of *n* variables, there are

two-way comparisons between them, which implies that the chance of false positives is substantially inflated above the prescribed critical value.5 What constitutes “significance” is therefore exceedingly hard to determine, and in reality, independence is an impossible condition to satisfy anyway.

These problems have been routinely ignored in SB literature—indeed, there is no evidence that they have really been considered. Wood (2011), for example, has argued in favor of statistical baraminology in the case of humanoid species as follows:

With statistical baraminology, the correlation test can be used to estimate the significance of organismal similarity or difference . . . . [so that] we can assign statistical probabilities to the differences that divide human from non-human and to the similarities that unite humans with other non-*sapiens* human species.

which he regards as the clinching argument against “qualitative” approaches. It would be nice if this statement were true, but until a truly rigorous probability model has been formulated and estimated, the jury must stay out.

The BDC procedure can only be seen, then, as a heuristic technique that may help to visualize the structure of the data—but we have no reliable way of knowing whether it does or not, whereas we do know that the assumptions underlying it are false. It would be possible to improve the BDC approach by carrying out a randomization test; that is, having calculated *ρ*, we could simulate a large number of distance matrices with the same statistical characteristics, calculate all the *r* values, and use these to find critical values. But every set of {*n*, *p*, *ρ*} parameters would generate different critical values, so the computational burden would be heavy. And there is a further problem: even when we have obtained “significant” correlations, we still have to interpret the pattern, if any.

At this point, the multi-dimensional scaling (MDS) procedure is generally invoked. Earlier SB literature (Cavanaugh and Wood 2002) used a technique called Analysis of Patterns (ANOPA), but since the publication of Wood (2005b), MDS seems to have become the standard approach.6 Unlike BDC, MDS is a well-documented, principled, tried-and- tested statistical technique for visualizing high-dimensional data in 2 (or occasionally 3) dimensions; it does, however, imply a possibly considerable loss of information. In 2 dimensions, the basic idea is to find a set of coordinates (*u _{i}*,

Whether their additional criticisms regarding the use of statistical techniques in general are truly justified, however, cannot be definitively answered when the BDC/MDS methodology is itself a flawed and unprincipled7 procedure. There are alternative statistical approaches, based on a much firmer foundation, such as cluster analysis. The question as to whether and how this can be applied to baraminology is considered in a separate paper (Reeves 2021).

Wood (2008a) has proposed a modified version of BDC that involves the idea of *bootstrapping*. Efron and Tibshirani (1993) is still the best introduction to this concept, which is in essence rather simple (although the mathematics that justifies it is rather less so). Bootstrapping is most commonly used to estimate the variability of an original single parameter estimate by means of a confidence interval. Classical statistics does this by making certain assumptions about the *sampling distribution* of the estimate of such a parameter—most commonly, that it can be approximated by a Normal (or occasionally some other well-defined) distribution, but in many cases this assumption is invalid. The bootstrap method is a solution for such cases. Our original dataset could have been different, but it is in fact, the best estimate we have of the underlying probability mass function, so we generate a large number *B* of *pseudoreplicates*, the same size as the original sample, by *resampling* from that original sample with replacement. (As it is *with replacement*, some elements of the original sample will appear more than once, while some won’t appear at all. It is this that makes the pseudoreplicates different from the original, and from each other.) Appendix D provides a simple example of the idea, but the critical assumption is this principle:

- Bootstrapping Principle: the sampling distribution of the sample around the population parameter can be approximated by the sampling distribution of the resample around the sample parameter.

But what are we (re-)sampling? The normal approach would be to focus on the *taxa*: assume there are more kinds of creatures than we have employed in the standard BDC analysis. (Perhaps there are even new species out there, or—more likely—we know of other species but just don’t have data on them.) For example, consider the data matrix * X* as a set of rows as follows:

A resample of the row indices in the case *n* = 6 might generate {1, 2, 1, 4, 2, 5}, so the corresponding pseudoreplicate would consist of the rows

from which a distance matrix could be calculated, and the BDC procedure followed through—a process repeated *B* times, each time with a different set of “pseudo-taxa.” But if the datasets are fairly complete in terms of taxa, they effectively *are* a population, not a sample, so what we might gain is puzzling: the groupings that we find are what they are. In fact, any “pseudo-taxon” introduced will be identical to some other real taxon. Consequently, “distances” between them are zero, somewhat distorting the whole process. Nonetheless, as a formal procedure, we could compute a measure of consistency from the pseudoreplicates—for example, how many times each pair of (real) taxa is in the same grouping. (For a review of other ideas, see Hennig 2007).

In any event, Wood’s description of the process he uses in Wood (2008a), though rather opaque, appears to make the primary focus on the *characters* (i.e. the *columns*), which he regards as a sample from a larger set of characters that might have been used if we had measured them, so the pseudoreplicates are obtained by sampling with replacement *B* times from the set of *p* columns. Effectively, if I have understood Wood (2008a) correctly, the normal roles of objects and variables are reversed. I understand Wood is worried as to whether the groupings found are influenced by the particular characters selected, which is certainly a valid subject for inquiry, and the use of “random perturbations in the character data” would be one way to test the robustness of a statistical procedure in this area. But is bootstrapping BDC a statistically principled way of accomplishing this?

To see the problems with this approach, consider an example from Efron and Tibshirani (1993). The dataset consists of values of two measures of performance (LSAT and GPA)—these correspond to characters (columns)—for students entering 15 US law schools (rows)—these correspond to taxa. They create pseudoreplicates by resampling the *rows* (taxa) 15 times, and calculate the correlation between LSAT and GPA for each pseudoreplicate.8 But if they were to resample the *columns*, in many cases this would mean calculating the correlation between LSAT and LSAT, or GPA and GPA, which would obviously be nonsensical! Yet by analogy this is, I think, what Wood is doing.

Now, for a valid application of the bootstrapping concept the (re-)sample values are required to be independently and identically distributed (*iid*). This is usually a reasonable approximation when the sample values are scalar quantities. But here each taxon would in general be a multi-dimensional vector, comprising values from different domains: dichotomies, multiple categories, ordinal, discrete, or even continuous (see Appendix B). Even when all the characters are dichotomous, so that each of the *n* elements of the vector has a Bernoulli distribution, they are not all guaranteed to have the same Bernoulli parameter. More generally, if there are different types of distribution, it is self-evident that the resamples are not identically distributed.9 Moreover, while it seems to be an article of faith in SB that all characters are equal (see Wood and Murray 2003, 115–136), what if some really are “more equal than others?” Even evolutionists such as Conroy (2005, 235–237) recognize that morphological characters (in his case, in hominims) are inextricably dependent on each other, so the values of the sample variables are very likely not independent either. Even if the original columns are independent, the resamples are *guaranteed* not to be, and the fundamental Bootstrapping Principle is in doubt even for the most basic dichotomies-only case. And of course, all the problems with the BDC procedure itself remain. The question of character selection deserves attention, but this approach has too many unexamined assumptions, and some obviously incorrect ones.

The methods used by statistical baraminology have multifarious flaws, despite the surface appearance of rigor and sophistication.

• The basic assumption that all characters are equal appears to be untested. In fact, if Wilson is correct to argue (2010) that it is nearly always the case that all characters are *not* equally important,10 the analysis would be affected in important ways.

• The choice of distance metric is rarely discussed in the context of the nature of the variables involved.

• Typically the distance metric used assumes, without justification, that equal weight should be attached to the presence and absence of characters.

• The BDC technique is not based securely on statistical principles, and the question of true statistical significance of the “correlations” is sidestepped.

• While MDS has a better statistical pedigree than BDC, the inherent subjectivity of the choice of distance metric, and the loss of information in the dimensionality reduction, mean that results should be treated with more than usual caution.

• In a formal sense the application of bootstrapping is uncontroversial, but it is not clear that the bootstrap is correctly applied, nor that the Bootstrapping Principle can hold for SB methods.

As is shown in Reeves (2021), it is possible—and in fact—fairly simple, to apply cluster analysis to the sort of questions being asked by proponents of SB. Statistical software such as the *r* language is readily available (at no cost) with a wide choice of principled algorithms using well-defined statistical tests of significance. BDC really should be abandoned. But even this isn’t the end of the matter: in Reeves (2021), I discuss the important conceptual and practical considerations that arise when a large number of characters are evaluated across a range of taxa. This again is something routinely ignored by SB, as it remains under the spell of a “holistic” treatment of the data in order to facilitate a statistical analysis. Whether the analysis has a rigorous foundation or not, there are some important questions to be asked *of the data* before any software is applied.

I don’t wish to be too hard on the proponents of statistical baraminology; the goal is worthy and the fundamental idea commendable, even if the techniques applied are inadequate, and the treatment of the data too often perfunctory. Moreover, some excellent and painstaking—even heroic—work has clearly been done in preparing and editing a growing collection of very useful datasets. But it is a shame to see much effort ploughed into a mistaken enterprise that may only give ammunition to unfriendly critics of creationist research. Not that creationists are alone in misapplying statistics; evolutionists are no sure guide themselves. (See Mannion et al. 2011 for a fairly recent example of poor statistical understanding.11) If as creationists we wish to apply statistics to the identification of created kinds, just plugging numbers into a computer program (even if as in Reeves (2021) it is more firmly based than BDC) is inadequate. I would suggest we need to

- understand better the nature of the data we have collected and the assumptions upon which rests our measure of “distance”
- understand better the assumptions underlying statistical methods such as multi-dimensional scaling, cluster analysis, randomization tests and bootstrapping
- realize the importance of testing our assumptions and checking results for robustness (e.g., does changing a distance metric alter the conclusions?)
- treat conclusions with much greater caution than has sometimes been the case (cf.
*Australopithecus sediba*).

Finally, I am no geneticist, but it seems highly likely to me that the exclusive focus on phenotypic information is a mistake. Perhaps there aren’t enough sequenced genomes for comparison, and defining “distance” at the level of DNA has its own collection of problems, some even more difficult than in the phenotype. But we surely expect the creatures within the kinds to be related on a genetic level.

Aaron, M. 2014. “Discerning Tyrants from Usurpers: A Statistical Baraminological Analysis of Tyrannosauroidea Yielding the First Dinosaur Holobaramin.” *Answers Research Journal* 7 (November 26): 463−481. https://answersingenesis.org/dinosaurs/discerning-tyrants-usurpers-statistical-baraminological-analysis-tyrannosauroidea-yielding-first-din/.

Cavanaugh, David P., and Todd Charles Wood. 2002. “A Baraminological Analysis of the Tribe Heliantheae *senso latu* (Asteraceae) Using Analysis of Pattern (ANOPA).” *Occasional Papers of the Baraminology Study Group* 1, 1−11.

Conroy, Glenn C. 2005. *Reconstructing Human Origins*. New York: W. W. Norton & Company.

Efron, Bradley. and Robert J. Tibshirani. 1993. *An Introduction to the Bootstrap*. San Francisco, California: Chapman & Hall.

Finch, Holmes. 2005. “Comparison of Distance Measures in Cluster Analysis with Dichotomous Data.” *Journal of Data Science* 3: 85−100.

Garner, Paul A. 2004. “Is the Equidae a Holobaramin?” In *Proceedings of the Third BSG Conference*. Edited by Roger W. Sanders. *Occasional Papers of the BSG* 4 (June 9): 10.

Garner, P. A. 2014. “Baraminological Analysis of the Picidae (Vertebrata:Aves:Piciformes) and Implications for Creationist Design Arguments.” *Journal of Creation Theology and Science Series B: Life Sciences* 4 (July 7): 1−11.

Hennig, C. 2007. “Cluster−wise Assessment of Cluster Stability.” *Computational Statistics and Data Analysis* 52, no. 1: 258−271.

Ingle, Matthew E., and M. Aaron. 2015. “A Baraminic Study of the Blood Flukes of Family Schistosomatidae.” *Answers Research Journal* 8 (June 24): 327–337. https://answersingenesis.org/creation-science/baraminology/baraminic-study-blood-flukes-family-schistosomatidae/.

Mannion, Philip D., Paul Upchurch, Matthew T. Carrano, and Paul M. Barrett. 2011. “Testing the Effect of the Rock Record on Diversity: A Multidisciplinary Approach to Elucidating the Generic Richness of Sauropodomorph Dinosaurs Through Time.” *Biology Reviews* 86, no. 1 (February): 157−181.

Menton, David A., Anne Habermehl, and David DeWitt. 2010. “Baraminological Analysis Places *Homo habilis*, *Homo rudolfensis*, and *Australopithecus sediba* in the Human Holobaramin: Discussion.” *Answers Research Journal* 3 (August 25): 153–158. https://answersingenesis.org/creation-science/baraminology/homo-habilis-homo-rudolfensis-australopithecus-sediba-discussion/.

Reeves, Colin R. 2021. “A Critical Evaluation of Statistical Baraminology: Part 2.” *Answers Research Journal* 14: 271–282.

Robinson, D. Ashley, and David P. Cavanaugh. 1998. “A Quantitative Approach to Baraminology with Examples from Catarrhine Primates.” *Creation Research Society Quarterly* 34, no. 4 (March): 196−208.

Wilson, Gordon. 2010. “Classic Multidimensional Scaling Isn’t the *Sine Qua Non* of Baraminology.” *Answers in Depth* 5 (September 29). https://answersingenesis.org/creation-science/baraminology/classic-multidimensional-scaling-and-baraminology/.

Wood, Todd Charles. 2001. BDIST software, v.1. Dayton, Tennessee: Center for Origins Research and Education, Bryan College. Distributed by the author. http://www. coresci.org/bdistinfo.html.

Wood, Todd Charles. 2002. “A Baraminology Tutorial with Examples from the Grasses (Poaceae).” *Journal of Creation* 16, no. 1 (April): 15−25.

Wood, Todd Charles. 2005a. *A Creationist Review and Preliminary Analysis of the History, Geology, Climate, and Biology of the Galápagos Islands. Center for Origins Research Issues in Creation*, no. 1 (June 15). Eugene, Oregon: Wipf & Stock.

Wood, Todd Charles. 2005b. “Visualizing Baraminic Distances using Classical Multidimensional Scaling.” *Origins* 57: 9−29.

Wood, Todd Charles. 2008a. “Baraminic Distance, Bootstraps, and BDISTMDS.” *Occasional Papers of the Baraminology Study Group* 12: 1−17.

Wood, T. C. 2008b. BDISTMDS Software, v.2.0. Dayton, Tennessee: Center for Origins Research and Education, Bryan College. Distributed by the author. http://www. coresci.org/bdist.html.

Wood, Todd Charles. 2010. “Baraminological Analysis Places *Homo habilis*, *Homo rudolfensis*, and *Australopithecus sediba* in the Human Holobaramin.” *Answers Research Journal* 3 (May 5): 71−90.

Wood, Todd Charles. 2011. “Baraminology, the Image of God, and *Australopithecus sediba*.” *Journal of Creation Theology and Science Series B: Life Sciences* 1 (July 8): 6–14.

Wood, T. C. 2016. “An Evaluation of *Homo naledi* and ‘Early’ Homo from a Young-Age Creationist Perspective.” *Journal of Creation Theology and Science, Series B: Life Sciences* 6: 14–30.

Wood, Todd Charles, and Megan J. Murray. 2003. *Understanding the Pattern of Life: Origins and Organization of the Species*. Nashville, Tennessee: Broadman & Holman.

Mathematically, a distance metric has the following properties:

Given points (or vectors) *x*, *y*, and *z* in a *p*-dimensional vector space *X*, a function

(where IR is the real line )

is a distance metric if

If this looks threatening, consider it intuitively.

Really this is just stating the obvious: the first three conditions say (respectively) that a distance may be zero (but if and only if the points are coincident), but can’t be negative, and that it shouldn’t depend on the direction of travel. The last (the *triangle inequality*) says that taking a detour can’t make the route between two points any shorter. Distances in cluster analysis may not always obey this last condition— the Dice distance, mentioned in the main text, is a case in point. When any condition is not true, we may speak more generally of a distance *measure*.

Closely related to the idea of distance is that of a vector *norm*, **║ x║**, which measures the “size” of a vector

It is a mistake to assume that all statistical techniques can be used on any type of statistical data. Statisticians commonly distinguish between at least 4 types of data: When the categories into which objects can be placed are simply “names” without underlying order (e.g., red/blue/green), the data are **nominal**; if there are only two categories (e.g., male/female), we further denote them as **binary** or **dichotomous**. If the categories can be ordered they are **ordinal** (e.g., small/medium/large). When they can be measured on a scale, they are **quantitative**. We distinguish quantitative variables as **discrete** if they are the result of counting, or **continuous** if they are the result of a measurement. The latter may be further divided: If the measuring scale is merely an **interval** scale (e.g., Fahrenheit temperature), there is no true zero and we can only compare differences, but if it is a **ratio** scale (e.g., blood pressure), we can calculate ratios and percentages. (If my diastolic blood pressure has decreased from 100 to 97, I can say either by 3 units or by 3%, but if my body temperature has decreased from 100° F to 97° F only the *difference* of 3° F makes sense. Try converting the ratios to Celsius!)

As an example for constructing different distance measures, consider the following table of 2 objects (taxa) and 9 binary variables (characters):

Taxon | Character | ||||||||
---|---|---|---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |

x |
1 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 1 |

y |
0 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 1 |

Comparing * x* and

Thus the simple matching distance is

the Jaccard distance is

while the Dice distance is

Notice how the distance between * x* and

The bootstrap estimator of a statistical parameter assumes that the best “guess” for the true probability distribution of a random variable is the probability mass function obtained from the actual data. For example, if we toss a fair coin (i.e. prob. of a head = prob. of a tail = 0.5) three times and record the number of heads, we can calculate the expected probability distribution analytically using the Binomial distribution, as in the following table:

No. of heads | 0 | 1 | 2 | 3 |
---|---|---|---|---|

Prob. | 1/8 | 3/8 | 3/8 | 1/8 |

But what if we suspect the coin is biased? Theoretical calculations are no good; we need data. Suppose we do 8 experiments, each time tossing the coin 3 times and counting the number of heads. Suppose they were distributed as shown below:

No. of heads | 0 | 1 | 2 | 3 |
---|---|---|---|---|

Prob. | 2/8 | 4/8 | 1/8 | 1/8 |

This is the empirical probability distribution. There are 4+2+3 = 9 heads in these 24 coin tosses, so the probability of a head is estimated as *p̂ _{H}* = 9/24 = 3/8 for this coin, but with what degree of confidence? We could use the Binomial distribution again to find a confidence interval, but we could also estimate the variability of the estimate by drawing (say) 1,000 samples of size 8 with replacement from the empirical mass function, configured as the set {0, 0, 1, 1, 1, 1, 2, 3}. From each resample, calculate an estimate

Cutting-edge creation research. Free. Answers Research Journal (ARJ) is a professional, peer-reviewed technical journal.

Browse Volume- Strictly speaking, BDC doesn’t use every character; there is a preliminary stage which eliminates characters that are not “relevant,” i.e., are sparsely represented in the taxa. Typically, the criterion for relevance seems to be 95% representation—occasionally 90% or even less; whether the results are sensitive to this choice is rarely mentioned. Presumably, the idea could be extended to eliminating taxa as well, although this does not seem to be an issue.
- Imagine the case of a continuous variable with a range of [0,30]cm. Artificially dichotomizing by separating into [0,15) and [15,30] means that two samples that measure 14.99 cm and 15.0 cm are regarded as different, when they are obviously nearly identical.
- This pre-specified or “critical” value (rather confusingly) is usually also given the symbol
*α*, and is the chance of a “false positive.” A value*α*= 5 is often assumed by default, but this is bad practice. - More exactly, the elements of column
*j*, for example, are treated as a set of observations on the*j*’th variable. - If the critical value is α, the chance of
*at least*one false positive is 1 − (1 − α)*M*≈*M*α for small α, if the variables are independent. If they are not, without knowing much more about the dependence structure of the whole ensemble it’s extremely difficult to say what it is! - This is sensible, as the methodology used by ANOPA is inappropriate to nominal data; it entails the calculation of centroids and Euclidean distances from the data matrix
*X*, but the use of medoids is needed for more nominal data where rectangular (or “Manhattan”) distances (the*L*norm again) are given. According to the account in Wood and Murray (2003, 115–136), ANOPA ‘‘reduces the dimensionality while minimizing the loss of information’’ in the dataset. Unfortunately, it is not clear exactly what criterion is measuring “information,” nor how its loss is minimized._{1} *Unprincipled*: not in a moral sense, but a statistical one.- The point of this example is that the histogram of the correlations resulting from the 1,000 pseudoreplicates is highly skewed, demonstrating the non-Normal nature of the distribution.
- For example, suppose each taxon has 20 characters—12 dichotomies, 6 ordinal, and 2 continuous; resampling the columns might generate one “pseudoreplicate” with 15 dichotomies and 5 ordinal characters, another with 10 dichotomies, 7 ordinal, and 3 continuous characters, etc. The sampling distributions clearly cannot be the same.
- It is fair to note that, contra Wilson, Wood and Murray (2003) place great emphasis on what they call the “holistic” virtues of treating all characters alike, although I have seen little in the nature of evidence to support the relevance of this claim.
- Mannion et al. (2011) conduct multiple comparisons of correlation coefficients, apparently without adjusting critical values for the inherent inter-relationships—the same problem that arises with BDC.

ISSN: 1937-9056 Copyright © 2021 Answers in Genesis, Inc. All content is owned by Answers in Genesis (“AiG”) unless otherwise indicated. AiG consents to unlimited copying and distribution of print copies of *Answers Research Journal* articles for non-commercial, non-sale purposes only, provided the following conditions are met: the author of the article is clearly identified; Answers in Genesis is acknowledged as the copyright owner; *Answers Research Journal* and its website, www.answersresearchjournal.org, are acknowledged as the publication source; and the integrity of the work is not compromised in any way. For website and other electronic distribution and publication, AiG consents to republication of article abstracts with direct links to the full papers on the *ARJ* website. All rights reserved. For more information write to: Answers in Genesis, PO Box 510, Hebron, KY 41048, Attn: Editor, *Answers Research Journal.* *The views expressed are those of the writer(s) and not necessarily those of the *Answers Research Journal* Editor or of Answers in Genesis.*

High-quality papers for *Answers Research Journal*, sponsored by Answers in Genesis, are invited for submission.

- Read the Instructions to Authors Manual (PDF).
- Email papers, diagrams, tables, etc. to the email address listed in the Manual.

Support the creation/gospel message by donating or getting involved!

**Answers in Genesis is an apologetics ministry**, dedicated to helping Christians defend their faith and proclaim the good news of Jesus Christ.

- Customer Service 800.778.3390
- © 2021 Answers in Genesis