Phylogenetic comparative methods


Phylogenetic comparative methods use information on the historical relationships of lineages to test evolutionary hypotheses. The comparative method has a long history in evolutionary biology; indeed, Charles Darwin used differences and similarities between species as a major source of evidence in The Origin of Species. However, the fact that closely related lineages share many traits and trait combinations as a result of the process of descent with modification means that lineages are not independent. This realization inspired the development of explicitly phylogenetic comparative methods. Initially, these methods were primarily developed to control for phylogenetic history when testing for adaptation; however, in recent years the use of the term has broadened to include any use of phylogenies in statistical tests. Although most studies that employ PCMs focus on extant organisms, many methods can also be applied to extinct taxa and can incorporate information from the fossil record.
PCMs can generally be divided into two types of approaches: those that infer the evolutionary history of some character across a phylogeny and those that infer the process of evolutionary branching itself, though there are some approaches that do both simultaneously. Typically the tree that is used in conjunction with PCMs has been estimated independently such that both the relationships between lineages and the length of branches separating them is assumed to be known.

Applications

Phylogenetic comparative approaches can complement other ways of studying adaptation, such as studying natural populations, experimental studies, and mathematical models. Making interspecific comparisons allow researchers to assess the generality of evolutionary phenomena by considering independent evolutionary events. Such an approach is particularly useful when there is little or no variation within species. And because they can be used to explicitly model evolutionary processes occurring over very long time periods, they can provide insight into macroevolutionary questions, once the exclusive domain of paleontology.
areas of 49 species of mammals in relation to their body size. Larger-bodied species tend to have larger home ranges, but at any given body size members of the order Carnivora tend to have larger home ranges than ungulates. Whether this difference is considered statistically significant depends on what type of analysis is applied
mass of various species of Primates in relation to their body size and mating system. Larger-bodied species tend to have larger testes, but at any given body size species in which females tend to mate with multiple males have males with larger testes.
Phylogenetic comparative methods are commonly applied to such questions as:
Example: how does brain mass vary in relation to body mass?
Example: do canids have larger hearts than felids?
Example: do carnivores have larger home ranges than herbivores?
Example: where did endothermy evolve in the lineage that led to mammals?
Example: where, when, and why did placentas and viviparity evolve?
Example: are behavioral traits more labile during evolution?
Example: why do small-bodied species have shorter life spans than their larger relatives?

Phylogenetically independent contrasts

proposed the first general statistical method in 1985 for incorporating phylogenetic information, i.e., the first that could use any arbitrary topology and a specified set of branch lengths. The method is now recognized as an algorithm that implements a special case of what are termed phylogenetic generalized least-squares models. The logic of the method is to use phylogenetic information to transform the original tip data into values that are statistically independent and identically distributed.
The algorithm involves computing values at internal nodes as an intermediate step, but they are generally not used for inferences by themselves. An exception occurs for the basal node, which can be interpreted as an estimate of the ancestral value for the entire tree or as a phylogenetically weighted estimate of the mean for the entire set of tip species. The value at the root is equivalent to that obtained from the "squared-change parsimony" algorithm and is also the maximum likelihood estimate under Brownian motion. The independent contrasts algebra can also be used to compute a standard error or confidence interval.

Phylogenetic generalized least squares (PGLS)

Probably the most commonly used PCM is phylogenetic generalized least squares. This approach is used to test whether there is a relationship between two variables while accounting for the fact that lineage are not independent. The method is a special case of generalized least squares and as such the PGLS estimator is also unbiased, consistent, efficient, and asymptotically normal. In many statistical situations where GLS is used residual errors ε are assumed to be independent and identically distributed random variables that are assumed to be normal
whereas in PGLS the errors are assumed to be distributed as
where V is a matrix of expected variance and covariance of the residuals given an evolutionary model and a phylogenetic tree. Therefore, it is the structure of residuals and not the variables themselves that show phylogenetic signal. This has long been a source of confusion in the scientific literature. A number of models have been proposed for the structure of V such as Brownian motion Ornstein-Uhlenbeck, and Pagel's λ model.. In PGLS, the parameters of the evolutionary model are typically co-estimated with the regression parameters.
PGLS can only be applied to questions where the dependent variable is continuously distributed; however, the phylogenetic tree can also be incorporated into the residual distribution of generalized linear models, making it possible to generalize the approach to a broader set of distributions for the response.

Phylogenetically informed Monte Carlo computer simulations

Martins and Garland proposed in 1991 that one way to account for phylogenetic relations when conducting statistical analyses was to use computer simulations to create many data sets that are consistent with the null hypothesis under test but that mimic evolution along the relevant phylogenetic tree. If such data sets are analyzed with the same statistical procedure that is used to analyze a real data set, then results for the simulated data sets can be used to create phylogenetically correct null distributions of the test statistic. Such simulation approaches can also be combined with such methods as phylogenetically independent contrasts or PGLS.

Journals