journal article
LitStream Collection
Paterson, Adrian M.; Wallis, Graham P.; Kennedy, Martyn; Gray, Russell D.
doi: 10.1111/cla.12040pmid: 34784697
Over the past two decades, behavioural biologists and ecologists have made effective use of the comparative method, but have often stopped short of adopting an explicitly phylogenetic approach. We examined 68 behaviour and life history (BLH) traits of 15 penguin species to: (i) infer penguin phylogeny, (ii) assess homology of behavioural characters, and (iii) evaluate hypotheses about character evolution and ancestral states. Parsimony analysis of the BLH dataset found either two shortest trees (characters coded as unordered) or a single shortest tree (characters coded as a combination of unordered and Dollo). The BLH data had significant structure. Kishino–Hasegawa tests indicated that BLH trees were significantly different from most previous estimates of penguin phylogeny. The BLH phylogeny generated from Dollo characters appeared to be less accurate than the tree derived from the completely unordered dataset. Dividing BLH data into display and non‐display traits resulted in no significant differences in level of homoplasy and no difference in the accuracy of phylogeny. Tests for homology of BLH traits were performed by mapping the characters onto a molecular tree. Assuming that independent gains are less likely than losses of character states, 65 of the 68 characters were likely to be homologous across taxa, and at least several characters appeared to have been stable since the origin of modern penguins around 30 Myr. Finally, the likely BLH traits of the most recent common ancestor of extant penguins were reconstructed from character states along the internal branch leading to the penguins. This analysis suggested that the “proto‐penguin” probably had a similar life history to current temperate penguins but few ritualized behaviours. A southern, cool‐temperate origin of penguins is suggested.
doi: 10.1111/cla.12047pmid: 34788977
Several extensions to implied weighting, recently implemented in TNT, allow a better treatment of data sets combining morphological and molecular data sets, as well as those comprising large numbers of missing entries (e.g. palaeontological matrices, or combined matrices with some genes sequenced for few taxa). As there have been recent suggestions that molecular matrices may be better analysed using equal weights (rather than implied weighting), a simple way to apply implied weighting to only some characters (e.g. morphology), leaving other characters with a constant weight (e.g. molecules), is proposed. The new methods also allow weighting entire partitions according to their average homoplasy, giving each of the characters in the partition the same weight (this can be used for dynamically weighting, e.g. entire genes, or first, second, and third positions collectively). Such an approach is easily implemented in schemes like successive weighting, but in the case of implied weighting poses some particular problems. The approach has the peculiar implication that the inclusion of uninformative characters influences the results (by influencing the implied weights for the partitions). Last, the concern that characters with many missing entries may receive artificially inflated weights (because they necessarily display less homoplasy) can be solved by allowing the use of different weighting functions for different characters, in such a way that the cost of additional transformations decreases more rapidly for characters with more missing entries (thus effectively assuming that the unobserved entries are likely to also display some unobserved homoplasy). The conceptual and practical aspects of all these problems, as well as details of the implementation in TNT, are discussed.
doi: 10.1111/cla.12056pmid: 34794250
Oblong, a program with very low memory requirements, is presented. It is designed for parsimony analysis of data sets comprising many characters for moderate numbers of taxa (the order of up to a few hundred). The program can avoid using vast amounts of RAM by temporarily saving data to disk buffers, only parts of which are periodically read back in by the program. In this way, the entire data set is never held in RAM by the program—only small parts of it. While using disk files to store the data slows down searches, it does so only by a relatively small factor (4× to 5×), because the program minimizes the number of times the data must be accessed (i.e. read back in) during tree searches. Thus, even if the program is not designed primarily for speed, runtimes are within an order of magnitude of those of the fastest existing parsimony programs.
doi: 10.1111/cla.12046pmid: 34788973
One of the most time‐consuming aspects of Bayesian posterior probability analysis in the analysis of phylogenetic trees is the use of Metropolis‐coupled Markov chain Monte Carlo (MC3) methods to determine relative posteriors and identify maximum a posteriori (MAP) trees. Here, analytical and numerical methods are presented to determine tree likelihoods that are integrated over edge‐length (and other parameter) distributions. Given topological (tree) priors (flat or otherwise), this allows for identification of the maximum posterior probability assignment (MAP‐A) of character states to non‐leaf tree vertices via dynamic programming. Using this form of posterior probability as an optimality criterion, tree space can be searched using standard trajectory techniques and heuristically optimal MAP‐A trees can be identified with considerable time savings over MC3. Example cases are presented using aligned and unaligned molecular sequences as well as combined molecular and anatomical data.
Stuessy, Tod F.; Hörandl, Elvira
doi: 10.1111/cla.12038pmid: 34784696
The review of paraphyly in botanical systematics by Schmidt‐Lebuhn brings together a number of useful perspectives for the reader. It fails to offer new ideas, however, and it does not recognize the fallacies of strict cladistic classification, namely accepting only holophyletic groups, and insisting that sister groups have the same rank. The reason for adherence to these rules is to maintain the convenience of cladistic classification. While convenience in biological classification by itself is not necessarily bad, it becomes unacceptable when its use overshadows achieving a higher level of evolutionary (and phylogenetic) information content. Evolutionary divergence and reticulation are both significant parts of the evolutionary process that cannot be ignored in biological classification and that are necessary for high predictive quality.
Showing 1 to 10 of 14 Articles