Abstract Established methods of exploratory data analysis reveal conflicts in morphological data from rhynchonellide brachiopods, conflicts that result in mismatches between morphological, combined, and molecular phylogenies. Consideration of the mismatches uncovers several findings of general applicability: causes of conflict include the inherent genealogical deficit of comparative morphology, and genetic signal erosion caused by genome remodelling. Because of genetic signal erosion, cladistic analysis cannot be used safely on taxa whose remote ancestors diverged in deep time. In theory, if not practice, morphological-molecular conflict may be avoided by use of genetically validated characters. The preliminary (prior) classification of the organisms concerned is a hitherto-overlooked first step of cladistic analysis; its integral roles are to enable homology hypotheses, and at least partly to determine cladistic results through classification bias. Thus hitherto, cladistics has been misunderstood and misrepresented; many reported taxon relationships may have been determined more by the prior classification than by the supporting characters. If the prior classification is artificial and does not accurately mirror evolutionary history, morphological and molecular phylogeny mismatches will register as spurious homoplasy. To avoid logical circularity in cladistic analysis there must be clear separation of input (taxon descriptions) from output (clades). This requirement rules out PhyloCode-like approaches to systematics if they allow (named) output of cladistic analysis to serve as the input to further such analyses. Brachiopoda, Circular logic, Cladistic analysis, Comparative anatomy, PhyloCode, Rhynchonellida INTRODUCTION The predominant non-molecular methodology for the reconstruction of evolutionary patterns and processes is based on phylogenetic inference from the comparative anatomy of morphological characters, analysed by the methods of cladistics (e.g. Platnick, 1979; Duncan & Stuessy, 1981; Carlson, 1990, 1991, 1995, 1999; Kitching et al., 1998; Carlson & Leighton, 2001; Carlson & Fitzgerald, 2008; Sereno, 2009; Assis, 2017). This approach has the advantage that it can be applied simultaneously to both fossil and Recent forms. It operates under the common assumptions that similarity of morphological phenotypes more-or-less accurately reports genealogical homology and that current classification and taxonomy provide a reliable guide to evolutionary relationships. Since about 1990, it has also been readily possible to use truly genealogical, genotypic data (preferably DNA sequences) to the same ends, raising the possibility of comparing morphological and molecular phylogenies with one another and with current taxonomy. For some groups of organisms, cladistic and molecular phylogenies do match closely (e.g. Marvaldi et al., 2009; Kundrata, Bocakova & Bocak, 2014); Resch et al., 2014), but in many others, such as brachiopods, crinoids and sponges (e.g. Cohen & Pisera, 2017; Morrow et al., 2013), there are mismatches (‘morphological–molecular conflict’), usually attributed to homoplasy. Rhynchonellides (Brachiopoda: Rhynchonellida) are the first group of brachiopods to which both approaches have been applied on an adequate scale for comparison, and in this case morphological–molecular conflict is stark, and there is corresponding conflict between the molecular phylogeny and the current taxonomy (Cohen & Bitner, 2013; Schreiber, Bitner & Carlson, 2013; Bitner & Cohen, 2015; Bapst, Schreiber & Carlson, 2017); none of the clades identified in the molecular analysis corresponds to any of the four Linnean superfamilies represented in the current taxonomy of living forms. Reciprocally, none of the four superfamilies corresponds to any one molecular clade, one of which even contains genera belonging to all four superfamilies. Yet cladistic analysis based on a substantial and carefully curated morphological data matrix gave results that broadly agreed with the superfamily classification. Subsequent Bayesian (‘total evidence’) analyses of the combined morphological and molecular data sets served only to confirm the conflict, giving results that scarcely differed from the molecular phylogeny (Bapst et al., 2017). See Supporting Information, supplementary file 1 for relevant details of superfamily morphology extracted from Schreiber et al. (2013), and see Carlson (2016) for a useful review of brachiopod evolution. For details of the rhynchonellide classification see Tables 1 and 2 of Schreiber et al. (2013). This report reflects the author’s attempts to interpret, understand and explain the stark conflict observed in the systematics of this group. Several conclusions are drawn, most of which appear to be generally applicable, specific to neither rhynchonellides nor brachiopods. They may be summarized as follows. 1. Preliminary data exploration (Morrison, 2010) reveals so much conflict in the morphological data that simple treelike resolution should not be expected. 2. The general, epistemological causes of morphological–molecular conflict are the lack of a genealogical basis for comparative morphology and the accretion, with time, of genomic changes (paralogy, co-option, translocation, inversion, chromosome fusion, fission, etc.) that result in disconnection between genotype and phenotype; changes that erode genealogical homology. 3. It is possible (in theory, if not in practice) to eliminate or reduce morphological–molecular conflict by using genetically validated morphological characters. 4. Agreement between the (prior) rhynchonellide superfamily classification and the results of cladistic analysis is attributable to the (prior) classification itself. This is a general relationship, not peculiar to rhynchonellides or brachiopods, and it means that cladistic analysis has hitherto been misunderstood and misrepresented because the prior classification has not previously been regarded or presented as integral to cladistic analysis. Instead, as will be demonstrated in later sections, the common concept of cladistic analysis requires revision; it should be understood that, because a Darwinian classification based on inheritance with modification is a prerequisite for making homology statements, the prior act of classification is an integral, enabling and determinative component of cladistic analysis. Moreover, if the prior classification is partly or wholly artificial (i.e. not in correspondence with evolutionary history), mismatches between morphology and molecules will generate spurious homoplasy. Note that the prior act of classification referred to here is distinct from the ‘primary homology’ concept of da Pinna (1991). 5. Involvement of the prior classification as the first, integral step in cladistic analysis makes plain the danger of logical circularity in cladistics. To avoid circularity, cladistic analysis input (taxon descriptions) and output (clades) must be separated as rigorously as possible. This rules out (or sets limits to) use of PhyloCode-like approaches to classification, in which the characters of clades named under such systems may become input data for later cladistic analyses, thereby completing a logical circle. MATERIAL AND METHODS Rhynchonellide data were downloaded from the Dryad repository specified by Bapst et al. (2017) as the directory runScripts, and the following two files contained therein were extracted. 1. Headers, but not data, in the morphology matrix file RunX_Updated_matrix5_1_15.nex were edited to make it analysable in PAUP*4.0a builds 153 and 157 (Swofford, 2000), after which it was used for parsimony jackknife analysis. This is a rapid, approximate tree-building tool (Farris et al., 1996). The same edited morphology matrix file was also input into SplitsTree 4.13 (Huson, 1998) and used to build a p-distance Neighbor Net, in which reticulations indicate conflicting splits. 2. The combined data file (Master_A_06-07-15.nex, an alignment of DNA sequences followed by character data) was edited to remove outgroups and make it acceptable to SplitsTree, which uses an apparently undocumented variant of the Nexus format (Maddison, Swofford & Maddison, 1997) and is unable to deal with IUPAC DNA ambiguity codes. These were replaced by the ‘missing’ symbol ‘?’. This edited alignment was then used for a combined DNA and morphology Neighbor Net analysis in SplitsTree. The molecular data alignment from Cohen & Bitner (2013) was downloaded as the relevant supplementary file from the journal website and analysed in a similar manner. RESULTS AND DISCUSSION Exploratory data analysis Two rapid, exploratory analyses were performed on the morphological data from Bapst et al. (2017); a p-distance Neighbor Net is shown in Figure 1 and a parsimony jackknife analysis in Figure 2. The reticulations in Figure 1 mean that the morphological data contain widespread internal conflict (Morrison, 2011); little of the reticulation is attributable to the outgroups. Figure 2 confirms the lack of resolution; 11 of 20 rhynchonellides are not grouped. Figure 3 shows that the p-distance Neighbor Net of the combined data also has much internal conflict. Figure 1. View largeDownload slide Rhynchonellide p-distance Neighbour Net based on morphological data from Bapst et al. (2017). Outgroup names are in bold. The ingroup creates most of the reticulation. Figure 1. View largeDownload slide Rhynchonellide p-distance Neighbour Net based on morphological data from Bapst et al. (2017). Outgroup names are in bold. The ingroup creates most of the reticulation. Figure 2. View largeDownload slide Rhynchonellide parsimony jackknife consensus tree with jackknife branch support frequencies (%), rooted with brachiopod outgroups. Based on morphological data from Bapst et al. (2017). Figure 2. View largeDownload slide Rhynchonellide parsimony jackknife consensus tree with jackknife branch support frequencies (%), rooted with brachiopod outgroups. Based on morphological data from Bapst et al. (2017). Figure 3. View largeDownload slide Rhynchonellide p-distance Neighbour Net based on combined morphological and molecular data from Bapst et al. (2017). Outgroup names are in bold. Figure 3. View largeDownload slide Rhynchonellide p-distance Neighbour Net based on combined morphological and molecular data from Bapst et al. (2017). Outgroup names are in bold. For comparison with Figures 1 and 2, Figures 4 and 5 are exploratory analyses of the rhynchonellide ingroup molecular data of Cohen & Bitner (2013). This data set has more than eight times as many variable and parsimony-informative characters as the morphological data (482 vs. 59), despite which it shows few reticulations and well-supported clades. However, comparisons of the two data sets are not straightforward owing to non-independence among the nucleotide data; the most one should conclude is that the molecular data do not show much internal conflict. Figure 4. View largeDownload slide Rhynchonellide ingroup p-distance Neighbour Net based on molecular data from Cohen & Bitner (2013). Taxa with names prefaced by ‘d’ have a large deletion in the SSU rDNA gene. Outgroup names are in bold. Figure 4. View largeDownload slide Rhynchonellide ingroup p-distance Neighbour Net based on molecular data from Cohen & Bitner (2013). Taxa with names prefaced by ‘d’ have a large deletion in the SSU rDNA gene. Outgroup names are in bold. Figure 5. View largeDownload slide Rhynchonellide ingroup parsimony jackknife consensus tree, with jackknife branch support frequencies (%), based on molecular data from Cohen & Bitner (2013). Taxa with names prefaced by ‘d’ have a large deletion in the SSU rDNA gene. Figure 5. View largeDownload slide Rhynchonellide ingroup parsimony jackknife consensus tree, with jackknife branch support frequencies (%), based on molecular data from Cohen & Bitner (2013). Taxa with names prefaced by ‘d’ have a large deletion in the SSU rDNA gene. The results in Figures 1–3 and comparison with Figures 4 and 5 show that it is advisable to apply rapid tests of data quality before embarking on computer-intensive analyses. Figure 3, in conjunction with Figures 1 and 2, strongly suggests that no matter how elaborate or sophisticated they may be, combined analyses would not alter the earlier conclusion that the molecular data yield a robust phylogeny, but the morphological data may not (Cohen & Bitner, 2013). Two simple explanations of morphological–molecular conflict Why do the morphological and molecular data not yield congruent phylogenetic inferences, as might be expected? Review of the cladistic character matrix and its analyses (Schreiber et al., 2013; Bapst et al., 2017) reveals no obvious faults. Much well-accepted molecular phylogeny of brachiopods (and many other organisms) has been based on rDNA sequences, despite their well-known limitations. Thus, when viewed in isolation, both sets of data and their analyses seem to be appropriate. Even if the morphological data are less informative, the tree topology to which they lead might be expected to be congruent with that from rDNAs (Cohen & Bitner, 2013). To explain the conflict between morphology and molecules, I offer in the next two subsections what appear to be simple, new perspectives based on trite and elementary biology. Morphological data are not genealogical The first perspective is that, whereas molecular (gene sequence) data are undoubtedly genealogical, and therefore capable of providing evidence of evolutionary history, morphological characters and character states are the product of observations of (and are rooted in) comparative anatomy, i.e. they are complex phenotypes that are generally the product of developmental processes, of which the molecular and physiological bases are largely unknown, probably highly polygenic and only indirectly genealogical; thus, they are intrinsically not capable of reporting evolutionary history. There is also a formal analogy (which may never have been noted before) to the barrier to reverse information flow between digitial and analog information exemplified by the central dogma of molecular biology, i.e. that information cannot pass from protein sequence to gene sequence (Koonin, 2015). Morphology represents genetic (and perhaps other) information expressed in analog, phenotypic form, whereas inferences of evolutionary history are obtained from nucleotide sequence records in digital form. Morphological character phenotypes can be genealogically informative only to the extent that they truthfully signal the involvement of homologous genes and developmental pathways; they cannot themselves embody or encode such information. However, especially in organisms separated by long periods of evolution, superficially homologous morphologies need not signal homologous genes or developmental pathways (de Beer, 1971). Not only is there no general answer in terms of cellular, tissue and organismal processes to the question ‘How is anatomical shape shaped?’, but also there are no adequate descriptions of how any anatomical structure’s shape is determined or constructed in giving expression to the genetic information, let alone how the differing shapes of a pair of genetically alternative structures come about (i.e. ontogenetic products that reflect the presence of allelic genes). The genetic specification and control of morphological characters that are candidate homologues are only exceptionally referable to specific genome elements; generally, such characters are not ‘good genetic markers’. For comparative morphology to acquire a genealogical component and become informative about evolutionary history, the compared morphological variants, like conventional genetic markers, should be under simple genetic control, i.e. they should segregate in genetic crosses (or their equivalent) under the control of identifiable genetic elements (sensu lato). Given that this has rarely been demonstrated (never for fossils), comparative morphology is inferior in genealogical value to molecular sequence variation as a tool for phylogenetic reconstruction. There are apparent exceptions to the generally low utility of morphological characters for phylogenetic reconstruction. For example, among insects, there can be considerable (or even complete) congruence between morphological and molecular phylogenies (e.g. Marvaldi et al., 2009; Kundrata et al., 2014; Resch et al., 2014). Possibly, this is because morphological variation affecting the arthropod cuticle is particularly important in classification and is often comparable in underlying biochemical/developmental basis to simple genetic variations such as have been analysed comprehensively in Drosophila and other model genetic organisms, i.e. they are closer to the levels of gene action than the typical variable characters of comparative anatomy. Perhaps also, arthropod cuticular characters are especially discrete and describable. Thus, both classification and morphology can sometimes accurately reflect evolutionary history. The genealogical deficit of (especially fossil) morphological data can be remedied only by altering its epistemological basis from comparative anatomy to discrete heredity. In practice, this may be unattainable because it would require a radical reinvention of morphological characters and character states, with restriction to those that can be shown (even approximately or indirectly) to have their basis in the effects of allelic genes (using gene in its most general, non-specific sense to include, for example, sites marked by single nucleotide polymorphisms). If such gene-based morphological characters could be demonstrated, and the phenotypic expressions of some could be recognized in fossils, then morpho-systematics based on such variations would, in principle, be a valid method to reconstruct fossil evolutionary history. One example of the kind of genetic variation that might satisfy these criteria was described very recently (Mouri et al., 2018); in the mouse, a large, spontaneous chromosomal insertion introduced new regulatory sequences that brought about a new morphological phenotype, syndactyly with webbing. The syndactyly involves bone morphology and, in principle, would be fossilizable. The future of this aspect of palaeobiology appears to depend on the development of many such variants, but it is not likely to succeed generally, for reasons outlined in the next subsection. Genetic signal erosion The second perspective that helps to explain the defects of morphological characters for phylogenetic reconstruction may be termed ‘genetic signal erosion’ or ‘homology as residue’ (Wray & Abouheif, 1998; Wagner, 2014, p. 74) and is most vividly exemplified by the synaptonemal complex. In eukaryotes, this highly conserved structure connects synapsed homologous chromosomes (of both animals and plants) in meiotic prophase and is a quasi-universal structure associated with sexual reproduction. Despite this long-conserved functional homology, the complex shows appreciable between-taxon variation in detailed morphology, and the proteins that make up the ladder-like central element show a surprising amount of between-taxon genetic variation (Cahoon et al., 2017); different structural genes are involved in divergent taxa (Cahoon & Hawley, 2016, Table 1). Thus, despite the universality of the structure and its functions, genetic markers based on the structural genes of ancestral synaptonemal complexes would fail as genealogical evidence for the reconstruction of evolutionary history. The genes that such markers would reflect have been replaced over time; the genealogical signal has been eroded. To echo Kluyver and Monod’s “Escherichia coli to elephant” remarks about biochemical unity (Friedmann, 2004): what is true for the synaptonemal complex is (likely to be) true for anatomical structures whose origins lie in deep time; today they may be mistaken for homologues, but their underlying genetic basis need no longer be homologous (cf. de Beer, 1971). The wider homology problem Morphological–molecular conflict, the genealogical deficit of palaeontological morphology and genetic signal erosion are aspects of the deeper problems associated with homology and homology concepts and the disconnect between phenotype and genotype (discussed by the following authors among many others: Hall, 1994; Wray & Abouheif, 1998; Wagner, 2014, p. 74, 2016; Morrison, Morgan & Kelchner, 2015; Brower, 2016; Inkpen & Doolittle, 2016). The characters used in fossil and Recent comparative anatomy are interpreted in terms of developmental (deep) homology, but those of molecular phylogeny are interpreted in terms of evolutionary (phylogenetic) homology (Morrison et al., 2015). It is therefore not surprising that perennial problems should be recognized (Inkpen & Doolittle, 2016); in cladistic character descriptions, morphological similarity between fossil and Recent specimens may be deceptive. It is possible that the desire for an all-encompassing phylogeny based on morphology is doomed to frustration, like that based on molecules, and for similar reasons; evolutionary changes at the genomic level are more complex and more unpredictable than simple-mindedness would suggest. As de Beer put it: ‘homologous structures need not be controlled by identical genes, and homology of phenotypes does not imply similarity of genotypes’ (de Beer, 1971, p. 15). Given that, and all else that we now know, the observed conflict between molecular and morphological phylogenies of rhynchonellide brachiopods (Cohen & Bitner, 2013; Schreiber et al., 2013) was eminently predictable. Classification as a source of morphological–molecular conflict When a classification accurately reflects evolutionary history, conflict may result from the factors that prevent morphology from being reliably genealogical. An additional source of conflict exists when a classification is to some extent artificial, as in the case of brachiopod superfamilies, and such conflict has been described as ‘spurious homoplasy’ (Bitner & Cohen, 2015, p. 498). In many other reported cases, spurious homoplasy has not been distinguished from real homoplasy, i.e. arrival at a particular evolutionary state by an alternative (e.g. convergent) route. Genome vs. phenome It has been said that a weakness of the empirical example that prompted the present manuscript is that it contrasts a few-gene rDNA analysis with a many-character array of phenotypes that (presumably) reflect a much larger sample of the genome and that might therefore have greater credibility. In my view, this gives too much genealogical value to phenotypes whose expression and genetic basis are subject, in the course of evolutionary time, to an unascertainable range of shifting intra- and extra-organismal influences that prevent them from being reliable lineage markers. In the course of the use of genetic variation to report evolutionary change, there has been a succession of cases in which reliance was placed first on phenotypes (e.g. blood groups, allozymes, immune reactions, comparative anatomy) only for it to be recognized eventually that each was an inadequate surrogate for genomic data. Whether the current use of many amino acid sequence phenotypes (so-called ‘phylogenomics’) will escape this fate remains to be seen. These data have two main weaknesses: (1) that their scale makes it necessary for analyses effectively to be by unsupervised machine-based algorithms; and (2) the unascertainable historical record of changing molecular biases (both those we know of that affect codon and nucleotide usage and those of which we yet have neither suspicion nor knowledge or which were active only in some past period). Thus, no completely reliable evolutionary history can ever be derived from even a very large sample of the phenome. Much the same can be said about the genome. But while some cases are known where rDNA sequences are subject to external influence (e.g. Eralesa et al., 2017), they are, in general, insulated from external influences and, despite their limitations, seem reliable as lineage markers. Against ‘total evidence’ combined analyses In addition to the fundamental weaknesses of morpho-systematics described above, there are two reasons why the type of combined analysis performed by Bapst et al. (2017) is unlikely to be useful. The first argument concerns the relative weights that should be accorded to morphological vs. molecular characters, a question that has received relatively little attention (e.g. de Queiroz, Donoghue & Kim, 1995; Huelsenbeck, Bull & Cunningham, 1996; Nylander et al., 2004). Gene sequences often (but not always) change by single base substitutions, the numbers of which provide an approximate measure of evolutionary divergence (e.g. expected number of substitutions per site). In contrast, the number of genetic units of change involved in the difference between any two morphological character states is known only in very special experimental populations. In fossil forms or long-diverged populations, it is not known how many individual genetic changes have occurred, and it seems likely that a large number of changes (or changes of a very different character) may often be involved. Thus, it is unreasonable to treat molecular and morphological characters as carrying equal evolutionary weight, as is usual in combined analyses. Absence of evidence for particular differential weights cannot justify equal weights. The second argument against combined analyses, particularly when they use maximum likelihood or Bayesian likelihood analyses, is that these approaches require specification of an evolutionary model, but in our present state of knowledge of how morphology develops and evolves, no highly pertinent model is conceivable. Instead, the models used are derivatives of parsimony analysis or of models specific to molecular data and can be only approximate. A new circumscription of cladistic analysis: the invisible hand of classification revealed How is morphological–molecular conflict such as that among rhynchonellides to be understood or explained, and what bearing does it have on cladistic analysis? It was first proposed (Bitner & Cohen, 2015) that the conflict resulted from ‘classification bias’ as follows. The first step in cladistic analysis is to define homologous characters (i.e. to make homology statements or hypotheses). A prerequisite for making homology statements is that the organisms concerned should be genetically related (i.e. share a common ancestor or be in ancestor–descendant or sister-taxon relationships). In other words, cladistic analysis cannot take place in a taxonomic vacuum, but requires ‘a prior act of classification’, and it must be one that implies descent with modification. In palaeontological (and other) laboratory practice, this classification influences which specimens and characters will be described and used to generate the character-state distributions that will serve in cladistic analysis for phylogeny estimation. There is an obligatory temporal relationship: only after an act of organismal classifi cation can characters and their states be described and coded for inclusion in a cladistic data matrix, for use in some form of clade finding. The presence of the superfamilies in the rhynchonellide cladistic results can thus be understood as a direct reflection of the presence of those superfamilies in the prior classification. Morphological traits, identified a priori as characteristics of superfamilies (and listed in taxon diagnoses; Savage et al., 2002) entered the cladistic coding scheme (in some form, to some extent) and appeared in the data matrix, where they ensured superfamily-congruent phylogenetic results (Schreiber et al., 2013; Bapst et al., 2017). This can be regarded as a bias directing cladistic analysis towards the prior classification (Bitner & Cohen, 2015). See Supporting Information, supplementary file 1 for relevant details of taxon diagnoses. Realization of the significance of the prior classification for cladistic analysis (which underlies and justifies long-standing worry that cladistics is logically circular) means that the upper boundary of cladistic analysis has hitherto been misplaced in concept and diagram (e.g. Williams & Forey, 2001, fig. 5.3; Wägele, 2005, fig. 138); the prior classification should instead be included within cladistic analysis, as in Figure 6. Figure 6. View largeDownload slide Cladistic analysis flowsheet, with classification as an integral component. [Inspired by fig. 5.3 of Williams & Forey (2001).] Figure 6. View largeDownload slide Cladistic analysis flowsheet, with classification as an integral component. [Inspired by fig. 5.3 of Williams & Forey (2001).] It is surprising that the essential role of the prior classification in cladistic analysis is not referred to in any publication that I have seen, including Hennig (1966), Duncan & Stuessy (1981), Farris (1981), Wiley (1981), Schram (1991), Carlson (1995), Pleijel (1995), Nielsen, Scharff & Eibye-Jacobsen (1996), Kitching et al. (1998), Zrzavy et al. (1998), Williams & Forey (2001), Wägele (2005) and Sereno (2009), nor was reference to it detected in any article in the journal Cladistics from volume 1 to date. Indeed, two particularly clear instances of prior classification being excluded from cladistic analysis (‘swept under the cladistic carpet’) are found in the following quotations: ‘Conceptually, a cladistic analysis consists of two main activities [citations omitted]. The first comprises empirical observation, leading to delineation of characters and character states, and to a data set in which those characters are scored for the terminals in the analysis.’ (de Laet, 2005, p. 83), and ‘... how data matrices are made: classification (including naming species) is something that happens much later in the process.’ Mishler (2005). The published comments that come nearest to recognizing the importance of the prior classification in cladistic analysis are non-specific: ‘… phylogenetic conclusions are an extrapolation from hypotheses about natural order’ and ‘… our knowledge of phylogeny stems from our knowledge of taxonomy …’ Platnick (1979, pp. 537, 545). Towards an improved understanding of cladistic analysis Imagine entering a competely disordered natural history museum (or the biosphere) in order to conduct cladistic analysis of its entire contents. Before specimen description, homology hypothesis-making and cladistic coding can begin, the museum’s contents must be put into some kind of preliminary order in which specimens are grouped ‘like with like’, i.e. ‘a preli minary or prior act of classification’ is needed. To permit homology hypotheses, this should be a Darwinian classification in which heredity links individuals and generations, not an artificial one (as rhynchonellide superfamilies may be, see references cited by Cohen & Bitner, 2013). As Darwin presciently noted: ‘… the natural system is founded on descent with modification … the characters are those which have been inheri ted from a common parent, …’ (Darwin, 1859, p. 420). Hypotheses of ancestor–descendant or sister-taxon relationship based on and arising from the prior classification are thus a prerequisite for making hypotheses of character homology. In this way, the prior classification inevitably and unavoidably influences any subsequent cladistic analysis. The prior classification and its associated taxonomy are a hitherto-unrecognized intrinsic and enabling component of cladistic analysis: the component that enables homology statements. Moreover, only if the classification reflects (even approximately) real evolutionary history will the cladistic results be correspondingly natural, i.e. the prior classification is intrinsic, enabling and, to an extent, determinative of the results of cladistic analysis. The importance of the foregoing thought experiment is that it shows the generality of the conclusion to which it leads; although prompted by the rhynchonellide case, the conclusion is not specific to that case. For rhynchonellides, cladistic analysis of the morphological character matrix found evidence consistent with the artificial classification into superfamilies because most of the specimens coded were type species or drawn from collections that otherwise mirrored the prevailing classification. In consequence, some or all of the characters used for the superfamily classification entered the cladistic coding scheme and determined its results (Williams, Carlson & Brunton, 2000; Schreiber et al., 2013; Bapst et al., 2017). Implications for systematics The new understanding of the cladistic analysis of morphological variation (comparative anatomy) described here carries implications for the future application of phylogenetic systematics (cladistics) and for systematics more generally. For cladistics, these may be summarized by recommending that to avoid encountering serious genealogical deficits, cladistic analysis should not be used for morphological characters of taxa whose ancestors diverged in deep time. How deep is, of course, lineage-specific and impossible to define, but guidance may be obtained from the presence or absence of morphological–molecular conflict among the taxa concerned, especially when gauged with genotypic (RNA or DNA) rather than phenotypic (amino acid) sequences. If there is such conflict, that may mean that enough time has elapsed for morphology to have lost some, most, or all of its genealogical value. For some animal taxa, a sufficient safeguard may be to restrict cladistic analysis to those carried out within and between Linnean families, less justifiably within and between orders. For most taxa, it may be necessary to avoid using cladistic analysis with taxa more divergent than families and orders. For an example of the type of over-wide data matrix that should be avoided, see Zhang et al. (2014), where the 21 included taxa range in age from Lower Cambrian to Recent and represent such a wide range of taxa (including Xenacoelomorpha) that even on the most generous interpretation (and overlooking many obvious errors) many proposed homologies must be uncertain. For cladistics more generally, the most important implication is the imperative need to reduce or avoid logical circularity. The foregoing sections have demonstrated that the initial (prior) classification of the subject organisms is an element of the data input into cladistic analysis and must therefore affect its output; if circularity is to be avoided, input (classification = taxa) must remain effectively distinct from output (clades). This categorical conclusion simply rules out the use in systematics of the PhyloCode (Forey, 2001; Cantino & de Querioz, 2006) or similar approaches in which output clades are named and serve as the basis of classification; an operational sequence that must generate unavoidable logical circularity in subsequent cladistic analyses. Alternatively, it means that any classification based upon analysis under the PhyloCode or similar should be terminal, with its component taxa thereafter being unavailable for use in cladistic analysis. It also means that classification needs to continue in its traditional form as a non-algorithmic exercise based on perceived, integrated similarity, whereas clades are constructed algorithmically from the synapomorphic distribution of selected elements of that similarity. Even in this form, there must be some (dilute) circularity. Contrary to the claim that cladistics, as a distinct school of taxonomic thought, has been underappreciated (Assis, 2017), this analysis shows that it has been consistently overrated; the fundamental roles of the prior classification have been unseen or ignored and the implications of those roles unrecognized. The most uncomfortable implications of the role of prior classification in shaping cladistic results concern the great body of published studies. Which reported relationships owe more to the prior classification than to the character data, and what supposedly settled systematics is therefore questionable? Where conflict between morphology and molecules has been attributed to homoplasy, does it arise from real convergent/reverse evolution or is it spurious, reflecting artificial elements in the prior classification? ENVOI My long struggle to understand rhynchonellide (and other) brachiopod morphological–molecular conflict has uncovered a new and important principle: that the prior classification is an overlooked, but necessary, component of cladistic analysis. Remarkably, this means that cladistics has been misrepresented and misunderstood ever since its introduction. It has been overlooked that cladistic analysis will generally reflect features of the underlying (prior) classification. To the extent that such features are non-evolutionary (= artificial) and have been coded and appear in the data matrix, cladistic analysis will recover non-evolutionary relationships and clades, and will thus fail in its supposed main purpose of evolutionary reconstruction. The striking conflict between cladistic and molecular analyses of Rhynchonellida (and also Dyscolioidea and Terebratuloidea; Bitner & Cohen, 2015) highlighted this effect because the existing superfamily classification was in some respects artificial. The struggle to understand the source of the resulting morphological–molecular conflict led to the realization that cladistic analysis itself must be re-circumscribed to include the preceding act of classification, a realization that would have been unnecessary had the conventional boundaries of cladistic analysis not generally been drawn so as to exclude it. Redrawing the boundary of cladistic analysis led to a clearer realization of the danger of logical circularity in phylogenetic systematics, highlighting the imperative need to separate the output of cladistic analysis (as clades) from its input (as taxon descriptions). SUPPORTING INFORMATION Additional Supporting Information may be found in the online version of this article at the publisher’s web-site. S1: Relevant details of superfamily morphology extracted from Schreiber, Bitner & Carlson (2013) ACKNOWLEDGEMENTS I am grateful to the following who have commented on various drafts connected with this manuscript: Roger Downie, Carsten Lüter, David Morrison and Jonathan Sheps. I also benefited from support from Maria Alexandra Bitner and Andrzej Pisera while the concept of classification bias was first emerging. REFERENCES Assis LCS. 2017. The jazz of cladistics. Systematics and Biodiversity 15: 385– 390. Google Scholar CrossRef Search ADS Bapst DW, Schreiber HA, Carlson SJ. 2017. Combined analysis of extant Rhynchonellida (Brachiopoda) using morphological and molecular data. Systematic Biology 67: 32– 48. Google Scholar CrossRef Search ADS de Beer GR. 1971. Homology: an unsolved problem . Oxford: Oxford University Press. Bitner MA, Cohen BL. 2015. Congruence and conflict: case studies of morpho-taxonomy versus rDNA gene tree phylogeny among articulate brachiopods (Brachiopoda: Rhynchonelliformea), with description of a new genus. Zoological Journal of the Linnean Society 172: 486– 504. Google Scholar CrossRef Search ADS Brower AVZ. 2016. Emergent properties. Cladistics 32: 577–579. Cahoon CK, Hawley RS. 2016. Regulating the construction and demolition of the synaptonemal complex. Nature Structural & Molecular Biology 23: 369– 377. Google Scholar CrossRef Search ADS PubMed Cahoon CK, Yua Z, Wang Y, Guo F, Unruh JR, Slaughter BD, Hawley RS. 2017. Superresolution expansion microscopy reveals the three-dimensional organization of the Drosophila synaptonemal complex. Proceedings of the National Academy of Sciencesof the United States of America 114: E6857– E6866. Google Scholar CrossRef Search ADS Cantino PD, de Querioz K. 2006. International code of Phylogenetic nomenclature , version 4c. Available at: https://www.ohio.edu/phylocode/ (accessed February 3, 2018). Carlson SC. 1991. A phylogenetic perspective on articulate brachiopod diversity and the Permo-Triassic extinctions. In: Dudley EC, ed. The unity of evolutionary biology . Portland, OR: Discorides Press, 119– 142. Carlson SC, Fitzgerald PC. 2008. Sampling taxa, estimating phylogeny and inferring macroevolution: an example from Devonian terebratulide brachiopods. Earth and Environmental Science Transactions of the Royal Society of Edinburgh 98: 311– 325. Google Scholar CrossRef Search ADS Carlson SJ. 1991. Phylogenetic relationships among brachiopod higher taxa. In McKinnon DI, Lee DE, Campbell JD, eds. Brachiopods through time . pp. 3-10. Balkema: Rotterdam and Brookfield. Carlson SJ. 1995. Phylogenetic relationships amongst brachiopods. Cladistics 11: 131– 197. Google Scholar CrossRef Search ADS Carlson SJ. 1999. Phylogenetic systematics and palaeontology. In: Harper DAT, ed. Numerical palaeobiology . London and New York: John Wiley and Sons, 41– 91. Carlson SJ. 2016. The evolution of Brachiopoda. Annual Review of Earth and Planetary Science 44: 409– 438. Google Scholar CrossRef Search ADS Carlson SJ, Leighton LR. 2001. The phylogeny and classification of Rhynchonelliformea. In: Brachiopods ancient and modern. A tribute to G. Arthur Cooper . Pittsburgh: The Paleontological Society Papers, 7: 27– 51. Cohen BL, Bitner A. 2013. Molecular phylogeny of rhynchonellide articulate brachiopods. Journal of Paleontology 87: 211– 216. Google Scholar CrossRef Search ADS Cohen BL, Pisera A. 2017. Crinoid phylogeny: new interpretation of the main Permo-Triassic divergence, comparisons with echinoids and brachiopods, and EvoDevo interpretations of major morphological variations. Biological Journal of the Linnean Society 120: 38– 53. Darwin C. 1859. On the origin of species . London: John Murray. Duncan T, Stuessy TF, eds. 1981. Cladistics: perspectives on the reconstruction of evolutionary history . New York: Columbia University Press. Eralesa J, Marchand V, Baptiste P, Gillotf S, Belina S, Ghayada E, Garciaa M, Laforêts F, Marcel V, Baudin-Baillieu A, Bertin P, Couté Y, Adrait A, Meyer M, Therizols G, Yusupov M, Namy O, Ohlmann T, Motorini Y, Cateza F, Diaz J-J. 2017. Evidence for rRNA 2′-O-methylation plasticity: Control of intrinsic translational capabilities of human ribosomes. Proceedings of the National Academy of Sciencesof the United States of America 114: 12934– 12939. Google Scholar CrossRef Search ADS Farris J. 1981. The logical basis of phylogenetic analysis. Advances in cladistics , Vol. 2. Ann Arbor, MI: Columbia University Press, 7– 36. Farris JS, Albert VA, Källersjö M, Lipscomb D, Kluge AG. 1996. Parsimony jackknifing outperforms neighbor-joining. Cladistics 12: 99– 124. Google Scholar CrossRef Search ADS Forey PL. 2001. The PhyloCode: description and commentary. Bulletin of Zoological Nomenclature 58: 81– 96. Friedmann HC. 2004. From “Butyribacterium” to “E. coli”. Perspectives in Biology and Medicine 47: 47– 66. Hall BK, ed. 1994. Homology. The hierarchical basis of comparative biology . San Diego, New York, Boston, London, Sydney, Tokyo, Toronto Academic Press. Hennig W. 1966. Phylogenetic systematics . Chicago: University of Illinois Press. Huelsenbeck JP, Bull JJ, Cunningham CW. 1996. Combining data in phylogenetic analysis. Trends in Ecology & Evolution 11: 152– 158. Google Scholar CrossRef Search ADS PubMed Huson DH. 1998. SplitsTree: analysing and visualizing evolutionary data. Bioinformatics 14: 68– 73. Google Scholar CrossRef Search ADS PubMed Inkpen SA, Doolittle WF. 2016. Molecular phylogenetics and the perennial problem of homology. Journal of Molecular Evolution 83: 184– 192. Google Scholar CrossRef Search ADS PubMed Kitching IJ, Forey PL, Humphries CJ, Williams DM. 1998. Cladistics . Oxford: Oxford University Press. Koonin EV. 2015. Why the Central Dogma: on the nature of the great biological exclusion principle. Biology Direct 10: 52. Google Scholar CrossRef Search ADS PubMed Kundrata R, Bocakova M, Bocak L. 2014. The comprehensive phylogeny of the superfamily Elateroidea (Coleoptera: Elateriformia). Molecular Phylogenetics and Evolution 76: 162– 171. Google Scholar CrossRef Search ADS PubMed de Laet J. 2005. Parsimony and the problem of inapplicables. In: Albert VA, ed. Parsimony, phylogeny and genomics . Oxford: Oxford University Press, 228. Maddison DR, Swofford DL, Maddison WP. 1997. NEXUS: an extensible file format for systematic information. Systematic Biology 46: 590– 621. Google Scholar CrossRef Search ADS PubMed Marvaldi AE, Duckett CN, Kjer KM, Gillespie JJ. 2009. Structural alignment of 18S and 28S rDNA sequences provides insights into phylogeny of Phytophaga (Coleoptera: Curculionoidea and Chrysomeloidea). Zoologica Scripta 38: 63– 77. Google Scholar CrossRef Search ADS Mishler BD. 2005. The logic of the data matrix in phylogenetic analysis. In: Albert VA, ed. Parsimony, phylogeny and genomics . Oxford: Oxford University Press. 57– 70. Google Scholar CrossRef Search ADS Morrison DA. 2010. Using data-display networks for exploratory data analysis in phylogenetic studies. Molecular Biology and Evolution 27: 1044– 1057. Google Scholar CrossRef Search ADS PubMed Morrison DA. 2011. Introduction to phylogenetic networks . Uppsala, Sweden: RJR productions. Morrison DA, Morgan MJ, Kelchner SA. 2015. Molecular homology and multiple-sequence alignment: an analysis of concepts and practice. Australian Systematic Botany 28: 46– 62. Google Scholar CrossRef Search ADS Morrow CC, Redmond NE, Picton BE, Thacker RW, Collins AG, Maggs CA, Sigwart JD, Allcock AL. 2013. Molecular phylogenies support homoplasy of multiple morphological characters used in the taxonomy of Heteroscleromorpha (Porifera: Demospongiae). Integrative and Comparative Biology 53: 428– 446. Google Scholar CrossRef Search ADS PubMed Mouri K, Sagai T, Maeno A, Amano T, Toyoda A, Siroishi T. 2018. Enhancer adoption caused by genomic insertion elicits interdigital Shh experssion and syndactyly in mouse. Proceedings of the National Academy of Sciences of the United States of America 115: 1021– 1026https://doi.org/10.1073/pnas.1713339115. Google Scholar CrossRef Search ADS PubMed Nielsen C, Scharff N, Eibye-Jacobsen D. 1996. Cladistic analysis of the animal kingdom. Biological Journal of the Linnean Society 57: 385– 410. Google Scholar CrossRef Search ADS Nylander JA, Ronquist F, Huelsenbeck JP, Nieves-Aldrey JL. 2004. Bayesian phylogenetic analysis of combined data. Systematic Biology 53: 47– 67. Google Scholar CrossRef Search ADS PubMed da Pinna MCC. 1991. Concepts and tests of homology in the cladistic paradigm. Cladistics 7: 367– 394. Google Scholar CrossRef Search ADS Platnick NI. 1979. Philosophy and the transformaton of cladistics. Systematic Zoology 28: 537– 546. Google Scholar CrossRef Search ADS Pleijel F. 1995. On character coding for phylogeny reconstruction. Cladistics 11: 309– 315. Google Scholar CrossRef Search ADS de Queiroz A, Donoghue MJ, Kim J. 1995. Separate versus combined analysis of phylogenetic evidence. Annual Review of Ecology and Systematics 26: 657– 681. Google Scholar CrossRef Search ADS Resch MC, Shrubovych J, Bartel D, Szucsich NU, Timelthaler G, Bu Y, Walzl M, Pass G. 2014. Where taxonomy based on subtle morphological differences is perfectly mirrored by huge genetic distances: DNA barcoding in Protura (Hexapoda). PLoS One 9: e90653. Google Scholar CrossRef Search ADS PubMed Savage NM, Manceñido M, Owen EF, Carlson SJ, Grant RE, Dagys AS, Dong-Li S. 2002. Rhynchonellida. In: Selden PA, ed. Treatise on invertebrate paleontology . Boulder, CO and Lawrence, KS: Geological Society of America and University of Kansas, 1027– 1377. Schram FR. 1991. Cladistic analysis of metazoan phyla and the placement of fossil problematica. In: Simonetta AM, Conway Morris S, eds. The early evolution of the metazoa and the significance of problematic taxa . Cambridge: Cambridge University Press, 35– 46. Schreiber HA, Bitner MA, Carlson SJ. 2013. Morphological analysis of phylogenetic relationships among extant rhynchonellide brachiopods. Journal of Paleontology 87: 550– 569. Google Scholar CrossRef Search ADS Sereno PC. 2009. Comparative cladistics. Cladistics 25: 624– 659. Google Scholar CrossRef Search ADS Swofford DL. 2000. Phylogenetic analysis using parsimony (*and other methods) . Sunderland, MA: Sinauer Associates. Wägele J-W. 2005. Foundations of phylogenetic systematics . Munich: Verlag Pfeil. Wagner GP. 2014. Homology, genes, and evolutionary innovation . Princeton and Oxford: Princeton University Press. Google Scholar CrossRef Search ADS Wagner GP. 2016. What is “Homology Thinking” and what is it for? Journal of Experimental Zoology. Part B, Molular and Developmental Evolution 326: 3– 8. Google Scholar CrossRef Search ADS Wiley EO. 1981. Phylogenetics . New York, Chichester, Brisbane, Toronto, Singapore: John Wiley & Sons. Williams A, Carlson SJ, Brunton CHC. 2000. Rhynchonelliformea. In: Kaesler RL, ed. Treatise on invertebrate paleontology . 193– 902. Williams DM, Forey PLeds. 2001. Milestones in systematics . London: Systematics Association. Wray GA, Abouheif E. 1998. When is homology not homology? Current Opinion in Genetics & Development 8: 675– 680. Google Scholar CrossRef Search ADS PubMed Zhang Z-F, Li G-X, Holmer LE, Brock GA, Balthasar U, Skovsted CB, Fu D-J, Zhang X-L, Wang H-Z, Butler A, Zhang Z-L, Cao C-Q, Han J, Liu J-N, Shu D-G. 2014. An early Cambrian agglutinated tubular lophophorate with brachiopod characters. Scientific Reports 4: 4682. Zrzavy J, Mihulka S, Kepka P, Bezdek A, Tietz D. 1998. Phylogeny of the metazoa based on morphological and 18S ribosomal DNA evidence. Cladistics 14: 249– 285. Google Scholar CrossRef Search ADS © 2018 The Linnean Society of London, Zoological Journal of the Linnean Society
Zoological Journal of the Linnean Society – Oxford University Press
Published: Mar 29, 2018
It’s your single place to instantly
discover and read the research
that matters to you.
Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.
All for just $49/month
Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly
Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.
Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.
Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.
All the latest content is available, no embargo periods.
“Hi guys, I cannot tell you how much I love this resource. Incredible. I really believe you've hit the nail on the head with this site in regards to solving the research-purchase issue.”Daniel C.
“Whoa! It’s like Spotify but for academic articles.”@Phil_Robichaud
“I must say, @deepdyve is a fabulous solution to the independent researcher's problem of #access to #information.”@deepthiw
“My last article couldn't be possible without the platform @deepdyve that makes journal papers cheaper.”@JoseServera