# Testing the Role of the Red Queen and Court Jester as Drivers of the Macroevolution of Apollo Butterflies

Testing the Role of the Red Queen and Court Jester as Drivers of the Macroevolution of Apollo... Abstract In macroevolution, the Red Queen (RQ) model posits that biodiversity dynamics depend mainly on species-intrinsic biotic factors such as interactions among species or life-history traits, while the Court Jester (CJ) model states that extrinsic environmental abiotic factors have a stronger role. Until recently, a lack of relevant methodological approaches has prevented the unraveling of contributions from these 2 types of factors to the evolutionary history of a lineage. Herein, we take advantage of the rapid development of new macroevolution models that tie diversification rates to changes in paleoenvironmental (extrinsic) and/or biotic (intrinsic) factors. We inferred a robust and fully-sampled species-level phylogeny, as well as divergence times and ancestral geographic ranges, and related these to the radiation of Apollo butterflies (Parnassiinae) using both extant (molecular) and extinct (fossil/morphological) evidence. We tested whether their diversification dynamics are better explained by an RQ or CJ hypothesis, by assessing whether speciation and extinction were mediated by diversity-dependence (niche filling) and clade-dependent host-plant association (RQ) or by large-scale continuous changes in extrinsic factors such as climate or geology (CJ). For the RQ hypothesis, we found significant differences in speciation rates associated with different host-plants but detected no sign of diversity-dependence. For CJ, the role of Himalayan–Tibetan building was substantial for biogeography but not a driver of high speciation, while positive dependence between warm climate and speciation/extinction was supported by continuously varying maximum-likelihood models. We find that rather than a single factor, the joint effect of multiple factors (biogeography, species traits, environmental drivers, and mass extinction) is responsible for current diversity patterns and that the same factor might act differently across clades, emphasizing the notion of opportunity. This study confirms the importance of the confluence of several factors rather than single explanations in modeling diversification within lineages. Diversification, extinction, Himalayan orogeny, historical biogeography, host-plant shifts, integrative study, mountain building, Papilionidae, past climate change, speciation Evolutionary biologists have long endeavored to determine which factors govern biodiversity dynamics, aiming to answer questions such as why some clades have diversified more than others, or why some lineages are widely distributed whereas others survive in restricted ranges (Ezard et al. 2016). Two different mechanistic macroevolutionary models have been proposed to explain the generation and maintenance of diversity. The Red Queen (RQ) model (Van Valen 1973), which stems from Darwin and Wallace, posits that diversification is driven by species-intrinsic, biotic factors such as interactions among species, species ecology, or life-history traits. The Court Jester (CJ) model, which builds on paleontological evidence (Barnosky 2001), argues that diversification dynamics result from historical abiotic forces such as abrupt changes in climate or geological tectonic events that drive speciation and extinction rates, usually acting clade-wide across lineages. The CJ and RQ models are generally considered the 2 extremes of a continuum, operating over different geographic and temporal scales. Biotic factors such as species interactions shape ecosystems locally over short time spans, whereas abiotic factors such as climate and tectonic events shape large-scale patterns regionally and globally over millions of years (Benton 2009). However, biotic interactions can also be observed at large spatial and temporal scales (Liow et al. 2015; Silvestro et al. 2015), while Van Valen’s (1973) original RQ hypothesis is now interpreted as accepting the role of a changing environment in shaping species evolution (Voje et al. 2015). Although both abiotic (environmental) and biotic (species-intrinsic) drivers are recognized as fundamental for regulating biodiversity (Ezard et al. 2011), these 2 types of factors are often studied in isolation (Drummond et al. 2012a; Bouchenak-Khelladi et al. 2015; Lagomarsino et al. 2016), searching for correlations between shifts in diversification rates and the evolution of key innovations or the appearance of key opportunities (Maddison et al. 2007; Alfaro et al. 2009; Rabosky et al. 2014a; Givnish et al. 2015). However, often no single factor but a confluence of biotic and abiotic factors is responsible for the diversification rate shift (Donoghue and Sanderson 2015), there could be an interaction effect. For example, C$$_{\mathrm{4}}$$ grasses appeared in the Eocene but their expansion and explosive diversification started only after mid-Miocene aridification in Africa and Central Asia (Edwards et al. 2010). Statistical assessment of the relative contributions of abiotic and biotic factors underlying diversity patterns has been made possible by the development of new probabilistic models in the field of diversification dynamics (Stadler 2013; Morlon 2014; Höhna 2015). One type of model estimates diversification rates that are clade-dependent and identifies differences in diversification rates among clades that can be explained by key innovations (Alfaro et al. 2009; Morlon et al. 2011; Rabosky et al. 2013), or by diversity-dependence and niche filling (i.e., diversification decreasing as the number of species increases, Rabosky and Lovette 2008; Etienne et al. 2012). A second type of model aims to detect statistical associations between diversification rates and changes in species traits (trait-dependent diversification models, Maddison et al. 2007; Ng and Smith 2014), or between geographic evolution and diversification, such as a change in continental connectivity allowing the colonization of a new region and a subsequent increase in allopatric speciation (Goldberg et al. 2011). A third type of model assumes continuous variation in diversification rates over time that depends on a paleoenvironmental variable and investigates whether diversification rates can be affected by abiotic environmental changes (e.g., paleotemperature, Condamine et al. 2013). Finally, episodic birth–death models search for tree-wide rate shifts that act concurrently across all lineages in a tree, for example, a mass extinction event removing a fraction of lineages at a certain time in the past (Stadler 2011; Höhna 2015; May et al. 2016). The first 2 types of models have been used to test RQ-like hypotheses on the effect of biotic interactions, while the other models are often used in the context of the CJ hypothesis (environmental change). Studies using a subset of these models to address both abiotic and biotic factors are becoming more frequent, but are often used at a local or regional geographic scale (Schnitzler et al. 2011; Drummond et al. 2012a; Jønsson et al. 2012; Hutter et al. 2013; Bouchenak-Khelladi et al. 2015; Lagomarsino et al. 2016), and are limited temporally. No study has yet addressed the full set of models using a large (geographic) and long (temporal) scale within the same lineage. Such a study would have substantial power to provide information on questions such as why some lineages diversify and others do not, and the extent that diversification is attributable to trait evolution and/or ecological opportunity (Wagner et al. 2012). It might also shed light on the current debate about whether biodiversity is at an equilibrium bounded by ecological limits (Rabosky et al. 2013), although it is unclear how relevant such limits are compared with other biological explanations (Moen and Morlon 2014). Alternatively, diversity might be unbounded and controlled by rare but drastic environmental changes such as climatic mass extinction events (Antonelli and Sanmartín 2011; Meredith et al. 2011; Kergoat et al. 2014; Condamine et al. 2015b; May et al. 2016). A key to such a study would be to find a group that: 1) has experienced both biotic and abiotic pressures at different temporal and spatial scales; 2) shows large species diversity and high specialization; and 3) has good information available about its taxonomic and evolutionary history. Nearly complete taxon sampling is crucial for accurate estimation of diversification rates (Cusimano and Renner 2010; Brock et al. 2011; Höhna et al. 2011; Davis et al. 2013), whereas rich fossil evidence is important for divergence time estimation and accurate reconstruction of biogeographic history (Meseguer et al., 2014). Phytophagous insects are especially good models because they present complex biotic (trophic) interactions with their host plants (Ehrlich and Raven 1964), and many of these lineages are old enough to have experienced dramatic past climatic changes. Herein, we address the role of biotic (RQ) and abiotic (CJ) factors that drive patterns of species richness in Apollo butterflies (Papilionidae: Parnassiinae). The Parnassiinae comprise 8 genera with at least 70 species currently recognized (Ackery 1975; Weiss 1991–2005; Kocman 2009), grouped into 3 tribes—Luehdorfiini, Zerynthiini, and Parnassiini (Nazari et al. 2007). Apollo butterflies occur from the Western Palearctic to the Western Nearctic (Weiss 1991–2005). Most diversity is concentrated in the Palearctic, where we find 2 non-monophyletic lowland-flying communities separated by the Himalaya and Tibetan Plateau (HTP, Fig. 1): an Eastern Palearctic group formed by Luehdorfia (Luehdorfiini) and Bhutanitis and Sericinus (Zerynthiini), and a Western Palearctic community including Archon (Luehdorfiini), Allancastria and Zerynthia (Zerynthiini), and Hypermnestra (Parnassiini). The largest genus, Parnassius, is mainly distributed in mountainous regions across the Holarctic, with its highest diversity in the HTP (Weiss 1991–2005; Kocman 2009; Nazari et al. 2007; Michel et al. 2008). When compared with the species-poor genera Allancastria, Archon, Bhutanitis, Hypermnestra, Luehdorfia, and Sericinus (each with 5 species maximum), the 50$$+$$ species of Parnassius have been suggested as an example of rapid diversification. This unbalanced species richness, together with their disjunct pattern of distribution in the Himalayan region, suggests that the HTP orogeny might have been an important driver in the biogeographic and diversification history of Parnassiinae. Figure 1. View largeDownload slide Distribution and species richness patterns of Apollo butterflies. Map shows the current separation of west and east communities of Parnassiinae, divided by the Himalayan–Tibetan Plateau. Lower portion of figure compares the current species diversity of each biogeographical unit (which may be non-monophyletic) or genus. Images of Parnassiinae (also in subsequent figures) are created by Fabien Condamine. Figure 1. View largeDownload slide Distribution and species richness patterns of Apollo butterflies. Map shows the current separation of west and east communities of Parnassiinae, divided by the Himalayan–Tibetan Plateau. Lower portion of figure compares the current species diversity of each biogeographical unit (which may be non-monophyletic) or genus. Images of Parnassiinae (also in subsequent figures) are created by Fabien Condamine. Parnassiinae are also remarkable for their (nearly) strict but diverse host-plant specialization (Ehrlich and Raven 1964). Tribes Luehdorfiini and Zerynthiini feed on Aristolochiaceae, the ancestral host of Parnassiinae (Condamine et al. 2012). Within Parnassiini, Hypermnestra feeds on Zygophyllaceae, while Parnassius has experienced 2 host-plant shifts, toward Crassulaceae $$+$$ Saxifragaceae and Papaveraceae (Michel et al. 2008; Condamine et al. 2012), and is regarded as an adaptive radiation (Rebourg et al. 2006). Finally, Apollo butterflies constitute a model to understand the role of past environmental change in lineage diversification. Phylogeographic studies identified an effect of Pleistocene glaciations on population-level dynamics (e.g., Keyghobadi et al. 2005; Gratton et al. 2008; Schoville and Roderick 2009; Dapporto 2010; Todisco et al. 2010, 2012; Zinetti et al. 2013). Time-calibrated phylogenies support an origin of the subfamily in the Early Cenozoic (Condamine et al. 2012), which means that the lineage has experienced the dramatic cooling and warming events of the last 50 Ma, including a drop in global temperatures at the end of the Eocene (Eocene-Oligocene climate transition [EOCT], Liu et al. 2009) that led to the demise of Boreotropical Holarctic vegetation (Pound et al. 2012). In this study, we reconstruct a completely sampled phylogeny of Parnassiinae to explore the role played by abiotic (CJ) and biotic (RQ) factors, isolated or in unison, in the origin and fate of biological radiations, including paleotemperature, mountain orogeny, host specialization, range expansion, and diversity dependence with niche filling. If interactions among species are the dominant drivers of evolution, diversification rates are expected to show diversity-dependent dynamics or be dependent on ecological traits. For example, one might expect diversification rates to decrease as a function of diversity or increase with ecological opportunity. Conversely, if evolution is driven mainly by changes in the physical environment, clade-wide responses due to abrupt abiotic perturbations should dominate macroevolutionary dynamics, for example, diversification rates may shift following major climatic changes that extirpated certain lineages while favoring the radiation of others. We test these hypotheses using trait-dependent, time-dependent, environment-dependent and episodic birth–death models, and compare the fit (explanatory power) of these models using maximum likelihood (ML) and Bayesian inference (BI). Our study is the first to address the CJ and RQ hypotheses at such a large scale, and to compare the performance of these models within a single inference framework. Materials and Methods Taxon Sampling and Molecular Dataset The latest effort to resolve phylogenetic relationships in Parnassiinae was Condamine et al. (2012), but that study did not include all species. There are currently 70 recognized species within Parnassiinae according to morphology (Weiss 1991–2005; Bollino and Racheli 2012; Frankenbach et al. 2012), but some authors have found a higher level of cryptic diversity, based on molecular data (Nazari and Sperling 2007; Gratton et al. 2008; Schoville and Roderick 2009; Dapporto 2010; Todisco et al. 2010, 2012; Zinetti et al. 2013). To address this, we assembled a large molecular dataset comprising all sequences generated by previous studies on Parnassiinae, especially by F.L.C. and F.A.H.S. (Omoto et al. 2004, 2009; Katoh et al. 2005; Nazari and Sperling 2007; Nazari et al. 2007; Michel et al. 2008; Condamine et al. 2012), and includes all currently described species within the family, except for Bhutanitis ludlowi (Möhn, 2005; Weiss 1991–2005; Churkin 2006). The dataset also includes the majority of described subspecies or geographic races proposed in molecular studies (Nazari and Sperling 2007; Gratton et al. 2008; Schoville and Roderick 2009; Dapporto 2010; Todisco et al. 2010, 2012; Zinetti et al. 2013). Additionally, sequences of 9 species of swallowtails were added to the dataset as outgroups, with Baronia brevicornis considered as the sister group to all remaining Papilionidae, and 2 species for each of the 4 tribes included within Papilioninae, the sister subfamily of Parnassiinae (Condamine et al. 2012). DNA sequences were obtained from GenBank using Taxonomy browser searches (full dataset is provided in the Supplementary Material that accompanies this article available on Dryad at http://dx.doi.org/10.5061/dryad.32bp4, Appendix S1). Sequences of less than 250 nucleotides (possibly microsatellites) and genes sequenced in very few specimens were not included. Overall the dataset comprises $$>$$1800 sequences, representing 69 of the traditionally recognized species and 5 genes (4 mitochondrial and 1 nuclear), divided into 730 sequences for cytochrome oxidase I (COI), 214 for NADH dehydrogenase 1 (ND1), 147 for NADH dehydrogenase 5 (ND5), 202 for rRNA 16S (16S), and 115 for elongation factor 1 alpha (EF-1$$\alpha )$$. Sequences were aligned using MAFFT 7.110 (Katoh and Standley 2013) with default settings (E-INS-i algorithm). Reading frames of coding genes were checked in Mesquite 3.03 (www.mesquite.org). Datasets (species/subspecies names and GenBank accession numbers) are available on Dryad (Supplementary Appendices S1 and S2). We first analyzed this specimen-level dataset to generate a gene tree for each of the 5 markers with every species represented by several specimens, each representing different populations. We then examined genetic differences based on branch lengths among specimens/species. For some species, we found strong genetic differences among specimens/populations supporting the recognition of these taxa as putative new species. For example, we identified 4 Zerynthia species instead of 2, 4 Archon species instead of 3, and 2 Hypermnestra species instead of one. For the final analyses below, we used the species-level dataset, with each species represented by a single specimen, comprising the 70 recognized species (1 represented by only morphological characters, see below) plus 15 putative new species. Assessment of Fossil Positions and Calibrations Constraints on clade ages were enforced through fossil calibrations, whose systematic position was assessed using phylogenetic analyses (Donoghue et al. 1989; Sauquet et al. 2012). Phylogenetic analyses of the dataset included both living and fossil taxa and incorporated both morphological and molecular data using a total-evidence approach (Ronquist et al. 2012a). Swallowtail fossils are scarce, but 2 unambiguously belong to Parnassiinae (Nazari et al. 2007). The first is $$\dagger$$Thaites ruminiana (Scudder 1875), a compression fossil from limestone in the Niveau du gypse d’Aix Formation of France (Bouches-du-Rhone, Aix-en-Provence) within the Chattian (23.03–28.1 Ma) of the late Oligocene (Rasnitsyn and Zherikhin 2002; Sohn et al. 2012). The second is $$\dagger$$Doritites bosniaskii (Rebel 1898), an exoskeleton and compression fossil from Italy (Tuscany) from the Messinian (5.33–7.25 Ma, late Miocene, Sohn et al. 2012). Absolute ages of geological formations were taken from Gradstein et al. (2012). To assess the phylogenetic positions of fossils, our morphological dataset comprised 236 characters (Hiura 1980; Nazari et al. 2007) including adult external (160) and internal (54) morphology, egg (2) and larval (10) morphology, pupal morphology (5), and characters for pigment color and chromosome number (5). Morphological characters were coded for all extant species in each genus except Parnassius (only 8 species: 1 per subgenus). Yet, Parnassius had little impact on our analyses since there are no fossils related to this genus (Nazari et al. 2007). Fossils were only coded for adult external morphology. Morphological and molecular data were combined to construct a total-evidence phylogeny using MrBayes 3.2.6 (Ronquist et al. 2012b). Morphological data were modeled using the Mk model (Lewis 2001). The best-fitting partitioning scheme for the molecular data was selected with PartitionFinder 1.1.1 (Lanfear et al. 2012), using the greedy search algorithm and the Bayesian Information Criterion. Rather than using a single substitution model per molecular partition, we sampled across the entire substitution rate model space (Huelsenbeck et al. 2004) using the reversible-jump Markov Chain Monte Carlo (rj-MCMC) option. Two independent analyses with 1 cold chain and 7 heated chains, each was run for 20 million generations, sampled every 2000th. A 50% majority rule consensus tree was built after discarding 25% samples as burn-in. The dataset used for this analysis is available on Dryad (Supplementary Appendix S3). Phylogeny and Estimates of Divergence Times Using the 85-species level dataset, we examined whether the rate of molecular evolution evolved in a clock-like pattern using PATHd8 (Britton et al. 2007). Since the hypothesis of a strict molecular clock was rejected for 68 of the 92 nodes ($$P < 0.05$$), we estimated divergence times using Bayesian relaxed-clock methods accounting for rate variation across lineages (Drummond et al. 2006). MCMC analyses implemented in BEAST 1.8.2 (Drummond et al. 2012b) were employed to approximate the posterior distribution of rates and divergences times and infer their credibility intervals. We set the following settings and priors: a partitioned dataset (after the best-fitting PartitionFinder scheme) was analyzed using the uncorrelated lognormal distribution clock model, with the mean set to a uniform prior between 0 and 1, and an exponential prior (lambda $$=$$ 0.333) for the standard deviation. The branching process prior was set to either a Yule or a birth–death (Gernhard 2008) process (since tree priors may impact estimates of molecular dating, Condamine et al. 2015a, using the following uniform priors: the Yule birth rate and birth–death mean growth rate ranged between 0 and 10 with a starting value at 0.1, and the birth–death relative death rate ranged between 0 and 1 (starting value $$=$$ 0.5). Calibration priors employed the fossil constraints indicated above (see Results section for the phylogenetic positions), using a uniform prior with the minimum age equal to the youngest age of the geological formation where the fossil was found. Additionally, we constrained the crown of Papilionidae with a uniform distribution bounded by a minimum age of 47.8 Ma (Smith et al. 2003; Gradstein et al. 2012) based on 2 fossils $$\dagger$$Praepapilio colorado and †Praepapilio gracilis (Durden and Rose 1978), both from the early Lutetian (mid-Eocene) of the Green River Formation (CO, USA). We could not place these fossils phylogenetically because of limited taxon sampling for Papilioninae. Instead, we placed these unambiguous fossils conservatively at the crown of the family as they share synapomorphies with all extant subfamilies (de Jong 2007), and have proven to be reliable calibration points for the crown group (Condamine et al. 2012). All uniform calibration priors were set with an upper bound equal to the estimated age of angiosperms ($$\sim$$140 Ma, sensuMagallón et al. 2015), which is 6 times older than the oldest Parnassiinae fossil. This upper age is intentionally set as ancient to allow exploration of potentially old ages for the clade. Since the fossil record of Lepidoptera is highly incomplete and biased (Sohn et al. 2015), caution is needed in using the few recent fossil calibrations. To explore the impact of sampled diversity, tree prior, and morphological data on BEAST inferences, we ran 4 analyses as follows: 1) molecular data only and Yule model; 2) molecular data only and birth–death process; 3) molecular and morphological data (total-evidence analysis) and Yule model; and 4) molecular and morphological data (total-evidence analysis) and birth–death process, with all other parameters set equal. The total-evidence analyses provide a completely sampled phylogeny of Parnassiinae due to the inclusion of B. ludlowi (only available with morphological data). Each MCMC analysis was run for 100 million generations, sampled every 10,000th, resulting in 10,000 samples in the posterior distribution of which the first 2500 samples were discarded as burn-in. All analyses were performed on the computer cluster CIPRES Science Gateway (Miller et al. 2015), using BEAGLE (Ayres et al. 2012). Convergence and performance of MCMC runs were evaluated using Tracer 1.6 (Rambaut and Drummond 2016) and the effective sample size (ESS) criterion for each parameter. A maximum-clade credibility (MCC) tree was reconstructed, with median age and 95% height posterior density (HPD). Bayes factor (BF) comparisons using stepping-stone sampling (Xie et al. 2011), which allows unbiased approximation of the marginal likelihood of Bayesian analyses (Baele et al. 2013), was used to select among competing models. We considered 2lnBF values $$>$$10 to significantly favor one model over another (Kass and Raftery 1995). The dataset and BEAST xml files generated for this study are available on Dryad (Supplementary Appendix S4). Historical Biogeography For the phylogenetic tree, we used the BEAST MCC tree with 85 species. Definition of biogeographic units was based on the present-day distribution of species but also informed by plate tectonics (Sanmartín et al. 2001) and alpine orogeny reconstructions (Bouilhol et al. 2013). Ten areas were defined: WN $$=$$ Western North America (including the Rocky Mountains); WP $$=$$ Europe (France, Spain, Pyrenees, Alps, Italy, Greece, Crete, Balkans, Scandinavia, Western Russian, and Ural Mountains); SI $$=$$ Eastern Russia, Siberia, and Kamchatka; CA $$=$$ Central Asia (Turkmenistan, Uzbekistan and Kazakhstan); MO $$=$$ Mongolian steppes and Altai Mountains; TU $$=$$ Turkey, Caucasian region, Syria, Iraq, Iranian Plateau, and Zagros Mountains; HTP $$=$$ Himalaya, Tibetan Plateau, and Pamir region; IN $$=$$ India, Bhutan, Yunnan, and Southern China; CJ $$=$$ Northern China, Korea, and Japan; AF $$=$$ North Africa and Arabia. Species ranges were defined by presence–absence in each region (Supplementary Appendix S5 available on Dryad). Biogeographic analyses were performed on the MCC tree (outgroups removed), using the Dispersal–Extinction–Cladogenesis model of range evolution (DEC; Ree and Smith 2008) implemented in the R-package BioGeoBEARS 0.2.1 (Matzke 2014). We did not use the DEC$$+$$J model (Matzke 2014) because of concerns with statistical validity of model choice among DEC-derived models and in particular under the DEC$$+$$J model (Ree and Sanmartín 2018). Also, DEC$$+$$J often leads to inferences that are decoupled from time, with null or extremely low extinction rates (Sanmartín and Meseguer 2016), an effect of the model favoring cladogenetic events over anagenetic events (Ree and Sanmartín 2018), which makes it inadequate for reconstructing the history of ancient groups with widespread distributions. Biogeographic ranges larger than 4 areas in size were disallowed as valid biogeographic states if they were not subsets of the terminal species ranges; widespread ranges comprising areas that have never been geographically connected (Sanmartín et al. 2001) were also removed. We constructed a time-stratified model (Supplementary Appendix S6 available on Dryad) with 4 time slices that specified constraints on area connections and spanned the Parnassiinae evolutionary history; each interval represents a geological period bounded by major changes in tectonic and climatic conditions thought to have affected the distribution of these butterflies. Testing the RQ and CJ hypotheses To understand the relative contributions of RQ versus CJ-type factors, we ran a series of macroevolutionary models using both ML and BI. Rather than on a single tree, analyses were performed on a sample of trees from the BEAST post-burnin posterior distribution (with outgroups removed); an exception was the diversity-dependent diversification models that, due to time constraints, were run only on the MCC tree. This approach allowed us to assess the robustness of results to phylogenetic uncertainty and estimate confidence intervals for the parameter estimates. We compared the fit of each model to the data using the corrected Akaike Information Criterion (AICc) for the ML analyses, and Bayes Factor comparisons (2lnBF) for BI. Diversification analyses in a RQ model. We first tested the hypothesis that diversity is bounded or at equilibrium, meaning that diversity expanded rapidly in early diversification, filled most niches and saturated toward the present. We explored the effect of diversity-dependence on speciation and extinction rates using the method of Etienne et al. (2012) implemented in the R-package DDD 2.7. We applied 5 different models: 1) speciation depends linearly on diversity without extinction, 2) speciation depends linearly on diversity with extinction, 3) speciation depends exponentially on diversity with extinction, 4) speciation does not depend on diversity and extinction depends linearly on diversity, and 5) speciation does not depend on diversity and extinction depends exponentially on diversity. For each model, the initial carrying capacity was set to the current number of described species. We also tested the effect of host-plant association on diversification rates, since previous study has pointed to a potential correlation between shifts in diversification and host-plant switches (Ehrlich and Raven 1964). We used a diversification model of the state-dependent speciation and extinction (SSE) family of models, in which extinction and speciation rates are associated with phenotypic evolution of a trait along a phylogeny (Maddison et al. 2007). In particular, we used the Multiple State Speciation Extinction model (MuSSE; FitzJohn et al. 2009) implemented in the R-package diversitree 0.9–7 (FitzJohn 2012), which allows for multiple character states. Larval host plant data were taken from previous work (Ehrlich and Raven 1964; Scriber 1984; Collins and Morris 1985; Tyler et al. 1994; Scriber et al. 1995). The following 4 character states were used: 1) Aristolochiaceae; 2) Zygophyllaceae; 3) Papaveraceae; and 4) Crassulaceae $$+$$ Saxifragaceae. Data at a lower taxonomic level than plant family were not used because of the great number of multiple associations exhibited by genera that could alter the phylogenetic signal. Thirty-six different models were run to test whether speciation, extinction, or transition rates were dependent on trait evolution. Models were built with increasing complexity, starting from a model with no difference in speciation, extinction and transition rates (3 parameters) to the most complex model with different speciation, extinction, and transition rates for each character state (20 parameters). We estimated posterior density distribution with Bayesian MCMC analyses (10,000 steps) performed with the best-fitting models and resulting speciation, extinction and transition rates. There have been concerns about the power of SSE models to infer diversification dynamics from a distribution of species traits (Davis et al. 2013; Maddison and FitzJohn 2015; Rabosky and Goldberg 2015). First, SSE models tend to have a high type I error bias related to the shape (topology) of the inferred tree (Rabosky and Goldberg 2015). To test whether our diversification results were potentially biased, we estimated the difference of fit ($$\Delta$$AIC) between the best model from MuSSE and a null model (corresponding to the model with no difference in speciation, extinction and transition rates between the states of a character) and compared this with the difference between the same models as estimated from simulated datasets. 1000 sets of traits were generated on the Parnassiinae phylogeny using the sim.history function from the R-package phytools (Revell 2012). To keep the phylogenetic signal in the trait, transition rates between states were simulated using values obtained from the null model fitted on the observed phylogeny. Second, there is concern whether SSE models are uncovering actual drivers of diversification, or whether they are simply pointing to more complex patterns involving unmeasured and co-distributed factors in the phylogeny (Beaulieu and O’Meara 2016). A model with an additional hidden character may alleviate this issue. We applied the Hidden SSE (HiSSE) model to specifically account for the presence of unmeasured factors that could impact diversification rates estimated for the states of any observed trait. The analyses were performed in the Bayesian software RevBayes (Höhna et al. 2016a). To provide an independent assessment of the relationship between diversification rates and host specificity, we used models that allow diversification rates to vary among clades. BAMM 2.5 (Rabosky et al. 2013, 2014a) was used to explore for differential diversification dynamic regimes among clades differing in their host-plant feeding. BAMM analyses were run for 10 million generations, sampling every 10,000th and with 4 different values of the compound Poisson prior (CPP) to ensure the posterior is independent of the prior (Moore et al. 2016). Mixing and convergence among runs (ESS $$>$$200 after 15% burn-in) were assessed with the R-package BAMMtools 2.1 (Rabosky et al. 2014b). BAMM has been criticized for incorrectly modeling rate-shifts on extinct lineages, that is, unobserved (extinct or unsampled) lineages inherit the ancestral diversification process and cannot experience subsequent diversification-rate shifts (Moore et al. 2016, but see Rabosky et al. 2017). To solve this, we used a novel approach implemented in RevBayes that models rate shifts consistently on extinct lineages by using the SSE framework (Moore et al. 2016; Höhna S., Moore B.R., Rannala B., May M.R., Huelsenbeck J.P. in preparation). Although there is no information of rate shifts for unobserved/extinct lineages in a phylogeny including extant species only, these types of events must be accounted for in computing the likelihood. The number of rate categories is fixed in the analysis but RevBayes allows any number to be specified, thus allowing direct comparison of different macroevolutionary regimes. Diversification analyses in a CJ model. We evaluated the impact of abiotic factors such as abrupt changes in tectonic or climatic settings or mass extinction events (e.g., the EOCT at 33.9 Ma) using episodic birth–death models implemented in the R-packages TreePar 3.3 (Stadler 2011) and TESS 2.1 (CoMET model, Höhna et al. 2016b; May et al. 2016). These models allow detection of discrete changes in speciation and extinction rates concurrently affecting all lineages in a tree. Both approaches estimate changes in diversification rates at discrete points in time, but can also infer mass extinction events, modeled as sampling events in the past in which the extant diversity is reduced by a fraction. Speciation and extinction rates can change at those points but remain constant within time intervals. The underlying likelihood function is identical in the 2 methods (Stadler 2011; Höhna 2015), but TreePar and TESS differ in the inference framework (ML vs. BI) and the method used for model comparison (AICc vs. BF). In addition, TESS uses independent CPPs to simultaneously detect mass extinction events and discrete changes in speciation and extinction rates, while TreePar estimates the magnitude and timing of speciation and extinction changes independently to the occurrence of mass extinctions (i.e., the 3 parameters cannot be estimated simultaneously due to parameter identifiability issues, Stadler 2011). To compare inferences between CoMET and TreePar, we performed 2 independent TreePar analyses allowing and disallowing mass extinction events (argument ME $$=$$ TRUE/FALSE). We compared models with 0, 1, or 2 rate shifts/mass extinction events in TreePar using the AICc, while BF comparisons were used to assess model fit between models with varying number and time of changes in speciation/extinction rates and mass extinctions in TESS. Finally, we used an implementation of CoMET in RevBayes (Höhna et al. 2016a), an episodic birth–death model in which speciation and extinction are allowed to vary at discrete points in time, but are autocorrelated (instead of independent/uncorrelated as in CoMET) across time intervals; a Brownian model with rates in the next time interval centered around the rates in the current time interval was used. Also, the number of diversification rate shifts was set a priori as in population skyline models (Strimmer and Pybus 2001), while in CoMET this is modeled through a CPP prior. To examine the influence of elevational distribution (lowland, mountain, or both) on speciation and extinction rates, we used the trait-dependent diversification model GeoSSE (Geographic State Speciation and Extinction, Goldberg et al. 2011). Geographic characters require a third widespread state, because, unlike morphological traits, ancestors can be present in more than 1 state (area). This requires the modeling of cladogenetic state change, the probability associated with the division of an ancestral range between the 2 descendants, alongside anagenetic state change (i.e., the probability of character change along branches). Species were coded by their elevation zone, using data from literature, museum records, and field observations. We evaluated 12 models of increasing complexity to test the relationship between elevational distributions and diversification. As in MuSSE above, we evaluated the performance of GeoSSE using simulation tests. Trait simulation was more complex here because there is no direct transition between states A and B. Instead, these transitions involve range expansion to an intermediate widespread state AB (dAB, dBA) followed by local extinction (xA, xB). Following Goldberg et al. (2011), we considered transition rates between A and B as null because they involve more than one instantaneous event; the transition rate from A to AB was coded as range expansion dA, from B to AB as dB; transitions from AB to A and from AB to B (“extirpation rates”) were coded as xB and xA, respectively. For our 1000 simulations, we used the transition rates of the model with equal speciation rates (sA $$=$$ sB $$=$$ sAB), but different extinction (xA$$\ne$$xB) and transition rates (dA$$\ne$$dB). Finally, we tested the impact of paleoenvironmental variables on diversification rates, using a birth–death likelihood method in which rates may change continuously with an environmental variable, itself varying through time (Condamine et al. 2013). We tested 4 models: 1) a constant-rate birth–death model in which diversification is not associated to the environmental variable; 2) speciation rate varies according to environment and extinction rate does not vary; 3) speciation rate does not vary and extinction rate varies according to environment; and 4) both speciation and extinction rates vary according to environment. We also tested the corresponding models in which speciation and/or extinction are allowed to vary with time, but independently from the environmental variable (time-dependent birth–death models, Morlon et al. 2011). When rates varied with the environment (E), we assumed exponential variation, such that $$\lambda (E)=\lambda_{0}\thinspace \times \thinspace e^{\alpha E}$$ and $$\mu (E)=\mu_{0}\times \thinspace e^{\beta E}$$, in which $$\lambda_{0}$$ and $$\mu _{0}$$ are the speciation and extinction rates for a given environmental variable for which the value is 0, and $$\alpha$$ and $$\beta$$ are the rates of change according to the environment. Positive values for $$\alpha$$ or $$\beta$$ mean a positive effect of the environment on speciation or extinction (and conversely). As environmental variables (Fig. 2), we used paleotemperature (data retrieved from Zachos et al. 2008) and paleo-elevation of the HTP (a proxy for mountain building), with data compiled from existing data in the literature available on Dryad (Supplementary Appendix S7). The R-package pspline was used to build environmental vectors from the data as input for the birth–death models. Figure 2. View largeDownload slide Environmental changes over the last 40 million years. a) Trends in global climate change estimated from relative proportions of different oxygen isotopes (Zachos et al. 2008). b) Trends in HTP elevation for the last 40 Myrs, compiled from data from a literature survey; our compilation shows that the HTP was as high as 3000 m in the Eocene, congruent with recent reviews (Renner 2016; Xu et al. 2018; Supplementary Appendix S7). c) Map of the current elevation of the Himalayan–Tibetan Plateau with the species diversity of Parnassius in the region. Figure 2. View largeDownload slide Environmental changes over the last 40 million years. a) Trends in global climate change estimated from relative proportions of different oxygen isotopes (Zachos et al. 2008). b) Trends in HTP elevation for the last 40 Myrs, compiled from data from a literature survey; our compilation shows that the HTP was as high as 3000 m in the Eocene, congruent with recent reviews (Renner 2016; Xu et al. 2018; Supplementary Appendix S7). c) Map of the current elevation of the Himalayan–Tibetan Plateau with the species diversity of Parnassius in the region. Alternatively, we used RevBayes and the CoMET framework (episodic birth–death models) to implement an analysis in which diversification rates are allowed to co-vary with, or evolve independently from, paleotemperatures and paleo-elevation across discrete time intervals (Palazzesi L., Hidalgo O., Barreda V.D, Forest F., Höhna S. in review). A correlation coefficient, denoted $$\beta$$, is co-estimated with diversification rates in the model and describes the magnitude and direction of the relationship between the variable and changes in diversification rates: $$\beta$$$$<$$ 0 (negative correlation), $$\beta$$$$>$$0 (positive correlation), and $$\beta = 0$$ (no correlation, in which case the model collapses to the episodic birth–death models; see Supplementary Appendix S8 available on Dryad). RESULTS Phylogeny and Fossil Placements The molecular matrix comprised 4535 nucleotides and 85 extant species (70 formally recognized and 15 putative species) of Parnassiinae plus 2 fossil species, following recent taxonomic reviews (Frankenbach et al. 2012; Bollino and Racheli 2012) and our own results from the species delimitation analyses (see Supplementary Appendices S1 and S2 available on Dryad). PartitionFinder selected 8 partitions as the best scheme for substitution models (Supplementary Appendix S9 available on Dryad). Convergence among runs was supported by the low average standard deviation of split frequencies, PSRF values $$\approx$$1.0, and ESS $$>>$$ 1000 for many parameters. The resulting Bayesian trees were well resolved and robust: 74.4% of the nodes were recovered with posterior probability (PP) $$>$$0.95 with MrBayes (Fig. 3), and 83% in BEAST (70% with PP $$=$$ 1) when fossil taxa were removed. Phylogenetic relationships agree with previous studies (Nazari et al. 2007; Michel et al. 2008), showing Luehdorfiini $$+$$ Zerynthiini as sister-tribes and sister to Parnassiini; within the latter, Hypermnestra is sister to the genus Parnassius. Most of the weak support is found within the radiation of species complexes, such as Parnassius apollo or in the subgenus Kailasius, but 2 deep nodes in Parnassius have PP $$<$$ 0.9. Figure 3. View largeDownload slide Phylogenetic relationships inferred with a Bayesian total-evidence approach combining molecular and morphological data of extant and extinct Parnassiinae. This integrative approach allowed assessment of the phylogenetic placement of 2 fossils, which were then used to calibrate the molecular dating analysis. Backbone nodes show posterior probability values. Nodal support values within genera or subclades are shown in Supplementary Appendix S10 for alternative dating analyses including, or not, morphological data. Grey circles indicate the crown-node of each tribe. Figure 3. View largeDownload slide Phylogenetic relationships inferred with a Bayesian total-evidence approach combining molecular and morphological data of extant and extinct Parnassiinae. This integrative approach allowed assessment of the phylogenetic placement of 2 fossils, which were then used to calibrate the molecular dating analysis. Backbone nodes show posterior probability values. Nodal support values within genera or subclades are shown in Supplementary Appendix S10 for alternative dating analyses including, or not, morphological data. Grey circles indicate the crown-node of each tribe. The total-evidence analysis allowed us to test the phylogenetic placement of the Parnassiinae fossils. $$\dagger$$Thaites was often recovered as sister to Parnassiini (PP $$=$$ 0.6, Fig. 3), and occasionally as sister to Luehdorfiini $$+$$ Zerynthiini. This fossil was used to provide a minimum age for crown Parnassiinae, calibrated with a uniform prior bounded between 23.03 Ma (minimum) and 47.8 Ma (maximum). $$\dagger$$Doritites was reconstructed as sister to Archon (Luehdorfiini, PP $$=$$ 0.96), in agreement with Carpenter (1992), who tentatively synonymized $$\dagger$$Doritites with Luehdorfia. The crown of Luehdorfiini was thus constrained for divergence time estimation using a uniform distribution bounded between 5.33 Ma (minimum) and 47.8 Ma (maximum). Divergence Time and Biogeographic Estimates The BEAST analyses (Yule vs. birth–death and molecular vs. total-evidence) yielded almost identical estimates of divergence times with less than 1 million years of difference for most nodes (Supplementary Appendix S10 available on Dryad). BF comparisons were also not conclusive, with little difference in the stepping-stone marginal likelihood estimate among analyses (Supplementary Appendix S11 available on Dryad). We present the results for the analysis with the birth–death prior and the total-evidence dataset (Fig. 4) because it offered slightly better convergence and incorporating extinction into the tree prior seemed more realistic for such an old lineage (results from the other analyses are available on Dryad, Supplementary Appendix S10). Figure 4. View largeDownload slide Time-calibrated phylogeny and biogeography of Apollo butterfly radiation and paleo-tectonic evolution at important periods. Colored squares on nodes indicate the most likely biogeographical estimate, as in the lower left inset (uncertainties in range estimates are in Supplementary Appendix S12). Triangles indicate range expansions and X’s denote local extinctions. A timescale spans the full evolutionary history of the group. Earth maps represent paleogeography at specific time periods. The red box highlights a notable period of local extinctions in the clade Luehdorfiini $$+$$ Zerynthiini. Panels at bottom represent the main Cenozoic environmental changes occurring with the onset and rise of the HTP or floristic turnovers. Sensitivity analyses were conducted to test the effect of adding or excluding morphological data; results are presented in Supplementary Appendix S10. Plio. $$=$$ Pliocene; Ple. $$=$$ Pleistocene. Figure 4. View largeDownload slide Time-calibrated phylogeny and biogeography of Apollo butterfly radiation and paleo-tectonic evolution at important periods. Colored squares on nodes indicate the most likely biogeographical estimate, as in the lower left inset (uncertainties in range estimates are in Supplementary Appendix S12). Triangles indicate range expansions and X’s denote local extinctions. A timescale spans the full evolutionary history of the group. Earth maps represent paleogeography at specific time periods. The red box highlights a notable period of local extinctions in the clade Luehdorfiini $$+$$ Zerynthiini. Panels at bottom represent the main Cenozoic environmental changes occurring with the onset and rise of the HTP or floristic turnovers. Sensitivity analyses were conducted to test the effect of adding or excluding morphological data; results are presented in Supplementary Appendix S10. Plio. $$=$$ Pliocene; Ple. $$=$$ Pleistocene. Based on this analysis and results from the DEC inference, Parnassiinae originated in the late Eocene (38.6 Ma, 95% HPD 31.4–46.6 Ma) in Central Asia (Fig. 4). The ancestor of sister-tribes Luehdorfiini and Zerynthiini (36.7 Ma, 29.9–44.5 Ma) is also reconstructed in Central Asia, followed by independent dispersal events to India $$+$$ Caucasus-Turkey and the HTP. Luehdorfiini diversified in the early Miocene at 23.3 Ma (17.4–30.2 Ma) within a broad geographic range including India, Turkey-Caucasus, and Central Asia; this was followed by extinction in Central Asia and India, and a later dispersal to China–Japan along the stem of Luehdorfia. Zerynthiini originated in the Oligocene (30.3 Ma, 24.1–37.2 Ma) within a region comprising India, Turkey-Caucasus, Central Asia, and the HTP, and also underwent extinction events in Central Asia and India during the Miocene. In contrast, Parnassiini first diversified within this region in the early Miocene (22.6 Ma, 17.2–28.5 Ma), followed by dispersal to Turkey-Caucasus in Hypermnestra. The ancestor of Parnassius dispersed from Central Asia to HTP around the mid-Miocene (13.4 Ma, 10.5–16.6 Ma), followed by vicariance between subgenus Parnassius and the ancestor of the other subgenera. Dispersal events to adjacent geographic regions and subsequent allopatric speciation are reconstructed within each subgenus, especially in Driopa. A more detailed account of the biogeographic history and alternative nodal reconstructions are available on Dryad (Supplementary Appendix S12). Macroevolutionary Dynamics A summary of diversification rate analysis results is presented in Table 1. Table 1. Summary of diversification analyses Evolutionary Type of birth–death $$\hspace{8pt}$$model Methods used References Data used Settings Main results Time-dependence (rates vary continuously as a function of time and across clades) Red Queen BAMM RevBayes Rabosky et al. (2013), Höhna et al. (2016a) MCC tree of Parnassiinae Bayesian model to test whether speciation rates have varied over time and/or across lineages (exponential variation) A single shift: increase of diversification near the crown of the genus Parnassius Diversity-dependence (rates vary as a function of the number of species) Red Queen DDD (dd_ML) Etienne et al. (2012) MCC tree of the clade and one of Parnassius (to test whether they reach their carrying capacity) 5 models to test whether speciation declines with diversity and/or extinction increases with diversity Both Parnassiinae and Parnassius have not reached their carrying capacity, but show some degree of diversity dependence Trait-dependence (rates vary as a function of a character state for a trait) Red Queen MuSSE (make.musse) RevBayes Maddison et al. 2007, FitzJohn et al. 2009, Höhna et al. (2016a) 200 posterior trees $$+$$ host plant preferences (four states) 36 models to test whether a character state impacted speciation and/or extinction and/or transition rates between host plants Support for independent speciation rates for each host-specialist clade; no significant differences for extinction and transition rates among clades Episodic birth–death model (rates vary discretely as a function of time) Court Jester TreePar (bd.shifts.optim) CoMET Stadler 2011, May et al. 2016 200 posterior trees MCC tree 4 models to test whether diversification rates and turnover episodically changed over time Significant diversification rate and turnover changes in the late Pliocene (3.2 Ma) and a possible mass extinction in the mid-Miocene (15 Ma) Trait-dependence (rates vary as a function of a character state for a trait) Court Jester GeoSSE (make.geosse) RevBayes Maddison et al. 2007, Goldberg et al. 2011, Höhna et al. (2016a) 200 posterior trees $$+$$ geographical occurrence (highland, lowland, or widespread) 12 models to test whether a character state impacted speciation and/or extinction and/or transition rates between mountains and lowlands High speciation in mountain, medium allopatric speciation and low speciation in lowland (no effect of the trait on extinction and transition rates) Environmental-dependence (rates vary as a function of a time-variable environment) Court Jester RPANDA (fit_env) TESS RevBayes Condamine et al. 2013, Höhna et al. (2016b) 200 posterior trees $$+$$ past global temperatures $$+$$ past altitude of the HTP 8 models for each variable to test whether rates vary with the paleoenvironment Speciation and extinction are positively associated with warm climate. Speciation or extinction is not significantly associated with past altitude Evolutionary Type of birth–death $$\hspace{8pt}$$model Methods used References Data used Settings Main results Time-dependence (rates vary continuously as a function of time and across clades) Red Queen BAMM RevBayes Rabosky et al. (2013), Höhna et al. (2016a) MCC tree of Parnassiinae Bayesian model to test whether speciation rates have varied over time and/or across lineages (exponential variation) A single shift: increase of diversification near the crown of the genus Parnassius Diversity-dependence (rates vary as a function of the number of species) Red Queen DDD (dd_ML) Etienne et al. (2012) MCC tree of the clade and one of Parnassius (to test whether they reach their carrying capacity) 5 models to test whether speciation declines with diversity and/or extinction increases with diversity Both Parnassiinae and Parnassius have not reached their carrying capacity, but show some degree of diversity dependence Trait-dependence (rates vary as a function of a character state for a trait) Red Queen MuSSE (make.musse) RevBayes Maddison et al. 2007, FitzJohn et al. 2009, Höhna et al. (2016a) 200 posterior trees $$+$$ host plant preferences (four states) 36 models to test whether a character state impacted speciation and/or extinction and/or transition rates between host plants Support for independent speciation rates for each host-specialist clade; no significant differences for extinction and transition rates among clades Episodic birth–death model (rates vary discretely as a function of time) Court Jester TreePar (bd.shifts.optim) CoMET Stadler 2011, May et al. 2016 200 posterior trees MCC tree 4 models to test whether diversification rates and turnover episodically changed over time Significant diversification rate and turnover changes in the late Pliocene (3.2 Ma) and a possible mass extinction in the mid-Miocene (15 Ma) Trait-dependence (rates vary as a function of a character state for a trait) Court Jester GeoSSE (make.geosse) RevBayes Maddison et al. 2007, Goldberg et al. 2011, Höhna et al. (2016a) 200 posterior trees $$+$$ geographical occurrence (highland, lowland, or widespread) 12 models to test whether a character state impacted speciation and/or extinction and/or transition rates between mountains and lowlands High speciation in mountain, medium allopatric speciation and low speciation in lowland (no effect of the trait on extinction and transition rates) Environmental-dependence (rates vary as a function of a time-variable environment) Court Jester RPANDA (fit_env) TESS RevBayes Condamine et al. 2013, Höhna et al. (2016b) 200 posterior trees $$+$$ past global temperatures $$+$$ past altitude of the HTP 8 models for each variable to test whether rates vary with the paleoenvironment Speciation and extinction are positively associated with warm climate. Speciation or extinction is not significantly associated with past altitude Notes: Evolutionary models are divided by whether they are used to test hypotheses in favor of the Red Queen or to support the Court Jester. MCC $$=$$ maximum clade credibility; HTP $$=$$ Himalaya and Tibetan Plateau. Table 1. Summary of diversification analyses Evolutionary Type of birth–death $$\hspace{8pt}$$model Methods used References Data used Settings Main results Time-dependence (rates vary continuously as a function of time and across clades) Red Queen BAMM RevBayes Rabosky et al. (2013), Höhna et al. (2016a) MCC tree of Parnassiinae Bayesian model to test whether speciation rates have varied over time and/or across lineages (exponential variation) A single shift: increase of diversification near the crown of the genus Parnassius Diversity-dependence (rates vary as a function of the number of species) Red Queen DDD (dd_ML) Etienne et al. (2012) MCC tree of the clade and one of Parnassius (to test whether they reach their carrying capacity) 5 models to test whether speciation declines with diversity and/or extinction increases with diversity Both Parnassiinae and Parnassius have not reached their carrying capacity, but show some degree of diversity dependence Trait-dependence (rates vary as a function of a character state for a trait) Red Queen MuSSE (make.musse) RevBayes Maddison et al. 2007, FitzJohn et al. 2009, Höhna et al. (2016a) 200 posterior trees $$+$$ host plant preferences (four states) 36 models to test whether a character state impacted speciation and/or extinction and/or transition rates between host plants Support for independent speciation rates for each host-specialist clade; no significant differences for extinction and transition rates among clades Episodic birth–death model (rates vary discretely as a function of time) Court Jester TreePar (bd.shifts.optim) CoMET Stadler 2011, May et al. 2016 200 posterior trees MCC tree 4 models to test whether diversification rates and turnover episodically changed over time Significant diversification rate and turnover changes in the late Pliocene (3.2 Ma) and a possible mass extinction in the mid-Miocene (15 Ma) Trait-dependence (rates vary as a function of a character state for a trait) Court Jester GeoSSE (make.geosse) RevBayes Maddison et al. 2007, Goldberg et al. 2011, Höhna et al. (2016a) 200 posterior trees $$+$$ geographical occurrence (highland, lowland, or widespread) 12 models to test whether a character state impacted speciation and/or extinction and/or transition rates between mountains and lowlands High speciation in mountain, medium allopatric speciation and low speciation in lowland (no effect of the trait on extinction and transition rates) Environmental-dependence (rates vary as a function of a time-variable environment) Court Jester RPANDA (fit_env) TESS RevBayes Condamine et al. 2013, Höhna et al. (2016b) 200 posterior trees $$+$$ past global temperatures $$+$$ past altitude of the HTP 8 models for each variable to test whether rates vary with the paleoenvironment Speciation and extinction are positively associated with warm climate. Speciation or extinction is not significantly associated with past altitude Evolutionary Type of birth–death $$\hspace{8pt}$$model Methods used References Data used Settings Main results Time-dependence (rates vary continuously as a function of time and across clades) Red Queen BAMM RevBayes Rabosky et al. (2013), Höhna et al. (2016a) MCC tree of Parnassiinae Bayesian model to test whether speciation rates have varied over time and/or across lineages (exponential variation) A single shift: increase of diversification near the crown of the genus Parnassius Diversity-dependence (rates vary as a function of the number of species) Red Queen DDD (dd_ML) Etienne et al. (2012) MCC tree of the clade and one of Parnassius (to test whether they reach their carrying capacity) 5 models to test whether speciation declines with diversity and/or extinction increases with diversity Both Parnassiinae and Parnassius have not reached their carrying capacity, but show some degree of diversity dependence Trait-dependence (rates vary as a function of a character state for a trait) Red Queen MuSSE (make.musse) RevBayes Maddison et al. 2007, FitzJohn et al. 2009, Höhna et al. (2016a) 200 posterior trees $$+$$ host plant preferences (four states) 36 models to test whether a character state impacted speciation and/or extinction and/or transition rates between host plants Support for independent speciation rates for each host-specialist clade; no significant differences for extinction and transition rates among clades Episodic birth–death model (rates vary discretely as a function of time) Court Jester TreePar (bd.shifts.optim) CoMET Stadler 2011, May et al. 2016 200 posterior trees MCC tree 4 models to test whether diversification rates and turnover episodically changed over time Significant diversification rate and turnover changes in the late Pliocene (3.2 Ma) and a possible mass extinction in the mid-Miocene (15 Ma) Trait-dependence (rates vary as a function of a character state for a trait) Court Jester GeoSSE (make.geosse) RevBayes Maddison et al. 2007, Goldberg et al. 2011, Höhna et al. (2016a) 200 posterior trees $$+$$ geographical occurrence (highland, lowland, or widespread) 12 models to test whether a character state impacted speciation and/or extinction and/or transition rates between mountains and lowlands High speciation in mountain, medium allopatric speciation and low speciation in lowland (no effect of the trait on extinction and transition rates) Environmental-dependence (rates vary as a function of a time-variable environment) Court Jester RPANDA (fit_env) TESS RevBayes Condamine et al. 2013, Höhna et al. (2016b) 200 posterior trees $$+$$ past global temperatures $$+$$ past altitude of the HTP 8 models for each variable to test whether rates vary with the paleoenvironment Speciation and extinction are positively associated with warm climate. Speciation or extinction is not significantly associated with past altitude Notes: Evolutionary models are divided by whether they are used to test hypotheses in favor of the Red Queen or to support the Court Jester. MCC $$=$$ maximum clade credibility; HTP $$=$$ Himalaya and Tibetan Plateau. Diversification in a RQ model. Diversity-dependent diversification models estimated an unlimited carrying capacity for the subfamily (Supplementary Appendix S13 available on Dryad), implying that accumulating diversity did not influence diversification rates. The best-fitting DDD model was one in which extinction rate increased with diversity. For Parnassius the best model was a diversity-dependence process with speciation decreasing with increasing diversity. However, none of these models received significant support when compared with other models ($$\Delta$$AICc $$<$$ 2). MuSSE analyses on host-plant preferences selected a model in which speciation rates varied among clades feeding on different host plants; extinction and transition rates were estimated as equal (Supplementary Appendix S14 available on Dryad). Zygophyllaceae feeders (Hypermnestra) had the lowest speciation rate, followed by Aristolochiaceae feeders (Luehdorfiini and Zerynthiini), with the highest rates for clades feeding on Papaveraceae (all Parnassius except subgenus Parnassius) and Crassulaceae $$+$$ Saxifragaceae (subgenus Parnassius) (Fig. 5). MCMC credibility intervals showed significant differences between the speciation rates of Papaveraceae feeders and Crassulaceae $$+$$ Saxifragaceae feeders and those feeding on other host plants (Supplementary Appendix S15 available on Dryad). Transition rates for host-plant switches were often estimated close to zero with strong niche conservatism in the phylogeny; once a lineage shifted to a new host it did not switch back and only rarely colonized another host plant (Supplementary Appendix S14 available on Dryad). We assessed the robustness of this pattern with a simulation procedure, and Bayesian implementations of HiSSE and MuSSE in RevBayes (Supplementary Appendix S16 available on Dryad). All analyses supported the same diversification pattern with speciation rates being different among lineages feeding on different host plants. Our estimates were robust to known biases of SSE models, suggesting a strong effect of host plant on diversification rates of Parnassiinae. Figure 5. View largeDownload slide Diversification of Parnassiinae linked to their host plant. a) Phylogenetic distribution of host-plant traits (four states) analyzed with MuSSE diversification models. b) Bayesian inferences made with the best-fitting MuSSE model (see Supplementary Appendix S14) showed that speciation rates vary according to the host-plant trait: Parnassius clades feeding on Papaveraceae and Crassulaceae $$+$$ Saxifragaceae have higher speciation rates than their relatives feeding on other families; extinction and transition rates were estimated as equal across clades. Figure 5. View largeDownload slide Diversification of Parnassiinae linked to their host plant. a) Phylogenetic distribution of host-plant traits (four states) analyzed with MuSSE diversification models. b) Bayesian inferences made with the best-fitting MuSSE model (see Supplementary Appendix S14) showed that speciation rates vary according to the host-plant trait: Parnassius clades feeding on Papaveraceae and Crassulaceae $$+$$ Saxifragaceae have higher speciation rates than their relatives feeding on other families; extinction and transition rates were estimated as equal across clades. Bayesian analyses that model rate-heterogeneity across clades provided further corroboration of the SSE inferences (Fig. 6a). Both BAMM and an alternative implementation of lineage-specific birth–death model in RevBayes supported a diversification regime in which there is at least an increase in diversification rates along the stem of Parnassius (and possibly a second upshift at the stem of subgenus Parnassius) and found higher speciation rates for host-plant feeders on Crassulaceae $$+$$ Saxifragaceae and Papaveraceae than those on Aristolochiaceae or Zygophyllaceae (Supplementary Appendices S17–S19 available on Dryad). BAMM was insensitive to the CPP prior. Figure 6. View largeDownload slide Diversification dynamics of Parnassiinae inferred with time-dependent models. a) Rate-heterogeneity-across-clades models implemented in BAMM and RevBayes identified one significant shift in diversification rates along the stem leading to Parnassius (represented by a red circle; see also Supplementary Appendices S17–S19), with a higher speciation rate in this clade than in the rest of the tree. Episodic birth–death models implemented in CoMET (b) and TreePar (c) identified a possible mass extinction event in the mid-Miocene ($$\sim$$15 Ma). The blue histogram shows the uncertainty in the timing of the mass extinction event based on the estimates over 200 trees with TreePar (similar estimates are obtained for CoMET; Supplementary Appendix S20). The 2 models also estimated an increase in diversity around the early-mid Pliocene (b and c), but this was followed by a decrease in diversity toward the present time in CoMET which was not detected in TreePar. The red line shows the lineages-through-time plot. The green lines represent the reconstructed past diversity before and after the mass extinction, based on the net diversification rates and 7.5% of survival probability estimated in TreePar. Plio. $$=$$ Pliocene; Ple. $$=$$ Pleistocene. Figure 6. View largeDownload slide Diversification dynamics of Parnassiinae inferred with time-dependent models. a) Rate-heterogeneity-across-clades models implemented in BAMM and RevBayes identified one significant shift in diversification rates along the stem leading to Parnassius (represented by a red circle; see also Supplementary Appendices S17–S19), with a higher speciation rate in this clade than in the rest of the tree. Episodic birth–death models implemented in CoMET (b) and TreePar (c) identified a possible mass extinction event in the mid-Miocene ($$\sim$$15 Ma). The blue histogram shows the uncertainty in the timing of the mass extinction event based on the estimates over 200 trees with TreePar (similar estimates are obtained for CoMET; Supplementary Appendix S20). The 2 models also estimated an increase in diversity around the early-mid Pliocene (b and c), but this was followed by a decrease in diversity toward the present time in CoMET which was not detected in TreePar. The red line shows the lineages-through-time plot. The green lines represent the reconstructed past diversity before and after the mass extinction, based on the net diversification rates and 7.5% of survival probability estimated in TreePar. Plio. $$=$$ Pliocene; Ple. $$=$$ Pleistocene. Diversification in a CJ model. When mass extinctions are disallowed, TreePar supported a model with a single shift at 3.2 Ma (Table 2), with a low initial rate of diversification (r$$_{\mathrm{2}} = 0.031$$) and high turnover ($$\varepsilon _{\mathrm{2}} = 0.91$$), followed by 4-fold increase in diversification rate (r$$_{\mathrm{1}} = 0.119$$) and a joint decrease of turnover ($$\varepsilon_{\mathrm{1}} = 0.17$$). The second best model identified a second shift in the mid-Miocene (15.3 Ma) marked by an extinction period surrounded by low and negative values for the net diversification rate, and values $$>$$1 for turnover. When mass extinctions are allowed to occur, TreePar favored a significant mass extinction at 14.8 Ma with an estimated survival probability after the mass extinction of 7.5% (Table 2; Fig. 6c). The CoMET model implemented in RevBayes with the autocorrelated model also identified a mass extinction event in the mid-Miocene $$\sim$$15 Ma (2lnBF $$=$$ 6, Supplementary Appendix S20 on Dryad). There was also a significant increase in speciation rates around the Pliocene, followed by a decrease in speciation toward the present (Fig. 6b); similar results were obtained with the TESS implementation of CoMET (Supplementary Appendix S20 available on Dryad). Table 2. Results from episodic diversification analyses in TreePar with or without mass extinction Mass extinction events disallowed Mass extinction events allowed BD with BD with BD with BD with BD with BD with BD with Models no shift time 1 shift time 2 shift times 3 shift times Models no ME 1 ME 2 MEs NP 2 5 8 11 NP 2 4 6 logL –182.840,0.5640 –175.099,0.5727 –172.910,0.5716 –170.504,0.5778 logL –240.789,0.5634 –235.792,0.5667 –234.082,0.5760 AICc 369.827,1.1281 360.959,1.1454 363.714,1.1432 366.624,1.1555 AICc 485.726,1.1267 482.345,1.1334 486.058,1.1520 $$\Delta$$AIC 8.868 0 2.755 5.665 $$\Delta$$AIC r1 0.0934,0.0008 0.1188,0.0057 0.0915,0.0067 0.0711,0.0105 r 0.0930,0.0008 0.1672,0.0012 0.1859,0.0014 $$\varepsilon1$$ 0.5374,0.0047 0.1699,0.0355 0.3109,0.0411 0.5288,0.1224 $$\varepsilon$$ 0.5363,0.0042 0.0279,0.0038 0.0025,0.0010 st1 — 3.21,0.0573 2.81,0.0693 2.56,0.0788 st1 — 14.82,0.12 7.37,0.17 r2 — 0.0312,0.0007 0.0254,0.0166 0.0031,0.0295 sp1 — 0.0749,0.0021 0.466,0.0144 $$\varepsilon2$$ — 0.9142,0.0052 0.8319,0.0529 0.9179,0.0795 st2 — — 16.08,0.29 st2 — — 15.29,0.8932 10.47,0.6951 sp2 — — 0.0759,0.0028 r3 — — –0.3498,0.1026 0.0302,0.0190 $$\varepsilon3$$ — — 0.7616,0.0380 0.8461,0.0506 st3 — — — 20.29,0.9277 r4 — — — –0.9929,0.1753 $$\varepsilon4$$ — — — 0.8571,0.0403 Mass extinction events disallowed Mass extinction events allowed BD with BD with BD with BD with BD with BD with BD with Models no shift time 1 shift time 2 shift times 3 shift times Models no ME 1 ME 2 MEs NP 2 5 8 11 NP 2 4 6 logL –182.840,0.5640 –175.099,0.5727 –172.910,0.5716 –170.504,0.5778 logL –240.789,0.5634 –235.792,0.5667 –234.082,0.5760 AICc 369.827,1.1281 360.959,1.1454 363.714,1.1432 366.624,1.1555 AICc 485.726,1.1267 482.345,1.1334 486.058,1.1520 $$\Delta$$AIC 8.868 0 2.755 5.665 $$\Delta$$AIC r1 0.0934,0.0008 0.1188,0.0057 0.0915,0.0067 0.0711,0.0105 r 0.0930,0.0008 0.1672,0.0012 0.1859,0.0014 $$\varepsilon1$$ 0.5374,0.0047 0.1699,0.0355 0.3109,0.0411 0.5288,0.1224 $$\varepsilon$$ 0.5363,0.0042 0.0279,0.0038 0.0025,0.0010 st1 — 3.21,0.0573 2.81,0.0693 2.56,0.0788 st1 — 14.82,0.12 7.37,0.17 r2 — 0.0312,0.0007 0.0254,0.0166 0.0031,0.0295 sp1 — 0.0749,0.0021 0.466,0.0144 $$\varepsilon2$$ — 0.9142,0.0052 0.8319,0.0529 0.9179,0.0795 st2 — — 16.08,0.29 st2 — — 15.29,0.8932 10.47,0.6951 sp2 — — 0.0759,0.0028 r3 — — –0.3498,0.1026 0.0302,0.0190 $$\varepsilon3$$ — — 0.7616,0.0380 0.8461,0.0506 st3 — — — 20.29,0.9277 r4 — — — –0.9929,0.1753 $$\varepsilon4$$ — — — 0.8571,0.0403 Notes: Bold columns represent the best model. When mass extinction is disallowed, the best-supported model has a single rate-shift, as determined by the lowest AICc and $$\Delta$$AICc. When mass extinction is allowed, the best model also has a single mass-extinction event. Adding more diversification shifts or mass extinctions did not significantly improve the likelihood of the models. “r1” denotes the diversification rate and “$$\varepsilon$$1” is the turnover, both inferred between Present and the shift time 1 (“st1”). BD $$=$$ birth–death; ME $$=$$ mass extinction; NP $$=$$ number of parameters; logL $$=$$ log-likelihood; AICc $$=$$ corrected Akaike Information Criterion; $$\Delta$$AICc $$=$$ the difference in AICc between the model with the lowest AICc and the others. Parameter estimates: r $$=$$ net diversification rate (speciation minus extinction); $$\varepsilon =$$ turnover (extinction over speciation); st $$=$$ shift time; sp $$=$$ survival probability at a mass extinction event. Table 2. Results from episodic diversification analyses in TreePar with or without mass extinction Mass extinction events disallowed Mass extinction events allowed BD with BD with BD with BD with BD with BD with BD with Models no shift time 1 shift time 2 shift times 3 shift times Models no ME 1 ME 2 MEs NP 2 5 8 11 NP 2 4 6 logL –182.840,0.5640 –175.099,0.5727 –172.910,0.5716 –170.504,0.5778 logL –240.789,0.5634 –235.792,0.5667 –234.082,0.5760 AICc 369.827,1.1281 360.959,1.1454 363.714,1.1432 366.624,1.1555 AICc 485.726,1.1267 482.345,1.1334 486.058,1.1520 $$\Delta$$AIC 8.868 0 2.755 5.665 $$\Delta$$AIC r1 0.0934,0.0008 0.1188,0.0057 0.0915,0.0067 0.0711,0.0105 r 0.0930,0.0008 0.1672,0.0012 0.1859,0.0014 $$\varepsilon1$$ 0.5374,0.0047 0.1699,0.0355 0.3109,0.0411 0.5288,0.1224 $$\varepsilon$$ 0.5363,0.0042 0.0279,0.0038 0.0025,0.0010 st1 — 3.21,0.0573 2.81,0.0693 2.56,0.0788 st1 — 14.82,0.12 7.37,0.17 r2 — 0.0312,0.0007 0.0254,0.0166 0.0031,0.0295 sp1 — 0.0749,0.0021 0.466,0.0144 $$\varepsilon2$$ — 0.9142,0.0052 0.8319,0.0529 0.9179,0.0795 st2 — — 16.08,0.29 st2 — — 15.29,0.8932 10.47,0.6951 sp2 — — 0.0759,0.0028 r3 — — –0.3498,0.1026 0.0302,0.0190 $$\varepsilon3$$ — — 0.7616,0.0380 0.8461,0.0506 st3 — — — 20.29,0.9277 r4 — — — –0.9929,0.1753 $$\varepsilon4$$ — — — 0.8571,0.0403 Mass extinction events disallowed Mass extinction events allowed BD with BD with BD with BD with BD with BD with BD with Models no shift time 1 shift time 2 shift times 3 shift times Models no ME 1 ME 2 MEs NP 2 5 8 11 NP 2 4 6 logL –182.840,0.5640 –175.099,0.5727 –172.910,0.5716 –170.504,0.5778 logL –240.789,0.5634 –235.792,0.5667 –234.082,0.5760 AICc 369.827,1.1281 360.959,1.1454 363.714,1.1432 366.624,1.1555 AICc 485.726,1.1267 482.345,1.1334 486.058,1.1520 $$\Delta$$AIC 8.868 0 2.755 5.665 $$\Delta$$AIC r1 0.0934,0.0008 0.1188,0.0057 0.0915,0.0067 0.0711,0.0105 r 0.0930,0.0008 0.1672,0.0012 0.1859,0.0014 $$\varepsilon1$$ 0.5374,0.0047 0.1699,0.0355 0.3109,0.0411 0.5288,0.1224 $$\varepsilon$$ 0.5363,0.0042 0.0279,0.0038 0.0025,0.0010 st1 — 3.21,0.0573 2.81,0.0693 2.56,0.0788 st1 — 14.82,0.12 7.37,0.17 r2 — 0.0312,0.0007 0.0254,0.0166 0.0031,0.0295 sp1 — 0.0749,0.0021 0.466,0.0144 $$\varepsilon2$$ — 0.9142,0.0052 0.8319,0.0529 0.9179,0.0795 st2 — — 16.08,0.29 st2 — — 15.29,0.8932 10.47,0.6951 sp2 — — 0.0759,0.0028 r3 — — –0.3498,0.1026 0.0302,0.0190 $$\varepsilon3$$ — — 0.7616,0.0380 0.8461,0.0506 st3 — — — 20.29,0.9277 r4 — — — –0.9929,0.1753 $$\varepsilon4$$ — — — 0.8571,0.0403 Notes: Bold columns represent the best model. When mass extinction is disallowed, the best-supported model has a single rate-shift, as determined by the lowest AICc and $$\Delta$$AICc. When mass extinction is allowed, the best model also has a single mass-extinction event. Adding more diversification shifts or mass extinctions did not significantly improve the likelihood of the models. “r1” denotes the diversification rate and “$$\varepsilon$$1” is the turnover, both inferred between Present and the shift time 1 (“st1”). BD $$=$$ birth–death; ME $$=$$ mass extinction; NP $$=$$ number of parameters; logL $$=$$ log-likelihood; AICc $$=$$ corrected Akaike Information Criterion; $$\Delta$$AICc $$=$$ the difference in AICc between the model with the lowest AICc and the others. Parameter estimates: r $$=$$ net diversification rate (speciation minus extinction); $$\varepsilon =$$ turnover (extinction over speciation); st $$=$$ shift time; sp $$=$$ survival probability at a mass extinction event. GeoSSE analyses selected a best-fitting model with equal dispersal and extinction rates between altitudinal regions (habitats), but higher speciation rates for highland regions when compared with lowland regions (Supplementary Appendix S14 available on Dryad). Allopatric speciation (for lowland/highland ancestors) was inferred to occur at a higher rate than within-lowland speciation, but at a lower rate than within-highland speciation; these differences remained significant in the Bayesian MCMC analyses, as the credibility interval did not overlap with zero (Fig. 7). Nevertheless, the difference in speciation rates between higher and lower altitudes was not larger than expected by chance in our simulation tests (Supplementary Appendix S16 available on Dryad), so we cannot reject the observed effect of altitude on speciation being affected by the shape of the phylogeny. Figure 7. View largeDownload slide Inference of geographic mode of speciation between mountain and lowland species. a) The phylogenetic distribution of geographic traits (lowland, mountain, or “widespread”) analyzed with GeoSSE models of diversification. b) Bayesian inferences made with the best-fitting GeoSSE model (selected over a series of 12 diversification models) showing that only speciation varies according to the geographic trait. The results indicate that the mountain-dwelling Parnassiinae species have significantly higher speciation rates than their counterparts. The allopatric (or biome divergence) speciation rate is also estimated to be elevated, congruent with our biogeographic estimates. Figure 7. View largeDownload slide Inference of geographic mode of speciation between mountain and lowland species. a) The phylogenetic distribution of geographic traits (lowland, mountain, or “widespread”) analyzed with GeoSSE models of diversification. b) Bayesian inferences made with the best-fitting GeoSSE model (selected over a series of 12 diversification models) showing that only speciation varies according to the geographic trait. The results indicate that the mountain-dwelling Parnassiinae species have significantly higher speciation rates than their counterparts. The allopatric (or biome divergence) speciation rate is also estimated to be elevated, congruent with our biogeographic estimates. The ML-based paleoenvironment-dependent model implemented in RPANDA indicated positive dependence between diversification rates and past temperature (Table 3). In the best model, both speciation and extinction rates increased exponentially with temperature, with a general decrease in diversification rates toward the present, combined with periods of high turnover rates in the late Oligocene and mid-Miocene (Fig. 8). The best paleo-elevation model had speciation increasing with increasing altitude, whereas extinction remained constant (and not affected by elevation fluctuations), but this model does not fit better than constant-rate or temperature-dependent models (Table 3). In contrast, the Bayesian paleoenvironment-dependent model implemented in RevBayes indicated that both past temperature and elevation did not impact the diversification of Parnassiinae, since the correlation parameters $$\alpha$$ and $$\beta$$ had credibility intervals that were not significantly different from zero (Supplementary Appendix S21 available on Dryad). Moreover, estimates with this model (Fig. 8) showed diversification dynamics that were similar to those obtained with CoMET and TreePar, but unlike those from RPANDA. These differences might be attributed to the inference framework (ML in RPANDA vs. BI in RevBayes) or to the different underlying birth–death models (continuous in RPANDA vs. episodic in RevBayes). To test this, we implemented the continuous-paleoenvironmental models (Condamine et al. 2013) in TESS. We then compared these models against a constant birth–death model, episodic (piecewise constant) birth–death models (1–5 shifts), and time-variable models. This procedure ensured that a single statistical framework (ML and AIC) implemented in TESS was used to compare all models. Results showed that the best-fitting model was one in which speciation and extinction rates vary positively with paleotemperature, in agreement with RPANDA (Tables 4 and 5). We, therefore, concluded that the different results are related not to the inference framework but to the underlying model: the use of an episodic birth–death model with autocorrelated rates across time intervals in the RevBayes model rather than the uncorrelated episodic model implemented in TESS. Figure 8. View largeDownload slide Paleoenvironment-dependent diversification processes in Apollo butterflies. ML continuous paleoenvironment-dependent models implemented in RPANDA (a–d) showed positive dependence between paleotemperatures and speciation/extinction; the paleo-elevation-dependent diversification models did not fit the data better than the constant-rate or time-variable models. Bayesian inference of environment-dependent diversification with autocorrelated rates in RevBayes (e–h) supports a model in which both speciation and extinction are not dependent on temperature and elevation changes (represented by dotted lines, temperature and elevation values are given in Figure 2). Colored areas in speciation and extinction plots indicate the credibility interval, with the continuous line representing the median rate. Pli $$=$$ Pliocene; P $$=$$ Pleistocene. Figure 8. View largeDownload slide Paleoenvironment-dependent diversification processes in Apollo butterflies. ML continuous paleoenvironment-dependent models implemented in RPANDA (a–d) showed positive dependence between paleotemperatures and speciation/extinction; the paleo-elevation-dependent diversification models did not fit the data better than the constant-rate or time-variable models. Bayesian inference of environment-dependent diversification with autocorrelated rates in RevBayes (e–h) supports a model in which both speciation and extinction are not dependent on temperature and elevation changes (represented by dotted lines, temperature and elevation values are given in Figure 2). Colored areas in speciation and extinction plots indicate the credibility interval, with the continuous line representing the median rate. Pli $$=$$ Pliocene; P $$=$$ Pleistocene. Table 3. Results from RPANDA analyses Models NP logL AICc $$\Delta$$AIC AI$$\omega$$ $$\lambda_{\mathrm{0}}$$ $$\alpha$$ $$\mu_{\mathrm{0}}$$ $$\beta$$ Constant birth–death 2 $$-240.336 \pm 0.620$$ 484.819 $${\pm}$$ 1.240 5.186 0.044 0.2048 $${\pm}$$ 0.0018 — 0.1123,0.0017 — $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-238.427 \pm 0.624$$ 483.151 $${\pm}$$ 1.249 3.518 0.101 0.1962 $${\pm}$$ 0.0016 –0.0407,0.0004 5.00E-08,5.00E-08 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Time}}$$ 3 $$-239.089 \pm 0.621$$ 484.474 $${\pm}$$ 1.242 4.841 0.052 0.1914 $${\pm}$$ 0.0016 — 0.0549,0.0010 0.0433,0.0006 $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{Time}}$$ 4 $$-238.427 \pm 0.624$$ 485.354 $${\pm}$$ 1.249 5.721 0.034 0.1962 $${\pm}$$ 0.0016 –0.0405,0.00004 8.535E-05,3.24E-05 0.0347,0.0024 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-240.247 \pm 0.572$$ 486.790 $${\pm}$$ 1.144 7.157 0.016 0.0310 $${\pm}$$ 0.0012 0.0004,6.35E-06 0.0794,0.0021 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Alti.}}$$ 3 $$-241.542 \pm 0.577$$ 489.379 $${\pm}$$ 1.155 9.746 0.004 0.1991 $${\pm}$$ 0.0017 — 0.5116,0.0676 –0.00024,1.35E-05 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{Alti.}}$$ 4 $$-240.253 \pm 0.582$$ 489.005 $${\pm}$$ 1.164 9.372 0.005 0.0454 $${\pm}$$ 0.0021 0.0003,9.30E-06 0.1016,0.0058 –2.67E-05,6.29E-05 $$\lambda_{\mathrm{Temp.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-240.605 \pm 0.632$$ 487.505 $${\pm}$$ 1.264 7.872 0.011 0.2121 $${\pm}$$ 0.0021 –0.0242,0.0012 0.0834,0.0011 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Temp.}}$$ 3 $$-238.058 \pm 0.649$$ 482.412 $${\pm}$$ 1.298 2.779 0.146 0.1817 $${\pm}$$ 0.0013 — 0.0071,0.0005 0.4707,0.0138 $$\lambda_{\mathrm{\mathbf{Temp.}}}$$and$$\mu_{\mathrm{\mathbf{Temp.}}}$$ 4 $$-235.566 \pm 0.595$$ 479.633 $${\pm}$$ 1.189 0 0.586 0.0839 $${\pm}$$ 0.0017 0.3811,0.0073 0.0797,0.0019 0.3988,0.0041 Models NP logL AICc $$\Delta$$AIC AI$$\omega$$ $$\lambda_{\mathrm{0}}$$ $$\alpha$$ $$\mu_{\mathrm{0}}$$ $$\beta$$ Constant birth–death 2 $$-240.336 \pm 0.620$$ 484.819 $${\pm}$$ 1.240 5.186 0.044 0.2048 $${\pm}$$ 0.0018 — 0.1123,0.0017 — $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-238.427 \pm 0.624$$ 483.151 $${\pm}$$ 1.249 3.518 0.101 0.1962 $${\pm}$$ 0.0016 –0.0407,0.0004 5.00E-08,5.00E-08 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Time}}$$ 3 $$-239.089 \pm 0.621$$ 484.474 $${\pm}$$ 1.242 4.841 0.052 0.1914 $${\pm}$$ 0.0016 — 0.0549,0.0010 0.0433,0.0006 $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{Time}}$$ 4 $$-238.427 \pm 0.624$$ 485.354 $${\pm}$$ 1.249 5.721 0.034 0.1962 $${\pm}$$ 0.0016 –0.0405,0.00004 8.535E-05,3.24E-05 0.0347,0.0024 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-240.247 \pm 0.572$$ 486.790 $${\pm}$$ 1.144 7.157 0.016 0.0310 $${\pm}$$ 0.0012 0.0004,6.35E-06 0.0794,0.0021 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Alti.}}$$ 3 $$-241.542 \pm 0.577$$ 489.379 $${\pm}$$ 1.155 9.746 0.004 0.1991 $${\pm}$$ 0.0017 — 0.5116,0.0676 –0.00024,1.35E-05 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{Alti.}}$$ 4 $$-240.253 \pm 0.582$$ 489.005 $${\pm}$$ 1.164 9.372 0.005 0.0454 $${\pm}$$ 0.0021 0.0003,9.30E-06 0.1016,0.0058 –2.67E-05,6.29E-05 $$\lambda_{\mathrm{Temp.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-240.605 \pm 0.632$$ 487.505 $${\pm}$$ 1.264 7.872 0.011 0.2121 $${\pm}$$ 0.0021 –0.0242,0.0012 0.0834,0.0011 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Temp.}}$$ 3 $$-238.058 \pm 0.649$$ 482.412 $${\pm}$$ 1.298 2.779 0.146 0.1817 $${\pm}$$ 0.0013 — 0.0071,0.0005 0.4707,0.0138 $$\lambda_{\mathrm{\mathbf{Temp.}}}$$and$$\mu_{\mathrm{\mathbf{Temp.}}}$$ 4 $$-235.566 \pm 0.595$$ 479.633 $${\pm}$$ 1.189 0 0.586 0.0839 $${\pm}$$ 0.0017 0.3811,0.0073 0.0797,0.0019 0.3988,0.0041 Notes: Bold columns represent the best model. The model with positive dependence of speciation and extinction on temperature has the lowest AICc and highest AIC$$\omega$$. Macroevolutionary scenarios were compared in which speciation and extinction vary through time (Time), or by paleo-elevation changes of the Himalaya and Tibetan Plateau (Alti.), and global temperature changes (Temp.). Abbreviations: NP, number of parameters; logL, log-likelihood; AICc, corrected Akaike Information Criterion; $$\Delta$$AICc, the difference in AICc between the model with the lowest AICc and the others; AIC$$\omega$$, Akaike weight. Parameter estimates: $$\lambda_{\mathrm{0}}$$ and $$\mu _{\mathrm{0}}$$, speciation and extinction for environmental value at present; $$\alpha$$ and $$\beta$$, parameter controlling variation of speciation ($$\alpha )$$ and extinction ($$\beta )$$ with paleoenvironment. Table 3. Results from RPANDA analyses Models NP logL AICc $$\Delta$$AIC AI$$\omega$$ $$\lambda_{\mathrm{0}}$$ $$\alpha$$ $$\mu_{\mathrm{0}}$$ $$\beta$$ Constant birth–death 2 $$-240.336 \pm 0.620$$ 484.819 $${\pm}$$ 1.240 5.186 0.044 0.2048 $${\pm}$$ 0.0018 — 0.1123,0.0017 — $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-238.427 \pm 0.624$$ 483.151 $${\pm}$$ 1.249 3.518 0.101 0.1962 $${\pm}$$ 0.0016 –0.0407,0.0004 5.00E-08,5.00E-08 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Time}}$$ 3 $$-239.089 \pm 0.621$$ 484.474 $${\pm}$$ 1.242 4.841 0.052 0.1914 $${\pm}$$ 0.0016 — 0.0549,0.0010 0.0433,0.0006 $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{Time}}$$ 4 $$-238.427 \pm 0.624$$ 485.354 $${\pm}$$ 1.249 5.721 0.034 0.1962 $${\pm}$$ 0.0016 –0.0405,0.00004 8.535E-05,3.24E-05 0.0347,0.0024 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-240.247 \pm 0.572$$ 486.790 $${\pm}$$ 1.144 7.157 0.016 0.0310 $${\pm}$$ 0.0012 0.0004,6.35E-06 0.0794,0.0021 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Alti.}}$$ 3 $$-241.542 \pm 0.577$$ 489.379 $${\pm}$$ 1.155 9.746 0.004 0.1991 $${\pm}$$ 0.0017 — 0.5116,0.0676 –0.00024,1.35E-05 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{Alti.}}$$ 4 $$-240.253 \pm 0.582$$ 489.005 $${\pm}$$ 1.164 9.372 0.005 0.0454 $${\pm}$$ 0.0021 0.0003,9.30E-06 0.1016,0.0058 –2.67E-05,6.29E-05 $$\lambda_{\mathrm{Temp.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-240.605 \pm 0.632$$ 487.505 $${\pm}$$ 1.264 7.872 0.011 0.2121 $${\pm}$$ 0.0021 –0.0242,0.0012 0.0834,0.0011 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Temp.}}$$ 3 $$-238.058 \pm 0.649$$ 482.412 $${\pm}$$ 1.298 2.779 0.146 0.1817 $${\pm}$$ 0.0013 — 0.0071,0.0005 0.4707,0.0138 $$\lambda_{\mathrm{\mathbf{Temp.}}}$$and$$\mu_{\mathrm{\mathbf{Temp.}}}$$ 4 $$-235.566 \pm 0.595$$ 479.633 $${\pm}$$ 1.189 0 0.586 0.0839 $${\pm}$$ 0.0017 0.3811,0.0073 0.0797,0.0019 0.3988,0.0041 Models NP logL AICc $$\Delta$$AIC AI$$\omega$$ $$\lambda_{\mathrm{0}}$$ $$\alpha$$ $$\mu_{\mathrm{0}}$$ $$\beta$$ Constant birth–death 2 $$-240.336 \pm 0.620$$ 484.819 $${\pm}$$ 1.240 5.186 0.044 0.2048 $${\pm}$$ 0.0018 — 0.1123,0.0017 — $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-238.427 \pm 0.624$$ 483.151 $${\pm}$$ 1.249 3.518 0.101 0.1962 $${\pm}$$ 0.0016 –0.0407,0.0004 5.00E-08,5.00E-08 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Time}}$$ 3 $$-239.089 \pm 0.621$$ 484.474 $${\pm}$$ 1.242 4.841 0.052 0.1914 $${\pm}$$ 0.0016 — 0.0549,0.0010 0.0433,0.0006 $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{Time}}$$ 4 $$-238.427 \pm 0.624$$ 485.354 $${\pm}$$ 1.249 5.721 0.034 0.1962 $${\pm}$$ 0.0016 –0.0405,0.00004 8.535E-05,3.24E-05 0.0347,0.0024 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-240.247 \pm 0.572$$ 486.790 $${\pm}$$ 1.144 7.157 0.016 0.0310 $${\pm}$$ 0.0012 0.0004,6.35E-06 0.0794,0.0021 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Alti.}}$$ 3 $$-241.542 \pm 0.577$$ 489.379 $${\pm}$$ 1.155 9.746 0.004 0.1991 $${\pm}$$ 0.0017 — 0.5116,0.0676 –0.00024,1.35E-05 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{Alti.}}$$ 4 $$-240.253 \pm 0.582$$ 489.005 $${\pm}$$ 1.164 9.372 0.005 0.0454 $${\pm}$$ 0.0021 0.0003,9.30E-06 0.1016,0.0058 –2.67E-05,6.29E-05 $$\lambda_{\mathrm{Temp.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-240.605 \pm 0.632$$ 487.505 $${\pm}$$ 1.264 7.872 0.011 0.2121 $${\pm}$$ 0.0021 –0.0242,0.0012 0.0834,0.0011 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Temp.}}$$ 3 $$-238.058 \pm 0.649$$ 482.412 $${\pm}$$ 1.298 2.779 0.146 0.1817 $${\pm}$$ 0.0013 — 0.0071,0.0005 0.4707,0.0138 $$\lambda_{\mathrm{\mathbf{Temp.}}}$$and$$\mu_{\mathrm{\mathbf{Temp.}}}$$ 4 $$-235.566 \pm 0.595$$ 479.633 $${\pm}$$ 1.189 0 0.586 0.0839 $${\pm}$$ 0.0017 0.3811,0.0073 0.0797,0.0019 0.3988,0.0041 Notes: Bold columns represent the best model. The model with positive dependence of speciation and extinction on temperature has the lowest AICc and highest AIC$$\omega$$. Macroevolutionary scenarios were compared in which speciation and extinction vary through time (Time), or by paleo-elevation changes of the Himalaya and Tibetan Plateau (Alti.), and global temperature changes (Temp.). Abbreviations: NP, number of parameters; logL, log-likelihood; AICc, corrected Akaike Information Criterion; $$\Delta$$AICc, the difference in AICc between the model with the lowest AICc and the others; AIC$$\omega$$, Akaike weight. Parameter estimates: $$\lambda_{\mathrm{0}}$$ and $$\mu _{\mathrm{0}}$$, speciation and extinction for environmental value at present; $$\alpha$$ and $$\beta$$, parameter controlling variation of speciation ($$\alpha )$$ and extinction ($$\beta )$$ with paleoenvironment. Table 4. Results from the TESS analyses using paleoenvironment-dependent diversification models Models NP logL AICc $$\Delta$$AIC AIC$$\omega$$ $$\lambda$$ $$\alpha$$ $$\mu$$ $$\beta$$ Constant birth–death 2 $$-240.687$$ 485.425 3.668 0.044 0.2037 — 0.1113 — $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-238.811$$ 483.775 2.018 0.101 0.1951 $$-0.0402$$ 2.01E-06 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Time}}$$ 3 $$-239.331$$ 484.816 3.059 0.060 0.1898 — 0.0515 0.0467 $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{Time}}$$ 4 $$-238.802$$ 485.916 4.159 0.035 0.1951 $$-0.0398$$ 0.0002 0.0330 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-239.570$$ 485.294 3.537 0.047 0.0483 0.0003 0.09450 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Alti.}}$$ 3 $$-240.416$$ 486.985 5.228 0.020 0.1999 — 0.3730 $$-0.00014$$ $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{Alti.}}$$ 4 $$-239.378$$ 487.067 5.310 0.019 0.0472 0.0004 0.1169 0.0002 $$\lambda_{\mathrm{Temp.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-239.570$$ 485.294 3.537 0.047 0.0483 0.0003 0.0945 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Temp.}}$$ 3 $$-237.826$$ 481.806 0.049 0.271 0.1800 — 0.0064 0.5007 $$\lambda$$$$_{\mathrm{\mathbf{Temp.}}}$$and$$\mu$$$$_{\mathrm{\mathbf{Temp.}}}$$ 4 $$-236.723$$ 481.757 0 0.278 0.1383 0.1463 0.0458 0.3483 Models NP logL AICc $$\Delta$$AIC AIC$$\omega$$ $$\lambda$$ $$\alpha$$ $$\mu$$ $$\beta$$ Constant birth–death 2 $$-240.687$$ 485.425 3.668 0.044 0.2037 — 0.1113 — $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-238.811$$ 483.775 2.018 0.101 0.1951 $$-0.0402$$ 2.01E-06 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Time}}$$ 3 $$-239.331$$ 484.816 3.059 0.060 0.1898 — 0.0515 0.0467 $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{Time}}$$ 4 $$-238.802$$ 485.916 4.159 0.035 0.1951 $$-0.0398$$ 0.0002 0.0330 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-239.570$$ 485.294 3.537 0.047 0.0483 0.0003 0.09450 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Alti.}}$$ 3 $$-240.416$$ 486.985 5.228 0.020 0.1999 — 0.3730 $$-0.00014$$ $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{Alti.}}$$ 4 $$-239.378$$ 487.067 5.310 0.019 0.0472 0.0004 0.1169 0.0002 $$\lambda_{\mathrm{Temp.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-239.570$$ 485.294 3.537 0.047 0.0483 0.0003 0.0945 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Temp.}}$$ 3 $$-237.826$$ 481.806 0.049 0.271 0.1800 — 0.0064 0.5007 $$\lambda$$$$_{\mathrm{\mathbf{Temp.}}}$$and$$\mu$$$$_{\mathrm{\mathbf{Temp.}}}$$ 4 $$-236.723$$ 481.757 0 0.278 0.1383 0.1463 0.0458 0.3483 Notes: Bold columns represent the best model. Table 4. Results from the TESS analyses using paleoenvironment-dependent diversification models Models NP logL AICc $$\Delta$$AIC AIC$$\omega$$ $$\lambda$$ $$\alpha$$ $$\mu$$ $$\beta$$ Constant birth–death 2 $$-240.687$$ 485.425 3.668 0.044 0.2037 — 0.1113 — $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-238.811$$ 483.775 2.018 0.101 0.1951 $$-0.0402$$ 2.01E-06 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Time}}$$ 3 $$-239.331$$ 484.816 3.059 0.060 0.1898 — 0.0515 0.0467 $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{Time}}$$ 4 $$-238.802$$ 485.916 4.159 0.035 0.1951 $$-0.0398$$ 0.0002 0.0330 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-239.570$$ 485.294 3.537 0.047 0.0483 0.0003 0.09450 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Alti.}}$$ 3 $$-240.416$$ 486.985 5.228 0.020 0.1999 — 0.3730 $$-0.00014$$ $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{Alti.}}$$ 4 $$-239.378$$ 487.067 5.310 0.019 0.0472 0.0004 0.1169 0.0002 $$\lambda_{\mathrm{Temp.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-239.570$$ 485.294 3.537 0.047 0.0483 0.0003 0.0945 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Temp.}}$$ 3 $$-237.826$$ 481.806 0.049 0.271 0.1800 — 0.0064 0.5007 $$\lambda$$$$_{\mathrm{\mathbf{Temp.}}}$$and$$\mu$$$$_{\mathrm{\mathbf{Temp.}}}$$ 4 $$-236.723$$ 481.757 0 0.278 0.1383 0.1463 0.0458 0.3483 Models NP logL AICc $$\Delta$$AIC AIC$$\omega$$ $$\lambda$$ $$\alpha$$ $$\mu$$ $$\beta$$ Constant birth–death 2 $$-240.687$$ 485.425 3.668 0.044 0.2037 — 0.1113 — $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-238.811$$ 483.775 2.018 0.101 0.1951 $$-0.0402$$ 2.01E-06 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Time}}$$ 3 $$-239.331$$ 484.816 3.059 0.060 0.1898 — 0.0515 0.0467 $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{Time}}$$ 4 $$-238.802$$ 485.916 4.159 0.035 0.1951 $$-0.0398$$ 0.0002 0.0330 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-239.570$$ 485.294 3.537 0.047 0.0483 0.0003 0.09450 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Alti.}}$$ 3 $$-240.416$$ 486.985 5.228 0.020 0.1999 — 0.3730 $$-0.00014$$ $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{Alti.}}$$ 4 $$-239.378$$ 487.067 5.310 0.019 0.0472 0.0004 0.1169 0.0002 $$\lambda_{\mathrm{Temp.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-239.570$$ 485.294 3.537 0.047 0.0483 0.0003 0.0945 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Temp.}}$$ 3 $$-237.826$$ 481.806 0.049 0.271 0.1800 — 0.0064 0.5007 $$\lambda$$$$_{\mathrm{\mathbf{Temp.}}}$$and$$\mu$$$$_{\mathrm{\mathbf{Temp.}}}$$ 4 $$-236.723$$ 481.757 0 0.278 0.1383 0.1463 0.0458 0.3483 Notes: Bold columns represent the best model. Table 5. Results from the TESS analyses using episodic birth–death models Models NP logL AICc $$\Delta$$AIC AIC$$\omega$$ $$\lambda_{\mathrm{1}}$$ $$\lambda_{\mathrm{2}}$$ $$\lambda_{\mathrm{3}}$$ $$\lambda_{\mathrm{4}}$$ $$\lambda_{\mathrm{5}}$$ $$\lambda_{\mathrm{6}}$$ $$\mu_{\mathrm{1}}$$ $$\mu_{\mathrm{2}}$$ $$\mu_{\mathrm{3}}$$ $$\mu_{\mathrm{4}}$$ $$\mu_{\mathrm{5}}$$ $$\mu_{\mathrm{6}}$$ Shift 1 5 $$-237.008$$ 484.542 2.785 0.069 0.052 0.176 — — — — 0.004 0.018 — — — — Shift 2 8 $$-236.086$$ 489.706 7.949 0.005 0.195 0.084 0.180 — — — 0.112 0.097 0.004 — — — Shift 3 11 $$-236.118$$ 497.379 15.622 0.0001 0.180 0.127 0.155 0.181 — — 0.111 0.051 0.159 0.002 — — Shift 4 14 $$-234.251$$ 501.935 20.178 0 0.164 0.159 0.049 0.161 0.190 — 0.136 0.062 0.074 0.058 0.008 — Models NP logL AICc $$\Delta$$AIC AIC$$\omega$$ $$\lambda_{\mathrm{1}}$$ $$\lambda_{\mathrm{2}}$$ $$\lambda_{\mathrm{3}}$$ $$\lambda_{\mathrm{4}}$$ $$\lambda_{\mathrm{5}}$$ $$\lambda_{\mathrm{6}}$$ $$\mu_{\mathrm{1}}$$ $$\mu_{\mathrm{2}}$$ $$\mu_{\mathrm{3}}$$ $$\mu_{\mathrm{4}}$$ $$\mu_{\mathrm{5}}$$ $$\mu_{\mathrm{6}}$$ Shift 1 5 $$-237.008$$ 484.542 2.785 0.069 0.052 0.176 — — — — 0.004 0.018 — — — — Shift 2 8 $$-236.086$$ 489.706 7.949 0.005 0.195 0.084 0.180 — — — 0.112 0.097 0.004 — — — Shift 3 11 $$-236.118$$ 497.379 15.622 0.0001 0.180 0.127 0.155 0.181 — — 0.111 0.051 0.159 0.002 — — Shift 4 14 $$-234.251$$ 501.935 20.178 0 0.164 0.159 0.049 0.161 0.190 — 0.136 0.062 0.074 0.058 0.008 — Notes: The model with positive dependence of speciation and extinction with temperature has the lowest AICc and highest AIC$$\omega$$. We compared macroevolutionary scenarios in which speciation and extinction vary through time (Time), or according to paleo-elevation changes of the Himalaya and Tibetan Plateau (Alti.), according to global temperature changes (Temp.), and episodic birth–death models (Shift). Parameter estimates: $$\lambda _{\mathrm{0}}$$ and $$\mu _{\mathrm{0}}$$, speciation and extinction for the environmental value at present; $$\alpha$$ and $$\beta$$, parameter controlling the variation of the speciation ($$\alpha )$$ and extinction ($$\beta )$$ with the paleoenvironment. $$\lambda_{\mathrm{1}}$$ and $$\mu _{\mathrm{1}}$$, speciation and extinction for the first period of time (present to first shift). NP $$=$$ number of parameters; logL $$=$$ log-likelihood; AICc $$=$$ corrected Akaike Information Criterion; $$\Delta$$AICc $$=$$ the difference in AICc between the model with the lowest AICc and the others; AIC$$\omega =$$ Akaike weight. Table 5. Results from the TESS analyses using episodic birth–death models Models NP logL AICc $$\Delta$$AIC AIC$$\omega$$ $$\lambda_{\mathrm{1}}$$ $$\lambda_{\mathrm{2}}$$ $$\lambda_{\mathrm{3}}$$ $$\lambda_{\mathrm{4}}$$ $$\lambda_{\mathrm{5}}$$ $$\lambda_{\mathrm{6}}$$ $$\mu_{\mathrm{1}}$$ $$\mu_{\mathrm{2}}$$ $$\mu_{\mathrm{3}}$$ $$\mu_{\mathrm{4}}$$ $$\mu_{\mathrm{5}}$$ $$\mu_{\mathrm{6}}$$ Shift 1 5 $$-237.008$$ 484.542 2.785 0.069 0.052 0.176 — — — — 0.004 0.018 — — — — Shift 2 8 $$-236.086$$ 489.706 7.949 0.005 0.195 0.084 0.180 — — — 0.112 0.097 0.004 — — — Shift 3 11 $$-236.118$$ 497.379 15.622 0.0001 0.180 0.127 0.155 0.181 — — 0.111 0.051 0.159 0.002 — — Shift 4 14 $$-234.251$$ 501.935 20.178 0 0.164 0.159 0.049 0.161 0.190 — 0.136 0.062 0.074 0.058 0.008 — Models NP logL AICc $$\Delta$$AIC AIC$$\omega$$ $$\lambda_{\mathrm{1}}$$ $$\lambda_{\mathrm{2}}$$ $$\lambda_{\mathrm{3}}$$ $$\lambda_{\mathrm{4}}$$ $$\lambda_{\mathrm{5}}$$ $$\lambda_{\mathrm{6}}$$ $$\mu_{\mathrm{1}}$$ $$\mu_{\mathrm{2}}$$ $$\mu_{\mathrm{3}}$$ $$\mu_{\mathrm{4}}$$ $$\mu_{\mathrm{5}}$$ $$\mu_{\mathrm{6}}$$ Shift 1 5 $$-237.008$$ 484.542 2.785 0.069 0.052 0.176 — — — — 0.004 0.018 — — — — Shift 2 8 $$-236.086$$ 489.706 7.949 0.005 0.195 0.084 0.180 — — — 0.112 0.097 0.004 — — — Shift 3 11 $$-236.118$$ 497.379 15.622 0.0001 0.180 0.127 0.155 0.181 — — 0.111 0.051 0.159 0.002 — — Shift 4 14 $$-234.251$$ 501.935 20.178 0 0.164 0.159 0.049 0.161 0.190 — 0.136 0.062 0.074 0.058 0.008 — Notes: The model with positive dependence of speciation and extinction with temperature has the lowest AICc and highest AIC$$\omega$$. We compared macroevolutionary scenarios in which speciation and extinction vary through time (Time), or according to paleo-elevation changes of the Himalaya and Tibetan Plateau (Alti.), according to global temperature changes (Temp.), and episodic birth–death models (Shift). Parameter estimates: $$\lambda _{\mathrm{0}}$$ and $$\mu _{\mathrm{0}}$$, speciation and extinction for the environmental value at present; $$\alpha$$ and $$\beta$$, parameter controlling the variation of the speciation ($$\alpha )$$ and extinction ($$\beta )$$ with the paleoenvironment. $$\lambda_{\mathrm{1}}$$ and $$\mu _{\mathrm{1}}$$, speciation and extinction for the first period of time (present to first shift). NP $$=$$ number of parameters; logL $$=$$ log-likelihood; AICc $$=$$ corrected Akaike Information Criterion; $$\Delta$$AICc $$=$$ the difference in AICc between the model with the lowest AICc and the others; AIC$$\omega =$$ Akaike weight. DISCUSSION Seeking the determinants of species richness is a fundamental aspect of biology (Morlon 2014; Benton 2015). However, studies have generally focused on a single factor to explain a given diversity pattern, whether by historical events promoting changes in diversification and extinction (Barnosky 2001; Erwin 2009; Mayhew et al. 2008), or species ecology and clade competitions as the dominant drivers of diversification (Rabosky et al., 2013; Silvestro et al. 2015). The combined influence of various factors has been tested less often (Drummond et al. 2012a; Bouchenak-Khelladi et al. 2015; Lagomarsino et al. 2016), and at a lower geographic and temporal scale or with a reduced set of methods. Further, these methods have mainly been implemented within a ML framework. In this study, we used existing and some novel methods under both ML and BI to evaluate potential factors driving diversification in Parnassiinae: some are related to a RQ mechanistic model of evolution, others to a CJ hypothesis. Coupled Effect of Mountain Building and Climate Change A long-held tenet in biology is that environmental stability leads to greater species richness than a changing environment, suggesting that species interactions are a dominant force (Wallace 1878). A recent debate has emerged between those arguing that physically dynamic ecosystems play a major role in diversification (Hoorn et al. 2013; Condamine et al. 2015b) and even promote adaptive radiation (Tan et al. 2013), and those maintaining that landscape changes have not been important in diversification history, instead advocating the role of large-scale dispersal events as drivers of speciation (Smith et al. 2014). In the Northern Hemisphere, tectonic events such as the closure of Turgai Sea or the building of Alpine and Himalayan orogenies have caused large-scale landscape and climatic changes (Sanmartín et al. 2001). As a result many new habitats were formed that created potential “new ecological spaces” and favored speciation (Xing and Ree 2017). Also, the newly created corridors stimulated species exchange between communities, while also acting as barriers that separated formerly continuous populations (Brikiatis 2014). Alternatively, landscape dynamics (i.e., unlike long-term stable environments) may have shattered former habitats, leading organisms to extinction. The rise of the HTP has often been invoked as providing important opportunities to diversify (Fjeldså et al. 2012; Price et al. 2014; Wen et al. 2014; Favre et al. 2015; Hughes and Atchison 2015; Renner 2016), but its role as a driver has rarely been formally tested with diversification models (Xing and Ree 2017). Our analyses detected higher in situ speciation rates for mountain species than their lowland counterparts (with similar extinction rates), resulting in higher net diversification rates for mountain species. Yet, this pattern was not robust to phylogenetic shape artifacts (Rabosky and Goldberg 2015). Moreover, ML and BI-based paleo-elevation models found that an explicit link between diversification rates and periods of HTP uplift could be rejected in favor of a constant birth–death. These findings contrast with those of high-elevation taxa in the Andes reported to have diversified faster due to the rise of the Andes when compared with their low-elevation relatives (Lagomarsino et al. 2016; Pérez- Escobar et al. 2017). In contrast, clade rate-heterogeneous models (BAMM, RevBayes) and biogeographic analyses supported a link between diversification rates and the HTP orogeny within Parnassiini, particularly in Parnassius. A significant upshift in speciation rates was detected along the stem of Parnassius, coincident with the colonization of the HTP, although DEC inferred several speciation/vicariance events associated with the HTP orogeny (Fig. 4). The latter is corroborated by the relatively high rate of allopatric (lowland/highland) speciation estimated by GeoSSE. These different results can partially be explained by the HTP rise providing novel, high-altitude habitats for colonization in Parnassius, thereby promoting ecological divergence and initial rapid diversification (followed later by allopatric speciation). Yet, diversification was not strictly parallel to periods of mountain building. Conversely, the rise of the HTP—and concomitant climatic changes such as regional cooling and aridification—led to higher extinction rates in non-mountain-adapted Parnassiinae lineages, such as Luehdorfiini and Zerynthiini that survived west and east of the Himalayan range; several events of extinction are inferred within these lineages around the mid-Miocene. This mixed effect is probably responsible for the lack of a significant association between orogeny and speciation in tree-wide models, which is otherwise detected by clade-heterogeneous models. Paleoclimate change is another abiotic factor postulated to be a major trigger of diversification in terrestrial organisms (Erwin 2009). CoMET, TreePar, and the ML-based paleotemperature-dependent model inferred a pattern of increased turnover (background extinction) rates associated with periods of global climate warming, such as the mid-Miocene Climate Optimum (MMCO, 15–17 Ma, Böhme, 2003) or the late Oligocene Warming Event (LOWE, $$\sim$$25 Ma, Zachos et al. 2008, Fig. 2). DEC also reconstructed several events of (geographic) extinction in Central Asia and India along the stem branches subtending the crown diversification of extant genera in Luehdorfiini and Zerynthiini (Fig. 4). These long branches are consistent with periods of high extinction rates related to the LOWE, which were eventually followed by mid-Miocene radiations, especially in Parnassius (the latter probably linked to the HTP rise). Paleotemperature and the HTP orogeny probably had a joint influence in the diversification of Parnassiinae. Mountain building is known to be responsible for regional climate change (Sepulchre et al. 2006; Armijo et al. 2015), and for changes in biodiversity patterns, both directly as a geographic barrier and indirectly through climate change (Hoorn et al. 2013). The HTP acted as a barrier to the influence of the Asian monsoons (Zhisheng et al. 2001) and induced considerable drying of Central Asia (Quade et al. 1989). In Parnassiinae, the HTP rise and subsequent periods of mountain building might have promoted diversification through the provision of “climate refugia”: locations where taxa survived periods of regionally adverse climate. Although usually associated with maintaining biodiversity through glacial–interglacial climate changes (Gavin et al. 2014), mountains can provide climate refugia during warming-aridification events (Migliore et al. 2013; Pokorny et al. 2015). Drastic range contractions within the non-mountain adapted Luehdorfiini and Zerynthiini (Fig. 4) could have been driven by aridification events induced by HTP uplift and mid-Miocene climate warming, with Hypermnestra as a relict group that survived in the Zagros Mountains. In Parnassius, colonization of the HTP coincided with a boost of speciation, followed by dispersal to adjacent areas in the late Miocene-Pliocene. The long stem branches leading to extant Parnassiinae genera are likely the result of climate-driven extinction events associated with Cenozoic warming periods, such as the LOWE and the MMCO. During these periods, Parnassiinae species probably migrated toward cooler temperatures by colonizing the newly uplifted mountain ranges of the Zagros and HTP. This agrees with the climate refugia hypothesis, a pattern currently observed in Himalayan plants, which are shifting their distribution upward along with global warming (Padma 2014). We argue that the HTP acted as climate refugia during global warming events, in which pre-adapted parnassiines survived and diversified before re-colonizing ancestral geographic ranges. It would be interesting to test this hypothesis in other extra-Himalayan groups that diversified during the HTP orogeny (Favre et al. 2015), as well as in other mountains around the world for which paleo-elevation data is available (Fjeldså et al. 2012; Hoorn et al. 2013; Hughes and Atchison 2015). Ecological Interactions via Insect-Plant Diversification and Host-Plant Shifts Biological interactions among distantly related lineages probably played a major role in the diversification of clades over geological time (Van Valen 1973; Liow et al. 2015; Silvestro et al. 2015; Voje et al. 2015). Our analyses identified significant differences in speciation rates associated with different host-plants. Several models (BAMM, MuSSE, and Bayesian implementations of MuSSE and HiSSE in RevBayes) indicated that Parnassius subgenera associated with Crassulaceae-Saxifragaceae and Papaveraceae have significantly higher rates of speciation than their relatives feeding on other host plants (Figs. 5 and 6). Interestingly, the lowest speciation rates are associated with butterflies feeding on Aristolochiaceae, the inferred ancestral host plant of the Parnassiinae (Condamine et al. 2012). Our results concur with the “escape and radiate” hypothesis of Ehrlich and Raven (1964), which predicts that the evolution of a herbivore insect trait (i.e., new tolerance of plant defenses) that allows colonizing/feeding on a novel host plant lineage would lead to a burst of diversification. The highly unbalanced extant species richness between Parnassiini and the clade Luehdorfiini $$+$$ Zerynthiini, explained above by geographic and climate drivers, might also be explained by differential speciation rates related to their feeding habit, as well as the overall low ability of Parnassiinae for shifting between host plants (i.e., transition rates estimated MuSSE are close to zero). Testing this hypothesis would require reconstruction of the diversification history of the host-plant clades; this would allow us, for example, to identify shifts that led to reciprocal changes in diversification in the host plants and to test whether these shifts were accompanied by matching ancestral geographic distributions between herbivore insects and plants at key events (Cruaud et al. 2012). Unfortunately, so far there is no complete phylogeny for any of the large host families for Parnassiinae, including Aristolochiaceae ($$\sim$$500–600 spp.), Papaveraceae ($$\sim$$700–800 spp.), Crassulaceae ($$\sim$$1300–1500 spp.), and Saxifragaceae ($$\sim$$600–700 spp.). However, we found a positive correlation between species richness of these plant lineages and the speciation rates of their insect feeders (MuSSE: R$$^{\mathrm{2}} = 0.96$$, BAMM: R$$^{\mathrm{2}} = 0.87$$; Supplementary Appendix S22 available on Dryad). This suggests that fast-diversifying clades are associated with families offering a higher number of potential hosts (akin to niche availability). Janz et al. (2006) found a similar correlation between host-plant diversity and speciation rates in the Nymphalidae; they posited that recurring oscillations between host-plant expansions (i.e., incorporation of new plants into the feeding repertoire) and host specialization acted as a major driving force behind the diversification of plant-feeding insects. In adaptive radiations, the expectation is an initial rapid burst of diversification, followed by a slowdown in speciation rates as niches are filled up; this is attributed to a diversity-dependence effect (Phillimore and Price 2008; Etienne et al. 2012). A shift to a new ecological resource (e.g., a new host plant) can be an adaptive breakthrough (Ehrlich and Raven 1964), providing higher opportunities for speciation at the start of the radiation, which significantly decrease as species accumulate. Though we did not detect this diversity-dependence effect in Parnassiinae, a pattern of speciation decreasing with standing diversity (and constant extinction rates) was found in Parnassius, supporting the hypothesis of this genus as an example of an adaptive radiation (Rebourg et al. 2006), driven by the evolution of a new host-plant association. The inference of similar extinction rates in clades of Parnassiinae feeding on different host plants does agree with Van Valen’s (1973) RQ hypothesis, which assumes constant extinction probabilities shared by all members of any given higher taxon. Interplay Between the RQ and the CJ Both abiotic and biotic factors drive species diversification, and biological radiations require both extrinsic conditions and intrinsic traits—acting in unison or sequentially over time—for success (Donoghue and Sanderson 2015). A lack of suitable methods and data has resulted in few studies attempting to merge both types of factors to explain evolutionary radiations (Bouchenak-Khelladi et al. 2015; Liow et al. 2015; Lagomarsino et al. 2016). Herein, we argue that RQ and CJ-type factors are intimately linked, and they both promote and prevent species diversification. Multiple factors explain the diversification of Apollo butterflies, and the effect of these factors differed across clades. Past changes in climate and elevation during active orogenic periods jointly mediated the rapid radiation of mountain-dwelling Parnassius, but also triggered extinction events in non-mountain adapted clades Luehdorfiini and Zerynthiini. The ancestor of Luehdorfiini and Zerynthiini was reconstructed in the Oligocene as occupying a broad distribution range extending from the Caucasus region, the Iranian Plateau and the Zagros Mountains, through Central Asia to India (red box in Fig. 4). Extinction in Central Asia and India around the Oligocene-Miocene, followed by independent eastward/northward colonizations by Luehdorfiini and Zerynthiini, could explain the current disjunct distribution pattern in these 2 tribes. Our study also demonstrates that ecological traits and biotic interactions such as host-plant associations and clade-specific diversity-dependence may confer differential diversification capacity in closely related species co-occurring in the same rapidly changing environment. In particular, the radiation of Parnassius in the HTP seems to have been fostered by the joint effect of climate and geological changes associated with new biotic interactions. Mountain uplift promoted speciation by allopatric speciation, but also increased habitat heterogeneity, and led ultimately to the formation of new niches (ecological divergence). The colonization of these niches led to new associations with mountain host species, and potentially to the evolution of new morphological adaptive traits, for example, denser setae on the body or altitudinal melanism. Although it is difficult to tease apart the role of key innovations (new traits that evolved allowing the invasion of a new niche) on ecological opportunity (the formation of a new environment that confers selective advantage fitness for the traits), a combination of extrinsic and intrinsic factors seems to have been necessary for the success of the Parnassius radiation. Parnassiinae species feed mostly on plant lineages of small size whose leaves grow close to the ground and which are characteristic elements of modern temperate vegetation: Aristolochia and Asarum (Aristolochiaceae) for Luehdorfiini and Zerynthiini, Saxifraga and Sedum (Saxifragaceae) and Rhodiola (Crassulaceae) for the subgenus Parnassius, or Corydalis (Papaveraceae) for the other subgenera of Parnassius. Zhang et al. (2014) inferred an origin of Rhodiola (70 spp., mid-Miocene; 12.1 Ma, 6.3–20.2 Ma) and Sedum (420 spp., late Oligocene) in the HTP, later spreading to the adjacent regions. Similarly, Ebersbach et al. (2017) placed the colonization of the HTP by genus Saxifraga (370 spp.) in the late Oligocene (20–34 Ma). Therefore, there was a synchronous temporal and geographic origin of Parnassius and its associated host plants in the HTP region. Conversely, Luehdorfiini and Zerynthiini were unable to evolve new traits or establish novel host-plant associations to invade the mountain and/or host-plant niches. This calls for further studies to decipher the genomic baseline for the adaptations to elevation and host plants by comparing Parnassiini and Luehdorfiini/Zerynthiini (Edger et al. 2015). Underlying Model and Inference Framework in Diversification Analyses Models of diversification have traditionally relied on time-continuous (independent) or autocorrelated rates in the way they model rate-shifts, with few studies applying both types for macroevolutionary inferences. Herein, we used both, in particular to estimate speciation and extinction rates depending on an environmental variable that itself varies through time. The first model is an ML implementation with a time-continuous birth–death model (Condamine et al. 2013). For the purpose of this study we implemented Bayesian paleoenvironment-dependent models but with autocorrelated (RevBayes) and time-continuous (TESS) rates of diversification. Our comparison across methods has revealed new aspects on the behavior of the models. The three methods are congruent on the role of orogeny (no effect of orogeny on diversification per se), but disagree on the effect of temperature (Fig. 8). The time-continuous models (RPANDA and TESS) support a positive dependence on speciation and extinction, while the autocorrelated model (RevBayes) shows no dependence. The results suggest that the lack of correlation found by the RevBayes paleoenvironmental model is directly a consequence of the assumption of autocorrelation in rates across time intervals. Thus, the difference lies in the underlying diversification model (autocorrelated vs. time-continuous rates) and not due to the inference framework (Bayesian vs. ML). Our study shows that multiple approaches must be combined to fully address the diversification of clades in relation to abiotic and biotic factors. Rather than a single factor, the joint effect of multiple factors (biogeography, species traits, environmental drivers, and mass extinction) is responsible for current diversity patterns, and we recommend a correspondingly multifaceted strategy for the study of these patterns. SUPPLEMENTARY MATERIAL Data available from the Dryad Digital Repository: http://dx.doi.org/10.5061/dryad.32bp4. FUNDING This work was supported by the Marie Curie Action (EU 7th Framework Programme) [BIOMME project, IOF-627684 to F.L.C. (jointly supervised by F.A.H.S. and I.S.)], by a Natural Sciences and Engineering Research Council of Canada (NSERC), Discovery Grant to F.A.H.S., and by MINECO/FEDER [CGL2015-67489-P to I.S]. Acknowledgements We thank Brian Wiegmann and two anonymous reviewers who provided excellent and constructive comments to improve the study. We also thank Sylvain Piry for help on the script for GenBank, Mark Miller for assistance on the CIPRES cluster. References Alfaro M.E., Santini F., Brock C., Alamillo H., Dornburg A., Rabosky D.L., Carnevale G., Harmon L.J. 2009 . Nine exceptional radiations plus high turnover explain species diversity in jawed vertebrates. Proc. Natl. Acad. Sci. USA 106 : 13410 – 13414 . Google Scholar Crossref Search ADS Antonelli A., Sanmartín I. 2011 . Mass extinction, gradual cooling, or rapid radiation? Reconstructing the spatiotemporal evolution of the ancient angiosperm genus Hedyosmum (Chloranthaceae) using empirical and simulated approaches. Syst. Biol. 60 : 596 – 615 . Google Scholar Crossref Search ADS PubMed Ackery P.R. 1975 . A guide to the genera and species of Parnassiinae (Lepidoptera: Papilionidae). Bull. British Mus. (Nat. Hist.), Entomol. 31 : 71 – 105 . Google Scholar Crossref Search ADS Armijo R., Lacassin R., Coudurier-Curveur A., Carrizo D. 2015 . Coupled tectonic evolution of Andean orogeny and global climate. Earth Sci. Rev. 143 : 1 – 35 . Google Scholar Crossref Search ADS Ayres D.L., Darling A., Zwickl D.J., Beerli P., Holder M.T., Lewis P.O., Huelsenbeck J.P., Ronquist F., Swofford D.L., Cummings M.P., Rambaut A., Suchard M.A. 2012 . BEAGLE: an application programming interface and high-performance computing library for statistical phylogenetics. Syst. Biol. 61 : 170 – 173 . Google Scholar Crossref Search ADS PubMed Baele G., Li W.L.S., Drummond A.J., Suchard M.A., Lemey P. 2013 . Accurate model selection of relaxed clocks in Bayesian phylogenetics. Mol. Biol. Evol. 30 : 239 – 243 . Google Scholar Crossref Search ADS PubMed Barnosky A.D. 2001 . Distinguishing the effects of the Red Queen and Court Jester on Miocene mammal evolution in the northern Rocky Mountains. J. Verteb. Paleontol. 21 : 172 – 185 . Google Scholar Crossref Search ADS Beaulieu J.M., O’Meara B.C. 2016 . Detecting hidden diversification shifts in models of trait- dependent speciation and extinction. Syst. Biol. 65 : 583 – 601 . Google Scholar Crossref Search ADS PubMed Benton M.J. 2009 . The red queen and the court jester: species diversity and the role of biotic and abiotic factors through time. Science 323 : 728 – 732 . Google Scholar Crossref Search ADS PubMed Benton M.J. 2015 . Exploring macroevolution using modern and fossil data. Proc. Biol. Sci. 282 : 20150569 Google Scholar Crossref Search ADS PubMed Böhme M. 2003 . The Miocene Climatic Optimum: evidence from ectothermic vertebrates of Central Europe. Palaeogeogr. Palaeoclim. Palaeoecol. 195 : 389 – 401 . Bollino M., Racheli T. 2012 . Parnassiinae (Partim). Parnassiini (Partim), Luehdorfiini, Zerynthiini. In: Bauer E., Frankenbach T., editors. Butterflies of the World, Part 36, Supplement 20. Keltern , Germany : Goecke und Evers . p. 64 . Bouchenak-Khelladi Y., Onstein R.E., Xing Y., Schwery O., Linder H.P. 2015 . On the complexity of triggering evolutionary radiations. New Phytol. 207 : 313 – 326 . Google Scholar Crossref Search ADS PubMed Bouilhol P., Jagoutz O., Hanchar J.M., Dudas F.O. 2013 . Dating the India–Eurasia collision through arc magmatic records. Earth Planet. Sci. Lett. 366 : 163 – 175 . Brikiatis L. 2014 . The De Geer, Thulean and Beringia routes: key concepts for understanding early Cenozoic biogeography. J Biogeogr. 41 : 1036 – 1054 . Google Scholar Crossref Search ADS Britton T., Anderson C.L., Jacquet D., Lundqvist S., Bremer K. 2007 . Estimating divergence times in large phylogenetic trees. Syst. Biol. 56 : 741 – 752 . Google Scholar Crossref Search ADS PubMed Brock C.D., Harmon L.J., Alfaro M.E. 2011 . Testing for temporal variation in diversification rates when sampling is incomplete and nonrandom. Syst. Biol. 60 : 410 – 419 . Google Scholar Crossref Search ADS PubMed Carpenter F.M. 1992 . Treatise on invertebrate paleontology, Part R, Arthropoda 3–4. Boulder (CO): Geological Society of America. Churkin S. 2006 . A new species of Parnassius Latreille, 1804 from Kyrgyzstan (Lepidoptera, Papilionidae). Helios 7 : 142 – 158 . Collins N.M., Morris M.G. 1985 . Threatened Swallowtail butterflies of the World. The Cambridge: IUCN Red Data Book . Condamine F.L., Nagalingum N.S., Marshall C.R., Morlon H. 2015a . Origin and diversification of living cycads: a cautionary tale on the impact of the branching process prior on Bayesian molecular dating. BMC Evol. Biol. 15 : 65 Google Scholar Crossref Search ADS Condamine F.L., Rolland J., Morlon H. 2013 . Macroevolutionary perspectives to environmental change. Ecol. Lett. 16 : 72 – 85 . Google Scholar Crossref Search ADS PubMed Condamine F.L., Sperling F.A., Wahlberg N., Rasplus J.Y., Kergoat G.J. 2012 . What causes latitudinal gradients in species diversity? Evolutionary processes and ecological constraints on swallowtail biodiversity. Ecol. Lett. 15 : 267 – 277 . Google Scholar Crossref Search ADS PubMed Condamine F.L., Toussaint E.F.A., Clamens A.-L., Genson G., Sperling F.A.H., Kergoat G.J. 2015b . Deciphering the evolution of birdwing butterflies 150 years after Alfred Russell Wallace. Sci. Rep. 5 : 11860 Google Scholar Crossref Search ADS Cruaud A., Rønsted N., Chantarasuwan B., Chou L.S., Clement W., Couloux A., Cousins B., Forest F., Genson G., Harrison R.D., Hossaert-McKey M., Jabbour-Zahab R., Jousselin E., Kerdelhué C., Kjellberg F., Lopez-Vaamonde C., Peebles J., Pereira R.A.S, Schramm T., Ubaidillah R., van Noort S., Weiblen G.D., Yang D.R, Yan-Qiong P., Yodpinyanee A., Libeskind-Hadas R., Cook J.M., Rasplus J.Y., Savolainen V. 2012 . An extreme case of plant-insect co-diversification: figs and fig-pollinating wasps. Syst. Biol. 61 : 1029 – 1047 . Google Scholar Crossref Search ADS PubMed Cusimano N., Renner S.S. 2010 . Slowdowns in diversification rates from real phylogenies may be not real. Syst. Biol. 59 : 458 – 464 . Google Scholar Crossref Search ADS PubMed Dapporto L. 2010 . Speciation in Mediterranean refugia and post-glacial expansion of Zerynthia polyxena (Lepidoptera, Papilionidae). J. Zoolog. Syst. Evol. Res. 48 : 229 – 237 . Davis M.P., Midford P.E., Maddison W.P. 2013 . Exploring power and parameter estimation of the BiSSE method for analysing species diversification. BMC Evol. Biol. 13 : 38 Google Scholar Crossref Search ADS PubMed de Jong R. 2007 . Estimating time and space in the evolution of the Lepidoptera. Tijdschrift v. Entomol. 150 : 319 – 346 . Google Scholar Crossref Search ADS DeChaine E.G., Martin A.P. 2004 . Historic cycles of fragmentation and expansion in Parnassius smintheus (Papilionidae) inferred using mitochondrial DNA. Evolution 58 : 113 – 127 . Google Scholar Crossref Search ADS PubMed Donoghue M.J., Doyle J.A., Gauthier J., Kluge A.G., Rowe T. 1989 . The importance of fossils in phylogeny reconstruction. Annu. Rev. Ecol. Syst. 20 : 431 – 460 . Google Scholar Crossref Search ADS Donoghue MJ, Sanderson MJ. 2015 . Confluence, synnovation, and depauperons in plant diversification. New Phytol. 207 : 260 – 274 . Google Scholar Crossref Search ADS PubMed Drummond A., Suchard M.A., Xie D., Rambaut A. 2012b. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol. Biol. Evol. 29 : 1969 – 1973 . Google Scholar Crossref Search ADS Drummond A.J., Ho S.Y.W., Phillips M.J., Rambaut A. 2006 . Relaxed phylogenetics and dating with confidence. PLoS Biol. 4 : e88 . Google Scholar Crossref Search ADS PubMed Drummond C.S., Eastwood R.J., Miotto S.T., Hughes C.E. 2012a . Multiple continental radiations and correlates of diversification in Lupinus (Leguminosae): testing for key innovation with incomplete taxon sampling. Syst. Biol. 61 : 443 – 460 . Google Scholar Crossref Search ADS Durden C.J., Rose H. 1978 . Butterflies from the middle Eocene: the earliest occurrence of fossil Papilionidae. Prarce-Sellards Ser. Tax. Mem. Mus. 29 : 1 – 25 . Ebersbach J., Muellner-Riehl A.N., Michalak I., Tkach N., Hoffmann M.H., Röser M., Sun H., Favre A. 2017 . In and out of the Qinghai-Tibet Plateau: divergence time estimation and historical biogeography of the large arctic-alpine genus Saxifraga L. J. Biogeogr. 44 : 900 – 910 . Edger P.P., Heidel-Fischer H.M., Bekaert M., Rota J., Glöckner G., Platts A.E., Heckel D. G., Der J., Wafula E., Tang M., Hofberger J.A., Smithson A., Hall J.C., Blanchette M., Bureau T.E., Wright S.I., dePamphilis C.W., Schranz M.E., Barker M.S., Conant G.C., Wahlberg N., Vogel H., Pires J.C., Wheat C.W. 2015 . The butterfly plant arms-race escalated by gene and genome duplications. Proc. Natl. Acad. Sci. USA 112 : 8362 – 8366 . Google Scholar Crossref Search ADS Edwards E.J., Osborne C.O., Stromberg C.A.E., Smith S.A., the C$$_{\mathrm{4}}$$ Grasses Consortium. 2010 . The origins of C4 grasslands: integrating evolutionary and ecosystem science. Science 328 : 587 – 591 . Ehrlich P.R., Raven P.H. 1964 . Butterflies and plants: a study in coevolution. Evolution 18 : 586 – 608 . Google Scholar Crossref Search ADS Erwin D.H. 2009 . Climate as a driver of evolutionary change. Curr. Biol. 19 : R575 – R583 . Google Scholar Crossref Search ADS PubMed Etienne R.S., Haegeman B., Stadler T., Aze T., Pearson P.N., Purvis A., Phillimore A.B. 2012 . Diversity-dependence brings molecular phylogenies closer to agreement with the fossil record. Proc. Biol. Sci. 279 : 1300 – 1309 . Google Scholar Crossref Search ADS PubMed Ezard T.H.G., Aze T., Pearson P.N., Purvis A. 2011 . Interplay between changing climate and species’ ecology drives macroevolutionary dynamics. Science 332 : 349 – 351 . Google Scholar Crossref Search ADS PubMed Ezard T.H.G., Quental T.B., Benton M.J. 2016 . The challenges to inferring the regulators of biodiversity in deep time. Philos. Trans. R. Soc. B 371 : 20150216 Google Scholar Crossref Search ADS Favre A., Päckert M., Pauls S.U., Jähnig S.C., Uhl D., Michalak I., Muellner-Riehl A.N. 2015. The role of the uplift of the Qinghai-Tibetan Plateau for the evolution of Tibetan biotas. Biol. Rev. 90 : 236 – 253 . Crossref Search ADS FitzJohn R.G. 2012 . Diversitree: comparative phylogenetic analyses of diversification in R. Methods Ecol. Evol. 3 : 1084 – 1092 . Google Scholar Crossref Search ADS FitzJohn R.G., Maddison W.P., Otto S.P. 2009 . Estimating trait-dependent speciation and extinction rates from incompletely resolved phylogenies. Syst. Biol. 58 : 595 – 611 . Google Scholar Crossref Search ADS PubMed Fjeldså J., Bowie R.C.K., Rahbek C. 2012 . The role of mountain ranges in the diversification of birds. Annu. Rev. Ecol. Evol. Syst. 43 : 249 – 265 . Google Scholar Crossref Search ADS Frankenbach T., Bollino M., Racheli T. 2012 . Papilionidae XIV: Hypermnestra, Luehdorfiini, Zerynthiini. In: Bauer E., Frankenbach T., editors. Butterflies of the World, Part 36. Keltern , Germany : Goecke und Evers . p. 26 . Gavin D.G., Fitzpatrick M.C., Gugger P.F., Heath K.D., Rodríguez-Sánchez F., Dobrowski S.Z., Hampe A., Hu F.S., Ashcroft M.B., Bartlein P.J., Blois J.L., Carstens B.C., Davis E.B., de Lafontaine G., Edwards M.E., Fernandez M., Henne P.D., Herring E.M., Holden Z.A., Kong W.S., Liu J., Magri D., Matzke N.J., McGlone M.S., Saltré F., Stigall A.L., Tsai Y.H., Williams J.W. 2014 . Climate refugia: joint inference from fossil records, species distribution models and phylogeography. New Phytol. 204 : 37 – 54 . Google Scholar Crossref Search ADS PubMed Gernhard T. 2008 . The conditioned reconstructed process. J. Theor. Biol. 253 : 769 – 778 . Google Scholar Crossref Search ADS PubMed Givnish T.J. Spalink D., Ames M., Lyon S.P., Hunter S.J., Zuluaga A., Iles W.J.D., Clements M.A., Arroyo M.T.K., Leebens-Mack J., Endara L., Kriebel R., Neubig K.M., Whitten W.M., Williams N.H., Cameron K.M. 2015 . Orchid phylogenomics and multiple drivers of their extraordinary diversification. Proc. Biol. Sci. 282 : 20151553 Google Scholar Crossref Search ADS Goldberg E.E., Lancaster L.T., Ree R.H. 2011 . Phylogenetic inference of reciprocal effects between geographic range evolution and diversification. Syst. Biol. 60 : 451 – 465 . Google Scholar Crossref Search ADS PubMed Gradstein F.M, Ogg J.G., Schmitz M., Ogg G. 2012 . The geologic time scale 2012. Boston : Elsevier . p. 1176 . Gratton P., Konopiński M.K., Sbordoni V. 2008 . Pleistocene evolutionary history of the clouded Apollo (Parnassius mnemosyne): genetic signatures of climate cycles and a ‘time-dependent’ mitochondrial substitution rate. Mol. Ecol. 17 : 4248 – 4262 . Google Scholar Crossref Search ADS PubMed Hiura L. 1980 . A phylogeny of the genera of Parnassiinae based on analysis of wing pattern, with description of a new genus (Lepidoptera: Papilionidae). Bull. Osaka Mus. Natural Hist. 33 : 71 – 85 . Ho S.Y.W., Phillips M.J. 2009 . Accounting for calibration uncertainties in phylogenetic estimation of evolutionary divergence times. Syst. Biol. 58 : 367 – 380 . Google Scholar Crossref Search ADS PubMed Höhna S. 2015 . The time-dependent reconstructed evolutionary process with a key-role for mass-extinction events. J. Theoret. Biol. 380 : 321 – 331 . Google Scholar Crossref Search ADS Höhna S., Landis M.J., Heath T.A., Boussau B., Lartillot N., Moore B.R., Huelsenbeck J.P., Ronquist F. 2016a . RevBayes: Bayesian phylogenetic inference using graphical models and an interactive model-specification language. Syst. Biol. 65 : 726 – 736 . Google Scholar Crossref Search ADS Höhna S., May M.R., Moore B.R. 2016b . TESS: an R package for efficiently simulating phylogenetic trees and performing Bayesian inference of lineage diversification rates. Bioinformatics 32 : 789 – 791 . Google Scholar Crossref Search ADS Höhna S., Stadler T., Ronquist F., Britton T. 2011 . Inferring speciation and extinction rates under different sampling schemes. Mol. Biol. Evol. 28 : 2577 – 2589 . Google Scholar Crossref Search ADS PubMed Hoorn C., Mosbrugger V., Mulch A., Antonelli A. 2013 . Biodiversity from mountain building. Nature Geosci . 6 : 154 Huelsenbeck J.P., Larget B., Alfaro M.E. 2004 . Bayesian phylogenetic model selection using reversible jump Markov chain Monte Carlo. Mol. Biol. Evol. 21 : 1123 – 1133 . Google Scholar Crossref Search ADS PubMed Hughes C.E., Atchison G.W. 2015 . The ubiquity of alpine plant radiations: from the Andes to the Hengduan Mountains. New Phytol. 207 : 275 – 282 . Google Scholar Crossref Search ADS PubMed Hutter C.R., Guayasamin J.M., Wiens J.J. 2013 . Explaining Andean megadiversity: the evolutionary and ecological causes of glassfrog elevational richness patterns. Ecol. Lett. 16 : 1135 – 1144 . Google Scholar Crossref Search ADS PubMed Janz N., Nylin S., Wahlberg N. 2006 . Diversity begets diversity: host expansions and the diversification of plant-feeding insects. BMC Evol. Biol. 6 : 4 Google Scholar Crossref Search ADS PubMed Jønsson K.A., Fabre P.H., Fritz S.A., Etienne R.S., Ricklefs R.E., Jørgensen T.B., Fjeldså J., Rahbek C., Ericson P.G.P., Woogg F., Pasquet E., Irestedt M. 2012 . Ecological and evolutionary determinants for the adaptive radiation of the Madagascan vangas. Proc. Natl. Acad. Sci. USA 109 : 6620 – 6625 . Google Scholar Crossref Search ADS Kass R.E., Raftery A.E. 1995 . Bayes factors. J. Am. Stat. Assoc. 90 : 773 – 795 . Google Scholar Crossref Search ADS Katoh K., Standley D.M. 2013 . MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30 : 772 – 780 . Google Scholar Crossref Search ADS PubMed Katoh T., Chichvarkin A., Yagi T., Omoto K. 2005 . Phylogeny and evolution of butterflies of the genus Parnassius: inferences from mitochondrial 16S and ND1 sequences. Zool. Sci. 22 : 343 – 351 . Google Scholar Crossref Search ADS PubMed Kergoat G.J., Bouchard P., Clamens A.-L., Abbate J.L., Jourdan H., Jabbour-Zahab R., Genson G., Soldati L., Condamine F.L. 2014 . Cretaceous environmental changes led to high extinction rates in a hyperdiverse beetle family. BMC Evol. Biol. 14 : 220 Google Scholar Crossref Search ADS PubMed Keyghobadi N, Roland J, Strobeck C. 2005 . Genetic differentiation and gene flow among populations of the alpine butterfly, Parnassius smintheus, vary with landscape connectivity. Mol. Ecol. 14 : 1897 – 1909 . Google Scholar Crossref Search ADS PubMed Kocman S. 2009 . Parnassius of Tibet and the adjacent areas. Pardubice (Czech Republic) : Vadim Tshikolovets . Lagomarsino L.P., Condamine F.L., Antonelli A., Mulch A., Davis C.C. 2016 . The abiotic and biotic drivers of rapid diversification in Andean bellflowers (Campanulaceae). New Phytol. 210 : 1430 – 1442 . Google Scholar Crossref Search ADS PubMed Lanfear R., Calcott B., SYW, Ho Guindon S. 2012 . PartitionFinder: Combined selection of partitioning schemes and substitution models for phylogenetic analyses. Mol. Biol. Evol. 29 : 1695 – 1701 . Google Scholar Crossref Search ADS PubMed Lewis P.O. 2001 . A likelihood approach to estimating phylogeny from discrete morphological character data. Syst. Biol. 50 : 913 – 925 . Google Scholar Crossref Search ADS PubMed Liow L.H., Reitan T., Harnik P.G. 2015 . Ecological interactions on macroevolutionary time scales: clams and brachiopods are more than ships that pass in the night. Ecol. Lett. 18 : 1030 – 1039 . Google Scholar Crossref Search ADS PubMed Liu Z., Pagani M., Zinniker D., DeConto R., Huber M., Brinkhuis H., Shah S.R., Leckie R.M., Pearson A. 2009. Global cooling during the Eocene-Oligocene climate transition. Science 323 : 1187 – 1190 . Crossref Search ADS PubMed Maddison W.P., FitzJohn R.G. 2015 . The unsolved challenge to phylogenetic correlation tests for categorical characters. Syst. Biol. 64 : 127 – 136 . Google Scholar Crossref Search ADS PubMed Maddison W.P., Midford P.E., Otto S.P. 2007 . Estimating a binary character’s effect on speciation and extinction. Syst. Biol. 56 : 701 – 710 . Google Scholar Crossref Search ADS PubMed Magallón S., Gómez-Acevedo S., Sánchez-Reyes L.L., Hernández-Hernández T. 2015 . A metacalibrated time-tree documents the early rise of flowering plant phylogenetic diversity. New Phytol. 207 : 437 – 453 . Google Scholar Crossref Search ADS PubMed Matzke N.J. 2014 . Model selection in historical biogeography reveals that founder-event speciation is a crucial process in island clades. Syst. Biol. 63 : 951 – 970 . Google Scholar Crossref Search ADS PubMed May M.R., Höhna S., Moore B.R. 2016 . A Bayesian approach for detecting the impact of mass-extinction events on molecular phylogenies when rates of lineage diversification may vary. Methods Ecol. Evol. 7 : 947 – 959 . Google Scholar Crossref Search ADS Mayhew P.J., Jenkins G.N., Benton T.G. 2008 . A long-term association between global temperature and biodiversity, origination and extinction in the fossil record. Proc. Biol. Sci. 275 : 47 – 53 . Google Scholar Crossref Search ADS PubMed Meredith R.W., Janečka J.E., Gatesy J., Ryder O.A., Fisher C.A., Teeling E.C., Goodbla A., Eizirik E., Simão T.L.L, Stadler T., Rabosky D.L., Honeycutt R.L., Flynn J.J., Ingram C.M., Steiner C., Williams T.L., Robinson T.J., Burk-Herrick A., Westerman M., Ayoub N.A., Springer M.S., Murphy W.J. 2011 . Impacts of the Cretaceous Terrestrial Revolution and KPg extinction on mammal diversification. Science 334 : 521 – 524 . Google Scholar Crossref Search ADS PubMed Meseguer A. S., Lobo J.M., Ree R.H., Beerling D.J., Sanmartín I. 2014 . Integrating fossils, phylogenies, and niche models into biogeography to reveal ancient evolutionary history: the case of Hypericum (Hypericaceae). Syst. Biol. 64 : 215 – 232 . Google Scholar Crossref Search ADS PubMed Michel F., Rebourg C., Cosson E., Descimon H. 2008 . Molecular phylogeny of Parnassiinae butterflies (Lepidoptera: Papilionidae) based on the sequences of four mitochondrial DNA segments. Ann. Soc. Entomol. Fr. 44 : 1 – 36 . Google Scholar Crossref Search ADS Migliore J., Baumel A., Juin M., Fady B., Roig A., Duong N., Médail F. 2013 . Surviving in Mountain climate refugia: New insights from the genetic diversity and structure of the relict shrub Myrtus nivellei (Myrtaceae) in the Sahara Desert. PLoS One 8 : e73795 . Google Scholar Crossref Search ADS PubMed Miller M.A., Schwartz T., Pickett B.E., He S., Klem E.B., Scheuermann R.H., Passarotti M., Kaufman S., O’Leary M.A. 2015 . A RESTful API for access to phylogenetic tools via the CIPRES Science Gateway. Evol. Bioinform. 11 : 43 – 48 . Google Scholar Crossref Search ADS Moen D., Morlon H. 2014 . Why does diversification slow down? Trends Ecol. Evol. 29 : 190 – 197 . Möhn E. 2005 . Papilionidae XII: Parnassius apollo III. In: Bauer E., Frankenbach T., editors. Butterflies of the World, Part 23. Keltern , Germany : Goecke und Evers . p. 33 . Moore B.R., Höhna S., May M.R., Rannala B., Huelsenbeck J.P. 2016 . Critically evaluating the theory and performance of Bayesian analysis of macroevolutionary mixtures. Proc. Natl. Acad. Sci. USA 113 : 9569 – 9574 . Google Scholar Crossref Search ADS Morlon H. 2014 . Phylogenetic approaches for studying diversification Ecol. Lett. 17 : 508 – 525 . Morlon H., Parsons T.L., Plotkin J. 2011 . Reconciling molecular phylogenies with the fossil record. Proc. Natl. Acad. Sci. USA 108 : 16327 – 16332 . Google Scholar Crossref Search ADS Nazari V., Sperling F.A.H. 2007 . Mitochondrial DNA divergence and phylogeography in western Palaearctic Parnassiinae (Lepidoptera: Papilionidae): how many species are there? Insect. Syst. Evol. 38 : 121 – 138 . Nazari V., Zakharov E.V., Sperling F.A.H. 2007 . Phylogeny, historical biogeography, and taxonomic ranking of Parnassiinae (Lepidoptera: Papilionidae) based on morphology and seven genes. Mol. Phylogenet. Evol. 42 : 131 – 156 . Google Scholar Crossref Search ADS PubMed Nee S., May R.M., Harvey P.H. 1994 . The reconstructed evolutionary process. Phil. Trans. R. Soc. B 344 : 305 – 311 . Google Scholar Crossref Search ADS Ng J., Smith S.D. 2014 . How traits shape trees: new approaches for detecting character state-dependent lineage diversification. J. Evol. Biol. 27 : 2035 – 2045 . Google Scholar Crossref Search ADS PubMed Omoto K., Katoh T., Chichvarkin A., Yagi T. 2004 . Molecular evolution of the ‘Apollo’ butterflies of the genus Parnassius (Lepidoptera: Papilionidae). Gene 326 : 141 – 147 . Google Scholar Crossref Search ADS PubMed Omoto K., Yonezawa T., Shinkawa T. 2009 . Molecular systematics and evolution of the recently discovered “Parnassian” butterfly (Parnassius davydovi Churkin, 2006) and its allied species (Lepidoptera, Papilionidae). Gene 441 : 80 – 88 . Google Scholar Crossref Search ADS PubMed Padma T.V. 2014 . Himalayan plants seek cooler climes. Nature 512 : 359 Google Scholar Crossref Search ADS PubMed Pérez-Escobar O.A., Chomicki G., Condamine F.L., Karremans A.P., Bogarín D., Matzke N.J., Silvestro D., Antonelli A. 2017 . Recent origin and rapid speciation of Neotropical orchids in the world’s richest plant biodiversity hotspot. New Phytol. 215 : 891 – 905 . Google Scholar Crossref Search ADS PubMed Phillimore A.B., Price T.D. 2008 . Density-dependent cladogenesis in birds. PLoS Biol. 6 : e71 . Google Scholar Crossref Search ADS PubMed Pokorny L., Riina R., Mairal M., Meseguer A.S., Culshaw V., Cendoya J., Serrano M., Carbajal R., Ortiz S., Heuertz M., Sanmartín I. 2015 . Living on the edge: timing of Rand Flora disjunctions congruent with ongoing aridification in Africa. Frontiers Genet. 6 : 154 Google Scholar Crossref Search ADS Pound M.J., Haywood A.M., Salzmann U., Riding J.B. 2012 . Global vegetation dynamics and latitudinal temperature gradients during the Mid to Late Miocene (15.97–5.33 Ma). Earth Sci. Rev. 112 : 1 – 22 . Google Scholar Crossref Search ADS Price T.D., Hooper D.M., Buchanan C.D., Johansson U.S., Tietze D.T., Alström P., Olsson U., Ghosh-Harihar M., Ishtiaq F., Gupta S.K., Martens J., Harr B., Singh P., Mohan D. 2014 . Niche filling slows the diversification of Himalayan songbirds. Nature 509 : 222 – 225 . Google Scholar Crossref Search ADS PubMed Quade J., Cerling T., Bowman J. 1989 . Development of Asian monsoon revealed by marked ecological shift during the latest Miocene in Northern Pakistan. Nature 342 : 163 – 166 . Google Scholar Crossref Search ADS Rabosky D.L., Donnellan S.C., Grundler M., Lovette I.J. 2014a . Analysis and visualization of complex macroevolutionary dynamics: an example from Australian scincid lizards. Syst. Biol. 63 : 610 – 627 . Google Scholar Crossref Search ADS Rabosky D.L., Goldberg E.E. 2015 . Model inadequacy and mistaken inferences of trait-dependent speciation. Syst. Biol. 64 : 340 – 355 . Google Scholar Crossref Search ADS PubMed Rabosky D.L., Grundler M., Anderson C., Shi J.J., Brown J.W., Huang H., Larson J.G. 2014b . BAMMtools: an R package for the analysis of evolutionary dynamics on phylogenetic trees. Methods Ecol. Evol. 5 : 701 – 707 . Google Scholar Crossref Search ADS Rabosky D.L., Lovette, I.J. 2008 . Explosive evolutionary radiations: decreasing speciation or increasing extinction through time? Evolution 62 : 1866 – 1875 . Google Scholar Crossref Search ADS PubMed Rabosky D.L., Mitchell J.S., Chang J. 2017 . Is BAMM flawed? Theoretical and practical concerns in the analysis of multi-rate diversification models. Syst. Biol. 66 : 477 – 498 . Google Scholar Crossref Search ADS PubMed Rabosky D.L., Santini F., Eastman J.M., Smith S.A., Sidlauskas B., Chang J., Alfaro M.E. 2013 . Rates of speciation and morphological evolution are correlated across the largest vertebrate radiation. Nat. Commun. 4 : 1958 Google Scholar Crossref Search ADS PubMed Rambaut A., Drummond A.J. 2016 . Tracer v1.6. Available from: URL http://beast.bio.ed.ac.uk/Tracer. Rasnitsyn A.P., Zherikhin V.V. 2002 . Appendix: alphabetic list of selected insect fossil sites. Impression fossils. In: Rasnitsyn A.P., Quicke D.L.J., editors. History of insects. Dordrecht, Boston & London : Kluwer Academic Publishers . p. 437 – 444 . Rebel H. 1898 . Fossile Lepidopteren aus der Miocänformation von Gabbro. Sitzungsberichte der Kaiserlichen Akademie der Wissenschaften. Mathematisch-Naturwissenschaftliche Classe 107 : 731 – 745 . Rebourg C., Péténian F., Cosson E., Faure E. 2006 . Patterns of speciation and adaptive radiation in Parnassius butterflies. J. Entomol. 3 : 204 – 215 . Google Scholar Crossref Search ADS Ree R.H., Sanmartín I. 2018 . Conceptual and statistical problems with the DEC+J model of founder-event speciation and its comparison with DEC via model selection. J. Biogeogr. doi.org/10.1111/jbi.13173. Ree R.H., Smith S.A. 2008 . Maximum-likelihood inference of geographic range evolution by dispersal, local extinction, and cladogenesis. Syst. Biol. 57 : 4 – 14 . Google Scholar Crossref Search ADS PubMed Renner S.S. 2016 . Available data point to a 4-km-high Tibetan Plateau by 40 Ma, but 100 molecular-clock papers have linked supposed recent uplift to young node ages. J. Biogeogr. 43 : 1479 – 1487 . Google Scholar Crossref Search ADS Revell, L.J. 2012 . phytools: an R package for phylogenetic comparative biology (and other things). Methods Ecol. Evol. 3 : 217 – 223 . Google Scholar Crossref Search ADS Roland J., Keyghobadi N., Fownes S. 2000 . Alpine Parnassius butterfly dispersal: effects of landscape and population size. Ecology 81 : 1642 – 1653 . Google Scholar Crossref Search ADS Rolland J., Condamine F.L., Jiguet F., Morlon H. 2014 . Faster speciation and reduced extinction in the tropics contribute to the mammalian latitudinal diversity gradient. PLoS Biol. 12 : e1001775 . Google Scholar Crossref Search ADS PubMed Ronquist F., Klopfstein S., Vilhelmsen L., Schulmeister S., Murray D.L., Rasnitsyn A.P. 2012a . A total-evidence approach to dating with fossils, applied to the early radiation of the Hymenoptera. Syst. Biol. 61 : 973 – 999 . Google Scholar Crossref Search ADS Ronquist F., Sanmartín I. 2011 . Phylogenetic methods in biogeography. Annu. Rev. Ecol. Evol. Syst. 42 : 441 – 464 . Google Scholar Crossref Search ADS Ronquist F., Teslenko M., van der Mark P., Ayres D.L., Darling A., Höhna S., Larget B., Liu L., Suchard M.A., Huelsenbeck J.P. 2012b . MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst. Biol. 61 : 539 – 542 . Google Scholar Crossref Search ADS Ruane S., Bryson R.W. Jr ., Pyron R.A., Burbrink F.T. 2014 . Coalescent species delimitation in milksnakes (genus Lampropeltis) and impacts on phylogenetic comparative analyses. Syst. Biol. 63 : 231 – 250 . Google Scholar Crossref Search ADS PubMed Sanmartín I., Enghoff H., Ronquist F. 2001 . Patterns of animal dispersal, vicariance and diversification in the Holarctic. Biol. J. Linn. Soc. 73 : 345 – 390 . Google Scholar Crossref Search ADS Sanmartín I., Meseguer A.S. 2016 . Extinction in phylogenetics and biogeography: From timetrees to patterns of biotic assemblage. Front. Genet. 7 : 35 Google Scholar Crossref Search ADS PubMed Sauquet H., Ho S.Y.W., Gandolfo M.A., Jordan G.J., Wilf P., Cantrill D.J., Bayly M.J., Bromham L., Brown G.K., Carpenter R.J., Lee D.M., Murphy D.J., Sniderman J.M.K., Udovicic F. 2012 . Testing the impact of calibration on molecular divergence times using a fossil-rich group: the case of Nothofagus (Fagales). Syst. Biol. 61 : 289 – 313 . Google Scholar Crossref Search ADS PubMed Schnitzler J., Barraclough T.G., Boatwright J.S., Goldblatt P., Manning J.C., Powell M.P., Rebelo T., Savolainen V. 2011 . Causes of plant diversification in the Cape biodiversity hotspot of South Africa. Syst. Biol. 60 : 343 – 357 . Google Scholar Crossref Search ADS PubMed Schoville S.D., Roderick G.K. 2009 . Alpine biogeography of Parnassian butterflies during Quaternary climate cycles in North America. Mol. Ecol. 18 : 3471 – 3485 . Google Scholar Crossref Search ADS PubMed Scriber J. M. 1984 . Larval foodplant utilization by the world Papilionidae (Lep.): latitudinal gradients reappraised. Acta Rhopaloc. 6 : 1 – 50 . Scriber J.M., Tsubaki Y., Lederhouse R.C. 1995 . Swallowtail butterflies: their ecology and evolutionary biology. Gainesville (FL) : Scientific Publishers . Scudder S.H. 1875 . Fossil butterflies. Mem. Am. Assoc. Advanc. Sci. 1 : 1 – 99 . Sepulchre P., Ramstein G., Fluteau F., Schuster M., Tiercelin J.-J., Brunet M. 2006 . Tectonic uplift and Eastern Africa aridification. Science 313 : 1419 – 1423 . Google Scholar Crossref Search ADS PubMed Silvestro D., Antonelli A., Salamin N., Quental T.B. 2015 . The role of clade competition in the diversification of North American canids. Proc. Natl. Acad. Sci. USA 112 : 8684 – 8689 . Google Scholar Crossref Search ADS Smith B.T., McCormack J.E., Cuervo A.M., Hickerson M.J., Aleixo A., Cadena C.D., Pérez-Emán J., Burney C.W., Xie X., Harvey M.G., Faircloth B.C., Glenn T.C., Derryberry E.P., Prejean J., Fields S., Brumfield R.T. 2014 . The drivers of tropical speciation. Nature 515 : 406 – 409 . Google Scholar Crossref Search ADS PubMed Smith M.E., Singer B., Carroll A. 2003 . $$^{\mathrm{40}}$$Ar/$$^{\mathrm{39}}$$Ar geochronology of the Eocene Green River Formation, Wyoming. Geol. Soc. Am. Bull. 115 : 549 – 565 . Google Scholar Crossref Search ADS Sohn J.-C., Labandeira C., Davis D., Mitter C. 2012 . An annotated catalog of fossil and subfossil Lepidoptera (Insecta: Holometabola) of the world. Zootaxa 3286 : 1 – 132 . Sohn J.-C., Labandeira C.C., Davis D.R. 2015 . The fossil record and taphonomy of butterflies and moths (Insecta, Lepidoptera): implications for evolutionary diversity and divergence-time estimates. BMC Evol. Biol. 15 : 12 Google Scholar Crossref Search ADS PubMed Stadler T. 2011 . Mammalian phylogeny reveals recent diversification rate shifts. Proc. Natl. Acad. Sci. USA 108 : 6187 – 6192 . Google Scholar Crossref Search ADS Stadler T. 2013 . Recovering speciation and extinction dynamics based on phylogenies. J. Evol. Biol. 26 : 1203 – 1219 . Google Scholar Crossref Search ADS PubMed Strimmer K., Pybus O.G. 2001 . Exploring the demographic history of DNA sequences using the generalized skyline plot. Mol. Biol. Evol. 18 : 2298 – 2305 . Google Scholar Crossref Search ADS PubMed Tan J., Kelly C.K., Jiang L. 2013 . Temporal niche promotes biodiversity during adaptive radiation. Nat. Commun. 4 : 2102 Google Scholar Crossref Search ADS PubMed Todisco V., Gratton P., Cesaroni D., Sbordoni V. 2010 . Phylogeography of Parnassius apollo: hints on taxonomy and conservation of a vulnerable glacial butterfly invader. Biol. J. Linn. Soc. 101 : 169 – 183 . Google Scholar Crossref Search ADS Todisco V., Gratton P., Zakharov E.V., Wheat C.W., Sbordoni V., Sperling F.A.H. 2012 . Mitochondrial phylogeography of the Holarctic Parnassius phoebus complex supports a recent refugial model for alpine butterflies. J. Biogeogr. 39 : 1058 – 1072 . Google Scholar Crossref Search ADS Tyler H.A., Brown K.S., Wilson K. 1994 . Swallowtail butterflies of the Americas: a study in biological dynamics, ecological diversity, biosystematics and conservation. Gainesville (FL) : Scientific Publishers . Van Valen L.M. 1973 . A new evolutionary law. Evol. Theory 1 : 1 – 30 . Voje K.L., Holen Ø.H., Liow L.H., Stenseth N.C. 2015 . The role of biotic forces in driving macroevolution: beyond the Red Queen. Proc. Biol. Sci. 282 : 20150186 Google Scholar Crossref Search ADS PubMed Wagner C.E., Harmon L.J., Seehausen O. 2012 . Ecological opportunity and sexual selection together predict adaptive radiation. Nature 487 : 366 – 369 . Google Scholar Crossref Search ADS PubMed Wallace A.R. 1878 . Tropical Nature and Other Essays. London : Macmillan and Co . Weiss J.C. 1991–2005 . The Parnassiinae of the World. Part I 1991, II 1992, III 1999, IV 2005. Venette : Sciences Nat. p. 400 . Wen J., Zhang J.-C., Nie Z.-L., Zhong Y., Sun H. 2014 . Evolutionary diversifications of plants on the Qinghai-Tibetan Plateau. Front. Genet. 5 : 4 Google Scholar PubMed Xie W., Lewis P.O., Fan Y., Kuo L., Chen M.H. 2011 . Improving marginal likelihood estimation for Bayesian phylogenetic model selection. Syst. Biol. 60 : 150 – 160 . Google Scholar Crossref Search ADS PubMed Xing Y., Ree, R.H. 2017 . Uplift-driven diversification in the Hengduan Mountains, a temperate biodiversity hotspot. Proc. Natl Acad. Sci. USA 114 : E3444 – E3451 . Google Scholar Crossref Search ADS Xu Q., Ding L., Spicer R.A., Liu X., Li S., Wang H. 2018 . Stable isotopes reveal southward growth of the Himalayan-Tibetan Plateau since the Paleocene. Gond. Res. 54 : 50 – 61 . Google Scholar Crossref Search ADS Yang Z., Rannala B. 2006 . Bayesian estimation of species divergence times under a molecular clock using multiple calibrations with soft bounds. Mol. Biol. Evol. 23 : 212 – 226 . Google Scholar Crossref Search ADS PubMed Zachos J.C., Dickens G.R., Zeebe R.E. 2008 . An early Cenozoic perspective on greenhouse warming and carbon-cycle dynamics. Nature 451 : 279 – 283 . Google Scholar Crossref Search ADS PubMed Zhang J.-Q., Meng S.-Y., Allen G.A., Wen J., Rao G.-Y. 2014 . A rapid radiation and dispersal out of the Qinghai-Tibetan Plateau of an alpine plant lineage Rhodiola (Crassulaceae). Mol. Phylogenet. Evol. 23 : 147 – 158 . Google Scholar Crossref Search ADS Zhisheng A., Kutzbach J.E., Prell W.L., Porter S.C. 2001 . Evolution of Asian monsoons and phased uplift of the Himalaya Tibetan plateau since Late Miocene times. Nature 411 : 62 – 66 . Google Scholar Crossref Search ADS PubMed Zinetti F., Dapporto L., Vovlas A., Chelazzi G., Bonelli S., Balletto E., Ciofi C. 2013 . When the rule becomes the exception. No evidence of gene flow between two Zerynthia cryptic butterflies suggests the emergence of a new model group. PLoS One 8 : e65746 . Google Scholar Crossref Search ADS PubMed © The Author(s) 2018. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Systematic Biology Oxford University Press

# Testing the Role of the Red Queen and Court Jester as Drivers of the Macroevolution of Apollo Butterflies

, Volume 67 (6) – Nov 1, 2018
25 pages

/lp/ou_press/testing-the-role-of-the-red-queen-and-court-jester-as-drivers-of-the-Nbh0AR8NPQ
Publisher
Oxford University Press
ISSN
1063-5157
eISSN
1076-836X
DOI
10.1093/sysbio/syy009
Publisher site
See Article on Publisher Site

### Abstract

Abstract In macroevolution, the Red Queen (RQ) model posits that biodiversity dynamics depend mainly on species-intrinsic biotic factors such as interactions among species or life-history traits, while the Court Jester (CJ) model states that extrinsic environmental abiotic factors have a stronger role. Until recently, a lack of relevant methodological approaches has prevented the unraveling of contributions from these 2 types of factors to the evolutionary history of a lineage. Herein, we take advantage of the rapid development of new macroevolution models that tie diversification rates to changes in paleoenvironmental (extrinsic) and/or biotic (intrinsic) factors. We inferred a robust and fully-sampled species-level phylogeny, as well as divergence times and ancestral geographic ranges, and related these to the radiation of Apollo butterflies (Parnassiinae) using both extant (molecular) and extinct (fossil/morphological) evidence. We tested whether their diversification dynamics are better explained by an RQ or CJ hypothesis, by assessing whether speciation and extinction were mediated by diversity-dependence (niche filling) and clade-dependent host-plant association (RQ) or by large-scale continuous changes in extrinsic factors such as climate or geology (CJ). For the RQ hypothesis, we found significant differences in speciation rates associated with different host-plants but detected no sign of diversity-dependence. For CJ, the role of Himalayan–Tibetan building was substantial for biogeography but not a driver of high speciation, while positive dependence between warm climate and speciation/extinction was supported by continuously varying maximum-likelihood models. We find that rather than a single factor, the joint effect of multiple factors (biogeography, species traits, environmental drivers, and mass extinction) is responsible for current diversity patterns and that the same factor might act differently across clades, emphasizing the notion of opportunity. This study confirms the importance of the confluence of several factors rather than single explanations in modeling diversification within lineages. Diversification, extinction, Himalayan orogeny, historical biogeography, host-plant shifts, integrative study, mountain building, Papilionidae, past climate change, speciation Evolutionary biologists have long endeavored to determine which factors govern biodiversity dynamics, aiming to answer questions such as why some clades have diversified more than others, or why some lineages are widely distributed whereas others survive in restricted ranges (Ezard et al. 2016). Two different mechanistic macroevolutionary models have been proposed to explain the generation and maintenance of diversity. The Red Queen (RQ) model (Van Valen 1973), which stems from Darwin and Wallace, posits that diversification is driven by species-intrinsic, biotic factors such as interactions among species, species ecology, or life-history traits. The Court Jester (CJ) model, which builds on paleontological evidence (Barnosky 2001), argues that diversification dynamics result from historical abiotic forces such as abrupt changes in climate or geological tectonic events that drive speciation and extinction rates, usually acting clade-wide across lineages. The CJ and RQ models are generally considered the 2 extremes of a continuum, operating over different geographic and temporal scales. Biotic factors such as species interactions shape ecosystems locally over short time spans, whereas abiotic factors such as climate and tectonic events shape large-scale patterns regionally and globally over millions of years (Benton 2009). However, biotic interactions can also be observed at large spatial and temporal scales (Liow et al. 2015; Silvestro et al. 2015), while Van Valen’s (1973) original RQ hypothesis is now interpreted as accepting the role of a changing environment in shaping species evolution (Voje et al. 2015). Although both abiotic (environmental) and biotic (species-intrinsic) drivers are recognized as fundamental for regulating biodiversity (Ezard et al. 2011), these 2 types of factors are often studied in isolation (Drummond et al. 2012a; Bouchenak-Khelladi et al. 2015; Lagomarsino et al. 2016), searching for correlations between shifts in diversification rates and the evolution of key innovations or the appearance of key opportunities (Maddison et al. 2007; Alfaro et al. 2009; Rabosky et al. 2014a; Givnish et al. 2015). However, often no single factor but a confluence of biotic and abiotic factors is responsible for the diversification rate shift (Donoghue and Sanderson 2015), there could be an interaction effect. For example, C$$_{\mathrm{4}}$$ grasses appeared in the Eocene but their expansion and explosive diversification started only after mid-Miocene aridification in Africa and Central Asia (Edwards et al. 2010). Statistical assessment of the relative contributions of abiotic and biotic factors underlying diversity patterns has been made possible by the development of new probabilistic models in the field of diversification dynamics (Stadler 2013; Morlon 2014; Höhna 2015). One type of model estimates diversification rates that are clade-dependent and identifies differences in diversification rates among clades that can be explained by key innovations (Alfaro et al. 2009; Morlon et al. 2011; Rabosky et al. 2013), or by diversity-dependence and niche filling (i.e., diversification decreasing as the number of species increases, Rabosky and Lovette 2008; Etienne et al. 2012). A second type of model aims to detect statistical associations between diversification rates and changes in species traits (trait-dependent diversification models, Maddison et al. 2007; Ng and Smith 2014), or between geographic evolution and diversification, such as a change in continental connectivity allowing the colonization of a new region and a subsequent increase in allopatric speciation (Goldberg et al. 2011). A third type of model assumes continuous variation in diversification rates over time that depends on a paleoenvironmental variable and investigates whether diversification rates can be affected by abiotic environmental changes (e.g., paleotemperature, Condamine et al. 2013). Finally, episodic birth–death models search for tree-wide rate shifts that act concurrently across all lineages in a tree, for example, a mass extinction event removing a fraction of lineages at a certain time in the past (Stadler 2011; Höhna 2015; May et al. 2016). The first 2 types of models have been used to test RQ-like hypotheses on the effect of biotic interactions, while the other models are often used in the context of the CJ hypothesis (environmental change). Studies using a subset of these models to address both abiotic and biotic factors are becoming more frequent, but are often used at a local or regional geographic scale (Schnitzler et al. 2011; Drummond et al. 2012a; Jønsson et al. 2012; Hutter et al. 2013; Bouchenak-Khelladi et al. 2015; Lagomarsino et al. 2016), and are limited temporally. No study has yet addressed the full set of models using a large (geographic) and long (temporal) scale within the same lineage. Such a study would have substantial power to provide information on questions such as why some lineages diversify and others do not, and the extent that diversification is attributable to trait evolution and/or ecological opportunity (Wagner et al. 2012). It might also shed light on the current debate about whether biodiversity is at an equilibrium bounded by ecological limits (Rabosky et al. 2013), although it is unclear how relevant such limits are compared with other biological explanations (Moen and Morlon 2014). Alternatively, diversity might be unbounded and controlled by rare but drastic environmental changes such as climatic mass extinction events (Antonelli and Sanmartín 2011; Meredith et al. 2011; Kergoat et al. 2014; Condamine et al. 2015b; May et al. 2016). A key to such a study would be to find a group that: 1) has experienced both biotic and abiotic pressures at different temporal and spatial scales; 2) shows large species diversity and high specialization; and 3) has good information available about its taxonomic and evolutionary history. Nearly complete taxon sampling is crucial for accurate estimation of diversification rates (Cusimano and Renner 2010; Brock et al. 2011; Höhna et al. 2011; Davis et al. 2013), whereas rich fossil evidence is important for divergence time estimation and accurate reconstruction of biogeographic history (Meseguer et al., 2014). Phytophagous insects are especially good models because they present complex biotic (trophic) interactions with their host plants (Ehrlich and Raven 1964), and many of these lineages are old enough to have experienced dramatic past climatic changes. Herein, we address the role of biotic (RQ) and abiotic (CJ) factors that drive patterns of species richness in Apollo butterflies (Papilionidae: Parnassiinae). The Parnassiinae comprise 8 genera with at least 70 species currently recognized (Ackery 1975; Weiss 1991–2005; Kocman 2009), grouped into 3 tribes—Luehdorfiini, Zerynthiini, and Parnassiini (Nazari et al. 2007). Apollo butterflies occur from the Western Palearctic to the Western Nearctic (Weiss 1991–2005). Most diversity is concentrated in the Palearctic, where we find 2 non-monophyletic lowland-flying communities separated by the Himalaya and Tibetan Plateau (HTP, Fig. 1): an Eastern Palearctic group formed by Luehdorfia (Luehdorfiini) and Bhutanitis and Sericinus (Zerynthiini), and a Western Palearctic community including Archon (Luehdorfiini), Allancastria and Zerynthia (Zerynthiini), and Hypermnestra (Parnassiini). The largest genus, Parnassius, is mainly distributed in mountainous regions across the Holarctic, with its highest diversity in the HTP (Weiss 1991–2005; Kocman 2009; Nazari et al. 2007; Michel et al. 2008). When compared with the species-poor genera Allancastria, Archon, Bhutanitis, Hypermnestra, Luehdorfia, and Sericinus (each with 5 species maximum), the 50$$+$$ species of Parnassius have been suggested as an example of rapid diversification. This unbalanced species richness, together with their disjunct pattern of distribution in the Himalayan region, suggests that the HTP orogeny might have been an important driver in the biogeographic and diversification history of Parnassiinae. Figure 1. View largeDownload slide Distribution and species richness patterns of Apollo butterflies. Map shows the current separation of west and east communities of Parnassiinae, divided by the Himalayan–Tibetan Plateau. Lower portion of figure compares the current species diversity of each biogeographical unit (which may be non-monophyletic) or genus. Images of Parnassiinae (also in subsequent figures) are created by Fabien Condamine. Figure 1. View largeDownload slide Distribution and species richness patterns of Apollo butterflies. Map shows the current separation of west and east communities of Parnassiinae, divided by the Himalayan–Tibetan Plateau. Lower portion of figure compares the current species diversity of each biogeographical unit (which may be non-monophyletic) or genus. Images of Parnassiinae (also in subsequent figures) are created by Fabien Condamine. Parnassiinae are also remarkable for their (nearly) strict but diverse host-plant specialization (Ehrlich and Raven 1964). Tribes Luehdorfiini and Zerynthiini feed on Aristolochiaceae, the ancestral host of Parnassiinae (Condamine et al. 2012). Within Parnassiini, Hypermnestra feeds on Zygophyllaceae, while Parnassius has experienced 2 host-plant shifts, toward Crassulaceae $$+$$ Saxifragaceae and Papaveraceae (Michel et al. 2008; Condamine et al. 2012), and is regarded as an adaptive radiation (Rebourg et al. 2006). Finally, Apollo butterflies constitute a model to understand the role of past environmental change in lineage diversification. Phylogeographic studies identified an effect of Pleistocene glaciations on population-level dynamics (e.g., Keyghobadi et al. 2005; Gratton et al. 2008; Schoville and Roderick 2009; Dapporto 2010; Todisco et al. 2010, 2012; Zinetti et al. 2013). Time-calibrated phylogenies support an origin of the subfamily in the Early Cenozoic (Condamine et al. 2012), which means that the lineage has experienced the dramatic cooling and warming events of the last 50 Ma, including a drop in global temperatures at the end of the Eocene (Eocene-Oligocene climate transition [EOCT], Liu et al. 2009) that led to the demise of Boreotropical Holarctic vegetation (Pound et al. 2012). In this study, we reconstruct a completely sampled phylogeny of Parnassiinae to explore the role played by abiotic (CJ) and biotic (RQ) factors, isolated or in unison, in the origin and fate of biological radiations, including paleotemperature, mountain orogeny, host specialization, range expansion, and diversity dependence with niche filling. If interactions among species are the dominant drivers of evolution, diversification rates are expected to show diversity-dependent dynamics or be dependent on ecological traits. For example, one might expect diversification rates to decrease as a function of diversity or increase with ecological opportunity. Conversely, if evolution is driven mainly by changes in the physical environment, clade-wide responses due to abrupt abiotic perturbations should dominate macroevolutionary dynamics, for example, diversification rates may shift following major climatic changes that extirpated certain lineages while favoring the radiation of others. We test these hypotheses using trait-dependent, time-dependent, environment-dependent and episodic birth–death models, and compare the fit (explanatory power) of these models using maximum likelihood (ML) and Bayesian inference (BI). Our study is the first to address the CJ and RQ hypotheses at such a large scale, and to compare the performance of these models within a single inference framework. Materials and Methods Taxon Sampling and Molecular Dataset The latest effort to resolve phylogenetic relationships in Parnassiinae was Condamine et al. (2012), but that study did not include all species. There are currently 70 recognized species within Parnassiinae according to morphology (Weiss 1991–2005; Bollino and Racheli 2012; Frankenbach et al. 2012), but some authors have found a higher level of cryptic diversity, based on molecular data (Nazari and Sperling 2007; Gratton et al. 2008; Schoville and Roderick 2009; Dapporto 2010; Todisco et al. 2010, 2012; Zinetti et al. 2013). To address this, we assembled a large molecular dataset comprising all sequences generated by previous studies on Parnassiinae, especially by F.L.C. and F.A.H.S. (Omoto et al. 2004, 2009; Katoh et al. 2005; Nazari and Sperling 2007; Nazari et al. 2007; Michel et al. 2008; Condamine et al. 2012), and includes all currently described species within the family, except for Bhutanitis ludlowi (Möhn, 2005; Weiss 1991–2005; Churkin 2006). The dataset also includes the majority of described subspecies or geographic races proposed in molecular studies (Nazari and Sperling 2007; Gratton et al. 2008; Schoville and Roderick 2009; Dapporto 2010; Todisco et al. 2010, 2012; Zinetti et al. 2013). Additionally, sequences of 9 species of swallowtails were added to the dataset as outgroups, with Baronia brevicornis considered as the sister group to all remaining Papilionidae, and 2 species for each of the 4 tribes included within Papilioninae, the sister subfamily of Parnassiinae (Condamine et al. 2012). DNA sequences were obtained from GenBank using Taxonomy browser searches (full dataset is provided in the Supplementary Material that accompanies this article available on Dryad at http://dx.doi.org/10.5061/dryad.32bp4, Appendix S1). Sequences of less than 250 nucleotides (possibly microsatellites) and genes sequenced in very few specimens were not included. Overall the dataset comprises $$>$$1800 sequences, representing 69 of the traditionally recognized species and 5 genes (4 mitochondrial and 1 nuclear), divided into 730 sequences for cytochrome oxidase I (COI), 214 for NADH dehydrogenase 1 (ND1), 147 for NADH dehydrogenase 5 (ND5), 202 for rRNA 16S (16S), and 115 for elongation factor 1 alpha (EF-1$$\alpha )$$. Sequences were aligned using MAFFT 7.110 (Katoh and Standley 2013) with default settings (E-INS-i algorithm). Reading frames of coding genes were checked in Mesquite 3.03 (www.mesquite.org). Datasets (species/subspecies names and GenBank accession numbers) are available on Dryad (Supplementary Appendices S1 and S2). We first analyzed this specimen-level dataset to generate a gene tree for each of the 5 markers with every species represented by several specimens, each representing different populations. We then examined genetic differences based on branch lengths among specimens/species. For some species, we found strong genetic differences among specimens/populations supporting the recognition of these taxa as putative new species. For example, we identified 4 Zerynthia species instead of 2, 4 Archon species instead of 3, and 2 Hypermnestra species instead of one. For the final analyses below, we used the species-level dataset, with each species represented by a single specimen, comprising the 70 recognized species (1 represented by only morphological characters, see below) plus 15 putative new species. Assessment of Fossil Positions and Calibrations Constraints on clade ages were enforced through fossil calibrations, whose systematic position was assessed using phylogenetic analyses (Donoghue et al. 1989; Sauquet et al. 2012). Phylogenetic analyses of the dataset included both living and fossil taxa and incorporated both morphological and molecular data using a total-evidence approach (Ronquist et al. 2012a). Swallowtail fossils are scarce, but 2 unambiguously belong to Parnassiinae (Nazari et al. 2007). The first is $$\dagger$$Thaites ruminiana (Scudder 1875), a compression fossil from limestone in the Niveau du gypse d’Aix Formation of France (Bouches-du-Rhone, Aix-en-Provence) within the Chattian (23.03–28.1 Ma) of the late Oligocene (Rasnitsyn and Zherikhin 2002; Sohn et al. 2012). The second is $$\dagger$$Doritites bosniaskii (Rebel 1898), an exoskeleton and compression fossil from Italy (Tuscany) from the Messinian (5.33–7.25 Ma, late Miocene, Sohn et al. 2012). Absolute ages of geological formations were taken from Gradstein et al. (2012). To assess the phylogenetic positions of fossils, our morphological dataset comprised 236 characters (Hiura 1980; Nazari et al. 2007) including adult external (160) and internal (54) morphology, egg (2) and larval (10) morphology, pupal morphology (5), and characters for pigment color and chromosome number (5). Morphological characters were coded for all extant species in each genus except Parnassius (only 8 species: 1 per subgenus). Yet, Parnassius had little impact on our analyses since there are no fossils related to this genus (Nazari et al. 2007). Fossils were only coded for adult external morphology. Morphological and molecular data were combined to construct a total-evidence phylogeny using MrBayes 3.2.6 (Ronquist et al. 2012b). Morphological data were modeled using the Mk model (Lewis 2001). The best-fitting partitioning scheme for the molecular data was selected with PartitionFinder 1.1.1 (Lanfear et al. 2012), using the greedy search algorithm and the Bayesian Information Criterion. Rather than using a single substitution model per molecular partition, we sampled across the entire substitution rate model space (Huelsenbeck et al. 2004) using the reversible-jump Markov Chain Monte Carlo (rj-MCMC) option. Two independent analyses with 1 cold chain and 7 heated chains, each was run for 20 million generations, sampled every 2000th. A 50% majority rule consensus tree was built after discarding 25% samples as burn-in. The dataset used for this analysis is available on Dryad (Supplementary Appendix S3). Phylogeny and Estimates of Divergence Times Using the 85-species level dataset, we examined whether the rate of molecular evolution evolved in a clock-like pattern using PATHd8 (Britton et al. 2007). Since the hypothesis of a strict molecular clock was rejected for 68 of the 92 nodes ($$P < 0.05$$), we estimated divergence times using Bayesian relaxed-clock methods accounting for rate variation across lineages (Drummond et al. 2006). MCMC analyses implemented in BEAST 1.8.2 (Drummond et al. 2012b) were employed to approximate the posterior distribution of rates and divergences times and infer their credibility intervals. We set the following settings and priors: a partitioned dataset (after the best-fitting PartitionFinder scheme) was analyzed using the uncorrelated lognormal distribution clock model, with the mean set to a uniform prior between 0 and 1, and an exponential prior (lambda $$=$$ 0.333) for the standard deviation. The branching process prior was set to either a Yule or a birth–death (Gernhard 2008) process (since tree priors may impact estimates of molecular dating, Condamine et al. 2015a, using the following uniform priors: the Yule birth rate and birth–death mean growth rate ranged between 0 and 10 with a starting value at 0.1, and the birth–death relative death rate ranged between 0 and 1 (starting value $$=$$ 0.5). Calibration priors employed the fossil constraints indicated above (see Results section for the phylogenetic positions), using a uniform prior with the minimum age equal to the youngest age of the geological formation where the fossil was found. Additionally, we constrained the crown of Papilionidae with a uniform distribution bounded by a minimum age of 47.8 Ma (Smith et al. 2003; Gradstein et al. 2012) based on 2 fossils $$\dagger$$Praepapilio colorado and †Praepapilio gracilis (Durden and Rose 1978), both from the early Lutetian (mid-Eocene) of the Green River Formation (CO, USA). We could not place these fossils phylogenetically because of limited taxon sampling for Papilioninae. Instead, we placed these unambiguous fossils conservatively at the crown of the family as they share synapomorphies with all extant subfamilies (de Jong 2007), and have proven to be reliable calibration points for the crown group (Condamine et al. 2012). All uniform calibration priors were set with an upper bound equal to the estimated age of angiosperms ($$\sim$$140 Ma, sensuMagallón et al. 2015), which is 6 times older than the oldest Parnassiinae fossil. This upper age is intentionally set as ancient to allow exploration of potentially old ages for the clade. Since the fossil record of Lepidoptera is highly incomplete and biased (Sohn et al. 2015), caution is needed in using the few recent fossil calibrations. To explore the impact of sampled diversity, tree prior, and morphological data on BEAST inferences, we ran 4 analyses as follows: 1) molecular data only and Yule model; 2) molecular data only and birth–death process; 3) molecular and morphological data (total-evidence analysis) and Yule model; and 4) molecular and morphological data (total-evidence analysis) and birth–death process, with all other parameters set equal. The total-evidence analyses provide a completely sampled phylogeny of Parnassiinae due to the inclusion of B. ludlowi (only available with morphological data). Each MCMC analysis was run for 100 million generations, sampled every 10,000th, resulting in 10,000 samples in the posterior distribution of which the first 2500 samples were discarded as burn-in. All analyses were performed on the computer cluster CIPRES Science Gateway (Miller et al. 2015), using BEAGLE (Ayres et al. 2012). Convergence and performance of MCMC runs were evaluated using Tracer 1.6 (Rambaut and Drummond 2016) and the effective sample size (ESS) criterion for each parameter. A maximum-clade credibility (MCC) tree was reconstructed, with median age and 95% height posterior density (HPD). Bayes factor (BF) comparisons using stepping-stone sampling (Xie et al. 2011), which allows unbiased approximation of the marginal likelihood of Bayesian analyses (Baele et al. 2013), was used to select among competing models. We considered 2lnBF values $$>$$10 to significantly favor one model over another (Kass and Raftery 1995). The dataset and BEAST xml files generated for this study are available on Dryad (Supplementary Appendix S4). Historical Biogeography For the phylogenetic tree, we used the BEAST MCC tree with 85 species. Definition of biogeographic units was based on the present-day distribution of species but also informed by plate tectonics (Sanmartín et al. 2001) and alpine orogeny reconstructions (Bouilhol et al. 2013). Ten areas were defined: WN $$=$$ Western North America (including the Rocky Mountains); WP $$=$$ Europe (France, Spain, Pyrenees, Alps, Italy, Greece, Crete, Balkans, Scandinavia, Western Russian, and Ural Mountains); SI $$=$$ Eastern Russia, Siberia, and Kamchatka; CA $$=$$ Central Asia (Turkmenistan, Uzbekistan and Kazakhstan); MO $$=$$ Mongolian steppes and Altai Mountains; TU $$=$$ Turkey, Caucasian region, Syria, Iraq, Iranian Plateau, and Zagros Mountains; HTP $$=$$ Himalaya, Tibetan Plateau, and Pamir region; IN $$=$$ India, Bhutan, Yunnan, and Southern China; CJ $$=$$ Northern China, Korea, and Japan; AF $$=$$ North Africa and Arabia. Species ranges were defined by presence–absence in each region (Supplementary Appendix S5 available on Dryad). Biogeographic analyses were performed on the MCC tree (outgroups removed), using the Dispersal–Extinction–Cladogenesis model of range evolution (DEC; Ree and Smith 2008) implemented in the R-package BioGeoBEARS 0.2.1 (Matzke 2014). We did not use the DEC$$+$$J model (Matzke 2014) because of concerns with statistical validity of model choice among DEC-derived models and in particular under the DEC$$+$$J model (Ree and Sanmartín 2018). Also, DEC$$+$$J often leads to inferences that are decoupled from time, with null or extremely low extinction rates (Sanmartín and Meseguer 2016), an effect of the model favoring cladogenetic events over anagenetic events (Ree and Sanmartín 2018), which makes it inadequate for reconstructing the history of ancient groups with widespread distributions. Biogeographic ranges larger than 4 areas in size were disallowed as valid biogeographic states if they were not subsets of the terminal species ranges; widespread ranges comprising areas that have never been geographically connected (Sanmartín et al. 2001) were also removed. We constructed a time-stratified model (Supplementary Appendix S6 available on Dryad) with 4 time slices that specified constraints on area connections and spanned the Parnassiinae evolutionary history; each interval represents a geological period bounded by major changes in tectonic and climatic conditions thought to have affected the distribution of these butterflies. Testing the RQ and CJ hypotheses To understand the relative contributions of RQ versus CJ-type factors, we ran a series of macroevolutionary models using both ML and BI. Rather than on a single tree, analyses were performed on a sample of trees from the BEAST post-burnin posterior distribution (with outgroups removed); an exception was the diversity-dependent diversification models that, due to time constraints, were run only on the MCC tree. This approach allowed us to assess the robustness of results to phylogenetic uncertainty and estimate confidence intervals for the parameter estimates. We compared the fit of each model to the data using the corrected Akaike Information Criterion (AICc) for the ML analyses, and Bayes Factor comparisons (2lnBF) for BI. Diversification analyses in a RQ model. We first tested the hypothesis that diversity is bounded or at equilibrium, meaning that diversity expanded rapidly in early diversification, filled most niches and saturated toward the present. We explored the effect of diversity-dependence on speciation and extinction rates using the method of Etienne et al. (2012) implemented in the R-package DDD 2.7. We applied 5 different models: 1) speciation depends linearly on diversity without extinction, 2) speciation depends linearly on diversity with extinction, 3) speciation depends exponentially on diversity with extinction, 4) speciation does not depend on diversity and extinction depends linearly on diversity, and 5) speciation does not depend on diversity and extinction depends exponentially on diversity. For each model, the initial carrying capacity was set to the current number of described species. We also tested the effect of host-plant association on diversification rates, since previous study has pointed to a potential correlation between shifts in diversification and host-plant switches (Ehrlich and Raven 1964). We used a diversification model of the state-dependent speciation and extinction (SSE) family of models, in which extinction and speciation rates are associated with phenotypic evolution of a trait along a phylogeny (Maddison et al. 2007). In particular, we used the Multiple State Speciation Extinction model (MuSSE; FitzJohn et al. 2009) implemented in the R-package diversitree 0.9–7 (FitzJohn 2012), which allows for multiple character states. Larval host plant data were taken from previous work (Ehrlich and Raven 1964; Scriber 1984; Collins and Morris 1985; Tyler et al. 1994; Scriber et al. 1995). The following 4 character states were used: 1) Aristolochiaceae; 2) Zygophyllaceae; 3) Papaveraceae; and 4) Crassulaceae $$+$$ Saxifragaceae. Data at a lower taxonomic level than plant family were not used because of the great number of multiple associations exhibited by genera that could alter the phylogenetic signal. Thirty-six different models were run to test whether speciation, extinction, or transition rates were dependent on trait evolution. Models were built with increasing complexity, starting from a model with no difference in speciation, extinction and transition rates (3 parameters) to the most complex model with different speciation, extinction, and transition rates for each character state (20 parameters). We estimated posterior density distribution with Bayesian MCMC analyses (10,000 steps) performed with the best-fitting models and resulting speciation, extinction and transition rates. There have been concerns about the power of SSE models to infer diversification dynamics from a distribution of species traits (Davis et al. 2013; Maddison and FitzJohn 2015; Rabosky and Goldberg 2015). First, SSE models tend to have a high type I error bias related to the shape (topology) of the inferred tree (Rabosky and Goldberg 2015). To test whether our diversification results were potentially biased, we estimated the difference of fit ($$\Delta$$AIC) between the best model from MuSSE and a null model (corresponding to the model with no difference in speciation, extinction and transition rates between the states of a character) and compared this with the difference between the same models as estimated from simulated datasets. 1000 sets of traits were generated on the Parnassiinae phylogeny using the sim.history function from the R-package phytools (Revell 2012). To keep the phylogenetic signal in the trait, transition rates between states were simulated using values obtained from the null model fitted on the observed phylogeny. Second, there is concern whether SSE models are uncovering actual drivers of diversification, or whether they are simply pointing to more complex patterns involving unmeasured and co-distributed factors in the phylogeny (Beaulieu and O’Meara 2016). A model with an additional hidden character may alleviate this issue. We applied the Hidden SSE (HiSSE) model to specifically account for the presence of unmeasured factors that could impact diversification rates estimated for the states of any observed trait. The analyses were performed in the Bayesian software RevBayes (Höhna et al. 2016a). To provide an independent assessment of the relationship between diversification rates and host specificity, we used models that allow diversification rates to vary among clades. BAMM 2.5 (Rabosky et al. 2013, 2014a) was used to explore for differential diversification dynamic regimes among clades differing in their host-plant feeding. BAMM analyses were run for 10 million generations, sampling every 10,000th and with 4 different values of the compound Poisson prior (CPP) to ensure the posterior is independent of the prior (Moore et al. 2016). Mixing and convergence among runs (ESS $$>$$200 after 15% burn-in) were assessed with the R-package BAMMtools 2.1 (Rabosky et al. 2014b). BAMM has been criticized for incorrectly modeling rate-shifts on extinct lineages, that is, unobserved (extinct or unsampled) lineages inherit the ancestral diversification process and cannot experience subsequent diversification-rate shifts (Moore et al. 2016, but see Rabosky et al. 2017). To solve this, we used a novel approach implemented in RevBayes that models rate shifts consistently on extinct lineages by using the SSE framework (Moore et al. 2016; Höhna S., Moore B.R., Rannala B., May M.R., Huelsenbeck J.P. in preparation). Although there is no information of rate shifts for unobserved/extinct lineages in a phylogeny including extant species only, these types of events must be accounted for in computing the likelihood. The number of rate categories is fixed in the analysis but RevBayes allows any number to be specified, thus allowing direct comparison of different macroevolutionary regimes. Diversification analyses in a CJ model. We evaluated the impact of abiotic factors such as abrupt changes in tectonic or climatic settings or mass extinction events (e.g., the EOCT at 33.9 Ma) using episodic birth–death models implemented in the R-packages TreePar 3.3 (Stadler 2011) and TESS 2.1 (CoMET model, Höhna et al. 2016b; May et al. 2016). These models allow detection of discrete changes in speciation and extinction rates concurrently affecting all lineages in a tree. Both approaches estimate changes in diversification rates at discrete points in time, but can also infer mass extinction events, modeled as sampling events in the past in which the extant diversity is reduced by a fraction. Speciation and extinction rates can change at those points but remain constant within time intervals. The underlying likelihood function is identical in the 2 methods (Stadler 2011; Höhna 2015), but TreePar and TESS differ in the inference framework (ML vs. BI) and the method used for model comparison (AICc vs. BF). In addition, TESS uses independent CPPs to simultaneously detect mass extinction events and discrete changes in speciation and extinction rates, while TreePar estimates the magnitude and timing of speciation and extinction changes independently to the occurrence of mass extinctions (i.e., the 3 parameters cannot be estimated simultaneously due to parameter identifiability issues, Stadler 2011). To compare inferences between CoMET and TreePar, we performed 2 independent TreePar analyses allowing and disallowing mass extinction events (argument ME $$=$$ TRUE/FALSE). We compared models with 0, 1, or 2 rate shifts/mass extinction events in TreePar using the AICc, while BF comparisons were used to assess model fit between models with varying number and time of changes in speciation/extinction rates and mass extinctions in TESS. Finally, we used an implementation of CoMET in RevBayes (Höhna et al. 2016a), an episodic birth–death model in which speciation and extinction are allowed to vary at discrete points in time, but are autocorrelated (instead of independent/uncorrelated as in CoMET) across time intervals; a Brownian model with rates in the next time interval centered around the rates in the current time interval was used. Also, the number of diversification rate shifts was set a priori as in population skyline models (Strimmer and Pybus 2001), while in CoMET this is modeled through a CPP prior. To examine the influence of elevational distribution (lowland, mountain, or both) on speciation and extinction rates, we used the trait-dependent diversification model GeoSSE (Geographic State Speciation and Extinction, Goldberg et al. 2011). Geographic characters require a third widespread state, because, unlike morphological traits, ancestors can be present in more than 1 state (area). This requires the modeling of cladogenetic state change, the probability associated with the division of an ancestral range between the 2 descendants, alongside anagenetic state change (i.e., the probability of character change along branches). Species were coded by their elevation zone, using data from literature, museum records, and field observations. We evaluated 12 models of increasing complexity to test the relationship between elevational distributions and diversification. As in MuSSE above, we evaluated the performance of GeoSSE using simulation tests. Trait simulation was more complex here because there is no direct transition between states A and B. Instead, these transitions involve range expansion to an intermediate widespread state AB (dAB, dBA) followed by local extinction (xA, xB). Following Goldberg et al. (2011), we considered transition rates between A and B as null because they involve more than one instantaneous event; the transition rate from A to AB was coded as range expansion dA, from B to AB as dB; transitions from AB to A and from AB to B (“extirpation rates”) were coded as xB and xA, respectively. For our 1000 simulations, we used the transition rates of the model with equal speciation rates (sA $$=$$ sB $$=$$ sAB), but different extinction (xA$$\ne$$xB) and transition rates (dA$$\ne$$dB). Finally, we tested the impact of paleoenvironmental variables on diversification rates, using a birth–death likelihood method in which rates may change continuously with an environmental variable, itself varying through time (Condamine et al. 2013). We tested 4 models: 1) a constant-rate birth–death model in which diversification is not associated to the environmental variable; 2) speciation rate varies according to environment and extinction rate does not vary; 3) speciation rate does not vary and extinction rate varies according to environment; and 4) both speciation and extinction rates vary according to environment. We also tested the corresponding models in which speciation and/or extinction are allowed to vary with time, but independently from the environmental variable (time-dependent birth–death models, Morlon et al. 2011). When rates varied with the environment (E), we assumed exponential variation, such that $$\lambda (E)=\lambda_{0}\thinspace \times \thinspace e^{\alpha E}$$ and $$\mu (E)=\mu_{0}\times \thinspace e^{\beta E}$$, in which $$\lambda_{0}$$ and $$\mu _{0}$$ are the speciation and extinction rates for a given environmental variable for which the value is 0, and $$\alpha$$ and $$\beta$$ are the rates of change according to the environment. Positive values for $$\alpha$$ or $$\beta$$ mean a positive effect of the environment on speciation or extinction (and conversely). As environmental variables (Fig. 2), we used paleotemperature (data retrieved from Zachos et al. 2008) and paleo-elevation of the HTP (a proxy for mountain building), with data compiled from existing data in the literature available on Dryad (Supplementary Appendix S7). The R-package pspline was used to build environmental vectors from the data as input for the birth–death models. Figure 2. View largeDownload slide Environmental changes over the last 40 million years. a) Trends in global climate change estimated from relative proportions of different oxygen isotopes (Zachos et al. 2008). b) Trends in HTP elevation for the last 40 Myrs, compiled from data from a literature survey; our compilation shows that the HTP was as high as 3000 m in the Eocene, congruent with recent reviews (Renner 2016; Xu et al. 2018; Supplementary Appendix S7). c) Map of the current elevation of the Himalayan–Tibetan Plateau with the species diversity of Parnassius in the region. Figure 2. View largeDownload slide Environmental changes over the last 40 million years. a) Trends in global climate change estimated from relative proportions of different oxygen isotopes (Zachos et al. 2008). b) Trends in HTP elevation for the last 40 Myrs, compiled from data from a literature survey; our compilation shows that the HTP was as high as 3000 m in the Eocene, congruent with recent reviews (Renner 2016; Xu et al. 2018; Supplementary Appendix S7). c) Map of the current elevation of the Himalayan–Tibetan Plateau with the species diversity of Parnassius in the region. Alternatively, we used RevBayes and the CoMET framework (episodic birth–death models) to implement an analysis in which diversification rates are allowed to co-vary with, or evolve independently from, paleotemperatures and paleo-elevation across discrete time intervals (Palazzesi L., Hidalgo O., Barreda V.D, Forest F., Höhna S. in review). A correlation coefficient, denoted $$\beta$$, is co-estimated with diversification rates in the model and describes the magnitude and direction of the relationship between the variable and changes in diversification rates: $$\beta$$$$<$$ 0 (negative correlation), $$\beta$$$$>$$0 (positive correlation), and $$\beta = 0$$ (no correlation, in which case the model collapses to the episodic birth–death models; see Supplementary Appendix S8 available on Dryad). RESULTS Phylogeny and Fossil Placements The molecular matrix comprised 4535 nucleotides and 85 extant species (70 formally recognized and 15 putative species) of Parnassiinae plus 2 fossil species, following recent taxonomic reviews (Frankenbach et al. 2012; Bollino and Racheli 2012) and our own results from the species delimitation analyses (see Supplementary Appendices S1 and S2 available on Dryad). PartitionFinder selected 8 partitions as the best scheme for substitution models (Supplementary Appendix S9 available on Dryad). Convergence among runs was supported by the low average standard deviation of split frequencies, PSRF values $$\approx$$1.0, and ESS $$>>$$ 1000 for many parameters. The resulting Bayesian trees were well resolved and robust: 74.4% of the nodes were recovered with posterior probability (PP) $$>$$0.95 with MrBayes (Fig. 3), and 83% in BEAST (70% with PP $$=$$ 1) when fossil taxa were removed. Phylogenetic relationships agree with previous studies (Nazari et al. 2007; Michel et al. 2008), showing Luehdorfiini $$+$$ Zerynthiini as sister-tribes and sister to Parnassiini; within the latter, Hypermnestra is sister to the genus Parnassius. Most of the weak support is found within the radiation of species complexes, such as Parnassius apollo or in the subgenus Kailasius, but 2 deep nodes in Parnassius have PP $$<$$ 0.9. Figure 3. View largeDownload slide Phylogenetic relationships inferred with a Bayesian total-evidence approach combining molecular and morphological data of extant and extinct Parnassiinae. This integrative approach allowed assessment of the phylogenetic placement of 2 fossils, which were then used to calibrate the molecular dating analysis. Backbone nodes show posterior probability values. Nodal support values within genera or subclades are shown in Supplementary Appendix S10 for alternative dating analyses including, or not, morphological data. Grey circles indicate the crown-node of each tribe. Figure 3. View largeDownload slide Phylogenetic relationships inferred with a Bayesian total-evidence approach combining molecular and morphological data of extant and extinct Parnassiinae. This integrative approach allowed assessment of the phylogenetic placement of 2 fossils, which were then used to calibrate the molecular dating analysis. Backbone nodes show posterior probability values. Nodal support values within genera or subclades are shown in Supplementary Appendix S10 for alternative dating analyses including, or not, morphological data. Grey circles indicate the crown-node of each tribe. The total-evidence analysis allowed us to test the phylogenetic placement of the Parnassiinae fossils. $$\dagger$$Thaites was often recovered as sister to Parnassiini (PP $$=$$ 0.6, Fig. 3), and occasionally as sister to Luehdorfiini $$+$$ Zerynthiini. This fossil was used to provide a minimum age for crown Parnassiinae, calibrated with a uniform prior bounded between 23.03 Ma (minimum) and 47.8 Ma (maximum). $$\dagger$$Doritites was reconstructed as sister to Archon (Luehdorfiini, PP $$=$$ 0.96), in agreement with Carpenter (1992), who tentatively synonymized $$\dagger$$Doritites with Luehdorfia. The crown of Luehdorfiini was thus constrained for divergence time estimation using a uniform distribution bounded between 5.33 Ma (minimum) and 47.8 Ma (maximum). Divergence Time and Biogeographic Estimates The BEAST analyses (Yule vs. birth–death and molecular vs. total-evidence) yielded almost identical estimates of divergence times with less than 1 million years of difference for most nodes (Supplementary Appendix S10 available on Dryad). BF comparisons were also not conclusive, with little difference in the stepping-stone marginal likelihood estimate among analyses (Supplementary Appendix S11 available on Dryad). We present the results for the analysis with the birth–death prior and the total-evidence dataset (Fig. 4) because it offered slightly better convergence and incorporating extinction into the tree prior seemed more realistic for such an old lineage (results from the other analyses are available on Dryad, Supplementary Appendix S10). Figure 4. View largeDownload slide Time-calibrated phylogeny and biogeography of Apollo butterfly radiation and paleo-tectonic evolution at important periods. Colored squares on nodes indicate the most likely biogeographical estimate, as in the lower left inset (uncertainties in range estimates are in Supplementary Appendix S12). Triangles indicate range expansions and X’s denote local extinctions. A timescale spans the full evolutionary history of the group. Earth maps represent paleogeography at specific time periods. The red box highlights a notable period of local extinctions in the clade Luehdorfiini $$+$$ Zerynthiini. Panels at bottom represent the main Cenozoic environmental changes occurring with the onset and rise of the HTP or floristic turnovers. Sensitivity analyses were conducted to test the effect of adding or excluding morphological data; results are presented in Supplementary Appendix S10. Plio. $$=$$ Pliocene; Ple. $$=$$ Pleistocene. Figure 4. View largeDownload slide Time-calibrated phylogeny and biogeography of Apollo butterfly radiation and paleo-tectonic evolution at important periods. Colored squares on nodes indicate the most likely biogeographical estimate, as in the lower left inset (uncertainties in range estimates are in Supplementary Appendix S12). Triangles indicate range expansions and X’s denote local extinctions. A timescale spans the full evolutionary history of the group. Earth maps represent paleogeography at specific time periods. The red box highlights a notable period of local extinctions in the clade Luehdorfiini $$+$$ Zerynthiini. Panels at bottom represent the main Cenozoic environmental changes occurring with the onset and rise of the HTP or floristic turnovers. Sensitivity analyses were conducted to test the effect of adding or excluding morphological data; results are presented in Supplementary Appendix S10. Plio. $$=$$ Pliocene; Ple. $$=$$ Pleistocene. Based on this analysis and results from the DEC inference, Parnassiinae originated in the late Eocene (38.6 Ma, 95% HPD 31.4–46.6 Ma) in Central Asia (Fig. 4). The ancestor of sister-tribes Luehdorfiini and Zerynthiini (36.7 Ma, 29.9–44.5 Ma) is also reconstructed in Central Asia, followed by independent dispersal events to India $$+$$ Caucasus-Turkey and the HTP. Luehdorfiini diversified in the early Miocene at 23.3 Ma (17.4–30.2 Ma) within a broad geographic range including India, Turkey-Caucasus, and Central Asia; this was followed by extinction in Central Asia and India, and a later dispersal to China–Japan along the stem of Luehdorfia. Zerynthiini originated in the Oligocene (30.3 Ma, 24.1–37.2 Ma) within a region comprising India, Turkey-Caucasus, Central Asia, and the HTP, and also underwent extinction events in Central Asia and India during the Miocene. In contrast, Parnassiini first diversified within this region in the early Miocene (22.6 Ma, 17.2–28.5 Ma), followed by dispersal to Turkey-Caucasus in Hypermnestra. The ancestor of Parnassius dispersed from Central Asia to HTP around the mid-Miocene (13.4 Ma, 10.5–16.6 Ma), followed by vicariance between subgenus Parnassius and the ancestor of the other subgenera. Dispersal events to adjacent geographic regions and subsequent allopatric speciation are reconstructed within each subgenus, especially in Driopa. A more detailed account of the biogeographic history and alternative nodal reconstructions are available on Dryad (Supplementary Appendix S12). Macroevolutionary Dynamics A summary of diversification rate analysis results is presented in Table 1. Table 1. Summary of diversification analyses Evolutionary Type of birth–death $$\hspace{8pt}$$model Methods used References Data used Settings Main results Time-dependence (rates vary continuously as a function of time and across clades) Red Queen BAMM RevBayes Rabosky et al. (2013), Höhna et al. (2016a) MCC tree of Parnassiinae Bayesian model to test whether speciation rates have varied over time and/or across lineages (exponential variation) A single shift: increase of diversification near the crown of the genus Parnassius Diversity-dependence (rates vary as a function of the number of species) Red Queen DDD (dd_ML) Etienne et al. (2012) MCC tree of the clade and one of Parnassius (to test whether they reach their carrying capacity) 5 models to test whether speciation declines with diversity and/or extinction increases with diversity Both Parnassiinae and Parnassius have not reached their carrying capacity, but show some degree of diversity dependence Trait-dependence (rates vary as a function of a character state for a trait) Red Queen MuSSE (make.musse) RevBayes Maddison et al. 2007, FitzJohn et al. 2009, Höhna et al. (2016a) 200 posterior trees $$+$$ host plant preferences (four states) 36 models to test whether a character state impacted speciation and/or extinction and/or transition rates between host plants Support for independent speciation rates for each host-specialist clade; no significant differences for extinction and transition rates among clades Episodic birth–death model (rates vary discretely as a function of time) Court Jester TreePar (bd.shifts.optim) CoMET Stadler 2011, May et al. 2016 200 posterior trees MCC tree 4 models to test whether diversification rates and turnover episodically changed over time Significant diversification rate and turnover changes in the late Pliocene (3.2 Ma) and a possible mass extinction in the mid-Miocene (15 Ma) Trait-dependence (rates vary as a function of a character state for a trait) Court Jester GeoSSE (make.geosse) RevBayes Maddison et al. 2007, Goldberg et al. 2011, Höhna et al. (2016a) 200 posterior trees $$+$$ geographical occurrence (highland, lowland, or widespread) 12 models to test whether a character state impacted speciation and/or extinction and/or transition rates between mountains and lowlands High speciation in mountain, medium allopatric speciation and low speciation in lowland (no effect of the trait on extinction and transition rates) Environmental-dependence (rates vary as a function of a time-variable environment) Court Jester RPANDA (fit_env) TESS RevBayes Condamine et al. 2013, Höhna et al. (2016b) 200 posterior trees $$+$$ past global temperatures $$+$$ past altitude of the HTP 8 models for each variable to test whether rates vary with the paleoenvironment Speciation and extinction are positively associated with warm climate. Speciation or extinction is not significantly associated with past altitude Evolutionary Type of birth–death $$\hspace{8pt}$$model Methods used References Data used Settings Main results Time-dependence (rates vary continuously as a function of time and across clades) Red Queen BAMM RevBayes Rabosky et al. (2013), Höhna et al. (2016a) MCC tree of Parnassiinae Bayesian model to test whether speciation rates have varied over time and/or across lineages (exponential variation) A single shift: increase of diversification near the crown of the genus Parnassius Diversity-dependence (rates vary as a function of the number of species) Red Queen DDD (dd_ML) Etienne et al. (2012) MCC tree of the clade and one of Parnassius (to test whether they reach their carrying capacity) 5 models to test whether speciation declines with diversity and/or extinction increases with diversity Both Parnassiinae and Parnassius have not reached their carrying capacity, but show some degree of diversity dependence Trait-dependence (rates vary as a function of a character state for a trait) Red Queen MuSSE (make.musse) RevBayes Maddison et al. 2007, FitzJohn et al. 2009, Höhna et al. (2016a) 200 posterior trees $$+$$ host plant preferences (four states) 36 models to test whether a character state impacted speciation and/or extinction and/or transition rates between host plants Support for independent speciation rates for each host-specialist clade; no significant differences for extinction and transition rates among clades Episodic birth–death model (rates vary discretely as a function of time) Court Jester TreePar (bd.shifts.optim) CoMET Stadler 2011, May et al. 2016 200 posterior trees MCC tree 4 models to test whether diversification rates and turnover episodically changed over time Significant diversification rate and turnover changes in the late Pliocene (3.2 Ma) and a possible mass extinction in the mid-Miocene (15 Ma) Trait-dependence (rates vary as a function of a character state for a trait) Court Jester GeoSSE (make.geosse) RevBayes Maddison et al. 2007, Goldberg et al. 2011, Höhna et al. (2016a) 200 posterior trees $$+$$ geographical occurrence (highland, lowland, or widespread) 12 models to test whether a character state impacted speciation and/or extinction and/or transition rates between mountains and lowlands High speciation in mountain, medium allopatric speciation and low speciation in lowland (no effect of the trait on extinction and transition rates) Environmental-dependence (rates vary as a function of a time-variable environment) Court Jester RPANDA (fit_env) TESS RevBayes Condamine et al. 2013, Höhna et al. (2016b) 200 posterior trees $$+$$ past global temperatures $$+$$ past altitude of the HTP 8 models for each variable to test whether rates vary with the paleoenvironment Speciation and extinction are positively associated with warm climate. Speciation or extinction is not significantly associated with past altitude Notes: Evolutionary models are divided by whether they are used to test hypotheses in favor of the Red Queen or to support the Court Jester. MCC $$=$$ maximum clade credibility; HTP $$=$$ Himalaya and Tibetan Plateau. Table 1. Summary of diversification analyses Evolutionary Type of birth–death $$\hspace{8pt}$$model Methods used References Data used Settings Main results Time-dependence (rates vary continuously as a function of time and across clades) Red Queen BAMM RevBayes Rabosky et al. (2013), Höhna et al. (2016a) MCC tree of Parnassiinae Bayesian model to test whether speciation rates have varied over time and/or across lineages (exponential variation) A single shift: increase of diversification near the crown of the genus Parnassius Diversity-dependence (rates vary as a function of the number of species) Red Queen DDD (dd_ML) Etienne et al. (2012) MCC tree of the clade and one of Parnassius (to test whether they reach their carrying capacity) 5 models to test whether speciation declines with diversity and/or extinction increases with diversity Both Parnassiinae and Parnassius have not reached their carrying capacity, but show some degree of diversity dependence Trait-dependence (rates vary as a function of a character state for a trait) Red Queen MuSSE (make.musse) RevBayes Maddison et al. 2007, FitzJohn et al. 2009, Höhna et al. (2016a) 200 posterior trees $$+$$ host plant preferences (four states) 36 models to test whether a character state impacted speciation and/or extinction and/or transition rates between host plants Support for independent speciation rates for each host-specialist clade; no significant differences for extinction and transition rates among clades Episodic birth–death model (rates vary discretely as a function of time) Court Jester TreePar (bd.shifts.optim) CoMET Stadler 2011, May et al. 2016 200 posterior trees MCC tree 4 models to test whether diversification rates and turnover episodically changed over time Significant diversification rate and turnover changes in the late Pliocene (3.2 Ma) and a possible mass extinction in the mid-Miocene (15 Ma) Trait-dependence (rates vary as a function of a character state for a trait) Court Jester GeoSSE (make.geosse) RevBayes Maddison et al. 2007, Goldberg et al. 2011, Höhna et al. (2016a) 200 posterior trees $$+$$ geographical occurrence (highland, lowland, or widespread) 12 models to test whether a character state impacted speciation and/or extinction and/or transition rates between mountains and lowlands High speciation in mountain, medium allopatric speciation and low speciation in lowland (no effect of the trait on extinction and transition rates) Environmental-dependence (rates vary as a function of a time-variable environment) Court Jester RPANDA (fit_env) TESS RevBayes Condamine et al. 2013, Höhna et al. (2016b) 200 posterior trees $$+$$ past global temperatures $$+$$ past altitude of the HTP 8 models for each variable to test whether rates vary with the paleoenvironment Speciation and extinction are positively associated with warm climate. Speciation or extinction is not significantly associated with past altitude Evolutionary Type of birth–death $$\hspace{8pt}$$model Methods used References Data used Settings Main results Time-dependence (rates vary continuously as a function of time and across clades) Red Queen BAMM RevBayes Rabosky et al. (2013), Höhna et al. (2016a) MCC tree of Parnassiinae Bayesian model to test whether speciation rates have varied over time and/or across lineages (exponential variation) A single shift: increase of diversification near the crown of the genus Parnassius Diversity-dependence (rates vary as a function of the number of species) Red Queen DDD (dd_ML) Etienne et al. (2012) MCC tree of the clade and one of Parnassius (to test whether they reach their carrying capacity) 5 models to test whether speciation declines with diversity and/or extinction increases with diversity Both Parnassiinae and Parnassius have not reached their carrying capacity, but show some degree of diversity dependence Trait-dependence (rates vary as a function of a character state for a trait) Red Queen MuSSE (make.musse) RevBayes Maddison et al. 2007, FitzJohn et al. 2009, Höhna et al. (2016a) 200 posterior trees $$+$$ host plant preferences (four states) 36 models to test whether a character state impacted speciation and/or extinction and/or transition rates between host plants Support for independent speciation rates for each host-specialist clade; no significant differences for extinction and transition rates among clades Episodic birth–death model (rates vary discretely as a function of time) Court Jester TreePar (bd.shifts.optim) CoMET Stadler 2011, May et al. 2016 200 posterior trees MCC tree 4 models to test whether diversification rates and turnover episodically changed over time Significant diversification rate and turnover changes in the late Pliocene (3.2 Ma) and a possible mass extinction in the mid-Miocene (15 Ma) Trait-dependence (rates vary as a function of a character state for a trait) Court Jester GeoSSE (make.geosse) RevBayes Maddison et al. 2007, Goldberg et al. 2011, Höhna et al. (2016a) 200 posterior trees $$+$$ geographical occurrence (highland, lowland, or widespread) 12 models to test whether a character state impacted speciation and/or extinction and/or transition rates between mountains and lowlands High speciation in mountain, medium allopatric speciation and low speciation in lowland (no effect of the trait on extinction and transition rates) Environmental-dependence (rates vary as a function of a time-variable environment) Court Jester RPANDA (fit_env) TESS RevBayes Condamine et al. 2013, Höhna et al. (2016b) 200 posterior trees $$+$$ past global temperatures $$+$$ past altitude of the HTP 8 models for each variable to test whether rates vary with the paleoenvironment Speciation and extinction are positively associated with warm climate. Speciation or extinction is not significantly associated with past altitude Notes: Evolutionary models are divided by whether they are used to test hypotheses in favor of the Red Queen or to support the Court Jester. MCC $$=$$ maximum clade credibility; HTP $$=$$ Himalaya and Tibetan Plateau. Diversification in a RQ model. Diversity-dependent diversification models estimated an unlimited carrying capacity for the subfamily (Supplementary Appendix S13 available on Dryad), implying that accumulating diversity did not influence diversification rates. The best-fitting DDD model was one in which extinction rate increased with diversity. For Parnassius the best model was a diversity-dependence process with speciation decreasing with increasing diversity. However, none of these models received significant support when compared with other models ($$\Delta$$AICc $$<$$ 2). MuSSE analyses on host-plant preferences selected a model in which speciation rates varied among clades feeding on different host plants; extinction and transition rates were estimated as equal (Supplementary Appendix S14 available on Dryad). Zygophyllaceae feeders (Hypermnestra) had the lowest speciation rate, followed by Aristolochiaceae feeders (Luehdorfiini and Zerynthiini), with the highest rates for clades feeding on Papaveraceae (all Parnassius except subgenus Parnassius) and Crassulaceae $$+$$ Saxifragaceae (subgenus Parnassius) (Fig. 5). MCMC credibility intervals showed significant differences between the speciation rates of Papaveraceae feeders and Crassulaceae $$+$$ Saxifragaceae feeders and those feeding on other host plants (Supplementary Appendix S15 available on Dryad). Transition rates for host-plant switches were often estimated close to zero with strong niche conservatism in the phylogeny; once a lineage shifted to a new host it did not switch back and only rarely colonized another host plant (Supplementary Appendix S14 available on Dryad). We assessed the robustness of this pattern with a simulation procedure, and Bayesian implementations of HiSSE and MuSSE in RevBayes (Supplementary Appendix S16 available on Dryad). All analyses supported the same diversification pattern with speciation rates being different among lineages feeding on different host plants. Our estimates were robust to known biases of SSE models, suggesting a strong effect of host plant on diversification rates of Parnassiinae. Figure 5. View largeDownload slide Diversification of Parnassiinae linked to their host plant. a) Phylogenetic distribution of host-plant traits (four states) analyzed with MuSSE diversification models. b) Bayesian inferences made with the best-fitting MuSSE model (see Supplementary Appendix S14) showed that speciation rates vary according to the host-plant trait: Parnassius clades feeding on Papaveraceae and Crassulaceae $$+$$ Saxifragaceae have higher speciation rates than their relatives feeding on other families; extinction and transition rates were estimated as equal across clades. Figure 5. View largeDownload slide Diversification of Parnassiinae linked to their host plant. a) Phylogenetic distribution of host-plant traits (four states) analyzed with MuSSE diversification models. b) Bayesian inferences made with the best-fitting MuSSE model (see Supplementary Appendix S14) showed that speciation rates vary according to the host-plant trait: Parnassius clades feeding on Papaveraceae and Crassulaceae $$+$$ Saxifragaceae have higher speciation rates than their relatives feeding on other families; extinction and transition rates were estimated as equal across clades. Bayesian analyses that model rate-heterogeneity across clades provided further corroboration of the SSE inferences (Fig. 6a). Both BAMM and an alternative implementation of lineage-specific birth–death model in RevBayes supported a diversification regime in which there is at least an increase in diversification rates along the stem of Parnassius (and possibly a second upshift at the stem of subgenus Parnassius) and found higher speciation rates for host-plant feeders on Crassulaceae $$+$$ Saxifragaceae and Papaveraceae than those on Aristolochiaceae or Zygophyllaceae (Supplementary Appendices S17–S19 available on Dryad). BAMM was insensitive to the CPP prior. Figure 6. View largeDownload slide Diversification dynamics of Parnassiinae inferred with time-dependent models. a) Rate-heterogeneity-across-clades models implemented in BAMM and RevBayes identified one significant shift in diversification rates along the stem leading to Parnassius (represented by a red circle; see also Supplementary Appendices S17–S19), with a higher speciation rate in this clade than in the rest of the tree. Episodic birth–death models implemented in CoMET (b) and TreePar (c) identified a possible mass extinction event in the mid-Miocene ($$\sim$$15 Ma). The blue histogram shows the uncertainty in the timing of the mass extinction event based on the estimates over 200 trees with TreePar (similar estimates are obtained for CoMET; Supplementary Appendix S20). The 2 models also estimated an increase in diversity around the early-mid Pliocene (b and c), but this was followed by a decrease in diversity toward the present time in CoMET which was not detected in TreePar. The red line shows the lineages-through-time plot. The green lines represent the reconstructed past diversity before and after the mass extinction, based on the net diversification rates and 7.5% of survival probability estimated in TreePar. Plio. $$=$$ Pliocene; Ple. $$=$$ Pleistocene. Figure 6. View largeDownload slide Diversification dynamics of Parnassiinae inferred with time-dependent models. a) Rate-heterogeneity-across-clades models implemented in BAMM and RevBayes identified one significant shift in diversification rates along the stem leading to Parnassius (represented by a red circle; see also Supplementary Appendices S17–S19), with a higher speciation rate in this clade than in the rest of the tree. Episodic birth–death models implemented in CoMET (b) and TreePar (c) identified a possible mass extinction event in the mid-Miocene ($$\sim$$15 Ma). The blue histogram shows the uncertainty in the timing of the mass extinction event based on the estimates over 200 trees with TreePar (similar estimates are obtained for CoMET; Supplementary Appendix S20). The 2 models also estimated an increase in diversity around the early-mid Pliocene (b and c), but this was followed by a decrease in diversity toward the present time in CoMET which was not detected in TreePar. The red line shows the lineages-through-time plot. The green lines represent the reconstructed past diversity before and after the mass extinction, based on the net diversification rates and 7.5% of survival probability estimated in TreePar. Plio. $$=$$ Pliocene; Ple. $$=$$ Pleistocene. Diversification in a CJ model. When mass extinctions are disallowed, TreePar supported a model with a single shift at 3.2 Ma (Table 2), with a low initial rate of diversification (r$$_{\mathrm{2}} = 0.031$$) and high turnover ($$\varepsilon _{\mathrm{2}} = 0.91$$), followed by 4-fold increase in diversification rate (r$$_{\mathrm{1}} = 0.119$$) and a joint decrease of turnover ($$\varepsilon_{\mathrm{1}} = 0.17$$). The second best model identified a second shift in the mid-Miocene (15.3 Ma) marked by an extinction period surrounded by low and negative values for the net diversification rate, and values $$>$$1 for turnover. When mass extinctions are allowed to occur, TreePar favored a significant mass extinction at 14.8 Ma with an estimated survival probability after the mass extinction of 7.5% (Table 2; Fig. 6c). The CoMET model implemented in RevBayes with the autocorrelated model also identified a mass extinction event in the mid-Miocene $$\sim$$15 Ma (2lnBF $$=$$ 6, Supplementary Appendix S20 on Dryad). There was also a significant increase in speciation rates around the Pliocene, followed by a decrease in speciation toward the present (Fig. 6b); similar results were obtained with the TESS implementation of CoMET (Supplementary Appendix S20 available on Dryad). Table 2. Results from episodic diversification analyses in TreePar with or without mass extinction Mass extinction events disallowed Mass extinction events allowed BD with BD with BD with BD with BD with BD with BD with Models no shift time 1 shift time 2 shift times 3 shift times Models no ME 1 ME 2 MEs NP 2 5 8 11 NP 2 4 6 logL –182.840,0.5640 –175.099,0.5727 –172.910,0.5716 –170.504,0.5778 logL –240.789,0.5634 –235.792,0.5667 –234.082,0.5760 AICc 369.827,1.1281 360.959,1.1454 363.714,1.1432 366.624,1.1555 AICc 485.726,1.1267 482.345,1.1334 486.058,1.1520 $$\Delta$$AIC 8.868 0 2.755 5.665 $$\Delta$$AIC r1 0.0934,0.0008 0.1188,0.0057 0.0915,0.0067 0.0711,0.0105 r 0.0930,0.0008 0.1672,0.0012 0.1859,0.0014 $$\varepsilon1$$ 0.5374,0.0047 0.1699,0.0355 0.3109,0.0411 0.5288,0.1224 $$\varepsilon$$ 0.5363,0.0042 0.0279,0.0038 0.0025,0.0010 st1 — 3.21,0.0573 2.81,0.0693 2.56,0.0788 st1 — 14.82,0.12 7.37,0.17 r2 — 0.0312,0.0007 0.0254,0.0166 0.0031,0.0295 sp1 — 0.0749,0.0021 0.466,0.0144 $$\varepsilon2$$ — 0.9142,0.0052 0.8319,0.0529 0.9179,0.0795 st2 — — 16.08,0.29 st2 — — 15.29,0.8932 10.47,0.6951 sp2 — — 0.0759,0.0028 r3 — — –0.3498,0.1026 0.0302,0.0190 $$\varepsilon3$$ — — 0.7616,0.0380 0.8461,0.0506 st3 — — — 20.29,0.9277 r4 — — — –0.9929,0.1753 $$\varepsilon4$$ — — — 0.8571,0.0403 Mass extinction events disallowed Mass extinction events allowed BD with BD with BD with BD with BD with BD with BD with Models no shift time 1 shift time 2 shift times 3 shift times Models no ME 1 ME 2 MEs NP 2 5 8 11 NP 2 4 6 logL –182.840,0.5640 –175.099,0.5727 –172.910,0.5716 –170.504,0.5778 logL –240.789,0.5634 –235.792,0.5667 –234.082,0.5760 AICc 369.827,1.1281 360.959,1.1454 363.714,1.1432 366.624,1.1555 AICc 485.726,1.1267 482.345,1.1334 486.058,1.1520 $$\Delta$$AIC 8.868 0 2.755 5.665 $$\Delta$$AIC r1 0.0934,0.0008 0.1188,0.0057 0.0915,0.0067 0.0711,0.0105 r 0.0930,0.0008 0.1672,0.0012 0.1859,0.0014 $$\varepsilon1$$ 0.5374,0.0047 0.1699,0.0355 0.3109,0.0411 0.5288,0.1224 $$\varepsilon$$ 0.5363,0.0042 0.0279,0.0038 0.0025,0.0010 st1 — 3.21,0.0573 2.81,0.0693 2.56,0.0788 st1 — 14.82,0.12 7.37,0.17 r2 — 0.0312,0.0007 0.0254,0.0166 0.0031,0.0295 sp1 — 0.0749,0.0021 0.466,0.0144 $$\varepsilon2$$ — 0.9142,0.0052 0.8319,0.0529 0.9179,0.0795 st2 — — 16.08,0.29 st2 — — 15.29,0.8932 10.47,0.6951 sp2 — — 0.0759,0.0028 r3 — — –0.3498,0.1026 0.0302,0.0190 $$\varepsilon3$$ — — 0.7616,0.0380 0.8461,0.0506 st3 — — — 20.29,0.9277 r4 — — — –0.9929,0.1753 $$\varepsilon4$$ — — — 0.8571,0.0403 Notes: Bold columns represent the best model. When mass extinction is disallowed, the best-supported model has a single rate-shift, as determined by the lowest AICc and $$\Delta$$AICc. When mass extinction is allowed, the best model also has a single mass-extinction event. Adding more diversification shifts or mass extinctions did not significantly improve the likelihood of the models. “r1” denotes the diversification rate and “$$\varepsilon$$1” is the turnover, both inferred between Present and the shift time 1 (“st1”). BD $$=$$ birth–death; ME $$=$$ mass extinction; NP $$=$$ number of parameters; logL $$=$$ log-likelihood; AICc $$=$$ corrected Akaike Information Criterion; $$\Delta$$AICc $$=$$ the difference in AICc between the model with the lowest AICc and the others. Parameter estimates: r $$=$$ net diversification rate (speciation minus extinction); $$\varepsilon =$$ turnover (extinction over speciation); st $$=$$ shift time; sp $$=$$ survival probability at a mass extinction event. Table 2. Results from episodic diversification analyses in TreePar with or without mass extinction Mass extinction events disallowed Mass extinction events allowed BD with BD with BD with BD with BD with BD with BD with Models no shift time 1 shift time 2 shift times 3 shift times Models no ME 1 ME 2 MEs NP 2 5 8 11 NP 2 4 6 logL –182.840,0.5640 –175.099,0.5727 –172.910,0.5716 –170.504,0.5778 logL –240.789,0.5634 –235.792,0.5667 –234.082,0.5760 AICc 369.827,1.1281 360.959,1.1454 363.714,1.1432 366.624,1.1555 AICc 485.726,1.1267 482.345,1.1334 486.058,1.1520 $$\Delta$$AIC 8.868 0 2.755 5.665 $$\Delta$$AIC r1 0.0934,0.0008 0.1188,0.0057 0.0915,0.0067 0.0711,0.0105 r 0.0930,0.0008 0.1672,0.0012 0.1859,0.0014 $$\varepsilon1$$ 0.5374,0.0047 0.1699,0.0355 0.3109,0.0411 0.5288,0.1224 $$\varepsilon$$ 0.5363,0.0042 0.0279,0.0038 0.0025,0.0010 st1 — 3.21,0.0573 2.81,0.0693 2.56,0.0788 st1 — 14.82,0.12 7.37,0.17 r2 — 0.0312,0.0007 0.0254,0.0166 0.0031,0.0295 sp1 — 0.0749,0.0021 0.466,0.0144 $$\varepsilon2$$ — 0.9142,0.0052 0.8319,0.0529 0.9179,0.0795 st2 — — 16.08,0.29 st2 — — 15.29,0.8932 10.47,0.6951 sp2 — — 0.0759,0.0028 r3 — — –0.3498,0.1026 0.0302,0.0190 $$\varepsilon3$$ — — 0.7616,0.0380 0.8461,0.0506 st3 — — — 20.29,0.9277 r4 — — — –0.9929,0.1753 $$\varepsilon4$$ — — — 0.8571,0.0403 Mass extinction events disallowed Mass extinction events allowed BD with BD with BD with BD with BD with BD with BD with Models no shift time 1 shift time 2 shift times 3 shift times Models no ME 1 ME 2 MEs NP 2 5 8 11 NP 2 4 6 logL –182.840,0.5640 –175.099,0.5727 –172.910,0.5716 –170.504,0.5778 logL –240.789,0.5634 –235.792,0.5667 –234.082,0.5760 AICc 369.827,1.1281 360.959,1.1454 363.714,1.1432 366.624,1.1555 AICc 485.726,1.1267 482.345,1.1334 486.058,1.1520 $$\Delta$$AIC 8.868 0 2.755 5.665 $$\Delta$$AIC r1 0.0934,0.0008 0.1188,0.0057 0.0915,0.0067 0.0711,0.0105 r 0.0930,0.0008 0.1672,0.0012 0.1859,0.0014 $$\varepsilon1$$ 0.5374,0.0047 0.1699,0.0355 0.3109,0.0411 0.5288,0.1224 $$\varepsilon$$ 0.5363,0.0042 0.0279,0.0038 0.0025,0.0010 st1 — 3.21,0.0573 2.81,0.0693 2.56,0.0788 st1 — 14.82,0.12 7.37,0.17 r2 — 0.0312,0.0007 0.0254,0.0166 0.0031,0.0295 sp1 — 0.0749,0.0021 0.466,0.0144 $$\varepsilon2$$ — 0.9142,0.0052 0.8319,0.0529 0.9179,0.0795 st2 — — 16.08,0.29 st2 — — 15.29,0.8932 10.47,0.6951 sp2 — — 0.0759,0.0028 r3 — — –0.3498,0.1026 0.0302,0.0190 $$\varepsilon3$$ — — 0.7616,0.0380 0.8461,0.0506 st3 — — — 20.29,0.9277 r4 — — — –0.9929,0.1753 $$\varepsilon4$$ — — — 0.8571,0.0403 Notes: Bold columns represent the best model. When mass extinction is disallowed, the best-supported model has a single rate-shift, as determined by the lowest AICc and $$\Delta$$AICc. When mass extinction is allowed, the best model also has a single mass-extinction event. Adding more diversification shifts or mass extinctions did not significantly improve the likelihood of the models. “r1” denotes the diversification rate and “$$\varepsilon$$1” is the turnover, both inferred between Present and the shift time 1 (“st1”). BD $$=$$ birth–death; ME $$=$$ mass extinction; NP $$=$$ number of parameters; logL $$=$$ log-likelihood; AICc $$=$$ corrected Akaike Information Criterion; $$\Delta$$AICc $$=$$ the difference in AICc between the model with the lowest AICc and the others. Parameter estimates: r $$=$$ net diversification rate (speciation minus extinction); $$\varepsilon =$$ turnover (extinction over speciation); st $$=$$ shift time; sp $$=$$ survival probability at a mass extinction event. GeoSSE analyses selected a best-fitting model with equal dispersal and extinction rates between altitudinal regions (habitats), but higher speciation rates for highland regions when compared with lowland regions (Supplementary Appendix S14 available on Dryad). Allopatric speciation (for lowland/highland ancestors) was inferred to occur at a higher rate than within-lowland speciation, but at a lower rate than within-highland speciation; these differences remained significant in the Bayesian MCMC analyses, as the credibility interval did not overlap with zero (Fig. 7). Nevertheless, the difference in speciation rates between higher and lower altitudes was not larger than expected by chance in our simulation tests (Supplementary Appendix S16 available on Dryad), so we cannot reject the observed effect of altitude on speciation being affected by the shape of the phylogeny. Figure 7. View largeDownload slide Inference of geographic mode of speciation between mountain and lowland species. a) The phylogenetic distribution of geographic traits (lowland, mountain, or “widespread”) analyzed with GeoSSE models of diversification. b) Bayesian inferences made with the best-fitting GeoSSE model (selected over a series of 12 diversification models) showing that only speciation varies according to the geographic trait. The results indicate that the mountain-dwelling Parnassiinae species have significantly higher speciation rates than their counterparts. The allopatric (or biome divergence) speciation rate is also estimated to be elevated, congruent with our biogeographic estimates. Figure 7. View largeDownload slide Inference of geographic mode of speciation between mountain and lowland species. a) The phylogenetic distribution of geographic traits (lowland, mountain, or “widespread”) analyzed with GeoSSE models of diversification. b) Bayesian inferences made with the best-fitting GeoSSE model (selected over a series of 12 diversification models) showing that only speciation varies according to the geographic trait. The results indicate that the mountain-dwelling Parnassiinae species have significantly higher speciation rates than their counterparts. The allopatric (or biome divergence) speciation rate is also estimated to be elevated, congruent with our biogeographic estimates. The ML-based paleoenvironment-dependent model implemented in RPANDA indicated positive dependence between diversification rates and past temperature (Table 3). In the best model, both speciation and extinction rates increased exponentially with temperature, with a general decrease in diversification rates toward the present, combined with periods of high turnover rates in the late Oligocene and mid-Miocene (Fig. 8). The best paleo-elevation model had speciation increasing with increasing altitude, whereas extinction remained constant (and not affected by elevation fluctuations), but this model does not fit better than constant-rate or temperature-dependent models (Table 3). In contrast, the Bayesian paleoenvironment-dependent model implemented in RevBayes indicated that both past temperature and elevation did not impact the diversification of Parnassiinae, since the correlation parameters $$\alpha$$ and $$\beta$$ had credibility intervals that were not significantly different from zero (Supplementary Appendix S21 available on Dryad). Moreover, estimates with this model (Fig. 8) showed diversification dynamics that were similar to those obtained with CoMET and TreePar, but unlike those from RPANDA. These differences might be attributed to the inference framework (ML in RPANDA vs. BI in RevBayes) or to the different underlying birth–death models (continuous in RPANDA vs. episodic in RevBayes). To test this, we implemented the continuous-paleoenvironmental models (Condamine et al. 2013) in TESS. We then compared these models against a constant birth–death model, episodic (piecewise constant) birth–death models (1–5 shifts), and time-variable models. This procedure ensured that a single statistical framework (ML and AIC) implemented in TESS was used to compare all models. Results showed that the best-fitting model was one in which speciation and extinction rates vary positively with paleotemperature, in agreement with RPANDA (Tables 4 and 5). We, therefore, concluded that the different results are related not to the inference framework but to the underlying model: the use of an episodic birth–death model with autocorrelated rates across time intervals in the RevBayes model rather than the uncorrelated episodic model implemented in TESS. Figure 8. View largeDownload slide Paleoenvironment-dependent diversification processes in Apollo butterflies. ML continuous paleoenvironment-dependent models implemented in RPANDA (a–d) showed positive dependence between paleotemperatures and speciation/extinction; the paleo-elevation-dependent diversification models did not fit the data better than the constant-rate or time-variable models. Bayesian inference of environment-dependent diversification with autocorrelated rates in RevBayes (e–h) supports a model in which both speciation and extinction are not dependent on temperature and elevation changes (represented by dotted lines, temperature and elevation values are given in Figure 2). Colored areas in speciation and extinction plots indicate the credibility interval, with the continuous line representing the median rate. Pli $$=$$ Pliocene; P $$=$$ Pleistocene. Figure 8. View largeDownload slide Paleoenvironment-dependent diversification processes in Apollo butterflies. ML continuous paleoenvironment-dependent models implemented in RPANDA (a–d) showed positive dependence between paleotemperatures and speciation/extinction; the paleo-elevation-dependent diversification models did not fit the data better than the constant-rate or time-variable models. Bayesian inference of environment-dependent diversification with autocorrelated rates in RevBayes (e–h) supports a model in which both speciation and extinction are not dependent on temperature and elevation changes (represented by dotted lines, temperature and elevation values are given in Figure 2). Colored areas in speciation and extinction plots indicate the credibility interval, with the continuous line representing the median rate. Pli $$=$$ Pliocene; P $$=$$ Pleistocene. Table 3. Results from RPANDA analyses Models NP logL AICc $$\Delta$$AIC AI$$\omega$$ $$\lambda_{\mathrm{0}}$$ $$\alpha$$ $$\mu_{\mathrm{0}}$$ $$\beta$$ Constant birth–death 2 $$-240.336 \pm 0.620$$ 484.819 $${\pm}$$ 1.240 5.186 0.044 0.2048 $${\pm}$$ 0.0018 — 0.1123,0.0017 — $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-238.427 \pm 0.624$$ 483.151 $${\pm}$$ 1.249 3.518 0.101 0.1962 $${\pm}$$ 0.0016 –0.0407,0.0004 5.00E-08,5.00E-08 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Time}}$$ 3 $$-239.089 \pm 0.621$$ 484.474 $${\pm}$$ 1.242 4.841 0.052 0.1914 $${\pm}$$ 0.0016 — 0.0549,0.0010 0.0433,0.0006 $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{Time}}$$ 4 $$-238.427 \pm 0.624$$ 485.354 $${\pm}$$ 1.249 5.721 0.034 0.1962 $${\pm}$$ 0.0016 –0.0405,0.00004 8.535E-05,3.24E-05 0.0347,0.0024 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-240.247 \pm 0.572$$ 486.790 $${\pm}$$ 1.144 7.157 0.016 0.0310 $${\pm}$$ 0.0012 0.0004,6.35E-06 0.0794,0.0021 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Alti.}}$$ 3 $$-241.542 \pm 0.577$$ 489.379 $${\pm}$$ 1.155 9.746 0.004 0.1991 $${\pm}$$ 0.0017 — 0.5116,0.0676 –0.00024,1.35E-05 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{Alti.}}$$ 4 $$-240.253 \pm 0.582$$ 489.005 $${\pm}$$ 1.164 9.372 0.005 0.0454 $${\pm}$$ 0.0021 0.0003,9.30E-06 0.1016,0.0058 –2.67E-05,6.29E-05 $$\lambda_{\mathrm{Temp.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-240.605 \pm 0.632$$ 487.505 $${\pm}$$ 1.264 7.872 0.011 0.2121 $${\pm}$$ 0.0021 –0.0242,0.0012 0.0834,0.0011 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Temp.}}$$ 3 $$-238.058 \pm 0.649$$ 482.412 $${\pm}$$ 1.298 2.779 0.146 0.1817 $${\pm}$$ 0.0013 — 0.0071,0.0005 0.4707,0.0138 $$\lambda_{\mathrm{\mathbf{Temp.}}}$$and$$\mu_{\mathrm{\mathbf{Temp.}}}$$ 4 $$-235.566 \pm 0.595$$ 479.633 $${\pm}$$ 1.189 0 0.586 0.0839 $${\pm}$$ 0.0017 0.3811,0.0073 0.0797,0.0019 0.3988,0.0041 Models NP logL AICc $$\Delta$$AIC AI$$\omega$$ $$\lambda_{\mathrm{0}}$$ $$\alpha$$ $$\mu_{\mathrm{0}}$$ $$\beta$$ Constant birth–death 2 $$-240.336 \pm 0.620$$ 484.819 $${\pm}$$ 1.240 5.186 0.044 0.2048 $${\pm}$$ 0.0018 — 0.1123,0.0017 — $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-238.427 \pm 0.624$$ 483.151 $${\pm}$$ 1.249 3.518 0.101 0.1962 $${\pm}$$ 0.0016 –0.0407,0.0004 5.00E-08,5.00E-08 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Time}}$$ 3 $$-239.089 \pm 0.621$$ 484.474 $${\pm}$$ 1.242 4.841 0.052 0.1914 $${\pm}$$ 0.0016 — 0.0549,0.0010 0.0433,0.0006 $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{Time}}$$ 4 $$-238.427 \pm 0.624$$ 485.354 $${\pm}$$ 1.249 5.721 0.034 0.1962 $${\pm}$$ 0.0016 –0.0405,0.00004 8.535E-05,3.24E-05 0.0347,0.0024 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-240.247 \pm 0.572$$ 486.790 $${\pm}$$ 1.144 7.157 0.016 0.0310 $${\pm}$$ 0.0012 0.0004,6.35E-06 0.0794,0.0021 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Alti.}}$$ 3 $$-241.542 \pm 0.577$$ 489.379 $${\pm}$$ 1.155 9.746 0.004 0.1991 $${\pm}$$ 0.0017 — 0.5116,0.0676 –0.00024,1.35E-05 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{Alti.}}$$ 4 $$-240.253 \pm 0.582$$ 489.005 $${\pm}$$ 1.164 9.372 0.005 0.0454 $${\pm}$$ 0.0021 0.0003,9.30E-06 0.1016,0.0058 –2.67E-05,6.29E-05 $$\lambda_{\mathrm{Temp.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-240.605 \pm 0.632$$ 487.505 $${\pm}$$ 1.264 7.872 0.011 0.2121 $${\pm}$$ 0.0021 –0.0242,0.0012 0.0834,0.0011 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Temp.}}$$ 3 $$-238.058 \pm 0.649$$ 482.412 $${\pm}$$ 1.298 2.779 0.146 0.1817 $${\pm}$$ 0.0013 — 0.0071,0.0005 0.4707,0.0138 $$\lambda_{\mathrm{\mathbf{Temp.}}}$$and$$\mu_{\mathrm{\mathbf{Temp.}}}$$ 4 $$-235.566 \pm 0.595$$ 479.633 $${\pm}$$ 1.189 0 0.586 0.0839 $${\pm}$$ 0.0017 0.3811,0.0073 0.0797,0.0019 0.3988,0.0041 Notes: Bold columns represent the best model. The model with positive dependence of speciation and extinction on temperature has the lowest AICc and highest AIC$$\omega$$. Macroevolutionary scenarios were compared in which speciation and extinction vary through time (Time), or by paleo-elevation changes of the Himalaya and Tibetan Plateau (Alti.), and global temperature changes (Temp.). Abbreviations: NP, number of parameters; logL, log-likelihood; AICc, corrected Akaike Information Criterion; $$\Delta$$AICc, the difference in AICc between the model with the lowest AICc and the others; AIC$$\omega$$, Akaike weight. Parameter estimates: $$\lambda_{\mathrm{0}}$$ and $$\mu _{\mathrm{0}}$$, speciation and extinction for environmental value at present; $$\alpha$$ and $$\beta$$, parameter controlling variation of speciation ($$\alpha )$$ and extinction ($$\beta )$$ with paleoenvironment. Table 3. Results from RPANDA analyses Models NP logL AICc $$\Delta$$AIC AI$$\omega$$ $$\lambda_{\mathrm{0}}$$ $$\alpha$$ $$\mu_{\mathrm{0}}$$ $$\beta$$ Constant birth–death 2 $$-240.336 \pm 0.620$$ 484.819 $${\pm}$$ 1.240 5.186 0.044 0.2048 $${\pm}$$ 0.0018 — 0.1123,0.0017 — $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-238.427 \pm 0.624$$ 483.151 $${\pm}$$ 1.249 3.518 0.101 0.1962 $${\pm}$$ 0.0016 –0.0407,0.0004 5.00E-08,5.00E-08 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Time}}$$ 3 $$-239.089 \pm 0.621$$ 484.474 $${\pm}$$ 1.242 4.841 0.052 0.1914 $${\pm}$$ 0.0016 — 0.0549,0.0010 0.0433,0.0006 $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{Time}}$$ 4 $$-238.427 \pm 0.624$$ 485.354 $${\pm}$$ 1.249 5.721 0.034 0.1962 $${\pm}$$ 0.0016 –0.0405,0.00004 8.535E-05,3.24E-05 0.0347,0.0024 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-240.247 \pm 0.572$$ 486.790 $${\pm}$$ 1.144 7.157 0.016 0.0310 $${\pm}$$ 0.0012 0.0004,6.35E-06 0.0794,0.0021 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Alti.}}$$ 3 $$-241.542 \pm 0.577$$ 489.379 $${\pm}$$ 1.155 9.746 0.004 0.1991 $${\pm}$$ 0.0017 — 0.5116,0.0676 –0.00024,1.35E-05 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{Alti.}}$$ 4 $$-240.253 \pm 0.582$$ 489.005 $${\pm}$$ 1.164 9.372 0.005 0.0454 $${\pm}$$ 0.0021 0.0003,9.30E-06 0.1016,0.0058 –2.67E-05,6.29E-05 $$\lambda_{\mathrm{Temp.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-240.605 \pm 0.632$$ 487.505 $${\pm}$$ 1.264 7.872 0.011 0.2121 $${\pm}$$ 0.0021 –0.0242,0.0012 0.0834,0.0011 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Temp.}}$$ 3 $$-238.058 \pm 0.649$$ 482.412 $${\pm}$$ 1.298 2.779 0.146 0.1817 $${\pm}$$ 0.0013 — 0.0071,0.0005 0.4707,0.0138 $$\lambda_{\mathrm{\mathbf{Temp.}}}$$and$$\mu_{\mathrm{\mathbf{Temp.}}}$$ 4 $$-235.566 \pm 0.595$$ 479.633 $${\pm}$$ 1.189 0 0.586 0.0839 $${\pm}$$ 0.0017 0.3811,0.0073 0.0797,0.0019 0.3988,0.0041 Models NP logL AICc $$\Delta$$AIC AI$$\omega$$ $$\lambda_{\mathrm{0}}$$ $$\alpha$$ $$\mu_{\mathrm{0}}$$ $$\beta$$ Constant birth–death 2 $$-240.336 \pm 0.620$$ 484.819 $${\pm}$$ 1.240 5.186 0.044 0.2048 $${\pm}$$ 0.0018 — 0.1123,0.0017 — $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-238.427 \pm 0.624$$ 483.151 $${\pm}$$ 1.249 3.518 0.101 0.1962 $${\pm}$$ 0.0016 –0.0407,0.0004 5.00E-08,5.00E-08 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Time}}$$ 3 $$-239.089 \pm 0.621$$ 484.474 $${\pm}$$ 1.242 4.841 0.052 0.1914 $${\pm}$$ 0.0016 — 0.0549,0.0010 0.0433,0.0006 $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{Time}}$$ 4 $$-238.427 \pm 0.624$$ 485.354 $${\pm}$$ 1.249 5.721 0.034 0.1962 $${\pm}$$ 0.0016 –0.0405,0.00004 8.535E-05,3.24E-05 0.0347,0.0024 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-240.247 \pm 0.572$$ 486.790 $${\pm}$$ 1.144 7.157 0.016 0.0310 $${\pm}$$ 0.0012 0.0004,6.35E-06 0.0794,0.0021 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Alti.}}$$ 3 $$-241.542 \pm 0.577$$ 489.379 $${\pm}$$ 1.155 9.746 0.004 0.1991 $${\pm}$$ 0.0017 — 0.5116,0.0676 –0.00024,1.35E-05 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{Alti.}}$$ 4 $$-240.253 \pm 0.582$$ 489.005 $${\pm}$$ 1.164 9.372 0.005 0.0454 $${\pm}$$ 0.0021 0.0003,9.30E-06 0.1016,0.0058 –2.67E-05,6.29E-05 $$\lambda_{\mathrm{Temp.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-240.605 \pm 0.632$$ 487.505 $${\pm}$$ 1.264 7.872 0.011 0.2121 $${\pm}$$ 0.0021 –0.0242,0.0012 0.0834,0.0011 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Temp.}}$$ 3 $$-238.058 \pm 0.649$$ 482.412 $${\pm}$$ 1.298 2.779 0.146 0.1817 $${\pm}$$ 0.0013 — 0.0071,0.0005 0.4707,0.0138 $$\lambda_{\mathrm{\mathbf{Temp.}}}$$and$$\mu_{\mathrm{\mathbf{Temp.}}}$$ 4 $$-235.566 \pm 0.595$$ 479.633 $${\pm}$$ 1.189 0 0.586 0.0839 $${\pm}$$ 0.0017 0.3811,0.0073 0.0797,0.0019 0.3988,0.0041 Notes: Bold columns represent the best model. The model with positive dependence of speciation and extinction on temperature has the lowest AICc and highest AIC$$\omega$$. Macroevolutionary scenarios were compared in which speciation and extinction vary through time (Time), or by paleo-elevation changes of the Himalaya and Tibetan Plateau (Alti.), and global temperature changes (Temp.). Abbreviations: NP, number of parameters; logL, log-likelihood; AICc, corrected Akaike Information Criterion; $$\Delta$$AICc, the difference in AICc between the model with the lowest AICc and the others; AIC$$\omega$$, Akaike weight. Parameter estimates: $$\lambda_{\mathrm{0}}$$ and $$\mu _{\mathrm{0}}$$, speciation and extinction for environmental value at present; $$\alpha$$ and $$\beta$$, parameter controlling variation of speciation ($$\alpha )$$ and extinction ($$\beta )$$ with paleoenvironment. Table 4. Results from the TESS analyses using paleoenvironment-dependent diversification models Models NP logL AICc $$\Delta$$AIC AIC$$\omega$$ $$\lambda$$ $$\alpha$$ $$\mu$$ $$\beta$$ Constant birth–death 2 $$-240.687$$ 485.425 3.668 0.044 0.2037 — 0.1113 — $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-238.811$$ 483.775 2.018 0.101 0.1951 $$-0.0402$$ 2.01E-06 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Time}}$$ 3 $$-239.331$$ 484.816 3.059 0.060 0.1898 — 0.0515 0.0467 $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{Time}}$$ 4 $$-238.802$$ 485.916 4.159 0.035 0.1951 $$-0.0398$$ 0.0002 0.0330 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-239.570$$ 485.294 3.537 0.047 0.0483 0.0003 0.09450 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Alti.}}$$ 3 $$-240.416$$ 486.985 5.228 0.020 0.1999 — 0.3730 $$-0.00014$$ $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{Alti.}}$$ 4 $$-239.378$$ 487.067 5.310 0.019 0.0472 0.0004 0.1169 0.0002 $$\lambda_{\mathrm{Temp.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-239.570$$ 485.294 3.537 0.047 0.0483 0.0003 0.0945 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Temp.}}$$ 3 $$-237.826$$ 481.806 0.049 0.271 0.1800 — 0.0064 0.5007 $$\lambda$$$$_{\mathrm{\mathbf{Temp.}}}$$and$$\mu$$$$_{\mathrm{\mathbf{Temp.}}}$$ 4 $$-236.723$$ 481.757 0 0.278 0.1383 0.1463 0.0458 0.3483 Models NP logL AICc $$\Delta$$AIC AIC$$\omega$$ $$\lambda$$ $$\alpha$$ $$\mu$$ $$\beta$$ Constant birth–death 2 $$-240.687$$ 485.425 3.668 0.044 0.2037 — 0.1113 — $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-238.811$$ 483.775 2.018 0.101 0.1951 $$-0.0402$$ 2.01E-06 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Time}}$$ 3 $$-239.331$$ 484.816 3.059 0.060 0.1898 — 0.0515 0.0467 $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{Time}}$$ 4 $$-238.802$$ 485.916 4.159 0.035 0.1951 $$-0.0398$$ 0.0002 0.0330 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-239.570$$ 485.294 3.537 0.047 0.0483 0.0003 0.09450 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Alti.}}$$ 3 $$-240.416$$ 486.985 5.228 0.020 0.1999 — 0.3730 $$-0.00014$$ $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{Alti.}}$$ 4 $$-239.378$$ 487.067 5.310 0.019 0.0472 0.0004 0.1169 0.0002 $$\lambda_{\mathrm{Temp.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-239.570$$ 485.294 3.537 0.047 0.0483 0.0003 0.0945 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Temp.}}$$ 3 $$-237.826$$ 481.806 0.049 0.271 0.1800 — 0.0064 0.5007 $$\lambda$$$$_{\mathrm{\mathbf{Temp.}}}$$and$$\mu$$$$_{\mathrm{\mathbf{Temp.}}}$$ 4 $$-236.723$$ 481.757 0 0.278 0.1383 0.1463 0.0458 0.3483 Notes: Bold columns represent the best model. Table 4. Results from the TESS analyses using paleoenvironment-dependent diversification models Models NP logL AICc $$\Delta$$AIC AIC$$\omega$$ $$\lambda$$ $$\alpha$$ $$\mu$$ $$\beta$$ Constant birth–death 2 $$-240.687$$ 485.425 3.668 0.044 0.2037 — 0.1113 — $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-238.811$$ 483.775 2.018 0.101 0.1951 $$-0.0402$$ 2.01E-06 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Time}}$$ 3 $$-239.331$$ 484.816 3.059 0.060 0.1898 — 0.0515 0.0467 $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{Time}}$$ 4 $$-238.802$$ 485.916 4.159 0.035 0.1951 $$-0.0398$$ 0.0002 0.0330 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-239.570$$ 485.294 3.537 0.047 0.0483 0.0003 0.09450 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Alti.}}$$ 3 $$-240.416$$ 486.985 5.228 0.020 0.1999 — 0.3730 $$-0.00014$$ $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{Alti.}}$$ 4 $$-239.378$$ 487.067 5.310 0.019 0.0472 0.0004 0.1169 0.0002 $$\lambda_{\mathrm{Temp.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-239.570$$ 485.294 3.537 0.047 0.0483 0.0003 0.0945 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Temp.}}$$ 3 $$-237.826$$ 481.806 0.049 0.271 0.1800 — 0.0064 0.5007 $$\lambda$$$$_{\mathrm{\mathbf{Temp.}}}$$and$$\mu$$$$_{\mathrm{\mathbf{Temp.}}}$$ 4 $$-236.723$$ 481.757 0 0.278 0.1383 0.1463 0.0458 0.3483 Models NP logL AICc $$\Delta$$AIC AIC$$\omega$$ $$\lambda$$ $$\alpha$$ $$\mu$$ $$\beta$$ Constant birth–death 2 $$-240.687$$ 485.425 3.668 0.044 0.2037 — 0.1113 — $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-238.811$$ 483.775 2.018 0.101 0.1951 $$-0.0402$$ 2.01E-06 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Time}}$$ 3 $$-239.331$$ 484.816 3.059 0.060 0.1898 — 0.0515 0.0467 $$\lambda_{\mathrm{Time}}$$ and $$\mu_{\mathrm{Time}}$$ 4 $$-238.802$$ 485.916 4.159 0.035 0.1951 $$-0.0398$$ 0.0002 0.0330 $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-239.570$$ 485.294 3.537 0.047 0.0483 0.0003 0.09450 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Alti.}}$$ 3 $$-240.416$$ 486.985 5.228 0.020 0.1999 — 0.3730 $$-0.00014$$ $$\lambda_{\mathrm{Alti.}}$$ and $$\mu_{\mathrm{Alti.}}$$ 4 $$-239.378$$ 487.067 5.310 0.019 0.0472 0.0004 0.1169 0.0002 $$\lambda_{\mathrm{Temp.}}$$ and $$\mu_{\mathrm{constant}}$$ 3 $$-239.570$$ 485.294 3.537 0.047 0.0483 0.0003 0.0945 — $$\lambda_{\mathrm{constant}}$$ and $$\mu_{\mathrm{Temp.}}$$ 3 $$-237.826$$ 481.806 0.049 0.271 0.1800 — 0.0064 0.5007 $$\lambda$$$$_{\mathrm{\mathbf{Temp.}}}$$and$$\mu$$$$_{\mathrm{\mathbf{Temp.}}}$$ 4 $$-236.723$$ 481.757 0 0.278 0.1383 0.1463 0.0458 0.3483 Notes: Bold columns represent the best model. Table 5. Results from the TESS analyses using episodic birth–death models Models NP logL AICc $$\Delta$$AIC AIC$$\omega$$ $$\lambda_{\mathrm{1}}$$ $$\lambda_{\mathrm{2}}$$ $$\lambda_{\mathrm{3}}$$ $$\lambda_{\mathrm{4}}$$ $$\lambda_{\mathrm{5}}$$ $$\lambda_{\mathrm{6}}$$ $$\mu_{\mathrm{1}}$$ $$\mu_{\mathrm{2}}$$ $$\mu_{\mathrm{3}}$$ $$\mu_{\mathrm{4}}$$ $$\mu_{\mathrm{5}}$$ $$\mu_{\mathrm{6}}$$ Shift 1 5 $$-237.008$$ 484.542 2.785 0.069 0.052 0.176 — — — — 0.004 0.018 — — — — Shift 2 8 $$-236.086$$ 489.706 7.949 0.005 0.195 0.084 0.180 — — — 0.112 0.097 0.004 — — — Shift 3 11 $$-236.118$$ 497.379 15.622 0.0001 0.180 0.127 0.155 0.181 — — 0.111 0.051 0.159 0.002 — — Shift 4 14 $$-234.251$$ 501.935 20.178 0 0.164 0.159 0.049 0.161 0.190 — 0.136 0.062 0.074 0.058 0.008 — Models NP logL AICc $$\Delta$$AIC AIC$$\omega$$ $$\lambda_{\mathrm{1}}$$ $$\lambda_{\mathrm{2}}$$ $$\lambda_{\mathrm{3}}$$ $$\lambda_{\mathrm{4}}$$ $$\lambda_{\mathrm{5}}$$ $$\lambda_{\mathrm{6}}$$ $$\mu_{\mathrm{1}}$$ $$\mu_{\mathrm{2}}$$ $$\mu_{\mathrm{3}}$$ $$\mu_{\mathrm{4}}$$ $$\mu_{\mathrm{5}}$$ $$\mu_{\mathrm{6}}$$ Shift 1 5 $$-237.008$$ 484.542 2.785 0.069 0.052 0.176 — — — — 0.004 0.018 — — — — Shift 2 8 $$-236.086$$ 489.706 7.949 0.005 0.195 0.084 0.180 — — — 0.112 0.097 0.004 — — — Shift 3 11 $$-236.118$$ 497.379 15.622 0.0001 0.180 0.127 0.155 0.181 — — 0.111 0.051 0.159 0.002 — — Shift 4 14 $$-234.251$$ 501.935 20.178 0 0.164 0.159 0.049 0.161 0.190 — 0.136 0.062 0.074 0.058 0.008 — Notes: The model with positive dependence of speciation and extinction with temperature has the lowest AICc and highest AIC$$\omega$$. We compared macroevolutionary scenarios in which speciation and extinction vary through time (Time), or according to paleo-elevation changes of the Himalaya and Tibetan Plateau (Alti.), according to global temperature changes (Temp.), and episodic birth–death models (Shift). Parameter estimates: $$\lambda _{\mathrm{0}}$$ and $$\mu _{\mathrm{0}}$$, speciation and extinction for the environmental value at present; $$\alpha$$ and $$\beta$$, parameter controlling the variation of the speciation ($$\alpha )$$ and extinction ($$\beta )$$ with the paleoenvironment. $$\lambda_{\mathrm{1}}$$ and $$\mu _{\mathrm{1}}$$, speciation and extinction for the first period of time (present to first shift). NP $$=$$ number of parameters; logL $$=$$ log-likelihood; AICc $$=$$ corrected Akaike Information Criterion; $$\Delta$$AICc $$=$$ the difference in AICc between the model with the lowest AICc and the others; AIC$$\omega =$$ Akaike weight. Table 5. Results from the TESS analyses using episodic birth–death models Models NP logL AICc $$\Delta$$AIC AIC$$\omega$$ $$\lambda_{\mathrm{1}}$$ $$\lambda_{\mathrm{2}}$$ $$\lambda_{\mathrm{3}}$$ $$\lambda_{\mathrm{4}}$$ $$\lambda_{\mathrm{5}}$$ $$\lambda_{\mathrm{6}}$$ $$\mu_{\mathrm{1}}$$ $$\mu_{\mathrm{2}}$$ $$\mu_{\mathrm{3}}$$ $$\mu_{\mathrm{4}}$$ $$\mu_{\mathrm{5}}$$ $$\mu_{\mathrm{6}}$$ Shift 1 5 $$-237.008$$ 484.542 2.785 0.069 0.052 0.176 — — — — 0.004 0.018 — — — — Shift 2 8 $$-236.086$$ 489.706 7.949 0.005 0.195 0.084 0.180 — — — 0.112 0.097 0.004 — — — Shift 3 11 $$-236.118$$ 497.379 15.622 0.0001 0.180 0.127 0.155 0.181 — — 0.111 0.051 0.159 0.002 — — Shift 4 14 $$-234.251$$ 501.935 20.178 0 0.164 0.159 0.049 0.161 0.190 — 0.136 0.062 0.074 0.058 0.008 — Models NP logL AICc $$\Delta$$AIC AIC$$\omega$$ $$\lambda_{\mathrm{1}}$$ $$\lambda_{\mathrm{2}}$$ $$\lambda_{\mathrm{3}}$$ $$\lambda_{\mathrm{4}}$$ $$\lambda_{\mathrm{5}}$$ $$\lambda_{\mathrm{6}}$$ $$\mu_{\mathrm{1}}$$ $$\mu_{\mathrm{2}}$$ $$\mu_{\mathrm{3}}$$ $$\mu_{\mathrm{4}}$$ $$\mu_{\mathrm{5}}$$ $$\mu_{\mathrm{6}}$$ Shift 1 5 $$-237.008$$ 484.542 2.785 0.069 0.052 0.176 — — — — 0.004 0.018 — — — — Shift 2 8 $$-236.086$$ 489.706 7.949 0.005 0.195 0.084 0.180 — — — 0.112 0.097 0.004 — — — Shift 3 11 $$-236.118$$ 497.379 15.622 0.0001 0.180 0.127 0.155 0.181 — — 0.111 0.051 0.159 0.002 — — Shift 4 14 $$-234.251$$ 501.935 20.178 0 0.164 0.159 0.049 0.161 0.190 — 0.136 0.062 0.074 0.058 0.008 — Notes: The model with positive dependence of speciation and extinction with temperature has the lowest AICc and highest AIC$$\omega$$. We compared macroevolutionary scenarios in which speciation and extinction vary through time (Time), or according to paleo-elevation changes of the Himalaya and Tibetan Plateau (Alti.), according to global temperature changes (Temp.), and episodic birth–death models (Shift). Parameter estimates: $$\lambda _{\mathrm{0}}$$ and $$\mu _{\mathrm{0}}$$, speciation and extinction for the environmental value at present; $$\alpha$$ and $$\beta$$, parameter controlling the variation of the speciation ($$\alpha )$$ and extinction ($$\beta )$$ with the paleoenvironment. $$\lambda_{\mathrm{1}}$$ and $$\mu _{\mathrm{1}}$$, speciation and extinction for the first period of time (present to first shift). NP $$=$$ number of parameters; logL $$=$$ log-likelihood; AICc $$=$$ corrected Akaike Information Criterion; $$\Delta$$AICc $$=$$ the difference in AICc between the model with the lowest AICc and the others; AIC$$\omega =$$ Akaike weight. DISCUSSION Seeking the determinants of species richness is a fundamental aspect of biology (Morlon 2014; Benton 2015). However, studies have generally focused on a single factor to explain a given diversity pattern, whether by historical events promoting changes in diversification and extinction (Barnosky 2001; Erwin 2009; Mayhew et al. 2008), or species ecology and clade competitions as the dominant drivers of diversification (Rabosky et al., 2013; Silvestro et al. 2015). The combined influence of various factors has been tested less often (Drummond et al. 2012a; Bouchenak-Khelladi et al. 2015; Lagomarsino et al. 2016), and at a lower geographic and temporal scale or with a reduced set of methods. Further, these methods have mainly been implemented within a ML framework. In this study, we used existing and some novel methods under both ML and BI to evaluate potential factors driving diversification in Parnassiinae: some are related to a RQ mechanistic model of evolution, others to a CJ hypothesis. Coupled Effect of Mountain Building and Climate Change A long-held tenet in biology is that environmental stability leads to greater species richness than a changing environment, suggesting that species interactions are a dominant force (Wallace 1878). A recent debate has emerged between those arguing that physically dynamic ecosystems play a major role in diversification (Hoorn et al. 2013; Condamine et al. 2015b) and even promote adaptive radiation (Tan et al. 2013), and those maintaining that landscape changes have not been important in diversification history, instead advocating the role of large-scale dispersal events as drivers of speciation (Smith et al. 2014). In the Northern Hemisphere, tectonic events such as the closure of Turgai Sea or the building of Alpine and Himalayan orogenies have caused large-scale landscape and climatic changes (Sanmartín et al. 2001). As a result many new habitats were formed that created potential “new ecological spaces” and favored speciation (Xing and Ree 2017). Also, the newly created corridors stimulated species exchange between communities, while also acting as barriers that separated formerly continuous populations (Brikiatis 2014). Alternatively, landscape dynamics (i.e., unlike long-term stable environments) may have shattered former habitats, leading organisms to extinction. The rise of the HTP has often been invoked as providing important opportunities to diversify (Fjeldså et al. 2012; Price et al. 2014; Wen et al. 2014; Favre et al. 2015; Hughes and Atchison 2015; Renner 2016), but its role as a driver has rarely been formally tested with diversification models (Xing and Ree 2017). Our analyses detected higher in situ speciation rates for mountain species than their lowland counterparts (with similar extinction rates), resulting in higher net diversification rates for mountain species. Yet, this pattern was not robust to phylogenetic shape artifacts (Rabosky and Goldberg 2015). Moreover, ML and BI-based paleo-elevation models found that an explicit link between diversification rates and periods of HTP uplift could be rejected in favor of a constant birth–death. These findings contrast with those of high-elevation taxa in the Andes reported to have diversified faster due to the rise of the Andes when compared with their low-elevation relatives (Lagomarsino et al. 2016; Pérez- Escobar et al. 2017). In contrast, clade rate-heterogeneous models (BAMM, RevBayes) and biogeographic analyses supported a link between diversification rates and the HTP orogeny within Parnassiini, particularly in Parnassius. A significant upshift in speciation rates was detected along the stem of Parnassius, coincident with the colonization of the HTP, although DEC inferred several speciation/vicariance events associated with the HTP orogeny (Fig. 4). The latter is corroborated by the relatively high rate of allopatric (lowland/highland) speciation estimated by GeoSSE. These different results can partially be explained by the HTP rise providing novel, high-altitude habitats for colonization in Parnassius, thereby promoting ecological divergence and initial rapid diversification (followed later by allopatric speciation). Yet, diversification was not strictly parallel to periods of mountain building. Conversely, the rise of the HTP—and concomitant climatic changes such as regional cooling and aridification—led to higher extinction rates in non-mountain-adapted Parnassiinae lineages, such as Luehdorfiini and Zerynthiini that survived west and east of the Himalayan range; several events of extinction are inferred within these lineages around the mid-Miocene. This mixed effect is probably responsible for the lack of a significant association between orogeny and speciation in tree-wide models, which is otherwise detected by clade-heterogeneous models. Paleoclimate change is another abiotic factor postulated to be a major trigger of diversification in terrestrial organisms (Erwin 2009). CoMET, TreePar, and the ML-based paleotemperature-dependent model inferred a pattern of increased turnover (background extinction) rates associated with periods of global climate warming, such as the mid-Miocene Climate Optimum (MMCO, 15–17 Ma, Böhme, 2003) or the late Oligocene Warming Event (LOWE, $$\sim$$25 Ma, Zachos et al. 2008, Fig. 2). DEC also reconstructed several events of (geographic) extinction in Central Asia and India along the stem branches subtending the crown diversification of extant genera in Luehdorfiini and Zerynthiini (Fig. 4). These long branches are consistent with periods of high extinction rates related to the LOWE, which were eventually followed by mid-Miocene radiations, especially in Parnassius (the latter probably linked to the HTP rise). Paleotemperature and the HTP orogeny probably had a joint influence in the diversification of Parnassiinae. Mountain building is known to be responsible for regional climate change (Sepulchre et al. 2006; Armijo et al. 2015), and for changes in biodiversity patterns, both directly as a geographic barrier and indirectly through climate change (Hoorn et al. 2013). The HTP acted as a barrier to the influence of the Asian monsoons (Zhisheng et al. 2001) and induced considerable drying of Central Asia (Quade et al. 1989). In Parnassiinae, the HTP rise and subsequent periods of mountain building might have promoted diversification through the provision of “climate refugia”: locations where taxa survived periods of regionally adverse climate. Although usually associated with maintaining biodiversity through glacial–interglacial climate changes (Gavin et al. 2014), mountains can provide climate refugia during warming-aridification events (Migliore et al. 2013; Pokorny et al. 2015). Drastic range contractions within the non-mountain adapted Luehdorfiini and Zerynthiini (Fig. 4) could have been driven by aridification events induced by HTP uplift and mid-Miocene climate warming, with Hypermnestra as a relict group that survived in the Zagros Mountains. In Parnassius, colonization of the HTP coincided with a boost of speciation, followed by dispersal to adjacent areas in the late Miocene-Pliocene. The long stem branches leading to extant Parnassiinae genera are likely the result of climate-driven extinction events associated with Cenozoic warming periods, such as the LOWE and the MMCO. During these periods, Parnassiinae species probably migrated toward cooler temperatures by colonizing the newly uplifted mountain ranges of the Zagros and HTP. This agrees with the climate refugia hypothesis, a pattern currently observed in Himalayan plants, which are shifting their distribution upward along with global warming (Padma 2014). We argue that the HTP acted as climate refugia during global warming events, in which pre-adapted parnassiines survived and diversified before re-colonizing ancestral geographic ranges. It would be interesting to test this hypothesis in other extra-Himalayan groups that diversified during the HTP orogeny (Favre et al. 2015), as well as in other mountains around the world for which paleo-elevation data is available (Fjeldså et al. 2012; Hoorn et al. 2013; Hughes and Atchison 2015). Ecological Interactions via Insect-Plant Diversification and Host-Plant Shifts Biological interactions among distantly related lineages probably played a major role in the diversification of clades over geological time (Van Valen 1973; Liow et al. 2015; Silvestro et al. 2015; Voje et al. 2015). Our analyses identified significant differences in speciation rates associated with different host-plants. Several models (BAMM, MuSSE, and Bayesian implementations of MuSSE and HiSSE in RevBayes) indicated that Parnassius subgenera associated with Crassulaceae-Saxifragaceae and Papaveraceae have significantly higher rates of speciation than their relatives feeding on other host plants (Figs. 5 and 6). Interestingly, the lowest speciation rates are associated with butterflies feeding on Aristolochiaceae, the inferred ancestral host plant of the Parnassiinae (Condamine et al. 2012). Our results concur with the “escape and radiate” hypothesis of Ehrlich and Raven (1964), which predicts that the evolution of a herbivore insect trait (i.e., new tolerance of plant defenses) that allows colonizing/feeding on a novel host plant lineage would lead to a burst of diversification. The highly unbalanced extant species richness between Parnassiini and the clade Luehdorfiini $$+$$ Zerynthiini, explained above by geographic and climate drivers, might also be explained by differential speciation rates related to their feeding habit, as well as the overall low ability of Parnassiinae for shifting between host plants (i.e., transition rates estimated MuSSE are close to zero). Testing this hypothesis would require reconstruction of the diversification history of the host-plant clades; this would allow us, for example, to identify shifts that led to reciprocal changes in diversification in the host plants and to test whether these shifts were accompanied by matching ancestral geographic distributions between herbivore insects and plants at key events (Cruaud et al. 2012). Unfortunately, so far there is no complete phylogeny for any of the large host families for Parnassiinae, including Aristolochiaceae ($$\sim$$500–600 spp.), Papaveraceae ($$\sim$$700–800 spp.), Crassulaceae ($$\sim$$1300–1500 spp.), and Saxifragaceae ($$\sim$$600–700 spp.). However, we found a positive correlation between species richness of these plant lineages and the speciation rates of their insect feeders (MuSSE: R$$^{\mathrm{2}} = 0.96$$, BAMM: R$$^{\mathrm{2}} = 0.87$$; Supplementary Appendix S22 available on Dryad). This suggests that fast-diversifying clades are associated with families offering a higher number of potential hosts (akin to niche availability). Janz et al. (2006) found a similar correlation between host-plant diversity and speciation rates in the Nymphalidae; they posited that recurring oscillations between host-plant expansions (i.e., incorporation of new plants into the feeding repertoire) and host specialization acted as a major driving force behind the diversification of plant-feeding insects. In adaptive radiations, the expectation is an initial rapid burst of diversification, followed by a slowdown in speciation rates as niches are filled up; this is attributed to a diversity-dependence effect (Phillimore and Price 2008; Etienne et al. 2012). A shift to a new ecological resource (e.g., a new host plant) can be an adaptive breakthrough (Ehrlich and Raven 1964), providing higher opportunities for speciation at the start of the radiation, which significantly decrease as species accumulate. Though we did not detect this diversity-dependence effect in Parnassiinae, a pattern of speciation decreasing with standing diversity (and constant extinction rates) was found in Parnassius, supporting the hypothesis of this genus as an example of an adaptive radiation (Rebourg et al. 2006), driven by the evolution of a new host-plant association. The inference of similar extinction rates in clades of Parnassiinae feeding on different host plants does agree with Van Valen’s (1973) RQ hypothesis, which assumes constant extinction probabilities shared by all members of any given higher taxon. Interplay Between the RQ and the CJ Both abiotic and biotic factors drive species diversification, and biological radiations require both extrinsic conditions and intrinsic traits—acting in unison or sequentially over time—for success (Donoghue and Sanderson 2015). A lack of suitable methods and data has resulted in few studies attempting to merge both types of factors to explain evolutionary radiations (Bouchenak-Khelladi et al. 2015; Liow et al. 2015; Lagomarsino et al. 2016). Herein, we argue that RQ and CJ-type factors are intimately linked, and they both promote and prevent species diversification. Multiple factors explain the diversification of Apollo butterflies, and the effect of these factors differed across clades. Past changes in climate and elevation during active orogenic periods jointly mediated the rapid radiation of mountain-dwelling Parnassius, but also triggered extinction events in non-mountain adapted clades Luehdorfiini and Zerynthiini. The ancestor of Luehdorfiini and Zerynthiini was reconstructed in the Oligocene as occupying a broad distribution range extending from the Caucasus region, the Iranian Plateau and the Zagros Mountains, through Central Asia to India (red box in Fig. 4). Extinction in Central Asia and India around the Oligocene-Miocene, followed by independent eastward/northward colonizations by Luehdorfiini and Zerynthiini, could explain the current disjunct distribution pattern in these 2 tribes. Our study also demonstrates that ecological traits and biotic interactions such as host-plant associations and clade-specific diversity-dependence may confer differential diversification capacity in closely related species co-occurring in the same rapidly changing environment. In particular, the radiation of Parnassius in the HTP seems to have been fostered by the joint effect of climate and geological changes associated with new biotic interactions. Mountain uplift promoted speciation by allopatric speciation, but also increased habitat heterogeneity, and led ultimately to the formation of new niches (ecological divergence). The colonization of these niches led to new associations with mountain host species, and potentially to the evolution of new morphological adaptive traits, for example, denser setae on the body or altitudinal melanism. Although it is difficult to tease apart the role of key innovations (new traits that evolved allowing the invasion of a new niche) on ecological opportunity (the formation of a new environment that confers selective advantage fitness for the traits), a combination of extrinsic and intrinsic factors seems to have been necessary for the success of the Parnassius radiation. Parnassiinae species feed mostly on plant lineages of small size whose leaves grow close to the ground and which are characteristic elements of modern temperate vegetation: Aristolochia and Asarum (Aristolochiaceae) for Luehdorfiini and Zerynthiini, Saxifraga and Sedum (Saxifragaceae) and Rhodiola (Crassulaceae) for the subgenus Parnassius, or Corydalis (Papaveraceae) for the other subgenera of Parnassius. Zhang et al. (2014) inferred an origin of Rhodiola (70 spp., mid-Miocene; 12.1 Ma, 6.3–20.2 Ma) and Sedum (420 spp., late Oligocene) in the HTP, later spreading to the adjacent regions. Similarly, Ebersbach et al. (2017) placed the colonization of the HTP by genus Saxifraga (370 spp.) in the late Oligocene (20–34 Ma). Therefore, there was a synchronous temporal and geographic origin of Parnassius and its associated host plants in the HTP region. Conversely, Luehdorfiini and Zerynthiini were unable to evolve new traits or establish novel host-plant associations to invade the mountain and/or host-plant niches. This calls for further studies to decipher the genomic baseline for the adaptations to elevation and host plants by comparing Parnassiini and Luehdorfiini/Zerynthiini (Edger et al. 2015). Underlying Model and Inference Framework in Diversification Analyses Models of diversification have traditionally relied on time-continuous (independent) or autocorrelated rates in the way they model rate-shifts, with few studies applying both types for macroevolutionary inferences. Herein, we used both, in particular to estimate speciation and extinction rates depending on an environmental variable that itself varies through time. The first model is an ML implementation with a time-continuous birth–death model (Condamine et al. 2013). For the purpose of this study we implemented Bayesian paleoenvironment-dependent models but with autocorrelated (RevBayes) and time-continuous (TESS) rates of diversification. Our comparison across methods has revealed new aspects on the behavior of the models. The three methods are congruent on the role of orogeny (no effect of orogeny on diversification per se), but disagree on the effect of temperature (Fig. 8). The time-continuous models (RPANDA and TESS) support a positive dependence on speciation and extinction, while the autocorrelated model (RevBayes) shows no dependence. The results suggest that the lack of correlation found by the RevBayes paleoenvironmental model is directly a consequence of the assumption of autocorrelation in rates across time intervals. Thus, the difference lies in the underlying diversification model (autocorrelated vs. time-continuous rates) and not due to the inference framework (Bayesian vs. ML). Our study shows that multiple approaches must be combined to fully address the diversification of clades in relation to abiotic and biotic factors. Rather than a single factor, the joint effect of multiple factors (biogeography, species traits, environmental drivers, and mass extinction) is responsible for current diversity patterns, and we recommend a correspondingly multifaceted strategy for the study of these patterns. SUPPLEMENTARY MATERIAL Data available from the Dryad Digital Repository: http://dx.doi.org/10.5061/dryad.32bp4. FUNDING This work was supported by the Marie Curie Action (EU 7th Framework Programme) [BIOMME project, IOF-627684 to F.L.C. (jointly supervised by F.A.H.S. and I.S.)], by a Natural Sciences and Engineering Research Council of Canada (NSERC), Discovery Grant to F.A.H.S., and by MINECO/FEDER [CGL2015-67489-P to I.S]. Acknowledgements We thank Brian Wiegmann and two anonymous reviewers who provided excellent and constructive comments to improve the study. We also thank Sylvain Piry for help on the script for GenBank, Mark Miller for assistance on the CIPRES cluster. References Alfaro M.E., Santini F., Brock C., Alamillo H., Dornburg A., Rabosky D.L., Carnevale G., Harmon L.J. 2009 . Nine exceptional radiations plus high turnover explain species diversity in jawed vertebrates. Proc. Natl. Acad. Sci. USA 106 : 13410 – 13414 . Google Scholar Crossref Search ADS Antonelli A., Sanmartín I. 2011 . Mass extinction, gradual cooling, or rapid radiation? Reconstructing the spatiotemporal evolution of the ancient angiosperm genus Hedyosmum (Chloranthaceae) using empirical and simulated approaches. Syst. Biol. 60 : 596 – 615 . Google Scholar Crossref Search ADS PubMed Ackery P.R. 1975 . A guide to the genera and species of Parnassiinae (Lepidoptera: Papilionidae). Bull. British Mus. (Nat. Hist.), Entomol. 31 : 71 – 105 . Google Scholar Crossref Search ADS Armijo R., Lacassin R., Coudurier-Curveur A., Carrizo D. 2015 . Coupled tectonic evolution of Andean orogeny and global climate. Earth Sci. Rev. 143 : 1 – 35 . Google Scholar Crossref Search ADS Ayres D.L., Darling A., Zwickl D.J., Beerli P., Holder M.T., Lewis P.O., Huelsenbeck J.P., Ronquist F., Swofford D.L., Cummings M.P., Rambaut A., Suchard M.A. 2012 . BEAGLE: an application programming interface and high-performance computing library for statistical phylogenetics. Syst. Biol. 61 : 170 – 173 . Google Scholar Crossref Search ADS PubMed Baele G., Li W.L.S., Drummond A.J., Suchard M.A., Lemey P. 2013 . Accurate model selection of relaxed clocks in Bayesian phylogenetics. Mol. Biol. Evol. 30 : 239 – 243 . Google Scholar Crossref Search ADS PubMed Barnosky A.D. 2001 . Distinguishing the effects of the Red Queen and Court Jester on Miocene mammal evolution in the northern Rocky Mountains. J. Verteb. Paleontol. 21 : 172 – 185 . Google Scholar Crossref Search ADS Beaulieu J.M., O’Meara B.C. 2016 . Detecting hidden diversification shifts in models of trait- dependent speciation and extinction. Syst. Biol. 65 : 583 – 601 . Google Scholar Crossref Search ADS PubMed Benton M.J. 2009 . The red queen and the court jester: species diversity and the role of biotic and abiotic factors through time. Science 323 : 728 – 732 . Google Scholar Crossref Search ADS PubMed Benton M.J. 2015 . Exploring macroevolution using modern and fossil data. Proc. Biol. Sci. 282 : 20150569 Google Scholar Crossref Search ADS PubMed Böhme M. 2003 . The Miocene Climatic Optimum: evidence from ectothermic vertebrates of Central Europe. Palaeogeogr. Palaeoclim. Palaeoecol. 195 : 389 – 401 . Bollino M., Racheli T. 2012 . Parnassiinae (Partim). Parnassiini (Partim), Luehdorfiini, Zerynthiini. In: Bauer E., Frankenbach T., editors. Butterflies of the World, Part 36, Supplement 20. Keltern , Germany : Goecke und Evers . p. 64 . Bouchenak-Khelladi Y., Onstein R.E., Xing Y., Schwery O., Linder H.P. 2015 . On the complexity of triggering evolutionary radiations. New Phytol. 207 : 313 – 326 . Google Scholar Crossref Search ADS PubMed Bouilhol P., Jagoutz O., Hanchar J.M., Dudas F.O. 2013 . Dating the India–Eurasia collision through arc magmatic records. Earth Planet. Sci. Lett. 366 : 163 – 175 . Brikiatis L. 2014 . The De Geer, Thulean and Beringia routes: key concepts for understanding early Cenozoic biogeography. J Biogeogr. 41 : 1036 – 1054 . Google Scholar Crossref Search ADS Britton T., Anderson C.L., Jacquet D., Lundqvist S., Bremer K. 2007 . Estimating divergence times in large phylogenetic trees. Syst. Biol. 56 : 741 – 752 . Google Scholar Crossref Search ADS PubMed Brock C.D., Harmon L.J., Alfaro M.E. 2011 . Testing for temporal variation in diversification rates when sampling is incomplete and nonrandom. Syst. Biol. 60 : 410 – 419 . Google Scholar Crossref Search ADS PubMed Carpenter F.M. 1992 . Treatise on invertebrate paleontology, Part R, Arthropoda 3–4. Boulder (CO): Geological Society of America. Churkin S. 2006 . A new species of Parnassius Latreille, 1804 from Kyrgyzstan (Lepidoptera, Papilionidae). Helios 7 : 142 – 158 . Collins N.M., Morris M.G. 1985 . Threatened Swallowtail butterflies of the World. The Cambridge: IUCN Red Data Book . Condamine F.L., Nagalingum N.S., Marshall C.R., Morlon H. 2015a . Origin and diversification of living cycads: a cautionary tale on the impact of the branching process prior on Bayesian molecular dating. BMC Evol. Biol. 15 : 65 Google Scholar Crossref Search ADS Condamine F.L., Rolland J., Morlon H. 2013 . Macroevolutionary perspectives to environmental change. Ecol. Lett. 16 : 72 – 85 . Google Scholar Crossref Search ADS PubMed Condamine F.L., Sperling F.A., Wahlberg N., Rasplus J.Y., Kergoat G.J. 2012 . What causes latitudinal gradients in species diversity? Evolutionary processes and ecological constraints on swallowtail biodiversity. Ecol. Lett. 15 : 267 – 277 . Google Scholar Crossref Search ADS PubMed Condamine F.L., Toussaint E.F.A., Clamens A.-L., Genson G., Sperling F.A.H., Kergoat G.J. 2015b . Deciphering the evolution of birdwing butterflies 150 years after Alfred Russell Wallace. Sci. Rep. 5 : 11860 Google Scholar Crossref Search ADS Cruaud A., Rønsted N., Chantarasuwan B., Chou L.S., Clement W., Couloux A., Cousins B., Forest F., Genson G., Harrison R.D., Hossaert-McKey M., Jabbour-Zahab R., Jousselin E., Kerdelhué C., Kjellberg F., Lopez-Vaamonde C., Peebles J., Pereira R.A.S, Schramm T., Ubaidillah R., van Noort S., Weiblen G.D., Yang D.R, Yan-Qiong P., Yodpinyanee A., Libeskind-Hadas R., Cook J.M., Rasplus J.Y., Savolainen V. 2012 . An extreme case of plant-insect co-diversification: figs and fig-pollinating wasps. Syst. Biol. 61 : 1029 – 1047 . Google Scholar Crossref Search ADS PubMed Cusimano N., Renner S.S. 2010 . Slowdowns in diversification rates from real phylogenies may be not real. Syst. Biol. 59 : 458 – 464 . Google Scholar Crossref Search ADS PubMed Dapporto L. 2010 . Speciation in Mediterranean refugia and post-glacial expansion of Zerynthia polyxena (Lepidoptera, Papilionidae). J. Zoolog. Syst. Evol. Res. 48 : 229 – 237 . Davis M.P., Midford P.E., Maddison W.P. 2013 . Exploring power and parameter estimation of the BiSSE method for analysing species diversification. BMC Evol. Biol. 13 : 38 Google Scholar Crossref Search ADS PubMed de Jong R. 2007 . Estimating time and space in the evolution of the Lepidoptera. Tijdschrift v. Entomol. 150 : 319 – 346 . Google Scholar Crossref Search ADS DeChaine E.G., Martin A.P. 2004 . Historic cycles of fragmentation and expansion in Parnassius smintheus (Papilionidae) inferred using mitochondrial DNA. Evolution 58 : 113 – 127 . Google Scholar Crossref Search ADS PubMed Donoghue M.J., Doyle J.A., Gauthier J., Kluge A.G., Rowe T. 1989 . The importance of fossils in phylogeny reconstruction. Annu. Rev. Ecol. Syst. 20 : 431 – 460 . Google Scholar Crossref Search ADS Donoghue MJ, Sanderson MJ. 2015 . Confluence, synnovation, and depauperons in plant diversification. New Phytol. 207 : 260 – 274 . Google Scholar Crossref Search ADS PubMed Drummond A., Suchard M.A., Xie D., Rambaut A. 2012b. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol. Biol. Evol. 29 : 1969 – 1973 . Google Scholar Crossref Search ADS Drummond A.J., Ho S.Y.W., Phillips M.J., Rambaut A. 2006 . Relaxed phylogenetics and dating with confidence. PLoS Biol. 4 : e88 . Google Scholar Crossref Search ADS PubMed Drummond C.S., Eastwood R.J., Miotto S.T., Hughes C.E. 2012a . Multiple continental radiations and correlates of diversification in Lupinus (Leguminosae): testing for key innovation with incomplete taxon sampling. Syst. Biol. 61 : 443 – 460 . Google Scholar Crossref Search ADS Durden C.J., Rose H. 1978 . Butterflies from the middle Eocene: the earliest occurrence of fossil Papilionidae. Prarce-Sellards Ser. Tax. Mem. Mus. 29 : 1 – 25 . Ebersbach J., Muellner-Riehl A.N., Michalak I., Tkach N., Hoffmann M.H., Röser M., Sun H., Favre A. 2017 . In and out of the Qinghai-Tibet Plateau: divergence time estimation and historical biogeography of the large arctic-alpine genus Saxifraga L. J. Biogeogr. 44 : 900 – 910 . Edger P.P., Heidel-Fischer H.M., Bekaert M., Rota J., Glöckner G., Platts A.E., Heckel D. G., Der J., Wafula E., Tang M., Hofberger J.A., Smithson A., Hall J.C., Blanchette M., Bureau T.E., Wright S.I., dePamphilis C.W., Schranz M.E., Barker M.S., Conant G.C., Wahlberg N., Vogel H., Pires J.C., Wheat C.W. 2015 . The butterfly plant arms-race escalated by gene and genome duplications. Proc. Natl. Acad. Sci. USA 112 : 8362 – 8366 . Google Scholar Crossref Search ADS Edwards E.J., Osborne C.O., Stromberg C.A.E., Smith S.A., the C$$_{\mathrm{4}}$$ Grasses Consortium. 2010 . The origins of C4 grasslands: integrating evolutionary and ecosystem science. Science 328 : 587 – 591 . Ehrlich P.R., Raven P.H. 1964 . Butterflies and plants: a study in coevolution. Evolution 18 : 586 – 608 . Google Scholar Crossref Search ADS Erwin D.H. 2009 . Climate as a driver of evolutionary change. Curr. Biol. 19 : R575 – R583 . Google Scholar Crossref Search ADS PubMed Etienne R.S., Haegeman B., Stadler T., Aze T., Pearson P.N., Purvis A., Phillimore A.B. 2012 . Diversity-dependence brings molecular phylogenies closer to agreement with the fossil record. Proc. Biol. Sci. 279 : 1300 – 1309 . Google Scholar Crossref Search ADS PubMed Ezard T.H.G., Aze T., Pearson P.N., Purvis A. 2011 . Interplay between changing climate and species’ ecology drives macroevolutionary dynamics. Science 332 : 349 – 351 . Google Scholar Crossref Search ADS PubMed Ezard T.H.G., Quental T.B., Benton M.J. 2016 . The challenges to inferring the regulators of biodiversity in deep time. Philos. Trans. R. Soc. B 371 : 20150216 Google Scholar Crossref Search ADS Favre A., Päckert M., Pauls S.U., Jähnig S.C., Uhl D., Michalak I., Muellner-Riehl A.N. 2015. The role of the uplift of the Qinghai-Tibetan Plateau for the evolution of Tibetan biotas. Biol. Rev. 90 : 236 – 253 . Crossref Search ADS FitzJohn R.G. 2012 . Diversitree: comparative phylogenetic analyses of diversification in R. Methods Ecol. Evol. 3 : 1084 – 1092 . Google Scholar Crossref Search ADS FitzJohn R.G., Maddison W.P., Otto S.P. 2009 . Estimating trait-dependent speciation and extinction rates from incompletely resolved phylogenies. Syst. Biol. 58 : 595 – 611 . Google Scholar Crossref Search ADS PubMed Fjeldså J., Bowie R.C.K., Rahbek C. 2012 . The role of mountain ranges in the diversification of birds. Annu. Rev. Ecol. Evol. Syst. 43 : 249 – 265 . Google Scholar Crossref Search ADS Frankenbach T., Bollino M., Racheli T. 2012 . Papilionidae XIV: Hypermnestra, Luehdorfiini, Zerynthiini. In: Bauer E., Frankenbach T., editors. Butterflies of the World, Part 36. Keltern , Germany : Goecke und Evers . p. 26 . Gavin D.G., Fitzpatrick M.C., Gugger P.F., Heath K.D., Rodríguez-Sánchez F., Dobrowski S.Z., Hampe A., Hu F.S., Ashcroft M.B., Bartlein P.J., Blois J.L., Carstens B.C., Davis E.B., de Lafontaine G., Edwards M.E., Fernandez M., Henne P.D., Herring E.M., Holden Z.A., Kong W.S., Liu J., Magri D., Matzke N.J., McGlone M.S., Saltré F., Stigall A.L., Tsai Y.H., Williams J.W. 2014 . Climate refugia: joint inference from fossil records, species distribution models and phylogeography. New Phytol. 204 : 37 – 54 . Google Scholar Crossref Search ADS PubMed Gernhard T. 2008 . The conditioned reconstructed process. J. Theor. Biol. 253 : 769 – 778 . Google Scholar Crossref Search ADS PubMed Givnish T.J. Spalink D., Ames M., Lyon S.P., Hunter S.J., Zuluaga A., Iles W.J.D., Clements M.A., Arroyo M.T.K., Leebens-Mack J., Endara L., Kriebel R., Neubig K.M., Whitten W.M., Williams N.H., Cameron K.M. 2015 . Orchid phylogenomics and multiple drivers of their extraordinary diversification. Proc. Biol. Sci. 282 : 20151553 Google Scholar Crossref Search ADS Goldberg E.E., Lancaster L.T., Ree R.H. 2011 . Phylogenetic inference of reciprocal effects between geographic range evolution and diversification. Syst. Biol. 60 : 451 – 465 . Google Scholar Crossref Search ADS PubMed Gradstein F.M, Ogg J.G., Schmitz M., Ogg G. 2012 . The geologic time scale 2012. Boston : Elsevier . p. 1176 . Gratton P., Konopiński M.K., Sbordoni V. 2008 . Pleistocene evolutionary history of the clouded Apollo (Parnassius mnemosyne): genetic signatures of climate cycles and a ‘time-dependent’ mitochondrial substitution rate. Mol. Ecol. 17 : 4248 – 4262 . Google Scholar Crossref Search ADS PubMed Hiura L. 1980 . A phylogeny of the genera of Parnassiinae based on analysis of wing pattern, with description of a new genus (Lepidoptera: Papilionidae). Bull. Osaka Mus. Natural Hist. 33 : 71 – 85 . Ho S.Y.W., Phillips M.J. 2009 . Accounting for calibration uncertainties in phylogenetic estimation of evolutionary divergence times. Syst. Biol. 58 : 367 – 380 . Google Scholar Crossref Search ADS PubMed Höhna S. 2015 . The time-dependent reconstructed evolutionary process with a key-role for mass-extinction events. J. Theoret. Biol. 380 : 321 – 331 . Google Scholar Crossref Search ADS Höhna S., Landis M.J., Heath T.A., Boussau B., Lartillot N., Moore B.R., Huelsenbeck J.P., Ronquist F. 2016a . RevBayes: Bayesian phylogenetic inference using graphical models and an interactive model-specification language. Syst. Biol. 65 : 726 – 736 . Google Scholar Crossref Search ADS Höhna S., May M.R., Moore B.R. 2016b . TESS: an R package for efficiently simulating phylogenetic trees and performing Bayesian inference of lineage diversification rates. Bioinformatics 32 : 789 – 791 . Google Scholar Crossref Search ADS Höhna S., Stadler T., Ronquist F., Britton T. 2011 . Inferring speciation and extinction rates under different sampling schemes. Mol. Biol. Evol. 28 : 2577 – 2589 . Google Scholar Crossref Search ADS PubMed Hoorn C., Mosbrugger V., Mulch A., Antonelli A. 2013 . Biodiversity from mountain building. Nature Geosci . 6 : 154 Huelsenbeck J.P., Larget B., Alfaro M.E. 2004 . Bayesian phylogenetic model selection using reversible jump Markov chain Monte Carlo. Mol. Biol. Evol. 21 : 1123 – 1133 . Google Scholar Crossref Search ADS PubMed Hughes C.E., Atchison G.W. 2015 . The ubiquity of alpine plant radiations: from the Andes to the Hengduan Mountains. New Phytol. 207 : 275 – 282 . Google Scholar Crossref Search ADS PubMed Hutter C.R., Guayasamin J.M., Wiens J.J. 2013 . Explaining Andean megadiversity: the evolutionary and ecological causes of glassfrog elevational richness patterns. Ecol. Lett. 16 : 1135 – 1144 . Google Scholar Crossref Search ADS PubMed Janz N., Nylin S., Wahlberg N. 2006 . Diversity begets diversity: host expansions and the diversification of plant-feeding insects. BMC Evol. Biol. 6 : 4 Google Scholar Crossref Search ADS PubMed Jønsson K.A., Fabre P.H., Fritz S.A., Etienne R.S., Ricklefs R.E., Jørgensen T.B., Fjeldså J., Rahbek C., Ericson P.G.P., Woogg F., Pasquet E., Irestedt M. 2012 . Ecological and evolutionary determinants for the adaptive radiation of the Madagascan vangas. Proc. Natl. Acad. Sci. USA 109 : 6620 – 6625 . Google Scholar Crossref Search ADS Kass R.E., Raftery A.E. 1995 . Bayes factors. J. Am. Stat. Assoc. 90 : 773 – 795 . Google Scholar Crossref Search ADS Katoh K., Standley D.M. 2013 . MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30 : 772 – 780 . Google Scholar Crossref Search ADS PubMed Katoh T., Chichvarkin A., Yagi T., Omoto K. 2005 . Phylogeny and evolution of butterflies of the genus Parnassius: inferences from mitochondrial 16S and ND1 sequences. Zool. Sci. 22 : 343 – 351 . Google Scholar Crossref Search ADS PubMed Kergoat G.J., Bouchard P., Clamens A.-L., Abbate J.L., Jourdan H., Jabbour-Zahab R., Genson G., Soldati L., Condamine F.L. 2014 . Cretaceous environmental changes led to high extinction rates in a hyperdiverse beetle family. BMC Evol. Biol. 14 : 220 Google Scholar Crossref Search ADS PubMed Keyghobadi N, Roland J, Strobeck C. 2005 . Genetic differentiation and gene flow among populations of the alpine butterfly, Parnassius smintheus, vary with landscape connectivity. Mol. Ecol. 14 : 1897 – 1909 . Google Scholar Crossref Search ADS PubMed Kocman S. 2009 . Parnassius of Tibet and the adjacent areas. Pardubice (Czech Republic) : Vadim Tshikolovets . Lagomarsino L.P., Condamine F.L., Antonelli A., Mulch A., Davis C.C. 2016 . The abiotic and biotic drivers of rapid diversification in Andean bellflowers (Campanulaceae). New Phytol. 210 : 1430 – 1442 . Google Scholar Crossref Search ADS PubMed Lanfear R., Calcott B., SYW, Ho Guindon S. 2012 . PartitionFinder: Combined selection of partitioning schemes and substitution models for phylogenetic analyses. Mol. Biol. Evol. 29 : 1695 – 1701 . Google Scholar Crossref Search ADS PubMed Lewis P.O. 2001 . A likelihood approach to estimating phylogeny from discrete morphological character data. Syst. Biol. 50 : 913 – 925 . Google Scholar Crossref Search ADS PubMed Liow L.H., Reitan T., Harnik P.G. 2015 . Ecological interactions on macroevolutionary time scales: clams and brachiopods are more than ships that pass in the night. Ecol. Lett. 18 : 1030 – 1039 . Google Scholar Crossref Search ADS PubMed Liu Z., Pagani M., Zinniker D., DeConto R., Huber M., Brinkhuis H., Shah S.R., Leckie R.M., Pearson A. 2009. Global cooling during the Eocene-Oligocene climate transition. Science 323 : 1187 – 1190 . Crossref Search ADS PubMed Maddison W.P., FitzJohn R.G. 2015 . The unsolved challenge to phylogenetic correlation tests for categorical characters. Syst. Biol. 64 : 127 – 136 . Google Scholar Crossref Search ADS PubMed Maddison W.P., Midford P.E., Otto S.P. 2007 . Estimating a binary character’s effect on speciation and extinction. Syst. Biol. 56 : 701 – 710 . Google Scholar Crossref Search ADS PubMed Magallón S., Gómez-Acevedo S., Sánchez-Reyes L.L., Hernández-Hernández T. 2015 . A metacalibrated time-tree documents the early rise of flowering plant phylogenetic diversity. New Phytol. 207 : 437 – 453 . Google Scholar Crossref Search ADS PubMed Matzke N.J. 2014 . Model selection in historical biogeography reveals that founder-event speciation is a crucial process in island clades. Syst. Biol. 63 : 951 – 970 . Google Scholar Crossref Search ADS PubMed May M.R., Höhna S., Moore B.R. 2016 . A Bayesian approach for detecting the impact of mass-extinction events on molecular phylogenies when rates of lineage diversification may vary. Methods Ecol. Evol. 7 : 947 – 959 . Google Scholar Crossref Search ADS Mayhew P.J., Jenkins G.N., Benton T.G. 2008 . A long-term association between global temperature and biodiversity, origination and extinction in the fossil record. Proc. Biol. Sci. 275 : 47 – 53 . Google Scholar Crossref Search ADS PubMed Meredith R.W., Janečka J.E., Gatesy J., Ryder O.A., Fisher C.A., Teeling E.C., Goodbla A., Eizirik E., Simão T.L.L, Stadler T., Rabosky D.L., Honeycutt R.L., Flynn J.J., Ingram C.M., Steiner C., Williams T.L., Robinson T.J., Burk-Herrick A., Westerman M., Ayoub N.A., Springer M.S., Murphy W.J. 2011 . Impacts of the Cretaceous Terrestrial Revolution and KPg extinction on mammal diversification. Science 334 : 521 – 524 . Google Scholar Crossref Search ADS PubMed Meseguer A. S., Lobo J.M., Ree R.H., Beerling D.J., Sanmartín I. 2014 . Integrating fossils, phylogenies, and niche models into biogeography to reveal ancient evolutionary history: the case of Hypericum (Hypericaceae). Syst. Biol. 64 : 215 – 232 . Google Scholar Crossref Search ADS PubMed Michel F., Rebourg C., Cosson E., Descimon H. 2008 . Molecular phylogeny of Parnassiinae butterflies (Lepidoptera: Papilionidae) based on the sequences of four mitochondrial DNA segments. Ann. Soc. Entomol. Fr. 44 : 1 – 36 . Google Scholar Crossref Search ADS Migliore J., Baumel A., Juin M., Fady B., Roig A., Duong N., Médail F. 2013 . Surviving in Mountain climate refugia: New insights from the genetic diversity and structure of the relict shrub Myrtus nivellei (Myrtaceae) in the Sahara Desert. PLoS One 8 : e73795 . Google Scholar Crossref Search ADS PubMed Miller M.A., Schwartz T., Pickett B.E., He S., Klem E.B., Scheuermann R.H., Passarotti M., Kaufman S., O’Leary M.A. 2015 . A RESTful API for access to phylogenetic tools via the CIPRES Science Gateway. Evol. Bioinform. 11 : 43 – 48 . Google Scholar Crossref Search ADS Moen D., Morlon H. 2014 . Why does diversification slow down? Trends Ecol. Evol. 29 : 190 – 197 . Möhn E. 2005 . Papilionidae XII: Parnassius apollo III. In: Bauer E., Frankenbach T., editors. Butterflies of the World, Part 23. Keltern , Germany : Goecke und Evers . p. 33 . Moore B.R., Höhna S., May M.R., Rannala B., Huelsenbeck J.P. 2016 . Critically evaluating the theory and performance of Bayesian analysis of macroevolutionary mixtures. Proc. Natl. Acad. Sci. USA 113 : 9569 – 9574 . Google Scholar Crossref Search ADS Morlon H. 2014 . Phylogenetic approaches for studying diversification Ecol. Lett. 17 : 508 – 525 . Morlon H., Parsons T.L., Plotkin J. 2011 . Reconciling molecular phylogenies with the fossil record. Proc. Natl. Acad. Sci. USA 108 : 16327 – 16332 . Google Scholar Crossref Search ADS Nazari V., Sperling F.A.H. 2007 . Mitochondrial DNA divergence and phylogeography in western Palaearctic Parnassiinae (Lepidoptera: Papilionidae): how many species are there? Insect. Syst. Evol. 38 : 121 – 138 . Nazari V., Zakharov E.V., Sperling F.A.H. 2007 . Phylogeny, historical biogeography, and taxonomic ranking of Parnassiinae (Lepidoptera: Papilionidae) based on morphology and seven genes. Mol. Phylogenet. Evol. 42 : 131 – 156 . Google Scholar Crossref Search ADS PubMed Nee S., May R.M., Harvey P.H. 1994 . The reconstructed evolutionary process. Phil. Trans. R. Soc. B 344 : 305 – 311 . Google Scholar Crossref Search ADS Ng J., Smith S.D. 2014 . How traits shape trees: new approaches for detecting character state-dependent lineage diversification. J. Evol. Biol. 27 : 2035 – 2045 . Google Scholar Crossref Search ADS PubMed Omoto K., Katoh T., Chichvarkin A., Yagi T. 2004 . Molecular evolution of the ‘Apollo’ butterflies of the genus Parnassius (Lepidoptera: Papilionidae). Gene 326 : 141 – 147 . Google Scholar Crossref Search ADS PubMed Omoto K., Yonezawa T., Shinkawa T. 2009 . Molecular systematics and evolution of the recently discovered “Parnassian” butterfly (Parnassius davydovi Churkin, 2006) and its allied species (Lepidoptera, Papilionidae). Gene 441 : 80 – 88 . Google Scholar Crossref Search ADS PubMed Padma T.V. 2014 . Himalayan plants seek cooler climes. Nature 512 : 359 Google Scholar Crossref Search ADS PubMed Pérez-Escobar O.A., Chomicki G., Condamine F.L., Karremans A.P., Bogarín D., Matzke N.J., Silvestro D., Antonelli A. 2017 . Recent origin and rapid speciation of Neotropical orchids in the world’s richest plant biodiversity hotspot. New Phytol. 215 : 891 – 905 . Google Scholar Crossref Search ADS PubMed Phillimore A.B., Price T.D. 2008 . Density-dependent cladogenesis in birds. PLoS Biol. 6 : e71 . Google Scholar Crossref Search ADS PubMed Pokorny L., Riina R., Mairal M., Meseguer A.S., Culshaw V., Cendoya J., Serrano M., Carbajal R., Ortiz S., Heuertz M., Sanmartín I. 2015 . Living on the edge: timing of Rand Flora disjunctions congruent with ongoing aridification in Africa. Frontiers Genet. 6 : 154 Google Scholar Crossref Search ADS Pound M.J., Haywood A.M., Salzmann U., Riding J.B. 2012 . Global vegetation dynamics and latitudinal temperature gradients during the Mid to Late Miocene (15.97–5.33 Ma). Earth Sci. Rev. 112 : 1 – 22 . Google Scholar Crossref Search ADS Price T.D., Hooper D.M., Buchanan C.D., Johansson U.S., Tietze D.T., Alström P., Olsson U., Ghosh-Harihar M., Ishtiaq F., Gupta S.K., Martens J., Harr B., Singh P., Mohan D. 2014 . Niche filling slows the diversification of Himalayan songbirds. Nature 509 : 222 – 225 . Google Scholar Crossref Search ADS PubMed Quade J., Cerling T., Bowman J. 1989 . Development of Asian monsoon revealed by marked ecological shift during the latest Miocene in Northern Pakistan. Nature 342 : 163 – 166 . Google Scholar Crossref Search ADS Rabosky D.L., Donnellan S.C., Grundler M., Lovette I.J. 2014a . Analysis and visualization of complex macroevolutionary dynamics: an example from Australian scincid lizards. Syst. Biol. 63 : 610 – 627 . Google Scholar Crossref Search ADS Rabosky D.L., Goldberg E.E. 2015 . Model inadequacy and mistaken inferences of trait-dependent speciation. Syst. Biol. 64 : 340 – 355 . Google Scholar Crossref Search ADS PubMed Rabosky D.L., Grundler M., Anderson C., Shi J.J., Brown J.W., Huang H., Larson J.G. 2014b . BAMMtools: an R package for the analysis of evolutionary dynamics on phylogenetic trees. Methods Ecol. Evol. 5 : 701 – 707 . Google Scholar Crossref Search ADS Rabosky D.L., Lovette, I.J. 2008 . Explosive evolutionary radiations: decreasing speciation or increasing extinction through time? Evolution 62 : 1866 – 1875 . Google Scholar Crossref Search ADS PubMed Rabosky D.L., Mitchell J.S., Chang J. 2017 . Is BAMM flawed? Theoretical and practical concerns in the analysis of multi-rate diversification models. Syst. Biol. 66 : 477 – 498 . Google Scholar Crossref Search ADS PubMed Rabosky D.L., Santini F., Eastman J.M., Smith S.A., Sidlauskas B., Chang J., Alfaro M.E. 2013 . Rates of speciation and morphological evolution are correlated across the largest vertebrate radiation. Nat. Commun. 4 : 1958 Google Scholar Crossref Search ADS PubMed Rambaut A., Drummond A.J. 2016 . Tracer v1.6. Available from: URL http://beast.bio.ed.ac.uk/Tracer. Rasnitsyn A.P., Zherikhin V.V. 2002 . Appendix: alphabetic list of selected insect fossil sites. Impression fossils. In: Rasnitsyn A.P., Quicke D.L.J., editors. History of insects. Dordrecht, Boston & London : Kluwer Academic Publishers . p. 437 – 444 . Rebel H. 1898 . Fossile Lepidopteren aus der Miocänformation von Gabbro. Sitzungsberichte der Kaiserlichen Akademie der Wissenschaften. Mathematisch-Naturwissenschaftliche Classe 107 : 731 – 745 . Rebourg C., Péténian F., Cosson E., Faure E. 2006 . Patterns of speciation and adaptive radiation in Parnassius butterflies. J. Entomol. 3 : 204 – 215 . Google Scholar Crossref Search ADS Ree R.H., Sanmartín I. 2018 . Conceptual and statistical problems with the DEC+J model of founder-event speciation and its comparison with DEC via model selection. J. Biogeogr. doi.org/10.1111/jbi.13173. Ree R.H., Smith S.A. 2008 . Maximum-likelihood inference of geographic range evolution by dispersal, local extinction, and cladogenesis. Syst. Biol. 57 : 4 – 14 . Google Scholar Crossref Search ADS PubMed Renner S.S. 2016 . Available data point to a 4-km-high Tibetan Plateau by 40 Ma, but 100 molecular-clock papers have linked supposed recent uplift to young node ages. J. Biogeogr. 43 : 1479 – 1487 . Google Scholar Crossref Search ADS Revell, L.J. 2012 . phytools: an R package for phylogenetic comparative biology (and other things). Methods Ecol. Evol. 3 : 217 – 223 . Google Scholar Crossref Search ADS Roland J., Keyghobadi N., Fownes S. 2000 . Alpine Parnassius butterfly dispersal: effects of landscape and population size. Ecology 81 : 1642 – 1653 . Google Scholar Crossref Search ADS Rolland J., Condamine F.L., Jiguet F., Morlon H. 2014 . Faster speciation and reduced extinction in the tropics contribute to the mammalian latitudinal diversity gradient. PLoS Biol. 12 : e1001775 . Google Scholar Crossref Search ADS PubMed Ronquist F., Klopfstein S., Vilhelmsen L., Schulmeister S., Murray D.L., Rasnitsyn A.P. 2012a . A total-evidence approach to dating with fossils, applied to the early radiation of the Hymenoptera. Syst. Biol. 61 : 973 – 999 . Google Scholar Crossref Search ADS Ronquist F., Sanmartín I. 2011 . Phylogenetic methods in biogeography. Annu. Rev. Ecol. Evol. Syst. 42 : 441 – 464 . Google Scholar Crossref Search ADS Ronquist F., Teslenko M., van der Mark P., Ayres D.L., Darling A., Höhna S., Larget B., Liu L., Suchard M.A., Huelsenbeck J.P. 2012b . MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst. Biol. 61 : 539 – 542 . Google Scholar Crossref Search ADS Ruane S., Bryson R.W. Jr ., Pyron R.A., Burbrink F.T. 2014 . Coalescent species delimitation in milksnakes (genus Lampropeltis) and impacts on phylogenetic comparative analyses. Syst. Biol. 63 : 231 – 250 . Google Scholar Crossref Search ADS PubMed Sanmartín I., Enghoff H., Ronquist F. 2001 . Patterns of animal dispersal, vicariance and diversification in the Holarctic. Biol. J. Linn. Soc. 73 : 345 – 390 . Google Scholar Crossref Search ADS Sanmartín I., Meseguer A.S. 2016 . Extinction in phylogenetics and biogeography: From timetrees to patterns of biotic assemblage. Front. Genet. 7 : 35 Google Scholar Crossref Search ADS PubMed Sauquet H., Ho S.Y.W., Gandolfo M.A., Jordan G.J., Wilf P., Cantrill D.J., Bayly M.J., Bromham L., Brown G.K., Carpenter R.J., Lee D.M., Murphy D.J., Sniderman J.M.K., Udovicic F. 2012 . Testing the impact of calibration on molecular divergence times using a fossil-rich group: the case of Nothofagus (Fagales). Syst. Biol. 61 : 289 – 313 . Google Scholar Crossref Search ADS PubMed Schnitzler J., Barraclough T.G., Boatwright J.S., Goldblatt P., Manning J.C., Powell M.P., Rebelo T., Savolainen V. 2011 . Causes of plant diversification in the Cape biodiversity hotspot of South Africa. Syst. Biol. 60 : 343 – 357 . Google Scholar Crossref Search ADS PubMed Schoville S.D., Roderick G.K. 2009 . Alpine biogeography of Parnassian butterflies during Quaternary climate cycles in North America. Mol. Ecol. 18 : 3471 – 3485 . Google Scholar Crossref Search ADS PubMed Scriber J. M. 1984 . Larval foodplant utilization by the world Papilionidae (Lep.): latitudinal gradients reappraised. Acta Rhopaloc. 6 : 1 – 50 . Scriber J.M., Tsubaki Y., Lederhouse R.C. 1995 . Swallowtail butterflies: their ecology and evolutionary biology. Gainesville (FL) : Scientific Publishers . Scudder S.H. 1875 . Fossil butterflies. Mem. Am. Assoc. Advanc. Sci. 1 : 1 – 99 . Sepulchre P., Ramstein G., Fluteau F., Schuster M., Tiercelin J.-J., Brunet M. 2006 . Tectonic uplift and Eastern Africa aridification. Science 313 : 1419 – 1423 . Google Scholar Crossref Search ADS PubMed Silvestro D., Antonelli A., Salamin N., Quental T.B. 2015 . The role of clade competition in the diversification of North American canids. Proc. Natl. Acad. Sci. USA 112 : 8684 – 8689 . Google Scholar Crossref Search ADS Smith B.T., McCormack J.E., Cuervo A.M., Hickerson M.J., Aleixo A., Cadena C.D., Pérez-Emán J., Burney C.W., Xie X., Harvey M.G., Faircloth B.C., Glenn T.C., Derryberry E.P., Prejean J., Fields S., Brumfield R.T. 2014 . The drivers of tropical speciation. Nature 515 : 406 – 409 . Google Scholar Crossref Search ADS PubMed Smith M.E., Singer B., Carroll A. 2003 . $$^{\mathrm{40}}$$Ar/$$^{\mathrm{39}}$$Ar geochronology of the Eocene Green River Formation, Wyoming. Geol. Soc. Am. Bull. 115 : 549 – 565 . Google Scholar Crossref Search ADS Sohn J.-C., Labandeira C., Davis D., Mitter C. 2012 . An annotated catalog of fossil and subfossil Lepidoptera (Insecta: Holometabola) of the world. Zootaxa 3286 : 1 – 132 . Sohn J.-C., Labandeira C.C., Davis D.R. 2015 . The fossil record and taphonomy of butterflies and moths (Insecta, Lepidoptera): implications for evolutionary diversity and divergence-time estimates. BMC Evol. Biol. 15 : 12 Google Scholar Crossref Search ADS PubMed Stadler T. 2011 . Mammalian phylogeny reveals recent diversification rate shifts. Proc. Natl. Acad. Sci. USA 108 : 6187 – 6192 . Google Scholar Crossref Search ADS Stadler T. 2013 . Recovering speciation and extinction dynamics based on phylogenies. J. Evol. Biol. 26 : 1203 – 1219 . Google Scholar Crossref Search ADS PubMed Strimmer K., Pybus O.G. 2001 . Exploring the demographic history of DNA sequences using the generalized skyline plot. Mol. Biol. Evol. 18 : 2298 – 2305 . Google Scholar Crossref Search ADS PubMed Tan J., Kelly C.K., Jiang L. 2013 . Temporal niche promotes biodiversity during adaptive radiation. Nat. Commun. 4 : 2102 Google Scholar Crossref Search ADS PubMed Todisco V., Gratton P., Cesaroni D., Sbordoni V. 2010 . Phylogeography of Parnassius apollo: hints on taxonomy and conservation of a vulnerable glacial butterfly invader. Biol. J. Linn. Soc. 101 : 169 – 183 . Google Scholar Crossref Search ADS Todisco V., Gratton P., Zakharov E.V., Wheat C.W., Sbordoni V., Sperling F.A.H. 2012 . Mitochondrial phylogeography of the Holarctic Parnassius phoebus complex supports a recent refugial model for alpine butterflies. J. Biogeogr. 39 : 1058 – 1072 . Google Scholar Crossref Search ADS Tyler H.A., Brown K.S., Wilson K. 1994 . Swallowtail butterflies of the Americas: a study in biological dynamics, ecological diversity, biosystematics and conservation. Gainesville (FL) : Scientific Publishers . Van Valen L.M. 1973 . A new evolutionary law. Evol. Theory 1 : 1 – 30 . Voje K.L., Holen Ø.H., Liow L.H., Stenseth N.C. 2015 . The role of biotic forces in driving macroevolution: beyond the Red Queen. Proc. Biol. Sci. 282 : 20150186 Google Scholar Crossref Search ADS PubMed Wagner C.E., Harmon L.J., Seehausen O. 2012 . Ecological opportunity and sexual selection together predict adaptive radiation. Nature 487 : 366 – 369 . Google Scholar Crossref Search ADS PubMed Wallace A.R. 1878 . Tropical Nature and Other Essays. London : Macmillan and Co . Weiss J.C. 1991–2005 . The Parnassiinae of the World. Part I 1991, II 1992, III 1999, IV 2005. Venette : Sciences Nat. p. 400 . Wen J., Zhang J.-C., Nie Z.-L., Zhong Y., Sun H. 2014 . Evolutionary diversifications of plants on the Qinghai-Tibetan Plateau. Front. Genet. 5 : 4 Google Scholar PubMed Xie W., Lewis P.O., Fan Y., Kuo L., Chen M.H. 2011 . Improving marginal likelihood estimation for Bayesian phylogenetic model selection. Syst. Biol. 60 : 150 – 160 . Google Scholar Crossref Search ADS PubMed Xing Y., Ree, R.H. 2017 . Uplift-driven diversification in the Hengduan Mountains, a temperate biodiversity hotspot. Proc. Natl Acad. Sci. USA 114 : E3444 – E3451 . Google Scholar Crossref Search ADS Xu Q., Ding L., Spicer R.A., Liu X., Li S., Wang H. 2018 . Stable isotopes reveal southward growth of the Himalayan-Tibetan Plateau since the Paleocene. Gond. Res. 54 : 50 – 61 . Google Scholar Crossref Search ADS Yang Z., Rannala B. 2006 . Bayesian estimation of species divergence times under a molecular clock using multiple calibrations with soft bounds. Mol. Biol. Evol. 23 : 212 – 226 . Google Scholar Crossref Search ADS PubMed Zachos J.C., Dickens G.R., Zeebe R.E. 2008 . An early Cenozoic perspective on greenhouse warming and carbon-cycle dynamics. Nature 451 : 279 – 283 . Google Scholar Crossref Search ADS PubMed Zhang J.-Q., Meng S.-Y., Allen G.A., Wen J., Rao G.-Y. 2014 . A rapid radiation and dispersal out of the Qinghai-Tibetan Plateau of an alpine plant lineage Rhodiola (Crassulaceae). Mol. Phylogenet. Evol. 23 : 147 – 158 . Google Scholar Crossref Search ADS Zhisheng A., Kutzbach J.E., Prell W.L., Porter S.C. 2001 . Evolution of Asian monsoons and phased uplift of the Himalaya Tibetan plateau since Late Miocene times. Nature 411 : 62 – 66 . Google Scholar Crossref Search ADS PubMed Zinetti F., Dapporto L., Vovlas A., Chelazzi G., Bonelli S., Balletto E., Ciofi C. 2013 . When the rule becomes the exception. No evidence of gene flow between two Zerynthia cryptic butterflies suggests the emergence of a new model group. PLoS One 8 : e65746 . Google Scholar Crossref Search ADS PubMed © The Author(s) 2018. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)

### Journal

Systematic BiologyOxford University Press

Published: Nov 1, 2018