Abstract In macroevolution, the Red Queen (RQ) model posits that biodiversity dynamics depend mainly on species-intrinsic biotic factors such as interactions among species or life-history traits, while the Court Jester (CJ) model states that extrinsic environmental abiotic factors have a stronger role. Until recently, a lack of relevant methodological approaches has prevented the unraveling of contributions from these 2 types of factors to the evolutionary history of a lineage. Herein, we take advantage of the rapid development of new macroevolution models that tie diversification rates to changes in paleoenvironmental (extrinsic) and/or biotic (intrinsic) factors. We inferred a robust and fully-sampled species-level phylogeny, as well as divergence times and ancestral geographic ranges, and related these to the radiation of Apollo butterflies (Parnassiinae) using both extant (molecular) and extinct (fossil/morphological) evidence. We tested whether their diversification dynamics are better explained by an RQ or CJ hypothesis, by assessing whether speciation and extinction were mediated by diversity-dependence (niche filling) and clade-dependent host-plant association (RQ) or by large-scale continuous changes in extrinsic factors such as climate or geology (CJ). For the RQ hypothesis, we found significant differences in speciation rates associated with different host-plants but detected no sign of diversity-dependence. For CJ, the role of Himalayan–Tibetan building was substantial for biogeography but not a driver of high speciation, while positive dependence between warm climate and speciation/extinction was supported by continuously varying maximum-likelihood models. We find that rather than a single factor, the joint effect of multiple factors (biogeography, species traits, environmental drivers, and mass extinction) is responsible for current diversity patterns and that the same factor might act differently across clades, emphasizing the notion of opportunity. This study confirms the importance of the confluence of several factors rather than single explanations in modeling diversification within lineages. Diversification, extinction, Himalayan orogeny, historical biogeography, host-plant shifts, integrative study, mountain building, Papilionidae, past climate change, speciation Evolutionary biologists have long endeavored to determine which factors govern biodiversity dynamics, aiming to answer questions such as why some clades have diversified more than others, or why some lineages are widely distributed whereas others survive in restricted ranges (Ezard et al. 2016). Two different mechanistic macroevolutionary models have been proposed to explain the generation and maintenance of diversity. The Red Queen (RQ) model (Van Valen 1973), which stems from Darwin and Wallace, posits that diversification is driven by species-intrinsic, biotic factors such as interactions among species, species ecology, or life-history traits. The Court Jester (CJ) model, which builds on paleontological evidence (Barnosky 2001), argues that diversification dynamics result from historical abiotic forces such as abrupt changes in climate or geological tectonic events that drive speciation and extinction rates, usually acting clade-wide across lineages. The CJ and RQ models are generally considered the 2 extremes of a continuum, operating over different geographic and temporal scales. Biotic factors such as species interactions shape ecosystems locally over short time spans, whereas abiotic factors such as climate and tectonic events shape large-scale patterns regionally and globally over millions of years (Benton 2009). However, biotic interactions can also be observed at large spatial and temporal scales (Liow et al. 2015; Silvestro et al. 2015), while Van Valen’s (1973) original RQ hypothesis is now interpreted as accepting the role of a changing environment in shaping species evolution (Voje et al. 2015). Although both abiotic (environmental) and biotic (species-intrinsic) drivers are recognized as fundamental for regulating biodiversity (Ezard et al. 2011), these 2 types of factors are often studied in isolation (Drummond et al. 2012a; Bouchenak-Khelladi et al. 2015; Lagomarsino et al. 2016), searching for correlations between shifts in diversification rates and the evolution of key innovations or the appearance of key opportunities (Maddison et al. 2007; Alfaro et al. 2009; Rabosky et al. 2014a; Givnish et al. 2015). However, often no single factor but a confluence of biotic and abiotic factors is responsible for the diversification rate shift (Donoghue and Sanderson 2015), there could be an interaction effect. For example, C$$_{\mathrm{4}}$$ grasses appeared in the Eocene but their expansion and explosive diversification started only after mid-Miocene aridification in Africa and Central Asia (Edwards et al. 2010). Statistical assessment of the relative contributions of abiotic and biotic factors underlying diversity patterns has been made possible by the development of new probabilistic models in the field of diversification dynamics (Stadler 2013; Morlon 2014; Höhna 2015). One type of model estimates diversification rates that are clade-dependent and identifies differences in diversification rates among clades that can be explained by key innovations (Alfaro et al. 2009; Morlon et al. 2011; Rabosky et al. 2013), or by diversity-dependence and niche filling (i.e., diversification decreasing as the number of species increases, Rabosky and Lovette 2008; Etienne et al. 2012). A second type of model aims to detect statistical associations between diversification rates and changes in species traits (trait-dependent diversification models, Maddison et al. 2007; Ng and Smith 2014), or between geographic evolution and diversification, such as a change in continental connectivity allowing the colonization of a new region and a subsequent increase in allopatric speciation (Goldberg et al. 2011). A third type of model assumes continuous variation in diversification rates over time that depends on a paleoenvironmental variable and investigates whether diversification rates can be affected by abiotic environmental changes (e.g., paleotemperature, Condamine et al. 2013). Finally, episodic birth–death models search for tree-wide rate shifts that act concurrently across all lineages in a tree, for example, a mass extinction event removing a fraction of lineages at a certain time in the past (Stadler 2011; Höhna 2015; May et al. 2016). The first 2 types of models have been used to test RQ-like hypotheses on the effect of biotic interactions, while the other models are often used in the context of the CJ hypothesis (environmental change). Studies using a subset of these models to address both abiotic and biotic factors are becoming more frequent, but are often used at a local or regional geographic scale (Schnitzler et al. 2011; Drummond et al. 2012a; Jønsson et al. 2012; Hutter et al. 2013; Bouchenak-Khelladi et al. 2015; Lagomarsino et al. 2016), and are limited temporally. No study has yet addressed the full set of models using a large (geographic) and long (temporal) scale within the same lineage. Such a study would have substantial power to provide information on questions such as why some lineages diversify and others do not, and the extent that diversification is attributable to trait evolution and/or ecological opportunity (Wagner et al. 2012). It might also shed light on the current debate about whether biodiversity is at an equilibrium bounded by ecological limits (Rabosky et al. 2013), although it is unclear how relevant such limits are compared with other biological explanations (Moen and Morlon 2014). Alternatively, diversity might be unbounded and controlled by rare but drastic environmental changes such as climatic mass extinction events (Antonelli and Sanmartín 2011; Meredith et al. 2011; Kergoat et al. 2014; Condamine et al. 2015b; May et al. 2016). A key to such a study would be to find a group that: 1) has experienced both biotic and abiotic pressures at different temporal and spatial scales; 2) shows large species diversity and high specialization; and 3) has good information available about its taxonomic and evolutionary history. Nearly complete taxon sampling is crucial for accurate estimation of diversification rates (Cusimano and Renner 2010; Brock et al. 2011; Höhna et al. 2011; Davis et al. 2013), whereas rich fossil evidence is important for divergence time estimation and accurate reconstruction of biogeographic history (Meseguer et al., 2014). Phytophagous insects are especially good models because they present complex biotic (trophic) interactions with their host plants (Ehrlich and Raven 1964), and many of these lineages are old enough to have experienced dramatic past climatic changes. Herein, we address the role of biotic (RQ) and abiotic (CJ) factors that drive patterns of species richness in Apollo butterflies (Papilionidae: Parnassiinae). The Parnassiinae comprise 8 genera with at least 70 species currently recognized (Ackery 1975; Weiss 1991–2005; Kocman 2009), grouped into 3 tribes—Luehdorfiini, Zerynthiini, and Parnassiini (Nazari et al. 2007). Apollo butterflies occur from the Western Palearctic to the Western Nearctic (Weiss 1991–2005). Most diversity is concentrated in the Palearctic, where we find 2 non-monophyletic lowland-flying communities separated by the Himalaya and Tibetan Plateau (HTP, Fig. 1): an Eastern Palearctic group formed by Luehdorfia (Luehdorfiini) and Bhutanitis and Sericinus (Zerynthiini), and a Western Palearctic community including Archon (Luehdorfiini), Allancastria and Zerynthia (Zerynthiini), and Hypermnestra (Parnassiini). The largest genus, Parnassius, is mainly distributed in mountainous regions across the Holarctic, with its highest diversity in the HTP (Weiss 1991–2005; Kocman 2009; Nazari et al. 2007; Michel et al. 2008). When compared with the species-poor genera Allancastria, Archon, Bhutanitis, Hypermnestra, Luehdorfia, and Sericinus (each with 5 species maximum), the 50$$+$$ species of Parnassius have been suggested as an example of rapid diversification. This unbalanced species richness, together with their disjunct pattern of distribution in the Himalayan region, suggests that the HTP orogeny might have been an important driver in the biogeographic and diversification history of Parnassiinae. Figure 1. View largeDownload slide Distribution and species richness patterns of Apollo butterflies. Map shows the current separation of west and east communities of Parnassiinae, divided by the Himalayan–Tibetan Plateau. Lower portion of figure compares the current species diversity of each biogeographical unit (which may be non-monophyletic) or genus. Images of Parnassiinae (also in subsequent figures) are created by Fabien Condamine. Figure 1. View largeDownload slide Distribution and species richness patterns of Apollo butterflies. Map shows the current separation of west and east communities of Parnassiinae, divided by the Himalayan–Tibetan Plateau. Lower portion of figure compares the current species diversity of each biogeographical unit (which may be non-monophyletic) or genus. Images of Parnassiinae (also in subsequent figures) are created by Fabien Condamine. Parnassiinae are also remarkable for their (nearly) strict but diverse host-plant specialization (Ehrlich and Raven 1964). Tribes Luehdorfiini and Zerynthiini feed on Aristolochiaceae, the ancestral host of Parnassiinae (Condamine et al. 2012). Within Parnassiini, Hypermnestra feeds on Zygophyllaceae, while Parnassius has experienced 2 host-plant shifts, toward Crassulaceae $$+$$ Saxifragaceae and Papaveraceae (Michel et al. 2008; Condamine et al. 2012), and is regarded as an adaptive radiation (Rebourg et al. 2006). Finally, Apollo butterflies constitute a model to understand the role of past environmental change in lineage diversification. Phylogeographic studies identified an effect of Pleistocene glaciations on population-level dynamics (e.g., Keyghobadi et al. 2005; Gratton et al. 2008; Schoville and Roderick 2009; Dapporto 2010; Todisco et al. 2010, 2012; Zinetti et al. 2013). Time-calibrated phylogenies support an origin of the subfamily in the Early Cenozoic (Condamine et al. 2012), which means that the lineage has experienced the dramatic cooling and warming events of the last 50 Ma, including a drop in global temperatures at the end of the Eocene (Eocene-Oligocene climate transition [EOCT], Liu et al. 2009) that led to the demise of Boreotropical Holarctic vegetation (Pound et al. 2012). In this study, we reconstruct a completely sampled phylogeny of Parnassiinae to explore the role played by abiotic (CJ) and biotic (RQ) factors, isolated or in unison, in the origin and fate of biological radiations, including paleotemperature, mountain orogeny, host specialization, range expansion, and diversity dependence with niche filling. If interactions among species are the dominant drivers of evolution, diversification rates are expected to show diversity-dependent dynamics or be dependent on ecological traits. For example, one might expect diversification rates to decrease as a function of diversity or increase with ecological opportunity. Conversely, if evolution is driven mainly by changes in the physical environment, clade-wide responses due to abrupt abiotic perturbations should dominate macroevolutionary dynamics, for example, diversification rates may shift following major climatic changes that extirpated certain lineages while favoring the radiation of others. We test these hypotheses using trait-dependent, time-dependent, environment-dependent and episodic birth–death models, and compare the fit (explanatory power) of these models using maximum likelihood (ML) and Bayesian inference (BI). Our study is the first to address the CJ and RQ hypotheses at such a large scale, and to compare the performance of these models within a single inference framework. Materials and Methods Taxon Sampling and Molecular Dataset The latest effort to resolve phylogenetic relationships in Parnassiinae was Condamine et al. (2012), but that study did not include all species. There are currently 70 recognized species within Parnassiinae according to morphology (Weiss 1991–2005; Bollino and Racheli 2012; Frankenbach et al. 2012), but some authors have found a higher level of cryptic diversity, based on molecular data (Nazari and Sperling 2007; Gratton et al. 2008; Schoville and Roderick 2009; Dapporto 2010; Todisco et al. 2010, 2012; Zinetti et al. 2013). To address this, we assembled a large molecular dataset comprising all sequences generated by previous studies on Parnassiinae, especially by F.L.C. and F.A.H.S. (Omoto et al. 2004, 2009; Katoh et al. 2005; Nazari and Sperling 2007; Nazari et al. 2007; Michel et al. 2008; Condamine et al. 2012), and includes all currently described species within the family, except for Bhutanitis ludlowi (Möhn, 2005; Weiss 1991–2005; Churkin 2006). The dataset also includes the majority of described subspecies or geographic races proposed in molecular studies (Nazari and Sperling 2007; Gratton et al. 2008; Schoville and Roderick 2009; Dapporto 2010; Todisco et al. 2010, 2012; Zinetti et al. 2013). Additionally, sequences of 9 species of swallowtails were added to the dataset as outgroups, with Baronia brevicornis considered as the sister group to all remaining Papilionidae, and 2 species for each of the 4 tribes included within Papilioninae, the sister subfamily of Parnassiinae (Condamine et al. 2012). DNA sequences were obtained from GenBank using Taxonomy browser searches (full dataset is provided in the Supplementary Material that accompanies this article available on Dryad at http://dx.doi.org/10.5061/dryad.32bp4, Appendix S1). Sequences of less than 250 nucleotides (possibly microsatellites) and genes sequenced in very few specimens were not included. Overall the dataset comprises $$>$$1800 sequences, representing 69 of the traditionally recognized species and 5 genes (4 mitochondrial and 1 nuclear), divided into 730 sequences for cytochrome oxidase I (COI), 214 for NADH dehydrogenase 1 (ND1), 147 for NADH dehydrogenase 5 (ND5), 202 for rRNA 16S (16S), and 115 for elongation factor 1 alpha (EF-1$$\alpha )$$. Sequences were aligned using MAFFT 7.110 (Katoh and Standley 2013) with default settings (E-INS-i algorithm). Reading frames of coding genes were checked in Mesquite 3.03 (www.mesquite.org). Datasets (species/subspecies names and GenBank accession numbers) are available on Dryad (Supplementary Appendices S1 and S2). We first analyzed this specimen-level dataset to generate a gene tree for each of the 5 markers with every species represented by several specimens, each representing different populations. We then examined genetic differences based on branch lengths among specimens/species. For some species, we found strong genetic differences among specimens/populations supporting the recognition of these taxa as putative new species. For example, we identified 4 Zerynthia species instead of 2, 4 Archon species instead of 3, and 2 Hypermnestra species instead of one. For the final analyses below, we used the species-level dataset, with each species represented by a single specimen, comprising the 70 recognized species (1 represented by only morphological characters, see below) plus 15 putative new species. Assessment of Fossil Positions and Calibrations Constraints on clade ages were enforced through fossil calibrations, whose systematic position was assessed using phylogenetic analyses (Donoghue et al. 1989; Sauquet et al. 2012). Phylogenetic analyses of the dataset included both living and fossil taxa and incorporated both morphological and molecular data using a total-evidence approach (Ronquist et al. 2012a). Swallowtail fossils are scarce, but 2 unambiguously belong to Parnassiinae (Nazari et al. 2007). The first is $$\dagger$$Thaites ruminiana (Scudder 1875), a compression fossil from limestone in the Niveau du gypse d’Aix Formation of France (Bouches-du-Rhone, Aix-en-Provence) within the Chattian (23.03–28.1 Ma) of the late Oligocene (Rasnitsyn and Zherikhin 2002; Sohn et al. 2012). The second is $$\dagger$$Doritites bosniaskii (Rebel 1898), an exoskeleton and compression fossil from Italy (Tuscany) from the Messinian (5.33–7.25 Ma, late Miocene, Sohn et al. 2012). Absolute ages of geological formations were taken from Gradstein et al. (2012). To assess the phylogenetic positions of fossils, our morphological dataset comprised 236 characters (Hiura 1980; Nazari et al. 2007) including adult external (160) and internal (54) morphology, egg (2) and larval (10) morphology, pupal morphology (5), and characters for pigment color and chromosome number (5). Morphological characters were coded for all extant species in each genus except Parnassius (only 8 species: 1 per subgenus). Yet, Parnassius had little impact on our analyses since there are no fossils related to this genus (Nazari et al. 2007). Fossils were only coded for adult external morphology. Morphological and molecular data were combined to construct a total-evidence phylogeny using MrBayes 3.2.6 (Ronquist et al. 2012b). Morphological data were modeled using the Mk model (Lewis 2001). The best-fitting partitioning scheme for the molecular data was selected with PartitionFinder 1.1.1 (Lanfear et al. 2012), using the greedy search algorithm and the Bayesian Information Criterion. Rather than using a single substitution model per molecular partition, we sampled across the entire substitution rate model space (Huelsenbeck et al. 2004) using the reversible-jump Markov Chain Monte Carlo (rj-MCMC) option. Two independent analyses with 1 cold chain and 7 heated chains, each was run for 20 million generations, sampled every 2000th. A 50% majority rule consensus tree was built after discarding 25% samples as burn-in. The dataset used for this analysis is available on Dryad (Supplementary Appendix S3). Phylogeny and Estimates of Divergence Times Using the 85-species level dataset, we examined whether the rate of molecular evolution evolved in a clock-like pattern using PATHd8 (Britton et al. 2007). Since the hypothesis of a strict molecular clock was rejected for 68 of the 92 nodes ($$P < 0.05$$), we estimated divergence times using Bayesian relaxed-clock methods accounting for rate variation across lineages (Drummond et al. 2006). MCMC analyses implemented in BEAST 1.8.2 (Drummond et al. 2012b) were employed to approximate the posterior distribution of rates and divergences times and infer their credibility intervals. We set the following settings and priors: a partitioned dataset (after the best-fitting PartitionFinder scheme) was analyzed using the uncorrelated lognormal distribution clock model, with the mean set to a uniform prior between 0 and 1, and an exponential prior (lambda $$=$$ 0.333) for the standard deviation. The branching process prior was set to either a Yule or a birth–death (Gernhard 2008) process (since tree priors may impact estimates of molecular dating, Condamine et al. 2015a, using the following uniform priors: the Yule birth rate and birth–death mean growth rate ranged between 0 and 10 with a starting value at 0.1, and the birth–death relative death rate ranged between 0 and 1 (starting value $$=$$ 0.5). Calibration priors employed the fossil constraints indicated above (see Results section for the phylogenetic positions), using a uniform prior with the minimum age equal to the youngest age of the geological formation where the fossil was found. Additionally, we constrained the crown of Papilionidae with a uniform distribution bounded by a minimum age of 47.8 Ma (Smith et al. 2003; Gradstein et al. 2012) based on 2 fossils $$\dagger$$Praepapilio colorado and †Praepapilio gracilis (Durden and Rose 1978), both from the early Lutetian (mid-Eocene) of the Green River Formation (CO, USA). We could not place these fossils phylogenetically because of limited taxon sampling for Papilioninae. Instead, we placed these unambiguous fossils conservatively at the crown of the family as they share synapomorphies with all extant subfamilies (de Jong 2007), and have proven to be reliable calibration points for the crown group (Condamine et al. 2012). All uniform calibration priors were set with an upper bound equal to the estimated age of angiosperms ($$\sim$$140 Ma, sensuMagallón et al. 2015), which is 6 times older than the oldest Parnassiinae fossil. This upper age is intentionally set as ancient to allow exploration of potentially old ages for the clade. Since the fossil record of Lepidoptera is highly incomplete and biased (Sohn et al. 2015), caution is needed in using the few recent fossil calibrations. To explore the impact of sampled diversity, tree prior, and morphological data on BEAST inferences, we ran 4 analyses as follows: 1) molecular data only and Yule model; 2) molecular data only and birth–death process; 3) molecular and morphological data (total-evidence analysis) and Yule model; and 4) molecular and morphological data (total-evidence analysis) and birth–death process, with all other parameters set equal. The total-evidence analyses provide a completely sampled phylogeny of Parnassiinae due to the inclusion of B. ludlowi (only available with morphological data). Each MCMC analysis was run for 100 million generations, sampled every 10,000th, resulting in 10,000 samples in the posterior distribution of which the first 2500 samples were discarded as burn-in. All analyses were performed on the computer cluster CIPRES Science Gateway (Miller et al. 2015), using BEAGLE (Ayres et al. 2012). Convergence and performance of MCMC runs were evaluated using Tracer 1.6 (Rambaut and Drummond 2016) and the effective sample size (ESS) criterion for each parameter. A maximum-clade credibility (MCC) tree was reconstructed, with median age and 95% height posterior density (HPD). Bayes factor (BF) comparisons using stepping-stone sampling (Xie et al. 2011), which allows unbiased approximation of the marginal likelihood of Bayesian analyses (Baele et al. 2013), was used to select among competing models. We considered 2lnBF values $$>$$10 to significantly favor one model over another (Kass and Raftery 1995). The dataset and BEAST xml files generated for this study are available on Dryad (Supplementary Appendix S4). Historical Biogeography For the phylogenetic tree, we used the BEAST MCC tree with 85 species. Definition of biogeographic units was based on the present-day distribution of species but also informed by plate tectonics (Sanmartín et al. 2001) and alpine orogeny reconstructions (Bouilhol et al. 2013). Ten areas were defined: WN $$=$$ Western North America (including the Rocky Mountains); WP $$=$$ Europe (France, Spain, Pyrenees, Alps, Italy, Greece, Crete, Balkans, Scandinavia, Western Russian, and Ural Mountains); SI $$=$$ Eastern Russia, Siberia, and Kamchatka; CA $$=$$ Central Asia (Turkmenistan, Uzbekistan and Kazakhstan); MO $$=$$ Mongolian steppes and Altai Mountains; TU $$=$$ Turkey, Caucasian region, Syria, Iraq, Iranian Plateau, and Zagros Mountains; HTP $$=$$ Himalaya, Tibetan Plateau, and Pamir region; IN $$=$$ India, Bhutan, Yunnan, and Southern China; CJ $$=$$ Northern China, Korea, and Japan; AF $$=$$ North Africa and Arabia. Species ranges were defined by presence–absence in each region (Supplementary Appendix S5 available on Dryad). Biogeographic analyses were performed on the MCC tree (outgroups removed), using the Dispersal–Extinction–Cladogenesis model of range evolution (DEC; Ree and Smith 2008) implemented in the R-package BioGeoBEARS 0.2.1 (Matzke 2014). We did not use the DEC$$+$$J model (Matzke 2014) because of concerns with statistical validity of model choice among DEC-derived models and in particular under the DEC$$+$$J model (Ree and Sanmartín 2018). Also, DEC$$+$$J often leads to inferences that are decoupled from time, with null or extremely low extinction rates (Sanmartín and Meseguer 2016), an effect of the model favoring cladogenetic events over anagenetic events (Ree and Sanmartín 2018), which makes it inadequate for reconstructing the history of ancient groups with widespread distributions. Biogeographic ranges larger than 4 areas in size were disallowed as valid biogeographic states if they were not subsets of the terminal species ranges; widespread ranges comprising areas that have never been geographically connected (Sanmartín et al. 2001) were also removed. We constructed a time-stratified model (Supplementary Appendix S6 available on Dryad) with 4 time slices that specified constraints on area connections and spanned the Parnassiinae evolutionary history; each interval represents a geological period bounded by major changes in tectonic and climatic conditions thought to have affected the distribution of these butterflies. Testing the RQ and CJ hypotheses To understand the relative contributions of RQ versus CJ-type factors, we ran a series of macroevolutionary models using both ML and BI. Rather than on a single tree, analyses were performed on a sample of trees from the BEAST post-burnin posterior distribution (with outgroups removed); an exception was the diversity-dependent diversification models that, due to time constraints, were run only on the MCC tree. This approach allowed us to assess the robustness of results to phylogenetic uncertainty and estimate confidence intervals for the parameter estimates. We compared the fit of each model to the data using the corrected Akaike Information Criterion (AICc) for the ML analyses, and Bayes Factor comparisons (2lnBF) for BI. Diversification analyses in a RQ model. We first tested the hypothesis that diversity is bounded or at equilibrium, meaning that diversity expanded rapidly in early diversification, filled most niches and saturated toward the present. We explored the effect of diversity-dependence on speciation and extinction rates using the method of Etienne et al. (2012) implemented in the R-package DDD 2.7. We applied 5 different models: 1) speciation depends linearly on diversity without extinction, 2) speciation depends linearly on diversity with extinction, 3) speciation depends exponentially on diversity with extinction, 4) speciation does not depend on diversity and extinction depends linearly on diversity, and 5) speciation does not depend on diversity and extinction depends exponentially on diversity. For each model, the initial carrying capacity was set to the current number of described species. We also tested the effect of host-plant association on diversification rates, since previous study has pointed to a potential correlation between shifts in diversification and host-plant switches (Ehrlich and Raven 1964). We used a diversification model of the state-dependent speciation and extinction (SSE) family of models, in which extinction and speciation rates are associated with phenotypic evolution of a trait along a phylogeny (Maddison et al. 2007). In particular, we used the Multiple State Speciation Extinction model (MuSSE; FitzJohn et al. 2009) implemented in the R-package diversitree 0.9–7 (FitzJohn 2012), which allows for multiple character states. Larval host plant data were taken from previous work (Ehrlich and Raven 1964; Scriber 1984; Collins and Morris 1985; Tyler et al. 1994; Scriber et al. 1995). The following 4 character states were used: 1) Aristolochiaceae; 2) Zygophyllaceae; 3) Papaveraceae; and 4) Crassulaceae $$+$$ Saxifragaceae. Data at a lower taxonomic level than plant family were not used because of the great number of multiple associations exhibited by genera that could alter the phylogenetic signal. Thirty-six different models were run to test whether speciation, extinction, or transition rates were dependent on trait evolution. Models were built with increasing complexity, starting from a model with no difference in speciation, extinction and transition rates (3 parameters) to the most complex model with different speciation, extinction, and transition rates for each character state (20 parameters). We estimated posterior density distribution with Bayesian MCMC analyses (10,000 steps) performed with the best-fitting models and resulting speciation, extinction and transition rates. There have been concerns about the power of SSE models to infer diversification dynamics from a distribution of species traits (Davis et al. 2013; Maddison and FitzJohn 2015; Rabosky and Goldberg 2015). First, SSE models tend to have a high type I error bias related to the shape (topology) of the inferred tree (Rabosky and Goldberg 2015). To test whether our diversification results were potentially biased, we estimated the difference of fit ($$\Delta $$AIC) between the best model from MuSSE and a null model (corresponding to the model with no difference in speciation, extinction and transition rates between the states of a character) and compared this with the difference between the same models as estimated from simulated datasets. 1000 sets of traits were generated on the Parnassiinae phylogeny using the sim.history function from the R-package phytools (Revell 2012). To keep the phylogenetic signal in the trait, transition rates between states were simulated using values obtained from the null model fitted on the observed phylogeny. Second, there is concern whether SSE models are uncovering actual drivers of diversification, or whether they are simply pointing to more complex patterns involving unmeasured and co-distributed factors in the phylogeny (Beaulieu and O’Meara 2016). A model with an additional hidden character may alleviate this issue. We applied the Hidden SSE (HiSSE) model to specifically account for the presence of unmeasured factors that could impact diversification rates estimated for the states of any observed trait. The analyses were performed in the Bayesian software RevBayes (Höhna et al. 2016a). To provide an independent assessment of the relationship between diversification rates and host specificity, we used models that allow diversification rates to vary among clades. BAMM 2.5 (Rabosky et al. 2013, 2014a) was used to explore for differential diversification dynamic regimes among clades differing in their host-plant feeding. BAMM analyses were run for 10 million generations, sampling every 10,000th and with 4 different values of the compound Poisson prior (CPP) to ensure the posterior is independent of the prior (Moore et al. 2016). Mixing and convergence among runs (ESS $$>$$200 after 15% burn-in) were assessed with the R-package BAMMtools 2.1 (Rabosky et al. 2014b). BAMM has been criticized for incorrectly modeling rate-shifts on extinct lineages, that is, unobserved (extinct or unsampled) lineages inherit the ancestral diversification process and cannot experience subsequent diversification-rate shifts (Moore et al. 2016, but see Rabosky et al. 2017). To solve this, we used a novel approach implemented in RevBayes that models rate shifts consistently on extinct lineages by using the SSE framework (Moore et al. 2016; Höhna S., Moore B.R., Rannala B., May M.R., Huelsenbeck J.P. in preparation). Although there is no information of rate shifts for unobserved/extinct lineages in a phylogeny including extant species only, these types of events must be accounted for in computing the likelihood. The number of rate categories is fixed in the analysis but RevBayes allows any number to be specified, thus allowing direct comparison of different macroevolutionary regimes. Diversification analyses in a CJ model. We evaluated the impact of abiotic factors such as abrupt changes in tectonic or climatic settings or mass extinction events (e.g., the EOCT at 33.9 Ma) using episodic birth–death models implemented in the R-packages TreePar 3.3 (Stadler 2011) and TESS 2.1 (CoMET model, Höhna et al. 2016b; May et al. 2016). These models allow detection of discrete changes in speciation and extinction rates concurrently affecting all lineages in a tree. Both approaches estimate changes in diversification rates at discrete points in time, but can also infer mass extinction events, modeled as sampling events in the past in which the extant diversity is reduced by a fraction. Speciation and extinction rates can change at those points but remain constant within time intervals. The underlying likelihood function is identical in the 2 methods (Stadler 2011; Höhna 2015), but TreePar and TESS differ in the inference framework (ML vs. BI) and the method used for model comparison (AICc vs. BF). In addition, TESS uses independent CPPs to simultaneously detect mass extinction events and discrete changes in speciation and extinction rates, while TreePar estimates the magnitude and timing of speciation and extinction changes independently to the occurrence of mass extinctions (i.e., the 3 parameters cannot be estimated simultaneously due to parameter identifiability issues, Stadler 2011). To compare inferences between CoMET and TreePar, we performed 2 independent TreePar analyses allowing and disallowing mass extinction events (argument ME $$=$$ TRUE/FALSE). We compared models with 0, 1, or 2 rate shifts/mass extinction events in TreePar using the AICc, while BF comparisons were used to assess model fit between models with varying number and time of changes in speciation/extinction rates and mass extinctions in TESS. Finally, we used an implementation of CoMET in RevBayes (Höhna et al. 2016a), an episodic birth–death model in which speciation and extinction are allowed to vary at discrete points in time, but are autocorrelated (instead of independent/uncorrelated as in CoMET) across time intervals; a Brownian model with rates in the next time interval centered around the rates in the current time interval was used. Also, the number of diversification rate shifts was set a priori as in population skyline models (Strimmer and Pybus 2001), while in CoMET this is modeled through a CPP prior. To examine the influence of elevational distribution (lowland, mountain, or both) on speciation and extinction rates, we used the trait-dependent diversification model GeoSSE (Geographic State Speciation and Extinction, Goldberg et al. 2011). Geographic characters require a third widespread state, because, unlike morphological traits, ancestors can be present in more than 1 state (area). This requires the modeling of cladogenetic state change, the probability associated with the division of an ancestral range between the 2 descendants, alongside anagenetic state change (i.e., the probability of character change along branches). Species were coded by their elevation zone, using data from literature, museum records, and field observations. We evaluated 12 models of increasing complexity to test the relationship between elevational distributions and diversification. As in MuSSE above, we evaluated the performance of GeoSSE using simulation tests. Trait simulation was more complex here because there is no direct transition between states A and B. Instead, these transitions involve range expansion to an intermediate widespread state AB (dAB, dBA) followed by local extinction (xA, xB). Following Goldberg et al. (2011), we considered transition rates between A and B as null because they involve more than one instantaneous event; the transition rate from A to AB was coded as range expansion dA, from B to AB as dB; transitions from AB to A and from AB to B (“extirpation rates”) were coded as xB and xA, respectively. For our 1000 simulations, we used the transition rates of the model with equal speciation rates (sA $$=$$ sB $$=$$ sAB), but different extinction (xA$$\ne $$xB) and transition rates (dA$$\ne $$dB). Finally, we tested the impact of paleoenvironmental variables on diversification rates, using a birth–death likelihood method in which rates may change continuously with an environmental variable, itself varying through time (Condamine et al. 2013). We tested 4 models: 1) a constant-rate birth–death model in which diversification is not associated to the environmental variable; 2) speciation rate varies according to environment and extinction rate does not vary; 3) speciation rate does not vary and extinction rate varies according to environment; and 4) both speciation and extinction rates vary according to environment. We also tested the corresponding models in which speciation and/or extinction are allowed to vary with time, but independently from the environmental variable (time-dependent birth–death models, Morlon et al. 2011). When rates varied with the environment (E), we assumed exponential variation, such that $$\lambda (E)=\lambda_{0}\thinspace \times \thinspace e^{\alpha E}$$ and $$\mu (E)=\mu_{0}\times \thinspace e^{\beta E}$$, in which $$\lambda_{0}$$ and $$\mu _{0}$$ are the speciation and extinction rates for a given environmental variable for which the value is 0, and $$\alpha $$ and $$\beta $$ are the rates of change according to the environment. Positive values for $$\alpha $$ or $$\beta $$ mean a positive effect of the environment on speciation or extinction (and conversely). As environmental variables (Fig. 2), we used paleotemperature (data retrieved from Zachos et al. 2008) and paleo-elevation of the HTP (a proxy for mountain building), with data compiled from existing data in the literature available on Dryad (Supplementary Appendix S7). The R-package pspline was used to build environmental vectors from the data as input for the birth–death models. Figure 2. View largeDownload slide Environmental changes over the last 40 million years. a) Trends in global climate change estimated from relative proportions of different oxygen isotopes (Zachos et al. 2008). b) Trends in HTP elevation for the last 40 Myrs, compiled from data from a literature survey; our compilation shows that the HTP was as high as 3000 m in the Eocene, congruent with recent reviews (Renner 2016; Xu et al. 2018; Supplementary Appendix S7). c) Map of the current elevation of the Himalayan–Tibetan Plateau with the species diversity of Parnassius in the region. Figure 2. View largeDownload slide Environmental changes over the last 40 million years. a) Trends in global climate change estimated from relative proportions of different oxygen isotopes (Zachos et al. 2008). b) Trends in HTP elevation for the last 40 Myrs, compiled from data from a literature survey; our compilation shows that the HTP was as high as 3000 m in the Eocene, congruent with recent reviews (Renner 2016; Xu et al. 2018; Supplementary Appendix S7). c) Map of the current elevation of the Himalayan–Tibetan Plateau with the species diversity of Parnassius in the region. Alternatively, we used RevBayes and the CoMET framework (episodic birth–death models) to implement an analysis in which diversification rates are allowed to co-vary with, or evolve independently from, paleotemperatures and paleo-elevation across discrete time intervals (Palazzesi L., Hidalgo O., Barreda V.D, Forest F., Höhna S. in review). A correlation coefficient, denoted $$\beta $$, is co-estimated with diversification rates in the model and describes the magnitude and direction of the relationship between the variable and changes in diversification rates: $$\beta $$$$<$$ 0 (negative correlation), $$\beta $$$$>$$0 (positive correlation), and $$\beta = 0$$ (no correlation, in which case the model collapses to the episodic birth–death models; see Supplementary Appendix S8 available on Dryad). RESULTS Phylogeny and Fossil Placements The molecular matrix comprised 4535 nucleotides and 85 extant species (70 formally recognized and 15 putative species) of Parnassiinae plus 2 fossil species, following recent taxonomic reviews (Frankenbach et al. 2012; Bollino and Racheli 2012) and our own results from the species delimitation analyses (see Supplementary Appendices S1 and S2 available on Dryad). PartitionFinder selected 8 partitions as the best scheme for substitution models (Supplementary Appendix S9 available on Dryad). Convergence among runs was supported by the low average standard deviation of split frequencies, PSRF values $$\approx $$1.0, and ESS $$>>$$ 1000 for many parameters. The resulting Bayesian trees were well resolved and robust: 74.4% of the nodes were recovered with posterior probability (PP) $$>$$0.95 with MrBayes (Fig. 3), and 83% in BEAST (70% with PP $$=$$ 1) when fossil taxa were removed. Phylogenetic relationships agree with previous studies (Nazari et al. 2007; Michel et al. 2008), showing Luehdorfiini $$+$$ Zerynthiini as sister-tribes and sister to Parnassiini; within the latter, Hypermnestra is sister to the genus Parnassius. Most of the weak support is found within the radiation of species complexes, such as Parnassius apollo or in the subgenus Kailasius, but 2 deep nodes in Parnassius have PP $$<$$ 0.9. Figure 3. View largeDownload slide Phylogenetic relationships inferred with a Bayesian total-evidence approach combining molecular and morphological data of extant and extinct Parnassiinae. This integrative approach allowed assessment of the phylogenetic placement of 2 fossils, which were then used to calibrate the molecular dating analysis. Backbone nodes show posterior probability values. Nodal support values within genera or subclades are shown in Supplementary Appendix S10 for alternative dating analyses including, or not, morphological data. Grey circles indicate the crown-node of each tribe. Figure 3. View largeDownload slide Phylogenetic relationships inferred with a Bayesian total-evidence approach combining molecular and morphological data of extant and extinct Parnassiinae. This integrative approach allowed assessment of the phylogenetic placement of 2 fossils, which were then used to calibrate the molecular dating analysis. Backbone nodes show posterior probability