TY - JOUR AU1 - Rönkä, K AU2 - Mappes, J AU3 - Michalis, C AU4 - Kiviö, R AU5 - Salokannas, J AU6 - Rojas, B AB - Abstract Multiple-model mimicry, whereby different morphs of an aposematic species each resemble another defended species sharing the costs of predator education, has been proposed as a mechanism allowing colour polymorphisms in aposematic species. Male wood tiger moths, Arctia plantaginis (Linnaeus, 1758), are chemically defended and polymorphic (yellow, white) for hindwing coloration. We selected four potentially aposematic moth species and studied whether Müllerian mimicry exists between them and A. plantaginis morphs. We tested the moths’ relative palatability to natural predators with and without visual cues, their phenotypic similarity under a bird visual system, and whether trials with a potential moth model influence a predator’s willingness to attack A. plantaginis. Our results show that (1) three of the four tested species were not sufficiently unpalatable and thus not potential models for A. plantaginis, and (2) birds confused the unpalatable yellow model Arichanna melanaria with yellow A. plantaginis, although their overall appearance is distinguishable. This indicates imperfect mimicry based on shared colour cues. Multiple-model mimicry is thus a potential contributor to the maintenance of multiple morphs, although no unpalatable model was found for the white morph. Our findings highlight the importance of accounting for both prey coloration and palatability, which in concert affect predator behaviour, the ultimate driver of mimicry evolution. INTRODUCTION Local variation in warning signals is evolutionarily puzzling because prey that have warning colours are expected to be under positive frequency-dependent selection by local predators, leading to signal monomorphism (Müller, 1879; Fisher, 1958; Ruxton, Sherratt & Speed, 2004). To avoid the costs of unnecessary pursuit or toxic load, predators learn to avoid unprofitable prey by associating it with prey coloration. Learned avoidance is expected to be generalized to other prey sharing a similar warning signal. This allows predators to optimize their fitness by attacking only profitable prey items in the prey community, while prey individuals with a similar appearance share the costs of predator education. Hence, local predators provide strong selection for qualities that make them associate warning signals with prey defence, for example signal conspicuousness, colour, pattern or uniformity. Signal sharing can occur between species when two or more defended species resemble each other in a Müllerian mimicry ring (e.g. Müller, 1879; Benson, 1972; Kapan, 2001; Marek & Bond, 2009; Stuckert et al., 2014). However, the system is prone to cheating. Less defended intra- or interspecific individuals with a similar warning signal can parasitize the defended model species, obtaining the benefits of predator avoidance without incurring the costs associated with the defence (e.g. Bates, 1862; Kunte, 2009; Kraemer, Serb & Adams, 2015; Jones et al., 2017; Katoh, Tatsuta & Tsuji, 2017). Thus, mimetic systems between species varying in their level of palatability can be thought of as a continuum from equally defended Müllerian mimics benefiting each other to Batesian mimicry, in which a palatable mimetic species gains protection from a defended model, while the model suffers from increased predation as the proportion of non-defended prey increases (Huheey, 1976; Rowland et al., 2010). Predators trade-off between the costs of attacking a defended model and not attacking a palatable mimic (Speed, 1999; Johnstone, 2002; Skelhorn & Rowe, 2007). The strength of selection for signal similarity in a given predator–prey community thus depends on the predator’s tendency to generalize and the rates of discrimination error (Lindström, Alatalo & Mappes, 1997; Mappes & Alatalo, 1997; MacDougall & Dawkins, 1998; Ihalainen, Lindström & Mappes, 2007; Aronsson & Gamberale-Stille, 2012; Ihalainen et al., 2012). This is, in turn, influenced by predators’ hungriness (i.e. motivation to attack; Sandre et al., 2010), which is affected by prey availability (Kokko, Mappes & Lindström, 2003; Lindström et al., 2004), as well as by the relative palatability of the prey (Ihalainen et al., 2007). Although the concept of Müllerian mimicry was proposed in 1879, and several theoretical as well as experimental approaches have contributed to a better understanding of its underpinnings, it remains of debate how mimetic relationships affect selection on warning coloration and how polymorphism among defended co-mimics is maintained (Joron & Mallet, 1998; Speed, 1999; Rowland et al., 2007). Lepidoptera have some of the best known examples of mimetic systems (e.g. Bates, 1862; Müller, 1879; Benson, 1972; Mallet & Barton, 1989; Kapan, 2001; Katoh et al., 2017), but, in general, empirical approaches with real prey and their relevant predators are scarce (Ruxton et al., 2004). Moreover, surprisingly little is known about the palatability of species, for example what kind of chemical defences the species possess, and how they affect different predators (Marsh & Rothschild, 1974; but see Arias et al., 2016a for a study where inter-species palatability was addressed as a possible explanation for polymorphism in a mimetic species). Another factor hindering mimicry studies has been the lack of an objective way of measuring colour and pattern similarity. However, the development of image analysis methods (e.g. Endler & Mielke, 2005; Endler, 2012; Le Poul et al., 2014; Kemp et al., 2015; Troscianko & Stevens, 2015; Taylor, Reader & Gilbert, 2016; Van Belleghem et al., 2018) and increasing knowledge of predator visual systems (e.g. Vorobyev & Osorio, 1998; Kelber, Vorobyev & Osorio, 2003; Renoult, Kelber & Schaefer, 2017) are beginning to overcome this issue. In addition to detailed knowledge of predator vision, however, it is necessary to understand the cognitive processes involved in prey recognition and predator attack decisions, more specifically, how predators use the information they gather from prey palatability and associated cues (Skelhorn, Halpin & Rowe, 2016). This is because predator behaviour, whether or not it sees the difference between co-mimics or confuses them, and correspondingly attacks (or not) a particular prey, is the ultimate selective force on both the warning signal and the chemical defence(s) variation. Local predator communities consist of both inexperienced and experienced individuals (Mappes et al., 2014), the latter of which may choose to consume defended prey (Ihalainen et al., 2008b) according to their physiological state, i.e. toxic burden (Johnstone, 2002; Barnett, Bateson & Rowe, 2007; Skelhorn & Rowe, 2007). Studying the behaviour of relevant natural predators is thus key in assessing selection on aposematic species (see also Merilaita, 2016). Although many previous studies have provided insight into how a single aposematic prey species can exhibit multiple morphs within a given population (e.g. Ueno, Sato & Tsuchida, 1998; Nokelainen et al., 2012, 2014; Hegna et al., 2013; Rojas, Devillechabrolle & Endler, 2014), the relative importance of different mechanisms remains poorly understood. A possible explanation for this counterintuitive phenomenon is multiple-model mimicry. Examples of multiple-model mimicry are described for both Batesian (Papilio dardanus: Nijhout, 2003; Papilio memnon: Clarke, Sheppard & Thornton, 1968) and Müllerian mimics (Heliconius numata: Brown & Benson, 1974; Joron et al., 1999; Appalachian millipedes: Marek & Bond, 2009; Ranitomeya imitator: Symula, Schulte & Summers, 2001; but see Chouteau et al., 2011 for a study that challenges the multiple-model hypothesis in R. imitator). The benefits of polymorphism are easily explained for Batesian mimics, which gain most selective advantage when they are rare compared to their models. Mimicking several sympatric models instead of one can thus sustain larger population sizes of the Batesian mimic (Edmunds, 1974). The same could apply to quasi-Batesian systems, where a defended model is mimicked by a less-defended co-mimic (Speed, 1999; Rowland et al., 2010). Müllerian polymorphism, however, is somewhat more complex: although each morph can benefit from sharing the signal with a defended model, frequency-dependent selection should still favour local monomorphism. One well-known example of multiple-model Müllerian mimicry is the remarkable case of Heliconius numata, where polymorphism is thought to be maintained via spatial differences in local selection and dispersal (Joron et al., 1999) and selection against intermediate phenotypes (Arias et al., 2016b). The wood tiger moth, Arctia plantaginis (formerly Parasemia plantaginisRönkä et al., 2016), is an aposematic species with remarkable variation in hindwing coloration across its Holarctic range. Males in some populations are polymorphic, with co-occurring yellow and white morphs. Here, we hypothesize that this local polymorphism could be maintained because each morph gains protection from a different defended species (i.e. a quasi-Batesian or Müllerian co-mimic, referred to hereafter as a model). For multiple-model mimicry to explain the maintenance of local polymorphism, the putative models need to (1) be sufficiently unpalatable to facilitate the avoidance of the co-mimic (A. plantaginis) and (2) share their warning colours with the corresponding morphs in the eyes of would-be predators. Two black-and-yellow and two black-and-white diurnal sympatric moths were selected as potential models for the yellow and white wood tiger moth morphs, respectively (Table 1). To test whether mimetic relationships could exist between the species, we carried out bioassays with relevant wild-caught predators aiming to test (1) the relative degree of palatability of each putatively mimetic pair in both the presence and the absence of visual cues, and (2) if experience with a putative model changes bird reactions towards the putative mimic (yellow or white A. plantaginis morph) and vice versa. In other words, we tested whether there is potential for generalized avoidance between unpalatable prey sharing a similar warning colour. We complemented these bioassays with detailed image analyses comparing both the overall appearance and hindwing warning colour of A. plantaginis and its putative co-mimics. The image analyses were used to provide an objective measure of similarity among species, and to determine whether the birds could use hindwing colour as a cue for unpalatability. Table 1. Selected moth species and comparative information on size, timing of flight, abundance and collection of samples. Photos taken by KR. View Large Table 1. Selected moth species and comparative information on size, timing of flight, abundance and collection of samples. Photos taken by KR. View Large MATERIAL AND METHODS Study species Adult A. plantaginis (Erebidae: Arctiinae) are diurnal (Rojas, Gordon & Mappes, 2015), and aposematic, as they are chemically defended (Rojas et al., 2017) and conspicuously coloured (Nokelainen et al., 2012). Their chemical defence contains pyrazine compounds, which are deterrent to avian predators (Rojas et al., 2017) and synthesized de novo (Burdfield-Steel et al., 2018). In Europe, the coloration of adult males consists of a contrasted black-and-white forewing pattern and either white or yellow hindwing warning colour combined with variable degree of black patterning (Hegna, Galarza & Mappes, 2015). Male hindwing warning coloration is determined by one autosomal locus with at least three alleles (J. Galarza et al., unpubl. data), resulting in distinct white and yellow hindwing morphs, whereas female hindwing colours vary continuously from orange to red. Local polymorphism is common across the Holarctic distribution range, and in Finland both white and yellow males co-occur. Morph frequencies in A. plantaginis are monitored yearly using pheromone traps and netting. The peak flight season is at the beginning of July, where males fly in search for females. Based on species phenology, co-occurrence with A. plantaginis and coloration, four geometrid moth species were selected as potential mimetic models (Table 1). The black-and-yellow Arichanna melanaria and Pseudopanthera macularia resemble the yellow morph, while the black-and-white Lomaspilis marginata and Rheumaptera hastata resemble the white morph of A. plantaginis. Species occurrence was inferred from distribution, habitat and timing of flight data from updated databases (FinBIF), books (Silvonen, Top-Jensen & Fibiger, 2014), and 30 transect counts in wood tiger moth habitats during its flight season in 2014 (K. Rönkä, unpubl. data). In addition to having a similar appearance to A. plantaginis, Arichannamelanaria is known to be capable of sequestering low quantities of grayanotoxins, which are known to deter at least lizards (Nishida, 1994). To our knowledge, none of the adult putative model species in this study has ever been directly tested for palatability. Wild-caught and freeze-killed specimens of Autographa gamma and Zygaena sp. were used for positive and negative palatability controls, respectively, while Tenebrio molitor larvae (hereafter referred to as mealworms) were used to control for bird motivation to attack and consume insect prey. The silver Y moth, Autographa gamma, is common, polyphagous and cryptically coloured, and thus is likely to be palatable to most predators. Burnet moths, Zygaena spp., by contrast, are known to possess hydrocyanic acid (Jones, Parsons & Rothschild, 1962; Davis & Nahrstedt, 1982), and to be unpalatable to birds (Turner, 1970, and references therein). Moth samples were collected by netting (netting and light for Arichanna melanaria) from their natural habitats (Table 1). To obtain enough samples for all the experiments, field-collected L. marginata, R. hastata and P. macularia were mated and F1 generation was reared in a glasshouse in Central Finland, using natural food plants, and overwintered as pupae. Arctia plantaginis were obtained from a laboratory stock originating and reinforced with field-collected individuals from Finland. Only male A. plantaginis were used in experiments. All the other species are sexually monomorphic in coloration, and thus a random selection of both sexes was used in experiments. As we were unable to rear Arichanna melanaria in sufficient numbers, all the samples of this species were collected from the wild in Central Finland during their flight season in August 2015. To reduce the effect of intensive sampling on natural populations, only male Arichanna melanaria were selectively collected and used in experiments. All samples were killed by freezing after collection or eclosion, and stored at −20 °C for a maximum of 18 months. Most samples were stored for less than 6 months before feeding them to birds. Palatability experiments without visual cues A first step to address the question of whether a species is a potential model for a mimic in a presumed Müllerian complex is to test how palatable it is in relation to its putative mimic. We thus tested the relative palatability between potential co-mimics with two feeding assays (Table 2), where moths were offered to a relevant predator (great tit, Parus major) without any visual cues. The proportion eaten was used as a proxy for palatability. To transform the frozen moths into a homogenous paste, we first dried them in a freeze-dryer at −20 °C. The dried samples were stored at room temperature for a maximum of 8 weeks before the experiments. Three individuals of each species were pooled in one Eppendorf tube to minimize the effect of potential inter-individual variation in palatability, leaving the left side fore- and hindwings of every third individual aside for subsequent image analysis. The tube contents were crushed into powder using a TissueLyser II for 5 s (at 30 rounds per second) with a 5-mm steel ball in the tube. We weighed the resulting powder to the nearest 0.01 mg and added water to each sample in a ratio of 6:1 for the A. plantaginis, Autographa gamma and Zygaena sp., and 9:1 for the other moths. These amounts of water were used to create a smooth paste of uniform consistency for all samples. Once water was added, the paste was used immediately or stored in a fridge (+3 °C) or freezer (−20 °C) between bird assays, to prevent spoilage and microbial growth. Table 2. Synthesis of all experiments, study questions, statistical tests, sample sizes, main results and conclusions Experiment  Experimental procedure  Study questions  Measured variables and tested comparisons  Statistical test method  Sample size  Main results and conclusions  Assay 1 palatability with no visual cues  Figure 1 2 trials with mealworm controls  (1) Are the putative models Am, Pm, Rh & Lm less palatable than the putative mimics Apy/Apw? (2) Are the putative models less palatable than good/bad mealworm controls? (3) Is Apw less palatable than Apy?  Figure 2, Figure 4A, Table 2A Proportions eaten in two trials of (a) putative models vs. Ap morph (b) putative models vs. controls (c) white vs. yellow Ap  GLMM with a beta distribution and a logit link function  Figure 2, Assay 1 [number of moth paste samples not spilled (used in Figure 4A)/ number of all moth samples (used in analysis])  (1) No significant difference Am vs. Apy and Pm vs. Apy; Rh and Lm are eaten more than Apw (2) Putative models are eaten less than both controls (3) No significant difference between Apy and Apw → Pm and Am are putative Müllerian models; Rh and Lm are putative quasi- Batesian mimics  Assay 2 palatability with no visual cues  Figure 1 1 trial with Autographa gamma and Zygaena sp. controls  (1) and (3) same as above (2) Are the putative models less palatable than a putatively palatable moth (Autographa gamma) or an unpalatable moth (Zygaena sp.)?  Figure 2, Figure 4B, Table 2B Proportions eaten of (a) putative models vs. Ap morph (b) putative models vs. controls (c) white vs. yellow Ap  GLMM with a beta distribution and a logit link function  Figure 2, Assay 2 [number of of moth paste samples not spilled (used in Figure 4B)/ number of of all moth samples (used in analysis])  (1) No significant difference Am vs. Apy and Pm vs. Apy; Rh and Lm are eaten more than Apw (2) Putative models are eaten less than Autographa gamma but more than Zygaena (3) No significant difference. between Apy and Apw → Pm and Am are putative Müllerian models; Rh and Lm are putative quasi-Batesian mimics  Assay 3 palatability with visual cues: proportions eaten  Figure 3 Data from all trials (see Table S4 for a version with data from first trials only)  (1) Are the putative models Am, Pm, Rh & Lm less palatable than the putative mimics Apy/Apw? (2) Does the presence of visual cues change moth acceptability to birds? (3) Is Apw less palatable than Apy?  Figure 2, Figure 4C, Table 3C Proportions eaten of (a) putative models vs. Ap morph (b) white vs. yellow Ap  GLMM with a beta distribution and a logit link function  Figure 2, Assay 3 Great tits: Pm = 20 Am = 20 Apy = 39 Apw = 18 Rh = 10 Lm = 8  Proportion eaten: (1) Pm eaten more than Apy, other differences non-significant (2) All moths eaten more than without visual cues → Once attacked, most species eaten at similar levels (different from without visual cues, indicating that birds can handle their prey)  Assay 3 palatability with visual cues: beak cleaning  Figure 3Data from all trials  (1) Do the putative models Am, Pm, Rh & Lm induce more disgust behaviour than the putative mimics Apy/Apw? (2) Is there a difference in bird disgust behaviour between Apw and Apy?  Figure 5, Table 4 Amount of beak cleaning of (a) putative models vs. Ap morph (b) white vs. yellow Ap  GLMM with a negative binomial distribution and a logit link function  Figure 2, Assay 3 Great tits: Pm = 20 Am = 20 Apy = 39 Apw = 18 Rh = 10 Lm = 8  Beak cleaning (BC): (1) Less BC to Pm than Apy, no diff. Am vs. Apy, less BC to Rh and Lm than Apw → Pm, Rh and Lm seem to be less defended than Ap (2) Less BC to Apy than Apw → difference between morph defences  Assay 3 palatability with visual cues: learning in Supporting Information  Figure 3 4 trials with species X as a ‘model’ (last trial not used)  Palatability in sequential presentations – are there signs of increased or decreased avoidance?  Table S1, Figure S1 The effects of (a) moth species, (b) bird species and (c) their interaction on changes in survival (cumulative risk of being attacked during a trial) during four sequential presentations  Cox proportional hazards model for survival  Figure S1 Blue tits: Apy = 11 Am = 11 Great tits: Apy = 20 Am = 10  The proportional risk of being attacked by blue and great tits changes differently depending on moth species → great tits showed increasing avoidance towards Am but blue tits did not  Image analysis visual cues only  pictures of all species → colour and pattern value extraction using blue tit vision model → model to assess discriminability  (1) How easy it is for the predators to tell apart the putative models and Apy/ Apw? (2) Could they be using hindwing colour only as a common cue?  Modelled discriminability of a putative model vs. a putatively mimetic Ap morph and the other Ap morph based on (a) overall appearance (colour and texture) (b) hindwing colour (Figure 6)  Logistic regression model → AUC (a measure from signal detection theory)  Apy vs. Am = 14 Pm = 12 Rh = 9 Lm = 16 Apw vs. Am = 14 Pm = 12 Rh = 9 Lm = 17  (1) All species distinguishable based on overall appearance (2) Am vs. Apy and Rh vs. Apw are less easy to discriminate based on hindwing colour → hindwing colour could be used as a common cue  Assay 3 mimicry  Figure 3 1st trial of species X as a ‘model’ vs. 5th trial of species X as a ‘mimic’ (from another group of birds)  (1) Does experience with the putative model increase hesitation towards the putative mimic? (2) Does experience with the putative mimic decrease hesitation towards the putative model?  Figure 7 Attack latency without recent experience (1st trial) with a putative model vs. after recent experience with a putative model (5th trial) (a) towards a putative mimic (Apy) (b) towards a putative model (Am) with both bird species  unpaired two-sample Wilcoxon test  Blue tits: Apy first = 11 Apy last = 10 Am first = 11 Am last = 11 Great tits: Apy first = 20 Apy last = 9 Am first = 10 Am last = 10  (1) Great tits show a non-significant tendency to hesitate more towards Apy after experience with Am, but blue tits do not (2) Experience with Apy decreased hesitation towards Am, although the effect is only significant in great tits → the compared species affect each other’s survival, suggesting that birds confuse the putatively mimetic moth species with each other  Experiment  Experimental procedure  Study questions  Measured variables and tested comparisons  Statistical test method  Sample size  Main results and conclusions  Assay 1 palatability with no visual cues  Figure 1 2 trials with mealworm controls  (1) Are the putative models Am, Pm, Rh & Lm less palatable than the putative mimics Apy/Apw? (2) Are the putative models less palatable than good/bad mealworm controls? (3) Is Apw less palatable than Apy?  Figure 2, Figure 4A, Table 2A Proportions eaten in two trials of (a) putative models vs. Ap morph (b) putative models vs. controls (c) white vs. yellow Ap  GLMM with a beta distribution and a logit link function  Figure 2, Assay 1 [number of moth paste samples not spilled (used in Figure 4A)/ number of all moth samples (used in analysis])  (1) No significant difference Am vs. Apy and Pm vs. Apy; Rh and Lm are eaten more than Apw (2) Putative models are eaten less than both controls (3) No significant difference between Apy and Apw → Pm and Am are putative Müllerian models; Rh and Lm are putative quasi- Batesian mimics  Assay 2 palatability with no visual cues  Figure 1 1 trial with Autographa gamma and Zygaena sp. controls  (1) and (3) same as above (2) Are the putative models less palatable than a putatively palatable moth (Autographa gamma) or an unpalatable moth (Zygaena sp.)?  Figure 2, Figure 4B, Table 2B Proportions eaten of (a) putative models vs. Ap morph (b) putative models vs. controls (c) white vs. yellow Ap  GLMM with a beta distribution and a logit link function  Figure 2, Assay 2 [number of of moth paste samples not spilled (used in Figure 4B)/ number of of all moth samples (used in analysis])  (1) No significant difference Am vs. Apy and Pm vs. Apy; Rh and Lm are eaten more than Apw (2) Putative models are eaten less than Autographa gamma but more than Zygaena (3) No significant difference. between Apy and Apw → Pm and Am are putative Müllerian models; Rh and Lm are putative quasi-Batesian mimics  Assay 3 palatability with visual cues: proportions eaten  Figure 3 Data from all trials (see Table S4 for a version with data from first trials only)  (1) Are the putative models Am, Pm, Rh & Lm less palatable than the putative mimics Apy/Apw? (2) Does the presence of visual cues change moth acceptability to birds? (3) Is Apw less palatable than Apy?  Figure 2, Figure 4C, Table 3C Proportions eaten of (a) putative models vs. Ap morph (b) white vs. yellow Ap  GLMM with a beta distribution and a logit link function  Figure 2, Assay 3 Great tits: Pm = 20 Am = 20 Apy = 39 Apw = 18 Rh = 10 Lm = 8  Proportion eaten: (1) Pm eaten more than Apy, other differences non-significant (2) All moths eaten more than without visual cues → Once attacked, most species eaten at similar levels (different from without visual cues, indicating that birds can handle their prey)  Assay 3 palatability with visual cues: beak cleaning  Figure 3Data from all trials  (1) Do the putative models Am, Pm, Rh & Lm induce more disgust behaviour than the putative mimics Apy/Apw? (2) Is there a difference in bird disgust behaviour between Apw and Apy?  Figure 5, Table 4 Amount of beak cleaning of (a) putative models vs. Ap morph (b) white vs. yellow Ap  GLMM with a negative binomial distribution and a logit link function  Figure 2, Assay 3 Great tits: Pm = 20 Am = 20 Apy = 39 Apw = 18 Rh = 10 Lm = 8  Beak cleaning (BC): (1) Less BC to Pm than Apy, no diff. Am vs. Apy, less BC to Rh and Lm than Apw → Pm, Rh and Lm seem to be less defended than Ap (2) Less BC to Apy than Apw → difference between morph defences  Assay 3 palatability with visual cues: learning in Supporting Information  Figure 3 4 trials with species X as a ‘model’ (last trial not used)  Palatability in sequential presentations – are there signs of increased or decreased avoidance?  Table S1, Figure S1 The effects of (a) moth species, (b) bird species and (c) their interaction on changes in survival (cumulative risk of being attacked during a trial) during four sequential presentations  Cox proportional hazards model for survival  Figure S1 Blue tits: Apy = 11 Am = 11 Great tits: Apy = 20 Am = 10  The proportional risk of being attacked by blue and great tits changes differently depending on moth species → great tits showed increasing avoidance towards Am but blue tits did not  Image analysis visual cues only  pictures of all species → colour and pattern value extraction using blue tit vision model → model to assess discriminability  (1) How easy it is for the predators to tell apart the putative models and Apy/ Apw? (2) Could they be using hindwing colour only as a common cue?  Modelled discriminability of a putative model vs. a putatively mimetic Ap morph and the other Ap morph based on (a) overall appearance (colour and texture) (b) hindwing colour (Figure 6)  Logistic regression model → AUC (a measure from signal detection theory)  Apy vs. Am = 14 Pm = 12 Rh = 9 Lm = 16 Apw vs. Am = 14 Pm = 12 Rh = 9 Lm = 17  (1) All species distinguishable based on overall appearance (2) Am vs. Apy and Rh vs. Apw are less easy to discriminate based on hindwing colour → hindwing colour could be used as a common cue  Assay 3 mimicry  Figure 3 1st trial of species X as a ‘model’ vs. 5th trial of species X as a ‘mimic’ (from another group of birds)  (1) Does experience with the putative model increase hesitation towards the putative mimic? (2) Does experience with the putative mimic decrease hesitation towards the putative model?  Figure 7 Attack latency without recent experience (1st trial) with a putative model vs. after recent experience with a putative model (5th trial) (a) towards a putative mimic (Apy) (b) towards a putative model (Am) with both bird species  unpaired two-sample Wilcoxon test  Blue tits: Apy first = 11 Apy last = 10 Am first = 11 Am last = 11 Great tits: Apy first = 20 Apy last = 9 Am first = 10 Am last = 10  (1) Great tits show a non-significant tendency to hesitate more towards Apy after experience with Am, but blue tits do not (2) Experience with Apy decreased hesitation towards Am, although the effect is only significant in great tits → the compared species affect each other’s survival, suggesting that birds confuse the putatively mimetic moth species with each other  Relevant figures and tables to each experiment are referred to in the table. The arrows are used to indicate conclusions made based on the main results. Species names are abbreviated: Ap = Arctia plantaginis, Apy = yellow morph, Apw = white morph are used for the putative mimics, and Am = Arichanna melanaria, Pm = Pseudopanthera macularia, Rh = Rheumaptera hastata and Lm = Lomaspilis marginata for the putative models. View Large Table 2. Synthesis of all experiments, study questions, statistical tests, sample sizes, main results and conclusions Experiment  Experimental procedure  Study questions  Measured variables and tested comparisons  Statistical test method  Sample size  Main results and conclusions  Assay 1 palatability with no visual cues  Figure 1 2 trials with mealworm controls  (1) Are the putative models Am, Pm, Rh & Lm less palatable than the putative mimics Apy/Apw? (2) Are the putative models less palatable than good/bad mealworm controls? (3) Is Apw less palatable than Apy?  Figure 2, Figure 4A, Table 2A Proportions eaten in two trials of (a) putative models vs. Ap morph (b) putative models vs. controls (c) white vs. yellow Ap  GLMM with a beta distribution and a logit link function  Figure 2, Assay 1 [number of moth paste samples not spilled (used in Figure 4A)/ number of all moth samples (used in analysis])  (1) No significant difference Am vs. Apy and Pm vs. Apy; Rh and Lm are eaten more than Apw (2) Putative models are eaten less than both controls (3) No significant difference between Apy and Apw → Pm and Am are putative Müllerian models; Rh and Lm are putative quasi- Batesian mimics  Assay 2 palatability with no visual cues  Figure 1 1 trial with Autographa gamma and Zygaena sp. controls  (1) and (3) same as above (2) Are the putative models less palatable than a putatively palatable moth (Autographa gamma) or an unpalatable moth (Zygaena sp.)?  Figure 2, Figure 4B, Table 2B Proportions eaten of (a) putative models vs. Ap morph (b) putative models vs. controls (c) white vs. yellow Ap  GLMM with a beta distribution and a logit link function  Figure 2, Assay 2 [number of of moth paste samples not spilled (used in Figure 4B)/ number of of all moth samples (used in analysis])  (1) No significant difference Am vs. Apy and Pm vs. Apy; Rh and Lm are eaten more than Apw (2) Putative models are eaten less than Autographa gamma but more than Zygaena (3) No significant difference. between Apy and Apw → Pm and Am are putative Müllerian models; Rh and Lm are putative quasi-Batesian mimics  Assay 3 palatability with visual cues: proportions eaten  Figure 3 Data from all trials (see Table S4 for a version with data from first trials only)  (1) Are the putative models Am, Pm, Rh & Lm less palatable than the putative mimics Apy/Apw? (2) Does the presence of visual cues change moth acceptability to birds? (3) Is Apw less palatable than Apy?  Figure 2, Figure 4C, Table 3C Proportions eaten of (a) putative models vs. Ap morph (b) white vs. yellow Ap  GLMM with a beta distribution and a logit link function  Figure 2, Assay 3 Great tits: Pm = 20 Am = 20 Apy = 39 Apw = 18 Rh = 10 Lm = 8  Proportion eaten: (1) Pm eaten more than Apy, other differences non-significant (2) All moths eaten more than without visual cues → Once attacked, most species eaten at similar levels (different from without visual cues, indicating that birds can handle their prey)  Assay 3 palatability with visual cues: beak cleaning  Figure 3Data from all trials  (1) Do the putative models Am, Pm, Rh & Lm induce more disgust behaviour than the putative mimics Apy/Apw? (2) Is there a difference in bird disgust behaviour between Apw and Apy?  Figure 5, Table 4 Amount of beak cleaning of (a) putative models vs. Ap morph (b) white vs. yellow Ap  GLMM with a negative binomial distribution and a logit link function  Figure 2, Assay 3 Great tits: Pm = 20 Am = 20 Apy = 39 Apw = 18 Rh = 10 Lm = 8  Beak cleaning (BC): (1) Less BC to Pm than Apy, no diff. Am vs. Apy, less BC to Rh and Lm than Apw → Pm, Rh and Lm seem to be less defended than Ap (2) Less BC to Apy than Apw → difference between morph defences  Assay 3 palatability with visual cues: learning in Supporting Information  Figure 3 4 trials with species X as a ‘model’ (last trial not used)  Palatability in sequential presentations – are there signs of increased or decreased avoidance?  Table S1, Figure S1 The effects of (a) moth species, (b) bird species and (c) their interaction on changes in survival (cumulative risk of being attacked during a trial) during four sequential presentations  Cox proportional hazards model for survival  Figure S1 Blue tits: Apy = 11 Am = 11 Great tits: Apy = 20 Am = 10  The proportional risk of being attacked by blue and great tits changes differently depending on moth species → great tits showed increasing avoidance towards Am but blue tits did not  Image analysis visual cues only  pictures of all species → colour and pattern value extraction using blue tit vision model → model to assess discriminability  (1) How easy it is for the predators to tell apart the putative models and Apy/ Apw? (2) Could they be using hindwing colour only as a common cue?  Modelled discriminability of a putative model vs. a putatively mimetic Ap morph and the other Ap morph based on (a) overall appearance (colour and texture) (b) hindwing colour (Figure 6)  Logistic regression model → AUC (a measure from signal detection theory)  Apy vs. Am = 14 Pm = 12 Rh = 9 Lm = 16 Apw vs. Am = 14 Pm = 12 Rh = 9 Lm = 17  (1) All species distinguishable based on overall appearance (2) Am vs. Apy and Rh vs. Apw are less easy to discriminate based on hindwing colour → hindwing colour could be used as a common cue  Assay 3 mimicry  Figure 3 1st trial of species X as a ‘model’ vs. 5th trial of species X as a ‘mimic’ (from another group of birds)  (1) Does experience with the putative model increase hesitation towards the putative mimic? (2) Does experience with the putative mimic decrease hesitation towards the putative model?  Figure 7 Attack latency without recent experience (1st trial) with a putative model vs. after recent experience with a putative model (5th trial) (a) towards a putative mimic (Apy) (b) towards a putative model (Am) with both bird species  unpaired two-sample Wilcoxon test  Blue tits: Apy first = 11 Apy last = 10 Am first = 11 Am last = 11 Great tits: Apy first = 20 Apy last = 9 Am first = 10 Am last = 10  (1) Great tits show a non-significant tendency to hesitate more towards Apy after experience with Am, but blue tits do not (2) Experience with Apy decreased hesitation towards Am, although the effect is only significant in great tits → the compared species affect each other’s survival, suggesting that birds confuse the putatively mimetic moth species with each other  Experiment  Experimental procedure  Study questions  Measured variables and tested comparisons  Statistical test method  Sample size  Main results and conclusions  Assay 1 palatability with no visual cues  Figure 1 2 trials with mealworm controls  (1) Are the putative models Am, Pm, Rh & Lm less palatable than the putative mimics Apy/Apw? (2) Are the putative models less palatable than good/bad mealworm controls? (3) Is Apw less palatable than Apy?  Figure 2, Figure 4A, Table 2A Proportions eaten in two trials of (a) putative models vs. Ap morph (b) putative models vs. controls (c) white vs. yellow Ap  GLMM with a beta distribution and a logit link function  Figure 2, Assay 1 [number of moth paste samples not spilled (used in Figure 4A)/ number of all moth samples (used in analysis])  (1) No significant difference Am vs. Apy and Pm vs. Apy; Rh and Lm are eaten more than Apw (2) Putative models are eaten less than both controls (3) No significant difference between Apy and Apw → Pm and Am are putative Müllerian models; Rh and Lm are putative quasi- Batesian mimics  Assay 2 palatability with no visual cues  Figure 1 1 trial with Autographa gamma and Zygaena sp. controls  (1) and (3) same as above (2) Are the putative models less palatable than a putatively palatable moth (Autographa gamma) or an unpalatable moth (Zygaena sp.)?  Figure 2, Figure 4B, Table 2B Proportions eaten of (a) putative models vs. Ap morph (b) putative models vs. controls (c) white vs. yellow Ap  GLMM with a beta distribution and a logit link function  Figure 2, Assay 2 [number of of moth paste samples not spilled (used in Figure 4B)/ number of of all moth samples (used in analysis])  (1) No significant difference Am vs. Apy and Pm vs. Apy; Rh and Lm are eaten more than Apw (2) Putative models are eaten less than Autographa gamma but more than Zygaena (3) No significant difference. between Apy and Apw → Pm and Am are putative Müllerian models; Rh and Lm are putative quasi-Batesian mimics  Assay 3 palatability with visual cues: proportions eaten  Figure 3 Data from all trials (see Table S4 for a version with data from first trials only)  (1) Are the putative models Am, Pm, Rh & Lm less palatable than the putative mimics Apy/Apw? (2) Does the presence of visual cues change moth acceptability to birds? (3) Is Apw less palatable than Apy?  Figure 2, Figure 4C, Table 3C Proportions eaten of (a) putative models vs. Ap morph (b) white vs. yellow Ap  GLMM with a beta distribution and a logit link function  Figure 2, Assay 3 Great tits: Pm = 20 Am = 20 Apy = 39 Apw = 18 Rh = 10 Lm = 8  Proportion eaten: (1) Pm eaten more than Apy, other differences non-significant (2) All moths eaten more than without visual cues → Once attacked, most species eaten at similar levels (different from without visual cues, indicating that birds can handle their prey)  Assay 3 palatability with visual cues: beak cleaning  Figure 3Data from all trials  (1) Do the putative models Am, Pm, Rh & Lm induce more disgust behaviour than the putative mimics Apy/Apw? (2) Is there a difference in bird disgust behaviour between Apw and Apy?  Figure 5, Table 4 Amount of beak cleaning of (a) putative models vs. Ap morph (b) white vs. yellow Ap  GLMM with a negative binomial distribution and a logit link function  Figure 2, Assay 3 Great tits: Pm = 20 Am = 20 Apy = 39 Apw = 18 Rh = 10 Lm = 8  Beak cleaning (BC): (1) Less BC to Pm than Apy, no diff. Am vs. Apy, less BC to Rh and Lm than Apw → Pm, Rh and Lm seem to be less defended than Ap (2) Less BC to Apy than Apw → difference between morph defences  Assay 3 palatability with visual cues: learning in Supporting Information  Figure 3 4 trials with species X as a ‘model’ (last trial not used)  Palatability in sequential presentations – are there signs of increased or decreased avoidance?  Table S1, Figure S1 The effects of (a) moth species, (b) bird species and (c) their interaction on changes in survival (cumulative risk of being attacked during a trial) during four sequential presentations  Cox proportional hazards model for survival  Figure S1 Blue tits: Apy = 11 Am = 11 Great tits: Apy = 20 Am = 10  The proportional risk of being attacked by blue and great tits changes differently depending on moth species → great tits showed increasing avoidance towards Am but blue tits did not  Image analysis visual cues only  pictures of all species → colour and pattern value extraction using blue tit vision model → model to assess discriminability  (1) How easy it is for the predators to tell apart the putative models and Apy/ Apw? (2) Could they be using hindwing colour only as a common cue?  Modelled discriminability of a putative model vs. a putatively mimetic Ap morph and the other Ap morph based on (a) overall appearance (colour and texture) (b) hindwing colour (Figure 6)  Logistic regression model → AUC (a measure from signal detection theory)  Apy vs. Am = 14 Pm = 12 Rh = 9 Lm = 16 Apw vs. Am = 14 Pm = 12 Rh = 9 Lm = 17  (1) All species distinguishable based on overall appearance (2) Am vs. Apy and Rh vs. Apw are less easy to discriminate based on hindwing colour → hindwing colour could be used as a common cue  Assay 3 mimicry  Figure 3 1st trial of species X as a ‘model’ vs. 5th trial of species X as a ‘mimic’ (from another group of birds)  (1) Does experience with the putative model increase hesitation towards the putative mimic? (2) Does experience with the putative mimic decrease hesitation towards the putative model?  Figure 7 Attack latency without recent experience (1st trial) with a putative model vs. after recent experience with a putative model (5th trial) (a) towards a putative mimic (Apy) (b) towards a putative model (Am) with both bird species  unpaired two-sample Wilcoxon test  Blue tits: Apy first = 11 Apy last = 10 Am first = 11 Am last = 11 Great tits: Apy first = 20 Apy last = 9 Am first = 10 Am last = 10  (1) Great tits show a non-significant tendency to hesitate more towards Apy after experience with Am, but blue tits do not (2) Experience with Apy decreased hesitation towards Am, although the effect is only significant in great tits → the compared species affect each other’s survival, suggesting that birds confuse the putatively mimetic moth species with each other  Relevant figures and tables to each experiment are referred to in the table. The arrows are used to indicate conclusions made based on the main results. Species names are abbreviated: Ap = Arctia plantaginis, Apy = yellow morph, Apw = white morph are used for the putative mimics, and Am = Arichanna melanaria, Pm = Pseudopanthera macularia, Rh = Rheumaptera hastata and Lm = Lomaspilis marginata for the putative models. View Large In the first assay, we used dead T. molitor larvae mixed with 397 µL/g of either water (positive control, palatable) or a 10% quinine solution (negative control, unpalatable). Mealworms were killed by freezing them and subsequently crushed in a mortar. Mealworms were not freeze-dried because their high body fat content made the dried samples colour yellowish, unlike the moth samples. Instead, we used non-dried samples, whose natural brown colour was similar to that of the moth samples. On average 40 mg (14–70 mg) of each species/control was taken for the first trial. The uneaten proportion was then reused in a subsequent second trial, supplemented with fresh samples to gain a starting weight between 14.7 and 61.7 mg. Numbered cups (lids of 2-mLEppendorf tubes) and small pieces of parafilm (to seal the cup) were weighed to the nearest 0.0001 g, and a well-mixed paste of each species and controls was added to the cups. Each cup was then placed in a randomized position in an eight-spot platform made of a Styrofoam bottom and a plastic carpet with 10-mm holes that kept the sealed cups in place during bird assays (Fig. 1). A parafilm seal was used before the beginning of each trial to avoid both water evaporation and the spread of any potential odours of the different species before the bird had the chance to taste them. Birds were pre-trained to open the parafilm-sealed cups using an edible paste made of shell-less sunflower seeds crushed with a mortar and coloured brown (1 mL of yellow/red/blue mixture of dr. Oetker food dyes added to 6.359 mg of seeds). Every bird had to complete five pre-training steps before starting the trial, to ensure they mastered the technique to open the parafilm seals and to motivate them to open all eight cups in search of food during the experiment. In the first pre-training step, the birds were given sunflower seeds from open cups. Once the bird had consumed all seeds, they were offered the pre-training paste in open cups. After the birds had consumed the paste and emptied all cups, they were given the next round with sealed cups, but with pre-made holes in the parafilm. Finally, the birds were presented with two rounds of platforms with sealed cups with no holes in them. The latter round was given immediately before the experiment, and the experiment was started 1 h after the bird had finished eating the last pre-training round. The second assay followed the same protocol, except that Autographa gamma was used as the positive and Zygaena sp. as the negative control, instead of mealworms (Table 2). To ensure that we had enough moth and moth control samples for 15 birds in the second experiment, the sample weight was reduced to 20.5–29.6 mg (on average 24.7 mg), and only one trial was run per bird. Figure 1. View largeDownload slide Schematic diagram of the platform and a cup made of an Eppendorf lid containing moth paste used in Assays 1 and 2 without visual cues. Each cup contained paste made of one moth species or the palatable (C+) or unpalatable (C-) control positioned in a randomized order on the platform. See text for a description of the positive (C+) and negative (C-) controls used in each assay. Figure 1. View largeDownload slide Schematic diagram of the platform and a cup made of an Eppendorf lid containing moth paste used in Assays 1 and 2 without visual cues. Each cup contained paste made of one moth species or the palatable (C+) or unpalatable (C-) control positioned in a randomized order on the platform. See text for a description of the positive (C+) and negative (C-) controls used in each assay. In both assays, the opened cups were weighed again immediately after the trial ended, and the proportion of paste eaten was calculated as the difference in cup weights before and after the trial, divided by the cup weight before the trial. To ensure accurate measures of the proportions eaten, we recorded cases where the bird spilled cup contents out of the cup during the trials to be taken into account in the analysis. In the second experiment, we also recorded the order in which the cups were opened, to account for potential effects of bird saturation or learning during the trial. Furthermore, we measured the time taken by the bird from opening the cup until the end of the trial. This was done to account for water evaporation, which was assumed to reduce sample weight in an approximately linear manner after opening the cup lid. Each trial ended 2 min after the bird had opened the last cup to provide time for the bird to finish eating all the cup contents at will. If the trial continued for longer than 1 h, seed crumbs were added on top of all unopened cups, to prevent bird starvation and to motivate the bird to continue the trial. To test the relative palatability of the potential model species, we built generalized linear mixed models (GLMs) with proportion eaten as the dependent variable, modelled with a beta distribution and a logit link function. Proportion eaten varies between 0 and 1, and is thus best fit with a beta distribution, which allows for heteroskedasticity and asymmetry of the dependent variable distribution. Proportions eaten marked to be exactly 0 or 1 due to measurement inaccuracy were modified to 0.001 and 0.999, respectively, to match the model assumptions. The models were fitted using the R package glmmADMB (v.0.8.3.3; Fournier et al., 2012; Skaug et al., 2013) to account for random structures in the experimental setup. In the first assay with two trials (Assay 1) Bird ID was added as a random factor and in the second assay (Assay 2), which consisted of only one trial, we added cup position on the tray as a random variable to account for any spatial bias in bird feeding behaviour caused by the experimental setup. Prey species (including the palatable and unpalatable controls) was used as the explanatory variable and whether cup contents were spilled on the floor was included as a fixed factor, to account for cases where some of the reduced weight was left potentially uneaten, in both models (Assay 1 and 2). In addition, opening order and the time that each cup was open within the trial were used as covariates in the analysis of the second experiment (Assay 2). Planned contrasts were used to test for all relevant differences in palatability of the putative models against A. plantaginis morphs and the controls, and between A. plantaginis morphs, to avoid multiple testing. This was done using a design matrix, where the average amount eaten was set as the intercept and contrasts were set according to the study questions (see Fig. 2, Table 2). Figure 2. View largeDownload slide Planned contrasts used in statistical analyses to compare bird responses between (A) Pseudopanthera macularia and yellow Arctia plantaginis, (B) Arichanna melanaria and yellow A. plantaginis, (C) Rheumaptera hastata and white A. plantaginis, (D) Lomaspilis marginata and white A. plantaginis, (E) putative models and the unpalatable control, (F) putative models and the palatable control, and (G) A. plantaginis morphs. The numbers given in each column under the moth species represent the number of samples not spilled on the floor/all samples tested in Assay 1 (first row), the number of samples not spilled on the floor/all samples tested in Assay 2 (second row) and the number of moths attacked during a trial/number of moths tested during Assay 3 (third row). All samples were used in analysing the proportions eaten in Assays 1 and 2, whereas in Assay 3 only the samples attacked and thus tasted were included in palatability analyses (proportions eaten, beak cleaning). See Table 2 for a comprehensive presentation of sample sizes in each analysis. Figure 2. View largeDownload slide Planned contrasts used in statistical analyses to compare bird responses between (A) Pseudopanthera macularia and yellow Arctia plantaginis, (B) Arichanna melanaria and yellow A. plantaginis, (C) Rheumaptera hastata and white A. plantaginis, (D) Lomaspilis marginata and white A. plantaginis, (E) putative models and the unpalatable control, (F) putative models and the palatable control, and (G) A. plantaginis morphs. The numbers given in each column under the moth species represent the number of samples not spilled on the floor/all samples tested in Assay 1 (first row), the number of samples not spilled on the floor/all samples tested in Assay 2 (second row) and the number of moths attacked during a trial/number of moths tested during Assay 3 (third row). All samples were used in analysing the proportions eaten in Assays 1 and 2, whereas in Assay 3 only the samples attacked and thus tasted were included in palatability analyses (proportions eaten, beak cleaning). See Table 2 for a comprehensive presentation of sample sizes in each analysis. Image analysis While coloration is commonly measured using spectrometry, analysing images makes it possible to consider the whole animal and its patterns instead of point measures only (Stevens et al., 2007). Here image analysis was used to assess the discriminability of the wood tiger moth morphs (yellow and white) from their putative models (yellow: Arichanna melanaria, P. macularia; white: R. hastata, L. marginata) based on (1) their overall appearance (colour and pattern on both wings) and (b) hindwing colour only modelled with an avian visual system (Table 2). A total of 94 dry moth samples, together with a scale and a 93% white and 7% grey calibration standard, were photographed with a Samsung NX1000 camera customized to full spectrum length with a Nikon EL-80mm lens. Images were taken in raw format with a fixed aperture setting, using filters for UV (Baader U 300–400 nm) and visible light (Baader UV/IR cut filter 400–700 nm) imaging. The grey standards were essential for the calibration to ensure an accurate representation of the moths’ natural colours. Calibration was done using the Image Calibration and Analysis Toolbox (Troscianko & Stevens, 2015) in ImageJ (Schneider, Rasband & Eliceiri, 2012), where raw pictures of each moth were aligned manually, normalized and assembled into a multispectral image stack, converted to 32 bits per channel and saved as .tif for further analysis. The area of interest containing the moth in these calibrated and normalized pictures (including also information from the UV-channels when saved as 32-bit per channel) was then specified using a custom-made Matlab program. The RGB values of each pixel within the area of interest (either both of the left side wings or the hindwing only) were then transformed to relative photon catches of the blue tit’s (Cyanistes caeruleus) cones under a D65 standard daylight illuminant (Hart, 2001), to describe the moth wings as seen by their avian predators (similar to Stevens & Cuthill, 2006). For the extraction of wing pattern values, 8000 pixels were extracted from each area of interest and those pixels were convolved with log-Gabor (Gabor, 1946) filters of six orientations (0 to 150° in 30° increments) and four spatial frequencies (the pattern analysis is similar to those of Xiao & Cuthill, 2016; Michalis et al., 2017). Log-Gabor filters can quantify a pattern by detecting changes in luminance at specified spatial scales and orientations. Using the extracted values, we compared the putative models (Arichanna melanaria, P. macularia, R. hastata, L. marginata) against each of the A. plantaginis morphs, first based on pattern and colour of both wings (left fore- and hindwing), and then based on hindwing colour only (i.e. the warning signal). To achieve this we used Signal Detection Theory measures, which were calculated based on logistic regression analyses. Comparisons were done via a logistic regression model, which evaluated how easy it is to determine if a random pixel is part of species A or species B in terms of colour and pattern using the R package lme4 (v.1.1.13; Bates et al., 2015). To avoid overfitting, we used a method called ‘leave-one-out cross-validation’ (Lantz, 2013). This method fits (or ‘trains’) the model created by the logistic regression analysis on all of the individuals of the two species to be compared, except one individual of species A and one individual of species B. Next, we tested (validated) this model on those individuals of species A and B that were left out. In each comparison the individuals of one of the putative models were compared with the individuals of one of the A. plantaginis morphs. This process was repeated so that each individual of the two species was left out once (Lantz, 2013). To determine how easily two species are discriminated from each other, the widely used statistical methods based on null hypothesis testing (statistical significance) cannot be used. The question is how different the two species are from each other rather than whether the two species are different. Thus, we employed several measures used in Signal Detection Theory to determine how well or badly classification works for each species (Wickens, 2002; the analysis is similar to that of Michalis, 2017). These measures were as follows: ‘sensitivity’, which is the proportion of individuals of species A correctly classified; ‘specificity’, which is the proportion of individuals of species B correctly classified; ‘precision’, which is the proportion of pixels classified as species A that were actually individuals of species A; ‘prevalence’, which is the proportion of all the pixels of species A (0.5 in our case as the same number of individuals from each species was used); the ‘Area Under the Curve (AUC)’, which is the probability that a randomly selected individual of species A can be differentiated from a randomly selected individual of species B); and, finally, the ‘Receiver Operating Characteristic’ (ROC) curve, a plot of sensitivity against specificity (Robin et al., 2011). The AUC values range from 0.5 (performance is no better than a random allocation of individuals as species A or B) to 1 (all individuals of species A can be perfectly discriminated from those of species B), and were interpreted in accordance with a standard scale where values of 0.9–1 imply excellent discrimination, 0.8–0.9 imply good discrimination, 0.7–0.8 imply fair discrimination, 0.6–0.7 imply poor discrimination and 0.5–0.6 imply that the putative co-mimics are indistinguishable from each other, i.e. the model is no better than chance (Lantz, 2013). Experiment with visual cues Visual cues can be associated with prey unpalatability in subsequent encounters with individuals of either the same species or a different species with similar appearance. Based on this, we ran a third assay, this time with freeze-killed moths, to test bird responses to taste and visual cues combined (Table 2). To measure the consistency of bird behaviour, and determine whether they would learn to associate the moth’s appearance with its taste, we offered four individuals of the potential model species in four subsequent trials to each bird (see Supporting Information for methods and results concerning changes in bird behaviour during the four trials). To further test if recent experience with a potential model would affect a bird’s reactions towards another species, the four trials were followed by a fifth trial with the potential mimic (Fig. 3). As the Japanese populations of Arichanna melanaria were known to possess an intermediate level of defence (Nishida, 1994), we wanted to test its palatability in relation to yellow A. plantaginis in more detail, using two different relevant predators, great and blue tits. Tests for differences between bird species are reported in the Supporting Information. The other species combinations were offered to great tits only, as in the assays without visual cues. Figure 3. View largeDownload slide Examples of sequences of trials in Assay 3 with visual cues: to a bird that received Arichanna melanaria as a putative model and yellow Arctia plantaginis as a mimic (above) and to a bird that received yellow A. plantaginis as a putative model and Arichanna melanaria as a mimic (below). Data from the first and fifth trials (in colour) were used to compare attack latencies with and without experience with a putative model (Fig. 7). Figure 3. View largeDownload slide Examples of sequences of trials in Assay 3 with visual cues: to a bird that received Arichanna melanaria as a putative model and yellow Arctia plantaginis as a mimic (above) and to a bird that received yellow A. plantaginis as a putative model and Arichanna melanaria as a mimic (below). Data from the first and fifth trials (in colour) were used to compare attack latencies with and without experience with a putative model (Fig. 7). Birds were pre-trained to fetch food from a Petri dish attached to a green cardboard platform, to provide a controlled green background for all moths. Each bird was given three sunflower seeds on the Petri dish, and once it had consumed the seeds, it was starved for 1 h. The bird was then offered a small live mealworm and, if it attacked the mealworm within 2 min, it was considered ready to begin the experiment. Pieces of freshly killed mealworm (one piece ~40 mg, one-third the weight of a final instar larva) were given before, after and between the moth trials to control for bird motivation to forage on insect prey, and rule out satiation where the bird refused to attack the moth. If the bird did not attack the mealworm within 2 min, it was given a 10-min break and tested again. If the bird did not attack a moth and the subsequent mealworm, the moth trial was dismissed and the moth was offered again after the 10-min break. However, if the bird did not attack the moth but did attack the mealworm, the experiment was continued to the next trial. Mealworm trials were continued for 3 min from when the bird first saw the prey. Frozen moths were thawed in airtight boxes lined with moist paper at +3 °C to keep them fresh and easy to manipulate. Before the experiment, the flight muscles of each thawed moth were broken with tweezers to enable manipulating their posture and the moth was then placed on a Petri dish in a natural resting position, with wing pattern and colours of the dorsal side visible. Moths were presented to birds for 5 min (starting from when the bird first saw the prey) in each trial. This was done to allow the birds plenty of time to decide whether to attack the moth, and to show potential disgust behaviours (such as beak cleaning) after tasting the moth. In each trial, we recorded whether the bird attacked the moth (an attack was a peck or grabbing the moth from the Petri dish), the latency to attack, the proportion of the moth eaten at the end of the trial (%) and the number of times the bird cleaned its beak (a known sign of disgust; Evans & Waldbauer, 1982; Rowland et al., 2015) within 1 min of the attack. To test for palatability in the assay with visual cues (Assay 3), we used data from all trials in which each bird attacked and thus tasted the moth (see Supporting Information for another approach using data from first encounters only). Similar to the assays without visual cues, the proportion of the attacked moths eaten was used as the dependent variable, modelled with a beta distribution and logit link function using the R package glmmADMB (v 0.8.3.3). Values of 0 and 1 were again modified to 0.001 and 0.999, respectively, to match model assumptions. Moth species was used as the explanatory variable and Bird ID as a random factor to account for repeated measures in the four subsequent trials. Furthermore, the frequency with which birds cleaned their beak during 1 min after attack (number of beak wipes) was included as the response variable modelled with a negative binomial distribution and a logit link function in a GLM using the R package lme4 (v.1.1.13), where prey species was included as the explanatory variable. Planned contrasts were again used to compare the palatability of each putative model and its mimic (Fig. 2A–D), and between the A. plantaginis morphs (Fig. 2G). Finally, we tested whether the birds’ behaviour towards the A. plantaginis morphs was affected by recent experience with a potential model species (here only Arichanna melanaria compared with the yellow morph, see results below). This was done by comparing attack latencies of birds that received A. plantaginis in the first trial to the attack latencies of birds that received A. plantaginis after having four trials with the potential model Arichanna melanaria with an unpaired two-sample Wilcoxon test (Table 2). To further test whether experience with A. plantaginis would affect bird reactions towards the putative co-mimic Arichanna melanaria, we compared attack latencies of birds that received Arichanna melanaria in the first trial to the attack latencies of birds that received Arichanna melanaria after having four trials with A. plantaginis (Fig. 3) with unpaired two-sample Wilcoxon tests. All analyses were done in Rstudio (RStudio, 2015) with R v.3.3.3 (R CoreTeam, 2013). Animal welfare All the assays were carried out in custom-built plywood aviaries equipped with a perch, fresh water available ad libitum and lit with a daylight lamp (Exo Terra Repti Glo 10.0 UVB). Great tits were observed through a one-way plexiglass wall in front of a 50 × 50 × × 70 cm cage whereas the blue tits, which are more sensitive to the observer’s presence, were observed through a small mesh-covered window at the side of a 50 × 45 × 65 cm cage. The cages for both species were placed in a dark room to avoid any disturbance caused by the observer. All experiments were filmed to record the observed behaviours in more detail. Birds were wild-caught at Konnevesi research station using peanut-filled feeding traps, and at Jyväskylä using traps and mist-netting at winter-feeding sites. Each bird was weighed upon capture and kept singly in a plywood cage with food and water available ad libitum and 12:12-h light/dark cycle. After the experiment each bird was sexed, weighed, ringed and released at the original capture site. Permits for the capture and use of birds in experiments were granted by the Central Finland Centre for Economic Development, Transport and Environment and licensed from the National Animal Experiment Board (ESAVI/9114/04.10.07/2014) and the Central Finland Regional Environment Centre (VARELY/294/2015). All procedures complied with the Association for the Study of Animal Behavious’s Guidelines for the treatment of animals in behavioural research and teaching. The experiments were conducted at Konnevesi research station in Central Finland in February–March 2016, October–November 2016 and February–March 2017. RESULTS Palatability with and without visual cues All yellow and white putative models were found to be less palatable than the positive controls (mealworm and Autographa gamma), but more palatable than Zygaena sp. when offered without visual cues (Table 3, Fig. 4A, B). When provided with visual cues birds ate higher proportions of the attacked moths (Fig. 4C). The proportions eaten of the yellow putative models did not differ significantly from the proportions eaten of the yellow A. plantaginis without visual cues, except for P. macularia, which were eaten significantly more by great tits in the assay with visual cues (Table 3C, Fig. 4C). The yellow A. plantaginis also elicited significantly more beak cleaning behaviour than P. macularia (Table 4, Fig. 5). Table 3. GLMM estimates for palatability in three assays: (A) with mealworm controls, (B) with moth controls and (C) with visual cues (A) Assay 1 with mealworm controls  Random effects  Variance  SD      Bird ID  0.3699  0.6082      Fixed effects  Estimate  SE  Z-value  P-value  (Intercept): [mean proportion eaten]  −0.918  0.155  −5.93  <0.001*  P. macularia vs. yellow A. plantaginis  0.315  0.241  1.31  0.1915  Arichanna melanaria vs. yellow A. plantaginis  0.395  0.240  1.64  0.1001  R. hastata vs. white A. plantaginis  0.427  0.242  1.77  0.0775  L. marginata vs. white A. plantaginis  0.613  0.240  2.55  0.0108*  C- vs. putative models  0.353  0.194  1.82  0.0684  C+ vs. putative models  1.522  0.193  7.88  < 0.001*  yellow vs. white A. plantaginis  −0.017  0.242  −0.07  0.9426  spilled on floor  0.462  0.160  2.88  0.0039*  (B) Assay 2 with moth controls  Random effects  Variance  SD      Bird ID  0.627  0.792      Cup position  0.000000455  0.00067      Fixed effects  Estimate  SE  Z-value  P-value  (Intercept): [mean proportion eaten]  −0.405  0.343  −1.18  0.23745  P. macularia vs. yellow A. plantaginis  0.423  0.340  1.24  0.21315  Arichanna melanaria vs. yellow A. plantaginis  0.327  0.342  0.96  0.33875  R. hastata vs. white A. plantaginis  0.903  0.357  2.53  0.01153*  L. marginata vs. white A. plantaginis  0.651  0.361  1.80  0.07144  Zygaena vs. putative models  −1.08  0.293  −3.69  0.00022*  Autographa gamma vs. putative models  0.673  0.293  2.30  0.02164*  yellow vs. white A. plantaginis  0.333  0.344  0.97  0.33309  time open  <0.001  <0.001  0.88  0.37941  eating order  −0.162  0.046  −3.51  0.00044*  spilled on floor  0.801  0.284  2.82  0.00480*  (C) Assay 3 with visual cues  Random effects  Variance  SD      Bird ID  0.851  0.923      Fixed effects  Estimate  SE  Z-value  P-value  (Intercept): [mean proportion eaten]  0.597  0.161  3.70  < 0.001*  P. macularia vs. yellow A. plantaginis  0.665  0.275  2.41  0.0158*  Arichanna melanaria vs. yellow A. plantaginis  −0.146  0.297  −0.49  0.622  R. hastata vs. white A. plantaginis  −0.203  0.419  −0.48  0.628  L. marginata vs. white A. plantaginis  0.287  0.446  0.64  0.520  yellow vs. white A. plantaginis  0.326  0.364  0.89  0.371  (A) Assay 1 with mealworm controls  Random effects  Variance  SD      Bird ID  0.3699  0.6082      Fixed effects  Estimate  SE  Z-value  P-value  (Intercept): [mean proportion eaten]  −0.918  0.155  −5.93  <0.001*  P. macularia vs. yellow A. plantaginis  0.315  0.241  1.31  0.1915  Arichanna melanaria vs. yellow A. plantaginis  0.395  0.240  1.64  0.1001  R. hastata vs. white A. plantaginis  0.427  0.242  1.77  0.0775  L. marginata vs. white A. plantaginis  0.613  0.240  2.55  0.0108*  C- vs. putative models  0.353  0.194  1.82  0.0684  C+ vs. putative models  1.522  0.193  7.88  < 0.001*  yellow vs. white A. plantaginis  −0.017  0.242  −0.07  0.9426  spilled on floor  0.462  0.160  2.88  0.0039*  (B) Assay 2 with moth controls  Random effects  Variance  SD      Bird ID  0.627  0.792      Cup position  0.000000455  0.00067      Fixed effects  Estimate  SE  Z-value  P-value  (Intercept): [mean proportion eaten]  −0.405  0.343  −1.18  0.23745  P. macularia vs. yellow A. plantaginis  0.423  0.340  1.24  0.21315  Arichanna melanaria vs. yellow A. plantaginis  0.327  0.342  0.96  0.33875  R. hastata vs. white A. plantaginis  0.903  0.357  2.53  0.01153*  L. marginata vs. white A. plantaginis  0.651  0.361  1.80  0.07144  Zygaena vs. putative models  −1.08  0.293  −3.69  0.00022*  Autographa gamma vs. putative models  0.673  0.293  2.30  0.02164*  yellow vs. white A. plantaginis  0.333  0.344  0.97  0.33309  time open  <0.001  <0.001  0.88  0.37941  eating order  −0.162  0.046  −3.51  0.00044*  spilled on floor  0.801  0.284  2.82  0.00480*  (C) Assay 3 with visual cues  Random effects  Variance  SD      Bird ID  0.851  0.923      Fixed effects  Estimate  SE  Z-value  P-value  (Intercept): [mean proportion eaten]  0.597  0.161  3.70  < 0.001*  P. macularia vs. yellow A. plantaginis  0.665  0.275  2.41  0.0158*  Arichanna melanaria vs. yellow A. plantaginis  −0.146  0.297  −0.49  0.622  R. hastata vs. white A. plantaginis  −0.203  0.419  −0.48  0.628  L. marginata vs. white A. plantaginis  0.287  0.446  0.64  0.520  yellow vs. white A. plantaginis  0.326  0.364  0.89  0.371  Statistical significance of the planned contrasts at P < 0.05 is marked with an asterisk (*). The planned contrasts are shown in Figure 3 for Assays 1 and 2. In Assay 3 no controls were used, and thus contrasts were only made between putative models and their mimics and between A. plantaginis morphs. View Large Table 3. GLMM estimates for palatability in three assays: (A) with mealworm controls, (B) with moth controls and (C) with visual cues (A) Assay 1 with mealworm controls  Random effects  Variance  SD      Bird ID  0.3699  0.6082      Fixed effects  Estimate  SE  Z-value  P-value  (Intercept): [mean proportion eaten]  −0.918  0.155  −5.93  <0.001*  P. macularia vs. yellow A. plantaginis  0.315  0.241  1.31  0.1915  Arichanna melanaria vs. yellow A. plantaginis  0.395  0.240  1.64  0.1001  R. hastata vs. white A. plantaginis  0.427  0.242  1.77  0.0775  L. marginata vs. white A. plantaginis  0.613  0.240  2.55  0.0108*  C- vs. putative models  0.353  0.194  1.82  0.0684  C+ vs. putative models  1.522  0.193  7.88  < 0.001*  yellow vs. white A. plantaginis  −0.017  0.242  −0.07  0.9426  spilled on floor  0.462  0.160  2.88  0.0039*  (B) Assay 2 with moth controls  Random effects  Variance  SD      Bird ID  0.627  0.792      Cup position  0.000000455  0.00067      Fixed effects  Estimate  SE  Z-value  P-value  (Intercept): [mean proportion eaten]  −0.405  0.343  −1.18  0.23745  P. macularia vs. yellow A. plantaginis  0.423  0.340  1.24  0.21315  Arichanna melanaria vs. yellow A. plantaginis  0.327  0.342  0.96  0.33875  R. hastata vs. white A. plantaginis  0.903  0.357  2.53  0.01153*  L. marginata vs. white A. plantaginis  0.651  0.361  1.80  0.07144  Zygaena vs. putative models  −1.08  0.293  −3.69  0.00022*  Autographa gamma vs. putative models  0.673  0.293  2.30  0.02164*  yellow vs. white A. plantaginis  0.333  0.344  0.97  0.33309  time open  <0.001  <0.001  0.88  0.37941  eating order  −0.162  0.046  −3.51  0.00044*  spilled on floor  0.801  0.284  2.82  0.00480*  (C) Assay 3 with visual cues  Random effects  Variance  SD      Bird ID  0.851  0.923      Fixed effects  Estimate  SE  Z-value  P-value  (Intercept): [mean proportion eaten]  0.597  0.161  3.70  < 0.001*  P. macularia vs. yellow A. plantaginis  0.665  0.275  2.41  0.0158*  Arichanna melanaria vs. yellow A. plantaginis  −0.146  0.297  −0.49  0.622  R. hastata vs. white A. plantaginis  −0.203  0.419  −0.48  0.628  L. marginata vs. white A. plantaginis  0.287  0.446  0.64  0.520  yellow vs. white A. plantaginis  0.326  0.364  0.89  0.371  (A) Assay 1 with mealworm controls  Random effects  Variance  SD      Bird ID  0.3699  0.6082      Fixed effects  Estimate  SE  Z-value  P-value  (Intercept): [mean proportion eaten]  −0.918  0.155  −5.93  <0.001*  P. macularia vs. yellow A. plantaginis  0.315  0.241  1.31  0.1915  Arichanna melanaria vs. yellow A. plantaginis  0.395  0.240  1.64  0.1001  R. hastata vs. white A. plantaginis  0.427  0.242  1.77  0.0775  L. marginata vs. white A. plantaginis  0.613  0.240  2.55  0.0108*  C- vs. putative models  0.353  0.194  1.82  0.0684  C+ vs. putative models  1.522  0.193  7.88  < 0.001*  yellow vs. white A. plantaginis  −0.017  0.242  −0.07  0.9426  spilled on floor  0.462  0.160  2.88  0.0039*  (B) Assay 2 with moth controls  Random effects  Variance  SD      Bird ID  0.627  0.792      Cup position  0.000000455  0.00067      Fixed effects  Estimate  SE  Z-value  P-value  (Intercept): [mean proportion eaten]  −0.405  0.343  −1.18  0.23745  P. macularia vs. yellow A. plantaginis  0.423  0.340  1.24  0.21315  Arichanna melanaria vs. yellow A. plantaginis  0.327  0.342  0.96  0.33875  R. hastata vs. white A. plantaginis  0.903  0.357  2.53  0.01153*  L. marginata vs. white A. plantaginis  0.651  0.361  1.80  0.07144  Zygaena vs. putative models  −1.08  0.293  −3.69  0.00022*  Autographa gamma vs. putative models  0.673  0.293  2.30  0.02164*  yellow vs. white A. plantaginis  0.333  0.344  0.97  0.33309  time open  <0.001  <0.001  0.88  0.37941  eating order  −0.162  0.046  −3.51  0.00044*  spilled on floor  0.801  0.284  2.82  0.00480*  (C) Assay 3 with visual cues  Random effects  Variance  SD      Bird ID  0.851  0.923      Fixed effects  Estimate  SE  Z-value  P-value  (Intercept): [mean proportion eaten]  0.597  0.161  3.70  < 0.001*  P. macularia vs. yellow A. plantaginis  0.665  0.275  2.41  0.0158*  Arichanna melanaria vs. yellow A. plantaginis  −0.146  0.297  −0.49  0.622  R. hastata vs. white A. plantaginis  −0.203  0.419  −0.48  0.628  L. marginata vs. white A. plantaginis  0.287  0.446  0.64  0.520  yellow vs. white A. plantaginis  0.326  0.364  0.89  0.371  Statistical significance of the planned contrasts at P < 0.05 is marked with an asterisk (*). The planned contrasts are shown in Figure 3 for Assays 1 and 2. In Assay 3 no controls were used, and thus contrasts were only made between putative models and their mimics and between A. plantaginis morphs. View Large Table 4. GLM estimates of the amounts of beak cleaning for the planned comparisons between putative models and A. plantaginis morphs in Assay 3; the amounts of beak cleaning are shown in Figure 5 Fixed effects  Estimate  SE  Z-value  P-value  (Intercept): [mean amount of BC]  1.037  0.129  8.045  < 0.001*  P. macularia vs. yellow A. plantaginis  −0.956  0.211  −4.534  < 0.001*  Arichanna melanaria vs. yellow A. plantaginis  −0.303  0.215  −1.407  0.1596  R. hastata vs. white A. plantaginis  −1.429  0.306  −4.667  < 0.001*  L. marginata vs. white A. plantaginis  −1.509  0.319  −4.725  < 0.001*  yellow vs. white A. plantaginis  −0.538  0.263  −2.047  0.0406*  Fixed effects  Estimate  SE  Z-value  P-value  (Intercept): [mean amount of BC]  1.037  0.129  8.045  < 0.001*  P. macularia vs. yellow A. plantaginis  −0.956  0.211  −4.534  < 0.001*  Arichanna melanaria vs. yellow A. plantaginis  −0.303  0.215  −1.407  0.1596  R. hastata vs. white A. plantaginis  −1.429  0.306  −4.667  < 0.001*  L. marginata vs. white A. plantaginis  −1.509  0.319  −4.725  < 0.001*  yellow vs. white A. plantaginis  −0.538  0.263  −2.047  0.0406*  Statistical significance of the planned contrasts at P < 0.05 is marked with an asterisk (*). View Large Table 4. GLM estimates of the amounts of beak cleaning for the planned comparisons between putative models and A. plantaginis morphs in Assay 3; the amounts of beak cleaning are shown in Figure 5 Fixed effects  Estimate  SE  Z-value  P-value  (Intercept): [mean amount of BC]  1.037  0.129  8.045  < 0.001*  P. macularia vs. yellow A. plantaginis  −0.956  0.211  −4.534  < 0.001*  Arichanna melanaria vs. yellow A. plantaginis  −0.303  0.215  −1.407  0.1596  R. hastata vs. white A. plantaginis  −1.429  0.306  −4.667  < 0.001*  L. marginata vs. white A. plantaginis  −1.509  0.319  −4.725  < 0.001*  yellow vs. white A. plantaginis  −0.538  0.263  −2.047  0.0406*  Fixed effects  Estimate  SE  Z-value  P-value  (Intercept): [mean amount of BC]  1.037  0.129  8.045  < 0.001*  P. macularia vs. yellow A. plantaginis  −0.956  0.211  −4.534  < 0.001*  Arichanna melanaria vs. yellow A. plantaginis  −0.303  0.215  −1.407  0.1596  R. hastata vs. white A. plantaginis  −1.429  0.306  −4.667  < 0.001*  L. marginata vs. white A. plantaginis  −1.509  0.319  −4.725  < 0.001*  yellow vs. white A. plantaginis  −0.538  0.263  −2.047  0.0406*  Statistical significance of the planned contrasts at P < 0.05 is marked with an asterisk (*). View Large Figure 4. View largeDownload slide Proportions of the potentially aposematic moth species eaten by great tits in the absence of visual cues (A) compared to mealworm controls (Assay 1) and (B) compared to moth controls (Assay 2). (C) The proportions of moths eaten when visual cues were presented (Assay 3). Only samples not spilled outside of the cups and those birds that did attack the moths offered are considered (see Fig. 3 for sample sizes). The statistical significance of each planned contrast between the moth species is given in Table 3. The box plots represent the middle 50% of the original data, upper (lower) limit is the first (third) quartile and thick line the median. Whiskers extend up to 1.5 times the interquartile range from top (bottom) of the box to the furthest data point within the distance. ‘Outlier’ data points beyond the whiskers are illustrated as open circles. Figure 4. View largeDownload slide Proportions of the potentially aposematic moth species eaten by great tits in the absence of visual cues (A) compared to mealworm controls (Assay 1) and (B) compared to moth controls (Assay 2). (C) The proportions of moths eaten when visual cues were presented (Assay 3). Only samples not spilled outside of the cups and those birds that did attack the moths offered are considered (see Fig. 3 for sample sizes). The statistical significance of each planned contrast between the moth species is given in Table 3. The box plots represent the middle 50% of the original data, upper (lower) limit is the first (third) quartile and thick line the median. Whiskers extend up to 1.5 times the interquartile range from top (bottom) of the box to the furthest data point within the distance. ‘Outlier’ data points beyond the whiskers are illustrated as open circles. Figure 5. View largeDownload slide Beak cleaning induced by different moth species over 1 min, starting from when the bird (great tit) first pecked or grabbed the prey in its beak in Assay 3. The statistical significance of planned contrasts between the putative models and Arctia plantaginis morphs for the amount of beak cleaning at the P < 0.05 level is marked with an asterisk (*) and model estimates given in Table 4. The box plots represent the middle 50% of the original data, upper (lower) limit is the first (third) quartile and the thick line the median. Whiskers extend up to 1.5 times the interquartile range from top (bottom) of the box to the furthest data point within the distance. ‘Outlier’ data points beyond the whiskers are illustrated as open circles. Figure 5. View largeDownload slide Beak cleaning induced by different moth species over 1 min, starting from when the bird (great tit) first pecked or grabbed the prey in its beak in Assay 3. The statistical significance of planned contrasts between the putative models and Arctia plantaginis morphs for the amount of beak cleaning at the P < 0.05 level is marked with an asterisk (*) and model estimates given in Table 4. The box plots represent the middle 50% of the original data, upper (lower) limit is the first (third) quartile and the thick line the median. Whiskers extend up to 1.5 times the interquartile range from top (bottom) of the box to the furthest data point within the distance. ‘Outlier’ data points beyond the whiskers are illustrated as open circles. Without visual cues, L. marginata was eaten significantly more and R. hastata marginally significantly more than white A. plantaginis in the first assay (Table 3A), and R. hastata was eaten significantly more and L. marginata marginally significantly more than white A. plantaginis in the second assay (Table 3B), indicating that both species were relatively more palatable than their putative mimic. Correspondingly, with visual cues, white A. plantaginis elicited significantly more beak cleaning than its putative mimics (Table 4, Fig. 5). Despite apparent differences in proportions eaten of the putative white model species and white A. plantaginis morphs with visual cues (Fig. 4C), there was no significant difference in proportions eaten of R. hastata and the white A. plantaginis (Table 3C), perhaps due to relatively small sample sizes and thus low power of the test (Table 2, Fig. 3). Great tits, however, ate all R. hastata offered with visual cues after very short attack latencies (Supporting Information, Figs S1, S2). White A. plantaginis were eaten significantly less when tasted for the first time (Table S3) and elicited significantly more beak cleaning (Table 4) than the yellow A. plantaginis, although no significant differences between morphs were found in the proportions eaten without visual cues or taking into account all trials with visual cues (Table 3). Changes in bird responses during the four subsequent trials with visual cues are examined in more detail in the Supporting Information. Discriminability of the putative models from the wood tiger moth morphs The putative models are easily distinguished from A. plantaginis when considering both colour and pattern of the whole moth (AUC values for all comparisons > 0.93). However, when considering only the hindwing warning colour, the putatively mimetic pairs Arichanna melanaria (Am) – yellow A. plantaginis (Apy) and R. hastata (Rh)– white A. plantaginis (Apw) become more difficult to discriminate (Am-Apy: AUC = 0.79, Rh-Apw: AUC = 0.67; Fig. 6), whereas L. marginata (Lm) and P. macularia (Pm) are easily distinguished from both A. plantaginis morphs based on hindwing colour too (all comparisons AUC = 1; Fig. 6). Figure 6. View largeDownload slide Discrimination of putative models from both Arctia plantaginis morphs (Apy and Apw) based on hindwing colour according to bird visual system. The putative models are (A) Arichanna melanaria (Am), (B) Pseudopanthera macularia (Pm), (C) Rheumaptera hastata (Rh) and (D) Lomaspilis marginata (Lm). Birds can discriminate P. macularia and L. marginata from both A. plantaginis morphs, but the discrimination is more difficult between Arichanna melanaria and the yellow A. plantaginis and between R. hastata and the white A. plantaginis. Figure 6. View largeDownload slide Discrimination of putative models from both Arctia plantaginis morphs (Apy and Apw) based on hindwing colour according to bird visual system. The putative models are (A) Arichanna melanaria (Am), (B) Pseudopanthera macularia (Pm), (C) Rheumaptera hastata (Rh) and (D) Lomaspilis marginata (Lm). Birds can discriminate P. macularia and L. marginata from both A. plantaginis morphs, but the discrimination is more difficult between Arichanna melanaria and the yellow A. plantaginis and between R. hastata and the white A. plantaginis. Bird responses towards the most likely model after experience with the other As all putative models except Arichanna melanaria were found to be more palatable than A. plantaginis, and L. marginata and P. macularia were found to be easily distinguished from their putative co-mimics based on the image analysis, they are unlikely to act as Müllerian models for the wood tiger moth morphs. We therefore tested changes in bird responses towards Arichanna melanaria and the yellow wood tiger moth morph only. There was no significant change in the birds’ attack latency towards the yellow A. plantaginis with or without recent experience with the putative model Arichanna melanaria by blue or great tits, although the latter did on average hesitate longer before attacking the yellow A. plantaginis after having experience with Arichanna melanaria (unpaired two-sample Wilcoxon tests, W = 76, P = 0.145 for blue tits, and W = 54.5, P = 0.097 for great tits; Fig. 7A, B). Experience with yellow A. plantaginis, however, significantly decreased the great tits’ attack latencies towards Arichanna melanaria (W = 80, P = 0.026; Fig. 7D), and the trend was similar but not significant in the blue tits (W = 78.5, P = 0.245; Fig. 7C). Figure 7. View largeDownload slide Attack latencies of (A) blue and (B) great tits towards the yellow Arctia plantaginis before and after experience with Arichanna melanaria (Am), and attack latencies of (C) blue and (D) great tits towards Arichanna melanaria after experience with yellow A. plantaginis (Apy). Statistically significant differences at P < 0.05 are marked with an asterisk (Wilcoxon test). The box plots represent the middle 50% of the original data, upper (lower) limit is the first (third) quartile and the thick line the median. Whiskers extend up to 1.5 times the interquartile range from top (bottom) of the box to the furthest data point within the distance. ‘Outlier’ data points beyond the whiskers are illustrated as open circles. Sample size is illustrated with orange dots on top of the box plots using R package beeswarm (v.0.2.3; Eklund, 2016). Figure 7. View largeDownload slide Attack latencies of (A) blue and (B) great tits towards the yellow Arctia plantaginis before and after experience with Arichanna melanaria (Am), and attack latencies of (C) blue and (D) great tits towards Arichanna melanaria after experience with yellow A. plantaginis (Apy). Statistically significant differences at P < 0.05 are marked with an asterisk (Wilcoxon test). The box plots represent the middle 50% of the original data, upper (lower) limit is the first (third) quartile and the thick line the median. Whiskers extend up to 1.5 times the interquartile range from top (bottom) of the box to the furthest data point within the distance. ‘Outlier’ data points beyond the whiskers are illustrated as open circles. Sample size is illustrated with orange dots on top of the box plots using R package beeswarm (v.0.2.3; Eklund, 2016). DISCUSSION The existence and maintenance of warning colour polymorphisms has raised interest among evolutionary biologists for decades because of their paradoxical nature. Here, we studied whether multiple-model mimicry could contribute to the maintenance of a warning signal polymorphism in the aposematic moth A. plantaginis. Our results provide partial support for this hypothesis, as Arichanna melanaria was found to be a potential Müllerian model for the yellow A. plantaginis, whereas R. hastata was a quasi-Batesian mimic rather than a Müllerian model for the white A. plantaginis. However, it is possible that a Müllerian model for the white morph exists among the species not considered in this study or that even less defended co-mimics benefit the white morph under some circumstances (see, for example, Rowland et al., 2007, 2010). Based on the relative palatability of the moths with and without visual cues, Arichanna melanaria is as (un)palatable to the birds as the yellow A. plantaginis. Although the image analysis with avian vision model indicates that birds are able to distinguish the species based on the moths’ overall appearance, the model suggests that the hindwing colours of Arichanna melanaria and the yellow A. plantaginis, and R. hastata and the white A. plantaginis are relatively similar. Knowing that birds pay more attention to coloration than to pattern (Exnerová et al., 2006; Ham et al., 2006; Aronsson & Gamberale-Stille, 2012; Rönkä et al., 2018), and that moth hindwing coloration indeed is an important signal for birds (e.g. Nokelainen et al., 2102, 2014), we can safely assume that signal sharing between these species is possible. All other putative co-mimics tested here were found to be visually distinct and/or more palatable than the wood tiger moths, and could therefore not be Müllerian models for them. We discuss the implications of these findings below. Palatable or unpalatable: it is all relative Studies on mimicry have predominantly focused on the appearance of the mimetic complex, i.e. their shared visual signal (e.g. Benson, 1972; Mallet & Barton, 1989; Yeager et al., 2012). However, chemical defences are also prone to variation (Nishida, 1994; Speed et al., 2012), both within and between species, adding another important dimension to mimetic relationships (Skelhorn & Rowe, 2006; Ihalainen et al., 2007; Stuckert et al., 2014; Arias et al., 2016a). Of all the species tested in this study, Autographa gamma was the most palatable, with birds eating on average about half of each sample of this species presented. In contrast, all the species tested as models/co-mimics exhibited a certain degree of unpalatability, although not to the extent of Zygaena sp., a species in a group well known for their chemical defences containing cyanide compounds (Davis & Nahrstedt, 1982). These results suggest that there is potential for mimetic relationships between A. plantaginis and the candidate mimics considered here, and that there is no such thing as a fully palatable or fully unpalatable prey species (Brower et al., 1968; Speed, 1999). Moth toxicity can vary in relation to sex, diet and drying of the specimen (Marsh & Rothschild, 1974). Here, we tested only male moths of Arichanna melanaria, which may bias the results towards higher palatability (Marsh & Rothschild, 1974). All specimens were either wild-caught or reared on natural food plants, except for P. macularia, which was reared on Lamium album, potentially affecting its relative palatability. We are not aware of how freezing or freeze-drying affects the palatability of these moths, but our observations of tit responses towards living A. plantaginis and P. macularia are largely similar to those observed towards the freeze-killed individuals (K. Rönkä, pers. observ.). However, it is important to note that the differences reported by Marsh & Rothschild (1974) are based on injections of moth extracts into mice, which are likely to confound reactions to the chemical defences with those to foreign insect protein, present especially in non-dried samples (Ley & Watt, 1989). Thus, injections trigger different responses than ingestion, which is the way in which predators come into contact with a prey animal’s chemical defences in the wild. In fact, it is known that predators can learn about the degree of toxicity of different prey based on prey unpalatability (Skelhorn & Rowe, 2010). Taste and visual cues combined Interestingly, in the third assay, in which birds were offered real moths, the proportion eaten of the attacked moths was considerably higher than when the same moth species were offered as a paste in the absence of visual cues. This could mean that the visual cues were essential for prey recognition (e.g. Veselý et al., 2013). Additionally, when moths were offered whole, birds could handle them and selectively consume only the most nutritious, palatable parts, leaving parts such as the wings and the prothoracic glands containing the defensive fluid of A. plantaginis uneaten. By contrast, when mixed in the paste birds would need to consume all moth parts including defensive compounds, which may also have been stronger, as they were pooled from three individuals. Moreover, encountering several unpalatable prey in a row may have reduced the birds’ motivation to forage out of the cups in the assays without visual cues, which is reflected in the significant negative effect of eating order on the proportion eaten in Assay 2 (Table 3B). In the first two assays, only one of the eight samples (the positive control) was known to be palatable. In contrast, the birds’ motivation to attack and consume insect prey was ensured in the third assay by offering them pieces of palatable freshly killed mealworms between every trial. Previous studies have demonstrated that prey palatability can influence predator learning of a particular warning signal, with signals associated with higher unpalatability being easier to learn and more memorable (Duncan & Sheppard, 1965; Skelhorn & Rowe, 2006; Rowland et al., 2007). Our third assay shows that birds do not increase their avoidance towards the yellow morph of A. plantaginis over time, as opposed to Arichanna melanaria, which is attacked less by great tits as trials proceed (Fig. S1). In fact, the attack risk towards Arichanna melanaria was overall lower than towards A. plantaginis. The latency to attack A. plantaginis by great tits after being exposed to Arichanna melanaria showed a non-significant increase compared to when the birds were presented A. plantaginis first (Fig. 7B). However, the experience with yellow A. plantaginis reduced the attack latency towards Arichanna melanaria (Fig. 7C, D), indicating that birds generalized their response towards A. plantaginis to Arichanna melanaria based on their similar hindwing colour. The reduction in attack latency may have been caused by birds becoming more experienced in handling yellow A. plantaginis during the four trials, leaving unpalatable parts uneaten (see Supporting Information). Altogether, our findings suggest that Arichanna melanaria could indeed be a Müllerian model for yellow A. plantaginis. One aspect to be cautious about with regard to this possible mimetic relationship is that A. plantaginis usually starts its flight season earlier than Arichanna melanaria. However, their flight seasons largely overlap and, if birds can remember their experiences over the winter, the yellow morph of A. plantaginis can still benefit from signal sharing with Arichanna melanaria. In addition, predators in the wild are likely to make their decision to attack in a few seconds. Thus, live A. plantaginis may be better protected in the eyes of natural predators in the wild than in the lab, rendering the relationship between yellow A. plantaginis and Arichanna melanaria beneficial for both species, i.e. truly Müllerian. This, however, remains to be explicitly tested. Live wood tiger moths are capable of surviving a bird attack (Rönkä et al., unpubl. data). Thus, moth palatability might not directly translate to differences in survival, and the different morphs may be relying on different lines of defence. Nokelainen et al. (2012) found that blue tits hesitated significantly longer before attacking live yellow versus white A. plantaginis. Here, we found no significant difference in the first trial hesitation times by great tits, although most of the longer hesitation times were towards the yellow morph. Black-and-yellow is a common warning colour combination, and might thus induce wariness in both inexperienced and experienced predators (Lindström, Alatalo & Mappes, 1999; Exnerová et al., 2006). However, after attacking, the great tits found the white A. plantaginis to be less palatable than the yellow ones, as per the higher frequency of beak cleaning (Fig. 5), lower proportions eaten (Table S3, Fig. S2) and a decreasing probability of attacks in the third assay compared to the yellow morph (Table S1). We thus hypothesize that the yellow morph relies on not being attacked and hence benefits from signal sharing with other similarly coloured aposematic prey, whereas the white morph relies more heavily on taste-rejection by birds once attacked. Although we found a potential Müllerian model for the yellow morph only, we cannot exclude the possibility that the white morph benefits from mimicry too. According to the image analysis, birds have trouble distinguishing between the more palatable black-and-white R. hastata and the white A. plantaginis morph based on hindwing colour. The colours and patterns of the moths blur when on the move, which further reduces the need for perfection in pattern similarity (Edmunds, 2000). If birds generalize between the two species, this might increase the chances of the white A. plantaginis being taste-rejected. This is because variation in prey chemical defences can be aversive to avian predators (Barnett, Bateson & Rowe, 2014) and, furthermore, because variation in co-mimic defences could enforce the surprise effect of the A. plantaginis defence: they release their aversive defensive fluid when grabbed by the bird in its beak, potentially causing the bird to release the prey relatively unharmed. Predator strategies In the present study, we report signs of difference in the response of blue and great tits to the same type of prey, which have also been addressed in previous studies (Exnerová et al., 2003, 2006; Turini, Veselý & Fuchs, 2016). These differences are expressed, for example, in the way in which prey are handled: the great tits decreased their attacks towards Arichanna melanaria, whereas the blue tits did not (Fig. S1). However, the blue tits did decrease the amounts they ate of the attacked Arichanna melanaria during the four trials, suggesting that they were using taste-rejection (Skelhorn & Rowe, 2006; Halpin & Rowe, 2010, 2016), whereas the more cautious behaviour of great tits, which are known to discriminate prey both before and during handling (Exnerová et al., 2003, 2006), matches better a ‘go-slow’ strategy (Guilford, 1994). This is in agreement with previous research showing how the composition of predator communities can have an effect on the success of one morph over the other (Nokelainen et al., 2014). Although beyond the scope of the present study, we highlight these potential differences in predator behaviour as a promising research avenue. Imperfect mimicry and generalized avoidance Inaccurate mimicry can arise when predators generalize widely among signals (Rowe, Lindström & Lyytinen, 2004; Ihalainen et al., 2012; Mappes et al., 2014) or use one cue and discard the rest (Bain et al., 2007; Chittka & Osorio, 2007; Kikuchi & Pfennig, 2010, 2013; Kikuchi et al., 2016). Here, we found support for the latter: (1) according to our image analysis, the birds’ visual system is capable of discerning between Arichanna melanaria and yellow A. plantaginis, and between R. hastata and the white A. plantaginis, but the hindwing colours of the two species-pairs are not easily distinguishable for the birds, and could thus be used as a cue that is generalized between them; and (2) the experience with yellow A. plantaginis changed great tits’ reaction towards Arichanna melanaria, indicating that the birds used the hindwing warning colour as their primary cue for moth palatability instead of wing pattern, size or shape. Indeed, previous experimental evidence has demonstrated that warning colour might be of foremost importance in recognizing aposematic prey (Morrell & Turner, 1970; Terhune, 1977; Exnerová et al., 2006; Ham et al., 2006; Aronsson & Gamberale-Stille, 2012; Cibulková, Veselý & Fuchs, 2014; Rönkä et al., 2018). For example, the area of warning colour on the wings has been shown to be a more important cue for predators than prey body size (Remmel & Tammaru, 2011; Hegna et al., 2013), and colour to be more important than pattern (Aronsson & Gamberale-Stille, 2008; Finkbeiner, Briscoe & Reed, 2014). Overall, our results imply a complex interplay between warning signals and chemical defences influencing predator attacks on prey that look alike (see also Ihalainen et al., 2008a), reinforcing the idea that both taste and visual appearance are important in predator avoidance learning (Lindström et al., 2006). We suggest that imperfect mimics can benefit from their coexistence (because birds generalize between similar colours). However, concluding that multiple-model mimicry explains warning signal polymorphism in A. plantaginis would require one more step of inquiry, namely testing how predation pressure towards A. plantaginis morphs is affected by the frequency of models and mimics of the yellow and white morphs in the field, in natural communities of predators and prey. Furthermore, experiments on mimicry are customarily done using dead or stationary prey, ignoring the fact that birds may use flying behaviour as a cue for unpalatability (Chai & Srygley, 1990), and that the resemblance of moving prey does not need to be as accurate as between non-moving prey (Edmunds, 2000). Our results highlight the importance of taking the predator’s perspective into account in the evolution of mimicry. Visual resemblance alone, even if investigated using objective image analyses and vision models, is not sufficient to prove a mimetic relationship, let alone mutualism. Indeed, predator behaviour – in particular, which cues predators pay attention to and whether they use this information of prey characteristics in decision-making – ultimately determines whether two focal species will benefit each other. SUPPORTING INFORMATION Additional Supporting Information may be found in the online version of this article at the publisher’s web-site. Table S1. Cox survival regression estimates for Arichanna melanaria and yellow A. plantaginis [Apy] in the four trials with blue and great tits (C. caeruleus and P. major, respectively). Hazard ratio is calculated as the risk of event (attack) for groups in the numerator (given in parentheses) as compared to the risk of event (attack) for groups in the denominator. Table S2. Cox survival regression estimates for R. hastata and white A. plantaginis in the four trials with great tits. Hazard ratio is calculated as the risk of event (attack) for R. hastata as compared to the risk of event (attack) for A. plantaginis. Table S3. Cox survival regression estimates for white and yellow A. plantaginis (Apy) in the four trials with great tits. Hazard ratio is calculated as the risk of event (attack) for white A. plantaginis as compared to the risk of event (attack) for yellow A. plantaginis. Table S4. Proportions eaten in Assay 3 considering only the first presentations with each species (i.e. in the first trial or the fifth trial). Figure S1. Cox regression survival model estimates for moths attacked by blue (A, B) and great tits (C–F) in the four trials (line colours darken towards the later trials 1–4) where yellow A. plantaginis (A, C), Arichanna melanaria (B, D), white A. plantaginis (E) or R. hastata (F) was offered as a potential model; n refers to the number of birds tested in each case. Figure S2. Proportions eaten in the four subsequent trials in Assay 3 of the moths that were attacked: (A) of yellow A. plantaginis and (B) of Arichanna melanaria by blue tits compared to (C) of yellow A. plantaginis and (D) Arichanna melanaria by great tits, and (E) white A. plantaginis and (F) R. hastata by great tits. ACKNOWLEDGEMENTS We are grateful to all the people working on this project: Kari Kulmala caught and reared moths with us from Central Finland; Helinä Nisu took care of the birds and pre-trained them for the palatability experiment and caught birds from feeders; Heikki Helle helped with bird catching; and Johannes Braunisch, Chiara de Pasqual and Tuuli Salmi helped with mimicry trials and watching the video-recorded behaviours. We are grateful to Janne Valkonen, Sebastiano de Bona and Andrés López-Sepulcre for statistical advice; Ossi Nokelainen for borrowing his customized camera and introducing ImageJ to us; Innes Cuthill who wrote the original signal detection analysis program; and to Emily Burdfield-Steel for helpful discussion and language checking that improved the manuscript. Brice Noonan, Mathieu Joron and Chris Jiggins gave their thoughtful comments on an earlier version of this paper as part of KR’s PhD thesis. We also thank three anonymous reviewers for their dedication and valuable feedback. Funding for this study was provided by the Centre of Excellence in Biological Interactions (Academy of Finland, project no. 284666 to JM). REFERENCES Arias M, Mappes J, Théry M, Llaurens V. 2016a. Inter-species variation in unpalatability does not explain polymorphism in a mimetic species. Evolutionary Ecology  30: 419– 433. Google Scholar CrossRef Search ADS   Arias M, le Poul Y, Chouteau M, Boisseau R, Rosser N, Théry M, Llaurens V. 2016b. Crossing fitness valleys: empirical estimation of a fitness landscape associated with polymorphic mimicry. Proceedings of the Royal Society B  283: 20160391. Google Scholar CrossRef Search ADS   Aronsson M, Gamberale-Stille G. 2008. Domestic chicks primarily attend to colour, not pattern, when learning an aposematic coloration. Animal Behaviour  75: 417– 423. Google Scholar CrossRef Search ADS   Aronsson M, Gamberale-Stille G. 2012. Colour and pattern similarity in mimicry: evidence for a hierarchical discriminative learning of different components. Animal Behaviour  84: 881– 887. Google Scholar CrossRef Search ADS   Bain RS, Rashed A, Cowper VJ, Gilbert FS, Sherratt TN. 2007. The key mimetic features of hoverflies through avian eyes. Proceedings of the Royal Society of London B  274: 1949– 1954. Google Scholar CrossRef Search ADS   Barnett CA, Bateson M, Rowe C. 2007. State-dependent decision making: educated predators strategically trade off the costs and benefits of consuming aposematic prey. Behavioral Ecology  18: 645– 651. Google Scholar CrossRef Search ADS   Barnett CA, Bateson M, Rowe C. 2014. Better the devil you know: avian predators find variation in prey toxicity aversive. Biology Letters  10: 20140533. Google Scholar CrossRef Search ADS   Bates HW. 1862. Contributions to an insect fauna of the Amazon Valley. Lepidoptera: Heliconidae. 23. Transactions of the Linnean Society of London  23: 495– 566. Google Scholar CrossRef Search ADS   Bates D, Maechler M, Bolker B, Walker S. 2015. Fitting linear mixed-effects models using lme4. Journal of Statistical Software  67: 1– 48. Google Scholar CrossRef Search ADS   Benson WW. 1972. Natural selection for Mullerian mimicry in Heliconius erato in Costa Rica. Science  176: 936– 939. Google Scholar CrossRef Search ADS   Brower LP, Ryerson WN, Coppinger LL, Glazier SC. 1968. Ecological chemistry and the palatability spectrum. Science  161: 1349– 1350. Google Scholar CrossRef Search ADS   Brown KSJr, Benson WW. 1974. Adaptive polymorphism associated with multiple Müllerian mimicry in Heliconius numata (Lepid. Nymph.). Biotropica  6: 205– 228. Google Scholar CrossRef Search ADS   Burdfield-Steel E, Pakkanen H, Rojas B, Galarza JA, Mappes J. 2018. De novo synthesis of chemical defenses in an aposematic moth. Journal of Insect Science  18: 1– 4. Google Scholar CrossRef Search ADS   Chai P, Srygley RB. 1990. Predation and the flight, morphology, and temperature of Neotropical rainforest butterflies. The American Naturalist  135: 748– 765. Google Scholar CrossRef Search ADS   Chittka L, Osorio D. 2007. Cognitive dimensions of predator responses to imperfect mimicry. PLoS Biology  5: e339. Google Scholar CrossRef Search ADS   Chouteau M, Summers K, Morales V, Angers B. 2011. Advergence in Müllerian mimicry: the case of the poison dart frogs of Northern Peru revisited. Biology Letters  7: 796– 800. Google Scholar CrossRef Search ADS   Cibulková A, Veselý P, Fuchs R. 2014. Importance of conspicuous colours in warning signals: the great tit’s (Parus major) point of view. Evolutionary Ecology  28: 427– 439. Google Scholar CrossRef Search ADS   Clarke CA, Sheppard PM, Thornton IW. 1968. The genetics of the mimetic butterfly Papilio memnon L. Philosophical Transactions of the Royal Society of London B  254: 37– 89. Google Scholar CrossRef Search ADS   Davis RH, Nahrstedt A. 1982. Occurrence and variation of the cyanogenic glucosides linamarin and lotaustralin in species of the Zygaenidae (Insecta: Lepidoptera). Comparative Biochemistry and Physiology Part B: Comparative Biochemistry  71: 329– 332. Google Scholar CrossRef Search ADS   Duncan C, Sheppard P. 1965. Sensory discrimination and its role in the evolution of Batesian mimicry. Behaviour  24: 269– 282. Google Scholar CrossRef Search ADS   Edmunds M. 1974. Defence in animals: a survey of antipredator defences . New York: Longman. Edmunds M. 2000. Why are there good and poor mimics? Biological Journal of the Linnean Society  70: 459– 466. Google Scholar CrossRef Search ADS   Eklund A. 2016. Beeswarm: the bee swarm plot, and alternative to stripchart. R package version 0.2.3 . Available at: https.//CRAN.R-project.org/package=beeswarm (accessed 27 March 2018). Endler JA. 2012. A framework for analysing colour pattern geometry: adjacent colours. Biological Journal of the Linnean Society  107: 233– 253. Google Scholar CrossRef Search ADS   Endler JA, Mielke PW. 2005. Comparing entire colour patterns as birds see them. Biological Journal of the Linnean Society  86: 405– 431. Google Scholar CrossRef Search ADS   Evans DL, Waldbauer GP. 1982. Behavior of adult and naive birds when presented with a bumblebee and its mimic. Zeitschrift für Tierpsychologie  59: 247– 259. Google Scholar CrossRef Search ADS   Exnerová A, Landová E, Štys P, Fuchs R, Prokopová M, Cehláriková P. 2003. Reactions of passerine birds to aposematic and non-aposematic firebugs (Pyrrhocoris apterus; Heteroptera). Biological Journal of the Linnean Society  78: 517– 525. Google Scholar CrossRef Search ADS   Exnerová A, Svádová K, Stys P, Barcalová S, Landová E, Prokopovvá M, Fuchs R, Socha R. 2006. Importance of colour in the reaction of passerine predators to aposematic prey: experiments with mutants of Pyrrhocoris apterus (Heteroptera). Biological Journal of the Linnean Society  88: 143– 153. Google Scholar CrossRef Search ADS   Finkbeiner SD, Briscoe AD, Reed RD. 2014. Warning signals are seductive: relative contributions of color and pattern to predator avoidance and mate attraction in Heliconius butterflies. Evolution  68: 3410– 3420. Google Scholar CrossRef Search ADS   Fisher RA. 1958. Polymorphism and natural selection. Journal of Ecology  46: 289– 293. Google Scholar CrossRef Search ADS   Fournier DA, Skaug HJ, Ancheta J, Ianelli J, Magnusson A, Maunder MN, Nielsen A, Sibert J. 2012. AD model builder: using automatic differentiation for statistical inference of highly parameterized complex nonlinear models. Optimization Methods and Software  27: 233– 249. Google Scholar CrossRef Search ADS   Gabor D. 1946. Theory of communication. Part 1: the analysis of information. Journal of the Institution of Electrical Engineers-Part III: Radio and Communication Engineering  93: 429– 441. Google Scholar CrossRef Search ADS   Guilford T. 1994. ‘Go-slow’ signaling and the problem of automimicry. Journal of Theoretical Biology  170: 311– 316. Google Scholar CrossRef Search ADS   Halpin CG, Rowe C. 2010. Taste-rejection behaviour by predators can promote variability in prey defences. Biology Letters  6: 617– 619. Google Scholar CrossRef Search ADS   Halpin CG, Rowe C. 2016. The effect of distastefulness and conspicuous coloration on the post-attack rejection behaviour of predators and survival of prey. Biological Journal of the Linnean Society  120: 236– 244. Ham AD, Ihalainen E, Lindstrom L, Mappes J. 2006. Does colour matter? The importance of colour in avoidance learning, memorability and generalisation. Behavioral Ecology and Sociobiology  60: 482– 491. Google Scholar CrossRef Search ADS   Hart NS. 2001. The visual ecology of avian photoreceptors. Progress in Retinal and Eye Research  20: 675– 703. Google Scholar CrossRef Search ADS   Hegna RH, Galarza JA, Mappes J. 2015. Global phylogeography and geographical variation in warning coloration of the wood tiger moth (Parasemia plantaginis). Journal of Biogeography  42: 1469– 1481. Google Scholar CrossRef Search ADS   Hegna RH, Nokelainen O, Hegna JR, Mappes J. 2013. To quiver or to shiver: increased melanization benefits thermoregulation, but reduces warning signal efficacy in the wood tiger moth. Proceedings of the Royal Society B  280: 20122812. Google Scholar CrossRef Search ADS   Huheey JE. 1976. Studies in warning coloration and mimicry. VII. Evolutionary consequences of a Batesian–Müllerian spectrum: a model for Müllerian mimicry. Evolution  30: 86– 93. Ihalainen E, Lindström L, Mappes J. 2007. Investigating Müllerian mimicry: predator learning and variation in prey defences. Journal of Evolutionary Biology  20: 780– 791. Google Scholar CrossRef Search ADS   Ihalainen E, Lindström L, Mappes J, Puolakkainen S. 2008a. Butterfly effects in mimicry? Combining signal and taste can twist the relationship of Müllerian co-mimics. Behavioral Ecology and Sociobiology  62: 1267– 1276. Google Scholar CrossRef Search ADS   Ihalainen E, Lindström L, Mappes J, Puolakkainen S. 2008b. Can experienced birds select for Müllerian mimicry? Behavioral Ecology  19: 362– 368. Google Scholar CrossRef Search ADS   Ihalainen E, Rowland HM, Speed MP, Ruxton GD, Mappes J. 2012. Prey community structure affects how predators select for Müllerian mimicry. Proceedings of the Royal Society B  279: 2099. Google Scholar CrossRef Search ADS   Johnstone RA. 2002. The evolution of inaccurate mimics. Nature  418: 524– 526. Google Scholar CrossRef Search ADS   Jones DA, Parsons J, Rothschild M. 1962. Release of hydrocyanic acid from crushed tissues of all stages in the life-cycle of species of the Zygaeninae (Lapidoptera). Nature  193: 52– 53. Google Scholar CrossRef Search ADS   Jones RS, Fenton A, Speed MP, Mappes J. 2017. Investment in multiple defences protects a nematode–bacterium symbiosis from predation. Animal Behaviour  129: 1– 8. Google Scholar CrossRef Search ADS   Joron M, Frezal L, Jones RT, Chamberlain NL, Lee SF, Haag CR, Whibley A, Becuwe M, Baxter SW, Ferguson L. 2011. Chromosomal rearrangements maintain a polymorphic supergene controlling butterfly mimicry. Nature  477: 203– 206. Google Scholar CrossRef Search ADS   Joron M, Mallet JL. 1998. Diversity in mimicry: paradox or paradigm? Trends in Ecology & Evolution  13: 461– 466. Google Scholar CrossRef Search ADS   Joron M, Wynne IR, Lamas G, Mallet J. 1999. Variable selection and the coexistence of multiple mimetic forms of the butterfly Heliconius numata. Evolutionary Ecology  13: 721– 754. Google Scholar CrossRef Search ADS   Kapan DD. 2001. Three-butterfly system provides a field test of Müllerian mimicry. Nature  409: 338– 340. Google Scholar CrossRef Search ADS   Katoh M, Tatsuta H, Tsuji K. 2017. Rapid evolution of a Batesian mimicry trait in a butterfly responding to arrival of a new model. Scientific Reports  7: 6369. Google Scholar CrossRef Search ADS   Kelber A, Vorobyev M, Osorio D. 2003. Animal colour vision–behavioural tests and physiological concepts. Biological Reviews of the Cambridge Philosophical Society  78: 81– 118. Google Scholar CrossRef Search ADS   Kemp DJ, Herberstein ME, Fleishman LJ, Endler JA, Bennett AT, Dyer AG, Hart NS, Marshall J, Whiting MJ. 2015. An integrative framework for the appraisal of coloration in nature. The American Naturalist  185: 705– 724. Google Scholar CrossRef Search ADS   Kikuchi DW, Mappes J, Sherratt TN, Valkonen JK. 2016. Selection for multicomponent mimicry: equal feature salience and variation in preferred traits. Behavioral Ecology  27: 1515– 1521. Google Scholar CrossRef Search ADS   Kikuchi DW, Pfennig DW. 2010. Predator cognition permits imperfect coral snake mimicry. The American Naturalist  176: 830– 834. Google Scholar CrossRef Search ADS   Kikuchi DW, Pfennig DW. 2013. Imperfect mimicry and the limits of natural selection. The Quarterly Review of Biology  88: 297– 315. Google Scholar CrossRef Search ADS   Kokko H, Mappes J, Lindstrom L. 2003. Alternative prey can change model-mimic dynamics between parasitism and mutualism. Ecology Letters  6: 1068– 1076. Google Scholar CrossRef Search ADS   Kraemer AC, Serb JM, Adams DC. 2015. Batesian mimics influence the evolution of conspicuousness in an aposematic salamander. Journal of Evolutionary Biology  28: 1016– 1023. Google Scholar CrossRef Search ADS   Kunte K. 2009. The diversity and evolution of Batesian mimicry in Papilio swallowtail butterflies. Evolution  63: 2707– 2716. Google Scholar CrossRef Search ADS   Lantz B. 2013. Machine learning with R . Birminghan: Packt Publishing Ltd. Le Poul Y, Whibley A, Chouteau M, Prunier F, Llaurens V, Joron M. 2014. Evolution of dominance mechanisms at a butterfly mimicry supergene. Nature Communications  5: 5644. Google Scholar CrossRef Search ADS   Ley C, Watt W. 1989. Testing the ‘mimicry’ explanation for the Colias ‘alba’ polymorphism: palatability of Colias and other butterflies to wild bird predators. Functional Ecology  3: 183– 192. Google Scholar CrossRef Search ADS   Lindström L, Alatalo RV, Lyytinen A, Mappes J. 2004. The effect of alternative prey on the dynamics of imperfect Batesian and Müllerian mimicries. Evolution  58: 1294– 1302. Google Scholar CrossRef Search ADS   Lindström L, Alatalo RV, Mappes J. 1997. Imperfect Batesian mimicry—the effects of the frequency and the distastefulness of the model. Proceedings of the Royal Society of London. Series B  264: 149. Google Scholar CrossRef Search ADS   Lindström L, Alatalo RV, Mappes J. 1999. Reactions of hand-reared and wild-caught predators toward warningly colored, gregarious, and conspicuous prey. Behavioral Ecology  10: 317– 322. Google Scholar CrossRef Search ADS   Lindström L, Lyytinen A, Mappes J, Ojala K. 2006. Relative importance of taste and visual appearance for predator education in Müllerian mimicry. Animal Behaviour  72: 323– 333. Google Scholar CrossRef Search ADS   MacDougall A, Dawkins MS. 1998. Predator discrimination error and the benefits of Müllerian mimicry. Animal Behaviour  55: 1281– 1288. Google Scholar CrossRef Search ADS   Mallet J, Barton NH. 1989. Strong natural selection in a warning-color hybrid zone. Evolution  43: 421– 431. Google Scholar CrossRef Search ADS   Mappes J, Alatalo RV. 1997. Effects of novelty and gregariousness in survival of aposematic prey. Behavioral Ecology  8: 174– 177. Google Scholar CrossRef Search ADS   Mappes J, Kokko H, Ojala K, Lindström L. 2014. Seasonal changes in predator community switch the direction of selection for prey defences. Nature Communications  5: 5016. Google Scholar CrossRef Search ADS   Marek PE, Bond JE. 2009. A Mullerian mimicry ring in Appalachian millipedes. Proceedings of the National Academy of Sciences of the United States of America  106: 9755– 9760. Google Scholar CrossRef Search ADS   Marsh N, Rothschild M. 1974. Aposematic and cryptic Lepidoptera tested on the mouse. Journal of Zoology  174: 89– 122. Google Scholar CrossRef Search ADS   Merilaita S. 2016. Broadening the angle of view on aposematism: a comment on Skelhorn et al. Behavioral Ecology  27: 966– 967. Google Scholar CrossRef Search ADS   Michalis K. 2017. Background matching camouflage . Bristol: University of Bristol. Michalis C, Scott-Samuel NE, Gibson DP, Cuthill IC. 2017. Optimal background matching camouflage. Proceedings of the Royal Society of London B  284: 20170709. Google Scholar CrossRef Search ADS   Morrell GM, Turner JR. 1970. Experiments on mimicry: I. The response of wild birds to artificial prey. Behaviour  36: 116– 130. Google Scholar CrossRef Search ADS   Müller F. 1879. Ituna and Thyridia: a remarkable case of mimicry in butterflies. Transactions of the Entomological Society of London  1879: 20– 29. Nijhout HF. 2003. Polymorphic mimicry in Papilio dardanus: mosaic dominance, big effects, and origins. Evolution & Development  5: 579– 592. Google Scholar CrossRef Search ADS   Nishida R. 1994. Sequestration of plant secondary compounds by butterflies and moths. Chemoecology  5: 127– 138. Google Scholar CrossRef Search ADS   Nokelainen O, Hegna RH, Reudler JH, Lindstedt C, Mappes J. 2012. Trade-off between warning signal efficacy and mating success in the wood tiger moth. Proceedings of the Royal Society B  279: 257– 265. Google Scholar CrossRef Search ADS   Nokelainen O, Valkonen J, Lindstedt C, Mappes J. 2014. Changes in predator community structure shifts the efficacy of two warning signals in Arctiid moths. The Journal of Animal Ecology  83: 598– 605. Google Scholar CrossRef Search ADS   R CoreTeam. 2013. R: a language and environment for statistical computing . Vienna: R Foundation for Statistical Computing. Remmel T, Tammaru T. 2011. Evidence for the higher importance of signal size over body size in aposematic signaling in insects. Journal of Insect Science  11: 1– 11. Google Scholar CrossRef Search ADS   Renoult JP, Kelber A, Schaefer HM. 2017. Colour spaces in ecology and evolutionary biology. Biological Reviews of the Cambridge Philosophical Society  92: 292– 315. Google Scholar CrossRef Search ADS   Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez JC, Müller M. 2011. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics  12: 77. Google Scholar CrossRef Search ADS   Rojas B, Devillechabrolle J, Endler JA. 2014. Paradox lost: variable colour-pattern geometry is associated with differences in movement in aposematic frogs. Biology Letters  10: 20140193. Google Scholar CrossRef Search ADS   Rojas B, Gordon SP, Mappes J. 2015. Frequency-dependent flight activity in the colour polymorphic wood tiger moth. Current Zoology  61: 765– 772. Google Scholar CrossRef Search ADS   Rojas B, Burdfield-Steel E, Pakkanen H, Suisto K, Maczka M, Schulz S, Mappes J. 2017. How to fight multiple enemies: target-specific chemical defences in an aposematic moth. Proceedings of the Royal Society B  284: 20171424. Google Scholar CrossRef Search ADS   Rönkä K, De Pasqual C, Mappes J, Gordon S, Rojas B. 2018. Colour alone matters: no predator generalization among morphs of an aposematic moth. Animal Behaviour  135: 153– 163. Google Scholar CrossRef Search ADS   Rönkä K, Mappes J, Kaila L, Wahlberg N. 2016. Putting Parasemia in its phylogenetic place: a molecular analysis of the subtribe Arctiina (Lepidoptera). Systematic Entomology  41: 844– 853. Google Scholar CrossRef Search ADS   Rowe C, Lindström L, Lyytinen A. 2004. The importance of pattern similarity between Müllerian mimics in predator avoidance learning. Proceedings of the Royal Society B  271: 407– 413. Google Scholar CrossRef Search ADS   Rowland HM, Ihalainen E, Lindström L, Mappes J, Speed MP. 2007. Co-mimics have a mutualistic relationship despite unequal defences. Nature  448: 64– 67. Google Scholar CrossRef Search ADS   Rowland HM, Mappes J, Ruxton GD, Speed MP. 2010. Mimicry between unequally defended prey can be parasitic: evidence for quasi-Batesian mimicry. Ecology Letters  13: 1494– 1502. Google Scholar CrossRef Search ADS   Rowland HM, Parker MR, Jiang P, Reed DR, Beauchamp GK. 2015. Comparative taste biology with special focus on birds and reptiles. In: Doty RL, ed. Handbook of olfaction and gestation , 3rd edn. Oxford: Wiley-Blackwell, 957– 982. Google Scholar CrossRef Search ADS   RStudio. 2015. RStudio: integrated development environment for R (Version 0.99.441) [Computer software] . Available at: http://www.rstudio.org/ (accessed 27 March 2018). Ruxton GD, Sherratt TN, Speed MP. 2004. Avoiding attack: the evolutionary ecology of crypsis, warning signals and mimicry . Oxford: Oxford University Press. Google Scholar CrossRef Search ADS   Sandre S-L, Stevens M, Mappes J. 2010. The effect of predator appetite, prey warning coloration and luminance on predator foraging decisions. Behaviour  147: 1121– 1143. Google Scholar CrossRef Search ADS   Silvonen K, Top-Jensen M, Fibiger M. 2014. Suomen päivä-ja yöperhoset—maastokäsikirja (A field guide to the butterflies and moths of Finland) . Østermarie: Bugbook Publishing. Skaug H, Fournier D, Nielsen A, Magnusson A, Bolker B. 2013. Generalized linear mixed models using AD model builder. R package version 0.7 7 . Available at: http://glmmadmb.r-forge.r-project.org (accessed 27 March 2018). Skelhorn J, Halpin CG, Rowe C. 2016. Learning about aposematic prey. Behavioral Ecology  27: 955– 964. Google Scholar CrossRef Search ADS   Skelhorn J, Rowe C. 2006. Prey palatability influences predator learning and memory. Animal Behaviour  71: 1111– 1118. Google Scholar CrossRef Search ADS   Skelhorn J, Rowe C. 2007. Predators’ toxin burdens influence their strategic decisions to eat toxic prey. Current Biology  17: 1479– 1483. Google Scholar CrossRef Search ADS   Skelhorn J, Rowe C. 2010. Birds learn to use distastefulness as a signal of toxicity. Proceedings of the Royal Society B  277: 1729– 1734. Google Scholar CrossRef Search ADS   Schneider CA, Rasband WS, Eliceiri KW. 2012. NIH Image to ImageJ: 25 years of image analysis. Nature Methods  9: 671– 675. Google Scholar CrossRef Search ADS   Speed MP. 1999. Batesian, quasi-Batesian or Müllerian mimicry? Theory and data in mimicry research. Evolutionary Ecology  13: 755– 776. Google Scholar CrossRef Search ADS   Speed MP, Ruxton GD, Mappes J, Sherratt TN. 2012. Why are defensive toxins so variable? An evolutionary perspective. Biological Reviews of the Cambridge Philosophical Society  87: 874– 884. Google Scholar CrossRef Search ADS   Stevens M, Cuthill IC. 2006. Disruptive coloration, crypsis and edge detection in early visual processing. Proceedings of the Royal Society of London B  273: 2141– 2147. Google Scholar CrossRef Search ADS   Stevens M, Parraga CA, Cuthill IC, Partridge JC, Troscianko TS. 2007. Using digital photography to study animal coloration. Biological Journal of the Linnean Society  90: 211– 237. Google Scholar CrossRef Search ADS   Stuckert AMM, Saporito RA, Venegas PJ, Summers K. 2014. Alkaloid defenses of co-mimics in a putative Mullerian mimetic radiation. BMC Evolutionary Biology  14: 76. Google Scholar CrossRef Search ADS   Symula R, Schulte R, Summers K. 2001. Molecular phylogenetic evidence for a mimetic radiation in Peruvian poison frogs supports a Mullerian mimicry hypothesis. Proceedings of the Royal Society of London Series B  268: 2415– 2421. Google Scholar CrossRef Search ADS   Taylor CH, Reader T, Gilbert F. 2016. Why many Batesian mimics are inaccurate: evidence from hoverfly colour patterns. Proceedings of the Royal Society of London Series B  283: 20161585. Google Scholar CrossRef Search ADS   Terhune EC. 1977. Components of a visual stimulus used by scrub jays to discriminate a Batesian model. The American Naturalist  111: 435– 451. Google Scholar CrossRef Search ADS   Therneau T. 2015. coxme:mixed effects Cox models R package version. 2.2–5 ed . Available at: https://CRAN.R-project.org/package=coxme (accessed 27 March 2018). Troscianko J, Stevens M. 2015. Image calibration and analysis toolbox - a free software suite for objectively measuring reflectance, colour and pattern. Methods in Ecology and Evolution  6: 1320– 1331. Google Scholar CrossRef Search ADS   Turini A, Veselý P, Fuchs R. 2016. Five species of passerine bird differ in their ability to detect Batesian mimics. Biological Journal of the Linnean Society  117: 832– 841. Google Scholar CrossRef Search ADS   Turner JRG. 1970. Studies of Müllerian mimicry and its evolution in Burnet moths and Heliconid butterflies. In: Creed R, ed. Ecological genetics and evolution . Boston: Springer, 224– 260. Google Scholar CrossRef Search ADS   Ueno H, Sato Y, Tsuchida K. 1998. Colour-associated mating success in a polymorphic Ladybird Beetle, Harmonia axyridis. Functional Ecology  12: 757– 761. Google Scholar CrossRef Search ADS   Van Belleghem SM, Papa R, Ortiz-Zuazaga H, Hendrickx F, Jiggins CD, Owen McMillan W, Counterman BA. 2018. patternize: an R package for quantifying colour pattern variation. Methods in Ecology and Evolution  9: 390– 398. Google Scholar CrossRef Search ADS   Veselý P, Luhanová D, Prášková M, Fuchs R. 2013. Generalization of mimics imperfect in colour patterns: the point of view of wild avian predators. Ethology  119: 138– 145. Google Scholar CrossRef Search ADS   Vorobyev M, Osorio D. 1998. Receptor noise as a determinant of colour thresholds. Proceedings of the Royal Society B  265: 351– 358. Google Scholar CrossRef Search ADS   Wickens TD. 2002. Elementary signal detection theory . Oxford: Oxford University Press. Xiao F, Cuthill IC. 2016. Background complexity and the detectability of camouflaged targets by birds and humans. Proceedings of the Royal Society of London B  283: 20161527. Google Scholar CrossRef Search ADS   Yeager J, Brown JL, Morales V, Cummings M, Summers K. 2012. Testing for selection on color and pattern in a mimetic radiation. Current Zoology  58: 668– 676. Google Scholar CrossRef Search ADS   © 2018 The Linnean Society of London, Biological Journal of the Linnean Society This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices) TI - Can multiple-model mimicry explain warning signal polymorphism in the wood tiger moth, Arctia plantaginis (Lepidoptera: Erebidae)? JF - Biological Journal of the Linnean Society DO - 10.1093/biolinnean/bly042 DA - 2018-04-30 UR - https://www.deepdyve.com/lp/oxford-university-press/can-multiple-model-mimicry-explain-warning-signal-polymorphism-in-the-T7aLdY5trT SP - 1 VL - Advance Article IS - DP - DeepDyve ER -