TY - JOUR AU - Johnson, Scott, A AB - Abstract Accurate age determination is a fundamental prerequisite for demographic studies as well as population monitoring efforts that provide information for management and conservation. Yet, common age determination methods suffer from low accuracy rates, impose additional handling and time costs on animals and biologists, or rely on invasive techniques such as tooth extraction. We introduce an alternative, mixture modeling approach for age determination that exploits mammalian growth patterns to classify newly encountered animals as juveniles or adults, and present an example analysis that classifies Allegheny woodrats based solely on their capture dates and mass at capture, in combination with data from known adults. We also introduce and validate a simulation-based heuristic to evaluate potential classification accuracy when no known-age test cases are available. In the Allegheny woodrat example, the mixture model achieved a 90–92% accuracy rate (heuristic range: 89–94%), far better than the 36–43% achieved with a fixed mass criterion, and comparable to accuracies reported for other species using more data-intensive, multivariate classification techniques. The model can be extended to classify multiple age groups, estimate chronological age, or further improve accuracy by including additional morphometric measures. age determination, Allegheny woodrat, growth model, mixture model, Neotoma magister, receiver operating characteristic curve, ROC The accurate description of age structure is a foundational element of population ecology (Lyons et al. 2012). Correct age classifications for individual study animals are necessary to describe age-specific patterns of survival, to quantify recruitment rates, and to differentiate parental and offspring cohorts for parentage analyses. To identify age in mammals, biologists may focus trapping efforts during seasons when juveniles can be easily differentiated based on body size or pelage, collect precise measurements of body size (Karels et al. 2004), assess the progression of tooth eruption or wear (Spinage 1973), or visualize cementum annuli in the cross section of extracted teeth (Hamlin et al. 2000). However, the applicability of these and other methods for age identification may vary considerably among species and study systems. Additionally, many aging methods can impose undesirable costs on study animals (e.g., stress associated with prolonged handling or need for chemical immobilization to collect requisite data) or researchers (e.g., increased handling time and reduction in the number of individuals that can be processed). Given that the feasibility of collecting data needed for age identification may vary among study systems, we sought to develop a generalized method for aging unknown individuals. Our solution exploits the predictable patterns of juvenile growth in mammals and the resulting differences in the distributions of body sizes among age classes of interest to probabilistically assign newly encountered individuals to discrete age classes. Although we illustrate the model with live-trapping data collected as part of population monitoring efforts for the Allegheny woodrat (Neotoma magister), we deliberately sought to create a flexible solution that can readily be extended to the unique biology of a variety of study systems. Allegheny woodrats are endemic to forested ecosystems throughout the Appalachian Mountains in the eastern United States (Castleberry et al. 2006). As habitat specialists, the local distribution of woodrats is restricted by the presence of complex rock structures such as caves, fissures, talus fields, and boulder piles (Castleberry et al. 2006). Woodrats establish den sites within the interstices of complex rock features, which they depend upon for protection from predators and for a thermally moderated environment in which to cache food items for overwinter subsistence and rearing young. Mengak (2002) observed a pulse in woodrat reproduction in the spring with additional litters produced opportunistically throughout the growing season, depending on resource availability. With small litter sizes generally numbering only 2 to 3 pups, woodrats have high maternal investment with young reaching sexual maturity in their 2nd year of life (Alligood et al. 2008; Wood 2008; Smyser and Swihart 2014). Once common, Allegheny woodrats have declined dramatically over the last 40 years as a result of synergistic effects of habitat fragmentation and increasing edge effects, reduced hard mast availability, and increased mortality associated with infection by Baylisascaris procyonis (LoGiudice 2006, 2008; Smyser et al. 2012). Given widespread declines, effective field methods are needed for monitoring woodrat populations and assessing the efficacy of recovery efforts. Allegheny woodrats are highly trappable (Castleberry et al. 2014; Smyser et al. 2016a, 2016b) yet trapping and handling is accompanied by some innate risk to target and nontarget species (Powell and Proulx 2003). To minimize the risks to woodrat populations introduced by livetrapping, standard monitoring protocols have been established for the species (Mengak 2002; Mengak et al. 2008). In addition to restricting livetrapping to 2 consecutive nights with traps checked daily, trapping efforts are scheduled to avoid the spring pulse in woodrat reproduction so as to minimize the risk of introducing undue stress on pregnant females or separating dependent young from lactating dams. However, delaying data collection until the summer or fall months complicates the differentiation of young-of-the-year (hereafter, juveniles) from previously unmarked adults. With livetrapping restricted to summer or fall, individuals born early in the reproductive season will have begun to approach adult body sizes and may possess adult coloration (brown as opposed to gray) by the time that monitoring efforts are initiated. As a result, follow-up analyses that rely on accurate age classifications (e.g., assessments of recruitment rates, juvenile versus adult survival, or individual reproductive success) may be biased. In addition, because the level of uncertainty associated with individual age determinations is unknown, no good method exists to assess the quality of age classification data. Given the challenges of differentiating older juveniles from reproductively mature adults, annual live-trap monitoring data for Allegheny woodrats collected from 2005 to 2013 provide an ideal test for our statistical age classification model. Briefly, we develop a finite mixture approach that estimates an age classification for each unknown-age encounter with an animal based on a growth curve whose parameters are estimated as part of the classification model. In the Allegheny woodrat example developed here, we base the model on body mass and capture date. However, body mass may not be the best metric for all species; any morphometric variable that exhibits a predictable growth pattern may be substituted for body mass without a change in the model structure. In addition, the mixture modeling approach can utilize data on multiple morphometric variables simultaneously (we present a simplified example of such a multivariate mixture model, without the growth component, in the Supplementary Data). Unlike the approach typically used for age determination of Allegheny woodrats (Mengak et al. 2008), which uses a fixed body mass criterion to differentiate juveniles from adults, our model-based classification provides an estimate of the uncertainty in each individual classification by assigning each animal (or each encounter) a probability of belonging to each of the represented age classes. In addition, we present a simulation-based technique to evaluate accuracy of age classification. Although our example analysis classifies unmarked individuals from a species with discrete breeding periods and only 2 age classes, our approach can be generalized to take advantage of repeated measures in mark–recapture studies, to use multiple morphometric criteria, and to classify multiple age classes. Materials and Methods Study site. We used data collected from live-trap monitoring of the Allegheny woodrat metapopulation occurring within Harrison and Crawford counties in extreme southern Indiana. Extensive efforts have been conducted previously to identify the extent of woodrat habitat within this landscape (Cudmore 1983; Johnson 2002; Smyser et al. 2016b). Locally, woodrats are restricted to discrete habitat patches that function as subpopulations within a metapopulation context with all habitat patches associated with cliff formations immediately adjacent to the Ohio River. Complex rock features (i.e., caves, fissures, and rock jumbles) within cliff lines satisfy the denning requirements of woodrats, whereas surrounding forests, dominated by sugar maple (Acer saccharum), white oak (Quercus alba), northern red oak (Q. rubra), black oak (Q. velutina), chinkapin oak (Q. muehlenbergii), pignut hickory (Carya glabra), and shagbark hickory (C. ovata), provide foraging resources. Annually from 2005 to 2013, we used standard trapping protocols to monitor all known Allegheny woodrat subpopulations within the study area (Mengak et al. 2008; Smyser et al. 2016b). Although scheduled to minimize disturbance to woodrat breeding behavior, the timing in which subpopulations were livetrapped varied both within and among years with the earliest trapping date of 12 June and latest trapping date of 5 November. During trapping efforts, we saturated areas of high woodrat activity (i.e., den sites, thoroughfares, and latrines) with live traps (Model #102; Tomahawk Live Trap, Hazelhurst, Wisconsin) baited with fresh, sliced apples. Upon initial capture during annual 2-night trapping sessions, we transferred woodrats from live traps to a handling cone, effectively immobilizing animals. We then recorded sex and weighed each individual to the nearest gram (estimated between 5-g increments) with a spring scale (Model 40600, 600 g; Pesola, Schindellegi, Switzerland) tared to correct for the mass of the handling cone. For newly encountered individuals, we applied a uniquely numbered ear tag to each ear (Monel #1; National Band and Tag Company, Newport, Kentucky) and collected a 2-mm circular biopsy punch (1538; National Band and Tag Company) from each ear for genetic analysis. Loss of both tags between annual monitoring efforts was rare; however, collection of a biopsy punch left an indelible mark on the ear that allowed us to unmistakably differentiate individuals that had lost tags from newly encountered individuals. Employing these methods, we encountered an estimated 76% of males and 94% of females that were present within subpopulations at the time of trapping (Smyser et al. 2016b). In addition to annual monitoring, study areas were part of ongoing woodrat recovery and research efforts associated with reinforcement of extant populations through translocation and monitoring of survival via radiotelemetry (Smyser et al. 2013; Blythe et al. 2015). Prior to the release of translocated woodrats, we conducted limited trapping to ensure that woodrats were released into vacant den sites, applying the same methods as described above at localized scales. Similarly, we implemented targeted trapping of radiocollared individuals at the conclusion of the monitoring period to remove collars. These auxiliary trapping efforts added data on several individuals from opportunistic encounters that fell outside of our typical monitoring period. All trapping and handling methods conformed to Purdue University Animal Care and Use Committee policies (Protocol 1201000596) and guidelines provided by the American Society of Mammalogists (Sikes et al. 2016). Data analysis. We sought to develop a statistical classification model that would allow us to estimate the probability that each of n newly encountered woodrats of unknown age was an adult (born > 1 year prior to capture) versus a juvenile (born in the same year as capture) at the time of capture. To do this, we related body mass (m) to the ordinal date of capture (t) for the unknown-age woodrats and compared the resulting patterns with the equivalent relationship between body mass (mA) and capture date (tA) in a set of nA additional woodrats that were known to be adults because they had been encountered in a previous year. In addition, because Allegheny woodrats are sexually dimorphic, we fit separate parameters for males and females and included sex (s and sA for new and known-adult woodrats, respectively) as a stratifying variable in our model. Although the woodrats in our study were marked with individually labeled ear tags, our analysis does not use this information except to assign animals to the known-adult group. Recaptures of the same unknown-age woodrat within a single trapping season are treated as separate, independent events, and age classifications are based on the 1st encounter with a given individual. This approach allows us to classify individuals that were encountered only once. While we do not use individual encounter histories directly for classification, we did take advantage of these histories to test model accuracy. Model description. Our modeling approach can incorporate either general or sex-specific parameters as needed. Allegheny woodrats are sexually dimorphic, and males generally attain a larger body size than females. Accordingly, our fitted model incorporates sex-specific parameters. However, to reduce the complexity of model notation, we describe a generalized model with only 1 set of parameters. For convenience, we assumed that body masses of adults are normally distributed with mean μA and standard deviation σA. Normality is not required, but is usually a reasonable assumption for body mass. Additionally, we assumed that on average, body masses of juveniles increase over the course of a single field season (for the woodrats, 12 June to 5 November) according to a growth function, g(δ0,t,β) , where μA−δ0 is the expected body mass at t = 0 (i.e., of the youngest individuals vulnerable to capture), t is an ordinal capture date ≥ 0, and β is a set of parameters determined by the form of the growth function. Assuming that juveniles were all born at approximately the same time, the distribution of body masses of juveniles in the population on any given capture date, t, is normal with mean g(δ0,t,β) and standard deviation σJ, which may differ from σA. For age classification of Allegheny woodrats, we used a simple asymptotic growth model to describe changes in body masses of juveniles over time, g(δ0,t,β)=μA−δ0exp(−βt) , where β ≥ 0. However, our approach does not depend on a specific type of growth model. For example, we include a Weibull growth model in Supplementary Data SD1 that is flexible enough to fit the growth patterns of a variety of species and includes the asymptotic model used here as a special case. Given the distributions for body masses of adults and juveniles over time, we define the likelihood function for age classification of an unknown-age individual, i, given its body mass, mi, and capture date as a finite mixture, L(pi|mi,ti)=piN(mi|μA,σA)+(1−pi)N[mi|g(δ0,t,β),σJ](1) where pi is the probability that individual i is an adult, and N(mi|μA,σA) represents the density of a normal distribution with mean μA and standard deviation σA at mi. When the distributions for adults and juveniles are strongly separated, and mi clearly falls near the mode of 1 of the distributions, setting pi near 1 for adults or 0 for juveniles will maximize equation 1. In contrast, if mi falls between the 2 distributions so that it is in the upper tail of the distribution of juveniles and lower tail of the distribution of adults, equation 1 will be maximized by setting pi nearer to 0.5. Given N = n + nA captures that are ordered so that all known adults appear in the data set after the unknown-age individuals, we can write the log-likelihood of the full parameter set given the data as, logL(θ|data)=∑i=1nlog{piN(mi|μA,σA)+(1−pi)N[mi|g(δ0,ti,β),σJ]}+∑i=n+1NlogN(mi|μA,σA)(2) where θ=(μA,σA,δ0,β,σJ,p1,...,pn) , and the data include measured body masses and the capture times for both unknown-age individuals and known adults. Equation 2 assumes that no known juveniles exist. If any are available, they would be included in a 3rd summation term that uses the body mass distribution for juveniles rather than the body mass distribution for adults. Fitting the model. Equation 2 can be maximized directly. Alternatively, the model can be fit in a Bayesian context by assigning prior distributions to each of the parameters in θ and then sampling from the (unnormalized) log-posterior for θ, which is equal to Equation 2 plus the sum of the logs of the priors for the parameters. One advantage of the Bayesian approach is that it can be more easily generalized to different life history patterns (see “Discussion”); accordingly, we have adopted the Bayesian approach here. We used a Gibbs sampling approach to fit the model to our data (Gelman et al. 2004). With the exception of δ0, we allowed all population-level parameters to vary by sex. In addition, we used zero-truncated normal distributions for mass; this had little effect as all of the body masses in our data set fell well above zero. With each Gibbs iteration, we drew a random variable, αi~Bernoulli(pi) , to represent the true age category of each unknown-age individual i = (1, ..., n), and then substituted αi for pi in Equation 2 where αi = 1 indicates that the individual is currently imputed to be an adult, and αi = 0 indicates that it is a juvenile. After sampling, we took the proportion of draws in which αi = 1 as an estimate of the probability that individual i is an adult. Using αi to perform the categorization is preferable to using pi directly because αi is less sensitive to the prior distribution of pi. We fit the model in the software package JAGS (Plummer 2003), implemented through the jagsUI package in R (R Development Core Team 2012; Kellner 2015). Posterior distributions were based on 4 parallel Markov chains, initialized with random starting values and each with 12,000 iterations, a burn-in of 10,000 iterations and a thinning rate of 5. We assessed convergence of the chains to stationary distributions using the Gelman–Rubin convergence diagnostic (Brooks and Gelman 1998). Both JAGS model code and R functions to fit the model are provided in Supplementary Data SD1. Classification accuracy. Ideally, predictive accuracy for binary classification models can be evaluated by analyzing a receiver operating characteristic (ROC) curve (Hanley and McNeil 1982). ROC curves plot the true positive rate for a classification (i.e., the proportion of unknown-age captures that actually are adults and are correctly classified as adults) against the false positive rate (1 minus the proportion of juveniles incorrectly classified as adults). Because both rates are inverse functions of the threshold used to classify individuals, the curve is strictly non-declining. For example, if the threshold is 0.5, any individual with P(αi = 1) ≥ 0.5 is classified as an adult. At a threshold of 0, the true positive rate = 1, but the false positive rate also = 1. In contrast, if a threshold of 1 is used, both rates drop to 0. One consequence of this construction is that the area under the curve (AUC) provides a direct measure of overall classification accuracy. For a perfect model, AUC = 1. For a model that performs no better than a coin-flip, AUC = 0.5. A major limitation of ROC curve analysis is that it requires knowledge of the true classifications. This information generally is not available for field data. Accordingly, we used 2 approaches to evaluate our model’s classification accuracy. First, we developed a heuristic technique to test model accuracy using simulated woodrat growth data. While it cannot directly measure the classification accuracy for a real data set, this approach allowed us to measure classification accuracy in artificial data sets that have a similar structure to the observed data. The method can be used in any situation where true classifications are unavailable. Second, we took advantage of the fact that our woodrat data included individual encounter histories to estimate classification accuracy using the observed data. To do this, we identified a subset of individuals whose histories allowed us to make unambiguous age determinations for the encounters that were included in the main data set. We used this secondary check to verify the results of the heuristic approach. For the 1st approach, we simulated data sets that were similar to the woodrat data, fit the classification model to the simulated data sets, and then analyzed the resulting ROC curves given the true values in the simulated data. Rather than simulating data through a parametric bootstrap procedure based on the classification model that we had previously fit to the observed data, we simulated data sets from a stochastic Gompertz process (see below). This function differed from the asymptotic growth function used in the classification model but nevertheless produced patterns similar to the actual data. This procedure avoids any biases that might arise by implicitly assuming that the functional form used for growth in the classification model is a true representation of growth in the study population. Following McClure and Randolph (1980), we simulated woodrat growth as a Gompertz process, mit=εitAiexp[−exp(−Ki[t−ji])] , where mit is the body mass of individual i on day t, Ai and Ki are the individual’s asymptotic body mass and average growth rate constant, respectively, and ji is the ordinal date of the inflection point in the growth curve for i. The term εit~N(1,τ) allows body mass to fluctuate randomly over time as a percentage of the expected mass. For each simulation, we randomly generated data for n = 894 captures, corresponding to the number of woodrat encounters in our true data set (unlike the true data, the simulation assumed that each individual was captured only once). Ordinal capture dates (from 1 January) and demographic data were taken directly from field observations, yielding 494 females and 400 males, of which 294 (= 182 female + 112 male) were known adults. Capture dates ranged from 163 to 309. For each simulation, separate values of A, K, and j were randomly generated for each individual by drawing a value from a normal or lognormal distribution with hyperparameters generated from a uniform distribution (Table 1). We set the standard deviation for random mass fluctuations on the percentage scale, τ = 0.02, for all simulations. Table 1. Parameterization ranges and distributions for asymptotic body mass of adults (A), growth rate constants (K), and the Julian date at curve inflection (j) for simulations of Allegheny woodrat (Neotoma magister) growth. Parameter . Distribution . Female mean . Male mean . SDa . A Normal 285–315 300–359 35–39 K Log-normal 0.0095–0.010 0.0091–0.011 0.57–0.63 j Normal 152–168 146–176 14–16 Parameter . Distribution . Female mean . Male mean . SDa . A Normal 285–315 300–359 35–39 K Log-normal 0.0095–0.010 0.0091–0.011 0.57–0.63 j Normal 152–168 146–176 14–16 aThe same SDs were used for both sexes in a given simulation. Open in new tab Table 1. Parameterization ranges and distributions for asymptotic body mass of adults (A), growth rate constants (K), and the Julian date at curve inflection (j) for simulations of Allegheny woodrat (Neotoma magister) growth. Parameter . Distribution . Female mean . Male mean . SDa . A Normal 285–315 300–359 35–39 K Log-normal 0.0095–0.010 0.0091–0.011 0.57–0.63 j Normal 152–168 146–176 14–16 Parameter . Distribution . Female mean . Male mean . SDa . A Normal 285–315 300–359 35–39 K Log-normal 0.0095–0.010 0.0091–0.011 0.57–0.63 j Normal 152–168 146–176 14–16 aThe same SDs were used for both sexes in a given simulation. Open in new tab We selected the ranges for the hyperparameters in Table 1 by visually overlaying the point clouds for the observed data and simulated data sets to ensure that they produced similar overall patterns of body mass relative to capture date (Fig. 1). Based on the observed distribution of woodrat body masses, we designated unknown-age individuals as true adults with a probability drawn from Uniform(0.25, 0.35). Thus, approximately one-third of the unknown-age individuals were actually adults, corresponding to the approximate proportion of woodrats of unknown age with a body mass > 225 g, the weight threshold recommended by Mengak et al. (2008) for identification of adult woodrats. Fig. 1. Open in new tabDownload slide Relationship between Julian capture date (x-axis) and body mass (y-axis) for known-adult (triangles) and unknown-age (circles) Allegheny woodrats (Neotoma magister) in a real data set (A) and in 1 replicate of a simulated data set (B). Lines show the results for a fitted model assuming stable mean adult size (solid lines) and asymptotic juvenile growth (dashed lines) for females (black lines, open symbols) and males (gray lines, filled symbols). Simulations followed a Gompertz model (McClure and Randolph 1980) and were used to evaluate the accuracy of the fitted model. Fig. 1. Open in new tabDownload slide Relationship between Julian capture date (x-axis) and body mass (y-axis) for known-adult (triangles) and unknown-age (circles) Allegheny woodrats (Neotoma magister) in a real data set (A) and in 1 replicate of a simulated data set (B). Lines show the results for a fitted model assuming stable mean adult size (solid lines) and asymptotic juvenile growth (dashed lines) for females (black lines, open symbols) and males (gray lines, filled symbols). Simulations followed a Gompertz model (McClure and Randolph 1980) and were used to evaluate the accuracy of the fitted model. In total, we generated 500 simulated data sets, each of which was subjected to model fitting (using the same approach as the observed data; see “Fitting the model”) and ROC curve analysis. In addition to the model-based classification, we classified a 2nd set of 500 simulations using Mengak et al.’s (2008) recommended threshold of 225 g for classification as adults. For the observed data and each of the modeled simulations, we also calculated the total proportional volume of overlap (V) between the fitted adult and juvenile body mass distributions over time, V=1T∫tmintmax∫0∞min{N(m|μAσA),N[m|g(δ0,t,β),σJ]} dm dt(3) where T = tmax − tmin, and tmin and tmax are the earliest and latest sampling days in the data set. Note that this equation assumes both distributions fall several standard deviations above 0. If this is not the case, a different distributional model should be considered. If the distributions of body masses of adults and juveniles are well separated throughout the sample period, V will be small and overall classification accuracy should be high. Conversely, if most of the sampling occurs after juveniles have grown to near-adult size, V will be large and classification accuracy will be lower. By regressing AUC against V for the simulated data and then determining where in the regression the model for the actual data falls, we can estimate the classification accuracy for the observed data set. To empirically validate our heuristic evaluation of classification accuracy, we identified a subset (n = 25) of the unknown-age woodrats for which we had ≥ 3 capture records that included measurements of body mass. We included data from auxiliary trapping efforts to increase the number of individuals that met these criteria. By separately plotting the change in mass over time, we were able to confidently assign age classes for each these individuals at the time of their 1st encounter in the main data set. Specifically, all but 1 of the adults weighed > 210 g, and their masses varied randomly over time within a relatively narrow range. In contrast, woodrats captured as juveniles were smaller (typically < 200 g) and showed substantial growth over subsequent encounters (Supplementary Data SD1). Finally, we estimated the model’s classification accuracy by comparing the age classes of these woodrats as determined by inspecting the pattern of changes in their body masses over successive encounters with their most likely age classes as assigned by the model. Results Woodrat classification. We captured Allegheny woodrats on 894 occasions during the regular monitoring period; 294 of those captures were of known adults (i.e., recapture of individuals marked in a previous year) and 600 represented encounters with individuals of unknown age (Fig. 1A). Known adults had a mean body mass (± standard deviation) of 311 ± 40 g and unknown individuals had mean body mass 221 ± 66 g. Based on a fixed criterion of 225 g for the classification as an adult (Mengak et al. 2008), 267 of the 600 unknown individuals (45%) were identified as adults and the remaining 333 (55%) as juveniles. Model-derived probabilities of adult status for each unknown woodrat ranged from 0 to 1 as a function of mass, sex, and date of capture (Fig. 2; Table 2). Of the 600 unknown-age woodrats, the model classified 524 (87%) as either adults or juveniles with a probability ≥ 0.7, 401 (67%) with probability ≥ 0.9, and 331 (55%) with probability ≥ 0.95 based on the posterior distribution for αi. When all woodrats of unknown age were assigned to their most likely age status based on the model-derived probabilities, 181 woodrats (30%) were classified as adults, whereas 419 (70%) were classified as juveniles. Fig. 2. Open in new tabDownload slide Model-derived probability of adult status (y-axis) for 600 Allegheny woodrats (Neotoma magister) of unknown age, but known sex, body mass (x-axis), and Julian date of capture. Females are shown as circles and males as triangles; darker symbol shading indicates a later capture date. Individuals to the right of the vertical line at 225 g would be classified as adults by a standard rule of thumb. Fig. 2. Open in new tabDownload slide Model-derived probability of adult status (y-axis) for 600 Allegheny woodrats (Neotoma magister) of unknown age, but known sex, body mass (x-axis), and Julian date of capture. Females are shown as circles and males as triangles; darker symbol shading indicates a later capture date. Individuals to the right of the vertical line at 225 g would be classified as adults by a standard rule of thumb. Table 2. Estimated parameter values and 95% credible intervals from the fitted model predicting Allegheny woodrat (Neotoma magister) age status. Parameter . Estimate . 95% CI . δ0 457 (390, 498) μA,female 290.5 (284.0, 296.0) βfemale 0.00665 (0.00584, 0.00720) σA,female 40.9 (37.0, 45.3) σJ,female 35.5 (31.7, 39.6) μA,male 318.8 (311.0, 326.3) βmale 0.00614 (0.00529, 0.00674) σA,male 43.7 (38.4, 49.7) σJ,male 38.7 (34.8, 43.5) Parameter . Estimate . 95% CI . δ0 457 (390, 498) μA,female 290.5 (284.0, 296.0) βfemale 0.00665 (0.00584, 0.00720) σA,female 40.9 (37.0, 45.3) σJ,female 35.5 (31.7, 39.6) μA,male 318.8 (311.0, 326.3) βmale 0.00614 (0.00529, 0.00674) σA,male 43.7 (38.4, 49.7) σJ,male 38.7 (34.8, 43.5) Open in new tab Table 2. Estimated parameter values and 95% credible intervals from the fitted model predicting Allegheny woodrat (Neotoma magister) age status. Parameter . Estimate . 95% CI . δ0 457 (390, 498) μA,female 290.5 (284.0, 296.0) βfemale 0.00665 (0.00584, 0.00720) σA,female 40.9 (37.0, 45.3) σJ,female 35.5 (31.7, 39.6) μA,male 318.8 (311.0, 326.3) βmale 0.00614 (0.00529, 0.00674) σA,male 43.7 (38.4, 49.7) σJ,male 38.7 (34.8, 43.5) Parameter . Estimate . 95% CI . δ0 457 (390, 498) μA,female 290.5 (284.0, 296.0) βfemale 0.00665 (0.00584, 0.00720) σA,female 40.9 (37.0, 45.3) σJ,female 35.5 (31.7, 39.6) μA,male 318.8 (311.0, 326.3) βmale 0.00614 (0.00529, 0.00674) σA,male 43.7 (38.4, 49.7) σJ,male 38.7 (34.8, 43.5) Open in new tab Classification differed between the model-based and fixed-criterion approach for 86 captures with body masses that ranged from 226 to 278 g. Among the captures with conflicting classification, the model-derived probability of being an adult ranged from 0.085 to 0.49, with a mean of 0.30. No probabilities > 0.50 were present in this group because conflicting classifications only occurred when an individual was identified as an adult with the fixed criterion but identified as a juvenile with the model. Classification accuracy. Our simulations captured the major features of the observed data, including the shape of the scatterplot of body mass versus capture date, and the tendency for males to be larger than females (Fig. 1). Figure 3 shows the average ROC curves (± 95% quantile envelopes) for both females and males over 500 simulated data sets. Across all simulations, accuracy of age classification was > 84% for females and > 86% for males; overall accuracies (mean AUC values) were 93% and 95%, respectively. In contrast, the fixed-criterion approach gave mean true positive rates of 93% and 98% for females and males, respectively, but also produced high rates of false positives, resulting in mean AUC values ≤ 37% for females and ≤ 43% for males. There was no significant difference between males and females for the regression of AUC on V (Fig. 3B). After pooling across both sexes (AUC = 1.01 − 0.545V, P < 0.001, R2 = 0.36), we infer that the overall classification accuracy for the observed data is likely in the range of 85–94%. By comparison, the model correctly classified 92% (23/25) of the woodrats for which we had multiple encounters that allowed confident age determinations to be made based on changes in mass over time. Of the 2 misclassified individuals, 1 was an unusually small adult with a mass of only 202 g, and the other was a large juvenile classified by the model as an adult with a probability of 0.68. Fig. 3. Open in new tabDownload slide A) Receiver operating characteristic (ROC) curves for age classifications of female and male Allegheny woodrats (Neotoma magister). Solid lines show the mean ROC curve from analyses of 500 simulated data sets and dashed lines show the corresponding 95% quantile envelopes. The diagonal dotted line corresponds to a coin-flip model which is correct 50% of the time. Curves above this line indicate better predictive performance, up to a maximum possible area under the curve (AUC) of 1 for a perfect predictor. B) Relationship between AUC and the volume of overlap between adult and juvenile size-date distributions in fitted models for the 500 simulations. The vertical error bars show the predicted AUC range for the volume of overlap in the model fit to the true data. Fig. 3. Open in new tabDownload slide A) Receiver operating characteristic (ROC) curves for age classifications of female and male Allegheny woodrats (Neotoma magister). Solid lines show the mean ROC curve from analyses of 500 simulated data sets and dashed lines show the corresponding 95% quantile envelopes. The diagonal dotted line corresponds to a coin-flip model which is correct 50% of the time. Curves above this line indicate better predictive performance, up to a maximum possible area under the curve (AUC) of 1 for a perfect predictor. B) Relationship between AUC and the volume of overlap between adult and juvenile size-date distributions in fitted models for the 500 simulations. The vertical error bars show the predicted AUC range for the volume of overlap in the model fit to the true data. Discussion Accurate descriptions of age structure are essential for understanding population dynamics and are absolutely crucial for estimating population trends and guiding recovery efforts for imperiled species such as Allegheny woodrats. In growth simulations, the average accuracy of age assignments for unmarked Allegheny woodrats using our model-based classification was ≈ 94%, and accuracy was > 89% for the vast majority of simulated data sets. Moreover, we found an accuracy of 92% in a small subsample of the real data for which ages could be confidently determined from encounter histories. In contrast, use of an established fixed criterion for classification of unmarked woodrats as adults (Mengak et al. 2008) had an overall accuracy of 37% and 43% for females and males, respectively. Thus, our model-based approach can greatly improve the accuracy of age classification over the fixed-criterion approach without requiring additional handling of captured individuals in the field. Our model estimates the rate of juvenile growth, which may independently be of interest to biologists, but also can be used to help plan future data collection. For species such as woodrats, which have distinct breeding seasons and attain near-adult body mass within their 1st year, the overlap between size distributions (V) of juveniles and adults is directly related to end date for annual data collections ( tmax ). At some point in time, young-of-the year will have attained sufficient body size to be indistinguishable from adults. Thus, individual ages cannot be determined for data collected after this point. The mean trendline from the regression of AUC against V (or more conservatively at one of its lower quantiles) can be used to identify a target end date for field data collection that provides a desired level of classification accuracy or that balances accuracy against other criteria (e.g., disruption during peak in breeding season). For example, this relationship allows biologists to plan the timing of field seasons so that the number of unique individuals encountered is balanced against the information content that can be gleaned from each capture. Based on changes in body mass of Allegheny woodrats observed over the course of our monitoring efforts, AUC would increase marginally to approximately 97% if monitoring were to end on 12 July, 30 days after the earliest start date in our data set, and would only decline to 89% if monitoring were extended through 31 December (see Supplementary Data SD1). Therefore, little would be gained by altering the current duration of monitoring efforts. Any such analysis that aims to identify periods when data collections are most likely to yield high power for age discriminations necessarily assumes that breeding dates and growth patterns are relatively stable across years, or must also include environmental drivers for reproduction and growth. Other model-based techniques for age classification also have been developed. For instance, Karels et al. (2004) estimated age classification accuracies for marmots (Marmota vancouverensis and M. caligata) between 69% and 86% (mean = 81%) using 2 classification techniques—discriminant function analysis (DFA) and classification and regression trees (CART). Although their results cannot be contrasted directly with ours due to our use of a different study species, age classification schemes, and methods to evaluate accuracy, our model appears to compare favorably with these techniques, especially considering that we only used data on mass, sex, and capture date, as opposed to a suite of 4–8 morphometric variables (see table 1 in Karels et al. 2004). While our application in Allegheny woodrats produced accurate classifications using only body mass and sex, additional morphometric variables could easily be added to the analysis, either as a series of univariate models (each of which would depend on a common imputed value for αi) or by replacing equation 1 with a multivariate mixture distribution. To illustrate, we have reanalyzed the marmot data from Karels et al. (2004) in Supplementary Data SD2. DFA, CART, and most other age classification techniques construct rules to categorize unknown-age individuals based on their relative similarity to individuals in different known-age classes. The mixture approach taken in our analysis can also be applied in this manner, as illustrated in our supplementary analysis of the Karels et al. (2004) marmot data. In contrast, our analysis for the Allegheny woodrats directly models the biological growth process. As a population, juvenile woodrats gain noticeable mass over the course of a single field season. As a result, the distribution of body masses of juveniles is non-stationary with respect to time and does not follow a parametric distribution if viewed over the entire season. By including a growth model, we break the marginal body mass distribution of juveniles down into a series of time-conditional distributions, each of which is approximately normal. Consequently, classification accuracy is improved. The model also does not require training data from known-age juveniles, although this information could be used if available. Another consequence is that the model can be used for inference on growth as well as for classification, and could be extended to estimate actual ages, rather than age categories. As indicated above, other morphometric measures (e.g., hind-foot length) can be used in addition to (or in place of) body mass. This could be particularly effective if growth in multiple morphometrics are modeled jointly, and would be required in order to discriminate age classes in species for which a single morphometric is not diagnostic (e.g., if senescent, older adults routinely lose mass and return to a body mass characteristic of juveniles). It is also straightforward to classify individuals to one of several age classes by adding more terms to the mixture model in equation 1 and then drawing α from a multinomial distribution instead of a Bernoulli (p in this case would become a vector with a Dirichlet or similarly distributed prior; see Supplementary Data SD2). However, extending the model to classify C age classes does require training data from at least C − 1 of those classes. Additionally, if the unknown-age individuals are uniquely marked and are recaptured on multiple occasions, a hierarchical version of the model can use the repeated measurements of individuals to better estimate the growth function and reduce the uncertainty in age classifications. Finally, the Bayesian approach used here can be extended by linking it directly to subsequent analyses that depend on age classifications. For example, parentage analyses require animals to be classified into cohorts of offspring and (older) potential parents. Each unique realization of α in our results leads to a different cohort structure and therefore a potentially different outcome for the parentage analysis. Although we have focused on a species that has a discrete, relatively short breeding season, the general approach developed here can also be used to classify species that have more extended breeding periods, as long as there is a sufficient number of recaptures to model individual growth. Without the assumption of pulsed breeding, the model as we have currently defined it becomes unidentifiable. An individual captured on a particular date with a given mass might be equally likely to have been born early in the season and to have grown slowly (i.e., μA−δ0 is large and β is small) or to have been born later in the season and grown more quickly (small μA−δ0 and large β). Fitting a hierarchical model in which each individual has its own initial size and growth rate would allow us to define distributions for both parameters. Alternatively, continuous-breeding species could also be classified without recaptures if strong informative priors can be placed on δ0 and β. A potential additional benefit in either case would be an ability to estimate individual birthdates and intraseasonal reproductive patterns. In addition to breeding within a discrete season, Allegheny woodrats typically achieve their adult body mass within approximately 1 year. In pulse-breeding species that mature over > 1 field season, young-of the-year, 2nd-year subadults, and any older subadult cohorts would each fall into separate size classes. Thus, species with longer times-to-maturity would necessarily require a model designed for multiple age categories as described above. In addition, the time component of the model would need to be modified to account for the fact that subadult classes had been growing for > 1 year at capture. Using our simple asymptotic model as an example, we could define g(δ0,t,β,αi)=μA−δ0exp[−β(t+365αi)] , where αi is the currently imputed age class for the ith unknown-age individual, and is equal to 0 for young-of-the-year, 1 for 2nd-year subadults, and so-on up to C − 1 for adults. This particular modification assumes that breeding happens at the same time each year, but more flexible parameterizations are also possible. In addition to the age classification model, our analysis introduces a heuristic, simulation-based technique to evaluate the accuracy of age classification models when known true classifications are not available. The goal of this technique is to analyze the relationship between AUC and 1 or more characteristics of the fitted model (in our case, the overlap between fitted distributions for adult and juvenile body masses over time), and then to use this relationship to predict the range of likely AUC values for the real data. We are comfortable suggesting this method to build confidence in classification results. However, we also caution users to treat the technique as a diagnostic of model accuracy as opposed to a true prediction. To be reliable, simulated data should match the real growth process and data patterns as closely as possible. Even in the best case, however, high accuracy in simulations only provides insight into model behavior and does not guarantee similar performance in a real data set. On the other hand, poor performance with simulated data should be taken as grounds for skepticism where real data are concerned. Supplementary Data Supplementary data are available at Journal of Mammalogy online. Supplementary Data SD1.—R code to fit a Bayesian age classification model and run associated analyses, with code documentation and examples. Supplementary Data SD2.—Reanalysis of hoary marmot age classification data (Karels et al. 2004) using a finite mixture model. Supplementary Data SD3.—Allegheny woodrat monitoring data. Supplementary Data SD4.—Plain text formatted R code for the analyses in Supplementary Data SD1. Acknowledgments R. Blythe collected a portion of the Allegheny woodrat data, and early discussions with J. Berl, M. Sundaram, D. Nelson, and R. K. Swihart helped us frame the problem and formulate the initial model. T. Karels, A. Bryant, and D. Hik generously contributed morphological data collected on hoary marmots (Karels et al. 2004) that allowed us to illustrate the extension of the Bayesian age classification model developed here to multivariate morphological data. We also thank T. Karels and an anonymous reviewer for comments that helped to improve the initial manuscript. This work was supported in part by the Indiana Department of Natural Resources (State Wildlife Grant T7R2 and T7R12), and by a USDA National Institutes of Food and Agriculture postdoctoral fellowship to NL (USDA NIFA 2015-67012-22791). Literature Cited Alligood C. A. , et al. 2008 . Pup development and maternal behavior in captive Key Largo woodrats (Neotoma floridana smalli). Zoo Biology 27 : 394 – 405 . Crossref Search ADS PubMed Blythe , R. M. , T. J. Smyser, S. A. Johnson, and R. K. Swihart. 2015. Post-release survival of captive-reared Allegheny woodrats. Animal Conservation 18 : 186 – 195 . Crossref Search ADS Brooks S. P. Gelman A. . 1998 . General methods for monitoring convergence of iterative simulations . Journal of Computational and Graphical Statistics 7 : 434 – 455 . OpenURL Placeholder Text WorldCat Castleberry S. B. Mengak M. T. Ford W. M. . 2006 . Neotoma magister . Mammalian Species 789 : 1 – 5 . Google Scholar Crossref Search ADS WorldCat Castleberry S. B. Mengak M. T. Menken T. E. . 2014 . Comparison of trapping and camera survey methods for determining presence of Allegheny woodrats . Wildlife Society Bulletin 38 : 414 – 418 . Google Scholar Crossref Search ADS WorldCat Cudmore W. W . 1983 . The distribution and ecology of the eastern woodrat, Neotoma floridana, in Indiana . Ph.D. dissertation , Indiana State University , Terre Haute . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Gelman A. Carlin J. B. Stern H. S. Rubin D. B. . 2004 . Bayesian data analysis . 2nd ed. CRC Press , Boca Raton, Forida . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Hamlin , K. L. , D. F. Pac, C. A. Sime, R. M. DeSimone, and G. L. Dusek. 2000. Evaluating the accuracy of ages obtained by two methods for Montana ungulates. The Journal of Wildlife Management 64 : 441 – 449 . Crossref Search ADS Hanley J. A. McNeil B. J. . 1982 . The meaning and use of the area under a receiver operating characteristic (ROC) curve . Radiology 143 : 29 – 36 . Google Scholar Crossref Search ADS PubMed WorldCat Johnson S. A . 2002 . Reassessment of the Allegheny woodrat (Neotoma magister) in Indiana . Proceedings of the Indiana Academy of Science 111 : 56 – 66 . OpenURL Placeholder Text WorldCat Karels T. J. Bryant A. A. Hik D. S. . 2004 . Comparison of discriminant function and classification tree analyses for age classification of marmots . Oikos 105 : 575 – 587 . Google Scholar Crossref Search ADS WorldCat Kellner K. F . 2015 . jagsUI: a wrapper around rjags to streamline JAGS analyses. R package version 1.3.7 . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC LoGiudice K . 2006 . Toward a synthetic view of extinction: a history lesson from a North American rodent . Bioscience 56 : 687 – 693 . Google Scholar Crossref Search ADS WorldCat LoGiudice K . 2008 . Multiple causes of the Allegheny woodrat decline: a historical-ecological examination . Pp. 23 – 41 in The Allegheny woodrat: ecology, conservation and management of a declining species ( J. D. Peles and J. Wright , eds.). Springer , New York . Google Scholar Crossref Search ADS Google Scholar Google Preview WorldCat COPAC Lyons E. K. Schroeder M. A. Robb L. A. . 2012 . Criteria for determining sex and age of birds and mammals . Pp. 207 – 229 in The wildlife techniques manual ( N. J. Silvy , ed.). 7th ed. Johns Hopkins University Press , Baltimore, Maryland . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC McClure P. A. Randolph J. . 1980 . Relative allocation of energy to growth and development of homeothermy in the eastern wood rat (Neotoma floridana) and hispid cotton rat (Sigmodon hispidus) . Ecological Monographs 50 : 199 – 219 . Google Scholar Crossref Search ADS WorldCat Mengak , M. T . 2002. Reproduction, juvenile growth and recapture rates of Allegheny woodrats (Neotoma magister) in Virginia. American Midland Naturalist 148:155–162. Mengak M. T. Butchkoski C. M. Feller D. J. Johnson S. A. . 2008 . Lessons from long-term monitoring of woodrat populations . Pp. 109 – 132 in The Allegheny woodrat: ecology, conservation and management of a declining species ( J. D. Peles and J. Wright , eds.). Springer , New York . Google Scholar Crossref Search ADS Google Scholar Google Preview WorldCat COPAC Plummer M . 2003 . JAGS: a program for analysis of Bayesian graphical models using Gibbs sampling . Proceedings of the 3rd International Workshop on Distributed Statistical Computing , March 20–22, 2003 , Vienna, Austria . Technische Universität Wien , Vienna, Austria . Powell R. A. Proulx G. . 2003 . Trapping and marking terrestrial mammals for research: integrating ethics, performance criteria, techniques, and common sense . Ilar Journal 44 : 259 – 276 . Google Scholar Crossref Search ADS PubMed WorldCat R Development Core Team . 2012 . R: a language and environment for statistical computing . R Foundation for Statistical Computing , Vienna, Austria . www.R-project.org/. Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Sikes R. S. , and The Animal Care and Use Committee of the American Society of Mammalogists . 2016 . 2016 Guidelines of the American Society of Mammalogists for the use of wild mammals in research and education . Journal of Mammalogy 97 : 663 – 688 . Google Scholar Crossref Search ADS WorldCat Smyser T. J. Blythe R. M. Johnson S. A. Lichti N. I. , and Swihart R. K. . 2016 a. Implications of American chestnut restoration for an imperiled rodent, the Allegheny woodrat . Journal of Wildlife Management 80 : 275 – 283 . Google Scholar Crossref Search ADS WorldCat Smyser , T. J. , Johnson S. A., Page L. K., C. M. Hudson, and O. E. Rhodes, Jr. 2013. Use of experimental translocations of Allegheny woodrat to decipher causal agents of decline. Conservation Biology 27:752–762. Smyser T. J. Johnson S. A. Page L. K. Rhodes O. E. Jr . 2012 . Synergistic stressors and the dilemma of conservation in a multivariate world: a case study in Allegheny woodrats . Animal Conservation 15 : 205 − 213 . Google Scholar Crossref Search ADS WorldCat Smyser T. J. Stauffer G. E. Johnson S. A. Rhodes O. E. Jr ., and Swihart R. K. . 2016 b. Annual survival of Allegheny woodrats in a nonequilibrium metapopulation . Journal of Mammalogy 97 : 1699 – 1708 . Google Scholar Crossref Search ADS WorldCat Smyser T. J. Swihart R. K. . 2014 . Allegheny woodrat (Neotoma magister) captive propagation to promote recovery of declining populations . Zoo Biology 33 : 29 – 35 . Google Scholar Crossref Search ADS PubMed WorldCat Spinage C . 1973 . A review of the age determination of mammals by means of teeth, with especial reference to Africa . African Journal of Ecology 11 : 165 – 187 . Google Scholar Crossref Search ADS WorldCat Wood P . 2008 . Woodrat population dynamics and movement patterns . pp. 45–62 in The Allegheny woodrat: ecology, conservation and management of a declining species ( J. D. Peles and J. Wright , eds.). Springer , New York . Google Scholar Crossref Search ADS Google Scholar Google Preview WorldCat COPAC Author notes " Associate Editor was Jessica Light. © 2017 American Society of Mammalogists, www.mammalogy.org TI - Bayesian model-based age classification using small mammal body mass and capture dates JO - Journal of Mammalogy DO - 10.1093/jmammal/gyx057 DA - 2017-10-03 UR - https://www.deepdyve.com/lp/oxford-university-press/bayesian-model-based-age-classification-using-small-mammal-body-mass-SGChP2Kyq3 SP - 1379 VL - 98 IS - 5 DP - DeepDyve ER -