A High Frequency of Beneficial Mutations Across Multiple Fitness Components in Saccharomyces cerevisiaeHall, David W; Joseph, Sarah B
doi: 10.1534/genetics.110.118307pmid: 20516495
Mutation-accumulation experiments are widely used to estimate parameters of spontaneous mutations affecting fitness. In many experiments only one component of fitness is measured. In a previous study involving the diploid yeast Saccharomyces cerevisiae, we measured the growth rate of 151 mutation-accumulation lines to estimate parameters of mutation. We found that an unexpectedly high frequency of fitness-altering mutations was beneficial. Here, we build upon our previous work by examining sporulation efficiency, spore viability, and haploid growth rate and find that these components of fitness also show a high frequency of beneficial mutations. We also examine whether mutation-acycumulation (MA) lines show any evidence of pleiotropy among accumulated mutations and find that, for most, there is none. However, MA lines that have zero fitness (i.e., lethality) for any one fitness component do show evidence for pleiotropy among accumulated mutations. We also report estimates of other parameters of mutation based on each component of fitness.
Using Environmental Correlations to Identify Loci Underlying Local AdaptationCoop, Graham; Witonsky, David; Di Rienzo, Anna; Pritchard, Jonathan K
doi: 10.1534/genetics.110.114819pmid: 20516501
Loci involved in local adaptation can potentially be identified by an unusual correlation between allele frequencies and important ecological variables or by extreme allele frequency differences between geographic regions. However, such comparisons are complicated by differences in sample sizes and the neutral correlation of allele frequencies across populations due to shared history and gene flow. To overcome these difficulties, we have developed a Bayesian method that estimates the empirical pattern of covariance in allele frequencies between populations from a set of markers and then uses this as a null model for a test at individual SNPs. In our model the sample frequencies of an allele across populations are drawn from a set of underlying population frequencies; a transform of these population frequencies is assumed to follow a multivariate normal distribution. We first estimate the covariance matrix of this multivariate normal across loci using a Monte Carlo Markov chain. At each SNP, we then provide a measure of the support, a Bayes factor, for a model where an environmental variable has a linear effect on the transformed allele frequencies compared to a model given by the covariance matrix alone. This test is shown through power simulations to outperform existing correlation tests. We also demonstrate that our method can be used to identify SNPs with unusually large allele frequency differentiation and offers a powerful alternative to tests based on pairwise or global FST. Software is available at http://www.eve.ucdavis.edu/gmcoop/.
A Genetic Model for the Female Sterility Barrier Between Asian and African Cultivated Rice SpeciesGaravito, Andrea; Guyot, Romain; Lozano, Jaime; Gavory, Frédérick; Samain, Sylvie; Panaud, Olivier; Tohme, Joe; Ghesquière, Alain; Lorieux, Mathias
doi: 10.1534/genetics.110.116772pmid: 20457876
S1 is the most important locus acting as a reproductive barrier between Oryza sativa and O. glaberrima. It is a complex locus, with factors that may affect male and female fertility separately. Recently, the component causing the allelic elimination of pollen was fine mapped. However, the position and nature of the component causing female sterility remains unknown. To fine map the factor of the S1 locus affecting female fertility, we developed a mapping approach based on the evaluation of the degree of female transmission ratio distortion (fTRD) of markers. Through implementing this methodology in four O. sativa × O. glaberrima crosses, the female component of the S1 locus was mapped into a 27.8-kb (O. sativa) and 50.3-kb (O. glaberrima) region included within the interval bearing the male component of the locus. Moreover, evidence of additional factors interacting with S1 was also found. In light of the available data, a model where incompatibilities in epistatic interactions between S1 and the additional factors are the cause of the female sterility barrier between O. sativa and O. glaberrima was developed to explain the female sterility and the TRD mediated by S1. According to our model, the recombination ratio and allelic combinations between these factors would determine the final allelic frequencies observed for a given cross.
The Use of Family Relationships and Linkage Disequilibrium to Impute Phase and Missing Genotypes in Up to Whole-Genome Sequence Density Genotypic DataMeuwissen, Theo; Goddard, Mike
doi: 10.1534/genetics.110.113936pmid: 20479147
A novel method, called linkage disequilibrium multilocus iterative peeling (LDMIP), for the imputation of phase and missing genotypes is developed. LDMIP performs an iterative peeling step for every locus, which accounts for the family data, and uses a forward–backward algorithm to accumulate information across loci. Marker similarity between haplotype pairs is used to impute possible missing genotypes and phases, which relies on the linkage disequilibrium between closely linked markers. After this imputation step, the combined iterative peeling/forward–backward algorithm is applied again, until convergence. The calculations per iteration scale linearly with number of markers and number of individuals in the pedigree, which makes LDMIP well suited to large numbers of markers and/or large numbers of individuals. Per iteration calculations scale quadratically with the number of alleles, which implies biallelic markers are preferred. In a situation with up to 15% randomly missing genotypes, the error rate of the imputed genotypes was <1% and ∼99% of the missing genotypes were imputed. In another example, LDMIP was used to impute whole-genome sequence data consisting of 17,321 SNPs on a chromosome. Imputation of the sequence was based on the information of 20 (re)sequenced founder individuals and genotyping their descendants for a panel of 3000 SNPs. The error rate of the imputed SNP genotypes was 10%. However, if the parents of these 20 founders are also sequenced, >99% of missing genotypes are imputed correctly.
Identification of Selection Signatures in Cattle Breeds Selected for Dairy ProductionStella, Alessandra; Ajmone-Marsan, Paolo; Lazzari, Barbara; Boettcher, Paul
doi: 10.1534/genetics.110.116111pmid: 20479146
The genomics revolution has spurred the undertaking of HapMap studies of numerous species, allowing for population genomics to increase the understanding of how selection has created genetic differences between subspecies populations. The objectives of this study were to (1) develop an approach to detect signatures of selection in subsets of phenotypically similar breeds of livestock by comparing single nucleotide polymorphism (SNP) diversity between the subset and a larger population, (2) verify this method in breeds selected for simply inherited traits, and (3) apply this method to the dairy breeds in the International Bovine HapMap (IBHM) study. The data consisted of genotypes for 32,689 SNPs of 497 animals from 19 breeds. For a given subset of breeds, the test statistic was the parametric composite log likelihood (CLL) of the differences in allelic frequencies between the subset and the IBHM for a sliding window of SNPs. The null distribution was obtained by calculating CLL for 50,000 random subsets (per chromosome) of individuals. The validity of this approach was confirmed by obtaining extremely large CLLs at the sites of causative variation for polled (BTA1) and black-coat-color (BTA18) phenotypes. Across the 30 bovine chromosomes, 699 putative selection signatures were detected. The largest CLL was on BTA6 and corresponded to KIT, which is responsible for the piebald phenotype present in four of the five dairy breeds. Potassium channel-related genes were at the site of the largest CLL on three chromosomes (BTA14, -16, and -25) whereas integrins (BTA18 and -19) and serine/arginine rich splicing factors (BTA20 and -23) each had the largest CLL on two chromosomes. On the basis of the results of this study, the application of population genomics to farm animals seems quite promising. Comparisons between breed groups have the potential to identify genomic regions influencing complex traits with no need for complex equipment and the collection of extensive phenotypic records and can contribute to the identification of candidate genes and to the understanding of the biological mechanisms controlling complex traits.
Graph-Based Data Selection for the Construction of Genomic Prediction ModelsMaenhout, Steven; De Baets, Bernard; Haesaert, Geert
doi: 10.1534/genetics.110.116426pmid: 20479144
Efficient genomic selection in animals or crops requires the accurate prediction of the agronomic performance of individuals from their high-density molecular marker profiles. Using a training data set that contains the genotypic and phenotypic information of a large number of individuals, each marker or marker allele is associated with an estimated effect on the trait under study. These estimated marker effects are subsequently used for making predictions on individuals for which no phenotypic records are available. As most plant and animal breeding programs are currently still phenotype driven, the continuously expanding collection of phenotypic records can only be used to construct a genomic prediction model if a dense molecular marker fingerprint is available for each phenotyped individual. However, as the genotyping budget is generally limited, the genomic prediction model can only be constructed using a subset of the tested individuals and possibly a genome-covering subset of the molecular markers. In this article, we demonstrate how an optimal selection of individuals can be made with respect to the quality of their available phenotypic data. We also demonstrate how the total number of molecular markers can be reduced while a maximum genome coverage is ensured. The third selection problem we tackle is specific to the construction of a genomic prediction model for a hybrid breeding program where only molecular marker fingerprints of the homozygous parents are available. We show how to identify the set of parental inbred lines of a predefined size that has produced the highest number of progeny. These three selection approaches are put into practice in a simulation study where we demonstrate how the trade-off between sample size and sample quality affects the prediction accuracy of genomic prediction models for hybrid maize.
Allelic Variation in Cell Wall Candidate Genes Affecting Solid Wood Properties in Natural Populations and Land Races of Pinus radiataDillon, S K; Nolan, M; Li, W; Bell, C; Wu, H X; Southerton, S G
doi: 10.1534/genetics.110.116582pmid: 20498299
Forest trees are ideally suited to association mapping due to their high levels of diversity and low genomic linkage disequilibrium. Using an association mapping approach, single-nucleotide polymorphism (SNP) markers influencing quantitative variation in wood quality were identified in a natural population of Pinus radiata. Of 149 sites examined, 10 demonstrated significant associations (P < 0.05, q < 0.1) with one or more traits after accounting for population structure and experimentwise error. Without accounting for marker interactions, phenotypic variation attributed to individual SNPs ranged from 2 to 6.5%. Undesirable negative correlations between wood quality and growth were not observed, indicating potential to break negative correlations by selecting for individual SNPs in breeding programs. Markers that yielded significant associations were reexamined in an Australian land race. SNPs from three genes (PAL1, PCBER, and SUSY) yielded significant associations. Importantly, associations with two of these genes validated associations with density previously observed in the discovery population. In both cases, decreased wood density was associated with the minor allele, suggesting that these SNPs may be under weak negative purifying selection for density in the natural populations. These results demonstrate the utility of LD mapping to detect associations, even when the power to detect SNPs with small effect is anticipated to be low.
Directionality of Epistasis in a Murine Intercross PopulationPavlicev, Mihaela; Le Rouzic, Arnaud; Cheverud, James M; Wagner, Günter P; Hansen, Thomas F
doi: 10.1534/genetics.110.118356pmid: 20516493
Directional epistasis describes a situation in which epistasis consistently increases or decreases the effect of allele substitutions, thereby affecting the amount of additive genetic variance available for selection in a given direction. This study applies a recent parameterization of directionality of epistasis to empirical data. Data stems from a QTL mapping study on an intercross between inbred mouse (Mus musculus) strains LG/J and SM/J, originally selected for large and small body mass, respectively. Results show a negative average directionality of epistasis for body-composition traits, predicting a reduction in additive allelic effects and in the response to selection for increased size. Focusing on average modification of additive effect of single loci, we find a more complex picture, whereby the effects of some loci are enhanced consistently across backgrounds, while effects of other loci are decreased, potentially contributing to either enhancement or reduction of allelic effects when selection acts at single loci. We demonstrate and discuss how the interpretation of the overall measurement of directionality depends on the complexity of the genotype–phenotype map. The measure of directionality changes with the power of scale in a predictable way; however, its expected effect with respect to the modification of additive genetic effects remains constant.
Phantom, a New Subclass of Mutator DNA Transposons Found in Insect Viruses and Widely Distributed in AnimalsMarquez, Claudia P; Pritham, Ellen J
doi: 10.1534/genetics.110.116673pmid: 20457878
Transposons of the Mutator (Mu) superfamily have been shown to play a critical role in the evolution of plant genomes. However, the identification of Mutator transposons in other eukaryotes has been quite limited. Here we describe a previously uncharacterized group of DNA transposons designated Phantom identified in the genomes of a wide range of eukaryotic taxa, including many animals, and provide evidence for its inclusion within the Mutator superfamily. Interestingly three Phantom proteins were also identified in two insect viruses and phylogenetic analysis suggests horizontal movement from insect to virus, providing a new line of evidence for the role of viruses in the horizontal transfer of DNA transposons in animals. Many of the Phantom transposases are predicted to harbor a FLYWCH domain in the amino terminus, which displays a WRKY–GCM1 fold characteristic of the DNA binding domain (DBD) of Mutator transposases and of several transcription factors. While some Phantom elements have terminal inverted repeats similar in length and structure to Mutator elements, some display subterminal inverted repeats (sub-TIRs) and others have more complex termini reminiscent of so-called Foldback (FB) transposons. The structural plasticity of Phantom and the distant relationship of its encoded protein to known transposases may have impeded the discovery of this group of transposons and it suggests that structure in itself is not a reliable character for transposon classification.