Access the full text.
Sign up today, get DeepDyve free for 14 days.
Aschard (2012)
1591Hum. Genet, 131
Shi (2016)
139Am. J. Hum. Genet, 99
M.R. Robinson (2017)
Genotype-covariate interaction effects and the heritability of adult body mass indexNat. Genet, 49
G. Pare (2016)
A method to estimate the contribution of regional genetic associations to complex traits from summary association statisticsSci. Rep, 6
H. Shi (2016)
Contrasting the genetic architecture of 30 complex traits from summary association dataAm. J. Hum. Genet, 99
H. Aschard (2016)
A perspective on interaction effects in genetic association studiesGenet. Epidemiol, 40
H. Aschard (2012)
Challenges and opportunities in genome-wide environmental interaction (GWEI) studiesHum. Genet, 131
Abecasis (2012)
56Nature, 491
Aschard (2016)
678Genet. Epidemiol, 40
Pare (2016)
27644Sci. Rep, 6
G.R. Abecasis (2012)
An integrated map of genetic variation from 1, 092 human genomesNature, 491
Robinson (2017)
1174Nat. Genet, 49
Summary: Many genome-wide association studies and genome-wide screening for gene–environ- ment (GxE) interactions have been performed to elucidate the underlying mechanisms of human traits and diseases. When the analyzed outcome is quantitative, the overall contribution of identi- fied genetic variants to the outcome is often expressed as the percentage of phenotypic variance explained. This is commonly done using individual-level genotype data but it is challenging when results are derived through meta-analyses. Here, we present R package, ‘VarExp’, that allows for the estimation of the percentage of phenotypic variance explained using summary statistics only. It allows for a range of models to be evaluated, including marginal genetic effects, GxE interaction effects and both effects jointly. Its implementation integrates all recent methodological develop- ments and does not need external data to be uploaded by users. Availability and implementation: The R package is available at https://gitlab.pasteur.fr/statistical- genetics/VarExp.git. Contact: vincent.laville@pasteur.fr Supplementary information: Supplementary data are available at Bioinformatics online. 1 Introduction percentage is to compare the coefficients of determination between the models including or not the significantly associated variants and/ Many genome-wide association studies (GWAS) or genome-wide or interactions. This requires individual genotype and phenotype screenings incorporating gene–environment (GxE) interactions data which can be challenging in meta-analyses performed in big (Aschard et al., 2012) have been performed to better understand consortia as pooling data from multiple cohorts raises practical and underlying mechanisms of human traits and diseases. When the ana- ethical issues. However, an alternative is to use only GWAS or lyzed outcome is continuous, a commonly used measure to judge the genome-wide GxE summary statistics. Recently, several methods overall impact of the significant associations is the percentage of (Pare et al., 2016; Shi et al., 2016) have been developed to estimate phenotypic variance explained. A standard way of estimating this V The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 3412 Estimating variance explained by genome-wide GxE summary statistics 3413 the variance explained by marginal genetic effects while taking into ac- count linkage disequilibrium between variants, and addressing statistic- al issues related to finite sample size and Single Nucleotide Polymorphisms (SNP) correlation matrices. Yet, these works only focused on marginal genetic effects, while genome-wide GxE and joint effect GWAS are now commonly performed and face the same need. In this work, we address this gap by extending the methodology to GxE screening and implementing R package VarExp to rapidly and easily es- timate the percentage explained by variants and/or interactions of inter- Fig. 1. Percentage of phenotypic variance explained using summary statistics est using only meta-analysis summary statistics from GWAS. (estimated) and individual-level data (observed) for (a) main genetic (b) inter- action and (c) joint effects. The line corresponds to y¼x and ICC is the intra- class correlation coefficient 2 Materials and methods of SNPs or from local data files for larger number of SNPs, as com- 2.1 Percentage of variance explained putational time is dramatically reduced when querying local files Consider a set of K SNPsðÞ G , coded additively as {0, 1, 2} k¼1...K (see Supplementary Material and Supplementary Fig. S3). To avoid and a quantitative outcome Y. The marginal genetic effect a of G:k matrix inversion issues, we also implemented an option to prune SNP G is estimated in the marginal model: SNPs with perfect correlation of 1 with another SNP in the matrix. Y ¼ a þ a G þ e: 0 G:k k Shi et al. (2016) proposed a first naı ¨ve estimator to derive the 3 Application example variance explained by genetic effects, f using summary statistics: In practice, application is performed in three main steps (see 0 T 0 f ¼ a R a =varðÞ Y G G Supplementary Material and Supplementary Fig. S4): (i) estimating the SNP correlation matrix, (ii) computing mean and variance for 0 T where a ¼ðÞ a r ... a r .. . a r , r denotes the standard G:1 1 G:k k G:K K k both the outcome and the exposure in the pooled sample and (iii) fi- deviation of SNP G and R is the Moore–Penrose generalized in- nally, estimating the percentage of phenotypic variance explained by verse of the genotype correlation matrix. However, finite sample main genetic effects and/or interaction effects. To illustrate the per- size implies statistical noise in both the effect sizes and the correl- formances of our package, we performed a simulation study (see ation matrix estimations which can induce bias in the estimation of Supplementary Material) comparing the adjusted coefficients of de- f . Shi et al. (2016) derived a general formula that addresses this termination from regressions and the estimates obtained using issue: VarExp across 1000 replicates. Figure 1 and Supplementary Figures 0 T 0 S1 and S2 demonstrate the high accuracy of our estimator with an f ¼ N a R a q =ðÞ ðÞ N q varðÞ Y G G G intraclass correlation coefficient between the coefficients of deter- where N and q denote respectively the sample size and the rank of mination and their estimations equal to 0.99, 0.98 and 0.99 for the the correlation matrix. marginal genetic effects, interaction effects and joint effects, Now consider an exposure E (either binary or quantitative). The respectively. main effect a of the SNP G and the interaction effect a can G:k k INT:k be estimated using a single-SNP model with an interaction term: 4 Concluding remarks Y ¼ a þ a G þ a E þ a G E þ e: 0 E G:k k INT:k k In this work, we provide R package VarExp to easily estimate the We show in Supplementary Material that, when re-parameterizing percentage of phenotypic variance explained by genetic effects, GxE the effect estimates of the above model to obtain parameters from a interaction effects or their joint contribution using summary statis- fully standardized model, the percentage of variance explained by tics only, making it straightforward in large-scale consortia. interactions effects f or jointly by genetic and interaction effects f I GþI Importantly, several limitations of GxE screenings have previously can also be derived using summary statistics only: been discussed [(Aschard, 2016; Robinson et al., 2017), see also 0 T 0 Supplementary Material] and have to be taken into account by users f ¼ a R a =varðÞ Y INT INT before applying our approach. f ¼ f þ f GþI G I 0 T where a ¼ðÞ a r r .. . a r r .. . a r r and r is INT:1 1 E INT:k k E INT:K K E E INT Acknowledgements the standard deviation of E. In this model, f is computed using ef- We gratefully acknowledge all contributors to the CHARGE Gene-Lifestyle fect sizes from the interaction model. However, for the reasons dis- Interactions Working Group. cussed above, we define our final estimators, f and f ,by I GþI applying the same corrections as proposed for the f estimator by Shi et al. (see f equation). Funding This work was supported by the R01HL118305 grant from the NHLBI. H.A. 2.2 Estimating the genotype correlation matrix was also supported by R21HG007687 from NHGRI. A.R.B. was supported When the genotype correlation is not available from the data, it can by the Intramural Research Program of the National Human Genome be estimated using genotype data from a reference panel such as the Research Institute in the Center for Research in Genomics and Global Health 1000 Genomes (Abecasis et al., 2012) . We implemented a transpar- (CRGGH, Z01HG200362). ent function that derives this correlation matrix from 1000 Genomes Phase 3 data either through a web access for small number Conflict of Interest: none declared. 3414 V.Laville et al. Pare,G. et al. (2016) A method to estimate the contribution of regional genetic References associations to complex traits from summary association statistics. Sci. Abecasis,G.R. et al. (2012) An integrated map of genetic variation from 1, 092 Rep., 6, 27644. human genomes. Nature, 491, 56–65. Robinson,M.R. et al. (2017) Genotype-covariate interaction effects and the Aschard,H. (2016) A perspective on interaction effects in genetic association heritability of adult body mass index. Nat. Genet., 49, 1174–1181. studies. Genet. Epidemiol., 40, 678–688. Shi,H. et al. (2016) Contrasting the genetic architecture of 30 complex traits Aschard,H. et al. (2012) Challenges and opportunities in genome-wide envir- from summary association data. Am. J. Hum. Genet., 99, 139–153. onmental interaction (GWEI) studies. Hum. Genet., 131, 1591–1613.
Bioinformatics – Oxford University Press
Published: May 3, 2018
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.