Genomic atlas of the human plasma
Benjamin B. Sun
, Joseph C. Maranville
, James E. Peters
, David Stacey
, James R. Staley
, James Blackshaw
, Tao Jiang
, Ellie Paige
, Praveen Surendran
, Clare Oliver-Williams
, Mihir A. Kamat
, Bram P. Prins
Sheri K. Wilcox
, Erik S. Zimmerman
, An Chi
, Narinder Bansal
, Sarah L. Spain
, Angela M. Wood
, Nicholas W. Morrell
John R. Bradley
, Nebojsa Janjic
, David J. Roberts
, Willem H. Ouwehand
, John A. Todd
, Nicole Soranzo
, Dirk S. Paul
, Caroline S. Fox
, Robert M. Plenge
, John Danesh
*, Heiko Runz
& Adam S. Butterworth
Although plasma proteins have important roles in biological processes and are the direct targets of many drugs, the genetic
factors that control inter-individual variation in plasma protein levels are not well understood. Here we characterize the
genetic architecture of the human plasma proteome in healthy blood donors from the INTERVAL study. We identify 1,927
genetic associations with 1,478 proteins, a fourfold increase on existing knowledge, including trans associations for 1,104
proteins. To understand the consequences of perturbations in plasma protein levels, we apply an integrated approach
that links genetic variation with biological pathway, disease, and drug databases. We show that protein quantitative trait
loci overlap with gene expression quantitative trait loci, as well as with disease-associated loci, and find evidence that
protein biomarkers have causal roles in disease using Mendelian randomization analysis. By linking genetic factors to
diseases via specific proteins, our analyses highlight potential therapeutic targets, opportunities for matching existing
drugs with new disease indications, and potential safety concerns for drugs under development.
Plasma proteins have key roles in various biological processes, including
signalling, transport, growth, repair, and defence against infection.
These proteins are frequently dysregulated in disease and are important
drug targets. Identifying factors that determine inter-individual protein
variability should, therefore, furnish biological and medical insights
Despite evidence for the heritability of plasma protein abundance
however, systematic assessment of how genetic variation influences
plasma protein levels has been limited
. Studies have examined intra-
cellular protein quantitative trait loci (pQTLs)
, but these studies have
tended to be small and involved cell lines rather than primary human
Here we create and interrogate a genetic atlas of the human plasma
proteome, using an expanded version of an aptamer-based multiplex
protein assay (SOMAscan)
to quantify 3,622 plasma proteins in 3,301
healthy participants from the INTERVAL study, a genomic bioresource
of 50,000 blood donors from 25 centres across England recruited into
a randomized trial of blood donation frequency
. We identify 1,927
genotype–protein associations (pQTLs), including trans-associated
loci for 1,104 proteins, providing new understanding of the genetic
control of protein regulation. Eighty-eight pQTLs overlap with disease
susceptibility loci, suggesting the molecular effects of disease-associated
variants. Using the principle of Mendelian randomization
, we find
evidence to support causal roles in disease for several protein path-
ways, and cross-reference our data with disease and drug databases to
highlight potential therapeutic targets.
Genetic architecture of the plasma proteome
We performed genome-wide testing of 10.6 million imputed auto-
somal variants against levels of 2,994 plasma proteins in 3,301 indi-
viduals of European descent (Methods, Extended Data Fig. 1). We
demonstrated the robustness of protein measurements in several
ways (Supplementary Note, Extended Data Fig. 2), including: highly
consistent measurements in replicate samples; temporal consistency
of protein levels within individuals over two years (Extended Data
Fig. 3b); and replication of known associations with non-genetic factors
(Supplementary Tables 1, 2). To assess potential off-target cross-
reactivity, we tested 920 aptamers (SOMAmers) for detection of proteins
with at least 40% sequence homology to the target protein (Methods).
Although 126 (14%) SOMAmers showed comparable binding
with a homologous protein (Supplementary Table 3), nearly half of
these were binding to alternative forms of the same protein.
We found 1,927 significant (P < 1.5 × 10
) associations between
1,478 proteins and 764 genomic regions (Fig. 1a, Supplementary
Table 4, Supplementary Fig. 1, Supplementary Note Table 1), with
MRC/BHF Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK.
MRL, Merck & Co., Inc., Kenilworth, NJ, USA.
Heart Foundation Cambridge Centre of Excellence, Division of Cardiovascular Medicine, Addenbrooke’s Hospital, Cambridge, UK.
MRC Biostatistics Unit, University of Cambridge, Cambridge, UK.
National Centre for Epidemiology and Population Health, The Australian National University, Canberra, Australian Capital Territory, Australia.
Homerton College, Cambridge, UK.
Boulder, CO, USA.
Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge,
Division of Respiratory Medicine, Department of Medicine, University of Cambridge, Cambridge, UK.
NIHR Cambridge Biomedical Research Centre/BioResource, Cambridge University
Hospitals, Cambridge, UK.
National Health Service (NHS) Blood and Transplant and Radcliffe Department of Medicine, NIHR Oxford Biomedical Research Centre, University of Oxford, John
Radcliffe Hospital, Oxford, UK.
BRC Haematology Theme and Department of Haematology, Churchill Hospital, Oxford, UK.
Department of Haematology, University of Cambridge, Cambridge
Biomedical Campus, Cambridge, UK.
National Health Service (NHS) Blood and Transplant, Cambridge Biomedical Campus, Cambridge, UK.
Department of Human Genetics, Wellcome Trust
Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
NIHR Blood and Transplant Research Unit in Donor Health and Genomics, Department of Public Health and Primary
Care, University of Cambridge, Cambridge, UK.
JDRF/Wellcome Trust Diabetes and Inflammation Laboratory, Wellcome Trust Centre for Human Genetics, Nuffield Department of Medicine, NIHR
Oxford Biomedical Research Centre, University of Oxford, Oxford, UK.
Department of Physiology and Biophysics, Weill Cornell Medicine–Qatar, Doha, Qatar.
Present address: Celgene Inc.,
Cambridge, MA, USA.
Present address: Biogen Inc., Cambridge, MA, USA.
These authors contributed equally: Benjamin B. Sun, Joseph C. Maranville, James E. Peters.
These authors jointly
supervised this work: Heiko Runz, Adam S. Butterworth. *e-mail: email@example.com; firstname.lastname@example.org
7 JUNE 2018 | VOL 558 | NATURE | 73
© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.