Access the full text.
Sign up today, get DeepDyve free for 14 days.
S. Schuster (2008)
Next-generation sequencing transforms today's biologyNature Methods, 5
C. Sabatti, Gareth James (2005)
Bayesian sparse hidden components analysis for transcription regulation networksBioinformatics, 22 6
F. Spitz, E. Furlong (2012)
Transcription factors: from enhancer binding to developmental controlNature Reviews Genetics, 13
D. Marbach, T. Schaffter, C. Mattiussi, D. Floreano (2009)
Generating Realistic In Silico Gene Networks for Performance Assessment of Reverse Engineering MethodsJournal of computational biology : a journal of computational molecular cell biology, 16 2
Nynke Berkum, Erez Lieberman-Aiden, Louise Williams, Maxim Imakaev, A. Gnirke, L. Mirny, J. Dekker, E. Lander (2010)
Hi-C: A Method to Study the Three-dimensional Architecture of Genomes.Journal of Visualized Experiments : JoVE
Amartya Sanyal, B. Lajoie, Gaurav Jain, J. Dekker (2012)
The long-range interaction landscape of gene promotersNature, 489
Spitz (2012)
613Nat. Rev. Genet, 13
Chen (2016)
e65Nucleic Acids Res, 44
Lange (1989)
881J. Am. Stat. Assoc, 84
Xi Chen, J. Jung, Ayesha Shajahan-Haq, R. Clarke, I. Shih, Yue Wang, L. Magnani, Tian-Li Wang, J. Xuan (2015)
ChIP-BIT: Bayesian inference of target genes using a novel joint probabilistic model of ChIP-seq profilesNucleic Acids Research, 44
C. Angelini, Valerio Costa (2014)
Understanding gene regulatory mechanisms by integrating ChIP-seq and RNA-seq data: statistical solutions to biological problemsFrontiers in Cell and Developmental Biology, 2
Su-wen Wang, Hanfei Sun, Jian Ma, C. Zang, Chenfei Wang, Juan Wang, Q. Tang, Clifford Meyer, Yong Zhang, X. Liu (2013)
Target analysis by integration of transcriptome and ChIP-seq data with BETANature Protocols, 8
D. Venet, V. Detours, H. Bersini (2012)
A Measure of the Signal-to-Noise Ratio of Microarray Samples and Studies Using Gene CorrelationsPLoS ONE, 7
Qin (2014)
294Methods, 67
Chen (2013)
1347IEEE/ACM Trans. Comput. Biol. Bioinform, 10
Wang (2013)
2502Nat. Protoc, 8
X. Chen, J. Xuan, Chen Wang, A. Shajahan, R. Riggins, R. Clarke (2012)
Reconstruction of Transcriptional Regulatory Networks by Stability-Based Network Component AnalysisIEEE/ACM Transactions on Computational Biology and Bioinformatics, 10
K. Lange, R. Little, Jeremy Taylor (1989)
Robust Statistical Modeling Using the t DistributionJournal of the American Statistical Association, 84
Marbach (2009)
229J. Comput. Biol, 16
Gu (2012)
1990Bioinformatics, 28
V. Huynh-Thu, A. Irrthum, L. Wehenkel, P. Geurts (2010)
Inferring Regulatory Networks from Expression Data Using Tree-Based MethodsPLoS ONE, 5
Xiujun Zhang, Keqin Liu, Zhiping Liu, B. Duval, Jean-Michel Richer, Xingming Zhao, Jin-Kao Hao, Luonan Chen (2013)
NARROMI: a noise and redundancy reduction technique improves accuracy of gene regulatory network inferenceBioinformatics, 29 1
Jinghua Gu, J. Xuan, R. Riggins, Li Chen, Y. Wang, R. Clarke (2012)
Robust identification of transcriptional regulatory networks using a Gibbs sampler on outlier sum statisticBioinformatics, 28 15
A.I. Ramos, S. Barolo (2013)
Low-affinity transcription factor binding sites shape morphogen responses and enhancer evolutionPhilos. Trans. R. Soc. Lond. B Biol. Sci, 368
Karlebach (2008)
770Nat. Rev. Mol. Cell. Biol, 9
D. Phanstiel, A. Boyle, Nastaran Heidari, M. Snyder (2015)
Mango: a bias-correcting ChIA-PET analysis pipelineBioinformatics, 31 19
Bo Li, Colin Dewey (2011)
RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genomeBMC Bioinformatics, 12
Chen (2007)
R4Genome Biol, 8
M. Love, W. Huber, S. Anders (2014)
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2Genome Biology, 15
Zhao (2005)
D103Nucleic Acids Res, 33
Venet (2012)
e51013.PLoS One, 7
Guang Chen, S. Jensen, C. Stoeckert (2007)
Clustering of genes into regulons using integrated modeling-COGRIMGenome Biology, 8
Fang Zhao, Z. Xuan, Lihua Liu, Michael Zhang (2004)
TRED: a Transcriptional Regulatory Element Database and a platform for in silico gene regulation studiesNucleic Acids Research, 33
Gelman (1992)
457Stat. Sci, 7
A. Gelman, D. Rubin (1992)
Inference from Iterative Simulation Using Multiple SequencesStatistical Science, 7
Angelini (2014)
51.Front. Cell. Dev. Biol, 2
BIOINFORMATICS APPLICATIONS NOTE doi:10.1093/bioinformatics/btm311 Genome analysis beadarray: R classes and methods for Illumina bead-based data
Liao (2003)
15522Proc. Natl. Acad. Sci. USA, 100
Weaver (1999)
112Pac. Symp. Biocomput, 4
Huynh-Thu (2010)
e12776.PLoS One, 5
Guoliang Li, Liuyang Cai, Huidan Chang, Ping Hong, Qiangwei Zhou, E. Kulakova, N. Kolchanov, Y. Ruan (2014)
Chromatin Interaction Analysis with Paired-End Tag (ChIA-PET) sequencing technology and applicationBMC Genomics, 15
M.J. Dunning (2007)
beadarray: R classes and methods for Illumina bead-based dataBioinformatics, 23
D. Weaver, C. Workman, G. Stormo (1998)
Modeling Regulatory Networks with Weight MatricesPacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
Love (2014)
550.Genome Biol, 15
Sanyal (2012)
109Nature, 489
Phanstiel (2015)
3092Bioinformatics, 31
Li (2011)
323.BMC Bioinformatics, 12
Li (2014)
S11BMC Genomics, 15 (Suppl. 12)
Sabatti (2006)
739Bioinformatics, 22
Ramos (2013)
20130018.Philos. Trans. R. Soc. Lond. B Biol. Sci, 368
J. Qin, Yaohua Hu, Feng Xu, H. Yalamanchili, Junwen Wang (2014)
Inferring gene regulatory networks by integrating ChIP-seq/chip and transcriptome data via LASSO-type regularization methods.Methods, 67 3
Zhang (2013)
106Bioinformatics, 29
Schuster (2008)
16Nat Methods, 5
Servant (2015)
259Genome Biol, 16
Dunning (2007)
2183Bioinformatics, 23
D. Cusanovich, Bryan Pavlovic, J. Pritchard, Y. Gilad (2013)
The Functional Consequences of Variation in Transcription Factor BindingPLoS Genetics, 10
Andrea Ramos, S. Barolo (2013)
Low-affinity transcription factor binding sites shape morphogen responses and enhancer evolutionPhilosophical Transactions of the Royal Society B: Biological Sciences, 368
J.C. Liao (2003)
Network component analysis: reconstruction of regulatory signals in biological systemsProc. Natl. Acad. Sci. USA, 100
N. Servant, N. Varoquaux, B. Lajoie, E. Viara, Chong-Jian Chen, Jean-Philippe Vert, E. Heard, J. Dekker, E. Barillot (2015)
HiC-Pro: an optimized and flexible pipeline for Hi-C data processingGenome Biology, 16
Guy Karlebach, R. Shamir (2008)
Modelling and analysis of gene regulatory networksNature Reviews Molecular Cell Biology, 9
Cusanovich (2014)
e1004226PLoS Genet, 10
Zhang (2008)
R137Genome Biol, 9
Yong Zhang, Tao Liu, Clifford Meyer, J. Eeckhoute, David Johnson, B. Bernstein, C. Nusbaum, R. Myers, Myles Brown, Wei Li, X. Liu (2008)
Model-based Analysis of ChIP-Seq (MACS)Genome Biology, 9
J. Liao, Riccardo Boscolo, Young-Lyeol Yang, L. Tran, C. Sabatti, V. Roychowdhury (2003)
Network component analysis: Reconstruction of regulatory signals in biological systemsProceedings of the National Academy of Sciences of the United States of America, 100
Motivation: NGS techniques have been widely applied in genetic and epigenetic studies. Multiple ChIP-seq and RNA-seq profiles can now be jointly used to infer functional regulatory networks (FRNs). However, existing methods suffer from either oversimplified assumption on transcription factor (TF) regulation or slow convergence of sampling for FRN inference from large-scale ChIP- seq and time-course RNA-seq data. Results: We developed an efficient Bayesian integration method (CRNET) for FRN inference using a two-stage Gibbs sampler to estimate iteratively hidden TF activities and the posterior probabil- ities of binding events. A novel statistic measure that jointly considers regulation strength and regression error enables the sampling process of CRNET to converge quickly, thus making CRNET very efficient for large-scale FRN inference. Experiments on synthetic and benchmark data showed a significantly improved performance of CRNET when compared with existing methods. CRNET was applied to breast cancer data to identify FRNs functional at promoter or enhancer regions in breast cancer MCF-7 cells. Transcription factor MYC is predicted as a key functional factor in both promoter and enhancer FRNs. We experimentally validated the regulation effects of MYC on CRNET-predicted target genes using appropriate RNAi approaches in MCF-7 cells. Availability and implementation: R scripts of CRNET are available at http://www.cbil.ece.vt.edu/ software.htm. Contact: xuan@vt.edu Supplementary information: Supplementary data are available at Bioinformatics online. 1 Introduction 2008). In genetic and epigenetic studies, gene transcription is regu- Next generation sequencing (NGS) technology continues to become lated through the integrated action of many cis-regulatory elements, ever more cost-effective. The era of ‘big data’, with large data sets of including promoter-proximal bindings as well as various distal cis- high quality and higher resolution, has clearly arrived (Schuster, regulatory modules functioning at enhancers (Spitz and Furlong, V The Author(s) 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com 1733 1734 X.Chen et al. 2012). Promoter focused studies have shown that on average only regulatory NETwork inference. We model functional bindings as 15% of target genes predicted from ChIP-seq data of a single TF are Bernoulli random variables with prior knowledge of physical bind- significantly differentially expressed when this TF is knocked down ings predicted from ChIP-seq data. CRNET uses a hybrid model of (Chen et al., 2016; Cusanovich et al., 2014). A large proportion of Gibbs sampling and linear regression to predict functional bindings physical bindings are either not functional or do not act alone. with a significantly improved sampling efficiency. Several desirable Given a large number of TFs, it is not practical to knock down TFs properties of the proposed CRNET method are as follows: (i) using individually. Therefore, computational efforts for inference of func- a Gibbs sampling framework, in each round CRNET only samples a tional regulatory networks (FRNs) play an important role in large- subset of physical bindings as potentially functional; (ii) using linear scale TF-gene regulation analysis (Angelini and Costa, 2014). Prior regression model, regulation strength of each sampled binding and binding information can be obtained from static ChIP-seq data, the the regression error are jointly modeled as a variable following a condition of which is similar to the ‘0’ time point when we start to Student’s t-distribution and a logistic function is then used to trans- generate time-course gene expression data with a certain stimulus. fer the Student’s t-distributed variable to a probability; (iii) unlike Integrating prior binding information with time-course gene expres- conventional methods using known TF expression as the activated sion data, we can infer which binding sites are functional during the form of TFs (TFAs), CRNET models each TFA as a variable and use cell response to the specific stimulus. FRN inference is a cost- a two-stage Gibbs sampling procedure to sample functional bindings effective and comprehensive approach to study the joint regulatory and TFAs iteratively. As an extension, if cell type-specific 3D chro- effects of multiple TFs. A variety of omics data types must be inte- matin interaction data are available [i.e. ChIA-PET (Li et al., 2014) grated, analyzed and interpreted (Angelini and Costa, 2014; Chen or Hi-C (van Berkum et al., 2010)], CRNET can also integrate prior et al., 2013; Karlebach and Shamir, 2008). bindings observed from enhancer regions with target gene expres- Early attempts for FRN inference solely used gene expression sion data to infer distal and functional bindings. data (Huynh-Thu et al., 2010; Zhang et al., 2013). With the accu- We show the advantages of CRNET on synthetic datasets with mulation of binding information like binding motifs or ChIP-chip/ respect to different experimental settings of noise level and false con- seq data, integrative approaches are now being developed. For nection rate. The performance of CRNET is further benchmarked example, Sabatti et al. proposed a Bayesian network component on DREAM4 in silico regulatory networks with time-course gene analysis framework (BNCA) approach to simultaneously infer pro- expression data by varying the proportions of false positive/false tein activities of TFs (TFAs) and functional bindings by integrating negative perturbations in the prior networks. CRNET achieves a ChIP-chip and microarray gene expression (Sabatti and James, significant improvement on functional bindings prediction over 2006). Chen et al. developed a similar Bayesian hierarchical model existing methods. To demonstrate the capability of CRNET on namely COGRIM, to infer regulatory gene clusters (Chen et al., large-scale FRN inference, we apply CRNET to K562 cell line 2007). Large-scale ChIP-seq data in ENCODE database makes it data and GM12878 data, respectively. In terms of sampling possible to predict FRNs on many different cell types. Wang et al. speed, CRNET is five times faster than the conventional Bayesian developed a BETA package for functional gene prediction by inte- approaches. Specific for the K562 study, we validate functional grating single TF ChIP-seq data with target gene RNA-seq data bindings of three selected TFs as ATF3, EGR1 and SRF. Compared (Wang et al., 2013). Qin et al. developed an LASSO based integra- to competing methods, a higher proportion of functional genes pre- tive approach (Qin et al., 2014). dicted by CRNET are validated. Finally, we apply CRNET to breast The importance of statistical integration of binding signals and cancer MCF-7 data for FRN inference at promoter or enhancer gene expression data in understanding gene regulatory mechanisms regions. MYC is predicted as the most dominant TF and also a posi- was discussed by Angelina & Costa (Angelini and Costa, 2014). tive regulator in both FRNs. We transfect MCF-7 cells with siMYC They proposed Bayesian integration models for causal inference of for 24 h and successfully validate the positive regulatory effects of genome-wide FRNs. BNCA and COGRIM are two existing MYC on a significant set of target genes. Bayesian approaches using Gibbs sampling to infer functional bind- ings. Both tools require sampling distributions for binding occur- rence and non-occurrence (each is a Gaussian distribution). This 2 Materials and methods step increases their computational cost greatly, since in each round CRNET is designed to use time-course RNA-seq data for the refine- the tools must sample two different distributions under two hypoth- ment of FRNs from initial candidate networks that can be constructed eses. From our perspective, the key parameter of interest is the prob- from ChIP-seq data. Specific for FRN inference at enhancer regions, ability for binding occurrence. When the number of TFs is small, a additional prior information of enhancer-promoter interactions is linear regression model can be used to infer a FRN very efficiently needed, which can be obtained from cell type-specific ChIA-PET or (Liao et al., 2003). Correlation coefficients of individual bindings Hi-C data. In Figure 1, using Gibbs sampling, CRNET iteratively and the regression error for each gene are reported and then jointly samples hidden TFAs by assuming Gaussian random process, calcu- modeled as a variable following a Student’s t distribution (Lange lates the significance of regulatory strength for each binding based on et al., 1989). The false discovery rate (FDR) of each binding can be Student’s t statistics, and samples each functional binding as a calculated from the Student’s t distribution. Only if the FDR is lower Bernoulli random variable according to the conditional probability. than a threshold will the binding be accepted. This evaluation is After sufficient rounds of sampling, CRNET reports a posterior prob- very efficient because only one hypothesis (binding occurrence) is ability (sample frequency) for each binding that indicates the possibil- tested. With the accumulation of ChIP-seq data, the number of bind- ity that this connection is functional. A more detailed workflow of ings on a gene is usually larger than the number of gene expression CRNET is shown in Supplementary Figure S1. observations. Such a linear regression model cannot be directly used to infer large-scale FRNs. Efficient integrative methods are in need to overcome above limitations. 2.1 Prior binding network construction We have developed a novel Bayesian method, namely CRNET, Given promoter or enhancer annotation files and multiple TFs to integrate ChIP-seq and time-course RNA-seq data for functional ChIP-seq data, a prior binding matrix can be constructed using a CRNET: an efficient sampling approach for FRN inference 1735 same number of rows as B and each row y ¼ y ; .. . ; y ; .. . ; y j;1 j;m j;M represents target gene expression under M conditions for the j-th gene or enhancer-gene loop in B. For simplicity, we treat each row of Y or B as a ‘gene’ and infer functional bindings for it. 2.3 Integrative modeling in the CRNET approach In general, gene transcription is regulated by a set of TFs, whose activation via post-translational modification controls gene expres- sion. The activated form of a TF [modeled as TF activity (TFA)], rather than its expression level, controls promoters or enhancers and dictates the physiological state of the cell (Liao et al., 2003). Correspondingly, how each promoter or enhancer receives the signal reflects the relative contribution of each TF to the expression of the Fig. 1. Flowchart of CRNET for FRN inference. CRNET is built on a twostage target gene, which can be quantified as regulation strength. For the Gibbs sampling procedure: (1) sampling hidden transcription factor activities j-th candidate target gene, we model gene expression y using a log- (TFAs) and (2) sampling binding connections linear model (Liao et al., 2003) as follows: probabilistic method, ChIP-BIT2 (an extended version of ChIP-BIT (Chen et al.,2016) to detect binding sites at enhancer and promoter y ¼ z a x þ g þ n ; (1) j;t j;t t j j j t¼1 regions, respectively). An advantage of using ChIP-BIT2 is that we can detect both strong and weak bindings. A weak binding refers to a where x ¼ x ; .. . ; x ; .. . ; x is a TFA vector and each x t t;1 t;m t;M t;m binding with a relatively low read intensity in the sample ChIP-seq represents the activity of t-th TF under m-th sample. We define a data but that is still significantly higher than that of the matched input TFA matrix X ¼½ x ; ... ; x ; .. . ; x . z and a represent the bind- 1 t T j;t j;t data. As demonstrated in (Chen et al.,2016)and (Ramos and Barolo, ing state and associated regulation strength for the connection 2013), weak bindings at promoter and enhancer regions could both between t-th TF and j-th gene. Regulation strength a is unknown j;t result in functional regulation of target genes. More details about and needs to be estimated for z ¼ 1(a is 0 when z ¼ 0). j;t j;t j;t ChIP-BIT2 can be found in Supplementary Material S2. Accordingly, we define a binding matrix Z ¼ z ; .. . ; z ; .. . ; z and 1 j J For promoter study, a prior binding matrix between TFs and tar- a regulation strength matrix A ¼ a ; ... ; a ; .. . ; a . g represents 1 j J get genes can be directly constructed since each promoter region can the basal expression at ‘0’ time point as y and n ¼ j;0 j be uniquely mapped to each gene. In this matrix, each row repre- n ; .. . ; n ; .. . ; n is a Gaussian additive noise vector (with j;1 j;m j;M sents a unique gene and each column represents a TF. For enhancer mean zero and variance r ). study, an enhancer-promoter (gene) loop map (provided by 3D chro- matin interactions) is needed to associate distal TF bindings at 2.3.1 Sampling hidden TFAs enhancer regions with target genes (Sanyal et al., 2012). 3D chroma- We model each TFA vector x as a Gaussian random process where tin interactions can be extracted from Hi-C data (Servant et al., the prior distribution of each variable x is a Gaussian distribution 2015) or ChIA-PET data (Phanstiel et al., 2015). More details t;m with mean zero and variance r (Sabatti and James, 2006). The joint about how to construct the enhancer-gene loop map can be found probability of TFA matrix X, given gene expression matrix Y, regula- in Supplementary Material S3.1. Using the map and distal bindings at enhancer regions, we can build a prior binding matrix, tory strength matrix A, and binding matrix Z, is defined as follows: too, whereas each row represents a unique enhancer-gene loop PðÞ XjY;A;Z/ PðÞ YjX;A;ZPðÞ X (enhancer: gene). Note that the number of rows may be larger 0 1 than the number of actual target genes or enhancers because one Q Q Q 1 1 1 1 @ A / exp y a z x g x : j;m j;t j;t t;m t j m j t;m 2 2 enhancer may regulate multiple genes through different loops and rr 2r 2r t x one gene may also be regulated by more than one enhancer. We (2) define a prior binding matrix B ¼ b ; .. . ; b ; .. . ; b , where each 1 j J row b ¼ b ; .. . ; b ; .. . ; b represents prior binding probabil- j j;1 j;t j;T We use Gibbs sampling to sample x according to its conditional t;m ities of in total T TFs. If ChIP-seq data are not available, B can be probability as defined in Equation (3), which is also a Gaussian constructed as a binary matrix from other regulatory network data- distribution: bases [e.g. TRED (Zhao et al., 2005) and RegNetwork (http://www. 1 1 1 regnetworkweb.org/)]. 0 pffiffiffiffiffiffi Px jy ; a ; x / exp x l 0 ; (3) t;m j;m j t 6¼t;m t;m x r 0 2r 2p x 0 2.2 Time-course RNA-seq data processing where the mean and variance are defined in following equations: Considering gene expression data quality, we infer FRNs by inte- "# ! X X grating prior binding matrix B with time-course RNA-seq data. For r 1 0 x 0 0 0 l ¼ y a z x g a z ; (4) x j;m j;t j;t t ;m j j;t j;t each RNA-seq sample, we use RSEM (v1.3.0) (Li and Dewey, 2011) r J j t 6¼t to estimate the transcripts per million (TPM) value of each gene. We then take a log2 transform of the TPM value. Candidate target genes 2 2 r Jr 0 x r ¼ : (5) should have a significant dynamic expression change (i.e. at least 2 2 2 2 r a z þ r J x j;t j;t at one time point it is differentially expressed as compared to the basal expression (‘0’ time point)). Considering computational cost, 0 0 pre-selection of biologically meaningful genes is preferred. We More details about the derivation of l and r can be found from hi x define a gene expression matrix Y ¼ y ; .. . ; y ; .. . ; y . Y has the Supplementary Material S3.2. If there are two or more gene 1 j J 1736 X.Chen et al. expression replicates under the same time point, we provide an a (t 6¼ t). In practice, we iteratively sample a new z ðÞ i þ 1 in the j;t j;t option to assume TFA the same under the same time point and use (i þ 1)-th round of sampling as follows: the mean value of gene expression replicates to estimate l . Pz ðÞ i þ 1jY; X; z ðÞ i þ 1 ; ... ; z ðÞ i þ 1 ; 1; z ðÞ i ; z ðÞ i : (9) j;t j;1 j;t1 j;tþ1 j;T 2.3.2 Sampling binding connections To ensure that the results from Equation (9) are not dependent of It is difficult to model directly the distribution of regulation strength. the TF order, we shuffle the TF order (columns of matrix B) in each In both BNCA and COGRIM, regulation strength is assumed to fol- round of sampling. After accumulating enough samples, we select low Gaussian distribution but the veracity of this assumption is not functional bindings according to the sampling frequency of each fully justified. For target gene identification, our final goal is to binding, which represents the posterior probability of the binding. determine functional bindings rather than sample individual regula- We also run multiple times of CRNET with different initial states tion strength. Therefore, we calculate the value of regulation and check algorithm convergence on variable estimation (Gelman and Rubin, 1992). More details about convergence check can be strength using linear regression directly and determine whether a found in Supplementary Material S3.4. binding is likely to be functional (a true event). The basic idea is to use hypothesis testing: H (null hypothesis)—no binding (regulation strength ¼ 0); H (alternative hypothesis)—functional binding (regu- 3 Results lation strength 6¼ 0). For some genes, the number of prior bindings may be larger than 3.1 Benchmarking robustness using simulated and the number of gene expression samples (M), especially when given DREAM regulatory networks tens or hundreds of TFs’ ChIP-seq data. Binding signal differences as We first simulated a weighted regulatory network with 200 genes reflected by their prior probabilities are considered during the sam- and 20 TFs and multiple time-course gene expression data sets for performance evaluation (Supplementary Figs S3 and S4). We aimed pling process. To meet the requirement of linear regression, in each to evaluate the effects of false connections in the prior binding round of sampling, for each of such genes, we randomly select at matrix, the noise power of gene expression data, and the number of most M bindings according to their prior probabilities in B and gene expression samples on final FRN inference. We mainly simu- then, evaluate their functional effects (regression performance). lated two different cases as: in Case 1, we simulated TFAs for indi- Those bindings with higher prior probabilities will be selected more vidual TFs using Gaussian random process with zero mean and unit frequently. If binary prior is given, for each gene uniform prior will variance under 20 time points; gene expression was then simulated be assigned to all candidate bindings. Whether to accept each based on the log-linear model introduced in Equation (1) with simu- selected binding is determined by its regression performance on the lated TFA and regulation strength; in Case 2, we only generated 10 target gene. To evaluate the functional effects of selected bindings time-course samples. Case 2 is more challenging because the number on j-th gene, we estimate a directly using a least-squares method as: of TFs is larger than the number of gene expression samples. In each case, we varied the false-positive rate (defined by false-positive inter- a ¼ XX Xy g : (6) j j actions/true interactions) from 5% to 25% in the prior binding net- work with a random prior probability between 0.5 and 1. We also We define a t-scorefa to help evaluate if z is functional.fa j;t j;t j;t varied the signal-to-noise ratio (SNR) of the gene expression data (as defined in the following equation) takes into account both regu- from 6 to 3 dB. More details about data simulation can be found lation strength and regression error, which follows a Student’s t dis- from Supplementary Material S4.1 and S4.2. tribution under H with a degree-of-freedom of M 1 z (Gu 0 j;t For performance comparison, besides existing Bayesian models et al., 2012): such as BNCA and COGRIM, we also included a LASSO-based pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi a M 1 z j;t j;t integrative approach (Qin et al., 2014) and two expression-based fa ¼ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi : (7) j;t P P P methods [i.e. NARROMI (Zhang et al., 2013) and GENIE3 y a z x g = x j;m j;t j;t t;m j m t m t;m (Huynh-Thu et al., 2010)]. F-measure (2/(1/precision þ 1/recall)) of the competing methods was presented in Figure 2. In Case 1, the With a significance level of a (e.g. 0.05), we can make a decision on total number of expression samples is larger than the number of can- any fa t with a a (¼5%) risk to be a true binding (where t is j;t a a didate TFs. Most integrative approaches work robustly against the t-statistic value determined by the significance level a). We fur- false-positive connections in the initial network, as shown in ther transform fa to a probability with which to generate Gibbs j;t Figure 2A. CRNET has an improved performance over traditional samples. We propose to use a logistic function as defined in the fol- Bayesian approaches. When the SNR of gene expression data lowing equation as a mapping function to calculate the probability decreases, performance of the competing methods degrades dramati- of a t-score (Weaver et al., 1999). cally, as shown in Figure 2B. Robustness of BNCA and COGRIM against expression data noise is lower than CRNET. CRNET uses a Pz ¼ 1jy ; X ¼ ; (8) j;t joint measurement of regulation strength and regression errors so 1 þ exp b jfa jþ b 1 j;t 0 that overfitting can be effectively avoided as compared with existing where b and b are trained following a procedure as described in 1 0 Bayesian methods. CRNET also estimates hidden TFAs instead of Supplementary Material S3.3. A logistic function curve with trained using the noisy TF expression (as in COGRIM and LASSO). In Case b and b is shown in Supplementary Figure S2. 2, CRNET achieves the best performance among competing meth- 1 0 According to the conditional probability in Equation (8), we can ods, as shown in Figure 2C and D. In this case, CRNET’s robustness sample z by either accepting or rejecting z ¼ 1. From Equation to network false connections over the other competing methods is j;t j;t (7) it can be seen thatfa calculation for z ¼ 1 depends on other more obvious when the number of gene expression samples is small. j;t j;t CRNET: an efficient sampling approach for FRN inference 1737 AB which are used in DREAM4 challenge (Marbach et al., 2009). We 0.9 0.85 first tested the robustness of CRNET on FRN inference by adding 0.8 false-positive or -negative edges to the benchmark networks. 0.8 0.75 A false-negative edge is a ‘true’ binding in the benchmark network 0.7 0.7 but ‘missed’ by the prior binding matrix. We varied the false- 0.65 CRNET CRNET BNCA BNCA 0.6 COGRIM positive rate from 5% to 100% or the false-negative rate (defined by COGRIM 0.6 LASSO LASSO 0.55 NARROMI NARROMI false-negative interactions/true interactions) from 5% to 50% in the GENIE3 GENIE3 0.5 0.5 prior binding matrices. Box plots of F-measures of CRNET under 0.05 0.10 0.15 0.20 0.25 63 0 -3 False positive rate SNR(dB) different rates of false positive/false negative perturbations were pre- CD sented in Figure 3A or B. It can be found from Figure 3A that when 0.85 0.85 the false positive rate of the prior matrix is under 40%, CRNET can 0.8 0.8 0.75 0.75 achieve an average Fmeasure higher than 0.8. To achieve a similar 0.7 0.7 performance using a prior network with false negative interactions, 0.65 0.65 CRNET CRNET as shown in Figure 3B, the rate should be below 30%. BNCA BNCA 0.6 COGRIM 0.6 COGRIM We further tested competing methods using the same benchmark LASSO LASSO NARROMI 0.55 0.55 NARROMI GENIE3 GENIE3 networks and time-course gene expression data, too. Here, we only 0.5 0.5 63 0 -3 0.05 0.10 0.15 0.20 0.25 add false positive interactions (30%) to the prior networks. GENIE3 False positive rate SNR(dB) is the winner of DREAM4 challenge; similar to NARROMI, it uses Fig. 2. F-measure performance comparison of competing methods using syn- gene expression data as the only input. For a fair comparison, we thetic data with varying false positive connections and noise levels. (A) Case only examined the performance of NARROMI and GENIE3 by 1 with different FPRs; (B) Case 1 with different SNRs; (C) Case 2 with different focusing on observed interactions in the prior binding network. As FPRs; (D) Case 2 with different SNRs shown in Figure 3C, F-measures are presented in box plots for all competing methods. CRNET provides the best performance and compared with BNCA or COGRIM, it converges much faster (as shown in Fig. 3D, E and F). It can be found from Figure 3C that integrative methods using prior binding information with a reason- able false connection rate and gene expression data can provide improved performance over traditional methods using gene expres- sion data only. 3.2 Benchmarking performance using hundreds of TFs in K562 and GM12878 cells from ENCODE We continued examination of CRNET efficiency on large-scale FRN prediction with hundreds of TFs. ChIP-seq data of K562 and GM12878 cells were downloaded from the ENCODE database. Matched time-course gene expression datasets were downloaded from the GEO database (with access numbers: GSE1036 and GSE51709) for K562 and GM12878 cells, respectively. We esti- mated the SNR (signal-to-noise ratio) of each gene expression data- set using SNAGEE (Venet et al., 2012). As shown in Supplementary Tables S1 or S3, using baseline expression at ‘0’ time point as con- trol (0 dB), the average SNR is 2.82 dB (or 2.04 dB) for K562 (or GM12878) data. Prior binding matrix construction and candidate gene selection can be found from Supplementary Material S5.1. Prior binding matrixes and gene expression data can be found in Supplementary Tables S2 and S4. Heatmaps of gene expression are shown in Supplementary Figure S7. In total, we selected 1351 candi- date genes and 228 TFs to infer FRNs in K562 cells; 925 genes and 122 TFs for FRN inference in GM12878 cells. Fig. 3. Performance comparison using DREAM 4 in silico benchmark net- We also applied competing methods to each prior binding net- works. (A) F-measure of CRNET by adding 10–100% false positive edges; (B) work and the matched gene expression data. Due to the high density F-measure of CRNET by deleting 10–50% true edges (false negative); (C)A of the prior networks (Supplementary Fig. S6), BNCA, a method box plot of F-measure for competing methods (after 1000 rounds of Gibbs sampling for Bayesian methods); (D), (E) and (F) box plots of CRNET, BNCA exhaustively searching and testing TF combinations, does not work. and COGRIM F-measure performance during the sampling process COGRIM and LASSO require sparse binary binding events as input. For a fair comparison, we set the prior probability threshold as 0.85 Furthermore, as shown in Supplementary Figure S5, CRNET con- and selected binary binding events from 1348 genes and 173 TFs in verges much faster than competing Bayesian methods. K562 cells and 877 genes and 80 TFs in GM12878 cells for further For further evaluation of CRNET’s performance with non-ChIP- analysis. GENIE3 uses gene expression data only and theoretically seq prior binary information, five benchmark regulatory networks has no limitations on the number of candidate TFs. After running and the matched time-course gene expression data were downloaded 1000 rounds of Gibbs sampling, the average speed of CRNET is 498 from (https://www.synapse.org/#! Synapse: syn3049712/files/), or 58 sec/round for K562 or GM12878 FRN inference (R 3.3.1, F-measure F-measure F-measure F-measure 1738 X.Chen et al. MAC OS X, CPU 2.8 GHz, RAM 16GB). While the sampling speed MCF-7 time-course RNA-seq dataset was downloaded from the of COGRIM is 2611 or 322 sec/round under the same condition. GEO database (accession number: GSE62789) and further proc- CRNET is five times faster than COGRIM. Learned distributions of essed using RSEM. (Supplementary Table S6). We estimated the sampling frequency of CRNET and CORGIM are depicted in SNR of each RNA-seq sample using SNAGEE. As shown in Supplementary Figure S8. It can be found that the sample frequency Supplementary Table S7, the average SNR is 2.83 dB. Candidate tar- of CRNET follows a bimodal distribution. It is easy to define the get gene selection can be found from Supplementary Material S6.1. cut-off threshold (0.6 in this case) and further extract confident A heatmap of gene expression is shown in Supplementary Figure S12. 39 TFs ChIP-seq data were downloaded from ENCODE or edges. COGRIM tested two hypotheses (functional binding vs. non- GEO database (Supplementary Table S8) and processed using ChIP- functional binding) and only recorded samples meeting the require- BIT2. More details about data processing can be found from ment of ‘functional binding’ hypothesis. Its sampling frequency dis- Supplementary Material S6.2 To associate prior bindings at the tribution is a one-component Gaussian like distribution. Similarity enhancer regions with target genes, 3 D chromatin interactions were of CRNET-estimated TFAs and original mRNA expression for each extracted from a set of ECNODE MCF-7 ChIA-PET data using TF was checked using Pearson correlation coefficient. As shown in Mango. In total, we selected 464 genes for promoter FRN inference; Supplementary Figure S9, there is no clear correlation between them 1050 enhancers and 318 genes (1122 loops) for enhancer FRN infer- and for a number of TFs, the correlation is negative. ence (Supplementary Material S6.2). A majority of enhancers have Specific for the FRN inferred using K562 data we validated only one target gene but on average, each gene is regulated by four functional target genes for three selected TFs: ATF3, EGR1 and enhancers. Prior binding matrixes, enhancer-gene loops and candi- SRF. For each TF, a ‘true’ target gene is defined as: (i) there should date gene expression data were provided in Supplementary Tables be at least one functional binding site at promoter region; (ii) this S9 and S10. gene should be significantly differentially expressed when the TF is For comparison, we also used MACS2 (v2.1.0) (Zhang et al., knocked down. We downloaded RNA-seq data with shRNA TF 2008) to call genome-wide peaks using the same ChIP-seq dataset knockdown for each specific TF (GEO access number: GSE33816). and then, mapped peaks to candidate gene promoter or enhancer For each TF, two RNA-seq replicates were generated under control regions. As shown in Supplementary Figure S13, most (94%) or treatment conditions (Vehicle vs. shRNA). We applied RSEM to MACS2 bindings at gene promoter regions were captured by ChIP- individual RNA-seq samples and estimated read counts and TPM BIT2 and 78% of them had a ChIP-BIT2-estimated probability values of each gene across all samples. The knockdown efficiency larger than 0.85 (the default threshold of ChIP-BIT2 for peak pre- (differential expression of each TF) is shown in Figure 4A. We then diction). For those bindings predicted by ChIP-BIT2 only, 58% used DeSeq2 (Love et al., 2014) to identify differentially expressed of them still had a ChIP-BIT2-estimated probability over 0.85. target genes with a q value cutoff as 0.05. In total, we identified Obviously, there were a number of weak bindings missed by 1133 differential genes for ATF3, 893 genes for EGR1 and 1011 MACS2. Prior bindings detected from enhancer regions were shown genes for SRF, whose expression patterns are shown in in Supplementary Figure S14. About 60% of MACS2 bindings were Supplementary Figure S10. Using ChIP-BIT2 weighted prior bind- still captured by ChIP-BIT2 and the latter provided additional bind- ing information, CRNET predicted 141, 249, and 62 functional ing events for functional exploration. target genes for each of the three TFs; the validation success rates We predicted FRNs at promoter regions using CRNET and are 14.9%, 11.15% and 12.5%, respectively. While using confi- COGRIM, respectively. The average speed of CRNET is 8.8 s/round dent binding events only, the validation success rates are 14.8%, as compared with 47.6 s/round for COGRIM. Convergence check 10.44% and 11.3%. Venn diagrams of gene validation are shown (Gelman and Rubin, 1992) was carried out using results of five repli- in Supplementary Figure S11 and validation success rates of com- cated Markov chains generated independently using each method. peting methods are shown in Figure 4B.In Supplementary Table As shown in Supplementary Figure S15, after 100 rounds of sam- S5, we further listed the detailed number of genes being predicted pling, CRNET starts to converge, while for COGRIM the number is by each method containing the same number of validated genes as 500. Similarity between TFA estimated by CRNET and mRNA CRNET with binary input (achieving the same recall performance). expression for each TF was shown in Supplementary Figure S17.It It can be found that the FRN predicted by CRNET has the highest can be found that there is no clear dependency between TFA and TF validation rate for each of the three TFs. expression. The pattern of TFA is, however, more consistent with that of candidate gene expression. Analyzing the top 500 edges in 3.3 Real application using breast cancer MCF-7 cells the CRNET-predicted FRN, we identified a key module including We finally applied CRNET to breast cancer MCF-7 cell ChIP-seq TFs MYC, TDRD3, E2F1, MBD3, SIN3A and MAX. Using and time-course RNA-seq data and inferred two FRNs at promoter COGRIM, a slightly different module was identified including and enhancer regions, respectively. A 17 b-estradiol (E2) treated MAX, FOXM1, RAD21, SRF and E2F1. We also checked the top 500 edges in the FRN predicted by CRNET, but with the MACS2 prior binding matrix, a module including MAX, SIN3A, MYC and E2F1 was identified. TDRD3 and MBD3 were missed because they had much fewer prior bindings predicted by MACS2, as discussed in Supplementary Material S6.2. We then predicted another FRN at enhancer regions using CRNET. The convergence curve in Supplementary Figure S16 shows that CRNET starts to converge after 100 rounds. The average sam- pling speed is 16 s/round. A key TF module including MYC, TDRD3, ER-a, GABPA, GATA3 and E2F1 was identified using Fig. 4. True positive rate of differentially expressed genes in inferred FRN top ranked edges. Here, three well-known breast cancer specific by competing methods. (A) TF knockdown efficiency for ATF3, EGR1 or SRF; (B) validation success rate for each TF enhancer activators: ER-a, GABPA and GATA3 were specifically CRNET: an efficient sampling approach for FRN inference 1739 enriched in this enhancer FRN. Again, we further examined the CRNET has a faster convergence speed and an improved perform- enhancer FRN predicted by CRNET but using prior bindings from ance in identifying FRNs from noisy binding and gene expression MACS2. Still using the top 500 edges in the inferred FRN, we identi- data. A summary of data and tools used in this paper can be found in Supplementary Material S7. Note that CRNET is developed fied a TF module, whereas TFs TDRD3 and GABPA in the previous module were missed. In this MCF-7 cell study, for the 39 selected based on an assumption that TFs work parallel on gene regulation, TFs, prior bindings predicted by MACS2 may not provide a ‘com- thus inferring FRNs, respectively, for promoter regions and plete’ candidate space for functional binding exploration. enhancer regions. However, evidence starts to emerge that TFs bind- Since in both promoter and enhancer FRN analyses MYC was ing at enhancers and at promoters could work collaboratively or identified as the ‘top’ TF, we used siMYC to knock down MYC in hierarchically. Enhancer TFs may activate TFs at promoter regions and then the latter will regulate gene expression, or enhancer TFs MCF-7 cells for 24 h, followed by Western blotting to confirm the may directly regulate genes through enhancer-promoter interactions. knockdown efficiency (Fig. 5A). The estimated TFA of MYC goes up after E2 treatment (Supplementary Figs S17 and S18), suggesting If such prior information is given, the proposed CRNET can be fur- a positive regulation relationship of MYC with those over-expressed ther extended and used to infer a joint FRN by properly merging prior binding observations from promoter and enhancer regions. target genes under E2 condition. If this active regulation relationship Another potential extension of current CRNET framework is to pre- is true, it can be expected that MYC target genes will be down- dict cis-regulatory modules (TF-associations), especially for regulated when MYC is knocked down. We therefore performed enhancer studies. Prior knowledge of TF-associations may be needed microarray mRNA profiling using Illumina HumanHT-12 v4 to define the candidate search space for module prediction. CRNET Expression BeadChip. Two replicates were generated under wild can predict functional TF modules by sampling candidate TF- type (siSCR) or siMYC condition and processed and normalized associations instead of individual TF bindings. using beadarray (v2.26.0) (Dunning et al., 2007). As shown in Figure 5B, MYC has been efficiently knocked down (differential mRNA expression P value 2.5e-2). Setting fold change threshold as Funding 0.5, in total we selected 2720 differentially expressed genes. A heat- This work was supported in part by the National Institutes of Health map of selected gene expression is shown in Figure 5C. 2271 (CA149653 to J.X., CA149147 & CA184902 to R.C., CA164384 to L, H.- (83.5%) MYC target genes are significantly down-regulated after C., CA148826 & CA187512 to T.-L.W.). MYC knockdown. Among 101 predicted genes in the promoter FRN, there are 40 genes significantly down-regulated (‘true’ MYC Conflict of Interest: none declared. targets). Among 92 enhancer target genes, the number of ‘true’ MYC targets is 44. There are 49 common predicted targets and 16 References common validated targets between two studies. We calculated the hypergeometric P value (in –log10 format) of the enrichment of Angelini,C., and Costa,V. (2014) Understanding gene regulatory mechanisms MYC valid target genes in the FRN predicted by each competing by integrating ChIP-seq and RNA-seq data: statistical solutions to biological problems. Front. Cell. Dev. Biol., 2, 51. method. As shown in Supplementary Table S11, using the same Chen,G. et al. (2007) Clustering of genes into regulons using integrated prior from ChIP-BIT2, CRNET performs better than COGRIM. modeling-COGRIM. Genome Biol., 8,R4. Since ChIP-BIT2 can capture more weak but still functional bind- Chen,X. et al. (2016) ChIP-BIT: Bayesian inference of target genes using ings, even for a strong TF like MYC, CRNET performs better if a novel joint probabilistic model of ChIP-seq profiles. Nucleic Acids Res., combined with ChIP-BIT2 than MACS2. Numbers of validated or 44, e65. non-validated MYC target and their gene expression heatmap were Chen,X. et al. (2013) Reconstruction of transcriptional regulatory networks shown in Supplementary Figure S19. by stability-based network component analysis. IEEE/ACM Trans. Comput. Biol. Bioinform., 10, 1347–1358. Cusanovich,D.A. et al. (2014) The functional consequences of variation in 4 Discussion transcription factor binding. PLoS Genet., 10, e1004226. Dunning,M.J. et al. (2007) beadarray: R classes and methods for Illumina CRNET is designed as an efficient sampling approach to integrate bead-based data. Bioinformatics, 23, 2183–2184. ChIP-seq and time-course RNA-seq data for large-scale FRN infer- Gelman,A., and Rubin,D.B. (1992) Inference from iterative simulation using ence. It aims to identify functional bindings among observed physi- multiple sequences. Stat. Sci., 7, 457–472. cal TF-gene interactions. Compared with other peer methods, Gu,J. et al. (2012) Robust identification of transcriptional regulatory networks using a Gibbs sampler on outlier sum statistic. Bioinformatics, 28, 1990–1997. Huynh-Thu,V.A. et al. (2010) Inferring regulatory networks from expression data using tree-based methods. PLoS One, 5, e12776. Karlebach,G., and Shamir,R. (2008) Modelling and analysis of gene regula- tory networks. Nat. Rev. Mol. Cell. Biol., 9, 770–780. Lange,K.L. et al. (1989) Robust statistical modeling using the t distribution. J. Am. Stat. Assoc., 84, 881–896. Li,B., and Dewey,C.N. (2011) RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics, 12, 323. Li,G. et al. (2014) Chromatin interaction analysis with paired-end tag (ChIA-PET) sequencing technology and application. BMC Genomics, 15 Fig. 5. Experimental validation of MYC target genes. (A) Western blot of MYC (Suppl. 12), S11. protein expression after transfecting MCF-7 breast cancer cells with siRNA of MYC and scramble (SCR) for 24 h; (B) mRNA expression levels of MYC across Liao,J.C. et al. (2003) Network component analysis: reconstruction of regulatory siRNA samples; (C) the heatmap of differentially expressed MYC target genes signals in biological systems. Proc. Natl. Acad. Sci. USA, 100, 15522–15527. 1740 X.Chen et al. Love,M.I. et al. (2014) Moderated estimation of fold change and dispersion Servant,N. et al. (2015) HiC-Pro: an optimized and flexible pipeline for Hi-C for RNA-seq data with DESeq2. Genome Biol., 15, 550. data processing. Genome Biol., 16, 259. Marbach,D. et al. (2009) Generating realistic in silico gene networks for per- Spitz,F., and Furlong,E.E. (2012) Transcription factors: from enhancer bind- formance assessment of reverse engineering methods. J. Comput. Biol., 16, ing to developmental control. Nat. Rev. Genet., 13, 613–626. 229–239. van Berkum,N.L. et al. (2010) Hi-C: a method to study the three-dimensional Phanstiel,D.H. et al. (2015) Mango: a bias-correcting ChIA-PET analysis pipe- architecture of genomes. J Vis Exp., 39, e1869. line. Bioinformatics, 31, 3092–3098. Venet,D. et al. (2012) A measure of the signal-to-noise ratio of microarray Qin,J. et al. (2014) Inferring gene regulatory networks by integrating samples and studies using gene correlations. PLoS One, 7, e51013. ChIP-seq/chip and transcriptome data via LASSO-type regularization meth- Wang,S. et al. (2013) Target analysis by integration of transcriptome and ods. Methods, 67, 294–303. ChIP-seq data with BETA. Nat. Protoc., 8, 2502–2515. Ramos,A.I., and Barolo,S. (2013) Low-affinity transcription factor binding Weaver,D.C. et al. (1999) Modeling regulatory networks with weight matri- sites shape morphogen responses and enhancer evolution. Philos. Trans. R. ces. Pac. Symp. Biocomput, 4, 112–123. Soc. Lond. B Biol. Sci., 368, 20130018. Zhang,X. et al. (2013) NARROMI: a noise and redundancy reduction Sabatti,C., and James,G.M. (2006) Bayesian sparse hidden components analy- technique improves accuracy of gene regulatory network inference. sis for transcription regulation networks. Bioinformatics, 22, 739–746. Bioinformatics, 29, 106–113. Sanyal,A. et al. (2012) The long-range interaction landscape of gene pro- Zhang,Y. et al. (2008) Model-based analysis of ChIP-Seq (MACS). Genome moters. Nature, 489, 109–113. Biol., 9, R137. Schuster,S.C. (2008) Next-generation sequencing transforms today’s biology. Zhao,F. et al. (2005) TRED: a Transcriptional Regulatory Element Database and a Nat Methods, 5, 16–18. platform for in silico gene regulation studies. Nucleic Acids Res., 33, D103–D107.
Bioinformatics – Oxford University Press
Published: Dec 21, 2017
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.