Genome and epigenome analysis of monozygotic twins discordant for congenital heart disease

Genome and epigenome analysis of monozygotic twins discordant for congenital heart disease Background: Congenital heart disease (CHD) is the leading non-infectious cause of death in infants. Monozygotic (MZ) twins share nearly all of their genetic variants before and after birth. Nevertheless, MZ twins are sometimes discordant for common complex diseases. The goal of this study is to identify genomic and epigenomic differences between a pair of twins discordant for a form of congenital heart disease, double outlet right ventricle (DORV). Results: A monoamniotic monozygotic (MZ) twin pair discordant for DORV were subjected to genome-wide sequencing and methylation analysis. We identified few genomic differences but 1566 differentially methylated regions (DMRs) between the MZ twins. Twenty percent (312/1566) of the DMRs are located within 2 kb upstream of transcription start sites (TSS), containing 121 binding sites of transcription factors. Particularly, ZIC3 and NR2F2 are found to have hypermethylated promoters in both the diseased twin and additional patients suffering from DORV. Conclusions: The results showed a high correlation between hypermethylated promoters at ZIC3 and NR2F2 and down- regulated gene expression levels of these two genes in patients with DORV compared to normal controls, providing new insight into the potential mechanism of this rare form of CHD. Keywords: CHD, MZ twins, ZIC3, NR2F2,DNA methylation,RRBS,WGS Background During the normal development of the heart, the out- Congenital heart disease (CHD) is the leading flow tract initially connects exclusively with the primitive non-infectious cause of death in infants. In Asia, CHD oc- right ventricle and must remodel to divide into a separ- curs in 9.3 per 1000 live births [1]. Double outlet right ven- ate pulmonary artery and aorta; subsequently, there is tricle (DORV), defined when both great arteries originate continued remodeling to establish direct continuity from from the morphological right ventricle in a heart, is a rare the left ventricle to the aorta [4]. In double outlet right form of congenital heart disease, accounting for 1–3% of all ventricle, drainage of the left ventricle is commonly CHD cases [2–4]. Multiple factors have been identified in achieved through a ventricular septal defect (VSD) at contributing to the disease, of which both genetic and epi- different locations and with varying relations to the pul- genetic changes and the interplay between them and the re- monary and aortic outflow tract. An insufficient mitral latedenvironment playakey role inthepathogenesis[5, 6]. valve and an atrial septal defect (ASD) can be found in DORV cases [3]. Since temporal and spatial expression of transcription factors (TFs) is a major determinant of * Correspondence: cheng_li@pku.edu.cn; nieyuniverse@126.com; weitao@pku.edu.cn cell lineage specification and patterning of the heart Guoliang Lyu, Chao Zhang and Te Ling contributed equally to this work. [7–10], mutations or expression dysregulation in them Center for Bioinformatics, School of Life Sciences, Peking University, Beijing can impair cardiac development and lead to congeni- 100871, China Department of Cardiovascular Surgery, Center for Cardiovascular tal heart malformations and dysfunction [7]. Changes Regenerative Medicine, Fuwai Hospital, Peking Union Medical College, of epigenetic modifications also affect the expression Chinese Academy of Medical Sciences, Beijing 100871, China patterns of these TFs [11]. Environmental alterations or Key Laboratory of Cell Proliferation and Differentiation, School of Life Sciences, Peking University, Beijing 100871, China © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Lyu et al. BMC Genomics (2018) 19:428 Page 2 of 13 stresses including drugs may perturb the TFs-related tran- extracted from whole blood samples of the MZ twins scriptional and epigenetic programs in the process of car- discordant for DORV (septal-defect heart (D3) and nor- diac development and give rise to CHD [12–14]. mal heart (D4) was sequenced by Illumina HiSeq X-Ten Cytosine methylation, an epigenetic modification, is (150 bp paired-end reads) and aligned to the hg19 refer- essential in mammalian development and particularly in ence human genome. The average sequencing depth of cell-lineage specification [15, 16]. Although each individ- D3 and D4 were 31.9 and 28.5 respectively, and 4-fold ual’s genome is fixed throughout life and across cell types, coverage represented more than 91% of the hg19 human epigenetic modifications are plastic and influence the reference genome (Additional file 1: Table S1). temporal and spatial pattern of gene expression [14]. Cell Single-nucleotide variations (SNVs) and short insertions/ type-specific DNA methylation patterns emerge during deletions (InDels) were identified using SAMtools [27] development and play a role in gene expression by direct- and filtered by Varscan [28]. More than 99.9% SNVs ing chromatin activation or interfering with the binding of shared between the MZ twins (Fig. 1a). Three hundred TFs [17]. Emerging evidence suggests that DNA methyla- and sixteen SNVs and 114 InDels were identified specif- tion is responsive to both physical and social environ- ically in D3. After filtering by the dbSNP (Version 138) ments during pregnancy and early life [14, 18]. database, forty-one SNVs and 71 InDels in D3 remained Monozygotic (MZ) twins share nearly all of their genetic (Additional file 2: Table S2). There was one deletion lo- variants before and after birth. Nevertheless, MZ twins are cated in non-coding RNA (ANKRD30BP2, pseudogene) often discordant for common complex diseases, such as and one synonymous SNV in the exon of DSPP gene, type 1 diabetes (T1D; 61%), type 2 diabetes (41%) [19], but none of the SNVs or InDels altered proteins (Fig. 1b autism (58 to 60%) [20, 21], schizophrenia (58%), and & Additional file 2: Table S2). No pathogenic variation different types of cancer (up to 16%) [22, 23]. These obser- specific to D3 was detected when filtered by the Online vations support the model that for many complex traits, Mendelian Inheritance in Man (OMIM) and Human genotype alone may not fully determine phenotypic vari- Genetic Mutation Database (HGMD). Sequence variants ation, and the interplay between genes and environment in promoter of GATA4 and TBX1, which were closely needs to be considered and epigenetics has been proposed related to VSD [29, 30], were found in both of our sam- to be one of the main mediators of this interaction ples, indicating that the sequences variants in the pro- [24–26]. Therefore, disease-discordant MZ twin pairs moters of these two transcription factor may not provide an ideal model for examining epigenetic functions contribute to the pathogenesis of DORV in D3 (Add- in diseases due to their shared genetic and environmental itional file 3: Table S3). factors during the pregnancy [24]. Analysis of discordant Copy number variants (CNVs) are deletions or ampli- MZ twins has been successfully used to study epigenetic fications of DNA segments that arise from inappropriate mechanisms in aging, cancer, autoimmune disease, and chromatid recombination or segregation during cell div- psychiatric, neurological and other traits [20, 21, 23]. ision. As CNVs alter the gene expression dosage of con- However, comprehensive analysis of genome-wide DNA tiguous genes, they may result in syndromic CHDs [31]. methylation in a MZ twin pair discordant for double out- Using Control-FREEC [32], we detected 15 focal CNVs; let right ventricle (DORV) is lacking. however, these CNVs did not pass CNVnator confirmation In this study, we dissected the contributions of DNA criteria [33], suggestingthe absenceof bonafideCNVs(Fig. methylation pattern to the pathogenesis of DORV through 1c; Additional file 4: Figure S1 & Additional file 5:Table DNA methylation profiling at single nucleotide level by S4). Structural variation (SV), typically a large insertion/de- utilizing the whole-blood-DNA derived from a MZ twin letion, inversion or translocation affecting a sequence pair discordant for DORV. The two years and two months length from1kbto3Mb,was notdetectedtobedifferent old Chinese girl, given a diagnosis of DORV at birth, has a between the two samples using the CREST software healthy MZ twin sister. Their parents have no past or fam- (Additional file 6:FigureS2).Together, thegenomic ily history of heart diseases and are not consanguineous. analyses suggest that the DORV in D3 is not likely Thus, we hypothesize that genome-wide analysis in this caused by genome alterations. phenotype-discordant twins will provide insights into gen- etic and epigenetic factors affecting normal heart DNA methylation differences between the MZ twin pair development. DNA methylation is involved in multiple processes, including regulation of gene expression, silencing of ret- Results rotransposons, genomic imprinting, X-chromosome in- Genome sequence differences in MZ twins discordant for activation and occurrence of various diseases [34, 35]. DORV We next sought to compare genome-scale DNA methy- We first tried to identify de novo genomic sequence lation profiles between the MZ twins at nucleotide reso- variation through whole genome sequencing. DNA lution. DNA samples were extracted from whole blood Lyu et al. BMC Genomics (2018) 19:428 Page 3 of 13 ab Fig. 1 Overview of whole genome sequencing. a The numbers of SNVs and InDels detected in D3 and D4, 99.9% loci shared between the MZ twins. b There are 1736 SNVs or InDels specific in the disease sample D3, of which only 430 loci showing high confidence confirmed by Varscan are regarded as potential de novo SNVs/InDels. All loci are classified into 7 categories (upstream, downstream, intergenic, exonic, intronic, UTR5, UTR3) according to the relative position of nearby genes. The pie plot shows the number of loci and proportion of each category. c Normalized copy number profiles of D3 and D4. Each point shows a 5 kb window (chromosome 1) of sequencing reads normalized by GC-content and mappability using Control-FREEC acquired before heart surgery, and gel-free reduced rep- found a high correlation between B cell and heart resentation bisulfite sequencing (RRBS) libraries were with Pearson’s correlation coefficient equal to 0.9238, constructed as previously reported [36] and then and visualizing the genome-wide methylation between sequenced on Illumina HiSeq2000 platform (100 bp them in IGV showed similar patterns (Additional file 8: paired-end reads). After aligning the reads to the hg19 Figure S3), suggesting that the DNA methylation pat- reference genome, we obtained 56 million and 48 mil- tern in blood correlate with that in heart. lion high-quality, 100-nucleotide, uniquely mapped reads The methylation levels of CpG dinucleotides in both from the D3 and D4 samples, respectively. The two samples showed a bimodal distribution with most samples had very similar sequencing-depth patterns of CpG sites being unmethylated or extensively methyl- cytosine sites and covered more than 34 million C sites ated (Fig. 2b). As expected, genome-wide patterns of with sequencing depth at least 5 fold (Additional file 7: DNA methylation (CpG andnon-CpG)werehighly Table S5). More than 6 million high-quality CpG dinu- correlated between the MZ twin pair (Figs. 2c-e & cleotides were supported by at least by 5 reads (depth ≥ Additional file 9: Figure S4), suggesting that DORV is not 5×), which covered 10% of all CpG sites (Fig. 2a). In associated with a global reprogramming of methylation. addition, we checked the methylation status of samples Next, we sought to identify differentially methylated from blood cell and heart tissue in ENCODE project regions (DMRs) between the twin pair using a window (Accession ID: ENCFF388NTJ, ENCFF107RMQ). We sliding strategy to reduce the sampling variation of Lyu et al. BMC Genomics (2018) 19:428 Page 4 of 13 ce Fig. 2 Comparison of CpG methylomes between two samples. a Overall reads coverage of reduced representation bisulfite sequencing (RRBS) at CpGs between the twins. The RRBS strategy shows a high covered rate of CpGs in two samples. 10.88% and 10.72% of all human CpGs are sequenced by more than 5 reads (5×) in D3 and D4, respectively. b Overall distribution of CpGs’ methylated levels in D3 and D4. This violin plot shows that most CpGs are 100% methylated or 0% methylated, and the global methylation level is unchanged between the pair (Wilcoxon test non- significant). c Average CpGs’ methylation profile in gene body, upstream (− 2 kb of TSS) and downstream (+ 2 kb of TTS). d Circos plot shows the distribution of the DMRs along the genome. The tracks from outsider to inner: CpG islands downloaded from UCSC (green); the –log10 p-value of detected DMRs (blue); the average methylation levels per 1 kb in D3 (red); the average methylation levels per 1 kb in D4 (green). e Scatter diagram of all CpGs’ methylation levels calculated by RRBS data between two samples, D3 and D4, showing a similar overall CpGs methylation levels with Pearson correlation coefficient equal to 0.9613 individual CpG sites (see Methods). We identified 1566 1566) of the DMRs distributed in gene bodies or inter- significant DMRs between the MZ twin pair (Fig. 3a). genic regions and 20% (312/1566) of the DMRs located Hypermethylation and hypomethylation DMRs in D3 within 2 kb upstream of TSS (Fig. 3b and showed similar genomic distributions, with 80% (1254/ Additional file 10: Table S6). We concluded that DMRs Lyu et al. BMC Genomics (2018) 19:428 Page 5 of 13 cardiovascular development (Fig. 4b and Additional file 12: Table S8). Furthermore, multiple signaling-pathways in- volved in heart development were also enriched, including regulation of BMP signaling and TGF beta receptor signal- ing (Fig. 4b and Additional file 12: Table S8). These results suggest that gene regulatory networks during the heart development of D3 may be affected by the DMRs. Moreover, DMRs can influence gene expression through directly altering the binding of transcription fac- tors to their targets. We analyzed the TF binding motifs in DMRs by using ANNOVAR (using tfbsConsSites database downloaded from UCSC). We identified 138 TFs (Additional file 10: Table S6) that belong to GATA family and NKX family, which are essential for cardiac development (Additional file 13: Table S9). GO analysis showed that many of these TFs were involved in heart development, muscle organ development, regulation of muscle cell differentiation, and blood vessel development (Fig. 4c and Additional file 13: Table S9). These TFs were also enriched in KEGG signaling pathways including MAPK signaling, Wnt signaling, TGF beta signaling, and Fig. 3 Distribution of D3 specific DMRs. a The scatter plot shows the found DMRs, the red color indicates the p-value, and non-significant VEGF signaling pathway (Fig. 4d and Additional file 13: regions are showed in gray. b The distribution of significantly DMRs. All Table S9), which were all involved in heart development DMRs are classified into 2 main categories (hypermethylation and [40–45]. Taken together, these results imply that DMRs hypomehtylation), and each category is further classified into 7 in TFs binding sites may contribute to the DORV classes (upstream, downstream, intergenic, exonic, intronic, UTR5, pathogenesis. UTR3) according to the relative position of nearby genes. The bar plot shows the proportion of each class. There are 294 genes that have a differentially methylated region in upstream Aberrant promoter methylation of ZIC3 and NR2F2 in the (2 kb) of transcription start site diseased twin Since promoter hypermethylation is typically associated between the MZ twins may be involved in the pathogen- with the repression of gene transcription [44], we then esis of DORV. focused on promoter DMRs for further analysis. We found that DMRs existed in the upstream of multiple genes whose family members have been reported to be involved in Gene ontology (GO) and transcription factor binding sites pathogenesis of CHDs, and their normal functions are enriched in DMRs critical in morphogenesis and establishment of the We analyzed the genes that contained DMRs within cardiovascular system during cardiac development [46–48]. 2 kb upstream of their TSS. In total, three hundred and These genes include CITED1 (member of CREB-binding twelve DMRs were located in the upstream of 621 TSSs protein/p300-interacting transactivator with Asp/Glu-rich (belonging to 294 genes annotated in Refseq) (Fig. 3b & C-terminal domain (CITED) family of proteins, hypomethy- Additional file 11: Table S7). These DMR-associated lated), GATA2 (member of the GATA family of zinc-finger genes were then subjected to KEGG (Kyoto TFs, hypermethylated), SOX3 (member of the SOX (SRY-re- Encyclopedia of Genes and Genomes) pathway and Gene lated HMG-box) family of transcription factors, hypomethy- Ontology (GO) analysis using the Enrichr and DAVID lated) (Additional file 14:FigureS5, Additional file 15: Web servers [37–39]. GO analysis revealed that genes Figure S6 and Additional file 16:FigureS7). associated with cardiolipin acyl-chain remodeling, car- We also found that genes encoding important epigen- diac muscle tissue regeneration, vascular function and etic factors contained DMRs in their upstream of TSS, organ growth are enriched in DMR-associated genes implying that these factors may play roles in causing (Fig. 4a & Additional file 11: Table S7), suggesting that CHDs through regulating expression of genes implicated DMRs may be associated with the abnormal heart devel- in heart development. The genes include NSD1 (Nuclear opment of D3. Receptor Binding SET Domain Protein 1, which prefer- We next analyzed the 851 genes containing DMRs in entially methylates ‘Lys-36’ of histone H3 and ‘Lys-20’ of their gene bodies. These genes were also enriched in several histone H4, hypomethylated), MTA2 (Metastasis Associ- biological processes that are involved in various stages of ated 1 Family, Member 2, a component of NuRD, Lyu et al. BMC Genomics (2018) 19:428 Page 6 of 13 ac Fig. 4 Gene Ontology (GO) and KEGG analysis of D3 specific DMRs. Gene Ontology (GO) and pathway enrichment analysis of differentially methylated regions using Enrichr and DAVID under default parameters. The bar charts show the most relevant and significantly enriched terms. Terms that are highly related to CHD are marked in blue. The x-axis represents the –log10 of the enrichment p-value. The y-axis represents the enriched terms in GO or KEGG databases. a GO enrichment analysis of genes associated with differentially methylated regions in 2 kb upstream of genes. b GO enrichment analysis of genes associated with differentially methylated regions in gene body. c GO enrichment analysis of TFs whose binding sites are differentially methylated. d KEGG pathway enrichment analysis of TFs whose binding sites are differentially methylated hypomethylated), MECP2 (Methyl CpG Binding Protein high intensity of TBP and Pol II binding in K562 cells 2, hypermethylated) and SUV39H1 (Suppressor of Varie- (ENCODE data) (Fig. 5b and d). These results suggested gation 3–9 Homolog 1, a histone methyltransferase that a possible association of hypermethylated promoters of trimethylates lysine 9 of histone H3, hypomethylated) ZIC3 and NR2F2 and their functions during the heart (Additional file 17: Figure S8, Additional file 18:Figure development of DORV patients. S9, Additional file 19: Figure S10 and Additional file 20: Figure S11). We also utilized the public expression profil- Hypermethylation and dysregulation of ZIC3 and NR2F2 ing data of embryo and adult heart from ENCODE (Ac- in additional DORV patients cession ID: ENCFF704AHC, ENCFF199GQY, In order to further confirm the dysregulation of ZIC3 ENCFF987YOV) to analyze the possible contribution of and NR2F2 in DORV pathogenesis, we collected twenty these genes to DORV. We found that these genes were DNA samples of whole blood from normal individuals expressed in both embryo and heart (Additional file 21: and clinical DORV patients. In order to guarantee the Table S10), suggesting that these genes may play import- similarity in age and in gender ratio between the normal ant roles in embryonic and cardiac development and dys- group and the patient cases, the samples included five regulated expression of them may contribute to CHD controls with normal heart development (aged 0.8– such as DORV. 3.8 years; 3 males, 2 females) and fifteen cases with Moreover, by scrutinizing all the DMRs located in DORV diagnosis (aged 1–3.5 years; 9 males, 6 females). gene promoter regions, we noticed that two genes, ZIC3 Using bisulfite sequencing, we confirmed hypermethy- and NR2F2, encode TFs annotated with CHD in OMIM lated promoters of both ZIC3 (Fig. 6a and b)and NR2F2 (300,265 and 107,773) [49–51]. In the DORV diseased (Fig. 6c and d) in twelve of fifteen DORV patients com- twin D3, the upstream of ZIC3 (harboring P300 and pared to normal subjects (Additional file 22: Figure S12, HNF1 binding sites) was hypermethylated, correspond- Additional file 23 Figure S13). The association of DORV ing to a region with high Pol II binding density in hESC and hypermethylated ZIC3 and NR2F2 pro- cells (ENCODE data) (Fig. 5a and c). Similarly, hyperme- moters showed a significant correlation by Fisher’s exact thylation of NR2F2 was detected in the upstream of the test (p values are both 0.0036) (Fig. 6a and c). TSS (harboring an IRF2 binding site) of the shortest Correspondingly, samples harboring hypermethylated NR2F2 transcript variant, corresponding to a region with promoters of the two genes have comparatively lower Lyu et al. BMC Genomics (2018) 19:428 Page 7 of 13 Fig. 5 (See legend on next page.) Lyu et al. BMC Genomics (2018) 19:428 Page 8 of 13 (See figure on previous page.) Fig. 5 Aberrant methylation in the upstream regions of ZIC3 and NR2F2, visualized in UCSC genome browser. DMRs are indicated by light blue bar; methylated levels in the twins are showed in blue (D3) and red (D4) bars. Transcription factor binding sites are also showed in zooming-in panels, which are indicated by black bars with names marked in front. An arrow gives TSS and transcriptional orientation. Two genes, ZIC3 (a) and NR2F2 (b), are showing differentially methylated in the upstream of TSS, and these two genes are known be associated with CHD. Transcription factor binding sites analyses are performed by ChIP-seq data of RNA Pol II and TBP derived from ENCODE. Bisulfite sequencing data (c and d) show methylation status of the same region of ZIC3 and NR2F2 as in (a) and (b). Methylated and unmethylated CpG sites are shown as black and white circles, respectively gene expression levels (Fig. 6e and f), and the methyla- hypermethylated at the TSS upstream region of the dis- tion and gene expression levels of these two genes were eased twin compared to the normal heart sample. ZIC3 negatively correlated (Fig. 6g and h). These results con- is a member of the ZIC family of C2H2-type zinc finger firm that lower gene expression levels of ZIC3 and proteins, which functions as a TF in early stages of NR2F2 are associated with promoter hypermethylation left-right body axis formation and heart development of these two genes in normal individuals and DORV [53]. ZIC3 acts in organizer formation by inhibiting the patients. canonical Wnt signaling pathway, and its expression is regulated by determinants of the early neural fate speci- Discussion fication and dorsal-ventral (D-V) axis formation, includ- This study represents the first analysis of genome and ing BMP, FGF, and Nodal signaling pathways [54]. epigenome profiling of MZ twins discordant for DORV, Mutations in ZIC3 result in heterotaxy or isolated CHD and provides the evidence for the presence of epigenetic (phenotypes including DORV, ASD and VSD [49, 50]. In differences between the twin pair. Genetic variations be- addition, the ZIC3 gene is located in X chromosome, so tween twins can affect proteins coding, gene transcrip- it may contribute to the pathogenesis of CHD in male tion and epigenetic modifications [52]. We therefore and female differently. NR2F2 was identified to be a first performed genomic sequence variation detection member of the steroid thyroid hormone superfamily of and revealed some differences between the twin pair, but nuclear receptors, involving in the regulation of many stringent filtering analyses of CNVs, SNVs and short different genes in development [55, 56]. In human cases InDels failed to identify genomic differences that may and mouse models, NR2F2 has been crucially implicated contribute to pathogenic DORV. Even though sequence in angiogenesis and heart development, and abnormal variants within the promoter regions of the GATA4 and expression or depletion of NR2F2 leads to AVSD (atrio- TBX1 were reported to contribute to congenital heart ventricular septal defect) and VSD (ventricular septal de- disease by altering their gene expression [10, 29, 30], our fects) [51]. Consistently, we also found DMRs in BMP results showed the sequence variants existed in both the and Wnt signaling pathway-related genes, indicating that normal and diseased twins, indicating that the pathogen- the regulation of these signaling pathways by ZIC3 or esis of DORV may not be due to these differences. It is NR2F2 may be critical for normal heart development. possible that the shared sequence variants showed differ- Using additional normal and clinical DORV samples, we ent penetrance, which may cause the DORV in one of confirmed promoter hypermethylation of ZIC3 and the MZ twins. Notably, sequences alignment in our NR2F2 in DORV patients and the anti-correlation be- study covered 91% of the hg19 human reference gen- tween their methylation and gene expression. Taken to- ome, leaving the rest of the genome not assessed. There- gether, aberrant methylation at promoter regions of fore, we cannot rule out the possibility that genome ZIC3 and NR2F2 and their dysregulated gene transcrip- sequence differences are involved in DORV of this twin tion levels, may contribute to DORV in human heart de- pair. velopment. However, the decisive conclusion needs Epigenetic variation at specific genomic regions has further investigations, since epigenetic changes in blood high heritability, and MZ twins typically share similar may not be able to fully reflect the causative basis of the epigenetic profiles [25]. However, a set of factors includ- disease due to lack of more DORV cases in twins and ing dietary components, physical changes, psychological difficulty in obtaining heart samples. states and environmental changes could affect epigen- DMRs were also found in the upstream of CITED1, omes of the two years and two months old twins after GATA2, SOX3 and some important epigenetic genes, in- birth [53]. In this study, many DMRs-related genes cluding MTA2, NSD1, MECP2 and SUV39H1, indicating (within upstream or gene body) are enriched in path- that they might also contribute to DORV. The MZ twins ways that contribute to cardiac development. Most im- shared the similar but not exactly the same environment, portantly, we found that ZIC3 and NR2F2, which are especially the postnatal environment. Thus, we suggest annotated with CHD in OMIM database, were that the non-shared environmental and stochastic Lyu et al. BMC Genomics (2018) 19:428 Page 9 of 13 Fig. 6 (See legend on next page.) Lyu et al. BMC Genomics (2018) 19:428 Page 10 of 13 (See figure on previous page.) Fig. 6 DNA methylation and gene expression detection of ZIC3 and NR2F2 from clinical cases. (a and c) Statistical summaries about DNA methylation status of DMRs in ZIC3 and NR2F2 in 20 clinical samples, consisting of five normal providers and fifteen DORV patients. (b and d) Diagrams exhibiting average methylated levels of individual CpG sites in DMRs of ZIC3 and NR2F2 from the indicated groups, respectively. (e and f)Histograms showing relative gene expression levels of ZIC3 and NR2F2 in different groups of specimens. (g and h) Scatterplots showing the gene expression levels of ZIC3 (g)and NR2F2 (h) are negatively correlated with their promoter methylation status. Pearson’s correlation coefficient and p-values were listed above the plot factors, including physical changes, chemical pollutants, output CX_report file was sorted by chromosomes using dietary components, temperature changes and other linux shell commands (awk). The sorted CX_report files external stresses during pregnancy [6, 57, 58], may were then used for downstream analysis. contribute to the pathogenesis of DORV through the mediation of epigenetic changes. DNA methylation detection and quantitative RT-PCR To monitor CpG methylation of screened DMRs in pro- Conclusions moters, genomic DNA was treated with sodium bisulfite In conclusion, disease-discordant MZ twin pairs are out- using EpiTect Bisulfite Kit (Qiagen, USA). The con- standing subjects to study epigenetic mechanisms driving verted DNA was then amplified by PCR with specific a number of pathologies. Here, using DNA methylation primers (ZIC3: Forward: 5′-GAGTGATTGATTTT profiling technology to analyze genome-wide DNA ATTAGTTTAAGGATAT-3’Reverse: 5’-AACCAAAAA methylation, we described differentially methylated re- ACTCCCTAAATACC-3′; NR2F2: Forward: 5′-GAAG gions in a DORV-discordant MZ twin pair. A limitation to TAGGAAAGGGTGGG-3’ Reverse: 5’-CGAACCCAA our study is that we only obtained one MZ twin pair dis- ACTATTATCTAAC-3′), PCR products were purified, li- cordant for DORV, and the present results call for more gated into pEasy-T5 vector (Transgene, China) and then DORV discordant twins and extensive tests for the transduced into competent Escherichia.coli. When bac- generalizability of our findings. Nevertheless, our results teria colonies appeared on the plate, at least 20 inde- provide new insights into the mechanism of DORV, a rare pendent clones were selected and sequenced. The disease that has been less studied by genomic and epige- sequenced results were analyzed by BiQ analyzer (Max nomic approaches. Plank Institute, Germany). Total RNA was extracted with RNAliquid Kit (Aidlab, China) and mRNA expres- Methods sion levels were detected with one-step RT-PCR kit Patients and materials (Takara, China) on lightcycler (Roche, USA). RT-qPCR Genomic DNA was extracted from the donated whole primers were listed as follows: ZIC3: Forward: 5′- blood samples of DORV patients and normal people by GGCGCTCAGTTTCCTAACTAC-3’Reverse: 5′- CTGC using the DNeasy Blood & Tissue Kit (Qiagen, Cat no. CGCATATAACGGAAGAA-3′; NR2F2: Forward: 5′- 69504). This study was conducted in accordance with AACCAGCCGACGAGATTCG-3’ Reverse: 5′- CCCG the principles of the Declaration of Helsinki and has GATGAGGGTTTCGATG-3′. been reviewed and approved by the Medical Ethics Committee of Fuwai Hospital. Written informed consent was obtained from the twins’ parents and mentioned DMRs detection samples’ providers. Differentially methylated regions (DMRs) were detected based on a windows swiping method. We used a 100 bp Reduced representation bisulfite sequencing (RRBS) window sliding on the genome at a 50 bp step to find MspI-digested RRBS library was prepared as published differentially methylated windows (DMWs) between two [36], one hundred bp paired-end reads were generated samples, and the neighboring windows were joined to- from Illumina Hiseq2000 platform (BIOPIC, Peking Uni- gether as a DMR. Only >10X CpG sites were used to cal- versity, Beijing). Raw reads were trimmed adapters and culate DMWs. To test the different average methylations low quality bases using trim_galore in RRBS mode. Hu- in a window between two samples, Wilcoxon test was man genome (hg19) was indexed with bismark_gen- applied, and p-value < 0.01 was then considered as the ome_preparation (a script from bismark mapping DMW. A significant DMW was discarded if less than package), and then, all clean reads aligned to indexed five CpG sites in the window and average methylation human genome using bismark (−-bowtie2). To extract levels between two samples were less than 10%. Finally, the methylation information for individual cytosines, bis- adjacent DMWs were joined together as DMRs using mark_methylation_extractor (−p –cytosine_report –CX BEDtools (intersectBed). DMRs were annotated by BED- –no_overlap) in paired-end mode was applied, and the tools (intersectBed) and ANNOVAR [59]. Lyu et al. BMC Genomics (2018) 19:428 Page 11 of 13 Gene ontology (GO) and pathway enrichment Additional file 11: Table S7. GO and KEGG analysis of DMRs located in Annotated DMRs were separated into 3 subsets: gene upstream region. (XLSX 43 kb) upstreams (2 kb in front of TSS), gene bodies and tran- Additional file 12: Table S8. GO and KEGG analysis of DMRs located in gene body. (XLSX 99 kb) script factor binding sites. Genes which had DMRs in up- Additional file 13: Table S9. GO and KEGG analysis of DMRs located in streams and gene bodies were submitted to Database for TF binding sites. (XLSX 80 kb) Enrichr and Annotation, Visualization and Integrated Dis- Additional file 14: Figure S5. Aberrant methylation in the upstream covery (DAVID) respectively, GO enrichments used regions of CITED1. Visualizing the methylation levels of DMRs near CITED1 GOTERM_BP_FAT database under default parameters, with UCSC genome browser. Methylated levels in the twins are showed in blue (D3) and red (D4). Transcription factor binding sites are also and pathway enrichments used Kyoto Encyclopedia of showed in zooming-in panels, which indicated by black bars with names Genes and Genomes (KEGG) database. DMRs which an- marked in front. Arrows give TSSs and transcriptional orientation. notated as transcript factors binding sites by ANNOVAR Transcription factor binding sites, Pol II ChIP-seq and TBP ChIP-seq data from ENCODE. (PDF 910 kb) (using tfbsConsSites database downloaded from UCSC Additional file 15: Figure S6. Aberrant methylation in the upstream [60]) were considered to influence the TFs function, so we regions of GATA2. Visualizing the methylation levels of DMRs near GATA2 analyzed those TFs using DAVID. GO (biological process) with UCSC genome browser. Methylated levels in the twins are showed and pathway enrichments were obtained to understand in blue (D3) and red (D4). An arrow gives TSS and transcriptional orientation. Transcription factor binding sites, Pol II ChIP-seq and TBP those DMRs’ biological meanings. ChIP-seq data from ENCODE. (PDF 767 kb) Additional file 16: Figure S7. Aberrant methylation in the upstream regions of SOX3. Visualizing the methylation levels of DMRs near SOX3 Additional files with UCSC genome browser. Methylated levels in the twins are showed in blue (D3) and red (D4). Transcription factor binding sites are also Additional file 1: Table S1. Overview of WGS data. (XLSX 39 kb) showed in zooming-in panels, which indicated by black bars with names marked in front. An arrow gives TSS and transcriptional orientation. Tran- Additional file 2: Table S2. D3 specific SNVs and short InDels (XLSX 82 kb) scription factor binding sites, Pol II ChIP-seq and TBP ChIP-seq data from Additional file 3: Table S3. Sequence variations reported related to ENCODE. (PDF 590 kb) VSD. (XLSX 30 kb) Additional file 17: Figure S8. Aberrant methylation in the upstream Additional file 4: Figure S1. Comparison of copy number profiles of 22 regions of NSD1. Visualizing the methylation levels of DMRs near NSD1 pairs of autochromosomes and X chromosome between twin pair. with UCSC genome browser. Methylated levels in the twins are showed Normalized copy number profiles of D3 and D4. Each point shows a 5 kb in blue (D3) and red (D4). Transcription factor binding sites are also windows (all chromosomes) of sequencing reads normalized by GC-content showed in zooming-in panels, which indicated by black bars with names and map-ability using Control-FREEC. (PDF 1756 kb) marked in front. An arrow gives TSS and transcriptional orientation. Tran- Additional file 5: Table S4. Analysis of CNVs two samples. (XLSX 46 kb) scription factor binding sites, Pol II ChIP-seq and TBP ChIP-seq data from ENCODE. (PDF 781 kb) Additional file 6: Figure S2. D3 specific structure variations (SVs) analysis. Alignments of reads in both D3 (top panel) and D4 (bottom panel) Additional file 18: Figure S9. Aberrant methylation in the upstream at each break points detected by CREST. The header lines in blue given the regions of MTA2. Visualizing the methylation levels of DMRs near MTA2 detail information of each structure variation (columns meanings: left_chr, with UCSC genome browser. Methylated levels in the twins are showed left_pos, left_strand, number of left soft-clipped reads, right_chr, right_pos, in blue (D3) and red (D4). Transcription factor binding sites are also right_strand, number right soft-clipped reads, SV type, coverage at left_- showed in zooming-in panels, which indicated by black bars with names pos, coverage at right_pos, assembled length at left_pos, assembled marked in front. An arrow gives TSS and transcriptional orientation. Tran- length at right_pos, average percent identity at left_pos, percent of non- scription factor binding sites, Pol II ChIP-seq and TBP ChIP-seq data from unique mapping reads at left_pos, average percent identity at right_pos, per- ENCODE. (PDF 633 kb) cent of non-unique mapping reads at right_pos, start position of consensus Additional file 19: Figure S10. Aberrant methylationinthe upstream mapping to genome, starting chromosome of consensus mapping, regions of MECP2. Visualizing the methylation levels of DMRs near MECP2 with position of the genomic mapping of consensus starting position, end UCSC genome browser. Methylated levels in the twins are showed in blue position of consensus mapping to genome, ending chromosome of (D3) and red (D4). Transcription factor binding sites are also showed in consensus mapping, position of genomic mapping of consensus end- ing position, and consensus sequences). (PDF 626 kb) zooming-in panels, which indicated by black bars with names marked in front. Arrows give TSSs and transcriptional orientation. Transcription factor binding Additional file 7: Table S5. Overview of RRBS data. (XLS 26 kb) sites, Pol II ChIP-seq and TBP ChIP-seq data from ENCODE. (PDF 969 kb) Additional file 8: Figure S3. Analysis of WGBS data of B cell and heart tissue from the ENCODE project. (A) Scatter plot and Pearson’s Additional file 20: Figure S11. Aberrant methylation in the upstream correlation analysis of DNA methylation of B cell and heart. Pearson’s regions of SUV39H1. Visualizing the methylation levels of DMRs near correlation coefficient was listed above the plot. (B) Visualization of SUV39H1 with UCSC genome browser. Methylated levels in the twins are the DNA methylation status of B cell and heart in chr1:1-3 M by IGV. showed in blue (D3) and red (D4). Transcription factor binding sites are (JPG 500 kb) also showed in zooming-in panels, which indicated by black bars with names marked in front. An arrow gives TSS and transcriptional orienta- Additional file 9: Figure S4. Comparison of systemic changes of tion. Transcription factor binding sites, Pol II ChIP-seq and TBP ChIP-seq methylome between two samples. (A) Cumulative depth distribution of data from ENCODE. (PDF 748 kb) RRBS data. The x-axis represents the depth of cytosine, and the y-axis represents the fraction of cytosine ≤ depth. (B) Methylation levels Additional file 21: Table S10. Expression profiling data of embryo and distribution in two samples of three different kinds of cytosine (CG, CHG, adult heart. (XLSX 9 kb) CHH). The x-axis is the methylation level; y-axis shows the log2 counts of Additional file 22: Figure S12. DNA methylation detection of ZIC3 the cytosine under a methylated level. (C) CHH methylation profile in the from clinical samples. Bisulfite sequencing tested DNA methylation status gene body, upstream and downstream. (D) CHG methylation profile in the of DMRs in ZIC3 in 20 clinical samples, five normal providers (1–5) and gene body, upstream and downstream. (PDF 891 kb) fifteen DORV patients (6–20). Methylated and unmethylated CpG sites are Additional file 10: Table S6. DMRs between two samples. (XLSX 580 kb) indicated as respective black and white circles. (PDF 5226 kb) Lyu et al. BMC Genomics (2018) 19:428 Page 12 of 13 Publisher’sNote Additional file 23: Figure S13. DNA methylation detection of NR2F2 Springer Nature remains neutral with regard to jurisdictional claims in from clinical samples. Bisulfite sequencing detected DNA published maps and institutional affiliations. methylation status of DMRs in NR2F2 in 20 clinical samples, five normal providers (1–5) and fifteen DORV patients (6–20). Received: 23 June 2017 Accepted: 22 May 2018 Methylated and unmethylated CpG sites are indicated as black and white circles, respectively. (PDF 5463 kb) Abbreviations References ASD: Atrial septal defect; CHD: Congenital heart disease; CITED1: Cbp/p300 1. Fahed AC, Gelb BD, Seidman J, Seidman CE. Genetics of congenital heart interacting transactivator with Glu/Asp rich carboxy-terminal domain 1; disease: the glass half empty. Circ Res. 2013;112(4):707–20. CNVs: Copy number variations; DAVID: Database for Annotation, Visualization 2. McMahon CJ, Breathnach C, Betts DR, Sharkey FH, Greally MT. De novo and Integrated Discovery; DMRs: Differentially methylated regions; interstitial deletion 13q33. 3q34 in a male patient with double outlet right DMWs: Differentially Methylated Windows; DORV: Double outlet right ventricle, microcephaly, dysmorphic craniofacial findings, and motor and ventricle; GATA2: GATA-binding protein 2; GATA4: GATA-binding factor 4; developmental delay. Am J Med Genet A. 2015;167(5):1134–41. HGMD: Human Genetic Mutation Database; HNF1: Hepatocyte nuclear factor 3. Hartge DR, Niemeyer L, Axt-Fliedner R, Krapp M, Gembruch U, Germer U, 1; InDels: Insertions/deletions; IRF2: Interferon regulatory factor 2; Weichert J. Prenatal detection and postnatal management of double outlet KEGG: Kyoto Encyclopedia of Genes and Genomes; MAPK: Mitogen activated right ventricle (DORV) in 21 singleton pregnancies. J Matern Fetal Neonatal protein kinase; MECP2: Methyl CpG binding protein 2; MTA2: Metastasis Med. 2012;25(1):58–63. associated 1 family member 2; MZ: Monozygotic; NR2F2: Nuclear receptor 4. Obler D, Juraszek A, Smoot LB, Natowicz MR. Double outlet right ventricle: subfamily 2 group F member 2; NSD1: Nuclear receptor binding SET domain aetiologies and associations. J Med Genet. 2008;45(8):481–97. protein 1; OMIM: Online Mendelian Inheritance in Man; P300: Histone 5. Ordovás JM, Smith CE. Epigenetics and cardiovascular disease. Nat Rev acetyltransferase p300; Pol II: RNA polymerase II; RRBS: Reduced Cardiol. 2010;7(9):510. representation bisulfite sequencing; SNVs: Single nucleotide variations; 6. Vecoli C, Pulignani S, Foffa I, Grazia Andreassi M. Congenital heart disease: SOX3: SRY-box 3; SUV39H1: Suppressor of variegation 3–9 homolog 1; the crossroads of genetics, epigenetics and environment. Curr Genomics. T1D: Type 1 diabetes; TBP: TATA-box binding protein; TBX1: T-box 2014;15(5):390–9. transcription factor 1; TFs: Transcription factors; TGF: Transforming growth 7. Bruneau BG. Signaling and transcriptional networks in heart development factor; TSS: Transcription start sites; VEGF: Vascular endothelial growth factor; and regeneration. Cold Spring Harb Perspect Biol. 2013;5(3):a008292. VSD: Ventricular septal defect; ZIC3: Zic family member 3 8. Hatcher CJ, Basson CT. Specification of the cardiac conduction system by transcription factors. Circ Res. 2009;105(7):620–30. Acknowledgements 9. Kathiriya IS, Nora EP, Bruneau BG. Investigating the transcriptional control of We are grateful to participants for kindly providing us with clinical samples. cardiovascular development. Circ Res. 2015;116(4):700–14. We are thankful to Fuwai Hospital for taking care of the participants. We 10. Stefanovic S, Christoffels VM. GATA-dependent transcriptional and thank the Bioinformatics Core, School of Life Sciences, Peking University for epigenetic control of cardiac lineage specification and differentiation. Cell providing bioinformatics analysis, and the reviewers for critical comments. Mol Life Sci. 2015;72(20):3871–81. 11. Liu L, Jin G, Zhou X. Modeling the relationship of epigenetic modifications Funding to transcription factor binding. Nucleic Acids Res. 2015;43(8):3873–85. This work was supported by 2014 China Postdoctoral Science Foundation 12. Feil R, Fraga MF. Epigenetics and the environment: emerging patterns and 55th General Financial Grant (No. 2014 M550007), 2013 Postdoctoral implications. Nat Rev Genet. 2012;13(2):97. Fellowship of Peking University-Tshinghua University Center for Life Sciences, 13. Handy DE, Castro R, Loscalzo J. Epigenetic modifications: basic mechanisms the National Natural Science Foundation of China (NSFC, Grant No. and role in cardiovascular disease. Circulation. 2011;123(19):2145–56. 31471205, 31671426 and 91219101) and the National Basic Research Program 14. Szyf M. The early life social environment and DNA methylation: DNA of China (973 Program, Grant No. 2010CB529500 and 2013CB530700). These methylation mediating the long-term impact of social environments early in funding bodies had no role in the design of the study, collection, analysis, life. Epigenetics. 2011;6(8):971–8. and interpretation of data, nor in writing the manuscript. 15. Smith ZD, Meissner A. DNA methylation: roles in mammalian development. Nat Rev Genet. 2013;14(3):204. Availability of data and materials 16. Hodges E, Molaro A, Dos Santos CO, Thekkat P, Song Q, Uren PJ, Park J, The datasets used and/or analysed during the current study available from Butler J, Rafii S, McCombie WR. Directional DNA methylation changes and the corresponding author on reasonable request. complex intermediate states accompany lineage specificity in the adult hematopoietic compartment. Mol Cell. 2011;44(1):17–28. Authors’ contributions 17. Brunner AL, Johnson DS, Kim SW, Valouev A, Reddy TE, Neff NF, Anton E, GLL and TL designed and performed molecular experiments, CZ contributed Medina C, Nguyen L, Chiao E. Distinct DNA methylation patterns to bioinformatics analysis, they contributed equally to this work; RL helped characterize differentiated human embryonic stem cells and developing with collections of clinical blood samples, CL, LZ, YTG, XKH, LS, LH and LJZ human fetal liver. Genome Res. 2009;19(6):1044–56. contributed to data analysis; WT, CL and YN conceived the project and 18. Kanherkar RR, Bhatia-Dey N, Csoka AB. Epigenetics across the human designed experiments; TL, GLL, CZ, CL and WT wrote the manuscript. All lifespan. Frontiers in cell and developmental biology. 2014;2:49. authors read and approved the final version of the manuscript. 19. Yuan W, Xia Y, Bell CG, Yet I, Ferreira T, Ward KJ, Gao F, Loomis AK, Hyde CL, Wu H. An integrated epigenomic analysis for type 2 diabetes susceptibility Ethics approval and consent to participate loci in monozygotic twins. Nat Commun. 2014;5:5719. This study was conducted in accordance with the principles of the 20. Wong C, Meaburn EL, Ronald A, Price T, Jeffries AR, Schalkwyk L, Plomin R, Declaration of Helsinki and has been reviewed and approved by the Medical Mill J. Methylomic analysis of monozygotic twins discordant for autism Ethics Committee of Fuwai Hospital (Approval no. 229). We confirm the spectrum disorder and related behavioural traits. Mol Psychiatry. 2014; informed written consent was obtained from all participants. 19(4):495. 21. Arora M, Reichenberg A, Willfors C, Austin C, Gennings C, Berggren S, Consent for publication Lichtenstein P, Anckarsäter H, Tammimies K, Bölte S. Fetal and postnatal For investigations involving clinical subjects, informed written consent has metal dysregulation in autism. Nat Commun. 2017;8:15493. been obtained from the participants involved. The participants consented to 22. Caramori ML, Kim Y, Moore JH, Rich SS, Mychaleckyj JC, Kikyo N, Mauer M. publish all of the sequencing data. Gene expression differences in skin fibroblasts in identical twins discordant for type 1 diabetes. Diabetes. 2012;61(3):739–44. Competing interests 23. Castillo-Fernandez JE, Spector TD, Bell JT. Epigenetics of discordant The authors declare that they have no competing interests. monozygotic twins: implications for disease. Genome Med. 2014;6(7):60. Lyu et al. BMC Genomics (2018) 19:428 Page 13 of 13 24. Bell JT, Saffery R. The value of twins in epigenetic epidemiology. Int J 47. Connelly JJ, Wang T, Cox JE, Haynes C, Wang L, Shah SH, Crosslin DR, Hale Epidemiol. 2012;41(1):140–50. AB, Nelson S, Crossman DC. GATA2 is associated with familial early-onset 25. Bell JT, Spector TD. A twin approach to unraveling epigenetics. Trends coronary artery disease. PLoS Genet. 2006;2(8):e139. Genet. 2011;27(3):116–25. 48. Paul MH, Harvey RP, Wegner M, Sock E. Cardiac outflow tract development relies on the complex function of Sox4 and Sox11 in multiple cell types. 26. Papadopoulos GK, Wijmenga C, Koning F. Interplay between genetics and Cell Mol Life Sci. 2014;71(15):2931–45. the environment in the development of celiac disease: perspectives for a 49. Ware SM, Peng J, Zhu L, Fernbach S, Colicos S, Casey B, Towbin J, Belmont healthy life. J Clin Invest. 2001;108(9):1261–6. JW. Identification and functional analysis of ZIC3 mutations in heterotaxy 27. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis and related congenital heart defects. Am J Hum Genet. 2004;74(1):93–105. G, Durbin R. The sequence alignment/map format and SAMtools. 50. Cowan J, Tariq M, Ware SM. Genetic and functional analyses of ZIC3 variants Bioinformatics. 2009;25(16):2078–9. in congenital heart disease. Hum Mutat. 2014;35(1):66–75. 28. Koboldt DC, Chen K, Wylie T, Larson DE, McLellan MD, Mardis ER, Weinstock GM, 51. Al Turki S, Manickaraj AK, Mercer CL, Gerety SS, Hitz M-P, Lindsay S, Wilson RK, Ding L. VarScan: variant detectioninmassively parallel sequencing of D’Alessandro LC, Swaminathan GJ, Bentham J, Arndt A-K. Rare variants in individual and pooled samples. Bioinformatics. 2009;25(17):2283–5. NR2F2 cause congenital heart defects in humans. Am J Hum Genet. 2014; 29. Tomita-Mitchell A, Maslen C, Morris C, Garg V, Goldmuntz E. GATA4 94(4):574–85. sequence variants in patients with congenital heart disease. J Med Genet. 52. Chen X, Kuja-Halkola R, Rahman I, Arpegård J, Viktorin A, Karlsson R, Hägg S, 2007;44(12):779–83. Svensson P, Pedersen NL, Magnusson PK. Dominant genetic variation and 30. Wang H, Chen D, Ma L, Meng H, Liu Y, Xie W, Pang S, Yan B. Genetic missing heritability for human complex traits: insights from twin versus analysis of the TBX1 gene promoter in ventricular septal defects. Mol Cell genome-wide common SNP models. Am J Hum Genet. 2015;97(5):708–14. Biochem. 2012;370(1–2):53–8. 53. Fraga MF, Ballestar E, Paz MF, Ropero S, Setien F, Ballestar ML, Heine-Suñer 31. Zhang F, Gu W, Hurles ME, Lupski JR. Copy number variation in human D, Cigudosa JC, Urioste M, Benitez J. Epigenetic differences arise during the health, disease, and evolution. Annu Rev Genomics Hum Genet. 2009;10: lifetime of monozygotic twins. Proc Natl Acad Sci U S A. 2005;102(30): 451–81. 10604–9. 32. Boeva V, Popova T, Bleakley K, Chiche P, Cappo J, Schleiermacher G, 54. Fujimi TJ, Hatayama M, Aruga J. Xenopus Zic3 controls notochord and Janoueix-Lerosey I, Delattre O, Barillot E. Control-FREEC: a tool for assessing organizer development through suppression of the Wnt/β-catenin signaling copy number and allelic content using next-generation sequencing data. pathway. Dev Biol. 2012;361(2):220–31. Bioinformatics. 2011;28(3):423–5. 55. Mendoza-Villarroel RE, Robert NM, Martin LJ, Brousseau C, Tremblay JJ. The 33. Abyzov A, Urban AE, Snyder M, Gerstein M. CNVnator: an approach to nuclear receptor NR2F2 activates star expression and steroidogenesis in discover, genotype, and characterize typical and atypical CNVs from family mouse MA-10 and MLTC-1 Leydig cells. Biol Reprod. 2014;91(1) and population genome sequencing. Genome Res. 2011;21(6):974–84. 56. Hubert MA, Sherritt SL, Bachurski CJ, Handwerger S. Involvement of 34. Oda M, Yamagiwa A, Yamamoto S, Nakayama T, Tsumura A, Sasaki H, Nakao transcription factor NR2F2 in human trophoblast differentiation. PLoS One. K, Li E, Okano M. DNA methylation regulates long-range gene silencing of 2010;5(2):e9417. an X-linked homeobox gene cluster in a lineage-specific manner. Genes 57. Cortessis VK, Thomas DC, Levine AJ, Breton CV, Mack TM, Siegmund KD, Dev. 2006;20(24):3382–94. Haile RW, Laird PW. Environmental epigenetics: prospects for studying 35. Jaenisch R, Bird A. Epigenetic regulation of gene expression: how the epigenetic mediation of exposure–response relationships. Hum Genet. 2012; genome integrates intrinsic and environmental signals. Nat Genet. 2003; 131(10):1565–89. 33:245. 58. Hogenson TL. Epigenetics as the underlying mechanism for monozygotic 36. Gu H, Smith ZD, Bock C, Boyle P, Gnirke A, Meissner A. Preparation of twin discordance. Medical Epigenetics. 2013;1(1):3–18. reduced representation bisulfite sequencing libraries for genome-scale DNA 59. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing methylation profiling. Nat Protoc. 2011;6(4):468. genomic features. Bioinformatics. 2010;26(6):841–2. 37. Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K. KEGG: new 60. Karolchik D, Hinrichs AS, Kent WJ: The UCSC genome browser. Current perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. protocols in bioinformatics 2012, 40(1):1.4. 1–1.4. 33. 2016;45(D1):D353–61. 38. Huang DW, Sherman BT, Tan Q, Kir J, Liu D, Bryant D, Guo Y, Stephens R, Baseler MW, Lane HC. DAVID bioinformatics resources: expanded annotation database and novel algorithms to better extract biology from large gene lists. Nucleic Acids Res. 2007;35(suppl_2):W169–75. 39. Kuleshov MV, Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, Koplev S, Jenkins SL, Jagodnik KM, Lachmann A. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016;44(W1):W90–7. 40. Zhang Y, Rath N, Hannenhalli S, Wang Z, Cappola T, Kimura S, Atochina- Vasserman E, Lu MM, Beers MF, Morrisey EE. GATA and Nkx factors synergistically regulate tissue-specific gene expression and development in vivo. Development. 2007;134(1):189–98. 41. Wang Y. Mitogen-activated protein kinases in heart development and diseases. Circulation. 2007;116(12):1413–23. 42. Rose BA, Force T, Wang Y. Mitogen-activated protein kinase signaling in the heart: angels versus demons in a heart-breaking tale. Physiol Rev. 2010;90(4): 1507–46. 43. Madonna R, De Caterina R. VEGF receptor switching in heart development and disease. Cardiovasc Res. 2009;84(1):4–6. 44. Dor Y, Camenisch TD, Itin A, Fishman GI, McDonald JA, Carmeliet P, Keshet E. A novel role for VEGF in endocardial cushion formation and its potential contribution to congenital heart defects. Development. 2001;128(9):1531–8. 45. Sridurongrit S, Larsson J, Schwartz R, Ruiz-Lozano P, Kaartinen V. Signaling via the Tgf-β type I receptor Alk5 in heart development. Dev Biol. 2008; 322(1):208–18. 46. Bamforth SD, Bragança J, Farthing CR, Schneider JE, Broadbent C, Michell AC, Clarke K, Neubauer S, Norris D, Brown NA. Cited2 controls left-right patterning and heart development through a nodal-Pitx2c pathway. Nat Genet. 2004;36(11):1189. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png BMC Genomics Springer Journals

Genome and epigenome analysis of monozygotic twins discordant for congenital heart disease

Free
13 pages
Loading next page...
 
/lp/springer_journal/genome-and-epigenome-analysis-of-monozygotic-twins-discordant-for-GHkVuxaF8i
Publisher
BioMed Central
Copyright
Copyright © 2018 by The Author(s).
Subject
Life Sciences; Life Sciences, general; Microarrays; Proteomics; Animal Genetics and Genomics; Microbial Genetics and Genomics; Plant Genetics and Genomics
eISSN
1471-2164
D.O.I.
10.1186/s12864-018-4814-7
Publisher site
See Article on Publisher Site

Abstract

Background: Congenital heart disease (CHD) is the leading non-infectious cause of death in infants. Monozygotic (MZ) twins share nearly all of their genetic variants before and after birth. Nevertheless, MZ twins are sometimes discordant for common complex diseases. The goal of this study is to identify genomic and epigenomic differences between a pair of twins discordant for a form of congenital heart disease, double outlet right ventricle (DORV). Results: A monoamniotic monozygotic (MZ) twin pair discordant for DORV were subjected to genome-wide sequencing and methylation analysis. We identified few genomic differences but 1566 differentially methylated regions (DMRs) between the MZ twins. Twenty percent (312/1566) of the DMRs are located within 2 kb upstream of transcription start sites (TSS), containing 121 binding sites of transcription factors. Particularly, ZIC3 and NR2F2 are found to have hypermethylated promoters in both the diseased twin and additional patients suffering from DORV. Conclusions: The results showed a high correlation between hypermethylated promoters at ZIC3 and NR2F2 and down- regulated gene expression levels of these two genes in patients with DORV compared to normal controls, providing new insight into the potential mechanism of this rare form of CHD. Keywords: CHD, MZ twins, ZIC3, NR2F2,DNA methylation,RRBS,WGS Background During the normal development of the heart, the out- Congenital heart disease (CHD) is the leading flow tract initially connects exclusively with the primitive non-infectious cause of death in infants. In Asia, CHD oc- right ventricle and must remodel to divide into a separ- curs in 9.3 per 1000 live births [1]. Double outlet right ven- ate pulmonary artery and aorta; subsequently, there is tricle (DORV), defined when both great arteries originate continued remodeling to establish direct continuity from from the morphological right ventricle in a heart, is a rare the left ventricle to the aorta [4]. In double outlet right form of congenital heart disease, accounting for 1–3% of all ventricle, drainage of the left ventricle is commonly CHD cases [2–4]. Multiple factors have been identified in achieved through a ventricular septal defect (VSD) at contributing to the disease, of which both genetic and epi- different locations and with varying relations to the pul- genetic changes and the interplay between them and the re- monary and aortic outflow tract. An insufficient mitral latedenvironment playakey role inthepathogenesis[5, 6]. valve and an atrial septal defect (ASD) can be found in DORV cases [3]. Since temporal and spatial expression of transcription factors (TFs) is a major determinant of * Correspondence: cheng_li@pku.edu.cn; nieyuniverse@126.com; weitao@pku.edu.cn cell lineage specification and patterning of the heart Guoliang Lyu, Chao Zhang and Te Ling contributed equally to this work. [7–10], mutations or expression dysregulation in them Center for Bioinformatics, School of Life Sciences, Peking University, Beijing can impair cardiac development and lead to congeni- 100871, China Department of Cardiovascular Surgery, Center for Cardiovascular tal heart malformations and dysfunction [7]. Changes Regenerative Medicine, Fuwai Hospital, Peking Union Medical College, of epigenetic modifications also affect the expression Chinese Academy of Medical Sciences, Beijing 100871, China patterns of these TFs [11]. Environmental alterations or Key Laboratory of Cell Proliferation and Differentiation, School of Life Sciences, Peking University, Beijing 100871, China © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Lyu et al. BMC Genomics (2018) 19:428 Page 2 of 13 stresses including drugs may perturb the TFs-related tran- extracted from whole blood samples of the MZ twins scriptional and epigenetic programs in the process of car- discordant for DORV (septal-defect heart (D3) and nor- diac development and give rise to CHD [12–14]. mal heart (D4) was sequenced by Illumina HiSeq X-Ten Cytosine methylation, an epigenetic modification, is (150 bp paired-end reads) and aligned to the hg19 refer- essential in mammalian development and particularly in ence human genome. The average sequencing depth of cell-lineage specification [15, 16]. Although each individ- D3 and D4 were 31.9 and 28.5 respectively, and 4-fold ual’s genome is fixed throughout life and across cell types, coverage represented more than 91% of the hg19 human epigenetic modifications are plastic and influence the reference genome (Additional file 1: Table S1). temporal and spatial pattern of gene expression [14]. Cell Single-nucleotide variations (SNVs) and short insertions/ type-specific DNA methylation patterns emerge during deletions (InDels) were identified using SAMtools [27] development and play a role in gene expression by direct- and filtered by Varscan [28]. More than 99.9% SNVs ing chromatin activation or interfering with the binding of shared between the MZ twins (Fig. 1a). Three hundred TFs [17]. Emerging evidence suggests that DNA methyla- and sixteen SNVs and 114 InDels were identified specif- tion is responsive to both physical and social environ- ically in D3. After filtering by the dbSNP (Version 138) ments during pregnancy and early life [14, 18]. database, forty-one SNVs and 71 InDels in D3 remained Monozygotic (MZ) twins share nearly all of their genetic (Additional file 2: Table S2). There was one deletion lo- variants before and after birth. Nevertheless, MZ twins are cated in non-coding RNA (ANKRD30BP2, pseudogene) often discordant for common complex diseases, such as and one synonymous SNV in the exon of DSPP gene, type 1 diabetes (T1D; 61%), type 2 diabetes (41%) [19], but none of the SNVs or InDels altered proteins (Fig. 1b autism (58 to 60%) [20, 21], schizophrenia (58%), and & Additional file 2: Table S2). No pathogenic variation different types of cancer (up to 16%) [22, 23]. These obser- specific to D3 was detected when filtered by the Online vations support the model that for many complex traits, Mendelian Inheritance in Man (OMIM) and Human genotype alone may not fully determine phenotypic vari- Genetic Mutation Database (HGMD). Sequence variants ation, and the interplay between genes and environment in promoter of GATA4 and TBX1, which were closely needs to be considered and epigenetics has been proposed related to VSD [29, 30], were found in both of our sam- to be one of the main mediators of this interaction ples, indicating that the sequences variants in the pro- [24–26]. Therefore, disease-discordant MZ twin pairs moters of these two transcription factor may not provide an ideal model for examining epigenetic functions contribute to the pathogenesis of DORV in D3 (Add- in diseases due to their shared genetic and environmental itional file 3: Table S3). factors during the pregnancy [24]. Analysis of discordant Copy number variants (CNVs) are deletions or ampli- MZ twins has been successfully used to study epigenetic fications of DNA segments that arise from inappropriate mechanisms in aging, cancer, autoimmune disease, and chromatid recombination or segregation during cell div- psychiatric, neurological and other traits [20, 21, 23]. ision. As CNVs alter the gene expression dosage of con- However, comprehensive analysis of genome-wide DNA tiguous genes, they may result in syndromic CHDs [31]. methylation in a MZ twin pair discordant for double out- Using Control-FREEC [32], we detected 15 focal CNVs; let right ventricle (DORV) is lacking. however, these CNVs did not pass CNVnator confirmation In this study, we dissected the contributions of DNA criteria [33], suggestingthe absenceof bonafideCNVs(Fig. methylation pattern to the pathogenesis of DORV through 1c; Additional file 4: Figure S1 & Additional file 5:Table DNA methylation profiling at single nucleotide level by S4). Structural variation (SV), typically a large insertion/de- utilizing the whole-blood-DNA derived from a MZ twin letion, inversion or translocation affecting a sequence pair discordant for DORV. The two years and two months length from1kbto3Mb,was notdetectedtobedifferent old Chinese girl, given a diagnosis of DORV at birth, has a between the two samples using the CREST software healthy MZ twin sister. Their parents have no past or fam- (Additional file 6:FigureS2).Together, thegenomic ily history of heart diseases and are not consanguineous. analyses suggest that the DORV in D3 is not likely Thus, we hypothesize that genome-wide analysis in this caused by genome alterations. phenotype-discordant twins will provide insights into gen- etic and epigenetic factors affecting normal heart DNA methylation differences between the MZ twin pair development. DNA methylation is involved in multiple processes, including regulation of gene expression, silencing of ret- Results rotransposons, genomic imprinting, X-chromosome in- Genome sequence differences in MZ twins discordant for activation and occurrence of various diseases [34, 35]. DORV We next sought to compare genome-scale DNA methy- We first tried to identify de novo genomic sequence lation profiles between the MZ twins at nucleotide reso- variation through whole genome sequencing. DNA lution. DNA samples were extracted from whole blood Lyu et al. BMC Genomics (2018) 19:428 Page 3 of 13 ab Fig. 1 Overview of whole genome sequencing. a The numbers of SNVs and InDels detected in D3 and D4, 99.9% loci shared between the MZ twins. b There are 1736 SNVs or InDels specific in the disease sample D3, of which only 430 loci showing high confidence confirmed by Varscan are regarded as potential de novo SNVs/InDels. All loci are classified into 7 categories (upstream, downstream, intergenic, exonic, intronic, UTR5, UTR3) according to the relative position of nearby genes. The pie plot shows the number of loci and proportion of each category. c Normalized copy number profiles of D3 and D4. Each point shows a 5 kb window (chromosome 1) of sequencing reads normalized by GC-content and mappability using Control-FREEC acquired before heart surgery, and gel-free reduced rep- found a high correlation between B cell and heart resentation bisulfite sequencing (RRBS) libraries were with Pearson’s correlation coefficient equal to 0.9238, constructed as previously reported [36] and then and visualizing the genome-wide methylation between sequenced on Illumina HiSeq2000 platform (100 bp them in IGV showed similar patterns (Additional file 8: paired-end reads). After aligning the reads to the hg19 Figure S3), suggesting that the DNA methylation pat- reference genome, we obtained 56 million and 48 mil- tern in blood correlate with that in heart. lion high-quality, 100-nucleotide, uniquely mapped reads The methylation levels of CpG dinucleotides in both from the D3 and D4 samples, respectively. The two samples showed a bimodal distribution with most samples had very similar sequencing-depth patterns of CpG sites being unmethylated or extensively methyl- cytosine sites and covered more than 34 million C sites ated (Fig. 2b). As expected, genome-wide patterns of with sequencing depth at least 5 fold (Additional file 7: DNA methylation (CpG andnon-CpG)werehighly Table S5). More than 6 million high-quality CpG dinu- correlated between the MZ twin pair (Figs. 2c-e & cleotides were supported by at least by 5 reads (depth ≥ Additional file 9: Figure S4), suggesting that DORV is not 5×), which covered 10% of all CpG sites (Fig. 2a). In associated with a global reprogramming of methylation. addition, we checked the methylation status of samples Next, we sought to identify differentially methylated from blood cell and heart tissue in ENCODE project regions (DMRs) between the twin pair using a window (Accession ID: ENCFF388NTJ, ENCFF107RMQ). We sliding strategy to reduce the sampling variation of Lyu et al. BMC Genomics (2018) 19:428 Page 4 of 13 ce Fig. 2 Comparison of CpG methylomes between two samples. a Overall reads coverage of reduced representation bisulfite sequencing (RRBS) at CpGs between the twins. The RRBS strategy shows a high covered rate of CpGs in two samples. 10.88% and 10.72% of all human CpGs are sequenced by more than 5 reads (5×) in D3 and D4, respectively. b Overall distribution of CpGs’ methylated levels in D3 and D4. This violin plot shows that most CpGs are 100% methylated or 0% methylated, and the global methylation level is unchanged between the pair (Wilcoxon test non- significant). c Average CpGs’ methylation profile in gene body, upstream (− 2 kb of TSS) and downstream (+ 2 kb of TTS). d Circos plot shows the distribution of the DMRs along the genome. The tracks from outsider to inner: CpG islands downloaded from UCSC (green); the –log10 p-value of detected DMRs (blue); the average methylation levels per 1 kb in D3 (red); the average methylation levels per 1 kb in D4 (green). e Scatter diagram of all CpGs’ methylation levels calculated by RRBS data between two samples, D3 and D4, showing a similar overall CpGs methylation levels with Pearson correlation coefficient equal to 0.9613 individual CpG sites (see Methods). We identified 1566 1566) of the DMRs distributed in gene bodies or inter- significant DMRs between the MZ twin pair (Fig. 3a). genic regions and 20% (312/1566) of the DMRs located Hypermethylation and hypomethylation DMRs in D3 within 2 kb upstream of TSS (Fig. 3b and showed similar genomic distributions, with 80% (1254/ Additional file 10: Table S6). We concluded that DMRs Lyu et al. BMC Genomics (2018) 19:428 Page 5 of 13 cardiovascular development (Fig. 4b and Additional file 12: Table S8). Furthermore, multiple signaling-pathways in- volved in heart development were also enriched, including regulation of BMP signaling and TGF beta receptor signal- ing (Fig. 4b and Additional file 12: Table S8). These results suggest that gene regulatory networks during the heart development of D3 may be affected by the DMRs. Moreover, DMRs can influence gene expression through directly altering the binding of transcription fac- tors to their targets. We analyzed the TF binding motifs in DMRs by using ANNOVAR (using tfbsConsSites database downloaded from UCSC). We identified 138 TFs (Additional file 10: Table S6) that belong to GATA family and NKX family, which are essential for cardiac development (Additional file 13: Table S9). GO analysis showed that many of these TFs were involved in heart development, muscle organ development, regulation of muscle cell differentiation, and blood vessel development (Fig. 4c and Additional file 13: Table S9). These TFs were also enriched in KEGG signaling pathways including MAPK signaling, Wnt signaling, TGF beta signaling, and Fig. 3 Distribution of D3 specific DMRs. a The scatter plot shows the found DMRs, the red color indicates the p-value, and non-significant VEGF signaling pathway (Fig. 4d and Additional file 13: regions are showed in gray. b The distribution of significantly DMRs. All Table S9), which were all involved in heart development DMRs are classified into 2 main categories (hypermethylation and [40–45]. Taken together, these results imply that DMRs hypomehtylation), and each category is further classified into 7 in TFs binding sites may contribute to the DORV classes (upstream, downstream, intergenic, exonic, intronic, UTR5, pathogenesis. UTR3) according to the relative position of nearby genes. The bar plot shows the proportion of each class. There are 294 genes that have a differentially methylated region in upstream Aberrant promoter methylation of ZIC3 and NR2F2 in the (2 kb) of transcription start site diseased twin Since promoter hypermethylation is typically associated between the MZ twins may be involved in the pathogen- with the repression of gene transcription [44], we then esis of DORV. focused on promoter DMRs for further analysis. We found that DMRs existed in the upstream of multiple genes whose family members have been reported to be involved in Gene ontology (GO) and transcription factor binding sites pathogenesis of CHDs, and their normal functions are enriched in DMRs critical in morphogenesis and establishment of the We analyzed the genes that contained DMRs within cardiovascular system during cardiac development [46–48]. 2 kb upstream of their TSS. In total, three hundred and These genes include CITED1 (member of CREB-binding twelve DMRs were located in the upstream of 621 TSSs protein/p300-interacting transactivator with Asp/Glu-rich (belonging to 294 genes annotated in Refseq) (Fig. 3b & C-terminal domain (CITED) family of proteins, hypomethy- Additional file 11: Table S7). These DMR-associated lated), GATA2 (member of the GATA family of zinc-finger genes were then subjected to KEGG (Kyoto TFs, hypermethylated), SOX3 (member of the SOX (SRY-re- Encyclopedia of Genes and Genomes) pathway and Gene lated HMG-box) family of transcription factors, hypomethy- Ontology (GO) analysis using the Enrichr and DAVID lated) (Additional file 14:FigureS5, Additional file 15: Web servers [37–39]. GO analysis revealed that genes Figure S6 and Additional file 16:FigureS7). associated with cardiolipin acyl-chain remodeling, car- We also found that genes encoding important epigen- diac muscle tissue regeneration, vascular function and etic factors contained DMRs in their upstream of TSS, organ growth are enriched in DMR-associated genes implying that these factors may play roles in causing (Fig. 4a & Additional file 11: Table S7), suggesting that CHDs through regulating expression of genes implicated DMRs may be associated with the abnormal heart devel- in heart development. The genes include NSD1 (Nuclear opment of D3. Receptor Binding SET Domain Protein 1, which prefer- We next analyzed the 851 genes containing DMRs in entially methylates ‘Lys-36’ of histone H3 and ‘Lys-20’ of their gene bodies. These genes were also enriched in several histone H4, hypomethylated), MTA2 (Metastasis Associ- biological processes that are involved in various stages of ated 1 Family, Member 2, a component of NuRD, Lyu et al. BMC Genomics (2018) 19:428 Page 6 of 13 ac Fig. 4 Gene Ontology (GO) and KEGG analysis of D3 specific DMRs. Gene Ontology (GO) and pathway enrichment analysis of differentially methylated regions using Enrichr and DAVID under default parameters. The bar charts show the most relevant and significantly enriched terms. Terms that are highly related to CHD are marked in blue. The x-axis represents the –log10 of the enrichment p-value. The y-axis represents the enriched terms in GO or KEGG databases. a GO enrichment analysis of genes associated with differentially methylated regions in 2 kb upstream of genes. b GO enrichment analysis of genes associated with differentially methylated regions in gene body. c GO enrichment analysis of TFs whose binding sites are differentially methylated. d KEGG pathway enrichment analysis of TFs whose binding sites are differentially methylated hypomethylated), MECP2 (Methyl CpG Binding Protein high intensity of TBP and Pol II binding in K562 cells 2, hypermethylated) and SUV39H1 (Suppressor of Varie- (ENCODE data) (Fig. 5b and d). These results suggested gation 3–9 Homolog 1, a histone methyltransferase that a possible association of hypermethylated promoters of trimethylates lysine 9 of histone H3, hypomethylated) ZIC3 and NR2F2 and their functions during the heart (Additional file 17: Figure S8, Additional file 18:Figure development of DORV patients. S9, Additional file 19: Figure S10 and Additional file 20: Figure S11). We also utilized the public expression profil- Hypermethylation and dysregulation of ZIC3 and NR2F2 ing data of embryo and adult heart from ENCODE (Ac- in additional DORV patients cession ID: ENCFF704AHC, ENCFF199GQY, In order to further confirm the dysregulation of ZIC3 ENCFF987YOV) to analyze the possible contribution of and NR2F2 in DORV pathogenesis, we collected twenty these genes to DORV. We found that these genes were DNA samples of whole blood from normal individuals expressed in both embryo and heart (Additional file 21: and clinical DORV patients. In order to guarantee the Table S10), suggesting that these genes may play import- similarity in age and in gender ratio between the normal ant roles in embryonic and cardiac development and dys- group and the patient cases, the samples included five regulated expression of them may contribute to CHD controls with normal heart development (aged 0.8– such as DORV. 3.8 years; 3 males, 2 females) and fifteen cases with Moreover, by scrutinizing all the DMRs located in DORV diagnosis (aged 1–3.5 years; 9 males, 6 females). gene promoter regions, we noticed that two genes, ZIC3 Using bisulfite sequencing, we confirmed hypermethy- and NR2F2, encode TFs annotated with CHD in OMIM lated promoters of both ZIC3 (Fig. 6a and b)and NR2F2 (300,265 and 107,773) [49–51]. In the DORV diseased (Fig. 6c and d) in twelve of fifteen DORV patients com- twin D3, the upstream of ZIC3 (harboring P300 and pared to normal subjects (Additional file 22: Figure S12, HNF1 binding sites) was hypermethylated, correspond- Additional file 23 Figure S13). The association of DORV ing to a region with high Pol II binding density in hESC and hypermethylated ZIC3 and NR2F2 pro- cells (ENCODE data) (Fig. 5a and c). Similarly, hyperme- moters showed a significant correlation by Fisher’s exact thylation of NR2F2 was detected in the upstream of the test (p values are both 0.0036) (Fig. 6a and c). TSS (harboring an IRF2 binding site) of the shortest Correspondingly, samples harboring hypermethylated NR2F2 transcript variant, corresponding to a region with promoters of the two genes have comparatively lower Lyu et al. BMC Genomics (2018) 19:428 Page 7 of 13 Fig. 5 (See legend on next page.) Lyu et al. BMC Genomics (2018) 19:428 Page 8 of 13 (See figure on previous page.) Fig. 5 Aberrant methylation in the upstream regions of ZIC3 and NR2F2, visualized in UCSC genome browser. DMRs are indicated by light blue bar; methylated levels in the twins are showed in blue (D3) and red (D4) bars. Transcription factor binding sites are also showed in zooming-in panels, which are indicated by black bars with names marked in front. An arrow gives TSS and transcriptional orientation. Two genes, ZIC3 (a) and NR2F2 (b), are showing differentially methylated in the upstream of TSS, and these two genes are known be associated with CHD. Transcription factor binding sites analyses are performed by ChIP-seq data of RNA Pol II and TBP derived from ENCODE. Bisulfite sequencing data (c and d) show methylation status of the same region of ZIC3 and NR2F2 as in (a) and (b). Methylated and unmethylated CpG sites are shown as black and white circles, respectively gene expression levels (Fig. 6e and f), and the methyla- hypermethylated at the TSS upstream region of the dis- tion and gene expression levels of these two genes were eased twin compared to the normal heart sample. ZIC3 negatively correlated (Fig. 6g and h). These results con- is a member of the ZIC family of C2H2-type zinc finger firm that lower gene expression levels of ZIC3 and proteins, which functions as a TF in early stages of NR2F2 are associated with promoter hypermethylation left-right body axis formation and heart development of these two genes in normal individuals and DORV [53]. ZIC3 acts in organizer formation by inhibiting the patients. canonical Wnt signaling pathway, and its expression is regulated by determinants of the early neural fate speci- Discussion fication and dorsal-ventral (D-V) axis formation, includ- This study represents the first analysis of genome and ing BMP, FGF, and Nodal signaling pathways [54]. epigenome profiling of MZ twins discordant for DORV, Mutations in ZIC3 result in heterotaxy or isolated CHD and provides the evidence for the presence of epigenetic (phenotypes including DORV, ASD and VSD [49, 50]. In differences between the twin pair. Genetic variations be- addition, the ZIC3 gene is located in X chromosome, so tween twins can affect proteins coding, gene transcrip- it may contribute to the pathogenesis of CHD in male tion and epigenetic modifications [52]. We therefore and female differently. NR2F2 was identified to be a first performed genomic sequence variation detection member of the steroid thyroid hormone superfamily of and revealed some differences between the twin pair, but nuclear receptors, involving in the regulation of many stringent filtering analyses of CNVs, SNVs and short different genes in development [55, 56]. In human cases InDels failed to identify genomic differences that may and mouse models, NR2F2 has been crucially implicated contribute to pathogenic DORV. Even though sequence in angiogenesis and heart development, and abnormal variants within the promoter regions of the GATA4 and expression or depletion of NR2F2 leads to AVSD (atrio- TBX1 were reported to contribute to congenital heart ventricular septal defect) and VSD (ventricular septal de- disease by altering their gene expression [10, 29, 30], our fects) [51]. Consistently, we also found DMRs in BMP results showed the sequence variants existed in both the and Wnt signaling pathway-related genes, indicating that normal and diseased twins, indicating that the pathogen- the regulation of these signaling pathways by ZIC3 or esis of DORV may not be due to these differences. It is NR2F2 may be critical for normal heart development. possible that the shared sequence variants showed differ- Using additional normal and clinical DORV samples, we ent penetrance, which may cause the DORV in one of confirmed promoter hypermethylation of ZIC3 and the MZ twins. Notably, sequences alignment in our NR2F2 in DORV patients and the anti-correlation be- study covered 91% of the hg19 human reference gen- tween their methylation and gene expression. Taken to- ome, leaving the rest of the genome not assessed. There- gether, aberrant methylation at promoter regions of fore, we cannot rule out the possibility that genome ZIC3 and NR2F2 and their dysregulated gene transcrip- sequence differences are involved in DORV of this twin tion levels, may contribute to DORV in human heart de- pair. velopment. However, the decisive conclusion needs Epigenetic variation at specific genomic regions has further investigations, since epigenetic changes in blood high heritability, and MZ twins typically share similar may not be able to fully reflect the causative basis of the epigenetic profiles [25]. However, a set of factors includ- disease due to lack of more DORV cases in twins and ing dietary components, physical changes, psychological difficulty in obtaining heart samples. states and environmental changes could affect epigen- DMRs were also found in the upstream of CITED1, omes of the two years and two months old twins after GATA2, SOX3 and some important epigenetic genes, in- birth [53]. In this study, many DMRs-related genes cluding MTA2, NSD1, MECP2 and SUV39H1, indicating (within upstream or gene body) are enriched in path- that they might also contribute to DORV. The MZ twins ways that contribute to cardiac development. Most im- shared the similar but not exactly the same environment, portantly, we found that ZIC3 and NR2F2, which are especially the postnatal environment. Thus, we suggest annotated with CHD in OMIM database, were that the non-shared environmental and stochastic Lyu et al. BMC Genomics (2018) 19:428 Page 9 of 13 Fig. 6 (See legend on next page.) Lyu et al. BMC Genomics (2018) 19:428 Page 10 of 13 (See figure on previous page.) Fig. 6 DNA methylation and gene expression detection of ZIC3 and NR2F2 from clinical cases. (a and c) Statistical summaries about DNA methylation status of DMRs in ZIC3 and NR2F2 in 20 clinical samples, consisting of five normal providers and fifteen DORV patients. (b and d) Diagrams exhibiting average methylated levels of individual CpG sites in DMRs of ZIC3 and NR2F2 from the indicated groups, respectively. (e and f)Histograms showing relative gene expression levels of ZIC3 and NR2F2 in different groups of specimens. (g and h) Scatterplots showing the gene expression levels of ZIC3 (g)and NR2F2 (h) are negatively correlated with their promoter methylation status. Pearson’s correlation coefficient and p-values were listed above the plot factors, including physical changes, chemical pollutants, output CX_report file was sorted by chromosomes using dietary components, temperature changes and other linux shell commands (awk). The sorted CX_report files external stresses during pregnancy [6, 57, 58], may were then used for downstream analysis. contribute to the pathogenesis of DORV through the mediation of epigenetic changes. DNA methylation detection and quantitative RT-PCR To monitor CpG methylation of screened DMRs in pro- Conclusions moters, genomic DNA was treated with sodium bisulfite In conclusion, disease-discordant MZ twin pairs are out- using EpiTect Bisulfite Kit (Qiagen, USA). The con- standing subjects to study epigenetic mechanisms driving verted DNA was then amplified by PCR with specific a number of pathologies. Here, using DNA methylation primers (ZIC3: Forward: 5′-GAGTGATTGATTTT profiling technology to analyze genome-wide DNA ATTAGTTTAAGGATAT-3’Reverse: 5’-AACCAAAAA methylation, we described differentially methylated re- ACTCCCTAAATACC-3′; NR2F2: Forward: 5′-GAAG gions in a DORV-discordant MZ twin pair. A limitation to TAGGAAAGGGTGGG-3’ Reverse: 5’-CGAACCCAA our study is that we only obtained one MZ twin pair dis- ACTATTATCTAAC-3′), PCR products were purified, li- cordant for DORV, and the present results call for more gated into pEasy-T5 vector (Transgene, China) and then DORV discordant twins and extensive tests for the transduced into competent Escherichia.coli. When bac- generalizability of our findings. Nevertheless, our results teria colonies appeared on the plate, at least 20 inde- provide new insights into the mechanism of DORV, a rare pendent clones were selected and sequenced. The disease that has been less studied by genomic and epige- sequenced results were analyzed by BiQ analyzer (Max nomic approaches. Plank Institute, Germany). Total RNA was extracted with RNAliquid Kit (Aidlab, China) and mRNA expres- Methods sion levels were detected with one-step RT-PCR kit Patients and materials (Takara, China) on lightcycler (Roche, USA). RT-qPCR Genomic DNA was extracted from the donated whole primers were listed as follows: ZIC3: Forward: 5′- blood samples of DORV patients and normal people by GGCGCTCAGTTTCCTAACTAC-3’Reverse: 5′- CTGC using the DNeasy Blood & Tissue Kit (Qiagen, Cat no. CGCATATAACGGAAGAA-3′; NR2F2: Forward: 5′- 69504). This study was conducted in accordance with AACCAGCCGACGAGATTCG-3’ Reverse: 5′- CCCG the principles of the Declaration of Helsinki and has GATGAGGGTTTCGATG-3′. been reviewed and approved by the Medical Ethics Committee of Fuwai Hospital. Written informed consent was obtained from the twins’ parents and mentioned DMRs detection samples’ providers. Differentially methylated regions (DMRs) were detected based on a windows swiping method. We used a 100 bp Reduced representation bisulfite sequencing (RRBS) window sliding on the genome at a 50 bp step to find MspI-digested RRBS library was prepared as published differentially methylated windows (DMWs) between two [36], one hundred bp paired-end reads were generated samples, and the neighboring windows were joined to- from Illumina Hiseq2000 platform (BIOPIC, Peking Uni- gether as a DMR. Only >10X CpG sites were used to cal- versity, Beijing). Raw reads were trimmed adapters and culate DMWs. To test the different average methylations low quality bases using trim_galore in RRBS mode. Hu- in a window between two samples, Wilcoxon test was man genome (hg19) was indexed with bismark_gen- applied, and p-value < 0.01 was then considered as the ome_preparation (a script from bismark mapping DMW. A significant DMW was discarded if less than package), and then, all clean reads aligned to indexed five CpG sites in the window and average methylation human genome using bismark (−-bowtie2). To extract levels between two samples were less than 10%. Finally, the methylation information for individual cytosines, bis- adjacent DMWs were joined together as DMRs using mark_methylation_extractor (−p –cytosine_report –CX BEDtools (intersectBed). DMRs were annotated by BED- –no_overlap) in paired-end mode was applied, and the tools (intersectBed) and ANNOVAR [59]. Lyu et al. BMC Genomics (2018) 19:428 Page 11 of 13 Gene ontology (GO) and pathway enrichment Additional file 11: Table S7. GO and KEGG analysis of DMRs located in Annotated DMRs were separated into 3 subsets: gene upstream region. (XLSX 43 kb) upstreams (2 kb in front of TSS), gene bodies and tran- Additional file 12: Table S8. GO and KEGG analysis of DMRs located in gene body. (XLSX 99 kb) script factor binding sites. Genes which had DMRs in up- Additional file 13: Table S9. GO and KEGG analysis of DMRs located in streams and gene bodies were submitted to Database for TF binding sites. (XLSX 80 kb) Enrichr and Annotation, Visualization and Integrated Dis- Additional file 14: Figure S5. Aberrant methylation in the upstream covery (DAVID) respectively, GO enrichments used regions of CITED1. Visualizing the methylation levels of DMRs near CITED1 GOTERM_BP_FAT database under default parameters, with UCSC genome browser. Methylated levels in the twins are showed in blue (D3) and red (D4). Transcription factor binding sites are also and pathway enrichments used Kyoto Encyclopedia of showed in zooming-in panels, which indicated by black bars with names Genes and Genomes (KEGG) database. DMRs which an- marked in front. Arrows give TSSs and transcriptional orientation. notated as transcript factors binding sites by ANNOVAR Transcription factor binding sites, Pol II ChIP-seq and TBP ChIP-seq data from ENCODE. (PDF 910 kb) (using tfbsConsSites database downloaded from UCSC Additional file 15: Figure S6. Aberrant methylation in the upstream [60]) were considered to influence the TFs function, so we regions of GATA2. Visualizing the methylation levels of DMRs near GATA2 analyzed those TFs using DAVID. GO (biological process) with UCSC genome browser. Methylated levels in the twins are showed and pathway enrichments were obtained to understand in blue (D3) and red (D4). An arrow gives TSS and transcriptional orientation. Transcription factor binding sites, Pol II ChIP-seq and TBP those DMRs’ biological meanings. ChIP-seq data from ENCODE. (PDF 767 kb) Additional file 16: Figure S7. Aberrant methylation in the upstream regions of SOX3. Visualizing the methylation levels of DMRs near SOX3 Additional files with UCSC genome browser. Methylated levels in the twins are showed in blue (D3) and red (D4). Transcription factor binding sites are also Additional file 1: Table S1. Overview of WGS data. (XLSX 39 kb) showed in zooming-in panels, which indicated by black bars with names marked in front. An arrow gives TSS and transcriptional orientation. Tran- Additional file 2: Table S2. D3 specific SNVs and short InDels (XLSX 82 kb) scription factor binding sites, Pol II ChIP-seq and TBP ChIP-seq data from Additional file 3: Table S3. Sequence variations reported related to ENCODE. (PDF 590 kb) VSD. (XLSX 30 kb) Additional file 17: Figure S8. Aberrant methylation in the upstream Additional file 4: Figure S1. Comparison of copy number profiles of 22 regions of NSD1. Visualizing the methylation levels of DMRs near NSD1 pairs of autochromosomes and X chromosome between twin pair. with UCSC genome browser. Methylated levels in the twins are showed Normalized copy number profiles of D3 and D4. Each point shows a 5 kb in blue (D3) and red (D4). Transcription factor binding sites are also windows (all chromosomes) of sequencing reads normalized by GC-content showed in zooming-in panels, which indicated by black bars with names and map-ability using Control-FREEC. (PDF 1756 kb) marked in front. An arrow gives TSS and transcriptional orientation. Tran- Additional file 5: Table S4. Analysis of CNVs two samples. (XLSX 46 kb) scription factor binding sites, Pol II ChIP-seq and TBP ChIP-seq data from ENCODE. (PDF 781 kb) Additional file 6: Figure S2. D3 specific structure variations (SVs) analysis. Alignments of reads in both D3 (top panel) and D4 (bottom panel) Additional file 18: Figure S9. Aberrant methylation in the upstream at each break points detected by CREST. The header lines in blue given the regions of MTA2. Visualizing the methylation levels of DMRs near MTA2 detail information of each structure variation (columns meanings: left_chr, with UCSC genome browser. Methylated levels in the twins are showed left_pos, left_strand, number of left soft-clipped reads, right_chr, right_pos, in blue (D3) and red (D4). Transcription factor binding sites are also right_strand, number right soft-clipped reads, SV type, coverage at left_- showed in zooming-in panels, which indicated by black bars with names pos, coverage at right_pos, assembled length at left_pos, assembled marked in front. An arrow gives TSS and transcriptional orientation. Tran- length at right_pos, average percent identity at left_pos, percent of non- scription factor binding sites, Pol II ChIP-seq and TBP ChIP-seq data from unique mapping reads at left_pos, average percent identity at right_pos, per- ENCODE. (PDF 633 kb) cent of non-unique mapping reads at right_pos, start position of consensus Additional file 19: Figure S10. Aberrant methylationinthe upstream mapping to genome, starting chromosome of consensus mapping, regions of MECP2. Visualizing the methylation levels of DMRs near MECP2 with position of the genomic mapping of consensus starting position, end UCSC genome browser. Methylated levels in the twins are showed in blue position of consensus mapping to genome, ending chromosome of (D3) and red (D4). Transcription factor binding sites are also showed in consensus mapping, position of genomic mapping of consensus end- ing position, and consensus sequences). (PDF 626 kb) zooming-in panels, which indicated by black bars with names marked in front. Arrows give TSSs and transcriptional orientation. Transcription factor binding Additional file 7: Table S5. Overview of RRBS data. (XLS 26 kb) sites, Pol II ChIP-seq and TBP ChIP-seq data from ENCODE. (PDF 969 kb) Additional file 8: Figure S3. Analysis of WGBS data of B cell and heart tissue from the ENCODE project. (A) Scatter plot and Pearson’s Additional file 20: Figure S11. Aberrant methylation in the upstream correlation analysis of DNA methylation of B cell and heart. Pearson’s regions of SUV39H1. Visualizing the methylation levels of DMRs near correlation coefficient was listed above the plot. (B) Visualization of SUV39H1 with UCSC genome browser. Methylated levels in the twins are the DNA methylation status of B cell and heart in chr1:1-3 M by IGV. showed in blue (D3) and red (D4). Transcription factor binding sites are (JPG 500 kb) also showed in zooming-in panels, which indicated by black bars with names marked in front. An arrow gives TSS and transcriptional orienta- Additional file 9: Figure S4. Comparison of systemic changes of tion. Transcription factor binding sites, Pol II ChIP-seq and TBP ChIP-seq methylome between two samples. (A) Cumulative depth distribution of data from ENCODE. (PDF 748 kb) RRBS data. The x-axis represents the depth of cytosine, and the y-axis represents the fraction of cytosine ≤ depth. (B) Methylation levels Additional file 21: Table S10. Expression profiling data of embryo and distribution in two samples of three different kinds of cytosine (CG, CHG, adult heart. (XLSX 9 kb) CHH). The x-axis is the methylation level; y-axis shows the log2 counts of Additional file 22: Figure S12. DNA methylation detection of ZIC3 the cytosine under a methylated level. (C) CHH methylation profile in the from clinical samples. Bisulfite sequencing tested DNA methylation status gene body, upstream and downstream. (D) CHG methylation profile in the of DMRs in ZIC3 in 20 clinical samples, five normal providers (1–5) and gene body, upstream and downstream. (PDF 891 kb) fifteen DORV patients (6–20). Methylated and unmethylated CpG sites are Additional file 10: Table S6. DMRs between two samples. (XLSX 580 kb) indicated as respective black and white circles. (PDF 5226 kb) Lyu et al. BMC Genomics (2018) 19:428 Page 12 of 13 Publisher’sNote Additional file 23: Figure S13. DNA methylation detection of NR2F2 Springer Nature remains neutral with regard to jurisdictional claims in from clinical samples. Bisulfite sequencing detected DNA published maps and institutional affiliations. methylation status of DMRs in NR2F2 in 20 clinical samples, five normal providers (1–5) and fifteen DORV patients (6–20). Received: 23 June 2017 Accepted: 22 May 2018 Methylated and unmethylated CpG sites are indicated as black and white circles, respectively. (PDF 5463 kb) Abbreviations References ASD: Atrial septal defect; CHD: Congenital heart disease; CITED1: Cbp/p300 1. Fahed AC, Gelb BD, Seidman J, Seidman CE. Genetics of congenital heart interacting transactivator with Glu/Asp rich carboxy-terminal domain 1; disease: the glass half empty. Circ Res. 2013;112(4):707–20. CNVs: Copy number variations; DAVID: Database for Annotation, Visualization 2. McMahon CJ, Breathnach C, Betts DR, Sharkey FH, Greally MT. De novo and Integrated Discovery; DMRs: Differentially methylated regions; interstitial deletion 13q33. 3q34 in a male patient with double outlet right DMWs: Differentially Methylated Windows; DORV: Double outlet right ventricle, microcephaly, dysmorphic craniofacial findings, and motor and ventricle; GATA2: GATA-binding protein 2; GATA4: GATA-binding factor 4; developmental delay. Am J Med Genet A. 2015;167(5):1134–41. HGMD: Human Genetic Mutation Database; HNF1: Hepatocyte nuclear factor 3. Hartge DR, Niemeyer L, Axt-Fliedner R, Krapp M, Gembruch U, Germer U, 1; InDels: Insertions/deletions; IRF2: Interferon regulatory factor 2; Weichert J. Prenatal detection and postnatal management of double outlet KEGG: Kyoto Encyclopedia of Genes and Genomes; MAPK: Mitogen activated right ventricle (DORV) in 21 singleton pregnancies. J Matern Fetal Neonatal protein kinase; MECP2: Methyl CpG binding protein 2; MTA2: Metastasis Med. 2012;25(1):58–63. associated 1 family member 2; MZ: Monozygotic; NR2F2: Nuclear receptor 4. Obler D, Juraszek A, Smoot LB, Natowicz MR. Double outlet right ventricle: subfamily 2 group F member 2; NSD1: Nuclear receptor binding SET domain aetiologies and associations. J Med Genet. 2008;45(8):481–97. protein 1; OMIM: Online Mendelian Inheritance in Man; P300: Histone 5. Ordovás JM, Smith CE. Epigenetics and cardiovascular disease. Nat Rev acetyltransferase p300; Pol II: RNA polymerase II; RRBS: Reduced Cardiol. 2010;7(9):510. representation bisulfite sequencing; SNVs: Single nucleotide variations; 6. Vecoli C, Pulignani S, Foffa I, Grazia Andreassi M. Congenital heart disease: SOX3: SRY-box 3; SUV39H1: Suppressor of variegation 3–9 homolog 1; the crossroads of genetics, epigenetics and environment. Curr Genomics. T1D: Type 1 diabetes; TBP: TATA-box binding protein; TBX1: T-box 2014;15(5):390–9. transcription factor 1; TFs: Transcription factors; TGF: Transforming growth 7. Bruneau BG. Signaling and transcriptional networks in heart development factor; TSS: Transcription start sites; VEGF: Vascular endothelial growth factor; and regeneration. Cold Spring Harb Perspect Biol. 2013;5(3):a008292. VSD: Ventricular septal defect; ZIC3: Zic family member 3 8. Hatcher CJ, Basson CT. Specification of the cardiac conduction system by transcription factors. Circ Res. 2009;105(7):620–30. Acknowledgements 9. Kathiriya IS, Nora EP, Bruneau BG. Investigating the transcriptional control of We are grateful to participants for kindly providing us with clinical samples. cardiovascular development. Circ Res. 2015;116(4):700–14. We are thankful to Fuwai Hospital for taking care of the participants. We 10. Stefanovic S, Christoffels VM. GATA-dependent transcriptional and thank the Bioinformatics Core, School of Life Sciences, Peking University for epigenetic control of cardiac lineage specification and differentiation. Cell providing bioinformatics analysis, and the reviewers for critical comments. Mol Life Sci. 2015;72(20):3871–81. 11. Liu L, Jin G, Zhou X. Modeling the relationship of epigenetic modifications Funding to transcription factor binding. Nucleic Acids Res. 2015;43(8):3873–85. This work was supported by 2014 China Postdoctoral Science Foundation 12. Feil R, Fraga MF. Epigenetics and the environment: emerging patterns and 55th General Financial Grant (No. 2014 M550007), 2013 Postdoctoral implications. Nat Rev Genet. 2012;13(2):97. Fellowship of Peking University-Tshinghua University Center for Life Sciences, 13. Handy DE, Castro R, Loscalzo J. Epigenetic modifications: basic mechanisms the National Natural Science Foundation of China (NSFC, Grant No. and role in cardiovascular disease. Circulation. 2011;123(19):2145–56. 31471205, 31671426 and 91219101) and the National Basic Research Program 14. Szyf M. The early life social environment and DNA methylation: DNA of China (973 Program, Grant No. 2010CB529500 and 2013CB530700). These methylation mediating the long-term impact of social environments early in funding bodies had no role in the design of the study, collection, analysis, life. Epigenetics. 2011;6(8):971–8. and interpretation of data, nor in writing the manuscript. 15. Smith ZD, Meissner A. DNA methylation: roles in mammalian development. Nat Rev Genet. 2013;14(3):204. Availability of data and materials 16. Hodges E, Molaro A, Dos Santos CO, Thekkat P, Song Q, Uren PJ, Park J, The datasets used and/or analysed during the current study available from Butler J, Rafii S, McCombie WR. Directional DNA methylation changes and the corresponding author on reasonable request. complex intermediate states accompany lineage specificity in the adult hematopoietic compartment. Mol Cell. 2011;44(1):17–28. Authors’ contributions 17. Brunner AL, Johnson DS, Kim SW, Valouev A, Reddy TE, Neff NF, Anton E, GLL and TL designed and performed molecular experiments, CZ contributed Medina C, Nguyen L, Chiao E. Distinct DNA methylation patterns to bioinformatics analysis, they contributed equally to this work; RL helped characterize differentiated human embryonic stem cells and developing with collections of clinical blood samples, CL, LZ, YTG, XKH, LS, LH and LJZ human fetal liver. Genome Res. 2009;19(6):1044–56. contributed to data analysis; WT, CL and YN conceived the project and 18. Kanherkar RR, Bhatia-Dey N, Csoka AB. Epigenetics across the human designed experiments; TL, GLL, CZ, CL and WT wrote the manuscript. All lifespan. Frontiers in cell and developmental biology. 2014;2:49. authors read and approved the final version of the manuscript. 19. Yuan W, Xia Y, Bell CG, Yet I, Ferreira T, Ward KJ, Gao F, Loomis AK, Hyde CL, Wu H. An integrated epigenomic analysis for type 2 diabetes susceptibility Ethics approval and consent to participate loci in monozygotic twins. Nat Commun. 2014;5:5719. This study was conducted in accordance with the principles of the 20. Wong C, Meaburn EL, Ronald A, Price T, Jeffries AR, Schalkwyk L, Plomin R, Declaration of Helsinki and has been reviewed and approved by the Medical Mill J. Methylomic analysis of monozygotic twins discordant for autism Ethics Committee of Fuwai Hospital (Approval no. 229). We confirm the spectrum disorder and related behavioural traits. Mol Psychiatry. 2014; informed written consent was obtained from all participants. 19(4):495. 21. Arora M, Reichenberg A, Willfors C, Austin C, Gennings C, Berggren S, Consent for publication Lichtenstein P, Anckarsäter H, Tammimies K, Bölte S. Fetal and postnatal For investigations involving clinical subjects, informed written consent has metal dysregulation in autism. Nat Commun. 2017;8:15493. been obtained from the participants involved. The participants consented to 22. Caramori ML, Kim Y, Moore JH, Rich SS, Mychaleckyj JC, Kikyo N, Mauer M. publish all of the sequencing data. Gene expression differences in skin fibroblasts in identical twins discordant for type 1 diabetes. Diabetes. 2012;61(3):739–44. Competing interests 23. Castillo-Fernandez JE, Spector TD, Bell JT. Epigenetics of discordant The authors declare that they have no competing interests. monozygotic twins: implications for disease. Genome Med. 2014;6(7):60. Lyu et al. BMC Genomics (2018) 19:428 Page 13 of 13 24. Bell JT, Saffery R. The value of twins in epigenetic epidemiology. Int J 47. Connelly JJ, Wang T, Cox JE, Haynes C, Wang L, Shah SH, Crosslin DR, Hale Epidemiol. 2012;41(1):140–50. AB, Nelson S, Crossman DC. GATA2 is associated with familial early-onset 25. Bell JT, Spector TD. A twin approach to unraveling epigenetics. Trends coronary artery disease. PLoS Genet. 2006;2(8):e139. Genet. 2011;27(3):116–25. 48. Paul MH, Harvey RP, Wegner M, Sock E. Cardiac outflow tract development relies on the complex function of Sox4 and Sox11 in multiple cell types. 26. Papadopoulos GK, Wijmenga C, Koning F. Interplay between genetics and Cell Mol Life Sci. 2014;71(15):2931–45. the environment in the development of celiac disease: perspectives for a 49. Ware SM, Peng J, Zhu L, Fernbach S, Colicos S, Casey B, Towbin J, Belmont healthy life. J Clin Invest. 2001;108(9):1261–6. JW. Identification and functional analysis of ZIC3 mutations in heterotaxy 27. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis and related congenital heart defects. Am J Hum Genet. 2004;74(1):93–105. G, Durbin R. The sequence alignment/map format and SAMtools. 50. Cowan J, Tariq M, Ware SM. Genetic and functional analyses of ZIC3 variants Bioinformatics. 2009;25(16):2078–9. in congenital heart disease. Hum Mutat. 2014;35(1):66–75. 28. Koboldt DC, Chen K, Wylie T, Larson DE, McLellan MD, Mardis ER, Weinstock GM, 51. Al Turki S, Manickaraj AK, Mercer CL, Gerety SS, Hitz M-P, Lindsay S, Wilson RK, Ding L. VarScan: variant detectioninmassively parallel sequencing of D’Alessandro LC, Swaminathan GJ, Bentham J, Arndt A-K. Rare variants in individual and pooled samples. Bioinformatics. 2009;25(17):2283–5. NR2F2 cause congenital heart defects in humans. Am J Hum Genet. 2014; 29. Tomita-Mitchell A, Maslen C, Morris C, Garg V, Goldmuntz E. GATA4 94(4):574–85. sequence variants in patients with congenital heart disease. J Med Genet. 52. Chen X, Kuja-Halkola R, Rahman I, Arpegård J, Viktorin A, Karlsson R, Hägg S, 2007;44(12):779–83. Svensson P, Pedersen NL, Magnusson PK. Dominant genetic variation and 30. Wang H, Chen D, Ma L, Meng H, Liu Y, Xie W, Pang S, Yan B. Genetic missing heritability for human complex traits: insights from twin versus analysis of the TBX1 gene promoter in ventricular septal defects. Mol Cell genome-wide common SNP models. Am J Hum Genet. 2015;97(5):708–14. Biochem. 2012;370(1–2):53–8. 53. Fraga MF, Ballestar E, Paz MF, Ropero S, Setien F, Ballestar ML, Heine-Suñer 31. Zhang F, Gu W, Hurles ME, Lupski JR. Copy number variation in human D, Cigudosa JC, Urioste M, Benitez J. Epigenetic differences arise during the health, disease, and evolution. Annu Rev Genomics Hum Genet. 2009;10: lifetime of monozygotic twins. Proc Natl Acad Sci U S A. 2005;102(30): 451–81. 10604–9. 32. Boeva V, Popova T, Bleakley K, Chiche P, Cappo J, Schleiermacher G, 54. Fujimi TJ, Hatayama M, Aruga J. Xenopus Zic3 controls notochord and Janoueix-Lerosey I, Delattre O, Barillot E. Control-FREEC: a tool for assessing organizer development through suppression of the Wnt/β-catenin signaling copy number and allelic content using next-generation sequencing data. pathway. Dev Biol. 2012;361(2):220–31. Bioinformatics. 2011;28(3):423–5. 55. Mendoza-Villarroel RE, Robert NM, Martin LJ, Brousseau C, Tremblay JJ. The 33. Abyzov A, Urban AE, Snyder M, Gerstein M. CNVnator: an approach to nuclear receptor NR2F2 activates star expression and steroidogenesis in discover, genotype, and characterize typical and atypical CNVs from family mouse MA-10 and MLTC-1 Leydig cells. Biol Reprod. 2014;91(1) and population genome sequencing. Genome Res. 2011;21(6):974–84. 56. Hubert MA, Sherritt SL, Bachurski CJ, Handwerger S. Involvement of 34. Oda M, Yamagiwa A, Yamamoto S, Nakayama T, Tsumura A, Sasaki H, Nakao transcription factor NR2F2 in human trophoblast differentiation. PLoS One. K, Li E, Okano M. DNA methylation regulates long-range gene silencing of 2010;5(2):e9417. an X-linked homeobox gene cluster in a lineage-specific manner. Genes 57. Cortessis VK, Thomas DC, Levine AJ, Breton CV, Mack TM, Siegmund KD, Dev. 2006;20(24):3382–94. Haile RW, Laird PW. Environmental epigenetics: prospects for studying 35. Jaenisch R, Bird A. Epigenetic regulation of gene expression: how the epigenetic mediation of exposure–response relationships. Hum Genet. 2012; genome integrates intrinsic and environmental signals. Nat Genet. 2003; 131(10):1565–89. 33:245. 58. Hogenson TL. Epigenetics as the underlying mechanism for monozygotic 36. Gu H, Smith ZD, Bock C, Boyle P, Gnirke A, Meissner A. Preparation of twin discordance. Medical Epigenetics. 2013;1(1):3–18. reduced representation bisulfite sequencing libraries for genome-scale DNA 59. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing methylation profiling. Nat Protoc. 2011;6(4):468. genomic features. Bioinformatics. 2010;26(6):841–2. 37. Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K. KEGG: new 60. Karolchik D, Hinrichs AS, Kent WJ: The UCSC genome browser. Current perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. protocols in bioinformatics 2012, 40(1):1.4. 1–1.4. 33. 2016;45(D1):D353–61. 38. Huang DW, Sherman BT, Tan Q, Kir J, Liu D, Bryant D, Guo Y, Stephens R, Baseler MW, Lane HC. DAVID bioinformatics resources: expanded annotation database and novel algorithms to better extract biology from large gene lists. Nucleic Acids Res. 2007;35(suppl_2):W169–75. 39. Kuleshov MV, Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, Koplev S, Jenkins SL, Jagodnik KM, Lachmann A. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016;44(W1):W90–7. 40. Zhang Y, Rath N, Hannenhalli S, Wang Z, Cappola T, Kimura S, Atochina- Vasserman E, Lu MM, Beers MF, Morrisey EE. GATA and Nkx factors synergistically regulate tissue-specific gene expression and development in vivo. Development. 2007;134(1):189–98. 41. Wang Y. Mitogen-activated protein kinases in heart development and diseases. Circulation. 2007;116(12):1413–23. 42. Rose BA, Force T, Wang Y. Mitogen-activated protein kinase signaling in the heart: angels versus demons in a heart-breaking tale. Physiol Rev. 2010;90(4): 1507–46. 43. Madonna R, De Caterina R. VEGF receptor switching in heart development and disease. Cardiovasc Res. 2009;84(1):4–6. 44. Dor Y, Camenisch TD, Itin A, Fishman GI, McDonald JA, Carmeliet P, Keshet E. A novel role for VEGF in endocardial cushion formation and its potential contribution to congenital heart defects. Development. 2001;128(9):1531–8. 45. Sridurongrit S, Larsson J, Schwartz R, Ruiz-Lozano P, Kaartinen V. Signaling via the Tgf-β type I receptor Alk5 in heart development. Dev Biol. 2008; 322(1):208–18. 46. Bamforth SD, Bragança J, Farthing CR, Schneider JE, Broadbent C, Michell AC, Clarke K, Neubauer S, Norris D, Brown NA. Cited2 controls left-right patterning and heart development through a nodal-Pitx2c pathway. Nat Genet. 2004;36(11):1189.

Journal

BMC GenomicsSpringer Journals

Published: Jun 4, 2018

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off