Origin of new genes after zygotic genome activation in vertebrate

Origin of new genes after zygotic genome activation in vertebrate Abstract New genes are drivers of evolutionary innovation and phenotypic evolution. Expression of new genes in early development raises the possibility that new genes could originate and be recruited for functions in embryonic development, but this remains undocumented. Here, based on temporal gene expression at different developmental stages in Xenopus tropicalis, we found that young protein-coding genes were significantly enriched for expression in developmental stages occurring after the midblastula transition (MBT), and displayed a decreasing trend in abundance in the subsequent stages after MBT. To complement the finding, we demonstrate essential functional attributes of a young orphan gene, named as Fog2, in morphological development. Our data indicate that new genes could originate after MBT and be recruited for functions in embryonic development, and thus provide insights for better understanding of the origin, evolution, and function of new genes. young gene evolution, zygotic genome activation, new gene origin Introduction New genes, as fundamental materials for evolutionary innovation, have been investigated for many years (Long et al., 2003, 2013). Many new genes rapidly acquire important and even essential functions driven by positive selection (Chen et al., 2010, 2013), with most pronounced roles being in reproduction, development, and brain (Kaessmann, 2010; Tautz and Domazet-Loso, 2011; Chen et al., 2013). For instance, a testis-bias expression of new genes is recapitulated in different animals (Kaessmann, 2010). As seen in human, many new genes exhibit a brain, particularly a neocortex-biased expression, suggesting that they may be recruited for the accelerated evolution of the human brain and may be involved in the acquisition of high cognitive ability (Li et al., 2010; Wu et al., 2011; Zhang et al., 2011). New genes also play integral roles in development (Chen et al., 2010, 2012). For instance, in a pioneering study by Chen et al. (2010), knockdown of many new genes by RNA interference (RNAi) in Drosophila lead to either lethality at diverse development stages, or tissue-specific morphological defects (Chen et al., 2010). In human, new genes display a substantially high expression levels in fetal brain compared with adult brain (Zhang et al., 2011; Wu et al., 2015). The investigations raise the possibility that new genes could originate during the process of embryonic development. Particularly, during early development after the midblastula transition (MBT), zygotic genome activation is initiated, where remarkable epigenetic modifications occur to induce the transcription of the zygotic genome to gradually take control of development. These dynamic changes in epigenetic modifications enhance transcriptional activity and promiscuous transcription (Ostrup et al., 2013). These events are mirrored in testis, where widespread demethylation of CpG dinucleotide-enriched promoter sequences occurs, resulting in a transcriptionally active chromatin state that facilitates the access of transcriptional machinery and promiscuous transcription of genes (Kleene, 2001; Kaessmann, 2010). Studies have observed that new genes are frequently recruited for new function in the testis due to the promiscuous transcription (Kaessmann, 2010; Tautz and Domazet-Loso, 2011; Chen et al., 2013). Therefore, we hypothesized that new genes might be recruited during early development, particularly after MBT. To test the hypothesis that new gene could frequently be recruited for key roles in early animal development, we examined the temporal expression of new genes during the early development of a frog, and found that expression of new genes was significantly enriched in developmental stages after embryonic genome activation. Furthermore, we examined the roles of two young orphan genes, named Fog1 and Fog2, and demonstrated that they have important functions in frog development. After MBT, zygotic genome activation (ZGA) is initiated, where remarkable epigenetic modifications occur that induce the transcription of the zygotic genome (Ostrup et al., 2013), with a similar pattern observed in the testis (Kleene, 2001; Kaessmann, 2010). We propose that the induced transcription of the zygotic genome might facilitate the origination of new genes. Results Temporal expression profiling reveals high expression of young protein-coding genes after MBT In the present study, we categorized young protein-coding genes in the frog (Xenopus tropicalis), i.e. duplicate genes and orphan genes, according to their mechanisms of origin (Supplementary Tables S1 and S2). Duplicate genes are generated by gene duplication from existing old genes, while orphan genes do not have homology with genes in other species, and are likely originate de novo or from rapidly evolved genes, thereby losing similarity with their ancestral sequences (Tautz and Domazet-Loso, 2011). Genes that were specifically duplicated in X. tropicalis were retrieved by BioMart (Smedley et al., 2009) from Ensembl (http://www.ensembl.org/, version 72). The ages of X. tropicalis genes were obtained from ProteinHistorian (Capra et al., 2012). Genes with an age of zero were taken as new genes that newly originated during the evolution of frogs. Recently duplicated genes were defined as X. tropicalis-specific duplicated genes with an age of 0. Orphan genes, which are X. tropicalis specific, were identified from homologous searches of protein databases of other animal species using BLASTP with an E-value of 1e−10 (Altschul et al., 1997). Profiling of temporal gene expression from the 2-cell stage to stage 44–45 (Tan et al., 2013) revealed that the expression of young protein-coding genes peak after MBT, particularly from gastrulation to neurulation (Figure 1, Supplementary Figure S1A). Expression levels of the new genes displayed a decreasing trend with developmental stages after MBT. When gene expression levels were normalized to the whole genome level, this pattern still held for new genes (Figure 1, Supplementary Figure S1B). Although expression quantification may not be accurate for young duplicate genes using RNA-sequencing data, we reasoned that it would not change our conclusion as orphan genes also exhibited the signature peak expression after MBT (Figure 1). Figure 1 View largeDownload slide Expression level of young protein-coding genes at different developmental stages in X. tropicalis. log2(FPKM+1) was used to calculate the expression value of each gene. Two groups of orphan genes were identified by BLASTP with cutoff E = 1e−10 and 1e−4, respectively. Figure 1 View largeDownload slide Expression level of young protein-coding genes at different developmental stages in X. tropicalis. log2(FPKM+1) was used to calculate the expression value of each gene. Two groups of orphan genes were identified by BLASTP with cutoff E = 1e−10 and 1e−4, respectively. Weighted gene co-expression network analysis Genes function by interacting with other genes, generating gene–gene interaction networks. To gain insight into involvement of new genes in networks, we performed weighted gene co-expression network analysis (WGCNA), an unsupervised and unbiased analysis (Langfelder and Horvath, 2008) to identify modules of coexpressed genes with shared functionality. A total of 30 distinct modules, representing clusters of genes with correlated expression, were identified (Figure 2A). Many of the modules exhibit expression patterns significantly correlated with specific developmental stages (Figure 2A, Supplementary Figure S3). For example, the M2 module is a stage 9-dominant module, with the significance of its association having a P-value of 1.0 × 10−8 (Figure 2B). Figure 2 View largeDownload slide WGCNA of young protein-coding genes. (A) Heatmaps displaying the correlations (and corresponding P-values) between modules and developmental stages or adult tissues. Color legend at the top indicates the level of correlation. eld0−4 represent five adult tissues: brain, liver, kidney, heart, and skeletal muscle, respectively. (B) Heatmap of gene expression in module M2. (C) Enrichment level of young duplicated genes for the different modules. Values are the proportion of young duplicated genes in each module divided by the proportion of other genes in the module. Asterisks (*) represent significant enrichments (P < 0.01 after Bonferroni correction by χ2 test). (D) Level of enrichment of young duplicated genes at each developmental stage shows a decreasing trend after MBT. Figure 2 View largeDownload slide WGCNA of young protein-coding genes. (A) Heatmaps displaying the correlations (and corresponding P-values) between modules and developmental stages or adult tissues. Color legend at the top indicates the level of correlation. eld0−4 represent five adult tissues: brain, liver, kidney, heart, and skeletal muscle, respectively. (B) Heatmap of gene expression in module M2. (C) Enrichment level of young duplicated genes for the different modules. Values are the proportion of young duplicated genes in each module divided by the proportion of other genes in the module. Asterisks (*) represent significant enrichments (P < 0.01 after Bonferroni correction by χ2 test). (D) Level of enrichment of young duplicated genes at each developmental stage shows a decreasing trend after MBT. We then examined the enrichment of new genes within these modules. Young duplicated genes of X. tropicalis were significantly over-represented in 13 of the 30 modules (Figure 2C, P < 0.01), with 12 modules harboring significant correlation with a developmental stage after MBT (Figure 2A, P < 0.01 after Bonferroni correction by χ2 test). The 12 modules cover the gastrulation processes (stages 9, 10, and 11.12, with 119 young duplicated genes involved), neurulation (stages 16.18, 19, and 20.21, with 76 young duplicated genes involved), and part of the organogenesis period (stages 24.26, 38.39, and 44.45, with 66 young duplicated genes involved). This enrichment indicates that these new genes have evolved important functions in critical gene–gene interaction networks. Interestingly, the level of enrichment for new genes showed a decreasing trend with developmental stage after MBT (P = 0.028, Figure 2D; P = 0.048, Supplementary Figure S4), corroborating the expression trajectory of new genes (Figure 1). The decreasing pattern after MBT could not be attributed to changes in global gene expression levels or changes in the proportions of expressed genes among developmental stages, as correlation was not observed between enrichment levels of new genes and transcriptional levels of developmental stages (Supplementary Figure S5). Young orphan genes were significantly enriched within two modules (P < 0.01 after Bonferroni correction by χ2 test), with modules M5 (29 genes) corresponding to developmental stage 10, and M24 (15 genes) (Supplementary Figure S6). It is notable that another 47 young duplicated genes are over-represented within module M5. Gene enrichment analysis performed by the DAVID program (Huang et al., 2008) found that genes in M5 were significantly enriched in categories associated with development such as ‘pattern specification process’ (GO:0007389, P = 0.01, 10 genes: T, HHEX, NOG, GSC, LHX1, DYNC2LI1, GATA4, ZIC1, TCF7L1, and ZIC3) and ‘regionalization’ (GO:0003002, P = 0.04) (Supplementary Table S3). These results indicate that the young genes might have evolved developmental roles through interactions with other genes. Expression patterns of two young orphan genes reveal potential functions in development High-expression levels of young protein-coding genes after MBT, particularly from gastrulation to neurulation, raise the possibility that some of these new genes might have acquired important functions in these developmental stages. To complement this finding, we chose some young protein-coding genes for further functional assessments. Two genes (ENSXETG00000027093, i.e. LOC100170590, which we named Fog1, frog orphan gene 1, and ENSXETG00000030468, i.e. LOC100158459, which we named Fog2, frog orphan gene 2) stand out, as they exhibit high expression levels after MBT, particularly from gastrulation to neurulation (Supplementary Figure S7). To interrogate the epigenetic aspects associated with expression, we examined the levels of H3K4me3 modification, which is associated with active transcription of nearby genes (Guenther et al., 2007), across the two genes at different developmental stages. As expected, the changes of H3K4me3 modification levels displayed a pattern paralleling the changes in the expression of these two genes (Supplementary Figure S8). Among the gene–gene expression networks, the two genes are located within module M7, which exhibited high expression level at the stages 10−14 after MBT (Figure 2A). Gene enrichment analysis showed that genes in M7 are significantly enriched in many categories associated with development such as: ‘somite development’, ‘pattern specification process’, ‘regionalization’, ‘segmentation’ and ‘anterior/posterior pattern specification’ (Supplementary Table S4). These findings suggest that Fog1/Fog2 might be functionally linked with development. To explore potential functions of these two genes, we firstly performed an expression correlation analysis to identify genes whose expression correlate significantly with these two genes. A total of 190 genes showed correlation with Fog1 (R2 > 0.9, by Pearson correlation), and gene enrichment analysis revealed that these genes were enriched in development categories. For example, 31 genes are enriched in the category ‘anatomical structure development’ (GO:0048856, P = 0.029), five genes in ‘mesenchyme development’ (GO:0060485, P = 0.02), four genes in ‘neural crest cell differentiation’ (GO:0014033, P = 0.016), and four genes in ‘somite development’ (GO:0061053, P = 0.0379) (Supplementary Table S5). On the other hand, 12 genes are correlated with Fog2 (R2 > 0.9, by Pearson correlation), with 7 of the genes involved in ‘anatomical structure development’ (GO:0048856, P = 0.0002) (Supplementary Table S6). Among these genes, Msgn1 (mesogenin 1) is a master regulator of paraxial presomitic mesoderm differentiation (Chalamalasetty et al., 2014), and Szl, encodes the secreted frizzled related protein Sizzled that negatively regulates Tolloid-like activity to control deposition of a fibronectin (FN) matrix between the mesoderm and endoderm (Kenny et al., 2012). The above expression correlation analyses suggest that Fog1 and Fog2 genes might be coupled to embryonic development. We further investigated the potential functions of the two genes in another model frog, Xenopus laevis. Real-time PCR confirmed their increased expression levels at the gastrula and the neurula stages, but lower levels in the tailbud stage of development (Supplementary Figure S7). After that, we used in situ hybridization to examine the temporal expression patterns during embryonic development, and found high expression of Fog1 and Fog2 in somites (muscle) (Figure 3). This is consistent with the above expression correlation analysis, where many genes involved in somite development displayed correlated expression with Fog1 and Fog2. In addition, Fog1 is also expressed in the branchial arches, optic vesicle, and tail end, while Fog2 shows expression in the nervous system and optic vesicle (Figure 3). Figure 3 View largeDownload slide Expression patterns of Fog1 and Fog2 in X. laevis embryos. (A−H) Expression patterns of Fog1 mRNA. (A and B) Fog1 is weakly expressed in the animal pole at the blastula stage. (C−E) Fog1 is weekly expressed in the neural system at the neurula stage. (F−H) Fog1 is mainly expressed in the muscle, brain, optic vesicle, branchial arches, and tail end. (A’−H’) Expression patterns of Fog2 mRNA. (A’−C’) Fog2 is weakly expressed in the animal pole at the blastula stage. (D’ and E’) Fog2 is expressed in the neural system at the neurula stage. (F’−H’) Fog2 is mainly expressed in the brain, optic vesicle, and muscle. n, neural system; ov, optic vesicle; mu, muscle; ba, branchial arches; te, tail end. Figure 3 View largeDownload slide Expression patterns of Fog1 and Fog2 in X. laevis embryos. (A−H) Expression patterns of Fog1 mRNA. (A and B) Fog1 is weakly expressed in the animal pole at the blastula stage. (C−E) Fog1 is weekly expressed in the neural system at the neurula stage. (F−H) Fog1 is mainly expressed in the muscle, brain, optic vesicle, branchial arches, and tail end. (A’−H’) Expression patterns of Fog2 mRNA. (A’−C’) Fog2 is weakly expressed in the animal pole at the blastula stage. (D’ and E’) Fog2 is expressed in the neural system at the neurula stage. (F’−H’) Fog2 is mainly expressed in the brain, optic vesicle, and muscle. n, neural system; ov, optic vesicle; mu, muscle; ba, branchial arches; te, tail end. Functional study of Fog1 and Fog2 The above expression analysis suggested that Fog1 and Fog2 might have functions in development. To further interrogate the functions of Fog1 and Fog2 in development, we knocked down mRNA expression of these two genes by injecting gene-specific MO (morpholino) into X. laevis embryos. By visual inspection, no noticeable phenotypic change was observed in development after injecting Fog1 MO (Supplementary Figure S9). While this does not annul the potential functional importance of this gene, it highlights the need for more systematic phenotyping assays to capture the phenotypic changes after knocking down Fog1. In stark contrast, a serious change in body axis curvature was observed in embryos after knocking down Fog2 compared to wild-type embryos (Figure 4A−D, Supplementary Figure S11). This malformation could be partially rescued when Fog2 mRNA was co-injected (Figure 4E and F, Supplementary Figure S11). In consideration of expression of Fog2 in somites (muscle) and mesoderm (Figure 3, Supplementary Figure S12), and expression correlation with other genes involved in mesoderm development, we further examined the expression of MyoD, a crucial gene in somites formation as an early response to mesoderm induction in Xenopus embryos. Fog2 MO injection caused downregulation of MyoD (Figure 5). This phenotype could be partially rescued by co-injection with Fog2 mRNA (Figure 5). Since the rescue experiment was performed by co-injecting MO and Fog2 mRNA, it is important to note that MO can block not only the endogenous mRNA but also the injected mRNA. Hence, the rescued phenotype might be due an overall increase in Fog2 mRNA. Altogether, these results suggest that Fog2 is tightly coupled to the morphological development of X. laevis. BLAST analysis did not find the homologous gene of Fog2 in the genome of Tibetan frog (Nanorana parkeri) (Sun et al., 2015), supporting Fog2 as a newly evolved gene. The experiment reiterates that some new genes could be recruited in embryonic development. Figure 4 View largeDownload slide Knockdown of Fog2 causes defects or malformation in embryonic development of Xenopus at late stages of development. Embryos were injected on the left side. (A) Wild-type (WT) embryos. (C) Embryos injected with Fog2-MO. (E) Embryos injected with both Fog2-MO and mRNA. (B, D, F) The same embryo for A, C, E, respectively. Numbers at the bottom corners represent the number of embryos with the corresponding effect and the number of embryos used in this experiment. The side of the injection is indicated by lacZ (red dots). Figure 4 View largeDownload slide Knockdown of Fog2 causes defects or malformation in embryonic development of Xenopus at late stages of development. Embryos were injected on the left side. (A) Wild-type (WT) embryos. (C) Embryos injected with Fog2-MO. (E) Embryos injected with both Fog2-MO and mRNA. (B, D, F) The same embryo for A, C, E, respectively. Numbers at the bottom corners represent the number of embryos with the corresponding effect and the number of embryos used in this experiment. The side of the injection is indicated by lacZ (red dots). Figure 5 View largeDownload slide Knockdown of Fog2 causes downregulation of MyoD. (A) Wild-type embryos. (C) Embryos injected with Fog2-MO. (E) Embryos injected with both Fog2-MO and mRNA. (B, D, F) The opposite side for A, C, E, respectively. Numbers at the bottom corners represent the number of embryos with the corresponding effect and the number of embryos used in this experiment. The side of the injection is indicated by lacZ (red dots). Figure 5 View largeDownload slide Knockdown of Fog2 causes downregulation of MyoD. (A) Wild-type embryos. (C) Embryos injected with Fog2-MO. (E) Embryos injected with both Fog2-MO and mRNA. (B, D, F) The opposite side for A, C, E, respectively. Numbers at the bottom corners represent the number of embryos with the corresponding effect and the number of embryos used in this experiment. The side of the injection is indicated by lacZ (red dots). Discussion In the present study, we show that young protein-coding genes are significantly enriched in embryonic developmental stages after MBT. A decreasing trend in the level of enrichment and expression of new genes was seen in the developmental stages after MBT, a pattern that had not been previously reported. For example, knockdown of new genes in Drosophila (Chen et al., 2010) generally lead to pupal rather than embryonic lethality. A repetitive analysis of time-course developmental data in zebrafish showed no peak expression in the early developmental stages (Domazet-Loso and Tautz, 2010; Zhong et al., 2016). Numerous possibilities such as different bioinformatics practices, different data profiling strategies, as well as specie-specific variations might explain the conflicts in these observations. In particular, we studied functions of two genes Fog1 and Fog2 exhibiting high expression levels after MBT, and found that Fog2 might have an important role in development. When comparing orthologous gene coding sequences between X. laevis and X. tropicalis, we found that Ka/Ks (the ratio of the number of nonsynonymous substitutions per nonsynonymous site (Ka), to the number of synonymous substitutions per synonymous site (Ks)) of Fog1 and Fog2 is 0.283 and 0.744, respectively (Supplementary Figure S13). It seems that Fog1 evolves under purifying selection, suggesting that it still may harbor an important function, while Fog2 evolves more rapidly which occurs commonly for recently evolved new genes. After MBT, zygotic genome activation is initiated, where remarkable epigenetic modifications occur to induce the transcription of the zygotic genome to gradually take over the control of development (Ostrup et al., 2013). An example of the epigenetic changes that occur with this transition is the genomic enrichment of H3K4me3, an indicator of active promoters, which is observed at the time of ZGA in zebrafish (Vastenhouw et al., 2010) and Xenopus (Akkers et al., 2009). Another example is DNA methylation, which is considered to be a negative regulator of gene expression. Methylated DNA is efficiently transcriptionally repressed in Xenopus oocytes, but DNA methylation-dependent transcriptional repression is greatly reduced after ZGA, and the repressive effect of DNA methylation is only restored during organogenesis and terminal differentiation (Bogdanovic et al., 2011; Ostrup et al., 2013). These dynamic changes in epigenetic modifications enhance transcriptional activity after ZGA. The promiscuous transcription then enables proto-genes to be selected due to beneficial functions that might accidentally be gained, and thus allow them to evolve into bona fide genes (Kaessmann, 2010). This epigenetic aspect is synonymous with that observed in testis, where widespread demethylation of CpG dinucleotide-enriched promoter sequences occurs, resulting in a transcriptionally active chromatin state that facilitates the access of transcriptional machinery and promiscuous transcription of genes (Kleene, 2001; Kaessmann, 2010). Previous studies have evidenced the recruitment of new genes for new functions in the testis (Kaessmann, 2010; Tautz and Domazet-Loso, 2011; Chen et al., 2013). In conclusion, our study identifies a genomic hotbed for the origination of new genes, and illustrates likely developmental functions of new genes besides the traditional roles such as in the testis and brain. Materials and methods Animal ethics The handling of animals used in this study followed the guidelines and regulations of Kunming Institute of Zoology on animal experimentation and was approved by the Institutional Animal Care and Use Committee of the Kunming Institute of Zoology. Analysis of transcriptome data Transcriptome data from 23 developmental stages from a previous study (Tan et al., 2013) were downloaded from the NCBI SRA (http://www.ncbi.nlm.nih.gov/sra/). Detailed information on each developmental stage is available elsewhere (Tan et al., 2013). All reads were processed using Btrim to remove low-quality reads with parameters –l 30 –q 20 (Kong, 2011), with the paired and unpaired short reads generated then aligned to the X. tropicalis genome (JGI_4.2) (Ensembl release 72) using the read gapped alignment program Tophat 2.0.9 (Trapnell et al., 2009; Kim et al., 2011). After merging paired and unpaired alignments from the same sample, the program Cufflinks 2.0.2 was used to assemble the transcriptomes (Trapnell et al., 2010) (Supplementary Figure S14). All isoforms assembled by Cufflinks from the 45 samples were sent to the Cuffcompare utility, along with the Ensembl annotation file, to generate an integrated combined gtf annotation file. With the purpose of minimizing annotation artifacts, we processed the Cuffcompare output file through the following steps. First, we excluded all new single exon transcripts and sequences with lengths < 200 bp. Transcript with FPKM values of at least 1 in any one sample or could be assembled in at least two samples were considered credible. Credible transcripts were added to the original Ensembl annotation gtf file as new isoforms of known X. tropicalis genes, and labeled with the class code ‘j’. We used this new annotation as a reference and re-run Cuffcompare with transcripts labeled with the class codes ‘i’, ‘x’, and ‘u’. We selected these loci as the transcripts labeled with class codes ‘i’, ‘x’, or ‘u’ to represent sense intronic locus, antisence exonic locus, or intergenic locus transcripts, respectively. Finally, we combined the known annotated Ensembl genes and their new assembled isoforms together with genes from the new loci identified to obtain a new set of annotations. This final annotation was processed by Cuffdiff together with the original alignment file to calculate the FPKM values of each gene in each sample. Based on our robust transcript reconstruction, we analyzed the coding potential of the transcripts encoded by the newly identified loci using Coding Potential Calculator (CPC) (Kong et al., 2007). Transcript sequences were extracted by gffread, a utility of the Cufflinks package (Trapnell et al., 2010). Transcripts with a score ≥0 were deemed as coding, those with scores <−1 as noncoding, and all others as weak noncoding. If all transcripts within a new locus were classified as noncoding, we defined it as a lncRNA locus, with those in intergenic regions having no overlap with any known gene locus considered as lincRNA loci. Weighted gene co-expression network analysis Weighted gene networks were constructed using the WGCNA package implemented in R (Langfelder and Horvath, 2008). The power of 7, for which the scale-free topology fit index curves flatten out at roughly 0.9, was interpreted as a soft-threshold for the adjacency matrix. In total, 30 modules were identified. Module membership is defined by the calculated Pearson correlation between the level of expression of a given gene and a given module eigengene. The module eigengene, which is defined as the first principal component of a given module, is considered as representative of the gene expression profiles for this module. We then correlated eigengenes with binary phenotype to each time point/sample. The statistical significance of the correlations was estimated using the Student t-test. Simply, a module that has the highest association with the status of the time point/sample was inferred to probably have a biological function that underlies the specific traits of this time point/sample. Identification of X. tropicalis-specific new genes The ages of X. tropicalis genes were obtained from ProteinHistorian (Capra et al., 2012). Genes with an age of zero were taken as new genes that newly originated during the evolution of frogs. Genes that were specifically duplicated in X. tropicalis were retrieved by BioMart (Smedley et al., 2009) from Ensembl (http://www.ensembl.org/). Recently, duplicated genes were defined as X. tropicalis specific duplicated genes with an age of 0. Orphan genes, which are X. tropicalis specific, were identified from homologous searches of protein databases of other animal species using BLASTP with an E-value of 1e−10 (Altschul et al., 1997). Gene cloning and in vitro transcription Approximately 570 bp of the coding sequences of the Fog1 and Fog2 genes from X. laevis were amplified by PCR and cloned into pGEM-T vector to generate a probe to detect RNA transcription. RNA from embryos of 16 stages was used as a template. The PCR primers were as follows: Fog1, forward: 5′-CCTGACTGGACTGGAGGCAAAT-3′ and reverse: 5′-GACGAGGAAGATGAGGAGATGGAA-3′. Fog2, forward: 5′-TTTGTGCGTCCTACCCTATGC-3′ and reverse: 5′-CCAGTTCAGCTAACCAGTCCCT-3′. Similarly, full-length ORF of Fog1 and Fog2 were cloned into the pCS2+-C-FLAG vector for mRNA transcription. Primer sequences were as follows: Fog1, forward: 5′-CCGCTCGAGATGGGCCCTGTCCCCCCAACC-3′ and reverse: 5′-CGCTCCGGAATGAAGCTTATTCAACCCTTTTTG-3′. Fog2, forward: 5′-CCGCTCGAGATGGAAGCTCCACCTGGAATATAC-3′ and reverse: 5′-CGCAGATCTGGTAACCCCAGTAACAAGTGGAC-3′. The cloned plasmids were used as templates for the in vitro transcription of RNA probes and mRNA using the SP6 mMessage Machine kit (Ambion). Real-time PCR assay Embryos of different stages (including 0, 3, 8, 10, 11, 14, 16, 18, 24, 25, 28, 32, and 39) were collected, with total RNA acquired from each sample separately. RNA was then reverse transcribed using the Fermentas RevertAid First Strand cDNA Synthesis Kit to prepare templates for real-time PCR on a LightCycler (Roche Diagnostics). Expression of the two genes was examined using SYBR green qPCR using the following primers: Fog1, forward: 5′-CCAAGCCCAGGACATTCACC-3′; reverse: 5′-CCGTCTCAGGGATTAGTTCAGC-3′. Fog2, forward: 5′-GCTACAACACCTTTGTGGGTGA-3′; reverse: 5′-GCTAACCAGTCCCTCCTTTCCT-3′. Primers for GAPDH were used as described in Nichane et al. (2008). The following cycling conditions were used: denaturation at 95°C (10 sec), annealing at 60°C (10 sec), and extension at 72°C (10 sec). Western blot of embryos Embryos were lysed in lysis buffer (50 mM Tris-HCl, pH 7.4, 150 mM NaCl, 5 mM EDTA, pH 8.0, and 1% Triton X-100) and protease inhibitors (Roche) mixture for 30 min on ice. The lysates were cleared of debris by centrifugation at 4°C for 10 min at 14000 rpm. SDS loading buffer was added to the supernatant, which was then heated at 95°C for 10 min. Total lysates were then subjected to SDS-PAGE and western blot analyses. Antibodies used in this experiment were as follows: anti-FLAG (Sigma, 1:1000) and anti-β-actin (Abcam, 1:5000), with HRP-conjugated anti-rabbit or anti-mouse IgG (Pierce, 1:5000) used as secondary antibodies. Embryo culture, microinjection, whole-mount in situ hybridization Embryos were staged as previously outlined (Nieuwkoop and Faber, 1967). At the 4-cell stage, 12 ng morpholino (MO) and/or 0.6 ng mRNA for the genes were injected into the embryos. In all experiments, embryos were co-injected with mRNA for LacZ to identify the manipulated side of the embryo. Sequences of the MO used were: Fog1 MO: CCATCGGAACACTAATTCTGAACCT; Fog2 MO: TCCAGGTGGAGCTTCCATGATGCAG. RNA probes for Fog1 and Fog2 were used to examine levels of mRNA expression. Embryos at the appropriate stages were fixed in MEMFA (Harland, 1991). Whole-mount in situ hybridization was performed as described previously (Harland, 1991). The injected areas were identified by staining for LacZ using red-gal. Some stained embryos were embedded in paraffin and sectioned at 15 μm. Supplementary material Supplementary material is available at Journal of Molecular Cell Biology online. Funding This work was supported by the National Natural Science Foundation of China (31671325 and 31271339). N.O.O. thanks the CAS-TWAS President’s Fellowship Program for Doctoral Candidates for support. Conflict of interest: none declared. References Akkers , R.C. , van Heeringen , S.J. , Jacobi , U.G. , et al. . ( 2009 ). A hierarchy of H3K4me3 and H3K27me3 acquisition in spatial gene regulation in Xenopus embryos . Dev. Cell 17 , 425 – 434 . Google Scholar CrossRef Search ADS PubMed Altschul , S.F. , Madden , T.L. , Schaffer , A.A. , et al. . ( 1997 ). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs . Nucleic Acids Res. 25 , 3389 – 3402 . Google Scholar CrossRef Search ADS PubMed Bogdanovic , O. , Long , S.W. , van Heeringen , S.J. , et al. . ( 2011 ). Temporal uncoupling of the DNA methylome and transcriptional repression during embryogenesis . Genome Res. 21 , 1313 – 1327 . Google Scholar CrossRef Search ADS PubMed Capra , J.A. , Williams , A.G. , and Pollard , K.S. ( 2012 ). ProteinHistorian: tools for the comparative analysis of eukaryote protein origin . PLoS Comp. Biol. 8 , e1002567 . Google Scholar CrossRef Search ADS Chalamalasetty , R.B. , Garriock , R.J. , Dunty , W.C. , et al. . ( 2014 ). Mesogenin 1 is a master regulator of paraxial presomitic mesoderm differentiation . Development 141 , 4285 – 4297 . Google Scholar CrossRef Search ADS PubMed Chen , S. , Krinsky , B.H. , and Long , M. ( 2013 ). New genes as drivers of phenotypic evolution . Nat. Rev. Genet. 14 , 645 – 660 . Google Scholar CrossRef Search ADS PubMed Chen , S. , Spletter , M. , Ni , X. , et al. . ( 2012 ). Frequent recent origination of brain genes shaped the evolution of foraging behavior in Drosophila . Cell Rep. 1 , 118 – 132 . Google Scholar CrossRef Search ADS PubMed Chen , S. , Zhang , Y.E. , and Long , M. ( 2010 ). New genes in Drosophila quickly become essential . Science 330 , 1682 – 1685 . Google Scholar CrossRef Search ADS PubMed Domazet-Loso , T. , and Tautz , D. ( 2010 ). A phylogenetically based transcriptome age index mirrors ontogenetic divergence patterns . Nature 468 , 815 – 818 . Google Scholar CrossRef Search ADS PubMed Guenther , M.G. , Levine , S.S. , Boyer , L.A. , et al. . ( 2007 ). A chromatin landmark and transcription initiation at most promoters in human cells . Cell 130 , 77 – 88 . Google Scholar CrossRef Search ADS PubMed Harland , R.M. ( 1991 ). In situ hybridization: an improved whole-mount method for Xenopus embryos . Methods Cell Biol. 36 , 685 – 695 . Google Scholar CrossRef Search ADS PubMed Huang , D.W. , Sherman , B.T. , and Lempicki , R.A. ( 2008 ). Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources . Nat. Protoc. 4 , 44 – 57 . Google Scholar CrossRef Search ADS Kaessmann , H. ( 2010 ). Origins, evolution, and phenotypic impact of new genes . Genome Res. 20 , 1313 – 1326 . Google Scholar CrossRef Search ADS PubMed Kenny , A.P. , Rankin , S.A. , Allbee , A.W. , et al. . ( 2012 ). Sizzled-tolloid interactions maintain foregut progenitors by regulating fibronectin-dependent BMP signaling . Dev. Cell 23 , 292 – 304 . Google Scholar CrossRef Search ADS PubMed Kim , D. , Pertea , G. , Trapnell , C. , et al. . ( 2011 ). TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions . Genome Biol. 14 , R36 . Google Scholar CrossRef Search ADS Kleene , K.C. ( 2001 ). A possible meiotic function of the peculiar patterns of gene expression in mammalian spermatogenic cells . Mech. Dev. 106 , 3 – 23 . Google Scholar CrossRef Search ADS PubMed Kong , Y. ( 2011 ). Btrim: a fast, lightweight adapter and quality trimming program for next-generation sequencing technologies . Genomics 98 , 152 – 153 . Google Scholar CrossRef Search ADS PubMed Kong , L. , Zhang , Y. , Ye , Z.-Q. , et al. . ( 2007 ). CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine . Nucleic Acids Res. 35 , W345 – W349 . Google Scholar CrossRef Search ADS PubMed Langfelder , P. , and Horvath , S. ( 2008 ). WGCNA: an R package for weighted correlation network analysis . BMC Bioinformatics 9 , 559 . Google Scholar CrossRef Search ADS PubMed Li , C.-Y. , Zhang , Y. , Wang , Z. , et al. . ( 2010 ). A human-specific de novo protein-coding gene associated with human brain functions . PLoS Comput. Biol. 6 , e1000734 . Google Scholar CrossRef Search ADS PubMed Long , M. , Betrán , E. , Thornton , K. , et al. . ( 2003 ). The origin of new genes: glimpses from the young and old . Nat. Rev. Genet. 4 , 865 – 875 . Google Scholar CrossRef Search ADS PubMed Long , M. , VanKuren , N.W. , Chen , S. , et al. . ( 2013 ). New gene evolution: little did we know . Annu. Rev. Genet. 47 , 307 – 333 . Google Scholar CrossRef Search ADS PubMed Nichane , M. , de Crozé , N. , Ren , X. , et al. . ( 2008 ). Hairy2–Id3 interactions play an essential role in Xenopus neural crest progenitor specification . Dev. Biol. 322 , 355 – 367 . Google Scholar CrossRef Search ADS PubMed Nieuwkoop , P.D. , and Faber , J. ( 1967 ) Normal Table of Xenopus Laevis (Daudin) . Amsterdam, The Netherlands : North-Holland Publishing Company . Ostrup , O. , Andersen , I.S. , and Collas , P. ( 2013 ). Chromatin-linked determinants of zygotic genome activation . Cell. Mol. Life Sci. 70 , 1 – 13 . Google Scholar CrossRef Search ADS PubMed Smedley , D. , Haider , S. , Ballester , B. , et al. . ( 2009 ). BioMart—biological queries made easy . BMC Genomics 10 , 22 . Google Scholar CrossRef Search ADS PubMed Sun , Y.-B. , Xiong , Z.-J. , Xiang , X.-Y. , et al. . ( 2015 ). Whole-genome sequence of the Tibetan frog Nanorana parkeri and the comparative evolution of tetrapod genomes . Proc. Natl Acad. Sci. USA 112 , E1257 – E1262 . Google Scholar CrossRef Search ADS Tan , M.H. , Au , K.F. , Yablonovitch , A.L. , et al. . ( 2013 ). RNA sequencing reveals a diverse and dynamic repertoire of the Xenopus tropicalis transcriptome over development . Genome Res. 23 , 201 – 216 . Google Scholar CrossRef Search ADS PubMed Tautz , D. , and Domazet-Loso , T. ( 2011 ). The evolutionary origin of orphan genes . Nat. Rev. Genet. 12 , 692 – 702 . Google Scholar CrossRef Search ADS PubMed Trapnell , C. , Pachter , L. , and Salzberg , S.L. ( 2009 ). TopHat: discovering splice junctions with RNA-Seq . Bioinformatics. 25 , 1105 – 1111 . Google Scholar CrossRef Search ADS PubMed Trapnell , C. , Williams , B.A. , Pertea , G. , et al. . ( 2010 ). Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation . Nat. Biotech. 28 , 511 – 515 . Google Scholar CrossRef Search ADS Vastenhouw , N.L. , Zhang , Y. , Woods , I.G. , et al. . ( 2010 ). Chromatin signature of embryonic pluripotency is established during genome activation . Nature 464 , 922 – 926 . Google Scholar CrossRef Search ADS PubMed Wu , D.-D. , Irwin , D.M. , and Zhang , Y.-P. ( 2011 ). De novo origin of human protein-coding genes . PLoS Genet. 7 , e1002379 . Google Scholar CrossRef Search ADS PubMed Wu , D.-D. , Ye , L.-Q. , Li , Y. , et al. . ( 2015 ). Integrative analyses of RNA editing, alternative splicing, and expression of young genes in human brain transcriptome by deep RNA sequencing . J. Mol. Cell Biol. 7 , 314 – 325 . Google Scholar CrossRef Search ADS PubMed Zhang , Y.E. , Landback , P. , Vibranovski , M.D. , et al. . ( 2011 ). Accelerated recruitment of new brain development genes into the human genome . PLoS Biol. 9 , e1001179 . Google Scholar CrossRef Search ADS PubMed Zhong , Z. , Yang , L. , Zhang , Y.E. , et al. . ( 2016 ). Correlated expression of retrocopies and parental genes in zebrafish . Mol. Genet. Genomics 291 , 723 – 737 . Google Scholar CrossRef Search ADS PubMed © The Author (2018). Published by Oxford University Press on behalf of Journal of Molecular Cell Biology, IBCB, SIBS, CAS. All rights reserved. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Molecular Cell Biology Oxford University Press

Origin of new genes after zygotic genome activation in vertebrate

Loading next page...
 
/lp/ou_press/origin-of-new-genes-after-zygotic-genome-activation-in-vertebrate-53Agdn0Hpr
Publisher
Oxford University Press
Copyright
© The Author (2018). Published by Oxford University Press on behalf of Journal of Molecular Cell Biology, IBCB, SIBS, CAS. All rights reserved.
ISSN
1674-2788
eISSN
1759-4685
D.O.I.
10.1093/jmcb/mjx057
Publisher site
See Article on Publisher Site

Abstract

Abstract New genes are drivers of evolutionary innovation and phenotypic evolution. Expression of new genes in early development raises the possibility that new genes could originate and be recruited for functions in embryonic development, but this remains undocumented. Here, based on temporal gene expression at different developmental stages in Xenopus tropicalis, we found that young protein-coding genes were significantly enriched for expression in developmental stages occurring after the midblastula transition (MBT), and displayed a decreasing trend in abundance in the subsequent stages after MBT. To complement the finding, we demonstrate essential functional attributes of a young orphan gene, named as Fog2, in morphological development. Our data indicate that new genes could originate after MBT and be recruited for functions in embryonic development, and thus provide insights for better understanding of the origin, evolution, and function of new genes. young gene evolution, zygotic genome activation, new gene origin Introduction New genes, as fundamental materials for evolutionary innovation, have been investigated for many years (Long et al., 2003, 2013). Many new genes rapidly acquire important and even essential functions driven by positive selection (Chen et al., 2010, 2013), with most pronounced roles being in reproduction, development, and brain (Kaessmann, 2010; Tautz and Domazet-Loso, 2011; Chen et al., 2013). For instance, a testis-bias expression of new genes is recapitulated in different animals (Kaessmann, 2010). As seen in human, many new genes exhibit a brain, particularly a neocortex-biased expression, suggesting that they may be recruited for the accelerated evolution of the human brain and may be involved in the acquisition of high cognitive ability (Li et al., 2010; Wu et al., 2011; Zhang et al., 2011). New genes also play integral roles in development (Chen et al., 2010, 2012). For instance, in a pioneering study by Chen et al. (2010), knockdown of many new genes by RNA interference (RNAi) in Drosophila lead to either lethality at diverse development stages, or tissue-specific morphological defects (Chen et al., 2010). In human, new genes display a substantially high expression levels in fetal brain compared with adult brain (Zhang et al., 2011; Wu et al., 2015). The investigations raise the possibility that new genes could originate during the process of embryonic development. Particularly, during early development after the midblastula transition (MBT), zygotic genome activation is initiated, where remarkable epigenetic modifications occur to induce the transcription of the zygotic genome to gradually take control of development. These dynamic changes in epigenetic modifications enhance transcriptional activity and promiscuous transcription (Ostrup et al., 2013). These events are mirrored in testis, where widespread demethylation of CpG dinucleotide-enriched promoter sequences occurs, resulting in a transcriptionally active chromatin state that facilitates the access of transcriptional machinery and promiscuous transcription of genes (Kleene, 2001; Kaessmann, 2010). Studies have observed that new genes are frequently recruited for new function in the testis due to the promiscuous transcription (Kaessmann, 2010; Tautz and Domazet-Loso, 2011; Chen et al., 2013). Therefore, we hypothesized that new genes might be recruited during early development, particularly after MBT. To test the hypothesis that new gene could frequently be recruited for key roles in early animal development, we examined the temporal expression of new genes during the early development of a frog, and found that expression of new genes was significantly enriched in developmental stages after embryonic genome activation. Furthermore, we examined the roles of two young orphan genes, named Fog1 and Fog2, and demonstrated that they have important functions in frog development. After MBT, zygotic genome activation (ZGA) is initiated, where remarkable epigenetic modifications occur that induce the transcription of the zygotic genome (Ostrup et al., 2013), with a similar pattern observed in the testis (Kleene, 2001; Kaessmann, 2010). We propose that the induced transcription of the zygotic genome might facilitate the origination of new genes. Results Temporal expression profiling reveals high expression of young protein-coding genes after MBT In the present study, we categorized young protein-coding genes in the frog (Xenopus tropicalis), i.e. duplicate genes and orphan genes, according to their mechanisms of origin (Supplementary Tables S1 and S2). Duplicate genes are generated by gene duplication from existing old genes, while orphan genes do not have homology with genes in other species, and are likely originate de novo or from rapidly evolved genes, thereby losing similarity with their ancestral sequences (Tautz and Domazet-Loso, 2011). Genes that were specifically duplicated in X. tropicalis were retrieved by BioMart (Smedley et al., 2009) from Ensembl (http://www.ensembl.org/, version 72). The ages of X. tropicalis genes were obtained from ProteinHistorian (Capra et al., 2012). Genes with an age of zero were taken as new genes that newly originated during the evolution of frogs. Recently duplicated genes were defined as X. tropicalis-specific duplicated genes with an age of 0. Orphan genes, which are X. tropicalis specific, were identified from homologous searches of protein databases of other animal species using BLASTP with an E-value of 1e−10 (Altschul et al., 1997). Profiling of temporal gene expression from the 2-cell stage to stage 44–45 (Tan et al., 2013) revealed that the expression of young protein-coding genes peak after MBT, particularly from gastrulation to neurulation (Figure 1, Supplementary Figure S1A). Expression levels of the new genes displayed a decreasing trend with developmental stages after MBT. When gene expression levels were normalized to the whole genome level, this pattern still held for new genes (Figure 1, Supplementary Figure S1B). Although expression quantification may not be accurate for young duplicate genes using RNA-sequencing data, we reasoned that it would not change our conclusion as orphan genes also exhibited the signature peak expression after MBT (Figure 1). Figure 1 View largeDownload slide Expression level of young protein-coding genes at different developmental stages in X. tropicalis. log2(FPKM+1) was used to calculate the expression value of each gene. Two groups of orphan genes were identified by BLASTP with cutoff E = 1e−10 and 1e−4, respectively. Figure 1 View largeDownload slide Expression level of young protein-coding genes at different developmental stages in X. tropicalis. log2(FPKM+1) was used to calculate the expression value of each gene. Two groups of orphan genes were identified by BLASTP with cutoff E = 1e−10 and 1e−4, respectively. Weighted gene co-expression network analysis Genes function by interacting with other genes, generating gene–gene interaction networks. To gain insight into involvement of new genes in networks, we performed weighted gene co-expression network analysis (WGCNA), an unsupervised and unbiased analysis (Langfelder and Horvath, 2008) to identify modules of coexpressed genes with shared functionality. A total of 30 distinct modules, representing clusters of genes with correlated expression, were identified (Figure 2A). Many of the modules exhibit expression patterns significantly correlated with specific developmental stages (Figure 2A, Supplementary Figure S3). For example, the M2 module is a stage 9-dominant module, with the significance of its association having a P-value of 1.0 × 10−8 (Figure 2B). Figure 2 View largeDownload slide WGCNA of young protein-coding genes. (A) Heatmaps displaying the correlations (and corresponding P-values) between modules and developmental stages or adult tissues. Color legend at the top indicates the level of correlation. eld0−4 represent five adult tissues: brain, liver, kidney, heart, and skeletal muscle, respectively. (B) Heatmap of gene expression in module M2. (C) Enrichment level of young duplicated genes for the different modules. Values are the proportion of young duplicated genes in each module divided by the proportion of other genes in the module. Asterisks (*) represent significant enrichments (P < 0.01 after Bonferroni correction by χ2 test). (D) Level of enrichment of young duplicated genes at each developmental stage shows a decreasing trend after MBT. Figure 2 View largeDownload slide WGCNA of young protein-coding genes. (A) Heatmaps displaying the correlations (and corresponding P-values) between modules and developmental stages or adult tissues. Color legend at the top indicates the level of correlation. eld0−4 represent five adult tissues: brain, liver, kidney, heart, and skeletal muscle, respectively. (B) Heatmap of gene expression in module M2. (C) Enrichment level of young duplicated genes for the different modules. Values are the proportion of young duplicated genes in each module divided by the proportion of other genes in the module. Asterisks (*) represent significant enrichments (P < 0.01 after Bonferroni correction by χ2 test). (D) Level of enrichment of young duplicated genes at each developmental stage shows a decreasing trend after MBT. We then examined the enrichment of new genes within these modules. Young duplicated genes of X. tropicalis were significantly over-represented in 13 of the 30 modules (Figure 2C, P < 0.01), with 12 modules harboring significant correlation with a developmental stage after MBT (Figure 2A, P < 0.01 after Bonferroni correction by χ2 test). The 12 modules cover the gastrulation processes (stages 9, 10, and 11.12, with 119 young duplicated genes involved), neurulation (stages 16.18, 19, and 20.21, with 76 young duplicated genes involved), and part of the organogenesis period (stages 24.26, 38.39, and 44.45, with 66 young duplicated genes involved). This enrichment indicates that these new genes have evolved important functions in critical gene–gene interaction networks. Interestingly, the level of enrichment for new genes showed a decreasing trend with developmental stage after MBT (P = 0.028, Figure 2D; P = 0.048, Supplementary Figure S4), corroborating the expression trajectory of new genes (Figure 1). The decreasing pattern after MBT could not be attributed to changes in global gene expression levels or changes in the proportions of expressed genes among developmental stages, as correlation was not observed between enrichment levels of new genes and transcriptional levels of developmental stages (Supplementary Figure S5). Young orphan genes were significantly enriched within two modules (P < 0.01 after Bonferroni correction by χ2 test), with modules M5 (29 genes) corresponding to developmental stage 10, and M24 (15 genes) (Supplementary Figure S6). It is notable that another 47 young duplicated genes are over-represented within module M5. Gene enrichment analysis performed by the DAVID program (Huang et al., 2008) found that genes in M5 were significantly enriched in categories associated with development such as ‘pattern specification process’ (GO:0007389, P = 0.01, 10 genes: T, HHEX, NOG, GSC, LHX1, DYNC2LI1, GATA4, ZIC1, TCF7L1, and ZIC3) and ‘regionalization’ (GO:0003002, P = 0.04) (Supplementary Table S3). These results indicate that the young genes might have evolved developmental roles through interactions with other genes. Expression patterns of two young orphan genes reveal potential functions in development High-expression levels of young protein-coding genes after MBT, particularly from gastrulation to neurulation, raise the possibility that some of these new genes might have acquired important functions in these developmental stages. To complement this finding, we chose some young protein-coding genes for further functional assessments. Two genes (ENSXETG00000027093, i.e. LOC100170590, which we named Fog1, frog orphan gene 1, and ENSXETG00000030468, i.e. LOC100158459, which we named Fog2, frog orphan gene 2) stand out, as they exhibit high expression levels after MBT, particularly from gastrulation to neurulation (Supplementary Figure S7). To interrogate the epigenetic aspects associated with expression, we examined the levels of H3K4me3 modification, which is associated with active transcription of nearby genes (Guenther et al., 2007), across the two genes at different developmental stages. As expected, the changes of H3K4me3 modification levels displayed a pattern paralleling the changes in the expression of these two genes (Supplementary Figure S8). Among the gene–gene expression networks, the two genes are located within module M7, which exhibited high expression level at the stages 10−14 after MBT (Figure 2A). Gene enrichment analysis showed that genes in M7 are significantly enriched in many categories associated with development such as: ‘somite development’, ‘pattern specification process’, ‘regionalization’, ‘segmentation’ and ‘anterior/posterior pattern specification’ (Supplementary Table S4). These findings suggest that Fog1/Fog2 might be functionally linked with development. To explore potential functions of these two genes, we firstly performed an expression correlation analysis to identify genes whose expression correlate significantly with these two genes. A total of 190 genes showed correlation with Fog1 (R2 > 0.9, by Pearson correlation), and gene enrichment analysis revealed that these genes were enriched in development categories. For example, 31 genes are enriched in the category ‘anatomical structure development’ (GO:0048856, P = 0.029), five genes in ‘mesenchyme development’ (GO:0060485, P = 0.02), four genes in ‘neural crest cell differentiation’ (GO:0014033, P = 0.016), and four genes in ‘somite development’ (GO:0061053, P = 0.0379) (Supplementary Table S5). On the other hand, 12 genes are correlated with Fog2 (R2 > 0.9, by Pearson correlation), with 7 of the genes involved in ‘anatomical structure development’ (GO:0048856, P = 0.0002) (Supplementary Table S6). Among these genes, Msgn1 (mesogenin 1) is a master regulator of paraxial presomitic mesoderm differentiation (Chalamalasetty et al., 2014), and Szl, encodes the secreted frizzled related protein Sizzled that negatively regulates Tolloid-like activity to control deposition of a fibronectin (FN) matrix between the mesoderm and endoderm (Kenny et al., 2012). The above expression correlation analyses suggest that Fog1 and Fog2 genes might be coupled to embryonic development. We further investigated the potential functions of the two genes in another model frog, Xenopus laevis. Real-time PCR confirmed their increased expression levels at the gastrula and the neurula stages, but lower levels in the tailbud stage of development (Supplementary Figure S7). After that, we used in situ hybridization to examine the temporal expression patterns during embryonic development, and found high expression of Fog1 and Fog2 in somites (muscle) (Figure 3). This is consistent with the above expression correlation analysis, where many genes involved in somite development displayed correlated expression with Fog1 and Fog2. In addition, Fog1 is also expressed in the branchial arches, optic vesicle, and tail end, while Fog2 shows expression in the nervous system and optic vesicle (Figure 3). Figure 3 View largeDownload slide Expression patterns of Fog1 and Fog2 in X. laevis embryos. (A−H) Expression patterns of Fog1 mRNA. (A and B) Fog1 is weakly expressed in the animal pole at the blastula stage. (C−E) Fog1 is weekly expressed in the neural system at the neurula stage. (F−H) Fog1 is mainly expressed in the muscle, brain, optic vesicle, branchial arches, and tail end. (A’−H’) Expression patterns of Fog2 mRNA. (A’−C’) Fog2 is weakly expressed in the animal pole at the blastula stage. (D’ and E’) Fog2 is expressed in the neural system at the neurula stage. (F’−H’) Fog2 is mainly expressed in the brain, optic vesicle, and muscle. n, neural system; ov, optic vesicle; mu, muscle; ba, branchial arches; te, tail end. Figure 3 View largeDownload slide Expression patterns of Fog1 and Fog2 in X. laevis embryos. (A−H) Expression patterns of Fog1 mRNA. (A and B) Fog1 is weakly expressed in the animal pole at the blastula stage. (C−E) Fog1 is weekly expressed in the neural system at the neurula stage. (F−H) Fog1 is mainly expressed in the muscle, brain, optic vesicle, branchial arches, and tail end. (A’−H’) Expression patterns of Fog2 mRNA. (A’−C’) Fog2 is weakly expressed in the animal pole at the blastula stage. (D’ and E’) Fog2 is expressed in the neural system at the neurula stage. (F’−H’) Fog2 is mainly expressed in the brain, optic vesicle, and muscle. n, neural system; ov, optic vesicle; mu, muscle; ba, branchial arches; te, tail end. Functional study of Fog1 and Fog2 The above expression analysis suggested that Fog1 and Fog2 might have functions in development. To further interrogate the functions of Fog1 and Fog2 in development, we knocked down mRNA expression of these two genes by injecting gene-specific MO (morpholino) into X. laevis embryos. By visual inspection, no noticeable phenotypic change was observed in development after injecting Fog1 MO (Supplementary Figure S9). While this does not annul the potential functional importance of this gene, it highlights the need for more systematic phenotyping assays to capture the phenotypic changes after knocking down Fog1. In stark contrast, a serious change in body axis curvature was observed in embryos after knocking down Fog2 compared to wild-type embryos (Figure 4A−D, Supplementary Figure S11). This malformation could be partially rescued when Fog2 mRNA was co-injected (Figure 4E and F, Supplementary Figure S11). In consideration of expression of Fog2 in somites (muscle) and mesoderm (Figure 3, Supplementary Figure S12), and expression correlation with other genes involved in mesoderm development, we further examined the expression of MyoD, a crucial gene in somites formation as an early response to mesoderm induction in Xenopus embryos. Fog2 MO injection caused downregulation of MyoD (Figure 5). This phenotype could be partially rescued by co-injection with Fog2 mRNA (Figure 5). Since the rescue experiment was performed by co-injecting MO and Fog2 mRNA, it is important to note that MO can block not only the endogenous mRNA but also the injected mRNA. Hence, the rescued phenotype might be due an overall increase in Fog2 mRNA. Altogether, these results suggest that Fog2 is tightly coupled to the morphological development of X. laevis. BLAST analysis did not find the homologous gene of Fog2 in the genome of Tibetan frog (Nanorana parkeri) (Sun et al., 2015), supporting Fog2 as a newly evolved gene. The experiment reiterates that some new genes could be recruited in embryonic development. Figure 4 View largeDownload slide Knockdown of Fog2 causes defects or malformation in embryonic development of Xenopus at late stages of development. Embryos were injected on the left side. (A) Wild-type (WT) embryos. (C) Embryos injected with Fog2-MO. (E) Embryos injected with both Fog2-MO and mRNA. (B, D, F) The same embryo for A, C, E, respectively. Numbers at the bottom corners represent the number of embryos with the corresponding effect and the number of embryos used in this experiment. The side of the injection is indicated by lacZ (red dots). Figure 4 View largeDownload slide Knockdown of Fog2 causes defects or malformation in embryonic development of Xenopus at late stages of development. Embryos were injected on the left side. (A) Wild-type (WT) embryos. (C) Embryos injected with Fog2-MO. (E) Embryos injected with both Fog2-MO and mRNA. (B, D, F) The same embryo for A, C, E, respectively. Numbers at the bottom corners represent the number of embryos with the corresponding effect and the number of embryos used in this experiment. The side of the injection is indicated by lacZ (red dots). Figure 5 View largeDownload slide Knockdown of Fog2 causes downregulation of MyoD. (A) Wild-type embryos. (C) Embryos injected with Fog2-MO. (E) Embryos injected with both Fog2-MO and mRNA. (B, D, F) The opposite side for A, C, E, respectively. Numbers at the bottom corners represent the number of embryos with the corresponding effect and the number of embryos used in this experiment. The side of the injection is indicated by lacZ (red dots). Figure 5 View largeDownload slide Knockdown of Fog2 causes downregulation of MyoD. (A) Wild-type embryos. (C) Embryos injected with Fog2-MO. (E) Embryos injected with both Fog2-MO and mRNA. (B, D, F) The opposite side for A, C, E, respectively. Numbers at the bottom corners represent the number of embryos with the corresponding effect and the number of embryos used in this experiment. The side of the injection is indicated by lacZ (red dots). Discussion In the present study, we show that young protein-coding genes are significantly enriched in embryonic developmental stages after MBT. A decreasing trend in the level of enrichment and expression of new genes was seen in the developmental stages after MBT, a pattern that had not been previously reported. For example, knockdown of new genes in Drosophila (Chen et al., 2010) generally lead to pupal rather than embryonic lethality. A repetitive analysis of time-course developmental data in zebrafish showed no peak expression in the early developmental stages (Domazet-Loso and Tautz, 2010; Zhong et al., 2016). Numerous possibilities such as different bioinformatics practices, different data profiling strategies, as well as specie-specific variations might explain the conflicts in these observations. In particular, we studied functions of two genes Fog1 and Fog2 exhibiting high expression levels after MBT, and found that Fog2 might have an important role in development. When comparing orthologous gene coding sequences between X. laevis and X. tropicalis, we found that Ka/Ks (the ratio of the number of nonsynonymous substitutions per nonsynonymous site (Ka), to the number of synonymous substitutions per synonymous site (Ks)) of Fog1 and Fog2 is 0.283 and 0.744, respectively (Supplementary Figure S13). It seems that Fog1 evolves under purifying selection, suggesting that it still may harbor an important function, while Fog2 evolves more rapidly which occurs commonly for recently evolved new genes. After MBT, zygotic genome activation is initiated, where remarkable epigenetic modifications occur to induce the transcription of the zygotic genome to gradually take over the control of development (Ostrup et al., 2013). An example of the epigenetic changes that occur with this transition is the genomic enrichment of H3K4me3, an indicator of active promoters, which is observed at the time of ZGA in zebrafish (Vastenhouw et al., 2010) and Xenopus (Akkers et al., 2009). Another example is DNA methylation, which is considered to be a negative regulator of gene expression. Methylated DNA is efficiently transcriptionally repressed in Xenopus oocytes, but DNA methylation-dependent transcriptional repression is greatly reduced after ZGA, and the repressive effect of DNA methylation is only restored during organogenesis and terminal differentiation (Bogdanovic et al., 2011; Ostrup et al., 2013). These dynamic changes in epigenetic modifications enhance transcriptional activity after ZGA. The promiscuous transcription then enables proto-genes to be selected due to beneficial functions that might accidentally be gained, and thus allow them to evolve into bona fide genes (Kaessmann, 2010). This epigenetic aspect is synonymous with that observed in testis, where widespread demethylation of CpG dinucleotide-enriched promoter sequences occurs, resulting in a transcriptionally active chromatin state that facilitates the access of transcriptional machinery and promiscuous transcription of genes (Kleene, 2001; Kaessmann, 2010). Previous studies have evidenced the recruitment of new genes for new functions in the testis (Kaessmann, 2010; Tautz and Domazet-Loso, 2011; Chen et al., 2013). In conclusion, our study identifies a genomic hotbed for the origination of new genes, and illustrates likely developmental functions of new genes besides the traditional roles such as in the testis and brain. Materials and methods Animal ethics The handling of animals used in this study followed the guidelines and regulations of Kunming Institute of Zoology on animal experimentation and was approved by the Institutional Animal Care and Use Committee of the Kunming Institute of Zoology. Analysis of transcriptome data Transcriptome data from 23 developmental stages from a previous study (Tan et al., 2013) were downloaded from the NCBI SRA (http://www.ncbi.nlm.nih.gov/sra/). Detailed information on each developmental stage is available elsewhere (Tan et al., 2013). All reads were processed using Btrim to remove low-quality reads with parameters –l 30 –q 20 (Kong, 2011), with the paired and unpaired short reads generated then aligned to the X. tropicalis genome (JGI_4.2) (Ensembl release 72) using the read gapped alignment program Tophat 2.0.9 (Trapnell et al., 2009; Kim et al., 2011). After merging paired and unpaired alignments from the same sample, the program Cufflinks 2.0.2 was used to assemble the transcriptomes (Trapnell et al., 2010) (Supplementary Figure S14). All isoforms assembled by Cufflinks from the 45 samples were sent to the Cuffcompare utility, along with the Ensembl annotation file, to generate an integrated combined gtf annotation file. With the purpose of minimizing annotation artifacts, we processed the Cuffcompare output file through the following steps. First, we excluded all new single exon transcripts and sequences with lengths < 200 bp. Transcript with FPKM values of at least 1 in any one sample or could be assembled in at least two samples were considered credible. Credible transcripts were added to the original Ensembl annotation gtf file as new isoforms of known X. tropicalis genes, and labeled with the class code ‘j’. We used this new annotation as a reference and re-run Cuffcompare with transcripts labeled with the class codes ‘i’, ‘x’, and ‘u’. We selected these loci as the transcripts labeled with class codes ‘i’, ‘x’, or ‘u’ to represent sense intronic locus, antisence exonic locus, or intergenic locus transcripts, respectively. Finally, we combined the known annotated Ensembl genes and their new assembled isoforms together with genes from the new loci identified to obtain a new set of annotations. This final annotation was processed by Cuffdiff together with the original alignment file to calculate the FPKM values of each gene in each sample. Based on our robust transcript reconstruction, we analyzed the coding potential of the transcripts encoded by the newly identified loci using Coding Potential Calculator (CPC) (Kong et al., 2007). Transcript sequences were extracted by gffread, a utility of the Cufflinks package (Trapnell et al., 2010). Transcripts with a score ≥0 were deemed as coding, those with scores <−1 as noncoding, and all others as weak noncoding. If all transcripts within a new locus were classified as noncoding, we defined it as a lncRNA locus, with those in intergenic regions having no overlap with any known gene locus considered as lincRNA loci. Weighted gene co-expression network analysis Weighted gene networks were constructed using the WGCNA package implemented in R (Langfelder and Horvath, 2008). The power of 7, for which the scale-free topology fit index curves flatten out at roughly 0.9, was interpreted as a soft-threshold for the adjacency matrix. In total, 30 modules were identified. Module membership is defined by the calculated Pearson correlation between the level of expression of a given gene and a given module eigengene. The module eigengene, which is defined as the first principal component of a given module, is considered as representative of the gene expression profiles for this module. We then correlated eigengenes with binary phenotype to each time point/sample. The statistical significance of the correlations was estimated using the Student t-test. Simply, a module that has the highest association with the status of the time point/sample was inferred to probably have a biological function that underlies the specific traits of this time point/sample. Identification of X. tropicalis-specific new genes The ages of X. tropicalis genes were obtained from ProteinHistorian (Capra et al., 2012). Genes with an age of zero were taken as new genes that newly originated during the evolution of frogs. Genes that were specifically duplicated in X. tropicalis were retrieved by BioMart (Smedley et al., 2009) from Ensembl (http://www.ensembl.org/). Recently, duplicated genes were defined as X. tropicalis specific duplicated genes with an age of 0. Orphan genes, which are X. tropicalis specific, were identified from homologous searches of protein databases of other animal species using BLASTP with an E-value of 1e−10 (Altschul et al., 1997). Gene cloning and in vitro transcription Approximately 570 bp of the coding sequences of the Fog1 and Fog2 genes from X. laevis were amplified by PCR and cloned into pGEM-T vector to generate a probe to detect RNA transcription. RNA from embryos of 16 stages was used as a template. The PCR primers were as follows: Fog1, forward: 5′-CCTGACTGGACTGGAGGCAAAT-3′ and reverse: 5′-GACGAGGAAGATGAGGAGATGGAA-3′. Fog2, forward: 5′-TTTGTGCGTCCTACCCTATGC-3′ and reverse: 5′-CCAGTTCAGCTAACCAGTCCCT-3′. Similarly, full-length ORF of Fog1 and Fog2 were cloned into the pCS2+-C-FLAG vector for mRNA transcription. Primer sequences were as follows: Fog1, forward: 5′-CCGCTCGAGATGGGCCCTGTCCCCCCAACC-3′ and reverse: 5′-CGCTCCGGAATGAAGCTTATTCAACCCTTTTTG-3′. Fog2, forward: 5′-CCGCTCGAGATGGAAGCTCCACCTGGAATATAC-3′ and reverse: 5′-CGCAGATCTGGTAACCCCAGTAACAAGTGGAC-3′. The cloned plasmids were used as templates for the in vitro transcription of RNA probes and mRNA using the SP6 mMessage Machine kit (Ambion). Real-time PCR assay Embryos of different stages (including 0, 3, 8, 10, 11, 14, 16, 18, 24, 25, 28, 32, and 39) were collected, with total RNA acquired from each sample separately. RNA was then reverse transcribed using the Fermentas RevertAid First Strand cDNA Synthesis Kit to prepare templates for real-time PCR on a LightCycler (Roche Diagnostics). Expression of the two genes was examined using SYBR green qPCR using the following primers: Fog1, forward: 5′-CCAAGCCCAGGACATTCACC-3′; reverse: 5′-CCGTCTCAGGGATTAGTTCAGC-3′. Fog2, forward: 5′-GCTACAACACCTTTGTGGGTGA-3′; reverse: 5′-GCTAACCAGTCCCTCCTTTCCT-3′. Primers for GAPDH were used as described in Nichane et al. (2008). The following cycling conditions were used: denaturation at 95°C (10 sec), annealing at 60°C (10 sec), and extension at 72°C (10 sec). Western blot of embryos Embryos were lysed in lysis buffer (50 mM Tris-HCl, pH 7.4, 150 mM NaCl, 5 mM EDTA, pH 8.0, and 1% Triton X-100) and protease inhibitors (Roche) mixture for 30 min on ice. The lysates were cleared of debris by centrifugation at 4°C for 10 min at 14000 rpm. SDS loading buffer was added to the supernatant, which was then heated at 95°C for 10 min. Total lysates were then subjected to SDS-PAGE and western blot analyses. Antibodies used in this experiment were as follows: anti-FLAG (Sigma, 1:1000) and anti-β-actin (Abcam, 1:5000), with HRP-conjugated anti-rabbit or anti-mouse IgG (Pierce, 1:5000) used as secondary antibodies. Embryo culture, microinjection, whole-mount in situ hybridization Embryos were staged as previously outlined (Nieuwkoop and Faber, 1967). At the 4-cell stage, 12 ng morpholino (MO) and/or 0.6 ng mRNA for the genes were injected into the embryos. In all experiments, embryos were co-injected with mRNA for LacZ to identify the manipulated side of the embryo. Sequences of the MO used were: Fog1 MO: CCATCGGAACACTAATTCTGAACCT; Fog2 MO: TCCAGGTGGAGCTTCCATGATGCAG. RNA probes for Fog1 and Fog2 were used to examine levels of mRNA expression. Embryos at the appropriate stages were fixed in MEMFA (Harland, 1991). Whole-mount in situ hybridization was performed as described previously (Harland, 1991). The injected areas were identified by staining for LacZ using red-gal. Some stained embryos were embedded in paraffin and sectioned at 15 μm. Supplementary material Supplementary material is available at Journal of Molecular Cell Biology online. Funding This work was supported by the National Natural Science Foundation of China (31671325 and 31271339). N.O.O. thanks the CAS-TWAS President’s Fellowship Program for Doctoral Candidates for support. Conflict of interest: none declared. References Akkers , R.C. , van Heeringen , S.J. , Jacobi , U.G. , et al. . ( 2009 ). A hierarchy of H3K4me3 and H3K27me3 acquisition in spatial gene regulation in Xenopus embryos . Dev. Cell 17 , 425 – 434 . Google Scholar CrossRef Search ADS PubMed Altschul , S.F. , Madden , T.L. , Schaffer , A.A. , et al. . ( 1997 ). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs . Nucleic Acids Res. 25 , 3389 – 3402 . Google Scholar CrossRef Search ADS PubMed Bogdanovic , O. , Long , S.W. , van Heeringen , S.J. , et al. . ( 2011 ). Temporal uncoupling of the DNA methylome and transcriptional repression during embryogenesis . Genome Res. 21 , 1313 – 1327 . Google Scholar CrossRef Search ADS PubMed Capra , J.A. , Williams , A.G. , and Pollard , K.S. ( 2012 ). ProteinHistorian: tools for the comparative analysis of eukaryote protein origin . PLoS Comp. Biol. 8 , e1002567 . Google Scholar CrossRef Search ADS Chalamalasetty , R.B. , Garriock , R.J. , Dunty , W.C. , et al. . ( 2014 ). Mesogenin 1 is a master regulator of paraxial presomitic mesoderm differentiation . Development 141 , 4285 – 4297 . Google Scholar CrossRef Search ADS PubMed Chen , S. , Krinsky , B.H. , and Long , M. ( 2013 ). New genes as drivers of phenotypic evolution . Nat. Rev. Genet. 14 , 645 – 660 . Google Scholar CrossRef Search ADS PubMed Chen , S. , Spletter , M. , Ni , X. , et al. . ( 2012 ). Frequent recent origination of brain genes shaped the evolution of foraging behavior in Drosophila . Cell Rep. 1 , 118 – 132 . Google Scholar CrossRef Search ADS PubMed Chen , S. , Zhang , Y.E. , and Long , M. ( 2010 ). New genes in Drosophila quickly become essential . Science 330 , 1682 – 1685 . Google Scholar CrossRef Search ADS PubMed Domazet-Loso , T. , and Tautz , D. ( 2010 ). A phylogenetically based transcriptome age index mirrors ontogenetic divergence patterns . Nature 468 , 815 – 818 . Google Scholar CrossRef Search ADS PubMed Guenther , M.G. , Levine , S.S. , Boyer , L.A. , et al. . ( 2007 ). A chromatin landmark and transcription initiation at most promoters in human cells . Cell 130 , 77 – 88 . Google Scholar CrossRef Search ADS PubMed Harland , R.M. ( 1991 ). In situ hybridization: an improved whole-mount method for Xenopus embryos . Methods Cell Biol. 36 , 685 – 695 . Google Scholar CrossRef Search ADS PubMed Huang , D.W. , Sherman , B.T. , and Lempicki , R.A. ( 2008 ). Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources . Nat. Protoc. 4 , 44 – 57 . Google Scholar CrossRef Search ADS Kaessmann , H. ( 2010 ). Origins, evolution, and phenotypic impact of new genes . Genome Res. 20 , 1313 – 1326 . Google Scholar CrossRef Search ADS PubMed Kenny , A.P. , Rankin , S.A. , Allbee , A.W. , et al. . ( 2012 ). Sizzled-tolloid interactions maintain foregut progenitors by regulating fibronectin-dependent BMP signaling . Dev. Cell 23 , 292 – 304 . Google Scholar CrossRef Search ADS PubMed Kim , D. , Pertea , G. , Trapnell , C. , et al. . ( 2011 ). TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions . Genome Biol. 14 , R36 . Google Scholar CrossRef Search ADS Kleene , K.C. ( 2001 ). A possible meiotic function of the peculiar patterns of gene expression in mammalian spermatogenic cells . Mech. Dev. 106 , 3 – 23 . Google Scholar CrossRef Search ADS PubMed Kong , Y. ( 2011 ). Btrim: a fast, lightweight adapter and quality trimming program for next-generation sequencing technologies . Genomics 98 , 152 – 153 . Google Scholar CrossRef Search ADS PubMed Kong , L. , Zhang , Y. , Ye , Z.-Q. , et al. . ( 2007 ). CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine . Nucleic Acids Res. 35 , W345 – W349 . Google Scholar CrossRef Search ADS PubMed Langfelder , P. , and Horvath , S. ( 2008 ). WGCNA: an R package for weighted correlation network analysis . BMC Bioinformatics 9 , 559 . Google Scholar CrossRef Search ADS PubMed Li , C.-Y. , Zhang , Y. , Wang , Z. , et al. . ( 2010 ). A human-specific de novo protein-coding gene associated with human brain functions . PLoS Comput. Biol. 6 , e1000734 . Google Scholar CrossRef Search ADS PubMed Long , M. , Betrán , E. , Thornton , K. , et al. . ( 2003 ). The origin of new genes: glimpses from the young and old . Nat. Rev. Genet. 4 , 865 – 875 . Google Scholar CrossRef Search ADS PubMed Long , M. , VanKuren , N.W. , Chen , S. , et al. . ( 2013 ). New gene evolution: little did we know . Annu. Rev. Genet. 47 , 307 – 333 . Google Scholar CrossRef Search ADS PubMed Nichane , M. , de Crozé , N. , Ren , X. , et al. . ( 2008 ). Hairy2–Id3 interactions play an essential role in Xenopus neural crest progenitor specification . Dev. Biol. 322 , 355 – 367 . Google Scholar CrossRef Search ADS PubMed Nieuwkoop , P.D. , and Faber , J. ( 1967 ) Normal Table of Xenopus Laevis (Daudin) . Amsterdam, The Netherlands : North-Holland Publishing Company . Ostrup , O. , Andersen , I.S. , and Collas , P. ( 2013 ). Chromatin-linked determinants of zygotic genome activation . Cell. Mol. Life Sci. 70 , 1 – 13 . Google Scholar CrossRef Search ADS PubMed Smedley , D. , Haider , S. , Ballester , B. , et al. . ( 2009 ). BioMart—biological queries made easy . BMC Genomics 10 , 22 . Google Scholar CrossRef Search ADS PubMed Sun , Y.-B. , Xiong , Z.-J. , Xiang , X.-Y. , et al. . ( 2015 ). Whole-genome sequence of the Tibetan frog Nanorana parkeri and the comparative evolution of tetrapod genomes . Proc. Natl Acad. Sci. USA 112 , E1257 – E1262 . Google Scholar CrossRef Search ADS Tan , M.H. , Au , K.F. , Yablonovitch , A.L. , et al. . ( 2013 ). RNA sequencing reveals a diverse and dynamic repertoire of the Xenopus tropicalis transcriptome over development . Genome Res. 23 , 201 – 216 . Google Scholar CrossRef Search ADS PubMed Tautz , D. , and Domazet-Loso , T. ( 2011 ). The evolutionary origin of orphan genes . Nat. Rev. Genet. 12 , 692 – 702 . Google Scholar CrossRef Search ADS PubMed Trapnell , C. , Pachter , L. , and Salzberg , S.L. ( 2009 ). TopHat: discovering splice junctions with RNA-Seq . Bioinformatics. 25 , 1105 – 1111 . Google Scholar CrossRef Search ADS PubMed Trapnell , C. , Williams , B.A. , Pertea , G. , et al. . ( 2010 ). Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation . Nat. Biotech. 28 , 511 – 515 . Google Scholar CrossRef Search ADS Vastenhouw , N.L. , Zhang , Y. , Woods , I.G. , et al. . ( 2010 ). Chromatin signature of embryonic pluripotency is established during genome activation . Nature 464 , 922 – 926 . Google Scholar CrossRef Search ADS PubMed Wu , D.-D. , Irwin , D.M. , and Zhang , Y.-P. ( 2011 ). De novo origin of human protein-coding genes . PLoS Genet. 7 , e1002379 . Google Scholar CrossRef Search ADS PubMed Wu , D.-D. , Ye , L.-Q. , Li , Y. , et al. . ( 2015 ). Integrative analyses of RNA editing, alternative splicing, and expression of young genes in human brain transcriptome by deep RNA sequencing . J. Mol. Cell Biol. 7 , 314 – 325 . Google Scholar CrossRef Search ADS PubMed Zhang , Y.E. , Landback , P. , Vibranovski , M.D. , et al. . ( 2011 ). Accelerated recruitment of new brain development genes into the human genome . PLoS Biol. 9 , e1001179 . Google Scholar CrossRef Search ADS PubMed Zhong , Z. , Yang , L. , Zhang , Y.E. , et al. . ( 2016 ). Correlated expression of retrocopies and parental genes in zebrafish . Mol. Genet. Genomics 291 , 723 – 737 . Google Scholar CrossRef Search ADS PubMed © The Author (2018). Published by Oxford University Press on behalf of Journal of Molecular Cell Biology, IBCB, SIBS, CAS. All rights reserved. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)

Journal

Journal of Molecular Cell BiologyOxford University Press

Published: Feb 14, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off