Full-Length Transcriptome Sequencing and Modular Organization Analysis of the Naringin/Neoeriocitrin-Related Gene Expression Pattern in Drynaria roosii

Full-Length Transcriptome Sequencing and Modular Organization Analysis of the... Abstract Drynaria roosii (Nakaike) is a traditional Chinese medicinal fern, known as ‘GuSuiBu’. The effective components, naringin and neoeriocitrin, share a highly similar chemical structure and medicinal function. Our HPLC–tandem mass spectrometry (MS/MS) results showed that the accumulation of naringin/neoeriocitrin depended on specific tissues or ages. However, little was known about the expression patterns of naringin/neoeriocitrin-related genes involved in their regulatory pathways. Due to a lack of basic genetic information, we applied a combination of single molecule real-time (SMRT) sequencing and second-generation sequencing (SGS) to generate the complete and full-length transcriptome of D. roosii. According to the SGS data, the differentially expressed gene (DEG)-based heat map analysis revealed that naringin/neoeriocitrin-related gene expression exhibited obvious tissue- and time-specific transcriptomic differences. Using the systems biology method of modular organization analysis, we clustered 16,472 DEGs into 17 gene modules and studied the relationships between modules and tissue/time point samples, as well as modules and naringin/neoeriocitrin contents. We found that naringin/neoeriocitrin-related DEGs distributed in nine distinct modules, and DEGs in these modules showed significantly different patterns of transcript abundance to be linked to specific tissues or ages. Moreover, weighted gene co-expression network analysis (WGCNA) results further identified that PAL, 4CL and C4H, and C3H and HCT acted as the major hub genes involved in naringin and neoeriocitrin synthesis, respectively, and exhibited high co-expression with MYB- and basic helix–leucine-helix (bHLH)-regulated genes. In this work, modular organization and co-expression networks elucidated the tissue and time specificity of the gene expression pattern, as well as hub genes associated with naringin/neoeriocitrin synthesis in D. roosii. Simultaneously, the comprehensive transcriptome data set provided important genetic information for further research on D. roosii. Introduction Drynaria roosii (Nakaike), a large epiphytic fern of the family Polypodiaceae, is a source of traditional Chinese medicine (TCM) commonly known as ‘GuSuiBu’ and by the pharmaceutical name Drynaria Rhizoma (Yang et al. 2010). ‘Qianggu Capsules’, which is a medicine made using ‘GuSuiBu’ as the main material, have a role in treating osteoporosis as a Category II New Drug for TCM in China (Xie et al. 2004). ‘GuSuiBu’ contains the effective medicinal flavonoid ingredients naringin and neoeriocitrin. This TCM has been extensively used to treat bone-related diseases such as bone fracture, osteoporosis and arthritis, and has been demonstrated to have therapeutic effects on bone healing (Li et al. 2011). According to the Pharmacopoeia of the People’s Republic of China, the drug formulation of ‘GuSuiBu’ should contain no less than 0.50% naringin, which is calculated in reference to the dried drug (Chinese Pharmacopoeia Commission 2015). Naringin can improve bone properties, stimulate osteogenic differentiation and ameliorate bone loss (Ma et al. 2016). Neoeriocitrin, which displays a similar structure and function to naringin, has been reported to exert a better stimulating effects on cell proliferation and osteogenic differentiation, and to have a rescue effect on the inhibition of osteogenic differentiation (Li et al. 2011). Flavonoids are one of the largest groups of secondary metabolites and are widely distributed in plants (W.J. Xu et al. 2015). The composition of flavonoids is largely dependent on the plant species, developmental stage, tissue type and key ecological influences of biotic and abiotic origins (Mouradov and Spangenberg 2014). The environmental or developmental regulation of flavonoids mostly relies on the co-ordinated expression of early and late biosynthetic genes (W.J. Xu et al. 2015). The early biosyntheic genes, which include chalcone synthase (CHS), chalcone isomerase (CHI), flavanone 3-hydroxylase (F3H) and flavanone 3'-hydroxylase (F3'H), are involved in the production of common precursors. In contrast, the late biosynthetic genes are usually downstream genes, comprising dihydroflavonol-4-reductase (DFR), leucoanthocyanidin reductase (LAR), anthocyanidin synthase (ANS) and anthocyanidin reductase (ANR) (Xu et al. 2014). Transcriptional regulation also plays an important role in the control of flavonoid biosynthesis, and six transparent testa (tt) genes [TT2 (MYB123), TT8 (bHLH042), TTG1 (WD40 or WDR), TT16/ABS (MADS box, AGL32), TT1 (WIP1/Zn finger) and TTG2 (DSL1/WRKY44)], a R3-MYB gene (MYBL2), R2R3-MYB genes (MYB12, MYB11, MYB111) and MYB-bHLH-WDR complex genes (i.e. TT2-TT8-TTG1, MYB5-TT8-TTG1, TT2-EGL3-TTG1 and TT2-GL3-TTG1) have been confirmed to be involved in encoding transcriptional regulators in the control of flavonoid biosynthesis (Xu et al. 2014). The advent of second-generation sequencing (SGS) technologies, such as Illumina technology, has stimulated the construction of genome and transcriptome resources for medicinal mushroom and plant species, such as Ganoderma lucidum and Salvia miltiorrhiza (Chen et al. 2012, Xu et al. 2016). However, the short reads obtained from SGS result in incompletely assembled transcripts and loss of some important information. Recently, single-molecule real-time (SMRT) sequencing conducted using the PacBio RS system has provided a third-generation sequencing platform that offers four major advantages: long read lengths, high consensus accuracy, a low degree of bias and simultaneous capability for epigenetic characterization (Huddleston et al. 2014). Full-length transcript sequences permit efficient analysis of alternative splicing, alternative polyadenylation, novel genes, non-coding RNAs and fusion transcripts, facilitating a complete understanding of the transcriptional behavior of plants with available reference genomes, such as S. miltiorrhiza and sorghum (Z.C. Xu et al. 2015, Abdel-Ghany et al. 2016). Additionally, full-length transcript sequences provide information on the total length of transcripts that do not need to be assembled because PacBio reads cover most of the transcripts of plants without an available reference genome, such as Agave deserti, Agave tequilana, Fritillaria and Astragalus membranaceus (Gross et al. 2013, B. Li et al. 2014, Li et al. 2017). One concern regarding PacBio sequencing is its relatively high error rate (up to 15%), which can be addressed through correction with SGS reads and/or self-correction using circular consensus sequencing (CCS) reads (Q.S. Li et al. 2014). There has been burgeoning interest in using a network approach to discover gene–gene dependencies in the field of systems biology, and high-throughput sequencing has contributed to an increase in data sets obtained from omics analyses, allowing the complex regulatory networks associated with biological processes to be understood (Ruan et al. 2010). In transcript expression data, interactions may be determined based on the correlation between the expression of paired genes, and a gene co-expression network is constructed through the discovery of non-random gene–gene expression dependencies derived from a collection of transcriptome data sets (Ficklin and Feltus 2011). Various software tools for the construction of co-expression networks have been developed with the aim of analyzing data sets to predict hub genes that could regulate a specific pathway or be involved in a particular biological process (Moschen et al. 2016). Weighted gene co-expression network analysis (WGCNA) is a typical R package algorithm method designed to identify clusters (modules) of highly correlated genes or metabolites based on high-throughput transcriptomic sequences (Langfelder and Horvath 2008), which has been widely used in plant species such as Arabidopsis thaliana (Xie et al. 2015), maize (Tan et al. 2017), Populus (Gerttula et al. 2015) and cotton (Naqvi et al. 2017). The wild D. roosii resources have been overexploited in China, and artificial cultivation is therefore important. In this work, we found 8 years or longer was necessary for artificially cultivated D. roosii to accumulate a level of naringin/neoeriocitrin similar to wild plants, and the accumulation of both components depended on specific tissues or ages. To date, the molecular regulatory mechanisms of naringin/neoeriocitrin accumulation remain largely unclear in D. roosii. Here, the different tissues (old rhizome, new rhizome, veins and leaves) and rhizomes from plants of different ages (2, 3, 5, 6, 8 and 10 years old) were used as the experimental materials. We combined SMRT sequencing and SGS to generate the complete and full-length transcriptome of the fern D. roosii. Due to the similar chemical structure and medicinal function of naringin and neoeriocitrin, we further analyzed RNA sequencing (RNA-seq) data to understand the naringin/neoeriocitrin-related gene expression pattern (NNRGEP) in D. roosii. Our method can detect gene expression levels at the transcriptome scale, which is not dependent on the reference genome. Moreover, the systems biology method of WGCNA was used to cluster gene modules and identify hub genes associated with naringin/neoeriocitrin, and the results provided significant insight into the modular organization and co-expression networks underlying the NNRGEP of D. roosii. Our data set will serve as a valuable resource for novel gene discovery and analyses of genomics and functional genomics in fern. Understanding the regulation of flavonoid metabolism is important to generate plants enriched with flavonoid compounds to obtain desirable dietary and beneficial health properties. Studying the NNRGEP will be valuable in efforts to improve the medicinal contents of naringin/neoeriocitrin more rapidly in D. roosii. Results HPLC-MS/MS analysis of naringin/neoeriocitrin accumulation in D. roosii First, we determined the naringin/neoeriocitrin contents of four different tissues [leaves (CLs), veins (CVs), new rhizomes (CNRs) and old rhizomes (CORs)] from 3-year-old plants using HPLC, and three replicates were carried out for each sample (Fig. 1A). As shown in Figs. 1C and 2A, the contents of naringin in CLs, CVs, CNRs and CORs were 0.0473, 0.0171, 1.093 and 0.5559%, respectively, suggesting that naringin mainly accumulates in rhizomes, and especially in CNRs. The neoeriocitrin contents of CLs, CVs, CNRs and CORs were 0.0709, 0.0023, 0.1095 and 0.2109%, respectively, indicating that neoeriocitrin mainly accumulates in rhizomes and is present at a much higher level in CORs. To obtain more insights into the accumulation of naringin/neoeriocitrin with increasing age in D. roosii, we detected the contents of these two compounds in the rhizomes of wild-type plants (WT), 2-year-old plants (R2Y), 3-year-old plants (R3Y), 5-year-old plants (R5Y), 6-year-old plants (R6Y), 8-year-old plants (R8Y) and 10-year-old plants (R10Y) using HPLC, and three replicates were carried out for each sample (Fig. 1B). As shown in Fig. 1B and E, we found that the fresh and dry weight of rhizomes accumulated steadily over the years. The fresh and dry weight were only 2.86 and 0.22 g for R2Y, 14.76 and 1.73 g for R3Y, and reached up to 100.76 and 16.15 g for R10Y. Moreover, the new rhizomes accumulated in the early years with a higher proportion than the old rhizomes, while the old rhizomes accumulated in the later years with a higher proportion than the new rhizomes. Figs. 1D and 2B show that the naringin contents of WT, R2Y, R3Y, R5Y, R6Y, R8Y and R10Y rhizomes were 0.7840, 0.7031, 0.5689, 0.5955, 0.6623, 0.6960 and 0.6420%, respectively, suggesting that naringin acts as an intermediate product and accumulates stably over time, with some fluctuation. The neoeriocitrin contents of WT, R2Y, R3Y, R5Y, R6Y, R8Y and R10Y rhizomes were 0.774, 0.1775, 0.2019, 0.3253, 0.4083, 0.9855 and 1.2064%, respectively, which revealed that neoeriocitrin accumulates and increases gradually over time (Figs. 1D, 2B). Additionally, we found that ≥8 years was necessary for D. roosii to accumulate a similar level of naringin/neoeriocitrin in comparison with wild plants. (Figs. 1D, 2B). Fig. 1 View largeDownload slide Photographs and HPLC analysis of naringin/neoeriocitrin contents of D. roosii. (A) Whole D. roosii plantlets and rhizomes. The area within the green circles contains new rhizomes, and the central part of the rhizome is old. Scale bar = 5 cm. (B) Photographs of plants of different ages, 2-year-old plants (R2Y), 3-year-old plants (R3Y), 5-year-old plants (R5Y), 6-year-old plants (R6Y), 8-year-old plants (R8Y) and 10-year-old plants (R10Y). Scale bar = 5 cm. (C) The histograms represent different naringin and neoeriocitrin levels in the four different tissues. (D) The histograms represent different naringin and neoeriocitrin levels in the rhizomes of wild plants and plants of different ages. (E) The fresh and dry weight of rhizomes from plants of different ages. Vertical bars indicate the SEM for three replicates of each sample. Fig. 1 View largeDownload slide Photographs and HPLC analysis of naringin/neoeriocitrin contents of D. roosii. (A) Whole D. roosii plantlets and rhizomes. The area within the green circles contains new rhizomes, and the central part of the rhizome is old. Scale bar = 5 cm. (B) Photographs of plants of different ages, 2-year-old plants (R2Y), 3-year-old plants (R3Y), 5-year-old plants (R5Y), 6-year-old plants (R6Y), 8-year-old plants (R8Y) and 10-year-old plants (R10Y). Scale bar = 5 cm. (C) The histograms represent different naringin and neoeriocitrin levels in the four different tissues. (D) The histograms represent different naringin and neoeriocitrin levels in the rhizomes of wild plants and plants of different ages. (E) The fresh and dry weight of rhizomes from plants of different ages. Vertical bars indicate the SEM for three replicates of each sample. Fig. 2 View largeDownload slide HPLC-MS/MS analyses of naringin/neoeriocitrin contents from biological samples in D. roosii. (A) The chromatogram indicates the differences in naringin and neoeriocitrin levels from the four different tissues [leaves (CL), veins (CV), young rhizomes (CNR) and old rhizomes (COR) of 3-year-old plants]. The red arrow presents the target peak of neoeriocitrin and the blue arrow presents the target peak of naringin. Three replicates were carried out for each sample. (B) The chromatogram indicates the differences in naringin and neoeriocitrin levels from the rhizomes of plants of different ages (R2Y, R3Y, R5Y, R6Y, R8Y and R10Y). The red arrow indicates the target peak of neoeriocitrin and the blue arrow indicates the target peak of naringin. Three replicates were carried out for each sample. (C–F) HPLC-MS/MS analyses of naringin/neoeriocitrin. The product ion mass spectra of naringin from reference standards (C) and biological samples (E) in the negative ion mode. The product ion mass spectra of neoeriocitrin from reference standards (D) and biological samples (F) in the negative ion mode. Conditions: ESI interface; mass spectra were applied using electrospray ionization in negative ionization mode with the m/z range of 100–600 and 100–650; desolvation temperature, 400°C; nitrogen drying gas flow, 800 l h–1; nebulizer pressure, 45 p.s.i.; capillary voltage, 3,000 V; argon was used as the collision gas with a collision energy of 30 V. Fig. 2 View largeDownload slide HPLC-MS/MS analyses of naringin/neoeriocitrin contents from biological samples in D. roosii. (A) The chromatogram indicates the differences in naringin and neoeriocitrin levels from the four different tissues [leaves (CL), veins (CV), young rhizomes (CNR) and old rhizomes (COR) of 3-year-old plants]. The red arrow presents the target peak of neoeriocitrin and the blue arrow presents the target peak of naringin. Three replicates were carried out for each sample. (B) The chromatogram indicates the differences in naringin and neoeriocitrin levels from the rhizomes of plants of different ages (R2Y, R3Y, R5Y, R6Y, R8Y and R10Y). The red arrow indicates the target peak of neoeriocitrin and the blue arrow indicates the target peak of naringin. Three replicates were carried out for each sample. (C–F) HPLC-MS/MS analyses of naringin/neoeriocitrin. The product ion mass spectra of naringin from reference standards (C) and biological samples (E) in the negative ion mode. The product ion mass spectra of neoeriocitrin from reference standards (D) and biological samples (F) in the negative ion mode. Conditions: ESI interface; mass spectra were applied using electrospray ionization in negative ionization mode with the m/z range of 100–600 and 100–650; desolvation temperature, 400°C; nitrogen drying gas flow, 800 l h–1; nebulizer pressure, 45 p.s.i.; capillary voltage, 3,000 V; argon was used as the collision gas with a collision energy of 30 V. One concern about the determination of the components was to ensure reliable characterization of the target compounds (Li et al. 2004). Here, we compared the mass spectra and retention times of target compounds with those of corresponding reference standards from HPLC–tandem mass spectrometry (MS/MS). HPLC-MS/MS analysis confirmed that the target compounds of our biological samples with retention times of 19.50 and 24.75 min were naringin (with a mass spectrum of m/z 579 and major ion fragments of m/z 151, 271 and 459; Fig. 2C, E) and neoeriocitrin (with a mass spectrum of m/z 595 and major ion fragments of m/z 151, 287 and 459; Fig. 2D, F). Notably, the MS/MS data showed one different major ion fragment between naringin (m/z 271) and neoeriocitrin (m/z 287), which suggested a highly similar chemical structure with the only difference of one hydroxyl group. Taken together, our results confirmed that the accumulation of naringin/neoeriocitrin obviously showed tissue and time specificity in D. roosii. Single-molecule real-time sequencing of D. roosii tissues captures full-length transcripts The lack of comprehensive D. roosii gene sequence data sets limits the scope of investigations into the molecular genetic basis of the metabolism of the medicinal ingredients. Here, we used SMRT technology to generate the complete and full-length D. roosii transcriptome. We generated 4,364,566 raw reads (12.2 billion bp) using the PacBio RS platform. After performing filtering using RS_Subreads.1 of PacBio RS, approximately 12 G of subread data were obtained. We applied RS_IsoSeq.1 protocols (https://github.com/PacificBiosciences/), including classify, cluster and quiver, to generate CCS data, which provides much more accurate sequence information from reads that pass through the insert at least three times (Sharon et al. 2013). The numbers of reads of insert (ROI) with high accuracy (read quality >0.85) from the different libraries (<1, 1–2, 2–3, 3–4, 4–5 and >5 kb) were 103,983, 76,059, 108,381, 93,698, 101,228, 109,469 and 68,839, respectively (Supplementary Table S1; Supplementary Fig. S1), and most of the read quality values were distributed above 0.95 (Supplementary Fig. S2). As shown in Supplementary Fig. S3, we found that the length of ROI sequences in different libraries depended on the library size. Statistically, the numbers of full-length reads obtained from the different libraries (<1, 1–2, 2–3, 3–4, 4–5 and >5 kb) were 406,214, 1,328,291, 1,147,936, 715,286, 394,386 and 372,453, respectively (Table 1). A total of 272,121 full-length non-chimeric reads with an average size of 3,073 bp were obtained (Supplementary Fig. S4; Supplementary Table S2), and the number of high-quality full-length transcripts was condensed to 67,147, with an N50 of 2,977 bp (Fig. 3A;Supplementary Table S3). In addition, 59,294 of these full-length transcripts were predicted to be long CDS sequences using Transcoder (Fig. 3B) and relevant protein sequences (Fig. 3C). Table 1 Length distribution of raw reads from SMRT sequencing data Cell  Library size  <1k  1–2k  2–3k  3–4k  4–5k  >5k  Total reads  Total bases (bp)  m*_233855  1–2k  84,255  566,292  324,790  33,269  15,301  21,799  1,045,706  2,104,717,056  m*_151316  1–2k  71,028  360,947  213,161  32,570  17,783  33,880  729,369  1,601,915,820  m*_043401  2–3k  52,076  74,545  245,179  244,188  31,461  43,344  690,793  2,088,777,249  m*_105045  2–3k  44,911  65,733  188,850  175,074  22,733  30,060  527,361  1,545,726,502  m*_062914  3–5k  67,503  108,774  69,300  95,200  121,731  57,947  520,455  1,688,769,580  m*_092811  3–5k  56,733  102,693  70,479  108,246  161,461  61,667  561,279  1,918,188,854  m*_231835  >5k  29,708  49,307  36,177  26,739  23,916  123,756  289,603  1,227,156,669  Total  –  406,214  1,328,291  1,147,936  715,286  394,386  372,453  4,364,566  12,175,251,730  Cell  Library size  <1k  1–2k  2–3k  3–4k  4–5k  >5k  Total reads  Total bases (bp)  m*_233855  1–2k  84,255  566,292  324,790  33,269  15,301  21,799  1,045,706  2,104,717,056  m*_151316  1–2k  71,028  360,947  213,161  32,570  17,783  33,880  729,369  1,601,915,820  m*_043401  2–3k  52,076  74,545  245,179  244,188  31,461  43,344  690,793  2,088,777,249  m*_105045  2–3k  44,911  65,733  188,850  175,074  22,733  30,060  527,361  1,545,726,502  m*_062914  3–5k  67,503  108,774  69,300  95,200  121,731  57,947  520,455  1,688,769,580  m*_092811  3–5k  56,733  102,693  70,479  108,246  161,461  61,667  561,279  1,918,188,854  m*_231835  >5k  29,708  49,307  36,177  26,739  23,916  123,756  289,603  1,227,156,669  Total  –  406,214  1,328,291  1,147,936  715,286  394,386  372,453  4,364,566  12,175,251,730  Table 1 Length distribution of raw reads from SMRT sequencing data Cell  Library size  <1k  1–2k  2–3k  3–4k  4–5k  >5k  Total reads  Total bases (bp)  m*_233855  1–2k  84,255  566,292  324,790  33,269  15,301  21,799  1,045,706  2,104,717,056  m*_151316  1–2k  71,028  360,947  213,161  32,570  17,783  33,880  729,369  1,601,915,820  m*_043401  2–3k  52,076  74,545  245,179  244,188  31,461  43,344  690,793  2,088,777,249  m*_105045  2–3k  44,911  65,733  188,850  175,074  22,733  30,060  527,361  1,545,726,502  m*_062914  3–5k  67,503  108,774  69,300  95,200  121,731  57,947  520,455  1,688,769,580  m*_092811  3–5k  56,733  102,693  70,479  108,246  161,461  61,667  561,279  1,918,188,854  m*_231835  >5k  29,708  49,307  36,177  26,739  23,916  123,756  289,603  1,227,156,669  Total  –  406,214  1,328,291  1,147,936  715,286  394,386  372,453  4,364,566  12,175,251,730  Cell  Library size  <1k  1–2k  2–3k  3–4k  4–5k  >5k  Total reads  Total bases (bp)  m*_233855  1–2k  84,255  566,292  324,790  33,269  15,301  21,799  1,045,706  2,104,717,056  m*_151316  1–2k  71,028  360,947  213,161  32,570  17,783  33,880  729,369  1,601,915,820  m*_043401  2–3k  52,076  74,545  245,179  244,188  31,461  43,344  690,793  2,088,777,249  m*_105045  2–3k  44,911  65,733  188,850  175,074  22,733  30,060  527,361  1,545,726,502  m*_062914  3–5k  67,503  108,774  69,300  95,200  121,731  57,947  520,455  1,688,769,580  m*_092811  3–5k  56,733  102,693  70,479  108,246  161,461  61,667  561,279  1,918,188,854  m*_231835  >5k  29,708  49,307  36,177  26,739  23,916  123,756  289,603  1,227,156,669  Total  –  406,214  1,328,291  1,147,936  715,286  394,386  372,453  4,364,566  12,175,251,730  Fig. 3 View largeDownload slide Full-length transcriptome sequencing and assembly of D. roosii using the SMRT method. (A) Length distribution of full-length transcripts. (B) Length distribution of CDS sequences. (C) Length distribution of protein sequences. (D) NR classification diagram. (E) Functional annotation of unigenes based on GO categorization. The main functional categories within the biological process, cellular component and molecular function categories are relevant to plant physiology. The right y-axis indicates the number of unigenes. The left y-axis indicates the percentage of unigenes. (F) COG functional classification of unigenes. A total of 24,876 unigenes were grouped according to 24 COG terms. The y-axis indicates the number of unigenes. Fig. 3 View largeDownload slide Full-length transcriptome sequencing and assembly of D. roosii using the SMRT method. (A) Length distribution of full-length transcripts. (B) Length distribution of CDS sequences. (C) Length distribution of protein sequences. (D) NR classification diagram. (E) Functional annotation of unigenes based on GO categorization. The main functional categories within the biological process, cellular component and molecular function categories are relevant to plant physiology. The right y-axis indicates the number of unigenes. The left y-axis indicates the percentage of unigenes. (F) COG functional classification of unigenes. A total of 24,876 unigenes were grouped according to 24 COG terms. The y-axis indicates the number of unigenes. The annotation, pathway and functional classification of these SMRT data were analyzed based on the NR (non-redundant), GO (Gene Ontology), COG (clusters of orthologous groups), Swiss-Prot, KEGG (Kyoto Encyclopedia of Genes and Genomes) and NT (nucleotide) databases. We annotated 59,100 transcripts (88.02%) for function, among which 52,439 transcripts (78.10%) were assigned to NR databases. There were three major species (Physcomitrella patens, 18.3%; Selaginella moellendorffii, 12.8%; and Picea sitchensis, 12.6%) with high similarity to D. roosii (Fig. 3D). We assigned 29,012 transcripts (43.2%) to GO databases, which were classified into three main categories [(i) biological process, (ii) cellular component and (iii) molecular function clusters] and further distributed across 53 subcategories (Fig. 3E). Among the biological processes, candidate genes involved in metabolic and cellular processes were highly represented. In addition, 24,876 transcripts (37.1%) were allocated to COG databases (Fig. 3F). Among the 24 COG categories, ‘general function prediction only’ represented the largest group (7,213; 17.9%), followed by ‘signal transduction mechanisms’ (4,033; 10.0%), ‘replication, recombination and repair’ (3,913; 9.7%) and ‘transcription’ (3,818; 9.5%). Moreover, 43,863 transcripts (65.3%), 37,028 transcripts (55.1%) and 41,844 transcripts (62.3%) were allocated to the Swiss-Prot, KEGG and NT databases, respectively (Supplementary Table S4). Based on the SMRT data, we suggest that D. roosii has an extremely large genome. In this context, SMRT technology is an effective approach for further biological analysis of plants without a molecular genetic basis and provides a comprehensive transcriptome resource for D. roosii. Transcriptomic insights into the naringin/neoeriocitrin-related gene expression pattern in D. roosii Plant secondary metabolites are important resources of high-value chemicals with pharmacological properties. Phytochemical investigations of D. roosii have revealed that naringin and neoeriocitrin exhibit a highly similar flavonoid structure (L.L. Liu et al. 2012). For assessment of the tissue- and time-specific expression of core genes associated with naringin/neoeriocitrin and the order of accumulation, 30 RNA samples from four different tissues (CLs, CVs, CNRs and CORs) and six time points (2, 3, 5, 6, 8 and 10 years old) were subjected to SGS. Each tissue and time point sample had three biological replications. We obtained 243 billion bases of raw data, with a Q20 >95% and an approximate GC content of 48.9% (Supplementary Table S5). After filtration, there were 215 billion bases of clean data (Supplementary Table S6). A total of 118,060 transcripts and 66,499 unigenes were assembled, with N50 values of 2,382 and 2,261 bp, respectively (Supplementary Figs. S5, S6). We allocated 52,334, 39,570, 38,766, 34,129, 23,568 and 30,109 unigenes to the NR (Supplementary Fig. S7), NT, Swiss-Prot, KEGG, GO (Supplementary Fig. S8) and COG (Supplementary Fig. S9) databases, respectively (Supplementary Table S7). Based on the NR database analysis, D. roosii shared the highest similarity with three major species (P. patens, 12.6%; S. moellendorffii, 10.2%; and P. sitchensis, 8.2%), which was consistent with the results of the SMRT analysis (Supplementary Fig. S7). According to the Venn diagram, we detected 278 and 144 differentially expressed genes (DEGs) that were shared in pairwise comparisons of different tissues and rhizomes of different ages, respectively (Supplementary Fig. S10). As the basic biosynthetic pathways leading to the production of the core flavonoid skeletons have been well characterized in terms of the molecular, genetic and biochemical mechanisms of the enzymes involved (Dixon and Pasinetti 2010), we suggested the pathway of naringin and neoeriocitrin synthesis in D. roosii. Initially, the enzymes of the phenylpropanoid pathway encoded by PAL (phenylalanine ammonia lyase), C4H (cinnamic acid 4-hydroxylase) and 4CL (4-coumarate-CoA ligase) catalyze a series of reactions to form 4-coumaroyl-CoA as the substrate of flavonoid biosynthetic pathways. The evaluation of one core subpathway showed that the products of CHS and CHI are involved in two-step condensation to produce the basic skeleton of naringenin, after which F3H is involved in the production of eriodictyol. In another paratactic subpathway, caffeoyl-CoA is produced by the products of HCT (hydroxycinnamoyltransferase), C3H and C3'H from 4-coumaroyl CoA. 14 C-Labeled CHS enzyme assay results showed that the product of CHS1 prefers to convert 4-coumaroyl-CoA to naringenin chalcone, whereas the product of CHS2 prefers to convert caffeoyl-CoA to eriodictyol chalcone in barley, which provided an attractive hypothesis that the eriodictyol CHS eliminates the need for the complex P450-dependent hydroxylation of naringenin to eriodictyol (Christensen et al. 1998). Additionally, the chromatographic and spectroscopic data showed that the isomerization product of eriodictyol chalcone (also known as 2',3,4,4',6'-pentahydroxy chalcone) is eriodictyol in Tulipa (Wiermann 1970). Thus we suggested that eriodictyol chalcone is generated by CHS from caffeoyl-CoA, thereafter eriodictyol is produced via the catalytic action of CHI in D. roosii. Later, UDP-glucosyltransferase (UGT) and rhamnosyltransferase catalyze position-specific transfer of glucose and rhamnose to C-7-O- of naringenin to produce naringin (McIntosh and Mansell 1990, Bar-Peled et al. 1991). B-ring hydroxylation at position 3 of flavonoids is catalyzed by the well-studied Cyt P450 monooxygenase gene F3'H (Hutabarat et al. 2016). In the present work, we proposed that some UGT genes contribute to the final production of naringin and neoeriocitrin, and F3'H is the key candidate gene for the bioconversion of naringin to neoeriocitrin over the years in D. roosii, which requires further research (Fig. 4B). Fig. 4 View largeDownload slide Heat map depicting the expression profiles of naringin/neoeriocitrin synthesis-related genes in D. roosii. (A) Transcript abundance profiles of genes encoding enzymes involved in naringin/neoeriocitrin synthesis in different tissues. (B) Putative schematic representation of the naringin/neoeriocitrin synthetic pathways in D. roosii. PAL, phenylalanine ammonia lyase; C4H, cinnamate 4-hydroxylase; 4CL, 4-coumarate-CoA ligase; HCT, hydroxycinnamoyl transferase; C3H, p-coumarate 3-hydroxylase; C3'H, p-coumarate 3'-hydroxylase; CHS, chalcone synthase; CHI, chalcone isomerase; F3'H, flavonoid-3' hydroxylase; UGT, UDP-glucosyltransferases. (C) Transcript abundance profiles of genes encoding enzymes involved in naringin/neoeriocitrin synthesis in rhizomes of different ages. (D) Numbers of naringin/neoeriocitrin synthesis-related genes (NNSRGs) expressed at different levels (based on the value of fold change) in CVs, CNRs and CORs compared with CLs. (E) Numbers of the NNSRGs expressed at different levels (based on the value of FPKM) in different tissues. (F) Numbers of the NNSRGs expressed at different levels (based on the value of fold change) in R3Y, R5Y, R6Y, R8Y and R10Y compared with R2Y. (G) Numbers of the NNSRGs expressed at different levels (based on the value of FPKM) in rhizomes of different ages. Fig. 4 View largeDownload slide Heat map depicting the expression profiles of naringin/neoeriocitrin synthesis-related genes in D. roosii. (A) Transcript abundance profiles of genes encoding enzymes involved in naringin/neoeriocitrin synthesis in different tissues. (B) Putative schematic representation of the naringin/neoeriocitrin synthetic pathways in D. roosii. PAL, phenylalanine ammonia lyase; C4H, cinnamate 4-hydroxylase; 4CL, 4-coumarate-CoA ligase; HCT, hydroxycinnamoyl transferase; C3H, p-coumarate 3-hydroxylase; C3'H, p-coumarate 3'-hydroxylase; CHS, chalcone synthase; CHI, chalcone isomerase; F3'H, flavonoid-3' hydroxylase; UGT, UDP-glucosyltransferases. (C) Transcript abundance profiles of genes encoding enzymes involved in naringin/neoeriocitrin synthesis in rhizomes of different ages. (D) Numbers of naringin/neoeriocitrin synthesis-related genes (NNSRGs) expressed at different levels (based on the value of fold change) in CVs, CNRs and CORs compared with CLs. (E) Numbers of the NNSRGs expressed at different levels (based on the value of FPKM) in different tissues. (F) Numbers of the NNSRGs expressed at different levels (based on the value of fold change) in R3Y, R5Y, R6Y, R8Y and R10Y compared with R2Y. (G) Numbers of the NNSRGs expressed at different levels (based on the value of FPKM) in rhizomes of different ages. On the one hand, we investigated the tissue-specific expression of DEGs encoding enzymes involved in the pathways shown in Fig. 4B yielding naringin/neoeriocitrin. The DEGs were subjected to the filters of a fold change >2 or <0.5 and a false discovery rate (FDR) <0.01 in the present study. Analysis of our tissue-specific transcriptome data set revealed that 108 candidate DEGs across nine gene families related to naringin/neoeriocitrin showed significant variation among different tissues (Fig. 4A). The number of up-regulated genes (fold change >2, 5 and 10) was higher in rhizomes compared with leaves (CL samples), in particular new rhizomes (Fig. 4A, D). As shown in Fig. 4E, we found that genes in rhizomes showed higher expression [fragments per kilobase of transcript per million mapped reads (FPKM) >10 and 100] than CLs and CVs, especially in CNRs. Moreover, we identified 68 DEGs (63.0%) which exhibited up-regulated expression in rhizomes, comprising three PAL genes, three C4H genes, four 4CL genes, 17 HCT genes, four C3'H genes, 12 C3H genes, seven CHS genes, two CHI genes and 16 F3'H genes. Also, 66.2% of these DEGs appeared to be expressed at higher transcriptional levels in CNRs than in CORs. Notably, two F3'H genes and one gene each of 4CL, C3H, PAL, HCT and CHS were most highly (and quite specifically) expressed in CNRs (fold change >100; Fig. 4A;Supplementary Table S8). On the other hand, we analyzed the transcriptome data from samples of different ages and identified 92 candidate DEGs across nine gene families related to naringin/neoeriocitrin (Fig. 4C). Fig. 4F shows that the number of up-regulated genes in R3Y, R5Y, R6Y, R8Y and R10Y decreased compared with R2Y (fold change >1, 2, 5 and 10), and Fig. 4G reveals that genes presented down-regulated expression with a decreasing FPKM value over time. Moreover, 70.7% of the DEGs exhibited down-regulated expression over time, and the numbers of down-regulated DEGs (fold change <0.5) in R3Y, R5Y, R6Y, R8Y and R10Y compared with R2Y were 22, 14, 30, 33 and 47, respectively (Fig. 4C;Supplementary Table S8). Three PAL genes, three 4CL genes, one HCT gene and one C4H gene exhibited the lowest transcriptional levels in R10Y (fold change <0.1). Overall, the results confirmed the tissue and temporal expression specificity of the naringin/neoeriocitrin-related genes (NNRGs), especially for PAL, 4CL and HCT, suggesting their crucial roles in the specialized naringin/neoeriocitrin accumulation in D. roosii. RNA-seq data reveal l-phenylalanine/malonyl-CoA formation and transcription factors associated with naringin/neoeriocitrin in D. roosii Flavonoids are synthesized in plants via the flavonoid branch of the phenylpropanoid and acetate–malonate metabolic pathway, in which l-phenylalanine and malonyl-CoA are the important precursor metabolites for the phenylpropanoid metabolic pathway (Buer et al. 2010). Beyond the NNRGs displayed in Fig. 4B, we studied genes involved in the formation of l-phenylalanine via the shikimate pathway and malonyl-CoA, which were related to naringin/neoeriocitrin accumulation. For l-phenylalanine, among 25 DEGs, 14 DEGs showed a high expression level in CNRs (Fig. 5A;Supplementary Table S9). Of these, 3-deoxy-7-phosphoheptulonate synthase (CL7383) showed the highest expression level in CNRs (fold change = 354). The numbers of down-regulated DEGs (fold change <0.5) identified in R3Y, R5Y, R6Y, R8Y and R10Y compared with R2Y were 9, 2, 8, 10 and 13, respectively. For malonyl-CoA, we analyzed 112 DEGs and identified 77 DEGs (68.8%) that were specifically expressed in rhizomes. Among these DEGs, 49 (63.6%) were expressed at a high level in CNRs. Of these, alcohol dehydrogenase 1/7 (CL2849, fold change = 440) and 3-ketoacyl-CoA synthase (CL7598, fold change = 445) were most specifically expressed in CNRs. The number of down-regulated DEGs (fold change <0.5) identified in R3Y, R5Y, R6Y, R8Y and R10Y was 24, 12, 47, 33 and 46, respectively. ACCase is a biotin-containing enzyme that catalyzes the carboxylation of acetyl-CoA to malonyl-CoA. In our work, two acetyl-CoA/propionyl-CoA carboxylase genes (CL18632 and Unigene9710) encoding ACCase were expressed at a high level in CNRs, and two acetyl-CoA/propionyl-CoA carboxylase genes (CL18632 and CL10663) exhibited a down-regulated expression pattern over time. The DEGs involved in l-phenylalanine formation (phenylalanine, tyrosine and tryptophan biosynthesis) and malonyl-CoA formation (fatty acid biosynthesis, fatty acid elongation and fatty acid metabolism) exhibited tissue-specific expression profiles, with the highest expression being observed in CNRs, and displayed time point expression specificity with down-regulated expression over time. Fig. 5 View largeDownload slide Heat map depicting the expression profiles of precursor metabolism and transcription factor-regulated genes in D. roosii. (A) Profiles of the transcriptional abundance of enzymatic genes involved in malonyl-CoA/l-phenylalanine formation in different tissues and plants of different ages. (B) Profiles of the transcriptional abundance of transcription factor-regulated (MYB- and bHLH-regulated) genes in different tissues and plants of different ages. Fig. 5 View largeDownload slide Heat map depicting the expression profiles of precursor metabolism and transcription factor-regulated genes in D. roosii. (A) Profiles of the transcriptional abundance of enzymatic genes involved in malonyl-CoA/l-phenylalanine formation in different tissues and plants of different ages. (B) Profiles of the transcriptional abundance of transcription factor-regulated (MYB- and bHLH-regulated) genes in different tissues and plants of different ages. To understand the regulatory role of transcription factors (TFs) on gene expression in D. roosii, we analyzed the expression profiles of MYB- and basic helix–loop–helix (bHLH)-regulated DEGs in different tissues and plants of different ages, and discovered 84 MYB- and 118 bHLH-regulated DEGs (Fig. 5B;Supplementary Table S10). Among these MYB-regulated DEGs, 57 (67.9%) were highly expressed in rhizomes, and 41 (71.9%) were more highly expressed in CNRs. Two MYB-regulated DEGs (CL17931, fold change = 433; CL8191, fold change = 581) were most highly expressed in CNRs. The numbers of down-regulated DEGs (fold change <0.5) identified in R3Y, R5Y, R6Y, R8Y and R10Y were 18, 19, 31, 35 and 40, respectively. Among the bHLH-regulated DEGs, 88 (74.6%) were highly expressed in rhizomes and 63 (71.6%) were more highly expressed in CNRs. The bHLH-regulated DEGs (CL4885, fold change = 468; Unigene731, fold change = 476) were most specifically expressed in CNRs. The number of down-regulated DEGs (fold change <0.5) identified in R3Y, R5Y, R6Y, R8Y and R10Y was 19, 13, 52, 33 and 62, respectively. The results revealed that TF-regulated genes share a similar expression pattern, with higher transcription levels in CNRs, and displayed down-regulation along with increasing age, suggesting that the regulation of gene expression by bHLH and MYB depended on different tissues and ages in D. roosii. Weighted gene co-expression network analysis in D. roosii To identify the relationship between modules and samples, a gene co-expression network was constructed using 16,472 DEGs (Fig. 6A). Principal component analysis data showed that principal components 1, 2 and 3 explained 74.83, 14.90 and 4.42% of the observed variation, respectively (Supplementary Fig. S11). In Fig. 6A, a network heat map plot (interconnectivity plot) of the gene network is shown, with the corresponding hierarchical clustering dendrograms and resulting modules. Using a set of parameters that produced refined clusters, we detected 17 modules containing from 41 to –6,137 DEGs each (fold change >2 or <0.5; FDR <0.01). The expression profile of each module is summarized by a module eigengene, which is analogous to the first principal component of the module expression data (Langfelder and Horvath 2008). The modules were defined using color codes, as shown in Fig. 6B. We also calculated the similarity and clustered these modules into different groups depending on their eigenvalues. A comparison of the module eigengenes (Fig. 6B, C) revealed four related module sets that displayed similar expression profiles across different samples: (i) the royalblue, saddlebrown, brown, orangered4 and purple modules; (ii) the violet, darkmagenta, salmon and turquoise modules; (iii) the floralwhite, darkorange and pink modules; and (iv) the darkorange2, tan, plum1, bisque4 and lightsteelblue1 modules. The 17 module eigengenes for the 17 distinct modules showed some cross-talk with distinct sample types, due to the tissue-specific expression profiles of the eigengenes. Notably, seven of the 17 co-expression modules were composed of genes that were highly expressed in specific samples (R > 0.45; P < 10–3). Therefore, each of these seven modules identified (or was correlated with) genes associated with a specific tissue or time point. For example, the salmon module identified 2,413 CV-specific genes; the darkorange module identified 1,633 R2Y-specific genes; the darkorange4 module identified 669 R10Y-specific genes; and the saddlebrown module identified 807 R8Y-specific genes (Fig. 6D). Fig. 6 View largeDownload slide Identification of co-expression network modules in D. roosii. (A) Network heat map plot. Branches in the hierarchical clustering dendrograms correspond to modules. Color-coded module membership is indicated by the colored bars below and to the right of the dendrograms. In the heat map, high co-expression interconnectedness is indicated by progressively more saturated yellow and red colors. Modules correspond to blocks of highly interconnected genes. Genes with high intramodular connectivity are located at the tip of the module branches because they display the highest interconnectedness with the rest of the genes in the module. (B) Module eigenvector clustering. The network consists of 17 modules. The eigenvectors for each module are prefixed with module eigengenes (MEs) in the dendrogram and are calculated prior to thresholding the network. Adjacent modules are more highly similar in expression. (C) Module eigengene adjacency heat map. The heat map shows the relatedness of the 17 co-expression modules identified in WGCNA, with red indicating highly related and blue indicating not related. (D) Module–sample association. Each row corresponds to a module. The color of each cell at the row–column intersection indicates the correlation coefficient between the module and the sample type. A high degree of correlation between a specific module and the sample type is indicated with dark red. Fig. 6 View largeDownload slide Identification of co-expression network modules in D. roosii. (A) Network heat map plot. Branches in the hierarchical clustering dendrograms correspond to modules. Color-coded module membership is indicated by the colored bars below and to the right of the dendrograms. In the heat map, high co-expression interconnectedness is indicated by progressively more saturated yellow and red colors. Modules correspond to blocks of highly interconnected genes. Genes with high intramodular connectivity are located at the tip of the module branches because they display the highest interconnectedness with the rest of the genes in the module. (B) Module eigenvector clustering. The network consists of 17 modules. The eigenvectors for each module are prefixed with module eigengenes (MEs) in the dendrogram and are calculated prior to thresholding the network. Adjacent modules are more highly similar in expression. (C) Module eigengene adjacency heat map. The heat map shows the relatedness of the 17 co-expression modules identified in WGCNA, with red indicating highly related and blue indicating not related. (D) Module–sample association. Each row corresponds to a module. The color of each cell at the row–column intersection indicates the correlation coefficient between the module and the sample type. A high degree of correlation between a specific module and the sample type is indicated with dark red. Identification of naringin/neoeriocitrin-related gene expression pattern using modular organization analysis The investigation of relationships between module eigengenes and naringin/neoeriocitrin contents revealed that the correlation coefficients varied widely, from –0.79 to 0.7 for naringin and from –0.58 to 0.83 for neoeriocitrin (Fig. 7A). The eigengenes of the darkorange2, tan, turquoise and salmon modules showed significant correlations with naringin (|R| > 0.5, P < 0.001), although the turquoise and salmon modules were negatively correlated (Fig. 7A). The saddlebrown, brown, orangered4, purple and darkmagenta modules showed significant correlations with neoeriocitrin (|R| >0.5, P < 0.001), although darkmagenta was negatively correlated. These nine modules of interest comprised 13,704 DEGs, among which the turquoise module was the largest, with 6,137 DEGs. A heat map of the relative expression of each gene from the nine modules revealed that many of the darkorange2 and tan module genes (positive correction with naringin) were most highly expressed in CNRs, followed by R3Y, with a high ratio of young tissue. On the other hand, a group of genes in the turquoise and salmon modules (negative correction with naringin) were more highly expressed in CLs and CVs, respectively. Moreover, most genes in the brown, orangered4, purple and saddlebrown modules (positive correction with neoeriocitrin) were specifically expressed in R10Y and R8Y, with a high ratio of old tissue, and many genes in the darkmagenta module (negative correction with neoeriocitrin) were specifically expressed in CNRs (Fig. 7B). Genes positively associated with naringin were specifically expressed in new rhizomes (CNRs/R3Y), while genes positively associated with neoeriocitrin were specifically expressed in old rhizomes (R8Y/R10Y), suggesting the tissue- and time point-specific pattern of the NNRGs in D. roosii. Furthermore, we performed an enrichment analysis of GO annotations of naringin/neoeriocitrin-related module genes, and the results showed that the greatest number of DGEs fell in ‘metabolic process’ of the biological process category (Supplementary Table S11). Fig. 7 View largeDownload slide Modular organization analysis of the naringin/neoeriocitrin-related gene expression pattern in D. roosii. (A) Module–naringin/neoeriocitrin correlations and corresponding P-values (in parentheses). The left panel shows 17 modules (royalblue, saddlebrown, brown, orangered4, purple, violet, darkmagenta, salmon, turquoise, floralwhite, darkorange, pink, darkorange2, tan, plum1, bisque4 and lightsteelblue) and their member genes. The right panel is a color scale for module–trait correlation, from −1 to 1. The number of genes in each module is indicated on the left. Each column corresponds to a specific sample. (B) Expanded view of the expression of naringin/neoeriocitrin-related DEGs in the nine modules across all samples. Heat map showing the relative expression of each gene from the saddlebrown, brown, orangered4, purple, darkorange2 and tan modules(positive correlation), and darkmagenta, salmon and turquoise modules (negative correlation). Fig. 7 View largeDownload slide Modular organization analysis of the naringin/neoeriocitrin-related gene expression pattern in D. roosii. (A) Module–naringin/neoeriocitrin correlations and corresponding P-values (in parentheses). The left panel shows 17 modules (royalblue, saddlebrown, brown, orangered4, purple, violet, darkmagenta, salmon, turquoise, floralwhite, darkorange, pink, darkorange2, tan, plum1, bisque4 and lightsteelblue) and their member genes. The right panel is a color scale for module–trait correlation, from −1 to 1. The number of genes in each module is indicated on the left. Each column corresponds to a specific sample. (B) Expanded view of the expression of naringin/neoeriocitrin-related DEGs in the nine modules across all samples. Heat map showing the relative expression of each gene from the saddlebrown, brown, orangered4, purple, darkorange2 and tan modules(positive correlation), and darkmagenta, salmon and turquoise modules (negative correlation). WGCNA reveals gene co-expression networks associated with naringin/neoeriocitrin Among these 17 modules, the turquoise and tan modules had the largest number of MYB- and bHLH-regulated genes compared with other modules, which indicated the involvement of MYB and bHLH TFs in regulating gene expression in the turquoise and tan modules (Fig. 8A). The hub gene network analysis revealed a hierarchical organization of highly connected genes in each module, through which core controlling (hub) genes in the modular network could be identified. To evaluate the co-expression pattern of NNRGs, we compared the networks using naringin/neoeriocitrin synthesis-related genes (NNSRGs) and TF-regulated DEGs in the turquoise and tan modules. The connections of the naringin-related and TF-regulated DEGs produced a network of 60 nodes and 440 edges in the turquoise module, showing that eight metabolic pathway hub genes (three C4H gene, two 4CL genes, two CHS genes and one PAL gene) were highly co-expressed with 52 TF-regulated genes (16 MYB- and 36 bHLH-regulated genes). The WGCNA between 15 pathway hub genes (five CHS genes, three 4CL genes, three CHI genes, three PAL genes and one C4H gene) and the list of 57 TF-regulated genes (22 MYB- and 35 bHLH-regulated genes) was conducted in the tan module associated with naringin, and the resulting network contained 72 nodes and 948 edges. Regarding the significant neoeriocitrin synthesis-related and relevant TF-regulated genes, the co-expression networks contained 92 nodes and 2,802 edges in the turquoise module, and 91 nodes and 2,447 edges in the tan module. The turquoise module included 40 hub pathway genes (14 HCT genes, 11 F3'H genes, five C3H genes, three C4H genes, two C3'H genes, two CHS genes, two 4CL genes and one PAL gene) and 52 TF-regulated genes (16 MYB- and 36 bHLH-regulated genes). A total of 34 hub pathway genes in the tan module comprised eight F3'H genes, seven C3H genes, five CHS genes, four HCT genes, three PAL genes, three 4CL genes, three CHI genes and one C4H gene, together with 22 MYB-regulated genes and 35 bHLH-regulated genes (Fig. 8B;Supplementary Tables S12, S13). Fig. 8 View largeDownload slide Networks enriched in naringin/neoeriocitrin-related genes revealed via WGCNA. (A) Distribution of transcription factor-regulated genes in D. roosii. The white bar represents total TF-regulated DEGs; the red bar represents MYB-regulated DEGs; the green bar represents bHLH-regulated DEGs. (B) The naringin/neoeriocitrin-related networks are constructed using synthesis-related genes and relevant transcription factor-regulated genes in turquoise and tan modules. The naringin/neoeriocitrin synthesis-related genes are highlighted in turquoise and tan modules, respectively. Additionally, see Supplementary Tables S12 and S13. (C) Relative gene expression values are based on log10 RPKM according to the networks of turquoise and tan modules. Fig. 8 View largeDownload slide Networks enriched in naringin/neoeriocitrin-related genes revealed via WGCNA. (A) Distribution of transcription factor-regulated genes in D. roosii. The white bar represents total TF-regulated DEGs; the red bar represents MYB-regulated DEGs; the green bar represents bHLH-regulated DEGs. (B) The naringin/neoeriocitrin-related networks are constructed using synthesis-related genes and relevant transcription factor-regulated genes in turquoise and tan modules. The naringin/neoeriocitrin synthesis-related genes are highlighted in turquoise and tan modules, respectively. Additionally, see Supplementary Tables S12 and S13. (C) Relative gene expression values are based on log10 RPKM according to the networks of turquoise and tan modules. Among naringin-related modules, 10 hub genes in the tan module showing the highest edge number (71 edges) comprised three PAL genes, three 4CL genes, two CHI genes, one C4H gene and one CHS gene, while seven hub genes in the turquoise module exhibiting the highest edge number (59 edges) comprised three C4H genes two 4CL genes, one PAL gene and one CHS gene (Supplementary Table S12). These results indicated that the 4CL, PAL and C4H gene families might play the central roles in naringin synthesis. For neoeriocitrin-related modules, 19 hub genes in the tan module presenting the highest edge number (90 edges) consisted of six C3H genes, three HCT genes, three 4CL genes, two F3'H genes, two PAL genes and one each of C4H, CHS and CHI. The 23 hub genes in the turquoise module exhibiting the highest edge number (91 edges) were nine HCT genes, three C3H genes, two C4H genes, two 4CL genes, two C3'H genes, one PAL gene and one CHS gene (Supplementary Table S12). These results suggested that the HCT and C3H gene families might play the most significant roles in neoeriocitrin synthesis. In total, the above results showed that the 4CL, PAL and C4H, and HCT and C3H gene families acted as the most important hub genes in the naringin- and neoeriocitrin-related co-expression networks, respectively, and played vital roles in naringin and neoeriocitrin synthesis. In addition, the majority of naringin/neoeriocitrin synthesis-related hub genes exhibited high co-expression with TF-regulated genes, which suggested that these TF-regulated genes were candidate genes involved in regulating naringin/neoeriocitrin synthesis, and MYB/bHLH TFs might take part in regulating the NNSRGs at the transcriptional level in D. roosii. Fig. 8C shows the patterns of transcript abundance of co-expressed hub enzymes and MYB/bHLH TF-regulated genes based on Fig. 8B. Genes in the turquoise and tan modules exhibited a highly co-expressed pattern and showed a negative and positive association with naringin content, respectively. Furthermore, the DEGs in the turquoise and tan modules were mapped onto KEGG pathways and presented the top 20 enriched pathways (Supplementary Fig. S12A, C). Of these, ‘Flavonoid biosynthesis’ (ko00941), ‘Flavone and flavonol biosynthesis’ (ko00944) and ‘Biosynthesis of secondary metabolites’ (ko01110) were enriched observably for turquoise and tan modules. ‘Phenylpropanoid biosynthesis’ (ko00940), ‘Phenylalanine metabolism’ (ko00360) and ‘Fatty acid elongation’ (ko00062) were particularly enriched for the tan module. Supplementary Fig. S12B and D presents the number of DEGs in the turquoise and tan modules involved in ‘Biosynthesis of secondary metabolites’ (ko01110), ‘Fatty acid metabolism’ (ko00071), ‘Tyrosine metabolism’ (ko00350) and ‘Tryptophan metabolism’ (ko00380), which were related to naringin/neoeriocitrin accumulation in D. roosii. The number of genes involved in ‘Flavonoid biosynthesis’ and ‘Phenylpropanoid biosynthesis’ (ko00940) was the highest when compared with other pathways. Indeed, biosynthesis of secondary metabolites, especially of flavonoid, was enriched in these naringin/neoeriocitrin-related modules depending on tissues or ages in D. roosii. Based on the co-expression network results, we detected the expression profiles of several hub genes associated with naringin/neoeriocitrin synthesis using quantiative real-time reverse transcription–PCR (qRT–PCR). The present results showed that the transcripts of PAL (CL9563), 4CL (CL5245) and C4H (CL4574) genes associated with naringin were higher in rhizomes, especially in new rhizomes, when compared with leaves and veins. Also, the expression of these genes remained unregulated over the years (Fig. 9A–C). Similarly, the transcripts of HCT (CL11832; CL4397) and C3H (CL6561) genes associated with neoeriocitrin were higher in the new rhizome than in other tissues. Nevertheless, the expression of these genes was clearly down-regulated over the years (Fig. 9D–F). The qRT–PCR results were consistent with our transcriptome data and supported the tissue- and age-dependent naringin/neoeriocitrin accumulation profile as shown in Figs. 1C, D and 2A, B. Fig. 9 View largeDownload slide Quantitative real-time PCR analysis of the PAL (CL9563), 4CL (CL5245), C4H (CL4574), HCT (CL11832 and CL4397) and C3H (CL6561) transcript levels in different tissues and D. roosii plants of different ages, respectively. Amplification of the ACTIN (CL186) gene was used as an internal control. Bars represent the mean and SD of values obtained from three biological replicates. Fig. 9 View largeDownload slide Quantitative real-time PCR analysis of the PAL (CL9563), 4CL (CL5245), C4H (CL4574), HCT (CL11832 and CL4397) and C3H (CL6561) transcript levels in different tissues and D. roosii plants of different ages, respectively. Amplification of the ACTIN (CL186) gene was used as an internal control. Bars represent the mean and SD of values obtained from three biological replicates. Discussion Medicinal aspects of naringin and neoeriocitrin in the fern D. roosii Osteoporosis is the most common bone-related health problem, affecting millions of people around the world, and especially the aging population. Naringin and neoeriocitrin are considered the main effective compounds of ‘GuSuiBu’, which has been extensively studied in TCM regarding its protective role in the maintenance of bone mass in various models of osteoporosis (Zhang et al. 2017). A reversed-phase HPLC with photodiode array detection method was previously developed for the accurate quantitative determination of naringin/neoeriocitrin and applied for fingerprint analysis of D. fortunei rhizomes (H.T. Liu et al. 2012). In the present study, HPLC-MS/MS analysis confirmed the highly similar chemical structure (Fig. 2C, D) and revealed the tissue- and time-specific accumulation pattern of naringin/neoeriocitrin in D. roosii. The naringin content presented particular high accumulation in CNRs and remained relatively stable over the years, whereas neoeriocitrin content was maintained at a much higher level in CORs and increased gradually over the years (Figs. 1C, D, 2A, B). Moreover, we found that at least 8 years was required for D. roosii to accumulate similar naringin and neoeriocitrin contents to wild plants, which are commonly harvested for medicinal use (Fig. 1D). However, 3 years of growth time was adequate for biomass accumulation of D. roosii (Fig. 1E). To balance the time cost and biomass demand for industrial development, we suggest that 3 years of growth is reasonable for further studies aimed at improving medicinal components for D. roosii. Combined sequencing approach to achieve a comprehensive transcriptome resource of D. roosii Structural and functional genomics studies are fundamental to the understanding of plant metabolism, necessitating further cloning efforts to obtain comprehensive sequences for the investigation of potential mechanisms underlying the specialized naringin/neoeriocitrin accumulation pattern in D. roosii. To perform such studies, access to high-quality genome and transcriptome sequences is essential. The transcripts derived from SGS may suffer from misassembly of the reads transcribed from highly repetitive regions or very similar members of multigene families, which may be even more severe for polyploid plants with many nearly identical homoeologous gene sets (B. Li et al. 2014). Based on RNA-seq data, we found that D. roosii harbored a mass of nearly identical homoeologous sequences, indicating its extremely large genome. SMRT sequencing with a read length up to 20 kb renders PacBio RSII very effective in the sequencing of full-length cDNAs that exhibit long transcript isoforms (Sharon et al. 2013). In this work, we presented comprehensive, high-quality, full-length transcriptome sequences for D. roosii using SMRT technology. A recent study reported that high-accuracy de novo transcriptome assemblies of A. deserti and A. tequilana in this way offered new resources for studying stress adaption (Gross et al. 2013). Based on our SMRT and SGS data, NR analysis showed that D. roosii shared high similarity with three major species (P. patens, S. moellendorffii and P. sitchensis), which provided us with the reference plants for study of D. roosii (Fig. 3D;Supplementary Fig. S7). The combination of SMRT sequencing and SGS technology first provided such abundant genetic information resources for D. roosii, vastly contributing to further study of its medicinal value. Transcriptomic analysis of the tissue- and time-specific naringin/neoeriocitrin-related gene expression pattern Numerous studies have indicated that the expression of the genes involved in the flavonoid biosynthetic pathway is controlled by distinct mechanisms in a tissue- or species-specific manner (Schaart et al. 2013). This regulation involves different sets of transcriptional regulators that are specific to each class of flavonoids (Xu et al. 2014). Using a non-targeted metabolomics approach, statistical analyses revealed that both development time and tissue type influenced metabolic changes, and development time seemed to produce more effects than tissue type on the fruit metabolome in cucumber. Most of the metabolite contents decreased gradually, while those of some flavonoids, amino acids and carbohydrates increased throughout development (Hu et al. 2018). The expression pattern in blackberry tissue showed that 6,604 and 6,599 genes were particularly overexpressed in fruits and leaves, respectively. Moreover, bioactives characterization indicated that total phenolics and flavonols were higher in leaves, while anthocyanins were higher in fruits (Gutierrez et al. 2017). The expression patterns of ParPAL1 and ParPAL2 genes were investigated in various vegetative tissues of apricot, and ParPAL1 transcripts were 3-fold more abundant in callus tissue than in leaves and bark tissue; nevertheless, ParPAL2 was expressed at the same level in all tissues (Irisarri et al. 2016). A tissue-specific expression analysis indicated that Ma4CL3 was strongly expressed in root bark, stem bark and old leaves, and its expression pattern was similar to the trend of the total flavonoid content throughout fruit development in mulberry (Wang et al. 2016). In the present study, the transcriptome analysis confirmed that rhizomes served as the main tissue in which naringin and neoeriocitrin were generated and stored, and revealed that the NNRGs exhibited tissue- or time-specific expression patterns in D. roosii. Fig. 4 and Supplementary Table S8 showed that 68 DEGs associated with naringin/neoeriocitrin synthesis (63.0%) exhibited higher expression in rhizomes, with 66.2% of these DEGs appearing to be most highly expressed in CNRs, and the number of down-regulated DEGs (fold change <0.5) increased over time. Notably, the obviously unique expression pattern of PAL, 4CL and HCT observed in different tissues and plants of different ages suggested the pivotal roles of these genes in naringin/neoeriocitrin synthesis. Combined with the similar tissue- and time-specific expression patterns of other NNSRGs (F3'H, C3H, CHS and C4H), our study illustrated the molecular regulatory mechanisms underlying naringin/neoeriocitrin synthesis in D. roosii. Flavonoids are derived from phenylalanine, which is a precursor of the phenylpropanoid metabolic pathway, via shikimate biosynthesis. Shikimate biosynthesis provides carbon skeletons for aromatic amino acids (Tohge et al. 2013). Accordingly, we found that the DEGs involved in l-phenylalanine formation via the shikimate pathway exhibited a tissue- and time point-specific expression pattern, in which the corresponding genes were most highly expressed in CNRs and displayed a down-regulated expression pattern over time (Fig. 5A). Malonyl-CoA is an important precursor metabolite for the biosynthesis of flavonoids and is naturally consumed for fatty acid biosynthesis, elongation and metabolism, leaving a reduced amount of malonyl-CoA available for the production of other metabolic targets (Chen et al. 2011). In the present study, the DEGs involved in malonyl-CoA formation exhibited rhizome-specific expression profiles, and were more highly expressed in CNRs (especially genes in the fatty acid elongation pathway); these DEGs displayed a down-regulated expression pattern over time. Recent studies have indicated that the expression of PAL, 4CL and F3H in the phenolic biosynthesis pathway may be positively or negatively regulated by MYB and bHLH TFs (Sun et al. 2017). In the case of grapevine, VviMYBC2-L1 and VviMYBC2-L3 have been characterized as negative regulators to fine-tune flavonoid levels (Cavallini et al. 2015). In the present study, we discovered that 67.9% of MYB-regulated DEGs were highly expressed in rhizomes, and 71.9% of these DEGs were more highly expressed in CNRs. Among the bHLH-regulated DEGs, 74.6% displayed higher transcriptional levels in rhizomes, and 71.6% of these DEGs were more highly expressed in CNRs. Moreover, the number of down-regulated (fold change <0.5) MYB- and bHLH-regulated DEGs clearly increased over time (Fig. 5B). All these findings indicated that MYB- and bHLH-regulated genes exhibited a similar expression pattern to NNSRGs, and MYB- or bHLH-mediated naringin/neoeriocitrin synthesis might also depend on different tissues and ages. WGCNA of naringin/neoeriocitrin-related gene expression The systems approach for data mining via WGCNA was particularly fruitfull in identifying information about modules that were shared by subsets of samples and linked to phenotypic traits, and this method appears to be quite effective in revealing common features or trait-specific modules and important hub genes for systems biology research (Hollender et al. 2014). The present study was a novel application of WGCNA in systems biology research on the naringin/neoeriocitrin-related gene expression in D. roosii. Our work elucidated 17 clustered modules of a total of 16,472 DEGs depending on the gene expression, and analyzed the relationships of modules/modules, modules/samples and modules/traits (Figs. 6, 7). Moreover, the salmon, darkorange, darkorange4 and saddlebrown modules identified 2,413 CV-specific, 1,633 R2Y-specific, 669 R10Y-specific and 807 R8Y-specific genes, respectively (Fig. 6D). Nine modules exhibited significant association with naringin/neoeriocitrin, and DEGs in each module displayed higher expression in a specific tissue or at a specific time point (Fig. 7). These results contributed a lot to understanding the expression profiles of the NNRGs in different tissues and plants of different ages, and confirmed the tissue and time specificity of the NNRGEP in D. roosii. Hub gene analyses were powerful in revealing important regulatory factors for specific biological features. In the present study, the transcriptome data analyses and WGCNA revealed information on hub genes, which presented high co-expression with MYB- and bHLH-regulated genes in networks. For naringin-related genes, three PAL genes, three 4CL genes and three C4H genes exhibited the highest edge number, while for neoeriocitrin-related genes, six C3H genes and nine HCT genes displayed the highest edge number in tan and turquoise module networks (Fig. 8B). The gene families of 4CL, PAL and C4H, and HCT and C3H played important roles in naringin and neoeriocitrin synthesis, respectively, which was consistent with the heat map analysis of PAL, 4CL and HCT genes in D. roosii. Different branches of the flavonoid pathway are highly regulated at the transcriptional level, especially by MYB and bHLH TFs, with some genes being co-expressed to stimulate or inhibit flavonoid accumulation in blackberry (Gutierrez et al. 2017). Considering the highly co-expressed relationship between naringin/neoeriocitrin synthesis-related hub genes and TF-regulated genes, we suggested that these MYB- and bHLH-regulated genes were candidates involved in regulating naringin/neoeriocitrin synthesis, and MYB/ bHLH TFs were indirectly linked to the hub genes of 4CL, PAL, C4H, HCT and C3H. The assessment of co-expression groups might enable basic discoveries related to genome function, regulatory networks, biomarkers and candidate genes, and support new evidence for the reorganization of plant metabolism. Materials and Methods Plant materials and RNA sample preparation Drynaria roosii was planted in the greenhouse at the Institute of Botany of the Chinese Academy of Sciences. Rhizomes from plants of different ages were collected. Three-year-old plants were also collected and divided into four parts (old rhizomes, new rhizomes, veins and leaves). Wild plants were harvested from Mount Emei in the Sichuan Province of China. Three plant replicates per experimental treatment and 10 plants per replicate were harvested. All of the rhizome samples from wild plants, plants of different ages (2-, 3-, 5-, 6-, 8- and 10-year-old rhizomes) and different tissues (old rhizomes, new rhizomes, veins and leaves) were immediately frozen in liquid nitrogen and stored at −80°C until RNA isolation. All rhizome samples were cut into pieces, mixed evenly and ground into flour with a grinding machine (IKA A11 basic, Staufen). Total RNA was obtained from each sample using an RNAprep Pure Plant kit (TIANGEN Biotech) according to the manufacturer’s instructions. The quality of RNA was determined using a NanoDrop 2000 spectrophotometer and an Agilent 2100 bioanalyzer (Agilent Technologies). Quantitative analyses and validation of naringin/neoeriocitrin metabolites Samples were analyzed using HPLC according to methods described previously (Cui et al. 2016). The powdered D. roosii samples were dried using vacuum freeze-drying equipment and passed through a 40 mesh sieve, after which 0.02 g of the sample powder was extracted with 2 ml of 80% ethanol in an ultrasonic bath for 30 min. The extracted solution was stored in a refrigerator at 4°C for 16 h and then placed in an ultrasonic bath for 30 min. The solution was filtered through a 0.45 µm filter membrane before the injection of 10 µl for HPLC analysis. A Dionex Ultimate 3000 HPLC system (Thermo Fisher Scientific) was used for the analysis. In addition, analysis was carried out on a ANPEL Athena C18-WP chromatographic column (4.6×150 mm, 5 μm, internal diameter, Tosoh) with a gradient elution program. The mobile phase was composed of 0.1% formic acid in ultrapure water (A) and acetonitrile (B). The elution program was optimized and conducted as follows: a linear gradient of 10–20% B (0–10 min), a linear gradient of 20–26% B (10–31 min), a linear gradient of 26–10% B (31–33 min) and a linear gradient of 10% B (33–35 min), at a flow rate of 0.8 ml min–1. The column temperature was maintained at 35°C, and the detection wavelength at 280 nm. Three replicates were carried out for each sample. The reference standards of naringin and neoeriocitrin used in HPLC analysis were purchased from Sigma-Aldrich. HPLC grade acetonitrile was obtained from Thermo Fisher Scientific. All other regents were of analytical grade and obtained from Beijing Chemical Works. Target compounds were identified using HPLC-MS/MS as described by Cui et al. (2016). Mass spectra were applied using electrospray ionization in negative ionization mode with the m/z range of 100–600 and 100–650. A nitrogen drying gas flow of 800 l h–1, a desolvation temperature at 400°C, a nebulizer pressure of 45 p.s.i. and a capillary voltage of 3,000 V were used. Argon was used as collision gas with collision energy of 30 V. PacBio library construction and sequencing The purified RNA samples from the four different tissues were used for PacBio library construction and sequencing. Normalized cDNAs of different sizes were constructed separately for seven SMRT cell libraries using the Clontech SMARTer PCR cDNA Synthesis Kit. The templates were bound to SA-DNA polymerase and V2 primers using the DNA/Polymerase Binding Kit 2.0. The complexes of templates and polymerase were bound to magnetic beads and then transferred to a 96-well PCR plate for processing using a PacBio RS system with C2 sequencing reagents. The subreads were filtered and subjected to CCS through SMRT Analysis_2.3.0. (http://www.pacb.com/support/software-downloads/). Illumina library construction and sequencing Thirty RNA samples from different tissues and rhizomes of different ages were used for Illumina library construction and sequencing. The cDNA libraries for HiSeq X Ten sequencing were constructed as follows. The total RNA from each sample was treated with DNase I, and poly(A) mRNA was isolated with oligo(dT) beads, followed by reverse transcription into first-strand cDNA using reverse transcriptase. Second-strand cDNA was synthesized with DNA polymerase I and RNaseH and then ligated with an adaptor or index adaptor using T4 DNA ligase. The adaptor-ligated fragments were separated and excised through agarose gel electrophoresis. PCR was performed selectively to enrich and amplify the cDNA fragments. Finally, the cDNA libraries were sequenced using HiSeq X Ten at Hangzhou 1 gene Technology Co., Ltd. Gene co-expression network analysis and network visualization Co-expression and module analyses were performed using functions of the R package WGCNA V1.41-1 (Langfelder and Horvath 2008). A soft power threshold of 6 was employed to transform the correlation matrix into a signed weighted adjacency matrix, leading to an approximate scale-free topology for the obtained network. The resulting adjacency matrix was used to calculate the topological overlap matrix (TOM). Next, a gene expression adjacency matrix was constructed using the soft-threshold Pearson correlation (power = 6) between all pairs of probe expression data. Such soft thresholding yields a scale-free network (scale-free topology model fit R2 = 0.9). A topological overlap map was calculated by comparing the similarity of the connectivity of each pair of probes. A dynamic hybrid tree-cut algorithm (R package dynamicTreeCut, v. 1.60) was employed for module detection. The minimum size of the module was set to 30 to avoid small modules, and DeepSplit was set to 4 to produce fine clusters. Modules exhibiting high similarity of eigengenes (dissimilarity, 0.25) were merged to avoid oversplitting clusters. The networks were visualized using Cytoscape _v.3.0.0. Validation of the expression of naringin/neoeriocitrin-related hub genes The expression profiles of the PAL (CL9563), 4CL (CL5245), C4H (CL4574), HCT (CL11832; CL4397) and C3H (CL6561) genes were characterized by qRT–PCR. Amplification of the ACTIN (CL186) gene was used as an internal control. The corresponding primer sequences are provided in Supplementary Table S14. The protocol for qRT–PCR and data processing were as previously described by Rieu and Powers (2009). Three biological replications were carried out. Accession codes The SMRT sequencing and HiSeq X Ten data were submitted to the Sequence Read Archive (SRA) of the National Center for Biotechnology Information (NCBI) under accession numbers SRP108926 and SRP108652, respectively. Supplementary Data Supplementary data are available at PCP Online. Funding This work was supported by the National Natural Science Foundation of China [grant Nos. 81573519 and 31600264] and the Key Projects of the Chinese Academy of Sciences [grant no. KZCC-EW-103-3]. Acknowledgments We thank Professor Yongzhen Pang from the Institute of Animal Sciences, Chinese Academy of Agricultural Sciences for help with research into the naringin and neoeriocitrin biosynthetic pathways. We thank Dr. Yan Zhu from the Plant Science Facility of the Institute of Botany, Chinese Academy of Sciences for HPLC-MS/MS analysis. Disclosures The authors have no conflicts of interest to declare. References Abdel-Ghany S.E., Hamilton M., Jacobi J.L., Ngam P., Devitt N., Schilkey F., et al.   ( 2016) A survey of the sorghum transcriptome using single-molecule long reads. Nat. Commun.  7: 11706. Google Scholar CrossRef Search ADS PubMed  Bar-Peled M., Lewinsohn E., Fluhr R., Gressel J. ( 1991) UDP-rhamnose:flavanone-7-O-glucoside-2''-O-rhamnosyltransferase. J. Biol. Chem.  266: 20953– 20959. Google Scholar PubMed  Buer C.S., Imin N., Djordjevic M.A. ( 2010) Flavonoids: new roles for old molecules. J. Integr. Plant Biol.  52: 98– 111. Google Scholar CrossRef Search ADS PubMed  Cavallini E., Matus J.T., Finezzo L., Zenoni S., Loyola R., Guzzo F., et al.   ( 2015) The phenylpropanoid pathway is controlled at different branches by a set of R2R3-MYB C2 repressors in grapevine. Plant Physiol.  167: 1448– 1470. Google Scholar CrossRef Search ADS PubMed  Chen H., Kim H.U., Weng H., Browse J. ( 2011) Malonyl-CoA synthetase, encoded by ACYL ACTIVATING ENZYME13, is essential for growth and development of Arabidopsis. Plant Cell  23: 2247– 2262. Google Scholar CrossRef Search ADS PubMed  Chen S.L., Xu J., Liu C., Zhu Y.J., Nelson D.R., Zhou S.G., et al.   ( 2012) Genome sequence of the model medicinal mushroom Ganoderma lucidum. Nat. Commun.  3: 913. Google Scholar CrossRef Search ADS PubMed  Chinese Pharmacopoeia Commission. ( 2015) Pharmacopoeia of the People’s Republic of China . China Medical Science Press, Beijing. Christensen A.B., Gregersen P.L., Schroder J., Collinge D.B. ( 1998) A chalcone synthase with an unusual substrate preference is expressed in barley leaves in response to UV light and pathogen attack. Plant Mol. Biol . 37: 849– 857. Google Scholar CrossRef Search ADS PubMed  Cui L.L., Yao S.B., Dai X.L., Yin Q.G., Liu Y.J., Jiang X.L., et al.   ( 2016) Identification of UDP-glycosyltransferases involved in the biosynthesis of astringent taste compounds in tea (Camellia sinensis). J. Exp. Bot . 67: 2285– 2297. Google Scholar CrossRef Search ADS PubMed  Dixon R.A., Pasinetti G.M. ( 2010) Flavonoids and isoflavonoids: from plant biology to agriculture and neuroscience. Plant Physiol.  154: 453– 457. Google Scholar CrossRef Search ADS PubMed  Ficklin S.P., Feltus F.A. ( 2011) Gene coexpression network alignment and conservation of gene modules between two grass species: maize and rice. Plant Physiol . 156: 1244– 1256. Google Scholar CrossRef Search ADS PubMed  Gerttula S., Zinkgraf M., Muday G.K., Lewis D.R., Ibatullin F.M., Brumer H., et al.   ( 2015) Transcriptional and hormonal regulation of gravitropism of woody stems in Populus. Plant Cell  27: 2800– 2813. Google Scholar PubMed  Gross S.M., Martin J.A., Simpson J., Abraham-Juarez M.J., Wang Z., Visel A. ( 2013) De novo transcriptome assembly of drought tolerant CAM plants, Agave deserti and Agave tequilana. BMC Genomics  14: 563. Google Scholar CrossRef Search ADS PubMed  Gutierrez E., García-Villaraco A., Lucas J.A., Gradillas A., Gutierrez-Mañero F.J., Ramos-Solano B. ( 2017) Transcriptomics, targeted metabolomics and gene expression of blackberry leaves and fruits indicate flavonoid metabolic flux from leaf to red fruit. Front. Plant Sci.  8: 472. Google Scholar CrossRef Search ADS PubMed  Hollender C.A., Kang C., Darwish O., Geretz A., Matthews B.F., Slovin J., et al.   ( 2014) Floral transcriptomes in woodland strawberry uncover developing receptacle and anther gene networks. Plant Physiol . 165: 1062– 1075. Google Scholar CrossRef Search ADS PubMed  Hu C.Y., Zhao H.Y., Wang W., Xu M.F., Shi J.X., Nie X.B., et al.   ( 2018) Identification of conserved and diverse metabolic shift of the stylar, intermediate and peduncular segments of cucumber fruit during development. Int. J. Mol. Sci.  19: 135. Google Scholar CrossRef Search ADS   Huddleston J., Ranade S., Malig M., Antonacci F., Chaisson M., Hon L., et al.   ( 2014) Reconstructing complex regions of genomes using long-read sequencing technology. Genome Res . 24: 688– 696. Google Scholar CrossRef Search ADS PubMed  Hutabarat O.S., Flachowsky H., Regos I., Miosic S., Kaufmann C., Faramarzi S., et al.   ( 2016) Transgenic apple plants overexpressing the chalcone-3-hydroxylase gene of Cosmos sulphureus show increased levels of 3-hydroxyphloridzin and reduced susceptibility to apple scab and fire blight. Planta  243: 1213– 1224. Google Scholar CrossRef Search ADS PubMed  Irisarri P., Zhebentyayeva T., Errea P., Pina A. ( 2016) Differential expression of phenylalanine ammonia lyase (PAL) genes implies distinct roles in development of graft incompatibility symptoms in Prunus. Sci. Hort . 204: 16– 24. Google Scholar CrossRef Search ADS   Langfelder P., Horvath S. ( 2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics  9: 559. Google Scholar CrossRef Search ADS PubMed  Li B., Fillmore N., Bai Y., Collins M., Thomson J.A., Stewart R., et al.   ( 2014) Evaluation of de novo transcriptome assemblies from RNA-Seq data. Genome Biol.  15: 553. Google Scholar CrossRef Search ADS PubMed  Li J., Harata-Lee Y., Denton M.D., Feng Q.J., Rathjen J.R., Qu Z.P., et al.   ( 2017) Long read reference genome-free reconstruction of a full-length transcriptome from Astragalus membranaceus reveals transcript variants involved in bioactive compound biosynthesis. Cell Discov.  3: 17031. Google Scholar CrossRef Search ADS PubMed  Li Q.S., Li Y., Song J.Y., Xu H.B., Xu J., Zhu Y.J., et al.   ( 2014) High-accuracy de novo assembly and SNP detection of chloroplast genomes using a SMRT circular consensus sequencing strategy. New Phytol . 204: 1041– 1049. Google Scholar CrossRef Search ADS PubMed  Li X.L., Xiao H.B., Liang X.M., Shi D.Z., Liu J.G. ( 2004) LC–MS/MS determination of naringin, hesperidin and neohesperidin in rat serum after orally administrating the decoction of Bulpleurum falcatum L. and Fractus aurantii. J. Pharmaceut. Biomed . 34: 159– 166. Google Scholar CrossRef Search ADS   Li L.N., Zeng Z., Cai G.P. ( 2011) Comparison of neoeriocitrin and naringin on proliferation and osteogenic differentiation in MC3T3-E1. Phytomedicine  18: 985– 989. Google Scholar CrossRef Search ADS PubMed  Liu L.L., Qu W., Liang J.Y. ( 2012) Progress on chemical constituents and biological activities of Drynaria fortunei. Strait Pharm. J . 1: 4– 9. Liu H.T., Zou S.S., Qi Y.D., Zhu Y.X., Li X.B., Zhang B.G. ( 2012) Quantitative determination of four compounds and fingerprint analysis in the rhizomes of Drynaria fortunei (Kunze) J. Sm. J. Nat. Med.  66: 413– 419. Google Scholar CrossRef Search ADS PubMed  Ma X.L., Lv J.W., Sun X.L., Ma J.X., Xing G.S., Wang Y., et al.   ( 2016) Naringin ameliorates bone loss induced by sciatic neurectomy and increases Semaphorin 3A expression in denervated bone. Sci. Rep.  6: 24562. Google Scholar CrossRef Search ADS PubMed  McIntosh C.A., Mansell R.L. ( 1990) Biosynthesis of naringin in Citrus paradisi: UDP-glucosyl-transferase activity in grapefruit seedlings. Phytochemistry  29: 1533– 1538. Google Scholar CrossRef Search ADS   Moschen S., Higgins J., Rienzo J.A.D., Heinz R.A., Paniego N., Fernandez P. ( 2016) Network and biosignature analysis for the integration of transcriptomic and metabolomic data to characterize leaf senescence process in sunflower. BMC Bioinformatics  17: 389– 398. Google Scholar CrossRef Search ADS PubMed  Mouradov A., Spangenberg G. ( 2014) Flavonoids: a metabolic network mediating plants adaptation to their real estate. Front. Plant Sci.  5: 1– 16. Google Scholar CrossRef Search ADS   Naqvi R.Z., Zaidi S.S., Akhtar K.P., Strickler S., Woldemariam M., Mishra B., et al.   ( 2017) Transcriptomics reveals multiple resistance mechanisms against cotton leaf curl disease in a naturally immune cotton species, Gossypium arboreum. Sci. Rep.  7: 15880. Google Scholar CrossRef Search ADS PubMed  Rieu I., Powers S.J. ( 2009) Real-time quantitative RT–PCR: design, calculations, and statistics. Plant Cell  21: 1031– 1033. Google Scholar CrossRef Search ADS PubMed  Ruan J., Dean A.K., Zhang W. ( 2010) A general co-expression network-based approach to gene expression analysis: comparison and applications. BMC Syst. Biol.  4: 8. Google Scholar CrossRef Search ADS PubMed  Schaart J.G., Dubos C., Romero De La Fuente I., van Houwelingen A.M., de Vos R.C., Jonker H.H., et al.   ( 2013) Identification and characterization of MYB–bHLH–WD40 regulatory complexes controlling proanthocyanidin biosynthesis in strawberry (Fragaria × ananassa) fruits. New Phytol.  197: 454– 467. Google Scholar CrossRef Search ADS PubMed  Sharon D., Tilgner H., Grubert F., Snyder M. ( 2013) A single-molecule long-read survey of the human transcriptome. Nat. Biotechnol.  31: 1009– 1014. Google Scholar CrossRef Search ADS PubMed  Sun R.Z., Cheng G., Li Q., He Y.N., Wang Y., Lan Y.B., et al.   ( 2017) Light-induced variation in phenolic compounds in cabernet sauvignon grapes (Vitis vinifera L.) involves extensive transcriptome reprogramming of biosynthetic enzymes, transcription factors, and phytohormonal regulators. Front. Plant Sci.  8: 547. Google Scholar CrossRef Search ADS PubMed  Tan M.P., Cheng D., Yang Y.N., Zhang G.Q., Qin M.J., Chen J., et al.   ( 2017) Co-expression network analysis of the transcriptomes of rice roots exposed to various cadmium stresses reveals universal cadmium-responsive genes. BMC Plant Biol.  17: 194. Google Scholar CrossRef Search ADS PubMed  Tohge T., Watanabe M., Hoefgen R., Fernie A.R. ( 2013) Shikimate and phenylalanine biosynthesis in the green lineage. Front. Plant Sci.  4: 1– 13. Google Scholar CrossRef Search ADS PubMed  Wang C.H., Yu J., Cai Y.X., Zhu P.P., Liu C.Y., Zhao A.C., et al.   ( 2016) Characterization and functional analysis of 4-coumarate: CoA ligase genes in Mulberry. PLoS One  11: e0155814. Google Scholar CrossRef Search ADS PubMed  Wiermann R. ( 1970) Die synthese von phenylpropanen während der Pollenentwicklung. Planta  95: 133– 145. Google Scholar CrossRef Search ADS PubMed  Xie W., Huang J., Liu Y., Rao J., Luo D., He M. ( 2015) Exploring potential new floral organ morphogenesis genes of Arabidopsis thaliana using systems biology approach. Front. Plant Sci.  6: 829. Google Scholar PubMed  Xie Y.M., Wang H.M., Shen L., Cui T.H., Ge J.R., Kou Q.A., et al.   ( 2004) Clinical study on effect of Qianggu capsule on primary osteoporosis (decrease in bone content). Chin. J. Inf. Tradit. Chin. Med . 11: 482– 488. Xu H.B., Song J.Y., Luo H.M., Zhang Y.J., Li Q.S., Zhu Y.J., et al.   ( 2016) Analysis of the genome sequence of the medicinal plant Salvia miltiorrhiza. Mol. Plant.  9: 949– 952. Google Scholar CrossRef Search ADS PubMed  Xu W.J., Dubos C., Lepiniec L. ( 2015) Transcriptional control of flavonoid biosynthesis by MYB–bHLH–WDR complexes. Trends Plant Sci . 20: 176– 185. Google Scholar CrossRef Search ADS PubMed  Xu W.J., Grain D., Bobet S., Gourrierec J.L., Thévenin J., Kelemen Z., et al.   ( 2014) Complexity and robustness of the flavonoid transcriptional regulatory network revealed by comprehensive analyses of MYB–bHLH–WDR complexes and their targets in Arabidopsis seed. New Phytol.  202: 132– 144. Google Scholar CrossRef Search ADS PubMed  Xu Z.C., Peters R.J., Weirather J., Luo H.M., Liao B.S., Zhang X., et al.   ( 2015) Full-length transcriptome sequences and splice variants obtained by a combination of sequencing platforms applied to different root tissues of Salvia miltiorrhiza and tanshinone biosynthesis. Plant J.  82: 951– 961. Google Scholar CrossRef Search ADS PubMed  Yang B., Chen G.X., Jiang D.S., Liu Z. ( 2010) The research progress of medicinal plant of Drynaria in China. Chin. Wild Plant Res . 29: 1– 6. Zhang Y.L., Jiang J.J., Shen H., Chai Y., Wei X., Xie Y.M. ( 2017) Total flavonoids from Rhizoma Drynariae (Gusuibu) for treating osteoporotic fractures: implication in clinical practice. Drug Des. Dev. Ther.  11: 1881– 1890. Google Scholar CrossRef Search ADS   Abbreviations Abbreviations bHLH basic helix–loop–helix CCS circular consensus sequencing C3H p-coumarate 3-hydroxylase C3'H p-coumarate 3'-hydroxylase C4H cinnamate 4-hydroxylase CHI chalcone isomerase CHS chalcone synthase CL leaf 4CL 4-coumarate CoA ligase CNR new rhizome COR old rhizome CV vein DEG differentially expressed gene FDR false discovery rate F3'H flavonoid-3'-hydroxylase GO Gene Ontology HCT hydroxycinnamoyl transferase KEGG Kyoto Encyclopedia of Genes and Genomes MS mass spectroscopy NNRG naringin/neoeriocitrin-related gene NNRGEP naringin/neoeriocitrin-related gene expression pattern NNSRG naringin/neoeriocitrin synthesis-related gene PAL phenylalanine ammonia lyase qRT–PCR quantitative real-time reverse transcription–PCR RNA-seq RNA sequencing R2Y 2-year-old plants R3Y 3-year-old plants R5Y 5-year-old plants R6Y 6-year-old plants R8Y 8-year-old plants R10Y 10-year-old plants SGS second-generation sequencing SMRT single-molecule real-time TCM traditional Chinese medicinal TF transcription factor WGCNA weighted gene co-expression network analysis © The Author(s) 2018. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Plant and Cell Physiology Oxford University Press

Full-Length Transcriptome Sequencing and Modular Organization Analysis of the Naringin/Neoeriocitrin-Related Gene Expression Pattern in Drynaria roosii

Loading next page...
 
/lp/ou_press/full-length-transcriptome-sequencing-and-modular-organization-analysis-Pb6WQn0rte
Publisher
Oxford University Press
Copyright
© The Author(s) 2018. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com
ISSN
0032-0781
eISSN
1471-9053
D.O.I.
10.1093/pcp/pcy072
Publisher site
See Article on Publisher Site

Abstract

Abstract Drynaria roosii (Nakaike) is a traditional Chinese medicinal fern, known as ‘GuSuiBu’. The effective components, naringin and neoeriocitrin, share a highly similar chemical structure and medicinal function. Our HPLC–tandem mass spectrometry (MS/MS) results showed that the accumulation of naringin/neoeriocitrin depended on specific tissues or ages. However, little was known about the expression patterns of naringin/neoeriocitrin-related genes involved in their regulatory pathways. Due to a lack of basic genetic information, we applied a combination of single molecule real-time (SMRT) sequencing and second-generation sequencing (SGS) to generate the complete and full-length transcriptome of D. roosii. According to the SGS data, the differentially expressed gene (DEG)-based heat map analysis revealed that naringin/neoeriocitrin-related gene expression exhibited obvious tissue- and time-specific transcriptomic differences. Using the systems biology method of modular organization analysis, we clustered 16,472 DEGs into 17 gene modules and studied the relationships between modules and tissue/time point samples, as well as modules and naringin/neoeriocitrin contents. We found that naringin/neoeriocitrin-related DEGs distributed in nine distinct modules, and DEGs in these modules showed significantly different patterns of transcript abundance to be linked to specific tissues or ages. Moreover, weighted gene co-expression network analysis (WGCNA) results further identified that PAL, 4CL and C4H, and C3H and HCT acted as the major hub genes involved in naringin and neoeriocitrin synthesis, respectively, and exhibited high co-expression with MYB- and basic helix–leucine-helix (bHLH)-regulated genes. In this work, modular organization and co-expression networks elucidated the tissue and time specificity of the gene expression pattern, as well as hub genes associated with naringin/neoeriocitrin synthesis in D. roosii. Simultaneously, the comprehensive transcriptome data set provided important genetic information for further research on D. roosii. Introduction Drynaria roosii (Nakaike), a large epiphytic fern of the family Polypodiaceae, is a source of traditional Chinese medicine (TCM) commonly known as ‘GuSuiBu’ and by the pharmaceutical name Drynaria Rhizoma (Yang et al. 2010). ‘Qianggu Capsules’, which is a medicine made using ‘GuSuiBu’ as the main material, have a role in treating osteoporosis as a Category II New Drug for TCM in China (Xie et al. 2004). ‘GuSuiBu’ contains the effective medicinal flavonoid ingredients naringin and neoeriocitrin. This TCM has been extensively used to treat bone-related diseases such as bone fracture, osteoporosis and arthritis, and has been demonstrated to have therapeutic effects on bone healing (Li et al. 2011). According to the Pharmacopoeia of the People’s Republic of China, the drug formulation of ‘GuSuiBu’ should contain no less than 0.50% naringin, which is calculated in reference to the dried drug (Chinese Pharmacopoeia Commission 2015). Naringin can improve bone properties, stimulate osteogenic differentiation and ameliorate bone loss (Ma et al. 2016). Neoeriocitrin, which displays a similar structure and function to naringin, has been reported to exert a better stimulating effects on cell proliferation and osteogenic differentiation, and to have a rescue effect on the inhibition of osteogenic differentiation (Li et al. 2011). Flavonoids are one of the largest groups of secondary metabolites and are widely distributed in plants (W.J. Xu et al. 2015). The composition of flavonoids is largely dependent on the plant species, developmental stage, tissue type and key ecological influences of biotic and abiotic origins (Mouradov and Spangenberg 2014). The environmental or developmental regulation of flavonoids mostly relies on the co-ordinated expression of early and late biosynthetic genes (W.J. Xu et al. 2015). The early biosyntheic genes, which include chalcone synthase (CHS), chalcone isomerase (CHI), flavanone 3-hydroxylase (F3H) and flavanone 3'-hydroxylase (F3'H), are involved in the production of common precursors. In contrast, the late biosynthetic genes are usually downstream genes, comprising dihydroflavonol-4-reductase (DFR), leucoanthocyanidin reductase (LAR), anthocyanidin synthase (ANS) and anthocyanidin reductase (ANR) (Xu et al. 2014). Transcriptional regulation also plays an important role in the control of flavonoid biosynthesis, and six transparent testa (tt) genes [TT2 (MYB123), TT8 (bHLH042), TTG1 (WD40 or WDR), TT16/ABS (MADS box, AGL32), TT1 (WIP1/Zn finger) and TTG2 (DSL1/WRKY44)], a R3-MYB gene (MYBL2), R2R3-MYB genes (MYB12, MYB11, MYB111) and MYB-bHLH-WDR complex genes (i.e. TT2-TT8-TTG1, MYB5-TT8-TTG1, TT2-EGL3-TTG1 and TT2-GL3-TTG1) have been confirmed to be involved in encoding transcriptional regulators in the control of flavonoid biosynthesis (Xu et al. 2014). The advent of second-generation sequencing (SGS) technologies, such as Illumina technology, has stimulated the construction of genome and transcriptome resources for medicinal mushroom and plant species, such as Ganoderma lucidum and Salvia miltiorrhiza (Chen et al. 2012, Xu et al. 2016). However, the short reads obtained from SGS result in incompletely assembled transcripts and loss of some important information. Recently, single-molecule real-time (SMRT) sequencing conducted using the PacBio RS system has provided a third-generation sequencing platform that offers four major advantages: long read lengths, high consensus accuracy, a low degree of bias and simultaneous capability for epigenetic characterization (Huddleston et al. 2014). Full-length transcript sequences permit efficient analysis of alternative splicing, alternative polyadenylation, novel genes, non-coding RNAs and fusion transcripts, facilitating a complete understanding of the transcriptional behavior of plants with available reference genomes, such as S. miltiorrhiza and sorghum (Z.C. Xu et al. 2015, Abdel-Ghany et al. 2016). Additionally, full-length transcript sequences provide information on the total length of transcripts that do not need to be assembled because PacBio reads cover most of the transcripts of plants without an available reference genome, such as Agave deserti, Agave tequilana, Fritillaria and Astragalus membranaceus (Gross et al. 2013, B. Li et al. 2014, Li et al. 2017). One concern regarding PacBio sequencing is its relatively high error rate (up to 15%), which can be addressed through correction with SGS reads and/or self-correction using circular consensus sequencing (CCS) reads (Q.S. Li et al. 2014). There has been burgeoning interest in using a network approach to discover gene–gene dependencies in the field of systems biology, and high-throughput sequencing has contributed to an increase in data sets obtained from omics analyses, allowing the complex regulatory networks associated with biological processes to be understood (Ruan et al. 2010). In transcript expression data, interactions may be determined based on the correlation between the expression of paired genes, and a gene co-expression network is constructed through the discovery of non-random gene–gene expression dependencies derived from a collection of transcriptome data sets (Ficklin and Feltus 2011). Various software tools for the construction of co-expression networks have been developed with the aim of analyzing data sets to predict hub genes that could regulate a specific pathway or be involved in a particular biological process (Moschen et al. 2016). Weighted gene co-expression network analysis (WGCNA) is a typical R package algorithm method designed to identify clusters (modules) of highly correlated genes or metabolites based on high-throughput transcriptomic sequences (Langfelder and Horvath 2008), which has been widely used in plant species such as Arabidopsis thaliana (Xie et al. 2015), maize (Tan et al. 2017), Populus (Gerttula et al. 2015) and cotton (Naqvi et al. 2017). The wild D. roosii resources have been overexploited in China, and artificial cultivation is therefore important. In this work, we found 8 years or longer was necessary for artificially cultivated D. roosii to accumulate a level of naringin/neoeriocitrin similar to wild plants, and the accumulation of both components depended on specific tissues or ages. To date, the molecular regulatory mechanisms of naringin/neoeriocitrin accumulation remain largely unclear in D. roosii. Here, the different tissues (old rhizome, new rhizome, veins and leaves) and rhizomes from plants of different ages (2, 3, 5, 6, 8 and 10 years old) were used as the experimental materials. We combined SMRT sequencing and SGS to generate the complete and full-length transcriptome of the fern D. roosii. Due to the similar chemical structure and medicinal function of naringin and neoeriocitrin, we further analyzed RNA sequencing (RNA-seq) data to understand the naringin/neoeriocitrin-related gene expression pattern (NNRGEP) in D. roosii. Our method can detect gene expression levels at the transcriptome scale, which is not dependent on the reference genome. Moreover, the systems biology method of WGCNA was used to cluster gene modules and identify hub genes associated with naringin/neoeriocitrin, and the results provided significant insight into the modular organization and co-expression networks underlying the NNRGEP of D. roosii. Our data set will serve as a valuable resource for novel gene discovery and analyses of genomics and functional genomics in fern. Understanding the regulation of flavonoid metabolism is important to generate plants enriched with flavonoid compounds to obtain desirable dietary and beneficial health properties. Studying the NNRGEP will be valuable in efforts to improve the medicinal contents of naringin/neoeriocitrin more rapidly in D. roosii. Results HPLC-MS/MS analysis of naringin/neoeriocitrin accumulation in D. roosii First, we determined the naringin/neoeriocitrin contents of four different tissues [leaves (CLs), veins (CVs), new rhizomes (CNRs) and old rhizomes (CORs)] from 3-year-old plants using HPLC, and three replicates were carried out for each sample (Fig. 1A). As shown in Figs. 1C and 2A, the contents of naringin in CLs, CVs, CNRs and CORs were 0.0473, 0.0171, 1.093 and 0.5559%, respectively, suggesting that naringin mainly accumulates in rhizomes, and especially in CNRs. The neoeriocitrin contents of CLs, CVs, CNRs and CORs were 0.0709, 0.0023, 0.1095 and 0.2109%, respectively, indicating that neoeriocitrin mainly accumulates in rhizomes and is present at a much higher level in CORs. To obtain more insights into the accumulation of naringin/neoeriocitrin with increasing age in D. roosii, we detected the contents of these two compounds in the rhizomes of wild-type plants (WT), 2-year-old plants (R2Y), 3-year-old plants (R3Y), 5-year-old plants (R5Y), 6-year-old plants (R6Y), 8-year-old plants (R8Y) and 10-year-old plants (R10Y) using HPLC, and three replicates were carried out for each sample (Fig. 1B). As shown in Fig. 1B and E, we found that the fresh and dry weight of rhizomes accumulated steadily over the years. The fresh and dry weight were only 2.86 and 0.22 g for R2Y, 14.76 and 1.73 g for R3Y, and reached up to 100.76 and 16.15 g for R10Y. Moreover, the new rhizomes accumulated in the early years with a higher proportion than the old rhizomes, while the old rhizomes accumulated in the later years with a higher proportion than the new rhizomes. Figs. 1D and 2B show that the naringin contents of WT, R2Y, R3Y, R5Y, R6Y, R8Y and R10Y rhizomes were 0.7840, 0.7031, 0.5689, 0.5955, 0.6623, 0.6960 and 0.6420%, respectively, suggesting that naringin acts as an intermediate product and accumulates stably over time, with some fluctuation. The neoeriocitrin contents of WT, R2Y, R3Y, R5Y, R6Y, R8Y and R10Y rhizomes were 0.774, 0.1775, 0.2019, 0.3253, 0.4083, 0.9855 and 1.2064%, respectively, which revealed that neoeriocitrin accumulates and increases gradually over time (Figs. 1D, 2B). Additionally, we found that ≥8 years was necessary for D. roosii to accumulate a similar level of naringin/neoeriocitrin in comparison with wild plants. (Figs. 1D, 2B). Fig. 1 View largeDownload slide Photographs and HPLC analysis of naringin/neoeriocitrin contents of D. roosii. (A) Whole D. roosii plantlets and rhizomes. The area within the green circles contains new rhizomes, and the central part of the rhizome is old. Scale bar = 5 cm. (B) Photographs of plants of different ages, 2-year-old plants (R2Y), 3-year-old plants (R3Y), 5-year-old plants (R5Y), 6-year-old plants (R6Y), 8-year-old plants (R8Y) and 10-year-old plants (R10Y). Scale bar = 5 cm. (C) The histograms represent different naringin and neoeriocitrin levels in the four different tissues. (D) The histograms represent different naringin and neoeriocitrin levels in the rhizomes of wild plants and plants of different ages. (E) The fresh and dry weight of rhizomes from plants of different ages. Vertical bars indicate the SEM for three replicates of each sample. Fig. 1 View largeDownload slide Photographs and HPLC analysis of naringin/neoeriocitrin contents of D. roosii. (A) Whole D. roosii plantlets and rhizomes. The area within the green circles contains new rhizomes, and the central part of the rhizome is old. Scale bar = 5 cm. (B) Photographs of plants of different ages, 2-year-old plants (R2Y), 3-year-old plants (R3Y), 5-year-old plants (R5Y), 6-year-old plants (R6Y), 8-year-old plants (R8Y) and 10-year-old plants (R10Y). Scale bar = 5 cm. (C) The histograms represent different naringin and neoeriocitrin levels in the four different tissues. (D) The histograms represent different naringin and neoeriocitrin levels in the rhizomes of wild plants and plants of different ages. (E) The fresh and dry weight of rhizomes from plants of different ages. Vertical bars indicate the SEM for three replicates of each sample. Fig. 2 View largeDownload slide HPLC-MS/MS analyses of naringin/neoeriocitrin contents from biological samples in D. roosii. (A) The chromatogram indicates the differences in naringin and neoeriocitrin levels from the four different tissues [leaves (CL), veins (CV), young rhizomes (CNR) and old rhizomes (COR) of 3-year-old plants]. The red arrow presents the target peak of neoeriocitrin and the blue arrow presents the target peak of naringin. Three replicates were carried out for each sample. (B) The chromatogram indicates the differences in naringin and neoeriocitrin levels from the rhizomes of plants of different ages (R2Y, R3Y, R5Y, R6Y, R8Y and R10Y). The red arrow indicates the target peak of neoeriocitrin and the blue arrow indicates the target peak of naringin. Three replicates were carried out for each sample. (C–F) HPLC-MS/MS analyses of naringin/neoeriocitrin. The product ion mass spectra of naringin from reference standards (C) and biological samples (E) in the negative ion mode. The product ion mass spectra of neoeriocitrin from reference standards (D) and biological samples (F) in the negative ion mode. Conditions: ESI interface; mass spectra were applied using electrospray ionization in negative ionization mode with the m/z range of 100–600 and 100–650; desolvation temperature, 400°C; nitrogen drying gas flow, 800 l h–1; nebulizer pressure, 45 p.s.i.; capillary voltage, 3,000 V; argon was used as the collision gas with a collision energy of 30 V. Fig. 2 View largeDownload slide HPLC-MS/MS analyses of naringin/neoeriocitrin contents from biological samples in D. roosii. (A) The chromatogram indicates the differences in naringin and neoeriocitrin levels from the four different tissues [leaves (CL), veins (CV), young rhizomes (CNR) and old rhizomes (COR) of 3-year-old plants]. The red arrow presents the target peak of neoeriocitrin and the blue arrow presents the target peak of naringin. Three replicates were carried out for each sample. (B) The chromatogram indicates the differences in naringin and neoeriocitrin levels from the rhizomes of plants of different ages (R2Y, R3Y, R5Y, R6Y, R8Y and R10Y). The red arrow indicates the target peak of neoeriocitrin and the blue arrow indicates the target peak of naringin. Three replicates were carried out for each sample. (C–F) HPLC-MS/MS analyses of naringin/neoeriocitrin. The product ion mass spectra of naringin from reference standards (C) and biological samples (E) in the negative ion mode. The product ion mass spectra of neoeriocitrin from reference standards (D) and biological samples (F) in the negative ion mode. Conditions: ESI interface; mass spectra were applied using electrospray ionization in negative ionization mode with the m/z range of 100–600 and 100–650; desolvation temperature, 400°C; nitrogen drying gas flow, 800 l h–1; nebulizer pressure, 45 p.s.i.; capillary voltage, 3,000 V; argon was used as the collision gas with a collision energy of 30 V. One concern about the determination of the components was to ensure reliable characterization of the target compounds (Li et al. 2004). Here, we compared the mass spectra and retention times of target compounds with those of corresponding reference standards from HPLC–tandem mass spectrometry (MS/MS). HPLC-MS/MS analysis confirmed that the target compounds of our biological samples with retention times of 19.50 and 24.75 min were naringin (with a mass spectrum of m/z 579 and major ion fragments of m/z 151, 271 and 459; Fig. 2C, E) and neoeriocitrin (with a mass spectrum of m/z 595 and major ion fragments of m/z 151, 287 and 459; Fig. 2D, F). Notably, the MS/MS data showed one different major ion fragment between naringin (m/z 271) and neoeriocitrin (m/z 287), which suggested a highly similar chemical structure with the only difference of one hydroxyl group. Taken together, our results confirmed that the accumulation of naringin/neoeriocitrin obviously showed tissue and time specificity in D. roosii. Single-molecule real-time sequencing of D. roosii tissues captures full-length transcripts The lack of comprehensive D. roosii gene sequence data sets limits the scope of investigations into the molecular genetic basis of the metabolism of the medicinal ingredients. Here, we used SMRT technology to generate the complete and full-length D. roosii transcriptome. We generated 4,364,566 raw reads (12.2 billion bp) using the PacBio RS platform. After performing filtering using RS_Subreads.1 of PacBio RS, approximately 12 G of subread data were obtained. We applied RS_IsoSeq.1 protocols (https://github.com/PacificBiosciences/), including classify, cluster and quiver, to generate CCS data, which provides much more accurate sequence information from reads that pass through the insert at least three times (Sharon et al. 2013). The numbers of reads of insert (ROI) with high accuracy (read quality >0.85) from the different libraries (<1, 1–2, 2–3, 3–4, 4–5 and >5 kb) were 103,983, 76,059, 108,381, 93,698, 101,228, 109,469 and 68,839, respectively (Supplementary Table S1; Supplementary Fig. S1), and most of the read quality values were distributed above 0.95 (Supplementary Fig. S2). As shown in Supplementary Fig. S3, we found that the length of ROI sequences in different libraries depended on the library size. Statistically, the numbers of full-length reads obtained from the different libraries (<1, 1–2, 2–3, 3–4, 4–5 and >5 kb) were 406,214, 1,328,291, 1,147,936, 715,286, 394,386 and 372,453, respectively (Table 1). A total of 272,121 full-length non-chimeric reads with an average size of 3,073 bp were obtained (Supplementary Fig. S4; Supplementary Table S2), and the number of high-quality full-length transcripts was condensed to 67,147, with an N50 of 2,977 bp (Fig. 3A;Supplementary Table S3). In addition, 59,294 of these full-length transcripts were predicted to be long CDS sequences using Transcoder (Fig. 3B) and relevant protein sequences (Fig. 3C). Table 1 Length distribution of raw reads from SMRT sequencing data Cell  Library size  <1k  1–2k  2–3k  3–4k  4–5k  >5k  Total reads  Total bases (bp)  m*_233855  1–2k  84,255  566,292  324,790  33,269  15,301  21,799  1,045,706  2,104,717,056  m*_151316  1–2k  71,028  360,947  213,161  32,570  17,783  33,880  729,369  1,601,915,820  m*_043401  2–3k  52,076  74,545  245,179  244,188  31,461  43,344  690,793  2,088,777,249  m*_105045  2–3k  44,911  65,733  188,850  175,074  22,733  30,060  527,361  1,545,726,502  m*_062914  3–5k  67,503  108,774  69,300  95,200  121,731  57,947  520,455  1,688,769,580  m*_092811  3–5k  56,733  102,693  70,479  108,246  161,461  61,667  561,279  1,918,188,854  m*_231835  >5k  29,708  49,307  36,177  26,739  23,916  123,756  289,603  1,227,156,669  Total  –  406,214  1,328,291  1,147,936  715,286  394,386  372,453  4,364,566  12,175,251,730  Cell  Library size  <1k  1–2k  2–3k  3–4k  4–5k  >5k  Total reads  Total bases (bp)  m*_233855  1–2k  84,255  566,292  324,790  33,269  15,301  21,799  1,045,706  2,104,717,056  m*_151316  1–2k  71,028  360,947  213,161  32,570  17,783  33,880  729,369  1,601,915,820  m*_043401  2–3k  52,076  74,545  245,179  244,188  31,461  43,344  690,793  2,088,777,249  m*_105045  2–3k  44,911  65,733  188,850  175,074  22,733  30,060  527,361  1,545,726,502  m*_062914  3–5k  67,503  108,774  69,300  95,200  121,731  57,947  520,455  1,688,769,580  m*_092811  3–5k  56,733  102,693  70,479  108,246  161,461  61,667  561,279  1,918,188,854  m*_231835  >5k  29,708  49,307  36,177  26,739  23,916  123,756  289,603  1,227,156,669  Total  –  406,214  1,328,291  1,147,936  715,286  394,386  372,453  4,364,566  12,175,251,730  Table 1 Length distribution of raw reads from SMRT sequencing data Cell  Library size  <1k  1–2k  2–3k  3–4k  4–5k  >5k  Total reads  Total bases (bp)  m*_233855  1–2k  84,255  566,292  324,790  33,269  15,301  21,799  1,045,706  2,104,717,056  m*_151316  1–2k  71,028  360,947  213,161  32,570  17,783  33,880  729,369  1,601,915,820  m*_043401  2–3k  52,076  74,545  245,179  244,188  31,461  43,344  690,793  2,088,777,249  m*_105045  2–3k  44,911  65,733  188,850  175,074  22,733  30,060  527,361  1,545,726,502  m*_062914  3–5k  67,503  108,774  69,300  95,200  121,731  57,947  520,455  1,688,769,580  m*_092811  3–5k  56,733  102,693  70,479  108,246  161,461  61,667  561,279  1,918,188,854  m*_231835  >5k  29,708  49,307  36,177  26,739  23,916  123,756  289,603  1,227,156,669  Total  –  406,214  1,328,291  1,147,936  715,286  394,386  372,453  4,364,566  12,175,251,730  Cell  Library size  <1k  1–2k  2–3k  3–4k  4–5k  >5k  Total reads  Total bases (bp)  m*_233855  1–2k  84,255  566,292  324,790  33,269  15,301  21,799  1,045,706  2,104,717,056  m*_151316  1–2k  71,028  360,947  213,161  32,570  17,783  33,880  729,369  1,601,915,820  m*_043401  2–3k  52,076  74,545  245,179  244,188  31,461  43,344  690,793  2,088,777,249  m*_105045  2–3k  44,911  65,733  188,850  175,074  22,733  30,060  527,361  1,545,726,502  m*_062914  3–5k  67,503  108,774  69,300  95,200  121,731  57,947  520,455  1,688,769,580  m*_092811  3–5k  56,733  102,693  70,479  108,246  161,461  61,667  561,279  1,918,188,854  m*_231835  >5k  29,708  49,307  36,177  26,739  23,916  123,756  289,603  1,227,156,669  Total  –  406,214  1,328,291  1,147,936  715,286  394,386  372,453  4,364,566  12,175,251,730  Fig. 3 View largeDownload slide Full-length transcriptome sequencing and assembly of D. roosii using the SMRT method. (A) Length distribution of full-length transcripts. (B) Length distribution of CDS sequences. (C) Length distribution of protein sequences. (D) NR classification diagram. (E) Functional annotation of unigenes based on GO categorization. The main functional categories within the biological process, cellular component and molecular function categories are relevant to plant physiology. The right y-axis indicates the number of unigenes. The left y-axis indicates the percentage of unigenes. (F) COG functional classification of unigenes. A total of 24,876 unigenes were grouped according to 24 COG terms. The y-axis indicates the number of unigenes. Fig. 3 View largeDownload slide Full-length transcriptome sequencing and assembly of D. roosii using the SMRT method. (A) Length distribution of full-length transcripts. (B) Length distribution of CDS sequences. (C) Length distribution of protein sequences. (D) NR classification diagram. (E) Functional annotation of unigenes based on GO categorization. The main functional categories within the biological process, cellular component and molecular function categories are relevant to plant physiology. The right y-axis indicates the number of unigenes. The left y-axis indicates the percentage of unigenes. (F) COG functional classification of unigenes. A total of 24,876 unigenes were grouped according to 24 COG terms. The y-axis indicates the number of unigenes. The annotation, pathway and functional classification of these SMRT data were analyzed based on the NR (non-redundant), GO (Gene Ontology), COG (clusters of orthologous groups), Swiss-Prot, KEGG (Kyoto Encyclopedia of Genes and Genomes) and NT (nucleotide) databases. We annotated 59,100 transcripts (88.02%) for function, among which 52,439 transcripts (78.10%) were assigned to NR databases. There were three major species (Physcomitrella patens, 18.3%; Selaginella moellendorffii, 12.8%; and Picea sitchensis, 12.6%) with high similarity to D. roosii (Fig. 3D). We assigned 29,012 transcripts (43.2%) to GO databases, which were classified into three main categories [(i) biological process, (ii) cellular component and (iii) molecular function clusters] and further distributed across 53 subcategories (Fig. 3E). Among the biological processes, candidate genes involved in metabolic and cellular processes were highly represented. In addition, 24,876 transcripts (37.1%) were allocated to COG databases (Fig. 3F). Among the 24 COG categories, ‘general function prediction only’ represented the largest group (7,213; 17.9%), followed by ‘signal transduction mechanisms’ (4,033; 10.0%), ‘replication, recombination and repair’ (3,913; 9.7%) and ‘transcription’ (3,818; 9.5%). Moreover, 43,863 transcripts (65.3%), 37,028 transcripts (55.1%) and 41,844 transcripts (62.3%) were allocated to the Swiss-Prot, KEGG and NT databases, respectively (Supplementary Table S4). Based on the SMRT data, we suggest that D. roosii has an extremely large genome. In this context, SMRT technology is an effective approach for further biological analysis of plants without a molecular genetic basis and provides a comprehensive transcriptome resource for D. roosii. Transcriptomic insights into the naringin/neoeriocitrin-related gene expression pattern in D. roosii Plant secondary metabolites are important resources of high-value chemicals with pharmacological properties. Phytochemical investigations of D. roosii have revealed that naringin and neoeriocitrin exhibit a highly similar flavonoid structure (L.L. Liu et al. 2012). For assessment of the tissue- and time-specific expression of core genes associated with naringin/neoeriocitrin and the order of accumulation, 30 RNA samples from four different tissues (CLs, CVs, CNRs and CORs) and six time points (2, 3, 5, 6, 8 and 10 years old) were subjected to SGS. Each tissue and time point sample had three biological replications. We obtained 243 billion bases of raw data, with a Q20 >95% and an approximate GC content of 48.9% (Supplementary Table S5). After filtration, there were 215 billion bases of clean data (Supplementary Table S6). A total of 118,060 transcripts and 66,499 unigenes were assembled, with N50 values of 2,382 and 2,261 bp, respectively (Supplementary Figs. S5, S6). We allocated 52,334, 39,570, 38,766, 34,129, 23,568 and 30,109 unigenes to the NR (Supplementary Fig. S7), NT, Swiss-Prot, KEGG, GO (Supplementary Fig. S8) and COG (Supplementary Fig. S9) databases, respectively (Supplementary Table S7). Based on the NR database analysis, D. roosii shared the highest similarity with three major species (P. patens, 12.6%; S. moellendorffii, 10.2%; and P. sitchensis, 8.2%), which was consistent with the results of the SMRT analysis (Supplementary Fig. S7). According to the Venn diagram, we detected 278 and 144 differentially expressed genes (DEGs) that were shared in pairwise comparisons of different tissues and rhizomes of different ages, respectively (Supplementary Fig. S10). As the basic biosynthetic pathways leading to the production of the core flavonoid skeletons have been well characterized in terms of the molecular, genetic and biochemical mechanisms of the enzymes involved (Dixon and Pasinetti 2010), we suggested the pathway of naringin and neoeriocitrin synthesis in D. roosii. Initially, the enzymes of the phenylpropanoid pathway encoded by PAL (phenylalanine ammonia lyase), C4H (cinnamic acid 4-hydroxylase) and 4CL (4-coumarate-CoA ligase) catalyze a series of reactions to form 4-coumaroyl-CoA as the substrate of flavonoid biosynthetic pathways. The evaluation of one core subpathway showed that the products of CHS and CHI are involved in two-step condensation to produce the basic skeleton of naringenin, after which F3H is involved in the production of eriodictyol. In another paratactic subpathway, caffeoyl-CoA is produced by the products of HCT (hydroxycinnamoyltransferase), C3H and C3'H from 4-coumaroyl CoA. 14 C-Labeled CHS enzyme assay results showed that the product of CHS1 prefers to convert 4-coumaroyl-CoA to naringenin chalcone, whereas the product of CHS2 prefers to convert caffeoyl-CoA to eriodictyol chalcone in barley, which provided an attractive hypothesis that the eriodictyol CHS eliminates the need for the complex P450-dependent hydroxylation of naringenin to eriodictyol (Christensen et al. 1998). Additionally, the chromatographic and spectroscopic data showed that the isomerization product of eriodictyol chalcone (also known as 2',3,4,4',6'-pentahydroxy chalcone) is eriodictyol in Tulipa (Wiermann 1970). Thus we suggested that eriodictyol chalcone is generated by CHS from caffeoyl-CoA, thereafter eriodictyol is produced via the catalytic action of CHI in D. roosii. Later, UDP-glucosyltransferase (UGT) and rhamnosyltransferase catalyze position-specific transfer of glucose and rhamnose to C-7-O- of naringenin to produce naringin (McIntosh and Mansell 1990, Bar-Peled et al. 1991). B-ring hydroxylation at position 3 of flavonoids is catalyzed by the well-studied Cyt P450 monooxygenase gene F3'H (Hutabarat et al. 2016). In the present work, we proposed that some UGT genes contribute to the final production of naringin and neoeriocitrin, and F3'H is the key candidate gene for the bioconversion of naringin to neoeriocitrin over the years in D. roosii, which requires further research (Fig. 4B). Fig. 4 View largeDownload slide Heat map depicting the expression profiles of naringin/neoeriocitrin synthesis-related genes in D. roosii. (A) Transcript abundance profiles of genes encoding enzymes involved in naringin/neoeriocitrin synthesis in different tissues. (B) Putative schematic representation of the naringin/neoeriocitrin synthetic pathways in D. roosii. PAL, phenylalanine ammonia lyase; C4H, cinnamate 4-hydroxylase; 4CL, 4-coumarate-CoA ligase; HCT, hydroxycinnamoyl transferase; C3H, p-coumarate 3-hydroxylase; C3'H, p-coumarate 3'-hydroxylase; CHS, chalcone synthase; CHI, chalcone isomerase; F3'H, flavonoid-3' hydroxylase; UGT, UDP-glucosyltransferases. (C) Transcript abundance profiles of genes encoding enzymes involved in naringin/neoeriocitrin synthesis in rhizomes of different ages. (D) Numbers of naringin/neoeriocitrin synthesis-related genes (NNSRGs) expressed at different levels (based on the value of fold change) in CVs, CNRs and CORs compared with CLs. (E) Numbers of the NNSRGs expressed at different levels (based on the value of FPKM) in different tissues. (F) Numbers of the NNSRGs expressed at different levels (based on the value of fold change) in R3Y, R5Y, R6Y, R8Y and R10Y compared with R2Y. (G) Numbers of the NNSRGs expressed at different levels (based on the value of FPKM) in rhizomes of different ages. Fig. 4 View largeDownload slide Heat map depicting the expression profiles of naringin/neoeriocitrin synthesis-related genes in D. roosii. (A) Transcript abundance profiles of genes encoding enzymes involved in naringin/neoeriocitrin synthesis in different tissues. (B) Putative schematic representation of the naringin/neoeriocitrin synthetic pathways in D. roosii. PAL, phenylalanine ammonia lyase; C4H, cinnamate 4-hydroxylase; 4CL, 4-coumarate-CoA ligase; HCT, hydroxycinnamoyl transferase; C3H, p-coumarate 3-hydroxylase; C3'H, p-coumarate 3'-hydroxylase; CHS, chalcone synthase; CHI, chalcone isomerase; F3'H, flavonoid-3' hydroxylase; UGT, UDP-glucosyltransferases. (C) Transcript abundance profiles of genes encoding enzymes involved in naringin/neoeriocitrin synthesis in rhizomes of different ages. (D) Numbers of naringin/neoeriocitrin synthesis-related genes (NNSRGs) expressed at different levels (based on the value of fold change) in CVs, CNRs and CORs compared with CLs. (E) Numbers of the NNSRGs expressed at different levels (based on the value of FPKM) in different tissues. (F) Numbers of the NNSRGs expressed at different levels (based on the value of fold change) in R3Y, R5Y, R6Y, R8Y and R10Y compared with R2Y. (G) Numbers of the NNSRGs expressed at different levels (based on the value of FPKM) in rhizomes of different ages. On the one hand, we investigated the tissue-specific expression of DEGs encoding enzymes involved in the pathways shown in Fig. 4B yielding naringin/neoeriocitrin. The DEGs were subjected to the filters of a fold change >2 or <0.5 and a false discovery rate (FDR) <0.01 in the present study. Analysis of our tissue-specific transcriptome data set revealed that 108 candidate DEGs across nine gene families related to naringin/neoeriocitrin showed significant variation among different tissues (Fig. 4A). The number of up-regulated genes (fold change >2, 5 and 10) was higher in rhizomes compared with leaves (CL samples), in particular new rhizomes (Fig. 4A, D). As shown in Fig. 4E, we found that genes in rhizomes showed higher expression [fragments per kilobase of transcript per million mapped reads (FPKM) >10 and 100] than CLs and CVs, especially in CNRs. Moreover, we identified 68 DEGs (63.0%) which exhibited up-regulated expression in rhizomes, comprising three PAL genes, three C4H genes, four 4CL genes, 17 HCT genes, four C3'H genes, 12 C3H genes, seven CHS genes, two CHI genes and 16 F3'H genes. Also, 66.2% of these DEGs appeared to be expressed at higher transcriptional levels in CNRs than in CORs. Notably, two F3'H genes and one gene each of 4CL, C3H, PAL, HCT and CHS were most highly (and quite specifically) expressed in CNRs (fold change >100; Fig. 4A;Supplementary Table S8). On the other hand, we analyzed the transcriptome data from samples of different ages and identified 92 candidate DEGs across nine gene families related to naringin/neoeriocitrin (Fig. 4C). Fig. 4F shows that the number of up-regulated genes in R3Y, R5Y, R6Y, R8Y and R10Y decreased compared with R2Y (fold change >1, 2, 5 and 10), and Fig. 4G reveals that genes presented down-regulated expression with a decreasing FPKM value over time. Moreover, 70.7% of the DEGs exhibited down-regulated expression over time, and the numbers of down-regulated DEGs (fold change <0.5) in R3Y, R5Y, R6Y, R8Y and R10Y compared with R2Y were 22, 14, 30, 33 and 47, respectively (Fig. 4C;Supplementary Table S8). Three PAL genes, three 4CL genes, one HCT gene and one C4H gene exhibited the lowest transcriptional levels in R10Y (fold change <0.1). Overall, the results confirmed the tissue and temporal expression specificity of the naringin/neoeriocitrin-related genes (NNRGs), especially for PAL, 4CL and HCT, suggesting their crucial roles in the specialized naringin/neoeriocitrin accumulation in D. roosii. RNA-seq data reveal l-phenylalanine/malonyl-CoA formation and transcription factors associated with naringin/neoeriocitrin in D. roosii Flavonoids are synthesized in plants via the flavonoid branch of the phenylpropanoid and acetate–malonate metabolic pathway, in which l-phenylalanine and malonyl-CoA are the important precursor metabolites for the phenylpropanoid metabolic pathway (Buer et al. 2010). Beyond the NNRGs displayed in Fig. 4B, we studied genes involved in the formation of l-phenylalanine via the shikimate pathway and malonyl-CoA, which were related to naringin/neoeriocitrin accumulation. For l-phenylalanine, among 25 DEGs, 14 DEGs showed a high expression level in CNRs (Fig. 5A;Supplementary Table S9). Of these, 3-deoxy-7-phosphoheptulonate synthase (CL7383) showed the highest expression level in CNRs (fold change = 354). The numbers of down-regulated DEGs (fold change <0.5) identified in R3Y, R5Y, R6Y, R8Y and R10Y compared with R2Y were 9, 2, 8, 10 and 13, respectively. For malonyl-CoA, we analyzed 112 DEGs and identified 77 DEGs (68.8%) that were specifically expressed in rhizomes. Among these DEGs, 49 (63.6%) were expressed at a high level in CNRs. Of these, alcohol dehydrogenase 1/7 (CL2849, fold change = 440) and 3-ketoacyl-CoA synthase (CL7598, fold change = 445) were most specifically expressed in CNRs. The number of down-regulated DEGs (fold change <0.5) identified in R3Y, R5Y, R6Y, R8Y and R10Y was 24, 12, 47, 33 and 46, respectively. ACCase is a biotin-containing enzyme that catalyzes the carboxylation of acetyl-CoA to malonyl-CoA. In our work, two acetyl-CoA/propionyl-CoA carboxylase genes (CL18632 and Unigene9710) encoding ACCase were expressed at a high level in CNRs, and two acetyl-CoA/propionyl-CoA carboxylase genes (CL18632 and CL10663) exhibited a down-regulated expression pattern over time. The DEGs involved in l-phenylalanine formation (phenylalanine, tyrosine and tryptophan biosynthesis) and malonyl-CoA formation (fatty acid biosynthesis, fatty acid elongation and fatty acid metabolism) exhibited tissue-specific expression profiles, with the highest expression being observed in CNRs, and displayed time point expression specificity with down-regulated expression over time. Fig. 5 View largeDownload slide Heat map depicting the expression profiles of precursor metabolism and transcription factor-regulated genes in D. roosii. (A) Profiles of the transcriptional abundance of enzymatic genes involved in malonyl-CoA/l-phenylalanine formation in different tissues and plants of different ages. (B) Profiles of the transcriptional abundance of transcription factor-regulated (MYB- and bHLH-regulated) genes in different tissues and plants of different ages. Fig. 5 View largeDownload slide Heat map depicting the expression profiles of precursor metabolism and transcription factor-regulated genes in D. roosii. (A) Profiles of the transcriptional abundance of enzymatic genes involved in malonyl-CoA/l-phenylalanine formation in different tissues and plants of different ages. (B) Profiles of the transcriptional abundance of transcription factor-regulated (MYB- and bHLH-regulated) genes in different tissues and plants of different ages. To understand the regulatory role of transcription factors (TFs) on gene expression in D. roosii, we analyzed the expression profiles of MYB- and basic helix–loop–helix (bHLH)-regulated DEGs in different tissues and plants of different ages, and discovered 84 MYB- and 118 bHLH-regulated DEGs (Fig. 5B;Supplementary Table S10). Among these MYB-regulated DEGs, 57 (67.9%) were highly expressed in rhizomes, and 41 (71.9%) were more highly expressed in CNRs. Two MYB-regulated DEGs (CL17931, fold change = 433; CL8191, fold change = 581) were most highly expressed in CNRs. The numbers of down-regulated DEGs (fold change <0.5) identified in R3Y, R5Y, R6Y, R8Y and R10Y were 18, 19, 31, 35 and 40, respectively. Among the bHLH-regulated DEGs, 88 (74.6%) were highly expressed in rhizomes and 63 (71.6%) were more highly expressed in CNRs. The bHLH-regulated DEGs (CL4885, fold change = 468; Unigene731, fold change = 476) were most specifically expressed in CNRs. The number of down-regulated DEGs (fold change <0.5) identified in R3Y, R5Y, R6Y, R8Y and R10Y was 19, 13, 52, 33 and 62, respectively. The results revealed that TF-regulated genes share a similar expression pattern, with higher transcription levels in CNRs, and displayed down-regulation along with increasing age, suggesting that the regulation of gene expression by bHLH and MYB depended on different tissues and ages in D. roosii. Weighted gene co-expression network analysis in D. roosii To identify the relationship between modules and samples, a gene co-expression network was constructed using 16,472 DEGs (Fig. 6A). Principal component analysis data showed that principal components 1, 2 and 3 explained 74.83, 14.90 and 4.42% of the observed variation, respectively (Supplementary Fig. S11). In Fig. 6A, a network heat map plot (interconnectivity plot) of the gene network is shown, with the corresponding hierarchical clustering dendrograms and resulting modules. Using a set of parameters that produced refined clusters, we detected 17 modules containing from 41 to –6,137 DEGs each (fold change >2 or <0.5; FDR <0.01). The expression profile of each module is summarized by a module eigengene, which is analogous to the first principal component of the module expression data (Langfelder and Horvath 2008). The modules were defined using color codes, as shown in Fig. 6B. We also calculated the similarity and clustered these modules into different groups depending on their eigenvalues. A comparison of the module eigengenes (Fig. 6B, C) revealed four related module sets that displayed similar expression profiles across different samples: (i) the royalblue, saddlebrown, brown, orangered4 and purple modules; (ii) the violet, darkmagenta, salmon and turquoise modules; (iii) the floralwhite, darkorange and pink modules; and (iv) the darkorange2, tan, plum1, bisque4 and lightsteelblue1 modules. The 17 module eigengenes for the 17 distinct modules showed some cross-talk with distinct sample types, due to the tissue-specific expression profiles of the eigengenes. Notably, seven of the 17 co-expression modules were composed of genes that were highly expressed in specific samples (R > 0.45; P < 10–3). Therefore, each of these seven modules identified (or was correlated with) genes associated with a specific tissue or time point. For example, the salmon module identified 2,413 CV-specific genes; the darkorange module identified 1,633 R2Y-specific genes; the darkorange4 module identified 669 R10Y-specific genes; and the saddlebrown module identified 807 R8Y-specific genes (Fig. 6D). Fig. 6 View largeDownload slide Identification of co-expression network modules in D. roosii. (A) Network heat map plot. Branches in the hierarchical clustering dendrograms correspond to modules. Color-coded module membership is indicated by the colored bars below and to the right of the dendrograms. In the heat map, high co-expression interconnectedness is indicated by progressively more saturated yellow and red colors. Modules correspond to blocks of highly interconnected genes. Genes with high intramodular connectivity are located at the tip of the module branches because they display the highest interconnectedness with the rest of the genes in the module. (B) Module eigenvector clustering. The network consists of 17 modules. The eigenvectors for each module are prefixed with module eigengenes (MEs) in the dendrogram and are calculated prior to thresholding the network. Adjacent modules are more highly similar in expression. (C) Module eigengene adjacency heat map. The heat map shows the relatedness of the 17 co-expression modules identified in WGCNA, with red indicating highly related and blue indicating not related. (D) Module–sample association. Each row corresponds to a module. The color of each cell at the row–column intersection indicates the correlation coefficient between the module and the sample type. A high degree of correlation between a specific module and the sample type is indicated with dark red. Fig. 6 View largeDownload slide Identification of co-expression network modules in D. roosii. (A) Network heat map plot. Branches in the hierarchical clustering dendrograms correspond to modules. Color-coded module membership is indicated by the colored bars below and to the right of the dendrograms. In the heat map, high co-expression interconnectedness is indicated by progressively more saturated yellow and red colors. Modules correspond to blocks of highly interconnected genes. Genes with high intramodular connectivity are located at the tip of the module branches because they display the highest interconnectedness with the rest of the genes in the module. (B) Module eigenvector clustering. The network consists of 17 modules. The eigenvectors for each module are prefixed with module eigengenes (MEs) in the dendrogram and are calculated prior to thresholding the network. Adjacent modules are more highly similar in expression. (C) Module eigengene adjacency heat map. The heat map shows the relatedness of the 17 co-expression modules identified in WGCNA, with red indicating highly related and blue indicating not related. (D) Module–sample association. Each row corresponds to a module. The color of each cell at the row–column intersection indicates the correlation coefficient between the module and the sample type. A high degree of correlation between a specific module and the sample type is indicated with dark red. Identification of naringin/neoeriocitrin-related gene expression pattern using modular organization analysis The investigation of relationships between module eigengenes and naringin/neoeriocitrin contents revealed that the correlation coefficients varied widely, from –0.79 to 0.7 for naringin and from –0.58 to 0.83 for neoeriocitrin (Fig. 7A). The eigengenes of the darkorange2, tan, turquoise and salmon modules showed significant correlations with naringin (|R| > 0.5, P < 0.001), although the turquoise and salmon modules were negatively correlated (Fig. 7A). The saddlebrown, brown, orangered4, purple and darkmagenta modules showed significant correlations with neoeriocitrin (|R| >0.5, P < 0.001), although darkmagenta was negatively correlated. These nine modules of interest comprised 13,704 DEGs, among which the turquoise module was the largest, with 6,137 DEGs. A heat map of the relative expression of each gene from the nine modules revealed that many of the darkorange2 and tan module genes (positive correction with naringin) were most highly expressed in CNRs, followed by R3Y, with a high ratio of young tissue. On the other hand, a group of genes in the turquoise and salmon modules (negative correction with naringin) were more highly expressed in CLs and CVs, respectively. Moreover, most genes in the brown, orangered4, purple and saddlebrown modules (positive correction with neoeriocitrin) were specifically expressed in R10Y and R8Y, with a high ratio of old tissue, and many genes in the darkmagenta module (negative correction with neoeriocitrin) were specifically expressed in CNRs (Fig. 7B). Genes positively associated with naringin were specifically expressed in new rhizomes (CNRs/R3Y), while genes positively associated with neoeriocitrin were specifically expressed in old rhizomes (R8Y/R10Y), suggesting the tissue- and time point-specific pattern of the NNRGs in D. roosii. Furthermore, we performed an enrichment analysis of GO annotations of naringin/neoeriocitrin-related module genes, and the results showed that the greatest number of DGEs fell in ‘metabolic process’ of the biological process category (Supplementary Table S11). Fig. 7 View largeDownload slide Modular organization analysis of the naringin/neoeriocitrin-related gene expression pattern in D. roosii. (A) Module–naringin/neoeriocitrin correlations and corresponding P-values (in parentheses). The left panel shows 17 modules (royalblue, saddlebrown, brown, orangered4, purple, violet, darkmagenta, salmon, turquoise, floralwhite, darkorange, pink, darkorange2, tan, plum1, bisque4 and lightsteelblue) and their member genes. The right panel is a color scale for module–trait correlation, from −1 to 1. The number of genes in each module is indicated on the left. Each column corresponds to a specific sample. (B) Expanded view of the expression of naringin/neoeriocitrin-related DEGs in the nine modules across all samples. Heat map showing the relative expression of each gene from the saddlebrown, brown, orangered4, purple, darkorange2 and tan modules(positive correlation), and darkmagenta, salmon and turquoise modules (negative correlation). Fig. 7 View largeDownload slide Modular organization analysis of the naringin/neoeriocitrin-related gene expression pattern in D. roosii. (A) Module–naringin/neoeriocitrin correlations and corresponding P-values (in parentheses). The left panel shows 17 modules (royalblue, saddlebrown, brown, orangered4, purple, violet, darkmagenta, salmon, turquoise, floralwhite, darkorange, pink, darkorange2, tan, plum1, bisque4 and lightsteelblue) and their member genes. The right panel is a color scale for module–trait correlation, from −1 to 1. The number of genes in each module is indicated on the left. Each column corresponds to a specific sample. (B) Expanded view of the expression of naringin/neoeriocitrin-related DEGs in the nine modules across all samples. Heat map showing the relative expression of each gene from the saddlebrown, brown, orangered4, purple, darkorange2 and tan modules(positive correlation), and darkmagenta, salmon and turquoise modules (negative correlation). WGCNA reveals gene co-expression networks associated with naringin/neoeriocitrin Among these 17 modules, the turquoise and tan modules had the largest number of MYB- and bHLH-regulated genes compared with other modules, which indicated the involvement of MYB and bHLH TFs in regulating gene expression in the turquoise and tan modules (Fig. 8A). The hub gene network analysis revealed a hierarchical organization of highly connected genes in each module, through which core controlling (hub) genes in the modular network could be identified. To evaluate the co-expression pattern of NNRGs, we compared the networks using naringin/neoeriocitrin synthesis-related genes (NNSRGs) and TF-regulated DEGs in the turquoise and tan modules. The connections of the naringin-related and TF-regulated DEGs produced a network of 60 nodes and 440 edges in the turquoise module, showing that eight metabolic pathway hub genes (three C4H gene, two 4CL genes, two CHS genes and one PAL gene) were highly co-expressed with 52 TF-regulated genes (16 MYB- and 36 bHLH-regulated genes). The WGCNA between 15 pathway hub genes (five CHS genes, three 4CL genes, three CHI genes, three PAL genes and one C4H gene) and the list of 57 TF-regulated genes (22 MYB- and 35 bHLH-regulated genes) was conducted in the tan module associated with naringin, and the resulting network contained 72 nodes and 948 edges. Regarding the significant neoeriocitrin synthesis-related and relevant TF-regulated genes, the co-expression networks contained 92 nodes and 2,802 edges in the turquoise module, and 91 nodes and 2,447 edges in the tan module. The turquoise module included 40 hub pathway genes (14 HCT genes, 11 F3'H genes, five C3H genes, three C4H genes, two C3'H genes, two CHS genes, two 4CL genes and one PAL gene) and 52 TF-regulated genes (16 MYB- and 36 bHLH-regulated genes). A total of 34 hub pathway genes in the tan module comprised eight F3'H genes, seven C3H genes, five CHS genes, four HCT genes, three PAL genes, three 4CL genes, three CHI genes and one C4H gene, together with 22 MYB-regulated genes and 35 bHLH-regulated genes (Fig. 8B;Supplementary Tables S12, S13). Fig. 8 View largeDownload slide Networks enriched in naringin/neoeriocitrin-related genes revealed via WGCNA. (A) Distribution of transcription factor-regulated genes in D. roosii. The white bar represents total TF-regulated DEGs; the red bar represents MYB-regulated DEGs; the green bar represents bHLH-regulated DEGs. (B) The naringin/neoeriocitrin-related networks are constructed using synthesis-related genes and relevant transcription factor-regulated genes in turquoise and tan modules. The naringin/neoeriocitrin synthesis-related genes are highlighted in turquoise and tan modules, respectively. Additionally, see Supplementary Tables S12 and S13. (C) Relative gene expression values are based on log10 RPKM according to the networks of turquoise and tan modules. Fig. 8 View largeDownload slide Networks enriched in naringin/neoeriocitrin-related genes revealed via WGCNA. (A) Distribution of transcription factor-regulated genes in D. roosii. The white bar represents total TF-regulated DEGs; the red bar represents MYB-regulated DEGs; the green bar represents bHLH-regulated DEGs. (B) The naringin/neoeriocitrin-related networks are constructed using synthesis-related genes and relevant transcription factor-regulated genes in turquoise and tan modules. The naringin/neoeriocitrin synthesis-related genes are highlighted in turquoise and tan modules, respectively. Additionally, see Supplementary Tables S12 and S13. (C) Relative gene expression values are based on log10 RPKM according to the networks of turquoise and tan modules. Among naringin-related modules, 10 hub genes in the tan module showing the highest edge number (71 edges) comprised three PAL genes, three 4CL genes, two CHI genes, one C4H gene and one CHS gene, while seven hub genes in the turquoise module exhibiting the highest edge number (59 edges) comprised three C4H genes two 4CL genes, one PAL gene and one CHS gene (Supplementary Table S12). These results indicated that the 4CL, PAL and C4H gene families might play the central roles in naringin synthesis. For neoeriocitrin-related modules, 19 hub genes in the tan module presenting the highest edge number (90 edges) consisted of six C3H genes, three HCT genes, three 4CL genes, two F3'H genes, two PAL genes and one each of C4H, CHS and CHI. The 23 hub genes in the turquoise module exhibiting the highest edge number (91 edges) were nine HCT genes, three C3H genes, two C4H genes, two 4CL genes, two C3'H genes, one PAL gene and one CHS gene (Supplementary Table S12). These results suggested that the HCT and C3H gene families might play the most significant roles in neoeriocitrin synthesis. In total, the above results showed that the 4CL, PAL and C4H, and HCT and C3H gene families acted as the most important hub genes in the naringin- and neoeriocitrin-related co-expression networks, respectively, and played vital roles in naringin and neoeriocitrin synthesis. In addition, the majority of naringin/neoeriocitrin synthesis-related hub genes exhibited high co-expression with TF-regulated genes, which suggested that these TF-regulated genes were candidate genes involved in regulating naringin/neoeriocitrin synthesis, and MYB/bHLH TFs might take part in regulating the NNSRGs at the transcriptional level in D. roosii. Fig. 8C shows the patterns of transcript abundance of co-expressed hub enzymes and MYB/bHLH TF-regulated genes based on Fig. 8B. Genes in the turquoise and tan modules exhibited a highly co-expressed pattern and showed a negative and positive association with naringin content, respectively. Furthermore, the DEGs in the turquoise and tan modules were mapped onto KEGG pathways and presented the top 20 enriched pathways (Supplementary Fig. S12A, C). Of these, ‘Flavonoid biosynthesis’ (ko00941), ‘Flavone and flavonol biosynthesis’ (ko00944) and ‘Biosynthesis of secondary metabolites’ (ko01110) were enriched observably for turquoise and tan modules. ‘Phenylpropanoid biosynthesis’ (ko00940), ‘Phenylalanine metabolism’ (ko00360) and ‘Fatty acid elongation’ (ko00062) were particularly enriched for the tan module. Supplementary Fig. S12B and D presents the number of DEGs in the turquoise and tan modules involved in ‘Biosynthesis of secondary metabolites’ (ko01110), ‘Fatty acid metabolism’ (ko00071), ‘Tyrosine metabolism’ (ko00350) and ‘Tryptophan metabolism’ (ko00380), which were related to naringin/neoeriocitrin accumulation in D. roosii. The number of genes involved in ‘Flavonoid biosynthesis’ and ‘Phenylpropanoid biosynthesis’ (ko00940) was the highest when compared with other pathways. Indeed, biosynthesis of secondary metabolites, especially of flavonoid, was enriched in these naringin/neoeriocitrin-related modules depending on tissues or ages in D. roosii. Based on the co-expression network results, we detected the expression profiles of several hub genes associated with naringin/neoeriocitrin synthesis using quantiative real-time reverse transcription–PCR (qRT–PCR). The present results showed that the transcripts of PAL (CL9563), 4CL (CL5245) and C4H (CL4574) genes associated with naringin were higher in rhizomes, especially in new rhizomes, when compared with leaves and veins. Also, the expression of these genes remained unregulated over the years (Fig. 9A–C). Similarly, the transcripts of HCT (CL11832; CL4397) and C3H (CL6561) genes associated with neoeriocitrin were higher in the new rhizome than in other tissues. Nevertheless, the expression of these genes was clearly down-regulated over the years (Fig. 9D–F). The qRT–PCR results were consistent with our transcriptome data and supported the tissue- and age-dependent naringin/neoeriocitrin accumulation profile as shown in Figs. 1C, D and 2A, B. Fig. 9 View largeDownload slide Quantitative real-time PCR analysis of the PAL (CL9563), 4CL (CL5245), C4H (CL4574), HCT (CL11832 and CL4397) and C3H (CL6561) transcript levels in different tissues and D. roosii plants of different ages, respectively. Amplification of the ACTIN (CL186) gene was used as an internal control. Bars represent the mean and SD of values obtained from three biological replicates. Fig. 9 View largeDownload slide Quantitative real-time PCR analysis of the PAL (CL9563), 4CL (CL5245), C4H (CL4574), HCT (CL11832 and CL4397) and C3H (CL6561) transcript levels in different tissues and D. roosii plants of different ages, respectively. Amplification of the ACTIN (CL186) gene was used as an internal control. Bars represent the mean and SD of values obtained from three biological replicates. Discussion Medicinal aspects of naringin and neoeriocitrin in the fern D. roosii Osteoporosis is the most common bone-related health problem, affecting millions of people around the world, and especially the aging population. Naringin and neoeriocitrin are considered the main effective compounds of ‘GuSuiBu’, which has been extensively studied in TCM regarding its protective role in the maintenance of bone mass in various models of osteoporosis (Zhang et al. 2017). A reversed-phase HPLC with photodiode array detection method was previously developed for the accurate quantitative determination of naringin/neoeriocitrin and applied for fingerprint analysis of D. fortunei rhizomes (H.T. Liu et al. 2012). In the present study, HPLC-MS/MS analysis confirmed the highly similar chemical structure (Fig. 2C, D) and revealed the tissue- and time-specific accumulation pattern of naringin/neoeriocitrin in D. roosii. The naringin content presented particular high accumulation in CNRs and remained relatively stable over the years, whereas neoeriocitrin content was maintained at a much higher level in CORs and increased gradually over the years (Figs. 1C, D, 2A, B). Moreover, we found that at least 8 years was required for D. roosii to accumulate similar naringin and neoeriocitrin contents to wild plants, which are commonly harvested for medicinal use (Fig. 1D). However, 3 years of growth time was adequate for biomass accumulation of D. roosii (Fig. 1E). To balance the time cost and biomass demand for industrial development, we suggest that 3 years of growth is reasonable for further studies aimed at improving medicinal components for D. roosii. Combined sequencing approach to achieve a comprehensive transcriptome resource of D. roosii Structural and functional genomics studies are fundamental to the understanding of plant metabolism, necessitating further cloning efforts to obtain comprehensive sequences for the investigation of potential mechanisms underlying the specialized naringin/neoeriocitrin accumulation pattern in D. roosii. To perform such studies, access to high-quality genome and transcriptome sequences is essential. The transcripts derived from SGS may suffer from misassembly of the reads transcribed from highly repetitive regions or very similar members of multigene families, which may be even more severe for polyploid plants with many nearly identical homoeologous gene sets (B. Li et al. 2014). Based on RNA-seq data, we found that D. roosii harbored a mass of nearly identical homoeologous sequences, indicating its extremely large genome. SMRT sequencing with a read length up to 20 kb renders PacBio RSII very effective in the sequencing of full-length cDNAs that exhibit long transcript isoforms (Sharon et al. 2013). In this work, we presented comprehensive, high-quality, full-length transcriptome sequences for D. roosii using SMRT technology. A recent study reported that high-accuracy de novo transcriptome assemblies of A. deserti and A. tequilana in this way offered new resources for studying stress adaption (Gross et al. 2013). Based on our SMRT and SGS data, NR analysis showed that D. roosii shared high similarity with three major species (P. patens, S. moellendorffii and P. sitchensis), which provided us with the reference plants for study of D. roosii (Fig. 3D;Supplementary Fig. S7). The combination of SMRT sequencing and SGS technology first provided such abundant genetic information resources for D. roosii, vastly contributing to further study of its medicinal value. Transcriptomic analysis of the tissue- and time-specific naringin/neoeriocitrin-related gene expression pattern Numerous studies have indicated that the expression of the genes involved in the flavonoid biosynthetic pathway is controlled by distinct mechanisms in a tissue- or species-specific manner (Schaart et al. 2013). This regulation involves different sets of transcriptional regulators that are specific to each class of flavonoids (Xu et al. 2014). Using a non-targeted metabolomics approach, statistical analyses revealed that both development time and tissue type influenced metabolic changes, and development time seemed to produce more effects than tissue type on the fruit metabolome in cucumber. Most of the metabolite contents decreased gradually, while those of some flavonoids, amino acids and carbohydrates increased throughout development (Hu et al. 2018). The expression pattern in blackberry tissue showed that 6,604 and 6,599 genes were particularly overexpressed in fruits and leaves, respectively. Moreover, bioactives characterization indicated that total phenolics and flavonols were higher in leaves, while anthocyanins were higher in fruits (Gutierrez et al. 2017). The expression patterns of ParPAL1 and ParPAL2 genes were investigated in various vegetative tissues of apricot, and ParPAL1 transcripts were 3-fold more abundant in callus tissue than in leaves and bark tissue; nevertheless, ParPAL2 was expressed at the same level in all tissues (Irisarri et al. 2016). A tissue-specific expression analysis indicated that Ma4CL3 was strongly expressed in root bark, stem bark and old leaves, and its expression pattern was similar to the trend of the total flavonoid content throughout fruit development in mulberry (Wang et al. 2016). In the present study, the transcriptome analysis confirmed that rhizomes served as the main tissue in which naringin and neoeriocitrin were generated and stored, and revealed that the NNRGs exhibited tissue- or time-specific expression patterns in D. roosii. Fig. 4 and Supplementary Table S8 showed that 68 DEGs associated with naringin/neoeriocitrin synthesis (63.0%) exhibited higher expression in rhizomes, with 66.2% of these DEGs appearing to be most highly expressed in CNRs, and the number of down-regulated DEGs (fold change <0.5) increased over time. Notably, the obviously unique expression pattern of PAL, 4CL and HCT observed in different tissues and plants of different ages suggested the pivotal roles of these genes in naringin/neoeriocitrin synthesis. Combined with the similar tissue- and time-specific expression patterns of other NNSRGs (F3'H, C3H, CHS and C4H), our study illustrated the molecular regulatory mechanisms underlying naringin/neoeriocitrin synthesis in D. roosii. Flavonoids are derived from phenylalanine, which is a precursor of the phenylpropanoid metabolic pathway, via shikimate biosynthesis. Shikimate biosynthesis provides carbon skeletons for aromatic amino acids (Tohge et al. 2013). Accordingly, we found that the DEGs involved in l-phenylalanine formation via the shikimate pathway exhibited a tissue- and time point-specific expression pattern, in which the corresponding genes were most highly expressed in CNRs and displayed a down-regulated expression pattern over time (Fig. 5A). Malonyl-CoA is an important precursor metabolite for the biosynthesis of flavonoids and is naturally consumed for fatty acid biosynthesis, elongation and metabolism, leaving a reduced amount of malonyl-CoA available for the production of other metabolic targets (Chen et al. 2011). In the present study, the DEGs involved in malonyl-CoA formation exhibited rhizome-specific expression profiles, and were more highly expressed in CNRs (especially genes in the fatty acid elongation pathway); these DEGs displayed a down-regulated expression pattern over time. Recent studies have indicated that the expression of PAL, 4CL and F3H in the phenolic biosynthesis pathway may be positively or negatively regulated by MYB and bHLH TFs (Sun et al. 2017). In the case of grapevine, VviMYBC2-L1 and VviMYBC2-L3 have been characterized as negative regulators to fine-tune flavonoid levels (Cavallini et al. 2015). In the present study, we discovered that 67.9% of MYB-regulated DEGs were highly expressed in rhizomes, and 71.9% of these DEGs were more highly expressed in CNRs. Among the bHLH-regulated DEGs, 74.6% displayed higher transcriptional levels in rhizomes, and 71.6% of these DEGs were more highly expressed in CNRs. Moreover, the number of down-regulated (fold change <0.5) MYB- and bHLH-regulated DEGs clearly increased over time (Fig. 5B). All these findings indicated that MYB- and bHLH-regulated genes exhibited a similar expression pattern to NNSRGs, and MYB- or bHLH-mediated naringin/neoeriocitrin synthesis might also depend on different tissues and ages. WGCNA of naringin/neoeriocitrin-related gene expression The systems approach for data mining via WGCNA was particularly fruitfull in identifying information about modules that were shared by subsets of samples and linked to phenotypic traits, and this method appears to be quite effective in revealing common features or trait-specific modules and important hub genes for systems biology research (Hollender et al. 2014). The present study was a novel application of WGCNA in systems biology research on the naringin/neoeriocitrin-related gene expression in D. roosii. Our work elucidated 17 clustered modules of a total of 16,472 DEGs depending on the gene expression, and analyzed the relationships of modules/modules, modules/samples and modules/traits (Figs. 6, 7). Moreover, the salmon, darkorange, darkorange4 and saddlebrown modules identified 2,413 CV-specific, 1,633 R2Y-specific, 669 R10Y-specific and 807 R8Y-specific genes, respectively (Fig. 6D). Nine modules exhibited significant association with naringin/neoeriocitrin, and DEGs in each module displayed higher expression in a specific tissue or at a specific time point (Fig. 7). These results contributed a lot to understanding the expression profiles of the NNRGs in different tissues and plants of different ages, and confirmed the tissue and time specificity of the NNRGEP in D. roosii. Hub gene analyses were powerful in revealing important regulatory factors for specific biological features. In the present study, the transcriptome data analyses and WGCNA revealed information on hub genes, which presented high co-expression with MYB- and bHLH-regulated genes in networks. For naringin-related genes, three PAL genes, three 4CL genes and three C4H genes exhibited the highest edge number, while for neoeriocitrin-related genes, six C3H genes and nine HCT genes displayed the highest edge number in tan and turquoise module networks (Fig. 8B). The gene families of 4CL, PAL and C4H, and HCT and C3H played important roles in naringin and neoeriocitrin synthesis, respectively, which was consistent with the heat map analysis of PAL, 4CL and HCT genes in D. roosii. Different branches of the flavonoid pathway are highly regulated at the transcriptional level, especially by MYB and bHLH TFs, with some genes being co-expressed to stimulate or inhibit flavonoid accumulation in blackberry (Gutierrez et al. 2017). Considering the highly co-expressed relationship between naringin/neoeriocitrin synthesis-related hub genes and TF-regulated genes, we suggested that these MYB- and bHLH-regulated genes were candidates involved in regulating naringin/neoeriocitrin synthesis, and MYB/ bHLH TFs were indirectly linked to the hub genes of 4CL, PAL, C4H, HCT and C3H. The assessment of co-expression groups might enable basic discoveries related to genome function, regulatory networks, biomarkers and candidate genes, and support new evidence for the reorganization of plant metabolism. Materials and Methods Plant materials and RNA sample preparation Drynaria roosii was planted in the greenhouse at the Institute of Botany of the Chinese Academy of Sciences. Rhizomes from plants of different ages were collected. Three-year-old plants were also collected and divided into four parts (old rhizomes, new rhizomes, veins and leaves). Wild plants were harvested from Mount Emei in the Sichuan Province of China. Three plant replicates per experimental treatment and 10 plants per replicate were harvested. All of the rhizome samples from wild plants, plants of different ages (2-, 3-, 5-, 6-, 8- and 10-year-old rhizomes) and different tissues (old rhizomes, new rhizomes, veins and leaves) were immediately frozen in liquid nitrogen and stored at −80°C until RNA isolation. All rhizome samples were cut into pieces, mixed evenly and ground into flour with a grinding machine (IKA A11 basic, Staufen). Total RNA was obtained from each sample using an RNAprep Pure Plant kit (TIANGEN Biotech) according to the manufacturer’s instructions. The quality of RNA was determined using a NanoDrop 2000 spectrophotometer and an Agilent 2100 bioanalyzer (Agilent Technologies). Quantitative analyses and validation of naringin/neoeriocitrin metabolites Samples were analyzed using HPLC according to methods described previously (Cui et al. 2016). The powdered D. roosii samples were dried using vacuum freeze-drying equipment and passed through a 40 mesh sieve, after which 0.02 g of the sample powder was extracted with 2 ml of 80% ethanol in an ultrasonic bath for 30 min. The extracted solution was stored in a refrigerator at 4°C for 16 h and then placed in an ultrasonic bath for 30 min. The solution was filtered through a 0.45 µm filter membrane before the injection of 10 µl for HPLC analysis. A Dionex Ultimate 3000 HPLC system (Thermo Fisher Scientific) was used for the analysis. In addition, analysis was carried out on a ANPEL Athena C18-WP chromatographic column (4.6×150 mm, 5 μm, internal diameter, Tosoh) with a gradient elution program. The mobile phase was composed of 0.1% formic acid in ultrapure water (A) and acetonitrile (B). The elution program was optimized and conducted as follows: a linear gradient of 10–20% B (0–10 min), a linear gradient of 20–26% B (10–31 min), a linear gradient of 26–10% B (31–33 min) and a linear gradient of 10% B (33–35 min), at a flow rate of 0.8 ml min–1. The column temperature was maintained at 35°C, and the detection wavelength at 280 nm. Three replicates were carried out for each sample. The reference standards of naringin and neoeriocitrin used in HPLC analysis were purchased from Sigma-Aldrich. HPLC grade acetonitrile was obtained from Thermo Fisher Scientific. All other regents were of analytical grade and obtained from Beijing Chemical Works. Target compounds were identified using HPLC-MS/MS as described by Cui et al. (2016). Mass spectra were applied using electrospray ionization in negative ionization mode with the m/z range of 100–600 and 100–650. A nitrogen drying gas flow of 800 l h–1, a desolvation temperature at 400°C, a nebulizer pressure of 45 p.s.i. and a capillary voltage of 3,000 V were used. Argon was used as collision gas with collision energy of 30 V. PacBio library construction and sequencing The purified RNA samples from the four different tissues were used for PacBio library construction and sequencing. Normalized cDNAs of different sizes were constructed separately for seven SMRT cell libraries using the Clontech SMARTer PCR cDNA Synthesis Kit. The templates were bound to SA-DNA polymerase and V2 primers using the DNA/Polymerase Binding Kit 2.0. The complexes of templates and polymerase were bound to magnetic beads and then transferred to a 96-well PCR plate for processing using a PacBio RS system with C2 sequencing reagents. The subreads were filtered and subjected to CCS through SMRT Analysis_2.3.0. (http://www.pacb.com/support/software-downloads/). Illumina library construction and sequencing Thirty RNA samples from different tissues and rhizomes of different ages were used for Illumina library construction and sequencing. The cDNA libraries for HiSeq X Ten sequencing were constructed as follows. The total RNA from each sample was treated with DNase I, and poly(A) mRNA was isolated with oligo(dT) beads, followed by reverse transcription into first-strand cDNA using reverse transcriptase. Second-strand cDNA was synthesized with DNA polymerase I and RNaseH and then ligated with an adaptor or index adaptor using T4 DNA ligase. The adaptor-ligated fragments were separated and excised through agarose gel electrophoresis. PCR was performed selectively to enrich and amplify the cDNA fragments. Finally, the cDNA libraries were sequenced using HiSeq X Ten at Hangzhou 1 gene Technology Co., Ltd. Gene co-expression network analysis and network visualization Co-expression and module analyses were performed using functions of the R package WGCNA V1.41-1 (Langfelder and Horvath 2008). A soft power threshold of 6 was employed to transform the correlation matrix into a signed weighted adjacency matrix, leading to an approximate scale-free topology for the obtained network. The resulting adjacency matrix was used to calculate the topological overlap matrix (TOM). Next, a gene expression adjacency matrix was constructed using the soft-threshold Pearson correlation (power = 6) between all pairs of probe expression data. Such soft thresholding yields a scale-free network (scale-free topology model fit R2 = 0.9). A topological overlap map was calculated by comparing the similarity of the connectivity of each pair of probes. A dynamic hybrid tree-cut algorithm (R package dynamicTreeCut, v. 1.60) was employed for module detection. The minimum size of the module was set to 30 to avoid small modules, and DeepSplit was set to 4 to produce fine clusters. Modules exhibiting high similarity of eigengenes (dissimilarity, 0.25) were merged to avoid oversplitting clusters. The networks were visualized using Cytoscape _v.3.0.0. Validation of the expression of naringin/neoeriocitrin-related hub genes The expression profiles of the PAL (CL9563), 4CL (CL5245), C4H (CL4574), HCT (CL11832; CL4397) and C3H (CL6561) genes were characterized by qRT–PCR. Amplification of the ACTIN (CL186) gene was used as an internal control. The corresponding primer sequences are provided in Supplementary Table S14. The protocol for qRT–PCR and data processing were as previously described by Rieu and Powers (2009). Three biological replications were carried out. Accession codes The SMRT sequencing and HiSeq X Ten data were submitted to the Sequence Read Archive (SRA) of the National Center for Biotechnology Information (NCBI) under accession numbers SRP108926 and SRP108652, respectively. Supplementary Data Supplementary data are available at PCP Online. Funding This work was supported by the National Natural Science Foundation of China [grant Nos. 81573519 and 31600264] and the Key Projects of the Chinese Academy of Sciences [grant no. KZCC-EW-103-3]. Acknowledgments We thank Professor Yongzhen Pang from the Institute of Animal Sciences, Chinese Academy of Agricultural Sciences for help with research into the naringin and neoeriocitrin biosynthetic pathways. We thank Dr. Yan Zhu from the Plant Science Facility of the Institute of Botany, Chinese Academy of Sciences for HPLC-MS/MS analysis. Disclosures The authors have no conflicts of interest to declare. References Abdel-Ghany S.E., Hamilton M., Jacobi J.L., Ngam P., Devitt N., Schilkey F., et al.   ( 2016) A survey of the sorghum transcriptome using single-molecule long reads. Nat. Commun.  7: 11706. Google Scholar CrossRef Search ADS PubMed  Bar-Peled M., Lewinsohn E., Fluhr R., Gressel J. ( 1991) UDP-rhamnose:flavanone-7-O-glucoside-2''-O-rhamnosyltransferase. J. Biol. Chem.  266: 20953– 20959. Google Scholar PubMed  Buer C.S., Imin N., Djordjevic M.A. ( 2010) Flavonoids: new roles for old molecules. J. Integr. Plant Biol.  52: 98– 111. Google Scholar CrossRef Search ADS PubMed  Cavallini E., Matus J.T., Finezzo L., Zenoni S., Loyola R., Guzzo F., et al.   ( 2015) The phenylpropanoid pathway is controlled at different branches by a set of R2R3-MYB C2 repressors in grapevine. Plant Physiol.  167: 1448– 1470. Google Scholar CrossRef Search ADS PubMed  Chen H., Kim H.U., Weng H., Browse J. ( 2011) Malonyl-CoA synthetase, encoded by ACYL ACTIVATING ENZYME13, is essential for growth and development of Arabidopsis. Plant Cell  23: 2247– 2262. Google Scholar CrossRef Search ADS PubMed  Chen S.L., Xu J., Liu C., Zhu Y.J., Nelson D.R., Zhou S.G., et al.   ( 2012) Genome sequence of the model medicinal mushroom Ganoderma lucidum. Nat. Commun.  3: 913. Google Scholar CrossRef Search ADS PubMed  Chinese Pharmacopoeia Commission. ( 2015) Pharmacopoeia of the People’s Republic of China . China Medical Science Press, Beijing. Christensen A.B., Gregersen P.L., Schroder J., Collinge D.B. ( 1998) A chalcone synthase with an unusual substrate preference is expressed in barley leaves in response to UV light and pathogen attack. Plant Mol. Biol . 37: 849– 857. Google Scholar CrossRef Search ADS PubMed  Cui L.L., Yao S.B., Dai X.L., Yin Q.G., Liu Y.J., Jiang X.L., et al.   ( 2016) Identification of UDP-glycosyltransferases involved in the biosynthesis of astringent taste compounds in tea (Camellia sinensis). J. Exp. Bot . 67: 2285– 2297. Google Scholar CrossRef Search ADS PubMed  Dixon R.A., Pasinetti G.M. ( 2010) Flavonoids and isoflavonoids: from plant biology to agriculture and neuroscience. Plant Physiol.  154: 453– 457. Google Scholar CrossRef Search ADS PubMed  Ficklin S.P., Feltus F.A. ( 2011) Gene coexpression network alignment and conservation of gene modules between two grass species: maize and rice. Plant Physiol . 156: 1244– 1256. Google Scholar CrossRef Search ADS PubMed  Gerttula S., Zinkgraf M., Muday G.K., Lewis D.R., Ibatullin F.M., Brumer H., et al.   ( 2015) Transcriptional and hormonal regulation of gravitropism of woody stems in Populus. Plant Cell  27: 2800– 2813. Google Scholar PubMed  Gross S.M., Martin J.A., Simpson J., Abraham-Juarez M.J., Wang Z., Visel A. ( 2013) De novo transcriptome assembly of drought tolerant CAM plants, Agave deserti and Agave tequilana. BMC Genomics  14: 563. Google Scholar CrossRef Search ADS PubMed  Gutierrez E., García-Villaraco A., Lucas J.A., Gradillas A., Gutierrez-Mañero F.J., Ramos-Solano B. ( 2017) Transcriptomics, targeted metabolomics and gene expression of blackberry leaves and fruits indicate flavonoid metabolic flux from leaf to red fruit. Front. Plant Sci.  8: 472. Google Scholar CrossRef Search ADS PubMed  Hollender C.A., Kang C., Darwish O., Geretz A., Matthews B.F., Slovin J., et al.   ( 2014) Floral transcriptomes in woodland strawberry uncover developing receptacle and anther gene networks. Plant Physiol . 165: 1062– 1075. Google Scholar CrossRef Search ADS PubMed  Hu C.Y., Zhao H.Y., Wang W., Xu M.F., Shi J.X., Nie X.B., et al.   ( 2018) Identification of conserved and diverse metabolic shift of the stylar, intermediate and peduncular segments of cucumber fruit during development. Int. J. Mol. Sci.  19: 135. Google Scholar CrossRef Search ADS   Huddleston J., Ranade S., Malig M., Antonacci F., Chaisson M., Hon L., et al.   ( 2014) Reconstructing complex regions of genomes using long-read sequencing technology. Genome Res . 24: 688– 696. Google Scholar CrossRef Search ADS PubMed  Hutabarat O.S., Flachowsky H., Regos I., Miosic S., Kaufmann C., Faramarzi S., et al.   ( 2016) Transgenic apple plants overexpressing the chalcone-3-hydroxylase gene of Cosmos sulphureus show increased levels of 3-hydroxyphloridzin and reduced susceptibility to apple scab and fire blight. Planta  243: 1213– 1224. Google Scholar CrossRef Search ADS PubMed  Irisarri P., Zhebentyayeva T., Errea P., Pina A. ( 2016) Differential expression of phenylalanine ammonia lyase (PAL) genes implies distinct roles in development of graft incompatibility symptoms in Prunus. Sci. Hort . 204: 16– 24. Google Scholar CrossRef Search ADS   Langfelder P., Horvath S. ( 2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics  9: 559. Google Scholar CrossRef Search ADS PubMed  Li B., Fillmore N., Bai Y., Collins M., Thomson J.A., Stewart R., et al.   ( 2014) Evaluation of de novo transcriptome assemblies from RNA-Seq data. Genome Biol.  15: 553. Google Scholar CrossRef Search ADS PubMed  Li J., Harata-Lee Y., Denton M.D., Feng Q.J., Rathjen J.R., Qu Z.P., et al.   ( 2017) Long read reference genome-free reconstruction of a full-length transcriptome from Astragalus membranaceus reveals transcript variants involved in bioactive compound biosynthesis. Cell Discov.  3: 17031. Google Scholar CrossRef Search ADS PubMed  Li Q.S., Li Y., Song J.Y., Xu H.B., Xu J., Zhu Y.J., et al.   ( 2014) High-accuracy de novo assembly and SNP detection of chloroplast genomes using a SMRT circular consensus sequencing strategy. New Phytol . 204: 1041– 1049. Google Scholar CrossRef Search ADS PubMed  Li X.L., Xiao H.B., Liang X.M., Shi D.Z., Liu J.G. ( 2004) LC–MS/MS determination of naringin, hesperidin and neohesperidin in rat serum after orally administrating the decoction of Bulpleurum falcatum L. and Fractus aurantii. J. Pharmaceut. Biomed . 34: 159– 166. Google Scholar CrossRef Search ADS   Li L.N., Zeng Z., Cai G.P. ( 2011) Comparison of neoeriocitrin and naringin on proliferation and osteogenic differentiation in MC3T3-E1. Phytomedicine  18: 985– 989. Google Scholar CrossRef Search ADS PubMed  Liu L.L., Qu W., Liang J.Y. ( 2012) Progress on chemical constituents and biological activities of Drynaria fortunei. Strait Pharm. J . 1: 4– 9. Liu H.T., Zou S.S., Qi Y.D., Zhu Y.X., Li X.B., Zhang B.G. ( 2012) Quantitative determination of four compounds and fingerprint analysis in the rhizomes of Drynaria fortunei (Kunze) J. Sm. J. Nat. Med.  66: 413– 419. Google Scholar CrossRef Search ADS PubMed  Ma X.L., Lv J.W., Sun X.L., Ma J.X., Xing G.S., Wang Y., et al.   ( 2016) Naringin ameliorates bone loss induced by sciatic neurectomy and increases Semaphorin 3A expression in denervated bone. Sci. Rep.  6: 24562. Google Scholar CrossRef Search ADS PubMed  McIntosh C.A., Mansell R.L. ( 1990) Biosynthesis of naringin in Citrus paradisi: UDP-glucosyl-transferase activity in grapefruit seedlings. Phytochemistry  29: 1533– 1538. Google Scholar CrossRef Search ADS   Moschen S., Higgins J., Rienzo J.A.D., Heinz R.A., Paniego N., Fernandez P. ( 2016) Network and biosignature analysis for the integration of transcriptomic and metabolomic data to characterize leaf senescence process in sunflower. BMC Bioinformatics  17: 389– 398. Google Scholar CrossRef Search ADS PubMed  Mouradov A., Spangenberg G. ( 2014) Flavonoids: a metabolic network mediating plants adaptation to their real estate. Front. Plant Sci.  5: 1– 16. Google Scholar CrossRef Search ADS   Naqvi R.Z., Zaidi S.S., Akhtar K.P., Strickler S., Woldemariam M., Mishra B., et al.   ( 2017) Transcriptomics reveals multiple resistance mechanisms against cotton leaf curl disease in a naturally immune cotton species, Gossypium arboreum. Sci. Rep.  7: 15880. Google Scholar CrossRef Search ADS PubMed  Rieu I., Powers S.J. ( 2009) Real-time quantitative RT–PCR: design, calculations, and statistics. Plant Cell  21: 1031– 1033. Google Scholar CrossRef Search ADS PubMed  Ruan J., Dean A.K., Zhang W. ( 2010) A general co-expression network-based approach to gene expression analysis: comparison and applications. BMC Syst. Biol.  4: 8. Google Scholar CrossRef Search ADS PubMed  Schaart J.G., Dubos C., Romero De La Fuente I., van Houwelingen A.M., de Vos R.C., Jonker H.H., et al.   ( 2013) Identification and characterization of MYB–bHLH–WD40 regulatory complexes controlling proanthocyanidin biosynthesis in strawberry (Fragaria × ananassa) fruits. New Phytol.  197: 454– 467. Google Scholar CrossRef Search ADS PubMed  Sharon D., Tilgner H., Grubert F., Snyder M. ( 2013) A single-molecule long-read survey of the human transcriptome. Nat. Biotechnol.  31: 1009– 1014. Google Scholar CrossRef Search ADS PubMed  Sun R.Z., Cheng G., Li Q., He Y.N., Wang Y., Lan Y.B., et al.   ( 2017) Light-induced variation in phenolic compounds in cabernet sauvignon grapes (Vitis vinifera L.) involves extensive transcriptome reprogramming of biosynthetic enzymes, transcription factors, and phytohormonal regulators. Front. Plant Sci.  8: 547. Google Scholar CrossRef Search ADS PubMed  Tan M.P., Cheng D., Yang Y.N., Zhang G.Q., Qin M.J., Chen J., et al.   ( 2017) Co-expression network analysis of the transcriptomes of rice roots exposed to various cadmium stresses reveals universal cadmium-responsive genes. BMC Plant Biol.  17: 194. Google Scholar CrossRef Search ADS PubMed  Tohge T., Watanabe M., Hoefgen R., Fernie A.R. ( 2013) Shikimate and phenylalanine biosynthesis in the green lineage. Front. Plant Sci.  4: 1– 13. Google Scholar CrossRef Search ADS PubMed  Wang C.H., Yu J., Cai Y.X., Zhu P.P., Liu C.Y., Zhao A.C., et al.   ( 2016) Characterization and functional analysis of 4-coumarate: CoA ligase genes in Mulberry. PLoS One  11: e0155814. Google Scholar CrossRef Search ADS PubMed  Wiermann R. ( 1970) Die synthese von phenylpropanen während der Pollenentwicklung. Planta  95: 133– 145. Google Scholar CrossRef Search ADS PubMed  Xie W., Huang J., Liu Y., Rao J., Luo D., He M. ( 2015) Exploring potential new floral organ morphogenesis genes of Arabidopsis thaliana using systems biology approach. Front. Plant Sci.  6: 829. Google Scholar PubMed  Xie Y.M., Wang H.M., Shen L., Cui T.H., Ge J.R., Kou Q.A., et al.   ( 2004) Clinical study on effect of Qianggu capsule on primary osteoporosis (decrease in bone content). Chin. J. Inf. Tradit. Chin. Med . 11: 482– 488. Xu H.B., Song J.Y., Luo H.M., Zhang Y.J., Li Q.S., Zhu Y.J., et al.   ( 2016) Analysis of the genome sequence of the medicinal plant Salvia miltiorrhiza. Mol. Plant.  9: 949– 952. Google Scholar CrossRef Search ADS PubMed  Xu W.J., Dubos C., Lepiniec L. ( 2015) Transcriptional control of flavonoid biosynthesis by MYB–bHLH–WDR complexes. Trends Plant Sci . 20: 176– 185. Google Scholar CrossRef Search ADS PubMed  Xu W.J., Grain D., Bobet S., Gourrierec J.L., Thévenin J., Kelemen Z., et al.   ( 2014) Complexity and robustness of the flavonoid transcriptional regulatory network revealed by comprehensive analyses of MYB–bHLH–WDR complexes and their targets in Arabidopsis seed. New Phytol.  202: 132– 144. Google Scholar CrossRef Search ADS PubMed  Xu Z.C., Peters R.J., Weirather J., Luo H.M., Liao B.S., Zhang X., et al.   ( 2015) Full-length transcriptome sequences and splice variants obtained by a combination of sequencing platforms applied to different root tissues of Salvia miltiorrhiza and tanshinone biosynthesis. Plant J.  82: 951– 961. Google Scholar CrossRef Search ADS PubMed  Yang B., Chen G.X., Jiang D.S., Liu Z. ( 2010) The research progress of medicinal plant of Drynaria in China. Chin. Wild Plant Res . 29: 1– 6. Zhang Y.L., Jiang J.J., Shen H., Chai Y., Wei X., Xie Y.M. ( 2017) Total flavonoids from Rhizoma Drynariae (Gusuibu) for treating osteoporotic fractures: implication in clinical practice. Drug Des. Dev. Ther.  11: 1881– 1890. Google Scholar CrossRef Search ADS   Abbreviations Abbreviations bHLH basic helix–loop–helix CCS circular consensus sequencing C3H p-coumarate 3-hydroxylase C3'H p-coumarate 3'-hydroxylase C4H cinnamate 4-hydroxylase CHI chalcone isomerase CHS chalcone synthase CL leaf 4CL 4-coumarate CoA ligase CNR new rhizome COR old rhizome CV vein DEG differentially expressed gene FDR false discovery rate F3'H flavonoid-3'-hydroxylase GO Gene Ontology HCT hydroxycinnamoyl transferase KEGG Kyoto Encyclopedia of Genes and Genomes MS mass spectroscopy NNRG naringin/neoeriocitrin-related gene NNRGEP naringin/neoeriocitrin-related gene expression pattern NNSRG naringin/neoeriocitrin synthesis-related gene PAL phenylalanine ammonia lyase qRT–PCR quantitative real-time reverse transcription–PCR RNA-seq RNA sequencing R2Y 2-year-old plants R3Y 3-year-old plants R5Y 5-year-old plants R6Y 6-year-old plants R8Y 8-year-old plants R10Y 10-year-old plants SGS second-generation sequencing SMRT single-molecule real-time TCM traditional Chinese medicinal TF transcription factor WGCNA weighted gene co-expression network analysis © The Author(s) 2018. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)

Journal

Plant and Cell PhysiologyOxford University Press

Published: Apr 12, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off