Evolution and diversity of pre-mRNA splicing in highly reduced nucleomorph genomes

Evolution and diversity of pre-mRNA splicing in highly reduced nucleomorph genomes Eukaryotic genes are interrupted by introns that are removed in a conserved process known as pre-mRNA splicing. Though well- studied in select model organisms, we are only beginning to understand the variation and diversity of this process across the tree of eukaryotes. We explored pre-mRNA splicing and other features of transcription in nucleomorphs, the highly reduced remnant nuclei of secondary endosymbionts. Strand-specific transcriptomes were sequenced from the cryptophyte Guillardia theta and the chlor- arachniophyte Bigelowiella natans, whose plastids are derived from red and green algae, respectively. Both organisms exhibited elevated nucleomorph antisense transcription and gene expression relative to their respective nuclei, suggesting unique properties of gene regulation and transcriptional control in nucleomorphs. Marked differences in splicing were observed between the two nucleomorphs: the few introns of the G. theta nucleomorph were largely retained in mature transcripts, whereas the many short introns of the B. natans nucleomorph are spliced at typical eukaryotic levels (>90%). These differences in splicing levels could be reflecting the ancestries of the respective plastids, the different intron densities due to independent genome reduction events, or a combination of both. In addition to extending our understanding of the diversity of pre-mRNA splicing across eukaryotes, our study also indicates potential links between splicing, antisense transcription, and gene regulation in reduced genomes. Key words: RNA-Seq, transcriptome, intron retention, cryptophyte, cryptomonad, chlorarachniophyte. Introduction Typically, these organisms have introns that are few in num- The regulation and flow of information within a cell are vital ber, are short (30 bp or less), or are both. In organisms with processes. Many genes in eukaryotes are interrupted with in- intron-sparse genomes, their spliceosomes often possess a tervening sequences known as introns, which are removed reduced set of components (Katinka et al. 2001; Grisdale from transcripts via a ubiquitous process known as pre-mRNA et al. 2013; Stark et al. 2015), and studying such reduced splicing. A large complex of proteins and small nuclear RNAs systems could provide insight into the core mechanisms of (snRNAs) known as the spliceosome mediates this process. splicing. Although rare, there are examples of genomes that The proper assembly of the spliceosome and subsequent in- have lost all introns (Lane et al. 2007; Cuomo et al. 2012). tron removal require conserved intronic sequence signals such Although splicing has been studied extensively in budding 0 0 as the 5 splice site (most often “GU”), the 3 splice site (most yeast and humans, it is assumed that this process occurs with often “AG”), and a biochemically important branch point little variation across eukaryotes. However, splicing has re- adenosine residue (Will and Lu ¨ hrmann 2011). cently been analyzed in detail in two very different organisms The presence of introns and conserved spliceosomal com- with reduced genomes, extending our understanding of the ponents across eukaryotes suggests that splicing is a mecha- diversity of pre-mRNA splicing. The microsporidian intracellu- nistically conserved process present in the last common lar parasite Encephalitozoon cuniculi has an extremely tiny ancestor of eukaryotes. Even organisms that have highly re- genome of 2.9 Mbp, with only 37 annotated introns duced or highly derived genomes can still have introns. (Katinka et al. 2001; Lee et al. 2010). The extremophilic red The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non- commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com Genome Biol. Evol. 10(6):1573–1583. doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 1573 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Wong et al. GBE alga Cyanidioschyzon merolae also has a reduced genome observed so far (from both cryptophytes and chlorarachnio- (16.5 Mbp) and only 27 annotated introns (Matsuzaki et al. phytes) carry their tiny genomes on three short linear chro- 2004; Stark et al. 2015). Transcriptomic studies of E. cuniculi mosomes (Douglas et al. 2001; Gilson et al. 2006; Lane et al. have revealed a high frequency of intron retention—that is, a 2007; Tanifuji et al. 2011; Moore et al. 2012; Tanifuji, large proportion of mature mRNA still retain introns (Grisdale Onodera, Brown, et al. 2014; Suzuki et al. 2015). Whether et al. 2013). Also, both of these species have only a limited or not there is a functional significance to this convergence, complement of spliceosomal proteins. Indeed, in C. merolae, rather than a mere coincidence, remains to be seen. As with one of the five major subcomplexes of the spliceosome is any organellar genome, very few genes remain; many have missing (Stark et al. 2015). been transferred to the host nucleus or lost. The vast majority Antisense transcription is a relatively rare process in eukar- of remaining nucleomorph genes is housekeeping genes, yotes where transcripts complementary to a “sense” gene are such as chaperone proteins, ribosomal proteins and those in- generated (Pelechano and Steinmetz 2013). Whereas most volved in DNA replication, along with the genes for rRNAs and antisense transcripts are noncoding, some “antisense” tran- an incomplete set of tRNAs (Douglas et al. 2001; Gilson et al. scripts are mRNA of oppositely oriented, neighboring genes. 2006; Lane et al. 2007; Tanifuji et al. 2011; Moore et al. 2012; Such transcripts are inherently more common when genes are Tanifuji, Onodera, Brown, et al. 2014; Suzuki et al. 2015). especially close to each other, as in reduced genomes Introns have been found in all but one of the nucleomorph (Williams et al. 2005; Pelechano and Steinmetz 2013). Long genomes sequenced to date (Douglas et al. 2001; Gilson et al. thought to only be errant transcriptional noise, it is now clear 2006; Lane et al. 2007; Tanifuji et al. 2011; Moore et al. 2012; that antisense transcription can be another source of gene Tanifuji, Onodera, Brown, et al. 2014; Suzuki et al. 2015). regulation (Wagner and Simons 1994; Vanhee-Brossollet However, the density of introns in the nucleomorph genomes and Vaquero 1998; Pelechano and Steinmetz 2013). This of cryptophytes and chlorarachniophytes is quite different. For property could be especially relevant for reduced genomes, example, the cryptophyte G. theta has 485 tightly packed especially if antisense transcription could compensate for a protein-coding genes in its 550 kbp nucleomorph genome, reduced set of transcription factors and regulatory elements and only 17 of these genes are interrupted by introns that in the genome. Antisense transcripts bound to complemen- range from 42 bp to 52 bp in length (Douglas et al. 2001). In tary mRNA can target it for degradation, for example by contrast, the smaller (370 kbp) nucleomorph genome of microRNAs (miRNA) or small interfering RNAs (siRNA; B. natans has almost 900 extremely short (18–21 bp) introns Ghildiyal and Zamore 2009; Moazed 2009; Pelechano and that interrupt a majority of the 283 protein-coding nucleo- Steinmetz 2013). A bound antisense transcript can also morph genes (Gilson et al. 2006; Tanifuji, Onodera, Brown, 0 0 mask splicing signals, leading to intron retention and other et al. 2014). Whereas canonical 5 and 3 splice sites are pre- alternative splicing events (Morrissy et al. 2011). sent in these tiny introns, other commonly conserved splicing The most reduced eukaryotic genomes in nature are the motifs such as the branch donor adenosine are not discernible nucleomorphs, relict nuclei found within secondary plastids of (Gilson et al. 2006; Tanifuji, Onodera, Brown, et al. 2014). two distant lineages of algae (Keeling 2004). While there are a There have been a number of studies on the peculiarities of number of algal lineages that have acquired photosynthesis transcription in nucleomorphs (Williams et al. 2005; Hirakawa by taking up an already-photosynthetic eukaryote as an en- et al. 2011; Hirakawa et al. 2014; Tanifuji, Onodera, Moore, dosymbiont, in only two independent lineages does the rem- et al. 2014; Suzuki et al. 2016; Sanita Lima and Smith 2017), nant endosymbiont nucleus (with an associated tiny genome) although studies about the unique introns and pre-mRNA persist. The cryptophytes bear a plastid derived from a red splicing of nucleomorphs are lacking. Here, we seek to un- alga, while the chlorarachniophytes acquired photosynthesis derstand pre-mRNA splicing in the highly reduced nucleo- from a green alga. As the nucleomorph is derived from a morphs through strand-specific RNA-Seq on mRNA eukaryotic nucleus, the genome has typical eukaryotic fea- extracted from both the cryptophyte G. theta and the chlor- tures, such as linear chromosomes, genes containing introns arachniophyte B. natans, and conducting an in-depth analysis and pre-mRNA splicing. Nucleomorphs provide insight into and comparison of pre-mRNA splicing and antisense tran- the reduction of a eukaryotic genome through endosymbiosis scription. Our data reveal that although there are similarities (as opposed to parasitism, for example). The first two nucle- between the nucleomorphs of G. theta and B. natans with omorphs sequenced were those of the cryptophyte Guillardia respect to gene expression and antisense transcription, there theta and the chlorarachniophyte Bigelowiella natans are marked differences in proportions of transcripts that re- (Douglas et al. 2001; Gilson et al. 2006). Since then, additional main unspliced. Whereas the intron-sparse nucleomorph of nucleomorphs from both lineages have been sequenced G. theta exhibits much intron retention, the many tiny introns (Lane et al. 2007; Tanifuji et al. 2011; Moore et al. 2012; of the B. natans nucleomorph are spliced at high levels, Tanifuji, Onodera, Brown, et al. 2014; Suzuki et al. 2015). highlighting contrasting and possibly lineage-specific differ- Nucleomorph genomes vary in size, but none has been ences in the evolutionary outcomes of pre-mRNA splicing found to be >1 Mbp. Interestingly, all nucleomorphs due to genome reduction. Furthermore, our study provides 1574 Genome Biol. Evol. 10(6):1573–1583 doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Evolution of Pre-mRNA Splicing in Nucleomorphs GBE insight into the diversity of evolutionary trajectories due to second strand, ensuring strand-specificity of the resultant se- genome reduction, and to the diverse nature of what is con- quenced fragments (Parkhomchuk et al. 2009; Wang et al. sidered a ubiquitous and conserved eukaryotic process. 2011). These two libraries were sequenced on an in-house Illumina HiSeq 2500, generating a total of 17,056,967 and 31,687,031 paired-end reads. The same process was re- Materials and Methods peated to generate two replicate libraries of B. natans,result- Culturing of G. theta and B. natans ing in 65,432,233 and 62,505,229 paired-end reads. Monoclonal cultures of G. theta strain CCMP 2712 and B. natans strain CCMP 621 were obtained from the National Center for Microbiota and Algae (NCMA, formerly Bioinformatics Analysis of Sequence Data CCMP). Both organisms were cultured in 250 ml Erlenmeyer The resulting sets of reads were mapped using TopHat2 (Kim flasks using 50 ml of f/2-Si media for B. natans,and the same et al. 2013) to concatenated reference genomes of G. theta volume of media supplemented with 50 mM of NH Cl for (GenBank accession AEIE00000000) and B. natans (GenBank G. theta, as it is unusual in its requirement of ammonium accession ADNK00000000) that include all nucleomorph as its nitrogen source (Hill and Wetherbee 1990). Cultures chromosomes, plastid and nuclear genomic sequences. were agitated on a shaking platform rotating at 120 rpm, Mapped read pairs in SAM format alignment files were and were exposed to 30 mmol photons/m /s of light for 12 h then processed with custom Python scripts to sort them into per day. sense or antisense read pairs based on existing gene annota- tions of G. theta and B. natans. Raw read counts were RNA Extraction summed for each gene for use in downstream calculations Guillardia theta cells were pelleted from 10 ml of culture spun of gene expression and antisense transcription levels. These down 6 h, or halfway, into the “daylight” phase of the light counts were then normalized to determine relative expression cycle. Total RNA was extracted from these pellets using the levels using the FPKM (fragments per kilobase of exon per Ambion RNaqueous Kit (Life Technologies) with the manufac- million reads mapped) method (Mortazavi et al. 2008). turer’s recommended protocol. The same volume of B. natans For each annotated junction in G. theta and B. natans, culture was spun down at the same time-point as G. theta further custom Python scripts were employed to enumerate cultures, and total RNA extracts were prepared from the cell the mapped read pairs in that vicinity for the type of splicing pellets using the TRIzol reagent (Ambion) under the manufac- event it represents: spliced transcripts, intron retention and turer’s recommended conditions. Eluted total RNA samples other alternative splicing events. To ensure that the splicing were quantified using a NanoDrop spectrophotometer events are real and not represented by spuriously mapped (Thermo Scientific). reads, a junction with <25 reads mapped to its vicinity was excluded from further analysis. Using read counts for spliced transcripts and intron-retained transcripts, we calculated the RNA Cleanup and Poly-A Purification percent spliced reads for each annotated junction by dividing Total RNA extracts were cleaned of gross DNA contamination the number of canonically spliced reads by the total number using the Invitrogen DNA-free DNA Removal Kit (Life of reads. We also performed a similar calculation for each Technologies) and further quantified for RNA and DNA using annotated junction by dividing the number of intron- Invitrogen Qubit fluorometry (Life Technologies) for both retained reads by total reads to generate percent intron re- macromolecules. Poly-A purification was performed using tention, which totals to 100% when summed with percent NEXTflex Poly(A) Beads (BioO Scientific) to enrich samples spliced reads. Because the length of introns in both the G. for mRNA and reduce the relative proportion of rRNA. theta and B. natans nucleomorphs are much shorter than read length, our calculated percent intron retention is a proxy Strand-Specific Library Preparation and Second- for the percent spliced in (PSI) value used in alternative splicing Generation Sequencing studies. Percent intron retention for nucleomorph junctions Two strand-specific libraries of G. theta were prepared as are provided as supplementary figures S1 and S2, replicates using the NEXTflex Directional RNA-Seq Kit (BioO Supplementary Material online. Scientific), which uses the dUTP method to maintain strand A discrepancy exists between the number of introns we specificity. The libraries were prepared without modification analyzed versus the latest number of introns found in the to the manufacturer’s protocol. This method employs the ad- nucleomorph of B. natans. Whereas 99 additional introns dition of deoxyuracil triphosphate (dUTP) in place of deoxy- were presented in a recent analysis of the B. natans nucleo- thymidine triphosphate (dTTP) during second-strand synthesis morph genome (Tanifuji, Onodera, Brown, et al. 2014), the of reverse transcription, and subsequent digestion of uracil available annotation files contain only the 852 introns anno- using uracil-DNA glycosylase (UDG) introduces breaks in the tated from the original genome sequencing project Genome Biol. Evol. 10(6):1573–1583 doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 1575 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Wong et al. GBE (Gilson et al. 2006), and these are the 852 B. natans nucleo- higher relative expression with an average FPKM of 75.4. morph introns analyzed here. Likewise, an average FPKM of 317 in nucleomorph genes of B. natans versus 17.1 in nuclear genes suggest a 19-fold in- crease in relative gene expression (fig. 1). Taken together with Results and Discussion results from G. theta, our expression data correspond well Increased Levels of Transcription in Both Cryptophyte and with previous studies, and we proceeded to examine further Chlorarachniophyte Nucleomorphs aspects of transcription only allowed by our use of strand- The nucleomorphs of cryptophytes and chlorarachniophytes specific methodologies. represent evolutionarily convergent structures, and our com- Antisense transcription has not been previously analyzed in parison of pre-mRNA splicing between nucleomorphs of both nucleomorphs, and we used our strand-specific RNA-Seq data lineages offers insight into the evolution of a conserved eu- to investigate the extent of antisense transcription in both the karyotic process in two independent cases of genome reduc- nucleomorph and nuclear genomes of G. theta and B. natans. tion. To examine the process of pre-mRNA splicing in Reduced genomes are thought to exhibit more antisense tran- nucleomorphs, we performed strand-specific RNA-Seq on scription, as intergenic spaces are small and transcripts of the cryptophyte G. theta and the chlorarachniophyte neighboring and oppositely oriented genes are likely to over- B. natans, and mapped the reads to their respective nucleo- lap (Williams et al. 2005; Pelechano and Steinmetz 2013). morph genomes. Because both G. theta and B. natans also Therefore, we predicted that the nucleomorphs of G. theta have sequenced nuclear genomes (Curtis et al. 2012), we and B. natans would have more antisense transcription taking simultaneously mapped our RNA-Seq data to their respective place than in their respective nuclei. To determine relative nuclei to be used as examples of typical eukaryotic genomes. antisense transcription levels, FPKM was calculated from Using available genome annotations, we totaled the number mapped antisense reads for all annotated genes. Indeed, of mapped reads for each gene representing expression, an- more antisense transcription occurs in the nucleomorphs of tisense transcription, intron excision or retention, alternative both G. theta and B. natans than in their respective nuclei splicing, and so forth. We determined with these counts that (fig. 1). the replicate RNA-Seq libraries were statistically similar The increased levels of antisense transcription in nucleo- (Pearson’s r¼ 0.95 for G. theta; r¼ 0.99 for B. natans), and morphs compared with their host nuclei suggest the provoc- all mapped reads from each species were pooled for our final ative possibility that these transcripts could be playing a transcriptome analyses. In total, 26,561,411 pooled reads functional role in gene regulation in the nucleomorph. were mappedtoall G. theta genomic sequences, with While Tanifuji, Onodera, Moore, et al. (2014) suggest that 1,091,066 of those (4.1%) mapping to the nucleomorph. increased nucleomorph gene expression could be a compen- For B. natans, 62,768,143 pooled reads were mapped, and satory mechanism against high levels of errant antisense tran- 6,690,653 of those (10.7%) mapped to the nucleomorph. scription, the converse could also be true—antisense Previous transcriptomic studies on G. theta and B. natans transcripts could down-regulate the massively increased levels have focused on gene expression. In those studies, research- of gene expression. Given the extremely reduced set of genes ers noted the high gene expression levels in nucleomorphs within the nucleomorph, the nucleomorph genome could (Hirakawa et al. 2014; Tanifuji, Onodera, Moore, et al. have dispensed with many of the factors necessary for finer 2014), and observed that virtually the entire nucleomorph transcriptional regulation. Indeed, based on sequence similar- genome is transcribed (Williams et al. 2005; SanitaLima ity to known transcription-related proteins, the nucleomorph and Smith 2017). An assessment of relative gene expression genome of G. theta is predicted to harbor only a small com- levels was performed to compare our data to those studies to plement of 30 such proteins, while the B. natans nucleo- ensure that our data and methodology were sound. We de- morph genome has even fewer at 22 (Douglas et al. 2001; termined relative gene expression levels from our RNA-Seq Gilson et al. 2006). Although some putative nucleomorph- data by normalizing counts of mapped reads across all (nu- targeted transcription factors have been identified for both clear and nucleomorph) annotated genes in G. theta and G. theta and B. natans, the full contribution of nuclear- B. natans using FPKM (fragments per kilobase of exon per encoded transcription-associated proteins is unknown million reads mapped). Tanifuji, Onodera, Moore, et al. (Curtis et al. 2012). Regardless, the extremely short intergenic (2014) showed that gene expression in G. theta nucleomorph regions of the G. theta and B. natans nucleomorph present genes was on average 2.6 times higher than in its nuclear considerable limitations on the position and sequences of reg- genes. The nucleomorph genes of B. natans were shown to ulatory motifs, especially if those motifs are involved in sup- be expressed almost 15 times higher than its nuclear genes pressing transcription, resulting in poorly controlled and (Tanifuji, Onodera, Moore, et al. 2014). Our RNA-Seq data uniformly high expression of nucleomorph genes and a pos- showed very similar increases in nucleomorph gene expres- sible reliance on antisense transcription as an alternate mech- sion (fig. 1)—G. theta nuclear genes have an average FPKM of anism of gene regulation. This is supported by a recent study 24.9, whereas G. theta nucleomorph genes have an 3-fold revealing that only two nucleomorph genes of B. natans are 1576 Genome Biol. Evol. 10(6):1573–1583 doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Evolution of Pre-mRNA Splicing in Nucleomorphs GBE Gene Expression in G. theta Gene Expression in B. natans A B Antisense Transcription in G. theta Antisense Transcription in B. natans C D Guillardia theta Bigelowiella natans Nuclear genes Nucleomorph genes Nuclear genes Nucleomorph genes Mean: 24.9 Mean: 75.4 Mean: 17.1 Mean: 317 Median: 2.93 Median: 17.8 Median: 5.24 Median: 207 Mean: 1.00 Mean: 11.5 Mean: 0.618 Mean: 19.1 Median: 0.0181 Median: 2.87 Median: 0 Median: 3.88 FIG.1.—Increased gene expression and antisense transcription is observed in nucleomorph genes of both Guillardia theta and Bigelowiella natans relative to their respective nuclear genes. (A) Box plot representing gene expression level (normalized using the FPKM formula) differences between nuclear and nucleomorph genes of G. theta.(B) Box plot representing gene expression level differences between nuclear and nucleomorph genes of B. natans.(C) Box plot representing levels of antisense transcription (normalized using the FPKM formula) differences between nuclear and nucleomorph genes of G. theta. (D) Box plot representing antisense transcription level differences between nuclear and nucleomorph genes of B. natans.(E) Summary table of mean and median FPKM and antisense FPKM of both nuclear and nucleomorph genes in G. theta and B. natans. differentially expressed between light and dark cycles, sug- increased intron retention, which may be more common in gesting a lack of transcriptional regulation (Suzuki et al. a reduced genome (Grisdale et al. 2013; Tanifuji, Onodera, 2016). Others have suggested that overexpression of nucleo- Moore, et al. 2014). As discussed later, our splicing analyses morph genes could be compensating for missplicing, or partially support this idea, raising questions about the Genome Biol. Evol. 10(6):1573–1583 doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 1577 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Antisense Sense Antisense FPKM Sense FPKM FPKM FPKM Antisense FPKM Sense FPKM Wong et al. GBE Percent spliced (including antisense reads) Percent spliced (excluding antisense reads) Nucleomorph Gene FIG.2.—Pre-mRNA splicing levels in Guillardia theta nucleomorph genes. For each intron in the nucleomorph, the percentage of spliced reads was calculated in two different ways—one including reads mapped to the antisense strand to simulate a traditional nonstranded RNA-Seq (light bars), and one excluding these antisense reads (dark bars). The genes are arranged left to right in order of increasing effect of antisense reads on the calculated percent spliced reads, highlighted by the difference in heights of the bars. Nucleomorph gene orf183 had <50 reads mapped across its junction, while the junction from orf263 was excluded from this analysis due to poor coverage. potential interplay of overexpression, antisense transcription out the possibility that increased intron retention seen in the and pre-mRNA splicing within reduced genomes. G. theta nucleomorph was reflecting technical issues, we also calculated the percent of spliced reads at G. theta nuclear junctions (supplementary fig. S3, Supplementary Material on- Increased Intron Retention in G. theta Nucleomorph Genes line). As expected for a typical eukaryotic genome, nuclear Our study provides the first analysis of pre-mRNA splicing in a gene transcripts of G. theta are spliced at high levels with little cryptophyte nucleomorph, and the first comparison of splic- intron retention. In the G. theta nucleomorph, the percent of ing between the independently reduced cryptophyte and spliced reads varies between the introns, from the intron for chlorarachniophyte nucleomorphs. While our RNA-Seq data the replication factor rfC being spliced at <20%, to a number showed that high expression is a common feature to both the of other genes where intron retention is not extensive and G. theta and B. natans nucleomorphs, we observe significant nearly 90% of reads are spliced, such as the ribosomal pro- differences in the patterns of pre-mRNA splicing between the teins rps16, rpl19,and rps13 (fig. 2). This range of splicing two. Only 17 introns have been annotated in the nucleo- levels might suggest that introns of certain genes are spliced morph genome of G. theta, eachone locatedina different at higher levels than others; however, there does not appear gene, all with a noticeable bias towards the 5 end of the gene to be any correlation between the extent of intron retention (Douglas et al. 2001). Previous transcriptomic studies on an- and the function of the gene in which the intron lies. other reduced genome have shown that relatively few introns The low splicing levels seen in the highly reduced G. theta remain in the genome, and these introns are often retained in nucleomorph are similar to patterns seen in a previous tran- mature transcripts (Grisdale et al. 2013). Considering the low scriptomic study of the microsporidian parasite E. cuniculi intron density and overall level of genome reduction in the (Grisdale et al. 2013), consistent with the hypothesis that nucleomorph of G. theta, we expected that high levels of one of the effects genome and spliceosome reduction is a intron retention would also be observed in these 17 introns. reduction in splicing levels of the few remaining introns. It is We examined the splicing levels for 16 of the 17 annotated possible that increased intron retention represents missplicing nucleomorph introns—the remaining intron was located on a due to poor recognition of reduced introns (with reduced, gene with poor read coverage and did not meet our cut-off divergent, or missing splicing motifs) by the spliceosome. In criterion. As shown in figure 2, intron retention is prevalent— a study by Jaillon et al. (2008), they found that the short on average, only around 60% of the reads mapped to the introns (20–34 bp) of the ciliate Paramecium tetraurelia pos- vicinity of introns are spliced. This is significant, as the only sess reduced splicing signals, and suggested with some sup- other report of unusual intron retention is from an RNA-Seq port that these introns are not removed efficiently from study on the highly reduced microsporidian E. cuniculi where transcripts. However, the 17 annotated introns in the G. theta intron retention was extensive (Grisdale et al. 2013). To rule nucleomorph have clearly defined splice sites and a 1578 Genome Biol. Evol. 10(6):1573–1583 doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Percent Spliced Reads Evolution of Pre-mRNA Splicing in Nucleomorphs GBE well-defined branch point adenosine (Douglas et al. 2001), transcripts for each junction are represented in figure 2.A and we did not find any discernible correlations between number of introns (on the right side of fig. 2), such as those features of the intron sequence and the propensity for its in the genes for ribosomal protein rps27 or growth factor ebi, removal. In fact, the 5 splice site is well-conserved across all have high levels of antisense transcripts that would confound 17 G. theta nucleomorph introns (Douglas et al. 2001), in line calculations of percent spliced transcripts. The intron in ebi is with other studies showing that genomes with low intron most affected; antisense reads comprise nearly half of the density tend towards stronger, rather than weaker, splicing mapped reads in the vicinity of its intron. Without strand motifs (Irimia et al. 2007, 2009; Irimia and Roy 2008; Leeetal. data for the reads, the percent of spliced reads would be 2010). It has also been suggested that an incomplete set of 31% as opposed to 60%, severely overestimating the occur- spliceosomal components in reduced genomes result in in- rence of intron retention for this gene. However, most other creased missplicing and intron retention (Grisdale et al. introns do not appear to have enough antisense reads that 2013). Although a small number of spliceosomal components would cause an overestimation of the extent of intron and one of the snRNAs have been identified within the nucle- retention—the proportion of antisense reads mapping to omorph genome (Douglas et al. 2001; Lopez et al. 2008), the regions within the intron is <10% for more than half of the full contribution of spliceosomal components from nuclear G. theta nucleomorph introns. In fact, no antisense reads encoded genes is unclear (Curtis et al. 2012). Regardless, be- were mapped to the vicinity of the introns in rps16, rpl19, cause our data show that not all of the 17 introns exhibit or rps17 (fig. 2). However, because reduced genomes tend to extensive intron retention, a functional spliceosome must be have very few introns, even a few overestimations can skew present. With so few, short introns remaining, one wonders the overall impression of the extent of intron retention. Thus, why they have not been dispensed with altogether given the it is worthwhile to consider strand-specific libraries for any high level of genome reduction. Although it is possible that RNA-Seq experiment for organisms with reduced genomes. their positional bias within the gene prevents their loss, it is Our transcriptomic data from the G. theta nucleomorph also possible that they play a regulatory role that contributes revealed that despite high levels of antisense transcription to their persistence. For example, retained introns in mature across the entire nucleomorph genome (see above), antisense transcripts, especially if splicing levels are actively regulated by transcription occurs significantly less frequently (Welch’s un- the organism, could act as a form of post-transcriptional reg- equal variances t-test, P< 0.001) in the 17 intron-containing ulation, down-regulating overly abundant transcripts from be- genes than in all the other single-exon genes. Although there ing translated (Lewis et al. 2003; Lareau et al. 2007). Another are limitations to drawing robust statistical conclusions from possibility could be interplay between antisense transcription only 17 introns, this observation could suggest that in the G. and splicing, as has been previously documented (Morrissy theta nucleomorph, antisense transcripts complementary to et al. 2011; Pelechano and Steinmetz 2013). multi-exon genes are actively down-regulated to allow for Our strand-specific methodology allows us to explore these proper splicing to occur. This could highlight a potential link potential links between antisense transcription and pre-mRNA between antisense transcription and pre-mRNA splicing, re- splicing. We have shown that transcription occurs at high quiring further investigation. It has previously been suggested levels in nucleomorphs, in line with previous studies exploring that antisense transcripts occlude splicing signals, leading to nucleomorph gene expression (Hirakawa et al. 2014; Tanifuji, intron retention, and indeed a positive correlation has been Onodera, Moore, et al. 2014). However, in those previous observed in animals between antisense transcription and al- studies, there was no way to reliably discern how many of ternatively spliced genes (Morrissy et al. 2011; Pelechano and the transcripts were antisense. Conventional methods of re- Steinmetz 2013). Although untangling the links between verse transcription do not preserve the strandedness of the gene expression, antisense transcription and pre-mRNA splic- original transcript, and the lack of strand specificity of ing in such a reduced genome presents many technical chal- the sequence data renders antisense transcripts indistinguish- lenges, this interplay within the G. theta nucleomorph could able from “sense” mRNA to read mapping programs— not only represent an unusual network of regulatory mecha- antisense reads mapping to an annotated intron would thus nisms shared with other reduced genomes, but also provide appear as intron retention. Our strand-specific RNA-Seq data an evolutionary explanation for the persistence of introns, and allow us to compare methodologies to determine if this is a their associated spliceosome. legitimate concern, in addition to providing opportunities to assess any potential effects of antisense transcription on pre- Near-typical Eukaryotic Splicing Levels in B. natans mRNA splicing. We were able to make two calculations of Nucleomorph Genes percent transcripts spliced: one simulated results from a con- ventional RNA-Seq experiment by calculating the percentage The independently reduced nucleomorph of the chlorarach- of spliced transcripts from all sense and antisense reads, and a niophyte B. natans bears remarkable contrasts to that of the second calculation excluding the antisense reads. cryptophyte G. theta. To investigate pre-mRNA splicing pat- The differences between the calculations of percent spliced terns in B. natans further, we calculated the percent of spliced Genome Biol. Evol. 10(6):1573–1583 doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 1579 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Wong et al. GBE Mean Percent Spliced: 87.6% Median Percent Spliced: 92.8% FIG.3.—Bigelowiella natans nucleomorph introns are spliced at high levels. Histogram showing the proportion of junctions at different levels of percent spliced reads. The vast majority of junctions has >80% spliced reads. reads at all annotated introns of both the nuclear and nucle- genomes (fig. 3). More than half of all annotated nucleo- omorph genomes of B. natans using the same methods and morph introns have 90–100% of junction-mapped reads criteria as G. theta.As with G. theta nuclear genes, reads spliced, and only 12% of all annotated introns have <80% mapped to the B. natans nucleus are spliced with a median of junction-mapped reads spliced. Furthermore, for the vast percent spliced reads higher than 90%, indicating little intron majority of these introns, antisense reads comprise <10% of retention (supplementary fig. S4, Supplementary Material the mapped reads, and do not contribute to any significant online). overestimation of intron retention (supplementary File S5, With only 283 annotated genes and nearly 900 introns (see Supplementary Material online). As with the G. theta nucleo- Materials and Methods for clarification) in the B. natans nucle- morph, we were unable to correlate splicing levels of a par- omorph genome, the genes are densely populated with ex- ticular gene’s introns and that gene’s proposed function. tremely short (18–21 bp) introns (Gilson et al. 2006; Tanifuji, Furthermore, given the high levels of splicing across all introns, Onodera, Brown, et al. 2014). The average intron density is it is less clear if these tiny introns could play a functional role. 3.1 introns per gene, and only 43 of the 283 annotated genes However, the possibility remains that under certain condi- lack introns (Gilson et al. 2006). However, based on our cur- tions, such as stress, nucleomorph splicing could be regulated. rent understanding of conserved mechanisms of splicing, This “proficiency” in pre-mRNA splicing we observed these short intron lengths place constraints on this process. means that the relatively high nucleomorph gene expression The introns of B. natans nucleomorph genes do not have a in B. natans is not compensating for excessive missplicing, as discernible branch point adenosine (Gilson et al. 2006), and suggested by Tanifuji, Onodera, Moore, et al. (2014). Instead, considering the intron lengths, it is difficult to imagine the near-typical eukaryotic splicing levels in the highly reduced formation of lariat structures typical of spliceosomal intron B. natans nucleomorph genome could be a consequence of removal. Also, aside from the GU-AG boundaries, no other having a relatively high intron density similar to other free- motifs are apparent (Gilson et al. 2006). Finally, the typical living green algae (Slamovits and Keeling 2009), where exten- eukaryotic spliceosome (or even a reduced version) is a very sive intron retention would likely be deleterious. While there large conglomerate of proteins that likely dwarfs these tiny are 24 predicted spliceosomal components in the B. natans introns. On the basis of these unusual aspects, and extensive nucleomorph (Gilson et al. 2006; Curtis et al. 2012), this is a intron retention in other reduced genomes, we expected to small fraction of the number of components required to form observe low levels of splicing in the B. natans nucleomorph. the familiar spliceosome that is conserved amongst other Using our minimum read coverage criterion as described in eukaryotes, even with contributions from nuclear-encoded the Materials and Methods, we analyzed splicing levels of 829 products. Taking our data together with the unusual se- introns from the B. natans nucleomorph. Surprisingly, the cal- quence features of their introns suggest that the B. natans culated splicing levels of B. natans nucleomorph introns follow nucleomorph must use a novel or highly divergent mechanism a pattern more like its respective nucleus than other reduced to effectively remove its extremely short introns with great 1580 Genome Biol. Evol. 10(6):1573–1583 doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Evolution of Pre-mRNA Splicing in Nucleomorphs GBE accuracy. It has been suggested that the length and sequence biochemistry of pre-mRNA splicing has been conducted. No of the B. natans nucleomorph introns may play a role in effi- reports exist of widespread intron retention in S. cerevisiae, cient identification and removal, as the range of lengths is very and Grisdale et al. (2013) showed in a splicing analysis using narrow (18–21 bp), and intronic sequences have a marked AU existing S. cerevisiae RNA-Seq data that the vast majority of bias (Gilson et al. 2006). Whereas an EST-based study of the the 253 introns is spliced at high levels. Thus, while intron chlorarachniophyte Gymnochlora stellata suggested that density could be a very strong determinant of intron retention splicing levels of nucleomorph introns are correlated with their in reduced genomes, other biological or evolutionary factors lengths (Slamovits and Keeling 2009), we find no significant are also involved. differences in splicing levels across B. natans nucleomorph Annotations from other sequenced nucleomorph genomes introns (supplementary File S5, Supplementary Material on- show that all cryptophyte nucleomorphs are intron-sparse, line). Furthermore, the only identifiable splicing sequence sig- while chlorarachniophyte nucleomorphs have very many 0 0 nals in B. natans nucleomorph introns (the 5 and 3 splice tiny introns (Lane et al. 2007; Tanifuji et al. 2011; Moore sites) are essentially invariable (Gilson et al. 2006). Indeed, et al. 2012; Tanifuji, Onodera, Brown, et al. 2014; Suzuki Gilson et al. (2006) propose that the B. natans nucleomorph et al. 2015). The chlorarachniophyte plastid was derived spliceosome could operate as a molecular “caliper,” excising from a green alga, and its nucleomorph’s intron density is 18–21 base-pair-long AU-rich regions bound by canonical similar to those of Arabidopsis thaliana and splice sites. Having very typical eukaryotic splicing levels in a Chlamydomonas reinhardtii (Gilson et al. 2006; Slamovits highly reduced genome with such short introns, pre-mRNA and Keeling 2009). There are also clear indications that splicing in the B. natans nucleomorph merits further biochem- some of the annotated chlorarachniophyte introns bear posi- ical and proteomic studies to elucidate this process and allow tional homology to those in extant green algae (Gilson et al. comparison with canonical splicing. 2006; Slamovits and Keeling 2009). The cryptophyte plastid, on the other hand, is derived from a red alga. Current geno- mic data repeatedly suggest that free-living red algae are gen- Evolution of Pre-mRNA Splicing in Reduced Eukaryotes erally gene- and intron-poor, hinting at a possible ancient Our analyses of splicing in the nucleomorphs of G. theta and genome reduction event before the radiation of rhodophytes B. natans highlight major differences in the patterns of pre- (Qiu et al. 2015). Should this be true, the red algal endosym- mRNA splicing in highly reduced genomes. Although the biont ancestor of the cryptophyte plastid would have already trend of increased intron retention in reduced genomes was been reduced with respect to intron density and the number seen in the highly reduced (but intron-sparse) nucleomorph of of spliceosomal components. Consequently, extant red algae G. theta, transcripts from the even tinier (but intron-dense) may also exhibit pre-mRNA splicing patterns similar to what B. natans nucleomorph were spliced at levels seen in most we observed in the G. theta nucleomorph. Indeed, in the other eukaryotes. This stark contrast between the two nucle- extremophilic red alga C. merolae, extensive intron retention omorphs could simply indicate that splicing levels are influ- is observed in its 27 annotated introns (Grisdale CJ, unpub- enced by intron density. However, the difference in splicing lished data), and only a relatively small complement of spli- levels could also reflect the evolutionary ancestries of the sec- ceosomal components remain: the U1 snRNP, the subunit of ondary plastids. These two possibilities are not mutually ex- the spliceosome responsible for the recognition of the 5 splice clusive, and their resolution would be useful for generalizing site, is entirely missing from the genome (Matsuzaki et al. the patterns of pre-mRNA splicing across cryptophyte and 2004; Stark et al. 2015). However, C. merolae could represent chlorarachniophyte nucleomorphs and other reduced eukary- a highly derived lineage of red algae, and pre-mRNA splicing otic genomes. As discussed previously, it would be deleterious remains to be studied in detail in mesophilic rhodophytes. for an intron-dense organism to exhibit extensive intron re- Further sampling of red algae and cryptophyte nucleomorphs tention, even under the evolutionary pressures of genome will be required to determine if evolutionary ancestry is re- reduction. This is supported not only by our splicing analysis sponsible for the differences in pre-mRNA splicing in the of the B. natans nucleomorph, but also by existing EST anal- highly reduced nucleomorphs of cryptophytes and yses on nucleomorph transcripts from another chlorarachnio- chlorarachniophytes. phyte G. stellata (Slamovits and Keeling 2009). In that study, a number of G. stellata nucleomorph genes were found to have Conclusions similar densities of very short introns as the B. natans nucleo- The nucleomorphs of cryptophytes and chlorarachniophytes morph, and most of those introns were removed in 80–100% provide unique opportunities to compare the evolution and of transcripts (Slamovits and Keeling 2009). However, there diversity of conserved eukaryotic processes within the context are genomes where intron density is rather low, yet intron of genome reduction. Our data reveal similar patterns of high retention is not widespread. The most notable of these is in gene expression and high antisense transcription in both the budding yeast Saccharomyces cerevisiae, whose genome nucleomorphs. We also observed differences in levels of is relatively small, and from where most research on the Genome Biol. Evol. 10(6):1573–1583 doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 1581 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Wong et al. GBE Irimia M, Roy SW. 2008. Evolutionary convergence on highly-conserved 3 antisense transcription around junctions in G. theta, suggest- intron structures in intron-poor eukaryotes and insights into the an- ing potential links between antisense transcription and pre- cestral eukaryotic genome. PLoS Genet. 4(8):e1000148. mRNA splicing. The marked differences observed in pre- Irimia M, et al. 2009. Complex selection on 5 splice sites in intron-rich mRNA splicing between the nucleomorphs highlights the di- organisms. Genome Res. 19(11):2021–2027. versity of what is considered to be a conserved process across Jaillon O, et al. 2008. Translational control of intron splicing in eukaryotes. Nature 451(7176):359–362. eukaryotes, and raises awareness of the value of investigating Katinka MD, et al. 2001. Genome sequence and gene compaction of the splicing in the lesser-studied branches of the eukaryotic tree. eukaryote parasite Encephalitozoon cuniculi.Nature Further investigations of the nature and mechanisms of pre- 414(6862):450–453. mRNA splicing in reduced genomes will provide valuable in- Keeling PJ. 2004. Diversity and evolutionary history of plastids and their sight and improve our understanding of this key eukaryotic hosts. Am J Bot. 91(10):1481–1493. Kim D, et al. 2013. TopHat2: accurate alignment of transcriptomes in the process. presence of insertions, deletions and gene fusions. Genome Biol. 14(4):R36. Supplementary Material Lane CE, et al. 2007. Nucleomorph genome of Hemiselmis andersenii reveals complete intron loss and compaction as a driver of protein Supplementary data areavailableat Genome Biology and structure and function. Proc Natl Acad Sci USA. Evolution online. 104(50):19908–19913. Lareau LF, Inada M, Green RE, Wengrod JC, Brenner SE. 2007. Unproductive splicing of SR genes associated with highly conserved and ultraconserved DNA elements. Nature 446(7138):926–929. Acknowledgments Lee RCH, Gill EE, Roy SW, Fast NM. 2010. Constrained intron structures in This work was supported by a Natural Sciences and a microsporidian. Mol Biol Evol. 27(9):1979–1982. Lewis BP, Green RE, Brenner SE. 2003. Evidence for the widespread cou- Engineering Research Council (NSERC) of Canada Discovery pling of alternative splicing and nonsense-mediated mRNA decay in Grant [262988 to N.M.F.] and a grant to the Centre for humans. Proc Natl Acad Sci USA. 100(1):189–192. Microbial Diversity and Evolution from the Tula Foundation. Lopez MD, Alm Rosenblad M, Samuelsson T. 2008. Computational screen The authors also acknowledge D. Tack (formerly University of for spliceosomal RNA genes aids in defining the phylogenetic distribu- British Columbia, currently Penn State) for his custom scripts tion of major and minor spliceosomal components. Nucleic Acids Res. 36(9):3001–3010. used in the analysis of our RNA-Seq data, and S. Rader Matsuzaki M, et al. 2004. Genome sequence of the ultrasmall unicellular (University of Northern British Columbia) and T. Whelan red alga Cyanidioschyzon merolae 10D. Nature 428(6983):653–657. (University of British Columbia) for helpful discussions and Moazed D. 2009. Small RNAs in transcriptional gene silencing and genome comments. defence. Nature 457(7228):413–420. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. 2008. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Literature Cited Methods. 5(7):621–628. Cuomo CA, et al. 2012. Microsporidian genome analysis reveals evolu- Moore CE, Curtis B, Mills T, Tanifuji G, Archibald JM. 2012. Nucleomorph genome sequence of the cryptophyte alga Chroomonas mesostigma- tionary strategies for obligate intracellular growth. Genome Res. 22(12):2478–2488. tica CCMP1168 reveals lineage-specific gene loss and genome com- plexity. Genome Biol Evol. 4(11):1162–1175. Curtis BA, et al. 2012. Algal genomes reveal evolutionary mosaicism and Morrissy AS, Griffith M, Marra MA. 2011. Extensive relationship between the fate of nucleomorphs. Nature 492(7427):59–65. Douglas S, et al. 2001. The highly reduced genome of an enslaved algal antisense transcription and alternative splicing in the human genome. Genome Res. 21(8):1203–1212. nucleus. Nature 410(6832):1091–1096. Gilson PR, et al. 2006. Complete nucleotide sequence of the chlorarach- Parkhomchuk D, et al. 2009. Transcriptome analysis by strand-specific niophyte nucleomorph: nature’s smallest nucleus. Proc Natl Acad Sci sequencing of complementary DNA. Nucleic Acids Res. 37(18):e123. Pelechano V, Steinmetz LM. 2013. Gene regulation by antisense transcrip- USA. 103(25):9566–9571. Grisdale CJ, Bowers LC, Didier ES, Fast NM. 2013. Transcriptome analysis tion. Nat Rev Genet. 14(12):880–893. Qiu H, Price DC, Yang EC, Yoon HS, Bhattacharya D. 2015. Evidence of of the parasite Encephalitozoon cuniculi: an in-depth examination of the pre-mRNA splicing in a reduced eukaryote. BMC Genomics. ancient genome reduction in red algae (Rhodophyta). J Phycol. 14(1):207–215. 51(4):624–636. Sanit a Lima M, Smith DR. 2017. Pervasive transcription of mitochondrial, Ghildiyal M, Zamore PD. 2009. Small silencing RNAs: an expanding uni- verse. Nat Rev Genet. 10(2):94–108. plastid, and nucleomorph genomes across diverse plastid-bearing spe- cies. Genome Biol Evol. 9(10):2650–2657. Hill DRA, Wetherbee R. 1990. Guillardia theta gen. et sp.nov. Slamovits CH, Keeling PJ. 2009. Evolution of ultrasmall spliceosomal (Cryptophyceae). Can J Bot. 68(9):1873–1876. Hirakawa Y, Burki F, Keeling PJ. 2011. Nucleus- and nucleomorph- introns in highly reduced nuclear genomes. Mol Biol Evol. 26(8):1699–1705. targeted histone proteins in a chlorarachniophyte alga. Mol Microbiol. 80(6):1439–1449. Stark MR, et al. 2015. Dramatically reduced spliceosome in Cyanidioschyzon merolae. Proc Natl Acad Sci USA. Hirakawa Y, Suzuki S, Archibald JM, Keeling PJ, Ishida K-I. 2014. 112(11):E1191–E1200. Overexpression of molecular chaperone genes in nucleomorph genomes. Mol Biol Evol. 31(6):1437–1443. Suzuki S, Shirato S, Hirakawa Y, Ishida K-I. 2015. Nucleomorph genome sequences of two chlorarachniophytes, Amorphochlora amoebiformis Irimia M, Penny D, Roy SW. 2007. Coevolution of genomic intron number and splice sites. Trends Genet. 23(7):321–325. and Lotharella vacuolata. Genome Biol Evol. 7(6):1533–1545. 1582 Genome Biol. Evol. 10(6):1573–1583 doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Evolution of Pre-mRNA Splicing in Nucleomorphs GBE Suzuki S, Ishida K-I, Hirakawa Y. 2016. Diurnal transcriptional regulation of Wagner EG, Simons RW. 1994. Antisense RNA control in bacteria, phages, endosymbiotically derived genes in the chlorarachniophyte and plasmids. Annu Rev Microbiol. 48:713–742. Bigelowiella natans. Genome Biol Evol. 8(9):2672–2682. Wang L, et al. 2011. A low-cost library construction protocol and data Tanifuji G, et al. 2011. Complete nucleomorph genome sequence of the analysis pipeline for Illumina-based strand-specific multiplex RNA-seq. nonphotosynthetic alga Cryptomonas paramecium reveals a core PLoS One 6(10):e26426. nucleomorph gene set. Genome Biol Evol. 3:44–54. Will CL, Lu ¨ hrmann R, et al. 2011. Spliceosome structure and function. Acc Tanifuji G, Onodera NT, Moore CE, et al. 2014. Reduced nuclear genomes Chem Res. 44(12):1292–1301. maintain high gene transcription levels. Mol Biol Evol. 31(3):625–635. Williams BA, Slamovits CH, Patron NJ, Fast NM, Keeling PJ. 2005. Tanifuji G, Onodera NT, Brown MW, et al. 2014. Nucleomorph and plastid A high frequency of overlapping gene expression in compacted genome sequences of the chlorarachniophyte Lotharella oceanica: eukaryotic genomes. Proc Natl Acad Sci USA. 102(31): convergent reductive evolution and frequent recombination in 10936–10941. nucleomorph-bearing algae. BMC Genomics 15(1):374. Vanhee-Brossollet C, Vaquero C. 1998. Do natural antisense transcripts make sense in eukaryotes? Gene 211(1):1–9. Associate editor: Michelle Meyer Genome Biol. Evol. 10(6):1573–1583 doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 1583 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Genome Biology and Evolution Oxford University Press

Evolution and diversity of pre-mRNA splicing in highly reduced nucleomorph genomes

Free
11 pages

Loading next page...
 
/lp/ou_press/evolution-and-diversity-of-pre-mrna-splicing-in-highly-reduced-tEaXNpFdAA
Publisher
Oxford University Press
Copyright
© The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
ISSN
1759-6653
eISSN
1759-6653
D.O.I.
10.1093/gbe/evy111
Publisher site
See Article on Publisher Site

Abstract

Eukaryotic genes are interrupted by introns that are removed in a conserved process known as pre-mRNA splicing. Though well- studied in select model organisms, we are only beginning to understand the variation and diversity of this process across the tree of eukaryotes. We explored pre-mRNA splicing and other features of transcription in nucleomorphs, the highly reduced remnant nuclei of secondary endosymbionts. Strand-specific transcriptomes were sequenced from the cryptophyte Guillardia theta and the chlor- arachniophyte Bigelowiella natans, whose plastids are derived from red and green algae, respectively. Both organisms exhibited elevated nucleomorph antisense transcription and gene expression relative to their respective nuclei, suggesting unique properties of gene regulation and transcriptional control in nucleomorphs. Marked differences in splicing were observed between the two nucleomorphs: the few introns of the G. theta nucleomorph were largely retained in mature transcripts, whereas the many short introns of the B. natans nucleomorph are spliced at typical eukaryotic levels (>90%). These differences in splicing levels could be reflecting the ancestries of the respective plastids, the different intron densities due to independent genome reduction events, or a combination of both. In addition to extending our understanding of the diversity of pre-mRNA splicing across eukaryotes, our study also indicates potential links between splicing, antisense transcription, and gene regulation in reduced genomes. Key words: RNA-Seq, transcriptome, intron retention, cryptophyte, cryptomonad, chlorarachniophyte. Introduction Typically, these organisms have introns that are few in num- The regulation and flow of information within a cell are vital ber, are short (30 bp or less), or are both. In organisms with processes. Many genes in eukaryotes are interrupted with in- intron-sparse genomes, their spliceosomes often possess a tervening sequences known as introns, which are removed reduced set of components (Katinka et al. 2001; Grisdale from transcripts via a ubiquitous process known as pre-mRNA et al. 2013; Stark et al. 2015), and studying such reduced splicing. A large complex of proteins and small nuclear RNAs systems could provide insight into the core mechanisms of (snRNAs) known as the spliceosome mediates this process. splicing. Although rare, there are examples of genomes that The proper assembly of the spliceosome and subsequent in- have lost all introns (Lane et al. 2007; Cuomo et al. 2012). tron removal require conserved intronic sequence signals such Although splicing has been studied extensively in budding 0 0 as the 5 splice site (most often “GU”), the 3 splice site (most yeast and humans, it is assumed that this process occurs with often “AG”), and a biochemically important branch point little variation across eukaryotes. However, splicing has re- adenosine residue (Will and Lu ¨ hrmann 2011). cently been analyzed in detail in two very different organisms The presence of introns and conserved spliceosomal com- with reduced genomes, extending our understanding of the ponents across eukaryotes suggests that splicing is a mecha- diversity of pre-mRNA splicing. The microsporidian intracellu- nistically conserved process present in the last common lar parasite Encephalitozoon cuniculi has an extremely tiny ancestor of eukaryotes. Even organisms that have highly re- genome of 2.9 Mbp, with only 37 annotated introns duced or highly derived genomes can still have introns. (Katinka et al. 2001; Lee et al. 2010). The extremophilic red The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non- commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com Genome Biol. Evol. 10(6):1573–1583. doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 1573 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Wong et al. GBE alga Cyanidioschyzon merolae also has a reduced genome observed so far (from both cryptophytes and chlorarachnio- (16.5 Mbp) and only 27 annotated introns (Matsuzaki et al. phytes) carry their tiny genomes on three short linear chro- 2004; Stark et al. 2015). Transcriptomic studies of E. cuniculi mosomes (Douglas et al. 2001; Gilson et al. 2006; Lane et al. have revealed a high frequency of intron retention—that is, a 2007; Tanifuji et al. 2011; Moore et al. 2012; Tanifuji, large proportion of mature mRNA still retain introns (Grisdale Onodera, Brown, et al. 2014; Suzuki et al. 2015). Whether et al. 2013). Also, both of these species have only a limited or not there is a functional significance to this convergence, complement of spliceosomal proteins. Indeed, in C. merolae, rather than a mere coincidence, remains to be seen. As with one of the five major subcomplexes of the spliceosome is any organellar genome, very few genes remain; many have missing (Stark et al. 2015). been transferred to the host nucleus or lost. The vast majority Antisense transcription is a relatively rare process in eukar- of remaining nucleomorph genes is housekeeping genes, yotes where transcripts complementary to a “sense” gene are such as chaperone proteins, ribosomal proteins and those in- generated (Pelechano and Steinmetz 2013). Whereas most volved in DNA replication, along with the genes for rRNAs and antisense transcripts are noncoding, some “antisense” tran- an incomplete set of tRNAs (Douglas et al. 2001; Gilson et al. scripts are mRNA of oppositely oriented, neighboring genes. 2006; Lane et al. 2007; Tanifuji et al. 2011; Moore et al. 2012; Such transcripts are inherently more common when genes are Tanifuji, Onodera, Brown, et al. 2014; Suzuki et al. 2015). especially close to each other, as in reduced genomes Introns have been found in all but one of the nucleomorph (Williams et al. 2005; Pelechano and Steinmetz 2013). Long genomes sequenced to date (Douglas et al. 2001; Gilson et al. thought to only be errant transcriptional noise, it is now clear 2006; Lane et al. 2007; Tanifuji et al. 2011; Moore et al. 2012; that antisense transcription can be another source of gene Tanifuji, Onodera, Brown, et al. 2014; Suzuki et al. 2015). regulation (Wagner and Simons 1994; Vanhee-Brossollet However, the density of introns in the nucleomorph genomes and Vaquero 1998; Pelechano and Steinmetz 2013). This of cryptophytes and chlorarachniophytes is quite different. For property could be especially relevant for reduced genomes, example, the cryptophyte G. theta has 485 tightly packed especially if antisense transcription could compensate for a protein-coding genes in its 550 kbp nucleomorph genome, reduced set of transcription factors and regulatory elements and only 17 of these genes are interrupted by introns that in the genome. Antisense transcripts bound to complemen- range from 42 bp to 52 bp in length (Douglas et al. 2001). In tary mRNA can target it for degradation, for example by contrast, the smaller (370 kbp) nucleomorph genome of microRNAs (miRNA) or small interfering RNAs (siRNA; B. natans has almost 900 extremely short (18–21 bp) introns Ghildiyal and Zamore 2009; Moazed 2009; Pelechano and that interrupt a majority of the 283 protein-coding nucleo- Steinmetz 2013). A bound antisense transcript can also morph genes (Gilson et al. 2006; Tanifuji, Onodera, Brown, 0 0 mask splicing signals, leading to intron retention and other et al. 2014). Whereas canonical 5 and 3 splice sites are pre- alternative splicing events (Morrissy et al. 2011). sent in these tiny introns, other commonly conserved splicing The most reduced eukaryotic genomes in nature are the motifs such as the branch donor adenosine are not discernible nucleomorphs, relict nuclei found within secondary plastids of (Gilson et al. 2006; Tanifuji, Onodera, Brown, et al. 2014). two distant lineages of algae (Keeling 2004). While there are a There have been a number of studies on the peculiarities of number of algal lineages that have acquired photosynthesis transcription in nucleomorphs (Williams et al. 2005; Hirakawa by taking up an already-photosynthetic eukaryote as an en- et al. 2011; Hirakawa et al. 2014; Tanifuji, Onodera, Moore, dosymbiont, in only two independent lineages does the rem- et al. 2014; Suzuki et al. 2016; Sanita Lima and Smith 2017), nant endosymbiont nucleus (with an associated tiny genome) although studies about the unique introns and pre-mRNA persist. The cryptophytes bear a plastid derived from a red splicing of nucleomorphs are lacking. Here, we seek to un- alga, while the chlorarachniophytes acquired photosynthesis derstand pre-mRNA splicing in the highly reduced nucleo- from a green alga. As the nucleomorph is derived from a morphs through strand-specific RNA-Seq on mRNA eukaryotic nucleus, the genome has typical eukaryotic fea- extracted from both the cryptophyte G. theta and the chlor- tures, such as linear chromosomes, genes containing introns arachniophyte B. natans, and conducting an in-depth analysis and pre-mRNA splicing. Nucleomorphs provide insight into and comparison of pre-mRNA splicing and antisense tran- the reduction of a eukaryotic genome through endosymbiosis scription. Our data reveal that although there are similarities (as opposed to parasitism, for example). The first two nucle- between the nucleomorphs of G. theta and B. natans with omorphs sequenced were those of the cryptophyte Guillardia respect to gene expression and antisense transcription, there theta and the chlorarachniophyte Bigelowiella natans are marked differences in proportions of transcripts that re- (Douglas et al. 2001; Gilson et al. 2006). Since then, additional main unspliced. Whereas the intron-sparse nucleomorph of nucleomorphs from both lineages have been sequenced G. theta exhibits much intron retention, the many tiny introns (Lane et al. 2007; Tanifuji et al. 2011; Moore et al. 2012; of the B. natans nucleomorph are spliced at high levels, Tanifuji, Onodera, Brown, et al. 2014; Suzuki et al. 2015). highlighting contrasting and possibly lineage-specific differ- Nucleomorph genomes vary in size, but none has been ences in the evolutionary outcomes of pre-mRNA splicing found to be >1 Mbp. Interestingly, all nucleomorphs due to genome reduction. Furthermore, our study provides 1574 Genome Biol. Evol. 10(6):1573–1583 doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Evolution of Pre-mRNA Splicing in Nucleomorphs GBE insight into the diversity of evolutionary trajectories due to second strand, ensuring strand-specificity of the resultant se- genome reduction, and to the diverse nature of what is con- quenced fragments (Parkhomchuk et al. 2009; Wang et al. sidered a ubiquitous and conserved eukaryotic process. 2011). These two libraries were sequenced on an in-house Illumina HiSeq 2500, generating a total of 17,056,967 and 31,687,031 paired-end reads. The same process was re- Materials and Methods peated to generate two replicate libraries of B. natans,result- Culturing of G. theta and B. natans ing in 65,432,233 and 62,505,229 paired-end reads. Monoclonal cultures of G. theta strain CCMP 2712 and B. natans strain CCMP 621 were obtained from the National Center for Microbiota and Algae (NCMA, formerly Bioinformatics Analysis of Sequence Data CCMP). Both organisms were cultured in 250 ml Erlenmeyer The resulting sets of reads were mapped using TopHat2 (Kim flasks using 50 ml of f/2-Si media for B. natans,and the same et al. 2013) to concatenated reference genomes of G. theta volume of media supplemented with 50 mM of NH Cl for (GenBank accession AEIE00000000) and B. natans (GenBank G. theta, as it is unusual in its requirement of ammonium accession ADNK00000000) that include all nucleomorph as its nitrogen source (Hill and Wetherbee 1990). Cultures chromosomes, plastid and nuclear genomic sequences. were agitated on a shaking platform rotating at 120 rpm, Mapped read pairs in SAM format alignment files were and were exposed to 30 mmol photons/m /s of light for 12 h then processed with custom Python scripts to sort them into per day. sense or antisense read pairs based on existing gene annota- tions of G. theta and B. natans. Raw read counts were RNA Extraction summed for each gene for use in downstream calculations Guillardia theta cells were pelleted from 10 ml of culture spun of gene expression and antisense transcription levels. These down 6 h, or halfway, into the “daylight” phase of the light counts were then normalized to determine relative expression cycle. Total RNA was extracted from these pellets using the levels using the FPKM (fragments per kilobase of exon per Ambion RNaqueous Kit (Life Technologies) with the manufac- million reads mapped) method (Mortazavi et al. 2008). turer’s recommended protocol. The same volume of B. natans For each annotated junction in G. theta and B. natans, culture was spun down at the same time-point as G. theta further custom Python scripts were employed to enumerate cultures, and total RNA extracts were prepared from the cell the mapped read pairs in that vicinity for the type of splicing pellets using the TRIzol reagent (Ambion) under the manufac- event it represents: spliced transcripts, intron retention and turer’s recommended conditions. Eluted total RNA samples other alternative splicing events. To ensure that the splicing were quantified using a NanoDrop spectrophotometer events are real and not represented by spuriously mapped (Thermo Scientific). reads, a junction with <25 reads mapped to its vicinity was excluded from further analysis. Using read counts for spliced transcripts and intron-retained transcripts, we calculated the RNA Cleanup and Poly-A Purification percent spliced reads for each annotated junction by dividing Total RNA extracts were cleaned of gross DNA contamination the number of canonically spliced reads by the total number using the Invitrogen DNA-free DNA Removal Kit (Life of reads. We also performed a similar calculation for each Technologies) and further quantified for RNA and DNA using annotated junction by dividing the number of intron- Invitrogen Qubit fluorometry (Life Technologies) for both retained reads by total reads to generate percent intron re- macromolecules. Poly-A purification was performed using tention, which totals to 100% when summed with percent NEXTflex Poly(A) Beads (BioO Scientific) to enrich samples spliced reads. Because the length of introns in both the G. for mRNA and reduce the relative proportion of rRNA. theta and B. natans nucleomorphs are much shorter than read length, our calculated percent intron retention is a proxy Strand-Specific Library Preparation and Second- for the percent spliced in (PSI) value used in alternative splicing Generation Sequencing studies. Percent intron retention for nucleomorph junctions Two strand-specific libraries of G. theta were prepared as are provided as supplementary figures S1 and S2, replicates using the NEXTflex Directional RNA-Seq Kit (BioO Supplementary Material online. Scientific), which uses the dUTP method to maintain strand A discrepancy exists between the number of introns we specificity. The libraries were prepared without modification analyzed versus the latest number of introns found in the to the manufacturer’s protocol. This method employs the ad- nucleomorph of B. natans. Whereas 99 additional introns dition of deoxyuracil triphosphate (dUTP) in place of deoxy- were presented in a recent analysis of the B. natans nucleo- thymidine triphosphate (dTTP) during second-strand synthesis morph genome (Tanifuji, Onodera, Brown, et al. 2014), the of reverse transcription, and subsequent digestion of uracil available annotation files contain only the 852 introns anno- using uracil-DNA glycosylase (UDG) introduces breaks in the tated from the original genome sequencing project Genome Biol. Evol. 10(6):1573–1583 doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 1575 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Wong et al. GBE (Gilson et al. 2006), and these are the 852 B. natans nucleo- higher relative expression with an average FPKM of 75.4. morph introns analyzed here. Likewise, an average FPKM of 317 in nucleomorph genes of B. natans versus 17.1 in nuclear genes suggest a 19-fold in- crease in relative gene expression (fig. 1). Taken together with Results and Discussion results from G. theta, our expression data correspond well Increased Levels of Transcription in Both Cryptophyte and with previous studies, and we proceeded to examine further Chlorarachniophyte Nucleomorphs aspects of transcription only allowed by our use of strand- The nucleomorphs of cryptophytes and chlorarachniophytes specific methodologies. represent evolutionarily convergent structures, and our com- Antisense transcription has not been previously analyzed in parison of pre-mRNA splicing between nucleomorphs of both nucleomorphs, and we used our strand-specific RNA-Seq data lineages offers insight into the evolution of a conserved eu- to investigate the extent of antisense transcription in both the karyotic process in two independent cases of genome reduc- nucleomorph and nuclear genomes of G. theta and B. natans. tion. To examine the process of pre-mRNA splicing in Reduced genomes are thought to exhibit more antisense tran- nucleomorphs, we performed strand-specific RNA-Seq on scription, as intergenic spaces are small and transcripts of the cryptophyte G. theta and the chlorarachniophyte neighboring and oppositely oriented genes are likely to over- B. natans, and mapped the reads to their respective nucleo- lap (Williams et al. 2005; Pelechano and Steinmetz 2013). morph genomes. Because both G. theta and B. natans also Therefore, we predicted that the nucleomorphs of G. theta have sequenced nuclear genomes (Curtis et al. 2012), we and B. natans would have more antisense transcription taking simultaneously mapped our RNA-Seq data to their respective place than in their respective nuclei. To determine relative nuclei to be used as examples of typical eukaryotic genomes. antisense transcription levels, FPKM was calculated from Using available genome annotations, we totaled the number mapped antisense reads for all annotated genes. Indeed, of mapped reads for each gene representing expression, an- more antisense transcription occurs in the nucleomorphs of tisense transcription, intron excision or retention, alternative both G. theta and B. natans than in their respective nuclei splicing, and so forth. We determined with these counts that (fig. 1). the replicate RNA-Seq libraries were statistically similar The increased levels of antisense transcription in nucleo- (Pearson’s r¼ 0.95 for G. theta; r¼ 0.99 for B. natans), and morphs compared with their host nuclei suggest the provoc- all mapped reads from each species were pooled for our final ative possibility that these transcripts could be playing a transcriptome analyses. In total, 26,561,411 pooled reads functional role in gene regulation in the nucleomorph. were mappedtoall G. theta genomic sequences, with While Tanifuji, Onodera, Moore, et al. (2014) suggest that 1,091,066 of those (4.1%) mapping to the nucleomorph. increased nucleomorph gene expression could be a compen- For B. natans, 62,768,143 pooled reads were mapped, and satory mechanism against high levels of errant antisense tran- 6,690,653 of those (10.7%) mapped to the nucleomorph. scription, the converse could also be true—antisense Previous transcriptomic studies on G. theta and B. natans transcripts could down-regulate the massively increased levels have focused on gene expression. In those studies, research- of gene expression. Given the extremely reduced set of genes ers noted the high gene expression levels in nucleomorphs within the nucleomorph, the nucleomorph genome could (Hirakawa et al. 2014; Tanifuji, Onodera, Moore, et al. have dispensed with many of the factors necessary for finer 2014), and observed that virtually the entire nucleomorph transcriptional regulation. Indeed, based on sequence similar- genome is transcribed (Williams et al. 2005; SanitaLima ity to known transcription-related proteins, the nucleomorph and Smith 2017). An assessment of relative gene expression genome of G. theta is predicted to harbor only a small com- levels was performed to compare our data to those studies to plement of 30 such proteins, while the B. natans nucleo- ensure that our data and methodology were sound. We de- morph genome has even fewer at 22 (Douglas et al. 2001; termined relative gene expression levels from our RNA-Seq Gilson et al. 2006). Although some putative nucleomorph- data by normalizing counts of mapped reads across all (nu- targeted transcription factors have been identified for both clear and nucleomorph) annotated genes in G. theta and G. theta and B. natans, the full contribution of nuclear- B. natans using FPKM (fragments per kilobase of exon per encoded transcription-associated proteins is unknown million reads mapped). Tanifuji, Onodera, Moore, et al. (Curtis et al. 2012). Regardless, the extremely short intergenic (2014) showed that gene expression in G. theta nucleomorph regions of the G. theta and B. natans nucleomorph present genes was on average 2.6 times higher than in its nuclear considerable limitations on the position and sequences of reg- genes. The nucleomorph genes of B. natans were shown to ulatory motifs, especially if those motifs are involved in sup- be expressed almost 15 times higher than its nuclear genes pressing transcription, resulting in poorly controlled and (Tanifuji, Onodera, Moore, et al. 2014). Our RNA-Seq data uniformly high expression of nucleomorph genes and a pos- showed very similar increases in nucleomorph gene expres- sible reliance on antisense transcription as an alternate mech- sion (fig. 1)—G. theta nuclear genes have an average FPKM of anism of gene regulation. This is supported by a recent study 24.9, whereas G. theta nucleomorph genes have an 3-fold revealing that only two nucleomorph genes of B. natans are 1576 Genome Biol. Evol. 10(6):1573–1583 doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Evolution of Pre-mRNA Splicing in Nucleomorphs GBE Gene Expression in G. theta Gene Expression in B. natans A B Antisense Transcription in G. theta Antisense Transcription in B. natans C D Guillardia theta Bigelowiella natans Nuclear genes Nucleomorph genes Nuclear genes Nucleomorph genes Mean: 24.9 Mean: 75.4 Mean: 17.1 Mean: 317 Median: 2.93 Median: 17.8 Median: 5.24 Median: 207 Mean: 1.00 Mean: 11.5 Mean: 0.618 Mean: 19.1 Median: 0.0181 Median: 2.87 Median: 0 Median: 3.88 FIG.1.—Increased gene expression and antisense transcription is observed in nucleomorph genes of both Guillardia theta and Bigelowiella natans relative to their respective nuclear genes. (A) Box plot representing gene expression level (normalized using the FPKM formula) differences between nuclear and nucleomorph genes of G. theta.(B) Box plot representing gene expression level differences between nuclear and nucleomorph genes of B. natans.(C) Box plot representing levels of antisense transcription (normalized using the FPKM formula) differences between nuclear and nucleomorph genes of G. theta. (D) Box plot representing antisense transcription level differences between nuclear and nucleomorph genes of B. natans.(E) Summary table of mean and median FPKM and antisense FPKM of both nuclear and nucleomorph genes in G. theta and B. natans. differentially expressed between light and dark cycles, sug- increased intron retention, which may be more common in gesting a lack of transcriptional regulation (Suzuki et al. a reduced genome (Grisdale et al. 2013; Tanifuji, Onodera, 2016). Others have suggested that overexpression of nucleo- Moore, et al. 2014). As discussed later, our splicing analyses morph genes could be compensating for missplicing, or partially support this idea, raising questions about the Genome Biol. Evol. 10(6):1573–1583 doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 1577 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Antisense Sense Antisense FPKM Sense FPKM FPKM FPKM Antisense FPKM Sense FPKM Wong et al. GBE Percent spliced (including antisense reads) Percent spliced (excluding antisense reads) Nucleomorph Gene FIG.2.—Pre-mRNA splicing levels in Guillardia theta nucleomorph genes. For each intron in the nucleomorph, the percentage of spliced reads was calculated in two different ways—one including reads mapped to the antisense strand to simulate a traditional nonstranded RNA-Seq (light bars), and one excluding these antisense reads (dark bars). The genes are arranged left to right in order of increasing effect of antisense reads on the calculated percent spliced reads, highlighted by the difference in heights of the bars. Nucleomorph gene orf183 had <50 reads mapped across its junction, while the junction from orf263 was excluded from this analysis due to poor coverage. potential interplay of overexpression, antisense transcription out the possibility that increased intron retention seen in the and pre-mRNA splicing within reduced genomes. G. theta nucleomorph was reflecting technical issues, we also calculated the percent of spliced reads at G. theta nuclear junctions (supplementary fig. S3, Supplementary Material on- Increased Intron Retention in G. theta Nucleomorph Genes line). As expected for a typical eukaryotic genome, nuclear Our study provides the first analysis of pre-mRNA splicing in a gene transcripts of G. theta are spliced at high levels with little cryptophyte nucleomorph, and the first comparison of splic- intron retention. In the G. theta nucleomorph, the percent of ing between the independently reduced cryptophyte and spliced reads varies between the introns, from the intron for chlorarachniophyte nucleomorphs. While our RNA-Seq data the replication factor rfC being spliced at <20%, to a number showed that high expression is a common feature to both the of other genes where intron retention is not extensive and G. theta and B. natans nucleomorphs, we observe significant nearly 90% of reads are spliced, such as the ribosomal pro- differences in the patterns of pre-mRNA splicing between the teins rps16, rpl19,and rps13 (fig. 2). This range of splicing two. Only 17 introns have been annotated in the nucleo- levels might suggest that introns of certain genes are spliced morph genome of G. theta, eachone locatedina different at higher levels than others; however, there does not appear gene, all with a noticeable bias towards the 5 end of the gene to be any correlation between the extent of intron retention (Douglas et al. 2001). Previous transcriptomic studies on an- and the function of the gene in which the intron lies. other reduced genome have shown that relatively few introns The low splicing levels seen in the highly reduced G. theta remain in the genome, and these introns are often retained in nucleomorph are similar to patterns seen in a previous tran- mature transcripts (Grisdale et al. 2013). Considering the low scriptomic study of the microsporidian parasite E. cuniculi intron density and overall level of genome reduction in the (Grisdale et al. 2013), consistent with the hypothesis that nucleomorph of G. theta, we expected that high levels of one of the effects genome and spliceosome reduction is a intron retention would also be observed in these 17 introns. reduction in splicing levels of the few remaining introns. It is We examined the splicing levels for 16 of the 17 annotated possible that increased intron retention represents missplicing nucleomorph introns—the remaining intron was located on a due to poor recognition of reduced introns (with reduced, gene with poor read coverage and did not meet our cut-off divergent, or missing splicing motifs) by the spliceosome. In criterion. As shown in figure 2, intron retention is prevalent— a study by Jaillon et al. (2008), they found that the short on average, only around 60% of the reads mapped to the introns (20–34 bp) of the ciliate Paramecium tetraurelia pos- vicinity of introns are spliced. This is significant, as the only sess reduced splicing signals, and suggested with some sup- other report of unusual intron retention is from an RNA-Seq port that these introns are not removed efficiently from study on the highly reduced microsporidian E. cuniculi where transcripts. However, the 17 annotated introns in the G. theta intron retention was extensive (Grisdale et al. 2013). To rule nucleomorph have clearly defined splice sites and a 1578 Genome Biol. Evol. 10(6):1573–1583 doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Percent Spliced Reads Evolution of Pre-mRNA Splicing in Nucleomorphs GBE well-defined branch point adenosine (Douglas et al. 2001), transcripts for each junction are represented in figure 2.A and we did not find any discernible correlations between number of introns (on the right side of fig. 2), such as those features of the intron sequence and the propensity for its in the genes for ribosomal protein rps27 or growth factor ebi, removal. In fact, the 5 splice site is well-conserved across all have high levels of antisense transcripts that would confound 17 G. theta nucleomorph introns (Douglas et al. 2001), in line calculations of percent spliced transcripts. The intron in ebi is with other studies showing that genomes with low intron most affected; antisense reads comprise nearly half of the density tend towards stronger, rather than weaker, splicing mapped reads in the vicinity of its intron. Without strand motifs (Irimia et al. 2007, 2009; Irimia and Roy 2008; Leeetal. data for the reads, the percent of spliced reads would be 2010). It has also been suggested that an incomplete set of 31% as opposed to 60%, severely overestimating the occur- spliceosomal components in reduced genomes result in in- rence of intron retention for this gene. However, most other creased missplicing and intron retention (Grisdale et al. introns do not appear to have enough antisense reads that 2013). Although a small number of spliceosomal components would cause an overestimation of the extent of intron and one of the snRNAs have been identified within the nucle- retention—the proportion of antisense reads mapping to omorph genome (Douglas et al. 2001; Lopez et al. 2008), the regions within the intron is <10% for more than half of the full contribution of spliceosomal components from nuclear G. theta nucleomorph introns. In fact, no antisense reads encoded genes is unclear (Curtis et al. 2012). Regardless, be- were mapped to the vicinity of the introns in rps16, rpl19, cause our data show that not all of the 17 introns exhibit or rps17 (fig. 2). However, because reduced genomes tend to extensive intron retention, a functional spliceosome must be have very few introns, even a few overestimations can skew present. With so few, short introns remaining, one wonders the overall impression of the extent of intron retention. Thus, why they have not been dispensed with altogether given the it is worthwhile to consider strand-specific libraries for any high level of genome reduction. Although it is possible that RNA-Seq experiment for organisms with reduced genomes. their positional bias within the gene prevents their loss, it is Our transcriptomic data from the G. theta nucleomorph also possible that they play a regulatory role that contributes revealed that despite high levels of antisense transcription to their persistence. For example, retained introns in mature across the entire nucleomorph genome (see above), antisense transcripts, especially if splicing levels are actively regulated by transcription occurs significantly less frequently (Welch’s un- the organism, could act as a form of post-transcriptional reg- equal variances t-test, P< 0.001) in the 17 intron-containing ulation, down-regulating overly abundant transcripts from be- genes than in all the other single-exon genes. Although there ing translated (Lewis et al. 2003; Lareau et al. 2007). Another are limitations to drawing robust statistical conclusions from possibility could be interplay between antisense transcription only 17 introns, this observation could suggest that in the G. and splicing, as has been previously documented (Morrissy theta nucleomorph, antisense transcripts complementary to et al. 2011; Pelechano and Steinmetz 2013). multi-exon genes are actively down-regulated to allow for Our strand-specific methodology allows us to explore these proper splicing to occur. This could highlight a potential link potential links between antisense transcription and pre-mRNA between antisense transcription and pre-mRNA splicing, re- splicing. We have shown that transcription occurs at high quiring further investigation. It has previously been suggested levels in nucleomorphs, in line with previous studies exploring that antisense transcripts occlude splicing signals, leading to nucleomorph gene expression (Hirakawa et al. 2014; Tanifuji, intron retention, and indeed a positive correlation has been Onodera, Moore, et al. 2014). However, in those previous observed in animals between antisense transcription and al- studies, there was no way to reliably discern how many of ternatively spliced genes (Morrissy et al. 2011; Pelechano and the transcripts were antisense. Conventional methods of re- Steinmetz 2013). Although untangling the links between verse transcription do not preserve the strandedness of the gene expression, antisense transcription and pre-mRNA splic- original transcript, and the lack of strand specificity of ing in such a reduced genome presents many technical chal- the sequence data renders antisense transcripts indistinguish- lenges, this interplay within the G. theta nucleomorph could able from “sense” mRNA to read mapping programs— not only represent an unusual network of regulatory mecha- antisense reads mapping to an annotated intron would thus nisms shared with other reduced genomes, but also provide appear as intron retention. Our strand-specific RNA-Seq data an evolutionary explanation for the persistence of introns, and allow us to compare methodologies to determine if this is a their associated spliceosome. legitimate concern, in addition to providing opportunities to assess any potential effects of antisense transcription on pre- Near-typical Eukaryotic Splicing Levels in B. natans mRNA splicing. We were able to make two calculations of Nucleomorph Genes percent transcripts spliced: one simulated results from a con- ventional RNA-Seq experiment by calculating the percentage The independently reduced nucleomorph of the chlorarach- of spliced transcripts from all sense and antisense reads, and a niophyte B. natans bears remarkable contrasts to that of the second calculation excluding the antisense reads. cryptophyte G. theta. To investigate pre-mRNA splicing pat- The differences between the calculations of percent spliced terns in B. natans further, we calculated the percent of spliced Genome Biol. Evol. 10(6):1573–1583 doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 1579 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Wong et al. GBE Mean Percent Spliced: 87.6% Median Percent Spliced: 92.8% FIG.3.—Bigelowiella natans nucleomorph introns are spliced at high levels. Histogram showing the proportion of junctions at different levels of percent spliced reads. The vast majority of junctions has >80% spliced reads. reads at all annotated introns of both the nuclear and nucle- genomes (fig. 3). More than half of all annotated nucleo- omorph genomes of B. natans using the same methods and morph introns have 90–100% of junction-mapped reads criteria as G. theta.As with G. theta nuclear genes, reads spliced, and only 12% of all annotated introns have <80% mapped to the B. natans nucleus are spliced with a median of junction-mapped reads spliced. Furthermore, for the vast percent spliced reads higher than 90%, indicating little intron majority of these introns, antisense reads comprise <10% of retention (supplementary fig. S4, Supplementary Material the mapped reads, and do not contribute to any significant online). overestimation of intron retention (supplementary File S5, With only 283 annotated genes and nearly 900 introns (see Supplementary Material online). As with the G. theta nucleo- Materials and Methods for clarification) in the B. natans nucle- morph, we were unable to correlate splicing levels of a par- omorph genome, the genes are densely populated with ex- ticular gene’s introns and that gene’s proposed function. tremely short (18–21 bp) introns (Gilson et al. 2006; Tanifuji, Furthermore, given the high levels of splicing across all introns, Onodera, Brown, et al. 2014). The average intron density is it is less clear if these tiny introns could play a functional role. 3.1 introns per gene, and only 43 of the 283 annotated genes However, the possibility remains that under certain condi- lack introns (Gilson et al. 2006). However, based on our cur- tions, such as stress, nucleomorph splicing could be regulated. rent understanding of conserved mechanisms of splicing, This “proficiency” in pre-mRNA splicing we observed these short intron lengths place constraints on this process. means that the relatively high nucleomorph gene expression The introns of B. natans nucleomorph genes do not have a in B. natans is not compensating for excessive missplicing, as discernible branch point adenosine (Gilson et al. 2006), and suggested by Tanifuji, Onodera, Moore, et al. (2014). Instead, considering the intron lengths, it is difficult to imagine the near-typical eukaryotic splicing levels in the highly reduced formation of lariat structures typical of spliceosomal intron B. natans nucleomorph genome could be a consequence of removal. Also, aside from the GU-AG boundaries, no other having a relatively high intron density similar to other free- motifs are apparent (Gilson et al. 2006). Finally, the typical living green algae (Slamovits and Keeling 2009), where exten- eukaryotic spliceosome (or even a reduced version) is a very sive intron retention would likely be deleterious. While there large conglomerate of proteins that likely dwarfs these tiny are 24 predicted spliceosomal components in the B. natans introns. On the basis of these unusual aspects, and extensive nucleomorph (Gilson et al. 2006; Curtis et al. 2012), this is a intron retention in other reduced genomes, we expected to small fraction of the number of components required to form observe low levels of splicing in the B. natans nucleomorph. the familiar spliceosome that is conserved amongst other Using our minimum read coverage criterion as described in eukaryotes, even with contributions from nuclear-encoded the Materials and Methods, we analyzed splicing levels of 829 products. Taking our data together with the unusual se- introns from the B. natans nucleomorph. Surprisingly, the cal- quence features of their introns suggest that the B. natans culated splicing levels of B. natans nucleomorph introns follow nucleomorph must use a novel or highly divergent mechanism a pattern more like its respective nucleus than other reduced to effectively remove its extremely short introns with great 1580 Genome Biol. Evol. 10(6):1573–1583 doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Evolution of Pre-mRNA Splicing in Nucleomorphs GBE accuracy. It has been suggested that the length and sequence biochemistry of pre-mRNA splicing has been conducted. No of the B. natans nucleomorph introns may play a role in effi- reports exist of widespread intron retention in S. cerevisiae, cient identification and removal, as the range of lengths is very and Grisdale et al. (2013) showed in a splicing analysis using narrow (18–21 bp), and intronic sequences have a marked AU existing S. cerevisiae RNA-Seq data that the vast majority of bias (Gilson et al. 2006). Whereas an EST-based study of the the 253 introns is spliced at high levels. Thus, while intron chlorarachniophyte Gymnochlora stellata suggested that density could be a very strong determinant of intron retention splicing levels of nucleomorph introns are correlated with their in reduced genomes, other biological or evolutionary factors lengths (Slamovits and Keeling 2009), we find no significant are also involved. differences in splicing levels across B. natans nucleomorph Annotations from other sequenced nucleomorph genomes introns (supplementary File S5, Supplementary Material on- show that all cryptophyte nucleomorphs are intron-sparse, line). Furthermore, the only identifiable splicing sequence sig- while chlorarachniophyte nucleomorphs have very many 0 0 nals in B. natans nucleomorph introns (the 5 and 3 splice tiny introns (Lane et al. 2007; Tanifuji et al. 2011; Moore sites) are essentially invariable (Gilson et al. 2006). Indeed, et al. 2012; Tanifuji, Onodera, Brown, et al. 2014; Suzuki Gilson et al. (2006) propose that the B. natans nucleomorph et al. 2015). The chlorarachniophyte plastid was derived spliceosome could operate as a molecular “caliper,” excising from a green alga, and its nucleomorph’s intron density is 18–21 base-pair-long AU-rich regions bound by canonical similar to those of Arabidopsis thaliana and splice sites. Having very typical eukaryotic splicing levels in a Chlamydomonas reinhardtii (Gilson et al. 2006; Slamovits highly reduced genome with such short introns, pre-mRNA and Keeling 2009). There are also clear indications that splicing in the B. natans nucleomorph merits further biochem- some of the annotated chlorarachniophyte introns bear posi- ical and proteomic studies to elucidate this process and allow tional homology to those in extant green algae (Gilson et al. comparison with canonical splicing. 2006; Slamovits and Keeling 2009). The cryptophyte plastid, on the other hand, is derived from a red alga. Current geno- mic data repeatedly suggest that free-living red algae are gen- Evolution of Pre-mRNA Splicing in Reduced Eukaryotes erally gene- and intron-poor, hinting at a possible ancient Our analyses of splicing in the nucleomorphs of G. theta and genome reduction event before the radiation of rhodophytes B. natans highlight major differences in the patterns of pre- (Qiu et al. 2015). Should this be true, the red algal endosym- mRNA splicing in highly reduced genomes. Although the biont ancestor of the cryptophyte plastid would have already trend of increased intron retention in reduced genomes was been reduced with respect to intron density and the number seen in the highly reduced (but intron-sparse) nucleomorph of of spliceosomal components. Consequently, extant red algae G. theta, transcripts from the even tinier (but intron-dense) may also exhibit pre-mRNA splicing patterns similar to what B. natans nucleomorph were spliced at levels seen in most we observed in the G. theta nucleomorph. Indeed, in the other eukaryotes. This stark contrast between the two nucle- extremophilic red alga C. merolae, extensive intron retention omorphs could simply indicate that splicing levels are influ- is observed in its 27 annotated introns (Grisdale CJ, unpub- enced by intron density. However, the difference in splicing lished data), and only a relatively small complement of spli- levels could also reflect the evolutionary ancestries of the sec- ceosomal components remain: the U1 snRNP, the subunit of ondary plastids. These two possibilities are not mutually ex- the spliceosome responsible for the recognition of the 5 splice clusive, and their resolution would be useful for generalizing site, is entirely missing from the genome (Matsuzaki et al. the patterns of pre-mRNA splicing across cryptophyte and 2004; Stark et al. 2015). However, C. merolae could represent chlorarachniophyte nucleomorphs and other reduced eukary- a highly derived lineage of red algae, and pre-mRNA splicing otic genomes. As discussed previously, it would be deleterious remains to be studied in detail in mesophilic rhodophytes. for an intron-dense organism to exhibit extensive intron re- Further sampling of red algae and cryptophyte nucleomorphs tention, even under the evolutionary pressures of genome will be required to determine if evolutionary ancestry is re- reduction. This is supported not only by our splicing analysis sponsible for the differences in pre-mRNA splicing in the of the B. natans nucleomorph, but also by existing EST anal- highly reduced nucleomorphs of cryptophytes and yses on nucleomorph transcripts from another chlorarachnio- chlorarachniophytes. phyte G. stellata (Slamovits and Keeling 2009). In that study, a number of G. stellata nucleomorph genes were found to have Conclusions similar densities of very short introns as the B. natans nucleo- The nucleomorphs of cryptophytes and chlorarachniophytes morph, and most of those introns were removed in 80–100% provide unique opportunities to compare the evolution and of transcripts (Slamovits and Keeling 2009). However, there diversity of conserved eukaryotic processes within the context are genomes where intron density is rather low, yet intron of genome reduction. Our data reveal similar patterns of high retention is not widespread. The most notable of these is in gene expression and high antisense transcription in both the budding yeast Saccharomyces cerevisiae, whose genome nucleomorphs. We also observed differences in levels of is relatively small, and from where most research on the Genome Biol. Evol. 10(6):1573–1583 doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 1581 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Wong et al. GBE Irimia M, Roy SW. 2008. Evolutionary convergence on highly-conserved 3 antisense transcription around junctions in G. theta, suggest- intron structures in intron-poor eukaryotes and insights into the an- ing potential links between antisense transcription and pre- cestral eukaryotic genome. PLoS Genet. 4(8):e1000148. mRNA splicing. The marked differences observed in pre- Irimia M, et al. 2009. Complex selection on 5 splice sites in intron-rich mRNA splicing between the nucleomorphs highlights the di- organisms. Genome Res. 19(11):2021–2027. versity of what is considered to be a conserved process across Jaillon O, et al. 2008. Translational control of intron splicing in eukaryotes. Nature 451(7176):359–362. eukaryotes, and raises awareness of the value of investigating Katinka MD, et al. 2001. Genome sequence and gene compaction of the splicing in the lesser-studied branches of the eukaryotic tree. eukaryote parasite Encephalitozoon cuniculi.Nature Further investigations of the nature and mechanisms of pre- 414(6862):450–453. mRNA splicing in reduced genomes will provide valuable in- Keeling PJ. 2004. Diversity and evolutionary history of plastids and their sight and improve our understanding of this key eukaryotic hosts. Am J Bot. 91(10):1481–1493. Kim D, et al. 2013. TopHat2: accurate alignment of transcriptomes in the process. presence of insertions, deletions and gene fusions. Genome Biol. 14(4):R36. Supplementary Material Lane CE, et al. 2007. Nucleomorph genome of Hemiselmis andersenii reveals complete intron loss and compaction as a driver of protein Supplementary data areavailableat Genome Biology and structure and function. Proc Natl Acad Sci USA. Evolution online. 104(50):19908–19913. Lareau LF, Inada M, Green RE, Wengrod JC, Brenner SE. 2007. Unproductive splicing of SR genes associated with highly conserved and ultraconserved DNA elements. Nature 446(7138):926–929. Acknowledgments Lee RCH, Gill EE, Roy SW, Fast NM. 2010. Constrained intron structures in This work was supported by a Natural Sciences and a microsporidian. Mol Biol Evol. 27(9):1979–1982. Lewis BP, Green RE, Brenner SE. 2003. Evidence for the widespread cou- Engineering Research Council (NSERC) of Canada Discovery pling of alternative splicing and nonsense-mediated mRNA decay in Grant [262988 to N.M.F.] and a grant to the Centre for humans. Proc Natl Acad Sci USA. 100(1):189–192. Microbial Diversity and Evolution from the Tula Foundation. Lopez MD, Alm Rosenblad M, Samuelsson T. 2008. Computational screen The authors also acknowledge D. Tack (formerly University of for spliceosomal RNA genes aids in defining the phylogenetic distribu- British Columbia, currently Penn State) for his custom scripts tion of major and minor spliceosomal components. Nucleic Acids Res. 36(9):3001–3010. used in the analysis of our RNA-Seq data, and S. Rader Matsuzaki M, et al. 2004. Genome sequence of the ultrasmall unicellular (University of Northern British Columbia) and T. Whelan red alga Cyanidioschyzon merolae 10D. Nature 428(6983):653–657. (University of British Columbia) for helpful discussions and Moazed D. 2009. Small RNAs in transcriptional gene silencing and genome comments. defence. Nature 457(7228):413–420. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. 2008. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Literature Cited Methods. 5(7):621–628. Cuomo CA, et al. 2012. Microsporidian genome analysis reveals evolu- Moore CE, Curtis B, Mills T, Tanifuji G, Archibald JM. 2012. Nucleomorph genome sequence of the cryptophyte alga Chroomonas mesostigma- tionary strategies for obligate intracellular growth. Genome Res. 22(12):2478–2488. tica CCMP1168 reveals lineage-specific gene loss and genome com- plexity. Genome Biol Evol. 4(11):1162–1175. Curtis BA, et al. 2012. Algal genomes reveal evolutionary mosaicism and Morrissy AS, Griffith M, Marra MA. 2011. Extensive relationship between the fate of nucleomorphs. Nature 492(7427):59–65. Douglas S, et al. 2001. The highly reduced genome of an enslaved algal antisense transcription and alternative splicing in the human genome. Genome Res. 21(8):1203–1212. nucleus. Nature 410(6832):1091–1096. Gilson PR, et al. 2006. Complete nucleotide sequence of the chlorarach- Parkhomchuk D, et al. 2009. Transcriptome analysis by strand-specific niophyte nucleomorph: nature’s smallest nucleus. Proc Natl Acad Sci sequencing of complementary DNA. Nucleic Acids Res. 37(18):e123. Pelechano V, Steinmetz LM. 2013. Gene regulation by antisense transcrip- USA. 103(25):9566–9571. Grisdale CJ, Bowers LC, Didier ES, Fast NM. 2013. Transcriptome analysis tion. Nat Rev Genet. 14(12):880–893. Qiu H, Price DC, Yang EC, Yoon HS, Bhattacharya D. 2015. Evidence of of the parasite Encephalitozoon cuniculi: an in-depth examination of the pre-mRNA splicing in a reduced eukaryote. BMC Genomics. ancient genome reduction in red algae (Rhodophyta). J Phycol. 14(1):207–215. 51(4):624–636. Sanit a Lima M, Smith DR. 2017. Pervasive transcription of mitochondrial, Ghildiyal M, Zamore PD. 2009. Small silencing RNAs: an expanding uni- verse. Nat Rev Genet. 10(2):94–108. plastid, and nucleomorph genomes across diverse plastid-bearing spe- cies. Genome Biol Evol. 9(10):2650–2657. Hill DRA, Wetherbee R. 1990. Guillardia theta gen. et sp.nov. Slamovits CH, Keeling PJ. 2009. Evolution of ultrasmall spliceosomal (Cryptophyceae). Can J Bot. 68(9):1873–1876. Hirakawa Y, Burki F, Keeling PJ. 2011. Nucleus- and nucleomorph- introns in highly reduced nuclear genomes. Mol Biol Evol. 26(8):1699–1705. targeted histone proteins in a chlorarachniophyte alga. Mol Microbiol. 80(6):1439–1449. Stark MR, et al. 2015. Dramatically reduced spliceosome in Cyanidioschyzon merolae. Proc Natl Acad Sci USA. Hirakawa Y, Suzuki S, Archibald JM, Keeling PJ, Ishida K-I. 2014. 112(11):E1191–E1200. Overexpression of molecular chaperone genes in nucleomorph genomes. Mol Biol Evol. 31(6):1437–1443. Suzuki S, Shirato S, Hirakawa Y, Ishida K-I. 2015. Nucleomorph genome sequences of two chlorarachniophytes, Amorphochlora amoebiformis Irimia M, Penny D, Roy SW. 2007. Coevolution of genomic intron number and splice sites. Trends Genet. 23(7):321–325. and Lotharella vacuolata. Genome Biol Evol. 7(6):1533–1545. 1582 Genome Biol. Evol. 10(6):1573–1583 doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Evolution of Pre-mRNA Splicing in Nucleomorphs GBE Suzuki S, Ishida K-I, Hirakawa Y. 2016. Diurnal transcriptional regulation of Wagner EG, Simons RW. 1994. Antisense RNA control in bacteria, phages, endosymbiotically derived genes in the chlorarachniophyte and plasmids. Annu Rev Microbiol. 48:713–742. Bigelowiella natans. Genome Biol Evol. 8(9):2672–2682. Wang L, et al. 2011. A low-cost library construction protocol and data Tanifuji G, et al. 2011. Complete nucleomorph genome sequence of the analysis pipeline for Illumina-based strand-specific multiplex RNA-seq. nonphotosynthetic alga Cryptomonas paramecium reveals a core PLoS One 6(10):e26426. nucleomorph gene set. Genome Biol Evol. 3:44–54. Will CL, Lu ¨ hrmann R, et al. 2011. Spliceosome structure and function. Acc Tanifuji G, Onodera NT, Moore CE, et al. 2014. Reduced nuclear genomes Chem Res. 44(12):1292–1301. maintain high gene transcription levels. Mol Biol Evol. 31(3):625–635. Williams BA, Slamovits CH, Patron NJ, Fast NM, Keeling PJ. 2005. Tanifuji G, Onodera NT, Brown MW, et al. 2014. Nucleomorph and plastid A high frequency of overlapping gene expression in compacted genome sequences of the chlorarachniophyte Lotharella oceanica: eukaryotic genomes. Proc Natl Acad Sci USA. 102(31): convergent reductive evolution and frequent recombination in 10936–10941. nucleomorph-bearing algae. BMC Genomics 15(1):374. Vanhee-Brossollet C, Vaquero C. 1998. Do natural antisense transcripts make sense in eukaryotes? Gene 211(1):1–9. Associate editor: Michelle Meyer Genome Biol. Evol. 10(6):1573–1583 doi:10.1093/gbe/evy111 Advance Access publication June 1, 2018 1583 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1573/5026597 by Ed 'DeepDyve' Gillespie user on 20 June 2018

Journal

Genome Biology and EvolutionOxford University Press

Published: Jun 1, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off