Abstract In this study, we applied short- and long-read RNA sequencing techniques, as well as PCR analysis to investigate the transcriptome of the porcine endogenous retrovirus (PERV) expressed from cultured porcine kidney cell line PK-15. This analysis has revealed six novel transcripts and eight transcript isoforms, including five length and three splice variants. We were able to establish whether a deletion in a transcript is the result of the splicing of mRNAs or of genomic deletion in one of the PERV clones. Additionally, we re-annotated the formerly identified RNA molecules. Our analysis revealed a higher complexity of PERV transcriptome than it was earlier believed. RNA-Seq, full-length sequencing, transcriptome, Pacific Biosciences, Oxford Nanopore Technologies, endogenous retrovirus INTRODUCTION The porcine endogenous retrovirus (PERV) is a C-type gammaretrovirus of swine (Todaro et al. 1974). The two polytropic subtypes PERV-A and PERV-B infect cells of several species, including humans, while the ecotropic PERV-C infects only pig cells (Czauderna et al. 2000). The PK-15 cell line harbors 10–20 copies of at least two different clones of the PERV genome (Akiyoshi et al. 1998). The DNA homologous to PERV-PK (GeneBank accession: AJ293656) produces a full length 8.3 kb long transcript and a subgenomic 3.1 kb long spliced transcript. This clone expresses all of the retroviral genes (gag, pro/pol and env), which enable the virus to productively infect the cells. The genome homologous to PK15-ERV (GeneBank accession: AF038601) has two deleted genomic regions affecting the RNA polymerase and envelope production of the virus; however, this viral clone produces RNA molecules (Czauderna et al. 2000; Krach et al. 2001). The pig has been under consideration to possibly be utilized in the future as an organ donor organism (Wu et al. 2017). Therefore, PERV may represent a possible health hazard in xenotransplantations, as it is present in relatively high copy number in the genome of pigs. The elimination of PERV from the pig cell by specific pathogen-free breeding has been reported to be impossible (Denner and Tönjes 2012), although some progress has been achieved recently by using CRISPR/Cas9-mediated excision of the provirus in cultured porcine kidney (PK-15) cells (Feng et al. 2015). Whole transcriptome studies have become indispensable for understanding the complexity of genetic regulation. Short-read sequencing has become a commonly applied approach for the structural and functional annotation of transcriptomes (Mortazavi et al. 2008; Wang, Gerstein and Snyder 2009; Djebali et al. 2012). However, this technique is not optimal for the de novo characterization of the transcriptome, since it is unable to identify alternatively transcribed and processed transcripts and to distinguish between transcript isoforms, including splice and length variants. Both the Pacific Biosciences (PacBio) RS II platform and Oxford Nanopore Technologies (ONT) MinION platform are capable of reading long DNA stretches in a single sequencing run (Sharon et al. 2013; Laver et al. 2015). Although, short-read technologies produce higher coverage than those of long-read sequencing, longer reads are much easier to process computationally and to interpret analytically since they map more unambiguously to a reference sequence and the larger the sequencing reads, the easier is to put the sequences together. Using longer reads, more overlaps can be identified between reads. In our former publications, we have demonstrated that PacBio long-read sequencing is able to reveal a hidden complexity of the transcriptional landscape of pseudorabies virus (PRV; Tombácz et al. 2016), although the PRV transcripts have been formerly determined by using an Illumina short-read platform (Oláh et al. 2015). The PacBio long-read technique has also been successfully applied for the quantitation of dynamic PRV transcriptome (Tombácz et al. 2017). Both the PacBio isoform sequencing (Iso-Seq) library preparation and the ONT (1D strand switching cDNA by ligation) methods are able to determine the 5’-ends of the transcripts with base pair precision. PacBio sequencing has the important advantage over other methods in that it does not produce systematic errors and any that arise are therefore easily corrected thanks to its high consensus accuracy (Miyamoto et al. 2014). On the other hand, the ONT sequencing technique does not have a skew towards a specific read length, thus it is more likely to sequence shorter and longer full-length cDNAs than the PacBio platform (Weirather et al. 2017). However, the PacBio sequencing technique is more accurate than that of ONT, because if any random errors occur in the raw reads, they are easily corrected thanks to the exceptionally high consensus accuracy of this platform. In this study, our aim was to re-evaluate the PERV transcriptome with a multiplatform approach that included Illumina, PacBio and ONT cDNA sequencing, as well as regular PCR for validation purposes. We used oligo(dT)20 primer-based sequencing for ONT, oligo(dT)20 and random primer-based sequencing for both the PacBio and Illumina platforms. Additionally, we also applied both amplified and non-amplified PacBio Iso-Seq techniques. MATERIALS AND METHODS Cells and viruses Immortalized porcine kidney cells of the cell line PK-15 (ATCC CCL-33, Manassas, VA, USA) hosting PERV isolate Szeged were maintained in Dulbecco's modified Eagle's medium (Gibco, Invitrogen, Carlsbad, CA, USA) supplemented with 5% fetal bovine serum (Gibco, Invitrogen) and 80 μg mL–1 gentamycin (Gibco, Invitrogen) at 37°C, under 5% CO2. Three freeze/thaw cycles were used, and the sample was then centrifuged at 10 000× g for 15 min. RNA purification Total RNA isolation Total RNA was isolated from the PK-15 cells with the Nucleospin RNA kit (Macherey-Nagel, Düren, Germany), according to the kit manual. Contaminating DNA was removed by on-column RNase free rDNase treatment (supplied by the kit), and the purified sample was further treated by TURBO DNA-free™ Kit (Life Technologies, Carlsbad, CA, USA) to eliminate potential residual DNA contamination. RNA concentration was measured using the Qubit 2.0 and the RNA BR Assay Kit (Life Technologies). The RNA integrity was checked with Agilent 2100 Bioanalyzer. The RNA sample was stored at –80°C until use. Isolation of polyadenylated RNAs Isolation of the poly(A)+ RNA fraction was carried out by the Spin-Column Protocol of the Oligotex mRNA Mini Kit (Qiagen, Hilden, Germany). The RNA quantity was measured by Qubit RNA HS Assay Kit (Life Technologies). Illumina HiScanSQ sequencing Two library preparation methods were carried out for Illumina sequencing. Libraries were constructed for paired-end 100 bp sequencing using the Illumina ScriptSeq v2 RNA-Seq Library Preparation Kit (Epicentre, Madison, WY, USA), which includes a random-primed cDNA synthesis reaction step. For the poly(A)-sequencing, a single-end library was prepared by using anchored adaptor-primer oligonucleotides with a (VN)T20 primer sequence. FastQC v0.10.1. was used for the quality assessment of raw read files. PacBio RSII isoform sequencing Three different library preparation approaches were used for SMRTbell library preparation. The non-amplified method Polyadenylated RNA samples were converted to cDNAs with SuperScript Double-Stranded cDNA Synthesis Kit (Life Technologies; the included SuperScript II was changed to SuperScript III enzyme). Anchored Oligo(dT)20 primers (Life Technologies) were used for the reverse transcription reactions. Double-stranded cDNAs were quantified with the Qubit HS dsDNA Assay Kit (Life Technologies). SMRTbell libraries were prepared from cDNAs with PacBio DNA Template Prep Kit 1.0, following the ‘Very Low (10 ng) Input 2 kb Template Preparation and Sequencing with Carrier DNA’ protocol. Agilent Technologies 2100 Bioanalyzer (Santa Clara, CA, USA) was used to determine the quality of the SMRTbell templates. DNA polymerase binding kit XL 1.0 and v2 sequencing primers were used for annealing and polymerase binding. The polymerase-template complexes were bound to MagBeads using the PacBio's MagBead Binding Kit. Sequencing was performed on the PacBio RSII platform (Menlo Park, CA, USA) with P5-C3 reagents. Movie lengths were 180 min (one movie was recorded for each SMRT cell). Library preparation approaches from amplified cDNAs The following PacBio protocols were used for the cDNA production: Isoform Sequencing (Iso-Seq) using the SMARTer PCR cDNA Synthesis Kit (Clontech, Mountain View, CA, USA, as recommended by the manufacturer's) and No Size Selection or the Manual Agarose-gel Size Selection. Single-stranded cDNAs were synthesized from the polyA + RNAs by using 3’ SMART® CDS Primer II A (included in the Clontech kit) or adapter-linked GC-rich random primers. The first-strand cDNAs were amplified by PCR, using the SMARTer PCR cDNA Synthesis Kit (Clontech) and the KAPA HiFi Enzyme (Kapa Biosystems, Basel, Switzerland) following the PacBio protocol. cDNA sample (500 ng) was used for the SMRTbell library preparation, using the PacBio DNA Template Prep Kit 1.0. DNA/Polymerase Binding Kit P6 kit was used for the production of the polymerase/template complexes. Sequencing was carried out on an RS II sequencer with DNA Sequencing Reagent 4.0 (P/N 100-356-200). The movie lengths were 240 min (one movie was recorded for each SMRT cell). Oxford nanopore MinION sequencing Libraries were prepared using the SQK-LSK108 Ligation Sequencing kit (Oxford Nanopore Technologies, Oxford, UK) applying the 1D strand switching cDNA by ligation protocol. Briefly, (ss)cDNA synthesis was carried out using SuperScript IV Reverse Transcriptase (Invitrogen/Thermo Fisher Scientific) and an anchored adapter-primer with (VN)T20 nucleotides. A 5’ adapter sequence with three O-methyl-guanine RNA bases (Integrated DNA Technologies, Coralville, IA, USA) was added for strand switching. PCR was carried out using the primers supplied in the kit and KapaHiFi high fidelity DNA polymerase (Kapa Biosystems). End repair was conducted using NEBNext End repair/dA-tailing Module (New England Biolabs, Ipswich, MA, USA) followed by adapter ligation using adapters supplied in the kit and NEB Blunt/TA Ligase Master Mix (New England Biolabs). cDNA was purified between each step using Agencourt AMPure XP magnetic beads (Beckman Coulter, Brea, CA, USA) and the sample concentration was determined using a Qubit 2.0 Fluorometer through the use of the Qubit (ds)DNA HS Assay Kit (Life Technologies/Thermo Fisher Scientific). Libraries were loaded on R9.5 SpotON Flow Cells, base calling was performed using Albacore v1.2.6. PCR analysis The putative novel and the previously described transcripts isoforms were validated by PCR analysis. cDNAs were created by reverse transcription with SuperScript IV Reverse Transcriptase (Life Technologies) following the manufacturer's recommendations. Samples were amplified by using the Applied Biosystem's Veriti Thermal Cycler with KAPA HiFi PCR Kit (KAPA Biosystems) according to the manufacturer's recommendations. The running conditions were as follows: 3 min at 95°C for initial denaturation, followed by 35 cycles at 98°C for 20 s (denaturation), 63°C for 20 s (annealing) and at 72°C for 2 min (extension). Final elongation was set at 72°C for 5 min. The primers used in this study are listed in Table 1. Table 1. Novel transcripts. Tr. name TATA-box TSS Exon end Exon start PA signal TES gpe0-d – 1 255* 8098* 8596–8601 8625 inltr – 1 – – 456–461 485 env-AT 371–385 399 623 5807 6198–6203 6217/6225 gpe-d 371–385 399 6029* 6109* 6198–6203 6217/6225 env-d-AT 371–385 399 623/6029* 5807/6109* 6198–6203 6217/6225 gpe-d2-1 371–385 399 3931*/6029*/6509 4741*/6109*/8328 8596–8601 8625 env-d-1 371–385 399 623/6029*/6509 5807/6109*/8205 8596–8601 8625 env-d-2 371–385 399 623/6029*/6509 5807/6109*/8358 8596–8601 8625 gpe-AT 371–385 399 – – 6198–6203 6217/6225 env 371–385 399 623 5807 8596–8601 8625 gpe-d-AT 371–385 399 6029* 6109* 8596–8601 8625 env-d 371–385 399 623/6029* 5807/6109* 8596–8601 8625 gpe-d2 371–385 399 3931*/6029 4741*/6109 8596–8601 8625 env1.5 – 7248 – – 8596–8601 8625 env1.5 – 7563 – – 8596–8601 8625 env1.5 – 7623 – – 8596–8601 8625 Tr. name TATA-box TSS Exon end Exon start PA signal TES gpe0-d – 1 255* 8098* 8596–8601 8625 inltr – 1 – – 456–461 485 env-AT 371–385 399 623 5807 6198–6203 6217/6225 gpe-d 371–385 399 6029* 6109* 6198–6203 6217/6225 env-d-AT 371–385 399 623/6029* 5807/6109* 6198–6203 6217/6225 gpe-d2-1 371–385 399 3931*/6029*/6509 4741*/6109*/8328 8596–8601 8625 env-d-1 371–385 399 623/6029*/6509 5807/6109*/8205 8596–8601 8625 env-d-2 371–385 399 623/6029*/6509 5807/6109*/8358 8596–8601 8625 gpe-AT 371–385 399 – – 6198–6203 6217/6225 env 371–385 399 623 5807 8596–8601 8625 gpe-d-AT 371–385 399 6029* 6109* 8596–8601 8625 env-d 371–385 399 623/6029* 5807/6109* 8596–8601 8625 gpe-d2 371–385 399 3931*/6029 4741*/6109 8596–8601 8625 env1.5 – 7248 – – 8596–8601 8625 env1.5 – 7563 – – 8596–8601 8625 env1.5 – 7623 – – 8596–8601 8625 TSS: transcript start site, PA: poly(A), TES: transcript end site; positions separated with a slash represent separate entries; asterisks (*) mark a deletion. View Large Data analysis and visualization Reads from the Illumina sequencing were aligned with Bowtie 2 (Langmead and Salzberg 2012), while reads from PacBio and ONT sequencing with GMAP mapper (Wu and Watanabe 2005) to the host genome (Sus scrofa assembly: Sscrofa10.2) and to the genome of PERV isolate Szeged (GeneBank accession number: KY484771), which had been previously sequenced and aligned by our group. For visualization of mapped reads, we used IGV (Thorvaldsdóttir, Robinson and Mesirov 2013). Poly(A) signals were predicted in silico by querying putative signal motifs to the 3’ ends of the transcripts. TATA boxes were predicted using JASPAR POLII database (Mathelier et al. 2016) and FIMO (Find Individual Motif Occurrences) software (Grant, Bailey and Noble 2011). RESULTS Analysis of PERV transcriptome using multiplatform techniques The Illumina HiScanSQ sequencing with random hexamer primers yielded 494 638 of 100-bp long reads with an average genome coverage of 5703, while the polyadenylation sequencing (PA-Seq) method resulted in 99 499 of 50 nucleotide (nt)-long reads with an average coverage of 573, and a Reads Per Kilobase Million (RPKM) value of 0.099499. The PacBio long read sequencing yielded a total of 17 544 reads, and an average genome coverage of 3238. The average length of the ROIs was 1555 nts. The Oxford Nanopore Technologies MinION sequencing yielded 7370 reads with an average read length of 1512 nts and average genome coverage of 1285. Determination of the 5’- and 3’-ends of PERV transcripts The 5’- and the 3’-ends of the already described transcripts map to nucleotide 399 and nucleotide 8625 respectively on the PERV-Szeged genome (Fig. 1, white arrow-rectangles). The upstream TATA box was predicted in silico at genomic position 371–385, while the Poly(A) signal (PAS) of these transcripts was predicted to nt 8596–8601 (Table 2). Akiyoshi and colleagues described a 7.5 kb-long transcript using northern blot analysis (Akiyoshi et al. 1998). We identified this as a 7339 bp-long transcript mapping to 399 and 8625 and have named it gpe-d2. Four of the six novel transcripts are located on the genome as follows: inltr mapped from nt 1 to 485, gpe0-d mapped from nt 1 to nt 8625, env-d and gpe-d mapped from nt 399 to nt 8625. The remaining two transcripts are truncated versions of env, and are located as follows: env1.5 mapped from nt 7248 to nt 8625, env1.3 mapped from nt 7623 to nt 8625 (Fig. 1, dark blue arrow-rectangles). Figure 1. View largeDownload slide Location of the already characterized and the novel transcripts on the PERV-Szeged genome. Arrow-rectangles in orange: ORFs, arrow-rectangles in gray: already known transcripts, arrow-rectangles in light blue: length variants, arrow-rectangles in dark blue: novel transcripts, novel splice isoforms. Dashed lines represent deletions of the genome while continuous lines represent introns. Figure 1. View largeDownload slide Location of the already characterized and the novel transcripts on the PERV-Szeged genome. Arrow-rectangles in orange: ORFs, arrow-rectangles in gray: already known transcripts, arrow-rectangles in light blue: length variants, arrow-rectangles in dark blue: novel transcripts, novel splice isoforms. Dashed lines represent deletions of the genome while continuous lines represent introns. Table 2. Primers used for PCR analysis. Name Sequence Start position GPE234_fw TGGCAGCCAGCAGGGTCTGG 8167 GPE234_rev GGACCTCCGGAGCTATTTTA 234 GPE532_fw AACATAGACTGAATCTCCAA 5935 GPE532_rev ACGAGGGGGATTGTTCTTTT 532 GPE3857_fw TCCCTCTAGATATGAGATCT 4883 GPE3857_rev GCTGGATTTTGCAGACTGTG 3857 GPE5964_fw GTTTATGGGAGTTCGGGCTG 6204 GPE5964_rev CTCGGTGGAAGGGACCTTAT 5964 GPE6391_fw ATGGGGTTCACAACAAAGCC 8514 GPE6391_rev CAGGACCCCCAAATAATGAA 6391 Name Sequence Start position GPE234_fw TGGCAGCCAGCAGGGTCTGG 8167 GPE234_rev GGACCTCCGGAGCTATTTTA 234 GPE532_fw AACATAGACTGAATCTCCAA 5935 GPE532_rev ACGAGGGGGATTGTTCTTTT 532 GPE3857_fw TCCCTCTAGATATGAGATCT 4883 GPE3857_rev GCTGGATTTTGCAGACTGTG 3857 GPE5964_fw GTTTATGGGAGTTCGGGCTG 6204 GPE5964_rev CTCGGTGGAAGGGACCTTAT 5964 GPE6391_fw ATGGGGTTCACAACAAAGCC 8514 GPE6391_rev CAGGACCCCCAAATAATGAA 6391 The start positions mark the first nucleotide of the primer on the genome of PERV-Szeged. View Large Identification of novel length isoforms with PacBio sequencing We have found two alternative transcript end sites (TES) one at positions 6217 and the other at position 6225, with the former being three times more abundant than the latter. PASs were predicted in silico to nt 6198–6203 (Table 2). Both of the previously described transcripts were found to be terminated alternatively in these positions. We designated these transcripts gpe-AT and env-AT. Two more transcripts from the deleted genomic copies named gpe-d-AT and env-d-AT were also found to be terminated at these positions. A 5’ UTR length isoform of the truncated env1.3 transcript designated env1.3-L is located between nts 7563 and 8625. Determination of novel splice variants We were able to identify three novel splice variants. Env-d-1 has a 1696 nt-long intron spanning from 6509 to 8205 (Fig. 2, lane E band 3). Env-d-2 has an 1849 nt-long intron having the same splice donor site as env-d-1, but a different acceptor site at position 8358 (Fig. 2, lane E band 1). Gpe-d2-1 has a 1719-nt long intron with a donor site at 6509 nt and an acceptor site at 8329 nt (Fig. 2, lane E band 2). Figure 2. View largeDownload slide Novel splice variants and transcripts of the PERV-Szeged clones. One percent agarose gel electrophoresis of the novel splice variants and new transcripts of the PERV-Szeged clones. Lanes missing an apostrophe were loaded with cDNA products from RT-PCR while lanes with the same letter and an apostrophe were loaded with PCR products of genomic DNA. Lane M was loaded with GeneRuler 1 kb Plus DNA Ladder (Thermo Fisher Scientific). Staining was performed with GelRed (Biotium). Lane no-RT was loaded with no-RT control. On lanes A, A’, C, C’ and D, D’, bands marked with a star represent new transcripts of the PERV-Szeged clones with an amplicon length of 90, 219 and 216 bp, respectively. On lane B, B’ bands marked with a star represent the already known env transcript. On lane E, E’, bands marked represent three new splice variants with an amplicon length of 274, 304 and 427 bp. The size of each band of the 1 kb Plus DNA Ladder is presented to the right of its lane. Green arrows represent the location of the PCR primers. Figure 2. View largeDownload slide Novel splice variants and transcripts of the PERV-Szeged clones. One percent agarose gel electrophoresis of the novel splice variants and new transcripts of the PERV-Szeged clones. Lanes missing an apostrophe were loaded with cDNA products from RT-PCR while lanes with the same letter and an apostrophe were loaded with PCR products of genomic DNA. Lane M was loaded with GeneRuler 1 kb Plus DNA Ladder (Thermo Fisher Scientific). Staining was performed with GelRed (Biotium). Lane no-RT was loaded with no-RT control. On lanes A, A’, C, C’ and D, D’, bands marked with a star represent new transcripts of the PERV-Szeged clones with an amplicon length of 90, 219 and 216 bp, respectively. On lane B, B’ bands marked with a star represent the already known env transcript. On lane E, E’, bands marked represent three new splice variants with an amplicon length of 274, 304 and 427 bp. The size of each band of the 1 kb Plus DNA Ladder is presented to the right of its lane. Green arrows represent the location of the PCR primers. Identification of novel PERV transcripts We identified six novel transcripts and termed these inltr, gpe0-d, env-d, gpe-d, env1.5 and env1.3 (Fig. 1, dark blue arrow-rectangles). Inltr was detected in relatively low abundance using PacBio (0.6% of the reads), ONT (0.96% of the reads) and Illumina sequencing. The PAS of this transcript is predicted to be nt 456–461. In silico analysis predicted a 90-bp long ORF encompassing 256 to 345 nt, which might encode a short, 30 amino acid long peptide. Gpe0-d is transcribed from a clone with a large deletion encompassing the region between nts 256 and 8098, while env-d and gpe-d are transcribed from a genomic region with a deletion between nts 6030 and 6108. The two truncated versions of env incorporate two shorter in-frame ORFs. Env1.5 harbors a 684 nt long, while env1.3 a 276 nt ORF which may result in an N-terminally truncated env protein. Determination of the deletions of PERV Mapping the sequencing reads to the genome of PERV-Szeged uncovered several intronic regions. These could be the results of either splicing or deleted genomic segments. PCR analysis revealed multiple deleted regions on the viral genome (Fig. 1, dashed lines). Based on these data, we suggest the existence of at least four different PERV clones in the genome of PK-15 cells. The first clone harbors no deletion; the second one is deleted between 256 and 8098 nt (Fig. 2, lane A and A’); the third is deleted between nt 6030 and 6108 (Fig. 2, lane D and D’), while the fourth clone has a deletion between nt 3932 and 4740 (Fig. 2, lane C and C’) and 6030 and 6108 (Fig. 2, lane D and D’). DISCUSSION Short-read sequencing techniques—despite their widespread use for structural annotation of transcriptomes—are suboptimal for example for the identification of alternatively transcribed and spliced transcripts (Mortazavi et al. 2008) and transcript isoforms. Long-read sequencing however has proven to be an excellent platform for the identification of splice and length transcript variants in human (Sharon et al. 2013) and herpesviruses (Tombácz et al. 2016, 2017). In this study, we applied a massively parallel long-read sequencing platform to characterize the transcriptome of PERV in PK-15 cells. We identified six novel transcripts, five length isoforms and three splice variants. The relative low amounts of some of these transcripts and their overlapping nature makes their identification difficult in gel-based assays and PCR. Additionally, we were also able to pinpoint four deleterious clones of PERV-Szeged using PCR. Our investigation has succeeded in uncovering a complex transcriptional landscape in PERV, and to characterize the deletions of the provirus clones in the host genome. Acknowledgements We would like to thank Marianna Ábrahám and Csilla Magyarné Papdi (University of Szeged) for technical assistance. Authors’ contributions NM carried out the PCR experiments, analyzed the data, participated in the sequence alignment and drafted the manuscript. ZBa analyzed the data and participated in the sequence alignment. DT carried out the PacBio sequencing, participated in the design of the study and took part in drafting the manuscript. ZC propagated the cells, and prepared the RNA, DNA and cDNA samples. AS participated in the sequence alignment. MS participated in the coordination of the study. ZBo conceived, designed and coordinated the study and wrote the manuscript. All authors have read and approved the final version of the manuscript. FUNDING This work was supported by the following: National Institutes of Health (NIH) Centers of Excellence in Genomic Science (CEGS)—Center for Personal Dynamic Regulomes: [grant number 5P50HG00773502 to MS]; TAMOP-Social Renewal Operational Programme: [grant number TAMOP-4.2.6-14/1 -288 to ZBo]; Bolyai Janos Scholarship of the Hungarian Academy of Sciences: [grant number 2015-18 to DT], and Swiss-Hungarian Cooperation Programme [grant number SH/7/2/8 to ZBo]. Conflict of interest. None declared. DECLARATIONS Ethics approval and consent to participate: Not applicable. Consent for publication: Not applicable. Availability of data and material: The datasets generated and/or analyzed during the current study are available in the Sequence Read Archive database accessible under accession: PRJNA381012. REFERENCES Akiyoshi DE, Denaro M, Zhu Het al. Identification of a full-length cDNA for an endogenous retrovirus of miniature swine. J Virol 1998; 72: 4503– 7. Google Scholar PubMed Czauderna F, Fischer N, Boller Ket al. Establishment and characterization of molecular clones of porcine endogenous retroviruses replicating on human cells. J Virol 2000; 74: 4028– 38. Google Scholar CrossRef Search ADS PubMed Denner J, Tönjes RR. Infection barriers to successful xenotransplantation focusing on porcine endogenous retroviruses. Clin Microbiol Rev 2012; 25: 318– 43. Google Scholar CrossRef Search ADS PubMed Djebali S, Davis CA, Merkel Aet al. Landscape of transcription in human cells. Nature 2012; 489: 101– 8. Google Scholar CrossRef Search ADS PubMed Feng W, Dai Y, Mou Let al. The potential of the combination of CRISPR/Cas9 and pluripotent stem cells to provide human organs from chimaeric pigs. Indian J Mar Sci 2015; 16: 6545– 56. Grant CE, Bailey TL, Noble WS. FIMO: scanning for occurrences of a given motif. Bioinformatics 2011; 27: 1017– 8. Google Scholar CrossRef Search ADS PubMed Krach U, Fischer N, Czauderna Fet al. Comparison of replication-competent molecular clones of porcine endogenous retrovirus class A and class B derived from pig and human cells. J Virol 2001; 75: 5465– 72. Google Scholar CrossRef Search ADS PubMed Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Meth 2012; 9: 357– 9. Google Scholar CrossRef Search ADS Laver T, Harrison J, O’Neill PAet al. Assessing the performance of the Oxford Nanopore Technologies MinION. Biomol Detection Quant 2015; 3: 1– 8. Google Scholar CrossRef Search ADS Mathelier A, Fornes O, Arenillas DJet al. JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles. Nucleic Acids Res 2016; 44: D110– 5. Google Scholar CrossRef Search ADS PubMed Miyamoto M, Motooka D, Gotoh Ket al. Performance comparison of second- and third-generation sequencers using a bacterial genome with two chromosomes. BMC Genomics 2014; 15: 699. Google Scholar CrossRef Search ADS PubMed Mortazavi A, Williams BA, McCue Ket al. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 2008; 5: 621– 8. Google Scholar CrossRef Search ADS PubMed Oláh P, Tombácz D, Póka Net al. Characterization of pseudorabies virus transcriptome by Illumina sequencing. BMC Microbiol 2015; 15: 130. Google Scholar CrossRef Search ADS PubMed Sharon D, Tilgner H, Grubert Fet al. A single-molecule long-read survey of the human transcriptome. Nat Biotechnol 2013; 31: 1009– 14. Google Scholar CrossRef Search ADS PubMed Thorvaldsdóttir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 2013; 14: 178– 92. Google Scholar CrossRef Search ADS PubMed Todaro GJ, Benveniste RE, Lieber MMet al. Characterization of a type C virus released from the porcine cell line PK(15). Virology 1974; 58: 65– 74. Google Scholar CrossRef Search ADS PubMed Tombácz D, Balázs Z, Csabai Zet al. Characterization of the dynamic transcriptome of a herpesvirus with long-read single molecule real-time sequencing. Sci Rep 2017; 7: 43751. Google Scholar CrossRef Search ADS PubMed Tombácz D, Csabai Z, Oláh Pet al. Full-length isoform sequencing reveals novel transcripts and substantial transcriptional overlaps in a herpesvirus. PLoS One 2016; 11: e0162868. Google Scholar CrossRef Search ADS PubMed Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 2009; 10: 57– 63. Google Scholar CrossRef Search ADS PubMed Weirather JL, de Cesare M, Wang Yet al. Comprehensive comparison of Pacific Biosciences and Oxford Nanopore Technologies and their applications to transcriptome analysis. F1000Res 2017; 6: 100. Google Scholar CrossRef Search ADS PubMed Wu J, Platero-Luengo A, Sakurai Met al. Interspecies chimerism with mammalian pluripotent stem cells. Cell 2017; 168: 473– 486.e15. Google Scholar CrossRef Search ADS PubMed Wu TD, Watanabe CK. GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 2005; 21: 1859– 75. Google Scholar CrossRef Search ADS PubMed © FEMS 2018. All rights reserved. For permissions, please e-mail: email@example.com
FEMS Microbiology Letters – Oxford University Press
Published: Mar 1, 2018
It’s your single place to instantly
discover and read the research
that matters to you.
Enjoy affordable access to
over 12 million articles from more than
10,000 peer-reviewed journals.
All for just $49/month
It’s easy to organize your research with our built-in tools.
All the latest content is available, no embargo periods.
“Whoa! It’s like Spotify but for academic articles.”@Phil_Robichaud