BackgroundViral infection causes multiple forms of human cancer, and HPV infection is the primary factor in cervical carcinomas. Recent single-cell RNA-seq studies highlight the tumor heterogeneity present in most cancers, but virally induced tumors have not been studied. HeLa is a well characterized HPV+ cervical cancer cell line.ResultWe developed a new high throughput platform to prepare single-cell RNA on a nanoliter scale based on a customized microwell chip. Using this method, we successfully amplified full-length transcripts of 669 single HeLa S3 cells and 40 of them were randomly selected to perform single-cell RNA sequencing. Based on these data, we obtained a comprehensive understanding of the heterogeneity of HeLa S3 cells in gene expression, alternative splicing and fusions. Furthermore, we identified a high diversity of HPV-18 expression and splicing at the single-cell level. By co-expression analysis we identified 283 E6, E7 co-regulated genes, including CDC25, PCNA, PLK4, BUB1B and IRF1 known to interact with HPV viral proteins.ConclusionOur results reveal the heterogeneity of a virus-infected cell line. It not only provides a transcriptome characterization of HeLa S3 cells at the single cell level, but is a demonstration of the power of single cell RNA-seq analysis of virally infected cells and cancers.

journal article

Open Access Collection

SMAP: a streamlined methylation analysis pipeline for bisulfite sequencing

Gao, Shengjie; Zou, Dan; Mao, Likai; Zhou, Quan; Jia, Wenlong; Huang, Yi; Zhao, Shancen; Chen, Gang; Wu, Song; Li, Dongdong; Xia, Fei; Chen, Huafeng; Chen, Maoshan; Ørntoft, Torben F; Bolund, Lars; Sørensen, Karina D

2015 GigaScience

doi: 10.1186/s13742-015-0070-9pmid: 26140213

journal article

Open Access Collection

High-coverage sequencing and annotated assembly of the genome of the Australian dragon lizard Pogona vitticeps

Georges, Arthur; Li, Qiye; Lian, Jinmin; O’Meally, Denis; Deakin, Janine; Wang, Zongji; Zhang, Pei; Fujita, Matthew; Patel, Hardip; Holleley, Clare; Zhou, Yang; Zhang, Xiuwen; Matsubara, Kazumi; Waters, Paul; Graves, Jennifer; Sarre, Stephen; Zhang, Guojie

2015 GigaScience

doi: 10.1186/s13742-015-0085-2

journal article

Open Access Collection

GenomeTester4: a toolkit for performing basic set operations - union, intersection and complement on k-mer lists

Kaplinski, Lauris; Lepamets, Maarja; Remm, Maido

2015 GigaScience

doi: 10.1186/s13742-015-0097-ypmid: 26640690

journal article

Open Access Collection

Investigation into the annotation of protocol sequencing steps in the sequence read archive

Alnasir, Jamie; Shanahan, Hugh P

2015 GigaScience

doi: 10.1186/s13742-015-0064-7pmid: 25960871

BackgroundThe workflow for the production of high-throughput sequencing data from nucleic acid samples is complex. There are a series of protocol steps to be followed in the preparation of samples for next-generation sequencing. The quantification of bias in a number of protocol steps, namely DNA fractionation, blunting, phosphorylation, adapter ligation and library enrichment, remains to be determined.ResultsWe examined the experimental metadata of the public repository Sequence Read Archive (SRA) in order to ascertain the level of annotation of important sequencing steps in submissions to the database. Using SQL relational database queries (using the SRAdb SQLite database generated by the Bioconductor consortium) to search for keywords commonly occurring in key preparatory protocol steps partitioned over studies, we found that 7.10%, 5.84% and 7.57% of all records (fragmentation, ligation and enrichment, respectively), had at least one keyword corresponding to one of the three protocol steps. Only 4.06% of all records, partitioned over studies, had keywords for all three steps in the protocol (5.58% of all SRA records).ConclusionsThe current level of annotation in the SRA inhibits systematic studies of bias due to these protocol steps. Downstream from this, meta-analyses and comparative studies based on these data will have a source of bias that cannot be quantified at present.

journal article

Open Access Collection

An image database of Drosophila melanogaster wings for phenomic and biometric analysis

Sonnenschein, Anne; VanderZee, David; Pitchers, William R; Chari, Sudarshan; Dworkin, Ian

2015 GigaScience

doi: 10.1186/s13742-015-0065-6pmid: 27390931

BackgroundExtracting important descriptors and features from images of biological specimens is an ongoing challenge. Features are often defined using landmarks and semi-landmarks that are determined a priori based on criteria such as homology or some other measure of biological significance. An alternative, widely used strategy uses computational pattern recognition, in which features are acquired from the image de novo. Subsets of these features are then selected based on objective criteria. Computational pattern recognition has been extensively developed primarily for the classification of samples into groups, whereas landmark methods have been broadly applied to biological inference.ResultsTo compare these approaches and to provide a general community resource, we have constructed an image database of Drosophila melanogaster wings - individually identifiable and organized by sex, genotype and replicate imaging system - for the development and testing of measurement and classification tools for biological images. We have used this database to evaluate the relative performance of current classification strategies. Several supervised parametric and nonparametric machine learning algorithms were used on principal components extracted from geometric morphometric shape data (landmarks and semi-landmarks). For comparison, we also classified phenotypes based on de novo features extracted from wing images using several computer vision and pattern recognition methods as implemented in the Bioimage Classification and Annotation Tool (BioCAT).ConclusionsBecause we were able to thoroughly evaluate these strategies using the publicly available Drosophila wing database, we believe that this resource will facilitate the development and testing of new tools for the measurement and classification of complex biological phenotypes.

journal article

Open Access Collection

Benchmark datasets for 3D MALDI- and DESI-imaging mass spectrometry

Oetjen, Janina; Veselkov, Kirill; Watrous, Jeramie; McKenzie, James; Becker, Michael; Hauberg-Lotte, Lena; Kobarg, Jan; Strittmatter, Nicole; Mróz, Anna; Hoffmann, Franziska; Trede, Dennis; Palmer, Andrew; Schiffler, Stefan; Steinhorst, Klaus; Aichler, Michaela; Goldin, Robert; Guntinas-Lichius, Orlando; Eggeling, Ferdinand; Thiele, Herbert;

journal article

Open Access Collection

Bacterial and viral identification and differentiation by amplicon sequencing on the MinION nanopore sequencer

Kilianski, Andy; Haas, Jamie; Corriveau, Elizabeth; Liem, Alvin; Willis, Kristen; Kadavy, Dana; Rosenzweig, C; Minot, Samuel

2015 GigaScience

doi: 10.1186/s13742-015-0051-zpmid: 25815165

journal article

Open Access Collection

A spectrum of sharing: maximization of information content for brain imaging data

Calhoun, Vince

2015 GigaScience

doi: 10.1186/s13742-014-0042-5pmid: 25653850

Efforts to expand sharing of neuroimaging data have been growing exponentially in recent years. There are several different types of data sharing which can be considered to fall along a spectrum, ranging from simpler and less informative to more complex and more informative. In this paper we consider this spectrum for three domains: data capture, data density, and data analysis. Here the focus is on the right end of the spectrum, that is, how to maximize the information content while addressing the challenges. A summary of associated challenges of and possible solutions is presented in this review and includes: 1) a discussion of tools to monitor quality of data as it is collected and encourage adoption of data mapping standards; 2) sharing of time-series data (not just summary maps or regions); and 3) the use of analytic approaches which maximize sharing potential as much as possible. Examples of existing solutions for each of these points, which we developed in our lab, are also discussed including the use of a comprehensive beginning-to-end neuroinformatics platform and the use of flexible analytic approaches, such as independent component analysis and multivariate classification approaches, such as deep learning.

journal article

Open Access Collection

Improving functional magnetic resonance imaging reproducibility

Pernet, Cyril; Poline, Jean-Baptiste

2015 GigaScience

doi: 10.1186/s13742-015-0055-8pmid: 25830019

BackgroundThe ability to replicate an entire experiment is crucial to the scientific method. With the development of more and more complex paradigms, and the variety of analysis techniques available, fMRI studies are becoming harder to reproduce.ResultsIn this article, we aim to provide practical advice to fMRI researchers not versed in computing, in order to make studies more reproducible. All of these steps require researchers to move towards a more open science, in which all aspects of the experimental method are documented and shared.ConclusionOnly by sharing experiments, data, metadata, derived data and analysis workflows will neuroimaging establish itself as a true data science.

Showing 1 to 10 of 120 Articles

Previous 1 2 3 4 5 …12 Next

Articles per page

Browse All Journals

Related Journals:

Natural Computing Computers in Entertainment

BackgroundDNA methylation has important roles in the regulation of gene expression and cellular specification. Reduced representation bisulfite sequencing (RRBS) has prevailed in methylation studies due to its cost-effectiveness and single-base resolution. The rapid accumulation of RRBS data demands well designed analytical tools.FindingsTo streamline the data processing of DNA methylation from multiple RRBS samples, we present a flexible pipeline named SMAP, whose features include: (i) handling of single—and/or paired-end diverse bisulfite sequencing data with reduced false-positive rates in differentially methylated regions; (ii) detection of allele-specific methylation events with improved algorithms; (iii) a built-in pipeline for detection of novel single nucleotide polymorphisms (SNPs); (iv) support of multiple user-defined restriction enzymes; (v) conduction of all methylation analyses in a single-step operation when well configured.ConclusionsSimulation and experimental data validated the high accuracy of SMAP for SNP detection and methylation identification. Most analyses required in methylation studies (such as estimation of methylation levels, differentially methylated cytosine groups, and allele-specific methylation regions) can be executed readily with SMAP. All raw data from diverse samples could be processed in parallel and ‘packetized' streams. A simple user guide to the methylation applications is also provided.

pmid:

2015 GigaScience

doi: 10.1186/s13742-015-0059-4pmid: 25941567