Abstract Summary The availability of cancer genomic data makes it possible to analyze genes related to cancer. Cancer is usually the result of a set of genes and the signal of a single gene could be covered by background noise. Here, we present a web server named Gene Set Cancer Analysis (GSCALite) to analyze a set of genes in cancers with the following functional modules. (i) Differential expression in tumor versus normal, and the survival analysis; (ii) Genomic variations and their survival analysis; (iii) Gene expression associated cancer pathway activity; (iv) miRNA regulatory network for genes; (v) Drug sensitivity for genes; (vi) Normal tissue expression and eQTL for genes. GSCALite is a user-friendly web server for dynamic analysis and visualization of gene set in cancer and drug sensitivity correlation, which will be of broad utilities to cancer researchers. Availability and implementation GSCALite is available on http://bioinfo.life.hust.edu.cn/web/GSCALite/. Supplementary information Supplementary data are available at Bioinformatics online. 1 Introduction Next generation sequencing (NGS) technology has emerged as a powerful method for cancer genomics analysis (Ding et al., 2014). The Cancer Genome Atlas (TCGA) (Weinstein et al., 2013), Genotype-Tissue Expression (GTEx) (GTEx Consortium, 2015) and other projects have generated a large amount of complex, multi-omics data for cancer and normal samples. These publicly available datasets provide unprecedented opportunities to understand cancer causal genes and mechanism, find candidate drug targets, and screen genes associated with phenotypes. Recently, a few excellent web servers such as cBioPortal focusing on the genomic variations based on multi-omics (Cerami1 et al., 2012), GEPIA (Tang et al., 2017) and Oncomine (Rhodes et al., 2007) providing analysis for single gene expression and survival. However, cancer initiation, progression and metastasis are inclined to the result of mutation and/or expression alterations of a set of genes or pathways (Harvey et al., 2013). Thus, performing gene set association analysis with big data of cancer multi-omics and drug sensitivity is imperative and very useful for cancer research. Therefore, we developed an interactive web-based application named GSCALite for Gene Set Cancer Analysis to analyze and visualize the expression/variation/correlation of a gene set in cancers with flexible manner. GSCALite offers analyses including gene differential expression, overall survival, single nucleotide variation, copy number variation, methylation, pathway activity, miRNA regulation, normal tissue expression and drug sensitivity. GSCALite provided various publication-ready figures and tables for users and the workflow was used in our recent paper (Gong et al., 2017). In brief, we integrated big multi-omics and drug data to provide all-in-one analysis for a set of genes in cancers. 2 Methods and functions The user-interface and back-end of GSCALite were written in Shiny. GSCALite consists of analytic modules for data from three major sources including multi-omics data from TCGA 11 160 samples across 33 cancer types (TCGA Cancer), 746 drug data from Genomics of Drug Sensitivity in Cancer (GDSC) (Yang et al., 2013) and Cancer Therapeutics Response Portal (CTRP) (Basu et al., 2013) (Drug Sensitivity), and normal tissue expression data of 11 688 samples from GTEx (GTEx Normal Tissue). We used R scripts and packages (ggplot2, visNetwork, survival and maftools) to generate figures and tables (details refer to the web site help pages). Analysis results are returned to the web page and can be downloaded in PDF, PNG, EPS, TXT as well as HTML formats. The workflow and typical output schema are shown in Supplementary Figure S1. Detailed functions and operations for each module are described below. 2.1 Gene set based multi-omics cancer analysis GSCALite provides the following six analysis modules for a gene set based on TCGA multi-omics cancer data: mRNA Expression module calculates the gene set differential expression between tumor and paired normal samples, the impact of gene expression to overall survival and expression difference between subtypes in each selected cancer type. Single Nucleotide Variation module uses maftools (Mayakonda and Koeffler, 2016) to present the SNV frequency and variant types of the gene set in selected cancer types. The effects of mutations to overall survival are given by means of the log-rank test which facilitate to evaluate the prognosis of the gene set mutations. On Copy Number Variation module, the statistics of heterozygous and homozygous CNV of each cancer type are displayed as pie chat for gene set, and Pearson correlation is performed between gene expression and CNV of each gene in each cancer to help to analyze the gene expression significantly affected by CNV. Methylation module explores the differential methylation between tumor and paired normal samples, the correlation of methylation and expression, and the survival affected by methylation level for selected cancer types. Pathway Activity module presents the correlation of genes expression with pathway activity groups (activation and inhibition) that defined by pathway scores (Akbani et al., 2014). For miRNA regulations, miRNA Network module combines miRNA targeting data from verified target databases and prediction methods as our previous studies (Zhang et al., 2015, 2016), and the negative correlation with gene expressions to explore the miRNA-gene regulatory network for gene set in all cancer types. 2.2 The analysis of drug sensitivity and resistance to genes Genomic aberrations influence clinical responses to treatment and are potential biomarkers for drug screening. Drug sensitivity and gene expression profiling data of cancer cell lines in GDSC and CTRP were integrated into GSCALite. The expression of each gene in the gene set was performed by Spearman correlation analysis with the small molecule/drug sensitivity (IC50). Correlations with false discovery rate (FDR < 0.05) were filtered as significant ones. 2.3 Expression profiling and eQTL in normal tissues GSCALite provides GTEx Normal Tissue module for gene set tissue specificity analysis. This analysis offers a comprehensive display of expression profiling and eQTL information for gene set in selected normal tissues. After analysis of this module, GSCALite provides a heatmap plot for selected tissues with expression value of each gene normalized by the median. 3 Discussion The GSCALite provides foundational tools and workflows in an all-in-one platform for cancer genomics analysis for a set of genes. GSCALite is a time-saving and intuitive tool for unleashing the value of the cancer genomics big data which enables experimental biologists without any computational programming skills to test hypothesis. It is based on gene set analysis with multi-omics data which complements the analysis with mRNA expression alone or single gene analysis. We will maintain the GSCALite web server for at least 5 years and update it with cancer genomics data increasing and new methods development. We anticipate GSCALite to help cancer research community and aid discovery of cancer pathways and drugs. Funding This work has been supported by The National Key Research and Development Program of China (2017YFA0700403) and National Natural Science Foundation of China (Nos. 31471247 and 31771458). Conflict of Interest: none declared. References Akbani R. et al. ( 2014 ) A pan-cancer proteomic perspective on The Cancer Genome Atlas . Nat. Commun ., 5 , 3887 . Google Scholar Crossref Search ADS PubMed Basu A. et al. ( 2013 ) An interactive resource to identify cancer genetic and lineage dependencies targeted by small molecules . Cell , 154 , 1151 – 1161 . Google Scholar Crossref Search ADS PubMed Cerami1 E. ( 2012 ) The cBio Cancer Genomics Portal: an open platform for exploring multidimensional cancer genomics data . Cancer Discov ., 2 , 401 – 404 . Google Scholar Crossref Search ADS PubMed Ding L. et al. ( 2014 ) Expanding the computational toolbox for mining cancer genomes . Nat. Rev. Genet ., 15 , 556 – 570 . Google Scholar Crossref Search ADS PubMed Gong J. et al. ( 2017 ) A pan-cancer analysis of the expression and clinical relevance of small nucleolar RNAs in human cancer . Cell Rep ., 21 , 1968 – 1981 . Google Scholar Crossref Search ADS PubMed GTEx Consortium ( 2015 ) The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans . Science , 348 , 648 – 660 . Crossref Search ADS PubMed Harvey K.F. et al. ( 2013 ) The Hippo pathway and human cancer . Nat. Rev. Cancer , 13 , 246 – 257 . Google Scholar Crossref Search ADS PubMed Mayakonda A. , Koeffler H.P. ( 2016 ) Maftools: efficient analysis, visualization and summarization of MAF files from large-scale cohort based cancer studies. bioRxiv, 052662. Rhodes D.R. et al. ( 2007 ) Oncomine 3.0: genes, pathways, and networks in a collection of 18,000 cancer gene expression profiles . Neoplasia , 9 , 166 – 180 . Google Scholar Crossref Search ADS PubMed Tang Z. et al. ( 2017 ) GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses . Nucleic Acids Res ., 45 , W98 – W102 . Google Scholar Crossref Search ADS PubMed Weinstein J.N. et al. ( 2013 ) The cancer genome atlas pan-cancer analysis project . Nat. Genet ., 45 , 1113 – 1120 . Google Scholar Crossref Search ADS PubMed Yang W. et al. ( 2013 ) Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells . Nucleic Acids Res ., 41 , D955 – D961 . Google Scholar Crossref Search ADS PubMed Zhang H.-M. et al. ( 2016 ) miR-146b-5p within BCR-ABL1–positive microvesicles promotes leukemic transformation of hematopoietic cells . Cancer Res ., 76 , 2901 – 2911 . Google Scholar Crossref Search ADS PubMed Zhang H.-M. et al. ( 2015 ) Transcription factor and microRNA co-regulatory loops: important regulatory motifs in biological processes and diseases . Brief. Bioinform ., 16 , 45 – 58 . Google Scholar Crossref Search ADS PubMed © The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: email@example.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)
Bioinformatics – Oxford University Press
Published: Nov 1, 2018
It’s your single place to instantly
discover and read the research
that matters to you.
Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.
All for just $49/month
Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly
Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.
Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.
Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.
All the latest content is available, no embargo periods.
“Hi guys, I cannot tell you how much I love this resource. Incredible. I really believe you've hit the nail on the head with this site in regards to solving the research-purchase issue.”Daniel C.
“Whoa! It’s like Spotify but for academic articles.”@Phil_Robichaud
“I must say, @deepdyve is a fabulous solution to the independent researcher's problem of #access to #information.”@deepthiw
“My last article couldn't be possible without the platform @deepdyve that makes journal papers cheaper.”@JoseServera