circlncRNAnet: an integrated web-based resource for mapping functional networks of long or circular forms of noncoding RNAs

circlncRNAnet: an integrated web-based resource for mapping functional networks of long or... Background: Despite their lack of protein-coding potential, long noncoding RNAs (lncRNAs) and circular RNAs (circRNAs) have emerged as key determinants in gene regulation, acting to fine-tune transcriptional and signaling output. These noncoding RNA transcripts are known to affect expression of messenger RNAs (mRNAs) via epigenetic and post-transcriptional regulation. Given their widespread target spectrum, as well as extensive modes of action, a complete understanding of their biological relevance will depend on integrative analyses of systems data at various levels. Findings: While a handful of publicly available databases have been reported, existing tools do not fully capture, from a network perspective, the functional implications of lncRNAs or circRNAs of interest. Through an integrated and streamlined design, circlncRNAnet aims to broaden the understanding of ncRNA candidates by testing in silico several hypotheses of ncRNA-based functions, on the basis of large-scale RNA-seq data. This web server is implemented with several features Received: 4 June 2017; Revised: 2 October 2017; Accepted: 22 November 2017 The Author(s) 2017. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. Downloaded from https://academic.oup.com/gigascience/article-abstract/7/1/1/4670937 by Ed 'DeepDyve' Gillespie user on 16 March 2018 2 Wu et al. that represent advances in the bioinformatics of ncRNAs: (1) a flexible framework that accepts and processes user-defined next-generation sequencing–based expression data; (2) multiple analytic modules that assign and productively assess the regulatory networks of user-selected ncRNAs by cross-referencing extensively curated databases; (3) an all-purpose, information-rich workflow design that is tailored to all types of ncRNAs. Outputs on expression profiles, co-expression networks and pathways, and molecular interactomes, are dynamically and interactively displayed according to user-defined criteria. Conclusions: In short, users may apply circlncRNAnet to obtain, in real time, multiple lines of functionally relevant information on circRNAs/lncRNAs of their interest. In summary, circlncRNAnet provides a “one-stop” resource for in-depth analyses of ncRNA biology. circlncRNAnet is freely available at http://app.cgu.edu.tw/circlnc/. Keywords: lncRNAs; circRNAs; co-expression network; molecular interactome is of importance in understanding ncRNAs, and associated bio- Introduction logical processes and may shed new light on diseases and pos- Only 1% of the human genome encodes proteins. In contrast, sibly new avenues of therapeutic interventions [20–22]. 70% to 90% of the genome can actually be transcribed at some Despite the enormous number of lncRNAs (∼15 000) anno- point during development, generating a large transcriptome of tated by GENCODE [23], our functional understanding of lncR- noncoding RNAs (ncRNA), part of which ultimately yield definite NAs remains largely limited. While large-scale sequencing stud- short or long RNAs with limited protein-coding capacity [1]. In ies have become a standard approach for identifying candidate recent years, deep sequencing technologies have unraveled the circRNAs/lncRNAs with significant expression alteration in cer- noncoding constituents of the transcriptome, most notably long tain cellular states, there may not be sufficient information in noncoding RNAs (lncRNAs) and circular RNAs (circRNAs). De- the literature to warrant further functional interrogation. More- spite the lack of protein-coding potential, these once uncharted over, given the potentially widespread target spectrum of these parts have emerged as a key determinant in gene regulation, act- ncRNAs as well as their extensive modes of action, a complete ing as critical switches that fine-tune transcriptional and signal- understanding of their biological relevance will depend on in- ing output [2, 3]. tegrative analyses of systems data at various levels [24]. While Distinct from small noncoding RNAs such as microRNAs and a handful of publicly available databases have been reported snRNAs, lncRNAs are RNA molecules with a length of more (Table 1), they are quite limited in the scope of reference data and than 200 nucleotides that lack a detectable open reading frame analytic modules, relying on existing datasets in public archives [4]. lncRNAs are usually transcribed by RNA polymerase II and and annotating preselected regulatory features of ncRNAs. Thus, exhibit known attributes of messenger RNAs, such as post- existing tools do not fully capture, from a network perspec- transcriptional processing. Circular RNAs are a more recently tive, the functional implications of lncRNAs or circRNAs of in- discovered class of noncoding RNAs that are defined not by terest. To solve this problem, we have implemented an integra- length but rather the unique structure of covalently closed cir- tive bioinformatics approach to examine in silico the functional cularity [5, 6]. Despite their differences in structure and biosyn- networks of ncRNAs. The overall design and analytic workflow thesis steps, lncRNAs and circRNAs are much more common in of this first “one-stop” web server tool for exploring the ncRNA terms of their roles and mechanisms in gene regulation, and in biology are depicted in Fig. 1. fact circRNAs are considered to be a class of lncRNAs by many re- searchers [3]. Even in the absence of protein products, these RNA molecules have been found to associate with distinct cellular Results and Methods compartments or components, and may act in cis or trans in tar- Data input get gene regulation [7–10]. At the epigenetic and transcriptional levels, lncRNAs are known to interact with transcriptional acti- To start, there are 2 separate upload pages for “lncRNA” and vators or repressors and consequently impact transcriptional ef- “circRNA” to meet the distinct analytic requirements of these ficiency. By binding with chromatin-modifying factors, lncRNAs 2 types of molecules (Fig. 2A). Users may upload tab-delimited could also serve as a guide or scaffold that controls the epige- text files that contain (1) expression matrix data of RNA-seq netic status. At the post-transcription level, lncRNAs may bind raw read counts, which are generated by using featureCounts to target RNAs and alter transcript structure, splicing pattern, (Fig. 2B) [25] and (2) sample/condition categories (Fig. 2C) into and stability. Both lncRNAs and circRNAs have been found to “Gene Expression Profile” and “Demographic Information,” re- harbor microRNA response elements (MREs) and potentially act spectively, on the webpage. For circRNA analyses, circRNA read as “miRNA sponges” that sequester these endogenous small counts, as quantified by KNIFE [ 26], should be additionally pro- RNAs [8, 11, 12], although the evidence for lncRNA miRNA vided in a separate file. Procedures for processing the datasets sponges is much stronger than for circRNA sponges [13, 14]. into the appropriate format are outlined in the tutorial page on These ncRNAs are therefore part of the competing endogenous the web server [27]. For demonstration of use, 2 test datasets de- RNA (ceRNA) network with the potential to alter miRNA-targeted rived from publicly available RNA-seq data are included in the mRNA expression. Another mode of regulation exerted by lncR- web server: The Cancer Genome Atlas (TCGA) data on colon and NAs is their association with RNA-binding proteins. Similar to rectal adenocarcinoma (COAD and READ; for lncRNA) and the the ceRNA scenario, this molecular interaction may impact the Encyclopedia of DNA Elements (ENCODE) data on the esophagus localization, and thus activity, of these gene regulators. Finally, and sigmoid colon (for circRNA) [28, 29]. in line with their critical roles as gene regulators, both circRNAs and lncRNAs exhibit unique expression profiles in various hu- Output summary man cancers, suggestive of a correlation with disease progres- sion and possibly its value as a predictor of patient outcome After the successful submission of a job, processing statuses, [15–19]. Delineation of these transcriptomic networks therefore file format conversion, co-expression analysis, interactome Downloaded from https://academic.oup.com/gigascience/article-abstract/7/1/1/4670937 by Ed 'DeepDyve' Gillespie user on 16 March 2018 circlncRNAnet for the bioinformatics of non-coding RNA 3 Table 1: Comparative functionalities of available web tools of ncRNAs. Both Co- RBP lncRNAs Co- expression: binding miRNA and Expression expression: annota- site target Regulatory Tool name Interface circRNAs pattern gene network tion/pathway prediction prediction Network Ref. circlncRNAnet Web server Yes Yes Yes Yes Yes Yes Yes This article NONCODE Web database Yes [53] LNCipedia Web database Yes [30] ncFANs Web server Yes Yes [54] lncRNAdb Web database Yes Yes Yes [55] LINC R package Yes Yes [56] cogena R package Yes Yes [57] WGCNA R package Yes [37] QUBIC R package Yes [58] circNet Web database Yes Yes Yes [59] CIRCpedia Web database Yes [60] Circ2Traits Web database Yes Yes Yes Yes [61] CircInteractome Web database Yes Yes Yes [62] DeepBase V2.0 Web database Yes Yes [24] starBase V2.0 Web database Yes Yes Yes Yes Yes [63] Figure 1: The overall design and the analytic workflow of circlncRNAnet. networking, and report generation are displayed using a dy- according to circBase [31], and circRNA (or host gene) splicing namic progress indicator. Computational tools and databases structure. employed in this study are listed in Tables 2 and 3, respectively, which also outline the parameters used to carry out the cor- Analytic module #1: coding–noncoding co-expression responding analyses. The output section of the tutorial page network profiling [27] shows the standard output of circlncRNAnet based on the demonstration datasets. The standard output is represented by After the upload, the server will first execute the differential ex- dynamic tables and charts, including bar and box plots, scatter pression analysis by using the R package DESeq2 [32]. The inter- plot, circos plot, heatmap, and network plots. Also included in active interface allows users to define the candidate gene list by the table is annotation information of the coding and noncoding fold changes and P-value. Moreover, to inspect the expression genes, such as genome location, distance from query lncRNA or distance between samples, principal component analysis (PCA) circRNA, lncRNA ID (ENCODE), coding potential [30], circRNA ID was implemented in our analysis pipeline. Downloaded from https://academic.oup.com/gigascience/article-abstract/7/1/1/4670937 by Ed 'DeepDyve' Gillespie user on 16 March 2018 4 Wu et al. Figure 2: Input file formats for circlncRNAnet. Interface on the web server for data upload (A). Two files are uploaded prior to data analysis: a gene matrix table ( B), which is generated using featureCounts, and a condition file describing the sample status (C). Table 2: Analytic and visualization R packages incorporated in of selected differentially expressed circRNA/lncRNA expression circlncRNAnet against all genes in the user-uploaded samples (Fig. 3A). For an overview of the sequenced transcriptomes, the extent of the co- Analytic ordinated expression (Fig. 3B) and overall distribution of non- software Version Description Ref. coding and coding RNA abundance (Fig. 3C) can be displayed as summary graphs. To provide users with a guide in the selection circlize 0.4.1 Circos plot [64] of relevant criteria for expression correlation, the server displays clusterProfiler 3.2.14 Gene enrichment analysis [ 65] a composite histogram showing the overall distribution of corre- DESeq2 1.14.1 Differential expression analysis [32] factoextra 1.0.4 Principle component analysis [66] lation coefficients calculated for all the ncRNA-mRNA pairs, su- ggplot2 2.2.1.9000 Data visualization [67] perimposed with the results from randomized correlation tests plotly 4.7.1 Interactive data visualization [68] (500 iterations of randomized Pearson correlations between tar- visNetwork 2.0.1 Network visualization [69] get ncRNAs and 5000 randomly selected mRNAs). The highly cor- WGCNA 1.51 Correlation calculation [37] related genes (based on user-defined Pearson’s correlation) will also be subjected to pathway enrichment analysis (Fig. 4). The identity and enriched terms of the co-expression networks will be provided to facilitate further functional deduction of ncRNAs Several known functional attributes of circRNAs/lncRNAs candidates. were taken into account when constructing this web server: As a proof of principle, we applied our analytic pipeline to a First, we adopted the gene co-expression analysis, which is known example of cancer-associated lncRNAs, ELFN1-AS1. Kim based on the concept of “guilt by association”—assuming that et al. recently reported that MYC-regulated lncRNA MYCLo-2 genes exhibiting analogous expression patterns may be involved (also known as ELFN1-AS1) represses CDKN2B transcription co- in similar biological pathways, functions of unknown genes may ordinately with hnRNPK [38]. To demonstrate the utility of cir- be inferred a priori from the co-expressed, functionally known clncRNAnet, we queried the functional network of ELFN1-AS1. genes [33]. To this end, Wolfe et al. developed a method to We used TCGA data on COAD and READ and paired normal demonstrate that co-expression with biologically defined mod- samples as the reference expression datasets. Co-expression ules may serve as a basis for characterizing the function of un- gene network analysis for ELFN1-AS1 may be done on the ba- known genes [34]. Ricano-Ponce et al. also used co-expression sis of the differentially expressed gene list and outputted ac- analysis to deduce the function of lncRNAs with expression cording to user-defined criteria (Fig. 4, middle panel). To further quantitative trait loci (eQTLs) effects [35]. The combined use visualize overall expression profiles of ELFN1-AS1 co-expressed of co-expression analysis and Gene Set Enrichment Analysis genes, “heatmap” may be used to display up to 500 of the (GSEA) has been demonstrated to identify lncRNAs putatively most correlated genes (ranked by absolute r value) (Fig. 4, upper involved in neuronal development [36]. To implement this co- left panel). Pair-wise expression correlation between the ncRNA expression analysis in circlncRNAnet, we used the R package and co-expressed mRNA genes is also possible. For instance, as WGCNA [37] to calculate the Pearson correlation coefficients Downloaded from https://academic.oup.com/gigascience/article-abstract/7/1/1/4670937 by Ed 'DeepDyve' Gillespie user on 16 March 2018 circlncRNAnet for the bioinformatics of non-coding RNA 5 Table 3: List of databases and analytic tools employed by circlncRNAnet Database Version Description Parameters Ref. cisBP-RNA and Ray, 2013 2013 RNA binding protein motifs for Downloaded from MEME motif database [70] (Homo sapiens) FIMO to discover potential RNA binding sites dbNSFP (Homo sapiens) 3.2 Gene annotation NA [71] ENCODE ChIP-Seq Feb 2017 Experimental transcription factor Regions from -3000∼1000 bp of TSS were considered as [29] (Homo sapiens) and protein binding sites the promoter; in-house scripts were then used to collect peaks with >2 score and annotate as binding sites ENCODE eCLIP (Homo Mar 2017 Experimental RNA binding protein In-house scripts were used to collect all the peaks [29] sapiens) binding sites corresponding to binding sites; binding score for each target gene was represented by the lowest peak score FIMO 4.11.2 Computational RNA binding Default [44] protein binding sites discovering GENCODE (Homo sapiens) Release 25 lncRNA annotation NA [23] LNCipedia (Homo 4 High-confidence lncRNA NA [30] sapiens) annotation miRanda 3.3a miRNA binding sites detection -m 10 000 000 -p 0.05 [72] MSigDB v5.2 Computational transcription The transcription factor targets dataset was used for TF [73] factor and protein binding sites enrichment analysis RNAhybrid 2.1.2 miRNA binding sites detection -sc 140, with cutoff seed similarity ≥85% and wobble [48] pair similarity ≥85% TarPmiR Mar 2016 miRNA binding sites detection -p 0.1 [74] ELFN1-AS1 is a known transcriptional target of MYC, users may Analytic module #2: RBP interactome mapping compare the expression patterns between ELFN1-AS1 and MYC Second, based on the lncRNAs that have been reported thus far, in the TCGA data. This is done through “Scatter plot,” and enter they have been mostly implicated in several aspects of gene ex- “MYC” in the “Co-expressed gene” box (Fig. 3D). Next, for path- pression, such as RNA stability, miRNA sponging, regulation of way analysis of genes co-expressed with ELFN1-AS1,the “GO& transcription factor, and epigenetic and chromosomal architec- KEGG Enrichment” functionality is available, in which the “En- ture [4, 7, 20, 21, 40]. Interestingly, behind these regulatory ac- riched pathway (MSigDB)” will output top enriched pathways, tions, molecular interactions are the most crucial determinant together with a network representation of the components. In in lncRNAs’ roles. In this context, lncRNAs are known to asso- the case of ELFN1-AS1, MYC TARGETS V1 and MYC TARGETS V2 ciate with various proteins (i.e., RNA-binding proteins and chro- are shown as 2 of the top pathways, consistent with the previous matin modifiers). For example, lncRNA ELFN1-AS1 interacts with findings (Fig. 4, lower panels). hnRNPK to transcriptionally suppress the expression of CDKN2B, In addition, we used another novel lncRNA as an exam- a tumor suppressor gene [38]. LncRNA NORAD acts as sequester ple of our analytic approach. XXbac-B476C20.9 was downreg- of PUM2 to maintain genomic stability [41]. A colorectal cancer ulated in colorectal cancer, and higher expression of XXbac- (CRC) associated lncRNA MYU binds hnRNPK and consequently B476C20.9 exhibited better survival expectancy, hinting at a stabilizes CDK6, which is critical for colon cancer cells’ growth tumor-suppressive role (data not shown). By using Pearson cor- [42]. These findings thus suggest that delineating the lncRNA- relation analysis, we identified hundreds of genes that exhibit interacting protein network may effectively prompt the func- significant co-expression with this lncRNA (data not shown). By tional exploration of lncRNA candidates. In our efforts of map- analyzing the chromosome distribution of XXbac-B476C20.9 co- ping the protein interactome of lncRNAs, we have extensively expressed genes, we did not see particular enrichment in chro- curated and integrated 2 types of public data into reference an- mosome 22 (where XXbac-B476C20.9 locates) (Fig. 4, upper right notations for the analytic workflow: computational RNA binding panel), indicating that this lncRNA may not exert expression reg- protein (RBP) motif scan and experimental RBP databases. ulationinacismanner. For this purpose, we first collected RBP binding motifs from Correlated expression may also be attributed to the func- MEME, which is a motif-discovering software, in addition to sev- tional interaction of the circRNAs/lncRNAs with particular eral RBP motifs from published data [43]. Next, we generated all transcription factor (TF) networks. Indeed, previous studies have lncRNA sequences from GENCODE, Release 25, and used FIMO reported that lncRNA could regulate TF activity through recipro- to scan computationally for the presence of possible RBP bind- cal interaction [39]. To address this possibility, our web server ing sites [44]. For the empirical RBP sites, we retrieved the RBP is equipped to determine whether the co-expression gene set binding sequences from ENCODE eCLIP [45]. To complement the is enriched in targets of specific TFs. Extensive TF-target pairs repertoire of RBP included in the analysis, we also integrated were first built by annotating 2 sources of data: (1) computational protein interaction profile sequencing (PIP-seq) [ 46]. Although motif scan of TF binding sites and (2) experimental TF binding the footprints of protein binding do not readily reveal the iden- sites as archived by the ENCODE Chromatin immunoprecipita- tity of the associated factors, PIP-seq data may serve as evidence tion sequencing (ChIP-Seq) data. For the latter, we retrieved EN- for molecular interaction. CODE ChIP-seq data and defined the promoter region as a win- Given that our exemplary lncRNA ELFN1-AS1 reportedly me- dow from -3000 bp to +1000 bp of the transcription start site to diates its function through interacting with hnRNPK, we next establish putative TF occupancy. The output of this type of anal- tested whether this attribute could be recapitulated by circlncR- ysis can be accessed via gene enrichment module. NAnet. To interrogate the ELFN1-AS1-associated proteins, the Downloaded from https://academic.oup.com/gigascience/article-abstract/7/1/1/4670937 by Ed 'DeepDyve' Gillespie user on 16 March 2018 6 Wu et al. regulation has emerged. By virtue of the distinct distribution of recurring miRNA target sequences in lncRNA transcripts, cer- tain lncRNAs are known to compete with mRNA transcripts for complementary binding by the cognate miRNAs. This regula- tory process, referred to as miRNA sponge or competing en- dogenous RNAs (ceRNAs) [47], alters the endogenous silencing activity of miRNAs, thereby impacting the expression of tar- geted mRNAs. Some lncRNAs have even been demonstrated as miRNA sponges in certain oncogenic processes [11, 12]. Thus, to complete this bioinformatics package, we installed in this web server an analytic module for sequence-based delineation of po- tential lncRNA-miRNA sponge pairs. Given that existing miRNA targeting sites databases annotate target sequences only in 3’ UTR, information regarding miRNA: ncRNA complementarity is not readily available. To resolve this issue, we generated a refer- ence database that catalogs putative miRNA binding sites within lncRNAs/circRNAs as computationally predicted by 3 different miRNA target prediction tools (RNAhybrid, miRanda, and TarP- miR) [48–50]. Analogous to the RBP module, an miRNA target is considered a positive hit if 2 of the 3 software tools uncover its existence, and will be denoted as a larger node and shown with a gene symbol in the network diagram. For the RNA components of the ELFN1-AS1 interactomes, circlncRNAnet provides information on the putative miRNA targeting sites within the RNA sequences. To explore, the “miRNA targeting sites network” may be selected to show the corresponding network (Fig. 5B). Analogous to the RBP network, any miRNA target sequences predicted by at least 2 miRNA targeting site–discovering softwares (miRanda, RNAhybrid, and TarPmiR) will be labeled with gene symbols and a larger node size in the network (Fig. 5B). Analytic module #4: multitier regulatory hierarchy mRNAs harboring the same miRNA binding sites as ncRNAs are likely to be subject to expression alteration in the miRNA sponge scenario—the inverse correlation in expression between miRNA and mRNAs/lncRNAs is expected [47]. Thus, to substan- Figure 3: Schematic showing example outputs of circlncRNAnet analyses of tiate the putative miRNA sponge activity and also to delineate lncRNA-based networks in colorectal cancer. After dataset upload, the server likely downstream mRNA targets, the web server is further de- executes differential expression and expression correlation analyses. The web signed to construct the ncRNA-miRNA-mRNAs regulatory hier- server allows the user to select query genes and correlation criteria (A). For an archy. For this purpose, 3’ UTRs with presumptive miRNA tar- overview of the sequenced transcriptomes, the extent of the coordinated expres- geting, as revealed by the aforementioned prediction tools, will sion (B) and overall distribution of noncoding and coding RNA abundance (C) are be cross-referenced with the gene set that shows correlated ex- displayed as summary graphs. As examples of use, co-expression network anal- ysis of a known lncRNA, ELFN1-AS1, and a novel lncRNA, XXbac-B476C20.9, was pression profiles with the candidate ncRNA. As a result, this in- performed using circlncRNAnet. (D) Scatter plot showing the extent of expres- tersected gene list presumably represents the targets of ncRNA- sion correlation between ELFNA-AS1 and 1 target, MYC. (E) Histogram displaying miRNA axis-mediated regulation, and will be depicted in a 2-tier the distributions of the Pearson correlation coefficients of all ncRNA-mRNA pairs network configuration (Fig. 5B). (Obs) and of a randomized correlation test (Rand). Similar network analyses are available for decoding the ncRNA-RBP-mRNA network. To this end, a reference RBP-mRNA database was first established, in which all GENCODE mRNA “Retrieve lncRNA-binding protein” module can be selected to genes were scanned and annotated for experimental and com- display a ELFN1-AS1-associated RNA-binding protein network putational RBP binding using the above approaches. For a partic- (Fig. 5A). An RBP is considered a hit (i.e., potential interactor ular RBP in the ncRNA interactome that is selected by the user, of the given lncRNA/circRNA) if its annotated motifs from at all ncRNA-co-expressed mRNAs with mutual RBP binding will least 2 database sources are detected in the transcript sequence, be assembled based on the RPB-mRNA database. These lines of and will be labeled with a gene symbol and a larger node size. information will then be integrated and subsequently outputted The output of this demo analysis illustrates a number of pu- as the multitier molecular network (Fig. 5A). tative interacting RBPs, one of which is HNRNPK, as reported (Fig. 5A). Benchmarking Analytic module #3: ceRNA networking circlncRNAnet is constructed on the Nginx 1.6.3 and Shiny 1.0.3 Third, aside from protein interactors, the role of circRNAs/ servers, which run on a CentOS 6.2 with 2 Intel XEON E5–2620 lncRNAs in microRNA (miRNA)-mediated post-transcriptional CPU and 200GB RAM. To optimize the CPU utilities for multiple Downloaded from https://academic.oup.com/gigascience/article-abstract/7/1/1/4670937 by Ed 'DeepDyve' Gillespie user on 16 March 2018 circlncRNAnet for the bioinformatics of non-coding RNA 7 Figure 4: Additional examples of circlncRNAnet output of lncRNA-based networks in colorectal cancer. In addition to the analyses shown in Fig. 3, more options for network interrogation of ncRNA-based regulation can be accessed on the webpage (middle). For instance, heatmap representation of the genes co-expressed with ELFN1-AS1 (Pearson’s |r| > 0.5) can be outputted (upper left). Pathway analysis of the co-expressed genes on the basis of MSigdb Hallmark pathways (bottom left), and its network depiction of the top 3 enriched pathways and the corresponding co-expressed components (bottom right). Circos plot can also be used to illustrate the genome-wide distribution of the top 100 co-expressed genes relative to the location of XXbac-B476C20.9 (upper right). Downloaded from https://academic.oup.com/gigascience/article-abstract/7/1/1/4670937 by Ed 'DeepDyve' Gillespie user on 16 March 2018 8 Wu et al. Figure 5: Examples of lncRNA-associated molecular components uncovered by circlncRNAnet. circlncRNAnet may be used to extensively profile the molecular in- teractome of candidate circRNAs/lncRNAs based on the compiled databases, with ELFN1-AS1 shown as an example in this figure. (A) For the RBP components ,the interactome will be outputted in both the table format (top) and network configuration (bottom). (B) Similarly, for the putative miRNA sponge network , predicted ELFN1-AS1-targeting miRNAs are shown in table (top) and network (bottom) formats. The web server is also designed to construct the ncRNA-RBP-mRNAs or ncRNA- miRNA-mRNAs regulatory hierarchy. circlncRNAnet delineates co-expressed mRNA genes with mutually shared RBP binding or miRNA targeting sites. Consequently, an intersected gene list is compiled (top) and may be depicted in a 2-tier network configuration (bottom). users, we assign 2 threads for an analysis task. We tested the Availability of supporting source code and web service with 20 normal/tumor paired samples, for which the requirements DESeq2 analysis required 130 seconds to produce differentially Project name: circlncRNAnet expressed genes. For calculating a co-expressed gene list, cir- Project home page: http://app.cgu.edu.tw/circlnc/[27], clncRNAnet took 50 seconds for 1 query gene and 270 seconds https://github.com/smw1414/circlncRNAnet [51] for 10 query genes. Operating system(s): platform independent Programming language: PHP, JavaScript, R, R shiny and Shell script Conclusions Other requirements: JavaScript supporting web browser With the expansion of transcriptome sequencing datasets, fo- License: GPLv3 cusing on a select set of publicly available, but potentially Research Resource Identifier: circlncRNAnet, irrelevant, sequencing data does not sufficiently address users’ RRID:SCR 015794 research needs. This prompted us to build a completely new system with the flexibility of accepting private or public data. Availability of supporting data To further support efficient analyses and presentation, we have extensively curated public data into reference annotations for The analytic modules and test datasets (from TCGA and EN- the circlncRNAnet workflow. Multilayer modules and algorithms CODE) are available in the GitHub repository [51]. An archival then provide outputs on expression profiles, co-expression net- copy of the modules and test datasets is also available via the Gi- works and pathways, and molecular interactomes, which are dy- gaScience repository, GigaDB [52]. For the convenience of prospec- namically and interactively displayed according to user-defined tive users, we also provided on GitHub instructions on running criteria. In short, users may apply circlncRNAnet to obtain, in our pipeline in local mode. real time, multiple lines of functionally relevant information on the circRNAs/lncRNAs of their interest. The overall workflow Abbreviations takes only a few minutes, as compared with hours of manual effort of independent database searches and analyses. In sum- ceRNA: competing endogenous RNA; ChIP-Seq: chromatin mary, circlncRNAnet is the first of its kind in the regulatory RNA immunoprecipitation sequencing; circRNA: circular RNA; research field, providing a “one-stop” resource for in-depth anal- COAD: colon adenocarcinoma; CRC: colorectal cancer; GSEA: yses of ncRNA biology. A tutorial with demo datasets is avail- Gene Set Enrichment Analysis; lncRNAs: long noncoding RNA; able under “Tutorial,” in which the functional network of known miRNA: microRNA; mRNAs: messenger RNA; ncRNA: noncoding lncRNA was illustrated in silico as an example. RNA; PIP-seq: Protein Interaction Profile sequencing; RBP: Downloaded from https://academic.oup.com/gigascience/article-abstract/7/1/1/4670937 by Ed 'DeepDyve' Gillespie user on 16 March 2018 circlncRNAnet for the bioinformatics of non-coding RNA 9 RNA-binding protein; READ: rectal adenocarcinoma; TCGA: The biomarker for gastric cancer. Oncotarget 2016;7(25):37812– Cancer Genome Atlas. 24. 12. Du Z, Sun T, Hacisuleyman E et al. Integrative analyses reveal a long noncoding RNA-mediated sponge regulatory network Funding in prostate cancer. Nat Commun 2016;7:10982. 13. Boeckel JN, Jae N, Heumuller AW et al. Identification and This work was supported by grants from the Ministry of Sci- characterization of hypoxia-regulated endothelial circular ence and Technology of Taiwan (MOST104–2321-B-182–007-MY3 RNANovelty and significance. Circ Res 2015; 117(10):884–90. to P.J.H.; MOST106–2320-B-182–035-MY3 to H.L.; MOST104– 14. Militello G, Weirick T, John D et al. Screening and validation 2320-B-182–029-MY3 and MOST105–2314-B-182–061-MY4 to of lncRNAs and circRNAs as miRNA sponges. Brief Bioinform B.C.M.T.; MOST103–2632-B-182–001, MOST104–2632-B-182–001, 2017;18(5):780–8. and MOST105–2632-B-182–001), Chang Gung Memorial Hospi- 15. Chen G, Wang Z, Wang D et al. LncRNADisease: a database tal (CMRPD1G0321 and CMRPD1G0322 to P.J.H.; CMRPD1F0571 for long-non-coding RNA-associated diseases. Nucleic Acids to H.L.; CMRPG3D1513 and CMRPG3D1514 to W.S.T.; CM- Res 2013;41(Database issue):D983–6. RPD3E0153, CMRPD1F0442, and BMRP960 to B.C.M.T.), the 16. Huarte M. A lncRNA links genomic variation with celiac dis- National Health Research Institute of Taiwan (NHRI-EX105– ease. Science 2016;352(6281):43–44. 10321SI), the Ministry of Education of Taiwan, and Biosignature 17. Li P, Chen S, Chen H et al. Using circular RNA as a novel type Research Grant CIRPD3B0013 for supporting bioinformatics and of biomarker in the screening of gastric cancer. Clin Chim computing resources. Acta 2015;444:132–6. 18. Qian Y, Lu Y, Rui C et al. Potential significance of circular RNA Competing interests in human placental tissue for patients with preeclampsia. Cell Physiol Biochem 2016;39(4):1380–90. The authors declare that they have no competing interests. 19. Yang G, Lu X, Yuan L. LncRNA: a link between RNA and can- cer. Biochim Biophys Acta 2014;1839(11):1097–109. 20. Xie X, Tang B, Xiao YF et al. Long non-coding RNAs in col- Author contributions orectal cancer. Oncotarget 2016;7(5):5226–39. H.L. and B.C.T. conceived the original idea of the web server. 21. Han D, Wang M, Ma N et al. Long noncoding RNAs: novel S.W., P.H., Y.C., C.L., W.T., and H.L. designed and implemented the players in colorectal cancer. Cancer Lett 2015;361(1):13–21. web server. S.W., P.J., and Y.C. conducted the benchmarks. C.L., 22. Park NJ, Zhou H, Elashoff D et al. Salivary microRNA: discov- C.Y., W.T., and B.C.T. tested the system and provided feedback ery, characterization, and clinical utility for oral cancer de- on features and functionality. S.W., H.L., and B.C.T. wrote the tection. Clin Cancer Res 2009;15(17):5473–7. manuscript. All authors read and approved the final manuscript. 23. Harrow J, Frankish A, Gonzalez JM et al. GENCODE: the ref- erence human genome annotation for The ENCODE Project. Genome Res 2012;22(9):1760–74. Acknowledgements 24. Zheng LL, Li JH, Wu J et al. deepBase v2.0: identification, ex- We are grateful to members of the BC-MT laboratory for critical pression, evolution and function of small RNAs, LncRNAs reading of the article and important discussions. and circular RNAs from deep-sequencing data. Nucleic Acids Res 2016;44(D1):D196–202. 25. Liao Y, Smyth GK, Shi W. featureCounts: an efficient general References purpose program for assigning sequence reads to genomic 1. Ponting CP, Oliver PL, Reik W. Evolution and functions of long features. Bioinformatics 2014;30(7):923–30. noncoding RNAs. Cell 2009;136(4):629–41. 26. Szabo L, Morey R, Palpant NJ et al. Statistically based splic- 2. Liu J, Liu T, Wang X, He A. Circles reshaping the RNA world: ing detection reveals neural enrichment and tissue-specific from waste to treasure. Mol Cancer 2017;16(1):58. induction of circular RNA during human fetal development. 3. Quinn JJ, Chang HY. Unique features of long non-coding RNA Genome Biol 2015;16(1):126. biogenesis and function. Nat Rev Genet 2016;17(1):47–62. 27. Wu S-M, Liu H, Huang P-J et al. circlncRNAnet: an inte- 4. Wang KC, Chang HY. Molecular mechanisms of long noncod- grated web-based resource for mapping functional net- ing RNAs. Mol Cell 2011;43(6):904–14. works of long or circular forms of non-coding RNAs. 2017, http://http://app.cgu.edu.tw/circlnc/. Accessed November 5. Jeck WR, Sharpless NE. Detecting and characterizing circular RNAs. Nat Biotechnol 2014;32(5):453–61. 2017. 28. Cancer Genome Atlas Network. Comprehensive molecular 6. Petkovic S, Muller S. RNA circularization strategies in vivo and in vitro. Nucleic Acids Res 2015;43(4):2454–65. characterization of human colon and rectal cancer. Nature 7. Rinn JL, Chang HY. Genome regulation by long noncoding 2012;487(7407):330–7. RNAs. Annu Rev Biochem 2012;81(1):145–66. 29. Consortium EP. An integrated encyclopedia of DNA elements 8. Wang X, Arai S, Song X et al. Induced ncRNAs allosterically in the human genome. Nature 2012;489(7414):57–74. modify RNA-binding proteins in cis to inhibit transcription. 30. Volders PJ, Verheggen K, Menschaert G et al. An up- Nature 2008;454(7200):126–30. date on LNCipedia: a database for annotated hu- 9. Guttman M, Rinn JL. Modular regulatory principles of large man lncRNA sequences. Nucleic Acids Res 2015;43(8): non-coding RNAs. Nature 2012;482 (7385):339–46. 4363–4. 10. Rinn JL, Kertesz M, Wang JK et al. Functional demarcation of 31. Glazar P, Papavasileiou P, Rajewsky N. circBase: a database active and silent chromatin domains in human HOX loci by for circular RNAs. RNA 2014;20(11):1666–70. 32. Love MI, Huber W, Anders S. Moderated estimation of noncoding RNAs. Cell 2007;129(7):1311–23. 11. Shao Y, Ye M, Li Q et al. LncRNA-RMRP promotes carcino- fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 2014;15(12):550. genesis by acting as a miR-206 sponge and is used as a novel Downloaded from https://academic.oup.com/gigascience/article-abstract/7/1/1/4670937 by Ed 'DeepDyve' Gillespie user on 16 March 2018 10 Wu et al. 33. De Smet R, Marchal K. Advantages and limitations of 54. Liao Q, Xiao H, Bu D et al. ncFANs: a web server for func- current network inference methods. Nat Rev Microbiol tional annotation of long non-coding RNAs. Nucleic Acids 2010;8(10):717–29. Res 2011;39(suppl):W118–24. 34. Wolfe CJ, Kohane IS, Butte AJ. Systematic survey reveals gen- 55. Quek XC, Thomson DW, Maag JL et al. lncRNAdb v2.0: ex- eral applicability of “guilt-by-association” within gene coex- panding the reference database for functional long noncod- pression networks. BMC Bioinformatics 2005;6 1:227. ing RNAs. Nucleic Acids Res 2015;43(D1):D168–73. 35. Ricano-Ponce I, Zhernakova DV, Deelen P et al. Refined map- 56. Goepferich M, Herrmann C. LINC: co-expression of lin- ping of autoimmune disease associated genetic variants cRNAs and protein-coding genes. 2017. https://doi.org/ with gene expression suggests an important role for non- doi:10.18129/B9.bioc.LINC. Accessed December 2017. coding RNAs. J Autoimmun 2016;68:62–74. 57. Jia Z, Liu Y, Guan N et al. Cogena, a novel tool for co- 36. D’Haene E, Jacobs EZ, Volders PJ et al. Identification of long expressed gene-set enrichment analysis, applied to drug non-coding RNAs involved in neuronal development and in- repositioning and drug mode of action discovery. BMC Ge- tellectual disability. Sci Rep 2016;6 1:28396. nomics 2016;17:414. doi:10.1186/s12864-016-2737-8. 37. Langfelder P, Horvath S. WGCNA: an R package for weighted 58. Zhang Y, Xie J, Yang J et al. QUBIC: a bioconductor package for correlation network analysis. BMC Bioinformatics 2008;9 qualitative biclustering analysis of gene co-expression data. 1:559. Bioinformatics 2017;33(3):450–2. 38. Kim T, Jeon YJ, Cui R et al. Role of MYC-regulated long non- 59. Liu YC, Li JR, Sun CH et al. CircNet: a database of circular coding RNAs in cell cycle regulation and tumorigenesis. J RNAs derived from transcriptome sequencing data. Nucleic Natl Cancer Inst 2015;107(4): doi:10.1093/jnci/dju505. Acids Res 2016;44(D1):D209–15. 39. Fu M, Huang G, Zhang Z et al. Expression profile of long non- 60. Zhang XO, Dong R, Zhang Y et al. Diverse alternative back- coding RNAs in cartilage from knee osteoarthritis patients. splicing and alternative splicing landscape of circular RNAs. Osteoarthritis Cartilage 2015;23(3):423–32. Genome Res 2016;26(9):1277–87. 40. Chen X, Liu B, Yang R et al. Integrated analysis of long 61. Ghosal S, Das S, Sen R et al. Circ2Traits: a comprehensive non-coding RNAs in human colorectal cancer. Oncotarget database for circular RNA potentially associated with disease 2016;7(17):23897–908. and traits. Front Genet 2013;4:283. 41. Lee S, Kopp F, Chang TC et al. Noncoding RNA NORAD reg- 62. Dudekula DB, Panda AC, Grammatikakis I et al. CircInter- ulates genomic stability by sequestering PUMILIO proteins. actome: a web tool for exploring circular RNAs and their Cell 2016;164(1–2):69–80. interacting proteins and microRNAs. RNA Biol 2016;13(1): 42. Kawasaki Y, Komiya M, Matsumura K et al. MYU, a 34–42. target lncRNA for Wnt/c-Myc signaling, mediates induc- 63. Li JH, Liu S, Zhou H et al. starBase v2.0: decoding miRNA- tion of CDK6 to promote cell cycle progression. Cell Rep ceRNA, miRNA-ncRNA and protein–RNA interaction net- 2016;16(10):2554–64. works from large-scale CLIP-Seq data. Nucl Acids Res 43. Bailey TL, Boden M, Buske FA et al. MEME SUITE: tools for mo- 2014;42(D1):D92–7. tif discovery and searching. Nucleic Acids Res 2009;37(Web 64. Gu Z, Gu L, Eils R et al. Circlize implements and enhances Server):W202–8. circular visualization in R. Bioinformatics 2014;30(19): 44. Grant CE, Bailey TL, Noble WS. FIMO: scanning for occur- 2811–2. rences of a given motif. Bioinformatics 2011;27(7):1017–8. 65. Yu G, Wang LG, Han Y et al. clusterProfiler: an R package for 45. Hong EL, Sloan CA, Chan ET et al. Principles of meta- comparing biological themes among gene clusters. OMICS data organization at the ENCODE data coordination center. 2012;16(5):284–7. Database 2016;2016: doi:10.1093/database/baw001. 66. Mundt AKF. Factoextra: extract and visualize the re- 46. Silverman IM, Li F, Alexander A et al. RNase-mediated sults of multivariate data analyses. 2017. https://cran.r- protein footprint sequencing reveals protein-binding project.org/package=factoextra. Accessed December 2017. sites throughout the human transcriptome. Genome Biol 67 Wickham H. ggplot2: Elegant Graphics for Data Analysis. 2014;15(1):R3. New York: Springer-Verlag; 2009. 47. Thomson DW, Dinger ME. Endogenous microRNA sponges: 68. Sievert C, Parmer C, Hocking T et al. Plotly: create interactive evidence and controversy. Nat Rev Genet 2016;17(5):272–83. web graphics via ‘plotly.js.’ 2017. https://plot.ly/r. Accessed 48. Kruger J, Rehmsmeier M. RNAhybrid: microRNA target pre- December 2017. diction easy, fast and flexible. Nucleic Acids Res 2006; 34(Web 69. Almende BV, Thieurmel B, Robert T. visNetwork: network Server):W451–4. visualization using ‘vis.js’ Library. 2017. https://CRAN.R- 49. Lewis BP, Burge CB, Bartel DP. Conserved seed pairing, often project.org/package=visNetwork. Accessed December 2017. flanked by adenosines, indicates that thousands of human 70. Ray D, Kazan H, Cook KB et al. A compendium of genes are microRNA targets. Cell 2005;120(1):15–20. RNA-binding motifs for decoding gene regulation. Nature 50. Enright AJ, John B, Gaul U et al. MicroRNA targets in 2013;499(7457):172–7. Drosophila. Genome Biol 2003;5(1):R1. 71. Liu X, Jian X, Boerwinkle E. dbNSFP: a lightweight database 51. Wu S-M, Liu H, Huang P-J et al. circlncRNAnet GitHub repos- of human nonsynonymous SNPs and their functional pre- itory. 2017. https://github.com/smw1414/circlncRNAnet. Ac- dictions. Hum Mutat 2011;32(8):894–9. cessed November 2017. 72. John B, Enright AJ, Aravin A et al. Human microRNA targets. 52. Wu S-M, Liu H, Huang P-J et al. Supporting data for PLoS Biol 2004;2(11):e363. “circlncRNAnet: an integrated web-based resource 73. Subramanian A, Tamayo P, Mootha VK et al. Gene set enrich- for mapping functional networks of long or circular ment analysis: a knowledge-based approach for interpreting forms of noncoding RNAs.” GigaScience Database 2017. genome-wide expression profiles. Proc Natl Acad Sci U S A http://dx.doi.org/10.5524/100378. Accessed December 2017. 2005;102(43):15545–50. 53. Zhao Y, Li H, Fang S et al. NONCODE 2016: an informative 74. Ding J, Li X, Hu H. TarPmiR: a new approach for mi- and valuable data source of long non-coding RNAs. Nucleic croRNA target site prediction. Bioinformatics 2016;32(18): Acids Res 2016;44(D1):D203–8. 2768–75. Downloaded from https://academic.oup.com/gigascience/article-abstract/7/1/1/4670937 by Ed 'DeepDyve' Gillespie user on 16 March 2018 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png GigaScience Oxford University Press

circlncRNAnet: an integrated web-based resource for mapping functional networks of long or circular forms of noncoding RNAs

Free
10 pages

Loading next page...
 
/lp/ou_press/circlncrnanet-an-integrated-web-based-resource-for-mapping-functional-8xOgK5iU20
Publisher
BGI
Copyright
© The Author 2017. Published by Oxford University Press.
eISSN
2047-217X
D.O.I.
10.1093/gigascience/gix118
Publisher site
See Article on Publisher Site

Abstract

Background: Despite their lack of protein-coding potential, long noncoding RNAs (lncRNAs) and circular RNAs (circRNAs) have emerged as key determinants in gene regulation, acting to fine-tune transcriptional and signaling output. These noncoding RNA transcripts are known to affect expression of messenger RNAs (mRNAs) via epigenetic and post-transcriptional regulation. Given their widespread target spectrum, as well as extensive modes of action, a complete understanding of their biological relevance will depend on integrative analyses of systems data at various levels. Findings: While a handful of publicly available databases have been reported, existing tools do not fully capture, from a network perspective, the functional implications of lncRNAs or circRNAs of interest. Through an integrated and streamlined design, circlncRNAnet aims to broaden the understanding of ncRNA candidates by testing in silico several hypotheses of ncRNA-based functions, on the basis of large-scale RNA-seq data. This web server is implemented with several features Received: 4 June 2017; Revised: 2 October 2017; Accepted: 22 November 2017 The Author(s) 2017. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. Downloaded from https://academic.oup.com/gigascience/article-abstract/7/1/1/4670937 by Ed 'DeepDyve' Gillespie user on 16 March 2018 2 Wu et al. that represent advances in the bioinformatics of ncRNAs: (1) a flexible framework that accepts and processes user-defined next-generation sequencing–based expression data; (2) multiple analytic modules that assign and productively assess the regulatory networks of user-selected ncRNAs by cross-referencing extensively curated databases; (3) an all-purpose, information-rich workflow design that is tailored to all types of ncRNAs. Outputs on expression profiles, co-expression networks and pathways, and molecular interactomes, are dynamically and interactively displayed according to user-defined criteria. Conclusions: In short, users may apply circlncRNAnet to obtain, in real time, multiple lines of functionally relevant information on circRNAs/lncRNAs of their interest. In summary, circlncRNAnet provides a “one-stop” resource for in-depth analyses of ncRNA biology. circlncRNAnet is freely available at http://app.cgu.edu.tw/circlnc/. Keywords: lncRNAs; circRNAs; co-expression network; molecular interactome is of importance in understanding ncRNAs, and associated bio- Introduction logical processes and may shed new light on diseases and pos- Only 1% of the human genome encodes proteins. In contrast, sibly new avenues of therapeutic interventions [20–22]. 70% to 90% of the genome can actually be transcribed at some Despite the enormous number of lncRNAs (∼15 000) anno- point during development, generating a large transcriptome of tated by GENCODE [23], our functional understanding of lncR- noncoding RNAs (ncRNA), part of which ultimately yield definite NAs remains largely limited. While large-scale sequencing stud- short or long RNAs with limited protein-coding capacity [1]. In ies have become a standard approach for identifying candidate recent years, deep sequencing technologies have unraveled the circRNAs/lncRNAs with significant expression alteration in cer- noncoding constituents of the transcriptome, most notably long tain cellular states, there may not be sufficient information in noncoding RNAs (lncRNAs) and circular RNAs (circRNAs). De- the literature to warrant further functional interrogation. More- spite the lack of protein-coding potential, these once uncharted over, given the potentially widespread target spectrum of these parts have emerged as a key determinant in gene regulation, act- ncRNAs as well as their extensive modes of action, a complete ing as critical switches that fine-tune transcriptional and signal- understanding of their biological relevance will depend on in- ing output [2, 3]. tegrative analyses of systems data at various levels [24]. While Distinct from small noncoding RNAs such as microRNAs and a handful of publicly available databases have been reported snRNAs, lncRNAs are RNA molecules with a length of more (Table 1), they are quite limited in the scope of reference data and than 200 nucleotides that lack a detectable open reading frame analytic modules, relying on existing datasets in public archives [4]. lncRNAs are usually transcribed by RNA polymerase II and and annotating preselected regulatory features of ncRNAs. Thus, exhibit known attributes of messenger RNAs, such as post- existing tools do not fully capture, from a network perspec- transcriptional processing. Circular RNAs are a more recently tive, the functional implications of lncRNAs or circRNAs of in- discovered class of noncoding RNAs that are defined not by terest. To solve this problem, we have implemented an integra- length but rather the unique structure of covalently closed cir- tive bioinformatics approach to examine in silico the functional cularity [5, 6]. Despite their differences in structure and biosyn- networks of ncRNAs. The overall design and analytic workflow thesis steps, lncRNAs and circRNAs are much more common in of this first “one-stop” web server tool for exploring the ncRNA terms of their roles and mechanisms in gene regulation, and in biology are depicted in Fig. 1. fact circRNAs are considered to be a class of lncRNAs by many re- searchers [3]. Even in the absence of protein products, these RNA molecules have been found to associate with distinct cellular Results and Methods compartments or components, and may act in cis or trans in tar- Data input get gene regulation [7–10]. At the epigenetic and transcriptional levels, lncRNAs are known to interact with transcriptional acti- To start, there are 2 separate upload pages for “lncRNA” and vators or repressors and consequently impact transcriptional ef- “circRNA” to meet the distinct analytic requirements of these ficiency. By binding with chromatin-modifying factors, lncRNAs 2 types of molecules (Fig. 2A). Users may upload tab-delimited could also serve as a guide or scaffold that controls the epige- text files that contain (1) expression matrix data of RNA-seq netic status. At the post-transcription level, lncRNAs may bind raw read counts, which are generated by using featureCounts to target RNAs and alter transcript structure, splicing pattern, (Fig. 2B) [25] and (2) sample/condition categories (Fig. 2C) into and stability. Both lncRNAs and circRNAs have been found to “Gene Expression Profile” and “Demographic Information,” re- harbor microRNA response elements (MREs) and potentially act spectively, on the webpage. For circRNA analyses, circRNA read as “miRNA sponges” that sequester these endogenous small counts, as quantified by KNIFE [ 26], should be additionally pro- RNAs [8, 11, 12], although the evidence for lncRNA miRNA vided in a separate file. Procedures for processing the datasets sponges is much stronger than for circRNA sponges [13, 14]. into the appropriate format are outlined in the tutorial page on These ncRNAs are therefore part of the competing endogenous the web server [27]. For demonstration of use, 2 test datasets de- RNA (ceRNA) network with the potential to alter miRNA-targeted rived from publicly available RNA-seq data are included in the mRNA expression. Another mode of regulation exerted by lncR- web server: The Cancer Genome Atlas (TCGA) data on colon and NAs is their association with RNA-binding proteins. Similar to rectal adenocarcinoma (COAD and READ; for lncRNA) and the the ceRNA scenario, this molecular interaction may impact the Encyclopedia of DNA Elements (ENCODE) data on the esophagus localization, and thus activity, of these gene regulators. Finally, and sigmoid colon (for circRNA) [28, 29]. in line with their critical roles as gene regulators, both circRNAs and lncRNAs exhibit unique expression profiles in various hu- Output summary man cancers, suggestive of a correlation with disease progres- sion and possibly its value as a predictor of patient outcome After the successful submission of a job, processing statuses, [15–19]. Delineation of these transcriptomic networks therefore file format conversion, co-expression analysis, interactome Downloaded from https://academic.oup.com/gigascience/article-abstract/7/1/1/4670937 by Ed 'DeepDyve' Gillespie user on 16 March 2018 circlncRNAnet for the bioinformatics of non-coding RNA 3 Table 1: Comparative functionalities of available web tools of ncRNAs. Both Co- RBP lncRNAs Co- expression: binding miRNA and Expression expression: annota- site target Regulatory Tool name Interface circRNAs pattern gene network tion/pathway prediction prediction Network Ref. circlncRNAnet Web server Yes Yes Yes Yes Yes Yes Yes This article NONCODE Web database Yes [53] LNCipedia Web database Yes [30] ncFANs Web server Yes Yes [54] lncRNAdb Web database Yes Yes Yes [55] LINC R package Yes Yes [56] cogena R package Yes Yes [57] WGCNA R package Yes [37] QUBIC R package Yes [58] circNet Web database Yes Yes Yes [59] CIRCpedia Web database Yes [60] Circ2Traits Web database Yes Yes Yes Yes [61] CircInteractome Web database Yes Yes Yes [62] DeepBase V2.0 Web database Yes Yes [24] starBase V2.0 Web database Yes Yes Yes Yes Yes [63] Figure 1: The overall design and the analytic workflow of circlncRNAnet. networking, and report generation are displayed using a dy- according to circBase [31], and circRNA (or host gene) splicing namic progress indicator. Computational tools and databases structure. employed in this study are listed in Tables 2 and 3, respectively, which also outline the parameters used to carry out the cor- Analytic module #1: coding–noncoding co-expression responding analyses. The output section of the tutorial page network profiling [27] shows the standard output of circlncRNAnet based on the demonstration datasets. The standard output is represented by After the upload, the server will first execute the differential ex- dynamic tables and charts, including bar and box plots, scatter pression analysis by using the R package DESeq2 [32]. The inter- plot, circos plot, heatmap, and network plots. Also included in active interface allows users to define the candidate gene list by the table is annotation information of the coding and noncoding fold changes and P-value. Moreover, to inspect the expression genes, such as genome location, distance from query lncRNA or distance between samples, principal component analysis (PCA) circRNA, lncRNA ID (ENCODE), coding potential [30], circRNA ID was implemented in our analysis pipeline. Downloaded from https://academic.oup.com/gigascience/article-abstract/7/1/1/4670937 by Ed 'DeepDyve' Gillespie user on 16 March 2018 4 Wu et al. Figure 2: Input file formats for circlncRNAnet. Interface on the web server for data upload (A). Two files are uploaded prior to data analysis: a gene matrix table ( B), which is generated using featureCounts, and a condition file describing the sample status (C). Table 2: Analytic and visualization R packages incorporated in of selected differentially expressed circRNA/lncRNA expression circlncRNAnet against all genes in the user-uploaded samples (Fig. 3A). For an overview of the sequenced transcriptomes, the extent of the co- Analytic ordinated expression (Fig. 3B) and overall distribution of non- software Version Description Ref. coding and coding RNA abundance (Fig. 3C) can be displayed as summary graphs. To provide users with a guide in the selection circlize 0.4.1 Circos plot [64] of relevant criteria for expression correlation, the server displays clusterProfiler 3.2.14 Gene enrichment analysis [ 65] a composite histogram showing the overall distribution of corre- DESeq2 1.14.1 Differential expression analysis [32] factoextra 1.0.4 Principle component analysis [66] lation coefficients calculated for all the ncRNA-mRNA pairs, su- ggplot2 2.2.1.9000 Data visualization [67] perimposed with the results from randomized correlation tests plotly 4.7.1 Interactive data visualization [68] (500 iterations of randomized Pearson correlations between tar- visNetwork 2.0.1 Network visualization [69] get ncRNAs and 5000 randomly selected mRNAs). The highly cor- WGCNA 1.51 Correlation calculation [37] related genes (based on user-defined Pearson’s correlation) will also be subjected to pathway enrichment analysis (Fig. 4). The identity and enriched terms of the co-expression networks will be provided to facilitate further functional deduction of ncRNAs Several known functional attributes of circRNAs/lncRNAs candidates. were taken into account when constructing this web server: As a proof of principle, we applied our analytic pipeline to a First, we adopted the gene co-expression analysis, which is known example of cancer-associated lncRNAs, ELFN1-AS1. Kim based on the concept of “guilt by association”—assuming that et al. recently reported that MYC-regulated lncRNA MYCLo-2 genes exhibiting analogous expression patterns may be involved (also known as ELFN1-AS1) represses CDKN2B transcription co- in similar biological pathways, functions of unknown genes may ordinately with hnRNPK [38]. To demonstrate the utility of cir- be inferred a priori from the co-expressed, functionally known clncRNAnet, we queried the functional network of ELFN1-AS1. genes [33]. To this end, Wolfe et al. developed a method to We used TCGA data on COAD and READ and paired normal demonstrate that co-expression with biologically defined mod- samples as the reference expression datasets. Co-expression ules may serve as a basis for characterizing the function of un- gene network analysis for ELFN1-AS1 may be done on the ba- known genes [34]. Ricano-Ponce et al. also used co-expression sis of the differentially expressed gene list and outputted ac- analysis to deduce the function of lncRNAs with expression cording to user-defined criteria (Fig. 4, middle panel). To further quantitative trait loci (eQTLs) effects [35]. The combined use visualize overall expression profiles of ELFN1-AS1 co-expressed of co-expression analysis and Gene Set Enrichment Analysis genes, “heatmap” may be used to display up to 500 of the (GSEA) has been demonstrated to identify lncRNAs putatively most correlated genes (ranked by absolute r value) (Fig. 4, upper involved in neuronal development [36]. To implement this co- left panel). Pair-wise expression correlation between the ncRNA expression analysis in circlncRNAnet, we used the R package and co-expressed mRNA genes is also possible. For instance, as WGCNA [37] to calculate the Pearson correlation coefficients Downloaded from https://academic.oup.com/gigascience/article-abstract/7/1/1/4670937 by Ed 'DeepDyve' Gillespie user on 16 March 2018 circlncRNAnet for the bioinformatics of non-coding RNA 5 Table 3: List of databases and analytic tools employed by circlncRNAnet Database Version Description Parameters Ref. cisBP-RNA and Ray, 2013 2013 RNA binding protein motifs for Downloaded from MEME motif database [70] (Homo sapiens) FIMO to discover potential RNA binding sites dbNSFP (Homo sapiens) 3.2 Gene annotation NA [71] ENCODE ChIP-Seq Feb 2017 Experimental transcription factor Regions from -3000∼1000 bp of TSS were considered as [29] (Homo sapiens) and protein binding sites the promoter; in-house scripts were then used to collect peaks with >2 score and annotate as binding sites ENCODE eCLIP (Homo Mar 2017 Experimental RNA binding protein In-house scripts were used to collect all the peaks [29] sapiens) binding sites corresponding to binding sites; binding score for each target gene was represented by the lowest peak score FIMO 4.11.2 Computational RNA binding Default [44] protein binding sites discovering GENCODE (Homo sapiens) Release 25 lncRNA annotation NA [23] LNCipedia (Homo 4 High-confidence lncRNA NA [30] sapiens) annotation miRanda 3.3a miRNA binding sites detection -m 10 000 000 -p 0.05 [72] MSigDB v5.2 Computational transcription The transcription factor targets dataset was used for TF [73] factor and protein binding sites enrichment analysis RNAhybrid 2.1.2 miRNA binding sites detection -sc 140, with cutoff seed similarity ≥85% and wobble [48] pair similarity ≥85% TarPmiR Mar 2016 miRNA binding sites detection -p 0.1 [74] ELFN1-AS1 is a known transcriptional target of MYC, users may Analytic module #2: RBP interactome mapping compare the expression patterns between ELFN1-AS1 and MYC Second, based on the lncRNAs that have been reported thus far, in the TCGA data. This is done through “Scatter plot,” and enter they have been mostly implicated in several aspects of gene ex- “MYC” in the “Co-expressed gene” box (Fig. 3D). Next, for path- pression, such as RNA stability, miRNA sponging, regulation of way analysis of genes co-expressed with ELFN1-AS1,the “GO& transcription factor, and epigenetic and chromosomal architec- KEGG Enrichment” functionality is available, in which the “En- ture [4, 7, 20, 21, 40]. Interestingly, behind these regulatory ac- riched pathway (MSigDB)” will output top enriched pathways, tions, molecular interactions are the most crucial determinant together with a network representation of the components. In in lncRNAs’ roles. In this context, lncRNAs are known to asso- the case of ELFN1-AS1, MYC TARGETS V1 and MYC TARGETS V2 ciate with various proteins (i.e., RNA-binding proteins and chro- are shown as 2 of the top pathways, consistent with the previous matin modifiers). For example, lncRNA ELFN1-AS1 interacts with findings (Fig. 4, lower panels). hnRNPK to transcriptionally suppress the expression of CDKN2B, In addition, we used another novel lncRNA as an exam- a tumor suppressor gene [38]. LncRNA NORAD acts as sequester ple of our analytic approach. XXbac-B476C20.9 was downreg- of PUM2 to maintain genomic stability [41]. A colorectal cancer ulated in colorectal cancer, and higher expression of XXbac- (CRC) associated lncRNA MYU binds hnRNPK and consequently B476C20.9 exhibited better survival expectancy, hinting at a stabilizes CDK6, which is critical for colon cancer cells’ growth tumor-suppressive role (data not shown). By using Pearson cor- [42]. These findings thus suggest that delineating the lncRNA- relation analysis, we identified hundreds of genes that exhibit interacting protein network may effectively prompt the func- significant co-expression with this lncRNA (data not shown). By tional exploration of lncRNA candidates. In our efforts of map- analyzing the chromosome distribution of XXbac-B476C20.9 co- ping the protein interactome of lncRNAs, we have extensively expressed genes, we did not see particular enrichment in chro- curated and integrated 2 types of public data into reference an- mosome 22 (where XXbac-B476C20.9 locates) (Fig. 4, upper right notations for the analytic workflow: computational RNA binding panel), indicating that this lncRNA may not exert expression reg- protein (RBP) motif scan and experimental RBP databases. ulationinacismanner. For this purpose, we first collected RBP binding motifs from Correlated expression may also be attributed to the func- MEME, which is a motif-discovering software, in addition to sev- tional interaction of the circRNAs/lncRNAs with particular eral RBP motifs from published data [43]. Next, we generated all transcription factor (TF) networks. Indeed, previous studies have lncRNA sequences from GENCODE, Release 25, and used FIMO reported that lncRNA could regulate TF activity through recipro- to scan computationally for the presence of possible RBP bind- cal interaction [39]. To address this possibility, our web server ing sites [44]. For the empirical RBP sites, we retrieved the RBP is equipped to determine whether the co-expression gene set binding sequences from ENCODE eCLIP [45]. To complement the is enriched in targets of specific TFs. Extensive TF-target pairs repertoire of RBP included in the analysis, we also integrated were first built by annotating 2 sources of data: (1) computational protein interaction profile sequencing (PIP-seq) [ 46]. Although motif scan of TF binding sites and (2) experimental TF binding the footprints of protein binding do not readily reveal the iden- sites as archived by the ENCODE Chromatin immunoprecipita- tity of the associated factors, PIP-seq data may serve as evidence tion sequencing (ChIP-Seq) data. For the latter, we retrieved EN- for molecular interaction. CODE ChIP-seq data and defined the promoter region as a win- Given that our exemplary lncRNA ELFN1-AS1 reportedly me- dow from -3000 bp to +1000 bp of the transcription start site to diates its function through interacting with hnRNPK, we next establish putative TF occupancy. The output of this type of anal- tested whether this attribute could be recapitulated by circlncR- ysis can be accessed via gene enrichment module. NAnet. To interrogate the ELFN1-AS1-associated proteins, the Downloaded from https://academic.oup.com/gigascience/article-abstract/7/1/1/4670937 by Ed 'DeepDyve' Gillespie user on 16 March 2018 6 Wu et al. regulation has emerged. By virtue of the distinct distribution of recurring miRNA target sequences in lncRNA transcripts, cer- tain lncRNAs are known to compete with mRNA transcripts for complementary binding by the cognate miRNAs. This regula- tory process, referred to as miRNA sponge or competing en- dogenous RNAs (ceRNAs) [47], alters the endogenous silencing activity of miRNAs, thereby impacting the expression of tar- geted mRNAs. Some lncRNAs have even been demonstrated as miRNA sponges in certain oncogenic processes [11, 12]. Thus, to complete this bioinformatics package, we installed in this web server an analytic module for sequence-based delineation of po- tential lncRNA-miRNA sponge pairs. Given that existing miRNA targeting sites databases annotate target sequences only in 3’ UTR, information regarding miRNA: ncRNA complementarity is not readily available. To resolve this issue, we generated a refer- ence database that catalogs putative miRNA binding sites within lncRNAs/circRNAs as computationally predicted by 3 different miRNA target prediction tools (RNAhybrid, miRanda, and TarP- miR) [48–50]. Analogous to the RBP module, an miRNA target is considered a positive hit if 2 of the 3 software tools uncover its existence, and will be denoted as a larger node and shown with a gene symbol in the network diagram. For the RNA components of the ELFN1-AS1 interactomes, circlncRNAnet provides information on the putative miRNA targeting sites within the RNA sequences. To explore, the “miRNA targeting sites network” may be selected to show the corresponding network (Fig. 5B). Analogous to the RBP network, any miRNA target sequences predicted by at least 2 miRNA targeting site–discovering softwares (miRanda, RNAhybrid, and TarPmiR) will be labeled with gene symbols and a larger node size in the network (Fig. 5B). Analytic module #4: multitier regulatory hierarchy mRNAs harboring the same miRNA binding sites as ncRNAs are likely to be subject to expression alteration in the miRNA sponge scenario—the inverse correlation in expression between miRNA and mRNAs/lncRNAs is expected [47]. Thus, to substan- Figure 3: Schematic showing example outputs of circlncRNAnet analyses of tiate the putative miRNA sponge activity and also to delineate lncRNA-based networks in colorectal cancer. After dataset upload, the server likely downstream mRNA targets, the web server is further de- executes differential expression and expression correlation analyses. The web signed to construct the ncRNA-miRNA-mRNAs regulatory hier- server allows the user to select query genes and correlation criteria (A). For an archy. For this purpose, 3’ UTRs with presumptive miRNA tar- overview of the sequenced transcriptomes, the extent of the coordinated expres- geting, as revealed by the aforementioned prediction tools, will sion (B) and overall distribution of noncoding and coding RNA abundance (C) are be cross-referenced with the gene set that shows correlated ex- displayed as summary graphs. As examples of use, co-expression network anal- ysis of a known lncRNA, ELFN1-AS1, and a novel lncRNA, XXbac-B476C20.9, was pression profiles with the candidate ncRNA. As a result, this in- performed using circlncRNAnet. (D) Scatter plot showing the extent of expres- tersected gene list presumably represents the targets of ncRNA- sion correlation between ELFNA-AS1 and 1 target, MYC. (E) Histogram displaying miRNA axis-mediated regulation, and will be depicted in a 2-tier the distributions of the Pearson correlation coefficients of all ncRNA-mRNA pairs network configuration (Fig. 5B). (Obs) and of a randomized correlation test (Rand). Similar network analyses are available for decoding the ncRNA-RBP-mRNA network. To this end, a reference RBP-mRNA database was first established, in which all GENCODE mRNA “Retrieve lncRNA-binding protein” module can be selected to genes were scanned and annotated for experimental and com- display a ELFN1-AS1-associated RNA-binding protein network putational RBP binding using the above approaches. For a partic- (Fig. 5A). An RBP is considered a hit (i.e., potential interactor ular RBP in the ncRNA interactome that is selected by the user, of the given lncRNA/circRNA) if its annotated motifs from at all ncRNA-co-expressed mRNAs with mutual RBP binding will least 2 database sources are detected in the transcript sequence, be assembled based on the RPB-mRNA database. These lines of and will be labeled with a gene symbol and a larger node size. information will then be integrated and subsequently outputted The output of this demo analysis illustrates a number of pu- as the multitier molecular network (Fig. 5A). tative interacting RBPs, one of which is HNRNPK, as reported (Fig. 5A). Benchmarking Analytic module #3: ceRNA networking circlncRNAnet is constructed on the Nginx 1.6.3 and Shiny 1.0.3 Third, aside from protein interactors, the role of circRNAs/ servers, which run on a CentOS 6.2 with 2 Intel XEON E5–2620 lncRNAs in microRNA (miRNA)-mediated post-transcriptional CPU and 200GB RAM. To optimize the CPU utilities for multiple Downloaded from https://academic.oup.com/gigascience/article-abstract/7/1/1/4670937 by Ed 'DeepDyve' Gillespie user on 16 March 2018 circlncRNAnet for the bioinformatics of non-coding RNA 7 Figure 4: Additional examples of circlncRNAnet output of lncRNA-based networks in colorectal cancer. In addition to the analyses shown in Fig. 3, more options for network interrogation of ncRNA-based regulation can be accessed on the webpage (middle). For instance, heatmap representation of the genes co-expressed with ELFN1-AS1 (Pearson’s |r| > 0.5) can be outputted (upper left). Pathway analysis of the co-expressed genes on the basis of MSigdb Hallmark pathways (bottom left), and its network depiction of the top 3 enriched pathways and the corresponding co-expressed components (bottom right). Circos plot can also be used to illustrate the genome-wide distribution of the top 100 co-expressed genes relative to the location of XXbac-B476C20.9 (upper right). Downloaded from https://academic.oup.com/gigascience/article-abstract/7/1/1/4670937 by Ed 'DeepDyve' Gillespie user on 16 March 2018 8 Wu et al. Figure 5: Examples of lncRNA-associated molecular components uncovered by circlncRNAnet. circlncRNAnet may be used to extensively profile the molecular in- teractome of candidate circRNAs/lncRNAs based on the compiled databases, with ELFN1-AS1 shown as an example in this figure. (A) For the RBP components ,the interactome will be outputted in both the table format (top) and network configuration (bottom). (B) Similarly, for the putative miRNA sponge network , predicted ELFN1-AS1-targeting miRNAs are shown in table (top) and network (bottom) formats. The web server is also designed to construct the ncRNA-RBP-mRNAs or ncRNA- miRNA-mRNAs regulatory hierarchy. circlncRNAnet delineates co-expressed mRNA genes with mutually shared RBP binding or miRNA targeting sites. Consequently, an intersected gene list is compiled (top) and may be depicted in a 2-tier network configuration (bottom). users, we assign 2 threads for an analysis task. We tested the Availability of supporting source code and web service with 20 normal/tumor paired samples, for which the requirements DESeq2 analysis required 130 seconds to produce differentially Project name: circlncRNAnet expressed genes. For calculating a co-expressed gene list, cir- Project home page: http://app.cgu.edu.tw/circlnc/[27], clncRNAnet took 50 seconds for 1 query gene and 270 seconds https://github.com/smw1414/circlncRNAnet [51] for 10 query genes. Operating system(s): platform independent Programming language: PHP, JavaScript, R, R shiny and Shell script Conclusions Other requirements: JavaScript supporting web browser With the expansion of transcriptome sequencing datasets, fo- License: GPLv3 cusing on a select set of publicly available, but potentially Research Resource Identifier: circlncRNAnet, irrelevant, sequencing data does not sufficiently address users’ RRID:SCR 015794 research needs. This prompted us to build a completely new system with the flexibility of accepting private or public data. Availability of supporting data To further support efficient analyses and presentation, we have extensively curated public data into reference annotations for The analytic modules and test datasets (from TCGA and EN- the circlncRNAnet workflow. Multilayer modules and algorithms CODE) are available in the GitHub repository [51]. An archival then provide outputs on expression profiles, co-expression net- copy of the modules and test datasets is also available via the Gi- works and pathways, and molecular interactomes, which are dy- gaScience repository, GigaDB [52]. For the convenience of prospec- namically and interactively displayed according to user-defined tive users, we also provided on GitHub instructions on running criteria. In short, users may apply circlncRNAnet to obtain, in our pipeline in local mode. real time, multiple lines of functionally relevant information on the circRNAs/lncRNAs of their interest. The overall workflow Abbreviations takes only a few minutes, as compared with hours of manual effort of independent database searches and analyses. In sum- ceRNA: competing endogenous RNA; ChIP-Seq: chromatin mary, circlncRNAnet is the first of its kind in the regulatory RNA immunoprecipitation sequencing; circRNA: circular RNA; research field, providing a “one-stop” resource for in-depth anal- COAD: colon adenocarcinoma; CRC: colorectal cancer; GSEA: yses of ncRNA biology. A tutorial with demo datasets is avail- Gene Set Enrichment Analysis; lncRNAs: long noncoding RNA; able under “Tutorial,” in which the functional network of known miRNA: microRNA; mRNAs: messenger RNA; ncRNA: noncoding lncRNA was illustrated in silico as an example. RNA; PIP-seq: Protein Interaction Profile sequencing; RBP: Downloaded from https://academic.oup.com/gigascience/article-abstract/7/1/1/4670937 by Ed 'DeepDyve' Gillespie user on 16 March 2018 circlncRNAnet for the bioinformatics of non-coding RNA 9 RNA-binding protein; READ: rectal adenocarcinoma; TCGA: The biomarker for gastric cancer. Oncotarget 2016;7(25):37812– Cancer Genome Atlas. 24. 12. Du Z, Sun T, Hacisuleyman E et al. Integrative analyses reveal a long noncoding RNA-mediated sponge regulatory network Funding in prostate cancer. Nat Commun 2016;7:10982. 13. Boeckel JN, Jae N, Heumuller AW et al. Identification and This work was supported by grants from the Ministry of Sci- characterization of hypoxia-regulated endothelial circular ence and Technology of Taiwan (MOST104–2321-B-182–007-MY3 RNANovelty and significance. Circ Res 2015; 117(10):884–90. to P.J.H.; MOST106–2320-B-182–035-MY3 to H.L.; MOST104– 14. Militello G, Weirick T, John D et al. Screening and validation 2320-B-182–029-MY3 and MOST105–2314-B-182–061-MY4 to of lncRNAs and circRNAs as miRNA sponges. Brief Bioinform B.C.M.T.; MOST103–2632-B-182–001, MOST104–2632-B-182–001, 2017;18(5):780–8. and MOST105–2632-B-182–001), Chang Gung Memorial Hospi- 15. Chen G, Wang Z, Wang D et al. LncRNADisease: a database tal (CMRPD1G0321 and CMRPD1G0322 to P.J.H.; CMRPD1F0571 for long-non-coding RNA-associated diseases. Nucleic Acids to H.L.; CMRPG3D1513 and CMRPG3D1514 to W.S.T.; CM- Res 2013;41(Database issue):D983–6. RPD3E0153, CMRPD1F0442, and BMRP960 to B.C.M.T.), the 16. Huarte M. A lncRNA links genomic variation with celiac dis- National Health Research Institute of Taiwan (NHRI-EX105– ease. Science 2016;352(6281):43–44. 10321SI), the Ministry of Education of Taiwan, and Biosignature 17. Li P, Chen S, Chen H et al. Using circular RNA as a novel type Research Grant CIRPD3B0013 for supporting bioinformatics and of biomarker in the screening of gastric cancer. Clin Chim computing resources. Acta 2015;444:132–6. 18. Qian Y, Lu Y, Rui C et al. Potential significance of circular RNA Competing interests in human placental tissue for patients with preeclampsia. Cell Physiol Biochem 2016;39(4):1380–90. The authors declare that they have no competing interests. 19. Yang G, Lu X, Yuan L. LncRNA: a link between RNA and can- cer. Biochim Biophys Acta 2014;1839(11):1097–109. 20. Xie X, Tang B, Xiao YF et al. Long non-coding RNAs in col- Author contributions orectal cancer. Oncotarget 2016;7(5):5226–39. H.L. and B.C.T. conceived the original idea of the web server. 21. Han D, Wang M, Ma N et al. Long noncoding RNAs: novel S.W., P.H., Y.C., C.L., W.T., and H.L. designed and implemented the players in colorectal cancer. Cancer Lett 2015;361(1):13–21. web server. S.W., P.J., and Y.C. conducted the benchmarks. C.L., 22. Park NJ, Zhou H, Elashoff D et al. Salivary microRNA: discov- C.Y., W.T., and B.C.T. tested the system and provided feedback ery, characterization, and clinical utility for oral cancer de- on features and functionality. S.W., H.L., and B.C.T. wrote the tection. Clin Cancer Res 2009;15(17):5473–7. manuscript. All authors read and approved the final manuscript. 23. Harrow J, Frankish A, Gonzalez JM et al. GENCODE: the ref- erence human genome annotation for The ENCODE Project. Genome Res 2012;22(9):1760–74. Acknowledgements 24. Zheng LL, Li JH, Wu J et al. deepBase v2.0: identification, ex- We are grateful to members of the BC-MT laboratory for critical pression, evolution and function of small RNAs, LncRNAs reading of the article and important discussions. and circular RNAs from deep-sequencing data. Nucleic Acids Res 2016;44(D1):D196–202. 25. Liao Y, Smyth GK, Shi W. featureCounts: an efficient general References purpose program for assigning sequence reads to genomic 1. Ponting CP, Oliver PL, Reik W. Evolution and functions of long features. Bioinformatics 2014;30(7):923–30. noncoding RNAs. Cell 2009;136(4):629–41. 26. Szabo L, Morey R, Palpant NJ et al. Statistically based splic- 2. Liu J, Liu T, Wang X, He A. Circles reshaping the RNA world: ing detection reveals neural enrichment and tissue-specific from waste to treasure. Mol Cancer 2017;16(1):58. induction of circular RNA during human fetal development. 3. Quinn JJ, Chang HY. Unique features of long non-coding RNA Genome Biol 2015;16(1):126. biogenesis and function. Nat Rev Genet 2016;17(1):47–62. 27. Wu S-M, Liu H, Huang P-J et al. circlncRNAnet: an inte- 4. Wang KC, Chang HY. Molecular mechanisms of long noncod- grated web-based resource for mapping functional net- ing RNAs. Mol Cell 2011;43(6):904–14. works of long or circular forms of non-coding RNAs. 2017, http://http://app.cgu.edu.tw/circlnc/. Accessed November 5. Jeck WR, Sharpless NE. Detecting and characterizing circular RNAs. Nat Biotechnol 2014;32(5):453–61. 2017. 28. Cancer Genome Atlas Network. Comprehensive molecular 6. Petkovic S, Muller S. RNA circularization strategies in vivo and in vitro. Nucleic Acids Res 2015;43(4):2454–65. characterization of human colon and rectal cancer. Nature 7. Rinn JL, Chang HY. Genome regulation by long noncoding 2012;487(7407):330–7. RNAs. Annu Rev Biochem 2012;81(1):145–66. 29. Consortium EP. An integrated encyclopedia of DNA elements 8. Wang X, Arai S, Song X et al. Induced ncRNAs allosterically in the human genome. Nature 2012;489(7414):57–74. modify RNA-binding proteins in cis to inhibit transcription. 30. Volders PJ, Verheggen K, Menschaert G et al. An up- Nature 2008;454(7200):126–30. date on LNCipedia: a database for annotated hu- 9. Guttman M, Rinn JL. Modular regulatory principles of large man lncRNA sequences. Nucleic Acids Res 2015;43(8): non-coding RNAs. Nature 2012;482 (7385):339–46. 4363–4. 10. Rinn JL, Kertesz M, Wang JK et al. Functional demarcation of 31. Glazar P, Papavasileiou P, Rajewsky N. circBase: a database active and silent chromatin domains in human HOX loci by for circular RNAs. RNA 2014;20(11):1666–70. 32. Love MI, Huber W, Anders S. Moderated estimation of noncoding RNAs. Cell 2007;129(7):1311–23. 11. Shao Y, Ye M, Li Q et al. LncRNA-RMRP promotes carcino- fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 2014;15(12):550. genesis by acting as a miR-206 sponge and is used as a novel Downloaded from https://academic.oup.com/gigascience/article-abstract/7/1/1/4670937 by Ed 'DeepDyve' Gillespie user on 16 March 2018 10 Wu et al. 33. De Smet R, Marchal K. Advantages and limitations of 54. Liao Q, Xiao H, Bu D et al. ncFANs: a web server for func- current network inference methods. Nat Rev Microbiol tional annotation of long non-coding RNAs. Nucleic Acids 2010;8(10):717–29. Res 2011;39(suppl):W118–24. 34. Wolfe CJ, Kohane IS, Butte AJ. Systematic survey reveals gen- 55. Quek XC, Thomson DW, Maag JL et al. lncRNAdb v2.0: ex- eral applicability of “guilt-by-association” within gene coex- panding the reference database for functional long noncod- pression networks. BMC Bioinformatics 2005;6 1:227. ing RNAs. Nucleic Acids Res 2015;43(D1):D168–73. 35. Ricano-Ponce I, Zhernakova DV, Deelen P et al. Refined map- 56. Goepferich M, Herrmann C. LINC: co-expression of lin- ping of autoimmune disease associated genetic variants cRNAs and protein-coding genes. 2017. https://doi.org/ with gene expression suggests an important role for non- doi:10.18129/B9.bioc.LINC. Accessed December 2017. coding RNAs. J Autoimmun 2016;68:62–74. 57. Jia Z, Liu Y, Guan N et al. Cogena, a novel tool for co- 36. D’Haene E, Jacobs EZ, Volders PJ et al. Identification of long expressed gene-set enrichment analysis, applied to drug non-coding RNAs involved in neuronal development and in- repositioning and drug mode of action discovery. BMC Ge- tellectual disability. Sci Rep 2016;6 1:28396. nomics 2016;17:414. doi:10.1186/s12864-016-2737-8. 37. Langfelder P, Horvath S. WGCNA: an R package for weighted 58. Zhang Y, Xie J, Yang J et al. QUBIC: a bioconductor package for correlation network analysis. BMC Bioinformatics 2008;9 qualitative biclustering analysis of gene co-expression data. 1:559. Bioinformatics 2017;33(3):450–2. 38. Kim T, Jeon YJ, Cui R et al. Role of MYC-regulated long non- 59. Liu YC, Li JR, Sun CH et al. CircNet: a database of circular coding RNAs in cell cycle regulation and tumorigenesis. J RNAs derived from transcriptome sequencing data. Nucleic Natl Cancer Inst 2015;107(4): doi:10.1093/jnci/dju505. Acids Res 2016;44(D1):D209–15. 39. Fu M, Huang G, Zhang Z et al. Expression profile of long non- 60. Zhang XO, Dong R, Zhang Y et al. Diverse alternative back- coding RNAs in cartilage from knee osteoarthritis patients. splicing and alternative splicing landscape of circular RNAs. Osteoarthritis Cartilage 2015;23(3):423–32. Genome Res 2016;26(9):1277–87. 40. Chen X, Liu B, Yang R et al. Integrated analysis of long 61. Ghosal S, Das S, Sen R et al. Circ2Traits: a comprehensive non-coding RNAs in human colorectal cancer. Oncotarget database for circular RNA potentially associated with disease 2016;7(17):23897–908. and traits. Front Genet 2013;4:283. 41. Lee S, Kopp F, Chang TC et al. Noncoding RNA NORAD reg- 62. Dudekula DB, Panda AC, Grammatikakis I et al. CircInter- ulates genomic stability by sequestering PUMILIO proteins. actome: a web tool for exploring circular RNAs and their Cell 2016;164(1–2):69–80. interacting proteins and microRNAs. RNA Biol 2016;13(1): 42. Kawasaki Y, Komiya M, Matsumura K et al. MYU, a 34–42. target lncRNA for Wnt/c-Myc signaling, mediates induc- 63. Li JH, Liu S, Zhou H et al. starBase v2.0: decoding miRNA- tion of CDK6 to promote cell cycle progression. Cell Rep ceRNA, miRNA-ncRNA and protein–RNA interaction net- 2016;16(10):2554–64. works from large-scale CLIP-Seq data. Nucl Acids Res 43. Bailey TL, Boden M, Buske FA et al. MEME SUITE: tools for mo- 2014;42(D1):D92–7. tif discovery and searching. Nucleic Acids Res 2009;37(Web 64. Gu Z, Gu L, Eils R et al. Circlize implements and enhances Server):W202–8. circular visualization in R. Bioinformatics 2014;30(19): 44. Grant CE, Bailey TL, Noble WS. FIMO: scanning for occur- 2811–2. rences of a given motif. Bioinformatics 2011;27(7):1017–8. 65. Yu G, Wang LG, Han Y et al. clusterProfiler: an R package for 45. Hong EL, Sloan CA, Chan ET et al. Principles of meta- comparing biological themes among gene clusters. OMICS data organization at the ENCODE data coordination center. 2012;16(5):284–7. Database 2016;2016: doi:10.1093/database/baw001. 66. Mundt AKF. Factoextra: extract and visualize the re- 46. Silverman IM, Li F, Alexander A et al. RNase-mediated sults of multivariate data analyses. 2017. https://cran.r- protein footprint sequencing reveals protein-binding project.org/package=factoextra. Accessed December 2017. sites throughout the human transcriptome. Genome Biol 67 Wickham H. ggplot2: Elegant Graphics for Data Analysis. 2014;15(1):R3. New York: Springer-Verlag; 2009. 47. Thomson DW, Dinger ME. Endogenous microRNA sponges: 68. Sievert C, Parmer C, Hocking T et al. Plotly: create interactive evidence and controversy. Nat Rev Genet 2016;17(5):272–83. web graphics via ‘plotly.js.’ 2017. https://plot.ly/r. Accessed 48. Kruger J, Rehmsmeier M. RNAhybrid: microRNA target pre- December 2017. diction easy, fast and flexible. Nucleic Acids Res 2006; 34(Web 69. Almende BV, Thieurmel B, Robert T. visNetwork: network Server):W451–4. visualization using ‘vis.js’ Library. 2017. https://CRAN.R- 49. Lewis BP, Burge CB, Bartel DP. Conserved seed pairing, often project.org/package=visNetwork. Accessed December 2017. flanked by adenosines, indicates that thousands of human 70. Ray D, Kazan H, Cook KB et al. A compendium of genes are microRNA targets. Cell 2005;120(1):15–20. RNA-binding motifs for decoding gene regulation. Nature 50. Enright AJ, John B, Gaul U et al. MicroRNA targets in 2013;499(7457):172–7. Drosophila. Genome Biol 2003;5(1):R1. 71. Liu X, Jian X, Boerwinkle E. dbNSFP: a lightweight database 51. Wu S-M, Liu H, Huang P-J et al. circlncRNAnet GitHub repos- of human nonsynonymous SNPs and their functional pre- itory. 2017. https://github.com/smw1414/circlncRNAnet. Ac- dictions. Hum Mutat 2011;32(8):894–9. cessed November 2017. 72. John B, Enright AJ, Aravin A et al. Human microRNA targets. 52. Wu S-M, Liu H, Huang P-J et al. Supporting data for PLoS Biol 2004;2(11):e363. “circlncRNAnet: an integrated web-based resource 73. Subramanian A, Tamayo P, Mootha VK et al. Gene set enrich- for mapping functional networks of long or circular ment analysis: a knowledge-based approach for interpreting forms of noncoding RNAs.” GigaScience Database 2017. genome-wide expression profiles. Proc Natl Acad Sci U S A http://dx.doi.org/10.5524/100378. Accessed December 2017. 2005;102(43):15545–50. 53. Zhao Y, Li H, Fang S et al. NONCODE 2016: an informative 74. Ding J, Li X, Hu H. TarPmiR: a new approach for mi- and valuable data source of long non-coding RNAs. Nucleic croRNA target site prediction. Bioinformatics 2016;32(18): Acids Res 2016;44(D1):D203–8. 2768–75. Downloaded from https://academic.oup.com/gigascience/article-abstract/7/1/1/4670937 by Ed 'DeepDyve' Gillespie user on 16 March 2018

Journal

GigaScienceOxford University Press

Published: Jan 1, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off