High-risk, Expression-Based Prognostic Long Noncoding RNA Signature in Neuroblastoma

High-risk, Expression-Based Prognostic Long Noncoding RNA Signature in Neuroblastoma Background: Current clinical risk factors stratify patients with neuroblastoma (NB) for appropriate treatments, yet patients with similar clinical behaviors evoke variable responses. MYCN amplification is one of the established drivers of NB and, when combined with high-risk displays, worsens outcomes. Growing high-throughput transcriptomics studies suggest long noncoding RNA (lncRNA) dysregulation in cancers, including NB. However, expression-based lncRNA signatures are altered by MYCN amplification, which is associated with high-risk, and patient prognosis remains limited. Methods: We investigated RNA-seq-based expression profiles of lncRNAs in MYCN status and risk status in a discovery cohort (n ¼ 493) and validated them in three independent cohorts. In the discovery cohort, a prognostic association of lncRNAs was determined by univariate Cox regression and integrated into a signature using the risk score method. A novel risk score threshold selection criterion was developed to stratify patients into risk groups. Outcomes by risk group and clinical subgroup were assessed using Kaplan-Meier survival curves and multivariable Cox regression. The performance of lncRNA signatures was evaluated by receiver operating characteristic curve. All statistical tests were two-sided. Results: In the discovery cohort, 16 lncRNAs that were differentially expressed (fold change  2 and adjusted P  0.01) integrated into a prognostic signature. A high risk score group of lncRNA signature had poor event-free survival (EFS; P < 1E-16). Notably, lncRNA signature was independent of other clinical risk factors when predicting EFS (hazard ratio ¼ 3.21, P ¼ 5.95E–07). The findings were confirmed in independent cohorts (P ¼ 2.86E-02, P ¼ 6.18E-03, P ¼ 9.39E-03, respectively). Finally, the lncRNA signature had higher accuracy for EFS prediction (area under the curve ¼ 0.788, 95% confidence interval ¼ 0.746 to 0.831). Conclusions: Here, we report the first (to our knowledge) RNA-seq 16-lncRNA prognostic signature for NB that may contribute to precise clinical stratification and EFS prediction. Neuroblastoma (NB) is the most common childhood cancer of approximately 25% of NB cases and correlates with the high- undifferentiated sympathetic neuroblasts, accounting for ap- risk tumor subtype (9,10). proximately 15% of deaths in children worldwide (1–3). Currently, the clinical risk factors of NB, including patient According to the Surveillance, Epidemiology, and End Results age at diagnosis, MYCN status, tumor stage, chromosomal (SEER) Cancer Statistics report, every year, more than 650 cases aberrations, and tumor histology are still in use for risk assess- are diagnosed in North America (4,5), with an incidence rate of ment and determination of appropriate treatments (11–13). about 10.54 cases per million per year in children younger than Nevertheless, risk assessments based on these risk factors have 15 years (6,7). The clinical hallmark of NB is its tumor heteroge- limited success, as patients with similar clinical behaviors neity, represented by disparate clinical behaviors (1,2). MYCN evoke variable responses (14). Identifying tumor-specific molec- oncogene amplification is one of the established drivers of ular markers can provide better risk estimation and determina- NB, indicating worsened outcomes (8,9). It contributes to tion of effective protocols for treating patients at the time of Received: January 8, 2018; Revised: March 13, 2018; Accepted: March 29, 2018 © The Author(s) 2018. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/ licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 1of 12 by Ed 'DeepDyve' Gillespie user on 21 June 2018 2of 12 | JNCI J Natl Cancer Inst, 2018, Vol. 0, No. 0 diagnosis. Genomics and experimental studies conducted for Detection of Prognostic lncRNA Signature From the protein-coding genes (PCGs) or microRNAs can discriminate Discovery Cohort between patients with an unfavorable or favorable outcome Univariate Cox proportional hazard regression was applied to (14–24). However, the five-year survival rate of event-free sur- examine the association between 16 differentially expressed vival (EFS) in the high-risk tumor subtype is still approximately lncRNAs and patient EFS and overall survival (OS). LncRNAs sta- 50% (25). With advancements in RNA-sequencing (RNA-seq) tistically significantly associated (P < 0.001) with patient EFS technology, there have been numerous efforts to correlate the were integrated into a signature using a risk score formula. The expression of long noncoding RNA (lncRNA) with tumor progno- risk score for each patient was calculated by a linear combination sis (26–32). of expression and univariate coefficient of lncRNAs as follows: LncRNAs (longer than 200 nucleotides) are major noncoding transcriptomes (33), transcribed by RNA polymerase II, and ex- hibit both low and tissue-specific expression (34–37). LncRNAs Risk score ¼ W  exp ; ij j¼1 are highly stable and easily detected in various body fluids, including urine (38), plasma (39), and blood (40), urging noninva- where W is the univariate coefficient for lncRNA j, Exp is the j ij sive diagnosis. They exhibit significant interindividual expres- expression value of lncRNA j in patient i, and n is the number of sion variations in the same cell type compared with PCGs (41). testing lncRNAs. Herein, n is 16. LncRNAs have also been reported as a tumor suppressor or on- cogene in several cancers, including NB (29,42–47). Our group identified that the lncRNA SNHG1 is regulated by N-MYC and in- Stepwise Risk Score Threshold Selection dependently predicts patient outcomes for EFS in NB (48). LncRNA risk scores were arranged in increasing order and di- However, identifying an expression-based lncRNA signature that is altered by MYCN amplification is associated with high- vided into quartiles. Next, the patient was classified into a fa- vorable or unfavorable risk score group using the first risk, and patient prognosis for EFS is largely unknown. Here, we analyzed RNA-seq expression profiles of lncRNAs in MYCN sta- quartile, median, or third quartile risk score cutoff. We then built Kaplan-Meier plots for quartiles. To select the risk score tus and risk status subtypes in NB. Using the risk score formula, we developed a 16-lncRNA signature that robustly discriminates threshold, we checked which quartile statistically signifi- cantly separated patients into two groups and had a survival patients at greater risk for relapse. Our results suggest that the lncRNA signature can serve as a potential prognostic biomarker probability of EFS of less than 50% in the unfavorable risk score group. The risk scores and the threshold for each co- for EFS in NB. hort were calculated separately. Methods Statistical Analysis for the Discovery Cohort NB Patient Data Sets Kaplan-Meier survival analysis (eg, favorable vs unfavorable NB expression data sets and corresponding clinical information risk score groups) with Mantel log-rank test was performed for the difference between survival curves. Multivariable Cox were downloaded from the Gene Expression Omnibus (GEO), the Genomic Data Commons (GDC), and the Therapeutically proportional hazard regression was performed to determine the prognostic independence of lncRNA risk scores and clini- Applicable Research to Generate Effective Treatments (TARGET) database. The following four cohorts were included in our cal risk factors. ROC curve and area under curve (AUC) analy- ses were used to evaluate the sensitivity and specificity of study. For training, the RNA-seq cohort with GEO accession number GSE62564 (49) was used and termed as the discovery co- the lncRNA risk score for EFS prediction. All statistical tests were two-sided. Survival analysis was performed using the hort. For validation, the RNA-seq cohort from GDC, microarray cohort from TARGET, and another with GEO accession number survival R package (52). TheAUC and confidenceinterval for the AUC were calculated using the pROC Rpackage (53). GSE16476 (50) were used and termed as independent cohorts 1–3, respectively. Details of data preprocessing and statistical analysis for inde- pendent cohorts are in the Supplementary Methods (available online). lncRNA Profiling in the Discovery Cohort The log2RPM normalized NB RNA-seq data set consisted of 498 Gene Set Enrichment Analysis patients, of which five with unknown MYCN status were re- moved. The clinical characteristics are shown (Supplementary Spearman correlation coefficients (SCC) were calculated Table 1, available online). To avoid negative log2 expression between the 16 lncRNAs and whole-genome PCGs (19 199 PCGs) values, the intensities were converted back to their original from the discovery cohort. To assess the function of each raw expression, increased by 1, then log2 transformed. PCG lncRNA, gene set enrichment analysis (GSEA v2.2.3, and lncRNA expression was extracted based on their RefSeq ID Broad Institute) (54) was performed using MSigDB annotation, which identified 34 255 and 6260 PCGs and (C5.bp.v5.2.symbols.gmt gene set collection, 4653 gene sets lncRNAs transcripts, respectively. LncRNAs were differentially available), with a ranked list as lncRNA-correlated PCGs and expressed (P  .01 and a fold-change  2) in MYCN status their corresponding SCCs, maximum gene set size of 5000, min- (MYCN amplified vs MYCN nonamplified), and risk status imum gene set size of 15, 1000 permutations, and weighted en- (high-risk vs low-risk) was identified using the limma Rpackage richment statistics. Over-represented gene sets (false discovery (51). In the case of multiple transcripts representing the same rate [FDR] q value ¼ 0.001, overlap coefficient value ¼ 0.5) were gene, high standard deviations were taken for further filtered and visualized using the Enrichment map-Cytoscape analyses. plug-in (55). Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 D. Sahu et al. |3of12 Figure 1. Identification of a subgroup in the MYCN nonamplified tumor samples. A) The heat map shows expression values of the long noncoding RNAs (lncRNAs) dif- ferentially expressed in MYCN status (MYCN amplified vs MYCN nonamplified) samples and risk status (high-risk vs low-risk) samples. Each column indicates a pa- tient annotated according to their clinical information related to MYCN status, age, risk status, and event-free survival status. Each row represents lncRNAs ordered by average linkage hierarchical clustering. The expression value of each lncRNA was z-normalized and is shown with a gradient color scale. The unsupervised hierarchi- cal clustering identified three clusters, as demonstrated by boxes. B) Kaplan-Meier plot of event-free survival and overall survival of cluster 2 and cluster 3 low-risk neuroblastoma patients. The P values were obtained using a Mantel log-rank test (two-sided). C) Venn diagram shows overlapping lncRNAs of the discovery cohort, in- dependent cohort 1, independent cohort 2, and independent cohort 3. lncRNA ¼ long noncoding RNA; NB ¼ neuroblastoma. comparable number of high- and low-risk patients. Kaplan- Results Meier analysis showed that patients in cluster 1 and cluster 3 had poor EFS and OS (Supplementary Figure 3A, available on- Identification of lncRNA Signature From the Discovery line). Next, we extracted low-risk patients from cluster 2 and Cohort cluster 3 and found that patients in cluster 3 had poorer EFS (P We first performed a differential expression analysis of the dis- ¼ 2.81E-05) and OS (P ¼ 6.37E-07) than patients in cluster 2 covery cohort—a total of 493 patients—of which 92 were MYCN (Figure 1B). Moreover, we also checked the expression of amplified and 401 were MYCN nonamplified NB tumor sam- lncRNAs: SNHG1 and CASC15 were reported to be highly ples. We identified 90 lncRNAs to be differentially expressed expressed in high-risk and low-risk NB tumors, respectively (fold change  2 and adjusted P  .01) in the MYCN status con- (Supplementary Figure 3B, available online) (48,56). dition. We then performed a differential expression analysis of Furthermore, for our downstream analyses, we considered 16 risk status, of which 175 were high-risk and 318 were low-risk of the 20 lncRNAs because four lncRNAs were not detected in tumor samples. With the same filter criteria as described the independent cohorts (Figure 1C). above, we identified 35 lncRNAs that were differentially expressed. We retained only those lncRNAs that were also an- notated in Gencode v.24 (35). The 20 lncRNAs were shared be- Building the 16-lncRNA Signature Risk Score tween MYCN status, risk status, and Gencode v.24 (Supplementary Figures 1 and 2 and Supplementary Table 2, The 16 lncRNAs with univariate Cox analysis were statistically available online). The unsupervised clustering analysis significantly (P < 0.001) associated with EFS and OS revealed that there were three clusters identified (Figure 1A). (Figure 2A). The eight lncRNAs with a negative coefficient Patients in cluster 1 and cluster 2 belonged to high- and low- were defined as “good survival lncRNAs,” whose high expres- risk groups, respectively. However, cluster 3 contains a sion is associated with good survival. The remaining eight Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 4of 12 | JNCI J Natl Cancer Inst, 2018, Vol. 0, No. 0 Figure 2. Univariate Cox regression and risk score analysis of prognostic long noncoding RNAs (lncRNAs) from the discovery cohort. A) Bar graph shows 16 prognostic lncRNAs ordered by their univariate z-score for event-free survival (EFS) and overall survival (OS). Positive scores are associated with shorter survival, and negative scores are associated with longer survival. Red and blue bars represent bad survival lncRNAs and good survival lncRNAs, respectively. The dashed line (colored in green) represents an absolute univariate z-score value of 61.96. B) Point plot of risk scores show risk score groups represented by color. Black represents favorable risk score group of patient samples, and red represents unfavorable risk score group of patient samples classified on median risk score of the discovery cohort. C) Waterfall plot of ordered risk scores shows disease relapse status of the patient. Red and gray bars represent patients with disease relapse and those who have not relapsed, re- spectively. D) Heat map shows the expression profile of the lncRNA signature. Each column indicates a patient in the favorable risk score group (black) and unfavorable risk score group (red). Each row represents lncRNAs associated with shorter survival (red) and longer survival (blue). The lncRNAs were ordered by hierarchical cluster- ing. The expression value of each lncRNA is scaled across rows and shown with a blue-red color scale. EFS ¼ event-free survival; lncRNA ¼ long noncoding RNA; OS ¼ overall survival. with a positive coefficient were defined as “bad survival two groups, with an EFS lower than 50% in the unfavorable risk lncRNAs,” whose high expression is associated with poor sur- score group (Supplementary Figure 5, available online). The 16- vival (Supplementary Table 3, available online). Permutation lncRNA signature risk scores range from –5.6 to 21.44 (median ¼ testing indicated that 16 lncRNAs had a more statistically signifi- 0.296) (Figure 2B). The clinical characteristics of patients based cant association with EFS prediction than expected by chance on risk groups from the discovery cohort are shown in (P < 1E-16) (Supplementary Methods and Supplementary Figure Supplementary Table 4 (available online). The waterfall plot 4, available online). A risk score was constructed with the regres- shows that most of the patients with a high risk score relapsed sion coefficient for EFS and was arranged in increasing order, (Figure 2C). The heat map (Figure 2D) shows that patients in the and then divided into quartiles. After creating Kaplan-Meier unfavorable risk score group tend to express bad survival plots for the quartiles, it was evident that the median risk score lncRNAs and patients in the favorable risk score group tend to threshold statistically significantly separated the patients into express good survival lncRNAs. Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 D. Sahu et al. |5of12 Figure 3. Survival estimates of event-free survival in neuroblastoma patients. Kaplan-Meier plots of favorable and unfavorable risk score groups based on the (A) me- dian risk score of the discovery cohort, (B) first quartile risk score of the independent cohort 1, (C) first quartile risk score of the independent cohort 2, (D) median risk score of the independent cohort 3. The P values were obtained using a Mantel log-rank test (two-sided). lncRNA ¼ long noncoding RNA. Prognostic Association of 16-lncRNA Signature Risk Survival Prediction by the 16-lncRNA Signature Is Score With Patient Survival Independent of Clinical Risk Factors Compared with patients in the favorable risk score group, those To evaluate the prognostic independence of the 16-lncRNA with an unfavorable risk score had a poor EFS and OS (Figure 3A; signature against known clinical risk factors, multivariable Supplementary Figure 6A, available online). Only 39% of NB patients Cox analysis showed that in the discovery cohort, in the unfavorable risk score group were disease free at five years, lncRNA signature (HR ¼ 3.21, P ¼ 5.95E-07), MYCN status compared with 86% of patients in the favorable risk score group. (HR ¼ 1.41, P ¼ 4.39E-02), stage (HR ¼ 1.51, P ¼ 3.06E-02), and The OS probability at five years was 57% in the unfavorable risk age (HR ¼ 1.6, P ¼ 8.4E-03) were predicted EFS score group, compared with 99% in the favorable risk score group independently (Figure 4A). In independent cohort 1, only (Supplementary Table 5, available online). The 16-lncRNA signature lncRNA signature (HR ¼ 2.32, P ¼ 2.86E-02) was indepen- was tested in three independent cohorts for validation. Using the dently associated with EFS (Figure 4B). In independent co- same risk score formula and stepwise risk score threshold selection hort 2, lncRNA signature (HR ¼ 2.61, P ¼ 6.18E-03) and age criteria (Supplementary Figure 5, available online), the 16-lncRNA (HR ¼ 23.56, P ¼ 1.9E-03) were independently associated signature statistically significantly stratified patients into two risk with EFS (Figure 4C). In independent cohort 3, only score groups for EFS and OS (Figures 3,B–D; Supplementary Figure lncRNA signature (HR ¼ 3.91, P ¼ 9.39E-03) was indepen- 6, B–D, available online), respectively. The clinical characteristics of dently associated with EFS (Figure 4D). Similar results were patients based on risk groups from the independent cohorts are also obtained for OS (Supplementary Figure 7, available shown (Supplementary Table 4, available online). online). Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 6of 12 | JNCI J Natl Cancer Inst, 2018, Vol. 0, No. 0 Figure 4. Multivariable Cox analysis of the 16-long noncoding RNA (lncRNA) signature for event-free survival in neuroblastoma patients of (A) discovery cohort, (B) in- dependent cohort 1, (C) independent cohort 2, and (D) independent cohort 3. Forest plot of the 16-lncRNA signature shows that patients in the unfavorable risk score groups had poor outcomes and an independent predictor of event-free survival after adjusting for the clinical risk factors. The hazard ratio and confidence interval for independent cohort 2 are represented on a log10 scale. All statistical tests were two-sided. CI ¼ confidence interval, HR ¼ hazard ratio; lncRNA ¼ long noncoding RNA. were considered continuous variables, and MYCN status, risk Survival Prediction by 16-lncRNA Signature Within status, and stage were considered categorical variables. The Clinical Risk Factor Subgroups results of the discovery and independent cohorts are shown The above analysis revealed that MYCN status, age, and stage (Figure 6A; Supplementary Figure 12, available online). were also statistically significantly associated with EFS. Moreover, studies have shown that several lncRNAs that were Therefore, in order to corroborate whether lncRNA signature identified in our study are prognostic biomarkers for survival in can stratify these risk factors into risk score groups, we first per- NB (48,56,57). ROC analysis showed higher prediction accuracy formed data stratification according to age, risk status, stage, of the 16-lncRNA signature against individual lncRNAs and MYCN status. Within each subgroup of risk factors, patients (Figure 6B). Furthermore, we compared the prediction accuracy were stratified into favorable and unfavorable risk score groups of the 16-lncRNA signature with other published NB prognostic using the median risk score of the discovery cohort. In all sub- gene signatures (14,22–24). We extracted their gene list, built the groups except the high-risk subgroup (n ¼ 175, P ¼ 0.518), the risk score from the discovery cohort, and calculated the AUC of 16-lncRNA signature statistically significantly stratified patients the ROC curve. As expected, the lncRNA signature showed simi- into two risk groups (Figure 5, A–G). The survival comparison for lar performance for EFS compared with other prognostic gene the MYCN amplified subgroup is not shown, as all the patients signatures (Supplementary Figure 13, available online). The were stratified under the unfavorable risk score group. Similar AUC and 95% confidence intervals of the AUC for the 16-lncRNA results were also obtained for OS (Supplementary Figure 8, A–G, signature (AUC ¼ 0.788, 95% CI ¼ 0.746 to 0.831), individual available online). In addition, the 16-lncRNA signature statisti- lncRNAs, and clinical risk factors are shown (Supplementary cally significantly stratified low-stage (stage 1 or stage 2) tumor Table 6, available online). patients (Figure 5H; Supplementary Figure 8H, available online). The results for independent cohorts are shown (Supplementary Figures 9–11, available online). Pairwise Correlation of the 16-lncRNA Signature in the Discovery Cohort To understand the regulatory roles of the lncRNA signature, we The 16-lncRNA Signature Predicts Patient Survival With calculated the SCC between the expression values of the 16 High Sensitivity and Specificity lncRNAs from the discovery cohort. We found that bad survival To confirm the prediction accuracy for EFS, we examined the lncRNAs and good survival lncRNAs were highly correlated ROC curve of lncRNA signature, age, MYCN status, risk status, within their respective groups (Figure 7A). We next investigated and stage. For better performance, lncRNA signature and age the FPKM-normalized RNA-seq expression of the 16 lncRNAs Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 D. Sahu et al. |7of12 Figure 5. Survival estimates of event-free survival within the clinical risk factors subgroups from the discovery cohort. Kaplan-Meier plots of the favorable and unfavor- able risk score groups based on the median risk score threshold value from the discovery cohort. The patients were stratified according to (A and B) age, (C and D) risk status, (E and F) stage, (G) MYCN nonamplified, (H) stage 1 and 2. P values were obtained using a Mantel log-rank test (two-sided). lncRNA ¼ long noncoding RNA. across 16 normal human tissues obtained from the Illumina their relationship with the MYCN gene in the MYCN amplified Human Body Map project and observed that most of the and MYCN nonamplified conditions, we identified a positive lncRNAs were expressed abundantly in the brain, adrenal correlation between bad survival lncRNAs and MYCN in both glands, and lymph nodes (Figure 7B). Subsequently, to evaluate conditions (Figure 7C; Supplementary Figure 14, available Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 8of 12 | JNCI J Natl Cancer Inst, 2018, Vol. 0, No. 0 Figure 6. Receiver operating characteristic (ROC) curve analysis of event-free survival prediction by the 16–long noncoding RNA (lncRNA) signature from the discovery cohort. ROC curve shows high sensitivity and specificity for predicting event-free survival. A) 16-lncRNA signature compared with all clinical risk factors. B) 16-lncRNA signature compared with individual lncRNAs. lncRNA ¼ long noncoding RNA; ROC ¼ receiver operating characteristic. online). Moreover, our ChIP-seq data analysis observed a MYCN risk score as a threshold from the discovery cohort. Kaplan-Meier binding site in the promoter of bad survival lncRNAs (SNHG1 analysis with Mantel log-rank test (two-sided) estimated the 16- and LINC00839) (58). lncRNA signature prognostic association for EFS and OS. Multivariable Cox analysis determined the independence of the lncRNA signature against the established clinical risk factors. Biological Functions of the 16-lncRNA Signature in NB There were platform and statistically significant clinical dif- ferences between the discovery and independent cohorts. LncRNAs have little or no protein-coding capacity; thus we applied Therefore, we calculated risk scores for each of the independent a guilt-by-association strategy to investigate the potential biologi- cohorts separately. To make the threshold selection indepen- cal functions of the lncRNA signature (59). We found biological dent of the cohort under investigation, we explored a novel functions related to translational initiation, establishment of pro- stepwise risk score threshold selection approach for stratifica- tein localization to the ER, and ribosome biogenesis enriched for tion of patients. The 16-lncRNA signature was validated as a bad survival lncRNAs. In contrast, biological functions related to statistically significant independent predictor for EFS in all the homophilic cell adhesion via plasma membrane molecules, den- independent cohorts. Additionally, the 16-lncRNA signature has drite morphogenesis, and adaptive immune response were the ability to discriminate patients into two risk score groups enriched for good survival lncRNAs (Figure 8A). The highest within the clinical risk factors subgroups. The results were also enriched biological function for the bad survival lncRNA, DANCR, reproduced in the independent cohorts. This important finding and the good survival lncRNA, CASC15, are shown (Figure 8,B suggests the clinical applicability of the 16-lncRNA signature to and C). The highest enriched biological functions for the rest of the identify patients who can benefit from appropriate treatments lncRNAs are shown (Supplementary Figure 15, available online). according to their risk of relapse. Data stratification for stage 1 and 2 tumors highlight the potential of the lncRNA signature to predict the risk of relapse for low-stage tumors. However, we Discussion were not able to validate this hypothesis in the independent Our study is the first to report (to our knowledge) the RNA-seq cohorts. For independent cohort 1, only one patient was in stage prognostic lncRNA signature in NB. Using the expression profiles 2b. This patient’s tumor relapsed, and they eventually died. For of a large sample of 493 patients from an RNA-seq cohort, we independent cohort 2, only 30 patients were in stage 1, and all identified 20 lncRNAs dysregulated in MYCN amplification and were censored. Thus, using the first quartile risk score threshold high-risk NB tumors. Identifying dysregulated lncRNAs from such for this cohort, no patients were stratified under the unfavorable a large high-throughput study increases the robustness and statis- risk score group. For independent cohort 3, we did not find a sta- tical power. However, only 16 lncRNAs were found to be in com- tistically significant difference in the two risk groups for stage 1 mon with independent cohorts. Application of a univariate Cox and 2 patients. The possible reasons for this were the limited model on this subset of lncRNAs identified their expression to sample size and the fact that most of the patients were censored. have a statistically significant association with patient EFS and OS Along with genomic amplification of the MYCN oncogene, from the discovery cohort. These 16 lncRNAs were integrated into genetic aberration, including chromosomal segmental aberra- a signature through a risk score formula built from their expres- tion or tumor DNA ploidy status, also contributes to advanced sion and respective survival contributions. Patients were divided disease stage and aggressive phenotype (1,2). We show that ex- into favorable and unfavorable risk score groups using the median pression of several of the 16 lncRNAs is differentially expressed Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 D. Sahu et al. |9of12 Figure 7. Spearman correlation coefficient (SCC) analysis of the 16–long noncoding RNA (lncRNA) signature from the discovery cohort. A) SCC matrix shows correlations within bad survival lncRNAs and good survival lncRNAs. The color scale bar denotes correlation strength, with 1 indicating a positive correlation (red) and –1 indicating a negative correlation (blue). B) Heat map shows the fragments per kilobase of transcript per million mapped reads (FPKM) normalized expression value of 16 lncRNAs across 16 normal human tissues from the Illumina Body Map project. The expression value of each lncRNA was z-normalized and is shown with a green-purple color scale. C) The co-expression network of 16 lncRNAs and MYCN in MYCN-amplified neuroblastoma. Nodes represent lncRNA and MYCN coding gene, whereas edges rep- resent the SCC of expression profiles between lncRNAs and the MYCN coding gene. Red edges represent positive correlations, and blue edges represent negative corre- lations. Edge width is proportional to the strength of the correlation. Dashed edges indicate that the correlation between lncRNA and the MYCN coding gene is nonsignificant. FPKM ¼ fragments per kilobase of transcript per million mapped reads; lncRNA ¼ long noncoding RNA; SCC ¼ spearman correlation coefficient. in chromosomal segmental aberration (Supplementary Figures and have significant clinical differences. Therefore, the findings 16–18, available online) and has a prognostic impact in predict- have to be validated separately. Second, independent cohort 1 ing the outcome of patients in tumor DNA ploidy subgroups and independent cohort 2 represent overly sensitive cohorts (Supplementary Figures 19 and 20 and Supplementary Tables 7– confounded by patient age and tumor stage. Third, in 8, available online). independent cohort 2, stage was not included as a covariate in Studies reported that several of the 16 lncRNAs identified in multivariable analysis as no relapse was observed in patients in our study were likely to have roles in NB, as well as in other can- stage 1 or 3. In independent cohort 3, age was not included as a cers. The lncRNA MYCNOS interacts with CTCF at the promoter covariate in multivariable analysis because clinical information and enhances MYCN expression (60). The lncRNA DANCR is as- about patient age was not available for this cohort. Despite sociated with tumorigenesis and prognosis in hepatocellular these drawbacks, independent confirmation and similarity be- carcinoma (HCC) (61). The lncRNA SNHG16, mapped to chromo- tween findings from the discovery and independent cohorts some 17q, is an independent predictor for patient survival in NB provide a high level of confidence in the overall analysis. (57). The lncRNA FIRRE, mapped to chromosome X, is involved In conclusion, we developed a signature consisting of 16 in the dosage compensation process (62). The lncRNA lncRNAs whose expression is associated with high risk and is LINC01234 is associated with patient survival in breast cancer regulated by MYCN amplification in NB. In addition, the lncRNA (63). The lncRNA DBH-AS1 is induced by hepatitis B virus x pro- signature can be incorporated into different clinical platforms, tein and is involved in hepatitis B virus–mediated HCC (64). The including RNA-seq and microarray. Our previous study also val- lncRNA EPB41L4A-AS2 is downregulated and associated with idated the expression of some of the identified lncRNAs using poor patient survival in breast cancer (65). real-time quantitative polymerase chain reaction (48). The There are also limitations to our study. First, the NB cohorts expressions of the lncRNA signature have better ability for pre- included in our study were profiled from different platforms diction of clinical response compared with the risk, based on Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 10 of 12 | JNCI J Natl Cancer Inst, 2018, Vol. 0, No. 0 Figure 8. Gene set enrichment analysis of the 16–long noncoding RNA (lncRNA) signature in neuroblastoma patients. A) Network shows overrepresented gene sets for the lncRNA signature. Red nodes represent bad survival lncRNA signature gene sets, and blue nodes represent good survival lncRNA signature gene sets. Node size is proportional to the normalized enrichment score. Biologically related gene sets tend to form clusters; these were manually identified and labeled with appropriate gene ontology terms. The network was generated using an enrichment map-cytoscape plug-in. B and C) Enrichment plot shows highest enriched function for the bad survival lncRNA (DANCR) and the good survival lncRNA (CASC15). FDR ¼ false discovery rate; lncRNA ¼ long noncoding RNA; NES ¼ normalized enrichment score. pathological and genetic markers. The lncRNA signature is inde- Technology (SYH), National Chiao Tung University, Hsinchu, pendent in predicting NB patient disease relapse. Thus, our Taiwan; Bioinformatics Program, Taiwan International results suggest that the 16-lncRNA prognostic signature may Graduate Program, Institute of Information Science, Academia have clinical application in NB. Sinica, Taipei, Taiwan (DS, SYH, HCH); Institute of Biomedical Informatics, Center for Systems and Synthetic Biology, National Yang-Ming University, Taipei, Taiwan (DS, HCH); Department of Funding Life Science, Institute of Molecular and Cellular Biology, Graduate Institute of Biomedical Electronics and Bioinformatics, This work was supported by the Ministry of Science and National Taiwan University, Taipei, Taiwan (HFJ). Technology (MOST 103-2320-B-010-031-MY3, MOST 104- The study funders had no role in the design of the study; the 2628-E-010-001-MY3, MOST 105-2320-B-002-057-MY3, and collection, analysis, or interpretation of the data; the writing of MOST 105-2634-E-002-002) and the National Health the manuscript; or the decision to submit the manuscript for Research Institutes (NHRI-EX106-10530PI). publication. DS and HCH conceived and designed the study. DS per- formed bioinformatics analyses. SYH helped with data analysis. Notes DS and HCH interpreted the results and wrote the manuscript. Affiliations of authors: Institute of Bioinformatics and Systems HFJ and HCH supervised the study. All authors read and ap- Biology (DS, SYH) and Department of Biological Science and proved the final manuscript. Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 D. Sahu et al. |11of12 29. Gupta RA, Shah N, Wang KC, et al. Long non-coding RNA HOTAIR reprograms The authors declare no competing financial interests. chromatin state to promote cancer metastasis. Nature. 2010;464(7291):1071–1076. The authors wish to thank Chen-Ching Lin and Chia-Lang 30. Kim K, Jutooru I, Chadalapaka G, et al. HOTAIR is a negative prognostic factor Hsu for helpful discussions. and exhibits pro-oncogenic activity in pancreatic cancer. Oncogene. 2013; 32(13):1616–1625. 31. Kim HJ, Lee DW, Yim GW, et al. Long non-coding RNA HOTAIR is associated References with human cervical cancer progression. Int J Oncol. 2015;46(2):521–530. 32. Cai B, Wu Z, Liao K, et al. Long noncoding RNA HOTAIR can serve as a com- 1. Brodeur GM. Neuroblastoma: Biological insights into a clinical enigma. Nat mon molecular marker for lymph node metastasis: A meta-analysis. Tumour Rev Cancer. 2003;3(3):203–216. Biol. 2014;35(9):8445–8450. 2. Maris JM, Hogarty MD, Bagatell R, et al. Neuroblastoma. Lancet. 2007; 33. Iyer MK, Niknafs YS, Malik R, et al. The landscape of long noncoding RNAs in 369(9579):2106–2120. the human transcriptome. Nat Genet. 2015;47(3):199–208. 3. Maris JM. Recent advances in neuroblastoma. N Engl J Med. 2010;362(23): 34. Cabili MN, Trapnell C, Goff L, et al. Integrative annotation of human large 2202–2211. intergenic noncoding RNAs reveals global properties and specific subclasses. 4. Speleman F, Park JR, Henderson TO. Neuroblastoma: A tough nut to crack. Genes Dev. 2011;25(18):1915–1927. Am Soc Clin Oncol Educ Book. 2016;35:e548–e557. 35. Derrien T, Johnson R, Bussotti G, et al. The GENCODE v7 catalog of human 5. Howlader N, Noone AM, Krapcho M, et al. SEER Cancer Statistics Review, long noncoding RNAs: Analysis of their gene structure, evolution, and ex- 1975–2013. Bethesda, MD: National Cancer Institute; 2016. https://seer. pression. Genome Res. 2012;22(9):1775–1789. cancer.gov/csr/1975_2013/. Accessed September 28, 2017. 36. Guttman M, Amit I, Garber M, et al. Chromatin signature reveals over a thou- 6. London WB, Castleberry RP, Matthay KK, et al. Evidence for an age cutoff sand highly conserved large non-coding RNAs in mammals. Nature. 2009; greater than 365 days for neuroblastoma risk group stratification in the 458(7235):223–227. Children’s Oncology Group. J Clin Oncol. 2005;23(27):6459–6465. 37. Bussemakers MJ, van Bokhoven A, Verhaegh GW, et al. DD3: A new prostate- 7. Gurney JG, Ross JA, Wall DA, et al. Infant cancer in the U.S.: Histology-specific specific gene, highly overexpressed in prostate cancer. Cancer Res. 1999; incidence and trends, 1973 to 1992. J Pediatr Hematol Oncol. 1997;19(5):428–432. 59(23):5975–5979. 8. Seeger RC, Brodeur GM, Sather H, et al. Association of multiple copies of the 38. Hessels D, Klein Gunnewiek JM, van Oort I, et al. DD3(PCA3)-based molecular N-myc oncogene with rapid progression of neuroblastomas. N Engl J Med. urine analysis for the diagnosis of prostate cancer. Eur Urol. 2003;44(1):8–15; 1985;313(18):1111–1116. discussion 15–16. 9. Huang M, Weiss WA. Neuroblastoma and MYCN. Cold Spring Harb Perspect 39. Arita T, Ichikawa D, Konishi H, et al. Circulating long non-coding RNAs in Med. 2013;3(10):a014415. plasma of patients with gastric cancer. Anticancer Res. 2013;33(8):3185–3193. 10. Brodeur GM, Seeger RC, Schwab M, et al. Amplification of N-myc in untreated 40. Panzitt K, Tschernatsch MM, Guelly C, et al. Characterization of HULC, a human neuroblastomas correlates with advanced disease stage. Science. novel gene with striking up-regulation in hepatocellular carcinoma, as non- 1984;224(4653):1121–1124. coding RNA. Gastroenterology. 2007;132(1):330–342. 11. Brodeur GM, Pritchard J, Berthold F, et al. Revisions of the international crite- 41. Kornienko AE, Dotter CP, Guenzl PM, et al. Long non-coding RNAs display ria for neuroblastoma diagnosis, staging, and response to treatment. J Clin higher natural expression variation than protein-coding genes in healthy Oncol. 1993;11(8):1466–1477. humans. Genome Biol. 2016;17. 12. Maris JM, Matthay KK. Molecular biology of neuroblastoma. J Clin Oncol. 1999; 42. Kotake Y, Nakagawa T, Kitagawa K, et al. Long non-coding RNA ANRIL is re- 17(7):2264–2279. quired for the PRC2 recruitment to and silencing of p15(INK4B) tumor sup- 13. Brodeur GM, Maris JM. Neuroblastoma. In: Pizzo P, Poplack D, eds. Principles pressor gene. Oncogene. 2011;30(16):1956–1962. and Practice of Pediatric Oncology. Philadelphia: Lippincott Williams & Wilkins; 43. Gutschner T, Hammerle M, Eissmann M, et al. The noncoding RNA MALAT1 2002:895–937. is a critical regulator of the metastasis phenotype of lung cancer cells. Cancer 14. Vermeulen J, De Preter K, Naranjo A, et al. Predicting outcomes for children Res. 2013;73(3):1180–1189. with neuroblastoma using a multigene-expression signature: A retrospective 44. Taniue K, Kurimoto A, Sugimasa H, et al. Long noncoding RNA UPAT pro- SIOPEN/COG/GPOH study. Lancet Oncol. 2009;10(7):663–671. motes colon tumorigenesis by inhibiting degradation of UHRF1. Proc Natl 15. Asgharzadeh S, Pique-Regi R, Sposto R, et al. Prognostic significance of gene Acad Sci U S A. 2016;113(5):1273–1278. expression profiles of metastatic neuroblastomas lacking MYCN gene ampli- 45. Pandey GK, Kanduri C. Long noncoding RNAs and neuroblastoma. Oncotarget. fication. J Natl Cancer Inst. 2006;98(17):1193–1203. 2015;6(21):18265–18275. 16. Wei JS, Johansson P, Chen QR, et al. microRNA profiling identifies cancer- 46. Pandey GK, Mitra S, Subhash S, et al. The risk-associated long noncoding specific and prognostic signatures in pediatric malignancies. Clin Cancer Res. RNA NBAT-1 controls neuroblastoma progression by regulating cell prolifera- 2009;15(17):5560–5568. tion and neuronal differentiation. Cancer Cell. 2014;26(5):722–737. 17. De Preter K, Vermeulen J, Brors B, et al. Accurate outcome prediction in neu- 47. Liu PY, Erriquez D, Marshall GM, et al. Effects of a novel long noncoding RNA, roblastoma across independent data sets using a multigene signature. Clin lncUSMycN, on N-Myc expression and neuroblastoma progression. J Natl Cancer Res. 2010;16(5):1532–1541. Cancer Inst. 2014;106(7):dju113. 18. Schulte JH, Schowe B, Mestdagh P, et al. Accurate prediction of neuroblastoma 48. Sahu D, Hsu CL, Lin CC, et al. Co-expression analysis identifies long noncod- outcome based on miRNA expression profiles. Int J Cancer. 2010;127(10):2374–2385. ing RNA SNHG1 as a novel predictor for event-free survival in neuroblas- 19. De Preter K, Mestdagh P, Vermeulen J, et al. miRNA expression profiling ena- toma. Oncotarget. 2016;7(36):58022–58037. bles risk stratification in archived and fresh neuroblastoma tumor samples. 49. Su Z, Fang H, Hong H, et al. An investigation of biomarkers derived from legacy Clin Cancer Res. 2011;17(24):7684–7692. microarray data for their utility in the RNA-seq era. Genome Biol. 2014;15(12):523. 20. Abel F, Dalevi D, Nethander M, et al. A 6-gene signature identifies four molec- 50. Molenaar JJ, Domingo-Fernandez R, Ebus ME, et al. LIN28B induces neuroblas- ular subgroups of neuroblastoma. Cancer Cell Int. 2011;11:9. toma and enhances MYCN levels via let-7 suppression. Nat Genet. 2012;44(11): 21. Valentijn LJ, Koster J, Haneveld F, et al. Functional MYCN signature predicts 1199–1206. outcome of neuroblastoma irrespective of MYCN amplification. Proc Natl 51. Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression Acad Sci U S A. 2012;109(47):19190–19195. analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015; 22. Formicola D, Petrosino G, Lasorsa VA, et al. An 18 gene expression-based 43(7):e47. score classifier predicts the clinical outcome in stage 4 neuroblastoma. J 52. Therneau T. A Package for Survival Analysis in S [computer program]. Version 2.38. Transl Med. 2016;14. 2015. https://CRAN.R-project.org/package¼survival. Accessed June 16, 2015. 23. Asgharzadeh S, Salo JA, Ji L, et al. Clinical significance of tumor-associated in- 53. Robin X, Turck N, Hainard A, et al. pROC: An open-source package for R and flammatory cells in metastatic neuroblastoma. J Clin Oncol. 2012;30(28): Sþ to analyze and compare ROC curves. BMC Bioinformatics. 2011;12:77. 3525–3532. 54. Subramanian A, Tamayo P, Mootha VK, et al. Gene set enrichment analysis: 24. Fardin P, Barla A, Mosci S, et al. A biology-driven approach identifies the hyp- A knowledge-based approach for interpreting genome-wide expression pro- oxia gene signature as a predictor of the outcome of neuroblastoma patients. files. Proc Natl Acad Sci U S A. 2005;102(43):15545–15550. Mol Cancer. 2010;9:185. 55. Isserlin R, Merico D, Voisin V, et al. Enrichment Map – a Cytoscape app to vi- 25. Shao J, Lu Z, Huang W, et al. A single center clinical analysis of children with sualize and explore OMICs pathway enrichment results. F1000Res. 2014;3. neuroblastoma. Oncol Lett. 2015;10(4):2311–2318. 56. Russell MR, Penikis A, Oldridge DA, et al. CASC15-S is a tumor suppressor 26. Svoboda M, Slyskova J, Schneiderova M, et al. HOTAIR long non-coding lncRNA at the 6p22 neuroblastoma susceptibility locus. Cancer Res. 2015; RNA is a negative prognostic factor not only in primary tumors, but also 75(15):3155–3166. in the blood of colorectal cancer patients. Carcinogenesis. 2014;35(7): 57. Yu M, Ohira M, Li Y, et al. High expression of ncRAN, a novel non-coding RNA 1510–1515. mapped to chromosome 17q25.1, is associated with poor prognosis in neuro- 27. Xu ZY, Yu QM, Du YA, et al. Knockdown of long non-coding RNA HOTAIR blastoma. Int J Oncol. 2009;34(4):931–938. suppresses tumor invasion and reverses epithelial-mesenchymal transition 58. Hsu CL, Chang HY, Chang JY, et al. Unveiling MYCN regulatory networks in in gastric cancer. Int J Biol Sci. 2013;9(6):587–597. neuroblastoma via integrative analysis of heterogeneous genomics data. 28. Li H, An J, Wu M, et al. LncRNA HOTAIR promotes human liver cancer stem Oncotarget. 2016;7(24):36293–36310. cell malignant growth through downregulation of SETD2. Oncotarget. 2015; 59. Rinn JL, Chang HY. Genome regulation by long noncoding RNAs. Annu Rev 6(29):27847–27864. Biochem. 2012;81:145–166. Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 12 of 12 | JNCI J Natl Cancer Inst, 2018, Vol. 0, No. 0 60. Zhao X, Li D, Pu J, et al. CTCF cooperates with noncoding RNA MYCNOS to 63. Guo W, Wang Q, Zhan Y, et al. Transcriptome sequencing uncovers a three- promote neuroblastoma progression through facilitating MYCN expression. long noncoding RNA signature in predicting breast cancer survival. Sci Rep. Oncogene. 2016;35(27):3565–3576. 2016;6:27931. 61. Yuan SX, Wang J, Yang F, et al. Long noncoding RNA DANCR increases stem- 64. Huang J, Ren T, Cao S, et al. HBx-related long non-coding RNA DBH-AS1 pro- ness features of hepatocellular carcinoma by derepression of CTNNB1. motes cell proliferation and survival by activating MAPK signaling in hepato- Hepatology. 2016;63(2):499–511. cellular carcinoma. Oncotarget. 2015;6(32):33791–33804. 62. Yang F, Deng X, Ma W, et al. The lncRNA Firre anchors the inactive X chromo- 65. Xu S, Wang P, You Z, et al. The long non-coding RNA EPB41L4A-AS2 inhibits some to the nucleolus by binding CTCF and maintains H3K27me3 methyla- tumor proliferation and is associated with favorable prognoses in breast can- tion. Genome Biol. 2015;16:52. cer and other solid tumors. Oncotarget. 2016;7(15):20704–20717. Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png JNCI Cancer Spectrum Oxford University Press

High-risk, Expression-Based Prognostic Long Noncoding RNA Signature in Neuroblastoma

Free
12 pages

Loading next page...
 
/lp/ou_press/high-risk-expression-based-prognostic-long-noncoding-rna-signature-in-e1smuVCJDl
Publisher
Oxford University Press
Copyright
© The Author(s) 2018. Published by Oxford University Press.
eISSN
2515-5091
D.O.I.
10.1093/jncics/pky015
Publisher site
See Article on Publisher Site

Abstract

Background: Current clinical risk factors stratify patients with neuroblastoma (NB) for appropriate treatments, yet patients with similar clinical behaviors evoke variable responses. MYCN amplification is one of the established drivers of NB and, when combined with high-risk displays, worsens outcomes. Growing high-throughput transcriptomics studies suggest long noncoding RNA (lncRNA) dysregulation in cancers, including NB. However, expression-based lncRNA signatures are altered by MYCN amplification, which is associated with high-risk, and patient prognosis remains limited. Methods: We investigated RNA-seq-based expression profiles of lncRNAs in MYCN status and risk status in a discovery cohort (n ¼ 493) and validated them in three independent cohorts. In the discovery cohort, a prognostic association of lncRNAs was determined by univariate Cox regression and integrated into a signature using the risk score method. A novel risk score threshold selection criterion was developed to stratify patients into risk groups. Outcomes by risk group and clinical subgroup were assessed using Kaplan-Meier survival curves and multivariable Cox regression. The performance of lncRNA signatures was evaluated by receiver operating characteristic curve. All statistical tests were two-sided. Results: In the discovery cohort, 16 lncRNAs that were differentially expressed (fold change  2 and adjusted P  0.01) integrated into a prognostic signature. A high risk score group of lncRNA signature had poor event-free survival (EFS; P < 1E-16). Notably, lncRNA signature was independent of other clinical risk factors when predicting EFS (hazard ratio ¼ 3.21, P ¼ 5.95E–07). The findings were confirmed in independent cohorts (P ¼ 2.86E-02, P ¼ 6.18E-03, P ¼ 9.39E-03, respectively). Finally, the lncRNA signature had higher accuracy for EFS prediction (area under the curve ¼ 0.788, 95% confidence interval ¼ 0.746 to 0.831). Conclusions: Here, we report the first (to our knowledge) RNA-seq 16-lncRNA prognostic signature for NB that may contribute to precise clinical stratification and EFS prediction. Neuroblastoma (NB) is the most common childhood cancer of approximately 25% of NB cases and correlates with the high- undifferentiated sympathetic neuroblasts, accounting for ap- risk tumor subtype (9,10). proximately 15% of deaths in children worldwide (1–3). Currently, the clinical risk factors of NB, including patient According to the Surveillance, Epidemiology, and End Results age at diagnosis, MYCN status, tumor stage, chromosomal (SEER) Cancer Statistics report, every year, more than 650 cases aberrations, and tumor histology are still in use for risk assess- are diagnosed in North America (4,5), with an incidence rate of ment and determination of appropriate treatments (11–13). about 10.54 cases per million per year in children younger than Nevertheless, risk assessments based on these risk factors have 15 years (6,7). The clinical hallmark of NB is its tumor heteroge- limited success, as patients with similar clinical behaviors neity, represented by disparate clinical behaviors (1,2). MYCN evoke variable responses (14). Identifying tumor-specific molec- oncogene amplification is one of the established drivers of ular markers can provide better risk estimation and determina- NB, indicating worsened outcomes (8,9). It contributes to tion of effective protocols for treating patients at the time of Received: January 8, 2018; Revised: March 13, 2018; Accepted: March 29, 2018 © The Author(s) 2018. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/ licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 1of 12 by Ed 'DeepDyve' Gillespie user on 21 June 2018 2of 12 | JNCI J Natl Cancer Inst, 2018, Vol. 0, No. 0 diagnosis. Genomics and experimental studies conducted for Detection of Prognostic lncRNA Signature From the protein-coding genes (PCGs) or microRNAs can discriminate Discovery Cohort between patients with an unfavorable or favorable outcome Univariate Cox proportional hazard regression was applied to (14–24). However, the five-year survival rate of event-free sur- examine the association between 16 differentially expressed vival (EFS) in the high-risk tumor subtype is still approximately lncRNAs and patient EFS and overall survival (OS). LncRNAs sta- 50% (25). With advancements in RNA-sequencing (RNA-seq) tistically significantly associated (P < 0.001) with patient EFS technology, there have been numerous efforts to correlate the were integrated into a signature using a risk score formula. The expression of long noncoding RNA (lncRNA) with tumor progno- risk score for each patient was calculated by a linear combination sis (26–32). of expression and univariate coefficient of lncRNAs as follows: LncRNAs (longer than 200 nucleotides) are major noncoding transcriptomes (33), transcribed by RNA polymerase II, and ex- hibit both low and tissue-specific expression (34–37). LncRNAs Risk score ¼ W  exp ; ij j¼1 are highly stable and easily detected in various body fluids, including urine (38), plasma (39), and blood (40), urging noninva- where W is the univariate coefficient for lncRNA j, Exp is the j ij sive diagnosis. They exhibit significant interindividual expres- expression value of lncRNA j in patient i, and n is the number of sion variations in the same cell type compared with PCGs (41). testing lncRNAs. Herein, n is 16. LncRNAs have also been reported as a tumor suppressor or on- cogene in several cancers, including NB (29,42–47). Our group identified that the lncRNA SNHG1 is regulated by N-MYC and in- Stepwise Risk Score Threshold Selection dependently predicts patient outcomes for EFS in NB (48). LncRNA risk scores were arranged in increasing order and di- However, identifying an expression-based lncRNA signature that is altered by MYCN amplification is associated with high- vided into quartiles. Next, the patient was classified into a fa- vorable or unfavorable risk score group using the first risk, and patient prognosis for EFS is largely unknown. Here, we analyzed RNA-seq expression profiles of lncRNAs in MYCN sta- quartile, median, or third quartile risk score cutoff. We then built Kaplan-Meier plots for quartiles. To select the risk score tus and risk status subtypes in NB. Using the risk score formula, we developed a 16-lncRNA signature that robustly discriminates threshold, we checked which quartile statistically signifi- cantly separated patients into two groups and had a survival patients at greater risk for relapse. Our results suggest that the lncRNA signature can serve as a potential prognostic biomarker probability of EFS of less than 50% in the unfavorable risk score group. The risk scores and the threshold for each co- for EFS in NB. hort were calculated separately. Methods Statistical Analysis for the Discovery Cohort NB Patient Data Sets Kaplan-Meier survival analysis (eg, favorable vs unfavorable NB expression data sets and corresponding clinical information risk score groups) with Mantel log-rank test was performed for the difference between survival curves. Multivariable Cox were downloaded from the Gene Expression Omnibus (GEO), the Genomic Data Commons (GDC), and the Therapeutically proportional hazard regression was performed to determine the prognostic independence of lncRNA risk scores and clini- Applicable Research to Generate Effective Treatments (TARGET) database. The following four cohorts were included in our cal risk factors. ROC curve and area under curve (AUC) analy- ses were used to evaluate the sensitivity and specificity of study. For training, the RNA-seq cohort with GEO accession number GSE62564 (49) was used and termed as the discovery co- the lncRNA risk score for EFS prediction. All statistical tests were two-sided. Survival analysis was performed using the hort. For validation, the RNA-seq cohort from GDC, microarray cohort from TARGET, and another with GEO accession number survival R package (52). TheAUC and confidenceinterval for the AUC were calculated using the pROC Rpackage (53). GSE16476 (50) were used and termed as independent cohorts 1–3, respectively. Details of data preprocessing and statistical analysis for inde- pendent cohorts are in the Supplementary Methods (available online). lncRNA Profiling in the Discovery Cohort The log2RPM normalized NB RNA-seq data set consisted of 498 Gene Set Enrichment Analysis patients, of which five with unknown MYCN status were re- moved. The clinical characteristics are shown (Supplementary Spearman correlation coefficients (SCC) were calculated Table 1, available online). To avoid negative log2 expression between the 16 lncRNAs and whole-genome PCGs (19 199 PCGs) values, the intensities were converted back to their original from the discovery cohort. To assess the function of each raw expression, increased by 1, then log2 transformed. PCG lncRNA, gene set enrichment analysis (GSEA v2.2.3, and lncRNA expression was extracted based on their RefSeq ID Broad Institute) (54) was performed using MSigDB annotation, which identified 34 255 and 6260 PCGs and (C5.bp.v5.2.symbols.gmt gene set collection, 4653 gene sets lncRNAs transcripts, respectively. LncRNAs were differentially available), with a ranked list as lncRNA-correlated PCGs and expressed (P  .01 and a fold-change  2) in MYCN status their corresponding SCCs, maximum gene set size of 5000, min- (MYCN amplified vs MYCN nonamplified), and risk status imum gene set size of 15, 1000 permutations, and weighted en- (high-risk vs low-risk) was identified using the limma Rpackage richment statistics. Over-represented gene sets (false discovery (51). In the case of multiple transcripts representing the same rate [FDR] q value ¼ 0.001, overlap coefficient value ¼ 0.5) were gene, high standard deviations were taken for further filtered and visualized using the Enrichment map-Cytoscape analyses. plug-in (55). Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 D. Sahu et al. |3of12 Figure 1. Identification of a subgroup in the MYCN nonamplified tumor samples. A) The heat map shows expression values of the long noncoding RNAs (lncRNAs) dif- ferentially expressed in MYCN status (MYCN amplified vs MYCN nonamplified) samples and risk status (high-risk vs low-risk) samples. Each column indicates a pa- tient annotated according to their clinical information related to MYCN status, age, risk status, and event-free survival status. Each row represents lncRNAs ordered by average linkage hierarchical clustering. The expression value of each lncRNA was z-normalized and is shown with a gradient color scale. The unsupervised hierarchi- cal clustering identified three clusters, as demonstrated by boxes. B) Kaplan-Meier plot of event-free survival and overall survival of cluster 2 and cluster 3 low-risk neuroblastoma patients. The P values were obtained using a Mantel log-rank test (two-sided). C) Venn diagram shows overlapping lncRNAs of the discovery cohort, in- dependent cohort 1, independent cohort 2, and independent cohort 3. lncRNA ¼ long noncoding RNA; NB ¼ neuroblastoma. comparable number of high- and low-risk patients. Kaplan- Results Meier analysis showed that patients in cluster 1 and cluster 3 had poor EFS and OS (Supplementary Figure 3A, available on- Identification of lncRNA Signature From the Discovery line). Next, we extracted low-risk patients from cluster 2 and Cohort cluster 3 and found that patients in cluster 3 had poorer EFS (P We first performed a differential expression analysis of the dis- ¼ 2.81E-05) and OS (P ¼ 6.37E-07) than patients in cluster 2 covery cohort—a total of 493 patients—of which 92 were MYCN (Figure 1B). Moreover, we also checked the expression of amplified and 401 were MYCN nonamplified NB tumor sam- lncRNAs: SNHG1 and CASC15 were reported to be highly ples. We identified 90 lncRNAs to be differentially expressed expressed in high-risk and low-risk NB tumors, respectively (fold change  2 and adjusted P  .01) in the MYCN status con- (Supplementary Figure 3B, available online) (48,56). dition. We then performed a differential expression analysis of Furthermore, for our downstream analyses, we considered 16 risk status, of which 175 were high-risk and 318 were low-risk of the 20 lncRNAs because four lncRNAs were not detected in tumor samples. With the same filter criteria as described the independent cohorts (Figure 1C). above, we identified 35 lncRNAs that were differentially expressed. We retained only those lncRNAs that were also an- notated in Gencode v.24 (35). The 20 lncRNAs were shared be- Building the 16-lncRNA Signature Risk Score tween MYCN status, risk status, and Gencode v.24 (Supplementary Figures 1 and 2 and Supplementary Table 2, The 16 lncRNAs with univariate Cox analysis were statistically available online). The unsupervised clustering analysis significantly (P < 0.001) associated with EFS and OS revealed that there were three clusters identified (Figure 1A). (Figure 2A). The eight lncRNAs with a negative coefficient Patients in cluster 1 and cluster 2 belonged to high- and low- were defined as “good survival lncRNAs,” whose high expres- risk groups, respectively. However, cluster 3 contains a sion is associated with good survival. The remaining eight Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 4of 12 | JNCI J Natl Cancer Inst, 2018, Vol. 0, No. 0 Figure 2. Univariate Cox regression and risk score analysis of prognostic long noncoding RNAs (lncRNAs) from the discovery cohort. A) Bar graph shows 16 prognostic lncRNAs ordered by their univariate z-score for event-free survival (EFS) and overall survival (OS). Positive scores are associated with shorter survival, and negative scores are associated with longer survival. Red and blue bars represent bad survival lncRNAs and good survival lncRNAs, respectively. The dashed line (colored in green) represents an absolute univariate z-score value of 61.96. B) Point plot of risk scores show risk score groups represented by color. Black represents favorable risk score group of patient samples, and red represents unfavorable risk score group of patient samples classified on median risk score of the discovery cohort. C) Waterfall plot of ordered risk scores shows disease relapse status of the patient. Red and gray bars represent patients with disease relapse and those who have not relapsed, re- spectively. D) Heat map shows the expression profile of the lncRNA signature. Each column indicates a patient in the favorable risk score group (black) and unfavorable risk score group (red). Each row represents lncRNAs associated with shorter survival (red) and longer survival (blue). The lncRNAs were ordered by hierarchical cluster- ing. The expression value of each lncRNA is scaled across rows and shown with a blue-red color scale. EFS ¼ event-free survival; lncRNA ¼ long noncoding RNA; OS ¼ overall survival. with a positive coefficient were defined as “bad survival two groups, with an EFS lower than 50% in the unfavorable risk lncRNAs,” whose high expression is associated with poor sur- score group (Supplementary Figure 5, available online). The 16- vival (Supplementary Table 3, available online). Permutation lncRNA signature risk scores range from –5.6 to 21.44 (median ¼ testing indicated that 16 lncRNAs had a more statistically signifi- 0.296) (Figure 2B). The clinical characteristics of patients based cant association with EFS prediction than expected by chance on risk groups from the discovery cohort are shown in (P < 1E-16) (Supplementary Methods and Supplementary Figure Supplementary Table 4 (available online). The waterfall plot 4, available online). A risk score was constructed with the regres- shows that most of the patients with a high risk score relapsed sion coefficient for EFS and was arranged in increasing order, (Figure 2C). The heat map (Figure 2D) shows that patients in the and then divided into quartiles. After creating Kaplan-Meier unfavorable risk score group tend to express bad survival plots for the quartiles, it was evident that the median risk score lncRNAs and patients in the favorable risk score group tend to threshold statistically significantly separated the patients into express good survival lncRNAs. Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 D. Sahu et al. |5of12 Figure 3. Survival estimates of event-free survival in neuroblastoma patients. Kaplan-Meier plots of favorable and unfavorable risk score groups based on the (A) me- dian risk score of the discovery cohort, (B) first quartile risk score of the independent cohort 1, (C) first quartile risk score of the independent cohort 2, (D) median risk score of the independent cohort 3. The P values were obtained using a Mantel log-rank test (two-sided). lncRNA ¼ long noncoding RNA. Prognostic Association of 16-lncRNA Signature Risk Survival Prediction by the 16-lncRNA Signature Is Score With Patient Survival Independent of Clinical Risk Factors Compared with patients in the favorable risk score group, those To evaluate the prognostic independence of the 16-lncRNA with an unfavorable risk score had a poor EFS and OS (Figure 3A; signature against known clinical risk factors, multivariable Supplementary Figure 6A, available online). Only 39% of NB patients Cox analysis showed that in the discovery cohort, in the unfavorable risk score group were disease free at five years, lncRNA signature (HR ¼ 3.21, P ¼ 5.95E-07), MYCN status compared with 86% of patients in the favorable risk score group. (HR ¼ 1.41, P ¼ 4.39E-02), stage (HR ¼ 1.51, P ¼ 3.06E-02), and The OS probability at five years was 57% in the unfavorable risk age (HR ¼ 1.6, P ¼ 8.4E-03) were predicted EFS score group, compared with 99% in the favorable risk score group independently (Figure 4A). In independent cohort 1, only (Supplementary Table 5, available online). The 16-lncRNA signature lncRNA signature (HR ¼ 2.32, P ¼ 2.86E-02) was indepen- was tested in three independent cohorts for validation. Using the dently associated with EFS (Figure 4B). In independent co- same risk score formula and stepwise risk score threshold selection hort 2, lncRNA signature (HR ¼ 2.61, P ¼ 6.18E-03) and age criteria (Supplementary Figure 5, available online), the 16-lncRNA (HR ¼ 23.56, P ¼ 1.9E-03) were independently associated signature statistically significantly stratified patients into two risk with EFS (Figure 4C). In independent cohort 3, only score groups for EFS and OS (Figures 3,B–D; Supplementary Figure lncRNA signature (HR ¼ 3.91, P ¼ 9.39E-03) was indepen- 6, B–D, available online), respectively. The clinical characteristics of dently associated with EFS (Figure 4D). Similar results were patients based on risk groups from the independent cohorts are also obtained for OS (Supplementary Figure 7, available shown (Supplementary Table 4, available online). online). Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 6of 12 | JNCI J Natl Cancer Inst, 2018, Vol. 0, No. 0 Figure 4. Multivariable Cox analysis of the 16-long noncoding RNA (lncRNA) signature for event-free survival in neuroblastoma patients of (A) discovery cohort, (B) in- dependent cohort 1, (C) independent cohort 2, and (D) independent cohort 3. Forest plot of the 16-lncRNA signature shows that patients in the unfavorable risk score groups had poor outcomes and an independent predictor of event-free survival after adjusting for the clinical risk factors. The hazard ratio and confidence interval for independent cohort 2 are represented on a log10 scale. All statistical tests were two-sided. CI ¼ confidence interval, HR ¼ hazard ratio; lncRNA ¼ long noncoding RNA. were considered continuous variables, and MYCN status, risk Survival Prediction by 16-lncRNA Signature Within status, and stage were considered categorical variables. The Clinical Risk Factor Subgroups results of the discovery and independent cohorts are shown The above analysis revealed that MYCN status, age, and stage (Figure 6A; Supplementary Figure 12, available online). were also statistically significantly associated with EFS. Moreover, studies have shown that several lncRNAs that were Therefore, in order to corroborate whether lncRNA signature identified in our study are prognostic biomarkers for survival in can stratify these risk factors into risk score groups, we first per- NB (48,56,57). ROC analysis showed higher prediction accuracy formed data stratification according to age, risk status, stage, of the 16-lncRNA signature against individual lncRNAs and MYCN status. Within each subgroup of risk factors, patients (Figure 6B). Furthermore, we compared the prediction accuracy were stratified into favorable and unfavorable risk score groups of the 16-lncRNA signature with other published NB prognostic using the median risk score of the discovery cohort. In all sub- gene signatures (14,22–24). We extracted their gene list, built the groups except the high-risk subgroup (n ¼ 175, P ¼ 0.518), the risk score from the discovery cohort, and calculated the AUC of 16-lncRNA signature statistically significantly stratified patients the ROC curve. As expected, the lncRNA signature showed simi- into two risk groups (Figure 5, A–G). The survival comparison for lar performance for EFS compared with other prognostic gene the MYCN amplified subgroup is not shown, as all the patients signatures (Supplementary Figure 13, available online). The were stratified under the unfavorable risk score group. Similar AUC and 95% confidence intervals of the AUC for the 16-lncRNA results were also obtained for OS (Supplementary Figure 8, A–G, signature (AUC ¼ 0.788, 95% CI ¼ 0.746 to 0.831), individual available online). In addition, the 16-lncRNA signature statisti- lncRNAs, and clinical risk factors are shown (Supplementary cally significantly stratified low-stage (stage 1 or stage 2) tumor Table 6, available online). patients (Figure 5H; Supplementary Figure 8H, available online). The results for independent cohorts are shown (Supplementary Figures 9–11, available online). Pairwise Correlation of the 16-lncRNA Signature in the Discovery Cohort To understand the regulatory roles of the lncRNA signature, we The 16-lncRNA Signature Predicts Patient Survival With calculated the SCC between the expression values of the 16 High Sensitivity and Specificity lncRNAs from the discovery cohort. We found that bad survival To confirm the prediction accuracy for EFS, we examined the lncRNAs and good survival lncRNAs were highly correlated ROC curve of lncRNA signature, age, MYCN status, risk status, within their respective groups (Figure 7A). We next investigated and stage. For better performance, lncRNA signature and age the FPKM-normalized RNA-seq expression of the 16 lncRNAs Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 D. Sahu et al. |7of12 Figure 5. Survival estimates of event-free survival within the clinical risk factors subgroups from the discovery cohort. Kaplan-Meier plots of the favorable and unfavor- able risk score groups based on the median risk score threshold value from the discovery cohort. The patients were stratified according to (A and B) age, (C and D) risk status, (E and F) stage, (G) MYCN nonamplified, (H) stage 1 and 2. P values were obtained using a Mantel log-rank test (two-sided). lncRNA ¼ long noncoding RNA. across 16 normal human tissues obtained from the Illumina their relationship with the MYCN gene in the MYCN amplified Human Body Map project and observed that most of the and MYCN nonamplified conditions, we identified a positive lncRNAs were expressed abundantly in the brain, adrenal correlation between bad survival lncRNAs and MYCN in both glands, and lymph nodes (Figure 7B). Subsequently, to evaluate conditions (Figure 7C; Supplementary Figure 14, available Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 8of 12 | JNCI J Natl Cancer Inst, 2018, Vol. 0, No. 0 Figure 6. Receiver operating characteristic (ROC) curve analysis of event-free survival prediction by the 16–long noncoding RNA (lncRNA) signature from the discovery cohort. ROC curve shows high sensitivity and specificity for predicting event-free survival. A) 16-lncRNA signature compared with all clinical risk factors. B) 16-lncRNA signature compared with individual lncRNAs. lncRNA ¼ long noncoding RNA; ROC ¼ receiver operating characteristic. online). Moreover, our ChIP-seq data analysis observed a MYCN risk score as a threshold from the discovery cohort. Kaplan-Meier binding site in the promoter of bad survival lncRNAs (SNHG1 analysis with Mantel log-rank test (two-sided) estimated the 16- and LINC00839) (58). lncRNA signature prognostic association for EFS and OS. Multivariable Cox analysis determined the independence of the lncRNA signature against the established clinical risk factors. Biological Functions of the 16-lncRNA Signature in NB There were platform and statistically significant clinical dif- ferences between the discovery and independent cohorts. LncRNAs have little or no protein-coding capacity; thus we applied Therefore, we calculated risk scores for each of the independent a guilt-by-association strategy to investigate the potential biologi- cohorts separately. To make the threshold selection indepen- cal functions of the lncRNA signature (59). We found biological dent of the cohort under investigation, we explored a novel functions related to translational initiation, establishment of pro- stepwise risk score threshold selection approach for stratifica- tein localization to the ER, and ribosome biogenesis enriched for tion of patients. The 16-lncRNA signature was validated as a bad survival lncRNAs. In contrast, biological functions related to statistically significant independent predictor for EFS in all the homophilic cell adhesion via plasma membrane molecules, den- independent cohorts. Additionally, the 16-lncRNA signature has drite morphogenesis, and adaptive immune response were the ability to discriminate patients into two risk score groups enriched for good survival lncRNAs (Figure 8A). The highest within the clinical risk factors subgroups. The results were also enriched biological function for the bad survival lncRNA, DANCR, reproduced in the independent cohorts. This important finding and the good survival lncRNA, CASC15, are shown (Figure 8,B suggests the clinical applicability of the 16-lncRNA signature to and C). The highest enriched biological functions for the rest of the identify patients who can benefit from appropriate treatments lncRNAs are shown (Supplementary Figure 15, available online). according to their risk of relapse. Data stratification for stage 1 and 2 tumors highlight the potential of the lncRNA signature to predict the risk of relapse for low-stage tumors. However, we Discussion were not able to validate this hypothesis in the independent Our study is the first to report (to our knowledge) the RNA-seq cohorts. For independent cohort 1, only one patient was in stage prognostic lncRNA signature in NB. Using the expression profiles 2b. This patient’s tumor relapsed, and they eventually died. For of a large sample of 493 patients from an RNA-seq cohort, we independent cohort 2, only 30 patients were in stage 1, and all identified 20 lncRNAs dysregulated in MYCN amplification and were censored. Thus, using the first quartile risk score threshold high-risk NB tumors. Identifying dysregulated lncRNAs from such for this cohort, no patients were stratified under the unfavorable a large high-throughput study increases the robustness and statis- risk score group. For independent cohort 3, we did not find a sta- tical power. However, only 16 lncRNAs were found to be in com- tistically significant difference in the two risk groups for stage 1 mon with independent cohorts. Application of a univariate Cox and 2 patients. The possible reasons for this were the limited model on this subset of lncRNAs identified their expression to sample size and the fact that most of the patients were censored. have a statistically significant association with patient EFS and OS Along with genomic amplification of the MYCN oncogene, from the discovery cohort. These 16 lncRNAs were integrated into genetic aberration, including chromosomal segmental aberra- a signature through a risk score formula built from their expres- tion or tumor DNA ploidy status, also contributes to advanced sion and respective survival contributions. Patients were divided disease stage and aggressive phenotype (1,2). We show that ex- into favorable and unfavorable risk score groups using the median pression of several of the 16 lncRNAs is differentially expressed Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 D. Sahu et al. |9of12 Figure 7. Spearman correlation coefficient (SCC) analysis of the 16–long noncoding RNA (lncRNA) signature from the discovery cohort. A) SCC matrix shows correlations within bad survival lncRNAs and good survival lncRNAs. The color scale bar denotes correlation strength, with 1 indicating a positive correlation (red) and –1 indicating a negative correlation (blue). B) Heat map shows the fragments per kilobase of transcript per million mapped reads (FPKM) normalized expression value of 16 lncRNAs across 16 normal human tissues from the Illumina Body Map project. The expression value of each lncRNA was z-normalized and is shown with a green-purple color scale. C) The co-expression network of 16 lncRNAs and MYCN in MYCN-amplified neuroblastoma. Nodes represent lncRNA and MYCN coding gene, whereas edges rep- resent the SCC of expression profiles between lncRNAs and the MYCN coding gene. Red edges represent positive correlations, and blue edges represent negative corre- lations. Edge width is proportional to the strength of the correlation. Dashed edges indicate that the correlation between lncRNA and the MYCN coding gene is nonsignificant. FPKM ¼ fragments per kilobase of transcript per million mapped reads; lncRNA ¼ long noncoding RNA; SCC ¼ spearman correlation coefficient. in chromosomal segmental aberration (Supplementary Figures and have significant clinical differences. Therefore, the findings 16–18, available online) and has a prognostic impact in predict- have to be validated separately. Second, independent cohort 1 ing the outcome of patients in tumor DNA ploidy subgroups and independent cohort 2 represent overly sensitive cohorts (Supplementary Figures 19 and 20 and Supplementary Tables 7– confounded by patient age and tumor stage. Third, in 8, available online). independent cohort 2, stage was not included as a covariate in Studies reported that several of the 16 lncRNAs identified in multivariable analysis as no relapse was observed in patients in our study were likely to have roles in NB, as well as in other can- stage 1 or 3. In independent cohort 3, age was not included as a cers. The lncRNA MYCNOS interacts with CTCF at the promoter covariate in multivariable analysis because clinical information and enhances MYCN expression (60). The lncRNA DANCR is as- about patient age was not available for this cohort. Despite sociated with tumorigenesis and prognosis in hepatocellular these drawbacks, independent confirmation and similarity be- carcinoma (HCC) (61). The lncRNA SNHG16, mapped to chromo- tween findings from the discovery and independent cohorts some 17q, is an independent predictor for patient survival in NB provide a high level of confidence in the overall analysis. (57). The lncRNA FIRRE, mapped to chromosome X, is involved In conclusion, we developed a signature consisting of 16 in the dosage compensation process (62). The lncRNA lncRNAs whose expression is associated with high risk and is LINC01234 is associated with patient survival in breast cancer regulated by MYCN amplification in NB. In addition, the lncRNA (63). The lncRNA DBH-AS1 is induced by hepatitis B virus x pro- signature can be incorporated into different clinical platforms, tein and is involved in hepatitis B virus–mediated HCC (64). The including RNA-seq and microarray. Our previous study also val- lncRNA EPB41L4A-AS2 is downregulated and associated with idated the expression of some of the identified lncRNAs using poor patient survival in breast cancer (65). real-time quantitative polymerase chain reaction (48). The There are also limitations to our study. First, the NB cohorts expressions of the lncRNA signature have better ability for pre- included in our study were profiled from different platforms diction of clinical response compared with the risk, based on Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 10 of 12 | JNCI J Natl Cancer Inst, 2018, Vol. 0, No. 0 Figure 8. Gene set enrichment analysis of the 16–long noncoding RNA (lncRNA) signature in neuroblastoma patients. A) Network shows overrepresented gene sets for the lncRNA signature. Red nodes represent bad survival lncRNA signature gene sets, and blue nodes represent good survival lncRNA signature gene sets. Node size is proportional to the normalized enrichment score. Biologically related gene sets tend to form clusters; these were manually identified and labeled with appropriate gene ontology terms. The network was generated using an enrichment map-cytoscape plug-in. B and C) Enrichment plot shows highest enriched function for the bad survival lncRNA (DANCR) and the good survival lncRNA (CASC15). FDR ¼ false discovery rate; lncRNA ¼ long noncoding RNA; NES ¼ normalized enrichment score. pathological and genetic markers. The lncRNA signature is inde- Technology (SYH), National Chiao Tung University, Hsinchu, pendent in predicting NB patient disease relapse. Thus, our Taiwan; Bioinformatics Program, Taiwan International results suggest that the 16-lncRNA prognostic signature may Graduate Program, Institute of Information Science, Academia have clinical application in NB. Sinica, Taipei, Taiwan (DS, SYH, HCH); Institute of Biomedical Informatics, Center for Systems and Synthetic Biology, National Yang-Ming University, Taipei, Taiwan (DS, HCH); Department of Funding Life Science, Institute of Molecular and Cellular Biology, Graduate Institute of Biomedical Electronics and Bioinformatics, This work was supported by the Ministry of Science and National Taiwan University, Taipei, Taiwan (HFJ). Technology (MOST 103-2320-B-010-031-MY3, MOST 104- The study funders had no role in the design of the study; the 2628-E-010-001-MY3, MOST 105-2320-B-002-057-MY3, and collection, analysis, or interpretation of the data; the writing of MOST 105-2634-E-002-002) and the National Health the manuscript; or the decision to submit the manuscript for Research Institutes (NHRI-EX106-10530PI). publication. DS and HCH conceived and designed the study. DS per- formed bioinformatics analyses. SYH helped with data analysis. Notes DS and HCH interpreted the results and wrote the manuscript. Affiliations of authors: Institute of Bioinformatics and Systems HFJ and HCH supervised the study. All authors read and ap- Biology (DS, SYH) and Department of Biological Science and proved the final manuscript. Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 D. Sahu et al. |11of12 29. Gupta RA, Shah N, Wang KC, et al. Long non-coding RNA HOTAIR reprograms The authors declare no competing financial interests. chromatin state to promote cancer metastasis. Nature. 2010;464(7291):1071–1076. The authors wish to thank Chen-Ching Lin and Chia-Lang 30. Kim K, Jutooru I, Chadalapaka G, et al. HOTAIR is a negative prognostic factor Hsu for helpful discussions. and exhibits pro-oncogenic activity in pancreatic cancer. Oncogene. 2013; 32(13):1616–1625. 31. Kim HJ, Lee DW, Yim GW, et al. Long non-coding RNA HOTAIR is associated References with human cervical cancer progression. Int J Oncol. 2015;46(2):521–530. 32. Cai B, Wu Z, Liao K, et al. Long noncoding RNA HOTAIR can serve as a com- 1. Brodeur GM. Neuroblastoma: Biological insights into a clinical enigma. Nat mon molecular marker for lymph node metastasis: A meta-analysis. Tumour Rev Cancer. 2003;3(3):203–216. Biol. 2014;35(9):8445–8450. 2. Maris JM, Hogarty MD, Bagatell R, et al. Neuroblastoma. Lancet. 2007; 33. Iyer MK, Niknafs YS, Malik R, et al. The landscape of long noncoding RNAs in 369(9579):2106–2120. the human transcriptome. Nat Genet. 2015;47(3):199–208. 3. Maris JM. Recent advances in neuroblastoma. N Engl J Med. 2010;362(23): 34. Cabili MN, Trapnell C, Goff L, et al. Integrative annotation of human large 2202–2211. intergenic noncoding RNAs reveals global properties and specific subclasses. 4. Speleman F, Park JR, Henderson TO. Neuroblastoma: A tough nut to crack. Genes Dev. 2011;25(18):1915–1927. Am Soc Clin Oncol Educ Book. 2016;35:e548–e557. 35. Derrien T, Johnson R, Bussotti G, et al. The GENCODE v7 catalog of human 5. Howlader N, Noone AM, Krapcho M, et al. SEER Cancer Statistics Review, long noncoding RNAs: Analysis of their gene structure, evolution, and ex- 1975–2013. Bethesda, MD: National Cancer Institute; 2016. https://seer. pression. Genome Res. 2012;22(9):1775–1789. cancer.gov/csr/1975_2013/. Accessed September 28, 2017. 36. Guttman M, Amit I, Garber M, et al. Chromatin signature reveals over a thou- 6. London WB, Castleberry RP, Matthay KK, et al. Evidence for an age cutoff sand highly conserved large non-coding RNAs in mammals. Nature. 2009; greater than 365 days for neuroblastoma risk group stratification in the 458(7235):223–227. Children’s Oncology Group. J Clin Oncol. 2005;23(27):6459–6465. 37. Bussemakers MJ, van Bokhoven A, Verhaegh GW, et al. DD3: A new prostate- 7. Gurney JG, Ross JA, Wall DA, et al. Infant cancer in the U.S.: Histology-specific specific gene, highly overexpressed in prostate cancer. Cancer Res. 1999; incidence and trends, 1973 to 1992. J Pediatr Hematol Oncol. 1997;19(5):428–432. 59(23):5975–5979. 8. Seeger RC, Brodeur GM, Sather H, et al. Association of multiple copies of the 38. Hessels D, Klein Gunnewiek JM, van Oort I, et al. DD3(PCA3)-based molecular N-myc oncogene with rapid progression of neuroblastomas. N Engl J Med. urine analysis for the diagnosis of prostate cancer. Eur Urol. 2003;44(1):8–15; 1985;313(18):1111–1116. discussion 15–16. 9. Huang M, Weiss WA. Neuroblastoma and MYCN. Cold Spring Harb Perspect 39. Arita T, Ichikawa D, Konishi H, et al. Circulating long non-coding RNAs in Med. 2013;3(10):a014415. plasma of patients with gastric cancer. Anticancer Res. 2013;33(8):3185–3193. 10. Brodeur GM, Seeger RC, Schwab M, et al. Amplification of N-myc in untreated 40. Panzitt K, Tschernatsch MM, Guelly C, et al. Characterization of HULC, a human neuroblastomas correlates with advanced disease stage. Science. novel gene with striking up-regulation in hepatocellular carcinoma, as non- 1984;224(4653):1121–1124. coding RNA. Gastroenterology. 2007;132(1):330–342. 11. Brodeur GM, Pritchard J, Berthold F, et al. Revisions of the international crite- 41. Kornienko AE, Dotter CP, Guenzl PM, et al. Long non-coding RNAs display ria for neuroblastoma diagnosis, staging, and response to treatment. J Clin higher natural expression variation than protein-coding genes in healthy Oncol. 1993;11(8):1466–1477. humans. Genome Biol. 2016;17. 12. Maris JM, Matthay KK. Molecular biology of neuroblastoma. J Clin Oncol. 1999; 42. Kotake Y, Nakagawa T, Kitagawa K, et al. Long non-coding RNA ANRIL is re- 17(7):2264–2279. quired for the PRC2 recruitment to and silencing of p15(INK4B) tumor sup- 13. Brodeur GM, Maris JM. Neuroblastoma. In: Pizzo P, Poplack D, eds. Principles pressor gene. Oncogene. 2011;30(16):1956–1962. and Practice of Pediatric Oncology. Philadelphia: Lippincott Williams & Wilkins; 43. Gutschner T, Hammerle M, Eissmann M, et al. The noncoding RNA MALAT1 2002:895–937. is a critical regulator of the metastasis phenotype of lung cancer cells. Cancer 14. Vermeulen J, De Preter K, Naranjo A, et al. Predicting outcomes for children Res. 2013;73(3):1180–1189. with neuroblastoma using a multigene-expression signature: A retrospective 44. Taniue K, Kurimoto A, Sugimasa H, et al. Long noncoding RNA UPAT pro- SIOPEN/COG/GPOH study. Lancet Oncol. 2009;10(7):663–671. motes colon tumorigenesis by inhibiting degradation of UHRF1. Proc Natl 15. Asgharzadeh S, Pique-Regi R, Sposto R, et al. Prognostic significance of gene Acad Sci U S A. 2016;113(5):1273–1278. expression profiles of metastatic neuroblastomas lacking MYCN gene ampli- 45. Pandey GK, Kanduri C. Long noncoding RNAs and neuroblastoma. Oncotarget. fication. J Natl Cancer Inst. 2006;98(17):1193–1203. 2015;6(21):18265–18275. 16. Wei JS, Johansson P, Chen QR, et al. microRNA profiling identifies cancer- 46. Pandey GK, Mitra S, Subhash S, et al. The risk-associated long noncoding specific and prognostic signatures in pediatric malignancies. Clin Cancer Res. RNA NBAT-1 controls neuroblastoma progression by regulating cell prolifera- 2009;15(17):5560–5568. tion and neuronal differentiation. Cancer Cell. 2014;26(5):722–737. 17. De Preter K, Vermeulen J, Brors B, et al. Accurate outcome prediction in neu- 47. Liu PY, Erriquez D, Marshall GM, et al. Effects of a novel long noncoding RNA, roblastoma across independent data sets using a multigene signature. Clin lncUSMycN, on N-Myc expression and neuroblastoma progression. J Natl Cancer Res. 2010;16(5):1532–1541. Cancer Inst. 2014;106(7):dju113. 18. Schulte JH, Schowe B, Mestdagh P, et al. Accurate prediction of neuroblastoma 48. Sahu D, Hsu CL, Lin CC, et al. Co-expression analysis identifies long noncod- outcome based on miRNA expression profiles. Int J Cancer. 2010;127(10):2374–2385. ing RNA SNHG1 as a novel predictor for event-free survival in neuroblas- 19. De Preter K, Mestdagh P, Vermeulen J, et al. miRNA expression profiling ena- toma. Oncotarget. 2016;7(36):58022–58037. bles risk stratification in archived and fresh neuroblastoma tumor samples. 49. Su Z, Fang H, Hong H, et al. An investigation of biomarkers derived from legacy Clin Cancer Res. 2011;17(24):7684–7692. microarray data for their utility in the RNA-seq era. Genome Biol. 2014;15(12):523. 20. Abel F, Dalevi D, Nethander M, et al. A 6-gene signature identifies four molec- 50. Molenaar JJ, Domingo-Fernandez R, Ebus ME, et al. LIN28B induces neuroblas- ular subgroups of neuroblastoma. Cancer Cell Int. 2011;11:9. toma and enhances MYCN levels via let-7 suppression. Nat Genet. 2012;44(11): 21. Valentijn LJ, Koster J, Haneveld F, et al. Functional MYCN signature predicts 1199–1206. outcome of neuroblastoma irrespective of MYCN amplification. Proc Natl 51. Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression Acad Sci U S A. 2012;109(47):19190–19195. analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015; 22. Formicola D, Petrosino G, Lasorsa VA, et al. An 18 gene expression-based 43(7):e47. score classifier predicts the clinical outcome in stage 4 neuroblastoma. J 52. Therneau T. A Package for Survival Analysis in S [computer program]. Version 2.38. Transl Med. 2016;14. 2015. https://CRAN.R-project.org/package¼survival. Accessed June 16, 2015. 23. Asgharzadeh S, Salo JA, Ji L, et al. Clinical significance of tumor-associated in- 53. Robin X, Turck N, Hainard A, et al. pROC: An open-source package for R and flammatory cells in metastatic neuroblastoma. J Clin Oncol. 2012;30(28): Sþ to analyze and compare ROC curves. BMC Bioinformatics. 2011;12:77. 3525–3532. 54. Subramanian A, Tamayo P, Mootha VK, et al. Gene set enrichment analysis: 24. Fardin P, Barla A, Mosci S, et al. A biology-driven approach identifies the hyp- A knowledge-based approach for interpreting genome-wide expression pro- oxia gene signature as a predictor of the outcome of neuroblastoma patients. files. Proc Natl Acad Sci U S A. 2005;102(43):15545–15550. Mol Cancer. 2010;9:185. 55. Isserlin R, Merico D, Voisin V, et al. Enrichment Map – a Cytoscape app to vi- 25. Shao J, Lu Z, Huang W, et al. A single center clinical analysis of children with sualize and explore OMICs pathway enrichment results. F1000Res. 2014;3. neuroblastoma. Oncol Lett. 2015;10(4):2311–2318. 56. Russell MR, Penikis A, Oldridge DA, et al. CASC15-S is a tumor suppressor 26. Svoboda M, Slyskova J, Schneiderova M, et al. HOTAIR long non-coding lncRNA at the 6p22 neuroblastoma susceptibility locus. Cancer Res. 2015; RNA is a negative prognostic factor not only in primary tumors, but also 75(15):3155–3166. in the blood of colorectal cancer patients. Carcinogenesis. 2014;35(7): 57. Yu M, Ohira M, Li Y, et al. High expression of ncRAN, a novel non-coding RNA 1510–1515. mapped to chromosome 17q25.1, is associated with poor prognosis in neuro- 27. Xu ZY, Yu QM, Du YA, et al. Knockdown of long non-coding RNA HOTAIR blastoma. Int J Oncol. 2009;34(4):931–938. suppresses tumor invasion and reverses epithelial-mesenchymal transition 58. Hsu CL, Chang HY, Chang JY, et al. Unveiling MYCN regulatory networks in in gastric cancer. Int J Biol Sci. 2013;9(6):587–597. neuroblastoma via integrative analysis of heterogeneous genomics data. 28. Li H, An J, Wu M, et al. LncRNA HOTAIR promotes human liver cancer stem Oncotarget. 2016;7(24):36293–36310. cell malignant growth through downregulation of SETD2. Oncotarget. 2015; 59. Rinn JL, Chang HY. Genome regulation by long noncoding RNAs. Annu Rev 6(29):27847–27864. Biochem. 2012;81:145–166. Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018 12 of 12 | JNCI J Natl Cancer Inst, 2018, Vol. 0, No. 0 60. Zhao X, Li D, Pu J, et al. CTCF cooperates with noncoding RNA MYCNOS to 63. Guo W, Wang Q, Zhan Y, et al. Transcriptome sequencing uncovers a three- promote neuroblastoma progression through facilitating MYCN expression. long noncoding RNA signature in predicting breast cancer survival. Sci Rep. Oncogene. 2016;35(27):3565–3576. 2016;6:27931. 61. Yuan SX, Wang J, Yang F, et al. Long noncoding RNA DANCR increases stem- 64. Huang J, Ren T, Cao S, et al. HBx-related long non-coding RNA DBH-AS1 pro- ness features of hepatocellular carcinoma by derepression of CTNNB1. motes cell proliferation and survival by activating MAPK signaling in hepato- Hepatology. 2016;63(2):499–511. cellular carcinoma. Oncotarget. 2015;6(32):33791–33804. 62. Yang F, Deng X, Ma W, et al. The lncRNA Firre anchors the inactive X chromo- 65. Xu S, Wang P, You Z, et al. The long non-coding RNA EPB41L4A-AS2 inhibits some to the nucleolus by binding CTCF and maintains H3K27me3 methyla- tumor proliferation and is associated with favorable prognoses in breast can- tion. Genome Biol. 2015;16:52. cer and other solid tumors. Oncotarget. 2016;7(15):20704–20717. Downloaded from https://academic.oup.com/jncics/article-abstract/2/2/pky015/5026709 by Ed 'DeepDyve' Gillespie user on 21 June 2018

Journal

JNCI Cancer SpectrumOxford University Press

Published: Jun 1, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off