Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Whole genome exon arrays identify differential expression of alternatively spliced, cancer-related genes in lung cancer

Whole genome exon arrays identify differential expression of alternatively spliced,... Published online 16 October 2008 Nucleic Acids Research, 2008, Vol. 36, No. 20 6535–6547 doi:10.1093/nar/gkn697 Whole genome exon arrays identify differential expression of alternatively spliced, cancer-related genes in lung cancer 1 1 1 1 Liqiang Xi , Andrew Feber , Vanita Gupta , Maoxin Wu , 1 3 2 3 Andrew D. Bergemann , Rodney J. Landreneau , Virginia R. Litle , Arjun Pennathur , 3 1,2, James D. Luketich and Tony E. Godfrey * 1 2 Department of Pathology, Department of Cardiothoracic Surgery, Mount Sinai School of Medicine, New York, NY 10029 and Heart, Lung and Esophageal Surgery Institute, and Pittsburgh Cancer Institute, University of Pittsburgh, Pittsburgh, PA 15260, USA Received August 25, 2008; Revised September 24, 2008; Accepted September 25, 2008 detection may improve outcomes, overall 5-year survival ABSTRACT rates for NSCLC are currently only 16% (2). New mole- Alternative processing of pre-mRNA transcripts is a cular diagnostic tests and novel therapeutic strategies are major source of protein diversity in eukaryotes and needed for this terrible disease. NSCLC is one of the most has been implicated in several disease processes studied tumor types in the scientific literature demon- including cancer. In this study we have performed strated by the number of excellent studies on global gene a genome wide analysis of alternative splicing expression (3–6) and genome-wide DNA copy number events in lung adenocarcinoma. We found that changes (7,8) that have been conducted in NSCLC. 2369 of the 17 800 core Refseq genes appear to These studies have enhanced our knowledge of lung cancer biology, led to proposals for multicenter trials of have alternative transcripts that are differentially primary tumor gene expression for prognosis and treat- expressed in lung adenocarcinoma versus normal. ment and may identify avenues for novel therapeutic According to their known functions the largest development. A promising area that remains relatively subset of these genes (30.8%) is believed to be unexplored, however, is alternative splicing (AS) of cancer related. Detailed analysis was performed mRNA to produce functionally different proteins. Such for several genes using PCR, quantitative RT-PCR studies may lead to improved diagnostic and prognostic and DNA sequencing. We found overexpression of tools and may identify additional therapeutic targets for ERG variant 2 but not variant 1 in lung tumors NSCLC. and overexpression of CEACAM1 variant 1 but not Alternative splicing of pre-mRNA is an important pro- variant 2 in lung tumors but not in breast or colon cess in normal metazoan development (9,10). Further- tumors. We also identified a novel, overexpressed more, recent bioinformatics analysis suggests that 65% variant of CDH3 and verified the existence and over- of human genes are alternatively spliced (11–14); a large expression of a novel variant of P16 transcribed increase over prior estimates as low as 5% (15). AS is not from the CDKN2A locus. These findings demon- only involved in normal development, but is also asso- strate how analysis of alternative pre-mRNA proces- ciated with human diseases including cancer (16–27). For some genes, alternative transcripts are differentially sing can shed additional light on differences expressed between tumor and normal tissue and in a few between tumors and normal tissues as well as cases, the expression of AS variants has been associated between different tumor types. Such studies may with tumor progression (28–32). However, most studies of lead to the development of additional tools for AS in human disease have used a targeted approach and tumor diagnosis, prognosis and therapy. focused on individual genes. There is a great deal of poten- tial for novel discovery from genome-wide studies of alter- native splicing. Until recently such large scale studies have INTRODUCTION been a considerable technical and bioinformatic challenge Non-small cell lung cancer (NSCLC) is the most common but the introduction of new technology and powerful cause of cancer-related death in the USA (1). While early data analysis software now makes them more feasible. *To whom correspondence should be addressed. Tel: 585 273 3112; Fax: 585 276 2576; Email: tony_godfrey@urmc.rochester.edu 2008 The Author(s) This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. 6536 Nucleic Acids Research, 2008, Vol. 36, No. 20 In this study we have used the GeneChip Human Exon The Robust Multi-array Analysis (RMA) (34) algorithm 1.0 ST Array from Affymetrix to explore genome-wide was used for probeset (exon-level) intensity analysis. AS events in the most predominant histologic type of Exon-level data was then filtered to include only those NSCLC, lung adenocarcinoma. The study was designed probesets that are in the ‘core’ meta-probe list, which to identify cancer-associated alternative splicing events, represents 17 800 RefSeq genes and full-length GenBank verify splice variants and to validate differential expression mRNAs. Within this gene set, the Analysis of Variance of selected splice variants in independent tissue sets. Our (ANOVA) and multi test correction for P-values in results demonstrate that a large number of known genes, Partek Genomic Suite were used to identify alternative including well known oncogenes and tumor suppressors, splicing events. Tissue type (tumor versus normal in this are alternatively spliced and differentially expressed case) was chosen as the candidate variable in the ANOVA between normal lung and lung adenocarcinoma. These model to obtain tumor-related splicing events. ANOVA findings may provide a new resource for diagnosis and P-values were corrected using Bonferroni method. A list treatment of NSCLC. of genes with significant alternative spliced events was generated by using a 0.05 FDR criterion as a significant cutoff. Then the genes were sorted based on gene function MATERIALS AND METHODS using Ingenuity Pathway Analysis software (Ingenuity Systems, www.ingenuity.com). Subsequent verification Specimens and RNA isolation and validation of splicing events was restricted to those Snap frozen lung tissue specimens were obtained from genes with functions identified as being associated with tissue banks at the Heart, Lung and Esophageal Surgery cancer such as invasion, cell movement, apoptosis, cell Institute, University of Pittsburgh Medical Center. death, tumorigenesis, and differentiation, etc. The flow This study involving human tissue was approved by the chart for data analyses was as follows: Institutional Review Boards from both the University of (0) QC with Affymetrix expression console. Pittsburgh and Mount Sinai School of Medicine. In total, (1) Probe-level analysis with GC-RMA in Partek GS. 36 pairs of lung adenocarcinoma and adjacent normal (2) Alternative splice analysis of exon data with lung tissue plus 43 additional adenocarcinoma and squa- ANOVA in Partek GS. mous cell carcinoma specimens were analyzed (see clinical (3) Function analysis of genes with alternative splicing information for all patients in Supplementary Table S1). in Ingenuity Pathway Analysis and generation of All tumor specimens were determined to comprise >70% a cancer-related gene list. tumor and adjacent normal specimens contained no (4) Manual review of Partek gene view plots to identify histologically evident tumor or contaminating tissues. alternate splicing forms and determine the frequency Forty, 5-micron sections from each tissue block were of changes observed in the patient set. cut and placed immediately in Qiagen RNA lysis buffer. (5) Detailed manual analysis focusing on genes with RNA was isolated using Qiagen kits with on-column simple forms of alt splicing and high frequency of DNAse treatment to remove genomic DNA followed by changes. Reviewed Affymetrix probeset sequences, precipitation. Purified RNA was then quantified using RefSeq database, Blatted probe sequences in UCSC a NanoDrop spectrophotometer and RNA integrity was Genomic Browser. determined by running aliquots on an Agilent Bioana- (6) Identification of genes with alternative splicing at lyzer. RNA integrity numbers were >6 in all cases. high frequency in this patient set (>50% patients with same change). RNA labeling, hybridization, data processing and quality (7) Verification and validation. assessment A total of 2 mg of RNA from each of 20 tumor/normal paired specimens (n ¼ 40) was labeled with reagents Exon array data verification, splice event validation, from Affymetrix according to the manufacturers instruc- and variant transcript quantification tions. Hybridization cocktails containing 5–5.5 mg of frag- Reverse transcription of 2 mg RNA was performed in mented, end-labeled single-stranded cDNA were prepared 100 ml reaction volumes with random hexamer priming and hybridized to GeneChip Human Exon 1.0 ST arrays. and MMLV reverse transcriptase (Epicentre, Madison, These arrays survey both gene expression and alternative WI) (35). Diluted cDNA was used in all following PCR splicing patterns on a whole-genome scale on a single or qPCR reactions. Standard PCR was carried out for array. The array contains 5.4 million, 5-mm features TM 35 cycles using Titanium Taq DNA Polymerase (probes) grouped into 1.4 million probesets, interrogating (ClonTech, Mountain View, CA) starting with 25 ng of over 1 million exon clusters (33). Processed arrays were cDNA as template in a 50 ml reaction. PCR products scanned using the GeneChip Scanner 3000 7G. Affymetrix TM were separated on 10% Criterion Precast TBE Gels TM (version 1.0) was used to Expression Console Software (Bio-Rad, Hercules, CA) for visualization or 1% agarose perform quality assessment. TM gels for DNA extractions using the MinElute Gel Extraction Kit (Qiagen, Inc., Valencia, CA). Data analysis Quantitative real-time PCR was performed using the All exon array data was analyzed using tools in Partek Brilliant SYBR Green QPCR kit (Stratagene, La Jolla, Genomic Suite 6.4 software (Partek Inc., St. Louis, MO). CA) with 10 ng of cDNA as template in a 5 ml reaction Nucleic Acids Research, 2008, Vol. 36, No. 20 6537 on an ABI PRISM 7900HT instrument (Applied of alternative splicing and interest level from a review of Biosystems, Foster City, CA). PCR was run in triplicate the literature. Expression of two exons from differentially for each sample. Relative expression was calculated using expressed portions of each gene was quantified using qRT- the delta-C methods previously described (36) and with PCR on the same samples used in the array experiments B2M as the endogenous control gene. B2M was chosen (Supplementary Figures). Array data was considered as it showed very little variability in the exon array data verified if qRT-PCR demonstrated the same, directional among all samples. Bidirectional DNA sequencing difference in expression between the different regions of of novel transcripts was performed by the DNA Core the gene, or if novel alternative transcripts were specifi- Facility at the Mount Sinai School of Medicine using cally identified based on PCR product size. For 4 of the forward and reverse PCR primers. 11 genes (CEACAM1, ERG, RASIP1 and VEGFC), the Detailed methods for analysis of alternative splicing and qPCR data was in clear agreement with the array data. primer design for PCR are included in the Supplementary One gene (CDKN2A) demonstrated differential expression Methods. that was statistically borderline (Supplementary Figures). In addition, while CDH3 exon-specific qPCR did not appear to validate differential expression, a novel alterna- RESULTS tive transcript was identified that matched the array data (described in detail below). Thus, 6 of the 11 selected genes Data quality assessment identified no outlier arrays using were considered validated while the remaining five genes Expression Console Software. Spearman Rank correlation (ARMET, CDKN1A, FOXP1, KLF2 and CDKN2B) were of the hybridization control signal values between any two not verifiable by PCR. In three of five cases, we noted that chips was high (r  0.92). Thus, data from all arrays was the exon array probes that did not show differential included in the alternative splicing ANOVA (Alt-splice expression were located in very G-C rich regions, raising ANOVA). Analyzing the 17 800 genes represented in the possibility of cross hybridization as the reason for ‘Core’ Probeset list, we identified a total of 2369 genes erroneous exon array data. Of the six validated genes, that appear to have differential expression of alternate four (ERG, CDH3, CDKN2A, and CEACAM1) were transcripts between normal and tumor tissue (FDR explored in more detail. correction to establish a cutoff P-value of 3.94e-6, corre- sponding to the 0.05 FDR level). Gene function analysis Differential expression of ERG splice variants in lung indicated that the largest subset (729/2369, 30.8%) of adenocarcinoma these genes were cancer related (a full list of genes is included in Supplementary Table S2) followed by other Exon array analysis of ERG demonstrated differential functional categories such as tissue development, cellular expression of splice variants in 20 of 20 tumor/normal growth and proliferation, tissue morphology, and immune tissue pairs. Furthermore, the expression pattern observed response. Of the 729 cancer-related genes, 47 showed the was consistent with differential expression of the two same alternate splicing event in 50% or more (10 of 20) known transcripts of ERG curated in the RefSeq database of patients. In addition, one gene (CDH3) showed alter- (Figure 1a and b). Variant 1 (NM_182918.2) encodes a native splicing in only 8 (40%) patients, but all of these 0 short form with a unique 5 first exon and an additional were female (8 of 14; 57%) indicating the possibility of 0 exon towards the 3 end while variant 2 (NM_004449.3) gender-specific, cancer related alternative splicing. Of the 0 encodes a long form with three unique exons on the 5 end. 48 genes, 20 have reported alternate splice variants in In order to validate the array data, we first designed two the Entrez Gene database and the remaining 28 genes sets of qRT-PCR primers to quantify the differential have only one known transcript. The 48 genes can also expression of unique versus shared exons. The results be further divided into categories as follows and listed in demonstrate that the qRT-PCR data correlated with the Table 1: (i) six genes with known splice variants where array data in the same individuals (Supplementary relevance of splice variants has been identified in cancer. Figures). Secondly, to prove that two transcripts exist in These include ADAM12, CEACAM1, and FGFR4 this set of lung tumor and matched normal samples, we which have been associated specifically with lung cancer; designed a common primer set that spanned exon 7-8 of (ii) 14 genes with known splice variants but where variant 2 of ERG and exon 5-6-7 of variant 1. These data relevance of the splice variants to cancer has not been demonstrate the existence of both ERG variants determined; (iii) 28 potentially novel splice variants differ- (Figure 1c) in both tumor and normal tissue. Two addi- entially expressed between normal lung and lung tional transcript variants, identified in the UCSC genome adenocarcinoma. browser (uc002ywz.1 and uc002yxc.1), were also evaluated but no expression was found in either tumor or normal Verification of exon array data tissues. Finally, to specifically demonstrate differential The first step in verification of alternative splicing was to expression of variants 1 and 2 we designed primer sets verify the exon array expression data for regions of genes unique to the two transcripts and quantified expression (exons) that appeared to have differential expression in of each. This was performed in 35 tumor/normal pairs tumor and normal samples. Eleven genes (ARMET, (which included the 20 pairs used in the original array CDKN1A, CDKN2A, CDKN2B, CEACAM1, ERG, analysis) (Figure 1c). We observed that ERG variant 2 FOXP1, KLF2, RASIP1, VEGFC and CDH3) were was significantly overexpressed (mean tumor/normal selected for qRT-PCR verification based on the frequency ratio 4.73; paired t-test P-value ¼ 0.0005) in tumor 6538 Nucleic Acids Research, 2008, Vol. 36, No. 20 Table 1. Alternative spliced genes between tumor and normal in high frequency from Exon Array analysis Symbol Entrez ID Gene name Frequency (%) Genes with known splice variants where relevance of splice variants has been identified in cancer ADAM12 8038 ADAM metallopeptidase domain 12 (meltrin alpha) 70 BCL6 604 B-cell CLL/lymphoma 6 (zinc finger protein 51) 55 CDKN2A 1029 Cyclin-dependent kinase inhibitor 2A (melanoma, p16, inhibits CDK4) 65 CEACAM1 634 Carcinoembryonic antigen-related cell adhesion molecule 1 (biliary glycoprotein) 70 DSCR1 1827 Down syndrome critical region gene 1 85 FGFR4 2264 Fibroblast growth factor receptor 4 90 Genes with known variants but where relevance of splice variants to cancer has not been identified ADRA1A 148 Adrenergic, alpha-1A-, receptor 90 AKAP12 9590 A kinase (PRKA) anchor protein (gravin) 12 90 CCNE1 898 Cyclin E1 75 CDCA1 83 540 Cell division cycle associated 1 90 DKK3 27 122 Dickkopf homolog 3 (Xenopus laevis) 75 ERG 2078 v-ets erythroblastosis virus E26 oncogene like (avian) 100 FEZ1 9638 Fasciculation and elongation protein zeta 1 (zygin I) 90 FPRL1 2358 Formyl peptide receptor-like 1 70 GSN 2934 Gelsolin (amyloidosis, Finnish type) 75 KL 9365 Klotho 80 RASGRF1 5923 Ras protein-specific guanine nucleotide-releasing factor 1 95 TBX3 6926 T-box 3 (ulnar mammary syndrome) 50 TNFSF11 8600 Tumor necrosis factor (ligand) superfamily, member 11 55 ZBTB16 7704 Zinc finger and BTB domain containing 16 80 Genes with novel splice variants ANKRD1 27 063 Ankyrin repeat domain 1 (cardiac muscle) 75 APCDD1 147 495 Adenomatosis polyposis coli down-regulated 1 65 ARHGEF3 50 650 Rho guanine nucleotide exchange factor (GEF) 3 85 ARMET 7873 Arginine-rich, mutated in early stage tumors 85 CDH1 999 Cadherin 1, type 1, E-cadherin (epithelial) 60 CDH3 1001 Cadherin 3, type 1, P-cadherin (placental) 40 CXCL5 6374 Chemokine (C-X-C motif) ligand 5 60 EMP2 2013 Epithelial membrane protein 2 65 FSTL3 10 272 Follistatin-like 3 (secreted glycoprotein) 70 GATA2 2624 GATA binding protein 2 95 GATA6 2627 GATA binding protein 6 90 HCK 3055 Hemopoietic cell kinase 80 IGFBP6 3489 Insulin-like growth factor binding protein 6 75 ITGA5 3678 Integrin, alpha 5 (fibronectin receptor, alpha polypeptide) 55 KLF2 10 365 Kruppel-like factor 2 (lung) 85 NFIL3 4783 Nuclear factor, interleukin 3 regulated 90 PGR 5241 Progesterone receptor 70 PIAS1 8554 Protein inhibitor of activated STAT, 1 60 PTHR1 5745 Parathyroid hormone receptor 1 100 RAMP2 10 266 Receptor (calcitonin) activity modifying protein 2 100 RASIP1 54 922 Ras interacting protein 1 100 SMAD6 4091 SMAD, mothers against DPP homolog 6 (Drosophila) 90 SPARC 6678 Secreted protein, acidic, cysteine-rich (osteonectin) 80 SRPX 8406 Sushi-repeat-containing protein, X-linked 100 SSBP2 23 635 Single-stranded DNA binding protein 2 80 TBX2 6909 T-box 2 95 TGFBR3 7049 Transforming growth factor, beta receptor III (betaglycan, 300kDa) 55 VEGFC 7424 Vascular endothelial growth factor C 85 Known or Novel splice variants were based on Entrez Gene database and relevance to cancer was based on PubMed search. Both performed on 31 December 2007. samples compared to paired normal while ERG variant 1 normal tissues. Interestingly, this was observed only in was slightly under-expressed in tumor samples compared female patients (8 of 14 cases) and not in any of six male patients (Supplementary Figures). Given the estab- to normal (ratio ¼ 0.74; P ¼ 0.0186). lished gender differences in lung adenocarcinoma inci- dence and survival (37), this finding was evaluated Identification of a novel CDH3 transcript variant in lung further. QRT-PCR, with primers in exon 2 and in adenocarcinoma exon 3, verified that exon 3 was expressed at much CDH3 is encoded by 16 exons and has only one known higher levels (mean 24-fold tumor/normal) in tumor com- transcript in the RefSeq database. Exon array data sug- pared with normal in these eight patients while exon 2 was gested alternative splicing of exon 2 since all other exons expressed only 3-fold higher in tumor versus normal. were expressed considerably higher in tumor than in However, exon 2 and exon 3 were expressed similarly Nucleic Acids Research, 2008, Vol. 36, No. 20 6539 1 2 3 3′ 5′ (a) Variant 1 Variant 2 (b) (c) Common R Primer / 201 bp, variant 1 1 V1 R Primer 129 bp, variant 2 2 V1 F primer Common F Primer Tissue Type TNTNTNTNTNTNTNTN 4 V2 R Primer Patient ID 236 260 416 419 432 459 519 522 V2 F Primer (d) ERG V1 (mean T/N ratio=0.74, P=0.0186) ERG V2 (mean T/N ratio=4.73, P=0.0005) T/N ratio V1 vs V2 (P=1.8478e-15) tumor normal tumor normal ERG V1 ERG V2 Figure 1. Exon array analysis and PCR analysis for alternative transcript variants of ERG.(a) Exon structure for ERG variant 1 (NM_182918.2) and variant 2 (NM_004449.3) showing the location of PCR primers used for expression analyses. (b) Partek GS alternative splice analysis of exon expression data in 20 patients. The graph shows mean expression value and standard error for each probe set in tumour (blue) and normal (normal) groups. (c) verification of PCR product size difference for two variants using primers 1 and 3. (d) Quantification of the two variants using primer set 1 and 2 for variant 1 and primers 4 and 5 for variant 2 in tumour/normal paired samples (tissue pairs are joined by solid lines) from 35 patients. with a 2- to 3-fold difference in tumor versus normal in the of exons 3 and 4. RT-PCR products, visualized on a gel, other six female patients and in all male patients, thus showed the expected 283 bp band for the normal trans- validating the array data. Next, in order to further char- cript and a shorter (168 bp) band (Figure 2a). acterize the area of exon 1-3, we designed a forward Sequencing of these two bands (Figure 2b) confirmed primer in exon 1 and reverse primer across the junction the the presence of the normal transcript (283 bp band) 6540 Nucleic Acids Research, 2008, Vol. 36, No. 20 (a) 283 bp, normal transcript 168 bp, novel transcript Tissue TypeTNT NTNTNTNTNTNTN Patient ID 236 260 416 419 432 459 519 522 Gender MF FM MF F F 283 bp 168 bp (b) Exon 2 Exon 3 Exon 1 Exon 1 Exon 3 ... . (c) (d) CDH3 normal transcript CDH3 E2 skipping variant mean T/N ratio=20.46, p=6.4786e-10) mean T/N ratio=42.86, p=1.9347e-10) Exon 2 skipping Frequency # of patients variant (%) Tumor Normal Male Female Total −− 2 2 4 11.4 +− 6 13 19 54.4 −+ 0 1 1 2.8 ++ 6 5 11 31.4 Total 14 21 35 100.0 tumor normal tumor normal Figure 2. Identification, validation, and quantification of novel CDH3 transcript variants. (a) Identification of an alternative CDH3 transcript using PCR primers located in exon 1 and across the exon 3-4 boundary. (b) DNA sequencing results for the two PCR products demonstrating skipping of exon 2. (c) Frequency of the E2 skipping transcript expressed in 35 patients. (d) Specific quantification of the two variants in tumour/normal paired samples (tissue pairs are joined by solid lines) from 35 patients. and the presence of a shorter transcript long and missing Identification of CDKN2A transcript variant 2 in lung exon 2 (168 bp). To evaluate expression of this novel adenocarcinoma alterative transcript (referred to as CDH3 E2 skipping RefSeq databases indicate several transcript CDKN2A transcript, or variant transcript, from this point forward), variants which differ in their first exons. At least three qRT-PCR was performed with primers unique to each alternatively spliced variants, each encoding distinct pro- CDH3 transcript. This was performed in paired tumor/ teins have been reported with variants 1, 3, and 4 encoding normal tissue from 35 patients and the results are shown P16, Isoform 3, and ARF respectively (Figure 3a). The in Figure 2. Based on a threshold of 40 cycles, four data from the exon arrays (Figure 3b) and qRT-PCR patients were negative for the E2 skipping transcript in verification (Supplementary Figures) suggested the pre- both tumor and normal tissue, one patient was negative sence of a further splice variant of this gene. Analysis of in tumor and positive in normal, 19 patients showed the Affymetrix probe design and NCBI GenBank data- expression in tumor but not in normal, and 11 patients bases identifed a cloned sequence originating from testis showed expression in both tissue types (Figure 2c). In all tissue (Accession #: BG717152, GenBank ID: 13996339) 11 patients, expression of the E2 skipping transcript was and a mRNA sequence, CDKN2A transcript variant 2 higher in tumor than in normal tissues. Therefore the E2 (Accession #: NM_058196, GenBank ID: 17738295). skipping transcript was overexpressed in 86% (30 of 35) of However, the record was temporarily removed by NCBI tumors (Figure 2d). However in this expanded patient staff since the variant has not been confirmed. Four cohort there was no statistically significant gender differ- unique PCR primer sets were designed to assess all four ence in the frequency or expression level of the E2 skip- possible transcript variants in this set of lung tumor/ ping transcript when compared with gender. Finally, the normal tissue. Variant 3 was not expressed in lung tissues normal CDH3 transcript was also found to be significantly overexpressed in tumor compared with normal but the three other variants were all expressed (Figure 3c). (Figure 2d). Variant 2 was also expressed in lung cancer cell lines H2DB10 A549 H129 9 UR Nucleic Acids Research, 2008, Vol. 36, No. 20 6541 (a) 2 3 3′ 5′ V2 (Isoform2) V1 (p16) V3 (Isoform3) V4 (Arf) E3 E2 E1 V2 R Primer (b) 2 Common R Primer V2 F primer V1 F Primer 5 V3 F Primer 6 V4 F Primer (c) p16 Isoform3 Arf Isoform2 -377 bp -190 bp -148 bp TNTNTNTN T N T N TNTNTNTNTNTNTNTNTNT (d) intron1-2 Exon 2 Exon 2 Exon 3 ... ... .... (e) P16 (mean T/N ratio=1.75, P=0.1439) Arf (mean T/N ratio=103.01, P=0.0035) Isoform2 (mean T/N ratio=20.10, P=0.0186) tumor normal tumor normal tumor normal Figure 3. Exon array analysis and PCR analysis for alternative transcript variants of CDKN2A.(a) Exon structure for CDKN2A transcript variants 1 (NM_000077), 2 (NM_058196), 3 (NM_058197) and 4 (NM_058195) indicating the location of PCR primers used for expression analyses. (b) Partek GS alternative splice analysis of exon expression data in 20 patients. The graph shows mean expression value and standard error for each probe set in tumour (blue) and normal (normal) groups. (c) verification of PCR product size difference for three variants using primers 1 and 3 (variant 2/ isoform2), 2 and 4 (variant 1/p16) and 2 and 6 (variant 4/Arf). Variant 3 was not detected using primers 2 and 5. (d) DNA sequencing results of isoform 2. Sequencing shows intronic sequence upstream of exon 2 but also reads directly into exon 3 eliminating the possibility of genomic DNA as the source of the PCR product. (e) Quantification of variant P16 (using primers 2 and 4), Arf (2 and 6), and Isoform 2 (2 and 3) in tumour/normal paired samples (tissue pairs are joined by solid lines) from 33 patients. 6542 Nucleic Acids Research, 2008, Vol. 36, No. 20 CEACAM1 V1 CEACAM1 V2 Paired tumor/normal ratio (V1 vs V2) (a) P=6.0221e-10 Mean T/N ratio=36.42 P=0.000002 Mean T/N ratio=3.53 P=0.0014 Tumor Paired tumor Paired normal Tumor Paired normal Paired tumor CEACAM1 V1 CEACAM1 V2 (N=43) (N=35) (N=35) (N=43) (N=35) (N=35) (b) (c) CEACAM1 V1 CEACAM1 V2 1 2 5′ 3′ CEACAM1 V1 CEACAM1 V2 V1 R Primer 3 4 4 V1 F primer 3 V2 R Primer 4 V2 F Primer Tumor Normal Tumor Normal Tumor Normal Tumor Normal Colon Breast Colon Breast Figure 4. Quantification of CEACAM1 variants in NSCLC, colon and breast cancer patients. (a) Expression of CEACAM1 variant 1 and variant 2 in 35 tumour/normal paired samples from lung adenocarcinoma patients (tissue pairs are joined by solid lines) and in 43 lung adenocarcinoma (n ¼ 11) and squamous cell carcinoma (n ¼ 32) samples only (without matched normals). The third graph shows the tumour/normal expression ratio for variant 1 and 2 in the 35 matched tissue pairs. (b) Expression of variant 1 and variant 2 in non-paired tumour and normal samples from colon and breast cancer patients (10 patients each). (c) Exon location and PCR primers locations for qPCR of variant 1 (NM_001712; primers 1 and 2), variant 2 (NM_001024912; primers 3 and 4). H1299 and H2DB10, and Universal Reference (UR) RNA exons while variant 2 is missing exon 7 (Figure 4c). which contains 10 combined cancer cell line RNAs, but Our array data demonstrated higher expression of not in the lung cancer cell line A549. Furthermore, DNA exons 1 through 6 as well as 8 and 9 in tumors versus sequencing of the 377 bp variant 2 PCR product con- normal while expression of exon 7 was essentially equal firmed that the transcript encodes a portion of intron 1, (Supplementary Figures). This was observed in 14 of 20 consistent with the reported variant 2 mRNA (Figure 3d). (70%) tissue pairs and suggested differential expression of Finally, quantification of three variants in 33 paired the two known CEACAM1 variants in lung tumor and tumor and normal tissues showed that variant 1 (p16) normal tissues as reported previously (38). Furthermore, was not expressed significantly differently between the exon array data was verified by qRT-PCR using PCR primer sets designed to amplify exon 7 and exon 8 speci- tumor and normal (mean tumor/normal ratio ¼ 1.75; fically in the same samples (Supplementary Figures). P ¼ 0.1439). Both variant 2 (Isoform 2, mean tumor/ Expression of CEACAM1 was reduced in malignant normal ratio ¼ 20.1; P ¼ 0.0186) and variant 4 (ARF, tissues as compared with corresponding normal tissues mean tumor/normal ratio ¼ 103.01; P ¼ 0.0035) were sig- deriving from breast (39), prostate (40), colon (41), nificantly overexpressed in tumor compared to normal and endometrium (42). These findings indicated that (Figure 3e). CEACAM1 might suppress carcinogenesis. However, in contrast, high expression of CEACAM1 protein was Quantification of CEACAM1 transcript variants in NSCLC seen in lung tumor tissues and also correlated with poor CEACAM1 is encoded by nine exons with two known survival in lung cancer (43–45). Since these studies did not variants in the RefSeq database. Variant 1 uses all nine examine expression of the two CEACAM1 variants Nucleic Acids Research, 2008, Vol. 36, No. 20 6543 specifically, we designed qRT-PCR assays unique to The ERG (V-ETS avian erythroblastosis virus E26 oncogene homolog) protein shares significant homology variants 1 and 2 of CEACAM1 and analyzed expression 0 0 of each in lung, breast and colon tumors plus normal with both 5 and 3 regions of the viral ETS1 oncogene, tissues from each organ site. In 35 lung adenocarcinoma ETS1, suggesting that it belongs to the ETS oncogene patients we found that CEACAM1 variant 2 was highly family. ERG is located at chromosome band 21q22 and and significantly overexpressed in paired tumor versus has been identified as the target of genomic rearrangement normal (tumor/normal ratio ¼ 36.42; P ¼ 0.000002) while events in acute myeloid leukemia (46), Ewing’s sarcoma variant 1 was only slightly overexpressed in tumor (tumor/ (47) and prostate cancer (48–50). In acute myeloid normal ratio ¼ 3.53; P ¼ 0.0014). Furthermore, analysis of leukemia (AML) and Ewing’s sarcoma ERG has been an additional 43 lung tumors, including squamous cell associated with several translocation fusion partners cancers, revealed expression levels indistinguishable from including ELF4, FUS1 and EWSR1 (46,51,52). Further- the original 35 adenocarcinoma samples (Figure 4a). more, high expression of ERG in the absence of karyo- In the breast and colon tissues however, we found no typic rearrangement or amplification was demonstrated significant differences in expression of either variant 1 or to be an adverse prognostic factor in patients with AML (53). In prostate cancer, ERG is frequently fused to a variant 2 in tumor versus normal (unmatched) samples (Figure 4b). Given this data we postulated that overex- nearby gene, TMPRSS2 resulting in androgen regulation pression of CEACAM1 variant 2 in lung cancer may be of ERG and several reports now indicate that the presence responsible for the survival differences observed between of this fusion is a poor prognostic indicator in prostate tumor types. However, an analysis of disease-free survival cancer (54,55). Interestingly, none of the reports cited in our cohort of 78 lung cancer patients including above discuss the existence of two ERG variants and how these are related to the fusion product. Our analysis 48 stage I and 30 higher stages (median follow up of variant-specific expression clearly showed that variant 2 24 months) showed no association of CEACAM1 variant 1 or 2 expression with patient survival (Cox regression has much higher expression in tumor compared to the P-values 0.715 and 0.536, respectively). paired normal lung tissue while variant 1 has similar or lower expression in tumors. Thus it seems that the onco- genic effect of ERG may be exerted through functions encoded by variant 2 and expression of fusion gene pro- DISCUSSION ducts should also be closely evaluated for which ERG In this study we have performed an extensive identification variant is present.Further investigations are required and verification of alternative splice variant gene expres- to identify the functional differences between the two sion in NSCLC. To our knowledge, this study is the first variants and could lead to more targeted drug discovery. such genome wide analysis of alternative splicing events The cadherins are a family of transmembrane proteins in NSCLC or any other tumor type. Our results indicate that mediate calcium-dependent cell-cell adhesion at adhe- that approximately 13% of the 17 800 core RefSeq genes rens junctions. The cytoplasmic domain of cadherins appear to have alternative transcripts that are differen- binds to A and G catenins and is linked to the actin tially expressed between lung adenocarcinoma and cytoskeleton via A catenin (56). These interactions are adjacent normal lung tissue. Furthermore, the largest vital for stable cell-cell interactions and maintenance of subsets of these alternatively spliced genes appear to be normal cell physiology. In cancer, disruption of the cancer related and/or involved in cellular processes such adherens junctions, for example by downregulation as growth and proliferation. For some genes, alterna- or inactivating mutation of cadherins, can result in tive transcripts have already been identified but we now epithelial-to-mesenchymal transition, increased prolifera- demonstrate their differential expression in cancer. tion, invasion and metastasis (56,57). In part, this may be In many cases however, our microarray data indicates mediated by the release and accumulation of B catenin the presence of differentially expressed alternative tran- which, when translocated to the nucleus induces transcrip- scripts that are currently unidentified. Thus it appears tion of genes such as cyclin D1 and c-myc. that differential expression of alternative transcripts is While the role of the prototypic cadherin, E-cadherin frequent in NSCLC and that this may be a valuable (CDH1) as a classic tumor suppressor gene in cancer is resource for the development of novel diagnostic, prog- well established the role of P-cadherin remains unclear as nostic and therapeutic tools. it behaves differently depending on the tumor type being While the ability to analyze alternative transcript studied. For example, in melanoma, the loss of P-cadherin expression on a genome wide scale is very powerful, (and E-Cadherin) allows invasion and migration of verification and validation of this data is labor intensive. cells and thus P-cadherin appears to be acting as a pro- For this reason, we chose to focus on genes that have adhesion tumor suppressor (58,59). In breast cancer previously been associated with cancer, and where differ- however, high expression of P-cadherin strongly corre- ential expression occurred in >50% of tumor/normal lated with high histologic grade, increased proliferation tissue pairs. In total, 11 genes were examined and we and poor patient survival (60,61). Furthermore, in pan- were able to validate the array data for six of them. creatic cancer cell lines, overexpression of P-cadherin Thus, verification and validation of data from the exon resulted in increased cell motility, cytoplasmic accumula- arrays is clearly required. Alternative transcript expression tion of catenins and activation of the Rho GTPases, for four of these genes was studied in more detail. Rac1 and Cdc42 (62). 6544 Nucleic Acids Research, 2008, Vol. 36, No. 20 In our study we found overexpression of P-cadherin in status in hepatocellular carcinoma (67) and worse out- lung tumors compared to normal lung but also identified come in B-cell lymphomas (66). Similarly, overexpression overexpression of an alternative splice variant in which of p16 has also been observed and has been associated exon 2 is missing. Analysis of the resulting mRNA indi- with progression and poor survival in ovarian cancer cates that the normal ATG initiation codon is placed out (69), prostate cancer (70) and breast cancer (71). While of frame and would result in a truncated protein after only overexpression of p16 and ARF appears to contradict 27 amino acids. This would clearly result in an inactive their known cellular functions as tumor suppressors, protein and would fit with a tumor suppressor function for mechanisms have been proposed whereby this event may P-cadherin in lung cancer if it were not for the fact that be explained through activating mutations in Rb or induc- full length P-cadherin mRNA is actually overexpressed in tion of myc and ras (68,72,73). However, our data suggests our tumors. However, upon further analysis we identified an alternative: that the variant 2 transcript may account several alternative in frame ATG codons downstream for the previously observed overexpression. Variant 2 was of the known translation start site. Furthermore, at least originally believed to give rise to a new isoform (Isoform two of these putative alternative start sites have kozak 2) of P16 with the first amino acid encoded by an in frame sequences that are believed to be active in other genes. ATG that is present in the original exon 2. However, we Protein translation initiated at either of these sites would also identified an alternative ATG codon that is in the result in a P-cadherin protein lacking the signal peptide extended exon 2 and is in frame with ARF. This alterna- and most of the extracellular domain, while retaining the tive ATG has a reasonably good Kozak sequence transmembrane domain, juxtamembrane domain and the (CCGTCATGC) and, being upstream of the putative catenin binding domain. If such a protein were to be over- p16 isoform 2 start site, would presumably dominate expressed in tumors one can easily envision disruption of translation initiation. This putative ARF isoform would the adherens junctions in a dominant manner leading to lack the amino terminal portion of ARF and would there- catenin accumulation and tumorigenesis. fore be unable to bind TBP-1, E2F, Myc, FoxM1, CTBP1 Its is well known that multiple transcripts are or mdm2 (74) and may be unable to block cell cycle transcribed from the CDKN2A (cyclin-dependent kinase progression. However others have shown that the carboxy inhibitor 2A) locus. CDKN2A is an extensively studied terminus of artificially truncated ARF still accumulates tumor suppressor locus that is frequently mutated or in the nucleolus (75–77) and thus this putative ARF iso- deleted in a wide variety of tumor types. Exploration of form could theoretically act as a dominant negative, thus the RefSeq database identified four transcript variants explaining how overexpression of ARF may be pro- potentially transcribed from this locus of which three are tumorigenic. considered verified. Variant 1 gives rise to the p16 protein CEACAM1 [carcinoembryonic antigen-related cell and variant 4 gives rise to the alternative reading frame adhesion molecule 1 (biliary glycoprotein)] is a cell–cell p14/ARF protein. Variant 4 also gives rise to a shorter adhesion molecule that also plays a role in signal transduc- protein product (p19smARF) which results from an alter- tion. Two common variants are known for CEACAM1; native translation start site (63). Variant 3 gives rise to one with a long cytoplasmic domain (L form or variant 1) a longer protein that shares the same reading frame as and one with a short cytoplasmic domain (S form or p16 and appears to be specifically expressed in the variant 2). The expression of CEACAM1 in cancer has pancreas. In addition, another transcript variant (p16g) been extensively studied but early reports appeared to be was recently identified (64) but has yet to be curated in contradictory. For example, reduced expression of the RefSeq databases. Finally, variant 2 lacks exons 1a CEACAM1 was reported in breast, colon, prostate and and 1b, and exon 2 is slightly longer due to inclusion of endometrial cancer (39,41,78) and CEACAM1 was there- an additional 100 bases of intronic sequence. Variant 2 fore considered to be a negative regulator of tumor cell may also have a shorter 3 UTR than p16 or Arf. growth. However, in melanoma (79) and lung cancer Variant 2 was originally cloned from testis tissue but has (43,44), several reports indicated that CEACAM1 was been temporarily removed by RefSeq staff for further overexpressed in tumors and that this was associated evaluation. with disease progression and poor outcome. In 1997 In cancer, inactivation of the p16INK4a/ARF tumor Turbide et al. (80) found that the L form of CEACAM1 suppressor genes is frequently mediated through genomic exhibited a tumor suppressive phenotype and that this was deletion, promoter methylation or inactivating mutation dominant over expression of the S form. Furthermore, leading to loss of p53 and Rb dependent cell cycle regula- using semi-quantitative RT-PCR Wang et al. (38) found tion (65). In NSCLC, loss of heterozygosity and/or homo- that the L form of CEACAM1 predominated in normal zygous deletion of the CDKN2A locus on chromosome lung while the S form appeared more abundant in tumors. 9p21 has been reported at frequencies up to 40% (8). Thus they proposed that isoform switching rather than In our study, 30% of tumors demonstrated reduced CEACAM1 downregulation occurs in NSCLC as opposed expression of all three measured transcripts (p16, ARF to other tumor types. Our quantitative analysis clearly and variant 2) and this is likely a result of genomic demonstrates a switch in abundance from the L form deletions. However, in the remaining tumors expression (variant 1) to the S form (variant 2) in NSCLC and we of ARF and variant 2 (but not p16) were significantly also demonstrate that no such switch appears to occur in higher than in paired normal tissue. Overexpression breast cancer or colon cancer. Furthermore, we also ana- of ARF in cancer has now been reported several times lyzed a publicly available GeneChip Human Exon 1.0 ST (66–68) and has been associated with poor differentiation array data set from colon (33) and found no significant Nucleic Acids Research, 2008, Vol. 36, No. 20 6545 8. Weir,B.A., Woo,M.S., Getz,G., Perner,S., Ding,L., Beroukhim,R., differential expression of CEACAM1 variants in Lin,W.M., Province,M.A., Kraja,A., Johnson,L.A. et al. (2007) those 10 pairs of colon tumor/normal samples (data not Characterizing the cancer genome in lung adenocarcinoma. Nature, shown). Thus our findings support the hypothesis that the 450, 893–898. tumor suppressive or oncogenic effects of CEACAM1 are 9. Maniatis,T. and Tasic,B. (2002) Alternative pre-mRNA splicing and splice variant dependent and that expression of the two proteome expansion in metazoans. Nature, 418, 236–243. 10. Black,D.L. (2003) Mechanisms of alternative pre-messenger RNA variants is differentially regulated in different tissue types. splicing. Annu. Rev. Biochem., 72, 291–336. In conclusion, our data demonstrates that differential 11. Mironov,A.A., Fickett,J.W. and Gelfand,M.S. (1999) Frequent expression of alternative splice variants is a common alternative splicing of human genes. Genome Res., 9, 1288–1293. event in NSCLC. It also shows that in addition to identi- 12. Brett,D., Hanke,J., Lehmann,G., Haase,S., Delbruck,S., Krueger,S., Reich,J. and Bork,P. (2000) EST comparison indicates 38% of fication of novel, cancer-related splice variants, additional human mRNAs contain possible alternative splice forms. FEBS information can be gained even with regard to extensively Lett., 474, 83–86. studied, cancer-related genes. Splice variant expression 13. Kan,Z., States,D. and Gish,W. (2002) Selecting for functional should be considered in future genome-wide expression alternative splices in ESTs. Genome Res., 12, 1837–1845. studies and may lead to novel diagnostic, prognostic or 14. Modrek,B., Resch,A., Grasso,C. and Lee,C. (2001) Genome-wide detection of alternative splicing in expressed sequences of human therapeutic strategies in the fight against cancer. genes. Nucleic Acids Res., 29, 2850–2859. (GeneChip Human Exon 1.0 ST array cell files along 15. Sharp,P.A. (1994) Split genes and RNA splicing. Cell, 77, 805–815. with GC-RMA data from core gene probsets and patient 16. Collesi,C., Santoro,M.M., Gaudino,G. and Comoglio,P.M. (1996) information have been submitted to GEO databases and A splicing variant of the RON transcript induces constitutive tyrosine kinase activity and an invasive phenotype. Mol. Cell Biol., GEO Accession # is GSE12236). 16, 5518–5526. 17. Gayther,S.A., Barski,P., Batley,S.J., Li,L., de Foy,K.A., Cohen,S.N., Ponder,B.A. and Caldas,C. (1997) Aberrant splicing SUPPLEMENTARY DATA of the TSG101 and FHIT genes occurs frequently in multiple Supplementary Data are available at NAR Online. malignancies and in normal tissues and mimics alterations previously described in tumours. Oncogene, 15, 2119–2126. 18. Scotet,E. and Houssaint,E. (1998) Exon III splicing switch of ACKNOWLEDGEMENT fibroblast growth factor (FGF) receptor-2 and -3 can be induced by FGF-1 or FGF-2. Oncogene, 17, 67–76. Lung cancer cell lines were kindly provided by Dr Stuart 19. Ge,K., DuHadaway,J., Du,W., Herlyn,M., Rodeck,U. and Aaronson at the Mount Sinai School of Medicine. Prendergast,G.C. (1999) Mechanism for elimination of a tumor suppressor: aberrant splicing of a brain-specific exon causes loss of function of Bin1 in melanoma. Proc. Natl Acad. Sci. USA, 96, 9689–9694. FUNDING 20. Baudry,D., Hamelin,M., Cabanis,M.O., Fournet,J.C., Funding for open access charge: The work was funded in Tournade,M.F., Sarnacki,S., Junien,C. and Jeanpierre,C. (2000) WT1 splicing alterations in Wilms’ tumors. Clin. Cancer Res., 6, part by NIH/NCI grant R01 CA94059. 3957–3965. 21. Slawin,K.M., Shariat,S.F., Nguyen,C., Leventis,A.K., Song,W., Conflict of interest statement. None declared. Kattan,M.W., Young,C.Y., Tindall,D.J. and Wheeler,T.M. (2000) Detection of metastatic prostate cancer using a splice variant- specific reverse transcriptase-polymerase chain reaction assay for REFERENCES human glandular kallikrein. Cancer Res., 60, 7142–7148. 22. Liu,H.X., Cartegni,L., Zhang,M.Q. and Krainer,A.R. (2001) 1. Jemal,A., Murray,T., Ward,E., Samuels,A., Tiwari,R.C., A mechanism for exon skipping caused by nonsense or missense Ghafoor,A., Feuer,E.J. and Thun,M.J. (2005) Cancer statistics. mutations in BRCA1 and other genes. Nat. Genet., 27, 55–58. CA Cancer J. Clin., 55, 10–30. 23. Lukas,J., Gao,D.Q., Keshmeshian,M., Wen,W.H., Tsao-Wei,D., 2. Ries,L.A.G., Melbert,D., Krapcho,M., Mariotto,A., Miller,B.A., Rosenberg,S. and Press,M.F. (2001) Alternative and aberrant Feuer,E.J., Clegg,L., Horner,M.J., Howlader,N., Eisner,M.P. et al. messenger RNA splicing of the mdm2 oncogene in invasive breast (2006) SEER Cancer Statistics Review, 1975–2004. National Cancer cancer. Cancer Res., 61, 3212–3219. Institute, Bethesda, MD. 24. Kwabi-Addo,B., Ropiquet,F., Giri,D. and Ittmann,M. (2001) 3. Bhattacharjee,A., Richards,W.G., Staunton,J., Li,C., Monti,S., Alternative splicing of fibroblast growth factor receptors in human Vasa,P., Ladd,C., Beheshti,J., Bueno,R., Gillette,M. et al. (2001) prostate cancer. Prostate, 46, 163–172. Classification of human lung carcinomas by mRNA expression 25. Bartel,F., Taubert,H. and Harris,L.C. (2002) Alternative and profiling reveals distinct adenocarcinoma subclasses. Proc. Natl aberrant splicing of MDM2 mRNA in human cancer. Cancer Cell, Acad. Sci. USA, 98, 13790–13795. 2, 9–15. 4. Beer,D.G., Kardia,S.L., Huang,C.C., Giordano,T.J., Levin,A.M., 26. Barbour,A.P., Reeder,J.A., Walsh,M.D., Fawcett,J., Antalis,T.M. Misek,D.E., Lin,L., Chen,G., Gharib,T.G., Thomas,D.G. et al. and Gotley,D.C. (2003) Expression of the CD44v2-10 isoform (2002) Gene-expression profiles predict survival of patients with confers a metastatic phenotype: importance of the heparan sulfate lung adenocarcinoma. Nat. Med., 8, 816–824. 5. Potti,A., Mukherjee,S., Petersen,R., Dressman,H.K., Bild,A., attachment site CD44v3. Cancer Res., 63, 887–892. Koontz,J., Kratzke,R., Watson,M.A., Kelley,M., Ginsburg,G.S. 27. Steinman,H.A., Burstein,E., Lengner,C., Gosselin,J., Pihan,G., et al. (2006) A genomic strategy to refine prognosis in early-stage Duckett,C.S. and Jones,S.N. (2004) An alternative splice form of non-small-cell lung cancer. N. Engl. J. Med., 355, 570–580. Mdm2 induces p53-independent cell growth and tumorigenesis. 6. Chen,H.Y., Yu,S.L., Chen,C.H., Chang,G.C., Chen,C.Y., Yuan,A., J. Biol. Chem., 279, 4877–4886. Cheng,C.L., Wang,C.H., Terng,H.J., Kao,S.F. et al. (2007) A five- 28. Brinkman,B.M. (2004) Splice variants as cancer biomarkers. gene signature and clinical outcome in non-small-cell lung cancer. Clin. Biochem., 37, 584–594. N. Engl. J. Med., 356, 11–20. 29. Venables,J.P. (2004) Aberrant and alternative splicing in cancer. 7. Tonon,G., Wong,K.K., Maulik,G., Brennan,C., Feng,B., Zhang,Y., Cancer Res., 64, 7647–7654. Khatry,D.B., Protopopov,A., You,M.J., Aguirre,A.J. et al. (2005) 30. Kalnina,Z., Zayakin,P., Silina,K. and Line,A. (2005) Alterations High-resolution genomic profiles of human lung cancer. Proc. of pre-mRNA splicing in cancer. Genes Chromosomes. Cancer, 42, Natl Acad. Sci. USA, 102, 9625–9630. 342–357. 6546 Nucleic Acids Research, 2008, Vol. 36, No. 20 31. Venables,J.P. (2006) Unbalanced alternative splicing and its cancer with implications for molecular diagnosis. Mod. Pathol., 20, 467–473. significance in cancer. Bioessays, 28, 378–386. 50. Liu,W., Ewing,C.M., Chang,B.L., Li,T., Sun,J., Turner,A.R., 32. Skotheim,R.I. and Nees,M. (2007) Alternative splicing in cancer: Dimitrov,L., Zhu,Y., Sun,J., Kim,J.W. et al. (2007) Multiple noise, functional, or systematic? Int. J. Biochem. Cell Biol., 39, genomic alterations on 21q22 predict various TMPRSS2/ERG 1432–1449. 33. Gardina,P.J., Clark,T.A., Shimada,B., Staples,M.K., Yang,Q., fusion transcripts in human prostate cancers. Genes Chromosomes. Veitch,J., Schweitzer,A., Awad,T., Sugnet,C., Dee,S. et al. (2006) Cancer, 46, 972–980. Alternative splicing and differential gene expression in colon cancer 51. Panagopoulos,I., Mandahl,N., Mitelman,F. and Aman,P. (1995) Two distinct FUS breakpoint clusters in myxoid liposarcoma and detected by a whole genome exon array. BMC Genomics, 7, 325. acute myeloid leukemia with the translocations t(12;16) and 34. Irizarry,R.A., Hobbs,B., Collin,F., Beazer-Barclay,Y.D., t(16;21). Oncogene, 11, 1133–1137. Antonellis,K.J., Scherf,U. and Speed,T.P. (2003) Exploration, 52. Desmaze,C., Brizard,F., Turc-Carel,C., Melot,T., Delattre,O., normalization, and summaries of high density oligonucleotide array Thomas,G. and Aurias,A. (1997) Multiple chromosomal mecha- probe level data. Biostatistics, 4, 249–264. nisms generate an EWS/FLI1 or an EWS/ERG fusion gene in 35. Godfrey,T.E., Kim,S.-H., Chavira,M., Ruff,D.W., Warren,R.S., Ewing tumors. Cancer Genet. Cytogenet., 97, 12–19. Gray,J.W. and Jensen,R.H. (2000) Quantitative mRNA expression 53. Marcucci,G., Baldus,C.D., Ruppert,A.S., Radmacher,M.D., analysis from formalin-fixed, paraffin-embedded tissues using 5 Mrozek,K., Whitman,S.P., Kolitz,J.E., Edwards,C.G., nuclease quantitative RT-PCR. J. Mol. Diagn., 2, 84–91. Vardiman,J.W., Powell,B.L. et al. (2005) Overexpression of the 36. Tassone,F., Hagerman,R.J., Taylor,A.K., Gane,L.W., Godfrey,T.E. ETS-related gene, ERG, predicts a worse outcome in acute myeloid and Hagerman,P.J. (2000) Elevated levels of FMR1 mRNA in leukemia with normal karyotype: a Cancer and Leukemia Group B carrier males: a new mechanism of involvement in the fragile-X study. J. Clin. Oncol., 23, 9234–9242. syndrome. Am. J. Hum. Genet., 66, 6–15. 54. Nam,R.K., Sugar,L., Yang,W., Srivastava,S., Klotz,L.H., 37. Wisnivesky,J.P. and Halm,E.A. (2007) Sex differences in lung cancer Yang,L.Y., Stanimirovic,A., Encioiu,E., Neill,M., Loblaw,D.A. survival: do tumors behave differently in elderly women? et al. (2007) Expression of the TMPRSS2:ERG fusion gene predicts J. Clin. Oncol., 25, 1705–1712. cancer recurrence after surgery for localised prostate cancer. 38. Wang,L., Lin,S.H., Wu,W.G., Kemp,B.L., Walsh,G.L., Hong,W.K. Br. J. Cancer, 97, 1690–1695. and Mao,L. (2000) C-CAM1, a candidate tumor suppressor gene, is 55. Demichelis,F., Fall,K., Perner,S., Andren,O., Schmidt,F., abnormally expressed in primary lung cancers. Clin. Cancer Res., 6, Setlur,S.R., Hoshida,Y., Mosquera,J.M., Pawitan,Y., Lee,C. et al. 2988–2993. (2007) TMPRSS2:ERG gene fusion associated with lethal prostate 39. Riethdorf,L., Lisboa,B.W., Henkel,U., Naumann,M., Wagener,C. cancer in a watchful waiting cohort. Oncogene, 26, 4596–4599. and Loning,T. (1997) Differential expression of CD66a (BGP), 56. Conacci-Sorrell,M., Zhurinsky,J. and Ben-Ze’ev,A. (2002) a cell adhesion molecule of the carcinoembryonic antigen family, The cadherin-catenin adhesion system in signaling and cancer. in benign, premalignant, and malignant lesions of the human J. Clin. Invest., 109, 987–991. mammary gland. J. Histochem. Cytochem., 45, 957–963. 57. Paredes,J., Correia,A.L., Ribeiro,A.S., Albergaria,A., Milanezi,F. 40. Luo,W., Tapolsky,M., Earley,K., Wood,C.G., Wilson,D.R., and Schmitt,F.C. (2007) P-cadherin expression in breast cancer: Logothetis,C.J. and Lin,S.H. (1999) Tumor-suppressive activity of a review. Breast Cancer Res., 9, 214. CD66a in prostate cancer. Cancer Gene Ther., 6, 313–321. 58. Sanders,D.S., Blessing,K., Hassan,G.A., Bruton,R., Marsden,J.R. 41. Neumaier,M., Paululat,S., Chan,A., Matthaes,P. and Wagener,C. and Jankowski,J. (1999) Alterations in cadherin and catenin (1993) Biliary glycoprotein, a potential human cell adhesion expression during the biological progression of melanocytic molecule, is down-regulated in colorectal carcinomas. Proc. Natl tumours. Mol. Pathol., 52, 151–157. Acad. Sci. USA, 90, 10744–10748. 59. Hsu,M.Y., Wheelock,M.J., Johnson,K.R. and Herlyn,M. (1996) 42. Bamberger,A.M., Riethdorf,L., Nollau,P., Naumann,M., Shifts in cadherin profiles between human normal melanocytes and Erdmann,I., Gotze,J., Brummer,J., Schulte,H.M., Wagener,C. and melanomas. J. Investig. Dermatol. Symp. Proc., 1, 188–194. Loning,T. (1998) Dysregulated expression of CD66a (BGP, 60. Paredes,J., Albergaria,A., Oliveira,J.T., Jeronimo,C., Milanezi,F. C-CAM), an adhesion molecule of the CEA family, in endometrial and Schmitt,F.C. (2005) P-cadherin overexpression is an indicator cancer. Am.. J. Pathol., 152, 1401–1406. of clinical outcome in invasive breast carcinomas and is associated 43. Laack,E., Nikbakht,H., Peters,A., Kugler,C., Jasiewicz,Y., Edler,L., with CDH3 promoter hypomethylation. Clin. Cancer Res., 11, Brummer,J., Schumacher,U. and Hossfeld,D.K. (2002) Expression 5869–5877. of CEACAM1 in adenocarcinoma of the lung: a factor of inde- 61. Peralta,S.A., Knudsen,K.A., Salazar,H., Han,A.C. and pendent prognostic significance. J. Clin. Oncol., 20, 4279–4284. Keshgegian,A.A. (1999) P-cadherin expression in breast carcinoma 44. Sienel,W., Dango,S., Woelfle,U., Morresi-Hauf,A., Wagener,C., indicates poor survival. Cancer, 86, 1263–1272. Brummer,J., Mutschler,W., Passlick,B. and Pantel,K. (2003) 62. Taniuchi,K., Nakagawa,H., Hosokawa,M., Nakamura,T., Elevated expression of carcinoembryonic antigen-related cell adhe- Eguchi,H., Ohigashi,H., Ishikawa,O., Katagiri,T. and Nakamura,Y. sion molecule 1 promotes progression of non-small cell lung cancer. (2005) Overexpressed P-cadherin/CDH3 promotes motility of Clin. Cancer Res., 9, 2260–2266. pancreatic cancer cells by interacting with p120ctn and activating 45. Dango,S., Sienel,W., Schreiber,M., Stremmel,C., Kirschbaum,A., rho-family GTPases. Cancer Res., 65, 3092–3099. Pantel,K. and Passlick,B. (2008) Elevated expression of carcinoem- 63. Reef,S. and Kimchi,A. (2006) A smARF way to die: a novel short bryonic antigen-related cell adhesion molecule 1 (CEACAM-1) is isoform of p19ARF is linked to autophagic cell death. Autophagy, associated with increased angiogenic potential in non-small-cell lung 2, 328–330. cancer. Lung Cancer, 60, 426–433. 64. Lin,Y.C., Diccianni,M.B., Kim,Y., Lin,H.H., Lee,C.H., Lin,R.J., 46. Moore,S.D., Offor,O., Ferry,J.A., Amrein,P.C., Morton,C.C. and Joo,S.H., Li,J., Chuang,T.J., Yang,A.S. et al. (2007) Human Dal,C.P. (2006) ELF4 is fused to ERG in a case of acute myeloid p16gamma, a novel transcriptional variant of p16(INK4A), leukemia with a t(X;21)(q25-26;q22). Leuk. Res., 30, 1037–1042. coexpresses with p16(INK4A) in cancer cells and inhibits cell-cycle 47. Sorensen,P.H., Lessnick,S.L., Lopez-Terrada,D., Liu,X.F., progression. Oncogene, 26, 7017–7027. Triche,T.J. and Denny,C.T. (1994) A second Ewing’s sarcoma 65. Gil,J. and Peters,G. (2006) Regulation of the INK4b-ARF-INK4a translocation, t(21;22), fuses the EWS gene to another ETS-family tumour suppressor locus: all for one or one for all. Nat. Rev. Mol. transcription factor, ERG. Nat. Genet., 6, 146–151. Cell Biol., 7, 667–677. 48. Tomlins,S.A., Rhodes,D.R., Perner,S., Dhanasekaran,S.M., 66. Sanchez-Aguilera,A., Sanchez-Beato,M., Garcia,J.F., Prieto,I., Mehra,R., Sun,X.W., Varambally,S., Cao,X., Tchinda,J., Kuefer,R. Pollan,M. and Piris,M.A. (2002) p14(ARF) nuclear overexpression et al. (2005) Recurrent fusion of TMPRSS2 and ETS transcription in aggressive B-cell lymphomas is a sensor of malfunction of the factor genes in prostate cancer. Science, 310, 644–648. common tumor suppressor pathways. Blood, 99, 1411–1418. 49. Lapointe,J., Kim,Y.H., Miller,M.A., Li,C., Kaygusuz,G., 67. Ito,T., Nishida,N., Fukuda,Y., Nishimura,T., Komeda,T. and van de,R.M., Huntsman,D.G., Brooks,J.D. and Pollack,J.R. (2007) Nakao,K. (2004) Alteration of the p14(ARF) gene and p53 status in A variant TMPRSS2 isoform and ERG fusion product in prostate human hepatocellular carcinomas. J. Gastroenterol., 39, 355–361. Nucleic Acids Research, 2008, Vol. 36, No. 20 6547 68. Ferru,A., Fromont,G., Gibelin,H., Guilhot,J., Savagner,F., 74. Chen,Y.W., Paliwal,S., Draheim,K., Grossman,S.R. and Lewis,B.C. Tourani,J.M., Kraimps,J.L., Larsen,C.J. and Karayan-Tapon,L. (2008) p19Arf inhibits the invasion of hepatocellular carcinoma (2006) The status of CDKN2A alpha (p16INK4A) and beta cells by binding to C-terminal binding protein. Cancer Res., 68, (p14ARF) transcripts in thyroid tumour progression. Br. J. Cancer, 476–482. 95, 1670–1677. 75. Rizos,H., Darmanian,A.P., Mann,G.J. and Kefford,R.F. (2000) 69. Dong,Y., Walsh,M.D., McGuckin,M.A., Gabrielli,B.G., Two arginine rich domains in the p14ARF tumour suppressor Cummings,M.C., Wright,R.G., Hurst,T., Khoo,S.K. and mediate nucleolar localization. Oncogene, 19, 2978–2985. Parsons,P.G. (1997) Increased expression of cyclin-dependent 76. Ayrault,O., Karayan,L., Riou,J.F., Larsen,C.J. and Seite,P. (2003) kinase inhibitor 2 (CDKN2A) gene product P16INK4A in ovarian Delineation of the domains required for physical and functional cancer is associated with progression and unfavourable prognosis. interaction of p14ARF with human topoisomerase I. Oncogene, 22, Int. J. Cancer, 74, 57–63. 1945–1954. 70. Henshall,S.M., Quinn,D.I., Lee,C.S., Head,D.R., Golovsky,D., 77. Moulin,S., Llanos,S., Kim,S.H. and Peters,G. (2008) Binding to Brenner,P.C., Delprado,W., Stricker,P.D., Grygiel,J.J. and nucleophosmin determines the localization of human and chicken Sutherland,R.L. (2001) Overexpression of the cell cycle inhibitor ARF but not its impact on p53. Oncogene, 27, 2382–2389. p16INK4A in high-grade prostatic intraepithelial neoplasia predicts 78. Prall,F., Nollau,P., Neumaier,M., Haubeck,H.D., Drzeniek,Z., early relapse in prostate cancer patients. Clin. Cancer Res., 7, Helmchen,U., Loning,T. and Wagener,C. (1996) CD66a (BGP), 544–550. an adhesion molecule of the carcinoembryonic antigen family, 71. Hui,R., Macmillan,R.D., Kenny,F.S., Musgrove,E.A., is expressed in epithelium, endothelium, and myeloid cells in a Blamey,R.W., Nicholson,R.I., Robertson,J.F. and Sutherland,R.L. wide range of normal human tissues. J. Histochem. Cytochem., 44, (2000) INK4a gene expression and methylation in primary breast 35–41. cancer: overexpression of p16INK4a messenger RNA is a marker of 79. Thies,A., Moll,I., Berger,J., Wagener,C., Brummer,J., Schulze,H.J., poor prognosis. Clin. Cancer Res., 6, 2777–2787. Brunner,G. and Schumacher,U. (2002) CEACAM1 expression in 72. Palmero,I., Pantoja,C. and Serrano,M. (1998) p19ARF links the cutaneous malignant melanoma predicts the development of meta- tumour suppressor p53 to Ras. Nature, 395, 125–126. static disease. J. Clin. Oncol., 20, 2530–2536. 73. Bates,S., Phillips,A.C., Clark,P.A., Stott,F., Peters,G., Ludwig,R.L. 80. Turbide,C., Kunath,T., Daniels,E. and Beauchemin,N. (1997) and Vousden,K.H. (1998) p14ARF links the tumour suppressors Optimal ratios of biliary glycoprotein isoforms required for inhibi- RB and p53. Nature, 395, 124–125. tion of colonic tumor cell growth. Cancer Res., 57, 2781–2788. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Nucleic Acids Research Oxford University Press

Whole genome exon arrays identify differential expression of alternatively spliced, cancer-related genes in lung cancer

Loading next page...
 
/lp/oxford-university-press/whole-genome-exon-arrays-identify-differential-expression-of-4CnyI4JKim

References (81)

Publisher
Oxford University Press
Copyright
© 2008 The Author(s)
ISSN
0305-1048
eISSN
1362-4962
DOI
10.1093/nar/gkn697
pmid
18927117
Publisher site
See Article on Publisher Site

Abstract

Published online 16 October 2008 Nucleic Acids Research, 2008, Vol. 36, No. 20 6535–6547 doi:10.1093/nar/gkn697 Whole genome exon arrays identify differential expression of alternatively spliced, cancer-related genes in lung cancer 1 1 1 1 Liqiang Xi , Andrew Feber , Vanita Gupta , Maoxin Wu , 1 3 2 3 Andrew D. Bergemann , Rodney J. Landreneau , Virginia R. Litle , Arjun Pennathur , 3 1,2, James D. Luketich and Tony E. Godfrey * 1 2 Department of Pathology, Department of Cardiothoracic Surgery, Mount Sinai School of Medicine, New York, NY 10029 and Heart, Lung and Esophageal Surgery Institute, and Pittsburgh Cancer Institute, University of Pittsburgh, Pittsburgh, PA 15260, USA Received August 25, 2008; Revised September 24, 2008; Accepted September 25, 2008 detection may improve outcomes, overall 5-year survival ABSTRACT rates for NSCLC are currently only 16% (2). New mole- Alternative processing of pre-mRNA transcripts is a cular diagnostic tests and novel therapeutic strategies are major source of protein diversity in eukaryotes and needed for this terrible disease. NSCLC is one of the most has been implicated in several disease processes studied tumor types in the scientific literature demon- including cancer. In this study we have performed strated by the number of excellent studies on global gene a genome wide analysis of alternative splicing expression (3–6) and genome-wide DNA copy number events in lung adenocarcinoma. We found that changes (7,8) that have been conducted in NSCLC. 2369 of the 17 800 core Refseq genes appear to These studies have enhanced our knowledge of lung cancer biology, led to proposals for multicenter trials of have alternative transcripts that are differentially primary tumor gene expression for prognosis and treat- expressed in lung adenocarcinoma versus normal. ment and may identify avenues for novel therapeutic According to their known functions the largest development. A promising area that remains relatively subset of these genes (30.8%) is believed to be unexplored, however, is alternative splicing (AS) of cancer related. Detailed analysis was performed mRNA to produce functionally different proteins. Such for several genes using PCR, quantitative RT-PCR studies may lead to improved diagnostic and prognostic and DNA sequencing. We found overexpression of tools and may identify additional therapeutic targets for ERG variant 2 but not variant 1 in lung tumors NSCLC. and overexpression of CEACAM1 variant 1 but not Alternative splicing of pre-mRNA is an important pro- variant 2 in lung tumors but not in breast or colon cess in normal metazoan development (9,10). Further- tumors. We also identified a novel, overexpressed more, recent bioinformatics analysis suggests that 65% variant of CDH3 and verified the existence and over- of human genes are alternatively spliced (11–14); a large expression of a novel variant of P16 transcribed increase over prior estimates as low as 5% (15). AS is not from the CDKN2A locus. These findings demon- only involved in normal development, but is also asso- strate how analysis of alternative pre-mRNA proces- ciated with human diseases including cancer (16–27). For some genes, alternative transcripts are differentially sing can shed additional light on differences expressed between tumor and normal tissue and in a few between tumors and normal tissues as well as cases, the expression of AS variants has been associated between different tumor types. Such studies may with tumor progression (28–32). However, most studies of lead to the development of additional tools for AS in human disease have used a targeted approach and tumor diagnosis, prognosis and therapy. focused on individual genes. There is a great deal of poten- tial for novel discovery from genome-wide studies of alter- native splicing. Until recently such large scale studies have INTRODUCTION been a considerable technical and bioinformatic challenge Non-small cell lung cancer (NSCLC) is the most common but the introduction of new technology and powerful cause of cancer-related death in the USA (1). While early data analysis software now makes them more feasible. *To whom correspondence should be addressed. Tel: 585 273 3112; Fax: 585 276 2576; Email: tony_godfrey@urmc.rochester.edu 2008 The Author(s) This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. 6536 Nucleic Acids Research, 2008, Vol. 36, No. 20 In this study we have used the GeneChip Human Exon The Robust Multi-array Analysis (RMA) (34) algorithm 1.0 ST Array from Affymetrix to explore genome-wide was used for probeset (exon-level) intensity analysis. AS events in the most predominant histologic type of Exon-level data was then filtered to include only those NSCLC, lung adenocarcinoma. The study was designed probesets that are in the ‘core’ meta-probe list, which to identify cancer-associated alternative splicing events, represents 17 800 RefSeq genes and full-length GenBank verify splice variants and to validate differential expression mRNAs. Within this gene set, the Analysis of Variance of selected splice variants in independent tissue sets. Our (ANOVA) and multi test correction for P-values in results demonstrate that a large number of known genes, Partek Genomic Suite were used to identify alternative including well known oncogenes and tumor suppressors, splicing events. Tissue type (tumor versus normal in this are alternatively spliced and differentially expressed case) was chosen as the candidate variable in the ANOVA between normal lung and lung adenocarcinoma. These model to obtain tumor-related splicing events. ANOVA findings may provide a new resource for diagnosis and P-values were corrected using Bonferroni method. A list treatment of NSCLC. of genes with significant alternative spliced events was generated by using a 0.05 FDR criterion as a significant cutoff. Then the genes were sorted based on gene function MATERIALS AND METHODS using Ingenuity Pathway Analysis software (Ingenuity Systems, www.ingenuity.com). Subsequent verification Specimens and RNA isolation and validation of splicing events was restricted to those Snap frozen lung tissue specimens were obtained from genes with functions identified as being associated with tissue banks at the Heart, Lung and Esophageal Surgery cancer such as invasion, cell movement, apoptosis, cell Institute, University of Pittsburgh Medical Center. death, tumorigenesis, and differentiation, etc. The flow This study involving human tissue was approved by the chart for data analyses was as follows: Institutional Review Boards from both the University of (0) QC with Affymetrix expression console. Pittsburgh and Mount Sinai School of Medicine. In total, (1) Probe-level analysis with GC-RMA in Partek GS. 36 pairs of lung adenocarcinoma and adjacent normal (2) Alternative splice analysis of exon data with lung tissue plus 43 additional adenocarcinoma and squa- ANOVA in Partek GS. mous cell carcinoma specimens were analyzed (see clinical (3) Function analysis of genes with alternative splicing information for all patients in Supplementary Table S1). in Ingenuity Pathway Analysis and generation of All tumor specimens were determined to comprise >70% a cancer-related gene list. tumor and adjacent normal specimens contained no (4) Manual review of Partek gene view plots to identify histologically evident tumor or contaminating tissues. alternate splicing forms and determine the frequency Forty, 5-micron sections from each tissue block were of changes observed in the patient set. cut and placed immediately in Qiagen RNA lysis buffer. (5) Detailed manual analysis focusing on genes with RNA was isolated using Qiagen kits with on-column simple forms of alt splicing and high frequency of DNAse treatment to remove genomic DNA followed by changes. Reviewed Affymetrix probeset sequences, precipitation. Purified RNA was then quantified using RefSeq database, Blatted probe sequences in UCSC a NanoDrop spectrophotometer and RNA integrity was Genomic Browser. determined by running aliquots on an Agilent Bioana- (6) Identification of genes with alternative splicing at lyzer. RNA integrity numbers were >6 in all cases. high frequency in this patient set (>50% patients with same change). RNA labeling, hybridization, data processing and quality (7) Verification and validation. assessment A total of 2 mg of RNA from each of 20 tumor/normal paired specimens (n ¼ 40) was labeled with reagents Exon array data verification, splice event validation, from Affymetrix according to the manufacturers instruc- and variant transcript quantification tions. Hybridization cocktails containing 5–5.5 mg of frag- Reverse transcription of 2 mg RNA was performed in mented, end-labeled single-stranded cDNA were prepared 100 ml reaction volumes with random hexamer priming and hybridized to GeneChip Human Exon 1.0 ST arrays. and MMLV reverse transcriptase (Epicentre, Madison, These arrays survey both gene expression and alternative WI) (35). Diluted cDNA was used in all following PCR splicing patterns on a whole-genome scale on a single or qPCR reactions. Standard PCR was carried out for array. The array contains 5.4 million, 5-mm features TM 35 cycles using Titanium Taq DNA Polymerase (probes) grouped into 1.4 million probesets, interrogating (ClonTech, Mountain View, CA) starting with 25 ng of over 1 million exon clusters (33). Processed arrays were cDNA as template in a 50 ml reaction. PCR products scanned using the GeneChip Scanner 3000 7G. Affymetrix TM were separated on 10% Criterion Precast TBE Gels TM (version 1.0) was used to Expression Console Software (Bio-Rad, Hercules, CA) for visualization or 1% agarose perform quality assessment. TM gels for DNA extractions using the MinElute Gel Extraction Kit (Qiagen, Inc., Valencia, CA). Data analysis Quantitative real-time PCR was performed using the All exon array data was analyzed using tools in Partek Brilliant SYBR Green QPCR kit (Stratagene, La Jolla, Genomic Suite 6.4 software (Partek Inc., St. Louis, MO). CA) with 10 ng of cDNA as template in a 5 ml reaction Nucleic Acids Research, 2008, Vol. 36, No. 20 6537 on an ABI PRISM 7900HT instrument (Applied of alternative splicing and interest level from a review of Biosystems, Foster City, CA). PCR was run in triplicate the literature. Expression of two exons from differentially for each sample. Relative expression was calculated using expressed portions of each gene was quantified using qRT- the delta-C methods previously described (36) and with PCR on the same samples used in the array experiments B2M as the endogenous control gene. B2M was chosen (Supplementary Figures). Array data was considered as it showed very little variability in the exon array data verified if qRT-PCR demonstrated the same, directional among all samples. Bidirectional DNA sequencing difference in expression between the different regions of of novel transcripts was performed by the DNA Core the gene, or if novel alternative transcripts were specifi- Facility at the Mount Sinai School of Medicine using cally identified based on PCR product size. For 4 of the forward and reverse PCR primers. 11 genes (CEACAM1, ERG, RASIP1 and VEGFC), the Detailed methods for analysis of alternative splicing and qPCR data was in clear agreement with the array data. primer design for PCR are included in the Supplementary One gene (CDKN2A) demonstrated differential expression Methods. that was statistically borderline (Supplementary Figures). In addition, while CDH3 exon-specific qPCR did not appear to validate differential expression, a novel alterna- RESULTS tive transcript was identified that matched the array data (described in detail below). Thus, 6 of the 11 selected genes Data quality assessment identified no outlier arrays using were considered validated while the remaining five genes Expression Console Software. Spearman Rank correlation (ARMET, CDKN1A, FOXP1, KLF2 and CDKN2B) were of the hybridization control signal values between any two not verifiable by PCR. In three of five cases, we noted that chips was high (r  0.92). Thus, data from all arrays was the exon array probes that did not show differential included in the alternative splicing ANOVA (Alt-splice expression were located in very G-C rich regions, raising ANOVA). Analyzing the 17 800 genes represented in the possibility of cross hybridization as the reason for ‘Core’ Probeset list, we identified a total of 2369 genes erroneous exon array data. Of the six validated genes, that appear to have differential expression of alternate four (ERG, CDH3, CDKN2A, and CEACAM1) were transcripts between normal and tumor tissue (FDR explored in more detail. correction to establish a cutoff P-value of 3.94e-6, corre- sponding to the 0.05 FDR level). Gene function analysis Differential expression of ERG splice variants in lung indicated that the largest subset (729/2369, 30.8%) of adenocarcinoma these genes were cancer related (a full list of genes is included in Supplementary Table S2) followed by other Exon array analysis of ERG demonstrated differential functional categories such as tissue development, cellular expression of splice variants in 20 of 20 tumor/normal growth and proliferation, tissue morphology, and immune tissue pairs. Furthermore, the expression pattern observed response. Of the 729 cancer-related genes, 47 showed the was consistent with differential expression of the two same alternate splicing event in 50% or more (10 of 20) known transcripts of ERG curated in the RefSeq database of patients. In addition, one gene (CDH3) showed alter- (Figure 1a and b). Variant 1 (NM_182918.2) encodes a native splicing in only 8 (40%) patients, but all of these 0 short form with a unique 5 first exon and an additional were female (8 of 14; 57%) indicating the possibility of 0 exon towards the 3 end while variant 2 (NM_004449.3) gender-specific, cancer related alternative splicing. Of the 0 encodes a long form with three unique exons on the 5 end. 48 genes, 20 have reported alternate splice variants in In order to validate the array data, we first designed two the Entrez Gene database and the remaining 28 genes sets of qRT-PCR primers to quantify the differential have only one known transcript. The 48 genes can also expression of unique versus shared exons. The results be further divided into categories as follows and listed in demonstrate that the qRT-PCR data correlated with the Table 1: (i) six genes with known splice variants where array data in the same individuals (Supplementary relevance of splice variants has been identified in cancer. Figures). Secondly, to prove that two transcripts exist in These include ADAM12, CEACAM1, and FGFR4 this set of lung tumor and matched normal samples, we which have been associated specifically with lung cancer; designed a common primer set that spanned exon 7-8 of (ii) 14 genes with known splice variants but where variant 2 of ERG and exon 5-6-7 of variant 1. These data relevance of the splice variants to cancer has not been demonstrate the existence of both ERG variants determined; (iii) 28 potentially novel splice variants differ- (Figure 1c) in both tumor and normal tissue. Two addi- entially expressed between normal lung and lung tional transcript variants, identified in the UCSC genome adenocarcinoma. browser (uc002ywz.1 and uc002yxc.1), were also evaluated but no expression was found in either tumor or normal Verification of exon array data tissues. Finally, to specifically demonstrate differential The first step in verification of alternative splicing was to expression of variants 1 and 2 we designed primer sets verify the exon array expression data for regions of genes unique to the two transcripts and quantified expression (exons) that appeared to have differential expression in of each. This was performed in 35 tumor/normal pairs tumor and normal samples. Eleven genes (ARMET, (which included the 20 pairs used in the original array CDKN1A, CDKN2A, CDKN2B, CEACAM1, ERG, analysis) (Figure 1c). We observed that ERG variant 2 FOXP1, KLF2, RASIP1, VEGFC and CDH3) were was significantly overexpressed (mean tumor/normal selected for qRT-PCR verification based on the frequency ratio 4.73; paired t-test P-value ¼ 0.0005) in tumor 6538 Nucleic Acids Research, 2008, Vol. 36, No. 20 Table 1. Alternative spliced genes between tumor and normal in high frequency from Exon Array analysis Symbol Entrez ID Gene name Frequency (%) Genes with known splice variants where relevance of splice variants has been identified in cancer ADAM12 8038 ADAM metallopeptidase domain 12 (meltrin alpha) 70 BCL6 604 B-cell CLL/lymphoma 6 (zinc finger protein 51) 55 CDKN2A 1029 Cyclin-dependent kinase inhibitor 2A (melanoma, p16, inhibits CDK4) 65 CEACAM1 634 Carcinoembryonic antigen-related cell adhesion molecule 1 (biliary glycoprotein) 70 DSCR1 1827 Down syndrome critical region gene 1 85 FGFR4 2264 Fibroblast growth factor receptor 4 90 Genes with known variants but where relevance of splice variants to cancer has not been identified ADRA1A 148 Adrenergic, alpha-1A-, receptor 90 AKAP12 9590 A kinase (PRKA) anchor protein (gravin) 12 90 CCNE1 898 Cyclin E1 75 CDCA1 83 540 Cell division cycle associated 1 90 DKK3 27 122 Dickkopf homolog 3 (Xenopus laevis) 75 ERG 2078 v-ets erythroblastosis virus E26 oncogene like (avian) 100 FEZ1 9638 Fasciculation and elongation protein zeta 1 (zygin I) 90 FPRL1 2358 Formyl peptide receptor-like 1 70 GSN 2934 Gelsolin (amyloidosis, Finnish type) 75 KL 9365 Klotho 80 RASGRF1 5923 Ras protein-specific guanine nucleotide-releasing factor 1 95 TBX3 6926 T-box 3 (ulnar mammary syndrome) 50 TNFSF11 8600 Tumor necrosis factor (ligand) superfamily, member 11 55 ZBTB16 7704 Zinc finger and BTB domain containing 16 80 Genes with novel splice variants ANKRD1 27 063 Ankyrin repeat domain 1 (cardiac muscle) 75 APCDD1 147 495 Adenomatosis polyposis coli down-regulated 1 65 ARHGEF3 50 650 Rho guanine nucleotide exchange factor (GEF) 3 85 ARMET 7873 Arginine-rich, mutated in early stage tumors 85 CDH1 999 Cadherin 1, type 1, E-cadherin (epithelial) 60 CDH3 1001 Cadherin 3, type 1, P-cadherin (placental) 40 CXCL5 6374 Chemokine (C-X-C motif) ligand 5 60 EMP2 2013 Epithelial membrane protein 2 65 FSTL3 10 272 Follistatin-like 3 (secreted glycoprotein) 70 GATA2 2624 GATA binding protein 2 95 GATA6 2627 GATA binding protein 6 90 HCK 3055 Hemopoietic cell kinase 80 IGFBP6 3489 Insulin-like growth factor binding protein 6 75 ITGA5 3678 Integrin, alpha 5 (fibronectin receptor, alpha polypeptide) 55 KLF2 10 365 Kruppel-like factor 2 (lung) 85 NFIL3 4783 Nuclear factor, interleukin 3 regulated 90 PGR 5241 Progesterone receptor 70 PIAS1 8554 Protein inhibitor of activated STAT, 1 60 PTHR1 5745 Parathyroid hormone receptor 1 100 RAMP2 10 266 Receptor (calcitonin) activity modifying protein 2 100 RASIP1 54 922 Ras interacting protein 1 100 SMAD6 4091 SMAD, mothers against DPP homolog 6 (Drosophila) 90 SPARC 6678 Secreted protein, acidic, cysteine-rich (osteonectin) 80 SRPX 8406 Sushi-repeat-containing protein, X-linked 100 SSBP2 23 635 Single-stranded DNA binding protein 2 80 TBX2 6909 T-box 2 95 TGFBR3 7049 Transforming growth factor, beta receptor III (betaglycan, 300kDa) 55 VEGFC 7424 Vascular endothelial growth factor C 85 Known or Novel splice variants were based on Entrez Gene database and relevance to cancer was based on PubMed search. Both performed on 31 December 2007. samples compared to paired normal while ERG variant 1 normal tissues. Interestingly, this was observed only in was slightly under-expressed in tumor samples compared female patients (8 of 14 cases) and not in any of six male patients (Supplementary Figures). Given the estab- to normal (ratio ¼ 0.74; P ¼ 0.0186). lished gender differences in lung adenocarcinoma inci- dence and survival (37), this finding was evaluated Identification of a novel CDH3 transcript variant in lung further. QRT-PCR, with primers in exon 2 and in adenocarcinoma exon 3, verified that exon 3 was expressed at much CDH3 is encoded by 16 exons and has only one known higher levels (mean 24-fold tumor/normal) in tumor com- transcript in the RefSeq database. Exon array data sug- pared with normal in these eight patients while exon 2 was gested alternative splicing of exon 2 since all other exons expressed only 3-fold higher in tumor versus normal. were expressed considerably higher in tumor than in However, exon 2 and exon 3 were expressed similarly Nucleic Acids Research, 2008, Vol. 36, No. 20 6539 1 2 3 3′ 5′ (a) Variant 1 Variant 2 (b) (c) Common R Primer / 201 bp, variant 1 1 V1 R Primer 129 bp, variant 2 2 V1 F primer Common F Primer Tissue Type TNTNTNTNTNTNTNTN 4 V2 R Primer Patient ID 236 260 416 419 432 459 519 522 V2 F Primer (d) ERG V1 (mean T/N ratio=0.74, P=0.0186) ERG V2 (mean T/N ratio=4.73, P=0.0005) T/N ratio V1 vs V2 (P=1.8478e-15) tumor normal tumor normal ERG V1 ERG V2 Figure 1. Exon array analysis and PCR analysis for alternative transcript variants of ERG.(a) Exon structure for ERG variant 1 (NM_182918.2) and variant 2 (NM_004449.3) showing the location of PCR primers used for expression analyses. (b) Partek GS alternative splice analysis of exon expression data in 20 patients. The graph shows mean expression value and standard error for each probe set in tumour (blue) and normal (normal) groups. (c) verification of PCR product size difference for two variants using primers 1 and 3. (d) Quantification of the two variants using primer set 1 and 2 for variant 1 and primers 4 and 5 for variant 2 in tumour/normal paired samples (tissue pairs are joined by solid lines) from 35 patients. with a 2- to 3-fold difference in tumor versus normal in the of exons 3 and 4. RT-PCR products, visualized on a gel, other six female patients and in all male patients, thus showed the expected 283 bp band for the normal trans- validating the array data. Next, in order to further char- cript and a shorter (168 bp) band (Figure 2a). acterize the area of exon 1-3, we designed a forward Sequencing of these two bands (Figure 2b) confirmed primer in exon 1 and reverse primer across the junction the the presence of the normal transcript (283 bp band) 6540 Nucleic Acids Research, 2008, Vol. 36, No. 20 (a) 283 bp, normal transcript 168 bp, novel transcript Tissue TypeTNT NTNTNTNTNTNTN Patient ID 236 260 416 419 432 459 519 522 Gender MF FM MF F F 283 bp 168 bp (b) Exon 2 Exon 3 Exon 1 Exon 1 Exon 3 ... . (c) (d) CDH3 normal transcript CDH3 E2 skipping variant mean T/N ratio=20.46, p=6.4786e-10) mean T/N ratio=42.86, p=1.9347e-10) Exon 2 skipping Frequency # of patients variant (%) Tumor Normal Male Female Total −− 2 2 4 11.4 +− 6 13 19 54.4 −+ 0 1 1 2.8 ++ 6 5 11 31.4 Total 14 21 35 100.0 tumor normal tumor normal Figure 2. Identification, validation, and quantification of novel CDH3 transcript variants. (a) Identification of an alternative CDH3 transcript using PCR primers located in exon 1 and across the exon 3-4 boundary. (b) DNA sequencing results for the two PCR products demonstrating skipping of exon 2. (c) Frequency of the E2 skipping transcript expressed in 35 patients. (d) Specific quantification of the two variants in tumour/normal paired samples (tissue pairs are joined by solid lines) from 35 patients. and the presence of a shorter transcript long and missing Identification of CDKN2A transcript variant 2 in lung exon 2 (168 bp). To evaluate expression of this novel adenocarcinoma alterative transcript (referred to as CDH3 E2 skipping RefSeq databases indicate several transcript CDKN2A transcript, or variant transcript, from this point forward), variants which differ in their first exons. At least three qRT-PCR was performed with primers unique to each alternatively spliced variants, each encoding distinct pro- CDH3 transcript. This was performed in paired tumor/ teins have been reported with variants 1, 3, and 4 encoding normal tissue from 35 patients and the results are shown P16, Isoform 3, and ARF respectively (Figure 3a). The in Figure 2. Based on a threshold of 40 cycles, four data from the exon arrays (Figure 3b) and qRT-PCR patients were negative for the E2 skipping transcript in verification (Supplementary Figures) suggested the pre- both tumor and normal tissue, one patient was negative sence of a further splice variant of this gene. Analysis of in tumor and positive in normal, 19 patients showed the Affymetrix probe design and NCBI GenBank data- expression in tumor but not in normal, and 11 patients bases identifed a cloned sequence originating from testis showed expression in both tissue types (Figure 2c). In all tissue (Accession #: BG717152, GenBank ID: 13996339) 11 patients, expression of the E2 skipping transcript was and a mRNA sequence, CDKN2A transcript variant 2 higher in tumor than in normal tissues. Therefore the E2 (Accession #: NM_058196, GenBank ID: 17738295). skipping transcript was overexpressed in 86% (30 of 35) of However, the record was temporarily removed by NCBI tumors (Figure 2d). However in this expanded patient staff since the variant has not been confirmed. Four cohort there was no statistically significant gender differ- unique PCR primer sets were designed to assess all four ence in the frequency or expression level of the E2 skip- possible transcript variants in this set of lung tumor/ ping transcript when compared with gender. Finally, the normal tissue. Variant 3 was not expressed in lung tissues normal CDH3 transcript was also found to be significantly overexpressed in tumor compared with normal but the three other variants were all expressed (Figure 3c). (Figure 2d). Variant 2 was also expressed in lung cancer cell lines H2DB10 A549 H129 9 UR Nucleic Acids Research, 2008, Vol. 36, No. 20 6541 (a) 2 3 3′ 5′ V2 (Isoform2) V1 (p16) V3 (Isoform3) V4 (Arf) E3 E2 E1 V2 R Primer (b) 2 Common R Primer V2 F primer V1 F Primer 5 V3 F Primer 6 V4 F Primer (c) p16 Isoform3 Arf Isoform2 -377 bp -190 bp -148 bp TNTNTNTN T N T N TNTNTNTNTNTNTNTNTNT (d) intron1-2 Exon 2 Exon 2 Exon 3 ... ... .... (e) P16 (mean T/N ratio=1.75, P=0.1439) Arf (mean T/N ratio=103.01, P=0.0035) Isoform2 (mean T/N ratio=20.10, P=0.0186) tumor normal tumor normal tumor normal Figure 3. Exon array analysis and PCR analysis for alternative transcript variants of CDKN2A.(a) Exon structure for CDKN2A transcript variants 1 (NM_000077), 2 (NM_058196), 3 (NM_058197) and 4 (NM_058195) indicating the location of PCR primers used for expression analyses. (b) Partek GS alternative splice analysis of exon expression data in 20 patients. The graph shows mean expression value and standard error for each probe set in tumour (blue) and normal (normal) groups. (c) verification of PCR product size difference for three variants using primers 1 and 3 (variant 2/ isoform2), 2 and 4 (variant 1/p16) and 2 and 6 (variant 4/Arf). Variant 3 was not detected using primers 2 and 5. (d) DNA sequencing results of isoform 2. Sequencing shows intronic sequence upstream of exon 2 but also reads directly into exon 3 eliminating the possibility of genomic DNA as the source of the PCR product. (e) Quantification of variant P16 (using primers 2 and 4), Arf (2 and 6), and Isoform 2 (2 and 3) in tumour/normal paired samples (tissue pairs are joined by solid lines) from 33 patients. 6542 Nucleic Acids Research, 2008, Vol. 36, No. 20 CEACAM1 V1 CEACAM1 V2 Paired tumor/normal ratio (V1 vs V2) (a) P=6.0221e-10 Mean T/N ratio=36.42 P=0.000002 Mean T/N ratio=3.53 P=0.0014 Tumor Paired tumor Paired normal Tumor Paired normal Paired tumor CEACAM1 V1 CEACAM1 V2 (N=43) (N=35) (N=35) (N=43) (N=35) (N=35) (b) (c) CEACAM1 V1 CEACAM1 V2 1 2 5′ 3′ CEACAM1 V1 CEACAM1 V2 V1 R Primer 3 4 4 V1 F primer 3 V2 R Primer 4 V2 F Primer Tumor Normal Tumor Normal Tumor Normal Tumor Normal Colon Breast Colon Breast Figure 4. Quantification of CEACAM1 variants in NSCLC, colon and breast cancer patients. (a) Expression of CEACAM1 variant 1 and variant 2 in 35 tumour/normal paired samples from lung adenocarcinoma patients (tissue pairs are joined by solid lines) and in 43 lung adenocarcinoma (n ¼ 11) and squamous cell carcinoma (n ¼ 32) samples only (without matched normals). The third graph shows the tumour/normal expression ratio for variant 1 and 2 in the 35 matched tissue pairs. (b) Expression of variant 1 and variant 2 in non-paired tumour and normal samples from colon and breast cancer patients (10 patients each). (c) Exon location and PCR primers locations for qPCR of variant 1 (NM_001712; primers 1 and 2), variant 2 (NM_001024912; primers 3 and 4). H1299 and H2DB10, and Universal Reference (UR) RNA exons while variant 2 is missing exon 7 (Figure 4c). which contains 10 combined cancer cell line RNAs, but Our array data demonstrated higher expression of not in the lung cancer cell line A549. Furthermore, DNA exons 1 through 6 as well as 8 and 9 in tumors versus sequencing of the 377 bp variant 2 PCR product con- normal while expression of exon 7 was essentially equal firmed that the transcript encodes a portion of intron 1, (Supplementary Figures). This was observed in 14 of 20 consistent with the reported variant 2 mRNA (Figure 3d). (70%) tissue pairs and suggested differential expression of Finally, quantification of three variants in 33 paired the two known CEACAM1 variants in lung tumor and tumor and normal tissues showed that variant 1 (p16) normal tissues as reported previously (38). Furthermore, was not expressed significantly differently between the exon array data was verified by qRT-PCR using PCR primer sets designed to amplify exon 7 and exon 8 speci- tumor and normal (mean tumor/normal ratio ¼ 1.75; fically in the same samples (Supplementary Figures). P ¼ 0.1439). Both variant 2 (Isoform 2, mean tumor/ Expression of CEACAM1 was reduced in malignant normal ratio ¼ 20.1; P ¼ 0.0186) and variant 4 (ARF, tissues as compared with corresponding normal tissues mean tumor/normal ratio ¼ 103.01; P ¼ 0.0035) were sig- deriving from breast (39), prostate (40), colon (41), nificantly overexpressed in tumor compared to normal and endometrium (42). These findings indicated that (Figure 3e). CEACAM1 might suppress carcinogenesis. However, in contrast, high expression of CEACAM1 protein was Quantification of CEACAM1 transcript variants in NSCLC seen in lung tumor tissues and also correlated with poor CEACAM1 is encoded by nine exons with two known survival in lung cancer (43–45). Since these studies did not variants in the RefSeq database. Variant 1 uses all nine examine expression of the two CEACAM1 variants Nucleic Acids Research, 2008, Vol. 36, No. 20 6543 specifically, we designed qRT-PCR assays unique to The ERG (V-ETS avian erythroblastosis virus E26 oncogene homolog) protein shares significant homology variants 1 and 2 of CEACAM1 and analyzed expression 0 0 of each in lung, breast and colon tumors plus normal with both 5 and 3 regions of the viral ETS1 oncogene, tissues from each organ site. In 35 lung adenocarcinoma ETS1, suggesting that it belongs to the ETS oncogene patients we found that CEACAM1 variant 2 was highly family. ERG is located at chromosome band 21q22 and and significantly overexpressed in paired tumor versus has been identified as the target of genomic rearrangement normal (tumor/normal ratio ¼ 36.42; P ¼ 0.000002) while events in acute myeloid leukemia (46), Ewing’s sarcoma variant 1 was only slightly overexpressed in tumor (tumor/ (47) and prostate cancer (48–50). In acute myeloid normal ratio ¼ 3.53; P ¼ 0.0014). Furthermore, analysis of leukemia (AML) and Ewing’s sarcoma ERG has been an additional 43 lung tumors, including squamous cell associated with several translocation fusion partners cancers, revealed expression levels indistinguishable from including ELF4, FUS1 and EWSR1 (46,51,52). Further- the original 35 adenocarcinoma samples (Figure 4a). more, high expression of ERG in the absence of karyo- In the breast and colon tissues however, we found no typic rearrangement or amplification was demonstrated significant differences in expression of either variant 1 or to be an adverse prognostic factor in patients with AML (53). In prostate cancer, ERG is frequently fused to a variant 2 in tumor versus normal (unmatched) samples (Figure 4b). Given this data we postulated that overex- nearby gene, TMPRSS2 resulting in androgen regulation pression of CEACAM1 variant 2 in lung cancer may be of ERG and several reports now indicate that the presence responsible for the survival differences observed between of this fusion is a poor prognostic indicator in prostate tumor types. However, an analysis of disease-free survival cancer (54,55). Interestingly, none of the reports cited in our cohort of 78 lung cancer patients including above discuss the existence of two ERG variants and how these are related to the fusion product. Our analysis 48 stage I and 30 higher stages (median follow up of variant-specific expression clearly showed that variant 2 24 months) showed no association of CEACAM1 variant 1 or 2 expression with patient survival (Cox regression has much higher expression in tumor compared to the P-values 0.715 and 0.536, respectively). paired normal lung tissue while variant 1 has similar or lower expression in tumors. Thus it seems that the onco- genic effect of ERG may be exerted through functions encoded by variant 2 and expression of fusion gene pro- DISCUSSION ducts should also be closely evaluated for which ERG In this study we have performed an extensive identification variant is present.Further investigations are required and verification of alternative splice variant gene expres- to identify the functional differences between the two sion in NSCLC. To our knowledge, this study is the first variants and could lead to more targeted drug discovery. such genome wide analysis of alternative splicing events The cadherins are a family of transmembrane proteins in NSCLC or any other tumor type. Our results indicate that mediate calcium-dependent cell-cell adhesion at adhe- that approximately 13% of the 17 800 core RefSeq genes rens junctions. The cytoplasmic domain of cadherins appear to have alternative transcripts that are differen- binds to A and G catenins and is linked to the actin tially expressed between lung adenocarcinoma and cytoskeleton via A catenin (56). These interactions are adjacent normal lung tissue. Furthermore, the largest vital for stable cell-cell interactions and maintenance of subsets of these alternatively spliced genes appear to be normal cell physiology. In cancer, disruption of the cancer related and/or involved in cellular processes such adherens junctions, for example by downregulation as growth and proliferation. For some genes, alterna- or inactivating mutation of cadherins, can result in tive transcripts have already been identified but we now epithelial-to-mesenchymal transition, increased prolifera- demonstrate their differential expression in cancer. tion, invasion and metastasis (56,57). In part, this may be In many cases however, our microarray data indicates mediated by the release and accumulation of B catenin the presence of differentially expressed alternative tran- which, when translocated to the nucleus induces transcrip- scripts that are currently unidentified. Thus it appears tion of genes such as cyclin D1 and c-myc. that differential expression of alternative transcripts is While the role of the prototypic cadherin, E-cadherin frequent in NSCLC and that this may be a valuable (CDH1) as a classic tumor suppressor gene in cancer is resource for the development of novel diagnostic, prog- well established the role of P-cadherin remains unclear as nostic and therapeutic tools. it behaves differently depending on the tumor type being While the ability to analyze alternative transcript studied. For example, in melanoma, the loss of P-cadherin expression on a genome wide scale is very powerful, (and E-Cadherin) allows invasion and migration of verification and validation of this data is labor intensive. cells and thus P-cadherin appears to be acting as a pro- For this reason, we chose to focus on genes that have adhesion tumor suppressor (58,59). In breast cancer previously been associated with cancer, and where differ- however, high expression of P-cadherin strongly corre- ential expression occurred in >50% of tumor/normal lated with high histologic grade, increased proliferation tissue pairs. In total, 11 genes were examined and we and poor patient survival (60,61). Furthermore, in pan- were able to validate the array data for six of them. creatic cancer cell lines, overexpression of P-cadherin Thus, verification and validation of data from the exon resulted in increased cell motility, cytoplasmic accumula- arrays is clearly required. Alternative transcript expression tion of catenins and activation of the Rho GTPases, for four of these genes was studied in more detail. Rac1 and Cdc42 (62). 6544 Nucleic Acids Research, 2008, Vol. 36, No. 20 In our study we found overexpression of P-cadherin in status in hepatocellular carcinoma (67) and worse out- lung tumors compared to normal lung but also identified come in B-cell lymphomas (66). Similarly, overexpression overexpression of an alternative splice variant in which of p16 has also been observed and has been associated exon 2 is missing. Analysis of the resulting mRNA indi- with progression and poor survival in ovarian cancer cates that the normal ATG initiation codon is placed out (69), prostate cancer (70) and breast cancer (71). While of frame and would result in a truncated protein after only overexpression of p16 and ARF appears to contradict 27 amino acids. This would clearly result in an inactive their known cellular functions as tumor suppressors, protein and would fit with a tumor suppressor function for mechanisms have been proposed whereby this event may P-cadherin in lung cancer if it were not for the fact that be explained through activating mutations in Rb or induc- full length P-cadherin mRNA is actually overexpressed in tion of myc and ras (68,72,73). However, our data suggests our tumors. However, upon further analysis we identified an alternative: that the variant 2 transcript may account several alternative in frame ATG codons downstream for the previously observed overexpression. Variant 2 was of the known translation start site. Furthermore, at least originally believed to give rise to a new isoform (Isoform two of these putative alternative start sites have kozak 2) of P16 with the first amino acid encoded by an in frame sequences that are believed to be active in other genes. ATG that is present in the original exon 2. However, we Protein translation initiated at either of these sites would also identified an alternative ATG codon that is in the result in a P-cadherin protein lacking the signal peptide extended exon 2 and is in frame with ARF. This alterna- and most of the extracellular domain, while retaining the tive ATG has a reasonably good Kozak sequence transmembrane domain, juxtamembrane domain and the (CCGTCATGC) and, being upstream of the putative catenin binding domain. If such a protein were to be over- p16 isoform 2 start site, would presumably dominate expressed in tumors one can easily envision disruption of translation initiation. This putative ARF isoform would the adherens junctions in a dominant manner leading to lack the amino terminal portion of ARF and would there- catenin accumulation and tumorigenesis. fore be unable to bind TBP-1, E2F, Myc, FoxM1, CTBP1 Its is well known that multiple transcripts are or mdm2 (74) and may be unable to block cell cycle transcribed from the CDKN2A (cyclin-dependent kinase progression. However others have shown that the carboxy inhibitor 2A) locus. CDKN2A is an extensively studied terminus of artificially truncated ARF still accumulates tumor suppressor locus that is frequently mutated or in the nucleolus (75–77) and thus this putative ARF iso- deleted in a wide variety of tumor types. Exploration of form could theoretically act as a dominant negative, thus the RefSeq database identified four transcript variants explaining how overexpression of ARF may be pro- potentially transcribed from this locus of which three are tumorigenic. considered verified. Variant 1 gives rise to the p16 protein CEACAM1 [carcinoembryonic antigen-related cell and variant 4 gives rise to the alternative reading frame adhesion molecule 1 (biliary glycoprotein)] is a cell–cell p14/ARF protein. Variant 4 also gives rise to a shorter adhesion molecule that also plays a role in signal transduc- protein product (p19smARF) which results from an alter- tion. Two common variants are known for CEACAM1; native translation start site (63). Variant 3 gives rise to one with a long cytoplasmic domain (L form or variant 1) a longer protein that shares the same reading frame as and one with a short cytoplasmic domain (S form or p16 and appears to be specifically expressed in the variant 2). The expression of CEACAM1 in cancer has pancreas. In addition, another transcript variant (p16g) been extensively studied but early reports appeared to be was recently identified (64) but has yet to be curated in contradictory. For example, reduced expression of the RefSeq databases. Finally, variant 2 lacks exons 1a CEACAM1 was reported in breast, colon, prostate and and 1b, and exon 2 is slightly longer due to inclusion of endometrial cancer (39,41,78) and CEACAM1 was there- an additional 100 bases of intronic sequence. Variant 2 fore considered to be a negative regulator of tumor cell may also have a shorter 3 UTR than p16 or Arf. growth. However, in melanoma (79) and lung cancer Variant 2 was originally cloned from testis tissue but has (43,44), several reports indicated that CEACAM1 was been temporarily removed by RefSeq staff for further overexpressed in tumors and that this was associated evaluation. with disease progression and poor outcome. In 1997 In cancer, inactivation of the p16INK4a/ARF tumor Turbide et al. (80) found that the L form of CEACAM1 suppressor genes is frequently mediated through genomic exhibited a tumor suppressive phenotype and that this was deletion, promoter methylation or inactivating mutation dominant over expression of the S form. Furthermore, leading to loss of p53 and Rb dependent cell cycle regula- using semi-quantitative RT-PCR Wang et al. (38) found tion (65). In NSCLC, loss of heterozygosity and/or homo- that the L form of CEACAM1 predominated in normal zygous deletion of the CDKN2A locus on chromosome lung while the S form appeared more abundant in tumors. 9p21 has been reported at frequencies up to 40% (8). Thus they proposed that isoform switching rather than In our study, 30% of tumors demonstrated reduced CEACAM1 downregulation occurs in NSCLC as opposed expression of all three measured transcripts (p16, ARF to other tumor types. Our quantitative analysis clearly and variant 2) and this is likely a result of genomic demonstrates a switch in abundance from the L form deletions. However, in the remaining tumors expression (variant 1) to the S form (variant 2) in NSCLC and we of ARF and variant 2 (but not p16) were significantly also demonstrate that no such switch appears to occur in higher than in paired normal tissue. Overexpression breast cancer or colon cancer. Furthermore, we also ana- of ARF in cancer has now been reported several times lyzed a publicly available GeneChip Human Exon 1.0 ST (66–68) and has been associated with poor differentiation array data set from colon (33) and found no significant Nucleic Acids Research, 2008, Vol. 36, No. 20 6545 8. Weir,B.A., Woo,M.S., Getz,G., Perner,S., Ding,L., Beroukhim,R., differential expression of CEACAM1 variants in Lin,W.M., Province,M.A., Kraja,A., Johnson,L.A. et al. (2007) those 10 pairs of colon tumor/normal samples (data not Characterizing the cancer genome in lung adenocarcinoma. Nature, shown). Thus our findings support the hypothesis that the 450, 893–898. tumor suppressive or oncogenic effects of CEACAM1 are 9. Maniatis,T. and Tasic,B. (2002) Alternative pre-mRNA splicing and splice variant dependent and that expression of the two proteome expansion in metazoans. Nature, 418, 236–243. 10. Black,D.L. (2003) Mechanisms of alternative pre-messenger RNA variants is differentially regulated in different tissue types. splicing. Annu. Rev. Biochem., 72, 291–336. In conclusion, our data demonstrates that differential 11. Mironov,A.A., Fickett,J.W. and Gelfand,M.S. (1999) Frequent expression of alternative splice variants is a common alternative splicing of human genes. Genome Res., 9, 1288–1293. event in NSCLC. It also shows that in addition to identi- 12. Brett,D., Hanke,J., Lehmann,G., Haase,S., Delbruck,S., Krueger,S., Reich,J. and Bork,P. (2000) EST comparison indicates 38% of fication of novel, cancer-related splice variants, additional human mRNAs contain possible alternative splice forms. FEBS information can be gained even with regard to extensively Lett., 474, 83–86. studied, cancer-related genes. Splice variant expression 13. Kan,Z., States,D. and Gish,W. (2002) Selecting for functional should be considered in future genome-wide expression alternative splices in ESTs. Genome Res., 12, 1837–1845. studies and may lead to novel diagnostic, prognostic or 14. Modrek,B., Resch,A., Grasso,C. and Lee,C. (2001) Genome-wide detection of alternative splicing in expressed sequences of human therapeutic strategies in the fight against cancer. genes. Nucleic Acids Res., 29, 2850–2859. (GeneChip Human Exon 1.0 ST array cell files along 15. Sharp,P.A. (1994) Split genes and RNA splicing. Cell, 77, 805–815. with GC-RMA data from core gene probsets and patient 16. Collesi,C., Santoro,M.M., Gaudino,G. and Comoglio,P.M. (1996) information have been submitted to GEO databases and A splicing variant of the RON transcript induces constitutive tyrosine kinase activity and an invasive phenotype. Mol. Cell Biol., GEO Accession # is GSE12236). 16, 5518–5526. 17. Gayther,S.A., Barski,P., Batley,S.J., Li,L., de Foy,K.A., Cohen,S.N., Ponder,B.A. and Caldas,C. (1997) Aberrant splicing SUPPLEMENTARY DATA of the TSG101 and FHIT genes occurs frequently in multiple Supplementary Data are available at NAR Online. malignancies and in normal tissues and mimics alterations previously described in tumours. Oncogene, 15, 2119–2126. 18. Scotet,E. and Houssaint,E. (1998) Exon III splicing switch of ACKNOWLEDGEMENT fibroblast growth factor (FGF) receptor-2 and -3 can be induced by FGF-1 or FGF-2. Oncogene, 17, 67–76. Lung cancer cell lines were kindly provided by Dr Stuart 19. Ge,K., DuHadaway,J., Du,W., Herlyn,M., Rodeck,U. and Aaronson at the Mount Sinai School of Medicine. Prendergast,G.C. (1999) Mechanism for elimination of a tumor suppressor: aberrant splicing of a brain-specific exon causes loss of function of Bin1 in melanoma. Proc. Natl Acad. Sci. USA, 96, 9689–9694. FUNDING 20. Baudry,D., Hamelin,M., Cabanis,M.O., Fournet,J.C., Funding for open access charge: The work was funded in Tournade,M.F., Sarnacki,S., Junien,C. and Jeanpierre,C. (2000) WT1 splicing alterations in Wilms’ tumors. Clin. Cancer Res., 6, part by NIH/NCI grant R01 CA94059. 3957–3965. 21. Slawin,K.M., Shariat,S.F., Nguyen,C., Leventis,A.K., Song,W., Conflict of interest statement. None declared. Kattan,M.W., Young,C.Y., Tindall,D.J. and Wheeler,T.M. (2000) Detection of metastatic prostate cancer using a splice variant- specific reverse transcriptase-polymerase chain reaction assay for REFERENCES human glandular kallikrein. Cancer Res., 60, 7142–7148. 22. Liu,H.X., Cartegni,L., Zhang,M.Q. and Krainer,A.R. (2001) 1. Jemal,A., Murray,T., Ward,E., Samuels,A., Tiwari,R.C., A mechanism for exon skipping caused by nonsense or missense Ghafoor,A., Feuer,E.J. and Thun,M.J. (2005) Cancer statistics. mutations in BRCA1 and other genes. Nat. Genet., 27, 55–58. CA Cancer J. Clin., 55, 10–30. 23. Lukas,J., Gao,D.Q., Keshmeshian,M., Wen,W.H., Tsao-Wei,D., 2. Ries,L.A.G., Melbert,D., Krapcho,M., Mariotto,A., Miller,B.A., Rosenberg,S. and Press,M.F. (2001) Alternative and aberrant Feuer,E.J., Clegg,L., Horner,M.J., Howlader,N., Eisner,M.P. et al. messenger RNA splicing of the mdm2 oncogene in invasive breast (2006) SEER Cancer Statistics Review, 1975–2004. National Cancer cancer. Cancer Res., 61, 3212–3219. Institute, Bethesda, MD. 24. Kwabi-Addo,B., Ropiquet,F., Giri,D. and Ittmann,M. (2001) 3. Bhattacharjee,A., Richards,W.G., Staunton,J., Li,C., Monti,S., Alternative splicing of fibroblast growth factor receptors in human Vasa,P., Ladd,C., Beheshti,J., Bueno,R., Gillette,M. et al. (2001) prostate cancer. Prostate, 46, 163–172. Classification of human lung carcinomas by mRNA expression 25. Bartel,F., Taubert,H. and Harris,L.C. (2002) Alternative and profiling reveals distinct adenocarcinoma subclasses. Proc. Natl aberrant splicing of MDM2 mRNA in human cancer. Cancer Cell, Acad. Sci. USA, 98, 13790–13795. 2, 9–15. 4. Beer,D.G., Kardia,S.L., Huang,C.C., Giordano,T.J., Levin,A.M., 26. Barbour,A.P., Reeder,J.A., Walsh,M.D., Fawcett,J., Antalis,T.M. Misek,D.E., Lin,L., Chen,G., Gharib,T.G., Thomas,D.G. et al. and Gotley,D.C. (2003) Expression of the CD44v2-10 isoform (2002) Gene-expression profiles predict survival of patients with confers a metastatic phenotype: importance of the heparan sulfate lung adenocarcinoma. Nat. Med., 8, 816–824. 5. Potti,A., Mukherjee,S., Petersen,R., Dressman,H.K., Bild,A., attachment site CD44v3. Cancer Res., 63, 887–892. Koontz,J., Kratzke,R., Watson,M.A., Kelley,M., Ginsburg,G.S. 27. Steinman,H.A., Burstein,E., Lengner,C., Gosselin,J., Pihan,G., et al. (2006) A genomic strategy to refine prognosis in early-stage Duckett,C.S. and Jones,S.N. (2004) An alternative splice form of non-small-cell lung cancer. N. Engl. J. Med., 355, 570–580. Mdm2 induces p53-independent cell growth and tumorigenesis. 6. Chen,H.Y., Yu,S.L., Chen,C.H., Chang,G.C., Chen,C.Y., Yuan,A., J. Biol. Chem., 279, 4877–4886. Cheng,C.L., Wang,C.H., Terng,H.J., Kao,S.F. et al. (2007) A five- 28. Brinkman,B.M. (2004) Splice variants as cancer biomarkers. gene signature and clinical outcome in non-small-cell lung cancer. Clin. Biochem., 37, 584–594. N. Engl. J. Med., 356, 11–20. 29. Venables,J.P. (2004) Aberrant and alternative splicing in cancer. 7. Tonon,G., Wong,K.K., Maulik,G., Brennan,C., Feng,B., Zhang,Y., Cancer Res., 64, 7647–7654. Khatry,D.B., Protopopov,A., You,M.J., Aguirre,A.J. et al. (2005) 30. Kalnina,Z., Zayakin,P., Silina,K. and Line,A. (2005) Alterations High-resolution genomic profiles of human lung cancer. Proc. of pre-mRNA splicing in cancer. Genes Chromosomes. Cancer, 42, Natl Acad. Sci. USA, 102, 9625–9630. 342–357. 6546 Nucleic Acids Research, 2008, Vol. 36, No. 20 31. Venables,J.P. (2006) Unbalanced alternative splicing and its cancer with implications for molecular diagnosis. Mod. Pathol., 20, 467–473. significance in cancer. Bioessays, 28, 378–386. 50. Liu,W., Ewing,C.M., Chang,B.L., Li,T., Sun,J., Turner,A.R., 32. Skotheim,R.I. and Nees,M. (2007) Alternative splicing in cancer: Dimitrov,L., Zhu,Y., Sun,J., Kim,J.W. et al. (2007) Multiple noise, functional, or systematic? Int. J. Biochem. Cell Biol., 39, genomic alterations on 21q22 predict various TMPRSS2/ERG 1432–1449. 33. Gardina,P.J., Clark,T.A., Shimada,B., Staples,M.K., Yang,Q., fusion transcripts in human prostate cancers. Genes Chromosomes. Veitch,J., Schweitzer,A., Awad,T., Sugnet,C., Dee,S. et al. (2006) Cancer, 46, 972–980. Alternative splicing and differential gene expression in colon cancer 51. Panagopoulos,I., Mandahl,N., Mitelman,F. and Aman,P. (1995) Two distinct FUS breakpoint clusters in myxoid liposarcoma and detected by a whole genome exon array. BMC Genomics, 7, 325. acute myeloid leukemia with the translocations t(12;16) and 34. Irizarry,R.A., Hobbs,B., Collin,F., Beazer-Barclay,Y.D., t(16;21). Oncogene, 11, 1133–1137. Antonellis,K.J., Scherf,U. and Speed,T.P. (2003) Exploration, 52. Desmaze,C., Brizard,F., Turc-Carel,C., Melot,T., Delattre,O., normalization, and summaries of high density oligonucleotide array Thomas,G. and Aurias,A. (1997) Multiple chromosomal mecha- probe level data. Biostatistics, 4, 249–264. nisms generate an EWS/FLI1 or an EWS/ERG fusion gene in 35. Godfrey,T.E., Kim,S.-H., Chavira,M., Ruff,D.W., Warren,R.S., Ewing tumors. Cancer Genet. Cytogenet., 97, 12–19. Gray,J.W. and Jensen,R.H. (2000) Quantitative mRNA expression 53. Marcucci,G., Baldus,C.D., Ruppert,A.S., Radmacher,M.D., analysis from formalin-fixed, paraffin-embedded tissues using 5 Mrozek,K., Whitman,S.P., Kolitz,J.E., Edwards,C.G., nuclease quantitative RT-PCR. J. Mol. Diagn., 2, 84–91. Vardiman,J.W., Powell,B.L. et al. (2005) Overexpression of the 36. Tassone,F., Hagerman,R.J., Taylor,A.K., Gane,L.W., Godfrey,T.E. ETS-related gene, ERG, predicts a worse outcome in acute myeloid and Hagerman,P.J. (2000) Elevated levels of FMR1 mRNA in leukemia with normal karyotype: a Cancer and Leukemia Group B carrier males: a new mechanism of involvement in the fragile-X study. J. Clin. Oncol., 23, 9234–9242. syndrome. Am. J. Hum. Genet., 66, 6–15. 54. Nam,R.K., Sugar,L., Yang,W., Srivastava,S., Klotz,L.H., 37. Wisnivesky,J.P. and Halm,E.A. (2007) Sex differences in lung cancer Yang,L.Y., Stanimirovic,A., Encioiu,E., Neill,M., Loblaw,D.A. survival: do tumors behave differently in elderly women? et al. (2007) Expression of the TMPRSS2:ERG fusion gene predicts J. Clin. Oncol., 25, 1705–1712. cancer recurrence after surgery for localised prostate cancer. 38. Wang,L., Lin,S.H., Wu,W.G., Kemp,B.L., Walsh,G.L., Hong,W.K. Br. J. Cancer, 97, 1690–1695. and Mao,L. (2000) C-CAM1, a candidate tumor suppressor gene, is 55. Demichelis,F., Fall,K., Perner,S., Andren,O., Schmidt,F., abnormally expressed in primary lung cancers. Clin. Cancer Res., 6, Setlur,S.R., Hoshida,Y., Mosquera,J.M., Pawitan,Y., Lee,C. et al. 2988–2993. (2007) TMPRSS2:ERG gene fusion associated with lethal prostate 39. Riethdorf,L., Lisboa,B.W., Henkel,U., Naumann,M., Wagener,C. cancer in a watchful waiting cohort. Oncogene, 26, 4596–4599. and Loning,T. (1997) Differential expression of CD66a (BGP), 56. Conacci-Sorrell,M., Zhurinsky,J. and Ben-Ze’ev,A. (2002) a cell adhesion molecule of the carcinoembryonic antigen family, The cadherin-catenin adhesion system in signaling and cancer. in benign, premalignant, and malignant lesions of the human J. Clin. Invest., 109, 987–991. mammary gland. J. Histochem. Cytochem., 45, 957–963. 57. Paredes,J., Correia,A.L., Ribeiro,A.S., Albergaria,A., Milanezi,F. 40. Luo,W., Tapolsky,M., Earley,K., Wood,C.G., Wilson,D.R., and Schmitt,F.C. (2007) P-cadherin expression in breast cancer: Logothetis,C.J. and Lin,S.H. (1999) Tumor-suppressive activity of a review. Breast Cancer Res., 9, 214. CD66a in prostate cancer. Cancer Gene Ther., 6, 313–321. 58. Sanders,D.S., Blessing,K., Hassan,G.A., Bruton,R., Marsden,J.R. 41. Neumaier,M., Paululat,S., Chan,A., Matthaes,P. and Wagener,C. and Jankowski,J. (1999) Alterations in cadherin and catenin (1993) Biliary glycoprotein, a potential human cell adhesion expression during the biological progression of melanocytic molecule, is down-regulated in colorectal carcinomas. Proc. Natl tumours. Mol. Pathol., 52, 151–157. Acad. Sci. USA, 90, 10744–10748. 59. Hsu,M.Y., Wheelock,M.J., Johnson,K.R. and Herlyn,M. (1996) 42. Bamberger,A.M., Riethdorf,L., Nollau,P., Naumann,M., Shifts in cadherin profiles between human normal melanocytes and Erdmann,I., Gotze,J., Brummer,J., Schulte,H.M., Wagener,C. and melanomas. J. Investig. Dermatol. Symp. Proc., 1, 188–194. Loning,T. (1998) Dysregulated expression of CD66a (BGP, 60. Paredes,J., Albergaria,A., Oliveira,J.T., Jeronimo,C., Milanezi,F. C-CAM), an adhesion molecule of the CEA family, in endometrial and Schmitt,F.C. (2005) P-cadherin overexpression is an indicator cancer. Am.. J. Pathol., 152, 1401–1406. of clinical outcome in invasive breast carcinomas and is associated 43. Laack,E., Nikbakht,H., Peters,A., Kugler,C., Jasiewicz,Y., Edler,L., with CDH3 promoter hypomethylation. Clin. Cancer Res., 11, Brummer,J., Schumacher,U. and Hossfeld,D.K. (2002) Expression 5869–5877. of CEACAM1 in adenocarcinoma of the lung: a factor of inde- 61. Peralta,S.A., Knudsen,K.A., Salazar,H., Han,A.C. and pendent prognostic significance. J. Clin. Oncol., 20, 4279–4284. Keshgegian,A.A. (1999) P-cadherin expression in breast carcinoma 44. Sienel,W., Dango,S., Woelfle,U., Morresi-Hauf,A., Wagener,C., indicates poor survival. Cancer, 86, 1263–1272. Brummer,J., Mutschler,W., Passlick,B. and Pantel,K. (2003) 62. Taniuchi,K., Nakagawa,H., Hosokawa,M., Nakamura,T., Elevated expression of carcinoembryonic antigen-related cell adhe- Eguchi,H., Ohigashi,H., Ishikawa,O., Katagiri,T. and Nakamura,Y. sion molecule 1 promotes progression of non-small cell lung cancer. (2005) Overexpressed P-cadherin/CDH3 promotes motility of Clin. Cancer Res., 9, 2260–2266. pancreatic cancer cells by interacting with p120ctn and activating 45. Dango,S., Sienel,W., Schreiber,M., Stremmel,C., Kirschbaum,A., rho-family GTPases. Cancer Res., 65, 3092–3099. Pantel,K. and Passlick,B. (2008) Elevated expression of carcinoem- 63. Reef,S. and Kimchi,A. (2006) A smARF way to die: a novel short bryonic antigen-related cell adhesion molecule 1 (CEACAM-1) is isoform of p19ARF is linked to autophagic cell death. Autophagy, associated with increased angiogenic potential in non-small-cell lung 2, 328–330. cancer. Lung Cancer, 60, 426–433. 64. Lin,Y.C., Diccianni,M.B., Kim,Y., Lin,H.H., Lee,C.H., Lin,R.J., 46. Moore,S.D., Offor,O., Ferry,J.A., Amrein,P.C., Morton,C.C. and Joo,S.H., Li,J., Chuang,T.J., Yang,A.S. et al. (2007) Human Dal,C.P. (2006) ELF4 is fused to ERG in a case of acute myeloid p16gamma, a novel transcriptional variant of p16(INK4A), leukemia with a t(X;21)(q25-26;q22). Leuk. Res., 30, 1037–1042. coexpresses with p16(INK4A) in cancer cells and inhibits cell-cycle 47. Sorensen,P.H., Lessnick,S.L., Lopez-Terrada,D., Liu,X.F., progression. Oncogene, 26, 7017–7027. Triche,T.J. and Denny,C.T. (1994) A second Ewing’s sarcoma 65. Gil,J. and Peters,G. (2006) Regulation of the INK4b-ARF-INK4a translocation, t(21;22), fuses the EWS gene to another ETS-family tumour suppressor locus: all for one or one for all. Nat. Rev. Mol. transcription factor, ERG. Nat. Genet., 6, 146–151. Cell Biol., 7, 667–677. 48. Tomlins,S.A., Rhodes,D.R., Perner,S., Dhanasekaran,S.M., 66. Sanchez-Aguilera,A., Sanchez-Beato,M., Garcia,J.F., Prieto,I., Mehra,R., Sun,X.W., Varambally,S., Cao,X., Tchinda,J., Kuefer,R. Pollan,M. and Piris,M.A. (2002) p14(ARF) nuclear overexpression et al. (2005) Recurrent fusion of TMPRSS2 and ETS transcription in aggressive B-cell lymphomas is a sensor of malfunction of the factor genes in prostate cancer. Science, 310, 644–648. common tumor suppressor pathways. Blood, 99, 1411–1418. 49. Lapointe,J., Kim,Y.H., Miller,M.A., Li,C., Kaygusuz,G., 67. Ito,T., Nishida,N., Fukuda,Y., Nishimura,T., Komeda,T. and van de,R.M., Huntsman,D.G., Brooks,J.D. and Pollack,J.R. (2007) Nakao,K. (2004) Alteration of the p14(ARF) gene and p53 status in A variant TMPRSS2 isoform and ERG fusion product in prostate human hepatocellular carcinomas. J. Gastroenterol., 39, 355–361. Nucleic Acids Research, 2008, Vol. 36, No. 20 6547 68. Ferru,A., Fromont,G., Gibelin,H., Guilhot,J., Savagner,F., 74. Chen,Y.W., Paliwal,S., Draheim,K., Grossman,S.R. and Lewis,B.C. Tourani,J.M., Kraimps,J.L., Larsen,C.J. and Karayan-Tapon,L. (2008) p19Arf inhibits the invasion of hepatocellular carcinoma (2006) The status of CDKN2A alpha (p16INK4A) and beta cells by binding to C-terminal binding protein. Cancer Res., 68, (p14ARF) transcripts in thyroid tumour progression. Br. J. Cancer, 476–482. 95, 1670–1677. 75. Rizos,H., Darmanian,A.P., Mann,G.J. and Kefford,R.F. (2000) 69. Dong,Y., Walsh,M.D., McGuckin,M.A., Gabrielli,B.G., Two arginine rich domains in the p14ARF tumour suppressor Cummings,M.C., Wright,R.G., Hurst,T., Khoo,S.K. and mediate nucleolar localization. Oncogene, 19, 2978–2985. Parsons,P.G. (1997) Increased expression of cyclin-dependent 76. Ayrault,O., Karayan,L., Riou,J.F., Larsen,C.J. and Seite,P. (2003) kinase inhibitor 2 (CDKN2A) gene product P16INK4A in ovarian Delineation of the domains required for physical and functional cancer is associated with progression and unfavourable prognosis. interaction of p14ARF with human topoisomerase I. Oncogene, 22, Int. J. Cancer, 74, 57–63. 1945–1954. 70. Henshall,S.M., Quinn,D.I., Lee,C.S., Head,D.R., Golovsky,D., 77. Moulin,S., Llanos,S., Kim,S.H. and Peters,G. (2008) Binding to Brenner,P.C., Delprado,W., Stricker,P.D., Grygiel,J.J. and nucleophosmin determines the localization of human and chicken Sutherland,R.L. (2001) Overexpression of the cell cycle inhibitor ARF but not its impact on p53. Oncogene, 27, 2382–2389. p16INK4A in high-grade prostatic intraepithelial neoplasia predicts 78. Prall,F., Nollau,P., Neumaier,M., Haubeck,H.D., Drzeniek,Z., early relapse in prostate cancer patients. Clin. Cancer Res., 7, Helmchen,U., Loning,T. and Wagener,C. (1996) CD66a (BGP), 544–550. an adhesion molecule of the carcinoembryonic antigen family, 71. Hui,R., Macmillan,R.D., Kenny,F.S., Musgrove,E.A., is expressed in epithelium, endothelium, and myeloid cells in a Blamey,R.W., Nicholson,R.I., Robertson,J.F. and Sutherland,R.L. wide range of normal human tissues. J. Histochem. Cytochem., 44, (2000) INK4a gene expression and methylation in primary breast 35–41. cancer: overexpression of p16INK4a messenger RNA is a marker of 79. Thies,A., Moll,I., Berger,J., Wagener,C., Brummer,J., Schulze,H.J., poor prognosis. Clin. Cancer Res., 6, 2777–2787. Brunner,G. and Schumacher,U. (2002) CEACAM1 expression in 72. Palmero,I., Pantoja,C. and Serrano,M. (1998) p19ARF links the cutaneous malignant melanoma predicts the development of meta- tumour suppressor p53 to Ras. Nature, 395, 125–126. static disease. J. Clin. Oncol., 20, 2530–2536. 73. Bates,S., Phillips,A.C., Clark,P.A., Stott,F., Peters,G., Ludwig,R.L. 80. Turbide,C., Kunath,T., Daniels,E. and Beauchemin,N. (1997) and Vousden,K.H. (1998) p14ARF links the tumour suppressors Optimal ratios of biliary glycoprotein isoforms required for inhibi- RB and p53. Nature, 395, 124–125. tion of colonic tumor cell growth. Cancer Res., 57, 2781–2788.

Journal

Nucleic Acids ResearchOxford University Press

Published: Nov 16, 2008

There are no references for this article.