TY - JOUR AU - Chen,, Li AB - Abstract BACKGROUND Targeted next-generation sequencing is a powerful method to comprehensively identify biomarkers for cancer. Starting material is currently either DNA or RNA for different variations, but splitting to 2 assays is burdensome and sometimes unpractical, causing delay or complete lack of detection of critical events, in particular, potent and targetable fusion events. An assay that analyzes both templates in a streamlined process is eagerly needed. METHODS We developed a single-tube, dual-template assay and an integrated bioinformatics pipeline for relevant variant calling. RNA was used for fusion detection, whereas DNA was used for single-nucleotide variations (SNVs) and insertion and deletions (indels). The reaction chemistry featured barcoded adaptor ligation, multiplexed linear amplification, and multiplexed PCR for noise reduction and novel fusion detection. An auxiliary quality control assay was also developed. RESULTS In a 1000-sample lung tumor cohort, we identified all major SNV/indel hotspots and fusions, as well as MET exon 14 skipping and several novel or rare fusions. The occurrence frequencies were in line with previous reports and were verified by Sanger sequencing. One noteworthy fusion event was HLA-DRB1-MET that constituted the second intergenic MET fusion ever detected in lung cancer. CONCLUSIONS This method should benefit not only a majority of patients carrying core actionable targets but also those with rare variations. Future extension of this assay to RNA expression and DNA copy number profiling of target genes such as programmed death-ligand 1 may provide additional biomarkers for immune checkpoint therapies. Tumor genome variations are diverse and include localized genomic mutations such as single-nucleotide variations (SNVs)7 and insertion or deletions (indels), copy number variations/amplifications, and transcriptional variations such as rearrangement/fusion, expression variation, or aberrant splicing, whose genomic causes may not be explicit (1–5). Targeted therapies and immune checkpoint therapies greatly improve patient survival rates, and effective application of new therapies relies on accurate genetic variant profiling. In lung cancer, mutations in EGFR8, BRAF, KRAS, and PIK3CA constitute important biomarkers to guide clinical care, and gene fusion events involving ALK, ROS1, and RET are also frequently identified. Importantly, recent studies have shown that rare fusion events such as those involving NTRK1, NTRK2, and NTRK3 can potently drive tumorigenesis and are likely targetable across diverse tumor types (6–10). High-throughput next-generation sequencing (NGS) provides a clear advantage over single-target testing (11, 12). Previous targeted NGS assays interrogated either DNA or RNA but not both. Large DNA panels encompassing hundreds of oncogenes are powerful in exploring mutation landscape (13–15), but it remains challenging to identify rearrangements/fusions with these assays (16–18). Clinical utility of large DNA panels is also debatable because actionable genetic variations in patients with lung cancer concern only a few well-defined genes (4, 19, 20). Indeed, patient insurance policies are not covering the cost of extensive panels, and systematic application of broad genetic profiling panels is limited to institutions that can otherwise subsidize these assays. Identifying fusions, particularly novel ones, can be challenging, and it is often necessary to identify the partner gene and specific breakpoint. In addition, chimeric transcript needs to be evaluated to establish integrity of the coding frame and oncogenic domains, and reasonable expression level of the fusion product. Therefore, most DNA panels prioritize only a few well-characterized fusion subtypes and are by design difficult to identify novel fusion events. Targeted RNA panels are effective at identifying fusions (21–25), particularly if the assay, such as anchored multiplex PCR, does not require a priori knowledge for fusion partner or breakpoint (26, 27). However, RNA resequencing is unsuitable for SNV and indel identification because the template is unstable and lacks double-stranded context and, therefore, is generally adopted as the complementary approach after mainstream DNA panels fail to yield optimal outcomes, potentially resulting in delayed or complete lack of molecular information. Thus, to provide direct benefit to patients through treatment instruction or disease monitoring, an NGS assay that comprehensively and accurately covers multiple variation types rather than large gene sets would be extremely valuable. Recently, assays that interrogate both RNA and DNA were deployed clinically (28) by leveraging parallel assay streams rather than single reaction chemistry and, therefore, can be costly and labor-intensive, and demanding in input material. Here we present a unified NGS assay converting total nucleic acid (TNA) extracted from tissue samples into multiple variant type calling, allowing us to streamline the variant detection process. Materials and Methods CLINICAL SAMPLE COLLECTION Fresh clinical samples were surgically obtained, and tumor contents were confirmed by pathologists. Tissue specimens were stored at −20 °C, then formalin-fixed and paraffin-embedded (FFPE), sectioned to between 5 and 10 μm, and stored at room temperature. Additionally, archived FFPE samples were obtained from hospital sample biobanks and similarly sliced. All samples were collected with written consent from patients, and research was approved by the Institutional Review Board of each participating hospital. NUCLEIC ACID PREPARATION Nucleic acids were extracted from specimens using either Formapure FFPE Nucleic Acid Extraction kit (Beckman Coulter) or PANO-Pure FFPE TNA Extraction kit (HeliTec Biotechnologies) as a single-tube nucleic acid mixture without splitting RNA and DNA. Nucleic acids were eluted in 50 μL of Tris buffer, pH 7.5, and quantified by Qubit Fluorometer 2.0 and Quant-iT HS dsDNA Assay Kit and RNA Assay Kits (Thermo Fisher), and sample qualities were determined as PASS if DNA was >10 ng/μL and RNA was >25 ng/μL. PRE-NGS SAMPLE QUALITY ASSAY A quantitative reverse transcription PCR (qRT-PCR) assay targeting the CHMP2A gene was developed to evaluate input nucleic acid quality. One microliter of nucleic acid was used in a reaction system consisting of 2 PCR primers located in 2 adjacent exons, 2 fluorescent probes [fluorescein (FAM) and 2-chloro-7-phenyl-1,4-dichloro-6-carboxy-fluorescein (VIC)] targeting the intron and one of the exons (see Table 1 in the Data Supplement that accompanies the online version of this article), M-MLV Reverse Transcriptase RNase H− (Enzymatics), and hot-start Taq polymerase (Thermo Fisher). After reverse transcription at 42 °C for 30 min, qPCR was initiated at 95 °C for 5 min and 45 thermocycles of 95 °C for 10 s and 60 °C for 30 s. The intronic and exonic probes detected genomic DNA (gDNA) and TNA, respectively, and the difference between CT values represented RNA contribution. After summarizing clinical samples for relative RNA and DNA stability, the pre-NGS RNA quality control (QC) score was empirically set as PASS if CT_VIC <33 and CT_VIC − CT_FAM < −0.5, reflecting TNA level and RNA-to-DNA difference, respectively, and DNA QC score was set as PASS when CT_VIC <35 and CT_FAM <35, reflecting DNA template signal from both probes. Quality filters can be fine-tuned after more samples are gathered. NGS ASSAY Between 10 and 500 ng of TNA was used as an input template for NGS library construction beginning with an RT reaction that converted RNA to cDNA without altering carryover gDNA. Input template was mixed with random hexamers and heated to 65 °C for 5 min, and immediately chilled in icy water. Reverse transcriptase, RNase inhibitor, and dNTPs (Enzymatics) were added and incubated at 42 °C for 2 h, followed by 70 °C for 15 min. The second strand was synthesized with Escherichia coli DNA polymerase I and RNase H (Enzymatics) at 16 °C for 1 h. After SPRI beads cleanup (Beckman), DNA were enzymatically fragmented by DNA fragmentation enzyme mix (New England Biolabs or Kapa Biosystems) and end-polished by T4 DNA polymerase, Taq polymerase, and T4 polynucleotide kinase (Enzymatics). This polished fragment is ligated with a set of adaptors. Each adaptor consisted of a designated P5 index for sample deconvolution and a degenerate unique molecular identifier (UMI) for amplicon binning. Ligated templates were SPRI-cleaned and mixed with the first pool of target-specific primers and thermal stable DNA polymerase (Thermo Fisher), and subjected to multiple linear amplification cycles, each producing a complementary strand copy of the input template. All primers share a common 5′ tag suppressing amplification between gene-specific primers owing to hairpin formation in such amplicons. After cleanup, a second pool of target-specific primers nested to the first pool, a common adaptor primer, and an indexed P7 primer were applied for PCR amplification. Resulting libraries were pooled and sequenced using HiSeq ×10 or NextSeq500/550 sequencers (Illumina) with paired-end 150 cycles and 8-bit P7 index run setting. A set of 96 P5 adaptor indexes and matched P7 primer indexes were designed, allowing library pooling for sequencing. NGS DATA PROCESSING Raw data were demultiplexed using Illumina bcl2fq version 2.19 for P7 index, followed by a custom script for matched P5 index demultiplexing, BBduk for adaptor trimming, and UMI parsing. Fastq sequences were aligned to human reference genome (hg19) using BWA MEM with the default setting (29). On-target alignments were extracted using BedTools 2.27 (30) supplied with specific panel bed files. SNVs were called using a UMI-aware custom script that includes samtools and bcftools. SNVs with occurrence frequency higher than 1% in the dbSNP database were regarded as polymorphisms and filtered. Gene fusions were called based on BWA MEM supplementary alignments, taking into account different mapping start positions, and the breakpoint frame status was inferred based on RefSeq open reading frames. We favored BWA MEM over common RNA-seq aligners such as STAR, TOPHAT, or Mapsplice (31) because its algorithm was advantageous for fusions involving 2 different genes. BWA MEM also allowed for simultaneous gDNA (resulting bam file is compatible with the mutation calling module) and cDNA read alignment, which was unique to our wet laboratory method. The read number cutoff for calling a fusion was usually 3, except for some canonical ones such as EML4-ALK, whose threshold might be lowered if the exon joint/breakpoint was detected by both paired-end reads of the amplicon. Read depths of the housekeeping gene CHMP2A cDNA and gDNA were used to calculate the RNA/DNA ratio, which was used as a QC for RNA quality. PCR, CLONING, AND SANGER SEQUENCING To verify fusion results, 25 to 300 ng of TNA was reverse-transcribed with 150 pmol of random hexamers, 12.5 nmol of dNTPs, 20 U of RNase inhibitor, and 100 U of reverse transcriptase (Enzymatics) in a 20-μL reaction, at 25 °C for 10 min, 42 °C for 30 min, and 70 °C for 15 min, and then amplified by PCR with 2 mmol/L MgCl2, 1 U of Platinum Taq polymerase (Thermo Fisher), 10 pmol each of P1F and P1R forward and reverse primers, for 15 cycles of 95 °C for 15 s, 50 to 60 °C for 20 s, and 72 °C for 20 s, followed by nested PCR with the second primer pair P2F and P2R. PCR products were subcloned for Sanger sequencing. To verify SNV and indel, TNAs were PCR-amplified with primers PF and PR and directly sequenced. Primer sequences are listed in Table 1 in the online Data Supplement. ADDITIONAL DATA ANALYSIS AND STATISTICS Processed NGS data were also analyzed or charted on an Integrative Genomics Viewer (IGV version 2.4.15, Broad Institute) (32, 33). Statistics were performed using Prism 5 for Windows (GraphPad Software) and Excel (Microsoft). Diagrams were drawn using PowerPoint (Microsoft). Results UNIFIED LIBRARY CONSTRUCTION ASSAY USING BOTH RNA AND DNA The parallel amplification numerically optimized sequencing (PANO-Seq) assay was developed so that TNAs extracted from tissue samples were converted to one NGS library without splitting gDNA or RNA during laboratory processing. TNAs were subjected to reverse transcription that converted RNA to double-stranded cDNA while gDNA remained unaffected. The follow-up library conversion consisted of adaptor ligation and gene-specific multisite linear amplification followed by nested PCR (Fig. 1A). For gDNA, primers were placed in one or both sides of the targeted whole exons or hotspots, incurring strand-informed detection for improved accuracy. For cDNA, primers were placed on target exons, toward the direction of potential fusion partners, and fusion events were captured from amplicons containing both a partner exon and a target exon. For fusion detection from gDNA, tiled primers covered an entire intron, toward the direction of potential fusion partners. Reads were traced to either RNA or DNA templates according to primer and amplicon composition (Fig. 1B) by the analysis pipeline and assigned to corresponding caller modules. A unified approach sequenced DNA and RNA in a single-tube format. Fig. 1. Open in new tabDownload slide Assay work flow with samples processed in a single tube without physical separation. TNAs were subjected to reverse transcription (rev. tsp.) with random hexamers (denoted as N6). Double-stranded DNA were then processed to NGS library through adaptor ligation, multiplexed linear amplification, and nested PCR with gene-specific primer pools (GSP1s and GSP2s) (A). Diagram of primer design: X represents fusion partner (B). frag., fragmentation; CNV, copy number variation. Fig. 1. Open in new tabDownload slide Assay work flow with samples processed in a single tube without physical separation. TNAs were subjected to reverse transcription (rev. tsp.) with random hexamers (denoted as N6). Double-stranded DNA were then processed to NGS library through adaptor ligation, multiplexed linear amplification, and nested PCR with gene-specific primer pools (GSP1s and GSP2s) (A). Diagram of primer design: X represents fusion partner (B). frag., fragmentation; CNV, copy number variation. PROOF-OF-CONCEPT STUDY IDENTIFIED BOTH FUSIONS AND MUTATIONS From a cohort of 66 freshly sectioned lung tissue samples (see Table 2 in the online Data Supplement), mutations were found in EGFR (27 cases), KRAS (4 cases), PIK3CA (4 cases), BRAF, and CDKN2A (1 case each). Additionally, 3 fusions (2 EML4-ALK and 1 KIF5B-RET) were identified, demonstrating that the intended unified multivariant type and bitemplate detection was indeed feasible. Mutant allele frequencies (MAFs) were then calibrated using reference standards, and of the 35 mutations in the reference material, all 22 mutations with MAFs at ≥2% were detected, plus 8 of 13 mutations around 0.5%, without false-positive calls, rendering limit of detection (LOD) at 2%. Detected MAFs were also highly correlated to nominal MAFs (Fig. 2A) and reproducible at both typical and reduced input amounts (50 ng and 10 ng) (Fig. 2B), demonstrating that template input requirement and LOD were sufficient for clinical tissue samples. Assay parameters and quality. Fig. 2. Open in new tabDownload slide Correlation between detected MAFs and nominal MAFs of reference materials. Data were analyzed using linear regression; P < 0.005 (A). Correlations were reproducible with regular and low DNA input; P < 0.005 (B). On-target ratio (C), uniformities (D), and amplicon size distribution (E) for cell line (n = 11) and FFPE (n = 78) samples were compared. Data were analyzed using 2-tailed unpaired t-test for means and F-test for variances. Error bars represent SE. Significances were estimated using unpaired 2-tailed t-tests; P < 0.005 (***) or P < 0.0001 (****). Target base coverage can be adjusted by altering primer concentration. Shown are effects of EGFR GSP1/GSP2 primer set concentrations on 4 exons (F). Fig. 2. Open in new tabDownload slide Correlation between detected MAFs and nominal MAFs of reference materials. Data were analyzed using linear regression; P < 0.005 (A). Correlations were reproducible with regular and low DNA input; P < 0.005 (B). On-target ratio (C), uniformities (D), and amplicon size distribution (E) for cell line (n = 11) and FFPE (n = 78) samples were compared. Data were analyzed using 2-tailed unpaired t-test for means and F-test for variances. Error bars represent SE. Significances were estimated using unpaired 2-tailed t-tests; P < 0.005 (***) or P < 0.0001 (****). Target base coverage can be adjusted by altering primer concentration. Shown are effects of EGFR GSP1/GSP2 primer set concentrations on 4 exons (F). LARGE-SCALE COHORT STUDY IN OVER 1000 CLINICAL SAMPLES Although specimens of newly enrolled cases are seldomly stocked for the long term, many ongoing patients with samples sectioned in past years have not yet been tested by NGS. To test the extended usability of this assay, FFPE samples from 1095 patients, banked for various amounts of time, ranging from <1 year to extensive archiving of >5 years, were used (see Table 3 in the online Data Supplement). The cohort was representative of known lung cancer subtype representation in patients with 815 adenocarcinomas (74.4%), 183 squamous cell carcinomas (16.7%), and 97 other or unclassified (8.9%). The targeted panel covered 179-amplicon length for 32 exons and a promoter of 10 genes (DNA-based SNV/indel detection), and 14 genes for RNA-based fusion detection, as well as 12 introns across 3 genes for DNA-based fusion detection for samples with poor RNA (see Table 4 in the online Data Supplement). Of all libraries successfully sequenced and analyzed, the on-target ratio was 60.1%, the mean consolidated depth was 12 622x, and the fraction of over 0.2x mean consolidated depth, a measurement for uniformity, was 70.0%. Amplicons shorter than 50 bp, an indicator for template integrity, were 48.9% (see Table 5 in the online Data Supplement). Significant improvements were found by comparing cultured cell with FFPE samples (Fig. 2, C–E), indicating that these real case performances were largely affected by the quality of the clinical samples. Importantly, uniformity could be conveniently optimized by adjusting the concentration of relevant primers in the pool (Fig. 2F), reflecting the flexibility of the assay. Uneven RNA level also affects overall performance, with degradation being a general concern for RNA-based assays. To monitor RNA reads quality, the CHMP2A gene was used as an internal control, with its expression compared with the level of its own gDNA moiety in the analyzed mixture (Fig. 3A). RNA quality analysis and pre-NGS evaluation assay. Fig. 3. Open in new tabDownload slide A housekeeping gene (CHMP2A) served as internal RNA QC. Shown are a snapshot of the Integrative Genomics Viewer graph and a schematic diagram of the tested region. The y axes were set to equal scale. RNA quality was determined by counting exon and intron reads immediately neighboring the targeted exon. A horizontal black arrow denotes primer direction, and vertical boxed arrows denote base positions (A). RNA and DNA sequencing qualities are affected by year of sample collection (B). Internal control gene RNA/gDNA reads ratios are affected by year (C). Diagram of pre-NGS dual RNA/DNA qRT-PCR assay (D). Presence of cDNA would increase the VIC probe signal. P1 and P2 denote PCR primers. Fig. 3. Open in new tabDownload slide A housekeeping gene (CHMP2A) served as internal RNA QC. Shown are a snapshot of the Integrative Genomics Viewer graph and a schematic diagram of the tested region. The y axes were set to equal scale. RNA quality was determined by counting exon and intron reads immediately neighboring the targeted exon. A horizontal black arrow denotes primer direction, and vertical boxed arrows denote base positions (A). RNA and DNA sequencing qualities are affected by year of sample collection (B). Internal control gene RNA/gDNA reads ratios are affected by year (C). Diagram of pre-NGS dual RNA/DNA qRT-PCR assay (D). Presence of cDNA would increase the VIC probe signal. P1 and P2 denote PCR primers. Nearly all samples collected within 2 years, reflecting newly enrolled patients, and most samples collected within 5 years, in line with duration of patients undergoing disease monitoring, are suitable for this assay (see Fig. 3B here and Table 5 in the online Data Supplement). Overall, sequencing parameters and particularly RNA quality were clearly associated with sample storage duration (see Fig. 3C here and Fig. 1 in the online Data Supplement). Nonetheless, RNA was more volatile than DNA, and sample quality evaluation before lengthy and costly NGS assays would be valuable. However, conventional nucleic acid quantification assays may not measure template quality accurately or distinguish RNA from DNA. We then developed a pre-NGS qRT-PCR assay for dual RNA/DNA template quality evaluation (Fig. 3D) and analyzed all available samples, demonstrating it as a higher quality test for downstream NGS processing than conventional assays (see Table 6 in the online Data Supplement). MUTATION PROFILING AND CLINICAL SIGNIFICANCE With a cutoff at 5% MAF to compensate for FFPE sample baseline, across all patients, 834 were found to harbor at least one mutation (see Table 1 here and Table 7 in the online Data Supplement). Overall, 79% (739 of 935) of patients with acceptable DNA reads quality had such an event, and 45% (421 of 935) were located in hotspots (Fig. 4A). Table 1. Summary of genetic variations detected in clinical cohort. . Number of patients . Total samples tested 1095     Samples with SNV/indel MAF > 5% 834     Samples with fusion 69 Samples with acceptable DNA reads 935     With SNV/indel MAF > 5% 739         With SNV/indel hotspots MAF > 5% 421 Samples with acceptable RNA reads 754     With fusion 58         ALK 27         ROS1 12         RET 8         MET 7         Other 4 Samples with both acceptable DNA and RNA reads 678     With SNV/indel (MAF > 5%) or fusion 555         With hotspots (MAF > 5%) or fusion 373 . Number of patients . Total samples tested 1095     Samples with SNV/indel MAF > 5% 834     Samples with fusion 69 Samples with acceptable DNA reads 935     With SNV/indel MAF > 5% 739         With SNV/indel hotspots MAF > 5% 421 Samples with acceptable RNA reads 754     With fusion 58         ALK 27         ROS1 12         RET 8         MET 7         Other 4 Samples with both acceptable DNA and RNA reads 678     With SNV/indel (MAF > 5%) or fusion 555         With hotspots (MAF > 5%) or fusion 373 Open in new tab Table 1. Summary of genetic variations detected in clinical cohort. . Number of patients . Total samples tested 1095     Samples with SNV/indel MAF > 5% 834     Samples with fusion 69 Samples with acceptable DNA reads 935     With SNV/indel MAF > 5% 739         With SNV/indel hotspots MAF > 5% 421 Samples with acceptable RNA reads 754     With fusion 58         ALK 27         ROS1 12         RET 8         MET 7         Other 4 Samples with both acceptable DNA and RNA reads 678     With SNV/indel (MAF > 5%) or fusion 555         With hotspots (MAF > 5%) or fusion 373 . Number of patients . Total samples tested 1095     Samples with SNV/indel MAF > 5% 834     Samples with fusion 69 Samples with acceptable DNA reads 935     With SNV/indel MAF > 5% 739         With SNV/indel hotspots MAF > 5% 421 Samples with acceptable RNA reads 754     With fusion 58         ALK 27         ROS1 12         RET 8         MET 7         Other 4 Samples with both acceptable DNA and RNA reads 678     With SNV/indel (MAF > 5%) or fusion 555         With hotspots (MAF > 5%) or fusion 373 Open in new tab Summary of fusions and hotspot mutations. Fig. 4. Open in new tabDownload slide Array of all fusions and mutations with MAF >5% in hotspots of 4 oncogenes identified in lung adenocarcinomas, squamous cell carcinomas, or other types (A). Solid lines separate tumor histology types, and dashed lines separate cases with or without fusion. HLA-DRB1-MET fusion is shown as a cyan bar in the MET subclass. Smoking history is marked as red (active), orange (quit), cyan (never), or white (unknown). MET exon 14 skipping as an example of intragenic fusion event (B). HLA-DRB1-MET as an example of intergenic fusion event (C). Each panel was shown with reference sequences, NGS reads, and Sanger sequencing confirmation. A vertical black line marks the exon breakpoint. Fig. 4. Open in new tabDownload slide Array of all fusions and mutations with MAF >5% in hotspots of 4 oncogenes identified in lung adenocarcinomas, squamous cell carcinomas, or other types (A). Solid lines separate tumor histology types, and dashed lines separate cases with or without fusion. HLA-DRB1-MET fusion is shown as a cyan bar in the MET subclass. Smoking history is marked as red (active), orange (quit), cyan (never), or white (unknown). MET exon 14 skipping as an example of intragenic fusion event (B). HLA-DRB1-MET as an example of intergenic fusion event (C). Each panel was shown with reference sequences, NGS reads, and Sanger sequencing confirmation. A vertical black line marks the exon breakpoint. Among these, EGFR was the most mutated oncogene, with hotspot mutations found in 375 total patients or 37.6% (352 of 935) of patients with acceptable DNA reads quality, with the majority being L858R (18.8%; 176 of 935) and exon 19 deletion (15.4%; 144 of 935). Resistant mutation T790M was found in 17 cases, including 2 just <5% MAF but still above baseline, and, noteworthily, were preferentially associated with exon 19 deletion than with L858R mutation (11 vs 3 cases). KRAS mutations (G12/G13/Q61; 54 cases) were also frequently found, preferentially in smokers (Fig. 4A), and other mutations in PIK3CA (E542/E545/H1047; 22 cases), BRAF (G469/V600; 6 cases), and NRAS (G12G13/Q61; 2 cases) were found in small fractions. At least one case for each mutation type was independently tested by Sanger sequencing, confirming validity of this assay. To further evaluate assay accuracy, 32 EGFR_L858R cases, representing the most abundant mutation, and 20 negative samples were further validated, demonstrating complete concordance (see Fig. 2 in the online Data Supplement). FUSION PROFILING IDENTIFIED NOVEL AND DIVERSE SUBTYPES Sixty-nine patients were fusion-positive overall, or 7.7% (58 of 754) in the subgroup with acceptable RNA reads quality (see Table 1 here and Table 8 in the online Data Supplement). ALK (3.6%; 27 of 754) (see Fig. 3A in the online Data Supplement), ROS1 (1.6%; 12 of 754) (see Fig. 3B in the online Data Supplement), and RET (1.1%; 8 of 754) (see Fig. 3C in the online Data Supplement) represent the vast majority of events. Fusions were mostly found in adenocarcinoma than in squamous cell carcinoma (Fig. 4A). No patients harbored multiple fusions, and just 3 fusions were accompanied by hotspot mutations (CCDC93-ALK with EGFR_S768I, FREM2-ROS1 with KRAS_Q61H, and AGK-BRAF with EGFR_exon19del and T790M), demonstrating oncogene mutual exclusiveness. Of note, the CCDC93-ALK (E16:E25) fusion resulted in truncated tyrosine kinase domain and is likely nonfunctional. MET exon 14 skipping was detected in 6 cases, constituting 0.8% of the tested population (Fig. 4, A and B); additionally, we identified an HLA-DRB1-MET fusion that was reported only once previously (34) (Fig. 4C). Four other fusions were found involving BRAF [BBS9-BRAF (see Fig. 3D in the online Data Supplement) and AGK-BRAF] and NRG1 [CD74-NRG1 (see Fig. 3E in the online Data Supplement) and MRTFB-NRG1]. Again, 34 fusion-positive cases, representing at least one case for each fusion type, as well as the most common intergenic ALK fusion and the MET exon 14 skipping, were validated by Sanger sequencing (see Table 8 in the online Data Supplement). Twenty negative cases were also checked for ALK fusion, and all were negative. Thus, the assay allows for accurate and effective detection of known and novel fusion events in parallel with detection of local genetic variants. Discussion Lung cancer has been the subject of extensive genomic studies, but the routine detection of clinically actionable genetic variants and further discovery of rare driving fusion events remain difficult. A primary operational reason is the lack of a comprehensive and economical platform to perform these studies. Clinical test differs from exploratory research in that well-defined gene sets are assayed in a timely and cost-effective manner so that patients can promptly benefit. At present, a small set of well-characterized genetic events constitute key selection markers for the application of approved or emerging therapeutics. This leaves a large fraction of patients without clear genetic drivers in their tumors, and for a subset of which, unrecognized and likely rare genetic variants, in particular fusions that could be targeted, are not leveraged. Our study represents an effort toward the development of a broadly encompassing assay with the potential to affect the care of a large number of lung (and other) patients with cancer by simultaneously addressing core actionable targets and rare but potent fusion events. Even with the currently tested relatively small panel, 82% (555 of 678) of patients with desirable sample quality were found with at least one mutation or fusion, and 373 (54%) with either hotspots or fusions that are likely actionable targets. All well-annotated genetic variations were found in this study, and the occurrences were largely in keeping with other reports among Eastern Asian populations, considering that a large proportion of the analyzed cohort was constituted of squamous cell carcinomas that typically do not contain the most canonical oncogenic variations. A few rare fusions were particularly intriguing but likely actionable. In part because of the availability of approved drugs, MET has been the subject of numerous previous inquiries. Although MET exon 14 skipping concerns a substantial number of patients, the HLA-DRB1-MET fusion deserves particular notice. Indeed, the first actionable MET fusion lung cancer, an identical HLA-DRB1-MET, was recently reported in a white female patient responsive to crizotinib (34). Based on fusion position, it is likely that, as with the exon 14 skipping, the fusion results in loss of recognition by CBL and impaired ubiquitination and degradation that would indicate clinical relevance (35). Other fusion cases also merit further investigation, for example, although BRAF_V600D/E mutations were frequently reported, fusion involving BRAF in lung cancer is poorly studied. Additionally, NTRK1/2/3 are increasingly being investigated as novel targets and represent a paradigm shift toward tissue-agnostic therapies. Several NTRK fusions detected from this cohort, for example, a frameshifted MSN-NTRK2 (see Fig. 3F in the online Data Supplement), are being investigated for validity or functional implication and, therefore, are not included for data statistics in this report. Assay performance on clinical samples may differ greatly from artificial reference standards (Fig. 2, C–E), affected by not only assay chemistry and environmental disruption but also tissue heterogeneity. Applying the 2% LOD set forth using the reference standards, 7702 mutations (SNVs or indels) would have been called from this cohort, many at low frequencies (MAF, 2%–5% for 72.5%; 5582 of 7702) or atypical [68.8%; 5299 of 7702, as SNVs unannotated by COSMIC (Catalogue Of Somatic Mutations In Cancer) (36)]; thus, it is critical to set an appropriate threshold to distinguish meaningful mutations that drive tumorigenesis or incur resistance. We postulated that clonal expansion enriches driver mutations and, therefore, examined several well-characterized oncogenic hotspots and found that hotspot MAFs were indeed seen at higher levels than non-hotspots (mean ± SE, 28.43 ± 0.87 vs 4.35 ± 0.05; P < 0.0001) (Fig. 5A), and 84% (506 of 611) of hotspots were associated with MAF >5% (Fig. 5B). Likewise, strong driver mutations based on MAF were more often reported in COSMIC (Fig. 5C). DNA mutation profiling for hotspot and non-hotspot mutations. Fig. 5. Open in new tabDownload slide Mutations >2% MAFs were charted, and top 1 percentile was excluded to adjust for outliers owing to hereditary mutations. + denotes means, and whiskers denote 10 to 90 percentiles. Scattered dots represent outliers. Numbers under x axis represent mutation count for each group; ex., except (A). Profiling for each hotspot (B), and MAFs were grouped by COSMIC cases (C). Fig. 5. Open in new tabDownload slide Mutations >2% MAFs were charted, and top 1 percentile was excluded to adjust for outliers owing to hereditary mutations. + denotes means, and whiskers denote 10 to 90 percentiles. Scattered dots represent outliers. Numbers under x axis represent mutation count for each group; ex., except (A). Profiling for each hotspot (B), and MAFs were grouped by COSMIC cases (C). Potentially detecting multiple genetic variations (SNV/indel/copy number variation through DNA and fusion/expression through RNA) in a single-tube NGS assay holds further clinical and investigational potentials. For example, immune checkpoint therapies have seen a remarkable expansion in prescription; however, NGS panels have not proved very useful other than their limited capability for tumor mutation burden evaluation, while programmed death-ligand 1 protein assay by immunohistochemistry lacks complex genetic information that may affect therapeutic outcome (37). Such obstacles could be overcome by a consolidated NGS approach measuring PDCD1/CD274 amplification and/or transcription, preferably in the context of other molecular biomarkers for targeted therapies. Footnotes 7 Nonstandard abbreviations: SNV single nucleotide variation indel insertion or deletion NGS next-generation sequencing TNA total nucleic acid FFPE formalin-fixed and paraffin-embedded qRT-PCR quantitative reverse transcription PCR FAM fluorescein VIC 2-chloro-7-phenyl-1,4-dichloro-6-carboxy-fluorescein QC quality control gDNA genomic DNA UMI unique molecular identifier MAF mutant allele frequency LOD limit of detection. Footnotes 8 Human genes: EGFR epidermal growth factor receptor BRAF B-Raf proto-oncogene, serine/threonine kinase KRAS KRAS proto-oncogene, GTPase PIK3CA phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit α ALK ALK receptor tyrosine kinase ROS1 ROS proto-oncogene 1, receptor tyrosine kinase RET ret proto-oncogene NTRK1 neurotrophic receptor tyrosine kinase 1 NTRK2 neurotrophic receptor tyrosine kinase 2 NTRK3 neurotrophic receptor tyrosine kinase 3 CDKN2A cyclin-dependent kinase inhibitor 2A CHMP2A charged multivesicular body protein 2A EML4 EMAP like 4 KIF5B kinesin family member 5B NRAS NRAS proto-oncogene, GTPase CCDC93 coiled-coil domain-containing protein 93 FREM2 FRAS1 related extracellular matrix 2 AGK acylglycerol kinase MET MET proto-oncogene, receptor tyrosine kinase HLA-DRB1 major histocompatibility complex, class II, DR β 1 BBS9 Bardet-Biedl syndrome 9 NRG1 neuregulin 1 CD74 CD74 molecule MRTFB myocardin related transcription factor B CBL Cbl proto-oncogene MSN moesin PDCD1 programmed cell death 1 CD274 CD274 molecule. Author Contributions: All authors confirmed they have contributed to the intellectual content of this paper and have met the following 4 requirements: (a) significant contributions to the conception and design, acquisition of data, or analysis and interpretation of data; (b) drafting or revising the article for intellectual content; (c) final approval of the published article; and (d) agreement to be accountable for all aspects of the article thus ensuring that questions related to the accuracy or integrity of any part of the article are appropriately investigated and resolved. C. Xu, financial support, statistical analysis, administrative support, provision of study material or patients; Y. He, statistical analysis, administrative support; Y. Zhu, provision of study material or patients; Y. Gao, statistical analysis, administrative support; M. Ji, provision of study material or patients; M. Chen, statistical analysis, administrative support; L. Chen, statistical analysis, administrative support. Authors' Disclosures or Potential Conflicts of Interest: Upon manuscript submission, all authors completed the author disclosure form. Disclosures and/or potential conflicts of interest: Employment or Leadership: Y. He, HeliTec Biotechnologies; Y. Gao, HeliTec Biotechnologies; M. Chen, HeliTec Biotechnologies; J. Lai, HeliTec Biotechnologies; L. Chen, HeliTec Biotechnologies. Consultant or Advisory Role: None declared. Stock Ownership: Y. Gao, HeliTec Biotechnologies; L. Chen, HeliTec Biotechnologies. Honoraria: None declared. Research Funding: None declared. Expert Testimony: None declared. Patents: J. Lai, CN 201910047187.3; L. Chen, CN 201910047187.3. Role of Sponsor: No sponsor was declared. Acknowledgments The authors thank Dr. Zongli Zheng for advice on project design and data processing, and Dr. Adam Lacy-Hulbert for valuable comments on manuscript preparation. The authors also thank all the patients, for without their trust and support this work would not be possible. References 1. Baselga J . Treatment of HER2-overexpressing breast cancer . Ann Oncol 2010 ; 21 : vii36 – vii40 . Google Scholar Crossref Search ADS PubMed WorldCat 2. Zheng D Wang R Ye T Yu S Hu H Shen X , et al. MET exon 14 skipping defines a unique molecular class of non-small cell lung cancer . Oncotarget 2016 ; 7 : 41691 – 702 . Google Scholar PubMed OpenURL Placeholder Text WorldCat 3. McDermott U Sharma SV Dowell L Greninger P Montagut C Lamb J , et al. Identification of genotype-correlated sensitivity to selective kinase inhibitors by using high-throughput tumor cell line profiling . Proc Natl Acad Sci U S A 2007 ; 104 : 19936 – 41 . Google Scholar Crossref Search ADS PubMed WorldCat 4. Hintzsche J Kim J Yadav V Amato C Robinson SE Seelenfreund E , et al. Impact: a whole-exome sequencing analysis pipeline for integrating molecular profiles with actionable therapeutics in clinical samples . J Am Med Inform Assoc 2016 ; 23 : 721 – 30 . Google Scholar Crossref Search ADS PubMed WorldCat 5. Kwak EL Bang YJ Camidge DR Shaw AT Solomon B Maki RG , et al. Anaplastic lymphoma kinase inhibition in non-small-cell lung cancer . N Engl J Med 2010 ; 363 : 1693 – 703 . Google Scholar Crossref Search ADS PubMed WorldCat 6. Larotrectinib OK'd for cancers with TRK fusions . Cancer Discov 2019 ; 9 : 8 – 9 . OpenURL Placeholder Text WorldCat 7. Entrectinib effective across NTRK fusion-positive cancers . Cancer Discov 2019 ; 9 : OF4 . OpenURL Placeholder Text WorldCat 8. Cocco E Scaltriti M Drilon A . NTRK fusion-positive cancers and TRK inhibitor therapy . Nat Rev Clin Oncol 2018 ; 15 : 731 – 47 . Google Scholar Crossref Search ADS PubMed WorldCat 9. Drilon A Laetsch TW Kummar S DuBois SG Lassen UN Demetri GD , et al. Efficacy of larotrectinib in TRK fusion-positive cancers in adults and children . N Engl J Med 2018 ; 378 : 731 – 9 . Google Scholar Crossref Search ADS PubMed WorldCat 10. Farago AF Le LP Zheng Z Muzikansky A Drilon A Patel M , et al. Durable clinical response to entrectinib in NTRK1-rearranged non-small cell lung cancer . J Thorac Oncol 2015 ; 10 : 1670 – 4 . Google Scholar Crossref Search ADS PubMed WorldCat 11. Mardis ER . Next-generation DNA sequencing methods . Annu Rev Genomics Hum Genet 2008 ; 9 : 387 – 402 . Google Scholar Crossref Search ADS PubMed WorldCat 12. Pettersson E Lundeberg J Ahmadian A . Generations of sequencing technologies . Genomics 2009 ; 93 : 105 – 11 . Google Scholar Crossref Search ADS PubMed WorldCat 13. Chalmers ZR Connelly CF Fabrizio D Gay L Ali SM Ennis R , et al. Analysis of 100,000 human cancer genomes reveals the landscape of tumor mutational burden . Genome Med 2017 ; 9 : 1 – 14 . Google Scholar Crossref Search ADS PubMed WorldCat 14. Devarakonda S Rotolo F Tsao MS Lanc I Brambilla E Masood A , et al. Tumor mutation burden as a biomarker in resected non-small-cell lung cancer . J Clin Oncol 2018 ; 36 : 2995 – 3006 . Google Scholar Crossref Search ADS PubMed WorldCat 15. Yu H Chen Z Ballman KV Watson MA Govindan R Lanc I , et al. Correlation of pd-l1 expression with tumor mutation burden and gene signatures for prognosis in early-stage squamous cell lung carcinoma . J Thorac Oncol 2019 ; 14 : 25 – 36 . Google Scholar Crossref Search ADS PubMed WorldCat 16. Zehir A Benayed R Shah RH Syed A Middha S Kim HR , et al. Mutational landscape of metastatic cancer revealed from prospective clinical sequencing of 10,000 patients . Nat Med 2017 ; 23 : 703 – 13 . Google Scholar Crossref Search ADS PubMed WorldCat 17. Rohland N Reich D . Cost-effective, high-throughput DNA sequencing libraries for multiplexed target capture . Genome Res 2011 : 939 – 46 . OpenURL Placeholder Text WorldCat 18. Kinde I Wu J Papadopoulos N Kinzler KW Vogelstein B . Detection and quantification of rare mutations with massively parallel sequencing . Proc Natl Acad Sci U S A 2011 ; 108 : 9530 – 5 . Google Scholar Crossref Search ADS PubMed WorldCat 19. Tomasetti C Marchionni L Nowak MA Parmigiani G Vogelstein B . Only three driver gene mutations are required for the development of lung and colorectal cancers . Proc Natl Acad Sci U S A 2015 ; 112 : 118 – 23 . Google Scholar Crossref Search ADS PubMed WorldCat 20. Reiter JG Makohon-Moore AP Gerold JM Heyde A Attiyeh MA Kohutek ZA , et al. Minimal functional driver gene heterogeneity among untreated metastases . Science 2018 ; 361 : 1033 – 7 . Google Scholar Crossref Search ADS PubMed WorldCat 21. Ritterhouse LL . Targeted RNA sequencing in non-small cell lung cancer . J Mol Diagn 2019 ; 21 : 183 – 5 . Google Scholar Crossref Search ADS PubMed WorldCat 22. Blidner RA Haynes BC Hyter S Schmitt S Pessetto ZY Godwin AK , et al. Design, optimization, and multisite evaluation of a targeted next-generation sequencing assay system for chimeric RNAs from gene fusions and exon-skipping events in non-small cell lung cancer . J Mol Diagn 2019 ; 21 : 352 – 65 . Google Scholar Crossref Search ADS PubMed WorldCat 23. McLeer-Florin A Duruisseaux M Pinsolle J Dubourd S Mondet J Phillips Houlbracq M , et al. ALK fusion variants detection by targeted RNA-next generation sequencing and clinical responses to crizotinib in ALK-positive non-small cell lung cancer . Lung Cancer 2018 ; 116 : 15 – 24 . Google Scholar Crossref Search ADS PubMed WorldCat 24. Letovanec I Finn S Zygoura P Smyth P Soltermann A Bubendorf L , et al. Evaluation of NGS and RT-PCR methods for ALK rearrangement in European NSCLC patients: results from the European Thoracic Oncology Platform Lungscape project . J Thorac Oncol 2018 ; 13 : 413 – 25 . Google Scholar Crossref Search ADS PubMed WorldCat 25. Scolnick JA Dimon M Wang IC Huelga SC Amorese DA . An efficient method for identifying gene fusions by targeted RNA sequencing from fresh frozen and FFPE samples . PLoS One 2015 ; 10 : e0128916 . Google Scholar Crossref Search ADS PubMed WorldCat 26. Zheng Z Liebers M Zhelyazkova B Cao Y Panditi D Lynch KD , et al. Anchored multiplex PCR for targeted next-generation sequencing . Nat Med 2014 ; 20 : 1479 – 84 . Google Scholar Crossref Search ADS PubMed WorldCat 27. Vendrell JA Taviaux S Béganton B Godreuil S Audran P Grand D , et al. Detection of known and novel ALK fusion transcripts in lung cancer patients using next-generation sequencing approaches . Sci Rep 2017 ; 7 : 1 – 11 . Google Scholar Crossref Search ADS PubMed WorldCat 28. Volckmar AL Leichsenring J Kirchner M Christopoulos P Neumann O Budczies J , et al. Combined targeted DNA and RNA sequencing of advanced NSCLC in routine molecular diagnostics: analysis of the first 3,000 Heidelberg cases . Int J Cancer 2019 ; 145 : 649 – 61 . Google Scholar Crossref Search ADS PubMed WorldCat 29. Li H . Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM . arXiv 2013 ; 13033997v2 . OpenURL Placeholder Text WorldCat 30. Quinlan AR Hall IM . Bedtools: a flexible suite of utilities for comparing genomic features . Bioinformatics 2010 ; 26 : 841 – 2 . Google Scholar Crossref Search ADS PubMed WorldCat 31. Engstrom PG Steijger T Sipos B Grant GR Kahles A Ratsch G , et al. Systematic evaluation of spliced alignment programs for RNA-seq data . Nat Methods 2013 ; 10 : 1185 – 91 . Google Scholar Crossref Search ADS PubMed WorldCat 32. Robinson JT Thorvaldsdottir H Winckler W Guttman M Lander ES Getz G Mesirov JP . Integrative genomics viewer . Nat Biotechnol 2011 ; 29 : 24 – 6 . Google Scholar Crossref Search ADS PubMed WorldCat 33. Thorvaldsdottir H Robinson JT Mesirov JP . Integrative genomics viewer (IGV): high-performance genomics data visualization and exploration . Brief Bioinform 2013 ; 14 : 178 – 92 . Google Scholar Crossref Search ADS PubMed WorldCat 34. Davies KD Ng TL Estrada-Bernal A Le AT Ennever PR Camidge DR , et al. Dramatic response to crizotinib in a patient with lung cancer positive for an HLA-DRB1-MET gene fusion . JCO Precis Oncol 2017 ; 1 : 1 – 6 . OpenURL Placeholder Text WorldCat 35. Awad MM . Impaired c-Met receptor degradation mediated by MET exon 14 mutations in non-small-cell lung cancer . J Clin Oncol 2016 ; 34 : 879 – 81 . Google Scholar Crossref Search ADS PubMed WorldCat 36. Tate JG Bamford S Jubb HC Sondka Z Beare DM Bindal N , et al. COSMIC: the Catalogue Of Somatic Mutations In Cancer . Nucleic Acids Res 2019 ; 47 : D941 – D7 . Google Scholar Crossref Search ADS PubMed WorldCat 37. Gong B Kiyotani K Sakata S Nagano S Kumehara S Baba S , et al. Secreted pd-l1 variants mediate resistance to pd-l1 blockade therapy in non-small cell lung cancer . J Exp Med 2019 ; 216 : 982 – 1000 . Google Scholar Crossref Search ADS PubMed WorldCat Author notes Z. Song, C. Xu, and Y. He contributed equally to this work. © 2019 American Association for Clinical Chemistry This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) TI - Simultaneous Detection of Gene Fusions and Base Mutations in Cancer Tissue Biopsies by Sequencing Dual Nucleic Acid Templates in Unified Reaction JF - Clinical Chemistry DO - 10.1373/clinchem.2019.308833 DA - 2020-01-01 UR - https://www.deepdyve.com/lp/oxford-university-press/simultaneous-detection-of-gene-fusions-and-base-mutations-in-cancer-Ged0W8I68g SP - 178 VL - 66 IS - 1 DP - DeepDyve ER -