Background: Next-generation sequencing (NGS) has revolutionized almost all fields of biology, agriculture and medicine, and is widely utilized to analyse genetic variation. Over the past decade, the NGS pipeline has been steadily improved, and the entire process is currently relatively straightforward. However, NGS instrumentation still requires upfront library preparation, which can be a laborious process, requiring significant hands-on time. Herein, we present a simple but robust approach to streamline library preparation by utilizing surface bound transposases to construct DNA libraries directly on a flowcell surface. Results: The surface bound transposases directly fragment genomic DNA while simultaneously attaching the library molecules to the flowcell. We sequenced and analysed a Drosophila genome library generated by this surface tagmentation approach, and we showed that our surface bound library quality was comparable to the quality of the library from a commercial kit. In addition to the time and cost savings, our approach does not require PCR amplification of the library, which eliminates potential problems associated with PCR duplicates. Conclusions: We described the first study to construct libraries directly on a flowcell. We believe our technique could be incorporated into the existing Illumina sequencing pipeline to simplify the workflow, reduce costs, and improve data quality. Keywords: Next generation sequencing, Transposases, Surface reaction Background generate libraries directly on a flowcell surface, which The Human Genome Project is having a remarkable im- will ultimately improve the efficiency of the NGS pact on the biomedical community, primarily due to the pipeline. amazing reduction in sequencing costs, from $10 to less The traditional NGS library preparation protocol con- than $0.000001 per finished base in less than thirty years sists of three primary steps: fragmentation, adaptor . Exome sequencing is now routinely used in both re- ligation, and amplification. DNA molecules are first search and clinical settings for the detection of inherited mechanically or enzymatically fragmented into 200~ or acquired mutations related to disease, and the FDA 400 bp, and then sequencing adaptors are ligated to the has already listed over 100 drugs that have genotype in- fragments. Finally, after several cycles of PCR, the DNA formation on their labels . In addition, the use of library is ready to go through several quality control whole genome sequencing is becoming more wide- steps and load into the NGS instrument . These steps spread. However, the simplicity of the Next-Generation typically take 8 to 10 h of hands-on work and expensive Sequencing (NGS) pipeline could still be improved to equipment is needed (e.g. Covaris). The Nextera kit im- further decrease the overall cost and increase the impact proves this process by combining genome fragmentation of NGS technology. Herein, we describe an approach to and adaptor ligation into a single step, which is called tagmentation. Transposases used in the Nextera kit con- tain adaptors and when mixed with genomic DNA, they * Correspondence: email@example.com 1 will shear the DNA and attach the adaptors to both ends Chemistry and Chemical Biology, University of New Mexico, Albuquerque, NM 87131, USA of DNA fragments. This process is very efficient and Internal Medicine, Chemical and Biological Engineering, University of New only takes a few minutes. Though the library preparation Mexico, Albuquerque, NM 87131, USA has been simplified by using tagmentation, PCR is still Full list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Feng et al. BMC Genomics (2018) 19:416 Page 2 of 6 required prior to loading the library into an NGS instru- the polyacrylamide gel as it polymerized (Fig. 1a), ment. We believe that the NGS pipeline would be sig- thereby linking the Tn5 transposases to the surface when nificantly improved if the library could be directly the Tn5 binds the 19 bp ME (Fig. 1b). We also included prepared on a flowcell surface, thus eliminating the need a 20 base oligo dT spacer at the 5′ end of the oligonucle- for upfront library construction. Ideally, genomic DNA otides to make the ME more accessible to the transpo- could be directly loaded into the sequencing machine sases. Once the transposases were assembled on the and no additional hands-on work. With our method, the polyacrylamide gel, genomic DNA and reaction buffer NGS users could simply insert genomic DNA into a se- were applied to the surface for simultaneous fragmenta- quencing instrument and prepare a library directly on tion and attachment (or tagmentation) to the surface. the flowcell surface. The tagmentation reaction occurred at 55 °C, after Herein, we demonstrate that sequencing libraries can which the DNA is fragmented and attached to the sur- be successfully generated on flowcell surface with sur- face (Fig. 1c). In our experiments, the fragment size face bound transposases. In this approach, DNA mole- ranged from 200 bp to 1 kb. As shown in Fig. 1d and e, cules are fragmented and linked to the flowcell surface The DNA fragments are attached to the surface at both by the Tn5 transposase. The linked DNA molecules are ends. then ready for cluster generation. We believe our ap- proach would simplify NGS pipeline significantly and Generating sequencing library with surface tagmentation contribute to the goal of sequencing a genome within After the tagmentation one could directly perform clus- $100. ter generation on the surface, as previously described in the literature [8–10]. However, herein, to confirm the li- Results brary that we generated, following tagmentation we per- Overall process of tagmentation on polyacrylamide gel formed a polymerase extension step to fill the 9 bp gap In order to perform the tagmentation on a solid surface created by Tn5 , as illustrated in Fig. 1c. To evaluate rather than in solution, we first attached the Tn5 trans- the quality of the surface bound library molecules, we posases on a surface. A thin (~ 10 μm) polyacrylamide transferred the hydrogel to a standard 200 μl PCR tube hydrogel was used for the solid surface, as has been de- and recovered the library molecules by PCR. Finally, we scribed previously for cluster generation [4–6], and the sequenced the library on a MiSeq sequencer. Illumina flowcell also has a very thin hydrogel layer on One would expect that the library quantity and the their flowcells . We designed two oligonucleotides size of the fragments would be related to the density of which contain the Illumina adapter sequences and a immobilized oligonucleotides, which controls the density 19 bp mosaic end (ME) sequence, which the Tn5 will of immobilized transposases on the surface. Therefore, recognize. The 5′ end of the oligonucleotides was modi- we generated sequencing libraries on multiple surfaces fied with an acrydite group to allow incorporation into with oligonucleotide concentrations of 0.1 μM, 0.5 μM, Fig. 1 Tagmentation on surface. a oligonucleotides are attached to a ~ 10 μm thick polyacrylamide gel. The 5’ end of the oligos have an acrydite modification for attachment to the acrylamide matrix. b The Tn5 transposases are assembled on the dsDNA oligos on the polyacrylamide gel surface. c genomic DNA is linked to the surface through tagmentation via incubation at 55 °C for 20 min with TAPs buffer. d, e side view and vertical view of the linked DNA on surface Feng et al. BMC Genomics (2018) 19:416 Page 3 of 6 1 μM, 5 μM, 10 μM, and 20 μM (Fig. 2a and b), and the combing is a technique to stretch DNA on a hydropho- library molecules shifted to smaller sizes when we in- bic surface by a receding air-water meniscus . We creased the oligonucleotides concentration (Fig. 2b). We first combed DNA on Polydimethylsiloxane (PDMS) and were not able to generate a sequencing library with then transferred to a polyacrylamide surface (Fig. 4a, b, c oligonucleotide concentrations below 1 μM (Fig. 2a). Fi- and d). In this way, DNA molecules are more likely to nally, we sequenced a Drosophila genomic DNA library be captured by the transposases. Therefore, we can use a generated on a surface with an oligonucleotide concen- much lower concentration of DNA. Another benefit for tration of 1 μM. As shown in Fig. 2c, the fragments size using combed DNA is that DNA molecules will be kept ranges from 200 bp to around 1 kb for this library, in their original shape with minimal shearing. As can be which is adequate for sequencing on an Illumina MiSeq. visualized in Fig. 4, these DNA fibers can be as long as After sequencing, the resulting data was mapped to 100 μm, which corresponds to approximately 200 kb. the reference genome  with BWA . 97.3% of the Once combed DNAs were transferred to the polyacryl- reads were aligned to the reference genome with 1.8% amide gel, they were tagmented by the transposases and PCR duplicates. The reads mapped uniformly across the linked to the surface (Fig. 4e and f). By using combed Drosophila genome (Fig. 3). To further address sequen- DNA, we could generate a library with as little as 50 ng cing bias, we downsampled the data to 1.2× depth of of DNA. coverage and compared the actual breadth of coverage to the theoretical expected value (Table 1). We then Discussion compared our genome sequencing library to a library Library preparation is the first step in the NGS pipeline, made from a commercial Illumina Nextera Kit, which and the process has been standardised and several kits used an in solution tagmentation technique (Table 1). are commercially available. With these kits, a standard When the two libraries were analysed at a similar se- sequencing library can be prepared in around 10 h. The quencing depth, the coverage was comparable, with the Nextera kit from Illumina, which utilizes Tn5 transpo- surface bound library being slightly better (58.5% vs. sases to generate the sequencing library, dramatically re- 52.0%). Although the expected coverage at this sequen- duces the hands-on time to 2 h by tagmentation , cing depth should be around 71%. and this kit can generate a reasonable library for NGS. However, the kit itself is expensive and it still requires Surface tagmentation with combed DNA the sequencing library to be produced prior to injection Tagmentation has several advantages over other NGS li- to the sequencing instrument. brary preparation methods, and requires less starting In order to make this technique more accessible to material. As little as 50 ng of DNA is enough to generate small labs, Picelli and colleagues cloned a Tn5 transpo- a library using Tn5 tagmentation in solution . How- sase , which generates comparable sequencing librar- ever, our approach requires 300 ng since the surface ies to the Nextera kit. Using this enzyme, we have bound Tn5 reduces the DNA collision frequency. In developed a method to generate DNA libraries directly order to solve this potential problem, we used combed on a flowcell surface, and ultimately within the sequen- DNA on a surface rather than DNA in solution. DNA cing instrument. In our approach, the Tn5 transposases Fig. 2 Drosophila sequencing library. a and b Agarose gel (2%) demonstrating the PCR products from surface tagmentation, various amount of acrydite oligonucleotides were used when casting the poly-acrylamide gel. c Agarose gel (2%) demonstrating the sequencing library after surface tagmentation and 16 cycles of PCR. The sequencing library was size selected with Agencourt Ampure XP beads Feng et al. BMC Genomics (2018) 19:416 Page 4 of 6 Fig. 3 Sequencing reads distribution. All reads were mapped to the reference genome with BWA while only reads with mapping quality higher than 30 were used. Reads count was normalized to chromosome length are first attached to a polyacrylamide gel surface, which DNA directly on flowcell surface, however, the DNA then fragments genomic DNA molecules and simultan- combing process was not as efficient on an acrylamide eously links them to the surface. Once the DNA mole- surface since it is hydrophilic. Two possible solution cules are linked to the surface, one could generate clusters would be replacing the polyacrylamide gel with another on the same surface and directly sequence the clusters hydrophobic surface or optimizing the combing condi- , This approach could significantly simplify the whole tions, i.e. pH or salt concentration . library preparation and sequencing procedure. However, Overall, our surface tagmentation strategy produced in this study to confirm the quality of the libraries gener- sequencing results that are comparable to those pre- ated by surface tagmentation, we constructed a Drosophila pared in solution, while significantly simplifying the genome library by surface tagmentation and then pre-sequencing library construction procedure. With extracted the library for sequencing to compare the overall DNA fragmented and attached to the flowcell surface, performance to solution-based tagmentation. the DNA molecules are ready for cluster generation with Generally, the NGS library preparation requires no additional PCR amplification. To go one step further, 100 ng ~ 1 μg of genomic DNA to start with while the the surface tagmentation step could be automated in a Nextera kit only use 50 ng. While with our method, sequencing instrument, and the original DNA material 300 ng of genomic DNA was used. It is acceptable but can be loaded directly onto a sequencing flowcell with- not ideal if the DNA material is precious or difficult to out any hands-on work. acquire. To overcome this shortcoming, we made use of the DNA combing strategy and successfully reduced the Conclusion starting material to 50 ng. DNA molecules were first In this study, we developed an approach to generate combed on PDMS and then transferred to a flowcell sur- DNA libraries directly on a flowcell surface. With Tn5 face. We made the PDMS surface in advance, and it only transposase, we successfully generated a Drosophila se- took several minutes to transfer the DNA from the quencing library by surface tagmentation and the per- PDMS to the flowcell surface. Ideally, we could comb formance was comparable to the Nextera kit. Ultimately, Table 1 Comparison NGS libraries made by surface tagmentation and Nextera kit. Nextera data was downloaded from NCBI Sequence Read Archive (SRA, ERR481289). Reads from our surface tagmentation and the Nextera data were downsampled with Picardtools (PROBABILITY = 0.5) to have approximately 1.2× coverage. The expect breadth of coverage was calculated according to Cd the formula Cb = 1–1/e , where Cd stands for the depth of coverage while Cb stands for the breadth of coverage Method Depth of Coverage Breadth of Coverage Expect Breadth of Coverage Surface tagmentation 1.249 58.5% 71.3% Nextera Kit 1.272 52.0% 72.0% Feng et al. BMC Genomics (2018) 19:416 Page 5 of 6 Fig. 4 Combed DNA on a surface. a and b YoYo-1 stained DNA stretched on PDMS. c and d Combed DNA on poly-acrylamide gel before tagmentation. e and f Combed DNA on poly-acrylamide gel after tagmentation our approach would allow cluster generation right after 300 μm/s. Stretched DNA was visualized using an inverse the tagmentation without any PCR, which drastically fluorescence microscope (Keyence). simplifies the whole NGS pipeline. Surface tagmentation and library preparation Methods Tn5 transposases, purified as previously described , Genomic DNA isolation were loaded onto the poly-acrylamide gel slide and incu- Genomic DNA was extracted from Drosophila melanoga- bated at 37 °C for 1 h. The slide was then washed several ster (female) using phenol chloroform. DNA was quanti- times with Tn5 wash buffer (100 mM Tris-HCl, 200 mM fied with Qubit dsDNA BR Assay Kit (Life Technologies) NaCl, 1 mM EDTA and 0.2% Triton-X100), and the sur- and stored at − 20 °C. pTXB1-Tn5 plasmid was acquired face was allowed to dry while preparing the tagmenta- from Addgene (plasmid#60240). tion mix (1 μl 10× TAPS-MgCl ,2 μl 40% PEG8000, 4 μlH O and 3 μl 200 ng/μl DNA). When using combed Preparation of poly-acrylamide gel surface DNA, they were transferred from PDMS by pressing the Clean glass slides were salinized with bind-silane (GE PDMS onto the Tn5 transposases preloaded polyacryl- Healthcare) overnight at room temperature. 5 μlof amide gel surface for 5 min. Tagmentation and library poly-acrylamide gel mix (4% acrylamide/bis, 1 μM PCR were performed as previously described . pre-annealed acrydite modified double-stranded oligonu- Briefly, the tagmentation was performed at 55 °C for cleotides (equal molar of Tn5A/Tn5R and Tn5B/Tn5R), 20 min and followed by several washes with nuclease 0.005% TEMED and 0.005% ammonium persulfate) was free water. To confirm the library, the poly-acrylamide loaded between the salinized slide and a 22 × 22 mm gel was then scraped off the slide and transferred to a coverslip. The acrylamide polymerized for 1 h at room 0.2 ml tube for library PCR using standard Taq polymer- temperature. To wash off the excess acrylamide mono- ase (To make a 50 μl reaction: 5 μl of 10× standard buf- mer and unbounded primers, the slide was incubated in fer, 0.4 μM primer mix, 0.25 U Taq polymerase and 40 ml of 0.5× SSC on a shaker for 30 min. The acrydite nuclease free water). The PCR performed as: 68 °C for modified oligonucleotides sequences are as follows: 3 min, 95 °C for 30 s, then 16 cycles of 95 °C for 15 s, Tn5A, 5’ - [Arcydite] TTTTTTTTTTTTTTTTTTTTT 63 °C for 30 s and 68 °C for 3 min. PCR primers CGTCGGCAGCGTCAGATGTGTATAAGAGACAG – sequences are as follows: forward primer: 5’-AATGA- 3’; Tn5B, 5’ - [Arcydite]TTTTTTTTTTTTTTTTTTTT TACGGCGACCACCGAGATCTACACTCGTCGGCAGC TGTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG GTC-3’; reverse primer: 5’-CAAGCAGAAGACGGCA- – 3’;Tn5R, 5’ - [phos]CTGTCTCTTATACACATCT – 3’. TACGAGATGTCTCGTGGGCTCGG-3’.The PCRprod- ucts were purified and size selected by Agencourt DNA combing AMPure XP beads (Beckman Coulter). The library was Genomic DNA was stained with YoYo-1 (ThermoFisher) quantified by qPCR. in phosphate buffered saline (pH 8.3) overnight at 4 °C. Polydimethylsiloxane (PDMS) coated coverslips were immersed into a reservoir containing YoYo stained DNA MiSeq sequencing and incubated for 1 h at room temperature. PDMS coated The library was diluted to final concentration of 2 nM coverslip was pulled up and out of the reservoir at and 5% PhiX was spiked in as control. Sequencing was Feng et al. BMC Genomics (2018) 19:416 Page 6 of 6 performed on a MiSeq instrument with the Nano Kit V2 6. Merritt J, DiTonno JR, Mitra RD, Church GM, Edwards JS. Parallel competition analysis of Saccharomyces cerevisiae strains differing by a single base using in paired 150 bp mode (Illumina). polymerase colonies. Nucleic Acids Res. 2003;31(15):e84. 7. Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, Hall KP, Evers DJ, Barnes CL, Bignell HR, et al. Accurate whole human Data analysis genome sequencing using reversible terminator chemistry. Nature. 2008; All sequencing reads were mapped to the reference gen- 456(7218):53–9. ome using BWA . The resulting bam file was sorted 8. Merritt J, Butz JA, Ogunnaike BA, Edwards JS. Parallel analysis of mutant human glucose 6-phosphate dehydrogenase in yeast using PCR colonies. and indexed using samtools . Reads with mapping Biotechnol Bioeng. 2005;92(5):519–31. quality lower than 30 were removed. PCR duplicates 9. Merritt J, Roberts KG, Butz JA, Edwards JS. Parallel analysis of tetramerization were dumped using Picardtools. Coverage analysis was domain mutants of the human p53 protein using PCR colonies. Gen Med. 2007;1(3–4):113–24. performed by homemade shell scripts. Nextera sequen- 10. Mikkilineni V, Mitra RD, Merritt J, DiTonno JR, Church GM, Ogunnaike B, cing data was downloaded from NCBI Sequence Read Edwards JS. Digital quantitative measurements of gene expression. Archive (SRA; http://www.ncbi.nlm.nih.gov/sra) under Biotechnol Bioeng. 2004;86(2):117–24. 11. Steiniger M, Adams CD, Marko JF, Reznikoff WS. Defining characteristics of accession number ERR481289. Tn5 transposase non-specific DNA binding. Nucleic Acids Res. 2006;34(9): 2820–32. Abbreviation 12. Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, Amanatides PG, PDMS: Polydimethylsiloxane Scherer SE, Li PW, Hoskins RA, Galle RF, et al. The genome sequence of Drosophila melanogaster. Science. 2000;287(5461):2185–95. Funding 13. Li H, Durbin R. Fast and accurate short read alignment with burrows- This work was supported by the National Institutes of Health [R01HG006876 wheeler transform. Bioinformatics. 2009;25(14):1754–60. and P50GM085273], and the UNM Comprehensive Cancer Center Support 14. Adey A, Morrison HG, Asan, Xun X, Kitzman JO, Turner EH, Stackhouse B, Grant [P30CA118100]. R01HG006876 (JSE PI) supported the generation of the MacKenzie AP, Caruccio NC, Zhang X, et al. Rapid, low-input, low-bias data and provided funding to KF. This project also used the biocomputing construction of shotgun fragment libraries by high-density in vitro shared resources of the UNM Comprehensive Cancer Center (P30CA118100) transposition. Genome Biol. 2010;11(12):R119. and the UNM Spatial Temporal Modeling Center (P50GM085273). 15. Bensimon A, Simon A, Chiffaudel A, Croquette V, Heslot F, Bensimon D. Alignment and sensitive detection of DNA by a moving interface. Science. Availability of data and materials 1994;265(5181):2096–8. All Illumina sequencing data has been deposited in the SRA database 16. Caruccio N. Preparation of next-generation sequencing libraries using (SAMN09002962; BioProject PRJNA453995). Please contact JE for any material Nextera technology: simultaneous DNA fragmentation and adaptor tagging requests. by in vitro transposition. Methods Mol Biol. 2011;733:241–55. 17. Picelli S, Bjorklund AK, Reinius B, Sagasser S, Winberg G, Sandberg R. Tn5 Authors’ contributions transposase and tagmentation procedures for massively scaled sequencing JSE and JC conceived of the study. KF performed all the experiments and projects. Genome Res. 2014;24(12):2033–40. data analysis. All authors contributed to drafting the manuscript. All authors 18. Schwartz JJ, Lee C, Hiatt JB, Adey A, Shendure J. Capturing native long- have read and approve of the manuscript. range contiguity by in situ library construction and optical sequencing. Proc Natl Acad Sci U S A. 2012;109(46):18749–54. 19. Benke A, Mertig M, Pompe W. pH- and salt-dependent molecular combing Ethics approval and consent to participate of DNA: experiments and phenomenological model. Nanotechnology. 2011; Not applicable. 22(3):035304. 20. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis Competing interests G, Durbin R. The sequence alignment/map format and SAMtools. The authors declare that they have no competing interests. Bioinformatics. 2009;25(16):2078–9. Author details Chemistry and Chemical Biology, University of New Mexico, Albuquerque, NM 87131, USA. Special Projects, Centrillion Technologies, Palo Alto, CA 94303, USA. Internal Medicine, Chemical and Biological Engineering, University of New Mexico, Albuquerque, NM 87131, USA. University of New Mexico Comprehensive Cancer Center, Albuquerque, NM 87131, USA. University of New Mexico Health Sciences Center, Albuquerque, NM 87131, USA. Received: 13 December 2017 Accepted: 14 May 2018 References 1. van Dijk EL, Auger H, Jaszczyszyn Y, Thermes C. Ten years of next- generation sequencing technology. Trends Genet. 2014;30(9):418–26. 2. Schuck RN, Grillo JA. Pharmacogenomic biomarkers: an FDA perspective on utilization in biological product labeling. AAPS J. 2016;18(3):573–7. 3. Head SR, Komori HK, LaMere SA, Whisenant T, Van Nieuwerburgh F, Salomon DR, Ordoukhanian P. Library construction for next-generation sequencing: overviews and challenges. BioTechniques. 2014;56(2):61–4. 66, 68, passim 4. Mitra RD, Church GM. In situ localized amplification and contact replication of many individual DNA molecules. Nucleic Acids Res. 1999;27(24):e34. 5. Butz JA, Yan H, Mikkilineni V, Edwards JS. Detection of allelic variations of human gene expression by polymerase colonies. BMC Genet. 2004;5:3.
– Springer Journals
Published: May 30, 2018