Efficient generation of complete sequences of MDR-encoding plasmids by rapid assembly of MinION barcoding sequencing data

Efficient generation of complete sequences of MDR-encoding plasmids by rapid assembly of MinION... Background: Multidrug resistance (MDR)–encoding plasmids are considered major molecular vehicles responsible for transmission of antibiotic resistance genes among bacteria of the same or different species. Delineating the complete sequences of such plasmids could provide valuable insight into the evolution and transmission mechanisms underlying bacterial antibiotic resistance development. However, due to the presence of multiple repeats of mobile elements, complete sequencing of MDR plasmids remains technically complicated, expensive, and time-consuming. Results: Here, we demonstrate a rapid and efficient approach to obtaining multiple MDR plasmid sequences through the use of the MinION nanopore sequencing platform, which is incorporated in a portable device. By assembling the long sequencing reads generated by a single MinION run according to a rapid barcoding sequencing protocol, we obtained the complete sequences of 20 plasmids harbored by multiple bacterial strains. Importantly, single long reads covering a plasmid end-to-end were recorded, indicating that de novo assembly may be unnecessary if the single reads exhibit high accuracy. Conclusions:This workflow represents a convenient and cost-effective approach for systematic assessment of MDR plasmids responsible for treatment failure of bacterial infections, offering the opportunity to perform detailed molecular epidemiological studies to probe the evolutionary and transmission mechanisms of MDR-encoding elements. Keywords: multidrug resistance (MDR) plasmids; de novo assembly; nanopore sequencing; long reads Introduction of antimicrobials that can be effectively used in treat- ment of bacterial infections [1, 2]. Identification of the key The emergence and increasing prevalence of antimicrobial resis- mechanisms responsible for AMR transmission is crucial to tance (AMR) among bacterial pathogens pose increasing public combat the threats imposed by AMR. Plasmids, especially health challenges worldwide by drastically reducing the number Received: 22 June 2017; Revised: 17 August 2017; Accepted: 13 December 2017 The Author(s) 2018. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. Downloaded from https://academic.oup.com/gigascience/article-abstract/7/3/1/4794946 by Ed 'DeepDyve' Gillespie user on 16 March 2018 2 Li et al. Figure 1: Statistics of an 8-hour MinION nanopore sequencing run using the Rapid Barcoding Sequencing Kit. (A), Distribution of read length and data volume generated by the MinION run in 8 hours. (B), Total base length and read number of the 12 samples after de-multiplexing. MDR-encoding plasmids, are now considered a major vector 1 near-complete plasmid that were efficiently obtained with the that facilitates AMR transmission among bacteria via horizon- data from a single MinION run. The detailed procedures for data tal transfer [3, 4]. Delineating the full length of plasmids and ge- analysis are described in the Methods. netic structures of other MDR mobile elements is vital for un- derstanding how such elements undergo evolutionary changes Results and horizontal transmission and adapt to new hosts [4]. How- ever, due to the presence of numerous insertion sequences and MinION workflow overview other repetitive elements in MDR plasmids, it is often difficult Twelve MDR plasmids harboring samples were prepared accord- and time-consuming to obtain the complete plasmid sequences ing to the MinION library construction protocols, followed by li- by next-generation sequencing with short reads and polymerase brary sequencing. After 8 hours of sequencing run, a total of 287 chain reaction (PCR) mapping by Sanger sequencing. With the 725 reads ranging from dozens to tens of thousands of bases in development of long read sequencing technology, tracking plas- length were obtained, covering a total of 493 Mbp (Fig. 1A). It was mid diversity by full assembly of plasmids has become possible estimated that the data should be enough for de novo assembly; [5]. To date, single-molecule, real-time sequencing (SMRT) can hence the run was stopped manually to save active nanopores generate full-sequence plasmids. However, the huge cost and la- for future use. The raw data were subjected to several stages borious library preparation procedure of this technology renders of processing, including basecalling, de-multiplexing, fasta se- it inaccessible for most laboratories. quence extraction, and de novo assembly, as stated in the Meth- Recently, another long read sequencing technology based on ods section. Upon de-multiplexing, a total of 121 584 reads were the use of a portable MinION device has become available from allocated into the 12 samples, which ranged from 5273 to 22 Oxford Nanopore Technologies (ONT). Although the accuracy 319 in read number and 18 to 93 Mbp in total length (Fig. 1B). of reads generated by this technique is generally lower than The reads that were unsuccessfully basecalled and unclassified that of short reads, it exhibits a promising capability to gener- reads generated during the de-multiplexing process were ex- ate complete chromosome and plasmid sequences [6, 7]. With cluded from the assembly analysis. By optimizing the param- the advance of library preparation techniques and data analysis eters of the de novo assembly tool, we obtained the complete tools, we found that this technology is feasible for MDR plasmid sequences of the MDR plasmids recovered from 11 samples, ex- sequencing. Here, we evaluated the feasibility of decoding the cept RB08, which was severely contaminated by chromosomal complete sequences of multiple MDR plasmids using MinION DNA. Nanopore sequencing technology through a run with a reusable flow cell within a short time frame. This workflow shall enable laboratories equipped with only basic molecular biology tech- Evaluation of plasmid assembly efficiency niques to perform detailed MDR plasmid analysis. Apart from sample RB08, de novo assembly was successfully performed on 11 MDR plasmids harboring samples by Canu. Data Description High-quality assembled sequences were obtained using Unicy- Raw long sequencing data collected after a MinION run were cler by combining with short read data. One to 5 plasmids, which de-multiplexed by Albacore basecalling software (v1.0.3) to gen- ranged from 46 to 238 Kb in length, were found in each sample, erate fast5 files allocated into 12 samples. The Poretools tool with a total of 20 complete and 1 near-complete plasmids being suite was used to extract reads with fasta format and proceded obtained from 11 samples (Table 1). To evaluate the accuracy of to de novo assembly and hybrid assembly with Canu (v1.3) and de novo assembly of rapid 1D sequencing data generated by the Unicycler (v0.3). The end result was 20 complete plasmids and MinION platform, the RB01 sample was selected for comparison Downloaded from https://academic.oup.com/gigascience/article-abstract/7/3/1/4794946 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Rapid assembly of plasmids by MinION sequencing data 3 Table 1: Basic data of 12 MDR plasmids harboring samples used in the single multiplexed MinION run a b c d Samples Marker genes Species Plasmid profile, kb 7.5 μL, ng Volume, μL Quantity, ng RB01 bla Escherichia coli 150,100 750 0.8 60 NDM-5 RB02 bla Escherichia coli 160, 135, 100, 60, 40 2010 0.4 160.8 NDM-5 RB03 bla Escherichia coli 330, 60 259.5 1.1 20.76 NDM-1 RB04 bla Escherichia coli 110, 130, 230 937.5 0.7 75 NDM-1 RB05 bla Escherichia coli 150 484.5 0.8 38.76 CTX-M-15 RB06 bla Escherichia coli 250 270 1 21.6 CTX-M-15 RB07 bla Vibrio parahaemolyticus 120 654 0.8 52.32 CTX-M-15 RB08 bla , bla Salmonella typhimurium 340 885 0.8 70.8 CTX-M-3 TEM-1 RB09 bla Escherichia coli 70 639 0.8 51.12 KPC-2 RB10 bla Escherichia coli 100, 130 346.5 1.1 27.72 KPC-2 RB11 bla Klebsiella pneumoniae 240 1125 0.8 90 KPC-2 RB12 bla Escherichia coli 120, 100 495 0.9 39.6 KPC-2 Plasmid profile was determined by S1 nuclease PFGE; the sizes of the plasmids were roughly estimated based on S1-PFGE. The input quantities of plasmid DNA in 7.5 μL during library preparation. The volume of each sample in the 10-μL pooled library. The actual quantity of DNA of each sample used in MinION sequencing. Figure 2: Evaluation of MinION nanopore sequencing long reads quality with nanonet. (A), Read counts along with reads length for sample RB01. All the raw reads could be retrieved from the Supplementary Data. (B), Nanopore read coverage with RB01-LZ135-CTX-128 976 as reference. (C), Nanopore read coverage with RB01-LZ135- NDM-90 845 as reference. (D–F), Alignment identity and GC distribution for reads aligned with RB01-LZ135-CTX-128 976. (G–I), Alignment identity and GC distribution for reads aligned with RB01-LZ135-NDM-90 845. between pair-end Illumina sequencing data and nanopore the alignment of reads to 2 reference plasmids, the MinION sequencing data. The sequences of 2 plasmids, RB01-LZ135- nanopore long reads’ accuracy was about 87%. CTX-128 976 and RB01-LZ135-NDM-90 845, were selected for Complete plasmid sequences obtained from de novo assem- evaluation of the nanopore reads’ quality (Fig. 2). Without size bly by Canu based on long reads were compared with the refer- selection during library preparation, the read lengths ranged ence plasmids (assembled by Unicycler) by BLASTN. The overall from 18 to 97 206 bp, and the N50 was 6473 bp. Based on identity of the completed plasmids by Canu was 97% identical Downloaded from https://academic.oup.com/gigascience/article-abstract/7/3/1/4794946 by Ed 'DeepDyve' Gillespie user on 16 March 2018 4 Li et al. Figure 3: Linear alignment of reference plasmids and corresponding plasmids assembled by Canu based only on MinION nanopore long reads of sample RB01. (A), Alignment of RB01-LZ135-CTX-128 973 and RB01-LZ135-CTX-125 934 (Canu). A large deletion that existed in the plasmid RB01-LZ135-CTX-125 934 is marked by a red vertical line. (B), Alignment of RB01-LZ135-NDM-90 845 and RB01-LZ135-NDM-88 896 (Canu). The crossed alignment region indicates that the bla region was NDM-5 duplicated. The 2 plasmid sequences assembled by Canu could be retrieved from the Supplementary Data. The 2 reference plasmids were deposited in the NCBI database. to the reference plasmids; the difference was mainly due to In the IncFII type plasmid RB01-LZ135-NDM-90 845, which fabricated deletions in plasmids assembled by Canu, result- was 90 845 bp in length, there was an MDR mosaic region ing in an overall sequence 3043 bp and 1949 bp shorter than composed of a Tn3 transposon containing the bla and TEM-1 RB01-LZ135-CTX-128 976 and RB01-LZ135-NDM-90 845 respec- rmtB genes, and IS26-ISAba125-bla -ble -traF-tat-ISCR1- NDM-5 MBL tively. No major structural variations were observed between the sul1-qacEdelta1-aadA2-dfrA12-intI1-IS26. Intriguingly, the latter 2 different de novo assembly methods (Fig. 3), indicating that fragment was duplicated in a tandem repeat format (Fig. 5A). nanopore long reads can be used to accurately resolve the mo- Online BLASTN of this bla -bearing plasmid in the NCBI NDM-5 saic structures frequently found in plasmids. database showed that it was highly similar to the plasmid pMC- Using only Nanopore data to complete plasmids of interest NDM, which was recovered from a metallo-beta-lactamase- was recommended when no short read data were available. The producing E. coli strain in Poland (GenBank no. HG003695), hybrid assembly used Unicycler, the accurate way to obtain com- with 99% identity at 97% coverage. The 2 major differences plete genome sequences. For sequences that cannot be resolved include existence of tandem repeats and a region replace- by Unicycler, Canu can be an option. Detailed comparison re- ment (Fig. 5A).The bla -bearing plasmid RB01-LZ135-CTX- CTX-M-15 sults using hybrid assembly by Unicycler and Nanopore data– 128 976 was 128 976 bp in length and had a conserved structure based assembly by Canu are described in Table S1. It is suggested similar to that found in plasmid pECY55, which was harbored by that the advantages of assembly using Canu include high effi- a previously reported E. coli strain (GenBank no. KU043115), with ciency, real-time monitoring, and cost-effectiveness, while the 99% identity at 97% coverage. The MDR region harboring tetA, disadvantage is its lower accuracy when compared with the hy- aac(6’)-Ib-cr, bla , bla , dfrA17, aadA5, sul1, chrA,and OXA-1 CTX-M-15 brid assembly approach using Unicycler. With the development mph(A) was shared by these 2 plasmids, and 2 group II introns of ONT technologies that significantly improve the accuracy of were found inserted in the backbone compared with pECY55 single reads, assembly using Canu is likely the best choice going (Fig. 5B). Detailed analysis of the longest reads after BWA MEM forward. alignment showed that 2 long reads spanned the plasmid RB01- LZ135-NDM-90 845 end to end, and another 2 long reads could be aligned to generate plasmid RB01-LZ135-CTX-128 976 (Fig. 5C and D). This is the first case in which the whole plasmid se- Characterization of MDR plasmids quence could be generated by only 1 single read. The number of resistance genes detectable among the 20 com- plete and 1 near-complete plasmids sequenced in this study ranged from 0 to 12, insertion sequences from 1 to 10, and repli- Discussion con genes from 1 to 4 (Table 2,Fig. 4). This implied that the plas- mids tested in this study had complex structures, the complete The advent of next-generation sequencing technologies revo- sequences of which were usually difficult to obtain by short read lutionizes the study mode in genomic research [9]. Specifically, sequencing technology due to the presence of numerous repet- it has tremendously facilitated molecular epidemiology studies itive sequences. and research on the diversity and evolution of MDR-encoding To demonstrate the ability of nanopore long reads to resolve elements from both clinical and basic research perspectives the complex structures of MDR plasmids, sample RB01 was in- [4, 10]. Although it is feasible to assess the distribution of re- vestigated in detail. Upon de novo assembly, 2 complete plasmids sistance genes among single bacterial or metagenomic samples were obtained and designated as RB01-LZ135-CTX-128 976 and with traditional short read data, constructing the entire plasmid RB01-LZ135-NDM-90 845, respectively. This sample originated and chromosome maps that depict the specific location of re- from a clinical carbapenem-resistant E. coli strain harboring the sistance genes is of vital importance in investigating the evolu- bla andbla genes, which was reported previously [8]. tionary features of such genes and tracking the evolution and CTX-M-15 NDM-5 Downloaded from https://academic.oup.com/gigascience/article-abstract/7/3/1/4794946 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Rapid assembly of plasmids by MinION sequencing data 5 Table 2: Overview of structures and genetic characteristics of 21 MDR plasmids recovered from 11 samples Structural No. of No. of No. of Plasmids Size, bp status resistance genes insertion sequences replicon genes RB01-LZ135-CTX-128976 128976 Circular 8 5 2 RB01-LZ135-NDM-90845 90845 Circular 5 2 1 RB02-JN105-IncF-TET-116277-N 116277 Circular 6 6 2 RB02-JN105-IncN-CTX-139496-N 142307 Circular 9 2 2 RB02-JN105-IncN-NDM6-55342 55342 Circular 3 3 1 RB02-JN105-IncX-NDM5-45823 45823 Circular 1 4 1 RB02-JN105-IncY-CTX-98443 98443 Circular 0 1 1 RB03-WH96T-IncF-OXA-153088 153088 Circular 3 9 4 RB03-WH96T-IncN-NDM1-56215 56215 Circular 2 4 1 RB04-SZ584-1T-IncF-TET-114056 114065 Circular 7 6 2 RB04-SZ584-1T-IncX3-NDM1-56K-NC 55919 Linear 2 4 1 RB04-SZ584-1T-IncY-130821 130821 Circular 0 9 1 RB05-C267-IncA/C-CTX-166467 166467 Circular 10 3 1 RB06-C499-IncA/C-CTX-192739 192739 Circular 11 3 1 RB07-vb0506-IncA/C-CTX-133742 133742 Circular 6 2 1 RB09-IncN-KPC-68571 68571 Circular 7 6 1 RB10-29KPC-IncF-TET-136532 136532 Circular 12 6 3 RB10-29KPC-IncY-KPC-98K-N 95908 Circular 1 2 1 RB11-IncF-IncHI-KPC-238153 238153 Circular 2 10 2 RB12-74T-KPC-IncF-115K-N 115689 Circular 0 6 4 RB12-74T-KPC-IncN-IncX1-KPC-108K-N 107969 Circular 5 4 3 Plasmid names ending with the letter N indicated that the plasmids could be assembled by Canu, based on MinION nanopore reads, but could not be assembled using the hybrid assembly strategy with Unicycler. Plasmid names ending with NC indicated incomplete assembly due to low coverage of reads resulting from the low copy number of large plasmids. Although this plasmid was not fully completed, it was still used to do further analysis together with other plasmids. transmission routes of MDR plasmids [4, 11]. The availability sequencing without the need to assemble sheared fragments. It of long read sequencing technologies such as SMRT and Min- should be noted that although only a few long reads were found ION nanopore sequencing has shed light on the development to cover the entire plasmid, they were sufficient to cover all of efficient approaches to assemble complete genomes with nu- the repetitive sequences in the MDR plasmids. With further im- merous repetitive elements [12, 13]. Owing to the high cost and provement in MinION sequencing, a plasmid being sequenced complex library preparation of SMRT technology, it cannot be end-to-end as a single molecule will become possible in the near commonly utilized in clinical settings and basic molecular labo- future. ratories, although this technology has been commercially avail- Another advantage of MinION sequencing is that it allows able for more than 5 years. On the contrary, the recently avail- halting of an ongoing sequencing run when sufficient data have able portable MinION nanopore sequencing technology offers been achieved, saving time and most importantly the flow cell, the opportunity to be used anywhere as long as a laptop com- which accounts for a significant portion of the cost of MinION se- puter is available. In this study, we evaluated the possibility of quencing. As a result, the flow cell can be reused several times MinION nanopore sequencing technology to resolve the mosaic until most of nanopores have lost activity. In this work, we fin- MDR plasmids with the latest R9.4 chemistry. ished the run in 8 hours, during which the MinION generated With the rapid barcoding sequencing kit, the complete se- sufficient data for assembling the complete plasmid sequences. quences of 20 complete (and 1 near-complete) plasmids har- Furthermore, the same flow cell was reused in another run, and bored by 11 samples could be successfully generated within a the data generated were of similar quality to that of the first run. few days (Fig. 6). Although de novo assembly of only nanopore The standard MinKNOW protocol involves running the flow cells long reads by Canu exhibited a relatively low quality of only for 48 hours. If 1 flow cell can accommodate 3 runs, each last- 97% identity to the reference sequences, the assembled plas- ing for 8, 10, and 12 hours, respectively, it indicates that 36 MDR mids were found to possess high-quality structural skeletons plasmids harboring samples can be sequenced in 1 flow cell us- with correct arrangements of various mobile elements. With Il- ing the rapid barcode kit, leading to significant reduction in the lumina short read data, accurate complete sequences of plas- cost of producing complete plasmid sequences. Furthermore, mids could be obtained by Unicycler, which involved 3 steps: the real-time hybrid genome assembly approach was reported contig construction with short reads, scaffolding of contigs with with the npScarf tool, which can overcome oversequencing is- long reads, and polishing with short reads [14]. Importantly, sues and shorten the analysis timeline [13]. This real-time anal- analysis of the 2 MDR plasmids in sample RB01 indicated that ysis workflow has the potential to be combined with the plasmid single long reads could cover a complete plasmid; this find- assembly workflow described in this study. ing inferred that the entire plasmid can be sequenced with- As an extra chromosomal element, plasmids play a domi- out interruption. In this case, de novo assembly was not nec- nant role in the dissemination of antibiotic resistance genes, essary since several long reads may cover the whole plasmid. virulence genes, and other functional genes [15, 16]. Obtaining The first antibiotic resistance island that was resolved by Min- complete plasmid sequences in a wide range of clinical isolates ION nanopore sequencing was reported in 2015 [7]. To the best of collected over a prolonged period enables in-depth studies of our knowledge, this is the first report of complete MDR plasmid plasmid evolution and adaptation, the underlying mechanisms Downloaded from https://academic.oup.com/gigascience/article-abstract/7/3/1/4794946 by Ed 'DeepDyve' Gillespie user on 16 March 2018 6 Li et al. Figure 4: Distribution of resistance genes, replicon genes, and insertion sequences among 21 plasmids. Red boxes indicate the presence of corresponding genes, and blue boxes indicate absence of the corresponding genes. The 21 plasmid sequences can be retrieved from the Supplementary Data. It should be noted that 1 plasmid, RB04-SZ584-1T-IncX3-NDM1-56K-NC, was not fully completed. ◦ ◦ of transmission of resistance genes, as well as tracking major tures were incubated at 30 C for 1 minute and at 75 C for 1 antibiotic-resistant pathogenic bacterial strains [5, 16, 17]. The minute. The barcoded libraries were pooled together with des- workflow presented in this work offers for the first time the op- ignated ratios in 10 μL(Table 1); 1 μL of RAD (Rapid 1D Adapter) portunity to perform these studies in a rapid, cost-effective, and was added to the pooled library and mixed gently; 0.2 μLof user-friendly manner. Blunt/TA Ligase Master Mix was added and incubated for 5 min- utes at room temperature. The constructed library was loaded into the Flow Cell R9.4 (FLO-MIN106) on a MinION device and run Methods with the SQK-RBK001 plus Basecaller script of MinKNOW1.5.12 software. The run was stopped after 8 hours, and the flow cell Bacterial MDR plasmids extraction was washed by a Wash Kit (EXP-WSH002) and stored in 4 Cfor To evaluate the efficiency of MDR plasmid sequencing by MinION later use. platform, we selected 12 MDR plasmid-bearing strains including E. coli, S. typhimurium, V. parahaemolyticus,and K. pneumoniae for Illumina sequencing plasmid extraction (Table 1). Overnight cultures (100 mL) were harvested and subjected to plasmid extraction using the QIA- To obtain high-quality short read data, paired-end (2 × 150 bp) GEN Plasmid Midi Kit. The extracted plasmids were dissolved in libraries were prepared by the focused acoustic shearing method ultrapure distilled water, and concentrations were measured by with the NEBNext Ultra DNA Library Prep Kit and the Multiplex Qubit 3.0 Fluorometer with a dsDNA BR Assay Kit. The plasmids Oligos Kit for Illumina (NEB). The libraries were quantified by were stored in –20 C until library preparation. employing quantitative PCR with P5-P7 primers, and they were pooled together and sequenced on the NextSeq 500 platform ac- cording to the manufacturer’s protocol (Illumina). MinION library preparation and sequencing Library preparation was performed using the Rapid Barcoding Basecalling, de-mutiplexing, assembly of complete Sequencing Kit (SQK-RBK001) according to the standard proto- plasmid sequences, and data analysis col provided by the manufacturer (Oxford Nanopore). Briefly, 7.5-μL plasmid templates were combined with a 2.5-μLFrag- Although a local basecaller script was used during the run, mentation Mix Barcode (1 barcode for each sample). The mix- there was still a small amount of reads that were not basecalled Downloaded from https://academic.oup.com/gigascience/article-abstract/7/3/1/4794946 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Rapid assembly of plasmids by MinION sequencing data 7 Figure 5: Alignment of plasmids in RB01 with similar structures in the NCBI database and MinION nanopore long read alignment with complete plasmids. (A), Alignment between pMC-NDM and RB01-LZ135-NDM-90 845. The resistance genes are highlighted in red, transposase genes in yellow, IS26 in cyan, group II intron gene in pink, and other CDS in light green. The sequence contained a large duplication region (ca.10kbp) designated as D1 and D2, each harboring a class 1 integron andabla NDM-1 cluster. (B), Alignment between pECY55 and RB01-LZ135-CTX-128 976. The CDS were labeled according to the labeling scheme in Figure 5A. The same group II intron genes were inserted and duplicated in RB01-LZ135-CTX-128 976 compared with pECY55. (C), BLASTN of 2 MinION long reads against RB01-LZ135-NDM-90 845. The results indicated that the whole plasmid could be sequenced end to end. (D), BLASTN of 2 MinION long reads against RB01-LZ135-CTX-128 976. The results indicated that 2 MinION long reads could cover the entire plasmid. The 4 long reads could be retrieved from the Supplementary Data. due to the generation of raw data in a rapid mode. Albacore to evaluate the quality of nanopore long reads [20]. BWA MEM basecalling software (v1.0.3) was used to generate fast5 files har- was used to align long reads against reference plasmids (BWA, boring the 1D DNA sequence from fast5 files with only raw data RRID:SCR 010910)[21]. in the tmp folder. Also, the read fast5 basecaller.py script in Al- To assess the distribution of resistance genes, mobile ele- bacore was used to de-multiplex the 12 samples from basecalled ments, and replicon genes, the corresponding databases were fast5 files (except the files in fail folder) based on the 12 bar- downloaded [21–23] and BLASTN was performed among the fin- codes in SQK-RBK001. The Poretools toolkit was utilized to ex- ished plasmids (BLASTN, RRID:SCR 001598). The result was vi- tract all the DNA sequences from fast5 to fasta format among the sualized by the tool Genesis [24]. Easyfig was utilized to com- 12 samples, respectively (Poretools, RRID:SCR 015879)[18]. The pare the detailed structures of the MDR plasmids (Easyfig, Canu assembly tool (v1.3; Canu, RRID:SCR 015880)[14] was used RRID:SCR 013169)[25]. to perform de novo assembly of complete plasmid sequences based on nanopore 1D long reads in 3 consecutive stages in- cluding correction, trimming, and assembly [14]. Due to the pos- Availability of supporting data sibility of the contamination of bacterial chromosomal DNA in Raw MinION and Illumina sequencing data are available in the the plasmid samples and the large variation of the size of the NCBI database via the BioProject number PRJNA398365. The 20 plasmids, the parameter of genomeSize was set at 0.5, 1, 2, complete and 1 near-complete plasmid sequences of the 12 sam- and 4 m, respectively, to optimize the assembly results to ob- ples were included as Supplementary Data in the GigaScience tain circular plasmid sequences of interest. The sizes and the database, GigaDB. The 2 plasmids in sample RB01 were deposited numbers of plasmids determined by S1–pulsed-field gel elec- in NCBI with the accession numbers MF353155 and MF353156. trophoresis (PFGE) were used to confirm the assembled results. The 2 plasmids assembled by only MinION nanopore long reads High-quality complete plasmids were constructed by hybrid de in sample RB01 are also included as Supplementary Data for ref- novo assembly of Illumina short reads and nanopore long reads erence in GigaDB. data using the Unicycler v0.3 tool [19]. NanoOK was adopted Downloaded from https://academic.oup.com/gigascience/article-abstract/7/3/1/4794946 by Ed 'DeepDyve' Gillespie user on 16 March 2018 8 Li et al. References 1. Holmes AH, Moore LS, Sundsfjord A et al. Understanding the mechanisms and drivers of antimicrobial resistance. Lancet 2016;387(10014):176–87. 2. Marston HD, Dixon DM, Knisely JM et al. Antimicrobial resis- tance. JAMA 2016;316(11):1193–204. 3. Smillie C, Garcillan-Barcia MP, Francia MV et al. Mobility of plasmids. Microbiol Mol Biol Rev 2010;74(3):434–52. 4. Beatson SA, Walker MJ. Tracking antibiotic resistance. Science 2014;345(6203):1454–5. 5. Conlan S, Thomas PJ, Deming C et al. Single-molecule sequencing to track plasmid diversity of hospital-associated carbapenemase-producing Enterobacteriaceae. Sci Translat Med 2014;6(254):254ra126. doi:10.1126/scitranslmed.3009845. 6. Bayliss SC, Hunt VL, Yokoyama M et al. The use of Oxford Nanopore native barcoding for complete genome assembly. Gigascience 2017; doi:10.1093/gigascience/gix001. 7. Ashton PM, Nair S, Dallman T et al. MinION nanopore se- quencing identifies the position and structure of a bacterial Figure 6: Workflow and time span overview of the MinION nanopore sequencing antibiotic resistance island. Nat Biotechnol 2015;33(3):296– and assembly process. This workflow was based on the rapid barcoding sequenc- ing kit, which could pool 12 samples in a single run. The time for basecalling and 300. de novo assembly depended on the computational performance of the computer 8. Huang Y, Yu X, Xie M et al. Widespread dissemination utilized, and Illumina short reads were needed if Unicycler was used to obtain of carbapenem-resistant Escherichia coli sequence type 167 high-quality assembled plasmids. strains harboring bla NDM-5 in clinical settings in China. An- timicrob Agents Chemother 2016;60(7):4364–8. 9. Goodwin S, McPherson JD, McCombie WR. Coming of age: ten years of next-generation sequencing technologies. Nat Rev Abbreviations Genet 2016;17(6):333–51. AMR: antimicrobial resistance; BLAST: Basic Local Alignment 10. Punina NV, Makridakis NM, Remnev MA et al. Whole- Search Tool; MDR: multidrug resistance; NCBI: National Cen- genome sequencing targets drug-resistant bacterial infec- ter for Biotechnology Information; ONT: Oxford Nanopore Tech- tions. Hum Genomics 2015;9(1):19. doi:10.1186/s40246-015- nologies; PCR: polymerase chain reaction; PFGE: pulsed-field gel 0037-z. electrophoresis; SMRT: single-molecule, real-time sequencing. 11. Ashton PM, Nair S, Dallman T et al. MinION nanopore se- quencing identifies the position and structure of a bacterial antibiotic resistance island. Nat Biotechnol 2015;33(3):296– Additional files 300. doi:10.1038/nbt.3103. Supplementary Table S1. Comparison between plasmid se- 12. Chin CS, Alexander DH, Marks P et al. Nonhybrid, fin- quences of first 4 samples assembled by Unicycler using both ished microbial genome assemblies from long-read SMRT Illumina and Nanopore data and by Canu using only Nanopore sequencing data. Nat Methods 2013;10(6):563–9. data. 13. Cao MD, Nguyen SH, Ganesamoorthy D et al. Scaffolding and Supplementary data 1-RB01 plasmids by Canu.fa completing genome assemblies in real-time with nanopore Supplementary data 2-twenty one plasmids.fa sequencing. Nat Commun 2017;8:14515. Supplementary data 3-four long reads.fa 14. Koren S, Walenz BP, Berlin K et al. Canu: scalable and accu- rate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 2017;27(5):722–36. Author contributions 15. Johnson TJ, Nolan LK. Pathogenomics of the virulence plas- R.L. conceived and initiated the study. M.X., N.D., and D.L. per- mids of Escherichia coli. Microbiol Mol Biol Revi 2009;73(4):750– formed bacterial isolation and plasmids extraction. R.L., X.Y., and M.H.W. performed MinION and Illumina sequencing and 16. Conlan S, Park M, Deming C et al. Plasmid dynamics in data analysis. R.L. wrote the first draft of the manuscript. E.W.C. KPC-positive klebsiella pneumoniae during long-term pa- revised the manuscript. S.C. supervised the whole project and tient colonization. mBio 2016; doi:10.1128/mBio.00742-16. edited the manuscript. 17. Porse A, Schonning K, Munck C et al. Survival and evolution of a large multidrug resistance plasmid in new clinical bac- terial hosts. Mol Biol Evol 2016; doi:10.1093/molbev/msw163. Competing interests 18. Loman NJ, Quinlan AR. Poretools: a toolkit for analyzing The authors declare no competing financial interests. nanopore sequence data. Bioinformatics 2014;30(23):3399– 19. Wick RR, Judd LM, Gorrie CL et al. Unicycler: resolv- Acknowledgements ing bacterial genome assemblies from short and long This research was supported by the Chinese National Key Ba- sequencing reads. PLoS Comput Biol 2017;13(6):e1005595. doi:10.1371/journal.pcbi.1005595. sic Research and Development (973) Program (2013CB127200) and the Collaborative Research Fund of the Hong Kong Research 20. Leggett RM, Heavens D, Caccamo M et al. NanoOK: Grant Council (C7038-15G and C5026-16G). multi-reference alignment analysis of nanopore sequencing Downloaded from https://academic.oup.com/gigascience/article-abstract/7/3/1/4794946 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Rapid assembly of plasmids by MinION sequencing data 9 data, quality and error profiles. Bioinformatics 2016; 32(1): 23. Siguier P, Perochon J, Lestrade L et al. ISfinder: the reference 142–4. centre for bacterial insertion sequences. Nucleic Acids Res 21. Carattoli A, Zankari E, Garcia-Fernandez A et al. In sil- 2006;34(90001):D32–6. ico detection and typing of plasmids using plasmidfinder 24. Sturn A, Quackenbush J, Trajanoski Z. Genesis: clus- and plasmid multilocus sequence typing. Antimicrob Agents ter analysis of microarray data. Bioinformatics 2002;18(1): Chemother 2014;58(7):3895–903. 207–8. 22. Zankari E, Hasman H, Cosentino S et al. Identification 25. Sullivan MJ, Petty NK, Beatson SA. Easyfig: a genome of acquired antimicrobial resistance genes. J Antimicrob comparison visualizer. Bioinformatics 2011;27(7): Chemother 2012;67(11):2640–4. 1009–10. Downloaded from https://academic.oup.com/gigascience/article-abstract/7/3/1/4794946 by Ed 'DeepDyve' Gillespie user on 16 March 2018 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png GigaScience Oxford University Press

Efficient generation of complete sequences of MDR-encoding plasmids by rapid assembly of MinION barcoding sequencing data

Free
9 pages

Loading next page...
 
/lp/ou_press/efficient-generation-of-complete-sequences-of-mdr-encoding-plasmids-by-rqaMpQ5mFO
Publisher
BGI
Copyright
© The Author(s) 2018. Published by Oxford University Press.
eISSN
2047-217X
D.O.I.
10.1093/gigascience/gix132
Publisher site
See Article on Publisher Site

Abstract

Background: Multidrug resistance (MDR)–encoding plasmids are considered major molecular vehicles responsible for transmission of antibiotic resistance genes among bacteria of the same or different species. Delineating the complete sequences of such plasmids could provide valuable insight into the evolution and transmission mechanisms underlying bacterial antibiotic resistance development. However, due to the presence of multiple repeats of mobile elements, complete sequencing of MDR plasmids remains technically complicated, expensive, and time-consuming. Results: Here, we demonstrate a rapid and efficient approach to obtaining multiple MDR plasmid sequences through the use of the MinION nanopore sequencing platform, which is incorporated in a portable device. By assembling the long sequencing reads generated by a single MinION run according to a rapid barcoding sequencing protocol, we obtained the complete sequences of 20 plasmids harbored by multiple bacterial strains. Importantly, single long reads covering a plasmid end-to-end were recorded, indicating that de novo assembly may be unnecessary if the single reads exhibit high accuracy. Conclusions:This workflow represents a convenient and cost-effective approach for systematic assessment of MDR plasmids responsible for treatment failure of bacterial infections, offering the opportunity to perform detailed molecular epidemiological studies to probe the evolutionary and transmission mechanisms of MDR-encoding elements. Keywords: multidrug resistance (MDR) plasmids; de novo assembly; nanopore sequencing; long reads Introduction of antimicrobials that can be effectively used in treat- ment of bacterial infections [1, 2]. Identification of the key The emergence and increasing prevalence of antimicrobial resis- mechanisms responsible for AMR transmission is crucial to tance (AMR) among bacterial pathogens pose increasing public combat the threats imposed by AMR. Plasmids, especially health challenges worldwide by drastically reducing the number Received: 22 June 2017; Revised: 17 August 2017; Accepted: 13 December 2017 The Author(s) 2018. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. Downloaded from https://academic.oup.com/gigascience/article-abstract/7/3/1/4794946 by Ed 'DeepDyve' Gillespie user on 16 March 2018 2 Li et al. Figure 1: Statistics of an 8-hour MinION nanopore sequencing run using the Rapid Barcoding Sequencing Kit. (A), Distribution of read length and data volume generated by the MinION run in 8 hours. (B), Total base length and read number of the 12 samples after de-multiplexing. MDR-encoding plasmids, are now considered a major vector 1 near-complete plasmid that were efficiently obtained with the that facilitates AMR transmission among bacteria via horizon- data from a single MinION run. The detailed procedures for data tal transfer [3, 4]. Delineating the full length of plasmids and ge- analysis are described in the Methods. netic structures of other MDR mobile elements is vital for un- derstanding how such elements undergo evolutionary changes Results and horizontal transmission and adapt to new hosts [4]. How- ever, due to the presence of numerous insertion sequences and MinION workflow overview other repetitive elements in MDR plasmids, it is often difficult Twelve MDR plasmids harboring samples were prepared accord- and time-consuming to obtain the complete plasmid sequences ing to the MinION library construction protocols, followed by li- by next-generation sequencing with short reads and polymerase brary sequencing. After 8 hours of sequencing run, a total of 287 chain reaction (PCR) mapping by Sanger sequencing. With the 725 reads ranging from dozens to tens of thousands of bases in development of long read sequencing technology, tracking plas- length were obtained, covering a total of 493 Mbp (Fig. 1A). It was mid diversity by full assembly of plasmids has become possible estimated that the data should be enough for de novo assembly; [5]. To date, single-molecule, real-time sequencing (SMRT) can hence the run was stopped manually to save active nanopores generate full-sequence plasmids. However, the huge cost and la- for future use. The raw data were subjected to several stages borious library preparation procedure of this technology renders of processing, including basecalling, de-multiplexing, fasta se- it inaccessible for most laboratories. quence extraction, and de novo assembly, as stated in the Meth- Recently, another long read sequencing technology based on ods section. Upon de-multiplexing, a total of 121 584 reads were the use of a portable MinION device has become available from allocated into the 12 samples, which ranged from 5273 to 22 Oxford Nanopore Technologies (ONT). Although the accuracy 319 in read number and 18 to 93 Mbp in total length (Fig. 1B). of reads generated by this technique is generally lower than The reads that were unsuccessfully basecalled and unclassified that of short reads, it exhibits a promising capability to gener- reads generated during the de-multiplexing process were ex- ate complete chromosome and plasmid sequences [6, 7]. With cluded from the assembly analysis. By optimizing the param- the advance of library preparation techniques and data analysis eters of the de novo assembly tool, we obtained the complete tools, we found that this technology is feasible for MDR plasmid sequences of the MDR plasmids recovered from 11 samples, ex- sequencing. Here, we evaluated the feasibility of decoding the cept RB08, which was severely contaminated by chromosomal complete sequences of multiple MDR plasmids using MinION DNA. Nanopore sequencing technology through a run with a reusable flow cell within a short time frame. This workflow shall enable laboratories equipped with only basic molecular biology tech- Evaluation of plasmid assembly efficiency niques to perform detailed MDR plasmid analysis. Apart from sample RB08, de novo assembly was successfully performed on 11 MDR plasmids harboring samples by Canu. Data Description High-quality assembled sequences were obtained using Unicy- Raw long sequencing data collected after a MinION run were cler by combining with short read data. One to 5 plasmids, which de-multiplexed by Albacore basecalling software (v1.0.3) to gen- ranged from 46 to 238 Kb in length, were found in each sample, erate fast5 files allocated into 12 samples. The Poretools tool with a total of 20 complete and 1 near-complete plasmids being suite was used to extract reads with fasta format and proceded obtained from 11 samples (Table 1). To evaluate the accuracy of to de novo assembly and hybrid assembly with Canu (v1.3) and de novo assembly of rapid 1D sequencing data generated by the Unicycler (v0.3). The end result was 20 complete plasmids and MinION platform, the RB01 sample was selected for comparison Downloaded from https://academic.oup.com/gigascience/article-abstract/7/3/1/4794946 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Rapid assembly of plasmids by MinION sequencing data 3 Table 1: Basic data of 12 MDR plasmids harboring samples used in the single multiplexed MinION run a b c d Samples Marker genes Species Plasmid profile, kb 7.5 μL, ng Volume, μL Quantity, ng RB01 bla Escherichia coli 150,100 750 0.8 60 NDM-5 RB02 bla Escherichia coli 160, 135, 100, 60, 40 2010 0.4 160.8 NDM-5 RB03 bla Escherichia coli 330, 60 259.5 1.1 20.76 NDM-1 RB04 bla Escherichia coli 110, 130, 230 937.5 0.7 75 NDM-1 RB05 bla Escherichia coli 150 484.5 0.8 38.76 CTX-M-15 RB06 bla Escherichia coli 250 270 1 21.6 CTX-M-15 RB07 bla Vibrio parahaemolyticus 120 654 0.8 52.32 CTX-M-15 RB08 bla , bla Salmonella typhimurium 340 885 0.8 70.8 CTX-M-3 TEM-1 RB09 bla Escherichia coli 70 639 0.8 51.12 KPC-2 RB10 bla Escherichia coli 100, 130 346.5 1.1 27.72 KPC-2 RB11 bla Klebsiella pneumoniae 240 1125 0.8 90 KPC-2 RB12 bla Escherichia coli 120, 100 495 0.9 39.6 KPC-2 Plasmid profile was determined by S1 nuclease PFGE; the sizes of the plasmids were roughly estimated based on S1-PFGE. The input quantities of plasmid DNA in 7.5 μL during library preparation. The volume of each sample in the 10-μL pooled library. The actual quantity of DNA of each sample used in MinION sequencing. Figure 2: Evaluation of MinION nanopore sequencing long reads quality with nanonet. (A), Read counts along with reads length for sample RB01. All the raw reads could be retrieved from the Supplementary Data. (B), Nanopore read coverage with RB01-LZ135-CTX-128 976 as reference. (C), Nanopore read coverage with RB01-LZ135- NDM-90 845 as reference. (D–F), Alignment identity and GC distribution for reads aligned with RB01-LZ135-CTX-128 976. (G–I), Alignment identity and GC distribution for reads aligned with RB01-LZ135-NDM-90 845. between pair-end Illumina sequencing data and nanopore the alignment of reads to 2 reference plasmids, the MinION sequencing data. The sequences of 2 plasmids, RB01-LZ135- nanopore long reads’ accuracy was about 87%. CTX-128 976 and RB01-LZ135-NDM-90 845, were selected for Complete plasmid sequences obtained from de novo assem- evaluation of the nanopore reads’ quality (Fig. 2). Without size bly by Canu based on long reads were compared with the refer- selection during library preparation, the read lengths ranged ence plasmids (assembled by Unicycler) by BLASTN. The overall from 18 to 97 206 bp, and the N50 was 6473 bp. Based on identity of the completed plasmids by Canu was 97% identical Downloaded from https://academic.oup.com/gigascience/article-abstract/7/3/1/4794946 by Ed 'DeepDyve' Gillespie user on 16 March 2018 4 Li et al. Figure 3: Linear alignment of reference plasmids and corresponding plasmids assembled by Canu based only on MinION nanopore long reads of sample RB01. (A), Alignment of RB01-LZ135-CTX-128 973 and RB01-LZ135-CTX-125 934 (Canu). A large deletion that existed in the plasmid RB01-LZ135-CTX-125 934 is marked by a red vertical line. (B), Alignment of RB01-LZ135-NDM-90 845 and RB01-LZ135-NDM-88 896 (Canu). The crossed alignment region indicates that the bla region was NDM-5 duplicated. The 2 plasmid sequences assembled by Canu could be retrieved from the Supplementary Data. The 2 reference plasmids were deposited in the NCBI database. to the reference plasmids; the difference was mainly due to In the IncFII type plasmid RB01-LZ135-NDM-90 845, which fabricated deletions in plasmids assembled by Canu, result- was 90 845 bp in length, there was an MDR mosaic region ing in an overall sequence 3043 bp and 1949 bp shorter than composed of a Tn3 transposon containing the bla and TEM-1 RB01-LZ135-CTX-128 976 and RB01-LZ135-NDM-90 845 respec- rmtB genes, and IS26-ISAba125-bla -ble -traF-tat-ISCR1- NDM-5 MBL tively. No major structural variations were observed between the sul1-qacEdelta1-aadA2-dfrA12-intI1-IS26. Intriguingly, the latter 2 different de novo assembly methods (Fig. 3), indicating that fragment was duplicated in a tandem repeat format (Fig. 5A). nanopore long reads can be used to accurately resolve the mo- Online BLASTN of this bla -bearing plasmid in the NCBI NDM-5 saic structures frequently found in plasmids. database showed that it was highly similar to the plasmid pMC- Using only Nanopore data to complete plasmids of interest NDM, which was recovered from a metallo-beta-lactamase- was recommended when no short read data were available. The producing E. coli strain in Poland (GenBank no. HG003695), hybrid assembly used Unicycler, the accurate way to obtain com- with 99% identity at 97% coverage. The 2 major differences plete genome sequences. For sequences that cannot be resolved include existence of tandem repeats and a region replace- by Unicycler, Canu can be an option. Detailed comparison re- ment (Fig. 5A).The bla -bearing plasmid RB01-LZ135-CTX- CTX-M-15 sults using hybrid assembly by Unicycler and Nanopore data– 128 976 was 128 976 bp in length and had a conserved structure based assembly by Canu are described in Table S1. It is suggested similar to that found in plasmid pECY55, which was harbored by that the advantages of assembly using Canu include high effi- a previously reported E. coli strain (GenBank no. KU043115), with ciency, real-time monitoring, and cost-effectiveness, while the 99% identity at 97% coverage. The MDR region harboring tetA, disadvantage is its lower accuracy when compared with the hy- aac(6’)-Ib-cr, bla , bla , dfrA17, aadA5, sul1, chrA,and OXA-1 CTX-M-15 brid assembly approach using Unicycler. With the development mph(A) was shared by these 2 plasmids, and 2 group II introns of ONT technologies that significantly improve the accuracy of were found inserted in the backbone compared with pECY55 single reads, assembly using Canu is likely the best choice going (Fig. 5B). Detailed analysis of the longest reads after BWA MEM forward. alignment showed that 2 long reads spanned the plasmid RB01- LZ135-NDM-90 845 end to end, and another 2 long reads could be aligned to generate plasmid RB01-LZ135-CTX-128 976 (Fig. 5C and D). This is the first case in which the whole plasmid se- Characterization of MDR plasmids quence could be generated by only 1 single read. The number of resistance genes detectable among the 20 com- plete and 1 near-complete plasmids sequenced in this study ranged from 0 to 12, insertion sequences from 1 to 10, and repli- Discussion con genes from 1 to 4 (Table 2,Fig. 4). This implied that the plas- mids tested in this study had complex structures, the complete The advent of next-generation sequencing technologies revo- sequences of which were usually difficult to obtain by short read lutionizes the study mode in genomic research [9]. Specifically, sequencing technology due to the presence of numerous repet- it has tremendously facilitated molecular epidemiology studies itive sequences. and research on the diversity and evolution of MDR-encoding To demonstrate the ability of nanopore long reads to resolve elements from both clinical and basic research perspectives the complex structures of MDR plasmids, sample RB01 was in- [4, 10]. Although it is feasible to assess the distribution of re- vestigated in detail. Upon de novo assembly, 2 complete plasmids sistance genes among single bacterial or metagenomic samples were obtained and designated as RB01-LZ135-CTX-128 976 and with traditional short read data, constructing the entire plasmid RB01-LZ135-NDM-90 845, respectively. This sample originated and chromosome maps that depict the specific location of re- from a clinical carbapenem-resistant E. coli strain harboring the sistance genes is of vital importance in investigating the evolu- bla andbla genes, which was reported previously [8]. tionary features of such genes and tracking the evolution and CTX-M-15 NDM-5 Downloaded from https://academic.oup.com/gigascience/article-abstract/7/3/1/4794946 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Rapid assembly of plasmids by MinION sequencing data 5 Table 2: Overview of structures and genetic characteristics of 21 MDR plasmids recovered from 11 samples Structural No. of No. of No. of Plasmids Size, bp status resistance genes insertion sequences replicon genes RB01-LZ135-CTX-128976 128976 Circular 8 5 2 RB01-LZ135-NDM-90845 90845 Circular 5 2 1 RB02-JN105-IncF-TET-116277-N 116277 Circular 6 6 2 RB02-JN105-IncN-CTX-139496-N 142307 Circular 9 2 2 RB02-JN105-IncN-NDM6-55342 55342 Circular 3 3 1 RB02-JN105-IncX-NDM5-45823 45823 Circular 1 4 1 RB02-JN105-IncY-CTX-98443 98443 Circular 0 1 1 RB03-WH96T-IncF-OXA-153088 153088 Circular 3 9 4 RB03-WH96T-IncN-NDM1-56215 56215 Circular 2 4 1 RB04-SZ584-1T-IncF-TET-114056 114065 Circular 7 6 2 RB04-SZ584-1T-IncX3-NDM1-56K-NC 55919 Linear 2 4 1 RB04-SZ584-1T-IncY-130821 130821 Circular 0 9 1 RB05-C267-IncA/C-CTX-166467 166467 Circular 10 3 1 RB06-C499-IncA/C-CTX-192739 192739 Circular 11 3 1 RB07-vb0506-IncA/C-CTX-133742 133742 Circular 6 2 1 RB09-IncN-KPC-68571 68571 Circular 7 6 1 RB10-29KPC-IncF-TET-136532 136532 Circular 12 6 3 RB10-29KPC-IncY-KPC-98K-N 95908 Circular 1 2 1 RB11-IncF-IncHI-KPC-238153 238153 Circular 2 10 2 RB12-74T-KPC-IncF-115K-N 115689 Circular 0 6 4 RB12-74T-KPC-IncN-IncX1-KPC-108K-N 107969 Circular 5 4 3 Plasmid names ending with the letter N indicated that the plasmids could be assembled by Canu, based on MinION nanopore reads, but could not be assembled using the hybrid assembly strategy with Unicycler. Plasmid names ending with NC indicated incomplete assembly due to low coverage of reads resulting from the low copy number of large plasmids. Although this plasmid was not fully completed, it was still used to do further analysis together with other plasmids. transmission routes of MDR plasmids [4, 11]. The availability sequencing without the need to assemble sheared fragments. It of long read sequencing technologies such as SMRT and Min- should be noted that although only a few long reads were found ION nanopore sequencing has shed light on the development to cover the entire plasmid, they were sufficient to cover all of efficient approaches to assemble complete genomes with nu- the repetitive sequences in the MDR plasmids. With further im- merous repetitive elements [12, 13]. Owing to the high cost and provement in MinION sequencing, a plasmid being sequenced complex library preparation of SMRT technology, it cannot be end-to-end as a single molecule will become possible in the near commonly utilized in clinical settings and basic molecular labo- future. ratories, although this technology has been commercially avail- Another advantage of MinION sequencing is that it allows able for more than 5 years. On the contrary, the recently avail- halting of an ongoing sequencing run when sufficient data have able portable MinION nanopore sequencing technology offers been achieved, saving time and most importantly the flow cell, the opportunity to be used anywhere as long as a laptop com- which accounts for a significant portion of the cost of MinION se- puter is available. In this study, we evaluated the possibility of quencing. As a result, the flow cell can be reused several times MinION nanopore sequencing technology to resolve the mosaic until most of nanopores have lost activity. In this work, we fin- MDR plasmids with the latest R9.4 chemistry. ished the run in 8 hours, during which the MinION generated With the rapid barcoding sequencing kit, the complete se- sufficient data for assembling the complete plasmid sequences. quences of 20 complete (and 1 near-complete) plasmids har- Furthermore, the same flow cell was reused in another run, and bored by 11 samples could be successfully generated within a the data generated were of similar quality to that of the first run. few days (Fig. 6). Although de novo assembly of only nanopore The standard MinKNOW protocol involves running the flow cells long reads by Canu exhibited a relatively low quality of only for 48 hours. If 1 flow cell can accommodate 3 runs, each last- 97% identity to the reference sequences, the assembled plas- ing for 8, 10, and 12 hours, respectively, it indicates that 36 MDR mids were found to possess high-quality structural skeletons plasmids harboring samples can be sequenced in 1 flow cell us- with correct arrangements of various mobile elements. With Il- ing the rapid barcode kit, leading to significant reduction in the lumina short read data, accurate complete sequences of plas- cost of producing complete plasmid sequences. Furthermore, mids could be obtained by Unicycler, which involved 3 steps: the real-time hybrid genome assembly approach was reported contig construction with short reads, scaffolding of contigs with with the npScarf tool, which can overcome oversequencing is- long reads, and polishing with short reads [14]. Importantly, sues and shorten the analysis timeline [13]. This real-time anal- analysis of the 2 MDR plasmids in sample RB01 indicated that ysis workflow has the potential to be combined with the plasmid single long reads could cover a complete plasmid; this find- assembly workflow described in this study. ing inferred that the entire plasmid can be sequenced with- As an extra chromosomal element, plasmids play a domi- out interruption. In this case, de novo assembly was not nec- nant role in the dissemination of antibiotic resistance genes, essary since several long reads may cover the whole plasmid. virulence genes, and other functional genes [15, 16]. Obtaining The first antibiotic resistance island that was resolved by Min- complete plasmid sequences in a wide range of clinical isolates ION nanopore sequencing was reported in 2015 [7]. To the best of collected over a prolonged period enables in-depth studies of our knowledge, this is the first report of complete MDR plasmid plasmid evolution and adaptation, the underlying mechanisms Downloaded from https://academic.oup.com/gigascience/article-abstract/7/3/1/4794946 by Ed 'DeepDyve' Gillespie user on 16 March 2018 6 Li et al. Figure 4: Distribution of resistance genes, replicon genes, and insertion sequences among 21 plasmids. Red boxes indicate the presence of corresponding genes, and blue boxes indicate absence of the corresponding genes. The 21 plasmid sequences can be retrieved from the Supplementary Data. It should be noted that 1 plasmid, RB04-SZ584-1T-IncX3-NDM1-56K-NC, was not fully completed. ◦ ◦ of transmission of resistance genes, as well as tracking major tures were incubated at 30 C for 1 minute and at 75 C for 1 antibiotic-resistant pathogenic bacterial strains [5, 16, 17]. The minute. The barcoded libraries were pooled together with des- workflow presented in this work offers for the first time the op- ignated ratios in 10 μL(Table 1); 1 μL of RAD (Rapid 1D Adapter) portunity to perform these studies in a rapid, cost-effective, and was added to the pooled library and mixed gently; 0.2 μLof user-friendly manner. Blunt/TA Ligase Master Mix was added and incubated for 5 min- utes at room temperature. The constructed library was loaded into the Flow Cell R9.4 (FLO-MIN106) on a MinION device and run Methods with the SQK-RBK001 plus Basecaller script of MinKNOW1.5.12 software. The run was stopped after 8 hours, and the flow cell Bacterial MDR plasmids extraction was washed by a Wash Kit (EXP-WSH002) and stored in 4 Cfor To evaluate the efficiency of MDR plasmid sequencing by MinION later use. platform, we selected 12 MDR plasmid-bearing strains including E. coli, S. typhimurium, V. parahaemolyticus,and K. pneumoniae for Illumina sequencing plasmid extraction (Table 1). Overnight cultures (100 mL) were harvested and subjected to plasmid extraction using the QIA- To obtain high-quality short read data, paired-end (2 × 150 bp) GEN Plasmid Midi Kit. The extracted plasmids were dissolved in libraries were prepared by the focused acoustic shearing method ultrapure distilled water, and concentrations were measured by with the NEBNext Ultra DNA Library Prep Kit and the Multiplex Qubit 3.0 Fluorometer with a dsDNA BR Assay Kit. The plasmids Oligos Kit for Illumina (NEB). The libraries were quantified by were stored in –20 C until library preparation. employing quantitative PCR with P5-P7 primers, and they were pooled together and sequenced on the NextSeq 500 platform ac- cording to the manufacturer’s protocol (Illumina). MinION library preparation and sequencing Library preparation was performed using the Rapid Barcoding Basecalling, de-mutiplexing, assembly of complete Sequencing Kit (SQK-RBK001) according to the standard proto- plasmid sequences, and data analysis col provided by the manufacturer (Oxford Nanopore). Briefly, 7.5-μL plasmid templates were combined with a 2.5-μLFrag- Although a local basecaller script was used during the run, mentation Mix Barcode (1 barcode for each sample). The mix- there was still a small amount of reads that were not basecalled Downloaded from https://academic.oup.com/gigascience/article-abstract/7/3/1/4794946 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Rapid assembly of plasmids by MinION sequencing data 7 Figure 5: Alignment of plasmids in RB01 with similar structures in the NCBI database and MinION nanopore long read alignment with complete plasmids. (A), Alignment between pMC-NDM and RB01-LZ135-NDM-90 845. The resistance genes are highlighted in red, transposase genes in yellow, IS26 in cyan, group II intron gene in pink, and other CDS in light green. The sequence contained a large duplication region (ca.10kbp) designated as D1 and D2, each harboring a class 1 integron andabla NDM-1 cluster. (B), Alignment between pECY55 and RB01-LZ135-CTX-128 976. The CDS were labeled according to the labeling scheme in Figure 5A. The same group II intron genes were inserted and duplicated in RB01-LZ135-CTX-128 976 compared with pECY55. (C), BLASTN of 2 MinION long reads against RB01-LZ135-NDM-90 845. The results indicated that the whole plasmid could be sequenced end to end. (D), BLASTN of 2 MinION long reads against RB01-LZ135-CTX-128 976. The results indicated that 2 MinION long reads could cover the entire plasmid. The 4 long reads could be retrieved from the Supplementary Data. due to the generation of raw data in a rapid mode. Albacore to evaluate the quality of nanopore long reads [20]. BWA MEM basecalling software (v1.0.3) was used to generate fast5 files har- was used to align long reads against reference plasmids (BWA, boring the 1D DNA sequence from fast5 files with only raw data RRID:SCR 010910)[21]. in the tmp folder. Also, the read fast5 basecaller.py script in Al- To assess the distribution of resistance genes, mobile ele- bacore was used to de-multiplex the 12 samples from basecalled ments, and replicon genes, the corresponding databases were fast5 files (except the files in fail folder) based on the 12 bar- downloaded [21–23] and BLASTN was performed among the fin- codes in SQK-RBK001. The Poretools toolkit was utilized to ex- ished plasmids (BLASTN, RRID:SCR 001598). The result was vi- tract all the DNA sequences from fast5 to fasta format among the sualized by the tool Genesis [24]. Easyfig was utilized to com- 12 samples, respectively (Poretools, RRID:SCR 015879)[18]. The pare the detailed structures of the MDR plasmids (Easyfig, Canu assembly tool (v1.3; Canu, RRID:SCR 015880)[14] was used RRID:SCR 013169)[25]. to perform de novo assembly of complete plasmid sequences based on nanopore 1D long reads in 3 consecutive stages in- cluding correction, trimming, and assembly [14]. Due to the pos- Availability of supporting data sibility of the contamination of bacterial chromosomal DNA in Raw MinION and Illumina sequencing data are available in the the plasmid samples and the large variation of the size of the NCBI database via the BioProject number PRJNA398365. The 20 plasmids, the parameter of genomeSize was set at 0.5, 1, 2, complete and 1 near-complete plasmid sequences of the 12 sam- and 4 m, respectively, to optimize the assembly results to ob- ples were included as Supplementary Data in the GigaScience tain circular plasmid sequences of interest. The sizes and the database, GigaDB. The 2 plasmids in sample RB01 were deposited numbers of plasmids determined by S1–pulsed-field gel elec- in NCBI with the accession numbers MF353155 and MF353156. trophoresis (PFGE) were used to confirm the assembled results. The 2 plasmids assembled by only MinION nanopore long reads High-quality complete plasmids were constructed by hybrid de in sample RB01 are also included as Supplementary Data for ref- novo assembly of Illumina short reads and nanopore long reads erence in GigaDB. data using the Unicycler v0.3 tool [19]. NanoOK was adopted Downloaded from https://academic.oup.com/gigascience/article-abstract/7/3/1/4794946 by Ed 'DeepDyve' Gillespie user on 16 March 2018 8 Li et al. References 1. Holmes AH, Moore LS, Sundsfjord A et al. Understanding the mechanisms and drivers of antimicrobial resistance. Lancet 2016;387(10014):176–87. 2. Marston HD, Dixon DM, Knisely JM et al. Antimicrobial resis- tance. JAMA 2016;316(11):1193–204. 3. Smillie C, Garcillan-Barcia MP, Francia MV et al. Mobility of plasmids. Microbiol Mol Biol Rev 2010;74(3):434–52. 4. Beatson SA, Walker MJ. Tracking antibiotic resistance. Science 2014;345(6203):1454–5. 5. Conlan S, Thomas PJ, Deming C et al. Single-molecule sequencing to track plasmid diversity of hospital-associated carbapenemase-producing Enterobacteriaceae. Sci Translat Med 2014;6(254):254ra126. doi:10.1126/scitranslmed.3009845. 6. Bayliss SC, Hunt VL, Yokoyama M et al. The use of Oxford Nanopore native barcoding for complete genome assembly. Gigascience 2017; doi:10.1093/gigascience/gix001. 7. Ashton PM, Nair S, Dallman T et al. MinION nanopore se- quencing identifies the position and structure of a bacterial Figure 6: Workflow and time span overview of the MinION nanopore sequencing antibiotic resistance island. Nat Biotechnol 2015;33(3):296– and assembly process. This workflow was based on the rapid barcoding sequenc- ing kit, which could pool 12 samples in a single run. The time for basecalling and 300. de novo assembly depended on the computational performance of the computer 8. Huang Y, Yu X, Xie M et al. Widespread dissemination utilized, and Illumina short reads were needed if Unicycler was used to obtain of carbapenem-resistant Escherichia coli sequence type 167 high-quality assembled plasmids. strains harboring bla NDM-5 in clinical settings in China. An- timicrob Agents Chemother 2016;60(7):4364–8. 9. Goodwin S, McPherson JD, McCombie WR. Coming of age: ten years of next-generation sequencing technologies. Nat Rev Abbreviations Genet 2016;17(6):333–51. AMR: antimicrobial resistance; BLAST: Basic Local Alignment 10. Punina NV, Makridakis NM, Remnev MA et al. Whole- Search Tool; MDR: multidrug resistance; NCBI: National Cen- genome sequencing targets drug-resistant bacterial infec- ter for Biotechnology Information; ONT: Oxford Nanopore Tech- tions. Hum Genomics 2015;9(1):19. doi:10.1186/s40246-015- nologies; PCR: polymerase chain reaction; PFGE: pulsed-field gel 0037-z. electrophoresis; SMRT: single-molecule, real-time sequencing. 11. Ashton PM, Nair S, Dallman T et al. MinION nanopore se- quencing identifies the position and structure of a bacterial antibiotic resistance island. Nat Biotechnol 2015;33(3):296– Additional files 300. doi:10.1038/nbt.3103. Supplementary Table S1. Comparison between plasmid se- 12. Chin CS, Alexander DH, Marks P et al. Nonhybrid, fin- quences of first 4 samples assembled by Unicycler using both ished microbial genome assemblies from long-read SMRT Illumina and Nanopore data and by Canu using only Nanopore sequencing data. Nat Methods 2013;10(6):563–9. data. 13. Cao MD, Nguyen SH, Ganesamoorthy D et al. Scaffolding and Supplementary data 1-RB01 plasmids by Canu.fa completing genome assemblies in real-time with nanopore Supplementary data 2-twenty one plasmids.fa sequencing. Nat Commun 2017;8:14515. Supplementary data 3-four long reads.fa 14. Koren S, Walenz BP, Berlin K et al. Canu: scalable and accu- rate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 2017;27(5):722–36. Author contributions 15. Johnson TJ, Nolan LK. Pathogenomics of the virulence plas- R.L. conceived and initiated the study. M.X., N.D., and D.L. per- mids of Escherichia coli. Microbiol Mol Biol Revi 2009;73(4):750– formed bacterial isolation and plasmids extraction. R.L., X.Y., and M.H.W. performed MinION and Illumina sequencing and 16. Conlan S, Park M, Deming C et al. Plasmid dynamics in data analysis. R.L. wrote the first draft of the manuscript. E.W.C. KPC-positive klebsiella pneumoniae during long-term pa- revised the manuscript. S.C. supervised the whole project and tient colonization. mBio 2016; doi:10.1128/mBio.00742-16. edited the manuscript. 17. Porse A, Schonning K, Munck C et al. Survival and evolution of a large multidrug resistance plasmid in new clinical bac- terial hosts. Mol Biol Evol 2016; doi:10.1093/molbev/msw163. Competing interests 18. Loman NJ, Quinlan AR. Poretools: a toolkit for analyzing The authors declare no competing financial interests. nanopore sequence data. Bioinformatics 2014;30(23):3399– 19. Wick RR, Judd LM, Gorrie CL et al. Unicycler: resolv- Acknowledgements ing bacterial genome assemblies from short and long This research was supported by the Chinese National Key Ba- sequencing reads. PLoS Comput Biol 2017;13(6):e1005595. doi:10.1371/journal.pcbi.1005595. sic Research and Development (973) Program (2013CB127200) and the Collaborative Research Fund of the Hong Kong Research 20. Leggett RM, Heavens D, Caccamo M et al. NanoOK: Grant Council (C7038-15G and C5026-16G). multi-reference alignment analysis of nanopore sequencing Downloaded from https://academic.oup.com/gigascience/article-abstract/7/3/1/4794946 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Rapid assembly of plasmids by MinION sequencing data 9 data, quality and error profiles. Bioinformatics 2016; 32(1): 23. Siguier P, Perochon J, Lestrade L et al. ISfinder: the reference 142–4. centre for bacterial insertion sequences. Nucleic Acids Res 21. Carattoli A, Zankari E, Garcia-Fernandez A et al. In sil- 2006;34(90001):D32–6. ico detection and typing of plasmids using plasmidfinder 24. Sturn A, Quackenbush J, Trajanoski Z. Genesis: clus- and plasmid multilocus sequence typing. Antimicrob Agents ter analysis of microarray data. Bioinformatics 2002;18(1): Chemother 2014;58(7):3895–903. 207–8. 22. Zankari E, Hasman H, Cosentino S et al. Identification 25. Sullivan MJ, Petty NK, Beatson SA. Easyfig: a genome of acquired antimicrobial resistance genes. J Antimicrob comparison visualizer. Bioinformatics 2011;27(7): Chemother 2012;67(11):2640–4. 1009–10. Downloaded from https://academic.oup.com/gigascience/article-abstract/7/3/1/4794946 by Ed 'DeepDyve' Gillespie user on 16 March 2018

Journal

GigaScienceOxford University Press

Published: Mar 1, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off