Access the full text.
Sign up today, get DeepDyve free for 14 days.
( BengtssonB., GrekoC. (2014) Antibiotic resistance–consequences for animal health, welfare, and food production. Ups. J. Med. Sci., 119, 96–102.24678738)
BengtssonB., GrekoC. (2014) Antibiotic resistance–consequences for animal health, welfare, and food production. Ups. J. Med. Sci., 119, 96–102.24678738BengtssonB., GrekoC. (2014) Antibiotic resistance–consequences for animal health, welfare, and food production. Ups. J. Med. Sci., 119, 96–102.24678738, BengtssonB., GrekoC. (2014) Antibiotic resistance–consequences for animal health, welfare, and food production. Ups. J. Med. Sci., 119, 96–102.24678738
( OlekhnovichE.I. et al (2018) MetaCherchant: analyzing genomic context of antibiotic resistance genes in gut microbiota. Bioinformatics, 34, 434–444.29092015)
OlekhnovichE.I. et al (2018) MetaCherchant: analyzing genomic context of antibiotic resistance genes in gut microbiota. Bioinformatics, 34, 434–444.29092015OlekhnovichE.I. et al (2018) MetaCherchant: analyzing genomic context of antibiotic resistance genes in gut microbiota. Bioinformatics, 34, 434–444.29092015, OlekhnovichE.I. et al (2018) MetaCherchant: analyzing genomic context of antibiotic resistance genes in gut microbiota. Bioinformatics, 34, 434–444.29092015
Konstantin Berlin, S. Koren, C. Chin, James Drake, J. Landolin, A. Phillippy (2014)
Assembling large genomes with single-molecule sequencing and locality-sensitive hashingNature Biotechnology, 33
R. Adams (1947)
ProceedingsMRS Bulletin, 7
( Van Der HelmE. et al (2017) Rapid resistome mapping using nanopore sequencing. Nucleic Acids Res., 45, gkw1328.)
Van Der HelmE. et al (2017) Rapid resistome mapping using nanopore sequencing. Nucleic Acids Res., 45, gkw1328.Van Der HelmE. et al (2017) Rapid resistome mapping using nanopore sequencing. Nucleic Acids Res., 45, gkw1328., Van Der HelmE. et al (2017) Rapid resistome mapping using nanopore sequencing. Nucleic Acids Res., 45, gkw1328.
Korbinian Schneeberger, J. Hagmann, S. Ossowski, N. Warthmann, S. Gesing, O. Kohlbacher, D. Weigel (2009)
Simultaneous alignment of short reads against multiple genomesGenome Biology, 10
( QuedenfeldJ., RahmannS. (2017) Variant tolerant read mapping using min-hashing. 1–19.)
QuedenfeldJ., RahmannS. (2017) Variant tolerant read mapping using min-hashing. 1–19.QuedenfeldJ., RahmannS. (2017) Variant tolerant read mapping using min-hashing. 1–19., QuedenfeldJ., RahmannS. (2017) Variant tolerant read mapping using min-hashing. 1–19.
Ying Yang, Xiaotao Jiang, Benli Chai, Liping Ma, Bing Li, A. Zhang, J. Cole, J. Tiedje, T. Zhang (2016)
ARGs-OAP: online analysis pipeline for antibiotic resistance genes detection from metagenomic data using an integrated structured ARG-databaseBioinformatics, 32 15
( JenkinsB. SpookyHash: a 128-bit noncryptographic hash. http://burtleburtle.net/bob/hash/spooky.html.)
JenkinsB. SpookyHash: a 128-bit noncryptographic hash. http://burtleburtle.net/bob/hash/spooky.html.JenkinsB. SpookyHash: a 128-bit noncryptographic hash. http://burtleburtle.net/bob/hash/spooky.html., JenkinsB. SpookyHash: a 128-bit noncryptographic hash. http://burtleburtle.net/bob/hash/spooky.html.
( World Health Organization. (2015) Global Antimicrobial Resistance Surveillance System Manual for Early Implementation Global Antimicrobial Resistance Surveillance System.)
World Health Organization. (2015) Global Antimicrobial Resistance Surveillance System Manual for Early Implementation Global Antimicrobial Resistance Surveillance System.World Health Organization. (2015) Global Antimicrobial Resistance Surveillance System Manual for Early Implementation Global Antimicrobial Resistance Surveillance System., World Health Organization. (2015) Global Antimicrobial Resistance Surveillance System Manual for Early Implementation Global Antimicrobial Resistance Surveillance System.
( PatenB. et al (2017) Genome graphs and the evolution of genome inference. Genome Res., 27, 665–676.28360232)
PatenB. et al (2017) Genome graphs and the evolution of genome inference. Genome Res., 27, 665–676.28360232PatenB. et al (2017) Genome graphs and the evolution of genome inference. Genome Res., 27, 665–676.28360232, PatenB. et al (2017) Genome graphs and the evolution of genome inference. Genome Res., 27, 665–676.28360232
Brian Bushnell (2014)
BBMap: A Fast, Accurate, Splice-Aware Aligner
E. Olekhnovich, Artem Vasilyev, V. Ulyantsev, E. Kostryukova, Alexander Tyakht (2018)
MetaCherchant: analyzing genomic context of antibiotic resistance genes in gut microbiotaBioinformatics, 34
R. Wick, Mark Schultz, J. Zobel, K. Holt (2015)
Bandage: interactive visualization of de novo genome assembliesBioinformatics, 31
G. Rose, A. Shaw, K. Sim, D. Wooldridge, Ming-Shi Li, S. Gharbia, Raju Misra, J. Kroll (2017)
Antibiotic resistance potential of the healthy preterm infant gut microbiomePeerJ, 5
( SchneebergerK. et al (2009) Simultaneous alignment of short reads against multiple genomes. Genome Biol., 10, R98.19761611)
SchneebergerK. et al (2009) Simultaneous alignment of short reads against multiple genomes. Genome Biol., 10, R98.19761611SchneebergerK. et al (2009) Simultaneous alignment of short reads against multiple genomes. Genome Biol., 10, R98.19761611, SchneebergerK. et al (2009) Simultaneous alignment of short reads against multiple genomes. Genome Biol., 10, R98.19761611
Bo Liu, Mihai Pop (2008)
ARDB—Antibiotic Resistance Genes DatabaseNucleic Acids Research, 37
( GionisA. et al (1999) Similarity search in high dimensions via hashing. VLDB ’99 Proceedings of the 25th International Conference Very Large Data Bases, 99, 518–529.)
GionisA. et al (1999) Similarity search in high dimensions via hashing. VLDB ’99 Proceedings of the 25th International Conference Very Large Data Bases, 99, 518–529.GionisA. et al (1999) Similarity search in high dimensions via hashing. VLDB ’99 Proceedings of the 25th International Conference Very Large Data Bases, 99, 518–529., GionisA. et al (1999) Similarity search in high dimensions via hashing. VLDB ’99 Proceedings of the 25th International Conference Very Large Data Bases, 99, 518–529.
D. Morrison (1968)
PATRICIA—Practical Algorithm To Retrieve Information Coded in AlphanumericJournal of the ACM (JACM), 15
R. Guérillot, Lucy Li, S. Baines, B. Howden, Mark Schultz, T. Seemann, I. Monk, Sacha Pidot, Wei Gao, S. Giulieri, A. Silva, Anthony D’Agata, T. Tomita, A. Peleg, T. Stinear, B. Howden (2018)
Comprehensive antibiotic-linked mutation assessment by resistance mutation sequencing (RM-seq)Genome Medicine, 10
Martin Hunt, Alison Mather, Leonor Sánchez-Busó, Andrew Page, Julian Parkhill, Jacqueline Keane, Simon Harris (2017)
ARIBA: rapid antimicrobial resistance genotyping directly from sequencing readsMicrobial Genomics, 3
( GuptaS.K. et al (2014) ARG-ANNOT, a new bioinformatic tool to discover antibiotic resistance genes in bacterial genomes. Antimicrob. Agents Chemother., 58, 212–220.24145532)
GuptaS.K. et al (2014) ARG-ANNOT, a new bioinformatic tool to discover antibiotic resistance genes in bacterial genomes. Antimicrob. Agents Chemother., 58, 212–220.24145532GuptaS.K. et al (2014) ARG-ANNOT, a new bioinformatic tool to discover antibiotic resistance genes in bacterial genomes. Antimicrob. Agents Chemother., 58, 212–220.24145532, GuptaS.K. et al (2014) ARG-ANNOT, a new bioinformatic tool to discover antibiotic resistance genes in bacterial genomes. Antimicrob. Agents Chemother., 58, 212–220.24145532
Sushim Gupta, Babu Padmanabhan, S. Diene, R. López-Rojas, M. Kempf, L. Landraud, J. Rolain (2013)
ARG-ANNOT, a New Bioinformatic Tool To Discover Antibiotic Resistance Genes in Bacterial GenomesAntimicrobial Agents and Chemotherapy, 58
( GarrisonE. et al (2017) Sequence variation aware references and read mapping with vg: the variation graph toolkit. bioRxiv, 1–27.)
GarrisonE. et al (2017) Sequence variation aware references and read mapping with vg: the variation graph toolkit. bioRxiv, 1–27.GarrisonE. et al (2017) Sequence variation aware references and read mapping with vg: the variation graph toolkit. bioRxiv, 1–27., GarrisonE. et al (2017) Sequence variation aware references and read mapping with vg: the variation graph toolkit. bioRxiv, 1–27.
( RoseG. et al (2017) Antibiotic resistance potential of the healthy preterm infant gut microbiome. PeerJ, 5, e2928.28149696)
RoseG. et al (2017) Antibiotic resistance potential of the healthy preterm infant gut microbiome. PeerJ, 5, e2928.28149696RoseG. et al (2017) Antibiotic resistance potential of the healthy preterm infant gut microbiome. PeerJ, 5, e2928.28149696, RoseG. et al (2017) Antibiotic resistance potential of the healthy preterm infant gut microbiome. PeerJ, 5, e2928.28149696
Baofeng Jia, A. Raphenya, Brian Alcock, Nicholas Waglechner, Peiyao Guo, Kara Tsang, B. Lago, Biren Dave, Sheldon Pereira, Arjun Sharma, Sachin Doshi, Mélanie Courtot, Raymond Lo, Laura Williams, J. Frye, Tariq Elsayegh, D. Sardar, E. Westman, Andrew Pawlowski, Timothy Johnson, F. Brinkman, Gerard Wright, A. McArthur (2016)
CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance databaseNucleic Acids Research, 45
Torbjørn Rognes, T. Flouri, B. Nichols, C. Quince, F. Mahé (2016)
VSEARCH: a versatile open source tool for metagenomicsPeerJ, 4
( TaoW. et al (2016) High levels of antibiotic resistance genes and their correlations with bacterial community and mobile genetic elements in pharmaceutical wastewater treatment bioreactors. PLoS One, 11, e0156854.27294780)
TaoW. et al (2016) High levels of antibiotic resistance genes and their correlations with bacterial community and mobile genetic elements in pharmaceutical wastewater treatment bioreactors. PLoS One, 11, e0156854.27294780TaoW. et al (2016) High levels of antibiotic resistance genes and their correlations with bacterial community and mobile genetic elements in pharmaceutical wastewater treatment bioreactors. PLoS One, 11, e0156854.27294780, TaoW. et al (2016) High levels of antibiotic resistance genes and their correlations with bacterial community and mobile genetic elements in pharmaceutical wastewater treatment bioreactors. PLoS One, 11, e0156854.27294780
( van der WaltA.J. et al (2017) Assembling metagenomes, one community at a time. BMC Genom., 18, 521.)
van der WaltA.J. et al (2017) Assembling metagenomes, one community at a time. BMC Genom., 18, 521.van der WaltA.J. et al (2017) Assembling metagenomes, one community at a time. BMC Genom., 18, 521., van der WaltA.J. et al (2017) Assembling metagenomes, one community at a time. BMC Genom., 18, 521.
Wenda Tao, Xu-xiang Zhang, F. Zhao, Kailong Huang, Haijun Ma, Zhu Wang, Lin Ye, H. Ren (2016)
High Levels of Antibiotic Resistance Genes and Their Correlations with Bacterial Community and Mobile Genetic Elements in Pharmaceutical Wastewater Treatment BioreactorsPLoS ONE, 11
( BerlinK. et al (2015) Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat. Biotechnol., 33, 623–630.26006009)
BerlinK. et al (2015) Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat. Biotechnol., 33, 623–630.26006009BerlinK. et al (2015) Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat. Biotechnol., 33, 623–630.26006009, BerlinK. et al (2015) Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat. Biotechnol., 33, 623–630.26006009
C. Brown, L. Irber (2016)
sourmash: a library for MinHash sketching of DNAJ. Open Source Softw., 1
Ryan Dale, B. Grüning, Andreas Sjödin, Jillian Rowe, Brad Chapman, C. Tomkins-Tinch, Renan Valieris, Bérénice Batut, Adam Caprez, Thomas Cokelaer, Dilmurat Yusuf, Kyle Beauchamp, Karel Brinda, Thomas, Wollmann, Gildas Corguillé, D. Ryan, A. Bretaudeau, Youri, Hoogstrate, Brent Pedersen, S. Heeringen, M. Raden, Sebastian, Luna-Valero, Nicola Soranzo, M. Smet, G. Kuster, Rory Kirchner, Lorena Pantano, Z. Charlop-Powers, Kevin Thornton, Marcel Martin, Marius, van Beek, Daniel Maticzka, M. Miladi, Sebastian Will, Kévin Gravouil, Per, Unneberg, C. Brueffer, Clemens Blank, V. Piro, Joachim Wolff, Tiago Antao, Simon Gladman, I. Shlyakhter, M. Hollander, Philip, Mabon, Wei Shen, J. Boekel, M. Holtgrewe, Dave Bouvier, Julian de, Ruiter, Jennifer Cabral, S. Choudhary, N. Harding, Robert Kleinkauf, E. Enns, Florian Eggenhofer, Joseph Brown, P. Cock, Henning Timm, Cristel Thomas, Xiao-ou Zhang, Matt Chambers, Nitesh Turaga, Enrico Seiler, Colin Brislawn, E. Pruesse, Jörg Fallmann, J. Kelleher, Hai Nguyen, Lance Parsons, Zhuoqing Fang, E. Stovner, Nicholas Stoler, Simon Ye, Inken Wohlers, Rick Farouni, M. Freeberg, James Johnson, Marcel, Bargull, P. Kensche, T. Webster, J. Eppley, Christoph, Stahl, Alexander Rose, Alex Reynolds, Liang-Bo Wang, Xavier Garnier, S. Dirmeier, Michael Knudsen, James Taylor, Avi Srivastava, Vivek Rai, Rasmus Ågren, Alexander Junge, R. Guimera, Aziz Khan, Schmeier, Guowei He, Luca Pinello, E. Hägglund, Alexander Mikheyev, J. Preussner, Nicholas Waters, Wei Li, Jordi Capellades, A. Chande, Yuri Pirola, Saskia Hiltemann, M. Bendall, Sourav Singh, W., Augustine Dunn, Alexandre Drouin, T. Domenico, I. Bruijn, David, E. Larson, Davide Chicco, Elena Grassi, Giorgio Gonnella, Liya, Wang, F. Giacomoni, Erik Clarke, Daniel Blankenberg, Camy Tran, Rob, Patro, Sacha Laurent, Matthew Gopez, B. Sennblad, J. Baaijens, Philip Ewels, Patrick Wright, Oana Enache, Pierrick Roger, Will Dampier, David Koppstein, Upendra Devisetty, T. Rausch, MacIntosh, Cornwell, Adrian Salatino, Julien Seiler, Matthieu Jung, Etienne, Kornobis, Fabio Cumbo, Bianca Stöcker, Oleksandr Moskalenko, D. Bogema, M. Workentine, Stephen Newhouse, Felipe da, Veiga Leprevost, Kevin Arvai, Johannes Köster (2017)
Bioconda: A sustainable and comprehensive software distribution for the life sciencesbioRxiv
( JiaB. et al (2017) CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database. Nucleic Acids Res., 45, D566–D573.27789705)
JiaB. et al (2017) CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database. Nucleic Acids Res., 45, D566–D573.27789705JiaB. et al (2017) CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database. Nucleic Acids Res., 45, D566–D573.27789705, JiaB. et al (2017) CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database. Nucleic Acids Res., 45, D566–D573.27789705
Steven Lakin, C. Dean, N. Noyes, Adam Dettenwanger, A. Ross, E. Doster, P. Rovira, Z. Abdo, Kenneth Jones, Jaime Ruiz, K. Belk, P. Morley, C. Boucher (2016)
MEGARes: an antimicrobial resistance database for high throughput sequencingNucleic Acids Research, 45
( PopicV., BatzoglouS. (2017) A hybrid cloud read aligner based on MinHash and kmer voting that preserves privacy. Nat. Commun., 8, 15311.28508884)
PopicV., BatzoglouS. (2017) A hybrid cloud read aligner based on MinHash and kmer voting that preserves privacy. Nat. Commun., 8, 15311.28508884PopicV., BatzoglouS. (2017) A hybrid cloud read aligner based on MinHash and kmer voting that preserves privacy. Nat. Commun., 8, 15311.28508884, PopicV., BatzoglouS. (2017) A hybrid cloud read aligner based on MinHash and kmer voting that preserves privacy. Nat. Commun., 8, 15311.28508884
Eric Helm, Lejla Imamovic, Mostafa Ellabaan, W. Schaik, A. Koza, M. Sommer (2016)
Rapid resistome mapping using nanopore sequencingNucleic Acids Research, 45
Jun Li, Cui Tai, Z. Deng, Weihong Zhong, Y. He, Hong-Yu Ou (2017)
VRprofile: gene‐cluster‐detection‐based profiling of virulence and antibiotic resistance traits encoded within genome sequences of pathogenic bacteriaBriefings in Bioinformatics, 19
( AuffretM.D. et al (2017) The rumen microbiome as a reservoir of antimicrobial resistance and pathogenicity genes is directly affected by diet in beef cattle. Microbiome, 5, 159.29228991)
AuffretM.D. et al (2017) The rumen microbiome as a reservoir of antimicrobial resistance and pathogenicity genes is directly affected by diet in beef cattle. Microbiome, 5, 159.29228991AuffretM.D. et al (2017) The rumen microbiome as a reservoir of antimicrobial resistance and pathogenicity genes is directly affected by diet in beef cattle. Microbiome, 5, 159.29228991, AuffretM.D. et al (2017) The rumen microbiome as a reservoir of antimicrobial resistance and pathogenicity genes is directly affected by diet in beef cattle. Microbiome, 5, 159.29228991
Mayank Bawa, Tyson Condie, Prasanna Ganesan (2005)
LSH forest: self-tuning indexes for similarity search
( ZankariE. et al (2012) Identification of acquired antimicrobial resistance genes. J. Antimicrob. Chemother., 67, 2640–2644.22782487)
ZankariE. et al (2012) Identification of acquired antimicrobial resistance genes. J. Antimicrob. Chemother., 67, 2640–2644.22782487ZankariE. et al (2012) Identification of acquired antimicrobial resistance genes. J. Antimicrob. Chemother., 67, 2640–2644.22782487, ZankariE. et al (2012) Identification of acquired antimicrobial resistance genes. J. Antimicrob. Chemother., 67, 2640–2644.22782487
( BushnellB. (2014) BBMap: a fast, accurate, splice-aware aligner. In: 9th Annual Genomics of Energy & Environment Meeting, Walnut Creek, CA, March 17-20, 2014.)
BushnellB. (2014) BBMap: a fast, accurate, splice-aware aligner. In: 9th Annual Genomics of Energy & Environment Meeting, Walnut Creek, CA, March 17-20, 2014.BushnellB. (2014) BBMap: a fast, accurate, splice-aware aligner. In: 9th Annual Genomics of Energy & Environment Meeting, Walnut Creek, CA, March 17-20, 2014., BushnellB. (2014) BBMap: a fast, accurate, splice-aware aligner. In: 9th Annual Genomics of Energy & Environment Meeting, Walnut Creek, CA, March 17-20, 2014.
( OndovB.D. et al (2016) Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol, 17, 132.27323842)
OndovB.D. et al (2016) Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol, 17, 132.27323842OndovB.D. et al (2016) Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol, 17, 132.27323842, OndovB.D. et al (2016) Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol, 17, 132.27323842
( BawaM. et al (2005) LSH forest: self-tuning indexes for similarity search. Proceedings of the 14th International Conference on World Wide Web – WWW ’05, p.651.)
BawaM. et al (2005) LSH forest: self-tuning indexes for similarity search. Proceedings of the 14th International Conference on World Wide Web – WWW ’05, p.651.BawaM. et al (2005) LSH forest: self-tuning indexes for similarity search. Proceedings of the 14th International Conference on World Wide Web – WWW ’05, p.651., BawaM. et al (2005) LSH forest: self-tuning indexes for similarity search. Proceedings of the 14th International Conference on World Wide Web – WWW ’05, p.651.
Phelim Bradley, N. Gordon, T. Walker, L. Dunn, S. Heys, B. Huang, S. Earle, L. Pankhurst, Luke Anson, M. Cesare, P. Piazza, A. Votintseva, Tanya Golubchik, Daniel Wilson, Daniel Wilson, D. Wyllie, R. Diel, S. Niemann, S. Feuerriegel, T. Kohl, N. Ismail, S. Omar, E. Smith, D. Buck, G. McVean, A. Walker, A. Walker, Tim Peto, Tim Peto, D. Crook, D. Crook, D. Crook, Z. Iqbal (2015)
Rapid antibiotic-resistance predictions from genome sequence data for Staphylococcus aureus and Mycobacterium tuberculosisNature Communications, 6
Ying Yang, Bing Li, F. Ju, Tong Zhang (2013)
Exploring variation of antibiotic resistance genes in activated sludge over a four-year period through a metagenomic approach.Environmental science & technology, 47 18
Patrick Munk, V. Andersen, L. Knegt, M. Jensen, B. Knudsen, O. Lukjancenko, H. Mordhorst, Julie Clasen, Y. Agersø, A. Folkesson, S. Pamp, H. Vigre, F. Aarestrup (2017)
A sampling and metagenomic sequencing-based methodology for monitoring antimicrobial resistance in swine herdsJournal of Antimicrobial Chemotherapy, 72
( MunkP. et al (2017) A sampling and metagenomic sequencing-based methodology for monitoring antimicrobial resistance in swine herds. J. Antimicrob. Chemother., 72, 385–392.28115502)
MunkP. et al (2017) A sampling and metagenomic sequencing-based methodology for monitoring antimicrobial resistance in swine herds. J. Antimicrob. Chemother., 72, 385–392.28115502MunkP. et al (2017) A sampling and metagenomic sequencing-based methodology for monitoring antimicrobial resistance in swine herds. J. Antimicrob. Chemother., 72, 385–392.28115502, MunkP. et al (2017) A sampling and metagenomic sequencing-based methodology for monitoring antimicrobial resistance in swine herds. J. Antimicrob. Chemother., 72, 385–392.28115502
A. Sczyrba, P. Hofmann, Peter Belmann, D. Koslicki, Stefan Janssen, J. Dröge, Ivan Gregor, Stephan Majda, Jessika Fiedler, Eik Dahms, A. Bremges, A. Fritz, R. Garrido-Oter, T. Jørgensen, N. Shapiro, Philip Blood, A. Gurevich, Yang Bai, Dmitrij Turaev, Matthew DeMaere, R. Chikhi, N. Nagarajan, C. Quince, F. Meyer, Monika Balvociute, L. Hansen, S. Sørensen, Burton Chia, Bertrand Denis, J. Froula, Zhong Wang, R. Egan, Dongwan Kang, Jeffrey Cook, C. Deltel, M. Beckstette, C. Lemaitre, P. Peterlongo, Guillaume Rizk, D. Lavenier, Yu-Wei Wu, S. Singer, Chirag Jain, M. Strous, Heiner Klingenberg, P. Meinicke, M. Barton, T. Lingner, Hsin-Hung Lin, Yu-Chieh Liao, G. Silva, Daniel Cuevas, R. Edwards, S. Saha, V. Piro, B. Renard, Mihai Pop, H. Klenk, M. Göker, N. Kyrpides, T. Woyke, J. Vorholt, P. Schulze-Lefert, E. Rubin, A. Darling, T. Rattei, A. Mchardy (2017)
Critical Assessment of Metagenome Interpretation—a benchmark of metagenomics softwareNature Methods, 14
( BroderA.Z. (2000) Identifying and filtering near-duplicate documents. In Annual Symposium on Combinatorial Pattern Matching, pp. 1–10.)
BroderA.Z. (2000) Identifying and filtering near-duplicate documents. In Annual Symposium on Combinatorial Pattern Matching, pp. 1–10.BroderA.Z. (2000) Identifying and filtering near-duplicate documents. In Annual Symposium on Combinatorial Pattern Matching, pp. 1–10., BroderA.Z. (2000) Identifying and filtering near-duplicate documents. In Annual Symposium on Combinatorial Pattern Matching, pp. 1–10.
W. Rowe, D. Verner-Jeffreys, C. Baker-Austin, Jim Ryan, D. Maskell, G. Pearce (2016)
Comparative metagenomics reveals a diverse range of antimicrobial resistance genes in effluents entering a river catchment.Water science and technology : a journal of the International Association on Water Pollution Research, 73 7
( SczyrbaA. et al (2017) Critical assessment of metagenome interpretation—a benchmark of metagenomics software. Nat. Methods, 14, 1063–1071.28967888)
SczyrbaA. et al (2017) Critical assessment of metagenome interpretation—a benchmark of metagenomics software. Nat. Methods, 14, 1063–1071.28967888SczyrbaA. et al (2017) Critical assessment of metagenome interpretation—a benchmark of metagenomics software. Nat. Methods, 14, 1063–1071.28967888, SczyrbaA. et al (2017) Critical assessment of metagenome interpretation—a benchmark of metagenomics software. Nat. Methods, 14, 1063–1071.28967888
( BaqueroF. (2012) Metagenomic epidemiology: a public health need for the control of antimicrobial resistance. Clin. Microbiol. Infect., 18, 67–73.)
BaqueroF. (2012) Metagenomic epidemiology: a public health need for the control of antimicrobial resistance. Clin. Microbiol. Infect., 18, 67–73.BaqueroF. (2012) Metagenomic epidemiology: a public health need for the control of antimicrobial resistance. Clin. Microbiol. Infect., 18, 67–73., BaqueroF. (2012) Metagenomic epidemiology: a public health need for the control of antimicrobial resistance. Clin. Microbiol. Infect., 18, 67–73.
Victoria Popic, S. Batzoglou (2017)
A hybrid cloud read aligner based on MinHash and kmer voting that preserves privacyNature Communications, 8
( MaL. et al (2017) Catalogue of antibiotic resistome and host-tracking in drinking water deciphered by a large scale survey. Microbiome, 5, 154.29179769)
MaL. et al (2017) Catalogue of antibiotic resistome and host-tracking in drinking water deciphered by a large scale survey. Microbiome, 5, 154.29179769MaL. et al (2017) Catalogue of antibiotic resistome and host-tracking in drinking water deciphered by a large scale survey. Microbiome, 5, 154.29179769, MaL. et al (2017) Catalogue of antibiotic resistome and host-tracking in drinking water deciphered by a large scale survey. Microbiome, 5, 154.29179769
A. Broder (1997)
On the resemblance and containment of documentsProceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171)
E. Zankari, H. Hasman, Salvatore Cosentino, M. Vestergaard, Simon Rasmussen, O. Lund, F. Aarestrup, M. Larsen (2012)
Identification of acquired antimicrobial resistance genesJournal of Antimicrobial Chemotherapy, 67
( BushK., JacobyG.A. (2010) Updated functional classification of beta-lactamases. Antimicrob. Agents Chemother., 54, 969–976.19995920)
BushK., JacobyG.A. (2010) Updated functional classification of beta-lactamases. Antimicrob. Agents Chemother., 54, 969–976.19995920BushK., JacobyG.A. (2010) Updated functional classification of beta-lactamases. Antimicrob. Agents Chemother., 54, 969–976.19995920, BushK., JacobyG.A. (2010) Updated functional classification of beta-lactamases. Antimicrob. Agents Chemother., 54, 969–976.19995920
Liping Ma, Bing Li, Xiao-Tao Jiang, Yulin Wang, Yu Xia, An-dong Li, Tong Zhang (2017)
Catalogue of antibiotic resistome and host-tracking in drinking water deciphered by a large scale surveyMicrobiome, 5
W. Rowe, C. Baker-Austin, D. Verner-Jeffreys, Jim Ryan, C. Micallef, D. Maskell, G. Pearce (2017)
Overexpression of antibiotic resistance genes in hospital effluents over timeJournal of Antimicrobial Chemotherapy, 72
B. Paten, Adam Novak, Jordan Eizenga, Garrison Erik (2017)
Genome graphs and the evolution of genome inferenceGenome Research, 27
( Public Health Agency of Canada. (2016) Canadian antimicrobial resistance surveillance system – Report 2016. Guelph, Canada.)
Public Health Agency of Canada. (2016) Canadian antimicrobial resistance surveillance system – Report 2016. Guelph, Canada.Public Health Agency of Canada. (2016) Canadian antimicrobial resistance surveillance system – Report 2016. Guelph, Canada., Public Health Agency of Canada. (2016) Canadian antimicrobial resistance surveillance system – Report 2016. Guelph, Canada.
K. Winglee, A. Howard, W. Sha, R. Gharaibeh, Jiawu Liu, D. Jin, A. Fodor, P. Gordon-Larsen (2017)
Recent urbanization in China is correlated with a Westernized microbiome encoding increased virulence and antibiotic resistance genesMicrobiome, 5
(2011)
Public Health Agency of CanadaJournal of Consumer Health on the Internet, 15
( JalaliS. et al (2015) Screening currency notes for microbial pathogens and antibiotic resistance genes using a shotgun metagenomic approach. PLoS One, 10, e0128711.26035208)
JalaliS. et al (2015) Screening currency notes for microbial pathogens and antibiotic resistance genes using a shotgun metagenomic approach. PLoS One, 10, e0128711.26035208JalaliS. et al (2015) Screening currency notes for microbial pathogens and antibiotic resistance genes using a shotgun metagenomic approach. PLoS One, 10, e0128711.26035208, JalaliS. et al (2015) Screening currency notes for microbial pathogens and antibiotic resistance genes using a shotgun metagenomic approach. PLoS One, 10, e0128711.26035208
Heng Li (2013)
Aligning sequence reads, clone sequences and assembly contigs with BWA-MEMarXiv: Genomics
Ryan Dale, B. Grüning, A. Sjödin, Jillian Rowe, Brad Chapman, C. Tomkins-Tinch, Renan Valieris, Bérénice Batut, A. Caprez, T. Cokelaer, Dilmurat Yusuf, Kyle Beauchamp, K. Břinda, Thomas Wollmann, Gildas Corguillé, D. Ryan, A. Bretaudeau, Y. Hoogstrate, Brent Pedersen, S. Heeringen, M. Raden, Sebastian Luna-Valero, N. Soranzo, Matthias Smet, G. Kuster, Rory Kirchner, L. Pantano, Z. Charlop-Powers, Kevin Thornton, Marcel Martin, Marius Beek, Daniel Maticzka, M. Miladi, S. Will, Kévin Gravouil, P. Unneberg, C. Brueffer, Clemens Blank, V. Piro, Joachim Wolff, T. Antão, Simon Gladman, I. Shlyakhter, M. Hollander, P. Mabon, Wei Shen, J. Boekel, M. Holtgrewe, Dave Bouvier, J. Ruiter, Jennifer Cabral, S. Choudhary, N. Harding, Robert Kleinkauf, E. Enns, Florian Eggenhofer, Joseph Brown, P. Cock, H. Timm, Cristel Thomas, Xiao-ou Zhang, Matt Chambers, Nitesh Turaga, E. Seiler, Colin Brislawn, E. Pruesse, Jörg Fallmann, J. Kelleher, Hai Nguyen, Lance Parsons, Zhuoqing Fang, E. Stovner, Nicholas Stoler, Simon Ye, Inken Wohlers, Rick Farouni, M. Freeberg, James Johnson, Marcel Bargull, P. Kensche, T. Webster, J. Eppley, C. Stahl, Alexander Rose, Alex Reynolds, Liang Wang, Xavier Garnier, S. Dirmeier, Michael Knudsen, James Taylor, Avi Srivastava, Vivek Rai, Rasmus Agren, Alexander Junge, R. Guimera, Aziz Khan, S. Schmeier, Guowei He, Luca Pinello, E. Hägglund, A. Mikheyev, J. Preussner, Nicholas Waters, Wei Li, Jordi Capellades, A. Chande, Yuri Pirola, Saskia Hiltemann, M. Bendall, Sourav Singh, W. Dunn, Alexandre Drouin, Tomás Domenico, Ino Bruijn, D. Larson, D. Chicco, Elena Grassi, Giorgio Gonnella, B. Jaivarsan, Liya Wang, F. Giacomoni, Erik Clarke, Daniel Blankenberg, Camy Tran, Robert Patro, S. Laurent, Matthew Gopez, B. Sennblad, J. Baaijens, Philip Ewels, Patrick Wright, Oana Enache, Pierrick Roger, W. Dampier, David Koppstein, Upendra Devisetty, T. Rausch, MacIntosh Cornwell, A. Salatino, Julien Seiler, Matthieu Jung, E. Kornobis, Fabio Cumbo, David Lähnemann, Bianca Stöcker, O. Moskalenko, D. Bogema, M. Workentine, S. Newhouse, Felipe Leprevost, Kevin Arvai, Johannes Köster (2018)
Bioconda: sustainable and comprehensive software distribution for the life sciencesNature Methods, 15
( RoweW. et al (2015) Search engine for antimicrobial resistance: a cloud compatible pipeline and web interface for rapidly detecting antimicrobial resistance genes directly from sequence data. PLoS One, 10, e0133492.26197475)
RoweW. et al (2015) Search engine for antimicrobial resistance: a cloud compatible pipeline and web interface for rapidly detecting antimicrobial resistance genes directly from sequence data. PLoS One, 10, e0133492.26197475RoweW. et al (2015) Search engine for antimicrobial resistance: a cloud compatible pipeline and web interface for rapidly detecting antimicrobial resistance genes directly from sequence data. PLoS One, 10, e0133492.26197475, RoweW. et al (2015) Search engine for antimicrobial resistance: a cloud compatible pipeline and web interface for rapidly detecting antimicrobial resistance genes directly from sequence data. PLoS One, 10, e0133492.26197475
SpookyHash: a 128-bit noncryptographic hash
( InouyeM. et al (2014) SRST2: rapid genomic surveillance for public health and hospital microbiology labs. Genome Med., 6, 90.25422674)
InouyeM. et al (2014) SRST2: rapid genomic surveillance for public health and hospital microbiology labs. Genome Med., 6, 90.25422674InouyeM. et al (2014) SRST2: rapid genomic surveillance for public health and hospital microbiology labs. Genome Med., 6, 90.25422674, InouyeM. et al (2014) SRST2: rapid genomic surveillance for public health and hospital microbiology labs. Genome Med., 6, 90.25422674
( ClausenP.T.L.C. et al (2016) Benchmarking of methods for identification of antimicrobial resistance genes in bacterial whole genome data. J. Antimicrob. Chemother., 71, 2484–2488.27365186)
ClausenP.T.L.C. et al (2016) Benchmarking of methods for identification of antimicrobial resistance genes in bacterial whole genome data. J. Antimicrob. Chemother., 71, 2484–2488.27365186ClausenP.T.L.C. et al (2016) Benchmarking of methods for identification of antimicrobial resistance genes in bacterial whole genome data. J. Antimicrob. Chemother., 71, 2484–2488.27365186, ClausenP.T.L.C. et al (2016) Benchmarking of methods for identification of antimicrobial resistance genes in bacterial whole genome data. J. Antimicrob. Chemother., 71, 2484–2488.27365186
W. Rowe, K. Baker, D. Verner-Jeffreys, C. Baker-Austin, Jim Ryan, D. Maskell, G. Pearce (2015)
Search Engine for Antimicrobial Resistance: A Cloud Compatible Pipeline and Web Interface for Rapidly Detecting Antimicrobial Resistance Genes Directly from Sequence DataPLoS ONE, 10
Erik Garrison, Jouni Sirén, Adam Novak, G. Hickey, Jordan Eizenga, Eric Dawson, William Jones, Michael Lin, B. Paten, R. Durbin (2017)
Sequence variation aware genome references and read mapping with the variation graph toolkitbioRxiv
Jens Quedenfeld, S. Rahmann (2017)
Variant tolerant read mapping using min-hashingArXiv, abs/1702.01703
B. Bengtsson, C. Greko (2014)
Antibiotic resistance—consequences for animal health, welfare, and food productionUpsala Journal of Medical Sciences, 119
( GryskiD. (2014) go-spooky. https://github.com/dgryski/go-spooky.)
GryskiD. (2014) go-spooky. https://github.com/dgryski/go-spooky.GryskiD. (2014) go-spooky. https://github.com/dgryski/go-spooky., GryskiD. (2014) go-spooky. https://github.com/dgryski/go-spooky.
( YangY. et al (2013) Exploring variation of antibiotic resistance genes in activated sludge over a four-year period through a metagenomic approach. Environ. Sci. Technol., 47, 10197–10205.23919449)
YangY. et al (2013) Exploring variation of antibiotic resistance genes in activated sludge over a four-year period through a metagenomic approach. Environ. Sci. Technol., 47, 10197–10205.23919449YangY. et al (2013) Exploring variation of antibiotic resistance genes in activated sludge over a four-year period through a metagenomic approach. Environ. Sci. Technol., 47, 10197–10205.23919449, YangY. et al (2013) Exploring variation of antibiotic resistance genes in activated sludge over a four-year period through a metagenomic approach. Environ. Sci. Technol., 47, 10197–10205.23919449
( HuntM. et al (2017) ARIBA: rapid antimicrobial resistance genotyping directly from sequencing reads. Microb. Genom., 3, e000131.29177089)
HuntM. et al (2017) ARIBA: rapid antimicrobial resistance genotyping directly from sequencing reads. Microb. Genom., 3, e000131.29177089HuntM. et al (2017) ARIBA: rapid antimicrobial resistance genotyping directly from sequencing reads. Microb. Genom., 3, e000131.29177089, HuntM. et al (2017) ARIBA: rapid antimicrobial resistance genotyping directly from sequencing reads. Microb. Genom., 3, e000131.29177089
Ruth Miller, V. Montoya, J. Gardy, D. Patrick, P. Tang (2013)
Metagenomics for pathogen detection in public healthGenome Medicine, 5
K. Bush, G. Jacoby (2009)
Updated Functional Classification of β-LactamasesAntimicrobial Agents and Chemotherapy, 54
M. Auffret, R. Dewhurst, C. Duthie, J. Rooke, R. Wallace, T. Freeman, Robert Stewart, M. Watson, R. Roehe (2017)
The rumen microbiome as a reservoir of antimicrobial resistance and pathogenicity genes is directly affected by diet in beef cattleMicrobiome, 5
A. Broder (2000)
Identifying and Filtering Near-Duplicate Documents
( GuérillotR. et al (2018) Comprehensive antibiotic-linked mutation assessment by Resistance Mutation Sequencing (RM-seq). bioRxiv, 257915.)
GuérillotR. et al (2018) Comprehensive antibiotic-linked mutation assessment by Resistance Mutation Sequencing (RM-seq). bioRxiv, 257915.GuérillotR. et al (2018) Comprehensive antibiotic-linked mutation assessment by Resistance Mutation Sequencing (RM-seq). bioRxiv, 257915., GuérillotR. et al (2018) Comprehensive antibiotic-linked mutation assessment by Resistance Mutation Sequencing (RM-seq). bioRxiv, 257915.
( LakinS.M. et al (2017) MEGARes: an antimicrobial resistance database for high throughput sequencing. Nucleic Acids Res., 45, D574–D580.27899569)
LakinS.M. et al (2017) MEGARes: an antimicrobial resistance database for high throughput sequencing. Nucleic Acids Res., 45, D574–D580.27899569LakinS.M. et al (2017) MEGARes: an antimicrobial resistance database for high throughput sequencing. Nucleic Acids Res., 45, D574–D580.27899569, LakinS.M. et al (2017) MEGARes: an antimicrobial resistance database for high throughput sequencing. Nucleic Acids Res., 45, D574–D580.27899569
A. Walt, Marc Goethem, J. Ramond, T. Makhalanyane, O. Reva, D. Cowan (2017)
Assembling metagenomes, one community at a timeBMC Genomics, 18
Y. Xie, Yiqing Wei, Yue Shen, Xiaobin Li, Hao Zhou, Cui Tai, Z. Deng, Hong-Yu Ou (2017)
TADB 2.0: an updated database of bacterial type II toxin–antitoxin lociNucleic Acids Research, 46
( WickR.R. et al (2015) Bandage: interactive visualization of de novo genome assemblies. Bioinformatics, 31, 3350–3352.26099265)
WickR.R. et al (2015) Bandage: interactive visualization of de novo genome assemblies. Bioinformatics, 31, 3350–3352.26099265WickR.R. et al (2015) Bandage: interactive visualization of de novo genome assemblies. Bioinformatics, 31, 3350–3352.26099265, WickR.R. et al (2015) Bandage: interactive visualization of de novo genome assemblies. Bioinformatics, 31, 3350–3352.26099265
( PetersenT.N. et al (2017) MGmapper: reference based mapping and taxonomy annotation of metagenomics sequence reads. PLoS One, 12, e0176469.28467460)
PetersenT.N. et al (2017) MGmapper: reference based mapping and taxonomy annotation of metagenomics sequence reads. PLoS One, 12, e0176469.28467460PetersenT.N. et al (2017) MGmapper: reference based mapping and taxonomy annotation of metagenomics sequence reads. PLoS One, 12, e0176469.28467460, PetersenT.N. et al (2017) MGmapper: reference based mapping and taxonomy annotation of metagenomics sequence reads. PLoS One, 12, e0176469.28467460
Saakshi Jalali, Samantha Kohli, Chitra Latka, Sugandha Bhatia, Shamsudheen Vellarikal, S. Sivasubbu, V. Scaria, S. Ramachandran (2015)
Screening Currency Notes for Microbial Pathogens and Antibiotic Resistance Genes Using a Shotgun Metagenomic ApproachPLoS ONE, 10
T. Petersen, O. Lukjancenko, M. Thomsen, Maria Sperotto, O. Lund, Frank Aarestrup, Thomas Sicheritz-Pontén (2017)
MGmapper: Reference based mapping and taxonomy annotation of metagenomics sequence readsPLoS ONE, 12
( LiH. (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM.)
LiH. (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM.LiH. (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM., LiH. (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM.
( LiJ. et al (2017) VRprofile: gene-cluster-detection-based profiling of virulence and antibiotic resistance traits encoded within genome sequences of pathogenic bacteria. Brief Bioinform., bbw141.)
LiJ. et al (2017) VRprofile: gene-cluster-detection-based profiling of virulence and antibiotic resistance traits encoded within genome sequences of pathogenic bacteria. Brief Bioinform., bbw141.LiJ. et al (2017) VRprofile: gene-cluster-detection-based profiling of virulence and antibiotic resistance traits encoded within genome sequences of pathogenic bacteria. Brief Bioinform., bbw141., LiJ. et al (2017) VRprofile: gene-cluster-detection-based profiling of virulence and antibiotic resistance traits encoded within genome sequences of pathogenic bacteria. Brief Bioinform., bbw141.
A. Gionis, P. Indyk, R. Motwani (1999)
Similarity Search in High Dimensions via Hashing
( BrownC.T., IrberL. (2016) sourmash: a library for MinHash sketching of DNA. J. Open Source Softw., 1, 27.)
BrownC.T., IrberL. (2016) sourmash: a library for MinHash sketching of DNA. J. Open Source Softw., 1, 27.BrownC.T., IrberL. (2016) sourmash: a library for MinHash sketching of DNA. J. Open Source Softw., 1, 27., BrownC.T., IrberL. (2016) sourmash: a library for MinHash sketching of DNA. J. Open Source Softw., 1, 27.
J. O'Neill (2016)
Tackling drug-resistant infections globally: final report and recommendations
( BradleyP. et al (2015) Rapid antibiotic-resistance predictions from genome sequence data for Staphylococcus aureus and Mycobacterium tuberculosis. Nat. Commun., 6, 10063.26686880)
BradleyP. et al (2015) Rapid antibiotic-resistance predictions from genome sequence data for Staphylococcus aureus and Mycobacterium tuberculosis. Nat. Commun., 6, 10063.26686880BradleyP. et al (2015) Rapid antibiotic-resistance predictions from genome sequence data for Staphylococcus aureus and Mycobacterium tuberculosis. Nat. Commun., 6, 10063.26686880, BradleyP. et al (2015) Rapid antibiotic-resistance predictions from genome sequence data for Staphylococcus aureus and Mycobacterium tuberculosis. Nat. Commun., 6, 10063.26686880
( BroderA.Z. (1997) On the resemblance and containment of documents. In Proceedings. Compression and Complexity of SEQUENCES (Cat. No.97TB100171) IEEE Comput. Soc., pp. 21–29.)
BroderA.Z. (1997) On the resemblance and containment of documents. In Proceedings. Compression and Complexity of SEQUENCES (Cat. No.97TB100171) IEEE Comput. Soc., pp. 21–29.BroderA.Z. (1997) On the resemblance and containment of documents. In Proceedings. Compression and Complexity of SEQUENCES (Cat. No.97TB100171) IEEE Comput. Soc., pp. 21–29., BroderA.Z. (1997) On the resemblance and containment of documents. In Proceedings. Compression and Complexity of SEQUENCES (Cat. No.97TB100171) IEEE Comput. Soc., pp. 21–29.
Brian Ondov, T. Treangen, Páll Melsted, Adam Mallonee, N. Bergman, S. Koren, A. Phillippy (2015)
Mash: fast genome and metagenome distance estimation using MinHashGenome Biology, 17
M. Inouye, Harriet Dashnow, Lesley-Ann Raven, Mark Schultz, B. Pope, T. Tomita, J. Zobel, K. Holt (2014)
SRST2: Rapid genomic surveillance for public health and hospital microbiology labsGenome Medicine, 6
( XieY. et al (2018) TADB 2.0: an updated database of bacterial type II toxin-antitoxin loci. Nucleic Acids Res., 46, D749–D753.29106666)
XieY. et al (2018) TADB 2.0: an updated database of bacterial type II toxin-antitoxin loci. Nucleic Acids Res., 46, D749–D753.29106666XieY. et al (2018) TADB 2.0: an updated database of bacterial type II toxin-antitoxin loci. Nucleic Acids Res., 46, D749–D753.29106666, XieY. et al (2018) TADB 2.0: an updated database of bacterial type II toxin-antitoxin loci. Nucleic Acids Res., 46, D749–D753.29106666
( RoweW.P.M. et al (2017) Overexpression of antibiotic resistance genes in hospital effluents over time. J. Antimicrob. Chemother., 72, 1617–1623.28175320)
RoweW.P.M. et al (2017) Overexpression of antibiotic resistance genes in hospital effluents over time. J. Antimicrob. Chemother., 72, 1617–1623.28175320RoweW.P.M. et al (2017) Overexpression of antibiotic resistance genes in hospital effluents over time. J. Antimicrob. Chemother., 72, 1617–1623.28175320, RoweW.P.M. et al (2017) Overexpression of antibiotic resistance genes in hospital effluents over time. J. Antimicrob. Chemother., 72, 1617–1623.28175320
F. Baquero (2012)
Metagenomic epidemiology: a public health need for the control of antimicrobial resistance.Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases, 18 Suppl 4
( WingleeK. et al (2017) Recent urbanization in China is correlated with a Westernized microbiome encoding increased virulence and antibiotic resistance genes. Microbiome, 5, 121.28915922)
WingleeK. et al (2017) Recent urbanization in China is correlated with a Westernized microbiome encoding increased virulence and antibiotic resistance genes. Microbiome, 5, 121.28915922WingleeK. et al (2017) Recent urbanization in China is correlated with a Westernized microbiome encoding increased virulence and antibiotic resistance genes. Microbiome, 5, 121.28915922, WingleeK. et al (2017) Recent urbanization in China is correlated with a Westernized microbiome encoding increased virulence and antibiotic resistance genes. Microbiome, 5, 121.28915922
P. Clausen, E. Zankari, F. Aarestrup, O. Lund (2016)
Benchmarking of methods for identification of antimicrobial resistance genes in bacterial whole genome data.The Journal of antimicrobial chemotherapy, 71 9
( MorrisonD.R. (1968) PATRICIA—practical algorithm to retrieve information coded in alphanumeric. J. ACM, 15, 514–534.)
MorrisonD.R. (1968) PATRICIA—practical algorithm to retrieve information coded in alphanumeric. J. ACM, 15, 514–534.MorrisonD.R. (1968) PATRICIA—practical algorithm to retrieve information coded in alphanumeric. J. ACM, 15, 514–534., MorrisonD.R. (1968) PATRICIA—practical algorithm to retrieve information coded in alphanumeric. J. ACM, 15, 514–534.
(Google. FarmHash. https://github.com/google/farmhash.)
Google. FarmHash. https://github.com/google/farmhash.Google. FarmHash. https://github.com/google/farmhash., Google. FarmHash. https://github.com/google/farmhash.
( MillerR.R. et al (2013) Metagenomics for pathogen detection in public health. Genome Med., 5, 81.24050114)
MillerR.R. et al (2013) Metagenomics for pathogen detection in public health. Genome Med., 5, 81.24050114MillerR.R. et al (2013) Metagenomics for pathogen detection in public health. Genome Med., 5, 81.24050114, MillerR.R. et al (2013) Metagenomics for pathogen detection in public health. Genome Med., 5, 81.24050114
Jouni Sirén (2016)
Indexing Variation GraphsArXiv, abs/1604.06605
( SirénJ. (2016) Indexing variation graphs. arXiv.1604.06605.)
SirénJ. (2016) Indexing variation graphs. arXiv.1604.06605.SirénJ. (2016) Indexing variation graphs. arXiv.1604.06605., SirénJ. (2016) Indexing variation graphs. arXiv.1604.06605.
( RognesT. et al (2016) VSEARCH: a versatile open source tool for metagenomics. PeerJ, 4, e2584.27781170)
RognesT. et al (2016) VSEARCH: a versatile open source tool for metagenomics. PeerJ, 4, e2584.27781170RognesT. et al (2016) VSEARCH: a versatile open source tool for metagenomics. PeerJ, 4, e2584.27781170, RognesT. et al (2016) VSEARCH: a versatile open source tool for metagenomics. PeerJ, 4, e2584.27781170
( O’NeillJ. (2016) Tackling Drug-Resistant Infections Globally: Final Report and Recommendations. The Review on Antimicrobial Resistance. London: HM Government and the Wellcome Trust; 2016.)
O’NeillJ. (2016) Tackling Drug-Resistant Infections Globally: Final Report and Recommendations. The Review on Antimicrobial Resistance. London: HM Government and the Wellcome Trust; 2016.O’NeillJ. (2016) Tackling Drug-Resistant Infections Globally: Final Report and Recommendations. The Review on Antimicrobial Resistance. London: HM Government and the Wellcome Trust; 2016., O’NeillJ. (2016) Tackling Drug-Resistant Infections Globally: Final Report and Recommendations. The Review on Antimicrobial Resistance. London: HM Government and the Wellcome Trust; 2016.
Global Antimicrobial Resistance Surveillance System Manual for Early Implementation Global Antimicrobial Resistance Surveillance System
( LiuB., PopM. (2009) ARDB–Antibiotic Resistance Genes Database. Nucleic Acids Res., 37, D443–D447.18832362)
LiuB., PopM. (2009) ARDB–Antibiotic Resistance Genes Database. Nucleic Acids Res., 37, D443–D447.18832362LiuB., PopM. (2009) ARDB–Antibiotic Resistance Genes Database. Nucleic Acids Res., 37, D443–D447.18832362, LiuB., PopM. (2009) ARDB–Antibiotic Resistance Genes Database. Nucleic Acids Res., 37, D443–D447.18832362
( DaleR. et al (2017) Bioconda: a sustainable and comprehensive software distribution for the life sciences. bioRxiv, 207092.)
DaleR. et al (2017) Bioconda: a sustainable and comprehensive software distribution for the life sciences. bioRxiv, 207092.DaleR. et al (2017) Bioconda: a sustainable and comprehensive software distribution for the life sciences. bioRxiv, 207092., DaleR. et al (2017) Bioconda: a sustainable and comprehensive software distribution for the life sciences. bioRxiv, 207092.
Motivation: Antimicrobial resistance (AMR) remains a major threat to global health. Profiling the collective AMR genes within a metagenome (the ‘resistome’) facilitates greater understanding of AMR gene diversity and dynamics. In turn, this can allow for gene surveillance, individualized treat- ment of bacterial infections and more sustainable use of antimicrobials. However, resistome profil- ing can be complicated by high similarity between reference genes, as well as the sheer volume of sequencing data and the complexity of analysis workflows. We have developed an efficient and ac- curate method for resistome profiling that addresses these complications and improves upon cur- rently available tools. Results: Our method combines a variation graph representation of gene sets with a locality- sensitive hashing Forest indexing scheme to allow for fast classification of metagenomic sequence reads using similarity-search queries. Subsequent hierarchical local alignment of classified reads against graph traversals enables accurate reconstruction of full-length gene sequences using a scoring scheme. We provide our implementation, graphing Resistance Out Of meTagenomes (GROOT), and show it to be both faster and more accurate than a current reference-dependent tool for resistome profiling. GROOT runs on a laptop and can process a typical 2 gigabyte metagenome in 2 min using a single CPU. Our method is not restricted to resistome profiling and has the poten- tial to improve current metagenomic workflows. Availability and implementation: GROOT is written in Go and is available at https://github.com/ will-rowe/groot (MIT license). Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online. 1 Introduction Organization, 2015). These programmes have traditionally used cul- Antimicrobial resistance (AMR) remains a significant global threat ture or polymerase chain reaction–based surveillance techniques, to public health, with directly attributable deaths per annum pre- restricting monitoring to a few key genes or organisms. However, dicted to rise from 700 000 to 10 000 000 by the year 2050 (O’Neill, the use of metagenomics for surveilling AMR is gaining traction as 2016). This threat is not restricted to humans, with consequences it offers a much greater breadth of testing over these traditional also for animal health, welfare and food production (Bengtsson and techniques (Baquero, 2012; Miller et al., 2013). Greko, 2014). In an effort to inform policy that can mitigate the The use of metagenomics to determine the antibiotic resistance spread of AMR, there has been a recent drive to establish surveil- gene (ARG) content of a microbial community, hereafter referred to lance programmes to monitor the prevalence of AMR as resistome profiling, has been applied in studies of a wide variety (Public Health Agency of Canada, 2016; World Health of biomes (Auffret et al., 2017; Jalali et al., 2015; Ma et al., 2017; V The Author(s) 2018. Published by Oxford University Press. 3601 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. 3602 W.P.M.Rowe and M.D.Winn Munk et al., 2017; Rose et al., 2017; Rowe et al., 2017, 2016; Tao variation graph, the traversals within a graph are indexed. Variation et al., 2016; Yang et al., 2013). Studies such as these utilize several graph indexing approaches to date have included hash-map and well-maintained ARG databases (Gupta et al., 2014; Jia et al., 2017; Burrow-Wheeler transform encoding (Schneeberger et al., 2009; Sire ´ n, 2016). Index design must balance query length, performance Zankari et al., 2012) and ARG-annotation tools (Hunt et al., 2017; and index size in order to deal with the complexity of variation Inouye et al., 2014; Rowe et al., 2015; Yang et al., 2016), of which only a few are designed for resistome profiling (Fig. 1). These data- graphs. Efficient indexing is further complicated in the case of mul- tiple graphs. bases and tools are used with assembled metagenomic contigs, or al- We propose that variation graph traversals can be indexed using ternatively with metagenomic sequence reads. In the case of the locality-sensitive hashing (LSH), allowing for fast and approximate latter some tools and studies opt to align reads to reference ARGs assignment of a sequence to a specific region of a traversal within and report only fully covered references (Rowe et al., 2017), where- one or more graphs. MinHash is a form of LSH that can be used to as some bin reads using BLAST and similarity/length thresholds compress sets into smaller representations, called MinHash signa- (Tao et al., 2016). All of these strategies offer the user a compromise tures. MinHash signatures can be used to estimate the similarity of between ease of use, analysis speed and the accurate typing of ARGs the original sets independently of the original set size and was first (ability to detect ARG type/subtype). used for duplicate webpage detection (Broder, 1997, 2000). The One of the main limitations of existing tools for resistome profil- MinHash technique is increasingly being used in bioinformatics ing (or profiling of other gene sets) in metagenomes is that high simi- applications for clustering and searching large sequencing datasets larity shared between reference sequences can result in ambiguous (Brown and Irber, 2016; Ondov et al., 2016) and has also recently alignments, unaligned reads (false negatives), or mis-annotated reads been applied to read alignment algorithms (Berlin et al., 2015; Popic (false positives); thus reducing the accuracy when typing ARGs and Batzoglou, 2017; Quedenfeld and Rahmann, 2017). In applica- (Petersen et al., 2017). A solution to this has been to cluster the ref- tions such as these, the ability to efficiently compare signatures is es- erence sequences and then use the cluster representative sequences sential. Additional LSH indexing techniques can therefore be as a target for read alignment (Rowe et al., 2015); however, this applied to reduce dimensionality and facilitate querying of signa- results in a loss of information as ARG subtypes will be masked. tures. One such technique is the LSH Forest indexing scheme that Likewise, a related approach has been to collapse final annotations uses multiple prefix trees to store hash tables containing portions of of ARG variants into a common ARG annotation, also masking sub- the MinHash signatures and facilitates self-tuning of index parame- types (Munk et al., 2017). The ability to accurately detect ARG ters, offering a performance improvement over traditional LSH type/subtype, here termed profiling resolution, is important consid- indexes (Bawa et al., 2005). ering that the variation between subtypes of ARGs can result in dif- In this article, we present a method for resistome profiling that ferent phenotypic activity (Bush and Jacoby, 2010). utilizes a variation graph representation of ARG databases to reduce Recently non-linear data structures, such as variation graphs, ambiguous alignment of metagenomic sequence reads. We store sets have been used to encode reference sequences for applications such of similar ARG reference sequences in variation graphs; collapsing as variant calling (Garrison et al., 2017; Paten et al., 2017). In terms identical sequences whilst retaining unique nodes that allow for ac- of ARG annotation, the Mykrobe predictor tool applies non-linear curate typing. By applying an LSH Forest indexing strategy to the reference representation in the form of a de Bruijn graph to identify variation graph collection that allows for fast and approximate resistance profiles in Staphylococcus aureus and Mycobacterium tu- search queries, we show that metagenomic reads can be seeded berculosis isolates (Bradley et al., 2015). Non-linear reference repre- against candidate graph traversals. Subsequent hierarchical local sentation reduces redundancy whilst maintaining information that alignment of reads and scoring enables accurate and efficient resis- facilitates classification. Variation graphs, which are directed acycli- tome profiling. We also provide our implementation, Graphing cal graphs (DAGs), are offered as a solution to reference bias in Resistance Out Of meTagenomes (GROOT), and compare it against population genomics as they represent sequence variation within a ARGS-OAP and AMRPlusPlus, the most recently published tools population (Garrison et al., 2017). To align a sequence against a for resistome profiling from metagenomic data (Lakin et al., 2017; Yang et al., 2016)(Fig. 1). Antibiotic Resistance Gene Resources ARGS-OAP * - Ublast/blastx tool - CARD + ARDB db kmerfinder / kmer resistance CARD RGI - co-occurence of k-mers - blast & protein - ResFinder db VRprofile variant tools ARDB - HMM/blast 2 Materials and methods - CARD db - blast, rpsblast mykrobe predictor - uses mobilomeDB - DeBruijn graph ANNOTATION & SNP tools SEAR * - S.auerus & TB db - ARDB db ARIBA ResFinder - clusters & maps reads TOOL - maps & assembles - customisable db Here we describe our method to index a collection of ARG reference - blast tool AMRPlusPlus * reads - can assemble reads - maps reads - customisable db - ResFinder db - MEGARes db sequences, align sequence reads using the index and then apply this for resistome profiling of metagenomics samples. We then document 2009 2012 2013 2014 2015 2016 2017 our implementation, GROOT, and describe how we evaluated its CARD Resfams * ARDB - ~2k ARGs - hidden markov - ~24k ARGs - incl. ontology models & protein - incl. type & AB GEAR DATABASE / performance. family db - gene, ncRNA & - offline? - ~200 HMM profiles RESOURCE SNP drug associations - ~2000 ARGs ResFinder ARG-annot MEGARes * - ~2k ARGs - ~2k ARGs - ~4k ARGs - only acquired ARGs - db only FARME db * 2.1 Indexing - hidden markov models & DNA/pro - tein seqs - ~11k ARGs The first stage in indexing a collection of ARG reference sequences * designed for resistome profiling is to remove redundancy, thus reducing the number of potential Fig. 1. Summary of ARG annotation tools and databases. This figure shows alignments to multiple references by collapsing identical sequence several published tools and databases that are used to detect ARGs (in regions into a single reference. To do this we employ a DAG (vari- sequencing data or assembled contigs) (Bradley et al., 2015; Clausen et al., ation graph) representation of ARG reference sequences. By finger- 2016; Hunt et al., 2017; Jia et al., 2017; Lakin et al., 2017; Li et al., 2017; Liu and printing variation graph traversals and using this as an index, we Pop, 2009; Rowe et al., 2015; Yang et al., 2016; Zankari et al., 2012). The list is not exhaustive and the tools that are designed specifically for the resistome can then perform approximate seeding of sequence reads (Fig. 2) profiling of metagenomic datasets are highlighted with an asterisk (Supplementary Material, Algorithm 1). Indexed variation graphs for efficient and accurate resistome profiling 3603 advanced from 0 and the new hash value is evaluated against the A1 A2 A3 value stored at the current index position; if the stored value is A.window1 = [n]int64{1, 2, ... n} B1 B2 CLUSTER GRAPH TRAVERSE WINDOW MINHASH A.window2 = [n]int64{1, 2, ... n} B3 A.window3 = [n]int64{1, 2, ... n} greater than the new value, the stored value is replaced by the new C1 C2 value. The signature index is reset to 0 prior to hashing the next k- Reference Sequences Clusters Variation Graphs Graph Traversals K-mer Sets Graph Window Signatures mer in the series and the same s hash functions are used. Once all k- A.window1 = [n]int64{1, 2, ... n} mers in a set for a given window have been hashed, the signature for A.window2 = [n]int64{1, 2, ... n} ................................................................. LSH C.window9 = [n]int64{1, 2, ... n} that window is complete; the signature is linked to the graph ID and Minhash Sharded Signatures Signatures [n]int64{1, 2, ... n} QUERY SIMILARITY | A B | RANK [n]int64{1, 2, ... n} J ( A , B ) = | A U B | the window start node before it is stored. A single graph node can [n]int64{1, 2, ... n} Candidate Jaccard Seeded Graph / Signatures Similarity Query Pair QC & HASH have multiple linked signatures if multiple traversals are possible. [n]int64{1, 2, ... n} Query Sequence Read Query Signature 2.1.3 Store window signatures A1 The final indexing stage is to store the window signatures for each Variation Graph ALIGN SCORE REPORT B3 C1 variation graph in a data structure that allows for fast and approxi- mate nearest-neighbour queries. To do this, we enlist the LSH Forest Hierarchical SAM/ BAM Annotation, Type, Relative Abundance etc. Query Sequence Read Local Alignment self-tuning indexing scheme (Bawa et al., 2005). This indexing scheme, given a query, will give a subset of nearest-neighbour candi- Fig. 2. Overview of our method to index and query ARG variation graphs. This figure shows the indexing (index) and alignment (seed and align) steps dates to which the query can be compared, based on the number of of the method. In the alignment step, once a query read has been seeded hash collisions between query and candidates. In our case, querying against a variation graph, a hierarchical local alignment process is performed, the index with a MinHash signature from a sequencing read will re- and the alignment is scored before being reported turn candidate window signatures. As in a traditional LSH index, two parameters must be tuned in 2.1.1 Create variation graphs order to maximize the occurrence of collisions between a query and To convert a collection of FASTA-formatted ARG reference sequen- its nearest neighbour: (i) the number of hash functions to encode an ces to a set of variation graphs, all sequences are first clustered based item (K) and (ii) the number of hash tables (buckets) to split an item on sequence identity (90%) in order to group sequences into dis- into (L). As we have already hashed the signature during tinct sets of similar sequences. Each resulting set of similar sequences MinHashing, we are essentially splitting the signature uint64 values can be viewed as a multiple sequence alignment (MSA) that over a series of buckets. Explicitly, for each of the L buckets we take describes the insertions, deletions and polymorphisms in each mem- K uint64 values from the signature (where K s) and hash the values ber of the set, relative to a set representative sequence. The MSA is again to a single binary string. The original signature of s uint64 val- converted to a variation graph by first using the representative se- ues is replaced by a more compact representation of L binary quence as the graph backbone; each base of the sequence is a node strings. with edges connecting them in series. The other sequences of the The challenge with traditional LSH indexing schemes was setting MSA are then incorporated; common bases are consolidated as sin- appropriate L and K values. Too small a value for K results in an gle nodes and all deletions and insertions are added via edges and increased false positive rate due to increased collisions of dissimilar bubbles. query-neighbour pairs, and a large value for K lowers the collision Each variation graph node contains information such as parent chances of similar query-neighbour pairs. Therefore, setting L> 1is sequence, reference position and encoded base, allowing restriction needed to maximize the occurrence of collisions between similar of graph traversals to known references and the translation of tra- query-neighbour pairs. To set appropriate K and L values, the dis- versals into reference name and sequence location. In the case of sets tance between query and nearest neighbour is needed, but tuning for containing a single sequence, they are still converted to variation one query-neighbour pair can reduce performance for other query- graphs but will be a linear series of nodes and only contain one pos- neighbour pairs (Gionis et al., 1999). The LSH Forest indexing sible traversal. All variation graphs are topologically sorted so that scheme addresses the limitations of the traditional LSH indexing by node ordering reflects MSA position prior to fingerprinting graph using a unique label of variable length to assign data points to buck- traversals. ets and storing these in a data structure that combines multiple pre- fix trees, each constructed from hash functions (derived from the 2.1.2 Fingerprint graph traversals MinHash signature). The second indexing stage is to fingerprint the traversals in each The LSH Forest data structure is first initialized and tuned using variation graph, allowing for fast and approximate matching of the MinHash signature length and the Jaccard Similarity threshold query reads to region(s) of a variation graph. To fingerprint a graph, for reporting query-neighbour matches. To tune the index, multiple a sliding window of length w is moved across all graph traversals combinations of the number of buckets (L) and the number of hash and a MinHash signature is created for each window (Broder, functions (K) are tested (within the bounds of the signature length) 1997). The windows are typically the same length as the expected for false positive and false negative rates at the specified Jaccard query reads. The starting nodes of consecutive windows are spaced Similarity. L and K are then set according to the lowest error rate by an offset, o, where o w. possible and a set of L initial hash tables are then created For each window, the contained nodes are joined to form a se- (Supplementary Material, Algorithm 2). quence (encoded as an array of bytes), which is then decomposed Once the LSH Forest data structure has been initialized, each sig- into a set of k-mers of length k, where k< w. To then create a nature from the variation graph(s) is added to the initial hash tables. MinHash signature (an array of unsigned 64-bit integers) of length Each signature is split into L equally sized chunks of K hash func- s, each k-mer in the set is evaluated in series. A k-mer is hashed to tions. The chunks are hashed to a binary string (little-endian order- an unsigned 64-bit integer s times, using s distinct hash functions. ing) and stored in the corresponding hash table, with each chunk Each time a k-mer is hashed, the index position in the signature is pointing to the graph and window from which they derive. Once all ALIGN SEED INDEX 3604 W.P.M.Rowe and M.D.Winn signatures have been added, the initial hash tables are transferred they align and then reference information is extracted from the cor- into a set of arrays (buckets) and sorted. responding graph(s). The length of the reference is used to perform a simple pileup of classified reads. Length and coverage thresholds are applied to determine if a gene can be reported as present in the resis- 2.2 Alignment tome profile. To align sequence reads to a variation graph, our method uses an ap- proximate nearest-neighbour search to seed a read against a region of a graph and then employs a hierarchical local alignment to fully 2.4 Our implementation align the read. We have implemented our method as an easy to use program called GROOT (Fig. 2). GROOT is written in Go (version 1.9) and com- piles for a variety of operating systems and architectures. 2.2.1 Nearest-neighbour search To create the supplied reference data (used for indexing), ARG For a given sequence read, a MinHash signature is generated as dur- sequences were downloaded from the CARD, Resfinder and ARG- ing indexing, using the same values for k and s, etc. The signature is annot databases (Gupta et al., 2014; Jia et al., 2017; Zankari et al., then queried against the LSH Forest data structure containing the 2012) (accessed June 2016). Each database was clustered using the indexed variation graphs (Supplementary Material, Algorithm 3). VSEARCH cluster_size command and stored as MSA files (Rognes To run a query, the signature is split into L equally sized chunks and et al., 2016). The GROOT get command will fetch a specified pre- each chunk is hashed to a binary string using the same hash function clustered database, check it and unpack it ready for indexing. as during the LSH Forest population (Bawa et al., 2005). Each Alternatively, a user can provide their own clustered database. hashed chunk of the query signature is then queried against the cor- The index command checks MSA files for formatting and dis- responding bucket in the LSH Forest. To improve querying effi- cards sequences shorter than the expected read query length. ciency, each bucket is compressed by constructing a prefix tree Variation graphs are then built from the MSAs, pruned and topo- (PATRICIA trie) (Morrison, 1968). Binary search is used to search logically sorted. For each graph, a sliding window is moved along the bucket (in ascending order) and return the smallest index of the each traversal (default length¼ 100, offset¼ 1), decomposed to k- bucket where the prefix matches the query chunk; the bucket is then mer sets (default k¼ 7) and MinHash signatures are created using iterated over from this index position, returning graph windows held until the prefix no longer matches the query chunk. Once all the Go implementations of Spooky and Farm hash functions (default chunks of the query sequence have been searched, all the windows signature length¼ 128, based on XORing the Spooky and Farm (graph IDs and start nodes) are collated and stored as seeds for each hash functions) (Jenkins; Google; Gryski https://github.com/dgryski/ read. The nearest-neighbour search is then repeated using the reverse go-spooky). An LSH Forest is then initialized, tuned and populated, complement of the query read. before being serialized using the Go gob package and written to disk. The align command loads the index, sets all hashing parameters 2.2.2 Hierarchical local alignment to match the index and then streams the FASTQ read file(s) (mul- For a read that has been seeded one or more times, a hierarchical tiple FASTQ files or paired reads can be read but paired-end infor- local alignment process is used to align the read to a graph traversal. mation is not utilized). MinHash signatures are created for each We assume that most reads do not feature novel variation, so we read and its reverse complement; signatures are then queried against first try an exact match alignment. We then try an exact match the index. Once seeded, reads are optionally quality trimmed prior alignment after shuffling the seed n nodes along the graph, followed to hierarchical local alignment and reporting. All variation graphs by a clipped alignment. As soon as the read aligns, no further align- that had reads align are saved to disk [in graphical fragment assem- ments are tried for that seed. To perform an alignment from a given bly (GFA) format] and can be viewed using Bandage (Wick et al., seed, a recursive depth-first search (DFS) of the seeded graph is per- 2015). All aligned reads are also reported in relation to a linear ref- formed, beginning at the start node identified during the nearest- erence sequence (in BAM format). neighbour search (Supplementary Material, Algorithm 4). If a match The index, align and report subcommands of GROOT all utilize between node and read base is encountered, the position in the read a concurrent pipeline pattern that is driven by the flow of data be- is incremented and the next node in the DFS is checked. When no tween structs. This pattern also facilitates the streaming of data match is found or the whole read has been iterated over, the DFS of from STDIN, as well as from disk, and allows the GROOT com- the current traversal is ended. mands to be piped together. 2.3 Reporting 2.5 Evaluating performance Reads are classified as ARG-derived if they have successfully aligned The full commands and code used to evaluate the performance of to a variation graph. To apply our method to resistome profiling, the classified reads must be evaluated to annotate what gene they de- our implementation can be found in the GROOT repository (https:// rive from and if that entire gene is present in a given sample. github.com/will-rowe/groot/tree/master/paper). GROOT version To classify a read alignment, a score is calculated according to 0.7 was used in all experiments (release 0.7, commit b43c32c). For the parent information of the nodes of the alignment. That is, if a running the accuracy benchmark, simulated FASTQ reads (150 bp read length) were generated from the ARG-annot database using node was derived from gene X and gene Y, the read would receive a BBMap (Bushnell, 2014; Gupta et al., 2014). An index of the ARG- point for each parent. Points are tallied, the top parent(s) in the tally annot database was created using the GROOT index command is used to classify the read. If the top parent was not present for every node traversed, the read is ambiguous. If multiple parents (length¼ 150, all other settings default). Reads were aligned using score the highest then the read has multiple valid classifications (a the GROOT align command with default settings, running on a multi-mapper). Linux laptop using 1, 4, and 8 CPUs for each test, respectively. Once reads have been classified, gene annotations are then For running the comparison benchmark, the genomes from the made. Classified reads are binned according to the graphs to which Critical Assessment of Metagenome Interpretation (CAMI) project Indexed variation graphs for efficient and accurate resistome profiling 3605 were downloaded in FASTA format (Sczyrba et al., 2017). The genomes were screened against the CARD database to mask any ARGs already present in the genomes (Jia et al., 2017). A subset of ARGs were then randomly sampled from the CARD database (v1.1.2) using Bioawk and were combined with the CAMI genomes in a single FASTA file (Jia et al., 2017). The CARD database was selected as ARGs-OAP uses this and it is one of the default databases provided by GROOT (Yang et al., 2016). Sets of metagenomic FASTQ reads (150 bp read length, errors allowed) at varying cover- age levels were then generated from the FASTA file using BBMap (Bushnell, 2014). For each set of metagenomics reads, GROOT and ARGs-OAP [stage 1, version 1 (release 1, commit: ed19cd1)] were Fig. 3. Runtime performance. This figure shows the runtime performance of both run with default parameters (recommended settings) on a our implementation (GROOT) on 1, 4, and 8 CPUs. The data were collected LINUX laptop using 1 core (Yang et al., 2016). ARGs-OAP (stage during the accuracy assessment of GROOT, where it classified ARG-derived 2, version 2) was completed on the dedicated Galaxy webserver, reads (no false negatives were recorded) allowing only full-length, 100% identity read matches (Yang et al., 2016). For comparing GROOT against AMRPlusPlus, reads were 2016)], we used a set of publically available, simulated metage- simulated (5 coverage) from the CAMI database as above, this nomes and spiked in full-length ARG sequences. The spiked data time spiked with 10 ARGs from the MegaRes database (version was sampled at varying coverage levels and the performance of 1.01) (Lakin et al., 2017). Simulated reads were analysed using GROOT and ARGs-OAP was evaluated in terms of the number of AMRPlusPlus (default settings, commit: 3086a1f) and GROOT on ARGs correctly annotated from each sample, as well as the time a Linux cluster, both using eight cores and a reporting threshold of taken for each tool to process the samples. 100%, 99% and 80% gene coverage for each test. On average GROOT was 6.3 times faster at processing the sam- For performing the resistome analysis, we reanalysed the metage- ples compared to stage 1 of ARGs-OAP. The mean time to process a nomic data from a recent microbiome study (Winglee et al., 2017). metagenome with 1 coverage (approximately 1.1 million reads) FASTQ reads for all 40 microbiomes were downloaded from the was 157.87 s for GROOT, compared to 989.15 s for ARGs-OAP SRA (BioProject: PRJNA349463) and were classified by GROOT (stage 1 only) (Fig. 4A). ARGs-OAP stage 2 took an average of 12 h using the CARD database (Jia et al.,2017). GROOT was set to re- to run on a dedicated server (not included in Fig. 4A). port full-length ARG annotations and the variation graph alignments In terms of accuracy, GROOT recorded only one false negative (GFA format) for each annotation were validated by manual inspec- across all samples (present in the lowest coverage metagenome), tion using Bandage (Wick et al.,2015). Toxin/antitoxin genes were whereas ARGs-OAP (Stages 1þ 2) consistently recorded three false identified in microbiome samples using GROOT and the TADB data- negatives per sample. GROOT had a mean rate of 1.6 false positives base (clustered at 90% sequence identity) (Xie et al., 2018). The full per sample, whereas ARGs-OAP had a mean rate of 99.6 false posi- commands used are available in the online GROOT tutorial (https:// tives per sample. The number of false positives found by GROOT did groot-documentation.readthedocs.io/en/latest/tutorial.html). not increase beyond 10 coverage, compared to the false positive in- crease observed at every coverage increase by ARGs-OAP (Fig. 4B). We additionally compared GROOT to AMRPlusPlus. Although 3 Results GROOT outperformed this software on the basis of runtime (117 s The results presented here evaluate our implementation of indexed versus 3366 s average run time) and accuracy (Supplementary variation graphs for resistome profiling, in terms of both the accur- Results 1), the comparison of GROOT and AMRPlusPlus is not acy of the tool and its performance compared to a recently published ideal. This is due to AMRPlusPlus being unable to run on a laptop resistome profiling tool. using a single core (as in the above benchmark test) and the fact that AMRPlusPlus is a pipeline that involves a lot of additional data 3.1 Accuracy assessment processing, leading to a much longer runtime. To assess the accuracy of our implementation, we generated sets of random FASTQ reads from the ARG-annot database, classified the 3.3 Resistome analysis reads with GROOT and then compared the classification to the ARG To show the application of our method on real-world data, we rean- from which the read derived (Gupta et al.,2014). For six sets of simu- alysed 40 recently published metagenomes that were derived from lated reads, ranging from 100 to 10 000 000 reads, GROOT classified the microbiome of rural and urban subjects from the Hunan prov- the ARG-derived reads with no recorded false negatives. GROOT ince of China (Winglee et al., 2017). GROOT made 84777 ARG took an average of 216.44 wall clock seconds to classify 1 000 000 read classifications across all the samples; the mean number of ARG-derived reads using 8 CPUs. At <1000 reads, runtimes were ARG-classified reads was higher in rural microbiome samples than equivalent regardless of the number of CPUs used, however, as the urban (Fig. 5A). The resistome profiles generated by GROOT identi- datasets approached real-world metagenome size (>1000 000 reads), fied 11 and 20 ARGs that could be accurately subtyped in the rural increasing the number of CPUs resulted in decreased runtime of ap- and urban microbiome samples respectively (Fig. 5B). The ARG sub- proximately 2.5-fold per 4 CPUs (Fig. 3). The average RAM occupa- types were also confirmed by inspection of variation graphs; uni- tion recorded during the accuracy assessment was 2.23 GB. form read coverage along graph traversals corresponded to the identified ARGs (see example variation graph for bla-cfxA, Fig. 5C). 3.2 Performance comparison Finally, to show the utility of GROOT in classifying non-ARG To compare the accuracy and speed of our implementation against a sequences, we used GROOT to classify reads that were derived from recently published resistome profiling tool [ARGs-OAP (Yang et al., toxin genes (Supplementary Fig. S1). We found 11 different full- 3606 W.P.M.Rowe and M.D.Winn indexed and searched using any reference set of similar sequences. For example, we have shown that GROOT can be used to annotate toxin genes in microbiome data (see results—resistome analysis). We should note however that our implementation currently restricts the number of graphs that can be generated to around 2000. In terms of profiling resolution, we are referring to the accuracy of the annotations in a resistome profile. That is, is it possible to an- notate the subtype of ARG in a sample (e.g. bla-SHV-1) or just the type (e.g. bla-SHV)? This resolution is important as different ARG subtypes can provide selective resistance to different antibiotics, for example genes within a single beta lactamase gene class (structural classes, based on amino acid sequence) can confer resistance to dif- ferent beta lactam antibiotics (Bush and Jacoby, 2010). Therefore, Fig. 4. Benchmarking GROOT and ARGs-OAP. (A) The runtime comparison of the greater the resolution, the more information can be extracted GROOT and ARGs-OAP when processing metagenomes at varying coverage from a sample. To gain this resolution, we need to both cover an en- levels. (B) The number of errors (false positives and false negatives) recorded tire reference sequence and be confident in the reads placed. We ad- by each program at varying metagenome coverage levels dress this by allowing only exact matches to the whole reference sequence before annotating the gene as present. The counter to this A B argument for greater resolution is that sequence quality will impact ARG rural urban AAC(6')-Ie-APH(2'')-Ia + - read placement and also, novel ARGs will be missed. Whilst we be- ANT(6)-Ib - + bacA - + CfxA2 + + lieve that the main utility of GROOT is its ability to confidently re- CfxA3 + + CfxA4 - + solve ARGs, we have also added features to allow for these points. CfxA5 + - dfrA17 - + dfrF - + Firstly, our implementation has an optional quality trimming algo- emrR + - ErmB (X82819) + - rithm to remove low quality bases prior to read alignment. ErmB (Y00116) + + ErmF + - Escherichia_coli_EF-Tu + + Secondly, to allow potentially novel ARGs or accommodate low mdtP - + ompF - + coverage samples, there is an option to relax the scoring system in OXA-347 - + QnrS1 - + QnrS4 - + order to allow non-exact matches or partially covered genes. QnrS7 - + QnrS9 - + Our method offers an improvement in resolution over those that sul1 - + sul2 - + tetD - + consolidate variant ARGs to a representative sequence (Munk et al., tetQ + + rural urban tetW + - 2017; Rowe et al., 2015), or that have high false positive rates due to allowing partial or inexact matches (Yang et al., 2016). Despite a marked improvement on ARGs-OAP in our benchmark our method did record a false negative and some false positives, likely as a result of introduced sequencing error and the low sample coverage (in the Fig. 5. Resistome analysis using GROOT on 40 human microbiome samples. case of the single false negative). These errors could be considered a (A) A boxplot comparing ARG-classified reads derived from all rural versus limitation of the exact local alignment utilized in this method, a urban subject microbiomes. (B) The full length ARGs detected by GROOT dur- more relaxed alignment could be allowed but this would be at the ing resistome profiling and in which microbiome class they were present. (C) expense of confidence in the ARG annotations. A subgraph of the bla-cfxA variation graph used by GROOT in this analysis In terms of speed, our method offers several advantages over (the full graph encodes 4 sequences using 10 nodes and 13 edges). Grey other resistome profiling tools. Our method does not require meta- shading corresponds to nodes with aligned reads. The bla-cfxA3 traversal is genome assembly or the upload of data to remote servers, both of highlighted by the black arrows which add significant time to a resistome profiling analysis. The lat- ter requires good bandwidth and the former can require large com- length toxin sequences in the microbiomes of urban subjects, where- pute resources; a complex metagenome can take up to 10 h and as no full-length sequences could be identified in the rural samples. 500 GB of RAM to assemble (van der Walt et al., 2017). Our imple- mentation can run a typical 2 GB metagenome in 2 min using a sin- gle CPU, and scales when run on higher-performance systems. Our 4 Discussion benchmark was restricted to a single CPU as ARGs-OAP is limited AMR remains a significant challenge to human and animal health. in the number of CPUs it can use. The benchmark also ignored the With the increasing drive for AMR surveillance in order to inform time taken during Stage 2 of ARGs-OAP (on the remote server) as policy and mitigate the spread of resistance, metagenomic studies to this could be variable depending on server load and available band- identify and monitor ARGs in a wide variety of biomes are becom- width for upload. Despite this, GROOT still offers a much faster ing commonplace. Sampled biomes range from the human gut, runtime than ARGs-OAP. In addition to runtime, analysis times are through to water supplies, agricultural land and marine environ- also reduced with GROOT due to its ease of use. It runs as a self- ments. Despite this scale and variety, the question being asked contained binary, is packaged with bioconda (Dale et al., 2017) and remains simple, what ARGs are present in a given sample? Although requires only two commands to run a resistome profiling analysis, simple, resistome profiling still presents a challenge, as illustrated by offering significant advantage over more complex workflows or the many tools available (Fig. 1). Our method improves upon two those that require upload to remote servers halfway through the main issues with these existing tools; the resolution offered and the analysis (Yang et al., 2016). Our implementation is targeted towards speed at which a sample can be profiled. Crucially also, as a novel researchers who may not have access to high performance comput- algorithm, it is translatable to identifying other highly related gene ing and wish to run metagenomics workflows on a laptop, for this sets within metagenomes. Variation graphs can be generated, reason we elected to compare our tool against ARGs-OAP as it was Log10(proportion ARG classified reads) Indexed variation graphs for efficient and accurate resistome profiling 3607 one of three published resistome profilers that is targeted for metage- References nomic workflows (Fig. 1). We also tried using AMRPlusPlus and Auffret,M.D. et al. (2017) The rumen microbiome as a reservoir of antimicro- SEAR in this benchmark but we could not get them to run using our bial resistance and pathogenicity genes is directly affected by diet in beef cat- 1-core laptop configuration (Lakin et al.,2017; Rowe et al.,2015). tle. Microbiome, 5, 159. However, as GROOT also scales for use on larger servers, we were Baquero,F. (2012) Metagenomic epidemiology: a public health need for the control of antimicrobial resistance. Clin. Microbiol. Infect., 18, 67–73. able to include a comparison of GROOT and AMRPlusPlus. Bawa,M. et al. (2005) LSH forest: self-tuning indexes for similarity search. Although this comparison is still slightly unfair as GROOT is a self- Proceedings of the 14th International Conference on World Wide Web – contained resistome profiler and AMRPlusPlus is a pipeline that WWW ’05, p.651. involves many additional steps, this comparison is useful as it shows Bengtsson,B., and Greko,C. (2014) Antibiotic resistance–consequences for GROOT is comparable to an approach using linear reference-based animal health, welfare, and food production. Ups. J. Med. Sci., 119, read mapping [as AMRPlusPlus uses BWA to align reads (Li, 2013)]. 96–102. With the advent of long-read and barcoding protocols for resis- Berlin,K. et al. (2015) Assembling large genomes with single-molecule tome profiling, it is tempting to think that tools for short read resis- sequencing and locality-sensitive hashing. Nat. Biotechnol., 33, 623–630. Bradley,P. et al. (2015) Rapid antibiotic-resistance predictions from genome tome profiling may prove less relevant in the near future (Gue ´ rillot sequence data for Staphylococcus aureus and Mycobacterium tuberculosis. et al., 2018; Van Der Helm et al., 2017). However, Illumina Nat. Commun., 6, 10063. sequencing currently remains the standard technology for metage- Broder,A.Z. (2000) Identifying and filtering near-duplicate documents. In nomics applications and historic Illumina datasets will also need to Annual Symposium on Combinatorial Pattern Matching, pp. 1–10. be reanalysed where new sampling is not feasible. In addition, our Broder,A.Z. (1997) On the resemblance and containment of documents. In method can be used in conjunction with other new methods that elu- Proceedings. Compression and Complexity of SEQUENCES (Cat. cidate gene context in metagenomics samples, allowing for novel No.97TB100171). IEEE Comput. Soc., pp. 21–29. Brown,C.T., and Irber,L. (2016) sourmash: a library for MinHash sketching insights to be gained from analysis and re-analysis of metagenomic of DNA. J. Open Source Softw., 1, 27. data collections (Olekhnovich et al., 2018). Bush,K., and Jacoby,G.A. (2010) Updated functional classification of beta-lac- Our implementation offers efficient and accurate resistome tamases. Antimicrob. Agents Chemother., 54, 969–976. profiling, enabling the identification of full-length ARGs in complex Bushnell,B. (2014) BBMap: a fast, accurate, splice-aware aligner. In: 9th samples. It should be noted that this method is database dependent Annual Genomics of Energy & Environment Meeting, Walnut Creek, CA, and is not intended for the identification of novel ARGs. The imple- March 17-20, 2014. mentation is easy and quick to run as there is no complicated instal- Clausen,P.T.L.C. et al. (2016) Benchmarking of methods for identification of antimicrobial resistance genes in bacterial whole genome data. J. lation or dependencies, no split local/remote processing, it facilitates Antimicrob. Chemother., 71, 2484–2488. streaming of input and is built using Go’s concurrency patterns. Dale,R. et al. (2017) Bioconda: a sustainable and comprehensive software dis- tribution for the life sciences. bioRxiv, 207092. Garrison,E. et al. (2017) Sequence variation aware references and read map- 5 Conclusions ping with vg: the variation graph toolkit. bioRxiv, 1–27. We present a method for resistome profiling that utilizes a novel Gionis,A. et al. (1999) Similarity search in high dimensions via hashing. index and search strategy to accurately type resistance genes in VLDB ’99 Proceedings of the 25th International Conference Very Large Data Bases, 99, 518–529. metagenomic samples. The use of variation graphs yields several Google. FarmHash. https://github.com/google/farmhash. advantages over other methods using linear reference sequences. Gryski,D. (2014) go-spooky. https://github.com/dgryski/go-spooky. GROOT performed much more quickly than a recent resistome Gue ´ rillot,R. et al. (2018) Comprehensive antibiotic-linked mutation assess- profiling tool, and also recorded few false positives and negatives. ment by Resistance Mutation Sequencing (RM-seq). bioRxiv, 257915. Our method is not restricted to resistome profiling and has the po- Gupta,S.K. et al. (2014) ARG-ANNOT, a new bioinformatic tool to discover tential to improve current metagenomic workflows. antibiotic resistance genes in bacterial genomes. Antimicrob. Agents Chemother., 58, 212–220. Van Der Helm,E. et al. (2017) Rapid resistome mapping using nanopore Acknowledgements sequencing. Nucleic Acids Res., 45, gkw1328. Hunt,M. et al. (2017) ARIBA: rapid antimicrobial resistance genotyping dir- Availability and implementation. ectly from sequencing reads. Microb. Genom., 3, e000131. The source code for our implementation, as well as the code used to evaluate Inouye,M. et al. (2014) SRST2: rapid genomic surveillance for public health its performance and plot the manuscript figures, can be found in the GROOT and hospital microbiology labs. Genome Med., 6, 90. repository (https://github.com/will-rowe/groot) (MIT License. DOI: https:// Jalali,S. et al. (2015) Screening currency notes for microbial pathogens and doi.org/10.5281/zenodo.1217889). antibiotic resistance genes using a shotgun metagenomic approach. PLoS One, 10, e0128711. Jenkins,B. SpookyHash: a 128-bit noncryptographic hash. http://burtleburtle. Funding net/bob/hash/spooky.html. This work was supported in part by the STFC Hartree Centre’s Innovation Jia,B. et al. (2017) CARD 2017: expansion and model-centric curation of the Return on Research programme, funded by the Department for Business, comprehensive antibiotic resistance database. Nucleic Acids Res., 45, Energy & Industrial Strategy. D566–D573. Lakin,S.M. et al. (2017) MEGARes: an antimicrobial resistance database for high throughput sequencing. Nucleic Acids Res., 45, D574–D580. Authors’ contributions Li,H. (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. W.P.M.R. conceived and implemented the method. All authors wrote, read Li,J. et al. (2017) VRprofile: gene-cluster-detection-based profiling of virulence and approved the final manuscript. and antibiotic resistance traits encoded within genome sequences of patho- Conflict of Interest: none declared. genic bacteria. Brief Bioinform., bbw141. 3608 W.P.M.Rowe and M.D.Winn Liu,B., and Pop,M. (2009) ARDB–Antibiotic Resistance Genes Database. Rowe,W. et al. (2015) Search engine for antimicrobial resistance: a cloud com- Nucleic Acids Res., 37, D443–D447. patible pipeline and web interface for rapidly detecting antimicrobial resist- Ma,L. et al. (2017) Catalogue of antibiotic resistome and host-tracking in ance genes directly from sequence data. PLoS One, 10, e0133492. drinking water deciphered by a large scale survey. Microbiome, 5, 154. Rowe,W.P.M. et al. (2017) Overexpression of antibiotic resistance genes in Miller,R.R. et al. (2013) Metagenomics for pathogen detection in public hospital effluents over time. J. Antimicrob. Chemother., 72, 1617–1623. health. Genome Med., 5, 81. Schneeberger,K. et al. (2009) Simultaneous alignment of short reads against Morrison,D.R. (1968) PATRICIA—practical algorithm to retrieve informa- multiple genomes. Genome Biol., 10, R98. tion coded in alphanumeric. J. ACM, 15, 514–534. Sczyrba,A. et al. (2017) Critical assessment of metagenome interpretation—a Munk,P. et al. (2017) A sampling and metagenomic sequencing-based meth- benchmark of metagenomics software. Nat. Methods, 14, 1063–1071. odology for monitoring antimicrobial resistance in swine herds. J. Sire ´ n,J. (2016) Indexing variation graphs. arXiv.1604.06605. Antimicrob. Chemother., 72, 385–392. Tao,W. et al. (2016) High levels of antibiotic resistance genes and their corre- O’Neill,J. (2016) Tackling Drug-Resistant Infections Globally: Final Report lations with bacterial community and mobile genetic elements in pharma- and Recommendations. The Review on Antimicrobial Resistance. London: ceutical wastewater treatment bioreactors. PLoS One, 11, e0156854. HM Government and the Wellcome Trust; 2016. van der Walt,A.J. et al. (2017) Assembling metagenomes, one community at a Olekhnovich,E.I. et al. (2018) MetaCherchant: analyzing genomic context of time. BMC Genom., 18, 521. antibiotic resistance genes in gut microbiota. Bioinformatics, 34, 434–444. Wick,R.R. et al. (2015) Bandage: interactive visualization of de novo genome Ondov,B.D. et al. (2016) Mash: fast genome and metagenome distance estima- assemblies. Bioinformatics, 31, 3350–3352. tion using MinHash. Genome Biol, 17, 132. Winglee,K. et al. (2017) Recent urbanization in China is correlated with a Paten,B. et al. (2017) Genome graphs and the evolution of genome inference. Westernized microbiome encoding increased virulence and antibiotic resist- Genome Res., 27, 665–676. ance genes. Microbiome, 5, 121. Petersen,T.N. et al. (2017) MGmapper: reference based mapping and taxonomy World Health Organization. (2015) Global Antimicrobial Resistance annotation of metagenomics sequence reads. PLoS One, 12, e0176469. Surveillance System Manual for Early Implementation Global Popic,V., and Batzoglou,S. (2017) A hybrid cloud read aligner based on Antimicrobial Resistance Surveillance System. MinHash and kmer voting that preserves privacy. Nat. Commun., 8, 15311. Xie,Y. et al. (2018) TADB 2.0: an updated database of bacterial type II Public Health Agency of Canada. (2016) Canadian antimicrobial resistance toxin-antitoxin loci. Nucleic Acids Res., 46, D749–D753. surveillance system – Report 2016. Guelph, Canada. Yang,Y. et al. (2016) ARGs-OAP: online analysis pipeline for antibiotic resist- Quedenfeld,J., and Rahmann,S. (2017) Variant tolerant read mapping using ance genes detection from metagenomic data using an integrated structured min-hashing. 1–19. ARG-database. Bioinformatics, 32, 2346–2351. Rognes,T. et al. (2016) VSEARCH: a versatile open source tool for metage- Yang,Y. et al. (2013) Exploring variation of antibiotic resistance genes in acti- nomics. PeerJ, 4, e2584. vated sludge over a four-year period through a metagenomic approach. Rose,G. et al. (2017) Antibiotic resistance potential of the healthy preterm in- Environ. Sci. Technol., 47, 10197–10205. fant gut microbiome. PeerJ, 5, e2928. Zankari,E. et al. (2012) Identification of acquired antimicrobial resistance Rowe,W. et al. (2016) Comparative metagenomics reveals a diverse range of genes. J. Antimicrob. Chemother., 67, 2640–2644. antimicrobial resistance genes in effluents entering a river catchment. Water Sci. Technol., 73, 1541–1549.
Bioinformatics – Oxford University Press
Published: May 14, 2018
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.