Microbes are the oldest and most widespread, phylogenetically and metabolically diverse life forms on Earth. However, they have been discovered only 334 years ago, and their diversity started to become seriously investigated even later. For these reasons, microbial studies that unveil novel microbial lineages and processes affecting or involving microbes deeply (and repeatedly) transform knowledge in biology. Considering the quantitative prevalence of taxonomically and functionally unassigned sequences in environmental genomics data sets, and that of uncultured microbes on the planet, we propose that unraveling the microbial dark matter should be identiﬁed as a central priority for biologists. Based on former empirical ﬁndings of microbial studies, we sketch a logic of discovery with the potential to further highlight the microbial unknowns. Key words: metagenomics, eukaryogenesis, microbial evolution, tree of life, web of life, CPR bacteria. Introduction In 1619, the famous astronomer Galileo, whose observa- Microbial studies are fascinating. Not only their ﬁndings can tions of the moons of Jupiter had threatened the geocentric deeply transform knowledge in a broad range of scientiﬁc theory, modiﬁed a telescope to magnify nearby terrestrial ﬁelds (from evolutionary biology to zoology and medical objects. Although he clearly was a revolutionary thinker, he and environmental sciences) but also, whereas philosophers found these observations of the minute world of limited in- of sciences debate whether there is such thing as a logic of terest, and, only 6 years later, did his friends name microsco- scientiﬁc discovery (Schickore 2014), microbial studies provide pio the strange inverted telescope Galileo had invented biologists with a set of empirical rules to enhance one’s (Falkowski 2015). By contrast, Robert Hooke, an English poly- chances to discover novel and unexpected life forms. This math scientist, and, later, Anton van Leeuwenhoek, who did unique potential of microbial studies to reshape knowledge not belong to the academic world, were much more excited has been recognized relatively recently, even though there by describing their microscopic observations. In 1671, van is a long standing history of studies of microbial pathogens, Leeuwenhoek, who had substantially changed the design of involving famous early researchers such as Robert Koch, the microscope to enhance its magnifying power, initiated a Louis Pasteur, or Martinus Beijerinck. If the laymen nowa- series of striking ﬁndings: microscopic lifeforms are abundant days appreciate that microbes impact our everyday life (i.e., and everywhere to be seen. Microbes, who had populated via their fermentative roles in food production), and know Earth for over 3.5 billion years, were for the ﬁrst time exposed that microbes also impacted our recent human histories to the human eye (Falkowski 2015). Both a technical progress (i.e., via their contribution to major pandemics; Diamond and an uncommon ability to delve into an unseen world were 1997), from a scientiﬁc perspective, microbes are nonethe- critical components of that progress. However, since biolog- less rather novel objects of studies. There are both technical ical theory at the time considered the living world was distrib- and conceptual reasons for this late yet broad recognition uted into two major groups: plants and animals, van of microbes, as we will highlight below, whereas providing Leeuwenhoek naturally assumed he was observing popula- an empirical recipe for further insights into the microbial tions of minute animals (with tiny organs), when microbes dark matter. were mobile, rather a new kind of living beings. In that sense, The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact email@example.com Genome Biol. Evol. 10(3):707–715. doi:10.1093/gbe/evy031 Advance Access publication February 5, 2018 707 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/707/4840377 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Bernard et al. GBE the unveiled microbiological world was ﬁrst rationalized in easily culturable, or are not culturable using standard techni- ways that ﬁt within preexisting theoretical categories derived ques (Staley and Konopka 1985; Barer and Harwood 1999). from the known living world. Importantly, neither Hooke nor Unraveling the microbial dark matter could thus led to two van Leeuwenhoek had immediate scientiﬁc successors. (nonexclusive) types of observations. Either the discovery of Arguably, it took another 200 years (Falkowski 2015), and hidden microbes will show that microbes unveiled from the several novel conceptual and technological developments to microbial dark matter are comparable in terms of genetic di- formulate an issue, currently at the forefront of microbial versity, ecological roles, abundance, evolutionary history, and studies: « is it possible that unknown microorganisms, with affected by processes similar to those affecting cultured different properties than those currently associated with the microbes, in which case our current knowledge of microbes known living world, are thriving in nature? ». is representative of what’s really going on in nature (we will The potential theoretical importance of such “known simply ﬁnd more of what we already knew by mining the unknowns” and even “unknown unknowns” of the micro- microbial world); or the microbial dark matter will prove to bial world (e.g., unknown genes, genomes, functions, organ- host entities and processes that differ from those already de- isms, processes, and communities associated with uncultured scribed, with the major consequence that scientiﬁc knowl- microbes and viruses), that were often popularized under the edge will not only need to be completed but also corrected catch-phrase “microbial dark matter,” should not be under- as microbiologists gain access to this still hidden microbial estimated. Interestingly, the relevance of this sentence is world in order to consider new phenomena, poorly explained debatted in microbiology. Many scientists ﬁnd the metaphor in extant theories. Such signiﬁcant theoretical transformations misleading or inaccurate, because the “microbial dark have arguably occurred when 1) microbiologists looked for matter” does not correspond to the dark matter studied by life in extreme environments, 2) detected life under unex- astronomers and physicists. This latter represents a hypothet- pected (i.e., very diverged) forms, and 3) unveiled new pro- ical, still unobserved, although widely accepted, kind of mat- cesses involving microbes, which allows us to stress some key ter, which does not interact with light but interacts through features for the success of a scientiﬁc research oriented to- gravity. Taking the mass of this unseen astronomic dark mat- ward the discovery of microbiological novelty. ter into account would explain the uncorrect predictions of the movement of galaxies by classic astronomy theories. This Searching Life in Extreme Environment: A astronomic dark matter is thus unquestionably different from Few Lessons the microbial dark matter. However, other microbiologists The developments of molecular markers and sequencing have endorsed the analogy (Rinke et al. 2013; Lobb et al. techniques were instrumental for the discovery of extremo- 2015; Lok 2015; Saw et al. 2015; Bruno et al. 2017; philes. By unveiling the archaea, a novel early branching Krishnamurthy and Wang 2017; Lewis 2017), since the sen- Domain of life, possibly sister-group to eukaryotes, Carl tence nonetheless conveniently stresses that, to some extent, Woese’s phylogenetic studies of the 16 S RNA revolutionized newly discovered microbes can harbor a different biology the views on the entire biological world (Woese and Fox from those that had been cultured. Although we agree that 1977; Woese et al. 1990). Woese argued that, rather than microbial and astronomic dark matter are very different being partitioned into two major groups, the eukaryotes and notions, we also ﬁnd the sentence “microbial dark matter,” the prokaryotes, the living world encompassed a much popularized by (Rinke et al. 2013) to be more useful than broader microbial diversity, justifying its classiﬁcation into detrimental. First, it is a convenient short hand for the idea three Domains of life. Subsequently, Woese and his col- that unknown microbial life may be playing important and leagues (referred to as “the Woese army” by Lynn even dominant role in ecosystem processes. Second, it has Margulis; Doolittle 2013) actively promoted this position, some editorial and educational virtues, as it effectively helps bringing the newly termed “archaea” into full light, while raising the interest for microbiology studies beyond the ﬁeld intending to ban the use of the “older” term “prokaryotes” of microbiology (in which none would really conﬂate astro- (Pace 2006). nomic and microbial dark matter), surely enhancing the gen- Importantly, this comparative approach of molecular phy- eral interest for the unexplored diversity of microbes and their logenetics was later coupled to a phase of exploratory science genes. We recommend however a more careful rather than (Waters 2007). Exploratory science is in essence a strategy of sensationialistic use of the term, to describe the (overwhelm- data mining. It goes from the data to the hypotheses (Burian ing) amount of microbes, microbial genes, and microbial con- 2013), seeking (robust) patterns in the data or unraveling new tributions to processes that were unknown at the time at phenomena. Although microbiology has a long history of ex- which scientists performed their analyses. ploratory research (O’Malley 2014), this mode of science Precisely, much of the extant knowledge in biology, that is, appears in strong contrast with the more classic about biological entities and biological processes, heavily relies hypothetico-deductive strategy, heralded by Karl Popper. on analyses conducted on macro-organisms and on cultured This deductive approach has inspired much of microbiology microbes. Yet, 60–99% of the microbial diversity are not 708 Genome Biol. Evol. 10(3):707–715 doi:10.1093/gbe/evy031 Advance Access publication February 5, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/707/4840377 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Microbial Dark Matter Investigations GBE and biochemistry studies, since these studies largely operated of dormancy, survive 2.5 years of travel in space, and thrive from the hypotheses to the data, that is, using data to reject within rocks as well as in the terrestrial stratosphere preexisting hypotheses, or eventually to corroborate them. (at> 44 km of altitude) (de los Rios et al. 2003; Pikuta et al. Since exploratory science is not ﬁrst aimed at rejecting (or 2007)(see, e.g., https://www.slideshare.net/AnjaliMalik3/ conﬁrming) preestablished hypotheses (thus deepening cur- extremophiles-imp-1). Some of these statistics were so unex- rent knowledge), it can potentially produce novel, unexpected pected that Pikuta et al. (Pikuta et al. 2007), summarizing the knowledge, or simply fail, making the ﬁnancial and scientiﬁc ongoing knowledge on extremophiles drew too short axes for investment in exploratory studies especially risky. temperature, pH, and salinity on plots showing the physico- Fortunately, the pioneering approach, ﬁrst largely based on chemical conditions compatible with life. Some environmen- the development of 16 S rRNA gene sequencing (Schmidt tal microbes were deﬁnitely outliers with respect to the ma- et al. 1991; Barns et al. 1996; Hugenholtz et al. 1998), then jority of known creatures. This counter-intuitive search for on the sequencing of other makers (Beja et al. 2000), and extremophiles likely reaches his summit in astrobiological latter on the development of metagenomics (Breitbart et al. studies, which search for life beyond Earth, seeking to deﬁne 2002; Tyson et al. 2004; Tringe et al. 2005) and single-cell biomarkers in exoplanetary analogs and to train to detect genomics, bypassed the need for culture studies, thereby lift- these biomarkers in regions of the universe that currently ﬁt ing a blind spot imposed by culture-based investigation to the minimal requirements for life in C, H, N, O, P, S, liquid comparative analyses. These studies returned a diversity of water, and energy (Olsson-Francis and Cockell 2010). No one exciting ﬁndings. By the beginning of the 2000s, microbial knows whether extraterrestrial microbes will ultimately be dis- ecologists had started characterizing the gene content, diver- covered this way, but, at least, ironically terrestrial microbes, sity, and relative abundance of environmental microbes which can grow in the International Space Station and (Venter et al. 2004). They had identiﬁed new functions of Spacecraft Assembly Facilities (Checinska et al. 2015)have major importance in the ocean (e.g., ammonia oxidation by potentially increased chances to spread in space, a problem archaea; Francis et al. 2005), possibly affecting the global ni- known as the issue of planetary protection (McKay and Davis trogen cycle, as well as unexpected photosynthesis (and 1989). other) genes in viruses (Sullivan et al. 2005). They had also gained unprecedented insights into the survival strategies of Searching for Very Divergent Homologs: A microbes (Tyson et al. 2004), into their community structures Few Lessons (Tyson et al. 2004; DeLong et al. 2006), andintotheir niche- In as much as environmental genomics enhance microbial speciﬁc adaptations (Tringe et al. 2005), for example, by dark matter studies, for example, by unraveling extremo- unraveling unknown iron-oxidizing and free-living diazotroph philes, it also raises issues, since environmental genomics in acid mine drainage bioﬁlms (Ram et al. 2005; Tyson et al. has its own blind spots. The selection of samples, of genes 2005). of interests (e.g., in metabarcoding projects, or more generally Environmental genomics in particular produced remark- in targeted environmental genomics) and the many ﬁltering able results when microbiologists turned their eyes to extreme decisions and heuristics in the subsequent bioinformatic treat- regions (in terms of temperature, pH, pressure, mineraliza- ments imposed by the wealth of environmental sequences tion, radiations) that many considered a priori devoid of life (i.e., reads and contigs), as well as the increased standardiza- (Pikuta et al. 2007). The seemingly counter-intuitive idea to tion of the methods and questions of environmental geno- sample lifeforms in environments hostile to life unveiled a mics studies (a logical scientiﬁc development for a broad diversity of extremophiles in the three Domains. comparative science; Vigliotti et al. 2017) raise theriskthat Granted, ﬁnding DNA in extreme environments does not in the most unexpected of life forms, even if already sequenced, itself constitute an ultimate proof that the life forms bearing remain drowned under this deluge of data. This risk has no- this DNA existed there, but analyses of environmental DNA torious roots: our observations are strongly constrained by (be they nonassembly based, assembly based or even of ge- what our theory makes us prone to expect, and therefore nome resolved metagenomics) are nonetheless an important by former perspectives informing various criteria in the sam- step in the discovery of new microbes in extreme environ- pling process. ment. Cultivation of microbes from these extreme locations This limit is obvious in the process of size-fractioning asso- offers a much stronger evidence, that is, Karl-Otto Stetter, by ciated with metagenomics analyses, such as the one con- this cultivation approach discovered life at the extreme tem- ducted in the Tara expedition, which a priori optimized the perature limits, pushing the boundaries of life as it was then net sizes of its ﬁlter to capture different taxa of marine known (Stetter 2013). microbes (Karsenti et al. 2011). This procedure entails the Using these strategies, microbiologists realized that life was inherent risk that important players of the microbial world possible at temperature 122 C, at negative pH (!), and at may be overlooked if their sizes do not satisfy these ﬁltering pH> 11, at pressures exceeding 1,200 atmospheres; that conditions. For example, 10 years ago, few (or even no) microbes could be resurrected after 20–40 millions of years Genome Biol. Evol. 10(3):707–715 doi:10.1093/gbe/evy031 Advance Access publication February 5, 2018 709 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/707/4840377 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Bernard et al. GBE microbiologists nor virologists would have assumed that bac- teria in the range of 0.2 microns and viruses >0.2 microns existed (Council 1999). This view radically changed with the discovery of ultrasmall bacteria, aka nanoorganisms, such as the CPR in 2015 (Brown et al. 2015; Luef et al. 2015)or some DPANN in 2010 (Baker et al. 2010), and with the discovery of giant viruses, such as Mimiviridae, in 2003 (La Scola et al. 2003). These taxa are now found in diverse environments, albeit at low abundance(Brown et al. 2015). CPR are remark- ably phylogenetically diverse (Hug et al. 2016), representing up to 50% of the bacterial domain (Anantharaman et al. 2016), and present an unusual biology (i.e., 16 S RNA with insertion, lack of metabolic genes usually considered as essen- tial), which suggests that CPR depend on other life forms (Kantor et al. 2013; Gong et al. 2014; Brown et al. 2015; Nelson and Stegen 2015; Danczak et al. 2017). CPR cells occupy an extremely tiny average volume of 0.0096 0.002 lm3, for a spherical diameter of 2536 25 nm (Luef et al. 2015). Mimivirus biology is not less striking. In particular, they are hosts to yet another new kind of viruses: virophages, that is, viruses of giant viruses (Boyer et al. 2011). FIG.1.—Four types of environmental sequences. Environmental The phylogenetic position of these relatively newcomers, es- sequences can be classiﬁed based on their taxonomical annotation (hori- pecially regarding how deep CPR and giant viruses branch (if zontal line) and their functional annotation (vertical column), which deﬁnes they do) with respect to the other Domains of life, is heavily four categories. The cells in purple and black correspond to categories that debated (Colson et al. 2012; Moreira and Lopez 2015; Hug are not readily explained based on current biological knowledge. et al. 2016), even though, regarding the phylogenetic position of CPR, Hug et al. did not committ themselves strongly, stress- ing instead that their method did not result in a well resolved Bioinformatic developments are currently designed to as- phylogeny (Hug et al. 2016). Such debates illustrates that sociate these unknown genes to reference gene families. For attempts to establish novel groups inevitably (and logically) example, the search for highly divergent homologs using se- arise resistances, but no one questions that an accurate pic- quence similarity networks (Lopez et al. 2015) highlighted ture of the microbial world and its evolution can any longer that a large majority of the ancient gene families that are satisfactorily be achieved without including nanoorganisms well-conserved in cultured microbes have extremely divergent and viruses, be they giant or not. homologs in nature. Lopez et al. (2015) proposed that at least Environmental genomics has not merely unraveled new some of these very divergent homologs might sign the exis- microbial lineages, it has also reported new gene families tence of deep branching yet unseen major divisions of life. (Riesenfeld et al. 2004; Lok 2015), new CRISPR-Cas systems Discovering environmental deeper lineages, branching below (Burstein et al. 2017), and unusual gene forms (i.e., very di- the currently recognized prokaryotic domains, could reopen vergent homologs from known genes). In principle, newly the debate on the number of Domains of life, questioning our sequenced environmental genes could fall into one of 4 fundamental knowledge in terms of biological classiﬁcations groups (ﬁg. 1). The in silico functional and taxonomical anno- and regarding early life evolution. Bioinformatic studies of tations of environmental genes using existing ontologies random environmental sequences however need to be com- (here, applied to 339 metagenomes; Fondi et al. 2016,sam- plemented by another type of experimental evidence, that is, pling a diversity of environments, that is, soil, seawater, individual sequences of genomes from putative very early inland-water, wastewater, host, air, bioremediation, biotrans- branching microbes or even isolations of these organisms. formation, and sludge waste) indicates that most environ- The former type of evidence typically obtains by genome re- mental genes have unknown functions, and belong to solved metagenomics, that is, genome binning from metage- uncharacterized microbial lineages (ﬁg. 2). In fact, at the min- nomics data sets. Genome binning consists in assembling imum %ID threshold of 95%,>50% of these genes are nei- metagenomic contigs using relative abundance and/or tetra ther functionally nor taxonomically annotated, and at the nucleotide abundance (Sedlar et al. 2017). This protocol minimum %ID threshold of 50%,>30% of these genes are allows to recover synteny and to identify conserved or un- neither functionally nor taxonomically annotated, which usual/unexpected genes for related microorganisms. This ap- stresses the genuine abundance of microbial dark matter in proach is invaluable to recover genomes for uncultured metagenomic data. organisms and to study their metabolic capabilities. 710 Genome Biol. Evol. 10(3):707–715 doi:10.1093/gbe/evy031 Advance Access publication February 5, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/707/4840377 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Microbial Dark Matter Investigations GBE FIG.2.—Microbial dark matter across a diversity of environmental samples. Proteins inferred (with FragGeneScan; Rho et al. 2010)based on Metagenomic sequences from (Fondi et al. 2016), clustered based on their taxonomy (using MEGAN 6; Huson et al. 2016) and functional (using EggNOG-mapper; Huerta-Cepas et al. 2017) annotation. The pie charts represent the proportion of proteins from each type of environment. The taxonomy annotation was performed using three minimum percentage of identity: 50% (panels A and B), 85% (panels C and D), and 95% (panels E and F). In panels A, C,and E, the proteins were clustered based on their functional annotation including the category S (“Function unknown”). Panels B, D,and F were clustered with the exclusion of the category S. Genome Biol. Evol. 10(3):707–715 doi:10.1093/gbe/evy031 Advance Access publication February 5, 2018 711 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/707/4840377 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Bernard et al. GBE Moreover, within the ﬁeld of environmental genomics, single (Gill et al. 2006; Gilbert et al. 2015), to the point that some cell genomics offers an additional alternative approach to pro- propose to introduce holobionts (the emergent associations duce environmental data sets, identifying genes from the of hosts and microbes) as a novel kind of central evolutionary same genomes. Even though these approaches are gaining player (Bordenstein and Theis 2015; Moran and Sloan 2015; popularity and data start accumulating, so far, despite the Theis et al. 2016). At an even broader scale, in the environ- actual high number of environmental “known unknowns” ment, microbes, most of which are unknown, are now as- no scientists (i.e., peer-reviewers) working with major scien- sumed to affect the geochemical processes that shape our tiﬁc journals have yet been convinced that enough evidence planet (Guidi et al. 2016) and, by a process called niche con- for new candidate Domains of life is available. For example, struction (Laland et al. 2016), these microbes are considered the remarkable work by (Parks et al. 2017) did not use uni- likely to impact ecosystems and the future of life. All these versally shared ribosomal proteins to build a tree of life, in- processes (lateral gene transfer, scaffolding, communication, cluding simultaneously novel environmental lineages, as well microbial coconstruction, and niche construction), while as known archaeal and bacterial lineages, whereas this strat- widespread in the microbial world, are still rather peripheral egy could have identiﬁed deep branching environmental in biological explanations. Introducing the processes to which groups. microbial dark matter contribute within biological theory thus requires revising the relative priority currently attributed to concepts in scientiﬁc explanations, which is likely to be a Microbial Processes as a Yet Unexhausted slow and tedious epistemic process. For example, prokaryotic Source of Knowledge biology, especially when considering microbiomes, At the same time that new microbes were discovered, our appears in fact so different from the biology of model knowledge on processes involving or affecting microbes eukaryotic organisms that several evolutionary biologists evolved substantially. The focus on interactions and the use and theoreticians have independently suggested that key of networks rather than trees to frame microbial studies is aspects of the classic Darwinian theory and of the Modern emerging as a major trend. It is becoming obvious that simple Synthesis would have been very different had microbial tree-based models, aiming at reconstructing the divergence studies been more central during the early development of lineages from a last common ancestor, are not fully doing of the evolutionary theory. Others however disagree that justice to the diversity and complexity of the processes the structure and content of the evolutionary theory explaining microbial evolution. For example, in nature, diver- requires to be reshaped, even in the light of this new sity generating retroelements contribute to rapid, targeted knowledge in microbiology (Wray 2014). Yet, debates sequence diversiﬁcation in Archaea and their viruses (Paul around the gene content, nature, and phylogenetic posi- et al. 2015), andinCPR (Paul et al. 2017). Introgressive pro- tion of Asgard archaea (Saw et al. 2015; Da Cunha et al. cesses such as lateral gene transfer stress the collective dimen- 2017; Zaremba-Niedzwiedzka et al. 2017) powerfully sion of microbial evolution (Doolittle 1999; Ochman et al. illustrates that an enhanced knowledge of the microbial 2000; Bapteste et al. 2012). Likewise, the discovery of envi- dark matter has unquestionably the potential to transform ronmental microbes with genuinely incomplete genomes (i.e., central elements in the evolutionary theory. If Asgard ar- lacking genes considered as essential) and of syntrophic con- chaea, currently only known via assemblies of environ- sortia insists on the importance of metabolic, ecological, and mental reads, prove to be sister-groups of eukaryotes, evolutionary scaffolding in the microbial world (DeLong 2007; this should (at least) impact the very notion of a tree of Morris et al. 2012; Sachs and Hollowell 2012; Caporael et al. life, bring further evidence regarding the number of 2013; Brown et al. 2015; Ereshefsky and Pedroso 2015). The Domains of life (since a convincing argument that the 2 claim that in nature microbes depend on other microbes to domains tree is better supported than the 3 domains tree survive, contrasts strongly with the notion that natural selec- predates the discovery of Asgard; Williams et al. 2013), tion ultimately favors individual optimized lineages via the and, depending on the intimate structural biology and success of the ﬁttest cells among large and phylogenetically metabolisms of these Asgard, it will also help testing homogeneous microbial populations. It matches however among competing hypotheses for the origin of eukaryotes well with the empirical observation that pure culture fails (Koonin 2015; Sousa et al. 2016). formostmicrobes (Staley and Konopka 1985), and in fact On a different level, newly discovered microbial genes have provides an explanation for this great plate anomaly. also impacted, and could further impact, critical societal Microbes belong to collectives rather than they live alone. needs. Discovering enzymes, such as lipases (Rogalska et al. Other striking interactions are also unveiled as scientists dig 1997) or organo-phosphorus degrading enzymes (Singh further into the microbial world. For example, unheard forms 2009), with greater activity, speciﬁcity, or stability, or new of communication impact microbial and viral population dy- antibiotics in the environment (Lok 2015), such as namics (Erez et al. 2017). Microbiomes and their hosts Teixobactin (Ling et al. 2015), is central to the development coconstruct a broad range of animal and plant phenotypes of the industrial enzymes market, which is expected to 712 Genome Biol. Evol. 10(3):707–715 doi:10.1093/gbe/evy031 Advance Access publication February 5, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/707/4840377 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Microbial Dark Matter Investigations GBE represent up to 6.20 billion of dollars in 2020. Scientiﬁc re- Acknowledgments search, as acknowledged by several Nobel Prizes, has also R.L., G.B., J.S.P., and E.B. are funded by the European greatly beneﬁted from the discovery of microbial enzymes, Research Council (FP7/2017-2013 Grant Agreement including restrictions enzymes, such as HindII (Smith and #615274). We thank Dr Karen Olsson-Francis, Dr Yan Wilcox 1970), or the DNA polymerases (Brock and Freeze Boucher, and Dr Lucie Bittner for stimulating discussion. 1969), which allowed the development of the Polymerase ChainReaction(Saiki et al. 1988). More recently, the discov- ery of Crispr-cas9 systems (Jinek et al. 2012), now used for Literature Cited genome editing, also highlights the signiﬁcant potential of Anantharaman K, et al. 2016. Thousands of microbial genomes shed light microbial genes discovery to enhance the evolution of drugs, on interconnected biogeochemical processes in an aquifer system. Nat biotechnologies, and research tools. Commun. 7:13219. Baker BJ, et al. 2010. Enigmatic, ultrasmall, uncultivated Archaea. Proc Natl Acad Sci U S A. 107(19):8806–8811. Conclusion Bapteste E, et al. 2012. Evolutionary analyses of non-genealogical bonds produced by introgressive descent. Proc Natl Acad Sci U S A. The discovery of an increasing number of types of microbes 109(45):18266–18272. has consistently shown that our planet hosts microbes with Barer MR, Harwood CR. 1999. Bacterial viability and culturability. Adv properties that were not simply identical to the ones formerly Microb Physiol. 41:93–137. described. Studies of the microbial dark matter have brought Barns SM, Delwiche CF, Palmer JD, Pace NR. 1996. Perspectives on ar- forward the existence of novel entities (e.g., nanoorganisms, chaeal diversity, thermophily and monophyly from environmental rRNA sequences. Proc Natl Acad Sci U S A. 93(17):9188–9193. giant viruses, and virophages) and novel relationships within Beja O, et al. 2000. Bacterial rhodopsin: evidence for a new type of photo- the microbial world (e.g., viral languages, high divergence, trophy in the sea. Science 289(5486):1902–1906. and scaffolding). This formerly dark microbial matter has Bordenstein SR, Theis KR. 2015. Host biology in light of the microbiome: not been unraveled randomly. To sum up its logic of discov- ten principles of holobionts and hologenomes. PLoS Biol. ery, it has required: to think outside the box (e.g., Woese’s 13(8):e1002226. Boyer M, et al. 2011. Mimivirus shows dramatic genome reduction after deﬁnition of a novel Domain), to take scientiﬁcally and ﬁnan- intraamoebal culture. Proc Natl Acad Sci U S A. 108(25): cially risky decisions (e.g., sampling sites where life was un- 10296–10301. likely), to develop novel methods pushing back the limits of Breitbart M, et al. 2002. Genomic analysis of uncultured marine viral com- detection (e.g., better microscopes, inclusive networks), to munities. Proc Natl Acad Sci U S A. 99(22):14250–14255. prepare one’s mind to detect unknowns and unexpected Brock TD, Freeze H. 1969. Thermus aquaticus gen. n. and sp. n., a non- sporulating extreme thermophile. J Bacteriol. 98(1):289–297. forms (e.g., biomarkers), to identify and to seek to explain Brown CT, et al. 2015. Unusual biology across a group comprising more anomaly (e.g., the great plate count anomaly), to change than 15% of domain bacteria. Nature 523(7559):208–211. perspectives (e.g., embracing the notion of nanoorganisms, Bruno A, et al. 2017. Exploring the under-investigated “microbial dark or of multiple prokaryotic domains), to use analogies to un- matter” of drinking water treatment plants. Sci Rep. 7:44350. cover new microbial systems (e.g., for the study of extremo- Burian RM. 2013. Exploratory experimentation. New York: Springer. p. 720–723. philes in space), to purposely depart from normal scientiﬁc Burstein D, et al. 2017. New CRISPR-Cas systems from uncultivated practices and background knowledge (e.g., network studies microbes. Nature 542(7640):237–241. of divergent gene forms, exploration of increasingly extreme Caporael L, Griesemer J, Wimsatt W. 2013. Scaffolding in evolution, cul- environments), to be willing to create novel groups (e.g., ture, and cognition. Massachusetts: MIT Press. Archea, CPR, Mimiviridae,.. .), and ﬁnally to convince (e.g., Checinska A, et al. 2015. Microbiomes of the dust particles collected from the International Space Station and Spacecraft Assembly Facilities. by banning competing notions, or by establishing new attrac- Microbiome 3:50. tive ﬁelds, such as environmental genomics). Indeed, many of Colson P, de Lamballerie X, Fournous G, Raoult D. 2012. Reclassiﬁcation of these discoveries presented in this work generated resistan- giant viruses composing a fourth domain of life in the new order ces. These resistances are perfectly explainable. Unraveling Megavirales. Intervirology 55(5):321–332. the unknown is especially difﬁcult, because although we Council NR editor. 1999. Report from the National Research Council. Washington (DC). could empirically sketch a logic of scientiﬁc discovery, at the Da Cunha V, Gaia M, Gadelle D, Nasir A, Forterre P. 2017. Lokiarchaea are time each novel ﬁnding was made, their inventors could not close relatives of Euryarchaeota, not bridging the gap between pro- yet rely on a standard method but essentially they had to karyotes and eukaryotes. PLoS Genet. 13:e1006810. convince the rest of the community that both their unusual Danczak RE, et al. 2017. Members of the Candidate Phyla Radiation are approaches and ﬁnding were relevant. Convincing its own functionally differentiated by carbon- and nitrogen-cycling capabilities. Microbiome 5(1):112. peers is ﬁnally essential, and possibly one of the largest and de los Rios A, Wierzchos J, Sancho LG, Ascaso C. 2003. Acid microenviron- commonest challenge for microbial dark matter studies, and ments in microbial bioﬁlms of antarctic endolithic microecosystems. this seems especially difﬁcult even for creative outsiders. Van Environ Microbiol. 5(4):231–237. Leeuwenhoek’s pioneering example offers indeed a great re- DeLong EF. 2007. Microbiology. Life on the thermodynamic edge. Science minder that extraordinary results can easily be forgotten. 317(5836):327–328. Genome Biol. Evol. 10(3):707–715 doi:10.1093/gbe/evy031 Advance Access publication February 5, 2018 713 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/707/4840377 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Bernard et al. GBE DeLong EF, et al. 2006. Community genomics among stratiﬁed microbial Lok C. 2015. Mining the microbial dark matter. Nature assemblages in the ocean’s interior. Science 311(5760):496–503. 522(7556):270–273. Diamond J. 1997. Guns, germs, and steel: the fates of human societies. Lopez P, Halary S, Bapteste E. 2015. Highly divergent ancient gene families New York city: W. W. Norton. in metagenomic samples are compatible with additional divisions of Doolittle WF. 1999. Phylogenetic classiﬁcation and the universal tree. life. Biol Direct. 10:64. Science 284(5423):2124–2129. Luef B, et al. 2015. Diverse uncultivated ultra-small bacterial cells in Doolittle WF. 2013. Carl R. Woese (1928–2012). Curr Biol. groundwater. Nat Commun. 6:6372. 23(5):R183–R185. McKay CP, Davis WL. 1989. Planetary protection issues in advance of Ereshefsky M, Pedroso M. 2015. Rethinking evolutionary individuality. Proc human exploration of Mars. Adv Space Res. 9(6):197–202. Natl Acad Sci U S A. 112(33):10126–10132. Moran NA, Sloan DB. 2015. The hologenome concept: helpful or hollow? Erez Z, et al. 2017. Communication between viruses guides lysis-lysogeny PLoS Biol. 13(12):e1002311. decisions. Nature 541(7638):488–493. Moreira D, Lopez GP. 2015. Evolution of viruses and cells: do we need a Falkowski P. 2015. Leeuwenhoek’s lucky break. Discover 1–5. fourth domain of life to explain the origin of eukaryotes? Philos Trans R Fondi M, et al. 2016. “Every Gene Is Everywhere but the Environment Soc Lond B Biol Sci. 370(1678):20140327. Selects”: global geolocalization of gene sharing in environmental Morris JJ, Lenski RE, Zinser ER. 2012. The Black Queen Hypothesis: evolu- samples through network analysis. Genome Biol Evol. 8(5): tion of dependencies through adaptive gene loss. MBio 3(2):e00036- 1388–1400. 12. Francis CA, Roberts KJ, Beman JM, Santoro AE, Oakley BB. 2005. Ubiquity Nelson WC, Stegen JC. 2015. The reduced genomes of Parcubacteria and diversity of ammonia-oxidizing archaea in water columns and (OD1) contain signatures of a symbiotic lifestyle. Front Microbiol. sediments of the ocean. Proc Natl Acad Sci U S A. 6:713. 102(41):14683–14688. Ochman H, Lawrence JG, Groisman EA. 2000. Lateral gene transfer and Gilbert SF, Bosch TC, Ledon-Rettig C. 2015. Eco-Evo-Devo: developmental the nature of bacterial innovation. Nature 405(6784):299–304. symbiosis and developmental plasticity as evolutionary agents. Nat Rev Olsson-Francis K, Cockell CS. 2010. Experimental methods for studying Genet. 16(10):611–622. microbial survival in extraterrestrial environments. J Microbiol Methods Gill SR, et al. 2006. Metagenomic analysis of the human distal gut micro- 80(1):1–13. biome. Science 312(5778):1355–1359. O’Malley MA. 2014. Philosophy of microbiology. Cambridge: Cambridge Gong J, Qing Y, Guo X, Warren A. 2014. “Candidatus Sonnebornia University Press. yantaiensis”, a member of candidate division OD1, as intracellular Pace NR. 2006. Time for a change. Nature 441(7091):289. bacteria of the ciliated protist Paramecium bursaria (Ciliophora, Parks DH, Rinke C, Chuvochina M, Chaumeil PA, Woodcroft BJ, Evans PN, Oligohymenophorea). Syst Appl Microbiol. 37(1):35–41. Hugenholtz P, Tyson GW. 2017. Recovery of nearly 8, 000 Guidi L, et al. 2016. Plankton networks driving carbon export in the oli- metagenome-assembled genomes substantially expands the tree of gotrophic ocean. Nature 532(7600):465–470. life. Nat Microbiol. 2:1533–1542. Huerta-Cepas J, Forslund K, Pedro Coelho L, Szklarczyk D, Juhl Jensen L, Paul BG, et al. 2015. Targeted diversity generation by intraterrestrial ar- von Mering C, Bork P. 2017. Fast genome-wide functional annotation chaea and archaeal viruses. Nat Commun. 6:6585. through orthology assignment by eggNOG-mapper. Mol Biol Evol. Paul BG, et al. 2017. Retroelement-guided protein diversiﬁcation abounds 34(8):2115–2122. in vast lineages of Bacteria and Archaea. Nat Microbiol. 2:17045. Hug LA, et al. 2016. A new view of the tree of life. Nat Microbiol. 1:16048. Pikuta EV, Hoover RB, Tang J. 2007. Microbial extremophiles at the limits Hugenholtz P, Goebel BM, Pace NR. 1998. Impact of culture-independent of life. Crit Rev Microbiol. 33(3):183–209. studies on the emerging phylogenetic view of bacterial diversity. J Ram RJ, et al. 2005. Community proteomics of a natural microbial bioﬁlm. Bacteriol. 180(18):4765–4774. Science 308(5730):1915–1920. Huson DH, et al. 2016. MEGAN Community Edition – interactive explora- Rho M, Tang H, Ye Y. 2010. FragGeneScan: predicting genes in short and tion and analysis of large-scale microbiome sequencing data. PLoS error-prone reads. Nucleic Acids Res. 38(20):e191. Comput Biol. 12(6):e1004957. Riesenfeld CS, Goodman RM, Handelsman J. 2004. Uncultured soil bac- Jinek M, et al. 2012. A programmable dual-RNA-guided DNA endonucle- teria are a reservoir of new antibiotic resistance genes. Environ ase in adaptive bacterial immunity. Science 337(6096):816–821. Microbiol. 6(9):981–989. Kantor RS, et al. 2013. Small genomes and sparse metabolisms of Rinke C, et al. 2013. Insights into the phylogeny and coding potential of sediment-associated bacteria from four candidate phyla. MBio microbial dark matter. Nature 499(7459):431–437. 4(5):e00708–e00713. Rogalska E, Douchet I, Verger R. 1997. Microbial lipases: structures, func- Karsenti E, et al. 2011. A holistic approach to marine eco-systems biology. tion and industrial applications. Biochem Soc Trans. 25(1):161–164. PLoS Biol. 9(10):e1001177. Sachs JL, Hollowell AC. 2012. The origins of cooperative bacterial com- Koonin EV. 2015. Archaeal ancestors of eukaryotes: not so elusive any munities. MBio 3(3):e00099-12. more. BMC Biol. 13:84. Saiki RK, et al. 1988. Primer-directed enzymatic ampliﬁcation of DNA with Krishnamurthy SR, Wang D. 2017. Origins and challenges of viral dark a thermostable DNA polymerase. Science 239(4839):487–491. matter. Virus Res. 239:136–142. Saw JH, et al. 2015. Exploring microbial dark matter to resolve the deep La Scola B, et al. 2003. A giant virus in amoebae. Science 299(5615):2033. archaeal ancestry of eukaryotes. Philos Trans R Soc Lond B Biol Sci. Laland K, Matthews B, Feldman MW. 2016. An introduction to niche 370(1678):20140328. construction theory. Evol Ecol. 30:191–202. Schickore J. 2014. Scientiﬁc discovery. Stanford: The Stanford Lewis K. 2017. Antibiotics from the microbial dark matter. FASEB J. Encyclopedia of Philosophy. 31(Suppl 257):252. Schmidt TM, DeLong EF, Pace NR. 1991. Analysis of a marine picoplankton Ling LL, et al. 2015. A new antibiotic kills pathogens without detectable community by 16S rRNA gene cloning and sequencing. J Bacteriol. resistance. Nature 517(7535):455–459. 173(14):4371–4378. Lobb B, Kurtz DA, Moreno-Hagelsieb G, Doxey AC. 2015. Remote homol- Sedlar K, Kupkova K, Provaznik I. 2017. Bioinformatics strategies for tax- ogy and the functions of metagenomic dark matter. Front Genet. onomy independent binning and visualization of sequences in shotgun 6:234. metagenomics. Comput Struct Biotechnol J. 15:48–55. 714 Genome Biol. Evol. 10(3):707–715 doi:10.1093/gbe/evy031 Advance Access publication February 5, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/707/4840377 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Microbial Dark Matter Investigations GBE Singh BK. 2009. Organophosphorus-degrading bacteria: ecology and in- acidophilic microbial community. Appl Environ Microbiol. dustrial applications. Nat Rev Microbiol. 7(2):156–164. 71(10):6319–6324. Smith HO, Wilcox KW. 1970. A restriction enzyme from Hemophilus Venter JC, et al. 2004. Environmental genome shotgun sequencing of the inﬂuenzae. I. Puriﬁcation and general properties. J Mol Biol. Sargasso Sea. Science 304(5667):66–74. 51(2):379–391. Vigliotti C, Lopez P, Bapteste E. 2017. Microbial diversity studies: the (par- Sousa FL, Neukirchen S, Allen JF, Lane N, Martin WF. 2016. Lokiarchaeon adoxical) challenge to have a broad view with metagenomics. In: is hydrogen dependent. Nat Microbiol. 1(5):1–3. Maurel PGMC, editor. Evolution and biodiversity. ISTE Editions. Staley JT, Konopka A. 1985. Measurement of in situ activities of non- Amsterdam: Elsevier. photosynthetic microorganisms in aquatic and terrestrial habitats. Waters CK. 2007. The nature and context of exploratory experimentation: Annu Rev Microbiol. 39:321–346. an introduction to three case studies of exploratory research. Hist Stetter KO. 2013. A brief history of the discovery of hyperthermophilic life. Philos Life Sci. 29(3):275–284. Biochem Soc Trans. 41(1):416–420. Williams TA, Foster PG, Cox CJ, Embley TM. 2013. An archaeal origin of Sullivan MB, Coleman ML, Weigele P, Rohwer F, Chisholm SW. 2005. eukaryotes supports only two primary domains of life. Nature Three Prochlorococcus cyanophage genomes: signature features and 504(7479):231–236. ecological interpretations. PLoS Biol. 3(5):e144. Woese CR, Fox GE. 1977. Phylogenetic structure of the prokaryotic do- Theis KR, Dheilly NM, Klassen JL, Brucker RM, Baines JF, Bosch TC, Cryan main: the primary kingdoms. Proc Natl Acad Sci U S A. JF, Gilbert SF, Goodnight CJ, Lloyd EA, et al. 2016. Getting the hol- 74(11):5088–5090. ogenome concept right: an eco-evolutionary framework for hosts and Woese CR, Kandler O, Wheelis ML. 1990. Towards a natural system of their microbiomes. mSystems 1 (2): DOI: 10.1128/mSystems.00028- organisms: proposal for the domains Archaea, Bacteria, and Eucarya. 16. Proc Natl Acad Sci U S A. 87(12):4576–4579. Tringe SG, et al. 2005. Comparative metagenomics of microbial commu- Zaremba-Niedzwiedzka K, et al. 2017. Asgard archaea illuminate the nities. Science 308(5721):554–557. origin of eukaryotic cellular complexity. Nature Tyson GW, et al. 2004. Community structure and metabolism through 541(7637):353–358. reconstruction of microbial genomes from the environment. Nature 428(6978):37–43. Tyson GW, et al. 2005. Genome-directed isolation of the key nitro- gen ﬁxer Leptospirillum ferrodiazotrophum sp.nov.from an Associate editor: Martin Embley Genome Biol. Evol. 10(3):707–715 doi:10.1093/gbe/evy031 Advance Access publication February 5, 2018 715 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/707/4840377 by Ed 'DeepDyve' Gillespie user on 16 March 2018
Genome Biology and Evolution – Oxford University Press
Published: Mar 1, 2018
It’s your single place to instantly
discover and read the research
that matters to you.
Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.
All for just $49/month
Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly
Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.
Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.
Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.
All the latest content is available, no embargo periods.
“Hi guys, I cannot tell you how much I love this resource. Incredible. I really believe you've hit the nail on the head with this site in regards to solving the research-purchase issue.”Daniel C.
“Whoa! It’s like Spotify but for academic articles.”@Phil_Robichaud
“I must say, @deepdyve is a fabulous solution to the independent researcher's problem of #access to #information.”@deepthiw
“My last article couldn't be possible without the platform @deepdyve that makes journal papers cheaper.”@JoseServera