Resources and tools for the high-throughput, multi-omic study of intestinal microbiota

Resources and tools for the high-throughput, multi-omic study of intestinal microbiota Abstract The human gut microbiome impacts several aspects of human health and disease, including digestion, drug metabolism and the propensity to develop various inflammatory, autoimmune and metabolic diseases. Many of the molecular processes that play a role in the activity and dynamics of the microbiota go beyond species and genic composition and thus, their understanding requires advanced bioinformatics support. This article aims to provide an up-to-date view of the resources and software tools that are being developed and used in human gut microbiome research, in particular data integration and systems-level analysis efforts. These efforts demonstrate the power of standardized and reproducible computational workflows for integrating and analysing varied omics data and gaining deeper insights into microbe community structure and function as well as host–microbe interactions. human gut microbiome, data repositories, large-scale and integrative computational tools, modelling, immunomodulation, drug screening Background The human gastrointestinal tract is a complex ecosystem in which eukaryotic cells continuously interact with nutrients and with the complex microbial population of the gut microbiota [1]. Gut microorganisms are the source of many bioactive products that play key functions in human host pathways and microbe–microbe interactions [2]. Processes such as host–microbe crosstalk, immune activation and inflammation, microbe–microbe signalling, microbial metabolism and antimicrobial activity are bioactive in the human gut [3]. Therefore, the ability to modulate the gut microbiome and the associated host–microbe interactions holds great promise for developing new therapeutic strategies for many chronic diseases and antibiotic-resistant infections [4, 5]. Colonization of the gut starts just after birth when pioneering species interact, through surface receptors, with gut cells to promote the expression of a specific set of host genes and favour the colonization of commensal microorganisms [6]. The epithelium function and the mucosal associated immune system are influenced by direct host–microbiota interactions and through modulation of the microbial metabolism [7]. The immune system is trained to ensure a fine balance between the response given to commensal gut microbiota (i.e. homeostatic and healthy situations) and pathogens (i.e. gastrointestinal disorders) [8]. Several non-infectious human diseases, such as autoimmune disorders, inflammatory bowel disease (IBD) and some gut-associated cancers, are related to the immunological imbalance and compositional perturbations of the gut microbiota, also known as dysbiosis. Gut dysbiosis is a major contributor in diet-related obesity and type 2 diabetes mellitus [9, 10]. For example, alterations in the relative abundances of Gammaproteobacteria and Verrucomicrobia phyla as well as in the ratios of Firmicutes to Bacteroidetes are associated to overweight, and alterations in butyrate-producing bacteria, such as Faecalibacterium prausnitzii, are often related to diabetes mellitus [11, 12]. Moreover, genetic and simple obesity share similar structural and functional features of dysbiosis, such as higher production of toxins with known potential to induce metabolic deteriorations (e.g. trimethylamine-N-oxide and indoxyl sulphate), higher abundance of genomes containing genes coding enzymes involved in the production of these toxic co-metabolites and higher abundance of pathways for biosynthesis of bacterial antigens (such as endotoxin) [13–15]. Although the precise cause remains unknown, profiling studies of the gut microbiome associate the pathogenesis of IBD, a chronic and relapsing inflammatory disorder of the gut, with the under-representation of certain species in the faecal microbiota [16–19]. For example, F. prausnitzii has been postulated as a biomarker or a potential therapeutic agent of IBD [20, 21]. Gastric cancer and colorectal cancer are also connected to alterations in gut microbiota. For example, some dietary factors may alter gut microbiota interactions and affect cancer development and response to cancer treatment [22]. In fact, the term ‘oncobiome’ has been recently adopted to refer to the emerging field of research devoted to the study of the interplay between the human microbiome and cancer development [23]. Moreover, it is well recognized that the excessive use of broad-spectrum antibiotics can affect the relative proportions of gut microbial populations and foster bacterial resistance [24]. Considering recent technological advancements and community initiatives towards large-scale compilation of data on human microbiome, integrative data analysis may be the key to better understand the mechanisms of action of the gut microbiome and their implication in the development and chronicity of the above-mentioned diseases [25]. For example, the study of colon cancer has relied on the combination of microbiome and metabolome data [26], proteome and metagenome data supported the investigation of Crohn's disease (CD) [27] and metabolome, metagenome and metatranscriptome data provide a basis for the investigation of the relations among gut microbiome and the xenobiotic metabolism of digoxin [28]. Furthermore, systems-level approaches, namely, metabolic modelling approaches [5, 29] and microbiome-based predictive tools [30, 31] are showing great potential in delivering non-obvious and biologically meaningful knowledge. Previous reviews presented attempts at computational systems biology and in silico modelling of the human microbiome [32] and introduced computational methods for understanding the human gut microbiota and developing therapeutic strategies [33]. The focus of the present review is human gut microbiome research, and our aim is to provide up-to-date information on bioinformatics resources and tools specialized in or useful for the multifaceted investigation of this microbiome (Figure 1). This review is accompanied by a small website (available on http://sing-group.org/humangut) that keeps up-to-date track of the public availability of the hereby mentioned projects, resources and tools while welcoming further inputs from the community. Figure 1. View largeDownload slide Unravelling the mechanism of action of the human microbiome: resources and tools for the study of the intestinal microbiota. Figure 1. View largeDownload slide Unravelling the mechanism of action of the human microbiome: resources and tools for the study of the intestinal microbiota. Attention is set on two main application areas: (1) the characterization of gut microbiota composition and the functional interplay related to dysbiosis, such as disease and antibiotic therapy; and (2) the screening of the proteome of human gut species for products holding immunomodulatory, anti-inflammatory or other bioactivity of therapeutic interest. This work is thus considered of interest to those investigating the human gut microbiome and, in particular, those who are developing in silico software to pursue and consolidate emerging paths in such research. Data repositories According to the journal Science, the discovery of the microbiome was one of the 10 milestones of the first decade of the 21st century (http://www.sciencemag.org/site/special/insights2010/). Late in 2000s, two large-scale initiatives were launched with the aim to document the role of human-associated microbial communities in human health and disease, i.e. the NIH’s Human Microbiome Project (HMP) [34] and the European Metagenomics of the Human Intestinal Tract (MetaHit) project [35]. Despite the enormous volume of data generated by these initiatives, general data usage is challenged by a number of design, technical and access decisions [36]. For example, many analyses still depend on a catalogue of reference genes. Existing catalogues for the human gut microbiome are based on samples from single cohorts or reference genomes (or protein sequences), which limits coverage of global microbiome diversity. Therefore, efforts have been invested in implementing integrated catalogues of reference genes [37] and developing approaches to conduct population-level analysis [38]. New catalogues as the 1000 Genomes Project [39], the AmericanGut (http://americangut.org/) and the BritishGut (http://britishgut.org/) are supporting these analyses. Although it is hardly possible to enumerate all existing data resources that may be helpful for this area of research, it is important to have a comprehensive view of data availability, so that the usefulness of less known repositories is uncovered and potential information gaps can be tackled. In this sense, Figure 2 presents relevant data repositories for human gut microbiome research, and Table 1 provides a list of the data sets available for human gut (e.g. from single cohort studies). Additional details on the explored databases can be found in Supplementary Material S1 and in the Web pages supporting this review. Table 1. Data sets available for the study of the human gut microbiome and its interplay with the host in health and disease scenarios Data set  URL  Target  Manipulation of the gut microbiota reveals role in colon tumorigenesis [56]  http://www.ncbi.nlm.nih.gov/sra/? term=SRP056144  Colon tumorigensis  Disease-specific alterations in the enteric virome in inflammatory bowel disease [57]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp11446  CD and ulcerative colitis (UC)  Integrated metagenomics/metaproteomics reveals human host-microbiota signatures of Crohn's disease [27]  http://compbio.ornl.gov/crohns_disease_metagenomics_metaproteomics/  CD  Gut microbiome in down syndrome [58]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp10557  Down sindrome  Metabolome of human gut microbiome is predictive of host dysbiosis [59]  http://gigadb.org/dataset/100163  Dysbiosis  Helicobacter pylori eradication causes perturbation of the human gut microbiome in young adults [60]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp8960  Dysbiosis  Interactions between the intestinal microbiota and bile acids in gallstones patients [61]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp11209  Gallstone patients  An integrated catalog of reference genes in the human gut microbiome [37]  http://gigadb.org/dataset/100064  General  An iterative workflow for mining the human intestinal metaproteome [62]  ftp://ftp.ncbi.nih.gov/pub/TraceDB/human_gut_metagenome/  General  Fecal microbial composition of ulcerative colitis and Crohn’s disease patients in remission and subsequent exacerbation [63]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp4728  IBD, CD and ulcerative colitis  Inference of network dynamics and metabolic interactions in the gut microbiome [64]  https://bitbucket.org/gutmicrobiomepaper/microbiomenetworkmodelpaper/src  Model construction  Development of the preterm gut microbiome in twins at risk of necrotising enterocolitis and sepsis [65]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp3781  Necrotising enterocolitis and sepsis  Patterned progression of bacterial populations in the premature infant gut [66]  https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi? study_id=phs000247.v4.p3  Necrotizing enterocolitis  Dietary modulation of gut microbiota contributes to alleviation of both genetic and simple obesity in children [12]  https://www.ncbi.nlm.nih.gov/sra/? term=SRP045211  Obesity  A core gut microbiome in obese and lean twins [67]  http://metagenomics.anl.gov/linkin.cgi? project=mgp10  Obesity  Moving pictures of the human microbiome [68]  http://metagenomics.anl.gov/linkin.cgi? project=mgp93  Obesity, CD, IBD and malnutrition  Temporal dynamics of the gut microbiota in people sharing a confined environment, a 520-day ground-based space simulation, MARS500 [69]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp79314  Population study  Gut microbiome of the Hadza hunter-gatherers [70]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp7058  Population study  A phylo-functional core of gut microbiota in healthy young Chinese cohorts across lifestyles, geography and ethnicities [71]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp1538  Population study  Gut Microbiota and Extreme Longevity [72]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp17761  Population study  Variation in rural African gut microbiota is strongly correlated with colonization by Entamoeba and subsistence [73]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp15238  Population study  Gut microbiome of coexisting BaAka Pygmies and Bantu Reflects Gradients of Traditional Subsistence Patterns [74]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp16608  Population study  Gut microbiota of type 1 diabetes patients with good glycaemic control and high physical fitness is similar to people without diabetes: an observational study [75]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp11616  T1D  A metagenome-wide association study of gut microbiota in type 2 diabetes [35]  https://www.ncbi.nlm.nih.gov/sra/? term=SRA045646https://www.ncbi.nlm.nih.gov/sra/? term=SRA050230  T2D  Gut metagenome in European women with normal, impaired and diabetic glucose control [76]  https://www.ncbi.nlm.nih.gov/sra? term=ERP002469  T2D  Modulation of gut microbiota dysbioses in type 2 diabetic patients by macrobiotic Ma-Pi 2 diet [77]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp17675  T2D  Data set  URL  Target  Manipulation of the gut microbiota reveals role in colon tumorigenesis [56]  http://www.ncbi.nlm.nih.gov/sra/? term=SRP056144  Colon tumorigensis  Disease-specific alterations in the enteric virome in inflammatory bowel disease [57]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp11446  CD and ulcerative colitis (UC)  Integrated metagenomics/metaproteomics reveals human host-microbiota signatures of Crohn's disease [27]  http://compbio.ornl.gov/crohns_disease_metagenomics_metaproteomics/  CD  Gut microbiome in down syndrome [58]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp10557  Down sindrome  Metabolome of human gut microbiome is predictive of host dysbiosis [59]  http://gigadb.org/dataset/100163  Dysbiosis  Helicobacter pylori eradication causes perturbation of the human gut microbiome in young adults [60]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp8960  Dysbiosis  Interactions between the intestinal microbiota and bile acids in gallstones patients [61]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp11209  Gallstone patients  An integrated catalog of reference genes in the human gut microbiome [37]  http://gigadb.org/dataset/100064  General  An iterative workflow for mining the human intestinal metaproteome [62]  ftp://ftp.ncbi.nih.gov/pub/TraceDB/human_gut_metagenome/  General  Fecal microbial composition of ulcerative colitis and Crohn’s disease patients in remission and subsequent exacerbation [63]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp4728  IBD, CD and ulcerative colitis  Inference of network dynamics and metabolic interactions in the gut microbiome [64]  https://bitbucket.org/gutmicrobiomepaper/microbiomenetworkmodelpaper/src  Model construction  Development of the preterm gut microbiome in twins at risk of necrotising enterocolitis and sepsis [65]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp3781  Necrotising enterocolitis and sepsis  Patterned progression of bacterial populations in the premature infant gut [66]  https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi? study_id=phs000247.v4.p3  Necrotizing enterocolitis  Dietary modulation of gut microbiota contributes to alleviation of both genetic and simple obesity in children [12]  https://www.ncbi.nlm.nih.gov/sra/? term=SRP045211  Obesity  A core gut microbiome in obese and lean twins [67]  http://metagenomics.anl.gov/linkin.cgi? project=mgp10  Obesity  Moving pictures of the human microbiome [68]  http://metagenomics.anl.gov/linkin.cgi? project=mgp93  Obesity, CD, IBD and malnutrition  Temporal dynamics of the gut microbiota in people sharing a confined environment, a 520-day ground-based space simulation, MARS500 [69]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp79314  Population study  Gut microbiome of the Hadza hunter-gatherers [70]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp7058  Population study  A phylo-functional core of gut microbiota in healthy young Chinese cohorts across lifestyles, geography and ethnicities [71]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp1538  Population study  Gut Microbiota and Extreme Longevity [72]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp17761  Population study  Variation in rural African gut microbiota is strongly correlated with colonization by Entamoeba and subsistence [73]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp15238  Population study  Gut microbiome of coexisting BaAka Pygmies and Bantu Reflects Gradients of Traditional Subsistence Patterns [74]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp16608  Population study  Gut microbiota of type 1 diabetes patients with good glycaemic control and high physical fitness is similar to people without diabetes: an observational study [75]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp11616  T1D  A metagenome-wide association study of gut microbiota in type 2 diabetes [35]  https://www.ncbi.nlm.nih.gov/sra/? term=SRA045646https://www.ncbi.nlm.nih.gov/sra/? term=SRA050230  T2D  Gut metagenome in European women with normal, impaired and diabetic glucose control [76]  https://www.ncbi.nlm.nih.gov/sra? term=ERP002469  T2D  Modulation of gut microbiota dysbioses in type 2 diabetic patients by macrobiotic Ma-Pi 2 diet [77]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp17675  T2D  Table 1. Data sets available for the study of the human gut microbiome and its interplay with the host in health and disease scenarios Data set  URL  Target  Manipulation of the gut microbiota reveals role in colon tumorigenesis [56]  http://www.ncbi.nlm.nih.gov/sra/? term=SRP056144  Colon tumorigensis  Disease-specific alterations in the enteric virome in inflammatory bowel disease [57]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp11446  CD and ulcerative colitis (UC)  Integrated metagenomics/metaproteomics reveals human host-microbiota signatures of Crohn's disease [27]  http://compbio.ornl.gov/crohns_disease_metagenomics_metaproteomics/  CD  Gut microbiome in down syndrome [58]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp10557  Down sindrome  Metabolome of human gut microbiome is predictive of host dysbiosis [59]  http://gigadb.org/dataset/100163  Dysbiosis  Helicobacter pylori eradication causes perturbation of the human gut microbiome in young adults [60]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp8960  Dysbiosis  Interactions between the intestinal microbiota and bile acids in gallstones patients [61]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp11209  Gallstone patients  An integrated catalog of reference genes in the human gut microbiome [37]  http://gigadb.org/dataset/100064  General  An iterative workflow for mining the human intestinal metaproteome [62]  ftp://ftp.ncbi.nih.gov/pub/TraceDB/human_gut_metagenome/  General  Fecal microbial composition of ulcerative colitis and Crohn’s disease patients in remission and subsequent exacerbation [63]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp4728  IBD, CD and ulcerative colitis  Inference of network dynamics and metabolic interactions in the gut microbiome [64]  https://bitbucket.org/gutmicrobiomepaper/microbiomenetworkmodelpaper/src  Model construction  Development of the preterm gut microbiome in twins at risk of necrotising enterocolitis and sepsis [65]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp3781  Necrotising enterocolitis and sepsis  Patterned progression of bacterial populations in the premature infant gut [66]  https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi? study_id=phs000247.v4.p3  Necrotizing enterocolitis  Dietary modulation of gut microbiota contributes to alleviation of both genetic and simple obesity in children [12]  https://www.ncbi.nlm.nih.gov/sra/? term=SRP045211  Obesity  A core gut microbiome in obese and lean twins [67]  http://metagenomics.anl.gov/linkin.cgi? project=mgp10  Obesity  Moving pictures of the human microbiome [68]  http://metagenomics.anl.gov/linkin.cgi? project=mgp93  Obesity, CD, IBD and malnutrition  Temporal dynamics of the gut microbiota in people sharing a confined environment, a 520-day ground-based space simulation, MARS500 [69]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp79314  Population study  Gut microbiome of the Hadza hunter-gatherers [70]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp7058  Population study  A phylo-functional core of gut microbiota in healthy young Chinese cohorts across lifestyles, geography and ethnicities [71]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp1538  Population study  Gut Microbiota and Extreme Longevity [72]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp17761  Population study  Variation in rural African gut microbiota is strongly correlated with colonization by Entamoeba and subsistence [73]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp15238  Population study  Gut microbiome of coexisting BaAka Pygmies and Bantu Reflects Gradients of Traditional Subsistence Patterns [74]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp16608  Population study  Gut microbiota of type 1 diabetes patients with good glycaemic control and high physical fitness is similar to people without diabetes: an observational study [75]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp11616  T1D  A metagenome-wide association study of gut microbiota in type 2 diabetes [35]  https://www.ncbi.nlm.nih.gov/sra/? term=SRA045646https://www.ncbi.nlm.nih.gov/sra/? term=SRA050230  T2D  Gut metagenome in European women with normal, impaired and diabetic glucose control [76]  https://www.ncbi.nlm.nih.gov/sra? term=ERP002469  T2D  Modulation of gut microbiota dysbioses in type 2 diabetic patients by macrobiotic Ma-Pi 2 diet [77]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp17675  T2D  Data set  URL  Target  Manipulation of the gut microbiota reveals role in colon tumorigenesis [56]  http://www.ncbi.nlm.nih.gov/sra/? term=SRP056144  Colon tumorigensis  Disease-specific alterations in the enteric virome in inflammatory bowel disease [57]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp11446  CD and ulcerative colitis (UC)  Integrated metagenomics/metaproteomics reveals human host-microbiota signatures of Crohn's disease [27]  http://compbio.ornl.gov/crohns_disease_metagenomics_metaproteomics/  CD  Gut microbiome in down syndrome [58]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp10557  Down sindrome  Metabolome of human gut microbiome is predictive of host dysbiosis [59]  http://gigadb.org/dataset/100163  Dysbiosis  Helicobacter pylori eradication causes perturbation of the human gut microbiome in young adults [60]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp8960  Dysbiosis  Interactions between the intestinal microbiota and bile acids in gallstones patients [61]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp11209  Gallstone patients  An integrated catalog of reference genes in the human gut microbiome [37]  http://gigadb.org/dataset/100064  General  An iterative workflow for mining the human intestinal metaproteome [62]  ftp://ftp.ncbi.nih.gov/pub/TraceDB/human_gut_metagenome/  General  Fecal microbial composition of ulcerative colitis and Crohn’s disease patients in remission and subsequent exacerbation [63]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp4728  IBD, CD and ulcerative colitis  Inference of network dynamics and metabolic interactions in the gut microbiome [64]  https://bitbucket.org/gutmicrobiomepaper/microbiomenetworkmodelpaper/src  Model construction  Development of the preterm gut microbiome in twins at risk of necrotising enterocolitis and sepsis [65]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp3781  Necrotising enterocolitis and sepsis  Patterned progression of bacterial populations in the premature infant gut [66]  https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi? study_id=phs000247.v4.p3  Necrotizing enterocolitis  Dietary modulation of gut microbiota contributes to alleviation of both genetic and simple obesity in children [12]  https://www.ncbi.nlm.nih.gov/sra/? term=SRP045211  Obesity  A core gut microbiome in obese and lean twins [67]  http://metagenomics.anl.gov/linkin.cgi? project=mgp10  Obesity  Moving pictures of the human microbiome [68]  http://metagenomics.anl.gov/linkin.cgi? project=mgp93  Obesity, CD, IBD and malnutrition  Temporal dynamics of the gut microbiota in people sharing a confined environment, a 520-day ground-based space simulation, MARS500 [69]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp79314  Population study  Gut microbiome of the Hadza hunter-gatherers [70]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp7058  Population study  A phylo-functional core of gut microbiota in healthy young Chinese cohorts across lifestyles, geography and ethnicities [71]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp1538  Population study  Gut Microbiota and Extreme Longevity [72]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp17761  Population study  Variation in rural African gut microbiota is strongly correlated with colonization by Entamoeba and subsistence [73]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp15238  Population study  Gut microbiome of coexisting BaAka Pygmies and Bantu Reflects Gradients of Traditional Subsistence Patterns [74]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp16608  Population study  Gut microbiota of type 1 diabetes patients with good glycaemic control and high physical fitness is similar to people without diabetes: an observational study [75]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp11616  T1D  A metagenome-wide association study of gut microbiota in type 2 diabetes [35]  https://www.ncbi.nlm.nih.gov/sra/? term=SRA045646https://www.ncbi.nlm.nih.gov/sra/? term=SRA050230  T2D  Gut metagenome in European women with normal, impaired and diabetic glucose control [76]  https://www.ncbi.nlm.nih.gov/sra? term=ERP002469  T2D  Modulation of gut microbiota dysbioses in type 2 diabetic patients by macrobiotic Ma-Pi 2 diet [77]  http://metagenomics.anl.gov/mgmain.html? mgpage=project&project=mgp17675  T2D  Figure 2. View largeDownload slide Mindmap of bioinformatics databases and data repositories commonly used in gut-related research. Figure 2. View largeDownload slide Mindmap of bioinformatics databases and data repositories commonly used in gut-related research. Functional profiling using reference information can be based either on reference genome read mapping (at the nucleotide level) or translated protein database searches [30]. That is, the assignment may be based on full protein-coding genes (CDSs) by means of orthology relations with sequences in well-characterized functional databases, such as NCBI nr [40], KEGG Orthology [41] and COGs [42], or by identifying specific PFAM [43] or SMART [44] peptide domains within CDSs. Broader biological functions are then built on these low-level functional annotations [45] using hierarchical ontologies that group functionally related proteins, such as in KEGG [41], Metacyc [46] and SEED [47]. Data processing and integration pipelines are also available, for instance, MG-RAST [48], IMG/M [49], MEGAN [50], HUMAnN [51], MALINA [52], MOCAT2 [53] and COGNIZER [54]. These pipelines typically include some combination of quality control and interference steps subsequent to homology search, such as selection of pathways by maximum parsimony, taxonomic limitation or statistical smoothing. However, as whole-community functional profiling is not yet well established, neither gene annotations within reference genomes nor those in protein databases are well tuned to whole-community metabolism. Indeed, both MetaCyc [46] and SEED [47] have ongoing efforts to develop microbiome-specific functional annotations, and gene family catalogues, such as eggNOG [55], are looking for a better way to represent uncultured communities. Bioinformatics tools A number of different bioinformatics tools are useful to the study of human gut. In particular, considering the huge volume of data being generated by high-throughput technologies, tools are needed for the processing and analysis of individual omics data as well as to gain a multi-level, integrated understanding of the role of gut microbiome in different aspects of health and disease conditions. The level of complexity and specialization of these tools varies significantly. Figure 3 summarizes commonly used bioinformatics tools, which are described in the following sections. A more detailed description of these tools is found within the Supplementary Material S1 and in the Web pages supporting this review, and the objectives, pros and cons of the different OMICS are available on Supplementary Material S2. Figure 3. View largeDownload slide Mindmap of bioinformatics tools commonly used in gut-related research. Tool names ended in an asterisk are public but require login and those ended in a number sign are private. The rest of tools are publicly available. Figure 3. View largeDownload slide Mindmap of bioinformatics tools commonly used in gut-related research. Tool names ended in an asterisk are public but require login and those ended in a number sign are private. The rest of tools are publicly available. Metagenomics: composition, abundance and variation Metagenome is the collective genome of a given microbial community. Metagenome sequencing presents the first, perhaps the greatest, opportunity to identify novel and biologically interesting microbial products in the human gut microbiome [78, 79]. Thousands of human gut-associated metagenomes have already been sequenced, representing an extensive database for mining biologically active microbial products, and studying intestinal microbiome diversity and dysbiosis, as well as relations to health and disease [80]. General and detailed descriptions of metagenomics technologies and computational support can be found in recent reviews on the field [81–84]. Table 2 shows common and publicly available metagenomics tools. Table 2. Publicly available metagenomics tools Tool  Purpose  AlFree [85]  Phylogeny reconstruction using alignment-free sequence comparison methods  AMOS [86]  Assembling DNA reads  AmphoraNet [87]  Phylogenetic analysis of metagenomic shotgun sequencing data and genomic data  BAGEL3 [88]  Mining for bacteriocins in single or multiple DNA sequences a, e.g. (un)finished genomes, scaffold files, and meta-genomics data  BLAST [89]  Identification of regions of similarity between biological sequences  CAFE [90]  Integrating 28 measures and downstream visualised analysis  CAMERA [91]  Creating a rich, distinctive data repository and a bioinformatics tools resource  Cd-hit [92]  Clustering and comparison of protein or nucleotide sequences.  Chimera Slayer [93]  Detection of sequences falsely interpreted as organisms (contributing to false perceptions of sample diversity and the false identification of novel taxa)  CloVR [94]  Automated sequence analysis able to use cloud computing resources  COGNIZER [54]  Functional annotation of metagenomic data sets  CONCOCT [95]  Unsupervised binning of metagenomic contigs  CVTree server [96]  Construction of whole-genome-based phylogenetic trees  DESeq2 [97]  Estimation of variance–mean dependence in count data from high-throughput sequencing assays and differential expression analysis  DESMAN [98]  Contig-based strain inference across multiple samples  DIAMOND [99]  High-throughput alignment of DNA reads and protein sequences  EMPANADA  Evidence-based assignment of genes to pathways in metagenomic data  FishTaco [100]  Descomposition of functional shifts into individual taxon-level contributions  FLASh [101]  Merging paired-end reads from next-generation sequencing experiments  FMAP [102]  Functional analysis of metagenomic/metatranscriptomic sequencing data, i.e. sequence alignment, gene family abundance calculations and differential feature statistical analysis  FragGeneScan [111]  Predicting protein-coding region in short reads  Genboree Microbiome Toolset [103]  Multi-omic data analysis  Glimmer-MG [104]  Allowing to detect genes in environmental shotgun DNA sequences  GroopM [105]  Using differential coverage to obtain high fidelity population genomes from related metagenomes  HUMAnN [51]  Determination of relative abundances of the gut microbial functional pathways in a community from metagenomic data  IDBA-UD [106]  Iterative De Bruijn Graph De Novo Assembler for short reads sequencing data  IMG/M [49]  Analysis and annotation of genome and metagenome datasets in a comprehensive comparative context  KOBAS [107]  Gene/protein functional annotation and functional enrichment of gene sets  MALINA [52]  Analysis of whole-genome gut-related metagenomic data  MaxBin [108]  Binning assembled metagenomic sequences based on an expectation–maximization algorithm.  MEGAHIT [109]  Assembling of for large and complex metagenomics sequencing reads.  MEGAN [50]  Interactive exploration and analysis of large-scale microbiome sequencing data  MetaBAT [110]  Integrating empirical probabilistic distances of genome abundance and tetranucleotide frequency for metagenome binning  MetaGeneAnnotator [111]  Predicting prokaryotic genes from genomic sequences  MetaMIS [112]  Analysing time series data of microbial community profiles  MetAMOS [113]  An integrated assembly and analysis pipeline for metagenomic data  MetaPhlAn [114]  Estimation of species abundance  metaSPAdes [115]  Assembling single cells and highly polymorphic diploid genomes reads  MetaVelvet [116]  De novo sequence assembler from short sequence reads  MG-RAST [117]  Metagenome analysis  MIRA [118]  Whole-genome shotgun (WGS) and EST sequence assembler  MOCAT2 [53]  Metagenomic sequence assembly and gene prediction  Mothur [119]  Analysing sequencing data  MUSiCC [120]  Normalizing and correcting gene abundance measurements derived from metagenomic shotgun sequencing  MyCC [121]  Combining genomic signatures, marker genes and optional contig coverages for automated metagenome binning  NBC Classifier [122]  Naïve Bayes Classification tool Web server for taxonomic classification of metagenomic reads  Orphelia [123]  Predicting protein-coding genes in short DNA sequences from metagenomics sequencing projects  PAUDA [124]  High-performance algorithms to compute BLASTX-like alignments  PhyloSift [125]  Phylogenetic analysis of metagenomic samples and comparison of community structure among multiple related samples  PICRUSt [126]  Predicting metagenomes from 16S data and a reference genome database  PRIAM [127]  Automated enzyme detection in fully sequenced genome  Prodigal [128]  Allowing gene prediction for microbial genomes  QIIME [129]  Performing microbiome analysis from raw DNA sequencing data  RAPSearch [130]  Fast protein similarity search for short reads  RAST [47]  Fully automated service for annotating bacterial and archaeal genomes  RPS-BLAST  Searching in profile databases  WebCARMA [131]  Taxonomic and functional classification of unassembled (ultra-)short reads from metagenomic communities  Tool  Purpose  AlFree [85]  Phylogeny reconstruction using alignment-free sequence comparison methods  AMOS [86]  Assembling DNA reads  AmphoraNet [87]  Phylogenetic analysis of metagenomic shotgun sequencing data and genomic data  BAGEL3 [88]  Mining for bacteriocins in single or multiple DNA sequences a, e.g. (un)finished genomes, scaffold files, and meta-genomics data  BLAST [89]  Identification of regions of similarity between biological sequences  CAFE [90]  Integrating 28 measures and downstream visualised analysis  CAMERA [91]  Creating a rich, distinctive data repository and a bioinformatics tools resource  Cd-hit [92]  Clustering and comparison of protein or nucleotide sequences.  Chimera Slayer [93]  Detection of sequences falsely interpreted as organisms (contributing to false perceptions of sample diversity and the false identification of novel taxa)  CloVR [94]  Automated sequence analysis able to use cloud computing resources  COGNIZER [54]  Functional annotation of metagenomic data sets  CONCOCT [95]  Unsupervised binning of metagenomic contigs  CVTree server [96]  Construction of whole-genome-based phylogenetic trees  DESeq2 [97]  Estimation of variance–mean dependence in count data from high-throughput sequencing assays and differential expression analysis  DESMAN [98]  Contig-based strain inference across multiple samples  DIAMOND [99]  High-throughput alignment of DNA reads and protein sequences  EMPANADA  Evidence-based assignment of genes to pathways in metagenomic data  FishTaco [100]  Descomposition of functional shifts into individual taxon-level contributions  FLASh [101]  Merging paired-end reads from next-generation sequencing experiments  FMAP [102]  Functional analysis of metagenomic/metatranscriptomic sequencing data, i.e. sequence alignment, gene family abundance calculations and differential feature statistical analysis  FragGeneScan [111]  Predicting protein-coding region in short reads  Genboree Microbiome Toolset [103]  Multi-omic data analysis  Glimmer-MG [104]  Allowing to detect genes in environmental shotgun DNA sequences  GroopM [105]  Using differential coverage to obtain high fidelity population genomes from related metagenomes  HUMAnN [51]  Determination of relative abundances of the gut microbial functional pathways in a community from metagenomic data  IDBA-UD [106]  Iterative De Bruijn Graph De Novo Assembler for short reads sequencing data  IMG/M [49]  Analysis and annotation of genome and metagenome datasets in a comprehensive comparative context  KOBAS [107]  Gene/protein functional annotation and functional enrichment of gene sets  MALINA [52]  Analysis of whole-genome gut-related metagenomic data  MaxBin [108]  Binning assembled metagenomic sequences based on an expectation–maximization algorithm.  MEGAHIT [109]  Assembling of for large and complex metagenomics sequencing reads.  MEGAN [50]  Interactive exploration and analysis of large-scale microbiome sequencing data  MetaBAT [110]  Integrating empirical probabilistic distances of genome abundance and tetranucleotide frequency for metagenome binning  MetaGeneAnnotator [111]  Predicting prokaryotic genes from genomic sequences  MetaMIS [112]  Analysing time series data of microbial community profiles  MetAMOS [113]  An integrated assembly and analysis pipeline for metagenomic data  MetaPhlAn [114]  Estimation of species abundance  metaSPAdes [115]  Assembling single cells and highly polymorphic diploid genomes reads  MetaVelvet [116]  De novo sequence assembler from short sequence reads  MG-RAST [117]  Metagenome analysis  MIRA [118]  Whole-genome shotgun (WGS) and EST sequence assembler  MOCAT2 [53]  Metagenomic sequence assembly and gene prediction  Mothur [119]  Analysing sequencing data  MUSiCC [120]  Normalizing and correcting gene abundance measurements derived from metagenomic shotgun sequencing  MyCC [121]  Combining genomic signatures, marker genes and optional contig coverages for automated metagenome binning  NBC Classifier [122]  Naïve Bayes Classification tool Web server for taxonomic classification of metagenomic reads  Orphelia [123]  Predicting protein-coding genes in short DNA sequences from metagenomics sequencing projects  PAUDA [124]  High-performance algorithms to compute BLASTX-like alignments  PhyloSift [125]  Phylogenetic analysis of metagenomic samples and comparison of community structure among multiple related samples  PICRUSt [126]  Predicting metagenomes from 16S data and a reference genome database  PRIAM [127]  Automated enzyme detection in fully sequenced genome  Prodigal [128]  Allowing gene prediction for microbial genomes  QIIME [129]  Performing microbiome analysis from raw DNA sequencing data  RAPSearch [130]  Fast protein similarity search for short reads  RAST [47]  Fully automated service for annotating bacterial and archaeal genomes  RPS-BLAST  Searching in profile databases  WebCARMA [131]  Taxonomic and functional classification of unassembled (ultra-)short reads from metagenomic communities  Table 2. Publicly available metagenomics tools Tool  Purpose  AlFree [85]  Phylogeny reconstruction using alignment-free sequence comparison methods  AMOS [86]  Assembling DNA reads  AmphoraNet [87]  Phylogenetic analysis of metagenomic shotgun sequencing data and genomic data  BAGEL3 [88]  Mining for bacteriocins in single or multiple DNA sequences a, e.g. (un)finished genomes, scaffold files, and meta-genomics data  BLAST [89]  Identification of regions of similarity between biological sequences  CAFE [90]  Integrating 28 measures and downstream visualised analysis  CAMERA [91]  Creating a rich, distinctive data repository and a bioinformatics tools resource  Cd-hit [92]  Clustering and comparison of protein or nucleotide sequences.  Chimera Slayer [93]  Detection of sequences falsely interpreted as organisms (contributing to false perceptions of sample diversity and the false identification of novel taxa)  CloVR [94]  Automated sequence analysis able to use cloud computing resources  COGNIZER [54]  Functional annotation of metagenomic data sets  CONCOCT [95]  Unsupervised binning of metagenomic contigs  CVTree server [96]  Construction of whole-genome-based phylogenetic trees  DESeq2 [97]  Estimation of variance–mean dependence in count data from high-throughput sequencing assays and differential expression analysis  DESMAN [98]  Contig-based strain inference across multiple samples  DIAMOND [99]  High-throughput alignment of DNA reads and protein sequences  EMPANADA  Evidence-based assignment of genes to pathways in metagenomic data  FishTaco [100]  Descomposition of functional shifts into individual taxon-level contributions  FLASh [101]  Merging paired-end reads from next-generation sequencing experiments  FMAP [102]  Functional analysis of metagenomic/metatranscriptomic sequencing data, i.e. sequence alignment, gene family abundance calculations and differential feature statistical analysis  FragGeneScan [111]  Predicting protein-coding region in short reads  Genboree Microbiome Toolset [103]  Multi-omic data analysis  Glimmer-MG [104]  Allowing to detect genes in environmental shotgun DNA sequences  GroopM [105]  Using differential coverage to obtain high fidelity population genomes from related metagenomes  HUMAnN [51]  Determination of relative abundances of the gut microbial functional pathways in a community from metagenomic data  IDBA-UD [106]  Iterative De Bruijn Graph De Novo Assembler for short reads sequencing data  IMG/M [49]  Analysis and annotation of genome and metagenome datasets in a comprehensive comparative context  KOBAS [107]  Gene/protein functional annotation and functional enrichment of gene sets  MALINA [52]  Analysis of whole-genome gut-related metagenomic data  MaxBin [108]  Binning assembled metagenomic sequences based on an expectation–maximization algorithm.  MEGAHIT [109]  Assembling of for large and complex metagenomics sequencing reads.  MEGAN [50]  Interactive exploration and analysis of large-scale microbiome sequencing data  MetaBAT [110]  Integrating empirical probabilistic distances of genome abundance and tetranucleotide frequency for metagenome binning  MetaGeneAnnotator [111]  Predicting prokaryotic genes from genomic sequences  MetaMIS [112]  Analysing time series data of microbial community profiles  MetAMOS [113]  An integrated assembly and analysis pipeline for metagenomic data  MetaPhlAn [114]  Estimation of species abundance  metaSPAdes [115]  Assembling single cells and highly polymorphic diploid genomes reads  MetaVelvet [116]  De novo sequence assembler from short sequence reads  MG-RAST [117]  Metagenome analysis  MIRA [118]  Whole-genome shotgun (WGS) and EST sequence assembler  MOCAT2 [53]  Metagenomic sequence assembly and gene prediction  Mothur [119]  Analysing sequencing data  MUSiCC [120]  Normalizing and correcting gene abundance measurements derived from metagenomic shotgun sequencing  MyCC [121]  Combining genomic signatures, marker genes and optional contig coverages for automated metagenome binning  NBC Classifier [122]  Naïve Bayes Classification tool Web server for taxonomic classification of metagenomic reads  Orphelia [123]  Predicting protein-coding genes in short DNA sequences from metagenomics sequencing projects  PAUDA [124]  High-performance algorithms to compute BLASTX-like alignments  PhyloSift [125]  Phylogenetic analysis of metagenomic samples and comparison of community structure among multiple related samples  PICRUSt [126]  Predicting metagenomes from 16S data and a reference genome database  PRIAM [127]  Automated enzyme detection in fully sequenced genome  Prodigal [128]  Allowing gene prediction for microbial genomes  QIIME [129]  Performing microbiome analysis from raw DNA sequencing data  RAPSearch [130]  Fast protein similarity search for short reads  RAST [47]  Fully automated service for annotating bacterial and archaeal genomes  RPS-BLAST  Searching in profile databases  WebCARMA [131]  Taxonomic and functional classification of unassembled (ultra-)short reads from metagenomic communities  Tool  Purpose  AlFree [85]  Phylogeny reconstruction using alignment-free sequence comparison methods  AMOS [86]  Assembling DNA reads  AmphoraNet [87]  Phylogenetic analysis of metagenomic shotgun sequencing data and genomic data  BAGEL3 [88]  Mining for bacteriocins in single or multiple DNA sequences a, e.g. (un)finished genomes, scaffold files, and meta-genomics data  BLAST [89]  Identification of regions of similarity between biological sequences  CAFE [90]  Integrating 28 measures and downstream visualised analysis  CAMERA [91]  Creating a rich, distinctive data repository and a bioinformatics tools resource  Cd-hit [92]  Clustering and comparison of protein or nucleotide sequences.  Chimera Slayer [93]  Detection of sequences falsely interpreted as organisms (contributing to false perceptions of sample diversity and the false identification of novel taxa)  CloVR [94]  Automated sequence analysis able to use cloud computing resources  COGNIZER [54]  Functional annotation of metagenomic data sets  CONCOCT [95]  Unsupervised binning of metagenomic contigs  CVTree server [96]  Construction of whole-genome-based phylogenetic trees  DESeq2 [97]  Estimation of variance–mean dependence in count data from high-throughput sequencing assays and differential expression analysis  DESMAN [98]  Contig-based strain inference across multiple samples  DIAMOND [99]  High-throughput alignment of DNA reads and protein sequences  EMPANADA  Evidence-based assignment of genes to pathways in metagenomic data  FishTaco [100]  Descomposition of functional shifts into individual taxon-level contributions  FLASh [101]  Merging paired-end reads from next-generation sequencing experiments  FMAP [102]  Functional analysis of metagenomic/metatranscriptomic sequencing data, i.e. sequence alignment, gene family abundance calculations and differential feature statistical analysis  FragGeneScan [111]  Predicting protein-coding region in short reads  Genboree Microbiome Toolset [103]  Multi-omic data analysis  Glimmer-MG [104]  Allowing to detect genes in environmental shotgun DNA sequences  GroopM [105]  Using differential coverage to obtain high fidelity population genomes from related metagenomes  HUMAnN [51]  Determination of relative abundances of the gut microbial functional pathways in a community from metagenomic data  IDBA-UD [106]  Iterative De Bruijn Graph De Novo Assembler for short reads sequencing data  IMG/M [49]  Analysis and annotation of genome and metagenome datasets in a comprehensive comparative context  KOBAS [107]  Gene/protein functional annotation and functional enrichment of gene sets  MALINA [52]  Analysis of whole-genome gut-related metagenomic data  MaxBin [108]  Binning assembled metagenomic sequences based on an expectation–maximization algorithm.  MEGAHIT [109]  Assembling of for large and complex metagenomics sequencing reads.  MEGAN [50]  Interactive exploration and analysis of large-scale microbiome sequencing data  MetaBAT [110]  Integrating empirical probabilistic distances of genome abundance and tetranucleotide frequency for metagenome binning  MetaGeneAnnotator [111]  Predicting prokaryotic genes from genomic sequences  MetaMIS [112]  Analysing time series data of microbial community profiles  MetAMOS [113]  An integrated assembly and analysis pipeline for metagenomic data  MetaPhlAn [114]  Estimation of species abundance  metaSPAdes [115]  Assembling single cells and highly polymorphic diploid genomes reads  MetaVelvet [116]  De novo sequence assembler from short sequence reads  MG-RAST [117]  Metagenome analysis  MIRA [118]  Whole-genome shotgun (WGS) and EST sequence assembler  MOCAT2 [53]  Metagenomic sequence assembly and gene prediction  Mothur [119]  Analysing sequencing data  MUSiCC [120]  Normalizing and correcting gene abundance measurements derived from metagenomic shotgun sequencing  MyCC [121]  Combining genomic signatures, marker genes and optional contig coverages for automated metagenome binning  NBC Classifier [122]  Naïve Bayes Classification tool Web server for taxonomic classification of metagenomic reads  Orphelia [123]  Predicting protein-coding genes in short DNA sequences from metagenomics sequencing projects  PAUDA [124]  High-performance algorithms to compute BLASTX-like alignments  PhyloSift [125]  Phylogenetic analysis of metagenomic samples and comparison of community structure among multiple related samples  PICRUSt [126]  Predicting metagenomes from 16S data and a reference genome database  PRIAM [127]  Automated enzyme detection in fully sequenced genome  Prodigal [128]  Allowing gene prediction for microbial genomes  QIIME [129]  Performing microbiome analysis from raw DNA sequencing data  RAPSearch [130]  Fast protein similarity search for short reads  RAST [47]  Fully automated service for annotating bacterial and archaeal genomes  RPS-BLAST  Searching in profile databases  WebCARMA [131]  Taxonomic and functional classification of unassembled (ultra-)short reads from metagenomic communities  A key preliminary step in metagenomic analysis is to characterize the taxonomic diversity of the metagenome, i.e. to categorize various microbes and quantify their diversity in terms of species abundance. Here, it is important to differentiate the adopted methodologies between those in which concise regions of the 16S ribosomal DNA are polymerase chain reaction (PCR)-amplified and sequenced (metataxonomics), and those where the whole genetic material is isolated, fragmented and sequenced, i.e. the shotgun metagenomics (metagenomics) [132]. Metataxonomics data can test if there is a population split in complex communities. However, it rarely informs you of the mechanisms underlying the population split because of inter-individual variability and/or coverage. On the other hand, metagenomics offers an effective but imperfect method to profile the structure and the potential functions encoded in microbial communities. Many gut metagenomics studies still perform 16S ribosomal RNA (rRNA) sequencing, and pipelines used correspond to QIIME or MEGAN. However, whole-genome sequencing is becoming the technology of choice to perform sequence analysis and community comparison; so, we consider more appropriate to focus this section in this second option. The assembly of overlapping reads into continuous or semi-continuous genome fragments allows an in-depth view of different aspects within a genomic context. Numerous metagenome assemblers have been developed, most of which assemble sequences in de novo fashion, i.e. do not rely on a closely related reference sequence. MIRA [118] and AMOS [86] are examples of reference-based assemblers, while IDBA-UD [106], MetaVelvet [116], MetAMOS [113], MEGAHIT [109] and metaSPAdes [115] are examples of de novo assemblers. Furthermore, the need to assembly increasingly larger sequencing data is motivating serious investment in improving computational performance. Assembler developers are now looking for more time- and memory-efficient ways to handle massive data volumes (hundreds of Giga base-pairs) on single server. Binning approaches, i.e. the classification and/or clustering of reads into specific bins, can further help elucidate the broader genomic context of interesting features [133]. Some binning methods are taxonomy-dependent (supervised learning procedures), i.e. obtain estimates of the profile/abundance of ‘known’ taxonomic groups (reference database) [134]. CAMERA [91], MG-RAST server [48], NBC classifier [122] and WebCARMA [131] are some well-known taxonomy-dependent Web applications. On the other hand, there are taxonomy-independent methods (unsupervised learning procedures), which group reads based on their mutual similarity and do not involve a database comparison step [82]. The tools CONCOCT [95], GroopM [105], MaxBin [108], MetaBAT [110] and MyCC [121] are some prominent examples. The prediction and annotation of gene-coding sequences is the last, fundamental step of analysis. In terms of software commonly used in 16S rRNA gene analyses, the Genboree Microbiome Toolset supports community profiling (i.e. determination of the abundance of each type of microbe) [103], QIIME [129] and Mothur [119] (also part of Genboree) can be used to obtain quantitative insights into microbial relative abundances and ecosystems, BLAST [89] and Cd-hit [92] facilitate the comparison of large sets of proteins and the Chimera Slayer is used to detect sequences falsely interpreted as organisms (aiming to correct false perceptions of sample diversity and false identification of novel taxa) [93]. Furthermore, the Ribosomal Database Project classifiers may help in the assignment of rRNA gene sequences into bacterial taxonomy [135]. Tools such as Glimmer-MG [104], FragGeneScan [136], MetaGeneAnnotator [111], PICRUSt [126], Orphelia [123] and PROkaryotic DYnamic programming Gene-finding ALgorithm (Prodigal) [128] are good examples of how gene prediction approaches have adapted to the challenges posed by shotgun sequencing data. Nowadays, MG-RAST [48], IMG/M [49], MEGAN [50], HUMAnN [51], MALINA [52], MOCAT2 [53] and COGNIZER [54] are among the most well-known comparative genomics-based automated computational pipelines, and present multiple ongoing developments. MG-RAST provides an easy-to-use Web interface for metagenomics analysis, including alignment, but imposes some limitations in terms of data file upload (file size limits). In turn, both HUMAnN and MEGAN both lack an integrated alignment tool and are notably unable to perform comprehensive downstream processes, such as operon-level analysis [137]. Databases such as Pfam [43] and Clusters of Orthologous Groups (COGs) [42] enable methods for comparison with sequence-diverse protein families or recurring sequence motifs, and the Kegg Orthology (KO) and KEGG pathways databases [41] are often used to predict the composition ratio of microbial gene families and pathways from the HMP [138, 139]. Tools such as RAPSearch [130] and PAUDA [124] propose faster alternatives than BLAST to the alignment of environmental sequencing reads. More recently, FishTaco, an analytical and computational framework, has presented the ability to produce integrated taxonomic and functional comparative analyses. In particular, FishTaco is equipped to accurately quantify taxon-level contributions to disease-associated functional shifts, i.e. to trace back shifts in the microbiome’s functional capacity to specific taxa [100]. Besides comparative genomics, gut studies encompass structure-based approaches, functional prediction methods based on evolutionary conservation and phylogeny and network context-based approaches (e.g. co-expression and metabolic networks) [139–141]. Approximately 50% of the genes in the gut microbiome could not be characterized using standard annotation methods [142]. Therefore, conventional methods for putative gene characterization and functional prediction, based on alignment to homologous genes with existing annotations (e.g. BLAST), were rendered ineffective [43]. Alternative computational methods approached the problem by integrating standard homology information with additional information, namely, sequence features, co-expression data, binding sites and subcellular localisation data [143–146]. The study of discrepancies between taxonomic and functional variations led to a proposal to revise some of the main metagenomic processing procedures to uncover hidden functional variation across samples [147]. This revision relies on the Metagenomic Universal Single-Copy Correction (MUSiCC) method [120], and the Evidence-based Metagenomic Pathway Assignment using geNe Abundance DAta (EMPANADA) schema. Phylogenetic analysis is often supported by tools such as: CAFE, a stand-alone software, which integrates 28 measures and downstream visualized analysis [90]; AlFree, a Web server, which integrates 38 measures and supports the visualization of phylogeny [85]; the CVTree server, which implements a whole-genome-based and alignment-free composition vector method [96] and is also included in CAFE tool; the AmphoraNet [87] that is the Web server implementation of the AMPHORA2 method, i.e. incorporates probability-based sequence alignment masks to improve the phylogenetic accuracy; MetaPhlAn, which estimates the abundance of species in each sample according to the number of mapped reads to its markers [114]; and PhyloSift [125], which statistically tests lineages of interest directly from an uncultured DNA sample and allows for comparison of community structure among many samples. An immediate application of phylogenetic approaches is the study of how species within the same genome interaction groups decrease or increase their abundance during dietary interventions [12]. The generation of community-level metabolic networks of the microbiome is also an interesting avenue. For example, these networks can be used to explore gene-level and network-level topological differences associated with obesity and IBD [148]. By placing variations in gene abundance in the context of these networks, researchers are able to look into the genes associated with these host states, namely, may inspect gene location and generate hypotheses about how the microbiome is interacting with host metabolism. Additionally, network analysis can bring to light associations between topological variations and community species composition. Genome mining approaches are increasingly valuable for the purpose of identifying antimicrobial-producing microorganisms as well as screening for and harnessing putative gene clusters. For example, genome mining using Rapid Annotation using Subsystem Technology (RAST) was applied to the comparative pathogenomic analysis of Nesterenkonia jeotgali [149]. Likewise, the bacteriocin genome mining tool BAGEL3 [88] helped in the identification of potential bacteriocin producers in the genomes of the gut microbiome subset of the HMP's reference genome database [150]. Arguably, metagenomics should be at the basis of most (if not all) microbiome studies. Despite the huge technological development in this field, methods are often limited in resolution and may fail to resolve relevant details concerning the composition of species and genes in the microbiome. Accumulating evidence shows that important functions of the gut microbiota may be species or even strain-specific; yet, many studies in metagenomics are still conducted at genus or higher taxonomic levels because of limited ability to assemble individual bacterial genomes directly from metagenomic data [151]. Metatranscriptomics: gene expression profiling Metatranscriptomics encompasses the functional characterization of microbiomes based on mRNA sequencing data to gain a better understanding about the taxonomic composition and active biochemical functions of microbiomes. Metatranscriptomics captures gene expression patterns from microbial communities without previous assumptions as to the ongoing activities or dominant taxa, and provides a catalogue of those genes being transcribed under given experimental conditions. Here, bioinformatics analysis methods can be broadly classified into those based on reference-dependent methods and those that are reference-independent. Reference-dependent methods are based on sequence alignment onto functionally well-characterized databases or datas ets, whereas reference-independent methods resort to de novo assemblies. Table 3 presents available metatranscriptomics tools. Table 3. Publicly available metatranscriptomics tools Tool  Purpose  BLAST [89]  Identification of regions of similarity between biological sequences  COMAN [152]  Functional analysis of metatranscriptomics data  DESeq2 [97]  Estimation of variance–mean dependence in count data from high-throughput sequencing assays and differential expression analysis  DIAMOND [99]  High-throughput alignment of DNA reads and protein sequences  FLASh [101]  Merging paired-end reads from next-generation sequencing experiments  FMAP [102]  Functional analysis of metagenomic/metatranscriptomic sequencing data, i.e. sequence alignment, gene family abundance calculations and differential feature statistical analysis  Genboree Microbiome Toolset [103]  Multi-omic data analysis  IMP [153]  Large-scale standardized integrated analysis of coupled metagenomic and metatranscriptomic data  KOBAS [107]  Gene/protein functional annotation and functional enrichment of gene sets  NCBI’s Best Match Tagger [154]  Filtering human reads from metagenomics data sets  PRIAM [127]  Automated enzyme detection in a fully sequenced genome  RPS-BLAST  Search in profile databases  SAMSA [155]  Comprehensive metatranscriptome analysis  USEARCH [156]  Sequence analysis, including search and clustering algorithms  Tool  Purpose  BLAST [89]  Identification of regions of similarity between biological sequences  COMAN [152]  Functional analysis of metatranscriptomics data  DESeq2 [97]  Estimation of variance–mean dependence in count data from high-throughput sequencing assays and differential expression analysis  DIAMOND [99]  High-throughput alignment of DNA reads and protein sequences  FLASh [101]  Merging paired-end reads from next-generation sequencing experiments  FMAP [102]  Functional analysis of metagenomic/metatranscriptomic sequencing data, i.e. sequence alignment, gene family abundance calculations and differential feature statistical analysis  Genboree Microbiome Toolset [103]  Multi-omic data analysis  IMP [153]  Large-scale standardized integrated analysis of coupled metagenomic and metatranscriptomic data  KOBAS [107]  Gene/protein functional annotation and functional enrichment of gene sets  NCBI’s Best Match Tagger [154]  Filtering human reads from metagenomics data sets  PRIAM [127]  Automated enzyme detection in a fully sequenced genome  RPS-BLAST  Search in profile databases  SAMSA [155]  Comprehensive metatranscriptome analysis  USEARCH [156]  Sequence analysis, including search and clustering algorithms  Table 3. Publicly available metatranscriptomics tools Tool  Purpose  BLAST [89]  Identification of regions of similarity between biological sequences  COMAN [152]  Functional analysis of metatranscriptomics data  DESeq2 [97]  Estimation of variance–mean dependence in count data from high-throughput sequencing assays and differential expression analysis  DIAMOND [99]  High-throughput alignment of DNA reads and protein sequences  FLASh [101]  Merging paired-end reads from next-generation sequencing experiments  FMAP [102]  Functional analysis of metagenomic/metatranscriptomic sequencing data, i.e. sequence alignment, gene family abundance calculations and differential feature statistical analysis  Genboree Microbiome Toolset [103]  Multi-omic data analysis  IMP [153]  Large-scale standardized integrated analysis of coupled metagenomic and metatranscriptomic data  KOBAS [107]  Gene/protein functional annotation and functional enrichment of gene sets  NCBI’s Best Match Tagger [154]  Filtering human reads from metagenomics data sets  PRIAM [127]  Automated enzyme detection in a fully sequenced genome  RPS-BLAST  Search in profile databases  SAMSA [155]  Comprehensive metatranscriptome analysis  USEARCH [156]  Sequence analysis, including search and clustering algorithms  Tool  Purpose  BLAST [89]  Identification of regions of similarity between biological sequences  COMAN [152]  Functional analysis of metatranscriptomics data  DESeq2 [97]  Estimation of variance–mean dependence in count data from high-throughput sequencing assays and differential expression analysis  DIAMOND [99]  High-throughput alignment of DNA reads and protein sequences  FLASh [101]  Merging paired-end reads from next-generation sequencing experiments  FMAP [102]  Functional analysis of metagenomic/metatranscriptomic sequencing data, i.e. sequence alignment, gene family abundance calculations and differential feature statistical analysis  Genboree Microbiome Toolset [103]  Multi-omic data analysis  IMP [153]  Large-scale standardized integrated analysis of coupled metagenomic and metatranscriptomic data  KOBAS [107]  Gene/protein functional annotation and functional enrichment of gene sets  NCBI’s Best Match Tagger [154]  Filtering human reads from metagenomics data sets  PRIAM [127]  Automated enzyme detection in a fully sequenced genome  RPS-BLAST  Search in profile databases  SAMSA [155]  Comprehensive metatranscriptome analysis  USEARCH [156]  Sequence analysis, including search and clustering algorithms  Most metatranscriptomics analyses involve reference-based or metagenomics-dependent analysis workflows. For example, the Functional Mapping and Analysis Pipeline (FMAP) aims to identify differentially abundant features in microbiome data sets [102]. FMAP supports data preprocessing and performs sequence alignment, gene family abundance calculations and differential statistical analysis. To this end, the pipeline integrates various tools: NCBI’s Best Match Tagger [154] for data preprocessing; USEARCH [156] and DIAMOND [99], for the alignment of reads to a reference database, namely, against a KEGG-filtered UniProt data collection; and the analysis of differentially abundant genes and the enrichment analysis of pathways and operons, based on statistical testing methods such as metagenomeSeq and Kruskal–Wallis rank-sum. The comprehensive metatranscriptomics analysis (COMAN) is an integrated Web server dedicated to the comprehensive functional analysis of metatranscriptomic data [152]. After an initial quality control step, reads are mapped to the RefSeq database using DIAMOND [99]. Functional annotation of genes and reads are prepared with COG [42] and KO [41]. COG-based annotation is conducted using RPS-BLAST against the CDD database [157], whereas DIAMOND [99] and KOBAS [107] support KO annotation. In addition, PRIAM [127] supports the annotation of genes to enzymes (Enzyme Commission numbers, ECs), and enables further profiling against MetaCyc pathways [46]. Noteworthy, the Simple Annotation of Metatranscriptomes by Sequence Analysis (SAMSA) pipeline has been specifically designed for the analysis of gut microbiome data [155]. The FLASh short-read aligner is used in the preprocessing step [101]. The annotation step resorts to MG-RAST tools to generate annotations for the best matches to organisms and individual transcripts (RefSeq database), and the SEED database is used to obtain additional ontology annotations. The DESeq2 package in R supports the comparison of metatranscriptome samples and the identification of significantly differentially expressed transcripts [97]. A major drawback of reference-based methods is the large number of sequencing reads from uncultured species and divergent strains that are discarded during data analysis, i.e. the loss of potentially useful information. For example, in a recent study of 252 human fecal samples, 43% of the reads could not be mapped to available isolate genomes [158]. To mitigate this lacuna, reference-independent methods address the retrieval of the actual genomes and potentially novel genes present in the samples, maximizing the amount of data exploited for analysis. To this end, metatranscriptomics reference-independent approaches use dedicated assembly methods, namely, metatranscriptome assemblers, metagenomics assemblers or single-species transcriptome assemblers [159]. Moreover, these approaches aim to leverage the advantages associated with integrating metatranscriptomics and metagenomics data for the large-scale analysis of microbial community structure and function. For example, the open-source Integrated Meta-omic Pipeline (IMP) is a self-contained and standardized de novo assembly-based pipeline, which allows automated and large-scale integrated analyses of combined metagenomics and metatranscriptomics data sets [153]. Notably, IMP incorporates iterative co-assembly of metagenomic and metatranscriptomic data, analyses of microbial community structure and function and genomic signature-based visualization. Despite all these advances, metatranscriptomics approaches continue to struggle to cope with the quality mRNA from microbial samples. Metaproteomics studies: spectral search and protein profile Metaproteomics studies aim to perform the large-scale characterization of proteins extracted from the human gut microbiota [160]. Metaproteomics allows for the characterization of the dynamic proteome in complex communities, revealing their impact on microbial metabolism and proportionate information about which taxonomic groups are performing different metabolic roles. Compared with metagenomics and metatranscriptomics, the added value of metaproteomics lays on providing function details, i.e. metaproteomics conveys the identification of proteins, their assignment to specific taxa and the description of how these proteins interact with the human host. The publicly available metaproteomics tools are listed in Table 4. Table 4. Publicly available metaproteomics tools Tool  Purpose  AACompIdent [161]  Identifying of a protein from its amino acid composition  AACompSim [161]  Comparing the amino acid composition between UniProtKB/Swiss-Prot entries  Blazmass+ComPIL [162]  A comprehensive and scalable database search system for metaproteomics  ClustalO [163]  Aligning two or more protein sequences  COILS [164]  Comparing a sequence to a database of known parallel two-stranded coiled coils and derives a similarity score  Compute pI/Mw [161]  Computing of the theoretical isoelectric point and molecular weight for a list sequences  FindPept [165]  Identifying peptides that result from unspecific cleavage of proteins from their experimental masses  Galaxy-P [166]  Integrative analysis of MA-based proteomics and genomic and transcriptomic data  HAMAP [167]  Classifying and annotating protein sequences  ISMARA [168]  Modelling genome-wide expression or ChIP-seq data, in terms of computationally predicted regulatory sites for transcription factors and microRNAs  Mascot [169]  Protein identification using MA data  MyriMatch [170]  Comparison of MA from shotgun proteomics against a reference database  MZJava [171]  Analysis of MA data from large-scale proteomics and glycomics experiments  OMSSA [172]  MA/MA search algorithm  PDBePisa [173]  Exploring macromolecular interfaces  pICarver [174]  Visualizing theoretical distributions of peptide pI on a given pH range  PredictProtein [175]  Predicting protein structural and functional features  QMEAN [176]  Estimating the quality of protein structure models is a vital step in protein structure prediction  QuickMod [177]  Identifying modified peptides  Scaffold [178]  Protein identification and analysis  ScanProsite [179]  Scanning proteins for matches against the PROSITE collection of motifs or user-defined patterns  SEQUEST [180]  Correlates uninterpreted tandem MA of peptides with amino acid sequences from protein and nucleotide databases  T-coffee [181]  Computing, evaluating and manipulating multiple alignments of DNA, RNA, protein sequences and structures  Unique Peptide Finder [182]  Characterization of taxon-specific peptidomes and peptidome-based clustering  X! Tandem [183]  Protein identification via tandem MA matching against peptide sequences  Tool  Purpose  AACompIdent [161]  Identifying of a protein from its amino acid composition  AACompSim [161]  Comparing the amino acid composition between UniProtKB/Swiss-Prot entries  Blazmass+ComPIL [162]  A comprehensive and scalable database search system for metaproteomics  ClustalO [163]  Aligning two or more protein sequences  COILS [164]  Comparing a sequence to a database of known parallel two-stranded coiled coils and derives a similarity score  Compute pI/Mw [161]  Computing of the theoretical isoelectric point and molecular weight for a list sequences  FindPept [165]  Identifying peptides that result from unspecific cleavage of proteins from their experimental masses  Galaxy-P [166]  Integrative analysis of MA-based proteomics and genomic and transcriptomic data  HAMAP [167]  Classifying and annotating protein sequences  ISMARA [168]  Modelling genome-wide expression or ChIP-seq data, in terms of computationally predicted regulatory sites for transcription factors and microRNAs  Mascot [169]  Protein identification using MA data  MyriMatch [170]  Comparison of MA from shotgun proteomics against a reference database  MZJava [171]  Analysis of MA data from large-scale proteomics and glycomics experiments  OMSSA [172]  MA/MA search algorithm  PDBePisa [173]  Exploring macromolecular interfaces  pICarver [174]  Visualizing theoretical distributions of peptide pI on a given pH range  PredictProtein [175]  Predicting protein structural and functional features  QMEAN [176]  Estimating the quality of protein structure models is a vital step in protein structure prediction  QuickMod [177]  Identifying modified peptides  Scaffold [178]  Protein identification and analysis  ScanProsite [179]  Scanning proteins for matches against the PROSITE collection of motifs or user-defined patterns  SEQUEST [180]  Correlates uninterpreted tandem MA of peptides with amino acid sequences from protein and nucleotide databases  T-coffee [181]  Computing, evaluating and manipulating multiple alignments of DNA, RNA, protein sequences and structures  Unique Peptide Finder [182]  Characterization of taxon-specific peptidomes and peptidome-based clustering  X! Tandem [183]  Protein identification via tandem MA matching against peptide sequences  ChIP-Seq: Chromatin Immunoprecipitation Sequencing. Table 4. Publicly available metaproteomics tools Tool  Purpose  AACompIdent [161]  Identifying of a protein from its amino acid composition  AACompSim [161]  Comparing the amino acid composition between UniProtKB/Swiss-Prot entries  Blazmass+ComPIL [162]  A comprehensive and scalable database search system for metaproteomics  ClustalO [163]  Aligning two or more protein sequences  COILS [164]  Comparing a sequence to a database of known parallel two-stranded coiled coils and derives a similarity score  Compute pI/Mw [161]  Computing of the theoretical isoelectric point and molecular weight for a list sequences  FindPept [165]  Identifying peptides that result from unspecific cleavage of proteins from their experimental masses  Galaxy-P [166]  Integrative analysis of MA-based proteomics and genomic and transcriptomic data  HAMAP [167]  Classifying and annotating protein sequences  ISMARA [168]  Modelling genome-wide expression or ChIP-seq data, in terms of computationally predicted regulatory sites for transcription factors and microRNAs  Mascot [169]  Protein identification using MA data  MyriMatch [170]  Comparison of MA from shotgun proteomics against a reference database  MZJava [171]  Analysis of MA data from large-scale proteomics and glycomics experiments  OMSSA [172]  MA/MA search algorithm  PDBePisa [173]  Exploring macromolecular interfaces  pICarver [174]  Visualizing theoretical distributions of peptide pI on a given pH range  PredictProtein [175]  Predicting protein structural and functional features  QMEAN [176]  Estimating the quality of protein structure models is a vital step in protein structure prediction  QuickMod [177]  Identifying modified peptides  Scaffold [178]  Protein identification and analysis  ScanProsite [179]  Scanning proteins for matches against the PROSITE collection of motifs or user-defined patterns  SEQUEST [180]  Correlates uninterpreted tandem MA of peptides with amino acid sequences from protein and nucleotide databases  T-coffee [181]  Computing, evaluating and manipulating multiple alignments of DNA, RNA, protein sequences and structures  Unique Peptide Finder [182]  Characterization of taxon-specific peptidomes and peptidome-based clustering  X! Tandem [183]  Protein identification via tandem MA matching against peptide sequences  Tool  Purpose  AACompIdent [161]  Identifying of a protein from its amino acid composition  AACompSim [161]  Comparing the amino acid composition between UniProtKB/Swiss-Prot entries  Blazmass+ComPIL [162]  A comprehensive and scalable database search system for metaproteomics  ClustalO [163]  Aligning two or more protein sequences  COILS [164]  Comparing a sequence to a database of known parallel two-stranded coiled coils and derives a similarity score  Compute pI/Mw [161]  Computing of the theoretical isoelectric point and molecular weight for a list sequences  FindPept [165]  Identifying peptides that result from unspecific cleavage of proteins from their experimental masses  Galaxy-P [166]  Integrative analysis of MA-based proteomics and genomic and transcriptomic data  HAMAP [167]  Classifying and annotating protein sequences  ISMARA [168]  Modelling genome-wide expression or ChIP-seq data, in terms of computationally predicted regulatory sites for transcription factors and microRNAs  Mascot [169]  Protein identification using MA data  MyriMatch [170]  Comparison of MA from shotgun proteomics against a reference database  MZJava [171]  Analysis of MA data from large-scale proteomics and glycomics experiments  OMSSA [172]  MA/MA search algorithm  PDBePisa [173]  Exploring macromolecular interfaces  pICarver [174]  Visualizing theoretical distributions of peptide pI on a given pH range  PredictProtein [175]  Predicting protein structural and functional features  QMEAN [176]  Estimating the quality of protein structure models is a vital step in protein structure prediction  QuickMod [177]  Identifying modified peptides  Scaffold [178]  Protein identification and analysis  ScanProsite [179]  Scanning proteins for matches against the PROSITE collection of motifs or user-defined patterns  SEQUEST [180]  Correlates uninterpreted tandem MA of peptides with amino acid sequences from protein and nucleotide databases  T-coffee [181]  Computing, evaluating and manipulating multiple alignments of DNA, RNA, protein sequences and structures  Unique Peptide Finder [182]  Characterization of taxon-specific peptidomes and peptidome-based clustering  X! Tandem [183]  Protein identification via tandem MA matching against peptide sequences  ChIP-Seq: Chromatin Immunoprecipitation Sequencing. ExPASy Web portal has worldwide reputation as one of the main bioinformatics resources for proteomics [184]. ExPASy databases include Swiss-Prot [185], STRING [186], SWISS-MODEL [187], PROSITE [188], ViralZone [189] and neXtProt [190]. Analysis tools are available for specific tasks, such as protein sequence and identification [191] (tools such AACompIdent [161] or FindPept [165]), proteomics experiment [192] (tools such MZJava [171] or pICarver [174]), function analysis [193] (tools such AACompSim [161] or Compute pI/Mw [161]), sequences sites, features and motifs [194] (tools such HAMAP [167] or ScanProsite [179]), protein modification [195] (tools such ISMARA [168] or QuickMod [177]), protein structure [196] (tools such COILS [164] or QMEAN [176]), protein interactions [197] (tools such PDBePisa [173] or PredictProtein [175]) and similarity search/alignment [198] (tools such ClustalO [163] or T-coffee [181]). Analysis of mass spectra (MS) (i.e. decode of peptide sequences) is typically facilitated by database searching algorithms, namely, SEQUEST [180], Mascot [169], MyriMatch [170], OMSSA [172] and X! Tandem [183]. The development of cross-species protein identification approaches is desired, but challenging, given the complexity of the gut microbial proteome and the dynamic distribution of species between individuals [199, 200]. New approaches aim to increase the sensitivity of the peptide spectrum matching. Together, the combination of the ComPIL metaproteomic analysis method and the Blazmass search engine allows larger-scale database searches, including peptide masses, protein information and peptide sequences [162]. Other possible approaches are the integration of synthetic metaproteome information with metagenomic information [62], and de novo sequencing [201, 202]. The Galaxy bioinformatics framework offers a sophisticated proteogenomic workflow, named Galaxy for Proteomics or Galaxy-P (usegalaxyp.org), in support of broad metaproteomics data analysis [203]. This is a complex workflow, which includes ∼140 steps, and can be shared using built-in Galaxy functions [166]. Alternatively, the MetaPro-IQ workflow, which has been specifically developed for gut metaproteome identification and quantification, uses almost complete human or mouse gut microbial gene catalogues as reference database and an iterative database search strategy [204]. Unipept offers programmatic access to metaproteomics analysis features and has the advantage of being supported by a fast index built from UniProtKB and NCBI Taxonomy [205]. It facilitates interactive data visualization, and the Unique Peptide Finder enables the discovery of tryptic peptides that are taxon-specific, i.e. peptides that can be used as biomarkers to reliably detect the presence of the targeted taxa [182]. Scaffold is designed to identify and analyse proteins in biological samples [178]. By using output files from MS/MS search engines, Scaffold validates, organizes and interprets MA data, allowing the user to more easily manage large amounts of data, compare samples and search for protein modifications. Moreover, it attempts to increase the confidence in protein identification reports through the use of several statistical methods. In terms of applications, the study ofCD is a meaningful example of the added value of the integrated analysis of metagenomics and metaproteomics approaches [27]. Such analysis led to a better understanding of the CD phenotype (i.e. genes, proteins and pathways that primarily differentiated patients from healthy subjects) and enabled the association of the phenotype with alterations in bacterial carbohydrate metabolism, bacterial–host interactions, as well as human host-secreted enzymes. The investigation of colonic metaproteomics bacterial signatures in obesity represents another application [206]. The goal was to detect differences among obese and non-obese samples at a functional level. The combination of metaproteomics and phylogenetic data exposed significant metabolic activity of the phylum Bacteroidetes in obese subjects. Likewise, faecal metaproteomics analysis was applied in a probiotic intervention trial to identify individually different human intestinal proteomes (i.e. personalized host–microbiota interactions) and examine the activity of main phyla as well as key species, namely, F. prausnitzii [207]. Finally, in the context of type 1 diabetes mellitus (T1DM), a large-scale analysis of intra- and inter-individual variation using metagenomics, metatranscriptomics and metaproteomics inputs showed that community structures are reflected across all ‘-omics’ levels. In particular, differences in the relative abundances of certain human pancreatic enzymes were correlated with the expression of microbial genes involved in T1DM-relevant metabolic transformations, such as thiamine synthesis and glycolysis [208]. Metabolomics studies: metabolite identification and concentration Gut metabolome studies aim to identify and quantify the set of metabolites (or specific metabolites) in biological samples, and therefore, look into differences in signature metabolites and their relation to changes in the activity of metabolic pathways [209–211]. Metabolomics allows for the characterization of the dynamic metabolome in complex communities, revealing their impact on microbial metabolism. Besides being the most immediate indicator of dysbiosis [59, 212], metabolome profiling is able to show dependences on environmental factors (e.g. diet [213, 214] and antibiotic exposure [215, 216]) as well as provide valuable information about the interactions of the microbial community with the host environment (e.g. quorum sensing [217]). Metabolite profiling is typically carried out using a combination of chromatographical techniques (e.g. liquid chromatography or gas chromatography) and detection methods, such as MA and nuclear magnetic resonance [218, 219]. Computationally speaking, data processing and analysis can be challenging because of the huge number of different metabolites potentially detected in this kind of samples. Moreover, a combination of statistical and machine learning methods is usually applied to identify discriminative features [220–222]. For example, classical univariate tests (e.g. Student’s t-test, multivariate linear regression and Mann–Whitney test) are combined with multivariate analysis such as principal component analysis, hierarchical cluster analysis, discriminant analysis and classification models (e.g. k-nearest neighbour). Currently available metabolomics computational tools are listed in Table 5. Table 5. Publicly available metabolomics tools Tool  Purpose  BNICE  Discovery of novel biochemical pathways  MassTRIX [223]  Annotation of metabolites in high precision MA data  MIDAS [224]  Database search algorithm for metabolite identification  MetFrag [225]  In silico fragmentation for computer-assisted identification of metabolite MA  MimoSA [226]  A pipeline for joint metabolic model-based analysis of metabolomics measurements and taxonomic composition from microbial communities  Tool  Purpose  BNICE  Discovery of novel biochemical pathways  MassTRIX [223]  Annotation of metabolites in high precision MA data  MIDAS [224]  Database search algorithm for metabolite identification  MetFrag [225]  In silico fragmentation for computer-assisted identification of metabolite MA  MimoSA [226]  A pipeline for joint metabolic model-based analysis of metabolomics measurements and taxonomic composition from microbial communities  Table 5. Publicly available metabolomics tools Tool  Purpose  BNICE  Discovery of novel biochemical pathways  MassTRIX [223]  Annotation of metabolites in high precision MA data  MIDAS [224]  Database search algorithm for metabolite identification  MetFrag [225]  In silico fragmentation for computer-assisted identification of metabolite MA  MimoSA [226]  A pipeline for joint metabolic model-based analysis of metabolomics measurements and taxonomic composition from microbial communities  Tool  Purpose  BNICE  Discovery of novel biochemical pathways  MassTRIX [223]  Annotation of metabolites in high precision MA data  MIDAS [224]  Database search algorithm for metabolite identification  MetFrag [225]  In silico fragmentation for computer-assisted identification of metabolite MA  MimoSA [226]  A pipeline for joint metabolic model-based analysis of metabolomics measurements and taxonomic composition from microbial communities  Comparison against spectral databases is required to identify and quantify the metabolites in the sample, namely: the Human Metabolome Database [227], BioMagResBank [228], Madison-Qingdao Metabolomics Consortium Database [229], MassBank [230], Golm Metabolome Database [231], METLIN [232] and ChemSpider [233]. Alternatively, the in silico fragmenter MetFrag [225] combines compound database searching (via ChemSpider and PubChem [234] Web services) and fragmentation prediction, and the Metabolite Identification via Database Searching (MIDAS) approach [224] matches measured tandem MA against the predicted fragments of metabolites in the MetaCyc database. Untargeted metabolomics approaches are being developed as means to minimize the challenges in matching metabolites to their spectral features [222]. For example, the Metabolic I Network Expansions (MINEs) databases record molecules that have not been observed yet, but are likely to occur based on known metabolites and common biochemical reactions [235]. Computational predictions are based on the Biochemical Network Integrated Computational Explorer (BNICE) algorithm and expert-curated reaction rules based on the Enzyme Commission classification system. Details on a broader range of Web accessible databases of the properties, enzymatic reactions and metabolism of small molecules-search options have been recently reported [236, 237]. Within the scope of human gut research, IBD is one the main focus of metabolomics studies. The most discriminative metabolites for IBD, mainly derived from nuclear magnetic resonance spectroscopy studies, were alanine, isoleucine, leucine, lysine, valine, phenylalanine and butyrate [209, 238–240]. Also, MA studies have shown that long-chain fatty acids could play an important role in the disease. Researchers are now exploring certain metabolic patterns, discussing whether they are a cause of IBD or rather a consequence of inflammation or altered gut microbiota. For example, an increase of amino acids in faecal samples of IBD patients is explained by the low capacity of the inflamed intestinal tissue to absorb nutrients [241]. Obesity and T2D are also the subject of discussion through studies of co-metabolites [14]. Fluxomics studies: high-throughput analysis of metabolic fluxes Fluxomics refers to the group of techniques focused on the high-throughput analysis of metabolic fluxes, and is a clear complement to transcriptomics, proteomics and metabolomics. By integrating in vivo metabolic data with stoichiometric network models, absolute fluxes in the central metabolism of a biological system can be determined. Applications can be grossly divided into two approaches, constraint-based methods for examination of the relative contributions of different pathways to a given phenotype, and fluxomics based in the incorporation and monitorization of (13) C-labelled compounds [242]. Different algorithms, desktop or Web applications and resources have been published during the past years to facilitate the work of the fluxomics researchers [243]. Table 6 presents the publicly available fluxomics tools. Table 6. Publicly available fluxomics tools Tool  Purpose  B-DMFA [244]  A fast heuristic algorithm developed for knot placement  COBRA Toolbox [245]  Quantitative prediction of cellular behaviour  COMETS [246]  Performing computer simulations of metabolism in spatially structured microbial communities.  CycSim [247]  Simulating with constraint-based models of metabolism  Fastcore [248]  Reconstruction of context-specific metabolic network models from global genome-wide metabolic network models  Fast-SL [249]  Identification of synthetic lethal gene/reaction sets in genome-scale metabolic models  Fast-SNP [250]  Function analysis and selection tool for identifying and prioritizing SNPs that are likely to have functional effects  FBA-SimVis [251]  Constraint-based analysis of metabolic models  FluxModeCalculator [252]  Flux mode analysis in stoichiometric models  GEMSiRV [253]  Performing metabolic network drafting and editing, network visualization and flux balance analysis  GlobalFit [254]  Finding globally optimal networks  Influx [255]  Optimized flux estimation  iReMet-flux [256]  Flux prediction  jQMM [257]  Flux calculation for genome-scale models  ll-ACHRB [258]  Sampling the feasible solution space of metabolic networks  MFF [259]  Flux distribution and impact prediction, selection of key network reactions and prioritization of measurements  Mflux [260]  Prediction of the bacterial central metabolism via machine learning  MicrobesFlux [261]  Generation and reconstruction of metabolic models for annotated microorganisms  ModelSEED [262]  Reconstruction, exploration, comparison and analysis of metabolic models  OptFlux [263]  Flux balance analysis, allowing user-manipulation of the nodes composing a metabolic network and the overlay of phenotype results  ROOM [264]  Constraint-based prediction of metabolic steady state  Sumoflux [265]  A toolbox for targeted 13C metabolic flux ratio analysis  SurreyFBA [266]  Providing constraint-based simulations and network map visualization  Sybil [267]  Constraint-based analyses of metabolic networks  Sysmetab [268]  Metabolic flux analysis  VisualCNA [269]  Constraint network analysis and molecular graphics representations  Tool  Purpose  B-DMFA [244]  A fast heuristic algorithm developed for knot placement  COBRA Toolbox [245]  Quantitative prediction of cellular behaviour  COMETS [246]  Performing computer simulations of metabolism in spatially structured microbial communities.  CycSim [247]  Simulating with constraint-based models of metabolism  Fastcore [248]  Reconstruction of context-specific metabolic network models from global genome-wide metabolic network models  Fast-SL [249]  Identification of synthetic lethal gene/reaction sets in genome-scale metabolic models  Fast-SNP [250]  Function analysis and selection tool for identifying and prioritizing SNPs that are likely to have functional effects  FBA-SimVis [251]  Constraint-based analysis of metabolic models  FluxModeCalculator [252]  Flux mode analysis in stoichiometric models  GEMSiRV [253]  Performing metabolic network drafting and editing, network visualization and flux balance analysis  GlobalFit [254]  Finding globally optimal networks  Influx [255]  Optimized flux estimation  iReMet-flux [256]  Flux prediction  jQMM [257]  Flux calculation for genome-scale models  ll-ACHRB [258]  Sampling the feasible solution space of metabolic networks  MFF [259]  Flux distribution and impact prediction, selection of key network reactions and prioritization of measurements  Mflux [260]  Prediction of the bacterial central metabolism via machine learning  MicrobesFlux [261]  Generation and reconstruction of metabolic models for annotated microorganisms  ModelSEED [262]  Reconstruction, exploration, comparison and analysis of metabolic models  OptFlux [263]  Flux balance analysis, allowing user-manipulation of the nodes composing a metabolic network and the overlay of phenotype results  ROOM [264]  Constraint-based prediction of metabolic steady state  Sumoflux [265]  A toolbox for targeted 13C metabolic flux ratio analysis  SurreyFBA [266]  Providing constraint-based simulations and network map visualization  Sybil [267]  Constraint-based analyses of metabolic networks  Sysmetab [268]  Metabolic flux analysis  VisualCNA [269]  Constraint network analysis and molecular graphics representations  Table 6. Publicly available fluxomics tools Tool  Purpose  B-DMFA [244]  A fast heuristic algorithm developed for knot placement  COBRA Toolbox [245]  Quantitative prediction of cellular behaviour  COMETS [246]  Performing computer simulations of metabolism in spatially structured microbial communities.  CycSim [247]  Simulating with constraint-based models of metabolism  Fastcore [248]  Reconstruction of context-specific metabolic network models from global genome-wide metabolic network models  Fast-SL [249]  Identification of synthetic lethal gene/reaction sets in genome-scale metabolic models  Fast-SNP [250]  Function analysis and selection tool for identifying and prioritizing SNPs that are likely to have functional effects  FBA-SimVis [251]  Constraint-based analysis of metabolic models  FluxModeCalculator [252]  Flux mode analysis in stoichiometric models  GEMSiRV [253]  Performing metabolic network drafting and editing, network visualization and flux balance analysis  GlobalFit [254]  Finding globally optimal networks  Influx [255]  Optimized flux estimation  iReMet-flux [256]  Flux prediction  jQMM [257]  Flux calculation for genome-scale models  ll-ACHRB [258]  Sampling the feasible solution space of metabolic networks  MFF [259]  Flux distribution and impact prediction, selection of key network reactions and prioritization of measurements  Mflux [260]  Prediction of the bacterial central metabolism via machine learning  MicrobesFlux [261]  Generation and reconstruction of metabolic models for annotated microorganisms  ModelSEED [262]  Reconstruction, exploration, comparison and analysis of metabolic models  OptFlux [263]  Flux balance analysis, allowing user-manipulation of the nodes composing a metabolic network and the overlay of phenotype results  ROOM [264]  Constraint-based prediction of metabolic steady state  Sumoflux [265]  A toolbox for targeted 13C metabolic flux ratio analysis  SurreyFBA [266]  Providing constraint-based simulations and network map visualization  Sybil [267]  Constraint-based analyses of metabolic networks  Sysmetab [268]  Metabolic flux analysis  VisualCNA [269]  Constraint network analysis and molecular graphics representations  Tool  Purpose  B-DMFA [244]  A fast heuristic algorithm developed for knot placement  COBRA Toolbox [245]  Quantitative prediction of cellular behaviour  COMETS [246]  Performing computer simulations of metabolism in spatially structured microbial communities.  CycSim [247]  Simulating with constraint-based models of metabolism  Fastcore [248]  Reconstruction of context-specific metabolic network models from global genome-wide metabolic network models  Fast-SL [249]  Identification of synthetic lethal gene/reaction sets in genome-scale metabolic models  Fast-SNP [250]  Function analysis and selection tool for identifying and prioritizing SNPs that are likely to have functional effects  FBA-SimVis [251]  Constraint-based analysis of metabolic models  FluxModeCalculator [252]  Flux mode analysis in stoichiometric models  GEMSiRV [253]  Performing metabolic network drafting and editing, network visualization and flux balance analysis  GlobalFit [254]  Finding globally optimal networks  Influx [255]  Optimized flux estimation  iReMet-flux [256]  Flux prediction  jQMM [257]  Flux calculation for genome-scale models  ll-ACHRB [258]  Sampling the feasible solution space of metabolic networks  MFF [259]  Flux distribution and impact prediction, selection of key network reactions and prioritization of measurements  Mflux [260]  Prediction of the bacterial central metabolism via machine learning  MicrobesFlux [261]  Generation and reconstruction of metabolic models for annotated microorganisms  ModelSEED [262]  Reconstruction, exploration, comparison and analysis of metabolic models  OptFlux [263]  Flux balance analysis, allowing user-manipulation of the nodes composing a metabolic network and the overlay of phenotype results  ROOM [264]  Constraint-based prediction of metabolic steady state  Sumoflux [265]  A toolbox for targeted 13C metabolic flux ratio analysis  SurreyFBA [266]  Providing constraint-based simulations and network map visualization  Sybil [267]  Constraint-based analyses of metabolic networks  Sysmetab [268]  Metabolic flux analysis  VisualCNA [269]  Constraint network analysis and molecular graphics representations  Constraint-based approaches include more or less specific applications dealing with flux balance analysis (FBA). FBA has been traditionally used in the characterization of cellular metabolism and metabolic engineering [270]. There are many algorithms that have been developed for the high-throughput characterization of metabolic fluxes. Regulatory On/Off Minimization (ROOM) works on metabolic steady states and is focused on changes induced by gene knockouts, mostly providing rerouting options in response to the absence of an enzymatic step (i.e. a gene knockout) [264]. Fastcore is another algorithm able to reconstruct metabolic sub-networks that have been extracted from wider models. Starting by a set of reactions empirically known to be active (denominated core), the algorithm returns a metabolic network containing all the reactions and the minimum number of additional reactions that satisfy the metabolic results [248]. Fast-SL is another algorithm that, in the context of a genome-scale metabolic model, identifies sets of lethal reactions, which is useful for combinatorial discovery of drug targets [249]. ll-ACHRB (Artificially Centered Hit-and-Run on a Box) is a scalable algorithm for sampling flux samples in the context of metabolic networks [258]. Fast-SNP is an algorithm focused on the improvement of computational efficiency by reducing the original network into a smaller matrix. Overall, this algorithm is efficient for the formulation of loop-law constraints, allowing loopless flux optimization [250]. B-spline fitting Dynamic Metabolic Flux Analysis (B-DMFA) is a heuristic algorithm focused on knot placement, a time-consuming task in the dynamic metabolic flux analysis. This is performed by implementing the local support property of B-splines [244]. Finally, Influx_s is a deterministic algorithm with improved accuracy for flux estimation. Influx_s uses few computational resources; indeed, the central carbon metabolism network estimation of Escherichia coli requires from several seconds to few minutes with a standard personal computer (PC) architecture [255]. From the hundreds of applications available in the literature, many have been programmed using the Matlab environment. Perhaps, the most used is the Constraints-Based Reconstruction and Analysis (COBRA) Toolbox. This software package has been used for quantitative prediction of cellular metabolism through a predictive computation of optimal growth (steady-state or dynamic), and allows modelling the occurrence of gene deletions [245]. COBRA Toolbox can be also useful for methodologies such as regulatory on/off minimization and flux variability analysis [271]. It has been also implemented as Python package (COBRApy) and in Julia, where other associated packages such as distributedFBA.jl can be used to solve multiple flux balance analyses on concise pathways or on the whole central metabolism. This implementation in Julia provides scalability and integration with the high-level interface MathProgBase.jl, obtaining optimum results in terms of resources optimization [272]. Mackinac has been designed to profit from the COBRA metabolic analysis capabilities to, in combination with ModelSEED, infer in the metabolic potential of a biological system and to optimize genome-scale metabolic models [273]. ModelSEED is precisely a Web-based resource for the analysis of genome-scale metabolic models [262], and some Cytoscape plugins such as CytoSEED allows visualization and manipulation of the created models [270]. ORCA is another COBRA-based package, which implements notable improvements in terms of scope extension of COBRA metabolic models [274]. Another Matlab-based desktop application is Coordinate Hit-and-Run with Rounding (CHRR), which allows genome-scale sampling in biochemical networks [275]. Many fluxomics applications have been developed in the R environment (sometimes simply as libraries), such as Sybil, an R-based library for the analyses of metabolic networks that indeed is part of the COBRA Toolbox implementation in R [267]. GlobalFit is another R package designed for metabolic network refining. This is achieved by establishing models in which many properties of the reactions are changed (e.g. reversibility, presence/absence of an enzymatic step) and the fitting of experimental data is observed. Then, GlobalFit finds the optimal metabolic model with the minimum number of metabolic changes that fits better with the experimental data [254]. OptFlux is a platform for metabolic engineering, allowing user-manipulation of the nodes composing a metabolic network and the overlay of phenotype results or flux modes [263]. It has been improved with a visualisation plugin, enabling the graphical the edition of the network and the visualization of the results [276]. Integration of Relative Metabolite Levels for Flux prediction (iReMet-flux) is an interesting tool, as it integrates data from other omics. More concisely, integration of metabolomics data is combined in iReMet-flux with the assumption that metabolism minimizes flux changes between two different scenarios. This allows biological interpretation of the changes on metabolite levels among different experimental conditions [256]. Sysmetab integrates high-throughput data from MS and/or nuclear magnetic resonance measurements to solve metabolic fluxes in experiments involving carbon-labelled metabolites [268]. Other remarkable applications are FluxModeCalculator, which allows large-scale elementary flux mode computations using multiple cores [252], and VisualCNA, which is a PyMOL plugin implementing many graphical visualizations of constraint network analysis [269]. A certain number of fluxomics tools incorporate data of 13 C experiments, in which the rates of the metabolic reactions within carbon metabolism are monitored through a 13 C-labelled metabolite, enabling among other applications query on the reversible character of some reactions [277]. This is the principle underlying Central Carbon Metabolic Flux (CeCaF) Database, which has been manually curated and which allows comparative analysis of the central carbon metabolism in many organisms. Resources where the empirical data were retrieved are linked and interactive visualization is supported in the Cytoscape Web API [277]. Sumoflux incorporates measurement of surrogates included in the experiments with machine learning algorithms, helping in the optimization of experimental designs, selecting which level of metabolites are more interesting to be measured, and it has also the possibility to merge data from different experiments [265]. An interesting application in the sense of identifying which metabolites are, a priori, more interesting to track is Maximum Metabolic Flexibility (MMF). This application estimates the influence of the flux of different metabolic pathways on other reactions, with a clear application in the prioritization of the reactions to measure first helping to optimize resources [259]. JBEI Quantitative Metabolic Modeling (jQMM) calculates flux models at the genome-scale. Prediction of internal metabolic fluxes is available not only through 13 C metabolic experiments but also through FBA. This application also accepts omics data, which makes it suitable for flux studies in microbial communities [257]. Finally, few Web-based fluxomics applications are available in comparison to desktop applications. For instance, MicrobesFlux uses annotated microorganism genomes (KEGG) to generate and reconstruct metabolic models [261]. CycSim is another Web application in which genome-scale metabolic models can be simulated and integrated with KEGG data [247]. MFlux is the third Web-based platform contemplated in this review, and it incorporates machine learning algorithms (support vector machine and k-nearest neighbours, among others) to predict bacterial metabolism, with the peculiarity that it incorporates experimental data from about a hundred of papers in which heterotrophic bacterial metabolisms were characterized by 13 C experiments. MFlux incorporates methodologies to adjust flux models with given stoichiometric constraints through quadratic programming [260]. Knowledge representation Model reconstruction and network analysis are mainstreams for system-level analysis, namely for the study of microbe systems as well as host–microbe and microbial community interplays. These works are equally relevant to gain a better understanding about the gut ecosystem and to disclose the impact of the social dynamics of these communities into dysbiosis and disease. Figure 4 illustrates the different aspects of knowledge representation that are detailed in the next subsections. Figure 4. View largeDownload slide Mindmap of gut-related modelling and system-level analysis efforts. Figure 4. View largeDownload slide Mindmap of gut-related modelling and system-level analysis efforts. Metabolic modelling The reconstruction of genome-scale metabolic models can be viewed as a framework for converting large amounts of varied data, e.g. genetic, metabolic and biochemical, into phenotype and interaction observations [25, 278]. Typically, such reconstruction requires extensive manual curation and validation, and is based on the genome sequence, biochemistry and physiology of the organism [279]. The resulting model describes individual chemical reactions governed by the fundamental laws of mass conservation and thermodynamics, and can be used to simulate microbial growth or to predict the production rate of a particular metabolite. The value of metabolic modelling for understanding the complex environment of the gut microbiome lays in resolving biochemical relationships within and between microbial species and potentially predicting the effect of ecosystem-wide perturbations, such as antibiotic application or pathogen invasion. Microbial communities can be seen not only as groups of individual microbes but also as collections of biochemical functions affecting and responding to an environment or host organism [280, 281]. Gut microbe biochemical models There are a number of available reconstructions for human gut microbes (Table 7). Notably, a recent work has presented draft metabolic reconstructions for 301 gastrointestinal microbe models [282]. Table 7. Genome-scale metabolic models and networks reconstructed for gut microbiota species Model  Species  Total constituents  Application  Extended/revised iAH991 [145]  Bacteroides thetaiotaomicron  308 genes, 82 enzymes, 22 transporters, 32 transcription factors and 37 proteins of undefined functions  Suggest and refine specific functional assignments for sugar catabolic enzymes and transporters  iAH991[284]  Bacteroides thetaiotaomicron  1488 reactions, 1152 metabolites and 991 genes  Characterization of host–microbe metabolic symbiosis  iAH991[284]  Bacteroides thetaiotaomicron  1488 reactions, 1152 metabolites and 991 genes  Growth under diets varying in fat, carbohydrate and protein content  iBif452 [283]  Bifidobacterium adolescentis L2-32    Study of the anti-inflammatory role  iMLTC806cdf [285]  Clostridium difficile pathogenic strain 630  806 genes, 703 metabolites and 769 metabolic, 117 exchange and 145 transport reactions  Prediction of essential targets and inhibitors  iNV213 [286]  Cryptosporidium hominis  3884/213 genes (genome/reconstruction) and 540 reactions  Cryptosporidiosis  iJO1366 [287]  Escherichia coli strain K-12 MG1655  4405/1366 genes (genome/reconstruction), 1136 unique metabolites and 2251 reactions  Comprehensive genome-scale reconstruction  iCA1273 [288]  Escherichia coli strain W (ATCC 9637)  4764/1273 genes (genome/reconstruction), 1111 unique metabolites and 2477 reactions  Comprehensive genome-scale reconstruction  iFpraus_v1.0 [289]  Faecalibacterium prausnitzii A2-165    Carbon source utilization capabilities  iFap484 [283]  Faecalibacterium prausnitzii A2-165    Study of the anti-inflammatory role  iIT341 [290]  Helicobacter pylori strain 26695  1632/341 genes (genome/reconstruction), 411 unique metabolites and 476 reactions  Gastritis, gastric ulcers, gastric cancer  iYL1228 [291]  Klebsiella pneumoniae strain MGH 78578  5186/1228 genes (genome/reconstruction), 1055 unique metabolites and 1970 reactions  Infection in various tissues  iLca12A_640 [292]  Lactobacillus casei ATCC 12A  1076 reactions, 979 metabolites and 640 genes  Identification of functional differences  iLca334_548 [292]  Lactobacillus casei ATCC 334  1040 reactions, 959 metabolites and 548 genes  Identification of functional differences  iJL846 [293]  Lactobacillus casei LC2W  846 genes, 969 metabolic reactions and 785 metabolites  Understanding and engineering the metabolism of the strain  Metabolic network [294]  Lactobacillus plantarum WCFS1  3009/721 genes (genome/reconstruction), 554 unique metabolites and 761 reactions, 643 reactions and 531 metabolites  Analysis of the physiology of growth on a complex medium  pan-metabolic map [295]  Lactobacillus reuteri ATCC 55730 and ATCC PTA 6475  The metabolic model of 6475 includes 563 genes, similar to the metabolic model of L. reuteri JCM 1112. The metabolic model of 55730 includes 623 genes  Define functional probiotic features  Metabolic network [296]  Lactococcus lactis ssp. lactis IL1403  2310/358 genes (genome/reconstruction), 422 unique metabolites and 621 reactionsa total of 621 reactions and 509 metabolites  Understanding of lactococcal metabolic capabilities  Metabolic network [297]  Lactococcus lactis subsp. cremoris MG1363  518 genes, 754 reactions and 650 metabolites  Analysis of flavour formation  iMA945 [298]  Salmonella typhimurium strain LT2  4489/619 genes (genome/reconstruction), 1036 unique metabolites and 1964 reactions  Salmonellosis food poisoning  STM_v1.0 [299]  Salmonella typhimurium strain LT2  4489/1270 genes (genome/reconstruction), 1119 unique metabolites and 2201 reactions  Salmonellosis food poisoning  Genome-scale model [300]  Streptococcus thermophilus LMG18311  1889/429 genes (genome/reconstruction) and 522 reactions, 1889 genes (or gene fragments), the total absolute numbers of reactions is 522  Metabolic Comparison of Lactic Acid Bacteria  VvuMBEL943 [301]  Vibrio vulnificus strain CMCP6  2896/673 genes (genome/reconstruction) 765 unique metabolites and 943 reactions  Gastroenteritis  Model  Species  Total constituents  Application  Extended/revised iAH991 [145]  Bacteroides thetaiotaomicron  308 genes, 82 enzymes, 22 transporters, 32 transcription factors and 37 proteins of undefined functions  Suggest and refine specific functional assignments for sugar catabolic enzymes and transporters  iAH991[284]  Bacteroides thetaiotaomicron  1488 reactions, 1152 metabolites and 991 genes  Characterization of host–microbe metabolic symbiosis  iAH991[284]  Bacteroides thetaiotaomicron  1488 reactions, 1152 metabolites and 991 genes  Growth under diets varying in fat, carbohydrate and protein content  iBif452 [283]  Bifidobacterium adolescentis L2-32    Study of the anti-inflammatory role  iMLTC806cdf [285]  Clostridium difficile pathogenic strain 630  806 genes, 703 metabolites and 769 metabolic, 117 exchange and 145 transport reactions  Prediction of essential targets and inhibitors  iNV213 [286]  Cryptosporidium hominis  3884/213 genes (genome/reconstruction) and 540 reactions  Cryptosporidiosis  iJO1366 [287]  Escherichia coli strain K-12 MG1655  4405/1366 genes (genome/reconstruction), 1136 unique metabolites and 2251 reactions  Comprehensive genome-scale reconstruction  iCA1273 [288]  Escherichia coli strain W (ATCC 9637)  4764/1273 genes (genome/reconstruction), 1111 unique metabolites and 2477 reactions  Comprehensive genome-scale reconstruction  iFpraus_v1.0 [289]  Faecalibacterium prausnitzii A2-165    Carbon source utilization capabilities  iFap484 [283]  Faecalibacterium prausnitzii A2-165    Study of the anti-inflammatory role  iIT341 [290]  Helicobacter pylori strain 26695  1632/341 genes (genome/reconstruction), 411 unique metabolites and 476 reactions  Gastritis, gastric ulcers, gastric cancer  iYL1228 [291]  Klebsiella pneumoniae strain MGH 78578  5186/1228 genes (genome/reconstruction), 1055 unique metabolites and 1970 reactions  Infection in various tissues  iLca12A_640 [292]  Lactobacillus casei ATCC 12A  1076 reactions, 979 metabolites and 640 genes  Identification of functional differences  iLca334_548 [292]  Lactobacillus casei ATCC 334  1040 reactions, 959 metabolites and 548 genes  Identification of functional differences  iJL846 [293]  Lactobacillus casei LC2W  846 genes, 969 metabolic reactions and 785 metabolites  Understanding and engineering the metabolism of the strain  Metabolic network [294]  Lactobacillus plantarum WCFS1  3009/721 genes (genome/reconstruction), 554 unique metabolites and 761 reactions, 643 reactions and 531 metabolites  Analysis of the physiology of growth on a complex medium  pan-metabolic map [295]  Lactobacillus reuteri ATCC 55730 and ATCC PTA 6475  The metabolic model of 6475 includes 563 genes, similar to the metabolic model of L. reuteri JCM 1112. The metabolic model of 55730 includes 623 genes  Define functional probiotic features  Metabolic network [296]  Lactococcus lactis ssp. lactis IL1403  2310/358 genes (genome/reconstruction), 422 unique metabolites and 621 reactionsa total of 621 reactions and 509 metabolites  Understanding of lactococcal metabolic capabilities  Metabolic network [297]  Lactococcus lactis subsp. cremoris MG1363  518 genes, 754 reactions and 650 metabolites  Analysis of flavour formation  iMA945 [298]  Salmonella typhimurium strain LT2  4489/619 genes (genome/reconstruction), 1036 unique metabolites and 1964 reactions  Salmonellosis food poisoning  STM_v1.0 [299]  Salmonella typhimurium strain LT2  4489/1270 genes (genome/reconstruction), 1119 unique metabolites and 2201 reactions  Salmonellosis food poisoning  Genome-scale model [300]  Streptococcus thermophilus LMG18311  1889/429 genes (genome/reconstruction) and 522 reactions, 1889 genes (or gene fragments), the total absolute numbers of reactions is 522  Metabolic Comparison of Lactic Acid Bacteria  VvuMBEL943 [301]  Vibrio vulnificus strain CMCP6  2896/673 genes (genome/reconstruction) 765 unique metabolites and 943 reactions  Gastroenteritis  Table 7. Genome-scale metabolic models and networks reconstructed for gut microbiota species Model  Species  Total constituents  Application  Extended/revised iAH991 [145]  Bacteroides thetaiotaomicron  308 genes, 82 enzymes, 22 transporters, 32 transcription factors and 37 proteins of undefined functions  Suggest and refine specific functional assignments for sugar catabolic enzymes and transporters  iAH991[284]  Bacteroides thetaiotaomicron  1488 reactions, 1152 metabolites and 991 genes  Characterization of host–microbe metabolic symbiosis  iAH991[284]  Bacteroides thetaiotaomicron  1488 reactions, 1152 metabolites and 991 genes  Growth under diets varying in fat, carbohydrate and protein content  iBif452 [283]  Bifidobacterium adolescentis L2-32    Study of the anti-inflammatory role  iMLTC806cdf [285]  Clostridium difficile pathogenic strain 630  806 genes, 703 metabolites and 769 metabolic, 117 exchange and 145 transport reactions  Prediction of essential targets and inhibitors  iNV213 [286]  Cryptosporidium hominis  3884/213 genes (genome/reconstruction) and 540 reactions  Cryptosporidiosis  iJO1366 [287]  Escherichia coli strain K-12 MG1655  4405/1366 genes (genome/reconstruction), 1136 unique metabolites and 2251 reactions  Comprehensive genome-scale reconstruction  iCA1273 [288]  Escherichia coli strain W (ATCC 9637)  4764/1273 genes (genome/reconstruction), 1111 unique metabolites and 2477 reactions  Comprehensive genome-scale reconstruction  iFpraus_v1.0 [289]  Faecalibacterium prausnitzii A2-165    Carbon source utilization capabilities  iFap484 [283]  Faecalibacterium prausnitzii A2-165    Study of the anti-inflammatory role  iIT341 [290]  Helicobacter pylori strain 26695  1632/341 genes (genome/reconstruction), 411 unique metabolites and 476 reactions  Gastritis, gastric ulcers, gastric cancer  iYL1228 [291]  Klebsiella pneumoniae strain MGH 78578  5186/1228 genes (genome/reconstruction), 1055 unique metabolites and 1970 reactions  Infection in various tissues  iLca12A_640 [292]  Lactobacillus casei ATCC 12A  1076 reactions, 979 metabolites and 640 genes  Identification of functional differences  iLca334_548 [292]  Lactobacillus casei ATCC 334  1040 reactions, 959 metabolites and 548 genes  Identification of functional differences  iJL846 [293]  Lactobacillus casei LC2W  846 genes, 969 metabolic reactions and 785 metabolites  Understanding and engineering the metabolism of the strain  Metabolic network [294]  Lactobacillus plantarum WCFS1  3009/721 genes (genome/reconstruction), 554 unique metabolites and 761 reactions, 643 reactions and 531 metabolites  Analysis of the physiology of growth on a complex medium  pan-metabolic map [295]  Lactobacillus reuteri ATCC 55730 and ATCC PTA 6475  The metabolic model of 6475 includes 563 genes, similar to the metabolic model of L. reuteri JCM 1112. The metabolic model of 55730 includes 623 genes  Define functional probiotic features  Metabolic network [296]  Lactococcus lactis ssp. lactis IL1403  2310/358 genes (genome/reconstruction), 422 unique metabolites and 621 reactionsa total of 621 reactions and 509 metabolites  Understanding of lactococcal metabolic capabilities  Metabolic network [297]  Lactococcus lactis subsp. cremoris MG1363  518 genes, 754 reactions and 650 metabolites  Analysis of flavour formation  iMA945 [298]  Salmonella typhimurium strain LT2  4489/619 genes (genome/reconstruction), 1036 unique metabolites and 1964 reactions  Salmonellosis food poisoning  STM_v1.0 [299]  Salmonella typhimurium strain LT2  4489/1270 genes (genome/reconstruction), 1119 unique metabolites and 2201 reactions  Salmonellosis food poisoning  Genome-scale model [300]  Streptococcus thermophilus LMG18311  1889/429 genes (genome/reconstruction) and 522 reactions, 1889 genes (or gene fragments), the total absolute numbers of reactions is 522  Metabolic Comparison of Lactic Acid Bacteria  VvuMBEL943 [301]  Vibrio vulnificus strain CMCP6  2896/673 genes (genome/reconstruction) 765 unique metabolites and 943 reactions  Gastroenteritis  Model  Species  Total constituents  Application  Extended/revised iAH991 [145]  Bacteroides thetaiotaomicron  308 genes, 82 enzymes, 22 transporters, 32 transcription factors and 37 proteins of undefined functions  Suggest and refine specific functional assignments for sugar catabolic enzymes and transporters  iAH991[284]  Bacteroides thetaiotaomicron  1488 reactions, 1152 metabolites and 991 genes  Characterization of host–microbe metabolic symbiosis  iAH991[284]  Bacteroides thetaiotaomicron  1488 reactions, 1152 metabolites and 991 genes  Growth under diets varying in fat, carbohydrate and protein content  iBif452 [283]  Bifidobacterium adolescentis L2-32    Study of the anti-inflammatory role  iMLTC806cdf [285]  Clostridium difficile pathogenic strain 630  806 genes, 703 metabolites and 769 metabolic, 117 exchange and 145 transport reactions  Prediction of essential targets and inhibitors  iNV213 [286]  Cryptosporidium hominis  3884/213 genes (genome/reconstruction) and 540 reactions  Cryptosporidiosis  iJO1366 [287]  Escherichia coli strain K-12 MG1655  4405/1366 genes (genome/reconstruction), 1136 unique metabolites and 2251 reactions  Comprehensive genome-scale reconstruction  iCA1273 [288]  Escherichia coli strain W (ATCC 9637)  4764/1273 genes (genome/reconstruction), 1111 unique metabolites and 2477 reactions  Comprehensive genome-scale reconstruction  iFpraus_v1.0 [289]  Faecalibacterium prausnitzii A2-165    Carbon source utilization capabilities  iFap484 [283]  Faecalibacterium prausnitzii A2-165    Study of the anti-inflammatory role  iIT341 [290]  Helicobacter pylori strain 26695  1632/341 genes (genome/reconstruction), 411 unique metabolites and 476 reactions  Gastritis, gastric ulcers, gastric cancer  iYL1228 [291]  Klebsiella pneumoniae strain MGH 78578  5186/1228 genes (genome/reconstruction), 1055 unique metabolites and 1970 reactions  Infection in various tissues  iLca12A_640 [292]  Lactobacillus casei ATCC 12A  1076 reactions, 979 metabolites and 640 genes  Identification of functional differences  iLca334_548 [292]  Lactobacillus casei ATCC 334  1040 reactions, 959 metabolites and 548 genes  Identification of functional differences  iJL846 [293]  Lactobacillus casei LC2W  846 genes, 969 metabolic reactions and 785 metabolites  Understanding and engineering the metabolism of the strain  Metabolic network [294]  Lactobacillus plantarum WCFS1  3009/721 genes (genome/reconstruction), 554 unique metabolites and 761 reactions, 643 reactions and 531 metabolites  Analysis of the physiology of growth on a complex medium  pan-metabolic map [295]  Lactobacillus reuteri ATCC 55730 and ATCC PTA 6475  The metabolic model of 6475 includes 563 genes, similar to the metabolic model of L. reuteri JCM 1112. The metabolic model of 55730 includes 623 genes  Define functional probiotic features  Metabolic network [296]  Lactococcus lactis ssp. lactis IL1403  2310/358 genes (genome/reconstruction), 422 unique metabolites and 621 reactionsa total of 621 reactions and 509 metabolites  Understanding of lactococcal metabolic capabilities  Metabolic network [297]  Lactococcus lactis subsp. cremoris MG1363  518 genes, 754 reactions and 650 metabolites  Analysis of flavour formation  iMA945 [298]  Salmonella typhimurium strain LT2  4489/619 genes (genome/reconstruction), 1036 unique metabolites and 1964 reactions  Salmonellosis food poisoning  STM_v1.0 [299]  Salmonella typhimurium strain LT2  4489/1270 genes (genome/reconstruction), 1119 unique metabolites and 2201 reactions  Salmonellosis food poisoning  Genome-scale model [300]  Streptococcus thermophilus LMG18311  1889/429 genes (genome/reconstruction) and 522 reactions, 1889 genes (or gene fragments), the total absolute numbers of reactions is 522  Metabolic Comparison of Lactic Acid Bacteria  VvuMBEL943 [301]  Vibrio vulnificus strain CMCP6  2896/673 genes (genome/reconstruction) 765 unique metabolites and 943 reactions  Gastroenteritis  These models describe the metabolism of each species, and their integrated analysis allows the exploration of interactions between predominant bacteria in the gut ecosystems. For example, El-Semman and colleagues [283] reconstructed two metabolic models for Bifidobacterium adolescentis L2-32 (the iBif452 model) and F. prausnitzii A2-165 (the iFap484), which enabled the study of the anti-inflammatory role that these microorganisms play in the gut ecosystem. A genome-scale metabolic model for Lactobacillus casei LC2W enabled the identification of essential amino acids and vitamins and the exploration of the biosynthetic potential of some metabolites [26]. Another reconstruction of B. adolescentis L2-32 and F. prausnitzii A2-165 models enabled in silico simulation of the metabolic crosstalk between the two species and evidenced the importance of acetate supply into butyrate production [27]. Likewise, the characterization of carbohydrate utilization in Bacteroides thetaiotaomicron, supported by genome-scale metabolic and regulatory reconstructions, prompted and refined specific functional assignments for sugar catabolic enzymes and transporters [144]. Many of the above described models were obtained using similar reconstruction pipelines and therefore, share some data resources and simulation tools. Often, genome sequence data (from NCBI Genome database [40]) is the starting point, and draft reconstructions are obtained with the Model SEED comparative genome annotation and analysis software [262]. KEGG database is a useful resource for functional annotation [41], and the BiGG database is further used to assign reaction directionality [302, 303]. Tools like GEMSiRV [253], Acorn [304], YANAsquare [305] and VANTED [306] are commonly used for this purpose. Finally, constraint-based computational techniques are used in varied model simulations. For example, the OptKnock algorithm [307] and the COBRA toolbox [245] are frequently used in flux balance analysis, which enables the prediction of the phenotypic responses triggered by environmental factors (i.e. manipulation of cellular growth in silico) and additional metabolic profiling. A comprehensive description of the available genome-scale metabolic reconstruction procedures and pipelines can be found in recent reviews [29, 308]. Along this line of research, but using different approaches, Bayesian inference of metabolic networks has been used to reveal a metabolic system with greater prevalence among IBD patients [309], and the construction and functional analysis of proteome interaction networks enabled the analysis of nutrient-affected pathways in human pathologies [310]. Gut microbiome community models As more metabolic reconstructions of gut microbes become available, bioinformatics efforts are being directed towards the development of modelling frameworks for the systematic investigation of metabolic crosstalk in gut microbiome communities [311, 312]. Although existing single-species quantitative and computational approaches can be applied to microbial communities, extended community-centred approaches are being proposed to consider the impact that social traits (e.g. bacteriocin production, quorum sensing and other cell-to-cell interactions) may have in specific scenarios [313]. Such modelling of microbe communities should entail community structure, i.e. the interactions among microbes over time (community states). Specifically, each community state is described by measurements of community-level fluxes, abundances of species and knowledge of the metabolism of these organisms. In complex ecosystems, such as the human gut, this may imply millions of reactions, many of which are carried out in different species. As the involved mathematical and computational modelling is too costly, alternatives based on coarse-grained models have been proposed, and recent reviews have described their rationale, pointing out main strengths and drawbacks [314, 315]. The so-called ‘supra-organism’ approach combines all metabolic reactions into a single network to study the metabolic capacities in terms of product and substrate variation of the community [316]. Such approach ignores the impact of species abundances and the interactions between community members while enabling the optimization of community-level objectives (i.e. prediction of important environmental conditions). The steady-state compartmentalized approach models each organism in the microbial community as a single constraint-based model (i.e. with its own objective function), nested within a global ecosystem model that represents the exchange of metabolites between the species. The aim is to maximize the objective function of the ecosystem, and thus, enable the study of host–microbe and microbe–microbe interactions. [315]. Although initially neglected, biomass concentrations of individual species are now also taken into account, which allows for the determination of accurate quantitative transfer rates [317]. The dynamic compartmentalized approach goes one step forward and uses the kinetics of substrate uptake and metabolite exchange between species to grasp ecosystem structure and functionality [318, 319]. Specifically, this approach accounts for changes in the biomass concentrations of individual species over time, which allows the simulation of interactions that may alter the community state. Furthermore, by complementing in silico metabolic network models with metagenomics-based compositional data, it is possible to predict levels of competition and complementarity among microbiome species and compare predicted interaction measures to species co-occurrence, specifically to study microbiome assembly according to habitat filtering [320]. There already exists several gut microbiome community models. Constraint-based multi-species modelling has been used to predict the effects of environmental constraints, namely, different dietary regimes as well as anoxic and oxic conditions, in the human gut ecosystem [321]. On the other hand, integer linear programming has been used to seek ways to shift target communities towards preferred states, i.e. minimal sets of microbial species that collectively provide the enzymatic capacity required to synthesize a set of desired target products from a predefined set of available substrates [322]. For example, the in silico design of faecal microbiota transplants, where synthetic communities are engineered to mimic a healthy gut, and thus, to be able to ameliorate the condition of patients with dysbiotic guts. This kind of transplants has shown promising results for addressing recurrent Clostridium difficile infections and other gut disorders, including IBD [323]. Computational tools such as PathPred [324] and Computation of Microbial Ecosystems in Time and Space (COMETS) [325] are being used in the study of community level biotransformation. PathPred uses the KEGG RPAIR database, a collection of biochemical structure transformation patterns and chemical structure alignments of substrate–product pairs, to predict plausible pathways for multistep reactions. COMETS enables computer simulations of metabolism in spatially structured microbial communities using dynamic flux balance analysis. To facilitate the visual exploration of the metabolic interactions between microbiomes in a community, e.g. as predicted by COMETS, tools like VisANT 5.0 [326], MetDraw [327], Cellular Overview [328], FBA-SimVis [251] and SurreyFBA [266] have been developed. Host–microbe models Typically, the characterization of host–microbe interactions entails the integration of a human metabolic reconstruction (or a mouse reconstruction) with one or various microbe metabolic reconstructions. These models are useful to gain a deeper understanding about host–microbe symbiosis in the scope of metabolic disorders, and thus, may offer valuable insights into diet modulation and the benefits of probiotics. For example, a ‘meta-metabolome’ network describing the interactions between the human host and three predominant phyla of gut bacteria, namely, Firmicutes, Bacteroidetes and Actinobacteria, shed light into cross-feeding relationships between some gut microbe enzymes and host carbohydrate metabolism enzymes [329]. The genome-scale metabolic reconstruction of B. thetaiotaomicron iAH991 was integrated with the mouse metabolic reconstruction iMM1415, in an effort to characterize intestinal transport and absorption reactions. The resulting model (iexGFMM_BΘ) comprises 7239 reactions, 5164 metabolites and 2769 genes, and was used to simulate the effect of different dietary regimes in both the host and the microbe [284]. Similarly, a metabolic reconstruction of human small intestinal epithelial cells, named hs_sIEC611, supported the study of microbe–microbe interactions in the presence/absence of the human host [330]. The first constraint-based host–microbial community model was recently published [311]. This model encompasses the most comprehensive model of human metabolism (Recon2) and 11 manually curated and validated metabolic models of commensals, probiotics, pathogens and opportunistic pathogens, with over 2000 exchanges representing metabolic functions in humans. It was used to predict potential metabolic host–microbe interactions under four in silico dietary regimes, which varied in carbohydrate, fat and protein intake. Network mining The construction of microbial function networks is often sought as a means of identifying co-occurrence of microbial species in humans. For example, a protein–protein interaction network supported the study of potential dietary interventions targeting the short-chain fatty acid metabolism, namely, the analysis of topological metrics enabled the identification of the most vulnerable protein targets of the butyrate and propionate metabolic pathways, i.e. protein targets that are more likely to change gene expression activity [331]. Another approach based on the mapping of microbial genes to functional units, i.e. KEGG orthologous groups (KO) [41] or evolutionary genealogy of genes (eggNOG) [55], has been applied to the study of the human gut microbiome associated with T2D [332]. This analysis used Pearson’s correlation coefficient to characterize the strength of the associations between functional units and included the prediction of the abundance of functional units using machine learning (i.e. random forest algorithm). Associations deemed as weak were eliminated and the final network of functional units was described in terms of global and local properties (e.g. number of nodes and edges, density, diameter and clustering coefficient) as well as functional modules, namely, T2D-specific functional networks, and network motifs. A network-based approach also helped in the characterization of microbial co-occurrence in IBD patients [333]. Besides classical topology metrics (e.g. path length and clustering coefficient), this study looked into three- and four-species network motifs to gain a better understanding about local patterns of species co-occurrence. A correlation network analysis identified significant associations between abundances of microbial taxa and diet-induced shifts in several metabolic health parameters [334]. Namely, it identified diet-induced changes in Bacteroides levels related to changes in carbohydrate oxidation rates, whereas changes in Firmicutes were correlated with changes in fat oxidation. Among network-based studies, Cytoscape is the platform with broadest usage [335]. Besides providing generic means of visualization and topological exploration, this platform offers a number of different applications that enable multi-scale data integration, data clustering, enrichment analysis (Gene Ontology (GO) functional annotation) and network comparison, among others. Alternatively, some works use ConsensusPathDB (which is also available as a plugin for Cytoscape) [336]. ConsensusPathDB-human integrates interaction networks in Homo sapiens including binary and complex protein–protein, genetic, metabolic, signalling, gene regulatory and drug–target interactions, as well as biochemical pathways. Other system-level analyses Although not so common, there are emerging tools that complement the knowledge provided by metabolic models and network, namely, in terms of system dynamics and time evolving. For example, the Metagenomic Microbial Interaction Simulator (MetaMIS) supports time series analysis of microbial community profiles [112]. The central purpose of this tool is to provide insights into microbe interactions in general and about specific microbes in the community. To this end, MetaMIS infers underlying microbial interactions from the abundance tables of operational taxonomic units, and then, uses it to construct interaction networks using the Lotka–Volterra model. For each interaction network, it systematically examines interaction patterns (such as mutualism or competition) and refines the biotic role within microbes. Dynamic Bayesian Networks are another approach used to capture complex interactions and dynamic change within the microbiome over time. For example, a dynamic Bayesian network was used to model the progression of microbiota while colonizing the infant gut [337]. The aim was to develop a predictive model based on prior composition. So, the model accounted for relationships between multiple bacterial taxa, the compositional changes bacterial taxa exert on other community members over time and the influence of environmental stresses (e.g. the use of antibiotics) on gut microbiome progression. A hidden Markov model was used to gain a better understanding about the distribution of butyrate production pathways in commensals and pathogens inhabiting different environments, namely, the human gut [338]. Boolean network modelling and dynamic analysis have enabled the inference of important relationships within gut microbiota composition. More specifically, this approach was used to explore the dynamics of clindamycin antibiotic treatment in C. difficile infection and to predict therapeutic probiotic interventions to suppress C. difficile infection [64]. Agent-based modelling has also supported the study of gut microbiome population dynamics. For example, one model represented two bacterial species, metabolites and the gut (host), considering behavioural rules for both microbe–microbe interactions and host–microbe interactions. Notably, this model encompasses the reactions governing fermentation of polysaccharides to acetate and propionate and fermentation of acetate to butyrate, and antibiotic treatment was chosen as disturbance factor and used to investigate stability of the system [339]. Another model described clostridia, Desulfovibrio sp., and bifidobacteria population interactions in the gut and supported the study of risks for developing autism [340]. Simulation results suggested that clostridia growth rate is a key determinant of risk of autism development and treatment of high-risk infants with supraphysiological levels of lysozymes may reduce the risk of developing autism. Finally, an agent-based model of virulence regulation in Pseudomonas aeruginosa was developed to represent the host–microbe interface in the gut and be able to study its spatial–temporal dynamics [341]. Gut microbiome studies have also taken advantage of machine learning and data mining methods. An approach using random forests helped exploring the role of the gut microbiota in colon tumorigenesis, namely, by modelling the number of tumours developed based on the initial composition of the microbiota and different combinations of antibiotics [56]. In another study, text mining and naive Bayes classification were combined towards disclosing diet–gut microbiome interactions at the molecular level [331, 342]. Noteworthy, this resulted in the development of NutriChem, a public Web-based database on associations between chronic diseases and plant-based foods [343]. Future challenges and opportunities After a period of notable introduction of Omics technologies into the study of the human gut microbiome, several achievements have been accomplished, which as a whole provided deeper insights and understanding into the complex microbial ecology and physiology operating on the intestinal mucosa. Omics have been crucial for the elucidation of some of the microbial and metabolic signatures, characterizing the link between dysbiosis and the disease/health status of the host. For instance, in IBD, a chronic gut inflammatory condition of the human gut, omics had led to a better understanding of the disease phenotype (i.e. genes, proteins and pathways that primarily differentiated patients from healthy subjects) and enabled the association of the phenotype with alterations in bacterial carbohydrate and protein metabolism, bacterial–host interactions, as well as human host-secreted enzymes [27]. Future challenges, which represent also new opportunities, rely on the integration of different techniques to recover a holistic view of the microbiomes. Both efficiency and quality of microbiome research heavily depend on the integration of the outputs of next-generation sequencing methods with those of other omics technologies, namely, data on transcript and protein variation, metabolite concentrations and spatial distribution. Within this context, microbiome bioinformatics have the mission to provide computational methods and techniques that complement experimental approaches and enrich our understanding of complex microbial communities, their internal interactions and their interactions with the host and the environment. Together with integration, curation of data repositories will be crucial for the proper identification of microorganisms, genes and proteins, notably those to which a putative status, a hypothetical function or not even that have been assigned. For this reason, studies focused on the function of single molecules are as important to metatranscriptomics, metaproteomics or metabolics as culturomics is for metagenomics. It will make no sense to discover new microbial biomarkers of disease if we cannot culture it. Last but not least, there is a general lack of graphical user interfaces to use many of the tools. This lack limits access of the scientific community to interesting resources and therefore represent and opportunity for current developments in the field of the human microbiota, as many of the informatics tools are currently operated via command line. Concluding remarks The human microbiome plays a key role in human health and is associated with numerous diseases. Understanding the importance of the gut microbiome on modulation of host health has thus become a subject of great interest for researchers across biomedical disciplines. Integrated and high-throughput analyses are providing new insights into microbial community structure and function in human gut. The continuous emergence of new knowledge, the heterogeneity of the involved data and the need for integrative, advanced and often application-based (customized workflows) analysis, makes the comprehensive description of possible resources, tools and pipelines hardly feasible. This review presents a collection of updated tools for the application of omics data into the field of gut microbiota, but it can be extended to the study of any microbiota, regardless the ecological niche. Main tools for metagenomics, metatranscriptomics, metaproteomics, metabolomics and fluxomics have been discussed and organised in a comprehensive way as an online resource for the research community. We have detected that further efforts need to be done in the integration of data from different omics, in the optimization of informatics resources, in the development of novel tools and, notably, in the curation of the information contained in databases and other public biomolecule repositories. We sincerely believe that cooperation between researchers working in different fields, from microbiology to bioinformatics, will help in achieving those suggested milestones. At the end, this will extend our view of the complex interaction between microbiota and the human host, and identify meaningful microbial targets for interventional studies in the framework of different diseases. Key Points Gut microbiota composition is related to human health and alterations in the relative proportions of its components linked several human diseases. Omics techniques have made possible to study the composition, functionality and metabolic activity of the human gut microbiota and its impact on host physiology. This review provides an overview of many bioinformatics resources that can be applied to the human gut microbiome research. Competing interests B.S. is on the scientific board and is co-founder of Microviable Therapeutics SL. The other authors have no competing interests. Supplementary Data Supplementary data are available online at http://bib.oxfordjournals.org/. Funding This work was supported by the Spanish ‘Programa Estatal de Investigación, Desarrollo e Inovación Orientada a los Retos de la Sociedad’ (grant number AGL2013-44039 R); the Asociación Española Contra el Cancer (‘Obtención de péptidos bioactivos contra el Cáncer Colo-Rectal a partir de secuencias genéticas de microbiomas intestinales’, grant number PS-2016). This study was also supported by the Portuguese Foundation for Science and Technology (FCT) under the scope of the strategic funding of UID/BIO/04469/2013 unit and COMPETE 2020 (POCI-01-0145-FEDER-006684); and the INOU16-05 project from the University of Vigo. SING group thanks CITI (Centro de Investigación, Transferencia e Innovación) from University of Vigo for hosting its IT infrastructure. Aitor Blanco-Míguez is a PhD student of the Computer Science Doctoral programme of the University of Vigo. He is currently developing advanced computational methods for modelling the interaction of commensal bacteria with host epitelial/immune cells. Florentino Fdez-Riverola is a faculty member of the Department of Computer Science and a researcher affiliated to the Biomedical Research Centre (CINBIO), at the University of Vigo. He leads the Next Generation Computer System Group (SING), which is dedicated to the research and development of cutting-edge computational methodologies and applications. Borja Sánchez is a senior researcher at the IPLA-CSIC. His main research line is devoted to the understanding of the molecular mechanisms of host–bacteria interaction through extracellular and surface-associated proteins/peptides. Anália Lourenço is a faculty member of the Department of Computer Science and a researcher affiliated to the Biomedical Research Centre (CINBIO), at the University of Vigo and the Centre of Biological Engineering, at the University of Minho. Her main research interests include computational intelligence, bioinformatics and systems biology. References 1 Rakoff-Nahoum S, Foster KR, Comstock LE. The evolution of cooperation within the gut microbiota. Nature  2016; 533: 255– 9. Google Scholar CrossRef Search ADS PubMed  2 Francino MP. Antibiotics and the human gut microbiome: dysbioses and accumulation of resistances. Front Microbiol  2016; 6: 1– 11. Google Scholar CrossRef Search ADS   3 Walsh CJ, Guinane CM, O'Toole PW, et al.   Beneficial modulation of the gut microbiota. FEBS Lett  2014; 588( 22): 4120– 30. http://dx.doi.org/10.1016/j.febslet.2014.03.035 Google Scholar CrossRef Search ADS PubMed  4 Morgan XC, Huttenhower C, Lewitter F, Kann M. Chapter 12: human microbiome analysis. PLoS Comput Biol  2012; 8( 12): e1002808. Google Scholar CrossRef Search ADS PubMed  5 van den Elsen LW, Poyntz HC, Weyrich LS, et al.   Embracing the gut microbiota: the new frontier for inflammatory and infectious diseases. Clin Transl Immunol  2017; 6( 1): e125. Google Scholar CrossRef Search ADS   6 Hooper LV, Gordon JI. Commensal host-bacterial relationships in the gut. Science  2001; 292( 5519): 1115– 8. http://dx.doi.org/10.1126/science.1058709 Google Scholar CrossRef Search ADS PubMed  7 Sánchez B, Urdaci MC, Margolles A. Extracellular proteins secreted by probiotic bacteria as mediators of effects that promote mucosa-bacteria interactions. Microbiology  2010; 156( Pt 11): 3232– 42. Google Scholar CrossRef Search ADS PubMed  8 Sansonetti PJ. War and peace at mucosal surfaces. Nat Rev Immunol  2004; 4( 12): 953– 64. http://dx.doi.org/10.1038/nri1499 Google Scholar CrossRef Search ADS PubMed  9 Patterson E, Ryan PM, Cryan JF, et al.   Gut microbiota, obesity and diabetes. Postgrad Med J  2016; 92( 1087): 286– 300. http://dx.doi.org/10.1136/postgradmedj-2015-133285 Google Scholar CrossRef Search ADS PubMed  10 He X, Ji G, Jia W, et al.   Gut microbiota and nonalcoholic fatty liver disease: insights on mechanism and application of metabolomics. Int J Mol Sci  2016; 17( 3): 300. http://dx.doi.org/10.3390/ijms17030300 Google Scholar CrossRef Search ADS PubMed  11 Barlow GM, Yu A, Mathur R. Role of the gut microbiome in obesity and diabetes mellitus. Nutr Clin Pract  2015; 30( 6): 787– 97. http://dx.doi.org/10.1177/0884533615609896 Google Scholar CrossRef Search ADS PubMed  12 Zhang C, Yin A, Li H, et al.   Dietary modulation of gut microbiota contributes to alleviation of both genetic and simple obesity in children. EBioMedicine  2015; 2: 966– 82. 13 Trøseid M, Hov JR, Nestvold TK, et al.   Major increase in microbiota-dependent proatherogenic metabolite TMAO one year after bariatric surgery. Metab Syndr Relat Disord  2016; 14: 197– 201. Google Scholar CrossRef Search ADS PubMed  14 Palau-Rodriguez M, Tulipani S, Isabel Queipo-Ortuño M, et al.   Metabolomic insights into the intricate gut microbial-host interaction in the development of obesity and type 2 diabetes. Front Microbiol  2015; 6: 1151. Google Scholar CrossRef Search ADS PubMed  15 Arora T, Singh S, Sharma RK. Probiotics: Interaction with gut microbiome and antiobesity potential. Nutrition  2013; 29( 4): 591– 6. http://dx.doi.org/10.1016/j.nut.2012.07.017 Google Scholar CrossRef Search ADS PubMed  16 Buttó LF, Haller D. Dysbiosis in intestinal inflammation: cause or consequence. Int J Med Microbiol  2016; 306: 302– 9. Google Scholar CrossRef Search ADS PubMed  17 Kataoka K. The intestinal microbiota and its role in human health and disease. J Med Invest  2016; 63( 1-2): 27– 37. http://dx.doi.org/10.2152/jmi.63.27 Google Scholar CrossRef Search ADS PubMed  18 Matsuoka K, Kanai T. The gut microbiota and inflammatory bowel disease. Semin Immunopathol  2015; 37( 1): 47– 55. http://dx.doi.org/10.1007/s00281-014-0454-4 Google Scholar CrossRef Search ADS PubMed  19 Kostic AD, Xavier RJ, Gevers D. The microbiome in inflammatory bowel disease: current status and the future ahead. Gastroenterology  2014; 146( 6): 1489– 99. http://dx.doi.org/10.1053/j.gastro.2014.02.009 Google Scholar CrossRef Search ADS PubMed  20 Forbes JD, Van Domselaar G, Bernstein CN. Microbiome survey of the inflamed and noninflamed gut at different compartments within the gastrointestinal tract of inflammatory bowel disease patients. Inflamm Bowel Dis  2016; 22: 817– 25. http://dx.doi.org/10.1097/MIB.0000000000000684 Google Scholar CrossRef Search ADS PubMed  21 Cao Y, Shen J, Ran ZH. Association between Faecalibacterium prausnitzii reduction and inflammatory bowel disease: a meta-analysis and systematic review of the literature. Gastroenterol Res Pract  2014; 2014: 872725. Google Scholar PubMed  22 Paul B, Barnes S, Demark-Wahnefried W, et al.   Influences of diet and the gut microbiome on epigenetic modulation in cancer and other diseases. Clin Epigenetics  2015; 7: 112. http://dx.doi.org/10.1186/s13148-015-0144-7 Google Scholar CrossRef Search ADS PubMed  23 Thomas RM, Jobin C. The Microbiome and Cancer: Is the ‘Oncobiome’ Mirage Real?. Trends in Cancer  2015; 1( 1): 24– 35. Google Scholar CrossRef Search ADS PubMed  24 Belizario JE, Napolitano M. Human microbiomes and their roles in dysbiosis, common diseases, and novel therapeutic approaches. Front Microbiol  2015; 6: 1– 16. Google Scholar CrossRef Search ADS PubMed  25 Sung J, Hale V, Merkel AC, et al.   Metabolic modeling with Big Data and the gut microbiome. Appl Transl genomics  2016; 10: 10– 5. http://dx.doi.org/10.1016/j.atg.2016.02.001 Google Scholar CrossRef Search ADS   26 Weir TL, Manter DK, Sheflin AM, et al.   Stool microbiome and metabolome differences between colorectal cancer patients and healthy adults. PLoS One  2013; 8( 8): e70803. Google Scholar CrossRef Search ADS PubMed  27 Erickson AR, Cantarel BL, Lamendella R, et al.   Integrated metagenomics/metaproteomics reveals human host-microbiota signatures of Crohn’s disease. PLoS One  2012; 7( 11): e49138. Google Scholar CrossRef Search ADS PubMed  28 Haiser HJ, Gootenberg DB, Chatman K, et al.   Predicting and manipulating cardiac drug inactivation by the human gut bacterium Eggerthella lenta. Science  2013; 341: 295– 8. http://dx.doi.org/10.1126/science.1235872 Google Scholar CrossRef Search ADS PubMed  29 Cuevas DA, Edirisinghe J, Henry CS, et al.   From DNA to FBA: how to build your own genome-scale metabolic model. Front Microbiol  2016; 7: 907. Google Scholar CrossRef Search ADS PubMed  30 Segata N, Boernigen D, Tickle TL, et al.   Computational meta’omics for microbial community studies. Mol Syst Biol  2014; 9: 666. Google Scholar CrossRef Search ADS   31 Morgan XC, Huttenhower C. Meta’omic analytic techniques for studying the intestinal microbiome. Gastroenterology  2014; 146( 6): 1437– 48.e1. Google Scholar CrossRef Search ADS PubMed  32 Borenstein E. Computational systems biology and in silico modeling of the human microbiome. Brief Bioinform  2012; 13( 6): 769– 80. http://dx.doi.org/10.1093/bib/bbs022 Google Scholar CrossRef Search ADS PubMed  33 Collison M, Hirt RP, Wipat A, et al.   Data mining the human gut microbiota for therapeutic targets. Brief Bioinform  2012; 13( 6): 751– 68. http://dx.doi.org/10.1093/bib/bbs002 Google Scholar CrossRef Search ADS PubMed  34 Human Microbiome Project Consortium. A framework for human microbiome research. Nature  2012; 486: 215– 21. http://dx.doi.org/10.1038/nature11209 CrossRef Search ADS PubMed  35 Qin J, Li Y, Cai Z, et al.   A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature  2012; 490( 7418): 55– 60. http://dx.doi.org/10.1038/nature11450 Google Scholar CrossRef Search ADS PubMed  36 McDonald D, Birmingham A, Knight R. Context and the human microbiome. Microbiome  2015; 3: 52. http://dx.doi.org/10.1186/s40168-015-0117-2 Google Scholar CrossRef Search ADS PubMed  37 Li J, Jia H, Cai X, et al.   An integrated catalog of reference genes in the human gut microbiome. Nat Biotechnol  2014; 32( 8): 834– 41. http://dx.doi.org/10.1038/nbt.2942 Google Scholar CrossRef Search ADS PubMed  38 Falony G, Joossens M, Vieira-Silva S, et al.   Population-level analysis of gut microbiome variation. Science  2016; 352( 6285): 560– 4. http://dx.doi.org/10.1126/science.aad3503 Google Scholar CrossRef Search ADS PubMed  39 Auton A, Brooks LD, Durbin RM, et al.   A global reference for human genetic variation. Nature  2015; 526( 7571): 68– 74. http://dx.doi.org/10.1038/nature15393 Google Scholar CrossRef Search ADS PubMed  40 Pruitt KD, Tatusova T, Brown GR, et al.   NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy. Nucleic Acids Res  2012; 40: D130– 5. Google Scholar CrossRef Search ADS PubMed  41 Kanehisa M, Furumichi M, Tanabe M, et al.   KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res  2017; 45( D1): D353– 61. Google Scholar CrossRef Search ADS PubMed  42 Tatusov RL, Fedorova ND, Jackson JD, et al.   The COG database: an updated version includes eukaryotes. BMC Bioinformatics  2003; 4: 41. http://dx.doi.org/10.1186/1471-2105-4-41 Google Scholar CrossRef Search ADS PubMed  43 Finn RD, Coggill P, Eberhardt RY, et al.   The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res  2016; 44( D1): D279– 85. Google Scholar CrossRef Search ADS PubMed  44 Letunic I, Doerks T, Bork P. SMART 7: recent updates to the protein domain annotation resource. Nucleic Acids Res  2012; 40: D302– 5. Google Scholar CrossRef Search ADS PubMed  45 Mitra S, Rupek P, Richter DC, et al.   Functional analysis of metagenomes and metatranscriptomes using SEED and KEGG. BMC Bioinformatics  2011; 12 (Suppl 1): S21. Google Scholar CrossRef Search ADS PubMed  46 Caspi R, Altman T, Dreher K, et al.   The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res  2012; 44: 471– 80. Google Scholar CrossRef Search ADS   47 Overbeek R, Olson R, Pusch GD, et al.   The SEED and the rapid annotation of microbial genomes using subsystems technology (RAST). Nucleic Acids Res  2014; 42: D206– 14. Google Scholar CrossRef Search ADS PubMed  48 Wilke A, Bischof J, Gerlach W, et al.   The MG-RAST metagenomics database and portal in 2015. Nucleic Acids Res  2016; 44: D590– 4. Google Scholar CrossRef Search ADS PubMed  49 Markowitz VM, Chen I-M. a, Palaniappan K, et al.   IMG 4 version of the integrated microbial genomes comparative analysis system. Nucleic Acids Res  2014; 42: 560– 7. http://dx.doi.org/10.1093/nar/gkt963 Google Scholar CrossRef Search ADS   50 Huson DH, Beier S, Flade I, et al.   MEGAN community edition - interactive exploration and analysis of large-scale microbiome sequencing data. PLoS Comput Biol  2016; 12( 6): e1004957. Google Scholar CrossRef Search ADS PubMed  51 Abubucker S, Segata N, Goll J, et al.   Metabolic reconstruction for metagenomic data and its application to the human microbiome. PLoS Comput Biol  2012; 8( 6): e1002358. Google Scholar CrossRef Search ADS PubMed  52 Tyakht AV, Popenko AS, Belenikin MS, et al.   MALINA: a web service for visual analytics of human gut microbiota whole-genome metagenomic reads. Source Code Biol Med  2012; 7: 13. http://dx.doi.org/10.1186/1751-0473-7-13 Google Scholar CrossRef Search ADS PubMed  53 Kultima JR, Coelho LP, Forslund K, et al.   MOCAT2: a metagenomic assembly, annotation and profiling framework. Bioinformatics  2016; 32: 2520– 3. http://dx.doi.org/10.1093/bioinformatics/btw183 Google Scholar CrossRef Search ADS PubMed  54 Bose T, Haque MM, Reddy C, et al.   COGNIZER: a framework for functional annotation of metagenomic datasets. PLoS One  2015; 10( 11): e0142102. Google Scholar CrossRef Search ADS PubMed  55 Huerta-Cepas J, Szklarczyk D, Forslund K, et al.   eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences. Nucleic Acids Res  2016; 44: D286– 93. Google Scholar CrossRef Search ADS PubMed  56 Zackular JP, Baxter NT, Chen GY, et al.   Manipulation of the gut microbiota reveals role in colon tumorigenesis. mSphere  2015; 1: e00001-15. Google Scholar CrossRef Search ADS PubMed  57 Norman JM, Handley SA, Baldridge MT, et al.   Disease-specific alterations in the enteric virome in inflammatory bowel disease. Cell  2015; 160: 447– 60. http://dx.doi.org/10.1016/j.cell.2015.01.002 Google Scholar CrossRef Search ADS PubMed  58 Biagi E, Candela M, Centanni M, et al.   Gut microbiome in down syndrome. PLoS One  2014; 9( 11): e112023. Google Scholar CrossRef Search ADS PubMed  59 Larsen PE, Dai Y. Metabolome of human gut microbiome is predictive of host dysbiosis. Gigascience  2015; 4: 42. http://dx.doi.org/10.1186/s13742-015-0084-3 Google Scholar CrossRef Search ADS PubMed  60 Yap TW-C, Gan H-M, Lee Y-P, et al.   Helicobacter pylori eradication causes perturbation of the human gut microbiome in young adults. PLoS One  2016; 11( 3): e0151893. Google Scholar CrossRef Search ADS PubMed  61 Keren N, Konikoff FM, Paitan Y, et al.   Interactions between the intestinal microbiota and bile acids in gallstones patients. Environ Microbiol Rep  2015; 7: 874– 80. http://dx.doi.org/10.1111/1758-2229.12319 Google Scholar CrossRef Search ADS PubMed  62 Rooijers K, Kolmeder C, Juste C, et al.   An iterative workflow for mining the human intestinal metaproteome. BMC Genomics  2011; 12: 6. http://dx.doi.org/10.1186/1471-2164-12-6 Google Scholar CrossRef Search ADS PubMed  63 Wills ES, Jonkers DMAE, Savelkoul PH, et al.   Fecal microbial composition of ulcerative colitis and Crohn’s disease patients in remission and subsequent exacerbation. PLoS One  2014; 9( 3): e90981. Google Scholar CrossRef Search ADS PubMed  64 Steinway SN, Biggs MB, Loughran TP, et al.   Inference of network dynamics and metabolic interactions in the gut microbiome. PLoS Comput Biol  2015; 11( 5): e1004338. Google Scholar CrossRef Search ADS PubMed  65 Stewart CJ, Marrs ECL, Nelson A, et al.   Development of the preterm gut microbiome in twins at risk of necrotising enterocolitis and sepsis. PLoS One  2013; 8( 8): e73465. Google Scholar CrossRef Search ADS PubMed  66 La Rosa PS, Warner BB, Zhou Y, et al.   Patterned progression of bacterial populations in the premature infant gut. Proc Natl Acad Sci USA  2014; 111: 12522– 7. http://dx.doi.org/10.1073/pnas.1409497111 Google Scholar CrossRef Search ADS PubMed  67 Turnbaugh PJ, Hamady M, Yatsunenko T, et al.   A core gut microbiome in obese and lean twins. Nature  2009; 457( 7228): 480– 4. http://dx.doi.org/10.1038/nature07540 Google Scholar CrossRef Search ADS PubMed  68 Caporaso JG, Lauber CL, Costello EK, et al.   Moving pictures of the human microbiome. Genome Biol  2011; 12( 5): R50. Google Scholar CrossRef Search ADS PubMed  69 Turroni S, Rampelli S, Biagi E, et al.   Temporal dynamics of the gut microbiota in people sharing a confined environment, a 520-day ground-based space simulation, MARS500. Microbiome  2017; 5( 1): 39. http://dx.doi.org/10.1186/s40168-017-0256-8 Google Scholar CrossRef Search ADS PubMed  70 Schnorr SL, Candela M, Rampelli S, et al.   Gut microbiome of the Hadza hunter-gatherers. Nat Commun  2014; 5: 3654. Google Scholar CrossRef Search ADS PubMed  71 Zhang J, Guo Z, Xue Z, et al.   A phylo-functional core of gut microbiota in healthy young Chinese cohorts across lifestyles, geography and ethnicities. ISME J  2015; 9( 9): 1979– 90. http://dx.doi.org/10.1038/ismej.2015.11 Google Scholar CrossRef Search ADS PubMed  72 Biagi E, Franceschi C, Rampelli S, et al.   Gut microbiota and extreme longevity. Curr Biol  2016; 26( 11): 1480– 5. http://dx.doi.org/10.1016/j.cub.2016.04.016 Google Scholar CrossRef Search ADS PubMed  73 Morton ER, Lynch J, Froment A, et al.   Variation in rural African gut microbiota is strongly correlated with colonization by entamoeba and subsistence. PLoS Genet  2015; 11( 11): e1005658. Google Scholar CrossRef Search ADS PubMed  74 Gomez A, Petrzelkova KJ, Burns MB, et al.   Gut microbiome of coexisting BaAka pygmies and bantu reflects gradients of traditional subsistence patterns. Cell Rep  2016; 14( 9): 2142– 53. http://dx.doi.org/10.1016/j.celrep.2016.02.013 Google Scholar CrossRef Search ADS PubMed  75 Stewart CJ, Nelson A, Campbell MD, et al.   Gut microbiota of Type 1 diabetes patients with good glycaemic control and high physical fitness is similar to people without diabetes: an observational study. Diabet Med  2017; 34: 127– 34. http://dx.doi.org/10.1111/dme.13140 Google Scholar CrossRef Search ADS PubMed  76 Karlsson FH, Tremaroli V, Nookaew I, et al.   Gut metagenome in European women with normal, impaired and diabetic glucose control. Nature  2013; 498( 7452): 99– 103. http://dx.doi.org/10.1038/nature12198 Google Scholar CrossRef Search ADS PubMed  77 Candela M, Biagi E, Soverini M, et al.   Modulation of gut microbiota dysbioses in type 2 diabetic patients by macrobiotic Ma-Pi 2 diet. Br J Nutr  2016; 116( 1): 80– 93. http://dx.doi.org/10.1017/S0007114516001045 Google Scholar CrossRef Search ADS PubMed  78 Wang W-L, Xu S-Y, Ren Z-G, et al.   Application of metagenomics in the human gut microbiome. World J Gastroenterol  2015; 21( 3): 803– 14. http://dx.doi.org/10.3748/wjg.v21.i3.803 Google Scholar CrossRef Search ADS PubMed  79 Fabijanić M, Vlahoviček K. Big data, evolution, and metagenomes: predicting disease from gut microbiota codon usage profiles. Methods Mol Biol  2016; 1415: 509– 31. Google Scholar CrossRef Search ADS PubMed  80 Mulcahy-O’Grady H, Workentine ML. The challenge and potential of metagenomics in the clinic. Front Immunol  2016; 7: 1– 8. Google Scholar CrossRef Search ADS PubMed  81 Noecker C, McNally CP, Eng A, et al.   High-resolution characterization of the human microbiome. Transl Res  2017; 179: 7– 23. http://dx.doi.org/10.1016/j.trsl.2016.07.012 Google Scholar CrossRef Search ADS PubMed  82 Sedlar K, Kupkova K, Provaznik I. Bioinformatics strategies for taxonomy independent binning and visualization of sequences in shotgun metagenomics. Comput Struct Biotechnol J  2017; 15: 48– 55. http://dx.doi.org/10.1016/j.csbj.2016.11.005 Google Scholar CrossRef Search ADS PubMed  83 Ghurye JS, Cepeda-Espinoza V, Pop M. Metagenomic assembly: overview, challenges and applications. Yale J Biol Med  2016; 89: 353– 62. Google Scholar PubMed  84 Coit P, Sawalha AH. The human microbiome in rheumatic autoimmune diseases: a comprehensive review. Clin Immunol  2016; 170: 70– 9. http://dx.doi.org/10.1016/j.clim.2016.07.026 Google Scholar CrossRef Search ADS PubMed  85 Zielezinski A, Vinga S, Almeida J, et al.   Alignment-free sequence comparison: benefits, applications, and tools. Genome Biol  2017; 18( 1): 186. http://dx.doi.org/10.1186/s13059-017-1319-7 Google Scholar CrossRef Search ADS PubMed  86 Treangen TJ, Sommer DD, Angly FE, et al.   Next generation sequence assembly with AMOS. Curr Protoc Bioinform  2011; Chapter 11: Unit 11.8. 87 Kerepesi C, Bánky D, Grolmusz V. AmphoraNet: the webserver implementation of the AMPHORA2 metagenomic workflow suite. Gene  2014; 533( 2): 538– 40. Google Scholar CrossRef Search ADS PubMed  88 van Heel AJ, de Jong A, Montalbán-López M, et al.   BAGEL3: Automated identification of genes encoding bacteriocins and (non-)bactericidal posttranslationally modified peptides. Nucleic Acids Res  2013; 41: W448– 53. Google Scholar CrossRef Search ADS PubMed  89 Altschul SF, Gish W, Miller W, et al.   Basic local alignment search tool. J Mol Biol  1990; 215( 3): 403– 10. http://dx.doi.org/10.1016/S0022-2836(05)80360-2 Google Scholar CrossRef Search ADS PubMed  90 Lu YY, Tang K, Ren J, et al.   CAFE: a Ccelerated Alignment-FrEe sequence analysis. Nucleic Acids Res  2017; 45( W1): W554– 9. Google Scholar CrossRef Search ADS   91 Sun S, Chen J, Li W, et al.   Community cyberinfrastructure for advanced microbial ecology research and analysis: the CAMERA resource. Nucleic Acids Res  2011; 39: D546– 51. Google Scholar CrossRef Search ADS PubMed  92 Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics  2006; 22: 1658– 9. http://dx.doi.org/10.1093/bioinformatics/btl158 Google Scholar CrossRef Search ADS PubMed  93 Haas BJ, Gevers D, Earl AM, et al.   Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons. Genome Res  2011; 21( 3): 494– 504. http://dx.doi.org/10.1101/gr.112730.110 Google Scholar CrossRef Search ADS PubMed  94 Angiuoli SV, Matalka M, Gussman A, et al.   CloVR: A virtual machine for automated and portable sequence analysis from the desktop using cloud computing. BMC Bioinformatics  2011; 12( 1): 356. http://dx.doi.org/10.1186/1471-2105-12-356 Google Scholar CrossRef Search ADS PubMed  95 Alneberg J, Bjarnason BS, de Bruijn I, et al.   Binning metagenomic contigs by coverage and composition. Nat Methods  2014; 11( 11): 1144– 6. http://dx.doi.org/10.1038/nmeth.3103 Google Scholar CrossRef Search ADS PubMed  96 Xu Z, Hao B. CVTree update: a newly designed phylogenetic study platform using composition vectors and whole genomes. Nucleic Acids Res  2009; 37: W174– 8. Google Scholar CrossRef Search ADS PubMed  97 Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol  2014; 15( 12): 550. http://dx.doi.org/10.1186/s13059-014-0550-8 Google Scholar CrossRef Search ADS PubMed  98 Quince C, Connellly S, Raguideau S, et al.   De novo extraction of microbial strains from metagenomes reveals intra-species niche partitioning. bioRxiv  2016, doi: https://doi.org/10.1101/073825. 99 Buchfink B, Xie C, Huson DH. Fast and sensitive protein alignment using DIAMOND. Nat Methods  2015; 12( 1): 59– 60. Google Scholar CrossRef Search ADS PubMed  100 Manor O, Borenstein E. Systematic characterization and analysis of the taxonomic drivers of functional shifts in the human microbiome. Cell Host Microbe  2017; 21( 2): 254– 67. http://dx.doi.org/10.1016/j.chom.2016.12.014 Google Scholar CrossRef Search ADS PubMed  101 Magoč T, Salzberg SL. FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics  2011; 27( 21): 2957– 63. Google Scholar CrossRef Search ADS PubMed  102 Kim J, Kim MS, Koh AY, et al.   FMAP: functional mapping and analysis pipeline for metagenomics and metatranscriptomics studies. BMC Bioinformatics  2016; 17( 1): 420. http://dx.doi.org/10.1186/s12859-016-1278-0 Google Scholar CrossRef Search ADS PubMed  103 Riehle K, Coarfa C, Jackson A, et al.   The genboree microbiome toolset and the analysis of 16S rRNA microbial sequences. BMC Bioinformatics  2012; 13(Suppl 13): S11. Google Scholar CrossRef Search ADS PubMed  104 Kelley DR, Liu B, Delcher AL, et al.   Gene prediction with Glimmer for metagenomic sequences augmented by classification and clustering. Nucleic Acids Res  2012; 40: e9. Google Scholar CrossRef Search ADS PubMed  105 Imelfort M, Parks D, Woodcroft BJ, et al.   GroopM: an automated tool for the recovery of population genomes from related metagenomes. PeerJ  2014; 2: e603. Google Scholar CrossRef Search ADS PubMed  106 Peng Y, Leung HCM, Yiu SM, et al.   IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics  2012; 28: 1420– 8. http://dx.doi.org/10.1093/bioinformatics/bts174 Google Scholar CrossRef Search ADS PubMed  107 Xie C, Mao X, Huang J, et al.   KOBAS 2.0: a web server for annotation and identification of enriched pathways and diseases. Nucleic Acids Res  2011; 39: W316– 22. Google Scholar CrossRef Search ADS PubMed  108 Wu Y-W, Tang Y-H, Tringe SG, et al.   MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm. Microbiome  2014; 2: 26. http://dx.doi.org/10.1186/2049-2618-2-26 Google Scholar CrossRef Search ADS PubMed  109 Li D, Luo R, Liu C-M, et al.   MEGAHIT v1.0: A fast and scalable metagenome assembler driven by advanced methodologies and community practices. Methods  2016; 102: 3– 11. http://dx.doi.org/10.1016/j.ymeth.2016.02.020 Google Scholar CrossRef Search ADS PubMed  110 Kang DD, Froula J, Egan R, et al.   MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities. PeerJ  2015; 3: e1165. Google Scholar CrossRef Search ADS PubMed  111 Noguchi H, Taniguchi T, Itoh T. MetaGeneAnnotator: detecting species-specific patterns of ribosomal binding site for precise gene prediction in anonymous prokaryotic and phage genomes. DNA Res  2008; 15( 6): 387– 96. http://dx.doi.org/10.1093/dnares/dsn027 Google Scholar CrossRef Search ADS PubMed  112 Shaw GT-W, Pao Y-Y, Wang D. MetaMIS: a metagenomic microbial interaction simulator based on microbial community profiles. BMC Bioinformatics  2016; 17: 488. http://dx.doi.org/10.1186/s12859-016-1359-0 Google Scholar CrossRef Search ADS PubMed  113 Treangen TJ, Koren S, Sommer DD, et al.   MetAMOS: a modular and open source metagenomic assembly and analysis pipeline. Genome Biol  2013; 14( 1): R2. Google Scholar CrossRef Search ADS PubMed  114 Segata N, Waldron L, Ballarini A, et al.   Metagenomic microbial community profiling using unique clade-specific marker genes. Nat Methods  2012; 9: 811– 4. http://dx.doi.org/10.1038/nmeth.2066 Google Scholar CrossRef Search ADS PubMed  115 Nurk S, Meleshko D, Korobeynikov A, et al.   metaSPAdes: a new versatile metagenomic assembler. Genome Res  2017; 27( 5): 824– 34. http://dx.doi.org/10.1101/gr.213959.116 Google Scholar CrossRef Search ADS PubMed  116 Namiki T, Hachiya T, Tanaka H, et al.   MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads. Nucleic Acids Res  2012; 40: e155. Google Scholar CrossRef Search ADS PubMed  117 Keegan KP, Glass EM, Meyer F. MG-RAST, a metagenomics service for analysis of microbial community structure and function. Methods Mol Biol  2016; 1399: 207– 33. Google Scholar CrossRef Search ADS PubMed  118 Chevreux B, Pfisterer T, Drescher B, et al.   Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs. Genome Res  2004; 14( 6): 1147– 59. http://dx.doi.org/10.1101/gr.1917404 Google Scholar CrossRef Search ADS PubMed  119 Schloss PD, Westcott SL, Ryabin T, et al.   Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol  2009; 75: 7537– 41. http://dx.doi.org/10.1128/AEM.01541-09 Google Scholar CrossRef Search ADS PubMed  120 Manor O, Borenstein E. MUSiCC: a marker genes based framework for metagenomic normalization and accurate profiling of gene abundances in the microbiome. Genome Biol  2015; 16: 53. http://dx.doi.org/10.1186/s13059-015-0610-8 Google Scholar CrossRef Search ADS PubMed  121 Lin H-H, Liao Y-C. Accurate binning of metagenomic contigs via automated clustering sequences using information of genomic signatures and marker genes. Sci Rep  2016; 6: 24175. http://dx.doi.org/10.1038/srep24175 Google Scholar CrossRef Search ADS PubMed  122 Rosen GL, Reichenberger ER, Rosenfeld AM. NBC: the Naive Bayes Classification tool webserver for taxonomic classification of metagenomic reads. Bioinformatics  2011; 27( 1): 127– 9. http://dx.doi.org/10.1093/bioinformatics/btq619 Google Scholar CrossRef Search ADS PubMed  123 Hoff KJ, Lingner T, Meinicke P, et al.   Orphelia: predicting genes in metagenomic sequencing reads. Nucleic Acids Res  2009; 37: W101– 5. Google Scholar CrossRef Search ADS PubMed  124 Huson DH, Xie CA. poor man’s BLASTX–high-throughput metagenomic protein database search using PAUDA. Bioinformatics  2014; 30: 38– 9. Google Scholar CrossRef Search ADS PubMed  125 Darling AE, Jospin G, Lowe E, et al.   PhyloSift: phylogenetic analysis of genomes and metagenomes. PeerJ  2014; 2: e243. Google Scholar CrossRef Search ADS PubMed  126 Langille MGI, Zaneveld J, Caporaso JG, et al.   Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat Biotechnol  2013; 31: 814– 21. http://dx.doi.org/10.1038/nbt.2676 Google Scholar CrossRef Search ADS PubMed  127 Claudel-Renard C, Chevalet C, Faraut T, et al.   Enzyme-specific profiles for genome annotation: PRIAM. Nucleic Acids Res  2003; 31: 6633– 9. http://dx.doi.org/10.1093/nar/gkg847 Google Scholar CrossRef Search ADS PubMed  128 Hyatt D, Chen G-L, LoCascio PF, et al.   Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics  2010; 11: 119. http://dx.doi.org/10.1186/1471-2105-11-119 Google Scholar CrossRef Search ADS PubMed  129 Caporaso JG, Kuczynski J, Stombaugh J, et al.   QIIME allows analysis of high-throughput community sequencing data. Nat Methods  2010; 7( 5): 335– 6. http://dx.doi.org/10.1038/nmeth.f.303 Google Scholar CrossRef Search ADS PubMed  130 Ye Y, Choi J-H, Tang H. RAPSearch: a fast protein similarity search tool for short reads. BMC Bioinformatics  2011; 12: 159. http://dx.doi.org/10.1186/1471-2105-12-159 Google Scholar CrossRef Search ADS PubMed  131 Gerlach W, Jünemann S, Tille F, et al.   WebCARMA: a web application for the functional and taxonomic classification of unassembled metagenomic reads. BMC Bioinformatics  2009; 10( 1): 430. Google Scholar CrossRef Search ADS PubMed  132 Marchesi JR, Ravel J. The vocabulary of microbiome research: a proposal. Microbiome  2015; 3: 31. http://dx.doi.org/10.1186/s40168-015-0094-5 Google Scholar CrossRef Search ADS PubMed  133 Mande SS, Mohammed MH, Ghosh TS. Classification of metagenomic sequences: methods and challenges. Brief Bioinform  2012; 13: 669– 81. http://dx.doi.org/10.1093/bib/bbs054 Google Scholar CrossRef Search ADS PubMed  134 Dröge J, McHardy AC. Taxonomic binning of metagenome samples generated by next-generation sequencing technologies. Brief Bioinform  2012; 13: 646– 55. Google Scholar CrossRef Search ADS PubMed  135 Wang Q, Garrity GM, Tiedje JM, et al.   Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol  2007; 73: 5261– 7. http://dx.doi.org/10.1128/AEM.00062-07 Google Scholar CrossRef Search ADS PubMed  136 Rho M, Tang H, Ye Y. FragGeneScan: predicting genes in short and error-prone reads. Nucleic Acids Res  2010; 38: e191. Google Scholar CrossRef Search ADS PubMed  137 Yi G, Sze S-H, Thon MR. Identifying clusters of functionally related genes in genomes. Bioinformatics  2007; 23( 9): 1053– 60. http://dx.doi.org/10.1093/bioinformatics/btl673 Google Scholar CrossRef Search ADS PubMed  138 Manor O, Levy R, Borenstein E. Mapping the inner workings of the microbiome: genomic- and metagenomic-based study of metabolism and metabolic interactions in the human microbiome. Cell Metab  2014; 20: 742– 52. http://dx.doi.org/10.1016/j.cmet.2014.07.021 Google Scholar CrossRef Search ADS PubMed  139 Joice R, Yasuda K, Shafquat A, et al.   Determining microbial products and identifying molecular targets in the human microbiome. Cell Metab  2014; 20: 731– 41. http://dx.doi.org/10.1016/j.cmet.2014.10.003 Google Scholar CrossRef Search ADS PubMed  140 Dudhagara P, Bhavsar S, Bhagat C, et al.   Web resources for metagenomics studies. Genomics Proteomics Bioinformatics  2015; 13( 5): 296– 303. http://dx.doi.org/10.1016/j.gpb.2015.10.003 Google Scholar CrossRef Search ADS PubMed  141 Kim Y, Koh I, Rho M. Deciphering the human microbiome using next-generation sequencing data and bioinformatics approaches. Methods  2015; 79-80: 52– 9. Google Scholar CrossRef Search ADS PubMed  142 Human Microbiome Project Consortium. Structure, function and diversity of the healthy human microbiome. Nature  2012; 486: 207– 14. http://dx.doi.org/10.1038/nature11234 CrossRef Search ADS PubMed  143 Dehoux P, Marvaud JC, Abouelleil A, et al.   Comparative genomics of Clostridium bolteae and Clostridium clostridioforme reveals species-specific genomic properties and numerous putative antibiotic resistance determinants. BMC Genomics  2016; 17: 819. http://dx.doi.org/10.1186/s12864-016-3152-x Google Scholar CrossRef Search ADS PubMed  144 Milani C, Turroni F, Duranti S, et al.   Genomics of the genus bifidobacterium reveals species-specific adaptation to the glycan-rich gut environment. Appl Environ Microbiol  2015; 82: 980– 91. Google Scholar CrossRef Search ADS PubMed  145 Ravcheev DA, Godzik A, Osterman AL, et al.   Polysaccharides utilization in human gut bacterium Bacteroides thetaiotaomicron: comparative genomics reconstruction of metabolic and regulatory networks. BMC Genomics  2013; 14( 1): 873. http://dx.doi.org/10.1186/1471-2164-14-873 Google Scholar CrossRef Search ADS PubMed  146 Neville BA, Sheridan PO, Harris HMB, et al.   Pro-inflammatory flagellin proteins of prevalent motile commensal bacteria are variably abundant in the intestinal microbiome of elderly humans. PLoS One  2013; 8( 7): e68919. Google Scholar CrossRef Search ADS PubMed  147 Manor O, Borenstein E. Revised computational metagenomic processing uncovers hidden and biologically meaningful functional variation in the human microbiome. Microbiome  2017; 5: 19. http://dx.doi.org/10.1186/s40168-017-0231-4 Google Scholar CrossRef Search ADS PubMed  148 Greenblum S, Turnbaugh PJ, Borenstein E. Metagenomic systems biology of the human gut microbiome reveals topological shifts associated with obesity and inflammatory bowel disease. Proc Natl Acad Sci USA  2012; 109: 594– 9. http://dx.doi.org/10.1073/pnas.1116053109 Google Scholar CrossRef Search ADS PubMed  149 Chander AM, Nair RG, Kaur G, et al.   Genome insight and comparative pathogenomic analysis of Nesterenkonia jeotgali Strain CD08_7 isolated from duodenal mucosa of celiac disease patient. Front Microbiol  2017; 8: 129. Google Scholar CrossRef Search ADS PubMed  150 Walsh CJ, Guinane CM, Hill C, et al.   In silico identification of bacteriocin gene clusters in the gastrointestinal tract, based on the Human Microbiome Project’s reference genome database. BMC Microbiol  2015; 15: 183. Google Scholar CrossRef Search ADS PubMed  151 Zhao L. The gut microbiota and obesity: from correlation to causality. Nat Rev Microbiol  2013; 11( 9): 639– 47. http://dx.doi.org/10.1038/nrmicro3089 Google Scholar CrossRef Search ADS PubMed  152 Ni Y, Li J, Panagiotou GCOMAN. a web server for comprehensive metatranscriptomics analysis. BMC Genomics  2016; 17: 622. http://dx.doi.org/10.1186/s12864-016-2964-z Google Scholar CrossRef Search ADS PubMed  153 Narayanasamy S, Jarosz Y, Muller EEL, et al.   IMP: a pipeline for reproducible reference-independent integrated metagenomic and metatranscriptomic analyses. Genome Biol  2016; 17: 260. http://dx.doi.org/10.1186/s13059-016-1116-8 Google Scholar CrossRef Search ADS PubMed  154 Rotmistrovsky K, Agarwala R. BMTagger: Best Match Tagger for Removing Human Reads from Metagenomics Datasets. 2011. 155 Westreich ST, Korf I, Mills DA, et al.   SAMSA: a comprehensive metatranscriptome analysis pipeline. BMC Bioinformatics  2016; 17: 399. http://dx.doi.org/10.1186/s12859-016-1270-8 Google Scholar CrossRef Search ADS PubMed  156 Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics  2010; 26: 2460– 1. http://dx.doi.org/10.1093/bioinformatics/btq461 Google Scholar CrossRef Search ADS PubMed  157 Marchler-Bauer A, Lu S, Anderson JB, et al.   CDD: a conserved domain database for the functional annotation of proteins. Nucleic Acids Res  2011; 39: D2259. Google Scholar CrossRef Search ADS   158 Sunagawa S, Mende DR, Zeller G, et al.   Metagenomic species profiling using universal phylogenetic marker genes. Nat Methods  2013; 10: 1196– 9. http://dx.doi.org/10.1038/nmeth.2693 Google Scholar CrossRef Search ADS PubMed  159 Celaj A, Markle J, Danska J, et al.   Comparison of assembly algorithms for improving rate of metatranscriptomic functional annotation. Microbiome  2014; 2: 39. http://dx.doi.org/10.1186/2049-2618-2-39 Google Scholar CrossRef Search ADS PubMed  160 Petriz BA, Franco OL. Metaproteomics as a complementary approach to gut microbiota in health and disease. Front Chem  2017; 5: 4. Google Scholar CrossRef Search ADS PubMed  161 Wilkins MR, Gasteiger E, Bairoch A, et al.   Protein identification and analysis tools in the ExPASy server. Methods Mol Biol  1999; 112: 531– 52. Google Scholar PubMed  162 Chatterjee S, Stupp GS, Park SKR, et al.   A comprehensive and scalable database search system for metaproteomics. BMC Genomics  2016; 17( 1): 642. http://dx.doi.org/10.1186/s12864-016-2855-3 Google Scholar CrossRef Search ADS PubMed  163 Sievers F, Wilm A, Dineen D, et al.   Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol  2014; 7: 539– 9. http://dx.doi.org/10.1038/msb.2011.75 Google Scholar CrossRef Search ADS   164 Lupas A, Van Dyke M, Stock J. Predicting coiled coils from protein sequences. Science  1991; 252( 5009): 1162– 4. http://dx.doi.org/10.1126/science.252.5009.1162 Google Scholar CrossRef Search ADS PubMed  165 Gattiker A, Bienvenut WV, Bairoch A, et al.   FindPept, a tool to identify unmatched masses in peptide mass fingerprinting protein identification. Proteomics  2002; 2( 10): 1435– 44. http://dx.doi.org/10.1002/1615-9861(200210)2:10<1435::AID-PROT1435>3.0.CO;2-9 Google Scholar CrossRef Search ADS PubMed  166 Jagtap PD, Blakely A, Murray K, et al.   Metaproteomic analysis using the Galaxy framework. Proteomics  2015; 15( 20): 3553– 65. http://dx.doi.org/10.1002/pmic.201500074 Google Scholar CrossRef Search ADS PubMed  167 Pedruzzi I, Rivoire C, Auchincloss AH, et al.   HAMAP in 2015: updates to the protein family classification and annotation system. Nucleic Acids Res  2015; 43: D1064– 70. [WorldCat] Google Scholar CrossRef Search ADS PubMed  168 Balwierz PJ, Pachkov M, Arnold P, et al.   ISMARA: automated modeling of genomic signals as a democracy of regulatory motifs. Genome Res  2014; 24( 5): 869– 84. http://dx.doi.org/10.1101/gr.169508.113 Google Scholar CrossRef Search ADS PubMed  169 Perkins DN, Pappin DJ, Creasy DM, et al.   Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis  1999; 20( 18): 3551– 67. http://dx.doi.org/10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2 Google Scholar CrossRef Search ADS PubMed  170 Tabb DL, Fernando CG, Chambers MC. MyriMatch: highly accurate tandem mass spectral peptide identification by multivariate hypergeometric analysis. J Proteome Res  2007; 6: 654– 61. http://dx.doi.org/10.1021/pr0604054 Google Scholar CrossRef Search ADS PubMed  171 Horlacher O, Nikitin F, Alocci D, et al.   MzJava: an open source library for mass spectrometry data processing. J Proteomics  2015; 129: 63– 70. http://dx.doi.org/10.1016/j.jprot.2015.06.013 Google Scholar CrossRef Search ADS PubMed  172 Geer LY, Markey SP, Kowalak JA, et al.   Open mass spectrometry search algorithm. J Proteome Res  2004; 3( 5): 958– 64. http://dx.doi.org/10.1021/pr0499491 Google Scholar CrossRef Search ADS PubMed  173 Krissinel E, Henrick K. Inference of macromolecular assemblies from crystalline state. J Mol Biol  2007; 372( 3): 774– 97. http://dx.doi.org/10.1016/j.jmb.2007.05.022 Google Scholar CrossRef Search ADS PubMed  174 Vaezzadeh AR, Hernandez C, Vadas O, et al.   pICarver: a software tool and strategy for peptides isoelectric focusing. J Proteome Res  2008; 7( 10): 4336– 45. http://dx.doi.org/10.1021/pr8002672 Google Scholar CrossRef Search ADS PubMed  175 Yachdav G, Kloppmann E, Kajan L, et al.   PredictProtein–an open resource for online prediction of protein structural and functional features. Nucleic Acids Res  2014; 42: W337– 43. Google Scholar CrossRef Search ADS PubMed  176 Benkert P, Tosatto SCE, Schomburg D. QMEAN: A comprehensive scoring function for model quality assessment. Proteins Struct Funct Bioinforma  2008; 71: 261– 77. http://dx.doi.org/10.1002/prot.21715 Google Scholar CrossRef Search ADS   177 Ahrné E, Nikitin F, Lisacek F, et al.   QuickMod: a tool for open modification spectrum library searches. J Proteome Res  2011; 10( 7): 2913– 21. Google Scholar CrossRef Search ADS PubMed  178 Searle BC. Scaffold: a bioinformatic tool for validating MS/MS-based proteomic studies. Proteomics  2010; 10( 6): 1265– 9. http://dx.doi.org/10.1002/pmic.200900437 Google Scholar CrossRef Search ADS PubMed  179 de Castro E, Sigrist CJA, Gattiker A, et al.   ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins. Nucleic Acids Res  2006; 34: W362– 5. Google Scholar CrossRef Search ADS PubMed  180 Eng JK, McCormack AL, Yates JR. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J Am Soc Mass Spectrom  1994; 5: 976– 89. http://dx.doi.org/10.1016/1044-0305(94)80016-2 Google Scholar CrossRef Search ADS PubMed  181 Notredame C, Higgins DG, Heringa J. T-coffee: a novel method for fast and accurate multiple sequence alignment 1 1. J Thornton J Mol Biol  2000; 302: 205– 17. http://dx.doi.org/10.1006/jmbi.2000.4042 Google Scholar CrossRef Search ADS   182 Mesuere B, Van der Jeugt F, Devreese B, et al.   The unique peptidome: Taxon-specific tryptic peptides as biomarkers for targeted metaproteomics. Proteomics  2016; 16: 2313– 8. http://dx.doi.org/10.1002/pmic.201600023 Google Scholar CrossRef Search ADS PubMed  183 Craig R, Beavis RC. TANDEM: matching proteins with tandem mass spectra. Bioinformatics  2004; 20( 9): 1466– 7. http://dx.doi.org/10.1093/bioinformatics/bth092 Google Scholar CrossRef Search ADS PubMed  184 Artimo P, Jonnalagedda M, Arnold K, et al.   ExPASy: SIB bioinformatics resource portal. Nucleic Acids Res  2012; 40: W597– 603. Google Scholar CrossRef Search ADS PubMed  185 Boutet E, Lieberherr D, Tognolli M, et al.   UniProtKB/Swiss-prot, the manually annotated section of the UniProt knowledgebase: how to use the entry view. Methods Mol Biol  2016; 1374: 23– 54. Google Scholar CrossRef Search ADS PubMed  186 Szklarczyk D, Morris JH, Cook H, et al.   The STRING database in 2017: quality-controlled protein–protein association networks, made broadly accessible. Nucleic Acids Res  2017; 45( D1): D362– 8. Google Scholar CrossRef Search ADS PubMed  187 Biasini M, Bienert S, Waterhouse A, et al.   SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information. Nucleic Acids Res  2014; 42( W1): W252– 8. Google Scholar CrossRef Search ADS PubMed  188 Sigrist CJA, de Castro E, Cerutti L, et al.   New and continuing developments at PROSITE. Nucleic Acids Res  2013; 41: D344– 7. Google Scholar CrossRef Search ADS PubMed  189 Hulo C, de Castro E, Masson P, et al.   ViralZone: a knowledge resource to understand virus diversity. Nucleic Acids Res  2011; 39(Suppl 1): D576– 82. Google Scholar CrossRef Search ADS   190 Gaudet P, Michel P-A, Zahn-Zabal M, et al.   The neXtProt knowledgebase on human proteins: 2017 update. Nucleic Acids Res  2017; 45( D1): D177– 82. Google Scholar CrossRef Search ADS PubMed  191 ExPASy proteomics toolset: Protein sequences and identification. 192 ExPASy proteomics toolset: Proteomics experiment. 193 ExPASy proteomics toolset: Function analysis. 194 ExPASy proteomics toolset: Sequences sites, features and motifs. 195 ExPASy proteomics toolset: Protein modification. 196 ExPASy proteomics toolset: Protein structure. 197 ExPASy proteomics toolset: Protein interactions. 198 ExPASy proteomics toolset: Similarity search/alignment. 199 Xiong W, Abraham PE, Li Z, et al.   Microbial metaproteomics for characterizing the range of metabolic functions and activities of human gut microbiota. Proteomics  2015; 15( 20): 3424– 38. http://dx.doi.org/10.1002/pmic.201400571 Google Scholar CrossRef Search ADS PubMed  200 Tanca A, Palomba A, Fraumene C, et al.   The impact of sequence database choice on metaproteomic results in gut microbiota studies. Microbiome  2016; 4( 1): 51. http://dx.doi.org/10.1186/s40168-016-0196-8 Google Scholar CrossRef Search ADS PubMed  201 Muth T, Renard BY, Martens L. Metaproteomic data analysis at a glance: advances in computational microbial community proteomics. Expert Rev Proteomics  2016; 13: 757– 69. http://dx.doi.org/10.1080/14789450.2016.1209418 Google Scholar CrossRef Search ADS PubMed  202 Muth T, Kolmeder CA, Salojärvi J, et al.   Navigating through metaproteomics data - a logbook of database searching. Proteomics  2015; 15: 3439– 53. Google Scholar CrossRef Search ADS PubMed  203 Jagtap PD, Johnson JE, Onsongo G, et al.   Flexible and accessible workflows for improved proteogenomic analysis using the Galaxy framework. J Proteome Res  2014; 13: 5898– 908. http://dx.doi.org/10.1021/pr500812t Google Scholar CrossRef Search ADS PubMed  204 Zhang X, Ning Z, Mayne J, et al.   MetaPro-IQ: a universal metaproteomic approach to studying human and mouse gut microbiota. Microbiome  2016; 4( 1): 31. http://dx.doi.org/10.1186/s40168-016-0176-z Google Scholar CrossRef Search ADS PubMed  205 Mesuere B, Willems T, Van der Jeugt F, et al.   Unipept web services for metaproteomics analysis. Bioinformatics  2016; 32( 11): 1746– 8. http://dx.doi.org/10.1093/bioinformatics/btw039 Google Scholar CrossRef Search ADS PubMed  206 Kolmeder CA, Ritari J, Verdam FJ, et al.   Colonic metaproteomic signatures of active bacteria and the host in obesity. Proteomics  2015; 15( 20): 3544– 52. http://dx.doi.org/10.1002/pmic.201500049 Google Scholar CrossRef Search ADS PubMed  207 Kolmeder CA, Salojärvi J, Ritari J, et al.   Faecal metaproteomic analysis reveals a personalized and stable functional microbiome and limited effects of a probiotic intervention in adults. PLoS One  2016; 11( 4): e0153294. Google Scholar CrossRef Search ADS PubMed  208 Heintz-Buschart A, May P, Laczny CC, et al.   Integrated multi-omics of the human gut microbiome in a case study of familial type 1 diabetes. Nat Microbiol  2016; 2: 16180. http://dx.doi.org/10.1038/nmicrobiol.2016.180 Google Scholar CrossRef Search ADS PubMed  209 Smirnov KS, Maier TV, Walker A, et al.   Challenges of metabolomics in human gut microbiota research. Int J Med Microbiol  2016; 306: 266– 79. http://dx.doi.org/10.1016/j.ijmm.2016.03.006 Google Scholar CrossRef Search ADS PubMed  210 Aw W, Fukuda S. Toward the comprehensive understanding of the gut ecosystem via metabolomics-based integrated omics approach. Semin Immunopathol  2015; 37( 1): 5– 16. http://dx.doi.org/10.1007/s00281-014-0456-2 Google Scholar CrossRef Search ADS PubMed  211 Aguiar-Pulido V, Huang W, Suarez-Ulloa V, et al.   Metagenomics, metatranscriptomics, and metabolomics approaches for microbiome analysis. Evol Bioinform Online  2016; 12: 5– 16. Google Scholar PubMed  212 Cheema AK, Maier I, Dowdy T, et al.   Chemopreventive metabolites are correlated with a change in intestinal microbiota measured in A-T mice and decreased carcinogenesis. PLoS One  2016; 11( 4): e0151190. Google Scholar CrossRef Search ADS PubMed  213 Duffy LC, Raiten DJ, Hubbard VS, et al.   Progress and challenges in developing metabolic footprints from diet in human gut microbial cometabolism. J Nutr  2015; 145( 5): 1123S– 30S. Google Scholar CrossRef Search ADS PubMed  214 Martin F-PJ, Collino S, Rezzi S, et al.   Metabolomic applications to decipher gut microbial metabolic influence in health and disease. Front Physiol  2012; 3: 113. Google Scholar CrossRef Search ADS PubMed  215 Bolvig AK, Nørskov NP, Hedemann MS, et al.   The effect of antibiotics and diet on enterolactone concentration and metabolome studied by targeted and non-targeted LC-MS metabolomics. J Proteome Res  2017; 16: 2135– 50. Google Scholar CrossRef Search ADS PubMed  216 Choo JM, Kanno T, Zain NMM, et al.   Divergent relationships between fecal microbiota and metabolome following distinct antibiotic-induced disruptions. mSphere  2017; 2: e00005-17. Google Scholar CrossRef Search ADS PubMed  217 Wilson CM, Aggio RBM, O’Toole PW, et al.   Transcriptional and metabolomic consequences of LuxS inactivation reveal a metabolic rather than quorum-sensing role for LuxS in Lactobacillus reuteri 100-23. J Bacteriol  2012; 194: 1743– 6. Google Scholar CrossRef Search ADS PubMed  218 Klassen A, Faccio AT, Canuto GAB, et al.   Metabolomics: definitions and significance in systems biology. Adv Exp Med Biol  2017; 965: 3– 17. Google Scholar CrossRef Search ADS PubMed  219 Dias DA, Jones OAH, Beale DJ, et al.   Current and future perspectives on the structural identification of small molecules in biological systems. Metabolites  2016; 6( 4): 46. http://dx.doi.org/10.3390/metabo6040046 Google Scholar CrossRef Search ADS   220 Worley B, Powers R. Multivariate analysis in metabolomics. Curr Metabolomics  2013; 1( 1): 92– 107. Google Scholar PubMed  221 Krumsiek J, Bartel J, Theis FJ. Computational approaches for systems metabolomics. Curr Opin Biotechnol  2016; 39: 198– 206. http://dx.doi.org/10.1016/j.copbio.2016.04.009 Google Scholar CrossRef Search ADS PubMed  222 Johnson CH, Ivanisevic J, Siuzdak G. Metabolomics: beyond biomarkers and towards mechanisms. Nat Rev Mol Cell Biol  2016; 17( 7): 451– 9. http://dx.doi.org/10.1038/nrm.2016.25 Google Scholar CrossRef Search ADS PubMed  223 Wägele B, Witting M, Schmitt-Kopplin P, et al.   MassTRIX reloaded: combined analysis and visualization of transcriptome and metabolome data. PLoS One  2012; 7( 7): e39860. Google Scholar CrossRef Search ADS PubMed  224 Wang Y, Kora G, Bowen BP, et al.   MIDAS: a database-searching algorithm for metabolite identification in metabolomics. Anal Chem  2014; 86: 9496– 503. http://dx.doi.org/10.1021/ac5014783 Google Scholar CrossRef Search ADS PubMed  225 Ruttkies C, Schymanski EL, Wolf S, et al.   MetFrag relaunched: incorporating strategies beyond in silico fragmentation. J Cheminform  2016; 8: 3. http://dx.doi.org/10.1186/s13321-016-0115-9 Google Scholar CrossRef Search ADS PubMed  226 Vyas J, Nowling RJ, Meusburger T, et al.   MimoSA: a system for minimotif annotation. BMC Bioinformatics  2010; 11: 328. http://dx.doi.org/10.1186/1471-2105-11-328 Google Scholar CrossRef Search ADS PubMed  227 Wishart DS, Jewison T, Guo AC, et al.   HMDB 3.0–the human metabolome database in 2013. Nucleic Acids Res  2013; 41: D801– 7. Google Scholar CrossRef Search ADS PubMed  228 Ulrich EL, Akutsu H, Doreleijers JF, et al.   BioMagResBank. Nucleic Acids Res  2008; 36: D402– 8. Google Scholar CrossRef Search ADS PubMed  229 Cui Q, Lewis IA, Hegeman AD, et al.   Metabolite identification via the madison metabolomics consortium database. Nat Biotechnol  2008; 26: 162– 4. http://dx.doi.org/10.1038/nbt0208-162 Google Scholar CrossRef Search ADS PubMed  230 Horai H, Arita M, Kanaya S, et al.   MassBank: a public repository for sharing mass spectral data for life sciences. J Mass Spectrom  2010; 45( 7): 703– 14. http://dx.doi.org/10.1002/jms.1777 Google Scholar CrossRef Search ADS PubMed  231 Kopka J, Schauer N, Krueger S, et al.   GMD@CSB.DB: the golm metabolome database. Bioinformatics  2005; 21( 8): 1635– 8. http://dx.doi.org/10.1093/bioinformatics/bti236 Google Scholar CrossRef Search ADS PubMed  232 Smith CA, O'Maille G, Want EJ, et al.   METLIN: a metabolite mass spectral database. Ther Drug Monit  2005; 27( 6): 747– 51. http://dx.doi.org/10.1097/01.ftd.0000179845.53213.39 Google Scholar CrossRef Search ADS PubMed  233 Little JL, Williams AJ, Pshenichnov A, et al.   Identification of ‘known unknowns’ utilizing accurate mass data and ChemSpider. J Am Soc Mass Spectrom  2012; 23( 1): 179– 85. Google Scholar CrossRef Search ADS PubMed  234 Kim S, Thiessen PA, Bolton EE, et al.   PubChem substance and compound databases. Nucleic Acids Res  2016; 44( D1): D1202– 13. Google Scholar CrossRef Search ADS PubMed  235 Jeffryes JG, Colastani RL, Elbadawi-Sidhu M, et al.   MINEs: open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics. J Cheminform  2015; 7: 44. Google Scholar CrossRef Search ADS PubMed  236 Minkiewicz P, Darewicz M, Iwaniak A, et al.   Internet databases of the properties, enzymatic reactions, and metabolism of small molecules-search options and applications in food science. Int J Mol Sci  2016; 17( 12): 2039. http://dx.doi.org/10.3390/ijms17122039 Google Scholar CrossRef Search ADS   237 Misra BB, van der Hooft JJJ. Updates in metabolomics tools and resources: 2014-2015. Electrophoresis  2016; 37( 1): 86– 110. http://dx.doi.org/10.1002/elps.201500417 Google Scholar CrossRef Search ADS PubMed  238 Ahmed I, Greenwood R, Costello B, et al.   Investigation of faecal volatile organic metabolites as novel diagnostic biomarkers in inflammatory bowel disease. Aliment Pharmacol Ther  2016; 43: 596– 611. http://dx.doi.org/10.1111/apt.13522 Google Scholar CrossRef Search ADS PubMed  239 Jansson J, Willing B, Lucio M, et al.   Metabolomics reveals metabolic biomarkers of Crohn’s disease. PLoS One  2009; 4( 7): e6386. Google Scholar CrossRef Search ADS PubMed  240 Lee T, Clavel T, Smirnov K, et al.   Oral versus intravenous iron replacement therapy distinctly alters the gut microbiota and metabolome in patients with IBD. Gut  2017; 66: 863– 71. http://dx.doi.org/10.1136/gutjnl-2015-309940 Google Scholar CrossRef Search ADS PubMed  241 Ghishan FK, Kiela PR. Epithelial transport in inflammatory bowel diseases. Inflamm Bowel Dis  2014; 20: 1099– 109. Google Scholar PubMed  242 Cortassa S, Caceres V, Bell LN, et al.   From metabolomics to fluxomics: a computational procedure to translate metabolite profiles into metabolic fluxes. Biophys J  2015; 108: 163– 72. http://dx.doi.org/10.1016/j.bpj.2014.11.1857 Google Scholar CrossRef Search ADS PubMed  243 Winter G, Krömer JO. Fluxomics - connecting ‘omics analysis and phenotypes. Environ Microbiol  2013; 15( 7): 1901– 16. Google Scholar CrossRef Search ADS PubMed  244 Martínez VS, Buchsteiner M, Gray P, et al.   Dynamic metabolic flux analysis using B-splines to study the effects of temperature shift on CHO cell metabolism. Metab Eng Commun  2015; 2: 46– 57. Google Scholar CrossRef Search ADS   245 Schellenberger J, Que R, Fleming RMT, et al.   Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0. Nat Protoc  2011; 6( 9): 1290– 307. Google Scholar CrossRef Search ADS PubMed  246 Granger BR, Chang Y-C, Wang Y, et al.   Visualization of metabolic interaction networks in microbial communities using VisANT 5.0. PLoS Comput Biol  2016; 12( 4): e1004875. Google Scholar CrossRef Search ADS PubMed  247 Le Fèvre F, Smidtas S, Combe C, et al.   CycSim–an online tool for exploring and experimenting with genome-scale metabolic models. Bioinformatics  2009; 25: 1987– 8. Google Scholar CrossRef Search ADS PubMed  248 Vlassis N, Pacheco MP, Sauter T, Ouzounis CA. Fast reconstruction of compact context-specific metabolic network models. PLoS Comput Biol  2014; 10( 1): e1003424. Google Scholar CrossRef Search ADS PubMed  249 Pratapa A, Balachandran S, Raman K. Fast-SL: an efficient algorithm to identify synthetic lethal sets in metabolic networks. Bioinformatics  2015; 31( 20): 3299– 305. http://dx.doi.org/10.1093/bioinformatics/btv352 Google Scholar CrossRef Search ADS PubMed  250 Saa PA, Nielsen LK. Fast-SNP: a fast matrix pre-processing algorithm for efficient loopless flux optimization of metabolic models. Bioinformatics  2016; 32( 24): 3807– 14. http://dx.doi.org/10.1093/bioinformatics/btw555 Google Scholar CrossRef Search ADS PubMed  251 Grafahrend-Belau E, Klukas C, Junker BH, Schreiber F. FBA-SimVis: interactive visualization of constraint-based metabolic models. Bioinformatics  2009; 25( 20): 2755– 7. Google Scholar CrossRef Search ADS PubMed  252 van Klinken JB, Willems van Dijk K. FluxModeCalculator: an efficient tool for large-scale flux mode computation. Bioinformatics  2016; 32: 1265– 6. http://dx.doi.org/10.1093/bioinformatics/btv742 Google Scholar CrossRef Search ADS PubMed  253 Liao Y-C, Tsai M-H, Chen F-C, Hsiung CA. GEMSiRV: a software platform for GEnome-scale metabolic model simulation, reconstruction and visualization. Bioinformatics  2012; 28( 13): 1752– 8. http://dx.doi.org/10.1093/bioinformatics/bts267 Google Scholar CrossRef Search ADS PubMed  254 Hartleb D, Jarre F, Lercher MJ, Patil KR. Improved metabolic models for E. coli and mycoplasma genitalium from globalfit, an algorithm that simultaneously matches growth and non-growth data sets. PLoS Comput Biol  2016; 12( 8): e1005036. Google Scholar CrossRef Search ADS PubMed  255 Sokol S, Millard P, http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Briefings in Bioinformatics Oxford University Press

Resources and tools for the high-throughput, multi-omic study of intestinal microbiota