Phylogenomic Analysis of Lactobacillus curvatus Reveals Two Lineages Distinguished by Genes for Fermenting Plant-Derived Carbohydrates

Phylogenomic Analysis of Lactobacillus curvatus Reveals Two Lineages Distinguished by Genes for... Lactobacillus curvatus is a lactic acid bacterium encountered in many different types of fermented food (meat, seafood, vegetables, and cereals). Although this species plays an important role in the preservation of these foods, few attempts have been made to assess its genomic diversity. This study uses comparative analyses of 13 published genomes (complete or draft) to better understand the evolutionary processes acting on the genome of this species. Phylogenomic analysis, based on a coalescent model of evolution, revealed that the 6,742 sites of single nucleotide polymorphism within the L. curvatus core genome delineate two major groups, with lineage 1 represented by the newly sequenced strain FLEC03, and lineage 2 represented by the type-strain DSM20019. The two lineages could also be distinguished by the content of their accessory genome, which sheds light on a long-term evolutionary process of lineage-dependent genetic acquisition and the possibility of population structure. Interestingly, one clade from lineage 2 shared more accessory genes with strains of lineage 1 than with other strains of lineage 2, indicating recent convergence in carbohydrate catabolism. Both lineages had a wide repertoire of accessory genes involved in the fermentation of plant-derived carbohydrates that are released from polymers of a/b-glucans, a/b-fructans, and N-acetylglucosan. Other gene clusters were distributed among strains according to the type of food from which the strains were isolated. These results give new insight into the ecological niches in which L. curvatus may naturally thrive (such as silage or compost heaps) in addition to fermented food. Key words: pangenome, Lactobacillus curvatus, plant fermentation, food, lactic acid bacteria, phylogenomic. Introduction or from the environmental fermentation process of corn or Lactobacillus curvatus is a facultative heterofermentative lactic grass silage (Tohno et al. 2012; Zhou et al. 2016). These obser- acid bacterium, commonly associated with fermented and vations suggest that L. curvatus is ubiquitous in lactic acid fer- vacuum-packaged refrigerated meat and fish products mentation and that foods of vegetable origins are a common (Hammes et al. 1990; Hammes and Knauf 1994; Hammes environment for this species. Based on this, it is perhaps not and Hertel 1998; Lyhs et al. 2002; Lyhs and Bjo ¨ rkroth 2008; surprising that L. curvatus has also been identified in the feces Lucquin et al. 2012; Chaillou et al. 2015). In addition, this spe- or gut of many animal species that feed on plants or cereals, cies has also been isolated from dairy products such as milk and including snails (Koleva et al. 2014), chickens (Zommiti et al. cheese (Kask et al. 2003). More recently, many studies have 2017), and humans (Dal Bello et al. 2003). identified L. curvatus in fermented plant products like sauer- Lactobacillus curvatus belongs to the Lactobacillus sakei kraut (Vogel et al. 1993), sourdough (Michel et al. 2016), radish clade of psychrotrophic Lactobacillus, which comprises pickles (Nakano et al. 2016), and kimchi (Jung et al. 2011); in four species: Lactobacillus sakei, Lactobacillus fuchuensis, other plant-derived materials like honey (Bulgasem et al. 2016); Lactobacillus graminis,and L. curvatus (Sun et al. 2015; The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com 1516 Genome Biol. Evol. 10(6):1516–1525. doi:10.1093/gbe/evy106 Advance Access publication May 29, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1516/5020733 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Phylogenomic Analysis of Lactobacillus curvatus Reveals Two Lineages GBE Zheng et al. 2015). To date, thirteen L. curvatus genomes are The Euclidean distance based on presence/absence was available, of which five are complete and eight are draft used to calculate the distance matrix between the variable (Hebert et al. 2012; Cousin 2015; Nakano et al. 2016; Inglin genomes, and clustering was performed by unsupervised et al. 2017; Jans et al. 2017; Lee et al. 2017; Teran et al. 2017). complete linkage. Some of these strains were sequenced to highlight their ability to produce multiple bacteriocins, like strain CRL705, which was Evolutionary and Phylogenomic Analysis isolated from an Argentinean artisanal dry sausage (Hebert The alignment of the nucleotide genome sequences was per- et al. 2012), or because of surprising features of flagellum- formed using PROGRESSIVEMAUVE (Darling et al. 2010); from mediated motility, like the Japanese strain NRIC0822, which this, we extracted the core alignment by keeping only the was isolated from sushi (Cousin 2015). Besides these few regions where all genomes aligned over at least 500 bp. This examples, very little is known about the intraspecies genomic core alignment was submitted to five independent runs of repertoire of L. curvatus strains. In particular, improved knowl- CLONALFRAME software (Didelot and Falush 2006; Vos and edge of this species’ genome could help to distinguish it from Didelot 2009), which consisted of a burn-in cycle of the MCMC the closely related species L. sakei, which is knowntobe sep- (Markov Chain Monte Carlo) algorithm fixed to 50,000 itera- arated into three phylogenic lineages (Chaillou et al. 2013). tions and a posterior sampling of 50,000 iterations. The prior Several recent publications have reported the genome se- iterations were discarded and model parameters were sampled quencing of strains of L. curvatus isolated from various food in the second period of the run every 50 iterations, resulting in products. We took advantage of these resources to perform a 1,000 samples from the posterior. Satisfactory convergence of detailed phylogenomic and pangenomic analysis of L. curva- the MCMC algorithm in the different runs was estimated based tus as a species. In order to improve our understanding of the on the Gelman-Rubin statistic calculated in CLONALFRAME. evolution and population structure of the 13 strains studied, The genealogy of the population was summarized and the ro- we performed multiple, complementary analyses. These in- bustness of the tree topology was evaluated by concatenating cluded an evolutionary analysis of the core genome, which the posterior samples of the five runs to build a 50% majority revealed the existence of two lineages, and an in-depth com- rule consensus tree using the CLONALFRAME GUI and MEGA6 parison of the biological functions encoded in the accessory software (Tamura et al. 2013). From these runs, several meas- genome, which highlighted the strong relationship of this urements were also taken, such as q/h (relative frequencies of species with different plant-based environments. occurrence of recombination and mutation) and r/m (relative impact of recombination and mutation in the diversification of Materials and Methods the lineages). A Bayesian approach, implemented in STRUCTURE software version 2.3 (Pritchard et al. 2000; Genome Data and Curation of Annotations Falush et al. 2003), wasusedtoinfer the lineage ancestry of Our data set consisted of 13 genomes of L. curvatus strains, of the core genome by assuming that each strain derived all of its which five were complete and eight were draft versions, all SNPs from one of the K ancestral subpopulations. The number available from the Genbank/EMBL databases (table 1). All of populations, K, was determined under the linkage model. genomes were downloaded to the MAGE annotation plat- Five individual runs per value of K (chosentobe 2 or 3) were form (Vallenet et al. 2013) and strain-specific genes were all performed using 50,000 burn-in iterations and 50,000 sam- manually curated in order to standardize the comparative pling iterations. analysis. Metabolic pathways were reconstructed using the METACYC database (Caspi et al. 2016) or from the literature Results and Discussion when indicated. Evolutionary Analysis Reveals the Existence of Two Pangenome Analysis and Clusters of Orthologous Genes Lineages Within L. curvatus The composition of the core and variable genomes was cal- The phylogeny of L. curvatus was inferred using two different culated using a pairwise estimation of orthologous proteins in approaches. The first strategy was based on Bayesian infer- CDhit (Li and Godzik 2006) at a threshold of 80% identity on ence with the coalescent model implemented in 80% of the protein’s total length. We then modeled the pro- CLONALFRAME software (Didelot and Falush 2006) whereas gression of pangenome size with respect to the number of the second strategy was to statistically estimate the probable genomes included by randomly picking genomes and iterat- number of ancestral subpopulations (K) within the genetic ing the process 13 times, as described on the MAGE annota- population of strains; this was performed using STRUCTURE tion platform (Vallenet et al. 2013). R statistical software with the linkage model (Pritchard et al. 2000; Falush et al. (R Development Core Team 2010) and the HEATMAP.2 func- 2003). The initial step of these two analysis consisted of the tion of the GPLOT package were used to construct a heatmap selection of the high-quality core genome using based on the variable components of the pangenome. PROGRESSIVEMAUVE software (Darling et al. 2010), which Genome Biol. Evol. 10(6):1516–1525 doi:10.1093/gbe/evy106 Advance Access publication May 29, 2018 1517 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1516/5020733 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Teran et al. GBE Table 1 List of Lactobacillus curvatus Genome Sequences Used in This Study Strain Origin Chromosome Sequencing Status Number of Number of Accession Reference (Mb) (Sequencing Technology) Contigs CDS Number FBA2 Radish/Carrot pickles 1.849 Complete (PacBio RS II platform) 1 1,718 CP016028.1 Nakano et al. (2016) Wikim38 Baechu 1.940 Complete (PacBio RS II platform) 1 1,810 CP017124.1 Lee et al. (2017) (Chinese Kimchi) Wikim52 Kimchi 1.987 Complete (PacBio RS II platform) 1 1,875 CP016602.1 NP MRS6 Fermented meat 2.114 Complete (PacBio RS II platform) 1 1,935 CP022474.1 Jans et al. (2017) KG6 Fermented meat 2.002 Complete (PacBio RS II platform) 2 1,884 CP022475.1-76.1 Jans et al. (2017) CRL705 Argentinean 1.838 Draft (454 GS Titanium 145 1,708 GCA_000235705.2 Hebert et al. (2012) dry-sausage pyrosequencing) FLEC03 Vacuum-packed 1.902 Draft (Illumina MiSeq pair-end) 47 1,944 GCA_900178545.1 Teran et al. (2017) beef carpaccio DSM20019 Milk 1.815 Draft (Ion Torrent PGM) 72 1,828 GCA_001311645.1 NP NRCI0822 Kabura zushi 1.945 Draft (Illumina HiSeq pair-end) 144 1,831 GCA_000805355.1 Cousin (2015) RI-406 Meat 2.001 Draft (Illumina MiSeq pair-end) 52 1,873 GCA_001981905.1 Inglin et al. (2017) RI-198 Meat 1.804 Draft (Illumina MiSeq pair-end) 77 1,727 GCA_001981925.1 Inglin et al. (2017) RI-193 Meat 1.805 Draft (Illumina MiSeq pair-end) 82 1,727 GCA_001982045.1 Inglin et al. (2017) RI-124 Meat 1.810 Draft (Illumina MiSeq pair-end) 77 1,722 GCA_001982025.1 Inglin et al. (2017) NP, no publication available. took into account only the coding sequences (CDS) and the CLONALFRAME analysis estimated statistically the q/h ratio, aligned regions with no frameshift between the 13 chromo- a measure of how often recombination events occur relative somes. Therefore, all CDS of potential low-sequencing quality to neutral genetic drift (mutation). This value was 0.137 (i.e., due to the draft genome sequencing status) were dis- (0.127–0.147 at 95% credibility interval), which indicated carded from this analysis. A total of 131 alignment blocks that the recombination rate is significantly lower than the longer than 500 bp were selected. The core aligned genome mutation rate and therefore, recombination has played only consisted of 199,762 bp, which contained 6,742 Single a minor role in the evolution of the two lineages. Nucleotide Polymorphic loci (SNPs) among the 13 chromo- Nevertheless, the admixture status of strains RI-124, RI-193, somes. Results are shown in figure 1. From the coalescent RI-198, and KG6 clearly indicated that recombination events tree, we were able to define two separate lineages: lineage between the two lineages may occur, perhaps when strains 1 comprised the strains FLEC03, MRS6, and RI-406, and line- from both lineages are in physical proximity such as in solid age 2 comprised the ten other strains. The results of the food. STRUCTURE analysis confirmed the presence of two popula- tions. Furthermore, this method enabled the characterization The Accessory Genome Corroborates the Existence of Two of the allele frequencies at each locus, then it probabilistically Lineages assigned individuals to K (unknown) ancestral populations. For both lineages (when K was set to two), >75% of the Another way to assess population structure is to perform a genetic contribution to each strain came from its own group. comparative analysis of the variable or accessory genome. However, we observed less genetic homogeneity in one clade Indeed, the speciation of a species into several lineages may of lineage 2 (named clade 2B), which contained strains RI- arise from positive selection for a given ecological niche and 124, RI-193, RI-198, and KG6. We therefore investigated if thus for the acquisition of lineage-specific metabolic traits inference to K¼ 3 ancestral populations would cause this (Doolittle and Papke 2006). The general genome features of clade to be assigned to a third lineage, a population structure the L. curvatus strains, shown in table 1, clearly highlighted which would be similar to that observed in the closely related some genetic variability (from 1708 CDS in strain CRL705 to species L. sakei (Chaillou et al. 2013). However, there was no 1944 CDS in strain FLEC03). We thus compared the 13 strains statistical support for this hypothesis, suggesting that clade 2B in terms of gene content by estimating the core shared ge- does not originate from a third ancestral population. nome and then evaluating how the accessory genome could Therefore, at this stage of the analysis it could only be con- contribute to the differentiation of the two lineages. The cluded that strains RI-124, RI-193, RI-198, and KG6 are most results, shown in figure 2, first indicated that the core genome likely affiliated to the broader lineage 2 but have some degree is composed of 1,215 clusters of orthologous genes (ortho- of admixture (from 35% to 45%) with lineage 1. The logs) whereas the pangenome contains 3,435 orthologs. It 1518 Genome Biol. Evol. 10(6):1516–1525 doi:10.1093/gbe/evy106 Advance Access publication May 29, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1516/5020733 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Phylogenomic Analysis of Lactobacillus curvatus Reveals Two Lineages GBE Q ancestry proportion 00.5 100.5 1 NRIC0822. Wikim38. FBA2. DSM20019. Wikim52. CRL705. LINEAGE 2 RI-193. RI-198. RI-124. KG6. FLEC03. LINEAGE 1 RI-406. MRS6. 0.8 0.6 0.4 0.2 0 K = 2 K = 3 Time (Coalescent Unit) FIG.1.—Phylogenomic clonal genealogy and population structure of Lactobacillus curvatus strains. On the left: Fifty-percent majority rule consensus tree inferred with CLONALFRAME software using the coalescent model (branch lengths are given in coalescent units). All branches are supported by a posterior probability of >95%. Strains are colored according to their lineage affiliation. On the right: Proportion of genetic material derived from each of K subpopulations as inferred by STRUCTURE (linkage model) and assuming K¼2 or 3 populations. Ancestral subpopulations are colored in red (lineage1), blue (lineage 2), and green (an unlikely lineage 3), respectively. Clade 2B of lineage 2 is colored in dark cyan to highlight its divergence from Clade 2Aand a strong degree of admixture between the two lineages. should be noted that although it has been shown that draft strains of their own putative lineage. It is important to remem- assemblies provide highly relevant insights into pangenomic ber that the core genome analysis is based on the mutation studies (Sun et al. 2015), the use of these types of assemblies and recombination rates among SNPs in housekeeping genes may raise the possibility to underestimate the core genome, and thus reflects a rather long-term evolutionary process. the mobile genome and the strain-specific genes. For this Instead, the accessory genome analysis is affected to a large reason, we then relaxed the threshold used to estimate the degree by horizontal gene transfer, which might be influ- core genome by including the possibility that a given gene enced by the lifestyle of the strains and represents a recent might be absent in one of the thirteen strains, a possibility that and ongoing process of fitness acquisition by the strains. could occur due to the draft sequencing of eight out of the 13 Therefore, the striking finding of a discrepancy between strains studied. With these settings, the core genome was core and accessory genome clustering suggests that strains estimated to contain 1,407 orthologs. Of these, 414 ortho- from clade 2B have recently evolved from lineage 2 through logs formed the mobile genome (IS elements and prophages), the acquisition of functional traits from lineage 1. It should be with 55 different putative transposase families represented noted that strains FLEC03, RI-406, RI-124, RI-193, RI-198, and overall. The cloud genome (genes present in only one strain MRS6 share a common source of isolation (meat), whereas and not from the mobile genome) contained 648 orthologs the other strains were isolated from fermented nonmeat which were mainly distributed into three major groups: pro- products (except for CRL705). Furthermore, strains isolated teins of unknown functions (50.1%), cell-surface or exported from Asian-type food products (sushi and kimchi) formed a proteins (15%), and proteins involved in the production of closely related subgroup of strains in lineage 2. Therefore, surface or exported polysaccharides or teichoic acid (20%). patterns in the accessory genome of L. curvatus suggest The remaining 901 orthologs (when the relaxed threshold that certain traits that affect environmental fitness have was used to estimate the core genome) form part of the ac- been recently acquired. However, it should be acknowledge cessory genome, in which genes are shared by at least two that a bias on the origin of strains might still exist since eight strains. This group of 901 orthologs from the accessory ge- out thirteen of the strains being sequenced and publicly avail- nome was used for cluster analysis of the strains (fig. 3). We able are from fresh or fermented meat. It would therefore be observed that strains grouped according to their lineage, with valuable to sequence more strains from other sources (vege- the exception of strains from clade 2B; these shared more tables and silage) to validate this conclusion. Based on figure 3, accessory genes with strains of lineage 1 than with the other several groups of accessory genes were defined (A to E) Genome Biol. Evol. 10(6):1516–1525 doi:10.1093/gbe/evy106 Advance Access publication May 29, 2018 1519 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1516/5020733 by Ed 'DeepDyve' Gillespie user on 20 June 2018 CLADE 1A CLADE 2B CLADE 2A Teran et al. GBE strains RI-124, RI-193, and RI-198 from the admixed clade 2B. pangenome=3435 These gene clusters are very similar to those previously iden- mobile =414 tified in L. sakei strain 23 K (Chaillou et al. 2005), which were hypothesized to be important for growth fitness in meat. It is therefore interesting to observe that such cell surface com- cloud =648 plexes are almost absent from strains in lineage 2, which were not isolated from fresh meat. shell genome=2308 accessory A Wide Repertoire of Phosphotransferase Transport (relax) =901 Systems for Sugar and Polyol Uptake core genome (relax) =1407 The accessory genome of the thirteen L. curvatus strains con- tains >30 phosphotransferase systems (PTSs) and 10 other core genome (stringent) =1215 systems dedicated to carbohydrate uptake. Altogether, these account for 240 genes (30.8% of the accessory genome 11 234 234 56789 56789 1100 1122 shown in fig. 3). Interestingly, it seems that the repertoire of Number of genomes PTS gene clusters enables the utilization of a wide range of FIG.2.—Progression of the core genome and pangenome of carbohydrates and polyols from plants (vegetables, fruits, Lactobacillus curvatus. Each boxplot represents the pairwise evolution of cereals) and insect/microbe-derived sugar polymers. In partic- the core genome (blue) and pangenome (yellow) of clusters of ortholo- ular, a significant proportion of these gene clusters encode gous proteins calculated iteratively as genomes were added to the analysis, uptake systems for the utilization of a/b-glucan, a/b-fructan, for a total of 2–13 genomes. Dashed lines represent the values obtained and N-acetylglucosan. An overview of these systems with re- for the progression of the core genome (using a stringent or relaxed esti- spect to the plant carbohydrates that they transport is shown mation; see text), the pangenome, and for another important step in the in figure 4 and details about the gene content of these clus- estimation of the accessory genome, the shell genome. The shell genome ters can be found in supplementary table S1, Supplementary is a more realistic functional estimation of the pangenome that excludes Material online. mobile selfish DNA (mobile genome) and unique gene clusters found in only one strain (cloud genome) from the sum of accessory genes. Systems for a and b-Glucans according to their frequency in the different strains and these will be addressed later in the discussion. We found at least three different systems for maltose utiliza- tion. Two of these use starch and maltodextrins via the intra- cellular a-amylase pathway and the maltose phosphorylase Cell-Surface Complexes as a Major Difference between pathway, both of which are linked to an ABC transporter the Two Lineages (map gene cluster N 1anN 2). Interestingly, these two Cell-surface complexes (Cscs) are conserved among gene clusters have a slightly different gene synteny (gene Firmicutes and, in particular, in species belonging to the order mapG encoding a hypothetical protein is respectively absent Lactobacillales. Cscs are multicomponent complexes com- from cluster N 1 and gene mapL1 encoding an oligo alpha- posed of four types of protein families which differ according 1,6-glusosidase is absent from cluster N 2).Furthermore,the to their domains: CscA has a DUF916 domain of unknown encoded proteins were not considered to be orthologs at a function, CscB and CscC contain a WxL1 and WxL2 domain threshold of 80% similarity, which indicates that they have which binds noncovalently to the murein polymer of the cell different phylogenetic origins. One gene cluster was present wall, and CscD contains an LPxTG motif for covalent anchor- in all strains of lineage 1 (strains FLEC03, RI-406, and MRS6), ing to the cell wall (Siezen et al. 2006). Csc components can but was also found in strains KG6 and FBA2 of lineage 2; it vary in number and position among clusters and not all of shares a high level of identity (75%) with homologs in them are necessarily present in one complex. In particular, Lactobacillus alimentarius. Conversely, the second gene clus- CscCs are large secreted proteins with putative carbohydrate ter was found only in strains of lineage 2, with a high level of polymer binding domains that are involved in adhesion and/or identity (70%) to homologs in Pediococcus pentosaceus. carbohydrate scavenging. They are often highly variable be- The third system for maltose utilization (cluster N 3), present tween gene clusters and were shown to cross-react with CscA only in strains NRIC0822 and MRS6, was the mal PTS, which and CscB proteins of noncognate gene clusters (Brinster et al. was coupled with the malA gene that encodes 6-phospho a- 2007). Of the accessory orthologous genes investigated here, glucosidase. we found Csc-encoding regions in both groups A and D (see Ten of the thirteen strains also contained genes coding for fig. 3). Strains FLEC03, RI-406, and MRS6 of lineage 1 share the tre PTS (cluster N 4), which would enable them to use the eight putative Csc clusters, two of which are also shared with a-glucan-derived disaccharide trehalose; the three strains that 1520 Genome Biol. Evol. 10(6):1516–1525 doi:10.1093/gbe/evy106 Advance Access publication May 29, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1516/5020733 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Number of orthologous CDS 1000 1000 1500 1500 2000 2000 2500 2500 3000 3000 3500 3500 Phylogenomic Analysis of Lactobacillus curvatus Reveals Two Lineages GBE gene is absent gene is present GROUP E GROUP A GROUP B GROUP C GROUP D CRL705. DSM20019. FBA2. Wikim38. NRIC0822. Wikim52. RI-124. RI-193. RI-198. FLEC03. RI-406. MRS6. KG6. pdu CSC srl CSC ula CLUSTER CLUSTERS CLUSTER CLUSTERS CLUSTER map2 map1 Fli, mot, che CLUSTER CLUSTER CLUSTERS FIG.3.—Heatmap showing the clustering analysis of Lactobacillus curvatus strains based on the content of their accessory genomes. Unsupervised complete linkage clustering of L. curvatus strains based on the presence (orange) or absence (blue) of 901 orthologs that constitute the L. curvatus accessory genome (without mobile and cloud genomes). Names of strains are colored according to their phylogenomic clade and lineage as shown in figure 1. The five main groups of orthologs prevalent among the strains are indicated above the heatmap and inside the clustering tree (from groups A to E). Similarly, gene prevalence groups are colored based on their specificity to each of the phylogenomic clades. Eight gene clusters representative of each groups are boxed with dashed lines: CSC clusters (Cell Surface Complexes); pdu cluster (propanediol catabolic pathway); map1 and map2 clusters (lineage-specific maltose phosphorylase pathways); srl cluster (sorbitol phosphotransferase system); fli, mot,and che clusters (motility operons); ula cluster (ascorbate catabolic pathway). Systems for a and b-Fructans lacked this system were RI-124, RI-193, and RI-198 from the admixed clade 2B. There was much more redundancy in PTSs Similarly, it appeared that some strains from lineage 2 are able to for the use of b-glucans, in which there was also extensive use sucrose through two different pathways, one involving the variation among strains. These systems included a lic PTS (clus- sucrose src PTS and sucrose-6-phosphate hydrolase pathway ter N 5) for catabolizing lichenan (barley glucan) and several (clusterN 08a and 08b), and the other involving a symport bgl-like PTSs (up to three systems within strains of lineage 1 system coupled with the catabolism of bacterial levan (b-fruc- from cluster N 06a to N 06c, although some of these may tan) by the lev PTS (identified in strains FAB2 and Wikim52) not be complete). However, one peculiar bgl system (cluster (cluster N 12). Another PTS for fructose utilization was identi- N 6c) in strain RI-406 also encoded an a-xylosidase (xylQ), fied in strains FLEC03 and RI-406 from lineage 1 and strains RI- indicating that this cluster might be involved in the degrada- 193, RI-198, and NRIC0822 from lineage 2; specifically, we tion of xyloglucan (plant hemicellulose) (Chaillou et al. 1998). detected the frl gene cluster (cluster N 10), which encodes a We found evidence that all strains are able to take up glucose fructose-lysine deglycation pathway (Wiame et al. 2005). This andfructose withthe manXYZ and fruKRI PTSs, respectively, molecule can be abundant in plant fluids, arising spontaneously but strain FLEC03 and RI-406 from lineage 1 also had an via condensation of the sugar and the amino acid when both additional copy of the manXYZ PTS (cluster N 7). are present in high concentrations (Bilova et al. 2016). The frl Genome Biol. Evol. 10(6):1516–1525 doi:10.1093/gbe/evy106 Advance Access publication May 29, 2018 1521 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1516/5020733 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Teran et al. GBE β-glucans vegetables cellobiose glucose lichenan fructose 6 fruits sucrose trehalose 8 glucose-lysine cereals PTS PTS PTS PTS fructose-lysine maltose PTS PTS α,β-fructans PTS PTS α-glucans cadaverine SY glucose 6P maltose 2 ABC lysine levan PTS fructose 6P PTS 13 mannosyl-glycerate ABC DHA P NAG 6P PTS N-acetyl-glucosamine (NAG) PTS glucitol PTS sorbitol PTS insects 21 PTS PTS chitobiose L-ascorbate PTS SY 20 PTS fungi N-acetyl-muramate galactitol bacteria citrate acetate propanol lactate propanoate 1,2 propanediol fluids of plants citrus fruits FIG.4.—Overview of Lactobacillus curvatus accessory gene repertoire involved in fermentation of plant-derived carbohydrates. Carbohydrates are grouped (external ellipses) according to their type (glucan and fructans) or origin (plants, cereals, and insects). Each uptake and catabolic systemis represented by a circle whose size depicts the degree of conservation among the L. curvatus strains. The clusters are numbered (small yellow circles) to facilitate their identification using supplementary file S1, Supplementary Material online. The inner circles illustrate the fate of these carbohydrates: into glucose 6P, fructose 6P, N-acetyl glucosamine 6P, or dihydroxy-acetone P (DHAP). PTS cluster is associated with a fructose 6P-lysine deglycase (frlF), lineage 2, have acquired multiple PTSs that are specific for which releases fructose 6 P and lysine. The expression of this these compounds, including a catabolic ula PTS pathway gene cluster might be controlled by an accessory r54-like tran- for ascorbic acid (Yew and Gerlt 2002)(cluster N 21), a scriptional factor, which would indicate that the bacteria might catabolic srl PTS pathway for sorbitol (Alcantara 2008) be able to sense this compound in the environment (Francke (cluster N 22), and a catabolic gat PTS pathway for gluci- et al. 2011). The genomes of strains NRIC0822 and Wikim52 tol/galactitol (clusters N 19 and N 20). Again, strains from contain a second putative glycation PTS (grl gene cluster N 09). lineage 1 and those of the admixed clade 2B had gat gene It should be noted that two pathways for the catabolism of clusters (Nobelmann and Lengeler 1996)that differ in se- lysine into cadaverine were found (cluster N 11b): a pyridoxal- quence homology and origin from that of strains from dependent lysine decarboxylase (tdcA) was present in most lineage 2. Similarly, strains of lineage 1 could also be dis- strains (except strains DSM20019 and Wikim52), while five tinguished from those of lineage 2 by the presence of a strains also contained the lysine decarboxylase complex pathway that is quite uncommon in lactic acid bacteria encoded by the cad gene cluster (clusterN 11a). (Cluster N 18): a coenzyme B(12)-dependent catabolic pathway (pdu gene cluster) for the utilization of 1,2-pro- panediol (Bobik et al. 1999). This compound is produced Systems for Polyols by the fermentation of the common plant sugars rham- Plant fluids are richinpolyols andvitamin C (ascorbic nose and fucose, and its catabolism creates propionate and propanol as end products. acid). Lactobacillus curvatus strains, in particular those in 1522 Genome Biol. Evol. 10(6):1516–1525 doi:10.1093/gbe/evy106 Advance Access publication May 29, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1516/5020733 by Ed 'DeepDyve' Gillespie user on 20 June 2018 CRL705. DSM20019. FBA2. Wikim38. NRIC0822. Wikim52. RI-124. RI-193. RI-198. FLEC03. RI-406. MRS6. KG6. Phylogenomic Analysis of Lactobacillus curvatus Reveals Two Lineages GBE Other Systems PHAGES The differences between the two lineages of L. curvatus or between some groups of strains go beyond catabolic pathways for plant-derived carbohydrates. For instance, some strains in lineage 2 (CRL705, DSM20019, and Wikim38) have an rbsUDKR gene cluster for ribose catab- TRANSPOSASES olism that encodes a ribose transporter rbsU and is similar to that of strains of L. sakei (Stentz and Zagorec 1999). Instead, all other strains have an rbsABCDKR gene cluster which encodes an ABC transporter rbsABC. Another ex- ample of divergence between the two lineages was also found in the catabolism of N-actetylglucosamine and N- acetylmuramic acid. Whereas most strains (with the ex- FIG.5.—Barplots showing the number of phage-related genes and ception of FLEC03) harbored the nagC PTC and the nagA transposase families in the Lactobacillus curvatus strains. The barplots in- gene (cluster N 14), which encodes the N-acetylglucos- dicate the numbers of phage genes (green) or transposase families (red) amine 6-phosphate deacetylase (Vogler and Lengeler identified in each strain. The red shaded area depicts strains from Asian- 1989), strains CRL705, DSM20019, and KG6 had an ad- type foods, in which a higher number of transposase families was found. ditional mur PTS together with the murQ gene (cluster N 16), which encodes the D-lactyl ether N-acetylmuramic 6-phosphate acid etherase (Dahl et al. 2004)for the ca- mobile elements predominate among the genes that are tabolism of N-acetylmurein. Strains from lineage 1 and not captured in fragmented Illumina-based genome assem- clade 2B had another unique variation, with what might blies. The set of four Asian L. curvatus genomes (FBA2, be a chi gene cluster together with chiK (cluster N 15); Wikim38, Wikim52, NRCI0822; table 1) are enriched in com- the chi cluster is involved in chitobiose uptake and catab- plete genomes (3 out of 4) and this observation might explain olism, and chiK encodes an N-acetylglucosamine kinase the higher number of transposase families in these strains. (Plumbridge and Pellegrini 2004). More surprisingly, Between one and three putative prophages were identified strains DSM20019 and MRS6 harbored a mng gene clus- in each genome in this group. All prophages were predicted ter (cluster N 13) similar to that of E. coli, which has been to be noncontratile tail phages of the family Siphoviridae of shownto be involvedin2-O-a-Mannosyl-D-glycerate order Caudoviridae, and their size was between 31 and 42 kb, PTS-dependent uptake and catabolism (mngB encodes both characteristics frequently found among prophages of an a-mannosidase; Sampaio et al. 2004). This unusual car- genus Lactobacillus (Mahony and van Sinderen 2014). There bohydrate is known to be abundant in hyperthermophilic was little sequence similarities between the different phages prokaryotes, in which it acts as an osmoprotectant. genomes, each phage being unique to one strain and this was Finally, three strains—FAB2, Wikim38, and RI-406— explained the large contribution of phages to the important possessed the cit gene cluster (cluster N 17), which is size of the mobile genome. It is interesting to note that strain involved in the decarboxylation of citrate to acetate DSM20019 was the richest in prophage content; this strain and pyruvate, a common catabolic pathway in was isolated from milk, where the concentration of phages is Leuconostoc (Marty-Teysset et al. 1996). Together, these 1 4 high (from 10 to 10 phages per milliliter) (Marcoet al. observations suggest that L. curvatus strains may also 2012). Instead, strain CRL705, which might have been asso- thrive in natural environments where microbial and in- ciated with a milk environment in the past because of the sect cell-wall polymeres can be scavenged as well as presence of a lactose PTS cluster in its genome, only contains those derived from plants, environments such as silage remnants of prophages. However, strain CRL705 was unique or compost heaps. in possessing two CRISPR/cas systems (Clustered Regulatory Short Palindromic Repeats; Deveau et al. 2010); in addition to Mobile Genome Shows Important Variations between the type II system (cas9 gene) present in all strains, CRL705 Strains also had a type I system (cas3). These two clusters might The mobile genome differs greatly between strains of the two provide strain CRL705 with a stronger immunity against lineages (fig. 5). Strains from clade 2A, and in particular those phages. Finally, as it could be expected from the weak as- isolated from Asian types of food, had a broader range of sembly performance of repetitive regions using short-read transposase families (up to 20 in strain FAB2, see fig. 5), in- sequencing, the CRISPR spacer regions were largely incom- dicating that gene transfer may occur more frequently in plete in draft genomes and no conclusive information could these types of foods than in meat or in environmental silage. be extracted on the possible history of the strains versus However, Schmid et al. (2018) pointed out recently that phages encounter. Genome Biol. Evol. 10(6):1516–1525 doi:10.1093/gbe/evy106 Advance Access publication May 29, 2018 1523 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1516/5020733 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Number of orthologous CDS Teran et al. GBE Motility Literature Cited Alcantara C. 2008. Regulation of Lactobacillus casei sorbitol utilization The mobility operon, which comprises the fli (flagellar structral genes requires DNA-binding transcriptional activator GutR and the complex), mot (flagellar motor complex), and che (chemotaxis conserved protein GutM. Appl Environ Microbiol. 74(18):5731–5740. regulatory complex) gene clusters, was initially characterized Bilova T, et al. 2016. A snapshot of the plant glycated proteome: structural, in strain NRIC0822 (Cousin 2015). However, we also identi- functional, and mechanistic aspects. J Biol Chem. fied it in the closely related strain Wikim52, suggesting 291(14):7621–7636. Bobik TA, Havemann GD, Busch RJ, Williams DS, Aldrich HC. 1999. The that this feature might not be so unusual among strains of propanediol utilization (pdu) operon of Salmonella enterica serovar L. curvatus. Typhimurium LT2 includes genes necessary for formation of polyhedral organelles involved in coenzyme B(12)-dependent 1,2-propanediol degradation. J Bacteriol. 181(19):5967–5975. Conclusion Brinster S, Furlan S, Serror P. 2007. C-terminal WxL domain mediates cell Our results showed that, as a species, L. curvatus is divided wall binding in Enterococcus faecalis and other gram-positive bacteria. J Bacteriol. 189(4):1244–1253. into two ancestral phylogenetic lineages. The traces of this Bulgasem BY, Lani MN, Hassan Z, Wan Yusoff WM, Fnaish SG. 2016. evolutionary path are not only present in the allele fre- Antifungal activity of lactic acid bacteria strains isolated from natural quencies of the core genes but also in the origin and struc- honey against pathogenicCandida species. Mycobiology 44(4):302–309. ture of some conserved metabolic gene clusters (i.e., Caspi R, et al. 2016. The MetaCyc database of metabolic pathways and ribose, maltose, galactitol). The degree of variation pre- enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res. 44(D1):D471–D480. sent in these systems suggests that the two lineages result Chaillou S, et al. 1998. Cloning, sequence analysis, and characterization of from different evolution mechanisms ending up to this the genes involved in isoprimeverose metabolism in Lactobacillus pen- repertoire. Furthermore, our work demonstrates that the tosus. J Bacteriol. 180(9):2312–2320. lifestyle and the ecological niche of the strains has a strong Chaillou S, et al. 2005. The complete genome sequence of the meat-borne influence on the gene content of the accessory genome, lactic acid bacterium Lactobacillus sakei 23K. Nat Biotechnol. 23(12):1527–1533. which has led to convergence between strains from line- Chaillou S, et al. 2015. Origin and ecological selection of core and food- age 1 and those of clade 2B from lineage 2. Lactobacillus specific bacterial communities associated with meat and seafood spoil- curvatus pangenome has revealed a wide repertoire of age. ISME J. 9(5):1105–1118. genes for catabolizing plant-derived carbohydrates, and Chaillou S, Lucquin I, Najjari A, Zagorec M, Champomier-Verge`s M-C. this capacity is representing a major difference with the 2013. Population genetics of Lactobacillus sakei reveals three lineages with distinct evolutionary histories. PLoS One 8(9):e73253. closely related species L. sakei. Finally, an in-depth analysis Cousin FJ. 2015. Detection and genomic characterization of motility in of the L. curvatus accessory genome has led us to con- Lactobacillus curvatus: confirmation of motility in a species outside clude that, in addition to living in fermented foods made the Lactobacillus salivarius clade. Appl Environ Microbiol. 81(4): of vegetables or meat, the species must also thrive in an 1297–1308. ecological niche where decaying plants, insects, and bac- Dahl U, Jaeger T, Nguyen BT, Sattler JM, Mayer C. 2004. Identification of a phosphotransferase system of Escherichia coli required for growth on teria are present in large amounts, for example, in silage N-acetylmuramic acid. J Bacteriol. 186(8):2385–2392. or compost heaps. Dal Bello F, Walter J, Hammes WP, Hertel C. 2003. Increased complexity of the species composition of lactic acid bacteria in human feces revealed by alternative incubation condition. Microb Ecol. 45(4):455–463. Supplementary Material Darling AE, Mau B, Perna NT. 2010. progressiveMauve: multiple genome Supplementary data areavailableat Genome Biology and alignment with gene gain, loss and rearrangement. PLoS One 5(6):e11147. Evolution online. Deveau H, Garneau JE, Moineau S. 2010. CRISPR/Cas system and its role in phage-bacteria interactions. Annu Rev Microbiol. 64(1):475–493. Didelot X, Falush D. 2006. Inference of bacterial microevolution using multilocus sequence data. Genetics 175(3):1251–1266. Acknowledgments Doolittle WF, Papke RT. 2006. Genomics and the bacterial species prob- lem. Genome Biol. 7(9):116. L.C.T. was the recipient of a PhD fellowship from the Falush D, Stephens M, Pritchard JK. 2003. Inference of population struc- BecAR Program and Campus France Argentine. The ture using multilocus genotype data: linked loci and correlated allele LABGeM (CEA/IG/Genoscope and CNRS UMR 8030) and frequencies. Genetics 164(4):1567–1587. theFranceGenomique National infrastructure (funded as Francke C, et al. 2011. Comparative analyses imply that the enigmatic part of Investissement d’avenir program managed by Sigma factor 54 is a central controller of the bacterial exterior. BMC Genomics 12(1):385. Agence Nationale pour la Recherche contract no. ANR- Hammes WP, Bantleon A, Min S. 1990. Lactic acid bacteria in meat fer- 10-INBS-09) are acknowledged for support within the mentation. FEMS Microbiol Rev. 87(1–2):165–174. MicroScope annotation platform. We are grateful to the Hammes WP, Hertel C. 1998. New developments in meat starter cultures. INRA MIGALE bioinformatics platform (http://migale.jouy. Meat Sci. 49:S125–S138. inra.fr) for providing computational resources and data Hammes WP, Knauf HJ. 1994. Starters in the processing of meat products. Meat Sci. 36(1–2):155–168. storage. 1524 Genome Biol. Evol. 10(6):1516–1525 doi:10.1093/gbe/evy106 Advance Access publication May 29, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1516/5020733 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Phylogenomic Analysis of Lactobacillus curvatus Reveals Two Lineages GBE Hebert EM, et al. 2012. Genome sequence of the bacteriocin-producing Sampaio M-M, et al. 2004. Phosphotransferase-mediated transport of the Lactobacillus curvatus strain CRL705. J Bacteriol. 194(2):538–539. osmolyte 2-O-alpha-mannosyl-D-glycerate in Escherichia coli occurs by Inglin RC, Meile L, Stevens MJA. 2017. Draft genome sequences of 43 the product of the mngA (hrsA) gene and is regulated by the mngR Lactobacillus strains from the species L. curvatus, L. fermentum, L. (farR) gene product acting as repressor. J Biol Chem. paracasei, L. plantarum, L. rhamnosus,and L. sakei, isolated from 279(7):5537–5548. food products. Genome Announc. 5(30):e00632-17–e00617. Schmid M, et al. 2018. Comparative genomics of completely sequenced Jans C, Lagler S, Lacroix C, Meile L, Stevens MJA. 2017. Complete genome Lactobacillus helveticus genomes provides insights into strain-specific sequences of Lactobacillus curvatus KG6, L.curvatus MRS6, and genes and resolves metagenomics data down to the strain level. Front Lactobacillus sakei FAM18311, isolated from fermented meat prod- Microbiol. 9:63. ucts. Genome Announc. 5(38):e00915-17. Siezen R, et al. 2006. Lactobacillus plantarum gene clusters encoding pu- Jung JY, et al. 2011. Metagenomic analysis of kimchi, a traditional Korean tative cell-surface protein complexes for carbohydrate utilization are fermented food. Appl Environ Microbiol. 77:2264–2274. conserved in specific gram-positive bacteria. BMC Genomics 7:126. Koleva Z, et al. 2014. Lactic acid microflora of the gut of snail Cornu Stentz R, Zagorec M. 1999. Ribose utilization in Lactobacillus sakei:analysis aspersum. Biotechnol Biotechnol Equip. 28(4):627–634. of the regulation of the rbs operon and putative involvement of a new Kask S, et al. 2003. Physiological properties of Lactobacillus paracasei, L. transporter. J Mol Microbiol Biotechnol. 1:165–173. danicus and L. curvatus strains isolated from Estonian semi-hard Sun Z, et al. 2015. Expanding the biotechnology potential of Lactobacilli cheese. Food Res Int. 36(9–10):1037–1046. through comparative genomics of 213 strains and associated genera. Lee SH, Jung MY, Song J-H, Lee M, Chang JY. 2017. Complete genome Nat Commun. 6(1):8322. sequence of Lactobacillus curvatus strain WiKim38 isolated from Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. 2013. MEGA6: Kimchi. Genome Announc. 5(18):e00273-17. molecular evolutionary genetics analysis version 6.0. Mol Biol Evol. Li W, Godzik A. 2006. Cd-hit: a fast program for clustering and comparing 30(12):2725–2729. large sets of protein or nucleotide sequences. Bioinformatics Ter an LC, Coeuret G, Raya R, Champomier-Verge ` s M-C, Chaillou S. 2017. 22(13):1658–1659. Draft genome sequence of Lactobacillus curvatus FLEC03, a meat- Lucquin I, Zagorec M, Champomier-Verges M, Chaillou S. 2012. borne isolate from beef carpaccio packaged in a modified atmo- Fingerprint of lactic acid bacteria population in beef carpaccio is influ- sphere. Genome Announc. 5(26):e00584-17. enced by storage process and seasonal changes. Food Microbiol. Tohno M, Kobayashi H, Nomura M, Uegaki R, Cai Y. 2012. Identification 29(2):187–196. and characterization of lactic acid bacteria isolated from mixed pasture Lyhs U, Bjo ¨ rkroth JK. 2008. Lactobacillus sakei/curvatus is the prevailing of timothy and orchardgrass, and its badly preserved silage. Anim Sci J. lactic acid bacterium group in spoiled maatjes herring. Food Microbiol. 83(4):318–330. 25(3):529–533. Vallenet D, et al. 2013. MicroScope–an integrated microbial resource for Lyhs U, Korkeala H, Bjo ¨ rkroth J. 2002. Identification of lactic acid bacteria the curation and comparative analysis of genomic and metabolic data. from spoiled, vacuum-packaged ‘gravad’ rainbow trout using ribotyp- Nucleic Acids Res. 41(D1):D636–D647. ing. Int J Food Microbiol. 72(1–2):147–153. Vogel RF, Lohmann M, Nguyen M, Weller AN, Hammes WP. 1993. Mahony J, van Sinderen D. 2014. Current taxonomy of phages infecting Molecular characterization of Lactobacillus curvatus and Lact. sake lactic acid bacteria. Front Microbiol. 5:7. isolated from sauerkraut and their application in sausage fermenta- Marco MB, Moineau S, Quiberoni A. 2012. Bacteriophages and dairy tions. J Appl Bacteriol. 74(3):295–300. fermentations. Bacteriophage 2(3):149–158. Vogler AP, Lengeler JW. 1989. Analysis of the nag regulon from Marty-Teysset C, et al. 1996. Proton motive force generation by citrolactic Escherichia coli K12 and Klebsiella pneumoniae and of its regulation. fermentation in Leuconostoc mesenteroides. J Bacteriol. Mol Gen Genet. 219(1–2):97–105. 178(8):2178–2185. Vos M, Didelot X. 2009. A comparison of homologous recombination Michel E, et al. 2016. Characterization of relative abundance of lactic acid rates in bacteria and archaea. ISME J. 3(2):199–208. bacteria species in French organic sourdough by cultural, qPCR and Wiame E, Lamosa P, Santos H, Van Schaftingen E. 2005. Identification of MiSeq high-throughput sequencing methods. Int J Food Microbiol. glucoselysine-6-phosphate deglycase, an enzyme involved in the me- 239:35–43. tabolism of the fructation product glucoselysine. Biochem J. Nakano K, et al. 2016. First complete genome sequence of the skin- 392(2):263–269. improving Lactobacillus curvatus strain FBA2, isolated from fermented Yew WS, Gerlt JA. 2002. Utilization of L-ascorbate by Escherichia coli K-12: vegetables, determined by PacBio single-molecule real-time technol- assignments of functions to products of the yjf-sga and yia-sgb oper- ogy. Genome Announc. 4(5):e00884-16. ons. J Bacteriol. 184(1):302–306. Nobelmann B, Lengeler JW. 1996. Molecular analysis of the gat genes Zheng J, Ruan L, Sun M, Ganzle M. 2015. A genomic view of Lactobacilli from Escherichia coli and of their roles in galactitol transport and me- and Pediococci demonstrates that phylogeny matches ecology and tabolism. J Bacteriol. 178(23):6790–6795. physiology. Appl Environ Microbiol. 81(20):7233–7243. Plumbridge J, Pellegrini O. 2004. Expression of the chitobiose operon of Zhou Y, Drouin P, Lafrenie ` re C. 2016. Effect of temperature (5-25 C) on Escherichia coli is regulated by three transcription factors: nagC, ChbR epiphytic lactic acid bacteria populations and fermentation of whole- and CAP. Mol Microbiol. 52(2):437–449. plant corn silage. J Appl Microbiol. 121(3):657–671. Pritchard JK, Stephens M, Donnelly P. 2000. Inference of population struc- Zommiti M, Connil N, Hamida JB, Ferchichi M. 2017. Probiotic character- ture using multilocus genotype data. Genetics 155(2):945–959. istics of Lactobacillus curvatus DN317, a strain isolated from chicken R Development Core Team. 2010. R: a language and environment for ceca. Probiotics Antimicrob Proteins. 9(4):415–424. statistical computing. Vienna (Austria): R Foundation for Statistical Computing. Associate editor:Esther Angert Genome Biol. Evol. 10(6):1516–1525 doi:10.1093/gbe/evy106 Advance Access publication May 29, 2018 1525 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1516/5020733 by Ed 'DeepDyve' Gillespie user on 20 June 2018 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Genome Biology and Evolution Oxford University Press

Phylogenomic Analysis of Lactobacillus curvatus Reveals Two Lineages Distinguished by Genes for Fermenting Plant-Derived Carbohydrates

Free
10 pages

Loading next page...
 
/lp/ou_press/phylogenomic-analysis-of-lactobacillus-curvatus-reveals-two-lineages-i5VE9W01mz
Publisher
Oxford University Press
Copyright
© The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
ISSN
1759-6653
eISSN
1759-6653
D.O.I.
10.1093/gbe/evy106
Publisher site
See Article on Publisher Site

Abstract

Lactobacillus curvatus is a lactic acid bacterium encountered in many different types of fermented food (meat, seafood, vegetables, and cereals). Although this species plays an important role in the preservation of these foods, few attempts have been made to assess its genomic diversity. This study uses comparative analyses of 13 published genomes (complete or draft) to better understand the evolutionary processes acting on the genome of this species. Phylogenomic analysis, based on a coalescent model of evolution, revealed that the 6,742 sites of single nucleotide polymorphism within the L. curvatus core genome delineate two major groups, with lineage 1 represented by the newly sequenced strain FLEC03, and lineage 2 represented by the type-strain DSM20019. The two lineages could also be distinguished by the content of their accessory genome, which sheds light on a long-term evolutionary process of lineage-dependent genetic acquisition and the possibility of population structure. Interestingly, one clade from lineage 2 shared more accessory genes with strains of lineage 1 than with other strains of lineage 2, indicating recent convergence in carbohydrate catabolism. Both lineages had a wide repertoire of accessory genes involved in the fermentation of plant-derived carbohydrates that are released from polymers of a/b-glucans, a/b-fructans, and N-acetylglucosan. Other gene clusters were distributed among strains according to the type of food from which the strains were isolated. These results give new insight into the ecological niches in which L. curvatus may naturally thrive (such as silage or compost heaps) in addition to fermented food. Key words: pangenome, Lactobacillus curvatus, plant fermentation, food, lactic acid bacteria, phylogenomic. Introduction or from the environmental fermentation process of corn or Lactobacillus curvatus is a facultative heterofermentative lactic grass silage (Tohno et al. 2012; Zhou et al. 2016). These obser- acid bacterium, commonly associated with fermented and vations suggest that L. curvatus is ubiquitous in lactic acid fer- vacuum-packaged refrigerated meat and fish products mentation and that foods of vegetable origins are a common (Hammes et al. 1990; Hammes and Knauf 1994; Hammes environment for this species. Based on this, it is perhaps not and Hertel 1998; Lyhs et al. 2002; Lyhs and Bjo ¨ rkroth 2008; surprising that L. curvatus has also been identified in the feces Lucquin et al. 2012; Chaillou et al. 2015). In addition, this spe- or gut of many animal species that feed on plants or cereals, cies has also been isolated from dairy products such as milk and including snails (Koleva et al. 2014), chickens (Zommiti et al. cheese (Kask et al. 2003). More recently, many studies have 2017), and humans (Dal Bello et al. 2003). identified L. curvatus in fermented plant products like sauer- Lactobacillus curvatus belongs to the Lactobacillus sakei kraut (Vogel et al. 1993), sourdough (Michel et al. 2016), radish clade of psychrotrophic Lactobacillus, which comprises pickles (Nakano et al. 2016), and kimchi (Jung et al. 2011); in four species: Lactobacillus sakei, Lactobacillus fuchuensis, other plant-derived materials like honey (Bulgasem et al. 2016); Lactobacillus graminis,and L. curvatus (Sun et al. 2015; The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com 1516 Genome Biol. Evol. 10(6):1516–1525. doi:10.1093/gbe/evy106 Advance Access publication May 29, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1516/5020733 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Phylogenomic Analysis of Lactobacillus curvatus Reveals Two Lineages GBE Zheng et al. 2015). To date, thirteen L. curvatus genomes are The Euclidean distance based on presence/absence was available, of which five are complete and eight are draft used to calculate the distance matrix between the variable (Hebert et al. 2012; Cousin 2015; Nakano et al. 2016; Inglin genomes, and clustering was performed by unsupervised et al. 2017; Jans et al. 2017; Lee et al. 2017; Teran et al. 2017). complete linkage. Some of these strains were sequenced to highlight their ability to produce multiple bacteriocins, like strain CRL705, which was Evolutionary and Phylogenomic Analysis isolated from an Argentinean artisanal dry sausage (Hebert The alignment of the nucleotide genome sequences was per- et al. 2012), or because of surprising features of flagellum- formed using PROGRESSIVEMAUVE (Darling et al. 2010); from mediated motility, like the Japanese strain NRIC0822, which this, we extracted the core alignment by keeping only the was isolated from sushi (Cousin 2015). Besides these few regions where all genomes aligned over at least 500 bp. This examples, very little is known about the intraspecies genomic core alignment was submitted to five independent runs of repertoire of L. curvatus strains. In particular, improved knowl- CLONALFRAME software (Didelot and Falush 2006; Vos and edge of this species’ genome could help to distinguish it from Didelot 2009), which consisted of a burn-in cycle of the MCMC the closely related species L. sakei, which is knowntobe sep- (Markov Chain Monte Carlo) algorithm fixed to 50,000 itera- arated into three phylogenic lineages (Chaillou et al. 2013). tions and a posterior sampling of 50,000 iterations. The prior Several recent publications have reported the genome se- iterations were discarded and model parameters were sampled quencing of strains of L. curvatus isolated from various food in the second period of the run every 50 iterations, resulting in products. We took advantage of these resources to perform a 1,000 samples from the posterior. Satisfactory convergence of detailed phylogenomic and pangenomic analysis of L. curva- the MCMC algorithm in the different runs was estimated based tus as a species. In order to improve our understanding of the on the Gelman-Rubin statistic calculated in CLONALFRAME. evolution and population structure of the 13 strains studied, The genealogy of the population was summarized and the ro- we performed multiple, complementary analyses. These in- bustness of the tree topology was evaluated by concatenating cluded an evolutionary analysis of the core genome, which the posterior samples of the five runs to build a 50% majority revealed the existence of two lineages, and an in-depth com- rule consensus tree using the CLONALFRAME GUI and MEGA6 parison of the biological functions encoded in the accessory software (Tamura et al. 2013). From these runs, several meas- genome, which highlighted the strong relationship of this urements were also taken, such as q/h (relative frequencies of species with different plant-based environments. occurrence of recombination and mutation) and r/m (relative impact of recombination and mutation in the diversification of Materials and Methods the lineages). A Bayesian approach, implemented in STRUCTURE software version 2.3 (Pritchard et al. 2000; Genome Data and Curation of Annotations Falush et al. 2003), wasusedtoinfer the lineage ancestry of Our data set consisted of 13 genomes of L. curvatus strains, of the core genome by assuming that each strain derived all of its which five were complete and eight were draft versions, all SNPs from one of the K ancestral subpopulations. The number available from the Genbank/EMBL databases (table 1). All of populations, K, was determined under the linkage model. genomes were downloaded to the MAGE annotation plat- Five individual runs per value of K (chosentobe 2 or 3) were form (Vallenet et al. 2013) and strain-specific genes were all performed using 50,000 burn-in iterations and 50,000 sam- manually curated in order to standardize the comparative pling iterations. analysis. Metabolic pathways were reconstructed using the METACYC database (Caspi et al. 2016) or from the literature Results and Discussion when indicated. Evolutionary Analysis Reveals the Existence of Two Pangenome Analysis and Clusters of Orthologous Genes Lineages Within L. curvatus The composition of the core and variable genomes was cal- The phylogeny of L. curvatus was inferred using two different culated using a pairwise estimation of orthologous proteins in approaches. The first strategy was based on Bayesian infer- CDhit (Li and Godzik 2006) at a threshold of 80% identity on ence with the coalescent model implemented in 80% of the protein’s total length. We then modeled the pro- CLONALFRAME software (Didelot and Falush 2006) whereas gression of pangenome size with respect to the number of the second strategy was to statistically estimate the probable genomes included by randomly picking genomes and iterat- number of ancestral subpopulations (K) within the genetic ing the process 13 times, as described on the MAGE annota- population of strains; this was performed using STRUCTURE tion platform (Vallenet et al. 2013). R statistical software with the linkage model (Pritchard et al. 2000; Falush et al. (R Development Core Team 2010) and the HEATMAP.2 func- 2003). The initial step of these two analysis consisted of the tion of the GPLOT package were used to construct a heatmap selection of the high-quality core genome using based on the variable components of the pangenome. PROGRESSIVEMAUVE software (Darling et al. 2010), which Genome Biol. Evol. 10(6):1516–1525 doi:10.1093/gbe/evy106 Advance Access publication May 29, 2018 1517 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1516/5020733 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Teran et al. GBE Table 1 List of Lactobacillus curvatus Genome Sequences Used in This Study Strain Origin Chromosome Sequencing Status Number of Number of Accession Reference (Mb) (Sequencing Technology) Contigs CDS Number FBA2 Radish/Carrot pickles 1.849 Complete (PacBio RS II platform) 1 1,718 CP016028.1 Nakano et al. (2016) Wikim38 Baechu 1.940 Complete (PacBio RS II platform) 1 1,810 CP017124.1 Lee et al. (2017) (Chinese Kimchi) Wikim52 Kimchi 1.987 Complete (PacBio RS II platform) 1 1,875 CP016602.1 NP MRS6 Fermented meat 2.114 Complete (PacBio RS II platform) 1 1,935 CP022474.1 Jans et al. (2017) KG6 Fermented meat 2.002 Complete (PacBio RS II platform) 2 1,884 CP022475.1-76.1 Jans et al. (2017) CRL705 Argentinean 1.838 Draft (454 GS Titanium 145 1,708 GCA_000235705.2 Hebert et al. (2012) dry-sausage pyrosequencing) FLEC03 Vacuum-packed 1.902 Draft (Illumina MiSeq pair-end) 47 1,944 GCA_900178545.1 Teran et al. (2017) beef carpaccio DSM20019 Milk 1.815 Draft (Ion Torrent PGM) 72 1,828 GCA_001311645.1 NP NRCI0822 Kabura zushi 1.945 Draft (Illumina HiSeq pair-end) 144 1,831 GCA_000805355.1 Cousin (2015) RI-406 Meat 2.001 Draft (Illumina MiSeq pair-end) 52 1,873 GCA_001981905.1 Inglin et al. (2017) RI-198 Meat 1.804 Draft (Illumina MiSeq pair-end) 77 1,727 GCA_001981925.1 Inglin et al. (2017) RI-193 Meat 1.805 Draft (Illumina MiSeq pair-end) 82 1,727 GCA_001982045.1 Inglin et al. (2017) RI-124 Meat 1.810 Draft (Illumina MiSeq pair-end) 77 1,722 GCA_001982025.1 Inglin et al. (2017) NP, no publication available. took into account only the coding sequences (CDS) and the CLONALFRAME analysis estimated statistically the q/h ratio, aligned regions with no frameshift between the 13 chromo- a measure of how often recombination events occur relative somes. Therefore, all CDS of potential low-sequencing quality to neutral genetic drift (mutation). This value was 0.137 (i.e., due to the draft genome sequencing status) were dis- (0.127–0.147 at 95% credibility interval), which indicated carded from this analysis. A total of 131 alignment blocks that the recombination rate is significantly lower than the longer than 500 bp were selected. The core aligned genome mutation rate and therefore, recombination has played only consisted of 199,762 bp, which contained 6,742 Single a minor role in the evolution of the two lineages. Nucleotide Polymorphic loci (SNPs) among the 13 chromo- Nevertheless, the admixture status of strains RI-124, RI-193, somes. Results are shown in figure 1. From the coalescent RI-198, and KG6 clearly indicated that recombination events tree, we were able to define two separate lineages: lineage between the two lineages may occur, perhaps when strains 1 comprised the strains FLEC03, MRS6, and RI-406, and line- from both lineages are in physical proximity such as in solid age 2 comprised the ten other strains. The results of the food. STRUCTURE analysis confirmed the presence of two popula- tions. Furthermore, this method enabled the characterization The Accessory Genome Corroborates the Existence of Two of the allele frequencies at each locus, then it probabilistically Lineages assigned individuals to K (unknown) ancestral populations. For both lineages (when K was set to two), >75% of the Another way to assess population structure is to perform a genetic contribution to each strain came from its own group. comparative analysis of the variable or accessory genome. However, we observed less genetic homogeneity in one clade Indeed, the speciation of a species into several lineages may of lineage 2 (named clade 2B), which contained strains RI- arise from positive selection for a given ecological niche and 124, RI-193, RI-198, and KG6. We therefore investigated if thus for the acquisition of lineage-specific metabolic traits inference to K¼ 3 ancestral populations would cause this (Doolittle and Papke 2006). The general genome features of clade to be assigned to a third lineage, a population structure the L. curvatus strains, shown in table 1, clearly highlighted which would be similar to that observed in the closely related some genetic variability (from 1708 CDS in strain CRL705 to species L. sakei (Chaillou et al. 2013). However, there was no 1944 CDS in strain FLEC03). We thus compared the 13 strains statistical support for this hypothesis, suggesting that clade 2B in terms of gene content by estimating the core shared ge- does not originate from a third ancestral population. nome and then evaluating how the accessory genome could Therefore, at this stage of the analysis it could only be con- contribute to the differentiation of the two lineages. The cluded that strains RI-124, RI-193, RI-198, and KG6 are most results, shown in figure 2, first indicated that the core genome likely affiliated to the broader lineage 2 but have some degree is composed of 1,215 clusters of orthologous genes (ortho- of admixture (from 35% to 45%) with lineage 1. The logs) whereas the pangenome contains 3,435 orthologs. It 1518 Genome Biol. Evol. 10(6):1516–1525 doi:10.1093/gbe/evy106 Advance Access publication May 29, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1516/5020733 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Phylogenomic Analysis of Lactobacillus curvatus Reveals Two Lineages GBE Q ancestry proportion 00.5 100.5 1 NRIC0822. Wikim38. FBA2. DSM20019. Wikim52. CRL705. LINEAGE 2 RI-193. RI-198. RI-124. KG6. FLEC03. LINEAGE 1 RI-406. MRS6. 0.8 0.6 0.4 0.2 0 K = 2 K = 3 Time (Coalescent Unit) FIG.1.—Phylogenomic clonal genealogy and population structure of Lactobacillus curvatus strains. On the left: Fifty-percent majority rule consensus tree inferred with CLONALFRAME software using the coalescent model (branch lengths are given in coalescent units). All branches are supported by a posterior probability of >95%. Strains are colored according to their lineage affiliation. On the right: Proportion of genetic material derived from each of K subpopulations as inferred by STRUCTURE (linkage model) and assuming K¼2 or 3 populations. Ancestral subpopulations are colored in red (lineage1), blue (lineage 2), and green (an unlikely lineage 3), respectively. Clade 2B of lineage 2 is colored in dark cyan to highlight its divergence from Clade 2Aand a strong degree of admixture between the two lineages. should be noted that although it has been shown that draft strains of their own putative lineage. It is important to remem- assemblies provide highly relevant insights into pangenomic ber that the core genome analysis is based on the mutation studies (Sun et al. 2015), the use of these types of assemblies and recombination rates among SNPs in housekeeping genes may raise the possibility to underestimate the core genome, and thus reflects a rather long-term evolutionary process. the mobile genome and the strain-specific genes. For this Instead, the accessory genome analysis is affected to a large reason, we then relaxed the threshold used to estimate the degree by horizontal gene transfer, which might be influ- core genome by including the possibility that a given gene enced by the lifestyle of the strains and represents a recent might be absent in one of the thirteen strains, a possibility that and ongoing process of fitness acquisition by the strains. could occur due to the draft sequencing of eight out of the 13 Therefore, the striking finding of a discrepancy between strains studied. With these settings, the core genome was core and accessory genome clustering suggests that strains estimated to contain 1,407 orthologs. Of these, 414 ortho- from clade 2B have recently evolved from lineage 2 through logs formed the mobile genome (IS elements and prophages), the acquisition of functional traits from lineage 1. It should be with 55 different putative transposase families represented noted that strains FLEC03, RI-406, RI-124, RI-193, RI-198, and overall. The cloud genome (genes present in only one strain MRS6 share a common source of isolation (meat), whereas and not from the mobile genome) contained 648 orthologs the other strains were isolated from fermented nonmeat which were mainly distributed into three major groups: pro- products (except for CRL705). Furthermore, strains isolated teins of unknown functions (50.1%), cell-surface or exported from Asian-type food products (sushi and kimchi) formed a proteins (15%), and proteins involved in the production of closely related subgroup of strains in lineage 2. Therefore, surface or exported polysaccharides or teichoic acid (20%). patterns in the accessory genome of L. curvatus suggest The remaining 901 orthologs (when the relaxed threshold that certain traits that affect environmental fitness have was used to estimate the core genome) form part of the ac- been recently acquired. However, it should be acknowledge cessory genome, in which genes are shared by at least two that a bias on the origin of strains might still exist since eight strains. This group of 901 orthologs from the accessory ge- out thirteen of the strains being sequenced and publicly avail- nome was used for cluster analysis of the strains (fig. 3). We able are from fresh or fermented meat. It would therefore be observed that strains grouped according to their lineage, with valuable to sequence more strains from other sources (vege- the exception of strains from clade 2B; these shared more tables and silage) to validate this conclusion. Based on figure 3, accessory genes with strains of lineage 1 than with the other several groups of accessory genes were defined (A to E) Genome Biol. Evol. 10(6):1516–1525 doi:10.1093/gbe/evy106 Advance Access publication May 29, 2018 1519 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1516/5020733 by Ed 'DeepDyve' Gillespie user on 20 June 2018 CLADE 1A CLADE 2B CLADE 2A Teran et al. GBE strains RI-124, RI-193, and RI-198 from the admixed clade 2B. pangenome=3435 These gene clusters are very similar to those previously iden- mobile =414 tified in L. sakei strain 23 K (Chaillou et al. 2005), which were hypothesized to be important for growth fitness in meat. It is therefore interesting to observe that such cell surface com- cloud =648 plexes are almost absent from strains in lineage 2, which were not isolated from fresh meat. shell genome=2308 accessory A Wide Repertoire of Phosphotransferase Transport (relax) =901 Systems for Sugar and Polyol Uptake core genome (relax) =1407 The accessory genome of the thirteen L. curvatus strains con- tains >30 phosphotransferase systems (PTSs) and 10 other core genome (stringent) =1215 systems dedicated to carbohydrate uptake. Altogether, these account for 240 genes (30.8% of the accessory genome 11 234 234 56789 56789 1100 1122 shown in fig. 3). Interestingly, it seems that the repertoire of Number of genomes PTS gene clusters enables the utilization of a wide range of FIG.2.—Progression of the core genome and pangenome of carbohydrates and polyols from plants (vegetables, fruits, Lactobacillus curvatus. Each boxplot represents the pairwise evolution of cereals) and insect/microbe-derived sugar polymers. In partic- the core genome (blue) and pangenome (yellow) of clusters of ortholo- ular, a significant proportion of these gene clusters encode gous proteins calculated iteratively as genomes were added to the analysis, uptake systems for the utilization of a/b-glucan, a/b-fructan, for a total of 2–13 genomes. Dashed lines represent the values obtained and N-acetylglucosan. An overview of these systems with re- for the progression of the core genome (using a stringent or relaxed esti- spect to the plant carbohydrates that they transport is shown mation; see text), the pangenome, and for another important step in the in figure 4 and details about the gene content of these clus- estimation of the accessory genome, the shell genome. The shell genome ters can be found in supplementary table S1, Supplementary is a more realistic functional estimation of the pangenome that excludes Material online. mobile selfish DNA (mobile genome) and unique gene clusters found in only one strain (cloud genome) from the sum of accessory genes. Systems for a and b-Glucans according to their frequency in the different strains and these will be addressed later in the discussion. We found at least three different systems for maltose utiliza- tion. Two of these use starch and maltodextrins via the intra- cellular a-amylase pathway and the maltose phosphorylase Cell-Surface Complexes as a Major Difference between pathway, both of which are linked to an ABC transporter the Two Lineages (map gene cluster N 1anN 2). Interestingly, these two Cell-surface complexes (Cscs) are conserved among gene clusters have a slightly different gene synteny (gene Firmicutes and, in particular, in species belonging to the order mapG encoding a hypothetical protein is respectively absent Lactobacillales. Cscs are multicomponent complexes com- from cluster N 1 and gene mapL1 encoding an oligo alpha- posed of four types of protein families which differ according 1,6-glusosidase is absent from cluster N 2).Furthermore,the to their domains: CscA has a DUF916 domain of unknown encoded proteins were not considered to be orthologs at a function, CscB and CscC contain a WxL1 and WxL2 domain threshold of 80% similarity, which indicates that they have which binds noncovalently to the murein polymer of the cell different phylogenetic origins. One gene cluster was present wall, and CscD contains an LPxTG motif for covalent anchor- in all strains of lineage 1 (strains FLEC03, RI-406, and MRS6), ing to the cell wall (Siezen et al. 2006). Csc components can but was also found in strains KG6 and FBA2 of lineage 2; it vary in number and position among clusters and not all of shares a high level of identity (75%) with homologs in them are necessarily present in one complex. In particular, Lactobacillus alimentarius. Conversely, the second gene clus- CscCs are large secreted proteins with putative carbohydrate ter was found only in strains of lineage 2, with a high level of polymer binding domains that are involved in adhesion and/or identity (70%) to homologs in Pediococcus pentosaceus. carbohydrate scavenging. They are often highly variable be- The third system for maltose utilization (cluster N 3), present tween gene clusters and were shown to cross-react with CscA only in strains NRIC0822 and MRS6, was the mal PTS, which and CscB proteins of noncognate gene clusters (Brinster et al. was coupled with the malA gene that encodes 6-phospho a- 2007). Of the accessory orthologous genes investigated here, glucosidase. we found Csc-encoding regions in both groups A and D (see Ten of the thirteen strains also contained genes coding for fig. 3). Strains FLEC03, RI-406, and MRS6 of lineage 1 share the tre PTS (cluster N 4), which would enable them to use the eight putative Csc clusters, two of which are also shared with a-glucan-derived disaccharide trehalose; the three strains that 1520 Genome Biol. Evol. 10(6):1516–1525 doi:10.1093/gbe/evy106 Advance Access publication May 29, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1516/5020733 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Number of orthologous CDS 1000 1000 1500 1500 2000 2000 2500 2500 3000 3000 3500 3500 Phylogenomic Analysis of Lactobacillus curvatus Reveals Two Lineages GBE gene is absent gene is present GROUP E GROUP A GROUP B GROUP C GROUP D CRL705. DSM20019. FBA2. Wikim38. NRIC0822. Wikim52. RI-124. RI-193. RI-198. FLEC03. RI-406. MRS6. KG6. pdu CSC srl CSC ula CLUSTER CLUSTERS CLUSTER CLUSTERS CLUSTER map2 map1 Fli, mot, che CLUSTER CLUSTER CLUSTERS FIG.3.—Heatmap showing the clustering analysis of Lactobacillus curvatus strains based on the content of their accessory genomes. Unsupervised complete linkage clustering of L. curvatus strains based on the presence (orange) or absence (blue) of 901 orthologs that constitute the L. curvatus accessory genome (without mobile and cloud genomes). Names of strains are colored according to their phylogenomic clade and lineage as shown in figure 1. The five main groups of orthologs prevalent among the strains are indicated above the heatmap and inside the clustering tree (from groups A to E). Similarly, gene prevalence groups are colored based on their specificity to each of the phylogenomic clades. Eight gene clusters representative of each groups are boxed with dashed lines: CSC clusters (Cell Surface Complexes); pdu cluster (propanediol catabolic pathway); map1 and map2 clusters (lineage-specific maltose phosphorylase pathways); srl cluster (sorbitol phosphotransferase system); fli, mot,and che clusters (motility operons); ula cluster (ascorbate catabolic pathway). Systems for a and b-Fructans lacked this system were RI-124, RI-193, and RI-198 from the admixed clade 2B. There was much more redundancy in PTSs Similarly, it appeared that some strains from lineage 2 are able to for the use of b-glucans, in which there was also extensive use sucrose through two different pathways, one involving the variation among strains. These systems included a lic PTS (clus- sucrose src PTS and sucrose-6-phosphate hydrolase pathway ter N 5) for catabolizing lichenan (barley glucan) and several (clusterN 08a and 08b), and the other involving a symport bgl-like PTSs (up to three systems within strains of lineage 1 system coupled with the catabolism of bacterial levan (b-fruc- from cluster N 06a to N 06c, although some of these may tan) by the lev PTS (identified in strains FAB2 and Wikim52) not be complete). However, one peculiar bgl system (cluster (cluster N 12). Another PTS for fructose utilization was identi- N 6c) in strain RI-406 also encoded an a-xylosidase (xylQ), fied in strains FLEC03 and RI-406 from lineage 1 and strains RI- indicating that this cluster might be involved in the degrada- 193, RI-198, and NRIC0822 from lineage 2; specifically, we tion of xyloglucan (plant hemicellulose) (Chaillou et al. 1998). detected the frl gene cluster (cluster N 10), which encodes a We found evidence that all strains are able to take up glucose fructose-lysine deglycation pathway (Wiame et al. 2005). This andfructose withthe manXYZ and fruKRI PTSs, respectively, molecule can be abundant in plant fluids, arising spontaneously but strain FLEC03 and RI-406 from lineage 1 also had an via condensation of the sugar and the amino acid when both additional copy of the manXYZ PTS (cluster N 7). are present in high concentrations (Bilova et al. 2016). The frl Genome Biol. Evol. 10(6):1516–1525 doi:10.1093/gbe/evy106 Advance Access publication May 29, 2018 1521 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1516/5020733 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Teran et al. GBE β-glucans vegetables cellobiose glucose lichenan fructose 6 fruits sucrose trehalose 8 glucose-lysine cereals PTS PTS PTS PTS fructose-lysine maltose PTS PTS α,β-fructans PTS PTS α-glucans cadaverine SY glucose 6P maltose 2 ABC lysine levan PTS fructose 6P PTS 13 mannosyl-glycerate ABC DHA P NAG 6P PTS N-acetyl-glucosamine (NAG) PTS glucitol PTS sorbitol PTS insects 21 PTS PTS chitobiose L-ascorbate PTS SY 20 PTS fungi N-acetyl-muramate galactitol bacteria citrate acetate propanol lactate propanoate 1,2 propanediol fluids of plants citrus fruits FIG.4.—Overview of Lactobacillus curvatus accessory gene repertoire involved in fermentation of plant-derived carbohydrates. Carbohydrates are grouped (external ellipses) according to their type (glucan and fructans) or origin (plants, cereals, and insects). Each uptake and catabolic systemis represented by a circle whose size depicts the degree of conservation among the L. curvatus strains. The clusters are numbered (small yellow circles) to facilitate their identification using supplementary file S1, Supplementary Material online. The inner circles illustrate the fate of these carbohydrates: into glucose 6P, fructose 6P, N-acetyl glucosamine 6P, or dihydroxy-acetone P (DHAP). PTS cluster is associated with a fructose 6P-lysine deglycase (frlF), lineage 2, have acquired multiple PTSs that are specific for which releases fructose 6 P and lysine. The expression of this these compounds, including a catabolic ula PTS pathway gene cluster might be controlled by an accessory r54-like tran- for ascorbic acid (Yew and Gerlt 2002)(cluster N 21), a scriptional factor, which would indicate that the bacteria might catabolic srl PTS pathway for sorbitol (Alcantara 2008) be able to sense this compound in the environment (Francke (cluster N 22), and a catabolic gat PTS pathway for gluci- et al. 2011). The genomes of strains NRIC0822 and Wikim52 tol/galactitol (clusters N 19 and N 20). Again, strains from contain a second putative glycation PTS (grl gene cluster N 09). lineage 1 and those of the admixed clade 2B had gat gene It should be noted that two pathways for the catabolism of clusters (Nobelmann and Lengeler 1996)that differ in se- lysine into cadaverine were found (cluster N 11b): a pyridoxal- quence homology and origin from that of strains from dependent lysine decarboxylase (tdcA) was present in most lineage 2. Similarly, strains of lineage 1 could also be dis- strains (except strains DSM20019 and Wikim52), while five tinguished from those of lineage 2 by the presence of a strains also contained the lysine decarboxylase complex pathway that is quite uncommon in lactic acid bacteria encoded by the cad gene cluster (clusterN 11a). (Cluster N 18): a coenzyme B(12)-dependent catabolic pathway (pdu gene cluster) for the utilization of 1,2-pro- panediol (Bobik et al. 1999). This compound is produced Systems for Polyols by the fermentation of the common plant sugars rham- Plant fluids are richinpolyols andvitamin C (ascorbic nose and fucose, and its catabolism creates propionate and propanol as end products. acid). Lactobacillus curvatus strains, in particular those in 1522 Genome Biol. Evol. 10(6):1516–1525 doi:10.1093/gbe/evy106 Advance Access publication May 29, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1516/5020733 by Ed 'DeepDyve' Gillespie user on 20 June 2018 CRL705. DSM20019. FBA2. Wikim38. NRIC0822. Wikim52. RI-124. RI-193. RI-198. FLEC03. RI-406. MRS6. KG6. Phylogenomic Analysis of Lactobacillus curvatus Reveals Two Lineages GBE Other Systems PHAGES The differences between the two lineages of L. curvatus or between some groups of strains go beyond catabolic pathways for plant-derived carbohydrates. For instance, some strains in lineage 2 (CRL705, DSM20019, and Wikim38) have an rbsUDKR gene cluster for ribose catab- TRANSPOSASES olism that encodes a ribose transporter rbsU and is similar to that of strains of L. sakei (Stentz and Zagorec 1999). Instead, all other strains have an rbsABCDKR gene cluster which encodes an ABC transporter rbsABC. Another ex- ample of divergence between the two lineages was also found in the catabolism of N-actetylglucosamine and N- acetylmuramic acid. Whereas most strains (with the ex- FIG.5.—Barplots showing the number of phage-related genes and ception of FLEC03) harbored the nagC PTC and the nagA transposase families in the Lactobacillus curvatus strains. The barplots in- gene (cluster N 14), which encodes the N-acetylglucos- dicate the numbers of phage genes (green) or transposase families (red) amine 6-phosphate deacetylase (Vogler and Lengeler identified in each strain. The red shaded area depicts strains from Asian- 1989), strains CRL705, DSM20019, and KG6 had an ad- type foods, in which a higher number of transposase families was found. ditional mur PTS together with the murQ gene (cluster N 16), which encodes the D-lactyl ether N-acetylmuramic 6-phosphate acid etherase (Dahl et al. 2004)for the ca- mobile elements predominate among the genes that are tabolism of N-acetylmurein. Strains from lineage 1 and not captured in fragmented Illumina-based genome assem- clade 2B had another unique variation, with what might blies. The set of four Asian L. curvatus genomes (FBA2, be a chi gene cluster together with chiK (cluster N 15); Wikim38, Wikim52, NRCI0822; table 1) are enriched in com- the chi cluster is involved in chitobiose uptake and catab- plete genomes (3 out of 4) and this observation might explain olism, and chiK encodes an N-acetylglucosamine kinase the higher number of transposase families in these strains. (Plumbridge and Pellegrini 2004). More surprisingly, Between one and three putative prophages were identified strains DSM20019 and MRS6 harbored a mng gene clus- in each genome in this group. All prophages were predicted ter (cluster N 13) similar to that of E. coli, which has been to be noncontratile tail phages of the family Siphoviridae of shownto be involvedin2-O-a-Mannosyl-D-glycerate order Caudoviridae, and their size was between 31 and 42 kb, PTS-dependent uptake and catabolism (mngB encodes both characteristics frequently found among prophages of an a-mannosidase; Sampaio et al. 2004). This unusual car- genus Lactobacillus (Mahony and van Sinderen 2014). There bohydrate is known to be abundant in hyperthermophilic was little sequence similarities between the different phages prokaryotes, in which it acts as an osmoprotectant. genomes, each phage being unique to one strain and this was Finally, three strains—FAB2, Wikim38, and RI-406— explained the large contribution of phages to the important possessed the cit gene cluster (cluster N 17), which is size of the mobile genome. It is interesting to note that strain involved in the decarboxylation of citrate to acetate DSM20019 was the richest in prophage content; this strain and pyruvate, a common catabolic pathway in was isolated from milk, where the concentration of phages is Leuconostoc (Marty-Teysset et al. 1996). Together, these 1 4 high (from 10 to 10 phages per milliliter) (Marcoet al. observations suggest that L. curvatus strains may also 2012). Instead, strain CRL705, which might have been asso- thrive in natural environments where microbial and in- ciated with a milk environment in the past because of the sect cell-wall polymeres can be scavenged as well as presence of a lactose PTS cluster in its genome, only contains those derived from plants, environments such as silage remnants of prophages. However, strain CRL705 was unique or compost heaps. in possessing two CRISPR/cas systems (Clustered Regulatory Short Palindromic Repeats; Deveau et al. 2010); in addition to Mobile Genome Shows Important Variations between the type II system (cas9 gene) present in all strains, CRL705 Strains also had a type I system (cas3). These two clusters might The mobile genome differs greatly between strains of the two provide strain CRL705 with a stronger immunity against lineages (fig. 5). Strains from clade 2A, and in particular those phages. Finally, as it could be expected from the weak as- isolated from Asian types of food, had a broader range of sembly performance of repetitive regions using short-read transposase families (up to 20 in strain FAB2, see fig. 5), in- sequencing, the CRISPR spacer regions were largely incom- dicating that gene transfer may occur more frequently in plete in draft genomes and no conclusive information could these types of foods than in meat or in environmental silage. be extracted on the possible history of the strains versus However, Schmid et al. (2018) pointed out recently that phages encounter. Genome Biol. Evol. 10(6):1516–1525 doi:10.1093/gbe/evy106 Advance Access publication May 29, 2018 1523 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1516/5020733 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Number of orthologous CDS Teran et al. GBE Motility Literature Cited Alcantara C. 2008. Regulation of Lactobacillus casei sorbitol utilization The mobility operon, which comprises the fli (flagellar structral genes requires DNA-binding transcriptional activator GutR and the complex), mot (flagellar motor complex), and che (chemotaxis conserved protein GutM. Appl Environ Microbiol. 74(18):5731–5740. regulatory complex) gene clusters, was initially characterized Bilova T, et al. 2016. A snapshot of the plant glycated proteome: structural, in strain NRIC0822 (Cousin 2015). However, we also identi- functional, and mechanistic aspects. J Biol Chem. fied it in the closely related strain Wikim52, suggesting 291(14):7621–7636. Bobik TA, Havemann GD, Busch RJ, Williams DS, Aldrich HC. 1999. The that this feature might not be so unusual among strains of propanediol utilization (pdu) operon of Salmonella enterica serovar L. curvatus. Typhimurium LT2 includes genes necessary for formation of polyhedral organelles involved in coenzyme B(12)-dependent 1,2-propanediol degradation. J Bacteriol. 181(19):5967–5975. Conclusion Brinster S, Furlan S, Serror P. 2007. C-terminal WxL domain mediates cell Our results showed that, as a species, L. curvatus is divided wall binding in Enterococcus faecalis and other gram-positive bacteria. J Bacteriol. 189(4):1244–1253. into two ancestral phylogenetic lineages. The traces of this Bulgasem BY, Lani MN, Hassan Z, Wan Yusoff WM, Fnaish SG. 2016. evolutionary path are not only present in the allele fre- Antifungal activity of lactic acid bacteria strains isolated from natural quencies of the core genes but also in the origin and struc- honey against pathogenicCandida species. Mycobiology 44(4):302–309. ture of some conserved metabolic gene clusters (i.e., Caspi R, et al. 2016. The MetaCyc database of metabolic pathways and ribose, maltose, galactitol). The degree of variation pre- enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res. 44(D1):D471–D480. sent in these systems suggests that the two lineages result Chaillou S, et al. 1998. Cloning, sequence analysis, and characterization of from different evolution mechanisms ending up to this the genes involved in isoprimeverose metabolism in Lactobacillus pen- repertoire. Furthermore, our work demonstrates that the tosus. J Bacteriol. 180(9):2312–2320. lifestyle and the ecological niche of the strains has a strong Chaillou S, et al. 2005. The complete genome sequence of the meat-borne influence on the gene content of the accessory genome, lactic acid bacterium Lactobacillus sakei 23K. Nat Biotechnol. 23(12):1527–1533. which has led to convergence between strains from line- Chaillou S, et al. 2015. Origin and ecological selection of core and food- age 1 and those of clade 2B from lineage 2. Lactobacillus specific bacterial communities associated with meat and seafood spoil- curvatus pangenome has revealed a wide repertoire of age. ISME J. 9(5):1105–1118. genes for catabolizing plant-derived carbohydrates, and Chaillou S, Lucquin I, Najjari A, Zagorec M, Champomier-Verge`s M-C. this capacity is representing a major difference with the 2013. Population genetics of Lactobacillus sakei reveals three lineages with distinct evolutionary histories. PLoS One 8(9):e73253. closely related species L. sakei. Finally, an in-depth analysis Cousin FJ. 2015. Detection and genomic characterization of motility in of the L. curvatus accessory genome has led us to con- Lactobacillus curvatus: confirmation of motility in a species outside clude that, in addition to living in fermented foods made the Lactobacillus salivarius clade. Appl Environ Microbiol. 81(4): of vegetables or meat, the species must also thrive in an 1297–1308. ecological niche where decaying plants, insects, and bac- Dahl U, Jaeger T, Nguyen BT, Sattler JM, Mayer C. 2004. Identification of a phosphotransferase system of Escherichia coli required for growth on teria are present in large amounts, for example, in silage N-acetylmuramic acid. J Bacteriol. 186(8):2385–2392. or compost heaps. Dal Bello F, Walter J, Hammes WP, Hertel C. 2003. Increased complexity of the species composition of lactic acid bacteria in human feces revealed by alternative incubation condition. Microb Ecol. 45(4):455–463. Supplementary Material Darling AE, Mau B, Perna NT. 2010. progressiveMauve: multiple genome Supplementary data areavailableat Genome Biology and alignment with gene gain, loss and rearrangement. PLoS One 5(6):e11147. Evolution online. Deveau H, Garneau JE, Moineau S. 2010. CRISPR/Cas system and its role in phage-bacteria interactions. Annu Rev Microbiol. 64(1):475–493. Didelot X, Falush D. 2006. Inference of bacterial microevolution using multilocus sequence data. Genetics 175(3):1251–1266. Acknowledgments Doolittle WF, Papke RT. 2006. Genomics and the bacterial species prob- lem. Genome Biol. 7(9):116. L.C.T. was the recipient of a PhD fellowship from the Falush D, Stephens M, Pritchard JK. 2003. Inference of population struc- BecAR Program and Campus France Argentine. The ture using multilocus genotype data: linked loci and correlated allele LABGeM (CEA/IG/Genoscope and CNRS UMR 8030) and frequencies. Genetics 164(4):1567–1587. theFranceGenomique National infrastructure (funded as Francke C, et al. 2011. Comparative analyses imply that the enigmatic part of Investissement d’avenir program managed by Sigma factor 54 is a central controller of the bacterial exterior. BMC Genomics 12(1):385. Agence Nationale pour la Recherche contract no. ANR- Hammes WP, Bantleon A, Min S. 1990. Lactic acid bacteria in meat fer- 10-INBS-09) are acknowledged for support within the mentation. FEMS Microbiol Rev. 87(1–2):165–174. MicroScope annotation platform. We are grateful to the Hammes WP, Hertel C. 1998. New developments in meat starter cultures. INRA MIGALE bioinformatics platform (http://migale.jouy. Meat Sci. 49:S125–S138. inra.fr) for providing computational resources and data Hammes WP, Knauf HJ. 1994. Starters in the processing of meat products. Meat Sci. 36(1–2):155–168. storage. 1524 Genome Biol. Evol. 10(6):1516–1525 doi:10.1093/gbe/evy106 Advance Access publication May 29, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1516/5020733 by Ed 'DeepDyve' Gillespie user on 20 June 2018 Phylogenomic Analysis of Lactobacillus curvatus Reveals Two Lineages GBE Hebert EM, et al. 2012. Genome sequence of the bacteriocin-producing Sampaio M-M, et al. 2004. Phosphotransferase-mediated transport of the Lactobacillus curvatus strain CRL705. J Bacteriol. 194(2):538–539. osmolyte 2-O-alpha-mannosyl-D-glycerate in Escherichia coli occurs by Inglin RC, Meile L, Stevens MJA. 2017. Draft genome sequences of 43 the product of the mngA (hrsA) gene and is regulated by the mngR Lactobacillus strains from the species L. curvatus, L. fermentum, L. (farR) gene product acting as repressor. J Biol Chem. paracasei, L. plantarum, L. rhamnosus,and L. sakei, isolated from 279(7):5537–5548. food products. Genome Announc. 5(30):e00632-17–e00617. Schmid M, et al. 2018. Comparative genomics of completely sequenced Jans C, Lagler S, Lacroix C, Meile L, Stevens MJA. 2017. Complete genome Lactobacillus helveticus genomes provides insights into strain-specific sequences of Lactobacillus curvatus KG6, L.curvatus MRS6, and genes and resolves metagenomics data down to the strain level. Front Lactobacillus sakei FAM18311, isolated from fermented meat prod- Microbiol. 9:63. ucts. Genome Announc. 5(38):e00915-17. Siezen R, et al. 2006. Lactobacillus plantarum gene clusters encoding pu- Jung JY, et al. 2011. Metagenomic analysis of kimchi, a traditional Korean tative cell-surface protein complexes for carbohydrate utilization are fermented food. Appl Environ Microbiol. 77:2264–2274. conserved in specific gram-positive bacteria. BMC Genomics 7:126. Koleva Z, et al. 2014. Lactic acid microflora of the gut of snail Cornu Stentz R, Zagorec M. 1999. Ribose utilization in Lactobacillus sakei:analysis aspersum. Biotechnol Biotechnol Equip. 28(4):627–634. of the regulation of the rbs operon and putative involvement of a new Kask S, et al. 2003. Physiological properties of Lactobacillus paracasei, L. transporter. J Mol Microbiol Biotechnol. 1:165–173. danicus and L. curvatus strains isolated from Estonian semi-hard Sun Z, et al. 2015. Expanding the biotechnology potential of Lactobacilli cheese. Food Res Int. 36(9–10):1037–1046. through comparative genomics of 213 strains and associated genera. Lee SH, Jung MY, Song J-H, Lee M, Chang JY. 2017. Complete genome Nat Commun. 6(1):8322. sequence of Lactobacillus curvatus strain WiKim38 isolated from Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. 2013. MEGA6: Kimchi. Genome Announc. 5(18):e00273-17. molecular evolutionary genetics analysis version 6.0. Mol Biol Evol. Li W, Godzik A. 2006. Cd-hit: a fast program for clustering and comparing 30(12):2725–2729. large sets of protein or nucleotide sequences. Bioinformatics Ter an LC, Coeuret G, Raya R, Champomier-Verge ` s M-C, Chaillou S. 2017. 22(13):1658–1659. Draft genome sequence of Lactobacillus curvatus FLEC03, a meat- Lucquin I, Zagorec M, Champomier-Verges M, Chaillou S. 2012. borne isolate from beef carpaccio packaged in a modified atmo- Fingerprint of lactic acid bacteria population in beef carpaccio is influ- sphere. Genome Announc. 5(26):e00584-17. enced by storage process and seasonal changes. Food Microbiol. Tohno M, Kobayashi H, Nomura M, Uegaki R, Cai Y. 2012. Identification 29(2):187–196. and characterization of lactic acid bacteria isolated from mixed pasture Lyhs U, Bjo ¨ rkroth JK. 2008. Lactobacillus sakei/curvatus is the prevailing of timothy and orchardgrass, and its badly preserved silage. Anim Sci J. lactic acid bacterium group in spoiled maatjes herring. Food Microbiol. 83(4):318–330. 25(3):529–533. Vallenet D, et al. 2013. MicroScope–an integrated microbial resource for Lyhs U, Korkeala H, Bjo ¨ rkroth J. 2002. Identification of lactic acid bacteria the curation and comparative analysis of genomic and metabolic data. from spoiled, vacuum-packaged ‘gravad’ rainbow trout using ribotyp- Nucleic Acids Res. 41(D1):D636–D647. ing. Int J Food Microbiol. 72(1–2):147–153. Vogel RF, Lohmann M, Nguyen M, Weller AN, Hammes WP. 1993. Mahony J, van Sinderen D. 2014. Current taxonomy of phages infecting Molecular characterization of Lactobacillus curvatus and Lact. sake lactic acid bacteria. Front Microbiol. 5:7. isolated from sauerkraut and their application in sausage fermenta- Marco MB, Moineau S, Quiberoni A. 2012. Bacteriophages and dairy tions. J Appl Bacteriol. 74(3):295–300. fermentations. Bacteriophage 2(3):149–158. Vogler AP, Lengeler JW. 1989. Analysis of the nag regulon from Marty-Teysset C, et al. 1996. Proton motive force generation by citrolactic Escherichia coli K12 and Klebsiella pneumoniae and of its regulation. fermentation in Leuconostoc mesenteroides. J Bacteriol. Mol Gen Genet. 219(1–2):97–105. 178(8):2178–2185. Vos M, Didelot X. 2009. A comparison of homologous recombination Michel E, et al. 2016. Characterization of relative abundance of lactic acid rates in bacteria and archaea. ISME J. 3(2):199–208. bacteria species in French organic sourdough by cultural, qPCR and Wiame E, Lamosa P, Santos H, Van Schaftingen E. 2005. Identification of MiSeq high-throughput sequencing methods. Int J Food Microbiol. glucoselysine-6-phosphate deglycase, an enzyme involved in the me- 239:35–43. tabolism of the fructation product glucoselysine. Biochem J. Nakano K, et al. 2016. First complete genome sequence of the skin- 392(2):263–269. improving Lactobacillus curvatus strain FBA2, isolated from fermented Yew WS, Gerlt JA. 2002. Utilization of L-ascorbate by Escherichia coli K-12: vegetables, determined by PacBio single-molecule real-time technol- assignments of functions to products of the yjf-sga and yia-sgb oper- ogy. Genome Announc. 4(5):e00884-16. ons. J Bacteriol. 184(1):302–306. Nobelmann B, Lengeler JW. 1996. Molecular analysis of the gat genes Zheng J, Ruan L, Sun M, Ganzle M. 2015. A genomic view of Lactobacilli from Escherichia coli and of their roles in galactitol transport and me- and Pediococci demonstrates that phylogeny matches ecology and tabolism. J Bacteriol. 178(23):6790–6795. physiology. Appl Environ Microbiol. 81(20):7233–7243. Plumbridge J, Pellegrini O. 2004. Expression of the chitobiose operon of Zhou Y, Drouin P, Lafrenie ` re C. 2016. Effect of temperature (5-25 C) on Escherichia coli is regulated by three transcription factors: nagC, ChbR epiphytic lactic acid bacteria populations and fermentation of whole- and CAP. Mol Microbiol. 52(2):437–449. plant corn silage. J Appl Microbiol. 121(3):657–671. Pritchard JK, Stephens M, Donnelly P. 2000. Inference of population struc- Zommiti M, Connil N, Hamida JB, Ferchichi M. 2017. Probiotic character- ture using multilocus genotype data. Genetics 155(2):945–959. istics of Lactobacillus curvatus DN317, a strain isolated from chicken R Development Core Team. 2010. R: a language and environment for ceca. Probiotics Antimicrob Proteins. 9(4):415–424. statistical computing. Vienna (Austria): R Foundation for Statistical Computing. Associate editor:Esther Angert Genome Biol. Evol. 10(6):1516–1525 doi:10.1093/gbe/evy106 Advance Access publication May 29, 2018 1525 Downloaded from https://academic.oup.com/gbe/article-abstract/10/6/1516/5020733 by Ed 'DeepDyve' Gillespie user on 20 June 2018

Journal

Genome Biology and EvolutionOxford University Press

Published: May 29, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off