Isolation in small populations of Wayampi Amerindians promotes endemicity and homogenisation of their faecal virome, but its distribution is not entirely random

Isolation in small populations of Wayampi Amerindians promotes endemicity and homogenisation of... Abstract The isolated community of the Wayampi Amerindians has been extensively studied for the presence of beta lactamase-producing enterobacteria and their gut microbiota. However, no information about their virome was available. This study tries to establish potential associations between the virome and diverse epidemiological data, through the metagenomic study of the faecal prophages and DNA viruses from 31 samples collected in 2010. Taxonomic assignments, composition, abundance and diversity analyses were obtained to characterise the virome and were compared between groups according to several demographic, environmental and medical data. Prophages outnumbered viruses. Composition and abundance of virome indicated relatively low variability. Diversity within samples showed no significant differences, regardless of the group comparison. Significant differences were observed in the beta diversity among samples according to hospitalisation and gender, but not by extended spectrum β-lactamase carriage, antibiotic intake or possession of pets, although some viruses differed in some cases (e.g. immunodeficiency-associated stool virus associated with antibiotic intake). The faecal virome of adult Wayampi is more homogeneous than that from western populations. Not a single factor analysed can explain alone the observed distribution of the virome, but differences by gender (fewer variability in females than males) may reflect differences in life habits and work. virome, next-generation sequencing, isolated human population, low exposure to antibiotics INTRODUCTION One of the major concerns on public health is the emergence of bacterial antibiotic resistance, not only in developed countries, but also in developing ones (D’Andrea et al.2013). One of these resistances is that of gram-negative bacteria encoding extended spectrum beta-lactamases (ESBLs) carried in plasmids that are readily transferred among bacterial species via horizontal transfer. However, monitoring the dynamics of dissemination of ESBL in populations is difficult in mainstream communities due to the mobility of populations and other factors, such as exposure to multiple sources of antibiotics (medical care, food chain, etc.) associated with the modern lifestyle that burden the follow-up of stable cohorts of individuals over time. To address this issue, the faecal enterobacteria-producing ESBL (E-ESBL) has been studied for over a decade, on the isolated Amerindian Wayampi community living in a very remote village of French Guiana, with three campaigns in 2001, 2006 and 2010 (Grenet et al.2004; Woerther et al.2010, 2013). In this context, metagenomic and metatranscriptomic approaches were conducted on a subset of samples collected during the last campaign (Gosalbes et al.2016) in order to answer the question regarding the potential association of intestinal carriage of E-ESBL with significant changes in the composition of the rest of the microbiota. However, this picture remained incomplete, as the association with other members of the gut microbiota, the viral community, has not been explored yet. In this work, we have examined the viral fraction from the set of samples analysed by Gosalbes et al.(2016), in order to describe the virome of a genetically and environmentally homogeneous population. We aim at establishing potential associations, not only between the virome and the intestinal carriage of E-ESBL, but also extending the scope to other epidemiological data (demographic, environmental and medical data). MATERIALS AND METHODS Subjects Detailed information about the study subjects and sampling was extensively described previously (Gosalbes et al.2016). Briefly, available frozen faecal samples from 31 healthy adult Wayampi Amerindians from the village of Trois-Sauts in French Guiana were used for this study. These subjects represented a subcohort of the 151 individuals voluntarily participating in a campaign in October 2010 to test for E-ESBL intestinal carriage. They included eight E-ESBL carriers, four of them carrying a CTX-M-1 type ESBL, two carrying a CTX-M-8 ESBL and one was carrying a CTX-M-2 ESBL. The remaining individuals were non-carrier controls chosen among the remaining 143 villagers. Demographic, lifestyle, environment, and medical history data had been previously collected (Woerther et al.2013) from each volunteer. Antibiotic treatments prescribed to all villagers were also recorded, as well as familial status and the location of the household of each villager. Signed informed consent had been previously obtained from all subjects and the study protocol was approved by the Regional Ethics Committee (Comité de Protection des Personnes Sud-Ouest et Outre Mer III, 2010-A00682-37). Sample collection and DNA purification For each volunteer, about 5 g of fresh faecal sample was diluted 1:10 in RNAlater (Applied Biosystems, Villebon-sur-Yvette, France), thoroughly mixed and frozen at −20°C and taken to the laboratory for storage at −80°C. Stool samples were defrosted and homogenised and 5 mL of each were diluted with 5 mL of phosphate buffered saline (containing, per litre, 8 g of NaCl, 0.2 g of KCl, 1.44 g of 96 Na2HPO4, and 0.24 g of KH2PO4 [pH 7.2]). Then, they were centrifuged at 1250 g at 4°C for 2 min to remove faecal debris. The supernatants were transferred into 2 mL microcentrifuge tubes and stored at −70°C. One aliquot per sample was used for the viral metagenome isolation. The suspensions were first centrifuged twice at 13,400 × g, 5 min at 4°C and the supernatants recovered and transferred into sterile tubes. The supernatants were next filtered through 045 and 0.22 μm pore size Acrodisc® Syringe filters (Pall España, Madrid, Spain) The resulting filtrates were digested with a cocktail of DNases/RNases consisting of 14 U of Turbo DNase (Ambion Inc, Austin, TX, USA), 25 U of Benzonase (Novagen Inc., Madison, WI, USA) and 2.8 U of RNAse A (Invitrogen, Carlsbad, CA, USA) in DNAse buffer (Ambion), for 120 min at 37°C. Viral nucleic acids were extracted using the QiAamp viral RNA extraction kit (Qiagen, Valencia, CA, USA), following the manufacturer's instructions. Each resulting viral DNAs was eluted in 40 μL of AVE buffer. DNAs were quantified with the Qubit fluorometric quantification method (Thermo Fisher Scientific, Carlsbad, CA, USA), and shot gun libraries were constructed using the Nextera XT DNA Library Prep kit (Illumina, CA, USA) according to the manufacturers’ instructions. Sequencing was performed by MiSeq Paired End Illumina Technology, using the Reagent Kit V3 for 300 bp paired-end reads. Sequence bioinformatics analysis Raw files containing all samples already demultiplexed were filtered by length and quality, trimmed and removed of low-complexity sequences and Ns using Prinseq-lite (v0.20.4) (Schmieder and Edwards 2011). For each sample, forward and reverse sequences were joined using the fastq-join tool from ea-tools suite (Aronesty 2011), with unjoined reads remaining as separate files. Next, files were filtered from human and bacterial reads using the end-to-end and very sensitive options implemented by Bowtie2 (Langmead and Salzberg 2012) against the GRCh38/hg38 reference human genome (Dec ember 2013) and a bacterial database consisting of the reference bacterial genomes updated to June 2016 downloaded from the NCBI FTP site. Next, a BLASTn search strategy (e-value <10−3, with identities ≥70%, 80% and 90%, along ≥75% of the read length) was used for the unmapped reads against a customised viral database (March 2016), consisting of 99% identity clusters of complete viral genomes from the EBI and NCBI sites, plus all available viral sequences from the International Nucleotide Sequence Database Collaboration. It also included prophages from PHAge Search Tool (Zhou et al.2011). Additionally, a BLASTn against the viral database was launched for those reads that had previously mapped against the bacterial database in order to identify potential prophage sequences in those reads. In all cases, tBLASTx searches were also conducted (e-value < 10−5, with identities ≥50%, 60% and 70%, along ≥65% of the read length), obtaining similar results. A taxonomic assignment was subsequently obtained for resulting hits using in-house scripts based on the lowest common ancestor strategy. Finally, contingency tables for reads matching both viral and bacterial hits, as well as for those matching viral but not bacterial hits, were built for the analyses with QIIME version 1.9.0 (Caporasso et al.2010). Ecological analyses of the viral communities such as sample composition, abundance and diversity within and among samples and were calculated. For analyses comprising groups of samples, a sub-sampling of reads based on the sample containing the fewest sequences, HE002 (2127 reads) was carried out. Sample HE022, containing only 11 viral reads was excluded from the analysis. For comparisons of differential distribution of abundances of prophages and viruses at family and species level between groups of samples according to factor such as antibiotic intake, ESBL carriage, gender, hospitalisation and possession of pets, non-parametric Wilcox tests were carried out with the free statistical package R 3.1.0 (R Core Team 2013) and P-values were corrected by the false discovery rate (FDR) corrections. For these analyses, only hits with at least three appearances (10% of the samples) were considered, in order to avoid viruses appearing rarely (in only one or two samples). In addition, Linear Discriminant Analysis Effect Size (LEfSe) (Segata et al.2011) was used to find differentially distributed markers in those groups. The threshold used to consider a discriminative feature was set to >3.0. Additionally, 1000 rarefactions were carried out and the alpha diversity was calculated with the Shannon diversity index. Boxplots were created using a n R script, and the diversity was compared by groups using the script compare_alpha_diversity.py and estimated if they were significantly different using a non-parametric two-tailed t-test using 2127 Monte Carlo permutations with FDR correction. In order to assess the homogeneity of the viral communities, the beta diversity was calculated with the pipeline beta_diversity_through_plots.py, using the Bray-Curtis dissimilarity index. To compare distances between categories, boxplots were created using the same R script as used for alpha diversity. The script make_distance_boxplots.py was used to assess the significance of differential distributions, carrying out Monte Carlo (nonparametric) tests, including Bonferroni corrections. The homogeneity of the virome from samples from the Wayampi individuals was also compared with that of samples from 20 healthy individuals from Spain, previously sequenced by us (Pérez-Brocal et al.2015), as well as with samples from 11 patients infected by Clostridium difficile processed and sequenced in parallel to the Wayampi samples, using the same procedure. To make groups comparable, qualitative statistics were used. Thus, dissimilarity among samples, calculated using the Binary Jaccard distance matrices and richness measures, was plotted for the three sets, using R package, and statistical differences were estimated between pairs of sets, as well as for the three groups using the non-parametric Mann–Whitney U and Kruskal–Wallis tests, respectively. Finally, the clustering based on the Bray–Curtis dissimilarity measurements and the heat maps of taxon abundance and composition were also generated using R package. Virome data submission The DNA virus metagenome data sets from this study are available in the EBI Short Read Archive under the study accession number PRJEB21741, with accession numbers [ERS1820590-ERS1820620]. As for the Western population viromes used for comparisons with the Wayampi ones, data sets can be accessed under the accession numbers ERS540192-ERS540312 (healthy individuals) and ERS1941693 -ERS1941704 (patients suffering from diarrhoea). RESULTS The virome contained in 31 faecal samples from adult volunteers belonging to the Wayampi tribe previously collected was analysed. The distribution of the samples according to the five characteristics selected to conduct the analyses (antibiotic intake, ESBL carriage, gender, hospitalisation and pet possession) is shown in Table 1. Thirty out of those samples resulted in a successful sequencing, adding up 14 344 007 pairs of raw reads in total. Sample HE022 was removed from the analysis due to its low number of reads. Table 1 (Supporting Information) summarises the evolution in the number of reads, total and per sample, after sequential processing steps, as well as the number of identified viral hits using three different BLASTn identity cut offs (above 70%, 80% and 90%). Table 1. Epidemiological parameters of the 31 individuals surveyed among the Wayampi Amerindians of Trois-Sauts. Sample ID  Antibiotic intakea  ESBL carriage  Sex  Hospitalisation  Pets  HE001  Yes  Carrier  Male  No  Yes  HE002  Yes  Control  Female  No  Yes  HE012  No  Control  Male  No  Yes  HE014  Yes  Control  Female  Yes  Yes  HE022b  No  Control  Female  No  No  HE025  No  Control  Male  Yes  Yes  HE050  Yes  Control  Male  Yes  No  HE053  No  Carrier  Male  No  No  HE054  No  Control  Female  No  No  HE055  Yes  Carrier  Female  Yes  Yes  HE059  No  Control  Female  No  Yes  HE071  No  Carrier  Female  No  Yes  HE077  No  Control  Female  Yes  Yes  HE080  Yes  Carrier  Male  No  Yes  HE084  No  Control  Male  No  Yes  HE089  No  Control  Male  No  Yes  HE090  No  Carrier  Female  No  Yes  HE093  No  Control  Female  Yes  Yes  HE097  No  Control  Female  Yes  Yes  HE103  No  Control  Male  Yes  No  HE104  No  Carrier  Female  N/A  No  HE106  No  Control  Female  No  No  HE108  No  Control  Female  Yes  Yes  HE109  No  Control  Male  No  No  HE113  No  Carrier  Female  No  Yes  HE135  No  Control  Female  No  No  HE137  No  Control  Female  Yes  Yes  HE147  No  Control  Female  No  Yes  HE157  No  Control  Male  No  No  HE161  No  Control  Male  No  Yes  HE163  No  Control  Male  No  Yes  Sample ID  Antibiotic intakea  ESBL carriage  Sex  Hospitalisation  Pets  HE001  Yes  Carrier  Male  No  Yes  HE002  Yes  Control  Female  No  Yes  HE012  No  Control  Male  No  Yes  HE014  Yes  Control  Female  Yes  Yes  HE022b  No  Control  Female  No  No  HE025  No  Control  Male  Yes  Yes  HE050  Yes  Control  Male  Yes  No  HE053  No  Carrier  Male  No  No  HE054  No  Control  Female  No  No  HE055  Yes  Carrier  Female  Yes  Yes  HE059  No  Control  Female  No  Yes  HE071  No  Carrier  Female  No  Yes  HE077  No  Control  Female  Yes  Yes  HE080  Yes  Carrier  Male  No  Yes  HE084  No  Control  Male  No  Yes  HE089  No  Control  Male  No  Yes  HE090  No  Carrier  Female  No  Yes  HE093  No  Control  Female  Yes  Yes  HE097  No  Control  Female  Yes  Yes  HE103  No  Control  Male  Yes  No  HE104  No  Carrier  Female  N/A  No  HE106  No  Control  Female  No  No  HE108  No  Control  Female  Yes  Yes  HE109  No  Control  Male  No  No  HE113  No  Carrier  Female  No  Yes  HE135  No  Control  Female  No  No  HE137  No  Control  Female  Yes  Yes  HE147  No  Control  Female  No  Yes  HE157  No  Control  Male  No  No  HE161  No  Control  Male  No  Yes  HE163  No  Control  Male  No  Yes  aAntibiotics received were metronidazole (HE001, HE050), augmentin (HE002), rodogyl (HE014), cotrimoxazole (HE055) and amoxicillin (HE080) bThis sample was excluded from the analyses due to its low number of reads. View Large Despite the protocol used to enrich the samples with viruses, viral hits stood for a minority on the reads, with an average ranging from 5.6% to 2.8% of all reads, using cutoffs of 70% and 90%. If only viral hits with no match in bacterial database are considered, those reads averaged from 3.1% to 0.4% of all reads. Considering only viral reads, hits matching viruses but not bacteria represented, on average, 27.2% of the total viral hits (i.e. including those matching both viruses and bacteria), ranging from 17.4% to 54.1% among samples at 70% identity. That average was 20.2% (from 9.2% to 63.4%) at 80% identity, and 15.7% (from 5.9% to 62.6%) at 90% identity. That implies that, for higher identity cutoffs, non-bacterial viral-only hits represented lower proportion of viral hits. Viruses in Wayampi stool samples are relatively homogeneous in distribution Relative abundance of the viruses at different taxonomic range levels was analysed (see Fig. 1 for distribution at the family level) not only for each sample, but also for groups of samples, according to the different variables: antibiotic intake, being or not carrier of E-ESBL, gender, hospitalisation during the last year, and possession of pets. In the surveyed individuals, reads were globally dominated by those matching prophages (87.4%), especially from those identified in bacteria from families Lachnospiraceae, Enterobacteriaceae, Bacteroidaceae, Ruminococcaceae and Streptococcaceae (summing up 36.5% of all reads). Among characterised viruses (12.6%), bacteriophages of families Myoviridae (4.1%) and Siphoviridae (3.8%), a Streptococcus phage (0.5%) and Caudovirales that could not be unambiguously determined (0.4%) occupied the top positions within bacteriophage families. The most abundantly identified eukaryotic viral family was Herpesviridae (0.3%). In addition, the unclassified immunodeficiency-associated stool virus (IAS virus) was also identified in a similar percentage. Table 2 (Supporting Information) shows details of those bar plots at family and species level. Even if only those reads matching only characterised viruses were considered, prophages hits still outnumbered those of bacteriophages and eukaryotic viruses (81.1% vs. 18.9%). Figure. 1. View largeDownload slide Barplots showing the relative abundance of different viral hits at family taxonomic level. Bars display the relative abundance of all samples (A), by antibiotic intake (B), by carriage of E-ESBL (C), by gender (D), by hospitalisation in the last year (E), by possession of pets (F), and by individual samples (G). Colours scale and legends are displayed in Table 2 (Supporting Information). Figure. 1. View largeDownload slide Barplots showing the relative abundance of different viral hits at family taxonomic level. Bars display the relative abundance of all samples (A), by antibiotic intake (B), by carriage of E-ESBL (C), by gender (D), by hospitalisation in the last year (E), by possession of pets (F), and by individual samples (G). Colours scale and legends are displayed in Table 2 (Supporting Information). Non-parametric Wilcox tests with FDR correction and LEfSe analyses indicated no significant differences between groups of samples according to any of the parameters analysed, suggesting a high degree of homogeneity in the viral composition and relative abundance across samples. Detailed results of the Wilcox tests, at family and species level, as well as LEfSe plots and up to four examples of potential biomarkers can be seen in Table 3 (Supporting Information). Even though some prophages and, to much lower extent, some viruses could be significant, the abundance distribution exemplified in the case of the selected plots of LEfSe markers shows that differences in mean and median values between groups do not necessarily reflect two distinct distributions, but rather an inter individual variability within each group, with abundances for many of the samples of each class that could otherwise overlap. One possible exception is represented by the immunodefiency-associated stool virus (IAS virus), which is significantly more abundant in those individuals having received antibiotics (0.66% ± 0.53%) than in those with no antibiotic treatment during the last months (0.21% ± 0.49%) (see its distribution by samples and Wilcox and LEfSE results in Tables 2 and 3, Supporting Information). This difference was observed even in the most restrictive conditions, i.e. using only viral and no bacterial hits, and more than 90% identity (data not shown), suggesting that the presence reads similar to those from this virus was not a mere artefact. However, this assumption is supported by a small and uneven (6 vs 24) number of samples. In fact, the sample with the highest number of reads attributable to this virus belonged to a non-treated individual (H137, 2.49%), although the remaining 23 individuals showed a much lower abundance (mean = 0.11% ± 0.08%). Similar diversity in samples and groups of samples Alpha diversity, measured by the Shannon diversity index, had a mean value of 5.71 ± 0.38, ranging from 4.72 ± 0.073 in HE109 to 6.10 ± 0.064 in HE055. However, when grouped by categories (i.e. antibiotic intake, ESBL carriage, gender, hospitalisation and pets), comparisons of the diversity within samples did not show significant differences, as displayed in Fig. 2. That implies that even though some degree of individual variation is found across samples, it cannot be associated with any particular factor of those analysed. For example, the higher average diversity observed in individuals having taken antibiotics or in pet owners compared to those without antibiotic intake and to those without pets, as well as the apparently higher diversity observed in females with respect to males were not significant, since the variability within groups was higher than the observed differences. The lowest differences were, however, observed by ESBL carriage and hospitalisation, with virtually undistinguishable values. This observation reinforces the idea of a relatively homogeneous virome in faecal samples, not only in composition and abundances (see above), but also in their alpha diversity. Figure. 2. View largeDownload slide Boxplots showing the alpha diversity, measured as Shannon diversity index, of the samples and groups of samples according to several variables of study. Panels show all samples (A), as well as comparisons by antibiotic intake (B), by carriage of E-ESBL (C), by gender (D), by hospitalisation in the last year (E), by possession of pets (F), and by individual samples (G). Boxes are delimited by the first and third quartiles, the upper and lower whiskers extent follow Spear definition; the median (solid horizontal line) and mean (cross) are displayed. The P-values for the different non-parametric two-tailed t-tests (using 2127 Monte Carlo permutations with false discovery rate (FDR) correction) are shown in group comparisons (B-F). Width of boxes is proportional to the number of points. Figure. 2. View largeDownload slide Boxplots showing the alpha diversity, measured as Shannon diversity index, of the samples and groups of samples according to several variables of study. Panels show all samples (A), as well as comparisons by antibiotic intake (B), by carriage of E-ESBL (C), by gender (D), by hospitalisation in the last year (E), by possession of pets (F), and by individual samples (G). Boxes are delimited by the first and third quartiles, the upper and lower whiskers extent follow Spear definition; the median (solid horizontal line) and mean (cross) are displayed. The P-values for the different non-parametric two-tailed t-tests (using 2127 Monte Carlo permutations with false discovery rate (FDR) correction) are shown in group comparisons (B-F). Width of boxes is proportional to the number of points. Differences in beta diversity are observed by gender and hospitalisation On the other hand, significant differences were found in some comparisons of the beta diversity analyses (Fig. 3). In particular, all differences by hospitalisation were significant, as well as those relative to gender involving females. Thus, the faecal virome from female samples displayed more homogeneity than that of males. Figure. 3. View largeDownload slide Boxplots showing the Bray–Curtis dissimilarity distribution within and between groups of samples according to several variables of study. Panels show the global values for all samples (A), as well as according to the antibiotic intake (B), carriage of E-ESBL (C), gender (D), hospitalisation in the last year (E), and possession of pets (F). Boxes are delimited by the first and third quartiles, the upper and lower whiskers extent follow Spear definition; the median (solid horizontal line), and mean (cross) are displayed. The P-values for the non-parametric Monte Carlo method with Bonferroni correction are shown. Width of boxes is proportional to the number of points. n.s.: not significant. Figure. 3. View largeDownload slide Boxplots showing the Bray–Curtis dissimilarity distribution within and between groups of samples according to several variables of study. Panels show the global values for all samples (A), as well as according to the antibiotic intake (B), carriage of E-ESBL (C), gender (D), hospitalisation in the last year (E), and possession of pets (F). Boxes are delimited by the first and third quartiles, the upper and lower whiskers extent follow Spear definition; the median (solid horizontal line), and mean (cross) are displayed. The P-values for the non-parametric Monte Carlo method with Bonferroni correction are shown. Width of boxes is proportional to the number of points. n.s.: not significant. Comparison with sets of samples from western populations The Binary Jaccard distance among the Wayampi set of samples, compared to that of faeces from healthy individuals and patients suffering from diarrhoea, both from a western country (Spain) (Fig. 4), revealed that statistically significant differences based on composition in the virome of the Wayampi were lower than those from healthy Westerners (0.62 ± 0.03 vs. 0.86 ± 0.06; P < 0.05), and also from patients with diarrhoea (0.84 ± 0.06; P < 0.05), whereas viromes from these last two did not differ significantly (P = 0.275). In addition, the virome from the tribe members had fewer variability than those from the general population of Spain. These two characteristics reinforce the observations pointing to a more homogenous community in the case of the virome in this isolated community. Figure. 4. View largeDownload slide Boxplots showing the global Binary Jaccard distance distribution in three sets of faecal samples. The left box corresponds to the 30 samples form adults belonging to the Wayampi tribe; the central box corresponds to 20 healthy controls from Spain, and the right box to 11 patients suffering from diarrhoea. Width of boxes is proportional to the number of points. n.s.: not significant,*: significant. Figure. 4. View largeDownload slide Boxplots showing the global Binary Jaccard distance distribution in three sets of faecal samples. The left box corresponds to the 30 samples form adults belonging to the Wayampi tribe; the central box corresponds to 20 healthy controls from Spain, and the right box to 11 patients suffering from diarrhoea. Width of boxes is proportional to the number of points. n.s.: not significant,*: significant. Correspondence between the clustering of the virome and the epidemiological parameters The clustering based on composition and abundance (Fig. 5), using the Bray–Curtis dissimilarity matrix for viruses and prophages, showed two main clusters, one of them (cluster A) consisting of 11 samples and the other (cluster B) including 17 samples, with two more samples clustered apart (cluster C). According to the selected categories, the 30 samples fell within 16 different combinations (see Table 4, Supporting Information) that could be related to the three clusters, observing no pattern that could be assigned to any particular cluster. Also, this association was analysed separately for each parameter. Unsurprisingly, none of them was able to explain alone the distribution of the samples based on the composition and abundance of the virome. Nevertheless, the small and uneven number of samples for most categories, especially for antibiotic intake (24 no vs 6 yes), ESBL carrier (22 no vs 8 yes) and possession of pets (9 no vs 22 yes), may be responsible for distorting the results of this sample clustering. Despite this, it is remarkable that, in the case where samples were more evenly distributed, by gender, (13 males vs 17 females), most females clustered together (12 in cluster B), whereas males were distributed in the three clusters (6, 5 and 2 in cluster A, B and C, respectively). In order to deepen in the division of roles by gender, a list of activities and other details and their correlation with the gender and sample clustering is shown in Table 5 (Supporting Information). Figure. 5. View largeDownload slide Faecal virome clustering and heatmap based on family composition and abundance (Bray–Curtis dissimilarity). Only families with abundance of >1% in at least one sample are plotted. Figure. 5. View largeDownload slide Faecal virome clustering and heatmap based on family composition and abundance (Bray–Curtis dissimilarity). Only families with abundance of >1% in at least one sample are plotted. DISCUSSION In a subset of 31 out of 151 samples from the October 2010 campaign for the identification of individuals for E-ESBL intestinal carriage, Gosalbes and collaborators analysed the total and active gut microbiota. In their comparison of general features, they found that the gut microbiota of eight carriers and 23 noncarriers did not differ significantly with respect to any of the epidemiological characteristics they tested. They also analysed the biodiversity and composition of the total and active microbiota. In this study, we have used the remaining faecal matter from that subset of samples that remained stored at −80°C for the analysis of the DNA virome, including prophages, from this subset. Some of the epidemiological parameters recorded during the campaign were used for grouping the samples in order to make the comparisons of virus and prophage composition and abundance, biodiversity within and between samples. Other analyses have been carried out on uncontacted Amerindians (Clemente et al.2015), focused on the bacterial microbiome and resistome of members of an isolated Yanomami Amerindian village. Nevertheless, as far as we know, this is the first attempt to characterise the virome of an isolated human population from an isolated location, whose main interest relies on its scarce contact with the mainstream population and the consequent environmental and genetic homogeneity of this community. This allows narrowing the potential sources of variability that normally encompass modern societies. The different analyses carried out pointed to a relatively evenness of the virome across individuals compared to western populations, either healthy or diseased. Probably this is due to a high degree of familiar connectivity, and/or living in an environment where the flow of microorganisms among hosts is facilitated, resulting in a homogenisation. However, it is extremely difficult to decompose the extent of the influence of the genetic and environmental components in this relative homogeneity observed in the Wayampi. Moreover, despite this, relative uniformity variation in diversity, based on composition and abundance among individuals, is still observed. We propose that it can be related to some kind of specialisation in functions and practices, particularly regarding the gender distinction, with higher variations in men than in women. Other gender-related differences observed in human viruses have been reported in the case of oral virome in California (USA) (Abeles et al.2014), but with a smaller cohort of eight Western individuals. Our sample size is greater and the characteristics of the population and environment surveyed are also different, but our results also hint that sex-based differences may apply to communities of human viruses. However, in our study, the analysis of different recorded activities such as hunting, fishing, water access, preparation of nourishment, health activity, and so on did not show correlation with the clustering of the samples. Even though for some activities exist a certain division of roles, with some activities mostly masculine (hunting, canoeing) or feminine (preparation of nourishment or cachiri), other activities were shared by both sexes (fishing, slaughter, care of children), making it difficult to associate a sole activity with the clustering, even more with such a low number or samples. As for the analysis of composition and abundance of the virome from members of this tribe, hits similar to potential prophages outnumber those matching strictly bacteriophages and eukaryotic viruses (almost 9 to 1), and even when the hits were restricted to those not matching bacteria (8 to 2). Those from Lachnospiraceae, Enterobacteriaceae, Ruminococcaeae, Bacteroidaceae and Streptococcaceae were the most abundant prophages found, whereas bacteriophages from Siphoviridae and Myoviridae families comprised the highest in number of reads for all viruses, and Herpesviruses for eukaryotic viruses. The unclassified IAS virus has not been assigned to either category. This virus has been reported in one study on diarrhoea in humans with advanced-stage HIV infection (Oude Munnink et al.2014). It was also recently detected in a study of healthy and diarrhoeic neonatal piglets (Karlsson et al.2016). However, in both cases, no association with diarrhoea was found. However, this dsDNA virus that displays no significant similarity to known viruses may deserve some further interest, since it shows significant association with the antibiotic intake in our samples, despite their limitations in size and evenness. The relatively small number of samples and the unevenness of representatives for each group of them make it difficult to extrapolate conclusive remarks, but these findings indicate that even in a small population, and with low gene exchange as the Wayampi Amerindians, a mixture of homogeneity in the viral composition, abundance and diversity within samples coexists with certain differences in the degree of variability (e.g. females/males), differences in particular viruses (e.g. IAS virus) that, even though cannot explain per se the clustering of those samples, allow us to differentiate, at least to some extent, patterns in samples that otherwise would seem indistinguishable. The inclusion of further criteria to group the samples might broaden the chances to find out traits that help us characterise the virome in this or similar populations. SUPPLEMENTARY DATA Supplementary data are available at FEMSEC online. Acknowledgements Illumina sequencing was carried out by Nuria Jiménez in the Sequencing Service of FISABIO-Salud Pública (Valencia Spain). FUNDING This work was supported by grants to AM from the Spanish Ministry of Economy and Competitiveness [grant numbers SAF 2012-31187, SAF2013-49788-EXP, SAF2015-65878-R]; Carlos III Institute of Health [grant numbers PIE14/00045, AC15/00022]; Generalitat Valenciana [grant number PrometeoII/2014/065] and co-financed by FEDER. Conflict of interest. None declared. REFERENCES Abeles SR, Robles-Sikisaka R, Ly M et al.   Human oral viruses are personal, persistent and gender-consistent. ISME J  2014; 8: 1753– 67. Google Scholar CrossRef Search ADS PubMed  Aronesty E. ea-utils: Command-line tools for processing biological sequencing data. ea-utils: FASTQ processing utilities. 2011. http://code.google.com/p/ea-utils. Caporaso JG, Kuczynski J, Stombaugh J et al.   QIIME allows analysis of high-throughput community sequencing data. Nat Methods  2010; 2: 335– 6. Google Scholar CrossRef Search ADS   Clemente J, Pehrsson E, Blaser M et al.   The microbiome of uncontacted Amerindians. Sci Adv  2015; 1: e1500183. Google Scholar CrossRef Search ADS PubMed  D’Andrea MM, Arena F, Pallecchi L et al.   CTX-M-type β-lactamases: a successful story of antibiotic resistance. Int J Med Microbiol  2013; 303: 305– 17. Google Scholar CrossRef Search ADS PubMed  Gosalbes MJ, Vázquez-Castellanos JF, Angebaul C et al.   Carriage of enterobacteria producing extended-spectrum β-lactamases and composition of the gut microbiota in an Amerindian community. Antimicrob Agents Chemother  2016; 60: 507– 14. Google Scholar CrossRef Search ADS   Grenet K, Guillemot D, Jarlier V et al.   Antibacterial resistance, Wayampis Amerindians, French Guyana. Emerg Infect Dis  2004; 10: 1150– 3. Google Scholar CrossRef Search ADS PubMed  Karlsson OE, Larsson J, Hayer J et al.   The Intestinal eukaryotic virome in healthy and diarrhoeic neonatal piglets. PLoS One  2016; 11: e0151481. Google Scholar CrossRef Search ADS PubMed  Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods  2012; 9: 357– 9. Google Scholar CrossRef Search ADS PubMed  Oude Munnink BB, Canuti M, Deijs M et al.   Unexplained diarrhoea in HIV-1 infected individuals. BMC Infect Dis  2014; 14: 22. Google Scholar CrossRef Search ADS PubMed  Pérez-Brocal V, García-López R., Nos P et al.   Metagenomic analysis of Crohn's disease patients identifies changes in the virome and microbiome related to disease status and therapy, and detects potential interactions and biomarkers. Inflamm Bowel Dis  2015; 21: 2515– 32. Google Scholar CrossRef Search ADS PubMed  R Development Core Team, R: A language and environment for statistical computing . ( 2013), R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.(available at http://www.r-project.org). Schmieder R, Edwards R. Quality control and preprocessing of metagenomic datasets. Bioinformatics  2011; 27: 863– 4. Google Scholar CrossRef Search ADS PubMed  Segata N, Izard J, Waldron L et al.   Metagenomic biomarker discovery and explanation. Genome Biol  2011; 12: R60. Google Scholar CrossRef Search ADS PubMed  Woerther PL, Angebault C, Jacquier H et al.   Characterization of fecal extended-spectrum-β-lactamase-producing Escherichia coli in a remote community during a long time period. Antimicrob Agents Chemother  2013; 57: 5060– 6. Google Scholar CrossRef Search ADS PubMed  Woerther PL, Angebault C, Lescat M et al.   Emergence and dissemination of extended-spectrum beta-lactamase-producing Escherichia coli in the community: lessons from the study of a remote and controlled population. J Infect Dis  2010; 202: 515– 23. Google Scholar CrossRef Search ADS PubMed  Zhou Y, Liang Y, Lynch KH et al.   PHAST: a fast phage search tool. Nucleic Acids Res  2011; 39: W347– 52. Google Scholar CrossRef Search ADS PubMed  © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png FEMS Microbiology Ecology Oxford University Press

Isolation in small populations of Wayampi Amerindians promotes endemicity and homogenisation of their faecal virome, but its distribution is not entirely random

Loading next page...
 
/lp/ou_press/isolation-in-small-populations-of-wayampi-amerindians-promotes-05AgaI6yHt
Publisher
Blackwell
Copyright
© FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
ISSN
0168-6496
eISSN
1574-6941
D.O.I.
10.1093/femsec/fix184
Publisher site
See Article on Publisher Site

Abstract

Abstract The isolated community of the Wayampi Amerindians has been extensively studied for the presence of beta lactamase-producing enterobacteria and their gut microbiota. However, no information about their virome was available. This study tries to establish potential associations between the virome and diverse epidemiological data, through the metagenomic study of the faecal prophages and DNA viruses from 31 samples collected in 2010. Taxonomic assignments, composition, abundance and diversity analyses were obtained to characterise the virome and were compared between groups according to several demographic, environmental and medical data. Prophages outnumbered viruses. Composition and abundance of virome indicated relatively low variability. Diversity within samples showed no significant differences, regardless of the group comparison. Significant differences were observed in the beta diversity among samples according to hospitalisation and gender, but not by extended spectrum β-lactamase carriage, antibiotic intake or possession of pets, although some viruses differed in some cases (e.g. immunodeficiency-associated stool virus associated with antibiotic intake). The faecal virome of adult Wayampi is more homogeneous than that from western populations. Not a single factor analysed can explain alone the observed distribution of the virome, but differences by gender (fewer variability in females than males) may reflect differences in life habits and work. virome, next-generation sequencing, isolated human population, low exposure to antibiotics INTRODUCTION One of the major concerns on public health is the emergence of bacterial antibiotic resistance, not only in developed countries, but also in developing ones (D’Andrea et al.2013). One of these resistances is that of gram-negative bacteria encoding extended spectrum beta-lactamases (ESBLs) carried in plasmids that are readily transferred among bacterial species via horizontal transfer. However, monitoring the dynamics of dissemination of ESBL in populations is difficult in mainstream communities due to the mobility of populations and other factors, such as exposure to multiple sources of antibiotics (medical care, food chain, etc.) associated with the modern lifestyle that burden the follow-up of stable cohorts of individuals over time. To address this issue, the faecal enterobacteria-producing ESBL (E-ESBL) has been studied for over a decade, on the isolated Amerindian Wayampi community living in a very remote village of French Guiana, with three campaigns in 2001, 2006 and 2010 (Grenet et al.2004; Woerther et al.2010, 2013). In this context, metagenomic and metatranscriptomic approaches were conducted on a subset of samples collected during the last campaign (Gosalbes et al.2016) in order to answer the question regarding the potential association of intestinal carriage of E-ESBL with significant changes in the composition of the rest of the microbiota. However, this picture remained incomplete, as the association with other members of the gut microbiota, the viral community, has not been explored yet. In this work, we have examined the viral fraction from the set of samples analysed by Gosalbes et al.(2016), in order to describe the virome of a genetically and environmentally homogeneous population. We aim at establishing potential associations, not only between the virome and the intestinal carriage of E-ESBL, but also extending the scope to other epidemiological data (demographic, environmental and medical data). MATERIALS AND METHODS Subjects Detailed information about the study subjects and sampling was extensively described previously (Gosalbes et al.2016). Briefly, available frozen faecal samples from 31 healthy adult Wayampi Amerindians from the village of Trois-Sauts in French Guiana were used for this study. These subjects represented a subcohort of the 151 individuals voluntarily participating in a campaign in October 2010 to test for E-ESBL intestinal carriage. They included eight E-ESBL carriers, four of them carrying a CTX-M-1 type ESBL, two carrying a CTX-M-8 ESBL and one was carrying a CTX-M-2 ESBL. The remaining individuals were non-carrier controls chosen among the remaining 143 villagers. Demographic, lifestyle, environment, and medical history data had been previously collected (Woerther et al.2013) from each volunteer. Antibiotic treatments prescribed to all villagers were also recorded, as well as familial status and the location of the household of each villager. Signed informed consent had been previously obtained from all subjects and the study protocol was approved by the Regional Ethics Committee (Comité de Protection des Personnes Sud-Ouest et Outre Mer III, 2010-A00682-37). Sample collection and DNA purification For each volunteer, about 5 g of fresh faecal sample was diluted 1:10 in RNAlater (Applied Biosystems, Villebon-sur-Yvette, France), thoroughly mixed and frozen at −20°C and taken to the laboratory for storage at −80°C. Stool samples were defrosted and homogenised and 5 mL of each were diluted with 5 mL of phosphate buffered saline (containing, per litre, 8 g of NaCl, 0.2 g of KCl, 1.44 g of 96 Na2HPO4, and 0.24 g of KH2PO4 [pH 7.2]). Then, they were centrifuged at 1250 g at 4°C for 2 min to remove faecal debris. The supernatants were transferred into 2 mL microcentrifuge tubes and stored at −70°C. One aliquot per sample was used for the viral metagenome isolation. The suspensions were first centrifuged twice at 13,400 × g, 5 min at 4°C and the supernatants recovered and transferred into sterile tubes. The supernatants were next filtered through 045 and 0.22 μm pore size Acrodisc® Syringe filters (Pall España, Madrid, Spain) The resulting filtrates were digested with a cocktail of DNases/RNases consisting of 14 U of Turbo DNase (Ambion Inc, Austin, TX, USA), 25 U of Benzonase (Novagen Inc., Madison, WI, USA) and 2.8 U of RNAse A (Invitrogen, Carlsbad, CA, USA) in DNAse buffer (Ambion), for 120 min at 37°C. Viral nucleic acids were extracted using the QiAamp viral RNA extraction kit (Qiagen, Valencia, CA, USA), following the manufacturer's instructions. Each resulting viral DNAs was eluted in 40 μL of AVE buffer. DNAs were quantified with the Qubit fluorometric quantification method (Thermo Fisher Scientific, Carlsbad, CA, USA), and shot gun libraries were constructed using the Nextera XT DNA Library Prep kit (Illumina, CA, USA) according to the manufacturers’ instructions. Sequencing was performed by MiSeq Paired End Illumina Technology, using the Reagent Kit V3 for 300 bp paired-end reads. Sequence bioinformatics analysis Raw files containing all samples already demultiplexed were filtered by length and quality, trimmed and removed of low-complexity sequences and Ns using Prinseq-lite (v0.20.4) (Schmieder and Edwards 2011). For each sample, forward and reverse sequences were joined using the fastq-join tool from ea-tools suite (Aronesty 2011), with unjoined reads remaining as separate files. Next, files were filtered from human and bacterial reads using the end-to-end and very sensitive options implemented by Bowtie2 (Langmead and Salzberg 2012) against the GRCh38/hg38 reference human genome (Dec ember 2013) and a bacterial database consisting of the reference bacterial genomes updated to June 2016 downloaded from the NCBI FTP site. Next, a BLASTn search strategy (e-value <10−3, with identities ≥70%, 80% and 90%, along ≥75% of the read length) was used for the unmapped reads against a customised viral database (March 2016), consisting of 99% identity clusters of complete viral genomes from the EBI and NCBI sites, plus all available viral sequences from the International Nucleotide Sequence Database Collaboration. It also included prophages from PHAge Search Tool (Zhou et al.2011). Additionally, a BLASTn against the viral database was launched for those reads that had previously mapped against the bacterial database in order to identify potential prophage sequences in those reads. In all cases, tBLASTx searches were also conducted (e-value < 10−5, with identities ≥50%, 60% and 70%, along ≥65% of the read length), obtaining similar results. A taxonomic assignment was subsequently obtained for resulting hits using in-house scripts based on the lowest common ancestor strategy. Finally, contingency tables for reads matching both viral and bacterial hits, as well as for those matching viral but not bacterial hits, were built for the analyses with QIIME version 1.9.0 (Caporasso et al.2010). Ecological analyses of the viral communities such as sample composition, abundance and diversity within and among samples and were calculated. For analyses comprising groups of samples, a sub-sampling of reads based on the sample containing the fewest sequences, HE002 (2127 reads) was carried out. Sample HE022, containing only 11 viral reads was excluded from the analysis. For comparisons of differential distribution of abundances of prophages and viruses at family and species level between groups of samples according to factor such as antibiotic intake, ESBL carriage, gender, hospitalisation and possession of pets, non-parametric Wilcox tests were carried out with the free statistical package R 3.1.0 (R Core Team 2013) and P-values were corrected by the false discovery rate (FDR) corrections. For these analyses, only hits with at least three appearances (10% of the samples) were considered, in order to avoid viruses appearing rarely (in only one or two samples). In addition, Linear Discriminant Analysis Effect Size (LEfSe) (Segata et al.2011) was used to find differentially distributed markers in those groups. The threshold used to consider a discriminative feature was set to >3.0. Additionally, 1000 rarefactions were carried out and the alpha diversity was calculated with the Shannon diversity index. Boxplots were created using a n R script, and the diversity was compared by groups using the script compare_alpha_diversity.py and estimated if they were significantly different using a non-parametric two-tailed t-test using 2127 Monte Carlo permutations with FDR correction. In order to assess the homogeneity of the viral communities, the beta diversity was calculated with the pipeline beta_diversity_through_plots.py, using the Bray-Curtis dissimilarity index. To compare distances between categories, boxplots were created using the same R script as used for alpha diversity. The script make_distance_boxplots.py was used to assess the significance of differential distributions, carrying out Monte Carlo (nonparametric) tests, including Bonferroni corrections. The homogeneity of the virome from samples from the Wayampi individuals was also compared with that of samples from 20 healthy individuals from Spain, previously sequenced by us (Pérez-Brocal et al.2015), as well as with samples from 11 patients infected by Clostridium difficile processed and sequenced in parallel to the Wayampi samples, using the same procedure. To make groups comparable, qualitative statistics were used. Thus, dissimilarity among samples, calculated using the Binary Jaccard distance matrices and richness measures, was plotted for the three sets, using R package, and statistical differences were estimated between pairs of sets, as well as for the three groups using the non-parametric Mann–Whitney U and Kruskal–Wallis tests, respectively. Finally, the clustering based on the Bray–Curtis dissimilarity measurements and the heat maps of taxon abundance and composition were also generated using R package. Virome data submission The DNA virus metagenome data sets from this study are available in the EBI Short Read Archive under the study accession number PRJEB21741, with accession numbers [ERS1820590-ERS1820620]. As for the Western population viromes used for comparisons with the Wayampi ones, data sets can be accessed under the accession numbers ERS540192-ERS540312 (healthy individuals) and ERS1941693 -ERS1941704 (patients suffering from diarrhoea). RESULTS The virome contained in 31 faecal samples from adult volunteers belonging to the Wayampi tribe previously collected was analysed. The distribution of the samples according to the five characteristics selected to conduct the analyses (antibiotic intake, ESBL carriage, gender, hospitalisation and pet possession) is shown in Table 1. Thirty out of those samples resulted in a successful sequencing, adding up 14 344 007 pairs of raw reads in total. Sample HE022 was removed from the analysis due to its low number of reads. Table 1 (Supporting Information) summarises the evolution in the number of reads, total and per sample, after sequential processing steps, as well as the number of identified viral hits using three different BLASTn identity cut offs (above 70%, 80% and 90%). Table 1. Epidemiological parameters of the 31 individuals surveyed among the Wayampi Amerindians of Trois-Sauts. Sample ID  Antibiotic intakea  ESBL carriage  Sex  Hospitalisation  Pets  HE001  Yes  Carrier  Male  No  Yes  HE002  Yes  Control  Female  No  Yes  HE012  No  Control  Male  No  Yes  HE014  Yes  Control  Female  Yes  Yes  HE022b  No  Control  Female  No  No  HE025  No  Control  Male  Yes  Yes  HE050  Yes  Control  Male  Yes  No  HE053  No  Carrier  Male  No  No  HE054  No  Control  Female  No  No  HE055  Yes  Carrier  Female  Yes  Yes  HE059  No  Control  Female  No  Yes  HE071  No  Carrier  Female  No  Yes  HE077  No  Control  Female  Yes  Yes  HE080  Yes  Carrier  Male  No  Yes  HE084  No  Control  Male  No  Yes  HE089  No  Control  Male  No  Yes  HE090  No  Carrier  Female  No  Yes  HE093  No  Control  Female  Yes  Yes  HE097  No  Control  Female  Yes  Yes  HE103  No  Control  Male  Yes  No  HE104  No  Carrier  Female  N/A  No  HE106  No  Control  Female  No  No  HE108  No  Control  Female  Yes  Yes  HE109  No  Control  Male  No  No  HE113  No  Carrier  Female  No  Yes  HE135  No  Control  Female  No  No  HE137  No  Control  Female  Yes  Yes  HE147  No  Control  Female  No  Yes  HE157  No  Control  Male  No  No  HE161  No  Control  Male  No  Yes  HE163  No  Control  Male  No  Yes  Sample ID  Antibiotic intakea  ESBL carriage  Sex  Hospitalisation  Pets  HE001  Yes  Carrier  Male  No  Yes  HE002  Yes  Control  Female  No  Yes  HE012  No  Control  Male  No  Yes  HE014  Yes  Control  Female  Yes  Yes  HE022b  No  Control  Female  No  No  HE025  No  Control  Male  Yes  Yes  HE050  Yes  Control  Male  Yes  No  HE053  No  Carrier  Male  No  No  HE054  No  Control  Female  No  No  HE055  Yes  Carrier  Female  Yes  Yes  HE059  No  Control  Female  No  Yes  HE071  No  Carrier  Female  No  Yes  HE077  No  Control  Female  Yes  Yes  HE080  Yes  Carrier  Male  No  Yes  HE084  No  Control  Male  No  Yes  HE089  No  Control  Male  No  Yes  HE090  No  Carrier  Female  No  Yes  HE093  No  Control  Female  Yes  Yes  HE097  No  Control  Female  Yes  Yes  HE103  No  Control  Male  Yes  No  HE104  No  Carrier  Female  N/A  No  HE106  No  Control  Female  No  No  HE108  No  Control  Female  Yes  Yes  HE109  No  Control  Male  No  No  HE113  No  Carrier  Female  No  Yes  HE135  No  Control  Female  No  No  HE137  No  Control  Female  Yes  Yes  HE147  No  Control  Female  No  Yes  HE157  No  Control  Male  No  No  HE161  No  Control  Male  No  Yes  HE163  No  Control  Male  No  Yes  aAntibiotics received were metronidazole (HE001, HE050), augmentin (HE002), rodogyl (HE014), cotrimoxazole (HE055) and amoxicillin (HE080) bThis sample was excluded from the analyses due to its low number of reads. View Large Despite the protocol used to enrich the samples with viruses, viral hits stood for a minority on the reads, with an average ranging from 5.6% to 2.8% of all reads, using cutoffs of 70% and 90%. If only viral hits with no match in bacterial database are considered, those reads averaged from 3.1% to 0.4% of all reads. Considering only viral reads, hits matching viruses but not bacteria represented, on average, 27.2% of the total viral hits (i.e. including those matching both viruses and bacteria), ranging from 17.4% to 54.1% among samples at 70% identity. That average was 20.2% (from 9.2% to 63.4%) at 80% identity, and 15.7% (from 5.9% to 62.6%) at 90% identity. That implies that, for higher identity cutoffs, non-bacterial viral-only hits represented lower proportion of viral hits. Viruses in Wayampi stool samples are relatively homogeneous in distribution Relative abundance of the viruses at different taxonomic range levels was analysed (see Fig. 1 for distribution at the family level) not only for each sample, but also for groups of samples, according to the different variables: antibiotic intake, being or not carrier of E-ESBL, gender, hospitalisation during the last year, and possession of pets. In the surveyed individuals, reads were globally dominated by those matching prophages (87.4%), especially from those identified in bacteria from families Lachnospiraceae, Enterobacteriaceae, Bacteroidaceae, Ruminococcaceae and Streptococcaceae (summing up 36.5% of all reads). Among characterised viruses (12.6%), bacteriophages of families Myoviridae (4.1%) and Siphoviridae (3.8%), a Streptococcus phage (0.5%) and Caudovirales that could not be unambiguously determined (0.4%) occupied the top positions within bacteriophage families. The most abundantly identified eukaryotic viral family was Herpesviridae (0.3%). In addition, the unclassified immunodeficiency-associated stool virus (IAS virus) was also identified in a similar percentage. Table 2 (Supporting Information) shows details of those bar plots at family and species level. Even if only those reads matching only characterised viruses were considered, prophages hits still outnumbered those of bacteriophages and eukaryotic viruses (81.1% vs. 18.9%). Figure. 1. View largeDownload slide Barplots showing the relative abundance of different viral hits at family taxonomic level. Bars display the relative abundance of all samples (A), by antibiotic intake (B), by carriage of E-ESBL (C), by gender (D), by hospitalisation in the last year (E), by possession of pets (F), and by individual samples (G). Colours scale and legends are displayed in Table 2 (Supporting Information). Figure. 1. View largeDownload slide Barplots showing the relative abundance of different viral hits at family taxonomic level. Bars display the relative abundance of all samples (A), by antibiotic intake (B), by carriage of E-ESBL (C), by gender (D), by hospitalisation in the last year (E), by possession of pets (F), and by individual samples (G). Colours scale and legends are displayed in Table 2 (Supporting Information). Non-parametric Wilcox tests with FDR correction and LEfSe analyses indicated no significant differences between groups of samples according to any of the parameters analysed, suggesting a high degree of homogeneity in the viral composition and relative abundance across samples. Detailed results of the Wilcox tests, at family and species level, as well as LEfSe plots and up to four examples of potential biomarkers can be seen in Table 3 (Supporting Information). Even though some prophages and, to much lower extent, some viruses could be significant, the abundance distribution exemplified in the case of the selected plots of LEfSe markers shows that differences in mean and median values between groups do not necessarily reflect two distinct distributions, but rather an inter individual variability within each group, with abundances for many of the samples of each class that could otherwise overlap. One possible exception is represented by the immunodefiency-associated stool virus (IAS virus), which is significantly more abundant in those individuals having received antibiotics (0.66% ± 0.53%) than in those with no antibiotic treatment during the last months (0.21% ± 0.49%) (see its distribution by samples and Wilcox and LEfSE results in Tables 2 and 3, Supporting Information). This difference was observed even in the most restrictive conditions, i.e. using only viral and no bacterial hits, and more than 90% identity (data not shown), suggesting that the presence reads similar to those from this virus was not a mere artefact. However, this assumption is supported by a small and uneven (6 vs 24) number of samples. In fact, the sample with the highest number of reads attributable to this virus belonged to a non-treated individual (H137, 2.49%), although the remaining 23 individuals showed a much lower abundance (mean = 0.11% ± 0.08%). Similar diversity in samples and groups of samples Alpha diversity, measured by the Shannon diversity index, had a mean value of 5.71 ± 0.38, ranging from 4.72 ± 0.073 in HE109 to 6.10 ± 0.064 in HE055. However, when grouped by categories (i.e. antibiotic intake, ESBL carriage, gender, hospitalisation and pets), comparisons of the diversity within samples did not show significant differences, as displayed in Fig. 2. That implies that even though some degree of individual variation is found across samples, it cannot be associated with any particular factor of those analysed. For example, the higher average diversity observed in individuals having taken antibiotics or in pet owners compared to those without antibiotic intake and to those without pets, as well as the apparently higher diversity observed in females with respect to males were not significant, since the variability within groups was higher than the observed differences. The lowest differences were, however, observed by ESBL carriage and hospitalisation, with virtually undistinguishable values. This observation reinforces the idea of a relatively homogeneous virome in faecal samples, not only in composition and abundances (see above), but also in their alpha diversity. Figure. 2. View largeDownload slide Boxplots showing the alpha diversity, measured as Shannon diversity index, of the samples and groups of samples according to several variables of study. Panels show all samples (A), as well as comparisons by antibiotic intake (B), by carriage of E-ESBL (C), by gender (D), by hospitalisation in the last year (E), by possession of pets (F), and by individual samples (G). Boxes are delimited by the first and third quartiles, the upper and lower whiskers extent follow Spear definition; the median (solid horizontal line) and mean (cross) are displayed. The P-values for the different non-parametric two-tailed t-tests (using 2127 Monte Carlo permutations with false discovery rate (FDR) correction) are shown in group comparisons (B-F). Width of boxes is proportional to the number of points. Figure. 2. View largeDownload slide Boxplots showing the alpha diversity, measured as Shannon diversity index, of the samples and groups of samples according to several variables of study. Panels show all samples (A), as well as comparisons by antibiotic intake (B), by carriage of E-ESBL (C), by gender (D), by hospitalisation in the last year (E), by possession of pets (F), and by individual samples (G). Boxes are delimited by the first and third quartiles, the upper and lower whiskers extent follow Spear definition; the median (solid horizontal line) and mean (cross) are displayed. The P-values for the different non-parametric two-tailed t-tests (using 2127 Monte Carlo permutations with false discovery rate (FDR) correction) are shown in group comparisons (B-F). Width of boxes is proportional to the number of points. Differences in beta diversity are observed by gender and hospitalisation On the other hand, significant differences were found in some comparisons of the beta diversity analyses (Fig. 3). In particular, all differences by hospitalisation were significant, as well as those relative to gender involving females. Thus, the faecal virome from female samples displayed more homogeneity than that of males. Figure. 3. View largeDownload slide Boxplots showing the Bray–Curtis dissimilarity distribution within and between groups of samples according to several variables of study. Panels show the global values for all samples (A), as well as according to the antibiotic intake (B), carriage of E-ESBL (C), gender (D), hospitalisation in the last year (E), and possession of pets (F). Boxes are delimited by the first and third quartiles, the upper and lower whiskers extent follow Spear definition; the median (solid horizontal line), and mean (cross) are displayed. The P-values for the non-parametric Monte Carlo method with Bonferroni correction are shown. Width of boxes is proportional to the number of points. n.s.: not significant. Figure. 3. View largeDownload slide Boxplots showing the Bray–Curtis dissimilarity distribution within and between groups of samples according to several variables of study. Panels show the global values for all samples (A), as well as according to the antibiotic intake (B), carriage of E-ESBL (C), gender (D), hospitalisation in the last year (E), and possession of pets (F). Boxes are delimited by the first and third quartiles, the upper and lower whiskers extent follow Spear definition; the median (solid horizontal line), and mean (cross) are displayed. The P-values for the non-parametric Monte Carlo method with Bonferroni correction are shown. Width of boxes is proportional to the number of points. n.s.: not significant. Comparison with sets of samples from western populations The Binary Jaccard distance among the Wayampi set of samples, compared to that of faeces from healthy individuals and patients suffering from diarrhoea, both from a western country (Spain) (Fig. 4), revealed that statistically significant differences based on composition in the virome of the Wayampi were lower than those from healthy Westerners (0.62 ± 0.03 vs. 0.86 ± 0.06; P < 0.05), and also from patients with diarrhoea (0.84 ± 0.06; P < 0.05), whereas viromes from these last two did not differ significantly (P = 0.275). In addition, the virome from the tribe members had fewer variability than those from the general population of Spain. These two characteristics reinforce the observations pointing to a more homogenous community in the case of the virome in this isolated community. Figure. 4. View largeDownload slide Boxplots showing the global Binary Jaccard distance distribution in three sets of faecal samples. The left box corresponds to the 30 samples form adults belonging to the Wayampi tribe; the central box corresponds to 20 healthy controls from Spain, and the right box to 11 patients suffering from diarrhoea. Width of boxes is proportional to the number of points. n.s.: not significant,*: significant. Figure. 4. View largeDownload slide Boxplots showing the global Binary Jaccard distance distribution in three sets of faecal samples. The left box corresponds to the 30 samples form adults belonging to the Wayampi tribe; the central box corresponds to 20 healthy controls from Spain, and the right box to 11 patients suffering from diarrhoea. Width of boxes is proportional to the number of points. n.s.: not significant,*: significant. Correspondence between the clustering of the virome and the epidemiological parameters The clustering based on composition and abundance (Fig. 5), using the Bray–Curtis dissimilarity matrix for viruses and prophages, showed two main clusters, one of them (cluster A) consisting of 11 samples and the other (cluster B) including 17 samples, with two more samples clustered apart (cluster C). According to the selected categories, the 30 samples fell within 16 different combinations (see Table 4, Supporting Information) that could be related to the three clusters, observing no pattern that could be assigned to any particular cluster. Also, this association was analysed separately for each parameter. Unsurprisingly, none of them was able to explain alone the distribution of the samples based on the composition and abundance of the virome. Nevertheless, the small and uneven number of samples for most categories, especially for antibiotic intake (24 no vs 6 yes), ESBL carrier (22 no vs 8 yes) and possession of pets (9 no vs 22 yes), may be responsible for distorting the results of this sample clustering. Despite this, it is remarkable that, in the case where samples were more evenly distributed, by gender, (13 males vs 17 females), most females clustered together (12 in cluster B), whereas males were distributed in the three clusters (6, 5 and 2 in cluster A, B and C, respectively). In order to deepen in the division of roles by gender, a list of activities and other details and their correlation with the gender and sample clustering is shown in Table 5 (Supporting Information). Figure. 5. View largeDownload slide Faecal virome clustering and heatmap based on family composition and abundance (Bray–Curtis dissimilarity). Only families with abundance of >1% in at least one sample are plotted. Figure. 5. View largeDownload slide Faecal virome clustering and heatmap based on family composition and abundance (Bray–Curtis dissimilarity). Only families with abundance of >1% in at least one sample are plotted. DISCUSSION In a subset of 31 out of 151 samples from the October 2010 campaign for the identification of individuals for E-ESBL intestinal carriage, Gosalbes and collaborators analysed the total and active gut microbiota. In their comparison of general features, they found that the gut microbiota of eight carriers and 23 noncarriers did not differ significantly with respect to any of the epidemiological characteristics they tested. They also analysed the biodiversity and composition of the total and active microbiota. In this study, we have used the remaining faecal matter from that subset of samples that remained stored at −80°C for the analysis of the DNA virome, including prophages, from this subset. Some of the epidemiological parameters recorded during the campaign were used for grouping the samples in order to make the comparisons of virus and prophage composition and abundance, biodiversity within and between samples. Other analyses have been carried out on uncontacted Amerindians (Clemente et al.2015), focused on the bacterial microbiome and resistome of members of an isolated Yanomami Amerindian village. Nevertheless, as far as we know, this is the first attempt to characterise the virome of an isolated human population from an isolated location, whose main interest relies on its scarce contact with the mainstream population and the consequent environmental and genetic homogeneity of this community. This allows narrowing the potential sources of variability that normally encompass modern societies. The different analyses carried out pointed to a relatively evenness of the virome across individuals compared to western populations, either healthy or diseased. Probably this is due to a high degree of familiar connectivity, and/or living in an environment where the flow of microorganisms among hosts is facilitated, resulting in a homogenisation. However, it is extremely difficult to decompose the extent of the influence of the genetic and environmental components in this relative homogeneity observed in the Wayampi. Moreover, despite this, relative uniformity variation in diversity, based on composition and abundance among individuals, is still observed. We propose that it can be related to some kind of specialisation in functions and practices, particularly regarding the gender distinction, with higher variations in men than in women. Other gender-related differences observed in human viruses have been reported in the case of oral virome in California (USA) (Abeles et al.2014), but with a smaller cohort of eight Western individuals. Our sample size is greater and the characteristics of the population and environment surveyed are also different, but our results also hint that sex-based differences may apply to communities of human viruses. However, in our study, the analysis of different recorded activities such as hunting, fishing, water access, preparation of nourishment, health activity, and so on did not show correlation with the clustering of the samples. Even though for some activities exist a certain division of roles, with some activities mostly masculine (hunting, canoeing) or feminine (preparation of nourishment or cachiri), other activities were shared by both sexes (fishing, slaughter, care of children), making it difficult to associate a sole activity with the clustering, even more with such a low number or samples. As for the analysis of composition and abundance of the virome from members of this tribe, hits similar to potential prophages outnumber those matching strictly bacteriophages and eukaryotic viruses (almost 9 to 1), and even when the hits were restricted to those not matching bacteria (8 to 2). Those from Lachnospiraceae, Enterobacteriaceae, Ruminococcaeae, Bacteroidaceae and Streptococcaceae were the most abundant prophages found, whereas bacteriophages from Siphoviridae and Myoviridae families comprised the highest in number of reads for all viruses, and Herpesviruses for eukaryotic viruses. The unclassified IAS virus has not been assigned to either category. This virus has been reported in one study on diarrhoea in humans with advanced-stage HIV infection (Oude Munnink et al.2014). It was also recently detected in a study of healthy and diarrhoeic neonatal piglets (Karlsson et al.2016). However, in both cases, no association with diarrhoea was found. However, this dsDNA virus that displays no significant similarity to known viruses may deserve some further interest, since it shows significant association with the antibiotic intake in our samples, despite their limitations in size and evenness. The relatively small number of samples and the unevenness of representatives for each group of them make it difficult to extrapolate conclusive remarks, but these findings indicate that even in a small population, and with low gene exchange as the Wayampi Amerindians, a mixture of homogeneity in the viral composition, abundance and diversity within samples coexists with certain differences in the degree of variability (e.g. females/males), differences in particular viruses (e.g. IAS virus) that, even though cannot explain per se the clustering of those samples, allow us to differentiate, at least to some extent, patterns in samples that otherwise would seem indistinguishable. The inclusion of further criteria to group the samples might broaden the chances to find out traits that help us characterise the virome in this or similar populations. SUPPLEMENTARY DATA Supplementary data are available at FEMSEC online. Acknowledgements Illumina sequencing was carried out by Nuria Jiménez in the Sequencing Service of FISABIO-Salud Pública (Valencia Spain). FUNDING This work was supported by grants to AM from the Spanish Ministry of Economy and Competitiveness [grant numbers SAF 2012-31187, SAF2013-49788-EXP, SAF2015-65878-R]; Carlos III Institute of Health [grant numbers PIE14/00045, AC15/00022]; Generalitat Valenciana [grant number PrometeoII/2014/065] and co-financed by FEDER. Conflict of interest. None declared. REFERENCES Abeles SR, Robles-Sikisaka R, Ly M et al.   Human oral viruses are personal, persistent and gender-consistent. ISME J  2014; 8: 1753– 67. Google Scholar CrossRef Search ADS PubMed  Aronesty E. ea-utils: Command-line tools for processing biological sequencing data. ea-utils: FASTQ processing utilities. 2011. http://code.google.com/p/ea-utils. Caporaso JG, Kuczynski J, Stombaugh J et al.   QIIME allows analysis of high-throughput community sequencing data. Nat Methods  2010; 2: 335– 6. Google Scholar CrossRef Search ADS   Clemente J, Pehrsson E, Blaser M et al.   The microbiome of uncontacted Amerindians. Sci Adv  2015; 1: e1500183. Google Scholar CrossRef Search ADS PubMed  D’Andrea MM, Arena F, Pallecchi L et al.   CTX-M-type β-lactamases: a successful story of antibiotic resistance. Int J Med Microbiol  2013; 303: 305– 17. Google Scholar CrossRef Search ADS PubMed  Gosalbes MJ, Vázquez-Castellanos JF, Angebaul C et al.   Carriage of enterobacteria producing extended-spectrum β-lactamases and composition of the gut microbiota in an Amerindian community. Antimicrob Agents Chemother  2016; 60: 507– 14. Google Scholar CrossRef Search ADS   Grenet K, Guillemot D, Jarlier V et al.   Antibacterial resistance, Wayampis Amerindians, French Guyana. Emerg Infect Dis  2004; 10: 1150– 3. Google Scholar CrossRef Search ADS PubMed  Karlsson OE, Larsson J, Hayer J et al.   The Intestinal eukaryotic virome in healthy and diarrhoeic neonatal piglets. PLoS One  2016; 11: e0151481. Google Scholar CrossRef Search ADS PubMed  Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods  2012; 9: 357– 9. Google Scholar CrossRef Search ADS PubMed  Oude Munnink BB, Canuti M, Deijs M et al.   Unexplained diarrhoea in HIV-1 infected individuals. BMC Infect Dis  2014; 14: 22. Google Scholar CrossRef Search ADS PubMed  Pérez-Brocal V, García-López R., Nos P et al.   Metagenomic analysis of Crohn's disease patients identifies changes in the virome and microbiome related to disease status and therapy, and detects potential interactions and biomarkers. Inflamm Bowel Dis  2015; 21: 2515– 32. Google Scholar CrossRef Search ADS PubMed  R Development Core Team, R: A language and environment for statistical computing . ( 2013), R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.(available at http://www.r-project.org). Schmieder R, Edwards R. Quality control and preprocessing of metagenomic datasets. Bioinformatics  2011; 27: 863– 4. Google Scholar CrossRef Search ADS PubMed  Segata N, Izard J, Waldron L et al.   Metagenomic biomarker discovery and explanation. Genome Biol  2011; 12: R60. Google Scholar CrossRef Search ADS PubMed  Woerther PL, Angebault C, Jacquier H et al.   Characterization of fecal extended-spectrum-β-lactamase-producing Escherichia coli in a remote community during a long time period. Antimicrob Agents Chemother  2013; 57: 5060– 6. Google Scholar CrossRef Search ADS PubMed  Woerther PL, Angebault C, Lescat M et al.   Emergence and dissemination of extended-spectrum beta-lactamase-producing Escherichia coli in the community: lessons from the study of a remote and controlled population. J Infect Dis  2010; 202: 515– 23. Google Scholar CrossRef Search ADS PubMed  Zhou Y, Liang Y, Lynch KH et al.   PHAST: a fast phage search tool. Nucleic Acids Res  2011; 39: W347– 52. Google Scholar CrossRef Search ADS PubMed  © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com

Journal

FEMS Microbiology EcologyOxford University Press

Published: Feb 1, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off