Convergent Amino Acid Signatures in Polyphyletic Campylobacter jejuni Subpopulations Suggest Human Niche Tropism

Convergent Amino Acid Signatures in Polyphyletic Campylobacter jejuni Subpopulations Suggest... Human infection with the gastrointestinal pathogen Campylobacter jejuni is dependent upon the opportunity for zoo- notic transmission and the ability of strains to colonize the human host. Certain lineages of this diverse organism are more common in human infection but the factors underlying this overrepresentation are not fully understood. We analyzed 601 isolate genomes from agricultural animals and human clinical cases, including isolates from the multihost (ecological generalist) ST-21 and ST-45 clonal complexes (CCs). Combined nucleotide and amino acid sequence analysis identified 12 human-only amino acid KPAX clusters among polyphyletic lineages within the common disease causing CC21 group isolates, with no such clusters among CC45 isolates. Isolate sequence types within human-only CC21 group KPAX clusters have been sampled from other hosts, including poultry, so rather than representing unsampled reservoir hosts, the increase in relative frequency in human infection potentially reflects a genetic bottleneck at the point of human infection. Consistent with this, sequence enrichment analysis identified nucleotide variation in genes with putative functions related to human colonization and pathogenesis, in human-only clusters. Furthermore, the tight clustering and polyphyly of human-only lineage clusters within a single CC suggest the repeated evolution of human association through acquisition of genetic elements within this complex. Taken together, combined nucleotide and amino acid analysis of large isolate collections may provide clues about human niche tropism and the nature of the forces that promote the emergence of clinically important C. jejuni lineages. Key words: Campylobacter, phylogenetics, adaptation, pathogenesis, human niche. Introduction to infect and survive new selective pressures associated Many bacterial species that are known as causes of gas- with a pathogenic lifestyle. troenteritis are common commensal organisms causing The common gastrointestinal pathogen Campylobacter little or no harm to the host species. For pathogenic strains jejuni is widely distributed among wild and domesticated an- of these species, the pathway to disease can involve a imal species/reservoirs (Sheppard et al. 2011), and the major- series of population bottlenecks. Therefore, clinical iso- ity of the human infections are the result of consumption of lates sampled from patients are a subset of the bacterial contaminated food (Kapperud et al. 2003; Friedman et al. population, representing strains that had the opportunity 2004; Skarp et al. 2016). Campylobacter jejuni populations The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. Genome Biol. Evol. 10(3):763–774. doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 763 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Meric et al. GBE are generally structured by host source (Sheppard et al. 2010, of genotypic and phenotypic plasticity that facilitates rapid 2011), and this has allowed the attribution of the source of host adaptation in a multihost environment (Read et al. human infection based upon comparative multilocus se- 2013; Woodcock et al. 2017; Pascoe et al. 2017) but little is quence typing (MLST) and whole-genome characterization known about the specific genomic variations that promote of host and clinical isolates (Sheppard, Dallas, MacRae, et al. proliferation of particular STs, within generalist lineages, in 2009; Sheppard, Dallas, Strachan, et al. 2009; Pascoe et al. different niches such as human hosts. 2015; Dearlove et al. 2016; Thepault et al. 2017). These stud- Here we combine nucleotide-based phylogenetic analysis ies revealed chickens as a major source of human campylo- with amino acid sequence-based clustering to characterize bacteriosis (EFSA 2015). On the assumption that all strains are populations of C. jejuni from humans and agricultural animals, equally able to infect humans, the abundance of C. jejuni in and identify candidate genes involved in these possible host farmed chickens (Vidal et al. 2016) and contamination of re- associations. Our hypothesis was that a combined methodo- tail poultry (Wimalarathna et al. 2013) would be enough to logical approach would identify subtle host-associated differ- explain the importance of chickens as a pathogen reservoir. ences between isolates from major generalist groups. These However, recent studies of C. jejuni in poultry have shown analyses identified sublineages of the ST-21 complex that that some common chicken-associated strains are rare among were overrepresented among isolates sampled from human clinical isolates while others increase in relative frequency disease. The putative functions of genes within human-only (Yahara et al. 2017). This suggests that factors other than amino acid clusters included those important in human path- simple opportunity for transmission are involved in human ogenesis, such as flagella and capsule synthesis. Our study infection. provides a new way of interrogating genomic data sets to In some species, such as Escherichia coli, the emergence of identify candidate genes in a subset of strains that may indi- pathogenic strains can be associated with the acquisition of cate a population bottleneck associated with human specific attributes which confer increased ability to cause dis- colonization. ease or evade treatment. For example, genetic elements that encode virulence and persistence in humans such as those Materials and Methods carried by phages and plasmids in E. coli or the acquisition of antibiotic resistance in Staphylococcus(as reviewed in Kaper Bacterial Genomes et al. 2004; Pantosti et al. 2007). In some cases the acquisition A total of 601 C. jejuni genomes were used in this analysis, of small amount of genetic material increases the virulence, as previously published in various studies (Cody et al. 2013; seen in the large scale outbreak of the Shiga-like-toxin pro- Sheppard, Didelot, Jolley, et al. 2013; Sheppard, Didelot, ducing E. coli O104:H4 (Frank et al. 2011). Where specific Meric, et al. 2013; Pascoe et al. 2017; Yahara et al. 2017) pathogenicity elements can be identified, it is relatively simple (supplementary table S1, Supplementary Material online). The to identify the agent causing an outbreak and its molecular majority of these came from clinical isolates (n¼ 481) and the cause. However, in C. jejnui, traits associated with clinical rest from agricultural sources, either poultry (n¼ 88) or cattle isolates not only reflect virulence but also those that confer (n¼ 32). Most isolates were from the United Kingdom a fitness advantage against the various selective pressures (n¼ 546/601, 90.1%). A total of 134/601 (22.3%) were encountered in the poultry processing chain, such as survival from CC-45 and 467/601 (77.7%) were from CC-21-48- in the nonhost environment (Yahara et al. 2017). 206 (supplementary table S1, Supplementary Material online), The increasing availability of whole-genome data provides which have been shown to form a single sequence cluster in opportunities to investigate the genomic differences underly- previous studies (Sheppard, Didelot, Meric, et al. 2013). These ing variation in proteins and their motifs that may promote constituted all the sequenced genomes available to us when the proliferation of particular pathogenic strains. this study was initiated. CC21-48-206 is henceforth collec- Epidemiological studies of C. jejuni from clinical samples and tively referred to as CC21 group in this study. Sequencing animal reservoirs typically reveal genetically diverse popula- was performed on Illumina platforms, and assemblies were tions. However, isolates belonging to CC21 and CC45 are performed with either Velvet (Zerbino and Birney 2008)or regularly the most common lineages isolated from human Spades (Bankevich et al. 2012). Assembled DNA sequences disease (Karenlampi et al. 2007; Levesque et al. 2008; from various sources (supplementary table S1, Supplementary Mullner et al. 2009; Sheppard, Dallas, MacRae, et al. 2009; Material online) were uploaded to a web-based database Sheppard, Dallas, Strachan, et al. 2009; Sanad et al. 2011; based on the BIGSdb platform (Jolley and Maiden 2010) Mughini Gras et al. 2012; Sahin et al. 2012; Guyard- which allowed archiving, whole-genome gene-by-gene se- Nicodeme et al. 2015). Both of these lineages have been iso- quence alignments and prevalence analyses. In addition, the lated from a variety of sources, including ruminants, poultry, isolation source of all available CC21 group and CC45 isolate wild birds, domesticated companion animals, as well as envi- records (n¼ 17,107) from the pubMLST database (https:// ronmental samples (Sopwith et al. 2008; Sheppard et al. pubmlst.org/campylobacter/; last accessed February 07, 2011, 2014). This ecological generalism may reflect a degree 764 Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Convergent Amino Acid Signatures in C. jejuni GBE 2018) were obtained (October 21, 2016) and analyzed to on Tajima and Nei (1984) pairwise distances of protein quantify the numbers of different STs isolated from humans sequences together with the Tamura and Kumar (2002) cor- and agricultural animals and contextualize this study. rection for heterogeneous patterns. The initial number of clusters was chosen by selecting the k associated with the highest log posterior probability under the KPAX2 model. In Phylogenetic Tree Inference total, 100 partitions were then created by applying random Sequence alignments were obtained using a gene-by-gene modifications to the initial partition obtained by the k- approach (Sheppard et al. 2012). Briefly, the presence of medoids solution to the proposal partition. Split, merge, and 1,668 coding sequences (CDS) from the reference C. jejuni transfer operators were as previously described (Pessia et al. NCTC11168 genome (NCBI accession: NC_002163.1) in all 2015). Each of the 100 partitions was then independently 601 genomes of this study was inferred using BLAST with the used as a starting state for the KPAX2 posterior maximization following parameters: A gene was considered present when a algorithm to ensure that the final estimate was as close to the local alignment match with the reference was obtained global posterior mode as possible. The 100 KPAX2 runs were on>50% of the sequence length with>70% sequence iden- done in parallel on a cluster computer, where the individual tity. Using these criteria, 1,058 genes were shared by all 601 runs took approximately 1–2 weeks until convergence. The genomes from our data set, constituting the “core genome.” clustering solution with the highest log posterior probability Gene-by-gene alignments using MAFFT (Katoh and Standley among the 100 independent runs was chosen as the final 2013) were concatenated to create a core genome gene-by- estimate. The source of isolates belonging to different KPAX gene alignment that was used subsequently. For protein clusters was indicated for isolates from: human clinical only trees, in-frame translation was performed using custom (clinical); chicken and human clinical sources (chickenþ scripts (supplementary file 1, Supplementary Material on- clinical); cattle and human clinical sources (cattleþ clinical); line) for each individual gene alignment, which were then and chicken, cattle and human clinical sources concatenated. The resulting concatenations were used as (chickenþ cattleþ clinical) (supplementary table S2, an input for the reconstruction of phylogenetic trees, ei- Supplementary Material online). For each KPAX cluster, char- ther using an approximation of the maximum-likelihood acteristic amino acids were determined (Pessia et al. 2015), as algorithm implemented in FastTree2 (Price et al. 2010) well as corresponding proteins and genes in the C. jejuni (fig. 2)or RAxML (Stamatakis 2014)(supplementary fig. NCTC11168 reference genome (supplementary table S3, S1, Supplementary Material online). For the comparison Supplementary Material online). This allowed for a compari- of nucleotide and in-frame translated phylogenetic trees, son of KPAX clustering results with genome-wide association we used RAxML (Stamatakis 2014)with GTRGAMMA and study (GWAS) results to identify the genes associated with PROTGAMMAGTR models, respectively. For amino acid clinical-only C. jejuni KPAX groups. trees, the analysis used a simple search under the GAMMA model of rate heterogeneity on the protein Prevalence of STs from Human-Only KPAX Clusters among data set using empirical base frequencies and estimating Isolates from Human and Nonhuman Sources a general time reversible model of amino acid substitution. Total prevalence of C. jejuni STs observed to belong to human-only KPAX clusters was quantified among samples isolated from human and nonhuman sources (mainly poultry KPAX2 Method: Bayesian Clustering Based on and cattle) and was inferred using isolation source informa- Amino Acid Sequence tion specified in a total of 17,107 CC21, CC48, CC206, and KPAX2 is a new Bayesian method for identifying evolutionary CC45 isolate records, taken from a total of 49,598 archived signals in amino acid sequences that relate to differential evo- isolate records from every CC publicly available in the lution of lineages that may be either monophyletic or poly- pubMLST database (https://pubmlst.org/campylobacter/; phyletic, for example, resulting from the horizontal accessed October 21, 2016). distribution of relevant genomic elements through recombi- nation (Pessia et al. 2015). Earlier analysis of a database of SEER Method: Genome-Wide Association Mapping thousands of influenza A virus H3N2 subtypes demonstrated that the method could accurately identify antigenic clusters We used a k-mer enrichment method to identify, from the determined by amino acid variation and the sequence posi- nucleotide sequence data, which genomic elements were sig- tions relevant for the antigenic differences (Pessia et al. 2015). nificantly more prevalent in two groups of isolates: The The concatenated set of 601 core genome sequences corre- human-only KPAX clusters (group 1, n¼ 103) compared to sponded to 153,911 amino acid positions, harboring 17,405 the remainder of the C. jejuni population (group 2, n¼ 498) polymorphic sites. KPAX2 was used with the default prior (Weinert et al. 2015; Lees et al. 2016). This binary trait analysis settings, and inference was initialized with a proposal partition was performed to ensure that eventual gene regulatory ele- of the samples obtained using the k-medoids algorithm based ments or accessory genes associated with the clusters would Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 765 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Meric et al. GBE 50 Clinical Agricultural animals 100 CC-21/48/206 CC-45 Sequence type (ST) from this study (with n>10 entries in pubMLST) FIG.1.—Prevalence of clinical and agricultural C. jejuni within ST-21 and ST-45 CCs in a public archive repository. The prevalence of clinical (black) and poultry/livestock (gray) isolation sources in pubMLST for each ST in our data set with more than ten isolate records in the pubMLST database (https://pubmlst. org/campylobacter/; last accessed February 07, 2018). There were a total of 17,107 archived public isolate records. not remain unidentified, because the KPAX2 method is based Amino Acid Sequence-Based Analysis Reveals Human- only on core protein sequence variation. The input assemblies Only Subclusters contained approximately 31 M unique k-mers with lengths The Bayesian model-based method KPAX2 was used to clas- between 10 and 99 nucleotides. The following filtering steps sify aligned proteins into functionally divergent groups, based were applied to reduce the original k-mer input set by includ- upon amino acid residues of a collection of 601 genomes ing only k-mers that: 1) had>75% frequency in group 1 representing 66 STs belonging to the CC21 group and and<25% frequency in group 2; 2) had a chi-square associ- CC45. A total of 1,058 core CDS used in the nucleotide phy- ation test P-value< 10 ; and 3) had association logeny were in silico translated and a concatenated amino P-value< 10 in a logistic regression model with the three acid alignment produced for each genome-sequenced strain. first multidimensional scaling coordinates representing the We then performed Bayesian clustering using the KPAX2 al- population structure correction. The multidimensional scaling gorithm, and the tree was annotated with the 36 KPAX clus- coordinates were calculated from a distance matrix based on ters identified (fig. 2). KPAX groups could be classified into 10,000 randomly selected k-mers from the initial set. The final four categories depending on sources of isolates: Human only set of genome-wide significant k-mers contained 347 k-mers, (12 KPAX groups, 112 isolates from 20 STs), human and which were mapped to an annotated reference genome to chicken only (10 KPAX groups, 150 isolates from 20 STs), identify their contexts. human and cattle only (4 KPAX groups, 33 isolates from 13 STs), and human, chicken and cattle (10 KPAX groups, 306 isolates from 24 STs). The isolate source within each KPAX Results group is shown in the supplementary table S2, Supplementary STs Vary in Frequency in Human Clinical and Agricultural Material online. Environments KPAX and nucleotide sequence clusters showed incom- Direct comparison of the relative prevalence of sequence plete congruence. Amino acid clustering was polyphyletic types was performed using the entire Campylobacter when superimposed on the nucleotide phylogeny (fig. 2, sup- PubMLST database. This contained a total of 49,598 plementary fig. S1, Supplementary Material online) and in entries on October 21, 2016. Of these 13,095 belonged some cases, divergent lineages shared the same KPAX cluster. to the CCs 21, 48, and 206, previously shown to form a For example, the 138 isolates belonging to ST-21 were found single sequence cluster based upon whole-genome anal- in 7 different KPAX groups containing isolates from various ysis, and 4,012 belonged to CC45 complex. Within the sources. However, particular STs (ST-21, ST-50, ST-47, ST-44, CC21 group there were 8,382 human clinical isolates and ST-861, and ST-190) were assigned KPAX groups encompass- 3,869 originating from agricultural animal sources, while ing only isolates from humans. Examination of isolate records in CC45 there were 1,674 human clinical isolates and in the entire pubMLST database revealed that most isolates 1,685 agricultural isolates. The relative abundance of iso- from STs assigned to human-only KPAX groups (276/283 iso- late STs belonging to CC21-48-206 and CC45 was deter- lates, in 15/20 STs) have also been isolated from humans and mined (fig. 1). In both CCs, there was variation in the other host species, with only ST-6601, ST-6137, ST-5727, and relative frequency of STs isolated from human clinical ST-2355havingbeenisolatedsolely fromhumans(table 1). and agricultural animal samples. Obviously, KPAX clusters were not defined using the whole 766 Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Prevalence in pubMLST (%) 11 Convergent Amino Acid Signatures in C. jejuni GBE 24*+29 8+28* KPAX group sources Clinical + chicken 28* ST-45 complex Clinical + cattle Clinical + chicken + cattle 11* 3* Clinical only 16 3* 17*++ 731+ 28* Polyphyletic KPAX group 11* 24*+17* 0.001 3* 19* 17* 3* 3* 2 14* 5* 34* 3* 17*+20 14*++ 34* 19*+ 27 14*+34* 5* 10+22 ST-21-48-206 complex 23++ 18 30+ 32 FIG.2.—Population structure of 601 C. jejuni ST-21 and ST-45 complex isolates. Isolates are labeled by KPAX group labels (integers) and colored by their source distribution within KPAX groups: Isolates from chicken and clinical sources (yellow), cattle and clinical sources (blue), chicken, cattle and clinical sources (pink), or clinical only (red). Polyphyletic KPAX groups, reflecting isolates in the same KPAX group but in multiple lineages on the tree, are indicated with an asterisk. The phylogenetic tree was reconstructed from a whole-genome gene-by-gene amino acid alignment, translated in-frame, using an approximation of the maximum-likelihood algorithm implemented in FastTree2, and using a general time reversible model. genomes of the pubMLST-archived comparative data set; 14 genes to have a role in nonhuman host adaptation however, it is useful to contextualize KPAX-ST correlation (Sheppard, Didelot, Meric, et al. 2013)(supplementary table within a wider data set. It should be noted that the ST desig- S4, Supplementary Material online). Although some of these nation can have poor specificity in contrast to the lineages associations were sometimes weak in the corresponding stud- determined from whole genomes and therefore an isolate ies, they were nonetheless highlighted and are consistent with from a nonhuman host present in the pubMLST database a general role in transmission and host colonization. may lack the genetic elements identified in our present To confirm whether these loci were associated with a hu- analysis. man clinical-only sublineage we also performed sequence el- ement enrichment analysis, using SEER (Lees et al. 2016), to identify the genetic basis of human clinical-only sublineage Identification of Genes with Human-Associated Amino strains compared with those from other host sources (fig. 3, Acid Signatures within the CC21 Group supplementary tables S5 and S6, Supplementary Material on- We sought to identify the discriminatory amino acids that line). A total of 181 genes (supplementary table S5, resulted in clustering of human clinical-only CC21 group iso- Supplementary Material online), containing 547 enriched k- lates. We identified a total of 1,213 amino acids sites which mers, were obtained (supplementary table S6, Supplementary mapped to 265 genes (supplementary table S4, Material online). These included genes that have been identi- Supplementary Material online). Mapping the physical loca- fied in previous association studies (supplementary table S5, tion of these against the reference CC21 genome Supplementary Material online), in particular genes with pu- NCTC11168 suggested that these loci were distributed across tative roles in in vitro colonization of surfaces and aggrega- the genome and not under strong linkage disequilibrium tion, host adaptation and clinical disease (Sheppard, Didelot, resulting from physical proximity (fig. 3A). Interestingly, a total Meric, et al. 2013; Pascoe et al. 2015; Yahara et al. 2017). of 24/265 (9.0%) genes were found to be associated with A total of 26 genes were significantly associated with previous GWASs (supplementary table S4, Supplementary human-only lineages in both KPAX clustering and SEER asso- Material online). More specifically, 3 genes were predicted ciation analyses (fig. 3, table 2). Half of these genes have been to have a role in survival from farm to clinical disease described as important for host colonization or pathogenesis, (Yahara et al. 2017), 8 genes to have a role in in vitro coloni- nine in humans or human cell studies, and four in chicken zation of surfaces and aggregation (Pascoe et al. 2015), and colonization studies (table 2), consistent with a broad role for Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 767 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Meric et al. GBE Table 1 Prevalence of isolates from STs found in human-only KPAX groups in human and nonhuman sources KPAX Group ST Total Number of Associated Hosts Prevalence in Prevalence in Isolates in Our Study Human Hosts Nonhuman Hosts in a a in pubMLST (%) pubMLST (%) KPAX-8 ST-21* 138 Human, chicken, cattle 66.5 22.4 KPAX-9 ST-475 5 Human 75.0 19.4 ST-6601# 1 Human 100.0 0.0 KPAX-19 ST-50* 100 Human, chicken 62.8 31.4 ST-5727# 2 Human 100.0 0.0 ST-2355# 1 Human 100.0 0.0 KPAX-20 ST-47* 3 Human 79.2 9.4 ST-5242# 1 Human 100.0 0.0 KPAX-21 ST-572 4 Human 82.7 11.8 ST-5138 1 Human 66.7 33.3 KPAX-26 ST-44* 6 Human 73.2 22.3 KPAX-27 ST-50* 100 Human, chicken 62.8 31.4 KPAX-28 ST-21* 138 Human, chicken, cattle 66.5 22.4 ST-861* 4 Human 86.2 10.3 ST-5018 3 Human 90.5 4.8 ST-190* 2 Human 54.7 43.1 ST-141 1 Human 72.0 24.0 KPAX-30 ST-222 3 Human 78.9 21.1 KPAX-32 ST-122 4 Human 78.2 13.9 KPAX-34 ST-21* 138 Human, chicken, cattle 66.5 22.4 ST-50* 100 Human, chicken 62.8 31.4 ST-3769 1 Human 83.3 16.7 ST-520 1 Human 46.1 51.3 KPAX-35 ST-6137# 2 Human 100.0 0.0 NOTE.—Asterisks indicate STs that also found in other nonhuman-only KPAX groups. Dashes indicate STs that have never been isolated from nonhuman sources in our data setorpubMLST. pubMLST (https://pubmlst.org/campylobacter/) as accessed on October 21, 2016. these genes in host adaptation and/or in multihost fitness. Of et al. 2016; Llarena et al. 2016). Although this has provided a particular note within these genes were the flagellar gene basis for identifying candidate genes with potential functional flgH highlighted in a previous GWAS on nonchicken host significance (Morley et al. 2015; Pascoe et al. 2015; Yahara adaptation (Sheppard, Didelot, Meric, et al. 2013), two genes et al. 2017), straight forward genome analysis often ignores (ceuC and ceuE) involved in the enterochelin iron uptake sys- factors relating translation and the production of specific tem in C. jejuni, a gene (aspB) involved in aspartate metabo- amino acid chains and proteins that may be important in lism, and a gene (fdhD) encoding a formate dehydrogenase, a host adaptation or pathogenicity. For example, although the function that has been highlighted as important for survival four nucleotides can form 64 different triplets they only en- from farm to clinical disease (Yahara et al. 2017). All five of code 20 amino acids. This means that the same amino acid these genes are known to be important in the invasion of can be encoded by different triplets, typically with variation at mammalian cells and/or human colonization (Palyada et al. the third base, and divergent genomes may have convergent 2004; Guerry 2007; Novik et al. 2010; Sheppard, Didelot, amino acid sequences that are potentially functionally impor- Meric, et al. 2013; Yahara et al. 2017). tant in host adaptation or pathogenesis. Analysis of encoded amino acid sequences in this study identified polyphyletic nu- cleotide sequence clusters within the CC21 group that clus- Discussion tered together within the same amino acid sequence clusters. An important aim in zoonotic pathogen research is to identify These convergent human-only amino acid KPAX clusters, in genetic and functional variations associated with lineages or divergent genomic backgrounds, may have been overlooked sublineages that cause human infection. Comparative analysis using conventional nucleotide sequence-based approaches. of nucleotide sequence variation across the genome has im- Comparative analysis of the nucleotide sequence of the proved understanding of the epidemiology and evolution of 601 C. jejuni genomes in this study identified STs belonging Campylobacter (Sheppard, Didelot, Jolley, et al. 2013; Gilbert to the CC21group andCC45that were reportedtohave 768 Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 0.4 1.2 0.2 1.0 Convergent Amino Acid Signatures in C. jejuni GBE Less More prevalent than in reference AB genome COG annotation Not assigned to COGs Function unknown Intracellular traffi cking and secretion Cell wall/membrane biogenesis Transcription Signal transduction mechanisms Cell motility General function prediction only Replication, recombination and repair Secondary metabolites biosynthesis, transport and catabolism Defense mechanisms Translation Nucleotide transport and metabolism Posttranslational modification, protein turnover, chaperones C. jejuni Coenzyme transport and metabolism 0.0 NCTC11168 Amino acid transport and metabolism Carbohydrate transport and metabolism Lipid transport and metabolism Energy production and conversion Inorganic ion transport and metabolism -6 -4 -2 0 2 4 6 Prevalence diff erence from reference genome annotation (%) Genes containing KPAX characteristic sites (n=265) Genes containing associated k-mers (SEER) (n=181) Genes containing KPAX characteristic sites (n=265) Genes containing associated k-mers (SEER) (n=181) Overlap (n=26) FIG.3.—Genes associated with clinical-only C. jejuni KPAX groups. (A) GWAS results visualized on a circular reference genome. The outer circle indicates genes from the C. jejuni NCTC1168 reference genome, with core genes shared by all isolates in our data set (black) and accessory genes (gray) indicated. Genes found to contain characteristic amino acid sites defining KPAX groups are represented (red ticks) along with a quantitative visualization of thenumber of these sites per gene (red dots; scale of the quantification from 0 to 420). Genes found to contain k-mers associated with clinical-only KPAX groups using SEER are represented (blue ticks) along with a quantitative visualization of the number of these k-mers mapped per gene (blue dots; scale of the quan- tification from 0 to 25). Black ticks indicate genes containing both KPAX group characteristic sites and associated k-mers using SEER. (B) Difference in COGs prevalence (%) among genes containing KPAX characteristic sites (red) and genes containing associated k-mers inferred by SEER (blue) with COGs prevalence in the C. jejuni NCTC11168 reference genome annotation. been isolated at different frequencies from agricultural animal asymptomatic carriage of Campylobacter may be underesti- and human sources lineages. This is consistent with other mated and underreported (Calva et al. 1988; Louwen et al. population genomic studies, where the variation in relative 2012; Lee et al. 2013; Islam et al. 2017). These factors could abundance has been explained by the different capacity of influence the evolution and population structure of symptom- certain strains to survive through the poultry production chain atic bacteria. at atmospheric oxygen concentrations (Yahara et al. 2017). Examination of isolate records in the entire pubMLST data- Asymptomatic carriage of C. jejuni is not thought to be com- base revealed that 97% of the isolates assigned to human- mon in humans in industrialized countries (Lee et al. 2013). only amino acid KPAX clusters are of STs that have been iso- Therefore, under a simple transmission model, amino acid lated from other host species as well as humans (table 1). clusters would be expected to be present in both reservoir Notably, only five STs from human-only KPAX groups (corre- animal and infected human hosts. For this reason, the exis- sponding to 7/276 isolates in our data set) have never been tence of strongly human-only amino acid KPAX clusters is reported in nonhuman hosts, either in our data set or from unexpected. There are two possible explanations. First, iso- isolate records in pubMLST. On the basis of the known sour- lates assigned to human-only KPAX clusters are derived ces of C. jejuni in human infection—including CC21 group from a source that is not represented in our isolate collection, isolates (Sheppard, Dallas, MacRae, et al. 2009; Sheppard, which has not been captured by the sampling of isolates used Dallas, Strachan, et al. 2009), the close similarity between C. in this study. Second, there are isolates that share amino acid jejuni populations on food and those from clinical samples clusters within CC21 group C. jejuni in our data set that in- (Kittl et al. 2013), and the presence of STs belonging to crease in relative frequency in humans, compared with the human-only amino acid KPAX clusters among agricultural isolates from other hosts. Additionally, it is possible that hosts in pubMLST, it is unlikely that they indicate an unknown Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 769 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 0.8 1.6 0.6 1.4 Meric et al. GBE 770 Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Table 2 List of Genes Associated with Clinical-Only Campylobacter jejuni KPAX Groups Name Alias Operon Predicted Product (COG) COG COG Description Number of Number of Notes References Code Characteristic Mapping Sites (KPAX) k-mers (SEER) cj1346c dxr 500 1-Deoxy-D-xylulose 5-phos- I Lipid transport and metabolism 52 8 phate reductoisomerase genes cj1347c cdsA 500 Phosphatidate I Lipid transport and metabolism 8 1 maf adhesins are included in the (46) cytidylyltransferase genes maf6-Cj1347 genomic region cj1253 pnp 472 Polynucleotide phosphory- J Translation 7 5 lase/polyadenylase cj0762c aspB 285 Aspartate aminotransferase E Amino acid transport and metabo- 61 A aspB mutant is defective for (38) lism genes entry into cultured human ep- ithelial cells cj0810 nadE 301 NAD synthetase H Coenzyme transport and metabo- 61 lism genes cj0006 — 4PutativeNaþ/Hþ antiporter R General function prediction only 5 4 Cj0006 is expressed in vivo when (48) family protein C. jejuni infects chicken cj0389 serS 149 Seryl-tRNA synthetase J Translation 5 1 cj0542 hemA 213 Glutamyl-tRNA reductase H Coenzyme transport and metabo- 33 lism genes cj0767c coaD 286 Phosphopantetheine H Coenzyme transport and metabo- 31 adenylyltransferase lism genes cj1620c mutY 593 A/G-specific adenine L Replication, recombination and 3 2 An SNP in mutY is associated Dai et al. (2015). glycosylase repair with increase of antibiotic resistance cj0005c — 3 Molydopterin containing R General function prediction only 2 2 Infection of and adherence to (47) oxidoreductase human Caco2 cells in vitro was strongly reduced in a cj0005c mutant cj0069 — 38 Hypothetical protein Cj0069 J Translation 2 1 Involved in the proximal re- Asakura et al. (2007). sponse to cell adhesion and biofilm formation cj0598 — 231 Hypothetical protein Cj0598 S Function unknown genes 2 5 cj0689 ackA 259 Acetate kinase C Energy production and conversion 2 2 Involved in nutrient acquisition, genes acetate metabolism cj1076 proC 404 Pyrroline-5-carboxylate E Amino acid transport and metabo- 21 reductase lism genes cj1157 dnaX 426 DNA polymerase III subunits L Replication, recombination and 2 2 Highlighted in a study as a puta- (52) gamma and tau repair tive Guillain–Barre syndrome marker Convergent Amino Acid Signatures in C. jejuni GBE Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 771 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 cj1508c fdhD 555 Formate dehydrogenase ac- C Energy production and conversion 2 3 Formate metabolism is involved (12) cessory protein genes in host association and survival in the food chain fromfarmto human disease cj0498 trpC 200 Indole-3-glycerol-phosphate E Amino acid transport and metabo- 1 2 In a genomic region identified as (53) synthase lism genes important for cell hyperinva- siveness in a transposon assay cj0518 htpG 206 Heat shock protein 90 O Posttranslational modification, pro- 1 1 Associated in GWAS on biofilm tein turnover, chaperones genes formation (heatshock protein); Pascoe et al. (2017) cj0543 proS 213 Prolyl-tRNA synthetase J Translation 1 3 cj0687c flgH 258 Flagellar basal body L-ring N Cell motility genes 1 3 Flagellar assembly cluster; fla- (23, 37) protein gellar motility is important for human and chicken coloniza- tion, and possible secretion of virulence factors/Associated with cattle adaptation in GWAS cj1056c — 398 Putative carbon–nitrogen R General function prediction only 1 1 Expression of cj1056c is modu- Reid et al. (2008). hydrolase family protein lated at low pH in vitro cj1261 racR 477 Two-component regulator K Transcription 1 6 The Campylobacter RacRS system (50, 51) regulates fumarate utilization in a low oxygen environment, andracRmutants show re- duced colonization of chicken cj1271c tyrS 479 Tyrosyl-tRNA synthetase J Translation 1 1 TyrS was overexpressed in a poor (23, 49) colonizer of chicken/ Associated with cattle adapta- tion in GWAS cj1353 ceuC 502 Enterochelin uptake P Inorganic ion transport and metab- 1 5 Uptake of siderophores is a de- (45) permease olism genes scribed virulence/host coloni- zation trait cj1355 ceuE 502 Enterochelin uptake peri- P Inorganic ion transport and metab- 1 5 ceuE mutant shows decreased (39) plasmic binding protein olism genes chicken colonization NOTE.—Genes are overlapping between the two analyses (KPAX and SEER). As predicted by OperonPredictor (http://biocomputo2.ibt.unam.mx/OperonPredictor/; last accessed February 07, 2018). Meric et al. GBE host source population, although this cannot be ruled out in abilities (Palyada et al. 2004). Additionally, the cdsA gene is this study. located in the genomic region of known maf adhesins, in- Our results are therefore consistent with the increase in volved in survival and host colonization (Karlyshev et al. relative frequency of particular amino acid sequence subclus- 2002). Knockout mutants of cj0005c, an uncharacterized ox- ters that are uncommon in animal hosts, among isolates from idoreductase, have been shown to be strongly impaired in humans. Host colonization potential is influenced by the infection abilities and adherence to human Caco2 cells adaptive genomic variations that exist before and after trans- in vitro (Tareen et al. 2011), whereas a neighboring gene, mission to the new host species (Geoghegan et al. 2016). In cj0006, encoding a putative transporter, has been shown in both cases, population bottlenecks reduce the genetic vari- global transcriptomic studies to be overexpressed in vivo ance in the population at interhost transmission which would when C. jejuni infects chicken (Hu et al. 2014). Finally, the account for the increased relative frequency of human-only tyrS gene, predicted to encode a tyrosyl-tRNA synthetase, has amino acid KPAX clusters. It remains difficult to differentiate been observed to be overexpressed in a poor chicken colo- genetic changes associated with bottlenecking and drift from nizer strain of C. jejuni (Seal et al. 2007). Additionally, it has adaptive physiological changes that directly impact pathogen- been associated with mammalian (cattle) adaptation in a pre- esis, such as human tissue tropism and virulence. vious GWAS from our laboratory (Sheppard, Didelot, Meric, Furthermore, human passage can induce genetic variation et al. 2013). in contingency genes coding surface structure through frame Genes predicted to have a role in metabolism were also shifts and phase variation (Bayliss et al. 2012; Revez et al. highlighted. The ackA and aspB genes are involved in acetate 2013; Thomas et al. 2014). However, the sharing of amino and aspartate metabolism, respectively, and have been shown acid sequence clusters by polyphyletic lineages is evidence of in mutagenesis studies to be important for entry into human homoplasy and investigating the putative function of these epithelial cells in vitro (Novik et al. 2010). The fdhD gene genes may provide clues about their potential role in human encoding a formate dehydrogenase was also associated colonization. Human-only KPAX clusters are present in every with isolates belonging to human-only amino acid clusters. major lineage within the CC21 group (fig. 2)and are notably Formate metabolism has been previously implicated in host absent among CC45 isolates. This asymmetry cannot be association and survival in the food production chain from explained by an insufficient sample size from the CC45 pop- farm to human disease (Yahara et al. 2017). The racR gene ulation in our data set and may suggest that, despite being an which regulates fumarate utilization in a low-oxygen environ- efficient human colonizer, CC45 strains may lack the suitable ment also displayed human-associated variation and racR-de- genetic background for acquisition of genomic elements that ficient mutants have shown reduced chicken colonization are associated with elevated human colonization that we ob- in vivo (Bras et al. 1999; van der Stel et al. 2015). Other genes serve in the CC21 group. Further analysis of larger sample with variation associated with the CC21 human amino acid sets, potentially including phenotypic analyses, is needed to clusters included the dnaX gene that encodes a DNA poly- confirm this. merase and is a marker for the campylobacteriosis sequelae Genome-wide association methods that have recently Guillain–Barre syndrome (Godschalk et al. 2006)and trpC that been applied to bacteria (Sheppard, Didelot, Meric, et al. encodes an indole-3-glycerol-phosphate synthase in a geno- 2013) allow the investigation of genetic variation that under- mic region important for human cell hyperinvasiveness (Javed lies important phenotypes. By quantifying the nucleotide se- et al. 2010). quence that was enriched in isolates from humans (Lees et al. Genomic variation associated with clinical C. jejuni isolates 2016) across the genomes, we were able to investigate the includes elements associated with the primary host putative function of genes with human-only amino acid KPAX (Sheppard, Didelot, Meric, et al. 2013) and the food produc- clusters. A total of 26 genes were identified (table 2), half of tion chain (Yahara et al. 2017), as well as variation which which have been previously linked to host colonization or confers an adaptive advantage to human colonization and pathogenesis, nine in humans or human cells, four in chicken. may directly impact pathogenesis (Thompson and Gaynor For example, flgH, a gene associated with flagellar assembly 2008). Evidence of genetic bottlenecks and selection fostered (table 2) and otherwise associated with adaptation in a mam- by this complex fitness landscape will not only be reflected in malian host (Sheppard, Didelot, Meric, et al. 2013). Flagellar nucleotide sequence variation but also in features, such as motility has been shown to be important for human and gene order, distribution of CDS on leading and lagging chicken colonization, and possibly for the secretion of viru- strands, GC skew, and codon usage (Bentley and Parkhill lence factors into host cells (Guerry 2007). Genes directly in- 2004; Rocha 2004). By combining analysis of nucleotide se- volved in host colonization also included ceuCE, involved in quence and amino acid variation we were able to identify a enterochelin uptake (table 2). The uptake of siderophore has subset of human-associated C. jejuni. As these isolates are been described as a virulence/host colonization trait in found in nonhuman hosts, we interpret this as evidence of Campylobacter (Richardson and Park 1995), and a ceuE mu- a genetic bottleneck that increases the relative frequency of tant has been shown to be altered in chicken colonization certain strains in the infected individuals. Although larger scale 772 Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Convergent Amino Acid Signatures in C. jejuni GBE Frank C, et al. 2011. Epidemic profile of Shiga-toxin-producing Escherichia studies are necessary to confirm a potential adaptive role for coli O104:H4 outbreak in Germany. N Engl J Med. the human-associated variation, our analysis has identified a 365(19):1771–1780. group of human-pathogenic C. jejuni that do not exhibit typ- Friedman CR, et al.; Emerging Infections Program FoodNet Working ical source-sink epidemiology, potentially reflecting human Group. 2004. Risk factors for sporadic Campylobacter infection in tissue tropism or virulence. the United States: a case-control study in FoodNet sites. Clin Infect Dis. 38(s3):S285–S296. Geoghegan JL, Senior AM, Holmes EC. 2016. Pathogen population bottle- Supplementary Material necks and adaptive landscapes: overcoming the barriers to disease emergence. Proc Biol Sci. 283(1837):20160727. Supplementary data areavailableat Genome Biology and Gilbert MJ, et al. 2016. Comparative genomics of Campylobacter fetus Evolution online. from reptiles and mammals reveals divergent evolution in host- associated lineages. Genome Biol Evol. 8(6):2006–2019. Godschalk PC, et al. 2006. Identification of DNA sequence variation in Campylobacter jejuni strains associated with the Guillain-Barre syn- drome by high-throughput AFLP analysis. BMC Microbiol. 6:32. Acknowledgments Guerry P. 2007. Campylobacter flagella: not just for motility. Trends Microbiol. 15(10):456–461. G.M. was supported by an National Institute for Social Care Guyard-Nicodeme M, et al. 2015. Prevalence and characterization of andHealthResearchFellowship(HF-14-13). Researchand Campylobacter jejuni from chicken meat sold in French retail outlets. computation in S.K.S.’s laboratory was supported by a grant Int J Food Microbiol. 203:8–14. from the Medical Research Council (MR/L015080/1), the Hu Y, Huang J, Jiao XA. 2014. Screening of genes expressed in vivo during Wellcome Trust (088786/C/09/Z), the Food Standards interaction between chicken and Campylobacter jejuni.J Microbiol Biotechnol. 24(2):217–224. Agency (FS246004), and the Biotechnology and Biological Islam Z, et al. 2017. Capsular genotype and lipooligosaccharide locus class Sciences Research Council (BB/I02464X/1). J.C. was sup- distribution in Campylobacter jejuni from young children with diarrhea ported by ERC grant no. 742158. We acknowledge MRC and asymptomatic carriers in Bangladesh. Eur J Clin Microbiol Infect CLIMB (S.K.S., Co-PI) that supported computationally inten- Dis. doi:10.1007/s10096-017-3165-7. sive analysis. Javed MA, et al. 2010. Transposon mutagenesis in a hyper-invasive clinical isolate of Campylobacter jejuni reveals a number of genes with poten- tial roles in invasion. Microbiology 156(4):1134–1143. Literature Cited Jolley KA, Maiden MC. 2010. BIGSdb: scalable analysis of bacterial genome variation at the population level. BMC Bioinformatics 11:595. Asakura H, Yamasaki M, Yamamoto S, Igimi S. 2007. Deletion of peb4 Kaper JB, Nataro JP, Mobley HL. 2004. Pathogenic Escherichia coli.Nat Rev gene impairs cell adhesion and biofilm formation in Campylobacter Microbiol. 2(2):123–140. jejuni. FEMS Microbiol Lett. 275(2):278–285. Kapperud G, et al. 2003. Factors associated with increased and decreased Bankevich A, et al. 2012. SPAdes: a new genome assembly algorithm and risk of Campylobacter infection: a prospective case-control study in its applications to single-cell sequencing. J Comput Biol. Norway. Am J Epidemiol. 158(3):234–242. 19(5):455–477. € € Karenlampi R, Rautelin H, Scho ¨ nberg-Norio D, Paulin L, Hanninen M-L. Bayliss CD, et al. 2012. Phase variable genes of Campylobacter jejuni ex- 2007. Longitudinal study of Finnish Campylobacter jejuni and C. coli hibit high mutation rates and specific mutational patterns but muta- isolates from humans, using multilocus sequence typing, including bility is not the major determinant of population structure during host comparison with epidemiological data and isolates from poultry and colonization. Nucleic Acids Res. 40(13):5876–5889. cattle. Appl Environ Microbiol. 73(1):148–155. Bentley SD, Parkhill J. 2004. Comparative genomic structure of prokar- Karlyshev AV, Linton D, Gregson NA, Wren BW. 2002. A novel paralogous yotes. Annu Rev Genet. 38:771–792. gene family involved in phase-variable flagella-mediated motility in Bras AM, Chatterjee S, Wren BW, Newell DG, Ketley JM. 1999. A novel Campylobacter jejuni. Microbiology 148(Pt 2):473–480. Campylobacter jejuni two-component regulatory system important for Katoh K, Standley DM. 2013. MAFFT multiple sequence alignment soft- temperature-dependent growth and colonization. J Bacteriol. ware version 7: improvements in performance and usability. Mol Biol 181(10):3298–3302. Evol. 30(4):772–780. Calva JJ, Ruiz-Palacios GM, Lopez-Vidal AB, Ramos A, Bojalil R. 1988. Kittl S, et al. 2013. Comparison of genotypes and antibiotic resistances of Cohort study of intestinal infection with campylobacter in Mexican Campylobacter jejuni and Campylobacter coli on chicken retail meat children. Lancet 1(8584):503–506. and at slaughter. Appl Environ Microbiol. 79(12):3875–3878. Cody AJ, et al. 2013. Real-time genomic epidemiological evaluation of Lee G, et al. 2013. Symptomatic and asymptomatic Campylobacter infec- human Campylobacter isolates by use of whole-genome multilocus tions associated with reduced growth in Peruvian children. PLoS Negl sequence typing. J Clin Microbiol. 51(8):2526–2534. Trop Dis. 7(1):e2036. Dearlove BL, et al. 2016. Rapid host switching in generalist Campylobacter Lees JA, et al. 2016. Sequence element enrichment analysis to determine the strains erodes the signal for tracing human infections. ISME J. genetic basis of bacterial phenotypes. Nat Commun. 7:12797. 10(3):721–729. Levesque S, Frost E, Arbeit RD, Michaud S. 2008. Multilocus sequence Dai L, Muraoka WT, Wu Z, Sahin O, Zhang Q. 2015. A single nucleotide typing of Campylobacter jejuni isolates from humans, chickens, raw change in mutY increases the emergence of antibiotic-resistant milk, and environmental water in Quebec, Canada. J Clin Microbiol. Campylobacter jejuni mutants. J Antimicrob Chemother. 46(10):3404–3411. 70(10):2739–2748. Llarena AK, et al. 2016. Monomorphic genotypes within a generalist lin- EFSA 2015. The European Union summary report on trends and sources of eage of Campylobacter jejuni show signs of global dispersion. Microb zoonoses, zoonotic agents and food-borne outbreaks in 2013. Genom. 2(10):e000088. EFSA J. 13(1):3991. Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 773 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Meric et al. GBE Louwen R, et al. 2012. Campylobacter bacteremia: a rare and under- Sheppard SK, Didelot X, Meric G, et al. 2013. Genome-wide association reported event? Eur J Microbiol Immunol (Bp). 2(1):76–87. study identifies vitamin B5 biosynthesis as a host specificity factor in Morley L, et al. 2015. Gene loss and lineage-specific restriction-modifica- Campylobacter. Proc Natl Acad Sci U S A. 110(29):11923–11927. tion systems associated with niche differentiation in the Sheppard SK, et al. 2010. Host association of Campylobacter genotypes Campylobacter jejuni sequence type 403 clonal complex. Appl transcends geographic variation. Appl Environ Microbiol. Environ Microbiol. 81(11):3641–3647. 76(15):5269–5277. Mughini Gras L, et al. 2012. Risk factors for campylobacteriosis of chicken, Sheppard SK, et al. 2011. Niche segregation and genetic structure of ruminant, and environmental origin: a combined case-control and Campylobacter jejuni populations from wild and agricultural host spe- source attribution analysis. PLoS One 7(8):e42599. cies. Mol Ecol. 20(16):3484–3490. Mullner P, et al. 2009. Assigning the source of human campylobacteriosis Sheppard SK, et al. 2014. Cryptic ecology among host generalist in New Zealand: a comparative genetic and epidemiological approach. Campylobacter jejuni in domestic animals. Mol Ecol. Infect Genet Evol. 9(6):1311–1319. 23(10):2442–2451. Novik V, Hofreuter D, Galan JE. 2010. Identification of Campylobacter Sheppard SK, Jolley KA, Maiden MCJ. 2012. A gene-by-gene approach to jejuni genes involved in its interaction with epithelial cells. Infect bacterial population genomics: whole genome MLST of Immun. 78(8):3540–3553. Campylobacter. Genes 3(4):261–277. Palyada K, Threadgill D, Stintzi A. 2004. Iron acquisition and regulation in Skarp CPA, Hanninen ML, Rautelin HIK. 2016. Campylobacteriosis: the Campylobacter jejuni. J Bacteriol. 186(14):4714–4729. role of poultry meat. Clin Microbiol Infect. 22(2):103–109. Pantosti A, Sanchini A, Monaco M. 2007. Mechanisms of antibiotic resis- Sopwith W, et al. 2008. Identification of potential environmentally tance in Staphylococcus aureus. Future Microbiol. 2(3):323–334. adapted Campylobacter jejuni strain, United Kingdom. Emerg Infect Pascoe B, et al. 2015. Enhanced biofilm formation and multi-host trans- Dis. 14(11):1769–1773. mission evolve from divergent genetic backgrounds in Campylobacter Stamatakis A. 2014. RAxML version 8: a tool for phylogenetic anal- jejuni. Environ Microbiol. 17(11):4779–4789. ysis and post-analysis of large phylogenies. Bioinformatics Pascoe B, et al. 2017. Local genes for local bacteria: evidence of allopatry in 30(9):1312–1313. the genomes of transatlantic Campylobacter populations. Mol Ecol. Tajima F, Nei M. 1984. Estimation of evolutionary distance between nu- 26(17):4497–4508. cleotide sequences. Mol Biol Evol. 1(3):269–285. Pessia A, Grad Y, Cobey S, Puranen JS, Corander J. 2015. K-Pax2: Bayesian Tamura K, Kumar S. 2002. Evolutionary distance estimation under hetero- identification of cluster-defining amino acid positions in large se- geneous substitution pattern among lineages. Mol Biol Evol. quence datasets. Microb Genom. 1(1):e000025. 19(10):1727–1736. Price MN, Dehal PS, Arkin AP. 2010. FastTree 2–approximately maximum- Tareen AM, Dasti JI, Zautner AE, Gross U, Lugert R. 2011. Sulphite: cyto- likelihood trees for large alignments. PLoS One 5(3):e9490. chrome c oxidoreductase deficiency in Campylobacter jejuni reduces mo- Read DS, et al. 2013. Evidence for phenotypic plasticity among multihost tility, host cell adherence and invasion. Microbiology 157(Pt Campylobacter jejuni and C. coli lineages, obtained using ribosomal 6):1776–1785. multilocus sequence typing and Raman spectroscopy. Appl Environ Thepault A, et al. 2017. Genome-wide identification of host-segregating Microbiol. 79(3):965–973. epidemiological markers for source attribution in Campylobacter Reid AN, et al. 2008. Identification of Campylobacter jejuni genes contrib- jejuni. Appl Environ Microbiol. 17;83(7):e03085–16. uting to acid adaptation by transcriptional profiling and genome-wide Thomas DK, et al. 2014. Comparative variation within the genome of mutagenesis. Appl Environ Microbiol. 74(5):1598–1612. Campylobacter jejuni NCTC 11168 in human and murine hosts. Revez J, Schott T, Llarena A-K, Rossi M, H€ anninen M-L. 2013. Genetic PLoS One 9(2):e88229. heterogeneity of Campylobacter jejuni NCTC 11168 upon human in- Thompson SA, Gaynor EC. 2008. Campylobacter jejuni host tissue tropism: fection. Infect Genet Evol. 16:305–309. a consequence of its low-carb lifestyle? Cell Host Microbe Richardson PT, Park SF. 1995. Enterochelin acquisition in Campylobacter 4(5):409–410. coli: characterization of components of a binding-protein-dependent van der Stel AX, et al. 2015. The Campylobacter jejuni RacRS system transport system. Microbiology 141(12):3181–3191. regulates fumarate utilization in a low oxygen environment. Environ Rocha EP. 2004. Codon usage bias from tRNA’s point of view: redun- Microbiol. 17(4):1049–1064. dancy, specialization, and efficient decoding for translation optimiza- Vidal AB, et al. 2016. Genetic diversity of Campylobacter jejuni and tion. Genome Res. 14(11):2279–2286. Campylobacter coli isolates from conventional broiler flocks and the Sahin O, et al. 2012. Molecular evidence for zoonotic transmission of an impacts of sampling strategy and laboratory method. Appl Environ emergent, highly pathogenic Campylobacter jejuni clone in the United Microbiol. 82(8):2347–2355. States. J Clin Microbiol. 50(3):680–687. Weinert LA, et al. 2015. Genomic signatures of human and animal disease Sanad YM, et al. 2011. Genotypic and phenotypic properties of cattle- in the zoonotic pathogen Streptococcus suis. Nat Commun. associated Campylobacter and their implications to public health in the 6(1):6740. USA. PLoS One 6(10):e25778. Wimalarathna HM, et al. 2013. Widespread acquisition of antimicrobial re- Seal BS, et al. 2007. Proteomic analyses of a robust versus a poor chicken sistance among Campylobacter isolates from UK retail poultry and evi- gastrointestinal colonizing isolate of Campylobacter jejuni. J Proteome dence for clonal expansion of resistant lineages. BMC Microbiol. 13:160. Res. 6(12):4582–4591. Woodcock DJ, et al. 2017. Genomic plasticity and rapid host switching can Sheppard SK, Dallas JF, MacRae M, et al. 2009. Campylobacter genotypes promote the evolution of generalism: a case study in the zoonotic from food animals, environmental sources and clinical disease in pathogen Campylobacter. Sci Rep. 29;7(1):9650. Scotland 2005/6. Int J Food Microbiol. 134(1–2):96–103. Yahara K, et al. 2017. Genome-wide association of functional traits linked Sheppard SK, Dallas JF, Strachan NJ, et al. 2009. Campylobacter genotyp- with Campylobacter jejuni survival from farm to fork. Environ ing to determine the source of human infection. Clin Infect Dis. Microbiol. 19(1):361–380. 48(8):1072–1078. Zerbino DR, Birney E. 2008. Velvet: algorithms for de novo short read Sheppard SK, Didelot X, Jolley KA, et al. 2013. Progressive genome-wide assembly using de Bruijn graphs. Genome Res. 18(5):821–829. introgression in agricultural Campylobacter coli. Mol Ecol. 22(4):1051–1064. Associate editor: Josefa Gonzalez 774 Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Genome Biology and Evolution Oxford University Press

Convergent Amino Acid Signatures in Polyphyletic Campylobacter jejuni Subpopulations Suggest Human Niche Tropism

Free
12 pages

Loading next page...
 
/lp/ou_press/convergent-amino-acid-signatures-in-polyphyletic-campylobacter-jejuni-qymcf0cH00
Publisher
Oxford University Press
Copyright
© The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
ISSN
1759-6653
eISSN
1759-6653
D.O.I.
10.1093/gbe/evy026
Publisher site
See Article on Publisher Site

Abstract

Human infection with the gastrointestinal pathogen Campylobacter jejuni is dependent upon the opportunity for zoo- notic transmission and the ability of strains to colonize the human host. Certain lineages of this diverse organism are more common in human infection but the factors underlying this overrepresentation are not fully understood. We analyzed 601 isolate genomes from agricultural animals and human clinical cases, including isolates from the multihost (ecological generalist) ST-21 and ST-45 clonal complexes (CCs). Combined nucleotide and amino acid sequence analysis identified 12 human-only amino acid KPAX clusters among polyphyletic lineages within the common disease causing CC21 group isolates, with no such clusters among CC45 isolates. Isolate sequence types within human-only CC21 group KPAX clusters have been sampled from other hosts, including poultry, so rather than representing unsampled reservoir hosts, the increase in relative frequency in human infection potentially reflects a genetic bottleneck at the point of human infection. Consistent with this, sequence enrichment analysis identified nucleotide variation in genes with putative functions related to human colonization and pathogenesis, in human-only clusters. Furthermore, the tight clustering and polyphyly of human-only lineage clusters within a single CC suggest the repeated evolution of human association through acquisition of genetic elements within this complex. Taken together, combined nucleotide and amino acid analysis of large isolate collections may provide clues about human niche tropism and the nature of the forces that promote the emergence of clinically important C. jejuni lineages. Key words: Campylobacter, phylogenetics, adaptation, pathogenesis, human niche. Introduction to infect and survive new selective pressures associated Many bacterial species that are known as causes of gas- with a pathogenic lifestyle. troenteritis are common commensal organisms causing The common gastrointestinal pathogen Campylobacter little or no harm to the host species. For pathogenic strains jejuni is widely distributed among wild and domesticated an- of these species, the pathway to disease can involve a imal species/reservoirs (Sheppard et al. 2011), and the major- series of population bottlenecks. Therefore, clinical iso- ity of the human infections are the result of consumption of lates sampled from patients are a subset of the bacterial contaminated food (Kapperud et al. 2003; Friedman et al. population, representing strains that had the opportunity 2004; Skarp et al. 2016). Campylobacter jejuni populations The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. Genome Biol. Evol. 10(3):763–774. doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 763 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Meric et al. GBE are generally structured by host source (Sheppard et al. 2010, of genotypic and phenotypic plasticity that facilitates rapid 2011), and this has allowed the attribution of the source of host adaptation in a multihost environment (Read et al. human infection based upon comparative multilocus se- 2013; Woodcock et al. 2017; Pascoe et al. 2017) but little is quence typing (MLST) and whole-genome characterization known about the specific genomic variations that promote of host and clinical isolates (Sheppard, Dallas, MacRae, et al. proliferation of particular STs, within generalist lineages, in 2009; Sheppard, Dallas, Strachan, et al. 2009; Pascoe et al. different niches such as human hosts. 2015; Dearlove et al. 2016; Thepault et al. 2017). These stud- Here we combine nucleotide-based phylogenetic analysis ies revealed chickens as a major source of human campylo- with amino acid sequence-based clustering to characterize bacteriosis (EFSA 2015). On the assumption that all strains are populations of C. jejuni from humans and agricultural animals, equally able to infect humans, the abundance of C. jejuni in and identify candidate genes involved in these possible host farmed chickens (Vidal et al. 2016) and contamination of re- associations. Our hypothesis was that a combined methodo- tail poultry (Wimalarathna et al. 2013) would be enough to logical approach would identify subtle host-associated differ- explain the importance of chickens as a pathogen reservoir. ences between isolates from major generalist groups. These However, recent studies of C. jejuni in poultry have shown analyses identified sublineages of the ST-21 complex that that some common chicken-associated strains are rare among were overrepresented among isolates sampled from human clinical isolates while others increase in relative frequency disease. The putative functions of genes within human-only (Yahara et al. 2017). This suggests that factors other than amino acid clusters included those important in human path- simple opportunity for transmission are involved in human ogenesis, such as flagella and capsule synthesis. Our study infection. provides a new way of interrogating genomic data sets to In some species, such as Escherichia coli, the emergence of identify candidate genes in a subset of strains that may indi- pathogenic strains can be associated with the acquisition of cate a population bottleneck associated with human specific attributes which confer increased ability to cause dis- colonization. ease or evade treatment. For example, genetic elements that encode virulence and persistence in humans such as those Materials and Methods carried by phages and plasmids in E. coli or the acquisition of antibiotic resistance in Staphylococcus(as reviewed in Kaper Bacterial Genomes et al. 2004; Pantosti et al. 2007). In some cases the acquisition A total of 601 C. jejuni genomes were used in this analysis, of small amount of genetic material increases the virulence, as previously published in various studies (Cody et al. 2013; seen in the large scale outbreak of the Shiga-like-toxin pro- Sheppard, Didelot, Jolley, et al. 2013; Sheppard, Didelot, ducing E. coli O104:H4 (Frank et al. 2011). Where specific Meric, et al. 2013; Pascoe et al. 2017; Yahara et al. 2017) pathogenicity elements can be identified, it is relatively simple (supplementary table S1, Supplementary Material online). The to identify the agent causing an outbreak and its molecular majority of these came from clinical isolates (n¼ 481) and the cause. However, in C. jejnui, traits associated with clinical rest from agricultural sources, either poultry (n¼ 88) or cattle isolates not only reflect virulence but also those that confer (n¼ 32). Most isolates were from the United Kingdom a fitness advantage against the various selective pressures (n¼ 546/601, 90.1%). A total of 134/601 (22.3%) were encountered in the poultry processing chain, such as survival from CC-45 and 467/601 (77.7%) were from CC-21-48- in the nonhost environment (Yahara et al. 2017). 206 (supplementary table S1, Supplementary Material online), The increasing availability of whole-genome data provides which have been shown to form a single sequence cluster in opportunities to investigate the genomic differences underly- previous studies (Sheppard, Didelot, Meric, et al. 2013). These ing variation in proteins and their motifs that may promote constituted all the sequenced genomes available to us when the proliferation of particular pathogenic strains. this study was initiated. CC21-48-206 is henceforth collec- Epidemiological studies of C. jejuni from clinical samples and tively referred to as CC21 group in this study. Sequencing animal reservoirs typically reveal genetically diverse popula- was performed on Illumina platforms, and assemblies were tions. However, isolates belonging to CC21 and CC45 are performed with either Velvet (Zerbino and Birney 2008)or regularly the most common lineages isolated from human Spades (Bankevich et al. 2012). Assembled DNA sequences disease (Karenlampi et al. 2007; Levesque et al. 2008; from various sources (supplementary table S1, Supplementary Mullner et al. 2009; Sheppard, Dallas, MacRae, et al. 2009; Material online) were uploaded to a web-based database Sheppard, Dallas, Strachan, et al. 2009; Sanad et al. 2011; based on the BIGSdb platform (Jolley and Maiden 2010) Mughini Gras et al. 2012; Sahin et al. 2012; Guyard- which allowed archiving, whole-genome gene-by-gene se- Nicodeme et al. 2015). Both of these lineages have been iso- quence alignments and prevalence analyses. In addition, the lated from a variety of sources, including ruminants, poultry, isolation source of all available CC21 group and CC45 isolate wild birds, domesticated companion animals, as well as envi- records (n¼ 17,107) from the pubMLST database (https:// ronmental samples (Sopwith et al. 2008; Sheppard et al. pubmlst.org/campylobacter/; last accessed February 07, 2011, 2014). This ecological generalism may reflect a degree 764 Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Convergent Amino Acid Signatures in C. jejuni GBE 2018) were obtained (October 21, 2016) and analyzed to on Tajima and Nei (1984) pairwise distances of protein quantify the numbers of different STs isolated from humans sequences together with the Tamura and Kumar (2002) cor- and agricultural animals and contextualize this study. rection for heterogeneous patterns. The initial number of clusters was chosen by selecting the k associated with the highest log posterior probability under the KPAX2 model. In Phylogenetic Tree Inference total, 100 partitions were then created by applying random Sequence alignments were obtained using a gene-by-gene modifications to the initial partition obtained by the k- approach (Sheppard et al. 2012). Briefly, the presence of medoids solution to the proposal partition. Split, merge, and 1,668 coding sequences (CDS) from the reference C. jejuni transfer operators were as previously described (Pessia et al. NCTC11168 genome (NCBI accession: NC_002163.1) in all 2015). Each of the 100 partitions was then independently 601 genomes of this study was inferred using BLAST with the used as a starting state for the KPAX2 posterior maximization following parameters: A gene was considered present when a algorithm to ensure that the final estimate was as close to the local alignment match with the reference was obtained global posterior mode as possible. The 100 KPAX2 runs were on>50% of the sequence length with>70% sequence iden- done in parallel on a cluster computer, where the individual tity. Using these criteria, 1,058 genes were shared by all 601 runs took approximately 1–2 weeks until convergence. The genomes from our data set, constituting the “core genome.” clustering solution with the highest log posterior probability Gene-by-gene alignments using MAFFT (Katoh and Standley among the 100 independent runs was chosen as the final 2013) were concatenated to create a core genome gene-by- estimate. The source of isolates belonging to different KPAX gene alignment that was used subsequently. For protein clusters was indicated for isolates from: human clinical only trees, in-frame translation was performed using custom (clinical); chicken and human clinical sources (chickenþ scripts (supplementary file 1, Supplementary Material on- clinical); cattle and human clinical sources (cattleþ clinical); line) for each individual gene alignment, which were then and chicken, cattle and human clinical sources concatenated. The resulting concatenations were used as (chickenþ cattleþ clinical) (supplementary table S2, an input for the reconstruction of phylogenetic trees, ei- Supplementary Material online). For each KPAX cluster, char- ther using an approximation of the maximum-likelihood acteristic amino acids were determined (Pessia et al. 2015), as algorithm implemented in FastTree2 (Price et al. 2010) well as corresponding proteins and genes in the C. jejuni (fig. 2)or RAxML (Stamatakis 2014)(supplementary fig. NCTC11168 reference genome (supplementary table S3, S1, Supplementary Material online). For the comparison Supplementary Material online). This allowed for a compari- of nucleotide and in-frame translated phylogenetic trees, son of KPAX clustering results with genome-wide association we used RAxML (Stamatakis 2014)with GTRGAMMA and study (GWAS) results to identify the genes associated with PROTGAMMAGTR models, respectively. For amino acid clinical-only C. jejuni KPAX groups. trees, the analysis used a simple search under the GAMMA model of rate heterogeneity on the protein Prevalence of STs from Human-Only KPAX Clusters among data set using empirical base frequencies and estimating Isolates from Human and Nonhuman Sources a general time reversible model of amino acid substitution. Total prevalence of C. jejuni STs observed to belong to human-only KPAX clusters was quantified among samples isolated from human and nonhuman sources (mainly poultry KPAX2 Method: Bayesian Clustering Based on and cattle) and was inferred using isolation source informa- Amino Acid Sequence tion specified in a total of 17,107 CC21, CC48, CC206, and KPAX2 is a new Bayesian method for identifying evolutionary CC45 isolate records, taken from a total of 49,598 archived signals in amino acid sequences that relate to differential evo- isolate records from every CC publicly available in the lution of lineages that may be either monophyletic or poly- pubMLST database (https://pubmlst.org/campylobacter/; phyletic, for example, resulting from the horizontal accessed October 21, 2016). distribution of relevant genomic elements through recombi- nation (Pessia et al. 2015). Earlier analysis of a database of SEER Method: Genome-Wide Association Mapping thousands of influenza A virus H3N2 subtypes demonstrated that the method could accurately identify antigenic clusters We used a k-mer enrichment method to identify, from the determined by amino acid variation and the sequence posi- nucleotide sequence data, which genomic elements were sig- tions relevant for the antigenic differences (Pessia et al. 2015). nificantly more prevalent in two groups of isolates: The The concatenated set of 601 core genome sequences corre- human-only KPAX clusters (group 1, n¼ 103) compared to sponded to 153,911 amino acid positions, harboring 17,405 the remainder of the C. jejuni population (group 2, n¼ 498) polymorphic sites. KPAX2 was used with the default prior (Weinert et al. 2015; Lees et al. 2016). This binary trait analysis settings, and inference was initialized with a proposal partition was performed to ensure that eventual gene regulatory ele- of the samples obtained using the k-medoids algorithm based ments or accessory genes associated with the clusters would Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 765 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Meric et al. GBE 50 Clinical Agricultural animals 100 CC-21/48/206 CC-45 Sequence type (ST) from this study (with n>10 entries in pubMLST) FIG.1.—Prevalence of clinical and agricultural C. jejuni within ST-21 and ST-45 CCs in a public archive repository. The prevalence of clinical (black) and poultry/livestock (gray) isolation sources in pubMLST for each ST in our data set with more than ten isolate records in the pubMLST database (https://pubmlst. org/campylobacter/; last accessed February 07, 2018). There were a total of 17,107 archived public isolate records. not remain unidentified, because the KPAX2 method is based Amino Acid Sequence-Based Analysis Reveals Human- only on core protein sequence variation. The input assemblies Only Subclusters contained approximately 31 M unique k-mers with lengths The Bayesian model-based method KPAX2 was used to clas- between 10 and 99 nucleotides. The following filtering steps sify aligned proteins into functionally divergent groups, based were applied to reduce the original k-mer input set by includ- upon amino acid residues of a collection of 601 genomes ing only k-mers that: 1) had>75% frequency in group 1 representing 66 STs belonging to the CC21 group and and<25% frequency in group 2; 2) had a chi-square associ- CC45. A total of 1,058 core CDS used in the nucleotide phy- ation test P-value< 10 ; and 3) had association logeny were in silico translated and a concatenated amino P-value< 10 in a logistic regression model with the three acid alignment produced for each genome-sequenced strain. first multidimensional scaling coordinates representing the We then performed Bayesian clustering using the KPAX2 al- population structure correction. The multidimensional scaling gorithm, and the tree was annotated with the 36 KPAX clus- coordinates were calculated from a distance matrix based on ters identified (fig. 2). KPAX groups could be classified into 10,000 randomly selected k-mers from the initial set. The final four categories depending on sources of isolates: Human only set of genome-wide significant k-mers contained 347 k-mers, (12 KPAX groups, 112 isolates from 20 STs), human and which were mapped to an annotated reference genome to chicken only (10 KPAX groups, 150 isolates from 20 STs), identify their contexts. human and cattle only (4 KPAX groups, 33 isolates from 13 STs), and human, chicken and cattle (10 KPAX groups, 306 isolates from 24 STs). The isolate source within each KPAX Results group is shown in the supplementary table S2, Supplementary STs Vary in Frequency in Human Clinical and Agricultural Material online. Environments KPAX and nucleotide sequence clusters showed incom- Direct comparison of the relative prevalence of sequence plete congruence. Amino acid clustering was polyphyletic types was performed using the entire Campylobacter when superimposed on the nucleotide phylogeny (fig. 2, sup- PubMLST database. This contained a total of 49,598 plementary fig. S1, Supplementary Material online) and in entries on October 21, 2016. Of these 13,095 belonged some cases, divergent lineages shared the same KPAX cluster. to the CCs 21, 48, and 206, previously shown to form a For example, the 138 isolates belonging to ST-21 were found single sequence cluster based upon whole-genome anal- in 7 different KPAX groups containing isolates from various ysis, and 4,012 belonged to CC45 complex. Within the sources. However, particular STs (ST-21, ST-50, ST-47, ST-44, CC21 group there were 8,382 human clinical isolates and ST-861, and ST-190) were assigned KPAX groups encompass- 3,869 originating from agricultural animal sources, while ing only isolates from humans. Examination of isolate records in CC45 there were 1,674 human clinical isolates and in the entire pubMLST database revealed that most isolates 1,685 agricultural isolates. The relative abundance of iso- from STs assigned to human-only KPAX groups (276/283 iso- late STs belonging to CC21-48-206 and CC45 was deter- lates, in 15/20 STs) have also been isolated from humans and mined (fig. 1). In both CCs, there was variation in the other host species, with only ST-6601, ST-6137, ST-5727, and relative frequency of STs isolated from human clinical ST-2355havingbeenisolatedsolely fromhumans(table 1). and agricultural animal samples. Obviously, KPAX clusters were not defined using the whole 766 Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Prevalence in pubMLST (%) 11 Convergent Amino Acid Signatures in C. jejuni GBE 24*+29 8+28* KPAX group sources Clinical + chicken 28* ST-45 complex Clinical + cattle Clinical + chicken + cattle 11* 3* Clinical only 16 3* 17*++ 731+ 28* Polyphyletic KPAX group 11* 24*+17* 0.001 3* 19* 17* 3* 3* 2 14* 5* 34* 3* 17*+20 14*++ 34* 19*+ 27 14*+34* 5* 10+22 ST-21-48-206 complex 23++ 18 30+ 32 FIG.2.—Population structure of 601 C. jejuni ST-21 and ST-45 complex isolates. Isolates are labeled by KPAX group labels (integers) and colored by their source distribution within KPAX groups: Isolates from chicken and clinical sources (yellow), cattle and clinical sources (blue), chicken, cattle and clinical sources (pink), or clinical only (red). Polyphyletic KPAX groups, reflecting isolates in the same KPAX group but in multiple lineages on the tree, are indicated with an asterisk. The phylogenetic tree was reconstructed from a whole-genome gene-by-gene amino acid alignment, translated in-frame, using an approximation of the maximum-likelihood algorithm implemented in FastTree2, and using a general time reversible model. genomes of the pubMLST-archived comparative data set; 14 genes to have a role in nonhuman host adaptation however, it is useful to contextualize KPAX-ST correlation (Sheppard, Didelot, Meric, et al. 2013)(supplementary table within a wider data set. It should be noted that the ST desig- S4, Supplementary Material online). Although some of these nation can have poor specificity in contrast to the lineages associations were sometimes weak in the corresponding stud- determined from whole genomes and therefore an isolate ies, they were nonetheless highlighted and are consistent with from a nonhuman host present in the pubMLST database a general role in transmission and host colonization. may lack the genetic elements identified in our present To confirm whether these loci were associated with a hu- analysis. man clinical-only sublineage we also performed sequence el- ement enrichment analysis, using SEER (Lees et al. 2016), to identify the genetic basis of human clinical-only sublineage Identification of Genes with Human-Associated Amino strains compared with those from other host sources (fig. 3, Acid Signatures within the CC21 Group supplementary tables S5 and S6, Supplementary Material on- We sought to identify the discriminatory amino acids that line). A total of 181 genes (supplementary table S5, resulted in clustering of human clinical-only CC21 group iso- Supplementary Material online), containing 547 enriched k- lates. We identified a total of 1,213 amino acids sites which mers, were obtained (supplementary table S6, Supplementary mapped to 265 genes (supplementary table S4, Material online). These included genes that have been identi- Supplementary Material online). Mapping the physical loca- fied in previous association studies (supplementary table S5, tion of these against the reference CC21 genome Supplementary Material online), in particular genes with pu- NCTC11168 suggested that these loci were distributed across tative roles in in vitro colonization of surfaces and aggrega- the genome and not under strong linkage disequilibrium tion, host adaptation and clinical disease (Sheppard, Didelot, resulting from physical proximity (fig. 3A). Interestingly, a total Meric, et al. 2013; Pascoe et al. 2015; Yahara et al. 2017). of 24/265 (9.0%) genes were found to be associated with A total of 26 genes were significantly associated with previous GWASs (supplementary table S4, Supplementary human-only lineages in both KPAX clustering and SEER asso- Material online). More specifically, 3 genes were predicted ciation analyses (fig. 3, table 2). Half of these genes have been to have a role in survival from farm to clinical disease described as important for host colonization or pathogenesis, (Yahara et al. 2017), 8 genes to have a role in in vitro coloni- nine in humans or human cell studies, and four in chicken zation of surfaces and aggregation (Pascoe et al. 2015), and colonization studies (table 2), consistent with a broad role for Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 767 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Meric et al. GBE Table 1 Prevalence of isolates from STs found in human-only KPAX groups in human and nonhuman sources KPAX Group ST Total Number of Associated Hosts Prevalence in Prevalence in Isolates in Our Study Human Hosts Nonhuman Hosts in a a in pubMLST (%) pubMLST (%) KPAX-8 ST-21* 138 Human, chicken, cattle 66.5 22.4 KPAX-9 ST-475 5 Human 75.0 19.4 ST-6601# 1 Human 100.0 0.0 KPAX-19 ST-50* 100 Human, chicken 62.8 31.4 ST-5727# 2 Human 100.0 0.0 ST-2355# 1 Human 100.0 0.0 KPAX-20 ST-47* 3 Human 79.2 9.4 ST-5242# 1 Human 100.0 0.0 KPAX-21 ST-572 4 Human 82.7 11.8 ST-5138 1 Human 66.7 33.3 KPAX-26 ST-44* 6 Human 73.2 22.3 KPAX-27 ST-50* 100 Human, chicken 62.8 31.4 KPAX-28 ST-21* 138 Human, chicken, cattle 66.5 22.4 ST-861* 4 Human 86.2 10.3 ST-5018 3 Human 90.5 4.8 ST-190* 2 Human 54.7 43.1 ST-141 1 Human 72.0 24.0 KPAX-30 ST-222 3 Human 78.9 21.1 KPAX-32 ST-122 4 Human 78.2 13.9 KPAX-34 ST-21* 138 Human, chicken, cattle 66.5 22.4 ST-50* 100 Human, chicken 62.8 31.4 ST-3769 1 Human 83.3 16.7 ST-520 1 Human 46.1 51.3 KPAX-35 ST-6137# 2 Human 100.0 0.0 NOTE.—Asterisks indicate STs that also found in other nonhuman-only KPAX groups. Dashes indicate STs that have never been isolated from nonhuman sources in our data setorpubMLST. pubMLST (https://pubmlst.org/campylobacter/) as accessed on October 21, 2016. these genes in host adaptation and/or in multihost fitness. Of et al. 2016; Llarena et al. 2016). Although this has provided a particular note within these genes were the flagellar gene basis for identifying candidate genes with potential functional flgH highlighted in a previous GWAS on nonchicken host significance (Morley et al. 2015; Pascoe et al. 2015; Yahara adaptation (Sheppard, Didelot, Meric, et al. 2013), two genes et al. 2017), straight forward genome analysis often ignores (ceuC and ceuE) involved in the enterochelin iron uptake sys- factors relating translation and the production of specific tem in C. jejuni, a gene (aspB) involved in aspartate metabo- amino acid chains and proteins that may be important in lism, and a gene (fdhD) encoding a formate dehydrogenase, a host adaptation or pathogenicity. For example, although the function that has been highlighted as important for survival four nucleotides can form 64 different triplets they only en- from farm to clinical disease (Yahara et al. 2017). All five of code 20 amino acids. This means that the same amino acid these genes are known to be important in the invasion of can be encoded by different triplets, typically with variation at mammalian cells and/or human colonization (Palyada et al. the third base, and divergent genomes may have convergent 2004; Guerry 2007; Novik et al. 2010; Sheppard, Didelot, amino acid sequences that are potentially functionally impor- Meric, et al. 2013; Yahara et al. 2017). tant in host adaptation or pathogenesis. Analysis of encoded amino acid sequences in this study identified polyphyletic nu- cleotide sequence clusters within the CC21 group that clus- Discussion tered together within the same amino acid sequence clusters. An important aim in zoonotic pathogen research is to identify These convergent human-only amino acid KPAX clusters, in genetic and functional variations associated with lineages or divergent genomic backgrounds, may have been overlooked sublineages that cause human infection. Comparative analysis using conventional nucleotide sequence-based approaches. of nucleotide sequence variation across the genome has im- Comparative analysis of the nucleotide sequence of the proved understanding of the epidemiology and evolution of 601 C. jejuni genomes in this study identified STs belonging Campylobacter (Sheppard, Didelot, Jolley, et al. 2013; Gilbert to the CC21group andCC45that were reportedtohave 768 Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 0.4 1.2 0.2 1.0 Convergent Amino Acid Signatures in C. jejuni GBE Less More prevalent than in reference AB genome COG annotation Not assigned to COGs Function unknown Intracellular traffi cking and secretion Cell wall/membrane biogenesis Transcription Signal transduction mechanisms Cell motility General function prediction only Replication, recombination and repair Secondary metabolites biosynthesis, transport and catabolism Defense mechanisms Translation Nucleotide transport and metabolism Posttranslational modification, protein turnover, chaperones C. jejuni Coenzyme transport and metabolism 0.0 NCTC11168 Amino acid transport and metabolism Carbohydrate transport and metabolism Lipid transport and metabolism Energy production and conversion Inorganic ion transport and metabolism -6 -4 -2 0 2 4 6 Prevalence diff erence from reference genome annotation (%) Genes containing KPAX characteristic sites (n=265) Genes containing associated k-mers (SEER) (n=181) Genes containing KPAX characteristic sites (n=265) Genes containing associated k-mers (SEER) (n=181) Overlap (n=26) FIG.3.—Genes associated with clinical-only C. jejuni KPAX groups. (A) GWAS results visualized on a circular reference genome. The outer circle indicates genes from the C. jejuni NCTC1168 reference genome, with core genes shared by all isolates in our data set (black) and accessory genes (gray) indicated. Genes found to contain characteristic amino acid sites defining KPAX groups are represented (red ticks) along with a quantitative visualization of thenumber of these sites per gene (red dots; scale of the quantification from 0 to 420). Genes found to contain k-mers associated with clinical-only KPAX groups using SEER are represented (blue ticks) along with a quantitative visualization of the number of these k-mers mapped per gene (blue dots; scale of the quan- tification from 0 to 25). Black ticks indicate genes containing both KPAX group characteristic sites and associated k-mers using SEER. (B) Difference in COGs prevalence (%) among genes containing KPAX characteristic sites (red) and genes containing associated k-mers inferred by SEER (blue) with COGs prevalence in the C. jejuni NCTC11168 reference genome annotation. been isolated at different frequencies from agricultural animal asymptomatic carriage of Campylobacter may be underesti- and human sources lineages. This is consistent with other mated and underreported (Calva et al. 1988; Louwen et al. population genomic studies, where the variation in relative 2012; Lee et al. 2013; Islam et al. 2017). These factors could abundance has been explained by the different capacity of influence the evolution and population structure of symptom- certain strains to survive through the poultry production chain atic bacteria. at atmospheric oxygen concentrations (Yahara et al. 2017). Examination of isolate records in the entire pubMLST data- Asymptomatic carriage of C. jejuni is not thought to be com- base revealed that 97% of the isolates assigned to human- mon in humans in industrialized countries (Lee et al. 2013). only amino acid KPAX clusters are of STs that have been iso- Therefore, under a simple transmission model, amino acid lated from other host species as well as humans (table 1). clusters would be expected to be present in both reservoir Notably, only five STs from human-only KPAX groups (corre- animal and infected human hosts. For this reason, the exis- sponding to 7/276 isolates in our data set) have never been tence of strongly human-only amino acid KPAX clusters is reported in nonhuman hosts, either in our data set or from unexpected. There are two possible explanations. First, iso- isolate records in pubMLST. On the basis of the known sour- lates assigned to human-only KPAX clusters are derived ces of C. jejuni in human infection—including CC21 group from a source that is not represented in our isolate collection, isolates (Sheppard, Dallas, MacRae, et al. 2009; Sheppard, which has not been captured by the sampling of isolates used Dallas, Strachan, et al. 2009), the close similarity between C. in this study. Second, there are isolates that share amino acid jejuni populations on food and those from clinical samples clusters within CC21 group C. jejuni in our data set that in- (Kittl et al. 2013), and the presence of STs belonging to crease in relative frequency in humans, compared with the human-only amino acid KPAX clusters among agricultural isolates from other hosts. Additionally, it is possible that hosts in pubMLST, it is unlikely that they indicate an unknown Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 769 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 0.8 1.6 0.6 1.4 Meric et al. GBE 770 Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Table 2 List of Genes Associated with Clinical-Only Campylobacter jejuni KPAX Groups Name Alias Operon Predicted Product (COG) COG COG Description Number of Number of Notes References Code Characteristic Mapping Sites (KPAX) k-mers (SEER) cj1346c dxr 500 1-Deoxy-D-xylulose 5-phos- I Lipid transport and metabolism 52 8 phate reductoisomerase genes cj1347c cdsA 500 Phosphatidate I Lipid transport and metabolism 8 1 maf adhesins are included in the (46) cytidylyltransferase genes maf6-Cj1347 genomic region cj1253 pnp 472 Polynucleotide phosphory- J Translation 7 5 lase/polyadenylase cj0762c aspB 285 Aspartate aminotransferase E Amino acid transport and metabo- 61 A aspB mutant is defective for (38) lism genes entry into cultured human ep- ithelial cells cj0810 nadE 301 NAD synthetase H Coenzyme transport and metabo- 61 lism genes cj0006 — 4PutativeNaþ/Hþ antiporter R General function prediction only 5 4 Cj0006 is expressed in vivo when (48) family protein C. jejuni infects chicken cj0389 serS 149 Seryl-tRNA synthetase J Translation 5 1 cj0542 hemA 213 Glutamyl-tRNA reductase H Coenzyme transport and metabo- 33 lism genes cj0767c coaD 286 Phosphopantetheine H Coenzyme transport and metabo- 31 adenylyltransferase lism genes cj1620c mutY 593 A/G-specific adenine L Replication, recombination and 3 2 An SNP in mutY is associated Dai et al. (2015). glycosylase repair with increase of antibiotic resistance cj0005c — 3 Molydopterin containing R General function prediction only 2 2 Infection of and adherence to (47) oxidoreductase human Caco2 cells in vitro was strongly reduced in a cj0005c mutant cj0069 — 38 Hypothetical protein Cj0069 J Translation 2 1 Involved in the proximal re- Asakura et al. (2007). sponse to cell adhesion and biofilm formation cj0598 — 231 Hypothetical protein Cj0598 S Function unknown genes 2 5 cj0689 ackA 259 Acetate kinase C Energy production and conversion 2 2 Involved in nutrient acquisition, genes acetate metabolism cj1076 proC 404 Pyrroline-5-carboxylate E Amino acid transport and metabo- 21 reductase lism genes cj1157 dnaX 426 DNA polymerase III subunits L Replication, recombination and 2 2 Highlighted in a study as a puta- (52) gamma and tau repair tive Guillain–Barre syndrome marker Convergent Amino Acid Signatures in C. jejuni GBE Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 771 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 cj1508c fdhD 555 Formate dehydrogenase ac- C Energy production and conversion 2 3 Formate metabolism is involved (12) cessory protein genes in host association and survival in the food chain fromfarmto human disease cj0498 trpC 200 Indole-3-glycerol-phosphate E Amino acid transport and metabo- 1 2 In a genomic region identified as (53) synthase lism genes important for cell hyperinva- siveness in a transposon assay cj0518 htpG 206 Heat shock protein 90 O Posttranslational modification, pro- 1 1 Associated in GWAS on biofilm tein turnover, chaperones genes formation (heatshock protein); Pascoe et al. (2017) cj0543 proS 213 Prolyl-tRNA synthetase J Translation 1 3 cj0687c flgH 258 Flagellar basal body L-ring N Cell motility genes 1 3 Flagellar assembly cluster; fla- (23, 37) protein gellar motility is important for human and chicken coloniza- tion, and possible secretion of virulence factors/Associated with cattle adaptation in GWAS cj1056c — 398 Putative carbon–nitrogen R General function prediction only 1 1 Expression of cj1056c is modu- Reid et al. (2008). hydrolase family protein lated at low pH in vitro cj1261 racR 477 Two-component regulator K Transcription 1 6 The Campylobacter RacRS system (50, 51) regulates fumarate utilization in a low oxygen environment, andracRmutants show re- duced colonization of chicken cj1271c tyrS 479 Tyrosyl-tRNA synthetase J Translation 1 1 TyrS was overexpressed in a poor (23, 49) colonizer of chicken/ Associated with cattle adapta- tion in GWAS cj1353 ceuC 502 Enterochelin uptake P Inorganic ion transport and metab- 1 5 Uptake of siderophores is a de- (45) permease olism genes scribed virulence/host coloni- zation trait cj1355 ceuE 502 Enterochelin uptake peri- P Inorganic ion transport and metab- 1 5 ceuE mutant shows decreased (39) plasmic binding protein olism genes chicken colonization NOTE.—Genes are overlapping between the two analyses (KPAX and SEER). As predicted by OperonPredictor (http://biocomputo2.ibt.unam.mx/OperonPredictor/; last accessed February 07, 2018). Meric et al. GBE host source population, although this cannot be ruled out in abilities (Palyada et al. 2004). Additionally, the cdsA gene is this study. located in the genomic region of known maf adhesins, in- Our results are therefore consistent with the increase in volved in survival and host colonization (Karlyshev et al. relative frequency of particular amino acid sequence subclus- 2002). Knockout mutants of cj0005c, an uncharacterized ox- ters that are uncommon in animal hosts, among isolates from idoreductase, have been shown to be strongly impaired in humans. Host colonization potential is influenced by the infection abilities and adherence to human Caco2 cells adaptive genomic variations that exist before and after trans- in vitro (Tareen et al. 2011), whereas a neighboring gene, mission to the new host species (Geoghegan et al. 2016). In cj0006, encoding a putative transporter, has been shown in both cases, population bottlenecks reduce the genetic vari- global transcriptomic studies to be overexpressed in vivo ance in the population at interhost transmission which would when C. jejuni infects chicken (Hu et al. 2014). Finally, the account for the increased relative frequency of human-only tyrS gene, predicted to encode a tyrosyl-tRNA synthetase, has amino acid KPAX clusters. It remains difficult to differentiate been observed to be overexpressed in a poor chicken colo- genetic changes associated with bottlenecking and drift from nizer strain of C. jejuni (Seal et al. 2007). Additionally, it has adaptive physiological changes that directly impact pathogen- been associated with mammalian (cattle) adaptation in a pre- esis, such as human tissue tropism and virulence. vious GWAS from our laboratory (Sheppard, Didelot, Meric, Furthermore, human passage can induce genetic variation et al. 2013). in contingency genes coding surface structure through frame Genes predicted to have a role in metabolism were also shifts and phase variation (Bayliss et al. 2012; Revez et al. highlighted. The ackA and aspB genes are involved in acetate 2013; Thomas et al. 2014). However, the sharing of amino and aspartate metabolism, respectively, and have been shown acid sequence clusters by polyphyletic lineages is evidence of in mutagenesis studies to be important for entry into human homoplasy and investigating the putative function of these epithelial cells in vitro (Novik et al. 2010). The fdhD gene genes may provide clues about their potential role in human encoding a formate dehydrogenase was also associated colonization. Human-only KPAX clusters are present in every with isolates belonging to human-only amino acid clusters. major lineage within the CC21 group (fig. 2)and are notably Formate metabolism has been previously implicated in host absent among CC45 isolates. This asymmetry cannot be association and survival in the food production chain from explained by an insufficient sample size from the CC45 pop- farm to human disease (Yahara et al. 2017). The racR gene ulation in our data set and may suggest that, despite being an which regulates fumarate utilization in a low-oxygen environ- efficient human colonizer, CC45 strains may lack the suitable ment also displayed human-associated variation and racR-de- genetic background for acquisition of genomic elements that ficient mutants have shown reduced chicken colonization are associated with elevated human colonization that we ob- in vivo (Bras et al. 1999; van der Stel et al. 2015). Other genes serve in the CC21 group. Further analysis of larger sample with variation associated with the CC21 human amino acid sets, potentially including phenotypic analyses, is needed to clusters included the dnaX gene that encodes a DNA poly- confirm this. merase and is a marker for the campylobacteriosis sequelae Genome-wide association methods that have recently Guillain–Barre syndrome (Godschalk et al. 2006)and trpC that been applied to bacteria (Sheppard, Didelot, Meric, et al. encodes an indole-3-glycerol-phosphate synthase in a geno- 2013) allow the investigation of genetic variation that under- mic region important for human cell hyperinvasiveness (Javed lies important phenotypes. By quantifying the nucleotide se- et al. 2010). quence that was enriched in isolates from humans (Lees et al. Genomic variation associated with clinical C. jejuni isolates 2016) across the genomes, we were able to investigate the includes elements associated with the primary host putative function of genes with human-only amino acid KPAX (Sheppard, Didelot, Meric, et al. 2013) and the food produc- clusters. A total of 26 genes were identified (table 2), half of tion chain (Yahara et al. 2017), as well as variation which which have been previously linked to host colonization or confers an adaptive advantage to human colonization and pathogenesis, nine in humans or human cells, four in chicken. may directly impact pathogenesis (Thompson and Gaynor For example, flgH, a gene associated with flagellar assembly 2008). Evidence of genetic bottlenecks and selection fostered (table 2) and otherwise associated with adaptation in a mam- by this complex fitness landscape will not only be reflected in malian host (Sheppard, Didelot, Meric, et al. 2013). Flagellar nucleotide sequence variation but also in features, such as motility has been shown to be important for human and gene order, distribution of CDS on leading and lagging chicken colonization, and possibly for the secretion of viru- strands, GC skew, and codon usage (Bentley and Parkhill lence factors into host cells (Guerry 2007). Genes directly in- 2004; Rocha 2004). By combining analysis of nucleotide se- volved in host colonization also included ceuCE, involved in quence and amino acid variation we were able to identify a enterochelin uptake (table 2). The uptake of siderophore has subset of human-associated C. jejuni. As these isolates are been described as a virulence/host colonization trait in found in nonhuman hosts, we interpret this as evidence of Campylobacter (Richardson and Park 1995), and a ceuE mu- a genetic bottleneck that increases the relative frequency of tant has been shown to be altered in chicken colonization certain strains in the infected individuals. Although larger scale 772 Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Convergent Amino Acid Signatures in C. jejuni GBE Frank C, et al. 2011. Epidemic profile of Shiga-toxin-producing Escherichia studies are necessary to confirm a potential adaptive role for coli O104:H4 outbreak in Germany. N Engl J Med. the human-associated variation, our analysis has identified a 365(19):1771–1780. group of human-pathogenic C. jejuni that do not exhibit typ- Friedman CR, et al.; Emerging Infections Program FoodNet Working ical source-sink epidemiology, potentially reflecting human Group. 2004. Risk factors for sporadic Campylobacter infection in tissue tropism or virulence. the United States: a case-control study in FoodNet sites. Clin Infect Dis. 38(s3):S285–S296. Geoghegan JL, Senior AM, Holmes EC. 2016. Pathogen population bottle- Supplementary Material necks and adaptive landscapes: overcoming the barriers to disease emergence. Proc Biol Sci. 283(1837):20160727. Supplementary data areavailableat Genome Biology and Gilbert MJ, et al. 2016. Comparative genomics of Campylobacter fetus Evolution online. from reptiles and mammals reveals divergent evolution in host- associated lineages. Genome Biol Evol. 8(6):2006–2019. Godschalk PC, et al. 2006. Identification of DNA sequence variation in Campylobacter jejuni strains associated with the Guillain-Barre syn- drome by high-throughput AFLP analysis. BMC Microbiol. 6:32. Acknowledgments Guerry P. 2007. Campylobacter flagella: not just for motility. Trends Microbiol. 15(10):456–461. G.M. was supported by an National Institute for Social Care Guyard-Nicodeme M, et al. 2015. Prevalence and characterization of andHealthResearchFellowship(HF-14-13). Researchand Campylobacter jejuni from chicken meat sold in French retail outlets. computation in S.K.S.’s laboratory was supported by a grant Int J Food Microbiol. 203:8–14. from the Medical Research Council (MR/L015080/1), the Hu Y, Huang J, Jiao XA. 2014. Screening of genes expressed in vivo during Wellcome Trust (088786/C/09/Z), the Food Standards interaction between chicken and Campylobacter jejuni.J Microbiol Biotechnol. 24(2):217–224. Agency (FS246004), and the Biotechnology and Biological Islam Z, et al. 2017. Capsular genotype and lipooligosaccharide locus class Sciences Research Council (BB/I02464X/1). J.C. was sup- distribution in Campylobacter jejuni from young children with diarrhea ported by ERC grant no. 742158. We acknowledge MRC and asymptomatic carriers in Bangladesh. Eur J Clin Microbiol Infect CLIMB (S.K.S., Co-PI) that supported computationally inten- Dis. doi:10.1007/s10096-017-3165-7. sive analysis. Javed MA, et al. 2010. Transposon mutagenesis in a hyper-invasive clinical isolate of Campylobacter jejuni reveals a number of genes with poten- tial roles in invasion. Microbiology 156(4):1134–1143. Literature Cited Jolley KA, Maiden MC. 2010. BIGSdb: scalable analysis of bacterial genome variation at the population level. BMC Bioinformatics 11:595. Asakura H, Yamasaki M, Yamamoto S, Igimi S. 2007. Deletion of peb4 Kaper JB, Nataro JP, Mobley HL. 2004. Pathogenic Escherichia coli.Nat Rev gene impairs cell adhesion and biofilm formation in Campylobacter Microbiol. 2(2):123–140. jejuni. FEMS Microbiol Lett. 275(2):278–285. Kapperud G, et al. 2003. Factors associated with increased and decreased Bankevich A, et al. 2012. SPAdes: a new genome assembly algorithm and risk of Campylobacter infection: a prospective case-control study in its applications to single-cell sequencing. J Comput Biol. Norway. Am J Epidemiol. 158(3):234–242. 19(5):455–477. € € Karenlampi R, Rautelin H, Scho ¨ nberg-Norio D, Paulin L, Hanninen M-L. Bayliss CD, et al. 2012. Phase variable genes of Campylobacter jejuni ex- 2007. Longitudinal study of Finnish Campylobacter jejuni and C. coli hibit high mutation rates and specific mutational patterns but muta- isolates from humans, using multilocus sequence typing, including bility is not the major determinant of population structure during host comparison with epidemiological data and isolates from poultry and colonization. Nucleic Acids Res. 40(13):5876–5889. cattle. Appl Environ Microbiol. 73(1):148–155. Bentley SD, Parkhill J. 2004. Comparative genomic structure of prokar- Karlyshev AV, Linton D, Gregson NA, Wren BW. 2002. A novel paralogous yotes. Annu Rev Genet. 38:771–792. gene family involved in phase-variable flagella-mediated motility in Bras AM, Chatterjee S, Wren BW, Newell DG, Ketley JM. 1999. A novel Campylobacter jejuni. Microbiology 148(Pt 2):473–480. Campylobacter jejuni two-component regulatory system important for Katoh K, Standley DM. 2013. MAFFT multiple sequence alignment soft- temperature-dependent growth and colonization. J Bacteriol. ware version 7: improvements in performance and usability. Mol Biol 181(10):3298–3302. Evol. 30(4):772–780. Calva JJ, Ruiz-Palacios GM, Lopez-Vidal AB, Ramos A, Bojalil R. 1988. Kittl S, et al. 2013. Comparison of genotypes and antibiotic resistances of Cohort study of intestinal infection with campylobacter in Mexican Campylobacter jejuni and Campylobacter coli on chicken retail meat children. Lancet 1(8584):503–506. and at slaughter. Appl Environ Microbiol. 79(12):3875–3878. Cody AJ, et al. 2013. Real-time genomic epidemiological evaluation of Lee G, et al. 2013. Symptomatic and asymptomatic Campylobacter infec- human Campylobacter isolates by use of whole-genome multilocus tions associated with reduced growth in Peruvian children. PLoS Negl sequence typing. J Clin Microbiol. 51(8):2526–2534. Trop Dis. 7(1):e2036. Dearlove BL, et al. 2016. Rapid host switching in generalist Campylobacter Lees JA, et al. 2016. Sequence element enrichment analysis to determine the strains erodes the signal for tracing human infections. ISME J. genetic basis of bacterial phenotypes. Nat Commun. 7:12797. 10(3):721–729. Levesque S, Frost E, Arbeit RD, Michaud S. 2008. Multilocus sequence Dai L, Muraoka WT, Wu Z, Sahin O, Zhang Q. 2015. A single nucleotide typing of Campylobacter jejuni isolates from humans, chickens, raw change in mutY increases the emergence of antibiotic-resistant milk, and environmental water in Quebec, Canada. J Clin Microbiol. Campylobacter jejuni mutants. J Antimicrob Chemother. 46(10):3404–3411. 70(10):2739–2748. Llarena AK, et al. 2016. Monomorphic genotypes within a generalist lin- EFSA 2015. The European Union summary report on trends and sources of eage of Campylobacter jejuni show signs of global dispersion. Microb zoonoses, zoonotic agents and food-borne outbreaks in 2013. Genom. 2(10):e000088. EFSA J. 13(1):3991. Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 773 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Meric et al. GBE Louwen R, et al. 2012. Campylobacter bacteremia: a rare and under- Sheppard SK, Didelot X, Meric G, et al. 2013. Genome-wide association reported event? Eur J Microbiol Immunol (Bp). 2(1):76–87. study identifies vitamin B5 biosynthesis as a host specificity factor in Morley L, et al. 2015. Gene loss and lineage-specific restriction-modifica- Campylobacter. Proc Natl Acad Sci U S A. 110(29):11923–11927. tion systems associated with niche differentiation in the Sheppard SK, et al. 2010. Host association of Campylobacter genotypes Campylobacter jejuni sequence type 403 clonal complex. Appl transcends geographic variation. Appl Environ Microbiol. Environ Microbiol. 81(11):3641–3647. 76(15):5269–5277. Mughini Gras L, et al. 2012. Risk factors for campylobacteriosis of chicken, Sheppard SK, et al. 2011. Niche segregation and genetic structure of ruminant, and environmental origin: a combined case-control and Campylobacter jejuni populations from wild and agricultural host spe- source attribution analysis. PLoS One 7(8):e42599. cies. Mol Ecol. 20(16):3484–3490. Mullner P, et al. 2009. Assigning the source of human campylobacteriosis Sheppard SK, et al. 2014. Cryptic ecology among host generalist in New Zealand: a comparative genetic and epidemiological approach. Campylobacter jejuni in domestic animals. Mol Ecol. Infect Genet Evol. 9(6):1311–1319. 23(10):2442–2451. Novik V, Hofreuter D, Galan JE. 2010. Identification of Campylobacter Sheppard SK, Jolley KA, Maiden MCJ. 2012. A gene-by-gene approach to jejuni genes involved in its interaction with epithelial cells. Infect bacterial population genomics: whole genome MLST of Immun. 78(8):3540–3553. Campylobacter. Genes 3(4):261–277. Palyada K, Threadgill D, Stintzi A. 2004. Iron acquisition and regulation in Skarp CPA, Hanninen ML, Rautelin HIK. 2016. Campylobacteriosis: the Campylobacter jejuni. J Bacteriol. 186(14):4714–4729. role of poultry meat. Clin Microbiol Infect. 22(2):103–109. Pantosti A, Sanchini A, Monaco M. 2007. Mechanisms of antibiotic resis- Sopwith W, et al. 2008. Identification of potential environmentally tance in Staphylococcus aureus. Future Microbiol. 2(3):323–334. adapted Campylobacter jejuni strain, United Kingdom. Emerg Infect Pascoe B, et al. 2015. Enhanced biofilm formation and multi-host trans- Dis. 14(11):1769–1773. mission evolve from divergent genetic backgrounds in Campylobacter Stamatakis A. 2014. RAxML version 8: a tool for phylogenetic anal- jejuni. Environ Microbiol. 17(11):4779–4789. ysis and post-analysis of large phylogenies. Bioinformatics Pascoe B, et al. 2017. Local genes for local bacteria: evidence of allopatry in 30(9):1312–1313. the genomes of transatlantic Campylobacter populations. Mol Ecol. Tajima F, Nei M. 1984. Estimation of evolutionary distance between nu- 26(17):4497–4508. cleotide sequences. Mol Biol Evol. 1(3):269–285. Pessia A, Grad Y, Cobey S, Puranen JS, Corander J. 2015. K-Pax2: Bayesian Tamura K, Kumar S. 2002. Evolutionary distance estimation under hetero- identification of cluster-defining amino acid positions in large se- geneous substitution pattern among lineages. Mol Biol Evol. quence datasets. Microb Genom. 1(1):e000025. 19(10):1727–1736. Price MN, Dehal PS, Arkin AP. 2010. FastTree 2–approximately maximum- Tareen AM, Dasti JI, Zautner AE, Gross U, Lugert R. 2011. Sulphite: cyto- likelihood trees for large alignments. PLoS One 5(3):e9490. chrome c oxidoreductase deficiency in Campylobacter jejuni reduces mo- Read DS, et al. 2013. Evidence for phenotypic plasticity among multihost tility, host cell adherence and invasion. Microbiology 157(Pt Campylobacter jejuni and C. coli lineages, obtained using ribosomal 6):1776–1785. multilocus sequence typing and Raman spectroscopy. Appl Environ Thepault A, et al. 2017. Genome-wide identification of host-segregating Microbiol. 79(3):965–973. epidemiological markers for source attribution in Campylobacter Reid AN, et al. 2008. Identification of Campylobacter jejuni genes contrib- jejuni. Appl Environ Microbiol. 17;83(7):e03085–16. uting to acid adaptation by transcriptional profiling and genome-wide Thomas DK, et al. 2014. Comparative variation within the genome of mutagenesis. Appl Environ Microbiol. 74(5):1598–1612. Campylobacter jejuni NCTC 11168 in human and murine hosts. Revez J, Schott T, Llarena A-K, Rossi M, H€ anninen M-L. 2013. Genetic PLoS One 9(2):e88229. heterogeneity of Campylobacter jejuni NCTC 11168 upon human in- Thompson SA, Gaynor EC. 2008. Campylobacter jejuni host tissue tropism: fection. Infect Genet Evol. 16:305–309. a consequence of its low-carb lifestyle? Cell Host Microbe Richardson PT, Park SF. 1995. Enterochelin acquisition in Campylobacter 4(5):409–410. coli: characterization of components of a binding-protein-dependent van der Stel AX, et al. 2015. The Campylobacter jejuni RacRS system transport system. Microbiology 141(12):3181–3191. regulates fumarate utilization in a low oxygen environment. Environ Rocha EP. 2004. Codon usage bias from tRNA’s point of view: redun- Microbiol. 17(4):1049–1064. dancy, specialization, and efficient decoding for translation optimiza- Vidal AB, et al. 2016. Genetic diversity of Campylobacter jejuni and tion. Genome Res. 14(11):2279–2286. Campylobacter coli isolates from conventional broiler flocks and the Sahin O, et al. 2012. Molecular evidence for zoonotic transmission of an impacts of sampling strategy and laboratory method. Appl Environ emergent, highly pathogenic Campylobacter jejuni clone in the United Microbiol. 82(8):2347–2355. States. J Clin Microbiol. 50(3):680–687. Weinert LA, et al. 2015. Genomic signatures of human and animal disease Sanad YM, et al. 2011. Genotypic and phenotypic properties of cattle- in the zoonotic pathogen Streptococcus suis. Nat Commun. associated Campylobacter and their implications to public health in the 6(1):6740. USA. PLoS One 6(10):e25778. Wimalarathna HM, et al. 2013. Widespread acquisition of antimicrobial re- Seal BS, et al. 2007. Proteomic analyses of a robust versus a poor chicken sistance among Campylobacter isolates from UK retail poultry and evi- gastrointestinal colonizing isolate of Campylobacter jejuni. J Proteome dence for clonal expansion of resistant lineages. BMC Microbiol. 13:160. Res. 6(12):4582–4591. Woodcock DJ, et al. 2017. Genomic plasticity and rapid host switching can Sheppard SK, Dallas JF, MacRae M, et al. 2009. Campylobacter genotypes promote the evolution of generalism: a case study in the zoonotic from food animals, environmental sources and clinical disease in pathogen Campylobacter. Sci Rep. 29;7(1):9650. Scotland 2005/6. Int J Food Microbiol. 134(1–2):96–103. Yahara K, et al. 2017. Genome-wide association of functional traits linked Sheppard SK, Dallas JF, Strachan NJ, et al. 2009. Campylobacter genotyp- with Campylobacter jejuni survival from farm to fork. Environ ing to determine the source of human infection. Clin Infect Dis. Microbiol. 19(1):361–380. 48(8):1072–1078. Zerbino DR, Birney E. 2008. Velvet: algorithms for de novo short read Sheppard SK, Didelot X, Jolley KA, et al. 2013. Progressive genome-wide assembly using de Bruijn graphs. Genome Res. 18(5):821–829. introgression in agricultural Campylobacter coli. Mol Ecol. 22(4):1051–1064. Associate editor: Josefa Gonzalez 774 Genome Biol. Evol. 10(3):763–774 doi:10.1093/gbe/evy026 Advance Access publication February 14, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/763/4857209 by Ed 'DeepDyve' Gillespie user on 16 March 2018

Journal

Genome Biology and EvolutionOxford University Press

Published: Mar 1, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off