Evolutionary Genetics of Cytoplasmic Incompatibility Genes cifA and cifB in Prophage WO of Wolbachia

Evolutionary Genetics of Cytoplasmic Incompatibility Genes cifA and cifB in Prophage WO of Wolbachia The bacterial endosymbiont Wolbachia manipulates arthropod reproduction to facilitate its maternal spread through host populations. The most common manipulation is cytoplasmic incompatibility (CI): Wolbachia-infected males produce mod- ified sperm that cause embryonic mortality, unless rescued by embryos harboring the same Wolbachia. The genes underlying CI, cifA and cifB, were recently identified in the eukaryotic association module of Wolbachia’s prophage WO. Here, we use transcriptomic and genomic approaches to address three important evolutionary facets of the cif genes. First, we assess whetherornot cifA and cifB comprise a classic toxin–antitoxin operon in wMel and show that the two genes exhibit striking, transcriptional differences across host development. They can produce a bicistronic message despite a predicted hairpin termination element in their intergenic region. Second, cifA and cifB strongly coevolve across the diversity of phage WO. Third, we provide new domain and functional predictions across homologs within Wolbachia, and show that amino acid sequences vary substantially across the genus. Finally, we investigate conservation of cifA and cifB and find frequent deg- radation and loss of the genes in strains that no longer induce CI. Taken together, we demonstrate that cifA and cifB exhibit complex transcriptional regulation in wMel, provide functional annotations that broaden the potential mechanisms of CI induction, and report recurrent erosion of cifA and cifB in non-CI strains, thus expanding our understanding of the most widespread form of reproductive parasitism. Key words: symbiosis, reproductive manipulation, gene loss, bacteriophage, prophage. Introduction produce modified sperm that can only be rescued by eggs The genus Wolbachia is the most widespread group of ma- infected with the same Wolbachia strain (Yen and Barr 1971). ternally transmitted endosymbiotic bacteria (Zug et al. 2012). If the modified sperm fertilize eggs infected with no They occur worldwide in numerous arthropods and nemato- Wolbachia (unidirectional CI) or a genetically incompatible des and can selfishly manipulate reproduction (Werren et al. Wolbachia strain (bidirectional CI), then delayed histone de- 2008), confer antiviral defense (Teixeira et al. 2008; Bian et al. position, improper chromosome condensation, and cell divi- 2010), and assist reproduction and development of their hosts sion abnormalities result in embryonic arrest and death (Lassy (Hoerauf et al. 1999; Dedeine et al. 2001; Hosokawa et al. and Karr 1996; Tram and Sullivan 2002; Serbus et al. 2008; 2010). The most common parasitic manipulation is cytoplas- Landmann et al. 2009). Other described reproductive manip- mic incompatibility (CI), whereby Wolbachia-infected males ulations include parthenogenesis (Stouthamer et al. 1990), The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non- commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com 434 Genome Biol. Evol. 10(2):434–451. doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Evolutionary Genetics of cifA and cifB GBE male-killing (Hurst et al. 1999), and feminization (Rousset vertically transmitted from mothers to offspring. The genes et al. 1992), all of which enhance the fitness of Wolbachia- were proposed as candidate CI effectors due to the presence infected females and assist the spread of the infected matri- of one of the protein products in the spermathecae of line through a population. These manipulations, once sus- infected female mosquitoes (Beckmann and Fallon 2013) tained, can also impact host evolution including speciation and their absence in the wAu Wolbachia strain that lost CI (Bordenstein et al. 2001; Jaenike et al. 2006; Brucker and function (Sutton et al. 2014). Bordenstein 2013) and mating behaviors (Randerson et al. The wMel homologs of these genes are designated CI 2000; Moreau et al. 2001; Miller et al. 2010; Shropshire factors cifA (locus WD0631) and cifB (locus WD0632), with and Bordenstein 2016). cifA always encoded directly upstream of cifB (LePage et al. In addition to the aforementioned reproductive manipula- 2017). The gene set occurs in varying copy number across 11 tions, Wolbachia strains affect host biology by provisioning total CI-inducing strains, and the copy number tentatively nutrients (Hosokawa et al. 2010), altering host survivorship correlates with CI levels. Core sequence changes of the two (Min and Benzer 1997) and fecundity (Stouthamer and Luck genes exhibit a pattern of codivergence and in turn closely 1993; Dedeine et al. 2001), and importantly, protecting the match bidirectional incompatibility patterns between host against pathogens (Teixeira et al. 2008; Kambris et al. Wolbachia strains. Homologs of CifA and CifB protein 2009; Moreira et al. 2009; Bian et al. 2010; Hughes et al. sequences belong to four distinct phylogenetic Types (desig- 2011; Walker et al. 2011). The combination of reproductive nated Types I–IV) that do not correlate with various phylog- manipulations that enable Wolbachia to spread in a pop- enies of Wolbachia housekeeping genes or gpW (locus ulation and the ability to reduce vector competence through WD0640) in phage WO (LePage et al. 2017). The homolo- pathogen protection have placed Wolbachia in the forefront gous sequences in wPip also cluster in Type I, though they of efforts to control disease carrying arthropod populations are 66% and 76% different from wMel’s, respectively (Turelli and Hoffmann 1991; Zabalou et al. 2004; Hoffmann (Beckmann et al. 2017). Hereinafter we use cifA and cifB to et al. 2011; Walker et al. 2011; LePage and Bordenstein 2013; refer to these genes, unless specifically referring to analyses of Bourtzis et al. 2014). Despite these important applications, the the wPip homologs, cidA and cidB. In vitro functional analyses widespread prevalence of Wolbachia across arthropod taxa revealed that cidB encodes deubiquitylase activity, and cidA (Werren and Windsor 2000; Hilgenboecker et al. 2008; Zug encodes a protein that binds CidB (Beckmann et al. 2017). et al. 2012), and decades of research, only recently have the Mutating the predicted catalytic residue in the deubiquitylat- genes underlying CI been determined (Beckmann et al. 2017; ing domain of CidB results in a loss of the CI-like function in LePage et al. 2017). transgenic flies (Beckmann et al. 2017). Whether these genes Two studies converged on the same central finding: or other alleles have additional enzymatic or regulatory roles Coexpression of a pair of syntenic genes recapitulates the CI and which other residues are important for function remain phenotype (Beckmann et al. 2017; LePage et al. 2017). open questions. Uninfected Drosophila melanogaster males transgenically There are important considerations for the location, or- expressing the two genes from wMel Wolbachia caused CI- ganization, and characterization of these genes. Whether or like embryonic lethality when crossed with uninfected females not cifA and cifB form a strict, toxin–antitoxin operon is de- that was notably rescued by wMel-infected females (LePage batable, and likewise has important implications for how et al. 2017). Additionally, the two wMel genes separately gene expression is regulated by Wolbachia during host in- enhanced wMel-induced CI in a dose-dependent manner fection. Support for the operon hypothesis is based on weak when expressed in infected males, and the CI was again res- transcription across the junction between cidA and cidB,in- cued by wMel-infected females (LePage et al. 2017). In the ferred to be due to the presence of bicistronic mRNA other study, CI-like embryonic lethality was also recapitulated (Beckmann and Fallon 2013; Beckmann et al. 2017); an al- in D. melanogaster males through transgenic coexpression of ternative explanation is transcriptional slippage. Quantitative homologous transgenes cidA and cidB, encoded by the wPip transcription analyses and various computational predictions strain of Wolbachia that naturally infect Culex mosquitoes of operon structure do not support the operon hypothesis (Beckmann et al. 2017). These two genes are located in the (LePage et al. 2017). Moreover and importantly, transgenic recently discovered eukaryotic association module of temper- studies show that both cifA and cifB are required for induc- ate phage WO (Bordenstein SR and Bordenstein SR 2016), tion of CI and thus cannot form a strict toxin (cifB)–antitoxin which was previously implicated in influencing CI (Masui (cifA) system as both genes positively contribute to CI and et al. 2000; Sinkins et al. 2005; Bordenstein et al. 2006; can individually enhance Wolbachia-induced CI (LePage Duron et al. 2006). The presence of these genes within pro- et al. 2017). However, like toxin–antitoxin systems, CidA phage WO has implications for the transmission of these binds CidB in vitro and expression of cidA rescues genes because temperate phage WO exhibits frequent lateral temperature-sensitive growth inhibition induced by cidB ex- transfers between Wolbachia (Bordenstein and Wernegreen pression in Saccharomyces, via an as-yet-unknown mecha- 2004; Chafee et al. 2010) while Wolbachia are mainly nism (Beckmann et al. 2017). Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 435 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Lindsey et al. GBE As it stands now, the genes remain largely unannotated to individual flies and samples homogenized using a pestle. with the exception of a few small domains. If other predicted After a 5-min incubation at room temperature, a 12,000 rcf protein domains occur in CifA and CifB, they could allow for centrifugation (at 4 C for 10 min) was followed by a chloro- new hypotheses for the mechanism of CI. Finally, the se- form extraction. Aqueous phase containing RNA was quence diversity and/or loss of cif genes across the extracted a second time with phenol: Chloroform before iso- Wolbachia tree may give insights into the selective conditions propanol precipitation of RNA. This RNA pellet was washed that maintain the cif genes versus those that do not. and resuspended in THE RNA Storage Solution (Ambion). RNA Exploration of cif gene regulation, expression, and function used in subsequent analyses was subjected to a short DNAse thus can provide a framework for more targeted investiga- treatment (10 min at 37 C then 10 min at 75 Cto inactivate tions of Wolbachia–host interactions, and potentially inform the enzyme). To detect the number of cifA and cifB transcripts the deployment of Wolbachia-based arthropod control. as well as RNA levels across the junction between cifA and cifB, we utilized the RNA extracted from these flies and the SensiFAST SYBER Hi-ROX One-step RT mix (Bioline) and the Materials and Methods Applied Biosystems StepOne Real-time polymerase chain re- Expression action (PCR) system. Quantitative reverse transcription poly- merase chain reaction (qRT-PCR) was performed with the For analysis of RNAseq data, we used our published approach following primer sets: cifAF: ATAAAGGCGTTTCAGCAGGA, (Gutzwiller et al. 2015). Briefly, fastq sequences for 1-day-old cifAR: TCAATGAGGCGCTTCTAGGT; cifBF: TACGGGAAG male and female flies were mapped against the Wolbachia TTTCATGCACA, cifBR: TTGCCAGCCATCATTCATAA; cifA_ wMel reference genome (GenBank AE017196) using bwa endF: TCTGGTTCTCATAAGAAAAGAAGAATC, cifB_begR: mem v. 0.7.5a with default parameters in paired-end mode. AACCATCAAGATCTCCATCCA. As a reference for transcrip- Mapped reads were sorted and converted to BAM format tion activity of the core Wolbachia genome, we utilized the using samtools v0.1.19 after which BAM files were used as Wolbachia ftsZ gene (forward: TTTTGTTGTCGCAAATACCG; input to Bedtools (bedcov) to generate pileups and count cov- reverse: AGCAAAGCGTTCACATTTCC). We designed primers erage at each position. For expression correlations between to ftsZ because as a core protein involved in cell division, the genes, the raw RNAseq counts were divided by (gene length- quantities of ftsZ would better correlate with bacterial num- þ 99), where 99 corresponds to read length (100)  1. Within bers and activity. Reactions were performed in duplicate or a growth stage these values were multiplied by 1e /(sum of triplicate in a 96-well plate and CT values generated by the values in stage) (Li and Dewey 2011). A pairwise distance machinewereused to calculate therelative amounts of between all genes was defined as (1  R), where the R is Wolbachia using the DDCt (Livak) method. the Pearson correlation coefficient between the normalized To identify a bicistronic message encompassing cifA and expression values of two genes. Possible negative correlations cifB, we designed primers based on the 5 -region of cifA and would be “penalized” here, resulting in a larger distance. the 3 -region of cifB (WD0631F: ATAAAGGCGTTTC Distances were clustered using the Kitsch program of AGCAGGA; WD0632R: TTGCCAGCCATCATTCATAA). We PHYLIP (Felsenstein 1989). extracted RNA from whole animals and performed a DNAse treatment (as described above) before using the Operon Prediction In Silico iScript first strand synthesis kit (Biorad) to generate cDNA. Negative controls included RT minus reactions. Resulting We used the dynamic profile of the transcriptome above to cDNA and negative controls were used in PCR reactions identify operons within the wMel genome using two different with the primers above and the following cycling condi- approaches. We used the program Rockhopper (McClure tions: 95 C 5 min then 35 cycles of 95 C for 1 min, 64 et al. 2013), with default parameters, in conjunction with C for 1 min, 72 C for 2.5 min followed by a final extension the BAM files generated above to delineate likely operons of 72 C for 10 min and using the HF Phusion enzyme mix across the entire genome. The Arnold web server (http:// (NEB). As a positive control, to confirm that we could am- rna.igmors.u-psud.fr/toolbox/arnold/) was used to predict plify long mRNAs from these samples, we used the 16S hairpin transcription termination elements (Gautheret and rRNA gene primers 27F (AGAGTTTGATCCTGGCTCAG) Lambert 2001; Macke et al. 2001). and 1492R (GGTTACCTTGTTACGACTT) with the same cycling conditions as above except that the annealing tem- Nucleic Acid Extractions and Quantitative Reverse perature was 55 C. Transcription Polymerase Chain Reaction To identify Wolbachia gene expression in adult male and fe- Correlated Cif Trees and Distance Matrices male D. melanogaster, RNA was extracted from individual, age-matched flies (1–3 days old, stock 145) using a modified Quantifying congruence scores between the CifA and CifB Trizol extraction protocol. Briefly, 500 mlof Trizol was added trees was carried out with Matching Cluster (MC) and 436 Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Evolutionary Genetics of cifA and cifB GBE Robinson–Foulds (RF) metrics using a custom python script Cif Phylogenetics previously described (Brooks et al. 2016) and the TreeCmp CifA and CifB protein sequences were identified using BlastP program (Bogdanowicz et al. 2012). MC weights topological searches of WOMelB WD0631 (NCBI accession number congruency of trees, similar to the widely used RF metric. AAS14330.1) and WD0632 (AAS14331.1), respectively. However, MC takes into account sections of subtree congru- 30 Homologs were selected based on: 1) E¼ 10 ,2)query ence and therefore is a more refined evaluation of small to- coverage greater than 70%, and 3) presence in fully se- pological changes that affect incongruence. Significance in quenced Wolbachia genomes. All sequences were intact the MC and RF analyses was determined by the probability with the exception of a partial WOSuziC CifA of 100,000 randomized bifurcating dendrogram topologies (WP_044471252.1) protein. The missing N-terminus was yielding equivalent or more congruent trees than the actual translated from the end of contig accession number tree. Normalized scores were calculated as the MC and RF CAOU02000024.1 and concatenated with partial protein congruency score of the two topologies divided by the max- WP_044471252.1 for analyses, resulting in 100% amino imum congruency score obtained from random topologies. acid identity to WORiC CifA (WP_012673228.1). In addition, The number of trees that had an equivalent or better score two previously identified sequences (LePage et al. 2017), than the actual tree was used to calculate the significance of WORecB CifB and WORiB CifB, were not available in NCBI’s observing that topology. Mantel tests were also performed on database and translated from nucleotide accession numbers the CifA and CifB patristic distance matrices calculated in JQAM01000018.1 and CP001391.1, respectively. The previ- Geneious v8.1.9 (Kearse et al. 2012). A custom Jupyter note- ously identified WOSol homologs (CifA: AGK87106 and CifB: book (Perez and Granger 2007) running python v3.5.2 (http:// AGK87078) (LePage et al. 2017) were also included in our python.org) was written in the QIIME2 (Caporaso et al. 2010) analyses. All protein sequences were aligned with the anaconda environment. The Mantel test (Mantel 1967)uti- MUSCLE (Edgar 2004) plugin in Geneious Pro version 8.1.7 lized the scikit-bio v0.5.1 (scikit-bio.org) Mantel function run, (Kearse et al. 2012); the best models of evolution, according using scikit-bio distance matrix objects for each gene. The to corrected Akaike (Hurvich and Tsai 1993) information cri- Mantel test was run with 100,000 permutations to calculate teria, were estimated to be JTT-G using the ProtTest server significance of the Pearson correlation coefficient between (Abascal et al. 2005); and phylogenetic trees were built using the two matrices using a two-sided correlation hypothesis. the MrBayes (Ronquist et al. 2012) plugin in Geneious. Protein Structure Genomes Used in Comparative Analyses All candidate CI gene protein sequences were individually In order to identify cif homologs across the Wolbachia assessed for the presence of domain structure using HHpred genomes, we defined orthologs across existing, sequenced (https://toolkit.tuebingen.mpg.de/hhpred/; So¨ding et al. genomes using reciprocal best BlastP. We included 2005)) with default parameters and the following databases: Wolbachia genomes across five supergroups: Monophyletic SCOPe70 (v.2.06), Pfam (v.31.0), SMART (v6.0), and COG/ clades of Wolbachia based on housekeeping genes, denoted KOG (v1.0). Schematics were created in inkscape (https://ink- by uppercase letters (O’Neill et al. 1992; Werren et al. 1995). scape.org/), to show regions with significant structural hits, as Supergroups A and B are the major arthropod infecting line- determined by probabilities greater than 50%, or greater ages, whereas C and D infect nematodes (Bandi et al. 1998). than 20% and in the top five hits. Supergroup F Wolbachia infect a variety of hosts (Lo et al. 2002). Included in this analysis were nine type A strains Protein Conservation (wRi, wSuzi, wHa, wMel, wMelPop, wAu, wRec, wUni, and wVitA), seven type B strains (wPipJHB, wPipPel, wBol1-b, Protein conservation was determined with the Protein Residue wNo, wTpre, wAlbB, and wDi), two type C strains (wOv Conservation Prediction tool (http://compbio.cs.princeton. and wOo), and one each type D (wBm) and type F (wCle). edu/conservation/index.html; Capra and Singh 2007), using We included all genomic data available for each strain such aligned amino acid sequences, Shannon entropy scores, a that if multiple assemblies existed for each Wolbachia variant window size of zero, and sequence weighting set to “false.” (such as in the case of wUni) we included the union of all Conservation was subsequently plotted in R version 3.3.2, and available contigs for that strain. Wolbachia orthologs were module regions were delineated according to the coordinates defined based on reciprocal best blast hits between amino of the WOMelB modules within the alignment. CI gene con- acid sequences in Wolbachia genomes. An orthologous group servation scores were calculated separately for Type I sequen- of genes was defined by complete linkage such that all mem- ces, and for all Types together. For CifB Type I sequences, the bers of the group had to be the reciprocal best hit of all other WOVitA4 ortholog was left out, due to the extended C-ter- members of the group. Information on strain phenotypes, minus of that protein. Conservation scores were also calcu- hosts, and accession numbers can be found in table 1. lated for “control proteins”: Wsp (Wolbachia surface protein), Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 437 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Lindsey et al. GBE Table 1 Genomes Used in Comparative Analyses of cifA and cifB Supergroup Strain Host Reproductive Phenotypes Accession Number A wMel Drosophila melanogaster CI NC_002978.6 wMelPop Drosophila melanogaster CI AQQE00000000.1 wRec Drosophila recens CI NZ_JQAM00000000.1 wAu Drosophila simulans None LK055284.1 wHa Drosophila simulans CI NC_021089.1 wRi Drosophila simulans CI NC_012416.1 wSuzi Drosophila suzukii None NZ_CAOU00000000.2 wUni Muscidifurax uniraptor PI NZ_ACFP00000000.1 wVitA Nasonia vitripennis CI NZ_MUJM00000000.1 B wAlbB Aedes albopictus CI CAGB00000000.1 wNo Drosophila simulans CI NC_021084.1 wDi Diaphorina citri Undetermined NZ_KB223540.1 wTpre Trichogramma pretiosum PI CM003641.1 wVitB Nasonia vitripennis CI AERW00000000.1 wBol1-b Hypolimnas bolina CI, MK NZ_CAOH00000000.1 wPipJHB Culex quinquefasciatus CI ABZA00000000.1 wPipPel Culex pipiens CI NC_010981.1 C wOo Onchocerca ochengi OM NC_018267.1 wOv Onchocerca volvulus OM NZ_HG810405.1 D wBm Brugia malayi OM NC_006833.1 F wCle Cimex lectularius OM NZ_AP013028.1 Note.—Reproductive phenotypes include: CI, parthenogenesis-inducing (PI), male-killing (MK), obligate mutualism (OM), no phenotype discovered after assessment (None), and phenotype was not assayed (Undetermined). known to be affected by frequent recombination events unrelated endosymbiont that can also cause CI in arthropods (Baldo et al. 2005), and FtsZ, which is relatively unaffected (Penz et al. 2012). Here, searches were performed with by recombination (Baldo, Dunning Hotopp, et al. 2006; Ros TBlastN and restricted to all available Cardinium sequence in et al. 2009). Variation in amino acid conservation between NCBI GenBank (taxid: 273135). modules and nonmodule regions was assessed in R version 3.3.2 with a one-way ANOVA including “region” (either the Hidden Markov Model Searches unique module number, or “nonmodule”) as a fixed effect, To identify cif homologs in draft Wolbachia genome assem- and followed by Tukey Honest Significant Difference for post blies we used the program suite HMMER (Eddy 2011). We hoc testing. defined cif Types based on our phylogenetic trees (fig. 4)and used aligned amino acids from these Types as input to Cif Modules HMMBUILD, using default parameters. We then searched six Wolbachia WGS assemblies (NCBI project numbers The WOMelB structural regions delineated by HHpred were PRJNA310358, PRJNA279175, PRJNA322628) using used to search for the presence of Cifs or remnants of Cifs HMMSEARCH with –F3 1e-20 –cut_nc and –domE 1e-10. across the Wolbachia phylogeny. Amino acid sequences of Regardless of thresholds used, or cif type of HMM, resulting the WOMelB modules were queried against complete ge- hits did not differ. nome sequences (table 1) using TBlastN. Any hit that was at least 40% of the length and 40% identity, or at least 90% of the length and 30% identity of the WOMelB module was Results considered a positive match. Module presence was plotted cifA and cifB Are Cotranscribed but Differentially across a Wolbachia phylogeny constructed using the five mul- Regulated in wMel tilocus sequence typing (MLST) genes defined by Baldo, Dunning Hotopp, et al. (2006). Nucleotide sequences were In bacteria, genes are commonly grouped into a single tran- aligned with MAFFT version 7.271 (Katoh and Standley scriptional unit under the control of one promoter, referred to 2013), and concatenated prior to phylogenetic reconstruction as an operon. Because cifA and cifB are syntenic across pro- with RAxML version 8.2.8 (Stamatakis 2014), the phage WO of Wolbachia andbothinvolvedinCI, we aimedto GTRGAMMA substitution model, and 1,000 bootstrap repli- assess whether cifA and cifB are cotranscribed. We performed cates. We also searched for cif-like regions in Cardinium:An RT-PCR using primers that amplify the entire region from the 438 Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Evolutionary Genetics of cifA and cifB GBE (A) 1kb ladder (B) females WD0630 cifA cifB WD0633 males forward reverse FIG.1—Expression of cifA and cifB in adult flies. (A)Amplification of cifA–cifB bicistronic message from cDNA generated from adult flies. Positive amplification occurred in both male and female adult flies. RT minus controls included. (B) RNAseq expression from 1-day-old female and male Drosophila melanogaster flies. Raw reads were mapped to the wMel assembly (using bwa) and coverage visualized using the Integrated Genomics Viewer (v2.3.77). The startofthe cifB open reading frame is denoted by a vertical, dotted line. start of cifA to the end of cifB (2.5 kbintotal). cDNA Operons are often comprised of loci encoding related amplification of the region from wMel-infected male and processes that can therefore be coregulated conveniently female flies was successful (fig. 1A), and the transcript through the control of transcription from one promoter. To wasconfirmedtobe cifA–cifB using Sanger sequencing assess whether cifA and cifB are coregulated, we reasoned from the forward and reverse ends of the cDNA amplicon that strictly coregulated loci will have correlated gene expres- (supplementary file S2, Supplementary Material online). sion across host development and similar total expression lev- We could not amplify a larger transcript from the loci els in whole animals. We therefore utilized an existing RNAseq flanking cifA and cifB, suggesting that the cifA–cifB tran- data set for Wolbachia in Drosophila melanogaster,covering script is adiscreteunit. 24 life cycle stages and 3 time samplings each for adult males Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 439 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Lindsey et al. GBE and females (Gutzwiller et al. 2015). We mapped reads to the (A) Junction existing wMel assembly (see Materials and Methods) and cal- cifA cifB culated Pearson correlation coefficients of normalized expres- (B) sion values across host development for between all gene pairs. cifA is expressed at much higher absolute levels than 1.2 cifB (fig. 1B; 8-fold higher based on RPKM values across both genes), and cifA and cifB expression is weakly, negatively cor- 1.0 related (Pearson r: 0.40; P-value: 0.014), suggesting that the expression level of one could have a negative influence on the 0.8 level of the other. To confirm differential expression of cif genes in wMel, we performed a quantitative RT-PCR analysis 0.6 of gene expression from 3-day-old male and female flies (fig. 2). We observed transcripts covering the junction 0.4 between cifA and cifB. However, transcripts covering this junction were more similar to the expression levels in cifA, 0.2 whereas expression of cifB was 9-fold less, supporting results from the RNAseq analysis. As a possible explanation for the large absolute differences cifA Junction cifB in transcription, we examined the intergenic sequence be- FIG.2—Relative expression ratio of cifA, the junction between cifA/ tween cifA and cifB and identified a Rho-independent tran- cifB,and cifB to ftsZ. Expression of both genes and their junction was scription terminator at nucleotides 618649–618668. This quantified using qRT-PCR, and normalized to Wolbachia ftsZ gene expres- terminator region is predicted to form a GC-rich hairpin sion. cifB gene expression is significantly less than that of the junction (50% GC compared with the Wolbachia wMel genome- (t¼ 3.220, df¼ 16, P¼ 0.005) and less than cifA (t ¼3.840, df¼ 17, wide 35%) in newly synthesized mRNA message proximal P¼ 0.001). to the RNA polymerase. There are two explanations for how the terminator might explain the transcript abundance BAM files as input to Rockhopper (McClure et al. 2013). The differences between cifA and cifB, and both have an impact program was able to correctly identify known operons in on the operon hypothesis. First, cifA and cifB have their own wMel (such as the T4SS WD0004–WD0008 and the ribo- promoters, but occasionally the genes are cotranscribed as a somal protein operon), but it did not identify cifA and cifB bicistronic message due to an imperfect hairpin terminator at as an operon. the end of cifA. In this model, cifA and cifB do not form an In summary, although a bicistronic message of cifA and cifB operon. Alternatively, the cifA and cifB operon has a single was detected by qPCR, their absolute and relative expression promoter upstream of cifA, and the imperfect terminator levels are drastically different. A termination signal in their provides a mechanism to control transcriptional differences intergenic sequence may limit expression of the bicistronic between cifA and cifB in the operon. Functionally resolving message and could explain the much higher absolute level whether cifA and cifB have the same or different promoters of cifA. Given the negative correlation across growth stages, will be the ultimate arbiter of the two models. some entity that activates cifA transcription, or a cifA product In order to identify loci with similar expression patterns itself, could repress cifB transcription. Clearly, cifA and cifB are during host development, we clustered all wMel genes based not a traditional operon and functionally resolving their pro- on their similarity in expression across Drosophila develop- moter(s) will reveal much about the regulation of the repro- ment (supplementary fig. S1, Supplementary Material online). ductive manipulations induced by Wolbachia. cifA did not group with cifB in wMel (fig. 3), suggesting that these two genes are not similarly expressed. Indeed, the pat- New Protein Domain Predictions Are Variable across the tern of cifA expression differs strikingly from that of cifB.For Cif Phylogeny example, cifA is relatively highly expressed during late em- bryogenesis and in adults, whereas cifB is relatively highly We recovered the four previously identified phylogenetic expressed during the first two-thirds of embryogenesis, and Types (LePage et al. 2017). Here, our analyses include addi- during larval stages (fig. 3A). Curiously, the expression profile tional strains that cause reproductive parasitism beyond CI of cifA in flies during development is most closely correlated (parthenogenesis and male-killing, table 1), and the more di- with the wsp locus WD1063 (fig. 3B). vergent Type IV paralogs for cifA, sofar identified inB- Because of the dramatic absolute difference in cifA and cifB Supergroup Wolbachia. We recover a set of Type III alleles transcript levels, computational methods for operon predic- from wUni, a strain that induces parthenogenesis in the par- tion do not support their cotranscription. For example, after asitoid wasp, Muscidifurax uniraptor (Stouthamer et al. 1993). mapping reads to the wMel assembly, we used the resulting The wBol1-b strain, a male-killer that has retained CI 440 Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Ratio of Expression (locus/ftsZ ) Evolutionary Genetics of cifA and cifB GBE FIG.3—Gene expression of cifA and cifB during Drosophila melanogaster development. (A) Heatmap representation of normalized transcripts per base pair per million (TPM) for both cifA and cifB during Drosophila melanogaster development. cifB is highly expressed during embryogenesis and downregulated after pupation, whereas cifA is more highly expressed in adults and pupae. Clustering of Wolbachia loci based on expression across fly development illustrates correlated expression profiles between wMel loci and cifA (B)or cifB (C). Mobile elements and loci involved in host interaction (wsp)are indicated with vertical lines on the right side of the figure. capabilities (Hornettetal. 2008), has alleles belonging to both Wolbachia strains that cause CI, parthenogenesis, male- Type I and Type IV. killing, or no reproductive phenotype were characterized by Homologs and predicted protein domains of CifA and CifB HHpred homology and domain structure prediction software for all four phylogenetic Types (LePage et al. 2017)from (So ¨ ding et al. 2005). Search parameters are described in the Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 441 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Lindsey et al. GBE Table 2 methods. Several new prominent protein domains, herein re- Predicted Structural Modules of Cif Proteins ferred to as “modules,” were identified for each CifA and CifB protein sequence (table 2). Protein Module Size Range Homology (AA) For CifA, three modules were annotated (fig. 4A, table 2). First, the most N-terminal module (ModA-1) is only recovered CifA ModA-1 21–22 • Catalase-rel, decomposes hy- drogen peroxide into water in Type I variants, with distant homology (22% amino acid and oxygen identity) to Catalase-rel that is predicted to catalyze the break- ModA-2 65–264 • DUF3243 domain of unknown down of hydrogen peroxide (Chelikani et al. 2004). The prob- function ability of the module being homologous to Catalase-rel is low • Puf family RNA binding (prob¼ 21–24), but the consistent recovery of structure in this • Globin-like protein region across Type I alleles is notable. The second CifA module ModA-3 47–74 • STE-like transcription factor in the central region (ModA-2) has homology to a domain of CifB ModB-1 103–133 • PDDEXK, PD-(D/E)XK nuclease unknown function (Types I, II, and IV, prob¼ 27–64), globin- superfamily like domains (Types II and IV, prob¼ 21–30), and Puf family • Endonuclease NucS RNA-binding domains (Types III and IV, prob¼ 25–49). The • Restriction endonuclease-like last CifA module in the C-terminal region (ModA-3) has hits to family • HSDR_N, type I restriction en- an STE-like transcription factor in all Types (prob¼ 27–42). In zyme R protein N terminus general, the CifA proteins showed distant homology to ModB-2 122–205 • PDDEXK, PD-(D/E)XK nuclease known domains, but we consistently recovered the same superfamily regions of structure within CifA protein Types. • MmcB-like DNA repair protein For CifB, three modules were also defined (fig. 4B, table 2). • COG5321, uncharacterized The first (ModB-1) and second (ModB-2) most N-terminal protein regions of all Types both have matches to the PDDEXK nucle- • HSDR_N, type I restriction en- ase family (prob¼ 57–98) and various other restriction endo- zyme R protein N-terminus nucleases such as NucS, HSDR_N, and MmcB (prob¼ 50–91). • Endonuclease NucS The third module, found only in the Type I C-terminus (ModB- ModB-3  95–147 • Ulp-1, ubiquitin-like proteases 3), has very strong homology to a number of ubiquitin- • Various proteases and pepti- dases (C5, C57, Sentrin-specific modification and protease-like domains (prob¼ 71–96). This protease) was expected, as ModB-3 contains the predicted catalytic res- idue associated with CI function in CidB, known to have deu- biquitylase activity (Beckmann et al. 2017). WOVitA4 (Type 1) Colors next to modules are used throughout the Figures. Only present in Type I. has an extended C-terminus not present in any other alleles, and within that extended C-terminus is an additional struc- tural domain, with homology to a Herpesvirus tegument pro- tein (prob¼ 53), and a phosphohydrolase-associated domain (prob¼ 57). CifB Type IV alleles (WOAlbB, WOPip2, and (LePage et al. 2017). Here, we statistically ground the in- wBol1-b) were not included in the phylogenetic reconstruc- ference of codivergence using the largest set of tion, as they are highly divergent and not reciprocal blasts of Wolbachia homologs to date. We quantified congruence WOMelB cifB. Despite their divergence, these Type IV CifB between the CifA and CifB phylogenetic trees for Types I– alleles have similar structures to Type II and III alleles: Two III (supplementary file S1, Supplementary Material online) PDDEXK-like modules, and no Ulp-1-like module 3 (supple- using MC and RF tree metrics (Robinson and Foulds 1981; mentary fig. S3, Supplementary Material online). Full struc- Bogdanowicz et al. 2012; Bogdanowicz and Giaro 2013), tural schematics with exact coordinates and homology with normalized distances ranging from 0.0 (complete regions for each allele are available in the Supplementary congruence) to 1.0 (complete incongruence). Results Material online (supplementary figs. S2 and S3, show strong levels of congruence between CifA and Supplementary Material online), as are all significant domain CifB (P< 0.00001 for both, normalized MC ¼ 0.06 and hits with associated probabilities and extended descriptions normalized RF ¼ 0.125). To further statistically validate (supplementary tables S1 and S2, Supplementary Material the inference of codivergence, we measured the correla- online). tion between patristic distance matrices for CifA and CifB using the Mantel test (Mantel 1967). Results demonstrate a high degree of correlation between patristic distance CifA and CifB Codiverge matrices, and through permutation show that indepen- Initial phylogenetic trees based on core amino acid sequences dent evolution of CifA and CifB is highly unlikely of Type I–III variants of CifA and CifB exhibited similar trees (Pearson correlation coefficient ¼ 0.905, P ¼ 0.00001). 442 Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 100 WOBol1-b WP_019236549.1 WOPip2 WP_007302980.1 WOPipJHB WP_007302988.1 WOPipJHB WP_007302980.1 wVitA WP_077188282.1 wUni WP_077188282.1 WOSuziC, WORiC wNo WP_015587805.1 wAlbB WP_006014164.1 WOPip1 (CidA) WP_012481787.1 WOAlbB WP_006012794.1 WORiB wRi_p05380 WORiC WP_012673227.1 WOSol AGK87106 WOSuziC WP_044471252.1* WOMelB WP_010962722.1 Evolutionary Genetics of cifA and cifB GBE (A) CifA Type III wNo 208-366 427-475 490aa Type I STE Puf familiy RNA binding WOMelB, WOMelPop 189-211 285-350 389-457 474aa Catalase-rel DUF3243 STE Type IV WOPip2, WOPipJHB 188-409 382-429 445aa WOHa1 WP_015588933.1 DUF3243/Puf family RNA binding/ STE globin-like WOVitA4 ONI58213.1 0.2 WORiC WP_012673228.1 Type II WOMelPop WP_010962721.1 456aa 264-423 390-443 DUF3243/globin-like STE (B) CifB Type I WOMelB, WOMelPop, WOSuziB 1166aa 277-381 591-796 929-1057 WOVitA4 PDDEXK PDDEXK/MmcB-like/ Ulp1/Proteases HD_assoc/ COG5321/NucS Herpes_teg_N WOVitA4 ONI58212.1 Type III WOPip1 (CidB) WP_012481788.1 wAlbB, wVitA, wUni 191-322 551-676 696aa PDDEXK/NucS/ PDDEXK/MmcB-like/ restriction endonuclease COG5321/HSDR_N/NucS restriction endonuclease 0.2 Type II WOSuziC, WORiC 754aa 204-337 579-718 PDDEXK/NucS/ PDDEXK/MmcB-like/ HSDR_N-containing COG5321/HSDR_N/NucS FIG.4—Phylogenetic relationships and representative predicted protein structure of Cif protein Types. (A) CifA and (B) CifB. Alleles are in bold next to their corresponding accession number, and pink shapes around branches designate monophyletic “Types.” Representative structures are shown for each type, with the length of the protein indicated at the C-terminus. If genes differed by only a few amino acids a single representative is shown. Allele names use the previously described naming convention with a WO prefix referring to particular phage WO haplotype, and the w prefix indicating a phage WO-like island (LePage et al. 2017). The N-terminus of CifA WOSuziC (* in figure) was translated from the end of another contig and concatenated to get the full- length protein (see Materials and Methods). WOMelB and WOMelPop are identical at the amino acid level, as are WOPipJHB and WOPip2. Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 443 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 wAlbB WP_006014162.1 WOMelPop WP_038228284.1 WOSuziC WP_044471251.1 wNo WP_015587806.1 WOBol1-b2 WP_019236479.1 WOBol1-b WP_019236548.1 WOHa1 WP_015588932.1 WOSuziB WP_044471243.1 WOMelB WP_010962721.1 WOSuziB WP_044471237.1 WORecB WP_038198916.1 wVitA WP_077188281.1 WOSol AGK87078 wUni WP_077190377.1 WORiB WP_012673191.1 WORecB wRec0567 84 84 100 Lindsey et al. GBE FIG.5—Protein conservation, as determined by Shannon entropy scores. (A)Wsp,(B) Cell division protein FtsZ, (C)Type I CifA,(D)All CifA,(E)Type I CifB alleles except for WOVitA4, (F) All CifB alleles. Red dots in (E)and (F) indicate the ModB-3 catalytic residue (Beckmann et al. 2017), unique to and completely conserved for Type I alleles. Blue dots in (E)and (F) represent the (P)D-(D/E)XK motif (Kosinski et al. 2005)present in wMel. Table 3 Cif Proteins Evolve Rapidly Average Amino Acid Conservation of Cifs and Modules Amino acid sequence conservation across the full length of Protein Region Type I All the Cif proteins was determined and compared with CifA ModA-1 0.93 0.67 Wolbachia amino acid sequences of genes that either have ModA-2 0.84 0.53 signatures of recombination and directional section (Wsp) or ModA-3 0.77 0.53 have not undergone extensive recombination and directional CifA 0.83 0.58 selection (FtsZ, cell division protein). Wsp protein sequences CifB ModB-1 0.89 0.70 exhibit considerable divergence (mean conservation¼ 0.85), ModB-2 0.87 0.60 with very few sites in a row being completely conserved b ModB-3 0.79 0.40 (fig. 5A). In contrast, FtsZ is relatively conserved (mean con- CifB 0.82 0.43 servation¼ 0.94), and most of the divergence is clustered at Module number is defined in table 2. the C-terminus (fig. 5B). Mean conservation for the Cif pro- Only annotated in Type I. tein sequences was lower than Wsp—0.83 for Type I CifA alleles (fig. 5C) and 0.82 for Type I CifB alleles (fig. 5E, table 3). When all Cif alleles were considered, mean conservation was cause CI remains to be determined. Although the CifB pro- even further reduced—0.58 for CifA (fig. 5D) and 0.43 for teins are highly divergent, the catalytic residue (red dot in CifB (fig. 5F). The lower average conservation of CifB genes is fig. 5E and F) in the deubiquitylating module of CifB is unique in part due to the many insertions and deletions in the align- to and completely conserved for the Type I alleles. ment, and the missing C-terminal deubiquitylase region, The Cif proteins have extensive amounts of diversity, with ModB-3, of the Type II and III alleles. Thus, several CifB pro- completely conserved amino acids distributed across the teins apparently lack this activity, and whether these variants length of the protein, and not confined to any particular 444 Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Evolutionary Genetics of cifA and cifB GBE regions (fig. 5C–F, supplementary tables S3–S6, more modules are recovered from the CI-inducing strains Supplementary Material online). There were significant differ- than the non-CI inducing strains within the same Type. ences in the level of conservation between modules and non- The high number of modules in wSuzi and wRi is due to the module regions for the Type I alignments of both CifA (F presence of a duplicated set of Type I variants. We recovered 3, 490 ¼ 4.276, P¼ 0.0054) and CifB (F ¼ 9.703 P¼ 1.5e-06) many modules in wSuzi, which is a strain not known to induce 3, 1195 (table 3). The only modules that had significantly higher con- CI, but is sister to wRi, which can induce CI (Hamm et al. servation than the nonmodule regions of the alignment were 2014; Cattel et al. 2016). This discrepancy between cif pres- ModB-1 (P¼ 0.0173) and ModB-2 (P¼ 0.0011). The wMel ence and absence of a reproductive phenotype might be strain contains the (P)D-(D/E)XK motif in ModB-1 (blue dots explained by the disrupted Type II cifA in wSuzi. The split in fig. 5E and F)(Kosinski et al. 2005), but it is less than 80% WOSuziC sequenced was concatenated to allow for a more conserved across strains despite the higher average conserva- robust phylogenetic reconstruction (fig. 4), but it is in fact tion of this module. In contrast, wMel does not contain the disrupted by a transposase (Conner et al. 2017). However, catalytic motif in ModB-2, also a PDDEXK nuclease-like do- having a functional set of Type I cif alleles appears to be suf- main. The ModB-2 (P)D-(D/E)XK motif is present in Type IV ficient for CI-induction in other strains (Beckmann et al. 2017; alleles such as WOPip2 that when mutated no longer induces LePage et al. 2017), so it is not clear how inactivation of the growth defects in yeast (Beckmann et al. 2017). ModA-3 is Type II alleles here may affect the final CI phenotype in wSuzi. significantly less conserved than the nonmodule regions of Strain wDi, infecting the Asian citrus psyllid Diaphorina citri, Type I CifA (P¼ 0.0300). has no identified reproductive phenotype, but only contains two modules: ModB-1 and ModB-3. For wHa, we recovered duplicates of all the modules. These represent a highly dis- Cif Module Presence Generally Predicts Reproductive rupted copy of the gene set harboring frameshifts that were Phenotype annotated as pseudogenes. We used the wMel-predicted Cif modules as a seed to search The lack of evidence for homologous cif genes in the C, D, for the presence of homologous modules across Wolbachia and F Supergroup Wolbachia agrees with previous findings genome sequences using TBlastN (fig. 6), with the intent of (LePage et al. 2017) that CI-function is restricted to the Aþ B- discovering cif-like regions or remnants in strains with other Supergroup clade (likely due to WO phage activity), and the phenotypes. In strains with more divergent Cif Types, we absence of WO phages for the nematode-infecting strains report modules that were expected based on the HHpred (Gavotte et al. 2007). The loss of CI within the A and B results, but not recovered with TBlastN due to sequence di- Supergroups is likely a derived trait due to the rapid evolution vergence from WOMelB. Additionally, we recover homolo- of prophage WO (Ishmael et al. 2009; Kent, Salichos, et al. gous modules outside of the annotated cif open reading 2011) and relaxed selection after transition to a new repro- frames, such as the chromosomal region with a ModB-3 ductive phenotype. The low number of modules identified in (Ulp-1-like) region in wNo. The ModB-3 wNo module is genic, such strains is consistent with gene degradation and loss. found within a hypothetical protein (WP_041581315.1). Additionally, we recover no cif-like regions in Cardinium,a Whether or not these cif-like regions outside of prophage member of the Bacteroidetes, and an independent transition WO contribute to CI remains to be determined. All to a CI phenotype (Penz et al. 2012). Supergroup A and B strains, with the exception of wAu and To further explore the conservation of the cif genes across wTpre (non-CI inducing strains), contained at least one recov- the sequenced Wolbachia, and to uncover diversity that may ered module. be present in other genomes, we searched the WGS data- Importantly, all strains that are known to be capable of bases for recently sequenced genomic scaffolds from inducing or rescuing CI have two or more recovered modules, Wolbachia infecting the Nomada bees (wNleu, wNla, wNpa, though they do not necessarily have ModB-3, which contains and wNfe) (Gerth and Bleidorn 2016), Drosophila inocompta the catalytic residue implicated in CI function (Beckmann et al. (wInc_Cu) (Wallau et al. 2016), and Laodelphax striatellus 2017). The non-CI strains have fewer recovered modules: One (wStri) (GenBank accession number NZ_LRUH00000000.1) module (ModB-2) in wUni, and no modules in wCle, wAu, using HMMER. Only for wStri do we have direct evidence of wTpre and the nematode-infecting strains (Supergroups C CI induction (Noda et al. 2001)yet the wStri WGS projects and D). wUni is a unique case, where we identified cif alleles contain only one cif locus (CifB) with an unusual structure not in the genome, but recovered only one module. Most wUni found in any of the other Types. On the basis of HHpred modules are either missing (fig. 4A) or divergent enough from analyses, the wStri homolog contains a deubiquitylase region WOMelB that they were not considered a positive match. in the middle of the protein, with two downstream regions wAlbB and wNo, both CI-inducing strains with Type III and that have homology to glucosyl transferases and lipases, re- IV alleles, have fewer recovered modules, but this is congruent spectively (supplementary fig. S4, Supplementary Material on- with the more divergent nature of those Cif Types. It is nota- line). The wInc_Cu WGS project contained one each of CifA ble that despite the phylogenetic distance from WOMelB, and CifB alleles. The CifA allele from wInc_Cu is a typical Type Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 445 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Lindsey et al. GBE FIG.6—Presence of wMel-like Cif modules across the Wolbachia phylogeny. The WOMelB module sequences were used to query available Wolbachia genomes to look for the presence of Cif-like regions beyond those within the annotated Cifs (fig. 5). Colored dots correspond to the structural regions delimited by HHpred, shown in figure 4, and listed in table 2. A “C” within a dot indicates the presence of a module outside of annotated cif open reading frames (fig. 4 and supplementary figs. S2 and S3, Supplementary Material online). The black dot indicates a module annotated by HHpred, but not identified by TBlastN due to divergence from the WOMelB module. Black boxes labeled with uppercase letters indicate branches leading to Wolbachia Supergroups. Dotted lines on the phylogeny lead to taxon names and are not included in the branch length. I protein containing three modules: An N-terminal Catalase- Discussion rel domain and an internal DUF3242 domain, followed by the We explored three key features of cif evolution: 1) The operon STE-like transcriptional factor domain. Because these are in- hypothesis, 2) potential novel functions across the cifA and complete genome projects, it is possible that other cif homo- cifB phylogenies, and 3) the conservation and diversity of cif logs have been missed due to the current sequencing alleles across strains with different host-manipulation pheno- coverage. Alternatively, it is possible that other, as yet undis- types. We provide multiple lines of evidence that although covered, mechanisms of reproductive manipulation exist in cifA and cifB can be cotranscribed, they are divergently tran- these strains. In contrast, the Nomada-associated Wolbachia scribed in wMel during host development, suggesting a more contain a large repertoire of cif homologs, including Types I, II, complex regulation of gene expression than found in classical IV, and several homologs with variations on the Type IV do- operons. Indeed, cifB transcription has a significant negative main architecture for CifA (supplementary fig. S4, correlation with cifA.In Escherichia coli, operons frequently Supplementary Material online). Many of the CifB homologs have internal promoters and terminators that result in differ- are disrupted Type I variants that contain the deubiquitylase- ent units of transcription, which are preferentially used during like domain, but not the nuclease-like domains (supplemen- certain conditions (Conway et al. 2014). Although cifB is tary fig. S4, Supplementary Material online). 446 Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Evolutionary Genetics of cifA and cifB GBE expressed at about 1/10 the level of cifA across all life cycle yetto beexplored in vivo, but is a promising direction for stages, the significant negative correlation in their levels sug- understanding the evolution of Wolbachia–host associations. gests that the same factor(s) could upregulate cifA while Based on what is known about Wolbachia biology, some downregulating cifB,or a cifA encoded RNA or protein may of the protein domains may be especially good candidates for inhibit cifB expression or vice versa. The new annotations for further study and in vivo functional characterization. Predicted cifA alleles, including a Puf family RNA-binding-like domain PDDEXK-like nuclease domains are present in all four CifB and STE transcription factor, could theoretically play a role in Types. Given the predicted interaction of these domains inhibiting cifB expression. More detailed analyses from a vari- with DNA (Kosinski et al. 2005), and the presence of these ety of strains, cif Types, and conditions would help develop a domains across CifB proteins, determining whether and how comprehensive understanding of the factors regulating ex- these regions interact with host (Wolbachia or insect) DNA, pression of these genes and the CI phenotype. and whether or not they contribute to CI function would be It is especially interesting that cifA and cifB synteny is main- useful in understanding the consistent presence of this mod- tained across prophage WO regions, despite the high level of ule. Mutating the predicted catalytic site of the nuclease re- recombination and rearrangements in prophage WO and gion in wPip’s Type IV CifB (aka CinB) reduces toxicity in yeast Wolbachia genomes (Baldo, Bordenstein, et al. 2006; Kent, (Beckmann et al. 2017). However, this catalytic residue is not Funkhouser, et al. 2011; Ellegaard et al. 2013). Although it is conserved, so further exploration of nuclease function across not yet clear why cifA and cifB homologs maintain their syn- more divergent alleles will be useful. As aforementioned, tenic orientation, given that they have very different absolute many of the CifA alleles encode Puf family RNA-binding-like and relative expression levels, we hypothesize that this feature domains, which have previously been implicated in mRNA can be attributed to 1) their location within prophage WO localization and transcriptional regulation (Quenault et al. and/or 2) functions associated with the ability of cifA and cifB 2011). This RNA binding-like domain is found upstream of to act synergistically to induce CI (LePage et al. 2017), or 3) an STE transcription factor-like domain and could provide a with the potential antagonism of cifA on cifB transcripts or promising direction for understanding the complicated tran- transcription. For example, they could share the same pro- scriptional dynamics of the cif genes. moter, but in addition to the Rho-independent terminator, Wolbachia strains that have lost CI have a strong signature cifA may further inhibit cifB expression by binding to the inter- of cif gene degradation and loss, consistent with their role in genic region between them, causing the polymerase to ter- CI. The two parthenogenesis-inducing strains (wTpre and minate. Such a model would be consistent with both the wUni) appear to be at different places in this process of absolute expression differences and the negative correlation. gene loss, with divergent Cif amino acid sequences recovered Alternatively, cifA and cifB may have different promoters, but for wUni, but no modules identified in wTpre. There are sev- a bicistronic message occurs because of an imperfect hairpin eral explanations for this. wUni is likely a more recent transi- terminator between the genes. We conclude that cifA and tion to parthenogenesis, as it is closely related to a CI strain cifB are cotranscribed but not coregulated as in a classical (wVitA) (Baldo, Dunning Hotopp, et al. 2006; Newton et al. operon, and do not act strictly as a toxin–antitoxin system 2016). In comparison, wTpre is part of a unique clade of due to the requirement of both Cif proteins for the induction Wolbachia that all induce parthenogenesis in Trichogramma of CI in arthropods. Determining how cifA and cifB expression wasps (Rousset et al. 1992; Werren et al. 1995; Schilthuizen is regulated in the insect host will advance an understanding and Stouthamer 1997). This strain has lost its WO phage as- of both the basic biology of CI and vector control programs sociation and only has relics of WO phage genes (Gavotte that deploy CI to control disease transmission. et al. 2007; Lindsey et al. 2016). Additionally, the two strains Despite the conservation of gene order, Cif proteins that independently transitioned to the parthenogenesis phe- showed extensive amounts of divergence and differences in notype have evolved separate mechanisms for doing so domain structure as previously reported (LePage et al. 2017). (Stouthamer and Kazmer 1994; Gottlieb et al. 2002). Here, the levels of amino acid conservation in the Cifs are Differences in time since transition to the parthenogenesis lower than FtsZ and Wsp, the latter of which is known to phenotype, phage WO associations, and mechanisms of par- recombine and be subject to directional selection. The con- thenogenesis induction likely all play a role in the rate of cif servation of the predicted catalytic residue in the C-terminal gene degradation. deubiquitylase domain is an important feature of CidB Although cifA and cifB are prophage WO genes, not all CI- (Beckmann et al. 2017). Although this residue is required inducing strains have a complete prophage. Indeed, the wRec for CI induction in CidB and other Type I alleles (Beckmann strain of Wolbachia in Drosophila recens is one such example et al. 2017), only Type I alleles (of the four identified Types) where approximately three quarters of prophage WO genes have this domain. Strains known to induce CI, such as wAlbB were eliminated (Metcalfetal. 2014), previously resulting in and wNo do not have Type I alleles, implying that the deubi- failed detection of WO presence (Bordenstein and quitylase domain is not essential for inducing CI across other Wernegreen 2004). However, genomic analyses of phage Cif Types. The complete, functional capacity of all Types has WO particles from wasps and moths revealed that several Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 447 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Lindsey et al. GBE genes packed in the genome of phage WO particles trees. Importantly, the locus, presumably expressed in the fe- (Bordenstein SR and Bordenstein SR 2016) are in fact retained male insect infected with a compatible Wolbachia,and mech- in prophage WO of wRec (Metcalf et al. 2014), including cifA anism behind rescuing CI are still unknown, as is the exact and cifB. Genes in prophage WO relics are apparently a source mechanism by which all Cif proteins induce CI. Therefore, the of host-manipulation across Wolbachia genomes. recent discovery of these CI genes and their sequence char- Additionally, there is considerable variation in the strength acterization described here pave the way for investigating key of CI across different Wolbachia strains (Veneti et al. 2003). mechanisms of the Wolbachia–host symbiosis. CifA andCifBhave anadditive effect on the strengthof CI (LePage et al. 2017), so it is possible that the level of cif ex- Supplementary Material pression, or the ratio of cifA and cifB transcripts across devel- Supplementary data are available at Genome Biology and opment, are ways in which CI strength is adjusted. The rapidly Evolution online. evolving nature of the Cif proteins may affect other ways in which they function in the host. For example, in Type I CifB proteins that have the essential deubiquitylase residue, other Acknowledgments sequence variation may affect the ability to bind with CifA, We thank J. Dylan Shropshire, Jeremy Brownlie, and anony- locations of posttranslational modifications, or the ability to be mous reviewers for feedback on earlier drafts of the manu- efficiently localized to the host nucleus. Additionally, the level script. This work was supported by the National Science of CI is often host-dependent (Bordenstein and Werren 1998; Foundation (DEB 1501227 to A.R.I.L., IOS 1456545 to McGraw et al. 2001), possibly a result of how well Wolbachia I.L.G.N., and IOS 1456778 to S.R.B.); the United States replicate in the host, and/or the specificity of Cif proteins with Department of Agriculture (NIFA 2016-67011-24778 to the host target, which is currently unknown. There are also A.R.I.L.); the National Institutes of Health (R21 HD086833 environmental conditions that affect the strength of CI, and and R01 AI132581 to S.R.B.); the Vanderbilt Microbiome they likely do so by affecting Wolbachia titers and resulting Cif Initiative; and Robert and Peggy van den Bosch Memorial expression in the host (Clancy and Hoffmann 1998; Yamada Scholarships to A.R.I.L. et al. 2007). On the basis of our analyses, we propose three avenues of research on the function of the Cif proteins. First, functional Literature Cited confirmation of the newly annotated modules will be impor- Abascal F, Zardoya R, Posada D. 2005. ProtTest: selection of best-fit mod- tant in understanding how these genes function enzymati- els of protein evolution. Bioinformatics 21(9):2104–2105. Archuleta TL. 2011. The Chlamydia effector chlamydial outer protein N cally. In total, we predict six modules in the Cif protein (CopN) sequesters tubulin and prevents microtubule assembly. J Biol sequence homologs, with varying degrees of confidence (sup- Chem. 286(39):33992–33998. plementary tables S1 and S2, Supplementary Material online). Baldo L, Bordenstein S, Wernegreen JJ, Werren JH. 2006. Widespread For some of these modules, straightforward experiments can recombination throughout Wolbachia genomes. MolBiolEvol. be designed in model systems (such as Saccharomyces)to 23:437–449. Baldo L, Dunning Hotopp JC, et al. 2006. Multilocus sequence typing determine whether their predicted function is correct, as system for the endosymbiont Wolbachia pipientis. Appl Environ has been done for the deubiquitylase domain of CidB Microbiol. 72(11):7098–7110. (Beckmann et al. 2017) and countless other bacterial effectors Baldo L, Lo N, Werren JH. 2005. Mosaic nature of the Wolbachia surface (Kramer et al. 2007; Siggers and Lesser 2008; Archuleta protein. J Bacteriol. 187(15):5406–5418. 2011). The necessity and importance of these modules to Bandi C, Anderson TJC, Genchi C, Blaxter ML. 1998. Phylogeny of Wolbachia in filarial nematodes. Proc R Soc Lond B. the CI phenotype can be assessed in the Drosophila model, 265(1413):2407–2413. where the induction of the phenotype and rescue is straight- Beckmann JF, Fallon AM. 2013. Detection of the Wolbachia protein forward (LePage et al. 2017). Second, detailed characteriza- WPIP0282 in mosquito spermathecae: implications for cytoplasmic tion of cif gene regulation will be important for understanding incompatibility. Insect Biochem Mol Biol. 43(9):867–878. CI expression and penetrance, thus informing vector control Beckmann JF, Ronau JA, Hochstrasser M. 2017. A Wolbachia deubiquity- lating enzyme induces cytoplasmic incompatibility. Nat Microbiol. programs that rely on proper expression of the CI phenotype, 2:17007. often in a transfected host. Finally, we suggest that although Bian G, et al. 2010. The endosymbiotic bacterium Wolbachia induces re- the discovery of these genes is fundamental, it is clear from sistance to dengue virus in Aedes aegypti. PLoS Pathog. this analysis that we have not comprehensively evaluated or 6(4):e1000833. identified the mechanisms behind CI and other reproductive Bogdanowicz D, Giaro K. 2013. On a matching distance between rooted phylogenetic trees. Int J Appl Math Comp Sci. 23:669–684. manipulations. The gene characterization analyses described Bogdanowicz D, Giaro K, Wrobel B. 2012. TreeCmp: comparison of trees here reveal new and relevant annotations, but with many in polynomial time. Evol Bioinform Online. 8:EBO.S9657. regions of unknown function across all of the phylogenetic Bordenstein SR, Bordenstein SR. 2016. Eukaryotic association module in Types, missing deubiquitylase domains in particular CI strains, phage WO genomes from Wolbachia.Nat Commun.7.doi:10.1038/ and a coevolving, phylogenetic relationship across the Cif ncomms13155. 448 Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Evolutionary Genetics of cifA and cifB GBE Bordenstein SR, Marshall ML, Fry AJ, Kim U, Wernegreen JJ. 2006. The Gottlieb Y, Zchori-Fein E, Werren JH, Karr TL. 2002. Diploidy restoration in tripartite associations between bacteriophage, Wolbachia, and arthro- Wolbachia-infected Muscidifurax uniraptor (Hymenoptera: pteromali- pods. PLoS Pathog. 2(5):e43. dae). J Invertebr Pathol. 81(3):166–174. Bordenstein SR, O’Hara FP, Werren JH. 2001. Wolbachia-induced incom- Gutzwiller F, et al. 2015. Dynamics of Wolbachia pipientis gene expression patibility precedes other hybrid incompatibilities in Nasonia.Nature across the Drosophila melanogaster life cycle. G3 5:2843–2856. 409(6821):707–710. Hamm CA, et al. 2014. Wolbachia do not live by reproductive manipula- Bordenstein SR, Wernegreen JJ. 2004. Bacteriophage flux in endosym- tion alone: infection polymorphism in Drosophila suzukii and D. sub- bionts (Wolbachia): infection frequency, lateral transfer, and recombi- pulchrella. Mol Ecol. 23(19):4871–4885. nation rates. Mol Biol Evol. 21(10):1981–1991. Hilgenboecker K, Hammerstein P, Schlattmann P, Telschow A, Werren JH. Bordenstein SR, Werren JH. 1998. Effects of A and B Wolbachia and host 2008. How many species are infected with Wolbachia? - A statistical genotype on interspecies cytoplasmic incompatibility in Nasonia. analysis of current data. FEMS Microbiol Lett. 281(2):215–220. Genetics 148:1833–1844. Hoerauf A, et al. 1999. Tetracycline therapy targets intracellular bacteria in Bourtzis K, et al. 2014. Harnessing mosquito–Wolbachia symbiosis for the filarial nematode Litomosoides sigmodontis and results in filarial vector and disease control. Acta Trop. 132:S150–S163. infertility. J Clin Invest. 103(1):11–18. Brooks AW, Kohl KD, Brucker RM, van Opstal EJ, Bordenstein SR. 2016. Hoffmann AA, et al. 2011. Successful establishment of Wolbachia in Phylosymbiosis: relationships and functional effects of microbial com- Aedes populations to suppress dengue transmission. Nature munities across host evolutionary history. PLoS Biol. 14(11):e2000225. 476(7361):454–U107. Brucker RM, Bordenstein SR. 2013. The hologenomic basis of speciation: Hornett EA, et al. 2008. You can’t keep a good parasite down: evolution gut bacteria cause hybrid lethality in the genus Nasonia.Science of a male-killer suppressor uncovers cytoplasmic incompatibility. 341(6146):667–669. Evolution 62(5):1258–1263. Caporaso JG, et al. 2010. QIIME allows analysis of high-throughput com- Hosokawa T, Koga R, Kikuchi Y, Meng XY, Fukatsu T. 2010. Wolbachia as munity sequencing data. Nat Methods. 7(5):335. a bacteriocyte-associated nutritional mutualist. Proc Natl Acad Sci U S Capra JA, Singh M. 2007. Predicting functionally important residues from A. 107(2):769–774. sequence conservation. Bioinformatics 23(15):1875–1882. http://python.org. Python Language Reference, version 2.7. Cattel J, et al. 2016. Wolbachia in European populations of the invasive Hughes GL, et al. 2011. Wolbachia infections are virulent and inhibit the pest Drosophila suzukii: regional variation in infection frequencies. human malaria parasite Plasmodium falciparum in Anopheles gam- PLoS One 11(1):e0147766. biae. PLoS Pathog. 7(5):e1002043. Chafee ME, Funk DJ, Harrison RG, Bordenstein SR. 2010. Lateral phage Hurst GD, et al. 1999. Male–killing Wolbachia in two species of insect. Proc transfer in obligate intracellular bacteria (Wolbachia): verification from R Soc Lond B. 266(1420):735–740. natural populations. Mol Biol Evol. 27(3):501–505. Hurvich CM, Tsai CL. 1993. A corrected Akaike information criterion for Chelikani P, Fita I, Loewen PC. 2004. Diversity of structures and properties vector autoregressive model selection. J Time Ser Anal. 14(3):271–279. among catalases. Cell Mol Life Sci. 61(2):192–208. Ishmael N, et al. 2009. Extensive genomic diversity of closely related Clancy DJ, Hoffmann AA. 1998. Environmental effects on cytoplasmic Wolbachia strains. Microbiology 155(Pt 7):2211–2222. incompatibility and bacterial load in Wolbachia-infected Drosophila Jaenike J, Dyer KA, Cornish C, Minhas MS. 2006. Asymmetrical reinforce- simulans. Entomol Exp Appl. 86(1):13–24. ment and Wolbachia infection in Drosophila. PLoS Biol 4(10):e325. Conner WR, et al. 2017. Genome comparisons indicate recent transfer of Kambris Z, Cook PE, Phuc HK, Sinkins SP. 2009. Immune activation by life- wRi-like Wolbachia between sister species Drosophila suzukii and D. shortening Wolbachia and reduced filarial competence in mosquitoes. subpulchrella. Ecol Evol 7(22):9391. Science 326(5949):134–136. Conway T, et al. 2014. Unprecedented high-resolution view of bacterial Katoh K, Standley DM. 2013. MAFFT multiple sequence alignment soft- operon architecture revealed by RNA sequencing. MBio ware version 7: improvements in performance and usability. Mol Biol 5(4):e01442–e01414. Evol. 30(4):772–780. Dedeine F, et al. 2001. Removing symbiotic Wolbachia bacteria specifically Kearse M, et al. 2012. Geneious Basic: an integrated and extendable inhibits oogenesis in a parasitic wasp. Proc Natl Acad Sci U S A. desktop software platform for the organization and analysis of se- 98(11):6247–6252. quence data. Bioinformatics 28(12):1647–1649. Duron O, Fort P, Weill M. 2006. Hypervariable prophage WO sequences Kent BN, Funkhouser LJ, Setia S, Bordenstein SR. 2011. Evolutionary ge- describe an unexpected high number of Wolbachia variants in the nomics of a temperate bacteriophage in an obligate intracellular bac- mosquito Culex pipiens. Proc R Soc Lond B. 273(1585):495–502. teria (Wolbachia). PLoS One 6:e24984. Eddy SR. 2011. Accelerated profile HMM searches. PLoS Comput Biol. Kent BN, Salichos L, et al. 2011. Complete bacteriophage transfer in a 7(10):e1002195. bacterial endosymbiont (Wolbachia) determined by targeted genome Edgar RC. 2004. MUSCLE: multiple sequence alignment with high accu- capture. Genome Biol Evol. 3(0):209–218. racy and high throughput. Nucleic Acids Res. 32(5):1792–1797. Kosinski J, Feder M, Bujnicki JM. 2005. The PD-(D/E) XK superfamily revis- Ellegaard KM, Klasson L, Naslund K, Bourtzis K, Andersson SGE. 2013. ited: identification of new members among proteins involved in DNA Comparative genomics of Wolbachia and the bacterial species con- metabolism and functional predictions for domains of (hitherto) un- cept. PLoS Genet 9(4):e1003381. known function. BMC Bioinformatics 6:172. Felsenstein J. 1989. PHYLIP-phylogeny inference package (version 3.2). Kramer RW, et al. 2007. Yeast functional genomic screens lead to iden- Cladistics 5:6. tification of a role for a bacterial effector in innate immunity regula- Gautheret D, Lambert A. 2001. Direct RNA motif definition and identifi- tion. PLoS Pathog. 3(2):e21. cation from multiple sequence alignments using secondary structure Landmann F, Orsi GA, Loppin B, Sullivan W, Schneider DS. 2009. profiles. J Mol Biol. 313(5):1003–1011. Wolbachia-mediated cytoplasmic incompatibility is associated with im- paired histone deposition in the male pronucleus. PLoS Pathog. Gavotte L, et al. 2007. A survey of the bacteriophage WO in the endo- symbiotic bacteria Wolbachia. Mol Biol Evol. 24(2):427–435. 5(3):e1000343. Gerth M, Bleidorn C. 2016. Comparative genomics provides a timeframe Lassy CW, Karr TL. 1996. Cytological analysis of fertilization and early for Wolbachia evolution and exposes a recent biotin synthesis operon embryonic development in incompatible crosses of Drosophila simu- transfer. Nat Microbiol. 2:16241. lans. Mech Dev. 57(1):47–58. Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 449 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Lindsey et al. GBE LePage D, Bordenstein SR. 2013. Wolbachia: can we save lives with a great Robinson DF, Foulds LR. 1981. Comparison of phylogenetic trees. Math pandemic? Trends Parasitol. 29(8):385–393. Biosci. 53(1–2):131–147. LePage DP, et al. 2017. Prophage WO genes recapitulate and enhance Ronquist F, et al. 2012. MrBayes 3.2: efficient Bayesian phylogenetic in- Wolbachia-induced cytoplasmic incompatibility. Nature ference and model choice across a large model space. Syst Biol. 543(7644):243–247. 61(3):539–542. Li B, Dewey CN. 2011. RSEM: accurate transcript quantification from RNA- Ros VID, Fleming VM, Feil EJ, Breeuwer JAJ. 2009. How diverse is the Seq data with or without a reference genome. BMC Bioinformatics genus Wolbachia? Multiple-gene sequencing reveals a putatively 12:323. new Wolbachia supergroup recovered from spider mites (Acari: tetra- Lindsey ARI, Werren JH, Richards S, Stouthamer R. 2016. Comparative nychidae). Appl Environ Microbiol. 75(4):1036–1043. genomics of a parthenogenesis-inducing Wolbachia symbiont. G3; Rousset F, Bouchon D, Pintureau B, Juchault P, Solignac M. 1992. 6(7):2113–2123. Wolbachia endosymbionts responsible for various alterations of sexu- Lo N, Casiraghi M, Salati E, Bazzocchi C, Bandi C. 2002. How many ality in arthropods. Proc R Soc Lond B. 250(1328):91–98. Wolbachia supergroups exist? Mol Biol Evol. 19(3):341–346. Schilthuizen M, Stouthamer R. 1997. Horizontal transmission of Macke TJ, et al. 2001. RNAMotif, an RNA secondary structure defi- parthenogenesis-inducing microbes in Trichogramma wasps. Proc R nition and search algorithm. Nucleic Acids Res. Soc Lond B. 264(1380):361–366. 29(22):4724–4735. scikit-bio.org. scikit-bio: core bioinformatics data structures and algorithms Mantel N. 1967. The detection of disease clustering and a generalized in Python. caporasolab.us: Northern Arizona University. regression approach. Cancer Res. 27:209–220. Serbus LR, Casper-Lindley C, Landmann F, Sullivan W. 2008. The genetics Masui S, Kamoda S, Sasaki T, Ishikawa H. 2000. Distribution and evolution and cell biology of Wolbachia-host interactions. Annu Rev Genet. of bacteriophage WO in Wolbachia, the endosymbiont causing sexual 42:683–707. alterations in arthropods. J Mol Evol. 51(5):491–497. Shropshire JD, Bordenstein SR. 2016. Speciation by symbiosis: the micro- McClure R, et al. 2013. Computational analysis of bacterial RNA-Seq biome and behavior. MBio 7(2):e01785-15. data. Nucleic Acids Res. 41(14): e140, https://doi.org/10.1093/nar/ Siggers KA, Lesser CF. 2008. The yeast Saccharomyces cerevisiae:a versa- gkt444. tile model system for the identification and characterization of bacte- McGraw E, Merritt D, Droller J, O’Neill S. 2001. Wolbachia-mediated rial virulence proteins. Cell Host Microbe 4(1):8–15. sperm modification is dependent on the host genotype in Sinkins SP, et al. 2005. Wolbachia variability and host effects on crossing Drosophila. Proc R Soc Lond B. 268(1485):2565–2570. type in Culex mosquitoes. Nature 436(7048):257–260. Metcalf JA, Jo M, Bordenstein SR, Jaenike J, Bordenstein SR. 2014. Recent So ¨ ding J, Biegert A, Lupas AN. 2005. The HHpred interactive server for genome reduction of Wolbachia in Drosophila recens targets phage protein homology detection and structure prediction. Nucleic Acids WO and narrows candidates for reproductive parasitism. PeerJ 2:e529. Res. 33:W244–W248. Miller WJ, Ehrman L, Schneider D. 2010. Infectious speciation revisited: Stamatakis A. 2014. RAxML version 8: a tool for phylogenetic analysis and impact of symbiont-depletion on female fitness and mating behavior post-analysis of large phylogenies. Bioinformatics 30(9):1312–1313. of Drosophila paulistorum. PLoS Pathog. 6(12):e1001214. Stouthamer R, Breeuwer JAJ, Luck RF, Werren JH. 1993. Molecular-iden- Min K-T, Benzer S. 1997. Wolbachia, normally a symbiont of Drosophila, tification of microorganisms associated with parthenogenesis. Nature can be virulent, causing degeneration and early death. Proc Natl Acad 361(6407):66–68. Sci U S A. 94(20):10792–10796. Stouthamer R, Kazmer DJ. 1994. Cytogenetics of microbe-associated par- Moreau J, Bertin A, Caubet Y, Rigaud T. 2001. Sexual selection in an thenogenesis and its consequences for gene flow in Trichogramma isopod with Wolbachia-induced sex reversal: males prefer real females. wasps. Heredity 73(3):317–327. J Evol Biol. 14(3):388–394. Stouthamer R, Luck R. 1993. Influence of microbe-associated partheno- Moreira LA, et al. 2009. A Wolbachia symbiont in Aedes aegypti limits genesis on the fecundity of Trichogramma deion and T. pretiosum. infection with dengue, chikungunya, and Plasmodium.Cell Entomol Exp Appl. 67(2):183–192. 139(7):1268–1278. Stouthamer R, Luck RF, Hamilton WD. 1990. Antibiotics cause partheno- Newton IL, et al. 2016. Comparative genomics of two closely related genetic Trichogramma (Hymenoptera, Trichogrammatidae) to revert Wolbachia with different reproductive effects on hosts. Genome Biol to sex. Proc Natl Acad Sci U S A. 87(7):2424–2427. Evol. 8(5):1526–1542. Sutton E, Harris S, Parkhill J, Sinkins S. 2014. Comparative genome analysis Noda H, Koizumi Y, Zhang Q, Deng K. 2001. Infection density of of Wolbachia strain wAu. BMC Genomics 15:928. Wolbachia and incompatibility level in two planthopper species, Teixeira L, Ferreira A, Ashburner M, Keller L. 2008. The bacterial symbiont Laodelphax striatellus and Sogatella furcifera.Insect Biochem Mol Wolbachia induces resistance to RNA viral infections in Drosophila Biol. 31(6–7):727–737. melanogaster. PLoS Biol. 6(12):e1000002. O’Neill SL, Giordano R, Colbert AME, Karr TL, Robertson HM. 1992. 16S Tram U, Sullivan W. 2002. Role of delayed nuclear envelope breakdown ribosomal-RNA phylogenetic analysis of the bacterial endosymbionts and mitosis in Wolbachia-induced cytoplasmic incompatibility. Science associated with cytoplasmic incompatibility in insects. Proc Natl Acad 296(5570):1124–1126. Sci U S A. 89:2699–2702. Turelli M, Hoffmann AA. 1991. Rapid spread of an inherited incompati- Penz T, et al. 2012. Comparative genomics suggests an independent origin bility factor in California Drosophila. Nature 353(6343):440–442. of cytoplasmic incompatibility in Cardinium hertigii. PLoS Genet. Veneti Z, et al. 2003. Cytoplasmic incompatibility and sperm cyst infection 8(10):e1003012. in different Drosophila-Wolbachia associations. Genetics Perez F, Granger BE. 2007. IPython: a system for interactive scientific com- 164(2):545–552. puting. Comput Sci Eng. 9, doi: 10.1109/MCSE.2007.53. Walker T, et al. 2011. The wMel Wolbachia strain blocks dengue and Quenault T, Lithgow T, Traven A. 2011. PUF proteins: repression, activa- invades caged Aedes aegypti populations. Nature 476(7361):450–U101. tion and mRNA localization. Trends Cell Biol. 21(2):104–112. Randerson JP, Jiggins FM, Hurst LD. 2000. Male killing can select for male Wallau GL, da Rosa MT, De Re FC, Loreto EL. 2016. Wolbachia from mate choice: a novel solution to the paradox of the lek. Proc R Soc Drosophila incompta: just a hitchhiker shared by Drosophila in the Lond B. 267(1446):867–874. New and Old World? Insect Mol Biol. 25(4):487–499. 450 Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Evolutionary Genetics of cifA and cifB GBE Werren JH, Baldo L, Clark ME. 2008. Wolbachia: master manipulators of Yen JH, Barr AR. 1971. New hypothesis of the cause of cytoplasmic in- invertebrate biology. Nat Rev Microbiol. 6(10):741–751. compatibility in Culex pipiens.Nature Werren JH, Windsor DM. 2000. Wolbachia infection frequencies in insects: Zabalou S, et al. 2004. Wolbachia-induced cytoplasmic incompatibility as a evidence of a global equilibrium? Proc R Soc Lond B. means for insect pest population control. Proc Natl Acad Sci U S A. 267(1450):1277–1285. 101(42):15042–15045. Werren JH, Zhang W, Guo LR. 1995. Evolution and phylogeny of Zug R, Hammerstein P, Cordaux R. 2012. Still a host of hosts for Wolbachia—reproductive parasites of arthropods. Proc R Soc Lond Wolbachia: analysis of recent data suggests that 40% of terrestrial B. 261(1360):55–63. arthropod species are infected. PLoS One 7(6):e38544. Yamada R, Floate KD, Riegler M, O’Neill SL. 2007. Male development time influences the strength of Wolbachia-induced cytoplasmic incompati- bility expression in Drosophila melanogaster. Genetics 177(2):801–808. Associate editor: Daniel Sloan Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 451 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Genome Biology and Evolution Oxford University Press

Evolutionary Genetics of Cytoplasmic Incompatibility Genes cifA and cifB in Prophage WO of Wolbachia

Free
18 pages

Loading next page...
 
/lp/ou_press/evolutionary-genetics-of-cytoplasmic-incompatibility-genes-cifa-and-uZ5lv1McDC
Publisher
Oxford University Press
Copyright
© The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
ISSN
1759-6653
eISSN
1759-6653
D.O.I.
10.1093/gbe/evy012
Publisher site
See Article on Publisher Site

Abstract

The bacterial endosymbiont Wolbachia manipulates arthropod reproduction to facilitate its maternal spread through host populations. The most common manipulation is cytoplasmic incompatibility (CI): Wolbachia-infected males produce mod- ified sperm that cause embryonic mortality, unless rescued by embryos harboring the same Wolbachia. The genes underlying CI, cifA and cifB, were recently identified in the eukaryotic association module of Wolbachia’s prophage WO. Here, we use transcriptomic and genomic approaches to address three important evolutionary facets of the cif genes. First, we assess whetherornot cifA and cifB comprise a classic toxin–antitoxin operon in wMel and show that the two genes exhibit striking, transcriptional differences across host development. They can produce a bicistronic message despite a predicted hairpin termination element in their intergenic region. Second, cifA and cifB strongly coevolve across the diversity of phage WO. Third, we provide new domain and functional predictions across homologs within Wolbachia, and show that amino acid sequences vary substantially across the genus. Finally, we investigate conservation of cifA and cifB and find frequent deg- radation and loss of the genes in strains that no longer induce CI. Taken together, we demonstrate that cifA and cifB exhibit complex transcriptional regulation in wMel, provide functional annotations that broaden the potential mechanisms of CI induction, and report recurrent erosion of cifA and cifB in non-CI strains, thus expanding our understanding of the most widespread form of reproductive parasitism. Key words: symbiosis, reproductive manipulation, gene loss, bacteriophage, prophage. Introduction produce modified sperm that can only be rescued by eggs The genus Wolbachia is the most widespread group of ma- infected with the same Wolbachia strain (Yen and Barr 1971). ternally transmitted endosymbiotic bacteria (Zug et al. 2012). If the modified sperm fertilize eggs infected with no They occur worldwide in numerous arthropods and nemato- Wolbachia (unidirectional CI) or a genetically incompatible des and can selfishly manipulate reproduction (Werren et al. Wolbachia strain (bidirectional CI), then delayed histone de- 2008), confer antiviral defense (Teixeira et al. 2008; Bian et al. position, improper chromosome condensation, and cell divi- 2010), and assist reproduction and development of their hosts sion abnormalities result in embryonic arrest and death (Lassy (Hoerauf et al. 1999; Dedeine et al. 2001; Hosokawa et al. and Karr 1996; Tram and Sullivan 2002; Serbus et al. 2008; 2010). The most common parasitic manipulation is cytoplas- Landmann et al. 2009). Other described reproductive manip- mic incompatibility (CI), whereby Wolbachia-infected males ulations include parthenogenesis (Stouthamer et al. 1990), The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non- commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com 434 Genome Biol. Evol. 10(2):434–451. doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Evolutionary Genetics of cifA and cifB GBE male-killing (Hurst et al. 1999), and feminization (Rousset vertically transmitted from mothers to offspring. The genes et al. 1992), all of which enhance the fitness of Wolbachia- were proposed as candidate CI effectors due to the presence infected females and assist the spread of the infected matri- of one of the protein products in the spermathecae of line through a population. These manipulations, once sus- infected female mosquitoes (Beckmann and Fallon 2013) tained, can also impact host evolution including speciation and their absence in the wAu Wolbachia strain that lost CI (Bordenstein et al. 2001; Jaenike et al. 2006; Brucker and function (Sutton et al. 2014). Bordenstein 2013) and mating behaviors (Randerson et al. The wMel homologs of these genes are designated CI 2000; Moreau et al. 2001; Miller et al. 2010; Shropshire factors cifA (locus WD0631) and cifB (locus WD0632), with and Bordenstein 2016). cifA always encoded directly upstream of cifB (LePage et al. In addition to the aforementioned reproductive manipula- 2017). The gene set occurs in varying copy number across 11 tions, Wolbachia strains affect host biology by provisioning total CI-inducing strains, and the copy number tentatively nutrients (Hosokawa et al. 2010), altering host survivorship correlates with CI levels. Core sequence changes of the two (Min and Benzer 1997) and fecundity (Stouthamer and Luck genes exhibit a pattern of codivergence and in turn closely 1993; Dedeine et al. 2001), and importantly, protecting the match bidirectional incompatibility patterns between host against pathogens (Teixeira et al. 2008; Kambris et al. Wolbachia strains. Homologs of CifA and CifB protein 2009; Moreira et al. 2009; Bian et al. 2010; Hughes et al. sequences belong to four distinct phylogenetic Types (desig- 2011; Walker et al. 2011). The combination of reproductive nated Types I–IV) that do not correlate with various phylog- manipulations that enable Wolbachia to spread in a pop- enies of Wolbachia housekeeping genes or gpW (locus ulation and the ability to reduce vector competence through WD0640) in phage WO (LePage et al. 2017). The homolo- pathogen protection have placed Wolbachia in the forefront gous sequences in wPip also cluster in Type I, though they of efforts to control disease carrying arthropod populations are 66% and 76% different from wMel’s, respectively (Turelli and Hoffmann 1991; Zabalou et al. 2004; Hoffmann (Beckmann et al. 2017). Hereinafter we use cifA and cifB to et al. 2011; Walker et al. 2011; LePage and Bordenstein 2013; refer to these genes, unless specifically referring to analyses of Bourtzis et al. 2014). Despite these important applications, the the wPip homologs, cidA and cidB. In vitro functional analyses widespread prevalence of Wolbachia across arthropod taxa revealed that cidB encodes deubiquitylase activity, and cidA (Werren and Windsor 2000; Hilgenboecker et al. 2008; Zug encodes a protein that binds CidB (Beckmann et al. 2017). et al. 2012), and decades of research, only recently have the Mutating the predicted catalytic residue in the deubiquitylat- genes underlying CI been determined (Beckmann et al. 2017; ing domain of CidB results in a loss of the CI-like function in LePage et al. 2017). transgenic flies (Beckmann et al. 2017). Whether these genes Two studies converged on the same central finding: or other alleles have additional enzymatic or regulatory roles Coexpression of a pair of syntenic genes recapitulates the CI and which other residues are important for function remain phenotype (Beckmann et al. 2017; LePage et al. 2017). open questions. Uninfected Drosophila melanogaster males transgenically There are important considerations for the location, or- expressing the two genes from wMel Wolbachia caused CI- ganization, and characterization of these genes. Whether or like embryonic lethality when crossed with uninfected females not cifA and cifB form a strict, toxin–antitoxin operon is de- that was notably rescued by wMel-infected females (LePage batable, and likewise has important implications for how et al. 2017). Additionally, the two wMel genes separately gene expression is regulated by Wolbachia during host in- enhanced wMel-induced CI in a dose-dependent manner fection. Support for the operon hypothesis is based on weak when expressed in infected males, and the CI was again res- transcription across the junction between cidA and cidB,in- cued by wMel-infected females (LePage et al. 2017). In the ferred to be due to the presence of bicistronic mRNA other study, CI-like embryonic lethality was also recapitulated (Beckmann and Fallon 2013; Beckmann et al. 2017); an al- in D. melanogaster males through transgenic coexpression of ternative explanation is transcriptional slippage. Quantitative homologous transgenes cidA and cidB, encoded by the wPip transcription analyses and various computational predictions strain of Wolbachia that naturally infect Culex mosquitoes of operon structure do not support the operon hypothesis (Beckmann et al. 2017). These two genes are located in the (LePage et al. 2017). Moreover and importantly, transgenic recently discovered eukaryotic association module of temper- studies show that both cifA and cifB are required for induc- ate phage WO (Bordenstein SR and Bordenstein SR 2016), tion of CI and thus cannot form a strict toxin (cifB)–antitoxin which was previously implicated in influencing CI (Masui (cifA) system as both genes positively contribute to CI and et al. 2000; Sinkins et al. 2005; Bordenstein et al. 2006; can individually enhance Wolbachia-induced CI (LePage Duron et al. 2006). The presence of these genes within pro- et al. 2017). However, like toxin–antitoxin systems, CidA phage WO has implications for the transmission of these binds CidB in vitro and expression of cidA rescues genes because temperate phage WO exhibits frequent lateral temperature-sensitive growth inhibition induced by cidB ex- transfers between Wolbachia (Bordenstein and Wernegreen pression in Saccharomyces, via an as-yet-unknown mecha- 2004; Chafee et al. 2010) while Wolbachia are mainly nism (Beckmann et al. 2017). Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 435 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Lindsey et al. GBE As it stands now, the genes remain largely unannotated to individual flies and samples homogenized using a pestle. with the exception of a few small domains. If other predicted After a 5-min incubation at room temperature, a 12,000 rcf protein domains occur in CifA and CifB, they could allow for centrifugation (at 4 C for 10 min) was followed by a chloro- new hypotheses for the mechanism of CI. Finally, the se- form extraction. Aqueous phase containing RNA was quence diversity and/or loss of cif genes across the extracted a second time with phenol: Chloroform before iso- Wolbachia tree may give insights into the selective conditions propanol precipitation of RNA. This RNA pellet was washed that maintain the cif genes versus those that do not. and resuspended in THE RNA Storage Solution (Ambion). RNA Exploration of cif gene regulation, expression, and function used in subsequent analyses was subjected to a short DNAse thus can provide a framework for more targeted investiga- treatment (10 min at 37 C then 10 min at 75 Cto inactivate tions of Wolbachia–host interactions, and potentially inform the enzyme). To detect the number of cifA and cifB transcripts the deployment of Wolbachia-based arthropod control. as well as RNA levels across the junction between cifA and cifB, we utilized the RNA extracted from these flies and the SensiFAST SYBER Hi-ROX One-step RT mix (Bioline) and the Materials and Methods Applied Biosystems StepOne Real-time polymerase chain re- Expression action (PCR) system. Quantitative reverse transcription poly- merase chain reaction (qRT-PCR) was performed with the For analysis of RNAseq data, we used our published approach following primer sets: cifAF: ATAAAGGCGTTTCAGCAGGA, (Gutzwiller et al. 2015). Briefly, fastq sequences for 1-day-old cifAR: TCAATGAGGCGCTTCTAGGT; cifBF: TACGGGAAG male and female flies were mapped against the Wolbachia TTTCATGCACA, cifBR: TTGCCAGCCATCATTCATAA; cifA_ wMel reference genome (GenBank AE017196) using bwa endF: TCTGGTTCTCATAAGAAAAGAAGAATC, cifB_begR: mem v. 0.7.5a with default parameters in paired-end mode. AACCATCAAGATCTCCATCCA. As a reference for transcrip- Mapped reads were sorted and converted to BAM format tion activity of the core Wolbachia genome, we utilized the using samtools v0.1.19 after which BAM files were used as Wolbachia ftsZ gene (forward: TTTTGTTGTCGCAAATACCG; input to Bedtools (bedcov) to generate pileups and count cov- reverse: AGCAAAGCGTTCACATTTCC). We designed primers erage at each position. For expression correlations between to ftsZ because as a core protein involved in cell division, the genes, the raw RNAseq counts were divided by (gene length- quantities of ftsZ would better correlate with bacterial num- þ 99), where 99 corresponds to read length (100)  1. Within bers and activity. Reactions were performed in duplicate or a growth stage these values were multiplied by 1e /(sum of triplicate in a 96-well plate and CT values generated by the values in stage) (Li and Dewey 2011). A pairwise distance machinewereused to calculate therelative amounts of between all genes was defined as (1  R), where the R is Wolbachia using the DDCt (Livak) method. the Pearson correlation coefficient between the normalized To identify a bicistronic message encompassing cifA and expression values of two genes. Possible negative correlations cifB, we designed primers based on the 5 -region of cifA and would be “penalized” here, resulting in a larger distance. the 3 -region of cifB (WD0631F: ATAAAGGCGTTTC Distances were clustered using the Kitsch program of AGCAGGA; WD0632R: TTGCCAGCCATCATTCATAA). We PHYLIP (Felsenstein 1989). extracted RNA from whole animals and performed a DNAse treatment (as described above) before using the Operon Prediction In Silico iScript first strand synthesis kit (Biorad) to generate cDNA. Negative controls included RT minus reactions. Resulting We used the dynamic profile of the transcriptome above to cDNA and negative controls were used in PCR reactions identify operons within the wMel genome using two different with the primers above and the following cycling condi- approaches. We used the program Rockhopper (McClure tions: 95 C 5 min then 35 cycles of 95 C for 1 min, 64 et al. 2013), with default parameters, in conjunction with C for 1 min, 72 C for 2.5 min followed by a final extension the BAM files generated above to delineate likely operons of 72 C for 10 min and using the HF Phusion enzyme mix across the entire genome. The Arnold web server (http:// (NEB). As a positive control, to confirm that we could am- rna.igmors.u-psud.fr/toolbox/arnold/) was used to predict plify long mRNAs from these samples, we used the 16S hairpin transcription termination elements (Gautheret and rRNA gene primers 27F (AGAGTTTGATCCTGGCTCAG) Lambert 2001; Macke et al. 2001). and 1492R (GGTTACCTTGTTACGACTT) with the same cycling conditions as above except that the annealing tem- Nucleic Acid Extractions and Quantitative Reverse perature was 55 C. Transcription Polymerase Chain Reaction To identify Wolbachia gene expression in adult male and fe- Correlated Cif Trees and Distance Matrices male D. melanogaster, RNA was extracted from individual, age-matched flies (1–3 days old, stock 145) using a modified Quantifying congruence scores between the CifA and CifB Trizol extraction protocol. Briefly, 500 mlof Trizol was added trees was carried out with Matching Cluster (MC) and 436 Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Evolutionary Genetics of cifA and cifB GBE Robinson–Foulds (RF) metrics using a custom python script Cif Phylogenetics previously described (Brooks et al. 2016) and the TreeCmp CifA and CifB protein sequences were identified using BlastP program (Bogdanowicz et al. 2012). MC weights topological searches of WOMelB WD0631 (NCBI accession number congruency of trees, similar to the widely used RF metric. AAS14330.1) and WD0632 (AAS14331.1), respectively. However, MC takes into account sections of subtree congru- 30 Homologs were selected based on: 1) E¼ 10 ,2)query ence and therefore is a more refined evaluation of small to- coverage greater than 70%, and 3) presence in fully se- pological changes that affect incongruence. Significance in quenced Wolbachia genomes. All sequences were intact the MC and RF analyses was determined by the probability with the exception of a partial WOSuziC CifA of 100,000 randomized bifurcating dendrogram topologies (WP_044471252.1) protein. The missing N-terminus was yielding equivalent or more congruent trees than the actual translated from the end of contig accession number tree. Normalized scores were calculated as the MC and RF CAOU02000024.1 and concatenated with partial protein congruency score of the two topologies divided by the max- WP_044471252.1 for analyses, resulting in 100% amino imum congruency score obtained from random topologies. acid identity to WORiC CifA (WP_012673228.1). In addition, The number of trees that had an equivalent or better score two previously identified sequences (LePage et al. 2017), than the actual tree was used to calculate the significance of WORecB CifB and WORiB CifB, were not available in NCBI’s observing that topology. Mantel tests were also performed on database and translated from nucleotide accession numbers the CifA and CifB patristic distance matrices calculated in JQAM01000018.1 and CP001391.1, respectively. The previ- Geneious v8.1.9 (Kearse et al. 2012). A custom Jupyter note- ously identified WOSol homologs (CifA: AGK87106 and CifB: book (Perez and Granger 2007) running python v3.5.2 (http:// AGK87078) (LePage et al. 2017) were also included in our python.org) was written in the QIIME2 (Caporaso et al. 2010) analyses. All protein sequences were aligned with the anaconda environment. The Mantel test (Mantel 1967)uti- MUSCLE (Edgar 2004) plugin in Geneious Pro version 8.1.7 lized the scikit-bio v0.5.1 (scikit-bio.org) Mantel function run, (Kearse et al. 2012); the best models of evolution, according using scikit-bio distance matrix objects for each gene. The to corrected Akaike (Hurvich and Tsai 1993) information cri- Mantel test was run with 100,000 permutations to calculate teria, were estimated to be JTT-G using the ProtTest server significance of the Pearson correlation coefficient between (Abascal et al. 2005); and phylogenetic trees were built using the two matrices using a two-sided correlation hypothesis. the MrBayes (Ronquist et al. 2012) plugin in Geneious. Protein Structure Genomes Used in Comparative Analyses All candidate CI gene protein sequences were individually In order to identify cif homologs across the Wolbachia assessed for the presence of domain structure using HHpred genomes, we defined orthologs across existing, sequenced (https://toolkit.tuebingen.mpg.de/hhpred/; So¨ding et al. genomes using reciprocal best BlastP. We included 2005)) with default parameters and the following databases: Wolbachia genomes across five supergroups: Monophyletic SCOPe70 (v.2.06), Pfam (v.31.0), SMART (v6.0), and COG/ clades of Wolbachia based on housekeeping genes, denoted KOG (v1.0). Schematics were created in inkscape (https://ink- by uppercase letters (O’Neill et al. 1992; Werren et al. 1995). scape.org/), to show regions with significant structural hits, as Supergroups A and B are the major arthropod infecting line- determined by probabilities greater than 50%, or greater ages, whereas C and D infect nematodes (Bandi et al. 1998). than 20% and in the top five hits. Supergroup F Wolbachia infect a variety of hosts (Lo et al. 2002). Included in this analysis were nine type A strains Protein Conservation (wRi, wSuzi, wHa, wMel, wMelPop, wAu, wRec, wUni, and wVitA), seven type B strains (wPipJHB, wPipPel, wBol1-b, Protein conservation was determined with the Protein Residue wNo, wTpre, wAlbB, and wDi), two type C strains (wOv Conservation Prediction tool (http://compbio.cs.princeton. and wOo), and one each type D (wBm) and type F (wCle). edu/conservation/index.html; Capra and Singh 2007), using We included all genomic data available for each strain such aligned amino acid sequences, Shannon entropy scores, a that if multiple assemblies existed for each Wolbachia variant window size of zero, and sequence weighting set to “false.” (such as in the case of wUni) we included the union of all Conservation was subsequently plotted in R version 3.3.2, and available contigs for that strain. Wolbachia orthologs were module regions were delineated according to the coordinates defined based on reciprocal best blast hits between amino of the WOMelB modules within the alignment. CI gene con- acid sequences in Wolbachia genomes. An orthologous group servation scores were calculated separately for Type I sequen- of genes was defined by complete linkage such that all mem- ces, and for all Types together. For CifB Type I sequences, the bers of the group had to be the reciprocal best hit of all other WOVitA4 ortholog was left out, due to the extended C-ter- members of the group. Information on strain phenotypes, minus of that protein. Conservation scores were also calcu- hosts, and accession numbers can be found in table 1. lated for “control proteins”: Wsp (Wolbachia surface protein), Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 437 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Lindsey et al. GBE Table 1 Genomes Used in Comparative Analyses of cifA and cifB Supergroup Strain Host Reproductive Phenotypes Accession Number A wMel Drosophila melanogaster CI NC_002978.6 wMelPop Drosophila melanogaster CI AQQE00000000.1 wRec Drosophila recens CI NZ_JQAM00000000.1 wAu Drosophila simulans None LK055284.1 wHa Drosophila simulans CI NC_021089.1 wRi Drosophila simulans CI NC_012416.1 wSuzi Drosophila suzukii None NZ_CAOU00000000.2 wUni Muscidifurax uniraptor PI NZ_ACFP00000000.1 wVitA Nasonia vitripennis CI NZ_MUJM00000000.1 B wAlbB Aedes albopictus CI CAGB00000000.1 wNo Drosophila simulans CI NC_021084.1 wDi Diaphorina citri Undetermined NZ_KB223540.1 wTpre Trichogramma pretiosum PI CM003641.1 wVitB Nasonia vitripennis CI AERW00000000.1 wBol1-b Hypolimnas bolina CI, MK NZ_CAOH00000000.1 wPipJHB Culex quinquefasciatus CI ABZA00000000.1 wPipPel Culex pipiens CI NC_010981.1 C wOo Onchocerca ochengi OM NC_018267.1 wOv Onchocerca volvulus OM NZ_HG810405.1 D wBm Brugia malayi OM NC_006833.1 F wCle Cimex lectularius OM NZ_AP013028.1 Note.—Reproductive phenotypes include: CI, parthenogenesis-inducing (PI), male-killing (MK), obligate mutualism (OM), no phenotype discovered after assessment (None), and phenotype was not assayed (Undetermined). known to be affected by frequent recombination events unrelated endosymbiont that can also cause CI in arthropods (Baldo et al. 2005), and FtsZ, which is relatively unaffected (Penz et al. 2012). Here, searches were performed with by recombination (Baldo, Dunning Hotopp, et al. 2006; Ros TBlastN and restricted to all available Cardinium sequence in et al. 2009). Variation in amino acid conservation between NCBI GenBank (taxid: 273135). modules and nonmodule regions was assessed in R version 3.3.2 with a one-way ANOVA including “region” (either the Hidden Markov Model Searches unique module number, or “nonmodule”) as a fixed effect, To identify cif homologs in draft Wolbachia genome assem- and followed by Tukey Honest Significant Difference for post blies we used the program suite HMMER (Eddy 2011). We hoc testing. defined cif Types based on our phylogenetic trees (fig. 4)and used aligned amino acids from these Types as input to Cif Modules HMMBUILD, using default parameters. We then searched six Wolbachia WGS assemblies (NCBI project numbers The WOMelB structural regions delineated by HHpred were PRJNA310358, PRJNA279175, PRJNA322628) using used to search for the presence of Cifs or remnants of Cifs HMMSEARCH with –F3 1e-20 –cut_nc and –domE 1e-10. across the Wolbachia phylogeny. Amino acid sequences of Regardless of thresholds used, or cif type of HMM, resulting the WOMelB modules were queried against complete ge- hits did not differ. nome sequences (table 1) using TBlastN. Any hit that was at least 40% of the length and 40% identity, or at least 90% of the length and 30% identity of the WOMelB module was Results considered a positive match. Module presence was plotted cifA and cifB Are Cotranscribed but Differentially across a Wolbachia phylogeny constructed using the five mul- Regulated in wMel tilocus sequence typing (MLST) genes defined by Baldo, Dunning Hotopp, et al. (2006). Nucleotide sequences were In bacteria, genes are commonly grouped into a single tran- aligned with MAFFT version 7.271 (Katoh and Standley scriptional unit under the control of one promoter, referred to 2013), and concatenated prior to phylogenetic reconstruction as an operon. Because cifA and cifB are syntenic across pro- with RAxML version 8.2.8 (Stamatakis 2014), the phage WO of Wolbachia andbothinvolvedinCI, we aimedto GTRGAMMA substitution model, and 1,000 bootstrap repli- assess whether cifA and cifB are cotranscribed. We performed cates. We also searched for cif-like regions in Cardinium:An RT-PCR using primers that amplify the entire region from the 438 Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Evolutionary Genetics of cifA and cifB GBE (A) 1kb ladder (B) females WD0630 cifA cifB WD0633 males forward reverse FIG.1—Expression of cifA and cifB in adult flies. (A)Amplification of cifA–cifB bicistronic message from cDNA generated from adult flies. Positive amplification occurred in both male and female adult flies. RT minus controls included. (B) RNAseq expression from 1-day-old female and male Drosophila melanogaster flies. Raw reads were mapped to the wMel assembly (using bwa) and coverage visualized using the Integrated Genomics Viewer (v2.3.77). The startofthe cifB open reading frame is denoted by a vertical, dotted line. start of cifA to the end of cifB (2.5 kbintotal). cDNA Operons are often comprised of loci encoding related amplification of the region from wMel-infected male and processes that can therefore be coregulated conveniently female flies was successful (fig. 1A), and the transcript through the control of transcription from one promoter. To wasconfirmedtobe cifA–cifB using Sanger sequencing assess whether cifA and cifB are coregulated, we reasoned from the forward and reverse ends of the cDNA amplicon that strictly coregulated loci will have correlated gene expres- (supplementary file S2, Supplementary Material online). sion across host development and similar total expression lev- We could not amplify a larger transcript from the loci els in whole animals. We therefore utilized an existing RNAseq flanking cifA and cifB, suggesting that the cifA–cifB tran- data set for Wolbachia in Drosophila melanogaster,covering script is adiscreteunit. 24 life cycle stages and 3 time samplings each for adult males Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 439 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Lindsey et al. GBE and females (Gutzwiller et al. 2015). We mapped reads to the (A) Junction existing wMel assembly (see Materials and Methods) and cal- cifA cifB culated Pearson correlation coefficients of normalized expres- (B) sion values across host development for between all gene pairs. cifA is expressed at much higher absolute levels than 1.2 cifB (fig. 1B; 8-fold higher based on RPKM values across both genes), and cifA and cifB expression is weakly, negatively cor- 1.0 related (Pearson r: 0.40; P-value: 0.014), suggesting that the expression level of one could have a negative influence on the 0.8 level of the other. To confirm differential expression of cif genes in wMel, we performed a quantitative RT-PCR analysis 0.6 of gene expression from 3-day-old male and female flies (fig. 2). We observed transcripts covering the junction 0.4 between cifA and cifB. However, transcripts covering this junction were more similar to the expression levels in cifA, 0.2 whereas expression of cifB was 9-fold less, supporting results from the RNAseq analysis. As a possible explanation for the large absolute differences cifA Junction cifB in transcription, we examined the intergenic sequence be- FIG.2—Relative expression ratio of cifA, the junction between cifA/ tween cifA and cifB and identified a Rho-independent tran- cifB,and cifB to ftsZ. Expression of both genes and their junction was scription terminator at nucleotides 618649–618668. This quantified using qRT-PCR, and normalized to Wolbachia ftsZ gene expres- terminator region is predicted to form a GC-rich hairpin sion. cifB gene expression is significantly less than that of the junction (50% GC compared with the Wolbachia wMel genome- (t¼ 3.220, df¼ 16, P¼ 0.005) and less than cifA (t ¼3.840, df¼ 17, wide 35%) in newly synthesized mRNA message proximal P¼ 0.001). to the RNA polymerase. There are two explanations for how the terminator might explain the transcript abundance BAM files as input to Rockhopper (McClure et al. 2013). The differences between cifA and cifB, and both have an impact program was able to correctly identify known operons in on the operon hypothesis. First, cifA and cifB have their own wMel (such as the T4SS WD0004–WD0008 and the ribo- promoters, but occasionally the genes are cotranscribed as a somal protein operon), but it did not identify cifA and cifB bicistronic message due to an imperfect hairpin terminator at as an operon. the end of cifA. In this model, cifA and cifB do not form an In summary, although a bicistronic message of cifA and cifB operon. Alternatively, the cifA and cifB operon has a single was detected by qPCR, their absolute and relative expression promoter upstream of cifA, and the imperfect terminator levels are drastically different. A termination signal in their provides a mechanism to control transcriptional differences intergenic sequence may limit expression of the bicistronic between cifA and cifB in the operon. Functionally resolving message and could explain the much higher absolute level whether cifA and cifB have the same or different promoters of cifA. Given the negative correlation across growth stages, will be the ultimate arbiter of the two models. some entity that activates cifA transcription, or a cifA product In order to identify loci with similar expression patterns itself, could repress cifB transcription. Clearly, cifA and cifB are during host development, we clustered all wMel genes based not a traditional operon and functionally resolving their pro- on their similarity in expression across Drosophila develop- moter(s) will reveal much about the regulation of the repro- ment (supplementary fig. S1, Supplementary Material online). ductive manipulations induced by Wolbachia. cifA did not group with cifB in wMel (fig. 3), suggesting that these two genes are not similarly expressed. Indeed, the pat- New Protein Domain Predictions Are Variable across the tern of cifA expression differs strikingly from that of cifB.For Cif Phylogeny example, cifA is relatively highly expressed during late em- bryogenesis and in adults, whereas cifB is relatively highly We recovered the four previously identified phylogenetic expressed during the first two-thirds of embryogenesis, and Types (LePage et al. 2017). Here, our analyses include addi- during larval stages (fig. 3A). Curiously, the expression profile tional strains that cause reproductive parasitism beyond CI of cifA in flies during development is most closely correlated (parthenogenesis and male-killing, table 1), and the more di- with the wsp locus WD1063 (fig. 3B). vergent Type IV paralogs for cifA, sofar identified inB- Because of the dramatic absolute difference in cifA and cifB Supergroup Wolbachia. We recover a set of Type III alleles transcript levels, computational methods for operon predic- from wUni, a strain that induces parthenogenesis in the par- tion do not support their cotranscription. For example, after asitoid wasp, Muscidifurax uniraptor (Stouthamer et al. 1993). mapping reads to the wMel assembly, we used the resulting The wBol1-b strain, a male-killer that has retained CI 440 Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Ratio of Expression (locus/ftsZ ) Evolutionary Genetics of cifA and cifB GBE FIG.3—Gene expression of cifA and cifB during Drosophila melanogaster development. (A) Heatmap representation of normalized transcripts per base pair per million (TPM) for both cifA and cifB during Drosophila melanogaster development. cifB is highly expressed during embryogenesis and downregulated after pupation, whereas cifA is more highly expressed in adults and pupae. Clustering of Wolbachia loci based on expression across fly development illustrates correlated expression profiles between wMel loci and cifA (B)or cifB (C). Mobile elements and loci involved in host interaction (wsp)are indicated with vertical lines on the right side of the figure. capabilities (Hornettetal. 2008), has alleles belonging to both Wolbachia strains that cause CI, parthenogenesis, male- Type I and Type IV. killing, or no reproductive phenotype were characterized by Homologs and predicted protein domains of CifA and CifB HHpred homology and domain structure prediction software for all four phylogenetic Types (LePage et al. 2017)from (So ¨ ding et al. 2005). Search parameters are described in the Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 441 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Lindsey et al. GBE Table 2 methods. Several new prominent protein domains, herein re- Predicted Structural Modules of Cif Proteins ferred to as “modules,” were identified for each CifA and CifB protein sequence (table 2). Protein Module Size Range Homology (AA) For CifA, three modules were annotated (fig. 4A, table 2). First, the most N-terminal module (ModA-1) is only recovered CifA ModA-1 21–22 • Catalase-rel, decomposes hy- drogen peroxide into water in Type I variants, with distant homology (22% amino acid and oxygen identity) to Catalase-rel that is predicted to catalyze the break- ModA-2 65–264 • DUF3243 domain of unknown down of hydrogen peroxide (Chelikani et al. 2004). The prob- function ability of the module being homologous to Catalase-rel is low • Puf family RNA binding (prob¼ 21–24), but the consistent recovery of structure in this • Globin-like protein region across Type I alleles is notable. The second CifA module ModA-3 47–74 • STE-like transcription factor in the central region (ModA-2) has homology to a domain of CifB ModB-1 103–133 • PDDEXK, PD-(D/E)XK nuclease unknown function (Types I, II, and IV, prob¼ 27–64), globin- superfamily like domains (Types II and IV, prob¼ 21–30), and Puf family • Endonuclease NucS RNA-binding domains (Types III and IV, prob¼ 25–49). The • Restriction endonuclease-like last CifA module in the C-terminal region (ModA-3) has hits to family • HSDR_N, type I restriction en- an STE-like transcription factor in all Types (prob¼ 27–42). In zyme R protein N terminus general, the CifA proteins showed distant homology to ModB-2 122–205 • PDDEXK, PD-(D/E)XK nuclease known domains, but we consistently recovered the same superfamily regions of structure within CifA protein Types. • MmcB-like DNA repair protein For CifB, three modules were also defined (fig. 4B, table 2). • COG5321, uncharacterized The first (ModB-1) and second (ModB-2) most N-terminal protein regions of all Types both have matches to the PDDEXK nucle- • HSDR_N, type I restriction en- ase family (prob¼ 57–98) and various other restriction endo- zyme R protein N-terminus nucleases such as NucS, HSDR_N, and MmcB (prob¼ 50–91). • Endonuclease NucS The third module, found only in the Type I C-terminus (ModB- ModB-3  95–147 • Ulp-1, ubiquitin-like proteases 3), has very strong homology to a number of ubiquitin- • Various proteases and pepti- dases (C5, C57, Sentrin-specific modification and protease-like domains (prob¼ 71–96). This protease) was expected, as ModB-3 contains the predicted catalytic res- idue associated with CI function in CidB, known to have deu- biquitylase activity (Beckmann et al. 2017). WOVitA4 (Type 1) Colors next to modules are used throughout the Figures. Only present in Type I. has an extended C-terminus not present in any other alleles, and within that extended C-terminus is an additional struc- tural domain, with homology to a Herpesvirus tegument pro- tein (prob¼ 53), and a phosphohydrolase-associated domain (prob¼ 57). CifB Type IV alleles (WOAlbB, WOPip2, and (LePage et al. 2017). Here, we statistically ground the in- wBol1-b) were not included in the phylogenetic reconstruc- ference of codivergence using the largest set of tion, as they are highly divergent and not reciprocal blasts of Wolbachia homologs to date. We quantified congruence WOMelB cifB. Despite their divergence, these Type IV CifB between the CifA and CifB phylogenetic trees for Types I– alleles have similar structures to Type II and III alleles: Two III (supplementary file S1, Supplementary Material online) PDDEXK-like modules, and no Ulp-1-like module 3 (supple- using MC and RF tree metrics (Robinson and Foulds 1981; mentary fig. S3, Supplementary Material online). Full struc- Bogdanowicz et al. 2012; Bogdanowicz and Giaro 2013), tural schematics with exact coordinates and homology with normalized distances ranging from 0.0 (complete regions for each allele are available in the Supplementary congruence) to 1.0 (complete incongruence). Results Material online (supplementary figs. S2 and S3, show strong levels of congruence between CifA and Supplementary Material online), as are all significant domain CifB (P< 0.00001 for both, normalized MC ¼ 0.06 and hits with associated probabilities and extended descriptions normalized RF ¼ 0.125). To further statistically validate (supplementary tables S1 and S2, Supplementary Material the inference of codivergence, we measured the correla- online). tion between patristic distance matrices for CifA and CifB using the Mantel test (Mantel 1967). Results demonstrate a high degree of correlation between patristic distance CifA and CifB Codiverge matrices, and through permutation show that indepen- Initial phylogenetic trees based on core amino acid sequences dent evolution of CifA and CifB is highly unlikely of Type I–III variants of CifA and CifB exhibited similar trees (Pearson correlation coefficient ¼ 0.905, P ¼ 0.00001). 442 Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 100 WOBol1-b WP_019236549.1 WOPip2 WP_007302980.1 WOPipJHB WP_007302988.1 WOPipJHB WP_007302980.1 wVitA WP_077188282.1 wUni WP_077188282.1 WOSuziC, WORiC wNo WP_015587805.1 wAlbB WP_006014164.1 WOPip1 (CidA) WP_012481787.1 WOAlbB WP_006012794.1 WORiB wRi_p05380 WORiC WP_012673227.1 WOSol AGK87106 WOSuziC WP_044471252.1* WOMelB WP_010962722.1 Evolutionary Genetics of cifA and cifB GBE (A) CifA Type III wNo 208-366 427-475 490aa Type I STE Puf familiy RNA binding WOMelB, WOMelPop 189-211 285-350 389-457 474aa Catalase-rel DUF3243 STE Type IV WOPip2, WOPipJHB 188-409 382-429 445aa WOHa1 WP_015588933.1 DUF3243/Puf family RNA binding/ STE globin-like WOVitA4 ONI58213.1 0.2 WORiC WP_012673228.1 Type II WOMelPop WP_010962721.1 456aa 264-423 390-443 DUF3243/globin-like STE (B) CifB Type I WOMelB, WOMelPop, WOSuziB 1166aa 277-381 591-796 929-1057 WOVitA4 PDDEXK PDDEXK/MmcB-like/ Ulp1/Proteases HD_assoc/ COG5321/NucS Herpes_teg_N WOVitA4 ONI58212.1 Type III WOPip1 (CidB) WP_012481788.1 wAlbB, wVitA, wUni 191-322 551-676 696aa PDDEXK/NucS/ PDDEXK/MmcB-like/ restriction endonuclease COG5321/HSDR_N/NucS restriction endonuclease 0.2 Type II WOSuziC, WORiC 754aa 204-337 579-718 PDDEXK/NucS/ PDDEXK/MmcB-like/ HSDR_N-containing COG5321/HSDR_N/NucS FIG.4—Phylogenetic relationships and representative predicted protein structure of Cif protein Types. (A) CifA and (B) CifB. Alleles are in bold next to their corresponding accession number, and pink shapes around branches designate monophyletic “Types.” Representative structures are shown for each type, with the length of the protein indicated at the C-terminus. If genes differed by only a few amino acids a single representative is shown. Allele names use the previously described naming convention with a WO prefix referring to particular phage WO haplotype, and the w prefix indicating a phage WO-like island (LePage et al. 2017). The N-terminus of CifA WOSuziC (* in figure) was translated from the end of another contig and concatenated to get the full- length protein (see Materials and Methods). WOMelB and WOMelPop are identical at the amino acid level, as are WOPipJHB and WOPip2. Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 443 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 wAlbB WP_006014162.1 WOMelPop WP_038228284.1 WOSuziC WP_044471251.1 wNo WP_015587806.1 WOBol1-b2 WP_019236479.1 WOBol1-b WP_019236548.1 WOHa1 WP_015588932.1 WOSuziB WP_044471243.1 WOMelB WP_010962721.1 WOSuziB WP_044471237.1 WORecB WP_038198916.1 wVitA WP_077188281.1 WOSol AGK87078 wUni WP_077190377.1 WORiB WP_012673191.1 WORecB wRec0567 84 84 100 Lindsey et al. GBE FIG.5—Protein conservation, as determined by Shannon entropy scores. (A)Wsp,(B) Cell division protein FtsZ, (C)Type I CifA,(D)All CifA,(E)Type I CifB alleles except for WOVitA4, (F) All CifB alleles. Red dots in (E)and (F) indicate the ModB-3 catalytic residue (Beckmann et al. 2017), unique to and completely conserved for Type I alleles. Blue dots in (E)and (F) represent the (P)D-(D/E)XK motif (Kosinski et al. 2005)present in wMel. Table 3 Cif Proteins Evolve Rapidly Average Amino Acid Conservation of Cifs and Modules Amino acid sequence conservation across the full length of Protein Region Type I All the Cif proteins was determined and compared with CifA ModA-1 0.93 0.67 Wolbachia amino acid sequences of genes that either have ModA-2 0.84 0.53 signatures of recombination and directional section (Wsp) or ModA-3 0.77 0.53 have not undergone extensive recombination and directional CifA 0.83 0.58 selection (FtsZ, cell division protein). Wsp protein sequences CifB ModB-1 0.89 0.70 exhibit considerable divergence (mean conservation¼ 0.85), ModB-2 0.87 0.60 with very few sites in a row being completely conserved b ModB-3 0.79 0.40 (fig. 5A). In contrast, FtsZ is relatively conserved (mean con- CifB 0.82 0.43 servation¼ 0.94), and most of the divergence is clustered at Module number is defined in table 2. the C-terminus (fig. 5B). Mean conservation for the Cif pro- Only annotated in Type I. tein sequences was lower than Wsp—0.83 for Type I CifA alleles (fig. 5C) and 0.82 for Type I CifB alleles (fig. 5E, table 3). When all Cif alleles were considered, mean conservation was cause CI remains to be determined. Although the CifB pro- even further reduced—0.58 for CifA (fig. 5D) and 0.43 for teins are highly divergent, the catalytic residue (red dot in CifB (fig. 5F). The lower average conservation of CifB genes is fig. 5E and F) in the deubiquitylating module of CifB is unique in part due to the many insertions and deletions in the align- to and completely conserved for the Type I alleles. ment, and the missing C-terminal deubiquitylase region, The Cif proteins have extensive amounts of diversity, with ModB-3, of the Type II and III alleles. Thus, several CifB pro- completely conserved amino acids distributed across the teins apparently lack this activity, and whether these variants length of the protein, and not confined to any particular 444 Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Evolutionary Genetics of cifA and cifB GBE regions (fig. 5C–F, supplementary tables S3–S6, more modules are recovered from the CI-inducing strains Supplementary Material online). There were significant differ- than the non-CI inducing strains within the same Type. ences in the level of conservation between modules and non- The high number of modules in wSuzi and wRi is due to the module regions for the Type I alignments of both CifA (F presence of a duplicated set of Type I variants. We recovered 3, 490 ¼ 4.276, P¼ 0.0054) and CifB (F ¼ 9.703 P¼ 1.5e-06) many modules in wSuzi, which is a strain not known to induce 3, 1195 (table 3). The only modules that had significantly higher con- CI, but is sister to wRi, which can induce CI (Hamm et al. servation than the nonmodule regions of the alignment were 2014; Cattel et al. 2016). This discrepancy between cif pres- ModB-1 (P¼ 0.0173) and ModB-2 (P¼ 0.0011). The wMel ence and absence of a reproductive phenotype might be strain contains the (P)D-(D/E)XK motif in ModB-1 (blue dots explained by the disrupted Type II cifA in wSuzi. The split in fig. 5E and F)(Kosinski et al. 2005), but it is less than 80% WOSuziC sequenced was concatenated to allow for a more conserved across strains despite the higher average conserva- robust phylogenetic reconstruction (fig. 4), but it is in fact tion of this module. In contrast, wMel does not contain the disrupted by a transposase (Conner et al. 2017). However, catalytic motif in ModB-2, also a PDDEXK nuclease-like do- having a functional set of Type I cif alleles appears to be suf- main. The ModB-2 (P)D-(D/E)XK motif is present in Type IV ficient for CI-induction in other strains (Beckmann et al. 2017; alleles such as WOPip2 that when mutated no longer induces LePage et al. 2017), so it is not clear how inactivation of the growth defects in yeast (Beckmann et al. 2017). ModA-3 is Type II alleles here may affect the final CI phenotype in wSuzi. significantly less conserved than the nonmodule regions of Strain wDi, infecting the Asian citrus psyllid Diaphorina citri, Type I CifA (P¼ 0.0300). has no identified reproductive phenotype, but only contains two modules: ModB-1 and ModB-3. For wHa, we recovered duplicates of all the modules. These represent a highly dis- Cif Module Presence Generally Predicts Reproductive rupted copy of the gene set harboring frameshifts that were Phenotype annotated as pseudogenes. We used the wMel-predicted Cif modules as a seed to search The lack of evidence for homologous cif genes in the C, D, for the presence of homologous modules across Wolbachia and F Supergroup Wolbachia agrees with previous findings genome sequences using TBlastN (fig. 6), with the intent of (LePage et al. 2017) that CI-function is restricted to the Aþ B- discovering cif-like regions or remnants in strains with other Supergroup clade (likely due to WO phage activity), and the phenotypes. In strains with more divergent Cif Types, we absence of WO phages for the nematode-infecting strains report modules that were expected based on the HHpred (Gavotte et al. 2007). The loss of CI within the A and B results, but not recovered with TBlastN due to sequence di- Supergroups is likely a derived trait due to the rapid evolution vergence from WOMelB. Additionally, we recover homolo- of prophage WO (Ishmael et al. 2009; Kent, Salichos, et al. gous modules outside of the annotated cif open reading 2011) and relaxed selection after transition to a new repro- frames, such as the chromosomal region with a ModB-3 ductive phenotype. The low number of modules identified in (Ulp-1-like) region in wNo. The ModB-3 wNo module is genic, such strains is consistent with gene degradation and loss. found within a hypothetical protein (WP_041581315.1). Additionally, we recover no cif-like regions in Cardinium,a Whether or not these cif-like regions outside of prophage member of the Bacteroidetes, and an independent transition WO contribute to CI remains to be determined. All to a CI phenotype (Penz et al. 2012). Supergroup A and B strains, with the exception of wAu and To further explore the conservation of the cif genes across wTpre (non-CI inducing strains), contained at least one recov- the sequenced Wolbachia, and to uncover diversity that may ered module. be present in other genomes, we searched the WGS data- Importantly, all strains that are known to be capable of bases for recently sequenced genomic scaffolds from inducing or rescuing CI have two or more recovered modules, Wolbachia infecting the Nomada bees (wNleu, wNla, wNpa, though they do not necessarily have ModB-3, which contains and wNfe) (Gerth and Bleidorn 2016), Drosophila inocompta the catalytic residue implicated in CI function (Beckmann et al. (wInc_Cu) (Wallau et al. 2016), and Laodelphax striatellus 2017). The non-CI strains have fewer recovered modules: One (wStri) (GenBank accession number NZ_LRUH00000000.1) module (ModB-2) in wUni, and no modules in wCle, wAu, using HMMER. Only for wStri do we have direct evidence of wTpre and the nematode-infecting strains (Supergroups C CI induction (Noda et al. 2001)yet the wStri WGS projects and D). wUni is a unique case, where we identified cif alleles contain only one cif locus (CifB) with an unusual structure not in the genome, but recovered only one module. Most wUni found in any of the other Types. On the basis of HHpred modules are either missing (fig. 4A) or divergent enough from analyses, the wStri homolog contains a deubiquitylase region WOMelB that they were not considered a positive match. in the middle of the protein, with two downstream regions wAlbB and wNo, both CI-inducing strains with Type III and that have homology to glucosyl transferases and lipases, re- IV alleles, have fewer recovered modules, but this is congruent spectively (supplementary fig. S4, Supplementary Material on- with the more divergent nature of those Cif Types. It is nota- line). The wInc_Cu WGS project contained one each of CifA ble that despite the phylogenetic distance from WOMelB, and CifB alleles. The CifA allele from wInc_Cu is a typical Type Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 445 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Lindsey et al. GBE FIG.6—Presence of wMel-like Cif modules across the Wolbachia phylogeny. The WOMelB module sequences were used to query available Wolbachia genomes to look for the presence of Cif-like regions beyond those within the annotated Cifs (fig. 5). Colored dots correspond to the structural regions delimited by HHpred, shown in figure 4, and listed in table 2. A “C” within a dot indicates the presence of a module outside of annotated cif open reading frames (fig. 4 and supplementary figs. S2 and S3, Supplementary Material online). The black dot indicates a module annotated by HHpred, but not identified by TBlastN due to divergence from the WOMelB module. Black boxes labeled with uppercase letters indicate branches leading to Wolbachia Supergroups. Dotted lines on the phylogeny lead to taxon names and are not included in the branch length. I protein containing three modules: An N-terminal Catalase- Discussion rel domain and an internal DUF3242 domain, followed by the We explored three key features of cif evolution: 1) The operon STE-like transcriptional factor domain. Because these are in- hypothesis, 2) potential novel functions across the cifA and complete genome projects, it is possible that other cif homo- cifB phylogenies, and 3) the conservation and diversity of cif logs have been missed due to the current sequencing alleles across strains with different host-manipulation pheno- coverage. Alternatively, it is possible that other, as yet undis- types. We provide multiple lines of evidence that although covered, mechanisms of reproductive manipulation exist in cifA and cifB can be cotranscribed, they are divergently tran- these strains. In contrast, the Nomada-associated Wolbachia scribed in wMel during host development, suggesting a more contain a large repertoire of cif homologs, including Types I, II, complex regulation of gene expression than found in classical IV, and several homologs with variations on the Type IV do- operons. Indeed, cifB transcription has a significant negative main architecture for CifA (supplementary fig. S4, correlation with cifA.In Escherichia coli, operons frequently Supplementary Material online). Many of the CifB homologs have internal promoters and terminators that result in differ- are disrupted Type I variants that contain the deubiquitylase- ent units of transcription, which are preferentially used during like domain, but not the nuclease-like domains (supplemen- certain conditions (Conway et al. 2014). Although cifB is tary fig. S4, Supplementary Material online). 446 Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Evolutionary Genetics of cifA and cifB GBE expressed at about 1/10 the level of cifA across all life cycle yetto beexplored in vivo, but is a promising direction for stages, the significant negative correlation in their levels sug- understanding the evolution of Wolbachia–host associations. gests that the same factor(s) could upregulate cifA while Based on what is known about Wolbachia biology, some downregulating cifB,or a cifA encoded RNA or protein may of the protein domains may be especially good candidates for inhibit cifB expression or vice versa. The new annotations for further study and in vivo functional characterization. Predicted cifA alleles, including a Puf family RNA-binding-like domain PDDEXK-like nuclease domains are present in all four CifB and STE transcription factor, could theoretically play a role in Types. Given the predicted interaction of these domains inhibiting cifB expression. More detailed analyses from a vari- with DNA (Kosinski et al. 2005), and the presence of these ety of strains, cif Types, and conditions would help develop a domains across CifB proteins, determining whether and how comprehensive understanding of the factors regulating ex- these regions interact with host (Wolbachia or insect) DNA, pression of these genes and the CI phenotype. and whether or not they contribute to CI function would be It is especially interesting that cifA and cifB synteny is main- useful in understanding the consistent presence of this mod- tained across prophage WO regions, despite the high level of ule. Mutating the predicted catalytic site of the nuclease re- recombination and rearrangements in prophage WO and gion in wPip’s Type IV CifB (aka CinB) reduces toxicity in yeast Wolbachia genomes (Baldo, Bordenstein, et al. 2006; Kent, (Beckmann et al. 2017). However, this catalytic residue is not Funkhouser, et al. 2011; Ellegaard et al. 2013). Although it is conserved, so further exploration of nuclease function across not yet clear why cifA and cifB homologs maintain their syn- more divergent alleles will be useful. As aforementioned, tenic orientation, given that they have very different absolute many of the CifA alleles encode Puf family RNA-binding-like and relative expression levels, we hypothesize that this feature domains, which have previously been implicated in mRNA can be attributed to 1) their location within prophage WO localization and transcriptional regulation (Quenault et al. and/or 2) functions associated with the ability of cifA and cifB 2011). This RNA binding-like domain is found upstream of to act synergistically to induce CI (LePage et al. 2017), or 3) an STE transcription factor-like domain and could provide a with the potential antagonism of cifA on cifB transcripts or promising direction for understanding the complicated tran- transcription. For example, they could share the same pro- scriptional dynamics of the cif genes. moter, but in addition to the Rho-independent terminator, Wolbachia strains that have lost CI have a strong signature cifA may further inhibit cifB expression by binding to the inter- of cif gene degradation and loss, consistent with their role in genic region between them, causing the polymerase to ter- CI. The two parthenogenesis-inducing strains (wTpre and minate. Such a model would be consistent with both the wUni) appear to be at different places in this process of absolute expression differences and the negative correlation. gene loss, with divergent Cif amino acid sequences recovered Alternatively, cifA and cifB may have different promoters, but for wUni, but no modules identified in wTpre. There are sev- a bicistronic message occurs because of an imperfect hairpin eral explanations for this. wUni is likely a more recent transi- terminator between the genes. We conclude that cifA and tion to parthenogenesis, as it is closely related to a CI strain cifB are cotranscribed but not coregulated as in a classical (wVitA) (Baldo, Dunning Hotopp, et al. 2006; Newton et al. operon, and do not act strictly as a toxin–antitoxin system 2016). In comparison, wTpre is part of a unique clade of due to the requirement of both Cif proteins for the induction Wolbachia that all induce parthenogenesis in Trichogramma of CI in arthropods. Determining how cifA and cifB expression wasps (Rousset et al. 1992; Werren et al. 1995; Schilthuizen is regulated in the insect host will advance an understanding and Stouthamer 1997). This strain has lost its WO phage as- of both the basic biology of CI and vector control programs sociation and only has relics of WO phage genes (Gavotte that deploy CI to control disease transmission. et al. 2007; Lindsey et al. 2016). Additionally, the two strains Despite the conservation of gene order, Cif proteins that independently transitioned to the parthenogenesis phe- showed extensive amounts of divergence and differences in notype have evolved separate mechanisms for doing so domain structure as previously reported (LePage et al. 2017). (Stouthamer and Kazmer 1994; Gottlieb et al. 2002). Here, the levels of amino acid conservation in the Cifs are Differences in time since transition to the parthenogenesis lower than FtsZ and Wsp, the latter of which is known to phenotype, phage WO associations, and mechanisms of par- recombine and be subject to directional selection. The con- thenogenesis induction likely all play a role in the rate of cif servation of the predicted catalytic residue in the C-terminal gene degradation. deubiquitylase domain is an important feature of CidB Although cifA and cifB are prophage WO genes, not all CI- (Beckmann et al. 2017). Although this residue is required inducing strains have a complete prophage. Indeed, the wRec for CI induction in CidB and other Type I alleles (Beckmann strain of Wolbachia in Drosophila recens is one such example et al. 2017), only Type I alleles (of the four identified Types) where approximately three quarters of prophage WO genes have this domain. Strains known to induce CI, such as wAlbB were eliminated (Metcalfetal. 2014), previously resulting in and wNo do not have Type I alleles, implying that the deubi- failed detection of WO presence (Bordenstein and quitylase domain is not essential for inducing CI across other Wernegreen 2004). However, genomic analyses of phage Cif Types. The complete, functional capacity of all Types has WO particles from wasps and moths revealed that several Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 447 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Lindsey et al. GBE genes packed in the genome of phage WO particles trees. Importantly, the locus, presumably expressed in the fe- (Bordenstein SR and Bordenstein SR 2016) are in fact retained male insect infected with a compatible Wolbachia,and mech- in prophage WO of wRec (Metcalf et al. 2014), including cifA anism behind rescuing CI are still unknown, as is the exact and cifB. Genes in prophage WO relics are apparently a source mechanism by which all Cif proteins induce CI. Therefore, the of host-manipulation across Wolbachia genomes. recent discovery of these CI genes and their sequence char- Additionally, there is considerable variation in the strength acterization described here pave the way for investigating key of CI across different Wolbachia strains (Veneti et al. 2003). mechanisms of the Wolbachia–host symbiosis. CifA andCifBhave anadditive effect on the strengthof CI (LePage et al. 2017), so it is possible that the level of cif ex- Supplementary Material pression, or the ratio of cifA and cifB transcripts across devel- Supplementary data are available at Genome Biology and opment, are ways in which CI strength is adjusted. The rapidly Evolution online. evolving nature of the Cif proteins may affect other ways in which they function in the host. For example, in Type I CifB proteins that have the essential deubiquitylase residue, other Acknowledgments sequence variation may affect the ability to bind with CifA, We thank J. Dylan Shropshire, Jeremy Brownlie, and anony- locations of posttranslational modifications, or the ability to be mous reviewers for feedback on earlier drafts of the manu- efficiently localized to the host nucleus. Additionally, the level script. This work was supported by the National Science of CI is often host-dependent (Bordenstein and Werren 1998; Foundation (DEB 1501227 to A.R.I.L., IOS 1456545 to McGraw et al. 2001), possibly a result of how well Wolbachia I.L.G.N., and IOS 1456778 to S.R.B.); the United States replicate in the host, and/or the specificity of Cif proteins with Department of Agriculture (NIFA 2016-67011-24778 to the host target, which is currently unknown. There are also A.R.I.L.); the National Institutes of Health (R21 HD086833 environmental conditions that affect the strength of CI, and and R01 AI132581 to S.R.B.); the Vanderbilt Microbiome they likely do so by affecting Wolbachia titers and resulting Cif Initiative; and Robert and Peggy van den Bosch Memorial expression in the host (Clancy and Hoffmann 1998; Yamada Scholarships to A.R.I.L. et al. 2007). On the basis of our analyses, we propose three avenues of research on the function of the Cif proteins. First, functional Literature Cited confirmation of the newly annotated modules will be impor- Abascal F, Zardoya R, Posada D. 2005. ProtTest: selection of best-fit mod- tant in understanding how these genes function enzymati- els of protein evolution. Bioinformatics 21(9):2104–2105. Archuleta TL. 2011. The Chlamydia effector chlamydial outer protein N cally. In total, we predict six modules in the Cif protein (CopN) sequesters tubulin and prevents microtubule assembly. J Biol sequence homologs, with varying degrees of confidence (sup- Chem. 286(39):33992–33998. plementary tables S1 and S2, Supplementary Material online). Baldo L, Bordenstein S, Wernegreen JJ, Werren JH. 2006. Widespread For some of these modules, straightforward experiments can recombination throughout Wolbachia genomes. MolBiolEvol. be designed in model systems (such as Saccharomyces)to 23:437–449. Baldo L, Dunning Hotopp JC, et al. 2006. Multilocus sequence typing determine whether their predicted function is correct, as system for the endosymbiont Wolbachia pipientis. Appl Environ has been done for the deubiquitylase domain of CidB Microbiol. 72(11):7098–7110. (Beckmann et al. 2017) and countless other bacterial effectors Baldo L, Lo N, Werren JH. 2005. Mosaic nature of the Wolbachia surface (Kramer et al. 2007; Siggers and Lesser 2008; Archuleta protein. J Bacteriol. 187(15):5406–5418. 2011). The necessity and importance of these modules to Bandi C, Anderson TJC, Genchi C, Blaxter ML. 1998. Phylogeny of Wolbachia in filarial nematodes. Proc R Soc Lond B. the CI phenotype can be assessed in the Drosophila model, 265(1413):2407–2413. where the induction of the phenotype and rescue is straight- Beckmann JF, Fallon AM. 2013. Detection of the Wolbachia protein forward (LePage et al. 2017). Second, detailed characteriza- WPIP0282 in mosquito spermathecae: implications for cytoplasmic tion of cif gene regulation will be important for understanding incompatibility. Insect Biochem Mol Biol. 43(9):867–878. CI expression and penetrance, thus informing vector control Beckmann JF, Ronau JA, Hochstrasser M. 2017. A Wolbachia deubiquity- lating enzyme induces cytoplasmic incompatibility. Nat Microbiol. programs that rely on proper expression of the CI phenotype, 2:17007. often in a transfected host. Finally, we suggest that although Bian G, et al. 2010. The endosymbiotic bacterium Wolbachia induces re- the discovery of these genes is fundamental, it is clear from sistance to dengue virus in Aedes aegypti. PLoS Pathog. this analysis that we have not comprehensively evaluated or 6(4):e1000833. identified the mechanisms behind CI and other reproductive Bogdanowicz D, Giaro K. 2013. On a matching distance between rooted phylogenetic trees. Int J Appl Math Comp Sci. 23:669–684. manipulations. The gene characterization analyses described Bogdanowicz D, Giaro K, Wrobel B. 2012. TreeCmp: comparison of trees here reveal new and relevant annotations, but with many in polynomial time. Evol Bioinform Online. 8:EBO.S9657. regions of unknown function across all of the phylogenetic Bordenstein SR, Bordenstein SR. 2016. Eukaryotic association module in Types, missing deubiquitylase domains in particular CI strains, phage WO genomes from Wolbachia.Nat Commun.7.doi:10.1038/ and a coevolving, phylogenetic relationship across the Cif ncomms13155. 448 Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Evolutionary Genetics of cifA and cifB GBE Bordenstein SR, Marshall ML, Fry AJ, Kim U, Wernegreen JJ. 2006. The Gottlieb Y, Zchori-Fein E, Werren JH, Karr TL. 2002. Diploidy restoration in tripartite associations between bacteriophage, Wolbachia, and arthro- Wolbachia-infected Muscidifurax uniraptor (Hymenoptera: pteromali- pods. PLoS Pathog. 2(5):e43. dae). J Invertebr Pathol. 81(3):166–174. Bordenstein SR, O’Hara FP, Werren JH. 2001. Wolbachia-induced incom- Gutzwiller F, et al. 2015. Dynamics of Wolbachia pipientis gene expression patibility precedes other hybrid incompatibilities in Nasonia.Nature across the Drosophila melanogaster life cycle. G3 5:2843–2856. 409(6821):707–710. Hamm CA, et al. 2014. Wolbachia do not live by reproductive manipula- Bordenstein SR, Wernegreen JJ. 2004. Bacteriophage flux in endosym- tion alone: infection polymorphism in Drosophila suzukii and D. sub- bionts (Wolbachia): infection frequency, lateral transfer, and recombi- pulchrella. Mol Ecol. 23(19):4871–4885. nation rates. Mol Biol Evol. 21(10):1981–1991. Hilgenboecker K, Hammerstein P, Schlattmann P, Telschow A, Werren JH. Bordenstein SR, Werren JH. 1998. Effects of A and B Wolbachia and host 2008. How many species are infected with Wolbachia? - A statistical genotype on interspecies cytoplasmic incompatibility in Nasonia. analysis of current data. FEMS Microbiol Lett. 281(2):215–220. Genetics 148:1833–1844. Hoerauf A, et al. 1999. Tetracycline therapy targets intracellular bacteria in Bourtzis K, et al. 2014. Harnessing mosquito–Wolbachia symbiosis for the filarial nematode Litomosoides sigmodontis and results in filarial vector and disease control. Acta Trop. 132:S150–S163. infertility. J Clin Invest. 103(1):11–18. Brooks AW, Kohl KD, Brucker RM, van Opstal EJ, Bordenstein SR. 2016. Hoffmann AA, et al. 2011. Successful establishment of Wolbachia in Phylosymbiosis: relationships and functional effects of microbial com- Aedes populations to suppress dengue transmission. Nature munities across host evolutionary history. PLoS Biol. 14(11):e2000225. 476(7361):454–U107. Brucker RM, Bordenstein SR. 2013. The hologenomic basis of speciation: Hornett EA, et al. 2008. You can’t keep a good parasite down: evolution gut bacteria cause hybrid lethality in the genus Nasonia.Science of a male-killer suppressor uncovers cytoplasmic incompatibility. 341(6146):667–669. Evolution 62(5):1258–1263. Caporaso JG, et al. 2010. QIIME allows analysis of high-throughput com- Hosokawa T, Koga R, Kikuchi Y, Meng XY, Fukatsu T. 2010. Wolbachia as munity sequencing data. Nat Methods. 7(5):335. a bacteriocyte-associated nutritional mutualist. Proc Natl Acad Sci U S Capra JA, Singh M. 2007. Predicting functionally important residues from A. 107(2):769–774. sequence conservation. Bioinformatics 23(15):1875–1882. http://python.org. Python Language Reference, version 2.7. Cattel J, et al. 2016. Wolbachia in European populations of the invasive Hughes GL, et al. 2011. Wolbachia infections are virulent and inhibit the pest Drosophila suzukii: regional variation in infection frequencies. human malaria parasite Plasmodium falciparum in Anopheles gam- PLoS One 11(1):e0147766. biae. PLoS Pathog. 7(5):e1002043. Chafee ME, Funk DJ, Harrison RG, Bordenstein SR. 2010. Lateral phage Hurst GD, et al. 1999. Male–killing Wolbachia in two species of insect. Proc transfer in obligate intracellular bacteria (Wolbachia): verification from R Soc Lond B. 266(1420):735–740. natural populations. Mol Biol Evol. 27(3):501–505. Hurvich CM, Tsai CL. 1993. A corrected Akaike information criterion for Chelikani P, Fita I, Loewen PC. 2004. Diversity of structures and properties vector autoregressive model selection. J Time Ser Anal. 14(3):271–279. among catalases. Cell Mol Life Sci. 61(2):192–208. Ishmael N, et al. 2009. Extensive genomic diversity of closely related Clancy DJ, Hoffmann AA. 1998. Environmental effects on cytoplasmic Wolbachia strains. Microbiology 155(Pt 7):2211–2222. incompatibility and bacterial load in Wolbachia-infected Drosophila Jaenike J, Dyer KA, Cornish C, Minhas MS. 2006. Asymmetrical reinforce- simulans. Entomol Exp Appl. 86(1):13–24. ment and Wolbachia infection in Drosophila. PLoS Biol 4(10):e325. Conner WR, et al. 2017. Genome comparisons indicate recent transfer of Kambris Z, Cook PE, Phuc HK, Sinkins SP. 2009. Immune activation by life- wRi-like Wolbachia between sister species Drosophila suzukii and D. shortening Wolbachia and reduced filarial competence in mosquitoes. subpulchrella. Ecol Evol 7(22):9391. Science 326(5949):134–136. Conway T, et al. 2014. Unprecedented high-resolution view of bacterial Katoh K, Standley DM. 2013. MAFFT multiple sequence alignment soft- operon architecture revealed by RNA sequencing. MBio ware version 7: improvements in performance and usability. Mol Biol 5(4):e01442–e01414. Evol. 30(4):772–780. Dedeine F, et al. 2001. Removing symbiotic Wolbachia bacteria specifically Kearse M, et al. 2012. Geneious Basic: an integrated and extendable inhibits oogenesis in a parasitic wasp. Proc Natl Acad Sci U S A. desktop software platform for the organization and analysis of se- 98(11):6247–6252. quence data. Bioinformatics 28(12):1647–1649. Duron O, Fort P, Weill M. 2006. Hypervariable prophage WO sequences Kent BN, Funkhouser LJ, Setia S, Bordenstein SR. 2011. Evolutionary ge- describe an unexpected high number of Wolbachia variants in the nomics of a temperate bacteriophage in an obligate intracellular bac- mosquito Culex pipiens. Proc R Soc Lond B. 273(1585):495–502. teria (Wolbachia). PLoS One 6:e24984. Eddy SR. 2011. Accelerated profile HMM searches. PLoS Comput Biol. Kent BN, Salichos L, et al. 2011. Complete bacteriophage transfer in a 7(10):e1002195. bacterial endosymbiont (Wolbachia) determined by targeted genome Edgar RC. 2004. MUSCLE: multiple sequence alignment with high accu- capture. Genome Biol Evol. 3(0):209–218. racy and high throughput. Nucleic Acids Res. 32(5):1792–1797. Kosinski J, Feder M, Bujnicki JM. 2005. The PD-(D/E) XK superfamily revis- Ellegaard KM, Klasson L, Naslund K, Bourtzis K, Andersson SGE. 2013. ited: identification of new members among proteins involved in DNA Comparative genomics of Wolbachia and the bacterial species con- metabolism and functional predictions for domains of (hitherto) un- cept. PLoS Genet 9(4):e1003381. known function. BMC Bioinformatics 6:172. Felsenstein J. 1989. PHYLIP-phylogeny inference package (version 3.2). Kramer RW, et al. 2007. Yeast functional genomic screens lead to iden- Cladistics 5:6. tification of a role for a bacterial effector in innate immunity regula- Gautheret D, Lambert A. 2001. Direct RNA motif definition and identifi- tion. PLoS Pathog. 3(2):e21. cation from multiple sequence alignments using secondary structure Landmann F, Orsi GA, Loppin B, Sullivan W, Schneider DS. 2009. profiles. J Mol Biol. 313(5):1003–1011. Wolbachia-mediated cytoplasmic incompatibility is associated with im- paired histone deposition in the male pronucleus. PLoS Pathog. Gavotte L, et al. 2007. A survey of the bacteriophage WO in the endo- symbiotic bacteria Wolbachia. Mol Biol Evol. 24(2):427–435. 5(3):e1000343. Gerth M, Bleidorn C. 2016. Comparative genomics provides a timeframe Lassy CW, Karr TL. 1996. Cytological analysis of fertilization and early for Wolbachia evolution and exposes a recent biotin synthesis operon embryonic development in incompatible crosses of Drosophila simu- transfer. Nat Microbiol. 2:16241. lans. Mech Dev. 57(1):47–58. Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 449 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Lindsey et al. GBE LePage D, Bordenstein SR. 2013. Wolbachia: can we save lives with a great Robinson DF, Foulds LR. 1981. Comparison of phylogenetic trees. Math pandemic? Trends Parasitol. 29(8):385–393. Biosci. 53(1–2):131–147. LePage DP, et al. 2017. Prophage WO genes recapitulate and enhance Ronquist F, et al. 2012. MrBayes 3.2: efficient Bayesian phylogenetic in- Wolbachia-induced cytoplasmic incompatibility. Nature ference and model choice across a large model space. Syst Biol. 543(7644):243–247. 61(3):539–542. Li B, Dewey CN. 2011. RSEM: accurate transcript quantification from RNA- Ros VID, Fleming VM, Feil EJ, Breeuwer JAJ. 2009. How diverse is the Seq data with or without a reference genome. BMC Bioinformatics genus Wolbachia? Multiple-gene sequencing reveals a putatively 12:323. new Wolbachia supergroup recovered from spider mites (Acari: tetra- Lindsey ARI, Werren JH, Richards S, Stouthamer R. 2016. Comparative nychidae). Appl Environ Microbiol. 75(4):1036–1043. genomics of a parthenogenesis-inducing Wolbachia symbiont. G3; Rousset F, Bouchon D, Pintureau B, Juchault P, Solignac M. 1992. 6(7):2113–2123. Wolbachia endosymbionts responsible for various alterations of sexu- Lo N, Casiraghi M, Salati E, Bazzocchi C, Bandi C. 2002. How many ality in arthropods. Proc R Soc Lond B. 250(1328):91–98. Wolbachia supergroups exist? Mol Biol Evol. 19(3):341–346. Schilthuizen M, Stouthamer R. 1997. Horizontal transmission of Macke TJ, et al. 2001. RNAMotif, an RNA secondary structure defi- parthenogenesis-inducing microbes in Trichogramma wasps. Proc R nition and search algorithm. Nucleic Acids Res. Soc Lond B. 264(1380):361–366. 29(22):4724–4735. scikit-bio.org. scikit-bio: core bioinformatics data structures and algorithms Mantel N. 1967. The detection of disease clustering and a generalized in Python. caporasolab.us: Northern Arizona University. regression approach. Cancer Res. 27:209–220. Serbus LR, Casper-Lindley C, Landmann F, Sullivan W. 2008. The genetics Masui S, Kamoda S, Sasaki T, Ishikawa H. 2000. Distribution and evolution and cell biology of Wolbachia-host interactions. Annu Rev Genet. of bacteriophage WO in Wolbachia, the endosymbiont causing sexual 42:683–707. alterations in arthropods. J Mol Evol. 51(5):491–497. Shropshire JD, Bordenstein SR. 2016. Speciation by symbiosis: the micro- McClure R, et al. 2013. Computational analysis of bacterial RNA-Seq biome and behavior. MBio 7(2):e01785-15. data. Nucleic Acids Res. 41(14): e140, https://doi.org/10.1093/nar/ Siggers KA, Lesser CF. 2008. The yeast Saccharomyces cerevisiae:a versa- gkt444. tile model system for the identification and characterization of bacte- McGraw E, Merritt D, Droller J, O’Neill S. 2001. Wolbachia-mediated rial virulence proteins. Cell Host Microbe 4(1):8–15. sperm modification is dependent on the host genotype in Sinkins SP, et al. 2005. Wolbachia variability and host effects on crossing Drosophila. Proc R Soc Lond B. 268(1485):2565–2570. type in Culex mosquitoes. Nature 436(7048):257–260. Metcalf JA, Jo M, Bordenstein SR, Jaenike J, Bordenstein SR. 2014. Recent So ¨ ding J, Biegert A, Lupas AN. 2005. The HHpred interactive server for genome reduction of Wolbachia in Drosophila recens targets phage protein homology detection and structure prediction. Nucleic Acids WO and narrows candidates for reproductive parasitism. PeerJ 2:e529. Res. 33:W244–W248. Miller WJ, Ehrman L, Schneider D. 2010. Infectious speciation revisited: Stamatakis A. 2014. RAxML version 8: a tool for phylogenetic analysis and impact of symbiont-depletion on female fitness and mating behavior post-analysis of large phylogenies. Bioinformatics 30(9):1312–1313. of Drosophila paulistorum. PLoS Pathog. 6(12):e1001214. Stouthamer R, Breeuwer JAJ, Luck RF, Werren JH. 1993. Molecular-iden- Min K-T, Benzer S. 1997. Wolbachia, normally a symbiont of Drosophila, tification of microorganisms associated with parthenogenesis. Nature can be virulent, causing degeneration and early death. Proc Natl Acad 361(6407):66–68. Sci U S A. 94(20):10792–10796. Stouthamer R, Kazmer DJ. 1994. Cytogenetics of microbe-associated par- Moreau J, Bertin A, Caubet Y, Rigaud T. 2001. Sexual selection in an thenogenesis and its consequences for gene flow in Trichogramma isopod with Wolbachia-induced sex reversal: males prefer real females. wasps. Heredity 73(3):317–327. J Evol Biol. 14(3):388–394. Stouthamer R, Luck R. 1993. Influence of microbe-associated partheno- Moreira LA, et al. 2009. A Wolbachia symbiont in Aedes aegypti limits genesis on the fecundity of Trichogramma deion and T. pretiosum. infection with dengue, chikungunya, and Plasmodium.Cell Entomol Exp Appl. 67(2):183–192. 139(7):1268–1278. Stouthamer R, Luck RF, Hamilton WD. 1990. Antibiotics cause partheno- Newton IL, et al. 2016. Comparative genomics of two closely related genetic Trichogramma (Hymenoptera, Trichogrammatidae) to revert Wolbachia with different reproductive effects on hosts. Genome Biol to sex. Proc Natl Acad Sci U S A. 87(7):2424–2427. Evol. 8(5):1526–1542. Sutton E, Harris S, Parkhill J, Sinkins S. 2014. Comparative genome analysis Noda H, Koizumi Y, Zhang Q, Deng K. 2001. Infection density of of Wolbachia strain wAu. BMC Genomics 15:928. Wolbachia and incompatibility level in two planthopper species, Teixeira L, Ferreira A, Ashburner M, Keller L. 2008. The bacterial symbiont Laodelphax striatellus and Sogatella furcifera.Insect Biochem Mol Wolbachia induces resistance to RNA viral infections in Drosophila Biol. 31(6–7):727–737. melanogaster. PLoS Biol. 6(12):e1000002. O’Neill SL, Giordano R, Colbert AME, Karr TL, Robertson HM. 1992. 16S Tram U, Sullivan W. 2002. Role of delayed nuclear envelope breakdown ribosomal-RNA phylogenetic analysis of the bacterial endosymbionts and mitosis in Wolbachia-induced cytoplasmic incompatibility. Science associated with cytoplasmic incompatibility in insects. Proc Natl Acad 296(5570):1124–1126. Sci U S A. 89:2699–2702. Turelli M, Hoffmann AA. 1991. Rapid spread of an inherited incompati- Penz T, et al. 2012. Comparative genomics suggests an independent origin bility factor in California Drosophila. Nature 353(6343):440–442. of cytoplasmic incompatibility in Cardinium hertigii. PLoS Genet. Veneti Z, et al. 2003. Cytoplasmic incompatibility and sperm cyst infection 8(10):e1003012. in different Drosophila-Wolbachia associations. Genetics Perez F, Granger BE. 2007. IPython: a system for interactive scientific com- 164(2):545–552. puting. Comput Sci Eng. 9, doi: 10.1109/MCSE.2007.53. Walker T, et al. 2011. The wMel Wolbachia strain blocks dengue and Quenault T, Lithgow T, Traven A. 2011. PUF proteins: repression, activa- invades caged Aedes aegypti populations. Nature 476(7361):450–U101. tion and mRNA localization. Trends Cell Biol. 21(2):104–112. Randerson JP, Jiggins FM, Hurst LD. 2000. Male killing can select for male Wallau GL, da Rosa MT, De Re FC, Loreto EL. 2016. Wolbachia from mate choice: a novel solution to the paradox of the lek. Proc R Soc Drosophila incompta: just a hitchhiker shared by Drosophila in the Lond B. 267(1446):867–874. New and Old World? Insect Mol Biol. 25(4):487–499. 450 Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Evolutionary Genetics of cifA and cifB GBE Werren JH, Baldo L, Clark ME. 2008. Wolbachia: master manipulators of Yen JH, Barr AR. 1971. New hypothesis of the cause of cytoplasmic in- invertebrate biology. Nat Rev Microbiol. 6(10):741–751. compatibility in Culex pipiens.Nature Werren JH, Windsor DM. 2000. Wolbachia infection frequencies in insects: Zabalou S, et al. 2004. Wolbachia-induced cytoplasmic incompatibility as a evidence of a global equilibrium? Proc R Soc Lond B. means for insect pest population control. Proc Natl Acad Sci U S A. 267(1450):1277–1285. 101(42):15042–15045. Werren JH, Zhang W, Guo LR. 1995. Evolution and phylogeny of Zug R, Hammerstein P, Cordaux R. 2012. Still a host of hosts for Wolbachia—reproductive parasites of arthropods. Proc R Soc Lond Wolbachia: analysis of recent data suggests that 40% of terrestrial B. 261(1360):55–63. arthropod species are infected. PLoS One 7(6):e38544. Yamada R, Floate KD, Riegler M, O’Neill SL. 2007. Male development time influences the strength of Wolbachia-induced cytoplasmic incompati- bility expression in Drosophila melanogaster. Genetics 177(2):801–808. Associate editor: Daniel Sloan Genome Biol. Evol. 10(2):434–451 doi:10.1093/gbe/evy012 Advance Access publication January 17, 2018 451 Downloaded from https://academic.oup.com/gbe/article-abstract/10/2/434/4815755 by Ed 'DeepDyve' Gillespie user on 16 March 2018

Journal

Genome Biology and EvolutionOxford University Press

Published: Feb 1, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 12 million articles from more than
10,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Unlimited reading

Read as many articles as you need. Full articles with original layout, charts and figures. Read online, from anywhere.

Stay up to date

Keep up with your field with Personalized Recommendations and Follow Journals to get automatic updates.

Organize your research

It’s easy to organize your research with our built-in tools.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

Monthly Plan

  • Read unlimited articles
  • Personalized recommendations
  • No expiration
  • Print 20 pages per month
  • 20% off on PDF purchases
  • Organize your research
  • Get updates on your journals and topic searches

$49/month

Start Free Trial

14-day Free Trial

Best Deal — 39% off

Annual Plan

  • All the features of the Professional Plan, but for 39% off!
  • Billed annually
  • No expiration
  • For the normal price of 10 articles elsewhere, you get one full year of unlimited access to articles.

$588

$360/year

billed annually
Start Free Trial

14-day Free Trial