Genome sequencing and comparative genomics reveal the potential pathogenic mechanism of Cercospora sojina Hara on soybean

Genome sequencing and comparative genomics reveal the potential pathogenic mechanism of... Frogeye leaf spot, caused by Cercospora sojina Hara, is a common disease of soybean in most soybean-growing countries of the world. In this study, we report a high-quality genome sequence of C. sojina by Single Molecule Real-Time sequencing method. The 40.8-Mb genome encodes 11,655 predicated genes, and 8,474 genes are revealed by RNA sequencing. Cercospora sojina ge- nome contains large numbers of gene clusters that are involved in synthesis of secondary metab- olites, including mycotoxins and pigments. However, much less carbohydrate-binding module protein encoding genes are identified in C. sojina genome, when compared with other phytopath- ogenic fungi. Bioinformatics analysis reveals that C. sojina harbours about 752 secreted proteins, and 233 of them are effectors. During early infection, the genes for metabolite biosynthesis and effectors are significantly enriched, suggesting that they may play essential roles in pathogenicity. We further identify 13 effectors that can inhibit BAX-induced cell death. Taken together, our results provide insights into the infection mechanisms of C. sojina on soybean. Key words: Cercospora sojina, soybean, genome, pathogenicity 1. Introduction disease is to grow resistant soybean varieties or to apply chemical The causal agent of frogeye leaf spot (FLS), Cercospora sojina Hara, fungicides, which usually lose the effects rapidly in fields due to race 1 4–6 is a worldwide destructive pathogen on soybean. It was first re- differentiation and gene mutations of the pathogen. FLS causes ported in Japan in 1915. After that, many soybean growing coun- about 10–60% yield loss in soybean growing regions, such as tries were reported the occurrence of this disease, such as the USA, Argentina and Nigeria, and it has been reported to be the most ex- 3 3 China, and Argentina. The main measurement to control this pensive disease in the history of soybean production in Argentina. V The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 journals.permissions@oup.com 25 by Ed 'DeepDyve' Gillespie user on 16 March 2018 26 Cercospora sojina genome sequence reveals the potential infection mechanism Taxonomically, C. sojina belongs to the order Capnodiales in the ground in liquid nitrogen. Genomic DNA was extracted using a class of Dothideomycetes. Currently, it is known that there are 22 modified cetyltrimethylammonium bromide (CTAB) method. races of C. sojina in Brazil and 12 races in the USA. Later, using 93 isolates of C. sojina and 38 putative soybean differentials, Mian 2.2. Genome sequencing and assembly et al. proposed a core set of 11 races that represent the major diver- Cercospora sojina race1 genome was sequenced by Single Molecule sity of the 93 isolates in the USA. We previously reported the Real-Time (SMRT) method in Biomarker Technologies (Beijing, 14 races in north China ; of the identified races of C. sojina, the oc- China). DNA libraries with 270 bp and 10 kb inserts were con- currence frequency of race 1 is 43.5%, emerging as the dominant structed. The 270-bp library was constructed following Illumina’s race among others, which causes yield loss up to 38% in the field. standard protocol, including fragmentation of genomic DNA, end re- Despite the importance of C. sojina, the infection mechanism and pair, adaptor ligation and PCR amplification. The 270-bp library the genetic information are not known for this pathogen. For exam- was quantified using 2100 Bioanalyzer (Agilent, USA) and subjected ple, most of the species can produce a non-specific coloured myco- to paired-ended 150 bp sequencing by Illumina HiSeq4000. The se- toxin cercosporin in the Cercospora genus, which is indispensable quencing data (filtered reads: 2.99G, sequencing depth: 90) were 8 9 for their pathogenicity, but C. sojina may not produce cercosporin. used to estimate the genome size, repeat content, and heterozygosity. Does the genome harbour the cercosporin biosynthesis genes? Then, the 10-kb library was constructed following PacBio’s standard Nowadays, the main strategy to unravel the mystery of the pathogen methods, including fragmentation of genomic DNA, end repair, infection mechanisms is to obtain their genome information. adaptor ligation, and templates purification. The 10-kb library was Magnaporthe grisea, one of the best studied fungi, was sequenced at quantified by 2100 Bioanalyzer (Agilent, USA) and sequenced by genome level in 2005. The genome annotation shows that the path- SMRT, and the sequencing data (filtered reads: 4.92G, sequencing ogen may carry over 700 secreted proteins, and most of them are be- depth: 123) was assembled by CANU (Version-1.2) with default lieved to be virulence effectors. Later, several effectors have been parameters. Finally, Illumina reads were used for error correction 11,12 implicated in suppressing immune responses in rice. The other and gap filling with SOAPdenovo GAPCLOSER v1.12. genes, such as MoEnd3, MoSwi6, and MoHYR1 have been demon- strated to be essential for appressorium formation, melanin accumu- 13–15 2.3. Genome annotations lation, and reactive oxygen species (ROS) scavenging. Similarly, Protein-encoding genes were annotated by a combination of three in- the genome sequence availability and gene annotation have helped to dependent ab initio predicators GeneMark (Version 4.30), uncover the infection mechanism of Verticillium dahliae substan- 24 25 SNAP , and Glimmer. Transcriptome data were incorporated into tially, where this pathogen shows distinct infection structure com- 16,17 26 PASA to improve quality of C. sojina annotation. Briefly, the tran- pared with M. oryzae and other fungi. scriptome assemblies were mapped to the genome using Trinity. Although the breeding programme and fungicide application Then PASA alignment assemblies based on overlapping transcript make great success in controlling FLS in last decades, their efficien- 4,7 alignments from Trinity and use EVidenceModeler (EVM) to com- cies are facing challenges recently. In fact, Cercospora species may pute weighted consensus gene structure annotations. Finally, PASA undergo positive selections and rapid evolution. Soares et al. re- was used to update the EVM consensus predictions. ported that more Cercospora species were able to infect soybean and Functional annotations for all predicted gene models were made caused similar disease symptom as C. kikuchii, the closest species of 3 28 using multiple databases, including Swiss-Prot, nr, KEGG , and C. sojina. Importantly, they detected interlineage recombination COG by blastP with E-values of1e5. Domain-calling analyses among Cercospora species, along with a high frequency of mutations 3 30 of protein-encoding genes were performed using the Pfam database linked to fungicide resistance. Moreover, it has been observed that and HMMER. Potential virulence-related proteins were identified C. sojina populations are genetically diverse and likely undergoing by searching against the pathogen–host interaction database (PHI- sexual reproduction. The above-mentioned reports imply that C. 32 33 base) by blastP with E-values of1e5. Blast2GO was used for sojina could adapt to the changing environment flexibly. Go enrichment analysis of genes that only belong to C. sojina, and In this study, we report the 40.8 Mb complete genome sequence of do not have homologue in the non-plant pathogen Aspergillus nidu- C. sojina race 1. The genome annotation and whole genome tran- lans and Neurospora crassa. scriptome assays reveal that C. sojina encodes a different set of pro- teins that distinguishes it from other fungi species in terms of infection strategy and disease development. We demonstrate that the secondary 2.4. Repetitive sequences and whole genome DNA metabolites and effectors play essential roles in the soybean–C. sojina methylation analysis pathosystem. Moreover, the genome information can be further used Tandem repeat sequences were identified using Tandem Repeats Finder for comparative genomic studies with other Cercospora species to un- (TrF). Transposable elements (TEs) were excavated strictly using three ravel their evolution and infection mechanism on soybean. softwares, including a de novo software Repeat Modeler (http://repeat masker.org/RepeatModeler/) and two database-based softwares Repeat Protein Masker (www.repeatmasker.org/cgi-bin/RepeatProtein MaskRequest) and Repeat Masker (www.repeatmasker.org/). All the pa- 2. Materials and methods rameters are set as default. Whole-genome DNA modification detection and motif analysis were performed according to Blow et al. using the 2.1. Fungal growth conditions and DNA preparation PacBio SMRT software (version¼ 2.2.3, www.pacb.com). Cercospora sojina (C. sojina) race 1 was isolated from soybean fields in Heilong Jiang province of China. Briefly, the fungi single spore 2.5. Phylogenetic analysis and synteny analysis was isolated from the soybean seedling infested by C. sojina race 1. The isolated spores were grown on potato dextrose agar (PDA) me- The sequences of fungi were downloaded from DOE Joint Genome dium at 28 C. Then, the mycelia were removed from the media and Institute (JGI). A group of consistent phylogenetic ‘backbone’ Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 X. Luo et al. 27 genes (phylogeneticly conserved) in the fungi genome were used to Table 1. Genome features of Cercospora sojina construct phylogenetic tree. Putative ‘backbone’ genes of the other Features C. sojina 11 fungi were identified using stand-alone blast with E-values of1e 20. And the ‘backbone’ genes were concatenated into one sequence. Size (bp) 40,835,411 Sequence alignment was done using MUSCLE and the phylogenetic Coverage 120 tree was generated by MEGA 7.0 using a UPGMA method. Ustilago (GþC) percentage (%) 53.12 N50 (bp) 1,594,385 maydis was used as outgroup. Synteny of Cercospora zeae-maydis, C. Protein-coding genes 11,655 sojina,and Pseudocercospora fijiensis was analysed using GATA. Average gene length (bp) 1,441 Gene density (no. gene per Mb) 285 2.6. Comparison analysis of carbohydrate-active tRNA genes 277 Pseudogene 281 enzymes and secondary metabolism genes The proteome of C. sojina and 14 other above-mentioned fungal spe- cies were downloaded from DOE Joint Genome Institute (JGI). HMMER 3.0 packages were used for homology search. Family- DDCT 42 SYBR qPCR Mix (Vazyme Biotech). The 2 method was used specific HMM profiles were downloaded from dbCAN database. for calculating the relative gene expression levels. Actin gene of The executable file hmmscan and the hmmscan-parser script pro- C. sojina was used as the internal control. vided by dbCAN were used to generate and extract the searching re- sults, respectively. Putative polyketide synthases (PKS) and non-ribosomal peptide 2.9. Functional study of putative effectors synthases (NRPS) genes were identified using the web-based software The putative effector genes were cloned into binary vector pMD-1 SMURF with default settings. The modules of different domains in (T7 tag) driven by 35S promoter using ClonExpress II One Step individual NRPS and PKS proteins were identified via searching the Cloning Kit (Vazyme Biotech). Then, these constructs were trans- antiSMASH database (antibiotics and Secondary Metabolite formed into Agrobacterium tumefaciens strain EHA105 by electro- Analysis Shell). The core genes were annotated using stand-alone poration. Leaves of 4-week-old Nicotiana benthamiana were BLAST (E-values1e10) against Swiss-Prot database. infiltrated with EHA105 strains harbouring indicated effectors using needleless syringes. GFP and Phytophothora sojae effector Avr1b 2.7. Secondary metabolites extraction and served as the negative control and positive control, respectively. Twenty-four hours later, plants were infiltrated with EHA105 quantification harbouring pVX-BAX (OD ¼ 0.4). Cell death symptoms were Cercosporin extraction and quantification were performed according evaluated and photographed at 72 h after pVX-BAX infiltration. to the method described by Shim and Dunkle. The pigments of Results are representatives of six biological replicates. C. sojina were induced in the complete medium (CM) with 20 mmol/l cyclic adenosine monophosphate (cAMP) or starvation treatments for 4daysat 28 C. The supernatants were collected and purified by the C18-SPE cartridge. The elutes in 40% methanol fraction were further 3. Results purified by a HPLC on C18 preparation column. The fractions with 3.1. Assembly of C. sojina genome grey, light yellow, and dark grey were further identified by a reverse Like most of the fungi, C. sojina showed similar infection cycle, phase HPLC with a PDA detector (Shimadzu Corporation, Japan). but it also demonstrated some distinctions (Supplementary Fig. S1). It does not form appressorium, but infects the plants by branched hy- 2.8. Transcriptome analysis and quantitative RT-PCR phae through open stomata (Supplementary Fig. S1C and D). For transcriptome analysis, the fungus was grown in minimal nutri- Compared with other hemibiotrophic fungi, FLS disease develop- ent medium (3 g NaNO ,1gK HPO , 0.5 g MgSO 7H O, 0.5 g ment is relatively slower (Supplementary Fig. S1C). In order to 3 2 4 4 2 KCl, 0.01 g FeSO , and 30 g Sucrose per litre) at 28 C for 6 days. investigate the infection mechanism, we therefore extracted the geno- For starvation treatment, the fungal mycelia was transferred to mini- mic DNA from the mycelia and sequenced the pathogen at genome mal nutrient medium lacking NaNO , then was harvested at 24 and level. 48 h, respectively. Each treatment had three biological replicates. The genome of C. sojina race 1 was assembled from the data gen- Library preparation and bioinformatics analysis were performed erated by a recently developed Single Molecule Real-Time (SMRT) according to the method of Yang et al. One microgram RNA per sequencing technique, an effective method to decode the difficulty to sample was subjected to RNA-seq library construction. The RNA- detect but important regions, such as non-coding regions and repeti- seq libraries were quantified using 2100 Bioanalyzer (Agilent, USA), tive elements, which can assist in obtaining gapless eukaryotic ge- and sequenced (paired-end, 100 bp each) by the Illumina genome an- nome sequence. alyzer (Hiseq 2000; Illumina, USA). Quantification of gene expres- Subreads distribution analyses confirm the high quality of the sion levels were estimated by fragments per kilobase of transcript per 10-kb library (Supplementary Fig. S2 and Table S1). The sequencing million fragments mapped. Differential expression analysis was per- data (4,920,479,113 bp clean reads) were de novo assembled using formed using the DESeq R package. CANU, leading to the generation of 62 contigs, with an N50 For quantitative RT-PCR, infected leaves were collected at indi- length of 1.59 Mb and a total assembly size around 40.84 Mb cated time points. Total RNA was extracted by Trizol method (Table 1 and Supplementary Table S2). Twenty-four largest scaffolds (Invitrogen). The cDNAs were synthesized using HiScript II Q RT were displayed by circos-plot (Fig. 1). A total of 11,655 protein- SuperMix kit with genomic DNA wiper (Vazyme Biotech). Reactions coding genes are predicted, in which the gene density is 285 genes TM were performed on CFX96 Real-time System (Bio-RAD) with the per 1 Mb. However, 277 tRNA and 281 pseudogenes are predicted Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 28 Cercospora sojina genome sequence reveals the potential infection mechanism Figure 1. Circos-plot of C. sojina. The largest 24 scaffolds of C. sojina are displayed by circos-plot (Mb scale). The circos from outside to inside are: (a) 24 largest scaffolds; (b) DNA methylations; (c) GC content; (d) carbohydrate enzymes; (e). putative effectors; (f) PHI-base genes; (g) duplicated genes; The DNA methyla- tions and GC contents are statistical results of 20 kb non-overlapping windows. The inner lines link duplicated genes. in the genome. Notably, 8,474 putative protein-coding genes were (6-methyl-adenosine) were identified in C. sojina genome (Fig. 2B). supported by the RNA-seq data. It is also worth noting that the ge- However, majority of methylation sites (8,453,041) were uncatego- nome size of C. sojina race 1 is much larger than C. sojina isolate S9, rized. Interestingly, most of the categorized DNA methylations are where the genome was assembled by 124 bp library and genome size m4C, accounting for 98.3%, whereas m6A only accounts for 1.7%. was estimated around 30.8 M. In consistent with the identified methylation sites, we detect multiple motifs that may be recognized by transmethylase specifically (Supplementary Table S3). However, compared with m6A, m4C DNA methylations occur with low frequency in the regions of repeti- 3.2. Repetitive elements and potential tive elements (Fig. 2C–E). methylation sites Repetitive DNA sequence and TEs play important roles in the evolu- tion, the genome structure, and gene functions of fungi. A total of 3.3. Comparative genomic analysis 11,138,239 bp (11 M) repeat sequences were identified in C. sojina genome, including DNA transposon, LTR retrotransposon, tandem The evolutionary relationship of C. sojina and other fungi species repeat sequence and other unclassified transposons (Fig. 2A). The re- was analysed using a group of phylogenetic backbone genes of the peat sequence accounts for 25.56% of the genome. Interestingly, the fungi. Phylogenetic analysis reveals that C. sojina is evolutionally majority of repetitive sequences (96.36%) are TEs, whereas the tan- close to Cercospora zeae-maydis, a plant pathogen that can cause dem repeat sequences just account for 0.93%. Notably, DNA trans- leaf spot disease on maize (Fig. 3A). In addition, C. sojina is also poson and LTR retrotransposons account for 28 and 25% of all close to the other three Dothideomycetes pathogen TEs, respectively. Pseudocercospora fijiensis, Sphaerulina musiva, and Dothistroma DNA methylation is involved in many important cell processes, septosporum (Fig. 3A). The C. sojina homologous proteins show an such as genomic imprinting and gene transcription regulation. average identity of 79.9, 65.9, 67.5, and 65.8% with that of C. zeae- Although DNA methylation has been found in higher plants and ani- maydis, P. fijiensis, S. musiva, and D. septosporum, respectively. mals for years, it is just reported in some fungi recently. Using Although C. sojina is evolutionarily distant from the non-plant SMRT, we were able to detect m6A and m4C methylation in particu- pathogen Aspergillus nidulans and Neurospora crassa, 47.18 and lar. In total, 1,015,733 m4C (4-methyl-cytosine) and 17,409 m6A 43.07% proteins of C. sojina have homologues in A. nidulans and Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 X. Luo et al. 29 Figure 2. Repeat elements and DNA methylation sites of C. sojina. (A) The percentage of different types of repetitive sequences in the C. sojina genome. (B) Statistic analysis of candidate DNA methylation sites from primary sequence of C. sojina genome. m4C, m6A, and the unidentified represent 4-methyl-cytosine, 6-methyl-adenosine, and the unidentified methylation sites, respectively. (C) Distribution of repetitive elements and different types of DNA methylations in scaffold 1of C. sojina. Black histogram indicates the distribution of repetitive elements. All data are statistical results of 20kb windows. Asterisks indicate regions with high and low frequency of DNA methylations, respectively. (D, E) The number of different type of methylations per 20kb was calculated in total genome and repetitive el- ements of C. sojina. N. crassa, respectively. Therefore, we examined the potential gene from current databases (Supplementary Fig. S3A). Go enrichment family expansions in C. sojina. The results show that 5,652 genes ex- analysis of the 1,675 annotated genes reveals that most of the genes clusively exist in C. sojina genome but not in A. nidulans and are involved in metabolic process, biosynthetic process, and response N. crassa. However, near 70% of these genes cannot be annotated to stresses or stimuli (Supplementary Fig. S3B). Notably, these genes Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 30 Cercospora sojina genome sequence reveals the potential infection mechanism Figure 3. Phylogenetic and synteny analysis of C. sojina with other fungal species. (A) Phylogenetic tree of different fungal species. The UPGMA phylogenetic tree was constructed based on the consistent phylogenetic backbone genes of the fungi. The number represents branch lengths. (B) Synteny of Cercospora zeae-maydis, C. sojina, and Pseudocercospora fijiensis. Rectangle boxes represent order of gene models. Non-coding regions are not depicted. are predicted to have binding function (ion binding or protein bind- (Supplementary Fig. S5 and Table S4). Domain calling analysis reveals ing), hydrolase activity, transferase activity or oxidoreductase activ- that more than 7,500 types of domains exist in C. sojina proteome. ity (Supplementary Fig. S3C). As C. sojina is evolutionarily distant The arsenal of potentially secreted proteins were predicted, and proteins from the non-plant pathogens, we speculate that the gene family ex- containing a signal peptide, but lacking transmembrane domain and pansion, a common event in the evolution of phytopathogenic glycosylphosphatidylinositol (GPI) modification site were considered as 52,53 fungi, might have occurred in C. sojina genome, which eventu- secreted proteins. A combination of software tools for the prediction of 54 55 ally makes C. sojina be a plant pathogen. transmembrane domain, signal peptide motifs, and GPI modifica- In addition, synteny analysis of C. sojina genome with the other tion site indicate that C. sojina has similar number of potential secreted three genomes of Dothideomycetes spp., the C. zeae-maydis, proteins (750) when compared with other fungal species. P. fijiensis, and Mycosphaerella graminicola, reveals that C. sojina Pathogen-secreted effectors play critical roles in facilitating genome displays different synteny with those fungi (Fig. 3B and the proliferation of pathogens, often by suppressing plant im- Supplementary Fig. S4). Of all the sequenced genomes, C. zeae-may- mune system. In total, 233 proteins are predicted as the putative dis shows highest synteny with C. sojina. For example, scaffolds 2, 4, small (400 amino acids) cysteine-rich (4 cysteine residues) pro- and 5 of C. zeae-maydis correspond well with the scaffold 1 of teins. Through domain calling analysis, 205 functional motifs and C. sojina, and Scaffold 3 and 10 show well syntney to the scaffold 3 domains were found in 141 putative effectors, including 60 effectors of C. sojina (Supplementary Fig. S4A). Importantly, we observed with multiple domains. Notably, the most abundant domain is that the 24 largest scaffolds, which accounts for 91.2% of C. sojina PF14295.4 (n¼ 6), which mediates proteinprotein interactions. genome, show very high synteny with the 13 core chromosomes of Other common domains include abhydrolase domain (PF12697.5, n M. graminicola, but not with the rest 8 dispersed chromosomes 5), Hydrolase domain (PF12146.6, n¼ 5), and PAN domain (Supplementary Fig. S4C), indicating that C. sojina shares the con- (PF00024.24, n¼ 5). served and core genes of Dothideomycetes. 3.5. Up-regulation of pathogenicity-related genes by 3.4. The secretome and potential effectors whole genome transcription assays Thegenomeof C. sojina contains 11,655 protein-coding genes, cover- Because the infection progress of C. sojina on soybean is very slow, it ing approximately 41% sequence of the genome (Table 1). Among is difficult to collect enough samples to examine the gene expression them, a total of 9,506 genes were annotated using multiple databases of the in planta hyphae. However, starvation treatments could mimic Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 X. Luo et al. 31 the physiology of pathogen during infection. Therefore, we used putative PKS that are responsible for pigment production (Fig. 5A the mycelia that were grown in nutrient-limited culture for 24 and and Supplementary Table S7). We also found that C. sojina can pro- 48 h, and performed transcriptome analysis by RNA sequencing; duce some grey pigments, and the pigment was significantly induced 3,227 and 3,223 differentially expressed genes (DEGs) were identi- by both starvation and cAMP treatments (Fig. 5B), suggesting that fied at 24 and 48 h post-starvation treatment (hpt), respectively the pigments may be related to pathogen virulence. Therefore, we (Supplementary Fig. S6). A total of 4,051 DEGs were identified dur- further isolated and partially purified the pigments. Three major ing starvation treatment, and 2,399 genes were differentially ex- components, the grey, the light yellow, and the dark grey pigments pressed at both 24 and 48 hpt. Of all the DEGs, 1,530 and 1,508 were obtained (Supplementary Fig. S9), and the dark grey pigment is genes were upregulated, while 1,697 and 1,715 genes were downre- the most abundant one. gulated at 24 and 48 hpt, respectively (Supplementary Fig. S6). Notably, four classes of DEGs caused our attention. These genes 3.7. Carbohydrate-active enzymes are annotated to be involved in PHI, secretome, putative carbohydrate-active enzymes (CAZymes), and secondary metabolic Successful phytopathogenic fungi can break down and utilize the processes. First, 1,036 PHI genes are differentially expressed after plant cell wall polysaccharides by CAZymes. Cercospora sojina har- starvation treatment (Supplementary Fig. S7A). A total of 591 PHI bours 596 predicted CAZymes (Supplementary Table S9). genes are significantly upregulated, demonstrating the important Compared with other fungi in Dothideomycetes, C. sojina has a roles of these genes in responding to stimulus. Second, 260 secreted larger group of potential carbohydrate esterases, which can catalyze 60,61 protein-coding genes, including 81 effectors, are differentially ex- the O-de- or N-deacylation of substituted saccharides pressed (Supplementary Fig. S7B). There is 62.5% (50/80) effector- (Supplementary Table S9). In C. sojina genome, there are around coding gene expression being significantly upregulated, and some ef- 23.5% potential secreted proteins (177/752) that were predicted as fectors with conserved domains, such as Wall Stress-responsive CAZyme, demonstrating that C. sojina may employ a large group of Component domain, glycoside hydrolase, fungal hydrophobin, cuti- CAZymes to digest host cell walls during invasion. nase, leucine rich repeat or peptidase domains, may play critical roles Interestingly, one of the families of CAZymes, the glycoside hy- in fungal pathogenicity (Supplementary Table S5). Third, 198 drolase GH109 family, is highly enriched (Fig. 6). The biochemical CAZymes were differentially expressed. Interestingly, almost half of function of GH109 family is proved to be a-N-acetylgalactosamini- them are glycoside hydrolases (87/198), implying their essential roles dase (aNAGAL), which can cleave the terminal alpha-linked in early infection (Supplementary Fig. S7C). Further, the secondary N-acetylgalactosamine epitope of blood group A. However, it has metabolism-related genes, including 5 PKS and 16 NRPS/NRPS-like been shown that soybean lectin can specifically bind N-acetyl galac- genes are significantly upregulated (Supplementary Fig. S7D). These tosamine, a component of fungal cell wall. These data suggest that genes are likely involved in mycotoxin biosynthesis in fungi. aNAGAL may be able to compete with lectin to bind N-acetylgalac- Therefore, the increased expression of PKS and NRPS/NRPS-like tosamine. We find that C. sojina encodes 14 putative GH109 genes, genes suggests that these genes may play essential roles in myco- which is more than most of fungi, such as M. oryzae, B. cinerea, and toxins biosynthesis (Supplementary Table S6). N. crassa (Supplementary Table S10). These data suggest that the ex- pansion of GH109 family in C. sojina may contribute to overcome lectin-mediated resistance in soybean. 3.6. Gene clusters for secondary metabolites Of the annotated carbohydrate esterase genes, family CE1 and Cercospora sojina genome encodes 16 non-ribosomal peptide syn- CE10 are two major subfamilies (Fig. 6). Both families encode pro- thetases (NRPS), 20 PKS, 18 fatty acid synthases, 3 terpene syn- teins with the common activities of carboxylesterase and endo-1,4-b- thases, 2 geranylgeranyl diphosphate synthases, and 1 terpenoid xylanase. Besides, 34 potential carbohydrate-binding module cyclases (Supplementary Tables S7 and S8). These enzymes are in- (CBM) proteins were identified in C. sojina genome (Supplementary volved in synthesis of secondary metabolites, including mycotoxins, Table S9), which can digest carbohydrate complex extracellularly. pigments, and alkaloids. Annotation of secondary metabolite biosyn- Unexpectedly, only one CBM protein CBM1 was found in C. sojina thesis genes shows that C. sojina lacks PKS-NRPS hybrids, PKS-like genome (Supplementary Table S9). However, many phytopathogenic proteins, and dimethylallyl tryptophan synthases (Supplementary fungi carry plenty of CBM1. For example, Verticillium dahlia has an Tables S7 and S8). expansion of CBM1 containing protein family (30 genes). In the Cercospora genus, most of the species can produce a non- specific mycotoxin cercosporin. However, it has been disputed that if 3.8. Functional analysis of putative effectors C. sojina produces cercosporin. Nevertheless, we identified a similar gene cluster with eight cercosporin biosynthesis genes in C. sojina ge- Effectors are low molecular weight proteins that are secreted by bac- nome (Fig. 4A). These eight genes display high amino acid sequence teria, oomycetes or fungi to impair the host immune defence and to similarity to C. nicotianae cercosporin biosynthesis genes in the same adapt to specific environment. In the maize pathogen U. maydis,it tandem order (Fig. 4A). Furthermore, we observed the increased was observed that most of the genes in secreted protein-coding transcription of the eight genes during infection (Fig. 4B). These data gene clusters were induced simultaneously in infected tissue. imply that C. sojina may produce cercosporin during infection. Phylogenetic analysis for 233 putative effectors reveals that 21 pre- However, we were unable to detect the cercosporin in either cultured dicted effectors are grouped into clusters that contain 2 or 3 high se- mycelium or infected plant tissue according to the method that was quence similarity genes in C. sojina (Fig. 7A). Clusters of putative used in other Cercospora species (Supplementary Fig. S8). effectors also suggest that local duplications might be involved in ex- Pigments are the other important group of secondary metabolites pansion of effectors in C. sojina. In addition, 40 putative effectors for successful invasion of pathogens. Generally, pathogen-produced can be annotated by PHI database (Supplementary Table S11), and pigment is able to protect pathogen from host oxidative stress during most of the annotated effectors have been implicated in fungal infection. We found that C. sojina genome encodes multiple pathogenesis. Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 32 Cercospora sojina genome sequence reveals the potential infection mechanism Figure 4. Putative gene clusters for cercosporin biosynthesis in C. sojina. (A) The eight cercosporin toxin biosynthetic genes in C. sojina genome. The genes in- volved in cercosporin biosynthesis of Cercospora nicotianae (C. nicotianae) were blasted against C. sojina using BlastP. The black arrow represents the direc- tion of sense strand. (B) Expression of candidate genes involved in cercosporin biosynthesis at 48 h after C. sojina infection. The mRNA levels were detected using qRT-PCR. Values are means6 SD (n¼ 3 biological replicates). Next, we attempt to investigate the effector function. Pro-apoptotic DNA methylation is an important research area of epigenetics in mouse protein BAX-induced programmed cell death (PCD) on N. ben- both eukaryotic and prokaryotic. The most studied type of DNA thamiana could physiologically resemble defence-associated hypersensi- methylation in fungi is m5C, while in prokaryotic it is m4C and tive response caused by pathogens, providing a valuable screening m6A. Until recently, the methylase and demethylase for m6A in eu- approach for effectors that can suppress defence-related PCD. We karyotic were identified. However, in fungi, this type of DNA modifi- randomly selected 50 effectors and transiently expressed them in cation is poorly studied. Using SMRT, we identified m6A and m4C N. benthamiana to screen the potential effectors that can suppress in C. sojina for the first time, in which we found 17,407 m6A and BAX-triggered PCD (BT-PCD). Our results show that about 1/4 1,015,733 m4C in C. sojina genome (Fig. 2B). The m6A frequency in (13/50) selected effectors strongly suppress BT-PCD (Fig. 7B). this fungus is 435.2/Mb, which is slightly more than 341.3/Mb in Moreover, qRT-PCR results show that most of these putative effectors yeast. In bacteria, m6A is regarded as an epigenetic signal for are transcriptionally induced at 48 h after C. sojina infection (Fig. 7C), DNA-protein interactions. Surprisingly, we also observe that implying they probably contribute to early infection on soybean. C. sojina contains large numbers of m4C, which is only reported in bacteria to our knowledge. Interestingly, the m4C modification oc- curs with lower frequency in the region of repetitive elements in C. sojina genome (Fig. 2C and D), indicating that m4C may be in- 4. Discussion volved in the transposition of the transposons. Our results also dem- onstrate that SMRT is a powerful tool in studying fungal genome Cercospora species cause severe leaf spot and blight diseases on epigenetic modification. many crops worldwide. In this study, we sequenced the genome of It is worth noting that one of the enriched family is glycoside the economically important fungi, C. sojina race 1. The genome size hydroxylate GH109 family (Fig. 6). This gene family encodes is 40.84 Mb, and 25.56% of the genome is composed of repeat se- a-N-acetylgalactosaminidase (aNAGAL), an enzyme that can cleave quences. However, compared with the genome size of sequenced but N-acetyl galactosamine from the conjugated proteins. not annotated C. sojina isolate using NGS method (IMG genome id: Interestingly, it has been found that lectin can specifically bind 2506520004, https://img.jgi.doe.gov/cgi-bin/m/main.cgi), C. sojina N-acetyl galactosamine. It is also found that soybean lectin can race 1 has a larger genome size (Table 1 and Supplementary Fig. bind the N-acetyl galactosamine, a component of fungal cell wall. S12). In particular, the assembled C. sojina genome by SMRT gener- Therefore, the aNAGAL may compete with soybean lectin to bind ates 10 Mb repetitive sequences that are not detected by other se- N-acetyl galactosamine. It is known that soybean lectin and other quencing techniques. Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 X. Luo et al. 33 Figure 5. Putative PKS gene clusters for pigment production in C. sojina. (A) Genes encoding methyltransferases, cytochrome P450s, oxidoreductases, dehydro- genases, acyltransferases, MFS transporters, and transcriptional factors were clustered with PKS genes. These clusters are responsible for the pigment synthe- sis. The black arrows represent the direction of sense chain. The orange arrows highlight PKS genes. The blue and grey arrows represent putative pigment biosynthesis genes and other genes that are not involved in pigment biosynthesis. (B) Pigments produced by C. sojina in different media. The supernatants were collected at 4 days after treatments on mycelia. Czapek medium served as a negative control. Results are representatives of three biological replicates. PDB and CM represent potato-dextrose broth medium and complete medium, respectively. plant lectins can inhibit hyphae growth and spore germination by CBMs are the most common non-catalytic modules associated binding their cell wall components in several fungi, such as Penicillia with enzymes active in plant cell-wall hydrolysis. They can increase 71,72 and Aspergilli species. We hypothesize that the expansion of enzyme efficiency by anchoring the enzyme’s catalytic region to insol- GH109 family in C. sojina genome may contribute to overcome the uble cellulose. However, only one CBM1 protein was found in lectin-mediated disease resistance in soybean. C. sojina genome (Supplementary Table S9). In light of their relative Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 34 Cercospora sojina genome sequence reveals the potential infection mechanism Figure 6. Comparison of carbohydrate enzymes between C. sojina and nine other fungal species. Nine species were selected to compare with C. sojina.Pi, Phytophothora infestans;Ps, Phytophothora sojae;Nc, Neurospora crassa;Bc, Botrytis cinerea;Cm, Cercospora zeae-maydis;Cs, Cercospora sojina;An, Aspergillus nidulans;Vd, Verticillium dahliae;Mo, Magnaporthe oryzae;Fg, Fusarium graminearum. GH, glycoside hydrolase; GT, glycosyltransferase; PL, polysaccharide lyase; CE, carbohydrate esterase; AA, auxiliary activity family; CBM, carbohydrate-binding module family. The numbers of gene families were normalized by Z score. slower infection process, the deficiency of CBMs in the genome may genome (Fig. 5A and Table S7). Moreover, the whole genome tran- undermine C. sojina infection in terms of digesting plant cell walls. scriptome assays also demonstrate that the key genes that are in- It is believed that mycotoxin plays a critical role during pathogen in- volved in pigment biosynthesis are significantly up-regulated. These fection. Mycotoxin cercosporin produced by Cercospora spp. is con- data further support the assumption that the pigments produced by sidered to be one of the key factors that can enhance their virulence, as C. sojina are involved in its virulence. their pathogenicity was remarkably impaired in the cercosporin- In addition to secondary metabolites, pathogens usually harbour deficient mutants. However, C. sojina is one of the few Cercospora various virulence effectors. These effectors interfere with host im- spp. that was reported not able to produce cercosporin, although there mune responses to enhance virulence. For example, bacterial patho- is a dispute. We made an effort to examine the cercosporin in either gen Pseudomonas syringae delivers over 30 effectors by type III cultured mycelium or infected plant tissue. However, we are unable to secretion system during infection. Our work showed that more detect cercosporin in any of the samples although the complete gene than one third of the effectors were upregulated during starvation cluster for cercosporin biosynthesis exists in C. sojina genome (Fig. 4 (Supplementary Fig. S7B), and many of them can suppress BAX- and Supplementary Fig. S8). Therefore, our data imply that C. sojina induced cell death (Fig. 7B), similar to the finding in Phytophothora may not employ cercosporin but other mycotoxin to enhance virulence. sojae. These data demonstrate that C. sojina can probably deploy Fungi-derived pigments can act as virulence factors to facilitate in- effectors to promote infection. fection in plants, and are required for pathogen fitness by serving as In summary, we report a complete genome sequence of C. sojina UV protectants and ROS scavengers. The well-studied fungal by SMRT sequencing method. This sequencing method not only as- pigment is melanin, which essentially contributes to fungal pathogen- sists us to find the repetitive elements, but also to discover the DNA esis by altering cytokine responses, decreasing phagocytosis and methylations in fungus. By the genome assembly and annotation, we scavenging ROS, as well as playing an important role in reinforcing hypothesize that the specific CAZymes, secondary metabolites, and fungal cell wall. Similar as melanin, we observed some pigment effectors can help C. sojina to adapt to soybean successfully. Our production was induced by cAMP or starvation treatments (Fig. 5B). work also lays the groundwork for future discoveries on this impor- We also identified the key gene clusters that encode PKS in C. sojina tant soybean disease. Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 X. Luo et al. 35 Figure 7. Functional analysis of putative C. sojina effectors. (A) Phylogenetic analysis of putative effectors. The phylogenetic tree of 233 putative effectors is categorized into six super clades. (B) Selected C. sojina effectors suppress BAX-induced programmed cell death in N. benthamiana. N. benthamiana leaves were infiltrated with agro- bacterium carrying C. sojina effector genes (OD600¼ 0.4). GFP and Phytophothora sojae effector Avr1b were used as the negative and the positive controls, respectively. Twenty-four hours later, plants were infiltrated with pVX-BAX (OD600¼ 0.4), then the photos were taken 3 days later. Results are representatives of six biological repli- cates. (C) The effectors that can suppress BAX-induced PCD are significantly up-regulated at 48 hpi. Soybean leaves were inoculated with C. sojina. Samples were col- lected at 0 and 48 hpi, respectively. Quantitative RT-PCR (qRT–PCR) were used to determine gene expression levels. Values are mean6 SD (n¼ 3 biological replicates). Supplementary data Funding Supplementary data are available at DNARES online. The study was supported by Chinese Academy of Sciences (Strategic Priority Research Program Grant NO. XDB11020300), the National Natural Science Foundation of China (Grant 31570252 and Grant 31500220), and by the grant from the State Key Laboratory of Plant Genomics (Grant No. O8KF021011). Authors’ contributions J.L. and S.M. conceived the research plans; X.L., J.C., and J.H. per- Conflict of interest formed the experiments; Z.W., Z.G., and Y.C. provided technical as- sistance; X.L., J.C., and J.L. wrote the article. None declared. Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 36 Cercospora sojina genome sequence reveals the potential infection mechanism diversity of Cercospora sojina populations on soybean from Arkansas: Accession number evidence for potential sexual reproduction, Phytopathology, 103, NFUF00000000 1045–51. 19. Kim, J.S., Seo, S.G., Jun, B.K., Kim, J.W. and Kim, S.H. 2010, Simple and reliable DNA extraction method for the dark pigmented fungus, Cercospora sojina, Plant Pathol. J., 26, 289–92. Data availability 20. Chin, C.S., Alexander, D.H., Marks, P., et al. 2013, Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data, This Whole Genome Shotgun project has been deposited at DDBJ/ Nat. Methods, 10, 563–9. ENA/GenBank under the accession NFUF00000000. The version de- 21. Berlin, K., Koren, S., Chin, C.S., Drake, J.P., Landolin, J.M. and scribed in this paper is version NFUF01000000. Phillippy, A.M. 2015, Assembling large genomes with single-molecule se- quencing and locality-sensitive hashing, Nat. Biotechnol., 33, 623–30. 22. Luo, R., Liu, B., Xie, Y., et al. 2012, SOAPdenovo2: an empirically References improved memory-efficient short-read de novo assembler, Gigascience, 1, 18. 23. Ter-Hovhannisyan, V., Lomsadze, A., Chernoff, Y.O. and Borodovsky, 1. Mian, M.A.R., Missaoui, A.M., Walker, D.R., Phillips, D.V. and Boerma, M. 2008, Gene prediction in novel fungal genomes using an ab initio algo- H.R. 2008, Frogeye leaf spot of soybean: a review and proposed race des- rithm with unsupervised training, Genome Res., 18, 1979–90. ignations for isolates of Cercospora sojina Hara, Crop Sci., 48, 14–24. 24. Salamov, A.A. and Solovyev, V.V. 2000, Ab initio gene finding in 2. Gupta, D.K., Singh, M., Singh, G. and Srivastava, L.S. 1994, Sources of Drosophila genomic DNA, Genome Res., 10, 516–22. resistance in soybean (Glycine-max) to frog-eye leaf-spot caused by 25. Delcher, A.L., Harmon, D., Kasif, S., White, O. and Salzberg, S.L. 1999, Cercosporidium-sojinum, Indian J. Agric. Sci., 64, 886–7. Improved microbial gene identification with GLIMMER. Nucleic Acids 3. Soares, A.P.G., Guillin, E.A., Borges, L.L., et al. 2015, More Cercospora Res, 27, 4636–41. species infect soybeans across the Americas than meets the eye, PloS One, 26. Haas, B.J., Zeng, Q., Pearson, M.D., Cuomo, C.A. and Wortman, J.R. 10, e0133495. 2011, Approaches to fungal genome annotation, Mycology, 2, 118–41. 4. Mian, R., Bond, J., Joobeur, T., et al. 2009, Identification of soybean ge- 27. Haas, B.J., Papanicolaou, A., Yassour, M., et al. 2013, De novo transcript notypes resistant to Cercospora sojina by field screening and molecular sequence reconstruction from RNA-seq using the Trinity platform for ref- markers, Plant Dis., 93, 408–11. erence generation and analysis, Nat. Protoc., 8, 1494–512. 5. Galloway, J. 2008, Effective management of soybean rust and frogeye leaf 28. Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y. and Hattori, M. 2004, spot using a mixture of flusilazole and carbendazim, Crop Prot., 27, The KEGG resource for deciphering the genome, Nucleic Acids Res., 32, 566–71. D277–80. 6. Zhang, G.R., Pedersen, D.K., Phillips, D.V. and Bradley, C.A. 2012, 29. Tatusov, R.L., Galperin, M.Y., Natale, D.A. and Koonin, E.V. 2000, The Sensitivity of Cercospora sojina isolates to quinone outside inhibitor fun- COG database: a tool for genome-scale analysis of protein functions and gicides, Crop Prot., 40, 63–8. evolution, Nucleic Acids Res., 28, 33–6. 7. Ma, S.M., and Li B.Y. 1997, Primary report on the identification for phys- 30. Finn, R.D., Bateman, A., Clements, J., et al. 2014, Pfam: the protein fami- iological races of Cercospora sojina Hara in northeast China, Acta lies database, Nucleic Acids Res., 42, D222–30. Phytopathol. Sin., 27, 180. 31. Eddy, S.R. 1998, Profile hidden Markov models, Bioinformatics, 14, 8. Daub, M.E. and Ehrenshaft, M. 2000, The photoactivated Cercospora 755–63. toxin cercosporin: contributions to plant disease and fundamental biol- 32. Urban, M., Cuzick, A., Rutherford, K., et al. 2017, PHI-base: a new inter- ogy, Annu. Rev. Phytopathol., 38, 461–90. face and further additions for the multi-species pathogen–host interactions 9. Goodwin, S.B., Dunkle, L.D. and Zismann, V.L. 2001, Phylogenetic anal- database, Nucleic Acids Res., 45, D604–10. ysis of cercospora and mycosphaerella based on the internal transcribed 33. Gotz, S., Arnold, R., Sebastian-Leon, P., et al. 2011, B2G-FAR, a spacer region of ribosomal DNA, Phytopathology, 91, 648–58. species-centered GO annotation repository, Bioinformatics, 27, 919–24. 10. Dean, R.A., Talbot, N.J., Ebbole, D.J., et al. 2005, The genome sequence 34. Benson, G. 1999, Tandem repeats finder: a program to analyze DNA se- of the rice blast fungus Magnaporthe grisea, Nature, 434, 980–6. quences, Nucleic Acids Res., 27, 573–80. 11. Wang, R., Ning, Y., Shi, X., et al. 2016, Immunity to rice blast disease by 35. Blow, M.J., Clark, T.A., Daum, C.G., et al. 2016, The epigenomic land- suppression of effector-triggered necrosis. Curr. Biol., 26, 2399–411. scape of prokaryotes, PloS Genet., 12, e1005854. 12. Park, C.H., Chen, S., Shirsekar, G., et al. 2012, The Magnaporthe oryzae 36. Nordberg, H., Cantor, M., Dusheyko, S., et al. 2014, The genome portal effector AvrPiz-t targets the RING E3 Ubiquitin Ligase APIP6 to suppress of the Department of Energy Joint Genome Institute: 2014 updates, pathogen-associated molecular pattern–triggered immunity in rice. Plant Nucleic Acids Res., 42, D26–31. Cell, 24, 4748–62. 37. Ebersberger, I., Simoes, R.D., Kupczok, A., et al. 2012, A consistent phy- 13. Li, X., Gao, C., Li, L., et al. 2017, MoEnd3 regulates appressorium for- logenetic backbone for the fungi, Mol. Biol. Evol., 29, 1319–34. mation and virulence through mediating endocytosis in rice blast fungus 38. Edgar, R.C. 2004, MUSCLE: multiple sequence alignment with high accu- Magnaporthe oryzae. PLoS Pathogens, 13, e1006449. racy and high throughput, Nucleic Acids Res., 32, 1792–7. 14. Qi, Z., Wang, Q., Dou, X., et al. 2012, MoSwi6, an APSES family tran- 39. Kumar, S., Stecher, G. and Tamura, K. 2016, MEGA7: molecular evolu- scription factor, interacts with MoMps1 and is required for hyphal and tionary genetics analysis version 7.0 for bigger datasets, Mol. Biol. Evol., conidial morphogenesis, appressorial function and pathogenicity of 33, 1870–4. Magnaporthe oryzae. Mol. Plant Pathol., 13, 677–89. 40. Nix, D.A. and Eisen, M.B. 2005, GATA: a graphic alignment tool for 15. Huang, K., Czymmek, K.J., Caplan, J.L., et al. 2011, HYR1-mediated de- comparative sequence analysis, BMC Bioinformatics, 6,9. toxification of reactive oxygen species is required for full virulence in the 41. Mistry, J., Finn, R.D., Eddy, S.R., Bateman, A. and Punta, M. 2013, rice blast fungus. PloS Pathogens, 7, e1001335. Challenges in homology search: HMMER3 and convergent evolution of 16. Zhou, T.T., Zhao, Y.L., and Guo, H.S. 2017, Secretory proteins are deliv- coiled-coil regions, Nucleic Acids Res., 41, e121. ered to the septin-organized penetration interface during root infection by 42. Yin, Y., Mao, X., Yang, J., Chen, X., Mao, F. and Xu, Y. 2012, dbCAN: Verticillium dahliae. PloS Pathol., 13, e1006275. a web resource for automated carbohydrate-active enzyme annotation, 17. Zhao, Y.L., Zhou, T.T., and Guo, H.S. 2016, Hyphopodium-specific 2þ Nucleic Acids Res., 40, W445–51. VdNoxB/VdPls1-dependent ROS-Ca signaling is required for plant in- 43. Khaldi, N., Seifuddin, F.T., Turner, G., et al. 2010, SMURF: genomic fection by Verticillium dahliae. PloS Pathol., 12, e1005793. mapping of fungal secondary metabolite clusters, Fungal Genet. Biol., 47, 18. Kim, H., Newell, A.D., Cota-Sieckmeyer, R.G., Rupe, J.C., Fakhoury, 736–41. A.M. and Bluhm, B.H. 2013, Mating-type distribution and genetic Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 X. Luo et al. 37 44. Medema, M.H., Blin, K., Cimermancic, P., et al. 2011, antiSMASH: rapid 61. Aurilia, V., Parracino, A. and D’Auria, S. 2008, Microbial carbohydrate identification, annotation and analysis of secondary metabolite biosynthe- esterases in cold adapted environments, Gene, 410, 234–40. sis gene clusters in bacterial and fungal genome sequences, Nucleic Acids 62. Liu, Q.P., Sulzenbacher, G., Yuan, H., et al. 2007, Bacterial glycosidases Res., 39, W339–46. for the production of universal red blood cells, Nat. Biotechnol., 25, 45. Shim, W.B. and Dunkle, L.D. 2002, Identification of genes expressed dur- 454–64. ing cercosporin biosynthesis in Cercospora zeae-maydis, Physiol. Mol. 63. Benhamou, N. and Ouellette, G.B. 1986, Ultrastructural-localization of Plant P., 61, 237–48. glycoconjugates in the fungus ascocalyx-abietina, the scleroderris canker 46. Yang, C., Li, W., Cao, J., et al. 2017, Activation of ethylene signaling agent of conifers, using lectin gold complexes, J. Histochem. Cytochem., pathways enhances disease resistance by regulating ROS and phytoalexin 34, 855–67. production in rice, Plant J., 89, 338–53. 64. Zhao, Z.T., Liu, H.Q., Wang, C.F. and Xu, J.R. 2013, Comparative anal- 47. Zeng, F., Wang, C., Zhang, G., Wei, J., Bradley, C.A. and Ming, R. 2017, ysis of fungal genomes reveals different plant cell wall degrading capacity Draft genome sequence of Cercospora sojina isolate S9, a fungus causing in fungi, BMC Genomics, 14, 274. frogeye leaf spot (FLS) disease of soybean. Genomics Data, 12, 79–80. 65. Klosterman, S.J., Subbarao, K.V., Kang, S.C., et al. 2011, Comparative 48. Bodega, B. and Orlando, V. 2014, Repetitive elements dynamics in cell identity genomics yields insights into niche adaptation of plant vascular wilt path- programming, maintenance and disease, Curr. Opin. Cell Biol., 31, 67–73. ogens, PloS Pathog., 7, e1002137. 49. Jaenisch, R. and Bird, A. 2003, Epigenetic regulation of gene expression: 66. Kamper, J., Kahmann, R., Bolker, M., et al. 2006, Insights from the ge- how the genome integrates intrinsic and environmental signals, Nat. nome of the biotrophic fungal plant pathogen Ustilago maydis, Nature, Genet., 33, 245–54. 444, 97–101. 50. Ohm, R.A., Feau, N., Henrissat, B., et al. 2012, Diverse lifestyles and 67. Wang, Q.Q., Han, C.Z., Ferreira, A.O., et al. 2011, Transcriptional pro- strategies of plant pathogenesis encoded in the genomes of eighteen gramming and functional interactions within the Phytophthora sojae Dothideomycetes fungi, PloS Pathog., 8, e1003037. RXLR effector repertoire, Plant Cell, 23, 2064–86. 51. Flusberg B.A., Webster D.R., Lee J.H., et al. 2010, Direct detection of 68. Dubey, A. and Jeon, J. 2016, Epigenetic regulation of development and DNA methylation during single-molecule, real-time sequencing. Nature pathogenesis in fungal plant pathogens, Mol. Plant Pathol. doi: Methods, 7, 461–5. 10.1111/mpp.12499. 52. Benson, J.M., Poland, J.A., Benson, B.M., Stromberg, E.L. and Nelson, 69. Ye, P.H., Luan, Y.Z., Chen, K.N., Liu, Y.Z., Xiao, C.L. and Xie, Z. 2017, R.J. 2015, Resistance to gray leaf spot of maize: genetic architecture and MethSMRT: an integrative database for DNA N6-methyladenine and mechanisms elucidated through nested association mapping and N4-methylcytosine generated by single-molecular real-time sequencing, near-isogenic line analysis, PloS Genet., 11, e1005045. Nucleic Acids Res., 45, D85–9. 53. Goodwin, S.B., Ben M’Barek, S., Dhillon, B., et al. 2011, Finished genome 70. Wion, D. and Casadesus, J. 2006, N6-methyl-adenine: an epigenetic of the fungal wheat pathogen Mycosphaerella graminicola reveals dispen- signal for DNA-protein interactions, Nat. Rev. Microbiol., 4, 183–92. some structure, chromosome plasticity, and stealth pathogenesis, PloS 71. Barkai-golan, R., Mirelman, D. and Sharon, N. 1978, Studies on growth inhi- Genet., 7, e1002070. bition by lectins of Penicillia and Aspergilli. Arch. Microbiol., 116, 119–24. 54. Krogh, A., Larsson, B., von Heijne, G. and Sonnhammer, E.L.L. 2001, 72. Guo, P., Wang, Y., Zhou, X., et al. 2013, Expression of soybean lectin in Predicting transmembrane protein topology with a hidden Markov model: transgenic tobacco results in enhanced resistance to pathogens and pests. application to complete genomes, J. Mol. Biol., 305, 567–80. Plant Science, 211, 17–22. 55. Petersen, T.N., Brunak, S., von Heijne, G. and Nielsen, H. 2011, SignalP 73. Boraston, A.B., Bolam, D.N., Gilbert, H.J. and Davies, G.J. 2004, 4.0: discriminating signal peptides from transmembrane regions, Nat. Carbohydrate-binding modules: fine-tuning polysaccharide recognition, Methods, 8, 785–6. Biochem. J., 382, 769–81. 56. Soanes, D.M., Alam, I., Cornell, M., et al. 2008, Comparative genome 74. Receveur, V., Czjzek, M., Schulein, M., Panine, P. and Henrissat, B. analysis of filamentous fungi reveals gene family expansions associated 2002, Dimension, shape, and conformational flexibility of a two domain with fungal pathogenesis, PloS One, 3, e2300. fungal cellulase in solution probed by small angle X-ray scattering, J. Biol. 57. Koeck, M., Hardham, A.R. and Dodds, P.N. 2011, The role of effectors of Chem., 277, 40887–92. biotrophic and hemibiotrophic fungi in infection, Cell Microbiol., 13, 1849–57. 75. Yu, J.H. and Keller, N. 2005, Regulation of secondary metabolism in fila- 58. Wang, Y., Wu, J., Park, Z.Y., et al. 2011, Comparative secretome investi- mentous fungi, Annu. Rev. Phytopathol., 43, 437–58. gation of magnaporthe oryzae proteins responsive to nitrogen starvation, 76. Nosanchuk, J.D., Stark, R.E. and Casadevall, A. 2015, Fungal melanin: J. Proteome Res., 10, 3136–48. what do we know about structure? Front Microbiol., 6, 1463. 59. Liu, G.Y. and Nizet, V. 2009, Color me bad: microbial pigments as viru- 77. Xu, N., Luo, X., Li, W., Wang, Z., and Liu, J. 2017, The bacterial effector lence factors, Trends Microbiol., 17, 406–13. AvrB-induced RIN4 hyperphosphorylation is mediated by a receptor-like 60. Biely, P. 2012 Microbial carbohydrate esterases deacetylating plant poly- cytoplasmic kinase complex in Arabidopsis. Mol. Plant-Microbe In., 30, saccharides. Biotechnol. Adv., 30, 1575–88. 502–12. Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png DNA Research Oxford University Press

Genome sequencing and comparative genomics reveal the potential pathogenic mechanism of Cercospora sojina Hara on soybean

Free
13 pages

Loading next page...
 
/lp/ou_press/genome-sequencing-and-comparative-genomics-reveal-the-potential-iNLPLnwQmu
Publisher
Oxford University Press
Copyright
© The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
ISSN
1340-2838
eISSN
1756-1663
D.O.I.
10.1093/dnares/dsx035
Publisher site
See Article on Publisher Site

Abstract

Frogeye leaf spot, caused by Cercospora sojina Hara, is a common disease of soybean in most soybean-growing countries of the world. In this study, we report a high-quality genome sequence of C. sojina by Single Molecule Real-Time sequencing method. The 40.8-Mb genome encodes 11,655 predicated genes, and 8,474 genes are revealed by RNA sequencing. Cercospora sojina ge- nome contains large numbers of gene clusters that are involved in synthesis of secondary metab- olites, including mycotoxins and pigments. However, much less carbohydrate-binding module protein encoding genes are identified in C. sojina genome, when compared with other phytopath- ogenic fungi. Bioinformatics analysis reveals that C. sojina harbours about 752 secreted proteins, and 233 of them are effectors. During early infection, the genes for metabolite biosynthesis and effectors are significantly enriched, suggesting that they may play essential roles in pathogenicity. We further identify 13 effectors that can inhibit BAX-induced cell death. Taken together, our results provide insights into the infection mechanisms of C. sojina on soybean. Key words: Cercospora sojina, soybean, genome, pathogenicity 1. Introduction disease is to grow resistant soybean varieties or to apply chemical The causal agent of frogeye leaf spot (FLS), Cercospora sojina Hara, fungicides, which usually lose the effects rapidly in fields due to race 1 4–6 is a worldwide destructive pathogen on soybean. It was first re- differentiation and gene mutations of the pathogen. FLS causes ported in Japan in 1915. After that, many soybean growing coun- about 10–60% yield loss in soybean growing regions, such as tries were reported the occurrence of this disease, such as the USA, Argentina and Nigeria, and it has been reported to be the most ex- 3 3 China, and Argentina. The main measurement to control this pensive disease in the history of soybean production in Argentina. V The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 journals.permissions@oup.com 25 by Ed 'DeepDyve' Gillespie user on 16 March 2018 26 Cercospora sojina genome sequence reveals the potential infection mechanism Taxonomically, C. sojina belongs to the order Capnodiales in the ground in liquid nitrogen. Genomic DNA was extracted using a class of Dothideomycetes. Currently, it is known that there are 22 modified cetyltrimethylammonium bromide (CTAB) method. races of C. sojina in Brazil and 12 races in the USA. Later, using 93 isolates of C. sojina and 38 putative soybean differentials, Mian 2.2. Genome sequencing and assembly et al. proposed a core set of 11 races that represent the major diver- Cercospora sojina race1 genome was sequenced by Single Molecule sity of the 93 isolates in the USA. We previously reported the Real-Time (SMRT) method in Biomarker Technologies (Beijing, 14 races in north China ; of the identified races of C. sojina, the oc- China). DNA libraries with 270 bp and 10 kb inserts were con- currence frequency of race 1 is 43.5%, emerging as the dominant structed. The 270-bp library was constructed following Illumina’s race among others, which causes yield loss up to 38% in the field. standard protocol, including fragmentation of genomic DNA, end re- Despite the importance of C. sojina, the infection mechanism and pair, adaptor ligation and PCR amplification. The 270-bp library the genetic information are not known for this pathogen. For exam- was quantified using 2100 Bioanalyzer (Agilent, USA) and subjected ple, most of the species can produce a non-specific coloured myco- to paired-ended 150 bp sequencing by Illumina HiSeq4000. The se- toxin cercosporin in the Cercospora genus, which is indispensable quencing data (filtered reads: 2.99G, sequencing depth: 90) were 8 9 for their pathogenicity, but C. sojina may not produce cercosporin. used to estimate the genome size, repeat content, and heterozygosity. Does the genome harbour the cercosporin biosynthesis genes? Then, the 10-kb library was constructed following PacBio’s standard Nowadays, the main strategy to unravel the mystery of the pathogen methods, including fragmentation of genomic DNA, end repair, infection mechanisms is to obtain their genome information. adaptor ligation, and templates purification. The 10-kb library was Magnaporthe grisea, one of the best studied fungi, was sequenced at quantified by 2100 Bioanalyzer (Agilent, USA) and sequenced by genome level in 2005. The genome annotation shows that the path- SMRT, and the sequencing data (filtered reads: 4.92G, sequencing ogen may carry over 700 secreted proteins, and most of them are be- depth: 123) was assembled by CANU (Version-1.2) with default lieved to be virulence effectors. Later, several effectors have been parameters. Finally, Illumina reads were used for error correction 11,12 implicated in suppressing immune responses in rice. The other and gap filling with SOAPdenovo GAPCLOSER v1.12. genes, such as MoEnd3, MoSwi6, and MoHYR1 have been demon- strated to be essential for appressorium formation, melanin accumu- 13–15 2.3. Genome annotations lation, and reactive oxygen species (ROS) scavenging. Similarly, Protein-encoding genes were annotated by a combination of three in- the genome sequence availability and gene annotation have helped to dependent ab initio predicators GeneMark (Version 4.30), uncover the infection mechanism of Verticillium dahliae substan- 24 25 SNAP , and Glimmer. Transcriptome data were incorporated into tially, where this pathogen shows distinct infection structure com- 16,17 26 PASA to improve quality of C. sojina annotation. Briefly, the tran- pared with M. oryzae and other fungi. scriptome assemblies were mapped to the genome using Trinity. Although the breeding programme and fungicide application Then PASA alignment assemblies based on overlapping transcript make great success in controlling FLS in last decades, their efficien- 4,7 alignments from Trinity and use EVidenceModeler (EVM) to com- cies are facing challenges recently. In fact, Cercospora species may pute weighted consensus gene structure annotations. Finally, PASA undergo positive selections and rapid evolution. Soares et al. re- was used to update the EVM consensus predictions. ported that more Cercospora species were able to infect soybean and Functional annotations for all predicted gene models were made caused similar disease symptom as C. kikuchii, the closest species of 3 28 using multiple databases, including Swiss-Prot, nr, KEGG , and C. sojina. Importantly, they detected interlineage recombination COG by blastP with E-values of1e5. Domain-calling analyses among Cercospora species, along with a high frequency of mutations 3 30 of protein-encoding genes were performed using the Pfam database linked to fungicide resistance. Moreover, it has been observed that and HMMER. Potential virulence-related proteins were identified C. sojina populations are genetically diverse and likely undergoing by searching against the pathogen–host interaction database (PHI- sexual reproduction. The above-mentioned reports imply that C. 32 33 base) by blastP with E-values of1e5. Blast2GO was used for sojina could adapt to the changing environment flexibly. Go enrichment analysis of genes that only belong to C. sojina, and In this study, we report the 40.8 Mb complete genome sequence of do not have homologue in the non-plant pathogen Aspergillus nidu- C. sojina race 1. The genome annotation and whole genome tran- lans and Neurospora crassa. scriptome assays reveal that C. sojina encodes a different set of pro- teins that distinguishes it from other fungi species in terms of infection strategy and disease development. We demonstrate that the secondary 2.4. Repetitive sequences and whole genome DNA metabolites and effectors play essential roles in the soybean–C. sojina methylation analysis pathosystem. Moreover, the genome information can be further used Tandem repeat sequences were identified using Tandem Repeats Finder for comparative genomic studies with other Cercospora species to un- (TrF). Transposable elements (TEs) were excavated strictly using three ravel their evolution and infection mechanism on soybean. softwares, including a de novo software Repeat Modeler (http://repeat masker.org/RepeatModeler/) and two database-based softwares Repeat Protein Masker (www.repeatmasker.org/cgi-bin/RepeatProtein MaskRequest) and Repeat Masker (www.repeatmasker.org/). All the pa- 2. Materials and methods rameters are set as default. Whole-genome DNA modification detection and motif analysis were performed according to Blow et al. using the 2.1. Fungal growth conditions and DNA preparation PacBio SMRT software (version¼ 2.2.3, www.pacb.com). Cercospora sojina (C. sojina) race 1 was isolated from soybean fields in Heilong Jiang province of China. Briefly, the fungi single spore 2.5. Phylogenetic analysis and synteny analysis was isolated from the soybean seedling infested by C. sojina race 1. The isolated spores were grown on potato dextrose agar (PDA) me- The sequences of fungi were downloaded from DOE Joint Genome dium at 28 C. Then, the mycelia were removed from the media and Institute (JGI). A group of consistent phylogenetic ‘backbone’ Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 X. Luo et al. 27 genes (phylogeneticly conserved) in the fungi genome were used to Table 1. Genome features of Cercospora sojina construct phylogenetic tree. Putative ‘backbone’ genes of the other Features C. sojina 11 fungi were identified using stand-alone blast with E-values of1e 20. And the ‘backbone’ genes were concatenated into one sequence. Size (bp) 40,835,411 Sequence alignment was done using MUSCLE and the phylogenetic Coverage 120 tree was generated by MEGA 7.0 using a UPGMA method. Ustilago (GþC) percentage (%) 53.12 N50 (bp) 1,594,385 maydis was used as outgroup. Synteny of Cercospora zeae-maydis, C. Protein-coding genes 11,655 sojina,and Pseudocercospora fijiensis was analysed using GATA. Average gene length (bp) 1,441 Gene density (no. gene per Mb) 285 2.6. Comparison analysis of carbohydrate-active tRNA genes 277 Pseudogene 281 enzymes and secondary metabolism genes The proteome of C. sojina and 14 other above-mentioned fungal spe- cies were downloaded from DOE Joint Genome Institute (JGI). HMMER 3.0 packages were used for homology search. Family- DDCT 42 SYBR qPCR Mix (Vazyme Biotech). The 2 method was used specific HMM profiles were downloaded from dbCAN database. for calculating the relative gene expression levels. Actin gene of The executable file hmmscan and the hmmscan-parser script pro- C. sojina was used as the internal control. vided by dbCAN were used to generate and extract the searching re- sults, respectively. Putative polyketide synthases (PKS) and non-ribosomal peptide 2.9. Functional study of putative effectors synthases (NRPS) genes were identified using the web-based software The putative effector genes were cloned into binary vector pMD-1 SMURF with default settings. The modules of different domains in (T7 tag) driven by 35S promoter using ClonExpress II One Step individual NRPS and PKS proteins were identified via searching the Cloning Kit (Vazyme Biotech). Then, these constructs were trans- antiSMASH database (antibiotics and Secondary Metabolite formed into Agrobacterium tumefaciens strain EHA105 by electro- Analysis Shell). The core genes were annotated using stand-alone poration. Leaves of 4-week-old Nicotiana benthamiana were BLAST (E-values1e10) against Swiss-Prot database. infiltrated with EHA105 strains harbouring indicated effectors using needleless syringes. GFP and Phytophothora sojae effector Avr1b 2.7. Secondary metabolites extraction and served as the negative control and positive control, respectively. Twenty-four hours later, plants were infiltrated with EHA105 quantification harbouring pVX-BAX (OD ¼ 0.4). Cell death symptoms were Cercosporin extraction and quantification were performed according evaluated and photographed at 72 h after pVX-BAX infiltration. to the method described by Shim and Dunkle. The pigments of Results are representatives of six biological replicates. C. sojina were induced in the complete medium (CM) with 20 mmol/l cyclic adenosine monophosphate (cAMP) or starvation treatments for 4daysat 28 C. The supernatants were collected and purified by the C18-SPE cartridge. The elutes in 40% methanol fraction were further 3. Results purified by a HPLC on C18 preparation column. The fractions with 3.1. Assembly of C. sojina genome grey, light yellow, and dark grey were further identified by a reverse Like most of the fungi, C. sojina showed similar infection cycle, phase HPLC with a PDA detector (Shimadzu Corporation, Japan). but it also demonstrated some distinctions (Supplementary Fig. S1). It does not form appressorium, but infects the plants by branched hy- 2.8. Transcriptome analysis and quantitative RT-PCR phae through open stomata (Supplementary Fig. S1C and D). For transcriptome analysis, the fungus was grown in minimal nutri- Compared with other hemibiotrophic fungi, FLS disease develop- ent medium (3 g NaNO ,1gK HPO , 0.5 g MgSO 7H O, 0.5 g ment is relatively slower (Supplementary Fig. S1C). In order to 3 2 4 4 2 KCl, 0.01 g FeSO , and 30 g Sucrose per litre) at 28 C for 6 days. investigate the infection mechanism, we therefore extracted the geno- For starvation treatment, the fungal mycelia was transferred to mini- mic DNA from the mycelia and sequenced the pathogen at genome mal nutrient medium lacking NaNO , then was harvested at 24 and level. 48 h, respectively. Each treatment had three biological replicates. The genome of C. sojina race 1 was assembled from the data gen- Library preparation and bioinformatics analysis were performed erated by a recently developed Single Molecule Real-Time (SMRT) according to the method of Yang et al. One microgram RNA per sequencing technique, an effective method to decode the difficulty to sample was subjected to RNA-seq library construction. The RNA- detect but important regions, such as non-coding regions and repeti- seq libraries were quantified using 2100 Bioanalyzer (Agilent, USA), tive elements, which can assist in obtaining gapless eukaryotic ge- and sequenced (paired-end, 100 bp each) by the Illumina genome an- nome sequence. alyzer (Hiseq 2000; Illumina, USA). Quantification of gene expres- Subreads distribution analyses confirm the high quality of the sion levels were estimated by fragments per kilobase of transcript per 10-kb library (Supplementary Fig. S2 and Table S1). The sequencing million fragments mapped. Differential expression analysis was per- data (4,920,479,113 bp clean reads) were de novo assembled using formed using the DESeq R package. CANU, leading to the generation of 62 contigs, with an N50 For quantitative RT-PCR, infected leaves were collected at indi- length of 1.59 Mb and a total assembly size around 40.84 Mb cated time points. Total RNA was extracted by Trizol method (Table 1 and Supplementary Table S2). Twenty-four largest scaffolds (Invitrogen). The cDNAs were synthesized using HiScript II Q RT were displayed by circos-plot (Fig. 1). A total of 11,655 protein- SuperMix kit with genomic DNA wiper (Vazyme Biotech). Reactions coding genes are predicted, in which the gene density is 285 genes TM were performed on CFX96 Real-time System (Bio-RAD) with the per 1 Mb. However, 277 tRNA and 281 pseudogenes are predicted Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 28 Cercospora sojina genome sequence reveals the potential infection mechanism Figure 1. Circos-plot of C. sojina. The largest 24 scaffolds of C. sojina are displayed by circos-plot (Mb scale). The circos from outside to inside are: (a) 24 largest scaffolds; (b) DNA methylations; (c) GC content; (d) carbohydrate enzymes; (e). putative effectors; (f) PHI-base genes; (g) duplicated genes; The DNA methyla- tions and GC contents are statistical results of 20 kb non-overlapping windows. The inner lines link duplicated genes. in the genome. Notably, 8,474 putative protein-coding genes were (6-methyl-adenosine) were identified in C. sojina genome (Fig. 2B). supported by the RNA-seq data. It is also worth noting that the ge- However, majority of methylation sites (8,453,041) were uncatego- nome size of C. sojina race 1 is much larger than C. sojina isolate S9, rized. Interestingly, most of the categorized DNA methylations are where the genome was assembled by 124 bp library and genome size m4C, accounting for 98.3%, whereas m6A only accounts for 1.7%. was estimated around 30.8 M. In consistent with the identified methylation sites, we detect multiple motifs that may be recognized by transmethylase specifically (Supplementary Table S3). However, compared with m6A, m4C DNA methylations occur with low frequency in the regions of repeti- 3.2. Repetitive elements and potential tive elements (Fig. 2C–E). methylation sites Repetitive DNA sequence and TEs play important roles in the evolu- tion, the genome structure, and gene functions of fungi. A total of 3.3. Comparative genomic analysis 11,138,239 bp (11 M) repeat sequences were identified in C. sojina genome, including DNA transposon, LTR retrotransposon, tandem The evolutionary relationship of C. sojina and other fungi species repeat sequence and other unclassified transposons (Fig. 2A). The re- was analysed using a group of phylogenetic backbone genes of the peat sequence accounts for 25.56% of the genome. Interestingly, the fungi. Phylogenetic analysis reveals that C. sojina is evolutionally majority of repetitive sequences (96.36%) are TEs, whereas the tan- close to Cercospora zeae-maydis, a plant pathogen that can cause dem repeat sequences just account for 0.93%. Notably, DNA trans- leaf spot disease on maize (Fig. 3A). In addition, C. sojina is also poson and LTR retrotransposons account for 28 and 25% of all close to the other three Dothideomycetes pathogen TEs, respectively. Pseudocercospora fijiensis, Sphaerulina musiva, and Dothistroma DNA methylation is involved in many important cell processes, septosporum (Fig. 3A). The C. sojina homologous proteins show an such as genomic imprinting and gene transcription regulation. average identity of 79.9, 65.9, 67.5, and 65.8% with that of C. zeae- Although DNA methylation has been found in higher plants and ani- maydis, P. fijiensis, S. musiva, and D. septosporum, respectively. mals for years, it is just reported in some fungi recently. Using Although C. sojina is evolutionarily distant from the non-plant SMRT, we were able to detect m6A and m4C methylation in particu- pathogen Aspergillus nidulans and Neurospora crassa, 47.18 and lar. In total, 1,015,733 m4C (4-methyl-cytosine) and 17,409 m6A 43.07% proteins of C. sojina have homologues in A. nidulans and Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 X. Luo et al. 29 Figure 2. Repeat elements and DNA methylation sites of C. sojina. (A) The percentage of different types of repetitive sequences in the C. sojina genome. (B) Statistic analysis of candidate DNA methylation sites from primary sequence of C. sojina genome. m4C, m6A, and the unidentified represent 4-methyl-cytosine, 6-methyl-adenosine, and the unidentified methylation sites, respectively. (C) Distribution of repetitive elements and different types of DNA methylations in scaffold 1of C. sojina. Black histogram indicates the distribution of repetitive elements. All data are statistical results of 20kb windows. Asterisks indicate regions with high and low frequency of DNA methylations, respectively. (D, E) The number of different type of methylations per 20kb was calculated in total genome and repetitive el- ements of C. sojina. N. crassa, respectively. Therefore, we examined the potential gene from current databases (Supplementary Fig. S3A). Go enrichment family expansions in C. sojina. The results show that 5,652 genes ex- analysis of the 1,675 annotated genes reveals that most of the genes clusively exist in C. sojina genome but not in A. nidulans and are involved in metabolic process, biosynthetic process, and response N. crassa. However, near 70% of these genes cannot be annotated to stresses or stimuli (Supplementary Fig. S3B). Notably, these genes Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 30 Cercospora sojina genome sequence reveals the potential infection mechanism Figure 3. Phylogenetic and synteny analysis of C. sojina with other fungal species. (A) Phylogenetic tree of different fungal species. The UPGMA phylogenetic tree was constructed based on the consistent phylogenetic backbone genes of the fungi. The number represents branch lengths. (B) Synteny of Cercospora zeae-maydis, C. sojina, and Pseudocercospora fijiensis. Rectangle boxes represent order of gene models. Non-coding regions are not depicted. are predicted to have binding function (ion binding or protein bind- (Supplementary Fig. S5 and Table S4). Domain calling analysis reveals ing), hydrolase activity, transferase activity or oxidoreductase activ- that more than 7,500 types of domains exist in C. sojina proteome. ity (Supplementary Fig. S3C). As C. sojina is evolutionarily distant The arsenal of potentially secreted proteins were predicted, and proteins from the non-plant pathogens, we speculate that the gene family ex- containing a signal peptide, but lacking transmembrane domain and pansion, a common event in the evolution of phytopathogenic glycosylphosphatidylinositol (GPI) modification site were considered as 52,53 fungi, might have occurred in C. sojina genome, which eventu- secreted proteins. A combination of software tools for the prediction of 54 55 ally makes C. sojina be a plant pathogen. transmembrane domain, signal peptide motifs, and GPI modifica- In addition, synteny analysis of C. sojina genome with the other tion site indicate that C. sojina has similar number of potential secreted three genomes of Dothideomycetes spp., the C. zeae-maydis, proteins (750) when compared with other fungal species. P. fijiensis, and Mycosphaerella graminicola, reveals that C. sojina Pathogen-secreted effectors play critical roles in facilitating genome displays different synteny with those fungi (Fig. 3B and the proliferation of pathogens, often by suppressing plant im- Supplementary Fig. S4). Of all the sequenced genomes, C. zeae-may- mune system. In total, 233 proteins are predicted as the putative dis shows highest synteny with C. sojina. For example, scaffolds 2, 4, small (400 amino acids) cysteine-rich (4 cysteine residues) pro- and 5 of C. zeae-maydis correspond well with the scaffold 1 of teins. Through domain calling analysis, 205 functional motifs and C. sojina, and Scaffold 3 and 10 show well syntney to the scaffold 3 domains were found in 141 putative effectors, including 60 effectors of C. sojina (Supplementary Fig. S4A). Importantly, we observed with multiple domains. Notably, the most abundant domain is that the 24 largest scaffolds, which accounts for 91.2% of C. sojina PF14295.4 (n¼ 6), which mediates proteinprotein interactions. genome, show very high synteny with the 13 core chromosomes of Other common domains include abhydrolase domain (PF12697.5, n M. graminicola, but not with the rest 8 dispersed chromosomes 5), Hydrolase domain (PF12146.6, n¼ 5), and PAN domain (Supplementary Fig. S4C), indicating that C. sojina shares the con- (PF00024.24, n¼ 5). served and core genes of Dothideomycetes. 3.5. Up-regulation of pathogenicity-related genes by 3.4. The secretome and potential effectors whole genome transcription assays Thegenomeof C. sojina contains 11,655 protein-coding genes, cover- Because the infection progress of C. sojina on soybean is very slow, it ing approximately 41% sequence of the genome (Table 1). Among is difficult to collect enough samples to examine the gene expression them, a total of 9,506 genes were annotated using multiple databases of the in planta hyphae. However, starvation treatments could mimic Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 X. Luo et al. 31 the physiology of pathogen during infection. Therefore, we used putative PKS that are responsible for pigment production (Fig. 5A the mycelia that were grown in nutrient-limited culture for 24 and and Supplementary Table S7). We also found that C. sojina can pro- 48 h, and performed transcriptome analysis by RNA sequencing; duce some grey pigments, and the pigment was significantly induced 3,227 and 3,223 differentially expressed genes (DEGs) were identi- by both starvation and cAMP treatments (Fig. 5B), suggesting that fied at 24 and 48 h post-starvation treatment (hpt), respectively the pigments may be related to pathogen virulence. Therefore, we (Supplementary Fig. S6). A total of 4,051 DEGs were identified dur- further isolated and partially purified the pigments. Three major ing starvation treatment, and 2,399 genes were differentially ex- components, the grey, the light yellow, and the dark grey pigments pressed at both 24 and 48 hpt. Of all the DEGs, 1,530 and 1,508 were obtained (Supplementary Fig. S9), and the dark grey pigment is genes were upregulated, while 1,697 and 1,715 genes were downre- the most abundant one. gulated at 24 and 48 hpt, respectively (Supplementary Fig. S6). Notably, four classes of DEGs caused our attention. These genes 3.7. Carbohydrate-active enzymes are annotated to be involved in PHI, secretome, putative carbohydrate-active enzymes (CAZymes), and secondary metabolic Successful phytopathogenic fungi can break down and utilize the processes. First, 1,036 PHI genes are differentially expressed after plant cell wall polysaccharides by CAZymes. Cercospora sojina har- starvation treatment (Supplementary Fig. S7A). A total of 591 PHI bours 596 predicted CAZymes (Supplementary Table S9). genes are significantly upregulated, demonstrating the important Compared with other fungi in Dothideomycetes, C. sojina has a roles of these genes in responding to stimulus. Second, 260 secreted larger group of potential carbohydrate esterases, which can catalyze 60,61 protein-coding genes, including 81 effectors, are differentially ex- the O-de- or N-deacylation of substituted saccharides pressed (Supplementary Fig. S7B). There is 62.5% (50/80) effector- (Supplementary Table S9). In C. sojina genome, there are around coding gene expression being significantly upregulated, and some ef- 23.5% potential secreted proteins (177/752) that were predicted as fectors with conserved domains, such as Wall Stress-responsive CAZyme, demonstrating that C. sojina may employ a large group of Component domain, glycoside hydrolase, fungal hydrophobin, cuti- CAZymes to digest host cell walls during invasion. nase, leucine rich repeat or peptidase domains, may play critical roles Interestingly, one of the families of CAZymes, the glycoside hy- in fungal pathogenicity (Supplementary Table S5). Third, 198 drolase GH109 family, is highly enriched (Fig. 6). The biochemical CAZymes were differentially expressed. Interestingly, almost half of function of GH109 family is proved to be a-N-acetylgalactosamini- them are glycoside hydrolases (87/198), implying their essential roles dase (aNAGAL), which can cleave the terminal alpha-linked in early infection (Supplementary Fig. S7C). Further, the secondary N-acetylgalactosamine epitope of blood group A. However, it has metabolism-related genes, including 5 PKS and 16 NRPS/NRPS-like been shown that soybean lectin can specifically bind N-acetyl galac- genes are significantly upregulated (Supplementary Fig. S7D). These tosamine, a component of fungal cell wall. These data suggest that genes are likely involved in mycotoxin biosynthesis in fungi. aNAGAL may be able to compete with lectin to bind N-acetylgalac- Therefore, the increased expression of PKS and NRPS/NRPS-like tosamine. We find that C. sojina encodes 14 putative GH109 genes, genes suggests that these genes may play essential roles in myco- which is more than most of fungi, such as M. oryzae, B. cinerea, and toxins biosynthesis (Supplementary Table S6). N. crassa (Supplementary Table S10). These data suggest that the ex- pansion of GH109 family in C. sojina may contribute to overcome lectin-mediated resistance in soybean. 3.6. Gene clusters for secondary metabolites Of the annotated carbohydrate esterase genes, family CE1 and Cercospora sojina genome encodes 16 non-ribosomal peptide syn- CE10 are two major subfamilies (Fig. 6). Both families encode pro- thetases (NRPS), 20 PKS, 18 fatty acid synthases, 3 terpene syn- teins with the common activities of carboxylesterase and endo-1,4-b- thases, 2 geranylgeranyl diphosphate synthases, and 1 terpenoid xylanase. Besides, 34 potential carbohydrate-binding module cyclases (Supplementary Tables S7 and S8). These enzymes are in- (CBM) proteins were identified in C. sojina genome (Supplementary volved in synthesis of secondary metabolites, including mycotoxins, Table S9), which can digest carbohydrate complex extracellularly. pigments, and alkaloids. Annotation of secondary metabolite biosyn- Unexpectedly, only one CBM protein CBM1 was found in C. sojina thesis genes shows that C. sojina lacks PKS-NRPS hybrids, PKS-like genome (Supplementary Table S9). However, many phytopathogenic proteins, and dimethylallyl tryptophan synthases (Supplementary fungi carry plenty of CBM1. For example, Verticillium dahlia has an Tables S7 and S8). expansion of CBM1 containing protein family (30 genes). In the Cercospora genus, most of the species can produce a non- specific mycotoxin cercosporin. However, it has been disputed that if 3.8. Functional analysis of putative effectors C. sojina produces cercosporin. Nevertheless, we identified a similar gene cluster with eight cercosporin biosynthesis genes in C. sojina ge- Effectors are low molecular weight proteins that are secreted by bac- nome (Fig. 4A). These eight genes display high amino acid sequence teria, oomycetes or fungi to impair the host immune defence and to similarity to C. nicotianae cercosporin biosynthesis genes in the same adapt to specific environment. In the maize pathogen U. maydis,it tandem order (Fig. 4A). Furthermore, we observed the increased was observed that most of the genes in secreted protein-coding transcription of the eight genes during infection (Fig. 4B). These data gene clusters were induced simultaneously in infected tissue. imply that C. sojina may produce cercosporin during infection. Phylogenetic analysis for 233 putative effectors reveals that 21 pre- However, we were unable to detect the cercosporin in either cultured dicted effectors are grouped into clusters that contain 2 or 3 high se- mycelium or infected plant tissue according to the method that was quence similarity genes in C. sojina (Fig. 7A). Clusters of putative used in other Cercospora species (Supplementary Fig. S8). effectors also suggest that local duplications might be involved in ex- Pigments are the other important group of secondary metabolites pansion of effectors in C. sojina. In addition, 40 putative effectors for successful invasion of pathogens. Generally, pathogen-produced can be annotated by PHI database (Supplementary Table S11), and pigment is able to protect pathogen from host oxidative stress during most of the annotated effectors have been implicated in fungal infection. We found that C. sojina genome encodes multiple pathogenesis. Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 32 Cercospora sojina genome sequence reveals the potential infection mechanism Figure 4. Putative gene clusters for cercosporin biosynthesis in C. sojina. (A) The eight cercosporin toxin biosynthetic genes in C. sojina genome. The genes in- volved in cercosporin biosynthesis of Cercospora nicotianae (C. nicotianae) were blasted against C. sojina using BlastP. The black arrow represents the direc- tion of sense strand. (B) Expression of candidate genes involved in cercosporin biosynthesis at 48 h after C. sojina infection. The mRNA levels were detected using qRT-PCR. Values are means6 SD (n¼ 3 biological replicates). Next, we attempt to investigate the effector function. Pro-apoptotic DNA methylation is an important research area of epigenetics in mouse protein BAX-induced programmed cell death (PCD) on N. ben- both eukaryotic and prokaryotic. The most studied type of DNA thamiana could physiologically resemble defence-associated hypersensi- methylation in fungi is m5C, while in prokaryotic it is m4C and tive response caused by pathogens, providing a valuable screening m6A. Until recently, the methylase and demethylase for m6A in eu- approach for effectors that can suppress defence-related PCD. We karyotic were identified. However, in fungi, this type of DNA modifi- randomly selected 50 effectors and transiently expressed them in cation is poorly studied. Using SMRT, we identified m6A and m4C N. benthamiana to screen the potential effectors that can suppress in C. sojina for the first time, in which we found 17,407 m6A and BAX-triggered PCD (BT-PCD). Our results show that about 1/4 1,015,733 m4C in C. sojina genome (Fig. 2B). The m6A frequency in (13/50) selected effectors strongly suppress BT-PCD (Fig. 7B). this fungus is 435.2/Mb, which is slightly more than 341.3/Mb in Moreover, qRT-PCR results show that most of these putative effectors yeast. In bacteria, m6A is regarded as an epigenetic signal for are transcriptionally induced at 48 h after C. sojina infection (Fig. 7C), DNA-protein interactions. Surprisingly, we also observe that implying they probably contribute to early infection on soybean. C. sojina contains large numbers of m4C, which is only reported in bacteria to our knowledge. Interestingly, the m4C modification oc- curs with lower frequency in the region of repetitive elements in C. sojina genome (Fig. 2C and D), indicating that m4C may be in- 4. Discussion volved in the transposition of the transposons. Our results also dem- onstrate that SMRT is a powerful tool in studying fungal genome Cercospora species cause severe leaf spot and blight diseases on epigenetic modification. many crops worldwide. In this study, we sequenced the genome of It is worth noting that one of the enriched family is glycoside the economically important fungi, C. sojina race 1. The genome size hydroxylate GH109 family (Fig. 6). This gene family encodes is 40.84 Mb, and 25.56% of the genome is composed of repeat se- a-N-acetylgalactosaminidase (aNAGAL), an enzyme that can cleave quences. However, compared with the genome size of sequenced but N-acetyl galactosamine from the conjugated proteins. not annotated C. sojina isolate using NGS method (IMG genome id: Interestingly, it has been found that lectin can specifically bind 2506520004, https://img.jgi.doe.gov/cgi-bin/m/main.cgi), C. sojina N-acetyl galactosamine. It is also found that soybean lectin can race 1 has a larger genome size (Table 1 and Supplementary Fig. bind the N-acetyl galactosamine, a component of fungal cell wall. S12). In particular, the assembled C. sojina genome by SMRT gener- Therefore, the aNAGAL may compete with soybean lectin to bind ates 10 Mb repetitive sequences that are not detected by other se- N-acetyl galactosamine. It is known that soybean lectin and other quencing techniques. Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 X. Luo et al. 33 Figure 5. Putative PKS gene clusters for pigment production in C. sojina. (A) Genes encoding methyltransferases, cytochrome P450s, oxidoreductases, dehydro- genases, acyltransferases, MFS transporters, and transcriptional factors were clustered with PKS genes. These clusters are responsible for the pigment synthe- sis. The black arrows represent the direction of sense chain. The orange arrows highlight PKS genes. The blue and grey arrows represent putative pigment biosynthesis genes and other genes that are not involved in pigment biosynthesis. (B) Pigments produced by C. sojina in different media. The supernatants were collected at 4 days after treatments on mycelia. Czapek medium served as a negative control. Results are representatives of three biological replicates. PDB and CM represent potato-dextrose broth medium and complete medium, respectively. plant lectins can inhibit hyphae growth and spore germination by CBMs are the most common non-catalytic modules associated binding their cell wall components in several fungi, such as Penicillia with enzymes active in plant cell-wall hydrolysis. They can increase 71,72 and Aspergilli species. We hypothesize that the expansion of enzyme efficiency by anchoring the enzyme’s catalytic region to insol- GH109 family in C. sojina genome may contribute to overcome the uble cellulose. However, only one CBM1 protein was found in lectin-mediated disease resistance in soybean. C. sojina genome (Supplementary Table S9). In light of their relative Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 34 Cercospora sojina genome sequence reveals the potential infection mechanism Figure 6. Comparison of carbohydrate enzymes between C. sojina and nine other fungal species. Nine species were selected to compare with C. sojina.Pi, Phytophothora infestans;Ps, Phytophothora sojae;Nc, Neurospora crassa;Bc, Botrytis cinerea;Cm, Cercospora zeae-maydis;Cs, Cercospora sojina;An, Aspergillus nidulans;Vd, Verticillium dahliae;Mo, Magnaporthe oryzae;Fg, Fusarium graminearum. GH, glycoside hydrolase; GT, glycosyltransferase; PL, polysaccharide lyase; CE, carbohydrate esterase; AA, auxiliary activity family; CBM, carbohydrate-binding module family. The numbers of gene families were normalized by Z score. slower infection process, the deficiency of CBMs in the genome may genome (Fig. 5A and Table S7). Moreover, the whole genome tran- undermine C. sojina infection in terms of digesting plant cell walls. scriptome assays also demonstrate that the key genes that are in- It is believed that mycotoxin plays a critical role during pathogen in- volved in pigment biosynthesis are significantly up-regulated. These fection. Mycotoxin cercosporin produced by Cercospora spp. is con- data further support the assumption that the pigments produced by sidered to be one of the key factors that can enhance their virulence, as C. sojina are involved in its virulence. their pathogenicity was remarkably impaired in the cercosporin- In addition to secondary metabolites, pathogens usually harbour deficient mutants. However, C. sojina is one of the few Cercospora various virulence effectors. These effectors interfere with host im- spp. that was reported not able to produce cercosporin, although there mune responses to enhance virulence. For example, bacterial patho- is a dispute. We made an effort to examine the cercosporin in either gen Pseudomonas syringae delivers over 30 effectors by type III cultured mycelium or infected plant tissue. However, we are unable to secretion system during infection. Our work showed that more detect cercosporin in any of the samples although the complete gene than one third of the effectors were upregulated during starvation cluster for cercosporin biosynthesis exists in C. sojina genome (Fig. 4 (Supplementary Fig. S7B), and many of them can suppress BAX- and Supplementary Fig. S8). Therefore, our data imply that C. sojina induced cell death (Fig. 7B), similar to the finding in Phytophothora may not employ cercosporin but other mycotoxin to enhance virulence. sojae. These data demonstrate that C. sojina can probably deploy Fungi-derived pigments can act as virulence factors to facilitate in- effectors to promote infection. fection in plants, and are required for pathogen fitness by serving as In summary, we report a complete genome sequence of C. sojina UV protectants and ROS scavengers. The well-studied fungal by SMRT sequencing method. This sequencing method not only as- pigment is melanin, which essentially contributes to fungal pathogen- sists us to find the repetitive elements, but also to discover the DNA esis by altering cytokine responses, decreasing phagocytosis and methylations in fungus. By the genome assembly and annotation, we scavenging ROS, as well as playing an important role in reinforcing hypothesize that the specific CAZymes, secondary metabolites, and fungal cell wall. Similar as melanin, we observed some pigment effectors can help C. sojina to adapt to soybean successfully. Our production was induced by cAMP or starvation treatments (Fig. 5B). work also lays the groundwork for future discoveries on this impor- We also identified the key gene clusters that encode PKS in C. sojina tant soybean disease. Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 X. Luo et al. 35 Figure 7. Functional analysis of putative C. sojina effectors. (A) Phylogenetic analysis of putative effectors. The phylogenetic tree of 233 putative effectors is categorized into six super clades. (B) Selected C. sojina effectors suppress BAX-induced programmed cell death in N. benthamiana. N. benthamiana leaves were infiltrated with agro- bacterium carrying C. sojina effector genes (OD600¼ 0.4). GFP and Phytophothora sojae effector Avr1b were used as the negative and the positive controls, respectively. Twenty-four hours later, plants were infiltrated with pVX-BAX (OD600¼ 0.4), then the photos were taken 3 days later. Results are representatives of six biological repli- cates. (C) The effectors that can suppress BAX-induced PCD are significantly up-regulated at 48 hpi. Soybean leaves were inoculated with C. sojina. Samples were col- lected at 0 and 48 hpi, respectively. Quantitative RT-PCR (qRT–PCR) were used to determine gene expression levels. Values are mean6 SD (n¼ 3 biological replicates). Supplementary data Funding Supplementary data are available at DNARES online. The study was supported by Chinese Academy of Sciences (Strategic Priority Research Program Grant NO. XDB11020300), the National Natural Science Foundation of China (Grant 31570252 and Grant 31500220), and by the grant from the State Key Laboratory of Plant Genomics (Grant No. O8KF021011). Authors’ contributions J.L. and S.M. conceived the research plans; X.L., J.C., and J.H. per- Conflict of interest formed the experiments; Z.W., Z.G., and Y.C. provided technical as- sistance; X.L., J.C., and J.L. wrote the article. None declared. Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 36 Cercospora sojina genome sequence reveals the potential infection mechanism diversity of Cercospora sojina populations on soybean from Arkansas: Accession number evidence for potential sexual reproduction, Phytopathology, 103, NFUF00000000 1045–51. 19. Kim, J.S., Seo, S.G., Jun, B.K., Kim, J.W. and Kim, S.H. 2010, Simple and reliable DNA extraction method for the dark pigmented fungus, Cercospora sojina, Plant Pathol. J., 26, 289–92. Data availability 20. Chin, C.S., Alexander, D.H., Marks, P., et al. 2013, Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data, This Whole Genome Shotgun project has been deposited at DDBJ/ Nat. Methods, 10, 563–9. ENA/GenBank under the accession NFUF00000000. The version de- 21. Berlin, K., Koren, S., Chin, C.S., Drake, J.P., Landolin, J.M. and scribed in this paper is version NFUF01000000. Phillippy, A.M. 2015, Assembling large genomes with single-molecule se- quencing and locality-sensitive hashing, Nat. Biotechnol., 33, 623–30. 22. Luo, R., Liu, B., Xie, Y., et al. 2012, SOAPdenovo2: an empirically References improved memory-efficient short-read de novo assembler, Gigascience, 1, 18. 23. Ter-Hovhannisyan, V., Lomsadze, A., Chernoff, Y.O. and Borodovsky, 1. Mian, M.A.R., Missaoui, A.M., Walker, D.R., Phillips, D.V. and Boerma, M. 2008, Gene prediction in novel fungal genomes using an ab initio algo- H.R. 2008, Frogeye leaf spot of soybean: a review and proposed race des- rithm with unsupervised training, Genome Res., 18, 1979–90. ignations for isolates of Cercospora sojina Hara, Crop Sci., 48, 14–24. 24. Salamov, A.A. and Solovyev, V.V. 2000, Ab initio gene finding in 2. Gupta, D.K., Singh, M., Singh, G. and Srivastava, L.S. 1994, Sources of Drosophila genomic DNA, Genome Res., 10, 516–22. resistance in soybean (Glycine-max) to frog-eye leaf-spot caused by 25. Delcher, A.L., Harmon, D., Kasif, S., White, O. and Salzberg, S.L. 1999, Cercosporidium-sojinum, Indian J. Agric. Sci., 64, 886–7. Improved microbial gene identification with GLIMMER. Nucleic Acids 3. Soares, A.P.G., Guillin, E.A., Borges, L.L., et al. 2015, More Cercospora Res, 27, 4636–41. species infect soybeans across the Americas than meets the eye, PloS One, 26. Haas, B.J., Zeng, Q., Pearson, M.D., Cuomo, C.A. and Wortman, J.R. 10, e0133495. 2011, Approaches to fungal genome annotation, Mycology, 2, 118–41. 4. Mian, R., Bond, J., Joobeur, T., et al. 2009, Identification of soybean ge- 27. Haas, B.J., Papanicolaou, A., Yassour, M., et al. 2013, De novo transcript notypes resistant to Cercospora sojina by field screening and molecular sequence reconstruction from RNA-seq using the Trinity platform for ref- markers, Plant Dis., 93, 408–11. erence generation and analysis, Nat. Protoc., 8, 1494–512. 5. Galloway, J. 2008, Effective management of soybean rust and frogeye leaf 28. Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y. and Hattori, M. 2004, spot using a mixture of flusilazole and carbendazim, Crop Prot., 27, The KEGG resource for deciphering the genome, Nucleic Acids Res., 32, 566–71. D277–80. 6. Zhang, G.R., Pedersen, D.K., Phillips, D.V. and Bradley, C.A. 2012, 29. Tatusov, R.L., Galperin, M.Y., Natale, D.A. and Koonin, E.V. 2000, The Sensitivity of Cercospora sojina isolates to quinone outside inhibitor fun- COG database: a tool for genome-scale analysis of protein functions and gicides, Crop Prot., 40, 63–8. evolution, Nucleic Acids Res., 28, 33–6. 7. Ma, S.M., and Li B.Y. 1997, Primary report on the identification for phys- 30. Finn, R.D., Bateman, A., Clements, J., et al. 2014, Pfam: the protein fami- iological races of Cercospora sojina Hara in northeast China, Acta lies database, Nucleic Acids Res., 42, D222–30. Phytopathol. Sin., 27, 180. 31. Eddy, S.R. 1998, Profile hidden Markov models, Bioinformatics, 14, 8. Daub, M.E. and Ehrenshaft, M. 2000, The photoactivated Cercospora 755–63. toxin cercosporin: contributions to plant disease and fundamental biol- 32. Urban, M., Cuzick, A., Rutherford, K., et al. 2017, PHI-base: a new inter- ogy, Annu. Rev. Phytopathol., 38, 461–90. face and further additions for the multi-species pathogen–host interactions 9. Goodwin, S.B., Dunkle, L.D. and Zismann, V.L. 2001, Phylogenetic anal- database, Nucleic Acids Res., 45, D604–10. ysis of cercospora and mycosphaerella based on the internal transcribed 33. Gotz, S., Arnold, R., Sebastian-Leon, P., et al. 2011, B2G-FAR, a spacer region of ribosomal DNA, Phytopathology, 91, 648–58. species-centered GO annotation repository, Bioinformatics, 27, 919–24. 10. Dean, R.A., Talbot, N.J., Ebbole, D.J., et al. 2005, The genome sequence 34. Benson, G. 1999, Tandem repeats finder: a program to analyze DNA se- of the rice blast fungus Magnaporthe grisea, Nature, 434, 980–6. quences, Nucleic Acids Res., 27, 573–80. 11. Wang, R., Ning, Y., Shi, X., et al. 2016, Immunity to rice blast disease by 35. Blow, M.J., Clark, T.A., Daum, C.G., et al. 2016, The epigenomic land- suppression of effector-triggered necrosis. Curr. Biol., 26, 2399–411. scape of prokaryotes, PloS Genet., 12, e1005854. 12. Park, C.H., Chen, S., Shirsekar, G., et al. 2012, The Magnaporthe oryzae 36. Nordberg, H., Cantor, M., Dusheyko, S., et al. 2014, The genome portal effector AvrPiz-t targets the RING E3 Ubiquitin Ligase APIP6 to suppress of the Department of Energy Joint Genome Institute: 2014 updates, pathogen-associated molecular pattern–triggered immunity in rice. Plant Nucleic Acids Res., 42, D26–31. Cell, 24, 4748–62. 37. Ebersberger, I., Simoes, R.D., Kupczok, A., et al. 2012, A consistent phy- 13. Li, X., Gao, C., Li, L., et al. 2017, MoEnd3 regulates appressorium for- logenetic backbone for the fungi, Mol. Biol. Evol., 29, 1319–34. mation and virulence through mediating endocytosis in rice blast fungus 38. Edgar, R.C. 2004, MUSCLE: multiple sequence alignment with high accu- Magnaporthe oryzae. PLoS Pathogens, 13, e1006449. racy and high throughput, Nucleic Acids Res., 32, 1792–7. 14. Qi, Z., Wang, Q., Dou, X., et al. 2012, MoSwi6, an APSES family tran- 39. Kumar, S., Stecher, G. and Tamura, K. 2016, MEGA7: molecular evolu- scription factor, interacts with MoMps1 and is required for hyphal and tionary genetics analysis version 7.0 for bigger datasets, Mol. Biol. Evol., conidial morphogenesis, appressorial function and pathogenicity of 33, 1870–4. Magnaporthe oryzae. Mol. Plant Pathol., 13, 677–89. 40. Nix, D.A. and Eisen, M.B. 2005, GATA: a graphic alignment tool for 15. Huang, K., Czymmek, K.J., Caplan, J.L., et al. 2011, HYR1-mediated de- comparative sequence analysis, BMC Bioinformatics, 6,9. toxification of reactive oxygen species is required for full virulence in the 41. Mistry, J., Finn, R.D., Eddy, S.R., Bateman, A. and Punta, M. 2013, rice blast fungus. PloS Pathogens, 7, e1001335. Challenges in homology search: HMMER3 and convergent evolution of 16. Zhou, T.T., Zhao, Y.L., and Guo, H.S. 2017, Secretory proteins are deliv- coiled-coil regions, Nucleic Acids Res., 41, e121. ered to the septin-organized penetration interface during root infection by 42. Yin, Y., Mao, X., Yang, J., Chen, X., Mao, F. and Xu, Y. 2012, dbCAN: Verticillium dahliae. PloS Pathol., 13, e1006275. a web resource for automated carbohydrate-active enzyme annotation, 17. Zhao, Y.L., Zhou, T.T., and Guo, H.S. 2016, Hyphopodium-specific 2þ Nucleic Acids Res., 40, W445–51. VdNoxB/VdPls1-dependent ROS-Ca signaling is required for plant in- 43. Khaldi, N., Seifuddin, F.T., Turner, G., et al. 2010, SMURF: genomic fection by Verticillium dahliae. PloS Pathol., 12, e1005793. mapping of fungal secondary metabolite clusters, Fungal Genet. Biol., 47, 18. Kim, H., Newell, A.D., Cota-Sieckmeyer, R.G., Rupe, J.C., Fakhoury, 736–41. A.M. and Bluhm, B.H. 2013, Mating-type distribution and genetic Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018 X. Luo et al. 37 44. Medema, M.H., Blin, K., Cimermancic, P., et al. 2011, antiSMASH: rapid 61. Aurilia, V., Parracino, A. and D’Auria, S. 2008, Microbial carbohydrate identification, annotation and analysis of secondary metabolite biosynthe- esterases in cold adapted environments, Gene, 410, 234–40. sis gene clusters in bacterial and fungal genome sequences, Nucleic Acids 62. Liu, Q.P., Sulzenbacher, G., Yuan, H., et al. 2007, Bacterial glycosidases Res., 39, W339–46. for the production of universal red blood cells, Nat. Biotechnol., 25, 45. Shim, W.B. and Dunkle, L.D. 2002, Identification of genes expressed dur- 454–64. ing cercosporin biosynthesis in Cercospora zeae-maydis, Physiol. Mol. 63. Benhamou, N. and Ouellette, G.B. 1986, Ultrastructural-localization of Plant P., 61, 237–48. glycoconjugates in the fungus ascocalyx-abietina, the scleroderris canker 46. Yang, C., Li, W., Cao, J., et al. 2017, Activation of ethylene signaling agent of conifers, using lectin gold complexes, J. Histochem. Cytochem., pathways enhances disease resistance by regulating ROS and phytoalexin 34, 855–67. production in rice, Plant J., 89, 338–53. 64. Zhao, Z.T., Liu, H.Q., Wang, C.F. and Xu, J.R. 2013, Comparative anal- 47. Zeng, F., Wang, C., Zhang, G., Wei, J., Bradley, C.A. and Ming, R. 2017, ysis of fungal genomes reveals different plant cell wall degrading capacity Draft genome sequence of Cercospora sojina isolate S9, a fungus causing in fungi, BMC Genomics, 14, 274. frogeye leaf spot (FLS) disease of soybean. Genomics Data, 12, 79–80. 65. Klosterman, S.J., Subbarao, K.V., Kang, S.C., et al. 2011, Comparative 48. Bodega, B. and Orlando, V. 2014, Repetitive elements dynamics in cell identity genomics yields insights into niche adaptation of plant vascular wilt path- programming, maintenance and disease, Curr. Opin. Cell Biol., 31, 67–73. ogens, PloS Pathog., 7, e1002137. 49. Jaenisch, R. and Bird, A. 2003, Epigenetic regulation of gene expression: 66. Kamper, J., Kahmann, R., Bolker, M., et al. 2006, Insights from the ge- how the genome integrates intrinsic and environmental signals, Nat. nome of the biotrophic fungal plant pathogen Ustilago maydis, Nature, Genet., 33, 245–54. 444, 97–101. 50. Ohm, R.A., Feau, N., Henrissat, B., et al. 2012, Diverse lifestyles and 67. Wang, Q.Q., Han, C.Z., Ferreira, A.O., et al. 2011, Transcriptional pro- strategies of plant pathogenesis encoded in the genomes of eighteen gramming and functional interactions within the Phytophthora sojae Dothideomycetes fungi, PloS Pathog., 8, e1003037. RXLR effector repertoire, Plant Cell, 23, 2064–86. 51. Flusberg B.A., Webster D.R., Lee J.H., et al. 2010, Direct detection of 68. Dubey, A. and Jeon, J. 2016, Epigenetic regulation of development and DNA methylation during single-molecule, real-time sequencing. Nature pathogenesis in fungal plant pathogens, Mol. Plant Pathol. doi: Methods, 7, 461–5. 10.1111/mpp.12499. 52. Benson, J.M., Poland, J.A., Benson, B.M., Stromberg, E.L. and Nelson, 69. Ye, P.H., Luan, Y.Z., Chen, K.N., Liu, Y.Z., Xiao, C.L. and Xie, Z. 2017, R.J. 2015, Resistance to gray leaf spot of maize: genetic architecture and MethSMRT: an integrative database for DNA N6-methyladenine and mechanisms elucidated through nested association mapping and N4-methylcytosine generated by single-molecular real-time sequencing, near-isogenic line analysis, PloS Genet., 11, e1005045. Nucleic Acids Res., 45, D85–9. 53. Goodwin, S.B., Ben M’Barek, S., Dhillon, B., et al. 2011, Finished genome 70. Wion, D. and Casadesus, J. 2006, N6-methyl-adenine: an epigenetic of the fungal wheat pathogen Mycosphaerella graminicola reveals dispen- signal for DNA-protein interactions, Nat. Rev. Microbiol., 4, 183–92. some structure, chromosome plasticity, and stealth pathogenesis, PloS 71. Barkai-golan, R., Mirelman, D. and Sharon, N. 1978, Studies on growth inhi- Genet., 7, e1002070. bition by lectins of Penicillia and Aspergilli. Arch. Microbiol., 116, 119–24. 54. Krogh, A., Larsson, B., von Heijne, G. and Sonnhammer, E.L.L. 2001, 72. Guo, P., Wang, Y., Zhou, X., et al. 2013, Expression of soybean lectin in Predicting transmembrane protein topology with a hidden Markov model: transgenic tobacco results in enhanced resistance to pathogens and pests. application to complete genomes, J. Mol. Biol., 305, 567–80. Plant Science, 211, 17–22. 55. Petersen, T.N., Brunak, S., von Heijne, G. and Nielsen, H. 2011, SignalP 73. Boraston, A.B., Bolam, D.N., Gilbert, H.J. and Davies, G.J. 2004, 4.0: discriminating signal peptides from transmembrane regions, Nat. Carbohydrate-binding modules: fine-tuning polysaccharide recognition, Methods, 8, 785–6. Biochem. J., 382, 769–81. 56. Soanes, D.M., Alam, I., Cornell, M., et al. 2008, Comparative genome 74. Receveur, V., Czjzek, M., Schulein, M., Panine, P. and Henrissat, B. analysis of filamentous fungi reveals gene family expansions associated 2002, Dimension, shape, and conformational flexibility of a two domain with fungal pathogenesis, PloS One, 3, e2300. fungal cellulase in solution probed by small angle X-ray scattering, J. Biol. 57. Koeck, M., Hardham, A.R. and Dodds, P.N. 2011, The role of effectors of Chem., 277, 40887–92. biotrophic and hemibiotrophic fungi in infection, Cell Microbiol., 13, 1849–57. 75. Yu, J.H. and Keller, N. 2005, Regulation of secondary metabolism in fila- 58. Wang, Y., Wu, J., Park, Z.Y., et al. 2011, Comparative secretome investi- mentous fungi, Annu. Rev. Phytopathol., 43, 437–58. gation of magnaporthe oryzae proteins responsive to nitrogen starvation, 76. Nosanchuk, J.D., Stark, R.E. and Casadevall, A. 2015, Fungal melanin: J. Proteome Res., 10, 3136–48. what do we know about structure? Front Microbiol., 6, 1463. 59. Liu, G.Y. and Nizet, V. 2009, Color me bad: microbial pigments as viru- 77. Xu, N., Luo, X., Li, W., Wang, Z., and Liu, J. 2017, The bacterial effector lence factors, Trends Microbiol., 17, 406–13. AvrB-induced RIN4 hyperphosphorylation is mediated by a receptor-like 60. Biely, P. 2012 Microbial carbohydrate esterases deacetylating plant poly- cytoplasmic kinase complex in Arabidopsis. Mol. Plant-Microbe In., 30, saccharides. Biotechnol. Adv., 30, 1575–88. 502–12. Downloaded from https://academic.oup.com/dnaresearch/article-abstract/25/1/25/4161430 by Ed 'DeepDyve' Gillespie user on 16 March 2018

Journal

DNA ResearchOxford University Press

Published: Feb 1, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 12 million articles from more than
10,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Unlimited reading

Read as many articles as you need. Full articles with original layout, charts and figures. Read online, from anywhere.

Stay up to date

Keep up with your field with Personalized Recommendations and Follow Journals to get automatic updates.

Organize your research

It’s easy to organize your research with our built-in tools.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

Monthly Plan

  • Read unlimited articles
  • Personalized recommendations
  • No expiration
  • Print 20 pages per month
  • 20% off on PDF purchases
  • Organize your research
  • Get updates on your journals and topic searches

$49/month

Start Free Trial

14-day Free Trial

Best Deal — 39% off

Annual Plan

  • All the features of the Professional Plan, but for 39% off!
  • Billed annually
  • No expiration
  • For the normal price of 10 articles elsewhere, you get one full year of unlimited access to articles.

$588

$360/year

billed annually
Start Free Trial

14-day Free Trial