Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Recombinational DSBs-intersected genes converge on specific disease- and adaptability-related pathways

Recombinational DSBs-intersected genes converge on specific disease- and adaptability-related... Motivation: The budding yeast Saccharomyces cerevisiae is a model species powerful for studying the recombination of eukaryotes. Although many recombination studies have been performed for this species by experimental methods, the population genomic study based on bioinformatics anal- yses is urgently needed to greatly increase the range and accuracy of recombination detection. Here, we carry out the population genomic analysis of recombination in S.cerevisiae to reveal the potential rules between recombination and evolution in eukaryotes. Results: By population genomic analysis, we discover significantly more and longer recombination events in clinical strains, which indicates that adverse environmental conditions create an obvious- ly wider range of genetic combination in response to the selective pressure. Based on the analysis of recombinational double strand breaks (DSBs)-intersected genes (RDIGs), we find that RDIGs sig- nificantly converge on specific disease- and adaptability-related pathways, indicating that recom- bination plays a biologically key role in the repair of DSBs related to diseases and environmental adaptability, especially the human neurological disorders. By evolutionary analysis of RDIGs, we find that the RDIGs highly prevailing in populations of yeast tend to be more evolutionarily con- served, indicating the accurate repair of DSBs in these RDIGs is critical to ensure the eukaryotic sur- vival or fitness. Contact: fgao@tju.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online. 1 Introduction most of the genes conserved between yeast and human play the same exact roles in both organisms, and human genes can completely substi- The budding yeast Saccharomyces cerevisiae haslong servedasanim- tute for yeast orthologous genes, raising that humanizing entire cellu- portant model organism for higher eukaryotes as it could promote the lar processes or pathways in yeast should be feasible (Kachroo et al., combination of classical genetics and biochemistry with recombinant 2015). Due to the high biological relevance between yeast and human, genetics (Amberg et al.,2016), and its cells have striking similarities to biological findings in the budding yeast can be efficiently translated mammalian cells at both the organelle and macromolecular level into new insights about human biology (Skrzypek et al.,2018). (Braconi et al., 2011). Just recently, a systematic study has shown that V The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 3421 3422 Z.-K.Yang et al. It has been known that genomic instability is an important factor blocks were realigned separately to produce more accurate align- for human diseases. Though endogenous or exogenous genotoxic ments by using the program MAFFT v7.271 with default parameters assaults constantly destroy the genome integrity and lead to DNA (Katoh et al., 2013). Then, all these realigned blocks were concaten- damages, cells usually repair them by various mechanisms (Jeggo ated to a single alignment. And then, all the single nucleotide poly- et al.,2016). Here we focus on the most hazardous damage, namely morphism (SNP) loci were extracted to generate the resulting the double strand breaks (DSBs), which are generally repaired by two alignment. Finally, this alignment was used as the input of main mechanisms, the error-free homologous recombination (HR) NeighborNet program in SplitsTree v4.14.5 (Huson et al., 2006)to construct the phylogenetic network with default settings. In add- and the fast but error-prone non-homologous end-joining (NHEJ), re- ition, a statistical analysis was carried out by the pair-wise homo- spectively. These mechanisms are shared by eukaryotes (Carroll et al., plasy index (PHI) test (Bruen et al., 2006) as implemented by 2014), but HR is usually less frequently used than NHEJ (Suhane SplitsTree program to estimate the situation of recombination et al.,2015). Considering HR is an accurate DNA repair mechanism, among these strains. we supposed that recombinational DSBs-intersected genes (RDIGs) were associated with special biological significances. As the most lethal form of DNA damage, DSBs pose a great po- 2.4 Detection and analysis of recombination events tential risk to human health. For example, defects in DNA repair After we constructed the phylogenetic network and carried out the pathways are the causal factors for atherosclerosis, and leaded to PHI statistical test for recombination, we employed seven different higher amounts of DNA damages after exposure to reactive oxy- methods, the RDP (Martin et al., 2000), GENECONV (Padidam, gen species (Mahmoudi et al.,2006). When HR repair was defect- 1999), 3SEQ (Boni et al., 2007), BOOTSCAN (Salminen et al., ive, smooth muscle cells became increasingly senescent in patients 1995), MAXCHI (Smith, 1992), SISCAN (Gibbs et al., 2000) and with Hutchinson–Gilford progeria syndrome (Zhang et al.,2014), CHIMAERA (Posada et al., 2001), respectively, to detect the recom- indicating DNA damages were related to aging (Gorbunova et al., bination signal, and all these methods were implemented in 2007). Differentiation of osteoblasts was inhibited obviously by Recombination Detection Program (RDP4) Beta v. 4.91 (Martin DNA damages, which contributed to the weakened bone structure et al., 2015). The breakpoint position of recombination events was in mouse model (Schmidt et al., 2012). These examples suggest determined by RDP4 using a hidden Markov model. In this process, that DSBs are associated with human diseases, but it is still unclear we identified the recombination events separately for each of eligible which diseases significantly rely on the error-free HR repair mech- multiple alignments. And then, all the outputs of RDP4 were parsed anism. Hence, we undertake this study aiming at exploring the bio- to get the simplified and reliable results, but we only considered logical feature and function of RDIGs, which reflects what those recombination events that at least five out of seven methods potential roles the recombination may play in human. showed the significant P-value. Finally, we performed the statistical analysis, as well as the association analysis of recombination with environmental factors. 2 Materials and methods 2.1 Strain and data 2.5 Identification of recombinational DSBs-intersected Based on the phylogenetic relationship previously built by us (Yang genes (RDIGs) et al., 2018), we totally selected 42 strains with consideration of the In this study, RDIGs refer to the genes including recombinational geographic and environmental origins (Supplementary Table S1), DSBs. The identification and functional analysis of RDIGs may be im- and downloaded the corresponding genome sequences from the portant for exploring the biological significance of recombination be- NCBI FTP site (ftp://ftp.ncbi.nih.gov/genomes/all/) in August 2016. cause recombinational DSBs were previously reported to be an These 42 strains represent an extensive geographic distribution, and extraordinarily non-random distribution along the whole genome all these genomes are well annotated and highly complete with more (Lam et al.,2015). So we identified all the RDIGs and calculated the than 99% integrity. In order to facilitate the following analyses, average frequency of recombinational DSBs by a custom PERL script. each sequence header of every strain genome was unified according In addition, we also examined the distribution of recombinational to the strain name, chromosome and corresponding identifier. In DSBs in each chromosome by 2 kbp non-overlapping windows, and addition, the protein sequences and GFF3 annotations were also plotted them by Circos program (Krzywinski et al.,2009). regenerated for each strain based on these gbff files from NCBI by a BioPerl Script. 2.6 Functional enrichment analysis Here we used the S.cerevisiae strain S288c as reference, and reanno- 2.2 Identification of homologous sequences tated all its protein-coding genes based on the latest functional data First, all the genomes were split into individual chromosome se- (please see the Supplementary Material for details). Then, a standard quence files. Then, all the sequences of each chromosome were procedure EnrichPipeline was adopted to carry out the enrichment aligned by Mugsy, which was a robust whole-genome alignment analysis of GO term/KEGG pathway (Beissbarth et al., 2004; program suitable for accurately aligning the closely related whole- Huang et al., 2009). In this process, the procedure firstly identified genome sequences (Angiuoli et al., 2011). And then, we extracted all the significantly enriched results according to the P-values by Chi- the local aligned blocks of every chromosome, and generated a multi square test and Fisher exact test. And then, the P-values of gene set FASTA format file for each of local aligned blocks. However, these enrichment were further adjusted by the default method fdr to filter blocks with the length<1 kbp or the number of homologous out those false-positive results. Finally, we took the cutoff of 0.05 sequences<3 would be viewed as ineligible blocks. for the adjusted P-values as statistical significance. 2.3 Phylogenetic network analysis 2.7 Estimation of dN/dS value We adopted these aligned blocks containing 42 homologous sequen- In order to estimate the correlation between the evolutionary rate ces to construct the phylogenetic network. First, eligible aligned and the frequency of recombinational DSBs, we conducted the Population genomic analysis of recombination 3423 dN/dS analysis of orthologous genes as follows. First, orthologous relationships of gene between strains were identified by OrthoMCL software (Li et al., 2003) with the E-value cutoff of 1.0E-05. Then we adopted the best-performing program PRANK (Fletcher et al., 2010; Lo ¨ ytynoja et al., 2005) to generate the codon alignment for each orthologous group without paralogs or missing genes. Finally, the dN,dS and dN/dS were estimated by the maximum likelihood using the codeml program of PAML software package (Yang, 2007). 3 Results 3.1 Phylogenetic network Originally, we constructed a reticulate phylogenetic network based on the alignment consisted of SNP loci (358 kbp) which were extracted from 240 eligible aligned blocks by a custom Perl script. As shown in Supplementary Figure S1, a complex and reticulate tree topology structure was presented, which indicated the presence of significantly conflicting phylogenetic signals. Generally, conflicting signals were caused by genetic recombination. By the PHI test ana- lysis, we detected the statistically significant evidence of recombin- ation (P ¼ 0.0) in the 42 budding yeast strains. Thus, a more comprehensive assessment of recombination events was necessary for the insight into the underlying biological significance. 3.2 Overview of recombination events There were totally 762 blocks after filtering the ineligible blocks. By PHI test analysis, we detected the statistically significant evidence of recombination in the 42 budding yeast strains. Then, we identified all possible recombination events (Supplementary Table S2) by the RDP4 program and compared these results predicted by seven dif- ferent methods implemented in this program, finding that the num- ber of recombination events predicted by different methods ranged from 12 935 to 16 107 (Fig. 1a). According to the overlapping rela- Fig. 1. Comparison of the results obtained from seven detection methods of tionships constructed by Venn diagram, we found very few recom- recombination. (a) Venn diagram showing the overlap of recombination bination events with significant signals detected by only one method events predicted by seven different methods. (b) Bar plot showing the num- (Fig. 1a), while the number of recombination events with significant ber of recombination events with significant signals detected and only signals detected by all seven methods was far greater than the others detected by the corresponding number of methods, and red dots linked by (Fig. 1b), indicating that these seven methods had a good consist- red line showing the total number of recombination events with significant signals detected by no less than the corresponding number of methods ency in the detection of recombination signals. Please note that the two sets of data presented in Figure 1b have a good correspondence. For example, the case with number of methods¼5 indicated by red previous experimental findings on the smallest four chromosomes that the number of recombination events decreased with the length dot includes three cases, namely 5, 6 and 7, indicated by bar plot. of chromosomes (Gerton et al., 2000). In addition, we also esti- Only these recombination events with signals identified to be sig- mated the guanine-cytosine (GC) content of recombination sequen- nificant (P-value<1.0E-05) by no less than 5 methods were retained ces, finding that sequences from recombination events had slightly for the following analyses, totaling 11, 921 recombination events. lower GC contents than those in genomic sequences (Supplementary Subsequently, by statistical analysis we found that the average Fig. S4). length of recombination events in chromosome 2 were obviously larger than that in others, and the number of recombination events in chromosome 2, 4 and 15 was significantly (P-value¼2.9E-03) 3.3 Source analysis of recombination more than those in others (Supplementary Fig. S2). By linear regres- In order to investigate the correlations between recombination and sion analysis, we found that the number of recombination events geographical or environmental factors, we constructed the network was positively correlated to the length of chromosomes with a cor- relationship of recombination events, which clearly showed the in- relation coefficient of 0.67 (Supplementary Fig. S3a), but the aver- tricate relationship of recombination among strains from different age length of recombination events had almost no correlation to the sources. By this network diagram (Supplementary Fig. S5), we length of chromosomes with an insignificant correlation coefficient observed that recombination appeared to occur more frequently in of 0.16 (Supplementary Fig. S3b). Remarkably, in the smallest four clinical strains than non-clinical strains, indicating that these strains chromosomes, the average length of recombination events had an in clinical environments suffered from stronger selection pressures, obvious positive correlation to the length of chromosomes, but the because there always existed a large number of exogenous agents number of recombination events had a strong negative correlation like pharmaceuticals and ionizing radiation. In order to assess to the length of chromosomes, which was in accordance with whether the number and length of recombination events in clinical 3424 Z.-K.Yang et al. Fig. 2. Relationship between environmental factors and recombination. (a) Comparison of recombination number between any two environmental fac- tors. (b) Comparison of recombination length between any two environmen- tal factors. Significance of above two analysis is estimated by the pairwise Wilcoxon test. Clinic, wine and other refer to the clinical, wine and other envi- ronments respectively environments differed significantly from those in other environ- Fig. 3. Distribution of DSBs along the genome of strain S288c. From outer to ments, we carried out the Wilcoxon test to detect the statistical sig- inner, the circle represents the genome coordinate (grey), all the genes nificance of difference between any two environments. As a result, located in forward and reverse strands (yellow, black and grey indicate the we found the number of recombination events in clinical environ- tRNA, rRNA and other genes, respectively), the percentage of DSBs (black), ments was obviously more than that in the other two environments some important sequence elements (black, red, blue, yellow and green indi- cate the centromere, replication origin, telomere, mobile element and repeat (wine and other, respectively) (Fig. 2a), but the total length of re- region, respectively), GC content (blue/red) and GC skew (blue/red), combination sequences in clinical environments was significantly respectively longer than that in either (Fig. 2b). These findings indicated that strains from harmful environments had significantly more genetic combination than those from benign environments, which was in ac- cordance with what had been observed in the fungus Arthrobotrys oligospora (Zhang et al., 2013). 3.4 DSBs and RDIGs Based on all the recombination events detected above, we identified 23 356 recombinational DSBs from genomes of 42 budding yeast strains, and plotted the distribution of DSBs along each chromo- some. As a result, we found that DSBs were obviously distributed non-randomly along chromosomes (Fig. 3), which was consistent with previous findings (Sze ´ kvo ¨ lgyi et al., 2015). Then, we mapped the positions of all these DSBs to the genome of strain s288c, and to- tally identified 3060 RDIGs. Actually, RDIGs were such genes spliced together by different portions of two homologous genes, therefore they also belonged to the mosaic genes. We calculated the average frequency of recombinational DSBs occurring in RDIGs by a custom Perl script. We focused more on these RDIGs with high frequency (0.9) of recombinational DSBs (referred as HRDIGs here), because these genes were highly dependent on error-free HR. Here, we totally identified 60 HRDIGs. Considering that perfect Fig. 4. Scatterplot for significantly enriched KEGG pathways in HRDIGs. Rich complementary pairings of invasive strand with template were cru- factor is the ratio of the number of HRDIGs annotated in this pathway to the cial for the initiation of HR (Maher et al., 2011), we speculated that number of all genes annotated in this pathway. P-value is the corrected P- HRDIGs were likely to be evolutionarily conserved and biologically value ranging from 0 to 0.05, with a lower value meaning a more significant significant genes. enrichment. Count represents the number of HRDIGs by previous findings that accumulation of DNA lesions was tightly 3.5 Results of KEGG enrichment analysis in HRDIGs By KEGG enrichment analysis, we found 13 pathways related to related to NDs (Maynard et al., 2015), and organisms may resist not human diseases significantly enriched in HRDIGs (Fig. 4; only the normal aging but also age-related NDs by reliable DNA re- pair mechanism (Leandro et al., 2015). And there were two cancer Supplementary Table S3). Specifically, there were three neurological disorders (NDs) (PD, HD and AD), which indicated that organisms pathways (prostate cancer and chemical carcinogenesis, respective- relied on HR to thwart the occurrence of NDs. This was supported ly), which was in accordance with previous findings that inaccurate Population genomic analysis of recombination 3425 repair of DSBs was related to the development of certain cancer (Dexheimer, 2013), and defects of recombination predisposed to cancer (Martino et al., 2016). There were also three other disease pathways, the non-alcoholic fatty liver disease, measles, fluid shear stress and atherosclerosis, respectively. Indeed, growing evidence pointed to the association of DNA damages with metabolic syn- dromes, such as non-alcoholic fatty liver disease and atherosclerosis (Gray et al., 2011; Jackson et al., 2009). As to the enrichment of measles pathway, previous experimental study observed a similar re- sult that all the measles cases had an obviously increased frequency of DSBs (Koskull et al., 1977). Moreover, there were also five disease-related pathways, namely the IL-17 signaling pathway, oxi- dative phosphorylation, antigen processing and presentation, Th17 cell differentiation and cardiac muscle contraction, respectively. Fig. 5. Relationship between recombinational DSBs and genetic evolution. These results together with the results of KOG and GO analysis (a) Correlation between recombinational DSB frequency and dN/dS value. (Supplementary Figs S6 and S7 and Table S4) suggested that (b) Comparison of recombinational DSBs frequency between NSGs and PSGs. HRDIGs were tightly associated with certain human diseases. Statistical significance analysis is performed by the pairwise Wilcoxon test Moreover, there were also some pathways related to specific adapt- ability. For instance, estrogen signaling pathway regulated a pleth- ora of physiological processes in mammals (including cardiovascular (Sundararajan et al., 2016). Its operating frequency is usually regu- protection, reproduction and behavior) (Rotroff et al., 2013), the lated homeostatically, and does not change with alteration of DSB drug metabolism and metabolism of xenobiotics by cytochrome frequency (Modliszewski et al., 2017). Though HR is an error-free P450 were responsible for xenobiotics biodegradation and metabol- DNA repair mechanism, it is also a non-random and biased process. ism (Kirchmair et al., 2015), and the two-component system acted While HR was found to prefer to repair these DSBs in evolutionarily as a basic stimulus-response mechanism to permit organisms to conserved regions as observed above (Fig. 5), there may be some sense and respond to the changes of environmental conditions other factors influencing this priority of HR. We supposed that HR (Kreamer et al., 2015). These results indicated that functional im- would be used preferentially to repair those biologically significant pairment of HRDIGs was likely to result in obvious decrease of en- DSBs. vironmental adaptability, hence DSBs occurring in HRDIGs were In this study, we found that the number of recombination events highly dependent on the error-free HR repair mechanism. had an obviously positive correlation with the length of chromo- somes, and also found that the total length of recombination events 3.6 DSBs frequency and dN/dS of RDIGs in clinical strains were significantly longer than that in wine or other By analyzing the relationship of recombinational DSBs frequency strains. A reasonable interpretation for this is that clinical strains are and dN/dS value in RDIGs, we found that those RDIGs prevailing subjected to numerous DNA damages, and need to break up the un- more widely in populations tended to keep a lower dN/dS value, desirable combinations and create favorable allelic ones in a wider and almost all of positively selective genes (PSGs) held an extremely range. Generally, recombination is essential for restoring the dam- low frequency of DSBs (Fig. 5a). This was consistent with previous aged genes, but its DSBs-intersected genes are structured into mosaic studies that high genetic similarity corresponded more significantly genes. By this way, widely prevailing RDIGs in populations general- ly evolve only with a very low rate (Fig. 5a) so as to make sure that to the distribution of DSBs (Magiorkinis et al., 2003). In HR repair, only these highly advantage mutations are accumulated. Hence, a long stretch of single strand DNA would be generated by the end resection of DSBs (Truong et al., 2013), and this single strand DNA RDIGs should have special biological significance, especially might be the most critical factor for determining if a homolog HRDIGs that hold high prevalence in the natural populations. As should be used as the recombination partner (Joshi et al., 2015). In we presumed, HRDIGs are significantly enriched in some critical this analysis, we found that the frequency of recombinational DSBs biological processes and pathways related to the specific adaptabil- in PSGs was significantly lower than that in negatively selective ity and human diseases, such as NDs, cancers and drug metabolism. All these findings indicate that the HR repair of DSBs may well be genes (NSGs) (Fig. 5b), and none of HRDIGs were PSGs. Hence, we crucial for the fitness of eukaryotic organisms, and also contribute speculated that HR preferred to repair these DNA damages in highly to the development of new therapeutic strategies for specific human homologous regions, which in turn contributed to higher evolution- diseases. ary conservation in these regions. Acknowledgements 4 Discussions The authors would like to thank Prof. Chun-Ting Zhang for the invaluable as- As discussed above, HR is a biologically significant mechanism of sistance and inspiring discussions. repairing DSBs that occur spontaneously or are induced by biologic- al, chemical or physical agents. In the HR process, 5 end of DSBs is firstly resected to generate an ssDNA tail of 800 nt. And then, this Funding ssDNA tail initiates the homologous pairing after invasion, and This work was supported by the National Natural Science Foundation of China serves as the primer for DNA biosynthesis. Ultimately, an intact (Grant Nos. 31571358, 21621004, 31171238, 11626250 and 91746119). homologous duplex is used as template to repair the broken strands. However, HR is only responsible for a small proportion of DSBs Conflict of Interest: none declared. 3426 Z.-K.Yang et al. Lo ¨ ytynoja,A. et al. (2005) An algorithm for progressive multiple alignment of References sequences with insertions. Proc. Natl. Acad. Sci. USA, 102, 10557–10562. Amberg,D.C. et al. (2016) Classical genetics with Saccharomyces cerevisiae. Magiorkinis,G. et al. (2003) In vivo characteristics of human immunodefi- Cold Spring Harb. Protoc., 2016, 413–421. ciency virus type 1 intersubtype recombination: determination of hot spots Angiuoli,S.V. et al. (2011) Mugsy: fast multiple alignment of closely related and correlation with sequence similarity. J. Gen. Virol., 84, 2715–2722. whole genomes. Bioinformatics, 27, 334–342. Maher,R.L. et al. (2011) Coordination of DNA replication and recombination Beissbarth,T. et al. (2004) GOstat: find statistically overrepresented Gene activities in the maintenance of genome stability. J. Cell. Biochem., 112, Ontologies within a group of genes. Bioinformatics, 20, 1464–1465. 2672–2682. Boni,M.F. et al. (2007) An exact nonparametric method for inferring mosaic Mahmoudi,M. et al. (2006) DNA damage and repair in atherosclerosis. structure in sequence triplets. Genetics, 176, 1035–1047. Cardiovasc. Res., 71, 259–268. Braconi,D. et al. (2011) Saccharomyces Cerevisiae as a tool to evaluate the Martino,J. et al. (2016) The Shu complex is a conserved regulator of homolo- effects of herbicides on eukaryotic life. In: Kortekamp, A. (ed.) Herbicides gous recombination. FEMS Yeast Res., 16, fow073. and Environment. InTech, London, pp. 493–514. Martin,D. et al. (2000) RDP: detection of recombination amongst aligned Bruen,T.C. et al. (2006) A simple and robust statistical test for detecting the sequences. Bioinformatics, 16, 562–563. presence of recombination. Genetics, 172, 2665–2681. Martin,D.P. et al. (2015) RDP4: detection and analysis of recombination pat- Carroll,D. et al. (2014) Genome engineering with TALENs and ZFNs: repair terns in virus genomes. Virus Evol., 1, vev003. pathways and donor design. Methods, 69, 137–141. Maynard,S. et al. (2015) DNA damage, DNA repair, aging, and neurodegener- Dexheimer,T.S. (2013) DNA repair pathways and mechanisms. In: Mathews, ation. Cold Spring Harb. Perspect. Med., 5, a025130. L.A. et al. (eds) DNA Repair of Cancer Stem Cells. Springer, Dordrecht, pp. Modliszewski,J.L. et al. (2017) Meiotic recombination gets stressed out: cO 19–32. frequency is plastic under pressure. Curr. Opin. Plant Biol., 36, 95–102. Fletcher,W. et al. (2010) The effect of insertions, deletions, and alignment Padidam,M. et al. (1999) Possible emergence of new geminiviruses by frequent errors on the branch-site test of positive selection. Mol. Biol. Evol., 27, recombination. Virology, 265, 218–225. 2257–2267. Posada,D. et al. (2001) Evaluation of methods for detecting recombination Gerton,J.L. et al. (2000) Global mapping of meiotic recombination hotspots from DNA sequences: computer simulations. Proc. Natl. Acad. Sci. USA, and coldspots in the yeast Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. 98, 13757–13762. USA, 97, 11383–11390. Rotroff,D.M. et al. (2013) Real-time growth kinetics measuring hormone Gibbs,M.J. et al. (2000) Sister-scanning: a Monte Carlo procedure for assess- ing signals in recombinant sequences. Bioinformatics, 16, 573–582. mimicry for ToxCast chemicals in T-47D human ductal carcinoma cells. Gorbunova,V. et al. (2007) Changes in DNA repair during aging. Nucleic Chem. Res. Toxicol., 26, 1097–1107. Acids Res., 35, 7466–7474. Salminen,M.O. et al. (1995) Identification of breakpoints in intergenotypic Gray,K. et al. (2011) Role of DNA damage in atherosclerosis–bystander or recombinants of HIV type 1 by bootscanning. AIDS Res. Hum. participant? Biochem. Pharmacol., 82, 693–700. Retroviruses, 11, 1423–1425. Huang,D.W. et al. (2009) Bioinformatics enrichment tools: paths toward the Schmidt,E. et al. (2012) Expression of the Hutchinson-Gilford progeria muta- comprehensive functional analysis of large gene lists. Nucleic Acids Res., tion during osteoblast development results in loss of osteocytes, irregular 37, 1–13. mineralization, and poor biomechanical properties. J. Biol. Chem., 287, Huson,D.H. et al. (2006) Application of phylogenetic networks in evolution- 33512–33522. ary studies. Mol. Biol. Evol., 23, 254–267. Skrzypek,M.S. et al. (2018) Saccharomyces genome database informs human Jackson,S.P. et al. (2009) The DNA-damage response in human biology and biology. Nucleic Acids Res., 46, D736–D742. disease. Nature, 461, 1071–1078. Smith,J.M. (1992) Analyzing the mosaic structure of genes. J. Mol. Evol., 34, Jeggo,P.A. et al. (2016) DNA repair, genome stability and cancer: a historical 126–129. perspective. Nat. Rev. Cancer, 16, 35–42. Suhane,T. et al. (2015) Both the charged linker region and ATPase domain of Joshi,N. et al. (2015) Gradual implementation of the meiotic recombination Hsp90 are essential for Rad51-dependent DNA repair. Eukaryot. Cell, 14, program via checkpoint pathways controlled by global DSB levels. Mol. 64–77. Cell, 57, 797–811. Sundararajan,A. et al. (2016) Gene evolutionary trajectories and GC patterns Kachroo,A.H. et al. (2015) Systematic humanization of yeast genes reveals driven by recombination in Zea mays. Front. Plant Sci., 7, 1433. conserved functions and genetic modularity. Science, 348, 921–925. Sze ´kvo ¨ lgyi,L. et al. (2015) Initiation of meiotic homologous recombination: Katoh,K. et al. (2013) MAFFT multiple sequence alignment software version 7: flexibility, impact of histone modifications, and chromatin remodeling. improvements in performance and usability. Mol. Biol. Evol., 30, 772–780. Cold Spring Harb. Perspect. Biol., 7, a016527. Kirchmair,J. et al. (2015) Predicting drug metabolism: experiment and/or com- Truong,L.N. et al. (2013) Microhomology-mediated End Joining and putation? Nat. Rev. Drug Discov., 14, 387–404. Homologous Recombination share the initial end resection step to repair Koskull,H.V. et al. (1977) Distribution of chromosome breaks in measles, DNA double-strand breaks in mammalian cells. Proc. Natl. Acad. Sci. USA, Fanconi’s anemia and controls. Hereditas, 87, 1–10. 110, 7720–7725. Kreamer,N.N. et al. (2015) The ferrous iron-responsive BqsRS Yang,Z. (2007) PAML 4: phylogenetic analysis by maximum likelihood. Mol. two-component system activates genes that promote cationic stress toler- Biol. Evol., 24, 1586–1591. ance. MBio, 6, e02549-14. Yang,Z.K. et al. (2018) The systematic analysis of ultraconserved genomic Krzywinski,M. et al. (2009) Circos: an information aesthetic for comparative regions in the budding yeast. Bioinformatics, 34, 361–366. genomics. Genome Res., 19, 1639–1645. Zhang,H. et al. (2014) Mechanisms controlling the smooth muscle cell death Lam,I. et al. (2015) Nonparadoxical evolutionary stability of the recombin- in progeria via down-regulation of poly(ADP-ribose) polymerase 1. Proc. ation initiation landscape in yeast. Science, 350, 932–937. Natl. Acad. Sci. USA, 111, E2261–E2270. Leandro,G.S. et al. (2015) The impact of base excision DNA repair in Zhang,Y. et al. (2013) Genetic diversity and recombination in natural popula- age-related neurodegenerative diseases. Mutat. Res., 776, 31–39. Li,L. et al. (2003) OrthoMCL: identification of ortholog groups for eukaryotic tions of the nematode-trapping fungus Arthrobotrys oligospora from China. genomes. Genome Res., 13, 2178–2189. Ecol. Evol., 3, 312–325. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Bioinformatics Oxford University Press

Recombinational DSBs-intersected genes converge on specific disease- and adaptability-related pathways

Loading next page...
1
 
/lp/ou_press/recombinational-dsbs-intersected-genes-converge-on-specific-disease-fdHTI0FmX7

References (55)

Publisher
Oxford University Press
Copyright
© The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
ISSN
1367-4803
eISSN
1460-2059
DOI
10.1093/bioinformatics/bty376
Publisher site
See Article on Publisher Site

Abstract

Motivation: The budding yeast Saccharomyces cerevisiae is a model species powerful for studying the recombination of eukaryotes. Although many recombination studies have been performed for this species by experimental methods, the population genomic study based on bioinformatics anal- yses is urgently needed to greatly increase the range and accuracy of recombination detection. Here, we carry out the population genomic analysis of recombination in S.cerevisiae to reveal the potential rules between recombination and evolution in eukaryotes. Results: By population genomic analysis, we discover significantly more and longer recombination events in clinical strains, which indicates that adverse environmental conditions create an obvious- ly wider range of genetic combination in response to the selective pressure. Based on the analysis of recombinational double strand breaks (DSBs)-intersected genes (RDIGs), we find that RDIGs sig- nificantly converge on specific disease- and adaptability-related pathways, indicating that recom- bination plays a biologically key role in the repair of DSBs related to diseases and environmental adaptability, especially the human neurological disorders. By evolutionary analysis of RDIGs, we find that the RDIGs highly prevailing in populations of yeast tend to be more evolutionarily con- served, indicating the accurate repair of DSBs in these RDIGs is critical to ensure the eukaryotic sur- vival or fitness. Contact: fgao@tju.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online. 1 Introduction most of the genes conserved between yeast and human play the same exact roles in both organisms, and human genes can completely substi- The budding yeast Saccharomyces cerevisiae haslong servedasanim- tute for yeast orthologous genes, raising that humanizing entire cellu- portant model organism for higher eukaryotes as it could promote the lar processes or pathways in yeast should be feasible (Kachroo et al., combination of classical genetics and biochemistry with recombinant 2015). Due to the high biological relevance between yeast and human, genetics (Amberg et al.,2016), and its cells have striking similarities to biological findings in the budding yeast can be efficiently translated mammalian cells at both the organelle and macromolecular level into new insights about human biology (Skrzypek et al.,2018). (Braconi et al., 2011). Just recently, a systematic study has shown that V The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 3421 3422 Z.-K.Yang et al. It has been known that genomic instability is an important factor blocks were realigned separately to produce more accurate align- for human diseases. Though endogenous or exogenous genotoxic ments by using the program MAFFT v7.271 with default parameters assaults constantly destroy the genome integrity and lead to DNA (Katoh et al., 2013). Then, all these realigned blocks were concaten- damages, cells usually repair them by various mechanisms (Jeggo ated to a single alignment. And then, all the single nucleotide poly- et al.,2016). Here we focus on the most hazardous damage, namely morphism (SNP) loci were extracted to generate the resulting the double strand breaks (DSBs), which are generally repaired by two alignment. Finally, this alignment was used as the input of main mechanisms, the error-free homologous recombination (HR) NeighborNet program in SplitsTree v4.14.5 (Huson et al., 2006)to construct the phylogenetic network with default settings. In add- and the fast but error-prone non-homologous end-joining (NHEJ), re- ition, a statistical analysis was carried out by the pair-wise homo- spectively. These mechanisms are shared by eukaryotes (Carroll et al., plasy index (PHI) test (Bruen et al., 2006) as implemented by 2014), but HR is usually less frequently used than NHEJ (Suhane SplitsTree program to estimate the situation of recombination et al.,2015). Considering HR is an accurate DNA repair mechanism, among these strains. we supposed that recombinational DSBs-intersected genes (RDIGs) were associated with special biological significances. As the most lethal form of DNA damage, DSBs pose a great po- 2.4 Detection and analysis of recombination events tential risk to human health. For example, defects in DNA repair After we constructed the phylogenetic network and carried out the pathways are the causal factors for atherosclerosis, and leaded to PHI statistical test for recombination, we employed seven different higher amounts of DNA damages after exposure to reactive oxy- methods, the RDP (Martin et al., 2000), GENECONV (Padidam, gen species (Mahmoudi et al.,2006). When HR repair was defect- 1999), 3SEQ (Boni et al., 2007), BOOTSCAN (Salminen et al., ive, smooth muscle cells became increasingly senescent in patients 1995), MAXCHI (Smith, 1992), SISCAN (Gibbs et al., 2000) and with Hutchinson–Gilford progeria syndrome (Zhang et al.,2014), CHIMAERA (Posada et al., 2001), respectively, to detect the recom- indicating DNA damages were related to aging (Gorbunova et al., bination signal, and all these methods were implemented in 2007). Differentiation of osteoblasts was inhibited obviously by Recombination Detection Program (RDP4) Beta v. 4.91 (Martin DNA damages, which contributed to the weakened bone structure et al., 2015). The breakpoint position of recombination events was in mouse model (Schmidt et al., 2012). These examples suggest determined by RDP4 using a hidden Markov model. In this process, that DSBs are associated with human diseases, but it is still unclear we identified the recombination events separately for each of eligible which diseases significantly rely on the error-free HR repair mech- multiple alignments. And then, all the outputs of RDP4 were parsed anism. Hence, we undertake this study aiming at exploring the bio- to get the simplified and reliable results, but we only considered logical feature and function of RDIGs, which reflects what those recombination events that at least five out of seven methods potential roles the recombination may play in human. showed the significant P-value. Finally, we performed the statistical analysis, as well as the association analysis of recombination with environmental factors. 2 Materials and methods 2.1 Strain and data 2.5 Identification of recombinational DSBs-intersected Based on the phylogenetic relationship previously built by us (Yang genes (RDIGs) et al., 2018), we totally selected 42 strains with consideration of the In this study, RDIGs refer to the genes including recombinational geographic and environmental origins (Supplementary Table S1), DSBs. The identification and functional analysis of RDIGs may be im- and downloaded the corresponding genome sequences from the portant for exploring the biological significance of recombination be- NCBI FTP site (ftp://ftp.ncbi.nih.gov/genomes/all/) in August 2016. cause recombinational DSBs were previously reported to be an These 42 strains represent an extensive geographic distribution, and extraordinarily non-random distribution along the whole genome all these genomes are well annotated and highly complete with more (Lam et al.,2015). So we identified all the RDIGs and calculated the than 99% integrity. In order to facilitate the following analyses, average frequency of recombinational DSBs by a custom PERL script. each sequence header of every strain genome was unified according In addition, we also examined the distribution of recombinational to the strain name, chromosome and corresponding identifier. In DSBs in each chromosome by 2 kbp non-overlapping windows, and addition, the protein sequences and GFF3 annotations were also plotted them by Circos program (Krzywinski et al.,2009). regenerated for each strain based on these gbff files from NCBI by a BioPerl Script. 2.6 Functional enrichment analysis Here we used the S.cerevisiae strain S288c as reference, and reanno- 2.2 Identification of homologous sequences tated all its protein-coding genes based on the latest functional data First, all the genomes were split into individual chromosome se- (please see the Supplementary Material for details). Then, a standard quence files. Then, all the sequences of each chromosome were procedure EnrichPipeline was adopted to carry out the enrichment aligned by Mugsy, which was a robust whole-genome alignment analysis of GO term/KEGG pathway (Beissbarth et al., 2004; program suitable for accurately aligning the closely related whole- Huang et al., 2009). In this process, the procedure firstly identified genome sequences (Angiuoli et al., 2011). And then, we extracted all the significantly enriched results according to the P-values by Chi- the local aligned blocks of every chromosome, and generated a multi square test and Fisher exact test. And then, the P-values of gene set FASTA format file for each of local aligned blocks. However, these enrichment were further adjusted by the default method fdr to filter blocks with the length<1 kbp or the number of homologous out those false-positive results. Finally, we took the cutoff of 0.05 sequences<3 would be viewed as ineligible blocks. for the adjusted P-values as statistical significance. 2.3 Phylogenetic network analysis 2.7 Estimation of dN/dS value We adopted these aligned blocks containing 42 homologous sequen- In order to estimate the correlation between the evolutionary rate ces to construct the phylogenetic network. First, eligible aligned and the frequency of recombinational DSBs, we conducted the Population genomic analysis of recombination 3423 dN/dS analysis of orthologous genes as follows. First, orthologous relationships of gene between strains were identified by OrthoMCL software (Li et al., 2003) with the E-value cutoff of 1.0E-05. Then we adopted the best-performing program PRANK (Fletcher et al., 2010; Lo ¨ ytynoja et al., 2005) to generate the codon alignment for each orthologous group without paralogs or missing genes. Finally, the dN,dS and dN/dS were estimated by the maximum likelihood using the codeml program of PAML software package (Yang, 2007). 3 Results 3.1 Phylogenetic network Originally, we constructed a reticulate phylogenetic network based on the alignment consisted of SNP loci (358 kbp) which were extracted from 240 eligible aligned blocks by a custom Perl script. As shown in Supplementary Figure S1, a complex and reticulate tree topology structure was presented, which indicated the presence of significantly conflicting phylogenetic signals. Generally, conflicting signals were caused by genetic recombination. By the PHI test ana- lysis, we detected the statistically significant evidence of recombin- ation (P ¼ 0.0) in the 42 budding yeast strains. Thus, a more comprehensive assessment of recombination events was necessary for the insight into the underlying biological significance. 3.2 Overview of recombination events There were totally 762 blocks after filtering the ineligible blocks. By PHI test analysis, we detected the statistically significant evidence of recombination in the 42 budding yeast strains. Then, we identified all possible recombination events (Supplementary Table S2) by the RDP4 program and compared these results predicted by seven dif- ferent methods implemented in this program, finding that the num- ber of recombination events predicted by different methods ranged from 12 935 to 16 107 (Fig. 1a). According to the overlapping rela- Fig. 1. Comparison of the results obtained from seven detection methods of tionships constructed by Venn diagram, we found very few recom- recombination. (a) Venn diagram showing the overlap of recombination bination events with significant signals detected by only one method events predicted by seven different methods. (b) Bar plot showing the num- (Fig. 1a), while the number of recombination events with significant ber of recombination events with significant signals detected and only signals detected by all seven methods was far greater than the others detected by the corresponding number of methods, and red dots linked by (Fig. 1b), indicating that these seven methods had a good consist- red line showing the total number of recombination events with significant signals detected by no less than the corresponding number of methods ency in the detection of recombination signals. Please note that the two sets of data presented in Figure 1b have a good correspondence. For example, the case with number of methods¼5 indicated by red previous experimental findings on the smallest four chromosomes that the number of recombination events decreased with the length dot includes three cases, namely 5, 6 and 7, indicated by bar plot. of chromosomes (Gerton et al., 2000). In addition, we also esti- Only these recombination events with signals identified to be sig- mated the guanine-cytosine (GC) content of recombination sequen- nificant (P-value<1.0E-05) by no less than 5 methods were retained ces, finding that sequences from recombination events had slightly for the following analyses, totaling 11, 921 recombination events. lower GC contents than those in genomic sequences (Supplementary Subsequently, by statistical analysis we found that the average Fig. S4). length of recombination events in chromosome 2 were obviously larger than that in others, and the number of recombination events in chromosome 2, 4 and 15 was significantly (P-value¼2.9E-03) 3.3 Source analysis of recombination more than those in others (Supplementary Fig. S2). By linear regres- In order to investigate the correlations between recombination and sion analysis, we found that the number of recombination events geographical or environmental factors, we constructed the network was positively correlated to the length of chromosomes with a cor- relationship of recombination events, which clearly showed the in- relation coefficient of 0.67 (Supplementary Fig. S3a), but the aver- tricate relationship of recombination among strains from different age length of recombination events had almost no correlation to the sources. By this network diagram (Supplementary Fig. S5), we length of chromosomes with an insignificant correlation coefficient observed that recombination appeared to occur more frequently in of 0.16 (Supplementary Fig. S3b). Remarkably, in the smallest four clinical strains than non-clinical strains, indicating that these strains chromosomes, the average length of recombination events had an in clinical environments suffered from stronger selection pressures, obvious positive correlation to the length of chromosomes, but the because there always existed a large number of exogenous agents number of recombination events had a strong negative correlation like pharmaceuticals and ionizing radiation. In order to assess to the length of chromosomes, which was in accordance with whether the number and length of recombination events in clinical 3424 Z.-K.Yang et al. Fig. 2. Relationship between environmental factors and recombination. (a) Comparison of recombination number between any two environmental fac- tors. (b) Comparison of recombination length between any two environmen- tal factors. Significance of above two analysis is estimated by the pairwise Wilcoxon test. Clinic, wine and other refer to the clinical, wine and other envi- ronments respectively environments differed significantly from those in other environ- Fig. 3. Distribution of DSBs along the genome of strain S288c. From outer to ments, we carried out the Wilcoxon test to detect the statistical sig- inner, the circle represents the genome coordinate (grey), all the genes nificance of difference between any two environments. As a result, located in forward and reverse strands (yellow, black and grey indicate the we found the number of recombination events in clinical environ- tRNA, rRNA and other genes, respectively), the percentage of DSBs (black), ments was obviously more than that in the other two environments some important sequence elements (black, red, blue, yellow and green indi- cate the centromere, replication origin, telomere, mobile element and repeat (wine and other, respectively) (Fig. 2a), but the total length of re- region, respectively), GC content (blue/red) and GC skew (blue/red), combination sequences in clinical environments was significantly respectively longer than that in either (Fig. 2b). These findings indicated that strains from harmful environments had significantly more genetic combination than those from benign environments, which was in ac- cordance with what had been observed in the fungus Arthrobotrys oligospora (Zhang et al., 2013). 3.4 DSBs and RDIGs Based on all the recombination events detected above, we identified 23 356 recombinational DSBs from genomes of 42 budding yeast strains, and plotted the distribution of DSBs along each chromo- some. As a result, we found that DSBs were obviously distributed non-randomly along chromosomes (Fig. 3), which was consistent with previous findings (Sze ´ kvo ¨ lgyi et al., 2015). Then, we mapped the positions of all these DSBs to the genome of strain s288c, and to- tally identified 3060 RDIGs. Actually, RDIGs were such genes spliced together by different portions of two homologous genes, therefore they also belonged to the mosaic genes. We calculated the average frequency of recombinational DSBs occurring in RDIGs by a custom Perl script. We focused more on these RDIGs with high frequency (0.9) of recombinational DSBs (referred as HRDIGs here), because these genes were highly dependent on error-free HR. Here, we totally identified 60 HRDIGs. Considering that perfect Fig. 4. Scatterplot for significantly enriched KEGG pathways in HRDIGs. Rich complementary pairings of invasive strand with template were cru- factor is the ratio of the number of HRDIGs annotated in this pathway to the cial for the initiation of HR (Maher et al., 2011), we speculated that number of all genes annotated in this pathway. P-value is the corrected P- HRDIGs were likely to be evolutionarily conserved and biologically value ranging from 0 to 0.05, with a lower value meaning a more significant significant genes. enrichment. Count represents the number of HRDIGs by previous findings that accumulation of DNA lesions was tightly 3.5 Results of KEGG enrichment analysis in HRDIGs By KEGG enrichment analysis, we found 13 pathways related to related to NDs (Maynard et al., 2015), and organisms may resist not human diseases significantly enriched in HRDIGs (Fig. 4; only the normal aging but also age-related NDs by reliable DNA re- pair mechanism (Leandro et al., 2015). And there were two cancer Supplementary Table S3). Specifically, there were three neurological disorders (NDs) (PD, HD and AD), which indicated that organisms pathways (prostate cancer and chemical carcinogenesis, respective- relied on HR to thwart the occurrence of NDs. This was supported ly), which was in accordance with previous findings that inaccurate Population genomic analysis of recombination 3425 repair of DSBs was related to the development of certain cancer (Dexheimer, 2013), and defects of recombination predisposed to cancer (Martino et al., 2016). There were also three other disease pathways, the non-alcoholic fatty liver disease, measles, fluid shear stress and atherosclerosis, respectively. Indeed, growing evidence pointed to the association of DNA damages with metabolic syn- dromes, such as non-alcoholic fatty liver disease and atherosclerosis (Gray et al., 2011; Jackson et al., 2009). As to the enrichment of measles pathway, previous experimental study observed a similar re- sult that all the measles cases had an obviously increased frequency of DSBs (Koskull et al., 1977). Moreover, there were also five disease-related pathways, namely the IL-17 signaling pathway, oxi- dative phosphorylation, antigen processing and presentation, Th17 cell differentiation and cardiac muscle contraction, respectively. Fig. 5. Relationship between recombinational DSBs and genetic evolution. These results together with the results of KOG and GO analysis (a) Correlation between recombinational DSB frequency and dN/dS value. (Supplementary Figs S6 and S7 and Table S4) suggested that (b) Comparison of recombinational DSBs frequency between NSGs and PSGs. HRDIGs were tightly associated with certain human diseases. Statistical significance analysis is performed by the pairwise Wilcoxon test Moreover, there were also some pathways related to specific adapt- ability. For instance, estrogen signaling pathway regulated a pleth- ora of physiological processes in mammals (including cardiovascular (Sundararajan et al., 2016). Its operating frequency is usually regu- protection, reproduction and behavior) (Rotroff et al., 2013), the lated homeostatically, and does not change with alteration of DSB drug metabolism and metabolism of xenobiotics by cytochrome frequency (Modliszewski et al., 2017). Though HR is an error-free P450 were responsible for xenobiotics biodegradation and metabol- DNA repair mechanism, it is also a non-random and biased process. ism (Kirchmair et al., 2015), and the two-component system acted While HR was found to prefer to repair these DSBs in evolutionarily as a basic stimulus-response mechanism to permit organisms to conserved regions as observed above (Fig. 5), there may be some sense and respond to the changes of environmental conditions other factors influencing this priority of HR. We supposed that HR (Kreamer et al., 2015). These results indicated that functional im- would be used preferentially to repair those biologically significant pairment of HRDIGs was likely to result in obvious decrease of en- DSBs. vironmental adaptability, hence DSBs occurring in HRDIGs were In this study, we found that the number of recombination events highly dependent on the error-free HR repair mechanism. had an obviously positive correlation with the length of chromo- somes, and also found that the total length of recombination events 3.6 DSBs frequency and dN/dS of RDIGs in clinical strains were significantly longer than that in wine or other By analyzing the relationship of recombinational DSBs frequency strains. A reasonable interpretation for this is that clinical strains are and dN/dS value in RDIGs, we found that those RDIGs prevailing subjected to numerous DNA damages, and need to break up the un- more widely in populations tended to keep a lower dN/dS value, desirable combinations and create favorable allelic ones in a wider and almost all of positively selective genes (PSGs) held an extremely range. Generally, recombination is essential for restoring the dam- low frequency of DSBs (Fig. 5a). This was consistent with previous aged genes, but its DSBs-intersected genes are structured into mosaic studies that high genetic similarity corresponded more significantly genes. By this way, widely prevailing RDIGs in populations general- ly evolve only with a very low rate (Fig. 5a) so as to make sure that to the distribution of DSBs (Magiorkinis et al., 2003). In HR repair, only these highly advantage mutations are accumulated. Hence, a long stretch of single strand DNA would be generated by the end resection of DSBs (Truong et al., 2013), and this single strand DNA RDIGs should have special biological significance, especially might be the most critical factor for determining if a homolog HRDIGs that hold high prevalence in the natural populations. As should be used as the recombination partner (Joshi et al., 2015). In we presumed, HRDIGs are significantly enriched in some critical this analysis, we found that the frequency of recombinational DSBs biological processes and pathways related to the specific adaptabil- in PSGs was significantly lower than that in negatively selective ity and human diseases, such as NDs, cancers and drug metabolism. All these findings indicate that the HR repair of DSBs may well be genes (NSGs) (Fig. 5b), and none of HRDIGs were PSGs. Hence, we crucial for the fitness of eukaryotic organisms, and also contribute speculated that HR preferred to repair these DNA damages in highly to the development of new therapeutic strategies for specific human homologous regions, which in turn contributed to higher evolution- diseases. ary conservation in these regions. Acknowledgements 4 Discussions The authors would like to thank Prof. Chun-Ting Zhang for the invaluable as- As discussed above, HR is a biologically significant mechanism of sistance and inspiring discussions. repairing DSBs that occur spontaneously or are induced by biologic- al, chemical or physical agents. In the HR process, 5 end of DSBs is firstly resected to generate an ssDNA tail of 800 nt. And then, this Funding ssDNA tail initiates the homologous pairing after invasion, and This work was supported by the National Natural Science Foundation of China serves as the primer for DNA biosynthesis. Ultimately, an intact (Grant Nos. 31571358, 21621004, 31171238, 11626250 and 91746119). homologous duplex is used as template to repair the broken strands. However, HR is only responsible for a small proportion of DSBs Conflict of Interest: none declared. 3426 Z.-K.Yang et al. Lo ¨ ytynoja,A. et al. (2005) An algorithm for progressive multiple alignment of References sequences with insertions. Proc. Natl. Acad. Sci. USA, 102, 10557–10562. Amberg,D.C. et al. (2016) Classical genetics with Saccharomyces cerevisiae. Magiorkinis,G. et al. (2003) In vivo characteristics of human immunodefi- Cold Spring Harb. Protoc., 2016, 413–421. ciency virus type 1 intersubtype recombination: determination of hot spots Angiuoli,S.V. et al. (2011) Mugsy: fast multiple alignment of closely related and correlation with sequence similarity. J. Gen. Virol., 84, 2715–2722. whole genomes. Bioinformatics, 27, 334–342. Maher,R.L. et al. (2011) Coordination of DNA replication and recombination Beissbarth,T. et al. (2004) GOstat: find statistically overrepresented Gene activities in the maintenance of genome stability. J. Cell. Biochem., 112, Ontologies within a group of genes. Bioinformatics, 20, 1464–1465. 2672–2682. Boni,M.F. et al. (2007) An exact nonparametric method for inferring mosaic Mahmoudi,M. et al. (2006) DNA damage and repair in atherosclerosis. structure in sequence triplets. Genetics, 176, 1035–1047. Cardiovasc. Res., 71, 259–268. Braconi,D. et al. (2011) Saccharomyces Cerevisiae as a tool to evaluate the Martino,J. et al. (2016) The Shu complex is a conserved regulator of homolo- effects of herbicides on eukaryotic life. In: Kortekamp, A. (ed.) Herbicides gous recombination. FEMS Yeast Res., 16, fow073. and Environment. InTech, London, pp. 493–514. Martin,D. et al. (2000) RDP: detection of recombination amongst aligned Bruen,T.C. et al. (2006) A simple and robust statistical test for detecting the sequences. Bioinformatics, 16, 562–563. presence of recombination. Genetics, 172, 2665–2681. Martin,D.P. et al. (2015) RDP4: detection and analysis of recombination pat- Carroll,D. et al. (2014) Genome engineering with TALENs and ZFNs: repair terns in virus genomes. Virus Evol., 1, vev003. pathways and donor design. Methods, 69, 137–141. Maynard,S. et al. (2015) DNA damage, DNA repair, aging, and neurodegener- Dexheimer,T.S. (2013) DNA repair pathways and mechanisms. In: Mathews, ation. Cold Spring Harb. Perspect. Med., 5, a025130. L.A. et al. (eds) DNA Repair of Cancer Stem Cells. Springer, Dordrecht, pp. Modliszewski,J.L. et al. (2017) Meiotic recombination gets stressed out: cO 19–32. frequency is plastic under pressure. Curr. Opin. Plant Biol., 36, 95–102. Fletcher,W. et al. (2010) The effect of insertions, deletions, and alignment Padidam,M. et al. (1999) Possible emergence of new geminiviruses by frequent errors on the branch-site test of positive selection. Mol. Biol. Evol., 27, recombination. Virology, 265, 218–225. 2257–2267. Posada,D. et al. (2001) Evaluation of methods for detecting recombination Gerton,J.L. et al. (2000) Global mapping of meiotic recombination hotspots from DNA sequences: computer simulations. Proc. Natl. Acad. Sci. USA, and coldspots in the yeast Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. 98, 13757–13762. USA, 97, 11383–11390. Rotroff,D.M. et al. (2013) Real-time growth kinetics measuring hormone Gibbs,M.J. et al. (2000) Sister-scanning: a Monte Carlo procedure for assess- ing signals in recombinant sequences. Bioinformatics, 16, 573–582. mimicry for ToxCast chemicals in T-47D human ductal carcinoma cells. Gorbunova,V. et al. (2007) Changes in DNA repair during aging. Nucleic Chem. Res. Toxicol., 26, 1097–1107. Acids Res., 35, 7466–7474. Salminen,M.O. et al. (1995) Identification of breakpoints in intergenotypic Gray,K. et al. (2011) Role of DNA damage in atherosclerosis–bystander or recombinants of HIV type 1 by bootscanning. AIDS Res. Hum. participant? Biochem. Pharmacol., 82, 693–700. Retroviruses, 11, 1423–1425. Huang,D.W. et al. (2009) Bioinformatics enrichment tools: paths toward the Schmidt,E. et al. (2012) Expression of the Hutchinson-Gilford progeria muta- comprehensive functional analysis of large gene lists. Nucleic Acids Res., tion during osteoblast development results in loss of osteocytes, irregular 37, 1–13. mineralization, and poor biomechanical properties. J. Biol. Chem., 287, Huson,D.H. et al. (2006) Application of phylogenetic networks in evolution- 33512–33522. ary studies. Mol. Biol. Evol., 23, 254–267. Skrzypek,M.S. et al. (2018) Saccharomyces genome database informs human Jackson,S.P. et al. (2009) The DNA-damage response in human biology and biology. Nucleic Acids Res., 46, D736–D742. disease. Nature, 461, 1071–1078. Smith,J.M. (1992) Analyzing the mosaic structure of genes. J. Mol. Evol., 34, Jeggo,P.A. et al. (2016) DNA repair, genome stability and cancer: a historical 126–129. perspective. Nat. Rev. Cancer, 16, 35–42. Suhane,T. et al. (2015) Both the charged linker region and ATPase domain of Joshi,N. et al. (2015) Gradual implementation of the meiotic recombination Hsp90 are essential for Rad51-dependent DNA repair. Eukaryot. Cell, 14, program via checkpoint pathways controlled by global DSB levels. Mol. 64–77. Cell, 57, 797–811. Sundararajan,A. et al. (2016) Gene evolutionary trajectories and GC patterns Kachroo,A.H. et al. (2015) Systematic humanization of yeast genes reveals driven by recombination in Zea mays. Front. Plant Sci., 7, 1433. conserved functions and genetic modularity. Science, 348, 921–925. Sze ´kvo ¨ lgyi,L. et al. (2015) Initiation of meiotic homologous recombination: Katoh,K. et al. (2013) MAFFT multiple sequence alignment software version 7: flexibility, impact of histone modifications, and chromatin remodeling. improvements in performance and usability. Mol. Biol. Evol., 30, 772–780. Cold Spring Harb. Perspect. Biol., 7, a016527. Kirchmair,J. et al. (2015) Predicting drug metabolism: experiment and/or com- Truong,L.N. et al. (2013) Microhomology-mediated End Joining and putation? Nat. Rev. Drug Discov., 14, 387–404. Homologous Recombination share the initial end resection step to repair Koskull,H.V. et al. (1977) Distribution of chromosome breaks in measles, DNA double-strand breaks in mammalian cells. Proc. Natl. Acad. Sci. USA, Fanconi’s anemia and controls. Hereditas, 87, 1–10. 110, 7720–7725. Kreamer,N.N. et al. (2015) The ferrous iron-responsive BqsRS Yang,Z. (2007) PAML 4: phylogenetic analysis by maximum likelihood. Mol. two-component system activates genes that promote cationic stress toler- Biol. Evol., 24, 1586–1591. ance. MBio, 6, e02549-14. Yang,Z.K. et al. (2018) The systematic analysis of ultraconserved genomic Krzywinski,M. et al. (2009) Circos: an information aesthetic for comparative regions in the budding yeast. Bioinformatics, 34, 361–366. genomics. Genome Res., 19, 1639–1645. Zhang,H. et al. (2014) Mechanisms controlling the smooth muscle cell death Lam,I. et al. (2015) Nonparadoxical evolutionary stability of the recombin- in progeria via down-regulation of poly(ADP-ribose) polymerase 1. Proc. ation initiation landscape in yeast. Science, 350, 932–937. Natl. Acad. Sci. USA, 111, E2261–E2270. Leandro,G.S. et al. (2015) The impact of base excision DNA repair in Zhang,Y. et al. (2013) Genetic diversity and recombination in natural popula- age-related neurodegenerative diseases. Mutat. Res., 776, 31–39. Li,L. et al. (2003) OrthoMCL: identification of ortholog groups for eukaryotic tions of the nematode-trapping fungus Arthrobotrys oligospora from China. genomes. Genome Res., 13, 2178–2189. Ecol. Evol., 3, 312–325.

Journal

BioinformaticsOxford University Press

Published: May 3, 2018

There are no references for this article.