Access the full text.
Sign up today, get DeepDyve free for 14 days.
I. Jakobsen, S. Easteal (1996)
A program for calculating and displaying compatibility matrices as an aid in determining reticulate evolution in molecular sequencesComputer applications in the biosciences : CABIOS, 12 4
G. Drouin, F. Prat, M. Ell, G. Clarke (1999)
Detecting and characterizing gene conversions between multigene family members.Molecular biology and evolution, 16 10
CABIOS
M. Schierup, J. Hein (2000)
Recombination and the molecular clock.Molecular biology and evolution, 17 10
John Smith (1992)
Analyzing the mosaic structure of genesJournal of Molecular Evolution, 34
G. Olsen, H. Matsuda, R. Hagstrom, R. Overbeek (1994)
fastDNAmL: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihoodComputer applications in the biosciences : CABIOS, 10 1
(2000)
GENECONV (Sawyer’s runs test
D. Posada, K. Crandall (2002)
The Effect of Recombination on the Accuracy of Phylogeny EstimationJournal of Molecular Evolution, 54
M. Schierup, J. Hein (2000)
Consequences of recombination on traditional phylogenetic analysis.Genetics, 156 2
D. Posada, K. Crandall (2001)
Evaluation of methods for detecting recombination from DNA sequences: Computer simulationsProceedings of the National Academy of Sciences of the United States of America, 98
G. McGuire, F. Wright (2000)
TOPAL 2.0: improved detection of mosaic sequences within multiple alignmentsBioinformatics, 16 2
K. Lole, R. Bollinger, R. Paranjape, D. Gadkari, S. Kulkarni, Nicole Novak, Roxann Ingersoll, H. Sheppard, S. Ray (1999)
Full-Length Human Immunodeficiency Virus Type 1 Genomes from Subtype C-Infected Seroconverters in India, with Evidence of Intersubtype RecombinationJournal of Virology, 73
D. Martin, E. Rybicki (2000)
RDP: detection of recombination amongst aligned sequencesBioinformatics, 16 6
M. Padidam, S. Sawyer, C. Fauquet (1999)
Possible emergence of new geminiviruses by frequent recombination.Virology, 265 2
(1989)
PHYLIP — Phylogeny Inference Package ( Version 3 . 2 )
M. Salminen, J. Carr, D. Burke, F. McCutchan (1995)
Identification of breakpoints in intergenotypic recombinants of HIV type 1 by bootscanning.AIDS research and human retroviruses, 11 11
D. Posada (2002)
Evaluation of methods for detecting recombination from DNA sequences: empirical data.Molecular biology and evolution, 19 5
M. Gibbs, J. Armstrong, A. Gibbs (2000)
Sister-Scanning: a Monte Carlo procedure for assessing signals in recombinant sequencesBioinformatics, 16 7
E. Holmes, M. Worobey, A. Rambaut (1999)
Phylogenetic evidence for recombination in dengue virus.Molecular biology and evolution, 16 3
Vol. 21 no. 2 2005, pages 260–262 BIOINFORMATICS APPLICATIONS NOTE doi:10.1093/bioinformatics/bth490 RDP2: recombination detection and analysis from sequence alignments 1,∗ 1 2 D. P. Martin , C. Williamson and D. Posada Institute of Infectious Diseases and Molecular Medicine, University of Cape Town, Cape Town 7000, South Africa and Department of Biochemistry, Genetics and Immunology, University of Vigo, 36200 Vigo, Spain Received on April 20, 2004; revised on June 28, 2004; accepted on August 13, 2004 Advance Access publication September 17, 2004 ABSTRACT under all conditions (Posada and Crandall, 2001; Posada, Summary: RDP2 is a Windows 95/XP program that exam- 2000). ines nucleotide sequence alignments and attempts to identify Sharing major components of its user interface and the RDP recombinant sequences and recombination breakpoints using recombination detection method with its predecessor, RDP, 10 published recombination detection methods, including RDP2 implements a variety of additional non-parametric geneconv, bootscan, maximum χ , chimaera and sister recombination detection methods (i.e. methods that do not scanning. The program enables fast automated analysis of make use of population genetic models and make no attempt to large alignments (up to 300 sequences containing 13 000 estimate the population recombination rate; Table 1). Among sites), and interactive exploration, management and verific- the new inclusions are many methods that have performed well ation of results with different recombination detection and tree in comparative tests (Drouin et al., 1999; Posada and Crandall, drawing methods. 2001; Posada, 2000). We have focused on published methods Availability: RDP2 is available free from the RDP2 website that can be used to (1) identify recombinant sequences, (2) (http://darwin.uvigo.es/rdp/rdp.html) identify recombination breakpoints and (3) identify parental Contact: darren@science.uct.ac.za sequences. The program can use any combination of six Supplementary information: Detailed descriptions of RDP2 methods to automatically (rdp, geneconv, maximum χ , and the methods it implements are included in the program bootscan, chimaera and sister scanning) identify recom- manual, which can be downloaded from the RDP2 website. binant and parental sequences, estimate breakpoint positions and calculate probability scores for potential recombination events. Once all potential recombination events are identi- fied, RDP2 sorts analysis results and attempts to determine A major problem encountered while using standard phylogen- the number of unique recombination events identifiable in etic methods in studies involving recombining organisms is an alignment. RDP2 can be set to automatically (1) filter that the evolutionary history of a recombinant sequence cannot be described with a single phylogenetic tree. A single recom- out unique events detected by fewer than a specified num- binant sequence in an alignment can seriously influence the ber of methods, (2) identify consensus daughter and parental branching order and branch lengths of the phylogenetic trees sequences using all evidence for a single actual recombination constructed using the alignment (Posada and Crandall, 2002). event (often involving many potential parental and daughter In addition, recombination compromises the validity of sev- sequence combinations detected using multiple methods) and eral phylogenetic inferences one can make by examining (3) use all evidence for a single actual event to determine most trees (Schierup and Hein, 2000a,b). A number of computa- probable breakpoint positions using a modified maximum χ tional tools for detecting and quantifying various aspects of approach (Maynard-Smith, 1992). recombination have therefore been developed (for a list of RDP2 permits exploration and checking of analysis results available recombination detection programs see http://www. in a highly interactive and user-friendly way. For any detected umber.embnet.org/∼robertson/recombination/index.shtml). A recombinations event, informations such as the method used Comparison of the recombination detection power of 14 of to detect the event, breakpoint positions, parental sequences, these methods using simulated and real datasets indicated that probability values, degrees of agreement with results obtained while some always performed better than others, no single using other detection methods, raw plot data, informative sites method can be adjudged to be best in detecting recombination in the alignment and phylogenetic trees, can be displayed by simply clicking on a graphical representation of the event. To whom correspondence should be addressed. Once an event is selected for more detailed study, checking 260 Bioinformatics vol. 21 issue 2 © Oxford University Press 2004; all rights reserved. RDP2: recombination detection and analysis Table 1. A brief description of recombination detection methods implemented in RDP2 Method (a.k.a.) Sequence Variable (V)/ Sliding Automated References a c comparisons All (A) sites window scans scanned rdp (RDP method) T V ++ Martin and Rybicki (2000) geneconv (Sawyer’s runs test) T/D V −+ Padidam et al. (1999) bootscan TA ++ Salminen et al. (1995) maximum χ (MaxChi) T/D V ++ Maynard-Smith (1992) chimaera TV ++ Posada and Crandall (2001) sister scanning (SiScan) T/F A ++ Gibbs et al. (2000) lard TA −− Holmes et al. (1999) distance plot (SimPlot) T/D A +− Lole et al. (1999) topal TA + − McGuire and Wright (2000) reticulate (compatibility matrix) F V −− Jakobsen and Easteal (1996) T, every possible combination of three sequences in an alignment scanned; D, every possible combination of two sequences in an alignment scanned with variable sites inferred from full alignment; and F, full alignment or substantial part thereof (4+ sequences) scanned with variable sites inferred only from the sequences being scanned. The exact subset of sites scanned will differ between methods and can also differ for the same method with different program settings. Only six methods can be used to automatically identify recombinant sequences and breakpoints from an alignment. Methods can also be run in either a manual or a checking mode allowing users to test specific recombination hypotheses. the evidence for recombination using 10 different recombina- of South Africa (D.P.M.), US National Institutes of Health tion detection methods (besides the six automated methods (D.P.) and the ‘Ramón y Cajal’ programme of the Spanish these also include lard, topal, reticulate and distance government (D.P.) for partially funding the development and plots) is achieved by simply selecting the methods from a distribution of RDP2. menu. To further aid in evaluating evidence for recombina- tion, RDP2 can also use phylip components simultaneously REFERENCES (Felsenstein, 1989; Olsen et al., 1994) to display phylogenetic Drouin,G., Prat,F., Ell,M. and Clarke,G.D.P. (1999) Detecting trees (UPGMA, bootstrapped neighbor-joining, least squares and characterizing gene conversions between multigene family or maximum-likelihood) constructed from different portions members. Mol. Biol. Evol., 16, 1369–1390. of an alignment. Felsenstein,J. (1989) PHYLIP—Phylogeny Inference Package As the amount of detectable recombination in an alignment (Version 3.2). Cladistics, 5, 164–166. Gibbs,M.J., Armstrong,J.S. and Gibbs,A.J. (2000) Sister-scanning: increases, the complexity of correctly inferring which a Monte Carlo procedure for assessing signals in recombinant sequences are parental and which are recombinant increases sequences. Bioinformatics, 16, 573–582. as well. RDP2 encourages user verification of its analysis Holmes,E.C., Worobey,M. and Rambaut,A. (1999) Phylogenetic results and permits user acceptance and rejection of potential evidence for recombination in Dengue virus. Mol. Biol. Evol., recombination events (useful for tracking the progress of an 16, 405–409. analysis), and interactive ‘correction’ of apparent parental and Jakobsen,I.B. and Easteal,S. (1996) A program for calculating daughter sequence misidentification. and displaying compatibility matrices as an aid in determining We have not placed any restrictions on the size of alignments reticulate evolution in molecular sequences. Comput. Appl. that can be examined using RDP2. For example, automated Biosci., 12, 291–295. analyses using all the detection methods together on a PC Lole,K.S., Bollinger,R.C., Paranjape,R.S., Gadarki,D., Kulkami,S.S., Novak,N.G., Ingersoll,R., Sheppard,H.W. and with 256 MB RAM and a 1 GHz Celeron Processor can take Ray,S.C. (1999) Full-length human immunodeficiency type 1 5 min for a 50 sequence alignment of 3 kb long sequences genomes from subtype C-infected seroconverters in India, with and less than 48 h for a 316 sequence alignment of 13 kb long evidence of intersubtype recombination. J. Virol., 73, 152–160. sequences. Martin, D. and Rybicki, E. (2000) RDP: detection of recombination amongst aligned sequences. Bioinformatics, 16, 562–563. Smith,J.M. (1992) Analyzing the mosaic structure of genes. J. Mol. ACKNOWLEDGEMENTS Evol., 34, 126–129. We would like to thank Stanley Sawyer, Andrew Rambaut, McGuire,G. and Wright,F. (2000) TOPAL 2.0: improved detection Ingrid Jakobsen, Joseph Felsenstein, Gary Olsen, Adrian of mosaic sequences within multiple alignments. Bioinformatics, Gibbs and John Armstrong for either agreeing to have their 16, 130–134. programs distributed using RDP2 or providing pieces of code Olsen,G.J., Matsuda,H., Hagstrom,R. and Overbeek,R. (1994) in RDP2. We also thank The National Research Foundation fastDNAML: a tool for construction of phylogenetic trees of 261 D.P.Martin et al. DNA sequences using maximum likelihood. Comput. Appl. Posada,D. and Crandall,K.A. (2002) The effect of recombination Biosci., 10, 41–48. on the accuracy of phylogeny estimation. J. Mol. Evol., 54, Padidam,M., Sawyer,S. and Fauquet,C.M. (1999) Possible 396–402. emergence of new geminiviruses by frequent recombination. Salminen,M.O., Carr,J.K., Burke,D.S. and McCutchan,F.E. (1995) Virology, 265, 218–225. Identification of breakpoints in intergenotypic recombinants of Posada,D. (2002) Evaluation of methods for detecting recombina- HIV type 1 by bootscanning. AIDS Res. Hum. Retroviruses., 11, tion from DNA sequences: empirical data. Mol. Biol. Evol., 19, 1423–1425. 708–717. Schierup,M.H. and Hein,J. (2000a) Consequences of recombination Posada,D. and Crandall,K.A. (2001) Evaluation of methods for on traditional phylogenetic analysis. Genetics, 156, 879–891. detecting recombination from DNA sequences: Computer Schierup,M.H. and Hein,J. (2000b) Recombination and the simulations. Proc. Natl Acad. Sci. USA, 98, 13757–13762. molecular clock. Mol. Biol. Evol., 17, 1578–1579.
Bioinformatics – Oxford University Press
Published: Sep 17, 2004
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.