Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

APE: Analyses of Phylogenetics and Evolution in R language

APE: Analyses of Phylogenetics and Evolution in R language Vol. 20 no. 2 2004, pages 289–290 BIOINFORMATICS APPLICATIONS NOTE DOI: 10.1093/bioinformatics/btg412 APE: Analyses of Phylogenetics and Evolution in R language 1,∗ 1 2 Emmanuel Paradis , Julien Claude and Korbinian Strimmer Laboratoire de Paléontologie, Paléobiologie and Phylogénie, Institut des Sciences de l’Évolution, Université Montpellier II, F-34095 Montpellier cédex 05, France and Department of Statistics, University of Munich, Ludwigstrasse 33, D-80539 Munich, Germany Received on April 11, 2003; revised on July 11, 2003; accepted on July 29, 2003 ABSTRACT In Version 1.1, APE provides functions for reading, writ- Summary: Analysis of Phylogenetics and Evolution (APE) is ing, plotting and manipulating phylogenetic trees, analyses a package written in the R language for use in molecular evol- of comparative data in a phylogenetic framework, analysis ution and phylogenetics. APE provides both utility functions for of diversification, computing distances from allelic and nuc- reading and writing data and manipulating phylogenetic trees, leotide data, reading nucleotide sequences and several other as well as several advanced methods for phylogenetic and tools, such as Mantel’s test, computation of minimum span- evolutionary analysis (e.g. comparative and population genetic ning tree or estimation of population genetics parameters. methods). APE takes advantage of the many R functions for Table 1 gives an overview of the functions currently available statistics and graphics, and also provides a flexible framework in APE. Note that some of the methods (e.g. comparative for developing and implementing further statistical methods for method, skyline plot, etc.) have previously been available the analysis of evolutionary processes. only in specialized softwares. External tree reconstruction Availability: The program is free and available from the offi- programs (such as PHYLIP) can be called from R through cial R package archive at http://cran.r-project.org/src/contrib/ standard shell commands. PACKAGES.html#ape. APE is licensed under the GNU One strength of R is that it is straightforward to obtain General Public License. publication-quality graphical output, particularly with its Contact: [email protected] PostScript device. For instance, the plotting function of phylo- genies in APE handles colors, line thickness, font, spacing of labels, which can be defined separately for each branch, so that Phylogenetic analysis, in its broad sense, covers a very wide three different variables can be represented on a single phylo- range of methods from computing evolutionary distances, geny plot. APE also produces complex population genetics reconstructing gene trees, estimating divergence dates, to plots, such as the generalized skyline plot (Strimmer and the analysis of comparative data, estimation of evolutionary Pybus, 2001), with a single command. rates and analysis of diversification. All these diverse tasks APE, like any R package, is command-line driven. The have one particular aspect in common: they rely heavily on functions are called by the user, possibly with arguments and computational statistics. options. Any session using APE in R starts with the command The R system, a free platform-independent open-source analysis environment, has recently emerged as the de facto library(ape) standard for statistical computing and graphics (Ihaka and which makes the functions of APE available in the R envir- Gentleman, 1996). One advantage of R is that it can be easily onment. The list of these functions can be displayed with the tailored to a particular application area by writing specialized command packages. In particular, the usefulness of R in bioinformatics has already been impressively demonstrated in the analysis of library(help = ape) gene expression data (http://www.bioconductor.org). which displays their names with a brief description. An evol- Analysis of Phylogenetics and Evolution (APE) is the first utionary tree saved on the disk in the text file tree1.txt in joint effort to utilize the power of R also in the analysis of the standard Newick parenthetic format can then be read by phylogenetic and evolutionary data. APE focuses on statistical analyses using phylogenetic and genealogical trees as input. tree1 <- read.tree(‘tree1.txt’) This stores the phylogenetic tree is in an object named tree1 To whom correspondence should be addressed. of class ‘phylo’. The information stored in this object Bioinformatics 20(2) © Oxford University Press 2004; all rights reserved. 289 E.Paradis et al. Table 1. Special functions available in APE 1.1 of groups of genes in phylogenetic trees using Klastorin’s method (Misawa and Tajima, 2000). Furthermore, distance- based clustering methods as implemented in the R function Application Available commands hclust can be used by APE using functions converting to and from objects of class ‘phylo’ and ‘hclust’. Input/output read.dna, write.dna, read.nexus, All R functions available in APE (Table 1) are documented write.nexus, read.tree, write.tree, read.GenBank in the R hypertext format and information regarding their use Graphics add.scale.bar, plot.mst, plot.phylo, can be retrieved by applying the help command, e.g. plot.skyline, lines.skyline, ltt.plot help(read.tree) Tree manipulation bind.tree, drop.tip, is.binary.tree, is.ultrametric The classes and methods in APE (like phylo) can also Comparative method compar.gee, compar.lynch, pic, easily be further extended to include other functionalities, for vcv.phylo instance to annotate phylogenetic trees. Thus, APE is not only Diversification birthdeath, cherry, diversi.gof, a data analysis package, it is also an environment for develop- diversi.time, gamma.stat ing and implementing new methods. Furthermore, programs Population genetics branching.times, coalescent.intervals, written in C, C++ or Fortran77 can be linked and called collapsed.intervals, from R. This is particularly useful for computer intensive find.skyline.epsilon, calculations. heterozygosity, skylineplot, skyline, theta.h, theta.k, theta.s Molecular dating chronogram, ratogram, NPRS.criterion ACKNOWLEDGEMENTS Miscellaneous all.equal.phylo, balance, base.freq, We thank two anonymous referees for their constructive com- dist.dna, dist.gene, dist.phylo, ments on a previous version of this paper. This research was GC.content, klastorin, mantel.test, financially supported by the Programme inter-EPST ‘Bioin- mst, summary.phylo Data sets bird.families, bird.orders, hivtree, formatique’ (E.P. and J.C.) and by an Emmy-Noether research landplants, opsin, woodmouse, grant from the DFG (K.S.). This is publication 2003–053 xenarthra of the Institute des Sciences de l’Evolution (Unite Mixte de Recherche 5554 du Centre National Recherche Scientifique). Detailed information about each function can be accessed with the online help [e.g. help(mantel.test)]. REFERENCES Felsenstein,J. (1985) Phylogenies and the comparative methods. Am. (e.g. branch lengths) can be inspected by typing tree1 and Nat., 125, 1–15. graphical output in form of a cladogram can be obtained by Harvey,P.H. and Pagel,M.D. (1991) The Comparative Method in executing Evolutionary Biology. Oxford University Press, Oxford. Ihaka,R. and Gentleman,R. (1996) R: a language for data analysis plot(tree1) and graphics. J. Comput. Graph. Statist., 5, 299–314. which actually calls the function plot.phylo of APE to Misawa,K. and Tajima,F. (2000) A simple method for classifying draw the phylogenetic tree tree1 [due to the object-oriented genes and a bootstrap test for classifications. Mol. Biol. Evol., 17, nature of R the command plot(x) may give a completely 1879–1884. Nee,S., Holmes,E.C., Rambaut,A. and Harvey,P.H. (1995) Inferring different result depending on the class of x]. The tree is population history from molecular phylogenies. Phil. Trans. R. plotted, by default, on a graphical window, but can be exported Soc. Lond. B, 349, 25–31. in various file formats depending on the operating system. Nee,S., May,R.M. and Harvey,P.H. (1994) The reconstructed evolu- In addition to this trivial example, the representation of tionary process. Phil. Trans. R. Soc. Lond. B, 344, 305–311. a phylogenetic tree in an object-oriented structure results Pybus,O.G. and Harvey,P.H. (2000) Testing macro-evolutionary in straightforward manipulation of the phylogenetic data for models using incomplete molecular phylogenies. Proc. R. Soc. various computations used in evolutionary analyses. Currently Lond B, 267, 2267–2272. implemented in APE are approaches, such as phylogenetically Sanderson,M.J. (1997) A nonparametric approach to estimating independent contrasts (Felsenstein, 1985; Harvey and Pagel, divergence times in the absence of rate constancy. Mol. Biol. 1991), fitting birth–death models (Nee et al., 1994; Pybus Evol., 14, 1218–1231. and Harvey, 2000), population-genetic analysis (Nee et al., Strimmer,K. and Pybus,O.G. (2001) Exploring the demographic his- tory of a sample of DNA sequences using the generalized skyline 1995; Strimmer and Pybus, 2001), non-parametric smooth- plot. Mol. Biol. Evol., 18, 2298–2305. ing of evolutionary rates (Sanderson, 1997) and estimation http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Bioinformatics Oxford University Press

APE: Analyses of Phylogenetics and Evolution in R language

Bioinformatics , Volume 20 (2): 2 – Jan 22, 2004

Loading next page...
 
/lp/oxford-university-press/ape-analyses-of-phylogenetics-and-evolution-in-r-language-ylQcMwZG0b

References (10)

Publisher
Oxford University Press
Copyright
© Oxford University Press 2004; all rights reserved.
ISSN
1367-4803
eISSN
1460-2059
DOI
10.1093/bioinformatics/btg412
Publisher site
See Article on Publisher Site

Abstract

Vol. 20 no. 2 2004, pages 289–290 BIOINFORMATICS APPLICATIONS NOTE DOI: 10.1093/bioinformatics/btg412 APE: Analyses of Phylogenetics and Evolution in R language 1,∗ 1 2 Emmanuel Paradis , Julien Claude and Korbinian Strimmer Laboratoire de Paléontologie, Paléobiologie and Phylogénie, Institut des Sciences de l’Évolution, Université Montpellier II, F-34095 Montpellier cédex 05, France and Department of Statistics, University of Munich, Ludwigstrasse 33, D-80539 Munich, Germany Received on April 11, 2003; revised on July 11, 2003; accepted on July 29, 2003 ABSTRACT In Version 1.1, APE provides functions for reading, writ- Summary: Analysis of Phylogenetics and Evolution (APE) is ing, plotting and manipulating phylogenetic trees, analyses a package written in the R language for use in molecular evol- of comparative data in a phylogenetic framework, analysis ution and phylogenetics. APE provides both utility functions for of diversification, computing distances from allelic and nuc- reading and writing data and manipulating phylogenetic trees, leotide data, reading nucleotide sequences and several other as well as several advanced methods for phylogenetic and tools, such as Mantel’s test, computation of minimum span- evolutionary analysis (e.g. comparative and population genetic ning tree or estimation of population genetics parameters. methods). APE takes advantage of the many R functions for Table 1 gives an overview of the functions currently available statistics and graphics, and also provides a flexible framework in APE. Note that some of the methods (e.g. comparative for developing and implementing further statistical methods for method, skyline plot, etc.) have previously been available the analysis of evolutionary processes. only in specialized softwares. External tree reconstruction Availability: The program is free and available from the offi- programs (such as PHYLIP) can be called from R through cial R package archive at http://cran.r-project.org/src/contrib/ standard shell commands. PACKAGES.html#ape. APE is licensed under the GNU One strength of R is that it is straightforward to obtain General Public License. publication-quality graphical output, particularly with its Contact: [email protected] PostScript device. For instance, the plotting function of phylo- genies in APE handles colors, line thickness, font, spacing of labels, which can be defined separately for each branch, so that Phylogenetic analysis, in its broad sense, covers a very wide three different variables can be represented on a single phylo- range of methods from computing evolutionary distances, geny plot. APE also produces complex population genetics reconstructing gene trees, estimating divergence dates, to plots, such as the generalized skyline plot (Strimmer and the analysis of comparative data, estimation of evolutionary Pybus, 2001), with a single command. rates and analysis of diversification. All these diverse tasks APE, like any R package, is command-line driven. The have one particular aspect in common: they rely heavily on functions are called by the user, possibly with arguments and computational statistics. options. Any session using APE in R starts with the command The R system, a free platform-independent open-source analysis environment, has recently emerged as the de facto library(ape) standard for statistical computing and graphics (Ihaka and which makes the functions of APE available in the R envir- Gentleman, 1996). One advantage of R is that it can be easily onment. The list of these functions can be displayed with the tailored to a particular application area by writing specialized command packages. In particular, the usefulness of R in bioinformatics has already been impressively demonstrated in the analysis of library(help = ape) gene expression data (http://www.bioconductor.org). which displays their names with a brief description. An evol- Analysis of Phylogenetics and Evolution (APE) is the first utionary tree saved on the disk in the text file tree1.txt in joint effort to utilize the power of R also in the analysis of the standard Newick parenthetic format can then be read by phylogenetic and evolutionary data. APE focuses on statistical analyses using phylogenetic and genealogical trees as input. tree1 <- read.tree(‘tree1.txt’) This stores the phylogenetic tree is in an object named tree1 To whom correspondence should be addressed. of class ‘phylo’. The information stored in this object Bioinformatics 20(2) © Oxford University Press 2004; all rights reserved. 289 E.Paradis et al. Table 1. Special functions available in APE 1.1 of groups of genes in phylogenetic trees using Klastorin’s method (Misawa and Tajima, 2000). Furthermore, distance- based clustering methods as implemented in the R function Application Available commands hclust can be used by APE using functions converting to and from objects of class ‘phylo’ and ‘hclust’. Input/output read.dna, write.dna, read.nexus, All R functions available in APE (Table 1) are documented write.nexus, read.tree, write.tree, read.GenBank in the R hypertext format and information regarding their use Graphics add.scale.bar, plot.mst, plot.phylo, can be retrieved by applying the help command, e.g. plot.skyline, lines.skyline, ltt.plot help(read.tree) Tree manipulation bind.tree, drop.tip, is.binary.tree, is.ultrametric The classes and methods in APE (like phylo) can also Comparative method compar.gee, compar.lynch, pic, easily be further extended to include other functionalities, for vcv.phylo instance to annotate phylogenetic trees. Thus, APE is not only Diversification birthdeath, cherry, diversi.gof, a data analysis package, it is also an environment for develop- diversi.time, gamma.stat ing and implementing new methods. Furthermore, programs Population genetics branching.times, coalescent.intervals, written in C, C++ or Fortran77 can be linked and called collapsed.intervals, from R. This is particularly useful for computer intensive find.skyline.epsilon, calculations. heterozygosity, skylineplot, skyline, theta.h, theta.k, theta.s Molecular dating chronogram, ratogram, NPRS.criterion ACKNOWLEDGEMENTS Miscellaneous all.equal.phylo, balance, base.freq, We thank two anonymous referees for their constructive com- dist.dna, dist.gene, dist.phylo, ments on a previous version of this paper. This research was GC.content, klastorin, mantel.test, financially supported by the Programme inter-EPST ‘Bioin- mst, summary.phylo Data sets bird.families, bird.orders, hivtree, formatique’ (E.P. and J.C.) and by an Emmy-Noether research landplants, opsin, woodmouse, grant from the DFG (K.S.). This is publication 2003–053 xenarthra of the Institute des Sciences de l’Evolution (Unite Mixte de Recherche 5554 du Centre National Recherche Scientifique). Detailed information about each function can be accessed with the online help [e.g. help(mantel.test)]. REFERENCES Felsenstein,J. (1985) Phylogenies and the comparative methods. Am. (e.g. branch lengths) can be inspected by typing tree1 and Nat., 125, 1–15. graphical output in form of a cladogram can be obtained by Harvey,P.H. and Pagel,M.D. (1991) The Comparative Method in executing Evolutionary Biology. Oxford University Press, Oxford. Ihaka,R. and Gentleman,R. (1996) R: a language for data analysis plot(tree1) and graphics. J. Comput. Graph. Statist., 5, 299–314. which actually calls the function plot.phylo of APE to Misawa,K. and Tajima,F. (2000) A simple method for classifying draw the phylogenetic tree tree1 [due to the object-oriented genes and a bootstrap test for classifications. Mol. Biol. Evol., 17, nature of R the command plot(x) may give a completely 1879–1884. Nee,S., Holmes,E.C., Rambaut,A. and Harvey,P.H. (1995) Inferring different result depending on the class of x]. The tree is population history from molecular phylogenies. Phil. Trans. R. plotted, by default, on a graphical window, but can be exported Soc. Lond. B, 349, 25–31. in various file formats depending on the operating system. Nee,S., May,R.M. and Harvey,P.H. (1994) The reconstructed evolu- In addition to this trivial example, the representation of tionary process. Phil. Trans. R. Soc. Lond. B, 344, 305–311. a phylogenetic tree in an object-oriented structure results Pybus,O.G. and Harvey,P.H. (2000) Testing macro-evolutionary in straightforward manipulation of the phylogenetic data for models using incomplete molecular phylogenies. Proc. R. Soc. various computations used in evolutionary analyses. Currently Lond B, 267, 2267–2272. implemented in APE are approaches, such as phylogenetically Sanderson,M.J. (1997) A nonparametric approach to estimating independent contrasts (Felsenstein, 1985; Harvey and Pagel, divergence times in the absence of rate constancy. Mol. Biol. 1991), fitting birth–death models (Nee et al., 1994; Pybus Evol., 14, 1218–1231. and Harvey, 2000), population-genetic analysis (Nee et al., Strimmer,K. and Pybus,O.G. (2001) Exploring the demographic his- tory of a sample of DNA sequences using the generalized skyline 1995; Strimmer and Pybus, 2001), non-parametric smooth- plot. Mol. Biol. Evol., 18, 2298–2305. ing of evolutionary rates (Sanderson, 1997) and estimation

Journal

BioinformaticsOxford University Press

Published: Jan 22, 2004

There are no references for this article.