Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

DroPhEA: Drosophila phenotype enrichment analysis for insect functional genomics

DroPhEA: Drosophila phenotype enrichment analysis for insect functional genomics Vol. 27 no. 22 2011, pages 3218–3219 BIOINFORMATICS APPLICATIONS NOTE doi:10.1093/bioinformatics/btr530 Databases and ontologies Advance Access publication October 5, 2011 DroPhEA: Drosophila phenotype enrichment analysis for insect functional genomics Meng-Pin Weng and Ben-Yang Liao Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Miaoli County 350, Taiwan, R.O.C. Associate Editor: Martin Bishop ABSTRACT do not implement mutant phenotype data. Insects and mammals diverged from each other >900 million years ago (Blair et al., 2005). Summary: DroPhEA is a core module of a web application that Due to the substantial anatomical, physiological, developmental facilitates research in insect functional genomics through enrichment and behavioral differences between insects and mammals, it is analysis on mutant phenotypes of fruit fly (Drosophila melanogaster). challenging to utilize mammalian phenotypes to explore insect The phenotypes investigated in the analyses can be predefined by functional genomics. FlyBase or customized by users. DroPhEA allows users to specify We therefore developed DroPhEA (Drosophila Phenotype mutation or ortholog types, displays enriched term results in a Enrichment Analysis). Similar to MamPhEA, a phenotype hierarchical structure and supports analyses on gene sets of all insect enrichment tool for analyses on mammalian genes (Weng and species with a fully sequenced genome. Liao, 2010), DroPhEA provides several useful features. First, Availability: http://evol.nhri.org.tw/phenome/DroPhEA/ DroPhEA allows enrichment analysis not only on phenotypes Contact: liaoby@nhri.org.tw predefined by FlyBase (http://flybase.org/), but also on customized Supplementary Information: Supplementary data are available at phenotypes to study complex traits. Second, different types of Bioinformatics online. mutations exhibit distinct impacts on protein function; to remove Received on May 6, 2011; revised on September 9, 2011; accepted potential biases caused by the use of data derived from differential on September 16, 2011 mutagenesis approaches (Liao and Zhang, 2008), DroPhEA enables users to perform analyses that exclude phenotypes derived from gain-of-function mutations. Third, DroPhEA generates graphical 1 INTRODUCTION and downloadable output displaying the enriched or depleted Enrichment analysis is a promising strategy for investigators to phenotypes according to the hierarchical structure of the phenotypic biologically interpret gene lists obtained from high-throughput classification. Finally, due to the paucity of phenotypic data in insect studies in the post-genomic era. Among the backend annotations species other than Drosophila melanogaster, DroPhEA supports employed in different enrichment tools (see review in Huang et al., analyses of genes orthologous to D.melanogaster for all insect 2009), mutant phenotype annotation is unique because it represents species with a fully sequenced genome (17 to date). Through the consequence of altering the information output of a gene. integration with MamPhEA, DroPhEA is also capable of analyzing Therefore, annotations of phenotypes are an ideal means to aid in mammalian genes. the understanding of how a set of genes function within the context of the whole organism. 2 IMPLEMENTATION Studies in fruit-fly biology were first documented in the early 1900s (Rubin and Lewis, 2000); over the last century, Drosophila Mutant fly phenotypic data were retrieved from FlyBase (version has become one of the most widely used model organisms for FB2011_03) (http://flybase.org/). Phenotypic entries resulting from animal genetics and developmental biology. As an insect species, mutations affecting multiple loci were excluded. FlyBase applies in addition to basic genetics studies, Drosophila has served as a two hierarchically structured and controlled lexicons to describe model to assist in agricultural and epidemiological investigations. phenotypes: (i) ‘Phenotypic class’ (FBcv term) represents the The richness and comprehensiveness of fly omics data have been pathology, or the effect of the mutation on the whole organism invaluable to many fields of biology. Currently, the proportion of (e.g. lethal, sterile) and (ii) ‘Anatomy’ (FBbt term) describes mutated and phenotyped fly genes (36.9%; 5606 of 15 191 genes the body part manifested by the mutation (e.g. eye, antenna) mapped to the genome) is approaching that of the house mouse (Drysdale, 2001). Phenotypic descriptions of a fly mutant allele (Mus musculus)[∼40% of the genes comprising the mouse genome are often expressed as a compound statement comprising multiple have been phenotyped; Weng and Liao (2010)]. However, with terms. In such cases, we treated each term in the compound the exception of the house mouse (Chen et al., 2009; Weng and statement individually. In fact, we found that each composite Liao, 2010), an enrichment tool based on mutant phenotypes of term usually contained only one ‘phenotypic class’ FBcv term another species has never been developed. Some bioinformatic or ‘anatomy‘ FBbt term, and the remaining terms were used tools for insect genomics, such as FlyMine (Lyne et al., 2007), as ‘qualifiers’ (FBcv:0000005; e.g. ‘dominant’) or ‘occurrent’ terms (FBdv:00007008; e.g. ‘pupal stage P5’), among others. To whom correspondence should be addressed. Consequently, most of the compound phenotypic descriptions can 3218 © The Author 2011. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com [14:52 19/10/2011 Bioinformatics-btr530.tex] Page: 3218 3218–3219 DroPhEA simply be expressed as a ‘phenotypic class’ FBcv term, or ‘anatomy’ for hematophagy, we downloaded the microarray data of the FBbt term for subsequent analyses. The analysis of gene sets of other blood-feeding malaria mosquito (Anopheles gambiae) (Marinotti insect species was supported by obtaining orthology information et al., 2005) from VectorBase (release December 2010) (http:// of each species to D.melanogaster from the InParanoid database www.vectorbase.org/). The 20% (2042/10 207) of mosquito genes (version 7.0, http://inparanoid.sbc.su.se/). shown to exhibit the highest increases in expression signals after DroPhEA typically compares two gene sets. When one gene set a blood meal were analyzed with DroPhEA. Consistent with is provided, it is compared with the rest of the genes in the genome. our current understanding that mosquito hematophagy is required Genes without a ‘phenotypic class’ FBcv term or ‘anatomy’ FBbt for oocyte development (Dana et al., 2005; Marinotti et al., term are excluded prior to the analysis. Significantly enriched or 2005), the results indicated enrichment of the input gene set depleted phenotypes are detected by Fisher’s exact test. P-values can with several reproduction-related or cell cycle-related phenotypes; be Bonferroni corrected for multiple tests. DroPhEA allows users furthermore, results showed the input gene set was depleted in to customize phenotypes by combining existing FlyBase controlled development, nervous system, muscle, sensory organ and behavior lexicons by a keyword search, or by browsing FBcv or FBbt term phenotypes (Supplementary File 1). We also conducted enrichment ontologies. Enrichment analysis applying customized phenotypes analyses using Example 2 gene sets on GO terms and KEGG gives DroPhEA the capability of exploring complex traits, such as pathways for comparisons. Significantly enriched KEGG pathways gene essentiality (see Examples 1.1 and 1.2 in Section 3.1 below). were not detected; however, many enriched/depleted GO terms Default DroPhEA generates graphical output displaying a in Biological Process were reported (Supplementary File 2), and hierarchical structure of enriched/depleted phenotypes. DroPhEA notably consistent with DroPhEA output. Despite the similarity also produces classic output in a simple linear text format showing in results, many enriched/depleted terms reported by DroPhEA differentially enriched phenotypes. JavaServer Pages (JSP) and describe traits at the organismal level (e.g. sterile, viable, circadian MySQL database on an Apache Tomcat server was used to build rhythm defective, among many others), which are not included in the web interface. GO or KEGG annotations. Therefore, mutant phenotypes are clearly invaluable sources of complementary data to augment GO/KEGG in bioinformatics analyses in the study of functional genomics. 3 EXAMPLES Two examples provided on the DroPhEA website are used to 4 CONCLUSION AND PERSPECTIVE illustrate the use of DroPhEA in hypothesis testing and knowledge discovery. DroPhEA is an online tool used to explore insect functional genomics through enrichment analysis of D.melanogaster 3.1 Essentiality of evolutionarily conserved phenotypes. Modules for enrichment analyses on phenotypes of model organisms other than mouse and fly will be added with (Example 1.1) and lineage-specific genes increased availability and improved annotations of eukaryotic (Example 1.2) phenotypic data in the future. The backend databases of DroPhEA Genes ubiquitous in genomes across a wide range of the are automatically updated every 2 months. The tutorial of DroPhEA phylogenetic spectrum are likely to perform fundamental biological is available online at http://evol.nhri.org.tw/phenome/. functions throughout different taxa and taxonomic levels, and are therefore expected to be more critical to individual fitness (survival Funding: National Health Research Institutes Intramural Fund and or reproduction). For the fruit fly, genes vital to fitness (essential National Science Council Grants (NSC 99-2311-B-400-003-MY2 genes) can be defined by association with the lethal or sterile and NSC 100-2319-B-400-001 to B.-Y.L.). mutant phenotype (Liao and Zhang, 2008). Therefore, we created Conflict of Interest: none declared. the ‘essentiality’ phenotype by combining the controlled lexicons FBcv:0000351 (lethal) and FBcv:0000364 (sterile). Consistent with our expectation, the results indicated 3151 Drosophila genes with REFERENCES one human ortholog (retrieved from BioMart version 61) exhibited Blair,J.E. et al. (2005) Evolutionary sequence analysis of complete eukaryote genomes. a strong enrichment in the customized phenotype ‘essentiality’ BMC Bioinformatics, 6, 53. relative to the genomic background (P =1.17E-4). Similarly, we Chen,J. et al. (2009) ToppGene Suite for gene list enrichment analysis and candidate examined 7852 fly lineage-specific genes, defined as genes without gene prioritization. Nucleic Acids Res., 37, W305–W311. Dana,A.N. et al. (2005) Gene expression patterns associated with blood-feeding in the an ortholog in the human genome based on the BioMart annotation. malaria mosquito Anopheles gambiae. BMC Genomics, 6,5. Contradictory to the results of the fly-human one-to-one orthologs, Drysdale,R. (2001) Phenotypic data in FlyBase. Brief. Bioinformatics, 2, 68–80. the fly lineage-specific genes were significantly depleted with the Huang,D.W. et al. (2009) Bioinformatics enrichment tools: paths toward the ‘essentiality’ phenotype (P = 1.91E-40). This example demonstrates comprehensive functional analysis of large gene lists. Nucleic Acids Res., 37, 1–13. that DroPhEA is a promising bioinformatic tool to explore complex Liao,B.Y. and Zhang,J. (2008) Null mutations in human and mouse orthologs frequently result in different phenotypes. Proc. Natl Acad. Sci. USA, 105, 6987–6992. traits. Lyne,R. et al. (2007) FlyMine: an integrated database for Drosophila and Anopheles genomics. Genome Biol., 8, R129 3.2 Genes associated with mosquito blood-feeding Marinotti,O. et al. (2005) Microarray analysis of genes showing variable expression following a blood meal in Anopheles gambiae. Insect Mol. Biol., 14, 365–373. behavior (Example 2) Rubin,G.M. and Lewis,E.B. (2000) A brief history of Drosophila’s contributions to Blood-feeding behavior is an important characteristic of mosquito genome research. Science, 287, 2216–2218. Weng,M.P. and Liao,B.Y. (2010) MamPhEA: a web tool for mammalian phenotype species. To elucidate the genetic components in the insect genome enrichment analysis. Bioinformatics, 26, 2212–2213. that have been acquired or are associated with the adaptation [14:52 19/10/2011 Bioinformatics-btr530.tex] Page: 3219 3218–3219 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Bioinformatics Oxford University Press

DroPhEA: Drosophila phenotype enrichment analysis for insect functional genomics

Bioinformatics , Volume 27 (22): 2 – Oct 5, 2011

Loading next page...
 
/lp/oxford-university-press/dro-phea-drosophila-phenotype-enrichment-analysis-for-insect-2XA6XNv4Ms

References (20)

Publisher
Oxford University Press
Copyright
© The Author 2011. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
ISSN
1367-4803
eISSN
1460-2059
DOI
10.1093/bioinformatics/btr530
pmid
21976423
Publisher site
See Article on Publisher Site

Abstract

Vol. 27 no. 22 2011, pages 3218–3219 BIOINFORMATICS APPLICATIONS NOTE doi:10.1093/bioinformatics/btr530 Databases and ontologies Advance Access publication October 5, 2011 DroPhEA: Drosophila phenotype enrichment analysis for insect functional genomics Meng-Pin Weng and Ben-Yang Liao Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Miaoli County 350, Taiwan, R.O.C. Associate Editor: Martin Bishop ABSTRACT do not implement mutant phenotype data. Insects and mammals diverged from each other >900 million years ago (Blair et al., 2005). Summary: DroPhEA is a core module of a web application that Due to the substantial anatomical, physiological, developmental facilitates research in insect functional genomics through enrichment and behavioral differences between insects and mammals, it is analysis on mutant phenotypes of fruit fly (Drosophila melanogaster). challenging to utilize mammalian phenotypes to explore insect The phenotypes investigated in the analyses can be predefined by functional genomics. FlyBase or customized by users. DroPhEA allows users to specify We therefore developed DroPhEA (Drosophila Phenotype mutation or ortholog types, displays enriched term results in a Enrichment Analysis). Similar to MamPhEA, a phenotype hierarchical structure and supports analyses on gene sets of all insect enrichment tool for analyses on mammalian genes (Weng and species with a fully sequenced genome. Liao, 2010), DroPhEA provides several useful features. First, Availability: http://evol.nhri.org.tw/phenome/DroPhEA/ DroPhEA allows enrichment analysis not only on phenotypes Contact: liaoby@nhri.org.tw predefined by FlyBase (http://flybase.org/), but also on customized Supplementary Information: Supplementary data are available at phenotypes to study complex traits. Second, different types of Bioinformatics online. mutations exhibit distinct impacts on protein function; to remove Received on May 6, 2011; revised on September 9, 2011; accepted potential biases caused by the use of data derived from differential on September 16, 2011 mutagenesis approaches (Liao and Zhang, 2008), DroPhEA enables users to perform analyses that exclude phenotypes derived from gain-of-function mutations. Third, DroPhEA generates graphical 1 INTRODUCTION and downloadable output displaying the enriched or depleted Enrichment analysis is a promising strategy for investigators to phenotypes according to the hierarchical structure of the phenotypic biologically interpret gene lists obtained from high-throughput classification. Finally, due to the paucity of phenotypic data in insect studies in the post-genomic era. Among the backend annotations species other than Drosophila melanogaster, DroPhEA supports employed in different enrichment tools (see review in Huang et al., analyses of genes orthologous to D.melanogaster for all insect 2009), mutant phenotype annotation is unique because it represents species with a fully sequenced genome (17 to date). Through the consequence of altering the information output of a gene. integration with MamPhEA, DroPhEA is also capable of analyzing Therefore, annotations of phenotypes are an ideal means to aid in mammalian genes. the understanding of how a set of genes function within the context of the whole organism. 2 IMPLEMENTATION Studies in fruit-fly biology were first documented in the early 1900s (Rubin and Lewis, 2000); over the last century, Drosophila Mutant fly phenotypic data were retrieved from FlyBase (version has become one of the most widely used model organisms for FB2011_03) (http://flybase.org/). Phenotypic entries resulting from animal genetics and developmental biology. As an insect species, mutations affecting multiple loci were excluded. FlyBase applies in addition to basic genetics studies, Drosophila has served as a two hierarchically structured and controlled lexicons to describe model to assist in agricultural and epidemiological investigations. phenotypes: (i) ‘Phenotypic class’ (FBcv term) represents the The richness and comprehensiveness of fly omics data have been pathology, or the effect of the mutation on the whole organism invaluable to many fields of biology. Currently, the proportion of (e.g. lethal, sterile) and (ii) ‘Anatomy’ (FBbt term) describes mutated and phenotyped fly genes (36.9%; 5606 of 15 191 genes the body part manifested by the mutation (e.g. eye, antenna) mapped to the genome) is approaching that of the house mouse (Drysdale, 2001). Phenotypic descriptions of a fly mutant allele (Mus musculus)[∼40% of the genes comprising the mouse genome are often expressed as a compound statement comprising multiple have been phenotyped; Weng and Liao (2010)]. However, with terms. In such cases, we treated each term in the compound the exception of the house mouse (Chen et al., 2009; Weng and statement individually. In fact, we found that each composite Liao, 2010), an enrichment tool based on mutant phenotypes of term usually contained only one ‘phenotypic class’ FBcv term another species has never been developed. Some bioinformatic or ‘anatomy‘ FBbt term, and the remaining terms were used tools for insect genomics, such as FlyMine (Lyne et al., 2007), as ‘qualifiers’ (FBcv:0000005; e.g. ‘dominant’) or ‘occurrent’ terms (FBdv:00007008; e.g. ‘pupal stage P5’), among others. To whom correspondence should be addressed. Consequently, most of the compound phenotypic descriptions can 3218 © The Author 2011. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com [14:52 19/10/2011 Bioinformatics-btr530.tex] Page: 3218 3218–3219 DroPhEA simply be expressed as a ‘phenotypic class’ FBcv term, or ‘anatomy’ for hematophagy, we downloaded the microarray data of the FBbt term for subsequent analyses. The analysis of gene sets of other blood-feeding malaria mosquito (Anopheles gambiae) (Marinotti insect species was supported by obtaining orthology information et al., 2005) from VectorBase (release December 2010) (http:// of each species to D.melanogaster from the InParanoid database www.vectorbase.org/). The 20% (2042/10 207) of mosquito genes (version 7.0, http://inparanoid.sbc.su.se/). shown to exhibit the highest increases in expression signals after DroPhEA typically compares two gene sets. When one gene set a blood meal were analyzed with DroPhEA. Consistent with is provided, it is compared with the rest of the genes in the genome. our current understanding that mosquito hematophagy is required Genes without a ‘phenotypic class’ FBcv term or ‘anatomy’ FBbt for oocyte development (Dana et al., 2005; Marinotti et al., term are excluded prior to the analysis. Significantly enriched or 2005), the results indicated enrichment of the input gene set depleted phenotypes are detected by Fisher’s exact test. P-values can with several reproduction-related or cell cycle-related phenotypes; be Bonferroni corrected for multiple tests. DroPhEA allows users furthermore, results showed the input gene set was depleted in to customize phenotypes by combining existing FlyBase controlled development, nervous system, muscle, sensory organ and behavior lexicons by a keyword search, or by browsing FBcv or FBbt term phenotypes (Supplementary File 1). We also conducted enrichment ontologies. Enrichment analysis applying customized phenotypes analyses using Example 2 gene sets on GO terms and KEGG gives DroPhEA the capability of exploring complex traits, such as pathways for comparisons. Significantly enriched KEGG pathways gene essentiality (see Examples 1.1 and 1.2 in Section 3.1 below). were not detected; however, many enriched/depleted GO terms Default DroPhEA generates graphical output displaying a in Biological Process were reported (Supplementary File 2), and hierarchical structure of enriched/depleted phenotypes. DroPhEA notably consistent with DroPhEA output. Despite the similarity also produces classic output in a simple linear text format showing in results, many enriched/depleted terms reported by DroPhEA differentially enriched phenotypes. JavaServer Pages (JSP) and describe traits at the organismal level (e.g. sterile, viable, circadian MySQL database on an Apache Tomcat server was used to build rhythm defective, among many others), which are not included in the web interface. GO or KEGG annotations. Therefore, mutant phenotypes are clearly invaluable sources of complementary data to augment GO/KEGG in bioinformatics analyses in the study of functional genomics. 3 EXAMPLES Two examples provided on the DroPhEA website are used to 4 CONCLUSION AND PERSPECTIVE illustrate the use of DroPhEA in hypothesis testing and knowledge discovery. DroPhEA is an online tool used to explore insect functional genomics through enrichment analysis of D.melanogaster 3.1 Essentiality of evolutionarily conserved phenotypes. Modules for enrichment analyses on phenotypes of model organisms other than mouse and fly will be added with (Example 1.1) and lineage-specific genes increased availability and improved annotations of eukaryotic (Example 1.2) phenotypic data in the future. The backend databases of DroPhEA Genes ubiquitous in genomes across a wide range of the are automatically updated every 2 months. The tutorial of DroPhEA phylogenetic spectrum are likely to perform fundamental biological is available online at http://evol.nhri.org.tw/phenome/. functions throughout different taxa and taxonomic levels, and are therefore expected to be more critical to individual fitness (survival Funding: National Health Research Institutes Intramural Fund and or reproduction). For the fruit fly, genes vital to fitness (essential National Science Council Grants (NSC 99-2311-B-400-003-MY2 genes) can be defined by association with the lethal or sterile and NSC 100-2319-B-400-001 to B.-Y.L.). mutant phenotype (Liao and Zhang, 2008). Therefore, we created Conflict of Interest: none declared. the ‘essentiality’ phenotype by combining the controlled lexicons FBcv:0000351 (lethal) and FBcv:0000364 (sterile). Consistent with our expectation, the results indicated 3151 Drosophila genes with REFERENCES one human ortholog (retrieved from BioMart version 61) exhibited Blair,J.E. et al. (2005) Evolutionary sequence analysis of complete eukaryote genomes. a strong enrichment in the customized phenotype ‘essentiality’ BMC Bioinformatics, 6, 53. relative to the genomic background (P =1.17E-4). Similarly, we Chen,J. et al. (2009) ToppGene Suite for gene list enrichment analysis and candidate examined 7852 fly lineage-specific genes, defined as genes without gene prioritization. Nucleic Acids Res., 37, W305–W311. Dana,A.N. et al. (2005) Gene expression patterns associated with blood-feeding in the an ortholog in the human genome based on the BioMart annotation. malaria mosquito Anopheles gambiae. BMC Genomics, 6,5. Contradictory to the results of the fly-human one-to-one orthologs, Drysdale,R. (2001) Phenotypic data in FlyBase. Brief. Bioinformatics, 2, 68–80. the fly lineage-specific genes were significantly depleted with the Huang,D.W. et al. (2009) Bioinformatics enrichment tools: paths toward the ‘essentiality’ phenotype (P = 1.91E-40). This example demonstrates comprehensive functional analysis of large gene lists. Nucleic Acids Res., 37, 1–13. that DroPhEA is a promising bioinformatic tool to explore complex Liao,B.Y. and Zhang,J. (2008) Null mutations in human and mouse orthologs frequently result in different phenotypes. Proc. Natl Acad. Sci. USA, 105, 6987–6992. traits. Lyne,R. et al. (2007) FlyMine: an integrated database for Drosophila and Anopheles genomics. Genome Biol., 8, R129 3.2 Genes associated with mosquito blood-feeding Marinotti,O. et al. (2005) Microarray analysis of genes showing variable expression following a blood meal in Anopheles gambiae. Insect Mol. Biol., 14, 365–373. behavior (Example 2) Rubin,G.M. and Lewis,E.B. (2000) A brief history of Drosophila’s contributions to Blood-feeding behavior is an important characteristic of mosquito genome research. Science, 287, 2216–2218. Weng,M.P. and Liao,B.Y. (2010) MamPhEA: a web tool for mammalian phenotype species. To elucidate the genetic components in the insect genome enrichment analysis. Bioinformatics, 26, 2212–2213. that have been acquired or are associated with the adaptation [14:52 19/10/2011 Bioinformatics-btr530.tex] Page: 3219 3218–3219

Journal

BioinformaticsOxford University Press

Published: Oct 5, 2011

There are no references for this article.