Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

The MIPS mammalian protein–protein interaction database

The MIPS mammalian protein–protein interaction database Vol. 21 no. 6 2005, pages 832–834 BIOINFORMATICS APPLICATIONS NOTE doi:10.1093/bioinformatics/bti115 Databases and ontologies The MIPS mammalian protein–protein interaction database 1 1 1 1 Philipp Pagel , Stefan Kovac , Matthias Oesterheld , Barbara Brauner , 1 1 1 Irmtraud Dunger-Kaltenbach , Goar Frishman , Corinna Montrone , 2 1 1,2 1 Pekka Mark , Volker Stümpflen , Hans-Werner Mewes , Andreas Ruepp 1,2,∗ and Dmitrij Frishman Institute for Bioinformatics/MIPS, GSF—National Research Center for Environment and Health, Ingolstädter Landstraße 1, 85764 Neuherberg, Germany and Department of Genome Oriented Bioinformatics, Technische Universität München, Wissenschaftszentrum Weihenstephan, 85350 Freising, Germany Received on September 3, 2004; revised and accepted on October 21, 2004 Advance Access publication November 5, 2004 ABSTRACT researchers have started to greatly improve this situation. Scientific Summary: The MIPS mammalian protein–protein interaction data- literature is rich with experiments demonstrating such interactions base (MPPI) is a new resource of high-quality experimental protein utilizing a large number of technical approaches. Our goal was to har- interaction data in mammals. The content is based on published exper- vest this abundance of available literature and generate a systematic, imental evidence that has been processed by human expert curators. manually curated database of mammalian PPI (MPPI) to serve both We provide the full dataset for download and a flexible and powerful the bioinformatics community as well as the wet lab scientist who web interface for users with various requirements. wants to quickly find relevant links between the protein of interest Availability: The MPPI database is located at http://mips.gsf.de/ and known binding partners. proj/ppi/ Contact: d.frishman@wzw.tum.de 2 ANNOTATION STRATEGY The first and foremost principle of our MPPI database is to favor 1 INTRODUCTION quality over completeness. Therefore, we decided to include only Protein–protein interactions (PPI) determine biological processes at published experimental evidence derived from individual experi- many levels of cellular complexity—from basic metabolism to cell ments as opposed to large-scale surveys. High-throughput data may differentiation. Their importance is reflected by the number of pro- be integrated later, but will be marked to distinguish it from evidence tein interaction experiments described in the life science literature derived from individual experiments. and the increasing interest in high-throughput techniques such as Our next design decision was to choose an appropriate organ- yeast two-hybrid (Ito et al., 2001; Uetz et al., 2000) and large scale ism as the primary model organism for the database. Although both mass spectroscopy of protein complexes (Ho et al., 2002; Gavin mouse and human immediately come to mind as the ideal choices, et al., 2002). Computational analyses of experimental data as well as a human or mouse PPI database would unnecessarily limit the pro- in silico predictions are important tools in the effort to increase our ject and ignore common lab practice. Due to the great structural and understanding of cellular architecture. In addition to the necessity of sequence similarity among mammalian orthologous proteins, it is a complete and in-depth knowledge of PPI networks for the under- quite common to perform interaction experiments using, e.g., endo- standing of cellular biology, they are highly interesting for target genous protein X in a human cell line together with recombinant selection aimed at pharmaceutical applications. protein Y derived from sheep sequence thus crossing species bound- Until recently, most of the databases and large scale experi- aries. Such cross-species experiments represent a large fraction of ments on PPI were derived from microorganisms, most prominently the available evidence in literature. Taking this into account, we Saccharomyces cerevisiae. While yeast is the best established decided not to restrict our database to a single species but rather model organism, many open questions concerning higher eukaryotes allow any mammalian protein in our dataset. Nevertheless, for sys- involve features not present in this organism. Especially, work with tematic analysis it can be desirable to map the data to one reference potential medical implications often requires mammalian models. genome. We chose to use Mus musculus—the most widely used Despite its practical relevance, comparatively little PPI data from mammalian model—as our reference species and provide links to our mammals has been available in public databases like BIND (Bader PEDANT (Frishman et al., 2003) mouse genome database whenever et al., 2003), DIP (Salwinski et al., 2004) and MINT (Zanzoni et al., possible. 2002). Recent efforts by database maintainers and experimental Given the large number of genes in mammalian genomes and the high percentage of yet uncharacterized and putative proteins, the To whom correspondence should be addressed. classical gene-by-gene strategy which has commonly been used in 832 © The Author 2004. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oupjournals.org The MIPS mammalian protein–protein interaction database A Species Distribution B Evidence Distribution 500 800 400 606 33 29 9 8 3 2 1 6 33 222 0 0 C Interactions per Protein D Evidences per Interaction 600 500 0 0 123456789 10 11 12 13 14 15 16 17 123456789 10 11 12 13 14 15 Number of interactions Number of evidences Fig. 1. Statistics: (A) Three species account for >90% of the proteins in our data. (B) Co-IP, two-hybrid methods and co-purification clearly dominate the evidence entries. (C) While most proteins in our database have only one annotated interaction, up to 17 binding partners can be found for some. (D) For many interactions there is more than one experimental evidence in our dataset. the annotation of small genomes is not a good solution. Instead of Another feature is the graphical representation of a protein with all finding literature about each gene product we decided to reverse its neighbors. the approach and locate the gene for each literature reference at For detailed analysis, the entire dataset is available for download hand. Relevant publications were identified in PubMed searches in the recently defined PSI-MI standard format (Hermjakob et al., using keywords such as ‘mammalian’, ‘mouse’, ‘two-hybrid’, 2004). ‘coimmunoprecipitation’, ‘binds to’, ... in various combinations. In addition to the proteins involved in an interaction we provide information on specific details such as the PubMed reference, exper- imental technique used, probable binding sites and functional role 3 IMPLEMENTATION AND DATA of the interaction. Links to external databases such as Swiss-Prot are provided for most proteins. All data are stored in a MySQL database and are accessible through Currently, our dataset contains >1800 evidence entries for PPI a web interface implemented as Perl CGI scripts. The interface among >900 proteins from 10 mammalian species. The data was was designed to be as intuitive as possible for the occasional user extracted from >370 articles. On average, each protein in the data- while allowing complex Boolean queries for advanced requirements. base is involved in 1.92 interactions and each interaction is supported We provide three different search forms and two result formats. Number of proteins Number of proteins Homo sapiens Mus musculus Rattus norvegicus Unspecified mammal Bos taurus Oryctolagus cuniculus Canis familiaris Cricetulus griseus Cercopithecus aethiops Sus scrofa domestica Ovis aries Number of interactions Number of interactions coimmunoprecipitation twohybrid affinity copurification overlay cross linking centrifugation surface plasmon resonance ELISA gel retardation Immunodiffusion crystal structure physical interaction energy transfer P.Pageletal. by 1.98 evidence entries. Figure 1 gives a graphical overview of the Frishman,D., Mokrejs,M., Kosykh,D., Kastenmuller,G., Kolesov,G., Zubrzycki,I., Gruber,C., Geier,B., Kaps,A., Albermann,K. et al. (2003) The PEDANT genome composition of our data. database. Nucleic Acids Res., 31, 207–211. As the importance of protein-interaction data in higher Gavin,A.C., Bosche,M., Krause,R., Grandi,P., Marzioch,M., Bauer,A., Schultz,J., eukaryotes—and especially mammals—has been recognized by Rick,J.M., Michon,A.M., Cruciat,C.M. et al. (2002) Functional organization of many researchers, several efforts to improve the amount of available the yeast proteome by systematic analysis of protein complexes. Nature, 415, 141–147. data have been undertaken. The human protein reference database Giot,L., Bader,J.S., Brouwer,C., Chaudhuri,A., Kuang,B., Li,Y., Hao,Y.L., Ooi,C.E., (Peri et al., 2003) aims at a comprehensive annotation of the human Godwin,B., Vitols,E. et al. (2003) A protein interaction map of Drosophila proteome and includes information about a large number of protein melanogaster. Science, 302, 1727–1736. interactions. While their dataset is significantly larger than ours we Hermjakob,H., Montecchi-Palazzi,L., Bader,G., Wojcik,J., Salwinski,L., Ceol,A., believe that our data is complementary to the HPRD set because the Moore,S., Orchard,S., Sarkans,U., von Mering,C. et al. (2004) The HUPO PSI’s molecular interaction format—a community standard for the representation of protein overlap is comparatively small (less than 30% of our PMIDs appear interaction data. Nature Biotechnol., 22, 177–183. in HPRD at the time of writing) and especially because we provide Ho,Y., Gruhler,A., Heilbut,A., Bader,G.D., Moore,L., Adams,S.L., Millar,A., Taylor,P., much more detailed information on the interactions and do not limit Bennett,K., Boutilier,K. et al. (2002) Systematic identification of protein complexes our data to one species. Other efforts are underway in many of the in Saccharomyces cerevisiae by mass spectrometry. Nature, 415, 180–183. Ito,T., Chiba,T., Ozawa,R., Yoshida,M., Hattori,M. and Sakaki,Y. (2001) A comprehens- well-known PPI databases. Large-scale interaction experiments have ive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl Acad. been performed for Caenorhabditis elegans (Li et al., 2004) and Dro- Sci. USA, 98, 4569–4574. sophila (Giot et al., 2003) but little such data exist for mammals at Li,S., Armstrong,C.M., Bertin,N., Ge,H., Milstein,S., Boxem,M., Vidalain,P.O., this time. Han,J.D., Chesneau,A., Hao,T. et al. (2004) A map of the interactome network of the metazoan C. elegans. Science, 303, 540–543. Peri,S., Navarro,J.D., Amanchy,R., Kristiansen,T.Z., Jonnalagadda,C.K., ACKNOWLEDGEMENTS Surendranath,V., Niranjan,V., Muthusamy,B., Gandhi,T.K., Gronborg,M. et al. We would like to thank Ulrich Güldener and Martin Münsterkötter (2003) Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res., 13, 2363–2371. from the MIPS yeast database group and Philip Wong for helpful Salwinski,L., Miller,C.S., Smith,A.J., Pettit,F.K., Bowie,J.U. and Eisenberg,D. (2004) comments. This work was funded by a grant from the German Fed- The database of interacting proteins: 2004 update. Nucleic Acids Res., 32, eral Ministry of Education and Research (BMBF) within the BFAM D449–451. framework (031U112C). Uetz,P., Giot,L., Cagney,G., Mansfield,T.A., Judson,R.S., Knight,J.R., Lockshon,D., Narayan,V., Srinivasan,M., Pochart,P. et al. (2000) A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature, 403, 623–627. REFERENCES Zanzoni,A., Montecchi-Palazzi,L., Quondam,M., Ausiello,G., Helmer-Citterich,M. and Bader,G.D., Betel,D. and Hogue,C.W. (2003) BIND: the biomolecular interaction Cesareni,G. (2002) MINT: a Molecular INTeraction database. FEBS Lett., 513, network database. Nucleic Acids Res., 31, 248–250. 135–140. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Bioinformatics Oxford University Press

Loading next page...
 
/lp/oxford-university-press/the-mips-mammalian-protein-protein-interaction-database-x00ThhQOzV

References (15)

  • Si-ming Li, Christopher Armstrong, N. Bertin, Hui Ge, S. Milstein, M. Boxem, P. Vidalain, J. Han, A. Chesneau, Tong Hao, D. Goldberg, Ning Li, Monica Martinez, Jean-François Rual, Philippe Lamesch, Lai Xu, M. Tewari, Sharyl Wong, Lan Zhang, G. Berriz, L. Jacotot, P. Vaglio, J. Reboul, T. Hirozane-Kishikawa, Qian-ru Li, Harrison Gabel, Ahmed Elewa, Bridget Baumgartner, D. Rose, Haiyuan Yu, Stephanie Bosak, Reynaldo Sequerra, Andrew Fraser, S. Mango, W. Saxton, S. Strome, S. Heuvel, F. Piano, J. Vandenhaute, C. Sardet, M. Gerstein, L. Doucette-Stamm, K. Gunsalus, J. Harper, M. Cusick, F. Roth, D. Hill, M. Vidal (2004)

    A Map of the Interactome Network of the Metazoan C. elegans

    Science, 303

  • A. Gavin, M. Bösche, R. Krause, P. Grandi, M. Marzioch, A. Bauer, J. Schultz, Jens Rick, A. Michon, Cristina Cruciat, M. Remor, Christian Höfert, Malgorzata Schelder, Miro Brajenovic, H. Ruffner, Alejandro Merino, Karin Klein, Manuela Hudak, David Dickson, T. Rudi, V. Gnau, A. Bauch, Sonja Bastuck, B. Huhse, C. Leutwein, Marie-Anne Heurtier, R. Copley, A. Edelmann, Erich Querfurth, V. Rybin, G. Drewes, M. Raida, T. Bouwmeester, P. Bork, B. Séraphin, B. Kuster, G. Neubauer, G. Superti-Furga (2002)

    Functional organization of the yeast proteome by systematic analysis of protein complexes

    Nature, 415

  • H. Hermjakob, L. Montecchi-Palazzi, Gary Bader, J. Wojcik, L. Salwínski, A. Céol, S. Moore, S. Orchard, Ugis Sarkans, C. Mering, B. Roechert, S. Poux, E. Jung, Henning Mersch, P. Kersey, M. Lappe, Yixue Li, R. Zeng, D. Rana, M. Nikolski, H. Husi, C. Brun, K. Shanker, S. Grant, C. Sander, P. Bork, Weimin Zhu, A. Pandey, A. Brazma, B. Jacq, M. Vidal, D. Sherman, P. Legrain, G. Cesareni, I. Xenarios, D. Eisenberg, Boris Steipe, C. Hogue, R. Apweiler (2004)

    The HUPO PSI's Molecular Interaction format—a community standard for the representation of protein interaction data

    Nature Biotechnology, 22

  • (2004)

    10.1093/nar/gkh086

    Nucleic Acids Research, 32

  • I. Xenarios, Esteban Fernandez, L. Salwínski, X. Duan, Michael Thompson, E. Marcotte, D. Eisenberg (2001)

    DIP: The Database of Interacting Proteins: 2001 update

    Nucleic acids research, 29 1

  • Suraj Peri, Suraj Peri, J. Navarro, J. Navarro, Ramars Amanchy, T. Kristiansen, T. Kristiansen, Chandra Jonnalagadda, V. Surendranath, V. Niranjan, Babylakshmi Muthusamy, T. Gandhi, M. Grønborg, M. Grønborg, N. Ibarrola, Nandan Deshpande, K. Shanker, H. Shivashankar, B. Rashmi, M. Ramya, Zhixing Zhao, K. Chandrika, N. Padma, H. Harsha, A. Yatish, M. Kavitha, M. Menezes, D. Choudhury, Shubha Suresh, Neelanjana Ghosh, R. Saravana, Sreenath Chandran, S. Krishna, Mary Joy, S. Anand, V. Madavan, A. Joseph, G. Wong, W. Schiemann, S. Constantinescu, L. Huang, R. Khosravi‐Far, H. Steen, M. Tewari, S. Ghaffari, G. Blobe, C. Dang, Joe Garcia, J. Pevsner, O. Jensen, P. Roepstorff, K. Deshpande, A. Chinnaiyan, A. Hamosh, A. Chakravarti, Akhilesh Pandey (2003)

    Development of human protein reference database as an initial platform for approaching systems biology in humans.

    Genome research, 13 10

  • P. Uetz, L. Giot, G. Cagney, T. Mansfield, R. Judson, James Knight, D. Lockshon, Vaibhav Narayan, Maithreyan Srinivasan, P. Pochart, Alia Qureshi-Emili, Ying Li, B. Godwin, D. Conover, T. Kalbfleisch, G. Vijayadamodar, Meijia Yang, M. Johnston, S. Fields, J. Rothberg (2000)

    A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae

    Nature, 403

  • Gary Bader, D. Betel, C. Hogue (2001)

    BIND--The Biomolecular Interaction Network Database.

    Nucleic acids research, 29 1

  • Takashi Ito, Tomoko Chiba, Ritsuko Ozawa, Mikio Yoshida, Masahira Hattori, Y. Sakaki (2001)

    A comprehensive two-hybrid analysis to explore the yeast protein interactome

    Proceedings of the National Academy of Sciences of the United States of America, 98

  • L. Giot, J. Bader, Cory Brouwer, Amitabha Chaudhuri, B. Kuang, Ying Li, Y. Hao, C. Ooi, B. Godwin, E. Vitols, G. Vijayadamodar, P. Pochart, H. Machineni, M. Welsh, Yong Kong, B. Zerhusen, Rachel Malcolm, Z. Varrone, A. Collis, M. Minto, S. Burgess, L. McDaniel, E. Stimpson, F. Spriggs, J. Williams, K. Neurath, N. Ioime, M. Agee, E. Voss, K. Furtak, R. Renzulli, N. Aanensen, S. Carrolla, E. Bickelhaupt, Y. Lazovatsky, A. Dasilva, J. Zhong, C. Stanyon, R. Finley, K. White, Michael Braverman, T. Jarvie, S. Gold, M. Leach, James Knight, R. Shimkets, M. McKenna, J. Chant, J. Rothberg (2003)

    A Protein Interaction Map of Drosophila melanogaster

    Science, 302

  • (2003)

    10.1093/nar/gkg056

    Nucleic Acids Research, 31

  • A. Chatr-aryamontri, A. Céol, L. Montecchi-Palazzi, G. Nardelli, M. Schneider, L. Castagnoli, G. Cesareni (2006)

    MINT: the Molecular INTeraction database

    Nucleic Acids Research, 35

  • (2002)

    10.1016/S0014-5793(01)03293-8

    FEBS Letters, 513

  • Y. Ho, A. Gruhler, Adrian Heilbut, Gary Bader, L. Moore, S. Adams, A. Millar, Paul Taylor, K. Bennett, K. Boutilier, Lingyun Yang, Cheryl Wolting, I. Donaldson, Søren Schandorff, Juanita Shewnarane, M. Võ, Joanne Taggart, Marilyn Goudreault, B. Muskat, C. Alfarano, D. Dewar, Zhen-Liang Lin, K. Michalickova, A. Willems, Holly Sassi, P. Nielsen, K. Rasmussen, J. Andersen, L. Johansen, L. Hansen, H. Jespersen, A. Podtelejnikov, E. Nielsen, Janne Crawford, Vibeke Poulsen, B. Sørensen, Jesper Matthiesen, Ronald Hendrickson, F. Gleeson, T. Pawson, M. Moran, D. Durocher, M. Mann, C. Hogue, D. Figeys, M. Tyers (2002)

    Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry

    Nature, 415

  • D. Frishman, M. Mokrejs, D. Kosykh, G. Kastenmüller, G. Kolesov, I. Zubrzycki, ChristianW. Gruber, Birgitta Geier, A. Kaps, K. Albermann, Andreas Volz, C. Wagner, M. Fellenberg, K. Heumann, H. Mewes (2003)

    The PEDANT genome database

    Nucleic acids research, 31 1

Publisher
Oxford University Press
Copyright
© The Author 2004. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oupjournals.org
ISSN
1367-4803
eISSN
1460-2059
DOI
10.1093/bioinformatics/bti115
pmid
15531608
Publisher site
See Article on Publisher Site

Abstract

Vol. 21 no. 6 2005, pages 832–834 BIOINFORMATICS APPLICATIONS NOTE doi:10.1093/bioinformatics/bti115 Databases and ontologies The MIPS mammalian protein–protein interaction database 1 1 1 1 Philipp Pagel , Stefan Kovac , Matthias Oesterheld , Barbara Brauner , 1 1 1 Irmtraud Dunger-Kaltenbach , Goar Frishman , Corinna Montrone , 2 1 1,2 1 Pekka Mark , Volker Stümpflen , Hans-Werner Mewes , Andreas Ruepp 1,2,∗ and Dmitrij Frishman Institute for Bioinformatics/MIPS, GSF—National Research Center for Environment and Health, Ingolstädter Landstraße 1, 85764 Neuherberg, Germany and Department of Genome Oriented Bioinformatics, Technische Universität München, Wissenschaftszentrum Weihenstephan, 85350 Freising, Germany Received on September 3, 2004; revised and accepted on October 21, 2004 Advance Access publication November 5, 2004 ABSTRACT researchers have started to greatly improve this situation. Scientific Summary: The MIPS mammalian protein–protein interaction data- literature is rich with experiments demonstrating such interactions base (MPPI) is a new resource of high-quality experimental protein utilizing a large number of technical approaches. Our goal was to har- interaction data in mammals. The content is based on published exper- vest this abundance of available literature and generate a systematic, imental evidence that has been processed by human expert curators. manually curated database of mammalian PPI (MPPI) to serve both We provide the full dataset for download and a flexible and powerful the bioinformatics community as well as the wet lab scientist who web interface for users with various requirements. wants to quickly find relevant links between the protein of interest Availability: The MPPI database is located at http://mips.gsf.de/ and known binding partners. proj/ppi/ Contact: d.frishman@wzw.tum.de 2 ANNOTATION STRATEGY The first and foremost principle of our MPPI database is to favor 1 INTRODUCTION quality over completeness. Therefore, we decided to include only Protein–protein interactions (PPI) determine biological processes at published experimental evidence derived from individual experi- many levels of cellular complexity—from basic metabolism to cell ments as opposed to large-scale surveys. High-throughput data may differentiation. Their importance is reflected by the number of pro- be integrated later, but will be marked to distinguish it from evidence tein interaction experiments described in the life science literature derived from individual experiments. and the increasing interest in high-throughput techniques such as Our next design decision was to choose an appropriate organ- yeast two-hybrid (Ito et al., 2001; Uetz et al., 2000) and large scale ism as the primary model organism for the database. Although both mass spectroscopy of protein complexes (Ho et al., 2002; Gavin mouse and human immediately come to mind as the ideal choices, et al., 2002). Computational analyses of experimental data as well as a human or mouse PPI database would unnecessarily limit the pro- in silico predictions are important tools in the effort to increase our ject and ignore common lab practice. Due to the great structural and understanding of cellular architecture. In addition to the necessity of sequence similarity among mammalian orthologous proteins, it is a complete and in-depth knowledge of PPI networks for the under- quite common to perform interaction experiments using, e.g., endo- standing of cellular biology, they are highly interesting for target genous protein X in a human cell line together with recombinant selection aimed at pharmaceutical applications. protein Y derived from sheep sequence thus crossing species bound- Until recently, most of the databases and large scale experi- aries. Such cross-species experiments represent a large fraction of ments on PPI were derived from microorganisms, most prominently the available evidence in literature. Taking this into account, we Saccharomyces cerevisiae. While yeast is the best established decided not to restrict our database to a single species but rather model organism, many open questions concerning higher eukaryotes allow any mammalian protein in our dataset. Nevertheless, for sys- involve features not present in this organism. Especially, work with tematic analysis it can be desirable to map the data to one reference potential medical implications often requires mammalian models. genome. We chose to use Mus musculus—the most widely used Despite its practical relevance, comparatively little PPI data from mammalian model—as our reference species and provide links to our mammals has been available in public databases like BIND (Bader PEDANT (Frishman et al., 2003) mouse genome database whenever et al., 2003), DIP (Salwinski et al., 2004) and MINT (Zanzoni et al., possible. 2002). Recent efforts by database maintainers and experimental Given the large number of genes in mammalian genomes and the high percentage of yet uncharacterized and putative proteins, the To whom correspondence should be addressed. classical gene-by-gene strategy which has commonly been used in 832 © The Author 2004. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oupjournals.org The MIPS mammalian protein–protein interaction database A Species Distribution B Evidence Distribution 500 800 400 606 33 29 9 8 3 2 1 6 33 222 0 0 C Interactions per Protein D Evidences per Interaction 600 500 0 0 123456789 10 11 12 13 14 15 16 17 123456789 10 11 12 13 14 15 Number of interactions Number of evidences Fig. 1. Statistics: (A) Three species account for >90% of the proteins in our data. (B) Co-IP, two-hybrid methods and co-purification clearly dominate the evidence entries. (C) While most proteins in our database have only one annotated interaction, up to 17 binding partners can be found for some. (D) For many interactions there is more than one experimental evidence in our dataset. the annotation of small genomes is not a good solution. Instead of Another feature is the graphical representation of a protein with all finding literature about each gene product we decided to reverse its neighbors. the approach and locate the gene for each literature reference at For detailed analysis, the entire dataset is available for download hand. Relevant publications were identified in PubMed searches in the recently defined PSI-MI standard format (Hermjakob et al., using keywords such as ‘mammalian’, ‘mouse’, ‘two-hybrid’, 2004). ‘coimmunoprecipitation’, ‘binds to’, ... in various combinations. In addition to the proteins involved in an interaction we provide information on specific details such as the PubMed reference, exper- imental technique used, probable binding sites and functional role 3 IMPLEMENTATION AND DATA of the interaction. Links to external databases such as Swiss-Prot are provided for most proteins. All data are stored in a MySQL database and are accessible through Currently, our dataset contains >1800 evidence entries for PPI a web interface implemented as Perl CGI scripts. The interface among >900 proteins from 10 mammalian species. The data was was designed to be as intuitive as possible for the occasional user extracted from >370 articles. On average, each protein in the data- while allowing complex Boolean queries for advanced requirements. base is involved in 1.92 interactions and each interaction is supported We provide three different search forms and two result formats. Number of proteins Number of proteins Homo sapiens Mus musculus Rattus norvegicus Unspecified mammal Bos taurus Oryctolagus cuniculus Canis familiaris Cricetulus griseus Cercopithecus aethiops Sus scrofa domestica Ovis aries Number of interactions Number of interactions coimmunoprecipitation twohybrid affinity copurification overlay cross linking centrifugation surface plasmon resonance ELISA gel retardation Immunodiffusion crystal structure physical interaction energy transfer P.Pageletal. by 1.98 evidence entries. Figure 1 gives a graphical overview of the Frishman,D., Mokrejs,M., Kosykh,D., Kastenmuller,G., Kolesov,G., Zubrzycki,I., Gruber,C., Geier,B., Kaps,A., Albermann,K. et al. (2003) The PEDANT genome composition of our data. database. Nucleic Acids Res., 31, 207–211. As the importance of protein-interaction data in higher Gavin,A.C., Bosche,M., Krause,R., Grandi,P., Marzioch,M., Bauer,A., Schultz,J., eukaryotes—and especially mammals—has been recognized by Rick,J.M., Michon,A.M., Cruciat,C.M. et al. (2002) Functional organization of many researchers, several efforts to improve the amount of available the yeast proteome by systematic analysis of protein complexes. Nature, 415, 141–147. data have been undertaken. The human protein reference database Giot,L., Bader,J.S., Brouwer,C., Chaudhuri,A., Kuang,B., Li,Y., Hao,Y.L., Ooi,C.E., (Peri et al., 2003) aims at a comprehensive annotation of the human Godwin,B., Vitols,E. et al. (2003) A protein interaction map of Drosophila proteome and includes information about a large number of protein melanogaster. Science, 302, 1727–1736. interactions. While their dataset is significantly larger than ours we Hermjakob,H., Montecchi-Palazzi,L., Bader,G., Wojcik,J., Salwinski,L., Ceol,A., believe that our data is complementary to the HPRD set because the Moore,S., Orchard,S., Sarkans,U., von Mering,C. et al. (2004) The HUPO PSI’s molecular interaction format—a community standard for the representation of protein overlap is comparatively small (less than 30% of our PMIDs appear interaction data. Nature Biotechnol., 22, 177–183. in HPRD at the time of writing) and especially because we provide Ho,Y., Gruhler,A., Heilbut,A., Bader,G.D., Moore,L., Adams,S.L., Millar,A., Taylor,P., much more detailed information on the interactions and do not limit Bennett,K., Boutilier,K. et al. (2002) Systematic identification of protein complexes our data to one species. Other efforts are underway in many of the in Saccharomyces cerevisiae by mass spectrometry. Nature, 415, 180–183. Ito,T., Chiba,T., Ozawa,R., Yoshida,M., Hattori,M. and Sakaki,Y. (2001) A comprehens- well-known PPI databases. Large-scale interaction experiments have ive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl Acad. been performed for Caenorhabditis elegans (Li et al., 2004) and Dro- Sci. USA, 98, 4569–4574. sophila (Giot et al., 2003) but little such data exist for mammals at Li,S., Armstrong,C.M., Bertin,N., Ge,H., Milstein,S., Boxem,M., Vidalain,P.O., this time. Han,J.D., Chesneau,A., Hao,T. et al. (2004) A map of the interactome network of the metazoan C. elegans. Science, 303, 540–543. Peri,S., Navarro,J.D., Amanchy,R., Kristiansen,T.Z., Jonnalagadda,C.K., ACKNOWLEDGEMENTS Surendranath,V., Niranjan,V., Muthusamy,B., Gandhi,T.K., Gronborg,M. et al. We would like to thank Ulrich Güldener and Martin Münsterkötter (2003) Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res., 13, 2363–2371. from the MIPS yeast database group and Philip Wong for helpful Salwinski,L., Miller,C.S., Smith,A.J., Pettit,F.K., Bowie,J.U. and Eisenberg,D. (2004) comments. This work was funded by a grant from the German Fed- The database of interacting proteins: 2004 update. Nucleic Acids Res., 32, eral Ministry of Education and Research (BMBF) within the BFAM D449–451. framework (031U112C). Uetz,P., Giot,L., Cagney,G., Mansfield,T.A., Judson,R.S., Knight,J.R., Lockshon,D., Narayan,V., Srinivasan,M., Pochart,P. et al. (2000) A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature, 403, 623–627. REFERENCES Zanzoni,A., Montecchi-Palazzi,L., Quondam,M., Ausiello,G., Helmer-Citterich,M. and Bader,G.D., Betel,D. and Hogue,C.W. (2003) BIND: the biomolecular interaction Cesareni,G. (2002) MINT: a Molecular INTeraction database. FEBS Lett., 513, network database. Nucleic Acids Res., 31, 248–250. 135–140.

Journal

BioinformaticsOxford University Press

Published: Nov 5, 2004

There are no references for this article.