Access the full text.
Sign up today, get DeepDyve free for 14 days.
C. Steinbeck, Christian Hoppe, S. Kuhn, M. Floris, R. Guha, Egon Willighagen (2006)
Recent developments of the chemistry development kit (CDK) - an open-source java library for chemo- and bioinformatics.Current pharmaceutical design, 12 17
(2005)
BMC Bioinformatics BioMed Central Database
Y. Martin, J. Kofron, L. Traphagen (2002)
Do structurally similar molecules have similar biological activity?Journal of medicinal chemistry, 45 19
L. Jensen, Jasmin Saric, P. Bork (2006)
Literature mining for the biologist: from information retrieval to biological discoveryNature Reviews Genetics, 7
D. Wishart, Craig Knox, Anchi Guo, S. Shrivastava, Murtaza Hassanali, P. Stothard, Zhan Chang, Jennifer Woolsey (2005)
DrugBank: a comprehensive resource for in silico drug discovery and explorationNucleic Acids Research, 34
M. Kanehisa, S. Goto, M. Hattori, Kiyoko Aoki-Kinoshita, M. Itoh, S. Kawashima, Toshiaki Katayama, M. Araki, M. Hirakawa (2005)
From genomics to chemical genomics: new developments in KEGGNucleic Acids Research, 34
C. Mering, L. Jensen, B. Snel, S. Hooper, M. Krupp, Mathilde Foglierini, Nelly Jouffre, M. Huynen, P. Bork (2004)
STRING: known and predicted protein–protein associations, integrated and transferred across organismsNucleic Acids Research, 33
A. Bairoch, R. Apweiler, Cathy Wu, W. Barker, B. Boeckmann, Serenella Ferro, E. Gasteiger, Hongzhan Huang, R. Lopez, M. Magrane, M. Martin, D. Natale, C. O’Donovan, Nicole Redaschi, L. Yeh (2004)
The Universal Protein Resource (UniProt)Nucleic Acids Research, 33
Roland Arnold, T. Rattei, Patrick Tischler, Minh-Duc Truong, V. Stümpflen, H. Mewes (2005)
SIMAP - The similarity matrix of proteinsNucleic acids research, 34 Database issue
D. Rebholz-Schuhmann, H. Kirsch, Miguel Arregui, S. Gaudan, Mark Riethoven, P. Stoehr (2007)
EBIMed - text crunching to gather facts for proteins from MedlineBioinformatics, 23 2
P. Imming, C. Sinning, A. Meyer (2006)
Drugs, their targets and the nature and number of drug targetsNature Reviews Drug Discovery, 5
Ken-Ichi Fujita (2006)
Cytochrome P450 and anticancer drugs.Current drug metabolism, 7 1
A. Goede, Mathias Dunkel, N. Mester, C. Frömmel, R. Preissner (2005)
SuperDrug: a conformational drug databaseBioinformatics, 21 9
S. Frantz (2005)
Drug discovery: Playing dirtyNature, 437
Xin Chen, Z. Ji, Yu Chen (2002)
TTD: Therapeutic Target DatabaseNucleic acids research, 30 1
Nita Deshpande, K. Addess, Wolfgang Bluhm, Jeffrey Merino-Ott, Wayne Townsend-Merino, Qing Zhang, Charlie Knezevich, Lie Xie, Li Chen, Zukang Feng, Rachel Green, J. Flippen-Anderson, J. Westbrook, H. Berman, P. Bourne (2004)
The RCSB Protein Data Bank: a redesigned query system and relational database based on the mmCIF schemaNucleic Acids Research, 33
E. Camon, M. Magrane, D. Barrell, V. Lee, E. Dimmer, J. Maslen, David Binns, Nicola Harte, R. Lopez, R. Apweiler (2004)
The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene OntologyNucleic acids research, 32 Database issue
Published online 16 October 2007 Nucleic Acids Research, 2008, Vol. 36, Database issue D919–D922 doi:10.1093/nar/gkm862 SuperTarget and Matador: resources for exploring drug-target relationships 1 2 1 2 Stefan Gu¨ nther , Michael Kuhn , Mathias Dunkel , Monica Campillos , 1 2 1 Christian Senger , Evangelia Petsalaki , Jessica Ahmed , 2 3 2 Eduardo Garcia Urdiales , Andreas Gewiess , Lars Juhl Jensen , 2 3 2 4 Reinhard Schneider , Roman Skoblo , Robert B. Russell , Philip E. Bourne , 2,5 1, Peer Bork and Robert Preissner * Structural Bioinformatics Group, Institute of Molecular Biology and Bioinformatics, Charite´ —University Medicine 2 3 Berlin, Arnimallee 22, 14195 Berlin, EMBL—Biocomputing, Meyerhofstraße 1, 69117 Heidelberg, Institute for Laboratory Medicine, Windscheidstr, 18, 10627 Berlin, Germany, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, 9500 Gilman Drive, La Jolla CA 92093, USA and Max-Delbru¨ ck-Center for MolecularMedicine (MDC), 13092 Berlin-Buch, Germany Received August 15, 2007; Revised September 26, 2007; Accepted September 27, 2007 INTRODUCTION ABSTRACT Within the past two decades our knowledge about The molecular basis of drug action is often not drugs, their mechanisms of action and target proteins well understood. This is partly because the very has increased rapidly. Nevertheless, knowledge on their abundant and diverse information generated in the molecular effects is far from complete. For some drugs past decades on drugs is hidden in millions of even the primary targets are still unknown, for example, medical articles or textbooks. Therefore, we develo- Diloxanide, Niclosamide and Ambroxol are administered ped a one-stop data warehouse, SuperTarget that successfully although their effect on human metabolism is integrates drug-related information about medical still not clarified at a molecular level (1). Even if the indication areas, adverse drug effects, drug meta- medical effect has been explained by a certain molecular bolization, pathways and Gene Ontology terms of interaction, most drugs interact with several additional the target proteins. An easy-to-use query interface targets, which may either strengthen the therapeutic enables the user to pose complex queries, for effect or cause unwanted adverse drug effects (2). example to find drugs that target a certain pathway, Moreover, our knowledge on drugs and their targets is interacting drugs that are metabolized by the same highly fragmented, most of it residing in millions of medical articles and textbooks, which precludes systematic cytochrome P450 or drugs that target the studies. same protein but are metabolized by different Several databases exist that collect binding data enzymes. Furthermore, we provide tools for 2D on small molecules, in particular drugs and proteins. drug screening and sequence comparison of the The largest such resource is DrugBank (3), which contains targets. The database contains more than 2500 2600 drug-target relations for 900 FDA-approved drugs target proteins, which are annotated with about and additional annotations for 3200 experimental drugs. 7300 relations to 1500 drugs; the vast majority of Another notable database is the Therapeutic Target entries have pointers to the respective literature Database (TTD) (4), which holds target information on source. A subset of these drugs has been annotated about 1000 small molecule drugs. Unfortunately, with additional binding information and indirect DrugBank only provides references on the target, interactions and is available as a separate resource although generally not on the interactions, which makes called Matador. SuperTarget and Matador are it difficult to obtain information on the experimental available at http://insilico.charite.de/supertarget context under which an interaction was observed. and http://matador.embl.de Moreover, the drugs in the TTD are not cross-linked *To whom correspondence should be addressed. Tel: +49 30 8445 1649; Fax: +49 30 8445 1551; Email: [email protected] 2007 The Author(s) This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. D920 Nucleic Acids Research, 2008, Vol. 36, Database issue with compound databases such as PubChem, ChemDB or the commercial CAS Registry, and the targets are not linked to protein databases such as UniProt or PDB. This makes it difficult to retrieve information such as the chemical structure of the drug, its physiochemical properties, the sequence or 3D structure of its target or the biological pathways that it affects. In order to be able to derive further information about drug-target relations, we have developed a one- stop data warehouse, SuperTarget that provides this functionality and integrates drug-target relations from different resources using heterogeneous retrieval methods. We consider a drug-target relation as a specific interaction of a small chemical compound administered to treat or diagnose a disease and a macromolecule, namely protein, Figure 1. System architecture and number of database entries of DNA or RNA. The first release of SuperTarget contains SuperTarget. The database contains the complete Uniprot with more than 3 million entries. Beside the targets, drugs and pathways the a core dataset of about 7300 drug-target relations of which database provides 23 000 different GO-terms and 30 000 links to protein 4900 interactions have been subjected to a more extensive structures. manual annotation effort to incorporate additional binding information as well as indirect interactions. The resulting data on 775 drugs is provided separately If those interactions could be confirmed by literature as Matador (Manually Annotated Targets And Drugs listed in PubMed, the references were included in Online Resource). SuperTarget otherwise the describing database is referenced. In consideration of the large number of entries we DRUG-TARGET RELATIONSHIPS cannot rule out that some of the data is erroneous, change over time or is too unspecific. In the case of doubt Drug-target relationships described in SuperTarget we refer to the referenced relation source. were obtained in three different ways. Starting with 2400 To be able to obtain more information on the drugs and their synonyms from the SuperDrug Database drug-target relations, SuperTarget provides links to phy- (5), the text mining tool EbiMed (6) was used to extract sicochemical properties and further structural information relevant text passages containing potential drug-target of drugs. Proven or potential target proteins are repre- relations from about 15 millions public abstracts listed sented by sequences as stored in UniProt (12), by func- in PubMed. Many thousands of false positive or irrelevant tional annotations extracted from GOA (13) and by related relations were eliminated by manual curation. pathway information provided by KEGG (9) (compare In parallel, potential drug-target relations were auto- Figure 1). Adverse drug reactions were extracted from the matically extracted from Medline by searching for syno- free accessible Canadian Adverse Reaction Monitoring nyms of drugs, proteins and Medical Subject Headings Program (CADRMP, http://www.hc-sc.gc.ca/). (MeSH terms) describing groups of proteins (7). MeSH For a subset of the drug-target relations, namely those terms were used to capture and down-weight interactions where our text-mining approach indicated a wealth of that are not explicitly described in the abstracts e.g. for additional information, the type of binding was further protein families or protein complexes. In the case of analyzed and direct and indirect interactions were families, the specific interacting family member might not manually distinguished. Indirect interactions can, for be known yet, whereas in the case of complexes, the drug example, be caused by active metabolites of the drug or might interact with more than one subunit. Proteins by changes in the expression of a protein. The extensively associated to MeSH terms were assigned by a semi- annotated subset, which is contained in Matador should automated procedure relying on mappings provided by MeSH and synonyms of proteins that are aggregated be well-suited as training set for various large-scale in the STRING resource (8). Proteins that were often discovery approaches. mentioned in abstracts, but could not be automatically assigned to families, were manually assigned. Depending on the size and nature of the families, the confidence of ANALYSIS OPTIONS IN SUPERTARGET an interaction between drugs and individual proteins The integration of drug related information provides was decreased. More heterogeneous families are assigned numerous different query entry points as well as global a lower confidence. The most probable candidates were identified using a benchmarking scheme (8) and surveys on complex topics using heterogeneous data manually curated. types (for an illustration of query capabilities see In a last step, relations from other databases, namely Figure 2). DrugBank (3), KEGG (9), PDB (10), SuperLigands Since compounds with high structural similarity (11) and TTD (4), were checked for drug-target frequently exhibit similar activities (14), all drugs with interactions not identified with the preceding steps. comparable structural fingerprints are made accessible. Nucleic Acids Research, 2008, Vol. 36, Database issue D921 pathways, GO terms or metabolization. A further hyperlink leads to the search history. CONCLUSIONS AD FUTURE DIRECTIONS Although the first version of SuperTarget with all the search and discovery tools around drug-target relations is already an extensive resource for both large-scale research and in depth analysis, the captured knowledge is still far from complete and we would like to invite the community to help in increasing quality and quantity of the records. SuperTarget offers an option to upload Figure 2. Example of a complex query: search for an alternative and incorporate drug-target relations into a working drug to Taxol, which adresses the same target Bcl-2, but is not metabolized by the cytochrome 2C8. Procedure: start to search for queue. Uploaded entries will be reviewed and approved the targets and the metabolizing cytochromes of Taxol in the associated by an annotation team comprised of graduated categories. The resulting lists contain among others the target Bcl-2 scientists. Both SuperTarget and Matador can be used as well as the P450 cytochrome 2C8. Second, forward 2C8 and Bcl-2 as knowledge sources, discovery tools or training sets into the query term box, combine them using the ‘NOT’ operator for various applications in chemical biology and and submit the query. The resulting drug list contains two alternative drug candidates for Taxol (Docetaxel and Flupirtine). elsewhere. AVAILABILITY Fingerprints allow a fast identification of drugs with very SuperTarget and Matador are available via their web sites, similar physiochemical properties that may interact with http://insilico.charite.de/supertarget and http://matador. the same target molecules. Structural similarity scores and embl.de. They can be obtained via a Creative Commons fingerprints are computed with the open-source Chemistry Attribution-Noncommercial-Share Alike 3.0 License. Development Kit (15). Analogously, similar proteins are quickly identifiable by precomputed sequence alignments provided by SIMAP ACKNOWLEDGEMENTS (16). Proteins homologous to annotated target proteins The authors wish to thank B. Gru¨ ning, P. Pyl, D. Jansen, are candidates for interacting with the drug and may J. Tscho¨ rtner, M. Fu¨ llbeck, E. Michalsky, J. Hossbach explain adverse effects of drugs. and I. Jaeger for assistance during literature screening, Drug metabolizing enzymes come into question to software testing and improvement of the web interface. explain adverse drug responses. Genetic polymorphism of Furthermore, we thank M. Kanehisa for providing cytochrome P450 genes or associated regulatory the KEGG web service as well as R. Arnold for the factors may lead to diverse ability of drug degradation support using the SIMAP web service. This work (17). The web interface provides an extra section to was supported by BMBF (Quantpro), Deutsche retrieve the cytochrome interaction data for most of the Forschungsgemeinschaft (DFG: SFB 449), IRTG Berlin- annotated drugs. Boston-Kyoto, Investitionsbank Berlin (IBB) and Drugs are classified in the Anatomical Therapeutic Deutsche Krebshilfe. Funding to pay the Open Access Chemical Classification System (ATC). This hierarchical publication charges for this article was provided by EMBL. classification system was introduced by the World Health Organization in 1990, classifying drugs according Conflict of interest statement. None declared. to their medical indications and chemical scaffold. Groups of drugs can be selected by their ATC code. A searchable, REFERENCES hierarchical ATC-tree enables a fast detection of drug classes and medical indications like, for example, ‘anti- 1. Imming,P., Sinning,C. and Meyer,A. (2006) Drugs, their targets and Parkinson drugs’ by expanding the branch ‘nervous the nature and number of drug targets. Nat. Rev. Drug Discov., 5, 821–834. system’. 2. Frantz,S. (2005) Drug discovery: playing dirty. Nature, 437, The integration of Gene Ontology (GO) terms in 942–943. SuperTarget enables complex queries like ‘Are there 3. Wishart,D.S., Knox,C., Guo,A.C., Shrivastava,S., Hassanali,M., anti-cancer drugs that induce apoptosis and are associated Stothard,P., Chang,Z. and Woolsey,J. (2006) DrugBank: a comprehensive resource for in silico drug discovery and exploration. with transcription factors?’ which can be answered by Nucleic Acids Res., 34, D668–D672. combined selection of apoptosis in the pathway tab 4. Chen,X., Ji,Z.L. and Chen,Y.Z. (2002) TTD: Therapeutic Target and transcriptional activator activity in the ontology tree. Database. Nucleic Acids Res., 30, 412–415. To facilitate the analysis, targets or drugs can be 5. Goede,A., Dunkel,M., Mester,N., Frommel,C. and Preissner,R. collected and stored in a clipboard. A session can be saved (2005) SuperDrug: a conformational drug database. Bioinformatics, 21, 1751–1753. on the server and restored up to 30 days later. Each object 6. Rebholz-Schuhmann,D., Kirsch,H., Arregui,M., Gaudan,S., of the clipboard links to a window with object-related Riethoven,M. and Stoehr,P. (2007) EBIMed–text crunching to information. Depending of the object type, the window gather facts for proteins from Medline. Bioinformatics, 23, contains additional information relating to drugs, targets, e237–e244. D922 Nucleic Acids Research, 2008, Vol. 36, Database issue 7. Jensen,L.J., Saric,J. and Bork,P. (2006) Literature mining for the 12. niProt-Consortium1 (2007) The Universal Protein Resource biologist: from information retrieval to biological discovery. Nat. (UniProt). Nucleic Acids Res., 35, D193–D197. Rev. Genet., 7, 119–129. 13. Camon,E., Magrane,M., Barrell,D., Lee,V., Dimmer,E., 8. von Mering,C., Jensen,L.J., Snel,B., Hooper,S.D., Krupp,M., Maslen,J., Binns,D., Harte,N., Lopez,R. et al. (2004) The Gene Foglierini,M., Jouffre,N., Huynen,M.A. and Bork,P. (2005) Ontology Annotation (GOA) Database: sharing knowledge in STRING: known and predicted protein-protein associations, Uniprot with Gene Ontology. Nucleic Acids Res., 32, integrated and transferred across organisms. Nucleic Acids Res., 33, D262–D266. D433–D437. 14. Martin,Y.C., Kofron,J.L. and Traphagen,L.M. (2002) Do 9. Kanehisa,M., Goto,S., Hattori,M., Aoki-Kinoshita,K.F., Itoh,M., structurally similar molecules have similar biological activity? Kawashima,S., Katayama,T., Araki,M. and Hirakawa,M. (2006) J. Med. Chem., 45, 4350–4358. From genomics to chemical genomics: new developments in KEGG. 15. Steinbeck,C., Hoppe,C., Kuhn,S., Floris,M., Guha,R. and Nucleic Acids Res., 34, D354–D357. Willighagen,E.L. (2006) Recent developments of the chemistry 10. Deshpande,N., Addess,K.J., Bluhm,W.F., Merino-Ott,J.C., development kit (CDK)—an open-source java Townsend-Merino,W., Zhang,Q., Knezevich,C., Xie,L., Chen,L. library for chemo- and bioinformatics. Curr. Pharm. Des., 12, et al. (2005) The RCSB Protein Data Bank: a redesigned query 2111–2120. system and relational database based on the mmCIF schema. 16. Rattei,T., Arnold,R., Tischler,P., Lindner,D., Stumpflen,V. and Nucleic Acids Res., 33, D233–D237. Mewes,H.W. (2006) SIMAP: the similarity matrix of proteins. 11. Michalsky,E., Dunkel,M., Goede,A. and Preissner,R. (2005) Nucleic Acids Res., 34, D252–D256. SuperLigands—a database of ligand structures derived from the 17. Fujita,K. (2006) Cytochrome P450 and anticancer drugs. Protein Data Bank. BMC Bioinform., 6, 122. Curr. Drug Metab., 7, 23–37.
Nucleic Acids Research – Oxford University Press
Published: Jan 16, 2008
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.