RegPhos: a system to explore the protein kinasesubstrate phosphorylation network in humansLee, Tzong-Yi; Bo-Kai Hsu, Justin; Chang, Wen-Chi; Huang, Hsien-Da
doi: 10.1093/nar/gkq970pmid: 21037261
Protein phosphorylation catalyzed by kinases plays crucial regulatory roles in intracellular signal transduction. With the increasing number of experimental phosphorylation sites that has been identified by mass spectrometry-based proteomics, the desire to explore the networks of protein kinases and substrates is motivated. Manning et al. have identified 518 human kinase genes, which provide a starting point for comprehensive analysis of protein phosphorylation networks. In this study, a knowledgebase is developed to integrate experimentally verified protein phosphorylation data and proteinprotein interaction data for constructing the protein kinasesubstrate phosphorylation networks in human. A total of 21110 experimental verified phosphorylation sites within 5092 human proteins are collected. However, only 4138 phosphorylation sites (20) have the annotation of catalytic kinases from public domain. In order to fully investigate how protein kinases regulate the intracellular processes, a published kinase-specific phosphorylation site prediction tool, named KinasePhos is incorporated for assigning the potential kinase. The web-based system, RegPhos, can let users input a group of human proteins; consequently, the phosphorylation network associated with the protein subcellular localization can be explored. Additionally, time-coursed microarray expression data is subsequently used to represent the degree of similarity in the expression profiles of network members. A case study demonstrates that the proposed scheme not only identify the correct network of insulin signaling but also detect a novel signaling pathway that may cross-talk with insulin signaling network. This effective system is now freely available at http://RegPhos.mbc.nctu.edu.tw.
ConsensusPathDB: toward a more complete picture of cell biologyKamburov, Atanas; Pentchev, Konstantin; Galicka, Hanna; Wierling, Christoph; Lehrach, Hans; Herwig, Ralf
doi: 10.1093/nar/gkq1156pmid: 21071422
ConsensusPathDB is a meta-database that integrates different types of functional interactions from heterogeneous interaction data resources. Physical protein interactions, metabolic and signaling reactions and gene regulatory interactions are integrated in a seamless functional association network that simultaneously describes multiple functional aspects of genes, proteins, complexes, metabolites, etc. With 155432 human, 194480 yeast and 13648 mouse complex functional interactions (originating from 18 databases on human and eight databases on yeast and mouse interactions each), ConsensusPathDB currently constitutes the most comprehensive publicly available interaction repository for these species. The Web interface at http://cpdb.molgen.mpg.de offers different ways of utilizing these integrated interaction data, in particular with tools for visualization, analysis and interpretation of high-throughput expression data in the light of functional interactions and biological pathways.
NGSmethDB: a database for next-generation sequencing single-cytosine-resolution DNA methylation dataHackenberg, Michael; Barturen, Guillermo; Oliver, Jos L.
doi: 10.1093/nar/gkq942pmid: 20965971
Next-generation sequencing (NGS) together with bisulphite conversion allows the generation of whole genome methylation maps at single-cytosine resolution. This allows studying the absence of methylation in a particular genome region over a range of tissues, the differential tissue methylation or the changes occurring along pathological conditions. However, no database exists fully addressing such requirements. We propose here NGSmethDB (http://bioinfo2.ugr.es/NGSmethDB/gbrowse/) for the storage and retrieval of methylation data derived from NGS. Two cytosine methylation contexts (CpG and CAG/CTG) are considered. Through a browser interface coupled to a MySQL backend and several data mining tools, the user can search for methylation states in a set of tissues, retrieve methylation values for a set of tissues in a given chromosomal region, or display the methylation of promoters among different tissues. NGSmethDB is currently populated with human, mouse and Arabidopsis data, but other methylomes will be incorporated through an automatic pipeline as soon as new data become available. Dump downloads for three coverage levels (1, 5 or 10 reads) are available. NGSmethDB will be useful for experimental researchers, as well as for bioinformaticians, who might use the data as input for further research.
The Gene Expression Barcode: leveraging public data repositories to begin cataloging the human and murine transcriptomesMcCall, Matthew N.; Uppal, Karan; Jaffee, Harris A.; Zilliox, Michael J.; Irizarry, Rafael A.
doi: 10.1093/nar/gkq1259pmid: 21177656
Various databases have harnessed the wealth of publicly available microarray data to address biological questions ranging from across-tissue differential expression to homologous gene expression. Despite their practical value, these databases rely on relative measures of expression and are unable to address the most fundamental questionwhich genes are expressed in a given cell type. The Gene Expression Barcode is the first database to provide reliable absolute measures of expression for most annotated genes for 131 human and 89 mouse tissue types, including diseased tissue. This is made possible by a novel algorithm that leverages information from the GEO and ArrayExpress public repositories to build statistical models that permit converting data from a single microarray into expressed/unexpressed calls for each gene. For selected platforms, users may upload data and obtain results in a matter of seconds. The raw data, curated annotation, and code used to create our resource are also available at http://rafalab.jhsph.edu/barcode.
Update of the FANTOM web resource: from mammalian transcriptional landscape to its dynamic regulationKawaji, Hideya; Severin, Jessica; Lizio, Marina; Forrest, Alistair R. R.; van Nimwegen, Erik; Rehli, Michael; Schroder, Kate; Irvine, Katharine; Suzuki, Harukazu; Carninci, Piero; Hayashizaki, Yoshihide; Daub, Carsten O.
doi: 10.1093/nar/gkq1112pmid: 21075797
The international Functional Annotation Of the Mammalian Genomes 4 (FANTOM4) research collaboration set out to better understand the transcriptional network that regulates macrophage differentiation and to uncover novel components of the transcriptome employing a series of high-throughput experiments. The primary and unique technique is cap analysis of gene expression (CAGE), sequencing mRNA 5-ends with a second-generation sequencer to quantify promoter activities even in the absence of gene annotation. Additional genome-wide experiments complement the setup including short RNA sequencing, microarray gene expression profiling on large-scale perturbation experiments and ChIPchip for epigenetic marks and transcription factors. All the experiments are performed in a differentiation time course of the THP-1 human leukemic cell line. Furthermore, we performed a large-scale mammalian two-hybrid (M2H) assay between transcription factors and monitored their expression profile across human and mouse tissues with qRT-PCR to address combinatorial effects of regulation by transcription factors. These interdependent data have been analyzed individually and in combination with each other and are published in related but distinct papers. We provide all data together with systematic annotation in an integrated view as resource for the scientific community (http://fantom.gsc.riken.jp/4/). Additionally, we assembled a rich set of derived analysis results including published predicted and validated regulatory interactions. Here we introduce the resource and its update after the initial release.
miRGator v2.0 : an integrated system for functional investigation of microRNAsCho, Sooyoung; Jun, Yukyung; Lee, Sanghyun; Choi, Hyung-Seok; Jung, Sungchul; Jang, Youngjun; Park, Charny; Kim, Sangok; Lee, Sanghyuk; Kim, Wankyu
doi: 10.1093/nar/gkq1094pmid: 21062822
miRGator is an integrated database of microRNA (miRNA)-associated gene expression, target prediction, disease association and genomic annotation, which aims to facilitate functional investigation of miRNAs. The recent version of miRGator v2.0 contains information about (i) human miRNA expression profiles under various experimental conditions, (ii) paired expression profiles of both mRNAs and miRNAs, (iii) gene expression profiles under miRNA-perturbation (e.g. miRNA knockout and overexpression), (iv) known/predicted miRNA targets and (v) miRNA-disease associations. In total, >8000 miRNA expression profiles, 300 miRNA-perturbed gene expression profiles and 2000 mRNA expression profiles are compiled with manually curated annotations on disease, tissue type and perturbation. By integrating these data sets, a series of novel associations (miRNAmiRNA, miRNAdisease and miRNAtarget) is extracted via shared features. For example, differentially expressed genes (DEGs) after miRNA knockout were systematically compared against miRNA targets. Likewise, differentially expressed miRNAs (DEmiRs) were compared with disease-associated miRNAs. Additionally, miRNA expression and disease-phenotype profiles revealed miRNA pairs whose expression was regulated in parallel in various experimental and disease conditions. Complex associations are readily accessible using an interactive network visualization interface. The miRGator v2.0 serves as a reference database to investigate miRNA expression and function (http://miRGator.kobic.re.kr).
IUPHAR-DB: new receptors and tools for easy searching and visualization of pharmacological dataSharman, Joanna L.; Mpamhanga, Chidochangu P.; Spedding, Michael; Germain, Pierre; Staels, Bart; Dacquet, Catherine; Laudet, Vincent; Harmar, Anthony J.; ,
doi: 10.1093/nar/gkq1062pmid: 21087994
The IUPHAR database is an established online reference resource for several important classes of human drug targets and related proteins. As well as providing recommended nomenclature, the database integrates information on the chemical, genetic, functional and pathophysiological properties of receptors and ion channels, curated and peer-reviewed from the biomedical literature by a network of experts. The database now includes information on 616 gene products from four superfamilies in human and rodent model organisms: G protein-coupled receptors, voltage- and ligand-gated ion channels and, in a recent update, 49 nuclear hormone receptors (NHRs). New data types for NHRs include details on co-regulators, DNA binding motifs, target genes and 3D structures. Other recent developments include curation of the chemical structures of approximately 2000 ligand molecules, providing electronic descriptors, identifiers, link-outs and calculated molecular properties, all available via enhanced ligand pages. The interface now provides intelligent tools for the visualization and exploration of ligand structure-activity relationships and the structural diversity of compounds active at each target. The database is freely available at http://www.iuphar-db.org.
tRNADB-CE 2011: tRNA gene database curated manually by expertsAbe, Takashi; Ikemura, Toshimichi; Sugahara, Junichi; Kanai, Akio; Ohara, Yasuo; Uehara, Hiroshi; Kinouchi, Makoto; Kanaya, Shigehiko; Yamada, Yuko; Muto, Akira; Inokuchi, Hachiro
doi: 10.1093/nar/gkq1007pmid: 21071414
We updated the tRNADB-CE by analyzing 939 complete and 1301 draft genomes of prokaryotes and eukaryotes, 171 complete virus genomes, 121 complete chloroplast genomes and approximately 230 million sequences obtained by metagenome analyses of 210 environmental samples. The 287102 tRNA genes in total, and thus two times of the tRNA genes compiled previously, are compiled, in which sequence information, clover-leaf structure and results of sequence similarity and oligonucleotide-pattern search can be browsed. In order to pool collective knowledge with help from any experts in the tRNA research field, we included a column to which comments can be added on each tRNA gene. By compiling tRNAs of known prokaryotes with identical sequences, we found high phylogenetic preservation of tRNA sequences, especially at a phylum level. Furthermore, a large number of tRNAs obtained by metagenome analyses of environmental samples had sequences identical to those found in known prokaryotes. The identical sequence group, therefore, can be used as phylogenetic markers to clarify the microbial community structure of an ecosystem. The updated tRNADB-CE provided functions, with which users can obtain the phylotype-specific markers (e.g. genus-specific markers) by themselves and clarify microbial community structures of ecosystems in detail. tRNADB-CE can be accessed freely at http://trna.nagahama-i-bio.ac.jp.
PCDDB: the protein circular dichroism data bank, a repository for circular dichroism spectral and metadataWhitmore, Lee; Woollett, Benjamin; Miles, Andrew John; Klose, D. P.; Janes, Robert W.; Wallace, B. A.
doi: 10.1093/nar/gkq1026pmid: 21071417
The Protein Circular Dichroism Data Bank (PCDDB) is a public repository that archives and freely distributes circular dichroism (CD) and synchrotron radiation CD (SRCD) spectral data and their associated experimental metadata. All entries undergo validation and curation procedures to ensure completeness, consistency and quality of the data included. A web-based interface enables users to browse and query sample types, sample conditions, experimental parameters and provides spectra in both graphical display format and as downloadable text files. The entries are linked, when appropriate, to primary sequence (UniProt) and structural (PDB) databases, as well as to secondary databases such as the Enzyme Commission functional classification database and the CATH fold classification database, as well as to literature citations. The PCDDB is available at: http://pcddb.cryst.bbk.ac.uk.