Access the full text.
Sign up today, get DeepDyve free for 14 days.
R. Potharst, A. Feelders (2002)
Classification trees for problems with monotonicity constraintsSIGKDD Explor., 4
E. Schmidt, E. Birney, David Croft, B. Bono, P. D’Eustachio, M. Gillespie, Gopal Gopinath, B. Jassal, S. Lewis, L. Matthews, L. Stein, Imre Vastrik, Guanming Wu (2004)
Reactome: a knowledgebase of biological pathwaysNucleic Acids Research, 33
J. Murray, M. Whitfield, N. Trinklein, R. Myers, P. Brown, D. Botstein (2004)
Diverse and specific gene expression responses to stresses in cultured human cells.Molecular biology of the cell, 15 5
Jiawei Han (2007)
IntroductionACM Trans. Knowl. Discov. Data, 1
N. Lavrač, Peter Flach (2001)
An extended transformation approach to inductive logic programmingACM Transactions on Computational Logic (TOCL), 2
Naren Ramakrishnan, Deept Kumar, B. Mishra, M. Potts, R. Helm (2003)
Turning CARTwheels: an alternating algorithm for mining redescriptionsProceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
R. Agrawal, R. Srikant (1994)
Fast Algorithms for Mining Association Rules in Large Databases
Suraj Peri, Suraj Peri, J. Navarro, J. Navarro, Ramars Amanchy, T. Kristiansen, T. Kristiansen, Chandra Jonnalagadda, V. Surendranath, V. Niranjan, Babylakshmi Muthusamy, T. Gandhi, M. Grønborg, M. Grønborg, N. Ibarrola, Nandan Deshpande, K. Shanker, H. Shivashankar, B. Rashmi, M. Ramya, Zhixing Zhao, K. Chandrika, N. Padma, H. Harsha, A. Yatish, M. Kavitha, M. Menezes, D. Choudhury, Shubha Suresh, Neelanjana Ghosh, R. Saravana, Sreenath Chandran, S. Krishna, Mary Joy, S. Anand, V. Madavan, A. Joseph, G. Wong, W. Schiemann, S. Constantinescu, L. Huang, R. Khosravi‐Far, H. Steen, M. Tewari, S. Ghaffari, G. Blobe, C. Dang, Joe Garcia, J. Pevsner, O. Jensen, P. Roepstorff, K. Deshpande, A. Chinnaiyan, A. Hamosh, A. Chakravarti, Akhilesh Pandey (2003)
Development of human protein reference database as an initial platform for approaching systems biology in humans.Genome research, 13 10
Gary Bader, D. Betel, C. Hogue (2001)
BIND--The Biomolecular Interaction Network Database.Nucleic acids research, 29 1
C. Ball, Ihab Awad, J. Demeter, J. Gollub, J. Hebert, T. Hernandez-Boussard, Heng Jin, J. Matese, Michael Nitzberg, F. Wymore, Z. Zachariah, P. Brown, G. Sherlock (2004)
The Stanford Microarray Database accommodates additional microarray platforms and data formatsNucleic Acids Research, 33
T. Calders, L. Lakshmanan, R. Ng, J. Paredaens (2006)
Expressive power of an algebra for data miningACM Trans. Database Syst., 31
L. Parida, Naren Ramakrishnan (2005)
Redescription Mining: Structure Theory and Algorithms
S. Tsur, J. Ullman, S. Abiteboul, Chris Clifton, R. Motwani, Svetlozar Nestorov, A. Rosenthal (1998)
Query flocks: a generalization of association-rule mining
K. Christie, S. Weng, R. Balakrishnan, M. Costanzo, K. Dolinski, S. Dwight, S. Engel, B. Feierbach, D. Fisk, J. Hirschman, Eurie Hong, L. Issel-Tarver, R. Nash, A. Sethuraman, Barry Starr, C. Theesfeld, R. Andrada, G. Binkley, Q. Dong, C. Lane, Mark Schroeder, D. Botstein, J. Cherry (2004)
Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organismsNucleic acids research, 32 Database issue
Jean-François Rual, K. Venkatesan, Tong Hao, T. Hirozane-Kishikawa, Amélie Dricot, Ning Li, G. Berriz, Francis Gibbons, Matija Dreze, Nono Ayivi-Guedehoussou, Niels Klitgord, Christophe Simon, M. Boxem, S. Milstein, Jennifer Rosenberg, D. Goldberg, Lan Zhang, Sharyl Wong, G. Franklin, Siming Li, J. Albala, Janghoo Lim, Carlene Fraughton, E. Llamosas, S. Cevik, C. Bex, Philippe Lamesch, R. Sikorski, J. Vandenhaute, H. Zoghbi, A. Smolyar, Stephanie Bosak, Reynaldo Sequerra, L. Doucette-Stamm, M. Cusick, D. Hill, F. Roth, M. Vidal (2005)
Towards a proteome-scale map of the human protein–protein interaction networkNature, 437
Deept Kumar, Naren Ramakrishnan, R. Helm, M. Potts (2006)
Algorithms for StorytellingIEEE Transactions on Knowledge and Data Engineering, 20
A. Subramanian, P. Tamayo, V. Mootha, Sayan Mukherjee, B. Ebert, Michael Gillette, A. Paulovich, S. Pomeroy, T. Golub, E. Lander, J. Mesirov (2005)
Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profilesProceedings of the National Academy of Sciences of the United States of America, 102
Robin Dhamankar, Yoonkyong Lee, A. Doan, A. Halevy, Pedro Domingos (2004)
iMAP: Discovering Complex Mappings between Database Schemas.
R. Bayardo (2002)
The many roles of constraints in data miningSigkdd Explorations
S. Džeroski (2001)
Relational Data Mining
R. Rymon (1992)
Search through Systematic Set Enumeration
S. Aluru (2005)
Handbook of Computational Molecular Biology (Chapman & All/Crc Computer and Information Science Series)
Robin Dhamankar, Yoonkyong Lee, A. Doan, A. Halevy, Pedro Domingos (2004)
iMAP: discovering complex semantic matches between database schemas
Sunita Sarawagi, Alok Kirpal (2004)
Efficient set joins on similarity predicates
E. Blalock, J. Geddes, K. Chen, N. Porter, W. Markesbery, P. Landfield (2004)
Incipient Alzheimer's disease: Microarray correlation analyses reveal major transcriptional and tumor suppressor responsesProceedings of the National Academy of Sciences of the United States of America, 101
Bo Long, Xiaoyun Wu, Zhongfei Zhang, Philip Yu (2006)
Unsupervised learning on k-partite graphs
Mohammed Zaki, Ching-Jui Hsiao (2002)
CHARM: An Efficient Algorithm for Closed Itemset Mining
T. Murali, S. Kasif (2002)
Extracting Conserved Gene Expression Motifs from Gene Expression DataPacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
K. Ogawa‐Goto, S. Irie, A. Omori, Y. Miura, H. Katano, H. Hasegawa, T. Kurata, T. Sata, Y. Arao (2002)
An Endoplasmic Reticulum Protein, p180, Is Highly Expressed in Human Cytomegalovirus-Permissive Cells and Interacts with the Tegument Protein Encoded by UL48Journal of Virology, 76
U. Stelzl, U. Worm, M. Lalowski, Christian Haenig, F. Brembeck, H. Goehler, Martin Stroedicke, Martina Zenkner, Anke Schoenherr, Susanne Koeppen, Jan Timm, Sascha Mintzlaff, C. Abraham, Nicole Bock, S. Kietzmann, A. Goedde, Engin Toksöz, A. Droege, S. Krobitsch, B. Korn, W. Birchmeier, H. Lehrach, E. Wanker (2005)
A Human Protein-Protein Interaction Network: A Resource for Annotating the ProteomeCell, 122
Nansheng Chen, T. Harris, I. Antoshechkin, C. Bastiani, Tamberlyn Bieri, Darin Blasiar, K. Bradnam, Payan Canaran, J. Chan, Chao-Kung Chen, Wen Chen, Fiona Cunningham, Paul Davis, Eimear Kenny, R. Kishore, D. Lawson, Raymond Lee, Hans-Michael Müller, Cecilia Nakamura, Shraddha Pai, P. Ozersky, Andrei Petcherski, Anthony Rogers, A. Sabo, E. Schwarz, K. Auken, Qinghua Wang, R. Durbin, J. Spieth, P. Sternberg, L. Stein (2004)
WormBase: a comprehensive data resource for Caenorhabditis biology and genomicsNucleic Acids Research, 33
Steffen Grossmann, Sebastian Bauer, P. Robinson, M. Vingron (2006)
An Improved Statistic for Detecting Over-Represented Gene Ontology Annotations in Gene Sets
E. Huala, A. Dickerman, M. Garcia-Hernandez, D. Weems, L. Reiser, F. LaFond, D. Hanley, Donald Kiphart, Mingzhe Zhuang, Wen Huang, L. Mueller, D. Bhattacharyya, D. Bhaya, B. Sobral, W. Beavis, D. Meinke, C. Town, C. Somerville, S. Rhee (2001)
The Arabidopsis Information Resource (TAIR): a comprehensive database and web-based information retrieval, analysis, and visualization system for a model plantNucleic acids research, 29 1
M. Whitfield, G. Sherlock, A. Saldanha, J. Murray, C. Ball, K. Alexander, J. Matese, C. Perou, M. Hurt, P. Brown, D. Botstein (2002)
Identification of genes periodically expressed in the human cell cycle and their expression in tumors.Molecular biology of the cell, 13 6
S. Muggleton (1999)
Scientific knowledge discovery using inductive logic programmingCommunications of the ACM, 42
D. Fisher (1987)
Knowledge Acquisition Via Incremental Conceptual ClusteringMachine Learning, 2
K. Gunsalus, F. Piano (2005)
RNAi as a tool to study cell biology: building the genome-phenome bridge.Current opinion in cell biology, 17 1
R. Drysdale, M. Crosby (2004)
FlyBase: genes and gene modelsNucleic Acids Research, 33
(2008)
Article 2, Publication date
M. Ashburner, C. Ball, J. Blake, D. Botstein, Heather Butler, J. Cherry, A. Davis, K. Dolinski, S. Dwight, J. Eppig, M. Harris, D. Hill, L. Issel-Tarver, A. Kasarskis, S. Lewis, J. Matese, J. Richardson, M. Ringwald, G. Rubin, G. Sherlock (2000)
Gene Ontology: tool for the unification of biologyNature Genetics, 25
S. Madeira, Arlindo Oliveira
Ieee/acm Transactions on Computational Biology and Bioinformatics 1 Biclustering Algorithms for Biological Data Analysis: a Survey
Anne Carpenter, D. Sabatini (2004)
Systematic genome-wide screens of gene functionNature Reviews Genetics, 5
(1980)
Knowledge acquisition through conceptual Clustering: A theoretical framework and algoritha for partitioning data into conjunctive concepts
M. Matzke, A. Matzke (2004)
Planting the Seeds of a New ParadigmPLoS Biology, 2
E. Rahm, P. Bernstein (2001)
A survey of approaches to automatic schema matchingThe VLDB Journal, 10
Arun Ramani, Razvan Bunescu, R. Mooney, E. Marcotte (2005)
Consolidating the set of known human protein-protein interactions in preparation for large-scale mapping of the human interactomeGenome Biology, 6
M. Matzke, J. Birchler (2005)
RNAi-mediated pathways in the nucleusNature Reviews Genetics, 6
R. Agrawal, J. Gehrke, D. Gunopulos, P. Raghavan (2005)
Automatic Subspace Clustering of High Dimensional DataData Mining and Knowledge Discovery, 11
A. Tanay, R. Sharan, R. Shamir (2007)
Biclustering Algorithms: A Survey
Mohammed Zaki, Naren Ramakrishnan (2005)
Reasoning about sets using redescription mining
E. Browne, Bret Wing, D. Coleman, T. Shenk (2001)
Altered Cellular mRNA Levels in Human Cytomegalovirus-Infected Fibroblasts: Viral Block to the Accumulation of Antiviral mRNAsJournal of Virology, 75
L. Dehaspe, Hannu Toivonen (1999)
Discovery of frequent DATALOG patternsData Mining and Knowledge Discovery, 3
F. Afrati, Gautam Das, A. Gionis, H. Mannila, Taneli Mielikäinen, Panayiotis Tsaparas (2005)
Mining chains of relationsFifth IEEE International Conference on Data Mining (ICDM'05)
Ben Lehner, A. Fraser (2004)
A first-draft human protein-interaction mapGenome Biology, 5
William Cohen (2000)
WHIRL: A word-based information representation languageArtif. Intell., 118
C. Galindo, J. Sha, D. Ribardo, A. Fadl, Lakshmi Pillai, A. Chopra (2003)
Identification of Aeromonas hydrophila Cytotoxic Enterotoxin-induced Genes in Macrophages Using Microarrays*Journal of Biological Chemistry, 278
A. Tanay, R. Sharan, R. Shamir (2002)
Discovering statistically significant biclusters in gene expression dataBioinformatics, 18 Suppl 1
Lizhuang Zhao, Mohammed Zaki, Naren Ramakrishnan (2006)
BLOSOM: a framework for mining arbitrary boolean expressions
Y. Benjamini, Y. Hochberg (1995)
Controlling the false discovery rate: a practical and powerful approach to multiple testingJournal of the royal statistical society series b-methodological, 57
Amrita Pati, Cecilia Vasquez-Robinet, L. Heath, R. Grene, T. Murali (2006)
XcisClique: analysis of regulatory bicliquesBMC Bioinformatics, 7
Gregory Grothaus, A. Mufti, TM Murali (2006)
Automatic layout and visualization of biclustersAlgorithms for Molecular Biology, 1
High-throughput biological screens are yielding ever-growing streams of information about multiple aspects of cellular activity. As more and more categories of datasets come online, there is a corresponding multitude of ways in which inferences can be chained across them, motivating the need for compositional data mining algorithms. In this article, we argue that such compositional data mining can be effectively realized by functionally cascading redescription mining and biclustering algorithms as primitives. Both these primitives mirror shifts of vocabulary that can be composed in arbitrary ways to create rich chains of inferences. Given a relational database and its schema, we show how the schema can be automatically compiled into a compositional data mining program, and how different domains in the schema can be related through logical sequences of biclustering and redescription invocations. This feature allows us to rapidly prototype new data mining applications, yielding greater understanding of scientific datasets. We describe two applications of compositional data mining: (i) matching terms across categories of the Gene Ontology and (ii) understanding the molecular mechanisms underlying stress response in human cells.
ACM Transactions on Knowledge Discovery from Data (TKDD) – Association for Computing Machinery
Published: Mar 1, 2008
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.