Composing, optimizing, and executing plans for bioinformatics web services

Snehal Thakkar; José Ambite; Craig Knoblock

doi:10.1007/s00778-005-0158-4

Loading next page...

References (52)

S. Davidson, C. Overton, Val TannenDept (1997)
BioKleisli: a digital library for biomedical researchers
International Journal on Digital Libraries, 1
T. Bultan, Xiang Fu, R. Hull, Jianwen Su (2003)
Conversation specification: a new approach to design and analysis of e-service composition
Alon Levy, Dan Suciu (1997)
Deciding Containment for Queries with Complex Objects.
N. Kushmerick, Daniel Weld, Robert Doorenbos (1997)
Wrapper Induction for Information Extraction
Alon Levy (2001)
Logic-based techniques in data integration
Laura Bright, J. Gruser, Louiqa Mar, Esther Vidal, Levallois-Perret France (1999)
A Wrapper Generation toolkit to specify and construct Wrappersfor Web Accessible Data Sources ( WebSources )
J. Ullman (1988)
Principles of database and knowledge-base systems, Vol. I
, 14
Oliver Duschka (1997)
Query planning and optimization in information integration
Snehal Thakkar, Craig Knoblock, J. Ambite (2003)
A View Integration Approach to Dynamic Composition of Web Services
S. Kambhampati, Eric Lambrecht, Ullas Nambiar, Zaiqing Nie, S. Gnanaprakasam (2004)
Optimizing Recursive Information Gathering Plans in EMERAC
Journal of Intelligent Information Systems, 22
R. Stevens, C. Goble, N. Paton, S. Bechhofer, Gary Ng, P. Baker, A. Brass (2003)
Complex Query Formulation Over Diverse Information Sources in TAMBIS
A. Halevy, A. Rajaraman, J. Ordille (1996)
Query-Answering Algorithms for Information Agents
D. Wu, B. Parsia, E. Sirin, J. Hendler, Dana Nau (2003)
Automating DAML-S Web Services Composition Using SHOP2
Z. Lacroix, L. Raschid (2002)
A Map of Biological Resources to Support a Complete Characterization of Scientific Entities
J. Ullman (1988)
Principles of Data and Knowledge-Base Systems
A. Halevy, Dan Suciu (1997)
Deciding containment for queries with complex objects (extended abstract)
S. Letovsky (2013)
Bioinformatics: Databases and Systems
Ion Muslea, Steven Minton, Craig Knoblock (2000)
Selective Sampling with Redundant Views
Oliver Duschka, M. Genesereth (1997)
Query planning in infomaster
J. Minker (2000)
Logic-Based Artificial Intelligence
M. Schoppers (1987)
Universal Plans for Reactive Robots in Unpredictable Environments
C. Goble, R. Stevens, Gary Ng, S. Bechhofer, N. Paton, P. Baker, M. Peim, A. Brass (2001)
Transparent access to multiple bioinformatics information sources
IBM Syst. J., 40
H. Levesque, R. Reiter, Y. Lespérance, Fangzhen Lin, R. Scherl (1997)
GOLOG: A Logic Programming Language for Dynamic Domains
J. Log. Program., 31
P. Mork, Ron Shaker, A. Halevy, P. Tarczy-Hornoch (2002)
PQL: a declarative query language over dynamic biological schemata
Proceedings. AMIA Symposium
H. Garcia-Molina, J. Hammer, K. Ireland, Y. Papakonstantinou, J. Ullman, J. Widom (1994)
Integrating and Accessing Heterogeneous Information Sources in TSIMMIS
S. Saeednia (2000)
How to maintain both privacy and authentication in digital libraries
International Journal on Digital Libraries, 2
R. Stevens, C. Goble, N.W. Paton, S. Bechhofer, G. Ng, P. Baker, A. Brass (2003)
Bioinformatics: Managing Scientific Data
M. Genesereth, A. Keller, Oliver Duschka (1997)
Infomaster: an information integration system
M. Michalowski, S. Thakkar, C. Knoblock (2005)
Automatically utilizing secondary sources to align information across sources, special issue on semantic integration
AI Mag., 26
Greg Barish, Craig Knoblock (2011)
An Expressive Language and Efficient Execution System for Software Agents
J. Artif. Intell. Res., 23
R. Pottinger, A. Halevy (2000)
MiniCon: A scalable algorithm for answering queries using views
The VLDB Journal, 10
Martin Michalowski, Snehal Thakkar, Craig Knoblock (2005)
Automatically Utilizing Secondary Sources to Align Information Across Sources
AI Mag., 26
B. Eckman, A. Kosky, Leonardo Laroco (2001)
Extending traditional query-based integration approaches for functional characterization of post-genomic data
Bioinformatics, 17 7
Snehal Thakkar, Craig Knoblock (2003)
Efficient Execution of Recursive Integration Plans
S. Tejada, Craig Knoblock, Steven Minton (2002)
Learning domain-independent string transformation weights for high accuracy object identification
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
J. Naughton, D. DeWitt, D. Maier, Ashraf Aboulnaga, Jianjun Chen, Leonidas Galanis, Jaewoo Kang, R. Krishnamurthy, Qiong Luo, N. Prakash, Ravishankar Ramamurthy, J. Shanmugasundaram, Feng Tian, K. Tufte, Stratis Viglas, Y. Wang, Chun Zhang, Bruce Jackson, Anurag Gupta, Rushan Chen (2001)
The Niagara Internet Query System.
IEEE Data(base) Engineering Bulletin, 24
P. Mork, A. Halevy, P. Tarczy-Hornoch (2001)
A model for data integration systems of biomedical data applied to online genetic databases
Proceedings. AMIA Symposium
M. Lenzerini (2002)
Data integration: a theoretical perspective
Craig Knoblock, Steven Minton, J. Ambite, N. Ashish, Ion Muslea, A. Philpot, S. Tejada (1998)
The Ariadne approach to Web-based information integration
M. Kifer, E. Lozinskii (1990)
On compile-time query optimization in deductive databases by means of static filtering
ACM Trans. Database Syst., 15
S. Davidson, O. Buneman, J. Crabtree, V. Tannen, G. Overton, L. Wong (2002)
BioKleisli: Integrating Biomedical Data and Analysis Packages
L. Haas, P. Kodali, J. Rice, P. Schwarz, William Swope (2000)
Integrating life sciences data-with a little Garlic
Proceedings IEEE International Symposium on Bio-Informatics and Biomedical Engineering
Z. Ives, D. Florescu, Marc Friedman, A. Halevy, Daniel Weld (1999)
An adaptive query execution system for data integration
C.A. Goble, R. Stevens, G. Ng, S. Bechhofer, N.W. Paton, P.G. Baker, M. Peim, A. Brass (2001)
Transparent access to multiple bioinformatics information sources, special issue on deep computing for the life sciences
IBM Syst. J., 40
B. Eckman, Z. Lacroix, L. Raschid (2001)
Optimized seamless integration of biomolecular data
Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001)
Sheila McIlraith, Tran Son (2002)
Adapting Golog for Composition of Semantic Web Services
N. Ashish, Craig Knoblock, A. Halevy (1997)
Information Gathering Plans With Sensing Actions
K. Golden (1998)
Leap Before You Look: Information Gathering in the PUCCINI Planner
J. Hellerstein, Michael Franklin, Sirish Chandrasekaran, Amol Deshpande, Kris Hildrum, S. Madden, Vijayshankar Raman, Mehul Shah (2000)
Adaptive Query Processing: Technology in Evolution.
IEEE Data(base) Engineering Bulletin, 23
Z. Lacroix, L. Raschid, B. Eckman (2004)
Techniques for Optimization of Queries on Integrated Biological Resources
Journal of bioinformatics and computational biology, 2 2
Snehal Thakkar, J. Ambite, Craig Knoblock, M. Rey (2004)
A Data Integration Approach to Automatically Composing and Optimizing Web Services
R. Bayardo, W. Bohrer, R. Brice, A. Cichocki, J. Fowler, A. Helal, V. Kashyap, T. Ksiezyk, G. Martin, M. Nodine, M. Rashid, M. Rusinkiewicz, R. Shea, C. Unnikrishnan, A. Unruh, D. Woelk (1997)
InfoSleuth: agent-based semantic integration of information in open and dynamic environments
, 26

Publisher: Springer Journals
Copyright: Copyright © 2005 by Springer-Verlag
Subject: Computer Science; Database Management
ISSN: 1066-8888
eISSN: 0949-877X
DOI: 10.1007/s00778-005-0158-4
Publisher site: See Article on Publisher Site

Abstract

The emergence of a large number of bioinformatics datasets on the Internet has resulted in the need for flexible and efficient approaches to integrate information from multiple bioinformatics data sources and services. In this paper, we present our approach to automatically generate composition plans for web services, optimize the composition plans, and execute these plans efficiently. While data integration techniques have been applied to the bioinformatics domain, the focus has been on answering specific user queries. In contrast, we focus on automatically generating parameterized integration plans that can be hosted as web services that respond to a range of inputs. In addition, we present two novel techniques that improve the execution time of the generated plans by reducing the number of requests to the existing data sources and by executing the generated plan more efficiently. The first optimization technique, called tuple-level filtering, analyzes the source/service descriptions in order to automatically insert filtering conditions in the composition plans that result in fewer requests to the component web services. To ensure that the filtering conditions can be evaluated, this technique may include sensing operations in the integration plan. The savings due to filtering significantly exceed the cost of the sensing operations. The second optimization technique consists in mapping the integration plans into programs that can be executed by a dataflow-style, streaming execution engine. We use real-world bioinformatics web services to show experimentally that (1) our automatic composition techniques can efficiently generate parameterized plans that integrate data from large numbers of existing services and (2) our optimization techniques can significantly reduce the response time of the generated integration plans.

Journal

The VLDB Journal – Springer Journals

Published: Sep 1, 2005

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Composing, optimizing, and executing plans for bioinformatics web services

Composing, optimizing, and executing plans for bioinformatics web services

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Composing, optimizing, and executing plans for bioinformatics web services

Composing, optimizing, and executing plans for bioinformatics web services

References (52)

Abstract

Journal

Recommended Articles

There are no references for this article.

Our policy towards the use of cookies