Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Pipelined data‐flow delegated orchestration for data‐intensive eScience workflows

Pipelined data‐flow delegated orchestration for data‐intensive eScience workflows Purpose – eScience workflows use orchestration for integrating and coordinating distributed and heterogeneous scientific resources, which are increasingly exposed as web services. The rate of growth of scientific data makes eScience workflows data‐intensive, challenging existing workflow solutions. Efficient methods of handling large data in scientific workflows based on web services are needed. The purpse of this paper is to address this issue. Design/methodology/approach – In a previous paper the authors proposed Data‐Flow Delegation (DFD) as a means to optimize orchestrated workflow performance, focusing on SOAP web services. To improve the performance further, they propose pipelined data‐flow delegation (PDFD) for web service‐based eScience workflows in this paper, by leveraging from the domain of parallel programming. Briefly, PDFD allows partitioning of large datasets into independent subsets that can be communicated in a pipelined manner. Findings – The results show that the PDFD improves the execution time of the workflow considerably and is capable of handling much larger data than the non‐pipelined approach. Practical implications – Execution of a web service‐based workflow hampered by the size of data can be facilitated or improved by using services supporting Pipelined Data‐Flow Delegation. Originality/value – Contributions of this work include the proposed concept of combining pipelining and Data‐Flow Delegation, an XML Schema supporting the PDFD communication between services, and the practical evaluation of the PDFD approach. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png International Journal of Web Information Systems Emerald Publishing

Pipelined data‐flow delegated orchestration for data‐intensive eScience workflows

Loading next page...
 
/lp/emerald-publishing/pipelined-data-flow-delegated-orchestration-for-data-intensive-d2CPa034so
Publisher
Emerald Publishing
Copyright
Copyright © 2013 Emerald Group Publishing Limited. All rights reserved.
ISSN
1744-0084
DOI
10.1108/IJWIS-05-2013-0012
Publisher site
See Article on Publisher Site

Abstract

Purpose – eScience workflows use orchestration for integrating and coordinating distributed and heterogeneous scientific resources, which are increasingly exposed as web services. The rate of growth of scientific data makes eScience workflows data‐intensive, challenging existing workflow solutions. Efficient methods of handling large data in scientific workflows based on web services are needed. The purpse of this paper is to address this issue. Design/methodology/approach – In a previous paper the authors proposed Data‐Flow Delegation (DFD) as a means to optimize orchestrated workflow performance, focusing on SOAP web services. To improve the performance further, they propose pipelined data‐flow delegation (PDFD) for web service‐based eScience workflows in this paper, by leveraging from the domain of parallel programming. Briefly, PDFD allows partitioning of large datasets into independent subsets that can be communicated in a pipelined manner. Findings – The results show that the PDFD improves the execution time of the workflow considerably and is capable of handling much larger data than the non‐pipelined approach. Practical implications – Execution of a web service‐based workflow hampered by the size of data can be facilitated or improved by using services supporting Pipelined Data‐Flow Delegation. Originality/value – Contributions of this work include the proposed concept of combining pipelining and Data‐Flow Delegation, an XML Schema supporting the PDFD communication between services, and the practical evaluation of the PDFD approach.

Journal

International Journal of Web Information SystemsEmerald Publishing

Published: Aug 23, 2013

Keywords: eScience; Scientific workflow; SOAP web services; Orchestration; Pipelining; Data management; Work flow

References