Guest Editorial to the special issue on data stream processing

Guest Editorial to the special issue on data stream processing The VLDB Journal (2004) 13: 317 / Digital Object Identifier (DOI) 10.1007/s00778-004-0141-5 1 2,3 Johannes Gehrke , Joseph M. Hellerstein Department of Computer Science, Cornell University; Ithaca, NY 14853, USA; e-mail: johannes@cs.cornell.edu Computer Science Division, University of California, Berkeley; Berkeley, CA 94720, USA; e-mail: jmh@cs.berkeley.edu Intel Research, Berkeley; Berkeley, CA; 94704, USA Published online: November 12, 2004 – c Springer-Verlag 2004 Data stream management techniques have been a hot research The fourth paper is an experience paper. It describes the area in the database community for the last 5 years. To our latest lessons from the design and implementation of the Au- call for papers for this special issue with a deadline of October rora stream processing engine, and it describes the authors’ 2003 we received 23 submissions that covered a wide range vision for their next system. of ongoing data stream research. In two rounds of review, The issue concludes with an article on data stream pro- we selected five papers that represent the diversity and depth cessing in sensor networks. Sensor nodes are different from of this research. Early work in data streams concentrated on traditional computers since energy is one of the limiting fac- developing efficient algorithms for specific data stream queries tors. The authors propose two methods for saving energy. First, such as sampling, join size estimation, and quantiles. This they propose a group-aware network construction that mini- issue shows that current data stream research has matured and mizes network traffic. Second, they allow queries to specify transcended pure algorithmic research to novel data types such that approximate query results (within user-specified bounds) as XML and to core systems issues. are sufficient, a further opportunity to reduce traffic. The stream considered in the first paper consists of XML Overall, we believe that these papers are an excellent snap- user queries rather than traditional data records. The paper shot of the state of the data stream community as of early 2004, considers how to efficiently mine frequent XML query pat- and we hope that you will enjoy reading the papers as much terns. As it is not feasible to keep all queries in main memory, as we did. the authors give efficient algorithms to incrementally maintain frequent user queries. Acknowledgements. We would like to thank Tamer Ozsu, our editor- The second paper considers how a data stream manage- in-chief, for his advice and support throughout the process, and we ment system can deal with load spikes by carefully scheduling would like to thank Stacey Shirk for administrative support. Our operators in the system. The suggested scheduling method, biggest thanks go to the authors whose contributions created the issue chain scheduling, keeps the output latency within a given that you are reading. bound while minimizing queuing memory. The third paper shows how to give approximate answers to aggregate queries over datasets undergoing constant change. In particular, this paper focuses on dealing with a stream that includes not only insertions of new data but also deletions of Johannes Gehrke old data. Joseph M. Hellerstein http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png The VLDB Journal Springer Journals

Guest Editorial to the special issue on data stream processing

Free
1 page

Loading next page...
1 Page
 
/lp/springer_journal/guest-editorial-to-the-special-issue-on-data-stream-processing-yLPlmKeUoV
Publisher
Springer Journals
Copyright
Copyright © 2004 by Springer-Verlag
Subject
Computer Science; Database Management
ISSN
1066-8888
eISSN
0949-877X
D.O.I.
10.1007/s00778-004-0141-5
Publisher site
See Article on Publisher Site

Abstract

The VLDB Journal (2004) 13: 317 / Digital Object Identifier (DOI) 10.1007/s00778-004-0141-5 1 2,3 Johannes Gehrke , Joseph M. Hellerstein Department of Computer Science, Cornell University; Ithaca, NY 14853, USA; e-mail: johannes@cs.cornell.edu Computer Science Division, University of California, Berkeley; Berkeley, CA 94720, USA; e-mail: jmh@cs.berkeley.edu Intel Research, Berkeley; Berkeley, CA; 94704, USA Published online: November 12, 2004 – c Springer-Verlag 2004 Data stream management techniques have been a hot research The fourth paper is an experience paper. It describes the area in the database community for the last 5 years. To our latest lessons from the design and implementation of the Au- call for papers for this special issue with a deadline of October rora stream processing engine, and it describes the authors’ 2003 we received 23 submissions that covered a wide range vision for their next system. of ongoing data stream research. In two rounds of review, The issue concludes with an article on data stream pro- we selected five papers that represent the diversity and depth cessing in sensor networks. Sensor nodes are different from of this research. Early work in data streams concentrated on traditional computers since energy is one of the limiting fac- developing efficient algorithms for specific data stream queries tors. The authors propose two methods for saving energy. First, such as sampling, join size estimation, and quantiles. This they propose a group-aware network construction that mini- issue shows that current data stream research has matured and mizes network traffic. Second, they allow queries to specify transcended pure algorithmic research to novel data types such that approximate query results (within user-specified bounds) as XML and to core systems issues. are sufficient, a further opportunity to reduce traffic. The stream considered in the first paper consists of XML Overall, we believe that these papers are an excellent snap- user queries rather than traditional data records. The paper shot of the state of the data stream community as of early 2004, considers how to efficiently mine frequent XML query pat- and we hope that you will enjoy reading the papers as much terns. As it is not feasible to keep all queries in main memory, as we did. the authors give efficient algorithms to incrementally maintain frequent user queries. Acknowledgements. We would like to thank Tamer Ozsu, our editor- The second paper considers how a data stream manage- in-chief, for his advice and support throughout the process, and we ment system can deal with load spikes by carefully scheduling would like to thank Stacey Shirk for administrative support. Our operators in the system. The suggested scheduling method, biggest thanks go to the authors whose contributions created the issue chain scheduling, keeps the output latency within a given that you are reading. bound while minimizing queuing memory. The third paper shows how to give approximate answers to aggregate queries over datasets undergoing constant change. In particular, this paper focuses on dealing with a stream that includes not only insertions of new data but also deletions of Johannes Gehrke old data. Joseph M. Hellerstein

Journal

The VLDB JournalSpringer Journals

Published: Dec 1, 2004

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off