Predictable performance and high query concurrency for data analytics

Predictable performance and high query concurrency for data analytics Conventional data warehouses employ the query-at-a-time model, which maps each query to a distinct physical plan. When several queries execute concurrently, this model introduces contention and thrashing, because the physical plans—unaware of each other—compete for access to the underlying I/O and computation resources. As a result, while modern systems can efficiently optimize and evaluate a single complex data analysis query, their performance suffers significantly and can be highly erratic when multiple complex queries run at the same time. We present in this paper C join , a new design that substantially improves throughput in large-scale data analytics systems processing many concurrent join queries. In contrast to the conventional query-at-a-time model our approach employs a single physical plan that shares I/O, computation, and tuple storage across all in-flight join queries. We use an “always on” pipeline of non-blocking operators, managed by a controller that continuously examines the current query mix and optimizes the pipeline on the fly. Our design enables data analytics engines to scale gracefully to large data sets, provide predictable execution times, and reduce contention. We implemented C join as an extension to the PostgreSQL DBMS. This prototype outperforms conventional commercial systems by an order of magnitude for tens to hundreds of concurrent queries. The VLDB Journal Springer Journals

Predictable performance and high query concurrency for data analytics

Loading next page...
Copyright © 2011 by Springer-Verlag
Computer Science; Database Management
Publisher site
See Article on Publisher Site


You’re reading a free preview. Subscribe to read the entire article.

DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 12 million articles from more than
10,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Unlimited reading

Read as many articles as you need. Full articles with original layout, charts and figures. Read online, from anywhere.

Stay up to date

Keep up with your field with Personalized Recommendations and Follow Journals to get automatic updates.

Organize your research

It’s easy to organize your research with our built-in tools.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

Monthly Plan

  • Read unlimited articles
  • Personalized recommendations
  • No expiration
  • Print 20 pages per month
  • 20% off on PDF purchases
  • Organize your research
  • Get updates on your journals and topic searches


Start Free Trial

14-day Free Trial

Best Deal — 39% off

Annual Plan

  • All the features of the Professional Plan, but for 39% off!
  • Billed annually
  • No expiration
  • For the normal price of 10 articles elsewhere, you get one full year of unlimited access to articles.



billed annually
Start Free Trial

14-day Free Trial