Multiversion databases store both current and historical data. Rows are typically annotated with timestamps representing the period when the row is/was valid. We develop novel techniques to reduce index maintenance in multiversion databases, so that indexes can be used effectively for analytical queries over current data without being a heavy burden on transaction throughput. To achieve this end, we re-design persistent index data structures in the storage hierarchy to employ an extra level of indirection. The indirection level is stored on solid-state disks that can support very fast random I/Os, so that traversing the extra level of indirection incurs a relatively small overhead. The extra level of indirection dramatically reduces the number of magnetic disk I/Os that are needed for index updates and localizes maintenance to indexes on updated attributes. Additionally, we batch insertions within the indirection layer in order to reduce physical disk I/Os for indexing new records. In this work, we further exploit SSDs by introducing novel DeltaBlock techniques for storing the recent changes to data on SSDs. Using our DeltaBlock, we propose an efficient method to periodically flush the recently changed data from SSDs to HDDs such that, on the one hand, we keep track of every change (or delta) for every record, and, on the other hand, we avoid redundantly storing the unchanged portion of updated records. By reducing the index maintenance overhead on transactions, we enable operational data stores to create more indexes to support queries. We have developed a prototype of our indirection proposal by extending the widely used generalized search tree open-source project, which is also employed in PostgreSQL. Our working implementation demonstrates that we can significantly reduce index maintenance and/or query processing cost by a factor of 3. For the insertion of new records, our novel batching technique can save up to 90 % of the insertion time. For updates, our prototype demonstrates that we can significantly reduce the database size by up to 80 % even with a modest space allocated for DeltaBlocks on SSDs.
The VLDB Journal – Springer Journals
Published: Nov 17, 2015
It’s your single place to instantly
discover and read the research
that matters to you.
Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.
All for just $49/month
Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly
Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.
Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.
Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.
All the latest content is available, no embargo periods.
“Hi guys, I cannot tell you how much I love this resource. Incredible. I really believe you've hit the nail on the head with this site in regards to solving the research-purchase issue.”Daniel C.
“Whoa! It’s like Spotify but for academic articles.”@Phil_Robichaud
“I must say, @deepdyve is a fabulous solution to the independent researcher's problem of #access to #information.”@deepthiw
“My last article couldn't be possible without the platform @deepdyve that makes journal papers cheaper.”@JoseServera