Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Pushing similarity joins down to the storage layer in XML databases

Pushing similarity joins down to the storage layer in XML databases PurposeThis article aims to explore how to incorporate similarity joins into XML database management systems (XDBMSs). The authors aim to provide seamless and efficient integration of similarity joins on tree-structured data into an XDBMS architecture.Design/methodology/approachThe authors exploit XDBMS-specific features to efficiently generate XML tree representations for similarity matching. In particular, the authors push down a large part of the structural similarity evaluation close to the storage layer.FindingsEmpirical experiments were conducted to measure and compare accuracy, performance and scalability of the tree similarity join using different similarity functions and on the top of different storage models. The results show that the authors’ proposal delivers performance and scalability without hurting the accuracy.Originality/valueSimilarity join is a fundamental operation for data integration. Unfortunately, none of the XDBMS architectures proposed so far provides an efficient support for this operation. Evaluating similarity joins on XML is challenging, because it requires similarity matching on the text and structure. In this work, the authors integrate similarity joins into an XDBMS. To the best of the authors’ knowledge, this work is the first to leverage the storage scheme of an XDBMS to support XML similarity join processing. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png International Journal of Web Information Systems Emerald Publishing

Pushing similarity joins down to the storage layer in XML databases

Loading next page...
 
/lp/emerald-publishing/pushing-similarity-joins-down-to-the-storage-layer-in-xml-databases-MgpyHGJuY0
Publisher
Emerald Publishing
Copyright
Copyright © Emerald Group Publishing Limited
ISSN
1744-0084
DOI
10.1108/IJWIS-04-2016-0019
Publisher site
See Article on Publisher Site

Abstract

PurposeThis article aims to explore how to incorporate similarity joins into XML database management systems (XDBMSs). The authors aim to provide seamless and efficient integration of similarity joins on tree-structured data into an XDBMS architecture.Design/methodology/approachThe authors exploit XDBMS-specific features to efficiently generate XML tree representations for similarity matching. In particular, the authors push down a large part of the structural similarity evaluation close to the storage layer.FindingsEmpirical experiments were conducted to measure and compare accuracy, performance and scalability of the tree similarity join using different similarity functions and on the top of different storage models. The results show that the authors’ proposal delivers performance and scalability without hurting the accuracy.Originality/valueSimilarity join is a fundamental operation for data integration. Unfortunately, none of the XDBMS architectures proposed so far provides an efficient support for this operation. Evaluating similarity joins on XML is challenging, because it requires similarity matching on the text and structure. In this work, the authors integrate similarity joins into an XDBMS. To the best of the authors’ knowledge, this work is the first to leverage the storage scheme of an XDBMS to support XML similarity join processing.

Journal

International Journal of Web Information SystemsEmerald Publishing

Published: Apr 18, 2017

References