Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Unary and n-ary inclusion dependency discovery in relational databases

Unary and n-ary inclusion dependency discovery in relational databases Foreign keys form one of the most fundamental constraints for relational databases. Since they are not always defined in existing databases, the discovery of foreign keys turns out to be an important and challenging task. The underlying problem is known to be the inclusion dependency (IND) inference problem. In this paper, data-mining algorithms are devised for IND inference in a given database. We propose a two-step approach. In the first step, unary INDs are discovered thanks to a new preprocessing stage which leads to a new algorithm and to an efficient implementation. In the second step, n-ary IND inference is achieved. This step fits in the framework of levelwise algorithms used in many data-mining algorithms. Since real-world databases can suffer from some data inconsistencies, approximate INDs, i.e. INDs which almost hold, are considered. We show how they can be safely integrated into our unary and n-ary discovery algorithms. An implementation of these algorithms has been achieved and tested against both synthetic and real-life databases. Up to our knowledge, no other algorithm does exist to solve this data-mining problem. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Intelligent Information Systems Springer Journals

Unary and n-ary inclusion dependency discovery in relational databases

Loading next page...
 
/lp/springer-journals/unary-and-n-ary-inclusion-dependency-discovery-in-relational-databases-nLnE049415

References (48)

Publisher
Springer Journals
Copyright
Copyright © 2007 by Springer Science+Business Media, LLC
Subject
Computer Science; Business Information Systems; Document Preparation and Text Processing ; Artificial Intelligence (incl. Robotics); Data Structures, Cryptology and Information Theory
ISSN
0925-9902
eISSN
1573-7675
DOI
10.1007/s10844-007-0048-x
Publisher site
See Article on Publisher Site

Abstract

Foreign keys form one of the most fundamental constraints for relational databases. Since they are not always defined in existing databases, the discovery of foreign keys turns out to be an important and challenging task. The underlying problem is known to be the inclusion dependency (IND) inference problem. In this paper, data-mining algorithms are devised for IND inference in a given database. We propose a two-step approach. In the first step, unary INDs are discovered thanks to a new preprocessing stage which leads to a new algorithm and to an efficient implementation. In the second step, n-ary IND inference is achieved. This step fits in the framework of levelwise algorithms used in many data-mining algorithms. Since real-world databases can suffer from some data inconsistencies, approximate INDs, i.e. INDs which almost hold, are considered. We show how they can be safely integrated into our unary and n-ary discovery algorithms. An implementation of these algorithms has been achieved and tested against both synthetic and real-life databases. Up to our knowledge, no other algorithm does exist to solve this data-mining problem.

Journal

Journal of Intelligent Information SystemsSpringer Journals

Published: Jan 26, 2008

There are no references for this article.