Purpose – The increasing popularity of XML has generated a lot of interest in query processing over graph‐structured data. To support efficient evaluation of path expressions structured indexes have been proposed. Extending the proposed indexes to work with large XML graphs and to support intra‐ or inter‐document links requires a lot of computing power for the creation process and a lot of space to store the indexes. Moreover, the efficient evaluation of ancestors‐descendants queries over arbitrary graphs with long paths is a severe problem. This paper aims to propose a scalable path index which is based on the concept of 2‐hop covers as introduced by Cohen et al. Design/methodology/approach – The problem of efficiently managing and querying XML documents poses interesting challenges on database research. The proposed algorithm for index creation scales down the original graph size substantially. As a result a directed acyclic graph with a smaller number of nodes and edges will emerge. This reduces the number of computing steps required for building the index. Thus, computing time and space will be reduced as well. The index also permits ancestors‐descendants relationships to be efficiently evaluated. Moreover, the proposed index has a nice property in comparison to most other work; it is optimized for descendants‐or‐self queries on arbitrary graphs with link relationships. Findings – In this paper, a scalable path index is proposed. It can efficiently address the problem of querying large XML documents that contain links and have cycles. Cycles in the graph stress path‐indexing algorithms. An overview about 2‐hop cover and the algorithms that used to build the index are given. Research limitations/implications – This paper works on the updating problem. Since the construction of the index is quite complex its construction make sense for some time. However, this means it is currently dealing with the problem of updating XML‐documents. Originality/value – This paper presents an efficient path index that can test the reachability between two nodes and evaluate ancestors‐descendants queries over arbitrary graphs with long paths.
International Journal of Web Information Systems – Emerald Publishing
Published: Apr 3, 2009
Keywords: Extensible markup language; Databases; Query Languages