Access the full text.
Sign up today, get DeepDyve free for 14 days.
RE Tarjan (1972)
Depth-first search and linear graph algorithmsSIAM J. Comput., 1
A strongly connected component ( $$\mathsf {SCC}$$ SCC ) is a maximal subgraph of a directed graph $$G$$ G in which every pair of nodes is reachable from each other in the $$\mathsf {SCC}$$ SCC . With such a property, a general directed graph can be represented by a directed acyclic graph ( DAG ) by contracting every $$\mathsf {SCC}$$ SCC of $$G$$ G to a node in DAG . In many real applications that need graph pattern matching, topological sorting, or reachability query processing, the best way to deal with a general directed graph is to deal with its DAG representation. Therefore, finding all $$\mathsf {SCC}$$ SCC s in a directed graph $$G$$ G is a critical operation. The existing in-memory algorithms based on depth first search ( DFS ) can find all $$\mathsf {SCC}$$ SCC s in linear time with respect to the size of a graph. However, when a graph cannot reside entirely in the main memory, the existing external or semi-external algorithms to find all $$\mathsf {SCC}$$ SCC s have limitation to achieve high I/O efficiency. In this paper, we study new I/O-efficient semi-external algorithms to find all $$\mathsf {SCC}$$ SCC s for a massive directed graph $$G$$ G that cannot reside in main memory entirely. To overcome the deficiency of the existing DFS -based semi-external algorithm that heavily relies on a total order, we explore a weak order based on which we investigate new algorithms. We propose a new two-phase algorithm, namely, tree construction and tree search. In the tree construction phase, a spanning tree of $$G$$ G can be constructed in bounded number of sequential scans of $$G$$ G . In the tree search phase, it needs to sequentially scan the graph once to find all $$\mathsf {SCC}$$ SCC s. In addition, we propose a new single-phase algorithm, which combines the tree construction and tree search phases into a single phase, with three new optimization techniques. They are early acceptance, early rejection, and batch processing. By the single-phase algorithm with the new optimization techniques, we can significantly reduce the number of I/Os and the CPU cost. We prove the correctness of the algorithms. We conduct extensive experimental studies using 4 real datasets including a massive real dataset and several synthetic datasets to confirm the I/O efficiency of our approaches.
The VLDB Journal – Springer Journals
Published: Apr 1, 2015
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.