The VLDB Journal (1997) 6: 224–240
Concurrency and recovery for index trees
, Betty Salzberg
Microsoft Corporation, One Microsoft Way, Bldg 9, Redmond, WA 98052-6399, USA; email@example.com
College of Computer Science, Northeastern University, Boston, MA 02115, USA; firstname.lastname@example.org
Edited by A. Reuter. Received August 1995 / accepted July 1996
Abstract. Although many suggestions have been made for
concurrency in B
-trees, few of these have considered re-
covery as well. We describe an approach which provides
high concurrency while preserving well-formed trees across
system crashes. Our approach works for a class of index
trees that is a generalization of the B
-tree. This class in-
cludes some multi-attribute indexes and temporal indexes.
Structural changes in an index tree are decomposed into a
sequence of atomic actions, each one leaving the tree well-
formed and each working on a separate level of the tree. All
atomic actions on levels of the tree above the leaf level are
independent of database transactions, and so are of short du-
ration. Incomplete structural changes are detected in normal
operations and trigger completion.
Key words: Concurrency – Recovery – Indexing – Access
methods – B-trees
In this paper, we describe a concurrency algorithm for a
class of trees which includes a type of B
sibling links (called a B
-tree) and some spatial and tempo-
ral indexes. Only data-node splitting sometimes takes place
within database transactions. All other parts of index tree
restructuring are independent of such transactions. The ba-
sic principles of this algorithm were exposed in (Lomet and
Salzberg 1992). This paper gives a step-by-step description
of the algorithm details.
In order to use this algorithm, a search structure must
have several structural and behavioral properties.
– It must partition the search space at each level of the
tree. That is, the set of spaces associated with nodes at
a given level covers the search space and no two nodes’
– When a node is split, a pointer in the old node must be
inserted, which indicates the address of the new node.
Other properties are similar to those of the standard B
– The data is all in the leaves.
– Insertion must ﬁrst search down the tree, inserting the
new record in a leaf if there is room and, if not, split-
ting the leaf and creating a new leaf and posting the
information to the parent.
– Splitting and posting continues up the tree if needed.
– When nodes get sparse, two “adjacent” nodes may some-
times be consolidated when they share the same parent.
We call such a tree a Π-tree, and we give a formal deﬁnition
of the Π-tree in Sect. 3.
-tree (Lehman and Yao 1981), which is a B
tree with sibling links, is a Π-tree. A time-split B-tree (or
TSB-tree) (Lomet and Salzberg 1989) can be made into a
Π-tree by adding sibling links. The TSB-tree is a temporal
index, where the search space is a two-dimensional space
based on time and database key. The hB
et al. 1995, 1997) is a spatial search structure on any num-
ber of dimensions which is a Π-tree. The R-tree (Guttman
1984) is not a Π-tree (and could not easily be made into
one by adding sibling links), because the spaces associated
with nodes which are on the same level of the tree overlap.
The algorithm described here can be used on any Π-
tree. It provides a high degree of concurrency because it
breaks down structural changes into a series of short-term
atomic actions – splitting a node, posting split information
to a parent or consolidating a node with one of its siblings.
Consolidation requires three nodes to be locked – the par-
ent and the two siblings being consolidated. Splitting index
nodes requires only the node being split to be locked. Post-
ing requires the node receiving the new split information
(the parent) to be locked and also must have a short-term
lock on the child to verify that posting is still needed. The
details of this algorithm, with explicit directions for locking,
are in Sect. 6.
The subject of concurrency in B
-trees has a long history
(Bayer 1977; Lehman and Yao 1981; Mohan and Levine
1992; Sagiv 1986; Salzberg 1985; Shasha and Goodman
1988). Most work, with the exception of Mohan and Levine
(1992), and Gray and Reuter (1993) has not treated the prob-
lem of system crashes during structural changes. In this pa-
per, we show how to manage both concurrency and recov-
ery for a wide class of index tree structures.