Target tracking with incomplete detection
, Qian Yu
, Isaac Cohen
Honeywell Labs, 1985 Douglas Drive North, Golden Valley, MN 55422, USA
Sarnoff Corporation, 201 Washington Rd, Princeton, NJ 08536, USA
Received 29 August 2006
Accepted 21 January 2009
Available online 30 January 2009
Multiple target tracking
Split and merge of detected regions
Maximum a posteriori
In this paper, we address the multiple target tracking problem as a maximum a posteriori problem.
We adopt a graph representation of all observations over time. To make full use of the visual obser-
vations from the image sequence, we introduce both motion and appearance likelihood. The multiple
target tracking problem is formulated as ﬁnding multiple optimal paths in the graph. Due to the noisy
foreground segmentation, an object may be represented by several foreground regions and similarly
one foreground region may correspond to multiple objects. To deal with this problem, we propose
merge, split and mean shift operations to generate new hypotheses to the measurement graph. The
proposed approach uses a sliding window framework, that aggregates information across a ﬁxed
number of frames. Experimental results on both indoor and outdoor data sets are reported. Further-
more, we provide a comparison between the proposed approach with the existing methods that do
not merge/split detected blobs.
Ó 2009 Elsevier Inc. All rights reserved.
Multiple target tracking is a key component in visual surveil-
lance. Tracking provides a spatio-temporal description of detected
moving regions in the scene, this low level information is critical
for recognition of human actions in video surveillance. In the con-
sidered visual tracking problem, the observations used are the de-
tected moving blobs. Incomplete observations due to occlusions,
stop and go motion or noisy foreground detections constitute the
main limitation of blob-based tracking methods. We propose a
tracking method that allows to split, merge detected moving re-
gions, as well as re-acquiring moving targets after a stop-and-go
motion or occlusion.
Several problems need to be addressed by a tracking algorithm:
A single moving object (e.g. one person) can be detected as multi-
ple moving blobs. In this case the tracking algorithm needs to
‘Merge’ the detected blobs. Similarly, one detected blob can be
composed of multiple moving objects, in this case the tracking
algorithm needs to ‘Split’ and segment the detected blob into cor-
responding moving objects. The split and merge of detected blobs
has to be robust to partial or total occlusions, as well as being capa-
ble of differentiating detected moving regions of nearby objects.
Stop-and-go motion, or non-detection due to similarity of the ob-
ject to the background may require the tracker to re-acquire the
Moveover, the detected blobs could be due to erroneous motion
detection. Here the tracking algorithm needs to ﬁlter these obser-
vations in presence of static or dynamic occlusions of the moving
objects in the scene. Finally the number of moving objects in the
scene vary as new moving objects enter or leave the ﬁeld of view
of the camera.
A large number of tracking algorithms have been developed in
the past decades. The interested reader can refer to  for a re-
cent comprehensive survey of the ﬁeld. Several data association
tracking algorithms have been proposed ranging from a simple
nearest neighbor association to the complex multiple hypothesis
tracker [9,10]. The Probabilistic Data Association (PDA) method
, which is considered a good compromise between perfor-
mance and complexity, uses a weighted average of all the mea-
surements within the tracks’ validation gate  to estimate the
target state. The PDA method deals with multiple targets as inde-
pendent objects in term of observations, and therefore less suitable
for addressing the situations where multiple observations corre-
spond to a single target and vice versa. JPDAF [7,8] is an extension
of the PDA, where the measurement of target association probabil-
ities is evaluated jointly across the targets. The Multiple Hypothe-
sis Tracker (MHT) tracking algorithm was ﬁrst developed by Reid
 and propagated multiple hypotheses in time. The ranking 
of the hypotheses requires evaluating over all existing hypotheses
and thus pruning and merging  were used to reduce the set of
hypotheses to a manageable size. A class of maximum likelihood
methods seek single or multiple best paths in an observation graph
[2,5], however the methods assume no missing detection and
known number of objects.
1077-3142/$ - see front matter Ó 2009 Elsevier Inc. All rights reserved.
* Corresponding author.
E-mail addresses: firstname.lastname@example.org (Y. Ma), email@example.com (Q. Yu),
firstname.lastname@example.org (I. Cohen).
Computer Vision and Image Understanding 113 (2009) 580–587
Contents lists available at ScienceDirect
Computer Vision and Image Understanding
journal homepage: www.elsevier.com/locate/cviu