Vision and RFID data fusion for tracking people in crowds by a mobile robot
T. Germa
a,b,
*
, F. Lerasle
a,b
, N. Ouadah
a,c
, V. Cadenat
a,b
a
CNRS, LAAS, 7, Avenue du Colonel Roche, F-31077 Toulouse, France
b
Université de Toulouse, UPS, INSA, INP, ISAE, LAAS-CNRS, F-31077 Toulouse, France
c
CDTA/ENP, Cité 20 août 1956, Baba Hassen, Alger, Algeria
article info
Article history:
Received 13 January 2009
Accepted 5 January 2010
Available online 22 January 2010
Keywords:
Radio frequency ID
Multimodal data fusion
Particle filtering
Person tracking
Person following
Multi-sensor fusion
Human visual servoing
abstract
In this paper, we address the problem of realizing a human following task in a crowded environment. We
consider an active perception system, consisting of a camera mounted on a pan-tilt unit and a 360
RFID
detection system, both embedded on a mobile robot. To perform such a task, it is necessary to efficiently
track humans in crowds. In a first step, we have dealt with this problem using the particle filtering frame-
work because it enables the fusion of heterogeneous data, which improves the tracking robustness. In a
second step, we have considered the problem of controlling the robot motion to make the robot follow
the person of interest. To this aim, we have designed a multi-sensor-based control strategy based on
the tracker outputs and on the RFID data. Finally, we have implemented the tracker and the control strat-
egy on our robot. The obtained experimental results highlight the relevance of the developed perceptual
functions. Possible extensions of this work are discussed at the end of the article.
Ó 2010 Elsevier Inc. All rights reserved.
1. Introduction
Giving a mobile robot the ability of automatically following a
person appears to be a key issue to make it efficiently interact with
humans. Numerous applications would benefit from such a capa-
bility. Service robotics is obviously one of these applications, as it
requires interactive robots [16] able to follow a person to provide
continual assistance in office buildings, museums, hospital envi-
ronments, or even in shopping centers. Service robots clearly need
to move in ways that are socially suitable for people. Such a robot
have to localize its user, to discriminate him/her from others pass-
ers-by and to be able to follow him/her across complex human-
centered environment. In this context, tracking a given person in
crowds from a mobile platform appears to be fundamental. How-
ever, numerous difficulties arise: moving cameras with limited
view field, cluttered background, illumination variations, hard
real-time constraints, and so on.
The literature offers many tools to go beyond these difficulties.
Our paper focuses on particle filtering framework as it easily en-
ables to fuse heterogeneous data from embedded sensors. Despite
their sporadicity, these dedicated person detectors and their hard-
ware counterpart are very discriminant when present.
The paper is organized as follows. Section 2 depicts an overview
of the corresponding works done within our robotic context and
introduces our contributions. Section 3 describes our omnidirec-
tional RFID prototype. This sensor is very discriminant when present
in order to detect the user wearing an RFID tag. Section 4 recalls some
PF basics and details our new importance function for multimodal
person tracking. The developed control strategy to achieve a person
following task in a crowded environment is detailed in Section 5,
while Section 6 presents the mobile robot which has been used for
our tests and the obtained results. Finally, Section 7 summarizes
our contributions and discusses future extensions.
2. Overview and related work
Particle filters (PF) [5] through different schemes are currently
investigated for person tracking in both robotics and vision com-
munities. Besides the well-known CONDENSATION scheme, the
fairly seldom exploited ICONDENSATION [26] variant steers sam-
pling towards state space regions of high likelihood by incorporat-
ing both the dynamics and the measurements in the importance
function. PF represent the posterior distribution by a set of sam-
ples, or particles, with associated importance weights. This
weighted particles set is first drawn from an importance function
and the state vector initial probability distribution, and is then up-
dated over time taking into account the measurement models.
Some approaches e.g. [34] show that intermittent and discriminant
cues based on person detection and recognition functionalities
1077-3142/$ - see front matter Ó 2010 Elsevier Inc. All rights reserved.
doi:10.1016/j.cviu.2010.01.008
* Corresponding author. Address: CNRS, LAAS, 7, Avenue du Colonel Roche,
F-31077 Toulouse, France.
E-mail addresses: tgerma@laas.fr (T. Germa), lerasle@laas.fr (F. Lerasle), noua
dah@laas.fr (N. Ouadah), cadenat@laas.fr (V. Cadenat).
Computer Vision and Image Understanding 114 (2010) 641–651
Contents lists available at ScienceDirect
Computer Vision and Image Understanding
journal homepage: www.elsevier.com/locate/cviu