Vision and RFID data fusion for tracking people in crowds by a mobile robot
, F. Lerasle
, N. Ouadah
, V. Cadenat
CNRS, LAAS, 7, Avenue du Colonel Roche, F-31077 Toulouse, France
Université de Toulouse, UPS, INSA, INP, ISAE, LAAS-CNRS, F-31077 Toulouse, France
CDTA/ENP, Cité 20 août 1956, Baba Hassen, Alger, Algeria
Received 13 January 2009
Accepted 5 January 2010
Available online 22 January 2010
Radio frequency ID
Multimodal data fusion
Human visual servoing
In this paper, we address the problem of realizing a human following task in a crowded environment. We
consider an active perception system, consisting of a camera mounted on a pan-tilt unit and a 360
detection system, both embedded on a mobile robot. To perform such a task, it is necessary to efﬁciently
track humans in crowds. In a ﬁrst step, we have dealt with this problem using the particle ﬁltering frame-
work because it enables the fusion of heterogeneous data, which improves the tracking robustness. In a
second step, we have considered the problem of controlling the robot motion to make the robot follow
the person of interest. To this aim, we have designed a multi-sensor-based control strategy based on
the tracker outputs and on the RFID data. Finally, we have implemented the tracker and the control strat-
egy on our robot. The obtained experimental results highlight the relevance of the developed perceptual
functions. Possible extensions of this work are discussed at the end of the article.
Ó 2010 Elsevier Inc. All rights reserved.
Giving a mobile robot the ability of automatically following a
person appears to be a key issue to make it efﬁciently interact with
humans. Numerous applications would beneﬁt from such a capa-
bility. Service robotics is obviously one of these applications, as it
requires interactive robots  able to follow a person to provide
continual assistance in ofﬁce buildings, museums, hospital envi-
ronments, or even in shopping centers. Service robots clearly need
to move in ways that are socially suitable for people. Such a robot
have to localize its user, to discriminate him/her from others pass-
ers-by and to be able to follow him/her across complex human-
centered environment. In this context, tracking a given person in
crowds from a mobile platform appears to be fundamental. How-
ever, numerous difﬁculties arise: moving cameras with limited
view ﬁeld, cluttered background, illumination variations, hard
real-time constraints, and so on.
The literature offers many tools to go beyond these difﬁculties.
Our paper focuses on particle ﬁltering framework as it easily en-
ables to fuse heterogeneous data from embedded sensors. Despite
their sporadicity, these dedicated person detectors and their hard-
ware counterpart are very discriminant when present.
The paper is organized as follows. Section 2 depicts an overview
of the corresponding works done within our robotic context and
introduces our contributions. Section 3 describes our omnidirec-
tional RFID prototype. This sensor is very discriminant when present
in order to detect the user wearing an RFID tag. Section 4 recalls some
PF basics and details our new importance function for multimodal
person tracking. The developed control strategy to achieve a person
following task in a crowded environment is detailed in Section 5,
while Section 6 presents the mobile robot which has been used for
our tests and the obtained results. Finally, Section 7 summarizes
our contributions and discusses future extensions.
2. Overview and related work
Particle ﬁlters (PF)  through different schemes are currently
investigated for person tracking in both robotics and vision com-
munities. Besides the well-known CONDENSATION scheme, the
fairly seldom exploited ICONDENSATION  variant steers sam-
pling towards state space regions of high likelihood by incorporat-
ing both the dynamics and the measurements in the importance
function. PF represent the posterior distribution by a set of sam-
ples, or particles, with associated importance weights. This
weighted particles set is ﬁrst drawn from an importance function
and the state vector initial probability distribution, and is then up-
dated over time taking into account the measurement models.
Some approaches e.g.  show that intermittent and discriminant
cues based on person detection and recognition functionalities
1077-3142/$ - see front matter Ó 2010 Elsevier Inc. All rights reserved.
* Corresponding author. Address: CNRS, LAAS, 7, Avenue du Colonel Roche,
F-31077 Toulouse, France.
E-mail addresses: email@example.com (T. Germa), firstname.lastname@example.org (F. Lerasle), noua
email@example.com (N. Ouadah), firstname.lastname@example.org (V. Cadenat).
Computer Vision and Image Understanding 114 (2010) 641–651
Contents lists available at ScienceDirect
Computer Vision and Image Understanding
journal homepage: www.elsevier.com/locate/cviu