Received: 15 April 2016 Revised: 19 June 2017 Accepted: 18 September 2017
SPECIAL ISSUE PAPER
Resilient computing on ROS using adaptive fault tolerance
CNRS, LAAS, Toulouse, France
Univ de Toulouse, UPS, Toulouse, France
Univ de Toulouse, INP, Toulouse, France
Michael Lauer, LAAS-CNRS 7, avenue du
Colonel Roche BP 54200, 31031 Toulouse
cedex 4, France.
Computer-based systems are now expected to evolve during their service life to cope with
changes of various nature, ranging from evolution of user needs, eg, additional features requested
by users, to system configuration changes, eg, modifications in available hardware resources.
When considering resilient embedded systems that must comply with stringent dependability
requirements, the challenge is even greater,asevolution must not impair dependability attributes.
Maintaining dependability properties when facing changes is, indeed, the exact definition of
In this paper, we consider the evolution of systems with respect to their dependability mech-
anisms and show how such mechanisms can evolve with the system evolution, in the case of
ROS, the robot operating system. We provide a synthesis of the concepts required for resilient
computing using a component-based approach. We particularly emphasize the process and the
techniques needed to implement an adaptation layer for fault tolerance mechanisms. In the light
of this analysis, we address the implementation of adaptive fault tolerance on ROS in 2 steps:
Firstly, we provide an architecture to implement fault tolerance mechanisms in ROS, and secondly,
we describe the actual adaptation of fault tolerance mechanisms in ROS. Beyond the implemen-
tation details given in the paper, we draw the lessons learned from this work and discuss the limits
of this run-time support to implement adaptive fault tolerance features in embedded systems.
adaptive fault tolerance, resilience, ROS
Evolution during service life is very frequent in many systems nowadays, including dependable systems. Such an evolution leads to modifications of
the system software and hardware configuration. A challenge for the dependability community is to develop systems that remain dependable when
facing changes (new threats, change in failures modes, and application updates). The persistence of dependability when facing changes—defining the
resilience of the system
—encompasses several aspects, among which evolvability is a key concept. Handling evolution involves new development
processes, such as agile development methods, but also run-time supports that enable modifications at run-time.
At run-time, dependability relies on fault-tolerant computing, ie, a collection of fault tolerance mechanisms (FTMs) attached to the application
according to its criticality level . In this context, one of the key challenges of resilient computing is the capacity to adapt the FTMs attached to an
application during its operational life.
In resilient systems, faults lead to failure modes that may violate dependability properties. The role of the safety analysis (eg, using fault tree
analysis or failure modes, effects, and criticality analysis) is to identify the failure mode, the fault model and then define the safety mechanisms to
prevent the violation of safety properties. Such safety mechanisms rely on basic error detection and recovery mechanisms, namely, fault tolerance
techniques, that are based on fault tolerance design patterns (FTDPs) that can be combined together.
During the operational life of the system, several situations may occur. For example, new threats may lead to revise the fault model (electromag-
netic perturbations, obsolescence of hardware components, software aging, etc). A revision of the fault model has of course an impact on the FTMs.
In other words, the validity of the FTMs or the safety mechanisms depends on the representativeness of the fault model. In a certain sense, a bad
identification of the fault model may lead, first, to pay for useless mechanisms in normal operation and, second, to observe a very low coverage of
J Softw Evol Proc. 2018;30:e1917. wileyonlinelibrary.com/journal/smr Copyright © 2017 John Wiley & Sons, Ltd. 1of14