Impact FD: An Unreliable Failure Detector Based on Process Relevance and Confidence in the System

Impact FD: An Unreliable Failure Detector Based on Process Relevance and Confidence in the System Abstract This paper presents a new unreliable failure detector, called the Impact failure detector (FD), that, contrarily to the majority of traditional FDs, outputs a trust level value which expresses the degree of confidence in the system. An impact factor is assigned to each process and the trust level is equal to the sum of the impact factors of the processes not suspected of failure. Moreover, a threshold parameter defines a lower bound value for the trust level, over which the confidence in the system is ensured. In particular, we defined a flexibility property that denotes the capacity of the Impact FD to tolerate a certain margin of failures or false suspicions, i.e. its capacity of considering different sets of responses that lead the system to trusted states. The Impact FD is suitable for systems that present node redundancy, heterogeneity of nodes, clustering feature and allow a margin of failures which does not degrade the confidence in the system. The paper also includes a timer-based distributed algorithm which implements an Impact FD, as well as its proof of correctness, for systems whose links are lossy asynchronous or for those whose all (or some) links are eventually timely. Performance evaluation results, based on PlanetLab (Planetlab. http://www.planet-lab.org. ‘Online. Access date: 16 September 2016’) traces, confirm the degree of flexible applicability of our FD and that, due to the accepted margin of failure, both failures and false suspicions are more tolerated when compared to traditional unreliable FDs. 1. INTRODUCTION In distributed systems, failures can occur and the detection of them is a crucial task in the design of fault tolerant distributed systems or applications. On the other hand, in asynchronous systems (AS), there exist no bounds on message transmission neither on processes speed. Therefore, detection of crashed processes is particularly difficult in those systems since it is impossible to know whether a process has really failed or if it and/or the network communication are just slow. Due to this lack of delay bounds, it is well known that consensus problem cannot be solved deterministically in an AS subject to even a single crash failure [1]. To circumvent such an impossibility and give support to the development of fault tolerant distributed systems, Chandra and Toueg proposed in [2] the unreliable failure detector (FD) abstraction. An unreliable FD can be seen as an oracle that gives (not always correct) information about process failures. Many current FDs are based on a binary model, in which monitored processes are either ‘trusted’ or ‘suspected’. Thus, most of existing FDs, such as those defined in [2, 3], output the set of processes that is currently suspected to have crashed. According to the type and the quality of this information, several FD classes have been proposed. This paper presents a new unreliable FD, denoted the Impact FD. A preliminary proposal of it was presented in [4]. Contrarily to the majority of existing unreliable FDs, the Impact FD provides an output that expresses the trust of the FD with regard to the system (or set of processes) as a whole and not to each process individually. A system is considered ‘trusted’ if it behaves correctly for a specific purpose even in the face of failures, i.e. the system is able to maintain the normal functionality. The conception of the Impact FD was inspired on systems that have the following features: (1) applications that execute on them are interested on information about the reliability of the system as a whole and can tolerate a certain margin of failures. The latter may vary depending on the environment, situation or context, such as systems that provide redundancy of software/hardware; (2) systems that organize nodes with some common characteristic in groups; (3) systems where the nodes can have different importance (relevance) or roles and, thus, their failures may have distinct impact on the system. Systems that present node redundancy, heterogeneity of nodes, clustering feature and allow a margin of failures which does not degrade the confidence in the system can, thus, benefit from the Impact FD and its configuration choices. They have motivated our work. In Section 2, there are some examples of such systems and the advantages, in these cases, of using the Impact FD instead of traditional FDs. The Impact FD outputs a trust level related to a given set of processes S of the monitored system. We, thus, denote FD ( IpS) the Impact FD module of process p that monitors the processes of S. When invoked in p, the Impact FD ( IpS) returns the trust_level value which expresses the confidence that p has in set S. To this end, an impact value, defined by the user, is assigned to each process of S and the trust_level is equal to the sum of the impact factors of the trusted nodes, i.e. those not suspected of failure by p. Furthermore, a threshold parameter defines a lower bound for the trust level, over which the confidence degree on S is ensured. Hence, by comparing the trust_level with the threshold, it is possible to determine whether S is currently ‘trusted’ or ‘untrusted’ by p. The impact factor indicates the relative importance of the process in the set S, while the threshold offers a degree of flexibility for failures and false suspicions, thus allowing a higher tolerance in case of instability in the system. For instance, in an unstable network, although there might be many false suspicions, depending on the value assigned to the threshold, the system might remain trustworthy [5]. We should also point out that the Impact FD configuration allows nodes of S to be grouped into subsets and threshold values can be defined for each of these subsets. In addition, similar to the traditional FD, several classes of Impact FDs can be defined depending on their capability of suspecting faulty processes (completeness property) and of not suspecting correct processes (accuracy property). Arguing that traditional approaches which assume a maximum number of failures f may lead to suboptimal solutions, such as in replication protocols where the number of replicas depends on f, Junqueira et al. proposed in [6] the survivor set approach, i.e. the unique collection of minimal sets of correct processes over all executions, each set containing all correct processes of some execution. The principle of the Impact FD also follows the authors’ argument: the threshold expresses certain margin of failures or false suspicions and the number of failures tolerated by the system is not necessarily fixed but depends on sets of correct processes, their respective impact factors and threshold values. Therefore, the Impact FD presents, what we denoted, the flexibility property. The latter expresses its capacity of considering different sets of responses that lead S to trusted states. In this context, we also define in this work, two properties, PR(IT)pS and PR(⋄IT)pS, which characterize the minimum necessary stability condition of S that ensures p's confidence (or eventual confidence) in S. In other words, if PR(IT)pS (resp., PR(⋄IT)pS) holds, the system S is always (resp., eventually always) trusted by the monitor process p. Note that the Impact FD threshold/impact factor approach is strictly more powerful than the maximum number of failures f approach since the latter can be expressed with the former but not the other way around. We also present in this paper a timer-based distributed algorithm (and its proof of correctness) which implements a Impact FD. It uses the algorithm proposed by [7] to estimate heartbeat message arrivals from monitored processes. The implementation can be applied to systems whose links are lossy asynchronous or those whose all (or some) of them have eventually a bounded synchronous behavior (♢—timely) [5]. Then, based on real-trace files collected from nodes of PlanetLab [8], we conducted extensive experiments in order to evaluate the Impact FD. These trace files contained a large amount of data related to the sending and reception of heartbeat messages, including unstable periods of links and message, characterizing, therefore, distributed systems that use FDs based on heartbeat. The testbed of the experiments comprises various configurations with different threshold values, impact factor of nodes and types of links. For evaluation sake, we used three of the QoS metrics proposed in [7]: detection time, average mistake rate, and query accuracy probability. The Impact FD implementation was also compared to a tradition timer-based FD one that outputs information about failure suspicions of each monitored process. Performance evaluation results confirm the degree of flexible applicability of the Impact FD that both failures and false suspicions are more tolerated than in traditional FDs, and that the former presents better QoS than the latter if the application is interested in the degree of confidence in the system (trust level) as a whole. The rest of this paper is structured as follows. Section 2 describes some distributed systems for which the Impact FD is suitable. Section 3 outlines some basic concepts of unreliable FDs and Section 4 describes our system models. Section 5 presents the Impact FD, its characteristics and some of its properties while in Section 6, we propose a timer-based algorithm that implements the Impact FD considering different systems, defined by the type of their links. The section also includes the proof of correctness of the algorithm. Section 7 presents a set of evaluation results obtained from experiments conducted with real traces on PlanetLab [8]. Section 8 discusses some existing related work. Finally, Section 9 concludes the paper and outlines some of our future research directions. 2. MOTIVATION SCENARIOS Our proposed approach can be applied to different distributed scenarios and is flexible enough to meet different needs. It is quite suitable for environments where there is node redundancy or nodes with different capabilities. We should point out that both the impact factor and the threshold render the estimation of the confidence of S more flexible. Hence, there might be a situation where some processes in S are faulty or suspected of being faulty but S is still considered to be trusted. Furthermore, the Impact FD can easily be configured and adapted to the needs of the application or system requirements. For instance, the application may require a stricter monitoring of nodes during the night than during the day. For this kind of adaptation, it is only necessary to adjust the threshold. The following examples show some scenarios to which the Impact FD can be applied: Scenario 1: Ubiquitous Wireless Sensor Networks (WSNs) are usually deployed to monitor physical conditions in various places such as geographical regions, agriculture lands and battlefields. In WSNs, there is a wide range of sensor nodes with different battery resources and communication or computation capabilities [9]. However, these sensors are prone to failures (e.g. battery failure, process failure, transceiver failure, etc.) [10]. Hence, it is necessary to provide failure detection and adaptation strategies to ensure that the failure of sensor nodes does not affect the overall task of the network. The redundant use of sensor nodes, reorganization of the sensor network and overlapping sensing regions are some of the techniques used to increase the fault tolerance and reliability of the network [11]. Let us take as example an ubiquitous WSN which is used to collect environmental data from within a vineyard and is divided into management zones in accordance with different characteristics (e.g. soil properties). Each zone comprises sensors of different types (e.g. humidity control, temperature control, etc.) and the density of the sensors depends on the characteristics of each zone. That is, the number of sensors can be different for each type of sensor within a given zone. Furthermore, the redundancy of the sensors ensures both area coverage and connectivity in case of failure. Each management zone can thus be viewed as a single set which has sensors of the same type grouped into subsets. This grouping approach allows a threshold to be defined as being equal to the minimum number of sensors that each subset must have to keep the connectivity and application functioning all the time. Moreover, in some situations, there might be a need to dynamically reconfigure the density of the zones. In this case, the threshold value would change. Scenario 2: In large-scale WSN environments, grouping sensor nodes into clusters has been widely adopted aiming the overall system scalability and reduction of resources consumption like battery power and bandwidth. Each clusteri is composed of a node, denoted cluster head (CH), which performs special tasks (e.g. routing, fusion, aggregation of messages, etc.) and several other sensor nodes (SN). The latter periodically transmit their data to the corresponding CH node which aggregate and transmit them to the base station (BS) either directly or through the intermediate communication with other CH nodes. In this scenario, the concept of Impact FD can be applied considering each clusteri as a subset of the system S whose size is initially ni. When defining the impact factor for the processes of clusteri, two issues should be considered: (i) the failure of CH which implies that the cluster is inaccessible compromising, therefore, the network connectivity and leading to untrusted states of S; (ii) when the number of alive SNs drops below a threshold, additional resources must be deployed to replenish the system to maintain its population density. Taking these constraints into account, we could have: impact factor = 1 to SNs, impact factor = ni to the CH of clusteri and threshold for this cluster equals to thresholdi = ni + (ni−fi), where fi is the maximum number of SN’s failures of clusteri. Thus, when either the CH fails or more than fi SNs fail, the trust level will be below the threshold and the BS must be warned to take some decision. Scenario 3: A third example might be a system consisting of a main server that offers a certain quality of service X (bandwidth, response time, etc.). If it fails, N backup servers can replace it, since each backup offers the same service but with a X/N quality of service. In this scenario, both the impact factor of the main server and the threshold would have the value of N*Iback where Iback is the impact value of each backup server, i.e. the system becomes unreliable whenever both the primary server and one or more of the N servers fail (or are suspected of being faulty). The Impact FD can be applied to all the above scenarios which have the following features: (i) the grouping of nodes that have some common characteristics into subgroups (subsets); (ii) the possibility of having nodes with different levels of relevance and (iii) the flexibility of some systems in being able to tolerate a margin of failure. 3. UNRELIABLE FDS Proposed by Chandra and Toueg in [2], an unreliable FD can be seen as an oracle that gives (not always correct) information about process failures (either trusted or suspected). It usually provides a list of processes suspected of having crashed. According to [12], unreliable FDs are so named because they can make mistakes (i) by erroneously suspecting a correct process1 (false suspicion) or (ii) by not suspecting a process that has actually crashed. If the FD detects its mistake later, it corrects it. For instance, a FD can stop suspecting at time t + 1, a process that it suspected at time t. Although an unreliable FD cannot accurately determine the real state of processes, its use increases knowledge about them and encapsulates the uncertainty of the communication delay between two processes [2]. Unreliable FDs are usually characterized by two properties: completeness and accuracy, as defined in [2]. Completeness characterizes the FD’s capability of suspecting faulty processes, while accuracy characterizes the FD’s capability of not suspecting correct processes, i.e. restricts the mistakes that the FD can make. FDs are then classified according to two completeness properties and four accuracy properties [2]. The combination of these properties yields eight classes of FDs. This approach allows the design of fault tolerant applications and proof of their correctness based only on these properties, without having to address, for example, low-level network parameters. In this work, we are particularly interested in the following completeness and accuracy properties: Strong completeness: Eventually every process that crashes is permanently suspected by every correct process. Weak completeness: Eventually every process that crashes is permanently suspected by some correct process. Eventual strong accuracy: There is a time after which correct processes are not suspected by any correct process. Eventual weak accuracy: There is a time after which some correct process is never suspected by any correct process. The class of the eventually perfect ♢P (resp., eventually strong ♢S) FDs satisfies the strong completeness and the eventual strong (resp., eventual weak) accuracy properties; the class of eventually weak FDs (♢W) satisfies the weak completeness and the eventual weak accuracy properties. ♢W is the weakest class allowing to solve consensus in an asynchronous distributed system with the additional assumption that a majority of processes are correct. Note that the type of accuracy depends on the synchrony or stability of the network. For instance, an algorithm that provides eventual accuracy (strong or weak) may rely on partially synchronous systems which eventually ensure a bound for message transmission delays and processes speed. From Chandra and Toueg’s work, numerous other FD implementations and classes have been proposed in the literature. They usually differ in the system assumptions such as synchronous model, type of node (identifiable, anonymous [13], homonymous [14]), type of link [5, 15, 16] (lossy asynchronous, reliable, timely, eventually timely, etc.), behavior properties [5, 17]; type of network (static [3, 15], dynamic [18, 19]), etc. They can also have different implementation choices (timer-based [7, 20], message pattern [17]) and performance or quality of service (QoS) requirements [7]. The type of problem can also define the properties of the FD (mutual exclusion [21], k-set agreement [22], register implementation [23], etc.). 3.1. Implementation of FDs The literature has several proposals for implementing unreliable FDs which usually exploit either a timer-based or a message-pattern approach. In the timer-based strategy, FD implementations make use of timers to detect failures in processes. There exist two mechanisms that can be used to implement the timer-based strategy: heartbeat and pinging. In the heartbeat mechanism every process q periodically sends a control message (‘I am alive’ message) to process p that is responsible for monitoring q. If p does not receive such a message from q after the expiration of a timer, it adds q to its list of suspected processes. If p later receives an ‘I am alive’ message from q, p then removes q from its list of suspected processes. An alternative approach uses the pinging mechanism which sends a query message ‘Are you alive?’ from each process p to another process q periodically. Upon reception of such messages, the monitored process replies with an ‘I am alive’ message. If process p times out on process q, it adds q to its list of suspected processes. If p later receives an ‘I am alive’ message from q, p then removes q from its list of suspected processes. The heartbeat strategy have advantages over pinging since the former sends half of the messages pinging detectors send for providing the same detection quality. Furthermore, a heartbeat detector estimates only the transmission delay of ‘I am alive’ messages, whereas the pinging detector must estimate the transmission delay of ‘Are you alive?’ messages, the reaction delay, and the transmission delay of ‘I am alive’ messages. The message-pattern strategy does not use any timeout mechanism. In [17], the authors propose an implementation that uses a request-response mechanism. A process p sends a QUERY message to n nodes that it monitors and then waits for responses (RESPONSE message) from α processes (α ≤ n, traditionally α = n − f, where f is the maximum number of failures). A query issued by p ends when it has received α responses. The other responses, if any, are discarded and the respective processes are suspected of having failed. A process sends QUERY messages repeatedly if it has not failed. If, on the next request-response, p receives a response from a suspected process q, then p removes q from its list of suspects. This approach considers the relative order for the receiving of messages which always (or after a time) allow some nodes to communicate faster than the others. 4. SYSTEM MODELS We consider a distributed system which consists of a finite set of processes Π = {q1,…,qn} with |Π| = n, (n ≥ 2) and that there is one process per node, site, or sensor. Therefore, the word process can mean a node, a sensor, or a site. Each process is uniquely identified (id | 1 ≤ id ≤ n) and identifiers are totally and consecutively ordered. Processes can fail by crashing and they do not recover. A process is considered correct if it does not fail during the whole execution. We consider the existence of some global time denoted T. A failure pattern is a function F:T → 2Π, where F(t) is the set of processes that have failed before or at time t. The function correct(F) denotes the set of correct processes, i.e. those that have never belonged to a failure pattern (F), while function faulty(F) denotes the set of faulty processes, i.e. the complement of correct(F) with respect to Π. A process p ∈ Π monitors a set S of processes of Π. We note correct(FS) = correct(F) ∩ S and faulty(FS) = faulty(F) ∩ S. Every process in S is connected to p by a communication link and sends messages to it through this link. Notice that other links among processes of S can exist. Process synchrony: We consider that each process has a local clock that can accurately measure intervals of time, but the clocks of the processes are not synchronized. Processes are synchronous, i.e. there is an upper bound on the time required to execute an instruction. For simplicity, and without loss of generality, we assume that local processing time is negligible with respect to message communication delays. Links and type of systems: For the current implementation, we consider that links are directed (either unidirectional or bidirectional) and there exists a link from q (∀q∈S) to p. Every link between p and q satisfies the following integrity property: p receives a message m from q at most once, only if q previously sent m to p. In other words, communication links cannot create or alter messages. Links are not assumed to be FIFO. Concerning loss property and link synchrony, we consider the following types of links as defined in [5]: lossy asynchronous: A link that satisfies the integrity property and there exists no bound on message delay. Note that, in this case, a message m sent over the link can be lost. However, if m is not lost, it is eventually received at its destination. (Typed)fair lossy: Assuming that each message has a type, link is fair lossy if, for every type infinitely many messages are sent, then infinitely many messages of each type are received (if the receiver process is correct). ♢-timely: A link that satisfies the integrity property and the following ♢-timeliness property: there exists δ and a time t such that if q sends a message m to p at time t′ ≥ t and p is correct, then p receives m from q by time t′ + δ. The maximum message delay δ and the time t are not known. Note that messages sent before time t can be lost. We then define the following types of system: AS: denotes a lossy AS with lossy asynchronous links; F-AS: denotes a fair lossy AS with fair lossy links; W-ET: denotes a weak eventually timely system: a system where some links are ♢-timely while the others are lossy asynchronous; S-ET: denotes a strong eventually timely system: a system where all links are ♢-timely; S-ET-Π: A system which is a S-ET system such that p in S, S = Π, every pair of processes in S is connected either by a pair of directed links (with opposite directions) or bidirectional links, and all processes of Π execute the Impact FD algorithms. W-ET-Π: A system which is a W-ET system such that p in S, S = Π, every pair of processes in S is connected either by a pair of directed links (with opposite directions) or bidirectional links, and all processes of Π execute the Impact FD algorithms. Moreover, there exists a correct process q1 in Π, such that, for all process q2 in Π, q1 ≠ q2, q1 is connected to q2 by a ♢-timely link (similarly to the definition of ♢-source of [16]). Note that a S-ET is also a W-ET and S-ET-Π (resp., W-ET-Π) is also a S-ET (resp., W-ET). Our Impact FD implementation can be applied to all of these systems. Figure 1 shows three types of system. The first one (i) is an AS system where all links are lossy asynchronous while system (ii) shows a W-ET where some links are ♢—timely and others are lossy asynchronous. Finally, the last one (iii) is a W-ET-Π where site q1 is a ♢—source. Figure 1. View largeDownload slide Examples of system types. Figure 1. View largeDownload slide Examples of system types. 5. IMPACT FD The Impact FD can be defined as an unreliable FD that provides an output related to the trust level with regard to a set of processes. If the trust level provided by the detector, is equal to, or greater than, a given threshold value, defined by the user, the confidence in the set of processes is ensured. We can thus say that the system is trusted. We denote FD ( IpS) the Impact FD module of process p and S is a set of processes of Π. When invoked in p, the Impact FD ( IpS) returns the trust_levelpS value which expresses the confidence that p has in set S. 5.1. Impact factor and subsets Each process q ∈ S has an impact factor ( Iq|Iq>0:Iq∈ℝ). Furthermore, set S can be partitioned into m disjoint subsets (S = {S1,S2,..Sm}). Notice that the grouping feature of the Impact FD allows the processes of S to be partitioned into disjoint subsets, in accordance with a particular criterion. For instance, in a scenario where there are different types of sensors, those of the same type can be gathered in the same subset. Let then S*={S1*,S2*,..Sm*} be the set S partitioned into m disjoint subsets where each Si* is a set which each element is a tuple of the form ⟨id,I⟩, where id is a process identifier and I is the value of the impact factor of the process in question. S*={S1*,S2*,..Sm*}is⁢a⁢⁢set⁢such⁢that∀i,j,i≠j,Si*∩Sj*=∅and∪{q|⟨q,_⟩∈Si*;1≤i≤m}=S. 5.2. Trust level We denote trustedpS(t) the set of processes of S that are not considered faulty by p at t∈T. The trust level at t∈T of process p∉F(t) in relation to S is denoted trust_levelpS*. We have then trust_levelpS*(t)= Trust_level(trustedpS(t),S*), where the function Trust_level⁢(trustedpS(t),S*) returns, for each subset Si*, the sum of the impact factors of the elements ⟨idq,Iq⟩ of Si* such that idq ∈ trusted. Trust_level(trusted,S*)={trust_leveli|trust_leveli=∑j∈(trusted∩Si)Ij,1≤i≤|S*|}. In other words, the trust_levelpS* is a set that contains the trust level of each subset of S* expressing the confidence that p has in the processes of S. Note that if all processes of Si* have failed trust_leveli = 0. 5.3. Margin of failures An acceptable margin of failures, denoted thresholdS*, characterizes the acceptable degree of failure flexibility in relation to set S*. The thresholdS* is adjusted to the minimum trust level required for each subset, i.e. it is defined as a set which contains the respective threshold of each subset of S*: thresholdS*={threshold1,…,thresholdm}. The thresholdS* is used by p to check the confidence in the processes of S. If, for each subset of S*, the trust_leveli(t) ≥ thresholdi, S is considered to be trusted at t by p, i.e. the confidence of p in S has not been jeopardized; otherwise S is considered untrusted by p at t. Three points should be highlighted: (i) both the impact factor and thresholdS* render the estimation of the confidence in S flexible. For instance, it is possible that some processes in S might be faulty or suspected of being faulty but S is still trusted; (ii) the Impact FD can be easily configured to adapt to the needs of the environment; (3) the thresholdS* can be tuned to provide a more restricted or softer monitoring. Note that the Impact FD can also be applied when the application needs individual information about each process of S. In this case, each process must be defined as a different subset of S*. 5.4. Examples Table 1 shows several examples of sets and their respective thresholds. In the first example (i) there is just one subset with three processes. Each process has impact factor equal to 1 and the threshold defines that the sum of impact factor of nonfaulty processes must be at least equals to 2, i.e. the system is considered trusted whenever there are two or more correct processes. Example (ii) shows a configuration where processes must be monitored individually. Each process is the only element of a subset and the threshold defines that if any of the processes fails, the system is not trusted anymore. In the third example (iii), S has two sets with three processes each. The threshold requires at least two correct processes in each subset. The last example (iv) has a single subset with five processes with different impact factors. The threshold defines that the set is trusted whenever the sum of impact factor of correct processes is at least equal to seven. Table 1. Examples of sets and threshold. S* ThresholdS* a {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩}} {2} b {{⟨q1,1⟩},{⟨q2,1⟩},{⟨q3,1⟩}} {1,1,1} c {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩}, {⟨q4,2⟩,⟨q5,2⟩,⟨q6,2⟩}} {2,4} d {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩, ⟨q4,5⟩,⟨q5,5⟩}} {7} S* ThresholdS* a {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩}} {2} b {{⟨q1,1⟩},{⟨q2,1⟩},{⟨q3,1⟩}} {1,1,1} c {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩}, {⟨q4,2⟩,⟨q5,2⟩,⟨q6,2⟩}} {2,4} d {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩, ⟨q4,5⟩,⟨q5,5⟩}} {7} Table 1. Examples of sets and threshold. S* ThresholdS* a {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩}} {2} b {{⟨q1,1⟩},{⟨q2,1⟩},{⟨q3,1⟩}} {1,1,1} c {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩}, {⟨q4,2⟩,⟨q5,2⟩,⟨q6,2⟩}} {2,4} d {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩, ⟨q4,5⟩,⟨q5,5⟩}} {7} S* ThresholdS* a {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩}} {2} b {{⟨q1,1⟩},{⟨q2,1⟩},{⟨q3,1⟩}} {1,1,1} c {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩}, {⟨q4,2⟩,⟨q5,2⟩,⟨q6,2⟩}} {2,4} d {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩, ⟨q4,5⟩,⟨q5,5⟩}} {7} In Table 2, we consider a set S* composed by three subsets: S1*, S2*, and S3* (S* = {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩}, {⟨q4,2⟩,⟨q5,2⟩,⟨q6,2⟩}, {⟨q7,3⟩,⟨q8,3⟩,⟨q9,3⟩}}). The values of thresholdS*={1,4,6} define that the subset S1* (resp., S2* and S3*) must have at least one (resp., two) correct process. The table shows several possible outputs for FD ( IpS) depending of process failures: the set S* is considered trusted at t if, for each subset Si*, trust_leveli(t) ≥ thresholdi. Table 2. Example of FD ( IpS) output: S* has three subsets. t F(t) trustedpS(t) trust_levelpS*(t) Status at t 1 {q2} {q1,q3,q4,q5,q6,q7,q8,q9} {2,6,9} Trusted 2 {q1,q2,q5} {q3,q4,q6,q7,q8,q9} {1,4,9} Trusted 3 {q1,q2,q5,q6} {q3,q4, q7,q8,q9} {1,2,9} Untrusted t F(t) trustedpS(t) trust_levelpS*(t) Status at t 1 {q2} {q1,q3,q4,q5,q6,q7,q8,q9} {2,6,9} Trusted 2 {q1,q2,q5} {q3,q4,q6,q7,q8,q9} {1,4,9} Trusted 3 {q1,q2,q5,q6} {q3,q4, q7,q8,q9} {1,2,9} Untrusted S* = {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩},{⟨q4,2⟩,⟨q5,2⟩⟨q6,2⟩}, {⟨q7,3⟩,⟨q8,3⟩,⟨q9,3⟩}}. thresholdS*={1,4,6}. Table 2. Example of FD ( IpS) output: S* has three subsets. t F(t) trustedpS(t) trust_levelpS*(t) Status at t 1 {q2} {q1,q3,q4,q5,q6,q7,q8,q9} {2,6,9} Trusted 2 {q1,q2,q5} {q3,q4,q6,q7,q8,q9} {1,4,9} Trusted 3 {q1,q2,q5,q6} {q3,q4, q7,q8,q9} {1,2,9} Untrusted t F(t) trustedpS(t) trust_levelpS*(t) Status at t 1 {q2} {q1,q3,q4,q5,q6,q7,q8,q9} {2,6,9} Trusted 2 {q1,q2,q5} {q3,q4,q6,q7,q8,q9} {1,4,9} Trusted 3 {q1,q2,q5,q6} {q3,q4, q7,q8,q9} {1,2,9} Untrusted S* = {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩},{⟨q4,2⟩,⟨q5,2⟩⟨q6,2⟩}, {⟨q7,3⟩,⟨q8,3⟩,⟨q9,3⟩}}. thresholdS*={1,4,6}. 5.5. Flexibility of the impact FD The flexibility of the Impact FD characterizes its capability of accepting different set of responses that lead to a trusted state of S. We define PS as the set that contains all possible subsets of processes which satisfy a defined threshold: PS=×PowerSet(Si*,thresholdiS*) where × Si corresponds to the Cartesian product of several sets. Initially, the PowerSet function generates the power set2 for each subset ( Si*) of S*. Then, only the subsets of Si* whose sum of their parts is greater than, or equal to, thresholdi are selected. That is, the output is the sets of possible trusted set that satisfy the threshold for each subset Si*. Following this, the Cartesian product is applied to generate all possible combinations, i.e. all the generated subsets of processes satisfy the thresholdS*. Let’s consider the following example: S*={{⟨q1,1⟩,⟨q2,1⟩},{⟨q3,1⟩,⟨q4,1⟩},{⟨q5,1⟩,⟨q6,1⟩}} thresholdS* = {1,1,1} PS = PowerSet(S*, thresholdS*) PowerSet(S1*,threshold1)={{q1},{q2},{q1,q2}}PowerSet(S2*,threshold2)={{q3},{q4},{q3,q4}}PowerSet(S3*,threshold3)={{q5},{q6},{q5,q6}}PS=PowerSet(S1*,threshold1)×PowerSet(S2*,threshold2)×PowerSet(S3*,threshold3) PS={{q1,q3,q5},{q1,q3,q6},{q1,q3,q5,q6},{q1,q4,q5},{q1,q4,q6},{q1,q4,q5,q6},{q1,q3,q4,q5},{q1,q3,q4,q6},{q1,q3,q4,q5,q6},…} For instance, if trustedpS(t1)= {q1, q3, q5} and trustedpS(t2)={q1,q3,q4,q6}, trustedpS(t1) and trustedpS(t1)∈PS, and, therefore, p considers that the system S is trusted at both t1 and t2. We now define two properties, PR(IT)pS and PR(⋄IT)pS, that characterize the stability condition that ensures the confidence (or eventual confidence) of p on S. Impact threshold property— PR(IT)pS: For a FD of a correct process p, the set trustedpS is always a subset of PS. PR(IT)pS≡p∈correct(F),∀t≥0,trustedpS(t)∈PS Eventual impact threshold property— PR(⋄IT)pS: For a FD of a correct process p, there is a time after which the set trustedpS is always a subset of PS. PR(⋄IT)pS≡∃t∈T,p∈correct(F),∀t′≥t,trustedpS(t′)∈PS If PR(IT)pS (resp., PR(⋄IT)pS) holds, the system S is always (resp., eventually always) trusted by p. 5.6. Classes of Impact FD Similarly to the completeness and accuracy properties defined in [2] (see Section 3), we define the following properties for the Impact FD: ImpactcompletenesspS: For a FD of a correct process p, there is a time after which p does not trust any crashed process of S; ∃t∈T,p∈correct(F),∀q∈faulty(FS):∀t′∈T≥t,q∉trustedpS(t′) ImpactweakcompletenesspS: For a FD of a correct process p, there is a time after which some p does not trust any crashed process of S; ∃t∈T,∃p∈correct(F),∀q∈faulty(FS):∀t′∈T≥t,q∉trustedpS(t′) EventualimpactstrongaccuracypS: For a FD of a correct process p, there is a time after which all correct processes of S belong to trustedpS; ∃t∈T,∀t′∈T≥t,p∈correct(F),∀q∈correct(FS):q∈trustedpS(t′) EventualimpactweakaccuracypS: For a FD of a correct process p, there is a time after which some correct process of S is trusted by every correct process. ∃t∈T,∀t′∈T≥t,∀p∈correct(F),∃q∈correct(FS):q∈trustedpS(t′) Lets consider that p in S and S = Π We can then define some classes of Impact FD, similarly to those defined in [2] and [23]: ♢IP (eventually perfect impact class): For S = Π, ∀p∈correct(F), impactcompletenesspS and eventualimpactstrongaccuracypS properties are satisfied; ♢IS (eventually strong impact class): For S = Π, ∀p∈correct(F), impactcompletenesspS and eventualimpactweakaccuracypS properties are satisfied; We point out that the trust level output of the FDs of the above classes depends on S*, i.e. the impact factor assigned to the processes as well as how they are grouped in subsets. 6. IMPLEMENTATION OF IMPACT FD The Impact FD can have different implementations according to the characteristics of the system: the synchronization model, whether or not the process p has knowledge about the composition of S (membership) and the type of nodes. In this section, we present a timer-based implementation of the Impact FD (Algorithms 2 and 3). Algorithm 1 Timeout Function. 1: function Timeout (q,η,model) 2:  ifmodel = * −ASthen ▻AS or F-AS system 3:   τq = β + EAq 4:  else 5:   τq = β + EAq + η 6:  end if 7:  returnτq 8: end function 1: function Timeout (q,η,model) 2:  ifmodel = * −ASthen ▻AS or F-AS system 3:   τq = β + EAq 4:  else 5:   τq = β + EAq + η 6:  end if 7:  returnτq 8: end function Algorithm 1 Timeout Function. 1: function Timeout (q,η,model) 2:  ifmodel = * −ASthen ▻AS or F-AS system 3:   τq = β + EAq 4:  else 5:   τq = β + EAq + η 6:  end if 7:  returnτq 8: end function 1: function Timeout (q,η,model) 2:  ifmodel = * −ASthen ▻AS or F-AS system 3:   τq = β + EAq 4:  else 5:   τq = β + EAq + η 6:  end if 7:  returnτq 8: end function Algorithm 2 Timer-based Impact FD Algorithm for p. 1: Begin   Input 2:  S*, model, η   Init 3:  trusted = S 4:  ∀q ≠ p : reset timer[q] = Timeout(q,0, model); η[q] = 0   Task T1 - Upon reception of ALIVE from q 5:  ifq∉trustedthen 6:   trusted = trusted∪{q} 7:   ifmodel = * − ETthen ▻W-ET or S-ET system 8:    η[q] = η[q] + η 9:   end if 10:  end if 11:  reset timer[q] = Timeout(q,η[q], model)   Task T2 - When timer[q] expires 12:  trusted = trusted\{q} 13:  reset timer[q] = Timeout(q,η[q], model)   Task T3 14:  Upon invocation ofImpact() do 15:   returnTrust_level(trusted, S*) 16:  end 17: End 1: Begin   Input 2:  S*, model, η   Init 3:  trusted = S 4:  ∀q ≠ p : reset timer[q] = Timeout(q,0, model); η[q] = 0   Task T1 - Upon reception of ALIVE from q 5:  ifq∉trustedthen 6:   trusted = trusted∪{q} 7:   ifmodel = * − ETthen ▻W-ET or S-ET system 8:    η[q] = η[q] + η 9:   end if 10:  end if 11:  reset timer[q] = Timeout(q,η[q], model)   Task T2 - When timer[q] expires 12:  trusted = trusted\{q} 13:  reset timer[q] = Timeout(q,η[q], model)   Task T3 14:  Upon invocation ofImpact() do 15:   returnTrust_level(trusted, S*) 16:  end 17: End Algorithm 2 Timer-based Impact FD Algorithm for p. 1: Begin   Input 2:  S*, model, η   Init 3:  trusted = S 4:  ∀q ≠ p : reset timer[q] = Timeout(q,0, model); η[q] = 0   Task T1 - Upon reception of ALIVE from q 5:  ifq∉trustedthen 6:   trusted = trusted∪{q} 7:   ifmodel = * − ETthen ▻W-ET or S-ET system 8:    η[q] = η[q] + η 9:   end if 10:  end if 11:  reset timer[q] = Timeout(q,η[q], model)   Task T2 - When timer[q] expires 12:  trusted = trusted\{q} 13:  reset timer[q] = Timeout(q,η[q], model)   Task T3 14:  Upon invocation ofImpact() do 15:   returnTrust_level(trusted, S*) 16:  end 17: End 1: Begin   Input 2:  S*, model, η   Init 3:  trusted = S 4:  ∀q ≠ p : reset timer[q] = Timeout(q,0, model); η[q] = 0   Task T1 - Upon reception of ALIVE from q 5:  ifq∉trustedthen 6:   trusted = trusted∪{q} 7:   ifmodel = * − ETthen ▻W-ET or S-ET system 8:    η[q] = η[q] + η 9:   end if 10:  end if 11:  reset timer[q] = Timeout(q,η[q], model)   Task T2 - When timer[q] expires 12:  trusted = trusted\{q} 13:  reset timer[q] = Timeout(q,η[q], model)   Task T3 14:  Upon invocation ofImpact() do 15:   returnTrust_level(trusted, S*) 16:  end 17: End Algorithm 3 Timer-based Impact FD Algorithm for q in S. 1: Begin   Input 2:  p, Δ   Task T1 - Repeat forever every Δ time unit 3:  send(ALIVE) to p 4: End 1: Begin   Input 2:  p, Δ   Task T1 - Repeat forever every Δ time unit 3:  send(ALIVE) to p 4: End Algorithm 3 Timer-based Impact FD Algorithm for q in S. 1: Begin   Input 2:  p, Δ   Task T1 - Repeat forever every Δ time unit 3:  send(ALIVE) to p 4: End 1: Begin   Input 2:  p, Δ   Task T1 - Repeat forever every Δ time unit 3:  send(ALIVE) to p 4: End The system S consists of n processes grouped in m subsets. The monitor process p∉S. Our implementation (Algorithms 2 and 3) uses timers to detect failures of processes in different system models. Process q periodically sends (heartbeat) messages to process p, that is responsible for monitoring process q. If p does not receive such a message from q after the expiration of the timer, it removes q from its list of trusted processes. Chen’s heartbeat estimation arrival: Algorithm 2 uses the algorithm proposed by [7], denoted Chen’s algorithm in this work, which computes the timeout value for waiting for a heartbeat message from each monitored process. Chen’s algorithm uses arrival times sampled in the recent past to compute an estimation of the arrival time of the next heartbeat. Then, timeout value is set according to this estimation and a safety margin (β). It is recomputed at each timer expiration. The estimation algorithm is the following: process p takes into account the z most recent heartbeat messages received from q, denoted by y1, y2, …, yz; A1, A2, …, Az are their actual reception times according to p’s local clock. When at least z messages have been received, the theoretical arrival time EA(k + 1) for a heartbeat from q is estimated by: EA(k+1)=1z∑i=k−zk(Ai−Δi∗i)+(k+1)Δi where Δi is the interval between the sending of two q’s heartbeats. The next timeout value which will be set in p’s timer and will expire at the next freshness point τ(k + 1), is then composed by EA(k + 1) and the constant safety margin β: τ(k+1)=β+EA(k+1)(nextfreshnesspoint) In Algorithm 2, Chen’s algorithm is executed by the Timeout function (Algorithm 1) which calculates the arrival estimation of the next heartbeat for process q. Furthermore, if the system is eventually timely in order to ensure accuracy of the impact FD a η value is added to the q’s timeout. The η has an initial zero value and is incremented whenever p falsely suspects q (line 6 of Algorithm 2). Such an increment ensures that, if the link is ♢—timely and stable, i.e. the delay bound δ verifies forever, the heartbeat arrival estimation time will be always equal or greater than the actual arrival time for every heartbeat and, therefore, there will be no more estimation mistakes and, therefore no more false suspicions and the accuracy property is hold. Algorithm 2 is executed by the monitor process p while Algorithm 3 by all processes of S. The following local variables are used by the algorithm: • trusted: set of processes considered not faulty by p; • η[]: keeps the timeout increment of each process in S; • timer[]: is set to the timeout value at each timer expiration. In Algorithm 2, p receives as input the set S*, the increment time η for the timeout estimation (used when occurs false suspicions in W-ET or S-ET systems), and the model of the system (AS, F-AS, W-ET or S-ET). Note that by receiving S*, the algorithm knows S, the impact factor of all processes of S, the number of subgroups m, and how processes are grouped. At the initialization, trusted is equal to the set of processes. Then, for each process q in S (q ≠ p), p initializes the timer that will control the arrival of heartbeat messages from q (line 6). Upon the reception of an ALIVE message from q (Task T1), q is added to the trusted set (line 6) and the timeout related to q is recomputed (line 6). In task T2, q is considered faulty by p and, therefore, removed from trusted (line 6). The timeout related to q is then recomputed (line 6). Task T3 handles the invocation of the Impact() function, which computes the trust_level of each subset and returns the trust level related to the current trusted processes which are trusted by p. In Algorithm 3, every monitored process q of S sends periodically, every Δ units of time, an ALIVE message to its input observer p in order to inform the latter that it is alive (Task T1). Note that if p∈S, like in S-ET-Π or W-ET-Π, all processes of Π execute the two algorithms behaving, thus, as both a monitor and a monitored process. In this case, the primitive send in line 3 of Algorithm 3 is replaced by the primitive broadcast, i.e. every process periodically sends a heartbeat to all processes of S. 6.1. Proof In this section, we prove the correctness of some properties of Algorithms 2 and 3. Lemma 6.1 If p is correct, Algorithms2 and 3satisfy the impact completeness property for p in relation to S. Proof Let’s consider that at t, Sf = faulty(FS) (i.e. all failures of processes in S already took place) and that all the ALIVE messages (heartbeats) sent by these faulty processes before they crashed were delivered to p. Thus, after t, p will receive no more ALIVE messages from processes of Sf. Then, ∀q∈Sf, in the next expiration of the timer[q] after t, q will be removed from trusted (line 6). Moreover, since p will receive no more ALIVE messages from q, line 6 will never be executed for q anymore and, thus, q will nevermore be included in trusted. Therefore, ∃t′ > t, ∀t′′ ≥ t′, ∀q ∈ faulty(FS):q ∉ trustedp(t′′).□ Lemma 6.2 If S is a W-ET, if p is correct, Algorithms 2 and 3 satisfy the eventual impact weak accuracy property for p in relation to S. Proof In a W-ET system S, there exists q∈correct(FS) linked to p by a ♢—timely. Let’s denote Tq the stabilization time of the link q from p, i.e. ∀t ≥ Tq, if q sends a message m to p, then q receives m by time t + δ. Then, when q sends a message to p at t ≥ Tq, and p receives the message at t1 ≥ t, two cases may happen: the next timer of q expires after t1 (Task T1). In this case, q will be added to trusted (line 6). Then, the timeout value of q is incremented (line 6) and the timer of q restarted; the current timer of q expires before t1: p removes q from trusted (line 6) and the timer is restarted. Since q keeps on sending ALIVE messages to p and timer[q] increases at false suspicion of q, there exists a time t2 > Tq such that timer[q] ≥ δ and then Task 2 will nevermore be executed by p for q and, ∀t3 ≥ t2, upon every q's message reception by p, task T1 will be executed for q. Therefore, q will remain forever in trustedp and Eventual impact weak accuracypS is satisfied.□ Lemma 6.3 If S is a S-ET, if p is correct, Algorithms 2 and 3 satisfy the eventual impact strong accuracy property for p in relation to S. Proof In a S-ET system S, every q∈correct(FS) is linked to p by a ♢ − timely. Then, following the same proof scheme of Lemma 6.2, q will remain forever in trustedp and Eventual impact strong accuracypS is satisfied.□ Theorem 6.1 In W-ET-Π systems, Algorithms 2 and 3 implement a FD of class ♢IS. Proof If the system is W-ET-Π, S = Π, from Lemmas 6.1 and 6.2, ∀p∈correct(F), impact completenesspΠ and eventual impact weak accuracypΠ are satisfied. Therefore, the algorithms implement a FD of class ♢IS.□ Theorem 6.2 In S-ET-Π systems, Algorithms 2 and 3 implement a FD of class ♢IP. Proof If the system is S-ET-Π, S = Π, from Lemmas 6.1 and 6.3, ∀p∈correct(F), Impact completenesspΠ and Eventual impact strong accuracypΠ are satisfied. Therefore, the algorithms implement a FD of class ♢IP.□ Theorem 6.3 If PR(IT)pS (resp., PR(⋄IT)pS) holds, the system S is always (resp., eventually always) trusted by p. Proof if PR(IT)pS (resp., PR(⋄IT)pS) holds, ∀t ≥ 0 (resp., ∃t1,∀t ≥ t1), trusted ∈ PS and, therefore, S is trusted by p.□ 7. PERFORMANCE EVALUATION In this section, we first describe the environment in which the experiments were conducted and the QoS metrics used for evaluating the results. Then, we discuss some of the results in different systems and configurations of node sets with regard to both the impact factor and the threshold. Our goal is to evaluate the QoS of the Impact FD: how fast it detects failures and how well it avoids false suspicions. With this purpose, we exploit a set of metrics that have been proposed by [7] and we compare the results of Impact FD with an approach that monitors processes individually using Chen’s FD [7]. We conducted a set of experiments, considering two different systems: (i) AS: a system where all links are lossy asynchronous; (ii) W-ET: a system where some links are ♢—timely and the others are lossy asynchronous. 7.1. Environment Our experiments are based on real-trace files, collected from 10 nodes of PlanetLab [8], as summarized in Table 3. The PlanetLab experiment started on 16 July 2014 at 15:06 UTC, and ended exactly a week later. Each site sent heartbeat messages to other sites at a rate of one heartbeat every 100 ms (the sending interval). We should point out that these traces of PlanetLab contain a large amount of data concerning the sending and reception of heartbeats, including unstable periods of links and message loss, which induce false suspicions. Thus, such traces characterize any distributed system that uses FDs based on heartbeat. Furthermore, since our experiments were conducted using the PlanetLab traces, all of them reproduce exactly the same scenarios of sending and receiving of heartbeats by the processes. Furthermore, provided that the same trace is available, the test conditions and results are reproducible. Table 3. Sites of experiments. ID Site Local 0 planetlab1.jhu.edu USA East Coast 1 ple4.ipv6.lip6.fr France 2 planetlab2.csuohio.edu USA, Ohio 3 75-130-96-12.static.oxfr.ma.charter.com USA, Massachusetts 4 planetlab1.cnis.nyit.edu USA, New York 5 saturn.planetlab.carleton.ca Canada, Ontario 6 PlanetLab-03.cs.princeton.edu USA, New Jersey 7 prata.mimuw.edu.pl Poland 8 planetlab3.upc.es Spain 9 pl1.eng.monash.edu.au Australia ID Site Local 0 planetlab1.jhu.edu USA East Coast 1 ple4.ipv6.lip6.fr France 2 planetlab2.csuohio.edu USA, Ohio 3 75-130-96-12.static.oxfr.ma.charter.com USA, Massachusetts 4 planetlab1.cnis.nyit.edu USA, New York 5 saturn.planetlab.carleton.ca Canada, Ontario 6 PlanetLab-03.cs.princeton.edu USA, New Jersey 7 prata.mimuw.edu.pl Poland 8 planetlab3.upc.es Spain 9 pl1.eng.monash.edu.au Australia Table 3. Sites of experiments. ID Site Local 0 planetlab1.jhu.edu USA East Coast 1 ple4.ipv6.lip6.fr France 2 planetlab2.csuohio.edu USA, Ohio 3 75-130-96-12.static.oxfr.ma.charter.com USA, Massachusetts 4 planetlab1.cnis.nyit.edu USA, New York 5 saturn.planetlab.carleton.ca Canada, Ontario 6 PlanetLab-03.cs.princeton.edu USA, New Jersey 7 prata.mimuw.edu.pl Poland 8 planetlab3.upc.es Spain 9 pl1.eng.monash.edu.au Australia ID Site Local 0 planetlab1.jhu.edu USA East Coast 1 ple4.ipv6.lip6.fr France 2 planetlab2.csuohio.edu USA, Ohio 3 75-130-96-12.static.oxfr.ma.charter.com USA, Massachusetts 4 planetlab1.cnis.nyit.edu USA, New York 5 saturn.planetlab.carleton.ca Canada, Ontario 6 PlanetLab-03.cs.princeton.edu USA, New Jersey 7 prata.mimuw.edu.pl Poland 8 planetlab3.upc.es Spain 9 pl1.eng.monash.edu.au Australia For the evaluation of Impact FD, we defined S = {1,2,3,4,5,6,7,8,9} and Site 0 as the monitor node (p∉S). Table 4 gives some information about the heartbeat messages received by Site 0 (the monitor node). We observe that the mean inter-arrival times of received heartbeats is very close to 100 ms. However, for some sites, the standard deviation is very high, like for Site 5 which the standard deviation was 310.958 ms with a minimum inter-arrival time of 0.006 ms, and a maximum of 657 900.226 ms. Such deviation probably indicates that, for a certain time interval during execution, the site stopped sending heartbeats and started again afterwards. Also note that Site 2 stopped sending messages after ~48 hours and, therefore, there are just 1 759 990 received messages. Table 4. Sites and heartbeat sampling. Site Messages Min (ms) Max (ms) Mean (ms) Standard deviation (ms) 1 5 424 326 0.025 26 494.168 100.058 19.525 2 1 759 989 0.031 509.093 100.415 9.275 3 5 426 843 0.027 1 227.349 100.012 1.709 4 5 414 122 0.003 1 193.276 100.247 18.595 5 5 413 542 0.006 657 900.226 100.258 310.958 6 5 426 700 0.003 3 787.643 100.015 2.557 7 5 424 117 0.006 59 603.188 100.062 31.229 8 5 424 560 0.027 11 443.359 100.054 100.714 9 5 422 043 0.004 30 600.076 100.100 18.798 Site Messages Min (ms) Max (ms) Mean (ms) Standard deviation (ms) 1 5 424 326 0.025 26 494.168 100.058 19.525 2 1 759 989 0.031 509.093 100.415 9.275 3 5 426 843 0.027 1 227.349 100.012 1.709 4 5 414 122 0.003 1 193.276 100.247 18.595 5 5 413 542 0.006 657 900.226 100.258 310.958 6 5 426 700 0.003 3 787.643 100.015 2.557 7 5 424 117 0.006 59 603.188 100.062 31.229 8 5 424 560 0.027 11 443.359 100.054 100.714 9 5 422 043 0.004 30 600.076 100.100 18.798 Table 4. Sites and heartbeat sampling. Site Messages Min (ms) Max (ms) Mean (ms) Standard deviation (ms) 1 5 424 326 0.025 26 494.168 100.058 19.525 2 1 759 989 0.031 509.093 100.415 9.275 3 5 426 843 0.027 1 227.349 100.012 1.709 4 5 414 122 0.003 1 193.276 100.247 18.595 5 5 413 542 0.006 657 900.226 100.258 310.958 6 5 426 700 0.003 3 787.643 100.015 2.557 7 5 424 117 0.006 59 603.188 100.062 31.229 8 5 424 560 0.027 11 443.359 100.054 100.714 9 5 422 043 0.004 30 600.076 100.100 18.798 Site Messages Min (ms) Max (ms) Mean (ms) Standard deviation (ms) 1 5 424 326 0.025 26 494.168 100.058 19.525 2 1 759 989 0.031 509.093 100.415 9.275 3 5 426 843 0.027 1 227.349 100.012 1.709 4 5 414 122 0.003 1 193.276 100.247 18.595 5 5 413 542 0.006 657 900.226 100.258 310.958 6 5 426 700 0.003 3 787.643 100.015 2.557 7 5 424 117 0.006 59 603.188 100.062 31.229 8 5 424 560 0.027 11 443.359 100.054 100.714 9 5 422 043 0.004 30 600.076 100.100 18.798 The implementation of the Impact FD used in our evaluation experiments is based on Algorithms 2 and 3, presented in Section 6. For the estimation of the timeout value of Chen’s estimation algorithm, the authors suggest that the safety margin β should range from 0 to 2500 ms. For all experiments, we set the window size to 100 samples, which means that the FD only relies on the last 100 heartbeat message samples for computing the estimation of the next heartbeat arrival time. Several works aim at improving the QoS of FDs which estimate the arrival time of the next heartbeat by varying some parameters such as window size [24–26]. The latter emphasize that Chen FD has better performance with smaller window sizes. Based on these studies and our experiments, we used the window size of 100 samples, which induces Chen FD to take less time to adapt to the dynamics of the network. 7.1.1. Evaluation of sites’ stability We evaluated the stability of sites, considering that the traces could correspond to either an AS system or W-ET system. For the first case, the value AS was assigned to the model parameter of Algorithm 2 while for the second case, the same parameter was set to W-ET. Each of the sites of S is considered individually and not as a whole system. The impact value of sites and the threshold values are not concerned for the experiments. The β value of Chen’s algorithm was set to 400 ms. We chose such a value because it is an acceptable safety margin for detection time and is not too aggressive; otherwise the FD would be prone to too many mistakes. The stability of sites and the corresponding links to the monitor were evaluated during the whole trace period for the AS system and during just the first 24 hours of the trace period for the W-ET system. ASsystem: Figure 2 shows the cumulative number of mistakes, i.e. false suspicions, made by the monitor Site 0 for each site of S. We can observe that site or link periods of instability entail late arrivals or loss of heartbeats and, therefore, mistakes by the monitor site. For example, Site 9 had a large number of cumulative mistakes at hour 48. After that, there is a stable period with regard to this site. On the other hand, around this time, Site 2 stopped sending messages since it crashed and, consequently, the monitor node made no more mistakes about it after this time. Finally, we can say that, considering the whole period, Sites 3 and 6 (resp., 8 and 9) are, in average, the most stable (resp., unstable) sites. Figure 2. View largeDownload slide AS System: cumulative number of mistakes of each site. Figure 2. View largeDownload slide AS System: cumulative number of mistakes of each site. W-ETsystem: In Algorithm 1 (Task T1), when the system is W-ET, Chen’s heartbeat arrival estimation value is incremented by η, whenever a false suspicion occurs. However, in order to prevent this estimation from increasing too fast when there is a period of high instability, which could increase the detection time considerably, we considered that the value of the timer (line 6) will be incremented by η at every μ heartbeat arrivals, provided that during the period of these μ heartbeat arrivals, one or more false suspicions took place. For the experiment, we considered μ equals to 10 and η = 1 ms. Note that when the heartbeat arrival estimation reaches a value which is greater than the transmission delay limit for links with ♢—timely behavior, the monitor site does not make any more mistakes for the related sites. Moreover, for unstable sites, as the heartbeat arrival estimation value will also be incremented by η in case of false suspicions, such an increment will be responsible for decreasing the number of mistakes for these sites when compared to an AS system. However, such a reduction induces a higher false suspicion detection time. Figure 3 shows the cumulative number of mistakes that the monitor process made for each site in the first 24 hours of the traces. We can observe that there are links which behave ♢—timely while the others are lossy asynchronous. The FD did not make mistakes related to Site 4. For Sites 2 and 3, it did only 1 and 2 mistakes, respectively, while for Site 6, it did 99 mistakes during the first hour, and then no more mistakes. Although some sites have had some periods of stability (1, 5, 8 and 9), Site 0 made mistakes related to them until almost the end of these execution. On the other hand, it did no mistakes for Site 7 after hour 9. In summary, we can consider that Site 0, the monitor site, is connected by ♢ − timely links to sites 2, 3, 4 and 6, and by lossy asynchronous links to 1, 5, 7, 8 and 9. Figure 3. View largeDownload slide W-ET System: cumulative number of mistakes of each site. Figure 3. View largeDownload slide W-ET System: cumulative number of mistakes of each site. 7.1.2. Evaluation of heartbeat arrival times The goal of this section is to show the behavior of the arrival times when the timer expires and the FD does not receive the heartbeat message. For the first 24 hours, we evaluated the behavior of the three arrival times at Site 0 related to heartbeat messages of Site 1 with two different values to β (100 and 400 ms). We chose Site 1 because it has many periods of instability. We consider that Sites 1 and 0 are alternately connected by lossy asynchronous or ♢—timely links. We evaluated three arrival times: (i) arrival of the heartbeat; (ii) the estimated arrival time considering that the link is lossy asynchronous; (iii) the estimated arrival time considering that the link is ♢—timely. In order to compute the latter, we set η = 1 ms and the number of heartbeats before incrementing the heartbeat arrival estimation value, in case of false suspicions, to 100 (μ = 100). Figures 4 and 5 show the time difference between the arrival time of the previous heartbeat and the above three arrival ones for Site 1: (i) the difference in milliseconds between the arrival time of the last heartbeat and the previous one (Arrival); (ii) the difference in milliseconds between the estimated arrival time (τq = β + EAq) and the arrival time of the previous heartbeat, considering the link lossy asynchronous (Estimation LA); (iii) the number of milliseconds elapsed between the estimated arrival time (τq = β + EAq + η) and the arrival time of the previous heartbeat, considering the link ♢—timely (Estimation ET). Figure 4. View largeDownload slide The behavior of the arrival times when the timeout expires in Site 1 for β = 100 ms and μ = 100 for 24 hours. The points correspond to the times where mistakes took place. Figure 4. View largeDownload slide The behavior of the arrival times when the timeout expires in Site 1 for β = 100 ms and μ = 100 for 24 hours. The points correspond to the times where mistakes took place. Figure 5. View largeDownload slide The behavior of the arrival times when the timeout expires in Site 1 for β = 400 ms and μ = 100 for 24 hours. The points correspond to the times where mistakes took place. Figure 5. View largeDownload slide The behavior of the arrival times when the timeout expires in Site 1 for β = 400 ms and μ = 100 for 24 hours. The points correspond to the times where mistakes took place. Figures 4 and 5 show the behavior of times when the timeout expires for β = 100 and β = 400 ms, respectively, till hour 24. In order to simplify the figures, the points correspond only to the times where mistakes took place. Figure 5 has fewer points than Figure 4 because the number of mistakes drops considerably due to a higher β value. Figure 4 summarizes the time differences for β = 100 ms. The monitor Site 0 made 807 (resp., 592) mistakes when the link is lossy asynchronous (resp. ♢—timely). Note that at several points, the estimated arrival time for the ET estimation is higher than the arrival time of the heartbeat while, in the LA estimation, the difference between them is very small (1 or 2 ms), specially from time 6 to 21. Thus, both lines in the figure overlap but the estimation arrival time is often below the arrival one which explains the high number of mistakes. At times 1, 4, 6, 21 and 23, which correspond to periods of instability, the arrival time of the heartbeat is much higher than the estimation one for the LA estimation. Contrarily to Figure 4, the number of mistakes drops to 168 and 166 mistakes, for ET and LA estimations respectively as shown in Figure 5. Therefore, since they are almost equal, the estimated arrival times for the lossy asynchronous and ♢—timely are also quite close. Similarly to Figure 4, the mistakes are concentrated in periods of great instability (1, 4, 6, 21 and 23). 7.1.3. Discussion about the choice of parameters Below we describe the criteria used to set the parameter values β, μ and η: β: for the estimation of the timeout value of Chen’s estimation algorithm, the authors [7] suggest that the safety margin should range from 0 to 2500 ms. The β value of Chen’s algorithm was set to 400 ms in experiments of Sections 7.1.1, 7.1.2 and 7.3.1. We chose such a value because it is an acceptable safety margin for detection time and is not too aggressive; otherwise the FD would be prone to too many mistakes. In experiments of Section 7.4, we used β = 50 ms and β = 100 ms. These safety margin values are quite aggressive, which, consequently, lead the FD prone to make mistakes. We choose these values because our aim was to check the behavior of the FD in a scenario more vulnerable to failures. η: this is increment time for the timeout estimation (used when false suspicions take place). We conducted experiments with two values: 500 μs and 1 ms. We defined these values taking into account that each site sends heartbeat messages to other sites at a rate of one heartbeat every 100 ms (the sending interval). A value greater than 1 ms greatly increases the detection time. On the other hand, a value smaller than 500 μs generates many mistakes. Thus, these values guarantee a better trade-off between detection time and accuracy of the Impact FD. μ: we conducted experiments with different values for μ (1, 10 and 100). On the one hand, we observed that for μ = 1, a smaller number of errors occurred, however, the detection time increased. On the other hand, when we used μ = 100, the number of errors increased. Considering these two trade-offs, we set μ = 10 in the experiment of section 7.1.1 (Evaluation of sites’ stability). 7.2. QoS metrics First, let’s remember that the goal of the Impact FD is to inform if a system is ‘trusted’ or ‘untrusted’. This information can be deduced by comparing the output trust_level of the Impact FD with the threshold. Thus, we say that the output of the Impact FD of p is correct if either, for each subset of S* (1 ≤ i ≤ m), trust_leveli ≥ thresholdi and S is actually trusted, or ∃ i such that trust_leveli < thresholdi and S is actually untrusted. Otherwise, the FD made a mistake. For evaluating the Impact FD, we used three of the QoS metrics proposed in [7]: detection time, average mistake rate, and query accuracy probability. Considering that p monitors S, the QoS of the Impact FD at p must take into account the transitions between ‘trusted’ to ‘untrusted’ states of S. Detection time (TD): In [7], the TD is defined as the time elapsed from the moment process q crashes until the FD at p starts suspecting q permanently. In the case of the Impact FD, the detection time (TD) of p in relation to S is the time elapsed till the monitor process reports a suspicion that leads to a status transition in S from trusted to untrusted. To this end, for each freshness point of a process q in S, it is necessary to check which process failures would lead to a state transition of S from trusted to untrusted and then compute the detection time TD for each of these processes. The latter is the time elapsed between the current freshness (τi + 1) and the last heartbeat arrival (Ai) with respect to the previous freshness point, i.e. τi + 1 − Ai, from each of these processes. If there is more than one process q∈S which could lead to the transition, i.e. Sf = q∈trustedi|(trust_leveli − Impact(q)) < thresholdi, the TD in relation to S is the greatest of them: TD = max(τi + 1 − Ai), ∀q∈Sf. Figure 6 shows an example where S* has just one subset with three processes whose impact factor is 1. The thresholdS defines that at least two processes must be correct. Note that at τi + 3, process p did not receive the heartbeat message from q1 and, therefore, p removes it from its trusted set (trustedp = {q2, q3}). However, S remains trusted for p because the trust level satisfies the threshold. At freshness point τi + 5, FD verifies if the failure of any of the processes of trustedp (q2 and q3) can lead to S transition (trust_level1 < threshold1). For this purpose, p computes the TD for each of the two processes. The TD in relation to S is the greatest among TD of q2 and TD of q3. Since p did not receive a heartbeat from q2, S becomes untrusted. Transitions between ‘trusted’ and ‘untrusted’ states for three processes with impact factor 1 within a single subset. At least two processes must be correct. Average mistake rate (λR): represents the number of mistakes that the FD makes per unit of time, i.e. the rate at each the FD makes mistakes. Query accuracy probability (PA): the probability that the FD output is correct at a random time. Figure 6. View largeDownload slide Transitions between ‘trusted’ and ‘untrusted’ states for three processes with impact factor 1 within a single subset. At least two processes must be correct. Figure 6. View largeDownload slide Transitions between ‘trusted’ and ‘untrusted’ states for three processes with impact factor 1 within a single subset. At least two processes must be correct. 7.3. Asynchronous system For this evaluation we consider an AS, i.e. links are lossy asynchronous. Table 5 shows five configurations with regard to impact factor values that have been considered for S* in the experiments. The sum of the impact factor of the processes is 90 for all configurations. Table 5. Set configurations (S*). Configuration Impact factor of each site S* 0 {{⟨q1,7⟩,⟨q2,3⟩,⟨q3,20⟩,⟨q4,20⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,3⟩,⟨q8,7⟩,⟨q9,7⟩}} S* 1 {{⟨q1,7⟩,⟨q2,20⟩,⟨q3,20⟩,⟨q4,3⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,3⟩,⟨q8,7⟩,⟨q9,7⟩}} S* 2 {{⟨q1,20⟩,⟨q2,7⟩,⟨q3,3⟩,⟨q4,3⟩,⟨q5,7⟩,⟨q6,3⟩,⟨q7,7⟩,⟨q8,20⟩,⟨q9,20⟩}} S* 3 {{⟨q1,7⟩,⟨q2,3⟩,⟨q3,20⟩,⟨q4,3⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,7⟩,⟨q8,20⟩,⟨q9,7⟩}} S* 4 {{⟨q1,10⟩,⟨q2,10⟩,⟨q3,10⟩,⟨q4,10⟩,⟨q5,10⟩,⟨q6,10⟩,⟨q7,10⟩,⟨q8,10⟩,⟨q9,10⟩}} Configuration Impact factor of each site S* 0 {{⟨q1,7⟩,⟨q2,3⟩,⟨q3,20⟩,⟨q4,20⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,3⟩,⟨q8,7⟩,⟨q9,7⟩}} S* 1 {{⟨q1,7⟩,⟨q2,20⟩,⟨q3,20⟩,⟨q4,3⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,3⟩,⟨q8,7⟩,⟨q9,7⟩}} S* 2 {{⟨q1,20⟩,⟨q2,7⟩,⟨q3,3⟩,⟨q4,3⟩,⟨q5,7⟩,⟨q6,3⟩,⟨q7,7⟩,⟨q8,20⟩,⟨q9,20⟩}} S* 3 {{⟨q1,7⟩,⟨q2,3⟩,⟨q3,20⟩,⟨q4,3⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,7⟩,⟨q8,20⟩,⟨q9,7⟩}} S* 4 {{⟨q1,10⟩,⟨q2,10⟩,⟨q3,10⟩,⟨q4,10⟩,⟨q5,10⟩,⟨q6,10⟩,⟨q7,10⟩,⟨q8,10⟩,⟨q9,10⟩}} Table 5. Set configurations (S*). Configuration Impact factor of each site S* 0 {{⟨q1,7⟩,⟨q2,3⟩,⟨q3,20⟩,⟨q4,20⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,3⟩,⟨q8,7⟩,⟨q9,7⟩}} S* 1 {{⟨q1,7⟩,⟨q2,20⟩,⟨q3,20⟩,⟨q4,3⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,3⟩,⟨q8,7⟩,⟨q9,7⟩}} S* 2 {{⟨q1,20⟩,⟨q2,7⟩,⟨q3,3⟩,⟨q4,3⟩,⟨q5,7⟩,⟨q6,3⟩,⟨q7,7⟩,⟨q8,20⟩,⟨q9,20⟩}} S* 3 {{⟨q1,7⟩,⟨q2,3⟩,⟨q3,20⟩,⟨q4,3⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,7⟩,⟨q8,20⟩,⟨q9,7⟩}} S* 4 {{⟨q1,10⟩,⟨q2,10⟩,⟨q3,10⟩,⟨q4,10⟩,⟨q5,10⟩,⟨q6,10⟩,⟨q7,10⟩,⟨q8,10⟩,⟨q9,10⟩}} Configuration Impact factor of each site S* 0 {{⟨q1,7⟩,⟨q2,3⟩,⟨q3,20⟩,⟨q4,20⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,3⟩,⟨q8,7⟩,⟨q9,7⟩}} S* 1 {{⟨q1,7⟩,⟨q2,20⟩,⟨q3,20⟩,⟨q4,3⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,3⟩,⟨q8,7⟩,⟨q9,7⟩}} S* 2 {{⟨q1,20⟩,⟨q2,7⟩,⟨q3,3⟩,⟨q4,3⟩,⟨q5,7⟩,⟨q6,3⟩,⟨q7,7⟩,⟨q8,20⟩,⟨q9,20⟩}} S* 3 {{⟨q1,7⟩,⟨q2,3⟩,⟨q3,20⟩,⟨q4,3⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,7⟩,⟨q8,20⟩,⟨q9,7⟩}} S* 4 {{⟨q1,10⟩,⟨q2,10⟩,⟨q3,10⟩,⟨q4,10⟩,⟨q5,10⟩,⟨q6,10⟩,⟨q7,10⟩,⟨q8,10⟩,⟨q9,10⟩}} 7.3.1. Experiment 1—query accuracy probability The aim of this experiment is to evaluate the Query Accuracy Probability (PA) with different threshold values (64, 70, 74, 80, and 83) and different impact factor configurations (Table 5). The safety margin was set to 400 ms (β=400 ms). Figure 7 shows that in most cases the PA decreases when the threshold increases. It should be remembered that the threshold is a limit value defined by the user and if the FD trust level output value is equal to, or greater than, the threshold, the confidence on the set of processes is ensured. Hence, the results confirm that when the threshold is lower, the Query Accuracy Probability is higher. Figure 7. View largeDownload slide AS System: PA vs. threshold with different set configurations (S*). Figure 7. View largeDownload slide AS System: PA vs. threshold with different set configurations (S*). On the one hand, except for threshold 83, ‘S*0’ configuration has the highest PA for most of the thresholds due to the assignment of high (resp., low) impact factors for the most stable (resp., unstable) sites. On the other hand, ‘S*2’ and ‘S*4’ have the lowest PA since unstable sites have high impact factor values assignment. For instance, in ‘S*2’ the high impact factor value of unstable sites 8 and 9 with standard deviation of 100 and 18 ms, respectively degrades the PA of this set. ‘S*4’ shows a sharp decline of the PA curve when the threshold = 83. This behavior can be explained since, in this set configuration, all sites have the same impact factor (10) which implies that every false suspicion renders the trust_level smaller than the threshold (83), increasing the mistake duration. Therefore, the query accuracy probability decreases. Notice that Site 2 failed after ~48 hours. Thus, after its crash, the FD output, which indicates trust_level smaller than the threshold, is not a mistake, i.e. it is not a false suspicion. Hence, in ‘S*1’, where the impact factor of Site 2 is 20 (high), the PA is constant for a threshold greater than 70: after the crash of Site 2, the FD output is always smaller than the threshold and false suspicions related to other sites do not alter it. The average mistake duration in the experiment is thus smaller after the crash, which improves the PA. Finally, we compared the PA of the Impact FD and a FD approach that monitors processes individually by applying Chen’s algorithm considering the 100 most recent heartbeats (WS = 100) and β = 400 ms. For the latter, the metric is the average of the PA value of all sites of S: PA¯=∑x=1nPAxn, for n = 9 and x equals to the index of each site in S. Thus, the obtained mean PA ( PA¯) is equal to 0.979788. This result shows that, regardless of the set (S*) configuration, the Impact FD has a higher PA than Chen’s FD since the former has enough flexibility to tolerate failures, i.e. the mistake duration only starts to be computed when the trust_level provided by Impact FD is smaller than the threshold, in contrast with individual monitoring, such as that by Chen FD, where every false suspicion increases the mistake duration. The results of this experiment highlight the fact that the assignment of heterogeneous impact factors to nodes can degrade the performance of the FD, especially when unstable sites have a high impact factor. 7.3.2. Experiment 2—query accuracy probability vs. detection time In the second experiment, we evaluated the average query accuracy probability (PA) regarding the average detection time (TD) for different threshold values (64, 70, 80 and 83). In order to obtain different values for the detection time, we varied the safety margin (Chen’s estimation) with intervals of 100 ms, starting at 100 ms. For this experiment, we chose the ‘S*0’ configuration since it presented the best PA in Experiment 1. We also evaluated the PA and TD for Chen’s algorithm, which outputs the set of suspected nodes. For the latter, the TD is computed as the average of the individual TD of all sites of S: TD¯=∑x=1nTDxn. Figure 8 shows that for a high threshold and detection time close to 200 ms, the PA of the Impact FD is quite small, independently of the threshold, because the safety margin (used to compute the expected arrival times) is, in this case, equal to 100 ms, which increases both the number of false suspicions and mistake duration. However, when TD is greater than 230 ms, the PA of Impact FD is considerably higher than that of Chen. After a detection time of ~400 ms, the PA of Impact FD becomes constant regardless of the detection time and threshold, and gets close to 1. Such a behavior can be explained since the higher the safety margin, the smaller the number of false suspicions, and the shorter the mistake duration which confirms that when the timeout is short, failures are detected faster but the probability of having false detections increases [27]. Figure 8. View largeDownload slide AS System: PA vs. TD with different thresholds. Figure 8. View largeDownload slide AS System: PA vs. TD with different thresholds. 7.3.3. Experiment 3—average mistake rate In this experiment, we evaluated the average detection time (TD) vs. the mistake rate (λR) (mistakes per second). For Chen’s algorithm, the λR is computed as the average of the individual λR of all sites of S: λR¯=∑x=1nλRn.We considered the ‘S*0’ configuration and the mistake rate is expressed in a logarithmic scale. We can observe in Figure 9 that the mistake rate of the Impact FD is high when the detection time is low (i.e. smaller than 400 ms) and the threshold is high (i.e. from 80 to 83). Such a result is in accordance with Experiment 2: whenever the safety margin is small and threshold tolerates fewer failures, the Impact FD makes mistakes more frequently. In other words, the mistake rate decreases when the threshold is low or the detection time increases. Figure 9. View largeDownload slide AS System: λR vs. TD with different thresholds. Figure 9. View largeDownload slide AS System: λR vs. TD with different thresholds. 7.3.4. Experiment 4—cumulative number of mistakes Figure 10 shows the cumulative number of mistakes for ‘S*0’ during the whole trace period, considering β = 400 ms and threshold value equals either to 80 or 83. Figure 10. View largeDownload slide AS System: cumulative number of mistakes for ‘S*0’ configuration. Figure 10. View largeDownload slide AS System: cumulative number of mistakes for ‘S*0’ configuration. We can observe in the figure that the cumulative number of mistakes is greater when the threshold value is equal to 83 (2754 mistakes) when compared to the threshold value equals to 80 (179 mistakes). The former makes few mistakes until approximately the hour 48 (when the Site 2 crashed). After that, the number of cumulative mistakes significantly increases because, since the threshold is high (83) and the failure of Site 2 was detected, false suspicions of any other site induce a trust_level value smaller than 83 in most cases. For instance, Site 8 is highly unstable and has impact factor value of 7. Whenever there is a false suspicion about it, after the crash of Site 2, the trust_level value is 80. On the other hand, for the threshold 80, there are fewer instability periods since the crash of Site 2 does not have much impact on the confidence of the system. At hour 48, there is an increase in the cumulative number of mistakes due to the unstable period of Site 9, as shown in Figure 2. From hour 50 to 100, the FD makes fewer mistakes. Such a behavior can be explained since, as observed in the same figure, all sites, with exception of Site 8, also have this same period of stability. After hour 108, there is a greater number of mistakes which is related to the instability of Sites 1, 7 and 8 (see Figure 2). 7.3.5. Experiment 5—query accuracy probability vs. time In this experiment, we divided the execution trace duration by fixed intervals of time and computed the average query accuracy probability (PA) for each of them. We chose the ‘S*0’ configuration, β = 400 ms, and the threshold values of 80 and 83. Similarly to the cumulative number of mistakes (Experiment 4), we observe in Figure 11 that instability periods have an impact in the PA. For instance, for the threshold = 80, from hour 108, the cumulative number of mistakes increases very fast. Consequently, the PA decreases. The period of instability of Site 9 is the responsible for the important reduction of the PA at hour 60 (i.e. from hour 48 to 60) when threshold = 83. A new degradation of the PA happens at hour 120 (i.e. from hour 108 to 120), due to unstable periods of the Sites 1, 7 and 8. Figure 11. View largeDownload slide AS System: PA vs. time. Figure 11. View largeDownload slide AS System: PA vs. time. 7.4. Weak ♢—timely System (W-ET) In this section, we consider the W-ET system described in Section 7.1.1: Site 0, the monitor site, is connected by ♢—timely links to sites 2, 3, 4 and 6 and by lossy asynchronous links to 1, 5, 7, 8 and 9. We defined the set S* with three subsets and all sites have the same impact factor (1): S*={{⟨q1,1⟩,⟨q3,1⟩,⟨q4,1⟩},{⟨q2,1⟩,⟨q5,1⟩,⟨q6,1⟩},{⟨q7,1⟩,⟨q8,1⟩,⟨q9,1⟩}} The thresholdS was defined as follows: thresholdS = {2,2,2} The thresholdS defines that the subsets S1, S2 and S3 must have at least two correct processes. As this experiment assigns W-ET to model parameter, it uses the η value and the heartbeat arrival estimation value is incremented by η at every μ heartbeat arrivals, if false suspicions occurred during this period. The experiments were carried out just for the first 24 hours of the traces, because after this time the FD does not make more mistakes for the set S*. 7.4.1. Experiment 6—eventually timely links vs. asynchronous links In this experiment, we compare the results obtained taking into account the above S* configuration and both systems W-ET and AS. The evaluation metrics are shown in Table 6. We set the value of safety margin β to 50 ms and η to 500 μs. This safety margin value is quite aggressive, which, consequently, leads the FD prone to make mistakes. For the W-ET system, we also varied μ: 1, 10 and 100. Table 6. W-ET vs AS - β = 50 ms, η = 500 μs. μ Mistakes Mistake rate PA Avg Mistake duration (ms) Time last mistake (min) HB Number TD (ms) 1 152 0.0017 0.99992 43.36 64 (1 h) 349 341 234.0 10 324 0.0037 0.99983 43.69 64 (1 h) 349 341 182.0 100 383 0.0044 0.99979 45.18 64 (1 h) 349 341 173.0 AS 4689 0.0542 0.99849 27.70 1438 (24 h) 7 749 909 151.0 μ Mistakes Mistake rate PA Avg Mistake duration (ms) Time last mistake (min) HB Number TD (ms) 1 152 0.0017 0.99992 43.36 64 (1 h) 349 341 234.0 10 324 0.0037 0.99983 43.69 64 (1 h) 349 341 182.0 100 383 0.0044 0.99979 45.18 64 (1 h) 349 341 173.0 AS 4689 0.0542 0.99849 27.70 1438 (24 h) 7 749 909 151.0 Table 6. W-ET vs AS - β = 50 ms, η = 500 μs. μ Mistakes Mistake rate PA Avg Mistake duration (ms) Time last mistake (min) HB Number TD (ms) 1 152 0.0017 0.99992 43.36 64 (1 h) 349 341 234.0 10 324 0.0037 0.99983 43.69 64 (1 h) 349 341 182.0 100 383 0.0044 0.99979 45.18 64 (1 h) 349 341 173.0 AS 4689 0.0542 0.99849 27.70 1438 (24 h) 7 749 909 151.0 μ Mistakes Mistake rate PA Avg Mistake duration (ms) Time last mistake (min) HB Number TD (ms) 1 152 0.0017 0.99992 43.36 64 (1 h) 349 341 234.0 10 324 0.0037 0.99983 43.69 64 (1 h) 349 341 182.0 100 383 0.0044 0.99979 45.18 64 (1 h) 349 341 173.0 AS 4689 0.0542 0.99849 27.70 1438 (24 h) 7 749 909 151.0 The first three rows of the table show the results for the W-ET system and the last row for the AS system. We can observe that the number of mistakes increases for different values of μ in the W-ET, but it is much smaller when compared to the AS (4689 mistakes). As a consequence, in the AS, the mistake rate is higher and PA is lower. In contrast, the average mistake duration in the AS (27.70 ms) is smaller than in the W-ET (around 43 ms). Such a difference occurs because the AS system has a lower timeout which induces false suspicions more often. Nevertheless, a heartbeat message may arrive immediately after the expiration of the timeout, generating a short mistake time. On the other hand, in the W-ET, the timeout value increases when there are false suspicions in periods of greater instability where messages take longer to arrive. For the W-ET system, we can observe that the time of the last mistake was at 64 minutes (heartbeat number 349 341) whereas in the AS system mistake occurrences are observable until the last hour (24 hour, heartbeat number 7 749 909). This happens because in the W-ET the heartbeat arrival estimation value is incremented by η when p falsely suspects the process within a period of μ heartbeats, which allows p to eventually get every heartbeat message from a site before the timeout expires. It is worth remarking that the number of mistakes reduces drastically, but the TD does not increase at the same rate. Table 7 summarizes the results of the experiments considering β = 100 ms and η = 500 μs. When comparing the two tables, we observe that with a less aggressive safety margin β, the number of mistakes reduces, especially in the AS system (231). Accordingly, the mistake rate decreases and PA increases in both systems. The last mistake is around 64 minutes in the W-ET while AS made mistakes until hour 24. The TD of the AS reduces because it has a higher safety margin and makes fewer mistakes. For instance, with β = 50 ms, two processes, whose maximum TD is 300 ms, that has the timeout expired, leads the set S* to a state untrusted. However, with β = 100 only one of them is suspected which does not lead a transition of state from trusted to untrusted. Table 7. W-ET vs AS - β = 100 ms, η = 500 μs. μ Mistakes Mistake rate PA Avg Mistake Duration (ms) Time last mistake (min) HB Number TD (ms) 1 84 0.00097 0.99995 48.35 64 (1 h) 349 341 273.8 10 121 0.00140 0.99993 48.28 64 (1 h) 349 341 224.0 100 135 0.00156 0.99993 44.53 64 (1 h) 349 341 219.6 AS 231 0.00267 0.99989 37,56 1431 (24 h) 7 708 057 208.0 μ Mistakes Mistake rate PA Avg Mistake Duration (ms) Time last mistake (min) HB Number TD (ms) 1 84 0.00097 0.99995 48.35 64 (1 h) 349 341 273.8 10 121 0.00140 0.99993 48.28 64 (1 h) 349 341 224.0 100 135 0.00156 0.99993 44.53 64 (1 h) 349 341 219.6 AS 231 0.00267 0.99989 37,56 1431 (24 h) 7 708 057 208.0 Table 7. W-ET vs AS - β = 100 ms, η = 500 μs. μ Mistakes Mistake rate PA Avg Mistake Duration (ms) Time last mistake (min) HB Number TD (ms) 1 84 0.00097 0.99995 48.35 64 (1 h) 349 341 273.8 10 121 0.00140 0.99993 48.28 64 (1 h) 349 341 224.0 100 135 0.00156 0.99993 44.53 64 (1 h) 349 341 219.6 AS 231 0.00267 0.99989 37,56 1431 (24 h) 7 708 057 208.0 μ Mistakes Mistake rate PA Avg Mistake Duration (ms) Time last mistake (min) HB Number TD (ms) 1 84 0.00097 0.99995 48.35 64 (1 h) 349 341 273.8 10 121 0.00140 0.99993 48.28 64 (1 h) 349 341 224.0 100 135 0.00156 0.99993 44.53 64 (1 h) 349 341 219.6 AS 231 0.00267 0.99989 37,56 1431 (24 h) 7 708 057 208.0 We also conducted the same experiment with β = 100 ms and η = 1 ms for the W-ET system (Table 8). We can note that the number of mistakes is reduced. On the other hand, with few mistakes, especially with μ = 1 ms, both the average mistake duration and TD increase. Based on these results, we can conclude that setting μ with a value greater than 1 is more suitable for this scenario, achieving, therefore, a better trade-off between detection time and accuracy of the Impact FD. Table 8. W-ET vs AS - β = 100 ms, η = 1 ms. μ Mistakes Mistake rate PA Avg Mistake duration (ms) Time last mistake (min) HB number TD (ms) 1 6 0.000069 0.999990 140.00 64 (1 h) 349 339 689.5 10 45 0.000520 0.999972 53.07 64 (1 h) 349 341 383.9 100 98 0.001133 0.999945 47.99 64 (1 h) 349 341 243.9 AS 231 0.002672 0.999899 37.56 1431 (24 h) 7 708 057 226.8 μ Mistakes Mistake rate PA Avg Mistake duration (ms) Time last mistake (min) HB number TD (ms) 1 6 0.000069 0.999990 140.00 64 (1 h) 349 339 689.5 10 45 0.000520 0.999972 53.07 64 (1 h) 349 341 383.9 100 98 0.001133 0.999945 47.99 64 (1 h) 349 341 243.9 AS 231 0.002672 0.999899 37.56 1431 (24 h) 7 708 057 226.8 Table 8. W-ET vs AS - β = 100 ms, η = 1 ms. μ Mistakes Mistake rate PA Avg Mistake duration (ms) Time last mistake (min) HB number TD (ms) 1 6 0.000069 0.999990 140.00 64 (1 h) 349 339 689.5 10 45 0.000520 0.999972 53.07 64 (1 h) 349 341 383.9 100 98 0.001133 0.999945 47.99 64 (1 h) 349 341 243.9 AS 231 0.002672 0.999899 37.56 1431 (24 h) 7 708 057 226.8 μ Mistakes Mistake rate PA Avg Mistake duration (ms) Time last mistake (min) HB number TD (ms) 1 6 0.000069 0.999990 140.00 64 (1 h) 349 339 689.5 10 45 0.000520 0.999972 53.07 64 (1 h) 349 341 383.9 100 98 0.001133 0.999945 47.99 64 (1 h) 349 341 243.9 AS 231 0.002672 0.999899 37.56 1431 (24 h) 7 708 057 226.8 8. RELATED WORK We can divide related work into two groups: (i) unreliable FDs and (ii) heartbeat arrival estimation strategies. Unreliable FDs: Most of the unreliable FDs in the literature are based on a binary model and provide as output a set of process identifiers, which usually informs the set of processes currently suspected of having failed ([2, 3]). However, in some detectors, such as class Σ (resp., Ω) [23], the output is the set of processes (resp., one process) which are (resp., is) not suspected of being faulty, i.e. trusted. The Accrual FD [24] proposes an approach where the output is a suspicion level on a continuous scale, rather than providing information of a binary nature (trusted or suspected). The suspicion level captures the degree of confidence with which a given process is believed to have crashed. If the process actually crashes, the value is guaranteed to accrue over time and tends toward infinity. Like the Accrual FD, Impact FD provides a non-binary output, however, the latter is related to the system as a whole and not to each process individually. On the other hand, some important features advocated by the authors in [28] for Accrual FD, can also be extended to our proposal. The authors argue that the aim of Accrual FDs is to decouple monitoring from interpretation. Hence, the accrual FDs provide a lower level abstraction that avoids having to interpret monitoring information. For instance, by setting an appropriate threshold, applications can trigger suspicions and take appropriate action, similarly to the Impact FD. Starting from the premise that applications should have information about failures to take specific and suitable recovery actions, the work in [29] proposes a service to report faults to applications. The latter also encapsulates uncertainty which allows applications to proceed safely in the presence of doubt. The service provides status reports related to fault detection with an abstraction that describes the degree of uncertainty. Considering that each node has a probability of being byzantine, a voting node redundancy approach is presented in [30] in order to improve reliability of distributed systems. Based on such probability values, the authors estimate the minimum number of machines that the system should have in order to provide a degree of reliability which is equal to or greater than a threshold value. In [31], the authors propose the use of a reputation mechanism to implement a FD for large and dynamic networks. The reputation mechanism allows node cooperation through the sharing of views about other nodes. The proposed approach exploits information about the behavior of nodes to increase its quality in terms of detection. When classifying the behavior of the nodes, the FD includes a reputation service where the nodes periodically exchange heartbeat messages. Heartbeat arrival estimation strategies: In the timer-based FD algorithms presented in Section 6, we used the heartbeat arrival estimation proposed by [7]. With the same aim of Chen’s algorithm, i.e. minimize false suspicions and failure detection time, several other estimation approaches have been proposed in the literature. They dynamically predict new heartbeat arrivals based on observed communication delays of the past heartbeat history. Bertier et al. [3] introduced a FD that was mainly intended for LAN environments. Their heartbeat arrival estimation approach combines of Chen’s estimation with a dynamic estimation based on Jacobson’s estimation [32]. The latter is used in the protocol TCP to estimate the delay after which a node retransmits its last message. Basically, the estimation of the next heartbeat arrival is calculated by adding Chen’s estimation to a safety margin given by Jacobson’s algorithm. Their approach provides a shorter detection time, but generates more false suspicions than Chen’s estimation, according to the authors’ measurements on a LAN. The ϕ Accrual FD is based [24] on inter-arrival estimation time, assuming that the latter follow a normal distribution. The Accrual FD dynamically adapts current network conditions based on the suspicion level. Similarly to the above FD [3] and [7], the estimation protocol samples the arrival time of heartbeats and maintains a sliding window of the most recent samples. The distribution of past samples is then used as an approximation for the probabilistic distribution of future heartbeat messages. With this information, it is possible to compute a value ϕ with a scale that changes dynamically to match recent network conditions. In [27], the authors extended the Accrual FD by exploiting the histogram density estimation. Taking into account, a sampled inter-arrival time and the time of the last received heartbeat, the algorithm estimates the probability that no further heartbeat messages will arrive from a given process, i.e. it has failed. The ANNFD presented in [33] is a FD based on artificial neural networks. It uses as input parameters variables collected by the Simple Network Management Protocol (SMNP) that characterize the network traffic at each time instant. After training the neural network, it must compute the message arrival time estimation EAk + 1, which is used to define the freshness point. By observing the changes in the computing environment and exploiting both the feedback control theory and user-defined QoS constraints, the autonomic FD (AFD) proposed in [34] dynamically configures the monitoring period and detection timeout value. A new metric, denoted FD availability (AV), is also defined. It suggests a safety margin (α) in such a way to decrease FD mistakes and to achieve the desired detection availability. If the detection service is inaccurate (i.e. AV is low), then the safety margin is increased to improve detection accuracy; otherwise, if AV is high, then α is decreased to improve the detection speed. Related work concerning FD’s implementations presents different approaches to estimate the timeout. The QoS of FDs depends on the choice of heartbeat arrival estimation strategy: a short timeout leads a FD to detect failures quickly, but may increase the number of false suspicions decreasing, consequently, its accuracy. We propose a new unreliable FD and its focus is not in heartbeat arrival estimation strategies. However, implementations of Impact FD may use different approaches to estimate the timeout. In the case of the timer-based Impact FD implementation of Section 6 (Algorithm 2), we use the heartbeat arrival estimation proposed by Chen et al. [7]. The reason for Chen’s algorithm choice is that it is a comparison reference for all FD performance studies. We should emphasize that to use another one, it is just necessary to change the code of the function Timeout () (Algorithm 1) called by Algorithm 2. For the Chen’s estimation algorithm, we consider the safety margin suggested by the authors, adding a dynamic increment for eventual timely links. Note that although the estimation solutions proposed by Chen’s and Accrual FDs [24, 27] have similar performance (mistake rate × detection time) over a wide-area network (environment of our experiments), the Accrual FD estimation requires tuning of the threshold parameter for each process and depends on application characteristics. It is important also to point out that Bertier, AFD, and ANNFD estimations were designed to local area networks where messages are rarely lost while the 2W-FD [25] has been tailored for unstable network scenarios such as latency jitter or switch contention. 9. CONCLUSION AND FUTURE WORK This paper introduced the Impact FD that provides an output that expresses the trust of the FD with regard to the system (or set of processes) as a whole. It is configured by the impact factor and the threshold which enable the user to define the importance (e.g. degree of reliability) of each node and an acceptable margin of failures respectively. It is thus suitable for environments where there is node redundancy or nodes with different capabilities. Both the impact factor and the threshold render the estimation of the confidence in the system (or a set of processes S) more flexible. In some scenarios, the failure of low impact or redundant nodes does not jeopardize the confidence in S, while the crash of a high impact factor one may seriously affect it. Either a softer or a stricter monitoring is, therefore, possible. We have defined two properties, PR(IT)pS and PR(⋄IT)pS, which denote the capacity of the Impact FD of accepting different set of trusted processes that lead to the confidence in S. Then, we presented a timer-based implementation of the Impact FD, which can be applied to systems whose links are lossy asynchronous or those whose all (or some) are ♢—timely. Performance evaluation results, based on real PlanetLab traces, showed that the assignment of a high (resp. low) impact factor to more stable (resp. unstable) nodes increases the Query Accuracy Probability of the FD. Furthermore, we observed that the Impact FD might weaken the rate of false suspicions when compared with the traditional Chen’s unreliable FD. Additionally, in the experiments carried out considering a W-ET system, it was observed that the number of mistakes reduces drastically when compared with the AS system, however the detection time does not increase in the same rate. Therefore, such results confirm the degree of flexible applicability of the Impact FD, that both failures and false suspicions are more tolerated than in traditional FDs, and that the former presents better QoS than the latter if the application is interested in the degree of confidence in the system (trust level) as a whole. In the near future, we intend to generalize the trust level calculation as well as its comparison with the threshold. To this end, the Trust_level(trusted, S*) function can perform an operation over the impact factor of the trusted processes other than the sum (e.g. multiplication, average, etc.) and the threshold will not necessary be a lower bound (e.g. upper bound, equality, etc.). For instance, suppose that the impact factor of a node corresponds to the probability that it behaves maliciously. The trust level, in this case, would express the probability that all nodes of the system behave maliciously. Thus, the trust_level sum operation would be replaced by multiplication operation and should be smaller than a reliability threshold value. Another research direction is to render the impact factor dynamic, i.e. the impact factor of a node can vary during execution, depending on the current degree of reliability of the node or its current reputation, its past history of stable/unstable periods, etc. Finally, we also aim at extending performance experiments to other networks such as MANET or LAN, comparing the performance of Impact FD with other well-known FDs. FUNDING This work was partially supported by grant 012909/2013-00 from the National Council for Scientific and Technological Development (CNPq). Footnotes 1 A process is denoted correct if it does not crash during the whole execution. 2 The power set of any set S is the set of all subsets of S, including the empty set and S itself. REFERENCES 1 Fischer , M. , Lynch , N. and Paterson , M. ( 1985 ) Impossibility of distributed consensus with one faulty process . J. ACM , 32 , 374 – 382 . Google Scholar Crossref Search ADS 2 Chandra , T. D. and Toueg , S. ( 1996 ) Unreliable failure detectors for reliable distributed systems . J. ACM , 43 , 225 – 267 . Google Scholar Crossref Search ADS 3 Bertier , M. , Marin , O. and Sens , P. ( 2003 ) Performance analysis of a hierarchical failure detector. 2003 Int. Conf. Dependable Systems and Networks (DSN), San Francisco, CA, USA, 22–25 June, pp. 635–644. IEEE Computer Society. 4 Rossetto , A. , Geyer , C. , Arantes , L. and Sens , P. ( 2015 ) A failure detector that gives information on the degree of confidence in the system. Symposium on Computers and Communication, Larnaca, Cyprus, 6–9 July, pp. 532–537. IEEE Computer Society. 5 Aguilera , M. , Delporte-Gallet , C. , Fauconnier , H. and Toueg , S. ( 2004 ) Communication-efficient leader election and consensus with limited link synchrony. Proc. 23rd Annual ACM Symposium on Principles of Distributed Computing, PODC, St. John’s, Newfoundland, Canada, 25–28 July, pp. 328–337. ACM. 6 Junqueira , J. , Marzullo , K. , Herlihy , M. and Penso , L. ( 2010 ) Threshold protocols in survivor set systems . Distrib. Comput. , 23 , 135 – 149 . Google Scholar Crossref Search ADS 7 Chen , W. , Toueg , S. and Aguilera , M. ( 2002 ) On the quality of service of failure detectors . IEEE Trans. Comput. , 51 , 561 – 580 . Google Scholar Crossref Search ADS 8 PlanetLab ( 2014 ). Planetlab. http://www.planet-lab.org. “Online. Access date: September 16, 2016”. 9 Ishibashi , K. and Yano , M. ( 2005 ) A proposal of forwarding method for urgent messages on an ubiquitous wireless sensor network. 6th Asia-Pacific Symposium on Information and Telecommunication Technologies, Yangon, Myanmar, 9–10 Nov, pp. 293–298. IEEE. 10 Geeta , D. , Nalini , N. and Biradar , R. ( 2013 ) Fault tolerance in wireless sensor network using hand-off and dynamic power adjustment approach . J. Netw. Computer Appl. , 36 , 1174 – 1185 . Google Scholar Crossref Search ADS 11 Rehman , A. , Abbasi , A. , Islam , N. and Shaikh , Z. ( 2014 ) A review of wireless sensors and networks’ applications in agriculture . Comput. Stand. Interfaces , 36 , 263 – 270 . Google Scholar Crossref Search ADS 12 Hayashibara , N. , Défago , X. and Katayama , T. ( 2003 ) Two-ways adaptive failure detection with the ϕ-failure detector. Workshop on Adaptive Distributed Systems (WADiS03), Sorrento, Italy, Oct, pp. 22–27. Citeseer. 13 Bonnet , F. and Raynal , M. ( 2013 ) Anonymous asynchronous systems: the case of failure detectors . Distributed Computing , 26 , 141 – 158 . Google Scholar Crossref Search ADS 14 Arévalo , S. , Fernández Anta , A. , Imbs , D. , Jiménez , E. and Raynal , M. ( 2012 ) Failure detectors in homonymous distributed systems (with an application to consensus). 2012 IEEE 32nd Int. Conf. Distributed Computing Systems, Macau, China, 18–21 June, pp. 275–284. IEEE Computer Society. 15 Larrea , M. , Anta , A. F. and Arévalo , S. ( 2013 ) Implementing the weakest failure detector for solving the consensus problem . IJPEDS , 28 , 537 – 555 . 16 Aguilera , M. K. , Delporte-Gallet , C. , Fauconnier , H. and Toueg , S. ( 2003 ) On implementing omega with weak reliability and synchrony assumptions. Proc. 22nd ACM Symposium on Principles of Distributed Computing PODC, Boston, Massachusetts, USA, July 13–16, pp. 306–314. ACM. 17 Mostéfaoui , A. , Mourgaya , E. and Raynal , M. ( 2003 ) Asynchronous implementation of failure detectors. Int. Conf. Dependable Systems and Networks (DSN), San Francisco, CA, USA, 22–25 June, pp. 351–360. IEEE Computer Society. 18 Arantes , L. , Greve , F. , Sens , P. and Simon , V. ( 2013 ) Eventual leader election in evolving mobile networks. 17th Int. Conf. Principles of Distributed Systems, OPODIS, Nice, France, 16–18 December, pp. 23–37. Springer. 19 Gómez-Calzado , C. , Lafuente , A. , Larrea , M. and Raynal , M. ( 2013 ) Fault-tolerant leader election in mobile dynamic distributed systems. IEEE 19th Pacific Rim Int. Symposium on Dependable Computing, PRDC, Vancouver, BC, Canada, 2–4 December, pp. 78–87. IEEE Computer Society. 20 Larrea , M. , Fernández , A. and Arévalo , S. ( 2004 ) On the implementation of unreliable failure detectors in partially synchronous systems . IEEE Trans. Comput. , 53 , 815 – 828 . Google Scholar Crossref Search ADS 21 Delporte-Gallet , C. , Fauconnier , H. , Guerraoui , R. and Kouznetsov , P. ( 2005 ) Mutual exclusion in asynchronous systems with failure detectors . J. Parallel Distrib. Comput. , 65 , 492 – 505 . Google Scholar Crossref Search ADS 22 Bonnet , F. and Raynal , M. ( 2011 ) On the road to the weakest failure detector for k-set agreement in message-passing systems . Theor. Comput. Sci. , 412 , 4273 – 4284 . Google Scholar Crossref Search ADS 23 Delporte-Gallet , C. , Fauconnier , H. , Guerraoui , R. , Hadzilacos , V. , Kouznetsov , P. and Toueg , S. ( 2004 ) The weakest failure detectors to solve certain fundamental problems in distributed computing. Proc. 23rd Annual ACM Symposium on Principles of Distributed Computing, PODC, St. John’s, Newfoundland, Canada, 25–28 July, pp. 338–346. ACM. 24 Hayashibara , N. , Defago , X. , Yared , R. and Katayama , T. ( 2004 ) The φ accrual failure detector. 23rd Int. Symposium on Reliable Distributed Systems SRDS, Florianopolis, Brazil, 18–20 October, pp. 66–78. IEEE Computer Society. 25 Tomsic , A. , Sens , P. , Garcia , J. , Arantes , L. and Sopena , J. ( 2015 ) 2w-fd: A failure detector algorithm with qos. IEEE Int. Parallel and Distributed Processing Symposium (IPDPS), Hyderabad, India, 25–29 May, pp. 885–893. IEEE. 26 Xiong , N. , Vasilakos , A. V. , Wu , J. , Yang , Y. R. , Rindos , A. , Zhou , Y. , Song , W.-Z. and Pan , Y. ( 2012 ) A self-tuning failure detection scheme for cloud computing service. 26th International Parallel & Distributed Processing Symposium (IPDPS), Shanghai, China, 21–25 May, pp. 668–679. IEEE. 27 Satzger , B. , Pietzowski , A. , Trumler , W. and Ungerer , T. ( 2007 ) A new adaptive accrual failure detector for dependable distributed systems. ACM Symposium on Applied Computing (SAC), Seoul, Korea, 11–15 March, pp. 551–555. ACM. 28 Défago , X. , Urbán , P. , Hayashibara , N. and Katayama , T. ( 2005 ) Definition and specification of accrual failure detectors. Int. Conf. Dependable Systems and Networks (DSN), Yokohama, Japan, 28 June–1 July, pp. 206–215. IEEE Computer Society. 29 Leners , J. B. , Gupta , T. , Aguilera , M. K. and Walfish , M. ( 2013 ) I mproving availability in distributed systems with failure informers. 10th USENIX Symposium on Networked Systems Design and Implementation, NSDI, Lombard, IL, USA, 2–5 April, pp. 427–441. USENIX Association. 30 Brun , Y. , Edwards , G. , Bang , J. Y. and Medvidovic , N. ( 2011 ) Smart redundancy for distributed computation. Int. Conf. Distributed Computing Systems, ICDCS, Minneapolis, Minnesota, USA, 20–24 June, pp. 665–676. IEEE Computer Society. 31 Véron , M. , Marin , O. , Monnet , S. and Sens , P. ( 2015 ) Repfd-using reputation systems to detect failures in large dynamic networks. 44th Int. Conf. Parallel Processing, ICPP, Beijing, China, 1–4 September, pp. 91–100. IEEE Computer Society. 32 Jacobson , V. ( 1988 ) Congestion avoidance and control. Symposium Proc. Communications Architectures and Protocols, SIGCOMM, Stanford, California, USA, 16–18 August, pp. 314–329. ACM. 33 Macêdo , R. A. and Lima , F. R. L. ( 2004 ) Improving the quality of service of failure detectors with snmp and artificial neural networks. Simpósio Brasileiro de Redes de Computadores, SBRC, Gramado - RS, Brazil, 10–14 May, pp. 583–586. SBC. 34 de Sá , A. S. and Macêdo , R. J. A. ( 2010 ) Qos self-configuring failure detectors for distributed systems. IFIP Int. Conf. Distributed Applications and Interoperable Systems, Amsterdam, The Netherlands, 7–9 June, pp. 126–140. Springer Berlin Heidelberg. © The British Computer Society 2018. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png The Computer Journal Oxford University Press

Impact FD: An Unreliable Failure Detector Based on Process Relevance and Confidence in the System

Loading next page...
 
/lp/ou_press/impact-fd-an-unreliable-failure-detector-based-on-process-relevance-EHpDVbGkRE
Publisher
Oxford University Press
Copyright
© The British Computer Society 2018. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
ISSN
0010-4620
eISSN
1460-2067
D.O.I.
10.1093/comjnl/bxy041
Publisher site
See Article on Publisher Site

Abstract

Abstract This paper presents a new unreliable failure detector, called the Impact failure detector (FD), that, contrarily to the majority of traditional FDs, outputs a trust level value which expresses the degree of confidence in the system. An impact factor is assigned to each process and the trust level is equal to the sum of the impact factors of the processes not suspected of failure. Moreover, a threshold parameter defines a lower bound value for the trust level, over which the confidence in the system is ensured. In particular, we defined a flexibility property that denotes the capacity of the Impact FD to tolerate a certain margin of failures or false suspicions, i.e. its capacity of considering different sets of responses that lead the system to trusted states. The Impact FD is suitable for systems that present node redundancy, heterogeneity of nodes, clustering feature and allow a margin of failures which does not degrade the confidence in the system. The paper also includes a timer-based distributed algorithm which implements an Impact FD, as well as its proof of correctness, for systems whose links are lossy asynchronous or for those whose all (or some) links are eventually timely. Performance evaluation results, based on PlanetLab (Planetlab. http://www.planet-lab.org. ‘Online. Access date: 16 September 2016’) traces, confirm the degree of flexible applicability of our FD and that, due to the accepted margin of failure, both failures and false suspicions are more tolerated when compared to traditional unreliable FDs. 1. INTRODUCTION In distributed systems, failures can occur and the detection of them is a crucial task in the design of fault tolerant distributed systems or applications. On the other hand, in asynchronous systems (AS), there exist no bounds on message transmission neither on processes speed. Therefore, detection of crashed processes is particularly difficult in those systems since it is impossible to know whether a process has really failed or if it and/or the network communication are just slow. Due to this lack of delay bounds, it is well known that consensus problem cannot be solved deterministically in an AS subject to even a single crash failure [1]. To circumvent such an impossibility and give support to the development of fault tolerant distributed systems, Chandra and Toueg proposed in [2] the unreliable failure detector (FD) abstraction. An unreliable FD can be seen as an oracle that gives (not always correct) information about process failures. Many current FDs are based on a binary model, in which monitored processes are either ‘trusted’ or ‘suspected’. Thus, most of existing FDs, such as those defined in [2, 3], output the set of processes that is currently suspected to have crashed. According to the type and the quality of this information, several FD classes have been proposed. This paper presents a new unreliable FD, denoted the Impact FD. A preliminary proposal of it was presented in [4]. Contrarily to the majority of existing unreliable FDs, the Impact FD provides an output that expresses the trust of the FD with regard to the system (or set of processes) as a whole and not to each process individually. A system is considered ‘trusted’ if it behaves correctly for a specific purpose even in the face of failures, i.e. the system is able to maintain the normal functionality. The conception of the Impact FD was inspired on systems that have the following features: (1) applications that execute on them are interested on information about the reliability of the system as a whole and can tolerate a certain margin of failures. The latter may vary depending on the environment, situation or context, such as systems that provide redundancy of software/hardware; (2) systems that organize nodes with some common characteristic in groups; (3) systems where the nodes can have different importance (relevance) or roles and, thus, their failures may have distinct impact on the system. Systems that present node redundancy, heterogeneity of nodes, clustering feature and allow a margin of failures which does not degrade the confidence in the system can, thus, benefit from the Impact FD and its configuration choices. They have motivated our work. In Section 2, there are some examples of such systems and the advantages, in these cases, of using the Impact FD instead of traditional FDs. The Impact FD outputs a trust level related to a given set of processes S of the monitored system. We, thus, denote FD ( IpS) the Impact FD module of process p that monitors the processes of S. When invoked in p, the Impact FD ( IpS) returns the trust_level value which expresses the confidence that p has in set S. To this end, an impact value, defined by the user, is assigned to each process of S and the trust_level is equal to the sum of the impact factors of the trusted nodes, i.e. those not suspected of failure by p. Furthermore, a threshold parameter defines a lower bound for the trust level, over which the confidence degree on S is ensured. Hence, by comparing the trust_level with the threshold, it is possible to determine whether S is currently ‘trusted’ or ‘untrusted’ by p. The impact factor indicates the relative importance of the process in the set S, while the threshold offers a degree of flexibility for failures and false suspicions, thus allowing a higher tolerance in case of instability in the system. For instance, in an unstable network, although there might be many false suspicions, depending on the value assigned to the threshold, the system might remain trustworthy [5]. We should also point out that the Impact FD configuration allows nodes of S to be grouped into subsets and threshold values can be defined for each of these subsets. In addition, similar to the traditional FD, several classes of Impact FDs can be defined depending on their capability of suspecting faulty processes (completeness property) and of not suspecting correct processes (accuracy property). Arguing that traditional approaches which assume a maximum number of failures f may lead to suboptimal solutions, such as in replication protocols where the number of replicas depends on f, Junqueira et al. proposed in [6] the survivor set approach, i.e. the unique collection of minimal sets of correct processes over all executions, each set containing all correct processes of some execution. The principle of the Impact FD also follows the authors’ argument: the threshold expresses certain margin of failures or false suspicions and the number of failures tolerated by the system is not necessarily fixed but depends on sets of correct processes, their respective impact factors and threshold values. Therefore, the Impact FD presents, what we denoted, the flexibility property. The latter expresses its capacity of considering different sets of responses that lead S to trusted states. In this context, we also define in this work, two properties, PR(IT)pS and PR(⋄IT)pS, which characterize the minimum necessary stability condition of S that ensures p's confidence (or eventual confidence) in S. In other words, if PR(IT)pS (resp., PR(⋄IT)pS) holds, the system S is always (resp., eventually always) trusted by the monitor process p. Note that the Impact FD threshold/impact factor approach is strictly more powerful than the maximum number of failures f approach since the latter can be expressed with the former but not the other way around. We also present in this paper a timer-based distributed algorithm (and its proof of correctness) which implements a Impact FD. It uses the algorithm proposed by [7] to estimate heartbeat message arrivals from monitored processes. The implementation can be applied to systems whose links are lossy asynchronous or those whose all (or some) of them have eventually a bounded synchronous behavior (♢—timely) [5]. Then, based on real-trace files collected from nodes of PlanetLab [8], we conducted extensive experiments in order to evaluate the Impact FD. These trace files contained a large amount of data related to the sending and reception of heartbeat messages, including unstable periods of links and message, characterizing, therefore, distributed systems that use FDs based on heartbeat. The testbed of the experiments comprises various configurations with different threshold values, impact factor of nodes and types of links. For evaluation sake, we used three of the QoS metrics proposed in [7]: detection time, average mistake rate, and query accuracy probability. The Impact FD implementation was also compared to a tradition timer-based FD one that outputs information about failure suspicions of each monitored process. Performance evaluation results confirm the degree of flexible applicability of the Impact FD that both failures and false suspicions are more tolerated than in traditional FDs, and that the former presents better QoS than the latter if the application is interested in the degree of confidence in the system (trust level) as a whole. The rest of this paper is structured as follows. Section 2 describes some distributed systems for which the Impact FD is suitable. Section 3 outlines some basic concepts of unreliable FDs and Section 4 describes our system models. Section 5 presents the Impact FD, its characteristics and some of its properties while in Section 6, we propose a timer-based algorithm that implements the Impact FD considering different systems, defined by the type of their links. The section also includes the proof of correctness of the algorithm. Section 7 presents a set of evaluation results obtained from experiments conducted with real traces on PlanetLab [8]. Section 8 discusses some existing related work. Finally, Section 9 concludes the paper and outlines some of our future research directions. 2. MOTIVATION SCENARIOS Our proposed approach can be applied to different distributed scenarios and is flexible enough to meet different needs. It is quite suitable for environments where there is node redundancy or nodes with different capabilities. We should point out that both the impact factor and the threshold render the estimation of the confidence of S more flexible. Hence, there might be a situation where some processes in S are faulty or suspected of being faulty but S is still considered to be trusted. Furthermore, the Impact FD can easily be configured and adapted to the needs of the application or system requirements. For instance, the application may require a stricter monitoring of nodes during the night than during the day. For this kind of adaptation, it is only necessary to adjust the threshold. The following examples show some scenarios to which the Impact FD can be applied: Scenario 1: Ubiquitous Wireless Sensor Networks (WSNs) are usually deployed to monitor physical conditions in various places such as geographical regions, agriculture lands and battlefields. In WSNs, there is a wide range of sensor nodes with different battery resources and communication or computation capabilities [9]. However, these sensors are prone to failures (e.g. battery failure, process failure, transceiver failure, etc.) [10]. Hence, it is necessary to provide failure detection and adaptation strategies to ensure that the failure of sensor nodes does not affect the overall task of the network. The redundant use of sensor nodes, reorganization of the sensor network and overlapping sensing regions are some of the techniques used to increase the fault tolerance and reliability of the network [11]. Let us take as example an ubiquitous WSN which is used to collect environmental data from within a vineyard and is divided into management zones in accordance with different characteristics (e.g. soil properties). Each zone comprises sensors of different types (e.g. humidity control, temperature control, etc.) and the density of the sensors depends on the characteristics of each zone. That is, the number of sensors can be different for each type of sensor within a given zone. Furthermore, the redundancy of the sensors ensures both area coverage and connectivity in case of failure. Each management zone can thus be viewed as a single set which has sensors of the same type grouped into subsets. This grouping approach allows a threshold to be defined as being equal to the minimum number of sensors that each subset must have to keep the connectivity and application functioning all the time. Moreover, in some situations, there might be a need to dynamically reconfigure the density of the zones. In this case, the threshold value would change. Scenario 2: In large-scale WSN environments, grouping sensor nodes into clusters has been widely adopted aiming the overall system scalability and reduction of resources consumption like battery power and bandwidth. Each clusteri is composed of a node, denoted cluster head (CH), which performs special tasks (e.g. routing, fusion, aggregation of messages, etc.) and several other sensor nodes (SN). The latter periodically transmit their data to the corresponding CH node which aggregate and transmit them to the base station (BS) either directly or through the intermediate communication with other CH nodes. In this scenario, the concept of Impact FD can be applied considering each clusteri as a subset of the system S whose size is initially ni. When defining the impact factor for the processes of clusteri, two issues should be considered: (i) the failure of CH which implies that the cluster is inaccessible compromising, therefore, the network connectivity and leading to untrusted states of S; (ii) when the number of alive SNs drops below a threshold, additional resources must be deployed to replenish the system to maintain its population density. Taking these constraints into account, we could have: impact factor = 1 to SNs, impact factor = ni to the CH of clusteri and threshold for this cluster equals to thresholdi = ni + (ni−fi), where fi is the maximum number of SN’s failures of clusteri. Thus, when either the CH fails or more than fi SNs fail, the trust level will be below the threshold and the BS must be warned to take some decision. Scenario 3: A third example might be a system consisting of a main server that offers a certain quality of service X (bandwidth, response time, etc.). If it fails, N backup servers can replace it, since each backup offers the same service but with a X/N quality of service. In this scenario, both the impact factor of the main server and the threshold would have the value of N*Iback where Iback is the impact value of each backup server, i.e. the system becomes unreliable whenever both the primary server and one or more of the N servers fail (or are suspected of being faulty). The Impact FD can be applied to all the above scenarios which have the following features: (i) the grouping of nodes that have some common characteristics into subgroups (subsets); (ii) the possibility of having nodes with different levels of relevance and (iii) the flexibility of some systems in being able to tolerate a margin of failure. 3. UNRELIABLE FDS Proposed by Chandra and Toueg in [2], an unreliable FD can be seen as an oracle that gives (not always correct) information about process failures (either trusted or suspected). It usually provides a list of processes suspected of having crashed. According to [12], unreliable FDs are so named because they can make mistakes (i) by erroneously suspecting a correct process1 (false suspicion) or (ii) by not suspecting a process that has actually crashed. If the FD detects its mistake later, it corrects it. For instance, a FD can stop suspecting at time t + 1, a process that it suspected at time t. Although an unreliable FD cannot accurately determine the real state of processes, its use increases knowledge about them and encapsulates the uncertainty of the communication delay between two processes [2]. Unreliable FDs are usually characterized by two properties: completeness and accuracy, as defined in [2]. Completeness characterizes the FD’s capability of suspecting faulty processes, while accuracy characterizes the FD’s capability of not suspecting correct processes, i.e. restricts the mistakes that the FD can make. FDs are then classified according to two completeness properties and four accuracy properties [2]. The combination of these properties yields eight classes of FDs. This approach allows the design of fault tolerant applications and proof of their correctness based only on these properties, without having to address, for example, low-level network parameters. In this work, we are particularly interested in the following completeness and accuracy properties: Strong completeness: Eventually every process that crashes is permanently suspected by every correct process. Weak completeness: Eventually every process that crashes is permanently suspected by some correct process. Eventual strong accuracy: There is a time after which correct processes are not suspected by any correct process. Eventual weak accuracy: There is a time after which some correct process is never suspected by any correct process. The class of the eventually perfect ♢P (resp., eventually strong ♢S) FDs satisfies the strong completeness and the eventual strong (resp., eventual weak) accuracy properties; the class of eventually weak FDs (♢W) satisfies the weak completeness and the eventual weak accuracy properties. ♢W is the weakest class allowing to solve consensus in an asynchronous distributed system with the additional assumption that a majority of processes are correct. Note that the type of accuracy depends on the synchrony or stability of the network. For instance, an algorithm that provides eventual accuracy (strong or weak) may rely on partially synchronous systems which eventually ensure a bound for message transmission delays and processes speed. From Chandra and Toueg’s work, numerous other FD implementations and classes have been proposed in the literature. They usually differ in the system assumptions such as synchronous model, type of node (identifiable, anonymous [13], homonymous [14]), type of link [5, 15, 16] (lossy asynchronous, reliable, timely, eventually timely, etc.), behavior properties [5, 17]; type of network (static [3, 15], dynamic [18, 19]), etc. They can also have different implementation choices (timer-based [7, 20], message pattern [17]) and performance or quality of service (QoS) requirements [7]. The type of problem can also define the properties of the FD (mutual exclusion [21], k-set agreement [22], register implementation [23], etc.). 3.1. Implementation of FDs The literature has several proposals for implementing unreliable FDs which usually exploit either a timer-based or a message-pattern approach. In the timer-based strategy, FD implementations make use of timers to detect failures in processes. There exist two mechanisms that can be used to implement the timer-based strategy: heartbeat and pinging. In the heartbeat mechanism every process q periodically sends a control message (‘I am alive’ message) to process p that is responsible for monitoring q. If p does not receive such a message from q after the expiration of a timer, it adds q to its list of suspected processes. If p later receives an ‘I am alive’ message from q, p then removes q from its list of suspected processes. An alternative approach uses the pinging mechanism which sends a query message ‘Are you alive?’ from each process p to another process q periodically. Upon reception of such messages, the monitored process replies with an ‘I am alive’ message. If process p times out on process q, it adds q to its list of suspected processes. If p later receives an ‘I am alive’ message from q, p then removes q from its list of suspected processes. The heartbeat strategy have advantages over pinging since the former sends half of the messages pinging detectors send for providing the same detection quality. Furthermore, a heartbeat detector estimates only the transmission delay of ‘I am alive’ messages, whereas the pinging detector must estimate the transmission delay of ‘Are you alive?’ messages, the reaction delay, and the transmission delay of ‘I am alive’ messages. The message-pattern strategy does not use any timeout mechanism. In [17], the authors propose an implementation that uses a request-response mechanism. A process p sends a QUERY message to n nodes that it monitors and then waits for responses (RESPONSE message) from α processes (α ≤ n, traditionally α = n − f, where f is the maximum number of failures). A query issued by p ends when it has received α responses. The other responses, if any, are discarded and the respective processes are suspected of having failed. A process sends QUERY messages repeatedly if it has not failed. If, on the next request-response, p receives a response from a suspected process q, then p removes q from its list of suspects. This approach considers the relative order for the receiving of messages which always (or after a time) allow some nodes to communicate faster than the others. 4. SYSTEM MODELS We consider a distributed system which consists of a finite set of processes Π = {q1,…,qn} with |Π| = n, (n ≥ 2) and that there is one process per node, site, or sensor. Therefore, the word process can mean a node, a sensor, or a site. Each process is uniquely identified (id | 1 ≤ id ≤ n) and identifiers are totally and consecutively ordered. Processes can fail by crashing and they do not recover. A process is considered correct if it does not fail during the whole execution. We consider the existence of some global time denoted T. A failure pattern is a function F:T → 2Π, where F(t) is the set of processes that have failed before or at time t. The function correct(F) denotes the set of correct processes, i.e. those that have never belonged to a failure pattern (F), while function faulty(F) denotes the set of faulty processes, i.e. the complement of correct(F) with respect to Π. A process p ∈ Π monitors a set S of processes of Π. We note correct(FS) = correct(F) ∩ S and faulty(FS) = faulty(F) ∩ S. Every process in S is connected to p by a communication link and sends messages to it through this link. Notice that other links among processes of S can exist. Process synchrony: We consider that each process has a local clock that can accurately measure intervals of time, but the clocks of the processes are not synchronized. Processes are synchronous, i.e. there is an upper bound on the time required to execute an instruction. For simplicity, and without loss of generality, we assume that local processing time is negligible with respect to message communication delays. Links and type of systems: For the current implementation, we consider that links are directed (either unidirectional or bidirectional) and there exists a link from q (∀q∈S) to p. Every link between p and q satisfies the following integrity property: p receives a message m from q at most once, only if q previously sent m to p. In other words, communication links cannot create or alter messages. Links are not assumed to be FIFO. Concerning loss property and link synchrony, we consider the following types of links as defined in [5]: lossy asynchronous: A link that satisfies the integrity property and there exists no bound on message delay. Note that, in this case, a message m sent over the link can be lost. However, if m is not lost, it is eventually received at its destination. (Typed)fair lossy: Assuming that each message has a type, link is fair lossy if, for every type infinitely many messages are sent, then infinitely many messages of each type are received (if the receiver process is correct). ♢-timely: A link that satisfies the integrity property and the following ♢-timeliness property: there exists δ and a time t such that if q sends a message m to p at time t′ ≥ t and p is correct, then p receives m from q by time t′ + δ. The maximum message delay δ and the time t are not known. Note that messages sent before time t can be lost. We then define the following types of system: AS: denotes a lossy AS with lossy asynchronous links; F-AS: denotes a fair lossy AS with fair lossy links; W-ET: denotes a weak eventually timely system: a system where some links are ♢-timely while the others are lossy asynchronous; S-ET: denotes a strong eventually timely system: a system where all links are ♢-timely; S-ET-Π: A system which is a S-ET system such that p in S, S = Π, every pair of processes in S is connected either by a pair of directed links (with opposite directions) or bidirectional links, and all processes of Π execute the Impact FD algorithms. W-ET-Π: A system which is a W-ET system such that p in S, S = Π, every pair of processes in S is connected either by a pair of directed links (with opposite directions) or bidirectional links, and all processes of Π execute the Impact FD algorithms. Moreover, there exists a correct process q1 in Π, such that, for all process q2 in Π, q1 ≠ q2, q1 is connected to q2 by a ♢-timely link (similarly to the definition of ♢-source of [16]). Note that a S-ET is also a W-ET and S-ET-Π (resp., W-ET-Π) is also a S-ET (resp., W-ET). Our Impact FD implementation can be applied to all of these systems. Figure 1 shows three types of system. The first one (i) is an AS system where all links are lossy asynchronous while system (ii) shows a W-ET where some links are ♢—timely and others are lossy asynchronous. Finally, the last one (iii) is a W-ET-Π where site q1 is a ♢—source. Figure 1. View largeDownload slide Examples of system types. Figure 1. View largeDownload slide Examples of system types. 5. IMPACT FD The Impact FD can be defined as an unreliable FD that provides an output related to the trust level with regard to a set of processes. If the trust level provided by the detector, is equal to, or greater than, a given threshold value, defined by the user, the confidence in the set of processes is ensured. We can thus say that the system is trusted. We denote FD ( IpS) the Impact FD module of process p and S is a set of processes of Π. When invoked in p, the Impact FD ( IpS) returns the trust_levelpS value which expresses the confidence that p has in set S. 5.1. Impact factor and subsets Each process q ∈ S has an impact factor ( Iq|Iq>0:Iq∈ℝ). Furthermore, set S can be partitioned into m disjoint subsets (S = {S1,S2,..Sm}). Notice that the grouping feature of the Impact FD allows the processes of S to be partitioned into disjoint subsets, in accordance with a particular criterion. For instance, in a scenario where there are different types of sensors, those of the same type can be gathered in the same subset. Let then S*={S1*,S2*,..Sm*} be the set S partitioned into m disjoint subsets where each Si* is a set which each element is a tuple of the form ⟨id,I⟩, where id is a process identifier and I is the value of the impact factor of the process in question. S*={S1*,S2*,..Sm*}is⁢a⁢⁢set⁢such⁢that∀i,j,i≠j,Si*∩Sj*=∅and∪{q|⟨q,_⟩∈Si*;1≤i≤m}=S. 5.2. Trust level We denote trustedpS(t) the set of processes of S that are not considered faulty by p at t∈T. The trust level at t∈T of process p∉F(t) in relation to S is denoted trust_levelpS*. We have then trust_levelpS*(t)= Trust_level(trustedpS(t),S*), where the function Trust_level⁢(trustedpS(t),S*) returns, for each subset Si*, the sum of the impact factors of the elements ⟨idq,Iq⟩ of Si* such that idq ∈ trusted. Trust_level(trusted,S*)={trust_leveli|trust_leveli=∑j∈(trusted∩Si)Ij,1≤i≤|S*|}. In other words, the trust_levelpS* is a set that contains the trust level of each subset of S* expressing the confidence that p has in the processes of S. Note that if all processes of Si* have failed trust_leveli = 0. 5.3. Margin of failures An acceptable margin of failures, denoted thresholdS*, characterizes the acceptable degree of failure flexibility in relation to set S*. The thresholdS* is adjusted to the minimum trust level required for each subset, i.e. it is defined as a set which contains the respective threshold of each subset of S*: thresholdS*={threshold1,…,thresholdm}. The thresholdS* is used by p to check the confidence in the processes of S. If, for each subset of S*, the trust_leveli(t) ≥ thresholdi, S is considered to be trusted at t by p, i.e. the confidence of p in S has not been jeopardized; otherwise S is considered untrusted by p at t. Three points should be highlighted: (i) both the impact factor and thresholdS* render the estimation of the confidence in S flexible. For instance, it is possible that some processes in S might be faulty or suspected of being faulty but S is still trusted; (ii) the Impact FD can be easily configured to adapt to the needs of the environment; (3) the thresholdS* can be tuned to provide a more restricted or softer monitoring. Note that the Impact FD can also be applied when the application needs individual information about each process of S. In this case, each process must be defined as a different subset of S*. 5.4. Examples Table 1 shows several examples of sets and their respective thresholds. In the first example (i) there is just one subset with three processes. Each process has impact factor equal to 1 and the threshold defines that the sum of impact factor of nonfaulty processes must be at least equals to 2, i.e. the system is considered trusted whenever there are two or more correct processes. Example (ii) shows a configuration where processes must be monitored individually. Each process is the only element of a subset and the threshold defines that if any of the processes fails, the system is not trusted anymore. In the third example (iii), S has two sets with three processes each. The threshold requires at least two correct processes in each subset. The last example (iv) has a single subset with five processes with different impact factors. The threshold defines that the set is trusted whenever the sum of impact factor of correct processes is at least equal to seven. Table 1. Examples of sets and threshold. S* ThresholdS* a {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩}} {2} b {{⟨q1,1⟩},{⟨q2,1⟩},{⟨q3,1⟩}} {1,1,1} c {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩}, {⟨q4,2⟩,⟨q5,2⟩,⟨q6,2⟩}} {2,4} d {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩, ⟨q4,5⟩,⟨q5,5⟩}} {7} S* ThresholdS* a {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩}} {2} b {{⟨q1,1⟩},{⟨q2,1⟩},{⟨q3,1⟩}} {1,1,1} c {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩}, {⟨q4,2⟩,⟨q5,2⟩,⟨q6,2⟩}} {2,4} d {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩, ⟨q4,5⟩,⟨q5,5⟩}} {7} Table 1. Examples of sets and threshold. S* ThresholdS* a {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩}} {2} b {{⟨q1,1⟩},{⟨q2,1⟩},{⟨q3,1⟩}} {1,1,1} c {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩}, {⟨q4,2⟩,⟨q5,2⟩,⟨q6,2⟩}} {2,4} d {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩, ⟨q4,5⟩,⟨q5,5⟩}} {7} S* ThresholdS* a {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩}} {2} b {{⟨q1,1⟩},{⟨q2,1⟩},{⟨q3,1⟩}} {1,1,1} c {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩}, {⟨q4,2⟩,⟨q5,2⟩,⟨q6,2⟩}} {2,4} d {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩, ⟨q4,5⟩,⟨q5,5⟩}} {7} In Table 2, we consider a set S* composed by three subsets: S1*, S2*, and S3* (S* = {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩}, {⟨q4,2⟩,⟨q5,2⟩,⟨q6,2⟩}, {⟨q7,3⟩,⟨q8,3⟩,⟨q9,3⟩}}). The values of thresholdS*={1,4,6} define that the subset S1* (resp., S2* and S3*) must have at least one (resp., two) correct process. The table shows several possible outputs for FD ( IpS) depending of process failures: the set S* is considered trusted at t if, for each subset Si*, trust_leveli(t) ≥ thresholdi. Table 2. Example of FD ( IpS) output: S* has three subsets. t F(t) trustedpS(t) trust_levelpS*(t) Status at t 1 {q2} {q1,q3,q4,q5,q6,q7,q8,q9} {2,6,9} Trusted 2 {q1,q2,q5} {q3,q4,q6,q7,q8,q9} {1,4,9} Trusted 3 {q1,q2,q5,q6} {q3,q4, q7,q8,q9} {1,2,9} Untrusted t F(t) trustedpS(t) trust_levelpS*(t) Status at t 1 {q2} {q1,q3,q4,q5,q6,q7,q8,q9} {2,6,9} Trusted 2 {q1,q2,q5} {q3,q4,q6,q7,q8,q9} {1,4,9} Trusted 3 {q1,q2,q5,q6} {q3,q4, q7,q8,q9} {1,2,9} Untrusted S* = {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩},{⟨q4,2⟩,⟨q5,2⟩⟨q6,2⟩}, {⟨q7,3⟩,⟨q8,3⟩,⟨q9,3⟩}}. thresholdS*={1,4,6}. Table 2. Example of FD ( IpS) output: S* has three subsets. t F(t) trustedpS(t) trust_levelpS*(t) Status at t 1 {q2} {q1,q3,q4,q5,q6,q7,q8,q9} {2,6,9} Trusted 2 {q1,q2,q5} {q3,q4,q6,q7,q8,q9} {1,4,9} Trusted 3 {q1,q2,q5,q6} {q3,q4, q7,q8,q9} {1,2,9} Untrusted t F(t) trustedpS(t) trust_levelpS*(t) Status at t 1 {q2} {q1,q3,q4,q5,q6,q7,q8,q9} {2,6,9} Trusted 2 {q1,q2,q5} {q3,q4,q6,q7,q8,q9} {1,4,9} Trusted 3 {q1,q2,q5,q6} {q3,q4, q7,q8,q9} {1,2,9} Untrusted S* = {{⟨q1,1⟩,⟨q2,1⟩,⟨q3,1⟩},{⟨q4,2⟩,⟨q5,2⟩⟨q6,2⟩}, {⟨q7,3⟩,⟨q8,3⟩,⟨q9,3⟩}}. thresholdS*={1,4,6}. 5.5. Flexibility of the impact FD The flexibility of the Impact FD characterizes its capability of accepting different set of responses that lead to a trusted state of S. We define PS as the set that contains all possible subsets of processes which satisfy a defined threshold: PS=×PowerSet(Si*,thresholdiS*) where × Si corresponds to the Cartesian product of several sets. Initially, the PowerSet function generates the power set2 for each subset ( Si*) of S*. Then, only the subsets of Si* whose sum of their parts is greater than, or equal to, thresholdi are selected. That is, the output is the sets of possible trusted set that satisfy the threshold for each subset Si*. Following this, the Cartesian product is applied to generate all possible combinations, i.e. all the generated subsets of processes satisfy the thresholdS*. Let’s consider the following example: S*={{⟨q1,1⟩,⟨q2,1⟩},{⟨q3,1⟩,⟨q4,1⟩},{⟨q5,1⟩,⟨q6,1⟩}} thresholdS* = {1,1,1} PS = PowerSet(S*, thresholdS*) PowerSet(S1*,threshold1)={{q1},{q2},{q1,q2}}PowerSet(S2*,threshold2)={{q3},{q4},{q3,q4}}PowerSet(S3*,threshold3)={{q5},{q6},{q5,q6}}PS=PowerSet(S1*,threshold1)×PowerSet(S2*,threshold2)×PowerSet(S3*,threshold3) PS={{q1,q3,q5},{q1,q3,q6},{q1,q3,q5,q6},{q1,q4,q5},{q1,q4,q6},{q1,q4,q5,q6},{q1,q3,q4,q5},{q1,q3,q4,q6},{q1,q3,q4,q5,q6},…} For instance, if trustedpS(t1)= {q1, q3, q5} and trustedpS(t2)={q1,q3,q4,q6}, trustedpS(t1) and trustedpS(t1)∈PS, and, therefore, p considers that the system S is trusted at both t1 and t2. We now define two properties, PR(IT)pS and PR(⋄IT)pS, that characterize the stability condition that ensures the confidence (or eventual confidence) of p on S. Impact threshold property— PR(IT)pS: For a FD of a correct process p, the set trustedpS is always a subset of PS. PR(IT)pS≡p∈correct(F),∀t≥0,trustedpS(t)∈PS Eventual impact threshold property— PR(⋄IT)pS: For a FD of a correct process p, there is a time after which the set trustedpS is always a subset of PS. PR(⋄IT)pS≡∃t∈T,p∈correct(F),∀t′≥t,trustedpS(t′)∈PS If PR(IT)pS (resp., PR(⋄IT)pS) holds, the system S is always (resp., eventually always) trusted by p. 5.6. Classes of Impact FD Similarly to the completeness and accuracy properties defined in [2] (see Section 3), we define the following properties for the Impact FD: ImpactcompletenesspS: For a FD of a correct process p, there is a time after which p does not trust any crashed process of S; ∃t∈T,p∈correct(F),∀q∈faulty(FS):∀t′∈T≥t,q∉trustedpS(t′) ImpactweakcompletenesspS: For a FD of a correct process p, there is a time after which some p does not trust any crashed process of S; ∃t∈T,∃p∈correct(F),∀q∈faulty(FS):∀t′∈T≥t,q∉trustedpS(t′) EventualimpactstrongaccuracypS: For a FD of a correct process p, there is a time after which all correct processes of S belong to trustedpS; ∃t∈T,∀t′∈T≥t,p∈correct(F),∀q∈correct(FS):q∈trustedpS(t′) EventualimpactweakaccuracypS: For a FD of a correct process p, there is a time after which some correct process of S is trusted by every correct process. ∃t∈T,∀t′∈T≥t,∀p∈correct(F),∃q∈correct(FS):q∈trustedpS(t′) Lets consider that p in S and S = Π We can then define some classes of Impact FD, similarly to those defined in [2] and [23]: ♢IP (eventually perfect impact class): For S = Π, ∀p∈correct(F), impactcompletenesspS and eventualimpactstrongaccuracypS properties are satisfied; ♢IS (eventually strong impact class): For S = Π, ∀p∈correct(F), impactcompletenesspS and eventualimpactweakaccuracypS properties are satisfied; We point out that the trust level output of the FDs of the above classes depends on S*, i.e. the impact factor assigned to the processes as well as how they are grouped in subsets. 6. IMPLEMENTATION OF IMPACT FD The Impact FD can have different implementations according to the characteristics of the system: the synchronization model, whether or not the process p has knowledge about the composition of S (membership) and the type of nodes. In this section, we present a timer-based implementation of the Impact FD (Algorithms 2 and 3). Algorithm 1 Timeout Function. 1: function Timeout (q,η,model) 2:  ifmodel = * −ASthen ▻AS or F-AS system 3:   τq = β + EAq 4:  else 5:   τq = β + EAq + η 6:  end if 7:  returnτq 8: end function 1: function Timeout (q,η,model) 2:  ifmodel = * −ASthen ▻AS or F-AS system 3:   τq = β + EAq 4:  else 5:   τq = β + EAq + η 6:  end if 7:  returnτq 8: end function Algorithm 1 Timeout Function. 1: function Timeout (q,η,model) 2:  ifmodel = * −ASthen ▻AS or F-AS system 3:   τq = β + EAq 4:  else 5:   τq = β + EAq + η 6:  end if 7:  returnτq 8: end function 1: function Timeout (q,η,model) 2:  ifmodel = * −ASthen ▻AS or F-AS system 3:   τq = β + EAq 4:  else 5:   τq = β + EAq + η 6:  end if 7:  returnτq 8: end function Algorithm 2 Timer-based Impact FD Algorithm for p. 1: Begin   Input 2:  S*, model, η   Init 3:  trusted = S 4:  ∀q ≠ p : reset timer[q] = Timeout(q,0, model); η[q] = 0   Task T1 - Upon reception of ALIVE from q 5:  ifq∉trustedthen 6:   trusted = trusted∪{q} 7:   ifmodel = * − ETthen ▻W-ET or S-ET system 8:    η[q] = η[q] + η 9:   end if 10:  end if 11:  reset timer[q] = Timeout(q,η[q], model)   Task T2 - When timer[q] expires 12:  trusted = trusted\{q} 13:  reset timer[q] = Timeout(q,η[q], model)   Task T3 14:  Upon invocation ofImpact() do 15:   returnTrust_level(trusted, S*) 16:  end 17: End 1: Begin   Input 2:  S*, model, η   Init 3:  trusted = S 4:  ∀q ≠ p : reset timer[q] = Timeout(q,0, model); η[q] = 0   Task T1 - Upon reception of ALIVE from q 5:  ifq∉trustedthen 6:   trusted = trusted∪{q} 7:   ifmodel = * − ETthen ▻W-ET or S-ET system 8:    η[q] = η[q] + η 9:   end if 10:  end if 11:  reset timer[q] = Timeout(q,η[q], model)   Task T2 - When timer[q] expires 12:  trusted = trusted\{q} 13:  reset timer[q] = Timeout(q,η[q], model)   Task T3 14:  Upon invocation ofImpact() do 15:   returnTrust_level(trusted, S*) 16:  end 17: End Algorithm 2 Timer-based Impact FD Algorithm for p. 1: Begin   Input 2:  S*, model, η   Init 3:  trusted = S 4:  ∀q ≠ p : reset timer[q] = Timeout(q,0, model); η[q] = 0   Task T1 - Upon reception of ALIVE from q 5:  ifq∉trustedthen 6:   trusted = trusted∪{q} 7:   ifmodel = * − ETthen ▻W-ET or S-ET system 8:    η[q] = η[q] + η 9:   end if 10:  end if 11:  reset timer[q] = Timeout(q,η[q], model)   Task T2 - When timer[q] expires 12:  trusted = trusted\{q} 13:  reset timer[q] = Timeout(q,η[q], model)   Task T3 14:  Upon invocation ofImpact() do 15:   returnTrust_level(trusted, S*) 16:  end 17: End 1: Begin   Input 2:  S*, model, η   Init 3:  trusted = S 4:  ∀q ≠ p : reset timer[q] = Timeout(q,0, model); η[q] = 0   Task T1 - Upon reception of ALIVE from q 5:  ifq∉trustedthen 6:   trusted = trusted∪{q} 7:   ifmodel = * − ETthen ▻W-ET or S-ET system 8:    η[q] = η[q] + η 9:   end if 10:  end if 11:  reset timer[q] = Timeout(q,η[q], model)   Task T2 - When timer[q] expires 12:  trusted = trusted\{q} 13:  reset timer[q] = Timeout(q,η[q], model)   Task T3 14:  Upon invocation ofImpact() do 15:   returnTrust_level(trusted, S*) 16:  end 17: End Algorithm 3 Timer-based Impact FD Algorithm for q in S. 1: Begin   Input 2:  p, Δ   Task T1 - Repeat forever every Δ time unit 3:  send(ALIVE) to p 4: End 1: Begin   Input 2:  p, Δ   Task T1 - Repeat forever every Δ time unit 3:  send(ALIVE) to p 4: End Algorithm 3 Timer-based Impact FD Algorithm for q in S. 1: Begin   Input 2:  p, Δ   Task T1 - Repeat forever every Δ time unit 3:  send(ALIVE) to p 4: End 1: Begin   Input 2:  p, Δ   Task T1 - Repeat forever every Δ time unit 3:  send(ALIVE) to p 4: End The system S consists of n processes grouped in m subsets. The monitor process p∉S. Our implementation (Algorithms 2 and 3) uses timers to detect failures of processes in different system models. Process q periodically sends (heartbeat) messages to process p, that is responsible for monitoring process q. If p does not receive such a message from q after the expiration of the timer, it removes q from its list of trusted processes. Chen’s heartbeat estimation arrival: Algorithm 2 uses the algorithm proposed by [7], denoted Chen’s algorithm in this work, which computes the timeout value for waiting for a heartbeat message from each monitored process. Chen’s algorithm uses arrival times sampled in the recent past to compute an estimation of the arrival time of the next heartbeat. Then, timeout value is set according to this estimation and a safety margin (β). It is recomputed at each timer expiration. The estimation algorithm is the following: process p takes into account the z most recent heartbeat messages received from q, denoted by y1, y2, …, yz; A1, A2, …, Az are their actual reception times according to p’s local clock. When at least z messages have been received, the theoretical arrival time EA(k + 1) for a heartbeat from q is estimated by: EA(k+1)=1z∑i=k−zk(Ai−Δi∗i)+(k+1)Δi where Δi is the interval between the sending of two q’s heartbeats. The next timeout value which will be set in p’s timer and will expire at the next freshness point τ(k + 1), is then composed by EA(k + 1) and the constant safety margin β: τ(k+1)=β+EA(k+1)(nextfreshnesspoint) In Algorithm 2, Chen’s algorithm is executed by the Timeout function (Algorithm 1) which calculates the arrival estimation of the next heartbeat for process q. Furthermore, if the system is eventually timely in order to ensure accuracy of the impact FD a η value is added to the q’s timeout. The η has an initial zero value and is incremented whenever p falsely suspects q (line 6 of Algorithm 2). Such an increment ensures that, if the link is ♢—timely and stable, i.e. the delay bound δ verifies forever, the heartbeat arrival estimation time will be always equal or greater than the actual arrival time for every heartbeat and, therefore, there will be no more estimation mistakes and, therefore no more false suspicions and the accuracy property is hold. Algorithm 2 is executed by the monitor process p while Algorithm 3 by all processes of S. The following local variables are used by the algorithm: • trusted: set of processes considered not faulty by p; • η[]: keeps the timeout increment of each process in S; • timer[]: is set to the timeout value at each timer expiration. In Algorithm 2, p receives as input the set S*, the increment time η for the timeout estimation (used when occurs false suspicions in W-ET or S-ET systems), and the model of the system (AS, F-AS, W-ET or S-ET). Note that by receiving S*, the algorithm knows S, the impact factor of all processes of S, the number of subgroups m, and how processes are grouped. At the initialization, trusted is equal to the set of processes. Then, for each process q in S (q ≠ p), p initializes the timer that will control the arrival of heartbeat messages from q (line 6). Upon the reception of an ALIVE message from q (Task T1), q is added to the trusted set (line 6) and the timeout related to q is recomputed (line 6). In task T2, q is considered faulty by p and, therefore, removed from trusted (line 6). The timeout related to q is then recomputed (line 6). Task T3 handles the invocation of the Impact() function, which computes the trust_level of each subset and returns the trust level related to the current trusted processes which are trusted by p. In Algorithm 3, every monitored process q of S sends periodically, every Δ units of time, an ALIVE message to its input observer p in order to inform the latter that it is alive (Task T1). Note that if p∈S, like in S-ET-Π or W-ET-Π, all processes of Π execute the two algorithms behaving, thus, as both a monitor and a monitored process. In this case, the primitive send in line 3 of Algorithm 3 is replaced by the primitive broadcast, i.e. every process periodically sends a heartbeat to all processes of S. 6.1. Proof In this section, we prove the correctness of some properties of Algorithms 2 and 3. Lemma 6.1 If p is correct, Algorithms2 and 3satisfy the impact completeness property for p in relation to S. Proof Let’s consider that at t, Sf = faulty(FS) (i.e. all failures of processes in S already took place) and that all the ALIVE messages (heartbeats) sent by these faulty processes before they crashed were delivered to p. Thus, after t, p will receive no more ALIVE messages from processes of Sf. Then, ∀q∈Sf, in the next expiration of the timer[q] after t, q will be removed from trusted (line 6). Moreover, since p will receive no more ALIVE messages from q, line 6 will never be executed for q anymore and, thus, q will nevermore be included in trusted. Therefore, ∃t′ > t, ∀t′′ ≥ t′, ∀q ∈ faulty(FS):q ∉ trustedp(t′′).□ Lemma 6.2 If S is a W-ET, if p is correct, Algorithms 2 and 3 satisfy the eventual impact weak accuracy property for p in relation to S. Proof In a W-ET system S, there exists q∈correct(FS) linked to p by a ♢—timely. Let’s denote Tq the stabilization time of the link q from p, i.e. ∀t ≥ Tq, if q sends a message m to p, then q receives m by time t + δ. Then, when q sends a message to p at t ≥ Tq, and p receives the message at t1 ≥ t, two cases may happen: the next timer of q expires after t1 (Task T1). In this case, q will be added to trusted (line 6). Then, the timeout value of q is incremented (line 6) and the timer of q restarted; the current timer of q expires before t1: p removes q from trusted (line 6) and the timer is restarted. Since q keeps on sending ALIVE messages to p and timer[q] increases at false suspicion of q, there exists a time t2 > Tq such that timer[q] ≥ δ and then Task 2 will nevermore be executed by p for q and, ∀t3 ≥ t2, upon every q's message reception by p, task T1 will be executed for q. Therefore, q will remain forever in trustedp and Eventual impact weak accuracypS is satisfied.□ Lemma 6.3 If S is a S-ET, if p is correct, Algorithms 2 and 3 satisfy the eventual impact strong accuracy property for p in relation to S. Proof In a S-ET system S, every q∈correct(FS) is linked to p by a ♢ − timely. Then, following the same proof scheme of Lemma 6.2, q will remain forever in trustedp and Eventual impact strong accuracypS is satisfied.□ Theorem 6.1 In W-ET-Π systems, Algorithms 2 and 3 implement a FD of class ♢IS. Proof If the system is W-ET-Π, S = Π, from Lemmas 6.1 and 6.2, ∀p∈correct(F), impact completenesspΠ and eventual impact weak accuracypΠ are satisfied. Therefore, the algorithms implement a FD of class ♢IS.□ Theorem 6.2 In S-ET-Π systems, Algorithms 2 and 3 implement a FD of class ♢IP. Proof If the system is S-ET-Π, S = Π, from Lemmas 6.1 and 6.3, ∀p∈correct(F), Impact completenesspΠ and Eventual impact strong accuracypΠ are satisfied. Therefore, the algorithms implement a FD of class ♢IP.□ Theorem 6.3 If PR(IT)pS (resp., PR(⋄IT)pS) holds, the system S is always (resp., eventually always) trusted by p. Proof if PR(IT)pS (resp., PR(⋄IT)pS) holds, ∀t ≥ 0 (resp., ∃t1,∀t ≥ t1), trusted ∈ PS and, therefore, S is trusted by p.□ 7. PERFORMANCE EVALUATION In this section, we first describe the environment in which the experiments were conducted and the QoS metrics used for evaluating the results. Then, we discuss some of the results in different systems and configurations of node sets with regard to both the impact factor and the threshold. Our goal is to evaluate the QoS of the Impact FD: how fast it detects failures and how well it avoids false suspicions. With this purpose, we exploit a set of metrics that have been proposed by [7] and we compare the results of Impact FD with an approach that monitors processes individually using Chen’s FD [7]. We conducted a set of experiments, considering two different systems: (i) AS: a system where all links are lossy asynchronous; (ii) W-ET: a system where some links are ♢—timely and the others are lossy asynchronous. 7.1. Environment Our experiments are based on real-trace files, collected from 10 nodes of PlanetLab [8], as summarized in Table 3. The PlanetLab experiment started on 16 July 2014 at 15:06 UTC, and ended exactly a week later. Each site sent heartbeat messages to other sites at a rate of one heartbeat every 100 ms (the sending interval). We should point out that these traces of PlanetLab contain a large amount of data concerning the sending and reception of heartbeats, including unstable periods of links and message loss, which induce false suspicions. Thus, such traces characterize any distributed system that uses FDs based on heartbeat. Furthermore, since our experiments were conducted using the PlanetLab traces, all of them reproduce exactly the same scenarios of sending and receiving of heartbeats by the processes. Furthermore, provided that the same trace is available, the test conditions and results are reproducible. Table 3. Sites of experiments. ID Site Local 0 planetlab1.jhu.edu USA East Coast 1 ple4.ipv6.lip6.fr France 2 planetlab2.csuohio.edu USA, Ohio 3 75-130-96-12.static.oxfr.ma.charter.com USA, Massachusetts 4 planetlab1.cnis.nyit.edu USA, New York 5 saturn.planetlab.carleton.ca Canada, Ontario 6 PlanetLab-03.cs.princeton.edu USA, New Jersey 7 prata.mimuw.edu.pl Poland 8 planetlab3.upc.es Spain 9 pl1.eng.monash.edu.au Australia ID Site Local 0 planetlab1.jhu.edu USA East Coast 1 ple4.ipv6.lip6.fr France 2 planetlab2.csuohio.edu USA, Ohio 3 75-130-96-12.static.oxfr.ma.charter.com USA, Massachusetts 4 planetlab1.cnis.nyit.edu USA, New York 5 saturn.planetlab.carleton.ca Canada, Ontario 6 PlanetLab-03.cs.princeton.edu USA, New Jersey 7 prata.mimuw.edu.pl Poland 8 planetlab3.upc.es Spain 9 pl1.eng.monash.edu.au Australia Table 3. Sites of experiments. ID Site Local 0 planetlab1.jhu.edu USA East Coast 1 ple4.ipv6.lip6.fr France 2 planetlab2.csuohio.edu USA, Ohio 3 75-130-96-12.static.oxfr.ma.charter.com USA, Massachusetts 4 planetlab1.cnis.nyit.edu USA, New York 5 saturn.planetlab.carleton.ca Canada, Ontario 6 PlanetLab-03.cs.princeton.edu USA, New Jersey 7 prata.mimuw.edu.pl Poland 8 planetlab3.upc.es Spain 9 pl1.eng.monash.edu.au Australia ID Site Local 0 planetlab1.jhu.edu USA East Coast 1 ple4.ipv6.lip6.fr France 2 planetlab2.csuohio.edu USA, Ohio 3 75-130-96-12.static.oxfr.ma.charter.com USA, Massachusetts 4 planetlab1.cnis.nyit.edu USA, New York 5 saturn.planetlab.carleton.ca Canada, Ontario 6 PlanetLab-03.cs.princeton.edu USA, New Jersey 7 prata.mimuw.edu.pl Poland 8 planetlab3.upc.es Spain 9 pl1.eng.monash.edu.au Australia For the evaluation of Impact FD, we defined S = {1,2,3,4,5,6,7,8,9} and Site 0 as the monitor node (p∉S). Table 4 gives some information about the heartbeat messages received by Site 0 (the monitor node). We observe that the mean inter-arrival times of received heartbeats is very close to 100 ms. However, for some sites, the standard deviation is very high, like for Site 5 which the standard deviation was 310.958 ms with a minimum inter-arrival time of 0.006 ms, and a maximum of 657 900.226 ms. Such deviation probably indicates that, for a certain time interval during execution, the site stopped sending heartbeats and started again afterwards. Also note that Site 2 stopped sending messages after ~48 hours and, therefore, there are just 1 759 990 received messages. Table 4. Sites and heartbeat sampling. Site Messages Min (ms) Max (ms) Mean (ms) Standard deviation (ms) 1 5 424 326 0.025 26 494.168 100.058 19.525 2 1 759 989 0.031 509.093 100.415 9.275 3 5 426 843 0.027 1 227.349 100.012 1.709 4 5 414 122 0.003 1 193.276 100.247 18.595 5 5 413 542 0.006 657 900.226 100.258 310.958 6 5 426 700 0.003 3 787.643 100.015 2.557 7 5 424 117 0.006 59 603.188 100.062 31.229 8 5 424 560 0.027 11 443.359 100.054 100.714 9 5 422 043 0.004 30 600.076 100.100 18.798 Site Messages Min (ms) Max (ms) Mean (ms) Standard deviation (ms) 1 5 424 326 0.025 26 494.168 100.058 19.525 2 1 759 989 0.031 509.093 100.415 9.275 3 5 426 843 0.027 1 227.349 100.012 1.709 4 5 414 122 0.003 1 193.276 100.247 18.595 5 5 413 542 0.006 657 900.226 100.258 310.958 6 5 426 700 0.003 3 787.643 100.015 2.557 7 5 424 117 0.006 59 603.188 100.062 31.229 8 5 424 560 0.027 11 443.359 100.054 100.714 9 5 422 043 0.004 30 600.076 100.100 18.798 Table 4. Sites and heartbeat sampling. Site Messages Min (ms) Max (ms) Mean (ms) Standard deviation (ms) 1 5 424 326 0.025 26 494.168 100.058 19.525 2 1 759 989 0.031 509.093 100.415 9.275 3 5 426 843 0.027 1 227.349 100.012 1.709 4 5 414 122 0.003 1 193.276 100.247 18.595 5 5 413 542 0.006 657 900.226 100.258 310.958 6 5 426 700 0.003 3 787.643 100.015 2.557 7 5 424 117 0.006 59 603.188 100.062 31.229 8 5 424 560 0.027 11 443.359 100.054 100.714 9 5 422 043 0.004 30 600.076 100.100 18.798 Site Messages Min (ms) Max (ms) Mean (ms) Standard deviation (ms) 1 5 424 326 0.025 26 494.168 100.058 19.525 2 1 759 989 0.031 509.093 100.415 9.275 3 5 426 843 0.027 1 227.349 100.012 1.709 4 5 414 122 0.003 1 193.276 100.247 18.595 5 5 413 542 0.006 657 900.226 100.258 310.958 6 5 426 700 0.003 3 787.643 100.015 2.557 7 5 424 117 0.006 59 603.188 100.062 31.229 8 5 424 560 0.027 11 443.359 100.054 100.714 9 5 422 043 0.004 30 600.076 100.100 18.798 The implementation of the Impact FD used in our evaluation experiments is based on Algorithms 2 and 3, presented in Section 6. For the estimation of the timeout value of Chen’s estimation algorithm, the authors suggest that the safety margin β should range from 0 to 2500 ms. For all experiments, we set the window size to 100 samples, which means that the FD only relies on the last 100 heartbeat message samples for computing the estimation of the next heartbeat arrival time. Several works aim at improving the QoS of FDs which estimate the arrival time of the next heartbeat by varying some parameters such as window size [24–26]. The latter emphasize that Chen FD has better performance with smaller window sizes. Based on these studies and our experiments, we used the window size of 100 samples, which induces Chen FD to take less time to adapt to the dynamics of the network. 7.1.1. Evaluation of sites’ stability We evaluated the stability of sites, considering that the traces could correspond to either an AS system or W-ET system. For the first case, the value AS was assigned to the model parameter of Algorithm 2 while for the second case, the same parameter was set to W-ET. Each of the sites of S is considered individually and not as a whole system. The impact value of sites and the threshold values are not concerned for the experiments. The β value of Chen’s algorithm was set to 400 ms. We chose such a value because it is an acceptable safety margin for detection time and is not too aggressive; otherwise the FD would be prone to too many mistakes. The stability of sites and the corresponding links to the monitor were evaluated during the whole trace period for the AS system and during just the first 24 hours of the trace period for the W-ET system. ASsystem: Figure 2 shows the cumulative number of mistakes, i.e. false suspicions, made by the monitor Site 0 for each site of S. We can observe that site or link periods of instability entail late arrivals or loss of heartbeats and, therefore, mistakes by the monitor site. For example, Site 9 had a large number of cumulative mistakes at hour 48. After that, there is a stable period with regard to this site. On the other hand, around this time, Site 2 stopped sending messages since it crashed and, consequently, the monitor node made no more mistakes about it after this time. Finally, we can say that, considering the whole period, Sites 3 and 6 (resp., 8 and 9) are, in average, the most stable (resp., unstable) sites. Figure 2. View largeDownload slide AS System: cumulative number of mistakes of each site. Figure 2. View largeDownload slide AS System: cumulative number of mistakes of each site. W-ETsystem: In Algorithm 1 (Task T1), when the system is W-ET, Chen’s heartbeat arrival estimation value is incremented by η, whenever a false suspicion occurs. However, in order to prevent this estimation from increasing too fast when there is a period of high instability, which could increase the detection time considerably, we considered that the value of the timer (line 6) will be incremented by η at every μ heartbeat arrivals, provided that during the period of these μ heartbeat arrivals, one or more false suspicions took place. For the experiment, we considered μ equals to 10 and η = 1 ms. Note that when the heartbeat arrival estimation reaches a value which is greater than the transmission delay limit for links with ♢—timely behavior, the monitor site does not make any more mistakes for the related sites. Moreover, for unstable sites, as the heartbeat arrival estimation value will also be incremented by η in case of false suspicions, such an increment will be responsible for decreasing the number of mistakes for these sites when compared to an AS system. However, such a reduction induces a higher false suspicion detection time. Figure 3 shows the cumulative number of mistakes that the monitor process made for each site in the first 24 hours of the traces. We can observe that there are links which behave ♢—timely while the others are lossy asynchronous. The FD did not make mistakes related to Site 4. For Sites 2 and 3, it did only 1 and 2 mistakes, respectively, while for Site 6, it did 99 mistakes during the first hour, and then no more mistakes. Although some sites have had some periods of stability (1, 5, 8 and 9), Site 0 made mistakes related to them until almost the end of these execution. On the other hand, it did no mistakes for Site 7 after hour 9. In summary, we can consider that Site 0, the monitor site, is connected by ♢ − timely links to sites 2, 3, 4 and 6, and by lossy asynchronous links to 1, 5, 7, 8 and 9. Figure 3. View largeDownload slide W-ET System: cumulative number of mistakes of each site. Figure 3. View largeDownload slide W-ET System: cumulative number of mistakes of each site. 7.1.2. Evaluation of heartbeat arrival times The goal of this section is to show the behavior of the arrival times when the timer expires and the FD does not receive the heartbeat message. For the first 24 hours, we evaluated the behavior of the three arrival times at Site 0 related to heartbeat messages of Site 1 with two different values to β (100 and 400 ms). We chose Site 1 because it has many periods of instability. We consider that Sites 1 and 0 are alternately connected by lossy asynchronous or ♢—timely links. We evaluated three arrival times: (i) arrival of the heartbeat; (ii) the estimated arrival time considering that the link is lossy asynchronous; (iii) the estimated arrival time considering that the link is ♢—timely. In order to compute the latter, we set η = 1 ms and the number of heartbeats before incrementing the heartbeat arrival estimation value, in case of false suspicions, to 100 (μ = 100). Figures 4 and 5 show the time difference between the arrival time of the previous heartbeat and the above three arrival ones for Site 1: (i) the difference in milliseconds between the arrival time of the last heartbeat and the previous one (Arrival); (ii) the difference in milliseconds between the estimated arrival time (τq = β + EAq) and the arrival time of the previous heartbeat, considering the link lossy asynchronous (Estimation LA); (iii) the number of milliseconds elapsed between the estimated arrival time (τq = β + EAq + η) and the arrival time of the previous heartbeat, considering the link ♢—timely (Estimation ET). Figure 4. View largeDownload slide The behavior of the arrival times when the timeout expires in Site 1 for β = 100 ms and μ = 100 for 24 hours. The points correspond to the times where mistakes took place. Figure 4. View largeDownload slide The behavior of the arrival times when the timeout expires in Site 1 for β = 100 ms and μ = 100 for 24 hours. The points correspond to the times where mistakes took place. Figure 5. View largeDownload slide The behavior of the arrival times when the timeout expires in Site 1 for β = 400 ms and μ = 100 for 24 hours. The points correspond to the times where mistakes took place. Figure 5. View largeDownload slide The behavior of the arrival times when the timeout expires in Site 1 for β = 400 ms and μ = 100 for 24 hours. The points correspond to the times where mistakes took place. Figures 4 and 5 show the behavior of times when the timeout expires for β = 100 and β = 400 ms, respectively, till hour 24. In order to simplify the figures, the points correspond only to the times where mistakes took place. Figure 5 has fewer points than Figure 4 because the number of mistakes drops considerably due to a higher β value. Figure 4 summarizes the time differences for β = 100 ms. The monitor Site 0 made 807 (resp., 592) mistakes when the link is lossy asynchronous (resp. ♢—timely). Note that at several points, the estimated arrival time for the ET estimation is higher than the arrival time of the heartbeat while, in the LA estimation, the difference between them is very small (1 or 2 ms), specially from time 6 to 21. Thus, both lines in the figure overlap but the estimation arrival time is often below the arrival one which explains the high number of mistakes. At times 1, 4, 6, 21 and 23, which correspond to periods of instability, the arrival time of the heartbeat is much higher than the estimation one for the LA estimation. Contrarily to Figure 4, the number of mistakes drops to 168 and 166 mistakes, for ET and LA estimations respectively as shown in Figure 5. Therefore, since they are almost equal, the estimated arrival times for the lossy asynchronous and ♢—timely are also quite close. Similarly to Figure 4, the mistakes are concentrated in periods of great instability (1, 4, 6, 21 and 23). 7.1.3. Discussion about the choice of parameters Below we describe the criteria used to set the parameter values β, μ and η: β: for the estimation of the timeout value of Chen’s estimation algorithm, the authors [7] suggest that the safety margin should range from 0 to 2500 ms. The β value of Chen’s algorithm was set to 400 ms in experiments of Sections 7.1.1, 7.1.2 and 7.3.1. We chose such a value because it is an acceptable safety margin for detection time and is not too aggressive; otherwise the FD would be prone to too many mistakes. In experiments of Section 7.4, we used β = 50 ms and β = 100 ms. These safety margin values are quite aggressive, which, consequently, lead the FD prone to make mistakes. We choose these values because our aim was to check the behavior of the FD in a scenario more vulnerable to failures. η: this is increment time for the timeout estimation (used when false suspicions take place). We conducted experiments with two values: 500 μs and 1 ms. We defined these values taking into account that each site sends heartbeat messages to other sites at a rate of one heartbeat every 100 ms (the sending interval). A value greater than 1 ms greatly increases the detection time. On the other hand, a value smaller than 500 μs generates many mistakes. Thus, these values guarantee a better trade-off between detection time and accuracy of the Impact FD. μ: we conducted experiments with different values for μ (1, 10 and 100). On the one hand, we observed that for μ = 1, a smaller number of errors occurred, however, the detection time increased. On the other hand, when we used μ = 100, the number of errors increased. Considering these two trade-offs, we set μ = 10 in the experiment of section 7.1.1 (Evaluation of sites’ stability). 7.2. QoS metrics First, let’s remember that the goal of the Impact FD is to inform if a system is ‘trusted’ or ‘untrusted’. This information can be deduced by comparing the output trust_level of the Impact FD with the threshold. Thus, we say that the output of the Impact FD of p is correct if either, for each subset of S* (1 ≤ i ≤ m), trust_leveli ≥ thresholdi and S is actually trusted, or ∃ i such that trust_leveli < thresholdi and S is actually untrusted. Otherwise, the FD made a mistake. For evaluating the Impact FD, we used three of the QoS metrics proposed in [7]: detection time, average mistake rate, and query accuracy probability. Considering that p monitors S, the QoS of the Impact FD at p must take into account the transitions between ‘trusted’ to ‘untrusted’ states of S. Detection time (TD): In [7], the TD is defined as the time elapsed from the moment process q crashes until the FD at p starts suspecting q permanently. In the case of the Impact FD, the detection time (TD) of p in relation to S is the time elapsed till the monitor process reports a suspicion that leads to a status transition in S from trusted to untrusted. To this end, for each freshness point of a process q in S, it is necessary to check which process failures would lead to a state transition of S from trusted to untrusted and then compute the detection time TD for each of these processes. The latter is the time elapsed between the current freshness (τi + 1) and the last heartbeat arrival (Ai) with respect to the previous freshness point, i.e. τi + 1 − Ai, from each of these processes. If there is more than one process q∈S which could lead to the transition, i.e. Sf = q∈trustedi|(trust_leveli − Impact(q)) < thresholdi, the TD in relation to S is the greatest of them: TD = max(τi + 1 − Ai), ∀q∈Sf. Figure 6 shows an example where S* has just one subset with three processes whose impact factor is 1. The thresholdS defines that at least two processes must be correct. Note that at τi + 3, process p did not receive the heartbeat message from q1 and, therefore, p removes it from its trusted set (trustedp = {q2, q3}). However, S remains trusted for p because the trust level satisfies the threshold. At freshness point τi + 5, FD verifies if the failure of any of the processes of trustedp (q2 and q3) can lead to S transition (trust_level1 < threshold1). For this purpose, p computes the TD for each of the two processes. The TD in relation to S is the greatest among TD of q2 and TD of q3. Since p did not receive a heartbeat from q2, S becomes untrusted. Transitions between ‘trusted’ and ‘untrusted’ states for three processes with impact factor 1 within a single subset. At least two processes must be correct. Average mistake rate (λR): represents the number of mistakes that the FD makes per unit of time, i.e. the rate at each the FD makes mistakes. Query accuracy probability (PA): the probability that the FD output is correct at a random time. Figure 6. View largeDownload slide Transitions between ‘trusted’ and ‘untrusted’ states for three processes with impact factor 1 within a single subset. At least two processes must be correct. Figure 6. View largeDownload slide Transitions between ‘trusted’ and ‘untrusted’ states for three processes with impact factor 1 within a single subset. At least two processes must be correct. 7.3. Asynchronous system For this evaluation we consider an AS, i.e. links are lossy asynchronous. Table 5 shows five configurations with regard to impact factor values that have been considered for S* in the experiments. The sum of the impact factor of the processes is 90 for all configurations. Table 5. Set configurations (S*). Configuration Impact factor of each site S* 0 {{⟨q1,7⟩,⟨q2,3⟩,⟨q3,20⟩,⟨q4,20⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,3⟩,⟨q8,7⟩,⟨q9,7⟩}} S* 1 {{⟨q1,7⟩,⟨q2,20⟩,⟨q3,20⟩,⟨q4,3⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,3⟩,⟨q8,7⟩,⟨q9,7⟩}} S* 2 {{⟨q1,20⟩,⟨q2,7⟩,⟨q3,3⟩,⟨q4,3⟩,⟨q5,7⟩,⟨q6,3⟩,⟨q7,7⟩,⟨q8,20⟩,⟨q9,20⟩}} S* 3 {{⟨q1,7⟩,⟨q2,3⟩,⟨q3,20⟩,⟨q4,3⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,7⟩,⟨q8,20⟩,⟨q9,7⟩}} S* 4 {{⟨q1,10⟩,⟨q2,10⟩,⟨q3,10⟩,⟨q4,10⟩,⟨q5,10⟩,⟨q6,10⟩,⟨q7,10⟩,⟨q8,10⟩,⟨q9,10⟩}} Configuration Impact factor of each site S* 0 {{⟨q1,7⟩,⟨q2,3⟩,⟨q3,20⟩,⟨q4,20⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,3⟩,⟨q8,7⟩,⟨q9,7⟩}} S* 1 {{⟨q1,7⟩,⟨q2,20⟩,⟨q3,20⟩,⟨q4,3⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,3⟩,⟨q8,7⟩,⟨q9,7⟩}} S* 2 {{⟨q1,20⟩,⟨q2,7⟩,⟨q3,3⟩,⟨q4,3⟩,⟨q5,7⟩,⟨q6,3⟩,⟨q7,7⟩,⟨q8,20⟩,⟨q9,20⟩}} S* 3 {{⟨q1,7⟩,⟨q2,3⟩,⟨q3,20⟩,⟨q4,3⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,7⟩,⟨q8,20⟩,⟨q9,7⟩}} S* 4 {{⟨q1,10⟩,⟨q2,10⟩,⟨q3,10⟩,⟨q4,10⟩,⟨q5,10⟩,⟨q6,10⟩,⟨q7,10⟩,⟨q8,10⟩,⟨q9,10⟩}} Table 5. Set configurations (S*). Configuration Impact factor of each site S* 0 {{⟨q1,7⟩,⟨q2,3⟩,⟨q3,20⟩,⟨q4,20⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,3⟩,⟨q8,7⟩,⟨q9,7⟩}} S* 1 {{⟨q1,7⟩,⟨q2,20⟩,⟨q3,20⟩,⟨q4,3⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,3⟩,⟨q8,7⟩,⟨q9,7⟩}} S* 2 {{⟨q1,20⟩,⟨q2,7⟩,⟨q3,3⟩,⟨q4,3⟩,⟨q5,7⟩,⟨q6,3⟩,⟨q7,7⟩,⟨q8,20⟩,⟨q9,20⟩}} S* 3 {{⟨q1,7⟩,⟨q2,3⟩,⟨q3,20⟩,⟨q4,3⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,7⟩,⟨q8,20⟩,⟨q9,7⟩}} S* 4 {{⟨q1,10⟩,⟨q2,10⟩,⟨q3,10⟩,⟨q4,10⟩,⟨q5,10⟩,⟨q6,10⟩,⟨q7,10⟩,⟨q8,10⟩,⟨q9,10⟩}} Configuration Impact factor of each site S* 0 {{⟨q1,7⟩,⟨q2,3⟩,⟨q3,20⟩,⟨q4,20⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,3⟩,⟨q8,7⟩,⟨q9,7⟩}} S* 1 {{⟨q1,7⟩,⟨q2,20⟩,⟨q3,20⟩,⟨q4,3⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,3⟩,⟨q8,7⟩,⟨q9,7⟩}} S* 2 {{⟨q1,20⟩,⟨q2,7⟩,⟨q3,3⟩,⟨q4,3⟩,⟨q5,7⟩,⟨q6,3⟩,⟨q7,7⟩,⟨q8,20⟩,⟨q9,20⟩}} S* 3 {{⟨q1,7⟩,⟨q2,3⟩,⟨q3,20⟩,⟨q4,3⟩,⟨q5,3⟩,⟨q6,20⟩,⟨q7,7⟩,⟨q8,20⟩,⟨q9,7⟩}} S* 4 {{⟨q1,10⟩,⟨q2,10⟩,⟨q3,10⟩,⟨q4,10⟩,⟨q5,10⟩,⟨q6,10⟩,⟨q7,10⟩,⟨q8,10⟩,⟨q9,10⟩}} 7.3.1. Experiment 1—query accuracy probability The aim of this experiment is to evaluate the Query Accuracy Probability (PA) with different threshold values (64, 70, 74, 80, and 83) and different impact factor configurations (Table 5). The safety margin was set to 400 ms (β=400 ms). Figure 7 shows that in most cases the PA decreases when the threshold increases. It should be remembered that the threshold is a limit value defined by the user and if the FD trust level output value is equal to, or greater than, the threshold, the confidence on the set of processes is ensured. Hence, the results confirm that when the threshold is lower, the Query Accuracy Probability is higher. Figure 7. View largeDownload slide AS System: PA vs. threshold with different set configurations (S*). Figure 7. View largeDownload slide AS System: PA vs. threshold with different set configurations (S*). On the one hand, except for threshold 83, ‘S*0’ configuration has the highest PA for most of the thresholds due to the assignment of high (resp., low) impact factors for the most stable (resp., unstable) sites. On the other hand, ‘S*2’ and ‘S*4’ have the lowest PA since unstable sites have high impact factor values assignment. For instance, in ‘S*2’ the high impact factor value of unstable sites 8 and 9 with standard deviation of 100 and 18 ms, respectively degrades the PA of this set. ‘S*4’ shows a sharp decline of the PA curve when the threshold = 83. This behavior can be explained since, in this set configuration, all sites have the same impact factor (10) which implies that every false suspicion renders the trust_level smaller than the threshold (83), increasing the mistake duration. Therefore, the query accuracy probability decreases. Notice that Site 2 failed after ~48 hours. Thus, after its crash, the FD output, which indicates trust_level smaller than the threshold, is not a mistake, i.e. it is not a false suspicion. Hence, in ‘S*1’, where the impact factor of Site 2 is 20 (high), the PA is constant for a threshold greater than 70: after the crash of Site 2, the FD output is always smaller than the threshold and false suspicions related to other sites do not alter it. The average mistake duration in the experiment is thus smaller after the crash, which improves the PA. Finally, we compared the PA of the Impact FD and a FD approach that monitors processes individually by applying Chen’s algorithm considering the 100 most recent heartbeats (WS = 100) and β = 400 ms. For the latter, the metric is the average of the PA value of all sites of S: PA¯=∑x=1nPAxn, for n = 9 and x equals to the index of each site in S. Thus, the obtained mean PA ( PA¯) is equal to 0.979788. This result shows that, regardless of the set (S*) configuration, the Impact FD has a higher PA than Chen’s FD since the former has enough flexibility to tolerate failures, i.e. the mistake duration only starts to be computed when the trust_level provided by Impact FD is smaller than the threshold, in contrast with individual monitoring, such as that by Chen FD, where every false suspicion increases the mistake duration. The results of this experiment highlight the fact that the assignment of heterogeneous impact factors to nodes can degrade the performance of the FD, especially when unstable sites have a high impact factor. 7.3.2. Experiment 2—query accuracy probability vs. detection time In the second experiment, we evaluated the average query accuracy probability (PA) regarding the average detection time (TD) for different threshold values (64, 70, 80 and 83). In order to obtain different values for the detection time, we varied the safety margin (Chen’s estimation) with intervals of 100 ms, starting at 100 ms. For this experiment, we chose the ‘S*0’ configuration since it presented the best PA in Experiment 1. We also evaluated the PA and TD for Chen’s algorithm, which outputs the set of suspected nodes. For the latter, the TD is computed as the average of the individual TD of all sites of S: TD¯=∑x=1nTDxn. Figure 8 shows that for a high threshold and detection time close to 200 ms, the PA of the Impact FD is quite small, independently of the threshold, because the safety margin (used to compute the expected arrival times) is, in this case, equal to 100 ms, which increases both the number of false suspicions and mistake duration. However, when TD is greater than 230 ms, the PA of Impact FD is considerably higher than that of Chen. After a detection time of ~400 ms, the PA of Impact FD becomes constant regardless of the detection time and threshold, and gets close to 1. Such a behavior can be explained since the higher the safety margin, the smaller the number of false suspicions, and the shorter the mistake duration which confirms that when the timeout is short, failures are detected faster but the probability of having false detections increases [27]. Figure 8. View largeDownload slide AS System: PA vs. TD with different thresholds. Figure 8. View largeDownload slide AS System: PA vs. TD with different thresholds. 7.3.3. Experiment 3—average mistake rate In this experiment, we evaluated the average detection time (TD) vs. the mistake rate (λR) (mistakes per second). For Chen’s algorithm, the λR is computed as the average of the individual λR of all sites of S: λR¯=∑x=1nλRn.We considered the ‘S*0’ configuration and the mistake rate is expressed in a logarithmic scale. We can observe in Figure 9 that the mistake rate of the Impact FD is high when the detection time is low (i.e. smaller than 400 ms) and the threshold is high (i.e. from 80 to 83). Such a result is in accordance with Experiment 2: whenever the safety margin is small and threshold tolerates fewer failures, the Impact FD makes mistakes more frequently. In other words, the mistake rate decreases when the threshold is low or the detection time increases. Figure 9. View largeDownload slide AS System: λR vs. TD with different thresholds. Figure 9. View largeDownload slide AS System: λR vs. TD with different thresholds. 7.3.4. Experiment 4—cumulative number of mistakes Figure 10 shows the cumulative number of mistakes for ‘S*0’ during the whole trace period, considering β = 400 ms and threshold value equals either to 80 or 83. Figure 10. View largeDownload slide AS System: cumulative number of mistakes for ‘S*0’ configuration. Figure 10. View largeDownload slide AS System: cumulative number of mistakes for ‘S*0’ configuration. We can observe in the figure that the cumulative number of mistakes is greater when the threshold value is equal to 83 (2754 mistakes) when compared to the threshold value equals to 80 (179 mistakes). The former makes few mistakes until approximately the hour 48 (when the Site 2 crashed). After that, the number of cumulative mistakes significantly increases because, since the threshold is high (83) and the failure of Site 2 was detected, false suspicions of any other site induce a trust_level value smaller than 83 in most cases. For instance, Site 8 is highly unstable and has impact factor value of 7. Whenever there is a false suspicion about it, after the crash of Site 2, the trust_level value is 80. On the other hand, for the threshold 80, there are fewer instability periods since the crash of Site 2 does not have much impact on the confidence of the system. At hour 48, there is an increase in the cumulative number of mistakes due to the unstable period of Site 9, as shown in Figure 2. From hour 50 to 100, the FD makes fewer mistakes. Such a behavior can be explained since, as observed in the same figure, all sites, with exception of Site 8, also have this same period of stability. After hour 108, there is a greater number of mistakes which is related to the instability of Sites 1, 7 and 8 (see Figure 2). 7.3.5. Experiment 5—query accuracy probability vs. time In this experiment, we divided the execution trace duration by fixed intervals of time and computed the average query accuracy probability (PA) for each of them. We chose the ‘S*0’ configuration, β = 400 ms, and the threshold values of 80 and 83. Similarly to the cumulative number of mistakes (Experiment 4), we observe in Figure 11 that instability periods have an impact in the PA. For instance, for the threshold = 80, from hour 108, the cumulative number of mistakes increases very fast. Consequently, the PA decreases. The period of instability of Site 9 is the responsible for the important reduction of the PA at hour 60 (i.e. from hour 48 to 60) when threshold = 83. A new degradation of the PA happens at hour 120 (i.e. from hour 108 to 120), due to unstable periods of the Sites 1, 7 and 8. Figure 11. View largeDownload slide AS System: PA vs. time. Figure 11. View largeDownload slide AS System: PA vs. time. 7.4. Weak ♢—timely System (W-ET) In this section, we consider the W-ET system described in Section 7.1.1: Site 0, the monitor site, is connected by ♢—timely links to sites 2, 3, 4 and 6 and by lossy asynchronous links to 1, 5, 7, 8 and 9. We defined the set S* with three subsets and all sites have the same impact factor (1): S*={{⟨q1,1⟩,⟨q3,1⟩,⟨q4,1⟩},{⟨q2,1⟩,⟨q5,1⟩,⟨q6,1⟩},{⟨q7,1⟩,⟨q8,1⟩,⟨q9,1⟩}} The thresholdS was defined as follows: thresholdS = {2,2,2} The thresholdS defines that the subsets S1, S2 and S3 must have at least two correct processes. As this experiment assigns W-ET to model parameter, it uses the η value and the heartbeat arrival estimation value is incremented by η at every μ heartbeat arrivals, if false suspicions occurred during this period. The experiments were carried out just for the first 24 hours of the traces, because after this time the FD does not make more mistakes for the set S*. 7.4.1. Experiment 6—eventually timely links vs. asynchronous links In this experiment, we compare the results obtained taking into account the above S* configuration and both systems W-ET and AS. The evaluation metrics are shown in Table 6. We set the value of safety margin β to 50 ms and η to 500 μs. This safety margin value is quite aggressive, which, consequently, leads the FD prone to make mistakes. For the W-ET system, we also varied μ: 1, 10 and 100. Table 6. W-ET vs AS - β = 50 ms, η = 500 μs. μ Mistakes Mistake rate PA Avg Mistake duration (ms) Time last mistake (min) HB Number TD (ms) 1 152 0.0017 0.99992 43.36 64 (1 h) 349 341 234.0 10 324 0.0037 0.99983 43.69 64 (1 h) 349 341 182.0 100 383 0.0044 0.99979 45.18 64 (1 h) 349 341 173.0 AS 4689 0.0542 0.99849 27.70 1438 (24 h) 7 749 909 151.0 μ Mistakes Mistake rate PA Avg Mistake duration (ms) Time last mistake (min) HB Number TD (ms) 1 152 0.0017 0.99992 43.36 64 (1 h) 349 341 234.0 10 324 0.0037 0.99983 43.69 64 (1 h) 349 341 182.0 100 383 0.0044 0.99979 45.18 64 (1 h) 349 341 173.0 AS 4689 0.0542 0.99849 27.70 1438 (24 h) 7 749 909 151.0 Table 6. W-ET vs AS - β = 50 ms, η = 500 μs. μ Mistakes Mistake rate PA Avg Mistake duration (ms) Time last mistake (min) HB Number TD (ms) 1 152 0.0017 0.99992 43.36 64 (1 h) 349 341 234.0 10 324 0.0037 0.99983 43.69 64 (1 h) 349 341 182.0 100 383 0.0044 0.99979 45.18 64 (1 h) 349 341 173.0 AS 4689 0.0542 0.99849 27.70 1438 (24 h) 7 749 909 151.0 μ Mistakes Mistake rate PA Avg Mistake duration (ms) Time last mistake (min) HB Number TD (ms) 1 152 0.0017 0.99992 43.36 64 (1 h) 349 341 234.0 10 324 0.0037 0.99983 43.69 64 (1 h) 349 341 182.0 100 383 0.0044 0.99979 45.18 64 (1 h) 349 341 173.0 AS 4689 0.0542 0.99849 27.70 1438 (24 h) 7 749 909 151.0 The first three rows of the table show the results for the W-ET system and the last row for the AS system. We can observe that the number of mistakes increases for different values of μ in the W-ET, but it is much smaller when compared to the AS (4689 mistakes). As a consequence, in the AS, the mistake rate is higher and PA is lower. In contrast, the average mistake duration in the AS (27.70 ms) is smaller than in the W-ET (around 43 ms). Such a difference occurs because the AS system has a lower timeout which induces false suspicions more often. Nevertheless, a heartbeat message may arrive immediately after the expiration of the timeout, generating a short mistake time. On the other hand, in the W-ET, the timeout value increases when there are false suspicions in periods of greater instability where messages take longer to arrive. For the W-ET system, we can observe that the time of the last mistake was at 64 minutes (heartbeat number 349 341) whereas in the AS system mistake occurrences are observable until the last hour (24 hour, heartbeat number 7 749 909). This happens because in the W-ET the heartbeat arrival estimation value is incremented by η when p falsely suspects the process within a period of μ heartbeats, which allows p to eventually get every heartbeat message from a site before the timeout expires. It is worth remarking that the number of mistakes reduces drastically, but the TD does not increase at the same rate. Table 7 summarizes the results of the experiments considering β = 100 ms and η = 500 μs. When comparing the two tables, we observe that with a less aggressive safety margin β, the number of mistakes reduces, especially in the AS system (231). Accordingly, the mistake rate decreases and PA increases in both systems. The last mistake is around 64 minutes in the W-ET while AS made mistakes until hour 24. The TD of the AS reduces because it has a higher safety margin and makes fewer mistakes. For instance, with β = 50 ms, two processes, whose maximum TD is 300 ms, that has the timeout expired, leads the set S* to a state untrusted. However, with β = 100 only one of them is suspected which does not lead a transition of state from trusted to untrusted. Table 7. W-ET vs AS - β = 100 ms, η = 500 μs. μ Mistakes Mistake rate PA Avg Mistake Duration (ms) Time last mistake (min) HB Number TD (ms) 1 84 0.00097 0.99995 48.35 64 (1 h) 349 341 273.8 10 121 0.00140 0.99993 48.28 64 (1 h) 349 341 224.0 100 135 0.00156 0.99993 44.53 64 (1 h) 349 341 219.6 AS 231 0.00267 0.99989 37,56 1431 (24 h) 7 708 057 208.0 μ Mistakes Mistake rate PA Avg Mistake Duration (ms) Time last mistake (min) HB Number TD (ms) 1 84 0.00097 0.99995 48.35 64 (1 h) 349 341 273.8 10 121 0.00140 0.99993 48.28 64 (1 h) 349 341 224.0 100 135 0.00156 0.99993 44.53 64 (1 h) 349 341 219.6 AS 231 0.00267 0.99989 37,56 1431 (24 h) 7 708 057 208.0 Table 7. W-ET vs AS - β = 100 ms, η = 500 μs. μ Mistakes Mistake rate PA Avg Mistake Duration (ms) Time last mistake (min) HB Number TD (ms) 1 84 0.00097 0.99995 48.35 64 (1 h) 349 341 273.8 10 121 0.00140 0.99993 48.28 64 (1 h) 349 341 224.0 100 135 0.00156 0.99993 44.53 64 (1 h) 349 341 219.6 AS 231 0.00267 0.99989 37,56 1431 (24 h) 7 708 057 208.0 μ Mistakes Mistake rate PA Avg Mistake Duration (ms) Time last mistake (min) HB Number TD (ms) 1 84 0.00097 0.99995 48.35 64 (1 h) 349 341 273.8 10 121 0.00140 0.99993 48.28 64 (1 h) 349 341 224.0 100 135 0.00156 0.99993 44.53 64 (1 h) 349 341 219.6 AS 231 0.00267 0.99989 37,56 1431 (24 h) 7 708 057 208.0 We also conducted the same experiment with β = 100 ms and η = 1 ms for the W-ET system (Table 8). We can note that the number of mistakes is reduced. On the other hand, with few mistakes, especially with μ = 1 ms, both the average mistake duration and TD increase. Based on these results, we can conclude that setting μ with a value greater than 1 is more suitable for this scenario, achieving, therefore, a better trade-off between detection time and accuracy of the Impact FD. Table 8. W-ET vs AS - β = 100 ms, η = 1 ms. μ Mistakes Mistake rate PA Avg Mistake duration (ms) Time last mistake (min) HB number TD (ms) 1 6 0.000069 0.999990 140.00 64 (1 h) 349 339 689.5 10 45 0.000520 0.999972 53.07 64 (1 h) 349 341 383.9 100 98 0.001133 0.999945 47.99 64 (1 h) 349 341 243.9 AS 231 0.002672 0.999899 37.56 1431 (24 h) 7 708 057 226.8 μ Mistakes Mistake rate PA Avg Mistake duration (ms) Time last mistake (min) HB number TD (ms) 1 6 0.000069 0.999990 140.00 64 (1 h) 349 339 689.5 10 45 0.000520 0.999972 53.07 64 (1 h) 349 341 383.9 100 98 0.001133 0.999945 47.99 64 (1 h) 349 341 243.9 AS 231 0.002672 0.999899 37.56 1431 (24 h) 7 708 057 226.8 Table 8. W-ET vs AS - β = 100 ms, η = 1 ms. μ Mistakes Mistake rate PA Avg Mistake duration (ms) Time last mistake (min) HB number TD (ms) 1 6 0.000069 0.999990 140.00 64 (1 h) 349 339 689.5 10 45 0.000520 0.999972 53.07 64 (1 h) 349 341 383.9 100 98 0.001133 0.999945 47.99 64 (1 h) 349 341 243.9 AS 231 0.002672 0.999899 37.56 1431 (24 h) 7 708 057 226.8 μ Mistakes Mistake rate PA Avg Mistake duration (ms) Time last mistake (min) HB number TD (ms) 1 6 0.000069 0.999990 140.00 64 (1 h) 349 339 689.5 10 45 0.000520 0.999972 53.07 64 (1 h) 349 341 383.9 100 98 0.001133 0.999945 47.99 64 (1 h) 349 341 243.9 AS 231 0.002672 0.999899 37.56 1431 (24 h) 7 708 057 226.8 8. RELATED WORK We can divide related work into two groups: (i) unreliable FDs and (ii) heartbeat arrival estimation strategies. Unreliable FDs: Most of the unreliable FDs in the literature are based on a binary model and provide as output a set of process identifiers, which usually informs the set of processes currently suspected of having failed ([2, 3]). However, in some detectors, such as class Σ (resp., Ω) [23], the output is the set of processes (resp., one process) which are (resp., is) not suspected of being faulty, i.e. trusted. The Accrual FD [24] proposes an approach where the output is a suspicion level on a continuous scale, rather than providing information of a binary nature (trusted or suspected). The suspicion level captures the degree of confidence with which a given process is believed to have crashed. If the process actually crashes, the value is guaranteed to accrue over time and tends toward infinity. Like the Accrual FD, Impact FD provides a non-binary output, however, the latter is related to the system as a whole and not to each process individually. On the other hand, some important features advocated by the authors in [28] for Accrual FD, can also be extended to our proposal. The authors argue that the aim of Accrual FDs is to decouple monitoring from interpretation. Hence, the accrual FDs provide a lower level abstraction that avoids having to interpret monitoring information. For instance, by setting an appropriate threshold, applications can trigger suspicions and take appropriate action, similarly to the Impact FD. Starting from the premise that applications should have information about failures to take specific and suitable recovery actions, the work in [29] proposes a service to report faults to applications. The latter also encapsulates uncertainty which allows applications to proceed safely in the presence of doubt. The service provides status reports related to fault detection with an abstraction that describes the degree of uncertainty. Considering that each node has a probability of being byzantine, a voting node redundancy approach is presented in [30] in order to improve reliability of distributed systems. Based on such probability values, the authors estimate the minimum number of machines that the system should have in order to provide a degree of reliability which is equal to or greater than a threshold value. In [31], the authors propose the use of a reputation mechanism to implement a FD for large and dynamic networks. The reputation mechanism allows node cooperation through the sharing of views about other nodes. The proposed approach exploits information about the behavior of nodes to increase its quality in terms of detection. When classifying the behavior of the nodes, the FD includes a reputation service where the nodes periodically exchange heartbeat messages. Heartbeat arrival estimation strategies: In the timer-based FD algorithms presented in Section 6, we used the heartbeat arrival estimation proposed by [7]. With the same aim of Chen’s algorithm, i.e. minimize false suspicions and failure detection time, several other estimation approaches have been proposed in the literature. They dynamically predict new heartbeat arrivals based on observed communication delays of the past heartbeat history. Bertier et al. [3] introduced a FD that was mainly intended for LAN environments. Their heartbeat arrival estimation approach combines of Chen’s estimation with a dynamic estimation based on Jacobson’s estimation [32]. The latter is used in the protocol TCP to estimate the delay after which a node retransmits its last message. Basically, the estimation of the next heartbeat arrival is calculated by adding Chen’s estimation to a safety margin given by Jacobson’s algorithm. Their approach provides a shorter detection time, but generates more false suspicions than Chen’s estimation, according to the authors’ measurements on a LAN. The ϕ Accrual FD is based [24] on inter-arrival estimation time, assuming that the latter follow a normal distribution. The Accrual FD dynamically adapts current network conditions based on the suspicion level. Similarly to the above FD [3] and [7], the estimation protocol samples the arrival time of heartbeats and maintains a sliding window of the most recent samples. The distribution of past samples is then used as an approximation for the probabilistic distribution of future heartbeat messages. With this information, it is possible to compute a value ϕ with a scale that changes dynamically to match recent network conditions. In [27], the authors extended the Accrual FD by exploiting the histogram density estimation. Taking into account, a sampled inter-arrival time and the time of the last received heartbeat, the algorithm estimates the probability that no further heartbeat messages will arrive from a given process, i.e. it has failed. The ANNFD presented in [33] is a FD based on artificial neural networks. It uses as input parameters variables collected by the Simple Network Management Protocol (SMNP) that characterize the network traffic at each time instant. After training the neural network, it must compute the message arrival time estimation EAk + 1, which is used to define the freshness point. By observing the changes in the computing environment and exploiting both the feedback control theory and user-defined QoS constraints, the autonomic FD (AFD) proposed in [34] dynamically configures the monitoring period and detection timeout value. A new metric, denoted FD availability (AV), is also defined. It suggests a safety margin (α) in such a way to decrease FD mistakes and to achieve the desired detection availability. If the detection service is inaccurate (i.e. AV is low), then the safety margin is increased to improve detection accuracy; otherwise, if AV is high, then α is decreased to improve the detection speed. Related work concerning FD’s implementations presents different approaches to estimate the timeout. The QoS of FDs depends on the choice of heartbeat arrival estimation strategy: a short timeout leads a FD to detect failures quickly, but may increase the number of false suspicions decreasing, consequently, its accuracy. We propose a new unreliable FD and its focus is not in heartbeat arrival estimation strategies. However, implementations of Impact FD may use different approaches to estimate the timeout. In the case of the timer-based Impact FD implementation of Section 6 (Algorithm 2), we use the heartbeat arrival estimation proposed by Chen et al. [7]. The reason for Chen’s algorithm choice is that it is a comparison reference for all FD performance studies. We should emphasize that to use another one, it is just necessary to change the code of the function Timeout () (Algorithm 1) called by Algorithm 2. For the Chen’s estimation algorithm, we consider the safety margin suggested by the authors, adding a dynamic increment for eventual timely links. Note that although the estimation solutions proposed by Chen’s and Accrual FDs [24, 27] have similar performance (mistake rate × detection time) over a wide-area network (environment of our experiments), the Accrual FD estimation requires tuning of the threshold parameter for each process and depends on application characteristics. It is important also to point out that Bertier, AFD, and ANNFD estimations were designed to local area networks where messages are rarely lost while the 2W-FD [25] has been tailored for unstable network scenarios such as latency jitter or switch contention. 9. CONCLUSION AND FUTURE WORK This paper introduced the Impact FD that provides an output that expresses the trust of the FD with regard to the system (or set of processes) as a whole. It is configured by the impact factor and the threshold which enable the user to define the importance (e.g. degree of reliability) of each node and an acceptable margin of failures respectively. It is thus suitable for environments where there is node redundancy or nodes with different capabilities. Both the impact factor and the threshold render the estimation of the confidence in the system (or a set of processes S) more flexible. In some scenarios, the failure of low impact or redundant nodes does not jeopardize the confidence in S, while the crash of a high impact factor one may seriously affect it. Either a softer or a stricter monitoring is, therefore, possible. We have defined two properties, PR(IT)pS and PR(⋄IT)pS, which denote the capacity of the Impact FD of accepting different set of trusted processes that lead to the confidence in S. Then, we presented a timer-based implementation of the Impact FD, which can be applied to systems whose links are lossy asynchronous or those whose all (or some) are ♢—timely. Performance evaluation results, based on real PlanetLab traces, showed that the assignment of a high (resp. low) impact factor to more stable (resp. unstable) nodes increases the Query Accuracy Probability of the FD. Furthermore, we observed that the Impact FD might weaken the rate of false suspicions when compared with the traditional Chen’s unreliable FD. Additionally, in the experiments carried out considering a W-ET system, it was observed that the number of mistakes reduces drastically when compared with the AS system, however the detection time does not increase in the same rate. Therefore, such results confirm the degree of flexible applicability of the Impact FD, that both failures and false suspicions are more tolerated than in traditional FDs, and that the former presents better QoS than the latter if the application is interested in the degree of confidence in the system (trust level) as a whole. In the near future, we intend to generalize the trust level calculation as well as its comparison with the threshold. To this end, the Trust_level(trusted, S*) function can perform an operation over the impact factor of the trusted processes other than the sum (e.g. multiplication, average, etc.) and the threshold will not necessary be a lower bound (e.g. upper bound, equality, etc.). For instance, suppose that the impact factor of a node corresponds to the probability that it behaves maliciously. The trust level, in this case, would express the probability that all nodes of the system behave maliciously. Thus, the trust_level sum operation would be replaced by multiplication operation and should be smaller than a reliability threshold value. Another research direction is to render the impact factor dynamic, i.e. the impact factor of a node can vary during execution, depending on the current degree of reliability of the node or its current reputation, its past history of stable/unstable periods, etc. Finally, we also aim at extending performance experiments to other networks such as MANET or LAN, comparing the performance of Impact FD with other well-known FDs. FUNDING This work was partially supported by grant 012909/2013-00 from the National Council for Scientific and Technological Development (CNPq). Footnotes 1 A process is denoted correct if it does not crash during the whole execution. 2 The power set of any set S is the set of all subsets of S, including the empty set and S itself. REFERENCES 1 Fischer , M. , Lynch , N. and Paterson , M. ( 1985 ) Impossibility of distributed consensus with one faulty process . J. ACM , 32 , 374 – 382 . Google Scholar Crossref Search ADS 2 Chandra , T. D. and Toueg , S. ( 1996 ) Unreliable failure detectors for reliable distributed systems . J. ACM , 43 , 225 – 267 . Google Scholar Crossref Search ADS 3 Bertier , M. , Marin , O. and Sens , P. ( 2003 ) Performance analysis of a hierarchical failure detector. 2003 Int. Conf. Dependable Systems and Networks (DSN), San Francisco, CA, USA, 22–25 June, pp. 635–644. IEEE Computer Society. 4 Rossetto , A. , Geyer , C. , Arantes , L. and Sens , P. ( 2015 ) A failure detector that gives information on the degree of confidence in the system. Symposium on Computers and Communication, Larnaca, Cyprus, 6–9 July, pp. 532–537. IEEE Computer Society. 5 Aguilera , M. , Delporte-Gallet , C. , Fauconnier , H. and Toueg , S. ( 2004 ) Communication-efficient leader election and consensus with limited link synchrony. Proc. 23rd Annual ACM Symposium on Principles of Distributed Computing, PODC, St. John’s, Newfoundland, Canada, 25–28 July, pp. 328–337. ACM. 6 Junqueira , J. , Marzullo , K. , Herlihy , M. and Penso , L. ( 2010 ) Threshold protocols in survivor set systems . Distrib. Comput. , 23 , 135 – 149 . Google Scholar Crossref Search ADS 7 Chen , W. , Toueg , S. and Aguilera , M. ( 2002 ) On the quality of service of failure detectors . IEEE Trans. Comput. , 51 , 561 – 580 . Google Scholar Crossref Search ADS 8 PlanetLab ( 2014 ). Planetlab. http://www.planet-lab.org. “Online. Access date: September 16, 2016”. 9 Ishibashi , K. and Yano , M. ( 2005 ) A proposal of forwarding method for urgent messages on an ubiquitous wireless sensor network. 6th Asia-Pacific Symposium on Information and Telecommunication Technologies, Yangon, Myanmar, 9–10 Nov, pp. 293–298. IEEE. 10 Geeta , D. , Nalini , N. and Biradar , R. ( 2013 ) Fault tolerance in wireless sensor network using hand-off and dynamic power adjustment approach . J. Netw. Computer Appl. , 36 , 1174 – 1185 . Google Scholar Crossref Search ADS 11 Rehman , A. , Abbasi , A. , Islam , N. and Shaikh , Z. ( 2014 ) A review of wireless sensors and networks’ applications in agriculture . Comput. Stand. Interfaces , 36 , 263 – 270 . Google Scholar Crossref Search ADS 12 Hayashibara , N. , Défago , X. and Katayama , T. ( 2003 ) Two-ways adaptive failure detection with the ϕ-failure detector. Workshop on Adaptive Distributed Systems (WADiS03), Sorrento, Italy, Oct, pp. 22–27. Citeseer. 13 Bonnet , F. and Raynal , M. ( 2013 ) Anonymous asynchronous systems: the case of failure detectors . Distributed Computing , 26 , 141 – 158 . Google Scholar Crossref Search ADS 14 Arévalo , S. , Fernández Anta , A. , Imbs , D. , Jiménez , E. and Raynal , M. ( 2012 ) Failure detectors in homonymous distributed systems (with an application to consensus). 2012 IEEE 32nd Int. Conf. Distributed Computing Systems, Macau, China, 18–21 June, pp. 275–284. IEEE Computer Society. 15 Larrea , M. , Anta , A. F. and Arévalo , S. ( 2013 ) Implementing the weakest failure detector for solving the consensus problem . IJPEDS , 28 , 537 – 555 . 16 Aguilera , M. K. , Delporte-Gallet , C. , Fauconnier , H. and Toueg , S. ( 2003 ) On implementing omega with weak reliability and synchrony assumptions. Proc. 22nd ACM Symposium on Principles of Distributed Computing PODC, Boston, Massachusetts, USA, July 13–16, pp. 306–314. ACM. 17 Mostéfaoui , A. , Mourgaya , E. and Raynal , M. ( 2003 ) Asynchronous implementation of failure detectors. Int. Conf. Dependable Systems and Networks (DSN), San Francisco, CA, USA, 22–25 June, pp. 351–360. IEEE Computer Society. 18 Arantes , L. , Greve , F. , Sens , P. and Simon , V. ( 2013 ) Eventual leader election in evolving mobile networks. 17th Int. Conf. Principles of Distributed Systems, OPODIS, Nice, France, 16–18 December, pp. 23–37. Springer. 19 Gómez-Calzado , C. , Lafuente , A. , Larrea , M. and Raynal , M. ( 2013 ) Fault-tolerant leader election in mobile dynamic distributed systems. IEEE 19th Pacific Rim Int. Symposium on Dependable Computing, PRDC, Vancouver, BC, Canada, 2–4 December, pp. 78–87. IEEE Computer Society. 20 Larrea , M. , Fernández , A. and Arévalo , S. ( 2004 ) On the implementation of unreliable failure detectors in partially synchronous systems . IEEE Trans. Comput. , 53 , 815 – 828 . Google Scholar Crossref Search ADS 21 Delporte-Gallet , C. , Fauconnier , H. , Guerraoui , R. and Kouznetsov , P. ( 2005 ) Mutual exclusion in asynchronous systems with failure detectors . J. Parallel Distrib. Comput. , 65 , 492 – 505 . Google Scholar Crossref Search ADS 22 Bonnet , F. and Raynal , M. ( 2011 ) On the road to the weakest failure detector for k-set agreement in message-passing systems . Theor. Comput. Sci. , 412 , 4273 – 4284 . Google Scholar Crossref Search ADS 23 Delporte-Gallet , C. , Fauconnier , H. , Guerraoui , R. , Hadzilacos , V. , Kouznetsov , P. and Toueg , S. ( 2004 ) The weakest failure detectors to solve certain fundamental problems in distributed computing. Proc. 23rd Annual ACM Symposium on Principles of Distributed Computing, PODC, St. John’s, Newfoundland, Canada, 25–28 July, pp. 338–346. ACM. 24 Hayashibara , N. , Defago , X. , Yared , R. and Katayama , T. ( 2004 ) The φ accrual failure detector. 23rd Int. Symposium on Reliable Distributed Systems SRDS, Florianopolis, Brazil, 18–20 October, pp. 66–78. IEEE Computer Society. 25 Tomsic , A. , Sens , P. , Garcia , J. , Arantes , L. and Sopena , J. ( 2015 ) 2w-fd: A failure detector algorithm with qos. IEEE Int. Parallel and Distributed Processing Symposium (IPDPS), Hyderabad, India, 25–29 May, pp. 885–893. IEEE. 26 Xiong , N. , Vasilakos , A. V. , Wu , J. , Yang , Y. R. , Rindos , A. , Zhou , Y. , Song , W.-Z. and Pan , Y. ( 2012 ) A self-tuning failure detection scheme for cloud computing service. 26th International Parallel & Distributed Processing Symposium (IPDPS), Shanghai, China, 21–25 May, pp. 668–679. IEEE. 27 Satzger , B. , Pietzowski , A. , Trumler , W. and Ungerer , T. ( 2007 ) A new adaptive accrual failure detector for dependable distributed systems. ACM Symposium on Applied Computing (SAC), Seoul, Korea, 11–15 March, pp. 551–555. ACM. 28 Défago , X. , Urbán , P. , Hayashibara , N. and Katayama , T. ( 2005 ) Definition and specification of accrual failure detectors. Int. Conf. Dependable Systems and Networks (DSN), Yokohama, Japan, 28 June–1 July, pp. 206–215. IEEE Computer Society. 29 Leners , J. B. , Gupta , T. , Aguilera , M. K. and Walfish , M. ( 2013 ) I mproving availability in distributed systems with failure informers. 10th USENIX Symposium on Networked Systems Design and Implementation, NSDI, Lombard, IL, USA, 2–5 April, pp. 427–441. USENIX Association. 30 Brun , Y. , Edwards , G. , Bang , J. Y. and Medvidovic , N. ( 2011 ) Smart redundancy for distributed computation. Int. Conf. Distributed Computing Systems, ICDCS, Minneapolis, Minnesota, USA, 20–24 June, pp. 665–676. IEEE Computer Society. 31 Véron , M. , Marin , O. , Monnet , S. and Sens , P. ( 2015 ) Repfd-using reputation systems to detect failures in large dynamic networks. 44th Int. Conf. Parallel Processing, ICPP, Beijing, China, 1–4 September, pp. 91–100. IEEE Computer Society. 32 Jacobson , V. ( 1988 ) Congestion avoidance and control. Symposium Proc. Communications Architectures and Protocols, SIGCOMM, Stanford, California, USA, 16–18 August, pp. 314–329. ACM. 33 Macêdo , R. A. and Lima , F. R. L. ( 2004 ) Improving the quality of service of failure detectors with snmp and artificial neural networks. Simpósio Brasileiro de Redes de Computadores, SBRC, Gramado - RS, Brazil, 10–14 May, pp. 583–586. SBC. 34 de Sá , A. S. and Macêdo , R. J. A. ( 2010 ) Qos self-configuring failure detectors for distributed systems. IFIP Int. Conf. Distributed Applications and Interoperable Systems, Amsterdam, The Netherlands, 7–9 June, pp. 126–140. Springer Berlin Heidelberg. © The British Computer Society 2018. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Journal

The Computer JournalOxford University Press

Published: Oct 1, 2018

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off