Failures in discrete-event systems and dealing with them by means of Petri nets

Failures in discrete-event systems and dealing with them by means of Petri nets An approach based on Petri nets pointing to the manner how to deal with failures in discrete-event systems is presented. It uses the reachability tree and/or reachability graph of the Petri net-based model of the real system as well as the synthesis of a supervisor to remove the possible deadlock(s). To illustrate the applicability of the approach to the detection and recovery of failures in DES modelled by Petri nets the case study on a railroad crossing is introduced. Keywords Detection · Discrete-event systems · Failure · Modelling · Petri nets · Reachability · Recovery 1 Introduction Error recovery is [12,19,25] the set of actions that must be performed in order to return the system to its normal state. A failure can be defined [2,3,10] as a deviation of a sys- At least one sequence of actions should exist in order to tem from its intended (normal) behaviour. The process of bring the system into its normal operation. When there exist detecting a potential failure in the system behaviour fol- more sequences, the best one is chosen with respect to a lowed by isolating the cause or the source of the failure, prescribed criterion. Usually it is the sequence of actions is called the system diagnosis. In the discrete-event systems which minimally disorganizes the system. (DES) diagnosis, faults may correspond to any discrete event. For systems without failures usage of Petri nets (PN) is Unfortunately, the predisposition of systems to fail increases very useful for modelling, analysing and control synthesis. with their complexity. The research effort has been spent in However, practically any system is not failure-free. Failures the development of diagnostic systems. It is necessary to dis- can emerge in any device or software. It is gratifying that PN tinguish (according to the manner in which faults are reset can be used [3,7,10,11,15–17,19,21–23,26] also for systems after they occur) [2,14] between permanent and intermit- where failures occur. There failures can be categorized into tent faults. In case of the permanent fault the recovery event hardware failures and software ones. To minimize the hard- occurs only due to repairing the controllable and observ- ware failures of devices, it is necessary to timely execute able fault. In case of the intermittent fault the recovery event maintenance of devices, test and/or check as well as timely occurs either spontaneously or due to repairing, and such replace their components. event has a tendency to be uncontrollable and unobservable. To decrease occurrence of software failures, fault-tolerant The fault diagnosability [27] is interested in whether the sys- software techniques are necessary. Error recovery is possible tem is diagnosable or not—i.e., in the fact whether the system only for the so called soft failures [6]. Hard (catastrophic) can detect the occurrence of the fault in a finite number of failures in systems are classified as functional and/or struc- steps or not. tural failures. Strategies and forms for detection and recovery of system soft failures are based on the so called error treatment and fail- This paper is an expanded version of the paper [5] presented at the ure treatment. The error treatment contains error detection, conference ACIIDS 2017, Kanazawa, Japan, April 3–5, 2017. damage assessment and error recovery. The failure treatment includes localization, identification, system repair and con- B František Capkovicˇ tinued service. However, the hard failures can be overcome Frantisek.Capkovic@savba.sk in most systems by means of redundancy [1]. Institute of Informatics, Slovak Academy of Sciences, Dúbravská cesta 9, 845 07 Bratislava, Slovakia 123 144 Vietnam Journal of Computer Science (2018) 5:143–155 k k T k Here, in this paper, failures in DES and their recovery (γ ,..., γ ) with entries γ ∈{0, 1}, representing the t t p 1 m i will be examined by means of utilizing Petri nets (PN). DES states of particular transitions (either enable—when 1, or are systems discrete by nature. They persist in a steady state disable—when 0) is the control vector; F, G are incidence until the occurrence of a discrete event which will cause their matrices of arcs corresponding, respectively, to the sets F, G transition into another state. Typical representatives of DES mentioned above. are discrete manufacturing systems, transport systems, com- A firing sequence from the initial state x (i.e., from the munication systems, etc. PN are frequently used for DES initial marking m ) is a sequence of transition sets T = modelling, analysing and control synthesizing. {τ τ ... τ } such that x [ τ > x [ τ > x > ··· τ > 1 2 k 0 1 1 2 2 k x [ τ > x . The set may be also empty, of course. The k−1 k k 1.1 Preliminaries about Petri nets notation x [ T denotes that the sequence T can be fired at x and the notation x [ T > x denotes that the firing of T 0 0 k Petri nets (PN) [8,18,20] are (as to their structure) bipar- yields x . tite directed graphs—i.e., graphs with two kinds of nodes More than one transition can be fired at any instant. Thus (places and transitions) and two kinds of edges (arcs directed there are two possibilities (i) to fire more than one transition from places to transitions and arcs directed contrary)— at any instant (concurrency assumption); (ii) to fire only one P, T , F , G with P ∩ T = ∅ and P ∪ T = ∅ (∅ is the of them at any instant [no concurrency (NC) assumption]. empty set), where P, | P|= n, is a finite set of places and T , Under the NC assumption, each τ is a singleton set, and |T|= m, is a finite set of transitions; F ⊆ P ×T , G ⊆ T × P T is a sequence of transitions. It can also be written that are subsets of the directed arcs. The set B = F ∪ G con- x [ T > x to denote that firing of T the state x can be 0 k k tains all directed arcs. The so called preset (a set of input reached from x . In general, the state x is reachable from x 0 k 0 ( p) places) of a transition t is defined as t ={ p|( p, t ) ∈ B}, if there exists a firing sequence T such that x [ T > x .For 0 k while the so called postset (a set of output places) of t is PN the set of reachable state vectors is R(PN, x ). All these ( p) defined as t ={ p|(t , p) ∈ B}. On the contrary, the pre- vectors create columns of the matrix X . reach set of a place p (a set of input transitions) is defined as The PN reachability tree (RT) expresses all states reach- (t ) p ={t |(t , p) ∈ B} while the postset (a set of output able from x as well as how (by means of firing which (t ) transitions) of p is defined as p ={t |( p, t ) ∈ B}.P/T transitions) they can be reached. Thus, the nodes of the RT PN is said to be pure if no self-loops occur in it, i.e., if for are labelled with the actual PN marking (state vectors) and p ∈ P, t ∈ T , {( p, t ) ∈ B) ⇒ (t , p)/ ∈ B}. the arcs are labelled with the transitions between the states. Places model some particular activities or operations of a The RT root is represented by the initial state x and the RT modelled DES being a real object (plant). This is expressed leafs are expressed by the states reachable from x . Connect- by putting tokens inside the places. Such a marking m is a vec- ing the leafs with the same name the reachability graph (RG) tor m : P → Z (Z represents positive integers including arises. ≥0 ≥0 0). The marking enables a set of transitions τ ⊆ T . Namely, The PN T-invariants and P-invariants [9,13,18] are impor- (t ) ∀ p ∈ P, m( p) ≥| p ∩ τ | (i.e., m( p) is greater than the tant too, respectively, at diagnosability [16] and supervision number of transitions in τ for which p is the input place [4] (and subsequently for deadlocks elimination). While T- or equal to this number). The enabled transitions may be invariants restore an initial state, P-invariants ensure the (but need not be) fired. After their firing the PN marking is token preservation. A T-invariant v is a solution of the equa- changed. tion Bv = 0. A P-invariant y is a solution of the equation As to the marking development (marking propagation can B y = 0. For any state x reachable from x the relation T T be understood to be PN dynamics), the PN can be formally y · x = y · x is valid. This fact was utilized at the super- defined as X , U,δ, x , where X is a set of PN states, U is visor synthesis [4] based on P-invariants. a set of discrete events; δ : X × U → X symbolizes the fact To express time, we can use timed Petri nets (TPN), where that the new state of marking depends on existing state and an time is assigned to the transitions as their duration function occurred discrete event; x ∈ X is the initial state of marking. D : T → Q , where Q symbolizes non-negative rational 0 ≥0 ≥0 The state equation (PN model of DES) is as follows: numbers. To illustrate the PN-based approach to the detection and x = x + B · u , k = 0, 1, ..., N , (1) k+1 k k recovery of failures in DES modelled by PN let us introduce the following case study. F · u ≤ x , (2) k k This paper is an expanded version of the paper [5]pre- where B = G − F. It expresses the PN dynamics. Here, sented at the conference ACIIDS 2017. In comparison with k k T k the conference paper, the part concerning the safety of tech- x = (σ ,..., σ ) with entries σ ∈{0, 1,..., ∞}, p p p 1 n i representing the states of particular places, is the PN state nical systems in general was added. Because the introduced vector in the k-th step of the dynamics development; u = Case Study concerns the accident on a railroad crossing, cer- 123 Vietnam Journal of Computer Science (2018) 5:143–155 145 Fig. 1 The illustrative examples of such accidents tain illustrations of formidable effects of such accidents were During last several years such collisions caused many introduced. Also the passage concerning the supervisor syn- casualties—130 human lives and huge material damages. thesis was modified, to be more comprehensible to readers. Consequently, it is necessary to be concerned with such prob- lems and to find possibilities how to improve security in that area. Also PN can help along this line. Of course, it is impossible to anticipate failures caused by people them- selves. The failures due to the human behaviour like the 2 Safety of technical systems absent-mindedness, willful and wanton acts of law breaking, infringement of traffic regulations, etc., cannot be removed The safety of different kind of technical systems is very simply. To prevent the bad habits the education or training, in important. Especially, in case of the systems where the human extreme cases a punishment, are necessary. Only right way life is endangered. From this point of view the transport sys- to the improvement of the safety of systems is to increase tems belong to the systems where the human life is often the reliability of the software and equipment. The following endangered. At present, man is directly endangered at the simple case study on railroad crossing offers the approach contact with the transport systems during whole day. The how to do this in such a case. mass transport is dangerous not only for the road user(s) who are crossing a road as pedestrian(s) but also for car drivers and their travel companion. For example the car collisions occur 2.1 Case study on simple railroad crossing very frequently. Likewise, collisions on railroad crossings are not unusual. Only in such small country like Slovakia, tragic Consider the simple railroad crossing where the railroad collisions between cars and/or trucks with trains occur prac- crossing gate prevents a direct contact of vehicles on the tically every month—see e.g., Fig. 1. The train having a many road with trains. The PN model of such system consists of times bigger mass, speed and consequently, also dynamics, three cooperating sub-models expressing in Fig. 2(left) the destroys not only human lives (being inside of the road vehi- behaviour of the train, crossing gate and control system. Here, cles and the train) but also the vehicles and some times also the sense of the places in the failure-free case is the following: the train itself ends completely destroyed. (i) the train has the states: p = approaching to the cross- 123 146 Vietnam Journal of Computer Science (2018) 5:143–155 Fig. 2 The PN model of the failure-free case together with its RT (left) and the PN model with three potential failures (right) ing, p = being before the crossing, p = being within the gate is mechanically damaged), and t represents a control 2 3 f crossing, p = being after the crossing; (ii) the barrier of the system failure (when an illegitimate signal occurs). crossing gate has the states: p = it is up, p =itisdown. It is practically impossible to recover the human fail- 11 12 The transitions t and t model, respectively, the events of ure of the engine-driver. Likewise, the technical problem in 6 7 raising and lowering the barrier; (iii) the control system has the crossing gate caused by a wrong function of the bar- the states: p , p , p , p , p , p ; (iv) the place p repre- rier raising/lowering can be hardly recovered. However, the 5 6 7 8 9 10 13 sents the interlock giving the warning signal for the train that erroneous function of the control system can be detected and the barrier is still up. The reachable states x , i = 0,..., 7 recovered. Consequently, let us consider in Fig. 2(right) only (RT/RG nodes N ), of the failure-free system are expressed the failure represented by t and neglect the failures repre- i +1 f as the rows of the following matrix sented by the transitions t and t . Then the coverability f f 2 6 tree and graph are given in Fig. 3. The reachable states of ⎛ ⎞ this model (nodes of the RT/RG) are given as the columns of ⎜ ⎟ the following matrix, where ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎛ ⎞ ⎜ ⎟ X = (3) 1 0 100000 000 00 00 00 000 0 0 reach ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 0 1 011110 1 10 00 10 00 000 0 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 0 0000001 000 11 00 00 100 0 0 ⎝ ⎠ ⎜ ⎟ 000 1 0 1 000 10 10 ⎜ ⎟ 0 0000000 001 0 0 0 1 1 1 0 11 1 1 ⎜ ⎟ ⎜ ⎟ 0 1 001000 000 00 00 00 000 0 0 ⎜ ⎟ ⎜ ⎟ 1 1 101000 000 00 01 00 011 0 1 ⎜ ⎟ The RT is displayed just by the failure-free PN model in ⎜ ⎟ X = 0 0 010111 111 11 10 11 100 1 0 reach ⎜ ⎟ Fig. 2. It is simple, without any branching. 0 0000000 001 0 0 0 0 1 1 000 1 0 ⎜ ⎟ ⎜ ⎟ 0 0 010010 000 00 00 00 000 0 0 However, there can occur three potential failures, one in ⎜ ⎟ ⎜ ⎟ 11 ω 1 ω 1 ω 10 ω 10 ωω 20 ωω 1 ωωω each subsystem. They are expressed by means of the failure ⎜ ⎟ ⎜ ⎟ 1 1111010 1 00 10 10 10 110 1 1 ⎜ ⎟ transitions t , t , t given in Fig. 2 (right). The transition f f f 2 5 6 ⎝ ⎠ 0 0000101 0 11 01 01 01 001 0 0 t takes a token from p and puts a token into p out of f 2 3 0 0000100 1 10 00 10 00 000 0 0 the correct sequence, t does the same for p and p , f 12 11 (4) and t involves an erroneous generation of a token in p f l0 which directly influences the position of the barrier. Thus, t represents a human failure (when the engine-driver omits It can be seen that at the infinity number of t occurrences, f f 2 5 or ignores the warning signal), t expresses the failure of the one half of the 22 states have the self-loops (see Fig. 3 right) crossing gate (when a premature gate raising occurs or the which are expressed by the symbol ω. 123 Vietnam Journal of Computer Science (2018) 5:143–155 147 Fig. 3 The coverability tree (left) and coverability graph (right) of the PN model with t at the infinite number of possible occurrences of the failure In order to generate only the finite number of the failure t occurrences, the place p was added to the previous 13 f 14 places—see Fig. 4. In general, the failure can occur more times. The more times the failure occurs, the more compli- cated will be the structure and dimensionality of RT (RG). Therefore, here we will suppose its occurrence only once as displayed in Fig. 4 in order to demonstrate how to deal with the failure. In case of more failures such process will be more complicated. The model parameters are ⎛ ⎞ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ 0000100001 ⎟ F = ⎜ ⎟ ; ⎜ 0000100000 ⎟ ⎜ ⎟ ⎜ 0000001000 ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ Fig. 4 The PN model of the system with the failure represented by t ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ 123 148 Vietnam Journal of Computer Science (2018) 5:143–155 Fig. 5 The RT of the system with the finite number (namely only once in this case) of possible occurrences of the failure represented by t (left) and the corresponding RG (right) ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 0000000000 1 ⎜ ⎟ ⎜ ⎟ 1000000000 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 0100000000 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 0010000000 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 1000000000 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 0000100000 1 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 0001000001 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ G = ; x = . (5) ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ = (6) X ⎜ ⎟ ⎜ 0010000000 ⎟ ⎜ 0 ⎟ reach ⎜ 0000000001000100010 ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ 0001000000 ⎟ ⎜ 0 ⎟ ⎜ 0001001000000000000 ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ 0000100100 ⎟ ⎜ 0 ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 0010101010101100201 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 0000010010 1 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 1111101000010011011 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 0000001001 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ ⎝ ⎠ 0000001000 0 ⎝ ⎠ 0000000000 1 Then, RT and RG of the failed system are given in Fig. 5. The RT has 19 nodes. However, with the accruing number It can be seen that the number of states as well as the RT/RG of occurrences of the failure, the RT/RG dimensionality and structure are completely different in comparison with RT complexity escalate. When σ = 2 RT has 30 nodes, when of the failure-free system in Fig. 2. Namely, the branching σ = 5 RT has 63 nodes, when σ = 10 RT has 118 p p 14 14 occurs here. The states (nodes of the RT/RG) are the columns nodes, etc. Although the procedure of RT computation is the of the matrix X . reach same, computational time correspondingly increases. To detect and recover the failure(s) we have to distinguish whether the barrier is down or up. When the train is approach- ing, in the standard situation (without any failure) the barrier is down. However, in the non-standard situation (when the failure t occurs) the barrier is going up. This is very dan- 123 Vietnam Journal of Computer Science (2018) 5:143–155 149 ⎛ ⎞ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ X =⎜ ⎟ reach ⎜ 000000000000001000001000011000⎟ ⎜ ⎟ ⎜ 000100011000110000100000000000⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 11111101010001001 0000101010101 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ 11010010101000 1000010001000000 (7) The RT and RG of the recovered system are given in Fig. 7. But the deadlock N (x ) occurs there. 19 18 2.2 Supervisor synthesis for deadlock(s) elimination A general approach to the supervisor synthesis based on P-invariants of PN was presented in [4]. Suitable linear combinations of entries of the state vector x (i.e., L.x)are restricted by means of entries of the constant vector b, i.e., n ×n n ×1 s s L.x ≤ b, L ∈ Z , b ∈ Z . Then, in a nutshell, the ≥0 ≥0 supervisor synthesis is as follows. Remove the inequality by Fig. 6 The PN model of the final recovered system adding the vector x of the so called slacks, i.e., gerous situation, critical as to safety. To detect the failure it L.x ≤ b, (8) is necessary to have redundant information. It must be con- L.x + I .x = b, (9) s s tained in the control system itself. Because p and p in the 6 7 control system correspond to p and p in the real cross- 11 12 where I is the (s × s)-dimensional identity matrix. Now ing gate, the failure is detected by checking if p and p 7 11 suppose that Y is a matrix of invariants of the extended PN are active simultaneously. If yes, there exists a contradiction model (the model of the plant and the supervisor together). between the real (i.e., fault) situation and standard one. After Then detecting the failure a kind of recovery can be applied. It T T T T depends on which case is accepted as the true state. When it Y .(B B ) = 0. (10) is supposed that the barrier is up and drops down the recovery is realized by means of t . The PN model of the recovered r Let us define system is given in Fig. 6. More detailed analysis is possible by means of the RT Y  (LI ). (11) and/or RG in Fig. 7 using information about the nodes given in the matrix given by the relation (7). When the barrier is up After multiplying the matrices in (10), we obtain and none train is approaching, the situation is considerably simpler. Namely, by virtue of t the fail signal p from the L.B + I · B = 0, (12) s s r 10 control system and the activity of p guarantee that the fail B =− L · B = G − F . (13) s s signal can be practically ignored. It has 30 states being the columns of the following matrix: The initial state vector of the supervisor follows from (9)in the form x = b − L.x . (14) B is the supervisor structure and x is its initial state, F , G s s s are the incidence matrices of the supervisor. Consequently, s T T T B = (B B ) is the structure of the supervised system 123 150 Vietnam Journal of Computer Science (2018) 5:143–155 Fig. 7 The RT of the recovered system (left) and the corresponding RG (right) s T T T (i.e., original system plus supervisor) and x = (x (x ) ) 0 0 is its initial state. Let us deal with the deadlock state x by means of syn- thesizing a suitable supervisor. Because the deadlock state is N , i.e., x = (0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0) , 19 18 we have to avoid its activation. Consider L = (0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0) and b = 3, i.e., at most three of p , p , p , p can be 2 7 9 12 active together. Then, the supervisor structure is given as B = (− 1, 1, 0, − 2, 1, 1, 0, 0, 0, − 1). After the break- up of B the incidence matrices of arcs are acquired. F = s s (1, 0, 0, 2, 0, 0, 0, 0, 0, 1) and G = (0, 1, 0, 0, 1, 1, 0, 0, 0, 0). The initial state of the supervisor is x = 3−0 = 3. The supervisor is incorporated into the PN model of the recovered system given in Fig. 6. Consequently, the form of the PN model is changed into the form given in Fig. 8.Its RT and RG are in Fig. 9. The reachable states of the deadlock- free recovered system are given as the columns of the matrix: Fig. 8 The PN model of the system with the recovered failure and the deadlock removed by means of the supervisor 123 Vietnam Journal of Computer Science (2018) 5:143–155 151 Fig. 9 The RT of the recovered system with removed deadlock by means of the supervisor (left) and the corresponding RG (right) ⎛ ⎞ th parameter concerns t . Simulation in Matlab by means of 101001000000000000000000000 1 ⎜ ⎟ 010110111011001100000000000 the tool HYPENS [24] brings the results given in Figs. 10, ⎜ ⎟ ⎜ ⎟ 11 and 12. Till now the deterministic timing of all transitions ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ was used, including t . To make sure that non-deterministic ⎜ ⎟ 5 ⎜ ⎟ ⎜ ⎟ timing of t does not affect the results, consider for t the f f 111011001000000010001100111 5 5 ⎜ ⎟ ⎜ ⎟ u discrete uniform probability distribution of timing: f = ⎜ 000100110111111101110011000 ⎟ x ⎜ ⎟ X = ⎜ 000000000000100001000011000 ⎟ 1/(b − a) if x ∈ (a, b), otherwise x = 0. Test two cases: (i) reach ⎜ ⎟ ⎜ 000100010001000000000000000 ⎟ a = 0.1, b = 1.2; (ii) a = 0.3, b = 0.7. The results are ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ introduced in Fig.13. ⎜ ⎟ 111111011001001 000101010101 ⎜ ⎟ As it can be seen, only the time instant of the failure ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ incidence represented by t manifests itself in marking of ⎜ ⎟ ⎝ ⎠ p —compare both pictures in Fig.13 each other and both of them with the corresponding part Fig.11 containing p . (15) Courses of marking of all other places stay unchanged. 2.3 Time views on results To give an image about time relations let us use TPN with time 3 Conclusion parameters of the transitions (delays in a time unit) defined by D = 0.2 × (1, 1, 1, 1, 1, 2, 2, 0.1, 0.05, 0.05), where The PN-based approach to dealing with failures in DES was first 7 parameters concerns transitions t − t , 8-th parameter presented. It is based on utilizing RT/RG of the PN-based 1 7 is assigned to t , 9-th parameter concerns t and finally 10- model of DES. Moreover, the elimination of deadlock(s) by f r 5 2 123 152 Vietnam Journal of Computer Science (2018) 5:143–155 Fig. 10 The courses of marking the places p − p wrt. (with respect to) time 1 4 means of supervision (synthesizing of the suitable super- the model recovering process is individual. As to the compu- visor) based on P-invariants of PN, introduced in [4], was tational complexity of the approach, it corresponds especially utilized. to that of computing RT, that depends on the structure of the The presented approach consists of the following steps: (i) PN model in question. creating the PN model of the investigated kind of DES; (ii) To illustrate the soundness of the procedure, the case study finding its behaviour in the standard (failure-free) situation; on the simple railroad crossing was introduced. Finally, the (iii) analysing the model with respect to possible failures deadlock-free recovery model was found. It is necessary to (in general, each system has its specificity and it is practi- emphasize that there are also the failures in DES which can- cally impossible to find a unified approach for all systems); not be recovered by means of the procedure. They depend on (iv) selecting the failures which can be successfully recov- human failures, bad properties and mistakes and/or on bad ered (because there are different kinds of failures and some technical state of devices. They must be precluded either by of them cannot be recovered—e.g., human failures of the means of the better preparation of human operators and/or engine-driver or a mechanical problem in the crossing gate); by means of better executing maintenance of devices, their (v) finding the structure of the recovered PN model; (vi) test- routine testing and/or checking, early replacing their compo- ing its behaviour with respect to deadlocks; (vii) removing nents, etc. deadlocks and finding the deadlock-free PN model. In future a possibility of generalization of the recovery PN were used in all of the steps. They make possible to process by means of PN will be investigated. create the uniform model of a system and compute its RT/RG. However, in different systems different states can fail. Hence, 123 Vietnam Journal of Computer Science (2018) 5:143–155 153 Fig. 11 The courses of marking the places p − p wrt. time. The marking of p is directly influenced by t (i.e., by a failure) 5 12 10 f 123 154 Vietnam Journal of Computer Science (2018) 5:143–155 Fig. 12 The courses of marking the places p − p wrt. time. The place p expresses the state (marking) of the supervisor 13 15 15 Fig. 13 The courses of marking the place p in the case (i) the left picture, and in the case (ii) the right picture Acknowledgements The research was partially supported by the Slo- to the original author(s) and the source, provide a link to the Creative vak Grant Agency for Science VEGA under Grant # 2/0029/17. The Commons license, and indicate if changes were made. author thanks VEGA for the support. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecomm ons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit 123 Vietnam Journal of Computer Science (2018) 5:143–155 155 References 15. Leveson, N.G., Stolzy, J.L.: Safety analysis using Petri nets. IEEE Trans. Softw. Eng. SE–13(3), 386–397 (1987) 16. Li, B., Khlif-Bouassida, M., Toguyéni, A.: On-the-Fly Diag- 1. Bernardi, S. et al.: Model-driven availability evaluation of railway nosability Analysis of Labeled Petri Nets Using T-invariants. control systems. In: Proceedings of 30th International Conference IFAC-Papers OnLine 48-7, pp. 064–070. Elsevier, Amsterdam on Computer Safety, Reliability and Security—SAFECOMP 2011, (2015) Naples, Italy. LNCS vol. 6894, pp. 15–28, Springer (2011) 17. Liu, B.: An Efficient Approach for Diagnosability and Diagnosis of 2. Cabasino, M.P., Giua, A., Pocci, M., Seatzu, C.: Discrete event DES Based on Labeled Petri Nets—Untimed and Timed Contexts. diagnosis using labeled Petri nets. An application to manufacturing Ph.D. Thesis, Laboratoire d’ Automatique, Génie Informatique et systems. Control Eng. Pract. 19(9), 989–1001 (2011) Signal, École Centrale de Lille, Lille (2014) 3. Cabasino, M.P., Giua, A., Lafortune, S., Seatzu, C.: New approach 18. Murata, T.: Petri nets: properties, analysis and applications. Proc. for diagnosability analysis of Petri nets using verifier nets. IEEE IEEE 77, 541–580 (1989) Trans. Autom. Control 57(12), 3104–3117 (2012) 19. Odrey, N.G.: Error recovery in production systems: a Petri net based 4. Capkovic, ˇ F.: Petri net-based synthesis of agent cooperation by intelligent system approach. In: Kordic, V. (ed.) Petri Net,Theory means of modularity and supervision principles. In: Dimirovski, and Applications, pp. 302–336. I-Tech Education and Publishing, G.M. (ed.) Complex Systems. Relationships Between Control, Vienna (2008) Communications and Computing, Chapter 20, Springer Series: 20. Peterson, J.L.: Petri Net Theory and the Modeling of Systems. Studies in Systems, Decision and Control, pp. 429–450. Springer, Prentice-Hall Inc., Englewood Cliffs (1981) Cham (2016) 21. Ramaswamy, S., Valavanis, K.P.: Modeling, analysis and simula- 5. Capkovic, ˇ F.: Failures in discrete event systems and dealing with tion of failures in a materials handling system with extended Petri them by means of Petri nets. In: Nguyen, N.T., et al. (eds.) ACIIDS nets. IEEE Trans. Syst. Man Cybern. 24(9), 1358–1373 (1994) 2017, Part I, LNAI 10191, pp. 379–391. Springer, Cham (2017) 22. Ramírez-Treviño, A., Ruiz-Beltrán, A.E., Rivera-Rangel, I., 6. Chang, S.J., DiCesare, F., Goldbogen, G.: Failure propagation trees López-Mellado, E.: Online fault diagnosis of discrete event sys- for diagnosis in manufacturing systems. IEEE Trans. SMC 21(4), tems. A Petri net-based approach. IEEE Trans. Autom. Sci. Eng. 767–776 (1991) 4(1), 31–39 (2007) 7. Chung, S., Wu, C., Jeng, M.: Failure diagnosis: a case study on 23. Ramírez-Treviño, A., Ruiz-Beltrán, A.E., Arámburo-Lizárraga, J., modeling and analysis by Petri nets. In: Proceedings of IEEE Inter- López-Mellado, E.: Structural diagnosability of DES and design national Conference on Systems, Man & Cybernetics, Washington, of reduced Petri net diagnosers. IEEE Trans. Syst. Man Cybern. A DC, 5–8 October 2003, pp. 2727–2732 (2003) 42(2), 416–429 (2012) 8. Desel, J., Reisig, W.: Place/transition Petri nets. In: Reisig, W., 24. Sessego, F., Giua, A., Seatzu, C.: HYPENS: a matlab tool for timed Rozenberg, G. (eds.) Lectures on Petri Nets I: Basic Models. discrete, continuous and hybrid petri nets. In: van Hee, K.M., Valk, Advances in Petri Nets, LNCS, vol. 1491, pp. 122–173. Springer, R. (eds.) Applications and Theory of Petri Nets, LNCS, vol. 5062, Heidelberg (1998) pp. 419–428. Springer, New York (2008) 9. Desel, J., Esparza, J.: Free Choice Petri Nets. Cambridge Tracts 25. Urban, S.D. et al.: The assurance point model for consistency and in Theoretical Computer Science, vol. 40. Cambridge University recovery in service composition. In: Innovations, Standards and Press, Cambridge (1995) Practices of Web Services: Emerging Research Topics, Chapter 10. Fanni, A., Giua, A., Sanna, N.: Control and error recovery of Petri 12, pp. 250–287, IGI Global (2012) net models with event observers. In: Proceeding of Second Inter- 26. Wen, Y., Jeng, M.: Diagnosability analysis based on T-invariants of national Workshop on Manufacturing and Petri Nets, Toulouse, Petri nets. In: Proceedings of 2005 IEEE International Conference France, pp. 53–68 (1997) on Networking, Sensing and Control, March 2005, pp. 371–376 11. Giua, A.: State estimation and fault detection using Petri nets. (2005) In: Kristensen, L.M. and Petrucci, L. (Eds.): Proceedings of 32nd 27. Zaytoon, J., Lafortune, S.: Overview of fault diagnosis methods for International Conference on Applications and Theory of Petri Nets discrete event systems. Annu. Rev. Control 37, 308–320 (2013) 2011, Newcastle, UK, June 20–24, 2011. Lecture Notes in Com- puter Science, vol. 6709, pp. 419–428, Springer, New York (2011) 12. Guo, Z. et al: Failure recovery: when the cure is worse than the dis- ease. In: Proceedings of 14th Workshop on Hot Topics in Operating Publisher’s Note Springer Nature remains neutral with regard to juris- Systems, Santa Ana Pueblo, New Mexico, USA, May 13–15 2013, dictional claims in published maps and institutional affiliations. USENIX, Berkeley. https://www.usenix.org/conference/hotos13/ failure-recovery-when-cure-worse-disease (2013) 13. Haar, S.: Types of Asynchronous Diagnosability and the Reveals- Relation in Occurrence Nets. Research Report RR-6902. INRIA, Rennes (2009) 14. Huang, Z., Chandra, V., Jiang, S., Kumar, R.: Modeling discrete event systems with faults using a rules based modeling formalism. Math. Comput. Model. Dyn. Syst. 9(3), 233–254 (2003) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Vietnam Journal of Computer Science Springer Journals

Failures in discrete-event systems and dealing with them by means of Petri nets

Free
13 pages
Loading next page...
 
/lp/springer_journal/failures-in-discrete-event-systems-and-dealing-with-them-by-means-of-hwoa6QbuqZ
Publisher
Springer Berlin Heidelberg
Copyright
Copyright © 2018 by The Author(s)
Subject
Computer Science; Information Systems and Communication Service; Artificial Intelligence (incl. Robotics); Computer Applications; e-Commerce/e-business; Computer Systems Organization and Communication Networks; Computational Intelligence
ISSN
2196-8888
eISSN
2196-8896
D.O.I.
10.1007/s40595-018-0110-3
Publisher site
See Article on Publisher Site

Abstract

An approach based on Petri nets pointing to the manner how to deal with failures in discrete-event systems is presented. It uses the reachability tree and/or reachability graph of the Petri net-based model of the real system as well as the synthesis of a supervisor to remove the possible deadlock(s). To illustrate the applicability of the approach to the detection and recovery of failures in DES modelled by Petri nets the case study on a railroad crossing is introduced. Keywords Detection · Discrete-event systems · Failure · Modelling · Petri nets · Reachability · Recovery 1 Introduction Error recovery is [12,19,25] the set of actions that must be performed in order to return the system to its normal state. A failure can be defined [2,3,10] as a deviation of a sys- At least one sequence of actions should exist in order to tem from its intended (normal) behaviour. The process of bring the system into its normal operation. When there exist detecting a potential failure in the system behaviour fol- more sequences, the best one is chosen with respect to a lowed by isolating the cause or the source of the failure, prescribed criterion. Usually it is the sequence of actions is called the system diagnosis. In the discrete-event systems which minimally disorganizes the system. (DES) diagnosis, faults may correspond to any discrete event. For systems without failures usage of Petri nets (PN) is Unfortunately, the predisposition of systems to fail increases very useful for modelling, analysing and control synthesis. with their complexity. The research effort has been spent in However, practically any system is not failure-free. Failures the development of diagnostic systems. It is necessary to dis- can emerge in any device or software. It is gratifying that PN tinguish (according to the manner in which faults are reset can be used [3,7,10,11,15–17,19,21–23,26] also for systems after they occur) [2,14] between permanent and intermit- where failures occur. There failures can be categorized into tent faults. In case of the permanent fault the recovery event hardware failures and software ones. To minimize the hard- occurs only due to repairing the controllable and observ- ware failures of devices, it is necessary to timely execute able fault. In case of the intermittent fault the recovery event maintenance of devices, test and/or check as well as timely occurs either spontaneously or due to repairing, and such replace their components. event has a tendency to be uncontrollable and unobservable. To decrease occurrence of software failures, fault-tolerant The fault diagnosability [27] is interested in whether the sys- software techniques are necessary. Error recovery is possible tem is diagnosable or not—i.e., in the fact whether the system only for the so called soft failures [6]. Hard (catastrophic) can detect the occurrence of the fault in a finite number of failures in systems are classified as functional and/or struc- steps or not. tural failures. Strategies and forms for detection and recovery of system soft failures are based on the so called error treatment and fail- This paper is an expanded version of the paper [5] presented at the ure treatment. The error treatment contains error detection, conference ACIIDS 2017, Kanazawa, Japan, April 3–5, 2017. damage assessment and error recovery. The failure treatment includes localization, identification, system repair and con- B František Capkovicˇ tinued service. However, the hard failures can be overcome Frantisek.Capkovic@savba.sk in most systems by means of redundancy [1]. Institute of Informatics, Slovak Academy of Sciences, Dúbravská cesta 9, 845 07 Bratislava, Slovakia 123 144 Vietnam Journal of Computer Science (2018) 5:143–155 k k T k Here, in this paper, failures in DES and their recovery (γ ,..., γ ) with entries γ ∈{0, 1}, representing the t t p 1 m i will be examined by means of utilizing Petri nets (PN). DES states of particular transitions (either enable—when 1, or are systems discrete by nature. They persist in a steady state disable—when 0) is the control vector; F, G are incidence until the occurrence of a discrete event which will cause their matrices of arcs corresponding, respectively, to the sets F, G transition into another state. Typical representatives of DES mentioned above. are discrete manufacturing systems, transport systems, com- A firing sequence from the initial state x (i.e., from the munication systems, etc. PN are frequently used for DES initial marking m ) is a sequence of transition sets T = modelling, analysing and control synthesizing. {τ τ ... τ } such that x [ τ > x [ τ > x > ··· τ > 1 2 k 0 1 1 2 2 k x [ τ > x . The set may be also empty, of course. The k−1 k k 1.1 Preliminaries about Petri nets notation x [ T denotes that the sequence T can be fired at x and the notation x [ T > x denotes that the firing of T 0 0 k Petri nets (PN) [8,18,20] are (as to their structure) bipar- yields x . tite directed graphs—i.e., graphs with two kinds of nodes More than one transition can be fired at any instant. Thus (places and transitions) and two kinds of edges (arcs directed there are two possibilities (i) to fire more than one transition from places to transitions and arcs directed contrary)— at any instant (concurrency assumption); (ii) to fire only one P, T , F , G with P ∩ T = ∅ and P ∪ T = ∅ (∅ is the of them at any instant [no concurrency (NC) assumption]. empty set), where P, | P|= n, is a finite set of places and T , Under the NC assumption, each τ is a singleton set, and |T|= m, is a finite set of transitions; F ⊆ P ×T , G ⊆ T × P T is a sequence of transitions. It can also be written that are subsets of the directed arcs. The set B = F ∪ G con- x [ T > x to denote that firing of T the state x can be 0 k k tains all directed arcs. The so called preset (a set of input reached from x . In general, the state x is reachable from x 0 k 0 ( p) places) of a transition t is defined as t ={ p|( p, t ) ∈ B}, if there exists a firing sequence T such that x [ T > x .For 0 k while the so called postset (a set of output places) of t is PN the set of reachable state vectors is R(PN, x ). All these ( p) defined as t ={ p|(t , p) ∈ B}. On the contrary, the pre- vectors create columns of the matrix X . reach set of a place p (a set of input transitions) is defined as The PN reachability tree (RT) expresses all states reach- (t ) p ={t |(t , p) ∈ B} while the postset (a set of output able from x as well as how (by means of firing which (t ) transitions) of p is defined as p ={t |( p, t ) ∈ B}.P/T transitions) they can be reached. Thus, the nodes of the RT PN is said to be pure if no self-loops occur in it, i.e., if for are labelled with the actual PN marking (state vectors) and p ∈ P, t ∈ T , {( p, t ) ∈ B) ⇒ (t , p)/ ∈ B}. the arcs are labelled with the transitions between the states. Places model some particular activities or operations of a The RT root is represented by the initial state x and the RT modelled DES being a real object (plant). This is expressed leafs are expressed by the states reachable from x . Connect- by putting tokens inside the places. Such a marking m is a vec- ing the leafs with the same name the reachability graph (RG) tor m : P → Z (Z represents positive integers including arises. ≥0 ≥0 0). The marking enables a set of transitions τ ⊆ T . Namely, The PN T-invariants and P-invariants [9,13,18] are impor- (t ) ∀ p ∈ P, m( p) ≥| p ∩ τ | (i.e., m( p) is greater than the tant too, respectively, at diagnosability [16] and supervision number of transitions in τ for which p is the input place [4] (and subsequently for deadlocks elimination). While T- or equal to this number). The enabled transitions may be invariants restore an initial state, P-invariants ensure the (but need not be) fired. After their firing the PN marking is token preservation. A T-invariant v is a solution of the equa- changed. tion Bv = 0. A P-invariant y is a solution of the equation As to the marking development (marking propagation can B y = 0. For any state x reachable from x the relation T T be understood to be PN dynamics), the PN can be formally y · x = y · x is valid. This fact was utilized at the super- defined as X , U,δ, x , where X is a set of PN states, U is visor synthesis [4] based on P-invariants. a set of discrete events; δ : X × U → X symbolizes the fact To express time, we can use timed Petri nets (TPN), where that the new state of marking depends on existing state and an time is assigned to the transitions as their duration function occurred discrete event; x ∈ X is the initial state of marking. D : T → Q , where Q symbolizes non-negative rational 0 ≥0 ≥0 The state equation (PN model of DES) is as follows: numbers. To illustrate the PN-based approach to the detection and x = x + B · u , k = 0, 1, ..., N , (1) k+1 k k recovery of failures in DES modelled by PN let us introduce the following case study. F · u ≤ x , (2) k k This paper is an expanded version of the paper [5]pre- where B = G − F. It expresses the PN dynamics. Here, sented at the conference ACIIDS 2017. In comparison with k k T k the conference paper, the part concerning the safety of tech- x = (σ ,..., σ ) with entries σ ∈{0, 1,..., ∞}, p p p 1 n i representing the states of particular places, is the PN state nical systems in general was added. Because the introduced vector in the k-th step of the dynamics development; u = Case Study concerns the accident on a railroad crossing, cer- 123 Vietnam Journal of Computer Science (2018) 5:143–155 145 Fig. 1 The illustrative examples of such accidents tain illustrations of formidable effects of such accidents were During last several years such collisions caused many introduced. Also the passage concerning the supervisor syn- casualties—130 human lives and huge material damages. thesis was modified, to be more comprehensible to readers. Consequently, it is necessary to be concerned with such prob- lems and to find possibilities how to improve security in that area. Also PN can help along this line. Of course, it is impossible to anticipate failures caused by people them- selves. The failures due to the human behaviour like the 2 Safety of technical systems absent-mindedness, willful and wanton acts of law breaking, infringement of traffic regulations, etc., cannot be removed The safety of different kind of technical systems is very simply. To prevent the bad habits the education or training, in important. Especially, in case of the systems where the human extreme cases a punishment, are necessary. Only right way life is endangered. From this point of view the transport sys- to the improvement of the safety of systems is to increase tems belong to the systems where the human life is often the reliability of the software and equipment. The following endangered. At present, man is directly endangered at the simple case study on railroad crossing offers the approach contact with the transport systems during whole day. The how to do this in such a case. mass transport is dangerous not only for the road user(s) who are crossing a road as pedestrian(s) but also for car drivers and their travel companion. For example the car collisions occur 2.1 Case study on simple railroad crossing very frequently. Likewise, collisions on railroad crossings are not unusual. Only in such small country like Slovakia, tragic Consider the simple railroad crossing where the railroad collisions between cars and/or trucks with trains occur prac- crossing gate prevents a direct contact of vehicles on the tically every month—see e.g., Fig. 1. The train having a many road with trains. The PN model of such system consists of times bigger mass, speed and consequently, also dynamics, three cooperating sub-models expressing in Fig. 2(left) the destroys not only human lives (being inside of the road vehi- behaviour of the train, crossing gate and control system. Here, cles and the train) but also the vehicles and some times also the sense of the places in the failure-free case is the following: the train itself ends completely destroyed. (i) the train has the states: p = approaching to the cross- 123 146 Vietnam Journal of Computer Science (2018) 5:143–155 Fig. 2 The PN model of the failure-free case together with its RT (left) and the PN model with three potential failures (right) ing, p = being before the crossing, p = being within the gate is mechanically damaged), and t represents a control 2 3 f crossing, p = being after the crossing; (ii) the barrier of the system failure (when an illegitimate signal occurs). crossing gate has the states: p = it is up, p =itisdown. It is practically impossible to recover the human fail- 11 12 The transitions t and t model, respectively, the events of ure of the engine-driver. Likewise, the technical problem in 6 7 raising and lowering the barrier; (iii) the control system has the crossing gate caused by a wrong function of the bar- the states: p , p , p , p , p , p ; (iv) the place p repre- rier raising/lowering can be hardly recovered. However, the 5 6 7 8 9 10 13 sents the interlock giving the warning signal for the train that erroneous function of the control system can be detected and the barrier is still up. The reachable states x , i = 0,..., 7 recovered. Consequently, let us consider in Fig. 2(right) only (RT/RG nodes N ), of the failure-free system are expressed the failure represented by t and neglect the failures repre- i +1 f as the rows of the following matrix sented by the transitions t and t . Then the coverability f f 2 6 tree and graph are given in Fig. 3. The reachable states of ⎛ ⎞ this model (nodes of the RT/RG) are given as the columns of ⎜ ⎟ the following matrix, where ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎛ ⎞ ⎜ ⎟ X = (3) 1 0 100000 000 00 00 00 000 0 0 reach ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 0 1 011110 1 10 00 10 00 000 0 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 0 0000001 000 11 00 00 100 0 0 ⎝ ⎠ ⎜ ⎟ 000 1 0 1 000 10 10 ⎜ ⎟ 0 0000000 001 0 0 0 1 1 1 0 11 1 1 ⎜ ⎟ ⎜ ⎟ 0 1 001000 000 00 00 00 000 0 0 ⎜ ⎟ ⎜ ⎟ 1 1 101000 000 00 01 00 011 0 1 ⎜ ⎟ The RT is displayed just by the failure-free PN model in ⎜ ⎟ X = 0 0 010111 111 11 10 11 100 1 0 reach ⎜ ⎟ Fig. 2. It is simple, without any branching. 0 0000000 001 0 0 0 0 1 1 000 1 0 ⎜ ⎟ ⎜ ⎟ 0 0 010010 000 00 00 00 000 0 0 However, there can occur three potential failures, one in ⎜ ⎟ ⎜ ⎟ 11 ω 1 ω 1 ω 10 ω 10 ωω 20 ωω 1 ωωω each subsystem. They are expressed by means of the failure ⎜ ⎟ ⎜ ⎟ 1 1111010 1 00 10 10 10 110 1 1 ⎜ ⎟ transitions t , t , t given in Fig. 2 (right). The transition f f f 2 5 6 ⎝ ⎠ 0 0000101 0 11 01 01 01 001 0 0 t takes a token from p and puts a token into p out of f 2 3 0 0000100 1 10 00 10 00 000 0 0 the correct sequence, t does the same for p and p , f 12 11 (4) and t involves an erroneous generation of a token in p f l0 which directly influences the position of the barrier. Thus, t represents a human failure (when the engine-driver omits It can be seen that at the infinity number of t occurrences, f f 2 5 or ignores the warning signal), t expresses the failure of the one half of the 22 states have the self-loops (see Fig. 3 right) crossing gate (when a premature gate raising occurs or the which are expressed by the symbol ω. 123 Vietnam Journal of Computer Science (2018) 5:143–155 147 Fig. 3 The coverability tree (left) and coverability graph (right) of the PN model with t at the infinite number of possible occurrences of the failure In order to generate only the finite number of the failure t occurrences, the place p was added to the previous 13 f 14 places—see Fig. 4. In general, the failure can occur more times. The more times the failure occurs, the more compli- cated will be the structure and dimensionality of RT (RG). Therefore, here we will suppose its occurrence only once as displayed in Fig. 4 in order to demonstrate how to deal with the failure. In case of more failures such process will be more complicated. The model parameters are ⎛ ⎞ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ 0000100001 ⎟ F = ⎜ ⎟ ; ⎜ 0000100000 ⎟ ⎜ ⎟ ⎜ 0000001000 ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ Fig. 4 The PN model of the system with the failure represented by t ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ 123 148 Vietnam Journal of Computer Science (2018) 5:143–155 Fig. 5 The RT of the system with the finite number (namely only once in this case) of possible occurrences of the failure represented by t (left) and the corresponding RG (right) ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 0000000000 1 ⎜ ⎟ ⎜ ⎟ 1000000000 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 0100000000 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 0010000000 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 1000000000 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 0000100000 1 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 0001000001 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ G = ; x = . (5) ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ = (6) X ⎜ ⎟ ⎜ 0010000000 ⎟ ⎜ 0 ⎟ reach ⎜ 0000000001000100010 ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ 0001000000 ⎟ ⎜ 0 ⎟ ⎜ 0001001000000000000 ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ 0000100100 ⎟ ⎜ 0 ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 0010101010101100201 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 0000010010 1 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 1111101000010011011 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 0000001001 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ ⎝ ⎠ 0000001000 0 ⎝ ⎠ 0000000000 1 Then, RT and RG of the failed system are given in Fig. 5. The RT has 19 nodes. However, with the accruing number It can be seen that the number of states as well as the RT/RG of occurrences of the failure, the RT/RG dimensionality and structure are completely different in comparison with RT complexity escalate. When σ = 2 RT has 30 nodes, when of the failure-free system in Fig. 2. Namely, the branching σ = 5 RT has 63 nodes, when σ = 10 RT has 118 p p 14 14 occurs here. The states (nodes of the RT/RG) are the columns nodes, etc. Although the procedure of RT computation is the of the matrix X . reach same, computational time correspondingly increases. To detect and recover the failure(s) we have to distinguish whether the barrier is down or up. When the train is approach- ing, in the standard situation (without any failure) the barrier is down. However, in the non-standard situation (when the failure t occurs) the barrier is going up. This is very dan- 123 Vietnam Journal of Computer Science (2018) 5:143–155 149 ⎛ ⎞ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ X =⎜ ⎟ reach ⎜ 000000000000001000001000011000⎟ ⎜ ⎟ ⎜ 000100011000110000100000000000⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 11111101010001001 0000101010101 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ 11010010101000 1000010001000000 (7) The RT and RG of the recovered system are given in Fig. 7. But the deadlock N (x ) occurs there. 19 18 2.2 Supervisor synthesis for deadlock(s) elimination A general approach to the supervisor synthesis based on P-invariants of PN was presented in [4]. Suitable linear combinations of entries of the state vector x (i.e., L.x)are restricted by means of entries of the constant vector b, i.e., n ×n n ×1 s s L.x ≤ b, L ∈ Z , b ∈ Z . Then, in a nutshell, the ≥0 ≥0 supervisor synthesis is as follows. Remove the inequality by Fig. 6 The PN model of the final recovered system adding the vector x of the so called slacks, i.e., gerous situation, critical as to safety. To detect the failure it L.x ≤ b, (8) is necessary to have redundant information. It must be con- L.x + I .x = b, (9) s s tained in the control system itself. Because p and p in the 6 7 control system correspond to p and p in the real cross- 11 12 where I is the (s × s)-dimensional identity matrix. Now ing gate, the failure is detected by checking if p and p 7 11 suppose that Y is a matrix of invariants of the extended PN are active simultaneously. If yes, there exists a contradiction model (the model of the plant and the supervisor together). between the real (i.e., fault) situation and standard one. After Then detecting the failure a kind of recovery can be applied. It T T T T depends on which case is accepted as the true state. When it Y .(B B ) = 0. (10) is supposed that the barrier is up and drops down the recovery is realized by means of t . The PN model of the recovered r Let us define system is given in Fig. 6. More detailed analysis is possible by means of the RT Y  (LI ). (11) and/or RG in Fig. 7 using information about the nodes given in the matrix given by the relation (7). When the barrier is up After multiplying the matrices in (10), we obtain and none train is approaching, the situation is considerably simpler. Namely, by virtue of t the fail signal p from the L.B + I · B = 0, (12) s s r 10 control system and the activity of p guarantee that the fail B =− L · B = G − F . (13) s s signal can be practically ignored. It has 30 states being the columns of the following matrix: The initial state vector of the supervisor follows from (9)in the form x = b − L.x . (14) B is the supervisor structure and x is its initial state, F , G s s s are the incidence matrices of the supervisor. Consequently, s T T T B = (B B ) is the structure of the supervised system 123 150 Vietnam Journal of Computer Science (2018) 5:143–155 Fig. 7 The RT of the recovered system (left) and the corresponding RG (right) s T T T (i.e., original system plus supervisor) and x = (x (x ) ) 0 0 is its initial state. Let us deal with the deadlock state x by means of syn- thesizing a suitable supervisor. Because the deadlock state is N , i.e., x = (0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0) , 19 18 we have to avoid its activation. Consider L = (0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0) and b = 3, i.e., at most three of p , p , p , p can be 2 7 9 12 active together. Then, the supervisor structure is given as B = (− 1, 1, 0, − 2, 1, 1, 0, 0, 0, − 1). After the break- up of B the incidence matrices of arcs are acquired. F = s s (1, 0, 0, 2, 0, 0, 0, 0, 0, 1) and G = (0, 1, 0, 0, 1, 1, 0, 0, 0, 0). The initial state of the supervisor is x = 3−0 = 3. The supervisor is incorporated into the PN model of the recovered system given in Fig. 6. Consequently, the form of the PN model is changed into the form given in Fig. 8.Its RT and RG are in Fig. 9. The reachable states of the deadlock- free recovered system are given as the columns of the matrix: Fig. 8 The PN model of the system with the recovered failure and the deadlock removed by means of the supervisor 123 Vietnam Journal of Computer Science (2018) 5:143–155 151 Fig. 9 The RT of the recovered system with removed deadlock by means of the supervisor (left) and the corresponding RG (right) ⎛ ⎞ th parameter concerns t . Simulation in Matlab by means of 101001000000000000000000000 1 ⎜ ⎟ 010110111011001100000000000 the tool HYPENS [24] brings the results given in Figs. 10, ⎜ ⎟ ⎜ ⎟ 11 and 12. Till now the deterministic timing of all transitions ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ was used, including t . To make sure that non-deterministic ⎜ ⎟ 5 ⎜ ⎟ ⎜ ⎟ timing of t does not affect the results, consider for t the f f 111011001000000010001100111 5 5 ⎜ ⎟ ⎜ ⎟ u discrete uniform probability distribution of timing: f = ⎜ 000100110111111101110011000 ⎟ x ⎜ ⎟ X = ⎜ 000000000000100001000011000 ⎟ 1/(b − a) if x ∈ (a, b), otherwise x = 0. Test two cases: (i) reach ⎜ ⎟ ⎜ 000100010001000000000000000 ⎟ a = 0.1, b = 1.2; (ii) a = 0.3, b = 0.7. The results are ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ introduced in Fig.13. ⎜ ⎟ 111111011001001 000101010101 ⎜ ⎟ As it can be seen, only the time instant of the failure ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ incidence represented by t manifests itself in marking of ⎜ ⎟ ⎝ ⎠ p —compare both pictures in Fig.13 each other and both of them with the corresponding part Fig.11 containing p . (15) Courses of marking of all other places stay unchanged. 2.3 Time views on results To give an image about time relations let us use TPN with time 3 Conclusion parameters of the transitions (delays in a time unit) defined by D = 0.2 × (1, 1, 1, 1, 1, 2, 2, 0.1, 0.05, 0.05), where The PN-based approach to dealing with failures in DES was first 7 parameters concerns transitions t − t , 8-th parameter presented. It is based on utilizing RT/RG of the PN-based 1 7 is assigned to t , 9-th parameter concerns t and finally 10- model of DES. Moreover, the elimination of deadlock(s) by f r 5 2 123 152 Vietnam Journal of Computer Science (2018) 5:143–155 Fig. 10 The courses of marking the places p − p wrt. (with respect to) time 1 4 means of supervision (synthesizing of the suitable super- the model recovering process is individual. As to the compu- visor) based on P-invariants of PN, introduced in [4], was tational complexity of the approach, it corresponds especially utilized. to that of computing RT, that depends on the structure of the The presented approach consists of the following steps: (i) PN model in question. creating the PN model of the investigated kind of DES; (ii) To illustrate the soundness of the procedure, the case study finding its behaviour in the standard (failure-free) situation; on the simple railroad crossing was introduced. Finally, the (iii) analysing the model with respect to possible failures deadlock-free recovery model was found. It is necessary to (in general, each system has its specificity and it is practi- emphasize that there are also the failures in DES which can- cally impossible to find a unified approach for all systems); not be recovered by means of the procedure. They depend on (iv) selecting the failures which can be successfully recov- human failures, bad properties and mistakes and/or on bad ered (because there are different kinds of failures and some technical state of devices. They must be precluded either by of them cannot be recovered—e.g., human failures of the means of the better preparation of human operators and/or engine-driver or a mechanical problem in the crossing gate); by means of better executing maintenance of devices, their (v) finding the structure of the recovered PN model; (vi) test- routine testing and/or checking, early replacing their compo- ing its behaviour with respect to deadlocks; (vii) removing nents, etc. deadlocks and finding the deadlock-free PN model. In future a possibility of generalization of the recovery PN were used in all of the steps. They make possible to process by means of PN will be investigated. create the uniform model of a system and compute its RT/RG. However, in different systems different states can fail. Hence, 123 Vietnam Journal of Computer Science (2018) 5:143–155 153 Fig. 11 The courses of marking the places p − p wrt. time. The marking of p is directly influenced by t (i.e., by a failure) 5 12 10 f 123 154 Vietnam Journal of Computer Science (2018) 5:143–155 Fig. 12 The courses of marking the places p − p wrt. time. The place p expresses the state (marking) of the supervisor 13 15 15 Fig. 13 The courses of marking the place p in the case (i) the left picture, and in the case (ii) the right picture Acknowledgements The research was partially supported by the Slo- to the original author(s) and the source, provide a link to the Creative vak Grant Agency for Science VEGA under Grant # 2/0029/17. The Commons license, and indicate if changes were made. author thanks VEGA for the support. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecomm ons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit 123 Vietnam Journal of Computer Science (2018) 5:143–155 155 References 15. Leveson, N.G., Stolzy, J.L.: Safety analysis using Petri nets. IEEE Trans. Softw. Eng. SE–13(3), 386–397 (1987) 16. Li, B., Khlif-Bouassida, M., Toguyéni, A.: On-the-Fly Diag- 1. Bernardi, S. et al.: Model-driven availability evaluation of railway nosability Analysis of Labeled Petri Nets Using T-invariants. control systems. In: Proceedings of 30th International Conference IFAC-Papers OnLine 48-7, pp. 064–070. Elsevier, Amsterdam on Computer Safety, Reliability and Security—SAFECOMP 2011, (2015) Naples, Italy. LNCS vol. 6894, pp. 15–28, Springer (2011) 17. Liu, B.: An Efficient Approach for Diagnosability and Diagnosis of 2. Cabasino, M.P., Giua, A., Pocci, M., Seatzu, C.: Discrete event DES Based on Labeled Petri Nets—Untimed and Timed Contexts. diagnosis using labeled Petri nets. An application to manufacturing Ph.D. Thesis, Laboratoire d’ Automatique, Génie Informatique et systems. Control Eng. Pract. 19(9), 989–1001 (2011) Signal, École Centrale de Lille, Lille (2014) 3. Cabasino, M.P., Giua, A., Lafortune, S., Seatzu, C.: New approach 18. Murata, T.: Petri nets: properties, analysis and applications. Proc. for diagnosability analysis of Petri nets using verifier nets. IEEE IEEE 77, 541–580 (1989) Trans. Autom. Control 57(12), 3104–3117 (2012) 19. Odrey, N.G.: Error recovery in production systems: a Petri net based 4. Capkovic, ˇ F.: Petri net-based synthesis of agent cooperation by intelligent system approach. In: Kordic, V. (ed.) Petri Net,Theory means of modularity and supervision principles. In: Dimirovski, and Applications, pp. 302–336. I-Tech Education and Publishing, G.M. (ed.) Complex Systems. Relationships Between Control, Vienna (2008) Communications and Computing, Chapter 20, Springer Series: 20. Peterson, J.L.: Petri Net Theory and the Modeling of Systems. Studies in Systems, Decision and Control, pp. 429–450. Springer, Prentice-Hall Inc., Englewood Cliffs (1981) Cham (2016) 21. Ramaswamy, S., Valavanis, K.P.: Modeling, analysis and simula- 5. Capkovic, ˇ F.: Failures in discrete event systems and dealing with tion of failures in a materials handling system with extended Petri them by means of Petri nets. In: Nguyen, N.T., et al. (eds.) ACIIDS nets. IEEE Trans. Syst. Man Cybern. 24(9), 1358–1373 (1994) 2017, Part I, LNAI 10191, pp. 379–391. Springer, Cham (2017) 22. Ramírez-Treviño, A., Ruiz-Beltrán, A.E., Rivera-Rangel, I., 6. Chang, S.J., DiCesare, F., Goldbogen, G.: Failure propagation trees López-Mellado, E.: Online fault diagnosis of discrete event sys- for diagnosis in manufacturing systems. IEEE Trans. SMC 21(4), tems. A Petri net-based approach. IEEE Trans. Autom. Sci. Eng. 767–776 (1991) 4(1), 31–39 (2007) 7. Chung, S., Wu, C., Jeng, M.: Failure diagnosis: a case study on 23. Ramírez-Treviño, A., Ruiz-Beltrán, A.E., Arámburo-Lizárraga, J., modeling and analysis by Petri nets. In: Proceedings of IEEE Inter- López-Mellado, E.: Structural diagnosability of DES and design national Conference on Systems, Man & Cybernetics, Washington, of reduced Petri net diagnosers. IEEE Trans. Syst. Man Cybern. A DC, 5–8 October 2003, pp. 2727–2732 (2003) 42(2), 416–429 (2012) 8. Desel, J., Reisig, W.: Place/transition Petri nets. In: Reisig, W., 24. Sessego, F., Giua, A., Seatzu, C.: HYPENS: a matlab tool for timed Rozenberg, G. (eds.) Lectures on Petri Nets I: Basic Models. discrete, continuous and hybrid petri nets. In: van Hee, K.M., Valk, Advances in Petri Nets, LNCS, vol. 1491, pp. 122–173. Springer, R. (eds.) Applications and Theory of Petri Nets, LNCS, vol. 5062, Heidelberg (1998) pp. 419–428. Springer, New York (2008) 9. Desel, J., Esparza, J.: Free Choice Petri Nets. Cambridge Tracts 25. Urban, S.D. et al.: The assurance point model for consistency and in Theoretical Computer Science, vol. 40. Cambridge University recovery in service composition. In: Innovations, Standards and Press, Cambridge (1995) Practices of Web Services: Emerging Research Topics, Chapter 10. Fanni, A., Giua, A., Sanna, N.: Control and error recovery of Petri 12, pp. 250–287, IGI Global (2012) net models with event observers. In: Proceeding of Second Inter- 26. Wen, Y., Jeng, M.: Diagnosability analysis based on T-invariants of national Workshop on Manufacturing and Petri Nets, Toulouse, Petri nets. In: Proceedings of 2005 IEEE International Conference France, pp. 53–68 (1997) on Networking, Sensing and Control, March 2005, pp. 371–376 11. Giua, A.: State estimation and fault detection using Petri nets. (2005) In: Kristensen, L.M. and Petrucci, L. (Eds.): Proceedings of 32nd 27. Zaytoon, J., Lafortune, S.: Overview of fault diagnosis methods for International Conference on Applications and Theory of Petri Nets discrete event systems. Annu. Rev. Control 37, 308–320 (2013) 2011, Newcastle, UK, June 20–24, 2011. Lecture Notes in Com- puter Science, vol. 6709, pp. 419–428, Springer, New York (2011) 12. Guo, Z. et al: Failure recovery: when the cure is worse than the dis- ease. In: Proceedings of 14th Workshop on Hot Topics in Operating Publisher’s Note Springer Nature remains neutral with regard to juris- Systems, Santa Ana Pueblo, New Mexico, USA, May 13–15 2013, dictional claims in published maps and institutional affiliations. USENIX, Berkeley. https://www.usenix.org/conference/hotos13/ failure-recovery-when-cure-worse-disease (2013) 13. Haar, S.: Types of Asynchronous Diagnosability and the Reveals- Relation in Occurrence Nets. Research Report RR-6902. INRIA, Rennes (2009) 14. Huang, Z., Chandra, V., Jiang, S., Kumar, R.: Modeling discrete event systems with faults using a rules based modeling formalism. Math. Comput. Model. Dyn. Syst. 9(3), 233–254 (2003)

Journal

Vietnam Journal of Computer ScienceSpringer Journals

Published: May 19, 2018

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off