Unifying Leakage Models: From Probing Attacks to Noisy Leakage

Unifying Leakage Models: From Probing Attacks to Noisy Leakage J Cryptol https://doi.org/10.1007/s00145-018-9284-1 Unifying Leakage Models: From Probing Attacks to Noisy Leakage Alexandre Duc University of Applied Sciences and Arts Western Switzerland (HES-SO/HEIG-VD), Yverdon-les-Bains, Switzerland Stefan Dziembowski University of Warsaw, Warsaw, Poland S.Dziembowski@crypto.edu.pl Sebastian Faust TU Darmstadt, Darmstadt, Germany Communicated by Kenneth G. Paterson. Received 17 January 2015 / Revised 19 February 2018 Abstract. A recent trend in cryptography is to formally show the leakage resilience of cryptographic implementations in a given leakage model. One of the most prominent leakage model—the so-called bounded leakage model—assumes that the amount of leakage that an adversary receives is a-priori bounded. Unfortunately, it has been pointed out by several works that the assumption of bounded leakages is hard to verify in practice. A more realistic assumption is to consider that leakages are sufficiently noisy, following the engineering observation that real-world physical leakages are inherently perturbed by physical noise. While already the seminal work of Chari et al. (in: CRYPTO, pp 398–412, 1999) study security of side-channel countermeasures in the noisy model, only recently Prouff and Rivain (in: Johansson T, Nguyen PQ (eds) EUROCRYPT, volume 7881 of lecture notes in 931 computer science, pp 142–159, Springer, 2013) offer a full formal analysis of the masking countermeasure in a physically motivated noise model. In particular, the authors show that a block-cipher implementation that uses the Boolean masking scheme is secure against a very general class of noisy leakage functions. While this is an important step toward better understanding the security of masking schemes, the analysis of Prouff and Rivain has several shortcomings including in particular requiring leak-free gates. In this work, we provide an alternative security proof in the same noise model that overcomes these challenges. We achieve this goal by a new reduction from noisy leakage to the important model of probing adversaries (Ishai et al. in: CRYPTO, pp 463–481, 2003). This reduction is the main technical contribution of our work that significantly simplifies the formal security analysis of masking schemes against realistic side-channel leakages. S. Dziembowski: Received founding from the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013)/ERC Grant Agreement Number 207908. Part of this work was done while Stefan Dziembowski was on leave from Sapienza University of Rome, Italy. S. Faust: Partially funded by the Emmy Noether Program FA 1320/1-1 of the German Research Fundation (DFG). © The Author(s) 2018 A. Duc et al. Keywords. Leakage-resilient cryptography, Noisy leakage, Probing attacks. 1. Introduction Physical side-channel attacks that exploit leakage emitting from devices are an important threat for cryptographic implementations. Prominent sources of such physical leakages include the running time of an implementation [20], its power consumption [21]orelec- tromagnetic radiation emitting from it [30]. A large body of recent applied and theoretical research attempts to incorporate the information an adversary obtains from the leakage into the security analysis and develops countermeasures to defeat common side-channel attacks [2,6,12,17,24,34,35]. While there is still a large gap between what theoretical models can achieve and what side-channel information is measured in practice, some recent important works propose models that go better in line with the perspective of cryp- tographic engineering [28,33,34]. Our work follows this line of research by analyzing the security of a common countermeasure—the so-called masking countermeasure—in the model of Prouff and Rivain [28]. Our analysis works by showing that security in certain theoretical leakage models implies security in the model of [28] and hence may be seen as a first attempt to unify the large class of different leakage models used in recent results. The Masking Countermeasure. A large body of work on cryptographic engineering has developed countermeasures to defeat side-channel attacks (see, e.g., [22] for an overview). While many countermeasures are specifically tailored to protect particular cryptographic implementations (e.g., key updates or shielded hardware), a method that generically works for most cryptographic schemes is masking [4,16,27,35]. The basic idea of a masking scheme is to secretly share all sensitive information, including the secret key and all intermediate values that depend on it, thereby making the leakage inde- pendent of the secret data. The most prominent masking scheme is the Boolean masking: abit b is encoded by a random bit string (b ,..., b ) such that b = b ⊕ ··· ⊕ b .The 1 n 1 n main difficulty in designing masking schemes is to develop masked operations, which securely compute on encoded data and ensure that all intermediate values are protected. Masking Against Noisy Leakages. Besides the fact that masking can be used to protect arbitrary computation, it has the advantage that it can be analyzed in formal security models. The first work that formally studies the soundness of masking in the presence of leakage is the seminal work of Chari et al. [6]. The authors consider a model where each share b of an encoding is perturbed by Gaussian noise and show that the number of noisy samples needed to recover the encoded secret bit b grows exponentially with the number of shares. As stated in [6], this model matches real-world physical leakages that inherently are noisy. Moreover, many practical solutions exist to amplify leakage noise (see for instance the works of [7,8,22]). One limitation of the security analysis given in [6] is the fact that it does not consider leakage emitting from masked computation. This shortcoming has been addressed in the recent important work of Prouff and Rivain [28], who extend at Eurocrypt 2013 the noisy leakage model of Chari et al. [6] to also include leakage from the masked operations. Specifically, they show that a variant of the construction of Ishai et al. [17] is secure even when there is noisy leakage from all the intermediate values that are produced during Unifying Leakage Models: From Probing Attacks to Noisy Leakage the computation. The authors of [28] also generalize the noisy leakage model of Chari et al. [6] to a wider range of leakage functions instead of considering only the Gaussian one. While clearly noisy leakage is closer to physical leakage occurring in real world, the security analysis of [28] has a number of shortcomings which puts strong limitations in which settings the masking countermeasure can be used and achieves the proved security statements. In particular, like earlier works on leakage resilient cryptography [10,13] the security analysis of Prouff and Rivain relies on so-called leak-free gates. Moreover, security is shown in a restricted adversarial model that assumes that plaintexts are chosen uniformly during an attack and the adversary does not exploit joint information from the leakages and, e.g., the ciphertext. We discuss these shortcomings in more detail in the next section. 1.1. The Work of Prouff and Rivain [28] Prouff and Rivain [28] analyze the security of a block-cipher implementation that is masked with an additive masking scheme working over a finite field F. More precisely, let t be the security parameter then a secret s ∈ F is represented by an encoding ( X ,..., X ) 1 t such that each X ← F is uniformly random subject to s = X ⊕ ··· ⊕ X . As discussed i 1 t above the main difficulty in designing secure masking schemes is to devise masked operations that work on masked values. To this end, Prouff and Rivain use the original scheme of Ishai et al. [17] augmented with some techniques from [5,31]toworkover larger fields and to obtain a more efficient implementation. The masked operations are built out of several smaller components. First, a leak-free operation that refreshes encodings, i.e., it takes as input an encoding ( X ,..., X ) of a secret s and outputs 1 t a freshly and independently chosen encoding of the same value. Second, a number of leaky elementary operations that work on a constant number of field elements. For each of these elementary operations the adversary is given leakage f ( X ), where X are the inputs of the operation and f is a noisy function. Clearly, the noise level has to be high enough so that given f ( X ) the values of X is not completely revealed. To this end, the authors introduce the notion of a bias, which informally says that the statistical distance between the distribution of X and the conditional distribution X | f ( X ) is bounded by some parameter. While noisy leakages are certainly a step in the right direction to model physical leakage, we detail below some of the limitations of the security analysis of Prouff and Rivain [28]: 1. Leak-Free Components. The assumption of leak-free computation has been used in earlier works on leakage resilient computation [10,13]. It is a strong assumption on the physical hardware and, as stated in [28], an important limitation of the current proof approach. The leak-free component of [28] is a simple operation that takes as input an encoding and refreshes it. While the computation of this operation is supposed to be completely shielded against leakage, the inputs and the outputs of this computation may leak. Notice that the leak-free component of [28] depends on the computation that is carried out in the circuit by taking inputs. In particular, this means that the computation of the leak-free component depends on secret information, which makes it harder to protect in practice and is different from earlier works that use leak-free components [10,13]. A. Duc et al. 2. Random Message Attacks. The security analysis is given only for random (known) message attacks. This is in contrast to most works in cryptography, which usually consider at least a chosen message attack. Hence, the proof does not cover chosen plaintext or chosen ciphertext attacks. We notice, however, that it is not clear whether chosen message attacks can improve the adversary’s success probability in the context of DPA attacks [36]. 3. Mutual-Information-Based Security Statement. The final statement of Theorem 4 in [28] only gives a bound on the mutual information of the key and the leak- ages from the cipher. While a mutual information analysis is a common method in side-channel analysis to evaluate the security of countermeasures [33], it has important shortcomings such as not including information that an adversary may learn from exploiting joint information from the leakages and plaintext/ciphertext pairs. Notice that such use of mutual information gets particularly problematic under continuous leakage attacks, since multiple plaintext/ciphertext pairs infor- mation theoretically completely reveal the secret key. The more standard security notion used in cryptography and also for the analysis of masking schemes, e.g., in the work of Ishai et al., uses a simulation-based approach and does not have these drawbacks. 1.2. Our Contribution We show in this work how to eliminate limitations 1–3 by a simple and elegant simulation- based argument and a reduction to the so-called t-probing adversarial setting [17] (that in this paper we call the t-threshold-probing model to emphasize the difference between this model and the random-probing model defined later). The t-threshold-probing model considers an adversary that can learn the value of t intermediate values that are produced during the computation and is often considered as a good approximation for modeling higher-order attacks. We notice that limitation 4 from above is what enables our security analysis. The fact that the noise is independent for each elementary operation allows us to formally prove security under an identical noise model as [28], but using a simpler and improved analysis. In particular, we are able to show that the original construction of Ishai et al. satisfies the standard simulation-based security notion under noisy leakages without relying on any leak-free components. We emphasize that our techniques are very different (and much simpler) than the recent breakthrough result of Goldwasser and Rothblum [15] who show how to eliminate leak-free gates in the bounded leakage model. We will further discuss related works in Sect. 1.3. Our proof considers three different leakage models and shows connections between them. One may view our work as a first attempt to “reduce” the number of different leakage models, which is in contrast to many earlier works that introduced new leakage settings. Eventually, we are able to reduce the security in the noisy leakage model to the security in the t -threshold-probing model. This shows that, for the particular choice of parameters given in [28], security in the t-threshold-probing model implies security in the noisy leakage model. This goes in line with the common approach of showing secu- rity against t-order attacks, which usually requires to prove security in the t-threshold- probing model. Moreover, it shows that the original construction of Ishai et al. that has been used in many works on masking (including the work of Prouff and Rivain) is indeed Unifying Leakage Models: From Probing Attacks to Noisy Leakage a sound approach for protecting against side-channel leakages when assuming that they are sufficiently noisy. We give some more details on our techniques below. From Noisy Leakages to Random Probes. As a first step in our security proof we show that we can simulate any adversary in the noisy leakage model of Prouff and Rivain with an adversary in a simpler noise model that we name a random-probing adversary and is similar to a model introduced in [17]. In this model, an adversary recovers an intermediate value with probability  and obtains a special symbol ⊥ with probability 1 − . This reduction shows that this model is worth studying, although from the engineering perspective it may seem unnatural. From Random Probes to the t -Threshold-Probing Model. We show how to go from the random-probing adversary setting to the more standard t-threshold-probing adversary of Ishai et al. [17]. This step is rather easy as due to the independency of the noise we can apply Chernoff’s bound almost immediately. One technical difficulty is that the work of Prouff and Rivain considers joint noisy leakage from elementary operations, while the standard t-threshold-probing setting only talks about leakage from wires. Notice, however, that the elementary operations of [28] only depend on two inputs and, hence, it is not hard to extend the result of Ishai et al. to consider “gate probing adversary” by tolerating a loss in the parameters. Finally, our analysis enables us to show security of the masking based countermeasure without the limitations 1–3 discussed above. Leakage Resilient Circuits with Simulation-Based Security. In our security analysis we use the framework of leakage resilient circuits introduced in the seminal work of Ishai et al. [17]. A circuit compiler takes as input the description of a cryptographic scheme C with secret key K , e.g., a circuit that describes a block cipher, and outputs a transformed circuit C and corresponding key K . The circuit C [K ] shall implement the same functionality as C running with key K , but additionally is resilient to certain well-defined classes of leakage. Notice that while the framework of [17] talks about circuits the same approach applies to software implementations, and we only follow this notation to abstract our description. Moreover, our work uses the well-established simulation paradigm to state the security guarantees we achieve. Intuitively, simulation-based security says that whatever attack an adversary can carry out when knowing the leakage; he can also run (with similar success probability) by just having black-box access to C. In contrast to the approach based on Shannon information theory, our analysis includes attacks that exploit joint information from the leakage and plaintext/ciphertext pairs. It seems impossible to us to incorporate the plaintext/ciphertext pairs into an analysis based on Shannon information theory. To see this, consider a block-cipher execution, where, clearly, when given a couple of plaintext/ciphertext pairs, the secret key is information theoretically revealed. The authors of [28] are well aware of this problem and explicitly exclude such joint information. A consequence of the simulation-based security analysis is that we require an additional mild assumption on the noise—namely, that it is efficiently computable More concretely: imagine an adversary that attacks a block-cipher implementation E ,where K is the secret key. Then just by launching a known-plaintext attack he can obtain several pairs V = ( M , E ( M )), ( M , E ( M )), . . .. Clearly a small number of such pairs is usually enough to determine 0 K 0 1 K 1 K information theoretically. Hence it makes no sense to require that “K is information-theoretically hidden given V and the side-channel leakage.” A. Duc et al. (see Sect. 3.1 for more details). While this is a standard assumption made in most works on leakage resilient cryptography, we emphasize that we can easily drop the assumption of efficiently computable noise (and hence considering the same noise model as [28]), when we only want to achieve the weaker security notion considered in [28]. Notice that in this case we are still able to eliminate the limitations 1 and 2 mentioned above. 1.3. Related Work Masking & Leakage Resilient Circuits. A large body of work has proposed various masking schemes and studies their security in different security models (see, e.g., [4,16, 27,31,35]). The already mentioned t-threshold-probing model has been considered in the work of Rivain and Prouff [31], who show how to extend the work of Ishai et al. to larger fields and propose efficiency improvements. In [29] it was shown that techniques from multiparty computation can be used to show security in the t-threshold-probing model. The work of Standaert et al. [35] studies masking schemes using the information theoretic framework of [33] by considering the Hamming weight model. Many other works analyze the security of the masking countermeasure, and we refer the reader for further details to [28]. With the emergence of leakage resilient cryptography [2,12,24] several works have proposed new security models and alternative masking schemes. The main difference be- tween these new security models and the t-threshold-probing model is that they consider joint leakages from large parts of the computation. The work of Faust et al. [13] extends the security analysis of Ishai et al. beyond the t-threshold-probing model by considering leakages that can be described by low-depth circuits (so-called AC leakages). Faust et al. use leak-free component that have been eliminated by Rohtblum in [32] using com- putational assumptions. The recent work of Miles and Viola [25] proposes a new circuit 0 0 transformation using alternating groups and shows security with respect to AC and TC leakages. Another line of work considers circuits that are provably secure in the so-called con- tinuous bounded leakage model [10,14,15,18]. In this model, the adversary is allowed to learn arbitrary information from the computation of the circuit as long as the amount of information is bounded. The proposed schemes rely additionally on the assumption of “only computation leaks information” of Micali and Reyzin [24]. Noisy Leakage Models. The work of Faust et al. [13] also considers circuit com- pilers for noisy models. Specifically, they propose a construction with security in the binomial noise model, where each value on a wire is flipped independently with prob- ability p ∈ (0, 1/2). In contrast to the work of [28] and our work the noise model is restricted to binomial noise, but the noise rate is significantly better (constant instead of linear noise). Similar to [28] the work of Faust et al. also uses leak-free components. Besides these works on masking schemes, several works consider noisy leakages for concrete cryptographic schemes [12,19,26]. Typically, the noise model considered in these works is significantly stronger than the noise model that is considered for masking schemes. In particular, no strong assumption about the independency of the noise is made. Unifying Leakage Models: From Probing Attacks to Noisy Leakage 2. Preliminaries We start with some standard definitions and lemmas about the statistical distance. If A is a set, then U ← A denotes a random variable sampled uniformly from A. Recall that if A and B are random variables over the same set A then the statistical distance between A and B is denoted as Δ( A; B), and defined as Δ( A; B) = |P ( A = a) − 2 a∈A P ( B = a)|= max{0, P ( A = a) − P ( B = a)}.If X , Y are some events, then a∈A by Δ(( A|X ) ; ( B|Y)) we will mean the distance between variables A and B , distributed according to the conditional distributions P and P .If X is an event of probability A|X B|Y 1, then we also write Δ( A ; ( B|Y)) instead of Δ(( A|X ) ; ( B|Y)).If C is a random variable, then by Δ( A ; ( B|C )) we mean P (C = c) · Δ( A ; ( B|(C = c))). If A, B, and C are random variables, then Δ(( B; C ) | A) denotes Δ((BA); (CA)). It is easy to see that it is equal to P ( A = a) · Δ(( B| A = a) ; (C | A = a)).If Δ( A; B) ≤ , then we say that A and B are -close.The “ = ” symbol denotes the equality of distributions, i.e., A = B if and only if Δ( A; B) = 0. 2.1. Basic Probability-Theoretic Facts Here, we state some basic lemmas that will be used later in some proofs. Lemma 1. Let A, B be two (possibly correlated) random variables. Let B be a vari- able distributed identically to B but independent from A. We have Δ( A; ( A| B)) = Δ(( B; B ) | A). (1) Proof. We have Δ( A; ( A| B)) | | = · P ( B = b) · P ( A = a) − P ( A = a| B = b) = |P ( B = b) · P ( A = a) − P ( B = b) · P ( A = a| B = b)| a,b = P B = b ∧ A = a − P ( B = b ∧ A = a) a,b = Δ(( B; B ) | A), (2) where in (2) we used the fact that B is a variable distributed identically to B and it is independent from A. Lemma 2. For any random variables A and B and an event E we have Δ(( A |¬E ); B) ≤ Δ( A; B) + P (E ) , where ¬E denotes the negation of E . A. Duc et al. Proof. We have Δ(( A |¬E ); B) = |P ( A = a |¬E ) − P ( B = a) | 1 1 ≤ |P ( A = a) − P ( A = a |¬E ) | + |P ( A = a) − P ( B = a) |, (3) 2 2 a a =:(∗) =Δ( A; B) where (3) comes from the triangle inequality. Now, (∗) is equal to P ( A = a) − P ( A = a |¬E ) , (4) a∈A where A is the set of all a’s such that P ( A = a) ≥ P ( A = a |¬E ). Clearly we have that P ( A = a |¬E ) ≥ P ( A = a ∧¬E ), and hence (4) is at most equal to P ( A = a) − P ( A = a ∧¬E ). We therefore have P ( A = a) − P ( A = a ∧¬E ) a∈A = P ( A ∈ A) − P ( A ∈ A ∧¬E ) ≤ P (E ) Thus, altogether, (4) is at most equal to P (E ) + Δ( A; B). This finishes the proof. We will also need the following standard fact. Lemma 3. (Chernoff bound, see, e.g., [9], Theorem 1.1) Let Z = Z , where Z ’s i i i =1 are random variables independently distributed over [0, 1]. Then for every ξ ∈[0, 1] we have P Z ≥ (1 + ξ)E Z ≤ exp − E Z . ( ( )) ( ) 3. Noise from Set Elements We start with describing the basic framework for reasoning about the noise from elements of a finite set X . Later, in Sect. 4, we will consider the leakage from the vectors over X , and then, in Sect. 5, from the entire computation. The reason why we can smoothly use the analysis from Sect. 3.1 in the later sections is that, as in the work of Prouff and Rivain, we require that the noise is independent for all elementary operations. By elementary operations, [28] considers the basic underlying operations over the underlying field X used in a masked implementation. In this work, we consider the same setting and type of underlying operations (in fact, notice that our construction is identical to theirs—except Unifying Leakage Models: From Probing Attacks to Noisy Leakage that we eliminate the leak-free gates and prove a stronger statement). Notice that instead of talking about elementary operations, we consider the more standard term of “gates” that was used in the work of Ishai et al. [17]. 3.1. Modeling Noise Let us start with a discussion defining what it means that a randomized function Noise : X → Y is “noisy”. We will assume that X is finite and rather small: typical choices for X would be GF(2) (the “Boolean case”), or GF(2 ), if we want to deal with the AES circuit. The set Y corresponds to the set of all possible noise measurements and may be infinite, except when we require the “efficient simulation” (we discuss it further at the end of this section). As already informally described in Sect. 1.1, our basic definition is as follows: we say that the function Noise is δ-noisy if δ = Δ( X ; ( X |Noise( X ))). (5) Of course for (5) to be well-defined we need to specify the distribution of X. The idea to define noisy functions by comparing the distributions of X and “ X conditioned on Noise( X )” comes from [28], where it is argued that the most natural choice for X is a random variable distributed uniformly over X . We also adopt this convention and assume that X ← X . We would like to stress, however, that in our proofs we will apply Noise to ˆ ˆ ˆ ˆ inputs X that are not necessarily uniform and in this case the value of Δ( X ; ( X |Noise( X )) may obviously be some non-trivial function of δ. Of course if X ← X and X ← X , then Noise( X ) is distributed identically to Noise( X ), and hence, by Lemma 1,Eq. (5) is equivalent to: δ = Δ((Noise( X ); Noise( X )) | X ), (6) where X and X are uniform over X . Note that at the beginning this definition may be a bit counter-intuitive, as smaller δ means more noise: in particular we achieve “full noise” if δ = 0, and “no noise” if δ ≈ 1. Let us compare this definition with the definition of [28]. In a nutshell: the definition of [28] is similar to ours, the only difference being that instead of the statistical distance Δ in [28] the authors use a distance based on the Euclidean norm. More precisely, they start with defining d as: d( X ; Y ) := (P ( X = x ) − P (Y = y)) , and using this notion they define β x ∈X as: β( X |Noise( X )) := P (Noise( X ) = y) · d( X ; ( X |Noise( X ) = y)) y∈Y (where X is uniform). In the terminology of [28] a function Noise is “δ-noisy” if δ = β( X |Noise( X )). Observe that the right hand side of our noise definition in Eq. (5) can be rewritten as: P (Noise( X ) = y) · Δ( X ; ( X |Noise( X ) = y)), y∈Y A. Duc et al. hence the only difference between their approach and ours is that we use Δ where they use the distance d. The authors do not explain why they choose this particular measure. We believe that our choice to use the standard definition of statistical distance Δ is more natural in this setting, since, unlike the “d” distance, it has been used in hundreds of cryptographic papers in the past. The popularity of the Δ distance comes from the fact that it corresponds to an intuitive concept of the “indistinguishability of distributions” —it is well-known, and simple to verify, that Δ( X ; Y ) ≤ δ if and only if no adversary can distinguish between X and Y with advantage better than δ. Hence, e.g., (6) can be interpreted as: δ is the maximum probability, over all adversaries A, that A distinguishes between the noise from a uniform X that is known to him, and a uniform X that is unknown to him. It is unclear to us if a d distance has a similar interpretation. We emphasize, however, that the choice whether to use Δ or β is not too important, as the following inequalities between these measures hold for every X and Y distributed over X (cf. [28]): | | 1 X · d( X ; Y ) ≤ Δ( X ; Y ) ≤ · d( X ; Y ), 2 2 and consequently | | 1 X · β( X |Noise( X )) ≤ Δ( X ; ( X |Noise( X )) ≤ · β( X |Noise( X )). (7) 2 2 Hence, we decide to stick to the “Δ distance” in this paper. However, to allow for comparison between our work and the one of [28] we will at the end of the paper present our results also in terms of the β measure. (This translation will be straightforward, thanks to the inequalities in (7).) In [28] (cf. Theorem 4) the result is stated in form of Shannon information theory. While such an information theoretic approach may be useful in certain settings [33], we follow the more “traditional” approach and provide an efficient simulation argument. As discussed in the introduction, this also covers a setting where the adversary exploits joint information of the leakage and, e.g., the plaintext/ciphertext pairs. We emphasize, however, that our results can easily be expressed in the information theoretic language, thanks to the following bound: I ( A; B) ≤ (2 N / ln 2) · Δ( A; ( A| B)), where A is uniformly distributed over a set of cardinality N. This result comes from Proposition 1 in [28] combined with the inequality β( A| B) ≤ 2Δ( A; ( A| B)),cf. (7). 3.1.1. The Issue of “Efficient Simulation” To achieve the strong simulation-based security notion, we need an additional require- ment on the leakage, namely, that the leakage can efficiently be “simulated”—which typ- ically requires that the noise function is efficiently computable. In fact, for our proofs to go through we actually need something slightly stronger, namely that Noise is efficiently decidable by which we mean that (a) there exists a randomized poly-time algorithm that This formally means that for every A we have |P (A( X ) = 1) − P (A(Y ) = 1)| ≤ δ. Unifying Leakage Models: From Probing Attacks to Noisy Leakage computes it, and (b) the set Y is finite and for every x and y the value of P (Noise(x ) = y) is computable in polynomial time. While (b) may look like a strong assumption we note that in practice for most “natural” noise functions (like the Gaussian noise with a known parameter, measured with a very good, but finite, precision) it is easily satisfiable. Recall that the results of [28] are stated without taking into consideration the issue of the “efficient simulation”. Hence, if one wants to compare our results with [28], then one can simply drop the efficient decidability assumption on the noise. To keep our presentation concise and clean, also in this case the results will be presented in a form “for every adversary A there exists an (inefficient) simulator S”. Here the “inefficient simulator” can be an arbitrary machine, capable, e.g., of sampling elements from any probability distributions. 3.2. Simulating Noise by -Identity Functions Lemma 4 below is our main technical tool. Informally, it states that every δ-noisy function Noise : X → Y can be represented as a composition Noise ◦ ϕ of efficiently computable | | randomized functions Noise and ϕ, where ϕ is a “δ · X -identity function”, defined in Definition 1 below. Definition 1. A randomized function ϕ : X → X ∪{⊥} is an -identity if for every x we have that either ϕ(x ) = x or ϕ(x ) =⊥ and P (ϕ(x ) =⊥) = . This will allow us to reduce the “noisy attacks” to the “random-probing attacks”, where the adversary learns each wire (or a gate, see Sect. 5.5) of the circuit with probability . Observe also, that thanks to the assumed independence of noise, the events that the adversary learns each element are independent, which, in turn, will allow us to use the Chernoff bound to prove that with a good probability the number of wires that the adversary learns is small. Lemma 4. Let Noise : X → Y be a δ-noisy function. Then there exist  ≤ δ · |X | and a randomized function Noise : X ∪{⊥} → X such that for every x ∈ X we have Noise(x ) = Noise (ϕ(x )), (8) where ϕ : X → X ∪{⊥} is the -identity function. Moreover, if Noise is efficiently decidable then Noise (ϕ(x )) is computable in time that is expected polynomial in |X |. Before we proceed to the proof let us remark that the “|X | factor” loss in this lemma (when going from δ to ) is in general unavoidable. More concretely, in the subsequent work [11] (Sect. 5) it is shown that there exist δ-noisy functions that can be reduced (in the sense of Lemma 4)toan -identity function only if  is (at least) approximately equal to δ ·|X |/2. Proof of Lemma 4. We consider only the case when Noise is efficiently decidable, and hence the Noise function that we construct will be efficiently computable. The case when Noise is not efficiently decidable is handled in an analogous way (the proof is actually A. Duc et al. simpler as the only difference is that we do not need to argue about the efficiency of the sampling algorithms). Very informally speaking, our proof is based on an extension of the standard obser- vation that for any two random variables A and B one can find two events A and B such that the distributions P and P are equal and P (A) , P (B) = Δ( A; B) (see, A|A B|B e.g., [23, Section 1.3]). Let X and X be uniform over X . For every y ∈ Y define π( y) = min(P (Noise(x ) = y)). x ∈X Clearly π is computable in time polynomial in |X |. Obviously π is usually not a proba- bility distribution since it does not sum up to 1. The good news is that it sums up “almost” to 1 provided δ is sufficiently small. This is shown below. Let  := 1 − π( y).We y∈Y now have =1 = P Noise( X ) = y − π( y) y∈Y y∈Y = P Noise( X ) = y − min(P (Noise(x ) = y)) x ∈X y∈Y = max(P Noise( X ) = y − P (Noise(x ) = y)) x ∈X y∈Y ≤ max(0, P Noise( X ) = y − P Noise(x ) = y )) (9) ( ) y∈Y x ∈X = Δ(Noise(x ); Noise( X )) x ∈X = |X | · Δ((Noise( X ); Noise( X )) | X ) = δ · |X | , (10) where (9) comes from the fact that the maximum of positive values cannot be larger than their sum , and (10) follows from the assumption that the Noise function is δ-noisy. Our construction of the Noise function is based on the standard technique of rejection sampling.Let Noise (x ) be a distribution defined as follows: for every y ∈ Y and every x =⊥ let: P Noise (x ) = y = (P (Noise(x ) = y) − π( y))/, (11) More precisely, for every { Z } we have: x ∈X max( Z ) ≤ Z x x x ∈X x : Z ≥0 = max(0, Z ), where in our case Z := P (Noise(x ) = y). x Unifying Leakage Models: From Probing Attacks to Noisy Leakage and otherwise: P Noise (⊥) = y = π( y)/(1 − ). (12) We will later show how to sample Noise efficiently. Obviously this will automatically imply that (11) and (12) define probability distributions over Y (which may not be obvious at the first sight). First, however, let us show (8). To this end take any x ∈ X and y ∈ Y and observe that P Noise (ϕ(x )) = y = P ϕ(x ) = x · P Noise (x ) = y + P ϕ(x ) =⊥ · P Noise (⊥) = y ( ) ( ) =  · (P (Noise(x ) = y) − π( y))/ + (1 − ) · π( y)/(1 − ) = P (Noise(x ) = y) − π( y) + π( y) = P (Noise(x ) = y) . Which implies (8). What remains is to show how to sample Noise efficiently. Let us first show an efficient algorithm Alg (x ) for computing Noise (x ) for x =⊥: Alg (x ) : 1. Sample y from Noise(x ). 2. With probability π( y)/P (Noise(x ) = y) resample y, i.e.: go back to Step 1. 3. Output y. We now argue that Alg (x ) indeed computes Noise (x ) efficiently. Let R ∈{1, 2,...} 1 1 be a random variable denoting the number of times the algorithm Alg (x ) performed Step 1. First observe that the probability of jumping back to Step 1 in Step 2 is equal to P (Noise(x ) = y) · π( y)/P (Noise(x ) = y) = π( y) (13) y y = 1 −  (14) Therefore the probability of not jumping back to Step 1 in Step 2 is , and hence the expected number E ( R ) of the executions of Step 1 in Alg (x ) is equal to i · (1 − i =1 i −1 ) ·  = 1/. Moreover for every i = 0, 1,... we have: P (Alg (x ) = y ∧ R = i | R ≥ i ) 1 1 1 = P (Noise(x ) = y) · (1 − (π( y)/P (Noise(x ) = y))) = P (Noise(x ) = y) − π( y) Hence P (Alg (x ) = y) = P (Alg (x ) = y ∧ R = i ) i =0 = P (Alg (x ) = y ∧ R = i | R ≥ i ) · P ( R ≥ i ) 1 1 1 1 i =0 A. Duc et al. = (P (Noise(x ) = y) − π( y)) · P ( R ≥ i ) i =1 = (P (Noise(x ) = y) − π( y)) · E ( R ) = (P (Noise(x ) = y) − π( y))/, as required in (11). We now present an efficient algorithm Alg for computing Noise (⊥). Fix an arbitrary element x ∈ X , and execute the following. Alg : 1. Sample y from Noise(x ). 2. With probability 1 − (π( y)/P Noise(x ) = y ) resample y, i.e.: go back to Step ( ) 3. Output y. By a similar argument as in the case of Alg we obtain that the expected number R of 1 2 times the algorithm Alg performs Step 1 is equal to E ( R ) = 1/(1 − ). Moreover for 2 2 every i = 1, 2,... we have: P (Alg = b ∧ R = i | R ≥ i ) = π( y), 2 2 2 which, in turn, implies that P (Alg (x ) = y) = π( y)/(1 − ), and hence the output of Alg satisfies (12). Clearly, the expected running time of both algorithms is polynomial in |X | and E ( R), where R is the number of execution of Step 1 in Alg or Alg .We 1 2 obviously have E ( R) = E ( R |ϕ(x ) =⊥) · P (ϕ(x ) =⊥) + E ( R |ϕ(x ) =⊥) · P (ϕ(x ) =⊥) 1 2 = (1/) ·  + (1/(1 − )) · (1 − ) = 2. Hence, the expected running time of Noise (ϕ(x )) is polynomial in | X |. 4. Leakage from Vectors In this section we describe the leakage models relevant to this paper. We start with describing the models abstractly, by considering leakage from an arbitrary sequence (x ,..., x ) ∈ X , where X is some finite set and is a parameter. The adversary A will be able to obtain some partial information about (x ,..., x ) via the games described below. Note that we do not specify the computational power of A,asthe definitions below make sense for both computationally bounded and infinitely powerful A. Noisy Model. For δ ≥ 0a δ-noisy adversary on X is a machine A that plays the following game against an oracle that knows (x ,..., x ) ∈ X : 1. A specifies a sequence {Noise : X → Y} of noisy functions such that every i =1 Noise is δ -noisy, for some δ ≤ δ and mutually independent noises. i i Unifying Leakage Models: From Probing Attacks to Noisy Leakage 2. A receives Noise (x ), . . . , Noise (x ) and outputs some value out (x ,..., x ). 1 1 A 1 If A works in polynomial time and the noise functions specified by A are efficiently decidable, then we say that A is poly-time-noisy. Random-Probing Model. For  ≤ 0a -random-probing adversary on X is a machine A that plays the following game against an oracle that knows (x ,..., x ) ∈ X : 1. A specifies a sequence ( ,..., ) such that each  ≤ . 1 i 2. A receives ϕ (x ),...,ϕ (x ) and outputs some value out (x ,..., x ), where 1 1 1 each ϕ is the  -identity function with mutually independent randomness. i i A similar model was introduced in the work of Ishai, Sahai and Wagner [17] to obtain a circuit compiler that blows up the size of the circuit linearly in the security parameter d. Also, the work of Ajtai [1] considers the random-probing model and constructs a compiler that for sufficiently large security parameter d achieves security in the random- probing model for a small (but constant) probability .[1], however, does not give concrete parameters for  and d, and circuits produced by the compiler of [1] result into a huge circuit size blow-up (O(d ) with large hidden constants). Threshold-Probing Model. For t = 0,..., a t-threshold-probing adversary on X is a machine A that plays the following game against an oracle that knows (x ,..., x ) ∈ X : 1. A specifies a set I ={i ,..., i }⊆{1,..., } of cardinality at most t, 1 |I| 2. A receives (x ,..., x ) and outputs some value out (x ,..., x ). i i A 1 1 |I| 4.1. Simulating the Noisy Adversary by a Random-Probing Adversary The following lemma shows that every δ-noisy adversary can be simulated by a δ · |X |- random-probing adversary. Lemma 5. Let A be a δ-noisy adversary on X . Then there exists a δ · |X |-random- probing adversary S on X such that for every (x ,..., x ) we have out (x ,..., x ) = out (x ,..., x ). (15) A 1 S 1 | | Moreover, if A is poly-time-noisy, then S works in time polynomial in X . Proof. Without loss of generality assume that A simply outputs all the information that he gets. Thus (15) can be rewritten as: (Noise (x ),..., Noise (x )) = out (x ,..., x ), (16) 1 1 S 1 where Noise ’s are the δ -noisy functions chosen by A. By Lemma 4 for each i there i i exists  ≤ δ · |X | ≤ δ · |X | and a randomized function Noise : X ∪{⊥} → X , such i i that for every x ∈ X we have Noise (x ) = Noise (ϕ (x )), (17) i i i A. Duc et al. where ϕ : X → X ∪{⊥} is the  -identity function and Noi se (ϕ (x )) is computable i i i i i in time polynomial in |X |. We now describe the actions of S. The sequence that he specifies is ( ,..., ). After receiving ( y ,..., y ) (equal to (ϕ (x ),...,ϕ (x )))he 1 1 1 1 outputs out(x ,..., x ) := (Noise ( y ), ..., Noise ( y )) 1 1 (this clearly takes time that is expected polynomial in · |X |). We now have (Noise ( y ),..., Noise ( y )) = (Noise (ϕ (x )), . . . , Noise (ϕ (x ))) 1 1 = (Noise (x ),..., Noise (x )) (18) 1 1 where (18) comes from (17). This implies (16) and hence it finishes the proof. Intuitively, this lemma easily follows from Lemma 4 applied independently to each element of (x ,..., x ). 4.2. Simulating the Random-Probing Adversary by a Threshold-Probing Adversary In this section we show how to simulate every δ-random-probing adversary by a threshold adversary. This simulation, unlike the one in Sect. 4, will not be perfect in the sense that the distribution output by the simulator will be identical to the distribution of the original adversary only when conditioned on some event that happens with a large probability. We start with the following lemma, whose proof is a straightforward application of the Chernoff bound. Lemma 6. Let A be an -random-probing adversary on X . Then there exists a (2 − 1)-threshold-probing adversary S on X operating in time linear in the working time of A such that for every (x ,..., x ) we have Δ(out (x ,..., x ) ; out (x ,..., x ) | out (x ,..., x ) =⊥) = 0, (19) A 1 S 1 S 1 where P (out (x ,..., x ) =⊥) ≤ exp − . (20) Proof. As in the proof of Lemma 5 we assume that the simulated adversary A outputs all the information that he received. Moreover, since for  ≤  every  -identity function ϕ can be simulated by the -identity function ϕ, hence we can assume that each Just set ϕ (x ) := ϕ(x ) with probability  /,and ϕ (x ) =⊥ otherwise. Then clearly P ϕ (x ) = x = specified by A is equal to . Thus, we need to show a 2 -threshold-probing simulator S such that for every (x ,..., x ) ∈ X we have Δ(ϕ (x ),...,ϕ (x ) ; out (x ,..., x ) | out (x ,..., x ) =⊥) = 0, (21) 1 1 S 1 S 1 (where each ϕ is the -identity function) and (20) holds. The simulator S proceeds as follows. First he chooses a sequence ( Z ,..., Z ) of independent random variables in the by setting, for each i: 1 with probability Z := 0 otherwise. Let Z denote the number of Z ’s equal to 1, i.e., Z := Z .If Z ≥ 2 , then i i i =1 S outputs ⊥. Otherwise, he specifies the set I as I := {i : Z = 1}. He receives (x ,..., x ). For all the remaining i’s (i.e., those not in the set I) the simulator sets i i 1 |I| x := ⊥. He outputs (x ,..., x ). It is straightforward to see that S is (2 − 1)- i 1 threshold-probing and that (21) holds. What remains is to show (20). Since E ( Z ) =  , P ( Z ≥ 2 ) = P ( Z ≥ 2E ( Z )) ≤ exp − , (22) where (22) comes from the Chernoff bound with ξ = 1 (cf. Lemma 3). This finishes the proof. The following corollary combines Lemmas 5 and 6 together, and will be useful in the sequel. Corollary 1. Let d, ∈ N with > d and let A bead/(4 · |X |)-noisy adversary on X . Then there exists an (d/2 − 1)-threshold-probing adversary S such that Δ(out (x ,..., x ) ; out (x ,..., x ) | out (x ,..., x ) =⊥) = 0 (23) A 1 S 1 S 1 and P (out (x ,..., x ) =⊥) ≤ exp(−d/12). Moreover, if A is poly-time-noisy then S works in time polynomial · |X |. Proof. By Lemma 5 there exists a d/(4 )-random-probing adversary A whose output is distributed identically to the output of A. In turn, by Lemma 6 for t = 2 · (d/(4 )) · = d/2 there exists a (t − 1)-threshold-probing adversary S whose output, conditioned on not being equal to ⊥, is distributed identically to the output of A , and such that P (out (x ,..., x ) =⊥) ≤ exp(−d/12). S 1 If A is poly-time noisy then clearly the expected working time of A is polynomial in · |X |. Since the working time of S is linear in the working time of A hence this finishes the proof.  A. Duc et al. 5. Leakage from Computation In this section we address the main topic of this paper, which is the noise-resilience of cryptographic computations. Our main model will be the model of arithmetic circuits over a finite field. First, in Sect. 5.1 we present our security definitions, and then, in Sect. 5.2 we describe a secure “compiler” that transforms any cryptographic scheme secure in the “black-box” model into one secure against the noisy leakage (it is essentially identical to the transformation of [17] later extended in [31]). Finally, in the last section we present our security results. 5.1. Definitions A (stateful arithmetic) circuit Γ over a field F is a directed graph whose nodes are called inp gates. Each gate γ can be of one of the following types: an input gate γ of fan-in zero, out rand an output gate γ of fan-out zero, a random gate γ of fan-in zero, a multiplication × + − gate γ of fan-in 2, an addition gate γ of fan-in 2, a subtraction gate γ of fan-in 2, const mem a constant gate γ , and a memory gate γ of fan-in 1. Following [17] we assume that the fan-out of every gate is at most 3. The only cycles that are allowed in Γ must contain exactly 1 memory gate. The size |Γ | of the circuit Γ is defined to be the total number of its gates. The numbers of input gates, output gates and memory gates will be denoted |Γ.inp| , |Γ.out|, and |Γ.mem|, respectively. The computation of Γ is performed in several “rounds” numbered 1, 2,.... In each of them the circuit will take some input, produce an output and update the memory state. |Γ.mem| Initially, the memory gates of Γ are preloaded with some initial “state” k ∈ F . At the beginning of the ith round the input gates are loaded with elements of some vector |Γ.inp| a ∈ F called the input for the i th round. The computation of Γ in the ith round depends on a and on the memory state k . It proceeds in a straightforward way: if i i −1 all the input wires of a given gate are known then the value on its output wire can be computed naturally: if γ is a multiplication gate with input wires carrying values a and b, then its output wire will carry the value a · b (where “·” is the multiplication operation in F), and the addition and the subtraction gates are handled analogously. We assume that the random gates produce a fresh random field element in each round. The output |Γ.out| of the i th round is read-off from the output gates and denoted b ∈ F .The state |Γ.mem| after the i th round is contained in the memory gates and denoted k .For k ∈ F |Γ.inp| and a sequence of inputs (a ,..., a ) (where each a ∈ F )let Γ(k, a ,..., a ) 1 m i 1 m denote the sequence ( B ,..., B ) where each B is the output of Γ with k = k 1 m i 0 and inputs a ,..., a in rounds 1, 2,.... Observe that, since Γ is randomized, hence 1 m Γ(k, a ,..., a ) is a random variable. 1 m A black-box circuit adversary A is a machine that adaptively interacts with a circuit Γ bb via the input and output interface. Then out A  Γ(k) denotes the output of A after interacting with Γ whose initial memory state is k = k.A δ-noisy circuit adversary A is an adversary that has the following additional ability: after each ith round A gets some partial information about the internal state of the computation via the noisy leakage functions. More precisely: let ( X ,..., X ) be the random variable denoting the values on the wires of Γ(k) in the ith round. Then A plays the role of a δ-noisy Unifying Leakage Models: From Probing Attacks to Noisy Leakage adversary in a game against ( X ,..., X ) (c.f. Sect. 4), namely: he chooses a sequence {Noise : F → Y} of functions such that every Noise is δ -noisy for some δ ≤ δ i i i i i =1 noisy and he receives Noise ( X ), ..., Noise ( X ).Let out A  Γ(k) denote the output 1 1 of such an A after interacting with Γ whose initial memory state is k = k. We can also replace, in the above definition, the “δ-noisy adversary” with the “- random probing adversary”. In this case, after each ith round A chooses a sequence ( ,..., ) such that each  ≤  and he learns ϕ ( X ),...,ϕ ( X ), where each ϕ 1 i 1 1 i rnd is the  -identity function. Let out A  Γ(k) denote the output of such A after interacting with Γ whose initial memory state is k = k. Analogously we can replace the “δ-noisy adversary” with the “t-threshold probing adversary” obtaining an adversary that after each ith round A learns t elements of thr X ,..., X .Let out A  Γ(k) denote the output of such A after interacting with Γ whose initial memory state is k = k. Definition 2. Consider two stateful circuits Γ and Γ (over some field F) and a random- ized encoding function Enc. We say that Γ is a (δ, ξ )-noise-resilient implementation of |Γ.inp| a circuit Γ w.r.t. Enc if the following holds for every k ∈ F : 1. the input-output behavior of Γ(k) and Γ (Enc(k)) is identical, i.e.: for every se- quence of inputs a ,..., a and outputs b ,..., b we have 1 m 1 m P (Γ(k, a ,..., a ) = (b ,..., b )) 1 m 1 m = P Γ (Enc(k), a ,..., a ) = (b ,..., b ) 1 m 1 m and 2. for every δ-noisy circuit adversary A there exists a black-box circuit adversary S such that noisy bb Δ out S  Γ(k) ; out A  Γ (Enc(k)) ≤ ξ. (24) The definition of Γ being a (, ξ )-random-probing resilient implementation of a circuit Γ is identical to the one above, except that Point 2 is replaced with: 2’. for every -random-probing circuit adversary A there exists a black-box circuit adversary S such that bb rnd Δ out S  Γ(k) ; out A  Γ (Enc(k)) ≤ ξ. The definition of Γ being a (t,ξ)-threshold-probing resilient implementation of a circuit Γ is identical to the one above, except that Point 2 is replaced with: A. Duc et al. 2”. for every t-threshold-probing circuit adversary A there exists a black-box circuit adversary S such that bb thr Δ out S  Γ(k) ; out A  Γ (Enc(k)) ≤ ξ. In all cases above we will say that Γ is a an implementation Γ with efficient simulation if the simulator S works in time polynomial in Γ · |F| as long as A is poly-time and the noise functions specified by A are efficiently decidable. 5.2. The Implementation In this section we describe the circuit compiler of [17], generalized to larger fields in [31]. Let Γ be a stateful arithmetic circuit and let d ∈ N be a parameter. The encoding function Enc that we use is also standard and is often called the “additive masking”. It is defined as: Enc (x ) := ( X ,..., X ), where X ,..., X are uniform such that + 1 d 1 d X + ··· + X = x. 1 d At a high level, each wire w in the original circuit Γ is represented by a wire bundle − → in Γ , consisting of d wires w = (w ,...,w ), that carry an encoding of w.The gates 1 d in C are replaced gate-by-gate with so-called gadgets, computing on encoded values. The main difficulty is to construct gadgets that remain “secure” even if their internals may leak. Because the transformed gadgets in Γ operate on encodings, Γ needs to have a subcircuit at the beginning that encodes the inputs and another subcircuit at the end that decodes the outputs. We will deal with the output decoding later. The input encoding is easy to implement for our encoding function Enc : to encode an input x one simply uses the random gates to generate d − 1 field elements x ,..., x and then computes x 1 d−1 d as x + ··· + x − x. Clearly this can be done using d addition and subtraction gates. 1 d−1 Recall that the memory gates of Γ are assumed to be preloaded with field elements that already encode k using the encoding Enc [cf. (24)]; hence, there is no need to encode k. const Each constant gate γ in Γ can be transformed into d constant gates in Γ ,the const const first of them being γ and the remaining ones being γ . This is trivially correct as rand c = c + 0 + ··· + 0. Every random gate γ in Γ is transformed into d random gates in Γ . This works since, clearly, a uniformly random encoding ( X ,..., X ) encodes a 1 d uniformly random element of F. What remains to show is how the operation (addition, subtraction, and multiplication) − → gates are handled. Consider a gate γ in Γ .Let a and b be its input wires and let a = − → (a ,..., a ) and b = (b ,..., b ) be their corresponding wire bundles in Γ .Let the 1 d 1 d output wire bundle in Γ be (c ,..., c ). The cases when γ is an addition or subtraction 1 d gate are actually easy to deal with, thanks to the linearity of the encoding function. For example, if γ is an addition gate γ then each c can be computed using an addition gate γ in Γ with input wires a and b (this is obviously correct as (a + b ) + ···+ (a + i i 1 1 d b ) = (a +··· a ) + (b +···+ b )). The subtraction is handled analogously. The only d 1 d 1 d tricky case is when γ is the multiplication gate. In this case the circuit Γ generates, for Unifying Leakage Models: From Probing Attacks to Noisy Leakage every 1 ≤ i < j ≤ d, a random field element z (this is done using the random gates i, j in Γ ). Then, for every 1 ≤ j < i ≤ d it computes z := a b + a b − z , and i, j i j j i j,i finally he computes each c (for i = 1,..., d)as c := a b + z . To see why i i i i i, j i = j this computation is correct consider the sum c = c + ··· + c and observe that every 1 d z in it appears exactly once with plus sign and once with a minus sign, and hence it i, j cancels out. Moreover each term a b appears in the formula for c exactly once. Hence i j d d c is equal to a b = a b = ab. It is straightforward to i j i j i, j ∈{1,...,n} i =1 j =1 verify that the total number of gates in this gadget is 3.5 · d . This finishes the description of the compiler. The multiplication gadget above turns out to be useful as a building block for “refresh- − → ing” of the encoding. More concretely, suppose we have a wire bundle a = (a ,..., a ) 1 d − → − → and we wish to obtain another bundle b = (b ,..., b ) such that b is a fresh encoding 1 d − → of Dec ( a ). This can be achieved by a Refresh sub-gadget constructed as follows. First, create an encoding (1, 0,..., 0) of 1 (using d constant gates), and multiply (1, 0,..., 0) − → and a together using the multiplication protocol above. Since (1, 0,..., 0) is an en- coding of 1, hence the result will be an encoding of 1 · a = a. The multiplication can be 2 2 done with 3.5 · d gates, and hence altogether this gadget uses 3.5 · d + d gates. We can now use the Refresh sub-gadget to construct the output gadgets in Γ .Let out γ be an output gate in Γ with an input wire a. Then in Γ it is transformed into the − → following: let a be the wire bundle corresponding to a. First apply the Refresh sub- gadget, and then calculate the sum b + ··· + b (where (b ,..., b ) is the output of 1 d 1 d Refresh) and output the result. The refreshing gadget is also useful to provide security of the memory encoding in the multi-round scenario. More precisely, we assume that every memory state gets refreshed at the end of each round by the Refresh procedure. It is easy to see that without this “refreshing” the contents of the memory would eventually leak completely to the adversary even if he probes a very limited number (say: 1) of wires in each round. For more details see [17]. 5.3. Security in the Probing Model [17] In [17] it is shown that the compiler from the previous section is secure against probing attacks in which the adversary can probe at most (d − 1)/2 wires in each round. This parameter may be a bit disappointing as the number of probes that the adversary needs to break the security does not grow with the size of the circuit. This assumption may seem particularity unrealistic for large circuits Γ . Fortunately, [17]alsoshows a small modification of the construction from Sect. 5.2 that is resilient to a larger number of probes, provided that the number of probes from each gadget is bounded. Before we Strictly speaking the proof of [17] considers only the case when F = GF(2). It was observed in [31]that it can be extended to any finite field, as the only properties of GF(2) that are used in the proof are the field axioms. Moreover, additions (XOR) gates are not considered, as AND and NOT gates are sufficient to describe any type of Boolean circuit. A full proof of security for all linear operations (including the field addition) and their composability has recently been provided by Andrychowicz et al. [3]. Moreover, we emphasize that when we use a composable refreshing scheme that is placed between each gadget, then the proof for the addition gate is trivial and just follows from counting the intermediate operations. A. Duc et al. present it let us argue why the original construction is not secure against such attacks. To this end, assume that our circuit Γ has a long sequence of wires a ,..., a , where 1 m each a (for i > 1) is the result of adding to a (using an addition gate) a 0 constant i i −1 const (that was generated using a γ gate). It is easy to see that in the circuit Γ all the − → − → − → wire bundles a ,..., a (where each a corresponds to a ) will be identical. Hence, 1 m i i the adversary that probes even a single wire in each addition gadget in Γ will learn the encoding of a completely as long as m ≥ d. Fortunately one can deal with this problem by “refreshing” the encoding after each subtraction and addition gate exactly in the same way as done before, i.e., by using the Refresh sub-gadget. Lemma 7. ([17]) Let Γ be an arbitrary stateful arithmetic circuit over some field F. Let Γ be the circuit that results from the procedure described above. Then Γ is a ((d − 1)/2· |Γ | , 0)-threshold-probing resilient implementation of a circuit Γ (with efficient simulation), provided that the adversary does not probe each gadget more than (d − 1)/2 times in each round. We notice that [17] also contains a second transformation with blow-up O(d |Γ |).Itmay be possible that this transformation can provide better noise parameters as is achieved by Theorem 2. However, due to the hidden parameters in the O-notation we do not get a straightforward improvement of our result. In particular, using this transformation the size of the transformed circuit depends also on an additional statistical security parameter, which will affect the tolerated noise level. 5.4. Resilience to Noisy Leakage from the Wires We now show that the construction from Sect. 5.3 is secure against the noisy leakage. More precisely, we show the following. Theorem 1. Let Γ be an arbitrary stateful arithmetic circuit over some field F. Let Γ be the circuit that results from the procedure described in Sect. 5.3. Then Γ is a (δ, |Γ | · exp(−d/12))-noise-resilient implementation of Γ (with efficient simulation), where −1 δ := ((28d + 8) |F|) = O(1/(d · |F|)). Proof. Let A be a δ-noisy circuit adversary attacking Γ . We construct an efficient black-box simulator S such that for every k it holds that bb noisy Δ out S  Γ(k) ; out A  Γ (Enc(k)) ≤ |Γ | · exp(−d/12). (25) Observe that in our construction every gates gets transformed into a gadget of at most 3.5·d +d gates. Since each gate can have at most 2 inputs, hence the total number of wires 2 1 |G| in a gadget is := 7 · d +2 · d.Let γ ,...,γ be the gates of Γ . For each i = 1,..., i i i let the wires in the gadget in Γ that corresponds to γ be denoted (x ,..., x ). Since i i δ = d/(4 |F|), we can use Corollary 1 and simulate the noise from each (x ,..., x ) 1 Unifying Leakage Models: From Probing Attacks to Noisy Leakage by a (d/2 − 1)-threshold-probing adversary S working in time polynomial in · |X |. The simulation is perfect, unless S outputs ⊥, which, by Corollary 1 happens with probability at most exp(−d/12). Hence, by the union-bound the probability that some S outputs ⊥ is at most |Γ | · exp(−d/12). Denote this event E. From Lemma 7 we know that every probing adversary that attacks Γ by probing at most (d − 1)/2≥ d/2 − 1 wires from each gadget can be perfectly simulated in polynomial time by an adversary S with a black-box access to Γ . Hence, A can also be simulated perfectly by a black-box access to Γ conditioned on the fact that E did not occur. Hence we get bb noisy Δ out S  Γ(k) |¬E ; out A  Γ (Enc(k)) = 0. This, by Lemma 2 (Sect. 2.1), implies (25). Obviously S works in time polynomial in |Γ | · d · |F|, which is polynomial in Γ · |F|. This finishes the proof. In short, this theorem is proven by combining Corollary 1 that reduces the noisy adversary to the probing adversary, with Lemma 7 that shows that the construction from Sect. 5.3 is secure against probing. 5.5. Resilience to Noisy Leakage from the Gates The model of Prouff and Rivain is actually slightly different than the one considered in the previous section. The difference is that they assume that the noise is generated by the gates, not by the wires. This can be formalized by assuming that each noise function Noise is applied to the “contents of a gate”. We do not need to specify exactly what we mean by this. It is enough to observe that the contents of each gate γ can be described by at most 2 field elements: obviously if γ is a random gate, output gate, or memory gate then its entire state in a given round can be described by one field element, and if γ is an operation gate then it can be described by two field elements that correspond to γ ’s input. Hence, without loss of generality we can assume that the noise function is defined over the domain F × F. Formally, we define a δ-gate-noisy circuit adversary A as a machine that, besides of having black-box access to a circuit Γ(k), can, after each ith round, get some partial information about the internal state of the computation via the δ-noisy leakage functions g-noisy applied to the gates (in a model described above). Let out A  Γ(k) denote the output of such A after interacting with Γ whose initial memory state is k = k. We can accordingly modify the definition of noise-resilient circuit implementations (cf. Definition 2). We say that Γ is a (δ, ξ )-input-gate-noise resilient implementation of a circuit Γ w.r.t. Enc if for every k and every δ-noisy circuit adversary A described above there exists a black-box circuit adversary S working in time polynomial in Γ · |F| such that g-noisy bb Δ out S  Γ(k) ; out A  Γ (Enc(k)) ≤ ξ. (26) A. Duc et al. It turns out that the transformation from Sect. 5.3 also works in this model, although with different parameters. More precisely we have the following theorem. Theorem 2. Let Γ be an arbitrary stateful arithmetic circuit over some field F. Let Γ be the circuit that results from the procedure described in Sect. 5.3. Then Γ is a | | (δ, Γ · exp(−d/24))-noise-resilient implementation of Γ (with efficient simulation), where −1 2 2 δ := (28d + 8) · |F| = O(1/(d · |F| )). (27) Proof. The proof is similar to the one of Theorem 1 so we only describe the key differences. Let A be a δ-noisy adversary. The number corresponds now to the number of gates in each gadget, and hence it is equal to 3.5 · d + d. It is therefore straightforward to calculate that δ defined in (27) is equal to (d/2)/(4 · |F| ). Since the Noise function has domain of size |F| , we can use Corollary 1 obtaining that A can be simulated by an adversary S that probes each gadget in less that d/2 positions. Since now each “position” corresponds to a gate in the circuit, hence the adversary needs to probe up to two wires to determine its value. Therefore S probes less than d wires in each gadget. Since d is now 1/2 of what it was in the proof of Corollary 1, hence the error probability becomes exp(−d/12) = exp(−d/24). Comparison with [28]. As described in the introduction, our main advantage over [28] is the removal of the assumption about the existence of the leak-free gates, a stronger security model—chosen message attack, instead of a random message attack, and a more meaningful security statement. Still, it is interesting to compare our noise parameters with the parameters of [28]. Let us analyze how much noise is needed by [28] to ensure that the adversary obtains exponentially small information from leakage. The reader should keep in mind that both in our paper, and in [28] “more noise” means that a certain quantity, δ, in our case, is smaller. Hence, the larger δ is, the stronger the result becomes (as it means that less noise is required for the security to hold). The main result of [28] is Theorem 4 on page 154. Unfortunately, the statement of | | this theorem is asymptotic treating F as constant, and hence to get a precise bound on how much noise is required one needs to inspect the proof. The bound on the noise can be deduced from the part of the proof entitled “Security of Type 3 Subsequences”, where the required noise is inversely proportional to “λ(d)”, and this last value is linear 3 3/2 in d · |F| for a general d and linear in d · |F| for large d’s (note that |F| is denoted by N in [28], and d is a security parameter identical to ours). Hence, for a general d, their δ is O(1/(d · |F| )). However, as explained in Sect. 3.1, the notion of distance in [28] is slightly different than the standard “statistical distance” that we use. Fortunately, one can use (7)to translate our bound into their language. It turns out that in this case our and their bounds are asymptotically identical for general d’s, i.e., (O(1/(d · |F| ))). This is shown in Note that our result holds only when the number of shares is large. For small values of d (e.g., d = 2, 3, 4) like those considered in [35], our result does not give meaningful bounds. This is similar to the work of Prouff and Rivain [28] and it is an interesting open research question to develop security models that work for small security parameters. Unifying Leakage Models: From Probing Attacks to Noisy Leakage Corollary 2 below. Note that this translation is unidirectional, in the sense that their 3 2 “O(1/(d · |F| ))” bound does not imply a bound “O(1/(d · |F| ))” in our sense. Corollary 2. Let Γ be an arbitrary stateful arithmetic circuit over some field F. Let Γ be the circuit that results from the procedure described in Sect. 5.3. Then Γ is a (δ , |Γ | · exp(−d/24))-noise-resilient implementation of Γ (with efficient simulation) when the noise is defined using the β distance, where −1 3 3 δ = (14d + 4) · |F| = O(1/(d · |F| )). Proof. From (7) with X = F × F, it follows that if Noise is δ -noisy with respect to the β distance, then it is (|F| · δ /2)-noisy in the standard sense. Since this last value is equal to δ defined in (27), hence we can use Theorem 2 obtaining that Γ is a (δ , |Γ | · exp(−d/24))-noise-resilient implementation of Γ when the noise is defined using the β distance. Acknowledgements We would like to thank the anonymous Eurocrypt and Journal of Cryptology reviewers for their careful reading of our manuscript and their many insightful comments. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 Inter- national License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. References [1] M. Ajtai, Secure computation with information leaking to an adversary. In Proceedings of the 43rd ACM Symposium on Theory of Computing, STOC 2011, San Jose, CA, USA, 6-8 June 2011, pages 715–724 (2011) [2] A. Akavia, S. Goldwasser, V. Vaikuntanathan, Simultaneous Hardcore Bits and Cryptography against Memory Attacks. In TCC, pages 474–495 (2009) [3] M. Andrychowicz, S. Dziembowski, S. Faust, Circuit compilers with o(1/ log (n)) leakage rate. In Advances in Cryptology - EUROCRYPT 2016 - 35th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Proceedings, Part II, pages 586–615 (2016) [4] J. Blömer, J. Guajardo, V. Krummel, Provably Secure Masking of AES. In Selected Areas in Cryptog- raphy, pages 69–83 (2004) [5] C. Carlet, L. Goubin, E. Prouff, M. Quisquater, M. Rivain, Higher-Order Masking Schemes for S-Boxes. In FSE, pages 366–384 (2012) [6] S. Chari, C.S. Jutla, J.R. Rao, P. Rohatgi, Towards Sound Approaches to Counteract Power-Analysis Attacks. In CRYPTO, pages 398–412 (1999) [7] C. Clavier, J. Coron, N. Dabbous, Differential Power Analysis in the Presence of Hardware Counter- measures. In CHES, pages 252–263 (2000) [8] J. Coron, I. Kizhvatov, Analysis and Improvement of the Random Delay Countermeasure of CHES 2009. In CHES, pages 95–109 (2010) A. Duc et al. [9] D.P. Dubhashi, A. Panconesi. Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press (2009) [10] S. Dziembowski, S. Faust. Leakage-Resilient Circuits without Computational Assumptions. In TCC, pages 230–247 (2012) [11] S. Dziembowski, S. Faust, M. Skorski, Noisy leakage revisited. In Elisabeth Oswald and Marc Fischlin, editors, Advances in Cryptology - EUROCRYPT 2015 - 34th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Sofia, Bulgaria, April 26-30, 2015, Proceedings, Part II, volume 9057, pages 159–188. Springer (2015) [12] S. Dziembowski, K. Pietrzak, Leakage-Resilient Cryptography. In FOCS, pages 293–302 (2008) [13] S. Faust, T. Rabin, L. Reyzin, E. Tromer, V. Vaikuntanathan, Protecting Circuits from Leakage: the Computationally-Bounded and Noisy Cases. In EUROCRYPT, pages 135–156 (2010) [14] S. Goldwasser, G.N. Rothblum. Securing computation against continuous leakage. In CRYPTO, pages 59–79 (2010) [15] S. Goldwasser, G.N. Rothblum. How to Compute in the Presence of Leakage. In FOCS, pages 31–40 (2012) [16] L. Goubin, J. Patarin, DES and Differential Power Analysis (The “Duplication” Method). In CHES, pages 158–172 (1999) [17] Y. Ishai, A. Sahai, D. Wagner, Private Circuits: Securing Hardware against Probing Attacks. In CRYPTO, pages 463–481 (2003) [18] A. Juma, Y. Vahlis, Protecting Cryptographic Keys against Continual Leakage. In CRYPTO, pages 41–58 (2010) [19] J. Katz, V. Vaikuntanathan, Signature Schemes with Bounded Leakage Resilience. In ASIACRYPT, pages 703–720 (2009) [20] P.C. Kocher, Timing attacks on implementations of Diffie-Hellman, RSA, DSS, and other systems. In CRYPTO’96, pages 104–113 (1996) [21] P.C. Kocher, J. Jaffe, B. Jun, Differential Power Analysis. In CRYPTO’99, pages 388–397 (1999) [22] Stefan Mangard, Elisabeth Oswald, Thomas Popp, Power Analysis Attacks: Revealing the Secrets of Smart Cards (Advances in Information Security). Springer-Verlag New York, Inc., Secaucus, NJ, USA (2007) [23] U.M. Maurer, S. Tessaro, A hardcore lemma for computational indistinguishability: Security amplifica- tion for arbitrarily weak prgs with optimal stretch. In Daniele Micciancio, editor, TCC, volume 5978 of Lecture Notes in Computer Science, pages 237–254. Springer (2010) [24] S. Micali, L. Reyzin, Physically Observable Cryptography (Extended Abstract). In TCC, pages 278–296 (2004) [25] E. Miles, E. Viola, Shielding circuits with groups. In STOC, pages 251–260 (2013) [26] M. Naor, G. Segev, Public-key cryptosystems resilient to key leakage. In CRYPTO, pages 18–35 (2009) [27] E. Oswald, S. Mangard, N. Pramstaller, V. Rijmen, A Side-Channel Analysis Resistant Description of the AES S-Box. In FSE, pages 413–423 (2005) [28] E. Prouff, M. Rivain, Masking against Side-Channel Attacks: A Formal Security Proof. In Thomas Johansson and Phong Q. Nguyen, editors, EUROCRYPT, volume 7881 of Lecture Notes in Computer Science, pages 142–159. Springer (2013) [29] E. Prouff, T. Roche, Higher-Order Glitches Free Implementation of the AES Using Secure Multi-party Computation Protocols. In CHES, pages 63–78 (2011) [30] J.-J. Quisquater, D. Samyde, ElectroMagnetic Analysis (EMA): Measures and Counter-Measures for Smart Cards. In E-smart, pages 200–210 (2001) [31] M. Rivain, E. Prouff, Provably Secure Higher-Order Masking of AES. In CHES, pages 413–427 (2010) [32] G.N. Rothblum, How to Compute under AC0 Leakage without Secure Hardware. In CRYPTO, pages 552–569 (2012) [33] F.-X. Standaert, T. Malkin, M. Yung, A Unified Framework for the Analysis of Side-Channel Key Recovery Attacks. In EUROCRYPT, pages 443–461 (2009) [34] François-Xavier Standaert, Olivier Pereira, Yu Yu. Leakage-Resilient Symmetric Cryptography under Empirically Verifiable Assumptions. In CRYPTO (1), pages 335–352 (2013) [35] F.-X. Standaert, N.V.-Charvillon, E. Oswald, B. Gierlichs, M. Medwed, M. Kasper, S. Mangard, The World Is Not Enough: Another Look on Second-Order DPA. In ASIACRYPT, pages 112–129 (2010) Unifying Leakage Models: From Probing Attacks to Noisy Leakage [36] N. Veyrat-Charvillon, F.-X. Standaert, Adaptive Chosen-Message Side-Channel Attacks. In Jianying Zhou and Moti Yung, editors, ACNS, volume 6123 of Lecture Notes in Computer Science, pages 186– 199 (2010) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Cryptology Springer Journals

Unifying Leakage Models: From Probing Attacks to Noisy Leakage

Free
27 pages

Loading next page...
 
/lp/springer_journal/unifying-leakage-models-from-probing-attacks-to-noisy-leakage-5yp5j9LeYC
Publisher
Springer Journals
Copyright
Copyright © 2018 by The Author(s)
Subject
Computer Science; Coding and Information Theory; Computational Mathematics and Numerical Analysis; Combinatorics; Probability Theory and Stochastic Processes; Communications Engineering, Networks
ISSN
0933-2790
eISSN
1432-1378
D.O.I.
10.1007/s00145-018-9284-1
Publisher site
See Article on Publisher Site

Abstract

J Cryptol https://doi.org/10.1007/s00145-018-9284-1 Unifying Leakage Models: From Probing Attacks to Noisy Leakage Alexandre Duc University of Applied Sciences and Arts Western Switzerland (HES-SO/HEIG-VD), Yverdon-les-Bains, Switzerland Stefan Dziembowski University of Warsaw, Warsaw, Poland S.Dziembowski@crypto.edu.pl Sebastian Faust TU Darmstadt, Darmstadt, Germany Communicated by Kenneth G. Paterson. Received 17 January 2015 / Revised 19 February 2018 Abstract. A recent trend in cryptography is to formally show the leakage resilience of cryptographic implementations in a given leakage model. One of the most prominent leakage model—the so-called bounded leakage model—assumes that the amount of leakage that an adversary receives is a-priori bounded. Unfortunately, it has been pointed out by several works that the assumption of bounded leakages is hard to verify in practice. A more realistic assumption is to consider that leakages are sufficiently noisy, following the engineering observation that real-world physical leakages are inherently perturbed by physical noise. While already the seminal work of Chari et al. (in: CRYPTO, pp 398–412, 1999) study security of side-channel countermeasures in the noisy model, only recently Prouff and Rivain (in: Johansson T, Nguyen PQ (eds) EUROCRYPT, volume 7881 of lecture notes in 931 computer science, pp 142–159, Springer, 2013) offer a full formal analysis of the masking countermeasure in a physically motivated noise model. In particular, the authors show that a block-cipher implementation that uses the Boolean masking scheme is secure against a very general class of noisy leakage functions. While this is an important step toward better understanding the security of masking schemes, the analysis of Prouff and Rivain has several shortcomings including in particular requiring leak-free gates. In this work, we provide an alternative security proof in the same noise model that overcomes these challenges. We achieve this goal by a new reduction from noisy leakage to the important model of probing adversaries (Ishai et al. in: CRYPTO, pp 463–481, 2003). This reduction is the main technical contribution of our work that significantly simplifies the formal security analysis of masking schemes against realistic side-channel leakages. S. Dziembowski: Received founding from the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013)/ERC Grant Agreement Number 207908. Part of this work was done while Stefan Dziembowski was on leave from Sapienza University of Rome, Italy. S. Faust: Partially funded by the Emmy Noether Program FA 1320/1-1 of the German Research Fundation (DFG). © The Author(s) 2018 A. Duc et al. Keywords. Leakage-resilient cryptography, Noisy leakage, Probing attacks. 1. Introduction Physical side-channel attacks that exploit leakage emitting from devices are an important threat for cryptographic implementations. Prominent sources of such physical leakages include the running time of an implementation [20], its power consumption [21]orelec- tromagnetic radiation emitting from it [30]. A large body of recent applied and theoretical research attempts to incorporate the information an adversary obtains from the leakage into the security analysis and develops countermeasures to defeat common side-channel attacks [2,6,12,17,24,34,35]. While there is still a large gap between what theoretical models can achieve and what side-channel information is measured in practice, some recent important works propose models that go better in line with the perspective of cryp- tographic engineering [28,33,34]. Our work follows this line of research by analyzing the security of a common countermeasure—the so-called masking countermeasure—in the model of Prouff and Rivain [28]. Our analysis works by showing that security in certain theoretical leakage models implies security in the model of [28] and hence may be seen as a first attempt to unify the large class of different leakage models used in recent results. The Masking Countermeasure. A large body of work on cryptographic engineering has developed countermeasures to defeat side-channel attacks (see, e.g., [22] for an overview). While many countermeasures are specifically tailored to protect particular cryptographic implementations (e.g., key updates or shielded hardware), a method that generically works for most cryptographic schemes is masking [4,16,27,35]. The basic idea of a masking scheme is to secretly share all sensitive information, including the secret key and all intermediate values that depend on it, thereby making the leakage inde- pendent of the secret data. The most prominent masking scheme is the Boolean masking: abit b is encoded by a random bit string (b ,..., b ) such that b = b ⊕ ··· ⊕ b .The 1 n 1 n main difficulty in designing masking schemes is to develop masked operations, which securely compute on encoded data and ensure that all intermediate values are protected. Masking Against Noisy Leakages. Besides the fact that masking can be used to protect arbitrary computation, it has the advantage that it can be analyzed in formal security models. The first work that formally studies the soundness of masking in the presence of leakage is the seminal work of Chari et al. [6]. The authors consider a model where each share b of an encoding is perturbed by Gaussian noise and show that the number of noisy samples needed to recover the encoded secret bit b grows exponentially with the number of shares. As stated in [6], this model matches real-world physical leakages that inherently are noisy. Moreover, many practical solutions exist to amplify leakage noise (see for instance the works of [7,8,22]). One limitation of the security analysis given in [6] is the fact that it does not consider leakage emitting from masked computation. This shortcoming has been addressed in the recent important work of Prouff and Rivain [28], who extend at Eurocrypt 2013 the noisy leakage model of Chari et al. [6] to also include leakage from the masked operations. Specifically, they show that a variant of the construction of Ishai et al. [17] is secure even when there is noisy leakage from all the intermediate values that are produced during Unifying Leakage Models: From Probing Attacks to Noisy Leakage the computation. The authors of [28] also generalize the noisy leakage model of Chari et al. [6] to a wider range of leakage functions instead of considering only the Gaussian one. While clearly noisy leakage is closer to physical leakage occurring in real world, the security analysis of [28] has a number of shortcomings which puts strong limitations in which settings the masking countermeasure can be used and achieves the proved security statements. In particular, like earlier works on leakage resilient cryptography [10,13] the security analysis of Prouff and Rivain relies on so-called leak-free gates. Moreover, security is shown in a restricted adversarial model that assumes that plaintexts are chosen uniformly during an attack and the adversary does not exploit joint information from the leakages and, e.g., the ciphertext. We discuss these shortcomings in more detail in the next section. 1.1. The Work of Prouff and Rivain [28] Prouff and Rivain [28] analyze the security of a block-cipher implementation that is masked with an additive masking scheme working over a finite field F. More precisely, let t be the security parameter then a secret s ∈ F is represented by an encoding ( X ,..., X ) 1 t such that each X ← F is uniformly random subject to s = X ⊕ ··· ⊕ X . As discussed i 1 t above the main difficulty in designing secure masking schemes is to devise masked operations that work on masked values. To this end, Prouff and Rivain use the original scheme of Ishai et al. [17] augmented with some techniques from [5,31]toworkover larger fields and to obtain a more efficient implementation. The masked operations are built out of several smaller components. First, a leak-free operation that refreshes encodings, i.e., it takes as input an encoding ( X ,..., X ) of a secret s and outputs 1 t a freshly and independently chosen encoding of the same value. Second, a number of leaky elementary operations that work on a constant number of field elements. For each of these elementary operations the adversary is given leakage f ( X ), where X are the inputs of the operation and f is a noisy function. Clearly, the noise level has to be high enough so that given f ( X ) the values of X is not completely revealed. To this end, the authors introduce the notion of a bias, which informally says that the statistical distance between the distribution of X and the conditional distribution X | f ( X ) is bounded by some parameter. While noisy leakages are certainly a step in the right direction to model physical leakage, we detail below some of the limitations of the security analysis of Prouff and Rivain [28]: 1. Leak-Free Components. The assumption of leak-free computation has been used in earlier works on leakage resilient computation [10,13]. It is a strong assumption on the physical hardware and, as stated in [28], an important limitation of the current proof approach. The leak-free component of [28] is a simple operation that takes as input an encoding and refreshes it. While the computation of this operation is supposed to be completely shielded against leakage, the inputs and the outputs of this computation may leak. Notice that the leak-free component of [28] depends on the computation that is carried out in the circuit by taking inputs. In particular, this means that the computation of the leak-free component depends on secret information, which makes it harder to protect in practice and is different from earlier works that use leak-free components [10,13]. A. Duc et al. 2. Random Message Attacks. The security analysis is given only for random (known) message attacks. This is in contrast to most works in cryptography, which usually consider at least a chosen message attack. Hence, the proof does not cover chosen plaintext or chosen ciphertext attacks. We notice, however, that it is not clear whether chosen message attacks can improve the adversary’s success probability in the context of DPA attacks [36]. 3. Mutual-Information-Based Security Statement. The final statement of Theorem 4 in [28] only gives a bound on the mutual information of the key and the leak- ages from the cipher. While a mutual information analysis is a common method in side-channel analysis to evaluate the security of countermeasures [33], it has important shortcomings such as not including information that an adversary may learn from exploiting joint information from the leakages and plaintext/ciphertext pairs. Notice that such use of mutual information gets particularly problematic under continuous leakage attacks, since multiple plaintext/ciphertext pairs infor- mation theoretically completely reveal the secret key. The more standard security notion used in cryptography and also for the analysis of masking schemes, e.g., in the work of Ishai et al., uses a simulation-based approach and does not have these drawbacks. 1.2. Our Contribution We show in this work how to eliminate limitations 1–3 by a simple and elegant simulation- based argument and a reduction to the so-called t-probing adversarial setting [17] (that in this paper we call the t-threshold-probing model to emphasize the difference between this model and the random-probing model defined later). The t-threshold-probing model considers an adversary that can learn the value of t intermediate values that are produced during the computation and is often considered as a good approximation for modeling higher-order attacks. We notice that limitation 4 from above is what enables our security analysis. The fact that the noise is independent for each elementary operation allows us to formally prove security under an identical noise model as [28], but using a simpler and improved analysis. In particular, we are able to show that the original construction of Ishai et al. satisfies the standard simulation-based security notion under noisy leakages without relying on any leak-free components. We emphasize that our techniques are very different (and much simpler) than the recent breakthrough result of Goldwasser and Rothblum [15] who show how to eliminate leak-free gates in the bounded leakage model. We will further discuss related works in Sect. 1.3. Our proof considers three different leakage models and shows connections between them. One may view our work as a first attempt to “reduce” the number of different leakage models, which is in contrast to many earlier works that introduced new leakage settings. Eventually, we are able to reduce the security in the noisy leakage model to the security in the t -threshold-probing model. This shows that, for the particular choice of parameters given in [28], security in the t-threshold-probing model implies security in the noisy leakage model. This goes in line with the common approach of showing secu- rity against t-order attacks, which usually requires to prove security in the t-threshold- probing model. Moreover, it shows that the original construction of Ishai et al. that has been used in many works on masking (including the work of Prouff and Rivain) is indeed Unifying Leakage Models: From Probing Attacks to Noisy Leakage a sound approach for protecting against side-channel leakages when assuming that they are sufficiently noisy. We give some more details on our techniques below. From Noisy Leakages to Random Probes. As a first step in our security proof we show that we can simulate any adversary in the noisy leakage model of Prouff and Rivain with an adversary in a simpler noise model that we name a random-probing adversary and is similar to a model introduced in [17]. In this model, an adversary recovers an intermediate value with probability  and obtains a special symbol ⊥ with probability 1 − . This reduction shows that this model is worth studying, although from the engineering perspective it may seem unnatural. From Random Probes to the t -Threshold-Probing Model. We show how to go from the random-probing adversary setting to the more standard t-threshold-probing adversary of Ishai et al. [17]. This step is rather easy as due to the independency of the noise we can apply Chernoff’s bound almost immediately. One technical difficulty is that the work of Prouff and Rivain considers joint noisy leakage from elementary operations, while the standard t-threshold-probing setting only talks about leakage from wires. Notice, however, that the elementary operations of [28] only depend on two inputs and, hence, it is not hard to extend the result of Ishai et al. to consider “gate probing adversary” by tolerating a loss in the parameters. Finally, our analysis enables us to show security of the masking based countermeasure without the limitations 1–3 discussed above. Leakage Resilient Circuits with Simulation-Based Security. In our security analysis we use the framework of leakage resilient circuits introduced in the seminal work of Ishai et al. [17]. A circuit compiler takes as input the description of a cryptographic scheme C with secret key K , e.g., a circuit that describes a block cipher, and outputs a transformed circuit C and corresponding key K . The circuit C [K ] shall implement the same functionality as C running with key K , but additionally is resilient to certain well-defined classes of leakage. Notice that while the framework of [17] talks about circuits the same approach applies to software implementations, and we only follow this notation to abstract our description. Moreover, our work uses the well-established simulation paradigm to state the security guarantees we achieve. Intuitively, simulation-based security says that whatever attack an adversary can carry out when knowing the leakage; he can also run (with similar success probability) by just having black-box access to C. In contrast to the approach based on Shannon information theory, our analysis includes attacks that exploit joint information from the leakage and plaintext/ciphertext pairs. It seems impossible to us to incorporate the plaintext/ciphertext pairs into an analysis based on Shannon information theory. To see this, consider a block-cipher execution, where, clearly, when given a couple of plaintext/ciphertext pairs, the secret key is information theoretically revealed. The authors of [28] are well aware of this problem and explicitly exclude such joint information. A consequence of the simulation-based security analysis is that we require an additional mild assumption on the noise—namely, that it is efficiently computable More concretely: imagine an adversary that attacks a block-cipher implementation E ,where K is the secret key. Then just by launching a known-plaintext attack he can obtain several pairs V = ( M , E ( M )), ( M , E ( M )), . . .. Clearly a small number of such pairs is usually enough to determine 0 K 0 1 K 1 K information theoretically. Hence it makes no sense to require that “K is information-theoretically hidden given V and the side-channel leakage.” A. Duc et al. (see Sect. 3.1 for more details). While this is a standard assumption made in most works on leakage resilient cryptography, we emphasize that we can easily drop the assumption of efficiently computable noise (and hence considering the same noise model as [28]), when we only want to achieve the weaker security notion considered in [28]. Notice that in this case we are still able to eliminate the limitations 1 and 2 mentioned above. 1.3. Related Work Masking & Leakage Resilient Circuits. A large body of work has proposed various masking schemes and studies their security in different security models (see, e.g., [4,16, 27,31,35]). The already mentioned t-threshold-probing model has been considered in the work of Rivain and Prouff [31], who show how to extend the work of Ishai et al. to larger fields and propose efficiency improvements. In [29] it was shown that techniques from multiparty computation can be used to show security in the t-threshold-probing model. The work of Standaert et al. [35] studies masking schemes using the information theoretic framework of [33] by considering the Hamming weight model. Many other works analyze the security of the masking countermeasure, and we refer the reader for further details to [28]. With the emergence of leakage resilient cryptography [2,12,24] several works have proposed new security models and alternative masking schemes. The main difference be- tween these new security models and the t-threshold-probing model is that they consider joint leakages from large parts of the computation. The work of Faust et al. [13] extends the security analysis of Ishai et al. beyond the t-threshold-probing model by considering leakages that can be described by low-depth circuits (so-called AC leakages). Faust et al. use leak-free component that have been eliminated by Rohtblum in [32] using com- putational assumptions. The recent work of Miles and Viola [25] proposes a new circuit 0 0 transformation using alternating groups and shows security with respect to AC and TC leakages. Another line of work considers circuits that are provably secure in the so-called con- tinuous bounded leakage model [10,14,15,18]. In this model, the adversary is allowed to learn arbitrary information from the computation of the circuit as long as the amount of information is bounded. The proposed schemes rely additionally on the assumption of “only computation leaks information” of Micali and Reyzin [24]. Noisy Leakage Models. The work of Faust et al. [13] also considers circuit com- pilers for noisy models. Specifically, they propose a construction with security in the binomial noise model, where each value on a wire is flipped independently with prob- ability p ∈ (0, 1/2). In contrast to the work of [28] and our work the noise model is restricted to binomial noise, but the noise rate is significantly better (constant instead of linear noise). Similar to [28] the work of Faust et al. also uses leak-free components. Besides these works on masking schemes, several works consider noisy leakages for concrete cryptographic schemes [12,19,26]. Typically, the noise model considered in these works is significantly stronger than the noise model that is considered for masking schemes. In particular, no strong assumption about the independency of the noise is made. Unifying Leakage Models: From Probing Attacks to Noisy Leakage 2. Preliminaries We start with some standard definitions and lemmas about the statistical distance. If A is a set, then U ← A denotes a random variable sampled uniformly from A. Recall that if A and B are random variables over the same set A then the statistical distance between A and B is denoted as Δ( A; B), and defined as Δ( A; B) = |P ( A = a) − 2 a∈A P ( B = a)|= max{0, P ( A = a) − P ( B = a)}.If X , Y are some events, then a∈A by Δ(( A|X ) ; ( B|Y)) we will mean the distance between variables A and B , distributed according to the conditional distributions P and P .If X is an event of probability A|X B|Y 1, then we also write Δ( A ; ( B|Y)) instead of Δ(( A|X ) ; ( B|Y)).If C is a random variable, then by Δ( A ; ( B|C )) we mean P (C = c) · Δ( A ; ( B|(C = c))). If A, B, and C are random variables, then Δ(( B; C ) | A) denotes Δ((BA); (CA)). It is easy to see that it is equal to P ( A = a) · Δ(( B| A = a) ; (C | A = a)).If Δ( A; B) ≤ , then we say that A and B are -close.The “ = ” symbol denotes the equality of distributions, i.e., A = B if and only if Δ( A; B) = 0. 2.1. Basic Probability-Theoretic Facts Here, we state some basic lemmas that will be used later in some proofs. Lemma 1. Let A, B be two (possibly correlated) random variables. Let B be a vari- able distributed identically to B but independent from A. We have Δ( A; ( A| B)) = Δ(( B; B ) | A). (1) Proof. We have Δ( A; ( A| B)) | | = · P ( B = b) · P ( A = a) − P ( A = a| B = b) = |P ( B = b) · P ( A = a) − P ( B = b) · P ( A = a| B = b)| a,b = P B = b ∧ A = a − P ( B = b ∧ A = a) a,b = Δ(( B; B ) | A), (2) where in (2) we used the fact that B is a variable distributed identically to B and it is independent from A. Lemma 2. For any random variables A and B and an event E we have Δ(( A |¬E ); B) ≤ Δ( A; B) + P (E ) , where ¬E denotes the negation of E . A. Duc et al. Proof. We have Δ(( A |¬E ); B) = |P ( A = a |¬E ) − P ( B = a) | 1 1 ≤ |P ( A = a) − P ( A = a |¬E ) | + |P ( A = a) − P ( B = a) |, (3) 2 2 a a =:(∗) =Δ( A; B) where (3) comes from the triangle inequality. Now, (∗) is equal to P ( A = a) − P ( A = a |¬E ) , (4) a∈A where A is the set of all a’s such that P ( A = a) ≥ P ( A = a |¬E ). Clearly we have that P ( A = a |¬E ) ≥ P ( A = a ∧¬E ), and hence (4) is at most equal to P ( A = a) − P ( A = a ∧¬E ). We therefore have P ( A = a) − P ( A = a ∧¬E ) a∈A = P ( A ∈ A) − P ( A ∈ A ∧¬E ) ≤ P (E ) Thus, altogether, (4) is at most equal to P (E ) + Δ( A; B). This finishes the proof. We will also need the following standard fact. Lemma 3. (Chernoff bound, see, e.g., [9], Theorem 1.1) Let Z = Z , where Z ’s i i i =1 are random variables independently distributed over [0, 1]. Then for every ξ ∈[0, 1] we have P Z ≥ (1 + ξ)E Z ≤ exp − E Z . ( ( )) ( ) 3. Noise from Set Elements We start with describing the basic framework for reasoning about the noise from elements of a finite set X . Later, in Sect. 4, we will consider the leakage from the vectors over X , and then, in Sect. 5, from the entire computation. The reason why we can smoothly use the analysis from Sect. 3.1 in the later sections is that, as in the work of Prouff and Rivain, we require that the noise is independent for all elementary operations. By elementary operations, [28] considers the basic underlying operations over the underlying field X used in a masked implementation. In this work, we consider the same setting and type of underlying operations (in fact, notice that our construction is identical to theirs—except Unifying Leakage Models: From Probing Attacks to Noisy Leakage that we eliminate the leak-free gates and prove a stronger statement). Notice that instead of talking about elementary operations, we consider the more standard term of “gates” that was used in the work of Ishai et al. [17]. 3.1. Modeling Noise Let us start with a discussion defining what it means that a randomized function Noise : X → Y is “noisy”. We will assume that X is finite and rather small: typical choices for X would be GF(2) (the “Boolean case”), or GF(2 ), if we want to deal with the AES circuit. The set Y corresponds to the set of all possible noise measurements and may be infinite, except when we require the “efficient simulation” (we discuss it further at the end of this section). As already informally described in Sect. 1.1, our basic definition is as follows: we say that the function Noise is δ-noisy if δ = Δ( X ; ( X |Noise( X ))). (5) Of course for (5) to be well-defined we need to specify the distribution of X. The idea to define noisy functions by comparing the distributions of X and “ X conditioned on Noise( X )” comes from [28], where it is argued that the most natural choice for X is a random variable distributed uniformly over X . We also adopt this convention and assume that X ← X . We would like to stress, however, that in our proofs we will apply Noise to ˆ ˆ ˆ ˆ inputs X that are not necessarily uniform and in this case the value of Δ( X ; ( X |Noise( X )) may obviously be some non-trivial function of δ. Of course if X ← X and X ← X , then Noise( X ) is distributed identically to Noise( X ), and hence, by Lemma 1,Eq. (5) is equivalent to: δ = Δ((Noise( X ); Noise( X )) | X ), (6) where X and X are uniform over X . Note that at the beginning this definition may be a bit counter-intuitive, as smaller δ means more noise: in particular we achieve “full noise” if δ = 0, and “no noise” if δ ≈ 1. Let us compare this definition with the definition of [28]. In a nutshell: the definition of [28] is similar to ours, the only difference being that instead of the statistical distance Δ in [28] the authors use a distance based on the Euclidean norm. More precisely, they start with defining d as: d( X ; Y ) := (P ( X = x ) − P (Y = y)) , and using this notion they define β x ∈X as: β( X |Noise( X )) := P (Noise( X ) = y) · d( X ; ( X |Noise( X ) = y)) y∈Y (where X is uniform). In the terminology of [28] a function Noise is “δ-noisy” if δ = β( X |Noise( X )). Observe that the right hand side of our noise definition in Eq. (5) can be rewritten as: P (Noise( X ) = y) · Δ( X ; ( X |Noise( X ) = y)), y∈Y A. Duc et al. hence the only difference between their approach and ours is that we use Δ where they use the distance d. The authors do not explain why they choose this particular measure. We believe that our choice to use the standard definition of statistical distance Δ is more natural in this setting, since, unlike the “d” distance, it has been used in hundreds of cryptographic papers in the past. The popularity of the Δ distance comes from the fact that it corresponds to an intuitive concept of the “indistinguishability of distributions” —it is well-known, and simple to verify, that Δ( X ; Y ) ≤ δ if and only if no adversary can distinguish between X and Y with advantage better than δ. Hence, e.g., (6) can be interpreted as: δ is the maximum probability, over all adversaries A, that A distinguishes between the noise from a uniform X that is known to him, and a uniform X that is unknown to him. It is unclear to us if a d distance has a similar interpretation. We emphasize, however, that the choice whether to use Δ or β is not too important, as the following inequalities between these measures hold for every X and Y distributed over X (cf. [28]): | | 1 X · d( X ; Y ) ≤ Δ( X ; Y ) ≤ · d( X ; Y ), 2 2 and consequently | | 1 X · β( X |Noise( X )) ≤ Δ( X ; ( X |Noise( X )) ≤ · β( X |Noise( X )). (7) 2 2 Hence, we decide to stick to the “Δ distance” in this paper. However, to allow for comparison between our work and the one of [28] we will at the end of the paper present our results also in terms of the β measure. (This translation will be straightforward, thanks to the inequalities in (7).) In [28] (cf. Theorem 4) the result is stated in form of Shannon information theory. While such an information theoretic approach may be useful in certain settings [33], we follow the more “traditional” approach and provide an efficient simulation argument. As discussed in the introduction, this also covers a setting where the adversary exploits joint information of the leakage and, e.g., the plaintext/ciphertext pairs. We emphasize, however, that our results can easily be expressed in the information theoretic language, thanks to the following bound: I ( A; B) ≤ (2 N / ln 2) · Δ( A; ( A| B)), where A is uniformly distributed over a set of cardinality N. This result comes from Proposition 1 in [28] combined with the inequality β( A| B) ≤ 2Δ( A; ( A| B)),cf. (7). 3.1.1. The Issue of “Efficient Simulation” To achieve the strong simulation-based security notion, we need an additional require- ment on the leakage, namely, that the leakage can efficiently be “simulated”—which typ- ically requires that the noise function is efficiently computable. In fact, for our proofs to go through we actually need something slightly stronger, namely that Noise is efficiently decidable by which we mean that (a) there exists a randomized poly-time algorithm that This formally means that for every A we have |P (A( X ) = 1) − P (A(Y ) = 1)| ≤ δ. Unifying Leakage Models: From Probing Attacks to Noisy Leakage computes it, and (b) the set Y is finite and for every x and y the value of P (Noise(x ) = y) is computable in polynomial time. While (b) may look like a strong assumption we note that in practice for most “natural” noise functions (like the Gaussian noise with a known parameter, measured with a very good, but finite, precision) it is easily satisfiable. Recall that the results of [28] are stated without taking into consideration the issue of the “efficient simulation”. Hence, if one wants to compare our results with [28], then one can simply drop the efficient decidability assumption on the noise. To keep our presentation concise and clean, also in this case the results will be presented in a form “for every adversary A there exists an (inefficient) simulator S”. Here the “inefficient simulator” can be an arbitrary machine, capable, e.g., of sampling elements from any probability distributions. 3.2. Simulating Noise by -Identity Functions Lemma 4 below is our main technical tool. Informally, it states that every δ-noisy function Noise : X → Y can be represented as a composition Noise ◦ ϕ of efficiently computable | | randomized functions Noise and ϕ, where ϕ is a “δ · X -identity function”, defined in Definition 1 below. Definition 1. A randomized function ϕ : X → X ∪{⊥} is an -identity if for every x we have that either ϕ(x ) = x or ϕ(x ) =⊥ and P (ϕ(x ) =⊥) = . This will allow us to reduce the “noisy attacks” to the “random-probing attacks”, where the adversary learns each wire (or a gate, see Sect. 5.5) of the circuit with probability . Observe also, that thanks to the assumed independence of noise, the events that the adversary learns each element are independent, which, in turn, will allow us to use the Chernoff bound to prove that with a good probability the number of wires that the adversary learns is small. Lemma 4. Let Noise : X → Y be a δ-noisy function. Then there exist  ≤ δ · |X | and a randomized function Noise : X ∪{⊥} → X such that for every x ∈ X we have Noise(x ) = Noise (ϕ(x )), (8) where ϕ : X → X ∪{⊥} is the -identity function. Moreover, if Noise is efficiently decidable then Noise (ϕ(x )) is computable in time that is expected polynomial in |X |. Before we proceed to the proof let us remark that the “|X | factor” loss in this lemma (when going from δ to ) is in general unavoidable. More concretely, in the subsequent work [11] (Sect. 5) it is shown that there exist δ-noisy functions that can be reduced (in the sense of Lemma 4)toan -identity function only if  is (at least) approximately equal to δ ·|X |/2. Proof of Lemma 4. We consider only the case when Noise is efficiently decidable, and hence the Noise function that we construct will be efficiently computable. The case when Noise is not efficiently decidable is handled in an analogous way (the proof is actually A. Duc et al. simpler as the only difference is that we do not need to argue about the efficiency of the sampling algorithms). Very informally speaking, our proof is based on an extension of the standard obser- vation that for any two random variables A and B one can find two events A and B such that the distributions P and P are equal and P (A) , P (B) = Δ( A; B) (see, A|A B|B e.g., [23, Section 1.3]). Let X and X be uniform over X . For every y ∈ Y define π( y) = min(P (Noise(x ) = y)). x ∈X Clearly π is computable in time polynomial in |X |. Obviously π is usually not a proba- bility distribution since it does not sum up to 1. The good news is that it sums up “almost” to 1 provided δ is sufficiently small. This is shown below. Let  := 1 − π( y).We y∈Y now have =1 = P Noise( X ) = y − π( y) y∈Y y∈Y = P Noise( X ) = y − min(P (Noise(x ) = y)) x ∈X y∈Y = max(P Noise( X ) = y − P (Noise(x ) = y)) x ∈X y∈Y ≤ max(0, P Noise( X ) = y − P Noise(x ) = y )) (9) ( ) y∈Y x ∈X = Δ(Noise(x ); Noise( X )) x ∈X = |X | · Δ((Noise( X ); Noise( X )) | X ) = δ · |X | , (10) where (9) comes from the fact that the maximum of positive values cannot be larger than their sum , and (10) follows from the assumption that the Noise function is δ-noisy. Our construction of the Noise function is based on the standard technique of rejection sampling.Let Noise (x ) be a distribution defined as follows: for every y ∈ Y and every x =⊥ let: P Noise (x ) = y = (P (Noise(x ) = y) − π( y))/, (11) More precisely, for every { Z } we have: x ∈X max( Z ) ≤ Z x x x ∈X x : Z ≥0 = max(0, Z ), where in our case Z := P (Noise(x ) = y). x Unifying Leakage Models: From Probing Attacks to Noisy Leakage and otherwise: P Noise (⊥) = y = π( y)/(1 − ). (12) We will later show how to sample Noise efficiently. Obviously this will automatically imply that (11) and (12) define probability distributions over Y (which may not be obvious at the first sight). First, however, let us show (8). To this end take any x ∈ X and y ∈ Y and observe that P Noise (ϕ(x )) = y = P ϕ(x ) = x · P Noise (x ) = y + P ϕ(x ) =⊥ · P Noise (⊥) = y ( ) ( ) =  · (P (Noise(x ) = y) − π( y))/ + (1 − ) · π( y)/(1 − ) = P (Noise(x ) = y) − π( y) + π( y) = P (Noise(x ) = y) . Which implies (8). What remains is to show how to sample Noise efficiently. Let us first show an efficient algorithm Alg (x ) for computing Noise (x ) for x =⊥: Alg (x ) : 1. Sample y from Noise(x ). 2. With probability π( y)/P (Noise(x ) = y) resample y, i.e.: go back to Step 1. 3. Output y. We now argue that Alg (x ) indeed computes Noise (x ) efficiently. Let R ∈{1, 2,...} 1 1 be a random variable denoting the number of times the algorithm Alg (x ) performed Step 1. First observe that the probability of jumping back to Step 1 in Step 2 is equal to P (Noise(x ) = y) · π( y)/P (Noise(x ) = y) = π( y) (13) y y = 1 −  (14) Therefore the probability of not jumping back to Step 1 in Step 2 is , and hence the expected number E ( R ) of the executions of Step 1 in Alg (x ) is equal to i · (1 − i =1 i −1 ) ·  = 1/. Moreover for every i = 0, 1,... we have: P (Alg (x ) = y ∧ R = i | R ≥ i ) 1 1 1 = P (Noise(x ) = y) · (1 − (π( y)/P (Noise(x ) = y))) = P (Noise(x ) = y) − π( y) Hence P (Alg (x ) = y) = P (Alg (x ) = y ∧ R = i ) i =0 = P (Alg (x ) = y ∧ R = i | R ≥ i ) · P ( R ≥ i ) 1 1 1 1 i =0 A. Duc et al. = (P (Noise(x ) = y) − π( y)) · P ( R ≥ i ) i =1 = (P (Noise(x ) = y) − π( y)) · E ( R ) = (P (Noise(x ) = y) − π( y))/, as required in (11). We now present an efficient algorithm Alg for computing Noise (⊥). Fix an arbitrary element x ∈ X , and execute the following. Alg : 1. Sample y from Noise(x ). 2. With probability 1 − (π( y)/P Noise(x ) = y ) resample y, i.e.: go back to Step ( ) 3. Output y. By a similar argument as in the case of Alg we obtain that the expected number R of 1 2 times the algorithm Alg performs Step 1 is equal to E ( R ) = 1/(1 − ). Moreover for 2 2 every i = 1, 2,... we have: P (Alg = b ∧ R = i | R ≥ i ) = π( y), 2 2 2 which, in turn, implies that P (Alg (x ) = y) = π( y)/(1 − ), and hence the output of Alg satisfies (12). Clearly, the expected running time of both algorithms is polynomial in |X | and E ( R), where R is the number of execution of Step 1 in Alg or Alg .We 1 2 obviously have E ( R) = E ( R |ϕ(x ) =⊥) · P (ϕ(x ) =⊥) + E ( R |ϕ(x ) =⊥) · P (ϕ(x ) =⊥) 1 2 = (1/) ·  + (1/(1 − )) · (1 − ) = 2. Hence, the expected running time of Noise (ϕ(x )) is polynomial in | X |. 4. Leakage from Vectors In this section we describe the leakage models relevant to this paper. We start with describing the models abstractly, by considering leakage from an arbitrary sequence (x ,..., x ) ∈ X , where X is some finite set and is a parameter. The adversary A will be able to obtain some partial information about (x ,..., x ) via the games described below. Note that we do not specify the computational power of A,asthe definitions below make sense for both computationally bounded and infinitely powerful A. Noisy Model. For δ ≥ 0a δ-noisy adversary on X is a machine A that plays the following game against an oracle that knows (x ,..., x ) ∈ X : 1. A specifies a sequence {Noise : X → Y} of noisy functions such that every i =1 Noise is δ -noisy, for some δ ≤ δ and mutually independent noises. i i Unifying Leakage Models: From Probing Attacks to Noisy Leakage 2. A receives Noise (x ), . . . , Noise (x ) and outputs some value out (x ,..., x ). 1 1 A 1 If A works in polynomial time and the noise functions specified by A are efficiently decidable, then we say that A is poly-time-noisy. Random-Probing Model. For  ≤ 0a -random-probing adversary on X is a machine A that plays the following game against an oracle that knows (x ,..., x ) ∈ X : 1. A specifies a sequence ( ,..., ) such that each  ≤ . 1 i 2. A receives ϕ (x ),...,ϕ (x ) and outputs some value out (x ,..., x ), where 1 1 1 each ϕ is the  -identity function with mutually independent randomness. i i A similar model was introduced in the work of Ishai, Sahai and Wagner [17] to obtain a circuit compiler that blows up the size of the circuit linearly in the security parameter d. Also, the work of Ajtai [1] considers the random-probing model and constructs a compiler that for sufficiently large security parameter d achieves security in the random- probing model for a small (but constant) probability .[1], however, does not give concrete parameters for  and d, and circuits produced by the compiler of [1] result into a huge circuit size blow-up (O(d ) with large hidden constants). Threshold-Probing Model. For t = 0,..., a t-threshold-probing adversary on X is a machine A that plays the following game against an oracle that knows (x ,..., x ) ∈ X : 1. A specifies a set I ={i ,..., i }⊆{1,..., } of cardinality at most t, 1 |I| 2. A receives (x ,..., x ) and outputs some value out (x ,..., x ). i i A 1 1 |I| 4.1. Simulating the Noisy Adversary by a Random-Probing Adversary The following lemma shows that every δ-noisy adversary can be simulated by a δ · |X |- random-probing adversary. Lemma 5. Let A be a δ-noisy adversary on X . Then there exists a δ · |X |-random- probing adversary S on X such that for every (x ,..., x ) we have out (x ,..., x ) = out (x ,..., x ). (15) A 1 S 1 | | Moreover, if A is poly-time-noisy, then S works in time polynomial in X . Proof. Without loss of generality assume that A simply outputs all the information that he gets. Thus (15) can be rewritten as: (Noise (x ),..., Noise (x )) = out (x ,..., x ), (16) 1 1 S 1 where Noise ’s are the δ -noisy functions chosen by A. By Lemma 4 for each i there i i exists  ≤ δ · |X | ≤ δ · |X | and a randomized function Noise : X ∪{⊥} → X , such i i that for every x ∈ X we have Noise (x ) = Noise (ϕ (x )), (17) i i i A. Duc et al. where ϕ : X → X ∪{⊥} is the  -identity function and Noi se (ϕ (x )) is computable i i i i i in time polynomial in |X |. We now describe the actions of S. The sequence that he specifies is ( ,..., ). After receiving ( y ,..., y ) (equal to (ϕ (x ),...,ϕ (x )))he 1 1 1 1 outputs out(x ,..., x ) := (Noise ( y ), ..., Noise ( y )) 1 1 (this clearly takes time that is expected polynomial in · |X |). We now have (Noise ( y ),..., Noise ( y )) = (Noise (ϕ (x )), . . . , Noise (ϕ (x ))) 1 1 = (Noise (x ),..., Noise (x )) (18) 1 1 where (18) comes from (17). This implies (16) and hence it finishes the proof. Intuitively, this lemma easily follows from Lemma 4 applied independently to each element of (x ,..., x ). 4.2. Simulating the Random-Probing Adversary by a Threshold-Probing Adversary In this section we show how to simulate every δ-random-probing adversary by a threshold adversary. This simulation, unlike the one in Sect. 4, will not be perfect in the sense that the distribution output by the simulator will be identical to the distribution of the original adversary only when conditioned on some event that happens with a large probability. We start with the following lemma, whose proof is a straightforward application of the Chernoff bound. Lemma 6. Let A be an -random-probing adversary on X . Then there exists a (2 − 1)-threshold-probing adversary S on X operating in time linear in the working time of A such that for every (x ,..., x ) we have Δ(out (x ,..., x ) ; out (x ,..., x ) | out (x ,..., x ) =⊥) = 0, (19) A 1 S 1 S 1 where P (out (x ,..., x ) =⊥) ≤ exp − . (20) Proof. As in the proof of Lemma 5 we assume that the simulated adversary A outputs all the information that he received. Moreover, since for  ≤  every  -identity function ϕ can be simulated by the -identity function ϕ, hence we can assume that each Just set ϕ (x ) := ϕ(x ) with probability  /,and ϕ (x ) =⊥ otherwise. Then clearly P ϕ (x ) = x = specified by A is equal to . Thus, we need to show a 2 -threshold-probing simulator S such that for every (x ,..., x ) ∈ X we have Δ(ϕ (x ),...,ϕ (x ) ; out (x ,..., x ) | out (x ,..., x ) =⊥) = 0, (21) 1 1 S 1 S 1 (where each ϕ is the -identity function) and (20) holds. The simulator S proceeds as follows. First he chooses a sequence ( Z ,..., Z ) of independent random variables in the by setting, for each i: 1 with probability Z := 0 otherwise. Let Z denote the number of Z ’s equal to 1, i.e., Z := Z .If Z ≥ 2 , then i i i =1 S outputs ⊥. Otherwise, he specifies the set I as I := {i : Z = 1}. He receives (x ,..., x ). For all the remaining i’s (i.e., those not in the set I) the simulator sets i i 1 |I| x := ⊥. He outputs (x ,..., x ). It is straightforward to see that S is (2 − 1)- i 1 threshold-probing and that (21) holds. What remains is to show (20). Since E ( Z ) =  , P ( Z ≥ 2 ) = P ( Z ≥ 2E ( Z )) ≤ exp − , (22) where (22) comes from the Chernoff bound with ξ = 1 (cf. Lemma 3). This finishes the proof. The following corollary combines Lemmas 5 and 6 together, and will be useful in the sequel. Corollary 1. Let d, ∈ N with > d and let A bead/(4 · |X |)-noisy adversary on X . Then there exists an (d/2 − 1)-threshold-probing adversary S such that Δ(out (x ,..., x ) ; out (x ,..., x ) | out (x ,..., x ) =⊥) = 0 (23) A 1 S 1 S 1 and P (out (x ,..., x ) =⊥) ≤ exp(−d/12). Moreover, if A is poly-time-noisy then S works in time polynomial · |X |. Proof. By Lemma 5 there exists a d/(4 )-random-probing adversary A whose output is distributed identically to the output of A. In turn, by Lemma 6 for t = 2 · (d/(4 )) · = d/2 there exists a (t − 1)-threshold-probing adversary S whose output, conditioned on not being equal to ⊥, is distributed identically to the output of A , and such that P (out (x ,..., x ) =⊥) ≤ exp(−d/12). S 1 If A is poly-time noisy then clearly the expected working time of A is polynomial in · |X |. Since the working time of S is linear in the working time of A hence this finishes the proof.  A. Duc et al. 5. Leakage from Computation In this section we address the main topic of this paper, which is the noise-resilience of cryptographic computations. Our main model will be the model of arithmetic circuits over a finite field. First, in Sect. 5.1 we present our security definitions, and then, in Sect. 5.2 we describe a secure “compiler” that transforms any cryptographic scheme secure in the “black-box” model into one secure against the noisy leakage (it is essentially identical to the transformation of [17] later extended in [31]). Finally, in the last section we present our security results. 5.1. Definitions A (stateful arithmetic) circuit Γ over a field F is a directed graph whose nodes are called inp gates. Each gate γ can be of one of the following types: an input gate γ of fan-in zero, out rand an output gate γ of fan-out zero, a random gate γ of fan-in zero, a multiplication × + − gate γ of fan-in 2, an addition gate γ of fan-in 2, a subtraction gate γ of fan-in 2, const mem a constant gate γ , and a memory gate γ of fan-in 1. Following [17] we assume that the fan-out of every gate is at most 3. The only cycles that are allowed in Γ must contain exactly 1 memory gate. The size |Γ | of the circuit Γ is defined to be the total number of its gates. The numbers of input gates, output gates and memory gates will be denoted |Γ.inp| , |Γ.out|, and |Γ.mem|, respectively. The computation of Γ is performed in several “rounds” numbered 1, 2,.... In each of them the circuit will take some input, produce an output and update the memory state. |Γ.mem| Initially, the memory gates of Γ are preloaded with some initial “state” k ∈ F . At the beginning of the ith round the input gates are loaded with elements of some vector |Γ.inp| a ∈ F called the input for the i th round. The computation of Γ in the ith round depends on a and on the memory state k . It proceeds in a straightforward way: if i i −1 all the input wires of a given gate are known then the value on its output wire can be computed naturally: if γ is a multiplication gate with input wires carrying values a and b, then its output wire will carry the value a · b (where “·” is the multiplication operation in F), and the addition and the subtraction gates are handled analogously. We assume that the random gates produce a fresh random field element in each round. The output |Γ.out| of the i th round is read-off from the output gates and denoted b ∈ F .The state |Γ.mem| after the i th round is contained in the memory gates and denoted k .For k ∈ F |Γ.inp| and a sequence of inputs (a ,..., a ) (where each a ∈ F )let Γ(k, a ,..., a ) 1 m i 1 m denote the sequence ( B ,..., B ) where each B is the output of Γ with k = k 1 m i 0 and inputs a ,..., a in rounds 1, 2,.... Observe that, since Γ is randomized, hence 1 m Γ(k, a ,..., a ) is a random variable. 1 m A black-box circuit adversary A is a machine that adaptively interacts with a circuit Γ bb via the input and output interface. Then out A  Γ(k) denotes the output of A after interacting with Γ whose initial memory state is k = k.A δ-noisy circuit adversary A is an adversary that has the following additional ability: after each ith round A gets some partial information about the internal state of the computation via the noisy leakage functions. More precisely: let ( X ,..., X ) be the random variable denoting the values on the wires of Γ(k) in the ith round. Then A plays the role of a δ-noisy Unifying Leakage Models: From Probing Attacks to Noisy Leakage adversary in a game against ( X ,..., X ) (c.f. Sect. 4), namely: he chooses a sequence {Noise : F → Y} of functions such that every Noise is δ -noisy for some δ ≤ δ i i i i i =1 noisy and he receives Noise ( X ), ..., Noise ( X ).Let out A  Γ(k) denote the output 1 1 of such an A after interacting with Γ whose initial memory state is k = k. We can also replace, in the above definition, the “δ-noisy adversary” with the “- random probing adversary”. In this case, after each ith round A chooses a sequence ( ,..., ) such that each  ≤  and he learns ϕ ( X ),...,ϕ ( X ), where each ϕ 1 i 1 1 i rnd is the  -identity function. Let out A  Γ(k) denote the output of such A after interacting with Γ whose initial memory state is k = k. Analogously we can replace the “δ-noisy adversary” with the “t-threshold probing adversary” obtaining an adversary that after each ith round A learns t elements of thr X ,..., X .Let out A  Γ(k) denote the output of such A after interacting with Γ whose initial memory state is k = k. Definition 2. Consider two stateful circuits Γ and Γ (over some field F) and a random- ized encoding function Enc. We say that Γ is a (δ, ξ )-noise-resilient implementation of |Γ.inp| a circuit Γ w.r.t. Enc if the following holds for every k ∈ F : 1. the input-output behavior of Γ(k) and Γ (Enc(k)) is identical, i.e.: for every se- quence of inputs a ,..., a and outputs b ,..., b we have 1 m 1 m P (Γ(k, a ,..., a ) = (b ,..., b )) 1 m 1 m = P Γ (Enc(k), a ,..., a ) = (b ,..., b ) 1 m 1 m and 2. for every δ-noisy circuit adversary A there exists a black-box circuit adversary S such that noisy bb Δ out S  Γ(k) ; out A  Γ (Enc(k)) ≤ ξ. (24) The definition of Γ being a (, ξ )-random-probing resilient implementation of a circuit Γ is identical to the one above, except that Point 2 is replaced with: 2’. for every -random-probing circuit adversary A there exists a black-box circuit adversary S such that bb rnd Δ out S  Γ(k) ; out A  Γ (Enc(k)) ≤ ξ. The definition of Γ being a (t,ξ)-threshold-probing resilient implementation of a circuit Γ is identical to the one above, except that Point 2 is replaced with: A. Duc et al. 2”. for every t-threshold-probing circuit adversary A there exists a black-box circuit adversary S such that bb thr Δ out S  Γ(k) ; out A  Γ (Enc(k)) ≤ ξ. In all cases above we will say that Γ is a an implementation Γ with efficient simulation if the simulator S works in time polynomial in Γ · |F| as long as A is poly-time and the noise functions specified by A are efficiently decidable. 5.2. The Implementation In this section we describe the circuit compiler of [17], generalized to larger fields in [31]. Let Γ be a stateful arithmetic circuit and let d ∈ N be a parameter. The encoding function Enc that we use is also standard and is often called the “additive masking”. It is defined as: Enc (x ) := ( X ,..., X ), where X ,..., X are uniform such that + 1 d 1 d X + ··· + X = x. 1 d At a high level, each wire w in the original circuit Γ is represented by a wire bundle − → in Γ , consisting of d wires w = (w ,...,w ), that carry an encoding of w.The gates 1 d in C are replaced gate-by-gate with so-called gadgets, computing on encoded values. The main difficulty is to construct gadgets that remain “secure” even if their internals may leak. Because the transformed gadgets in Γ operate on encodings, Γ needs to have a subcircuit at the beginning that encodes the inputs and another subcircuit at the end that decodes the outputs. We will deal with the output decoding later. The input encoding is easy to implement for our encoding function Enc : to encode an input x one simply uses the random gates to generate d − 1 field elements x ,..., x and then computes x 1 d−1 d as x + ··· + x − x. Clearly this can be done using d addition and subtraction gates. 1 d−1 Recall that the memory gates of Γ are assumed to be preloaded with field elements that already encode k using the encoding Enc [cf. (24)]; hence, there is no need to encode k. const Each constant gate γ in Γ can be transformed into d constant gates in Γ ,the const const first of them being γ and the remaining ones being γ . This is trivially correct as rand c = c + 0 + ··· + 0. Every random gate γ in Γ is transformed into d random gates in Γ . This works since, clearly, a uniformly random encoding ( X ,..., X ) encodes a 1 d uniformly random element of F. What remains to show is how the operation (addition, subtraction, and multiplication) − → gates are handled. Consider a gate γ in Γ .Let a and b be its input wires and let a = − → (a ,..., a ) and b = (b ,..., b ) be their corresponding wire bundles in Γ .Let the 1 d 1 d output wire bundle in Γ be (c ,..., c ). The cases when γ is an addition or subtraction 1 d gate are actually easy to deal with, thanks to the linearity of the encoding function. For example, if γ is an addition gate γ then each c can be computed using an addition gate γ in Γ with input wires a and b (this is obviously correct as (a + b ) + ···+ (a + i i 1 1 d b ) = (a +··· a ) + (b +···+ b )). The subtraction is handled analogously. The only d 1 d 1 d tricky case is when γ is the multiplication gate. In this case the circuit Γ generates, for Unifying Leakage Models: From Probing Attacks to Noisy Leakage every 1 ≤ i < j ≤ d, a random field element z (this is done using the random gates i, j in Γ ). Then, for every 1 ≤ j < i ≤ d it computes z := a b + a b − z , and i, j i j j i j,i finally he computes each c (for i = 1,..., d)as c := a b + z . To see why i i i i i, j i = j this computation is correct consider the sum c = c + ··· + c and observe that every 1 d z in it appears exactly once with plus sign and once with a minus sign, and hence it i, j cancels out. Moreover each term a b appears in the formula for c exactly once. Hence i j d d c is equal to a b = a b = ab. It is straightforward to i j i j i, j ∈{1,...,n} i =1 j =1 verify that the total number of gates in this gadget is 3.5 · d . This finishes the description of the compiler. The multiplication gadget above turns out to be useful as a building block for “refresh- − → ing” of the encoding. More concretely, suppose we have a wire bundle a = (a ,..., a ) 1 d − → − → and we wish to obtain another bundle b = (b ,..., b ) such that b is a fresh encoding 1 d − → of Dec ( a ). This can be achieved by a Refresh sub-gadget constructed as follows. First, create an encoding (1, 0,..., 0) of 1 (using d constant gates), and multiply (1, 0,..., 0) − → and a together using the multiplication protocol above. Since (1, 0,..., 0) is an en- coding of 1, hence the result will be an encoding of 1 · a = a. The multiplication can be 2 2 done with 3.5 · d gates, and hence altogether this gadget uses 3.5 · d + d gates. We can now use the Refresh sub-gadget to construct the output gadgets in Γ .Let out γ be an output gate in Γ with an input wire a. Then in Γ it is transformed into the − → following: let a be the wire bundle corresponding to a. First apply the Refresh sub- gadget, and then calculate the sum b + ··· + b (where (b ,..., b ) is the output of 1 d 1 d Refresh) and output the result. The refreshing gadget is also useful to provide security of the memory encoding in the multi-round scenario. More precisely, we assume that every memory state gets refreshed at the end of each round by the Refresh procedure. It is easy to see that without this “refreshing” the contents of the memory would eventually leak completely to the adversary even if he probes a very limited number (say: 1) of wires in each round. For more details see [17]. 5.3. Security in the Probing Model [17] In [17] it is shown that the compiler from the previous section is secure against probing attacks in which the adversary can probe at most (d − 1)/2 wires in each round. This parameter may be a bit disappointing as the number of probes that the adversary needs to break the security does not grow with the size of the circuit. This assumption may seem particularity unrealistic for large circuits Γ . Fortunately, [17]alsoshows a small modification of the construction from Sect. 5.2 that is resilient to a larger number of probes, provided that the number of probes from each gadget is bounded. Before we Strictly speaking the proof of [17] considers only the case when F = GF(2). It was observed in [31]that it can be extended to any finite field, as the only properties of GF(2) that are used in the proof are the field axioms. Moreover, additions (XOR) gates are not considered, as AND and NOT gates are sufficient to describe any type of Boolean circuit. A full proof of security for all linear operations (including the field addition) and their composability has recently been provided by Andrychowicz et al. [3]. Moreover, we emphasize that when we use a composable refreshing scheme that is placed between each gadget, then the proof for the addition gate is trivial and just follows from counting the intermediate operations. A. Duc et al. present it let us argue why the original construction is not secure against such attacks. To this end, assume that our circuit Γ has a long sequence of wires a ,..., a , where 1 m each a (for i > 1) is the result of adding to a (using an addition gate) a 0 constant i i −1 const (that was generated using a γ gate). It is easy to see that in the circuit Γ all the − → − → − → wire bundles a ,..., a (where each a corresponds to a ) will be identical. Hence, 1 m i i the adversary that probes even a single wire in each addition gadget in Γ will learn the encoding of a completely as long as m ≥ d. Fortunately one can deal with this problem by “refreshing” the encoding after each subtraction and addition gate exactly in the same way as done before, i.e., by using the Refresh sub-gadget. Lemma 7. ([17]) Let Γ be an arbitrary stateful arithmetic circuit over some field F. Let Γ be the circuit that results from the procedure described above. Then Γ is a ((d − 1)/2· |Γ | , 0)-threshold-probing resilient implementation of a circuit Γ (with efficient simulation), provided that the adversary does not probe each gadget more than (d − 1)/2 times in each round. We notice that [17] also contains a second transformation with blow-up O(d |Γ |).Itmay be possible that this transformation can provide better noise parameters as is achieved by Theorem 2. However, due to the hidden parameters in the O-notation we do not get a straightforward improvement of our result. In particular, using this transformation the size of the transformed circuit depends also on an additional statistical security parameter, which will affect the tolerated noise level. 5.4. Resilience to Noisy Leakage from the Wires We now show that the construction from Sect. 5.3 is secure against the noisy leakage. More precisely, we show the following. Theorem 1. Let Γ be an arbitrary stateful arithmetic circuit over some field F. Let Γ be the circuit that results from the procedure described in Sect. 5.3. Then Γ is a (δ, |Γ | · exp(−d/12))-noise-resilient implementation of Γ (with efficient simulation), where −1 δ := ((28d + 8) |F|) = O(1/(d · |F|)). Proof. Let A be a δ-noisy circuit adversary attacking Γ . We construct an efficient black-box simulator S such that for every k it holds that bb noisy Δ out S  Γ(k) ; out A  Γ (Enc(k)) ≤ |Γ | · exp(−d/12). (25) Observe that in our construction every gates gets transformed into a gadget of at most 3.5·d +d gates. Since each gate can have at most 2 inputs, hence the total number of wires 2 1 |G| in a gadget is := 7 · d +2 · d.Let γ ,...,γ be the gates of Γ . For each i = 1,..., i i i let the wires in the gadget in Γ that corresponds to γ be denoted (x ,..., x ). Since i i δ = d/(4 |F|), we can use Corollary 1 and simulate the noise from each (x ,..., x ) 1 Unifying Leakage Models: From Probing Attacks to Noisy Leakage by a (d/2 − 1)-threshold-probing adversary S working in time polynomial in · |X |. The simulation is perfect, unless S outputs ⊥, which, by Corollary 1 happens with probability at most exp(−d/12). Hence, by the union-bound the probability that some S outputs ⊥ is at most |Γ | · exp(−d/12). Denote this event E. From Lemma 7 we know that every probing adversary that attacks Γ by probing at most (d − 1)/2≥ d/2 − 1 wires from each gadget can be perfectly simulated in polynomial time by an adversary S with a black-box access to Γ . Hence, A can also be simulated perfectly by a black-box access to Γ conditioned on the fact that E did not occur. Hence we get bb noisy Δ out S  Γ(k) |¬E ; out A  Γ (Enc(k)) = 0. This, by Lemma 2 (Sect. 2.1), implies (25). Obviously S works in time polynomial in |Γ | · d · |F|, which is polynomial in Γ · |F|. This finishes the proof. In short, this theorem is proven by combining Corollary 1 that reduces the noisy adversary to the probing adversary, with Lemma 7 that shows that the construction from Sect. 5.3 is secure against probing. 5.5. Resilience to Noisy Leakage from the Gates The model of Prouff and Rivain is actually slightly different than the one considered in the previous section. The difference is that they assume that the noise is generated by the gates, not by the wires. This can be formalized by assuming that each noise function Noise is applied to the “contents of a gate”. We do not need to specify exactly what we mean by this. It is enough to observe that the contents of each gate γ can be described by at most 2 field elements: obviously if γ is a random gate, output gate, or memory gate then its entire state in a given round can be described by one field element, and if γ is an operation gate then it can be described by two field elements that correspond to γ ’s input. Hence, without loss of generality we can assume that the noise function is defined over the domain F × F. Formally, we define a δ-gate-noisy circuit adversary A as a machine that, besides of having black-box access to a circuit Γ(k), can, after each ith round, get some partial information about the internal state of the computation via the δ-noisy leakage functions g-noisy applied to the gates (in a model described above). Let out A  Γ(k) denote the output of such A after interacting with Γ whose initial memory state is k = k. We can accordingly modify the definition of noise-resilient circuit implementations (cf. Definition 2). We say that Γ is a (δ, ξ )-input-gate-noise resilient implementation of a circuit Γ w.r.t. Enc if for every k and every δ-noisy circuit adversary A described above there exists a black-box circuit adversary S working in time polynomial in Γ · |F| such that g-noisy bb Δ out S  Γ(k) ; out A  Γ (Enc(k)) ≤ ξ. (26) A. Duc et al. It turns out that the transformation from Sect. 5.3 also works in this model, although with different parameters. More precisely we have the following theorem. Theorem 2. Let Γ be an arbitrary stateful arithmetic circuit over some field F. Let Γ be the circuit that results from the procedure described in Sect. 5.3. Then Γ is a | | (δ, Γ · exp(−d/24))-noise-resilient implementation of Γ (with efficient simulation), where −1 2 2 δ := (28d + 8) · |F| = O(1/(d · |F| )). (27) Proof. The proof is similar to the one of Theorem 1 so we only describe the key differences. Let A be a δ-noisy adversary. The number corresponds now to the number of gates in each gadget, and hence it is equal to 3.5 · d + d. It is therefore straightforward to calculate that δ defined in (27) is equal to (d/2)/(4 · |F| ). Since the Noise function has domain of size |F| , we can use Corollary 1 obtaining that A can be simulated by an adversary S that probes each gadget in less that d/2 positions. Since now each “position” corresponds to a gate in the circuit, hence the adversary needs to probe up to two wires to determine its value. Therefore S probes less than d wires in each gadget. Since d is now 1/2 of what it was in the proof of Corollary 1, hence the error probability becomes exp(−d/12) = exp(−d/24). Comparison with [28]. As described in the introduction, our main advantage over [28] is the removal of the assumption about the existence of the leak-free gates, a stronger security model—chosen message attack, instead of a random message attack, and a more meaningful security statement. Still, it is interesting to compare our noise parameters with the parameters of [28]. Let us analyze how much noise is needed by [28] to ensure that the adversary obtains exponentially small information from leakage. The reader should keep in mind that both in our paper, and in [28] “more noise” means that a certain quantity, δ, in our case, is smaller. Hence, the larger δ is, the stronger the result becomes (as it means that less noise is required for the security to hold). The main result of [28] is Theorem 4 on page 154. Unfortunately, the statement of | | this theorem is asymptotic treating F as constant, and hence to get a precise bound on how much noise is required one needs to inspect the proof. The bound on the noise can be deduced from the part of the proof entitled “Security of Type 3 Subsequences”, where the required noise is inversely proportional to “λ(d)”, and this last value is linear 3 3/2 in d · |F| for a general d and linear in d · |F| for large d’s (note that |F| is denoted by N in [28], and d is a security parameter identical to ours). Hence, for a general d, their δ is O(1/(d · |F| )). However, as explained in Sect. 3.1, the notion of distance in [28] is slightly different than the standard “statistical distance” that we use. Fortunately, one can use (7)to translate our bound into their language. It turns out that in this case our and their bounds are asymptotically identical for general d’s, i.e., (O(1/(d · |F| ))). This is shown in Note that our result holds only when the number of shares is large. For small values of d (e.g., d = 2, 3, 4) like those considered in [35], our result does not give meaningful bounds. This is similar to the work of Prouff and Rivain [28] and it is an interesting open research question to develop security models that work for small security parameters. Unifying Leakage Models: From Probing Attacks to Noisy Leakage Corollary 2 below. Note that this translation is unidirectional, in the sense that their 3 2 “O(1/(d · |F| ))” bound does not imply a bound “O(1/(d · |F| ))” in our sense. Corollary 2. Let Γ be an arbitrary stateful arithmetic circuit over some field F. Let Γ be the circuit that results from the procedure described in Sect. 5.3. Then Γ is a (δ , |Γ | · exp(−d/24))-noise-resilient implementation of Γ (with efficient simulation) when the noise is defined using the β distance, where −1 3 3 δ = (14d + 4) · |F| = O(1/(d · |F| )). Proof. From (7) with X = F × F, it follows that if Noise is δ -noisy with respect to the β distance, then it is (|F| · δ /2)-noisy in the standard sense. Since this last value is equal to δ defined in (27), hence we can use Theorem 2 obtaining that Γ is a (δ , |Γ | · exp(−d/24))-noise-resilient implementation of Γ when the noise is defined using the β distance. Acknowledgements We would like to thank the anonymous Eurocrypt and Journal of Cryptology reviewers for their careful reading of our manuscript and their many insightful comments. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 Inter- national License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. References [1] M. Ajtai, Secure computation with information leaking to an adversary. In Proceedings of the 43rd ACM Symposium on Theory of Computing, STOC 2011, San Jose, CA, USA, 6-8 June 2011, pages 715–724 (2011) [2] A. Akavia, S. Goldwasser, V. Vaikuntanathan, Simultaneous Hardcore Bits and Cryptography against Memory Attacks. In TCC, pages 474–495 (2009) [3] M. Andrychowicz, S. Dziembowski, S. Faust, Circuit compilers with o(1/ log (n)) leakage rate. In Advances in Cryptology - EUROCRYPT 2016 - 35th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Proceedings, Part II, pages 586–615 (2016) [4] J. Blömer, J. Guajardo, V. Krummel, Provably Secure Masking of AES. In Selected Areas in Cryptog- raphy, pages 69–83 (2004) [5] C. Carlet, L. Goubin, E. Prouff, M. Quisquater, M. Rivain, Higher-Order Masking Schemes for S-Boxes. In FSE, pages 366–384 (2012) [6] S. Chari, C.S. Jutla, J.R. Rao, P. Rohatgi, Towards Sound Approaches to Counteract Power-Analysis Attacks. In CRYPTO, pages 398–412 (1999) [7] C. Clavier, J. Coron, N. Dabbous, Differential Power Analysis in the Presence of Hardware Counter- measures. In CHES, pages 252–263 (2000) [8] J. Coron, I. Kizhvatov, Analysis and Improvement of the Random Delay Countermeasure of CHES 2009. In CHES, pages 95–109 (2010) A. Duc et al. [9] D.P. Dubhashi, A. Panconesi. Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press (2009) [10] S. Dziembowski, S. Faust. Leakage-Resilient Circuits without Computational Assumptions. In TCC, pages 230–247 (2012) [11] S. Dziembowski, S. Faust, M. Skorski, Noisy leakage revisited. In Elisabeth Oswald and Marc Fischlin, editors, Advances in Cryptology - EUROCRYPT 2015 - 34th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Sofia, Bulgaria, April 26-30, 2015, Proceedings, Part II, volume 9057, pages 159–188. Springer (2015) [12] S. Dziembowski, K. Pietrzak, Leakage-Resilient Cryptography. In FOCS, pages 293–302 (2008) [13] S. Faust, T. Rabin, L. Reyzin, E. Tromer, V. Vaikuntanathan, Protecting Circuits from Leakage: the Computationally-Bounded and Noisy Cases. In EUROCRYPT, pages 135–156 (2010) [14] S. Goldwasser, G.N. Rothblum. Securing computation against continuous leakage. In CRYPTO, pages 59–79 (2010) [15] S. Goldwasser, G.N. Rothblum. How to Compute in the Presence of Leakage. In FOCS, pages 31–40 (2012) [16] L. Goubin, J. Patarin, DES and Differential Power Analysis (The “Duplication” Method). In CHES, pages 158–172 (1999) [17] Y. Ishai, A. Sahai, D. Wagner, Private Circuits: Securing Hardware against Probing Attacks. In CRYPTO, pages 463–481 (2003) [18] A. Juma, Y. Vahlis, Protecting Cryptographic Keys against Continual Leakage. In CRYPTO, pages 41–58 (2010) [19] J. Katz, V. Vaikuntanathan, Signature Schemes with Bounded Leakage Resilience. In ASIACRYPT, pages 703–720 (2009) [20] P.C. Kocher, Timing attacks on implementations of Diffie-Hellman, RSA, DSS, and other systems. In CRYPTO’96, pages 104–113 (1996) [21] P.C. Kocher, J. Jaffe, B. Jun, Differential Power Analysis. In CRYPTO’99, pages 388–397 (1999) [22] Stefan Mangard, Elisabeth Oswald, Thomas Popp, Power Analysis Attacks: Revealing the Secrets of Smart Cards (Advances in Information Security). Springer-Verlag New York, Inc., Secaucus, NJ, USA (2007) [23] U.M. Maurer, S. Tessaro, A hardcore lemma for computational indistinguishability: Security amplifica- tion for arbitrarily weak prgs with optimal stretch. In Daniele Micciancio, editor, TCC, volume 5978 of Lecture Notes in Computer Science, pages 237–254. Springer (2010) [24] S. Micali, L. Reyzin, Physically Observable Cryptography (Extended Abstract). In TCC, pages 278–296 (2004) [25] E. Miles, E. Viola, Shielding circuits with groups. In STOC, pages 251–260 (2013) [26] M. Naor, G. Segev, Public-key cryptosystems resilient to key leakage. In CRYPTO, pages 18–35 (2009) [27] E. Oswald, S. Mangard, N. Pramstaller, V. Rijmen, A Side-Channel Analysis Resistant Description of the AES S-Box. In FSE, pages 413–423 (2005) [28] E. Prouff, M. Rivain, Masking against Side-Channel Attacks: A Formal Security Proof. In Thomas Johansson and Phong Q. Nguyen, editors, EUROCRYPT, volume 7881 of Lecture Notes in Computer Science, pages 142–159. Springer (2013) [29] E. Prouff, T. Roche, Higher-Order Glitches Free Implementation of the AES Using Secure Multi-party Computation Protocols. In CHES, pages 63–78 (2011) [30] J.-J. Quisquater, D. Samyde, ElectroMagnetic Analysis (EMA): Measures and Counter-Measures for Smart Cards. In E-smart, pages 200–210 (2001) [31] M. Rivain, E. Prouff, Provably Secure Higher-Order Masking of AES. In CHES, pages 413–427 (2010) [32] G.N. Rothblum, How to Compute under AC0 Leakage without Secure Hardware. In CRYPTO, pages 552–569 (2012) [33] F.-X. Standaert, T. Malkin, M. Yung, A Unified Framework for the Analysis of Side-Channel Key Recovery Attacks. In EUROCRYPT, pages 443–461 (2009) [34] François-Xavier Standaert, Olivier Pereira, Yu Yu. Leakage-Resilient Symmetric Cryptography under Empirically Verifiable Assumptions. In CRYPTO (1), pages 335–352 (2013) [35] F.-X. Standaert, N.V.-Charvillon, E. Oswald, B. Gierlichs, M. Medwed, M. Kasper, S. Mangard, The World Is Not Enough: Another Look on Second-Order DPA. In ASIACRYPT, pages 112–129 (2010) Unifying Leakage Models: From Probing Attacks to Noisy Leakage [36] N. Veyrat-Charvillon, F.-X. Standaert, Adaptive Chosen-Message Side-Channel Attacks. In Jianying Zhou and Moti Yung, editors, ACNS, volume 6123 of Lecture Notes in Computer Science, pages 186– 199 (2010)

Journal

Journal of CryptologySpringer Journals

Published: Jun 5, 2018

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off