Noisy rumor spreading and plurality consensus

Noisy rumor spreading and plurality consensus Error-correcting codes are efficient methods for handling noisy communication channels in the context of technological networks. However, such elaborate methods differ a lot from the unsophisticated way biological entities are supposed to communicate. Yet, it has been recently shown by Feinerman et al. (PODC 2014) that complex coordination tasks such as rumor spreading and majority consensus can plausibly be achieved in biological systems subject to noisy communication channels, where every message transferred through a channel remains intact with small probability +, without using coding techniques. This result is a considerable step towards a better understanding of the way biological entities may cooperate. It has nevertheless been established only in the case of 2-valued opinions: rumor spreading aims at broadcasting a single-bit opinion to all nodes, and majority consensus aims at leading all nodes to adopt the single-bit opinion that was initially present in the system with (relative) majority. In this paper, we extend this previous work to k-valued opinions, for any constant k ≥ 2. Our extension requires to address a series of important issues, some conceptual, others technical. We had to entirely revisit the notion of noise, for handling channels carrying k-valued messages. In fact, we precisely characterize the type of noise patterns for which plurality consensus is solvable. Also, a key result employed in the bivalued case by Feinerman et al. is an estimate of the probability of observing the most frequent opinion from observing the mode of a small sample. We generalize this result to the multivalued case by providing a new analytical proof for the bivalued case that is amenable to be extended, by induction, and that is of independent interest. Keywords Noise · Rumor spreading · Plurality consensus · PUSH model · Biological distributed algorithms 1 Introduction 1.1 Context and objective To guarantee reliable communication over a network in the presence of noise is the main goal of Network Infor- mation Theory [23]. Thanks to the achievements of this An extended abstract of this work appeared in Proceedings of the theory, the impact of noise can often be drastically reduced 2016 ACM Symposium on Principles of Distributed Computing (PODC’16). to almost zero by employing error-correcting codes, which P. Fraigniaud: Additional support from the ANR project are practical methods whenever dealing with artificial enti- DISPLEXITY and from the INRIA project GANG. ties. However, as observed in [26], the situation is radically E. Natale: This work has been partly done while the author was a different for scenarios in which the computational entities fellow of the Simons Institute for the Theory of Computing. are biological. Indeed, from a biological perspective, a com- B Emanuele Natale putational process can be considered “simple” only if it emanuele.natale@mpi-inf.mpg.de consists of very basic primitive operations, and is extremely Pierre Fraigniaud lightweight. As a consequence, it is unlikely that biologi- pierref@irif.fr cal entities are employing techniques like error-correcting codes to reduce the impact of noise in communications Institut de Recherche en Informatique Fondamentale, CNRS and University Paris Diderot, Paris, France between them. Yet, biological signals are subject to noise, when generated, transmitted, and received. This rises the Max Planck Institute for Informatics, Saarbrücken 66123, Germany intriguing question of how entities in biological ensem- 123 P. Fraigniaud, E. Natale bles can cooperate in presence of noisy communications, for the case of rumor spreading, the algorithm exchanges but in absence of mechanisms such as error-correcting solely opinions between nodes. In fact, the latter algorithm codes. for majority consensus is used as a subroutine for solving the An important step toward understanding communica- rumor-spreading problem. Note that the majority consensus tions in biological ensembles has been achieved recently algorithm of [26] requires that the nodes are initially aware in [26], which showed how it is possible to cope with of the size of A. noisy communications in absence of coding mechanisms According to [26], both algorithms are optimal, since both for solving complex tasks such as rumor-spreading and rumor-spreading and majority consensus require ( log n) majority consensus. Such a result provides highly valuable rounds w.h.p. in n-node networks. hints on how complex tasks can be achieved in frameworks Our objective is to extend the work of [26] to the natural such as the immune system, bacteria populations, or super- case of an arbitrary number of opinions, to go beyond a proof organisms of social insects, despite the presence of noisy of concept. The problem that results from this extension is an communications. instance of the plurality consensus problem in the presence In the case of rumor-spreading, [26] assumes that a source- of noise, i.e., the problem of making the system converging to node initially handles a bit, set to some binary value, called the initially most frequent opinion (i.e., the plurality opinion). Indeed, the plurality consensus problem naturally arises in the correct opinion. This opinion has to be transmitted to all nodes, in a noisy environment, modeled as a complete several biological settings, typically for choosing between network with unreliable links. More precisely, messages are different directions for a flock of birds [11], different speeds transmitted in the network according to the classical uni- for a school of fish [42], or different nesting sites for ants form push model [19,32,38] where, at each round, every [27]. The computation of the most frequent value has also node can send one binary opinion to a neighbor chosen uni- been observed in biological cells [15]. formly and independently at random and, before reaching the receiver, that opinion is flipped with probability at most 1.2 Our contribution −  with > 0. It is proved that, even in this very noisy setting, the rumor-spreading problem can be solved quite effi- 1.2.1 Our results ciently. Specifically, [26] provides an algorithm that solves the noisy rumor-spreading problem in O( log n) commu- We generalize the results in [26] to the setting in which an arbitrary large number k of opinions is present in the sys- nication rounds, with high probability (w.h.p.) in n-node tem. In the context of rumor spreading, the correct opinion networks, using O(log log n + log(1/)) bits of memory is a value i ∈{1,..., k}, for any constant k ≥ 2. Initially, per node . Again, this algorithm exchanges solely opinions one node supports this opinion i, and the other nodes have between nodes. no opinions. The nodes must exchange opinions so that, In the case of majority consensus, [26] assumes that some eventually, all nodes support the correct opinion i.Inthe nodes are supporting opinion 0, some nodes are supporting context of (relative) majority consensus, also known as plu- opinion 1, and some other nodes are supporting no opin- rality consensus, each node u initially supports one opinion ion. The objective is that all nodes eventually support the i ∈{1,..., k}, or has no opinion. The objective is that all initially most frequent opinion (0 or 1). More precisely, let u nodes eventually adopt the plurality opinion (i.e., the opinion A be the set of nodes with opinion, and let b ∈{0, 1} initially held by more nodes than any other, but not necessar- be the majority opinion in A.The majority bias of A is ily by an overall majority of nodes). As in [26], we restrict defined as (|A |−|A |)/|A| where A is the set of nodes b ¯ i 2 b ourselves to “natural” algorithms [16], that is, algorithms in with opinion i ∈{0, 1}. In the very same noisy commu- which nodes only exchange opinions in a straightforward nication model as above, [26] provides an algorithm that manner (i.e., they do not use the opinions to encode, e.g., solves the noisy majority consensus problem for |A|= part of their internal state). For both problems, the difficulty ( log n) with majority-bias ( log n/|A|). The algo- 1 comes from the fact that every opinion can be modified during rithm runs in O( log n) rounds, w.h.p., in n-node networks, its traversal of any link, and switched at random to any other using O(log log n + log(1/)) bits of memory per node. As opinion. In short, we prove that there are algorithms solving the noisy rumor spreading problem and the noisy plurality 1 c Aseriesofevents E , n ≥ 1, hold w.h.p. if Pr(E ) ≥ 1 − O(1/n ) consensus problem for multiple opinions, with the same per- n n for some c > 0. formances and probabilistic guarantees as the algorithms for We remark that, while it would be more appropriate to measure the binary opinions in [26]. space complexity by the number of states here (in accordance with other work which is concerned with minimizing it, such as population protocols [7] or cellular automata [35]), we make use of the memory bits for consistency with the main related work [26]. 123 Noisy rumor spreading and plurality consensus 1.2.2 The technical challenges show that the execution of the given protocol, on the uniform push model, can be tightly approximated with the execution Generalizing noisy rumor spreading and noisy majority con- of the same protocol on a another suitable communication sensus to more than just two opinions requires to address a model, that is not affected by the stochastic correlation that series of issues, some conceptual, others technical. affects the uniform push model. Conceptually, one needs first to redefine the notion of noise. In the case of binary opinions, the noise can just flip an opinion to its complement. In the case of multiple opin- 1.3 Other related work ions, an opinion i subject to a modification is switched to another opinion i , but there are many ways of picking i . By extending the work of [26], we contribute to the theoret- For instance, i can be picked uniformly at random (u.a.r.) ical understanding of how communications and interactions among all opinions. Or, i could be picked as one of the function in biological distributed systems, from an algorith- “close opinions”, say, either i + 1or i − 1 modulo k.Or, i mic perspective [2,3,6–8,13,14,34]. We refer the reader to could be “reset” to, say, i = 1. In fact, there are very many [26] for a discussion on the computational aspects of biolog- alternatives, and not all enable rumor spreading and plural- ical distributed systems, an overview of the rumor spreading ity consensus to be solved. One of our contributions is to problem in distributed computing, as well as its biological characterize noise matrices P = (p ), where p is the significance in the presence of noise. In this section, we i , j i , j probability that opinion i is switched to opinion j,for which mainly discuss the previous technical contributions from the these two problems are efficiently solvable. Similar issues literature related to the novelty of our work, that is the exten- arise for, e.g., redefining the majority bias into a plurality sion to the case of several different opinions. bias. We remark that, in the following, we say that a protocol The technical difficulties are manifold. A key ingredient solves a problem within a given time if a correct solution is of the analysis in [26] is a fine estimate of how nodes can mit- achieved with high probability within said time. In the context igate the impact of noise by observing the opinions of many of population protocol, the problem of achieving majority other nodes, and then considering the mode of such sample. consensus in the binary case has been solved by employ- Their proof relies on the fact that for the binary opinion case, ing a simple protocol called undecided-state dynamic [7]. given a sample of size γ , the number of 1s and 0s in the In the uniform push model, the binary majority consensus sample sum up to γ . Even for the ternary opinion case, the problem can be solved very efficiently as a consequence of a additional degree of freedom in the sample radically changes more general result about computing the median of the initial the nature of the problem, and the impact of noise is statisti- opinions [20]. Still in the uniform push model, the undecided cally far more difficult to handle. state dynamic has been analyzed in the case of an arbitrar- Also, to address the multivalued case, we had to cope ily large number of opinions, which may even be a function with the fact that, in the uniform push model, the messages of the number of agents in the system [9]. A similar result received by nodes at every round are correlated. To see why, has been obtained for another elementary protocol, so-called consider an instance of the system in which a certain opin- 3-majority dynamics, in which, at each round, each node sam- ion b is held by one node only, and there is no noise at all. ples the opinion of three random nodes, and adopts the most In one round, only one other node can receive b. It follows frequent opinion among these three [10]. The 3-majority that if a certain node u has received b, no other nodes have dynamics has also been shown to be fault-tolerant against received it. Thus, the messages each node receives are not an adversary that can change up to O( n) agents at each independent. In [25] (conference version of [26]), probabil- round [10,20]. Other work has analyzed the undecided-state ity concentration results are claimed for random variables dynamics in asynchronous models with a constant number of (r.v.) that depend on such messages, using Chernoff bounds. opinions [21,31,37], and the h-majority dynamics (or slight However, Chernoff bounds have been proved to hold only for variations of it) on different graph classes in the uniform push random variables whose stochastic dependence belongs to a model [1,18]. The analysis of the undecided-state dynamics very limited class (see for example [22]). In [26], it is pointed in [9] has been followed by a series of work which have out that the binary random variables on which the Chernoff used it to design optimal plurality consensus algorithms in bound is applied satisfy the property of being negatively 1- the uniform pull model [29,30]. correlated (see Section 1.7 in [26] for a formal definition). In A general result by Kempe et al. [33] shows how to com- our analysis, we show instead how to obtain concentration of pute a large class of functions in the uniform push model. probability in this dependent setting by leveraging Poisson However, their protocol requires the nodes to send slightly approximation techniques. Our approach has the following more complex messages than their sole current opinion, and advantage: instead of showing that the Chernoff bound can be its effectiveness heavily relies on a potential function argu- directly applied to the specific involved random variables, we ment that does not hold in the presence of noise. 123 P. Fraigniaud, E. Natale To the best of our knowledge, we are the first considering opinions represented by an integer in [k]={1,..., k}. Addi- the plurality consensus problem in the presence of noise. tionally, there may be undecided nodes that do not support any opinion, which represents nodes that are not actively aware that the system has started to solve the problem; thus, 2 Model and formal statement of our results undecided nodes are not allowed to send any message before receiving any of them. In this section we formally define the communication model, the main definitions, the investigated problems and our con- • In rumor spreading, initially, one node, called the source, tribution to them. has an opinion m ∈{1,..., k}, called the correct opin- As discussed in Sect. 1.1, intuitively we look for proto- ion. All the other nodes have no opinion. The objective cols that are simple enough to be plausible communication is to design a protocol insuring that, after a certain num- strategies for primitive biological system. We believe that the ber of communication rounds, every node has the correct computational investigation regarding biologically-feasible opinion m. protocols is still too premature for a reasonable attempt to • In plurality consensus, initially, for every i ∈{1,..., k}, provide a general formal definition of what constitutes a bio- aset A of nodes have opinion i.The sets A , i = i i logically feasible computation. Hence, in the following we 1,..., k, are pairwise disjoint, and their union does not restrict our attention solely on the biological significance of need to cover all nodes, i.e., there may be some unde- the rumor-spreading and plurality consensus problems and cided nodes with no opinion initially. The objective is the corresponding protocols that we consider. to design a protocol insuring that, after a certain number Regarding the problems of multivalued rumor spreading of communication rounds, every node has the plurality and plurality consensus, while for practical reasons many opinion, that is, the opinion m with relative majority in experiments on collective behavior have been designed to the initial setting (i.e., |A | > |A | for any j = m). m j investigate the binary-decision setting, the considered natural phenomena usually involve a decision among a large number Observe that the rumor-spreading problem is a special case of of different options [17]: famous examples in the literature the plurality consensus problem with |A |= 1 and |A |= 0 m j include cockroaches aggregating in a common site [5], and for any j = m. the house-hunting process of ant colonies when seeking a Following the guidelines of [26], we work under two con- new site to relocate their nest [28] or of honeybee swarms straints: when a portion of a strong colony branches from it in order to start a new one [40,41]. Therefore, it is natural to ask what 1. We restrict ourselves to protocols in which each node can trade-offs and constraints are required by the extension of the only transmit opinions, i.e., every message is an integer results in [26] to the multivalued case. in {1,..., k}. Regarding the solution we consider, as illustrated in Sects. 2. Transmissions are subject to noise, that is, for every 2.3 and 3.1, we consider a natural generalization of the round, and for every node u, if an opinion i ∈{1,..., k} protocol given in [26], which is essentially an elementary is transmitted to node u during that round, then node combination of sampling and majority operations. These ele- u will receive message j ∈{1,..., k} with probability mentary operations have extensively been observed in the k p ≥ 0, where p = 1. i , j i , j j =1 aforementioned experimental settings [17]. The noisy push model is the uniform push model together 2.1 Communication model and definition of the with the previous two constraints. The probabilities problems { p } can be seen as a transition matrix, called the i , j i , j ∈[k] noise matrix, and denoted by P = (p ) . The noise i , j i , j ∈[k] The communication model we consider is essentially the uni- matrix in [26]issimply form push model [19,32,38], where in each (synchronous) round each agent can send (push) a message to another agent 1 1 +  − 2 2 chosen uniformly at random. This occurs without having the P = . (1) 1 1 −  + sender or the receiver learning about each other’s identity. 2 2 Note that it may happen that several agents push a message to the same node u at the same round. In the latter case we 2.1.1 The reception of simultaneous messages assume that the nodes receive them in a random order; we discuss this assumption in detail in Sect. 2.1.1. In the uniform push model, it may happen that several agents We study the problems of rumor-spreading and plurality push a message to the same node u at the same round. In such consensus. In both cases, we assume that nodes can support cases, the model should specify whether the node receives all 123 Noisy rumor spreading and plurality consensus such messages, only one of them or neither of them. Which given by [26]. Finally, we remark that, more generally, the choice is better depends on the biological setting that is being research on solving fundamental coordination problems such modeled: if the communication between the agents of the as plurality consensus in fully-asynchronous communication system is an auditory or tactile signal, it could be more real- models such as population protocols is an active research area istic to assume that simultaneous messages to the same node [4,24]. We believe that obtaining analogous results to those would “collide”, and the node would not be able to grasp any provided here in a noisy version of population protocols is of them. If, on the other hand, the messages represent visual an interesting direction for future research. or chemical signals (see e.g., [10,11,27,42]), then it may be unrealistic to assume that nodes cannot receive more than 2.2 Plurality bias, and majority preservation one of such messages at the same round and besides, by a standard balls-into-bins argument (e.g., by applying Lemma When time proceeds, our protocols will result in the propor- 3), it follows that in the uniform push model at each round tion of nodes with a given opinion to evolve. Note that there no node receives more than O(log n) messages w.h.p. In this might be nodes who do not support any opinion at time t. work we thus consider the model in which all messages are As mentioned in the previous section, we call such nodes (t ) received, also because such assumption allows us to obtain undecided. We denote by a the fraction of nodes support- simpler proofs than the other variants. We finally note that ing any opinion at time t and we call the nodes contributing (t ) our protocol does not strictly need such assumption, since to a opinionated. Consequently, the fraction of undecided (t ) (t ) it only requires the nodes to collect a small random sample nodes at time t is 1 − a .Let c be the fraction of opin- of the received messages. However, since we look at the lat- ionated nodes in the system that support opinion i ∈[k] at (t ) (t ) (t ) ter feature as a consequence of active choices of the nodes the beginning of round t, so that c = a .Let cˆ i ∈[k] i i rather than some inherent property of the environment, we be the fraction of opinionated nodes which receive at least avoid to weaken the model to the point that it matches the one message at time t − 1 and support opinion i ∈[k] at (t ) (t ) (t ) requirements of the protocol. the beginning of round t. We write c = (c ,..., c ) to 1 k denote the opinion distribution of the opinions at time t.Sim- (t ) (t ) (t ) 2.1.2 On the role of synchronicity in the result ilarly, let c ˆ = (cˆ ,..., cˆ ). In particular, if every node 1 k would simply switch to the last opinion it received, then An important aspect of many natural biological computations (t +1) (t ) is their tolerance with respect to a high level of asynchrony. E[ˆc | c ] Following [26], in this work we tackle the noisy rumor- = Pr[received i | original message is j ] spreading and plurality consensus problems by assuming that j ∈[k] agents are provided a shared clock, which they can employ · Pr[original message is j ] to synchronize their behavior across different phases of a (t ) protocol. In [26, Section 3], it is shown how substantially = c · p . j ,i relax the previous assumption by assuming, instead, that a j ∈[k] source agent can broadcast a starting signal to the rest of That is, the system to initiate the execution of the protocol. That is, asimple rumor-spreading procedure is employed to awake (t +1) (t ) (t ) the agents which join the system asynchronously, and it is E[c ˆ | c ]= c · P, (2) shown that with high probability the level of asynchrony (i.e. the largest difference among the agents’ estimates of the time where P is the noise matrix. In particular, in the absence of since the start of the execution of the protocol), is logarith- noise, we have P = I (the identity matrix), and if every node mically bounded with high probability. It thus follows that would simply copy the opinion that it just received, we had (t +1) (t ) (t ) their results can be generalized to the setting in which source E[c ˆ | c ]= c . So, given the opinion distribution at agents initiate the execution of the protocol by waking up the round t, from the definition of the model it follows that the rest of the system, with only a logarithmic overhead factor messages each node receives at round t + 1 can equivalently in the running time. We defer the reader to [26, Section 3] be seen as being sent from a system without noise, but whose (t ) for formal details regarding this synchronization procedure. opinion distribution at round t is c · P. Our generalization of the results in [26] is independent from Recall that m denotes the initially correct opinion, that any aspect which concerns the aforementioned procedure. is, the source’s opinion in the rumor-spreading problem, and Hence, their relaxation holds for our results as well, with the the initial plurality opinion in the plurality consensus prob- same log n additional factor in the running time. It is an open lem. The following definition naturally extends the concept problem to obtain a simple procedure to wake up the system of majority bias in [26]to plurality bias. while incurring a smaller overhead than the logarithmic one 123 P. Fraigniaud, E. Natale Definition 1 Let δ> 0. An opinion distribution c is said to Eq. (1)is -majority-biased. Note that, as in [26], the plurality be δ-biased toward opinion mif c − c ≥ δ for all i = m. consensus algorithm requires the nodes to known the size |S| m i of the set S of opinionated nodes. In [26], each binary opinion that is transmitted between two nodes is flipped with probability at most − , with  = − +η n for an arbitrarily small η> 0. Thus, the noise is 3 The analysis parametrized by . The smaller , the more noisy are the communications. We generalize the role of this parameter In this section we prove Theorems 1 and 2 by generalizing with the following definition. the analysis of Stage 1 given in [26] and by providing a new analysis of Stage 2. Note that the proof techniques required Definition 2 Let  = (n) and δ = δ(n) be positive. A noise for the generalization to arbitrary k significantly depart from matrix P is said to be (, δ)-majority-preserving ((, δ)-m.p.) thosein[26] for the case k = 2. In particular, our approach with respect to opinion m if, for every opinion distribution provides a general framework to rigorously deal with many c that is δ-biased toward opinion m, we have (c · P) − kind of stochastic dependences among messages in the uni- (c · P) > δ for all i = m. form push model (Fig. 1). In the rumor-spreading problem, as well as in the plurality 3.1 Definition of the protocol consensus problem, when we say that a noise matrix is (, δ)- m.p., we implicitly mean that it is (, δ)-m.p. with respect to the initially correct opinion. Because of the space constraints, We describe a rumor spreading protocol performing in two we defer the discussion on the class of (, δ)-m.p. noise matri- stages. Each stage is decomposed into a number of phases, ces in Sect. 4 (including its tightness w.r.t. Theorems 1 and each one decomposed into a number of rounds. During each 2). phase of the two stages, the nodes apply the simple rules given below. 2.3 Our formal results 3.1.1 The rule during each phase of Stage 1 We show that a natural generalization of the protocol in [26] solves the rumor spreading problem and the plurality con- Nodes that already support some opinion at the beginning sensus problem for an arbitrary number of opinions k.More of the phase push their opinion at each round of the phase. precisely, using the protocol which we describe in Sect. 3.1, Nodes that do not support any opinion at the beginning of we can establish the following two results, whose proof can the phase but receive at least one opinion during the phase be found in Sect. 3. start supporting an opinion at the end of the phase, chosen u.a.r. (counting multiplicities) from the received opinions. Theorem 1 Assume that the noise matrix P is (, δ)-m.p. In other words, each node tries to acquire an opinion during − +η with  = (n ) for an arbitrarily small constant η> 0 each phase of Stage 1, and, as it eventually receives some and δ = ( log n/n). The noisy rumor-spreading problem opinions, it starts supporting one of them (chosen u.a.r.) from log n with k opinions can be solved in O( ) communication the beginning of the next phase. In particular, opinionated rounds, w.h.p., by a protocol using O(log log n + log ) bits nodes never change their opinion during the entire stage. of memory at each node. More formally, let φ, β, and s be three constants satisfying φ> β > s. The rounds of Stage 1 are grouped in T +2 phases Theorem 2 Let S with |S|= ( log n) be an initial set of 2 2 with T =log(n/(2s/ log n))/ log(β/ + 1). Phase 0 nodes with opinions in [k], the rest of the nodes having no 2 2 takes s/ log n rounds, phase T +1 takes φ/ log n rounds, opinions. Assume that the noise matrix P is (, δ)-m.p. for 2 and each phase j with 1 ≤ j ≤ T takes β/ rounds. We some > 0, and that S is ( log n/|S|)-majority-biased. denote with τ the end of the last round of phase j. The noisy plurality consensus problem with k opinions can Let t be thefirst timeinwhich u receives any opinion log n be solved in O( ) communication rounds, w.h.p., by a since the beginning of the protocol (with t = 0for the protocol using O(log log n + log ) bits of memory at each source). Let j be the phase of t , and let val(u) be an opinion u u node. chosen u.a.r. by u among those that it receives during phase For k = 2, we get the theorems in [26] from the above two Note that, in the protocol considered in [26], the choice of each node’s theorems. Indeed, the simple 2-dimensional noise matrix of new opinion in both stages is based on the first messages received. In [26], in order to relax the synchronicity assumption that nodes share a For a discussion on what happens for other values of , see “Appendix common clock, they adopt the same sample-based variant of the rule C”. that we adopt here. 123 Noisy rumor spreading and plurality consensus Fig. 1 Diagrams of dependencies among the different parts of the analysis. Each box represents a statement proven in the analysis, and an arrow between two boxes u and v signifies that the statement of box u is employed in the proof of box v j . During the first stage of the protocol each node applies {1,..., k}}. We then define maj(A) as the most frequent value the following rule. in A (breaking ties u.a.r.), i.e., maj(A) is the r.v. on {1,..., k} such that Pr(maj(A) = i ) = 1 /|mode(A)|. Let {i ∈mode(A)} Rule of Stage 1. Each opinionated node u pushes opin- R (u) be the multiset of messages received by node u during ion val(u) during each round of every phase j = j + phase j. During the second stage of the protocol each node 1,..., T + 1. applies the following rule. Rule of Stage 2. During each phase j of length 2L of 3.1.2 The rule during each phase of Stage 2 Stage 2 (L = or ), each node u pushes its current opinion at each round of the phase, and starts drawing a During each phase of Stage 2, every node pushes its opinion random uniform sample S(u) of size L from R (u).Pro- at each round of the phase. At the end of the phase, each node vided |R (u)|≥ L, at the end of the phase u changes its that received “enough” opinions takes a random sample of opinion to maj(S(u)). them, and starts supporting the most frequent opinion in that sample (breaking ties u.a.r.). Let us remark that the reason we require the use of sam- More formally, the rounds of stage 2 are divided in pling in the previous rule is that at a given round a node may T + 1 phases with T = log( n/ log n) . Each phase j, receive much more messages than 2L. Thus, if the nodes were 0 ≤ j ≤ T − 1, has length 2 with c/ for some to collect all the messages they receive, some of them would large-enough constant c > 0, and phase T has length 2 −2 need much more memory than our protocol does. Finally, with = O( log n). For any finite multiset A of ele- observe that overall both stages 1 and 2 take O( log n) ments in {1,..., k}, and any i ∈{1,..., k}, let occ(i , A) be 2 rounds. the number of occurrences of i in A, and let mode(A) = {i ∈{1,..., k}| occ(i , A) ≥ occ( j , A) for every j ∈ 3.2 Pushing colored balls-into-bins Note that, in order to sample u.a.r. one of them, u does not need to Before delving into the analysis of the protocol, we provide a collect all the opinions it receives. A natural sampling strategy such as reservoir sampling can be used. framework to rigorously deal with the stochastic dependence 123 P. Fraigniaud, E. Natale that arises between messages in the uniform push model. Let 2. each ball/message ends up in the same bin/node. process O be the process that results from the execution of the protocol of Sect. 3.1 in the uniform push model. In order Thus, the joint probability distribution of the sets to apply concentration of probability results that requires R (u) in process O is the same as the one given by u∈[n] the involved random variables to be independent, we view process B. the messages as balls, and the nodes as bins, and employ Observe also that, from the definition of the protocol (see Poisson approximation techniques. More specifically, during the rule of Stage 1 and Stage 2 in Sect. 3.1), it follows that each phase j of the protocol, let M be the set of messages j each node’s action depends only on the set R (u) of received that are sent to random nodes, and N be the set of mes- messages at the end of each phase j, and does not depend on sages sent after the noise has acted on them. (In other words, any further information such as the actual order in which the N = R (u)). We prove that, at the end of phase j,we messages are received during the phase. j j can equivalently assume that all the messages M have been j Summing up the two previous observations, we get that if, sent to the nodes according to the following process. at the end of each phase j, we generate the R (u)s accord- ing to process B, and we let the protocol execute according Definition 3 The balls-into-bins process B associated to to them, then we indeed get the same stochastic process as phase j is the two-step process in which the nodes represent process O. bins and all messages sent in the phase represent colored balls, with each color corresponding to some opinion. Ini- Now, one key ingredient in our proof is to approximate tially, balls are colored according to M . At the first step, process B using the following process P. each ball of color i ∈{1,..., k} is re-colored with color j ∈{1,..., k} with probability p , independently of the i , j Definition 4 Given N , process P associated to phase j is other balls. At the second step all balls are thrown into the the one-shot process in which each node receives a num- bins u.a.r. as in a balls-into-bins experiment. ber of opinions i that is a random variable with distribution Poisson(h /n), where h is the number of messages in N i i j Claim 1 Given the opinion distribution and the number of carrying opinion i, and each Poisson random variable is inde- active nodes at the beginning of phase j, the probability dis- pendent of the others. tribution of the opinion distribution and the number of active nodes at the end of phase j in process O is the same as if the Now we provide some results from the theory of Poisson messages were sent according to process B. approximation for balls-into-bins experiments that are used It is not hard to see that Claim 1 holds in the case of a sin- in Sect. 3.2. For a nice introduction to the topic, we refer to gle round. For more than one round, it is crucial to observe [36]. that the way each node u acts in the protocol depends only Lemma 1 Let X be independent r.v. such that X ∼ j j on the received messages R (u), regardless of the order in j ∈ n ˜ j [ ] Poisson(λ ). The vector (X ,..., X ) conditional on which these messages are received. As an example, consider j 1 n ˜ the opinion distribution in which one node has opinion 1, one X =˜ m follows a multinomial distribution with m ˜ trials and 1 n ˜ other node has opinion 2, and all other nodes have opinion 3. probabilities ,..., . λ λ j j j j Suppose that each node pushes its opinion for two consecu- tive rounds. Since, at each round, exactly one opinion 1 and Lemma 2 Consider a balls-into-bins experiment in which exactly one opinion 2 are pushed, no node can receive two h colored balls are thrown in n bins, where h balls 1s during the first round and then two 2s during the second have color i with i ∈ {1,..., k} and h = h. Let round, i.e., no node can possibly receive the sequence of mes- X be the number of i-colored balls u,i u∈{1,...,n},i ∈{1,...,k} sages “1,1,2,2” in this exact order. Instead, in process B such that end up in bin u, let f (x ,..., x , x ,..., x , z , 1,1 n,1 n,2 n,k 1 a sequence is possible. ..., z ) be a non-negative function with positive inte- ger arguments x ,..., x , x ,..., x , z ,..., z ,let 1,1 n,1 n,2 n,k 1 n Proof of Claim 1 In both process B and process O, at each Y be independent r.v. such that Y u,i u,i u∈{1,...,n},i ∈{1,...,k} round, the noise acts independently on each ball/message ∼Poisson(h /n) and let Z ,..., Z be integer valued r.v. i 1 n of a given color/opinion, according to the same probability independent from the X s and Y s. Then u,i u,i distribution for that color/opinion. Then, in both processes, each ball/message is sent to some bin/node chosen u.a.r. and E f X ,..., X , X ,..., X , Z ,..., Z 1,1 n,1 n,2 n,k 1 n independently of the other balls/messages. Indeed, we can couple process B and process O by requiring that: ≤ e h E f Y ,..., Y , Y ,..., i 1,1 n,1 n,2 1. each ball/message is changed by the noise to the same Y , Z ,..., Z . n,k 1 n color/value, and 123 Noisy rumor spreading and plurality consensus ¯ ¯ Proof To simplify notation, let Z = (Z ,..., Z ), X = Y be the independent r.v. of process P 1 n u,i { } { } u∈ 1,...,n ,i ∈ 1,...,k (X ,..., X , X ,..., X ), Y = (Y ,..., Y , such that Y ∼Poisson(h /n) and let Z ,..., Z be integer 1,1 n,1 n,2 n,n 1,1 n,1 u,i i 1 n n n Y ,..., Y ), Y = ( Y ,..., Y ), λ = valued r.v. independent from the X s and Y s. n,2 n,n u,1 u,k i u,i u,i u=1 u=1 Fix any realization of N , i.e. any re-coloring of the balls in h /n, λ = (λ ,...,λ ) and finally x¯ = (x ,..., x ) for i 1 k 1 k the first step of process B. By choosing f in Lemma 2 as the any x ,..., x . Observe that, while X and X are clearly 1 k u,i v,i binary r.v. indicating whether event E has occurred, where dependent, X and X with i = j are stochastically inde- u,i v, j E is a function of the r.v. X ,..., X , X ,..., X , pendent (even if u = v). Indeed, the distribution of the r.v. 1,1 n,1 n,2 n,k Z , ..., Z , we get X for each fixed i is multinomial with λ trials 1 n u,i i u∈{1,...,n} and uniform distribution on the bins. Thus, from Lemma 1 we have that X are distributed as Y u,i u,i u∈{1,...,n} u∈{1,...,n} Pr E X ,..., X , Z ,..., Z N 1,1 n,k 1 n j conditional on Y = λ , that is u,i i u=1 ≤ e h Pr E Y ,..., Y , Z ,..., Z N . i 1,1 n,k 1 n j n n ¯ ¯ E f Y , Z Y = λ ,..., Y = λ u,1 1 u,k k (3) u=1 u=1 ¯ ¯ = E f X , Z . Thus, from Eq. (3), the Inequality of arithmetic and geo- metric means and the hypotheses on the probability of E,we Therefore, we have get ¯ ¯ E f Y , Z Pr E X ,..., X , Z ,..., Z N ¯ ¯  ¯ ¯ 1,1 n,k 1 n j = E f Y , Z Y =¯x Pr Y =¯x x¯ :x ,...,x ≥0 1 k ≤ e h Pr E Y ,..., Y , Z ,..., Z N i 1,1 n,k 1 n j ¯ ¯ ¯ ¯ ¯ ¯ ≥ E f Y , Z Y = λ Pr Y = λ ¯ ¯ ¯ ¯ = E f X , Z Pr Y = λ h ≤ e Pr E Y ,..., Y , Z ,..., Z N i −k 1,1 n,k 1 n j i −h ¯ ¯ ¯ ¯ = E f X , Z e ≥ E f X , Z  , h ! i h Finally, let N be the set of all possible realizations of N .By where, in the last inequality, we use that, by Stirling’s approx- the law of total probability over N , we get that imation, a!≤ e a( ) for any a > 0. From Lemmas 1 and 2, we get the following general result Pr E X ,..., X , Z ,..., Z N = s 1,1 n,k 1 n j s∈N which says that if a generic event E holds w.h.p in process P, it also holds w.h.p. in process O. Pr N = s Lemma 3 Given the opinion distribution and the number of k ≤ e Pr E Y ,..., Y , Z ,..., Z N = s 1,1 n,k 1 n j active nodes at the beginning of a fixed phase j, let E be an s∈N event that, at the end of that phase, holds with probability at Pr N = s −b least 1 − n in process P,for some b >(k log h)/(2log n) 6 h with h = h . Then, at the end of phase j, E holds w.h.p. i ≤ e Pr E Y ,..., Y , Z ,..., Z 1,1 n,k 1 n also in process O. k k e k e k −b log h −b log n 2 2 ≤ h n ≤ e e Proof Thanks to Claim 1, it suffices to prove that, at the end k k log k 2 2 k e of phase j, E holds w.h.p. in process B. k k k− log k+ log h−b log n − (1) 2 2 ≤ e ≤ n , Let E be the complementary event of E.Let h =|M | be the number of balls that are thrown in process B asso- ciated to phase j, where h balls have color i with i ∈ i where in the last line we used the hypotheses on the proba- {1,..., k} and h = h.Let X be i u,i bility of E. u∈{1,...,n},i ∈{1,...,k} the number of i-colored balls that end up in bin u,let We now analyze the two stages of our protocol, starting Note that, if N is not yet fixed, the parameters h of process P j i with Stage 1. Note that, in the following two sections, the associated to phase j are random variables. However, if the opinion statements about the evolution of the process refer to process distribution and the number of active nodes at the beginning of phase j are given, then h = h =|N |=|M | is fixed. O. i j j 123 P. Fraigniaud, E. Natale 3.3 Stage 1 Lemma 7 W.h.p., at the end of each phase j of Stage 1, we have an (/2) -biased opinion distribution. The rule of Stage 1 is aimed at guaranteeing that, w.h.p., Proof We prove the lemma by induction on the phase num- the system reaches a target opinion distribution from which ber. The case j = 1 is a direct application of Lemma 16 to the rumor-spreading problem becomes an instance of the (τ ) (τ ) 1 1 c −c (i = m), where the number of opinionated nodes plurality consensus problem. More precisely, we have the i is given by Claim 2, and, where the independence of the r.v. following. follows from the fact that each node that becomes opinion- Lemma 4 Stage 1 takes O( log n) rounds, after which ated in the first phase has necessarily received the messages (τ ) T +1 from the source-node. Now, suppose that the lemma holds w.h.p. all nodes are active and c is δ-biased toward up to phase j − 1 ≤ T.Let S = {u| j = j } be the set j u the correct opinion, with δ = ( log n/n). of nodes that become opinionated during phase j. Recall Proof The fact that an undecided node becomes opinionated the definition of M and N from Sect. 3.2, and observe j j (τ ) j −1 during a phase only depends on whether it gets a message that M = N = τ − τ n · a , and that the j j j j −1 (τ ) during that phase, regardless of the value of such messages. j −1 number of times opinion i occurs in M is M c .Let j j (τ ) T +1 Hence, the proof that, w.h.p., a = 1 is reduced to the us identify each message in M with a distinct number in analysis of the rule of Stage 1 as an information spreading 1,..., M , and let {X (i )} be the binary r.v. j w w∈ 1,..., M { | |} process. First, by carefully exploiting the Chernoff bound such that X (i ) = 1 if and only if w is i after the action of the and Lemma 3, we can establish Claims 2 and 3 below: |N | noise. The frequency of opinion i in N is X (i ). j w w=1 | j | Claim 2 W.h.p., at the end of phase 0, we have s/ log n/3n Thanks to Lemma 3, it suffices to prove the lemma for pro- (τ ) 2 ≤ a ≤ s/ log n/n. cess P. By definition, in process P, for each i, the number of messages with opinion i that each node receives conditional Claim 3 W.h.p., at the end of phase j,1 ≤ j ≤ T,wehave | j | on N follows a Poisson ( X (i )) distribution. Each j w n w=1 node u that becomes opinionated during phase j gets at least 2 j (τ ) (τ ) 2 j (τ ) 0 j 0 (β/ + 1) a /8 ≤ a ≤ (β/ + 1) a . one message during the phase. Thus, from Lemma 1,the probability that u gets opinion i conditional on N is Proof of Claims 2 and 3 The probability that, in the process O, an undecided node becomes opinionated at the end of |N | |N | j 1 h j phase j is 1−(1− ) where h is the number of messages sent X (i ) 1 n w=1 = X (i ) . n | | k j during that phase. In process P, this probability is 1 − e . N X (i ) w=1 x w i =1 w=1 1+x By using that e ≤ 1 + x ≤ e for |x | < 1 we see that − h n n−1 Since opinionated nodes never change opinion during Stage 1 − e ≤ 1 − (1 − 1/n) ≤ 1 − e . Thus, we can prove (τ ) Claims 2 and 3 for process P by repeating essentially the same 1, the bias of c is at least the minimum between the bias (τ ) j −1 of c and the bias among the newly opinionated nodes calculations as in the proofs of Claims 2.2 and 2.4 in [26]. . Hence, we can apply the Chernoff bound to the nodes Since the Poisson distributions in process P are independent, in S we can apply the Chernoff bound as claimed in [26]. Finally, in S to prove that the bias at the end of phase j is, w.h.p. , we can prove that the statements hold also for process O τ τ ( j ) ( j ) thanks to Lemma 2. Pr c − c N m j ⎛ ⎞ |N | |N | From the previous two claims, and by the definition of T j j 1 1 ⎝ ⎠ ˜ we get the following. ≥ X (m) − X (i ) 1 − δ , w w j N N j j w=1 w=1 (τ ) T +1 Lemma 5 W.h.p., at the end of phase T , we have a = (4) 2 T (τ ) 2 ((β/ + 1) a ) = ( ). where δ = O( log n/|S |). j j Finally, from Lemma 5, an application of the Chernoff Moreover, note that bound gives us the following. ⎡  ⎤ | j | Lemma 6 W.h.p., at the end of Stage 1, all nodes are opin- τ τ τ ( j −1) ( j −1) ( j −1) ⎣  ⎦ E X (i ) c , a = c · P . ionated. w w=1 (τ +1) As for the fact that, w.h.p., c is a δ-biased opinion τ τ ( j ) ( j ) distribution with δ = ( log n/n), we can prove the fol- We remark that Eq. (4) concerns the value of Pr(c − c |N ), m j lowing. which is a random variable. 123 Noisy rumor spreading and plurality consensus (τ ) (τ ) j −1 j −1 Furthermore, (conditional on c and a )the r.v. where {X (i )} are independent. Thus, for each i = m, ⎧ w∈{1,...,|N |} −1 ⎨ 2 2 √ δ(1 − δ ) if δ< , from Claim 3, and by applying the Chernoff bound on g (δ, ) = |N | |N | √ √ −1 j j X (m), and on X (i ), we get that w.h.p. 2 √ 1/ (1 − 1/ ) if δ ≥ . w w w=1 w=1 |N | |N | j j First, we prove Eq. (6)for k = 2. We then obtain the 1 1 − j +1 j general case by induction. The proof for k = 2 is based on a X (m) − X (i ) ≥ 1 − δ 2  , w w j N N j j w=1 w=1 known relation between the cumulative distribution function (5) of the binomial distribution, and the cumulative distribution function of the beta distribution. This relation is given by the where δ = O( log n/|N |). following lemma. j j From Claims 2 and 3, it follows that δ ,δ ≤ w.h.p. j j Lemma 8 Given p ∈ (0, 1) and 0 ≤ j ≤ it holds Thus by putting together Eqs. (4) and (5) via the chain rule, we get that, w.h.p., −i p (1 − p) τ τ j <i ≤ ( j ) ( j ) − j +1 j c − c ≥ 1 − δ 1 − δ 2  ≥ . j j m $ − j −1 = ( j + 1) z (1 − z) dz. j + 1 T +2 Lemma 7 implies that, w.h.p., we get a bias  = Proof By integrating by parts, for j < − 1wehave ( log n/n) at the end of Stage 1, which completes the proof of Lemma 4. − j −1 ( j + 1) z (1 − z) dz j + 1 3.4 Stage 2 j +1 − j −1 = p (1 − p) j + 1 As proved in the previous section, w.h.p., all nodes are j +1 − j −2 opinionated at the end of Stage 1, and the final opinion − ( − j − 1) z (1 − z) dz j + 1 distribution is ( log n/n)-biased. Now, we have that the rumor-spreading problem is reduced to an instance of the j +1 − j −1 = p (1 − p) j + 1 plurality consensus problem. The purpose of Stage 2 is to progressively amplify the initial bias until all nodes support j +1 − j −2 − ( j + 2) z (1 − z) dz, (7) the plurality opinion, i.e., the opinion originally held by the j + 2 source node. where, in the last equality, we used the identity During the first T phases, it is not hard to see that, by taking c large enough, a fraction arbitrarily close to 1 of the − j ) = ( j + 1) . nodes receives at least messages, w.h.p. Each node u in j j + 1 such fraction changes its opinion at the end of the phase. With a slight abuse of notation, let maj (u) = maj (S(u)) be Note that when j = − 1, Eq. (7) becomes u’s new opinion based on the =|S(u)| randomly sampled −1 received messages. We show that, w.h.p., these new opin- p = z dz. ions increase the bias of the opinion distribution toward the 0 plurality opinion by a constant factor > 1. Hence, we can unroll the recurrence given by Eq. (7)to For the sake of simplicity, we assume that is odd (see obtain Appendix B for details on how to remove this assumption). − j −1 ( j + 1) z (1 − z) dz Proposition 1 Suppose that, at the beginning of phase j of j + 1 Stage 2 with 0 ≤ j ≤ T − 1, the opinion distribution is p −i −1 = p (1 − p) + z dz δ-biased toward m. In process P, if a node u changes its j <i ≤ −1 opinion at the end of the phase, then, for any i = m, we have −i = p (1 − p) g(δ, j <i ≤ Pr maj (u) = m − Pr maj (u) = i ≥ , (k−2) ln 4 π e (6) concluding the proof. 123 P. Fraigniaud, E. Natale Lemma 8 allows us to express the survival function of a Thus, for any y ∈ (−p + p , p − p ) we have 1 2 1 2 binomial sample as an integral. Thanks to it, we can prove + , + , p −p $ 1 2 Proposition 1 when k = 2. 2 1 1 − y − t dt ≥ y . (8) Lemma 9 Let c = (c , c ) be a δ-biased opinion distri- p −p 1 2 1 2 4 4 bution during Stage 2. In process P, for any node u, we have Pr maj (u) = m −Pr maj (u) = 3 − m ≥ 2 /π · The r.h.s. of Eq. (8) is maximized w.r.t. y ∈ (−p + p , p − 1 2 1 g (δ, ) . p ) when Proof Without loss of generality, let m = 1. Let X be a ⎧ ⎫ 3 4 ⎨ ⎬ ) ( 1 1 r.v. with distribution Bin( , p ), and let X = − X .By 2 1 y = min p − p , - = min p − p , √ . 1 2 1 2 . / using Lemma 8, we get ⎩ ⎭ 2 + 1 Pr maj (u) = 1 − Pr maj (u) = 2 Hence, for p − p < , we get 1 2 ) ( ) ( ) ( = Pr X > X − Pr X > X 1 2 2 1 + , p −p $ 1 2 −i −i i 2 2 1 = p p − p p 2 1 2 2 1 − t dt % & i % & i p −p 1 2 ≤i ≤ ≤i ≤ 2 2 + , 1 − (p − p ) −i 1 2 = p (1 − p ) ≥ (p − p ) 1 2 % & i ≤i ≤ −1 +1 2 = 2 (p − p ) 1 − (p − p ) −i i 1 2 1 2 − p (1 − p ) % & = 2 g (p − p , ) . ≤i ≤ 1 2 ) *$ + , + , 2 2 ' ( For p − p ≥ we get = z (1 − z) dz 1 2 + , + , + , p −p 1 2 2 2 − z (1 − z) dz . 2 0 − t dt p −p 1 2 p −p 1 1 2 1 −1 By setting t = z − , and rewriting p = + and 1 − 2 2 2 2 1 p − p 2 1 1 ≥ √ 1 − = 2 g (p − p , ) . 1 2 p = + we obtain 2 2 Pr maj (u) = 1 − Pr maj (u) = 2 By using the fact that g is a non-decreasing function w.r.t. its + , + , ) *$ first argument, we obtain 2 2 ' ( = z (1 − z) dz $ + , + , p Pr maj (u) = 1 − Pr maj (u) = 2 2 2 + , − z (1 − z) dz p −p ) *$ 1 2 0 2 ⎛ + , ' ( = − t dt p −p ) * $ 1 2 p −p 1 2 2 4 2 − 1 2 ' (  ) * = − t dt 2 4 − − ' ( ≥ 2 g (p − p , 1 2 + , ⎞ 2 p −p 1 2 ) * ⎠ − − − t dt ' ( ≥ 2 g (δ, ) . − 2 + , ) * p −p $ 1 2 2r 1 2 2r 1 2 2 √ 9r Finally, by using the bounds ≥ e (see Lemma 13), ' ( = − t dt . r πr p −p 2 1 2 4 − x 8 and e ≥ 1 − x together with the identity y y p −p p −p  ) * 1 2 1 2 For any t ∈ (− , ) ⊆ (− , ), it holds + 1 − 1 2 2 2 2 ' ( = = −1 + , 2 2 + , 2 2 2 1 1 − y − t ≥ . Recall that we are assuming that is odd. 4 4 123 Noisy rumor spreading and plurality consensus we get Let Pr maj (u) = 1 − Pr maj (u) = 2 σ (x) = (x , ..., x , x , x , ..., x ) i i −1 1 i +1 k ) * ≥ ' ( 2 g δ, ( ) be the vector function that swaps the entries x and x in x. 1 i 2 (!) (=) (=) σ is clearly a bijection between the sets A ,A ,A and −1 1 1 1 (!) (=) (=) 9( −1) ≥ - e · 2 g (δ, ) A , A , A , respectively, namely i i i −1 (!) (!) (=) (=) (=) (=) σ : A → → A ,σ : A → → A ,σ : A → → A 1 1 1 i i i 2 1 ≥ 1 − 1 − · g (δ, π 9 ( − 1) where → → denotes a bijection. (=) Moreover, for all x ∈ A , it holds ≥ · g (δ, ) , ) ( ¯ ¯ Pr X = x = Pr X = σ (x) . concluding the proof. Next we show how to lower bound the above difference Therefore with a much simpler expression. ) ( ¯ ¯ Pr X = x = Pr X = σ x ( ) Lemma 10 In process P, during Stage 2, for any node u, (=) (=) Pr(maj (u) = m) − Pr(maj (u) = 3 − m) is at least x∈A σ (x)∈A 1 1 ) ( ) ( ) ( ) ( ) ( Pr(X > X , ..., X ) − Pr(X > X , ..., X , 1 2 k i 1 i −1 = Pr X = x . (10) ) ( ) ( ) ( X , ..., X ), where X = (X , ..., X ) follows a (=) i +1 k 1 k x∈A multinomial distribution with trials and probability dis- tribution c · P. (=) Furthermore, for all x ∈ A ,wehave Proof Without loss of generality, let m = 1. Let x = (x , ..., x ) denote a generic vector with positive integer 1 k x x x ) 1 i k k Pr X = x = p ... p ... p 1 i k entries such that x = ,let W (x) be the set of the j =1 x ... x 1 k greatest entries of x, and, for j ∈ {1, i },let x x x i 1 k > p ... p ... p 1 i k x ... x 1 k (!) { } • A = x | W (x) ={ j } , (=) = Pr X = σ (x) , (11) • A = {x | 1, i ∈ W (x)}, (=) • A = {x | 1 ∈ W (x) ∧ i ∈ / W (x) ∧ |W (x)| > 1} and 1 (=) where σ(x) ∈ A .FromEq. (11) we thus have that (=) • A = {x | i ∈ W (x) ∧ 1 ∈ / W (x) ∧ |W (x)| > 1}. ) ( ¯ ¯ Pr X = x > Pr X = σ (x) It holds (=) (=) x∈A σ (x)∈A 1 1 Pr maj (u) = j = Pr X = x . (12) ) ( ¯ ¯ = Pr X = x Pr maj (u) = j X = x ( ) x∈A (!) x∈A From Eq. (9), (10) and (12) we finally get ) ( ¯ ¯ + Pr X = x Pr maj (u) = j X = x (=) x∈A Pr maj (u) = 1 − Pr maj (u) = i ( ) ) ( ) Pr X = x ¯ ¯ + Pr X = x Pr maj (u) = j X = x ( = Pr X = x + (=) |W (x)| x∈A (!) (=) x∈A x∈A 1 1 ( ) (l) Pr X = x  ¯ Pr X = x ¯ ( = Pr X = x + ¯ + − Pr X = x |W (x)| |W (x)| (!) (=) (=) (!) x∈A x∈A j x∈A x∈A 1 i ) ( ¯  ¯ Pr X = x Pr X = x Pr X = x + (9) − − |W (x)| |W (x)| |W (x)| (=) (=) (=) x∈A x∈A x∈A i i 123 P. Fraigniaud, E. Natale ) ( ¯ ¯ and ≥ Pr X = x − Pr X = x (!) (!) x∈A x∈A ) ( ) ( ) ( ) ( )  ( 1 i Pr X > X ,..., X , X ,..., X X = h i 1 i −1 i +1 κ+1 κ+1 ) ( ) ( ¯ ¯ = Pr W (X ) ={X } − Pr W (X ) ={X } , 1 i ) ( ) ( ) ( )  ( = Pr X > X ,..., X , X ,..., X X =h . i 1 i −1 i +1 κ+1 concluding the proof of Lemma 10. Moreover, X follows a multinomial distribution with parameters p and . Thus X = h implies that the Intuitively, Lemma 10 says that the set of events in which ) ( remaining entries X ,..., X follow a multinomial dis- a tie occurs among the most frequent opinions in the node’s 1 k−1 p p 1 k−1 tribution with l −h trials, and distribution ( ,..., ). sample of observed messages does not favor the probability 1−p 1− p k k −h) ( −h) −h) that the node picks the wrong opinion. Thus, by avoid- Let Y = (Y ,..., Y ) be the distribution of 1 k−1 ) ( ) ( ing considering those events, we get a lower bound on X ,..., X conditional on X = h.FromEq. (14)we 1 k−1 k Pr(maj (u) = 1) − Pr(maj (u) = i ). get Thanks to Lemma 10, the proof of Eq. (6) reduces to prov- ing the following. ( ) ( ) ( Pr X > X ,..., X 1 2 κ+1 ) ( ) ( ) ( ) ( Lemma 11 For any fixed k, and with X defined as in − Pr X > X ,..., X , X ,..., X i 1 i −1 i +1 κ+1 Lemma 10, we have + , κ+1 ) ( ) ( (l) ) ( ) ( ≥ Pr X > X ,..., X  X = h Pr X > X ,..., X 1 2 κ κ+1 1 2 h=0 ) ( ) ( ) ( ) ( − Pr X > X ,..., X , X ,..., X ( i 1 i −1 i +1 k Pr X = h κ+1 g (δ, ) + , ≥ 2 /π . (13) κ+1 k−2 ) ( ) ( ) ( − Pr X > X ,..., X , X ,..., X i 1 i −1 i +1 h=0 Proof We prove Eq. (13) by induction. Lemma 9 provides us ) ( with the base case for k = 2. Let us assume that, for k ≤ κ, X = h Pr X = h κ+1 κ+1 Eq. (13) holds. For k = κ + 1, by using the law of total + , probability, we have κ+1 −h −h ( ) ( ) ( −h) ≥ Pr Y > Y ,..., Y − 1 2 ) ( ) ( h=0 Pr X > X ,..., X 1 2 κ+1 l−h l−h l−h l−h ( ) ( ) ( ) ( ) (l−h) − Pr Y >Y ,..., Y , Y ,..., Y ) ( ) ( ) ( ) ( ) κ i 1 i −1 i +1 − Pr X > X ,..., X , X ,..., X i 1 i −1 i +1 κ+1 (l) + , Pr X = h . (15) κ+1 κ+1 ) ( ) ( ) ( ≥ Pr X > X ,..., X  X = h 1 2 κ+1 κ+1 Now, using the inductive hypothesis on the r.h.s. of Eq. (15) h=0 we get Pr X = h + , κ+1 + , κ+1 −h) ( −h) κ+1 ( −h) Pr Y > Y ,..., Y 1 2 κ ) ( ) ( ) ( ) ( − Pr X > X ,..., X , X ,..., X i 1 i −1 i +1 κ+1 h=0 h=0 −h) ( −h) ( −h) ( −h) −h) − Pr Y >Y ,..., Y , Y ,..., Y 1 κ i i −1 i +1 ) ( X = h Pr X = h . (14) κ+1 κ+1 Pr X = h κ+1 + , + , ) ( ) ( Now, arg max {X }= X and X ≤ together j κ+1 j i κ+1 κ+1 − 2h g (δ, − h) ( ) ( ) ≥ Pr X = h imply X > X . Thus, in the r.h.s. of Eq. (14), we have κ+1 κ−2 i κ+1 π 4 h=0 + , ) ( ) ( )  ( κ+1 Pr X > X ,..., X X = h 1 2 κ+1 κ+1 2 g (δ, ) h ≥ · 1 − Pr X = h , κ+1 κ−2 (l) ( ) ( ) π 4 = Pr X > X ,..., X  X = h h=0 1 2 κ κ+1 123 Noisy rumor spreading and plurality consensus where, in the last inequality, we used the fact that g is c of the phase length large enough, in process P we get that a non-increasing function w.r.t. the second argument (see Pr maj (u) = m − Pr maj (u) = i ≥ αδ for some con- Lemma 15). stant α> 1 (provided that δ ≤ 1/2). Hence, by applying (τ ) It remains to show that Lemma 16 in Appendix A with θ = δ, we get Pr(c − (τ ) 2 −˜ α + , c ≤ αδ/2) ≤ exp(−(αδ) n/16) ≤ n for some con- κ+1 stant α ˜ that is large enough to apply Lemma 2. Therefore, h 1 1 − Pr X = h ≥ . (τ ) (τ ) j j κ+1 until δ ≥ 1/2, in process P we have that c −c ≥ αδ/2 l 4 m h=0 holds w.h.p. From the previous equation it follows that, after ) T phases, the protocol has reached an opinion distribution Let W be a r.v. with probability distribution Bin( , ). κ+1 κ+1 with a bias greater than 1/2. Thus, by a direct application of (τ ) (τ ) Since X ∼ Bin( , p ) with p ≤ , a stan- κ+1 κ+1 T T κ+1 κ+1 Lemma 16 and Lemma 2 to c −c , we get that, w.h.p., dard coupling argument (see for example [22, Exercise 1.1.]), (τ  ) (τ  ) T T c − c = 1, concluding the proof. enables to show that Finally, the time efficiency claimed in Theorems 1 and 2 ) ( Pr X ≤ h ≥ Pr W ≤ h . κ+1 κ+1 directly follows from Lemma 12, while the required memory follows from the fact that in each phase each node needs only Hence, we can apply the central limit theorem (Lemma 14) to count how many times it has received each opinion, i.e. to 2− 3 on W , and get that, for any  ˜ ≤ , there exists some κ+1 4 count up to at most O( log n) w.h.p. fixed constant such that, for ,wehave 0 0 ) ( 4 On the notion of Pr X ≤ ≥ Pr W ≤ ≥ −˜  . κ+1 κ+1 κ + 1 κ + 1 2 (, ı)-majority-preserving matrix (16) In this section we discuss the notion of (, δ)-m.p. noise By using Eq. (16), for we finally get that matrix introduced by Definition 2. Let us consider Eq. (2). The matrix P represents the “perturbation” introduced by the + , κ+1 noise, and so (c · P) − (c · P) measures how much infor- m i mation the system is losing about the correct opinion m,in 1 − Pr X = h κ+1 a single communication round. An (, δ)-m.p. noise matrix h=0 + , + , is a noise matrix that preserves at least an  fraction of bias, κ+1 provided the initial bias is at least δ.The (, δ)-m.p. property κ+1 ≥ 1 − · Pr X = h κ+1 essentially characterizes the amount of noise beyond which h=0 some coordination problems cannot be solved without fur- ) ther hypotheses on the nodes’ knowledge of the matrix P. ≥ 1 − · Pr X ≤ κ+1 κ + 1 κ + 1 To see why this is the case, consider an (, δ)-m.p. noise matrix for which there is a δ-biased opinion distribution c ˜ κ − 1 1 1 1 1 ≥ · −˜  ≥ · −˜  ≥ , such that (c ˜ · P) − (c ˜ · P) < 0 for some opinion i.Given κ + 1 2 3 2 4 opinion distribution c ˜, from each node’s perspective, opinion m does not appear to be the most frequent opinion. Indeed, concluding the proof that the messages that are received are more likely to be i than m. Thus, plurality consensus cannot be solved from opinion g (δ, Pr maj (u) = 1 − Pr maj (u) = i ≥ . (k−2) ln 4 distribution c ˜. π e Observe that verifying whether a given matrix P is (, δ)- m.p. with respect to opinion m consists in checking whether for each i = m the value of the following linear program is By using Proposition 1, we can then prove Lemma 12. at least δ: Lemma 12 W.h.p., at the end of Stage 2, all nodes support maximize (P · c) − (P · c) m i the initial plurality opinion. subject to c = 1, Proof Let δ = ( log n/n) be the bias of the opinion dis- tribution at the beginning of a generic phase j < T of and ∀ j , c ≥ 0, c − c − δ ≥ 0. j m j Stage 2. Thanks to Proposition 1, by choosing the constant 123 P. Fraigniaud, E. Natale We now provide some negative and positive examples of number n of individuals. Nevertheless, it could be interest- (, δ)-m.p. noise matrices. First, we note that a natural matrix ing, at least from a conceptual point to view, to address rumor property such as being diagonally dominant does not imply spreading and plurality consensus in a scenario in which the that the matrix is (, δ)-m.p. For example, by multiplying number of opinions varies with n. This appears to be a tech- the following diagonally dominant matrix by the δ-biased nically challenging problem. Indeed, extending the results in opinion distribution c = (1/2 + δ, 1/2 − δ, 0) , we see that the extended abstract of [26] from 2 opinions to any constant it does not even preserve the majority opinion at all when number k of opinions already required to use complex tools. , δ < 1/6: Yet, several of these tools do not apply if k depends on n. This is typically the case of Proposition 1. We let as an open ⎛ ⎞ 1 1 +  0 −  problem the design of stochastic tools enabling to handle the 2 2 ⎜ ⎟ 1 1 scenario where k = k(n). −  +  0 . ⎝ ⎠ 2 2 1 1 0 −  + Acknowledgements Open access funding provided by Max Planck 2 2 Society. We thank the anonymous reviewers of an earlier version of this work for their constructive criticisms and comments, which were On the other hand, the following natural generalization of of great help in improving the results and their presentation. the noise matrix in [26](seeEq. (1)), is (, δ)-m.p. for every δ> 0 with respect to any opinion: Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecomm ons.org/licenses/by/4.0/), which permits unrestricted use, distribution, +  if i = j , and reproduction in any medium, provided you give appropriate credit (P) = p = i , j i , j − otherwise. to the original author(s) and the source, provide a link to the Creative k k−1 Commons license, and indicate if changes were made. More generally, let P be a noise matrix such that APPENDIX p if i = j , (P) = (17) i , j q ≤ q ≤ q otherwise, l i , j u A technical tools for some positive numbers p, q and q . Since u l Lemma 13 For any integer r ≥ 1 it holds (Pc) − (Pc) = pc + q c − pc − q c m i m j ,m j i j ,i j 2r 2r 2 1 2r 2 1 j =m j =i 9r 8r √ e ≤ ≤ √ e . πr πr ≥ p(c − c ) + q c − q c m i l j u j j =m j =i Proof By using Stirling’s approximation [39] ≥ p(c − c ) + q (1 − c ) − q (1 − c ) m i l m u i √ r √ r r 1 r 1 12r +1 12r ≥ p(c − c ) + q − q c − q + q c 2πr e ≤ r!≤ 2πr e , m i l l m u u i e e ≥ p(c − c ) − q (c − c ) − (q − q ) m i u m i u l we have ≥ (p − q )(c − c ) − (q − q ) u m i u l ≥ (p − q )δ − (q − q ). (18) √ 1 u u l 2r 2r 12r +1 2π2r e 2r (2r )! = ≥ 2 √ 2 r r (r !) r By defining  = (p − q )/2, we get that the last line in Eq. u 12r 2πr e (18) is greater than δ iff (p − q )δ/2 ≥ (q − q ), which u u l 2r 2r 12r +1 gives a sufficient condition for any matrix of the form given 4πr e 2r in Eq. (17) for being (, δ)-m.p. 24r 2πr e 2r 2r 2 2 1 2 1 12r +1 24r 9r = √ e ≥ √ e . πr πr 5 Conclusion 12r +1 In this paper, we solved the general version of rumor spread- The proof of the upper bound is analogous (swap e and ing and plurality consensus in biological systems. That is, we 12r e in the first inequality). have solved these problems for an arbitrarily large number k of opinions. We are not aware of realistic biological contexts Lemma 14 Let X , ..., X be a random sample from a 1 n in which the number of opinions might be a function of the Bernoulli(p) distribution with p ∈ (0, 1) constant, and let 123 Noisy rumor spreading and plurality consensus Z ∼ N (0, 1). It holds with p + r + q = 1. It holds X − pn i =1 lim sup Pr √ ≤ z Pr X ≤ (1 − θ ) E X − θn t t n→∞ z∈R i i − Pr Z ≤ √ = 0. p (1 − p) ≤ exp − E X + n . Lemma 15 The function Proof Let us define the r.v. y−1 ⎪ 1 2 2 ⎨ √ x 1 − x if x < , X + 1 g x , y = y−1 ( ) Y = . (19) ⎪ 2 1 1 1 √ √ 1 − if x ≥ , 2 y y y We can apply the Chernoff-Hoeffding bound to Y (see The- with x ∈ [0, 1] and y ∈ [1, +∞) is non-decreasing w.r.t. x orem 1.1 in [22]), obtaining and non-increasing w.r.t. y. Proof To show that g(x , y) is non-decreasing w.r.t. x, Pr Y ≤ (1−θ ) E Y ≤ exp − E Y i i i observe that 2 i i i y−1 y−1 ∂ y − 1 2 2 2 2 2 for any θ ∈ (0, 1). Substituting Eq. (19)wehave g (x , y) = 1 − x −2x 1 − x ∂x 2 Pr X + n ≤ (1 − θ ) E X + n − t t for x < y < 1, and i i y−1 y−1 y − 1 2 2 2 2 2 = Pr X ≤ (1 − θ ) E X − θn t t 1 − x − 2x 1 − x ≥ 0 i i ≤ exp − E X + n , for x < y . To show that g(x , y) is non-increasing w.r.t. y, observe 1 1 − − 2 2 that this is true for x < y .For x ≥ y , since concluding the proof. ∂ 1 y − 1 1 log y + log 1 − ∂ y 2 y B. Removing the parity assumption on ∂ y − 1 y = log (y − 1) − log y ≤ 0, ∂ y 2 2 The next lemma shows that, for k = 2, the increment of bias at the end of each phase of Stage 2 in the process P we have is non-decreasing in the value of , regardless of its parity. In particular, since Proposition 1 is proven by induction, and 3  4 ∂ ∂ 1 y − 1 1 2 since the value of affects only the base case, the next lemma g (x , y)= exp log y + log 1− ≤ 0, ∂ y ∂ y 2 y implies also the same kind of monotonicity for general k. concluding the proof. Lemma 17 Let k = 2,a = 1,let be odd, and let (c · P) ≥ (c · P) . The rule of Stage 2 of the protocol is such that Lemma 16 Let {X } be n i.i.d. random variables such t t ∈[n] that Pr maj (u)=1 = Pr maj (u)=1 ≤ Pr maj (u)=1 , Pr maj (u)=2 = Pr maj (u)=2 ≥ Pr maj (u)=2 . ⎪ 1 with probability p, (20) X = 0 with probability r , −1 with probability q. 123 P. Fraigniaud, E. Natale Proof To simplify notation, let p = (c · P) and p = and 1 1 2 (c · P) . By definition, we have < = ) * l 1 Pr maj (u) = 1 X = = Pr (Y = 1) . (23) +1 1 Pr maj (u) = 1 = Pr X ≥ , 2 2 + 1 +1) Pr maj (u) = 1 = Pr X > +1 1 2 Moreover, by a direct calculation one can verify that + 1 +1) + Pr X = , ) *  < = 2 2 Pr (Y = 0) ) * ) ( Pr X = = Pr X = . (24) + 2 +2) 1 1 2 Pr (Y = 1) 2 Pr maj (u) = 1 = Pr X ≥ , ) ( +1) ( +2) From Eqs. (22), (23) and (24) it follows that where X , X and X are binomial r.v. with proba- 1 1 1 bility p and number of trials +1, and +2, respectively. ) ( +1) ( +2) ) *  ) * We can view X , X , and X as the sum of + 1 1 1 1 ) ( and +2 Bernoulli (p ) r.v., respectively. In particular, let Y Pr maj (u) = 1 X = Pr X = +1 1 1 2 2 and Y be independent r.v. with distribution Bernoulli (p ). < =  < = ) ( +1) ( +2) ) ( We can couple X , X and X as follows: 1 1 1 + Pr maj (u)=1 X = Pr X = +1 1 1 2 2 +1) ( X = X + Y ( 1 1 = Pr (Y = 1) + Pr (Y = 0) · Pr X ) *    < = and = + Pr (Y = 1) · Pr X = 2 2 2 ) * +2) ( +1) X = X + Y 1 1 = Pr X = ' ( Pr (Y = 0) Pr (Y = 1) Pr (Y = 0) Since is odd, observe that if X > , then maj (u) = 1 1 2 Pr (Y = 1) + + ' ( 2 2 Pr (Y = 1) regardless of the value of Y , and similarly if X < then 2  ) * maj (u) = 2. Thus we have ( = Pr X = . (25) (l) Pr maj (u) = 1 = Pr maj (u) = 1 X = i i =1 By plugging Eq. (25)inEq. (21) we get Pr X = i Pr maj (u) = 1 = Pr maj (u) = 1 . = Pr X = i % & i > ) *  ) * The proof that ) ( + Pr maj (u) = 1 X = Pr X = +1 1 1 2 2 < =  < = ) ( Pr maj (u) = 2 = Pr maj (u) = 1 + Pr maj (u)=1 X = Pr X = . +1 1 1 2 2 (21) is analogous, proving the first part of Eq. (20). +1) As for the second part, observe that if X > , then As for the last two terms in the previous equation, we have maj (u) = 1 regardless of the value of Y , and similarly if that +1) X < then maj (u) = 2. Observe also that 1 2 ) * Pr maj (u) = 1 X = l + 1 1 ( +1) Pr maj (u) = 1 X = = Pr (Y = 1) = p . = Pr (Y = 1) + Pr (Y = 0) , (22) +2  1 123 Noisy rumor spreading and plurality consensus − −η Because of the previous observations and the hypothesis that However, when  = (n ) for some η> 0, from p ≥ , we have that Claim 2 and Lemma 7 we have that, after phase 0 in opin- 2 +2η ion distribution c,atmost O log n/ = O(n log n) Pr maj (u) = 1 nodes are opinionated, and c is -biased. Each node that gets opinionated in phase 1 receives a message pushed from some +1) ( +1) node of c, and, because of the noise, the value of this mes- = Pr maj (u) = 1 X = i Pr X = i 1 1 (τ ) i =0 sage is distributed according to c · P.Itfollows that c is an 2 2 − −2η /2-biased opinion distribution with  = n which is +1)  ( +1) √ = Pr X = i + Pr maj (u) = 1 X much smaller than the ( log n/n) bound required for the +2 1 i > second stage. We believe that no minor modification of the protocol + 1 + 1 +1) = Pr X = 1 proposed here can correctly solve the noisy rumor-spreading 2 2 − −η 2 problem when  = (n ) in time O log n/ . + 1 +1) ( +1) = Pr X = i + p · Pr X = 1 1 i > + 1 +1) ( +1) References ≥ Pr X = i + Pr X = 1 1 2 2 i > 1. Abdullah, M.A., Draief, M.: Global majority consensus by local majority polling on graphs of a given degree sequence. Discrete = Pr maj (u) = 1 . (26) Appl. Math. 180, 1–10 (2015) 2. Afek, Y., Alon, N., Barad, O., Barkai, N., Bar-Joseph, Z., Hornstein, The proof of E.: A biological solution to a fundamental distributed computing problem. Science 331(6014), 183–185 (2011) 3. Afek, Y., Alon, N., Bar-Joseph, Z., Cornejo, A., Haeupler, B., Kuhn, Pr maj (u) = 2 ≤ Pr maj (u) = 2 F.: Beeping a maximal independent set. Distrib. Comput. 26(4), 195–208 (2013) 4. Alistarh, D., Aspnes, J., Gelashvili, R.: Space-optimal majority in is the same up to the inequality in (26), whose direction is population protocols. In: Proceedings of the 19th Annual ACM- reversed because p ≤ . SIAM Symposium on Discrete Algorithms, pp. 2221–2239 (2018) 5. Ame, J.-M., Rivault, C., Deneubourg, J.-L.: Cockroach aggregation based on strain odour recognition. Anim. Behav. 68(4), 793–801 − − (2004) C. Rumor spreading with  = 2(n ) 6. Angluin, D., Aspnes, J., Eisenstat, D., Ruppert, E.: The compu- tational power of population protocols. Distrib. Comput. 20(4), In [26] it is shown that at the end of Stage 1 the bias toward 279–304 (2007) T +2 7. Angluin, D., Aspnes, J., Eisenstat, D.: A simple population protocol the correct opinion is at least  /2 and, at the beginning for fast robust approximate majority. Distrib. Comput. 21(2), 87– of Stage 2, they assume a bias toward the correct opinion 102 (2008) of ( log n/n). In this section, we show that, when  = 8. Aspnes, J., Ruppert, E.: An introduction to population protocols. − −η (n ) for some η ∈ (0, 1/4), the protocol considered In: Middleware for Network Eccentric and Mobile Applications. Springer, pp. 97–120 (2009) by [26] and us cannot solve the rumor-spreading and the 9. Becchetti, L., Clementi, A., Natale, E., Pasquale, F., Silvestri, R.: plurality consensus problem in time (log n/ ). √ Plurality consensus in the gossip model. In: Proceedings of the 26th First, observe that when  = ( log n/n) the length of Annual ACM-SIAM Symposium on Discrete Algorithms, SIAM, the first phase of Stage 1 is log n/ = (n log n), which pp. 371–390 (2015) 10. Becchetti, L., Clementi, A., Natale, E., Pasquale, F., Silvestri, R., implies that, w.h.p., each node gets at least one message from Trevisan, L.: Simple dynamics for plurality consensus. Distrib. the source during the first phase. Thus, thanks to our analysis √ Comput. 30(4), 1–14 (2016) of Stage 2 we have that when  = ( log n/n) the protocol 11. Ben-Shahar, O., Dolev, S., Dolgin, A., Segal, M.: Direction elec- effectively solves the rumor-spreading problem, w.h.p., in tion in flocking swarms. In: Proceedings of the 6th International Workshop on Foundations of Mobile Computing, ACM, pp. 73–80 time (log n/ ). (2010) −1/2−η In general, for < n for some constant η> 0, if 12. Berenbrink, P., Friedetzky, T., Kling, P., Mallmann-Trenn, F., we adopt the second stage right from the beginning (which Wastell, C.: Plurality consensus in arbitrary graphs: lessons learned −2 means that the source node sends  messages), we get that, from load balancing. In: Proceedings of the 24th Annual European Symposium on Algorithms, vol. 57, p. 10:1–10:18 (2016) w.h.p., all nodes receive at least log n/( n) messages. Thus, 13. Boczkowski, L., Korman, A., Natale, E.: Limits for Rumor Spread- by a direct application of Lemma 16, after the first phase √ ing in Stochastic Populations. In: Proceedings of the 9th Innova- we get an log n/n-biased opinion distribution, w.h.p., and tions in Theoretical Computer Science Conference, vol. 94, pp. Stage 2 correctly solves the problem according to Theorem 2. 49:1–49:21 (2018) 123 P. Fraigniaud, E. Natale 14. Boczkowski, L., Korman, A., Natale, E.: Minimizing message size 29. Ghaffari, M., Parter, M.: A polylogarithmic gossip algorithm for in stochastic communication patterns: fast self-stabilizing proto- plurality consensus. In: Proceedings of the 36th ACM Symposium cols with 3 bits. In: Proceedings of the 28th Annual ACM-SIAM on Principles of Distributed Computing, ACM, pp. 117–126 (2016) Symposium on Discrete Algorithms, SIAM, pp. 2540–2559 (2017) 30. Giakkoupis, G., Berenbrink, P., Friedetzky, T., Kling, P.: Efficient 15. Cardelli, L., Csikász-Nagy, A.: The cell cycle switch computes Plurality Consensus, or: the benefits of cleaning up from time to approximate majority. Sci. Rep. 2, 656–656 (2011) time. In: Proceedings of the 43rd International Colloquium on 16. Chazelle, B.: Natural algorithms. In: Proceedings of the 20th Automata, Languages, and Programming vol. 55, p. 136:1–136:14 Annual ACM-SIAM Symposium on Discrete Algorithms, SIAM, (2016) pp. 422–431 (2009) 31. Jung, K., Kim, B.Y., Vojnovic, ´ M.: Distributed ranking in net- 17. Conradt, L., Roper, T.J.: Group decision-making in animals. Nature works with limited memory and communication. In: Proceedings 421(6919), 155–158 (2003) of the 2012 IEEE International Symposium on Information Theory, 18. Cooper, C., Elsässer, R., Radzik, T.: The power of two choices IEEE, pp. 980–984 (2012) in distributed voting. In: Automata, Languages, and Programming, 32. Karp, R., Schindelhauer, C., Shenker, S., Vocking, B.: Randomized vol. 8573 of Lecture Notes in Computer Science. Springer, pp. 435– rumor spreading. In: Proceedings of the 41st Annual Symposium 446 (2014) on Foundations of Computer Science, IEEE, pp. 565–574 (2000) 19. Demers, A., Greene, D., Hauser, C., Irish, W., Larson, J., Shenker, 33. Kempe, D., Dobra, A., Gehrke, J.: Gossip-based computation of S., Sturgis, H., Swinehart, D., Terry, D.: Epidemic algorithms for aggregate information. In: Proceedings of the 44st Annual Sym- replicated database maintenance. In: Proceedings of the 6th Annual posium on Foundations of Computer Science, IEEE, pp. 482–491 ACM Symposium on Principles of Distributed Computing, ACM, (2003) pp. 1–12 (1987) 34. Korman, A., Greenwald, E., Feinerman, O.: Confidence sharing: an 20. Doerr, B., Goldberg, L.A., Minder, L., Sauerwald, T., Scheideler, economic strategy for efficient information flows in animal groups. C.: Stabilizing consensus with the power of two choices. In: Pro- PLoS Comput. Biol. 10(10), e1003862–e1003862 (2014) ceedings of the 23th Annual ACM Symposium on Parallelism in 35. Land, M., Belew, R.: No perfect two-state cellular automata for Algorithms and Architectures, ACM, pp. 149–158 (2011) density classification exists. Phys. Rev. Lett. 74(25), 5148–5150 21. Draief, M., Vojnovic, M.: Convergence speed of binary interval (1995) consensus. SIAM J. Control Optim. 50(3), 1087–1109 (2012) 36. Mitzenmacher, M., Upfal, E.: Probability and Computing: Ran- 22. Dubhashi, D.P., Panconesi, A.: Concentration of Measure for the domized Algorithms and Probabilistic Analysis. Cambridge Uni- Analysis of Randomized Algorithms. Cambridge University Press, versity Press, Cambridge (2005) Cambridge (2009) 37. Perron, E., Vasudevan, D., Vojnovic, M.: Using three states for 23. El Gamal, A., Kim, Y.-H.: Network Information Theory. Cam- binary consensus on complete graphs. In: Proceedings of 28th IEEE bridge University Press, Cambridge (2011) INFOCOM (2009) 24. Elsässer, R., Friedetzky, T., Kaaser, D., Mallmann-Trenn, F., 38. Pittel, B.: On spreading a rumor. SIAM J. Appl. Math. 47(1), 213– Trinker, H.: Brief announcement: rapid asynchronous plurality con- 223 (1987) sensus. In: Proceedings of the 37th ACM Symposium on Principles 39. Robbins, H.: A remark on Stirling’s formula. Am. Math. Mon. 62, of Distributed Computing, ACM, pp. 363–365 (2017) 26–29 (1955) 25. Feinerman, O., Haeupler, B., Korman, A.: Breathe before speak- 40. Seeley, T.D., Buhrman, S.C.: Group decision making in swarms of ing: Efficient information dissemination despite noisy, limited honey bees. Behav. Ecol. Sociobiol. 45(1), 19–31 (1999) and anonymous communication. In: Proceedings of the 34th 41. Seeley, T.D., Visscher, P.K.: Quorum sensing during nest-site selec- ACM Symposium on Principles of Distributed Computing, ACM, tion by honeybee swarms. Behav. Ecol. Sociobiol. 56(6), 594–601 pp. 114–123. Extended abstract of [27] (2014) (2004) 26. Feinerman, O., Haeupler, B., Korman, A.: Breathe before speak- 42. Sumpter, D.J., Krause, J., James, R., Couzin, I.D., Ward, A.J.: Con- ing: efficient information dissemination despite noisy, limited and sensus decision making by fish. Curr. Biol. 18(22), 1773–1777 anonymous communication. Distrib. Comput. 30(5), 1–17 (2015) (2008) 27. Franks, N.R., Pratt, S.C., Mallon, E.B., Britton, N.F., Sumpter, D.J.: Information flow, opinion polling and collective intelligence in house-hunting social insects. Philos. Trans. R. Soc. Lond. B Biol. Sci. 357(1427), 1567–1583 (2002) 28. Franks, N.R., Dornhaus, A., Fitzsimmons, J.P., Stevens, M.: Speed versus accuracy in collective decision making. Proc. Biol. Sci. 270(1532), 2457–2463 (2003) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Distributed Computing Springer Journals

Noisy rumor spreading and plurality consensus

Free
20 pages

Loading next page...
 
/lp/springer_journal/noisy-rumor-spreading-and-plurality-consensus-Ah8GeVVtOw
Publisher
Springer Journals
Copyright
Copyright © 2018 by The Author(s)
Subject
Computer Science; Computer Communication Networks; Computer Hardware; Computer Systems Organization and Communication Networks; Software Engineering/Programming and Operating Systems; Theory of Computation
ISSN
0178-2770
eISSN
1432-0452
D.O.I.
10.1007/s00446-018-0335-5
Publisher site
See Article on Publisher Site

Abstract

Error-correcting codes are efficient methods for handling noisy communication channels in the context of technological networks. However, such elaborate methods differ a lot from the unsophisticated way biological entities are supposed to communicate. Yet, it has been recently shown by Feinerman et al. (PODC 2014) that complex coordination tasks such as rumor spreading and majority consensus can plausibly be achieved in biological systems subject to noisy communication channels, where every message transferred through a channel remains intact with small probability +, without using coding techniques. This result is a considerable step towards a better understanding of the way biological entities may cooperate. It has nevertheless been established only in the case of 2-valued opinions: rumor spreading aims at broadcasting a single-bit opinion to all nodes, and majority consensus aims at leading all nodes to adopt the single-bit opinion that was initially present in the system with (relative) majority. In this paper, we extend this previous work to k-valued opinions, for any constant k ≥ 2. Our extension requires to address a series of important issues, some conceptual, others technical. We had to entirely revisit the notion of noise, for handling channels carrying k-valued messages. In fact, we precisely characterize the type of noise patterns for which plurality consensus is solvable. Also, a key result employed in the bivalued case by Feinerman et al. is an estimate of the probability of observing the most frequent opinion from observing the mode of a small sample. We generalize this result to the multivalued case by providing a new analytical proof for the bivalued case that is amenable to be extended, by induction, and that is of independent interest. Keywords Noise · Rumor spreading · Plurality consensus · PUSH model · Biological distributed algorithms 1 Introduction 1.1 Context and objective To guarantee reliable communication over a network in the presence of noise is the main goal of Network Infor- mation Theory [23]. Thanks to the achievements of this An extended abstract of this work appeared in Proceedings of the theory, the impact of noise can often be drastically reduced 2016 ACM Symposium on Principles of Distributed Computing (PODC’16). to almost zero by employing error-correcting codes, which P. Fraigniaud: Additional support from the ANR project are practical methods whenever dealing with artificial enti- DISPLEXITY and from the INRIA project GANG. ties. However, as observed in [26], the situation is radically E. Natale: This work has been partly done while the author was a different for scenarios in which the computational entities fellow of the Simons Institute for the Theory of Computing. are biological. Indeed, from a biological perspective, a com- B Emanuele Natale putational process can be considered “simple” only if it emanuele.natale@mpi-inf.mpg.de consists of very basic primitive operations, and is extremely Pierre Fraigniaud lightweight. As a consequence, it is unlikely that biologi- pierref@irif.fr cal entities are employing techniques like error-correcting codes to reduce the impact of noise in communications Institut de Recherche en Informatique Fondamentale, CNRS and University Paris Diderot, Paris, France between them. Yet, biological signals are subject to noise, when generated, transmitted, and received. This rises the Max Planck Institute for Informatics, Saarbrücken 66123, Germany intriguing question of how entities in biological ensem- 123 P. Fraigniaud, E. Natale bles can cooperate in presence of noisy communications, for the case of rumor spreading, the algorithm exchanges but in absence of mechanisms such as error-correcting solely opinions between nodes. In fact, the latter algorithm codes. for majority consensus is used as a subroutine for solving the An important step toward understanding communica- rumor-spreading problem. Note that the majority consensus tions in biological ensembles has been achieved recently algorithm of [26] requires that the nodes are initially aware in [26], which showed how it is possible to cope with of the size of A. noisy communications in absence of coding mechanisms According to [26], both algorithms are optimal, since both for solving complex tasks such as rumor-spreading and rumor-spreading and majority consensus require ( log n) majority consensus. Such a result provides highly valuable rounds w.h.p. in n-node networks. hints on how complex tasks can be achieved in frameworks Our objective is to extend the work of [26] to the natural such as the immune system, bacteria populations, or super- case of an arbitrary number of opinions, to go beyond a proof organisms of social insects, despite the presence of noisy of concept. The problem that results from this extension is an communications. instance of the plurality consensus problem in the presence In the case of rumor-spreading, [26] assumes that a source- of noise, i.e., the problem of making the system converging to node initially handles a bit, set to some binary value, called the initially most frequent opinion (i.e., the plurality opinion). Indeed, the plurality consensus problem naturally arises in the correct opinion. This opinion has to be transmitted to all nodes, in a noisy environment, modeled as a complete several biological settings, typically for choosing between network with unreliable links. More precisely, messages are different directions for a flock of birds [11], different speeds transmitted in the network according to the classical uni- for a school of fish [42], or different nesting sites for ants form push model [19,32,38] where, at each round, every [27]. The computation of the most frequent value has also node can send one binary opinion to a neighbor chosen uni- been observed in biological cells [15]. formly and independently at random and, before reaching the receiver, that opinion is flipped with probability at most 1.2 Our contribution −  with > 0. It is proved that, even in this very noisy setting, the rumor-spreading problem can be solved quite effi- 1.2.1 Our results ciently. Specifically, [26] provides an algorithm that solves the noisy rumor-spreading problem in O( log n) commu- We generalize the results in [26] to the setting in which an arbitrary large number k of opinions is present in the sys- nication rounds, with high probability (w.h.p.) in n-node tem. In the context of rumor spreading, the correct opinion networks, using O(log log n + log(1/)) bits of memory is a value i ∈{1,..., k}, for any constant k ≥ 2. Initially, per node . Again, this algorithm exchanges solely opinions one node supports this opinion i, and the other nodes have between nodes. no opinions. The nodes must exchange opinions so that, In the case of majority consensus, [26] assumes that some eventually, all nodes support the correct opinion i.Inthe nodes are supporting opinion 0, some nodes are supporting context of (relative) majority consensus, also known as plu- opinion 1, and some other nodes are supporting no opin- rality consensus, each node u initially supports one opinion ion. The objective is that all nodes eventually support the i ∈{1,..., k}, or has no opinion. The objective is that all initially most frequent opinion (0 or 1). More precisely, let u nodes eventually adopt the plurality opinion (i.e., the opinion A be the set of nodes with opinion, and let b ∈{0, 1} initially held by more nodes than any other, but not necessar- be the majority opinion in A.The majority bias of A is ily by an overall majority of nodes). As in [26], we restrict defined as (|A |−|A |)/|A| where A is the set of nodes b ¯ i 2 b ourselves to “natural” algorithms [16], that is, algorithms in with opinion i ∈{0, 1}. In the very same noisy commu- which nodes only exchange opinions in a straightforward nication model as above, [26] provides an algorithm that manner (i.e., they do not use the opinions to encode, e.g., solves the noisy majority consensus problem for |A|= part of their internal state). For both problems, the difficulty ( log n) with majority-bias ( log n/|A|). The algo- 1 comes from the fact that every opinion can be modified during rithm runs in O( log n) rounds, w.h.p., in n-node networks, its traversal of any link, and switched at random to any other using O(log log n + log(1/)) bits of memory per node. As opinion. In short, we prove that there are algorithms solving the noisy rumor spreading problem and the noisy plurality 1 c Aseriesofevents E , n ≥ 1, hold w.h.p. if Pr(E ) ≥ 1 − O(1/n ) consensus problem for multiple opinions, with the same per- n n for some c > 0. formances and probabilistic guarantees as the algorithms for We remark that, while it would be more appropriate to measure the binary opinions in [26]. space complexity by the number of states here (in accordance with other work which is concerned with minimizing it, such as population protocols [7] or cellular automata [35]), we make use of the memory bits for consistency with the main related work [26]. 123 Noisy rumor spreading and plurality consensus 1.2.2 The technical challenges show that the execution of the given protocol, on the uniform push model, can be tightly approximated with the execution Generalizing noisy rumor spreading and noisy majority con- of the same protocol on a another suitable communication sensus to more than just two opinions requires to address a model, that is not affected by the stochastic correlation that series of issues, some conceptual, others technical. affects the uniform push model. Conceptually, one needs first to redefine the notion of noise. In the case of binary opinions, the noise can just flip an opinion to its complement. In the case of multiple opin- 1.3 Other related work ions, an opinion i subject to a modification is switched to another opinion i , but there are many ways of picking i . By extending the work of [26], we contribute to the theoret- For instance, i can be picked uniformly at random (u.a.r.) ical understanding of how communications and interactions among all opinions. Or, i could be picked as one of the function in biological distributed systems, from an algorith- “close opinions”, say, either i + 1or i − 1 modulo k.Or, i mic perspective [2,3,6–8,13,14,34]. We refer the reader to could be “reset” to, say, i = 1. In fact, there are very many [26] for a discussion on the computational aspects of biolog- alternatives, and not all enable rumor spreading and plural- ical distributed systems, an overview of the rumor spreading ity consensus to be solved. One of our contributions is to problem in distributed computing, as well as its biological characterize noise matrices P = (p ), where p is the significance in the presence of noise. In this section, we i , j i , j probability that opinion i is switched to opinion j,for which mainly discuss the previous technical contributions from the these two problems are efficiently solvable. Similar issues literature related to the novelty of our work, that is the exten- arise for, e.g., redefining the majority bias into a plurality sion to the case of several different opinions. bias. We remark that, in the following, we say that a protocol The technical difficulties are manifold. A key ingredient solves a problem within a given time if a correct solution is of the analysis in [26] is a fine estimate of how nodes can mit- achieved with high probability within said time. In the context igate the impact of noise by observing the opinions of many of population protocol, the problem of achieving majority other nodes, and then considering the mode of such sample. consensus in the binary case has been solved by employ- Their proof relies on the fact that for the binary opinion case, ing a simple protocol called undecided-state dynamic [7]. given a sample of size γ , the number of 1s and 0s in the In the uniform push model, the binary majority consensus sample sum up to γ . Even for the ternary opinion case, the problem can be solved very efficiently as a consequence of a additional degree of freedom in the sample radically changes more general result about computing the median of the initial the nature of the problem, and the impact of noise is statisti- opinions [20]. Still in the uniform push model, the undecided cally far more difficult to handle. state dynamic has been analyzed in the case of an arbitrar- Also, to address the multivalued case, we had to cope ily large number of opinions, which may even be a function with the fact that, in the uniform push model, the messages of the number of agents in the system [9]. A similar result received by nodes at every round are correlated. To see why, has been obtained for another elementary protocol, so-called consider an instance of the system in which a certain opin- 3-majority dynamics, in which, at each round, each node sam- ion b is held by one node only, and there is no noise at all. ples the opinion of three random nodes, and adopts the most In one round, only one other node can receive b. It follows frequent opinion among these three [10]. The 3-majority that if a certain node u has received b, no other nodes have dynamics has also been shown to be fault-tolerant against received it. Thus, the messages each node receives are not an adversary that can change up to O( n) agents at each independent. In [25] (conference version of [26]), probabil- round [10,20]. Other work has analyzed the undecided-state ity concentration results are claimed for random variables dynamics in asynchronous models with a constant number of (r.v.) that depend on such messages, using Chernoff bounds. opinions [21,31,37], and the h-majority dynamics (or slight However, Chernoff bounds have been proved to hold only for variations of it) on different graph classes in the uniform push random variables whose stochastic dependence belongs to a model [1,18]. The analysis of the undecided-state dynamics very limited class (see for example [22]). In [26], it is pointed in [9] has been followed by a series of work which have out that the binary random variables on which the Chernoff used it to design optimal plurality consensus algorithms in bound is applied satisfy the property of being negatively 1- the uniform pull model [29,30]. correlated (see Section 1.7 in [26] for a formal definition). In A general result by Kempe et al. [33] shows how to com- our analysis, we show instead how to obtain concentration of pute a large class of functions in the uniform push model. probability in this dependent setting by leveraging Poisson However, their protocol requires the nodes to send slightly approximation techniques. Our approach has the following more complex messages than their sole current opinion, and advantage: instead of showing that the Chernoff bound can be its effectiveness heavily relies on a potential function argu- directly applied to the specific involved random variables, we ment that does not hold in the presence of noise. 123 P. Fraigniaud, E. Natale To the best of our knowledge, we are the first considering opinions represented by an integer in [k]={1,..., k}. Addi- the plurality consensus problem in the presence of noise. tionally, there may be undecided nodes that do not support any opinion, which represents nodes that are not actively aware that the system has started to solve the problem; thus, 2 Model and formal statement of our results undecided nodes are not allowed to send any message before receiving any of them. In this section we formally define the communication model, the main definitions, the investigated problems and our con- • In rumor spreading, initially, one node, called the source, tribution to them. has an opinion m ∈{1,..., k}, called the correct opin- As discussed in Sect. 1.1, intuitively we look for proto- ion. All the other nodes have no opinion. The objective cols that are simple enough to be plausible communication is to design a protocol insuring that, after a certain num- strategies for primitive biological system. We believe that the ber of communication rounds, every node has the correct computational investigation regarding biologically-feasible opinion m. protocols is still too premature for a reasonable attempt to • In plurality consensus, initially, for every i ∈{1,..., k}, provide a general formal definition of what constitutes a bio- aset A of nodes have opinion i.The sets A , i = i i logically feasible computation. Hence, in the following we 1,..., k, are pairwise disjoint, and their union does not restrict our attention solely on the biological significance of need to cover all nodes, i.e., there may be some unde- the rumor-spreading and plurality consensus problems and cided nodes with no opinion initially. The objective is the corresponding protocols that we consider. to design a protocol insuring that, after a certain number Regarding the problems of multivalued rumor spreading of communication rounds, every node has the plurality and plurality consensus, while for practical reasons many opinion, that is, the opinion m with relative majority in experiments on collective behavior have been designed to the initial setting (i.e., |A | > |A | for any j = m). m j investigate the binary-decision setting, the considered natural phenomena usually involve a decision among a large number Observe that the rumor-spreading problem is a special case of of different options [17]: famous examples in the literature the plurality consensus problem with |A |= 1 and |A |= 0 m j include cockroaches aggregating in a common site [5], and for any j = m. the house-hunting process of ant colonies when seeking a Following the guidelines of [26], we work under two con- new site to relocate their nest [28] or of honeybee swarms straints: when a portion of a strong colony branches from it in order to start a new one [40,41]. Therefore, it is natural to ask what 1. We restrict ourselves to protocols in which each node can trade-offs and constraints are required by the extension of the only transmit opinions, i.e., every message is an integer results in [26] to the multivalued case. in {1,..., k}. Regarding the solution we consider, as illustrated in Sects. 2. Transmissions are subject to noise, that is, for every 2.3 and 3.1, we consider a natural generalization of the round, and for every node u, if an opinion i ∈{1,..., k} protocol given in [26], which is essentially an elementary is transmitted to node u during that round, then node combination of sampling and majority operations. These ele- u will receive message j ∈{1,..., k} with probability mentary operations have extensively been observed in the k p ≥ 0, where p = 1. i , j i , j j =1 aforementioned experimental settings [17]. The noisy push model is the uniform push model together 2.1 Communication model and definition of the with the previous two constraints. The probabilities problems { p } can be seen as a transition matrix, called the i , j i , j ∈[k] noise matrix, and denoted by P = (p ) . The noise i , j i , j ∈[k] The communication model we consider is essentially the uni- matrix in [26]issimply form push model [19,32,38], where in each (synchronous) round each agent can send (push) a message to another agent 1 1 +  − 2 2 chosen uniformly at random. This occurs without having the P = . (1) 1 1 −  + sender or the receiver learning about each other’s identity. 2 2 Note that it may happen that several agents push a message to the same node u at the same round. In the latter case we 2.1.1 The reception of simultaneous messages assume that the nodes receive them in a random order; we discuss this assumption in detail in Sect. 2.1.1. In the uniform push model, it may happen that several agents We study the problems of rumor-spreading and plurality push a message to the same node u at the same round. In such consensus. In both cases, we assume that nodes can support cases, the model should specify whether the node receives all 123 Noisy rumor spreading and plurality consensus such messages, only one of them or neither of them. Which given by [26]. Finally, we remark that, more generally, the choice is better depends on the biological setting that is being research on solving fundamental coordination problems such modeled: if the communication between the agents of the as plurality consensus in fully-asynchronous communication system is an auditory or tactile signal, it could be more real- models such as population protocols is an active research area istic to assume that simultaneous messages to the same node [4,24]. We believe that obtaining analogous results to those would “collide”, and the node would not be able to grasp any provided here in a noisy version of population protocols is of them. If, on the other hand, the messages represent visual an interesting direction for future research. or chemical signals (see e.g., [10,11,27,42]), then it may be unrealistic to assume that nodes cannot receive more than 2.2 Plurality bias, and majority preservation one of such messages at the same round and besides, by a standard balls-into-bins argument (e.g., by applying Lemma When time proceeds, our protocols will result in the propor- 3), it follows that in the uniform push model at each round tion of nodes with a given opinion to evolve. Note that there no node receives more than O(log n) messages w.h.p. In this might be nodes who do not support any opinion at time t. work we thus consider the model in which all messages are As mentioned in the previous section, we call such nodes (t ) received, also because such assumption allows us to obtain undecided. We denote by a the fraction of nodes support- simpler proofs than the other variants. We finally note that ing any opinion at time t and we call the nodes contributing (t ) our protocol does not strictly need such assumption, since to a opinionated. Consequently, the fraction of undecided (t ) (t ) it only requires the nodes to collect a small random sample nodes at time t is 1 − a .Let c be the fraction of opin- of the received messages. However, since we look at the lat- ionated nodes in the system that support opinion i ∈[k] at (t ) (t ) (t ) ter feature as a consequence of active choices of the nodes the beginning of round t, so that c = a .Let cˆ i ∈[k] i i rather than some inherent property of the environment, we be the fraction of opinionated nodes which receive at least avoid to weaken the model to the point that it matches the one message at time t − 1 and support opinion i ∈[k] at (t ) (t ) (t ) requirements of the protocol. the beginning of round t. We write c = (c ,..., c ) to 1 k denote the opinion distribution of the opinions at time t.Sim- (t ) (t ) (t ) 2.1.2 On the role of synchronicity in the result ilarly, let c ˆ = (cˆ ,..., cˆ ). In particular, if every node 1 k would simply switch to the last opinion it received, then An important aspect of many natural biological computations (t +1) (t ) is their tolerance with respect to a high level of asynchrony. E[ˆc | c ] Following [26], in this work we tackle the noisy rumor- = Pr[received i | original message is j ] spreading and plurality consensus problems by assuming that j ∈[k] agents are provided a shared clock, which they can employ · Pr[original message is j ] to synchronize their behavior across different phases of a (t ) protocol. In [26, Section 3], it is shown how substantially = c · p . j ,i relax the previous assumption by assuming, instead, that a j ∈[k] source agent can broadcast a starting signal to the rest of That is, the system to initiate the execution of the protocol. That is, asimple rumor-spreading procedure is employed to awake (t +1) (t ) (t ) the agents which join the system asynchronously, and it is E[c ˆ | c ]= c · P, (2) shown that with high probability the level of asynchrony (i.e. the largest difference among the agents’ estimates of the time where P is the noise matrix. In particular, in the absence of since the start of the execution of the protocol), is logarith- noise, we have P = I (the identity matrix), and if every node mically bounded with high probability. It thus follows that would simply copy the opinion that it just received, we had (t +1) (t ) (t ) their results can be generalized to the setting in which source E[c ˆ | c ]= c . So, given the opinion distribution at agents initiate the execution of the protocol by waking up the round t, from the definition of the model it follows that the rest of the system, with only a logarithmic overhead factor messages each node receives at round t + 1 can equivalently in the running time. We defer the reader to [26, Section 3] be seen as being sent from a system without noise, but whose (t ) for formal details regarding this synchronization procedure. opinion distribution at round t is c · P. Our generalization of the results in [26] is independent from Recall that m denotes the initially correct opinion, that any aspect which concerns the aforementioned procedure. is, the source’s opinion in the rumor-spreading problem, and Hence, their relaxation holds for our results as well, with the the initial plurality opinion in the plurality consensus prob- same log n additional factor in the running time. It is an open lem. The following definition naturally extends the concept problem to obtain a simple procedure to wake up the system of majority bias in [26]to plurality bias. while incurring a smaller overhead than the logarithmic one 123 P. Fraigniaud, E. Natale Definition 1 Let δ> 0. An opinion distribution c is said to Eq. (1)is -majority-biased. Note that, as in [26], the plurality be δ-biased toward opinion mif c − c ≥ δ for all i = m. consensus algorithm requires the nodes to known the size |S| m i of the set S of opinionated nodes. In [26], each binary opinion that is transmitted between two nodes is flipped with probability at most − , with  = − +η n for an arbitrarily small η> 0. Thus, the noise is 3 The analysis parametrized by . The smaller , the more noisy are the communications. We generalize the role of this parameter In this section we prove Theorems 1 and 2 by generalizing with the following definition. the analysis of Stage 1 given in [26] and by providing a new analysis of Stage 2. Note that the proof techniques required Definition 2 Let  = (n) and δ = δ(n) be positive. A noise for the generalization to arbitrary k significantly depart from matrix P is said to be (, δ)-majority-preserving ((, δ)-m.p.) thosein[26] for the case k = 2. In particular, our approach with respect to opinion m if, for every opinion distribution provides a general framework to rigorously deal with many c that is δ-biased toward opinion m, we have (c · P) − kind of stochastic dependences among messages in the uni- (c · P) > δ for all i = m. form push model (Fig. 1). In the rumor-spreading problem, as well as in the plurality 3.1 Definition of the protocol consensus problem, when we say that a noise matrix is (, δ)- m.p., we implicitly mean that it is (, δ)-m.p. with respect to the initially correct opinion. Because of the space constraints, We describe a rumor spreading protocol performing in two we defer the discussion on the class of (, δ)-m.p. noise matri- stages. Each stage is decomposed into a number of phases, ces in Sect. 4 (including its tightness w.r.t. Theorems 1 and each one decomposed into a number of rounds. During each 2). phase of the two stages, the nodes apply the simple rules given below. 2.3 Our formal results 3.1.1 The rule during each phase of Stage 1 We show that a natural generalization of the protocol in [26] solves the rumor spreading problem and the plurality con- Nodes that already support some opinion at the beginning sensus problem for an arbitrary number of opinions k.More of the phase push their opinion at each round of the phase. precisely, using the protocol which we describe in Sect. 3.1, Nodes that do not support any opinion at the beginning of we can establish the following two results, whose proof can the phase but receive at least one opinion during the phase be found in Sect. 3. start supporting an opinion at the end of the phase, chosen u.a.r. (counting multiplicities) from the received opinions. Theorem 1 Assume that the noise matrix P is (, δ)-m.p. In other words, each node tries to acquire an opinion during − +η with  = (n ) for an arbitrarily small constant η> 0 each phase of Stage 1, and, as it eventually receives some and δ = ( log n/n). The noisy rumor-spreading problem opinions, it starts supporting one of them (chosen u.a.r.) from log n with k opinions can be solved in O( ) communication the beginning of the next phase. In particular, opinionated rounds, w.h.p., by a protocol using O(log log n + log ) bits nodes never change their opinion during the entire stage. of memory at each node. More formally, let φ, β, and s be three constants satisfying φ> β > s. The rounds of Stage 1 are grouped in T +2 phases Theorem 2 Let S with |S|= ( log n) be an initial set of 2 2 with T =log(n/(2s/ log n))/ log(β/ + 1). Phase 0 nodes with opinions in [k], the rest of the nodes having no 2 2 takes s/ log n rounds, phase T +1 takes φ/ log n rounds, opinions. Assume that the noise matrix P is (, δ)-m.p. for 2 and each phase j with 1 ≤ j ≤ T takes β/ rounds. We some > 0, and that S is ( log n/|S|)-majority-biased. denote with τ the end of the last round of phase j. The noisy plurality consensus problem with k opinions can Let t be thefirst timeinwhich u receives any opinion log n be solved in O( ) communication rounds, w.h.p., by a since the beginning of the protocol (with t = 0for the protocol using O(log log n + log ) bits of memory at each source). Let j be the phase of t , and let val(u) be an opinion u u node. chosen u.a.r. by u among those that it receives during phase For k = 2, we get the theorems in [26] from the above two Note that, in the protocol considered in [26], the choice of each node’s theorems. Indeed, the simple 2-dimensional noise matrix of new opinion in both stages is based on the first messages received. In [26], in order to relax the synchronicity assumption that nodes share a For a discussion on what happens for other values of , see “Appendix common clock, they adopt the same sample-based variant of the rule C”. that we adopt here. 123 Noisy rumor spreading and plurality consensus Fig. 1 Diagrams of dependencies among the different parts of the analysis. Each box represents a statement proven in the analysis, and an arrow between two boxes u and v signifies that the statement of box u is employed in the proof of box v j . During the first stage of the protocol each node applies {1,..., k}}. We then define maj(A) as the most frequent value the following rule. in A (breaking ties u.a.r.), i.e., maj(A) is the r.v. on {1,..., k} such that Pr(maj(A) = i ) = 1 /|mode(A)|. Let {i ∈mode(A)} Rule of Stage 1. Each opinionated node u pushes opin- R (u) be the multiset of messages received by node u during ion val(u) during each round of every phase j = j + phase j. During the second stage of the protocol each node 1,..., T + 1. applies the following rule. Rule of Stage 2. During each phase j of length 2L of 3.1.2 The rule during each phase of Stage 2 Stage 2 (L = or ), each node u pushes its current opinion at each round of the phase, and starts drawing a During each phase of Stage 2, every node pushes its opinion random uniform sample S(u) of size L from R (u).Pro- at each round of the phase. At the end of the phase, each node vided |R (u)|≥ L, at the end of the phase u changes its that received “enough” opinions takes a random sample of opinion to maj(S(u)). them, and starts supporting the most frequent opinion in that sample (breaking ties u.a.r.). Let us remark that the reason we require the use of sam- More formally, the rounds of stage 2 are divided in pling in the previous rule is that at a given round a node may T + 1 phases with T = log( n/ log n) . Each phase j, receive much more messages than 2L. Thus, if the nodes were 0 ≤ j ≤ T − 1, has length 2 with c/ for some to collect all the messages they receive, some of them would large-enough constant c > 0, and phase T has length 2 −2 need much more memory than our protocol does. Finally, with = O( log n). For any finite multiset A of ele- observe that overall both stages 1 and 2 take O( log n) ments in {1,..., k}, and any i ∈{1,..., k}, let occ(i , A) be 2 rounds. the number of occurrences of i in A, and let mode(A) = {i ∈{1,..., k}| occ(i , A) ≥ occ( j , A) for every j ∈ 3.2 Pushing colored balls-into-bins Note that, in order to sample u.a.r. one of them, u does not need to Before delving into the analysis of the protocol, we provide a collect all the opinions it receives. A natural sampling strategy such as reservoir sampling can be used. framework to rigorously deal with the stochastic dependence 123 P. Fraigniaud, E. Natale that arises between messages in the uniform push model. Let 2. each ball/message ends up in the same bin/node. process O be the process that results from the execution of the protocol of Sect. 3.1 in the uniform push model. In order Thus, the joint probability distribution of the sets to apply concentration of probability results that requires R (u) in process O is the same as the one given by u∈[n] the involved random variables to be independent, we view process B. the messages as balls, and the nodes as bins, and employ Observe also that, from the definition of the protocol (see Poisson approximation techniques. More specifically, during the rule of Stage 1 and Stage 2 in Sect. 3.1), it follows that each phase j of the protocol, let M be the set of messages j each node’s action depends only on the set R (u) of received that are sent to random nodes, and N be the set of mes- messages at the end of each phase j, and does not depend on sages sent after the noise has acted on them. (In other words, any further information such as the actual order in which the N = R (u)). We prove that, at the end of phase j,we messages are received during the phase. j j can equivalently assume that all the messages M have been j Summing up the two previous observations, we get that if, sent to the nodes according to the following process. at the end of each phase j, we generate the R (u)s accord- ing to process B, and we let the protocol execute according Definition 3 The balls-into-bins process B associated to to them, then we indeed get the same stochastic process as phase j is the two-step process in which the nodes represent process O. bins and all messages sent in the phase represent colored balls, with each color corresponding to some opinion. Ini- Now, one key ingredient in our proof is to approximate tially, balls are colored according to M . At the first step, process B using the following process P. each ball of color i ∈{1,..., k} is re-colored with color j ∈{1,..., k} with probability p , independently of the i , j Definition 4 Given N , process P associated to phase j is other balls. At the second step all balls are thrown into the the one-shot process in which each node receives a num- bins u.a.r. as in a balls-into-bins experiment. ber of opinions i that is a random variable with distribution Poisson(h /n), where h is the number of messages in N i i j Claim 1 Given the opinion distribution and the number of carrying opinion i, and each Poisson random variable is inde- active nodes at the beginning of phase j, the probability dis- pendent of the others. tribution of the opinion distribution and the number of active nodes at the end of phase j in process O is the same as if the Now we provide some results from the theory of Poisson messages were sent according to process B. approximation for balls-into-bins experiments that are used It is not hard to see that Claim 1 holds in the case of a sin- in Sect. 3.2. For a nice introduction to the topic, we refer to gle round. For more than one round, it is crucial to observe [36]. that the way each node u acts in the protocol depends only Lemma 1 Let X be independent r.v. such that X ∼ j j on the received messages R (u), regardless of the order in j ∈ n ˜ j [ ] Poisson(λ ). The vector (X ,..., X ) conditional on which these messages are received. As an example, consider j 1 n ˜ the opinion distribution in which one node has opinion 1, one X =˜ m follows a multinomial distribution with m ˜ trials and 1 n ˜ other node has opinion 2, and all other nodes have opinion 3. probabilities ,..., . λ λ j j j j Suppose that each node pushes its opinion for two consecu- tive rounds. Since, at each round, exactly one opinion 1 and Lemma 2 Consider a balls-into-bins experiment in which exactly one opinion 2 are pushed, no node can receive two h colored balls are thrown in n bins, where h balls 1s during the first round and then two 2s during the second have color i with i ∈ {1,..., k} and h = h. Let round, i.e., no node can possibly receive the sequence of mes- X be the number of i-colored balls u,i u∈{1,...,n},i ∈{1,...,k} sages “1,1,2,2” in this exact order. Instead, in process B such that end up in bin u, let f (x ,..., x , x ,..., x , z , 1,1 n,1 n,2 n,k 1 a sequence is possible. ..., z ) be a non-negative function with positive inte- ger arguments x ,..., x , x ,..., x , z ,..., z ,let 1,1 n,1 n,2 n,k 1 n Proof of Claim 1 In both process B and process O, at each Y be independent r.v. such that Y u,i u,i u∈{1,...,n},i ∈{1,...,k} round, the noise acts independently on each ball/message ∼Poisson(h /n) and let Z ,..., Z be integer valued r.v. i 1 n of a given color/opinion, according to the same probability independent from the X s and Y s. Then u,i u,i distribution for that color/opinion. Then, in both processes, each ball/message is sent to some bin/node chosen u.a.r. and E f X ,..., X , X ,..., X , Z ,..., Z 1,1 n,1 n,2 n,k 1 n independently of the other balls/messages. Indeed, we can couple process B and process O by requiring that: ≤ e h E f Y ,..., Y , Y ,..., i 1,1 n,1 n,2 1. each ball/message is changed by the noise to the same Y , Z ,..., Z . n,k 1 n color/value, and 123 Noisy rumor spreading and plurality consensus ¯ ¯ Proof To simplify notation, let Z = (Z ,..., Z ), X = Y be the independent r.v. of process P 1 n u,i { } { } u∈ 1,...,n ,i ∈ 1,...,k (X ,..., X , X ,..., X ), Y = (Y ,..., Y , such that Y ∼Poisson(h /n) and let Z ,..., Z be integer 1,1 n,1 n,2 n,n 1,1 n,1 u,i i 1 n n n Y ,..., Y ), Y = ( Y ,..., Y ), λ = valued r.v. independent from the X s and Y s. n,2 n,n u,1 u,k i u,i u,i u=1 u=1 Fix any realization of N , i.e. any re-coloring of the balls in h /n, λ = (λ ,...,λ ) and finally x¯ = (x ,..., x ) for i 1 k 1 k the first step of process B. By choosing f in Lemma 2 as the any x ,..., x . Observe that, while X and X are clearly 1 k u,i v,i binary r.v. indicating whether event E has occurred, where dependent, X and X with i = j are stochastically inde- u,i v, j E is a function of the r.v. X ,..., X , X ,..., X , pendent (even if u = v). Indeed, the distribution of the r.v. 1,1 n,1 n,2 n,k Z , ..., Z , we get X for each fixed i is multinomial with λ trials 1 n u,i i u∈{1,...,n} and uniform distribution on the bins. Thus, from Lemma 1 we have that X are distributed as Y u,i u,i u∈{1,...,n} u∈{1,...,n} Pr E X ,..., X , Z ,..., Z N 1,1 n,k 1 n j conditional on Y = λ , that is u,i i u=1 ≤ e h Pr E Y ,..., Y , Z ,..., Z N . i 1,1 n,k 1 n j n n ¯ ¯ E f Y , Z Y = λ ,..., Y = λ u,1 1 u,k k (3) u=1 u=1 ¯ ¯ = E f X , Z . Thus, from Eq. (3), the Inequality of arithmetic and geo- metric means and the hypotheses on the probability of E,we Therefore, we have get ¯ ¯ E f Y , Z Pr E X ,..., X , Z ,..., Z N ¯ ¯  ¯ ¯ 1,1 n,k 1 n j = E f Y , Z Y =¯x Pr Y =¯x x¯ :x ,...,x ≥0 1 k ≤ e h Pr E Y ,..., Y , Z ,..., Z N i 1,1 n,k 1 n j ¯ ¯ ¯ ¯ ¯ ¯ ≥ E f Y , Z Y = λ Pr Y = λ ¯ ¯ ¯ ¯ = E f X , Z Pr Y = λ h ≤ e Pr E Y ,..., Y , Z ,..., Z N i −k 1,1 n,k 1 n j i −h ¯ ¯ ¯ ¯ = E f X , Z e ≥ E f X , Z  , h ! i h Finally, let N be the set of all possible realizations of N .By where, in the last inequality, we use that, by Stirling’s approx- the law of total probability over N , we get that imation, a!≤ e a( ) for any a > 0. From Lemmas 1 and 2, we get the following general result Pr E X ,..., X , Z ,..., Z N = s 1,1 n,k 1 n j s∈N which says that if a generic event E holds w.h.p in process P, it also holds w.h.p. in process O. Pr N = s Lemma 3 Given the opinion distribution and the number of k ≤ e Pr E Y ,..., Y , Z ,..., Z N = s 1,1 n,k 1 n j active nodes at the beginning of a fixed phase j, let E be an s∈N event that, at the end of that phase, holds with probability at Pr N = s −b least 1 − n in process P,for some b >(k log h)/(2log n) 6 h with h = h . Then, at the end of phase j, E holds w.h.p. i ≤ e Pr E Y ,..., Y , Z ,..., Z 1,1 n,k 1 n also in process O. k k e k e k −b log h −b log n 2 2 ≤ h n ≤ e e Proof Thanks to Claim 1, it suffices to prove that, at the end k k log k 2 2 k e of phase j, E holds w.h.p. in process B. k k k− log k+ log h−b log n − (1) 2 2 ≤ e ≤ n , Let E be the complementary event of E.Let h =|M | be the number of balls that are thrown in process B asso- ciated to phase j, where h balls have color i with i ∈ i where in the last line we used the hypotheses on the proba- {1,..., k} and h = h.Let X be i u,i bility of E. u∈{1,...,n},i ∈{1,...,k} the number of i-colored balls that end up in bin u,let We now analyze the two stages of our protocol, starting Note that, if N is not yet fixed, the parameters h of process P j i with Stage 1. Note that, in the following two sections, the associated to phase j are random variables. However, if the opinion statements about the evolution of the process refer to process distribution and the number of active nodes at the beginning of phase j are given, then h = h =|N |=|M | is fixed. O. i j j 123 P. Fraigniaud, E. Natale 3.3 Stage 1 Lemma 7 W.h.p., at the end of each phase j of Stage 1, we have an (/2) -biased opinion distribution. The rule of Stage 1 is aimed at guaranteeing that, w.h.p., Proof We prove the lemma by induction on the phase num- the system reaches a target opinion distribution from which ber. The case j = 1 is a direct application of Lemma 16 to the rumor-spreading problem becomes an instance of the (τ ) (τ ) 1 1 c −c (i = m), where the number of opinionated nodes plurality consensus problem. More precisely, we have the i is given by Claim 2, and, where the independence of the r.v. following. follows from the fact that each node that becomes opinion- Lemma 4 Stage 1 takes O( log n) rounds, after which ated in the first phase has necessarily received the messages (τ ) T +1 from the source-node. Now, suppose that the lemma holds w.h.p. all nodes are active and c is δ-biased toward up to phase j − 1 ≤ T.Let S = {u| j = j } be the set j u the correct opinion, with δ = ( log n/n). of nodes that become opinionated during phase j. Recall Proof The fact that an undecided node becomes opinionated the definition of M and N from Sect. 3.2, and observe j j (τ ) j −1 during a phase only depends on whether it gets a message that M = N = τ − τ n · a , and that the j j j j −1 (τ ) during that phase, regardless of the value of such messages. j −1 number of times opinion i occurs in M is M c .Let j j (τ ) T +1 Hence, the proof that, w.h.p., a = 1 is reduced to the us identify each message in M with a distinct number in analysis of the rule of Stage 1 as an information spreading 1,..., M , and let {X (i )} be the binary r.v. j w w∈ 1,..., M { | |} process. First, by carefully exploiting the Chernoff bound such that X (i ) = 1 if and only if w is i after the action of the and Lemma 3, we can establish Claims 2 and 3 below: |N | noise. The frequency of opinion i in N is X (i ). j w w=1 | j | Claim 2 W.h.p., at the end of phase 0, we have s/ log n/3n Thanks to Lemma 3, it suffices to prove the lemma for pro- (τ ) 2 ≤ a ≤ s/ log n/n. cess P. By definition, in process P, for each i, the number of messages with opinion i that each node receives conditional Claim 3 W.h.p., at the end of phase j,1 ≤ j ≤ T,wehave | j | on N follows a Poisson ( X (i )) distribution. Each j w n w=1 node u that becomes opinionated during phase j gets at least 2 j (τ ) (τ ) 2 j (τ ) 0 j 0 (β/ + 1) a /8 ≤ a ≤ (β/ + 1) a . one message during the phase. Thus, from Lemma 1,the probability that u gets opinion i conditional on N is Proof of Claims 2 and 3 The probability that, in the process O, an undecided node becomes opinionated at the end of |N | |N | j 1 h j phase j is 1−(1− ) where h is the number of messages sent X (i ) 1 n w=1 = X (i ) . n | | k j during that phase. In process P, this probability is 1 − e . N X (i ) w=1 x w i =1 w=1 1+x By using that e ≤ 1 + x ≤ e for |x | < 1 we see that − h n n−1 Since opinionated nodes never change opinion during Stage 1 − e ≤ 1 − (1 − 1/n) ≤ 1 − e . Thus, we can prove (τ ) Claims 2 and 3 for process P by repeating essentially the same 1, the bias of c is at least the minimum between the bias (τ ) j −1 of c and the bias among the newly opinionated nodes calculations as in the proofs of Claims 2.2 and 2.4 in [26]. . Hence, we can apply the Chernoff bound to the nodes Since the Poisson distributions in process P are independent, in S we can apply the Chernoff bound as claimed in [26]. Finally, in S to prove that the bias at the end of phase j is, w.h.p. , we can prove that the statements hold also for process O τ τ ( j ) ( j ) thanks to Lemma 2. Pr c − c N m j ⎛ ⎞ |N | |N | From the previous two claims, and by the definition of T j j 1 1 ⎝ ⎠ ˜ we get the following. ≥ X (m) − X (i ) 1 − δ , w w j N N j j w=1 w=1 (τ ) T +1 Lemma 5 W.h.p., at the end of phase T , we have a = (4) 2 T (τ ) 2 ((β/ + 1) a ) = ( ). where δ = O( log n/|S |). j j Finally, from Lemma 5, an application of the Chernoff Moreover, note that bound gives us the following. ⎡  ⎤ | j | Lemma 6 W.h.p., at the end of Stage 1, all nodes are opin- τ τ τ ( j −1) ( j −1) ( j −1) ⎣  ⎦ E X (i ) c , a = c · P . ionated. w w=1 (τ +1) As for the fact that, w.h.p., c is a δ-biased opinion τ τ ( j ) ( j ) distribution with δ = ( log n/n), we can prove the fol- We remark that Eq. (4) concerns the value of Pr(c − c |N ), m j lowing. which is a random variable. 123 Noisy rumor spreading and plurality consensus (τ ) (τ ) j −1 j −1 Furthermore, (conditional on c and a )the r.v. where {X (i )} are independent. Thus, for each i = m, ⎧ w∈{1,...,|N |} −1 ⎨ 2 2 √ δ(1 − δ ) if δ< , from Claim 3, and by applying the Chernoff bound on g (δ, ) = |N | |N | √ √ −1 j j X (m), and on X (i ), we get that w.h.p. 2 √ 1/ (1 − 1/ ) if δ ≥ . w w w=1 w=1 |N | |N | j j First, we prove Eq. (6)for k = 2. We then obtain the 1 1 − j +1 j general case by induction. The proof for k = 2 is based on a X (m) − X (i ) ≥ 1 − δ 2  , w w j N N j j w=1 w=1 known relation between the cumulative distribution function (5) of the binomial distribution, and the cumulative distribution function of the beta distribution. This relation is given by the where δ = O( log n/|N |). following lemma. j j From Claims 2 and 3, it follows that δ ,δ ≤ w.h.p. j j Lemma 8 Given p ∈ (0, 1) and 0 ≤ j ≤ it holds Thus by putting together Eqs. (4) and (5) via the chain rule, we get that, w.h.p., −i p (1 − p) τ τ j <i ≤ ( j ) ( j ) − j +1 j c − c ≥ 1 − δ 1 − δ 2  ≥ . j j m $ − j −1 = ( j + 1) z (1 − z) dz. j + 1 T +2 Lemma 7 implies that, w.h.p., we get a bias  = Proof By integrating by parts, for j < − 1wehave ( log n/n) at the end of Stage 1, which completes the proof of Lemma 4. − j −1 ( j + 1) z (1 − z) dz j + 1 3.4 Stage 2 j +1 − j −1 = p (1 − p) j + 1 As proved in the previous section, w.h.p., all nodes are j +1 − j −2 opinionated at the end of Stage 1, and the final opinion − ( − j − 1) z (1 − z) dz j + 1 distribution is ( log n/n)-biased. Now, we have that the rumor-spreading problem is reduced to an instance of the j +1 − j −1 = p (1 − p) j + 1 plurality consensus problem. The purpose of Stage 2 is to progressively amplify the initial bias until all nodes support j +1 − j −2 − ( j + 2) z (1 − z) dz, (7) the plurality opinion, i.e., the opinion originally held by the j + 2 source node. where, in the last equality, we used the identity During the first T phases, it is not hard to see that, by taking c large enough, a fraction arbitrarily close to 1 of the − j ) = ( j + 1) . nodes receives at least messages, w.h.p. Each node u in j j + 1 such fraction changes its opinion at the end of the phase. With a slight abuse of notation, let maj (u) = maj (S(u)) be Note that when j = − 1, Eq. (7) becomes u’s new opinion based on the =|S(u)| randomly sampled −1 received messages. We show that, w.h.p., these new opin- p = z dz. ions increase the bias of the opinion distribution toward the 0 plurality opinion by a constant factor > 1. Hence, we can unroll the recurrence given by Eq. (7)to For the sake of simplicity, we assume that is odd (see obtain Appendix B for details on how to remove this assumption). − j −1 ( j + 1) z (1 − z) dz Proposition 1 Suppose that, at the beginning of phase j of j + 1 Stage 2 with 0 ≤ j ≤ T − 1, the opinion distribution is p −i −1 = p (1 − p) + z dz δ-biased toward m. In process P, if a node u changes its j <i ≤ −1 opinion at the end of the phase, then, for any i = m, we have −i = p (1 − p) g(δ, j <i ≤ Pr maj (u) = m − Pr maj (u) = i ≥ , (k−2) ln 4 π e (6) concluding the proof. 123 P. Fraigniaud, E. Natale Lemma 8 allows us to express the survival function of a Thus, for any y ∈ (−p + p , p − p ) we have 1 2 1 2 binomial sample as an integral. Thanks to it, we can prove + , + , p −p $ 1 2 Proposition 1 when k = 2. 2 1 1 − y − t dt ≥ y . (8) Lemma 9 Let c = (c , c ) be a δ-biased opinion distri- p −p 1 2 1 2 4 4 bution during Stage 2. In process P, for any node u, we have Pr maj (u) = m −Pr maj (u) = 3 − m ≥ 2 /π · The r.h.s. of Eq. (8) is maximized w.r.t. y ∈ (−p + p , p − 1 2 1 g (δ, ) . p ) when Proof Without loss of generality, let m = 1. Let X be a ⎧ ⎫ 3 4 ⎨ ⎬ ) ( 1 1 r.v. with distribution Bin( , p ), and let X = − X .By 2 1 y = min p − p , - = min p − p , √ . 1 2 1 2 . / using Lemma 8, we get ⎩ ⎭ 2 + 1 Pr maj (u) = 1 − Pr maj (u) = 2 Hence, for p − p < , we get 1 2 ) ( ) ( ) ( = Pr X > X − Pr X > X 1 2 2 1 + , p −p $ 1 2 −i −i i 2 2 1 = p p − p p 2 1 2 2 1 − t dt % & i % & i p −p 1 2 ≤i ≤ ≤i ≤ 2 2 + , 1 − (p − p ) −i 1 2 = p (1 − p ) ≥ (p − p ) 1 2 % & i ≤i ≤ −1 +1 2 = 2 (p − p ) 1 − (p − p ) −i i 1 2 1 2 − p (1 − p ) % & = 2 g (p − p , ) . ≤i ≤ 1 2 ) *$ + , + , 2 2 ' ( For p − p ≥ we get = z (1 − z) dz 1 2 + , + , + , p −p 1 2 2 2 − z (1 − z) dz . 2 0 − t dt p −p 1 2 p −p 1 1 2 1 −1 By setting t = z − , and rewriting p = + and 1 − 2 2 2 2 1 p − p 2 1 1 ≥ √ 1 − = 2 g (p − p , ) . 1 2 p = + we obtain 2 2 Pr maj (u) = 1 − Pr maj (u) = 2 By using the fact that g is a non-decreasing function w.r.t. its + , + , ) *$ first argument, we obtain 2 2 ' ( = z (1 − z) dz $ + , + , p Pr maj (u) = 1 − Pr maj (u) = 2 2 2 + , − z (1 − z) dz p −p ) *$ 1 2 0 2 ⎛ + , ' ( = − t dt p −p ) * $ 1 2 p −p 1 2 2 4 2 − 1 2 ' (  ) * = − t dt 2 4 − − ' ( ≥ 2 g (p − p , 1 2 + , ⎞ 2 p −p 1 2 ) * ⎠ − − − t dt ' ( ≥ 2 g (δ, ) . − 2 + , ) * p −p $ 1 2 2r 1 2 2r 1 2 2 √ 9r Finally, by using the bounds ≥ e (see Lemma 13), ' ( = − t dt . r πr p −p 2 1 2 4 − x 8 and e ≥ 1 − x together with the identity y y p −p p −p  ) * 1 2 1 2 For any t ∈ (− , ) ⊆ (− , ), it holds + 1 − 1 2 2 2 2 ' ( = = −1 + , 2 2 + , 2 2 2 1 1 − y − t ≥ . Recall that we are assuming that is odd. 4 4 123 Noisy rumor spreading and plurality consensus we get Let Pr maj (u) = 1 − Pr maj (u) = 2 σ (x) = (x , ..., x , x , x , ..., x ) i i −1 1 i +1 k ) * ≥ ' ( 2 g δ, ( ) be the vector function that swaps the entries x and x in x. 1 i 2 (!) (=) (=) σ is clearly a bijection between the sets A ,A ,A and −1 1 1 1 (!) (=) (=) 9( −1) ≥ - e · 2 g (δ, ) A , A , A , respectively, namely i i i −1 (!) (!) (=) (=) (=) (=) σ : A → → A ,σ : A → → A ,σ : A → → A 1 1 1 i i i 2 1 ≥ 1 − 1 − · g (δ, π 9 ( − 1) where → → denotes a bijection. (=) Moreover, for all x ∈ A , it holds ≥ · g (δ, ) , ) ( ¯ ¯ Pr X = x = Pr X = σ (x) . concluding the proof. Next we show how to lower bound the above difference Therefore with a much simpler expression. ) ( ¯ ¯ Pr X = x = Pr X = σ x ( ) Lemma 10 In process P, during Stage 2, for any node u, (=) (=) Pr(maj (u) = m) − Pr(maj (u) = 3 − m) is at least x∈A σ (x)∈A 1 1 ) ( ) ( ) ( ) ( ) ( Pr(X > X , ..., X ) − Pr(X > X , ..., X , 1 2 k i 1 i −1 = Pr X = x . (10) ) ( ) ( ) ( X , ..., X ), where X = (X , ..., X ) follows a (=) i +1 k 1 k x∈A multinomial distribution with trials and probability dis- tribution c · P. (=) Furthermore, for all x ∈ A ,wehave Proof Without loss of generality, let m = 1. Let x = (x , ..., x ) denote a generic vector with positive integer 1 k x x x ) 1 i k k Pr X = x = p ... p ... p 1 i k entries such that x = ,let W (x) be the set of the j =1 x ... x 1 k greatest entries of x, and, for j ∈ {1, i },let x x x i 1 k > p ... p ... p 1 i k x ... x 1 k (!) { } • A = x | W (x) ={ j } , (=) = Pr X = σ (x) , (11) • A = {x | 1, i ∈ W (x)}, (=) • A = {x | 1 ∈ W (x) ∧ i ∈ / W (x) ∧ |W (x)| > 1} and 1 (=) where σ(x) ∈ A .FromEq. (11) we thus have that (=) • A = {x | i ∈ W (x) ∧ 1 ∈ / W (x) ∧ |W (x)| > 1}. ) ( ¯ ¯ Pr X = x > Pr X = σ (x) It holds (=) (=) x∈A σ (x)∈A 1 1 Pr maj (u) = j = Pr X = x . (12) ) ( ¯ ¯ = Pr X = x Pr maj (u) = j X = x ( ) x∈A (!) x∈A From Eq. (9), (10) and (12) we finally get ) ( ¯ ¯ + Pr X = x Pr maj (u) = j X = x (=) x∈A Pr maj (u) = 1 − Pr maj (u) = i ( ) ) ( ) Pr X = x ¯ ¯ + Pr X = x Pr maj (u) = j X = x ( = Pr X = x + (=) |W (x)| x∈A (!) (=) x∈A x∈A 1 1 ( ) (l) Pr X = x  ¯ Pr X = x ¯ ( = Pr X = x + ¯ + − Pr X = x |W (x)| |W (x)| (!) (=) (=) (!) x∈A x∈A j x∈A x∈A 1 i ) ( ¯  ¯ Pr X = x Pr X = x Pr X = x + (9) − − |W (x)| |W (x)| |W (x)| (=) (=) (=) x∈A x∈A x∈A i i 123 P. Fraigniaud, E. Natale ) ( ¯ ¯ and ≥ Pr X = x − Pr X = x (!) (!) x∈A x∈A ) ( ) ( ) ( ) ( )  ( 1 i Pr X > X ,..., X , X ,..., X X = h i 1 i −1 i +1 κ+1 κ+1 ) ( ) ( ¯ ¯ = Pr W (X ) ={X } − Pr W (X ) ={X } , 1 i ) ( ) ( ) ( )  ( = Pr X > X ,..., X , X ,..., X X =h . i 1 i −1 i +1 κ+1 concluding the proof of Lemma 10. Moreover, X follows a multinomial distribution with parameters p and . Thus X = h implies that the Intuitively, Lemma 10 says that the set of events in which ) ( remaining entries X ,..., X follow a multinomial dis- a tie occurs among the most frequent opinions in the node’s 1 k−1 p p 1 k−1 tribution with l −h trials, and distribution ( ,..., ). sample of observed messages does not favor the probability 1−p 1− p k k −h) ( −h) −h) that the node picks the wrong opinion. Thus, by avoid- Let Y = (Y ,..., Y ) be the distribution of 1 k−1 ) ( ) ( ing considering those events, we get a lower bound on X ,..., X conditional on X = h.FromEq. (14)we 1 k−1 k Pr(maj (u) = 1) − Pr(maj (u) = i ). get Thanks to Lemma 10, the proof of Eq. (6) reduces to prov- ing the following. ( ) ( ) ( Pr X > X ,..., X 1 2 κ+1 ) ( ) ( ) ( ) ( Lemma 11 For any fixed k, and with X defined as in − Pr X > X ,..., X , X ,..., X i 1 i −1 i +1 κ+1 Lemma 10, we have + , κ+1 ) ( ) ( (l) ) ( ) ( ≥ Pr X > X ,..., X  X = h Pr X > X ,..., X 1 2 κ κ+1 1 2 h=0 ) ( ) ( ) ( ) ( − Pr X > X ,..., X , X ,..., X ( i 1 i −1 i +1 k Pr X = h κ+1 g (δ, ) + , ≥ 2 /π . (13) κ+1 k−2 ) ( ) ( ) ( − Pr X > X ,..., X , X ,..., X i 1 i −1 i +1 h=0 Proof We prove Eq. (13) by induction. Lemma 9 provides us ) ( with the base case for k = 2. Let us assume that, for k ≤ κ, X = h Pr X = h κ+1 κ+1 Eq. (13) holds. For k = κ + 1, by using the law of total + , probability, we have κ+1 −h −h ( ) ( ) ( −h) ≥ Pr Y > Y ,..., Y − 1 2 ) ( ) ( h=0 Pr X > X ,..., X 1 2 κ+1 l−h l−h l−h l−h ( ) ( ) ( ) ( ) (l−h) − Pr Y >Y ,..., Y , Y ,..., Y ) ( ) ( ) ( ) ( ) κ i 1 i −1 i +1 − Pr X > X ,..., X , X ,..., X i 1 i −1 i +1 κ+1 (l) + , Pr X = h . (15) κ+1 κ+1 ) ( ) ( ) ( ≥ Pr X > X ,..., X  X = h 1 2 κ+1 κ+1 Now, using the inductive hypothesis on the r.h.s. of Eq. (15) h=0 we get Pr X = h + , κ+1 + , κ+1 −h) ( −h) κ+1 ( −h) Pr Y > Y ,..., Y 1 2 κ ) ( ) ( ) ( ) ( − Pr X > X ,..., X , X ,..., X i 1 i −1 i +1 κ+1 h=0 h=0 −h) ( −h) ( −h) ( −h) −h) − Pr Y >Y ,..., Y , Y ,..., Y 1 κ i i −1 i +1 ) ( X = h Pr X = h . (14) κ+1 κ+1 Pr X = h κ+1 + , + , ) ( ) ( Now, arg max {X }= X and X ≤ together j κ+1 j i κ+1 κ+1 − 2h g (δ, − h) ( ) ( ) ≥ Pr X = h imply X > X . Thus, in the r.h.s. of Eq. (14), we have κ+1 κ−2 i κ+1 π 4 h=0 + , ) ( ) ( )  ( κ+1 Pr X > X ,..., X X = h 1 2 κ+1 κ+1 2 g (δ, ) h ≥ · 1 − Pr X = h , κ+1 κ−2 (l) ( ) ( ) π 4 = Pr X > X ,..., X  X = h h=0 1 2 κ κ+1 123 Noisy rumor spreading and plurality consensus where, in the last inequality, we used the fact that g is c of the phase length large enough, in process P we get that a non-increasing function w.r.t. the second argument (see Pr maj (u) = m − Pr maj (u) = i ≥ αδ for some con- Lemma 15). stant α> 1 (provided that δ ≤ 1/2). Hence, by applying (τ ) It remains to show that Lemma 16 in Appendix A with θ = δ, we get Pr(c − (τ ) 2 −˜ α + , c ≤ αδ/2) ≤ exp(−(αδ) n/16) ≤ n for some con- κ+1 stant α ˜ that is large enough to apply Lemma 2. Therefore, h 1 1 − Pr X = h ≥ . (τ ) (τ ) j j κ+1 until δ ≥ 1/2, in process P we have that c −c ≥ αδ/2 l 4 m h=0 holds w.h.p. From the previous equation it follows that, after ) T phases, the protocol has reached an opinion distribution Let W be a r.v. with probability distribution Bin( , ). κ+1 κ+1 with a bias greater than 1/2. Thus, by a direct application of (τ ) (τ ) Since X ∼ Bin( , p ) with p ≤ , a stan- κ+1 κ+1 T T κ+1 κ+1 Lemma 16 and Lemma 2 to c −c , we get that, w.h.p., dard coupling argument (see for example [22, Exercise 1.1.]), (τ  ) (τ  ) T T c − c = 1, concluding the proof. enables to show that Finally, the time efficiency claimed in Theorems 1 and 2 ) ( Pr X ≤ h ≥ Pr W ≤ h . κ+1 κ+1 directly follows from Lemma 12, while the required memory follows from the fact that in each phase each node needs only Hence, we can apply the central limit theorem (Lemma 14) to count how many times it has received each opinion, i.e. to 2− 3 on W , and get that, for any  ˜ ≤ , there exists some κ+1 4 count up to at most O( log n) w.h.p. fixed constant such that, for ,wehave 0 0 ) ( 4 On the notion of Pr X ≤ ≥ Pr W ≤ ≥ −˜  . κ+1 κ+1 κ + 1 κ + 1 2 (, ı)-majority-preserving matrix (16) In this section we discuss the notion of (, δ)-m.p. noise By using Eq. (16), for we finally get that matrix introduced by Definition 2. Let us consider Eq. (2). The matrix P represents the “perturbation” introduced by the + , κ+1 noise, and so (c · P) − (c · P) measures how much infor- m i mation the system is losing about the correct opinion m,in 1 − Pr X = h κ+1 a single communication round. An (, δ)-m.p. noise matrix h=0 + , + , is a noise matrix that preserves at least an  fraction of bias, κ+1 provided the initial bias is at least δ.The (, δ)-m.p. property κ+1 ≥ 1 − · Pr X = h κ+1 essentially characterizes the amount of noise beyond which h=0 some coordination problems cannot be solved without fur- ) ther hypotheses on the nodes’ knowledge of the matrix P. ≥ 1 − · Pr X ≤ κ+1 κ + 1 κ + 1 To see why this is the case, consider an (, δ)-m.p. noise matrix for which there is a δ-biased opinion distribution c ˜ κ − 1 1 1 1 1 ≥ · −˜  ≥ · −˜  ≥ , such that (c ˜ · P) − (c ˜ · P) < 0 for some opinion i.Given κ + 1 2 3 2 4 opinion distribution c ˜, from each node’s perspective, opinion m does not appear to be the most frequent opinion. Indeed, concluding the proof that the messages that are received are more likely to be i than m. Thus, plurality consensus cannot be solved from opinion g (δ, Pr maj (u) = 1 − Pr maj (u) = i ≥ . (k−2) ln 4 distribution c ˜. π e Observe that verifying whether a given matrix P is (, δ)- m.p. with respect to opinion m consists in checking whether for each i = m the value of the following linear program is By using Proposition 1, we can then prove Lemma 12. at least δ: Lemma 12 W.h.p., at the end of Stage 2, all nodes support maximize (P · c) − (P · c) m i the initial plurality opinion. subject to c = 1, Proof Let δ = ( log n/n) be the bias of the opinion dis- tribution at the beginning of a generic phase j < T of and ∀ j , c ≥ 0, c − c − δ ≥ 0. j m j Stage 2. Thanks to Proposition 1, by choosing the constant 123 P. Fraigniaud, E. Natale We now provide some negative and positive examples of number n of individuals. Nevertheless, it could be interest- (, δ)-m.p. noise matrices. First, we note that a natural matrix ing, at least from a conceptual point to view, to address rumor property such as being diagonally dominant does not imply spreading and plurality consensus in a scenario in which the that the matrix is (, δ)-m.p. For example, by multiplying number of opinions varies with n. This appears to be a tech- the following diagonally dominant matrix by the δ-biased nically challenging problem. Indeed, extending the results in opinion distribution c = (1/2 + δ, 1/2 − δ, 0) , we see that the extended abstract of [26] from 2 opinions to any constant it does not even preserve the majority opinion at all when number k of opinions already required to use complex tools. , δ < 1/6: Yet, several of these tools do not apply if k depends on n. This is typically the case of Proposition 1. We let as an open ⎛ ⎞ 1 1 +  0 −  problem the design of stochastic tools enabling to handle the 2 2 ⎜ ⎟ 1 1 scenario where k = k(n). −  +  0 . ⎝ ⎠ 2 2 1 1 0 −  + Acknowledgements Open access funding provided by Max Planck 2 2 Society. We thank the anonymous reviewers of an earlier version of this work for their constructive criticisms and comments, which were On the other hand, the following natural generalization of of great help in improving the results and their presentation. the noise matrix in [26](seeEq. (1)), is (, δ)-m.p. for every δ> 0 with respect to any opinion: Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecomm ons.org/licenses/by/4.0/), which permits unrestricted use, distribution, +  if i = j , and reproduction in any medium, provided you give appropriate credit (P) = p = i , j i , j − otherwise. to the original author(s) and the source, provide a link to the Creative k k−1 Commons license, and indicate if changes were made. More generally, let P be a noise matrix such that APPENDIX p if i = j , (P) = (17) i , j q ≤ q ≤ q otherwise, l i , j u A technical tools for some positive numbers p, q and q . Since u l Lemma 13 For any integer r ≥ 1 it holds (Pc) − (Pc) = pc + q c − pc − q c m i m j ,m j i j ,i j 2r 2r 2 1 2r 2 1 j =m j =i 9r 8r √ e ≤ ≤ √ e . πr πr ≥ p(c − c ) + q c − q c m i l j u j j =m j =i Proof By using Stirling’s approximation [39] ≥ p(c − c ) + q (1 − c ) − q (1 − c ) m i l m u i √ r √ r r 1 r 1 12r +1 12r ≥ p(c − c ) + q − q c − q + q c 2πr e ≤ r!≤ 2πr e , m i l l m u u i e e ≥ p(c − c ) − q (c − c ) − (q − q ) m i u m i u l we have ≥ (p − q )(c − c ) − (q − q ) u m i u l ≥ (p − q )δ − (q − q ). (18) √ 1 u u l 2r 2r 12r +1 2π2r e 2r (2r )! = ≥ 2 √ 2 r r (r !) r By defining  = (p − q )/2, we get that the last line in Eq. u 12r 2πr e (18) is greater than δ iff (p − q )δ/2 ≥ (q − q ), which u u l 2r 2r 12r +1 gives a sufficient condition for any matrix of the form given 4πr e 2r in Eq. (17) for being (, δ)-m.p. 24r 2πr e 2r 2r 2 2 1 2 1 12r +1 24r 9r = √ e ≥ √ e . πr πr 5 Conclusion 12r +1 In this paper, we solved the general version of rumor spread- The proof of the upper bound is analogous (swap e and ing and plurality consensus in biological systems. That is, we 12r e in the first inequality). have solved these problems for an arbitrarily large number k of opinions. We are not aware of realistic biological contexts Lemma 14 Let X , ..., X be a random sample from a 1 n in which the number of opinions might be a function of the Bernoulli(p) distribution with p ∈ (0, 1) constant, and let 123 Noisy rumor spreading and plurality consensus Z ∼ N (0, 1). It holds with p + r + q = 1. It holds X − pn i =1 lim sup Pr √ ≤ z Pr X ≤ (1 − θ ) E X − θn t t n→∞ z∈R i i − Pr Z ≤ √ = 0. p (1 − p) ≤ exp − E X + n . Lemma 15 The function Proof Let us define the r.v. y−1 ⎪ 1 2 2 ⎨ √ x 1 − x if x < , X + 1 g x , y = y−1 ( ) Y = . (19) ⎪ 2 1 1 1 √ √ 1 − if x ≥ , 2 y y y We can apply the Chernoff-Hoeffding bound to Y (see The- with x ∈ [0, 1] and y ∈ [1, +∞) is non-decreasing w.r.t. x orem 1.1 in [22]), obtaining and non-increasing w.r.t. y. Proof To show that g(x , y) is non-decreasing w.r.t. x, Pr Y ≤ (1−θ ) E Y ≤ exp − E Y i i i observe that 2 i i i y−1 y−1 ∂ y − 1 2 2 2 2 2 for any θ ∈ (0, 1). Substituting Eq. (19)wehave g (x , y) = 1 − x −2x 1 − x ∂x 2 Pr X + n ≤ (1 − θ ) E X + n − t t for x < y < 1, and i i y−1 y−1 y − 1 2 2 2 2 2 = Pr X ≤ (1 − θ ) E X − θn t t 1 − x − 2x 1 − x ≥ 0 i i ≤ exp − E X + n , for x < y . To show that g(x , y) is non-increasing w.r.t. y, observe 1 1 − − 2 2 that this is true for x < y .For x ≥ y , since concluding the proof. ∂ 1 y − 1 1 log y + log 1 − ∂ y 2 y B. Removing the parity assumption on ∂ y − 1 y = log (y − 1) − log y ≤ 0, ∂ y 2 2 The next lemma shows that, for k = 2, the increment of bias at the end of each phase of Stage 2 in the process P we have is non-decreasing in the value of , regardless of its parity. In particular, since Proposition 1 is proven by induction, and 3  4 ∂ ∂ 1 y − 1 1 2 since the value of affects only the base case, the next lemma g (x , y)= exp log y + log 1− ≤ 0, ∂ y ∂ y 2 y implies also the same kind of monotonicity for general k. concluding the proof. Lemma 17 Let k = 2,a = 1,let be odd, and let (c · P) ≥ (c · P) . The rule of Stage 2 of the protocol is such that Lemma 16 Let {X } be n i.i.d. random variables such t t ∈[n] that Pr maj (u)=1 = Pr maj (u)=1 ≤ Pr maj (u)=1 , Pr maj (u)=2 = Pr maj (u)=2 ≥ Pr maj (u)=2 . ⎪ 1 with probability p, (20) X = 0 with probability r , −1 with probability q. 123 P. Fraigniaud, E. Natale Proof To simplify notation, let p = (c · P) and p = and 1 1 2 (c · P) . By definition, we have < = ) * l 1 Pr maj (u) = 1 X = = Pr (Y = 1) . (23) +1 1 Pr maj (u) = 1 = Pr X ≥ , 2 2 + 1 +1) Pr maj (u) = 1 = Pr X > +1 1 2 Moreover, by a direct calculation one can verify that + 1 +1) + Pr X = , ) *  < = 2 2 Pr (Y = 0) ) * ) ( Pr X = = Pr X = . (24) + 2 +2) 1 1 2 Pr (Y = 1) 2 Pr maj (u) = 1 = Pr X ≥ , ) ( +1) ( +2) From Eqs. (22), (23) and (24) it follows that where X , X and X are binomial r.v. with proba- 1 1 1 bility p and number of trials +1, and +2, respectively. ) ( +1) ( +2) ) *  ) * We can view X , X , and X as the sum of + 1 1 1 1 ) ( and +2 Bernoulli (p ) r.v., respectively. In particular, let Y Pr maj (u) = 1 X = Pr X = +1 1 1 2 2 and Y be independent r.v. with distribution Bernoulli (p ). < =  < = ) ( +1) ( +2) ) ( We can couple X , X and X as follows: 1 1 1 + Pr maj (u)=1 X = Pr X = +1 1 1 2 2 +1) ( X = X + Y ( 1 1 = Pr (Y = 1) + Pr (Y = 0) · Pr X ) *    < = and = + Pr (Y = 1) · Pr X = 2 2 2 ) * +2) ( +1) X = X + Y 1 1 = Pr X = ' ( Pr (Y = 0) Pr (Y = 1) Pr (Y = 0) Since is odd, observe that if X > , then maj (u) = 1 1 2 Pr (Y = 1) + + ' ( 2 2 Pr (Y = 1) regardless of the value of Y , and similarly if X < then 2  ) * maj (u) = 2. Thus we have ( = Pr X = . (25) (l) Pr maj (u) = 1 = Pr maj (u) = 1 X = i i =1 By plugging Eq. (25)inEq. (21) we get Pr X = i Pr maj (u) = 1 = Pr maj (u) = 1 . = Pr X = i % & i > ) *  ) * The proof that ) ( + Pr maj (u) = 1 X = Pr X = +1 1 1 2 2 < =  < = ) ( Pr maj (u) = 2 = Pr maj (u) = 1 + Pr maj (u)=1 X = Pr X = . +1 1 1 2 2 (21) is analogous, proving the first part of Eq. (20). +1) As for the second part, observe that if X > , then As for the last two terms in the previous equation, we have maj (u) = 1 regardless of the value of Y , and similarly if that +1) X < then maj (u) = 2. Observe also that 1 2 ) * Pr maj (u) = 1 X = l + 1 1 ( +1) Pr maj (u) = 1 X = = Pr (Y = 1) = p . = Pr (Y = 1) + Pr (Y = 0) , (22) +2  1 123 Noisy rumor spreading and plurality consensus − −η Because of the previous observations and the hypothesis that However, when  = (n ) for some η> 0, from p ≥ , we have that Claim 2 and Lemma 7 we have that, after phase 0 in opin- 2 +2η ion distribution c,atmost O log n/ = O(n log n) Pr maj (u) = 1 nodes are opinionated, and c is -biased. Each node that gets opinionated in phase 1 receives a message pushed from some +1) ( +1) node of c, and, because of the noise, the value of this mes- = Pr maj (u) = 1 X = i Pr X = i 1 1 (τ ) i =0 sage is distributed according to c · P.Itfollows that c is an 2 2 − −2η /2-biased opinion distribution with  = n which is +1)  ( +1) √ = Pr X = i + Pr maj (u) = 1 X much smaller than the ( log n/n) bound required for the +2 1 i > second stage. We believe that no minor modification of the protocol + 1 + 1 +1) = Pr X = 1 proposed here can correctly solve the noisy rumor-spreading 2 2 − −η 2 problem when  = (n ) in time O log n/ . + 1 +1) ( +1) = Pr X = i + p · Pr X = 1 1 i > + 1 +1) ( +1) References ≥ Pr X = i + Pr X = 1 1 2 2 i > 1. Abdullah, M.A., Draief, M.: Global majority consensus by local majority polling on graphs of a given degree sequence. Discrete = Pr maj (u) = 1 . (26) Appl. Math. 180, 1–10 (2015) 2. Afek, Y., Alon, N., Barad, O., Barkai, N., Bar-Joseph, Z., Hornstein, The proof of E.: A biological solution to a fundamental distributed computing problem. Science 331(6014), 183–185 (2011) 3. Afek, Y., Alon, N., Bar-Joseph, Z., Cornejo, A., Haeupler, B., Kuhn, Pr maj (u) = 2 ≤ Pr maj (u) = 2 F.: Beeping a maximal independent set. Distrib. Comput. 26(4), 195–208 (2013) 4. Alistarh, D., Aspnes, J., Gelashvili, R.: Space-optimal majority in is the same up to the inequality in (26), whose direction is population protocols. In: Proceedings of the 19th Annual ACM- reversed because p ≤ . SIAM Symposium on Discrete Algorithms, pp. 2221–2239 (2018) 5. Ame, J.-M., Rivault, C., Deneubourg, J.-L.: Cockroach aggregation based on strain odour recognition. Anim. Behav. 68(4), 793–801 − − (2004) C. Rumor spreading with  = 2(n ) 6. Angluin, D., Aspnes, J., Eisenstat, D., Ruppert, E.: The compu- tational power of population protocols. Distrib. Comput. 20(4), In [26] it is shown that at the end of Stage 1 the bias toward 279–304 (2007) T +2 7. Angluin, D., Aspnes, J., Eisenstat, D.: A simple population protocol the correct opinion is at least  /2 and, at the beginning for fast robust approximate majority. Distrib. Comput. 21(2), 87– of Stage 2, they assume a bias toward the correct opinion 102 (2008) of ( log n/n). In this section, we show that, when  = 8. Aspnes, J., Ruppert, E.: An introduction to population protocols. − −η (n ) for some η ∈ (0, 1/4), the protocol considered In: Middleware for Network Eccentric and Mobile Applications. Springer, pp. 97–120 (2009) by [26] and us cannot solve the rumor-spreading and the 9. Becchetti, L., Clementi, A., Natale, E., Pasquale, F., Silvestri, R.: plurality consensus problem in time (log n/ ). √ Plurality consensus in the gossip model. In: Proceedings of the 26th First, observe that when  = ( log n/n) the length of Annual ACM-SIAM Symposium on Discrete Algorithms, SIAM, the first phase of Stage 1 is log n/ = (n log n), which pp. 371–390 (2015) 10. Becchetti, L., Clementi, A., Natale, E., Pasquale, F., Silvestri, R., implies that, w.h.p., each node gets at least one message from Trevisan, L.: Simple dynamics for plurality consensus. Distrib. the source during the first phase. Thus, thanks to our analysis √ Comput. 30(4), 1–14 (2016) of Stage 2 we have that when  = ( log n/n) the protocol 11. Ben-Shahar, O., Dolev, S., Dolgin, A., Segal, M.: Direction elec- effectively solves the rumor-spreading problem, w.h.p., in tion in flocking swarms. In: Proceedings of the 6th International Workshop on Foundations of Mobile Computing, ACM, pp. 73–80 time (log n/ ). (2010) −1/2−η In general, for < n for some constant η> 0, if 12. Berenbrink, P., Friedetzky, T., Kling, P., Mallmann-Trenn, F., we adopt the second stage right from the beginning (which Wastell, C.: Plurality consensus in arbitrary graphs: lessons learned −2 means that the source node sends  messages), we get that, from load balancing. In: Proceedings of the 24th Annual European Symposium on Algorithms, vol. 57, p. 10:1–10:18 (2016) w.h.p., all nodes receive at least log n/( n) messages. Thus, 13. Boczkowski, L., Korman, A., Natale, E.: Limits for Rumor Spread- by a direct application of Lemma 16, after the first phase √ ing in Stochastic Populations. In: Proceedings of the 9th Innova- we get an log n/n-biased opinion distribution, w.h.p., and tions in Theoretical Computer Science Conference, vol. 94, pp. Stage 2 correctly solves the problem according to Theorem 2. 49:1–49:21 (2018) 123 P. Fraigniaud, E. Natale 14. Boczkowski, L., Korman, A., Natale, E.: Minimizing message size 29. Ghaffari, M., Parter, M.: A polylogarithmic gossip algorithm for in stochastic communication patterns: fast self-stabilizing proto- plurality consensus. In: Proceedings of the 36th ACM Symposium cols with 3 bits. In: Proceedings of the 28th Annual ACM-SIAM on Principles of Distributed Computing, ACM, pp. 117–126 (2016) Symposium on Discrete Algorithms, SIAM, pp. 2540–2559 (2017) 30. Giakkoupis, G., Berenbrink, P., Friedetzky, T., Kling, P.: Efficient 15. Cardelli, L., Csikász-Nagy, A.: The cell cycle switch computes Plurality Consensus, or: the benefits of cleaning up from time to approximate majority. Sci. Rep. 2, 656–656 (2011) time. In: Proceedings of the 43rd International Colloquium on 16. Chazelle, B.: Natural algorithms. In: Proceedings of the 20th Automata, Languages, and Programming vol. 55, p. 136:1–136:14 Annual ACM-SIAM Symposium on Discrete Algorithms, SIAM, (2016) pp. 422–431 (2009) 31. Jung, K., Kim, B.Y., Vojnovic, ´ M.: Distributed ranking in net- 17. Conradt, L., Roper, T.J.: Group decision-making in animals. Nature works with limited memory and communication. In: Proceedings 421(6919), 155–158 (2003) of the 2012 IEEE International Symposium on Information Theory, 18. Cooper, C., Elsässer, R., Radzik, T.: The power of two choices IEEE, pp. 980–984 (2012) in distributed voting. In: Automata, Languages, and Programming, 32. Karp, R., Schindelhauer, C., Shenker, S., Vocking, B.: Randomized vol. 8573 of Lecture Notes in Computer Science. Springer, pp. 435– rumor spreading. In: Proceedings of the 41st Annual Symposium 446 (2014) on Foundations of Computer Science, IEEE, pp. 565–574 (2000) 19. Demers, A., Greene, D., Hauser, C., Irish, W., Larson, J., Shenker, 33. Kempe, D., Dobra, A., Gehrke, J.: Gossip-based computation of S., Sturgis, H., Swinehart, D., Terry, D.: Epidemic algorithms for aggregate information. In: Proceedings of the 44st Annual Sym- replicated database maintenance. In: Proceedings of the 6th Annual posium on Foundations of Computer Science, IEEE, pp. 482–491 ACM Symposium on Principles of Distributed Computing, ACM, (2003) pp. 1–12 (1987) 34. Korman, A., Greenwald, E., Feinerman, O.: Confidence sharing: an 20. Doerr, B., Goldberg, L.A., Minder, L., Sauerwald, T., Scheideler, economic strategy for efficient information flows in animal groups. C.: Stabilizing consensus with the power of two choices. In: Pro- PLoS Comput. Biol. 10(10), e1003862–e1003862 (2014) ceedings of the 23th Annual ACM Symposium on Parallelism in 35. Land, M., Belew, R.: No perfect two-state cellular automata for Algorithms and Architectures, ACM, pp. 149–158 (2011) density classification exists. Phys. Rev. Lett. 74(25), 5148–5150 21. Draief, M., Vojnovic, M.: Convergence speed of binary interval (1995) consensus. SIAM J. Control Optim. 50(3), 1087–1109 (2012) 36. Mitzenmacher, M., Upfal, E.: Probability and Computing: Ran- 22. Dubhashi, D.P., Panconesi, A.: Concentration of Measure for the domized Algorithms and Probabilistic Analysis. Cambridge Uni- Analysis of Randomized Algorithms. Cambridge University Press, versity Press, Cambridge (2005) Cambridge (2009) 37. Perron, E., Vasudevan, D., Vojnovic, M.: Using three states for 23. El Gamal, A., Kim, Y.-H.: Network Information Theory. Cam- binary consensus on complete graphs. In: Proceedings of 28th IEEE bridge University Press, Cambridge (2011) INFOCOM (2009) 24. Elsässer, R., Friedetzky, T., Kaaser, D., Mallmann-Trenn, F., 38. Pittel, B.: On spreading a rumor. SIAM J. Appl. Math. 47(1), 213– Trinker, H.: Brief announcement: rapid asynchronous plurality con- 223 (1987) sensus. In: Proceedings of the 37th ACM Symposium on Principles 39. Robbins, H.: A remark on Stirling’s formula. Am. Math. Mon. 62, of Distributed Computing, ACM, pp. 363–365 (2017) 26–29 (1955) 25. Feinerman, O., Haeupler, B., Korman, A.: Breathe before speak- 40. Seeley, T.D., Buhrman, S.C.: Group decision making in swarms of ing: Efficient information dissemination despite noisy, limited honey bees. Behav. Ecol. Sociobiol. 45(1), 19–31 (1999) and anonymous communication. In: Proceedings of the 34th 41. Seeley, T.D., Visscher, P.K.: Quorum sensing during nest-site selec- ACM Symposium on Principles of Distributed Computing, ACM, tion by honeybee swarms. Behav. Ecol. Sociobiol. 56(6), 594–601 pp. 114–123. Extended abstract of [27] (2014) (2004) 26. Feinerman, O., Haeupler, B., Korman, A.: Breathe before speak- 42. Sumpter, D.J., Krause, J., James, R., Couzin, I.D., Ward, A.J.: Con- ing: efficient information dissemination despite noisy, limited and sensus decision making by fish. Curr. Biol. 18(22), 1773–1777 anonymous communication. Distrib. Comput. 30(5), 1–17 (2015) (2008) 27. Franks, N.R., Pratt, S.C., Mallon, E.B., Britton, N.F., Sumpter, D.J.: Information flow, opinion polling and collective intelligence in house-hunting social insects. Philos. Trans. R. Soc. Lond. B Biol. Sci. 357(1427), 1567–1583 (2002) 28. Franks, N.R., Dornhaus, A., Fitzsimmons, J.P., Stevens, M.: Speed versus accuracy in collective decision making. Proc. Biol. Sci. 270(1532), 2457–2463 (2003)

Journal

Distributed ComputingSpringer Journals

Published: Jun 6, 2018

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off