# Random multi-hopper model: super-fast random walks on graphs

Random multi-hopper model: super-fast random walks on graphs Abstract We develop a mathematical model considering a random walker with long-range hops on arbitrary graphs. The random multi-hopper can jump to any node of the graph from an initial position, with a probability that decays as a function of the shortest-path distance between the two nodes in the graph. We consider here two decaying functions in the form of Laplace and Mellin transforms of the shortest-path distances. We prove that when the parameters of these transforms approach zero asymptotically, the hitting time in the multi-hopper approaches the minimum possible value for a normal random walker. We show by computational experiments that the multi-hopper explores a graph with clusters or skewed degree distributions more efficiently than a normal random walker. We provide computational evidences of the advantages of the random multi-hopper model with respect to the normal random walk by studying deterministic, random and real-world networks. 1. Introduction Many of the complex systems existing in the real world are better represented as a network than as a continuous system. This includes ecological and biomolecular, social and economical as well as infrastructural and technological networks [1]. The use of random walk models in these systems also provides a large variety of possibilities ranging from the analysis of the diffusion of information and navigability on these networks to the exploration of their structures to detect their fine-grained organization [2, 3]. From a mathematical point of view, these networks are nothing but graphs, and we will use both terms interchangeably here with preference for the term network in case a real-world system is represented. The first work exploring the use of random walks on graphs is credited to Pólya (1921), who was walking in a park and crossed the same couple very often, leading him to consider recurrence relations for random walks [4]. In the mid 1990s, the group of Senft and Ehrlich [5] observed experimentally the self-diffusion of weakly bounded Pd atoms. They observed significant contributions to the thermodynamical properties of the system from jumps spanning second and third nearest-neighbours in the metallic surface, which can be considered as a regular lattice. Then, in 1997 the group of Linderoth et al. [6] observed experimentally that for the self-diffusion of Pt atoms on Pt(110) surface the jumps from non-nearest neighbours also contribute to the diffusion. Even more surprising were the results of the same group, when they studied the diffusion of two large organic molecules on the Cu(110) surface. In this case, using scanning tunnelling microscopy, it was observed that long jumps play a dominating role in the diffusion of the two organic molecules, with root-mean-square jump length as large as $$3.9$$ and $$6.8$$ lattice spacings [7]. Since then the role of long-jumps in adatom and molecules diffusing on metallic surfaces has been both theoretically and experimentally confirmed in many different systems [8, 9]. Due to this experimental evidence, some attempts were made to consider long-range jumps in the diffusion of a particle on a regular lattice. The first of them was the paper entitled Lattice walks by long jumps by Wrigley et al. [10]. Other works have considered that the space in which the diffusion takes place is continuous, and then applied a random-walk model with Lévy flights to model these long-range effects (see below). However, the development of a general multi-hopper model, in which a random walker hops to any node of a general graph with probabilities that depends on the distance separating the corresponding nodes is still missing in the literature. Apart from the physical scenarios related to the diffusion of adatoms and admolecules on metallic surface, long-range jumps on graphs are of a general interest. For instance, in social networks one can take advantage of the full or partial knowledge of the network beyond first acquaintances to diffuse information in a swifter way than can be done by the traditional nearest-neighbour only strategy. In exploring technological and infrastructural networks we can exploit our knowledge of the topology of the network to jump from a position to non-nearest-neighbours in such a way that the whole system can be explored in shorter time. In 2012, two groups published independently models that are designed to account for all potential long jumps that a random walker can take on a graph. In one of these papers Mateos and Riascos [11] proposed a random walk model with jumps that can go to a non-nearest neighbour with a probability that decays as a power-law of the shortest-path distance separating the two nodes (all formal definitions are done in the Preliminaries section of this paper). In the other work, Estrada [12] generalized the concept of graph Laplacian by introducing the k-path Laplacians, which were then plugged into a generalized diffusion equation to study the influence of long-range jumps on diffusive processes on graphs. While the first paper provides a probabilistic approach to the problem, the second one provides the algebraic tools needed for its generalization and mathematical formalization. In this work, we will consider the generalized formulation of a random walk model with long-range jumps that decay as a function—not necessarily a power-law—of the shortest path distance between the nodes. We propose to call this model the random multi-hopper (RMH). We proceed by using the random walk–electrical networks connection discovered by Doyle in order to formulate mathematically the RMH model and some of its main parameters, namely the hitting and commute times. We prove analytically here that for certain asymptotic values of the parameters of the model, the average hitting time of the random walker is the smallest possible for a classical random walker. We study here some deterministic graphs for which some extremal properties of random walks are known, such as lollipop, barbell and path graphs. Then, we move to the analysis of random networks, in particular we explore here the Erdős–Rényi and the Barabási–Albert models. We finally study a real-world network representing the power grid of the western USA. In all cases, we compare the RMH with the normal random-walk (NRW) model, and conclude that the multi-hopper overcomes several of the difficulties that a normal random walker has to explore a graph. In particular, the multi-hopper explores more efficiently graphs with clusters of highly interconnected nodes as well as graphs with very skewed degree distributions, when compared with the normal random walker. These characteristics which are omnipresent in real-world networks make the RMH an excellent choice for transport and search on complex systems. 2. Preliminaries We introduce in this section some definitions and properties associated with random walks on graphs and define the notation used throughout the work. A graph$$G=(V,E)$$ is defined by a set of $$n$$ nodes (vertices) $$V$$ and a set of $$m$$ edges $$E=\{(u,v)|u,v\in V\}$$ between the nodes. All the graphs considered in this work are finite, undirected, simple, without self-loops, and connected. A path of length $$k$$ in $$G$$ is a sequence of different nodes $$u_{1},u_{2},\ldots,u_{k},u_{k+1}$$ such that for all $$1\leq l\leq k$$, $$(u_{l},u_{l+1})\in E$$. The length of a shortest path between two nodes $$i$$ and $$j$$ constitutes a distance function, here designated by $$d\left(i,j\right)$$, which is known as the shortest-path distance between the nodes $$i$$ and $$j$$. Let $$A=\left(a_{ij}\right)_{n\times n}$$ be the adjacency matrix of the graph where $$a_{ij}=1$$ if and only if $$\left(i,j\right)\in E$$ and it is zero otherwise. The degree of the node $$i$$ is the number of nodes adjacent to it and it is designated here by $$k\left(i\right)$$. A random walk on the graph is a random sequence of vertices generated as follows. Given a starting vertex $$i$$ we select a neighbour $$j$$ at random, and move to this neighbour. Then we select a neighbour $$k$$ of $$j$$ at random, and move to it, and so on [13, 14]. This sequence of random nodes $$v_{t}:t=0,1,\ldots$$ is a Markov chain, and it has probability distribution $$\mathbf{p}_{t}\left(q\right)=\Pr\left(v_{t}=q\right)\!.$$ (2.1) The transition probability matrix is $$P=\left(p_{ij}\right)_{i,j\in V}$$ whose entries are given by $$p_{ij}=\dfrac{\delta_{ij}}{k_{i}},$$ (2.2) where $$k_{i}$$ is the degree of the node $$i$$, and $$\delta_{ij}$$ is the Kronecker delta. Let $$K$$ be the diagonal matrix whose entries $$K_{ii}=k_{i}$$ and let $$P=K^{-1}A$$. Then, the vector containing the probability of finding the walker at a given node of the graph at time $$t$$ is $$\mathbf{p}_{t}=\left(P^{T}\right)^{t}\mathbf{p}_{0},$$ (2.3) where $$T$$ represents the transpose of the matrix and $$\mathbf{p}_{0}$$ is the initial probability distribution. The following are important parameters of the random walk on a graph which are of direct utility in the current work. For a random walk starting at the node $$i$$, the expected number of steps before it reaches node $$j$$ is known as the hitting time and it is designated by $$H\left(i,j\right)$$. The expected number of steps of a random walk starting at node $$i$$ to visit node $$j$$ and then return to node $$i$$ again is known as the commute time, and it is designated by $$\kappa\left(i,j\right)$$. Both quantities are related by: $$\kappa\left(i,j\right)=H\left(i,j\right)+H\left(\,j,i\right)\!.$$ (2.4) 3. RMH model 3.1 Intuition of the model In the normal random walk in a graph, the random walker makes steps of length one—in terms of the number of edges travelled—and after each step she throws again the dice to decide where to move. This is reminiscent of a drunkard that does not remember where her home is, and so she stops at every junction that she finds and takes the decision of where to go next in a random way. Let us now suppose the existence of a random walker who does not necessarily stops at a nearest neighbour of her current position. That is, suppose that the random walker placed at the node $$i$$ of the graph selects any node $$q$$ of the graph to which she wants to move. Let us consider that the shortest path distance between $$i$$ and $$q$$ is $$d\left(i,q\right)$$. If $$d\left(i,q\right)>1$$ the random walker will not stop at any of the intermediate nodes between $$i$$ and $$q$$, but she will go directly to that node. This would correspond to a drunkard that thinks she remembers where her home is, then walks a few blocks without stopping at any junction, until she arrives at a given point where she realizes she is lost. Then, she repeats the process again. Therefore, the movements of the drunkard are modelled by a graph, the edges of which are shortest paths of various lengths of the original graph $$G$$, and the probability for the random walker to jump straight from node $$i$$ to node $$q$$ is proportional to a certain weight $$a\left(i,q\right)$$, which is a function of the distance $$d\left(i,q\right)$$ in $$G$$. As examples of decaying functions of the shortest path distance, we mention $$a\left(i,q\right)=\exp\left(-l\cdot d\left(i,q\right)\right)$$, $$a\left(i,q\right)=\left(d\left(i,q\right)\right)^{-s}$$ and $$a\left(i,q\right)=z^{-d\left(i,q\right)}$$, for $$l>0$$, $$s>0$$, and $$z>1$$, respectively. Hereafter, we will consider the first two for the analysis, called respectively the Laplace transform case and the Mellin transform case, by analogy with these transforms. As a matter of example let us consider the one-dimensional case consisting of a linear chain of five nodes labelled as 1—2—3—4—5. Let us consider that the attractiveness of a given node from its current position is given by $$a\left(i,q\right)=\exp\left(-0.5\cdot d\left(i,q\right)\right)$$. If the drunkard is placed at the Node 1 she has the following probabilities of having a walk of length 1, 2, 3 or 4: 0.46, 0.28, 0.18 and 0.10, respectively. That is, the probability that she still stops at the nearest node from his current position is higher than that for the rest of nodes, but the last ones are not zero like in the classical random walk. If the attractiveness of the farthest nodes is dramatically reduced, the probabilities approach those of the classical random walker. For instance if $$a\left(i,q\right)=\exp\left(-5\cdot d\left(i,q\right)\right)$$. the probabilities of having a walk of length 1, 2, 3 or 4 are: 0.9933, 0.0067, 0.000 and 0.000, respectively. For $$\lambda=10$$, we completely recover the classical random walk model with probabilities: 1, 0, 0 and 0 for the walks of length 1, 2 , 3 and 4, respectively. This clearly indicates that the current model is a generalization of the classical random walk model in which the random walker is allowed with certain probability of travelling to a non-nearest neighbour of his current position. 3.2 Mathematical formulation Let $$G=\left(V,E\right)$$ be a simple, undirected graph without self-loops. Let $$d_{max}$$ be the graph diameter, that is, the maximum shortest path distance in the graph. Let us now define the $$d$$-path adjacency matrix ($$d\leq d_{max}$$), denoted by $$A_{d}$$, of a connected graph of $$n$$ nodes as the square, symmetric, $$n\times n$$ matrix whose entries are: $$A_{d}\left(i,j\right)=\left\{ \begin{array}{c} 1\\ 0 \end{array}\right.\begin{array}{c} if\ d_{ij}=d,\\ otherwise, \end{array}$$ (3.1) where $$d_{ij}$$ is the shortest path distance, that is, the number of edges in the shortest path connecting the nodes $$i$$ and $$j$$. The $$d$$-path degree of the node $$i$$ is given by [11, 12] $$k_{d}\left(i\right)=\left(\mathbf{1}^{T}A_{d}\right)_{i}$$ (3.2) the symbol for the all-ones vector should simply be a bold 1 is an all-ones column vector. Let us now consider the following transformed $$k$$-path adjacency matrices [12]: $${\hat{A}}^{\tau}=\sum_{k=1}^{d_{max}}c_{d}^{\tau}A_{d},\label{eq:generalized operator}$$ (3.3) where $$\tau$$ indicates the type of transformation, for instance $$c_{d}^{\tau=\text{Mel}}=d^{-s}$$ for $$s>0$$ is the Mellin transform and $$c_{d}^{\tau=\text{Lapl}}=\exp\left(-l\cdot k\right)$$ for $$l>0$$ is the Laplace transform. Let us define the generalized degree of a given node as $$\hat{k}^{\tau}\left(i\right)=\left(\hat{A}^{\tau}\mathbf{1}\right)_{i}.\label{eq:generalized degree}$$ (3.4) Now we define the probability that a particle staying at node $$i$$ hops to the node $$j$$ as $$P^{\tau}\left(i,j\right)=\hat{A}^{\tau}\left(i,j\right)/\hat{k}^{\tau}\left(i\right)\!.$$ (3.5) Notice that if do not consider any long-range interaction, then $$\hat{A}^{\tau}=A$$ and $$P^{\tau}\left(i,j\right)=1/k\left(i\right)$$, if $$\left(i,j\right)\in E$$ and zero otherwise, where $$k\left(i\right)$$ is the degree of the node. That is, we recover the classical random walk probability. Let us denote by $$\hat{K}^{\tau}$$ the diagonal matrix with $$\hat{K}^{\tau}\left(i,i\right)=\hat{k}\left(i\right)$$ and let us define the matrix $$\hat{P}^{\tau}=\left(\hat{K}^{\tau}\right)^{-1}\hat{A}^{\tau}$$. Then, the evolution equation ruling the states of the walker at a given time step is given by $$\mathbf{p}_{t+1}=\left(\hat{P}^{\tau}\right)^{T}\mathbf{p}_{t}.$$ (3.6) 3.3 $$k$$-path Laplacians, hitting and commute times In a similar way as the Laplacian matrix for a graph, we define the Laplacian matrix corresponding to (3.3). That is, $$\hat{L}^{\tau}=\hat{K}^{\tau}-\hat{A}^{\tau},\label{eq:k-path Laplacian}$$ (3.7) where $$K^{\tau}$$ is the diagonal matrix of generalized degree $$\hat{k}^{\tau}\left(i\right)$$ defined in (3.4) and $$A^{\tau}$$ is the generalized adjacency matrix defined in (3.3). This is the Laplacian of the graph $$G^{\tau}$$, the (weighted) adjacency matrix of which is $$A^{\tau}$$. As a result, this generalized Laplacian $$\hat{L}^{\tau}$$ is positive semi-definite. The graph $$G^{\tau}$$ can be seen as a network of resistances, with the entry of $$A^{\tau}$$ representing the conductance (inverse resistance) of the connection between two nodes. By assuming that an electric current is flowing through the network $$G'$$ by entering at the node $$i$$ and leaving at the node $$j$$ we can calculate the effective resistance between these two nodes as follows: $$\hat{\varOmega}^{\tau}\left(i,j\right)=\hat{\mathcal{\mathcal{L}}}^{\tau}\left(i,i\right)+\hat{\mathcal{\mathcal{L}}}^{\tau}\left(\,j,j\right)-2\hat{\mathcal{\mathcal{L}}}^{\tau}\left(i,j\right)\!,$$ (3.8) where $$\hat{\mathcal{\mathcal{L}}}^{\tau}$$ is the Moore–Penrose pseudo-inverse of the generalized Laplacian matrix. It is well known that the analogous of this effective resistance for the simple graph is a distance between the corresponding pair of nodes. It is straightforward to show that this is also the case here and we will call $$\hat{\Omega}^{\tau}\left(i,j\right)$$ the generalized effective resistance between the nodes $$i$$ and $$j$$ in a graph. The sum of all resistance distances in a graph is known as the Kirkkoff index of the graph. In the context of the multi-hopper model it can be defined in a similar way as $$\hat{\Omega}^{\tau}_{tot}=\sum_{i<j}\hat{\varOmega}^{\tau}\left(i,j\right)=\dfrac{1}{2}\mathbf{1}^{T}\hat{\varOmega}^{\tau}\mathbf{1}.$$ (3.9) Then, an extension of a result obtained by Nash-Williams [15] and by Chandra et al. [16] allow us to calculate the commute and hitting times based on the generalized resistance distance. That is, the commute time between the corresponding nodes is given by: $$\hat{\kappa}^{\tau}\left(i,j\right)=vol\left(G^{\tau}\right)\hat{\varOmega}^{\tau}\left(i,j\right)\!,$$ (3.10) where $$vol\left(G^{\tau}\right)$$ is the sum of all the weights of the edges of $$G^{\tau}$$ (see for instance [17]). Using the scaled generalized fundamental matrix (SGFM) (see [18, 19]), we can express the hitting and commute times in matrix form as $$\hat{H}^{\tau}=\mathbf{1}\left[{\rm diag}\left(\tilde{Z^{\tau}}^{-1}\right)\right]^{T}-\tilde{Z^{\tau}},$$ (3.11) $$\hat{\kappa}^{\tau}=\mathbf{1}\left[diag\left(\tilde{Z^{\tau}}^{-1}\right)\right]^{T}+\left[{\rm diag}\left(\tilde{Z^{\tau}}^{-1}\right)\right]\mathbf{1}^{T}-\tilde{Z^{\tau}}-\tilde{Z^{\tau}}^{T},$$ (3.12) where $$\tilde{Z^{\tau}}$$ is just the SGFM for the graph $$G^{\tau}$$. The expected commute time averaged over all pairs of nodes can be easily obtained from the multi-hopper Kirchhoff index as $$\left\langle \hat{\kappa}^{\tau}\right\rangle =\dfrac{4\left(\mathbf{1}^{T}\mathbf{w^{\tau}}\right)}{n\left(n-1\right)}\hat{\Omega}^{\tau}_{\rm tot},$$ (3.13) where $$\mathbf{w^{\tau}}$$ is the vector containing the weight of each edge in the graph $$G^{\tau}$$. In a similar way we can obtain the expected hitting time averaged over all pairs of nodes $$\left\langle \hat{H}^{\tau}\right\rangle =\dfrac{2\left(\mathbf{1}^{T}\mathbf{w^{\tau}}\right)}{n\left(n-1\right)}\hat{\Omega}^{\tau}_{\rm tot}.$$ (3.14) In order to understand the mechanism behind the efficiency of the multi-hopper random walker to explore networks, it is important to relate the structure of the network with dynamical quantities. In the following part we study the stationary probability distribution and the mean-first return time and its relation with the distances in the network. The stationary probability distribution vector $$\pi_{\tau}$$ is obtained in Equation (??): $${\mathbf{\pi}}^{\tau}=\frac{\hat{A}^{\tau}\mathbf{1}}{\mathbf{1}^{T}\hat{A}^{\tau}\mathbf{1}}.$$ (3.15) Taking into account the definition of the matrix $$\hat{P}^{\tau}$$, we obtain for the elements $$\pi^{\tau}(i)$$ with $$i=1,2,\ldots,n$$ of the stationary probability distribution: $$\mathbf{\mathbf{\pi}}^{\tau}(i)=\frac{k^{\tau}(i)}{\sum_{j=1}^{n}k^{\tau}(\,j)}.\label{StatDistf}$$ (3.16) In this way, we obtain for the Mellin transformation with parameter $$s$$: $$k^{\tau=\text{Mel}}(i)=k\left(i\right)+k_{2}\left(i\right)\frac{1}{2^{s}}+k_{3}\left(i\right)\frac{1}{3^{s}}+\ldots+k_{d_{\rm max}}\left(i\right)\frac{1}{d_{\rm max}^{s}},\label{LRDegreeM}$$ (3.17) and the Laplace transformation with parameter $$l$$: $$k^{\tau=\text{Lapl}}(i)=\left[k\left(i\right)+k_{2}\left(i\right)\frac{1}{e^{l}}+k_{3}\left(i\right)\frac{1}{e^{2l}}+\ldots+k_{d_{max}}\left(i\right)\frac{1}{e^{l(d_{\rm max}-1)}}\right]\frac{1}{e^{l}}.\label{LRDegreeE}$$ (3.18) The stationary probability distribution in Equation (3.16) determines the probability to find the random walker at the node $$i$$ in the limit $$t$$ large. Expressions (3.17) and (3.18) allow to identify how the structure of the networks and the long-range strategy controlled by the parameters $$s$$ or $$l$$ combine in order to change the stationary probability distribution. In the limit of $$s,l\to\infty$$ the long-range contribution is null and the result $$\pi(i)=\frac{k_{i}}{\sum_{l=1}^{n}k_{l}}$$ for the normal random walker is recovered. On the other hand, when $$s,l\to0$$, the dynamics includes, in the same proportion, contributions of neighbours, second-, third-,..., and $$d_{max}$$-nearest neighbours. In the limit, the stationary probability distribution is the same for all the nodes and $$\pi(i)=1/n$$. 4. On the hitting time in the RMH model The most important result of this work is related to the average hitting time of the RMH walk when the parameters of the corresponding transforms tends to zero. Lemma 1 Let us consider the transformed $$k$$-path adjacency matrices: $$\hat{A}^{\tau}=\sum_{d=1}^{d_{max}}c_{d}^{\tau}A_{d},\label{eq:generalized operator-1}$$ (4.1) with $$c_{d}^{\tau=\text{Mel}}=d^{-s}$$ for $$s>0$$ and $$c_{d}^{\tau=\text{Lapl}}=\exp\left(-l\cdot d\right)$$ for $$l>0$$. Then, when $$s\rightarrow0$$ or $$l\rightarrow0$$ the average hitting time $$\left\langle \hat{H}^{\tau}\right\rangle \rightarrow\left(n-1\right)$$, independently of the topology of the graph, which is the minimum for any graph of $$n$$ nodes. Proof. First, let $$G_{n}$$ be any connected, simple graphs with $$n$$ nodes. Then, $$\left\langle \hat{H}\left(G_{n}\right)\right\rangle \leq\left\langle \hat{H}\left(K_{n}\right)\right\rangle =n-1$$, with equality if and only if $$G_{n}=K_{n}$$, where $$K_{n}$$ is the complete graph with $$n$$ nodes (see [20] for a couple of proofs). Then, when $$s\rightarrow0$$ or $$l\rightarrow0$$ we have that $$\hat{A}^{\text{Mell}}\left(i,j\right)\rightarrow1$$ and $$\hat{A}^{\text{Lapl}}\left(i,j\right)\rightarrow1$$, respectively. This means that $$\hat{A}^{\tau}\left(i,j\right)=1$$$$\forall i\neq j$$ and $$\hat{A}^{\tau}\left(i,j\right)=0$$$$\forall i=j$$. In other words, $$\hat{A}^{\tau}=\mathbf{1}\mathbf{1}^{T}-I$$, which is the adjacency matrix of the complete graph $$K_{n}$$. In closing, when $$s\rightarrow0$$ or $$l\rightarrow0$$, $$\hat{A}^{\tau}\rightarrow A\left(K_{n}\right)$$. As it has been previously proved $$\left\langle H\left(K_{n}\right)\right\rangle =n-1$$, which proves the result. □ At first sight the result may seem trivial. We replace the graph by a weighted complete graph, thus the hitting time should be simply that of a weighted complete graph. The problem is however that the weights that every edge in this complete graph receives depends of the structure of the original graph $$G$$. Then, only at the point when $$s=0$$ or $$l=0$$ the results coincide with those of the complete graph. However, neither the rate of decay of the hitting time as $$s\rightarrow0$$ or $$l\rightarrow0$$, nor its specific value at any specific value of $$s$$ or $$l$$ are trivial and they strongly depend on the topology of the graph. To give an illustrative example we consider here six different graphs with the same size $$n=100$$. There are a barbell graph $$B\left(100,33,33\right)$$, the path $$P_{100}$$, the cycle $$C_{100}$$, the square lattice $$10\times10$$, the triangular lattice $$10\times10$$ and the star graph $$S_{1,99}$$. A barbell graph $$B\left(n,k_{1},k_{2}\right)$$ is a graph with $$n$$ nodes and two cliques of sizes $$k_{1}$$ and $$k_{2}$$, connected by a path consisting of the remaining nodes. In Fig. 1, we illustrate the results for the hitting time as a function of the parameters $$s$$ (left panel) and $$l$$ (right panel) in the Mellin and Laplace transforms defined before, respectively. As can be seen in both cases the rate of convergence to the smallest hitting time of all these graphs is significantly different from one structure to another. The case of the star is easy to analyse as it is formed only by shortest paths of length one and two. Thus its convergence to the best hitting time is very fast. The worst rate followed by the path and then the cycle. However, the largest diameter among these three graphs is not the one of the barbell graph but that of the path, which indicates that not only the lengths of the shortest paths involved influence this rate of convergence of the hitting time in general graphs (we will consider the barbell graphs later on this work). The square and triangular lattices show almost identical convergence rates although they also display different shortest-path structures. More important is to consider a specific value of the hitting time at a given value of $$s$$ or $$l$$. Let us consider for instance $$s=2$$ and $$l=2$$. In this case, the barbell graph displays a hitting time 50 times larger than that of the square and triangular lattices in the Mellin transform or 150 times larger in the case of the Laplace transform. All of this shows that the consideration of a RMH in graphs is far from the trivial replacement of a graph by a weighted complete graph. Fig. 1. View largeDownload slide Change of the average hitting time in a series of graphs with the same size $$n=100$$ as a function of the parameters $$s$$ and $$l$$ of the Melling (a) and Laplace (b) transforms, respectively. Fig. 1. View largeDownload slide Change of the average hitting time in a series of graphs with the same size $$n=100$$ as a function of the parameters $$s$$ and $$l$$ of the Melling (a) and Laplace (b) transforms, respectively. 5. Deterministic graphs In this section, we study some of the properties of the multi-hopper model for some classes of graphs which have deterministic structure. We study the average hitting time of all $$11,117$$ connected graphs with 8 nodes. The average hitting time has mean $$10.036\pm1.932$$ for all the graphs with $$n=8$$, with a maximum value of $$21.071$$. These values converge quickly to $$n-1$$ as soon as $$s,l\rightarrow0$$. For instance for $$s=0.5$$ the mean of the average hitting time is already $$7.062\pm0.024$$, and this value drops up to $$7.00253\pm0.00096$$ for $$s=0.1$$. The situation is very similar for $$l\rightarrow0$$, and the mean of the average hitting time is $$7.0037\pm0.0045$$ for $$l=0.1$$ and $$7.000072\pm4.67\cdot10^{-5}$$ for $$l=0.01$$. In the next subsection, we study some specific families of graphs which frequently appear in bounds for the hitting and commute times in graphs. 5.1 Lollipop and barbell graphs The first classes of graphs that we study here are the so-called lollipop and barbell graphs. The lollipop graphs appear in many extremal properties related to random walks on graphs. In 1990, Brightwell and Winkler [21] proved that the hitting time between a pair of nodes $$i$$ and $$j$$ in a graph is maximum for the lollipop graph $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ consisting of a clique of $$\left\lfloor \dfrac{2n}{3}\right\rfloor$$ nodes including $$i$$ to which a path on the remaining nodes, ending in $$j$$, is attached. The same graph was found by Jonasson as the one containing the pair of nodes maximizing the commute time among all graphs [22]. Here we consider lollipop graphs $$L\left(n,\left\lfloor \tfrac{n}{2}\right\rfloor \right)$$ and $$L\left(n,\left\lfloor 2\tfrac{n}{3}\right\rfloor \right)\!,$$ which have appeared already in the previous section and the symmetric barbell graphs $$B\left(n,\left\lfloor \tfrac{n}{3}\right\rfloor ,\left\lfloor \tfrac{n}{3}\right\rfloor \right)$$. We want to remark that these graphs are not necessarily the extremal ones for the hitting time as discussed in the previous section but they can be considered as representative of their classes. We observe that the three graphs display $$\left\langle H\right\rangle \approx an^{3}$$ for the normal random walk. The coefficients $$a$$ obtained by using nonlinear fitting of the hitting times with $$n$$ are: $$a\approx0.01387$$ for $$L\left(n,\left\lfloor \tfrac{n}{3}\right\rfloor \right)$$, $$a\approx0.0179$$ for $$B\left(n,\left\lfloor \tfrac{n}{3}\right\rfloor ,\left\lfloor \tfrac{n}{3}\right\rfloor \right)$$ and $$a\approx0.01928$$ for $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$. We then study the variation of the parameters $$s$$ and $$l$$ in the Mellin and Laplace transforms of the multi-hopper model for the three graphs previously studied. In Fig. (2), we illustrate the results of these calculations. An important aspect to remark at this point are the obvious differences between the use of the Mellin and Laplace transforms in the multi-hopper model. First, it is easily observed that the Mellin transform produces a faster decay of the average hitting time than the Laplace one for the three graphs. For instance the 50% reduction in the average hitting time of the three graphs occurs for values of $$3.5<s<4.0$$ for the Mellin transform, but it happens for $$1.5<l<2.0$$ for the Laplace. This implies that the Laplace transform converges to average hitting time of $$n-1$$ at much smaller values than the Mellin transform. The second important difference is observed in the insets of Fig. (2). For the Laplace-transformed multi-hopper, the lollipop graph $$L\left(n,\left\lfloor \tfrac{n}{2}\right\rfloor \right)$$ always has the largest value of the average hitting time at any value of $$l$$ among the three graphs studied. However, for the Mellin-transformed case, the lollipop graph $$L\left(n,\left\lfloor \tfrac{n}{2}\right\rfloor \right)$$ has the largest hitting time for large values of $$s$$, but for $$s\lesssim2.1$$ the lollipop graph $$L\left(n,\left\lfloor 2\tfrac{n}{3}\right\rfloor \right)$$ is the one with the largest value of the average hitting time (see the crossing in the inset of Fig. 2). This confirms the complexity of the analysis of the extremal graphs for the multi-hopper model as we have hinted at in the previous section. Fig. 2. View largeDownload slide Hitting time as a function of the parameter $$s$$ for the Mellin (left) and the Laplace (right) transforms in lollipops $$L\left(n,\left\lfloor \tfrac{n}{2}\right\rfloor \right)$$ (blue squares), $$L\left(n,\left\lfloor 2\tfrac{n}{3}\right\rfloor \right)$$ (red circles) and barbell $$B\left(n,\left\lfloor \tfrac{n}{3}\right\rfloor ,\left\lfloor \tfrac{n}{3}\right\rfloor \right)$$ (yellow triangles) graphs for $$n=999$$. In the inset panels, we zoom the plot for the region $$1.8\leq s\leq2.3$$ and $$0.01\leq l\leq1$$, respectively. Fig. 2. View largeDownload slide Hitting time as a function of the parameter $$s$$ for the Mellin (left) and the Laplace (right) transforms in lollipops $$L\left(n,\left\lfloor \tfrac{n}{2}\right\rfloor \right)$$ (blue squares), $$L\left(n,\left\lfloor 2\tfrac{n}{3}\right\rfloor \right)$$ (red circles) and barbell $$B\left(n,\left\lfloor \tfrac{n}{3}\right\rfloor ,\left\lfloor \tfrac{n}{3}\right\rfloor \right)$$ (yellow triangles) graphs for $$n=999$$. In the inset panels, we zoom the plot for the region $$1.8\leq s\leq2.3$$ and $$0.01\leq l\leq1$$, respectively. We now concentrate on the variation of the average hitting time with the number of nodes in the lollipop $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ for different values of the parameters $$s$$ and $$l$$ (Fig. 3). For the Mellin transformed multi-hopper model with a fixed value of the exponent $$s$$, the average hitting time is always a power-law of the number of nodes: $$\left\langle \hat{H}^{\text{Mel}}\left(s\right)\right\rangle \approx an^{\gamma}$$, where $$\gamma\rightarrow3$$ when $$s\rightarrow\infty$$ and $$\gamma\rightarrow1$$ when $$s\rightarrow0$$. For instance, $$\gamma=2.831$$ for $$s=3;$$$$\gamma=2.129$$ for $$s=2;$$$$\gamma=1.772$$ for $$s=1;$$$$\gamma=1.291$$ for $$s=0.5;$$$$\gamma=1.011$$ for $$s=0.1$$. The situation is quite similar for the Laplace transformed multi-hopper model. This observation indicates that for small values of the parameters $$s$$ and $$l$$ the average hitting time changes linearly with the number of nodes. This important observation is repeated for every family of graphs as we will see in further sections of this work. Fig. 3. View largeDownload slide Average hitting times for the RMH model with Mellin (left) and Laplace (right) transforms in the lollipop graph $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ as a function of the number of nodes $$n$$ in the graph. Fig. 3. View largeDownload slide Average hitting times for the RMH model with Mellin (left) and Laplace (right) transforms in the lollipop graph $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ as a function of the number of nodes $$n$$ in the graph. We then study the variation of the average hitting time with the number of nodes for the lollipop $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ for $$0.001\leq s\leq0.05$$ and obtain the linear dependence of the type: $$\left\langle \hat{H}_{M}\left(G\right)\!,s\right\rangle \approx\alpha n+\beta$$. Using these linear fits, we can estimate the critical number of nodes $$n_{c}$$ below which $$\left(n-1\right)\leq\left\langle \hat{H}_{M}\left(L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)\right)\!,s\right\rangle \leq n$$ for a given value of the parameter $$s$$. Clearly, $$n_{c}\leq\dfrac{\beta}{1-\alpha}.$$ However, we can simplify this expression by observing that $$\beta<-1$$ and that $$\alpha\approx1+2.751s^{2}$$. Then, $$n_{c}\leq\dfrac{1}{2.751s^{2}},s\leq0.05.$$ (5.1) The values of the critical number of nodes are given in the Supplementary Information, which range from $$145$$ for $$s=0.05$$ to $$363,504$$ for $$s=0.01.$$ This means, for instance, that any lollipop $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ having less than 58,160 nodes will display an average hitting time bounded between $$n-1$$ and $$n$$ in the RMH model with a Mellin transform and parameter $$s\leq0.0025$$. The previous inequality can also be used in the other way around, namely in order to estimate what is the value of $$s$$ that should be used such that a lollipop $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ has average hitting time bounded as $$\left(n-1\right)\leq\left\langle \hat{H}_{M}\left(L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)\right)\!,s\right\rangle \leq n$$. For instance, if we would like to know the value of $$s$$ for which any graph with less than 100,000 nodes has hitting time below $$n-1$$, we use $$s\geq\dfrac{1}{\sqrt{2.751n_{c}}},s\leq0.05,\label{eq:s_bound}$$ (5.2) and obtain $$s\approx0.0019$$. We can extrapolate here to roughly estimate the value of $$s$$ for which any lollipop $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ with less than 1 million nodes has hitting time bounded as before. This estimation gives a value of $$s\lesssim0.0006$$. The importance of the previous investigation is the following. Currently, we do not know what are the graphs with the largest value of the average hitting time among all the graphs with $$n$$ nodes. However, we have strong intuition and evidence that it should be either a lollipop or a barbell graph. For these graphs, the average hitting time is of the order $$n^{3}$$ for the NRW. Then, we can use the previous values obtained for the lollipop $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ as rough indications of the worse-case scenarios that can be expected for any graph. In other words, if we consider a graph of any structure having 1 million nodes we should expect that its average hitting time is bounded below $$n$$ for $$s\lesssim0.0006$$. We will see that for the case of real-world networks, these values of $$s$$ are order of magnitude overestimated due to the fact that the structure of these graphs make the average hitting time drops order of magnitude in relation to this upper bound. A first flavour of these differences is obtained by the analysis of random graphs in the next section of this work. 5.1.1 Time evolution In this section, we are not interested in a detailed description of the time evolution of the random walker or the multi-hopper in the lollipop or barbell graphs. We rather will make a comparison between the evolution of them at different times in such a way that we remark the main difference between the two models. Consequently we consider a lollipop $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ and a barbell $$B\left(n,\left\lfloor \tfrac{n}{3}\right\rfloor ,\left\lfloor \tfrac{n}{3}\right\rfloor \right)$$ graph, both with $$n=999$$ nodes. In both cases we place the random walker at a node of one of the two cliques. This node is selected not to be the one attached to the path. Let any node in a clique in the lollipop (respectively, a clique in the barbell) which is not the one connected to the path be designated as the node $$i$$. Let the end point of the path be named $$j$$. We remark that the node $$j$$ does not belong to the clique. Let the node connecting the clique to which $$i$$ belongs and the path be designated as $$k$$. Then, we have placed the random walker and the multi-hopper at the node $$i$$ of the lollipop and the barbell and explore the probability at each node after different times. using $$\mathbf{p}_{t}=\left(\hat{P}^{\textrm{Mel,}T}\right)^{t}\mathbf{p}_{0},$$ (5.3) where $$\hat{P}^{\textrm{Mel,}T}$$ is the transpose of $$\hat{P}^{\textrm{Mel}}.$$ As can be seen in the Fig. (4), the classical random walker spends most of its time in the clique of the graphs, taking on average $$\left\lfloor \tfrac{2n}{3}\right\rfloor -1$$ steps to visit the node $$k$$ in the lollipop and $$\left\lfloor \tfrac{n}{3}\right\rfloor -1$$ to visit it in the barbell. Once the walker visits the node $$k$$ she can walk to the node $$j$$ only with probability $$1/\left\lfloor \tfrac{2n}{3}\right\rfloor$$ in the lollipop and $$1/\left\lfloor \tfrac{n}{3}\right\rfloor$$ in the barbell graph. Then it can be seen in the Fig. (4) that for times as large as $$t=10^{6}$$ the random walker is still stuck in the clique of the lollipop graph. In the case of the barbell when $$t=10^{3}$$ the walker has visited only the nodes of the clique in which she started and when $$t=10^{6}$$ she starts to visit the nodes of the other clique. Fig. 4. View largeDownload slide Probability distribution at the different nodes of a lollipop $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ (top) and barbell $$B\left(n,\left\lfloor \tfrac{n}{3}\right\rfloor ,\left\lfloor \tfrac{n}{3}\right\rfloor \right)$$ (bottom) graph with $$n=999$$ nodes. Classical random walk (blue solid line) and the multi-hopper using the Mellin transform with $$s=1$$ (red broken line) and using the Laplace transform with $$l=0.1$$ (red dotted line). The evolution of the probabilities is shown at three different times for a random walk starting at node $$i$$ (see text) at $$t=50$$ (left), $$t=1000$$ (centre) and $$t=10^{6}$$ (right). Fig. 4. View largeDownload slide Probability distribution at the different nodes of a lollipop $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ (top) and barbell $$B\left(n,\left\lfloor \tfrac{n}{3}\right\rfloor ,\left\lfloor \tfrac{n}{3}\right\rfloor \right)$$ (bottom) graph with $$n=999$$ nodes. Classical random walk (blue solid line) and the multi-hopper using the Mellin transform with $$s=1$$ (red broken line) and using the Laplace transform with $$l=0.1$$ (red dotted line). The evolution of the probabilities is shown at three different times for a random walk starting at node $$i$$ (see text) at $$t=50$$ (left), $$t=1000$$ (centre) and $$t=10^{6}$$ (right). On the other hand, the RMH has a non-zero probability of escaping directly from the clique. As can be seen in the right panels of the Fig. (4) even for the small time $$t=50$$ the multi-hopper with Mellin transform has already visited all the nodes of the graphs. For this short time, however, the multi-hopper with Laplace transform has visited all the nodes of the cliques plus the initial nodes of the path, but she has not arrived yet at the node $$j$$. For time $$t=1000$$ the multi-hopper with Mellin transform is already in the stationary state and the one with Laplace transform has already visited all the nodes of the graphs. At $$t=10^{6}$$ the multi-hopper has reached the stationary state for both transforms. This significant difference with the classical RW is due to the fact that the RMH is not trapped in the cliques due to the fact that she can go directly from $$i$$ to any node of the graph with a probability that decays as a function of the distance from $$i$$. Then, the first few nodes of the path are frequently visited by the multi-hopper as they are at relatively short distances from the node $$i$$. Once the multi-hopper reached these nodes, she can visit the extreme nodes of the graphs in an easier way, overtaking the classical RW even for relatively short times. The way in which a RMH is propagated through a path is analysed in the next subsection of this work. 5.2 Path graphs Another interesting graph to consider is the path $$P_{n}$$. As proved by Palacios [20], this graph has the maximum sum of all effective resistances among all pairs of nodes, i.e., the maximum Kirchhoff index. For the normal RW, Palacios proved that $$\Omega_{tot}\sim n^{3}$$. Because the number of edges in $$P_{n}$$ is $$n-1$$ we easily get that $$\left\langle H\left(P_{n}\right)\right\rangle \sim n^{2}$$. In Fig. 5, we illustrate the evolution of the probabilities of being at a given node of the path of 1000 nodes labelled in consecutive order from 1 to 1000, in which we have placed the random walker at the node 1. As can be seen in the Fig. (5)$$t=500$$ (left), the classical random walker has visited only the first 100 nodes of the path while the RMHs for both transforms have already visited all the nodes. As the time increases, the RMH model gives almost identical probabilities of finding the walker at any node of the path, but the classical random walker still shows close to zero probability of finding the walker at the other side of the path for times as large as $$t=5000$$ (see right plots in the Fig. 5). Fig. 5. View largeDownload slide Probabilities of finding the random walker at a given node of $$P_{1000}$$ at $$t=500$$ (left), $$t=1000$$ (centre) and $$t=5000$$ (right) for the classical (blue solid line) and multi-hopper random walk model with Mellin transform with parameter $$s=2$$ and with the Laplace transform with parameter $$l=0.1$$. Fig. 5. View largeDownload slide Probabilities of finding the random walker at a given node of $$P_{1000}$$ at $$t=500$$ (left), $$t=1000$$ (centre) and $$t=5000$$ (right) for the classical (blue solid line) and multi-hopper random walk model with Mellin transform with parameter $$s=2$$ and with the Laplace transform with parameter $$l=0.1$$. As in the previous subsection, we study here the influence of the graph size on the hitting time in paths for both the Mellin and Laplace transforms. In particular, we compare both transformations in the multi-hopper random walk with the classical one for the path graph with $$100\leq n\leq1000$$. As expected the average hitting time in the classical random walk follows a quadratic dependence with the number of nodes, $$\left\langle H_{k}\right\rangle \approx0.3333n^{2}$$. However, for the multi-hopper one it follows power-laws with exponent smaller than 2. For instance, for the dependence is of the form $$\left\langle H_{k}\right\rangle \approx an^{b}$$, with parameters given in the Supplementary Information. The most interesting thing here is that as for the barbell and lollipop graphs the average hitting time of paths also increases linearly with the number of nodes for relatively small values of the parameters $$s$$ and $$l$$ as can be seen in Supplementary Fig. S1 accompanying this article. 5.3 Some remarks It is intuitive to think that the average shortest-path distance plays a fundamental role in explaining the average hitting time of graphs in the normal RW model. Then, because we allow for long-range jumps in the multi-hopper model we would intuitively expect that such influence of the shortest-path distance is diminished in this model. However, one important thing that we have learned from the analysis of the lollipop, barbell and path graphs is the following. Although the average shortest path distance plays some role in the determination of the average hitting times, it is the existence of large, relatively isolated, clusters which plays the major role. That is, although a path graph has the largest possible average shortest path distance of any graph with n nodes (the average shortest path length is equal to $$(n + 1)/3)$$, it has an average hitting time one order of magnitude smaller than the lollipop and barbell graphs, which have comparably small average shortest path distances—particularly the ones analyzed in this section. This role of large clusters in graphs, which we discussed in this section, is a great importance for the analysis of real-world networks. Although these networks have relatively small average shortest-path distance due to their small-world properties, they contain many communities—clusters of tightly connected nodes, which are poorly connected between each other—which resemble the extremal situation of barbell and lollipop graphs. 6. Random graphs In this section, we explore the multi-hopper model for two types of random networks: Barabási–Albert (BA) [23] and Erdős–Rényi (ER) [24] types. We use the exact result obtained for the expected hitting time averaged over all pairs of nodes $$\left\langle \hat{H}_{\tau}\right\rangle$$ in Equation (3.19). The analysis of the BA and ER random networks shows that the hitting time increases linearly with the number of nodes in the graph (see Supplementary Figures). In all cases, the use of the Mellin transform in the multi-hopper reduces the slope of the linear fits $$\left\langle \hat{H}_{\tau}\right\rangle \approx an+b$$ (see Supplementary Information for the parameters $$a$$ and $$b$$) expressing the dependence of the hitting time with the number of nodes. Thus, we found that in general, using the long-range strategies, the resulting random walker reaches more efficiently any site on the network in comparison with the normal random walk. In Fig. 6, we fixed the number of nodes to $$n=2000$$ in ER and BA networks and we calculate the average hitting time as a function of the parameters $$l$$ and $$s$$ for the Laplace and Mellin transform, respectively. The results confirm previous findings [11] that in the limit for $$l,s\to0$$ the hitting times reach the value $$n-1$$. However, for parameters in the interval $$(0,10)$$, we see how the two types of strategies present a strong variation in the average hitting time the values of $$\left\langle \hat{H}^{\tau}\right\rangle$$, this is a direct consequence of how the random walk strategies assign weights to small, intermediate and large steps. Finally, for large values of the parameters, long-range transitions appear with low probability and the values of $$\left\langle \hat{H}^{\tau}\right\rangle$$ are equal to the results for the normal random walk strategy with transitions only to nearest neighbours. These results are of great significance for the further analysis of real-world networks in the next section of this work. Fig. 6. View largeDownload slide Influence of the degree distribution on the average hitting time for random networks with $$n=2000$$ nodes. (a) Barabási-Albert (BA) network, (b) connected Erdős-Rényi (ER) networks with probability $$p=\log(n)/n$$. We plot the results for the hitting time as a function of the parameter $$l$$ for the Laplace transform and $$s$$ for the Mellin transformation. We depict, with dashed lines, the results for the normal random walk (NRW) and $$n-1$$ obtained for a complete graph. Fig. 6. View largeDownload slide Influence of the degree distribution on the average hitting time for random networks with $$n=2000$$ nodes. (a) Barabási-Albert (BA) network, (b) connected Erdős-Rényi (ER) networks with probability $$p=\log(n)/n$$. We plot the results for the hitting time as a function of the parameter $$l$$ for the Laplace transform and $$s$$ for the Mellin transformation. We depict, with dashed lines, the results for the normal random walk (NRW) and $$n-1$$ obtained for a complete graph. As we have observed in the previous analysis, there are very significant differences between the random networks considered here and the graphs analysed in the previous section, where the average hitting time increases as a third or second power of the number of nodes. The linear increase observed here for the random graphs studied cannot be understood only on the basis of the fact that they display relatively small shortest-path distances. For instance, we can construct barbell graphs $$B\left(n,\left\lfloor \tfrac{n-k}{2}\right\rfloor ,\left\lfloor \tfrac{n-k}{2}\right\rfloor \right)$$ with small values of $$k$$, which have small average shortest-path distance. For instance, a graph $$B\left(n,\left\lfloor \tfrac{n-k}{2}\right\rfloor ,\left\lfloor \tfrac{n-k}{2}\right\rfloor \right)$$ has only distances $$d_{ij}\in\left[1,k\right]$$. One important difference, however, between the studied random networks and the barbell and lollipop graphs previously considered is the lack of large cliques in these random graphs which may trap the random walker inside them. In the next section, we study this problem by using random graphs with different intercommunity density of links. In addition, we study the influence of the degree distribution on the hitting time of these random graphs with the goal of understanding the differences between ER and BA networks. 6.1 Influence of communities and degree distribution We start here by considering the influence of the presence of clusters of nodes defined in the following way. Let us consider a network with $$n$$ nodes. Let us make a partition of the network in $$k$$ clusters of size $$\left\lfloor \dfrac{n}{k}\right\rfloor$$. Let $$C_{i}$$ and $$C_{j}$$ be two of such clusters. Then, the probability that two nodes $$p,q\in C_{i}$$ are connected is much larger than the probability that two nodes $$r\in C_{i}$$ and $$s\in C_{j}$$ are connected. This gives rise to higher internal densities of links in the clusters than the inter-cluster density of links. It is a well-known fact that neither the ER nor the BA networks contain such kind of clusters. The lack of such clusters—known in network theory as communities—is characterized by the so-called good expansion properties of these graphs. Loosely speaking a graph is an expander if it does not contain any structural bottleneck, i.e., a small group of nodes or edges‘ whose removal separates the network into two connected components of approximately the same size [25]. We remark here that both ER and BA graphs have been proved to be expanders when the number of nodes is very large [25, 26]. Then, we use here an implementation of the algorithm described by Lancichinetti et al. [27] to produce undirected random networks with communities with a fixed average degree $$\langle k\rangle$$. A mixing parameter $$\mu$$ defines the fraction of links that a node share with nodes in other communities. A small value of the mixing parameter produces graphs with tightly connected clusters which are poorly connected among them. That is, it produces very well-defined communities in the graph. As the mixing parameter increases and surpasses the value 0.5, the communities disappear and the graph looks more and more like an expander for a sufficiently large number of nodes. Here, we explore the effect of communities in the capacity of the multi-hopper random walk strategy to reach any site of the network by constructing random graphs with the same number of nodes and edges but changing the mixing parameter. In Fig. 7, we depict the average hitting time $$\langle H^{\tau}\rangle$$ for different values of the parameters $$l$$ and $$s$$ for networks with communities constructed as described before. As can be seen in this figure, for random graphs with well defined communities (small values of the mixing parameter), the random walker needs significantly longer times to explore the whole network. This is particularly true for relatively large values of the Mellin and Laplace parameters of the multi-hopper model, which indicate that the normal RW is significantly less efficient in networks having communities than in networks not displaying such structural characteristic. Here again, as these parameters approach zero the hitting time decays to the lower bound as expected from the theory. In closing, the small hitting times observed for the random graphs studied in the previous subsection are mainly due to the fact that these graphs are expanders and they lack any community structure, which may trap the random walker for longer times without visiting other clusters. The multi-hopper solves this trapping problem by having a larger chance to jump from one community to another, thereby reducing her time inside each of the clusters visited. We provide a video as Supplementary Information accompanying this paper that visualizes these findings. Fig. 7. View largeDownload slide Average hitting time for the multi hopper random walker in networks with communities. (a) Laplace and (b) Mellin strategies. We explore networks with $$n=1000$$ nodes, an average degree $$\langle k\rangle=15$$ and different values of the mixing parameter $$\mu$$ that defines the fraction of connection that a node has with nodes in other communities. Fig. 7. View largeDownload slide Average hitting time for the multi hopper random walker in networks with communities. (a) Laplace and (b) Mellin strategies. We explore networks with $$n=1000$$ nodes, an average degree $$\langle k\rangle=15$$ and different values of the mixing parameter $$\mu$$ that defines the fraction of connection that a node has with nodes in other communities. Now we move to the consideration of the influence of the degree distribution on the performance of the RMH. We then study the stationary probability distribution $$\pi^{\tau}(i)$$ for the Laplace and Mellin transformation in a BA and an ER network, with $$n=2000$$ nodes. The results are obtained from the calculation of the long-range degrees $$k^{\tau}(i)$$ in Equations (3.17) and (3.18) and the respective normalization defined in Equation (3.16). In Fig. 8, we summarize our results. The important aspect of these plots are the slopes of the corresponding curves for different values of the Mellin and Laplace transforms in the multi-hopper model. If we compare the slopes for the ER network with those of the BA network, we observe that the first is smaller and closer to the constant line $$n-1$$ than the second. The smallest hitting time is obtained when the slope coincides with this line, which represents the fully connected graph. Thus, the ER graphs are already close to this slope and this is the main reason why they display relatively small hitting times. However, in the BA model when $$s,l$$ are very large the slopes of the curves are very steep and far away from the asymptotic result. As soon as these parameters approach zero the slope of the curves become more flat approaching $$\pi^{\tau}(i)=n-1$$ as a consequence of the fact that the graph approaches the fully connected one. That is, the long-range dynamics changes the way in which the random walker reaches the nodes. For small values of the parameter $$s$$ or $$l$$, the stationary probability distribution reaches the value $$\pi(i)=1/n$$. On the other hand, the inverse of the stationary probability distribution defines the average time $$\left\langle t^{\tau}(i)\right\rangle =\frac{1}{\pi^{\tau}(i)}$$ needed for the random walker to return for the first time to the node $$i$$. In this way, the random walker returns to sites with high values of $$\pi^{\tau}$$ and, gets trapped in regions with this property. As we can see in Fig. 8, the effect of the long-range strategies is to reduce the probability to revisit highly connected sites, and to increase the capacity of reaching any site in the network. Fig. 8. View largeDownload slide Stationary probability distribution for multi-hopper random walkers on Barabási-Albert and Erdős-Rényi networks with $$n=2000$$ nodes. (a) Laplace transform, (b) Mellin transform. Fig. 8. View largeDownload slide Stationary probability distribution for multi-hopper random walkers on Barabási-Albert and Erdős-Rényi networks with $$n=2000$$ nodes. (a) Laplace transform, (b) Mellin transform. In closing, in this section we have seen that a random walker can be trapped in certain regions of a network—that is, having larger probability of staying at these regions than in other parts of the graph—due to two different factors. The first is the presence of clusters of highly connected nodes in which the random walker is retained for long times before she visits other clusters of the graph. The second is the existence of hubs (highly connected nodes) to which the random walker returns frequently, thereby making her exploration of the network more difficult. These two characteristics, the presence of communities and the existence of fat-tailed degree distributions, are well known to be ubiquitous in real-world networks. The observation that the RMH can overcome both of these factors make this model an important candidate for the exploration of real-world networks, which is the topic of our next section. 7. Real-world networks One of the areas in which the RMH can show many potential applications is in the study of large real-world networks. Normal random walks on networks have been previously used as mechanisms of transport and search on networks [3, 28, 29]. These are graphs representing the networked skeleton of complex systems ranging from infrastructural and technological to biological and social systems. Here we study a few networks representing a variety of real-world complex systems, including biological, communication and infrastructural ones. In Table 1, we report the sizes of these networks as well as the hitting times using the normal random walk and the multi-hopper with Mellin transform. By using the expression 5.2, we can estimate the lower bound for the value of the Mellin parameter $$s$$ for which $$\left\langle \hat{H}_{M}\right\rangle \leq n$$. These values are given in Table 1 as $$s_{lower}$$ for all the networks studied in this section. In addition we calculate the actual value of this parameter for which $$\left\langle \hat{H}_{M}\right\rangle \leq n$$ in these networks, and report it as $$s_{c}$$ in Table 1. Here, the values of $$s_{c}$$ are obtained as follows. We calculate the value of $$\left\langle \hat{H}_{M}\right\rangle$$ for different values of $$s$$ and obtain a fit of the form: $$\left\langle \hat{H}_{M}\right\rangle \approx\alpha s^{2}+\beta$$ for $$0.01\leq s\leq0.5$$. Obviously, $$\beta=n-1$$, which is the lowest value obtained by $$\left\langle \hat{H}_{M}\right\rangle$$ for any graph. Using these fitted equations we then calculate the values of $$s_{c}$$ reported in Table 1. As can be seen the values of $$s_{c}$$ are as average 10 times larger than the lower bound expected from the lollipop graphs of the same size as the studied networks. Table 1 Real-world networks studied in this work, their number of nodes $$n$$ and the average hitting time of the normal random walker $$\left\langle H\right\rangle$$. $$s_{c}$$ is the value of the Mellin parameter $$s$$ for which the corresponding network has hitting time smaller than $$n$$. The value of $$s_{B}$$ is the Mellin parameter $$s$$ for which the corresponding a lollipop graph $$L\left(n,\left\lfloor \dfrac{3n}{2}\right\rfloor \right)$$ with the same number of nodes as the real-world network has hitting time smaller than $$n$$. The last column, % improv., represents the percentage of improvement in the hitting time using the Mellin-transformed multi-hopper respect to the NRW Network $$n$$ $$\left\langle H\right\rangle$$ $$s_{c}$$ $$s_{B}$$ % impr. Bio_PPI_yeast 2,224 8,652 0.145 0.0128 389 City_Atlanta 3,234 11,973 0.088 0.0106 370 Colab_Geom 3,621 15,719 0.086 0.0100 434 City_Berlin 4,495 13,752 0.094 0.0090 306 Power_USA 4,941 34,455 0.098 0.0086 697 City_Barcelona 5,575 17,282 0.063 0.0081 310 Colab_AstroPh 17,903 101,072 0.048 0.0045 565 City_Seattle 20,207 108,746 0.041 0.0042 538 Colab_CondMat 21,363 77,298 0.049 0.0041 362 Comm_Enron 33,696 193,714 0.040 0.0033 575 Network $$n$$ $$\left\langle H\right\rangle$$ $$s_{c}$$ $$s_{B}$$ % impr. Bio_PPI_yeast 2,224 8,652 0.145 0.0128 389 City_Atlanta 3,234 11,973 0.088 0.0106 370 Colab_Geom 3,621 15,719 0.086 0.0100 434 City_Berlin 4,495 13,752 0.094 0.0090 306 Power_USA 4,941 34,455 0.098 0.0086 697 City_Barcelona 5,575 17,282 0.063 0.0081 310 Colab_AstroPh 17,903 101,072 0.048 0.0045 565 City_Seattle 20,207 108,746 0.041 0.0042 538 Colab_CondMat 21,363 77,298 0.049 0.0041 362 Comm_Enron 33,696 193,714 0.040 0.0033 575 Table 1 Real-world networks studied in this work, their number of nodes $$n$$ and the average hitting time of the normal random walker $$\left\langle H\right\rangle$$. $$s_{c}$$ is the value of the Mellin parameter $$s$$ for which the corresponding network has hitting time smaller than $$n$$. The value of $$s_{B}$$ is the Mellin parameter $$s$$ for which the corresponding a lollipop graph $$L\left(n,\left\lfloor \dfrac{3n}{2}\right\rfloor \right)$$ with the same number of nodes as the real-world network has hitting time smaller than $$n$$. The last column, % improv., represents the percentage of improvement in the hitting time using the Mellin-transformed multi-hopper respect to the NRW Network $$n$$ $$\left\langle H\right\rangle$$ $$s_{c}$$ $$s_{B}$$ % impr. Bio_PPI_yeast 2,224 8,652 0.145 0.0128 389 City_Atlanta 3,234 11,973 0.088 0.0106 370 Colab_Geom 3,621 15,719 0.086 0.0100 434 City_Berlin 4,495 13,752 0.094 0.0090 306 Power_USA 4,941 34,455 0.098 0.0086 697 City_Barcelona 5,575 17,282 0.063 0.0081 310 Colab_AstroPh 17,903 101,072 0.048 0.0045 565 City_Seattle 20,207 108,746 0.041 0.0042 538 Colab_CondMat 21,363 77,298 0.049 0.0041 362 Comm_Enron 33,696 193,714 0.040 0.0033 575 Network $$n$$ $$\left\langle H\right\rangle$$ $$s_{c}$$ $$s_{B}$$ % impr. Bio_PPI_yeast 2,224 8,652 0.145 0.0128 389 City_Atlanta 3,234 11,973 0.088 0.0106 370 Colab_Geom 3,621 15,719 0.086 0.0100 434 City_Berlin 4,495 13,752 0.094 0.0090 306 Power_USA 4,941 34,455 0.098 0.0086 697 City_Barcelona 5,575 17,282 0.063 0.0081 310 Colab_AstroPh 17,903 101,072 0.048 0.0045 565 City_Seattle 20,207 108,746 0.041 0.0042 538 Colab_CondMat 21,363 77,298 0.049 0.0041 362 Comm_Enron 33,696 193,714 0.040 0.0033 575 8. A note on the multi-hopper probability distributions In this section, we study the distribution of probabilities of finding the random walker at a given place as a function of the distance from that place to the current position of the walker. In a homogeneous and isotropic space, e.g. in a continuous space, Lévy flights (LF) are a frequently studied model in which the walker can also jump to distant regions of the space [30]. LFs are widely used to model efficient search processes, for instance, of animals searching for sparse food. At every jump the step length is drawn from a long-tailed probability density function with the power-law decay [31–33]. Due to the homogeneity and isotropy of the space in which the random walker moves, there should be a perfect correlation between the probability of realizing a jump and the distance separating the two places from/to which the walker moves to. That is, the probability of making a walk of length $$k$$, is exactly the same independently of the position in space where the walker starts/end her jump. This can be interpreted as the fact that at every jump all memory to previous jumps is erased, and thus the jump lengths $$x$$ are independent and identically distributed random variables [34]. Is this situation observed for a graph in the context of the multi-hopper random walk? The answer is in general no. For the RMH, the conditions of homogeneity and isotropy of the space are broken. Suppose, as we have seen before, a barbell graph in which there are two cliques and a path connecting them. Let us pick (i) a pair of nodes inside one of the cliques, (ii) a pair of nodes in the middle of the path and (iii) a pair formed by the node of the clique which is connected to a node of the path. In the three cases the distance between the three pairs of nodes is exactly the same, i.e., one. However, every time the walker departs from one of the nodes in the pair (i), the probability of her to return to the other node in this pair is very high, due to the high interconnection of the clique. In the case of the pair in the middle of the path the situation is quite different. Here every time the walker departs from one of the two nodes the probability of returning to the other node is very much diminished by the fact that it can get trapped in any of the two cliques and hardly returning to the centre of the path. Similar situation can be described for the pair (iii). Then, we have three different probabilities for exactly the same distance between nodes. Using the previously used ‘memory’ analogy, the situation in the graph can be seen as if the random walker remembers the place from which he originally departed, something totally strange to the continuous space. In Fig. 9, we illustrate the distribution of probabilities as a function of the distance between the nodes from/to which the RMH jumps for three different graphs. The first is a cycle graph having 1000 nodes, the second is a barbell graph with 999 nodes and the third is the real-world protein–protein interaction network of yeast. As can be seen, only for the case of the cycle there is a perfect power-law dependency between the probability and the distance as expected for ‘normal’ random walks with Lévy flights. In this case, we obtain, as expected, $$p\left(i,j\right)=d\left(i,j\right)^{-1}$$ with Pearson correlation coefficient $$r^{2}=1$$. However, for the barbell graph we get $$p\left(i,j\right)\sim d\left(i,j\right)^{-0.856}$$ with $$r^{2}=0.678$$. Here, neither the exponent of the power-law is equal to one, nor the correlation coefficient is so high as to accept such correlation. The most important thing however, is pretty much observed from the proper plot. For each value of the distance there are many different probabilities, and such probabilities can vary more than one order of magnitude. In the case of the real-world network is situation is even worse. Here we obtain $$p\left(i,j\right)\sim d\left(i,j\right)^{-0.673}$$ with $$r^{2}=0.659$$. The exponent of the power-law is far from the epected value of -1 and the correlation coefficient is very poor. Obviously, the consequences for these deviations are the ones we have analysed in this article: heterogeneity in the degree distributions and presence of regions with diverse density communities, among other potential structural heterogeneities appearing in real-world graphs. In closing we remark here that the main condition observed by the random-walks with Lévy flights in the continuous space is no longer fulfilled by the analogue on graph. That is the main reason we have preferred to call this model the multi-hopper ones instead of the most traditional name of random-walks with Lévy flights. Fig. 9. View largeDownload slide (top) Log–log plot of the jumping probabilities in the multi-hopper model using a Mellin transform with exponent $$s=1$$ versus the shortest path distance for a cycle $$C_{1000}$$, a barbell $$B\left(999,333,333\right)$$ and the network of PPI of yeast. Fig. 9. View largeDownload slide (top) Log–log plot of the jumping probabilities in the multi-hopper model using a Mellin transform with exponent $$s=1$$ versus the shortest path distance for a cycle $$C_{1000}$$, a barbell $$B\left(999,333,333\right)$$ and the network of PPI of yeast. 9. Conclusions We develop here a mathematical and computational framework for using random walks with long-range jumps on graphs of any topology. This multi-hopper model allows a random walker positioned at a given node of a simple, connected graph to jump to any other node of the graph with a probability that decays as a function of the shortest-path distance between her original and final positions. The decaying probabilities for long-range jumps are selected as Laplace or Mellin transforms of the shortest-path distances in this work. We prove here that when the parameters of these transforms approach zero asymptotically, the hitting time of the multi-hopper approaches the minimum possible value for a normal random walker. Thus, the multi-hopper represents a super-fast random walker hopping among the nodes of a graph. We show by computational experiments that the multi-hopper overcomes several of the difficulties that a normal random walker has to explore a graph. For instance, the multi-hopper explores more efficiently a graph having clusters of highly interconnected nodes, which are poorly connected to other clusters (communities), when compared to the normal random walker. It also diffuses faster than the normal random walker in graphs with very skewed degree distributions, such as scale-free networks. In these graphs, the normal random walker visits hubs more frequently than the nodes with low degree, thus getting stuck around the high-degree nodes of the graph. Finally, we illustrate how the multi-hopper can be useful in transport and search problems in real-world networks where structural heterogeneity, such as the presence of communities and skewed degree distributions, is more a rule than an exception. We hope that the use of the RMH will open new avenues in the exploration of lattices, graphs and real-world networks. Supplementary data Supplementary data are available at COMNET online. Acknowledgments E.E. thanks the Royal Society for a Wolfson Research Merit Award. Funding European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement [702410 to M.T.S.]; the Concerted Research Action (ARC) programme supported by the Federation Wallonia-Brussels (contract ARC 14/19-060 on Mining and Optimization of Big Data Models) [to J.C.D.]; DFG, project ME 1535/6-1 [to R.M.]. References 1. Estrada E. ( 2012 ) The Structure of Complex Networks: Theory and Applications . Oxford : Oxford University Press . 2. Klafter J. & Sokolov I. M. ( 2011 ) First Steps in Random Walks: From Tools to Applications . Oxford : Oxford University Press . Google Scholar CrossRef Search ADS 3. Noh J. D. & Rieger H. ( 2004 ) Random walks on complex networks . Phys. Rev. Lett. , 92 , 118701 . Google Scholar CrossRef Search ADS PubMed 4. Pólya G. Arithmetische eigenschaften der reihenentwicklungen rationaler funktionen . J. Reine Angew. Math. , 151 , 1 – 31 . 5. Senft D. C. & Ehrlich G. ( 1995 ) Long jumps in surface diffusion: one-dimensional migration of isolated adatoms . Phys. Rev. Lett. , 74 , p. 294 . Google Scholar CrossRef Search ADS PubMed 6. Linderoth T. R. , Horch S. , Lægsgaard E. , Stensgaard I. & Besenbacher F. ( 1997 ) Surface diffusion of pt on pt (110): Arrhenius behavior of long jumps . Phys. Rev. Lett. , 78 , p. 4978 . Google Scholar CrossRef Search ADS 7. Schunack M. , Linderoth T. R. , Rosei F. , Lægsgaard E. , Stensgaard I. & Besenbacher F. ( 2002 ) Long jumps in the surface diffusion of large molecules . Phys. Rev. Lett. , 88 , p. 156102 . Google Scholar CrossRef Search ADS PubMed 8. Ala-Nissila T. , Ferrando R. & Ying S. ( 2002 ) Collective and single particle diffusion on surfaces . Adv. Phys. , 51 , 949 – 1078 . Google Scholar CrossRef Search ADS 9. Yu C. , Guan J. , Chen K. , Bae S. C. & Granick S. ( 2013 ) Single-molecule observation of long jumps in polymer adsorption . ACS Nano. , 7 , 9735 – 9742 . Google Scholar CrossRef Search ADS PubMed 10. Wrigley J. D. , Twigg M. E. & Ehrlich G. ( 1990 ) Lattice walks by long jumps . J. Chem. Phys. , 93 , 2885 – 2902 . Google Scholar CrossRef Search ADS 11. Riascos A. & Mateos J. L. ( 2012 ) Long-range navigation on complex networks using lévy random walks . Phys. Rev. E , 86 , p. 056110 . Google Scholar CrossRef Search ADS 12. Estrada E. ( 2012 ) Path laplacian matrices: introduction and application to the analysis of consensus in networks . Linear Algebra Appl. , 436 , 3373 – 3391 . Google Scholar CrossRef Search ADS 13. Aldous D. & Fill J. Reversible Markov Chains and Random Walks on Graphs . 2002 . 14. Lovász L. ( 1993 ) Random walks on graphs . Combinatorics, Paul Erdos Is Eighty , 2 , 1 – 46 . 15. Nash-Williams C. S. J. ( 1959 ) Random walk and electric currents in networks, in mathematical . Proceedings of the Cambridge Philosophical Society , vol. 55 . Cambridge University Press , 181 – 194 . 16. Chandra A. K. , Raghavan P. , Ruzzo W. L. , Smolensky R. & Tiwari P. ( 1996 ) The electrical resistance of a graph captures its commute and cover times . Comput. Complexity , 6 , 312 – 340 . Google Scholar CrossRef Search ADS 17. Ghosh A. Boyd S. & Saberi A. ( 2008 ) Minimizing effective resistance of a graph . SIAM Rev. , 50 , 37 – 66 . Google Scholar CrossRef Search ADS 18. Boley D. Ranjan G. & Zhang Z.-L. ( 2011 ) Commute times for a directed graph using an asymmetric laplacian . Linear Algebra Appl. , 435 , 224 – 242 . Google Scholar CrossRef Search ADS 19. Grinstead C. M. & Snell J. L. ( 2012 ) Introduction to Probability . American Mathematical Society . 20. Palacios J. L. ( 2001 ) Resistance distance in graphs and random walks. Int. J. Quantum Chem. , 81 , 29 – 33 . Google Scholar CrossRef Search ADS 21. Brightwell G. & Winkler P. ( 1990 ) Maximum hitting time for random walks on graphs . Random Structures Algorithms , 1 , 263 – 276 . Google Scholar CrossRef Search ADS 22. Jonasson J. ( 2000 ) Lollipop graphs are extremal for commute times . Random Structures Algorithms , 16 , 131 – 142 . Google Scholar CrossRef Search ADS 23. Barabási A.-L. & Albert R. ( 1999 ) Emergence of scaling in random networks . Science , 286 , 509 – 512 . Google Scholar CrossRef Search ADS PubMed 24. Erdös P. & Rènyi A. ( 1959 ) On random graphs, i . Publ. Math. Debrecen , 6 , 290 – 297 . 25. Hoory S. Linial N. & Wigderson A. ( 2006 ) Expander graphs and their applications . Bull. Amer. Math. Soc. , 43 , 439 – 561 . Google Scholar CrossRef Search ADS 26. Mihail M. , Papadimitriou C. & Saberi A. ( 2003 ) On certain connectivity properties of the internet topology, in foundations of computer science, 2003 . Proceedings 44th Annual IEEE Symposium . IEEE , 28 – 35 . 27. Lancichinetti A. , Fortunato S. & Radicchi F. ( 2008 ) Benchmark graphs for testing community detection algorithms . Phys. Rev. E , 78 , 046110 . Google Scholar CrossRef Search ADS 28. Adamic L. A. , Lukose R. M. , Puniyani A. R. & Huberman B. A. ( 2001 ) Search in power-law networks . Phys. Rev. E , 64 , p. 046135 . Google Scholar CrossRef Search ADS 29. Guimerà R. , Díaz-Guilera A. , Vega-Redondo F. , Cabrales A. & Arenas A. ( 2002 ) Optimal network topologies for local search with congestion . Phys. Rev. Lett. , 89 , p. 248701 . Google Scholar CrossRef Search ADS PubMed 30. Shlesinger M. F. , Zaslavsky G. M. & Frisch U. ( 1995 ) Lèvy flights and related topics in physics . Levy Flights and Related Topics in Physics , 450 . 31. Hughes B. D. ( 1996 ) Random Walks and Random Environments . Random Walks , 1 . 32. Metzler R. & Klafter J. ( 2000 ) The random walk’s guide to anomalous diffusion: a fractional dynamics approach . Phys. Rep. , 339 , 1 – 77 . Google Scholar CrossRef Search ADS 33. Metzler R. & Klafter J. ( 2004 ) The restaurant at the end of the random walk: recent developments in the description of anomalous transport by fractional dynamics . J. Phys. A , 37 , p. R161 . Google Scholar CrossRef Search ADS 34. Bouchaud J.-P. & Georges A. ( 1990 ) Anomalous diffusion in disordered media: statistical mechanisms, models and physical applications . Phys. Rep. , 195 , 127 – 293 . Google Scholar CrossRef Search ADS © The authors 2017. Published by Oxford University Press. All rights reserved. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Complex Networks Oxford University Press

# Random multi-hopper model: super-fast random walks on graphs

22 pages

/lp/ou_press/random-multi-hopper-model-super-fast-random-walks-on-graphs-YWb4001b30
Publisher
Oxford University Press
ISSN
2051-1310
eISSN
2051-1329
DOI
10.1093/comnet/cnx043
Publisher site
See Article on Publisher Site

### Abstract

Abstract We develop a mathematical model considering a random walker with long-range hops on arbitrary graphs. The random multi-hopper can jump to any node of the graph from an initial position, with a probability that decays as a function of the shortest-path distance between the two nodes in the graph. We consider here two decaying functions in the form of Laplace and Mellin transforms of the shortest-path distances. We prove that when the parameters of these transforms approach zero asymptotically, the hitting time in the multi-hopper approaches the minimum possible value for a normal random walker. We show by computational experiments that the multi-hopper explores a graph with clusters or skewed degree distributions more efficiently than a normal random walker. We provide computational evidences of the advantages of the random multi-hopper model with respect to the normal random walk by studying deterministic, random and real-world networks. 1. Introduction Many of the complex systems existing in the real world are better represented as a network than as a continuous system. This includes ecological and biomolecular, social and economical as well as infrastructural and technological networks [1]. The use of random walk models in these systems also provides a large variety of possibilities ranging from the analysis of the diffusion of information and navigability on these networks to the exploration of their structures to detect their fine-grained organization [2, 3]. From a mathematical point of view, these networks are nothing but graphs, and we will use both terms interchangeably here with preference for the term network in case a real-world system is represented. The first work exploring the use of random walks on graphs is credited to Pólya (1921), who was walking in a park and crossed the same couple very often, leading him to consider recurrence relations for random walks [4]. In the mid 1990s, the group of Senft and Ehrlich [5] observed experimentally the self-diffusion of weakly bounded Pd atoms. They observed significant contributions to the thermodynamical properties of the system from jumps spanning second and third nearest-neighbours in the metallic surface, which can be considered as a regular lattice. Then, in 1997 the group of Linderoth et al. [6] observed experimentally that for the self-diffusion of Pt atoms on Pt(110) surface the jumps from non-nearest neighbours also contribute to the diffusion. Even more surprising were the results of the same group, when they studied the diffusion of two large organic molecules on the Cu(110) surface. In this case, using scanning tunnelling microscopy, it was observed that long jumps play a dominating role in the diffusion of the two organic molecules, with root-mean-square jump length as large as $$3.9$$ and $$6.8$$ lattice spacings [7]. Since then the role of long-jumps in adatom and molecules diffusing on metallic surfaces has been both theoretically and experimentally confirmed in many different systems [8, 9]. Due to this experimental evidence, some attempts were made to consider long-range jumps in the diffusion of a particle on a regular lattice. The first of them was the paper entitled Lattice walks by long jumps by Wrigley et al. [10]. Other works have considered that the space in which the diffusion takes place is continuous, and then applied a random-walk model with Lévy flights to model these long-range effects (see below). However, the development of a general multi-hopper model, in which a random walker hops to any node of a general graph with probabilities that depends on the distance separating the corresponding nodes is still missing in the literature. Apart from the physical scenarios related to the diffusion of adatoms and admolecules on metallic surface, long-range jumps on graphs are of a general interest. For instance, in social networks one can take advantage of the full or partial knowledge of the network beyond first acquaintances to diffuse information in a swifter way than can be done by the traditional nearest-neighbour only strategy. In exploring technological and infrastructural networks we can exploit our knowledge of the topology of the network to jump from a position to non-nearest-neighbours in such a way that the whole system can be explored in shorter time. In 2012, two groups published independently models that are designed to account for all potential long jumps that a random walker can take on a graph. In one of these papers Mateos and Riascos [11] proposed a random walk model with jumps that can go to a non-nearest neighbour with a probability that decays as a power-law of the shortest-path distance separating the two nodes (all formal definitions are done in the Preliminaries section of this paper). In the other work, Estrada [12] generalized the concept of graph Laplacian by introducing the k-path Laplacians, which were then plugged into a generalized diffusion equation to study the influence of long-range jumps on diffusive processes on graphs. While the first paper provides a probabilistic approach to the problem, the second one provides the algebraic tools needed for its generalization and mathematical formalization. In this work, we will consider the generalized formulation of a random walk model with long-range jumps that decay as a function—not necessarily a power-law—of the shortest path distance between the nodes. We propose to call this model the random multi-hopper (RMH). We proceed by using the random walk–electrical networks connection discovered by Doyle in order to formulate mathematically the RMH model and some of its main parameters, namely the hitting and commute times. We prove analytically here that for certain asymptotic values of the parameters of the model, the average hitting time of the random walker is the smallest possible for a classical random walker. We study here some deterministic graphs for which some extremal properties of random walks are known, such as lollipop, barbell and path graphs. Then, we move to the analysis of random networks, in particular we explore here the Erdős–Rényi and the Barabási–Albert models. We finally study a real-world network representing the power grid of the western USA. In all cases, we compare the RMH with the normal random-walk (NRW) model, and conclude that the multi-hopper overcomes several of the difficulties that a normal random walker has to explore a graph. In particular, the multi-hopper explores more efficiently graphs with clusters of highly interconnected nodes as well as graphs with very skewed degree distributions, when compared with the normal random walker. These characteristics which are omnipresent in real-world networks make the RMH an excellent choice for transport and search on complex systems. 2. Preliminaries We introduce in this section some definitions and properties associated with random walks on graphs and define the notation used throughout the work. A graph$$G=(V,E)$$ is defined by a set of $$n$$ nodes (vertices) $$V$$ and a set of $$m$$ edges $$E=\{(u,v)|u,v\in V\}$$ between the nodes. All the graphs considered in this work are finite, undirected, simple, without self-loops, and connected. A path of length $$k$$ in $$G$$ is a sequence of different nodes $$u_{1},u_{2},\ldots,u_{k},u_{k+1}$$ such that for all $$1\leq l\leq k$$, $$(u_{l},u_{l+1})\in E$$. The length of a shortest path between two nodes $$i$$ and $$j$$ constitutes a distance function, here designated by $$d\left(i,j\right)$$, which is known as the shortest-path distance between the nodes $$i$$ and $$j$$. Let $$A=\left(a_{ij}\right)_{n\times n}$$ be the adjacency matrix of the graph where $$a_{ij}=1$$ if and only if $$\left(i,j\right)\in E$$ and it is zero otherwise. The degree of the node $$i$$ is the number of nodes adjacent to it and it is designated here by $$k\left(i\right)$$. A random walk on the graph is a random sequence of vertices generated as follows. Given a starting vertex $$i$$ we select a neighbour $$j$$ at random, and move to this neighbour. Then we select a neighbour $$k$$ of $$j$$ at random, and move to it, and so on [13, 14]. This sequence of random nodes $$v_{t}:t=0,1,\ldots$$ is a Markov chain, and it has probability distribution $$\mathbf{p}_{t}\left(q\right)=\Pr\left(v_{t}=q\right)\!.$$ (2.1) The transition probability matrix is $$P=\left(p_{ij}\right)_{i,j\in V}$$ whose entries are given by $$p_{ij}=\dfrac{\delta_{ij}}{k_{i}},$$ (2.2) where $$k_{i}$$ is the degree of the node $$i$$, and $$\delta_{ij}$$ is the Kronecker delta. Let $$K$$ be the diagonal matrix whose entries $$K_{ii}=k_{i}$$ and let $$P=K^{-1}A$$. Then, the vector containing the probability of finding the walker at a given node of the graph at time $$t$$ is $$\mathbf{p}_{t}=\left(P^{T}\right)^{t}\mathbf{p}_{0},$$ (2.3) where $$T$$ represents the transpose of the matrix and $$\mathbf{p}_{0}$$ is the initial probability distribution. The following are important parameters of the random walk on a graph which are of direct utility in the current work. For a random walk starting at the node $$i$$, the expected number of steps before it reaches node $$j$$ is known as the hitting time and it is designated by $$H\left(i,j\right)$$. The expected number of steps of a random walk starting at node $$i$$ to visit node $$j$$ and then return to node $$i$$ again is known as the commute time, and it is designated by $$\kappa\left(i,j\right)$$. Both quantities are related by: $$\kappa\left(i,j\right)=H\left(i,j\right)+H\left(\,j,i\right)\!.$$ (2.4) 3. RMH model 3.1 Intuition of the model In the normal random walk in a graph, the random walker makes steps of length one—in terms of the number of edges travelled—and after each step she throws again the dice to decide where to move. This is reminiscent of a drunkard that does not remember where her home is, and so she stops at every junction that she finds and takes the decision of where to go next in a random way. Let us now suppose the existence of a random walker who does not necessarily stops at a nearest neighbour of her current position. That is, suppose that the random walker placed at the node $$i$$ of the graph selects any node $$q$$ of the graph to which she wants to move. Let us consider that the shortest path distance between $$i$$ and $$q$$ is $$d\left(i,q\right)$$. If $$d\left(i,q\right)>1$$ the random walker will not stop at any of the intermediate nodes between $$i$$ and $$q$$, but she will go directly to that node. This would correspond to a drunkard that thinks she remembers where her home is, then walks a few blocks without stopping at any junction, until she arrives at a given point where she realizes she is lost. Then, she repeats the process again. Therefore, the movements of the drunkard are modelled by a graph, the edges of which are shortest paths of various lengths of the original graph $$G$$, and the probability for the random walker to jump straight from node $$i$$ to node $$q$$ is proportional to a certain weight $$a\left(i,q\right)$$, which is a function of the distance $$d\left(i,q\right)$$ in $$G$$. As examples of decaying functions of the shortest path distance, we mention $$a\left(i,q\right)=\exp\left(-l\cdot d\left(i,q\right)\right)$$, $$a\left(i,q\right)=\left(d\left(i,q\right)\right)^{-s}$$ and $$a\left(i,q\right)=z^{-d\left(i,q\right)}$$, for $$l>0$$, $$s>0$$, and $$z>1$$, respectively. Hereafter, we will consider the first two for the analysis, called respectively the Laplace transform case and the Mellin transform case, by analogy with these transforms. As a matter of example let us consider the one-dimensional case consisting of a linear chain of five nodes labelled as 1—2—3—4—5. Let us consider that the attractiveness of a given node from its current position is given by $$a\left(i,q\right)=\exp\left(-0.5\cdot d\left(i,q\right)\right)$$. If the drunkard is placed at the Node 1 she has the following probabilities of having a walk of length 1, 2, 3 or 4: 0.46, 0.28, 0.18 and 0.10, respectively. That is, the probability that she still stops at the nearest node from his current position is higher than that for the rest of nodes, but the last ones are not zero like in the classical random walk. If the attractiveness of the farthest nodes is dramatically reduced, the probabilities approach those of the classical random walker. For instance if $$a\left(i,q\right)=\exp\left(-5\cdot d\left(i,q\right)\right)$$. the probabilities of having a walk of length 1, 2, 3 or 4 are: 0.9933, 0.0067, 0.000 and 0.000, respectively. For $$\lambda=10$$, we completely recover the classical random walk model with probabilities: 1, 0, 0 and 0 for the walks of length 1, 2 , 3 and 4, respectively. This clearly indicates that the current model is a generalization of the classical random walk model in which the random walker is allowed with certain probability of travelling to a non-nearest neighbour of his current position. 3.2 Mathematical formulation Let $$G=\left(V,E\right)$$ be a simple, undirected graph without self-loops. Let $$d_{max}$$ be the graph diameter, that is, the maximum shortest path distance in the graph. Let us now define the $$d$$-path adjacency matrix ($$d\leq d_{max}$$), denoted by $$A_{d}$$, of a connected graph of $$n$$ nodes as the square, symmetric, $$n\times n$$ matrix whose entries are: $$A_{d}\left(i,j\right)=\left\{ \begin{array}{c} 1\\ 0 \end{array}\right.\begin{array}{c} if\ d_{ij}=d,\\ otherwise, \end{array}$$ (3.1) where $$d_{ij}$$ is the shortest path distance, that is, the number of edges in the shortest path connecting the nodes $$i$$ and $$j$$. The $$d$$-path degree of the node $$i$$ is given by [11, 12] $$k_{d}\left(i\right)=\left(\mathbf{1}^{T}A_{d}\right)_{i}$$ (3.2) the symbol for the all-ones vector should simply be a bold 1 is an all-ones column vector. Let us now consider the following transformed $$k$$-path adjacency matrices [12]: $${\hat{A}}^{\tau}=\sum_{k=1}^{d_{max}}c_{d}^{\tau}A_{d},\label{eq:generalized operator}$$ (3.3) where $$\tau$$ indicates the type of transformation, for instance $$c_{d}^{\tau=\text{Mel}}=d^{-s}$$ for $$s>0$$ is the Mellin transform and $$c_{d}^{\tau=\text{Lapl}}=\exp\left(-l\cdot k\right)$$ for $$l>0$$ is the Laplace transform. Let us define the generalized degree of a given node as $$\hat{k}^{\tau}\left(i\right)=\left(\hat{A}^{\tau}\mathbf{1}\right)_{i}.\label{eq:generalized degree}$$ (3.4) Now we define the probability that a particle staying at node $$i$$ hops to the node $$j$$ as $$P^{\tau}\left(i,j\right)=\hat{A}^{\tau}\left(i,j\right)/\hat{k}^{\tau}\left(i\right)\!.$$ (3.5) Notice that if do not consider any long-range interaction, then $$\hat{A}^{\tau}=A$$ and $$P^{\tau}\left(i,j\right)=1/k\left(i\right)$$, if $$\left(i,j\right)\in E$$ and zero otherwise, where $$k\left(i\right)$$ is the degree of the node. That is, we recover the classical random walk probability. Let us denote by $$\hat{K}^{\tau}$$ the diagonal matrix with $$\hat{K}^{\tau}\left(i,i\right)=\hat{k}\left(i\right)$$ and let us define the matrix $$\hat{P}^{\tau}=\left(\hat{K}^{\tau}\right)^{-1}\hat{A}^{\tau}$$. Then, the evolution equation ruling the states of the walker at a given time step is given by $$\mathbf{p}_{t+1}=\left(\hat{P}^{\tau}\right)^{T}\mathbf{p}_{t}.$$ (3.6) 3.3 $$k$$-path Laplacians, hitting and commute times In a similar way as the Laplacian matrix for a graph, we define the Laplacian matrix corresponding to (3.3). That is, $$\hat{L}^{\tau}=\hat{K}^{\tau}-\hat{A}^{\tau},\label{eq:k-path Laplacian}$$ (3.7) where $$K^{\tau}$$ is the diagonal matrix of generalized degree $$\hat{k}^{\tau}\left(i\right)$$ defined in (3.4) and $$A^{\tau}$$ is the generalized adjacency matrix defined in (3.3). This is the Laplacian of the graph $$G^{\tau}$$, the (weighted) adjacency matrix of which is $$A^{\tau}$$. As a result, this generalized Laplacian $$\hat{L}^{\tau}$$ is positive semi-definite. The graph $$G^{\tau}$$ can be seen as a network of resistances, with the entry of $$A^{\tau}$$ representing the conductance (inverse resistance) of the connection between two nodes. By assuming that an electric current is flowing through the network $$G'$$ by entering at the node $$i$$ and leaving at the node $$j$$ we can calculate the effective resistance between these two nodes as follows: $$\hat{\varOmega}^{\tau}\left(i,j\right)=\hat{\mathcal{\mathcal{L}}}^{\tau}\left(i,i\right)+\hat{\mathcal{\mathcal{L}}}^{\tau}\left(\,j,j\right)-2\hat{\mathcal{\mathcal{L}}}^{\tau}\left(i,j\right)\!,$$ (3.8) where $$\hat{\mathcal{\mathcal{L}}}^{\tau}$$ is the Moore–Penrose pseudo-inverse of the generalized Laplacian matrix. It is well known that the analogous of this effective resistance for the simple graph is a distance between the corresponding pair of nodes. It is straightforward to show that this is also the case here and we will call $$\hat{\Omega}^{\tau}\left(i,j\right)$$ the generalized effective resistance between the nodes $$i$$ and $$j$$ in a graph. The sum of all resistance distances in a graph is known as the Kirkkoff index of the graph. In the context of the multi-hopper model it can be defined in a similar way as $$\hat{\Omega}^{\tau}_{tot}=\sum_{i<j}\hat{\varOmega}^{\tau}\left(i,j\right)=\dfrac{1}{2}\mathbf{1}^{T}\hat{\varOmega}^{\tau}\mathbf{1}.$$ (3.9) Then, an extension of a result obtained by Nash-Williams [15] and by Chandra et al. [16] allow us to calculate the commute and hitting times based on the generalized resistance distance. That is, the commute time between the corresponding nodes is given by: $$\hat{\kappa}^{\tau}\left(i,j\right)=vol\left(G^{\tau}\right)\hat{\varOmega}^{\tau}\left(i,j\right)\!,$$ (3.10) where $$vol\left(G^{\tau}\right)$$ is the sum of all the weights of the edges of $$G^{\tau}$$ (see for instance [17]). Using the scaled generalized fundamental matrix (SGFM) (see [18, 19]), we can express the hitting and commute times in matrix form as $$\hat{H}^{\tau}=\mathbf{1}\left[{\rm diag}\left(\tilde{Z^{\tau}}^{-1}\right)\right]^{T}-\tilde{Z^{\tau}},$$ (3.11) $$\hat{\kappa}^{\tau}=\mathbf{1}\left[diag\left(\tilde{Z^{\tau}}^{-1}\right)\right]^{T}+\left[{\rm diag}\left(\tilde{Z^{\tau}}^{-1}\right)\right]\mathbf{1}^{T}-\tilde{Z^{\tau}}-\tilde{Z^{\tau}}^{T},$$ (3.12) where $$\tilde{Z^{\tau}}$$ is just the SGFM for the graph $$G^{\tau}$$. The expected commute time averaged over all pairs of nodes can be easily obtained from the multi-hopper Kirchhoff index as $$\left\langle \hat{\kappa}^{\tau}\right\rangle =\dfrac{4\left(\mathbf{1}^{T}\mathbf{w^{\tau}}\right)}{n\left(n-1\right)}\hat{\Omega}^{\tau}_{\rm tot},$$ (3.13) where $$\mathbf{w^{\tau}}$$ is the vector containing the weight of each edge in the graph $$G^{\tau}$$. In a similar way we can obtain the expected hitting time averaged over all pairs of nodes $$\left\langle \hat{H}^{\tau}\right\rangle =\dfrac{2\left(\mathbf{1}^{T}\mathbf{w^{\tau}}\right)}{n\left(n-1\right)}\hat{\Omega}^{\tau}_{\rm tot}.$$ (3.14) In order to understand the mechanism behind the efficiency of the multi-hopper random walker to explore networks, it is important to relate the structure of the network with dynamical quantities. In the following part we study the stationary probability distribution and the mean-first return time and its relation with the distances in the network. The stationary probability distribution vector $$\pi_{\tau}$$ is obtained in Equation (??): $${\mathbf{\pi}}^{\tau}=\frac{\hat{A}^{\tau}\mathbf{1}}{\mathbf{1}^{T}\hat{A}^{\tau}\mathbf{1}}.$$ (3.15) Taking into account the definition of the matrix $$\hat{P}^{\tau}$$, we obtain for the elements $$\pi^{\tau}(i)$$ with $$i=1,2,\ldots,n$$ of the stationary probability distribution: $$\mathbf{\mathbf{\pi}}^{\tau}(i)=\frac{k^{\tau}(i)}{\sum_{j=1}^{n}k^{\tau}(\,j)}.\label{StatDistf}$$ (3.16) In this way, we obtain for the Mellin transformation with parameter $$s$$: $$k^{\tau=\text{Mel}}(i)=k\left(i\right)+k_{2}\left(i\right)\frac{1}{2^{s}}+k_{3}\left(i\right)\frac{1}{3^{s}}+\ldots+k_{d_{\rm max}}\left(i\right)\frac{1}{d_{\rm max}^{s}},\label{LRDegreeM}$$ (3.17) and the Laplace transformation with parameter $$l$$: $$k^{\tau=\text{Lapl}}(i)=\left[k\left(i\right)+k_{2}\left(i\right)\frac{1}{e^{l}}+k_{3}\left(i\right)\frac{1}{e^{2l}}+\ldots+k_{d_{max}}\left(i\right)\frac{1}{e^{l(d_{\rm max}-1)}}\right]\frac{1}{e^{l}}.\label{LRDegreeE}$$ (3.18) The stationary probability distribution in Equation (3.16) determines the probability to find the random walker at the node $$i$$ in the limit $$t$$ large. Expressions (3.17) and (3.18) allow to identify how the structure of the networks and the long-range strategy controlled by the parameters $$s$$ or $$l$$ combine in order to change the stationary probability distribution. In the limit of $$s,l\to\infty$$ the long-range contribution is null and the result $$\pi(i)=\frac{k_{i}}{\sum_{l=1}^{n}k_{l}}$$ for the normal random walker is recovered. On the other hand, when $$s,l\to0$$, the dynamics includes, in the same proportion, contributions of neighbours, second-, third-,..., and $$d_{max}$$-nearest neighbours. In the limit, the stationary probability distribution is the same for all the nodes and $$\pi(i)=1/n$$. 4. On the hitting time in the RMH model The most important result of this work is related to the average hitting time of the RMH walk when the parameters of the corresponding transforms tends to zero. Lemma 1 Let us consider the transformed $$k$$-path adjacency matrices: $$\hat{A}^{\tau}=\sum_{d=1}^{d_{max}}c_{d}^{\tau}A_{d},\label{eq:generalized operator-1}$$ (4.1) with $$c_{d}^{\tau=\text{Mel}}=d^{-s}$$ for $$s>0$$ and $$c_{d}^{\tau=\text{Lapl}}=\exp\left(-l\cdot d\right)$$ for $$l>0$$. Then, when $$s\rightarrow0$$ or $$l\rightarrow0$$ the average hitting time $$\left\langle \hat{H}^{\tau}\right\rangle \rightarrow\left(n-1\right)$$, independently of the topology of the graph, which is the minimum for any graph of $$n$$ nodes. Proof. First, let $$G_{n}$$ be any connected, simple graphs with $$n$$ nodes. Then, $$\left\langle \hat{H}\left(G_{n}\right)\right\rangle \leq\left\langle \hat{H}\left(K_{n}\right)\right\rangle =n-1$$, with equality if and only if $$G_{n}=K_{n}$$, where $$K_{n}$$ is the complete graph with $$n$$ nodes (see [20] for a couple of proofs). Then, when $$s\rightarrow0$$ or $$l\rightarrow0$$ we have that $$\hat{A}^{\text{Mell}}\left(i,j\right)\rightarrow1$$ and $$\hat{A}^{\text{Lapl}}\left(i,j\right)\rightarrow1$$, respectively. This means that $$\hat{A}^{\tau}\left(i,j\right)=1$$$$\forall i\neq j$$ and $$\hat{A}^{\tau}\left(i,j\right)=0$$$$\forall i=j$$. In other words, $$\hat{A}^{\tau}=\mathbf{1}\mathbf{1}^{T}-I$$, which is the adjacency matrix of the complete graph $$K_{n}$$. In closing, when $$s\rightarrow0$$ or $$l\rightarrow0$$, $$\hat{A}^{\tau}\rightarrow A\left(K_{n}\right)$$. As it has been previously proved $$\left\langle H\left(K_{n}\right)\right\rangle =n-1$$, which proves the result. □ At first sight the result may seem trivial. We replace the graph by a weighted complete graph, thus the hitting time should be simply that of a weighted complete graph. The problem is however that the weights that every edge in this complete graph receives depends of the structure of the original graph $$G$$. Then, only at the point when $$s=0$$ or $$l=0$$ the results coincide with those of the complete graph. However, neither the rate of decay of the hitting time as $$s\rightarrow0$$ or $$l\rightarrow0$$, nor its specific value at any specific value of $$s$$ or $$l$$ are trivial and they strongly depend on the topology of the graph. To give an illustrative example we consider here six different graphs with the same size $$n=100$$. There are a barbell graph $$B\left(100,33,33\right)$$, the path $$P_{100}$$, the cycle $$C_{100}$$, the square lattice $$10\times10$$, the triangular lattice $$10\times10$$ and the star graph $$S_{1,99}$$. A barbell graph $$B\left(n,k_{1},k_{2}\right)$$ is a graph with $$n$$ nodes and two cliques of sizes $$k_{1}$$ and $$k_{2}$$, connected by a path consisting of the remaining nodes. In Fig. 1, we illustrate the results for the hitting time as a function of the parameters $$s$$ (left panel) and $$l$$ (right panel) in the Mellin and Laplace transforms defined before, respectively. As can be seen in both cases the rate of convergence to the smallest hitting time of all these graphs is significantly different from one structure to another. The case of the star is easy to analyse as it is formed only by shortest paths of length one and two. Thus its convergence to the best hitting time is very fast. The worst rate followed by the path and then the cycle. However, the largest diameter among these three graphs is not the one of the barbell graph but that of the path, which indicates that not only the lengths of the shortest paths involved influence this rate of convergence of the hitting time in general graphs (we will consider the barbell graphs later on this work). The square and triangular lattices show almost identical convergence rates although they also display different shortest-path structures. More important is to consider a specific value of the hitting time at a given value of $$s$$ or $$l$$. Let us consider for instance $$s=2$$ and $$l=2$$. In this case, the barbell graph displays a hitting time 50 times larger than that of the square and triangular lattices in the Mellin transform or 150 times larger in the case of the Laplace transform. All of this shows that the consideration of a RMH in graphs is far from the trivial replacement of a graph by a weighted complete graph. Fig. 1. View largeDownload slide Change of the average hitting time in a series of graphs with the same size $$n=100$$ as a function of the parameters $$s$$ and $$l$$ of the Melling (a) and Laplace (b) transforms, respectively. Fig. 1. View largeDownload slide Change of the average hitting time in a series of graphs with the same size $$n=100$$ as a function of the parameters $$s$$ and $$l$$ of the Melling (a) and Laplace (b) transforms, respectively. 5. Deterministic graphs In this section, we study some of the properties of the multi-hopper model for some classes of graphs which have deterministic structure. We study the average hitting time of all $$11,117$$ connected graphs with 8 nodes. The average hitting time has mean $$10.036\pm1.932$$ for all the graphs with $$n=8$$, with a maximum value of $$21.071$$. These values converge quickly to $$n-1$$ as soon as $$s,l\rightarrow0$$. For instance for $$s=0.5$$ the mean of the average hitting time is already $$7.062\pm0.024$$, and this value drops up to $$7.00253\pm0.00096$$ for $$s=0.1$$. The situation is very similar for $$l\rightarrow0$$, and the mean of the average hitting time is $$7.0037\pm0.0045$$ for $$l=0.1$$ and $$7.000072\pm4.67\cdot10^{-5}$$ for $$l=0.01$$. In the next subsection, we study some specific families of graphs which frequently appear in bounds for the hitting and commute times in graphs. 5.1 Lollipop and barbell graphs The first classes of graphs that we study here are the so-called lollipop and barbell graphs. The lollipop graphs appear in many extremal properties related to random walks on graphs. In 1990, Brightwell and Winkler [21] proved that the hitting time between a pair of nodes $$i$$ and $$j$$ in a graph is maximum for the lollipop graph $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ consisting of a clique of $$\left\lfloor \dfrac{2n}{3}\right\rfloor$$ nodes including $$i$$ to which a path on the remaining nodes, ending in $$j$$, is attached. The same graph was found by Jonasson as the one containing the pair of nodes maximizing the commute time among all graphs [22]. Here we consider lollipop graphs $$L\left(n,\left\lfloor \tfrac{n}{2}\right\rfloor \right)$$ and $$L\left(n,\left\lfloor 2\tfrac{n}{3}\right\rfloor \right)\!,$$ which have appeared already in the previous section and the symmetric barbell graphs $$B\left(n,\left\lfloor \tfrac{n}{3}\right\rfloor ,\left\lfloor \tfrac{n}{3}\right\rfloor \right)$$. We want to remark that these graphs are not necessarily the extremal ones for the hitting time as discussed in the previous section but they can be considered as representative of their classes. We observe that the three graphs display $$\left\langle H\right\rangle \approx an^{3}$$ for the normal random walk. The coefficients $$a$$ obtained by using nonlinear fitting of the hitting times with $$n$$ are: $$a\approx0.01387$$ for $$L\left(n,\left\lfloor \tfrac{n}{3}\right\rfloor \right)$$, $$a\approx0.0179$$ for $$B\left(n,\left\lfloor \tfrac{n}{3}\right\rfloor ,\left\lfloor \tfrac{n}{3}\right\rfloor \right)$$ and $$a\approx0.01928$$ for $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$. We then study the variation of the parameters $$s$$ and $$l$$ in the Mellin and Laplace transforms of the multi-hopper model for the three graphs previously studied. In Fig. (2), we illustrate the results of these calculations. An important aspect to remark at this point are the obvious differences between the use of the Mellin and Laplace transforms in the multi-hopper model. First, it is easily observed that the Mellin transform produces a faster decay of the average hitting time than the Laplace one for the three graphs. For instance the 50% reduction in the average hitting time of the three graphs occurs for values of $$3.5<s<4.0$$ for the Mellin transform, but it happens for $$1.5<l<2.0$$ for the Laplace. This implies that the Laplace transform converges to average hitting time of $$n-1$$ at much smaller values than the Mellin transform. The second important difference is observed in the insets of Fig. (2). For the Laplace-transformed multi-hopper, the lollipop graph $$L\left(n,\left\lfloor \tfrac{n}{2}\right\rfloor \right)$$ always has the largest value of the average hitting time at any value of $$l$$ among the three graphs studied. However, for the Mellin-transformed case, the lollipop graph $$L\left(n,\left\lfloor \tfrac{n}{2}\right\rfloor \right)$$ has the largest hitting time for large values of $$s$$, but for $$s\lesssim2.1$$ the lollipop graph $$L\left(n,\left\lfloor 2\tfrac{n}{3}\right\rfloor \right)$$ is the one with the largest value of the average hitting time (see the crossing in the inset of Fig. 2). This confirms the complexity of the analysis of the extremal graphs for the multi-hopper model as we have hinted at in the previous section. Fig. 2. View largeDownload slide Hitting time as a function of the parameter $$s$$ for the Mellin (left) and the Laplace (right) transforms in lollipops $$L\left(n,\left\lfloor \tfrac{n}{2}\right\rfloor \right)$$ (blue squares), $$L\left(n,\left\lfloor 2\tfrac{n}{3}\right\rfloor \right)$$ (red circles) and barbell $$B\left(n,\left\lfloor \tfrac{n}{3}\right\rfloor ,\left\lfloor \tfrac{n}{3}\right\rfloor \right)$$ (yellow triangles) graphs for $$n=999$$. In the inset panels, we zoom the plot for the region $$1.8\leq s\leq2.3$$ and $$0.01\leq l\leq1$$, respectively. Fig. 2. View largeDownload slide Hitting time as a function of the parameter $$s$$ for the Mellin (left) and the Laplace (right) transforms in lollipops $$L\left(n,\left\lfloor \tfrac{n}{2}\right\rfloor \right)$$ (blue squares), $$L\left(n,\left\lfloor 2\tfrac{n}{3}\right\rfloor \right)$$ (red circles) and barbell $$B\left(n,\left\lfloor \tfrac{n}{3}\right\rfloor ,\left\lfloor \tfrac{n}{3}\right\rfloor \right)$$ (yellow triangles) graphs for $$n=999$$. In the inset panels, we zoom the plot for the region $$1.8\leq s\leq2.3$$ and $$0.01\leq l\leq1$$, respectively. We now concentrate on the variation of the average hitting time with the number of nodes in the lollipop $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ for different values of the parameters $$s$$ and $$l$$ (Fig. 3). For the Mellin transformed multi-hopper model with a fixed value of the exponent $$s$$, the average hitting time is always a power-law of the number of nodes: $$\left\langle \hat{H}^{\text{Mel}}\left(s\right)\right\rangle \approx an^{\gamma}$$, where $$\gamma\rightarrow3$$ when $$s\rightarrow\infty$$ and $$\gamma\rightarrow1$$ when $$s\rightarrow0$$. For instance, $$\gamma=2.831$$ for $$s=3;$$$$\gamma=2.129$$ for $$s=2;$$$$\gamma=1.772$$ for $$s=1;$$$$\gamma=1.291$$ for $$s=0.5;$$$$\gamma=1.011$$ for $$s=0.1$$. The situation is quite similar for the Laplace transformed multi-hopper model. This observation indicates that for small values of the parameters $$s$$ and $$l$$ the average hitting time changes linearly with the number of nodes. This important observation is repeated for every family of graphs as we will see in further sections of this work. Fig. 3. View largeDownload slide Average hitting times for the RMH model with Mellin (left) and Laplace (right) transforms in the lollipop graph $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ as a function of the number of nodes $$n$$ in the graph. Fig. 3. View largeDownload slide Average hitting times for the RMH model with Mellin (left) and Laplace (right) transforms in the lollipop graph $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ as a function of the number of nodes $$n$$ in the graph. We then study the variation of the average hitting time with the number of nodes for the lollipop $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ for $$0.001\leq s\leq0.05$$ and obtain the linear dependence of the type: $$\left\langle \hat{H}_{M}\left(G\right)\!,s\right\rangle \approx\alpha n+\beta$$. Using these linear fits, we can estimate the critical number of nodes $$n_{c}$$ below which $$\left(n-1\right)\leq\left\langle \hat{H}_{M}\left(L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)\right)\!,s\right\rangle \leq n$$ for a given value of the parameter $$s$$. Clearly, $$n_{c}\leq\dfrac{\beta}{1-\alpha}.$$ However, we can simplify this expression by observing that $$\beta<-1$$ and that $$\alpha\approx1+2.751s^{2}$$. Then, $$n_{c}\leq\dfrac{1}{2.751s^{2}},s\leq0.05.$$ (5.1) The values of the critical number of nodes are given in the Supplementary Information, which range from $$145$$ for $$s=0.05$$ to $$363,504$$ for $$s=0.01.$$ This means, for instance, that any lollipop $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ having less than 58,160 nodes will display an average hitting time bounded between $$n-1$$ and $$n$$ in the RMH model with a Mellin transform and parameter $$s\leq0.0025$$. The previous inequality can also be used in the other way around, namely in order to estimate what is the value of $$s$$ that should be used such that a lollipop $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ has average hitting time bounded as $$\left(n-1\right)\leq\left\langle \hat{H}_{M}\left(L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)\right)\!,s\right\rangle \leq n$$. For instance, if we would like to know the value of $$s$$ for which any graph with less than 100,000 nodes has hitting time below $$n-1$$, we use $$s\geq\dfrac{1}{\sqrt{2.751n_{c}}},s\leq0.05,\label{eq:s_bound}$$ (5.2) and obtain $$s\approx0.0019$$. We can extrapolate here to roughly estimate the value of $$s$$ for which any lollipop $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ with less than 1 million nodes has hitting time bounded as before. This estimation gives a value of $$s\lesssim0.0006$$. The importance of the previous investigation is the following. Currently, we do not know what are the graphs with the largest value of the average hitting time among all the graphs with $$n$$ nodes. However, we have strong intuition and evidence that it should be either a lollipop or a barbell graph. For these graphs, the average hitting time is of the order $$n^{3}$$ for the NRW. Then, we can use the previous values obtained for the lollipop $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ as rough indications of the worse-case scenarios that can be expected for any graph. In other words, if we consider a graph of any structure having 1 million nodes we should expect that its average hitting time is bounded below $$n$$ for $$s\lesssim0.0006$$. We will see that for the case of real-world networks, these values of $$s$$ are order of magnitude overestimated due to the fact that the structure of these graphs make the average hitting time drops order of magnitude in relation to this upper bound. A first flavour of these differences is obtained by the analysis of random graphs in the next section of this work. 5.1.1 Time evolution In this section, we are not interested in a detailed description of the time evolution of the random walker or the multi-hopper in the lollipop or barbell graphs. We rather will make a comparison between the evolution of them at different times in such a way that we remark the main difference between the two models. Consequently we consider a lollipop $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ and a barbell $$B\left(n,\left\lfloor \tfrac{n}{3}\right\rfloor ,\left\lfloor \tfrac{n}{3}\right\rfloor \right)$$ graph, both with $$n=999$$ nodes. In both cases we place the random walker at a node of one of the two cliques. This node is selected not to be the one attached to the path. Let any node in a clique in the lollipop (respectively, a clique in the barbell) which is not the one connected to the path be designated as the node $$i$$. Let the end point of the path be named $$j$$. We remark that the node $$j$$ does not belong to the clique. Let the node connecting the clique to which $$i$$ belongs and the path be designated as $$k$$. Then, we have placed the random walker and the multi-hopper at the node $$i$$ of the lollipop and the barbell and explore the probability at each node after different times. using $$\mathbf{p}_{t}=\left(\hat{P}^{\textrm{Mel,}T}\right)^{t}\mathbf{p}_{0},$$ (5.3) where $$\hat{P}^{\textrm{Mel,}T}$$ is the transpose of $$\hat{P}^{\textrm{Mel}}.$$ As can be seen in the Fig. (4), the classical random walker spends most of its time in the clique of the graphs, taking on average $$\left\lfloor \tfrac{2n}{3}\right\rfloor -1$$ steps to visit the node $$k$$ in the lollipop and $$\left\lfloor \tfrac{n}{3}\right\rfloor -1$$ to visit it in the barbell. Once the walker visits the node $$k$$ she can walk to the node $$j$$ only with probability $$1/\left\lfloor \tfrac{2n}{3}\right\rfloor$$ in the lollipop and $$1/\left\lfloor \tfrac{n}{3}\right\rfloor$$ in the barbell graph. Then it can be seen in the Fig. (4) that for times as large as $$t=10^{6}$$ the random walker is still stuck in the clique of the lollipop graph. In the case of the barbell when $$t=10^{3}$$ the walker has visited only the nodes of the clique in which she started and when $$t=10^{6}$$ she starts to visit the nodes of the other clique. Fig. 4. View largeDownload slide Probability distribution at the different nodes of a lollipop $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ (top) and barbell $$B\left(n,\left\lfloor \tfrac{n}{3}\right\rfloor ,\left\lfloor \tfrac{n}{3}\right\rfloor \right)$$ (bottom) graph with $$n=999$$ nodes. Classical random walk (blue solid line) and the multi-hopper using the Mellin transform with $$s=1$$ (red broken line) and using the Laplace transform with $$l=0.1$$ (red dotted line). The evolution of the probabilities is shown at three different times for a random walk starting at node $$i$$ (see text) at $$t=50$$ (left), $$t=1000$$ (centre) and $$t=10^{6}$$ (right). Fig. 4. View largeDownload slide Probability distribution at the different nodes of a lollipop $$L\left(n,\left\lfloor \tfrac{2n}{3}\right\rfloor \right)$$ (top) and barbell $$B\left(n,\left\lfloor \tfrac{n}{3}\right\rfloor ,\left\lfloor \tfrac{n}{3}\right\rfloor \right)$$ (bottom) graph with $$n=999$$ nodes. Classical random walk (blue solid line) and the multi-hopper using the Mellin transform with $$s=1$$ (red broken line) and using the Laplace transform with $$l=0.1$$ (red dotted line). The evolution of the probabilities is shown at three different times for a random walk starting at node $$i$$ (see text) at $$t=50$$ (left), $$t=1000$$ (centre) and $$t=10^{6}$$ (right). On the other hand, the RMH has a non-zero probability of escaping directly from the clique. As can be seen in the right panels of the Fig. (4) even for the small time $$t=50$$ the multi-hopper with Mellin transform has already visited all the nodes of the graphs. For this short time, however, the multi-hopper with Laplace transform has visited all the nodes of the cliques plus the initial nodes of the path, but she has not arrived yet at the node $$j$$. For time $$t=1000$$ the multi-hopper with Mellin transform is already in the stationary state and the one with Laplace transform has already visited all the nodes of the graphs. At $$t=10^{6}$$ the multi-hopper has reached the stationary state for both transforms. This significant difference with the classical RW is due to the fact that the RMH is not trapped in the cliques due to the fact that she can go directly from $$i$$ to any node of the graph with a probability that decays as a function of the distance from $$i$$. Then, the first few nodes of the path are frequently visited by the multi-hopper as they are at relatively short distances from the node $$i$$. Once the multi-hopper reached these nodes, she can visit the extreme nodes of the graphs in an easier way, overtaking the classical RW even for relatively short times. The way in which a RMH is propagated through a path is analysed in the next subsection of this work. 5.2 Path graphs Another interesting graph to consider is the path $$P_{n}$$. As proved by Palacios [20], this graph has the maximum sum of all effective resistances among all pairs of nodes, i.e., the maximum Kirchhoff index. For the normal RW, Palacios proved that $$\Omega_{tot}\sim n^{3}$$. Because the number of edges in $$P_{n}$$ is $$n-1$$ we easily get that $$\left\langle H\left(P_{n}\right)\right\rangle \sim n^{2}$$. In Fig. 5, we illustrate the evolution of the probabilities of being at a given node of the path of 1000 nodes labelled in consecutive order from 1 to 1000, in which we have placed the random walker at the node 1. As can be seen in the Fig. (5)$$t=500$$ (left), the classical random walker has visited only the first 100 nodes of the path while the RMHs for both transforms have already visited all the nodes. As the time increases, the RMH model gives almost identical probabilities of finding the walker at any node of the path, but the classical random walker still shows close to zero probability of finding the walker at the other side of the path for times as large as $$t=5000$$ (see right plots in the Fig. 5). Fig. 5. View largeDownload slide Probabilities of finding the random walker at a given node of $$P_{1000}$$ at $$t=500$$ (left), $$t=1000$$ (centre) and $$t=5000$$ (right) for the classical (blue solid line) and multi-hopper random walk model with Mellin transform with parameter $$s=2$$ and with the Laplace transform with parameter $$l=0.1$$. Fig. 5. View largeDownload slide Probabilities of finding the random walker at a given node of $$P_{1000}$$ at $$t=500$$ (left), $$t=1000$$ (centre) and $$t=5000$$ (right) for the classical (blue solid line) and multi-hopper random walk model with Mellin transform with parameter $$s=2$$ and with the Laplace transform with parameter $$l=0.1$$. As in the previous subsection, we study here the influence of the graph size on the hitting time in paths for both the Mellin and Laplace transforms. In particular, we compare both transformations in the multi-hopper random walk with the classical one for the path graph with $$100\leq n\leq1000$$. As expected the average hitting time in the classical random walk follows a quadratic dependence with the number of nodes, $$\left\langle H_{k}\right\rangle \approx0.3333n^{2}$$. However, for the multi-hopper one it follows power-laws with exponent smaller than 2. For instance, for the dependence is of the form $$\left\langle H_{k}\right\rangle \approx an^{b}$$, with parameters given in the Supplementary Information. The most interesting thing here is that as for the barbell and lollipop graphs the average hitting time of paths also increases linearly with the number of nodes for relatively small values of the parameters $$s$$ and $$l$$ as can be seen in Supplementary Fig. S1 accompanying this article. 5.3 Some remarks It is intuitive to think that the average shortest-path distance plays a fundamental role in explaining the average hitting time of graphs in the normal RW model. Then, because we allow for long-range jumps in the multi-hopper model we would intuitively expect that such influence of the shortest-path distance is diminished in this model. However, one important thing that we have learned from the analysis of the lollipop, barbell and path graphs is the following. Although the average shortest path distance plays some role in the determination of the average hitting times, it is the existence of large, relatively isolated, clusters which plays the major role. That is, although a path graph has the largest possible average shortest path distance of any graph with n nodes (the average shortest path length is equal to $$(n + 1)/3)$$, it has an average hitting time one order of magnitude smaller than the lollipop and barbell graphs, which have comparably small average shortest path distances—particularly the ones analyzed in this section. This role of large clusters in graphs, which we discussed in this section, is a great importance for the analysis of real-world networks. Although these networks have relatively small average shortest-path distance due to their small-world properties, they contain many communities—clusters of tightly connected nodes, which are poorly connected between each other—which resemble the extremal situation of barbell and lollipop graphs. 6. Random graphs In this section, we explore the multi-hopper model for two types of random networks: Barabási–Albert (BA) [23] and Erdős–Rényi (ER) [24] types. We use the exact result obtained for the expected hitting time averaged over all pairs of nodes $$\left\langle \hat{H}_{\tau}\right\rangle$$ in Equation (3.19). The analysis of the BA and ER random networks shows that the hitting time increases linearly with the number of nodes in the graph (see Supplementary Figures). In all cases, the use of the Mellin transform in the multi-hopper reduces the slope of the linear fits $$\left\langle \hat{H}_{\tau}\right\rangle \approx an+b$$ (see Supplementary Information for the parameters $$a$$ and $$b$$) expressing the dependence of the hitting time with the number of nodes. Thus, we found that in general, using the long-range strategies, the resulting random walker reaches more efficiently any site on the network in comparison with the normal random walk. In Fig. 6, we fixed the number of nodes to $$n=2000$$ in ER and BA networks and we calculate the average hitting time as a function of the parameters $$l$$ and $$s$$ for the Laplace and Mellin transform, respectively. The results confirm previous findings [11] that in the limit for $$l,s\to0$$ the hitting times reach the value $$n-1$$. However, for parameters in the interval $$(0,10)$$, we see how the two types of strategies present a strong variation in the average hitting time the values of $$\left\langle \hat{H}^{\tau}\right\rangle$$, this is a direct consequence of how the random walk strategies assign weights to small, intermediate and large steps. Finally, for large values of the parameters, long-range transitions appear with low probability and the values of $$\left\langle \hat{H}^{\tau}\right\rangle$$ are equal to the results for the normal random walk strategy with transitions only to nearest neighbours. These results are of great significance for the further analysis of real-world networks in the next section of this work. Fig. 6. View largeDownload slide Influence of the degree distribution on the average hitting time for random networks with $$n=2000$$ nodes. (a) Barabási-Albert (BA) network, (b) connected Erdős-Rényi (ER) networks with probability $$p=\log(n)/n$$. We plot the results for the hitting time as a function of the parameter $$l$$ for the Laplace transform and $$s$$ for the Mellin transformation. We depict, with dashed lines, the results for the normal random walk (NRW) and $$n-1$$ obtained for a complete graph. Fig. 6. View largeDownload slide Influence of the degree distribution on the average hitting time for random networks with $$n=2000$$ nodes. (a) Barabási-Albert (BA) network, (b) connected Erdős-Rényi (ER) networks with probability $$p=\log(n)/n$$. We plot the results for the hitting time as a function of the parameter $$l$$ for the Laplace transform and $$s$$ for the Mellin transformation. We depict, with dashed lines, the results for the normal random walk (NRW) and $$n-1$$ obtained for a complete graph. As we have observed in the previous analysis, there are very significant differences between the random networks considered here and the graphs analysed in the previous section, where the average hitting time increases as a third or second power of the number of nodes. The linear increase observed here for the random graphs studied cannot be understood only on the basis of the fact that they display relatively small shortest-path distances. For instance, we can construct barbell graphs $$B\left(n,\left\lfloor \tfrac{n-k}{2}\right\rfloor ,\left\lfloor \tfrac{n-k}{2}\right\rfloor \right)$$ with small values of $$k$$, which have small average shortest-path distance. For instance, a graph $$B\left(n,\left\lfloor \tfrac{n-k}{2}\right\rfloor ,\left\lfloor \tfrac{n-k}{2}\right\rfloor \right)$$ has only distances $$d_{ij}\in\left[1,k\right]$$. One important difference, however, between the studied random networks and the barbell and lollipop graphs previously considered is the lack of large cliques in these random graphs which may trap the random walker inside them. In the next section, we study this problem by using random graphs with different intercommunity density of links. In addition, we study the influence of the degree distribution on the hitting time of these random graphs with the goal of understanding the differences between ER and BA networks. 6.1 Influence of communities and degree distribution We start here by considering the influence of the presence of clusters of nodes defined in the following way. Let us consider a network with $$n$$ nodes. Let us make a partition of the network in $$k$$ clusters of size $$\left\lfloor \dfrac{n}{k}\right\rfloor$$. Let $$C_{i}$$ and $$C_{j}$$ be two of such clusters. Then, the probability that two nodes $$p,q\in C_{i}$$ are connected is much larger than the probability that two nodes $$r\in C_{i}$$ and $$s\in C_{j}$$ are connected. This gives rise to higher internal densities of links in the clusters than the inter-cluster density of links. It is a well-known fact that neither the ER nor the BA networks contain such kind of clusters. The lack of such clusters—known in network theory as communities—is characterized by the so-called good expansion properties of these graphs. Loosely speaking a graph is an expander if it does not contain any structural bottleneck, i.e., a small group of nodes or edges‘ whose removal separates the network into two connected components of approximately the same size [25]. We remark here that both ER and BA graphs have been proved to be expanders when the number of nodes is very large [25, 26]. Then, we use here an implementation of the algorithm described by Lancichinetti et al. [27] to produce undirected random networks with communities with a fixed average degree $$\langle k\rangle$$. A mixing parameter $$\mu$$ defines the fraction of links that a node share with nodes in other communities. A small value of the mixing parameter produces graphs with tightly connected clusters which are poorly connected among them. That is, it produces very well-defined communities in the graph. As the mixing parameter increases and surpasses the value 0.5, the communities disappear and the graph looks more and more like an expander for a sufficiently large number of nodes. Here, we explore the effect of communities in the capacity of the multi-hopper random walk strategy to reach any site of the network by constructing random graphs with the same number of nodes and edges but changing the mixing parameter. In Fig. 7, we depict the average hitting time $$\langle H^{\tau}\rangle$$ for different values of the parameters $$l$$ and $$s$$ for networks with communities constructed as described before. As can be seen in this figure, for random graphs with well defined communities (small values of the mixing parameter), the random walker needs significantly longer times to explore the whole network. This is particularly true for relatively large values of the Mellin and Laplace parameters of the multi-hopper model, which indicate that the normal RW is significantly less efficient in networks having communities than in networks not displaying such structural characteristic. Here again, as these parameters approach zero the hitting time decays to the lower bound as expected from the theory. In closing, the small hitting times observed for the random graphs studied in the previous subsection are mainly due to the fact that these graphs are expanders and they lack any community structure, which may trap the random walker for longer times without visiting other clusters. The multi-hopper solves this trapping problem by having a larger chance to jump from one community to another, thereby reducing her time inside each of the clusters visited. We provide a video as Supplementary Information accompanying this paper that visualizes these findings. Fig. 7. View largeDownload slide Average hitting time for the multi hopper random walker in networks with communities. (a) Laplace and (b) Mellin strategies. We explore networks with $$n=1000$$ nodes, an average degree $$\langle k\rangle=15$$ and different values of the mixing parameter $$\mu$$ that defines the fraction of connection that a node has with nodes in other communities. Fig. 7. View largeDownload slide Average hitting time for the multi hopper random walker in networks with communities. (a) Laplace and (b) Mellin strategies. We explore networks with $$n=1000$$ nodes, an average degree $$\langle k\rangle=15$$ and different values of the mixing parameter $$\mu$$ that defines the fraction of connection that a node has with nodes in other communities. Now we move to the consideration of the influence of the degree distribution on the performance of the RMH. We then study the stationary probability distribution $$\pi^{\tau}(i)$$ for the Laplace and Mellin transformation in a BA and an ER network, with $$n=2000$$ nodes. The results are obtained from the calculation of the long-range degrees $$k^{\tau}(i)$$ in Equations (3.17) and (3.18) and the respective normalization defined in Equation (3.16). In Fig. 8, we summarize our results. The important aspect of these plots are the slopes of the corresponding curves for different values of the Mellin and Laplace transforms in the multi-hopper model. If we compare the slopes for the ER network with those of the BA network, we observe that the first is smaller and closer to the constant line $$n-1$$ than the second. The smallest hitting time is obtained when the slope coincides with this line, which represents the fully connected graph. Thus, the ER graphs are already close to this slope and this is the main reason why they display relatively small hitting times. However, in the BA model when $$s,l$$ are very large the slopes of the curves are very steep and far away from the asymptotic result. As soon as these parameters approach zero the slope of the curves become more flat approaching $$\pi^{\tau}(i)=n-1$$ as a consequence of the fact that the graph approaches the fully connected one. That is, the long-range dynamics changes the way in which the random walker reaches the nodes. For small values of the parameter $$s$$ or $$l$$, the stationary probability distribution reaches the value $$\pi(i)=1/n$$. On the other hand, the inverse of the stationary probability distribution defines the average time $$\left\langle t^{\tau}(i)\right\rangle =\frac{1}{\pi^{\tau}(i)}$$ needed for the random walker to return for the first time to the node $$i$$. In this way, the random walker returns to sites with high values of $$\pi^{\tau}$$ and, gets trapped in regions with this property. As we can see in Fig. 8, the effect of the long-range strategies is to reduce the probability to revisit highly connected sites, and to increase the capacity of reaching any site in the network. Fig. 8. View largeDownload slide Stationary probability distribution for multi-hopper random walkers on Barabási-Albert and Erdős-Rényi networks with $$n=2000$$ nodes. (a) Laplace transform, (b) Mellin transform. Fig. 8. View largeDownload slide Stationary probability distribution for multi-hopper random walkers on Barabási-Albert and Erdős-Rényi networks with $$n=2000$$ nodes. (a) Laplace transform, (b) Mellin transform. In closing, in this section we have seen that a random walker can be trapped in certain regions of a network—that is, having larger probability of staying at these regions than in other parts of the graph—due to two different factors. The first is the presence of clusters of highly connected nodes in which the random walker is retained for long times before she visits other clusters of the graph. The second is the existence of hubs (highly connected nodes) to which the random walker returns frequently, thereby making her exploration of the network more difficult. These two characteristics, the presence of communities and the existence of fat-tailed degree distributions, are well known to be ubiquitous in real-world networks. The observation that the RMH can overcome both of these factors make this model an important candidate for the exploration of real-world networks, which is the topic of our next section. 7. Real-world networks One of the areas in which the RMH can show many potential applications is in the study of large real-world networks. Normal random walks on networks have been previously used as mechanisms of transport and search on networks [3, 28, 29]. These are graphs representing the networked skeleton of complex systems ranging from infrastructural and technological to biological and social systems. Here we study a few networks representing a variety of real-world complex systems, including biological, communication and infrastructural ones. In Table 1, we report the sizes of these networks as well as the hitting times using the normal random walk and the multi-hopper with Mellin transform. By using the expression 5.2, we can estimate the lower bound for the value of the Mellin parameter $$s$$ for which $$\left\langle \hat{H}_{M}\right\rangle \leq n$$. These values are given in Table 1 as $$s_{lower}$$ for all the networks studied in this section. In addition we calculate the actual value of this parameter for which $$\left\langle \hat{H}_{M}\right\rangle \leq n$$ in these networks, and report it as $$s_{c}$$ in Table 1. Here, the values of $$s_{c}$$ are obtained as follows. We calculate the value of $$\left\langle \hat{H}_{M}\right\rangle$$ for different values of $$s$$ and obtain a fit of the form: $$\left\langle \hat{H}_{M}\right\rangle \approx\alpha s^{2}+\beta$$ for $$0.01\leq s\leq0.5$$. Obviously, $$\beta=n-1$$, which is the lowest value obtained by $$\left\langle \hat{H}_{M}\right\rangle$$ for any graph. Using these fitted equations we then calculate the values of $$s_{c}$$ reported in Table 1. As can be seen the values of $$s_{c}$$ are as average 10 times larger than the lower bound expected from the lollipop graphs of the same size as the studied networks. Table 1 Real-world networks studied in this work, their number of nodes $$n$$ and the average hitting time of the normal random walker $$\left\langle H\right\rangle$$. $$s_{c}$$ is the value of the Mellin parameter $$s$$ for which the corresponding network has hitting time smaller than $$n$$. The value of $$s_{B}$$ is the Mellin parameter $$s$$ for which the corresponding a lollipop graph $$L\left(n,\left\lfloor \dfrac{3n}{2}\right\rfloor \right)$$ with the same number of nodes as the real-world network has hitting time smaller than $$n$$. The last column, % improv., represents the percentage of improvement in the hitting time using the Mellin-transformed multi-hopper respect to the NRW Network $$n$$ $$\left\langle H\right\rangle$$ $$s_{c}$$ $$s_{B}$$ % impr. Bio_PPI_yeast 2,224 8,652 0.145 0.0128 389 City_Atlanta 3,234 11,973 0.088 0.0106 370 Colab_Geom 3,621 15,719 0.086 0.0100 434 City_Berlin 4,495 13,752 0.094 0.0090 306 Power_USA 4,941 34,455 0.098 0.0086 697 City_Barcelona 5,575 17,282 0.063 0.0081 310 Colab_AstroPh 17,903 101,072 0.048 0.0045 565 City_Seattle 20,207 108,746 0.041 0.0042 538 Colab_CondMat 21,363 77,298 0.049 0.0041 362 Comm_Enron 33,696 193,714 0.040 0.0033 575 Network $$n$$ $$\left\langle H\right\rangle$$ $$s_{c}$$ $$s_{B}$$ % impr. Bio_PPI_yeast 2,224 8,652 0.145 0.0128 389 City_Atlanta 3,234 11,973 0.088 0.0106 370 Colab_Geom 3,621 15,719 0.086 0.0100 434 City_Berlin 4,495 13,752 0.094 0.0090 306 Power_USA 4,941 34,455 0.098 0.0086 697 City_Barcelona 5,575 17,282 0.063 0.0081 310 Colab_AstroPh 17,903 101,072 0.048 0.0045 565 City_Seattle 20,207 108,746 0.041 0.0042 538 Colab_CondMat 21,363 77,298 0.049 0.0041 362 Comm_Enron 33,696 193,714 0.040 0.0033 575 Table 1 Real-world networks studied in this work, their number of nodes $$n$$ and the average hitting time of the normal random walker $$\left\langle H\right\rangle$$. $$s_{c}$$ is the value of the Mellin parameter $$s$$ for which the corresponding network has hitting time smaller than $$n$$. The value of $$s_{B}$$ is the Mellin parameter $$s$$ for which the corresponding a lollipop graph $$L\left(n,\left\lfloor \dfrac{3n}{2}\right\rfloor \right)$$ with the same number of nodes as the real-world network has hitting time smaller than $$n$$. The last column, % improv., represents the percentage of improvement in the hitting time using the Mellin-transformed multi-hopper respect to the NRW Network $$n$$ $$\left\langle H\right\rangle$$ $$s_{c}$$ $$s_{B}$$ % impr. Bio_PPI_yeast 2,224 8,652 0.145 0.0128 389 City_Atlanta 3,234 11,973 0.088 0.0106 370 Colab_Geom 3,621 15,719 0.086 0.0100 434 City_Berlin 4,495 13,752 0.094 0.0090 306 Power_USA 4,941 34,455 0.098 0.0086 697 City_Barcelona 5,575 17,282 0.063 0.0081 310 Colab_AstroPh 17,903 101,072 0.048 0.0045 565 City_Seattle 20,207 108,746 0.041 0.0042 538 Colab_CondMat 21,363 77,298 0.049 0.0041 362 Comm_Enron 33,696 193,714 0.040 0.0033 575 Network $$n$$ $$\left\langle H\right\rangle$$ $$s_{c}$$ $$s_{B}$$ % impr. Bio_PPI_yeast 2,224 8,652 0.145 0.0128 389 City_Atlanta 3,234 11,973 0.088 0.0106 370 Colab_Geom 3,621 15,719 0.086 0.0100 434 City_Berlin 4,495 13,752 0.094 0.0090 306 Power_USA 4,941 34,455 0.098 0.0086 697 City_Barcelona 5,575 17,282 0.063 0.0081 310 Colab_AstroPh 17,903 101,072 0.048 0.0045 565 City_Seattle 20,207 108,746 0.041 0.0042 538 Colab_CondMat 21,363 77,298 0.049 0.0041 362 Comm_Enron 33,696 193,714 0.040 0.0033 575 8. A note on the multi-hopper probability distributions In this section, we study the distribution of probabilities of finding the random walker at a given place as a function of the distance from that place to the current position of the walker. In a homogeneous and isotropic space, e.g. in a continuous space, Lévy flights (LF) are a frequently studied model in which the walker can also jump to distant regions of the space [30]. LFs are widely used to model efficient search processes, for instance, of animals searching for sparse food. At every jump the step length is drawn from a long-tailed probability density function with the power-law decay [31–33]. Due to the homogeneity and isotropy of the space in which the random walker moves, there should be a perfect correlation between the probability of realizing a jump and the distance separating the two places from/to which the walker moves to. That is, the probability of making a walk of length $$k$$, is exactly the same independently of the position in space where the walker starts/end her jump. This can be interpreted as the fact that at every jump all memory to previous jumps is erased, and thus the jump lengths $$x$$ are independent and identically distributed random variables [34]. Is this situation observed for a graph in the context of the multi-hopper random walk? The answer is in general no. For the RMH, the conditions of homogeneity and isotropy of the space are broken. Suppose, as we have seen before, a barbell graph in which there are two cliques and a path connecting them. Let us pick (i) a pair of nodes inside one of the cliques, (ii) a pair of nodes in the middle of the path and (iii) a pair formed by the node of the clique which is connected to a node of the path. In the three cases the distance between the three pairs of nodes is exactly the same, i.e., one. However, every time the walker departs from one of the nodes in the pair (i), the probability of her to return to the other node in this pair is very high, due to the high interconnection of the clique. In the case of the pair in the middle of the path the situation is quite different. Here every time the walker departs from one of the two nodes the probability of returning to the other node is very much diminished by the fact that it can get trapped in any of the two cliques and hardly returning to the centre of the path. Similar situation can be described for the pair (iii). Then, we have three different probabilities for exactly the same distance between nodes. Using the previously used ‘memory’ analogy, the situation in the graph can be seen as if the random walker remembers the place from which he originally departed, something totally strange to the continuous space. In Fig. 9, we illustrate the distribution of probabilities as a function of the distance between the nodes from/to which the RMH jumps for three different graphs. The first is a cycle graph having 1000 nodes, the second is a barbell graph with 999 nodes and the third is the real-world protein–protein interaction network of yeast. As can be seen, only for the case of the cycle there is a perfect power-law dependency between the probability and the distance as expected for ‘normal’ random walks with Lévy flights. In this case, we obtain, as expected, $$p\left(i,j\right)=d\left(i,j\right)^{-1}$$ with Pearson correlation coefficient $$r^{2}=1$$. However, for the barbell graph we get $$p\left(i,j\right)\sim d\left(i,j\right)^{-0.856}$$ with $$r^{2}=0.678$$. Here, neither the exponent of the power-law is equal to one, nor the correlation coefficient is so high as to accept such correlation. The most important thing however, is pretty much observed from the proper plot. For each value of the distance there are many different probabilities, and such probabilities can vary more than one order of magnitude. In the case of the real-world network is situation is even worse. Here we obtain $$p\left(i,j\right)\sim d\left(i,j\right)^{-0.673}$$ with $$r^{2}=0.659$$. The exponent of the power-law is far from the epected value of -1 and the correlation coefficient is very poor. Obviously, the consequences for these deviations are the ones we have analysed in this article: heterogeneity in the degree distributions and presence of regions with diverse density communities, among other potential structural heterogeneities appearing in real-world graphs. In closing we remark here that the main condition observed by the random-walks with Lévy flights in the continuous space is no longer fulfilled by the analogue on graph. That is the main reason we have preferred to call this model the multi-hopper ones instead of the most traditional name of random-walks with Lévy flights. Fig. 9. View largeDownload slide (top) Log–log plot of the jumping probabilities in the multi-hopper model using a Mellin transform with exponent $$s=1$$ versus the shortest path distance for a cycle $$C_{1000}$$, a barbell $$B\left(999,333,333\right)$$ and the network of PPI of yeast. Fig. 9. View largeDownload slide (top) Log–log plot of the jumping probabilities in the multi-hopper model using a Mellin transform with exponent $$s=1$$ versus the shortest path distance for a cycle $$C_{1000}$$, a barbell $$B\left(999,333,333\right)$$ and the network of PPI of yeast. 9. Conclusions We develop here a mathematical and computational framework for using random walks with long-range jumps on graphs of any topology. This multi-hopper model allows a random walker positioned at a given node of a simple, connected graph to jump to any other node of the graph with a probability that decays as a function of the shortest-path distance between her original and final positions. The decaying probabilities for long-range jumps are selected as Laplace or Mellin transforms of the shortest-path distances in this work. We prove here that when the parameters of these transforms approach zero asymptotically, the hitting time of the multi-hopper approaches the minimum possible value for a normal random walker. Thus, the multi-hopper represents a super-fast random walker hopping among the nodes of a graph. We show by computational experiments that the multi-hopper overcomes several of the difficulties that a normal random walker has to explore a graph. For instance, the multi-hopper explores more efficiently a graph having clusters of highly interconnected nodes, which are poorly connected to other clusters (communities), when compared to the normal random walker. It also diffuses faster than the normal random walker in graphs with very skewed degree distributions, such as scale-free networks. In these graphs, the normal random walker visits hubs more frequently than the nodes with low degree, thus getting stuck around the high-degree nodes of the graph. Finally, we illustrate how the multi-hopper can be useful in transport and search problems in real-world networks where structural heterogeneity, such as the presence of communities and skewed degree distributions, is more a rule than an exception. We hope that the use of the RMH will open new avenues in the exploration of lattices, graphs and real-world networks. Supplementary data Supplementary data are available at COMNET online. Acknowledgments E.E. thanks the Royal Society for a Wolfson Research Merit Award. Funding European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement [702410 to M.T.S.]; the Concerted Research Action (ARC) programme supported by the Federation Wallonia-Brussels (contract ARC 14/19-060 on Mining and Optimization of Big Data Models) [to J.C.D.]; DFG, project ME 1535/6-1 [to R.M.]. References 1. Estrada E. ( 2012 ) The Structure of Complex Networks: Theory and Applications . Oxford : Oxford University Press . 2. Klafter J. & Sokolov I. M. ( 2011 ) First Steps in Random Walks: From Tools to Applications . Oxford : Oxford University Press . Google Scholar CrossRef Search ADS 3. Noh J. D. & Rieger H. ( 2004 ) Random walks on complex networks . Phys. Rev. Lett. , 92 , 118701 . Google Scholar CrossRef Search ADS PubMed 4. Pólya G. Arithmetische eigenschaften der reihenentwicklungen rationaler funktionen . J. Reine Angew. Math. , 151 , 1 – 31 . 5. Senft D. C. & Ehrlich G. ( 1995 ) Long jumps in surface diffusion: one-dimensional migration of isolated adatoms . Phys. Rev. Lett. , 74 , p. 294 . Google Scholar CrossRef Search ADS PubMed 6. Linderoth T. R. , Horch S. , Lægsgaard E. , Stensgaard I. & Besenbacher F. ( 1997 ) Surface diffusion of pt on pt (110): Arrhenius behavior of long jumps . Phys. Rev. Lett. , 78 , p. 4978 . Google Scholar CrossRef Search ADS 7. Schunack M. , Linderoth T. R. , Rosei F. , Lægsgaard E. , Stensgaard I. & Besenbacher F. ( 2002 ) Long jumps in the surface diffusion of large molecules . Phys. Rev. Lett. , 88 , p. 156102 . Google Scholar CrossRef Search ADS PubMed 8. Ala-Nissila T. , Ferrando R. & Ying S. ( 2002 ) Collective and single particle diffusion on surfaces . Adv. Phys. , 51 , 949 – 1078 . Google Scholar CrossRef Search ADS 9. Yu C. , Guan J. , Chen K. , Bae S. C. & Granick S. ( 2013 ) Single-molecule observation of long jumps in polymer adsorption . ACS Nano. , 7 , 9735 – 9742 . Google Scholar CrossRef Search ADS PubMed 10. Wrigley J. D. , Twigg M. E. & Ehrlich G. ( 1990 ) Lattice walks by long jumps . J. Chem. Phys. , 93 , 2885 – 2902 . Google Scholar CrossRef Search ADS 11. Riascos A. & Mateos J. L. ( 2012 ) Long-range navigation on complex networks using lévy random walks . Phys. Rev. E , 86 , p. 056110 . Google Scholar CrossRef Search ADS 12. Estrada E. ( 2012 ) Path laplacian matrices: introduction and application to the analysis of consensus in networks . Linear Algebra Appl. , 436 , 3373 – 3391 . Google Scholar CrossRef Search ADS 13. Aldous D. & Fill J. Reversible Markov Chains and Random Walks on Graphs . 2002 . 14. Lovász L. ( 1993 ) Random walks on graphs . Combinatorics, Paul Erdos Is Eighty , 2 , 1 – 46 . 15. Nash-Williams C. S. J. ( 1959 ) Random walk and electric currents in networks, in mathematical . Proceedings of the Cambridge Philosophical Society , vol. 55 . Cambridge University Press , 181 – 194 . 16. Chandra A. K. , Raghavan P. , Ruzzo W. L. , Smolensky R. & Tiwari P. ( 1996 ) The electrical resistance of a graph captures its commute and cover times . Comput. Complexity , 6 , 312 – 340 . Google Scholar CrossRef Search ADS 17. Ghosh A. Boyd S. & Saberi A. ( 2008 ) Minimizing effective resistance of a graph . SIAM Rev. , 50 , 37 – 66 . Google Scholar CrossRef Search ADS 18. Boley D. Ranjan G. & Zhang Z.-L. ( 2011 ) Commute times for a directed graph using an asymmetric laplacian . Linear Algebra Appl. , 435 , 224 – 242 . Google Scholar CrossRef Search ADS 19. Grinstead C. M. & Snell J. L. ( 2012 ) Introduction to Probability . American Mathematical Society . 20. Palacios J. L. ( 2001 ) Resistance distance in graphs and random walks. Int. J. Quantum Chem. , 81 , 29 – 33 . Google Scholar CrossRef Search ADS 21. Brightwell G. & Winkler P. ( 1990 ) Maximum hitting time for random walks on graphs . Random Structures Algorithms , 1 , 263 – 276 . Google Scholar CrossRef Search ADS 22. Jonasson J. ( 2000 ) Lollipop graphs are extremal for commute times . Random Structures Algorithms , 16 , 131 – 142 . Google Scholar CrossRef Search ADS 23. Barabási A.-L. & Albert R. ( 1999 ) Emergence of scaling in random networks . Science , 286 , 509 – 512 . Google Scholar CrossRef Search ADS PubMed 24. Erdös P. & Rènyi A. ( 1959 ) On random graphs, i . Publ. Math. Debrecen , 6 , 290 – 297 . 25. Hoory S. Linial N. & Wigderson A. ( 2006 ) Expander graphs and their applications . Bull. Amer. Math. Soc. , 43 , 439 – 561 . Google Scholar CrossRef Search ADS 26. Mihail M. , Papadimitriou C. & Saberi A. ( 2003 ) On certain connectivity properties of the internet topology, in foundations of computer science, 2003 . Proceedings 44th Annual IEEE Symposium . IEEE , 28 – 35 . 27. Lancichinetti A. , Fortunato S. & Radicchi F. ( 2008 ) Benchmark graphs for testing community detection algorithms . Phys. Rev. E , 78 , 046110 . Google Scholar CrossRef Search ADS 28. Adamic L. A. , Lukose R. M. , Puniyani A. R. & Huberman B. A. ( 2001 ) Search in power-law networks . Phys. Rev. E , 64 , p. 046135 . Google Scholar CrossRef Search ADS 29. Guimerà R. , Díaz-Guilera A. , Vega-Redondo F. , Cabrales A. & Arenas A. ( 2002 ) Optimal network topologies for local search with congestion . Phys. Rev. Lett. , 89 , p. 248701 . Google Scholar CrossRef Search ADS PubMed 30. Shlesinger M. F. , Zaslavsky G. M. & Frisch U. ( 1995 ) Lèvy flights and related topics in physics . Levy Flights and Related Topics in Physics , 450 . 31. Hughes B. D. ( 1996 ) Random Walks and Random Environments . Random Walks , 1 . 32. Metzler R. & Klafter J. ( 2000 ) The random walk’s guide to anomalous diffusion: a fractional dynamics approach . Phys. Rep. , 339 , 1 – 77 . Google Scholar CrossRef Search ADS 33. Metzler R. & Klafter J. ( 2004 ) The restaurant at the end of the random walk: recent developments in the description of anomalous transport by fractional dynamics . J. Phys. A , 37 , p. R161 . Google Scholar CrossRef Search ADS 34. Bouchaud J.-P. & Georges A. ( 1990 ) Anomalous diffusion in disordered media: statistical mechanisms, models and physical applications . Phys. Rep. , 195 , 127 – 293 . Google Scholar CrossRef Search ADS © The authors 2017. Published by Oxford University Press. All rights reserved. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)

### Journal

Journal of Complex NetworksOxford University Press

Published: Oct 3, 2017

## You’re reading a free preview. Subscribe to read the entire article.

### DeepDyve is your personal research library

It’s your single place to instantly
that matters to you.

over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month ### Explore the DeepDyve Library ### Search Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly ### Organize Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place. ### Access Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals. ### Your journals are on DeepDyve Read from thousands of the leading scholarly journals from SpringerNature, Wiley-Blackwell, Oxford University Press and more. All the latest content is available, no embargo periods. DeepDyve ### Freelancer DeepDyve ### Pro Price FREE$49/month
\$360/year

Save searches from
PubMed

Create folders to