Non-Backtracking Centrality Based Random Walk on Networks

Non-Backtracking Centrality Based Random Walk on Networks Abstract Random walks are a fundamental tool for analyzing realistic complex networked systems and implementing randomized algorithms to solve diverse problems such as searching and sampling. For many real applications, their actual effect and convenience depend on the properties (e.g. stationary distribution and hitting time) of random walks, with biased random walks often outperforming traditional unbiased random walks (TURW). In this paper, we present a new class of biased random walks, non-backtracking centrality based random walks (NBCRW) on a network, where the walker prefers to jump to neighbors with high non-backtracking centrality that has some advantages over eigenvector centrality. We study some properties of the non-backtracking matrix of a network, on the basis of which we propose a theoretical framework for fast computation of the transition probabilities, stationary distribution and hitting times for NBCRW on the network. Within the paradigm, we study NBCRW on some model and real networks and compare the results with those corresponding to TURW and maximal entropy random walks (MERW), with the latter being biased random walks based on eigenvector centrality. We show that the behaviors of stationary distribution and hitting times for NBCRW widely differ from those associated with TURW and MERW, especially for heterogeneous networks. 1. INTRODUCTION As a fundamental and powerful tool, random walks have found a wide range of applications in computer science and engineering. For example, in the area of communication and information networks, random walks can not only model and describe information delivery [1] and data gathering [2, 3], but also quantify and predict the throughput [4, 5], latency performance [1], transition [6] and search costs [7, 8]. Other related applications of random walks in computer science include community detection [9], recommendation system [10], computer vision [11], image segmentation [12], sampling networks [13, 14], to name a few. The statistical properties of random walks play an important role in their applications, since they not only characterize the behavior of random walks themselves, but also capture the performance metrics of different applications. For example, stationary probability of stationary distribution can measure the node importance [15] of a network, as well as the visual saliency at a location [16], while hitting time can serve as search performance gauge [7]. Thus, the properties of random walks have a strong impact on, even determine to a large extent, the effects of their applications. Among various random walks, the traditional unbiased random walk (TURW) is probably the simplest one, where the transition probability from the current location to any neighbor at next time step is uniform. Nevertheless, a vast majority of real-life networks are heterogeneous [17], implying that the importance or role of different nodes are also distinct. Thus, random walks in realistic heterogeneous networks should be biased [18, 19], with transition probability to an important neighbor higher than that of an ordinary neighbor. A lot of works show that in comparison with TURW, biased random walks are superior in some concrete applications, e.g. network search [18, 20] sampling [21]. A typical biased random walk is maximal entropy random walk (MERW) [22], which has received considerable attention [23–26]. Entropy of random walks quantifies the randomness of trajectories and can measure mobility of random walker [27]. MERW displays some remarkable properties different from those of TURW, e.g. small relaxation time [28], localization of stationary distribution [23]. In the past years, MERW has been applied to several aspects, such as link prediction [29], visual saliency [16] and digital image forensics [30], and produced more desirable effects. MERW is in fact a biased random walk with transition biasing towards neighboring nodes with high eigenvector centrality [31], i.e. principal eigenvector of adjacent matrix. However, a recent research [32] pointed out that standard centrality undergoes a localization transition in heterogeneous networks, which leads to most of weight concentrating around the hub node and its vicinity. Thus, as a common measure of node importance, the standard eigenvector centrality fails to discriminate those nodes with small weight. As a remedy, an alternative centrality measure, non-backtracking centrality, was proposed [32], which reserves the advantage of standard centrality but avoids its deficiency. This new centrality measure is based on non-backtracking matrix [33, 34], which has been successfully applied to many aspects, such as community detection [34], percolation [35, 36], epidemic spreading [37] and identifying influential nodes [38]. Since the node properties, based on which the walker has preference to jump towards different nodes, play a central role in determining the behavior of biased random walks, an interesting question arises naturally: How does a random walk behave if non-backtracking centrality is incorporated into its transition probabilities? In this paper, we design a new biased random walk, Non-Backtracking centrality based Random Walk (NBCRW), with the transition probabilities dependent on the non-backtracking centrality. We present a framework for computing quickly transition probabilities, stationary distribution and hitting times of NBCRW, and provide analytical expressions for stationary distribution and hitting times. Within this framework, we study NBCRW on some synthetic and real networks, and compare their results with those with respect to TURW and MERW. We show that the behaviors of NBCRW differ greatly from those of TURW and MERW, in particular for heterogeneous networks. The main contributions of this paper are summarized as follows: We propose a novel type of biased random walks, non-backtracking centrality based random walks (NBCRW), in which the transition probability is proportional to the non-backtracking centrality. We develop a theoretical framework for efficiently computing transition probabilities of NBCRW as well as its properties, including stationary distribution and hitting times. We derive an analytical expression of the stationary distribution of NBCRW in terms of the leading eigenvalue of non-backtracking matrix and non-backtracking centrality. We also determine hitting times for NBCRW, including hitting time from an arbitrary node to another one, partial mean hitting time to a given target, and global mean hitting time to a uniformly selected node. Within the established general framework, we study analytically or numerically NBCRW in model and realistic networks, and compare the results with those corresponding to TURW and MERW. We show that the stationary distribution and hitting times behave differently from those of TURW and MERW. The remainder of this paper is organized as follows. Section 2 presents a brief introduction to networks and an overview of TURW and MERW on networks. Section 3 is devoted to the formulation of NBCRW. Section 4 gives the experiment results and comparison between NBCRW, TURW and MERW in model and real-life networks. Section 5 reports the exact analytical results of stationary distributions and hitting times for NBCRW, TURW and MERW in a class of rose graphs. Section 6 concludes the paper. 2. PRELIMINARIES In this section, we introduce some useful concepts for graphs and discrete-time random walks on graphs. 2.1. Concepts for graphs and random walks Let G(V,E) be a finite connected undirected network (graph) of N nodes and E edges, with node set V={1,2,…,N} and edge set E={(i,j)∣i,j∈V}. The connectivity of nodes is defined by the adjacency matrix A=(aij)N×N, in which the element aij=1 if (i,j)∈E, and aij=0 otherwise. Let Ni denote the set of neighbors of node i. The degree of node i is di=∣Ni∣=∑j=1Naij, which is ith non-zero entry of the diagonal degree matrix D=diag(d1,d2,…,dN). The Laplacian matrix of G is defined to be L=D−A. For a graph G, we can define a discrete-time nearest-neighbor random walk taking place on it. Any random walk on a network G is in fact a Markov chain characterized by a unique stochastic matrix P=(pij)N×N, also called transition probability matrix, with entry pij describing the transition probability from node i to a neighboring node j. Definition 2.1 For an irreducible random walk on graph G, the stationary distribution π=(π1,π2,…,πN)is an N-dimension vector satisfying πP=πand ∑i=1Nπi=1. The stationary probabilities of stationary distribution can be employed to rank nodes in a network [15]. Another fundamental quantity relevant to random walks is hitting time [39]. Definition 2.2 For a random walk on graph G, the hitting time from node ito node j ( j≠i), denoted by Tij, stands for the expected jumping steps required for the walker starting from the source node ito arrive at the target node jfor the first time. The hitting time is a significant indicator to measure the transition or research cost in a network [7]. Based on hitting time, we can further define some other quantities for random walks, such as partial mean hitting time and global mean hitting time. Definition 2.3. For a random walk on graph G, the partial mean hitting time to node j, denoted by Tj, is the average of hitting times Tijover all source nodes in the network: Tj=1N−1∑i=1NTij. (1) The partial mean hitting time Tj is actually mean absorbing time of an absorbing Markov chain with j being the absorbing state, reflecting the absorbing efficiency of node j [40, 41]. It was recently utilized to measure the importance of node j, and is thus called Markov centrality [42]. Definition 2.4 For a random walk on graph G, the global mean hitting time, denoted by ⟨T⟩, is the average of hitting times Tijover all N(N−1)pair of nodes, equivalent to the mean hitting time to a uniform distributed node, which is given by ⟨T⟩=1N(N−1)∑i=1N∑j≠iTij=1N∑j=1NTj. (2) The global mean hitting time can be applied to gauge the search efficiency of a network [43]. Given a network, we can define different random walks. Below we only introduce two much studied random walks: traditional unbiased random walk (TURW) and maximal entropy random walk (MERW). 2.2. Traditional unbiased random walk For TURW on graph G, the transition probability from a node i to one of its neighboring nodes j is identical, namely pij=aijdi. (3) Thus, the transition probability matrix is P=D−1A, and the stationary distribution is [44, 45] πT=(π1T,π2T,…,πNT)=d12E,d22E,…,dN2E, (4) which implies that all nodes with the same degree have identical occupation probability in the stationary state. The hitting time for TURW on G can be expressed in terms of spectra of its Laplacian matrix L. Let 0=σ1<σ2≤⋯≤σN be the N eigenvalues of L, and let μ1,μ2,…,μN be their corresponding normalized mutually orthogonal eigenvectors, where μi=(μi1,μi2,…,μiN)⊤ for each i=1,2,…,N. Then, the hitting time Tij, partial mean hitting time, and global mean hitting time can be represented by [40] Tij=∑z=1Ndz∑k=2N1σk(μkiμkz−μkiμkj−μkjμkz+μkj2), (5) Tj=NN−1∑k=2N1σk2E×μkj2−μkj∑z=1Ndzμkz (6) and ⟨T⟩=2EN−1∑k=2N1σk, (7) respectively. 2.3. Maximal entropy random walk Different from the TUWR, MERW on graph G is a biased random walk, whose transition probability is defined based on the leading eigenvalue and eigenvector of adjacency matrix A. Let λ1>λ2≥⋯≥λN be the N real eigenvalues of A, and ψ1,ψ2,…,ψN their corresponding mutually orthogonal unit eigenvectors, where ψi=(ψi1,ψi2,…,ψiN)⊤ for each i=1,2,…,N. Then, the transitional probability pij from node i to node j in MERW is defined by [22, 23] pij=aijλ1ψ1jψ1i. (8) Note that principal eigenvector ψ1 is in fact the frequently used centrality measure [31], with the entry ψ1,i defining a centrality score for node i. In this sense, MERW can be considered as a biased random walk based on eigenvector centrality. Equation (8) guarantees that MERW maximizes the entropy of a set of trajectories with a given length and end-nodes, leading to the maximal entropy rate of such process [23]. The stationary distribution of MERW is πM=(π1M,π2M,…,πNM)=(ψ112,ψ122,…,ψ1N2). (9) Since in some networks, especially heterogeneous networks, the eigenvector centrality ψ1 exhibits a localization phenomenon [32] with the weight of centrality concentrating around one or a few nodes with high degree in the networks, from (9) one can see that in these networks, the stationary distribution for MERW displays a more evident localization transition: the abrupt focusing of occupation probabilities on just a few large-degree nodes and their neighbors. Interestingly, for MERW on graph G, the hitting time Tij, partial mean hitting time Tj and global mean hitting time ⟨T⟩ can be expressed in terms of the eigenvalues and eigenvectors of adjacency matrix A [26]: Tij=1ψ1j2∑k=2Nλ1λ1−λkψkj2−ψkiψkjψ1jψ1i, (10) Tj=1ψ1j2(N−1)∑k=2Nλ1λ1−λkNψkj2−ψkjψ1j∑i=1Nψkiψ1i, (11) ⟨T⟩=1N(N−1)∑j=1N1ψ1j2∑k=2Nλ1λ1−λkNψkj2−ψkjψ1j∑i=1Nψkiψ1i. (12) 3. FORMULATION OF NON-BACKTRACKING CENTRALITY BASED RANDOM WALK For a biased random walk on a graph G, its behavior depends on the property of the quantity with respect to nodes, based on which the transition probability is defined. As shown in a recent paper [32], the eigenvector centrality has some flaws, e.g. localization transition, which results in obvious heterogeneity in the stationary distribution of MERW. Since non-backtracking centrality can avoid the deficiency of eigenvector centrality [32], as a remedy of MERW, in this section, we propose a new biased random walk based on non-backtracking centrality. To begin with, we introduce the non-backtracking centrality and study some of its properties. 3.1. Non-backtracking centrality The non-backtracking centrality [32] is defined and calculated by the Hashimoto or non-backtracking matrix [33, 34], denoted by B that is a 2E×2E matrix. For any undirected network G, we can transform it to a directed graph through replacing each undirected edge (i,j) by two directed ones i→j and j→i. The 2E×2E non-backtracking matrix B of G describes the relation between the 2E different directed edges, the element Bi→j,k→l of which is defined as follows: Bi→j,k→l=1,j=kandi≠l,0,otherwise. (13) Since all entries of the non-backtracking matrix B are non-negative real numbers, by the Perron–Frobenius theorem [46], its leading eigenvalue is real and non-negative, and there exists a corresponding leading eigenvector, whose elements are also non-negative real numbers. Let κ be the leading eigenvalue of B, and let vi→j be the element of the leading eigenvector corresponding to the directed edge i→j. Then, vi→j represents the centrality of node j neglecting any contribution from node i. According to the leading eigenvector of B, one can define two centrality measures of each node [34], outgoing centrality and incoming centrality, by considering the outgoing and incoming edges of the node. Definition 3.1 For a node iin network G, its outgoing centrality is xi=∑j∈Nivi→j, (14)and its incoming centrality is yi=∑j∈Nivj→i. (15) Note that the outgoing centrality xi is actually the non-backtracking centrality [32]. Lemma 3.1 For a node iin network G, its outgoing and incoming centralities obey κyi=(di−1)xi. (16) Proof By definition of eigenvalues and eigenvectors for matrix B, we can establish equation κvi→j=∑k∈Njk≠ivj→k. (17) Using (14) and (17), we rephrase (15) as yi=1κ∑j∈Ni∑k∈Nik≠jvi→k=1κ∑j∈Ni∑k∈Nivi→k−∑j∈Nivi→j=1κ∑j∈Nixi−xi=1κ(dixi−xi), (18) which is equivalent to (16).□ If graph G is a tree, the leading eigenvalue κ of its non-backtracking matrix B is zero. However, when G is not a tree, the leading eigenvalue κ of B is positive, and the components of leading eigenvector may be all non-negative. In what follows, we will consider the case when G are not trees. For a network G, computing its non-backtracking centrality involving computing the leading eigenvector of its non-backtracking matrix B of order 2E×2E. If we directly compute the leading eigenvector according to definition, the time and the space cost are very high. Fortunately, in practice, we can substantially reduce the consumption by executing a faster computation for κ and non-backtracking centrality xi, utilizing the Ihara determinant [33, 47, 48]. Lemma 3.2 For a network G, its leading eigenvalue κof non-backtracking matrix Bis equal to the leading eigenvalue of a 2N×2Nmatrix M=AI−DI0, (19)where Iis the N×Nidentity matrix. In addition, x1,x2,…,xNcorrespond to the first Nelements of the leading eigenvector of matrix M. Proof Combining (17) and (14), the non-backtracking centrality xi can be rewritten as xi=∑j∈Ni1κ∑k∈Njk≠ivj→k=1κ∑j∈Ni∑k∈Njvj→k−∑j∈Nivj→i=1κ∑j∈Nixj−yi=1κ∑j=1Naijxj−di−1κxi. (20) Recasting (20) in matrix notation, one obtains A−1κD+1κIx=κx, (21) where x=(x1,x2,…,xN)⊤ is a vector composed of the non-backtracking centralities of N nodes in G. Equation (21) shows that matrices B and M have the same set of real eigenvalues. Furthermore, Mz=κz. (22) Here z=(x∣1κx), in which x represents the first N elements of z and 1κx constitutes the last N elements. Thus, the leading eigenvalues of matrix B and M are equal to each other, and the first N elements of z correspond to the non-backtracking centralities x1, x2,…,xN.□ Lemma 3.2 indicates that the computation of the leading eigenvalue κ for non-backtracking centrality M of order 2E and non-backtracking centralities x can be reduced to calculating the leading eigenvalue and eigenvector for matrix M of order of 2N, smaller than the order 2E of matrix B, especially for dense networks. Thus, we can compute κ and xi very rapidly by evaluating the leading eigenvalue and eigenvector of matrix M. 3.2. Definition of transition matrix According to the bias towards properties of nodes, we can define different biased random walks. Here we propose a novel random walk, non-backtracking centrality random walk (NBCRW), which is a biased one with the transition probability having a bias towards nodes with high non-backtracking centrality. Definition 3.2 For NBCRW on network G, the element at row iand column jof transition matrix Pis pij=aijxj∑k=1Naikxk. (23) In other words, the transition probability for NBCRW from node i to its neighbor j is proportional to the non-backtracking centrality of j. In order to investigate the properties of NBCRW on network G, we propose an approach to construct a weighted network W from the original network G. The weight of each edge in W is related to the non-backtracking centralities of both nodes connecting the edge in G. We will present that NBCRW on network G is equivalent to the ordinary random walk [49] in the corresponding weighted network W, with both random walks having the same properties, such as transition probability, stationary distribution, and hitting times. Definition 3.3 For an unweighted network G(V,E), given its adjacency matrix A, a diagonal matrix Xwith its ith diagonal entry equal to non-backtracking centrality xiof node i, its corresponding weighted network is defined as W(V,E), with the weight between nodes iand jgiven by wij=aijxixj. Let W=(wij)N×N stand for the adjacency matrix of the weighted network W. Different from the adjacency matrix A of binary network G, the elements of W are not simply 0 or 1, but are the weights of all pairs of nodes. By definition, we have W=XAX. In a weighted network W, the strength [50] of a node i is si=∑k=1Nwik=xi∑k=1Naikxk, and the total strength of the whole network W is s=∑i=1N∑j=1Nwij. Then, the diagonal strength matrix of W is defined as S=diag(s1,s2,…,sN), and the Laplacian matrix of W is defined by L=S−W. Theorem 3.1 The transition matrix of NBCRW in an arbitrary connected network Gis identical to the transitional matrix of ordinary random walk in the corresponding weighted network W. Proof For the ordinary random walk in the weighted network W, the transitional probability from node i to node j is pij=wijsi=aijxixj∑k=1Naikxixk=aijxi∑k=1Naikxk, which completely agrees with (23). Therefore, the transition matrix for NBCRW on G is the same as that of the ordinary random walk on W.□ Since both networks G and W have the same topological structure and transition matrix, NBCRW on G and ordinary random walk on W also have identical behaviors. In the sequel, we study the properties of NBCRW on G directly or indirectly by considering those of ordinary random walk on W. We note that our proposed NBCRW is different from non-backtracking random walk [51, 52] that is a random process, during which the walker never goes back along the edge it just traversed. For a general graph G, non-backtracking random walk is not a Markov chain on its vertex set, although it can be regarded as a Markov chain on the set of its directed edges [53], whose adjacency relation are encoded in non-backtracking matrix B. In contrast, NBCRW on the vertex set of G is a biased Markov chain based on non-backtracking centrality. A main goal of this paper is to unveil the impacts of biases, especially non-backtracking centrality, on the behaviors of biased random walks. 3.3. Stationary distribution First, we address the stationary distribution for NBCRW on G. Theorem 3.2 The stationary distribution for NBCRW on network Gis πB=(π1B,π2B,…,πNB), where πiB=κ2−1κ+diκxi2Q, (24) with Q=∑i=1Nκ2−1κ+diκxi2 (25)being the normalized factor to guarantee ∑i=1NπiB=1. Proof First, we prove that πB fulfills the detailed balance condition πiBpij=πjBpji for different i and j. To this end, we require to compute a related quantity ∑k=1Naikxk. From (21), we have Ax=1κDx+κ2−1κx, which means ∑k=1Naikxk=diκ+κ2−1κxi. Thus, we have πiBpij=κ2−1κ+diκxi2Q·aijxj∑k=1Naikxk=aijxixjQ. Similarly, we can get πjBpji=aijxixjQ. Hence, the detailed balance condition πiBpij=πjBpji (26) is satisfied for all pairs of i and j. According to (26), we have ∑i=1NπiBpij=∑i=1NπjBpji=πjB. (27) In other words, πBP=πB, showing that πB is the stationary distribution for NBCRW on G.□ 3.4. Hitting times Let θ1,θ2,…,θN be the N eigenvalues of the Laplacian matrix L for weighted network W, rearranged as 0=θ1<θ2≤⋯≤θN, and let ϕ1,ϕ2,…,ϕN be their corresponding mutually orthogonal eigenvectors of unit length, where ϕi=(ϕi1,ϕi2,…,ϕiN)⊤. Then, the hitting times for NBCRW on G can be expressed in term of the eigenvalues and eigenvectors of Laplacian matrix of network W. Theorem 3.3 For non-backtracking centrality based random walk on network G, the hitting time from a node ito another node jis Tij=12∑z=1Nsz∑k=2N1θk(ϕkj2−ϕkiϕkj−ϕkjϕkz+ϕkiϕkz), (28)the partial mean hitting time to an arbitrary destination node jis Tj=NN−1∑k=2N1θks×ϕkj2−ϕkj∑z=1Nszϕkz, (29)and the global mean hitting time for the whole network Gis ⟨T⟩=sN−1∑k=2N1θk. (30) Proof As mentioned earlier, NBCRW on network G is equivalent to ordinary random walk on its weighted counterpart W. According to our previous result [54], the theorem follows immediately.□ 4. EXPERIMENTS AND RESULTS FOR MODEL AND REALISTIC NETWORKS In this section, we study NBCRW on some classical model networks (e.g. Erdös–Rényi (ER) network [55] and Barabási–Albert (BA) network [56]) and real networks, and compare the results of stationary distribution and hitting times for NBCRW with those corresponding to TURW and MERW. 4.1. Stationary distribution Figure 1 shows the stationary distribution for TURW, NBCRW and MERW in an ER network with 1000 nodes. We can see that, for all the three random walks, the stationary probability of a node approximately increases with the degree of the node: for two nodes with different degrees, the stationary probability of the large-degree node is higher than that of the small-degree node. Moreover, the stationary probabilities of the three random walks are all distributed in a narrow range: the largest stationary probability is less than twice of the smallest stationary probability. Thus, there is little difference for the stationary probability of the three random walks. In particular, the stationary distribution of NBCRW and MERW are almost identical to each other. The main reason for this phenomenon is that ER network is homogeneous, with different nodes exhibiting similar structural and dynamical properties. Figure 1. View largeDownload slide Stationary distribution in an ER network with 1000 nodes, where each pair of nodes are connected with probability p=0.5. The results for TURW, MERW and NBCRW are obtained by (4), (9) and (24), respectively. According to decreasing order of degree, all the nodes are labeled from 1 to 1000. Figure 1. View largeDownload slide Stationary distribution in an ER network with 1000 nodes, where each pair of nodes are connected with probability p=0.5. The results for TURW, MERW and NBCRW are obtained by (4), (9) and (24), respectively. According to decreasing order of degree, all the nodes are labeled from 1 to 1000. Figure 2 exhibits the behaviors of stationary distribution for TURW, NBCRW and MERW on a BA network with 1000 nodes and average degree 4. We can see that the stationary distributions are heterogeneous for all the three random walks. According to (4) the stationary distribution of TURW is similar to the degree distribution. Figure 2a shows that the stationary probability of TURW lies in the interval [0.0005,0.02]. For NBCRW and MERW, the stationary probability lies, respectively, in the intervals [10−6,0.11] and [10−7,0.16], the heterogeneous extent of which is more pronounced than that of TURW. In addition to heterogeneous extent, the stationary distribution of the considered random walks has obvious differences. For TURW, the stationary probability of a node is fully determined by its degree: any two nodes with the identical degree have the same stationary probability. For NBCRW and MERW, two different nodes generally have different stationary probabilities, in spite of their degrees. Thus, the stationary probabilities of NBCRW and MERW can discriminate nodes in the BA networks, including those with identical degree. Figure 2. View largeDownload slide Stationary distribution in a BA network with 1000 nodes and average degree 4. (a) Stationary distribution of TURW, calculated by (4). (b) Stationary distribution of NBCRW, calculated by (24). (c) Stationary distribution of MERW, calculated by (9). According to stationary probability, all the 1000 nodes are labeled from 1 to 1000. The insets are the stationary probabilities of the 200 nodes with the smallest stationary probability. Figure 2. View largeDownload slide Stationary distribution in a BA network with 1000 nodes and average degree 4. (a) Stationary distribution of TURW, calculated by (4). (b) Stationary distribution of NBCRW, calculated by (24). (c) Stationary distribution of MERW, calculated by (9). According to stationary probability, all the 1000 nodes are labeled from 1 to 1000. The insets are the stationary probabilities of the 200 nodes with the smallest stationary probability. However, even for NBCRW and MERW in BA networks, their stationary probabilities differ greatly from each other. For the hub node 1, the stationary probability for MERW is greater than that of NBCRW; while for small-degree nodes, excluding those neighboring nodes of the hub, the stationary probability of a node for MERW is much lower than that corresponding to NBCRW. The insets show that for those 200 small-degree nodes with the lowest stationary probabilities, their stationary probabilities are almost below 10−5 for MERW, but there are over 150 nodes with stationary probabilities larger than 10−5 for NBCRW. In order to reflect the heterogeneous extent of stationary distributions between NBCRW and MERW in BA networks, we compute the inverse participation ratio S=∑i=1Nπi2, which is a standard quantity characterizing localization or inhomogeneity of an indicator [57]: the larger the value S=∑i=1Nπi2, the more heterogeneous the stationary distribution. In Table 1, we list the inverse participation ratio for NBCRW and MERW on some model and real networks. From Table 1, we can see that for all considered model and real networks, the heterogeneity of stationary distribution of MERW is more pronounced than that of NBCRW. Table 1. Inverse participation ratio of stationary distribution for NBCRW and MERW in a variety of networks. Network Size NBCRW MERW BA-model 5000 2.346×10−2 8.589×10−2 WS-model [58] 1000 1.692×10−3 2.510×10−3 Dolphins [59] 53 4.938×10−2 5.312×10−2 ca-NetSci [60] 379 7.372×10−2 7.938×10−2 C.elegans [61] 448 3.369×10−2 4.057×10−2 E-mail [62] 1133 8.517×10−3 9.557×10−3 P2P [63] 6299 7.665×10−3 7.945×10−3 Network Size NBCRW MERW BA-model 5000 2.346×10−2 8.589×10−2 WS-model [58] 1000 1.692×10−3 2.510×10−3 Dolphins [59] 53 4.938×10−2 5.312×10−2 ca-NetSci [60] 379 7.372×10−2 7.938×10−2 C.elegans [61] 448 3.369×10−2 4.057×10−2 E-mail [62] 1133 8.517×10−3 9.557×10−3 P2P [63] 6299 7.665×10−3 7.945×10−3 View Large Table 1. Inverse participation ratio of stationary distribution for NBCRW and MERW in a variety of networks. Network Size NBCRW MERW BA-model 5000 2.346×10−2 8.589×10−2 WS-model [58] 1000 1.692×10−3 2.510×10−3 Dolphins [59] 53 4.938×10−2 5.312×10−2 ca-NetSci [60] 379 7.372×10−2 7.938×10−2 C.elegans [61] 448 3.369×10−2 4.057×10−2 E-mail [62] 1133 8.517×10−3 9.557×10−3 P2P [63] 6299 7.665×10−3 7.945×10−3 Network Size NBCRW MERW BA-model 5000 2.346×10−2 8.589×10−2 WS-model [58] 1000 1.692×10−3 2.510×10−3 Dolphins [59] 53 4.938×10−2 5.312×10−2 ca-NetSci [60] 379 7.372×10−2 7.938×10−2 C.elegans [61] 448 3.369×10−2 4.057×10−2 E-mail [62] 1133 8.517×10−3 9.557×10−3 P2P [63] 6299 7.665×10−3 7.945×10−3 View Large 4.2. Hitting times Analogous to the case of stationary distribution, there are little dissimilarity for the behaviors of hitting times between NBCRW, MERW and TURW for homogeneous networks, e.g. ER networks. Below we study hitting times on heterogeneous networks, focusing on two representative cases: mean hitting time to the hub node TH and the global mean hitting time ⟨T⟩. Figures 3 and 4 display, respectively, TH and ⟨T⟩ for the three random walks in BA networks with node number N changing from 1000 to 10 000. Figure 3. View largeDownload slide Mean hitting time to the hub node for different random walks in BA networks with average degree 4. The calculation of TH for TURW, MERW and NBCRW are based on (6), (11) and (29), respectively. Figure 3. View largeDownload slide Mean hitting time to the hub node for different random walks in BA networks with average degree 4. The calculation of TH for TURW, MERW and NBCRW are based on (6), (11) and (29), respectively. From Fig. 3, we can see that when the hub is the target node, the mean absorbing time is the least for MERW, slightly smaller than that for NBCRW. In contrast, the mean absorbing time to the hub for TURW is significantly higher than those for MERW and NBCRW, which are all in inverse proportion to their corresponding stationary probabilities. In a previous work [26], we have proved that in BA networks, the asymptotical scaling for mean hitting time to the hub for MERW and TURW are lnN and N1/2, respectively, both of which are consistent with Fig. 3. As opposed to the sublinear scaling of partial mean hitting time TH to the hub for TURW, NBCRW and MERW in BA networks, the global mean hitting time ⟨T⟩ for the three random walks behaves linearly for TURW and superlinearly for NBCRW and MERW, as indicated in Fig. 4. Although for NBCRW and MERW ⟨T⟩∼Nρ with ρ>1, the power exponent ρ of NBCRW is less than that of MERW. In addition, combining the results in Figs 3 and 4, we found that among the three random walks, TH is the largest and ⟨T⟩ is the lowest for TURW, with the latter achieving the possible minimal scaling for TURW on all connected networks [64]; TH is the smallest and ⟨T⟩ is the largest for MERW. For NBCRW, both TH and ⟨T⟩ lie between those associated with TURW and MERW. Thus, for TURW, NBCRW and MERW on a heterogeneous network, the mean absorbing time to a particular target is not representative of the network. Figure 4. View largeDownload slide Global mean hitting times for TURW, NBCRW and MERW in BA networks with average degree 4. The inset provides results for TURW and NBCRW for comparison. The calculation of ⟨T⟩ for TURW, MERW and NBCRW are based on (7), (12) and (30), respectively. Figure 4. View largeDownload slide Global mean hitting times for TURW, NBCRW and MERW in BA networks with average degree 4. The inset provides results for TURW and NBCRW for comparison. The calculation of ⟨T⟩ for TURW, MERW and NBCRW are based on (7), (12) and (30), respectively. In addition to the BA networks, we also study partial mean hitting time to the hub and global mean hitting time for TURW, NBCRW and MERW in other synthetic and real networks. In Table 2, we provide related results for these three random walks, where superscripts T, B and M are used to represent the quantities corresponding to TURW, NBCRW and MERW, respectively. From Table 2 we observe that THT>THB>THM and ⟨T⟩M>⟨T⟩B>⟨T⟩T for all studied model and realistic networks. Table 2. Mean hitting time to a hub node and global mean hitting time for TURW, NBCRW and MERW in a variety of networks. Network Size THT THB THM ⟨T⟩T ⟨T⟩B ⟨T⟩M BA-model 5000 173.9 22.29 9.466 9837 3.005×105 7.960×106 WS-model 1000 557.4 246.7 147.0 1667 2440 3396 Dolphins 53 39.79 14.43 13.05 106.6 3175 6684 ca-NetSci 379 488.6 14.20 12.56 1892 2.980×1012 2.767×1013 C.elegans 448 19.82 7.929 6.472 1045 2.398×106 4.421×106 E-mail 1133 180.0 27.43 24.46 3713 7.633×106 1.118×107 P2P 6299 476.1 51.03 48.67 2.094×104 1.232×1010 2.041×1010 Network Size THT THB THM ⟨T⟩T ⟨T⟩B ⟨T⟩M BA-model 5000 173.9 22.29 9.466 9837 3.005×105 7.960×106 WS-model 1000 557.4 246.7 147.0 1667 2440 3396 Dolphins 53 39.79 14.43 13.05 106.6 3175 6684 ca-NetSci 379 488.6 14.20 12.56 1892 2.980×1012 2.767×1013 C.elegans 448 19.82 7.929 6.472 1045 2.398×106 4.421×106 E-mail 1133 180.0 27.43 24.46 3713 7.633×106 1.118×107 P2P 6299 476.1 51.03 48.67 2.094×104 1.232×1010 2.041×1010 View Large Table 2. Mean hitting time to a hub node and global mean hitting time for TURW, NBCRW and MERW in a variety of networks. Network Size THT THB THM ⟨T⟩T ⟨T⟩B ⟨T⟩M BA-model 5000 173.9 22.29 9.466 9837 3.005×105 7.960×106 WS-model 1000 557.4 246.7 147.0 1667 2440 3396 Dolphins 53 39.79 14.43 13.05 106.6 3175 6684 ca-NetSci 379 488.6 14.20 12.56 1892 2.980×1012 2.767×1013 C.elegans 448 19.82 7.929 6.472 1045 2.398×106 4.421×106 E-mail 1133 180.0 27.43 24.46 3713 7.633×106 1.118×107 P2P 6299 476.1 51.03 48.67 2.094×104 1.232×1010 2.041×1010 Network Size THT THB THM ⟨T⟩T ⟨T⟩B ⟨T⟩M BA-model 5000 173.9 22.29 9.466 9837 3.005×105 7.960×106 WS-model 1000 557.4 246.7 147.0 1667 2440 3396 Dolphins 53 39.79 14.43 13.05 106.6 3175 6684 ca-NetSci 379 488.6 14.20 12.56 1892 2.980×1012 2.767×1013 C.elegans 448 19.82 7.929 6.472 1045 2.398×106 4.421×106 E-mail 1133 180.0 27.43 24.46 3713 7.633×106 1.118×107 P2P 6299 476.1 51.03 48.67 2.094×104 1.232×1010 2.041×1010 View Large 5. ANALYTICAL RESULTS FOR NBCBRW ON ROSE GRAPHS In the preceding section, we show that in some model and real networks, the behaviors of NBCRW are strikingly different from those of TURW and MERW. Since many real-life networks are scale-free, analytically unveiling the impact of heterogeneous topology on random walks is important for better understanding its dynamical behaviors and applications. In this section, we study analytically and numerically NBCRW, TURW and MERW on a class of heterogeneous rose graphs [65]. For a particular rose graph, we obtain closed-form expressions for stationary distribution and hitting times for these three random walks, and obtain numerical results for general rose graphs, which widely differ from one another. Based on the results, we can discover the impact of topological heterogeneity on NBCRW, TURW and MERW are evidently different. 5.1. Construction of rose graphs The rose graphs are a family of deterministic networks, which allow to analytically treat some of their structural and dynamical properties. Let Rml denote the rose graphs, which are constructed by merging m ( m≥2) l-length ( l is even) cycles at a central hub node. Here we focus on a specific class of rose graphs, Rm4 with each petal being 4-length rings, see Fig. 5a. It is easy to derive that in Rm4 the total number of nodes is Nm=3m+1, and the total number of edges is Em=6m+2. Figure 5. View largeDownload slide Rose graphs. (a) Rose graphs Rm4. (b) Rose graphs Rml, where each of m petals is a l-length cycle. In either graph, the red node, green nodes and blue nodes stand for the hub, internal nodes and peripheral nodes, respectively. Figure 5. View largeDownload slide Rose graphs. (a) Rose graphs Rm4. (b) Rose graphs Rml, where each of m petals is a l-length cycle. In either graph, the red node, green nodes and blue nodes stand for the hub, internal nodes and peripheral nodes, respectively. For the convenience of description, we partition all the Nm nodes of Rm4 into three classes: hub node, peripheral nodes, and internal nodes. The hub node is the unique node of the largest degree, the peripheral nodes are those m nodes farthermost from the hub node, while the remaining 2m nodes are internal nodes, each of which is linked to the hub node and a peripheral node. Furthermore, the 3m+1 nodes can be labeled from 1 to 3m+1 in the following way. We label by 1 the hub node. For the nodes in the i th ( i=1,2,…,m) petal, the two internal nodes are labeled as 3(i−1)+2 and 3(i−1)+3, while the peripheral node is labeled as 3(i−1)+4. 5.2. Stationary distribution For TURW on Rm4, the stationary distribution can be obtained from (4). For NBCRW and MERW on Rm4, their stationary distributions can also be determined exactly. Theorem 5.1 For NBCRW on rose graphs Rm4, the stationary probability at the hub node, an internal node, and a peripheral node are πHB=m2(m+2m−1)=Nm−12(Nm−1)+26Nm−15, (31) πIB=14m=34(Nm−1) (32)and πPB=m2m−1−2m+12m(m−1)2=3(Nm−1)6Nm−15−18Nm+452(Nm−1)(Nm−4)2, (33)respectively. The proof is presented in Appendix A. Theorem 5.2 For MERW on rose graphs Rm4, the stationary probability at the hub node, an internal node and a peripheral node are πHM=m2m+2=Nm−12(Nm−1)+6, (34) πIM=14m=34(Nm−1), (35)and πPM=12m(m+1)=92(Nm+2)(Nm−1), (36)respectively. The proof is presented in Appendix B. Thus far, we have obtained the stationary distribution for NBCRW and MERW on Rm4. For TURW on Rm4, the stationary distribution is determined by the degree sequence of nodes and can be directly computed from (4), from which we obtain that the stationary probability at the hub node, an interval node, and a peripheral node are THT=m3m+1=Nm−13Nm, TIT=13m+1=32(Nm−1) and TPT=13m+1=32(Nm−1), respectively. If we choose stationary probability as an indicator of node importance, the stationary distribution of TURW fails to differentiate the internal nodes and peripheral nodes in Rm4, since the degree of the two internal and peripheral nodes is equal to each other. We will show that this shortcoming can be overcome by using the stationary distribution for NBCRW and MERW, although they also differ greatly. For NBCRW and MERW on Rm4, Theorems 5.1 and 5.2 show that πHM>πHB, πIM=πIB and πPM<πPB. Moreover, for both NBCRW and MERW, the stationary probability for internal nodes and peripheral nodes are different, in spite of the fact that their degree is identical. Thus, stationary distribution of NBCRW and MERW can discriminate between an internal node and a peripheral node. However, there exist differences between the stationary probability of NBCRW and MERW. For example, from (33) and (36), we can see that the stationary probability of a peripheral node for NBCRW gets a fraction O(Nm−2/3), larger than the fraction O(Nm−2) received for MERW. Therefore, in comparison with MERW, the stationary probabilities of NBCRW are distributed over a narrow range of values. In order to further unveil the distinction of stationary distribution between NBCRW and MERW. We compare the stationary distributions for NBCRW and MERW in the rose graph R320 with 58 nodes, among which the hub node has degree 6, while each of the other 57 nodes has a degree of 2. We can classify the 58 nodes in R320 by designating a level number to each node according to its shortest distance to the hub node: the hub node is at lever zero, the neighboring nodes of the hub are at level one, and the farthermost nodes from the hub are at level ten. Figure 6 provides numerical results of stationary distributions of NBCRW and MERW for every node in R320, which shows that for both NBCRW and MERW, the stationary probability depends on the level: the smaller the level of a node, the larger its stationary probability. However, their differences are also striking. For MERW, the stationary probability almost concentrates around the hub node and its neighbors, with other nodes getting vanishing weight; for NBCRW, although the stationary probability of the hub is also significantly larger than those of other nodes, the stationary probability of every node is nonvanishing and greater than 0.01, as shown in the inset of Fig. 6. Thus, if we use stationary probability to measure relative importance of nonhub nodes, MERW is hard to distinguish those nodes at large levels, which can be discriminated by NBCRW. Figure 6. View largeDownload slide Stationary probabilities of different nodes for NBCRW and MERW in the rose graph R320. Figure 6. View largeDownload slide Stationary probabilities of different nodes for NBCRW and MERW in the rose graph R320. 5.3. Hitting times In addition to stationary distribution, for NBCRW on the rose graph Rm4, the partial mean hitting time to the hub node and the global mean hitting time of the whole network can also be determined explicitly. For the purpose of comparison, we also provide the corresponding exact results for TURW and MERW. Theorem 5.3 The partial mean hitting time to the hub node for TURW, NBCRW and MERW in Rm4are THT=103, (37) THB=43+22m−1m=43+26Nm−15Nm−1 (38)and THM=43+2m=43+6Nm−1, (39)respectively. The proof is presented in Appendix C. Theorem 5.3 shows that for TURW, NBCRW and MERW in large rose graph Rm4 ( Nm→∞), the partial mean hitting time to the hub node tends to be constants, with THB=THM=4/3 smaller than THT=10/3. Although for NBCRW and MERW in large rose graphs Rm4, THB and THM are asymptotically equal, their global mean hitting times follow different behaviors, as can be seen from the following theorem. Theorem 5.4 The global mean hitting times for TURW, NBCRW and MERW in Rm4with Nm=3m+1nodes are ⟨T⟩T=20m(3m−1)3(3m+1)=20(Nm−1)(Nm−2)9Nm, (40) ⟨T⟩B=2m3+12m2−14m+4(3m+1)2m−1+36m2−8m3(3m+1)=2Nm2+30Nm−19296Nm−15+268+206Nm−159Nm6Nm−15+12Nm−329 (41)and ⟨T⟩M=6m3+36m2+10m−129m+3=2Nm3+30Nm2−36Nm−10427Nm, (42)respectively. The proof is presented in Appendix D. Theorem 5.3 provides succinct dependence relations of ⟨T⟩T, ⟨T⟩B and ⟨T⟩M on the network size Nm, from which we can find that for large networks (i.e. Nm→∞), the leading terms for global mean hitting times for TURW, NBCRW and MERW are ⟨T⟩T∼Nm, ⟨T⟩B∼(Nm)32 and ⟨T⟩M∼(Nm)2, respectively. Thus, ⟨T⟩T, ⟨T⟩B and ⟨T⟩M behave differently for the three random walks on Rm4. For TURW, ⟨T⟩T grows linearly with Nm; while for both NBCRW and MERW, ⟨T⟩B and ⟨T⟩M increase superlinearly with Nm, with ⟨T⟩B smaller than ⟨T⟩M. These results indicate that when searching a node distributed uniformly in Rm4, TURW is the most efficient, while MERW is the most inefficient, as observed in the model and real networks studied in the previous section. 6. CONCLUSION The application effects of random walks are determined to a large extent by the properties and behaviors of stationary distribution and hitting times. Recent works indicate that biased random walks perform better in multiple applications than TURW. Thus, designing appropriate biased random walks and understanding their properties are of significant importance. In this paper, we defined a new biased random walk, NBCRW, with the bias dependent on the non-backtracking centrality, which is a recently proposed node centrality measure having several advantages over traditional eigenvector centrality metric. We established a theoretical framework for computing quickly the transition probabilities, stationary distribution and hitting times of NBCRW on a general network. Within our proposed framework, we studied numerically or analytically NBCRW on some model and realistic networks, and compared the results about stationary distribution and hitting times with those corresponding to TURW and MERW, the latter of which is actually a biased random walk towards selecting neighboring having high eigenvector centrality. We found that for homogeneous networks, the behaviors for stationary distribution and hitting times of the three random walks resemble to each other. However, for heterogeneous networks, there is a big difference in the behaviors of the three random walks. For example, the stationary distribution of NBCRW outperforms TURW and MERW in discriminating nodes, in particular those with identical degree. With respect to hitting times, a walker finds the hub node most quickly when performing MERW, and detects a uniformly selected target most rapidly when executing TURW. For both cases, the hitting times of NBCRW interpolates between TURW and MERW. In view of the distinctive behaviors of NBCRW, in future, it is interesting to explore the applications of NBCRW in different fields, such as community detection, visual saliency and link prediction. FUNDING This work was supported by the National Natural Science Foundation of China under Grants no. 11275049. REFERENCES 1 Chau , C.-K. and Basu , P. ( 2011 ) Analysis of latency of stateless opportunistic forwarding in intermittently connected networks . IEEE/ACM Trans. Netw. , 19 , 1111 – 1124 . Google Scholar CrossRef Search ADS 2 Zheng , H. , Yang , F. , Tian , X. , Gan , X. , Wang , X. and Xiao , S. ( 2015 ) Data gathering with compressive sensing in wireless sensor networks: a random walk based approach . IEEE Trans. Parallel Distrib. Syst. , 26 , 35 – 44 . Google Scholar CrossRef Search ADS 3 Lee , C.-H. and Kwak , J. ( 2016 ) Towards distributed optimal movement strategy for data gathering in wireless sensor networks . IEEE Trans. Parallel Distrib. Syst. , 27 , 574 – 584 . Google Scholar CrossRef Search ADS 4 El Gamal , A. , Mammen , J. , Prabhakar , B. and Shah , D. ( 2006 ) Optimal throughput-delay scaling in wireless networks—Part I: the fluid model . IEEE Trans. Inf. Theory , 52 , 2568 – 2592 . Google Scholar CrossRef Search ADS 5 Liu , J. , Jiang , X. , Nishiyama , H. and Kato , N. ( 2012 ) Exact Throughput Capacity Under Power Control in Mobile Ad Hoc Networks. Proc. IEEE INFOCOM, Orlando, FL, USA, March 25–30, pp. 1–9. IEEE Press, Piscataway, NJ, USA. 6 Li , Y. and Zhang , Z.-L. ( 2013 ) Random walks and green’s function on digraphs: a framework for estimating wireless transmission costs . IEEE/ACM Trans. Netw. , 21 , 135 – 148 . Google Scholar CrossRef Search ADS 7 Beraldi , R. , Querzoni , L. and Baldoni , R. ( 2009 ) Low hitting time random walks in wireless networks . Wirel. Commun. Mob. Comput. , 9 , 719 – 732 . Google Scholar CrossRef Search ADS 8 Lin , T. , Lin , P. , Wang , H. and Chen , C. ( 2009 ) Dynamic search algorithm in unstructured peer-to-peer networks . IEEE Trans. Parallel Distrib. Syst. , 20 , 654 – 666 . Google Scholar CrossRef Search ADS 9 Pons , P. and Latapy , M. ( 2005 ) Computing Communities in Large Networks Using Random Walks. Proc. Int. Symp. Comput. Inform. Sci., Istanbul, Turkey, October 26–28, pp. 284–293. Springer. 10 Fouss , F. , Pirotte , A. , Renders , J.-M. and Saerens , M. ( 2007 ) Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation . IEEE Trans. Knowl. Data Eng. , 19 , 355 – 369 . Google Scholar CrossRef Search ADS 11 Gopalakrishnan , V. , Hu , Y. and Rajan , D. ( 2009 ) Random Walks on Graphs to Model Saliency in Images. Proc. IEEE Int. Conf. Comput. Vision Pattern Recognit., Miami Beach, Florida USA, June 20–25, pp. 1698–1705. IEEE Press, Piscataway, NJ, USA. 12 Grady , L. ( 2006 ) Random walks for image segmentation . IEEE Trans. Pattern Anal. Mach. Intell. , 28 , 1768 – 1783 . Google Scholar CrossRef Search ADS 13 Ribeiro , B. and Towsley , D. ( 2010 ) Estimating and Sampling Graphs with Multidimensional Random Walks. Proc. ACM SIGCOMM IMC, New Delhi, India, August 30–September 3, pp. 390–403. ACM Press, New York, NY, USA. 14 Ribeiro , B. , Wang , P. , Murai , F. and Towsley , D. ( 2012 ) Sampling Directed Graphs with Random Walks. Proc. IEEE INFOCOM, March 25–30, Orlando, FL, USA, pp. 1692–1700. IEEE Press, Piscataway, NJ, USA. 15 Brin , S. and Page , L. ( 1998 ) The anatomy of a large-scale hypertextual Web search engine . Comput. Netw. ISDN Syst. , 30 , 107 – 117 . Google Scholar CrossRef Search ADS 16 Yu , J.-G. , Zhao , J. , Tian , J. and Tan , Y. ( 2014 ) Maximal entropy random walk for region-based visual saliency . IEEE Trans. Cybern , 44 , 1661 – 1672 . Google Scholar CrossRef Search ADS 17 Newman , M.E. ( 2003 ) The structure and function of complex networks . SIAM Rev. , 45 , 167 – 256 . Google Scholar CrossRef Search ADS 18 Beraldi , R. ( 2009 ) Biased random walks in uniform wireless networks . IEEE Trans. Mob. Comput. , 8 , 500 – 513 . Google Scholar CrossRef Search ADS 19 Gjoka , M. , Kurant , M. , Butts , C.T. and Markopoulou , A. ( 2010 ) Walking in Facebook: A Case Study of Unbiased Sampling of OSNS. Proc. IEEE INFOCOM, San Diego, CA, USA, March 14–19, pp. 1–9. IEEE Press, Piscataway, NJ, USA. 20 Ikeda , S. , Kubo , I. and Yamashita , M. ( 2009 ) The hitting and cover times of random walks on finite graphs using local degree information . Theor. Comput. Sci. , 410 , 94 – 100 . Google Scholar CrossRef Search ADS 21 Maiya , A.S. and Berger-Wolf , T.Y. ( 2011 ) Benefits of Bias: Towards Better Characterization of Network Sampling. Proc. Int. Conf. Knowl. Discov. Data Mining, San Diego, CA, USA, August 21–24, pp. 105–113. ACM Press, New York, NY, USA. 22 Parry , W. ( 1964 ) Intrinsic Markov chains . Trans. Am. Math. Soc. , 112 , 55 – 66 . Google Scholar CrossRef Search ADS 23 Burda , Z. , Duda , J. , Luck , J. and Waclaw , B. ( 2009 ) Localization of the maximal entropy random walk . Phys. Rev. Lett. , 102 , 160602 . Google Scholar CrossRef Search ADS 24 Gómez-Gardeñes , J. and Latora , V. ( 2008 ) Entropy rate of diffusion processes on complex networks . Phys. Rev. E , 78 , 065102 . Google Scholar CrossRef Search ADS 25 Peng , X. and Zhang , Z. ( 2014 ) Maximal entropy random walk improves efficiency of trapping in dendrimers . J. Chem. Phys. , 140 , 234104 . Google Scholar CrossRef Search ADS 26 Lin , Y. and Zhang , Z. ( 2014 ) Mean first-passage time for maximal-entropy random walks in complex networks . Sci. Rep. , 4 , 5365 . Google Scholar CrossRef Search ADS 27 Kafsi , M. , Grossglauser , M. and Thiran , P. ( 2013 ) The entropy of conditional Markov trajectories . IEEE Trans. Inf. Theory , 59 , 5577 – 5583 . Google Scholar CrossRef Search ADS 28 Ochab , J. and Burda , Z. ( 2012 ) Exact solution for statics and dynamics of maximal-entropy random walks on Cayley trees . Phys. Rev. E , 85 , 021145 . Google Scholar CrossRef Search ADS 29 Li , R.-H. , Yu , J.X. and Liu , J. ( 2011 ) Link Prediction: The Power of Maximal Entropy Random Walk. Proc. Int. Conf. Inform. Knowl. Manag., Glasgow, United Kingdom, October 24–28, pp. 1147–1156. ACM Press, New York, NY, USA. 30 Korus , P. and Huang , J. ( 2016 ) Improved tampering localization in digital image forensics based on maximal entropy random walk . IEEE Signal Process. Lett. , 23 , 169 – 173 . Google Scholar CrossRef Search ADS 31 Bonacich , P. ( 1972 ) Factoring and weighting approaches to status scores and clique identification . J. Math. Sociol. , 2 , 113 – 120 . Google Scholar CrossRef Search ADS 32 Martin , T. , Zhang , X. and Newman , M.E.J. ( 2014 ) Localization and centrality in networks . Phys. Rev. E , 90 , 052808 . Google Scholar CrossRef Search ADS 33 Hashimoto , K. ( 1989 ) Zeta functions of finite graphs and representations of p-adic groups . Adv. Stud. Pure Math. , 15 , 211 – 280 . 34 Krzakala , F. , Moore , C. , Mossel , E. , Neeman , J. , Sly , A. , Zdeborovíć , L. and Zhang , P. ( 2013 ) Spectral redemption in clustering sparse networks . Proc. Natl. Acad. Sci , 110 , 20935 – 20940 . Google Scholar CrossRef Search ADS 35 Karrer , B. , Newman , M.E.J. and Zdeborová , L. ( 2014 ) Percolation on sparse networks . Phys. Rev. Lett. , 113 , 208702 . Google Scholar CrossRef Search ADS 36 Lin , Y. , Chen , W. and Zhang , Z. ( 2017 ) Assessing Percolation Threshold Based on High-Order Non-backtracking Matrices. Proc. 26th Int. Conf. World Wide Web, Perth, Australia, April 3–7, pp. 223–232. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland. 37 Shrestha , M. , Scarpino , S.V. and Moore , C. ( 2015 ) Message-passing approach for recurrent-state epidemic models on networks . Phys. Rev. E , 92 , 022821 . Google Scholar CrossRef Search ADS 38 Morone , F. and Makse , H.A. ( 2015 ) Influence maximization in complex networks through optimal percolation . Nature , 524 , 65 – 68 . Google Scholar CrossRef Search ADS 39 Condamin , S. , Bénichou , O. , Tejedor , V. , Voituriez , R. and Klafter , J. ( 2007 ) First-passage times in complex scale-invariant media . Nature , 450 , 77 – 80 . Google Scholar CrossRef Search ADS 40 Lin , Y. , Julaiti , A. and Zhang , Z.Z. ( 2012 ) Mean first-passage time for random walks in general graphs with a deep trap . J. Chem. Phys. , 137 , 124104 . Google Scholar CrossRef Search ADS 41 Ermon , S. , Gomes , C.P. , Sabharwal , A. and Selman , B. ( 2014 ) Designing Fast Absorbing Markov Chains. Proc. AAAI, July 27–31, pp. 849–855. AAAI Press. 42 White , S. and Smyth , P. ( 2003 ) Algorithms for Estimating Relative Importance in Networks. Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, Washington DC, USA, August 24–27, pp. 266–275. ACM Press, New York, NY, USA. 43 Feng , M. , Qu , H. and Yi , Z. ( 2014 ) Highest degree likelihood search algorithm using a state transition matrix for complex networks . IEEE Trans. Circuits and Syst. I, Reg. Papers , 61 , 2941 – 2950 . Google Scholar CrossRef Search ADS 44 Lovász , L. ( 1993 ) Random walks on graphs: a survey . Combinatorics, Paul Erdös is Eighty , 2 , 1 – 46 . 45 Aldous , D. and Fill , J. ( 1999 ) Reversible Markov chains and random walks on graphs. http://www.stat.berkeley.edu/~aldous/RWG/book.html. 46 Strang , G. ( 2009 ) Introduction to Linear Algebra . Wellesley-Cambridge Press , Wellesley, MA . 47 Bass , H. ( 1992 ) The Ihara–Selberg zeta function of a tree lattice . Int. J. Math. , 03 , 717 – 797 . Google Scholar CrossRef Search ADS 48 Angel , O. , Friedman , J. and Hoory , S. ( 2015 ) The non-backtracking spectrum of the universal cover of a graph . Trans. Am. Math. Soc , 367 , 4287 – 4318 . Google Scholar CrossRef Search ADS 49 Zhang , Z. , Shan , T. and Chen , G. ( 2013 ) Random walks on weighted networks . Phys. Rev. E , 87 , 012112 . Google Scholar CrossRef Search ADS 50 Barrat , A. , Barthelemy , M. , Pastor-Satorras , R. and Vespignani , A. ( 2004 ) The architecture of complex weighted networks . Proc. Natl. Acad. Sci. USA. , 101 , 3747 – 3752 . Google Scholar CrossRef Search ADS 51 Alon , N. , Benjamini , I. , Lubetzky , E. and Sodin , S. ( 2007 ) Non-backtracking random walks mix faster . Commun. Contemp. Math. , 9 , 585 – 603 . Google Scholar CrossRef Search ADS 52 Fitzner , R. and van der Hofstad , R. ( 2013 ) Non-backtracking random walk . J. Stat. Phys. , 150 , 264 – 284 . Google Scholar CrossRef Search ADS 53 Kempton , M. ( 2016 ) Non-backtracking random walks and a weighted Ihara’s theorem . Open J. Discrete Math. , 6 , 207 – 226 . Google Scholar CrossRef Search ADS 54 Lin , Y. and Zhang , Z.Z. ( 2013 ) Random walks in weighted networks with a perfect trap: an application of Laplacian spectra . Phys. Rev. E , 87 , 062140 . Google Scholar CrossRef Search ADS 55 Erdös , P. and Rényi , A. ( 1960 ) On the evolution of random graphs . Publ. Math. Inst. Hungar. Acad. Sci , 5 , 17 – 61 . 56 Barabási , A.-L. and Albert , R. ( 1999 ) Emergence of scaling in random networks . Science , 286 , 509 – 512 . Google Scholar CrossRef Search ADS 57 Bell , R. and Dean , P. ( 1970 ) Atomic vibrations in vitreous silica . Discuss. Faraday Soc. , 50 , 55 – 61 . Google Scholar CrossRef Search ADS 58 Watts , D.J. and Strogatz , S.H. ( 1998 ) Collective dynamics of ‘small-world’ networks . Nature , 393 , 440 – 442 . Google Scholar CrossRef Search ADS 59 Lusseau , D. , Schneider , K. , Boisseau , O.J. , Haase , P. , Slooten , E. and Dawson , S.M. ( 2003 ) The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations . Behav. Ecol. Sociobiol. , 54 , 396 – 405 . Google Scholar CrossRef Search ADS 60 Newman , M.E.J. ( 2006 ) Finding community structure in networks using the eigenvectors of matrices . Phys. Rev. E , 74 , 036104 . Google Scholar CrossRef Search ADS 61 Duch , J. and Arenas , A. ( 2005 ) Community detection in complex networks using extremal optimization . Phys. Rev. E , 72 , 027104 . Google Scholar CrossRef Search ADS 62 Guimera , R. , Danon , L. , Diaz-Guilera , A. , Giralt , F. and Arenas , A. ( 2003 ) Self-similar community structure in a network of human interactions . Phys. Rev. E , 68 , 065103 . Google Scholar CrossRef Search ADS 63 Leskovec , J. , Kleinberg , J. and Faloutsos , C. ( 2007 ) Graph evolution: densification and shrinking diameters . ACM Trans. Knowl. Discov. Data , 1 , 2 . Google Scholar CrossRef Search ADS 64 Tejedor , V. , Bénichou , O. and Voituriez , R. ( 2009 ) Global mean first-passage times of random walks on complex networks . Phys. Rev. E , 80 , 065104 . Google Scholar CrossRef Search ADS 65 Liu , F. and Huang , Q. ( 2013 ) Laplacian spectral characterization of 3-rose graphs . Linear Algebra Appl. , 439 , 2914 – 2920 . Google Scholar CrossRef Search ADS Appendix A. PROOF OF THEOREM 5.1 Let κ1 denote the leading eigenvalue of matrix M corresponding to Rm4. Let Mm(κ) be the characteristic polynomial of matrix M, i.e. Mm(κ)=det(κI−M). (A.1) Then, κ1 is the largest root of equation Mm(κ)=0. Equation (A.1) can be recast as Mm(κ)=detκI−AD−I−IκI=1κNm·detκ2I−κA−I+DI−DOκI=det((κ2−1)I+D−κA), (A.2) which reduces the computation of Mm(κ) to computing a determinant of a new matrix Pm=(κ2−1)I+D−κA of low order. According to the construction of Rm4, det(Pm) can be rephrased in the following form: det(Pm)=detκ2+2m−1−κe−κe⋯−κe−κe⊤QO⋯O−κe⊤OQ⋯O⋮⋮⋮⋮−κe⊤OO⋯Q, (A.3) where e=(1,1,0), O is the 3×3 zero matrix, and Q is a 3×3 matrix given by Q=κ2+10−κ0κ2+1−κ−κ−κκ2+1. (A.4) Note that the first row of matrix Pm on the right-hand side (rhs) of (A.3) can be regarded as the sum (i.e. a linear combination with all scalars being 1) of the following m+1 vectors: (κ2+2m−1,0,0,…,0), (0,−κe,0,…,0), (0,0,−κe,…,0), …, (0,0,0,…,−κe), where 0 represents the zero vector (0,0,0). According to the properties of determinants, det(Pm) can be rewritten as det(Pm)=detκ2+2m−100⋯0−κe⊤QO⋯O−κe⊤OQ⋯O⋮⋮⋮⋮−κe⊤OO⋯Q+m·det0−κe0⋯0−κe⊤QO⋯O−κe⊤OQ⋯O⋮⋮⋮⋮−κe⊤OO⋯Q=(κ2+2m−1)(detQ)m+m(detQ)m−1det0−κe−κe⊤Q. (A.5) Based on (A.4), we obtain det(Q)=κ6+κ4+κ2+1 (A.6) and det0−κe−κe⊤Q=−2κ6−4κ4−2κ2. (A.7) Inserting (A.6) and (A.7) into (A.5) yields Mm(κ)=det(Pm)=(κ+1)(κ−1)(κ2+1)(κ4−2m+1)(κ6+κ4+κ2+1)m−1. (A.8) Thus, the largest eigenvalue κ1 of matrix M is κ1=(2m−1)14. (A.9) Next, we continue to derive the eigenvector of unit length corresponding to eigenvalue κ1. Let xH, xI and xP represent separately the non-backtracking centrality for the hub node, an internal node and a peripheral node in graph Rm4. According to (21), xH, xI and xP satisfy the following system of equations: 2mxI−(2m−1)xHκ=κxH,xH+xP−xIκ=κxI,xH2+xHκ2+2mxI2+xIκ2+mxP2+xPκ2=1, (A.10) which can be resolved to yield xH=2mκ3(κ2+1)(κ8+2κ6+2(8m−3)κ4+2(2m−1)2κ2+(2m−1)2),xI=κ2(κ2+2m−1)m(κ2+1)(κ8+2κ6+2(8m−3)κ4+2(2m−1)2κ2+(2m−1)2),xP=κ(κ4+2m−1)m(κ2+1)(κ8+2κ6+2(8m−3)κ4+2(2m−1)2κ2+(2m−1)2). (A.11) Then the normalized factor Q can be computed by Q=∑i=1Nmκ2−1κ+diκxi2=κ2−1κ+2mκxH2+2mκ2−1κ+2κxI+mκ2−1κ+2κxP=2(2m−1)14[2m2+m−1+2m−1(3m−1)](2m−1+1)[m(5+2m−1)−2]. (A.12) Plugging (A.9), (A.11) and (A.12) into (24) and considering the relation Nm=3m+1, the theorem follows. Appendix B. PROOF OF THEOREM 5.2 Let λ1 be the leading eigenvalue of adjacency matrix A for graph Rm4. And let μH, μI and μP be the elements of the leading eigenvector of unit length corresponding to the hub node, an internal node and a peripheral node, respectively. Then, 2mμI=λ1μH,μH+μP=λ1μI,2μI=λ1μP. (B.1) Doing some simple algebra operations on (B.1), we have 2mμIμH+μP=μHμI,μH+μP2μI=μIμP, (B.2) which, together with the normalization condition μH2+2mμI2+2μP2=1, is solved to yield μH=m2m+2,μI=12m,μP=1m2m+2. (B.3) Combining (B.3) and the relation Nm=3m+1, the theorem follows from (9). Appendix C. PROOF OF THEOREM 5.3 For TURW (MERW, NBCRW) in Rm4, let TI→HT ( TI→HM, TI→HB) be the hitting time from an internal node to the hub and, and let TP→HT ( TP→HM, TP→HB) be the hitting time from a peripheral node to the hub. Then, by definition, for the three random walks in Rm4 the partial mean hitting time to the hub can be computed in a uniform formula as THZ=13(2TI→HZ+TP→HZ), (C.1) where Z can be T, M or B. We next determine the partial mean hitting time to the hub for the three considered random walks. Case I: TURW. Since TURW is unbiased, the quantities TI→HT and TP→HT satisfy the following relations: TI→HT=12+12(1+TP→HT) and TP→HT=1+TI→HT, from which we have TI→HT=3 (C.2) and TP→HT=4. (C.3) Plugging (C.2) and (C.3) into (C.1) yields THT=103. Case II: NBCRW. We first determine the transition probabilities between different nodes for NBCRW in Rm4. If the walker is currently at a peripheral node, at next time step, it will jump to either of the two internal nodes adjacent to it; if the current location of the walker is an internal node, according to (23) and (A.11), at next time step, the probability of the walker at the hub node or a peripheral node is xHxH+xP=mm+2m−1 and xPxH+xP=2m−1m+2m−1, respectively. Then, we can establish the relations between TI→HB and TP→HB as TI→HB=mm+2m−1+2m−1m+2m−1(1+TP→HB) and TP→HB=1+TI→HB, which can be solved to yield TI→HB=1+22m−1m (C.4) and TP→HB=2+22m−1m. Thus, the mean hitting time to the hub for NBCRW in Rm4 is THB=13(2TI→HB+TP→HB)=43+22m−1m=43+26Nm−15Nm−1. Case III: MERW. For MERW in Rm4, the transition probability from a node i to one of its neighbor j is μj/∑kaikμk. Then, according to (B.3), we obtain that the probabilities from an internal node to the hub node and the peripheral neighbor node are μHμH+μP=mm+1 and μPμH+μP=1m+1, respectively. Thus, we can establish the following relations for TI→HM and TP→HM: TI→HM=mm+1+1m+1(1+TP→HM) (C.5) and TP→HM=1+TI→HM. (C.6) Resolving (C.5) and (C.6), one obtains the analytical expressions for TI→HM and TP→HM as TI→HM=m+2m (C.7) and TP→HM=2(m+1)m. (C.8) According (C.1), (C.7) and (C.8), the mean hitting time to the hub for MERW in Rm4 is obtained to be THM=13(2TI→HM+TP→HM)=43+2m=43+6Nm−1. This completes the proof. Appendix D. PROOF OF THEOREM 5.4 We use superscript Z∈{T,M,B} to differentiate related quantities for TURW, MERW, NBCRW on Rm4. For example, ⟨T⟩T (⟨T⟩M,⟨T⟩B) presents global mean hitting time for TURW (MERW, NBCRW) on Rm4. By definition, ⟨T⟩Z=1Nm(Nm−1)∑i=1Nm∑j=1NmTi→jZ=TtotZNm(Nm−1), (D.1) where Ti→jZ is the hitting time from node i to node j in Rm4, and TtotZ=∑i=1Nm∑j=1NmTi→jZ denotes the sum of hitting times over all Nm(Nm−1) node pairs in Rm4. Thus, in order to obtain ⟨T⟩Z, we only need to determine TtotZ. By construction, TtotZ can be decomposed into two terms as TtotZ=mTtot,1Z+m(m−1)Ttot,2Z, (D.2) where Ttot,1Z is the sum of hitting times between all pairs of nodes belonging to one of the m petals, and Ttot,2Z is the sum of hitting times between all pairs of nodes in different petals. We now compute Ttot,1Z and Ttot,2Z. For Ttot,1Z, it can be evaluated by Ttot,1Z=2TI→HZ+TP→HZ+2(TH→IZ+TI→IZ+TP→IZ)+TH→PZ+2TI→PZ, (D.3) where TX→YZ represents the hitting time from a node in class X to another node in class Y with both nodes belonging to the same petal. For instance, TI→PZ is the hitting time from an internal node to the peripheral node in the same petal, and TI→IZ is the hitting time from one internal node to the other internal node in the same petal. For Ttot,2Z, we have Ttot,2Z=2(3TI→HZ+2TH→IZ+TH→PZ)+(3TP→HZ+2TH→IZ+TH→PZ). (D.4) Plugging (D.3) and (D.4) into (D.2) yields TtotZ=(6m2−4m)TI→HZ+(3m2−2m)TP→HZ+(6m2−4m)TH→IZ+2mTI→IZ+2mTP→IZ+(3m2−2m)TH→PZ+2mTI→PZ. (D.5) We are now ready to determine the global mean hitting time for TURW, NBCRW and MERW in Rm4 by evaluating those quantities on the rhs of (D.5). Case I: TURW. Since the quantities TI→HT and TP→HT have been obtained earlier, we only require to determine TH→IT, TI→IT, TP→IT, TH→PT and TI→PT, which obey the following relations: TH→IT=m−1m(1+TI→HT+TH→IT)+12m(1+TI→IT)+12m, TI→IT=12(1+TH→IT)+12(1+TP→IT), TP→IT=12(1+TI→IT)+12, TH→PT=m−1m(1+TI→HT+TH→PT)+1m(1+TI→PT and TI→PT=12(1+TH→PT)+12. Using (C.2), the above equations are solved to obtain TH→IT=6m−3, (D.6) TI→IT=4m, (D.7) TP→IT=2m+1, (D.8) TH→PT=8m−4 (D.9) and TI→PT=4m−1. (D.10) Plugging (C.2)–(C.3) and (D.6)–(D.10) into (D.5) and (D.1), we obtain the explicit expression for the global mean hitting time ⟨T⟩T for TURW in Rm4 and its relation between node number Nm=3m+1, as given by (40). Case II: NBCRW. For NBCRW on Rm4, we can establish the following recursive relations for the quantities TH→IB, TI→IB, TP→IB, TH→PB and TI→PB: TH→IB=m−1m(1+TI→HB+TH→IB)+12m(1+TI→IB)+12m, TI→IB=xHxH+xP(1+TH→IB)+xPxH+xP(1+TP→IB), TP→IB=12(1+TI→IB)+12, TH→PB=m−1m(1+TI→HB+TH→PB)+1m(1+TI→PB) and TI→PB=xHxH+xP(1+TH→PB)+xPxH+xP. Considering (A.11) and (C.4), the above equations are resolved to obtain TH→IB=4m+22m−1−1−22m−1m, (D.11) TI→IB=4m, (D.12) TP→IB=2m+1, (D.13) TH→PB=4m22m−1+2m3+4m2−2m2m−1−6m+2m2m−1 (D.14) and TI→PB=2m22m−1+2m−1. (D.15) Substituting (C.4), (C.5), and (D.11)–(D.15) into (D.5) and (D.1) yields (41). Case III: MERW. For MERW on Rm4, we can also build some relations among related hitting times: TH→IM=m−1m(1+TI→HM+TH→IH)+12m(1+TI→IM)+12m, (Y.1) TI→IM=μHμH+μP(1+TH→IM)+μPμH+μP(1+TP→IM), TP→IM=12(1+TI→IM)+12, TH→PM=m−1m(1+TI→HM+TH→PM)+1m(1+TI→PM) and TI→PM=μHμH+μP(1+TH→PM)+μPμH+μP. Making use of (B.3) and (C.7), the above equations are solved to give TH→IM=4m+1−2m, TI→IM=4m, TP→IM=2m+1, TH→PM=2(m+1)m(m2+m−1) and TI→PM=2m(m+1)−1. Plugging the above-obtained results into (D.5) and (D.1) results in (41). This completes the proof of the theorem. Author notes Handling editor: Prudence Wong © The British Computer Society 2018. All rights reserved. For permissions, please email: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png The Computer Journal Oxford University Press

Non-Backtracking Centrality Based Random Walk on Networks

Loading next page...
 
/lp/ou_press/non-backtracking-centrality-based-random-walk-on-networks-t4p9YjHNM0
Publisher
Oxford University Press
Copyright
© The British Computer Society 2018. All rights reserved. For permissions, please email: journals.permissions@oup.com
ISSN
0010-4620
eISSN
1460-2067
D.O.I.
10.1093/comjnl/bxy028
Publisher site
See Article on Publisher Site

Abstract

Abstract Random walks are a fundamental tool for analyzing realistic complex networked systems and implementing randomized algorithms to solve diverse problems such as searching and sampling. For many real applications, their actual effect and convenience depend on the properties (e.g. stationary distribution and hitting time) of random walks, with biased random walks often outperforming traditional unbiased random walks (TURW). In this paper, we present a new class of biased random walks, non-backtracking centrality based random walks (NBCRW) on a network, where the walker prefers to jump to neighbors with high non-backtracking centrality that has some advantages over eigenvector centrality. We study some properties of the non-backtracking matrix of a network, on the basis of which we propose a theoretical framework for fast computation of the transition probabilities, stationary distribution and hitting times for NBCRW on the network. Within the paradigm, we study NBCRW on some model and real networks and compare the results with those corresponding to TURW and maximal entropy random walks (MERW), with the latter being biased random walks based on eigenvector centrality. We show that the behaviors of stationary distribution and hitting times for NBCRW widely differ from those associated with TURW and MERW, especially for heterogeneous networks. 1. INTRODUCTION As a fundamental and powerful tool, random walks have found a wide range of applications in computer science and engineering. For example, in the area of communication and information networks, random walks can not only model and describe information delivery [1] and data gathering [2, 3], but also quantify and predict the throughput [4, 5], latency performance [1], transition [6] and search costs [7, 8]. Other related applications of random walks in computer science include community detection [9], recommendation system [10], computer vision [11], image segmentation [12], sampling networks [13, 14], to name a few. The statistical properties of random walks play an important role in their applications, since they not only characterize the behavior of random walks themselves, but also capture the performance metrics of different applications. For example, stationary probability of stationary distribution can measure the node importance [15] of a network, as well as the visual saliency at a location [16], while hitting time can serve as search performance gauge [7]. Thus, the properties of random walks have a strong impact on, even determine to a large extent, the effects of their applications. Among various random walks, the traditional unbiased random walk (TURW) is probably the simplest one, where the transition probability from the current location to any neighbor at next time step is uniform. Nevertheless, a vast majority of real-life networks are heterogeneous [17], implying that the importance or role of different nodes are also distinct. Thus, random walks in realistic heterogeneous networks should be biased [18, 19], with transition probability to an important neighbor higher than that of an ordinary neighbor. A lot of works show that in comparison with TURW, biased random walks are superior in some concrete applications, e.g. network search [18, 20] sampling [21]. A typical biased random walk is maximal entropy random walk (MERW) [22], which has received considerable attention [23–26]. Entropy of random walks quantifies the randomness of trajectories and can measure mobility of random walker [27]. MERW displays some remarkable properties different from those of TURW, e.g. small relaxation time [28], localization of stationary distribution [23]. In the past years, MERW has been applied to several aspects, such as link prediction [29], visual saliency [16] and digital image forensics [30], and produced more desirable effects. MERW is in fact a biased random walk with transition biasing towards neighboring nodes with high eigenvector centrality [31], i.e. principal eigenvector of adjacent matrix. However, a recent research [32] pointed out that standard centrality undergoes a localization transition in heterogeneous networks, which leads to most of weight concentrating around the hub node and its vicinity. Thus, as a common measure of node importance, the standard eigenvector centrality fails to discriminate those nodes with small weight. As a remedy, an alternative centrality measure, non-backtracking centrality, was proposed [32], which reserves the advantage of standard centrality but avoids its deficiency. This new centrality measure is based on non-backtracking matrix [33, 34], which has been successfully applied to many aspects, such as community detection [34], percolation [35, 36], epidemic spreading [37] and identifying influential nodes [38]. Since the node properties, based on which the walker has preference to jump towards different nodes, play a central role in determining the behavior of biased random walks, an interesting question arises naturally: How does a random walk behave if non-backtracking centrality is incorporated into its transition probabilities? In this paper, we design a new biased random walk, Non-Backtracking centrality based Random Walk (NBCRW), with the transition probabilities dependent on the non-backtracking centrality. We present a framework for computing quickly transition probabilities, stationary distribution and hitting times of NBCRW, and provide analytical expressions for stationary distribution and hitting times. Within this framework, we study NBCRW on some synthetic and real networks, and compare their results with those with respect to TURW and MERW. We show that the behaviors of NBCRW differ greatly from those of TURW and MERW, in particular for heterogeneous networks. The main contributions of this paper are summarized as follows: We propose a novel type of biased random walks, non-backtracking centrality based random walks (NBCRW), in which the transition probability is proportional to the non-backtracking centrality. We develop a theoretical framework for efficiently computing transition probabilities of NBCRW as well as its properties, including stationary distribution and hitting times. We derive an analytical expression of the stationary distribution of NBCRW in terms of the leading eigenvalue of non-backtracking matrix and non-backtracking centrality. We also determine hitting times for NBCRW, including hitting time from an arbitrary node to another one, partial mean hitting time to a given target, and global mean hitting time to a uniformly selected node. Within the established general framework, we study analytically or numerically NBCRW in model and realistic networks, and compare the results with those corresponding to TURW and MERW. We show that the stationary distribution and hitting times behave differently from those of TURW and MERW. The remainder of this paper is organized as follows. Section 2 presents a brief introduction to networks and an overview of TURW and MERW on networks. Section 3 is devoted to the formulation of NBCRW. Section 4 gives the experiment results and comparison between NBCRW, TURW and MERW in model and real-life networks. Section 5 reports the exact analytical results of stationary distributions and hitting times for NBCRW, TURW and MERW in a class of rose graphs. Section 6 concludes the paper. 2. PRELIMINARIES In this section, we introduce some useful concepts for graphs and discrete-time random walks on graphs. 2.1. Concepts for graphs and random walks Let G(V,E) be a finite connected undirected network (graph) of N nodes and E edges, with node set V={1,2,…,N} and edge set E={(i,j)∣i,j∈V}. The connectivity of nodes is defined by the adjacency matrix A=(aij)N×N, in which the element aij=1 if (i,j)∈E, and aij=0 otherwise. Let Ni denote the set of neighbors of node i. The degree of node i is di=∣Ni∣=∑j=1Naij, which is ith non-zero entry of the diagonal degree matrix D=diag(d1,d2,…,dN). The Laplacian matrix of G is defined to be L=D−A. For a graph G, we can define a discrete-time nearest-neighbor random walk taking place on it. Any random walk on a network G is in fact a Markov chain characterized by a unique stochastic matrix P=(pij)N×N, also called transition probability matrix, with entry pij describing the transition probability from node i to a neighboring node j. Definition 2.1 For an irreducible random walk on graph G, the stationary distribution π=(π1,π2,…,πN)is an N-dimension vector satisfying πP=πand ∑i=1Nπi=1. The stationary probabilities of stationary distribution can be employed to rank nodes in a network [15]. Another fundamental quantity relevant to random walks is hitting time [39]. Definition 2.2 For a random walk on graph G, the hitting time from node ito node j ( j≠i), denoted by Tij, stands for the expected jumping steps required for the walker starting from the source node ito arrive at the target node jfor the first time. The hitting time is a significant indicator to measure the transition or research cost in a network [7]. Based on hitting time, we can further define some other quantities for random walks, such as partial mean hitting time and global mean hitting time. Definition 2.3. For a random walk on graph G, the partial mean hitting time to node j, denoted by Tj, is the average of hitting times Tijover all source nodes in the network: Tj=1N−1∑i=1NTij. (1) The partial mean hitting time Tj is actually mean absorbing time of an absorbing Markov chain with j being the absorbing state, reflecting the absorbing efficiency of node j [40, 41]. It was recently utilized to measure the importance of node j, and is thus called Markov centrality [42]. Definition 2.4 For a random walk on graph G, the global mean hitting time, denoted by ⟨T⟩, is the average of hitting times Tijover all N(N−1)pair of nodes, equivalent to the mean hitting time to a uniform distributed node, which is given by ⟨T⟩=1N(N−1)∑i=1N∑j≠iTij=1N∑j=1NTj. (2) The global mean hitting time can be applied to gauge the search efficiency of a network [43]. Given a network, we can define different random walks. Below we only introduce two much studied random walks: traditional unbiased random walk (TURW) and maximal entropy random walk (MERW). 2.2. Traditional unbiased random walk For TURW on graph G, the transition probability from a node i to one of its neighboring nodes j is identical, namely pij=aijdi. (3) Thus, the transition probability matrix is P=D−1A, and the stationary distribution is [44, 45] πT=(π1T,π2T,…,πNT)=d12E,d22E,…,dN2E, (4) which implies that all nodes with the same degree have identical occupation probability in the stationary state. The hitting time for TURW on G can be expressed in terms of spectra of its Laplacian matrix L. Let 0=σ1<σ2≤⋯≤σN be the N eigenvalues of L, and let μ1,μ2,…,μN be their corresponding normalized mutually orthogonal eigenvectors, where μi=(μi1,μi2,…,μiN)⊤ for each i=1,2,…,N. Then, the hitting time Tij, partial mean hitting time, and global mean hitting time can be represented by [40] Tij=∑z=1Ndz∑k=2N1σk(μkiμkz−μkiμkj−μkjμkz+μkj2), (5) Tj=NN−1∑k=2N1σk2E×μkj2−μkj∑z=1Ndzμkz (6) and ⟨T⟩=2EN−1∑k=2N1σk, (7) respectively. 2.3. Maximal entropy random walk Different from the TUWR, MERW on graph G is a biased random walk, whose transition probability is defined based on the leading eigenvalue and eigenvector of adjacency matrix A. Let λ1>λ2≥⋯≥λN be the N real eigenvalues of A, and ψ1,ψ2,…,ψN their corresponding mutually orthogonal unit eigenvectors, where ψi=(ψi1,ψi2,…,ψiN)⊤ for each i=1,2,…,N. Then, the transitional probability pij from node i to node j in MERW is defined by [22, 23] pij=aijλ1ψ1jψ1i. (8) Note that principal eigenvector ψ1 is in fact the frequently used centrality measure [31], with the entry ψ1,i defining a centrality score for node i. In this sense, MERW can be considered as a biased random walk based on eigenvector centrality. Equation (8) guarantees that MERW maximizes the entropy of a set of trajectories with a given length and end-nodes, leading to the maximal entropy rate of such process [23]. The stationary distribution of MERW is πM=(π1M,π2M,…,πNM)=(ψ112,ψ122,…,ψ1N2). (9) Since in some networks, especially heterogeneous networks, the eigenvector centrality ψ1 exhibits a localization phenomenon [32] with the weight of centrality concentrating around one or a few nodes with high degree in the networks, from (9) one can see that in these networks, the stationary distribution for MERW displays a more evident localization transition: the abrupt focusing of occupation probabilities on just a few large-degree nodes and their neighbors. Interestingly, for MERW on graph G, the hitting time Tij, partial mean hitting time Tj and global mean hitting time ⟨T⟩ can be expressed in terms of the eigenvalues and eigenvectors of adjacency matrix A [26]: Tij=1ψ1j2∑k=2Nλ1λ1−λkψkj2−ψkiψkjψ1jψ1i, (10) Tj=1ψ1j2(N−1)∑k=2Nλ1λ1−λkNψkj2−ψkjψ1j∑i=1Nψkiψ1i, (11) ⟨T⟩=1N(N−1)∑j=1N1ψ1j2∑k=2Nλ1λ1−λkNψkj2−ψkjψ1j∑i=1Nψkiψ1i. (12) 3. FORMULATION OF NON-BACKTRACKING CENTRALITY BASED RANDOM WALK For a biased random walk on a graph G, its behavior depends on the property of the quantity with respect to nodes, based on which the transition probability is defined. As shown in a recent paper [32], the eigenvector centrality has some flaws, e.g. localization transition, which results in obvious heterogeneity in the stationary distribution of MERW. Since non-backtracking centrality can avoid the deficiency of eigenvector centrality [32], as a remedy of MERW, in this section, we propose a new biased random walk based on non-backtracking centrality. To begin with, we introduce the non-backtracking centrality and study some of its properties. 3.1. Non-backtracking centrality The non-backtracking centrality [32] is defined and calculated by the Hashimoto or non-backtracking matrix [33, 34], denoted by B that is a 2E×2E matrix. For any undirected network G, we can transform it to a directed graph through replacing each undirected edge (i,j) by two directed ones i→j and j→i. The 2E×2E non-backtracking matrix B of G describes the relation between the 2E different directed edges, the element Bi→j,k→l of which is defined as follows: Bi→j,k→l=1,j=kandi≠l,0,otherwise. (13) Since all entries of the non-backtracking matrix B are non-negative real numbers, by the Perron–Frobenius theorem [46], its leading eigenvalue is real and non-negative, and there exists a corresponding leading eigenvector, whose elements are also non-negative real numbers. Let κ be the leading eigenvalue of B, and let vi→j be the element of the leading eigenvector corresponding to the directed edge i→j. Then, vi→j represents the centrality of node j neglecting any contribution from node i. According to the leading eigenvector of B, one can define two centrality measures of each node [34], outgoing centrality and incoming centrality, by considering the outgoing and incoming edges of the node. Definition 3.1 For a node iin network G, its outgoing centrality is xi=∑j∈Nivi→j, (14)and its incoming centrality is yi=∑j∈Nivj→i. (15) Note that the outgoing centrality xi is actually the non-backtracking centrality [32]. Lemma 3.1 For a node iin network G, its outgoing and incoming centralities obey κyi=(di−1)xi. (16) Proof By definition of eigenvalues and eigenvectors for matrix B, we can establish equation κvi→j=∑k∈Njk≠ivj→k. (17) Using (14) and (17), we rephrase (15) as yi=1κ∑j∈Ni∑k∈Nik≠jvi→k=1κ∑j∈Ni∑k∈Nivi→k−∑j∈Nivi→j=1κ∑j∈Nixi−xi=1κ(dixi−xi), (18) which is equivalent to (16).□ If graph G is a tree, the leading eigenvalue κ of its non-backtracking matrix B is zero. However, when G is not a tree, the leading eigenvalue κ of B is positive, and the components of leading eigenvector may be all non-negative. In what follows, we will consider the case when G are not trees. For a network G, computing its non-backtracking centrality involving computing the leading eigenvector of its non-backtracking matrix B of order 2E×2E. If we directly compute the leading eigenvector according to definition, the time and the space cost are very high. Fortunately, in practice, we can substantially reduce the consumption by executing a faster computation for κ and non-backtracking centrality xi, utilizing the Ihara determinant [33, 47, 48]. Lemma 3.2 For a network G, its leading eigenvalue κof non-backtracking matrix Bis equal to the leading eigenvalue of a 2N×2Nmatrix M=AI−DI0, (19)where Iis the N×Nidentity matrix. In addition, x1,x2,…,xNcorrespond to the first Nelements of the leading eigenvector of matrix M. Proof Combining (17) and (14), the non-backtracking centrality xi can be rewritten as xi=∑j∈Ni1κ∑k∈Njk≠ivj→k=1κ∑j∈Ni∑k∈Njvj→k−∑j∈Nivj→i=1κ∑j∈Nixj−yi=1κ∑j=1Naijxj−di−1κxi. (20) Recasting (20) in matrix notation, one obtains A−1κD+1κIx=κx, (21) where x=(x1,x2,…,xN)⊤ is a vector composed of the non-backtracking centralities of N nodes in G. Equation (21) shows that matrices B and M have the same set of real eigenvalues. Furthermore, Mz=κz. (22) Here z=(x∣1κx), in which x represents the first N elements of z and 1κx constitutes the last N elements. Thus, the leading eigenvalues of matrix B and M are equal to each other, and the first N elements of z correspond to the non-backtracking centralities x1, x2,…,xN.□ Lemma 3.2 indicates that the computation of the leading eigenvalue κ for non-backtracking centrality M of order 2E and non-backtracking centralities x can be reduced to calculating the leading eigenvalue and eigenvector for matrix M of order of 2N, smaller than the order 2E of matrix B, especially for dense networks. Thus, we can compute κ and xi very rapidly by evaluating the leading eigenvalue and eigenvector of matrix M. 3.2. Definition of transition matrix According to the bias towards properties of nodes, we can define different biased random walks. Here we propose a novel random walk, non-backtracking centrality random walk (NBCRW), which is a biased one with the transition probability having a bias towards nodes with high non-backtracking centrality. Definition 3.2 For NBCRW on network G, the element at row iand column jof transition matrix Pis pij=aijxj∑k=1Naikxk. (23) In other words, the transition probability for NBCRW from node i to its neighbor j is proportional to the non-backtracking centrality of j. In order to investigate the properties of NBCRW on network G, we propose an approach to construct a weighted network W from the original network G. The weight of each edge in W is related to the non-backtracking centralities of both nodes connecting the edge in G. We will present that NBCRW on network G is equivalent to the ordinary random walk [49] in the corresponding weighted network W, with both random walks having the same properties, such as transition probability, stationary distribution, and hitting times. Definition 3.3 For an unweighted network G(V,E), given its adjacency matrix A, a diagonal matrix Xwith its ith diagonal entry equal to non-backtracking centrality xiof node i, its corresponding weighted network is defined as W(V,E), with the weight between nodes iand jgiven by wij=aijxixj. Let W=(wij)N×N stand for the adjacency matrix of the weighted network W. Different from the adjacency matrix A of binary network G, the elements of W are not simply 0 or 1, but are the weights of all pairs of nodes. By definition, we have W=XAX. In a weighted network W, the strength [50] of a node i is si=∑k=1Nwik=xi∑k=1Naikxk, and the total strength of the whole network W is s=∑i=1N∑j=1Nwij. Then, the diagonal strength matrix of W is defined as S=diag(s1,s2,…,sN), and the Laplacian matrix of W is defined by L=S−W. Theorem 3.1 The transition matrix of NBCRW in an arbitrary connected network Gis identical to the transitional matrix of ordinary random walk in the corresponding weighted network W. Proof For the ordinary random walk in the weighted network W, the transitional probability from node i to node j is pij=wijsi=aijxixj∑k=1Naikxixk=aijxi∑k=1Naikxk, which completely agrees with (23). Therefore, the transition matrix for NBCRW on G is the same as that of the ordinary random walk on W.□ Since both networks G and W have the same topological structure and transition matrix, NBCRW on G and ordinary random walk on W also have identical behaviors. In the sequel, we study the properties of NBCRW on G directly or indirectly by considering those of ordinary random walk on W. We note that our proposed NBCRW is different from non-backtracking random walk [51, 52] that is a random process, during which the walker never goes back along the edge it just traversed. For a general graph G, non-backtracking random walk is not a Markov chain on its vertex set, although it can be regarded as a Markov chain on the set of its directed edges [53], whose adjacency relation are encoded in non-backtracking matrix B. In contrast, NBCRW on the vertex set of G is a biased Markov chain based on non-backtracking centrality. A main goal of this paper is to unveil the impacts of biases, especially non-backtracking centrality, on the behaviors of biased random walks. 3.3. Stationary distribution First, we address the stationary distribution for NBCRW on G. Theorem 3.2 The stationary distribution for NBCRW on network Gis πB=(π1B,π2B,…,πNB), where πiB=κ2−1κ+diκxi2Q, (24) with Q=∑i=1Nκ2−1κ+diκxi2 (25)being the normalized factor to guarantee ∑i=1NπiB=1. Proof First, we prove that πB fulfills the detailed balance condition πiBpij=πjBpji for different i and j. To this end, we require to compute a related quantity ∑k=1Naikxk. From (21), we have Ax=1κDx+κ2−1κx, which means ∑k=1Naikxk=diκ+κ2−1κxi. Thus, we have πiBpij=κ2−1κ+diκxi2Q·aijxj∑k=1Naikxk=aijxixjQ. Similarly, we can get πjBpji=aijxixjQ. Hence, the detailed balance condition πiBpij=πjBpji (26) is satisfied for all pairs of i and j. According to (26), we have ∑i=1NπiBpij=∑i=1NπjBpji=πjB. (27) In other words, πBP=πB, showing that πB is the stationary distribution for NBCRW on G.□ 3.4. Hitting times Let θ1,θ2,…,θN be the N eigenvalues of the Laplacian matrix L for weighted network W, rearranged as 0=θ1<θ2≤⋯≤θN, and let ϕ1,ϕ2,…,ϕN be their corresponding mutually orthogonal eigenvectors of unit length, where ϕi=(ϕi1,ϕi2,…,ϕiN)⊤. Then, the hitting times for NBCRW on G can be expressed in term of the eigenvalues and eigenvectors of Laplacian matrix of network W. Theorem 3.3 For non-backtracking centrality based random walk on network G, the hitting time from a node ito another node jis Tij=12∑z=1Nsz∑k=2N1θk(ϕkj2−ϕkiϕkj−ϕkjϕkz+ϕkiϕkz), (28)the partial mean hitting time to an arbitrary destination node jis Tj=NN−1∑k=2N1θks×ϕkj2−ϕkj∑z=1Nszϕkz, (29)and the global mean hitting time for the whole network Gis ⟨T⟩=sN−1∑k=2N1θk. (30) Proof As mentioned earlier, NBCRW on network G is equivalent to ordinary random walk on its weighted counterpart W. According to our previous result [54], the theorem follows immediately.□ 4. EXPERIMENTS AND RESULTS FOR MODEL AND REALISTIC NETWORKS In this section, we study NBCRW on some classical model networks (e.g. Erdös–Rényi (ER) network [55] and Barabási–Albert (BA) network [56]) and real networks, and compare the results of stationary distribution and hitting times for NBCRW with those corresponding to TURW and MERW. 4.1. Stationary distribution Figure 1 shows the stationary distribution for TURW, NBCRW and MERW in an ER network with 1000 nodes. We can see that, for all the three random walks, the stationary probability of a node approximately increases with the degree of the node: for two nodes with different degrees, the stationary probability of the large-degree node is higher than that of the small-degree node. Moreover, the stationary probabilities of the three random walks are all distributed in a narrow range: the largest stationary probability is less than twice of the smallest stationary probability. Thus, there is little difference for the stationary probability of the three random walks. In particular, the stationary distribution of NBCRW and MERW are almost identical to each other. The main reason for this phenomenon is that ER network is homogeneous, with different nodes exhibiting similar structural and dynamical properties. Figure 1. View largeDownload slide Stationary distribution in an ER network with 1000 nodes, where each pair of nodes are connected with probability p=0.5. The results for TURW, MERW and NBCRW are obtained by (4), (9) and (24), respectively. According to decreasing order of degree, all the nodes are labeled from 1 to 1000. Figure 1. View largeDownload slide Stationary distribution in an ER network with 1000 nodes, where each pair of nodes are connected with probability p=0.5. The results for TURW, MERW and NBCRW are obtained by (4), (9) and (24), respectively. According to decreasing order of degree, all the nodes are labeled from 1 to 1000. Figure 2 exhibits the behaviors of stationary distribution for TURW, NBCRW and MERW on a BA network with 1000 nodes and average degree 4. We can see that the stationary distributions are heterogeneous for all the three random walks. According to (4) the stationary distribution of TURW is similar to the degree distribution. Figure 2a shows that the stationary probability of TURW lies in the interval [0.0005,0.02]. For NBCRW and MERW, the stationary probability lies, respectively, in the intervals [10−6,0.11] and [10−7,0.16], the heterogeneous extent of which is more pronounced than that of TURW. In addition to heterogeneous extent, the stationary distribution of the considered random walks has obvious differences. For TURW, the stationary probability of a node is fully determined by its degree: any two nodes with the identical degree have the same stationary probability. For NBCRW and MERW, two different nodes generally have different stationary probabilities, in spite of their degrees. Thus, the stationary probabilities of NBCRW and MERW can discriminate nodes in the BA networks, including those with identical degree. Figure 2. View largeDownload slide Stationary distribution in a BA network with 1000 nodes and average degree 4. (a) Stationary distribution of TURW, calculated by (4). (b) Stationary distribution of NBCRW, calculated by (24). (c) Stationary distribution of MERW, calculated by (9). According to stationary probability, all the 1000 nodes are labeled from 1 to 1000. The insets are the stationary probabilities of the 200 nodes with the smallest stationary probability. Figure 2. View largeDownload slide Stationary distribution in a BA network with 1000 nodes and average degree 4. (a) Stationary distribution of TURW, calculated by (4). (b) Stationary distribution of NBCRW, calculated by (24). (c) Stationary distribution of MERW, calculated by (9). According to stationary probability, all the 1000 nodes are labeled from 1 to 1000. The insets are the stationary probabilities of the 200 nodes with the smallest stationary probability. However, even for NBCRW and MERW in BA networks, their stationary probabilities differ greatly from each other. For the hub node 1, the stationary probability for MERW is greater than that of NBCRW; while for small-degree nodes, excluding those neighboring nodes of the hub, the stationary probability of a node for MERW is much lower than that corresponding to NBCRW. The insets show that for those 200 small-degree nodes with the lowest stationary probabilities, their stationary probabilities are almost below 10−5 for MERW, but there are over 150 nodes with stationary probabilities larger than 10−5 for NBCRW. In order to reflect the heterogeneous extent of stationary distributions between NBCRW and MERW in BA networks, we compute the inverse participation ratio S=∑i=1Nπi2, which is a standard quantity characterizing localization or inhomogeneity of an indicator [57]: the larger the value S=∑i=1Nπi2, the more heterogeneous the stationary distribution. In Table 1, we list the inverse participation ratio for NBCRW and MERW on some model and real networks. From Table 1, we can see that for all considered model and real networks, the heterogeneity of stationary distribution of MERW is more pronounced than that of NBCRW. Table 1. Inverse participation ratio of stationary distribution for NBCRW and MERW in a variety of networks. Network Size NBCRW MERW BA-model 5000 2.346×10−2 8.589×10−2 WS-model [58] 1000 1.692×10−3 2.510×10−3 Dolphins [59] 53 4.938×10−2 5.312×10−2 ca-NetSci [60] 379 7.372×10−2 7.938×10−2 C.elegans [61] 448 3.369×10−2 4.057×10−2 E-mail [62] 1133 8.517×10−3 9.557×10−3 P2P [63] 6299 7.665×10−3 7.945×10−3 Network Size NBCRW MERW BA-model 5000 2.346×10−2 8.589×10−2 WS-model [58] 1000 1.692×10−3 2.510×10−3 Dolphins [59] 53 4.938×10−2 5.312×10−2 ca-NetSci [60] 379 7.372×10−2 7.938×10−2 C.elegans [61] 448 3.369×10−2 4.057×10−2 E-mail [62] 1133 8.517×10−3 9.557×10−3 P2P [63] 6299 7.665×10−3 7.945×10−3 View Large Table 1. Inverse participation ratio of stationary distribution for NBCRW and MERW in a variety of networks. Network Size NBCRW MERW BA-model 5000 2.346×10−2 8.589×10−2 WS-model [58] 1000 1.692×10−3 2.510×10−3 Dolphins [59] 53 4.938×10−2 5.312×10−2 ca-NetSci [60] 379 7.372×10−2 7.938×10−2 C.elegans [61] 448 3.369×10−2 4.057×10−2 E-mail [62] 1133 8.517×10−3 9.557×10−3 P2P [63] 6299 7.665×10−3 7.945×10−3 Network Size NBCRW MERW BA-model 5000 2.346×10−2 8.589×10−2 WS-model [58] 1000 1.692×10−3 2.510×10−3 Dolphins [59] 53 4.938×10−2 5.312×10−2 ca-NetSci [60] 379 7.372×10−2 7.938×10−2 C.elegans [61] 448 3.369×10−2 4.057×10−2 E-mail [62] 1133 8.517×10−3 9.557×10−3 P2P [63] 6299 7.665×10−3 7.945×10−3 View Large 4.2. Hitting times Analogous to the case of stationary distribution, there are little dissimilarity for the behaviors of hitting times between NBCRW, MERW and TURW for homogeneous networks, e.g. ER networks. Below we study hitting times on heterogeneous networks, focusing on two representative cases: mean hitting time to the hub node TH and the global mean hitting time ⟨T⟩. Figures 3 and 4 display, respectively, TH and ⟨T⟩ for the three random walks in BA networks with node number N changing from 1000 to 10 000. Figure 3. View largeDownload slide Mean hitting time to the hub node for different random walks in BA networks with average degree 4. The calculation of TH for TURW, MERW and NBCRW are based on (6), (11) and (29), respectively. Figure 3. View largeDownload slide Mean hitting time to the hub node for different random walks in BA networks with average degree 4. The calculation of TH for TURW, MERW and NBCRW are based on (6), (11) and (29), respectively. From Fig. 3, we can see that when the hub is the target node, the mean absorbing time is the least for MERW, slightly smaller than that for NBCRW. In contrast, the mean absorbing time to the hub for TURW is significantly higher than those for MERW and NBCRW, which are all in inverse proportion to their corresponding stationary probabilities. In a previous work [26], we have proved that in BA networks, the asymptotical scaling for mean hitting time to the hub for MERW and TURW are lnN and N1/2, respectively, both of which are consistent with Fig. 3. As opposed to the sublinear scaling of partial mean hitting time TH to the hub for TURW, NBCRW and MERW in BA networks, the global mean hitting time ⟨T⟩ for the three random walks behaves linearly for TURW and superlinearly for NBCRW and MERW, as indicated in Fig. 4. Although for NBCRW and MERW ⟨T⟩∼Nρ with ρ>1, the power exponent ρ of NBCRW is less than that of MERW. In addition, combining the results in Figs 3 and 4, we found that among the three random walks, TH is the largest and ⟨T⟩ is the lowest for TURW, with the latter achieving the possible minimal scaling for TURW on all connected networks [64]; TH is the smallest and ⟨T⟩ is the largest for MERW. For NBCRW, both TH and ⟨T⟩ lie between those associated with TURW and MERW. Thus, for TURW, NBCRW and MERW on a heterogeneous network, the mean absorbing time to a particular target is not representative of the network. Figure 4. View largeDownload slide Global mean hitting times for TURW, NBCRW and MERW in BA networks with average degree 4. The inset provides results for TURW and NBCRW for comparison. The calculation of ⟨T⟩ for TURW, MERW and NBCRW are based on (7), (12) and (30), respectively. Figure 4. View largeDownload slide Global mean hitting times for TURW, NBCRW and MERW in BA networks with average degree 4. The inset provides results for TURW and NBCRW for comparison. The calculation of ⟨T⟩ for TURW, MERW and NBCRW are based on (7), (12) and (30), respectively. In addition to the BA networks, we also study partial mean hitting time to the hub and global mean hitting time for TURW, NBCRW and MERW in other synthetic and real networks. In Table 2, we provide related results for these three random walks, where superscripts T, B and M are used to represent the quantities corresponding to TURW, NBCRW and MERW, respectively. From Table 2 we observe that THT>THB>THM and ⟨T⟩M>⟨T⟩B>⟨T⟩T for all studied model and realistic networks. Table 2. Mean hitting time to a hub node and global mean hitting time for TURW, NBCRW and MERW in a variety of networks. Network Size THT THB THM ⟨T⟩T ⟨T⟩B ⟨T⟩M BA-model 5000 173.9 22.29 9.466 9837 3.005×105 7.960×106 WS-model 1000 557.4 246.7 147.0 1667 2440 3396 Dolphins 53 39.79 14.43 13.05 106.6 3175 6684 ca-NetSci 379 488.6 14.20 12.56 1892 2.980×1012 2.767×1013 C.elegans 448 19.82 7.929 6.472 1045 2.398×106 4.421×106 E-mail 1133 180.0 27.43 24.46 3713 7.633×106 1.118×107 P2P 6299 476.1 51.03 48.67 2.094×104 1.232×1010 2.041×1010 Network Size THT THB THM ⟨T⟩T ⟨T⟩B ⟨T⟩M BA-model 5000 173.9 22.29 9.466 9837 3.005×105 7.960×106 WS-model 1000 557.4 246.7 147.0 1667 2440 3396 Dolphins 53 39.79 14.43 13.05 106.6 3175 6684 ca-NetSci 379 488.6 14.20 12.56 1892 2.980×1012 2.767×1013 C.elegans 448 19.82 7.929 6.472 1045 2.398×106 4.421×106 E-mail 1133 180.0 27.43 24.46 3713 7.633×106 1.118×107 P2P 6299 476.1 51.03 48.67 2.094×104 1.232×1010 2.041×1010 View Large Table 2. Mean hitting time to a hub node and global mean hitting time for TURW, NBCRW and MERW in a variety of networks. Network Size THT THB THM ⟨T⟩T ⟨T⟩B ⟨T⟩M BA-model 5000 173.9 22.29 9.466 9837 3.005×105 7.960×106 WS-model 1000 557.4 246.7 147.0 1667 2440 3396 Dolphins 53 39.79 14.43 13.05 106.6 3175 6684 ca-NetSci 379 488.6 14.20 12.56 1892 2.980×1012 2.767×1013 C.elegans 448 19.82 7.929 6.472 1045 2.398×106 4.421×106 E-mail 1133 180.0 27.43 24.46 3713 7.633×106 1.118×107 P2P 6299 476.1 51.03 48.67 2.094×104 1.232×1010 2.041×1010 Network Size THT THB THM ⟨T⟩T ⟨T⟩B ⟨T⟩M BA-model 5000 173.9 22.29 9.466 9837 3.005×105 7.960×106 WS-model 1000 557.4 246.7 147.0 1667 2440 3396 Dolphins 53 39.79 14.43 13.05 106.6 3175 6684 ca-NetSci 379 488.6 14.20 12.56 1892 2.980×1012 2.767×1013 C.elegans 448 19.82 7.929 6.472 1045 2.398×106 4.421×106 E-mail 1133 180.0 27.43 24.46 3713 7.633×106 1.118×107 P2P 6299 476.1 51.03 48.67 2.094×104 1.232×1010 2.041×1010 View Large 5. ANALYTICAL RESULTS FOR NBCBRW ON ROSE GRAPHS In the preceding section, we show that in some model and real networks, the behaviors of NBCRW are strikingly different from those of TURW and MERW. Since many real-life networks are scale-free, analytically unveiling the impact of heterogeneous topology on random walks is important for better understanding its dynamical behaviors and applications. In this section, we study analytically and numerically NBCRW, TURW and MERW on a class of heterogeneous rose graphs [65]. For a particular rose graph, we obtain closed-form expressions for stationary distribution and hitting times for these three random walks, and obtain numerical results for general rose graphs, which widely differ from one another. Based on the results, we can discover the impact of topological heterogeneity on NBCRW, TURW and MERW are evidently different. 5.1. Construction of rose graphs The rose graphs are a family of deterministic networks, which allow to analytically treat some of their structural and dynamical properties. Let Rml denote the rose graphs, which are constructed by merging m ( m≥2) l-length ( l is even) cycles at a central hub node. Here we focus on a specific class of rose graphs, Rm4 with each petal being 4-length rings, see Fig. 5a. It is easy to derive that in Rm4 the total number of nodes is Nm=3m+1, and the total number of edges is Em=6m+2. Figure 5. View largeDownload slide Rose graphs. (a) Rose graphs Rm4. (b) Rose graphs Rml, where each of m petals is a l-length cycle. In either graph, the red node, green nodes and blue nodes stand for the hub, internal nodes and peripheral nodes, respectively. Figure 5. View largeDownload slide Rose graphs. (a) Rose graphs Rm4. (b) Rose graphs Rml, where each of m petals is a l-length cycle. In either graph, the red node, green nodes and blue nodes stand for the hub, internal nodes and peripheral nodes, respectively. For the convenience of description, we partition all the Nm nodes of Rm4 into three classes: hub node, peripheral nodes, and internal nodes. The hub node is the unique node of the largest degree, the peripheral nodes are those m nodes farthermost from the hub node, while the remaining 2m nodes are internal nodes, each of which is linked to the hub node and a peripheral node. Furthermore, the 3m+1 nodes can be labeled from 1 to 3m+1 in the following way. We label by 1 the hub node. For the nodes in the i th ( i=1,2,…,m) petal, the two internal nodes are labeled as 3(i−1)+2 and 3(i−1)+3, while the peripheral node is labeled as 3(i−1)+4. 5.2. Stationary distribution For TURW on Rm4, the stationary distribution can be obtained from (4). For NBCRW and MERW on Rm4, their stationary distributions can also be determined exactly. Theorem 5.1 For NBCRW on rose graphs Rm4, the stationary probability at the hub node, an internal node, and a peripheral node are πHB=m2(m+2m−1)=Nm−12(Nm−1)+26Nm−15, (31) πIB=14m=34(Nm−1) (32)and πPB=m2m−1−2m+12m(m−1)2=3(Nm−1)6Nm−15−18Nm+452(Nm−1)(Nm−4)2, (33)respectively. The proof is presented in Appendix A. Theorem 5.2 For MERW on rose graphs Rm4, the stationary probability at the hub node, an internal node and a peripheral node are πHM=m2m+2=Nm−12(Nm−1)+6, (34) πIM=14m=34(Nm−1), (35)and πPM=12m(m+1)=92(Nm+2)(Nm−1), (36)respectively. The proof is presented in Appendix B. Thus far, we have obtained the stationary distribution for NBCRW and MERW on Rm4. For TURW on Rm4, the stationary distribution is determined by the degree sequence of nodes and can be directly computed from (4), from which we obtain that the stationary probability at the hub node, an interval node, and a peripheral node are THT=m3m+1=Nm−13Nm, TIT=13m+1=32(Nm−1) and TPT=13m+1=32(Nm−1), respectively. If we choose stationary probability as an indicator of node importance, the stationary distribution of TURW fails to differentiate the internal nodes and peripheral nodes in Rm4, since the degree of the two internal and peripheral nodes is equal to each other. We will show that this shortcoming can be overcome by using the stationary distribution for NBCRW and MERW, although they also differ greatly. For NBCRW and MERW on Rm4, Theorems 5.1 and 5.2 show that πHM>πHB, πIM=πIB and πPM<πPB. Moreover, for both NBCRW and MERW, the stationary probability for internal nodes and peripheral nodes are different, in spite of the fact that their degree is identical. Thus, stationary distribution of NBCRW and MERW can discriminate between an internal node and a peripheral node. However, there exist differences between the stationary probability of NBCRW and MERW. For example, from (33) and (36), we can see that the stationary probability of a peripheral node for NBCRW gets a fraction O(Nm−2/3), larger than the fraction O(Nm−2) received for MERW. Therefore, in comparison with MERW, the stationary probabilities of NBCRW are distributed over a narrow range of values. In order to further unveil the distinction of stationary distribution between NBCRW and MERW. We compare the stationary distributions for NBCRW and MERW in the rose graph R320 with 58 nodes, among which the hub node has degree 6, while each of the other 57 nodes has a degree of 2. We can classify the 58 nodes in R320 by designating a level number to each node according to its shortest distance to the hub node: the hub node is at lever zero, the neighboring nodes of the hub are at level one, and the farthermost nodes from the hub are at level ten. Figure 6 provides numerical results of stationary distributions of NBCRW and MERW for every node in R320, which shows that for both NBCRW and MERW, the stationary probability depends on the level: the smaller the level of a node, the larger its stationary probability. However, their differences are also striking. For MERW, the stationary probability almost concentrates around the hub node and its neighbors, with other nodes getting vanishing weight; for NBCRW, although the stationary probability of the hub is also significantly larger than those of other nodes, the stationary probability of every node is nonvanishing and greater than 0.01, as shown in the inset of Fig. 6. Thus, if we use stationary probability to measure relative importance of nonhub nodes, MERW is hard to distinguish those nodes at large levels, which can be discriminated by NBCRW. Figure 6. View largeDownload slide Stationary probabilities of different nodes for NBCRW and MERW in the rose graph R320. Figure 6. View largeDownload slide Stationary probabilities of different nodes for NBCRW and MERW in the rose graph R320. 5.3. Hitting times In addition to stationary distribution, for NBCRW on the rose graph Rm4, the partial mean hitting time to the hub node and the global mean hitting time of the whole network can also be determined explicitly. For the purpose of comparison, we also provide the corresponding exact results for TURW and MERW. Theorem 5.3 The partial mean hitting time to the hub node for TURW, NBCRW and MERW in Rm4are THT=103, (37) THB=43+22m−1m=43+26Nm−15Nm−1 (38)and THM=43+2m=43+6Nm−1, (39)respectively. The proof is presented in Appendix C. Theorem 5.3 shows that for TURW, NBCRW and MERW in large rose graph Rm4 ( Nm→∞), the partial mean hitting time to the hub node tends to be constants, with THB=THM=4/3 smaller than THT=10/3. Although for NBCRW and MERW in large rose graphs Rm4, THB and THM are asymptotically equal, their global mean hitting times follow different behaviors, as can be seen from the following theorem. Theorem 5.4 The global mean hitting times for TURW, NBCRW and MERW in Rm4with Nm=3m+1nodes are ⟨T⟩T=20m(3m−1)3(3m+1)=20(Nm−1)(Nm−2)9Nm, (40) ⟨T⟩B=2m3+12m2−14m+4(3m+1)2m−1+36m2−8m3(3m+1)=2Nm2+30Nm−19296Nm−15+268+206Nm−159Nm6Nm−15+12Nm−329 (41)and ⟨T⟩M=6m3+36m2+10m−129m+3=2Nm3+30Nm2−36Nm−10427Nm, (42)respectively. The proof is presented in Appendix D. Theorem 5.3 provides succinct dependence relations of ⟨T⟩T, ⟨T⟩B and ⟨T⟩M on the network size Nm, from which we can find that for large networks (i.e. Nm→∞), the leading terms for global mean hitting times for TURW, NBCRW and MERW are ⟨T⟩T∼Nm, ⟨T⟩B∼(Nm)32 and ⟨T⟩M∼(Nm)2, respectively. Thus, ⟨T⟩T, ⟨T⟩B and ⟨T⟩M behave differently for the three random walks on Rm4. For TURW, ⟨T⟩T grows linearly with Nm; while for both NBCRW and MERW, ⟨T⟩B and ⟨T⟩M increase superlinearly with Nm, with ⟨T⟩B smaller than ⟨T⟩M. These results indicate that when searching a node distributed uniformly in Rm4, TURW is the most efficient, while MERW is the most inefficient, as observed in the model and real networks studied in the previous section. 6. CONCLUSION The application effects of random walks are determined to a large extent by the properties and behaviors of stationary distribution and hitting times. Recent works indicate that biased random walks perform better in multiple applications than TURW. Thus, designing appropriate biased random walks and understanding their properties are of significant importance. In this paper, we defined a new biased random walk, NBCRW, with the bias dependent on the non-backtracking centrality, which is a recently proposed node centrality measure having several advantages over traditional eigenvector centrality metric. We established a theoretical framework for computing quickly the transition probabilities, stationary distribution and hitting times of NBCRW on a general network. Within our proposed framework, we studied numerically or analytically NBCRW on some model and realistic networks, and compared the results about stationary distribution and hitting times with those corresponding to TURW and MERW, the latter of which is actually a biased random walk towards selecting neighboring having high eigenvector centrality. We found that for homogeneous networks, the behaviors for stationary distribution and hitting times of the three random walks resemble to each other. However, for heterogeneous networks, there is a big difference in the behaviors of the three random walks. For example, the stationary distribution of NBCRW outperforms TURW and MERW in discriminating nodes, in particular those with identical degree. With respect to hitting times, a walker finds the hub node most quickly when performing MERW, and detects a uniformly selected target most rapidly when executing TURW. For both cases, the hitting times of NBCRW interpolates between TURW and MERW. In view of the distinctive behaviors of NBCRW, in future, it is interesting to explore the applications of NBCRW in different fields, such as community detection, visual saliency and link prediction. FUNDING This work was supported by the National Natural Science Foundation of China under Grants no. 11275049. REFERENCES 1 Chau , C.-K. and Basu , P. ( 2011 ) Analysis of latency of stateless opportunistic forwarding in intermittently connected networks . IEEE/ACM Trans. Netw. , 19 , 1111 – 1124 . Google Scholar CrossRef Search ADS 2 Zheng , H. , Yang , F. , Tian , X. , Gan , X. , Wang , X. and Xiao , S. ( 2015 ) Data gathering with compressive sensing in wireless sensor networks: a random walk based approach . IEEE Trans. Parallel Distrib. Syst. , 26 , 35 – 44 . Google Scholar CrossRef Search ADS 3 Lee , C.-H. and Kwak , J. ( 2016 ) Towards distributed optimal movement strategy for data gathering in wireless sensor networks . IEEE Trans. Parallel Distrib. Syst. , 27 , 574 – 584 . Google Scholar CrossRef Search ADS 4 El Gamal , A. , Mammen , J. , Prabhakar , B. and Shah , D. ( 2006 ) Optimal throughput-delay scaling in wireless networks—Part I: the fluid model . IEEE Trans. Inf. Theory , 52 , 2568 – 2592 . Google Scholar CrossRef Search ADS 5 Liu , J. , Jiang , X. , Nishiyama , H. and Kato , N. ( 2012 ) Exact Throughput Capacity Under Power Control in Mobile Ad Hoc Networks. Proc. IEEE INFOCOM, Orlando, FL, USA, March 25–30, pp. 1–9. IEEE Press, Piscataway, NJ, USA. 6 Li , Y. and Zhang , Z.-L. ( 2013 ) Random walks and green’s function on digraphs: a framework for estimating wireless transmission costs . IEEE/ACM Trans. Netw. , 21 , 135 – 148 . Google Scholar CrossRef Search ADS 7 Beraldi , R. , Querzoni , L. and Baldoni , R. ( 2009 ) Low hitting time random walks in wireless networks . Wirel. Commun. Mob. Comput. , 9 , 719 – 732 . Google Scholar CrossRef Search ADS 8 Lin , T. , Lin , P. , Wang , H. and Chen , C. ( 2009 ) Dynamic search algorithm in unstructured peer-to-peer networks . IEEE Trans. Parallel Distrib. Syst. , 20 , 654 – 666 . Google Scholar CrossRef Search ADS 9 Pons , P. and Latapy , M. ( 2005 ) Computing Communities in Large Networks Using Random Walks. Proc. Int. Symp. Comput. Inform. Sci., Istanbul, Turkey, October 26–28, pp. 284–293. Springer. 10 Fouss , F. , Pirotte , A. , Renders , J.-M. and Saerens , M. ( 2007 ) Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation . IEEE Trans. Knowl. Data Eng. , 19 , 355 – 369 . Google Scholar CrossRef Search ADS 11 Gopalakrishnan , V. , Hu , Y. and Rajan , D. ( 2009 ) Random Walks on Graphs to Model Saliency in Images. Proc. IEEE Int. Conf. Comput. Vision Pattern Recognit., Miami Beach, Florida USA, June 20–25, pp. 1698–1705. IEEE Press, Piscataway, NJ, USA. 12 Grady , L. ( 2006 ) Random walks for image segmentation . IEEE Trans. Pattern Anal. Mach. Intell. , 28 , 1768 – 1783 . Google Scholar CrossRef Search ADS 13 Ribeiro , B. and Towsley , D. ( 2010 ) Estimating and Sampling Graphs with Multidimensional Random Walks. Proc. ACM SIGCOMM IMC, New Delhi, India, August 30–September 3, pp. 390–403. ACM Press, New York, NY, USA. 14 Ribeiro , B. , Wang , P. , Murai , F. and Towsley , D. ( 2012 ) Sampling Directed Graphs with Random Walks. Proc. IEEE INFOCOM, March 25–30, Orlando, FL, USA, pp. 1692–1700. IEEE Press, Piscataway, NJ, USA. 15 Brin , S. and Page , L. ( 1998 ) The anatomy of a large-scale hypertextual Web search engine . Comput. Netw. ISDN Syst. , 30 , 107 – 117 . Google Scholar CrossRef Search ADS 16 Yu , J.-G. , Zhao , J. , Tian , J. and Tan , Y. ( 2014 ) Maximal entropy random walk for region-based visual saliency . IEEE Trans. Cybern , 44 , 1661 – 1672 . Google Scholar CrossRef Search ADS 17 Newman , M.E. ( 2003 ) The structure and function of complex networks . SIAM Rev. , 45 , 167 – 256 . Google Scholar CrossRef Search ADS 18 Beraldi , R. ( 2009 ) Biased random walks in uniform wireless networks . IEEE Trans. Mob. Comput. , 8 , 500 – 513 . Google Scholar CrossRef Search ADS 19 Gjoka , M. , Kurant , M. , Butts , C.T. and Markopoulou , A. ( 2010 ) Walking in Facebook: A Case Study of Unbiased Sampling of OSNS. Proc. IEEE INFOCOM, San Diego, CA, USA, March 14–19, pp. 1–9. IEEE Press, Piscataway, NJ, USA. 20 Ikeda , S. , Kubo , I. and Yamashita , M. ( 2009 ) The hitting and cover times of random walks on finite graphs using local degree information . Theor. Comput. Sci. , 410 , 94 – 100 . Google Scholar CrossRef Search ADS 21 Maiya , A.S. and Berger-Wolf , T.Y. ( 2011 ) Benefits of Bias: Towards Better Characterization of Network Sampling. Proc. Int. Conf. Knowl. Discov. Data Mining, San Diego, CA, USA, August 21–24, pp. 105–113. ACM Press, New York, NY, USA. 22 Parry , W. ( 1964 ) Intrinsic Markov chains . Trans. Am. Math. Soc. , 112 , 55 – 66 . Google Scholar CrossRef Search ADS 23 Burda , Z. , Duda , J. , Luck , J. and Waclaw , B. ( 2009 ) Localization of the maximal entropy random walk . Phys. Rev. Lett. , 102 , 160602 . Google Scholar CrossRef Search ADS 24 Gómez-Gardeñes , J. and Latora , V. ( 2008 ) Entropy rate of diffusion processes on complex networks . Phys. Rev. E , 78 , 065102 . Google Scholar CrossRef Search ADS 25 Peng , X. and Zhang , Z. ( 2014 ) Maximal entropy random walk improves efficiency of trapping in dendrimers . J. Chem. Phys. , 140 , 234104 . Google Scholar CrossRef Search ADS 26 Lin , Y. and Zhang , Z. ( 2014 ) Mean first-passage time for maximal-entropy random walks in complex networks . Sci. Rep. , 4 , 5365 . Google Scholar CrossRef Search ADS 27 Kafsi , M. , Grossglauser , M. and Thiran , P. ( 2013 ) The entropy of conditional Markov trajectories . IEEE Trans. Inf. Theory , 59 , 5577 – 5583 . Google Scholar CrossRef Search ADS 28 Ochab , J. and Burda , Z. ( 2012 ) Exact solution for statics and dynamics of maximal-entropy random walks on Cayley trees . Phys. Rev. E , 85 , 021145 . Google Scholar CrossRef Search ADS 29 Li , R.-H. , Yu , J.X. and Liu , J. ( 2011 ) Link Prediction: The Power of Maximal Entropy Random Walk. Proc. Int. Conf. Inform. Knowl. Manag., Glasgow, United Kingdom, October 24–28, pp. 1147–1156. ACM Press, New York, NY, USA. 30 Korus , P. and Huang , J. ( 2016 ) Improved tampering localization in digital image forensics based on maximal entropy random walk . IEEE Signal Process. Lett. , 23 , 169 – 173 . Google Scholar CrossRef Search ADS 31 Bonacich , P. ( 1972 ) Factoring and weighting approaches to status scores and clique identification . J. Math. Sociol. , 2 , 113 – 120 . Google Scholar CrossRef Search ADS 32 Martin , T. , Zhang , X. and Newman , M.E.J. ( 2014 ) Localization and centrality in networks . Phys. Rev. E , 90 , 052808 . Google Scholar CrossRef Search ADS 33 Hashimoto , K. ( 1989 ) Zeta functions of finite graphs and representations of p-adic groups . Adv. Stud. Pure Math. , 15 , 211 – 280 . 34 Krzakala , F. , Moore , C. , Mossel , E. , Neeman , J. , Sly , A. , Zdeborovíć , L. and Zhang , P. ( 2013 ) Spectral redemption in clustering sparse networks . Proc. Natl. Acad. Sci , 110 , 20935 – 20940 . Google Scholar CrossRef Search ADS 35 Karrer , B. , Newman , M.E.J. and Zdeborová , L. ( 2014 ) Percolation on sparse networks . Phys. Rev. Lett. , 113 , 208702 . Google Scholar CrossRef Search ADS 36 Lin , Y. , Chen , W. and Zhang , Z. ( 2017 ) Assessing Percolation Threshold Based on High-Order Non-backtracking Matrices. Proc. 26th Int. Conf. World Wide Web, Perth, Australia, April 3–7, pp. 223–232. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland. 37 Shrestha , M. , Scarpino , S.V. and Moore , C. ( 2015 ) Message-passing approach for recurrent-state epidemic models on networks . Phys. Rev. E , 92 , 022821 . Google Scholar CrossRef Search ADS 38 Morone , F. and Makse , H.A. ( 2015 ) Influence maximization in complex networks through optimal percolation . Nature , 524 , 65 – 68 . Google Scholar CrossRef Search ADS 39 Condamin , S. , Bénichou , O. , Tejedor , V. , Voituriez , R. and Klafter , J. ( 2007 ) First-passage times in complex scale-invariant media . Nature , 450 , 77 – 80 . Google Scholar CrossRef Search ADS 40 Lin , Y. , Julaiti , A. and Zhang , Z.Z. ( 2012 ) Mean first-passage time for random walks in general graphs with a deep trap . J. Chem. Phys. , 137 , 124104 . Google Scholar CrossRef Search ADS 41 Ermon , S. , Gomes , C.P. , Sabharwal , A. and Selman , B. ( 2014 ) Designing Fast Absorbing Markov Chains. Proc. AAAI, July 27–31, pp. 849–855. AAAI Press. 42 White , S. and Smyth , P. ( 2003 ) Algorithms for Estimating Relative Importance in Networks. Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, Washington DC, USA, August 24–27, pp. 266–275. ACM Press, New York, NY, USA. 43 Feng , M. , Qu , H. and Yi , Z. ( 2014 ) Highest degree likelihood search algorithm using a state transition matrix for complex networks . IEEE Trans. Circuits and Syst. I, Reg. Papers , 61 , 2941 – 2950 . Google Scholar CrossRef Search ADS 44 Lovász , L. ( 1993 ) Random walks on graphs: a survey . Combinatorics, Paul Erdös is Eighty , 2 , 1 – 46 . 45 Aldous , D. and Fill , J. ( 1999 ) Reversible Markov chains and random walks on graphs. http://www.stat.berkeley.edu/~aldous/RWG/book.html. 46 Strang , G. ( 2009 ) Introduction to Linear Algebra . Wellesley-Cambridge Press , Wellesley, MA . 47 Bass , H. ( 1992 ) The Ihara–Selberg zeta function of a tree lattice . Int. J. Math. , 03 , 717 – 797 . Google Scholar CrossRef Search ADS 48 Angel , O. , Friedman , J. and Hoory , S. ( 2015 ) The non-backtracking spectrum of the universal cover of a graph . Trans. Am. Math. Soc , 367 , 4287 – 4318 . Google Scholar CrossRef Search ADS 49 Zhang , Z. , Shan , T. and Chen , G. ( 2013 ) Random walks on weighted networks . Phys. Rev. E , 87 , 012112 . Google Scholar CrossRef Search ADS 50 Barrat , A. , Barthelemy , M. , Pastor-Satorras , R. and Vespignani , A. ( 2004 ) The architecture of complex weighted networks . Proc. Natl. Acad. Sci. USA. , 101 , 3747 – 3752 . Google Scholar CrossRef Search ADS 51 Alon , N. , Benjamini , I. , Lubetzky , E. and Sodin , S. ( 2007 ) Non-backtracking random walks mix faster . Commun. Contemp. Math. , 9 , 585 – 603 . Google Scholar CrossRef Search ADS 52 Fitzner , R. and van der Hofstad , R. ( 2013 ) Non-backtracking random walk . J. Stat. Phys. , 150 , 264 – 284 . Google Scholar CrossRef Search ADS 53 Kempton , M. ( 2016 ) Non-backtracking random walks and a weighted Ihara’s theorem . Open J. Discrete Math. , 6 , 207 – 226 . Google Scholar CrossRef Search ADS 54 Lin , Y. and Zhang , Z.Z. ( 2013 ) Random walks in weighted networks with a perfect trap: an application of Laplacian spectra . Phys. Rev. E , 87 , 062140 . Google Scholar CrossRef Search ADS 55 Erdös , P. and Rényi , A. ( 1960 ) On the evolution of random graphs . Publ. Math. Inst. Hungar. Acad. Sci , 5 , 17 – 61 . 56 Barabási , A.-L. and Albert , R. ( 1999 ) Emergence of scaling in random networks . Science , 286 , 509 – 512 . Google Scholar CrossRef Search ADS 57 Bell , R. and Dean , P. ( 1970 ) Atomic vibrations in vitreous silica . Discuss. Faraday Soc. , 50 , 55 – 61 . Google Scholar CrossRef Search ADS 58 Watts , D.J. and Strogatz , S.H. ( 1998 ) Collective dynamics of ‘small-world’ networks . Nature , 393 , 440 – 442 . Google Scholar CrossRef Search ADS 59 Lusseau , D. , Schneider , K. , Boisseau , O.J. , Haase , P. , Slooten , E. and Dawson , S.M. ( 2003 ) The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations . Behav. Ecol. Sociobiol. , 54 , 396 – 405 . Google Scholar CrossRef Search ADS 60 Newman , M.E.J. ( 2006 ) Finding community structure in networks using the eigenvectors of matrices . Phys. Rev. E , 74 , 036104 . Google Scholar CrossRef Search ADS 61 Duch , J. and Arenas , A. ( 2005 ) Community detection in complex networks using extremal optimization . Phys. Rev. E , 72 , 027104 . Google Scholar CrossRef Search ADS 62 Guimera , R. , Danon , L. , Diaz-Guilera , A. , Giralt , F. and Arenas , A. ( 2003 ) Self-similar community structure in a network of human interactions . Phys. Rev. E , 68 , 065103 . Google Scholar CrossRef Search ADS 63 Leskovec , J. , Kleinberg , J. and Faloutsos , C. ( 2007 ) Graph evolution: densification and shrinking diameters . ACM Trans. Knowl. Discov. Data , 1 , 2 . Google Scholar CrossRef Search ADS 64 Tejedor , V. , Bénichou , O. and Voituriez , R. ( 2009 ) Global mean first-passage times of random walks on complex networks . Phys. Rev. E , 80 , 065104 . Google Scholar CrossRef Search ADS 65 Liu , F. and Huang , Q. ( 2013 ) Laplacian spectral characterization of 3-rose graphs . Linear Algebra Appl. , 439 , 2914 – 2920 . Google Scholar CrossRef Search ADS Appendix A. PROOF OF THEOREM 5.1 Let κ1 denote the leading eigenvalue of matrix M corresponding to Rm4. Let Mm(κ) be the characteristic polynomial of matrix M, i.e. Mm(κ)=det(κI−M). (A.1) Then, κ1 is the largest root of equation Mm(κ)=0. Equation (A.1) can be recast as Mm(κ)=detκI−AD−I−IκI=1κNm·detκ2I−κA−I+DI−DOκI=det((κ2−1)I+D−κA), (A.2) which reduces the computation of Mm(κ) to computing a determinant of a new matrix Pm=(κ2−1)I+D−κA of low order. According to the construction of Rm4, det(Pm) can be rephrased in the following form: det(Pm)=detκ2+2m−1−κe−κe⋯−κe−κe⊤QO⋯O−κe⊤OQ⋯O⋮⋮⋮⋮−κe⊤OO⋯Q, (A.3) where e=(1,1,0), O is the 3×3 zero matrix, and Q is a 3×3 matrix given by Q=κ2+10−κ0κ2+1−κ−κ−κκ2+1. (A.4) Note that the first row of matrix Pm on the right-hand side (rhs) of (A.3) can be regarded as the sum (i.e. a linear combination with all scalars being 1) of the following m+1 vectors: (κ2+2m−1,0,0,…,0), (0,−κe,0,…,0), (0,0,−κe,…,0), …, (0,0,0,…,−κe), where 0 represents the zero vector (0,0,0). According to the properties of determinants, det(Pm) can be rewritten as det(Pm)=detκ2+2m−100⋯0−κe⊤QO⋯O−κe⊤OQ⋯O⋮⋮⋮⋮−κe⊤OO⋯Q+m·det0−κe0⋯0−κe⊤QO⋯O−κe⊤OQ⋯O⋮⋮⋮⋮−κe⊤OO⋯Q=(κ2+2m−1)(detQ)m+m(detQ)m−1det0−κe−κe⊤Q. (A.5) Based on (A.4), we obtain det(Q)=κ6+κ4+κ2+1 (A.6) and det0−κe−κe⊤Q=−2κ6−4κ4−2κ2. (A.7) Inserting (A.6) and (A.7) into (A.5) yields Mm(κ)=det(Pm)=(κ+1)(κ−1)(κ2+1)(κ4−2m+1)(κ6+κ4+κ2+1)m−1. (A.8) Thus, the largest eigenvalue κ1 of matrix M is κ1=(2m−1)14. (A.9) Next, we continue to derive the eigenvector of unit length corresponding to eigenvalue κ1. Let xH, xI and xP represent separately the non-backtracking centrality for the hub node, an internal node and a peripheral node in graph Rm4. According to (21), xH, xI and xP satisfy the following system of equations: 2mxI−(2m−1)xHκ=κxH,xH+xP−xIκ=κxI,xH2+xHκ2+2mxI2+xIκ2+mxP2+xPκ2=1, (A.10) which can be resolved to yield xH=2mκ3(κ2+1)(κ8+2κ6+2(8m−3)κ4+2(2m−1)2κ2+(2m−1)2),xI=κ2(κ2+2m−1)m(κ2+1)(κ8+2κ6+2(8m−3)κ4+2(2m−1)2κ2+(2m−1)2),xP=κ(κ4+2m−1)m(κ2+1)(κ8+2κ6+2(8m−3)κ4+2(2m−1)2κ2+(2m−1)2). (A.11) Then the normalized factor Q can be computed by Q=∑i=1Nmκ2−1κ+diκxi2=κ2−1κ+2mκxH2+2mκ2−1κ+2κxI+mκ2−1κ+2κxP=2(2m−1)14[2m2+m−1+2m−1(3m−1)](2m−1+1)[m(5+2m−1)−2]. (A.12) Plugging (A.9), (A.11) and (A.12) into (24) and considering the relation Nm=3m+1, the theorem follows. Appendix B. PROOF OF THEOREM 5.2 Let λ1 be the leading eigenvalue of adjacency matrix A for graph Rm4. And let μH, μI and μP be the elements of the leading eigenvector of unit length corresponding to the hub node, an internal node and a peripheral node, respectively. Then, 2mμI=λ1μH,μH+μP=λ1μI,2μI=λ1μP. (B.1) Doing some simple algebra operations on (B.1), we have 2mμIμH+μP=μHμI,μH+μP2μI=μIμP, (B.2) which, together with the normalization condition μH2+2mμI2+2μP2=1, is solved to yield μH=m2m+2,μI=12m,μP=1m2m+2. (B.3) Combining (B.3) and the relation Nm=3m+1, the theorem follows from (9). Appendix C. PROOF OF THEOREM 5.3 For TURW (MERW, NBCRW) in Rm4, let TI→HT ( TI→HM, TI→HB) be the hitting time from an internal node to the hub and, and let TP→HT ( TP→HM, TP→HB) be the hitting time from a peripheral node to the hub. Then, by definition, for the three random walks in Rm4 the partial mean hitting time to the hub can be computed in a uniform formula as THZ=13(2TI→HZ+TP→HZ), (C.1) where Z can be T, M or B. We next determine the partial mean hitting time to the hub for the three considered random walks. Case I: TURW. Since TURW is unbiased, the quantities TI→HT and TP→HT satisfy the following relations: TI→HT=12+12(1+TP→HT) and TP→HT=1+TI→HT, from which we have TI→HT=3 (C.2) and TP→HT=4. (C.3) Plugging (C.2) and (C.3) into (C.1) yields THT=103. Case II: NBCRW. We first determine the transition probabilities between different nodes for NBCRW in Rm4. If the walker is currently at a peripheral node, at next time step, it will jump to either of the two internal nodes adjacent to it; if the current location of the walker is an internal node, according to (23) and (A.11), at next time step, the probability of the walker at the hub node or a peripheral node is xHxH+xP=mm+2m−1 and xPxH+xP=2m−1m+2m−1, respectively. Then, we can establish the relations between TI→HB and TP→HB as TI→HB=mm+2m−1+2m−1m+2m−1(1+TP→HB) and TP→HB=1+TI→HB, which can be solved to yield TI→HB=1+22m−1m (C.4) and TP→HB=2+22m−1m. Thus, the mean hitting time to the hub for NBCRW in Rm4 is THB=13(2TI→HB+TP→HB)=43+22m−1m=43+26Nm−15Nm−1. Case III: MERW. For MERW in Rm4, the transition probability from a node i to one of its neighbor j is μj/∑kaikμk. Then, according to (B.3), we obtain that the probabilities from an internal node to the hub node and the peripheral neighbor node are μHμH+μP=mm+1 and μPμH+μP=1m+1, respectively. Thus, we can establish the following relations for TI→HM and TP→HM: TI→HM=mm+1+1m+1(1+TP→HM) (C.5) and TP→HM=1+TI→HM. (C.6) Resolving (C.5) and (C.6), one obtains the analytical expressions for TI→HM and TP→HM as TI→HM=m+2m (C.7) and TP→HM=2(m+1)m. (C.8) According (C.1), (C.7) and (C.8), the mean hitting time to the hub for MERW in Rm4 is obtained to be THM=13(2TI→HM+TP→HM)=43+2m=43+6Nm−1. This completes the proof. Appendix D. PROOF OF THEOREM 5.4 We use superscript Z∈{T,M,B} to differentiate related quantities for TURW, MERW, NBCRW on Rm4. For example, ⟨T⟩T (⟨T⟩M,⟨T⟩B) presents global mean hitting time for TURW (MERW, NBCRW) on Rm4. By definition, ⟨T⟩Z=1Nm(Nm−1)∑i=1Nm∑j=1NmTi→jZ=TtotZNm(Nm−1), (D.1) where Ti→jZ is the hitting time from node i to node j in Rm4, and TtotZ=∑i=1Nm∑j=1NmTi→jZ denotes the sum of hitting times over all Nm(Nm−1) node pairs in Rm4. Thus, in order to obtain ⟨T⟩Z, we only need to determine TtotZ. By construction, TtotZ can be decomposed into two terms as TtotZ=mTtot,1Z+m(m−1)Ttot,2Z, (D.2) where Ttot,1Z is the sum of hitting times between all pairs of nodes belonging to one of the m petals, and Ttot,2Z is the sum of hitting times between all pairs of nodes in different petals. We now compute Ttot,1Z and Ttot,2Z. For Ttot,1Z, it can be evaluated by Ttot,1Z=2TI→HZ+TP→HZ+2(TH→IZ+TI→IZ+TP→IZ)+TH→PZ+2TI→PZ, (D.3) where TX→YZ represents the hitting time from a node in class X to another node in class Y with both nodes belonging to the same petal. For instance, TI→PZ is the hitting time from an internal node to the peripheral node in the same petal, and TI→IZ is the hitting time from one internal node to the other internal node in the same petal. For Ttot,2Z, we have Ttot,2Z=2(3TI→HZ+2TH→IZ+TH→PZ)+(3TP→HZ+2TH→IZ+TH→PZ). (D.4) Plugging (D.3) and (D.4) into (D.2) yields TtotZ=(6m2−4m)TI→HZ+(3m2−2m)TP→HZ+(6m2−4m)TH→IZ+2mTI→IZ+2mTP→IZ+(3m2−2m)TH→PZ+2mTI→PZ. (D.5) We are now ready to determine the global mean hitting time for TURW, NBCRW and MERW in Rm4 by evaluating those quantities on the rhs of (D.5). Case I: TURW. Since the quantities TI→HT and TP→HT have been obtained earlier, we only require to determine TH→IT, TI→IT, TP→IT, TH→PT and TI→PT, which obey the following relations: TH→IT=m−1m(1+TI→HT+TH→IT)+12m(1+TI→IT)+12m, TI→IT=12(1+TH→IT)+12(1+TP→IT), TP→IT=12(1+TI→IT)+12, TH→PT=m−1m(1+TI→HT+TH→PT)+1m(1+TI→PT and TI→PT=12(1+TH→PT)+12. Using (C.2), the above equations are solved to obtain TH→IT=6m−3, (D.6) TI→IT=4m, (D.7) TP→IT=2m+1, (D.8) TH→PT=8m−4 (D.9) and TI→PT=4m−1. (D.10) Plugging (C.2)–(C.3) and (D.6)–(D.10) into (D.5) and (D.1), we obtain the explicit expression for the global mean hitting time ⟨T⟩T for TURW in Rm4 and its relation between node number Nm=3m+1, as given by (40). Case II: NBCRW. For NBCRW on Rm4, we can establish the following recursive relations for the quantities TH→IB, TI→IB, TP→IB, TH→PB and TI→PB: TH→IB=m−1m(1+TI→HB+TH→IB)+12m(1+TI→IB)+12m, TI→IB=xHxH+xP(1+TH→IB)+xPxH+xP(1+TP→IB), TP→IB=12(1+TI→IB)+12, TH→PB=m−1m(1+TI→HB+TH→PB)+1m(1+TI→PB) and TI→PB=xHxH+xP(1+TH→PB)+xPxH+xP. Considering (A.11) and (C.4), the above equations are resolved to obtain TH→IB=4m+22m−1−1−22m−1m, (D.11) TI→IB=4m, (D.12) TP→IB=2m+1, (D.13) TH→PB=4m22m−1+2m3+4m2−2m2m−1−6m+2m2m−1 (D.14) and TI→PB=2m22m−1+2m−1. (D.15) Substituting (C.4), (C.5), and (D.11)–(D.15) into (D.5) and (D.1) yields (41). Case III: MERW. For MERW on Rm4, we can also build some relations among related hitting times: TH→IM=m−1m(1+TI→HM+TH→IH)+12m(1+TI→IM)+12m, (Y.1) TI→IM=μHμH+μP(1+TH→IM)+μPμH+μP(1+TP→IM), TP→IM=12(1+TI→IM)+12, TH→PM=m−1m(1+TI→HM+TH→PM)+1m(1+TI→PM) and TI→PM=μHμH+μP(1+TH→PM)+μPμH+μP. Making use of (B.3) and (C.7), the above equations are solved to give TH→IM=4m+1−2m, TI→IM=4m, TP→IM=2m+1, TH→PM=2(m+1)m(m2+m−1) and TI→PM=2m(m+1)−1. Plugging the above-obtained results into (D.5) and (D.1) results in (41). This completes the proof of the theorem. Author notes Handling editor: Prudence Wong © The British Computer Society 2018. All rights reserved. For permissions, please email: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)

Journal

The Computer JournalOxford University Press

Published: Mar 24, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off