Uniform sampling of bipartite graphs with degrees in prescribed intervals

Uniform sampling of bipartite graphs with degrees in prescribed intervals Abstract Random sampling of bipartite graphs with given vertex degrees is an important and well-studied task with many applications in complex network analysis. While it is often assumed that the degrees of an observed network are precisely known, there are good reasons to assume that the true network might differ from the observed one. This motivates us to study the problem of sampling bipartite graphs uniformly at random from the set of all bipartite graphs whose degrees lie in prescribed intervals. In this article, we present a sampling algorithm that is based on the Markov chain Monte Carlo method and prove that the algorithm produces uniformly distributed samples. Since the proposed Markov chain has a high probability of staying at the same state, the sampling algorithm can be quite slow in practice. Therefore, we present a second Markov chain which is designed to converge faster to its stationary distribution at the price of a larger running time in each step. In a set of experiments with scale-free and near-regular networks, we evaluate both Markov chains and examine under which conditions the second Markov chain should be preferred to the first. We close our discussion with an application that demonstrates how our sampling methods can be used to study structural properties of partially observed networks. 1. Introduction Bipartite networks are widely used to describe the relations between two different groups of objects. In ecology, for example, bipartite networks are used to model the interaction between species like plants and pollinators or to describe the presence or absence of species in geographical regions. The latter type of data is commonly represented as a presence–absence table, in which rows represent species and columns represent sites. If a species is present at a site, the associated table entry is set to one, otherwise it is zero. Prominent presence–absence tables like the famous ‘Darwin’s finches’ data set (see Table 1) have been objects of intense study in the last decades. Table 1 ‘Darwin’s finches’: Presence of $$13$$ finch species on $$17$$ islands of the Galápagos archipelago$$($$data from [1]$$)$$    Island     Species  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  $$\mathbf{r}$$  $$A$$  0  0  1  1  1  1  1  1  1  1  0  1  1  1  1  1  1  14  $$B$$  1  1  1  1  1  1  1  1  1  1  0  1  0  1  1  0  0  13  $$C$$  1  1  1  1  1  1  1  1  1  1  1  1  0  1  1  0  0  14  $$D$$  0  0  1  1  1  0  0  1  0  1  0  1  1  0  1  1  1  10  $$E$$  1  1  1  0  1  1  1  1  1  1  0  1  0  1  1  0  0  12  $$F$$  0  0  0  0  0  0  0  0  0  0  1  0  1  0  0  0  0  2  $$G$$  0  0  1  1  1  1  1  1  1  0  0  1  0  1  1  0  0  10  $$H$$  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  1  $$I$$  0  0  1  1  1  1  1  1  1  1  0  1  0  0  1  0  0  10  $$J$$  0  0  1  1  1  1  1  1  1  1  0  1  0  1  1  0  0  11  $$K$$  0  0  1  1  1  0  1  1  0  1  0  0  0  0  0  0  0  6  $$L$$  0  0  1  1  0  0  0  0  0  0  0  0  0  0  0  0  0  2  $$M$$  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  17  $$\mathbf{c}$$  4  4  11  10  10  8  9  10  8  9  3  10  4  7  9  3  3  122     Island     Species  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  $$\mathbf{r}$$  $$A$$  0  0  1  1  1  1  1  1  1  1  0  1  1  1  1  1  1  14  $$B$$  1  1  1  1  1  1  1  1  1  1  0  1  0  1  1  0  0  13  $$C$$  1  1  1  1  1  1  1  1  1  1  1  1  0  1  1  0  0  14  $$D$$  0  0  1  1  1  0  0  1  0  1  0  1  1  0  1  1  1  10  $$E$$  1  1  1  0  1  1  1  1  1  1  0  1  0  1  1  0  0  12  $$F$$  0  0  0  0  0  0  0  0  0  0  1  0  1  0  0  0  0  2  $$G$$  0  0  1  1  1  1  1  1  1  0  0  1  0  1  1  0  0  10  $$H$$  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  1  $$I$$  0  0  1  1  1  1  1  1  1  1  0  1  0  0  1  0  0  10  $$J$$  0  0  1  1  1  1  1  1  1  1  0  1  0  1  1  0  0  11  $$K$$  0  0  1  1  1  0  1  1  0  1  0  0  0  0  0  0  0  6  $$L$$  0  0  1  1  0  0  0  0  0  0  0  0  0  0  0  0  0  2  $$M$$  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  17  $$\mathbf{c}$$  4  4  11  10  10  8  9  10  8  9  3  10  4  7  9  3  3  122  Table 1 ‘Darwin’s finches’: Presence of $$13$$ finch species on $$17$$ islands of the Galápagos archipelago$$($$data from [1]$$)$$    Island     Species  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  $$\mathbf{r}$$  $$A$$  0  0  1  1  1  1  1  1  1  1  0  1  1  1  1  1  1  14  $$B$$  1  1  1  1  1  1  1  1  1  1  0  1  0  1  1  0  0  13  $$C$$  1  1  1  1  1  1  1  1  1  1  1  1  0  1  1  0  0  14  $$D$$  0  0  1  1  1  0  0  1  0  1  0  1  1  0  1  1  1  10  $$E$$  1  1  1  0  1  1  1  1  1  1  0  1  0  1  1  0  0  12  $$F$$  0  0  0  0  0  0  0  0  0  0  1  0  1  0  0  0  0  2  $$G$$  0  0  1  1  1  1  1  1  1  0  0  1  0  1  1  0  0  10  $$H$$  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  1  $$I$$  0  0  1  1  1  1  1  1  1  1  0  1  0  0  1  0  0  10  $$J$$  0  0  1  1  1  1  1  1  1  1  0  1  0  1  1  0  0  11  $$K$$  0  0  1  1  1  0  1  1  0  1  0  0  0  0  0  0  0  6  $$L$$  0  0  1  1  0  0  0  0  0  0  0  0  0  0  0  0  0  2  $$M$$  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  17  $$\mathbf{c}$$  4  4  11  10  10  8  9  10  8  9  3  10  4  7  9  3  3  122     Island     Species  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  $$\mathbf{r}$$  $$A$$  0  0  1  1  1  1  1  1  1  1  0  1  1  1  1  1  1  14  $$B$$  1  1  1  1  1  1  1  1  1  1  0  1  0  1  1  0  0  13  $$C$$  1  1  1  1  1  1  1  1  1  1  1  1  0  1  1  0  0  14  $$D$$  0  0  1  1  1  0  0  1  0  1  0  1  1  0  1  1  1  10  $$E$$  1  1  1  0  1  1  1  1  1  1  0  1  0  1  1  0  0  12  $$F$$  0  0  0  0  0  0  0  0  0  0  1  0  1  0  0  0  0  2  $$G$$  0  0  1  1  1  1  1  1  1  0  0  1  0  1  1  0  0  10  $$H$$  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  1  $$I$$  0  0  1  1  1  1  1  1  1  1  0  1  0  0  1  0  0  10  $$J$$  0  0  1  1  1  1  1  1  1  1  0  1  0  1  1  0  0  11  $$K$$  0  0  1  1  1  0  1  1  0  1  0  0  0  0  0  0  0  6  $$L$$  0  0  1  1  0  0  0  0  0  0  0  0  0  0  0  0  0  2  $$M$$  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  17  $$\mathbf{c}$$  4  4  11  10  10  8  9  10  8  9  3  10  4  7  9  3  3  122  Motivation In sciences like ecology, empirical data are typically collected by the observation and classification of species in the field. However, due to limited resources, ecologists can only spend a restricted amount of time to the data collection. Consequently, despite all effort and carefulness, the collected data are typically imperfect. In particular, species-interaction networks will most likely lack rarely occurring interactions between species and similarly, rare species may falsely be declared absent in a presence–absence table. On the other hand, potential misclassification of species may introduce relations that are not present in reality. Thus, an observed network will often differ from the ‘true’ network at least slightly. One way to deal with such problems is to incorporate the imperfectness of the data in the network analysis. For example, by assuming that some edges of the true network remained unobserved, we can model the true network to be an unknown element of a set of graphs whose node degrees are allowed to exceed the observed ones by a predefined value. Then, the expected structure of the true network can be approximated by drawing uniformly distributed samples from this set. The application of such a procedure requires a method for the uniform sampling of bipartite graphs whose degrees lie in prescribed intervals. In this article, we focus on such sampling algorithms. Problem definition Let $$m$$ and $$n$$ be positive integers and let $$\mathbf{{r}} = ({r}_1, \ldots, {r}_m)$$, $$\bar{{{\mathbf{r}}}} = (\bar{{{r}}}_1, \ldots, \bar{{{r}}}_m)$$, $$\mathbf{{c}} = ({c}_1, \ldots, {c}_n)$$, and $$\bar{{{\mathbf{c}}}} = (\bar{{{c}}}_1, \ldots, \bar{{{c}}}_n)$$ be integer vectors of length $$m$$ and $$n$$. Let $$G=(U,V,E)$$ be a simple bipartite graph with $$U = \{ u_1, \ldots, u_m \}$$ and $$V =\{ v_1, \ldots, v_n \}$$ being disjoint vertex sets and $$E \subseteq U \times V$$ being a set of edges. Let $$\delta_G(v)$$ denote the degree of a vertex $$v \in U \cup V$$ in $$G$$. We define $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$ to be the set of bipartite graphs $$G=(U,V,E)$$ such that $${r}_i \leq \delta_G(u_i) \leq \bar{{{r}}}_i$$ and $${c}_j \leq \delta_G(v_j) \leq \bar{{{c}}}_j$$ hold for each $$u_i \in U$$ and $$v_j \in V$$. The problem studied in this article is how to uniformly sample from the set $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$. Notation While we consider the term network to refer to an ‘informal concept describing an object composed of elements and interactions’ [2], we use the term graph to denote an abstract mathematical representation of such objects. In such a representation, we neglect the original interpretation of the data and focus on its underlying structure. A bipartite graph $$G=(U,V,E)$$ can be uniquely represented by an $$m \times n$$ binary matrix $$M$$. The matrix $$M = (m_{ij})$$ is called the bi-adjacency matrix of $$G$$ and is defined by   \[ m_{ij} = \begin{cases} 1,& \text{ if } \{u_i, v_j\} \in E\\ 0,& \text{ otherwise.} \end{cases} \] Thus, the problem can be reformulated as uniform sampling a binary matrix $$M \in \{0,1\}^{m \times n}$$ such that $$\mathbf{{r}}$$ and $$\bar{{{\mathbf{r}}}}$$ are lower and upper bounds on the row sums of $$M,$$ and $$\mathbf{{c}}$$ and $$\bar{{{\mathbf{c}}}}$$ are lower and upper bounds on the column sums of $$M$$. In many cases, we do not need to distinguish between vertices from $$U$$ and $$V$$. Thus, we define $${d} \colon U \cup V \to \mathbb{Z}$$ and $$\bar{{{d}}} \colon U \cup V \to \mathbb{Z}$$ to assign the lower and upper degree bounds to the set of vertices:   \[ {d}(v) = \begin{cases} {r}_i,& \text{if } v = u_i \in U\\ {c}_j,& \text{if } v = v_j \in V \end{cases} \qquad \bar{{{d}}}(v) = \begin{cases} \bar{{{r}}}_i,& \text{if } v = u_i \in U\\ \bar{{{c}}}_j,& \text{if } v = v_j \in V. \end{cases} \] In cases where the integer vectors are known from the context, we will abbreviate $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$ with $$\Omega$$. Related work To study the structural properties of networks, Connor and Simberloff [3] suggested statistical hypothesis testing. In such tests, the structure of an observed network is compared with that of randomly sampled null-networks. A much-discussed question is the choice of an appropriate null model that describes how properties of an observed network are reflected by the null-networks. To preserve the characteristics of the observed network, Connor and Simberloff suggested the uniform distribution over the set of bipartite graphs whose degrees match those of the observed one. Different null models with various advantages and disadvantages have been discussed during the last decades [4, 5]. However, the basic null model suggested by Connor and Simberloff is still widely applied. Consequently, due to its high importance for network analysis, the uniform sampling of bipartite graphs with exactly prescribed degrees has received much attention. As this is a special case of our problem in which $$\mathbf{{c}} = \bar{{{\mathbf{c}}}}$$ and $$\mathbf{{r}} =\bar{{{\mathbf{r}}}}$$, we will briefly review the most important ideas of these methods. Methods for randomly constructing bipartite graphs whose degrees match a pair of given integer vectors $$\mathbf{r}$$ and $$\mathbf{c}$$ can be roughly divided into two groups. The first group of algorithms, like the ones presented in this article, follow the Markov chain Monte Carlo (MCMC) approach. Such algorithms simulate a Markov chain whose state space is defined as the set of bipartite graphs from which we want to draw samples and whose stationary distribution is uniform. In particular, the famous switch chain has been used for decades to randomize bipartite graphs. Here, the key idea is to randomly select two edges and switch its endpoints. It is well known that all bipartite graphs with fixed degree can be generated by iterated application of switches. The number of steps $$\tau(\varepsilon)$$, until the Markov chain’s distribution is ‘close’ to its stationary distribution up to an $$\varepsilon$$ is called total mixing time of the Markov chain. Although the total mixing time can in principle be computed by complete enumeration of the Markov chain’s state space [6], this is not a feasible approach in practice since the number of states typically grows exponentially with the size of the bipartite graph. A Markov chain is called rapidly mixing if its total mixing time can be bounded from above by a polynomial in the size of the input. Proving that a Markov chain is rapidly mixing is often a challenging task in theoretical computer science. Although the switch chain was proven to be rapidly mixing for several instance classes [7–9], it is unknown whether this is true in general. There are several modifications of the classical switch chain like the curveball algorithm presented by Strona et al. [10] or the algorithm presented by Verhelst [11]. Experimental studies showed that such algorithms are empirically more efficient than the classical switch chain [12]. Bezáková et al. [13] presented a simulated annealing algorithm which produces uniformly distributed bipartite graphs and whose running time is bounded from above polynomially. However, its running time is still too large to be of practical use. The second group of algorithms use other approaches like rejection sampling or importance sampling [1, 14, 15]. Although they do not necessarily produce uniformly distributed samples, they can be used for the unbiased estimation of expected values, which is required by many applications. Harrison and Miller [14] showed that sequential importance sampling can be very efficient in practice, although there are counter examples in which the approach behaves poorly [16]. Exploiting the connection between counting and sampling, Miller and Harrison showed that exact sampling of bipartite graphs is feasible for many real-world data sets [17]. In ecological null-network analysis, several methods have been discussed to create random bipartite graphs, whose degrees are allowed to vary from the observed ones. As one of the simplest algorithms, the EE-algorithm [18] randomly distributes edges equiprobably between each pair of vertices. Thereby, it produces random bipartite graphs that possess the same edge total as the observed network but whose vertex degrees may vary entirely random. To approximately preserve the degrees of the observed network, a family of so-called PP-algorithms can be used to produce bipartite graphs, in which the probability of an edge is proportional to the corresponding vertex degrees in the observed network [19]. Ulrich and Gotelli [20] present a PP-algorithm that produces random bipartite graphs, whose degrees may vary freely, but in which the expected values of the degrees are intended to match the observed network. Thus, the degrees of a null network will be distributed randomly around the observed ones. In contrast to the PP-approach, our model assumes that the ‘true’ network is unknown and might be different from the observed one. Consequently, our method is not intended to produce random samples whose edge total match the one of the observed networks. Instead, we do allow the edge total to vary within the margins specified by the intervals. In particular, our method may be used to create null networks whose degrees are at least as large as the observed ones, which is not possible by the EE- or PP-approach. In addition, there is—to the best of our knowledge—no formal proof that the sampling of the proportional algorithms is uniform. Contribution In this article, we present two sampling algorithms based on the MCMC method. These algorithms simulate Markov chains whose state space is the set $$\Omega$$. We prove that the stationary distribution of these Markov chains is uniform, thereby showing that the sampling algorithms can be used for uniform sampling of bipartite graphs whose degrees lie in prescribed intervals. As shown by experiments, the first Markov chain is slowed down by a high number of so-called loop transitions which do not change the state of the Markov chain. For this reason, we designed the second Markov chain to avoid loop transitions at the price of a larger running time in each step. Since the running time of an MCMC algorithm is determined by the convergence of its underlying Markov chain and the running time spent in each step, it is not immediately clear under which conditions the second chain should be preferred to the first. In a set of experiments, we evaluate both Markov chains and find that the second Markov chain works very well if the given degree intervals describe scale-free networks. In contrast, the first chain is more efficient if the degrees are nearly regular. We conclude this article by showing how our sampling methods can be applied for studying the properties of partially observed networks. For this purpose, we assume that ‘Darwin’s finches’ are just a partial observation of a larger ‘true’ network and show how our sampling methods may be used to approximate some expected properties of this network. 2. Methods Our sampling algorithm is based on the famous MCMC method. Thus, we simulate a Markov chain whose state space is the set $$\Omega$$ and whose stationary distribution is uniform. Starting with an arbitrary bipartite graph $$G \in \Omega$$, we randomly modify $$G$$ to create a state $$G' \in \Omega$$. Afterwards, we set $$G \gets G'$$ and iterate. This produces a series of graphs $$G_0, G_1, \ldots, G_t$$ from $$\Omega$$. After $$t$$ steps, we stop the process and return $$G_t$$ as a random sample. One of the major challenges of the MCMC method is to choose the number of steps $$t$$ large enough such that the sample $$G_t$$ is ‘sufficiently random’. Initial state The MCMC method requires an arbitrary graph $$G_0 = (U,V,E) \in \Omega$$ that is used as the initial state. In many applications, an initial graph is given by the observed network. However, if no observed network is at hand, the first step needs to be the construction of an appropriate bipartite graph whose degrees lie within the prescribed bounds. This is a variant of the classical graph realization problem, which can be solved in $$O(|U| + |V| + |E|)$$ time [21]. 2.1 Simple Markov chain The following Markov chain can be used for the uniform sampling of bipartite graphs from the set $$\Omega$$. The Markov chain is based on the three operations switch, flip and shift that are used to transform a bipartite graph $$G \in \Omega$$ into $$G' \in \Omega$$ (see Fig. 1). Fig. 1. View largeDownload slide Visualization of a switch (left), flip (mid) and shift operation (right). Fig. 1. View largeDownload slide Visualization of a switch (left), flip (mid) and shift operation (right). Definition 2.1 (Switch) Let $$G = (U,V, E) \in \Omega$$. Randomly choose four vertices $$u_i, u_k \in U$$ and $$v_j, v_l \in V$$ without replacement. If $$\{u_i, v_j\} \in E$$, $$\{u_k, v_l\} \in E$$, $$\{u_i, v_l\} \not\in E$$ and $$\{u_k, v_j\} \not\in E$$, return the graph $$G'=(U,V,E')$$ where $$E' = \left(E \setminus \left\{ \{u_i, v_j\}, \{u_k, v_l\} \right\} \right) \cup \left\{ \{u_i, v_l\}, \{u_k, v_j\} \right\}$$. If $$\{u_i, v_l\} \in E$$, $$\{u_k, v_j\} \in E$$, $$\{u_i, v_j\} \not\in E$$ and $$\{u_k, v_l\} \not\in E$$, return the graph $$G'=(U,V,E')$$ where $$E' = \left(E \setminus \left\{ \{u_i, v_l\}, \{u_k, v_j\} \right\} \right) \cup \left\{ \{u_i, v_j\}, \{u_k, v_l\} \right\}$$. Otherwise, return $$G$$. It is easy to verify that the probability $$p_{\text{switch}}(G,G') = \binom{m}{2}^{-1}\binom{n}{2}^{-1}$$ of transforming a graph $$G$$ into $$G' \not= G$$ via a switch operation is equal to $$p_{\text{switch}}(G',G)$$ since it only depends on the number of vertices, which is invariant for each pair of graphs $$G, G' \in \Omega$$. Definition 2.2 (Flip) Let $$G = (U,V, E) \in \Omega$$. Choose two vertices $$u \in U$$ and $$v \in V$$ uniformly at random. If $$\{u,v\} \in E$$, set $$E' = E \setminus \left\{ \{u,v \} \right\}$$, otherwise set $$E' = E \cup \left\{ \{u,v\} \right\}$$. Set $$G' = (U,V,E')$$. If $${d}(x) \leq \delta_{G'}(x) \leq \bar{{{d}}}(x)$$ for each $$x \in \{u,v\}$$, return $$G'$$. Otherwise, return $$G$$. Unlike the switch and shift operation, the flip changes the total number of edges in $$G$$. The probability of transforming a graph $$G$$ into $$G' \not= G$$ via a single flip operation is $$p_{\text{flip}}(G,G') = m^{-1}n^{-1} = p_{\text{flip}}(G',G)$$. Definition 2.3 (Shift) Let $$G = (U,V, E) \in \Omega$$. Choose two vertices $$u \in U$$ and $$v \in V$$ uniformly at random. Choose a vertex $$w$$ from $$(U \cup V) \setminus \{u,v\}$$ uniformly at random. If $$w \in V$$, switch the roles of $$u$$ and $$v$$. If $$\{u,v\} \not\in E$$ and $$\{w,v\} \in E$$, replace the edge $$\{ w,v\}$$ by $$\{u,v\}$$. Thus, create $$G'=(U,V,E')$$ with $$E'= (E \setminus \left\{ \{w,v\} \right\}) \cup \left\{ \{ u,v \} \right\}$$ and go to Step 7. If $$\{u,v\} \in E$$ and $$\{w,v\} \not\in E$$, replace the edge $$\{ u,v\}$$ by $$\{w,v\}$$. Thus, create $$G'=(U,V,E')$$ with $$E'= (E \setminus \left\{ \{u,v\} \right\}) \cup \left\{ \{ w,v \} \right\}$$ and go to Step 7. In all other cases return $$G$$. If $${d}(x) \leq \delta_{G'}(x) \leq \bar{{{d}}}(x)$$ for each $$x \in \{u,w \}$$ return $$G'$$, otherwise return $$G$$. The shift operation only affects the degrees of the vertices $$u$$ and $$w$$ and does not affect the degrees of all other vertices. Transforming $$G$$ into $$G' \not=G$$ via a single shift operation has a probability of $$p_{\text{shift}}(G,G') = m^{-1}n^{-1}(m+n-2)^{-1} = p_{\text{shift}}(G',G)$$. We can now define our first Markov chain. Definition 2.4 (Simple Markov chain) Let $$G = (U,V, E) \in \Omega$$. Let $$q \colon \{ \text{switch} , \text{flip},$$$$\text{shift} \} \to [0,1]$$ be an arbitrary probability function that assigns positive probability to each operation. Randomly select one of the three operations switch, flip and shift with respect to the probability function $$q$$. Apply the operation to transform $$G$$ into $$G'$$. Set $$G \gets G'$$ and go to Step 1. Proof of correctness In the remainder of this section, we will prove that the simple chain can be used for uniform sampling from the set $$\Omega$$. Therefore, it suffices to show that the Markov chain is irreducible, aperiodic and reversible with respect to the uniform distribution on $$\Omega$$. It follows from the fundamental theorem of Markov chains (see, e.g. standard textbooks like [22]), that the Markov chain’s stationary distribution is uniform. Definition 2.5 Let $$p \colon \Omega \times \Omega \to [0,1]$$ be the transition probability of a Markov chain with state space $$\Omega$$. The directed graph $$\Gamma = (\Omega, \Phi)$$ with $$\Phi = \{ (x,y) \colon p(x,y) > 0 \}$$ is called state graph of the corresponding Markov chain. A Markov chain is called irreducible if $$\Gamma$$ is strongly connected. It is called aperiodic, if $$\Gamma$$ is not bipartite and it is called reversible with respect to a probability function $$\pi \colon \Omega \to [0,1]$$, if $$\pi(x)p(x,y) = \pi(y) p(y,x)$$ for each $$x,y \in \Omega$$. We start by proving the irreducibility of the chain. For the following theorems, let $$G=(U,V,E)$$ and $$G'=(U,V,E')$$ be arbitrary bipartite graphs from the set $$\Omega$$. Definition 2.6 Let $$G \triangle G' := (E\setminus E') \cup (E' \setminus E)$$ be the symmetric difference of the edges of $$G$$ and $$G'$$. A sequence of nodes $$p=(v_1, v_2, \ldots, v_k)$$ is called alternating path (in $$G \triangle G'$$) if two consecutive vertices are alternatingly in $$E \setminus E'$$ and $$E' \setminus E$$. An alternating path $$p$$ is called non-extendable if there is no alternating path containing $$p$$ as a proper subpath. An alternating path is called simple if $$v_i \not= v_j$$ for $$1 \leq i \neq j \leq k$$. A sequence of nodes $$(v_1, v_2, \ldots, v_{k}, v_1)$$ is called alternating cycle (in $$G \triangle G'$$) if $$(v_1, v_2, \ldots, v_{k})$$ is an alternating path. An alternating cycle is called simple if it does not contain a smaller alternating cycle. In the special case of $$\mathbf{{r}}=\bar{{{\mathbf{r}}}}$$ and $$\mathbf{{c}}=\bar{{{\mathbf{c}}}}$$, the symmetric difference $$G\triangle G'$$ can be partitioned into a set of simple alternating cycles. Furthermore, in this case, it is well known that an alternating cycle $$(v_1, v_2, \ldots, v_k, v_1)$$ in $$G \triangle G'$$ can be completely dissolved via a series of at most $$k/2$$ switch operations. Thus, when $$\mathbf{{r}}=\bar{{{\mathbf{r}}}}$$ and $$\mathbf{{c}}=\bar{{{\mathbf{c}}}}$$, the graphs $$G$$ and $$G'$$ can be transformed into each other by at most $$|G \triangle G'|/2$$ switch operations. By the following theorems, we will prove a similar result for our more general scenario. Lemma 2.1 Let $$(v_1, v_2, \ldots, v_{k}, v_1)$$ be a simple alternating cycle in $$G \triangle G'$$. We can apply a switch operation to $$G$$ or $$G'$$, thereby transforming $$G$$ into $$\bar{{{G}}}$$ or $$G'$$ into $$\bar{{{G}}}'$$ such that $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| \geq 2$$. Proof. We prove the theorem inductively by the length $$k$$ of the alternating cycle. Assume that $$\{v_1, v_2\} \in E$$ and $$\{v_{k-1}, v_{k}\} \in E$$, otherwise switch the roles of $$G$$ and $$G'$$. If $$k=4$$, we can immediately apply a switch to $$G$$ which transforms $$G$$ into $$\bar{{{G}}} = (U,V, E \setminus \{ \{v_1, v_2\}, \{v_3, v_4\} \} \cup \{ \{v_1, v_4\}, \{v_2, v_3\} \})$$ and leaves $$\bar{{{G}}}' \gets G'$$ unchanged. Thus, we have $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| = 4$$. Now let $$k > 4$$ and assume the theorem has been proven for alternating cycles of length $$i < k$$. Depending on whether or not the edge $$\{v_1, v_4\}$$ exists in $$G$$ and $$G'$$, four cases can occur (see Fig. 2). If $$\{v_1, v_4\} \not\in E$$ and $$\{v_1, v_4\} \not\in E'$$, we can replace the edges $$\{v_1, v_2\} \in E$$ and $$\{v_3, v_4\} \in E$$ by $$\{v_1, v_4\}$$ and $$\{v_2, v_3\}$$. Thus, a switch operation may transform $$G$$ into $$\bar{{{G}}} = (U,V, \bar{{{E}}})$$ with $$\bar{{{E}}} = (E \setminus \left\{ \{v_1, v_2\}, \{v_3, v_4\} \right\}) \cup \left\{ \{v_1, v_4\}, \{v_2, v_3\} \right\} )$$. We set $$\bar{{{G}}}' \gets G'$$ and have $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| = 2$$. If $$\{v_1, v_4\} \not\in E$$ and $$\{v_1, v_4\} \in E'$$, we can replace the edges $$\{v_1, v_2\} \in E$$ and $$\{v_3, v_4\} \in E$$ by $$\{v_1, v_4\}$$ and $$\{v_2, v_3\}$$. Thus, a switch operation may transform $$G$$ into $$\bar{{{G}}} = (U,V, \bar{{{E}}})$$ with $$\bar{{{E}}} = (E \setminus \left\{ \{v_1, v_2\}, \{v_3, v_4\} \right\}) \cup \left\{ \{v_1, v_4\}, \{v_2, v_3\} \right\} )$$. We set $$\bar{{{G}}}' \gets G'$$ and have $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| = 4$$. Remarkably, this does not create a smaller alternating cycle. Instead, it creates the alternating path $$(v_4, v_5, \ldots, v_k, v_1)$$ in $$\bar{{{G}}} \triangle \bar{{{G}}}'$$ of length $$k-2$$. If $$\{v_1, v_4\} \in E$$ and $$\{v_1, v_4\} \not\in E'$$, we do not directly find an applicable switch operation. However, there is an alternating cycle $$(v_1, v_4, \ldots, v_k, v_1)$$ of length $$k-2$$. Thus, by induction hypothesis, we can find a switch operation that transforms $$G$$ into $$\bar{{{G}}}$$ or $$G'$$ into $$\bar{{{G}}}'$$ such that $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G'}}}| \geq 2$$. If $$\{v_1, v_4\} \in E$$ and $$\{v_1, v_4\} \in E'$$, we can replace the edges $$\{v_1, v_4\} \in E'$$ and $$\{v_2, v_3\} \in E'$$ by $$\{v_1, v_2\}$$ and $$\{v_3, v_4\}$$. Thus, a switch operation may transform $$G'$$ into $$\bar{{{G}}}' = (U,V, \bar{{{E}}}')$$ with $$\bar{{{E}}}' = (E' \setminus \left\{ \{v_1, v_4\}, \{v_2, v_3\} \right\}) \cup \left\{ \{v_1, v_2\}, \{v_3, v_4\} \right\}$$. We set $$\bar{{{G}}} \gets G$$ and have $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G'}}}| = 2$$. □ Fig. 2. View largeDownload slide Example of an alternating cycle $$c=(v_1, v_2, \ldots, v_6, v_1)$$ of length $$k=6$$. Straight lines are present in $$G$$, and dashed lines are present in $$G'$$. From left to right: cases one to four of Lemma 2.1. Fig. 2. View largeDownload slide Example of an alternating cycle $$c=(v_1, v_2, \ldots, v_6, v_1)$$ of length $$k=6$$. Straight lines are present in $$G$$, and dashed lines are present in $$G'$$. From left to right: cases one to four of Lemma 2.1. Lemma 2.2 Let $$(v_1, v_2, \ldots, v_k)$$ be a simple non-extendable alternating path in $$G \triangle G'$$. (a) If $$\{v_1, v_2\} \in E$$, we may remove an edge $$\{v_1, \cdot\}$$ from $$E$$ or add a new edge $$\{v_1, \cdot\}$$ to $$E'$$ without violating the lower and upper bounds on the degree of $$v_1$$. (b) If $$\{v_1, v_2\} \in E'$$, we may add a new edge $$\{v_1, \cdot\}$$ to $$E$$ or remove an edge $$\{v_1, \cdot\}$$ from $$E'$$ without violating the lower and upper bounds on the degree of $$v_1$$. Proof. Assume $$\{v_1, v_2\} \in E$$, otherwise switch the roles of $$G$$ and $$G'$$. Since the path is non-extendable, there is no edge $$\{v_1, \cdot\} \in E' \setminus E$$. Thus, $$\delta_G(v_1)$$ exceeds $$\delta_{G'}(v_1)$$ by at least one. Since $$G$$ and $$G'$$ are valid realizations and so do not violate the lower and upper bounds on the degree of $$v_1$$, the inequalities $${d}(v_1) \leq \delta_{G'}(v_1) < \delta_G(v_1) \leq \bar{{{d}}}(v_1)$$ must hold. Hence, by adding an additional edge $$\{v_1, \cdot\}$$ to $$E'$$ or removing an edge $$\{v_1, \cdot\}$$ from $$E$$, the lower and upper bounds on the degree of $$v_1$$ are not violated. □ Lemma 2.3 Let $$p=(v_1, v_2, \ldots, v_{k-1}, v_{k})$$ be a simple non-extendable alternating path in $$G \triangle G'$$. By a series of at most two flip, switch or shift operations, we can transform $$G$$ into $$\bar{{{G}}}$$ and $$G'$$ into $$\bar{{{G}}}'$$ such that $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| \geq 1$$. Proof. First, let $$p$$ be of odd length, thus $$k$$ is even. Assume that $$\{v_1, v_2\} \in E$$ and $$\{v_{k-1}, v_{k}\} \in E$$, otherwise switch the roles of $$G$$ and $$G'$$. Depending on the existence of the edge $$\{v_1, v_{k}\}$$ in $$G$$ and $$G'$$, we discuss four cases (see Fig. 3). If $$\{v_1, v_{k}\} \not\in E$$ and $$\{v_1, v_{k} \} \not\in E$$, Lemma 2.2 allows to add the edge $$\{v_1, v_{k}\}$$ to $$E'$$. Thus, a flip operation may transform $$G'$$ into $$\tilde{G}' = (U,V, \tilde{E}')$$ with $$\tilde{E}' = E' \cup \left\{ \{v_1, v_{k}\} \right\}$$. By introducing a new edge, we temporarily create a situation where $$|G \triangle G'| - |G \triangle \tilde{G}'| = -1$$. However, we additionally construct the alternating cycle $$(v_1, v_2, \ldots, v_k, v_1)$$ in $$G \triangle \tilde{G}'$$. By Lemma 2.1, we can find a switch operation applicable to $$G$$ or $$\tilde{G}'$$ leading to $$\bar{{{G}}}$$ and $$\bar{{{G}}}'$$ such that $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| \geq 1$$. If $$\{v_1, v_{k}\} \not\in E$$ and $$\{v_1, v_{k} \} \in E'$$, the closed path $$(v_1, v_2, \ldots, v_{k}, v_1)$$ is an alternating cycle in $$G \triangle G'$$. Thus, the path $$p$$ is not a non-extendable alternating path. Since this contradicts our assumption, this case cannot occur. If $$\{v_1, v_{k}\} \in E$$ and $$\{v_1, v_{k}\} \not\in E'$$, Lemma 2.2 allows to remove the edge $$\{v_1, v_k\}$$ from $$E$$. Thus, a flip operation may transform $$G$$ into $$\bar{{{G}}} = (U,V, \bar{{{E}}})$$ with $$\bar{{{E}}} = E \setminus \{ \{v_1, v_{k}\} \}$$. We set $$\bar{{{G}}}' \gets G'$$ and have $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| = 1$$. If $$\{v_1, v_{k}\} \in E$$ and $$\{v_1, v_{k}\} \in E'$$, Lemma 2.2 allows to remove the edge $$\{v_1, v_k\}$$ from $$E$$. Thus, a flip operation may transform $$G$$ into $$\tilde{G} = (U,V, \tilde{E})$$ with $$\tilde{E} = E \setminus \{ \{v_1, v_{k}\} \}$$. By removing an edge, we temporarily create a situation in which $$|G \triangle G'| - | \tilde{G} \triangle G'| = -1$$. However, we also create the alternating cycle $$(v_1, v_2, \ldots, v_{k}, v_1)$$ in $$\tilde{G} \triangle G'$$. By Lemma 2.1, we can find a switch operation applicable to either $$\tilde{G}$$ or $$G'$$ leading to $$\bar{{{G}}}$$ and $$\bar{{{G}}}'$$ such that $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| \geq 1$$. Now let $$p$$ be of even length, thus $$k$$ is odd. In this case, the edge $$\{v_1, v_{k}\}$$ can neither exist in $$E$$ nor in $$E'$$ since $$v_1$$ and $$v_{k}$$ are in the same vertex set. However, it suffices to focus on the existence of the edge $$\{v_1, v_{k-1} \}$$ in $$G$$ and $$G'$$ (see Fig. 4). If $$\{v_1, v_{k-1} \} \not\in E$$ and $$\{ v_1, v_{k-1} \} \not\in E$$, Lemma 2.2 allows to replace the edge $$\{ v_{k-1}, v_k\} \in E'$$ by $$\{v_{k-1}, v_1 \}$$. Thus, a shift operation may transform $$G'$$ into $$\tilde{G}'=(U,V, \tilde{E}')$$ with $$\tilde{E}' = (E' \setminus \left\{ \{v_{k-1}, v_{k} \} \right\}) \cup \left\{ \{ v_{k-1}, v_1 \} \right\}$$. While this leads to a situation where $$|G \triangle G'| - |G \triangle \tilde{G}'| = 0$$, we also create the alternating cycle $$(v_1, \ldots, v_{k-1}, v_1)$$ in $$G \triangle \tilde{G}'$$. By Lemma 2.1, we can find a switch operation applicable to $$G$$ or $$\tilde{G}'$$ leading to $$\bar{{{G}}}$$ and $$\bar{{{G}}}'$$ such that $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| \geq 2$$. If $$\{v_1, v_{k-1} \} \not\in E$$ and $$\{v_1, v_{k-1}\} \in E'$$, the closed path $$(v_1, v_2, \ldots, v_{k-1}, v_1)$$ is an alternating cycle in $$G \triangle G'$$. Thus, the path $$p$$ is not an non-extendable alternating path. Since this contradicts our assumption, this case cannot occur. If $$\{v_1, v_{k-1}\} \in E$$ and $$\{v_1, v_{k-1} \} \not\in E'$$, Lemma 2.2 allows to replace the edge $$\{v_1, v_{k-1}\} \in E$$ by $$\{v_k, v_{k-1}\}$$. Thus, a shift operation transforms $$G$$ into $$\bar{{{G}}}=(U,V,\bar{{{E}}})$$ with $$ \bar{{{E}}} = (E \setminus \left\{ \{v_1, v_{k-1} \} \right\}) \cup \left\{ \{v_{k-1}, v_{k}\} \right\}$$. We set $$\bar{{{G}}}' \gets G'$$ and have $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| = 2$$. If $$\{v_1, v_{k-1} \} \in E$$ and $$\{ v_1, v_{k-1} \} \in E'$$, Lemma 2.2 allows to replace the edge $$\{v_1, v_{k-1}\} \in E$$ by $$\{v_k, v_{k-1}\}$$. Thus, a shift operation transforms $$G$$ into $$\tilde{G}=(U,V, \tilde{E})$$ with $$\tilde{E} = (E \setminus \left\{ \{v_1, v_{k-1}\} \right\}) \cup \left\{ \{v_{k}, v_{k-1}\} \right\}$$. While this leads to a situation where $$|G \triangle G'| - |\tilde{G} \triangle G'| = 0$$, we also create the alternating cycle $$(v_1, \ldots, v_{k-1}, v_1)$$ in $$\tilde{G} \triangle G'$$. By Lemma 2.1 we can find a switch operation applicable to $$\tilde{G}$$ or $$G'$$ leading to $$\bar{{{G}}}$$ and $$\bar{{{G}}}'$$ such that $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| \geq 2$$. □ Fig. 3. View largeDownload slide Example of a non-extendable alternating path $$p=(v_1, v_2, \ldots, v_6)$$ of odd length ($$k=6$$). Straight lines are present in $$G$$, and dashed lines are present in $$G'$$. From left to right: cases one to four of the first part of Lemma 2.3. Fig. 3. View largeDownload slide Example of a non-extendable alternating path $$p=(v_1, v_2, \ldots, v_6)$$ of odd length ($$k=6$$). Straight lines are present in $$G$$, and dashed lines are present in $$G'$$. From left to right: cases one to four of the first part of Lemma 2.3. Fig. 4. View largeDownload slide Example of a non-extendable alternating path $$p=(v_1, v_2, \ldots, v_5)$$ of even length ($$k=5$$). Straight lines are present in $$G$$, and dashed lines are present in $$G'$$. From left to right: cases one to four of the second part of Lemma 2.3. Fig. 4. View largeDownload slide Example of a non-extendable alternating path $$p=(v_1, v_2, \ldots, v_5)$$ of even length ($$k=5$$). Straight lines are present in $$G$$, and dashed lines are present in $$G'$$. From left to right: cases one to four of the second part of Lemma 2.3. Lemma 2.4 Let $$G$$ and $$G'$$ be arbitrary graphs from $$\Omega$$. By a series of at most $$2|G \triangle G'|$$ switch, flip and shift operations, we can transform $$G$$ and $$G'$$ into each other. Proof. We prove the theorem inductively by the size of $$G \triangle G'$$. If $$|G \triangle G'| = 0$$, the graphs must be identical. Otherwise, apply the following algorithm to construct either a simple non-extendable alternating path $$p$$ in $$G \triangle G'$$ or a simple alternating cycle $$c$$ in $$G \triangle G'$$. Choose an arbitrary edge $$\{v, w\} \in G \triangle G'$$ and let $$p\gets (v, w)$$. Let $$z \gets w$$ be the rightmost node of the path. While there is some edge $$\{z, y\} \in G \triangle G'$$ that can be used to extend $$p$$ (preserving its alternating structure), append $$y$$ to $$p$$. If this produces a cycle, terminate. Otherwise, let $$z \gets y$$ and iterate until $$p$$ cannot be extended. Now reverse the path $$p$$, let $$z \gets v$$ be the rightmost node and repeat the expanding of $$p$$. The algorithm terminates with a simple non-extendable path or detects a simple alternating cycle. In the first case, we can reduce size of $$G \triangle G'$$ via Lemma 2.3, while in the latter case, we can apply Lemma 2.1. Thus, we can inductively reduce $$|G \triangle G'|$$ to zero. The upper bound on the number of operations follows from Lemmas 2.3 and 2.1 since we need at most two operations to reduce the symmetric difference by at least one. □ Lemma 2.5 The simple Markov chain is aperiodic. Proof. Consider an arbitrary bipartite graph $$G = (U,V,E) \in \Omega$$. If $$|E| \leq 1$$, all switch operations will return $$G' = G$$. Otherwise, if $$|E| > 1$$, a shift operation may select three vertices $$u$$, $$v$$ and $$w$$ such that $$\{u,v\} \in E$$ and $$\{v,w\} \in E$$. Thus, we return $$G$$ with positive probability. As the state graph possesses at least one loop arc, it cannot be bipartite. Thus, the Markov chain is aperiodic. □ Lemma 2.6 The simple Markov chain is reversible with respect to the uniform distribution. Proof. We showed that each operation can be reversed with the same probability. Hence, the transition probability $$p(G,G') = p(G',G)$$ is symmetric for each pair of states $$G, G' \in \Omega$$. Thus, the Markov chain is reversible with respect to the uniform distribution $$\pi(G) = |\Omega|^{-1}$$ for each $$G \in \Omega$$. □ Theorem 2.7 The stationary distribution of the simple Markov chain is the uniform distribution on $$\Omega$$. Proof. This follows from Lemmas 2.4, 2.5 and 2.6 and the fundamental theorem of Markov chains. □ 2.2 Informed Markov chain As shown in the previous section, the simple chain can be used for the uniform sampling of bipartite graphs from the set $$\Omega$$. However, a problem of this chain is the large number of loop transitions that may arise from its uninformed nature of selecting random vertices. In this section, we present a second Markov chain that is designed to converge faster to its stationary distribution at the price of an increased running time per step. It is based on two key ideas. First, we want to reduce the number of loop transitions by a more informed vertex selection. Additionally, we increase the number of edges that are affected by a single transition. Inspired by the Curveball algorithm [10], we replace the simple operations switch, shift and flip by their more complex counterparts trade, multi-shift and multi-flip. The main difference between the simple and complex operations is the number of affected edges per transition. Whereas the simple operations affect at most four edges per step, the number of edges affected by a complex operation is $${O}(|V|)$$. We start with defining three new operations. Definition 2.8 (Trade) Let $$G=(U,V,E) \in \Omega$$. Choose two vertices $$u_i, u_k \in U$$ uniformly at random without replacement. Determine the sets of vertices $$V_i$$, $$V_k$$ and $$V_{ik}$$ such that   \begin{align*} V_i &= \left\{ v \colon \{u_i, v\} \in E \wedge \{u_k, v\} \not\in E \right\}\!, \\ V_k &= \left\{ v \colon \{u_i, v\} \not\in E \wedge \{u_k, v\}\in E \right\}\!, \\ V_{ik} &= \left\{ v \colon \{u_i,v\} \in E \wedge \{u_k,v\} \in E \right\}\!. \end{align*} Thus, $$V_i$$ is the set of vertices that are adjacent to $$u_i$$ but not to $$u_k$$, $$V_k$$ is the set of vertices that are adjacent to $$u_k$$ but not to $$u_i$$ and $$V_{ik}$$ is the set of vertices that are adjacent to both $$u_i$$ and $$u_k$$. 3. Choose a subset $$S \subseteq V_i \cup V_k$$ of size $$|V_i|$$ uniformly at random. 4. Create the new sets of edges   \begin{align*} E'_i &= \left\{ \{u_i, v\} \colon v \in S \right\}\!,\\ E'_k &= \left\{ \{u_k, v\} \colon v \in (V_i \cup V_k) \setminus S \right\} \\ E'_{ik} &= \left\{ \{u_i, v\}, \{u_k, v\} \colon v \in V_{ik} \right\} \end{align*} and return the bipartite graph $$G'=(U, V, E')$$ with $$E' = E'_i \cup E'_k \cup E'_{ik}$$. Figure 5 visualizes the trade operation. The trade may create a graph $$G'$$ such that $$G = G'$$ when in step three $$S$$ is chosen to be equal to $$V_i$$. In addition, it contains the switch as a special case when in step three $$S$$ is chosen such that $$| V_i \triangle S | = 2$$. Carstens [23] showed that the transition probabilities of the trade operation are symmetric. Fig. 5. View largeDownload slide Visualization of a trade operation. Transforming the left graph into the right: $$V_i = \{ v_1, v_2\}$$ (grey, solid border), $$V_k = \{v_4, v_5, v_6\}$$ (grey, dashed border). The set $$S=\{v_4, v_5 \}$$ was chosen randomly from $$V_i \cup V_k$$. Fig. 5. View largeDownload slide Visualization of a trade operation. Transforming the left graph into the right: $$V_i = \{ v_1, v_2\}$$ (grey, solid border), $$V_k = \{v_4, v_5, v_6\}$$ (grey, dashed border). The set $$S=\{v_4, v_5 \}$$ was chosen randomly from $$V_i \cup V_k$$. Definition 2.9 (Multi-flip) Let $$G=(U,V,E) \in \Omega$$. Choose a vertex $$u \in U$$ uniformly at random. Determine the sets of vertices $$V_0$$ and $$V_1$$ such that   \begin{align*} V_0 &= \left\{ v \colon \{u,v\} \not\in E \wedge \delta_G(v) < \bar{{{d}}}(v) \right\}\!,\\ V_1 &= \left\{ v \colon \{u,v\} \in E \wedge \delta_G(v) > {d}(v) \right\}\!. \end{align*} Thus, $$V_0$$ is the set of vertices that are not adjacent to $$u$$ and may be part of an additional edge without violating the upper bound on their degree. In contrast, $$V_1$$ is the set of vertices that are adjacent to $$u$$ and may lose an incident edge without violating the lower bound on their degree. 3. Let $$s = \min \left(\bar{{{d}}}(u) - {d}(u), |V_0 \cup V_1| \right)$$. If $$s = 0$$, return $$G$$. 4. Choose a subset $$S \subseteq V_0 \cup V_1$$ of size $$s$$ uniformly at random. 5. Choose a subset $$T \subseteq S$$ of arbitrary size uniformly at random. 6. Create the sets of edges $$E_0$$ and $$E_1$$ such that   \begin{align*} E_0 = \left\{ \{u,v\} \in E \colon v \in T \right\}\!, \hspace{10mm} E_1 = \left\{ \{u,v\} \not\in E \colon v \in T \right\}\!. \end{align*} Remove the edges $$E_0$$ from $$E$$ and add the edges $$E_1$$, thus create the bipartite graph $$G'=(U,V,E')$$ with $$E' = (E \setminus E_0) \cup E_1$$. 7. If $${d}(u) \leq \delta_{G'}(u) \leq \bar{{{d}}}(u)$$ return $$G'$$, otherwise return $$G$$. Figure 6 visualizes the multi-flip. The multi-flip contains the simple flip as a special case, when $$|T|=1$$. In addition, the multi-flip may produce $$G'=G$$ when $$T = \emptyset$$. Fig. 6. View largeDownload slide Visualization of a multi-flip operation. The numbers in brackets show the upper and lower bounds on the degrees. Transforming the left graph into the right: $$V_0 = \{ v_3, v_4\}$$ (grey, dashed border), $$V_1 = \{ v_1, v_2\}$$ (grey, solid border). Since the bounds on the degree of $$u$$ differ by three, we randomly chose the set $$S = \{v_2, v_3, v_4\}$$ of size three from $$V_0 \cup V_1$$. Subsequently, we randomly chose $$S = T$$ as a subset of $$T$$. Fig. 6. View largeDownload slide Visualization of a multi-flip operation. The numbers in brackets show the upper and lower bounds on the degrees. Transforming the left graph into the right: $$V_0 = \{ v_3, v_4\}$$ (grey, dashed border), $$V_1 = \{ v_1, v_2\}$$ (grey, solid border). Since the bounds on the degree of $$u$$ differ by three, we randomly chose the set $$S = \{v_2, v_3, v_4\}$$ of size three from $$V_0 \cup V_1$$. Subsequently, we randomly chose $$S = T$$ as a subset of $$T$$. Lemma 2.7 The transition probability of the multi-flip is symmetric. Proof. Consider two graphs $$G=(U,V,E)$$ and $$G'=(U,V,E')$$ that can be transformed into each other via a single multi-flip. To distinguish the sets $$V_0$$, $$V_1$$ and $$T$$ in $$G$$ and $$G'$$, we affiliate each set with the corresponding graph. Following from its definition, the probability for transforming a graph $$G $$ into $$G'\neq G$$ via a single multi-flip is   \begin{align*} p_{\text{multi-flip}}(G, G') = m^{-1} \binom{|V_0(G) \cup V_1(G)|}{s}^{-1} 2^{-s}, \end{align*} where $$s = \min(\bar{{{d}}}(u) - {d}(u), |V_0(G) \cup V_1(G)|)$$. To prove that $$p_{\text{multi-flip}}(G, G')$$ equals $$p_{\text{multi-flip}}(G', G)$$, it suffices to show $$|V_0(G) \cup V_1(G)| = |V_0(G') \cup V_1(G')|$$ since $$m$$ and $$\bar{{{d}}}(u) - {d}(u)$$ are invariant constants. Therefore, assume $$|V_0(G) \cup V_1(G)| > |V_0(G') \cup V_1(G')|$$, otherwise switch the roles of $$G$$ and $$G'$$. Let $$v \in V_0(G)$$ be a vertex that is neither in $$V_0(G')$$ nor in $$V_1(G')$$. Since $$v \in V_0(G)$$, it follows from the definition of $$V_0$$ that $$\{u,v\} \not\in E$$. Now consider two cases that can occur while transforming $$G$$ into $$G'$$. If $$v \in T(G)$$, by the definition of the multi-shift the edge $$\{u,v\}$$ must be included in $$E'$$. Thus, $$\delta_{G'}(v) = \delta_G(v) + 1$$. Since $$\{u,v\} \in E'$$ and $${d}(v) < \delta_{G'}(v)$$, it follows that $$v \in V_1(G')$$. Otherwise, if $$v \not\in T(G)$$, the degree of $$v$$ is not changed by the multi-shift operation and thus, $$v \in V_0(G')$$. In either case, $$v \in V_0(G')$$ or $$v \in V_1(G')$$, which contradicts our assumption. The symmetric case $$v \in V_1(G)$$ can be proven analogously. □ Definition 2.10 (Multi-shift) Let $$G=(U,V,E) \in \Omega$$. Choose a random vertex $$u \in U$$. Determine the sets of vertices $$V_0$$ and $$V_1$$ such that   \begin{align*} V_0 &= \left\{ v \colon \{u,v\} \not\in E \wedge \delta_G(v) < \bar{{{d}}}(v) \right\}\!,\\ V_1 &= \left\{ v \colon \{u,v\} \in E \wedge \delta_G(v) > {d}(v) \right\}\!. \end{align*} Up to this point, the multi-shift is defined similarly to the multi-flip. 3. Choose a subset $$S \subseteq V_0 \cup V_1$$ of size $$|V_1|$$ uniformly at random. 4. Create the set of edges $$E_0$$ and $$E_1$$ such that   \begin{align*} E_0 = \left\{ \{u,v\} \in E \colon v \in S \right\}\!, \hspace{10mm} E_1 = \left\{ \{u,v\} \not\in E \colon v \in (V_0 \cup V_1) \setminus S \right\}\!. \end{align*} Remove the edges $$E_0$$ from $$E$$ and add the edges $$E_1$$, thus create and return the bipartite graph $$G'=(U,V,E')$$ with $$E' = (E \setminus E_0) \cup E_1$$. Figure 7 visualizes the multi-shift. In contrast to the multi-flip, the multi-shift operation does not change the degree of $$u$$ since $$S$$ is chosen to have the same cardinality as $$V_1$$. In addition, the upper or lower bound on the degree of a node $$v \in V$$ cannot be violated by the way we define $$V_0$$ and $$V_1$$. Therefore, $$G' \in \Omega$$. Like the other operations, the multi-shift contains the possibility of creating $$G' = G$$ when in step three $$S$$ is chosen to be equal to $$V_1$$. The multi-shift does not yet contain the simple shift as a special case since it only modifies the degree of nodes from $$V$$. However, we will take care for this when defining the Markov chain. Fig. 7. View largeDownload slide Visualization of a multi-shift operation. The numbers in brackets show the relevant upper and lower bounds on the degrees. Transforming the left graph into the right: $$V_0 = \{ v_3, v_4\}$$ (grey, dashed border), $$V_1 = \{ v_1, v_2\}$$ (grey, solid border). We randomly selected $$S = \{v_3, v_4\}$$ of size $$|V_1|$$ from $$V_0 \cup V_1$$. Fig. 7. View largeDownload slide Visualization of a multi-shift operation. The numbers in brackets show the relevant upper and lower bounds on the degrees. Transforming the left graph into the right: $$V_0 = \{ v_3, v_4\}$$ (grey, dashed border), $$V_1 = \{ v_1, v_2\}$$ (grey, solid border). We randomly selected $$S = \{v_3, v_4\}$$ of size $$|V_1|$$ from $$V_0 \cup V_1$$. Lemma 2.8 The transition probability of the multi-shift is symmetric. Proof. Following from its definition, the probability for transforming a graph $$G $$ into $$G'\neq G$$ via a single multi-shift is   \begin{align*} p_{\text{multi-shift}}(G, G') = m^{-1} \binom{|V_0(G) \cup V_1(G)|}{|V_1(G)|}^{-1}. \end{align*} Since the degree of $$u$$ is not changed by a multi-shift, $$|V_1(G)|$$ is equal to $$|V_1(G')|$$. In addition, a similar argument as used in the proof of Lemma 2.7 shows $$|V_0(G) \cup V_1(G)| = |V_0(G') \cup V_1(G')|$$. Thus, $$p_{\text{multi-shift}}(G, G') = p_{\text{multi-shift}}(G', G)$$. □ Markov chain By the way we defined our operations, a single operation affects at most two vertices from $$U$$ and up to $$|V|$$ vertices from $$V$$. Thus, we need at least $$|U|/2$$ operations until each vertex from $$U$$ is affected by an operation. We intend to speed up the Markov chain by occasionally switching the roles of $$U$$ and $$V$$. However, if $$|U| \ll |V|$$ we do not like to switch the roles of $$U$$ and $$V$$ too often since the number of affected edges per operation is largely reduced in such cases. Thus, we switch the roles of $$U$$ and $$V$$ with probability $$|U|/(|U|+|V|) = m/(m+n)$$. The improved Markov chain can now be stated as follows: Definition 2.11 (Informed Markov Chain) Let $$q \colon \{ \text{trade}, \text{multi-flip}, \text{multi-shift} \} \to [0,1]$$ be an arbitrary probability function that assigns positive probability to each operation. Let $$G=(U,V,E) \in \Omega$$. Select a real number $$x \in [0, 1)$$ uniformly at random. If $$x < m/(m+n)$$, switch the roles of $$U$$ and $$V$$. Randomly select one of the three operations trade, multi-flip and multi-shift with respect to the probability function $$q$$. Apply the selected operation type to transform $$G$$ into $$G'$$. Set $$G \gets G'$$ and go to step 1. Theorem 2.12 The stationary distribution of the informed Markov chain is the uniform distribution on $$\Omega$$. Proof. We first prove that the improved chain is irreducible. Since each of the three complex operations contains its simple counterpart as a special case, the informed chain contains the transitions of the simple chain. Since the simple chain is irreducible, the informed chain must be irreducible, too. In addition, the improved chain is aperiodic since each operation may create a loop with non-zero probability. Thus, the improved chain possesses a unique stationary distribution. Since the transition probabilities of each operation type are symmetric, it follows from the fundamental theorem of Markov chains that the stationary distribution is the uniform distribution. □ 2.3 Dynamic adjustment of probability The probability function $$q$$ for selecting an operation type has a large effect on the Markov chain’s speed of convergence. A simple implementation is to select each operation with equal probability. However, this is obviously not a good choice in the special case of $$\mathbf{{r}} = \bar{{{\mathbf{r}}}}$$ and $$\mathbf{{c}} = \bar{{{\mathbf{c}}}}$$ since all flip and shift operations must result in a loop transition and thus, at least 2/3 of all transitions will be loops. It is not immediately clear how to choose $$q$$ in a better way since the applicability of the operations heavily depends on the given integer vectors. Thus, we implemented a strategy that initially selects each type of operation with equal probability and dynamically adjusts $$q$$ during the simulation of the Markov chain. For this purpose, we keep track of the number of simulated transitions of each type and the number of non-loop transitions of each type. Every 100 steps, we rebalance the probability functions such that each type of operation gains a probability that is proportional to its success rate. Due to the dynamic adjustment of $$q$$, less time will be spent in operations that tend to have a large loop probability. To ensure positive probability for each type of operation, we slightly dampen the effect of the rebalancing by incorporating previous values. Being precise, we define $$s^{(t)}_{\text{switch}}$$, $$s^{(t)}_{\text{flip}}$$, $$s^{(t)}_{\text{shift}}$$ to be the number of non-loop transitions of each operation type within the first $$t$$ steps of the simulation and $$n^{(t)}_{\text{switch}}$$, $$n^{(t)}_{\text{flip}}$$, $$n^{(t)}_{\text{shift}}$$ to be the total number of simulated operations of each type within the first $$t$$ steps (including loop transitions). We define the success rate$$\alpha^{(t)}$$ of each operation type after $$t$$ steps to be   \begin{align*} \alpha^{(t)}_{\text{switch}} = s^{(t)}_{\text{switch}} / n^{(t)}_{\text{switch}},\hspace{12mm} \alpha^{(t)}_{\text{flip}} = s^{(t)}_{\text{flip}} / n^{(t)}_{\text{flip}},\hspace{12mm} \alpha^{(t)}_{\text{shift}} = s^{(t)}_{\text{shift}} / n^{(t)}_{\text{shift}} \end{align*} and define $$z^{(t)} = \alpha^{(t)}_{\text{switch}} + \alpha^{(t)}_{\text{flip}} + \alpha^{(t)}_{\text{shift}}$$. The probability function $$q$$ at step $$t$$ is defined as   \[ q_{\text{switch}}^{(t)} = \begin{cases} 1/3, & \text{ if } t = 0,\\ q_{\text{switch}}^{(t-1)}, & \text{ if } (t \bmod 100) \not= 0,\\ \left( q_{\text{switch}}^{(t-100)} + \alpha^{(t)}_{\text{switch}} / z^{(t)} \right) / 2, & \text{ if } t > 0 \text{ and } (t \bmod 100) = 0. \end{cases} \] The probability of the flip, shift and the more complex operations is defined analogously. (In the very unlikely case where the calculations as described above require a division by zero, we simply take over the previous values of $$q^{(t-100)}$$.) In the following discussion, we will call a Markov chain static when $$q$$ is fixed to 1/3 for each type of operation and dynamic when we use dynamical adjustment of the probability function. 3. Experimental results and discussion In this section, we present an experimental evaluation of the Markov chains presented in this article. Whereas the first experiment is designed to gain insights into structural properties of the Markov chains, the second experiment is devoted to an empirical evaluation of the number of steps that is sufficient to produce random samples. Finally, we come back to Darwin’s finches and demonstrate how our sampling methods can be used to analyse the structural properties of partially observed networks. 3.1 State graph analysis In the first experiment, we analyse properties of our Markov chains. When the number of states $$|\Omega|$$ is not too large, we can completely enumerate the set $$\Omega$$ to construct the state graph of the Markov chains. In this experiment, we considered classes of integer vectors whose members can be scaled to different sizes. Consider, for example, the class of integer vectors   \begin{align*} \mathbf{{r}} &= (n-1, n-2, 3) & \mathbf{{c}}& = (2,2, \ldots, 2)\\ \bar{{{\mathbf{r}}}} &= (n, n-1, 4) & \bar{{{\mathbf{c}}}}& = (\underbrace{3,3, \ldots, 3}_{n \text{ times}}), \end{align*} where the length $$n \in \mathbb{N}$$ of the vectors $$\mathbf{{c}}$$ and $$\bar{{{\mathbf{c}}}}$$ can be scaled to produce different members of the class. We used the software library marathon [6] to construct the state graph of the simple and informed Markov chain for $$5 \leq n < 35$$. The total mixing time$$\tau(\varepsilon)$$ gives a formal measure of how many steps are needed to come sufficiently close to the stationary distribution of a Markov chain. However, computing this quantity becomes infeasible even when $$|\Omega|$$ is moderately large. Hence, we computed the upper spectral bound [24] on the total mixing time   \begin{align} \tau(\varepsilon) \leq (1-\lambda_{\max})^{-1} \cdot \left( \ln(\varepsilon^{-1}) + \ln(|\Omega|) \right)\!, \end{align} (3.1) where $$\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_{|\Omega|}$$ are the real eigenvalues of a Markov chain’s transition matrix and $$\lambda_{\max} = \max(|\lambda_2|, |\lambda_{|\Omega|}|)$$. Figure 8 shows the growth of the spectral bound for the state graphs investigated in this experiment. The data suggest that the upper spectral bound of the simple chain grows more than one order of magnitude faster than that of the informed chain. This may be explained by considering the expected probability of loop transitions. We observe that the loop probability of the simple chain increases with growing $$n$$, whereas it decreases in case of the informed chain. By counting the number of non-zero transitions, we confirm that the informed chain, as intended, is much denser than the simple chain. For other classes of integer vectors, we made similar observations. Fig. 8. View largeDownload slide Upper spectral bound (left) of the state graphs for a special class of integer vectors ($$\varepsilon=0.01$$). Right: Expected loop probability calculated from the state graphs. Fig. 8. View largeDownload slide Upper spectral bound (left) of the state graphs for a special class of integer vectors ($$\varepsilon=0.01$$). Right: Expected loop probability calculated from the state graphs. 3.2 Convergence of sample means As the number of states grows fast, the evaluation of inequality 3.1 becomes infeasible when the input instances grow larger. For this reason, we analysed the efficiency of the Markov chains by a different approach that is based on simulation. Methodology By simulating $$t$$ steps of a Markov chain and returning the final state, we produce a random sample $$x^{(t)} \in \Omega$$. Let $$p_{s}^{(t)}(x)$$ be the probability of returning state $$x \in \Omega$$ after exactly $$t$$ simulation steps, while starting at state $$s \in \Omega$$. If we repeat this simulation $$N$$ times (always starting from $$s\in \Omega$$), we produce a series of independent and identically distributed random samples $$x^{(t)}_1, x^{(t)}_2, \ldots, x^{(t)}_N$$. By evaluating a target function $$f \colon \Omega \to \mathbb{R}$$ on each random sample and calculating the sample mean   \[ \bar{\mu}_{s}^{(t)}[\,f(X)] := \frac{1}{N} \sum_{i=1}^{N} f \left(x^{(t)}_i \right)\!, \] we approximate the expected value   \[ E_{p^{(t)}_s}[\,f(X)] := \sum_{x \in \Omega} f(x) p_s^{(t)}(x), \] with respect to the probability distribution $$p^{(t)}_s(x)$$. By Theorems 2.7, 2.12 and by the law of large numbers, the sample mean $$\bar{{{\mu}}}_{s}^{(t)}[\,f(X)]$$ will approach the expected value   \[ E[\,f(X)] := |\Omega|^{-1} \sum_{x \in \Omega} f(x), \] when $$t,N \to \infty$$. By letting $$t$$ grow and observing the sample means $$\bar{{{\mu}}}_{s}^{(t)}[\,f(X)]$$, we can detect the number of steps after which the sample means stabilize. This number can be used as an indicator for the efficiency of a Markov chain. Experimental setup Starting from a fixed bipartite graph $$s \in \Omega$$, we simulated $$100000$$ random transitions of both the simple and the informed Markov chain. Every 10 steps, we interrupted the simulation to evaluate the target function on the current state of the Markov chain. This process was repeated $$N=1000$$ times to calculate the sample means $$\bar{{{\mu}}}_{s}^{(t)}[\,f(X)]$$ for $$t=10,20, \ldots, 100000$$. Target function In this experiment, we tested several target functions $$f \colon \Omega \to \mathbb{N}$$. In ecology, the structure of bipartite networks can be quantified by several metrics, from which we considered the following: The $$\bar{S}^2$$ statistic introduced by Roberts and Stone [25]. The nested subset statistic $$S_{\text{nest}}$$ introduced by Patterson and Atmar [26]. The nestedness measure based on overlap and decreasing fill (NODF) [27]. The spectral radius of the bipartite graph’s adjacency matrix [28]. Initial state To clearly observe the stabilization of the sample means, the value $$f(s)$$ at the initial state $$s \in \Omega$$ should considerably differ from the expected value $$E[\,f(X)]$$. As many metrics are known to be correlated with the edge total, we intentionally constructed the initial state $$s$$ to possess as many edges as allowed by the upper bounds $$\bar{{{\mathbf{r}}}}$$ and $$\bar{{{\mathbf{c}}}}$$. This can be achieved by applying an appropriate realization algorithm [21]. Scale-free bipartite graphs Many real-world networks are known to have a scale-free degree distribution. For this reason, we used a Barabási–Albert model to create the degree sequence of a random, scale-free bipartite graph $$G=(U,V,E)$$. Starting with a trivial graph consisting of two vertices connected by a single edge, we iteratively added new vertices to $$U$$ and $$V$$. Each newly added vertex is connected to the existing vertices with a probability that is proportional to the degree of the existing vertices. Thus, vertices that already have a high degree tend to gain more neighbours. After 100 vertices have been included in each $$U$$ and $$V$$, we stop the process and determine the degrees of the vertices in $$U$$ and $$V$$ as integer vectors $$\mathbf{{r}}$$ and $$\mathbf{{c}}$$. To simulate the imperfectness of the data collection process, we added a constant $$d$$ to each integer to create the vectors of upper bounds $$\bar{{{\mathbf{r}}}}$$ and $$\bar{{{\mathbf{c}}}}$$. Based on these vectors, we estimated the efficiency of our Markov chains by observing the sample means of the $$\bar{S}^2$$ statistic. The first four diagrams of Fig. 9 show the results for $$d=1$$ and $$d=5$$. Considering the case of $$d=1$$, we observe that the sample mean of the informed chain stabilizes after very few steps. In contrast, the simple chain needs several thousand iterations to stabilize. By counting the number of loop transitions, we observed that about 94% of all transitions of the simple chain have been loops, in comparison to only 27% of the informed chain’s transitions. Using dynamic probability adjustment, the simple chain starts to deviate from its static counterpart after the first rebalancing at $$t=100$$. During the simulation, 11% of all operations have been switches, 76% were flips and 13% have been shifts. Overall, the dynamic rebalancing reduced the simple chain’s number of loops by 4%. In contrast, the dynamical adjustment could not improve the informed chain, as the necessary number of steps is already very small. As this observation will apply for each of the following experiments, we do not consider the dynamical adjustment for the informed chain from now on. Considering the case of $$d=5$$, we observe that the sample means produced by the simple chain converge significantly faster than before since the loop rate of the Markov chain drops to 85%. Fig. 9. View largeDownload slide Sample means of the $$\bar{S}^2$$ metric for scale-free and near-regular integer vectors. Fig. 9. View largeDownload slide Sample means of the $$\bar{S}^2$$ metric for scale-free and near-regular integer vectors. Although we observed that the informed chain is faster in terms of iterations, this does not necessarily imply that it is also faster in practice since the running time of a single step is larger. However, our experiments showed that the fast convergence of the chain justifies the large running time in each step. We conclude that the informed chain is properly suited for randomization of scale-free graphs, especially if the gap between the degree bounds is small. Near-regular bipartite graphs Next, we experimented with the degrees of near-regular bipartite graphs. For this purpose, we defined two integer vectors $$\mathbf{{r}}$$ and $$\mathbf{{c}}$$ of length $$m=n=100$$ with $${r}_i = {c}_i = 50$$ for each $$i = 1, \ldots, 100$$. As before, we added a constant $$d$$ to each integer to produce the vectors of upper bounds $$\bar{{{\mathbf{r}}}}$$ and $$\bar{{{\mathbf{c}}}}$$. The last four diagrams in Fig. 9 show the sample means for $$d=1$$ and $$d=5$$. Considering $$d=1$$, we observe that the informed chain is faster in terms of iterations but is in practice only fairly more efficient than the simple chain. This can be explained by the simple chain’s rate of loop transitions, which is 82% and thus comparably small. In comparison, 24% of the informed chain’s transitions have been loops. With dynamic probability adjustment, the simple chain selected switch operations with a probability of 25%, flip operations with 50% and shift operations with 25%, thereby effectively reducing the number of loops by 3%. Considering $$d=5$$, we found that the simple chain is even more efficient than before, as its loop rate dropped to 59%. Although only 7% of the informed chain’s transitions have been loops, the simple chain is superior in practice. We conclude that the simple chain with dynamic adjustment is suited best for the randomization of near-regular bipartite graphs. Sensitivity to the target function Replacing the $$\bar{S}^2$$ metric by the other mentioned target functions confirmed our previous observations. Although the sample means stabilized after a different number of steps, the qualitative results were mostly unchanged. As an example, Fig. 10 shows the sample means of the spectral radius metric. We conclude that our observations were not a side-effect of the metric used for convergence detection. Fig. 10. View largeDownload slide Sample means of the spectral radius metric for scale-free and near-regular integer vectors. Fig. 10. View largeDownload slide Sample means of the spectral radius metric for scale-free and near-regular integer vectors. 3.3 Sampling application In a final experiment, we demonstrate how our sampling methods can be applied to study properties of partially observed networks. In this experiment, we assumed that the observed ‘Darwin’s Finches’ network derived from Table 1 is a part of some larger ‘true’ network whose exact shape is unknown due to missing observations. To model this kind of uncertainty, we assume that each island of the Galápagos archipelago may be inhabited by up to one finch species whose presence was not observed during the data collection. Symmetrically, we allow each finch species to be present at one additional island. Mathematically speaking, the true ‘Darwin’s Finches’ network is assumed to be an unknown element of a set $$S$$ that contains all bipartite graphs $$G=(U,V,E)$$, (a) with $$|U|=13$$ and $$|V|=17$$ nodes, (b) whose degrees may exceed those of the observed network by one and (c) which contain all edges of the observed network. In this experiment, we will approximate the expected structure of the true network by uniformly sampling from the set $$S$$. Rejection sampling Our methods cannot directly be applied to uniformly sample from $$S$$ as they do not incorporate requirement (c). However, as $$S$$ is clearly a subset of $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$ with   \begin{align*} \mathbf{{r}} &= (14, 13, 14, 10, 12, 2, 10, 1, 10, 11, 6, 2, 17),\\ \bar{{{\mathbf{r}}}} &= (15, 14, 15, 11, 13, 3, 11, 2, 11, 12, 7, 3, 18),\\ \mathbf{{c}} &= (4, 4, 11, 10, 10, 8, 9, 10, 8, 9, 3, 10, 4, 7, 9, 3, 3),\\ \bar{{{\mathbf{c}}}} &= (5, 5, 12, 11, 11, 9, 10, 11, 9, 10, 4, 11, 5, 8, 10, 4, 4), \end{align*} we can apply the idea of rejection sampling. By randomly sampling from $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$ and rejecting all samples that miss an observed edge, we assure that the non-rejected samples are uniformly distributed in $$S$$. As we draw uniformly, we need to draw about $$|\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})| / |S|$$ samples from $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$ to find a single sample from $$S$$. Unfortunately, as $$S$$ is just a tiny subset of the much larger $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$, this strategy is too inefficient to be of practical use. For that reason, we first transform the sampling problem to an equivalent problem that can be solved more efficiently. Problem transformation By definition, a graph $$G \in S$$ contains all edges of the observed network. By removing these edges from $$G$$, we create a graph $$G'$$ with reduced degrees. In particular, as each node of $$G$$ may possess at most one additional edge, the degrees of $$G'$$ will be either zero or one. Thus, we may replace the original sampling problem by the problem of uniform sampling from a set $$T$$ that contains all bipartite graphs $$G=(U,V,E)$$, (a) with $$|U|=13$$ and $$|V|=17$$ nodes, (b) whose degrees are either zero or one and (c) which do not contain edges that are present in the observed network. In Fig. 11, we demonstrate the problem transformation at a small example. It is easy to see that the reduced sampling problem is equivalent to the original one, as we can easily transform an element of $$T$$ into an element of $$S$$ by adding the observed edges. Vice versa, an element of $$S$$ can be converted into an element of $$T$$ by removing all observed edges. Thus, there is a one-to-one correspondence between $$S$$ and $$T$$. Consequently, we conclude that $$|S| = |T|$$. Fig. 11. View largeDownload slide Illustration of the problem transformation. Let the network on the left be our observed network. We assume that the unknown ‘true’ network may contain up to one additional edge per node, thus its degrees may exceed the observed ones by one. By removing the observed edges, we create a reduced network, whose degrees may be either zero or one. Fig. 11. View largeDownload slide Illustration of the problem transformation. Let the network on the left be our observed network. We assume that the unknown ‘true’ network may contain up to one additional edge per node, thus its degrees may exceed the observed ones by one. By removing the observed edges, we create a reduced network, whose degrees may be either zero or one. Refined sampling strategy As $$T$$ is a subset of the set $$\Omega({\mathbf{p}}, \bar{{{\mathbf{p}}}}, {\mathbf{q}}, \bar{{{\mathbf{q}}}})$$ with   \[ {\mathbf{p}} = [0]^{13}, \quad \bar{{{\mathbf{p}}}} = [1]^{13}, \quad {\mathbf{q}} = [0]^{17}, \quad \bar{{{\mathbf{q}}}} = [1]^{17}, \] we may produce uniform samples from $$T$$ by uniformly sampling from $$\Omega({\mathbf{p}}, \bar{{{\mathbf{p}}}}, {\mathbf{q}}, \bar{{{\mathbf{q}}}})$$ and rejecting all samples that contain an observed edge. After adding the observed edges, each non-rejected sample will be uniformly distributed from $$S$$. To demonstrate that this refined strategy is superior to the basic rejection sampling strategy, we calculated the sizes of the sets $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$ and $$\Omega({\mathbf{p}}, \bar{{{\mathbf{p}}}}, {\mathbf{q}}, \bar{{{\mathbf{q}}}})$$ by using a counting approach similar to that of Miller and Harrison [17]. In doing so, we found $$|\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})| \approx 3.3 \times 10^{26}$$ and $$|\Omega({\mathbf{p}}, \bar{{{\mathbf{p}}}}, {\mathbf{q}}, \bar{{{\mathbf{q}}}})| \approx 1.2 \times 10^{14}$$. As $$\Omega({\mathbf{p}}, \bar{{{\mathbf{p}}}}, {\mathbf{q}}, \bar{{{\mathbf{q}}}})$$ is several orders of magnitude smaller than $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$, the refined strategy outperforms the basic strategy by a factor of   \[ \frac{|\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})| \cdot |T|}{|\Omega({\mathbf{p}}, \bar{{{\mathbf{p}}}}, {\mathbf{q}}, \bar{{{\mathbf{q}}}})|\cdot |S|} = \frac{|\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})|}{|\Omega({\mathbf{p}}, \bar{{{\mathbf{p}}}}, {\mathbf{q}}, \bar{{{\mathbf{q}}}})|} \approx \frac{3.3\cdot 10^{26}}{1.2 \cdot 10^{14}} \approx 2.8 \cdot 10^{12}. \] Experimental setup To approximate the expected structure of the true network, we generated one million uniform samples from the set $$S$$ and constructed the sampling histograms of the associated four metrics. Using these samples, we approximated the expected structure of our assumed true network. To create a reference to which the structure of the true network can be compared with, we additionally calculated the corresponding sampling histograms for the set $$\Omega({\mathbf{r}}, {\mathbf{r}}, {\mathbf{c}}, {\mathbf{c}})$$ of graphs with the same degrees as the observed ‘Darwin’s Finches’ and for the set $$\Omega({\mathbf{r}}, \bar{{{\mathbf{r}}}}, {\mathbf{c}}, \bar{{{\mathbf{c}}}})$$ of graphs with similar degrees as the true network but which do not need to contain the observed edges. Results Figure 12 shows the calculated sampling histograms. We observe that expected structural properties of the true network clearly exceed those of the observed network with respect to $$\bar{S}^2$$, $$S_{\text{nest}}$$ and spectral radius. In contrast, the observed and the true network are structured very similarly with respect to NODF. In addition, we estimated the expected number of edges in the true network to be 130.5. In contrast, the number of edges in the observed network is 122. Fig. 12. View largeDownload slide Sampling histograms obtained by uniform sampling from three sets of networks. Left column: The set $$S$$ of potential ‘true’ networks. Middle column: The set $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$ that do not require to contain the observed edges. Right column: The set of networks with the same degrees as ‘Darwin’s Finches’. The vertical solid lines mark the structural properties of the observed ‘Darwin’s Finches’ network. The dashed lines mark the estimated expected values of our assumed ‘true’ network. Fig. 12. View largeDownload slide Sampling histograms obtained by uniform sampling from three sets of networks. Left column: The set $$S$$ of potential ‘true’ networks. Middle column: The set $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$ that do not require to contain the observed edges. Right column: The set of networks with the same degrees as ‘Darwin’s Finches’. The vertical solid lines mark the structural properties of the observed ‘Darwin’s Finches’ network. The dashed lines mark the estimated expected values of our assumed ‘true’ network. To assess the influence of the observed edges, we compared the structure of the true network with the structure that would be expected from chance if the observed edges were not required to be included in the true network (see middle column). In doing so, we found that the presence of the observed edges enforces the $$\bar{{{S}}}^2$$ and spectral radius of true network to be significantly larger than it would be expected from chance alone. In contrast, the observed edges barely affect the $$S_{\text{nest}}$$ and NODF metrics. Interestingly, the presence of the observed edges forces the true network to be denser than it would be expected from chance, as we approximated the expected number of edges of networks in $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$ to be 128.5. We conclude that the structural properties of our imaginary true network significantly differ from the structure that would be expected from chance if the observed edges were not required to exist. In an analogous experiment, we evaluated the influence of the observed edges by comparing the structure of the observed network with that of null-networks with identical degrees but which do not need to contain the observed edges (right column). In doing so, we found that the $$\bar{S}^2$$, NODF and spectral radius metrics of the observed ‘Darwin’s finches’ notably differ from those of most random null-networks, whereas the values for $$S_{\text{nest}}$$ are very similar. Summarizing these observations, we found that the structure of our imaginary true network is affected by the existence of the observed edges and may change by incorporating additional edges. 4. Conclusion We introduced a Markov chain whose state space is the set of all bipartite graphs whose degrees lie in prescribed intervals. Afterwards, we introduced a second Markov chain that has been designed to affect multiple edges per step and to avoid loop transitions. We showed that both Markov chains can be used for uniform sampling by proving that their stationary distribution is uniform. Experimentally, we showed that the second Markov chain converges faster to its stationary distribution. However, its practical applicability depends on the distribution of the given degrees. We found that the informed chain is more efficient for scale-free degree distributions, whereas the simple chain is superior when the degrees are near-regular. Since many real-world networks have a scale-free degree distribution, the informed chain is well suited for many practical applications. Furthermore, we showed how the dynamic adjustment of the probability function $$q$$ can speed-up the simple chain, whereas it has almost no effect on the informed chain since the loop probability is already very small and equally distributed between the three types of operations. In an application, we showed how our methods can be used to study properties of partially observed networks and found that the presence of unobserved edges may influence the expected structure of the network. In future, we plan to extend our sampling algorithms to wider areas by enabling uniform sampling of non-bipartite graphs, directed graphs and multi-graphs. Acknowledgements We thank Annabell Berger for her valuable suggestions and critical remarks. References 1. Chen Y., Diaconis P., Holmes S. P. & Liu J. S. ( 2005) Sequential Monte Carlo methods for statistical analysis of tables. J. Amer. Statist. Assoc. , 100, 109– 120. Google Scholar CrossRef Search ADS   2. Brandes U. & Erlebach T. (eds) ( 2005) Network Analysis: Methodological Foundations, LNCS, vol. 3418. New York, USA: Springer Berlin-Heidelberg-New York. Google Scholar CrossRef Search ADS   3. Connor E. F. & Simberloff D. ( 1979) The assembly of species communities: chance or competition? Ecology , 60, 1132– 1140. Google Scholar CrossRef Search ADS   4. Gotelli N. & Graves G. ( 1996) Null Models in Ecology . Washington, USA: Smithsonian Institution Press. 5. Gotelli N. J. ( 2000) Null model analysis of species co-occurrence patterns. Ecology , 81, 2606– 2621. Google Scholar CrossRef Search ADS   6. Rechner S. & Berger A. ( 2016) Marathon: an open source software library for the analysis of Markov-Chain Monte Carlo algorithms. PLoS One , 11, e0147935. Google Scholar CrossRef Search ADS PubMed  7. Erdõs P., Miklós I. & Toroczkai Z. ( 2017) New Classes of Degree Sequences with Fast Mixing Swap Markov Chain Sampling. Combinatorics, Probability and Computing , 1– 22, https://doi.org/10.1017/S0963548317000499. 8. Greenhill C. ( 2014) The switch Markov chain for sampling irregular graphs. Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms  ( Indyk P. ed.), Philadelphia, USA: Society for Industrial and Applied Mathematics, pp. 1564– 1572. Google Scholar CrossRef Search ADS   9. Kannan R., Tetali P. & Vempala S. ( 1999) Simple Markov-chain algorithms for generating bipartite graphs and tournaments. Random Struct. Algorithms , 14, 293– 308. Google Scholar CrossRef Search ADS   10. Strona G., Nappo D., Boccacci F., Fattorini S. & San-Miguel-Ayanz J. ( 2014) A fast and unbiased procedure to randomize ecological binary matrices with fixed row and column totals. Nat. Commun. , 5, article no. 4114. 11. Verhelst N. D. ( 2008) An efficient MCMC algorithm to sample binary matrices with fixed marginals. Psychometrika , 73, 705. Google Scholar CrossRef Search ADS   12. Carstens C. J., Berger A. & Strona G. ( 2016) Curveball: a new generation of sampling algorithms for graphs with fixed degree sequence. arXiv preprint arXiv:1609.05137v2 ( submitted on 22 December 2016). 13. Bezáková I., Bhatnagar N. & Vigoda E. ( 2007) Sampling binary contingency tables with a greedy start. Random Struct. Algorithms , 30, 168– 205. Google Scholar CrossRef Search ADS   14. Harrison M. T. & Miller J. W. ( 2013). Importance sampling for weighted binary random matrices with specified margins. arXiv preprint arXiv:1301.3928v1 ( submitted on 16 January 2013). 15. Holmes R. & Jones L. ( 1996) On uniform generation of two-way tables with fixed margins and the conditional volume test of Diaconis and Efron. Ann. Statist. , 24, 64– 68. Google Scholar CrossRef Search ADS   16. Bezáková I., Sinclair A., Štefankovič D. & Vigoda E. ( 2012) Negative examples for sequential importance sampling of binary contingency tables. Algorithmica , 64, 606– 620. Google Scholar CrossRef Search ADS   17. Miller J. W. & Harrison M. T. ( 2013) Exact sampling and counting for fixed-margin matrices. Ann. Statist. , 41, 1569– 1592. Google Scholar CrossRef Search ADS   18. Schluter D. ( 1984) A variance test for detecting species associations, with some example applications. Ecology , 65, 998– 1005. Google Scholar CrossRef Search ADS   19. Gilpin M. E. & Diamond J. M. ( 1982) Factors contributing to non-randomness in species co-occurrences on islands. Oecologia , 52, 75– 84. Google Scholar CrossRef Search ADS PubMed  20. Ulrich W. & Gotelli N. J. ( 2012) A null model algorithm for presence–absence matrices based on proportional resampling. Ecol. Model. , 244, 20– 27. Google Scholar CrossRef Search ADS   21. Rechner S. ( 2017) An optimal realization algorithm for bipartite graphs with degrees in prescribed intervals. arXiv preprint arXiv:1708.05520v1 ( submitted on 18 August 2017). 22. Levin D. A., Peres Y. & Wilmer E. L. ( 2009) Markov Chains and Mixing TImes . Providence, Rhode Island, USA: American Mathematical Society. 23. Carstens C. J. ( 2015) Proof of uniform sampling of binary matrices with fixed row sums and column sums for the fast curveball algorithm. Phys. Rev. E , 91, 042812. Google Scholar CrossRef Search ADS   24. Diaconis P. & Stroock D. ( 1991) Geometric bounds for eigenvalues of Markov chains. Ann. Appl. Probab. , 1, 36– 61. Google Scholar CrossRef Search ADS   25. Roberts A. & Stone L. ( 1990) Island-sharing by archipelago species. Oecologia , 83, 560– 567. Google Scholar CrossRef Search ADS PubMed  26. Patterson B. D. & Atmar W. ( 1986) Nested subsets and the structure of insular mammalian faunas and archipelagos. Biol. J. Linnean Soc. , 28, 65– 82. Google Scholar CrossRef Search ADS   27. Almeida-Neto M., Guimarães P., Guimarães P. R., Loyola R. D. & Ulrich W. ( 2008) A consistent metric for nestedness analysis in ecological systems: reconciling concept and measurement. Oikos , 117, 1227– 1239. Google Scholar CrossRef Search ADS   28. Staniczenko P. P., Kopp J. C. & Allesina S. ( 2013) The ghost of nestedness in ecological networks. Nat. Commun. , 4, 1391. Google Scholar CrossRef Search ADS PubMed  © The authors 2017. Published by Oxford University Press. All rights reserved. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Complex Networks Oxford University Press

Uniform sampling of bipartite graphs with degrees in prescribed intervals

Loading next page...
 
/lp/ou_press/uniform-sampling-of-bipartite-graphs-with-degrees-in-prescribed-vKl69nnmYr
Publisher
Oxford University Press
Copyright
© The authors 2017. Published by Oxford University Press. All rights reserved.
ISSN
2051-1310
eISSN
2051-1329
D.O.I.
10.1093/comnet/cnx059
Publisher site
See Article on Publisher Site

Abstract

Abstract Random sampling of bipartite graphs with given vertex degrees is an important and well-studied task with many applications in complex network analysis. While it is often assumed that the degrees of an observed network are precisely known, there are good reasons to assume that the true network might differ from the observed one. This motivates us to study the problem of sampling bipartite graphs uniformly at random from the set of all bipartite graphs whose degrees lie in prescribed intervals. In this article, we present a sampling algorithm that is based on the Markov chain Monte Carlo method and prove that the algorithm produces uniformly distributed samples. Since the proposed Markov chain has a high probability of staying at the same state, the sampling algorithm can be quite slow in practice. Therefore, we present a second Markov chain which is designed to converge faster to its stationary distribution at the price of a larger running time in each step. In a set of experiments with scale-free and near-regular networks, we evaluate both Markov chains and examine under which conditions the second Markov chain should be preferred to the first. We close our discussion with an application that demonstrates how our sampling methods can be used to study structural properties of partially observed networks. 1. Introduction Bipartite networks are widely used to describe the relations between two different groups of objects. In ecology, for example, bipartite networks are used to model the interaction between species like plants and pollinators or to describe the presence or absence of species in geographical regions. The latter type of data is commonly represented as a presence–absence table, in which rows represent species and columns represent sites. If a species is present at a site, the associated table entry is set to one, otherwise it is zero. Prominent presence–absence tables like the famous ‘Darwin’s finches’ data set (see Table 1) have been objects of intense study in the last decades. Table 1 ‘Darwin’s finches’: Presence of $$13$$ finch species on $$17$$ islands of the Galápagos archipelago$$($$data from [1]$$)$$    Island     Species  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  $$\mathbf{r}$$  $$A$$  0  0  1  1  1  1  1  1  1  1  0  1  1  1  1  1  1  14  $$B$$  1  1  1  1  1  1  1  1  1  1  0  1  0  1  1  0  0  13  $$C$$  1  1  1  1  1  1  1  1  1  1  1  1  0  1  1  0  0  14  $$D$$  0  0  1  1  1  0  0  1  0  1  0  1  1  0  1  1  1  10  $$E$$  1  1  1  0  1  1  1  1  1  1  0  1  0  1  1  0  0  12  $$F$$  0  0  0  0  0  0  0  0  0  0  1  0  1  0  0  0  0  2  $$G$$  0  0  1  1  1  1  1  1  1  0  0  1  0  1  1  0  0  10  $$H$$  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  1  $$I$$  0  0  1  1  1  1  1  1  1  1  0  1  0  0  1  0  0  10  $$J$$  0  0  1  1  1  1  1  1  1  1  0  1  0  1  1  0  0  11  $$K$$  0  0  1  1  1  0  1  1  0  1  0  0  0  0  0  0  0  6  $$L$$  0  0  1  1  0  0  0  0  0  0  0  0  0  0  0  0  0  2  $$M$$  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  17  $$\mathbf{c}$$  4  4  11  10  10  8  9  10  8  9  3  10  4  7  9  3  3  122     Island     Species  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  $$\mathbf{r}$$  $$A$$  0  0  1  1  1  1  1  1  1  1  0  1  1  1  1  1  1  14  $$B$$  1  1  1  1  1  1  1  1  1  1  0  1  0  1  1  0  0  13  $$C$$  1  1  1  1  1  1  1  1  1  1  1  1  0  1  1  0  0  14  $$D$$  0  0  1  1  1  0  0  1  0  1  0  1  1  0  1  1  1  10  $$E$$  1  1  1  0  1  1  1  1  1  1  0  1  0  1  1  0  0  12  $$F$$  0  0  0  0  0  0  0  0  0  0  1  0  1  0  0  0  0  2  $$G$$  0  0  1  1  1  1  1  1  1  0  0  1  0  1  1  0  0  10  $$H$$  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  1  $$I$$  0  0  1  1  1  1  1  1  1  1  0  1  0  0  1  0  0  10  $$J$$  0  0  1  1  1  1  1  1  1  1  0  1  0  1  1  0  0  11  $$K$$  0  0  1  1  1  0  1  1  0  1  0  0  0  0  0  0  0  6  $$L$$  0  0  1  1  0  0  0  0  0  0  0  0  0  0  0  0  0  2  $$M$$  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  17  $$\mathbf{c}$$  4  4  11  10  10  8  9  10  8  9  3  10  4  7  9  3  3  122  Table 1 ‘Darwin’s finches’: Presence of $$13$$ finch species on $$17$$ islands of the Galápagos archipelago$$($$data from [1]$$)$$    Island     Species  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  $$\mathbf{r}$$  $$A$$  0  0  1  1  1  1  1  1  1  1  0  1  1  1  1  1  1  14  $$B$$  1  1  1  1  1  1  1  1  1  1  0  1  0  1  1  0  0  13  $$C$$  1  1  1  1  1  1  1  1  1  1  1  1  0  1  1  0  0  14  $$D$$  0  0  1  1  1  0  0  1  0  1  0  1  1  0  1  1  1  10  $$E$$  1  1  1  0  1  1  1  1  1  1  0  1  0  1  1  0  0  12  $$F$$  0  0  0  0  0  0  0  0  0  0  1  0  1  0  0  0  0  2  $$G$$  0  0  1  1  1  1  1  1  1  0  0  1  0  1  1  0  0  10  $$H$$  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  1  $$I$$  0  0  1  1  1  1  1  1  1  1  0  1  0  0  1  0  0  10  $$J$$  0  0  1  1  1  1  1  1  1  1  0  1  0  1  1  0  0  11  $$K$$  0  0  1  1  1  0  1  1  0  1  0  0  0  0  0  0  0  6  $$L$$  0  0  1  1  0  0  0  0  0  0  0  0  0  0  0  0  0  2  $$M$$  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  17  $$\mathbf{c}$$  4  4  11  10  10  8  9  10  8  9  3  10  4  7  9  3  3  122     Island     Species  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  $$\mathbf{r}$$  $$A$$  0  0  1  1  1  1  1  1  1  1  0  1  1  1  1  1  1  14  $$B$$  1  1  1  1  1  1  1  1  1  1  0  1  0  1  1  0  0  13  $$C$$  1  1  1  1  1  1  1  1  1  1  1  1  0  1  1  0  0  14  $$D$$  0  0  1  1  1  0  0  1  0  1  0  1  1  0  1  1  1  10  $$E$$  1  1  1  0  1  1  1  1  1  1  0  1  0  1  1  0  0  12  $$F$$  0  0  0  0  0  0  0  0  0  0  1  0  1  0  0  0  0  2  $$G$$  0  0  1  1  1  1  1  1  1  0  0  1  0  1  1  0  0  10  $$H$$  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  1  $$I$$  0  0  1  1  1  1  1  1  1  1  0  1  0  0  1  0  0  10  $$J$$  0  0  1  1  1  1  1  1  1  1  0  1  0  1  1  0  0  11  $$K$$  0  0  1  1  1  0  1  1  0  1  0  0  0  0  0  0  0  6  $$L$$  0  0  1  1  0  0  0  0  0  0  0  0  0  0  0  0  0  2  $$M$$  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  17  $$\mathbf{c}$$  4  4  11  10  10  8  9  10  8  9  3  10  4  7  9  3  3  122  Motivation In sciences like ecology, empirical data are typically collected by the observation and classification of species in the field. However, due to limited resources, ecologists can only spend a restricted amount of time to the data collection. Consequently, despite all effort and carefulness, the collected data are typically imperfect. In particular, species-interaction networks will most likely lack rarely occurring interactions between species and similarly, rare species may falsely be declared absent in a presence–absence table. On the other hand, potential misclassification of species may introduce relations that are not present in reality. Thus, an observed network will often differ from the ‘true’ network at least slightly. One way to deal with such problems is to incorporate the imperfectness of the data in the network analysis. For example, by assuming that some edges of the true network remained unobserved, we can model the true network to be an unknown element of a set of graphs whose node degrees are allowed to exceed the observed ones by a predefined value. Then, the expected structure of the true network can be approximated by drawing uniformly distributed samples from this set. The application of such a procedure requires a method for the uniform sampling of bipartite graphs whose degrees lie in prescribed intervals. In this article, we focus on such sampling algorithms. Problem definition Let $$m$$ and $$n$$ be positive integers and let $$\mathbf{{r}} = ({r}_1, \ldots, {r}_m)$$, $$\bar{{{\mathbf{r}}}} = (\bar{{{r}}}_1, \ldots, \bar{{{r}}}_m)$$, $$\mathbf{{c}} = ({c}_1, \ldots, {c}_n)$$, and $$\bar{{{\mathbf{c}}}} = (\bar{{{c}}}_1, \ldots, \bar{{{c}}}_n)$$ be integer vectors of length $$m$$ and $$n$$. Let $$G=(U,V,E)$$ be a simple bipartite graph with $$U = \{ u_1, \ldots, u_m \}$$ and $$V =\{ v_1, \ldots, v_n \}$$ being disjoint vertex sets and $$E \subseteq U \times V$$ being a set of edges. Let $$\delta_G(v)$$ denote the degree of a vertex $$v \in U \cup V$$ in $$G$$. We define $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$ to be the set of bipartite graphs $$G=(U,V,E)$$ such that $${r}_i \leq \delta_G(u_i) \leq \bar{{{r}}}_i$$ and $${c}_j \leq \delta_G(v_j) \leq \bar{{{c}}}_j$$ hold for each $$u_i \in U$$ and $$v_j \in V$$. The problem studied in this article is how to uniformly sample from the set $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$. Notation While we consider the term network to refer to an ‘informal concept describing an object composed of elements and interactions’ [2], we use the term graph to denote an abstract mathematical representation of such objects. In such a representation, we neglect the original interpretation of the data and focus on its underlying structure. A bipartite graph $$G=(U,V,E)$$ can be uniquely represented by an $$m \times n$$ binary matrix $$M$$. The matrix $$M = (m_{ij})$$ is called the bi-adjacency matrix of $$G$$ and is defined by   \[ m_{ij} = \begin{cases} 1,& \text{ if } \{u_i, v_j\} \in E\\ 0,& \text{ otherwise.} \end{cases} \] Thus, the problem can be reformulated as uniform sampling a binary matrix $$M \in \{0,1\}^{m \times n}$$ such that $$\mathbf{{r}}$$ and $$\bar{{{\mathbf{r}}}}$$ are lower and upper bounds on the row sums of $$M,$$ and $$\mathbf{{c}}$$ and $$\bar{{{\mathbf{c}}}}$$ are lower and upper bounds on the column sums of $$M$$. In many cases, we do not need to distinguish between vertices from $$U$$ and $$V$$. Thus, we define $${d} \colon U \cup V \to \mathbb{Z}$$ and $$\bar{{{d}}} \colon U \cup V \to \mathbb{Z}$$ to assign the lower and upper degree bounds to the set of vertices:   \[ {d}(v) = \begin{cases} {r}_i,& \text{if } v = u_i \in U\\ {c}_j,& \text{if } v = v_j \in V \end{cases} \qquad \bar{{{d}}}(v) = \begin{cases} \bar{{{r}}}_i,& \text{if } v = u_i \in U\\ \bar{{{c}}}_j,& \text{if } v = v_j \in V. \end{cases} \] In cases where the integer vectors are known from the context, we will abbreviate $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$ with $$\Omega$$. Related work To study the structural properties of networks, Connor and Simberloff [3] suggested statistical hypothesis testing. In such tests, the structure of an observed network is compared with that of randomly sampled null-networks. A much-discussed question is the choice of an appropriate null model that describes how properties of an observed network are reflected by the null-networks. To preserve the characteristics of the observed network, Connor and Simberloff suggested the uniform distribution over the set of bipartite graphs whose degrees match those of the observed one. Different null models with various advantages and disadvantages have been discussed during the last decades [4, 5]. However, the basic null model suggested by Connor and Simberloff is still widely applied. Consequently, due to its high importance for network analysis, the uniform sampling of bipartite graphs with exactly prescribed degrees has received much attention. As this is a special case of our problem in which $$\mathbf{{c}} = \bar{{{\mathbf{c}}}}$$ and $$\mathbf{{r}} =\bar{{{\mathbf{r}}}}$$, we will briefly review the most important ideas of these methods. Methods for randomly constructing bipartite graphs whose degrees match a pair of given integer vectors $$\mathbf{r}$$ and $$\mathbf{c}$$ can be roughly divided into two groups. The first group of algorithms, like the ones presented in this article, follow the Markov chain Monte Carlo (MCMC) approach. Such algorithms simulate a Markov chain whose state space is defined as the set of bipartite graphs from which we want to draw samples and whose stationary distribution is uniform. In particular, the famous switch chain has been used for decades to randomize bipartite graphs. Here, the key idea is to randomly select two edges and switch its endpoints. It is well known that all bipartite graphs with fixed degree can be generated by iterated application of switches. The number of steps $$\tau(\varepsilon)$$, until the Markov chain’s distribution is ‘close’ to its stationary distribution up to an $$\varepsilon$$ is called total mixing time of the Markov chain. Although the total mixing time can in principle be computed by complete enumeration of the Markov chain’s state space [6], this is not a feasible approach in practice since the number of states typically grows exponentially with the size of the bipartite graph. A Markov chain is called rapidly mixing if its total mixing time can be bounded from above by a polynomial in the size of the input. Proving that a Markov chain is rapidly mixing is often a challenging task in theoretical computer science. Although the switch chain was proven to be rapidly mixing for several instance classes [7–9], it is unknown whether this is true in general. There are several modifications of the classical switch chain like the curveball algorithm presented by Strona et al. [10] or the algorithm presented by Verhelst [11]. Experimental studies showed that such algorithms are empirically more efficient than the classical switch chain [12]. Bezáková et al. [13] presented a simulated annealing algorithm which produces uniformly distributed bipartite graphs and whose running time is bounded from above polynomially. However, its running time is still too large to be of practical use. The second group of algorithms use other approaches like rejection sampling or importance sampling [1, 14, 15]. Although they do not necessarily produce uniformly distributed samples, they can be used for the unbiased estimation of expected values, which is required by many applications. Harrison and Miller [14] showed that sequential importance sampling can be very efficient in practice, although there are counter examples in which the approach behaves poorly [16]. Exploiting the connection between counting and sampling, Miller and Harrison showed that exact sampling of bipartite graphs is feasible for many real-world data sets [17]. In ecological null-network analysis, several methods have been discussed to create random bipartite graphs, whose degrees are allowed to vary from the observed ones. As one of the simplest algorithms, the EE-algorithm [18] randomly distributes edges equiprobably between each pair of vertices. Thereby, it produces random bipartite graphs that possess the same edge total as the observed network but whose vertex degrees may vary entirely random. To approximately preserve the degrees of the observed network, a family of so-called PP-algorithms can be used to produce bipartite graphs, in which the probability of an edge is proportional to the corresponding vertex degrees in the observed network [19]. Ulrich and Gotelli [20] present a PP-algorithm that produces random bipartite graphs, whose degrees may vary freely, but in which the expected values of the degrees are intended to match the observed network. Thus, the degrees of a null network will be distributed randomly around the observed ones. In contrast to the PP-approach, our model assumes that the ‘true’ network is unknown and might be different from the observed one. Consequently, our method is not intended to produce random samples whose edge total match the one of the observed networks. Instead, we do allow the edge total to vary within the margins specified by the intervals. In particular, our method may be used to create null networks whose degrees are at least as large as the observed ones, which is not possible by the EE- or PP-approach. In addition, there is—to the best of our knowledge—no formal proof that the sampling of the proportional algorithms is uniform. Contribution In this article, we present two sampling algorithms based on the MCMC method. These algorithms simulate Markov chains whose state space is the set $$\Omega$$. We prove that the stationary distribution of these Markov chains is uniform, thereby showing that the sampling algorithms can be used for uniform sampling of bipartite graphs whose degrees lie in prescribed intervals. As shown by experiments, the first Markov chain is slowed down by a high number of so-called loop transitions which do not change the state of the Markov chain. For this reason, we designed the second Markov chain to avoid loop transitions at the price of a larger running time in each step. Since the running time of an MCMC algorithm is determined by the convergence of its underlying Markov chain and the running time spent in each step, it is not immediately clear under which conditions the second chain should be preferred to the first. In a set of experiments, we evaluate both Markov chains and find that the second Markov chain works very well if the given degree intervals describe scale-free networks. In contrast, the first chain is more efficient if the degrees are nearly regular. We conclude this article by showing how our sampling methods can be applied for studying the properties of partially observed networks. For this purpose, we assume that ‘Darwin’s finches’ are just a partial observation of a larger ‘true’ network and show how our sampling methods may be used to approximate some expected properties of this network. 2. Methods Our sampling algorithm is based on the famous MCMC method. Thus, we simulate a Markov chain whose state space is the set $$\Omega$$ and whose stationary distribution is uniform. Starting with an arbitrary bipartite graph $$G \in \Omega$$, we randomly modify $$G$$ to create a state $$G' \in \Omega$$. Afterwards, we set $$G \gets G'$$ and iterate. This produces a series of graphs $$G_0, G_1, \ldots, G_t$$ from $$\Omega$$. After $$t$$ steps, we stop the process and return $$G_t$$ as a random sample. One of the major challenges of the MCMC method is to choose the number of steps $$t$$ large enough such that the sample $$G_t$$ is ‘sufficiently random’. Initial state The MCMC method requires an arbitrary graph $$G_0 = (U,V,E) \in \Omega$$ that is used as the initial state. In many applications, an initial graph is given by the observed network. However, if no observed network is at hand, the first step needs to be the construction of an appropriate bipartite graph whose degrees lie within the prescribed bounds. This is a variant of the classical graph realization problem, which can be solved in $$O(|U| + |V| + |E|)$$ time [21]. 2.1 Simple Markov chain The following Markov chain can be used for the uniform sampling of bipartite graphs from the set $$\Omega$$. The Markov chain is based on the three operations switch, flip and shift that are used to transform a bipartite graph $$G \in \Omega$$ into $$G' \in \Omega$$ (see Fig. 1). Fig. 1. View largeDownload slide Visualization of a switch (left), flip (mid) and shift operation (right). Fig. 1. View largeDownload slide Visualization of a switch (left), flip (mid) and shift operation (right). Definition 2.1 (Switch) Let $$G = (U,V, E) \in \Omega$$. Randomly choose four vertices $$u_i, u_k \in U$$ and $$v_j, v_l \in V$$ without replacement. If $$\{u_i, v_j\} \in E$$, $$\{u_k, v_l\} \in E$$, $$\{u_i, v_l\} \not\in E$$ and $$\{u_k, v_j\} \not\in E$$, return the graph $$G'=(U,V,E')$$ where $$E' = \left(E \setminus \left\{ \{u_i, v_j\}, \{u_k, v_l\} \right\} \right) \cup \left\{ \{u_i, v_l\}, \{u_k, v_j\} \right\}$$. If $$\{u_i, v_l\} \in E$$, $$\{u_k, v_j\} \in E$$, $$\{u_i, v_j\} \not\in E$$ and $$\{u_k, v_l\} \not\in E$$, return the graph $$G'=(U,V,E')$$ where $$E' = \left(E \setminus \left\{ \{u_i, v_l\}, \{u_k, v_j\} \right\} \right) \cup \left\{ \{u_i, v_j\}, \{u_k, v_l\} \right\}$$. Otherwise, return $$G$$. It is easy to verify that the probability $$p_{\text{switch}}(G,G') = \binom{m}{2}^{-1}\binom{n}{2}^{-1}$$ of transforming a graph $$G$$ into $$G' \not= G$$ via a switch operation is equal to $$p_{\text{switch}}(G',G)$$ since it only depends on the number of vertices, which is invariant for each pair of graphs $$G, G' \in \Omega$$. Definition 2.2 (Flip) Let $$G = (U,V, E) \in \Omega$$. Choose two vertices $$u \in U$$ and $$v \in V$$ uniformly at random. If $$\{u,v\} \in E$$, set $$E' = E \setminus \left\{ \{u,v \} \right\}$$, otherwise set $$E' = E \cup \left\{ \{u,v\} \right\}$$. Set $$G' = (U,V,E')$$. If $${d}(x) \leq \delta_{G'}(x) \leq \bar{{{d}}}(x)$$ for each $$x \in \{u,v\}$$, return $$G'$$. Otherwise, return $$G$$. Unlike the switch and shift operation, the flip changes the total number of edges in $$G$$. The probability of transforming a graph $$G$$ into $$G' \not= G$$ via a single flip operation is $$p_{\text{flip}}(G,G') = m^{-1}n^{-1} = p_{\text{flip}}(G',G)$$. Definition 2.3 (Shift) Let $$G = (U,V, E) \in \Omega$$. Choose two vertices $$u \in U$$ and $$v \in V$$ uniformly at random. Choose a vertex $$w$$ from $$(U \cup V) \setminus \{u,v\}$$ uniformly at random. If $$w \in V$$, switch the roles of $$u$$ and $$v$$. If $$\{u,v\} \not\in E$$ and $$\{w,v\} \in E$$, replace the edge $$\{ w,v\}$$ by $$\{u,v\}$$. Thus, create $$G'=(U,V,E')$$ with $$E'= (E \setminus \left\{ \{w,v\} \right\}) \cup \left\{ \{ u,v \} \right\}$$ and go to Step 7. If $$\{u,v\} \in E$$ and $$\{w,v\} \not\in E$$, replace the edge $$\{ u,v\}$$ by $$\{w,v\}$$. Thus, create $$G'=(U,V,E')$$ with $$E'= (E \setminus \left\{ \{u,v\} \right\}) \cup \left\{ \{ w,v \} \right\}$$ and go to Step 7. In all other cases return $$G$$. If $${d}(x) \leq \delta_{G'}(x) \leq \bar{{{d}}}(x)$$ for each $$x \in \{u,w \}$$ return $$G'$$, otherwise return $$G$$. The shift operation only affects the degrees of the vertices $$u$$ and $$w$$ and does not affect the degrees of all other vertices. Transforming $$G$$ into $$G' \not=G$$ via a single shift operation has a probability of $$p_{\text{shift}}(G,G') = m^{-1}n^{-1}(m+n-2)^{-1} = p_{\text{shift}}(G',G)$$. We can now define our first Markov chain. Definition 2.4 (Simple Markov chain) Let $$G = (U,V, E) \in \Omega$$. Let $$q \colon \{ \text{switch} , \text{flip},$$$$\text{shift} \} \to [0,1]$$ be an arbitrary probability function that assigns positive probability to each operation. Randomly select one of the three operations switch, flip and shift with respect to the probability function $$q$$. Apply the operation to transform $$G$$ into $$G'$$. Set $$G \gets G'$$ and go to Step 1. Proof of correctness In the remainder of this section, we will prove that the simple chain can be used for uniform sampling from the set $$\Omega$$. Therefore, it suffices to show that the Markov chain is irreducible, aperiodic and reversible with respect to the uniform distribution on $$\Omega$$. It follows from the fundamental theorem of Markov chains (see, e.g. standard textbooks like [22]), that the Markov chain’s stationary distribution is uniform. Definition 2.5 Let $$p \colon \Omega \times \Omega \to [0,1]$$ be the transition probability of a Markov chain with state space $$\Omega$$. The directed graph $$\Gamma = (\Omega, \Phi)$$ with $$\Phi = \{ (x,y) \colon p(x,y) > 0 \}$$ is called state graph of the corresponding Markov chain. A Markov chain is called irreducible if $$\Gamma$$ is strongly connected. It is called aperiodic, if $$\Gamma$$ is not bipartite and it is called reversible with respect to a probability function $$\pi \colon \Omega \to [0,1]$$, if $$\pi(x)p(x,y) = \pi(y) p(y,x)$$ for each $$x,y \in \Omega$$. We start by proving the irreducibility of the chain. For the following theorems, let $$G=(U,V,E)$$ and $$G'=(U,V,E')$$ be arbitrary bipartite graphs from the set $$\Omega$$. Definition 2.6 Let $$G \triangle G' := (E\setminus E') \cup (E' \setminus E)$$ be the symmetric difference of the edges of $$G$$ and $$G'$$. A sequence of nodes $$p=(v_1, v_2, \ldots, v_k)$$ is called alternating path (in $$G \triangle G'$$) if two consecutive vertices are alternatingly in $$E \setminus E'$$ and $$E' \setminus E$$. An alternating path $$p$$ is called non-extendable if there is no alternating path containing $$p$$ as a proper subpath. An alternating path is called simple if $$v_i \not= v_j$$ for $$1 \leq i \neq j \leq k$$. A sequence of nodes $$(v_1, v_2, \ldots, v_{k}, v_1)$$ is called alternating cycle (in $$G \triangle G'$$) if $$(v_1, v_2, \ldots, v_{k})$$ is an alternating path. An alternating cycle is called simple if it does not contain a smaller alternating cycle. In the special case of $$\mathbf{{r}}=\bar{{{\mathbf{r}}}}$$ and $$\mathbf{{c}}=\bar{{{\mathbf{c}}}}$$, the symmetric difference $$G\triangle G'$$ can be partitioned into a set of simple alternating cycles. Furthermore, in this case, it is well known that an alternating cycle $$(v_1, v_2, \ldots, v_k, v_1)$$ in $$G \triangle G'$$ can be completely dissolved via a series of at most $$k/2$$ switch operations. Thus, when $$\mathbf{{r}}=\bar{{{\mathbf{r}}}}$$ and $$\mathbf{{c}}=\bar{{{\mathbf{c}}}}$$, the graphs $$G$$ and $$G'$$ can be transformed into each other by at most $$|G \triangle G'|/2$$ switch operations. By the following theorems, we will prove a similar result for our more general scenario. Lemma 2.1 Let $$(v_1, v_2, \ldots, v_{k}, v_1)$$ be a simple alternating cycle in $$G \triangle G'$$. We can apply a switch operation to $$G$$ or $$G'$$, thereby transforming $$G$$ into $$\bar{{{G}}}$$ or $$G'$$ into $$\bar{{{G}}}'$$ such that $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| \geq 2$$. Proof. We prove the theorem inductively by the length $$k$$ of the alternating cycle. Assume that $$\{v_1, v_2\} \in E$$ and $$\{v_{k-1}, v_{k}\} \in E$$, otherwise switch the roles of $$G$$ and $$G'$$. If $$k=4$$, we can immediately apply a switch to $$G$$ which transforms $$G$$ into $$\bar{{{G}}} = (U,V, E \setminus \{ \{v_1, v_2\}, \{v_3, v_4\} \} \cup \{ \{v_1, v_4\}, \{v_2, v_3\} \})$$ and leaves $$\bar{{{G}}}' \gets G'$$ unchanged. Thus, we have $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| = 4$$. Now let $$k > 4$$ and assume the theorem has been proven for alternating cycles of length $$i < k$$. Depending on whether or not the edge $$\{v_1, v_4\}$$ exists in $$G$$ and $$G'$$, four cases can occur (see Fig. 2). If $$\{v_1, v_4\} \not\in E$$ and $$\{v_1, v_4\} \not\in E'$$, we can replace the edges $$\{v_1, v_2\} \in E$$ and $$\{v_3, v_4\} \in E$$ by $$\{v_1, v_4\}$$ and $$\{v_2, v_3\}$$. Thus, a switch operation may transform $$G$$ into $$\bar{{{G}}} = (U,V, \bar{{{E}}})$$ with $$\bar{{{E}}} = (E \setminus \left\{ \{v_1, v_2\}, \{v_3, v_4\} \right\}) \cup \left\{ \{v_1, v_4\}, \{v_2, v_3\} \right\} )$$. We set $$\bar{{{G}}}' \gets G'$$ and have $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| = 2$$. If $$\{v_1, v_4\} \not\in E$$ and $$\{v_1, v_4\} \in E'$$, we can replace the edges $$\{v_1, v_2\} \in E$$ and $$\{v_3, v_4\} \in E$$ by $$\{v_1, v_4\}$$ and $$\{v_2, v_3\}$$. Thus, a switch operation may transform $$G$$ into $$\bar{{{G}}} = (U,V, \bar{{{E}}})$$ with $$\bar{{{E}}} = (E \setminus \left\{ \{v_1, v_2\}, \{v_3, v_4\} \right\}) \cup \left\{ \{v_1, v_4\}, \{v_2, v_3\} \right\} )$$. We set $$\bar{{{G}}}' \gets G'$$ and have $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| = 4$$. Remarkably, this does not create a smaller alternating cycle. Instead, it creates the alternating path $$(v_4, v_5, \ldots, v_k, v_1)$$ in $$\bar{{{G}}} \triangle \bar{{{G}}}'$$ of length $$k-2$$. If $$\{v_1, v_4\} \in E$$ and $$\{v_1, v_4\} \not\in E'$$, we do not directly find an applicable switch operation. However, there is an alternating cycle $$(v_1, v_4, \ldots, v_k, v_1)$$ of length $$k-2$$. Thus, by induction hypothesis, we can find a switch operation that transforms $$G$$ into $$\bar{{{G}}}$$ or $$G'$$ into $$\bar{{{G}}}'$$ such that $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G'}}}| \geq 2$$. If $$\{v_1, v_4\} \in E$$ and $$\{v_1, v_4\} \in E'$$, we can replace the edges $$\{v_1, v_4\} \in E'$$ and $$\{v_2, v_3\} \in E'$$ by $$\{v_1, v_2\}$$ and $$\{v_3, v_4\}$$. Thus, a switch operation may transform $$G'$$ into $$\bar{{{G}}}' = (U,V, \bar{{{E}}}')$$ with $$\bar{{{E}}}' = (E' \setminus \left\{ \{v_1, v_4\}, \{v_2, v_3\} \right\}) \cup \left\{ \{v_1, v_2\}, \{v_3, v_4\} \right\}$$. We set $$\bar{{{G}}} \gets G$$ and have $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G'}}}| = 2$$. □ Fig. 2. View largeDownload slide Example of an alternating cycle $$c=(v_1, v_2, \ldots, v_6, v_1)$$ of length $$k=6$$. Straight lines are present in $$G$$, and dashed lines are present in $$G'$$. From left to right: cases one to four of Lemma 2.1. Fig. 2. View largeDownload slide Example of an alternating cycle $$c=(v_1, v_2, \ldots, v_6, v_1)$$ of length $$k=6$$. Straight lines are present in $$G$$, and dashed lines are present in $$G'$$. From left to right: cases one to four of Lemma 2.1. Lemma 2.2 Let $$(v_1, v_2, \ldots, v_k)$$ be a simple non-extendable alternating path in $$G \triangle G'$$. (a) If $$\{v_1, v_2\} \in E$$, we may remove an edge $$\{v_1, \cdot\}$$ from $$E$$ or add a new edge $$\{v_1, \cdot\}$$ to $$E'$$ without violating the lower and upper bounds on the degree of $$v_1$$. (b) If $$\{v_1, v_2\} \in E'$$, we may add a new edge $$\{v_1, \cdot\}$$ to $$E$$ or remove an edge $$\{v_1, \cdot\}$$ from $$E'$$ without violating the lower and upper bounds on the degree of $$v_1$$. Proof. Assume $$\{v_1, v_2\} \in E$$, otherwise switch the roles of $$G$$ and $$G'$$. Since the path is non-extendable, there is no edge $$\{v_1, \cdot\} \in E' \setminus E$$. Thus, $$\delta_G(v_1)$$ exceeds $$\delta_{G'}(v_1)$$ by at least one. Since $$G$$ and $$G'$$ are valid realizations and so do not violate the lower and upper bounds on the degree of $$v_1$$, the inequalities $${d}(v_1) \leq \delta_{G'}(v_1) < \delta_G(v_1) \leq \bar{{{d}}}(v_1)$$ must hold. Hence, by adding an additional edge $$\{v_1, \cdot\}$$ to $$E'$$ or removing an edge $$\{v_1, \cdot\}$$ from $$E$$, the lower and upper bounds on the degree of $$v_1$$ are not violated. □ Lemma 2.3 Let $$p=(v_1, v_2, \ldots, v_{k-1}, v_{k})$$ be a simple non-extendable alternating path in $$G \triangle G'$$. By a series of at most two flip, switch or shift operations, we can transform $$G$$ into $$\bar{{{G}}}$$ and $$G'$$ into $$\bar{{{G}}}'$$ such that $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| \geq 1$$. Proof. First, let $$p$$ be of odd length, thus $$k$$ is even. Assume that $$\{v_1, v_2\} \in E$$ and $$\{v_{k-1}, v_{k}\} \in E$$, otherwise switch the roles of $$G$$ and $$G'$$. Depending on the existence of the edge $$\{v_1, v_{k}\}$$ in $$G$$ and $$G'$$, we discuss four cases (see Fig. 3). If $$\{v_1, v_{k}\} \not\in E$$ and $$\{v_1, v_{k} \} \not\in E$$, Lemma 2.2 allows to add the edge $$\{v_1, v_{k}\}$$ to $$E'$$. Thus, a flip operation may transform $$G'$$ into $$\tilde{G}' = (U,V, \tilde{E}')$$ with $$\tilde{E}' = E' \cup \left\{ \{v_1, v_{k}\} \right\}$$. By introducing a new edge, we temporarily create a situation where $$|G \triangle G'| - |G \triangle \tilde{G}'| = -1$$. However, we additionally construct the alternating cycle $$(v_1, v_2, \ldots, v_k, v_1)$$ in $$G \triangle \tilde{G}'$$. By Lemma 2.1, we can find a switch operation applicable to $$G$$ or $$\tilde{G}'$$ leading to $$\bar{{{G}}}$$ and $$\bar{{{G}}}'$$ such that $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| \geq 1$$. If $$\{v_1, v_{k}\} \not\in E$$ and $$\{v_1, v_{k} \} \in E'$$, the closed path $$(v_1, v_2, \ldots, v_{k}, v_1)$$ is an alternating cycle in $$G \triangle G'$$. Thus, the path $$p$$ is not a non-extendable alternating path. Since this contradicts our assumption, this case cannot occur. If $$\{v_1, v_{k}\} \in E$$ and $$\{v_1, v_{k}\} \not\in E'$$, Lemma 2.2 allows to remove the edge $$\{v_1, v_k\}$$ from $$E$$. Thus, a flip operation may transform $$G$$ into $$\bar{{{G}}} = (U,V, \bar{{{E}}})$$ with $$\bar{{{E}}} = E \setminus \{ \{v_1, v_{k}\} \}$$. We set $$\bar{{{G}}}' \gets G'$$ and have $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| = 1$$. If $$\{v_1, v_{k}\} \in E$$ and $$\{v_1, v_{k}\} \in E'$$, Lemma 2.2 allows to remove the edge $$\{v_1, v_k\}$$ from $$E$$. Thus, a flip operation may transform $$G$$ into $$\tilde{G} = (U,V, \tilde{E})$$ with $$\tilde{E} = E \setminus \{ \{v_1, v_{k}\} \}$$. By removing an edge, we temporarily create a situation in which $$|G \triangle G'| - | \tilde{G} \triangle G'| = -1$$. However, we also create the alternating cycle $$(v_1, v_2, \ldots, v_{k}, v_1)$$ in $$\tilde{G} \triangle G'$$. By Lemma 2.1, we can find a switch operation applicable to either $$\tilde{G}$$ or $$G'$$ leading to $$\bar{{{G}}}$$ and $$\bar{{{G}}}'$$ such that $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| \geq 1$$. Now let $$p$$ be of even length, thus $$k$$ is odd. In this case, the edge $$\{v_1, v_{k}\}$$ can neither exist in $$E$$ nor in $$E'$$ since $$v_1$$ and $$v_{k}$$ are in the same vertex set. However, it suffices to focus on the existence of the edge $$\{v_1, v_{k-1} \}$$ in $$G$$ and $$G'$$ (see Fig. 4). If $$\{v_1, v_{k-1} \} \not\in E$$ and $$\{ v_1, v_{k-1} \} \not\in E$$, Lemma 2.2 allows to replace the edge $$\{ v_{k-1}, v_k\} \in E'$$ by $$\{v_{k-1}, v_1 \}$$. Thus, a shift operation may transform $$G'$$ into $$\tilde{G}'=(U,V, \tilde{E}')$$ with $$\tilde{E}' = (E' \setminus \left\{ \{v_{k-1}, v_{k} \} \right\}) \cup \left\{ \{ v_{k-1}, v_1 \} \right\}$$. While this leads to a situation where $$|G \triangle G'| - |G \triangle \tilde{G}'| = 0$$, we also create the alternating cycle $$(v_1, \ldots, v_{k-1}, v_1)$$ in $$G \triangle \tilde{G}'$$. By Lemma 2.1, we can find a switch operation applicable to $$G$$ or $$\tilde{G}'$$ leading to $$\bar{{{G}}}$$ and $$\bar{{{G}}}'$$ such that $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| \geq 2$$. If $$\{v_1, v_{k-1} \} \not\in E$$ and $$\{v_1, v_{k-1}\} \in E'$$, the closed path $$(v_1, v_2, \ldots, v_{k-1}, v_1)$$ is an alternating cycle in $$G \triangle G'$$. Thus, the path $$p$$ is not an non-extendable alternating path. Since this contradicts our assumption, this case cannot occur. If $$\{v_1, v_{k-1}\} \in E$$ and $$\{v_1, v_{k-1} \} \not\in E'$$, Lemma 2.2 allows to replace the edge $$\{v_1, v_{k-1}\} \in E$$ by $$\{v_k, v_{k-1}\}$$. Thus, a shift operation transforms $$G$$ into $$\bar{{{G}}}=(U,V,\bar{{{E}}})$$ with $$ \bar{{{E}}} = (E \setminus \left\{ \{v_1, v_{k-1} \} \right\}) \cup \left\{ \{v_{k-1}, v_{k}\} \right\}$$. We set $$\bar{{{G}}}' \gets G'$$ and have $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| = 2$$. If $$\{v_1, v_{k-1} \} \in E$$ and $$\{ v_1, v_{k-1} \} \in E'$$, Lemma 2.2 allows to replace the edge $$\{v_1, v_{k-1}\} \in E$$ by $$\{v_k, v_{k-1}\}$$. Thus, a shift operation transforms $$G$$ into $$\tilde{G}=(U,V, \tilde{E})$$ with $$\tilde{E} = (E \setminus \left\{ \{v_1, v_{k-1}\} \right\}) \cup \left\{ \{v_{k}, v_{k-1}\} \right\}$$. While this leads to a situation where $$|G \triangle G'| - |\tilde{G} \triangle G'| = 0$$, we also create the alternating cycle $$(v_1, \ldots, v_{k-1}, v_1)$$ in $$\tilde{G} \triangle G'$$. By Lemma 2.1 we can find a switch operation applicable to $$\tilde{G}$$ or $$G'$$ leading to $$\bar{{{G}}}$$ and $$\bar{{{G}}}'$$ such that $$|G \triangle G'| - |\bar{{{G}}} \triangle \bar{{{G}}}'| \geq 2$$. □ Fig. 3. View largeDownload slide Example of a non-extendable alternating path $$p=(v_1, v_2, \ldots, v_6)$$ of odd length ($$k=6$$). Straight lines are present in $$G$$, and dashed lines are present in $$G'$$. From left to right: cases one to four of the first part of Lemma 2.3. Fig. 3. View largeDownload slide Example of a non-extendable alternating path $$p=(v_1, v_2, \ldots, v_6)$$ of odd length ($$k=6$$). Straight lines are present in $$G$$, and dashed lines are present in $$G'$$. From left to right: cases one to four of the first part of Lemma 2.3. Fig. 4. View largeDownload slide Example of a non-extendable alternating path $$p=(v_1, v_2, \ldots, v_5)$$ of even length ($$k=5$$). Straight lines are present in $$G$$, and dashed lines are present in $$G'$$. From left to right: cases one to four of the second part of Lemma 2.3. Fig. 4. View largeDownload slide Example of a non-extendable alternating path $$p=(v_1, v_2, \ldots, v_5)$$ of even length ($$k=5$$). Straight lines are present in $$G$$, and dashed lines are present in $$G'$$. From left to right: cases one to four of the second part of Lemma 2.3. Lemma 2.4 Let $$G$$ and $$G'$$ be arbitrary graphs from $$\Omega$$. By a series of at most $$2|G \triangle G'|$$ switch, flip and shift operations, we can transform $$G$$ and $$G'$$ into each other. Proof. We prove the theorem inductively by the size of $$G \triangle G'$$. If $$|G \triangle G'| = 0$$, the graphs must be identical. Otherwise, apply the following algorithm to construct either a simple non-extendable alternating path $$p$$ in $$G \triangle G'$$ or a simple alternating cycle $$c$$ in $$G \triangle G'$$. Choose an arbitrary edge $$\{v, w\} \in G \triangle G'$$ and let $$p\gets (v, w)$$. Let $$z \gets w$$ be the rightmost node of the path. While there is some edge $$\{z, y\} \in G \triangle G'$$ that can be used to extend $$p$$ (preserving its alternating structure), append $$y$$ to $$p$$. If this produces a cycle, terminate. Otherwise, let $$z \gets y$$ and iterate until $$p$$ cannot be extended. Now reverse the path $$p$$, let $$z \gets v$$ be the rightmost node and repeat the expanding of $$p$$. The algorithm terminates with a simple non-extendable path or detects a simple alternating cycle. In the first case, we can reduce size of $$G \triangle G'$$ via Lemma 2.3, while in the latter case, we can apply Lemma 2.1. Thus, we can inductively reduce $$|G \triangle G'|$$ to zero. The upper bound on the number of operations follows from Lemmas 2.3 and 2.1 since we need at most two operations to reduce the symmetric difference by at least one. □ Lemma 2.5 The simple Markov chain is aperiodic. Proof. Consider an arbitrary bipartite graph $$G = (U,V,E) \in \Omega$$. If $$|E| \leq 1$$, all switch operations will return $$G' = G$$. Otherwise, if $$|E| > 1$$, a shift operation may select three vertices $$u$$, $$v$$ and $$w$$ such that $$\{u,v\} \in E$$ and $$\{v,w\} \in E$$. Thus, we return $$G$$ with positive probability. As the state graph possesses at least one loop arc, it cannot be bipartite. Thus, the Markov chain is aperiodic. □ Lemma 2.6 The simple Markov chain is reversible with respect to the uniform distribution. Proof. We showed that each operation can be reversed with the same probability. Hence, the transition probability $$p(G,G') = p(G',G)$$ is symmetric for each pair of states $$G, G' \in \Omega$$. Thus, the Markov chain is reversible with respect to the uniform distribution $$\pi(G) = |\Omega|^{-1}$$ for each $$G \in \Omega$$. □ Theorem 2.7 The stationary distribution of the simple Markov chain is the uniform distribution on $$\Omega$$. Proof. This follows from Lemmas 2.4, 2.5 and 2.6 and the fundamental theorem of Markov chains. □ 2.2 Informed Markov chain As shown in the previous section, the simple chain can be used for the uniform sampling of bipartite graphs from the set $$\Omega$$. However, a problem of this chain is the large number of loop transitions that may arise from its uninformed nature of selecting random vertices. In this section, we present a second Markov chain that is designed to converge faster to its stationary distribution at the price of an increased running time per step. It is based on two key ideas. First, we want to reduce the number of loop transitions by a more informed vertex selection. Additionally, we increase the number of edges that are affected by a single transition. Inspired by the Curveball algorithm [10], we replace the simple operations switch, shift and flip by their more complex counterparts trade, multi-shift and multi-flip. The main difference between the simple and complex operations is the number of affected edges per transition. Whereas the simple operations affect at most four edges per step, the number of edges affected by a complex operation is $${O}(|V|)$$. We start with defining three new operations. Definition 2.8 (Trade) Let $$G=(U,V,E) \in \Omega$$. Choose two vertices $$u_i, u_k \in U$$ uniformly at random without replacement. Determine the sets of vertices $$V_i$$, $$V_k$$ and $$V_{ik}$$ such that   \begin{align*} V_i &= \left\{ v \colon \{u_i, v\} \in E \wedge \{u_k, v\} \not\in E \right\}\!, \\ V_k &= \left\{ v \colon \{u_i, v\} \not\in E \wedge \{u_k, v\}\in E \right\}\!, \\ V_{ik} &= \left\{ v \colon \{u_i,v\} \in E \wedge \{u_k,v\} \in E \right\}\!. \end{align*} Thus, $$V_i$$ is the set of vertices that are adjacent to $$u_i$$ but not to $$u_k$$, $$V_k$$ is the set of vertices that are adjacent to $$u_k$$ but not to $$u_i$$ and $$V_{ik}$$ is the set of vertices that are adjacent to both $$u_i$$ and $$u_k$$. 3. Choose a subset $$S \subseteq V_i \cup V_k$$ of size $$|V_i|$$ uniformly at random. 4. Create the new sets of edges   \begin{align*} E'_i &= \left\{ \{u_i, v\} \colon v \in S \right\}\!,\\ E'_k &= \left\{ \{u_k, v\} \colon v \in (V_i \cup V_k) \setminus S \right\} \\ E'_{ik} &= \left\{ \{u_i, v\}, \{u_k, v\} \colon v \in V_{ik} \right\} \end{align*} and return the bipartite graph $$G'=(U, V, E')$$ with $$E' = E'_i \cup E'_k \cup E'_{ik}$$. Figure 5 visualizes the trade operation. The trade may create a graph $$G'$$ such that $$G = G'$$ when in step three $$S$$ is chosen to be equal to $$V_i$$. In addition, it contains the switch as a special case when in step three $$S$$ is chosen such that $$| V_i \triangle S | = 2$$. Carstens [23] showed that the transition probabilities of the trade operation are symmetric. Fig. 5. View largeDownload slide Visualization of a trade operation. Transforming the left graph into the right: $$V_i = \{ v_1, v_2\}$$ (grey, solid border), $$V_k = \{v_4, v_5, v_6\}$$ (grey, dashed border). The set $$S=\{v_4, v_5 \}$$ was chosen randomly from $$V_i \cup V_k$$. Fig. 5. View largeDownload slide Visualization of a trade operation. Transforming the left graph into the right: $$V_i = \{ v_1, v_2\}$$ (grey, solid border), $$V_k = \{v_4, v_5, v_6\}$$ (grey, dashed border). The set $$S=\{v_4, v_5 \}$$ was chosen randomly from $$V_i \cup V_k$$. Definition 2.9 (Multi-flip) Let $$G=(U,V,E) \in \Omega$$. Choose a vertex $$u \in U$$ uniformly at random. Determine the sets of vertices $$V_0$$ and $$V_1$$ such that   \begin{align*} V_0 &= \left\{ v \colon \{u,v\} \not\in E \wedge \delta_G(v) < \bar{{{d}}}(v) \right\}\!,\\ V_1 &= \left\{ v \colon \{u,v\} \in E \wedge \delta_G(v) > {d}(v) \right\}\!. \end{align*} Thus, $$V_0$$ is the set of vertices that are not adjacent to $$u$$ and may be part of an additional edge without violating the upper bound on their degree. In contrast, $$V_1$$ is the set of vertices that are adjacent to $$u$$ and may lose an incident edge without violating the lower bound on their degree. 3. Let $$s = \min \left(\bar{{{d}}}(u) - {d}(u), |V_0 \cup V_1| \right)$$. If $$s = 0$$, return $$G$$. 4. Choose a subset $$S \subseteq V_0 \cup V_1$$ of size $$s$$ uniformly at random. 5. Choose a subset $$T \subseteq S$$ of arbitrary size uniformly at random. 6. Create the sets of edges $$E_0$$ and $$E_1$$ such that   \begin{align*} E_0 = \left\{ \{u,v\} \in E \colon v \in T \right\}\!, \hspace{10mm} E_1 = \left\{ \{u,v\} \not\in E \colon v \in T \right\}\!. \end{align*} Remove the edges $$E_0$$ from $$E$$ and add the edges $$E_1$$, thus create the bipartite graph $$G'=(U,V,E')$$ with $$E' = (E \setminus E_0) \cup E_1$$. 7. If $${d}(u) \leq \delta_{G'}(u) \leq \bar{{{d}}}(u)$$ return $$G'$$, otherwise return $$G$$. Figure 6 visualizes the multi-flip. The multi-flip contains the simple flip as a special case, when $$|T|=1$$. In addition, the multi-flip may produce $$G'=G$$ when $$T = \emptyset$$. Fig. 6. View largeDownload slide Visualization of a multi-flip operation. The numbers in brackets show the upper and lower bounds on the degrees. Transforming the left graph into the right: $$V_0 = \{ v_3, v_4\}$$ (grey, dashed border), $$V_1 = \{ v_1, v_2\}$$ (grey, solid border). Since the bounds on the degree of $$u$$ differ by three, we randomly chose the set $$S = \{v_2, v_3, v_4\}$$ of size three from $$V_0 \cup V_1$$. Subsequently, we randomly chose $$S = T$$ as a subset of $$T$$. Fig. 6. View largeDownload slide Visualization of a multi-flip operation. The numbers in brackets show the upper and lower bounds on the degrees. Transforming the left graph into the right: $$V_0 = \{ v_3, v_4\}$$ (grey, dashed border), $$V_1 = \{ v_1, v_2\}$$ (grey, solid border). Since the bounds on the degree of $$u$$ differ by three, we randomly chose the set $$S = \{v_2, v_3, v_4\}$$ of size three from $$V_0 \cup V_1$$. Subsequently, we randomly chose $$S = T$$ as a subset of $$T$$. Lemma 2.7 The transition probability of the multi-flip is symmetric. Proof. Consider two graphs $$G=(U,V,E)$$ and $$G'=(U,V,E')$$ that can be transformed into each other via a single multi-flip. To distinguish the sets $$V_0$$, $$V_1$$ and $$T$$ in $$G$$ and $$G'$$, we affiliate each set with the corresponding graph. Following from its definition, the probability for transforming a graph $$G $$ into $$G'\neq G$$ via a single multi-flip is   \begin{align*} p_{\text{multi-flip}}(G, G') = m^{-1} \binom{|V_0(G) \cup V_1(G)|}{s}^{-1} 2^{-s}, \end{align*} where $$s = \min(\bar{{{d}}}(u) - {d}(u), |V_0(G) \cup V_1(G)|)$$. To prove that $$p_{\text{multi-flip}}(G, G')$$ equals $$p_{\text{multi-flip}}(G', G)$$, it suffices to show $$|V_0(G) \cup V_1(G)| = |V_0(G') \cup V_1(G')|$$ since $$m$$ and $$\bar{{{d}}}(u) - {d}(u)$$ are invariant constants. Therefore, assume $$|V_0(G) \cup V_1(G)| > |V_0(G') \cup V_1(G')|$$, otherwise switch the roles of $$G$$ and $$G'$$. Let $$v \in V_0(G)$$ be a vertex that is neither in $$V_0(G')$$ nor in $$V_1(G')$$. Since $$v \in V_0(G)$$, it follows from the definition of $$V_0$$ that $$\{u,v\} \not\in E$$. Now consider two cases that can occur while transforming $$G$$ into $$G'$$. If $$v \in T(G)$$, by the definition of the multi-shift the edge $$\{u,v\}$$ must be included in $$E'$$. Thus, $$\delta_{G'}(v) = \delta_G(v) + 1$$. Since $$\{u,v\} \in E'$$ and $${d}(v) < \delta_{G'}(v)$$, it follows that $$v \in V_1(G')$$. Otherwise, if $$v \not\in T(G)$$, the degree of $$v$$ is not changed by the multi-shift operation and thus, $$v \in V_0(G')$$. In either case, $$v \in V_0(G')$$ or $$v \in V_1(G')$$, which contradicts our assumption. The symmetric case $$v \in V_1(G)$$ can be proven analogously. □ Definition 2.10 (Multi-shift) Let $$G=(U,V,E) \in \Omega$$. Choose a random vertex $$u \in U$$. Determine the sets of vertices $$V_0$$ and $$V_1$$ such that   \begin{align*} V_0 &= \left\{ v \colon \{u,v\} \not\in E \wedge \delta_G(v) < \bar{{{d}}}(v) \right\}\!,\\ V_1 &= \left\{ v \colon \{u,v\} \in E \wedge \delta_G(v) > {d}(v) \right\}\!. \end{align*} Up to this point, the multi-shift is defined similarly to the multi-flip. 3. Choose a subset $$S \subseteq V_0 \cup V_1$$ of size $$|V_1|$$ uniformly at random. 4. Create the set of edges $$E_0$$ and $$E_1$$ such that   \begin{align*} E_0 = \left\{ \{u,v\} \in E \colon v \in S \right\}\!, \hspace{10mm} E_1 = \left\{ \{u,v\} \not\in E \colon v \in (V_0 \cup V_1) \setminus S \right\}\!. \end{align*} Remove the edges $$E_0$$ from $$E$$ and add the edges $$E_1$$, thus create and return the bipartite graph $$G'=(U,V,E')$$ with $$E' = (E \setminus E_0) \cup E_1$$. Figure 7 visualizes the multi-shift. In contrast to the multi-flip, the multi-shift operation does not change the degree of $$u$$ since $$S$$ is chosen to have the same cardinality as $$V_1$$. In addition, the upper or lower bound on the degree of a node $$v \in V$$ cannot be violated by the way we define $$V_0$$ and $$V_1$$. Therefore, $$G' \in \Omega$$. Like the other operations, the multi-shift contains the possibility of creating $$G' = G$$ when in step three $$S$$ is chosen to be equal to $$V_1$$. The multi-shift does not yet contain the simple shift as a special case since it only modifies the degree of nodes from $$V$$. However, we will take care for this when defining the Markov chain. Fig. 7. View largeDownload slide Visualization of a multi-shift operation. The numbers in brackets show the relevant upper and lower bounds on the degrees. Transforming the left graph into the right: $$V_0 = \{ v_3, v_4\}$$ (grey, dashed border), $$V_1 = \{ v_1, v_2\}$$ (grey, solid border). We randomly selected $$S = \{v_3, v_4\}$$ of size $$|V_1|$$ from $$V_0 \cup V_1$$. Fig. 7. View largeDownload slide Visualization of a multi-shift operation. The numbers in brackets show the relevant upper and lower bounds on the degrees. Transforming the left graph into the right: $$V_0 = \{ v_3, v_4\}$$ (grey, dashed border), $$V_1 = \{ v_1, v_2\}$$ (grey, solid border). We randomly selected $$S = \{v_3, v_4\}$$ of size $$|V_1|$$ from $$V_0 \cup V_1$$. Lemma 2.8 The transition probability of the multi-shift is symmetric. Proof. Following from its definition, the probability for transforming a graph $$G $$ into $$G'\neq G$$ via a single multi-shift is   \begin{align*} p_{\text{multi-shift}}(G, G') = m^{-1} \binom{|V_0(G) \cup V_1(G)|}{|V_1(G)|}^{-1}. \end{align*} Since the degree of $$u$$ is not changed by a multi-shift, $$|V_1(G)|$$ is equal to $$|V_1(G')|$$. In addition, a similar argument as used in the proof of Lemma 2.7 shows $$|V_0(G) \cup V_1(G)| = |V_0(G') \cup V_1(G')|$$. Thus, $$p_{\text{multi-shift}}(G, G') = p_{\text{multi-shift}}(G', G)$$. □ Markov chain By the way we defined our operations, a single operation affects at most two vertices from $$U$$ and up to $$|V|$$ vertices from $$V$$. Thus, we need at least $$|U|/2$$ operations until each vertex from $$U$$ is affected by an operation. We intend to speed up the Markov chain by occasionally switching the roles of $$U$$ and $$V$$. However, if $$|U| \ll |V|$$ we do not like to switch the roles of $$U$$ and $$V$$ too often since the number of affected edges per operation is largely reduced in such cases. Thus, we switch the roles of $$U$$ and $$V$$ with probability $$|U|/(|U|+|V|) = m/(m+n)$$. The improved Markov chain can now be stated as follows: Definition 2.11 (Informed Markov Chain) Let $$q \colon \{ \text{trade}, \text{multi-flip}, \text{multi-shift} \} \to [0,1]$$ be an arbitrary probability function that assigns positive probability to each operation. Let $$G=(U,V,E) \in \Omega$$. Select a real number $$x \in [0, 1)$$ uniformly at random. If $$x < m/(m+n)$$, switch the roles of $$U$$ and $$V$$. Randomly select one of the three operations trade, multi-flip and multi-shift with respect to the probability function $$q$$. Apply the selected operation type to transform $$G$$ into $$G'$$. Set $$G \gets G'$$ and go to step 1. Theorem 2.12 The stationary distribution of the informed Markov chain is the uniform distribution on $$\Omega$$. Proof. We first prove that the improved chain is irreducible. Since each of the three complex operations contains its simple counterpart as a special case, the informed chain contains the transitions of the simple chain. Since the simple chain is irreducible, the informed chain must be irreducible, too. In addition, the improved chain is aperiodic since each operation may create a loop with non-zero probability. Thus, the improved chain possesses a unique stationary distribution. Since the transition probabilities of each operation type are symmetric, it follows from the fundamental theorem of Markov chains that the stationary distribution is the uniform distribution. □ 2.3 Dynamic adjustment of probability The probability function $$q$$ for selecting an operation type has a large effect on the Markov chain’s speed of convergence. A simple implementation is to select each operation with equal probability. However, this is obviously not a good choice in the special case of $$\mathbf{{r}} = \bar{{{\mathbf{r}}}}$$ and $$\mathbf{{c}} = \bar{{{\mathbf{c}}}}$$ since all flip and shift operations must result in a loop transition and thus, at least 2/3 of all transitions will be loops. It is not immediately clear how to choose $$q$$ in a better way since the applicability of the operations heavily depends on the given integer vectors. Thus, we implemented a strategy that initially selects each type of operation with equal probability and dynamically adjusts $$q$$ during the simulation of the Markov chain. For this purpose, we keep track of the number of simulated transitions of each type and the number of non-loop transitions of each type. Every 100 steps, we rebalance the probability functions such that each type of operation gains a probability that is proportional to its success rate. Due to the dynamic adjustment of $$q$$, less time will be spent in operations that tend to have a large loop probability. To ensure positive probability for each type of operation, we slightly dampen the effect of the rebalancing by incorporating previous values. Being precise, we define $$s^{(t)}_{\text{switch}}$$, $$s^{(t)}_{\text{flip}}$$, $$s^{(t)}_{\text{shift}}$$ to be the number of non-loop transitions of each operation type within the first $$t$$ steps of the simulation and $$n^{(t)}_{\text{switch}}$$, $$n^{(t)}_{\text{flip}}$$, $$n^{(t)}_{\text{shift}}$$ to be the total number of simulated operations of each type within the first $$t$$ steps (including loop transitions). We define the success rate$$\alpha^{(t)}$$ of each operation type after $$t$$ steps to be   \begin{align*} \alpha^{(t)}_{\text{switch}} = s^{(t)}_{\text{switch}} / n^{(t)}_{\text{switch}},\hspace{12mm} \alpha^{(t)}_{\text{flip}} = s^{(t)}_{\text{flip}} / n^{(t)}_{\text{flip}},\hspace{12mm} \alpha^{(t)}_{\text{shift}} = s^{(t)}_{\text{shift}} / n^{(t)}_{\text{shift}} \end{align*} and define $$z^{(t)} = \alpha^{(t)}_{\text{switch}} + \alpha^{(t)}_{\text{flip}} + \alpha^{(t)}_{\text{shift}}$$. The probability function $$q$$ at step $$t$$ is defined as   \[ q_{\text{switch}}^{(t)} = \begin{cases} 1/3, & \text{ if } t = 0,\\ q_{\text{switch}}^{(t-1)}, & \text{ if } (t \bmod 100) \not= 0,\\ \left( q_{\text{switch}}^{(t-100)} + \alpha^{(t)}_{\text{switch}} / z^{(t)} \right) / 2, & \text{ if } t > 0 \text{ and } (t \bmod 100) = 0. \end{cases} \] The probability of the flip, shift and the more complex operations is defined analogously. (In the very unlikely case where the calculations as described above require a division by zero, we simply take over the previous values of $$q^{(t-100)}$$.) In the following discussion, we will call a Markov chain static when $$q$$ is fixed to 1/3 for each type of operation and dynamic when we use dynamical adjustment of the probability function. 3. Experimental results and discussion In this section, we present an experimental evaluation of the Markov chains presented in this article. Whereas the first experiment is designed to gain insights into structural properties of the Markov chains, the second experiment is devoted to an empirical evaluation of the number of steps that is sufficient to produce random samples. Finally, we come back to Darwin’s finches and demonstrate how our sampling methods can be used to analyse the structural properties of partially observed networks. 3.1 State graph analysis In the first experiment, we analyse properties of our Markov chains. When the number of states $$|\Omega|$$ is not too large, we can completely enumerate the set $$\Omega$$ to construct the state graph of the Markov chains. In this experiment, we considered classes of integer vectors whose members can be scaled to different sizes. Consider, for example, the class of integer vectors   \begin{align*} \mathbf{{r}} &= (n-1, n-2, 3) & \mathbf{{c}}& = (2,2, \ldots, 2)\\ \bar{{{\mathbf{r}}}} &= (n, n-1, 4) & \bar{{{\mathbf{c}}}}& = (\underbrace{3,3, \ldots, 3}_{n \text{ times}}), \end{align*} where the length $$n \in \mathbb{N}$$ of the vectors $$\mathbf{{c}}$$ and $$\bar{{{\mathbf{c}}}}$$ can be scaled to produce different members of the class. We used the software library marathon [6] to construct the state graph of the simple and informed Markov chain for $$5 \leq n < 35$$. The total mixing time$$\tau(\varepsilon)$$ gives a formal measure of how many steps are needed to come sufficiently close to the stationary distribution of a Markov chain. However, computing this quantity becomes infeasible even when $$|\Omega|$$ is moderately large. Hence, we computed the upper spectral bound [24] on the total mixing time   \begin{align} \tau(\varepsilon) \leq (1-\lambda_{\max})^{-1} \cdot \left( \ln(\varepsilon^{-1}) + \ln(|\Omega|) \right)\!, \end{align} (3.1) where $$\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_{|\Omega|}$$ are the real eigenvalues of a Markov chain’s transition matrix and $$\lambda_{\max} = \max(|\lambda_2|, |\lambda_{|\Omega|}|)$$. Figure 8 shows the growth of the spectral bound for the state graphs investigated in this experiment. The data suggest that the upper spectral bound of the simple chain grows more than one order of magnitude faster than that of the informed chain. This may be explained by considering the expected probability of loop transitions. We observe that the loop probability of the simple chain increases with growing $$n$$, whereas it decreases in case of the informed chain. By counting the number of non-zero transitions, we confirm that the informed chain, as intended, is much denser than the simple chain. For other classes of integer vectors, we made similar observations. Fig. 8. View largeDownload slide Upper spectral bound (left) of the state graphs for a special class of integer vectors ($$\varepsilon=0.01$$). Right: Expected loop probability calculated from the state graphs. Fig. 8. View largeDownload slide Upper spectral bound (left) of the state graphs for a special class of integer vectors ($$\varepsilon=0.01$$). Right: Expected loop probability calculated from the state graphs. 3.2 Convergence of sample means As the number of states grows fast, the evaluation of inequality 3.1 becomes infeasible when the input instances grow larger. For this reason, we analysed the efficiency of the Markov chains by a different approach that is based on simulation. Methodology By simulating $$t$$ steps of a Markov chain and returning the final state, we produce a random sample $$x^{(t)} \in \Omega$$. Let $$p_{s}^{(t)}(x)$$ be the probability of returning state $$x \in \Omega$$ after exactly $$t$$ simulation steps, while starting at state $$s \in \Omega$$. If we repeat this simulation $$N$$ times (always starting from $$s\in \Omega$$), we produce a series of independent and identically distributed random samples $$x^{(t)}_1, x^{(t)}_2, \ldots, x^{(t)}_N$$. By evaluating a target function $$f \colon \Omega \to \mathbb{R}$$ on each random sample and calculating the sample mean   \[ \bar{\mu}_{s}^{(t)}[\,f(X)] := \frac{1}{N} \sum_{i=1}^{N} f \left(x^{(t)}_i \right)\!, \] we approximate the expected value   \[ E_{p^{(t)}_s}[\,f(X)] := \sum_{x \in \Omega} f(x) p_s^{(t)}(x), \] with respect to the probability distribution $$p^{(t)}_s(x)$$. By Theorems 2.7, 2.12 and by the law of large numbers, the sample mean $$\bar{{{\mu}}}_{s}^{(t)}[\,f(X)]$$ will approach the expected value   \[ E[\,f(X)] := |\Omega|^{-1} \sum_{x \in \Omega} f(x), \] when $$t,N \to \infty$$. By letting $$t$$ grow and observing the sample means $$\bar{{{\mu}}}_{s}^{(t)}[\,f(X)]$$, we can detect the number of steps after which the sample means stabilize. This number can be used as an indicator for the efficiency of a Markov chain. Experimental setup Starting from a fixed bipartite graph $$s \in \Omega$$, we simulated $$100000$$ random transitions of both the simple and the informed Markov chain. Every 10 steps, we interrupted the simulation to evaluate the target function on the current state of the Markov chain. This process was repeated $$N=1000$$ times to calculate the sample means $$\bar{{{\mu}}}_{s}^{(t)}[\,f(X)]$$ for $$t=10,20, \ldots, 100000$$. Target function In this experiment, we tested several target functions $$f \colon \Omega \to \mathbb{N}$$. In ecology, the structure of bipartite networks can be quantified by several metrics, from which we considered the following: The $$\bar{S}^2$$ statistic introduced by Roberts and Stone [25]. The nested subset statistic $$S_{\text{nest}}$$ introduced by Patterson and Atmar [26]. The nestedness measure based on overlap and decreasing fill (NODF) [27]. The spectral radius of the bipartite graph’s adjacency matrix [28]. Initial state To clearly observe the stabilization of the sample means, the value $$f(s)$$ at the initial state $$s \in \Omega$$ should considerably differ from the expected value $$E[\,f(X)]$$. As many metrics are known to be correlated with the edge total, we intentionally constructed the initial state $$s$$ to possess as many edges as allowed by the upper bounds $$\bar{{{\mathbf{r}}}}$$ and $$\bar{{{\mathbf{c}}}}$$. This can be achieved by applying an appropriate realization algorithm [21]. Scale-free bipartite graphs Many real-world networks are known to have a scale-free degree distribution. For this reason, we used a Barabási–Albert model to create the degree sequence of a random, scale-free bipartite graph $$G=(U,V,E)$$. Starting with a trivial graph consisting of two vertices connected by a single edge, we iteratively added new vertices to $$U$$ and $$V$$. Each newly added vertex is connected to the existing vertices with a probability that is proportional to the degree of the existing vertices. Thus, vertices that already have a high degree tend to gain more neighbours. After 100 vertices have been included in each $$U$$ and $$V$$, we stop the process and determine the degrees of the vertices in $$U$$ and $$V$$ as integer vectors $$\mathbf{{r}}$$ and $$\mathbf{{c}}$$. To simulate the imperfectness of the data collection process, we added a constant $$d$$ to each integer to create the vectors of upper bounds $$\bar{{{\mathbf{r}}}}$$ and $$\bar{{{\mathbf{c}}}}$$. Based on these vectors, we estimated the efficiency of our Markov chains by observing the sample means of the $$\bar{S}^2$$ statistic. The first four diagrams of Fig. 9 show the results for $$d=1$$ and $$d=5$$. Considering the case of $$d=1$$, we observe that the sample mean of the informed chain stabilizes after very few steps. In contrast, the simple chain needs several thousand iterations to stabilize. By counting the number of loop transitions, we observed that about 94% of all transitions of the simple chain have been loops, in comparison to only 27% of the informed chain’s transitions. Using dynamic probability adjustment, the simple chain starts to deviate from its static counterpart after the first rebalancing at $$t=100$$. During the simulation, 11% of all operations have been switches, 76% were flips and 13% have been shifts. Overall, the dynamic rebalancing reduced the simple chain’s number of loops by 4%. In contrast, the dynamical adjustment could not improve the informed chain, as the necessary number of steps is already very small. As this observation will apply for each of the following experiments, we do not consider the dynamical adjustment for the informed chain from now on. Considering the case of $$d=5$$, we observe that the sample means produced by the simple chain converge significantly faster than before since the loop rate of the Markov chain drops to 85%. Fig. 9. View largeDownload slide Sample means of the $$\bar{S}^2$$ metric for scale-free and near-regular integer vectors. Fig. 9. View largeDownload slide Sample means of the $$\bar{S}^2$$ metric for scale-free and near-regular integer vectors. Although we observed that the informed chain is faster in terms of iterations, this does not necessarily imply that it is also faster in practice since the running time of a single step is larger. However, our experiments showed that the fast convergence of the chain justifies the large running time in each step. We conclude that the informed chain is properly suited for randomization of scale-free graphs, especially if the gap between the degree bounds is small. Near-regular bipartite graphs Next, we experimented with the degrees of near-regular bipartite graphs. For this purpose, we defined two integer vectors $$\mathbf{{r}}$$ and $$\mathbf{{c}}$$ of length $$m=n=100$$ with $${r}_i = {c}_i = 50$$ for each $$i = 1, \ldots, 100$$. As before, we added a constant $$d$$ to each integer to produce the vectors of upper bounds $$\bar{{{\mathbf{r}}}}$$ and $$\bar{{{\mathbf{c}}}}$$. The last four diagrams in Fig. 9 show the sample means for $$d=1$$ and $$d=5$$. Considering $$d=1$$, we observe that the informed chain is faster in terms of iterations but is in practice only fairly more efficient than the simple chain. This can be explained by the simple chain’s rate of loop transitions, which is 82% and thus comparably small. In comparison, 24% of the informed chain’s transitions have been loops. With dynamic probability adjustment, the simple chain selected switch operations with a probability of 25%, flip operations with 50% and shift operations with 25%, thereby effectively reducing the number of loops by 3%. Considering $$d=5$$, we found that the simple chain is even more efficient than before, as its loop rate dropped to 59%. Although only 7% of the informed chain’s transitions have been loops, the simple chain is superior in practice. We conclude that the simple chain with dynamic adjustment is suited best for the randomization of near-regular bipartite graphs. Sensitivity to the target function Replacing the $$\bar{S}^2$$ metric by the other mentioned target functions confirmed our previous observations. Although the sample means stabilized after a different number of steps, the qualitative results were mostly unchanged. As an example, Fig. 10 shows the sample means of the spectral radius metric. We conclude that our observations were not a side-effect of the metric used for convergence detection. Fig. 10. View largeDownload slide Sample means of the spectral radius metric for scale-free and near-regular integer vectors. Fig. 10. View largeDownload slide Sample means of the spectral radius metric for scale-free and near-regular integer vectors. 3.3 Sampling application In a final experiment, we demonstrate how our sampling methods can be applied to study properties of partially observed networks. In this experiment, we assumed that the observed ‘Darwin’s Finches’ network derived from Table 1 is a part of some larger ‘true’ network whose exact shape is unknown due to missing observations. To model this kind of uncertainty, we assume that each island of the Galápagos archipelago may be inhabited by up to one finch species whose presence was not observed during the data collection. Symmetrically, we allow each finch species to be present at one additional island. Mathematically speaking, the true ‘Darwin’s Finches’ network is assumed to be an unknown element of a set $$S$$ that contains all bipartite graphs $$G=(U,V,E)$$, (a) with $$|U|=13$$ and $$|V|=17$$ nodes, (b) whose degrees may exceed those of the observed network by one and (c) which contain all edges of the observed network. In this experiment, we will approximate the expected structure of the true network by uniformly sampling from the set $$S$$. Rejection sampling Our methods cannot directly be applied to uniformly sample from $$S$$ as they do not incorporate requirement (c). However, as $$S$$ is clearly a subset of $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$ with   \begin{align*} \mathbf{{r}} &= (14, 13, 14, 10, 12, 2, 10, 1, 10, 11, 6, 2, 17),\\ \bar{{{\mathbf{r}}}} &= (15, 14, 15, 11, 13, 3, 11, 2, 11, 12, 7, 3, 18),\\ \mathbf{{c}} &= (4, 4, 11, 10, 10, 8, 9, 10, 8, 9, 3, 10, 4, 7, 9, 3, 3),\\ \bar{{{\mathbf{c}}}} &= (5, 5, 12, 11, 11, 9, 10, 11, 9, 10, 4, 11, 5, 8, 10, 4, 4), \end{align*} we can apply the idea of rejection sampling. By randomly sampling from $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$ and rejecting all samples that miss an observed edge, we assure that the non-rejected samples are uniformly distributed in $$S$$. As we draw uniformly, we need to draw about $$|\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})| / |S|$$ samples from $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$ to find a single sample from $$S$$. Unfortunately, as $$S$$ is just a tiny subset of the much larger $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$, this strategy is too inefficient to be of practical use. For that reason, we first transform the sampling problem to an equivalent problem that can be solved more efficiently. Problem transformation By definition, a graph $$G \in S$$ contains all edges of the observed network. By removing these edges from $$G$$, we create a graph $$G'$$ with reduced degrees. In particular, as each node of $$G$$ may possess at most one additional edge, the degrees of $$G'$$ will be either zero or one. Thus, we may replace the original sampling problem by the problem of uniform sampling from a set $$T$$ that contains all bipartite graphs $$G=(U,V,E)$$, (a) with $$|U|=13$$ and $$|V|=17$$ nodes, (b) whose degrees are either zero or one and (c) which do not contain edges that are present in the observed network. In Fig. 11, we demonstrate the problem transformation at a small example. It is easy to see that the reduced sampling problem is equivalent to the original one, as we can easily transform an element of $$T$$ into an element of $$S$$ by adding the observed edges. Vice versa, an element of $$S$$ can be converted into an element of $$T$$ by removing all observed edges. Thus, there is a one-to-one correspondence between $$S$$ and $$T$$. Consequently, we conclude that $$|S| = |T|$$. Fig. 11. View largeDownload slide Illustration of the problem transformation. Let the network on the left be our observed network. We assume that the unknown ‘true’ network may contain up to one additional edge per node, thus its degrees may exceed the observed ones by one. By removing the observed edges, we create a reduced network, whose degrees may be either zero or one. Fig. 11. View largeDownload slide Illustration of the problem transformation. Let the network on the left be our observed network. We assume that the unknown ‘true’ network may contain up to one additional edge per node, thus its degrees may exceed the observed ones by one. By removing the observed edges, we create a reduced network, whose degrees may be either zero or one. Refined sampling strategy As $$T$$ is a subset of the set $$\Omega({\mathbf{p}}, \bar{{{\mathbf{p}}}}, {\mathbf{q}}, \bar{{{\mathbf{q}}}})$$ with   \[ {\mathbf{p}} = [0]^{13}, \quad \bar{{{\mathbf{p}}}} = [1]^{13}, \quad {\mathbf{q}} = [0]^{17}, \quad \bar{{{\mathbf{q}}}} = [1]^{17}, \] we may produce uniform samples from $$T$$ by uniformly sampling from $$\Omega({\mathbf{p}}, \bar{{{\mathbf{p}}}}, {\mathbf{q}}, \bar{{{\mathbf{q}}}})$$ and rejecting all samples that contain an observed edge. After adding the observed edges, each non-rejected sample will be uniformly distributed from $$S$$. To demonstrate that this refined strategy is superior to the basic rejection sampling strategy, we calculated the sizes of the sets $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$ and $$\Omega({\mathbf{p}}, \bar{{{\mathbf{p}}}}, {\mathbf{q}}, \bar{{{\mathbf{q}}}})$$ by using a counting approach similar to that of Miller and Harrison [17]. In doing so, we found $$|\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})| \approx 3.3 \times 10^{26}$$ and $$|\Omega({\mathbf{p}}, \bar{{{\mathbf{p}}}}, {\mathbf{q}}, \bar{{{\mathbf{q}}}})| \approx 1.2 \times 10^{14}$$. As $$\Omega({\mathbf{p}}, \bar{{{\mathbf{p}}}}, {\mathbf{q}}, \bar{{{\mathbf{q}}}})$$ is several orders of magnitude smaller than $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$, the refined strategy outperforms the basic strategy by a factor of   \[ \frac{|\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})| \cdot |T|}{|\Omega({\mathbf{p}}, \bar{{{\mathbf{p}}}}, {\mathbf{q}}, \bar{{{\mathbf{q}}}})|\cdot |S|} = \frac{|\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})|}{|\Omega({\mathbf{p}}, \bar{{{\mathbf{p}}}}, {\mathbf{q}}, \bar{{{\mathbf{q}}}})|} \approx \frac{3.3\cdot 10^{26}}{1.2 \cdot 10^{14}} \approx 2.8 \cdot 10^{12}. \] Experimental setup To approximate the expected structure of the true network, we generated one million uniform samples from the set $$S$$ and constructed the sampling histograms of the associated four metrics. Using these samples, we approximated the expected structure of our assumed true network. To create a reference to which the structure of the true network can be compared with, we additionally calculated the corresponding sampling histograms for the set $$\Omega({\mathbf{r}}, {\mathbf{r}}, {\mathbf{c}}, {\mathbf{c}})$$ of graphs with the same degrees as the observed ‘Darwin’s Finches’ and for the set $$\Omega({\mathbf{r}}, \bar{{{\mathbf{r}}}}, {\mathbf{c}}, \bar{{{\mathbf{c}}}})$$ of graphs with similar degrees as the true network but which do not need to contain the observed edges. Results Figure 12 shows the calculated sampling histograms. We observe that expected structural properties of the true network clearly exceed those of the observed network with respect to $$\bar{S}^2$$, $$S_{\text{nest}}$$ and spectral radius. In contrast, the observed and the true network are structured very similarly with respect to NODF. In addition, we estimated the expected number of edges in the true network to be 130.5. In contrast, the number of edges in the observed network is 122. Fig. 12. View largeDownload slide Sampling histograms obtained by uniform sampling from three sets of networks. Left column: The set $$S$$ of potential ‘true’ networks. Middle column: The set $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$ that do not require to contain the observed edges. Right column: The set of networks with the same degrees as ‘Darwin’s Finches’. The vertical solid lines mark the structural properties of the observed ‘Darwin’s Finches’ network. The dashed lines mark the estimated expected values of our assumed ‘true’ network. Fig. 12. View largeDownload slide Sampling histograms obtained by uniform sampling from three sets of networks. Left column: The set $$S$$ of potential ‘true’ networks. Middle column: The set $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$ that do not require to contain the observed edges. Right column: The set of networks with the same degrees as ‘Darwin’s Finches’. The vertical solid lines mark the structural properties of the observed ‘Darwin’s Finches’ network. The dashed lines mark the estimated expected values of our assumed ‘true’ network. To assess the influence of the observed edges, we compared the structure of the true network with the structure that would be expected from chance if the observed edges were not required to be included in the true network (see middle column). In doing so, we found that the presence of the observed edges enforces the $$\bar{{{S}}}^2$$ and spectral radius of true network to be significantly larger than it would be expected from chance alone. In contrast, the observed edges barely affect the $$S_{\text{nest}}$$ and NODF metrics. Interestingly, the presence of the observed edges forces the true network to be denser than it would be expected from chance, as we approximated the expected number of edges of networks in $$\Omega(\mathbf{{r}}, \bar{{{\mathbf{r}}}}, \mathbf{{c}}, \bar{{{\mathbf{c}}}})$$ to be 128.5. We conclude that the structural properties of our imaginary true network significantly differ from the structure that would be expected from chance if the observed edges were not required to exist. In an analogous experiment, we evaluated the influence of the observed edges by comparing the structure of the observed network with that of null-networks with identical degrees but which do not need to contain the observed edges (right column). In doing so, we found that the $$\bar{S}^2$$, NODF and spectral radius metrics of the observed ‘Darwin’s finches’ notably differ from those of most random null-networks, whereas the values for $$S_{\text{nest}}$$ are very similar. Summarizing these observations, we found that the structure of our imaginary true network is affected by the existence of the observed edges and may change by incorporating additional edges. 4. Conclusion We introduced a Markov chain whose state space is the set of all bipartite graphs whose degrees lie in prescribed intervals. Afterwards, we introduced a second Markov chain that has been designed to affect multiple edges per step and to avoid loop transitions. We showed that both Markov chains can be used for uniform sampling by proving that their stationary distribution is uniform. Experimentally, we showed that the second Markov chain converges faster to its stationary distribution. However, its practical applicability depends on the distribution of the given degrees. We found that the informed chain is more efficient for scale-free degree distributions, whereas the simple chain is superior when the degrees are near-regular. Since many real-world networks have a scale-free degree distribution, the informed chain is well suited for many practical applications. Furthermore, we showed how the dynamic adjustment of the probability function $$q$$ can speed-up the simple chain, whereas it has almost no effect on the informed chain since the loop probability is already very small and equally distributed between the three types of operations. In an application, we showed how our methods can be used to study properties of partially observed networks and found that the presence of unobserved edges may influence the expected structure of the network. In future, we plan to extend our sampling algorithms to wider areas by enabling uniform sampling of non-bipartite graphs, directed graphs and multi-graphs. Acknowledgements We thank Annabell Berger for her valuable suggestions and critical remarks. References 1. Chen Y., Diaconis P., Holmes S. P. & Liu J. S. ( 2005) Sequential Monte Carlo methods for statistical analysis of tables. J. Amer. Statist. Assoc. , 100, 109– 120. Google Scholar CrossRef Search ADS   2. Brandes U. & Erlebach T. (eds) ( 2005) Network Analysis: Methodological Foundations, LNCS, vol. 3418. New York, USA: Springer Berlin-Heidelberg-New York. Google Scholar CrossRef Search ADS   3. Connor E. F. & Simberloff D. ( 1979) The assembly of species communities: chance or competition? Ecology , 60, 1132– 1140. Google Scholar CrossRef Search ADS   4. Gotelli N. & Graves G. ( 1996) Null Models in Ecology . Washington, USA: Smithsonian Institution Press. 5. Gotelli N. J. ( 2000) Null model analysis of species co-occurrence patterns. Ecology , 81, 2606– 2621. Google Scholar CrossRef Search ADS   6. Rechner S. & Berger A. ( 2016) Marathon: an open source software library for the analysis of Markov-Chain Monte Carlo algorithms. PLoS One , 11, e0147935. Google Scholar CrossRef Search ADS PubMed  7. Erdõs P., Miklós I. & Toroczkai Z. ( 2017) New Classes of Degree Sequences with Fast Mixing Swap Markov Chain Sampling. Combinatorics, Probability and Computing , 1– 22, https://doi.org/10.1017/S0963548317000499. 8. Greenhill C. ( 2014) The switch Markov chain for sampling irregular graphs. Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms  ( Indyk P. ed.), Philadelphia, USA: Society for Industrial and Applied Mathematics, pp. 1564– 1572. Google Scholar CrossRef Search ADS   9. Kannan R., Tetali P. & Vempala S. ( 1999) Simple Markov-chain algorithms for generating bipartite graphs and tournaments. Random Struct. Algorithms , 14, 293– 308. Google Scholar CrossRef Search ADS   10. Strona G., Nappo D., Boccacci F., Fattorini S. & San-Miguel-Ayanz J. ( 2014) A fast and unbiased procedure to randomize ecological binary matrices with fixed row and column totals. Nat. Commun. , 5, article no. 4114. 11. Verhelst N. D. ( 2008) An efficient MCMC algorithm to sample binary matrices with fixed marginals. Psychometrika , 73, 705. Google Scholar CrossRef Search ADS   12. Carstens C. J., Berger A. & Strona G. ( 2016) Curveball: a new generation of sampling algorithms for graphs with fixed degree sequence. arXiv preprint arXiv:1609.05137v2 ( submitted on 22 December 2016). 13. Bezáková I., Bhatnagar N. & Vigoda E. ( 2007) Sampling binary contingency tables with a greedy start. Random Struct. Algorithms , 30, 168– 205. Google Scholar CrossRef Search ADS   14. Harrison M. T. & Miller J. W. ( 2013). Importance sampling for weighted binary random matrices with specified margins. arXiv preprint arXiv:1301.3928v1 ( submitted on 16 January 2013). 15. Holmes R. & Jones L. ( 1996) On uniform generation of two-way tables with fixed margins and the conditional volume test of Diaconis and Efron. Ann. Statist. , 24, 64– 68. Google Scholar CrossRef Search ADS   16. Bezáková I., Sinclair A., Štefankovič D. & Vigoda E. ( 2012) Negative examples for sequential importance sampling of binary contingency tables. Algorithmica , 64, 606– 620. Google Scholar CrossRef Search ADS   17. Miller J. W. & Harrison M. T. ( 2013) Exact sampling and counting for fixed-margin matrices. Ann. Statist. , 41, 1569– 1592. Google Scholar CrossRef Search ADS   18. Schluter D. ( 1984) A variance test for detecting species associations, with some example applications. Ecology , 65, 998– 1005. Google Scholar CrossRef Search ADS   19. Gilpin M. E. & Diamond J. M. ( 1982) Factors contributing to non-randomness in species co-occurrences on islands. Oecologia , 52, 75– 84. Google Scholar CrossRef Search ADS PubMed  20. Ulrich W. & Gotelli N. J. ( 2012) A null model algorithm for presence–absence matrices based on proportional resampling. Ecol. Model. , 244, 20– 27. Google Scholar CrossRef Search ADS   21. Rechner S. ( 2017) An optimal realization algorithm for bipartite graphs with degrees in prescribed intervals. arXiv preprint arXiv:1708.05520v1 ( submitted on 18 August 2017). 22. Levin D. A., Peres Y. & Wilmer E. L. ( 2009) Markov Chains and Mixing TImes . Providence, Rhode Island, USA: American Mathematical Society. 23. Carstens C. J. ( 2015) Proof of uniform sampling of binary matrices with fixed row sums and column sums for the fast curveball algorithm. Phys. Rev. E , 91, 042812. Google Scholar CrossRef Search ADS   24. Diaconis P. & Stroock D. ( 1991) Geometric bounds for eigenvalues of Markov chains. Ann. Appl. Probab. , 1, 36– 61. Google Scholar CrossRef Search ADS   25. Roberts A. & Stone L. ( 1990) Island-sharing by archipelago species. Oecologia , 83, 560– 567. Google Scholar CrossRef Search ADS PubMed  26. Patterson B. D. & Atmar W. ( 1986) Nested subsets and the structure of insular mammalian faunas and archipelagos. Biol. J. Linnean Soc. , 28, 65– 82. Google Scholar CrossRef Search ADS   27. Almeida-Neto M., Guimarães P., Guimarães P. R., Loyola R. D. & Ulrich W. ( 2008) A consistent metric for nestedness analysis in ecological systems: reconciling concept and measurement. Oikos , 117, 1227– 1239. Google Scholar CrossRef Search ADS   28. Staniczenko P. P., Kopp J. C. & Allesina S. ( 2013) The ghost of nestedness in ecological networks. Nat. Commun. , 4, 1391. Google Scholar CrossRef Search ADS PubMed  © The authors 2017. Published by Oxford University Press. All rights reserved.

Journal

Journal of Complex NetworksOxford University Press

Published: Dec 22, 2017

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off