Journal of Complex Networks, Volume 6 (4) – Aug 1, 2018

/lp/ou_press/the-temporal-event-graph-panQxlXnZ4

- Publisher
- Oxford University Press
- Copyright
- © The authors 2017. Published by Oxford University Press. All rights reserved.
- ISSN
- 2051-1310
- eISSN
- 2051-1329
- DOI
- 10.1093/comnet/cnx048
- Publisher site
- See Article on Publisher Site

Abstract Temporal networks are increasingly being used to model the interactions of complex systems. Most studies require the temporal aggregation of edges (or events) into discrete time steps to perform analysis. In this article, we describe a static, behavioural representation of a temporal network, the temporal event graph (TEG). The TEG describes the temporal network in terms of both inter-event time and two-event temporal motifs. By considering the distributions of these quantities in unison, we provide a new method to characterize the behaviour of individuals and collectives in temporal networks as well as providing a natural decomposition of the network. We illustrate the utility of the TEG by providing examples on both synthetic and real temporal networks. 1. Introduction Temporal networks have seen increased use in the study of dynamics in complex systems. This is due partly to the increase in available timestamped data from sources such as Twitter [1] and proximity sensors [2], among others, but also due to the recognition that the temporal patterns of complex systems have a major influence on the proliferation of processes upon them [3]. The role that temporal patterns of connectivity play on dynamics is not fully understood, the clearest example being whether spreading mechanisms are helped or hindered by temporal connections [4–7]. What is clear, however, is that the inclusion of temporal information uncovers patterns not observable from the study of a static network alone [8, 9], making it vital to be able to characterize and understand the structure of temporal networks. There are many ways to represent a temporally evolving network [10–12], the simplest being a sequence of timestamped edges $$(u,v,t)$$ which may also last for a duration. This is referred to as event or contact sequence [13]. A number of representations aim to describe a temporal network as a single static network (with varying degrees of aggregation). We briefly review some of these constructions before introducing the temporal event graph (TEG). A trivial way to represent a temporal network is to consider an aggregation in the temporal dimension creating a time-aggregated graph. Two nodes share an edge in the time-aggregated graph if they share a edge at any point in time. Alternatively the time-aggregated graph can be a weighted graph where each edge is weighted by how many times it appears over time. This representation can be useful to understand the density of the network or highly active edges but in aggregating the temporal dimension we lose information about causality and temporal walks. In reachability graphs [14], there is a directed edge from nodes $$u$$ to $$v$$ if there exists a temporal path from $$u$$ to $$v$$. This captures whether or not one node can affect another and is useful for understanding how well processes can spread on the network. This has also can be generalized to ask whether a temporal path exists starting at time $$t$$ but taking no longer than a fixed duration $$\delta$$ [15]. The reachability graph tells us whether or not a temporal path exists, but it does not describe the path itself (length, route taken, etc.). More recent approaches aim to keep all the information of the temporal network. This is done by considering nodes at different times separately (sometimes referred to as time-unfolded nodes [3]), in what is known as a time-node graph [16]. In a time-node graph, a temporal event $$(u,v,t)$$ is represented by an edge from a node $$u_t$$ to a node $$v_{t+1}$$, or to node $$v_{t+\delta}$$ if the event has duration $$\delta$$. This type of representation can be useful as static methods can be applied which, due to the structure of the network, automatically include the temporal dimension. Results for individual nodes can then be collated by mapping (or ‘refolding’) the timestamped nodes back to the original nodes. This type of network can also be conveniently expressed as an adjacency tensor or multilayer network when time is discretized [17]. Like the reachability graph this representation preserves temporal paths, and is in fact lossless—the representation uniquely defines a temporal network. The second class of temporal network representations consider the dual of the network, where the edges of the original network are now the vertices of the dual. The transmission graph [18] connects edges of the network when a temporal event occurs between two connected edges within a short time window. For example a sequence of events $$(u,v,t_1), (v,w,t_2)$$ would create a link between vertices $$(u,v)$$ and $$(v,w)$$ provided $$t_2-t_1$$ was sufficiently small. These links between vertices can be cumulative, or persist only for a set time. Closely related to the transmission graph are the second-order time-aggregated networks of [3]. These correspond to the final state of the cumulative transmission graph. The link weights are the number of times those edges have been involved in a two-step path. In a similar flavour, the second-order memory networks of [19] consider the probability of the two-step path given that the first step has occurred, that is the probability of edge $$(v,w)$$ occurring after an observation of edge $$(u,v)$$. Unlike the time-node graphs, these representations are lossy. We know that paths have occurred along certain edges but we do not know exactly when. In this article, we introduce the temporal event graph (TEG), a static representation of a temporal network. The TEG combines approaches from time-unfolded nodes and the dual representation of transmission graphs. In the TEG, the vertices are individual timestamped events $$(u,v,t)$$ which are connected if they occur close in time, and share one or more nodes. This differs from the approaches of transmission graph and memory networks where only edges $$(u,v)$$ between nodes act as vertices. As this is a network of temporal events, we can consider the relationship between any two events, namely the inter-event times (IETs) and two-event temporal motif they form [13]. The analysis of temporal motifs has previously uncovered the behaviour of individuals when applied to a number of different temporal networks [9, 20–22]. Similarly the IETs of temporal networks has attracted a large number of studies [23–25]. Using the TEG, we are able to decompose a temporal network into constituent components and study the motif and IET distributions in tandem, highlighting the heterogeneity of behaviour across components and allowing us to uncover patterns of behaviour not seen when considering motifs and IETs alone. In Section 2, we formally describe the TEG. In Section 3, we outline some of the theoretical properties of the TEG, showing specifically that we can recover a temporal network from the TEG up to a translation of disconnected components in time. In Section 4, we state the statistical properties which describe the TEG before applying them in Section 5 to characterize an online social network. Finally, in Section 6 we discuss the possible applications of the TEG and further generalizations. 2. The temporal event graph We consider temporal networks as a sequence of temporal events $$E$$. Let $$V \subset \mathbb{N}$$ be a set of interacting nodes, and $$T \subset \mathbb{R}^+_0$$ a non-empty ordered set of interaction times, then the temporal network is defined as the tuple $$G=G(V,E,T)$$ where $$E \subset V^2 \times T$$. An individual event $$e_i=(u_i,v_i,t_i)\in E$$ corresponds to an interaction of node $$u_i$$ with node $$v_i$$ at time $$t_i$$ (here assuming interaction is instantaneous and that each $$t_i$$ is distinct). The systems most suited to this representation are communication networks (letter and email correspondence, phone calls, social media, etc.) and proximity networks (human contact networks) [26]. To define the TEG we first need to be able to relate two events in a meaningful way, capturing the relationship of the nodes and the temporal proximity of the events. One such relation is that of $$\Delta t$$-adjacency [13]. Definition 2.1 Two time-ordered events $$e_i, e_j$$ are said to be $$\Delta t$$-adjacent if they share at least one node $$\left(\{u_i, v_i\} \cap \{u_j, v_j\} \neq \emptyset \right)$$ and the time between the two events (IET) is no greater than $$\Delta t$$, that is $$0 < t_j-t_i \le \Delta t$$. This definition of $$\Delta t$$-adjacency makes no assumption on the directionality of events; events can be directed or undirected. As such this definition gives little information as to the existence temporal paths in the network, although one could modify it accordingly to consider only events which are guaranteed to create a path. For the remainder of this article, we assume that events are directed. Using the definition of $$\Delta t$$-adjacency, we can now formally define the TEG. Definition 2.2 For a temporal network $$G = G(V, E, T)$$, the $$\Delta t$$-TEG, hereby known as the $$\Delta t$$-TEG, is a directed graph $$\mathcal{G} = \mathcal{G}(\mathcal{V}, \mathcal{E})$$ with $$\mathcal{V} = E$$ and $$\mathcal{E} \subset \mathcal{V} \times \mathcal{V}$$. The graph is defined such that there is a vertex for each event in $$E$$ and each vertex is connected to the subsequent$$\Delta t$$-adjacent event of each node in that event. More precisely, let \begin{align*} S(u_i) &= \{ k | \phantom{1}\underbrace{( \{u_i\} \cap \{u_k, v_k\} \neq \emptyset )}_{\textrm{Share a node}}\phantom{.} \text{ and } \underbrace{(0 < t_k - t_i \le \Delta t)}_{\textrm{Occur within }\Delta t \textrm{ of each other}}\}, \end{align*} be the set of subsequent $$\Delta t$$-adjacent events for the node $$u_i$$ with the equivalent set defined for $$v_i$$. The set of edges in the TEG is then given by \begin{align*} \mathcal{E} &= \{ (e_i, e_j) | (j = \min\{S(u_i)\}) \text{ or } (j = \min\{S(v_i)\}) \}. \end{align*} This construction means that each vertex has an out-degree and in-degree of at most two (see Lemma 3.1). The $$\Delta t$$-TEG consists of one or more temporal components (or maximal temporal subgraphs [13]), that is for each pair of events in a component there exists a sequence of events between them such that all pairs of consecutive events are $$\Delta t$$-adjacent, that is each pair of events are $$\Delta t$$-connected. From a purely graphical standpoint, these are the weakly connected components of the $$\Delta t$$-TEG. Of particular interest is the $$\Delta t$$-TEG in the limit $$\Delta t \to \infty$$, hereby referred to as the TEG. This captures all possible connected events (due to no cut-off) and hence gives us the most information on the temporal network. In practice for a given empirical temporal network there exists a maximal $$\Delta t$$ which we can consider as infinity given that these networks are sampled over a finite time window. The examples in Fig. 1 show how the TEG is constructed from an event sequence. To avoid ambiguity we use the terms nodes and events for the temporal network, and vertices and edges for the TEG. Fig. 1. View largeDownload slide Illustration of the duality of temporal networks and the TEG. (a) Four simple temporal networks (event sequences) involving four events. (b) Pictorial representations of the temporal networks. Event labels represent the instantaneous time when that event occurred between two nodes. (c) The TEG for each temporal network (with $$\Delta t \to \infty$$). (d) The corresponding edge-labelled TEGs (Def. 2.3). Edges are labelled with the tuple $$(\tau, \mu)$$, the IET and motif, respectively. Note in the bottom example the next two events for node A are connected to the first event. This is consistent as the ABBA edge occurred after that of the ABAC, that is node A’s subsequent event was $$\textrm{A} \to \textrm{C}$$ and node B’s subsequent event was $$\textrm{B} \to \textrm{A}$$ (coincidently A’s next event). Fig. 1. View largeDownload slide Illustration of the duality of temporal networks and the TEG. (a) Four simple temporal networks (event sequences) involving four events. (b) Pictorial representations of the temporal networks. Event labels represent the instantaneous time when that event occurred between two nodes. (c) The TEG for each temporal network (with $$\Delta t \to \infty$$). (d) The corresponding edge-labelled TEGs (Def. 2.3). Edges are labelled with the tuple $$(\tau, \mu)$$, the IET and motif, respectively. Note in the bottom example the next two events for node A are connected to the first event. This is consistent as the ABBA edge occurred after that of the ABAC, that is node A’s subsequent event was $$\textrm{A} \to \textrm{C}$$ and node B’s subsequent event was $$\textrm{B} \to \textrm{A}$$ (coincidently A’s next event). There are two important functions of the edge set to consider. Firstly the function $$\tau: \mathcal{E} \to \mathbb{R}_{0}^+$$, given by $$\tau\left((e_i, e_j)\right) = t_j - t_i$$ describes the IET between the two events. The function $$\mu: \mathcal{E} \to \mathbb{M}$$, where $$\mathbb{M} = \{\textrm{ABAB, ABBA, ABAC, ABCA, ABBC, ABCB}\}$$ is the set of two-event motifs (Table 1), describes the relative positions of the nodes between events. These motifs are given a descriptive name which is indicative of the behaviour associated with each pattern. For example, the ABAC motif is described as the broadcast motif as node A is in contact with multiple other nodes (B and C). The ABBA motif is the reciprocal motif, as an event from A to B is then followed by the reciprocal event B to A. Let $$f_{ij}$$ be an enumeration of the ordered sequence of nodes $$(u_i, v_i, u_j, v_j)$$ (not necessarily distinct) mapped to the corresponding alphabetic character,1 then $$\mu\left((e_i, e_j)\right) = f_{ij}(u_i)f_{ij}(v_i)f_{ij}(u_j)f_{ij}(v_j)$$. For example, the edge $$((5,10,t_0), (10,12,t_1))$$ has nodes ($$5,10,12$$) which are mapped to (A,B,C) and so becomes ABBC under the action of $$\mu$$. It is also possible for the motif function $$\mu$$ to incorporate other event data such as event or node colourings. Table 1 The set of all possible two-event motifs $$\mathbb{M}$$, given by their contact sequence, description, label, and label properties $$\xi_{\rm in}$$, $$\xi_{\rm out}$$ and $$\xi_{\rm switch}$$ Motif Name Shorthand $$\xi_{\rm out}$$ $$\xi_{\rm in}$$ $$\xi_{\rm switch}$$ $$A\to B$$, $$B\to A$$ Reciprocal ABBA AB BA $$-1$$ $$A\to B$$, $$A\to B$$ Repeated ABAB AB AB $$1$$ $$A\to B$$, $$A\to C$$ Broadcasting ABAC A$$\bullet$$ A$$\bullet$$ $$1$$ $$A\to B$$, $$C\to A$$ Non-sequential ABCA A$$\bullet$$ $$\bullet$$A $$-1$$ $$A\to B$$, $$B\to C$$ Message Passing ABBC $$\bullet$$B B$$\bullet$$ $$-1$$ $$A\to B$$, $$C\to B$$ Receiving ABCB $$\bullet$$B $$\bullet$$B $$1$$ Motif Name Shorthand $$\xi_{\rm out}$$ $$\xi_{\rm in}$$ $$\xi_{\rm switch}$$ $$A\to B$$, $$B\to A$$ Reciprocal ABBA AB BA $$-1$$ $$A\to B$$, $$A\to B$$ Repeated ABAB AB AB $$1$$ $$A\to B$$, $$A\to C$$ Broadcasting ABAC A$$\bullet$$ A$$\bullet$$ $$1$$ $$A\to B$$, $$C\to A$$ Non-sequential ABCA A$$\bullet$$ $$\bullet$$A $$-1$$ $$A\to B$$, $$B\to C$$ Message Passing ABBC $$\bullet$$B B$$\bullet$$ $$-1$$ $$A\to B$$, $$C\to B$$ Receiving ABCB $$\bullet$$B $$\bullet$$B $$1$$ Table 1 The set of all possible two-event motifs $$\mathbb{M}$$, given by their contact sequence, description, label, and label properties $$\xi_{\rm in}$$, $$\xi_{\rm out}$$ and $$\xi_{\rm switch}$$ Motif Name Shorthand $$\xi_{\rm out}$$ $$\xi_{\rm in}$$ $$\xi_{\rm switch}$$ $$A\to B$$, $$B\to A$$ Reciprocal ABBA AB BA $$-1$$ $$A\to B$$, $$A\to B$$ Repeated ABAB AB AB $$1$$ $$A\to B$$, $$A\to C$$ Broadcasting ABAC A$$\bullet$$ A$$\bullet$$ $$1$$ $$A\to B$$, $$C\to A$$ Non-sequential ABCA A$$\bullet$$ $$\bullet$$A $$-1$$ $$A\to B$$, $$B\to C$$ Message Passing ABBC $$\bullet$$B B$$\bullet$$ $$-1$$ $$A\to B$$, $$C\to B$$ Receiving ABCB $$\bullet$$B $$\bullet$$B $$1$$ Motif Name Shorthand $$\xi_{\rm out}$$ $$\xi_{\rm in}$$ $$\xi_{\rm switch}$$ $$A\to B$$, $$B\to A$$ Reciprocal ABBA AB BA $$-1$$ $$A\to B$$, $$A\to B$$ Repeated ABAB AB AB $$1$$ $$A\to B$$, $$A\to C$$ Broadcasting ABAC A$$\bullet$$ A$$\bullet$$ $$1$$ $$A\to B$$, $$C\to A$$ Non-sequential ABCA A$$\bullet$$ $$\bullet$$A $$-1$$ $$A\to B$$, $$B\to C$$ Message Passing ABBC $$\bullet$$B B$$\bullet$$ $$-1$$ $$A\to B$$, $$C\to B$$ Receiving ABCB $$\bullet$$B $$\bullet$$B $$1$$ There are three properties of the motif set, $$(\xi_{\rm out}, \xi_{\rm in}, \xi_{\rm switch})$$, which are required in Section 3.1. For event pairs involving three distinct nodes, we define $$\xi_{\rm out}$$ to be the label and position of the node which appears in both events, $$\xi_{\rm in}$$ to be the label and position of the shared node in the later event, and $$\xi_{\rm switch} = 1$$ if $$\xi_{\rm out}=\xi_{\rm in}$$ and $$-1$$ otherwise. For example, in the motif ABBC the node labelled B is carried forward from the first event so $$\xi_{\rm out}(\textrm{ABBC})=\bullet\textrm{B}$$ and takes the first position in the second event so that $$\xi_{\rm in}(\textrm{ABBC})= \textrm{B}\bullet$$. Subsequently as $$\xi_{\rm out}\neq\xi_{\rm in}$$, then $$\xi_{\rm switch}(\textrm{ABBC})=-1$$, the node labelled B has switched between being the target of an event to a source. For consistency, we define $$\xi_{\rm out}(\textrm{ABAB}) = \textrm{AB} = \xi_{\rm out}(\textrm{ABBA})$$ and $$\xi_{\rm in}(\textrm{ABAB})=\textrm{AB}$$ and $$\xi_{\rm in}(\textrm{ABBA}) = \textrm{BA}$$. When constructing the TEG from a temporal network, we have information on the events and their connectivity. We can also consider a TEG without the event information, defined purely by the connectivity information and edge functions (IETs and motifs). Definition 2.3 The edge-labelled TEG is the static graph defined by the upper-triangular adjacency pair $$(A^{\tau}, A^{\mu})$$ where \begin{align*} A^{\tau}_{ij} = \begin{cases} \tau(e_i, e_j) &\mbox{if } (e_i, e_j) \in \mathcal{E} \\ 0 & \mbox{otherwise}, \end{cases} \end{align*} is the weighted adjacency matrix consisting of IETs and \begin{align*} A^{\mu}_{ij} = \begin{cases} \mu(e_i, e_j) &\mbox{if } (e_i, e_j) \in \mathcal{E} \\ 0 & \mbox{otherwise}. \end{cases} \end{align*} is the matrix containing edge motif labels. 3. Theoretical properties of the TEG In this section, we state and prove a number of properties of the TEG and list the conditions required for an edge-labelled TEG to represent a temporal network. Using this knowledge, we are then able to show how the edge-labelled TEG uniquely defines a temporal network up to a translation in time between components. Lemma 3.1 Each vertex in the TEG has at most in-degree two and out-degree two. Proof. Consider an event vertex representing the event $$e_i = (u_i, v_i, t_i)$$. From our definition we let \begin{align*} A^+_{u}(i) &= \{ k | ( \{u_i\} \cap \{u_k, v_k\} \neq \emptyset ) \text{ and } (0 < t_k - t_i \le \Delta t) \}, \\ A^+_{v}(i) &= \{ k | ( \{v_i\} \cap \{u_k, v_k\} \neq \emptyset ) \text{ and } (0 < t_k - t_i \le \Delta t) \} \end{align*} be the subsequent $$\Delta t$$-adjacent events for the nodes $$u_i$$ and $$v_i$$ respectively. The set of edges in the TEG is given by \begin{align*} \mathcal{E} &= \{ (e_i, e_j) | j = \min(A^+_{u}(i)) \text{ or } j = \min(A^+_{v}(i)) \}. \end{align*} Therefore, for each $$e_i$$ there exists then the two edges to events whose indices are the minima of each set. These two minima do not need be unique, nor exist, and so there are at most two out edges. For the edge in-degree, the previous $$\Delta t$$-adjacent events for the nodes $$u_i$$ and $$v_i$$ are \begin{align*} A^-_{u}(i) &= \{ k | ( \{u_i\} \cap \{u_k, v_k\} \neq \emptyset ) \text{ and } (0 < t_i - t_k \le \Delta t) \}, \\ A^-_{v}(i) &= \{ k | ( \{v_i\} \cap \{u_k, v_k\} \neq \emptyset ) \text{ and } (0 < t_i - t_k \le \Delta t) \} \end{align*} We can analogously define the edge set as \begin{align*} \mathcal{E} &= \{ (e_j, e_i) | j = \max(A^-_{u}(i)) \text{ or } j = \max(A^-_{v}(i)) \}. \end{align*} By the same reasoning as with forward definition, vertices can have a maximum in-degree of at most two. □ Lemma 3.2 The TEG is a directed acyclic graph (DAG). Proof. For a graph $$G$$ to be a DAG, each node in $$G$$ must not have a directed path from that node back to itself. The edge set is given by \begin{align*} \mathcal{E} &= \{ (e_i, e_j) |\, j = \min(A^+_{u}(i)) \textrm{ or } j = \min(A^+_{v}(i)) \}, \end{align*} where the set $$A^+_{u}(i)$$ contains only events $$e_k$$ such that $$t_k > t_i$$ by definition. Suppose there exists a direct path from event $$i$$ back to itself via a sequence of ordered events $$e_{k_1}, e_{k_2}, \dots e_{k_n}$$. Then by transitivity this implies $$t_i < t_{k_1} < t_{k_2} < \dots < t_{k_n} < t_i$$, which is a contradiction. Hence no such path exists and the TEG is a DAG. Simply put, as edges travel strictly forward in time there can be no cycles in the graph. □ Lemma 3.3 The set of nodes in each temporal component of the TEG are distinct, that is if there exists two temporal components of the TEG, $$C_1$$ and $$C_2$$, with node sets $$V_1, V_2, \subset V$$ then $$V_1 \cap V_2 = \emptyset$$. Proof. Suppose $$V_1 \cap V_2 \neq \emptyset$$ and there exists a node $$u \in V_1 \cap V_2$$. Then there exists a set of events in $$C_1$$ which contain $$u$$ with times $$t^{(1)}_1, t^{(1)}_2, \dots, t^{(1)}_{n_1}$$. Similarly there exists a set of events in $$C_2$$ which contain $$u$$ with times $$t^{(2)}_1, t^{(2)}_2, \dots, t^{(2)}_{n_2}$$. Assuming that event times are distinct then there exists an ordering of these times. Regardless of the relative ordering of these times there must exist a time $$t^{(1)}_i$$ followed by a time $$t^{(2)}_j$$ (or vice versa). These events share a node and the timing of the events are consecutive meaning the two events are adjacent. This implies there exists an edge between the two events by definition of the TEG, and $$C_1$$ and $$C_2$$ are one component. This contradicts the original statement and hence $$C_1$$ and $$C_2$$ must contain distinct nodes. □ Note that this is not true in the $$\Delta t$$-TEG, even if the components completely overlap in time. Lemma 3.4 The maximal path (allowing for backwards traversal along edges with negative weight) through a temporal component of the edge-labelled TEG includes the earliest and latest event in the temporal component. Proof. Let $$p_{\rm{max}} = (e_0, \dots, e_k)$$ be the sequence of vertices in the maximal path. Suppose there exists an event $$e_{*} \notin p_{{\rm{max}}}$$ such that $$t_* < t_i$$ for $$i=0,\dots,k$$. Then, as the component is connected, there exists a path $$p_{*i}$$ (ignoring edge directions) from $$e_* \to e_i$$, $$\forall e_i \in p_{{\rm{max}}}$$. Then $$l(p_{*i}) > l(p_{0i})$$ where $$l(\cdot)$$ is the weighted length of the path, and hence the path $$e_* \to e_i \to e_k$$ is longer than $$p_{\rm{max}}$$. This is a contradiction and hence the maximal path through the component must contain the earliest event. A similar but opposite argument shows that the latest event is also contained in the maximal path. □ 3.1 Duality Before we show that relationship between the edge-labelled TEG and the temporal network we first show that not all permutations of the vertices and edges of an edge-labelled TEG describe a temporal network. The structure of the TEG takes a specific form, and there are four conditions required for an edge-labelled TEG to represent a temporal network. [C1] Event times must be consistent across all paths (Fig. 2(c)): Let $$P_{ij}$$ be the set of all directed paths between vertices $$i$$ and $$j$$. We describe a path $$p_\alpha \in P_{ij}$$ as the sequence of edges in the path. The sum of IET along all paths must be equal, that is \begin{align*} \sum_{(k,l) \in p_\alpha}{A^\tau_{kl}} = \sum_{(k,l) \in p_\beta}{A^\tau_{kl}} \textrm{ for all } p_\alpha, p_\beta \in P_{ij}. \end{align*} [C2] Nodes in each event have only one subsequent event (Fig. 2(a)): For each pair of out-edges $$(i, k), (i, l)$$ of a vertex, we require $$\xi_{\rm out}(A^\mu_{ik}) \neq \xi_{\rm out}(A^\mu_{il})$$. [C3] Nodes in each event cannot be overprescribed (Fig. 2(b)): For each pair of in-edges $$(k, i), (l, i)$$ of a vertex, we require $$\xi_{\rm in}(A^\mu_{ki}) \neq \xi_{\rm in}(A^\mu_{li})$$. [C4] Edge types and nodes must be consistent across multiple paths (Fig. 2(d)): If there exists an edge $$(i,j)$$ such that there exists a secondary path $$p \in P_{ij}$$ via at least one other vertex then \begin{align*} A^\mu_{ij} = \begin{cases} \textrm{ABAB} &\mbox{if } \displaystyle\prod_{(k,l) \in p}{\xi_{\rm switch}(A^\mu_{kl})} = 1 \\ \textrm{ABBA} & \mbox{if } \displaystyle\prod_{(k,l) \in p}{\xi_{\rm switch}(A^\mu_{kl})} = -1 \end{cases} . \end{align*} Conversely if there is a vertex with two in edges, $$(i,j), (k,j)$$, with $$A^\mu_{ij} \in \{\textrm{ABAB, ABBA}\}$$ then there exists a path $$p \in P_{ij}$$ with $$(k,j) \in p$$ and $$\prod_{(m,n) \in p}{\xi_{\textrm{switch}}(A^\mu_{mn})} = \xi_{\textrm{switch}}(A^\mu_{ij})$$. Similarly for a vertex with two out edges $$(i,j), (i,k)$$ with $$A^\mu_{ij} \in \{\textrm{ABAB, ABBA}\}$$ then there exists a path $$p \in P_{ij}$$ with $$(i,k) \in p$$ and $$\prod_{(m,n) \in p}{\xi_{\textrm{switch}}(A^\mu_{mn})} = \xi_{\textrm{switch}}(A^\mu_{ij})$$. Fig. 2. View largeDownload slide Inconsistent edge-labelled TEGs. Edges are labelled with the tuple $$(\tau, \mu)$$. (a) The subsequent two events for node A are included as edges, breaking condition [C2]. (b) Both incoming edge types dictate the first node of the event which is contradictory (condition [C3]). (c) The IET across multiple paths are not equal (condition [C1]). (d) The edge between events $$e_1$$ and $$e_3$$ is incorrectly labelled. By reconstructing the temporal network or using condition [C4] we see that $$A^\mu_{13}= \textrm{ABAB}$$. Fig. 2. View largeDownload slide Inconsistent edge-labelled TEGs. Edges are labelled with the tuple $$(\tau, \mu)$$. (a) The subsequent two events for node A are included as edges, breaking condition [C2]. (b) Both incoming edge types dictate the first node of the event which is contradictory (condition [C3]). (c) The IET across multiple paths are not equal (condition [C1]). (d) The edge between events $$e_1$$ and $$e_3$$ is incorrectly labelled. By reconstructing the temporal network or using condition [C4] we see that $$A^\mu_{13}= \textrm{ABAB}$$. We call graphs which satisfy these conditions consistent graphs. Those graphs which do not satisfy these conditions are inconsistent in that they do not uniquely describe a temporal network, and attempting to recover the temporal network using the following algorithm will lead to contradiction. Examples of inconsistent TEGs are given in Fig. 2. Generally it is difficult to generate graphs which satisfy these conditions however any TEG generated from a temporal network will be consistent. For each connected component of an edge-labelled TEG we are able to reconstruct the temporal network with the following algorithm: (a) Find the maximal path from a root vertex (no incoming edges) to a leaf vertex (no outgoing edges) in the edge-labelled TEG using the network of IETs, $$A^\tau$$, allowing for backwards traversal along edges with opposite weight. (Fig. 3(a)). This can be achieved by finding the shortest path in the network $$\left(A^\tau\right)^\top - A^\tau$$, where $$\cdot^\top$$ is the transpose. (b) Label the first vertex in the maximal path with $$t=0$$ and subsequently propagate the event times through the edge-labelled TEG along the edges. We know that this event is the earliest in the component using Lemma 3.4. For a vertex $$i$$, the time at which that event occurs is given by \begin{align*} t_i = \sum_{(m,n) \in P_{0i}} \left(\left(A^\tau\right)^\top - A^\tau\right)_{mn} \end{align*} To be able to do this we require the condition [C1] otherwise the existence of multiple paths between vertices can lead to a contradiction in event times. (c) For events in time order, resolve the nodes in each event from the incoming edges (Fig. 3(b,c)). We require condition [C3] here otherwise there can be a conflict on resolving a node position. If a node in an event is unprescribed (the event has zero or one incoming edge) then the unprescribed nodes are given a new label. Condition [C2] is required by definition of the edge-labelled TEG to enforce that the subsequent event of each node is connected by an edge. Without it, the subsequent two edges for one node could be given. Finally condition [C4] ensures that the edge-labelled TEG is uniquely labelled (Fig. 2(d)). The existence of an inverse algorithm confirms a duality between the edge-labelled TEG and the temporal network. Theorem 3.5 Let $$X$$ be the set of all temporal networks translated in time such that the first event occurs at $$t=0$$, nodes are labelled in order of appearance, and such that the time-aggregated graph of connections is connected. Let $$Y$$ be the set of all consistent and connected TEGs. Then there exists a bijection $$f: X \to Y$$, that is, an edge-labelled TEG uniquely describes a temporal network in $$X$$. Fig. 3. View largeDownload slide The inverse algorithm for the TEG. (a) The maximal path between root and leaf vertices (red/grey) through the TEG with edges labelled with IETs. Once the maximal path has been found, the root vertex is assigned time $$t=0$$ and the remainder of times are found by propagation along all other edges (black). (b) The resolution of an event from two incoming edges. Each incoming edge determines one of the nodes in the later event. (c) The resolution of an event with one incoming edge. In this case only one node is prescribed and so the other is given a new label. Fig. 3. View largeDownload slide The inverse algorithm for the TEG. (a) The maximal path between root and leaf vertices (red/grey) through the TEG with edges labelled with IETs. Once the maximal path has been found, the root vertex is assigned time $$t=0$$ and the remainder of times are found by propagation along all other edges (black). (b) The resolution of an event from two incoming edges. Each incoming edge determines one of the nodes in the later event. (c) The resolution of an event with one incoming edge. In this case only one node is prescribed and so the other is given a new label. Proof. Trivially for each temporal network there exists only one edge-labelled TEG as the nodes in each event have at most one subsequent event2 and the functions $$\tau$$ and $$\mu$$ are deterministic. The proof rests on the existence of the inverse algorithm $$f^{-1}$$, outlined above. We consider a general event $$e_i = (u_i, v_i, t_i)$$ in the temporal network, and its representative vertex $$x$$ in the edge-labelled TEG. By the translation of the temporal network, this event occurs $$t_i$$ time units after the first event. By finding the maximal path through the edge-labelled TEG, we find the first event in the temporal network (Lemma 3.4) and can hence find the time at which $$x$$ occurs relative to this first event, that is $$t_i$$. The event is now is correctly placed in time. To recover the nodes of the event $$u_i$$ and $$v_i$$, assume the nodes in all previous events have been correctly determined in order of appearance. There are three possible cases: 1. Event $$e_i$$ has no incoming edges. In this case neither of these nodes have previously interacted and can be enumerated. 2. Event $$e_i$$ has one incoming edge prescribing one node. In this case a new node is involved and is enumerated accordingly (Fig. 3(c)). 3. Event $$e_i$$ has one or two incoming edges prescribing both nodes. In this case the nodes are completely determined by previous events (Fig. 3(b)). For the base case, the earliest event vertices have no incoming edges and are labelled freely. Subsequent event vertices must then have all incoming edges prescribed as they occur strictly earlier in time. Hence the nodes in $$e_i$$ are correctly labelled, relative to the labelling of the previous events. As both nodes are labelled relative to previous events, and the time of the event is positioned relative to the first event, the event is recovered from the TEG. Since this argument holds for an arbitrary event in the temporal network, it holds for all. Therefore $$f^{-1}(f(X)) = X$$, and $$f$$ is a bijection. □ Corollary 3.6 A TEG $$\mathcal{G}$$, consisting of multiple connected components defines a temporal network up to a translation of time between components. If the events of $$\mathcal{G}$$ are time stamped, then $$\mathcal{G}$$ uniquely defines a temporal network. Proof. By Theorem 3.5 for each connected component there exists a unique temporal network such that the earliest event occurs at $$t=0$$. Trivially there exists an ensemble of temporal networks with the same TEG, dependent on the choice of earliest event time for each component. If the time of this event is given then the choice is removed and hence the TEG uniquely defines the temporal network. □ Time translation between components may seem disconcerting, however these components are disconnected and do not share any nodes (using Lemma 3.3). This means that, assuming the network is not visible to those within it, any dynamics on the network are completely independent across components.3 Most digital communication channels that we will consider are hidden from an observer, for example email, SMS and telephone calls. Other sources of communication such as Twitter are in the public domain and so all messages are observable (although require active searching). Furthermore, with real examples we keep the event timestamps which fixes the temporal components in time, and so the TEG uniquely defines a temporal network. This means that the temporal network can be uniquely defined within the time translation of components by the network of subsequent adjacent events, their IETs, and the motifs formed between them. As a result, considering the network in this formalism is equivalent to studying the temporal network as the same information is contained in both representations. 4. Statistical properties of the TEG We now look to describe the TEG statistically by considering the temporal components as a function of $$\Delta t$$ and the distribution of IETs and motifs across these components. In this section we consider the $$\Delta t$$-TEG as a weighted directed static network where edge weights are the IETs between events. This allows us to prune a network based on edge weights (IETs). The $$\Delta t$$-TEG contains edges $$(i,j)$$ where $$A^\tau_{ij} \le \Delta t$$, using the notation of the edge-labelled TEG from Section 2. Let $$C_i^{\Delta t}$$ be the $$i$$th temporal component of the $$\Delta t$$-TEG, where components are partially ordered by the number of events they contain such that $$|C_0| \ge |C_1| \ge \cdots$$. These components provide a natural decomposition of the temporal network into constituent events. The temporal components do not however immediately give any insight into the connected components of nodes. In fact it has been shown previously that finding strongly connected components of nodes in temporal networks is an NP-complete problem [27], so we will therefore make no such attempt. 4.1 Component sizes, distribution and growth The number and size of temporal components in the $$\Delta t$$-TEG evidently depends on $$\Delta t$$. A natural question is to ask how many temporal components there are in a temporal network and how the events are distributed between them. In the limit $$\Delta t \to 0$$ the TEG will be completely disconnected (assuming a node does not participate in two events at once), however it is not guaranteed that as $$\Delta t \to \infty$$ a single component will form. In fact in the limit $$\Delta t \to \infty$$ the components of the TEG contain distinct sets of nodes (Lemma 3.3) and correspond to the connected components of the time-aggregated temporal network. For intermediate $$\Delta t$$ the structure of the TEG has a complex dependency on both the connectivity of the nodes (network topology) and the timing between subsequent events. To characterize the network structure, we look at the component size distribution of the $$\Delta t$$-TEG. By size we mean the number of events in each component, although we can similarly consider the number of distinct nodes or component duration. We are also interested in the size of the largest component $$|C_0^{\Delta t}|$$, given by the number of events in that component. In particular, understanding the growth of $$|C_0^{\Delta t}|$$ as a function of $$\Delta t$$ gives an understanding of the different timescales involved in the network, for example what fraction of the whole network does it contain and for what value of $$\Delta t$$ does it reach $$95\%$$ of its total size? As an example, we look at a randomly generated temporal network. To generate a temporal network of $$N$$ nodes with $$M$$ events with a prescribed IET distribution $$X$$ we perform the iteration: 1. Increment $$t$$ to $$t+\tau$$ where $$\tau$$ is drawn from $$X$$ 2. Draw $$u, v$$ from $$\{1, \dots, N \}$$ without replacement 3. Add event $$(u,v,t)$$ to the temporal network for each event, after initializing $$t=0$$. In Fig. 4, we see the results for a random graph where $$N=200$$, $$M=5,000$$, and $$X$$ is power-law distributed with density $$P(x;a) = ax^{a-1}$$, where $$0 \le x \le 1$$ and $$a=0.2$$. Results are averaged over an ensemble of $$100$$ temporal networks. The size of the largest component has a sigmoidal dependence on $$\Delta t$$, with only a small fraction of the TEG connected below a characteristic time, and the majority of events connected above (Fig. 4(a)). The average duration of the temporal network is $$1,000$$ meaning that when $$\Delta t$$ is only $$2\%$$ of the network duration, the majority of events are connected. Also, due to the random selection of nodes the largest component ultimately contains every event as $$\Delta t \to \infty$$. The distribution of temporal components (Fig. 4(b)) also display this transition. For $$\Delta t = 5$$ there is a continuous spectrum of component sizes although the maximum observed size is less than $$10\%$$ of events. The probability of observing components any larger grows exponentially small. For $$\Delta t = 10$$ almost all possible component sizes are observed. However, above the characteristic time at $$\Delta t = 15$$, the distribution is not continuous. Components either are a small fraction of the TEG, or are the majority fraction. There are no components of intermediate size. Fig. 4. View largeDownload slide Temporal component dependence on $$\Delta t$$. (a) The size of the largest temporal component in the $$\Delta t$$-TEG as a fraction of all events for a random temporal network of $$200$$ nodes and $$5,000$$ events. The largest component size has a sigmoidal dependence on $$\Delta t$$, with a sharp transitional period from being only a small fraction of all events ($$<$$10%), to containing almost all events ($$>$$90%). (b) The corresponding distribution of temporal component sizes for $$\Delta t = 5,10,15$$ constructed using an ensemble of random temporal networks. For $$\Delta t=5$$ there are a range of component sizes however non which make up more than $$10\%$$ of the network. For $$\Delta t=10$$ components can take any size. For $$\Delta t=15$$ components either make up the majority of the network, or are small isolated components. Fig. 4. View largeDownload slide Temporal component dependence on $$\Delta t$$. (a) The size of the largest temporal component in the $$\Delta t$$-TEG as a fraction of all events for a random temporal network of $$200$$ nodes and $$5,000$$ events. The largest component size has a sigmoidal dependence on $$\Delta t$$, with a sharp transitional period from being only a small fraction of all events ($$<$$10%), to containing almost all events ($$>$$90%). (b) The corresponding distribution of temporal component sizes for $$\Delta t = 5,10,15$$ constructed using an ensemble of random temporal networks. For $$\Delta t=5$$ there are a range of component sizes however non which make up more than $$10\%$$ of the network. For $$\Delta t=10$$ components can take any size. For $$\Delta t=15$$ components either make up the majority of the network, or are small isolated components. One way to visualize the temporal components is through a temporal barcode, as seen in Fig. 5. This displays the components of the $$\Delta t$$-TEG, ordered by their size with the largest components at the bottom. Within each component, the individual events are plotted by a single vertical line. This visualization allows us to see the duration of each component, its temporal position relative to other components, and the distribution of IETs within the component. Fig. 5. View largeDownload slide Illustration of the temporal barcode associated with a $$\Delta t$$-TEG. (a) A temporal network involving six nodes and nine events. Event labels represent the instantaneous time when that event occurred. (b) The temporal components of (a) when $$\Delta t = 4$$. (c) The temporal barcode of (b). There are three different components. Events in each component appear as black lines. Components 1 and 2 are distinct from 3 as they involve a distinct set of nodes. Components 1 and 2 are distinct as there is a gap greater than $$\Delta t$$ between activity on the nodes. Fig. 5. View largeDownload slide Illustration of the temporal barcode associated with a $$\Delta t$$-TEG. (a) A temporal network involving six nodes and nine events. Event labels represent the instantaneous time when that event occurred. (b) The temporal components of (a) when $$\Delta t = 4$$. (c) The temporal barcode of (b). There are three different components. Events in each component appear as black lines. Components 1 and 2 are distinct from 3 as they involve a distinct set of nodes. Components 1 and 2 are distinct as there is a gap greater than $$\Delta t$$ between activity on the nodes. 4.2 Motif and inter-event time distributions We can also consider the IET and motif distributions across the temporal components of the TEG.4 The simple temporal networks in Fig. 6 have trivial motif distributions. In Fig. 6(a) the only motif present is that of ABAC, reflective of the broadcasting type behaviour of node $$\epsilon$$ in this instant. If we were to consider the distribution of motifs in Fig. 6(b), we would see an equal split between the ABAB and ABBA motifs. However, considering the motif distribution of each component we see that there are in fact two distinct components containing either the ABAB or ABBA motif only. Without a suitable null model for the temporal network, analysing the motif distributions alone cannot give the significance of any observations [28, 29], and choosing a null model is non-trivial beyond time-shuffling and time-reversal [30, 31]. Comparing the temporal network with itself however allows us to gain information about the relative motif counts across components. Motif counts can be compared across different node or event types, or even different intervals in the network, however, given the use of temporal components in the calculation of motif counts, comparing the motif distributions across temporal components is a natural way to proceed. Fig. 6. View largeDownload slide Examples of temporal networks and their TEGs. (a) (Left) a temporal network consisting of a central node messaging four other nodes in turn. (b) (Right) the corresponding TEG. (Left) a temporal network consisting of two pairs of nodes. The bottom pair periodically reciprocate messages in turn, whereas in the top pair all messages are sent in one direction. (Right) the corresponding TEG. Fig. 6. View largeDownload slide Examples of temporal networks and their TEGs. (a) (Left) a temporal network consisting of a central node messaging four other nodes in turn. (b) (Right) the corresponding TEG. (Left) a temporal network consisting of two pairs of nodes. The bottom pair periodically reciprocate messages in turn, whereas in the top pair all messages are sent in one direction. (Right) the corresponding TEG. Returning to the random temporal network example of Section 4.1 it can be shown that the motif distribution is given by \begin{align} \Pr(x) = \begin{cases} \frac{1}{4N-6} &\mbox{ for } x \in \{{\rm ABAB, ABBA}\} \\ \frac{N-2}{4N-6} &\mbox{ for } x \in \{{\rm ABBC, ABCB, ABAC, ABCA}\}. \end{cases} \end{align} (4.1) So, as $$N \to \infty$$, the ABAB and ABBA motifs are less likely to be observed and all other motifs are observed with equal probability. This illustrates why the random temporal network model is an unsuitable null model for social systems where one expects a degree of reciprocity. Coupled to each motif, each edge in the TEG carries the IET between the two connected events. This is the time between events which an individual node participates (inward and outward activity). This time differs from what has been previously studied in temporal networks—usually the global time between events for the entire network, or the outward activity of an individual node [10, 25, 32]. We may also partition the IETs based on the motif formed between the two events. For example we can calculate $$\Pr(t | m)$$ with $$m \in \mathbb{M}$$, the probability of observing an IET of $$t$$ given the motif formed between the two events is $$m$$. By considering these conditional probabilities, we can uncover more information about the process that generated the temporal network than if we considered the IETs and motifs in isolation. In Fig. 7(a), we plot the CCDF of the IETs of the TEG. For real data, this distribution is a complex function of node interactivity and activity patterns. For the random temporal network however the distribution is approximately geometric. This is due to each node having a constant probability of being in an event at each iteration. The distribution would be exactly geometric if $$X$$ was deterministic. In Fig. 7(b), we see that the motifs with two nodes (ABAB and ABBA) occur faster on average than the motifs containing three nodes. This make sense as in the random temporal network model the three node motifs are more likely to occur and so the two node motifs must occur quickly or not at all. Fig. 7. View largeDownload slide The IET distributions for the random temporal network. (a) the CCDF for the IET distribution of the TEG, that is the time between consecutive events for each node. (b) the CCDFs of the IET distributions, conditional on the motif formed. The motifs containing two nodes have on average a smaller IET than motifs with three nodes. Fig. 7. View largeDownload slide The IET distributions for the random temporal network. (a) the CCDF for the IET distribution of the TEG, that is the time between consecutive events for each node. (b) the CCDFs of the IET distributions, conditional on the motif formed. The motifs containing two nodes have on average a smaller IET than motifs with three nodes. 4.3 Induced aggregate networks The $$\Delta t$$-TEG provides a convenient way to decompose a larger temporal network, however being event-centric it can be difficult to assess the connectivity of the nodes within each component. This information can be extracted easily however by considering the static aggregation of the temporal component. The static network can then be analysed using standard methods to find quantities of interest. In particular, we will be interested in the number of nodes, edge density, the fraction of reciprocated edges and network diameter. Studying the components of the decomposed network offers the advantage of understanding the role of nodes within a particular context, as opposed to consideration of the static graph of the full temporal network, which may be dense or noisy, or of fixed intervals which may dissect patterns of behaviour. Partitioning the random temporal network into intervals of fixed width results in a series of Erdős–Rényi (ER) static networks with edge forming parameter $$p$$ dependent on the number of events in each partition. This gives the ‘temporal ER network’ as described in [33]. The aggregated networks of the TEG components by contrast are not in the class of ER graphs as they are guaranteed to be connected, and a full analysis of their properties is yet to be undertaken. 5. Application to data In this section, we consider the social network of students from University of California, Irvine (UCI) [34, 35].5 The social network was created to sustain social interaction among students and to help enlarge their social circles. Students created a profile which contained a short biography and demographic characteristics. Students could then view or message any other student in the network. The data set covers a period of six months from April to October 2004, over this time $$59,835$$ messages were sent between $$1,899$$ users6. The resulting aggregate network has $$20,296$$ directed edges, meaning the network is sparse ($$0.56\%$$ of all possible edges are present). To get a first impression of the network structure we look at the temporal barcode of the $$10$$ min-TEG over a short period (12 h) of the data in Fig. 5. The TEG consists of multiple large components which occur over a duration of over an hour. Some of these components overlap in time suggesting that distinct conversations were occurring. Over the same time period there a number of smaller components of interest. For example, component $$20$$ consists of two users exchanging messages back and forth over a period of $$30$$ min. Interestingly one of these users has a response time significantly shorter than the other. Component $$19$$ is a more complicated mix of broadcasting nodes which then converse amongst themselves. Despite containing more events than component $$20$$, this component only lasts for a duration of 9 min. We can quantify these observations by examining the motif and IET distributions across each component. We now systematically study the structural dependence of the TEG on the parameter $$\Delta t$$. In Fig. 9(a) we plot the size of the largest component (as a fraction of all events). As we increase $$\Delta t$$, the largest component increases in a step-wise fashion, indicating that there are distinct timescales in the connectivity of the TEG. For example, the jump in connectivity around $$\Delta t = 6$$ h could be attributed to message pairs which occur in the late evening and then first thing in the morning. The step-like nature of the largest component highlights a need to exercise caution when choosing a value of $$\Delta t$$. Any analysis of these networks may be robust to small variation in $$\Delta t$$ on each step however there is now the question as to which step is most suitable for the analysis. We also see that as $$\Delta t \to \infty$$ (not shown) the largest component contains all but four events (out 59,835). Consequently the aggregate static network contains only four components with the activity of the smaller three consisting of one or two events only. Furthermore the distributions of component sizes for varying $$\Delta t$$ are given in Fig. 9(b). The component size distributions mirror that of the random temporal network with the formation of a giant component as $$\Delta t$$ increases. However in this case the transition from multiple small components to a giant component is less abrupt. For $$\Delta t=60$$ (blue squares) and for other small values of $$\Delta t$$ the distribution of component sizes can be well approximated by a power-law distribution. Fig. 8. View largeDownload slide The top $$20$$ components of the $$10$$ min-TEG for a 12-h period beginning on $$10$$th May $$2004$$. As with Fig. 5, each vertical black line represents a single event. The $$10$$ min-TEG consists of a number of large components (with some overlap) which occur in the mid to late evening. There are also many smaller components occurring at the same time with distinctive IETs. Fig. 8. View largeDownload slide The top $$20$$ components of the $$10$$ min-TEG for a 12-h period beginning on $$10$$th May $$2004$$. As with Fig. 5, each vertical black line represents a single event. The $$10$$ min-TEG consists of a number of large components (with some overlap) which occur in the mid to late evening. There are also many smaller components occurring at the same time with distinctive IETs. Fig. 9. View largeDownload slide Temporal component dependence on $$\Delta t$$. (a) The size of the largest temporal component in the $$\Delta t$$-TEG as a fraction of all events for the UCI network. (b) The corresponding distribution of temporal component sizes for $$\Delta t = 60$$, 3,600 and 86,400 s (corresponding to $$1$$ min, $$1$$ h and $$1$$ day). For $$\Delta t=60$$ the distribution of component sizes are well represented by a power-law distribution. For $$\Delta t=3,600$$ we see the onset of the giant component. For $$\Delta t=86,400$$ the largest component is over $$80\%$$ of all events and there are very few moderately sized components. The remaining components are all relatively small. Fig. 9. View largeDownload slide Temporal component dependence on $$\Delta t$$. (a) The size of the largest temporal component in the $$\Delta t$$-TEG as a fraction of all events for the UCI network. (b) The corresponding distribution of temporal component sizes for $$\Delta t = 60$$, 3,600 and 86,400 s (corresponding to $$1$$ min, $$1$$ h and $$1$$ day). For $$\Delta t=60$$ the distribution of component sizes are well represented by a power-law distribution. For $$\Delta t=3,600$$ we see the onset of the giant component. For $$\Delta t=86,400$$ the largest component is over $$80\%$$ of all events and there are very few moderately sized components. The remaining components are all relatively small. For the remainder of this section we fix $$\Delta t = 3,600$$ s ($$1$$ h). The choice of $$\Delta t$$ in this case (as in previous work) is chosen arbitrarily, although as Fig. 9 confirms we are in a regime where the largest component is no larger than $$5\%$$ of the total number of events. We first calculate the distribution of two-event motifs across the entire network and compare this to the average distribution over an ensemble of $$200$$ time-shuffled versions of the network7 (Table 2). The distribution of motifs in the true network differs significantly from the random ensemble in most motifs, in particular the ABBA is over-represented ($$z=569$$) and the ABBC motif is under-represented ($$z=-162$$). Table 2 Motif distribution for the UCI temporal network and ensemble of shuffled networks. Standardized scores are given in brackets Network ABAB ABBA ABAC ABCA ABBC ABCB UCI $$7.0 \times 10^{-2}$$ $$0.14$$ $$0.27$$ $$0.15$$ $$0.11$$ $$0.25$$ Shuffled $$9.0 \times 10^{-3}$$ $$7.6 \times 10^{-3}$$ $$0.27$$ $$0.22$$ $$0.22$$ $$0.26$$ (215) (569) ($$-$$3.93) ($$-$$94.8) ($$-$$162) ($$-$$16.9) Network ABAB ABBA ABAC ABCA ABBC ABCB UCI $$7.0 \times 10^{-2}$$ $$0.14$$ $$0.27$$ $$0.15$$ $$0.11$$ $$0.25$$ Shuffled $$9.0 \times 10^{-3}$$ $$7.6 \times 10^{-3}$$ $$0.27$$ $$0.22$$ $$0.22$$ $$0.26$$ (215) (569) ($$-$$3.93) ($$-$$94.8) ($$-$$162) ($$-$$16.9) Table 2 Motif distribution for the UCI temporal network and ensemble of shuffled networks. Standardized scores are given in brackets Network ABAB ABBA ABAC ABCA ABBC ABCB UCI $$7.0 \times 10^{-2}$$ $$0.14$$ $$0.27$$ $$0.15$$ $$0.11$$ $$0.25$$ Shuffled $$9.0 \times 10^{-3}$$ $$7.6 \times 10^{-3}$$ $$0.27$$ $$0.22$$ $$0.22$$ $$0.26$$ (215) (569) ($$-$$3.93) ($$-$$94.8) ($$-$$162) ($$-$$16.9) Network ABAB ABBA ABAC ABCA ABBC ABCB UCI $$7.0 \times 10^{-2}$$ $$0.14$$ $$0.27$$ $$0.15$$ $$0.11$$ $$0.25$$ Shuffled $$9.0 \times 10^{-3}$$ $$7.6 \times 10^{-3}$$ $$0.27$$ $$0.22$$ $$0.22$$ $$0.26$$ (215) (569) ($$-$$3.93) ($$-$$94.8) ($$-$$162) ($$-$$16.9) By considering the TEG structure (the temporal components in particular) we can decompose the motif counts into the components from which they originate (see Fig. 10). There are two observations to note. First, the majority of the TEG components (red/grey dots) lie outside the bulk of the time-shuffled component distribution (shading, darker being higher density) which confirms our earlier observation that the difference in aggregate motif distributions is significant. This is somewhat unsurprising as we have removed any temporal correlations between events by shuffling. More surprising is that the diversity in the motif counts of each component is greater than in the randomized networks (a z-score of $$12$$ when considering the average nearest-neighbour distance). The largest components have distributions close to the average for the entire network however there are certain components where one motif is more greatly expressed than the others. For example, in the three components in the top right of Fig. 10 the ABAC motif makes up over $$95\%$$ of all observed motifs in these components. This highlights that the temporal network is not simply made up of homogeneous groups of nodes and activity but instead consists of distinct heterogeneous components. The largest components can be decomposed by further reducing $$\Delta t$$ which may isolate more diverse behaviour. Fig. 10. View largeDownload slide The motif distribution of the largest $$100$$ components of the $$1$$h-TEG, reduced to two dimensions using principal component analysis (red/grey dots). Behind, a kernel density estimate of the motif distribution from the largest $$100$$ components of $$200$$ time-shuffled networks (shading). Darker areas have higher probability. Here, we can see that the average motif distribution differs from the randomized networks, and that also there is a larger variance among the components compared to the randomized networks (see text). Fig. 10. View largeDownload slide The motif distribution of the largest $$100$$ components of the $$1$$h-TEG, reduced to two dimensions using principal component analysis (red/grey dots). Behind, a kernel density estimate of the motif distribution from the largest $$100$$ components of $$200$$ time-shuffled networks (shading). Darker areas have higher probability. Here, we can see that the average motif distribution differs from the randomized networks, and that also there is a larger variance among the components compared to the randomized networks (see text). Much like in Fig. 6(b), by considering the motif distribution at the component level we are able to understand the behaviour of nodes and groups of nodes much more clearly than considering the aggregate motif distribution alone. This also has a major consequence should we attempt to model this network, or any temporal network in general. To faithfully model this network we need to incorporate the heterogeneity of behaviour across different nodes and also across time. Another benefit of studying IETs and motifs in tandem is that we can consider the IET distribution conditioned on the motif formed between the two events. In Fig. 11(a) we see the IET distribution across the whole TEG (blue/black $$\square$$), compared to the same distribution over an ensemble of $$100$$ time-shuffled versions of the network (dashed). The time-shuffled CCDF is well modelled by a log-normal distribution, however in the real network smaller IETs are over-represented and the log-normal fit is poor. By considering the conditional IET distributions, $$\Pr(t|m)$$, we see in Fig. 11(b) that on average the ABAB motif occurs much quicker than the other five motifs, and all motif IETs are smaller than the random ensemble equivalent. This is most likely due to users of the network breaking their messages and sending the same information over multiple messages. These conditional distributions help further our understanding of the generative mechanisms of the network, and also show that simple priority queue models of temporal networks [25], while useful, cannot capture the rich behaviour observed. Fig. 11. View largeDownload slide The IET distributions for the UCI network. (Left) the CCDF for the IET distribution of the TEG (blue/black $$\square$$), that is the time between consecutive events for each node. The IET CCDF of an ensemble of $$100$$ time-shuffled versions of the temporal network is given by the dashed line. (Right) the CCDFs of the IET distributions, conditional on the motif formed. The fastest appearing motif on average is that of ABAB. The slowest appearing motif is the ABCB motif. The time-shuffled ensemble CCDFs for each motif are all roughly identical (not shown). Fig. 11. View largeDownload slide The IET distributions for the UCI network. (Left) the CCDF for the IET distribution of the TEG (blue/black $$\square$$), that is the time between consecutive events for each node. The IET CCDF of an ensemble of $$100$$ time-shuffled versions of the temporal network is given by the dashed line. (Right) the CCDFs of the IET distributions, conditional on the motif formed. The fastest appearing motif on average is that of ABAB. The slowest appearing motif is the ABCB motif. The time-shuffled ensemble CCDFs for each motif are all roughly identical (not shown). 6. Conclusions In this article, we introduced the TEG, a static representation of a temporal network. Furthermore, we show that the TEG can uniquely define a temporal network up to a translation of disconnected components in time. In this sense, we are able to fully describe a temporal network in terms of IET and two-event motifs. In Section 5, we showed that the TEG provided a natural decomposition of the temporal network and that by considering the IET conditioned on motif type we were able to uncover different timescales for behaviour that would not be visible when considering both properties independently. We also saw how the behaviour of individuals and collectives differed across temporal components, suggesting that temporal motifs should be considered on a component level, rather than in aggregate. This also suggests new ways to model temporal networks. A possible method may be to model the temporal components, matching their size distributions and placement in time, and then modelling the behaviour of each component individually. It is also worth noting that the TEG is not limited to simple event tuples $$(u,v,t)$$ but can be generalized in the same fashion as temporal motifs to include coloured events or nodes, for example to distinguish phone calls and SMS messages in communication networks. Provided a meaningful relationship between events exists then in fact any such sequence of timestamped events can be represented by a TEG. The calculation of the TEG is also computationally efficient. Building the TEG from a temporal network can be done in time which scales linearly with the number of events in the network. This means that this type of analysis is well suited to large data sets such as those extracted from social or telecommunication networks. It also allows the TEG to be constructed in real time (provided data is received sequentially) and so provides a method to quickly assess behavioural changes in the network. While many details of the TEG are yet to be explored, our study demonstrates how a temporal network can be represented as a static network of events and be how it can be classified using event relationships. This finding provides an initial, but significant step towards the systematic investigation of temporal networks and their generating mechanisms. Acknowledgements We thank Jonathan Ward for the careful review of this manuscript and Alastair Rucklidge, Mauro Mobilia, Mary Aprahamian and Charles Taylor for useful discussion and insightful offerings. We also thank the two anonymous referees for their input and suggestions. Funding Engineering and Physical Sciences Research Council CASE Studentship (Grant number EP/L50550X/1); Bloom Agency.8 Footnotes Edited by: James Gleeson 1 For example $$f_{ij}(u_i) = \text{A}, f_{ij}(v_i) = \text{B}, \dots$$. 2 Here we assume that a node participates in only one event at the time same. 3 In the case where the network is visible, observing the network usually prompts a response that is directed towards the observed agents, subsequently connecting the two components. There may be cases where nodes in one component observe nodes in another and act upon that information without any interaction with the component. In these cases it is important to include the time stamp of each event in the TEG. 4 For consistency with the work of [13] we will consider only valid motifs and their corresponding IETs. 5 Data available at: http://snap.stanford.edu/data/CollegeMsg.html. (Accessed 22 September 2017) 6 Special users who broadcast messages to the entire network were removed. 7 Node pairs are kept the same however times are shuffled between events. 8bloomagency.co.uk. References 1. Haewoon K. , Changhyun L. , Hosung P. , & Moon S. ( 2010 ) What is Twitter, a social network or a news media? Proceedings of the 19th International Conference on World Wide Web . New York, NY, USA : ACM , pp. 591 – 600 . 2. Sociopatterns. Available at http://www.sociopatterns.org. ( Accessed 22 September 2017 ). 3. Scholtes I. , Wider N. , Pfitzner R. , Garas A. , Tessone C. J. & Schweitzer F. ( 2014 ) Causality-driven slow-down and speed-up of diffusion in non-Markovian temporal networks. Nat. Commun. , 5 , 5024 . Google Scholar CrossRef Search ADS PubMed 4. Masuda N. , Klemm K. & Eguíluz V. M. ( 2013 ) Temporal networks: slowing down diffusion by long lasting interactions. Phys. Rev. Lett. , 111 , 188701 . Google Scholar CrossRef Search ADS PubMed 5. Luis Iribarren J. & Moro E. ( 2009 ) Impact of human activity patterns on the dynamics of information diffusion. Phys. Rev. Lett. , 103 , 038702 . Google Scholar CrossRef Search ADS PubMed 6. Karsai M. , Kivelä, M., Pan R. , Kaski K. , Kertész J. , Barabási A.-L. & Saramäki J. ( 2011 ) Small but slow world: how network topology and burstiness slow down spreading. Phys. Rev. E. , 83 , 025102 . Google Scholar CrossRef Search ADS 7. Onaga T. , Gleeson J. P. & Masuda N. ( 2017 ) Concurrency-induced transitions in epidemic dynamics on temporal networks. Phys. Rev. Lett. , 119 , 108301 . Google Scholar CrossRef Search ADS PubMed 8. Mastrandrea R. , Fournet J. & Barrat A. ( 2015 ) Contact patterns in a high school: A comparison between data collected using wearable sensors, contact diaries and friendship surveys. PLoS One , 10 , e0136497 . Google Scholar CrossRef Search ADS PubMed 9. Kovanen L. , Kaski K. , Kertész J. & Saramäki J. ( 2013 ) Temporal motifs reveal homophily, gender-specific patterns, and group talk in call sequences. Proc. Natl. Acad. Sci. USA , 110 , 18070 – 18075 . Google Scholar CrossRef Search ADS 10. Holme P. & Saramäki J. ( 2013 ) Temporal Networks . Berlin, Germany : Springer . 11. Holme P. ( 2015 ) Modern temporal network theory: a colloquium. Eur. Phys. J. B , 88 , 234 . Google Scholar CrossRef Search ADS 12. Masuda N. & Lambiotte R. ( 2016 ) A Guide to Temporal Networks , volume 4. London, UK : World Scientific . 13. Kovanen L. , Karsai M. , Kaski K. , Kertész J. & Saramäki J. ( 2011 ) Temporal motifs in time-dependent networks. J. Stat. Mech. Theory Exp. , 2011 , P11005 . Google Scholar CrossRef Search ADS 14. Moody J. ( 2002 ) The importance of relationship timing for diffusion. Social Forces , 81 , 25 – 56 . Google Scholar CrossRef Search ADS 15. Whitbeck J. , Dias de Amorim M. , Conan V. & Guillaume J. ( 2012 ) Temporal reachability graphs. Proceedings of the 18th Annual International Conference on Mobile Computing and Networking . New York, NY, USA : ACM , pp. 377 – 88 . 16. Michail O. ( 2016 ) An introduction to temporal graphs: an algorithmic perspective. Internet Math. , 12 , 239 – 80 . Google Scholar CrossRef Search ADS 17. Wehmuth K. , Ziviani A. & Fleury E. ( 2015 ) A unifying model for representing time-varying graphs. IEEE International Conference on Data Science and Advanced Analytics (DSAA) ( Gaussier E. Cao L. Gallinari P. Kwok J. Pasi G. & Zaiane O. eds), number 36678. Piscataway, NJ, USA : IEEE , pp. 1 – 10 . 18. Riolo C. S. , Koopman J. S. , & Chick S. E. ( 2001 ) Methods and measures for the description of epidemiologic contact networks. J. Urban Health , 78 , 446 – 57 . Google Scholar CrossRef Search ADS PubMed 19. Rosvall M. , Esquivel A. V. , Lancichinetti A. , West J. D. & Lambiotte R. ( 2014 ) Memory in network flows and its effects on spreading dynamics and community detection. Nat. Commun. , 5 . 20. Jurgens D. & Lu T.-C. ( 2012 ) Temporal motifs reveal the dynamics of editor interactions in Wikipedia. ICWSM . Palo Alto, CA, USA : AAAI . 21. Schneider C. M. , Belik V. , Couronné T. , Smoreda Z. & González M. C. ( 2013 ) Unravelling daily human mobility motifs. J. R. Soc. Interface , 10 , 20130246 . Google Scholar CrossRef Search ADS PubMed 22. Xie W.-J. , Li M.-X. , Jiang Z.-Q. & Zhou W.-X. ( 2014 ) Triadic motifs in the dependence networks of virtual societies. Sci. Rep. , 4 . 23. Goh K.-I. & Barabási A.-L. ( 2008 ) Burstiness and memory in complex systems. Europhys. Lett. , 81 , 48002 . Google Scholar CrossRef Search ADS 24. Lambiotte R. , Tabourier L. & Delvenne J.-C. ( 2013 ) Burstiness and spreading on temporal networks. Eur. Phys. J. B , 86 , 1 – 4 . Google Scholar CrossRef Search ADS 25. Barabasi A.-L. ( 2005 ) The origin of bursts and heavy tails in human dynamics. Nature , 435 , 207 – 211 . Google Scholar CrossRef Search ADS PubMed 26. Barrat A. , Cattuto C. , Tozzi A. E. , Vanhems P. & Voirin N. ( 2014 ) Measuring contact patterns with wearable sensors: Methods, data characteristics and applications to data-driven simulations of infectious diseases. Clin. Microbiol. Infect. , 20 , 10 – 16 . Google Scholar CrossRef Search ADS PubMed 27. Nicosia V. , Tang J. , Musolesi M. , Russo G. , Mascolo C. & Latora V. ( 2012 ) Components in time-varying graphs. Chaos , 22 , 023101 . Google Scholar CrossRef Search ADS PubMed 28. Shen-Orr S. S. , Milo R. , Mangan S. & Alon U. ( 2002 ) Network motifs in the transcriptional regulation network of Escherichia coli. Nat. Genet. , 31 , 64 – 68 . Google Scholar CrossRef Search ADS PubMed 29. Milo R. , Shen-Orr S. , Itzkovitz S. , Kashtan N. , Chklovskii D. & Alon U. ( 2002 ) Network motifs: simple building blocks of complex networks. Science , 298 , 824 – 827 . Google Scholar CrossRef Search ADS PubMed 30. Artzy-Randrup Y. , Fleishman S. J. , Ben-Tal N. & Stone L. ( 2004 ) Comment on ‘Network motifs: simple building blocks of complex networks’ and ‘Superfamilies of evolved and designed networks’. Science , 305 , 1107 – 1107 . Google Scholar CrossRef Search ADS PubMed 31. Bajardi P. , Barrat A. , Natale F. , Savini L. & Colizza V. ( 2011 ) Dynamical patterns of cattle trade movements. PLoS One , 6 , e19869 . Google Scholar CrossRef Search ADS PubMed 32. Jo H.-H. , Karsai M. , Kertész J. & Kaski K. ( 2012 ) Circadian pattern and burstiness in mobile phone communication. New J. Phys. , 14 , 013055 . Google Scholar CrossRef Search ADS 33. Scellato S. , Leontiadis I. , Mascolo C. , Basu P. & Zafer M. ( 2013 ) Evaluating temporal robustness of mobile networks. IEEE Trans. Mobile Comput. , 12 , 105 – 117 . Google Scholar CrossRef Search ADS 34. Panzarasa P. , Opsahl T. & Carley K. M. ( 2009 ) Patterns and dynamics of users’ behavior and interaction: network analysis of an online community. J. Amer. Soc. Inform. Sci. Technol. , 60 , 911 – 932 . Google Scholar CrossRef Search ADS 35. Opsahl T. & Panzarasa P. ( 2009 ) Clustering in weighted networks. Soc. Netw. , 31 , 155 – 163 . Google Scholar CrossRef Search ADS © The authors 2017. Published by Oxford University Press. All rights reserved. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)

Journal of Complex Networks – Oxford University Press

**Published: ** Aug 1, 2018

Loading...

personal research library

It’s your single place to instantly

**discover** and **read** the research

that matters to you.

Enjoy **affordable access** to

over 18 million articles from more than

**15,000 peer-reviewed journals**.

All for just $49/month

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Read from thousands of the leading scholarly journals from *SpringerNature*, *Elsevier*, *Wiley-Blackwell*, *Oxford University Press* and more.

All the latest content is available, no embargo periods.

## “Hi guys, I cannot tell you how much I love this resource. Incredible. I really believe you've hit the nail on the head with this site in regards to solving the research-purchase issue.”

Daniel C.

## “Whoa! It’s like Spotify but for academic articles.”

@Phil_Robichaud

## “I must say, @deepdyve is a fabulous solution to the independent researcher's problem of #access to #information.”

@deepthiw

## “My last article couldn't be possible without the platform @deepdyve that makes journal papers cheaper.”

@JoseServera

DeepDyve ## Freelancer | DeepDyve ## Pro | |
---|---|---|

Price | FREE | $49/month |

Save searches from | ||

Create folders to | ||

Export folders, citations | ||

Read DeepDyve articles | Abstract access only | Unlimited access to over |

20 pages / month | ||

PDF Discount | 20% off | |

Read and print from thousands of top scholarly journals.

System error. Please try again!

Already have an account? Log in

Bookmark this article. You can see your Bookmarks on your DeepDyve Library.

To save an article, **log in** first, or **sign up** for a DeepDyve account if you don’t already have one.

Copy and paste the desired citation format or use the link below to download a file formatted for EndNote

**EndNote**

All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.

ok to continue