From Large Deviations to Semidistances of Transport and Mixing: Coherence Analysis for Finite Lagrangian Data

From Large Deviations to Semidistances of Transport and Mixing: Coherence Analysis for Finite... J Nonlinear Sci https://doi.org/10.1007/s00332-018-9471-0 From Large Deviations to Semidistances of Transport and Mixing: Coherence Analysis for Finite Lagrangian Data 1 2 Péter Koltai · D. R. Michiel Renger Received: 8 September 2017 / Accepted: 19 May 2018 © The Author(s) 2018 Abstract One way to analyze complicated non-autonomous flows is through try- ing to understand their transport behavior. In a quantitative, set-oriented approach to transport and mixing, finite time coherent sets play an important role. These are time-parametrized families of sets with unlikely transport to and from their sur- roundings under small or vanishing random perturbations of the dynamics. Here we propose, as a measure of transport and mixing for purely advective (i.e., determin- istic) flows, (semi)distances that arise under vanishing perturbations in the sense of large deviations. Analogously, for given finite Lagrangian trajectory data we derive a discrete-time-and-space semidistance that comes from the “best” approximation of the randomly perturbed process conditioned on this limited information of the deter- ministic flow. It can be computed as shortest path in a graph with time-dependent weights. Furthermore, we argue that coherent sets are regions of maximal farness in terms of transport and mixing, and hence they occur as extremal regions on a spanning structure of the state space under this semidistance—in fact, under any distance mea- sure arising from the physical notion of transport. Based on this notion, we develop a tool to analyze the state space (or the finite trajectory data at hand) and identify coherent regions. We validate our approach on idealized prototypical examples and well-studied standard cases. Communicated by Oliver Junge. B Péter Koltai peter.koltai@fu-berlin.de D. R. Michiel Renger d.r.michiel.renger@wias-berlin.de Institute of Mathematics, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany Weierstraß-Institut, Mohrenstraße 39, 10117 Berlin, Germany 123 J Nonlinear Sci Keywords Large deviation · Coherent sets · Lagrangian data · Mixing . Mixing distance . Transport distance Mathematics Subject Classification Primary 37N10 · 60F10; Secondary 60J22 · 76F25 · 05C12 · 37M10 1 Introduction Transport in Dynamical Systems Instrumental to understanding the essential behav- ior of complicated non-autonomous flows is to grasp how transport is happening in them. This leads on a qualitative level to objects that prohibit transport, commonly named transport barriers; often originating from the geometric picture for autonomous systems and that trajectories are unable to cross co-dimension 1 invariant manifolds (Haller 2000, 2001; Haller and Beron-Vera 2012, 2013). For periodically forced sys- tems, invariant manifolds enclose regions called “lobes” that get transported across these periodically varying manifolds (MacKay et al. 1984; Rom-Kedar and Wiggins 1990). On a quantitative level, one searches for surfaces of small flux (Balasuriya et al. 2014; Karrasch 2016; Froyland and Koltai 2017), so-called partial barriers (Wigner 1937;Meiss 1992). Instead of characterizing regions that do not mix with one another via enclosing them by boundaries of low flux, there are approaches that aim to describe these sets directly. Such set-oriented concepts are strongly interwoven with the the- ory of transfer operators (Perron–Frobenius and Koopman operators) and comprise almost-invariant sets (Dellnitz and Junge 1999), ergodic partitions (Mezic´ and Wig- gins 1999) in autonomous, and coherent sets (Froyland et al. 2010; Froyland 2013)in the non-autonomous cases. Distinctive attention has been given to coherent sets, which are a (possibly time- dependent) family of sets having little or no exchange with their surrounding in terms of transport and are robust to small diffusion over a finite time of consideration (Froyland 2013, 2015). Natural examples include moving vortices in atmospheric (Rypina et al. 2007; Froyland et al. 2010), oceanographic (Treguier et al. 2003; Dellnitz et al. 2009; Froyland et al. 2015), and plasma flows (Padberg et al. 2007). In such applications, one would like to be able to find coherent sets even in the cases when a dynamical model that can be evaluated arbitrarily often is not available, only a finite set of Lagrangian trajectory data (passive tracers moving with flow with positions sampled at discrete time instances). This problem has received lot of attention in recent years, and a diverse collection of tools has been developed to tackle it (Budišic´ and Mezic´ 2012; Allshouse and Thiffeault 2012; Ser-Giacomi et al. 2015; Froyland and Padberg- Gehle 2015; Allshouse and Peacock 2015; Williams et al. 2015; Hadjighasem et al. 2016; Banisch and Koltai 2017; Schlueter-Kuck and Dabiri 2017; Padberg-Gehle and Schneide 2017; Rypina and Pratt 2017; Fabregat et al. 2016; Froyland and Junge 2017). While other current methods aim at collecting trajectories into coherent sets, in Banisch and Koltai (2017) it has been proposed to go one step further and analyze the connectivity structure of the state space under transport and mixing with “transport 123 J Nonlinear Sci coordinates” and the “skeleton of transport”. Very similar observations have been made earlier in Budišic´ and Mezic( ´ 2012) in the infinite-time limit for periodically forced systems. While coherent sets (and transport barriers) aim at partitioning the state space, the skeleton is aiming at “spanning” the state space with respect to transport. In this respect, coherent sets can be associated with distinct “extremal regions” of the skeleton. Here we will only use this idea of extremality, more precisely that coherent sets are “maximally far” from one another, as measured by transport. To this end, we will need to measure “farness” of dynamical trajectories. Several dynamical distance measures have been put forward already to measure the “distance” or “dissimilarity” of trajectories or initial states in dynamical systems (Mezic´ and Banaszuk 2004; Budišic´ and Mezic´ 2012; Froyland and Padberg-Gehle 2015; Hadjighasem et al. 2016; Fabregat et al. 2016; Karrasch and Keller 2016). The majority of them are shown to serve their purpose well in revealing coherent structures efficiently and reliably. However, they are either heuristic in the sense that they are not derived from the physical notion of transport and mixing, or no discretizations to finite scattered trajectory data have been developed. The purpose of this paper is thus twofold. On the one hand, we develop a dis- tance measure (a semidistance) between trajectories that is derived from the physical notion of transport and mixing subject to diffusion of vanishing strength, and we also derive a discretized distance measure for finite (also possibly sparse and incomplete) Lagrangian data that is consistent with its continuous counterpart in the limit of infinite data. On the other hand, we construct a tool to analyze with such distances the structure of the state space under transport, especially to find coherent sets. This tool makes use of the idea that coherent sets are some sort of extremal regions on a spanning structure with respect to transport, although in this work we will not investigate this “skeleton” in its entirety. Finite Time Coherent Sets Let us consider the ordinary differential equation (ODE) x˙ = v(t, x ) (1) t t on some bounded X ⊂ R and on a finite time interval [0, T ] for some T > 0. Throughout the paper, we will assume that v :[0, T]× X → R is a continuous velocity field tangential at the boundary, such that the flow of (1), denoted by φ [·], s,t 0 ≤ s, t ≤ T , is a diffeomorphism on appropriate subsets of X.For t < s we flow −1 backward in time: φ = φ . s,t t,s Many different notions to characterize coherent sets have been proposed in the liter- ature. Central to all of these notions is the idea that coherent sets should be robust under noise; without such a requirement, any non-intersecting characteristic of a singleton could be considered a coherent set. To this end, one typically perturbs the ODE (1)by a random noise (Denner et al. 2016; Froyland and Koltai 2017; Karrasch and Keller 2016), leading to the Itô stochastic differential equation (SDE) (ε) (ε) dx = v(t, x )dt + εdw , (2) t t We denote random variables by boldface symbols. 123 J Nonlinear Sci where {w } is a Wiener process (Brownian motion) with generator φ → φ, t t ∈[0,T ] reflecting boundaries, and starting from w = 0 (deterministically) and ε> 0is, at least for now, a given small constant. In fact, the rigorous mathematical formulation of an SDE with reflecting boundaries can be quite subtle, see Andres (2009). We ignore this issue as it does not affect our analysis. According to the definition of finite time coherent pairs (Froyland et al. 2010; Froyland 2013; Koltai et al. 2016), two sets A, B ⊂ X are coherent for times 0 and T if most mass from set A is likely to end up in set B, and most mass ending up in set B is likely to originate from set A, that is, (ε) (ε) (ε) (ε) P x ∈ B | x ∈ A ≈ 1, and P x ∈ A | x ∈ B ≈ 1. (3) T 0 0 T Naturally, for practical purposes one would need to choose how small ε and how large these probabilities should be. As the systems we are dealing with are often determin- istic by nature, and there is no “physically straightforward” choice of the diffusion strength ε, our first aim is to remove some of this indeterminacy by quantifying what it means for probabilities to be close to 1 for small ε, in terms of large deviations as we explain below. However, it turns out that the forward and backward conditions (3) are essentially equivalent in the large-deviation regime, and even worse, the large- deviation limits of (3) hardly give any quantitative information about how coherent two sets might be, as discussed in Appendix A. To conclude, the large deviations of conditions (3) do not yield sensible conditions for coherence. Large-Deviation-Based Semidistances In the current paper, we take a different approach. We study semidistances that quantify how unlikely it is for mass to flow from one point to another. These are semidistances in the sense that they satisfy all properties of a metric except for the triangle inequality. In the first part of this paper, in Sects. 2 and 3, we show how such semidistances can arise naturally from probabilistic arguments via large-deviation principles, as we explain below. In the second part, in Sects. 4 and 5, we discuss how (general) semidistances can be used to analyze coherent sets, and we apply the concepts of this paper to a number of examples. In Sect. 2, we derive two different semidistances from the large deviations of two (ε) probabilities. The first one is related to the probability that the endpoint x of the (ε) random path is φ [ y], given that it starts in x = x, for any two initial positions 0,T x , y ∈ X.As ε → 0, the process can no longer deviate from the deterministic flow of (1), and hence this probability will converge to 0 whenever x = y. In fact, it converges exponentially fast (Freidlin and Wentzell 1998), i.e., (ε) (ε) − μ (x → y) P x φ [ y]| x = x ∼ e (4) 0,T T 0 for some function μ (x → y) ≥ 0, where this statement and notation are made precise in Sect. 2.1. Such exponential convergence results are called large-deviation principles, A different way of factoring out diffusion to obtain coherent sets for deterministic flows appeared in the set-oriented transfer operator-based characterization in Froyland (2015); Froyland and Kwok (2016), leading to the notion of the dynamic Laplacian. See also our concluding remarks in Sect. 6.1. 123 J Nonlinear Sci cross Fig. 1 μ (x , y) is the cost to move from x to φ [ y] and 0,T from y to φ [x ] 0,T φ [y] 0,T φ [x] 0,T meet Fig. 2 μ (x , y) is the cost for two trajectories to meet φ [y] 0,T φ [x] 0,T and μ (x → y) is the large-deviation rate. The less probable it is to reach one point from another, the larger the rate between them. The first semidistance is then obtained via symmetrization: cross μ (x , y) := μ (x → y) + μ ( y→x ). (5) T T We call this the cross semidistance, since it arises from mass flowing from x to φ [ y] 0,T and mass flowing from y to φ [x ] simultaneously and independently, see Fig. 1. 0,T The second semidistance arises as the large deviations of the probability for two (ε) (ε) independent random trajectories x , y starting at x and y, respectively, to meet at or before time T (Fig. 2): 1 meet (ε) (ε) (ε) (ε) − μ (x , y) ε T P x y | x = x , y = y ∼ e , (6) T T 0 0 where the meeting semidistance is given by meet μ (x , y) := inf μ (x →z) + μ ( y→z). T T z∈ X cross meet By this procedure, we find two semidistances μ and μ that can be used as a T T measure of “farness” of points x , y, which will be low for points in the same coherent set, and high otherwise. Since both arise from large-deviation principles, they have a nice additional interpretation as a probabilistic cost or free energy that needs to be paid in order to deviate from the expected flows; such interpretation is common in statistical physics, see for example Onsager and Machlup (1953). 123 J Nonlinear Sci Nevertheless, we will see that in order to calculate these costs explicitly, the velocity field v needs to be known. As discussed above, this is in practice seldom the case; mostly one can only assume to have discrete-time snapshots of the positions of a limited number of floaters. With this in mind, we derive similar cost functions as above, that are based on such a finite data set only. This will be the content of Sect. 3. First, the dynamics is discretized in time and space by conditioning a usual time-stepping method for the SDE (2) on the event that the random continuous trajectories are to be found in the set of known floater positions at the K ∈ N given time instances. As above, we then cross meet derive two large-deviation semidistances ν (x , y) and ν (x , y) that have a clear K K probabilistic interpretation, that can be used to characterize coherent sets, and that are based on the finite data set rather than on the explicit velocity field. In fact, we will show that these discrete-space–time semidistances are really specific discretizations cross meet of the continuous-space–time semidistances μ and μ . As shown in Sect. 3.4, T T they can be computed as shortest path lengths in a time-dependent weighted graph. We give an algorithm to compute these shortest paths in Appendix B. Let us stress that these semidistances are defined for deterministic dynamical sys- tems. The random perturbation that is factored out by the large-deviation principle is merely acting as a catalyst to help quantify how strongly distinct trajectories mix—or, we should rather say how poorly, as the transport from one trajectory to another is inversely proportional to their semidistance. Coherence Analysis with Semidistances In Sect. 4, we describe how in general a semidistance on finite Lagrangian data can be used to analyze coherence. Key to our method is the notion of cornerstone: a point that is furthest away from all other points. Cornerstones are though of as “endpoints” of a spanning structure, and ideally each cornerstone is in some sense the center of a coherent set. As a next step, trajectories can be clustered around cornerstones to yield coherent sets. Of course, this approach is very close to the k-means- and fuzzy c-means clustering of trajectories with respect to dynamical distances in Hadjighasem et al. (2016); Froyland and Padberg-Gehle (2015), with the important difference that the centers are not chosen by the heuristics of these clustering approaches, but with regard to the properties of coherent sets in the light of transport and mixing. To exemplify the usefulness of the theory put forth in this paper, in Sects. 4 and 5 we test our approach on a number of standard test cases. Finally, Sect. 6 discusses possible combinations of this work with other concepts. 2 Large-Deviation Semidistances in Continuous Time and Space In this section, we study large deviations of the forms (4) and (6). In large-deviation theory, it is often easier to first study large deviations in a larger space. In our setting, we first study the large deviations of paths in Sect. 2.1 before transforming to the large deviations of the endpoints in Sect. 2.2. We end with a discussion of the resulting cross meet semidistances μ ,μ in Sect. 2.3. T T 123 J Nonlinear Sci 2.1 Large Deviations of Paths We denote paths by w to distinguish them from points w.Let P be the Wiener (·) measure, i.e., the probability that a Brownian path lies in a set U ⊂ C (0, T ; R ) is P[w ∈ U ]. Recall that there does not exist a canonical probability measure on the (·) space of paths, and so the Wiener measure cannot be identified with a meaningful density. This means that one always needs to consider sets rather than particular realizations of the Brownian path. Nevertheless, large-deviation rates are always local, in the sense that they depend on one realization only (the most likely one in the set U under consideration). This motivates writing w f if w lies in an infinitesimal (·) (·) (·) neighborhood U of the path f . We will make this more precise below. (·) The large deviations for the SDE (2) are a standard result by Freidlin and Wentzell (1998). This result can be derived via a combination of Schilder’s theorem and a contraction principle as we now explain. We first consider the noise part εw , which clearly converges (almost surely uni- formly) to the constant path 0 as ε → 0. The corresponding large-deviation principle is given by Schilder’s theorem (Dembo and Zeitouni 1987, Th. 5.2.3): −ε log P εw w −−→ |˙ w | dt, (7) (·) (·) t ε→0 for differentiable paths w starting from w = 0 (otherwise the limit will be ∞). (·) 0 Let us assume that the velocity field v(t, ·) is Lipschitz, so for each realization of (ε) the Brownian path w = w corresponds a unique solution x of the SDE, starting (·) (·) (·) (ε) from some given x = x, see (Øksendal 2003, Th. 5.2.1). The contraction principle (Dembo and Zeitouni 1987, Th. 4.2.1) then states that the large-deviation rate of a path x is given by the minimum of (7) over all realizations of the noise that give rise (·) to that path, i.e., (ε) −ε log P x x −−→ inf |˙ w | dt (·) t (·) ε→0 w :˙ x =v(t,x )+˙ w (·) t t t = |˙ x − v(t, x )| dt, (8) t t for differentiable paths x starting from x = x. (·) 0 2.2 Large Deviations of Endpoints We now derive the large-deviation principle of the type (4) as discussed in the Intro- duction. In a sense, the pathwise large deviations (8) encode more information than is (ε) needed if we are only interested in the endpoint x φ ( y) of the random path. 0,T 3 1 2 More rigorously, (7) means ε log P εw ∈ U − −− → − inf |˙ w | dt, where, for technical (·) w ∈U (·) 2 0 ε→0 reasons, this convergence is realized by a liminf lower bound for open sets U and limsup upper bound for closed sets U ; see Dembo and Zeitouni (1987). 123 J Nonlinear Sci Another application of the contraction principle then states that the large-deviation rate for the endpoint is the minimum of (8) over all paths starting from x and ending in that given endpoint φ [ y], that is: 0,T (ε) (ε) −ε log P x φ [ y]| x = x 0,T T 0 −−→ inf |˙ x − v(t, x )| dt =: μ (x → y). (9) t t T ε→0 x :x =x ,x =φ [ y] (·) 0 T 0,T This defines the “one-way” rate that we are after. cross The sum (5) then defines the cross-semidistance μ (x , y) and has a natural inter- pretation in terms of large deviations: As mentioned in the Introduction, it arises from (ε) (ε) two independent and simultaneous copies x , y . By independence, the probability t t (ε) (ε) (ε) (ε) that (x , y ) (φ [ y],φ [x ]) given (x , y ) = (x , y) is a product of one-way 0,T 0,T T T 0 0 probabilities, yielding the sum of two one-way rates in the large deviations, see Fig. 1. (ε) A similar argument can be used to derive the meeting large deviations (6). Let x (ε) and y be two independent solutions of the SDE (2), starting from given x and y, respectively. We consider the probability that both trajectories end in a given point, say φ [z] for some z ∈ R , see Fig. 2. Assuming independence of the two trajecto- 0,T ries, we immediately get (ε) (ε) (ε) (ε) − ε log P x φ [z], y φ [z]| x = x , y = y 0,T 0,T T T 0 0 (ε) (ε) (ε) (ε) =−ε log P x φ [z]| x = x − ε log P y φ [z]| y = y 0,T 0,T T 0 T 0 (9) −−→ μ (x →z) + μ ( y→z). T T ε→0 However, we are only interested in the probability that the two trajectories meet, and not in the point where they meet. A final contraction principle thus yields: (ε) (ε) (ε) (ε) −ε log P x y | x = x , y = y T T 0 0 meet −−→ inf μ x →z + μ y→z =: μ (x , y). (10) T T ε→0 z∈ X Observe that the two paths could also meet earlier and subsequently follow the same trajectory up until time T with zero cost; the time T thus acts as a maximum time at which the paths should meet. 2.3 The Semidistances We now discuss some metric properties of the rate functionals. Recall from Introduc- tion that we assumed that the flow is a diffeomorphism. Therefore μ (x → y) = 0if and only if x = y. It is then easy to see that, for any x , y, cross meet (i) μ (x , y) ≥ 0 and μ (x , y) ≥ 0, T T cross meet (ii) μ (x , y) = 0 ⇐⇒ x = y and μ (x , y) = 0 ⇐⇒ x = y, T T cross cross meet meet (iii) μ (x , y) = μ ( y, x ) and μ (x , y) = μ ( y, x ). T T T T 123 J Nonlinear Sci cross meet However, the triangle inequality can fail, and so μ and μ are semidistances T T only. We point out the following useful relation between the two. Observe that by meet the definition, μ (x , y) ≤ μ (x → y) + μ ( y, y) = μ (x → y), and similarly T T T meet μ (x , y) ≤ μ ( y→x ). Therefore, meet μ (x , y) ≤ min μ (x → y), μ ( y→x ) ≤ max μ (x → y), μ ( y→x ) T T T T cross ≤ μ (x , y). In order to investigate which semidistance is more suitable to study coherence, one cross meet would need to study in which setting the gap μ (x , y) − μ (x , y) becomes large. T T This is beyond the scope of this paper, but we will show for several examples that both work as they should. Remark 2.1 (Invariance under time reversal) Note the following invariance property of μ under time reversal: 2 ← − μ (x → y) = inf |˙ y + v(T − t, y )| dt =: μ φ [ y]→φ [x ] , T t t T 0,T 0,T y : y =φ [ y] 0 0,T (·) 0 −1 y =φ [φ [x ]] T 0,T 0,T ← − where μ is the one-way rate associated with the backward system y ˙ =−v(T −t, y ). T t t cross This time-reversal property is retained for the cross-semidistance: μ (x , y) = ← −cross meet μ φ [x ],φ [ y] . However, the meeting semidistance μ (x , y) = inf 0,T 0,T z T T ← − ← − μ z→φ [x ] + μ z→φ [ y] is the same as the cost for the backward tra- T 0,T T 0,T jectories to start in a joint position and end in φ [x ] and φ [ y]. 0,T 0,T 2.4 A Simple Example Let us consider a very simple example, where the domain of interest is the inter- val [0, L], and there is no dynamics, i.e., v ≡ 0. Its primary purpose is to form our intuition and expectations about how the semidistances work in more complicated settings. In particular, we shall see how the semidistances scale in time and system size. The system is considered on the time interval [0, T ]. One can then easily see that x˙ ≡ L/ T is an optimal path in (9), thus giving T 2 1 L L (0→L) = μ (L →0) = dt = , T T 2 T 2T 2 2 L L cross and so μ (0, L) = . Thus, also μ (0→L/2) = μ (L →L/2) = . In general, T T T T 8T the one-way cost is proportional to the squared distance and inversely proportional to time. This also gives meet μ (0, L) = , 4T 123 J Nonlinear Sci so, in this symmetric situation the meeting distance is half of one-way cost and quarter of the cross-semidistance. We will revisit this example in the next section and will realize that the behavior of the discrete semidistances deviates from the one observed here for continuous space and time. 3 Large-Deviation Semidistances in Discrete Time and Space cross meet As mentioned in the Introduction, the cost functions μ and μ are difficult T T to calculate explicitly, and impossible if the velocity or flow field is not explicitly known. In this section, we take a more practical approach. We will assume that the only information at hand is the position at finite times of a finite number I of floaters. (i ) To be more specific, let {x } ⊂ R be given positions of floaters i = k=0,...;K ,i =1,..., I 1,..., I at time kτ for k = 0,..., K for some τ> 0. Assuming that the floaters sample from the deterministic flow field φ , we know that for each floater i, s,t (i ) (i ) x = φ [x ] . (11) kτ,(k+1)τ k+1 k If we would add noise to the system, we would find random particles described by the set of SDEs (i,ε) (i,ε) (i ) (i,ε) (i ) dx = v(t, x )dt + εdw , x = x , for i = 1,..., I, (12) t t t 0 0 (i ) where w are now independent standard Brownian motions. Our strategy is to study the probability that random particles described by the SDEs (12) deviate from the given floater trajectories (11), conditional to the fact that all our knowledge about the otherwise unknown flow field φ comes from the time- s,t and space-discrete set of trajectory data (11). We first approximate the SDEs (12) by discrete-time, continuous-space Markov processes in Sect. 3.1, as it is done in standard time-stepping methods for SDEs (Kloeden and Platen 2010). Next, in Sect. 3.2 we condition these discrete-time processes on the given floater positions. Then we calculate the large-deviation rate for trajectories in Sect. 3.3, and for endpoints in Sect. 3.4. Finally, we end the section with a discussion of the metric properties of the resulting large-deviation rates in Sect. 3.5. 3.1 Discrete-Time Approximation We first focus our attention to one time step kτ → (k + 1)τ of one trajectory i, and temporarily drop the superindex for brevity. Since the noise process is a standard Brownian motion, we know its density, √ √ dP[ εw ∈ d y εw = x ] |x − y| (k+1)τ kτ −d/2 = (2πετ ) exp − . (13) d y 2ετ 123 J Nonlinear Sci Hence, we have exact information on the purely deterministic part of the SDE by (11) and on the purely noise part by (13). We combine this information by using the following time-stepping approximation for the SDE (2). + − Fix an α ∈[0, 1], and let (ξ ,ξ ) be independent normally distributed R - k=0,...,K k k valued random variables with unit variance. Given the approximated random position x ˜ at time kτ , we iterate + + x ˜ := x ˜ + ατ ε ξ k k − + x ˜ := φ [˜ x ] kτ,(k+1)τ (14) k+1 k − − x ˜ := x ˜ + (1 − α)τ ε ξ k+1 k+1 (k+1)τ + − Here, x ˜ and x ˜ are only auxiliary (intermediate) steps. The method (14) is a special k k+1 case of a splitting method, since the deterministic evolution and purely noise parts of the SDE (2) are handled separately in the distinct steps; here it would be “noise-flow- noise”. We would like to stress that our choice of discretization is made on the basis that we can use the available information on the flow given by (11). In the realm of one- step methods for SDEs we are bound to choices of the form (14), because there is no information on the drift other than the flow generated by it on prescribed time intervals [kτ, (k + 1)τ ). Given the form of time-stepping (14), the optimal [in the sense of highest weak consistency order Kloeden and Platen (2010)] approximation of the SDE is obtained by choosing α = 1/2. That is the so-called Strang-splitting (Strang 1968) and has weak order two, while for α = 1/2 we only get order one. Having performed discretization in time, in the next section we derive a discrete- time, discrete-space, α-dependent Markov chain that we will use to derive discrete semidistances. 3.2 Conditioning on Finite Data Recall that we considered one discrete-time process (x ˜ ) with initial condition k k=1,...,K (i ) x ˜ = x , and that we suppressed the dependency on i. For each k = 0,..., K,we introduce the set ( j ) A := {x } . k j =1,... I of available points at time t = kτ . We now condition the random process on the event that for each realization x ˜ ∈ A and for each intermediate point x ˜ ∈ A . k k k Formally, this can be seen by denoting the generators of the noise process and advection by A and B, respectively, and estimating the difference of the Markov propagators associated with (12)and (14)by performing formal Taylor expansions in τ with the non-commuting operators A and B, to obtain O(τ ), α = 1/2, τ( A+ B) (1−α)τ A τ B ατ A e − e e e = O(τ ), α = 1/2. 123 J Nonlinear Sci This automatically implies conditioning of the other intermediate points x ˜ ∈ A k+1 k+1 due to (11). We choose to condition on the intermediate points for practical reasons; otherwise, we would not be able to perform the second step in (14), since the discrete trajectories are our only information about the flow, cf. Remark 3.1 below. The conditioning on the finite data set results in replacing the discrete-time continuous-space process by a fully discrete-time discrete-space Markov chain that hops between the given trajectories. Therefore, the state of the new Markov chain can be represented by the labels j = 1,..., I ; this is particularly useful since the deterministic flow (11) will change the positions but not the labels. Since the resulting process is still Markovian, we can fully characterize its behavior through its transi- tion probabilities for one time-step k → k + 1. We now calculate these transition probabilities, dealing with each step in (14) separately. See Fig. 3 for a sketch. For the transition from x ˜ to x ˜ , where we know the increment distribution (13), note that we are in fact conditioning on a null set, so that the conditional probabilities are sensibly defined as the limits over balls B (·) of small radii r → 0 around these points. We thus obtain, for any j, = 1,... I : + + ( ) ( j ) + p ( j, ) := P x ˜ = x x ˜ = x and x ˜ ∈ A k k k k k k k ) ( j ) P x ˜ ∈ B (x ) x ˜ = x r k k k k = lim ( j ) r →0 P x ˜ ∈ B (A ) x ˜ = x r k k k k ) ( j ) exp − x − x /(2αετ ) k k =   , (15) I ( ) ( j ) exp − x − x  /(2αετ ) ˆ k k =1 and similarly for the transition from x ˜ to x ˜ : k+1 k+1 − (m) − ( p ( , m) := P x ˜ = x x ˜ = x and x ˜ ∈ A k+1 k+1 k+1 k+1 k+1 k+1 k+1 (m) ( exp − x − x / (2(1 − α)ετ ) k+1 k+1 = . (16) (m ˆ ) ( exp − x − x / (2(1 − α)ετ ) m ˆ =1 k+1 k+1 + − Since the transition from x ˜ to x ˜ is deterministic (middle equation in (14)), we k k+1 have that, (m) ( j ) + − P ( j, m) := P x ˜ = x x ˜ = x and x ˜ ∈ A , x ˜ ∈ A k k+1 k k k+1 k+1 k k k+1 + − = p ( j, ) p ( , m). (17) k k+1 =1 In words, the process performs the following three subsequent steps for one time step (see Fig. 3): 123 J Nonlinear Sci Fig. 3 One time step of the discrete-time discrete-space Markov chain i i = m k+1 p ( ) φ k+1 kτ,(k+1)τ p ( ) i = j A k+1 ( j ) ) (k,+) 1. Start in x , and perform a jump to some x with probability p , k k j, ) ( 2. Perform a deterministic jump from x to x , k k+1 ) (m) (k+1,−) 3. Perform a jump from x to x with probability p . k+1 k+1 ,m The transition probabilities P ( j, m) define our new, discrete-time Markov chain (i ) on the discrete space {1,..., I }. To shorten notation, we will write k k=0,...,K i := (i ) for a discrete path, analogous to the continuous-time setting. By the (·) k k=0,...,K Markov property, the probability that the Markov chain realizes such a path is simply K −1 P i = i = P (i , i ), (18) (·) (·) k k k+1 k=0 where we assumed that the chain starts (deterministically) from i . 3.3 Large Deviations of Discrete Trajectories We now study the large deviations of the discrete Markov chain i . Similarly to the continuous setting from Sect. 2, we start from the large deviations of paths. First we + − calculate the large deviations for p ( j, ) and p ( , m). By the Laplace princi- k k+1 ple (27), ⎛ ⎞ 2 2 I ( ) ( j ) ( ) ( j ) x − x  − x − x (15) k k k k ⎝ ⎠ −ε log p ( j, ) = ε log exp 2αετ =1 ) ( j )  ( ) ( j ) ) ( j ) x − x  − x − x |x − x | k k k k k k −−→ max = . 2ατ 2ατ ε→0 =1,..., I We will make this simplification again below. Similarly, we obtain 123 J Nonlinear Sci (m) ( |x − x | (16) − k+1 k+1 −ε log p ( , m) −−→ . k+1 2(1 − α)τ ε→0 Using these two exponential approximations, we can again use the Laplace princi- ple (27) to find for the jump probability of one time step: (17) + − lim −ε log P ( j, m) = lim −ε log p ( j, ) p ( , m) k k+1 ε→0 ε→0 =1 + − = min lim −ε log p ( j, ) − ε log p ( , m) k k+1 =1,..., I ε→0 ) ( j ) (m) ( 2 2 |x − x | |x − x | k k k+1 k+1 = min + . 2ατ 2(1 − α)τ =1,...,m Finally, the large-deviation rate of a discrete path is K −1 (18) −ε log P i = i =−ε log P (i , i ) (·) (·) k k k+1 k=0 K −1 (i ) (i ) ( ) k+1 k 2 2 |x − x | |x − x | k+1 k+1 k k −−→ min + := J (i ), (·) 2ατ 2(1 − α)τ ε→0 =1,..., I k=0 (19) Remark 3.1 Recall that we conditioned on the event that all x ˜ as well as the interme- diate points x ˜ lie in the set A of available points. One might argue that in practice only the points x ˜ are measured to lie in A , while the other two are mathematical k k constructs that may lie anywhere. However, if we would relax this conditioning and follow the calculations as above, we would find: (m) ( j ) x − φ [x ] |x − x | t ,t k+1 k k+1 −ε log P ( j, m) −−→ min + d 2ατ 2(1 − α)τ ε→0 x ∈R (m ˆ ) x − φ [x ] t ,t k k+1 k+1 − min . 2(1 − α)τ m ˆ =1,..., I Since this large-deviation rate still depends on the unknown flow field φ, it cannot be used if only the data of a finite number of floaters is available. Remark 3.2 (Missing data and non-uniform time-sampling) Note that the construction works exactly as described above even if information about trajectories is partially missing. The conditioning on the set A works identically, but now these sets might have different cardinalities smaller or equal I . Observe that our only information about the deterministic flow for times in [kτ, (k + 1)τ ) comes from those trajectories that are available both in A and A . If this intersection k k+1 is empty, we need to skip that time slice completely. This is not a problem, since our choice of sampling time uniformly by the step size τ was solely in order to ease 123 J Nonlinear Sci presentation. As the reader has probably observed, the extension for varying time steps τ is straightforward. 3.4 Large Deviations of Endpoints Analogously to the continuous setting, we study the large deviations of the one-way probability to hop from i to j in discrete time K , and the meeting probability that two independent chains, starting from i and j, respectively, meet by discrete time K or earlier. Since the paths (19) encode more information than the endpoints, we can now easily derive the large deviations of the one-way probability by a contraction principle. Indeed, for any two indices i, j = 1,..., I , −ε log P[i = j | i = i ] −−→ min J (i ) =: ν (i → j ), (20) K 0 (·) K ε→0 i : i =i,i = j (·) 0 K where J is the discrete-path large-deviation rate (19). Note that J is the shortest path length in a graph with time-dependent edge weights ( j ) ) (i ) ( 2 2 |x − x | |x − x | k k k+1 k+1 w (i, j ) = min + . 2ατ 2(1 − α)τ =1,..., I cross Again, the sum ν (i, j ) := ν (i → j ) + ν ( j →i ) can be given an interpretation K K in terms of large deviations as in Sect. 2.2. Moreover, following the same argument as in (10), if we take two independent trajectories i and j , then (·) (·) meet −ε log P[i = j | i = i, j = j]−−→ min ν (i → ) + ν ( j → ) =: ν (i, j ). K K 0 0 K K ε→0 =1,..., I 3.5 The Semidistances It is easily checked that in the discrete setting the properties of a semidistance are also satisfied: cross meet (i) ν (i, j ) ≥ 0 and ν (i, j ) ≥ 0, K K cross meet (ii) ν (i, j ) = 0 ⇐⇒ i = j and ν (i, j ) = 0 ⇐⇒ i = j, K K cross cross meet meet (iii) ν (i, j ) = ν ( j, i ) and ν (i, j ) = ν ( j, i ). K K K K Furthermore, the triangle inequality fails, but we again have the following estimate: meet cross ν (i, j ) ≤ min ν (i → j ), ν ( j →i ) ≤ max ν (i → j ), ν ( j →i ) ≤ ν (i, j ). K K K K K K Both semidistances can be computed from shortest path costs, where the cost of a path is given by (19). We stress that this expression is fairly simple and depends on the flow field through the known positions of the floaters x only. Because of this: (1) these costs can be used in practice if the velocity field is unknown (Sects. 4 and 5); (2) these costs can even be applied to cases where there may not be an underlying velocity field, as for example in discrete-time dynamical system (Sect. 4.1). 123 J Nonlinear Sci Fig. 4 Hopping back and forth (solid line) between two given trajectories (dashed lines) (j) (i) These semidistances can be computed by first computing the one-way rates ν (i → j ) using Algorithm 1, see Appendix B. From these rates, one readily obtains the semidistances via cross meet ν (i, j ) = ν (i → j ) + ν ( j → i ) and ν (i, j ) K K K K = min ν (i → ) + ν ( j → ). K K =1,..., I Remark 3.3 (Time reversal for discrete semidistances) Similarly to Remark 2.1,the ← − one-way cost satisfies the time-reversal property ν (i → j ) = ν ( j → i ), provided K K ← − α = 1/2, where ν is the cost associated with the backward dynamics. Moreover, this time-reversal property also holds for the cross-semidistance, whereas for the meeting ← − ← − meet semidistance ν (i, j ) = min ν ( → i ) + ν ( → j ). Apart from the =1,..., I K K superior consistency order discussed in Sect. 3.1, the invariance of semidistances under time reversal is another reason for choosing α = 1/2. Remark 3.4 Other large-deviation-based semidistances are also possible. If one con- siders the “noise-flow” (i.e., α = 1) time-stepping scheme for the SDE rather than “noise-flow-noise”, expression (19) simplifies a bit. As another example of a large- (i ) ( j ) deviation-based semidistance between two given discrete paths {x , x } , one k=0,...K k k could consider the probability to hop back and forth between the two trajectories, see Fig. 4. In that case, we find in the large-deviation scaling for α = 1: K −1 (i ) ( j ) |x − x | k k −ε log P[i = j , i = i , ··· | i = i ] −−→ , (21) 1 1 2 2 0 0 ε→0 2τ k=0 for α = 0 the sum would go from k = 1to K . Naturally, this is simply the L -distance between two trajectories, as considered earlier in Froyland and Padberg-Gehle (2015). Although this construction is very easy to calculate and its square root is a genuine metric, it is less interpretable as a cost for transport and mixing. meet cross Remark 3.5 It should be noted that the semidistances ν ,ν scale quadratically K K in space; this becomes even more apparent in the example considered in Sect. 3.7. In the case of the L -distance (21), the cost becomes a genuine distance after taking meet cross the square root. However, if we take the square roots of ν and ν , the triangle K K 123 J Nonlinear Sci inequality still fails. We therefore stick to the quadratic scaling as this has the most direct interpretation as large-deviation costs. Remark 3.6 (Eulerian transport vs Lagrangian mixing) When speaking of transport in this paper, we mean “transport (of probability) from a trajectory to another”, to express how the dynamics is mixing up regions these two trajectories come in contact with. This can be seen as a Lagrangian perspective. We express with large-deviation rates the unlikeliness of transitions between trajectories, and these are then computed as shortest paths, cf. Sect. 3.4. Deceivingly similar mathematical constructions show up in Ser-Giacomi et al. (2015), where the authors consider “highly probable paths” of non-homogeneous Markov chains, which also leads to a time-dependent shortest path problem. Note, however, that this is orthogonal to our concept, as this is quantifying likeliness. A further important distinction is that their Markov chain is constructed in an Eulerian manner (opposed to our Lagrangian setting), meaning that it describes transport between fixed regions of state space; serving as a discretization of the flow field (Froyland et al. May 2007, 2010). 3.6 Discretization of the Continuous Semidistances We now show that the one-way discrete-space–time cost ν can also be obtained by discretizing the continuous-space–time cost μ . This means that discretization and derivation of the large-deviation principle are interchangeable operations (if done the right way). We will not be precise about the discretization error; of course one needs to assume that the number of floaters is sufficiently large. K −1 We first divide the time interval into subintervals [0, T ) = [kτ, (k + α)τ ) ∪ k=0 [(k + α)τ, (k + 1)τ ). Recall that φ is the flow associated with v(t, ·), that is, for t ,t any t , x, ∂ φ [x]= v t,φ [x ] . t t ,t t ,t 0 0 Note in what follows that x is some path, not necessarily a trajectory of the flow. In (·) each interval [kτ, (k + α)τ ), we approximate by finite differences: x − x x − φ [x ] (k+α)τ kτ (k+α)τ (k+α)τ,kτ (k+α)τ x˙ ≈ and v(t, x ) ≈ . t t ατ ατ In each interval [(k + α)τ, (k + 1)τ ), we approximate: x − x φ [x ]− x (k+1)τ (k+α)τ (k+α)τ,(k+1)τ (k+α)τ (k+α)τ x˙ ≈ and v(t, x ) ≈ . t t (1 − α)τ (1 − α)τ Because of the assumption that the flow is one-to-one, we can always write x = (k+α)τ φ [ˆ x ] for some xˆ . We thus obtain: kτ,(k+α)τ k k K −1 T 2 2 |x −ˆ x | |x − φ [ˆ x ]| 1 2 kτ k (k+1)τ kτ,(k+1)τ k x˙ − v(t, x ) dt ≈ + . t t 2ατ 2(1 − α)τ k=0 123 J Nonlinear Sci (i ) (i ) Since the number of floaters {x } is large, we can find an x close k=0,...,K ;i =1,..., I k k to xˆ ,giving (i ) ( j ) (i ) ( j ) μ (x → x ) ≈ inf |˙ x + v(t, x )| dt : x = x , x = x , x ∈ A , T t t 0 T kτ k 0 K 0 K x := φ [x ]∈ A (k+α)τ,kτ (k+α)τ ) k (i ) (i ) K −1 ( ) k+1 ( k 2 2 |x − x | |x − x | k k k+1 k+1 ≈ min min + i :i =i,i = j =1,..., I 2ατ 2(1 − α)τ (·) 0 K k=0 = ν (i → j ). This shows that we can either derive the large-deviation rate function in continuous space and discretize this to finite trajectories (as done here), or we can restrict the continuous dynamics to finite trajectory data and derive a large-deviation rate function for that (as done above); we obtain consistent results whichever route we take. 3.7 The Simple Example Revisited Let us now demonstrate how the results of this section apply to the example of Sect. 2.4. Discrete Time and Continuous Space Let us first suppose we are given infinitely many “trajectories” of the system, one starting at each point x ∈[0, L], and they are sampled at discrete time points kτ , k = 0, 1,..., K , with τ = . From Sect. 3.3 with α = 1/2 we obtain, by writing x = , that 2 2 2 (x ) (L/K ) L 1 1 ν (0→L) = = K · = , 2 2 τ T /K 2T k=1 where we used that the optimal discrete path in (20) is the one making jumps of equal lengths x. Note that the rate function is identical to that in the fully continuous case. Analogously, ν (0→L/2) = ν (L/2→L) = , and generally, if |x − y|= K K 8T δ, then ν (x → y) = . The derived semidistances scale similarly. Note that the 2T semidistances converge to zero as T →∞. Discrete Time and Space If we are given a finite number I of equispaced trajectories of this system sampled at the same times as in the previous paragraph, the virtual random walker cannot make arbitrarily small jumps as in the continuous state case, thus 2 2 (x ) L 1 k ν (0→L) = ≈ , if K ≤ I , τ 2T k=1 123 J Nonlinear Sci −1 since we can take x ≈ L/K with error O( I ) as I grows. However, if K > I , the smallest jumps are x = , thus 2 2 (L/ I ) L K ν (0→L) = I · = . τ 2IT Thus, if the observation time of trajectories grows and they are still observed at the same rate (i.e., τ stays constant), the semidistances saturate at and do not converge I τ to zero as in the continuous time case. Moreover, to reach y = L/2 from x = 0, we still cannot make smaller jumps than x = , but now we only require only I /2 I (L/ I ) L of them, such that we obtain ν (0→L/2) = ν (L/2→L) = · = (for K K 2 τ 4 I τ even I , and vanishing error for odd I as I grows). The main lesson is that while in the continuous space case halving the Euclidean distance makes the semidistance scale by , if the spatial resolution of trajectories is coarse, the discrete semidistance scales only by . In general, if |x − y|= δ, then on a coarse resolution grid it takes about jumps to travel between these two points, δ (x ) x and we obtain ν (x → y) ≈ · = δ · . Note that x and τ are constant x τ τ quantities, and thus the one-way discrete cost scales linearly in the Euclidean distance between the two points, as opposed to quadratic scaling in the continuous space case. 4 Coherence Analysis with Semidistances Let us assume that we are given a set of discrete time and space trajectories ( j ) {x } , and a (semi)distance d. We now describe how such semidis- j =1,..., I,k=0,...,K tances can be used to distinguish and analyze coherent sets from the finite data. We shall work with an unspecified semidistance d, but of course the semidistances that we cross meet have in mind are ν and ν that we derived in the previous section. Other—not K K large-deviation based—distance measures could be used just as well, as we discuss below. Nevertheless, the semidistances should not be completely arbitrary; we assume meet cross that they share the behavior of ν and ν that we discuss in Sects. 4.1 and 4.2. K K To illustrate the ideas, we first analyze the behavior of two one-dimensional proto- typical examples. These examples show the difference between two types of regions: “mixing” and “static” (also known as “regular”). In these one-dimensional and simple examples, one can easily determine the regions and whether they are mixing or static from the semidistances. One example has two invariant sets under the dynamics, which is (measure-theoretically and topologically) mixing on both of them. The other has two static regions, where the mutual physical distance of trajectories does not change under the dynamics, and these regions are separated by a third, mixing region. After this we proceed with a more involved model: a two-dimensional periodically forced double-gyre flow, where the boundaries of the separate regions are no longer as clear-cut as in the one-dimensional example. Nevertheless, we will show that one can identify the separate regions via the tools that we present next. To this end, we introduce the notion of cornerstones, representing possible coherent sets or mixing regions, and then discuss how to find them and when to stop searching 123 J Nonlinear Sci φ(x) φ(x) x x Fig. 5 Left: the time-discrete flow map with two invariant mixing subdomains. Right: the time-discrete flow map with two static regions and an invariant mixing subdomains between them for them. Finally, to obtain coherent sets, we assign the trajectories to cornerstones. The notion of fuzzy affiliations will be used to express the uncertainty whether a trajectory close to the boundary of a set belongs to it or not. Note that in the case of finite data such an uncertainty is always present. 4.1 Two Illustrative Model Cases Two Invariant, Mixing Subdomains As mentioned in Sect. 3.5, we may also apply the techniques developed in this paper to a discrete-time dynamical system. To gain some intuition for the behavior of the semidistances at mixing regions, we consider the discrete-time system on the unit interval X =[0, 1] and one-step flow map, see Fig. 5 (left), 1 1 4x mod , x < 2 2 φ(x ) = 1 1 1 1 4(x − ) mod + , x ≥ . 2 2 2 2 1 1 −1 The sets X =[0, ], X = ( , 1] are invariant, i.e., φ ( X ) = X 1 2 1 1 2 2 −1 and φ ( X ) = X , and φ is simply the circle-quadrupling map on each of these 2 2 sets, i.e., it is mixing on the single components. Consequently, t t lim inf φ (x ) − φ ( y) =0(22) t ∈N, t →∞ for (Lebesgue-)almost every pair x , y ∈ X , i = 1, 2. If a system ( X,ψ,μ) is mixing, then ( X × X,ψ ×ψ, μ×μ) is ergodic (Walters 2000, Theorem 1.24) . Thus, for μ × μ-almost every pair (x , y), the trajectory (ψ × ψ) (x , y) will enter every set A of nonzero measure for some t ≥ 0. This shows (22) by taking ψ = φ| or ψ = φ| ,and A ={(x , y) ||x − y| <ε} (0,1/2) (1/2,1) for any fixed ε> 0. 123 J Nonlinear Sci 0.8 0.6 0.4 0.2 0 20 40 60 80 100 120 140 160 180 200 Fig. 6 Two trajectories of the map φ of length 200 steps, starting in X and X , respectively. Theory shows 1 2 that they come arbitrary close, eventually. Here they get the closest at time step 193, shown by a circle Thus, ν (i → j ) → 0as K →∞, because if x , x are both in X or both in X , K i j 1 2 then (22) shows that their trajectories get arbitrarily close eventually. If the trajectories start in different halves of [0, 1], then t 1 t 1 lim inf φ (x ) − + φ (x ) − = 0 , (23) i j 2 2 t ∈N, t →∞ thus the jump from one trajectory to another gets arbitrarily cheap. See Fig. 6. The transport semidistances between any two points within the same region are very small, at least if the time window is large enough. This behavior is typical for mixing regions. In fact, since the two mixing regions are only separated by one point, it is relatively cheap to move from one region to the other, and so the semidistances between two points in separate regions converge with increasing time to zero. Nevertheless, the semidistances still detect a difference between the two invariant sets: The semidistance between two trajectories in the same invariant component goes in general quicker to zero than the one between two from different components, as shown in Fig. 7 for I = 100 initially equispaced trajectories. Thus, it is the relative difference between the semidistances that is relevant for the transport structure of the state space, and not the absolute values. Two Static Regions Divided by a Mixing One To gain some intuition about static regions, let us now consider the discrete-time system on X =[0, 1] given by 1 3 x , x ∈[0, ) ∪ ( , 1] 4 4 φ(x ) = 1 1 1 1 3 2(x − ) mod + , x ∈[ , ] , 4 2 4 4 4 see Fig. 5 (right). This map has three invariant sets. The left and right ones are static, such that the mapping restricted to them is the identity, and are meant to model regions 6 t Applying Footnote 5 to the case where ψ = φ| , we obtain that (φ| × φ| ) (x , y) enters (0,1/2) (0,1/2) (0,1/2) 1 1 A ={(x , y) ||x −1/2|+| y| <ε} eventually. Noting that φ| = φ| (·− )+ , the claim follows. (1/2,1) (0,1/2) 2 2 Comparing the different slopes in Fig. 7, based on the reasoning in Footnote 5 and here we conjecture that the minimal distance of two trajectories decays faster in the case when they both start in the same invariant set, because the set {(x , y) ||x − y| <ε} is larger in measure than {(x , y) ||x − 1/2|+| y| <ε}. 123 J Nonlinear Sci -2 -2 10 10 -3 -3 10 10 -4 -4 10 10 -5 -5 10 10 -6 -6 10 10 -7 -7 10 10 0 1020304050 0 1020304050 K K cross meet Fig. 7 Semidistances ν (i → j ) (left) and ν (i, j ) (right) for increasing maximal time K,averaged K K over x , x ∈ X (downward-pointing triangles), x , x ∈ X (upward-pointing triangles), and x , ∈ i j 1 i j 2 i X , x ∈ X (circles), respectively. Note that the decrease in the distance is much slower for trajectories 1 j 2 taken from different invariant sets Fig. 8 One-way cost ν (i → j ) for i = 1 (left) and i = 51 (right) for the map with two static and one mixing region of the state space in complicated flows that are “static” in the sense that the mutual dis- tance of points is not changed (or just barely) by the dynamics. We will consider these as one kind of prototype for coherent sets. The third region is mixing and physically separates the other two. We take I = 100 initially equispaced trajectories and compute the one-way costs ν (i →·) with K = 50 for i = 1 and i = 51, respectively, shown in Fig. 8. From our analysis in Sect. 3.7, we would have expected to see quadratic growth of the one-way cost with respect to physical distance in the static regions, but we only observe linear growth. This is due to the finite number of considered trajectories, as also explained in the second paragraph of Sect. 3.7. All points of the mixing region have almost the same cost from any one point in the static regions, and approximately zero cost from one another. To obtain the cost between two points of different static regions, one has to consider the cost to go to the boundary of the static and mixing regions (linear cost in Euclidean distance), travel on a trajectory from there to the boundary of the other static region (at zero cost), and then go from there to the desired 123 J Nonlinear Sci Fig. 9 Sketch of the velocity field of the periodically forced double-gyre flow at two different times. The horizontal axis is x , and the vertical is x 1 2 point. (Again, linear cost in Euclidean distance that needs to be covered.) Thus, the cost (and semidistance) between these two points is the sum of their one-way cost (and semidistance) to the mixing region, provided the time of consideration is sufficiently large for the mixing to take place. Since our fictive random walker uses trajectories of the mixing region to travel from one static region to the other, we will also call it transition region henceforth. To conclude, from Fig. 8 we can easily identify three separate regions, and from the steepness of the slopes (linear/quadratic or flat), we can determine wether a region is static or mixing. As the next example shows, this distinction is usually not as clear as in these constructed examples, but the main ideas will be based on this observation. 4.2 The Periodically Forced Double Gyre Let us now consider the non-autonomous system x˙ = v(t, x ) on X =[0, 2]×[0, 1] t t with (Froyland and Padberg-Gehle 2014) −π A sin (π f (t, x )) cos(π x ) 1 2 v(t, x ) := , (24) d f π A cos (π f (t, x )) sin(π x ) (t, x ) 1 2 1 dz where f (t, z) = β sin(ωt )z + (1 − 2β sin(ωt ))z. We fix the parameter values A = 0.25, β = 0.25 and ω = 2π; hence, the vector field has time period 1. The system preserves the Lebesgue measure on X. Equation (24) describes two counter-rotating gyres next to each other (the left one rotates clockwise), with the vertical boundary between the gyres oscillating periodically, see Fig. 9. We choose a uniform 50 × 25 grid as initial conditions for the floaters at time t = 0; i.e., I = 1250. We sample the trajectories of these floaters at times t = kτ , k = 0, 1,..., K , where K = 100 and τ = 0.2. That means, the length of trajectories in consideration is 20 times the period of the forcing. Employing our large-deviation-based distance computations on this data set using Algorithm 1 and α = 1/2, we get the one-way costs ν (i → j ), i, j = 1,..., I,from cross meet which we compute ν (i, j ) and ν (i, j ). K K Note that for this argument, ergodicity of the dynamics in the “mixing” region would be sufficient, since one only needs to travel “from one static region to the other”. The crucial additional property we get from mixingness is that the mutual semidistances of points in this region go to zero. 123 J Nonlinear Sci -1 0.045 0.04 0.035 0.03 0.025 -2 0.02 0.015 0.01 0.005 -3 0 10 0 1 2 3 4 0 200 400 600 800 1000 1200 10 10 10 10 10 meet Fig. 10 Left: ν (c , ·) sorted in ascending order. Right: the same as left, on a log-log scale. The horizontal axis shows the rank, and the vertical shows the semidistance value As a first simple analysis, we can order the points by their semidistances to the center of one gyre, see Fig. 10. Here and in the following, the rates and semidistances will be always given in units 1/τ . On a log-log scale, the slope 1/2 (square-root-type behav- ior) indicates that most trajectories in the gyre are approximately concentric circular regions around the center. Since the semidistances grow linearly in the Euclidean distance inside the gyre, we see that we are in the low-resolution regime discussed in Sect. 3.7. Analogously to Fig. 8, we can again (vaguely) distinguish three regions: a steep (square-root-type) region, a flat region, and another steep (flipped square-root- type) region. As before, the flat region is typically strongly mixing, and the steep regions are static. We shall make this distinction more precise in the next sections. Remark 4.1 (Three-dimensional flows) Clearly, the scaling behavior shown in Figs. 8 and 10is dimension-dependent. For a three-dimensional static region, one would see a slope 1/3 on a log–log scale. The question remains, how does a typical coherent set behave there; can it be modeled by a static region? If an incompressible flow rotates uniformly in a plane, it necessarily has a constant shifting motion in the perpendicular axial direction, leading to cylindrical vortices. If the cylindrical rings of a vortex rotate at different angular frequencies, the flow speed along axial directions is non-uniform, and mixing-type behavior occurs in the vortex (Halász et al. 2007). We leave the analysis of such systems to future work and proceed with analyzing different aspects of prototypical two-dimensional flows here. 4.3 Cornerstones To start the analysis of the state space under a semidistance d, we randomly choose a trajectory, represented by a label c ∈{1,..., I }, and compute the trajectory furthest from it, i.e, we set 8 2 As on a regular grid there are O(δ ) points not further than Euclidean distance δ from a reference point, 1/2 the r-th closest point to the reference point has distance O(r ). 123 J Nonlinear Sci 0.045 0.04 c , c 0.035 1 2 c , c , c 1 2 3 0.03 0.025 0.02 0.015 0.01 0.005 0 200 400 600 800 1000 1200 Fig. 11 The objective functions of the maximization problem in (25), sorted in ascending order (yellow meet and purple). Blue and red: ν (c , ·), i = 1, 2 (Color figure online) c = arg max d(i, c ). 1 0 i =1,..., I To find a set of points that “spans” the state space, we identify successively further trajectories that are far away from all the other already identified “corner- stones” {c } ,asinRüdrichetal. (2017): q q=1,..., Q c = arg max min d(i, c ). (25) Q+1 q q=1,..., Q i =1,..., I Observe that in this optimization problem we ignore the first, randomly chosen tra- jectory c ; hence, the set of cornerstones {c } will be less dependent on this 0 q q=1,..., Q randomness. Moreover, even if the first trajectory c would represent a coherent set, the algorithm will eventually provide a new cornerstone in that set, which lies closer to the semidistance center of that set. For the double gyre and the meeting distance, we identified three cornerstones. The objective function of the maximization problem (25) is plotted in Fig. 11;this yields a similar but more detailed picture as Fig. 10. Note that the chaotic, well-mixed transition region appears as flat region in these distance graphs, and the gyres appear as steep regions toward the maxima of the respective graphs. That the chaotic region is well mixed, and has no stratification (invariant rings as the gyres), can be seen from its flat behavior toward its maximum. The forth cornerstone is part of a gyre, and it starts to stratify it. Nevertheless, its distance to the other corners is much smaller. To get a first glimpse of the separate regions in the state space, we have plotted the semidistances from each cornerstone in Fig. 12, both at the initial and final times. Note that since we work with trajectory labels rather than physical positions, the semidistances are invariant in time, whereas the physical positions of the floaters change over time. From these figures, one can approximately identify the two static 123 J Nonlinear Sci 1 1 0.04 0.04 0.8 0.8 0.03 0.03 0.6 0.6 0.02 0.02 0.4 0.4 0.01 0.01 0.2 0.2 0 0 0 0 0 0.5 1 1.5 2 0 0.5 1 1.5 2 1 1 0.04 0.04 0.8 0.8 0.03 0.03 0.6 0.6 0.02 0.02 0.4 0.4 0.01 0.01 0.2 0.2 0 0 0 0 0 0.5 1 1.5 2 0 0.5 1 1.5 2 1 1 0.8 0.8 0.015 0.015 0.6 0.6 0.01 0.01 0.4 0.4 0.005 0.005 0.2 0.2 0 0 0 0 0 0.5 1 1.5 2 0 0.5 1 1.5 2 meet Fig. 12 Distances ν from trajectory c (top), c (middle) and c (bottom), marked by the magenta 1 2 3 circle, at initial (left) and final times (right). The semidistances are given in units 1/τ . The horizontal axis is x , and the vertical is x (Color figure online) 1 2 (gyre) regions, being very close and very far from c and c , respectively, and the 1 2 chaotic transition region in between, having approximately constant distance from c and c , cf. (Froyland and Padberg-Gehle 2014, Figure 1) . 4.4 Number of Cornerstones How to determine the number of cornerstones that should be used? Is there an optimal number, or is it up to our liking? In the case of the double gyre, as noted above, a fourth cornerstone would be part of one of the gyres, and assigning affiliations would thus split one gyre into two sets. If the gyres would consist of a continuum of periodic orbits, then we could proceed and split them this way into as many rings as we like. The same situation in an idealized framework appears in Sect. 4.1 for the static regions: since they are static, arbitrary subsets are perfectly coherent (even invariant in this case). A good place to stop searching for further cornerstones would be when they would start to subdivide “maximal coherent” sets, as the gyres in the double-gyre example, or the static sets in the second, and the invariant sets in the first example of Sect. 4.1. To this end we make an idealized assumption: Coherent sets appear as for the second example in Sect. 4.1, i.e., multiple static regions divided by one mixing region. Here, “static” is meant in the sense that the mutual distances between points in the set barely change. Such an assumption was also utilized in (Hadjighasem et al. 2016). 123 J Nonlinear Sci 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 0 0.5 1 1.5 2 00.5 11.5 2 meet Fig. 13 The trajectories closest in terms of ν to one of the cornerstones than to the others. Left: initial time, right: final time. The horizontal axis is x , the vertical is x 1 2 Note that if there are C ≥ 2 coherent sets, the first C corner stones are going to be in them, one in each. This is due to the fact that to move from the center of one static region to another, the shortest path in (20) needs to move out of one set, travel in the transition region to the other set, and move to its center, hence maximizing the minimal distance to all other cornerstones. After finding all static regions, the next cornerstone is to be found in the transition region, if all static regions are approximately of the same size—which we assume here. The crucial observation is that this (C + 1)-st cornerstone is half as far from the other cornerstones, as they are from one another. In other words, d(c , c ) + d(c , c ) ≈ d(c , c ), i, j ≤ C. i C +1 j C +1 i j To summarize, our simple check when to stop searching for cornerstones is going to be, when the value of the objective function in (25) drops by at least a factor two compared with the previous value. Observe how nicely this works in the periodically forced double-gyre case: the rightmost points of the curves in Fig. 11 are the optimizers, and the corresponding value of the yellow curve is less than half of the values for the first two cornerstones. This indicates to stop with three cornerstones, as they will represent both the gyres and the transition region. 4.5 Clustering and Fuzzy Affiliations To get an even more crisp picture of the subdivision of the state space into regions which are far away in terms of the semidistance d, we assign to each cornerstone c , c , c the 1 2 3 trajectories that are closer to them than to the two other cornerstones, respectively. For the periodically forced double gyre and the meeting distance this is shown in Fig. 13. Comparing this picture with the typical trajectories of the time-1 Poincaré map of the system [again, see Froyland and Padberg-Gehle 2014, Figure 1], it appears that the gyre regions in our figure are smaller. This is due to the nature of the transport distance at hand: the gyres are partly made up of so-called regular regions of the Poincaré map, meaning that typical trajectories move on periodic orbits that are approximately concentric circular lines. Transport between these trajectories is only possible through diffusion, and the price one has to pay for this transport in radial direction is reflected by the rate function (recall, this is what we model by the static regions in Sect. 4.1). The Here we assume that we are in the coarse spatial resolution case, where the semidistances scale linearly and not quadratically, cf. Sect. 3.7. Otherwise, the drop in the distance is more than a factor two (toward factor four). 123 J Nonlinear Sci 1 1 1 1 1 1 0.8 0.8 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.6 0.6 0.4 0.4 0.4 0.4 0.4 0.4 0.2 0.2 0.2 0.2 0.2 0.2 0 0 0 0 0 0 0.5 1 1.5 2 0 0.5 1 1.5 2 00.5 11.5 2 Fig. 14 Fuzzy affiliations q (·) of the trajectories to the three cornerstones, c , c , c (from left to right) c 1 2 3 for fuzziness exponent m = 2, shown at initial time. The horizontal axis is x , the vertical is x 1 2 cost to get from the center of the gyre (the cornerstone c or c ) to a regular trajectory 1 2 in the same gyre is proportional to the “radial distance” between them (compare with the static part of the second example in Sect. 4.1). This behavior is not characteristic for the well-mixed transition region, because there the dynamics (eventually) brings any two trajectories close to each other. The effect is most prominent if the time frame of consideration grows infinitely large, and on our finite time horizon it appears as a flattening of the curve. This brings us back to why the blue and green regions in Fig. 13 are smaller than gyres in the Poincaré map. The answer is simply, because the outer periodic orbits are closer to the transition region than to the center of the gyre, hence also closer to the cornerstone c that is in the transition region, because the points in the transition region have very small distance from one another. Instead of a hard clustering we can assign the trajectories to the cornerstones by fuzzy affiliations q (·), to obtain more refined information on coherence. For instance, let m > 1, and minimizing the affiliation-weighted penalty function m 2 q ( j ) d(c , j ) c i j =1 i =1 subject to the constraints 0 ≤ q for i = 1,..., and q ( j ) = 1 for every j = c c i i =1 i 1,..., I , yields q ( j ) = . (26) m−1 d(c , j ) k=1 d(c , j ) This is the affiliation function in the fuzzy c-means algorithm (Bezdek 1981), giv- ing q ( j ) = 1 ⇔ d(c , j ) = 0, i.e., affiliation is maximal if the distance is minimal. c i Further, the parameter m controls the fuzziness of the clustering: large m gives soft clusters, while m approaching 1 gives more and more “crisp” clusters as the affiliations converge either to 0 or to 1 (Bezdek et al. 1987). The resulting affiliations (indicated at initial time) for m = 2 are shown in Fig. 14.For m close to 1 we obtain affiliations very similar to the hard clusters in Fig. 13. 5 Numerical Results In the previous section we already presented numerical results for the double-gyre system, where we used the results to motivate and develop the analysis tools. In 123 J Nonlinear Sci this section we apply these tools to two other well-analyzed test cases: the perturbed Bickley Jet and the rotating (transitory) double gyre. They are different paradigmatic examples, as the Bickley Jet has a non-vortex coherent set (the jet core), and the transitory double gyre is not a periodically forced system, thus genuinely living on a finite time interval. Let us also point out that the examples presented here and in the previous section are all one- or two-dimensional. Although we expect the analysis in higher dimensions to be at least qualitatively not very different from the two-dimensional case, dealing with the nevertheless arising subtle differences (see Remark 4.1) is beyond the scope of this conceptual work. cross meet It turns out that the choice between the two semidistances ν or ν has only K K marginal influence on the results. In this section we shall mostly work with the cross- semidistance. 5.1 The Bickley Jet We consider a perturbed Bickley Jet as described in Rypina et al. (2007). This is an idealized zonal jet approximation in a band around a fixed latitude, assuming incom- pressibility, on which three traveling Rossby waves are superimposed, see Fig. 15.The ∂ ∂ dynamics is given by x˙ = v(t, x ) with v(t, x ) = (− , ) and stream function t t ∂ x ∂ x 2 1 (t, x , x ) =−U L tanh x /L + U L sech x /L A cos k x − c t . ( ( )) 1 2 0 2 0 2 n n 1 n n=1 The constants are chosen as in Rypina et al. (2007, Section 4) . In particular, we set k = 2n/r with r = 6.371, U = 5.414, and L = 1.77. The phase speeds c n e e 0 n of the Rossby waves are c = 0.1446U , c = 0.205U , c = 0.461U , their ampli- 1 0 2 0 3 0 tudes A = 0.0075, A = 0.15, and A = 0.3, as in Hadjighasem et al. (2016). The 1 2 3 system is considered on a state space X =[0,π r ]×[−3, 3] which is periodic in the horizontal x coordinate. We choose a uniform 60 × 18 grid as initial conditions for the floaters at time t = 0; i.e., I = 1080. We sample the trajectories of these floaters at times t = kτ , k = 0, 1,..., K , where K = 80 and τ = 0.5. In this time interval, typical trajectories cross the cylindrical state space horizontally 4–5 times, trajectories in the jet core (the wavy structure in Fig. 15) up to 9 times. Employing our large-deviation-based distance computations on this data set using Algorithm 1 and α = 1/2, we get the rates ν (i → j ), i, j = 1,..., I . From these Fig. 15 Sketch of the Bickley Jet flow field at two different times. The flow pattern travels from left to right on the horizontally periodic domain. The horizontal axis is x , the vertical is x 1 2 123 J Nonlinear Sci cross rates we readily obtain the ν (i, j ) via cross ν (i, j ) = ν (i → j ) + ν ( j →i). K K We repeat the cornerstone finding analysis from the previous section. The optimal values of the objective function in the cornerstone finding problem (25) are for 8 cornerstones, in order: 2.06, 3.21, 2.45, 2.33, 2.30, 2.14, 1.42, 0.70. Recall that the first value is with respect to a random cornerstone c that we discard. These numerical values with our previous analysis shed light on the topological struc- ture of the state space with respect transport and mixing. Note, that our assumption from Sect. 4.4, that all coherent sets are divided by one mixing region, is not satisfied: the jet core is a coherent set itself, dividing two mixing regions (below and above it), each containing 3 further coherent sets (the gyres). Thus, c and c have maximal 1 2 cross distance (ν (c , c ) = 3.21), because the random walker needs to cross the jet 1 2 core. Every further cornerstone c ,..., c can be reached from either c or c through 3 6 1 2 one of the mixing regions, and thus have a very similar cost. The deviation of these costs, 2.14–2.45, shows that we did not reach the state of full mixing on the chosen time interval. Now, the seventh cornerstone lies in the jet core, which has to be crossed if traveling between cornerstones that are below and above it, respectively. The corresponding cost (1.42) is a bit larger than half of the previous cost, because c does not lie on the shortest path between cornerstones below and above the jet core. Intuitively, the “center line” of the jet core should be equally far from all cornerstones c ,..., c , 1 6 if the time interval is large enough such that the regions around the gyres are truly mixing. Since it is not, there are points on the boundary of the jet core which are easier to reach from them, and thus easier to cross there. The cornerstone c represents the position where it is the hardest to cross. The eighth cornerstone has truly half the semidistance to the closest one than c , and lies in one of the mixing regions. We show our results for seven cornerstones. The semidistances are shown in Fig. 16. The corresponding fuzzy affiliations from (26)for m = 1.1are showninFig. 17. They show a very crisp distinction of the six gyres from the rest of the state space. The bottom right figure shows the affiliation q (·) for m = 1.9, which suggests that the region around the gyres could still be partitioned into coherent sets itself: the jet core appears more strongly affiliated to this cornerstone than the other trajectories. It is not surprising that we could not see this for m = 1.1, since the closer m is to 1, the more “crisp” the affiliation function is forced to be, and the mixing region is more easily reached from the thin jet core than from the gyres. If we include additional cornerstones, results tend to deteriorate due to the low resolution and because the chosen time interval is not giving full mixing. 123 J Nonlinear Sci 3 3 2 2 2 2 0 0 1 1 -2 -2 0 0 0 5 10 15 20 0 5 10 15 20 3 3 2 2 2 2 0 0 1 1 -2 -2 0 0 0 5 10 15 20 0 5 10 15 20 3 3 2 2 2 2 0 0 1 1 -2 -2 0 0 0 5 10 15 20 0 5 10 15 20 1.5 0.5 -2 0 5 10 15 20 Fig. 16 Row-wise from top left to bottom: the identified corner stores c , i = 1,..., 7, (magenta circles) cross and their distances ν (c , ·) to the other trajectories, at initial time. The cornerstones are located in the six gyres and the central jet region. The distances are given in units 1/τ . The horizontal axis is x ,the vertical is x (Color figure online) 1 1 2 2 0 0.5 0 0.5 -2 -2 0 0 0 5 10 15 20 05 10 15 20 1 1 2 2 0 0.5 0 0.5 -2 -2 0 0 0 5 10 15 20 05 10 15 20 1 1 2 2 0.5 0.5 0 0 -2 -2 0 0 0 5 10 15 20 05 10 15 20 1 1 2 2 0 0.5 0 0.5 -2 -2 0 0 0 5 10 15 20 05 10 15 20 Fig. 17 Row-wise from top left to bottom: the fuzzy affiliations (26) of the trajectories at time t = 5tothe cornerstones c ,..., c , respectively (magenta circles). Bottom right: affiliation q (·) for m = 1.9, which 1 7 c suggests that the 7th coherent region could contain a coherent set itself: the jet core. The horizontal axis is x , the vertical is x (Color figure online) 1 2 123 J Nonlinear Sci Fig. 18 Sketch of the flow field of the rotating double gyre at initial (left) and final (right) times. The horizontal axis is x , the vertical is x 1 2 5.2 The Rotating Double Gyre Let us consider a prototype for a system, where transport is considered only on a limited time interval. The rotating double-gyre system (Mosovsky and Meiss 2011) is given by the stream function ψ(t, x , x ) = (1−s(t ))ψ (x , x )+s(t )ψ (x , x ), with s(t ) = 1 2 P 1 2 F 1 2 t (3 −2t ), ψ (x , x ) = sin(2π x ) sin(π x ), and ψ (x , x ) = sin(π x ) sin(2π x ), P 1 2 1 2 F 1 2 1 2 and is considered on the state space X =[0, 1] and time interval t ∈[0, 1].The two gyres, which initially occupy the left and right halves of the unit square, turn during this time by π/2 to occupy the top and bottom halves at final time, see Fig. 18. We choose a uniform 30 ×30 grid as initial conditions for the floaters at time t = 0; i.e., I = 900. We sample the trajectories of these floaters at times t = kτ , k = 0, 1,..., K , where K = 100 and τ = 0.01. We employ the cross-semidistance, and start our cornerstone search. The first three values of the optimization problem (25)are 0.0274, 0.0474, 0.0262. We identify the significant drop after two corner stones, hence we expect two coherent sets with one mixing region dividing them. The drop in the distance is by a factor 0.55, which is not below one half, the reason for this being again that the time interval of consideration is not sufficient for perfect mixing of the transition region. The semidistances from the three identified cores and the affiliations to these cores cross for exponent m = 1.2 are shown in Figs. 19 and 20, respectively. Although both ν meet and ν have been shown to be able to detect coherent sets, we demonstrate their different nature by showing the shortest paths in the respective distance in Fig. 21. Finally, we demonstrate the approach for a scattered set of sparse data points, taking 400 initial points randomly distributed in X, and repeating the analysis for their trajectories. We show the resulting fuzzy affiliations in Fig. 22. 6 Discussion and Outlook 6.1 The Dynamic Laplacian Froyland (2015) has introduced the dynamic Laplacian as a transport-related tool to find coherent sets. Similarly to our approach, it makes use of a small random perturbation of size ε, then ε is driven to zero. 123 J Nonlinear Sci 1 1 1 0.045 0.045 0.025 0.9 0.9 0.9 0.04 0.04 0.8 0.8 0.8 0.035 0.035 0.02 0.7 0.7 0.7 0.03 0.03 0.6 0.6 0.6 0.015 0.025 0.025 0.5 0.5 0.5 0.02 0.02 0.4 0.4 0.4 0.01 0.015 0.015 0.3 0.3 0.3 0.01 0.01 0.2 0.2 0.2 0.005 0.005 0.1 0.1 0.005 0.1 0 0 0 0 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 1 1 1 0.045 0.045 0.025 0.9 0.9 0.9 0.04 0.04 0.8 0.8 0.8 0.035 0.035 0.02 0.7 0.7 0.7 0.03 0.03 0.6 0.6 0.6 0.015 0.025 0.025 0.5 0.5 0.5 0.02 0.02 0.4 0.4 0.4 0.01 0.015 0.015 0.3 0.3 0.3 0.01 0.01 0.2 0.2 0.2 0.005 0.005 0.005 0.1 0.1 0.1 0 0 0 0 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 cross Fig. 19 From left to right: the identified corner stores c , i = 1,..., 3, (magenta circles) and their distances ν (c , ·) to the other trajectories, at initial time (top) and final i i time (bottom). The distances are given in units 1/τ . The horizontal axis is x , the vertical is x (Color figure online) 1 2 J Nonlinear Sci 1 1 1 1 1 1 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.7 0.7 0.7 0.7 0.7 0.7 0.6 0.6 0.6 0.6 0.6 0.6 0.5 0.5 0.5 0.5 0.5 0.5 0.4 0.4 0.4 0.4 0.4 0.4 0.3 0.3 0.3 0.3 0.3 0.3 0.2 0.2 0.2 0.2 0.2 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0 0 0 0 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 1 1 1 1 1 1 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.7 0.7 0.7 0.7 0.7 0.7 0.6 0.6 0.6 0.6 0.6 0.6 0.5 0.5 0.5 0.5 0.5 0.5 0.4 0.4 0.4 0.4 0.4 0.4 0.3 0.3 0.3 0.3 0.3 0.3 0.2 0.2 0.2 0.2 0.2 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0 0 0 0 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Fig. 20 The fuzzy affiliations computed with m = 1.2 to the cornerstones c (top) and c (bottom), at times t = 0, 0.5, 1, from left to right, respectively. The horizontal axis 1 2 is x , the vertical is x 1 2 J Nonlinear Sci 1 1 1 0.9 0.9 0.9 0.8 0.8 0.8 0.7 0.7 0.7 0.6 0.6 0.6 0.5 0.5 0.5 0.4 0.4 0.4 0.3 0.3 0.3 0.2 0.2 0.2 0.1 0.1 0.1 0 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Fig. 21 Shortest paths from cornerstone c to c (left), from c to c (middle), and the meeting paths with the shortest joint length (right). The brighter the color of a path 1 2 2 1 segment, the larger the cost of that transition. For those segments for which the path crosses from one trajectory to another we show with a dashed segment how the former trajectory would have continued. The starting point of the path is indicated by a circle, and the endpoint by a triangle. The horizontal axis is x , the vertical is x 1 2 J Nonlinear Sci 1 1 1 1 1 1 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.7 0.7 0.7 0.7 0.7 0.7 0.6 0.6 0.6 0.6 0.6 0.6 0.5 0.5 0.5 0.5 0.5 0.5 0.4 0.4 0.4 0.4 0.4 0.4 0.3 0.3 0.3 0.3 0.3 0.3 0.2 0.2 0.2 0.2 0.2 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0 0 0 0 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 1 1 1 1 1 1 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.7 0.7 0.7 0.7 0.7 0.7 0.6 0.6 0.6 0.6 0.6 0.6 0.5 0.5 0.5 0.5 0.5 0.5 0.4 0.4 0.4 0.4 0.4 0.4 0.3 0.3 0.3 0.3 0.3 0.3 0.2 0.2 0.2 0.2 0.2 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0 0 0 0 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Fig. 22 The rotating double gyre with randomly chosen 400 initial points. The fuzzy affiliations computed with m = 1.2 to the cornerstones c (top) and c (bottom), at 1 2 times t = 0, 0.5, 1, from left to right, respectively. The horizontal axis is x , the vertical is x 1 2 J Nonlinear Sci Numerical methods so far discretize directly the dynamic Laplacian (Froyland and Junge 2015; Banisch and Koltai 2017; Froyland and Junge 2017). In light of our analysis, which can be used both ways (derive the large-deviation principle in continuous space, then discretize it to finite trajectories, cf. Sect. 3.6, or discretize the dynamics to finite trajectories, then derive the large-deviation principle on them, cf. Sect. 3.3), we ask whether there is a discrete dynamic Laplacian that can be derived from a discretization of the perturbed dynamics? Mimicking the construction in Froyland (2015) and sketching the idea while skip- I × I ping details, one should construct a discrete, ε-dependent transfer operator T ∈ R , that represents transition probabilities of a forward-backward process, then obtain a discrete dynamic Laplace operator L := T . A discrete transfer opera- dyn ε dε ε=0 tor T that is a consistent approximation of the continuous dynamics can be obtained by a construction as in Sect. 3.2, by using the transition probabilities (17). Tech- nical details aside, we see that the probabilities are linear combinations of terms −x /ε of the form e , where x here is a formal distance term that appears in the formulas. Differentiation with respect to ε immediately yields that all off-diagonal entries (basically, where x > 0) of L are zero, in fact the matrix is the iden- dyn tity. Thus, this approach of discretizing the dynamics first, and then factoring out the ε-small stochastic perturbation does not give a dynamically meaningful result. In analytic terms the very same problem occurred in a different attempt to introduce a discrete dynamic Laplacian from a discrete transfer operator, see (Banisch and Koltai 2017, Section IV) . In general, it would be desirable to understand when and how can the “first discretize, then factor out ε” methods work, such that they can complement the methods that directly discretize the (continuous) dynamic Laplace operator. 6.2 Other Distance Measures The time-dependent shortest path problem used to compute our semidistances is com- putationally demanding in our current algorithmic realization, which theoretically limits the number of trajectories that can be handled. Moreover, they do not satisfy the triangle inequality, hence they are not a metric. Although numerical efficiency is not the main focus of this paper, and we demonstrated the usefulness of our semidis- tances in unraveling the underlying dynamical structure of the example systems, a more cheaply computable metric would enhance the utility and significance of the analysis methods presented here. Ultimately, one would like to understand the intrinsic, possibly low-dimensional geometric organization of the state space with respect to transport and mixing, as pioneered in Banisch and Koltai (2017). Employing proper metrics would allow, e.g., the usage of low-dimensional embedding techniques, such as multidimensional scaling, to represent and better understand this geometric organization. One canon- ical candidate would be the metric structure related to the dynamic Laplacian, considered in Karrasch and Keller (2016). This will be subject of future stud- ies. 123 J Nonlinear Sci To summarize, although other distance measures could be used to analyze compli- cated dynamic behavior, we showed that the semidistances we derived in this paper from the physical notion of transport and mixing in the vanishing diffusion setting are natural and effective diagnostic tools. Acknowledgements This work is supported by the Deutsche Forschungsgemeinschaft (DFG) through the Priority Programme SPP 1881 “Turbulent Superstructures”, and through the CRC 1114 “Scaling Cascades in Complex Systems”, projects A01 and C08. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 Interna- tional License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. A Large Deviations of the Forward-Backward Conditions In this appendix we explore the conditions (3) in the large-deviation regime. The argument is based on the Laplace Principle, which states that for any measure ρ and function f : − f (x ) lim −ε log e ρ(dx ) = inf f (x ). (27) x ∈supp ρ ε→0 As in (9), (ε) (ε) −ε log P x y | x = x T 0 −−→ inf |˙ x − v(t, x )| dt =: λ (x → y), (28) t t T x :x =x ,x = y ε→0 (·) 0 T where, contrary to (9), the symbol y now denotes a position at time T , that is, μ (x →x˜ ) = λ (x →φ [˜ x ]) = λ (x → y). T T 0,T T Fix an ε-independent initial probability measure ρ (dx ) = P[x ∈ dx ].For the 0 0 large deviations of the forward condition in (3), it follows from the Laplace principle that (ε) fw J ( B| A) := lim −ε log P x ∈ B | x ∈ A T T ε→0 (ε) = lim −ε log P x ∈ d y | x = x ρ (dx ) 0 0 ε→0 B A (28) − λ (x → y) = lim −ε log e ρ (dx ) ε→0 B A (27) = inf inf λ (x → y). y∈ B x ∈ A∩supp ρ Observe that since the initial distribution ρ is independent of ε, it only appears in the large deviations through its support supp ρ . 123 J Nonlinear Sci The large deviations of the backward conditions in (3) can be calculated analo- gously, but now the conditioning does depend on ε. By Bayes’ rule, the rate function of the backward condition is bw (ε) J ( A| B) := lim −ε log P[x ∈ A | x ∈ B] T T ε→0 P[x ∈ A] (ε) 0 = lim −ε log P[x ∈ B | x ∈ A] T (ε) ε→0 P[x ∈ B] (ε) = lim −ε log P x ∈ d y | x = x ρ (dx ) 0 0 ε→0 B A (ε) + ε log P x ∈ d y | x = x ρ (dx ) − ε log ρ ( A) 0 0 0 (27,28) = inf inf λ (x → y) − inf inf λ (x → y) T T x ∈supp ρ y∈ B x ∈ A∩supp ρ y∈ B 0 fw fw = J ( B| A) − J ( B| X ). T T If we assume that that there is at least one admissible path x that starts in supp ρ (·) 0 fw fw bw and ends in B, then in fact J ( B| X ) = 0, and so J ( B| A) = J ( A| B). T T T These calculations have two important implications. First, observe that while the (ε) (ε) forward and backward probabilities P[x ∈ B | x ∈ A] and P[x ∈ A | x ∈ B] are 0 0 T T not equal in general, the forward and backward rate functions are. The same argument even holds if we shrink the sets A ad B down to single points x and y; in that case we obtain for the “backward rates” that (ε) λ (x ← y) := lim −ε log P x x | x = y = λ (x → y), T 0 T ε→0 cf. Remark 2.1. Apparently, in the large-deviation scaling it does not matter whether we consider the forward or the backward process. Since the forward condition fw J ( B| A) ≈ 0 in itself does not hold enough information to characterize coher- bw ence and the backward condition J ( A| B) ≈ 0 does not add information, these conditions are not helpful to characterize coherence. fw Secondly, we see that J ( B| A) = 0 as soon as φ [ A ∩ supp ρ ]∩ B =∅. 0,T 0 fw Naturally, there are many such pairs A, B, and the set function J does not give any quantitative information about which pairs are more coherent than others. Because of this, the large deviations of the forward and backward conditions (3)are even less useful to identify coherent sets. We start to gain useful information about coherence, if there are at least two coherent fw fw pairs, say A , B and A , B . Then the rates J ( B | A ) and J ( B | A ) are in 1 1 2 2 2 1 1 2 T T general large, since coherence of the respective set pairs dictate that it is very unlikely to encounter paths from pair #1 to pair #2. Using these rates as measures of farness is one idea this paper exploits. 123 J Nonlinear Sci B Algorithm: Shortest Path in Time-Dependent Graphs Since we could not find an algorithm suited to our purpose, we describe in this appendix a solution we came up with to solve the problem of finding shortest paths in a graph with time-dependent nonnegative edge weights. Transition is only possible between nodes that are connected by an edge of positive weight. There are several solutions to the shortest path problem for time-independent graphs, such as Dijkstra’s algorithm (Dijkstra 1959)orthe Floyd–Warshall algorithm (Floyd 1962). Each of them use in some sense a “monotonicity” argument, namely, that sub-paths of shortest paths are shortest paths themselves. This does not hold for time-dependent graphs, because at every step that we make the environment might change completely, and the number of steps we can make is limited by the number of time instances of the graph. We propose the following algorithm to compute shortest paths from a specific node s to all other nodes. Note that we can stay in a node for any time at zero cost. The weight of the transition i → j at time t is denoted by w (i → j ). Algorithm 1 Shortest distance in time-dependent graphs 1: R ={s}, R =∅ (reached states at times 0 and 1) ol d new 2: dist(s) = 0, dist(i ) =∞ for i = s 3: for t = 1,..., T do 4: while R =∅ do ol d 5: v = arg max dist(i ) i ∈ R ol d 6: R ← R \{v} ol d ol d 7: for j : w (v → j)< ∞ do 8: if dist(v) + w (v → j)< dist( j ) then 9: dist( j ) = dist(v) + w (v → j ) 10: R ← R ∪{ j } new new 11: R = R new ol d It is important to have the max on line 5, since if we does not start the update procedure at the node which has the maximal distance, then we might erroneously cut off nodes that could still be reached from it. Algorithm 1 can clearly be extended to keep track of the shortest path as well. The distance of a node j is updated to a smaller one, whenever there is a path through some other node v that is shorter than the previous one (line 10). Hence, the new candidate shortest path is the one leading to v and then jumping to j in the current time step. This is implemented in Algorithm 2 (line 12). Herein, path(i → j ) is the shortest path from node i to node j, such that path (i → j ) is the node the walker resides in at time t = 0, 1,..., T while going through the shortest path, and path (i → j ) = i.If there is no path from i to j, then path(i → j ) is the zero vector. We use 1 : k to denote the index set 1, 2,..., k. It should be mentioned here that every time-dependent shortest path problem can be rephrased as a time-independent problem by considering each node at each time point as a distinct node of a large graph, and then it could be solved by standard methods. We do not take this approach here, as it might introduce memory requirement issues. 123 J Nonlinear Sci Algorithm 2 Shortest path in time-dependent graphs 1: R ={s}, R =∅ (reached states at times 0 and 1) new ol d 2: dist(s) = 0, dist(i ) =∞ for i = s T +1 3: path(s→ j ) = 0 ∈ R for all j, path (s→s) = s 4: for t = 1,..., T do 5: while R =∅ do ol d 6: v = arg max dist(i ) i ∈ R ol d 7: R ← R \{v} ol d ol d 8: for j : w (v → j)< ∞ do 9: if dist(v) + w (v → j)< dist( j ) then 10: dist( j ) = dist(v) + w (v → j ) 11: path (s→ j ) = path (s→v), path (s→ j ) = j 1:t −1 1:t −1 t 12: R ← R ∪{ j } new new 13: path (s→s) = s 14: R = R ol d new References Allshouse, M.R., Peacock, T.: Lagrangian based methods for coherent structure detection. Chaos: an Inter- disciplinary. J. Nonlinear Sci. 25(9), 97617 (2015) Allshouse, M.R., Thiffeault, J.-L.: Detecting coherent structures using braids. Physica D: Nonlinear Phenom. 241(2), 95–105 (2012) Andres, S.: Diffusion processes with reflection. PhD thesis, TU Berlin (2009) Balasuriya, S., Froyland, G., Santitissadeekorn, N.: Absolute flux optimising curves of flows on a surface. J. Math. Anal. Appl. 409(1), 119–139 (2014) Banisch, R., Koltai, P.: Understanding the geometry of transport: diffusion maps for Lagrangian trajectory data unravel coherent sets. Chaos: an Interdisciplinary. J. Nonlinear Sci. 27(3), 035804 (2017) Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Dordrecht (1981) Bezdek, J.C., Hathaway, R.J., Sabin, M.J., Tucker, W.T.: Convergence theory for fuzzy c-means: counterex- amples and repairs. IEEE Trans. Syst. Man Cybern. 17(5), 873–877 (1987) Budišic, ´ M., Mezic, ´ I.: Geometry of the ergodic quotient reveals coherent structures in flows. Physica D: Nonlinear Phenom. 241(15), 1255–1269 (2012) Dellnitz, M., Junge, O.: On the approximation of complicated dynamical behavior. SIAM J. Numer. Anal. 36, 491–515 (1999) Dellnitz, M., Froyland, G., Horenkamp, C., Padberg-Gehle, K., Gupta, A.S.: Seasonal variability of the subpolar gyres in the Southern Ocean: a numerical investigation based on transfer operators. Nonlinear Proces. Geophys. 16, 655–664 (2009) Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications, vol. 38, 2nd edn. Springer, New York (1987) Denner, A., Junge, O., Matthes, D.: Computing coherent sets using the Fokker–Planck equation. J. Comput. Dyn. 3(2), 163–177 (2016) Dijkstra, E.W.: A note on two problems in connexion with graphs. Numer. Math. 1(1), 269–271 (1959) Fabregat , A., Mezic, I., Poje, A. C.: Finite-time partitions for Lagrangian structure identification in Gulf Stream eddy transport. (2016). arXiv preprint arXiv:1606.07382 Floyd, R.W.: Algorithm 97: shortest path. Commun. ACM 5(6), 345 (1962) Freidlin, M., Wentzell, A.: Random Perturbations of Dynamical Systems, 2nd edn. Springer, Berlin (1998) Froyland, G., Junge, O.: Robust FEM-based extraction of finite-time coherent sets using scattered, sparse, and incomplete trajectories. (2017). arXiv preprint arXiv:1705.03640 Froyland, G., Kwok, E.: A dynamic Laplacian for identifying Lagrangian coherent structures on weighted Riemannian manifolds (2016). arXiv preprint arXiv:1610.01128 Froyland, G., Padberg-Gehle, K.: Almost-invariant and finite-time coherent sets: directionality, duration, and diffusion. In: Ergodic Theory, Open Dynamics, and Coherent Structures, pp. 171–216. Springer (2014) 123 J Nonlinear Sci Froyland, G.: An analytic framework for identifying finite-time coherent sets in time-dependent dynamical systems. Physica D: Nonlinear Phenom. 250, 1–19 (2013) Froyland, G.: Dynamic isoperimetry and the geometry of Lagrangian coherent structures. Nonlinearity 28(10), 3587 (2015) Froyland, G., Junge, O.: On fast computation of finite-time coherent sets using radial basis functions. Chaos: an Interdisciplinary. J. Nonlinear Sci. 25(8), 087409 (2015) Froyland, G., Koltai, P.: Estimating long-term behavior of periodically driven flows without trajectory integration. Nonlinearity 30(5), 1948 (2017) Froyland, G., Padberg-Gehle, K.: A rough-and-ready cluster-based approach for extracting finite-time coher- ent sets from sparse and incomplete trajectory data. Chaos: an Interdisciplinary. J. Nonlinear Sci. 25(8), 087406 (2015) Froyland, G., Padberg, K., England, M.H., Treguier, A.M.: Detection of coherent oceanic structures via transfer operators. Phys. Rev. Lett. 98, 224503 (2007) Froyland, G., Santitissadeekorn, N., Monahan, A.: Transport in time-dependent dynamical systems: finite- time coherent sets. Chaos: an Interdisciplinary. J. Nonlinear Sci. 20(4), 043116 (2010) Froyland, G., Horenkamp, C., Rossi, V., van Sebille, E.: Studying an Agulhas ring’s long-term pathway and decay with finite-time coherent sets. Chaos 25(8), 083119 (2015) Hadjighasem, A., Karrasch, D., Teramoto, H., Haller, G.: Spectral-clustering approach to Lagrangian vortex detection. Phys. Rev. E 93(6), 063107 (2016) Halász, G., Gyüre, B., Jánosi, I.M., Szabó, K.G., Tél, T.: Vortex flow generated by a magnetic stirrer. Am. J. Phys. 75(12), 1092–1098 (2007) Haller, G., Beron-Vera, F.: Coherent Lagrangian vortices: the black holes of turbulence. J. Fluid Mech. 731, R4 (2013). https://doi.org/10.1017/jfm.2013.391 Haller, G.: Finding finite-time invariant manifolds in two-dimensional velocity fields. Chaos: an Interdisci- plinary. J. Nonlinear Sci. 10(1), 99–108 (2000) Haller, G.: Distinguished material surfaces and coherent structures in three-dimensional fluid flows. Physica D 149(4), 248–277 (2001) Haller, G., Beron-Vera, F.J.: Geodesic theory of transport barriers in two-dimensional flows. Physica D: Nonlinear Phenom. 241(20), 1680–1702 (2012) Karrasch, D., Keller, J.: A geometric heat-flow theory of Lagrangian coherent structures (2016). arXiv Preprint arXiv:1608.05598 Karrasch, D.: Lagrangian transport through surfaces in volume-preserving flows. SIAM J. Appl. Math. 76(3), 1178–1190 (2016) Kloeden, P.E., Platen, E.: Numerical Solution of Stochastic Differential Equations, 3rd edn. Springer, Berlin (2010) Koltai, P., Ciccotti, G., Schütte, C.: On metastability and Markov state models for non-stationary molecular dynamics. J. Chem. Phys. 145(17), 174103 (2016) MacKay, R., Meiss, J., Percival, I.: Transport in Hamiltonian systems. Physica D: Nonlinear Phenom. 13(1–2), 55–81 (1984) Meiss, J.: Symplectic maps, variational principles, and transport. Rev. Mod. Phys. 64(3), 795 (1992) Mezic, ´ I., Banaszuk, A.: Comparison of systems with complex behavior. Physica D: Nonlinear Phenom. 197(1), 101–133 (2004) Mezic, ´ I., Wiggins, S.: A method for visualization of invariant sets of dynamical systems based on the ergodic partition. Chaos: an Interdisciplinary. J. Nonlinear Sci. 9(1), 213–218 (1999) Mosovsky, B.A., Meiss, J.D.: Transport in transitory dynamical systems. SIAM J. Appl. Dyn. Syst. 10(1), 35–65 (2011) Øksendal, B.: Stochastic Differential Equations—An Introduction with Applications, 6th edn. Springer, Berlin (2003) Onsager, L., Machlup, S.: Fluctuations and irreversible processes. Phys. Rev. 91(6), 1505–1512 (1953) Padberg, K., Hauff, T., Jenko, F., Junge, O.: Lagrangian structures and transport in turbulent magnetized plasmas. New J. Phys. 9, 400 (2007) Padberg-Gehle, K., Schneide, C.: Network-based study of Lagrangian transport and mixing. Nonlinear Proces. Geophys. Discuss. 1–14, 2017 (2017) Rom-Kedar, V., Wiggins, S.: Transport in two-dimensional maps. Arch. Ration. Mech. Anal. 109(3), 239– 298 (1990) 123 J Nonlinear Sci Rüdrich, S., Sarich, M., Schütte, C.: Utilizing hitting times for finding metastable sets in non-reversible markov chains. To appear in J. Comput. Dyn. (2017) Preprint https://opus4.kobv.de/opus4-zib/ frontdoor/index/index/docId/5120 Rypina, I.I., Pratt, L.J.: Trajectory encounter volume as a diagnostic of mixing potential in fluid flows. Nonlinear Proces. Geophys. 24(2), 189 (2017) Rypina, I., Brown, M., Beron-Vera, F., Kocak, H., Olascoaga, M., Udovydchenkov, I.: On the Lagrangian dynamics of atmospheric zonal jets and the permeability of the stratospheric polar vortex. J. Atmos. Sci. 64(10), 3595–3610 (2007) Schlueter-Kuck, K.L., Dabiri, J.O.: Coherent structure colouring: identification of coherent structures from sparse data using graph theory. J. Fluid Mech. 811, 468–486 (2017) Ser-Giacomi, E., Rossi, V., López, C., Hernández-García, E.: Flow networks: a characterization of geo- physical fluid transport. Chaos: an Interdisciplinary. J. Nonlinear Sci. 25(3), 036404 (2015) Ser-Giacomi, E., Vasile, R., Hernández-García, E., López, C.: Most probable paths in temporal weighted networks: an application to ocean transport. Phys. Rev. E 92(1), 012818 (2015) Strang, G.: On the construction and comparison of difference schemes. SIAM J. Numer. Anal. 5(3), 506–517 (1968) Treguier, A.-M., Boebel, O., Barnier, B., Madec, G.: Agulhas eddy fluxes in a 1/6 degrees Atlantic model. Deep Sea Res. Part II 50(1), 251–280 (2003) Walters, P.: An Introduction to Ergodic Theory, vol. 79. Springer, Berlink (2000) Wigner, E.P.: Calculation of the rate of elementary association reactions. J. Chem. Phys. 5, 720–725 (1937) Williams, M.O., Rypina, I.I., Rowley, C.W.: Identifying finite-time coherent sets from limited quantities of Lagrangian data. Chaos 25(8), 087408 (2015) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Nonlinear Science Springer Journals

From Large Deviations to Semidistances of Transport and Mixing: Coherence Analysis for Finite Lagrangian Data

Free
43 pages
Loading next page...
 
/lp/springer_journal/from-large-deviations-to-semidistances-of-transport-and-mixing-yocP1SwKs8
Publisher
Springer US
Copyright
Copyright © 2018 by The Author(s)
Subject
Mathematics; Analysis; Theoretical, Mathematical and Computational Physics; Classical Mechanics; Mathematical and Computational Engineering; Economic Theory/Quantitative Economics/Mathematical Methods
ISSN
0938-8974
eISSN
1432-1467
D.O.I.
10.1007/s00332-018-9471-0
Publisher site
See Article on Publisher Site

Abstract

J Nonlinear Sci https://doi.org/10.1007/s00332-018-9471-0 From Large Deviations to Semidistances of Transport and Mixing: Coherence Analysis for Finite Lagrangian Data 1 2 Péter Koltai · D. R. Michiel Renger Received: 8 September 2017 / Accepted: 19 May 2018 © The Author(s) 2018 Abstract One way to analyze complicated non-autonomous flows is through try- ing to understand their transport behavior. In a quantitative, set-oriented approach to transport and mixing, finite time coherent sets play an important role. These are time-parametrized families of sets with unlikely transport to and from their sur- roundings under small or vanishing random perturbations of the dynamics. Here we propose, as a measure of transport and mixing for purely advective (i.e., determin- istic) flows, (semi)distances that arise under vanishing perturbations in the sense of large deviations. Analogously, for given finite Lagrangian trajectory data we derive a discrete-time-and-space semidistance that comes from the “best” approximation of the randomly perturbed process conditioned on this limited information of the deter- ministic flow. It can be computed as shortest path in a graph with time-dependent weights. Furthermore, we argue that coherent sets are regions of maximal farness in terms of transport and mixing, and hence they occur as extremal regions on a spanning structure of the state space under this semidistance—in fact, under any distance mea- sure arising from the physical notion of transport. Based on this notion, we develop a tool to analyze the state space (or the finite trajectory data at hand) and identify coherent regions. We validate our approach on idealized prototypical examples and well-studied standard cases. Communicated by Oliver Junge. B Péter Koltai peter.koltai@fu-berlin.de D. R. Michiel Renger d.r.michiel.renger@wias-berlin.de Institute of Mathematics, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany Weierstraß-Institut, Mohrenstraße 39, 10117 Berlin, Germany 123 J Nonlinear Sci Keywords Large deviation · Coherent sets · Lagrangian data · Mixing . Mixing distance . Transport distance Mathematics Subject Classification Primary 37N10 · 60F10; Secondary 60J22 · 76F25 · 05C12 · 37M10 1 Introduction Transport in Dynamical Systems Instrumental to understanding the essential behav- ior of complicated non-autonomous flows is to grasp how transport is happening in them. This leads on a qualitative level to objects that prohibit transport, commonly named transport barriers; often originating from the geometric picture for autonomous systems and that trajectories are unable to cross co-dimension 1 invariant manifolds (Haller 2000, 2001; Haller and Beron-Vera 2012, 2013). For periodically forced sys- tems, invariant manifolds enclose regions called “lobes” that get transported across these periodically varying manifolds (MacKay et al. 1984; Rom-Kedar and Wiggins 1990). On a quantitative level, one searches for surfaces of small flux (Balasuriya et al. 2014; Karrasch 2016; Froyland and Koltai 2017), so-called partial barriers (Wigner 1937;Meiss 1992). Instead of characterizing regions that do not mix with one another via enclosing them by boundaries of low flux, there are approaches that aim to describe these sets directly. Such set-oriented concepts are strongly interwoven with the the- ory of transfer operators (Perron–Frobenius and Koopman operators) and comprise almost-invariant sets (Dellnitz and Junge 1999), ergodic partitions (Mezic´ and Wig- gins 1999) in autonomous, and coherent sets (Froyland et al. 2010; Froyland 2013)in the non-autonomous cases. Distinctive attention has been given to coherent sets, which are a (possibly time- dependent) family of sets having little or no exchange with their surrounding in terms of transport and are robust to small diffusion over a finite time of consideration (Froyland 2013, 2015). Natural examples include moving vortices in atmospheric (Rypina et al. 2007; Froyland et al. 2010), oceanographic (Treguier et al. 2003; Dellnitz et al. 2009; Froyland et al. 2015), and plasma flows (Padberg et al. 2007). In such applications, one would like to be able to find coherent sets even in the cases when a dynamical model that can be evaluated arbitrarily often is not available, only a finite set of Lagrangian trajectory data (passive tracers moving with flow with positions sampled at discrete time instances). This problem has received lot of attention in recent years, and a diverse collection of tools has been developed to tackle it (Budišic´ and Mezic´ 2012; Allshouse and Thiffeault 2012; Ser-Giacomi et al. 2015; Froyland and Padberg- Gehle 2015; Allshouse and Peacock 2015; Williams et al. 2015; Hadjighasem et al. 2016; Banisch and Koltai 2017; Schlueter-Kuck and Dabiri 2017; Padberg-Gehle and Schneide 2017; Rypina and Pratt 2017; Fabregat et al. 2016; Froyland and Junge 2017). While other current methods aim at collecting trajectories into coherent sets, in Banisch and Koltai (2017) it has been proposed to go one step further and analyze the connectivity structure of the state space under transport and mixing with “transport 123 J Nonlinear Sci coordinates” and the “skeleton of transport”. Very similar observations have been made earlier in Budišic´ and Mezic( ´ 2012) in the infinite-time limit for periodically forced systems. While coherent sets (and transport barriers) aim at partitioning the state space, the skeleton is aiming at “spanning” the state space with respect to transport. In this respect, coherent sets can be associated with distinct “extremal regions” of the skeleton. Here we will only use this idea of extremality, more precisely that coherent sets are “maximally far” from one another, as measured by transport. To this end, we will need to measure “farness” of dynamical trajectories. Several dynamical distance measures have been put forward already to measure the “distance” or “dissimilarity” of trajectories or initial states in dynamical systems (Mezic´ and Banaszuk 2004; Budišic´ and Mezic´ 2012; Froyland and Padberg-Gehle 2015; Hadjighasem et al. 2016; Fabregat et al. 2016; Karrasch and Keller 2016). The majority of them are shown to serve their purpose well in revealing coherent structures efficiently and reliably. However, they are either heuristic in the sense that they are not derived from the physical notion of transport and mixing, or no discretizations to finite scattered trajectory data have been developed. The purpose of this paper is thus twofold. On the one hand, we develop a dis- tance measure (a semidistance) between trajectories that is derived from the physical notion of transport and mixing subject to diffusion of vanishing strength, and we also derive a discretized distance measure for finite (also possibly sparse and incomplete) Lagrangian data that is consistent with its continuous counterpart in the limit of infinite data. On the other hand, we construct a tool to analyze with such distances the structure of the state space under transport, especially to find coherent sets. This tool makes use of the idea that coherent sets are some sort of extremal regions on a spanning structure with respect to transport, although in this work we will not investigate this “skeleton” in its entirety. Finite Time Coherent Sets Let us consider the ordinary differential equation (ODE) x˙ = v(t, x ) (1) t t on some bounded X ⊂ R and on a finite time interval [0, T ] for some T > 0. Throughout the paper, we will assume that v :[0, T]× X → R is a continuous velocity field tangential at the boundary, such that the flow of (1), denoted by φ [·], s,t 0 ≤ s, t ≤ T , is a diffeomorphism on appropriate subsets of X.For t < s we flow −1 backward in time: φ = φ . s,t t,s Many different notions to characterize coherent sets have been proposed in the liter- ature. Central to all of these notions is the idea that coherent sets should be robust under noise; without such a requirement, any non-intersecting characteristic of a singleton could be considered a coherent set. To this end, one typically perturbs the ODE (1)by a random noise (Denner et al. 2016; Froyland and Koltai 2017; Karrasch and Keller 2016), leading to the Itô stochastic differential equation (SDE) (ε) (ε) dx = v(t, x )dt + εdw , (2) t t We denote random variables by boldface symbols. 123 J Nonlinear Sci where {w } is a Wiener process (Brownian motion) with generator φ → φ, t t ∈[0,T ] reflecting boundaries, and starting from w = 0 (deterministically) and ε> 0is, at least for now, a given small constant. In fact, the rigorous mathematical formulation of an SDE with reflecting boundaries can be quite subtle, see Andres (2009). We ignore this issue as it does not affect our analysis. According to the definition of finite time coherent pairs (Froyland et al. 2010; Froyland 2013; Koltai et al. 2016), two sets A, B ⊂ X are coherent for times 0 and T if most mass from set A is likely to end up in set B, and most mass ending up in set B is likely to originate from set A, that is, (ε) (ε) (ε) (ε) P x ∈ B | x ∈ A ≈ 1, and P x ∈ A | x ∈ B ≈ 1. (3) T 0 0 T Naturally, for practical purposes one would need to choose how small ε and how large these probabilities should be. As the systems we are dealing with are often determin- istic by nature, and there is no “physically straightforward” choice of the diffusion strength ε, our first aim is to remove some of this indeterminacy by quantifying what it means for probabilities to be close to 1 for small ε, in terms of large deviations as we explain below. However, it turns out that the forward and backward conditions (3) are essentially equivalent in the large-deviation regime, and even worse, the large- deviation limits of (3) hardly give any quantitative information about how coherent two sets might be, as discussed in Appendix A. To conclude, the large deviations of conditions (3) do not yield sensible conditions for coherence. Large-Deviation-Based Semidistances In the current paper, we take a different approach. We study semidistances that quantify how unlikely it is for mass to flow from one point to another. These are semidistances in the sense that they satisfy all properties of a metric except for the triangle inequality. In the first part of this paper, in Sects. 2 and 3, we show how such semidistances can arise naturally from probabilistic arguments via large-deviation principles, as we explain below. In the second part, in Sects. 4 and 5, we discuss how (general) semidistances can be used to analyze coherent sets, and we apply the concepts of this paper to a number of examples. In Sect. 2, we derive two different semidistances from the large deviations of two (ε) probabilities. The first one is related to the probability that the endpoint x of the (ε) random path is φ [ y], given that it starts in x = x, for any two initial positions 0,T x , y ∈ X.As ε → 0, the process can no longer deviate from the deterministic flow of (1), and hence this probability will converge to 0 whenever x = y. In fact, it converges exponentially fast (Freidlin and Wentzell 1998), i.e., (ε) (ε) − μ (x → y) P x φ [ y]| x = x ∼ e (4) 0,T T 0 for some function μ (x → y) ≥ 0, where this statement and notation are made precise in Sect. 2.1. Such exponential convergence results are called large-deviation principles, A different way of factoring out diffusion to obtain coherent sets for deterministic flows appeared in the set-oriented transfer operator-based characterization in Froyland (2015); Froyland and Kwok (2016), leading to the notion of the dynamic Laplacian. See also our concluding remarks in Sect. 6.1. 123 J Nonlinear Sci cross Fig. 1 μ (x , y) is the cost to move from x to φ [ y] and 0,T from y to φ [x ] 0,T φ [y] 0,T φ [x] 0,T meet Fig. 2 μ (x , y) is the cost for two trajectories to meet φ [y] 0,T φ [x] 0,T and μ (x → y) is the large-deviation rate. The less probable it is to reach one point from another, the larger the rate between them. The first semidistance is then obtained via symmetrization: cross μ (x , y) := μ (x → y) + μ ( y→x ). (5) T T We call this the cross semidistance, since it arises from mass flowing from x to φ [ y] 0,T and mass flowing from y to φ [x ] simultaneously and independently, see Fig. 1. 0,T The second semidistance arises as the large deviations of the probability for two (ε) (ε) independent random trajectories x , y starting at x and y, respectively, to meet at or before time T (Fig. 2): 1 meet (ε) (ε) (ε) (ε) − μ (x , y) ε T P x y | x = x , y = y ∼ e , (6) T T 0 0 where the meeting semidistance is given by meet μ (x , y) := inf μ (x →z) + μ ( y→z). T T z∈ X cross meet By this procedure, we find two semidistances μ and μ that can be used as a T T measure of “farness” of points x , y, which will be low for points in the same coherent set, and high otherwise. Since both arise from large-deviation principles, they have a nice additional interpretation as a probabilistic cost or free energy that needs to be paid in order to deviate from the expected flows; such interpretation is common in statistical physics, see for example Onsager and Machlup (1953). 123 J Nonlinear Sci Nevertheless, we will see that in order to calculate these costs explicitly, the velocity field v needs to be known. As discussed above, this is in practice seldom the case; mostly one can only assume to have discrete-time snapshots of the positions of a limited number of floaters. With this in mind, we derive similar cost functions as above, that are based on such a finite data set only. This will be the content of Sect. 3. First, the dynamics is discretized in time and space by conditioning a usual time-stepping method for the SDE (2) on the event that the random continuous trajectories are to be found in the set of known floater positions at the K ∈ N given time instances. As above, we then cross meet derive two large-deviation semidistances ν (x , y) and ν (x , y) that have a clear K K probabilistic interpretation, that can be used to characterize coherent sets, and that are based on the finite data set rather than on the explicit velocity field. In fact, we will show that these discrete-space–time semidistances are really specific discretizations cross meet of the continuous-space–time semidistances μ and μ . As shown in Sect. 3.4, T T they can be computed as shortest path lengths in a time-dependent weighted graph. We give an algorithm to compute these shortest paths in Appendix B. Let us stress that these semidistances are defined for deterministic dynamical sys- tems. The random perturbation that is factored out by the large-deviation principle is merely acting as a catalyst to help quantify how strongly distinct trajectories mix—or, we should rather say how poorly, as the transport from one trajectory to another is inversely proportional to their semidistance. Coherence Analysis with Semidistances In Sect. 4, we describe how in general a semidistance on finite Lagrangian data can be used to analyze coherence. Key to our method is the notion of cornerstone: a point that is furthest away from all other points. Cornerstones are though of as “endpoints” of a spanning structure, and ideally each cornerstone is in some sense the center of a coherent set. As a next step, trajectories can be clustered around cornerstones to yield coherent sets. Of course, this approach is very close to the k-means- and fuzzy c-means clustering of trajectories with respect to dynamical distances in Hadjighasem et al. (2016); Froyland and Padberg-Gehle (2015), with the important difference that the centers are not chosen by the heuristics of these clustering approaches, but with regard to the properties of coherent sets in the light of transport and mixing. To exemplify the usefulness of the theory put forth in this paper, in Sects. 4 and 5 we test our approach on a number of standard test cases. Finally, Sect. 6 discusses possible combinations of this work with other concepts. 2 Large-Deviation Semidistances in Continuous Time and Space In this section, we study large deviations of the forms (4) and (6). In large-deviation theory, it is often easier to first study large deviations in a larger space. In our setting, we first study the large deviations of paths in Sect. 2.1 before transforming to the large deviations of the endpoints in Sect. 2.2. We end with a discussion of the resulting cross meet semidistances μ ,μ in Sect. 2.3. T T 123 J Nonlinear Sci 2.1 Large Deviations of Paths We denote paths by w to distinguish them from points w.Let P be the Wiener (·) measure, i.e., the probability that a Brownian path lies in a set U ⊂ C (0, T ; R ) is P[w ∈ U ]. Recall that there does not exist a canonical probability measure on the (·) space of paths, and so the Wiener measure cannot be identified with a meaningful density. This means that one always needs to consider sets rather than particular realizations of the Brownian path. Nevertheless, large-deviation rates are always local, in the sense that they depend on one realization only (the most likely one in the set U under consideration). This motivates writing w f if w lies in an infinitesimal (·) (·) (·) neighborhood U of the path f . We will make this more precise below. (·) The large deviations for the SDE (2) are a standard result by Freidlin and Wentzell (1998). This result can be derived via a combination of Schilder’s theorem and a contraction principle as we now explain. We first consider the noise part εw , which clearly converges (almost surely uni- formly) to the constant path 0 as ε → 0. The corresponding large-deviation principle is given by Schilder’s theorem (Dembo and Zeitouni 1987, Th. 5.2.3): −ε log P εw w −−→ |˙ w | dt, (7) (·) (·) t ε→0 for differentiable paths w starting from w = 0 (otherwise the limit will be ∞). (·) 0 Let us assume that the velocity field v(t, ·) is Lipschitz, so for each realization of (ε) the Brownian path w = w corresponds a unique solution x of the SDE, starting (·) (·) (·) (ε) from some given x = x, see (Øksendal 2003, Th. 5.2.1). The contraction principle (Dembo and Zeitouni 1987, Th. 4.2.1) then states that the large-deviation rate of a path x is given by the minimum of (7) over all realizations of the noise that give rise (·) to that path, i.e., (ε) −ε log P x x −−→ inf |˙ w | dt (·) t (·) ε→0 w :˙ x =v(t,x )+˙ w (·) t t t = |˙ x − v(t, x )| dt, (8) t t for differentiable paths x starting from x = x. (·) 0 2.2 Large Deviations of Endpoints We now derive the large-deviation principle of the type (4) as discussed in the Intro- duction. In a sense, the pathwise large deviations (8) encode more information than is (ε) needed if we are only interested in the endpoint x φ ( y) of the random path. 0,T 3 1 2 More rigorously, (7) means ε log P εw ∈ U − −− → − inf |˙ w | dt, where, for technical (·) w ∈U (·) 2 0 ε→0 reasons, this convergence is realized by a liminf lower bound for open sets U and limsup upper bound for closed sets U ; see Dembo and Zeitouni (1987). 123 J Nonlinear Sci Another application of the contraction principle then states that the large-deviation rate for the endpoint is the minimum of (8) over all paths starting from x and ending in that given endpoint φ [ y], that is: 0,T (ε) (ε) −ε log P x φ [ y]| x = x 0,T T 0 −−→ inf |˙ x − v(t, x )| dt =: μ (x → y). (9) t t T ε→0 x :x =x ,x =φ [ y] (·) 0 T 0,T This defines the “one-way” rate that we are after. cross The sum (5) then defines the cross-semidistance μ (x , y) and has a natural inter- pretation in terms of large deviations: As mentioned in the Introduction, it arises from (ε) (ε) two independent and simultaneous copies x , y . By independence, the probability t t (ε) (ε) (ε) (ε) that (x , y ) (φ [ y],φ [x ]) given (x , y ) = (x , y) is a product of one-way 0,T 0,T T T 0 0 probabilities, yielding the sum of two one-way rates in the large deviations, see Fig. 1. (ε) A similar argument can be used to derive the meeting large deviations (6). Let x (ε) and y be two independent solutions of the SDE (2), starting from given x and y, respectively. We consider the probability that both trajectories end in a given point, say φ [z] for some z ∈ R , see Fig. 2. Assuming independence of the two trajecto- 0,T ries, we immediately get (ε) (ε) (ε) (ε) − ε log P x φ [z], y φ [z]| x = x , y = y 0,T 0,T T T 0 0 (ε) (ε) (ε) (ε) =−ε log P x φ [z]| x = x − ε log P y φ [z]| y = y 0,T 0,T T 0 T 0 (9) −−→ μ (x →z) + μ ( y→z). T T ε→0 However, we are only interested in the probability that the two trajectories meet, and not in the point where they meet. A final contraction principle thus yields: (ε) (ε) (ε) (ε) −ε log P x y | x = x , y = y T T 0 0 meet −−→ inf μ x →z + μ y→z =: μ (x , y). (10) T T ε→0 z∈ X Observe that the two paths could also meet earlier and subsequently follow the same trajectory up until time T with zero cost; the time T thus acts as a maximum time at which the paths should meet. 2.3 The Semidistances We now discuss some metric properties of the rate functionals. Recall from Introduc- tion that we assumed that the flow is a diffeomorphism. Therefore μ (x → y) = 0if and only if x = y. It is then easy to see that, for any x , y, cross meet (i) μ (x , y) ≥ 0 and μ (x , y) ≥ 0, T T cross meet (ii) μ (x , y) = 0 ⇐⇒ x = y and μ (x , y) = 0 ⇐⇒ x = y, T T cross cross meet meet (iii) μ (x , y) = μ ( y, x ) and μ (x , y) = μ ( y, x ). T T T T 123 J Nonlinear Sci cross meet However, the triangle inequality can fail, and so μ and μ are semidistances T T only. We point out the following useful relation between the two. Observe that by meet the definition, μ (x , y) ≤ μ (x → y) + μ ( y, y) = μ (x → y), and similarly T T T meet μ (x , y) ≤ μ ( y→x ). Therefore, meet μ (x , y) ≤ min μ (x → y), μ ( y→x ) ≤ max μ (x → y), μ ( y→x ) T T T T cross ≤ μ (x , y). In order to investigate which semidistance is more suitable to study coherence, one cross meet would need to study in which setting the gap μ (x , y) − μ (x , y) becomes large. T T This is beyond the scope of this paper, but we will show for several examples that both work as they should. Remark 2.1 (Invariance under time reversal) Note the following invariance property of μ under time reversal: 2 ← − μ (x → y) = inf |˙ y + v(T − t, y )| dt =: μ φ [ y]→φ [x ] , T t t T 0,T 0,T y : y =φ [ y] 0 0,T (·) 0 −1 y =φ [φ [x ]] T 0,T 0,T ← − where μ is the one-way rate associated with the backward system y ˙ =−v(T −t, y ). T t t cross This time-reversal property is retained for the cross-semidistance: μ (x , y) = ← −cross meet μ φ [x ],φ [ y] . However, the meeting semidistance μ (x , y) = inf 0,T 0,T z T T ← − ← − μ z→φ [x ] + μ z→φ [ y] is the same as the cost for the backward tra- T 0,T T 0,T jectories to start in a joint position and end in φ [x ] and φ [ y]. 0,T 0,T 2.4 A Simple Example Let us consider a very simple example, where the domain of interest is the inter- val [0, L], and there is no dynamics, i.e., v ≡ 0. Its primary purpose is to form our intuition and expectations about how the semidistances work in more complicated settings. In particular, we shall see how the semidistances scale in time and system size. The system is considered on the time interval [0, T ]. One can then easily see that x˙ ≡ L/ T is an optimal path in (9), thus giving T 2 1 L L (0→L) = μ (L →0) = dt = , T T 2 T 2T 2 2 L L cross and so μ (0, L) = . Thus, also μ (0→L/2) = μ (L →L/2) = . In general, T T T T 8T the one-way cost is proportional to the squared distance and inversely proportional to time. This also gives meet μ (0, L) = , 4T 123 J Nonlinear Sci so, in this symmetric situation the meeting distance is half of one-way cost and quarter of the cross-semidistance. We will revisit this example in the next section and will realize that the behavior of the discrete semidistances deviates from the one observed here for continuous space and time. 3 Large-Deviation Semidistances in Discrete Time and Space cross meet As mentioned in the Introduction, the cost functions μ and μ are difficult T T to calculate explicitly, and impossible if the velocity or flow field is not explicitly known. In this section, we take a more practical approach. We will assume that the only information at hand is the position at finite times of a finite number I of floaters. (i ) To be more specific, let {x } ⊂ R be given positions of floaters i = k=0,...;K ,i =1,..., I 1,..., I at time kτ for k = 0,..., K for some τ> 0. Assuming that the floaters sample from the deterministic flow field φ , we know that for each floater i, s,t (i ) (i ) x = φ [x ] . (11) kτ,(k+1)τ k+1 k If we would add noise to the system, we would find random particles described by the set of SDEs (i,ε) (i,ε) (i ) (i,ε) (i ) dx = v(t, x )dt + εdw , x = x , for i = 1,..., I, (12) t t t 0 0 (i ) where w are now independent standard Brownian motions. Our strategy is to study the probability that random particles described by the SDEs (12) deviate from the given floater trajectories (11), conditional to the fact that all our knowledge about the otherwise unknown flow field φ comes from the time- s,t and space-discrete set of trajectory data (11). We first approximate the SDEs (12) by discrete-time, continuous-space Markov processes in Sect. 3.1, as it is done in standard time-stepping methods for SDEs (Kloeden and Platen 2010). Next, in Sect. 3.2 we condition these discrete-time processes on the given floater positions. Then we calculate the large-deviation rate for trajectories in Sect. 3.3, and for endpoints in Sect. 3.4. Finally, we end the section with a discussion of the metric properties of the resulting large-deviation rates in Sect. 3.5. 3.1 Discrete-Time Approximation We first focus our attention to one time step kτ → (k + 1)τ of one trajectory i, and temporarily drop the superindex for brevity. Since the noise process is a standard Brownian motion, we know its density, √ √ dP[ εw ∈ d y εw = x ] |x − y| (k+1)τ kτ −d/2 = (2πετ ) exp − . (13) d y 2ετ 123 J Nonlinear Sci Hence, we have exact information on the purely deterministic part of the SDE by (11) and on the purely noise part by (13). We combine this information by using the following time-stepping approximation for the SDE (2). + − Fix an α ∈[0, 1], and let (ξ ,ξ ) be independent normally distributed R - k=0,...,K k k valued random variables with unit variance. Given the approximated random position x ˜ at time kτ , we iterate + + x ˜ := x ˜ + ατ ε ξ k k − + x ˜ := φ [˜ x ] kτ,(k+1)τ (14) k+1 k − − x ˜ := x ˜ + (1 − α)τ ε ξ k+1 k+1 (k+1)τ + − Here, x ˜ and x ˜ are only auxiliary (intermediate) steps. The method (14) is a special k k+1 case of a splitting method, since the deterministic evolution and purely noise parts of the SDE (2) are handled separately in the distinct steps; here it would be “noise-flow- noise”. We would like to stress that our choice of discretization is made on the basis that we can use the available information on the flow given by (11). In the realm of one- step methods for SDEs we are bound to choices of the form (14), because there is no information on the drift other than the flow generated by it on prescribed time intervals [kτ, (k + 1)τ ). Given the form of time-stepping (14), the optimal [in the sense of highest weak consistency order Kloeden and Platen (2010)] approximation of the SDE is obtained by choosing α = 1/2. That is the so-called Strang-splitting (Strang 1968) and has weak order two, while for α = 1/2 we only get order one. Having performed discretization in time, in the next section we derive a discrete- time, discrete-space, α-dependent Markov chain that we will use to derive discrete semidistances. 3.2 Conditioning on Finite Data Recall that we considered one discrete-time process (x ˜ ) with initial condition k k=1,...,K (i ) x ˜ = x , and that we suppressed the dependency on i. For each k = 0,..., K,we introduce the set ( j ) A := {x } . k j =1,... I of available points at time t = kτ . We now condition the random process on the event that for each realization x ˜ ∈ A and for each intermediate point x ˜ ∈ A . k k k Formally, this can be seen by denoting the generators of the noise process and advection by A and B, respectively, and estimating the difference of the Markov propagators associated with (12)and (14)by performing formal Taylor expansions in τ with the non-commuting operators A and B, to obtain O(τ ), α = 1/2, τ( A+ B) (1−α)τ A τ B ατ A e − e e e = O(τ ), α = 1/2. 123 J Nonlinear Sci This automatically implies conditioning of the other intermediate points x ˜ ∈ A k+1 k+1 due to (11). We choose to condition on the intermediate points for practical reasons; otherwise, we would not be able to perform the second step in (14), since the discrete trajectories are our only information about the flow, cf. Remark 3.1 below. The conditioning on the finite data set results in replacing the discrete-time continuous-space process by a fully discrete-time discrete-space Markov chain that hops between the given trajectories. Therefore, the state of the new Markov chain can be represented by the labels j = 1,..., I ; this is particularly useful since the deterministic flow (11) will change the positions but not the labels. Since the resulting process is still Markovian, we can fully characterize its behavior through its transi- tion probabilities for one time-step k → k + 1. We now calculate these transition probabilities, dealing with each step in (14) separately. See Fig. 3 for a sketch. For the transition from x ˜ to x ˜ , where we know the increment distribution (13), note that we are in fact conditioning on a null set, so that the conditional probabilities are sensibly defined as the limits over balls B (·) of small radii r → 0 around these points. We thus obtain, for any j, = 1,... I : + + ( ) ( j ) + p ( j, ) := P x ˜ = x x ˜ = x and x ˜ ∈ A k k k k k k k ) ( j ) P x ˜ ∈ B (x ) x ˜ = x r k k k k = lim ( j ) r →0 P x ˜ ∈ B (A ) x ˜ = x r k k k k ) ( j ) exp − x − x /(2αετ ) k k =   , (15) I ( ) ( j ) exp − x − x  /(2αετ ) ˆ k k =1 and similarly for the transition from x ˜ to x ˜ : k+1 k+1 − (m) − ( p ( , m) := P x ˜ = x x ˜ = x and x ˜ ∈ A k+1 k+1 k+1 k+1 k+1 k+1 k+1 (m) ( exp − x − x / (2(1 − α)ετ ) k+1 k+1 = . (16) (m ˆ ) ( exp − x − x / (2(1 − α)ετ ) m ˆ =1 k+1 k+1 + − Since the transition from x ˜ to x ˜ is deterministic (middle equation in (14)), we k k+1 have that, (m) ( j ) + − P ( j, m) := P x ˜ = x x ˜ = x and x ˜ ∈ A , x ˜ ∈ A k k+1 k k k+1 k+1 k k k+1 + − = p ( j, ) p ( , m). (17) k k+1 =1 In words, the process performs the following three subsequent steps for one time step (see Fig. 3): 123 J Nonlinear Sci Fig. 3 One time step of the discrete-time discrete-space Markov chain i i = m k+1 p ( ) φ k+1 kτ,(k+1)τ p ( ) i = j A k+1 ( j ) ) (k,+) 1. Start in x , and perform a jump to some x with probability p , k k j, ) ( 2. Perform a deterministic jump from x to x , k k+1 ) (m) (k+1,−) 3. Perform a jump from x to x with probability p . k+1 k+1 ,m The transition probabilities P ( j, m) define our new, discrete-time Markov chain (i ) on the discrete space {1,..., I }. To shorten notation, we will write k k=0,...,K i := (i ) for a discrete path, analogous to the continuous-time setting. By the (·) k k=0,...,K Markov property, the probability that the Markov chain realizes such a path is simply K −1 P i = i = P (i , i ), (18) (·) (·) k k k+1 k=0 where we assumed that the chain starts (deterministically) from i . 3.3 Large Deviations of Discrete Trajectories We now study the large deviations of the discrete Markov chain i . Similarly to the continuous setting from Sect. 2, we start from the large deviations of paths. First we + − calculate the large deviations for p ( j, ) and p ( , m). By the Laplace princi- k k+1 ple (27), ⎛ ⎞ 2 2 I ( ) ( j ) ( ) ( j ) x − x  − x − x (15) k k k k ⎝ ⎠ −ε log p ( j, ) = ε log exp 2αετ =1 ) ( j )  ( ) ( j ) ) ( j ) x − x  − x − x |x − x | k k k k k k −−→ max = . 2ατ 2ατ ε→0 =1,..., I We will make this simplification again below. Similarly, we obtain 123 J Nonlinear Sci (m) ( |x − x | (16) − k+1 k+1 −ε log p ( , m) −−→ . k+1 2(1 − α)τ ε→0 Using these two exponential approximations, we can again use the Laplace princi- ple (27) to find for the jump probability of one time step: (17) + − lim −ε log P ( j, m) = lim −ε log p ( j, ) p ( , m) k k+1 ε→0 ε→0 =1 + − = min lim −ε log p ( j, ) − ε log p ( , m) k k+1 =1,..., I ε→0 ) ( j ) (m) ( 2 2 |x − x | |x − x | k k k+1 k+1 = min + . 2ατ 2(1 − α)τ =1,...,m Finally, the large-deviation rate of a discrete path is K −1 (18) −ε log P i = i =−ε log P (i , i ) (·) (·) k k k+1 k=0 K −1 (i ) (i ) ( ) k+1 k 2 2 |x − x | |x − x | k+1 k+1 k k −−→ min + := J (i ), (·) 2ατ 2(1 − α)τ ε→0 =1,..., I k=0 (19) Remark 3.1 Recall that we conditioned on the event that all x ˜ as well as the interme- diate points x ˜ lie in the set A of available points. One might argue that in practice only the points x ˜ are measured to lie in A , while the other two are mathematical k k constructs that may lie anywhere. However, if we would relax this conditioning and follow the calculations as above, we would find: (m) ( j ) x − φ [x ] |x − x | t ,t k+1 k k+1 −ε log P ( j, m) −−→ min + d 2ατ 2(1 − α)τ ε→0 x ∈R (m ˆ ) x − φ [x ] t ,t k k+1 k+1 − min . 2(1 − α)τ m ˆ =1,..., I Since this large-deviation rate still depends on the unknown flow field φ, it cannot be used if only the data of a finite number of floaters is available. Remark 3.2 (Missing data and non-uniform time-sampling) Note that the construction works exactly as described above even if information about trajectories is partially missing. The conditioning on the set A works identically, but now these sets might have different cardinalities smaller or equal I . Observe that our only information about the deterministic flow for times in [kτ, (k + 1)τ ) comes from those trajectories that are available both in A and A . If this intersection k k+1 is empty, we need to skip that time slice completely. This is not a problem, since our choice of sampling time uniformly by the step size τ was solely in order to ease 123 J Nonlinear Sci presentation. As the reader has probably observed, the extension for varying time steps τ is straightforward. 3.4 Large Deviations of Endpoints Analogously to the continuous setting, we study the large deviations of the one-way probability to hop from i to j in discrete time K , and the meeting probability that two independent chains, starting from i and j, respectively, meet by discrete time K or earlier. Since the paths (19) encode more information than the endpoints, we can now easily derive the large deviations of the one-way probability by a contraction principle. Indeed, for any two indices i, j = 1,..., I , −ε log P[i = j | i = i ] −−→ min J (i ) =: ν (i → j ), (20) K 0 (·) K ε→0 i : i =i,i = j (·) 0 K where J is the discrete-path large-deviation rate (19). Note that J is the shortest path length in a graph with time-dependent edge weights ( j ) ) (i ) ( 2 2 |x − x | |x − x | k k k+1 k+1 w (i, j ) = min + . 2ατ 2(1 − α)τ =1,..., I cross Again, the sum ν (i, j ) := ν (i → j ) + ν ( j →i ) can be given an interpretation K K in terms of large deviations as in Sect. 2.2. Moreover, following the same argument as in (10), if we take two independent trajectories i and j , then (·) (·) meet −ε log P[i = j | i = i, j = j]−−→ min ν (i → ) + ν ( j → ) =: ν (i, j ). K K 0 0 K K ε→0 =1,..., I 3.5 The Semidistances It is easily checked that in the discrete setting the properties of a semidistance are also satisfied: cross meet (i) ν (i, j ) ≥ 0 and ν (i, j ) ≥ 0, K K cross meet (ii) ν (i, j ) = 0 ⇐⇒ i = j and ν (i, j ) = 0 ⇐⇒ i = j, K K cross cross meet meet (iii) ν (i, j ) = ν ( j, i ) and ν (i, j ) = ν ( j, i ). K K K K Furthermore, the triangle inequality fails, but we again have the following estimate: meet cross ν (i, j ) ≤ min ν (i → j ), ν ( j →i ) ≤ max ν (i → j ), ν ( j →i ) ≤ ν (i, j ). K K K K K K Both semidistances can be computed from shortest path costs, where the cost of a path is given by (19). We stress that this expression is fairly simple and depends on the flow field through the known positions of the floaters x only. Because of this: (1) these costs can be used in practice if the velocity field is unknown (Sects. 4 and 5); (2) these costs can even be applied to cases where there may not be an underlying velocity field, as for example in discrete-time dynamical system (Sect. 4.1). 123 J Nonlinear Sci Fig. 4 Hopping back and forth (solid line) between two given trajectories (dashed lines) (j) (i) These semidistances can be computed by first computing the one-way rates ν (i → j ) using Algorithm 1, see Appendix B. From these rates, one readily obtains the semidistances via cross meet ν (i, j ) = ν (i → j ) + ν ( j → i ) and ν (i, j ) K K K K = min ν (i → ) + ν ( j → ). K K =1,..., I Remark 3.3 (Time reversal for discrete semidistances) Similarly to Remark 2.1,the ← − one-way cost satisfies the time-reversal property ν (i → j ) = ν ( j → i ), provided K K ← − α = 1/2, where ν is the cost associated with the backward dynamics. Moreover, this time-reversal property also holds for the cross-semidistance, whereas for the meeting ← − ← − meet semidistance ν (i, j ) = min ν ( → i ) + ν ( → j ). Apart from the =1,..., I K K superior consistency order discussed in Sect. 3.1, the invariance of semidistances under time reversal is another reason for choosing α = 1/2. Remark 3.4 Other large-deviation-based semidistances are also possible. If one con- siders the “noise-flow” (i.e., α = 1) time-stepping scheme for the SDE rather than “noise-flow-noise”, expression (19) simplifies a bit. As another example of a large- (i ) ( j ) deviation-based semidistance between two given discrete paths {x , x } , one k=0,...K k k could consider the probability to hop back and forth between the two trajectories, see Fig. 4. In that case, we find in the large-deviation scaling for α = 1: K −1 (i ) ( j ) |x − x | k k −ε log P[i = j , i = i , ··· | i = i ] −−→ , (21) 1 1 2 2 0 0 ε→0 2τ k=0 for α = 0 the sum would go from k = 1to K . Naturally, this is simply the L -distance between two trajectories, as considered earlier in Froyland and Padberg-Gehle (2015). Although this construction is very easy to calculate and its square root is a genuine metric, it is less interpretable as a cost for transport and mixing. meet cross Remark 3.5 It should be noted that the semidistances ν ,ν scale quadratically K K in space; this becomes even more apparent in the example considered in Sect. 3.7. In the case of the L -distance (21), the cost becomes a genuine distance after taking meet cross the square root. However, if we take the square roots of ν and ν , the triangle K K 123 J Nonlinear Sci inequality still fails. We therefore stick to the quadratic scaling as this has the most direct interpretation as large-deviation costs. Remark 3.6 (Eulerian transport vs Lagrangian mixing) When speaking of transport in this paper, we mean “transport (of probability) from a trajectory to another”, to express how the dynamics is mixing up regions these two trajectories come in contact with. This can be seen as a Lagrangian perspective. We express with large-deviation rates the unlikeliness of transitions between trajectories, and these are then computed as shortest paths, cf. Sect. 3.4. Deceivingly similar mathematical constructions show up in Ser-Giacomi et al. (2015), where the authors consider “highly probable paths” of non-homogeneous Markov chains, which also leads to a time-dependent shortest path problem. Note, however, that this is orthogonal to our concept, as this is quantifying likeliness. A further important distinction is that their Markov chain is constructed in an Eulerian manner (opposed to our Lagrangian setting), meaning that it describes transport between fixed regions of state space; serving as a discretization of the flow field (Froyland et al. May 2007, 2010). 3.6 Discretization of the Continuous Semidistances We now show that the one-way discrete-space–time cost ν can also be obtained by discretizing the continuous-space–time cost μ . This means that discretization and derivation of the large-deviation principle are interchangeable operations (if done the right way). We will not be precise about the discretization error; of course one needs to assume that the number of floaters is sufficiently large. K −1 We first divide the time interval into subintervals [0, T ) = [kτ, (k + α)τ ) ∪ k=0 [(k + α)τ, (k + 1)τ ). Recall that φ is the flow associated with v(t, ·), that is, for t ,t any t , x, ∂ φ [x]= v t,φ [x ] . t t ,t t ,t 0 0 Note in what follows that x is some path, not necessarily a trajectory of the flow. In (·) each interval [kτ, (k + α)τ ), we approximate by finite differences: x − x x − φ [x ] (k+α)τ kτ (k+α)τ (k+α)τ,kτ (k+α)τ x˙ ≈ and v(t, x ) ≈ . t t ατ ατ In each interval [(k + α)τ, (k + 1)τ ), we approximate: x − x φ [x ]− x (k+1)τ (k+α)τ (k+α)τ,(k+1)τ (k+α)τ (k+α)τ x˙ ≈ and v(t, x ) ≈ . t t (1 − α)τ (1 − α)τ Because of the assumption that the flow is one-to-one, we can always write x = (k+α)τ φ [ˆ x ] for some xˆ . We thus obtain: kτ,(k+α)τ k k K −1 T 2 2 |x −ˆ x | |x − φ [ˆ x ]| 1 2 kτ k (k+1)τ kτ,(k+1)τ k x˙ − v(t, x ) dt ≈ + . t t 2ατ 2(1 − α)τ k=0 123 J Nonlinear Sci (i ) (i ) Since the number of floaters {x } is large, we can find an x close k=0,...,K ;i =1,..., I k k to xˆ ,giving (i ) ( j ) (i ) ( j ) μ (x → x ) ≈ inf |˙ x + v(t, x )| dt : x = x , x = x , x ∈ A , T t t 0 T kτ k 0 K 0 K x := φ [x ]∈ A (k+α)τ,kτ (k+α)τ ) k (i ) (i ) K −1 ( ) k+1 ( k 2 2 |x − x | |x − x | k k k+1 k+1 ≈ min min + i :i =i,i = j =1,..., I 2ατ 2(1 − α)τ (·) 0 K k=0 = ν (i → j ). This shows that we can either derive the large-deviation rate function in continuous space and discretize this to finite trajectories (as done here), or we can restrict the continuous dynamics to finite trajectory data and derive a large-deviation rate function for that (as done above); we obtain consistent results whichever route we take. 3.7 The Simple Example Revisited Let us now demonstrate how the results of this section apply to the example of Sect. 2.4. Discrete Time and Continuous Space Let us first suppose we are given infinitely many “trajectories” of the system, one starting at each point x ∈[0, L], and they are sampled at discrete time points kτ , k = 0, 1,..., K , with τ = . From Sect. 3.3 with α = 1/2 we obtain, by writing x = , that 2 2 2 (x ) (L/K ) L 1 1 ν (0→L) = = K · = , 2 2 τ T /K 2T k=1 where we used that the optimal discrete path in (20) is the one making jumps of equal lengths x. Note that the rate function is identical to that in the fully continuous case. Analogously, ν (0→L/2) = ν (L/2→L) = , and generally, if |x − y|= K K 8T δ, then ν (x → y) = . The derived semidistances scale similarly. Note that the 2T semidistances converge to zero as T →∞. Discrete Time and Space If we are given a finite number I of equispaced trajectories of this system sampled at the same times as in the previous paragraph, the virtual random walker cannot make arbitrarily small jumps as in the continuous state case, thus 2 2 (x ) L 1 k ν (0→L) = ≈ , if K ≤ I , τ 2T k=1 123 J Nonlinear Sci −1 since we can take x ≈ L/K with error O( I ) as I grows. However, if K > I , the smallest jumps are x = , thus 2 2 (L/ I ) L K ν (0→L) = I · = . τ 2IT Thus, if the observation time of trajectories grows and they are still observed at the same rate (i.e., τ stays constant), the semidistances saturate at and do not converge I τ to zero as in the continuous time case. Moreover, to reach y = L/2 from x = 0, we still cannot make smaller jumps than x = , but now we only require only I /2 I (L/ I ) L of them, such that we obtain ν (0→L/2) = ν (L/2→L) = · = (for K K 2 τ 4 I τ even I , and vanishing error for odd I as I grows). The main lesson is that while in the continuous space case halving the Euclidean distance makes the semidistance scale by , if the spatial resolution of trajectories is coarse, the discrete semidistance scales only by . In general, if |x − y|= δ, then on a coarse resolution grid it takes about jumps to travel between these two points, δ (x ) x and we obtain ν (x → y) ≈ · = δ · . Note that x and τ are constant x τ τ quantities, and thus the one-way discrete cost scales linearly in the Euclidean distance between the two points, as opposed to quadratic scaling in the continuous space case. 4 Coherence Analysis with Semidistances Let us assume that we are given a set of discrete time and space trajectories ( j ) {x } , and a (semi)distance d. We now describe how such semidis- j =1,..., I,k=0,...,K tances can be used to distinguish and analyze coherent sets from the finite data. We shall work with an unspecified semidistance d, but of course the semidistances that we cross meet have in mind are ν and ν that we derived in the previous section. Other—not K K large-deviation based—distance measures could be used just as well, as we discuss below. Nevertheless, the semidistances should not be completely arbitrary; we assume meet cross that they share the behavior of ν and ν that we discuss in Sects. 4.1 and 4.2. K K To illustrate the ideas, we first analyze the behavior of two one-dimensional proto- typical examples. These examples show the difference between two types of regions: “mixing” and “static” (also known as “regular”). In these one-dimensional and simple examples, one can easily determine the regions and whether they are mixing or static from the semidistances. One example has two invariant sets under the dynamics, which is (measure-theoretically and topologically) mixing on both of them. The other has two static regions, where the mutual physical distance of trajectories does not change under the dynamics, and these regions are separated by a third, mixing region. After this we proceed with a more involved model: a two-dimensional periodically forced double-gyre flow, where the boundaries of the separate regions are no longer as clear-cut as in the one-dimensional example. Nevertheless, we will show that one can identify the separate regions via the tools that we present next. To this end, we introduce the notion of cornerstones, representing possible coherent sets or mixing regions, and then discuss how to find them and when to stop searching 123 J Nonlinear Sci φ(x) φ(x) x x Fig. 5 Left: the time-discrete flow map with two invariant mixing subdomains. Right: the time-discrete flow map with two static regions and an invariant mixing subdomains between them for them. Finally, to obtain coherent sets, we assign the trajectories to cornerstones. The notion of fuzzy affiliations will be used to express the uncertainty whether a trajectory close to the boundary of a set belongs to it or not. Note that in the case of finite data such an uncertainty is always present. 4.1 Two Illustrative Model Cases Two Invariant, Mixing Subdomains As mentioned in Sect. 3.5, we may also apply the techniques developed in this paper to a discrete-time dynamical system. To gain some intuition for the behavior of the semidistances at mixing regions, we consider the discrete-time system on the unit interval X =[0, 1] and one-step flow map, see Fig. 5 (left), 1 1 4x mod , x < 2 2 φ(x ) = 1 1 1 1 4(x − ) mod + , x ≥ . 2 2 2 2 1 1 −1 The sets X =[0, ], X = ( , 1] are invariant, i.e., φ ( X ) = X 1 2 1 1 2 2 −1 and φ ( X ) = X , and φ is simply the circle-quadrupling map on each of these 2 2 sets, i.e., it is mixing on the single components. Consequently, t t lim inf φ (x ) − φ ( y) =0(22) t ∈N, t →∞ for (Lebesgue-)almost every pair x , y ∈ X , i = 1, 2. If a system ( X,ψ,μ) is mixing, then ( X × X,ψ ×ψ, μ×μ) is ergodic (Walters 2000, Theorem 1.24) . Thus, for μ × μ-almost every pair (x , y), the trajectory (ψ × ψ) (x , y) will enter every set A of nonzero measure for some t ≥ 0. This shows (22) by taking ψ = φ| or ψ = φ| ,and A ={(x , y) ||x − y| <ε} (0,1/2) (1/2,1) for any fixed ε> 0. 123 J Nonlinear Sci 0.8 0.6 0.4 0.2 0 20 40 60 80 100 120 140 160 180 200 Fig. 6 Two trajectories of the map φ of length 200 steps, starting in X and X , respectively. Theory shows 1 2 that they come arbitrary close, eventually. Here they get the closest at time step 193, shown by a circle Thus, ν (i → j ) → 0as K →∞, because if x , x are both in X or both in X , K i j 1 2 then (22) shows that their trajectories get arbitrarily close eventually. If the trajectories start in different halves of [0, 1], then t 1 t 1 lim inf φ (x ) − + φ (x ) − = 0 , (23) i j 2 2 t ∈N, t →∞ thus the jump from one trajectory to another gets arbitrarily cheap. See Fig. 6. The transport semidistances between any two points within the same region are very small, at least if the time window is large enough. This behavior is typical for mixing regions. In fact, since the two mixing regions are only separated by one point, it is relatively cheap to move from one region to the other, and so the semidistances between two points in separate regions converge with increasing time to zero. Nevertheless, the semidistances still detect a difference between the two invariant sets: The semidistance between two trajectories in the same invariant component goes in general quicker to zero than the one between two from different components, as shown in Fig. 7 for I = 100 initially equispaced trajectories. Thus, it is the relative difference between the semidistances that is relevant for the transport structure of the state space, and not the absolute values. Two Static Regions Divided by a Mixing One To gain some intuition about static regions, let us now consider the discrete-time system on X =[0, 1] given by 1 3 x , x ∈[0, ) ∪ ( , 1] 4 4 φ(x ) = 1 1 1 1 3 2(x − ) mod + , x ∈[ , ] , 4 2 4 4 4 see Fig. 5 (right). This map has three invariant sets. The left and right ones are static, such that the mapping restricted to them is the identity, and are meant to model regions 6 t Applying Footnote 5 to the case where ψ = φ| , we obtain that (φ| × φ| ) (x , y) enters (0,1/2) (0,1/2) (0,1/2) 1 1 A ={(x , y) ||x −1/2|+| y| <ε} eventually. Noting that φ| = φ| (·− )+ , the claim follows. (1/2,1) (0,1/2) 2 2 Comparing the different slopes in Fig. 7, based on the reasoning in Footnote 5 and here we conjecture that the minimal distance of two trajectories decays faster in the case when they both start in the same invariant set, because the set {(x , y) ||x − y| <ε} is larger in measure than {(x , y) ||x − 1/2|+| y| <ε}. 123 J Nonlinear Sci -2 -2 10 10 -3 -3 10 10 -4 -4 10 10 -5 -5 10 10 -6 -6 10 10 -7 -7 10 10 0 1020304050 0 1020304050 K K cross meet Fig. 7 Semidistances ν (i → j ) (left) and ν (i, j ) (right) for increasing maximal time K,averaged K K over x , x ∈ X (downward-pointing triangles), x , x ∈ X (upward-pointing triangles), and x , ∈ i j 1 i j 2 i X , x ∈ X (circles), respectively. Note that the decrease in the distance is much slower for trajectories 1 j 2 taken from different invariant sets Fig. 8 One-way cost ν (i → j ) for i = 1 (left) and i = 51 (right) for the map with two static and one mixing region of the state space in complicated flows that are “static” in the sense that the mutual dis- tance of points is not changed (or just barely) by the dynamics. We will consider these as one kind of prototype for coherent sets. The third region is mixing and physically separates the other two. We take I = 100 initially equispaced trajectories and compute the one-way costs ν (i →·) with K = 50 for i = 1 and i = 51, respectively, shown in Fig. 8. From our analysis in Sect. 3.7, we would have expected to see quadratic growth of the one-way cost with respect to physical distance in the static regions, but we only observe linear growth. This is due to the finite number of considered trajectories, as also explained in the second paragraph of Sect. 3.7. All points of the mixing region have almost the same cost from any one point in the static regions, and approximately zero cost from one another. To obtain the cost between two points of different static regions, one has to consider the cost to go to the boundary of the static and mixing regions (linear cost in Euclidean distance), travel on a trajectory from there to the boundary of the other static region (at zero cost), and then go from there to the desired 123 J Nonlinear Sci Fig. 9 Sketch of the velocity field of the periodically forced double-gyre flow at two different times. The horizontal axis is x , and the vertical is x 1 2 point. (Again, linear cost in Euclidean distance that needs to be covered.) Thus, the cost (and semidistance) between these two points is the sum of their one-way cost (and semidistance) to the mixing region, provided the time of consideration is sufficiently large for the mixing to take place. Since our fictive random walker uses trajectories of the mixing region to travel from one static region to the other, we will also call it transition region henceforth. To conclude, from Fig. 8 we can easily identify three separate regions, and from the steepness of the slopes (linear/quadratic or flat), we can determine wether a region is static or mixing. As the next example shows, this distinction is usually not as clear as in these constructed examples, but the main ideas will be based on this observation. 4.2 The Periodically Forced Double Gyre Let us now consider the non-autonomous system x˙ = v(t, x ) on X =[0, 2]×[0, 1] t t with (Froyland and Padberg-Gehle 2014) −π A sin (π f (t, x )) cos(π x ) 1 2 v(t, x ) := , (24) d f π A cos (π f (t, x )) sin(π x ) (t, x ) 1 2 1 dz where f (t, z) = β sin(ωt )z + (1 − 2β sin(ωt ))z. We fix the parameter values A = 0.25, β = 0.25 and ω = 2π; hence, the vector field has time period 1. The system preserves the Lebesgue measure on X. Equation (24) describes two counter-rotating gyres next to each other (the left one rotates clockwise), with the vertical boundary between the gyres oscillating periodically, see Fig. 9. We choose a uniform 50 × 25 grid as initial conditions for the floaters at time t = 0; i.e., I = 1250. We sample the trajectories of these floaters at times t = kτ , k = 0, 1,..., K , where K = 100 and τ = 0.2. That means, the length of trajectories in consideration is 20 times the period of the forcing. Employing our large-deviation-based distance computations on this data set using Algorithm 1 and α = 1/2, we get the one-way costs ν (i → j ), i, j = 1,..., I,from cross meet which we compute ν (i, j ) and ν (i, j ). K K Note that for this argument, ergodicity of the dynamics in the “mixing” region would be sufficient, since one only needs to travel “from one static region to the other”. The crucial additional property we get from mixingness is that the mutual semidistances of points in this region go to zero. 123 J Nonlinear Sci -1 0.045 0.04 0.035 0.03 0.025 -2 0.02 0.015 0.01 0.005 -3 0 10 0 1 2 3 4 0 200 400 600 800 1000 1200 10 10 10 10 10 meet Fig. 10 Left: ν (c , ·) sorted in ascending order. Right: the same as left, on a log-log scale. The horizontal axis shows the rank, and the vertical shows the semidistance value As a first simple analysis, we can order the points by their semidistances to the center of one gyre, see Fig. 10. Here and in the following, the rates and semidistances will be always given in units 1/τ . On a log-log scale, the slope 1/2 (square-root-type behav- ior) indicates that most trajectories in the gyre are approximately concentric circular regions around the center. Since the semidistances grow linearly in the Euclidean distance inside the gyre, we see that we are in the low-resolution regime discussed in Sect. 3.7. Analogously to Fig. 8, we can again (vaguely) distinguish three regions: a steep (square-root-type) region, a flat region, and another steep (flipped square-root- type) region. As before, the flat region is typically strongly mixing, and the steep regions are static. We shall make this distinction more precise in the next sections. Remark 4.1 (Three-dimensional flows) Clearly, the scaling behavior shown in Figs. 8 and 10is dimension-dependent. For a three-dimensional static region, one would see a slope 1/3 on a log–log scale. The question remains, how does a typical coherent set behave there; can it be modeled by a static region? If an incompressible flow rotates uniformly in a plane, it necessarily has a constant shifting motion in the perpendicular axial direction, leading to cylindrical vortices. If the cylindrical rings of a vortex rotate at different angular frequencies, the flow speed along axial directions is non-uniform, and mixing-type behavior occurs in the vortex (Halász et al. 2007). We leave the analysis of such systems to future work and proceed with analyzing different aspects of prototypical two-dimensional flows here. 4.3 Cornerstones To start the analysis of the state space under a semidistance d, we randomly choose a trajectory, represented by a label c ∈{1,..., I }, and compute the trajectory furthest from it, i.e, we set 8 2 As on a regular grid there are O(δ ) points not further than Euclidean distance δ from a reference point, 1/2 the r-th closest point to the reference point has distance O(r ). 123 J Nonlinear Sci 0.045 0.04 c , c 0.035 1 2 c , c , c 1 2 3 0.03 0.025 0.02 0.015 0.01 0.005 0 200 400 600 800 1000 1200 Fig. 11 The objective functions of the maximization problem in (25), sorted in ascending order (yellow meet and purple). Blue and red: ν (c , ·), i = 1, 2 (Color figure online) c = arg max d(i, c ). 1 0 i =1,..., I To find a set of points that “spans” the state space, we identify successively further trajectories that are far away from all the other already identified “corner- stones” {c } ,asinRüdrichetal. (2017): q q=1,..., Q c = arg max min d(i, c ). (25) Q+1 q q=1,..., Q i =1,..., I Observe that in this optimization problem we ignore the first, randomly chosen tra- jectory c ; hence, the set of cornerstones {c } will be less dependent on this 0 q q=1,..., Q randomness. Moreover, even if the first trajectory c would represent a coherent set, the algorithm will eventually provide a new cornerstone in that set, which lies closer to the semidistance center of that set. For the double gyre and the meeting distance, we identified three cornerstones. The objective function of the maximization problem (25) is plotted in Fig. 11;this yields a similar but more detailed picture as Fig. 10. Note that the chaotic, well-mixed transition region appears as flat region in these distance graphs, and the gyres appear as steep regions toward the maxima of the respective graphs. That the chaotic region is well mixed, and has no stratification (invariant rings as the gyres), can be seen from its flat behavior toward its maximum. The forth cornerstone is part of a gyre, and it starts to stratify it. Nevertheless, its distance to the other corners is much smaller. To get a first glimpse of the separate regions in the state space, we have plotted the semidistances from each cornerstone in Fig. 12, both at the initial and final times. Note that since we work with trajectory labels rather than physical positions, the semidistances are invariant in time, whereas the physical positions of the floaters change over time. From these figures, one can approximately identify the two static 123 J Nonlinear Sci 1 1 0.04 0.04 0.8 0.8 0.03 0.03 0.6 0.6 0.02 0.02 0.4 0.4 0.01 0.01 0.2 0.2 0 0 0 0 0 0.5 1 1.5 2 0 0.5 1 1.5 2 1 1 0.04 0.04 0.8 0.8 0.03 0.03 0.6 0.6 0.02 0.02 0.4 0.4 0.01 0.01 0.2 0.2 0 0 0 0 0 0.5 1 1.5 2 0 0.5 1 1.5 2 1 1 0.8 0.8 0.015 0.015 0.6 0.6 0.01 0.01 0.4 0.4 0.005 0.005 0.2 0.2 0 0 0 0 0 0.5 1 1.5 2 0 0.5 1 1.5 2 meet Fig. 12 Distances ν from trajectory c (top), c (middle) and c (bottom), marked by the magenta 1 2 3 circle, at initial (left) and final times (right). The semidistances are given in units 1/τ . The horizontal axis is x , and the vertical is x (Color figure online) 1 2 (gyre) regions, being very close and very far from c and c , respectively, and the 1 2 chaotic transition region in between, having approximately constant distance from c and c , cf. (Froyland and Padberg-Gehle 2014, Figure 1) . 4.4 Number of Cornerstones How to determine the number of cornerstones that should be used? Is there an optimal number, or is it up to our liking? In the case of the double gyre, as noted above, a fourth cornerstone would be part of one of the gyres, and assigning affiliations would thus split one gyre into two sets. If the gyres would consist of a continuum of periodic orbits, then we could proceed and split them this way into as many rings as we like. The same situation in an idealized framework appears in Sect. 4.1 for the static regions: since they are static, arbitrary subsets are perfectly coherent (even invariant in this case). A good place to stop searching for further cornerstones would be when they would start to subdivide “maximal coherent” sets, as the gyres in the double-gyre example, or the static sets in the second, and the invariant sets in the first example of Sect. 4.1. To this end we make an idealized assumption: Coherent sets appear as for the second example in Sect. 4.1, i.e., multiple static regions divided by one mixing region. Here, “static” is meant in the sense that the mutual distances between points in the set barely change. Such an assumption was also utilized in (Hadjighasem et al. 2016). 123 J Nonlinear Sci 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 0 0.5 1 1.5 2 00.5 11.5 2 meet Fig. 13 The trajectories closest in terms of ν to one of the cornerstones than to the others. Left: initial time, right: final time. The horizontal axis is x , the vertical is x 1 2 Note that if there are C ≥ 2 coherent sets, the first C corner stones are going to be in them, one in each. This is due to the fact that to move from the center of one static region to another, the shortest path in (20) needs to move out of one set, travel in the transition region to the other set, and move to its center, hence maximizing the minimal distance to all other cornerstones. After finding all static regions, the next cornerstone is to be found in the transition region, if all static regions are approximately of the same size—which we assume here. The crucial observation is that this (C + 1)-st cornerstone is half as far from the other cornerstones, as they are from one another. In other words, d(c , c ) + d(c , c ) ≈ d(c , c ), i, j ≤ C. i C +1 j C +1 i j To summarize, our simple check when to stop searching for cornerstones is going to be, when the value of the objective function in (25) drops by at least a factor two compared with the previous value. Observe how nicely this works in the periodically forced double-gyre case: the rightmost points of the curves in Fig. 11 are the optimizers, and the corresponding value of the yellow curve is less than half of the values for the first two cornerstones. This indicates to stop with three cornerstones, as they will represent both the gyres and the transition region. 4.5 Clustering and Fuzzy Affiliations To get an even more crisp picture of the subdivision of the state space into regions which are far away in terms of the semidistance d, we assign to each cornerstone c , c , c the 1 2 3 trajectories that are closer to them than to the two other cornerstones, respectively. For the periodically forced double gyre and the meeting distance this is shown in Fig. 13. Comparing this picture with the typical trajectories of the time-1 Poincaré map of the system [again, see Froyland and Padberg-Gehle 2014, Figure 1], it appears that the gyre regions in our figure are smaller. This is due to the nature of the transport distance at hand: the gyres are partly made up of so-called regular regions of the Poincaré map, meaning that typical trajectories move on periodic orbits that are approximately concentric circular lines. Transport between these trajectories is only possible through diffusion, and the price one has to pay for this transport in radial direction is reflected by the rate function (recall, this is what we model by the static regions in Sect. 4.1). The Here we assume that we are in the coarse spatial resolution case, where the semidistances scale linearly and not quadratically, cf. Sect. 3.7. Otherwise, the drop in the distance is more than a factor two (toward factor four). 123 J Nonlinear Sci 1 1 1 1 1 1 0.8 0.8 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.6 0.6 0.4 0.4 0.4 0.4 0.4 0.4 0.2 0.2 0.2 0.2 0.2 0.2 0 0 0 0 0 0 0.5 1 1.5 2 0 0.5 1 1.5 2 00.5 11.5 2 Fig. 14 Fuzzy affiliations q (·) of the trajectories to the three cornerstones, c , c , c (from left to right) c 1 2 3 for fuzziness exponent m = 2, shown at initial time. The horizontal axis is x , the vertical is x 1 2 cost to get from the center of the gyre (the cornerstone c or c ) to a regular trajectory 1 2 in the same gyre is proportional to the “radial distance” between them (compare with the static part of the second example in Sect. 4.1). This behavior is not characteristic for the well-mixed transition region, because there the dynamics (eventually) brings any two trajectories close to each other. The effect is most prominent if the time frame of consideration grows infinitely large, and on our finite time horizon it appears as a flattening of the curve. This brings us back to why the blue and green regions in Fig. 13 are smaller than gyres in the Poincaré map. The answer is simply, because the outer periodic orbits are closer to the transition region than to the center of the gyre, hence also closer to the cornerstone c that is in the transition region, because the points in the transition region have very small distance from one another. Instead of a hard clustering we can assign the trajectories to the cornerstones by fuzzy affiliations q (·), to obtain more refined information on coherence. For instance, let m > 1, and minimizing the affiliation-weighted penalty function m 2 q ( j ) d(c , j ) c i j =1 i =1 subject to the constraints 0 ≤ q for i = 1,..., and q ( j ) = 1 for every j = c c i i =1 i 1,..., I , yields q ( j ) = . (26) m−1 d(c , j ) k=1 d(c , j ) This is the affiliation function in the fuzzy c-means algorithm (Bezdek 1981), giv- ing q ( j ) = 1 ⇔ d(c , j ) = 0, i.e., affiliation is maximal if the distance is minimal. c i Further, the parameter m controls the fuzziness of the clustering: large m gives soft clusters, while m approaching 1 gives more and more “crisp” clusters as the affiliations converge either to 0 or to 1 (Bezdek et al. 1987). The resulting affiliations (indicated at initial time) for m = 2 are shown in Fig. 14.For m close to 1 we obtain affiliations very similar to the hard clusters in Fig. 13. 5 Numerical Results In the previous section we already presented numerical results for the double-gyre system, where we used the results to motivate and develop the analysis tools. In 123 J Nonlinear Sci this section we apply these tools to two other well-analyzed test cases: the perturbed Bickley Jet and the rotating (transitory) double gyre. They are different paradigmatic examples, as the Bickley Jet has a non-vortex coherent set (the jet core), and the transitory double gyre is not a periodically forced system, thus genuinely living on a finite time interval. Let us also point out that the examples presented here and in the previous section are all one- or two-dimensional. Although we expect the analysis in higher dimensions to be at least qualitatively not very different from the two-dimensional case, dealing with the nevertheless arising subtle differences (see Remark 4.1) is beyond the scope of this conceptual work. cross meet It turns out that the choice between the two semidistances ν or ν has only K K marginal influence on the results. In this section we shall mostly work with the cross- semidistance. 5.1 The Bickley Jet We consider a perturbed Bickley Jet as described in Rypina et al. (2007). This is an idealized zonal jet approximation in a band around a fixed latitude, assuming incom- pressibility, on which three traveling Rossby waves are superimposed, see Fig. 15.The ∂ ∂ dynamics is given by x˙ = v(t, x ) with v(t, x ) = (− , ) and stream function t t ∂ x ∂ x 2 1 (t, x , x ) =−U L tanh x /L + U L sech x /L A cos k x − c t . ( ( )) 1 2 0 2 0 2 n n 1 n n=1 The constants are chosen as in Rypina et al. (2007, Section 4) . In particular, we set k = 2n/r with r = 6.371, U = 5.414, and L = 1.77. The phase speeds c n e e 0 n of the Rossby waves are c = 0.1446U , c = 0.205U , c = 0.461U , their ampli- 1 0 2 0 3 0 tudes A = 0.0075, A = 0.15, and A = 0.3, as in Hadjighasem et al. (2016). The 1 2 3 system is considered on a state space X =[0,π r ]×[−3, 3] which is periodic in the horizontal x coordinate. We choose a uniform 60 × 18 grid as initial conditions for the floaters at time t = 0; i.e., I = 1080. We sample the trajectories of these floaters at times t = kτ , k = 0, 1,..., K , where K = 80 and τ = 0.5. In this time interval, typical trajectories cross the cylindrical state space horizontally 4–5 times, trajectories in the jet core (the wavy structure in Fig. 15) up to 9 times. Employing our large-deviation-based distance computations on this data set using Algorithm 1 and α = 1/2, we get the rates ν (i → j ), i, j = 1,..., I . From these Fig. 15 Sketch of the Bickley Jet flow field at two different times. The flow pattern travels from left to right on the horizontally periodic domain. The horizontal axis is x , the vertical is x 1 2 123 J Nonlinear Sci cross rates we readily obtain the ν (i, j ) via cross ν (i, j ) = ν (i → j ) + ν ( j →i). K K We repeat the cornerstone finding analysis from the previous section. The optimal values of the objective function in the cornerstone finding problem (25) are for 8 cornerstones, in order: 2.06, 3.21, 2.45, 2.33, 2.30, 2.14, 1.42, 0.70. Recall that the first value is with respect to a random cornerstone c that we discard. These numerical values with our previous analysis shed light on the topological struc- ture of the state space with respect transport and mixing. Note, that our assumption from Sect. 4.4, that all coherent sets are divided by one mixing region, is not satisfied: the jet core is a coherent set itself, dividing two mixing regions (below and above it), each containing 3 further coherent sets (the gyres). Thus, c and c have maximal 1 2 cross distance (ν (c , c ) = 3.21), because the random walker needs to cross the jet 1 2 core. Every further cornerstone c ,..., c can be reached from either c or c through 3 6 1 2 one of the mixing regions, and thus have a very similar cost. The deviation of these costs, 2.14–2.45, shows that we did not reach the state of full mixing on the chosen time interval. Now, the seventh cornerstone lies in the jet core, which has to be crossed if traveling between cornerstones that are below and above it, respectively. The corresponding cost (1.42) is a bit larger than half of the previous cost, because c does not lie on the shortest path between cornerstones below and above the jet core. Intuitively, the “center line” of the jet core should be equally far from all cornerstones c ,..., c , 1 6 if the time interval is large enough such that the regions around the gyres are truly mixing. Since it is not, there are points on the boundary of the jet core which are easier to reach from them, and thus easier to cross there. The cornerstone c represents the position where it is the hardest to cross. The eighth cornerstone has truly half the semidistance to the closest one than c , and lies in one of the mixing regions. We show our results for seven cornerstones. The semidistances are shown in Fig. 16. The corresponding fuzzy affiliations from (26)for m = 1.1are showninFig. 17. They show a very crisp distinction of the six gyres from the rest of the state space. The bottom right figure shows the affiliation q (·) for m = 1.9, which suggests that the region around the gyres could still be partitioned into coherent sets itself: the jet core appears more strongly affiliated to this cornerstone than the other trajectories. It is not surprising that we could not see this for m = 1.1, since the closer m is to 1, the more “crisp” the affiliation function is forced to be, and the mixing region is more easily reached from the thin jet core than from the gyres. If we include additional cornerstones, results tend to deteriorate due to the low resolution and because the chosen time interval is not giving full mixing. 123 J Nonlinear Sci 3 3 2 2 2 2 0 0 1 1 -2 -2 0 0 0 5 10 15 20 0 5 10 15 20 3 3 2 2 2 2 0 0 1 1 -2 -2 0 0 0 5 10 15 20 0 5 10 15 20 3 3 2 2 2 2 0 0 1 1 -2 -2 0 0 0 5 10 15 20 0 5 10 15 20 1.5 0.5 -2 0 5 10 15 20 Fig. 16 Row-wise from top left to bottom: the identified corner stores c , i = 1,..., 7, (magenta circles) cross and their distances ν (c , ·) to the other trajectories, at initial time. The cornerstones are located in the six gyres and the central jet region. The distances are given in units 1/τ . The horizontal axis is x ,the vertical is x (Color figure online) 1 1 2 2 0 0.5 0 0.5 -2 -2 0 0 0 5 10 15 20 05 10 15 20 1 1 2 2 0 0.5 0 0.5 -2 -2 0 0 0 5 10 15 20 05 10 15 20 1 1 2 2 0.5 0.5 0 0 -2 -2 0 0 0 5 10 15 20 05 10 15 20 1 1 2 2 0 0.5 0 0.5 -2 -2 0 0 0 5 10 15 20 05 10 15 20 Fig. 17 Row-wise from top left to bottom: the fuzzy affiliations (26) of the trajectories at time t = 5tothe cornerstones c ,..., c , respectively (magenta circles). Bottom right: affiliation q (·) for m = 1.9, which 1 7 c suggests that the 7th coherent region could contain a coherent set itself: the jet core. The horizontal axis is x , the vertical is x (Color figure online) 1 2 123 J Nonlinear Sci Fig. 18 Sketch of the flow field of the rotating double gyre at initial (left) and final (right) times. The horizontal axis is x , the vertical is x 1 2 5.2 The Rotating Double Gyre Let us consider a prototype for a system, where transport is considered only on a limited time interval. The rotating double-gyre system (Mosovsky and Meiss 2011) is given by the stream function ψ(t, x , x ) = (1−s(t ))ψ (x , x )+s(t )ψ (x , x ), with s(t ) = 1 2 P 1 2 F 1 2 t (3 −2t ), ψ (x , x ) = sin(2π x ) sin(π x ), and ψ (x , x ) = sin(π x ) sin(2π x ), P 1 2 1 2 F 1 2 1 2 and is considered on the state space X =[0, 1] and time interval t ∈[0, 1].The two gyres, which initially occupy the left and right halves of the unit square, turn during this time by π/2 to occupy the top and bottom halves at final time, see Fig. 18. We choose a uniform 30 ×30 grid as initial conditions for the floaters at time t = 0; i.e., I = 900. We sample the trajectories of these floaters at times t = kτ , k = 0, 1,..., K , where K = 100 and τ = 0.01. We employ the cross-semidistance, and start our cornerstone search. The first three values of the optimization problem (25)are 0.0274, 0.0474, 0.0262. We identify the significant drop after two corner stones, hence we expect two coherent sets with one mixing region dividing them. The drop in the distance is by a factor 0.55, which is not below one half, the reason for this being again that the time interval of consideration is not sufficient for perfect mixing of the transition region. The semidistances from the three identified cores and the affiliations to these cores cross for exponent m = 1.2 are shown in Figs. 19 and 20, respectively. Although both ν meet and ν have been shown to be able to detect coherent sets, we demonstrate their different nature by showing the shortest paths in the respective distance in Fig. 21. Finally, we demonstrate the approach for a scattered set of sparse data points, taking 400 initial points randomly distributed in X, and repeating the analysis for their trajectories. We show the resulting fuzzy affiliations in Fig. 22. 6 Discussion and Outlook 6.1 The Dynamic Laplacian Froyland (2015) has introduced the dynamic Laplacian as a transport-related tool to find coherent sets. Similarly to our approach, it makes use of a small random perturbation of size ε, then ε is driven to zero. 123 J Nonlinear Sci 1 1 1 0.045 0.045 0.025 0.9 0.9 0.9 0.04 0.04 0.8 0.8 0.8 0.035 0.035 0.02 0.7 0.7 0.7 0.03 0.03 0.6 0.6 0.6 0.015 0.025 0.025 0.5 0.5 0.5 0.02 0.02 0.4 0.4 0.4 0.01 0.015 0.015 0.3 0.3 0.3 0.01 0.01 0.2 0.2 0.2 0.005 0.005 0.1 0.1 0.005 0.1 0 0 0 0 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 1 1 1 0.045 0.045 0.025 0.9 0.9 0.9 0.04 0.04 0.8 0.8 0.8 0.035 0.035 0.02 0.7 0.7 0.7 0.03 0.03 0.6 0.6 0.6 0.015 0.025 0.025 0.5 0.5 0.5 0.02 0.02 0.4 0.4 0.4 0.01 0.015 0.015 0.3 0.3 0.3 0.01 0.01 0.2 0.2 0.2 0.005 0.005 0.005 0.1 0.1 0.1 0 0 0 0 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 cross Fig. 19 From left to right: the identified corner stores c , i = 1,..., 3, (magenta circles) and their distances ν (c , ·) to the other trajectories, at initial time (top) and final i i time (bottom). The distances are given in units 1/τ . The horizontal axis is x , the vertical is x (Color figure online) 1 2 J Nonlinear Sci 1 1 1 1 1 1 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.7 0.7 0.7 0.7 0.7 0.7 0.6 0.6 0.6 0.6 0.6 0.6 0.5 0.5 0.5 0.5 0.5 0.5 0.4 0.4 0.4 0.4 0.4 0.4 0.3 0.3 0.3 0.3 0.3 0.3 0.2 0.2 0.2 0.2 0.2 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0 0 0 0 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 1 1 1 1 1 1 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.7 0.7 0.7 0.7 0.7 0.7 0.6 0.6 0.6 0.6 0.6 0.6 0.5 0.5 0.5 0.5 0.5 0.5 0.4 0.4 0.4 0.4 0.4 0.4 0.3 0.3 0.3 0.3 0.3 0.3 0.2 0.2 0.2 0.2 0.2 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0 0 0 0 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Fig. 20 The fuzzy affiliations computed with m = 1.2 to the cornerstones c (top) and c (bottom), at times t = 0, 0.5, 1, from left to right, respectively. The horizontal axis 1 2 is x , the vertical is x 1 2 J Nonlinear Sci 1 1 1 0.9 0.9 0.9 0.8 0.8 0.8 0.7 0.7 0.7 0.6 0.6 0.6 0.5 0.5 0.5 0.4 0.4 0.4 0.3 0.3 0.3 0.2 0.2 0.2 0.1 0.1 0.1 0 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Fig. 21 Shortest paths from cornerstone c to c (left), from c to c (middle), and the meeting paths with the shortest joint length (right). The brighter the color of a path 1 2 2 1 segment, the larger the cost of that transition. For those segments for which the path crosses from one trajectory to another we show with a dashed segment how the former trajectory would have continued. The starting point of the path is indicated by a circle, and the endpoint by a triangle. The horizontal axis is x , the vertical is x 1 2 J Nonlinear Sci 1 1 1 1 1 1 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.7 0.7 0.7 0.7 0.7 0.7 0.6 0.6 0.6 0.6 0.6 0.6 0.5 0.5 0.5 0.5 0.5 0.5 0.4 0.4 0.4 0.4 0.4 0.4 0.3 0.3 0.3 0.3 0.3 0.3 0.2 0.2 0.2 0.2 0.2 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0 0 0 0 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 1 1 1 1 1 1 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.7 0.7 0.7 0.7 0.7 0.7 0.6 0.6 0.6 0.6 0.6 0.6 0.5 0.5 0.5 0.5 0.5 0.5 0.4 0.4 0.4 0.4 0.4 0.4 0.3 0.3 0.3 0.3 0.3 0.3 0.2 0.2 0.2 0.2 0.2 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0 0 0 0 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Fig. 22 The rotating double gyre with randomly chosen 400 initial points. The fuzzy affiliations computed with m = 1.2 to the cornerstones c (top) and c (bottom), at 1 2 times t = 0, 0.5, 1, from left to right, respectively. The horizontal axis is x , the vertical is x 1 2 J Nonlinear Sci Numerical methods so far discretize directly the dynamic Laplacian (Froyland and Junge 2015; Banisch and Koltai 2017; Froyland and Junge 2017). In light of our analysis, which can be used both ways (derive the large-deviation principle in continuous space, then discretize it to finite trajectories, cf. Sect. 3.6, or discretize the dynamics to finite trajectories, then derive the large-deviation principle on them, cf. Sect. 3.3), we ask whether there is a discrete dynamic Laplacian that can be derived from a discretization of the perturbed dynamics? Mimicking the construction in Froyland (2015) and sketching the idea while skip- I × I ping details, one should construct a discrete, ε-dependent transfer operator T ∈ R , that represents transition probabilities of a forward-backward process, then obtain a discrete dynamic Laplace operator L := T . A discrete transfer opera- dyn ε dε ε=0 tor T that is a consistent approximation of the continuous dynamics can be obtained by a construction as in Sect. 3.2, by using the transition probabilities (17). Tech- nical details aside, we see that the probabilities are linear combinations of terms −x /ε of the form e , where x here is a formal distance term that appears in the formulas. Differentiation with respect to ε immediately yields that all off-diagonal entries (basically, where x > 0) of L are zero, in fact the matrix is the iden- dyn tity. Thus, this approach of discretizing the dynamics first, and then factoring out the ε-small stochastic perturbation does not give a dynamically meaningful result. In analytic terms the very same problem occurred in a different attempt to introduce a discrete dynamic Laplacian from a discrete transfer operator, see (Banisch and Koltai 2017, Section IV) . In general, it would be desirable to understand when and how can the “first discretize, then factor out ε” methods work, such that they can complement the methods that directly discretize the (continuous) dynamic Laplace operator. 6.2 Other Distance Measures The time-dependent shortest path problem used to compute our semidistances is com- putationally demanding in our current algorithmic realization, which theoretically limits the number of trajectories that can be handled. Moreover, they do not satisfy the triangle inequality, hence they are not a metric. Although numerical efficiency is not the main focus of this paper, and we demonstrated the usefulness of our semidis- tances in unraveling the underlying dynamical structure of the example systems, a more cheaply computable metric would enhance the utility and significance of the analysis methods presented here. Ultimately, one would like to understand the intrinsic, possibly low-dimensional geometric organization of the state space with respect to transport and mixing, as pioneered in Banisch and Koltai (2017). Employing proper metrics would allow, e.g., the usage of low-dimensional embedding techniques, such as multidimensional scaling, to represent and better understand this geometric organization. One canon- ical candidate would be the metric structure related to the dynamic Laplacian, considered in Karrasch and Keller (2016). This will be subject of future stud- ies. 123 J Nonlinear Sci To summarize, although other distance measures could be used to analyze compli- cated dynamic behavior, we showed that the semidistances we derived in this paper from the physical notion of transport and mixing in the vanishing diffusion setting are natural and effective diagnostic tools. Acknowledgements This work is supported by the Deutsche Forschungsgemeinschaft (DFG) through the Priority Programme SPP 1881 “Turbulent Superstructures”, and through the CRC 1114 “Scaling Cascades in Complex Systems”, projects A01 and C08. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 Interna- tional License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. A Large Deviations of the Forward-Backward Conditions In this appendix we explore the conditions (3) in the large-deviation regime. The argument is based on the Laplace Principle, which states that for any measure ρ and function f : − f (x ) lim −ε log e ρ(dx ) = inf f (x ). (27) x ∈supp ρ ε→0 As in (9), (ε) (ε) −ε log P x y | x = x T 0 −−→ inf |˙ x − v(t, x )| dt =: λ (x → y), (28) t t T x :x =x ,x = y ε→0 (·) 0 T where, contrary to (9), the symbol y now denotes a position at time T , that is, μ (x →x˜ ) = λ (x →φ [˜ x ]) = λ (x → y). T T 0,T T Fix an ε-independent initial probability measure ρ (dx ) = P[x ∈ dx ].For the 0 0 large deviations of the forward condition in (3), it follows from the Laplace principle that (ε) fw J ( B| A) := lim −ε log P x ∈ B | x ∈ A T T ε→0 (ε) = lim −ε log P x ∈ d y | x = x ρ (dx ) 0 0 ε→0 B A (28) − λ (x → y) = lim −ε log e ρ (dx ) ε→0 B A (27) = inf inf λ (x → y). y∈ B x ∈ A∩supp ρ Observe that since the initial distribution ρ is independent of ε, it only appears in the large deviations through its support supp ρ . 123 J Nonlinear Sci The large deviations of the backward conditions in (3) can be calculated analo- gously, but now the conditioning does depend on ε. By Bayes’ rule, the rate function of the backward condition is bw (ε) J ( A| B) := lim −ε log P[x ∈ A | x ∈ B] T T ε→0 P[x ∈ A] (ε) 0 = lim −ε log P[x ∈ B | x ∈ A] T (ε) ε→0 P[x ∈ B] (ε) = lim −ε log P x ∈ d y | x = x ρ (dx ) 0 0 ε→0 B A (ε) + ε log P x ∈ d y | x = x ρ (dx ) − ε log ρ ( A) 0 0 0 (27,28) = inf inf λ (x → y) − inf inf λ (x → y) T T x ∈supp ρ y∈ B x ∈ A∩supp ρ y∈ B 0 fw fw = J ( B| A) − J ( B| X ). T T If we assume that that there is at least one admissible path x that starts in supp ρ (·) 0 fw fw bw and ends in B, then in fact J ( B| X ) = 0, and so J ( B| A) = J ( A| B). T T T These calculations have two important implications. First, observe that while the (ε) (ε) forward and backward probabilities P[x ∈ B | x ∈ A] and P[x ∈ A | x ∈ B] are 0 0 T T not equal in general, the forward and backward rate functions are. The same argument even holds if we shrink the sets A ad B down to single points x and y; in that case we obtain for the “backward rates” that (ε) λ (x ← y) := lim −ε log P x x | x = y = λ (x → y), T 0 T ε→0 cf. Remark 2.1. Apparently, in the large-deviation scaling it does not matter whether we consider the forward or the backward process. Since the forward condition fw J ( B| A) ≈ 0 in itself does not hold enough information to characterize coher- bw ence and the backward condition J ( A| B) ≈ 0 does not add information, these conditions are not helpful to characterize coherence. fw Secondly, we see that J ( B| A) = 0 as soon as φ [ A ∩ supp ρ ]∩ B =∅. 0,T 0 fw Naturally, there are many such pairs A, B, and the set function J does not give any quantitative information about which pairs are more coherent than others. Because of this, the large deviations of the forward and backward conditions (3)are even less useful to identify coherent sets. We start to gain useful information about coherence, if there are at least two coherent fw fw pairs, say A , B and A , B . Then the rates J ( B | A ) and J ( B | A ) are in 1 1 2 2 2 1 1 2 T T general large, since coherence of the respective set pairs dictate that it is very unlikely to encounter paths from pair #1 to pair #2. Using these rates as measures of farness is one idea this paper exploits. 123 J Nonlinear Sci B Algorithm: Shortest Path in Time-Dependent Graphs Since we could not find an algorithm suited to our purpose, we describe in this appendix a solution we came up with to solve the problem of finding shortest paths in a graph with time-dependent nonnegative edge weights. Transition is only possible between nodes that are connected by an edge of positive weight. There are several solutions to the shortest path problem for time-independent graphs, such as Dijkstra’s algorithm (Dijkstra 1959)orthe Floyd–Warshall algorithm (Floyd 1962). Each of them use in some sense a “monotonicity” argument, namely, that sub-paths of shortest paths are shortest paths themselves. This does not hold for time-dependent graphs, because at every step that we make the environment might change completely, and the number of steps we can make is limited by the number of time instances of the graph. We propose the following algorithm to compute shortest paths from a specific node s to all other nodes. Note that we can stay in a node for any time at zero cost. The weight of the transition i → j at time t is denoted by w (i → j ). Algorithm 1 Shortest distance in time-dependent graphs 1: R ={s}, R =∅ (reached states at times 0 and 1) ol d new 2: dist(s) = 0, dist(i ) =∞ for i = s 3: for t = 1,..., T do 4: while R =∅ do ol d 5: v = arg max dist(i ) i ∈ R ol d 6: R ← R \{v} ol d ol d 7: for j : w (v → j)< ∞ do 8: if dist(v) + w (v → j)< dist( j ) then 9: dist( j ) = dist(v) + w (v → j ) 10: R ← R ∪{ j } new new 11: R = R new ol d It is important to have the max on line 5, since if we does not start the update procedure at the node which has the maximal distance, then we might erroneously cut off nodes that could still be reached from it. Algorithm 1 can clearly be extended to keep track of the shortest path as well. The distance of a node j is updated to a smaller one, whenever there is a path through some other node v that is shorter than the previous one (line 10). Hence, the new candidate shortest path is the one leading to v and then jumping to j in the current time step. This is implemented in Algorithm 2 (line 12). Herein, path(i → j ) is the shortest path from node i to node j, such that path (i → j ) is the node the walker resides in at time t = 0, 1,..., T while going through the shortest path, and path (i → j ) = i.If there is no path from i to j, then path(i → j ) is the zero vector. We use 1 : k to denote the index set 1, 2,..., k. It should be mentioned here that every time-dependent shortest path problem can be rephrased as a time-independent problem by considering each node at each time point as a distinct node of a large graph, and then it could be solved by standard methods. We do not take this approach here, as it might introduce memory requirement issues. 123 J Nonlinear Sci Algorithm 2 Shortest path in time-dependent graphs 1: R ={s}, R =∅ (reached states at times 0 and 1) new ol d 2: dist(s) = 0, dist(i ) =∞ for i = s T +1 3: path(s→ j ) = 0 ∈ R for all j, path (s→s) = s 4: for t = 1,..., T do 5: while R =∅ do ol d 6: v = arg max dist(i ) i ∈ R ol d 7: R ← R \{v} ol d ol d 8: for j : w (v → j)< ∞ do 9: if dist(v) + w (v → j)< dist( j ) then 10: dist( j ) = dist(v) + w (v → j ) 11: path (s→ j ) = path (s→v), path (s→ j ) = j 1:t −1 1:t −1 t 12: R ← R ∪{ j } new new 13: path (s→s) = s 14: R = R ol d new References Allshouse, M.R., Peacock, T.: Lagrangian based methods for coherent structure detection. Chaos: an Inter- disciplinary. J. Nonlinear Sci. 25(9), 97617 (2015) Allshouse, M.R., Thiffeault, J.-L.: Detecting coherent structures using braids. Physica D: Nonlinear Phenom. 241(2), 95–105 (2012) Andres, S.: Diffusion processes with reflection. PhD thesis, TU Berlin (2009) Balasuriya, S., Froyland, G., Santitissadeekorn, N.: Absolute flux optimising curves of flows on a surface. J. Math. Anal. Appl. 409(1), 119–139 (2014) Banisch, R., Koltai, P.: Understanding the geometry of transport: diffusion maps for Lagrangian trajectory data unravel coherent sets. Chaos: an Interdisciplinary. J. Nonlinear Sci. 27(3), 035804 (2017) Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Dordrecht (1981) Bezdek, J.C., Hathaway, R.J., Sabin, M.J., Tucker, W.T.: Convergence theory for fuzzy c-means: counterex- amples and repairs. IEEE Trans. Syst. Man Cybern. 17(5), 873–877 (1987) Budišic, ´ M., Mezic, ´ I.: Geometry of the ergodic quotient reveals coherent structures in flows. Physica D: Nonlinear Phenom. 241(15), 1255–1269 (2012) Dellnitz, M., Junge, O.: On the approximation of complicated dynamical behavior. SIAM J. Numer. Anal. 36, 491–515 (1999) Dellnitz, M., Froyland, G., Horenkamp, C., Padberg-Gehle, K., Gupta, A.S.: Seasonal variability of the subpolar gyres in the Southern Ocean: a numerical investigation based on transfer operators. Nonlinear Proces. Geophys. 16, 655–664 (2009) Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications, vol. 38, 2nd edn. Springer, New York (1987) Denner, A., Junge, O., Matthes, D.: Computing coherent sets using the Fokker–Planck equation. J. Comput. Dyn. 3(2), 163–177 (2016) Dijkstra, E.W.: A note on two problems in connexion with graphs. Numer. Math. 1(1), 269–271 (1959) Fabregat , A., Mezic, I., Poje, A. C.: Finite-time partitions for Lagrangian structure identification in Gulf Stream eddy transport. (2016). arXiv preprint arXiv:1606.07382 Floyd, R.W.: Algorithm 97: shortest path. Commun. ACM 5(6), 345 (1962) Freidlin, M., Wentzell, A.: Random Perturbations of Dynamical Systems, 2nd edn. Springer, Berlin (1998) Froyland, G., Junge, O.: Robust FEM-based extraction of finite-time coherent sets using scattered, sparse, and incomplete trajectories. (2017). arXiv preprint arXiv:1705.03640 Froyland, G., Kwok, E.: A dynamic Laplacian for identifying Lagrangian coherent structures on weighted Riemannian manifolds (2016). arXiv preprint arXiv:1610.01128 Froyland, G., Padberg-Gehle, K.: Almost-invariant and finite-time coherent sets: directionality, duration, and diffusion. In: Ergodic Theory, Open Dynamics, and Coherent Structures, pp. 171–216. Springer (2014) 123 J Nonlinear Sci Froyland, G.: An analytic framework for identifying finite-time coherent sets in time-dependent dynamical systems. Physica D: Nonlinear Phenom. 250, 1–19 (2013) Froyland, G.: Dynamic isoperimetry and the geometry of Lagrangian coherent structures. Nonlinearity 28(10), 3587 (2015) Froyland, G., Junge, O.: On fast computation of finite-time coherent sets using radial basis functions. Chaos: an Interdisciplinary. J. Nonlinear Sci. 25(8), 087409 (2015) Froyland, G., Koltai, P.: Estimating long-term behavior of periodically driven flows without trajectory integration. Nonlinearity 30(5), 1948 (2017) Froyland, G., Padberg-Gehle, K.: A rough-and-ready cluster-based approach for extracting finite-time coher- ent sets from sparse and incomplete trajectory data. Chaos: an Interdisciplinary. J. Nonlinear Sci. 25(8), 087406 (2015) Froyland, G., Padberg, K., England, M.H., Treguier, A.M.: Detection of coherent oceanic structures via transfer operators. Phys. Rev. Lett. 98, 224503 (2007) Froyland, G., Santitissadeekorn, N., Monahan, A.: Transport in time-dependent dynamical systems: finite- time coherent sets. Chaos: an Interdisciplinary. J. Nonlinear Sci. 20(4), 043116 (2010) Froyland, G., Horenkamp, C., Rossi, V., van Sebille, E.: Studying an Agulhas ring’s long-term pathway and decay with finite-time coherent sets. Chaos 25(8), 083119 (2015) Hadjighasem, A., Karrasch, D., Teramoto, H., Haller, G.: Spectral-clustering approach to Lagrangian vortex detection. Phys. Rev. E 93(6), 063107 (2016) Halász, G., Gyüre, B., Jánosi, I.M., Szabó, K.G., Tél, T.: Vortex flow generated by a magnetic stirrer. Am. J. Phys. 75(12), 1092–1098 (2007) Haller, G., Beron-Vera, F.: Coherent Lagrangian vortices: the black holes of turbulence. J. Fluid Mech. 731, R4 (2013). https://doi.org/10.1017/jfm.2013.391 Haller, G.: Finding finite-time invariant manifolds in two-dimensional velocity fields. Chaos: an Interdisci- plinary. J. Nonlinear Sci. 10(1), 99–108 (2000) Haller, G.: Distinguished material surfaces and coherent structures in three-dimensional fluid flows. Physica D 149(4), 248–277 (2001) Haller, G., Beron-Vera, F.J.: Geodesic theory of transport barriers in two-dimensional flows. Physica D: Nonlinear Phenom. 241(20), 1680–1702 (2012) Karrasch, D., Keller, J.: A geometric heat-flow theory of Lagrangian coherent structures (2016). arXiv Preprint arXiv:1608.05598 Karrasch, D.: Lagrangian transport through surfaces in volume-preserving flows. SIAM J. Appl. Math. 76(3), 1178–1190 (2016) Kloeden, P.E., Platen, E.: Numerical Solution of Stochastic Differential Equations, 3rd edn. Springer, Berlin (2010) Koltai, P., Ciccotti, G., Schütte, C.: On metastability and Markov state models for non-stationary molecular dynamics. J. Chem. Phys. 145(17), 174103 (2016) MacKay, R., Meiss, J., Percival, I.: Transport in Hamiltonian systems. Physica D: Nonlinear Phenom. 13(1–2), 55–81 (1984) Meiss, J.: Symplectic maps, variational principles, and transport. Rev. Mod. Phys. 64(3), 795 (1992) Mezic, ´ I., Banaszuk, A.: Comparison of systems with complex behavior. Physica D: Nonlinear Phenom. 197(1), 101–133 (2004) Mezic, ´ I., Wiggins, S.: A method for visualization of invariant sets of dynamical systems based on the ergodic partition. Chaos: an Interdisciplinary. J. Nonlinear Sci. 9(1), 213–218 (1999) Mosovsky, B.A., Meiss, J.D.: Transport in transitory dynamical systems. SIAM J. Appl. Dyn. Syst. 10(1), 35–65 (2011) Øksendal, B.: Stochastic Differential Equations—An Introduction with Applications, 6th edn. Springer, Berlin (2003) Onsager, L., Machlup, S.: Fluctuations and irreversible processes. Phys. Rev. 91(6), 1505–1512 (1953) Padberg, K., Hauff, T., Jenko, F., Junge, O.: Lagrangian structures and transport in turbulent magnetized plasmas. New J. Phys. 9, 400 (2007) Padberg-Gehle, K., Schneide, C.: Network-based study of Lagrangian transport and mixing. Nonlinear Proces. Geophys. Discuss. 1–14, 2017 (2017) Rom-Kedar, V., Wiggins, S.: Transport in two-dimensional maps. Arch. Ration. Mech. Anal. 109(3), 239– 298 (1990) 123 J Nonlinear Sci Rüdrich, S., Sarich, M., Schütte, C.: Utilizing hitting times for finding metastable sets in non-reversible markov chains. To appear in J. Comput. Dyn. (2017) Preprint https://opus4.kobv.de/opus4-zib/ frontdoor/index/index/docId/5120 Rypina, I.I., Pratt, L.J.: Trajectory encounter volume as a diagnostic of mixing potential in fluid flows. Nonlinear Proces. Geophys. 24(2), 189 (2017) Rypina, I., Brown, M., Beron-Vera, F., Kocak, H., Olascoaga, M., Udovydchenkov, I.: On the Lagrangian dynamics of atmospheric zonal jets and the permeability of the stratospheric polar vortex. J. Atmos. Sci. 64(10), 3595–3610 (2007) Schlueter-Kuck, K.L., Dabiri, J.O.: Coherent structure colouring: identification of coherent structures from sparse data using graph theory. J. Fluid Mech. 811, 468–486 (2017) Ser-Giacomi, E., Rossi, V., López, C., Hernández-García, E.: Flow networks: a characterization of geo- physical fluid transport. Chaos: an Interdisciplinary. J. Nonlinear Sci. 25(3), 036404 (2015) Ser-Giacomi, E., Vasile, R., Hernández-García, E., López, C.: Most probable paths in temporal weighted networks: an application to ocean transport. Phys. Rev. E 92(1), 012818 (2015) Strang, G.: On the construction and comparison of difference schemes. SIAM J. Numer. Anal. 5(3), 506–517 (1968) Treguier, A.-M., Boebel, O., Barnier, B., Madec, G.: Agulhas eddy fluxes in a 1/6 degrees Atlantic model. Deep Sea Res. Part II 50(1), 251–280 (2003) Walters, P.: An Introduction to Ergodic Theory, vol. 79. Springer, Berlink (2000) Wigner, E.P.: Calculation of the rate of elementary association reactions. J. Chem. Phys. 5, 720–725 (1937) Williams, M.O., Rypina, I.I., Rowley, C.W.: Identifying finite-time coherent sets from limited quantities of Lagrangian data. Chaos 25(8), 087408 (2015)

Journal

Journal of Nonlinear ScienceSpringer Journals

Published: Jun 1, 2018

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off