# Revealing transient strain in geodetic data with Gaussian process regression

Revealing transient strain in geodetic data with Gaussian process regression Summary Transient strain derived from global navigation satellite system (GNSS) data can be used to detect and understand geophysical processes such as slow slip events and post-seismic deformation. Here we propose using Gaussian process regression (GPR) as a tool for estimating transient strain from GNSS data. GPR is a non-parametric, Bayesian method for interpolating scattered data. In our approach, we assume a stochastic prior model for transient displacements. The prior describes how much we expect transient displacements to covary spatially and temporally. A posterior estimate of transient strain is obtained by differentiating the posterior transient displacements, which are formed by conditioning the prior with the GNSS data. As a demonstration, we use GPR to detect transient strain resulting from slow slip events in the Pacific Northwest. Maximum likelihood methods are used to constrain a prior model for transient displacements in this region. The temporal covariance of our prior model is described by a compact Wendland covariance function, which significantly reduces the computational burden that can be associated with GPR. Our results reveal the spatial and temporal evolution of strain from slow slip events. We verify that the transient strain estimated with GPR is in fact geophysical signal by comparing it to the seismic tremor that is associated with Pacific Northwest slow slip events. Satellite geodesy, Transient deformation, Spatial analysis, Statistical methods, Time-series analysis, Dynamics of lithosphere and mantle 1 INTRODUCTION Crustal strain rates are fundamentally important quantities for assessing seismic hazard. Knowing where and how quickly strain is accumulating gives insight into where we can expect stored elastic energy to be released seismically. Estimates of secular, or long-term, strain rates have been used to constrain seismic hazard models such as UCERF3 (Field et al. 2014). Transient strain, which is anomalous strain caused by geophysical processes such as slow slip events (SSEs) or post-seismic deformation, is also relevant for assessing seismic hazard. While transient strain itself is not damaging, there is a possibility that it can trigger major earthquakes (Roeloffs 2006; Freed & Lin 2001). Global navigation satellite system (GNSS) networks, such as the Plate Boundary Observatory (PBO), can be used to infer secular and transient strain. Numerous methods for converting GNSS displacement data to strain have been proposed, and most involve assuming some parametric form for the deformation signal. The simplest method for estimating secular strain rates assumes that GNSS derived velocities can be described with a first-order polynomial (i.e. the deformation gradients are constant) over some subnetwork of the GNSS stations (e.g. Feigl et al. 1990; Murray & Lisowski 2000). The components of the secular strain rate tensor for each subnetwork are then determined from the least squares fit to the velocities. The assumption that deformation gradients are spatially uniform is not appropriate when subnetworks span a very large area. To help overcome this deficiency, Shen et al. (1996, 2015) used an inverse distance weighting scheme, in which the estimated secular strain rate at any point is primarily controlled by the velocities at nearby stations. The method of Shen et al. (1996, 2015) can be viewed as a form of local least squares regression with a first-order polynomial. Other methods for estimating secular strain rates have parameterized GNSS derived velocities with bi-cubic splines (Beavan & Haines 2001), spherical wavelets (Tape et al. 2009), and elastostatic Green’s functions (Sandwell & Wessel 2016). The type of basis functions and the number of degrees of freedom for a parameterization, which are often chosen subjectively, have a strong influence on the solution. If there are too few degrees of freedom in the parameterization, then estimated secular strain rates will be biased and the uncertainties will be underestimated. On the other hand, if there are too many degrees of freedom, then there will not be any coherent features in the estimated secular strain rates. The methods described by Beavan & Haines (2001) and Tape et al. (2009) also require the user to specify penalty parameters that control a similar trade-off between bias and variance in the solution. One could also parameterize deformation with a physically motivated model of interseismic deformation (e.g. Meade & Hager 2005; McCaffrey et al. 2007). In such models, the lithospheric rheology and fault geometries are assumed to be known. Any errors in the assumed physical model could result in biased estimates of the secular strain rates and underestimated uncertainties. The aforementioned studies are concerned with estimating secular strain rates. In recent years the Southern California Earthquake Center (SCEC) community has shown interest in developing methods for detecting transient deformation. SCEC supported a transient detection exercise (Lohman & Murray 2013), where several research groups tested their methods for detecting transient geophysical signal with a synthetic GNSS data set. Among the methods tested were the Network Strain Filter (NSF; Ohtani et al. 2010) and the Network Inversion Filter (NIF; Segall & Mathews 1997). The NSF uses a wavelet parameterization to describe the spatial component of geophysical signal. The NIF, which is intended for imaging slow fault slip from geodetic data, uses the elastic dislocation Green’s functions from Okada (1992). For the NSF and NIF, the time dependence of the geophysical signal is modelled as integrated Brownian motion. The method described in Holt & Shcherbenko (2013) was also tested in the SCEC transient detection exercise. Holt & Shcherbenko (2013) detect transient signal from an estimate of transient strain rates, and they estimate transient strain rates using a bi-cubic spatial parameterization of displacements between time epochs. Holt & Shcherbenko (2013) define a detection threshold based on the transient strain rate magnitude, and we demonstrate that this is indeed an effective criterion for identifying a geophysical signal. For the same reasons described above, the transient deformation and corresponding uncertainties estimated by these methods can be biased by the chosen spatial parameterization. It is then difficult to distinguish signal from noise with these methods, which limits their utility for transient detection. Here we propose using Gaussian process regression (GPR) (Rasmussen & Williams 2006) to estimate transient strain rates from GNSS data. GPR is closely related to kriging (Cressie 1993) and least squares collocation (Moritz 1978). The latter has been used by Kato et al. (1998) and El-Fiky & Kato (1998) to estimate secular strain rates from GNSS data. GPR is a Bayesian, non-parametric method for inferring a continuous function from scattered data. Since GNSS stations are irregularly spaced and observation times may differ between stations, GPR is an ideal tool for synthesizing discrete GNSS data into a spatially and temporally continuous representation of transient strain rates. GPR is Bayesian in that we use a prior model to control the spatial and temporal roughness of the inferred transient strain rates. The prior is specified as a stochastic process, namely a Gaussian process. If there is no information available to help choose an appropriate prior Gaussian process, then maximum likelihood methods can be used to objectively choose one that is most consistent with the data. We use GPR to infer transient strain rates resulting from SSEs in the Pacific Northwest, demonstrating that GPR is an effective tool for detecting transient geophysical processes. 2 ESTIMATING TRANSIENT STRAIN RATES We seek a spatially and temporally dependent estimate of transient strain rates. We consider transient strain rates to be any deviation from the secular strain rates. We denote transient strain rates as $$\dot{\varepsilon }(p)$$, where p represents the ordered pair $$(\vec{x},t)$$, $$\vec{x}= (x_\mathrm{e},x_\mathrm{n})$$ are spatial coordinates, and t is time. The subscripts ‘e’ and ‘n’ indicate the east and north component, and we assume that the study region is sufficiently small that $$\vec{x}$$ can be considered a point in a 2-D Cartesian map projection that is aligned with the cardinal directions. We determine $$\dot{\varepsilon }$$ by spatially and temporally differentiating estimates of transient displacements, which we constrain with GNSS data. We let $$\vec{u}(p) = (u_\mathrm{e}(p),u_\mathrm{n}(p))$$ be our prior understanding of transient displacements. Since transient displacements are not precisely known, we cannot consider $$\vec{u}$$ to be a deterministic function. Instead, $$\vec{u}$$ is considered to be a stochastic process containing a distribution of functions that could potentially describe transient displacements. Specifically, we let each component of $$\vec{u}$$ be a Gaussian process. A Gaussian process is a stochastic process whose value at any collection of points can be described with a multivariate normal distribution. That is to say, the random vector $$[u_i(p)]_{p \in \bf {P}}$$ has a multivariate normal distribution for any collection of points $$\bf {P}$$. A realization of the random vector $$[u_i(p)]_{p \in \bf {P}}$$ can be interpreted as a realization of ui (i.e. a sample function) that is evaluated at $$\bf {P}$$. Just as a multivariate normal distribution is fully determined by a mean vector and a covariance matrix, a Gaussian process is fully determined by a mean function and a covariance function. For example, Brownian motion, B(t), is a well-known Gaussian process in $$\mathbb {R}^1$$ which has the mean function E[B(t)] = 0 and the covariance function Cov[B(t), B(t΄)] = ϕ2min (t, t΄), where t, t΄ ≥ 0 and ϕ is a scale factor. We let each component of $$\vec{u}$$ have zero mean and a generic covariance function $$\mathrm{Cov}\left[ u_i(p),u_i(p^{\prime }) \right] = C_{u_i}(p,p^{\prime })$$. Using a more concise notation, we write our prior on each component of $$\vec{u}$$ as $$u_i \sim \mathcal {GP}\left(0,C_{u_i}\right)$$. The function $$C_{u_i}$$ must be positive definite in order to be a valid covariance function. By definition, $$C_{u_i}$$ is a positive definite function if the matrix $$[C_{u_i}(p,p^{\prime })]_{(p,p^{\prime }) \in \bf {P}\times \bf {P}}$$ is positive definite for any set of points $$\bf {P}$$ (Cressie 1993, section 2.5). We assume that $$C_{u_i}$$ can be separated into spatial and temporal functions as   \begin{eqnarray} C_{u_i}(p,p^{\prime }) = C_{u_i}\left((\vec{x},t),(\vec{x}^{\prime },t^{\prime })\right) = X_i(\vec{x},\vec{x}^{\prime })T_i(t,t^{\prime }). \end{eqnarray} (1)As long as the functions Xi and Ti are positive definite, $$C_{u_i}$$ is guaranteed to also be positive definite (Rasmussen & Williams 2006, section 4.2.4). We also require that the derivatives $$\partial ^2 X_i(\vec{x},\vec{x}^{\prime })/ \partial x_j \partial x_j^{\prime }$$ and ∂2Ti(t, t΄)/∂t∂t΄ exist. This ensures that ui is spatially and temporally differentiable, allowing us to compute transient strain rates (see Adler (1981, section 2.2) or Papoulis (1991, appx. 10A) for a definition of stochastic differentiation and the conditions for differentiability). We provide an example to give the prior on transient displacements a more tangible meaning. We can use a squared exponential function for Xi and Ti,   \begin{eqnarray} X_i(\vec{x},\vec{x}^{\prime }) &=& \exp \left(\frac{-||\vec{x}- \vec{x}^{\prime }||_2^2}{2\ell ^2}\right),\nonumber\\ T_i(t,t^{\prime }) &=& \phi ^2 \exp \left(\frac{-|t - t^{\prime }|^2}{2\tau ^2}\right), \end{eqnarray} (2)which satisfies our requirements for positive definiteness and differentiability. The parameters ℓ and τ control the length-scale and timescale, respectively, of realizations of ui. The parameter ϕ, which we have arbitrarily chosen to incorporate into Ti rather than Xi, controls the amplitude of realizations of ui. Ideally, we want realizations of ui to have a length-scale, timescale, and amplitude that resemble what we expect for the true transient displacements. In Fig. 1(a), we show $$C_{u_i}$$ using the squared exponential function for Xi and Ti and the parameters ℓ = 100 km, τ = 10 d, and ϕ = 1 mm. A single realization of ui corresponding to that choice of covariance function is shown in Fig. 1(b) (see Rasmussen & Williams (2006, appx. A.2) for details on drawing realizations from Gaussian processes). While the squared exponential function is a commonly used covariance function for GPR, it is not appropriate for every application. The appropriate choice for Xi and Ti may vary depending on the geophysical signal we are trying to describe. To keep this section sufficiently general, we hold off on specifying Xi and Ti until Section 3.2, where we estimate transient strain from SSEs in the Pacific Northwest. Figure 1. View largeDownload slide (a) An example covariance function for transient displacements, $$C_{u_i}(p,p^{\prime })$$, shown as a function of the spatial and temporal distance between p and p΄. (b) A single realization of transient displacements, ui(p), corresponding to the covariance function from panel (a). The realization is a function of three variables, t, xe and xn, although we only show its dependence on t and one of the spatial dimensions. Figure 1. View largeDownload slide (a) An example covariance function for transient displacements, $$C_{u_i}(p,p^{\prime })$$, shown as a function of the spatial and temporal distance between p and p΄. (b) A single realization of transient displacements, ui(p), corresponding to the covariance function from panel (a). The realization is a function of three variables, t, xe and xn, although we only show its dependence on t and one of the spatial dimensions. GNSS data records transient displacements as well as other physical and non-physical processes that we are not interested in. We first consider component i of a single GNSS displacement observation made at station j, which is located at $$\vec{x}^{(j)}$$, and time t(k). We describe this observation, $$d_i^{*(jk)}$$, as a realization of the random variable   \begin{eqnarray} d_i^{(jk)} &=& u_i\left(\vec{x}^{(j)},t^{(k)}\right) + \eta _i^{(jk)} + a_i^{(j)} + b_i^{(j)}t^{(k)} \nonumber\\ &&+\,c_i^{(j)}\sin \left(2 \pi t^{(k)}\right) + e_i^{(j)}\cos \left(2 \pi t^{(k)}\right)\nonumber\\ &&+\,f_i^{(j)}\sin \left(4 \pi t^{(k)}\right) + g_i^{(j)}\cos \left(4 \pi t^{(k)}\right), \end{eqnarray} (3)where $$\eta _i^{(jk)}$$ describes noise, $$a_i^{(j)}$$ is an offset that is unique for each station, $$b_i^{(j)}$$ is the secular velocity at $$\vec{x}^{(j)}$$, and the sinusoids describe seasonal deformation (using units of years for t(k)). We then consider the column vector of n GNSS displacement observations, $$\bf {d}^*_i$$, where the subscript indicates that we are still only considering component i of displacements. The observations are made at m stations, and the times and positions for each observation are described by the set $$\bf {P}$$. The data vector $$\bf {d}^*_i$$ can be considered a realization of the random vector $$\bf {d}_i$$, which is formed by evaluating eq. (3) at each point in $$\bf {P}$$. To write out $$\bf {d}_i$$ explicitly, we let $$\bf {G}$$ be an n × 6 m matrix consisting of the basis functions from eq. (3) (i.e. the linear trends and sinusoids for each station) evaluated at each point in $$\bf {P}$$. The coefficients corresponding to each basis function are collected into the column vector $$\bf {m}_i$$. The noise for all the observations are described by the column vector $$\bf {\eta }_i$$. We can then write $$\bf {d}_i$$ as   \begin{eqnarray} \bf {d}_i = u_i(\bf {P}) + \bf {\eta }_i + \bf {G}\bf {m}_i, \end{eqnarray} (4)where the notation $$u_i(\bf {P})$$ represents the column vector $$[u_i(p)]_{p \in \bf {P}}$$. We assume a diffuse prior for the components of $$\bf {m}_i$$. That is to say $$\bf {m}_i \sim \mathcal {N}(\bf {0},\kappa ^2\bf {I})$$ in the limit as κ → ∞. Of course, the secular velocities, $$b_i^{(j)}$$, are spatially correlated and we could invoke a tectonic model to form a prior on $$b_i^{(j)}$$. However, in our application to the Pacific Northwest, we will be using displacement time series which are long enough to sufficiently constrain $$b_i^{(j)}$$ for each station, avoiding the need to incorporate a prior. Likewise, seasonal deformation is spatially correlated (Dong et al. 2002; Langbein 2008), and it may be worth exploring and exploiting such a correlation in a future study. We assume that $$\bf {\eta }_i$$ is a spatially and/or temporally correlated random vector distributed as $$\mathcal {N}(\bf {0},\bf {C}_{\eta _i})$$. For example, $$\bf {\eta }_i$$ can be uncorrelated white noise, temporally correlated noise describing benchmark wobble (e.g. Wyatt 1982, 1989), and/or spatially correlated noise describing common mode error (e.g. Wdowinski et al. 1997). The appropriate noise model may vary depending on the application, and we hold off on specifying the covariance matrix, $$\bf {C}_{\eta_i}$$, until Section 3.1. We are now able to write the distribution of $$\bf {d}_i$$ as   \begin{eqnarray} \bf {d}_i \sim \mathcal {N}\left(\bf {0}, C_{u_i}(\bf {P},\bf {P}) + \bf {C}_{\eta _i} + \kappa ^2\bf {G}\bf {G}^T \right), \end{eqnarray} (5)where $$C_{u_i}(\bf {P},\bf {P})$$ represents the matrix $$[C_{u_i}(p,p^{\prime })]_{(p,p^{\prime }) \in \bf {P}\times \bf {P}}$$. We form a posterior estimate of transient displacements, denoted as $$\hat{u}_i$$, by updating ui with the fact that $$\bf {d}_i^*$$ was realized from the random vector $$\bf {d}_i$$, which we write as $$\hat{u}_i = u_i |(\bf {d}_i = \bf {d}^*_i)$$. A general solution for $$\hat{u}_i$$ is derived in von Mises (1964, section 8.9), where we find that $$\hat{u}_i$$ is distributed as $$\mathcal {GP}(\mu _{\hat{u}_i},C_{\hat{u}_i})$$ with mean function   \begin{eqnarray} \mu _{\hat{u}_i}(p) &=& \mathrm{E}\left[ u_i(p) \right] + \mathrm{Cov}\left[ u_i(p),\bf {d}_i \right] \mathrm{Cov}\left[ \bf {d}_i \right]^{-1} \left(\bf {d}^*_i - \mathrm{E}\left[ \bf {d}_i \right]\right)\nonumber \\ &=& C_{u_i}(p,\bf {P})\left( C_{u_i}(\bf {P},\bf {P}) + \bf {C}_{\eta _i} + \kappa ^2\bf {G}\bf {G}^T\right)^{-1} \bf {d}^*_i \end{eqnarray} (6)and covariance function   \begin{eqnarray} C_{\hat{u}_i}(p,p^{\prime }) &=& \mathrm{Cov}\left[ u_i(p),u_i(p^{\prime }) \right]\nonumber\\ && -\, \mathrm{Cov}\left[ u_i(p),\bf {d}_i \right] \mathrm{Cov}\left[ \bf {d}_i \right]^{-1} \mathrm{Cov}\left[ \bf {d}_i,u_i(p^{\prime }) \right] \nonumber \\ &=& C_{u_i}(p,p^{\prime }) - C_{u_i}(p,\bf {P})\left( C_{u_i}(\bf {P},\bf {P})\right.\nonumber\\ &&\left. +\, \bf {C}_{\eta _i} + \kappa ^2\bf {G}\bf {G}^T \right)^{-1} C_{u_i}(\bf {P},p^{\prime }). \end{eqnarray} (7)However, we are interested in the limit as κ → ∞, and the form for eqs (6) and (7) is not suitable for evaluating this limit. We use a partitioned matrix inversion identity (Press et al. 2007, sec. 2.7.4) to rewrite eqs (6) and (7) as   \begin{eqnarray} \mu _{\hat{u}_i}(p) = \left[\begin{array}{c@{\quad}c}C_{u_i}(p,\bf {P}) & \bf {0}\\ \end{array}\right] \left[\begin{array}{c@{\quad}c}C_{u_i}(\bf {P},\bf {P}) + \bf {C}_{\eta _i} & \bf {G}\\ \bf {G}^T & -\kappa ^{-2} \bf {I}\\ \end{array}\right]^{-1} \left[\begin{array}{c}\bf {d}^*_i \\ \bf {0}\\ \end{array}\right]\!\!\!\!\nonumber\\ \end{eqnarray} (8)and   \begin{eqnarray} C_{\hat{u}_i}(p,p^{\prime }) &=& C_{u_i}(p,p^{\prime }) -[C_{u_i}(p,\bf {P}) \quad \bf {0}]\nonumber\\ &&\times\, \left[\begin{array}{c@{\quad}c}C_{u_i}(\bf {P},\bf {P}) + \bf {C}_{\eta _i} & \bf {G}\\ \bf {G}^T & -\kappa ^{-2} \bf {I}\\ \end{array}\right]^{-1} \left[\begin{array}{c}C_{u_i}(\bf {P},p^{\prime }) \\ \bf {0}\nonumber\\ \end{array}\right].\nonumber\\ \end{eqnarray} (9)Taking the limit as κ → ∞, we get the solution for the mean and covariance of $$\hat{u}_i$$,   \begin{eqnarray} \mu _{\hat{u}_i}(p) = \left[\begin{array}{c@{\quad}c}C_{u_i}(p,\bf {P}) & \bf {0}\\ \end{array}\right] \left[\begin{array}{c@{\quad}c}C_{u_i}(\bf {P},\bf {P}) + \bf {C}_{\eta _i} & \bf {G}\\ \bf {G}^T & \bf {0}\\ \end{array}\right]^{-1} \left[\begin{array}{c}\bf {d}^*_i \\ \bf {0}\\ \end{array}\right]\!\!\!\!\nonumber\\ \end{eqnarray} (10)and   \begin{eqnarray} C_{\hat{u}_i}(p,p^{\prime }) &=& C_{u_i}(p,p^{\prime }) - \left[\begin{array}{c@{\quad}c}C_{u_i}(p,\bf {P}) & \bf {0}\\ \end{array}\right]\nonumber\\ &&\times \left[\begin{array}{c@{\quad}c}C_{u_i}(\bf {P},\bf {P}) + \bf {C}_{\eta _i} & \bf {G}\\ \bf {G}^T & \bf {0}\\ \end{array}\right]^{-1} \left[\begin{array}{c}C_{u_i}(\bf {P},p^{\prime }) \\ \bf {0}\\ \end{array}\right]. \end{eqnarray} (11)To ensure that the inverse matrices in eqs (10) and (11) exist, each column in $$\bf {G}$$ must be linearly independent. This condition tends to be violated when there are too few observations at a station. In that case, a singular value decomposition can be used to remove linearly dependent components from $$\bf {G}$$. It should be noted that we have ignored any covariances between the easting and northing components of $$\vec{u}$$ and $$\vec{\bf {d}}$$. This simplification reduces the computational complexity of evaluating the posterior transient displacements because each component can be evaluated independently. However, we are inherently assuming that the principal components of $$\vec{u}(p)$$ and $$\vec{d}^{(jk)}$$ are aligned with the cardinal directions, which is admittedly an arbitrary assumption. The posterior transient displacements are spatially and temporally continuous, and we can use eqs (10) and (11) to evaluate $$\hat{u}_i$$ at any p. Furthermore, $$\hat{u}_i$$ is spatially and temporally differentiable, allowing us to formulate $$\dot{\varepsilon }$$ at any p that we may be interested in. The components of $$\dot{\varepsilon }$$ can be written as   \begin{eqnarray} \dot{\varepsilon }_{ij}(p) = \frac{1}{2} \frac{\partial }{\partial t} \left(\frac{\partial \hat{u}_i(p)}{\partial x_j} + \frac{\partial \hat{u}_j(p)}{\partial x_i} \right). \end{eqnarray} (12)Since eq. (12) is a linear operation on the Gaussian processes $$\hat{u}_i$$ and $$\hat{u}_j$$, we know that $$\dot{\varepsilon }_{ij}$$ is also a Gaussian process. From Papoulis (1991, ch. 10), we find that the mean and covariance functions for $$\dot{\varepsilon }_{ij}$$ are   \begin{eqnarray} \mu _{\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}}(p) = \frac{\partial ^2 \mu _{\hat{u}_\mathrm{e}}(p)}{\partial t {\, } \partial x_\mathrm{e}} \end{eqnarray} (13)  \begin{eqnarray} \mu _{\dot{\varepsilon }_{\mathrm{n}\mathrm{n}}}(p) = \frac{\partial ^2 \mu _{\hat{u}_\mathrm{n}}(p)}{\partial t {\, } \partial x_\mathrm{n}} \end{eqnarray} (14)  \begin{eqnarray} \mu _{\dot{\varepsilon }_{\mathrm{e}\mathrm{n}}}(p) = \frac{1}{2}\frac{\partial }{\partial t}\left( \frac{\partial \mu _{\hat{u}_\mathrm{e}}(p)}{\partial x_\mathrm{n}} + \frac{\partial \mu _{\hat{u}_\mathrm{n}}(p)}{\partial x_\mathrm{e}} \right) \end{eqnarray} (15)and   \begin{eqnarray} C_{\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}}(p,p^{\prime }) = \frac{\partial ^4 C_{\hat{u}_\mathrm{e}}(p,p^{\prime })}{\partial t {\, } \partial t^{\prime } {\, } \partial x_\mathrm{e}{\, } \partial x^{\prime }_\mathrm{e}} \end{eqnarray} (16)  \begin{eqnarray} C_{\dot{\varepsilon }_{\mathrm{n}\mathrm{n}}}(p,p^{\prime }) = \frac{\partial ^4 C_{\hat{u}_\mathrm{n}}(p,p^{\prime })}{\partial t {\, } \partial t^{\prime } {\, } \partial x_\mathrm{n}{\, } \partial x^{\prime }_\mathrm{n}} \end{eqnarray} (17)   \begin{eqnarray} C_{\dot{\varepsilon }_{\mathrm{e}\mathrm{n}}}(p,p^{\prime }) = \frac{1}{4} \frac{\partial ^2}{\partial t {\, } \partial t^{\prime }}\left( \frac{\partial ^2 C_{\hat{u}_\mathrm{e}}(p,p^{\prime })}{\partial x_\mathrm{n}{\, } \partial x^{\prime }_\mathrm{n}} + \frac{\partial ^2 C_{\hat{u}_\mathrm{n}}(p,p^{\prime })}{\partial x_\mathrm{e}{\, } \partial x^{\prime }_\mathrm{e}} \right), \end{eqnarray} (18)respectively. The above derivatives of $$\mu _{\hat{u}_i}$$ and $$C_{\hat{u}_i}$$ are computed analytically by replacing the terms $$C_{u_i}(p,\bf {P})$$, $$C_{u_i}(\bf {P},p^{\prime })$$, and $$C_{u_i}(p,p^{\prime })$$ in eqs (10) and (11) with their appropriate derivatives. For example, the mean and covariance functions for $$\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}$$ can be written more verbosely as   \begin{eqnarray} \mu _{\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}}(p) &=& \left[\begin{array}{c@{\quad}c}\frac{\partial ^2 C_{u_\mathrm{e}}\left( p,\bf {P}\right)}{\partial t {\, } \partial x_\mathrm{e}} & \bf {0}\\ \end{array}\right] \left[\begin{array}{c@{\quad}c}C_{u_\mathrm{e}}(\bf {P},\bf {P}) + \bf {C}_{\eta _\mathrm{e}} & \bf {G}\\ \bf {G}^T & \bf {0}\\ \end{array}\right]^{-1} \left[\begin{array}{c}\bf {d}^*_\mathrm{e}\\ \bf {0}\\ \end{array}\right]\nonumber\\ \end{eqnarray} (19)and   \begin{eqnarray} C_{\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}}(p,p^{\prime }) &=& \frac{\partial ^4 C_{u_\mathrm{e}}(p,p^{\prime })}{\partial t {\, } \partial t^{\prime } {\, } \partial x_\mathrm{e}{\, } \partial x^{\prime }_\mathrm{e}} - \left[\begin{array}{c@{\quad}c}\frac{\partial ^2 C_{u_\mathrm{e}}\left( p,\bf {P}\right)}{\partial t {\, } \partial x_\mathrm{e}} & \bf {0} \end{array}\right]\nonumber\\ &&\times\,\left[\begin{array}{c@{\quad}c}C_{u_\mathrm{e}}(\bf {P},\bf {P}) + \bf {C}_{\eta _\mathrm{e}} & \bf {G}\\ \bf {G}^T & \bf {0}\\ \end{array}\right]^{-1} \left[\begin{array}{c}\frac{\partial ^2 C_{u_\mathrm{e}}\left(\bf {P}, p^{\prime } \right)}{\partial t^{\prime } {\, } \partial x^{\prime }_\mathrm{e}} \\ \bf {0}\\ \end{array}\right].\nonumber\\ \end{eqnarray} (20) 2.1 Transient detection criterion Our motivation for estimating transient strain rates is, in part, to detect transient geophysical processes. As we will see, geophysical signal can be easily identified by visually inspecting the solution for $$\dot{\varepsilon }$$. However, if we want to detect geophysical signal automatically, then we need to define a detection criterion. We use a signal-to-noise ratio, SNR, that is based on the Frobenius norm of $$\dot{\varepsilon }$$, $$||\dot{\varepsilon }||_\mathrm{F} = \left(\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}^2 + \dot{\varepsilon }_{\mathrm{n}\mathrm{n}}^2 + 2\dot{\varepsilon }_{\mathrm{e}\mathrm{n}}^2\right)^{\frac{1}{2}}$$, for our detection criterion. In the geodetic literature, $$||\dot{\varepsilon }||_\mathrm{F}$$ is often used as a metric for the strain rate ‘magnitude’, and it is sometimes referred to as the second invariant of strain rate. Noting that $$||\dot{\varepsilon }||_\mathrm{F}$$ is a random variable, we take SNR to be the ratio of the estimated mean and standard deviation of $$||\dot{\varepsilon }||_\mathrm{F}$$. An estimate of the mean is found by evaluating $$||\dot{\varepsilon }||_\mathrm{F}$$ at the mean of $$\dot{\varepsilon }$$,   \begin{eqnarray} \mu _{||\dot{\varepsilon }||_\mathrm{F}} \approx ||\dot{\varepsilon }||_F \big |_{\dot{\varepsilon }= \mu _{\dot{\varepsilon }}} = \left(\mu _{\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}}^2 + \mu _{\dot{\varepsilon }_{\mathrm{n}\mathrm{n}}}^2 + 2\mu _{\dot{\varepsilon }_{\mathrm{e}\mathrm{n}}}^2 \right)^\frac{1}{2}, \end{eqnarray} (21)and we use nonlinear uncertainty propagation to estimate the standard deviation,   \begin{eqnarray} \sigma _{||\dot{\varepsilon }||_\mathrm{F}} &\approx& \left( \left(\left. \frac{\partial ||\dot{\varepsilon }||_\mathrm{F}}{\partial \dot{\varepsilon }_{\mathrm{e}\mathrm{e}}}\right|_{\dot{\varepsilon }= \mu _{\dot{\varepsilon }}} \right)^2 \sigma ^2_{\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}} + \left(\left. \frac{\partial ||\dot{\varepsilon }||_\mathrm{F}}{\partial \dot{\varepsilon }_{\mathrm{n}\mathrm{n}}}\right|_{\dot{\varepsilon }= \mu _{\dot{\varepsilon }}} \right)^2 \sigma ^2_{\dot{\varepsilon }_{\mathrm{n}\mathrm{n}}} \right.\nonumber\\ &&\left. +\, \left(\left. \frac{\partial ||\dot{\varepsilon }||_\mathrm{F}}{\partial \dot{\varepsilon }_{\mathrm{e}\mathrm{n}}}\right|_{\dot{\varepsilon }= \mu _{\dot{\varepsilon }}} \right)^2 \sigma ^2_{\dot{\varepsilon }_{\mathrm{e}\mathrm{n}}} \right)^\frac{1}{2}, \end{eqnarray} (22)where $$\sigma ^2_{\dot{\varepsilon }_{ij}}(p) = C_{\dot{\varepsilon }_{ij}}(p,p)$$. After some calculations, we find SNR to be   \begin{eqnarray} {\mathrm{SNR}(p) = \frac{\mu _{||\dot{\varepsilon }||_\mathrm{F}}(p)}{\sigma _{||\dot{\varepsilon }||_\mathrm{F}}(p)} }\nonumber \\ &&\quad = \frac{\mu _{\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}}(p)^2 + \mu _{\dot{\varepsilon }_{\mathrm{n}\mathrm{n}}}(p)^2 + 2\mu _{\dot{\varepsilon }_{\mathrm{e}\mathrm{n}}}(p)^2}{\big (\sigma ^2_{\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}}(p)\mu _{\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}}(p)^2 + \sigma ^2_{\dot{\varepsilon }_{\mathrm{n}\mathrm{n}}}(p)\mu _{\dot{\varepsilon }_{\mathrm{n}\mathrm{n}}}(p)^2 + 4\sigma ^2_{\dot{\varepsilon }_{\mathrm{e}\mathrm{n}}}(p)\mu _{\dot{\varepsilon }_{\mathrm{e}\mathrm{n}}}(p)^2 \big )^{\frac{1}{2}}}.\nonumber\\ \end{eqnarray} (23)We explicitly show that SNR is a function of p to emphasize that it identifies the position and time of anomalous deformation. We can reasonably suspect that some transient geophysical phenomena is occurring wherever and whenever SNR is larger than ∼3. 2.2 Outlier detection In deriving our formulation for transient strain rates, we have assumed that noise in the data vector is normally distributed. This is not an appropriate assumption for GNSS data, which often have more outliers than would be expected for normally distributed noise. Methods for analysing GNSS data should either be robust against outliers or should involve a preprocessing step in which outliers are detected and removed. Examples of the former include the MIDAS method for estimating secular velocities (Blewitt et al. 2016) and the GPS Imaging method for detecting spatially coherent features (Hammond et al. 2016). In this study, we identify and remove outliers as a preprocessing step before estimating $$\dot{\varepsilon }$$. Outliers are identified based on the residuals for a model that best fits the observed data. Observations with residuals that exceed some threshold are removed. This strategy for detecting outliers is commonly used for GNSS data, where the model being fit to the data typically consists of a linear trend and seasonal terms for each GNSS station (e.g. Johansson et al. 2002; Dong et al. 2006; Bos et al. 2013). To prevent transient geophysical signal from being erroneously identified as outliers, the model used in our outlier detection algorithm additionally consists of a temporally correlated Gaussian process. The details of our algorithm are given in the Appendix. It should be noted that our algorithm does not identify jumps in GNSS time-series, which is another common issue. Some, but not all, jumps can be automatically removed by looking up the dates of equipment changes and earthquakes (Gazeaux et al. 2013). However, it is still necessary to manually find and remove jumps of unknown origin. 3 APPLICATION TO PACIFIC NORTHWEST SLOW SLIP EVENTS In this section we estimate transient strain rates in the Pacific Northwest, and we are specifically interested in identifying transient strain resulting from SSEs. SSEs in the Pacific Northwest were first discovered by Dragert et al. (2001) and we refer the reader to Schwartz & Rokosky (2007) for a review of the subsequent research. SSEs in the Pacific Northwest can be detected by monitoring for associated seismic tremor (Rogers & Dragert 2003), which is actively being done by the Pacific Northwest Seismic Network (Wech 2010). We can then compare the tremor records to our estimated transient strain rates to verify that we are indeed identifying strain from SSEs. Before estimating transient strain rates, we first establish a noise model for GNSS stations in this region, and then we establish a prior Gaussian process to describe displacements from SSEs. We use the daily displacement solutions for continuous GNSS stations generated by the Geodesy Advancing Geosciences and EarthScope (GAGE) Facility (Herring et al. 2016). We limit the data set to the stations and times that are pertinent to the seven most recent SSEs in the Puget Sound region. The earliest SSE considered in this study began in August 2010, and the most recent SSE began in February 2017. The positions of GNSS stations used to estimate transient strain rates are shown in Fig. 2. Figure 2. View largeDownload slide Positions of the GNSS stations used to estimate transient strain rates. The coloured regions indicate the distribution of seismic tremor as determined by Wech (2010). The red dots show the positions of GNSS stations mentioned in this paper. The blue dot indicates the location of the transient strain rates shown in Fig. 7 and the signal-to-noise ratio shown in Fig.8. The blue dashed circle demarcates the spatial extent of the tremors shown in Fig. 8. Figure 2. View largeDownload slide Positions of the GNSS stations used to estimate transient strain rates. The coloured regions indicate the distribution of seismic tremor as determined by Wech (2010). The red dots show the positions of GNSS stations mentioned in this paper. The blue dot indicates the location of the transient strain rates shown in Fig. 7 and the signal-to-noise ratio shown in Fig.8. The blue dashed circle demarcates the spatial extent of the tremors shown in Fig. 8. 3.1 Noise model We consider the noise vector, $$\bf {\eta }_i$$, to be composed of a temporally correlated Gaussian process, $$z_i \sim \mathcal {GP}(0,C_{z_i})$$, and a vector of uncorrelated Gaussian noise, $$\bf {w}_i$$, so that   \begin{eqnarray} \bf {\eta }_i = z_i(\bf {P}) + \bf {w}_i. \end{eqnarray} (24)The standard deviations for $$\bf {w}_i$$ are taken to be the uncertainties derived for the GNSS displacement solutions, $$\bf {\sigma }_i$$. The noise vector then has zero mean and the covariance matrix   \begin{eqnarray} \bf {C}_{\eta _i} = C_{z_i}(\bf {P},\bf {P}) + \mathrm{diag}(\bf {\sigma }_i^2). \end{eqnarray} (25) The temporally correlated noise in GNSS data has been thoroughly studied over the past two decades (see Bock & Melgar (2016) or He et al. (2017) for a review). Most studies found that the temporally correlated noise can be described with Brownian motion (also known as random walk noise or a Weiner process) or flicker noise (e.g. Zhang et al. 1997; Mao et al. 1999; Williams et al. 2004; Langbein 2008; Murray & Svarc 2017). Less frequently used noise models include a band-pass process (e.g. Langbein 2004) and a first-order Gauss–Markov (FOGM) process (e.g. Langbein 2004; Dmitrieva et al. 2015). There is some physical justification for using Brownian motion as a noise model because it accurately describes the power spectrum of motion resulting from instability in a geodetic monument (e.g. Wyatt 1982, 1989). In particular, the power spectra for Brownian motion and monument motion both decay proportionally to f−2, where f is frequency. However, Brownian motion necessarily contains a reference time at which the process begins. Since there is no notion of when noise ‘begins’ in GNSS data, we do not find Brownian motion to be an appropriate noise model. On the other hand, an FOGM process has no reference time (i.e. it is stationary), and its power spectrum,   \begin{eqnarray} S(f) = \frac{\beta ^2}{(2 \pi f)^2 + \alpha ^2}, \end{eqnarray} (26)decays proportionally to f−2 above the cut-off frequency α/(2π). We choose zi to be a spatially uncorrelated FOGM process. The covariance function for zi is then   \begin{eqnarray} C_{z_i}\!\left((\vec{x},t),(\vec{x}^{\prime },t^{\prime })\right) = \frac{\beta ^2}{2\alpha } \exp \left(-\alpha |t - t^{\prime }|\right)\delta _{\vec{x},\vec{x}^{\prime }}, \end{eqnarray} (27)where $$\delta _{\vec{x},\vec{x}^{\prime }}$$ is 1 if $$\vec{x}= \vec{x}^{\prime }$$ and 0 otherwise (Rasmussen & Williams 2006, section B2). By choosing an FOGM process for zi, we have introduced two parameters that need to be constrained, α and β. We constrain these parameters, which we collectively refer to as $$\bf {\theta }$$, with the Restricted Maximum Likelihood (REML) method (Harville 1974). The REML method is conceptually similar to the Maximum Likelihood Estimation (MLE) method from Langbein & Johnson (1997), where we pick $$\bf {\theta }$$ to maximize the probability of drawing the observed data, $$\bf {d}_i^*$$, from the random vector $$\bf {d}_i$$. However, unlike the MLE method, the REML method produces unbiased estimates of $$\bf {\theta }$$ (Cressie 1993, section 2.6). We use the REML method to estimate $$\bf {\theta }$$ at 38 continuous GNSS stations in the Pacific Northwest that are east of 121°W. These stations are sufficiently far from the subduction zone that they are unlikely to record transient deformation from SSEs, allowing us to ignore the term $$u_i(\bf {P})$$ in $$\bf {d}_i$$. We assume the noise at these inland stations is representative of the noise at all the stations considered in this study, which is probably a poor assumption since the inland stations are subject to distinctly different climatic conditions. Nonetheless, we find $$\bf {\theta }$$ for each of the inland stations and for each displacement component by maximizing the REML likelihood function   \begin{eqnarray} \mathcal {L}(\bf {\theta }) &=& \left(\frac{\left|\bf {G}^T\bf {G}\right|}{(2\pi )^{n-6m} \left| \bf {C}_{\eta _i}(\bf {\theta }) \right| \left| \bf {G}^T\bf {C}_{\eta _i}(\bf {\theta })^{-1}\bf {G}\right|}\right)^{\frac{1}{2}} e^{-\frac{1}{2}\bf {d}_i^{*T}\bf {K}(\bf {\theta })\bf {d}_i^*}\nonumber\\ \end{eqnarray} (28)with respect to $$\bf {\theta }$$, where   \begin{eqnarray} \bf {K}(\bf {\theta }) = \bf {C}_{\eta _i}(\bf {\theta })^{-1} - \bf {C}_{\eta _i}(\bf {\theta })^{-1}\bf {G}\left(\bf {G}^T\bf {C}_{\eta _i}(\bf {\theta })^{-1}\bf {G}\right)^{-1} \bf {G}^T\bf {C}_{\eta _i}(\bf {\theta })^{-1}.\nonumber\\ \end{eqnarray} (29)In the above equation, $$\bf {d}^*_i$$ consists of the data for a single station. The distribution of inferred values for α and β are shown in Fig. 3. The estimates of β for the easting and northing components are clustered around 0.5 mm yr−0.5. The corresponding estimates of α tend to cluster around 0 yr−1, indicating that the power spectrum of noise indeed decays proportionally to f−2 over a wide band of frequencies. We also estimate α and β for the vertical component of displacements, with the hope that vertical deformation gradients could reveal some geophysical signal. For the vertical component, β is relatively large with a median value of 13.5 mm yr−0.5. The inferred values for α are also higher for the vertical component with a median value of 8.21 yr−1. In Fig. 4, we use the median values of α and β to generate two realizations of FOGM noise for each component. The realizations span five years, and the easting and northing realizations drift by about 1 mm over these 5 yr. In the context of detecting SSEs, which produce several mm’s of surface displacement on the timescale of weeks, the estimated FOGM noise for the easting and northing component is negligible. In contrast, the estimated FOGM noise for the vertical component is larger than the signal we would expect from SSEs. We suspect that the higher amplitude for the FOGM noise in the vertical component is accommodating for deficiencies in our rather simple seasonal model. Based on this analysis, we henceforth ignore temporally correlated noise in the easting and northing components because of its low amplitude. We also do not use vertical displacements because of the presumably low SNR. Figure 3. View largeDownload slide Distribution of inferred values for the parameters in the FOGM noise model (eq. 27). ‘IQR’ is the interquartile range of inferred values. Figure 3. View largeDownload slide Distribution of inferred values for the parameters in the FOGM noise model (eq. 27). ‘IQR’ is the interquartile range of inferred values. Figure 4. View largeDownload slide Two samples of FOGM noise for each displacement component. The parameters for the FOGM noise model have been set to the median values from Fig. 3. Figure 4. View largeDownload slide Two samples of FOGM noise for each displacement component. The parameters for the FOGM noise model have been set to the median values from Fig. 3. Another significant source of noise in GNSS data is common mode error (e.g. Wdowinski et al. 1997; Dong et al. 2006), which is noise that is highly spatially correlated. When not accounted for, common mode error manifests as spatially uniform undulations in our estimated transient displacements. These undulations have no effect on the transient strain rates, and so we do not need to include common mode error in our noise model. 3.2 Prior model We next establish a prior model for transient displacements. Specifically, we discuss our choice for the spatial and temporal functions making up $$C_{u_i}$$, Xi and Ti. For Xi, we use the squared exponential (SE) covariance function,   \begin{eqnarray} X_i(\vec{x},\vec{x}^{\prime }) = \exp \left(\frac{-||\vec{x}- \vec{x}^{\prime }||_2^2}{2 \ell ^2}\right). \end{eqnarray} (30)The SE covariance function is commonly used for kriging (e.g. Cressie 1993) and GPR (e.g. Rasmussen & Williams 2006). In terms of geodetic applications, Kato et al. (1998) and El-Fiky & Kato (1998) demonstrated that the SE covariance function accurately describes the spatial covariance of secular GNSS derived velocities in Japan. We consider three potential choices for Ti. The first option is the 1-D SE covariance function,   \begin{eqnarray} T_i(t,t^{\prime }) = \phi ^2\exp \left(\frac{-|t - t^{\prime }|^2}{2\tau ^2}\right). \end{eqnarray} (31) Note that Ti includes the parameter ϕ, which serves to scale the covariance function $$C_{u_i}$$. A second option is a member of the Wendland class of covariance functions (Wendland 2005). Wendland covariance functions have compact support, and thus their corresponding covariance matrices are sparse. We exploit this sparsity with the CHOLMOD software package (Chen et al. 2008). Wendland covariance functions can be constructed such that they are positive definite in $$\mathbb {R}^d$$ and their corresponding Gaussian process can be differentiated k times. We use d = 1, because we are describing the temporal covariance of ui, and we use k = 2, giving samples of ui continuous velocities and accelerations. The corresponding Wendland covariance function is   \begin{eqnarray} T_i(t,t^{\prime }) = \phi ^2\left(1 - \frac{|t - t^{\prime }|}{\tau }\right)^5_+ \left(\frac{8|t - t^{\prime }|^2}{\tau ^2} + \frac{5|t - t^{\prime }|}{\tau } + 1\right), \end{eqnarray} (32)where (t)+ = max(0, t). A third option for Ti is the covariance function for integrated Brownian motion (IBM). IBM is a Gaussian process with zero mean and its covariance function can be found by integrating the covariance function for Brownian motion,   \begin{eqnarray} T_i(t,t^{\prime }) &=& \int _0^t \int _0^{t^{\prime }} \phi ^2 \min (s,s^{\prime }) {\, }\text{d}s^{\prime }{\, }\text{d}s \nonumber \\ &=& \frac{\phi ^2}{2}\min (t,t^{\prime })^2 \left(\max (t,t^{\prime }) - \frac{1}{3}\min (t,t^{\prime })\right), \ \ \ t,t^{\prime } \ge 0.\nonumber\\ \end{eqnarray} (33)IBM has been used in the context of Kalman filtering as a non-parametric model for the time dependence of geophysical signals (e.g. Segall & Mathews 1997; McGuire & Segall 2003; Ohtani et al. 2010; Hines & Hetland 2016). Similar to Brownian motion, IBM has a reference time, t = 0, at which the process begins. For some geophysical signals, it is appropriate to have this reference time. For example, if we are trying to identify post-seismic deformation, then t should be zero at the time of the earthquake. However, if we are interested in detecting transient events, where there is no known start time, then IBM may not be an appropriate prior model. Instead, one may prefer to use the SE or Wendland covariance functions because they are stationary. In the following analysis, we make the choice that t is zero on the first epoch of $$\bf {d}_i^*$$. Using an earlier reference time does not change the results discussed in this section. We use the REML method to determine the appropriate values for the parameters in $$C_{u_i}$$, which are ℓ, ϕ and τ. We refer to these parameters collectively as $$\bf {\theta }$$. Just as in Section 3.1, we pick $$\bf {\theta }$$ to maximize the probability of drawing $$\bf {d}_i^*$$ from $$\bf {d}_i$$, but now we are including $$u_i(\bf {P})$$ in $$\bf {d}_i$$, and we are optimizing the parameters for $$C_{u_i}$$ rather than $$\bf {C}_{\eta _i}$$. We divide the GNSS data into seven subsets that are 4 months long and each centred on the time of an SSE. The times of the seven SSEs are determined with tremor records from Wech (2010). We find the optimal $$\bf {\theta }$$ for each subset of data, for each displacement component, and for each choice of Ti. The REML likelihood function that we are maximizing with respect to $$\bf {\theta }$$ is now   \begin{eqnarray} \mathcal {L}(\bf {\theta }) = \left(\frac{\left| \bf {G}^T \bf {G}\right|}{(2\pi )^{n-6m} \left| \bf {\Sigma }(\bf {\theta }) \right| \left| \bf {G}^T \bf {\Sigma }(\bf {\theta })^{-1} \bf {G}\right|}\right)^{\frac{1}{2}} e^{-\frac{1}{2} \bf {d}_i^{*T} \bf {K}(\bf {\theta }) \bf {d}_i^*}, \end{eqnarray} (34)where   \begin{eqnarray} \bf {K}(\bf {\theta }) = \bf {\Sigma }(\bf {\theta })^{-1} - \bf {\Sigma }(\bf {\theta })^{-1} \bf {G}\left( \bf {G}^T \bf {\Sigma }(\bf {\theta })^{-1} \bf {G}\right)^{-1} \bf {G}^T \bf {\Sigma }(\bf {\theta })^{-1} \end{eqnarray} (35)and $$\bf {\Sigma }(\bf {\theta }) = \bf {C}_{\eta _i} + C_{u_i}(\bf {P},\bf {P};\bf {\theta })$$. In the above equation, $$\bf {d}^*_i$$ consists of a single subset of data. The estimated parameters are summarized in Table 1. Based on the interquartile ranges, the estimated parameters for the SE and Wendland covariance functions do not vary significantly between SSEs. This suggests that the median of the estimated parameters should be appropriate for all the Pacific Northwest SSEs. For the IBM model, there are several anomalously large values for ℓ and ϕ, which is why the interquartile ranges are large. Table 1. Optimal values for the parameters in Xi and Ti determined with the REML method. The covariance function for Ti is indicated by the ‘Ti’ column. The SE, Wendland and IBM covariance functions are defined in eqs (31), (32) and (33), respectively. We use eq. (30) for Xi. The parameters are estimated for each of the seven SSEs considered in this study, and the tabulated values indicate the median and interquartile ranges of estimated values. The ‘diff log (REML)’ column compares the optimal log REML likelihoods to those for when Ti is the SE covariance function. Negative values indicate that observations are more consistent with the SE covariance function. Ti  Component  ℓ  ϕ  τ  diff. log (REML)  SE  East  92 ± 25 km  0.62 ± 0.11 mm  0.026 ± 0.011 yr  –  SE  North  91 ± 53 km  0.43 ± 0.05 mm  0.030 ± 0.017 yr  –  Wendland  East  95 ± 30 km  0.66 ± 0.15 mm  0.093 ± 0.044 yr  0.78 ± 0.87  Wendland  North  92 ± 57 km  0.46 ± 0.10 mm  0.116 ± 0.057 yr  0.08 ± 0.58  IBM  East  110 ± 130 km  290 ± 420 mm yr−1.5  –  − 16.4 ± 7.8  IBM  North  150 ± 560 km  110 ± 250 mm yr−1.5  –  − 10.1 ± 2.3  Ti  Component  ℓ  ϕ  τ  diff. log (REML)  SE  East  92 ± 25 km  0.62 ± 0.11 mm  0.026 ± 0.011 yr  –  SE  North  91 ± 53 km  0.43 ± 0.05 mm  0.030 ± 0.017 yr  –  Wendland  East  95 ± 30 km  0.66 ± 0.15 mm  0.093 ± 0.044 yr  0.78 ± 0.87  Wendland  North  92 ± 57 km  0.46 ± 0.10 mm  0.116 ± 0.057 yr  0.08 ± 0.58  IBM  East  110 ± 130 km  290 ± 420 mm yr−1.5  –  − 16.4 ± 7.8  IBM  North  150 ± 560 km  110 ± 250 mm yr−1.5  –  − 10.1 ± 2.3  View Large Next, we identify which covariance function for Ti best describes the time dependence of deformation from SSEs. One approach is to choose the covariance function that produces the largest optimal REML likelihoods, similar to the analysis in Langbein (2004). In Table 1, we summarize how the optimal REML likelihoods for the Wendland and IBM covariance functions compare to those for the SE covariance function. Based on the differences in optimal REML likelihoods, the data is substantially more likely to come from a Gaussian process with an SE or Wendland covariance function than an IBM covariance function. The REML likelihoods do not definitively indicate whether the SE or Wendland covariance function is preferable. A more intuitive way of deciding which function to use for Ti is to compare the posterior displacements and the observed displacements. The posterior displacements consist of our estimate of transient displacements, secular trends, and seasonal deformation. Specifically, the posterior displacement vector is $$\skew6\hat{\bf {d}}_i = \left( u_i(\bf {P}) + \bf {G}\bf {m}_i \right) | \left( \bf {d}_i = \bf {d}_i^* \right)$$. Following a similar procedure as in Section 2, it can be shown that $$\skew4\hat{\bf {d}}_i$$ has a multivariate normal distribution with mean vector   \begin{eqnarray} \bf {\mu }_{\hat{d}_i} = \left[\begin{array}{c@{\quad}c}C_{u_i}(\bf {P},\bf {P}) & \bf {G}\\ \end{array}\right] \left[\begin{array}{c@{\quad}c}\bf {C}_{\eta _i} + C_{u_i}(\bf {P},\bf {P}) & \bf {G}\\ \bf {G}^T & \bf {0}\\ \end{array}\right]^{-1} \left[\begin{array}{c}\bf {d}_i^* \\ \bf {0}\\ \end{array}\right] \end{eqnarray} (36)and covariance matrix   \begin{eqnarray} \bf {C}_{\hat{d}_i} &=& C_{u_i}(\bf {P},\bf {P}) - \left[\begin{array}{c@{\quad}c}C_{u_i}(\bf {P},\bf {P}) & \bf {G}\\ \end{array}\right]\nonumber\\ &&\times\,\left[\begin{array}{c@{\quad}c}\bf {C}_{\eta _i} + C_{u_i}(\bf {P},\bf {P}) & \bf {G}\\ \bf {G}^T & \bf {0}\\ \end{array}\right]^{-1} \left[\begin{array}{c}C_{u_i}(\bf {P},\bf {P}) \\ \bf {G}^T \\ \end{array}\right]. \end{eqnarray} (37)We compute $$\skew6\hat{\bf {d}}_i$$ using the SE, Wendland, and IBM covariance functions for Ti, and we use the median values from Table 1 for $$\bf {\theta }$$. Fig. 5 compares $$\bf {d}_i^*$$ to $$\skew6\hat{\bf {d}}_i$$ during the winter 2015–2016 SSE at stations ALBH, P436 and SC02, which are the stations that recorded the strongest signal from this SSE. Based on Fig. 5, the chosen covariance function for Ti has almost no effect on $$\skew6\hat{\bf {d}}_i$$. The posterior displacements for the IBM covariance function contain slightly more high frequency, and perhaps spurious, features. Regardless of the chosen covariance function, $$\skew6\hat{\bf {d}}_i$$ appears to underestimate the rate of deformation during the SSE at stations ALBH and SC02. The deformation at these two stations is particularly rapid compared to the surrounding stations, and the misfit between $$\bf {d}_i^*$$ and $$\skew6\hat{\bf {d}}_i$$ is likely due to oversmoothing. This oversmoothing could indicate that the chosen values for τ or ℓ are too large. However, $$\skew6\hat{\bf {d}}_i$$ does faithfully fit $$\bf {d}_i^*$$ at the remaining stations, and so we do not attempt to further refine the parameters. Figure 5. View largeDownload slide Easting component of the observed displacements, $$\bf {d}_\mathrm{e}^*$$, and posterior displacements, $$\skew6\hat{\bf {d}}_\mathrm{e}$$, during the winter 2015–2016 SSE. We show $$\skew6\hat{\bf {d}}_\mathrm{e}$$ for when Ti is an SE, Wendland and IBM covariance function. The one standard deviation uncertainties are shown for $$\bf {d}_\mathrm{e}^*$$ and the $$\skew6\hat{\bf {d}}_\mathrm{e}$$ that has an SE covariance function for Ti. For clarity, uncertainties are not shown for the $$\skew6\hat{\bf {d}}_\mathrm{e}$$ that have an IBM or Wendland covariance function for Ti. The posterior displacements for the different covariance functions are all practically indistinguishable. Figure 5. View largeDownload slide Easting component of the observed displacements, $$\bf {d}_\mathrm{e}^*$$, and posterior displacements, $$\skew6\hat{\bf {d}}_\mathrm{e}$$, during the winter 2015–2016 SSE. We show $$\skew6\hat{\bf {d}}_\mathrm{e}$$ for when Ti is an SE, Wendland and IBM covariance function. The one standard deviation uncertainties are shown for $$\bf {d}_\mathrm{e}^*$$ and the $$\skew6\hat{\bf {d}}_\mathrm{e}$$ that has an SE covariance function for Ti. For clarity, uncertainties are not shown for the $$\skew6\hat{\bf {d}}_\mathrm{e}$$ that have an IBM or Wendland covariance function for Ti. The posterior displacements for the different covariance functions are all practically indistinguishable. For our estimates of transient strain discussed in the next section, we ultimately settle on the Wendland covariance function for Ti, and we use the median values from Table 1 for the parameters. We choose the Wendland covariance function over the SE covariance function because of its computational advantages. 3.3 Transient Strain Rates Having established a noise model and a prior for transient displacements, we can now calculate transient strain rates, $$\dot{\varepsilon }$$, in the Puget Sound region from the GNSS data. We evaluate $$\dot{\varepsilon }$$ at a grid of points spanning the study area for each day from 2010 January 1 to 2017 May 15. In Fig. 6, we show a map view of $$\dot{\varepsilon }$$ on January 1, 2016, which coincides with the height of the winter 2015–2016 SSE. In Fig. 7, we show a time series of $$\dot{\varepsilon }$$ at a position on the Olympic Peninsula, which is where $$\dot{\varepsilon }$$ tends to be the largest during SSEs. We also include a supplementary animation showing a map view of $$\dot{\varepsilon }$$ over time. In each figure, we show the mean and standard deviation of $$\dot{\varepsilon }$$, making it easy to identify which features are statistically significant. Figure 6. View largeDownload slide Transient strain rates during the winter 2015–2016 SSE. The glyphs show the normal strain rates as a function of azimuth, where orange indicates compression and blue indicates extension. The shaded regions show the one standard deviation uncertainties for the normal strain rates. Figure 6. View largeDownload slide Transient strain rates during the winter 2015–2016 SSE. The glyphs show the normal strain rates as a function of azimuth, where orange indicates compression and blue indicates extension. The shaded regions show the one standard deviation uncertainties for the normal strain rates. Figure 7. View largeDownload slide Components of the transient strain rates at the position shown in Fig. 2. The shaded regions indicate the one standard deviation uncertainties. Figure 7. View largeDownload slide Components of the transient strain rates at the position shown in Fig. 2. The shaded regions indicate the one standard deviation uncertainties. The transient strain rates shown for the winter 2015–2016 SSE in Fig. 6 are characteristic of most SSEs in the Puget Sound region. The SSEs cause trench perpendicular compression in the Olympic Peninsula and extension east of Puget Sound. The strain transitions from compression to extension around the southern tip of Vancouver Island, coinciding with where fault slip tends to be inferred (e.g. Dragert et al. 2001; Wech et al. 2009; Schmidt & Gao 2010). Hence, this pattern of strain is to be expected. Over decadal timescales, there is trench perpendicular compression throughout this study region caused by steady tectonic plate motion (Murray & Lisowski 2000; McCaffrey et al. 2007, 2013). When comparing the transient strain rates caused by SSEs to the secular strain rates, we see that SSEs are concentrating tectonically accumulated strain energy towards the trench, and presumably pushing the subduction zone closer to failure. To verify that the estimated transient strain rates are accurately identifying geophysical signal, we compare the signal-to-noise ratio from eq. (23) at a point on the Olympic Peninsula to the frequency of seismic tremor (Fig. 8). A signal-to-noise ratio greater than ∼3 can be interpreted as a detected geophysical signal. We detect nine distinct events, which each correspond to peaks in seismic tremor. The smaller events detected in August 2014 and February 2017 can be considered inter-SSE events. They were not among the SSEs used to constrain the prior covariance function. In between peaks in seismic tremor, the signal-to-noise ratio is consistently between 0 and 2, suggesting that all the transient strain detected at this point on the Olympic Peninsula is associated with SSEs and inter-SSE events. Figure 8. View largeDownload slide Top: signal-to-noise ratio (eq. 23) at the position shown in Fig. 2. Bottom: frequency of seismic tremors in the region shown in Fig. 2. Figure 8. View largeDownload slide Top: signal-to-noise ratio (eq. 23) at the position shown in Fig. 2. Bottom: frequency of seismic tremors in the region shown in Fig. 2. The results we have presented indicate that we are identifying the transient strain that we should expect to see. This is not to say that there are no unexpected features in $$\dot{\varepsilon }$$ that are worth further exploration. However, a discussion on features in $$\dot{\varepsilon }$$ that have unknown origin is beyond the scope of this paper. 4 DISCUSSION Our results demonstrate that GPR is an effective tool for estimating transient strain rates and detecting geophysical processes from GNSS data. However, there are some aspects of our method that warrant further research. For example, there is potential for a more thorough analysis of the spatiotemporal noise in the GNSS data than what was performed in Section 3.1. Specifically, we did not explore spatially correlated noise, such as common mode error. We have assumed that any spatially correlated noise has a sufficiently long wavelength that it has a negligible effect on transient strain rates. We can also improve upon the seasonal noise model used in this study, which consists of four sinusoids for each station. We did not explore the roughness of the seasonal deformation (i.e. the number of sinusoids needed to describe the deformation). The periodic Gaussian process (Mackay 1998) is an alternative model for seasonal deformation and is well suited for exploring the roughness of seasonal deformation. The periodic Gaussian process has zero mean and the covariance function   \begin{eqnarray} T(t,t^{\prime }) = \phi ^2 \exp \left(\frac{-\sin (\pi |t - t^{\prime }|)^2}{2\tau ^2}\right). \end{eqnarray} (38)Realizations have annual periodicity and the roughness is controlled by τ. Decreasing τ has the same effect as including higher frequency sinusoids in the seasonal model. The optimal value for τ can be found with the REML method. Another potential research direction would be to reduce the computational cost of our method for estimating transient strain rates. GPR is generally computationally expensive when there are many observations. The transient strain rates estimated in this study are constrained by seven years of daily displacement observations from 94 GNSS stations, which amounts to about 240 000 observations for each displacement component. For a data set with this size, it is difficult to evaluate the matrix inverses in eqs (10) and (11). We alleviate this computational cost by using a compact Wendland covariance function for our prior. By using a compact covariance function for our prior, eqs (10) and (11) become sparse systems of equations that are more tractable to evaluate. However, the sparsity decreases when the Wendland covariance function has a larger timescale parameter. The Wendland covariance function then offers less of a computational advantage when we are interested in geophysical processes that occur over longer timescales. An alternative approach to dealing with large data sets would be to explore approximation methods for GPR (Rasmussen & Williams 2006, ch. 8). In addition to detecting geophysical processes, the GNSS derived transient strain rates can be used to better understand the data from borehole strain metres (BSMs). The PBO contains about forty BSMs in the Pacific Northwest, and it has been demonstrated that BSMs are able to record transient geophysical events such as SSEs (e.g. Dragert & Wang 2011). However, there are complications that prevent BSM data from being used quantitatively in geophysical studies. One difficulty is that BSM data should be calibrated with a well-known strain source, such as diurnal and semi-diurnal tides (Hart et al. 1996; Roeloffs 2010; Hodgkinson et al. 2013). Unfortunately, the tidal forces at BSMs which record SSEs are strongly influenced by local bodies of water such as the Straight of Juan de Fuca, making it difficult to form a theoretical prediction of tidal strains (Roeloffs 2010). Another complication is that noise in BSM data is not well understood. The noise consists, in part, of a long-term decay resulting from the instrument equilibrating with the surrounding rock (Gladwin et al. 1987). Typically, this noise is dealt with in an ad-hoc manner by fitting and removing exponentials and low-order polynomials. We envision that the GNSS derived strain rates from this paper can be used as a reference strain for calibrating BSM data and quantify its noise. 5 CONCLUSION We propose using GPR to estimate transient strain rates from GNSS data. Most other methods for estimating strain rates assume a parametric representation of deformation, which can bias the results if the parameterization is not chosen carefully. Here we assume a stochastic, rather than parametric, prior model for displacements. Our prior model describes how much we expect transient displacements to covary spatially and temporally. If we know nothing about the underlying signal that we are trying to recover, then the prior model can be chosen objectively with maximum likelihood methods. We demonstrate that GPR is an effective tool for detecting geophysical processes, such as SSEs, in our application to the Pacific Northwest. While this paper just focuses on using GPR to estimate transient strain rates and detect geophysical processes, we believe that GPR is a powerful tool that can be applied to a wide range of geophysical problems. Acknowledgements This material is based upon work supported by the National Science Foundation under grant EAR 1245263. The EarthScope Plate Boundary Observatory data are provided by UNAVCO through the GAGE Facility with support from the National Science Foundation (NSF) and National Aeronautics and Space Administration (NASA) under NSF Cooperative Agreement EAR-1261833. An implementation of the method described in this paper is named Python-based Geodetic Network Strain software (PyGeoNS). PyGeoNS is distributed under the MIT License and can be found at www.github.com/treverhines/PyGeoNS. REFERENCES Acheson D.T., 1975. Data editing—subroutine editq, Tech. rep., US Department of Commerce, National Oceanic and Atmospheric Administration, Environmental Data Service. Adler R.J., 1981. The Geometry of Random Fields , John Wiley & Sons. Beavan J., Haines J., 2001. Contemporary horizontal velocity and strain rate fields of the Pacific-Australian plate boundary zone through New Zealand, J. geophys. Res. , 106( B1), 741– 770. Google Scholar CrossRef Search ADS   Blewitt G., Kreemer C., Hammond W.C., Gazeaux J., 2016. MIDAS robust trend estimator for accurate GPS station velocities without step detection, J. geophys. Res. , 121, 2054– 2068. Google Scholar CrossRef Search ADS   Bock Y. Melgar D., 2016. Physical applications of GPS geodesy: a review, Rep. Prog. Phys. , 79( 10), 106801, 16– 23. Google Scholar CrossRef Search ADS   Bos M.S., Fernandes R.M.S., Williams S.D.P., Bastos L., 2013. Fast error analysis of continuous GNSS observations with missing data, J. Geod. , 87( 4), 351– 360. Google Scholar CrossRef Search ADS   Chen Y., Davis T.a., Hager W.W., 2008. Algorithm 887 : CHOLMOD, supernodal sparse Cholesky factorization and update/downdate, ACM Trans. Math. Softw. , 35( 3), 1– 12. Google Scholar CrossRef Search ADS   Cressie N., 1993. Statistics for Spatial Data , John Wiley & Sons, rev. edn. Dmitrieva K., Segall P., DeMets C., 2015. Network-based estimation of time-dependent noise in GPS position time series, J. Geod. , 89( 6), 591– 606. Google Scholar CrossRef Search ADS   Dong D. Fang P. Bock Y. Cheng M.K. Miyazaki S., 2002. Anatomy of apparent seasonal variations from GPS-derived site position time series, J. geophys. Res. , 107( B4), ETG 9-1–ETG 9-16. Dong D., Fang P., Bock Y., Webb F., Prawirodirdjo L., Kedar S., Jamason P., 2006. Spatiotemporal filtering using principal component analysis and Karhunen–Loeve expansion approaches for regional GPS network analysis, J. geophys. Res. , 111( 3), 1– 16. Google Scholar PubMed  Dragert H., Wang K., 2011. Temporal evolution of an episodic tremor and slip event along the northern Cascadia margin, J. geophys. Res. , 116( 12), 1– 12. Dragert H., Wang K., James T.S., 2001. A silent slip event on the deeper Cascadia subduction interface, Science , 292, 1525– 1528. Google Scholar CrossRef Search ADS PubMed  El-Fiky G.S., Kato T., 1998. Continuous distribution of the horizontal strain in the Tohoku district, Japan, predicted by least-squares collocation, J. Geodyn. , 27( 2), 213– 236. Google Scholar CrossRef Search ADS   Feigl K.L., King R.W., Jordan T.H., 1990. Geodetic measurement of tectonic deformation in the Santa Maria Fold and Thrust Belt, California, J. geophys. Res. , 95( B3), 2679– 2699. Google Scholar CrossRef Search ADS   Field E.H. et al.  , 2014. Uniform California Earthquake Rupture Forecast, version 3 (UCERF3) -The time-independent model, Bull. seism. Soc. Am. , 104( 3), 1122– 1180. Google Scholar CrossRef Search ADS   Freed a.M., Lin J., 2001. Delayed triggering of the 1999 Hector Mine earthquake by viscoelastic stress transfer, Nature , 411, 180– 183. Google Scholar CrossRef Search ADS PubMed  Gazeaux J. et al.  , 2013. Detecting offsets in GPS time series: first results from the detection of offsets in GPS experiment, J. geophys. Res. , 118( 5), 2397– 2407. Google Scholar CrossRef Search ADS   Gladwin M.T., Gwyther R.L., Hart R., Francis M., Johnston M.J.S., 1987. Borehole tensor strain measurements in California, J. geophys. Res. , 92( B8), 7981– 7988. Google Scholar CrossRef Search ADS   Hammond W.C. Blewitt G. Kreemer C., 2016. GPS imaging of vertical land motion in California and Nevada: Implications for Sierra Nevada uplift, J. geophys. Res . Hart R. H.G. Gladwin M.T. Gwyther R.L. Agnew D.C. Wyatt F.K., 1996. Tidal calibration of borehole strain meters: Removing the effects of small-scale inhomogeneity, J. geophys. Res. , 101( 96). Harville D.A., 1974. Bayesian Inference for Variance Components Using Only Error Contrasts, Biometrika , 61( 2), 383– 385. Google Scholar CrossRef Search ADS   He X., Montillet J.P., Fernandes R., Bos M., Yu K., Hua X., Jiang W., 2017. Review of current GPS methodologies for producing accurate time series and their error sources, J. Geodyn. , 106, 12– 29. Google Scholar CrossRef Search ADS   Herring T.A. et al.  , 2016. Plate Boundary Observatory and related networks: GPS data analysis methods and geodetic product, Reviews of Geophysics , pp. 1– 50. Google Scholar CrossRef Search ADS   Hines T.T., Hetland E.A., 2016. Rheologic constraints on the upper mantle from five years of postseismic deformation following the El Mayor-Cucapah earthquake, J. geophys. Res. , 121, 6809– 6827. Google Scholar CrossRef Search ADS   Hodgkinson K., Agnew D., Roeloffs E., 2013. Working With Strainmeter Data, EOS, Trans. Am. geophys. Un. , 94( 9), 91. Google Scholar CrossRef Search ADS   Holt W.E., Shcherbenko G., 2013. Toward a continuous monitoring of the horizontal displacement gradient tensor field in Southern California using cGPS observations from Plate Boundary Observatory (PBO), Seismol. Res. Lett. , 84( 3), 455– 467. Google Scholar CrossRef Search ADS   Johansson J.M. et al.  , 2002. Continuous GPS measurements of postglacial adjustment in Fennoscandia 1. Geodetic results, J. geophys. Res. , 107, ETG 3-1–ETG 3-27. Kato T., El-Fiky G.S., Oware E.N., Miyazaki S., 1998. Crustal strains in the Japanese islands as deduced from dense GPS array, Geophys. Res. Lett. , 25( 18), 3445– 3448. Google Scholar CrossRef Search ADS   Langbein J., 2004. Noise in two-color electronic distance meter measurements revisited, J. geophys. Res. , 109( 4), 1– 16. Langbein J., 2008. Noise in GPS displacement measurements from Southern California and Southern Nevada, J. geophys. Res. , 113( 5), 1– 12. Langbein J., Johnson H., 1997. Correlated errors in geodetic time series: implications for time-dependent deformation, J. geophys. Res. , 102( B1), 591– 603. Google Scholar CrossRef Search ADS   Lohman R.B., Murray J.R., 2013. The SCEC geodetic transient-detection validation exercise, Seism. Res. Lett. , 84( 3), 419– 425. Google Scholar CrossRef Search ADS   Mackay D.J.C., 1998. Introduction to gaussian processes, Neural Netw. Mach. Learn. , 168( 1996), 133– 165. Mao A., Harrison G.A., Dixon H., 1999. Noise in GPS coordinate time series, J. geophys. Res. , 104( B2), 2797– 2816. Google Scholar CrossRef Search ADS   McCaffrey R. et al.  , 2007. Fault locking, block rotation and crustal deformation in the Pacific Northwest, Geophys. J. Int. , 169( 3), 1315– 1340. Google Scholar CrossRef Search ADS   McCaffrey R., King R.W., Payne S.J., Lancaster M., 2013. Active tectonics of northwestern U.S. inferred from GPS-derived surface velocities, J. geophys. Res. , 118, 709– 723. Google Scholar CrossRef Search ADS   McGuire J.J., Segall P., 2003. Imaging of aseismic fault slip transients recorded by dense geodetic networks, Geophys. J. Int. , 155, 778– 788. Google Scholar CrossRef Search ADS   Meade B.J., Hager B.H., 2005. Block models of crustal motion in southern California constrained by GPS measurements, J. geophys. Res. , 110, 1– 19. Google Scholar CrossRef Search ADS   Moritz H., 1978. Least-Squares Collocation, Rev. Geophys. , 16( 3), 421– 430. Google Scholar CrossRef Search ADS   Murray J.R., Svarc J., 2017. Global Positioning System data collection, processing, and analysis conducted by the U.S. Geological Survey Earthquake Hazards Program, Seismol. Res. Lett. , 88( 3), 1– 10. Google Scholar CrossRef Search ADS   Murray M.H., Lisowski M., 2000. Strain accumulation along the Cascadia subduction zone in western Washington, Geophys. Res. Lett. , 27( 22), 3631– 3634. Google Scholar CrossRef Search ADS   Ohtani R., McGuire J.J., Segall P., 2010. Network strain filter: A new tool for monitoring and detecting transient deformation signals in GPS arrays, J. geophys. Res. , 115( 12), 1– 17. Okada Y., 1992. Internal deformation due to shear and tensile faults in a half space, Bull. seism. Soc. Am. , 82( 2), 1018– 1040. Papoulis A., 1991. Probability, Random Variables, and Stochastic Processes , 3rd edn, McGraw-Hill. Press W.H. Flannery B.P. Teukolsky S.A. Vetterling W.T., 2007. Numerical Recipes: The Art of Scientific Computing , 3rd edn, Cambridge Univ. Press. Rasmussen C.E. Williams C.K.I., 2006. Gaussian Processes for Machine Learning , The MIT Press. Roeloffs E., 2010. Tidal calibration of Plate Boundary Observatory borehole strainmeters: roles of vertical and shear coupling, J. geophys. Res. , 115( 6), 1– 25. Roeloffs E.A., 2006. Evidence for aseismic deformation rate changes prior to earthquakes, Annu. Rev. Earth Planet. Sci. , 34( 1), 591– 627. Google Scholar CrossRef Search ADS   Rogers G., Dragert H., 2003. Episodic tremor and slip on the Cascadia subduction zone: the chatter of silent slip, Science , 300, 1942– 1943. Google Scholar CrossRef Search ADS PubMed  Sandwell D.T. Wessel P., 2016. Interpolation of 2-D vector data using constraints from elasticity, Geophys. Res. Lett. , 43, 10 703– 10 709. Google Scholar CrossRef Search ADS   Schmidt D.A., Gao H., 2010. Source parameters and time-dependent slip distributions of slow slip events on the Cascadia subduction zone from 1998 to 2008, J. geophys. Res. , 115( 4), 1– 13. Schwartz S.Y., Rokosky J.M., 2007. Slow slip events and seismic tremor at circum-Pacific subduction zones, Rev. Geophys. , 45, 1– 32. Google Scholar CrossRef Search ADS   Segall P. Mathews M., 1997. Time dependent inversion of geodetic data, J. geophys. Res. , 102( B10), 22 391–22 409. Shen Z., Wang M., Zeng Y., Wang F., 2015. Optimal Interpolation of Spatially Discretized Geodetic Data, Bull. seism. Soc. Am. , 105( 4), 2117– 2127. Google Scholar CrossRef Search ADS   Shen Z.K., Jackson D.D., Ge B.X., Bob X.G., 1996. Crustal deformation across and beyond the Los Angeles basin from geodetic measurements, J. geophys. Res. , 101( B12), 27927– 27957. Google Scholar CrossRef Search ADS   Tape C., Musé P., Simons M., Dong D., Webb F., 2009. Multiscale estimation of GPS velocity fields, Geophys. J. Int. , 179( 2), 945– 971. Google Scholar CrossRef Search ADS   von Mises R., 1964. Mathematical Theory of Probability and Statistics , Academic Press. Wdowinski S. Zhang J. Fang P. Genrich J., 1997. Southern California Permanent GPS Geodetic Array: Spatial filtering of daily positions for estimating coseismic and postseismic displacements induced by the 1992 Landers earthquake, 102( 97), 57– 70. Wech A.G., 2010. Interactive Tremor Monitoring, Seismol. Res. Lett. , 81( 4), 664– 669. Google Scholar CrossRef Search ADS   Wech A.G., Creager K.C., Melbourne T.I., 2009. Seismic and geodetic constraints on Cascadia slow slip, J. geophys. Res. , 114( 10), 1– 9. Wendland H., 2005. Scattered Data Approximation , Cambridge Univ. Press. Williams S.D.P. Bock Y. Fang P. Jamason P. Nikolaidis R.M. Prawirodirdjo L. Miller M. Johnson D.J., 2004. Error analysis of continuous GPS position time series, J. geophys. Res. , 109( B3), 1– 19. Google Scholar CrossRef Search ADS   Wyatt F., 1982. Displacement of Surface Monuments: Horizontal Motion, J. geophys. Res. , 87( B2), 979– 989. Google Scholar CrossRef Search ADS   Wyatt F.K., 1989. Displacement of surface monuments: Vertical motion, J. geophys. Res. , 94( B2), 1655– 1664. Google Scholar CrossRef Search ADS   Zhang J. Bock Y. Johnson H. Fang P. Williams S. Genrich J. Wdowinski S. Behr J., 1997. Southern California Permanent GPS Geodetic Array: error analysis of daily position estimates and site velocities, J. geophys. Res. , 102( B8), 18 035–18 055. SUPPORTING INFORMATION Supplementary data are available at GJI online. strain-animation.mp4 Please note: Oxford University Press is not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the paper. APPENDIX: OUTLIER DETECTION ALGORITHM Our outlier detection algorithm is loosely based on the data editing algorithm from Acheson (1975). Let $$\bf {d}^*$$ denote all n GNSS displacement observations for a single directional component, which have been made at positions and times $$\bf {P}$$. We describe $$\bf {d}^*$$ as a realization of the random vector   \begin{eqnarray} \bf {d}= \bf {G}\bf {m} + v(\bf {P}) + \bf {w}, \end{eqnarray} (A1)where $$\bf {G}$$ and $$\bf {m}$$ are the same as in eq. (4), v is a Gaussian process distributed as $$\mathcal {GP}(0,C_v)$$, and $$\bf {w}$$ is a vector of uncorrelated Gaussian noise with known standard deviations $$\bf {\sigma }$$. The Gaussian process v is intended to describe transient features in the data that cannot be explained by the linear trend or seasonal terms in $$\bf {G}$$. We let the temporal covariance of v be a squared exponential, and we let v be spatially uncorrelated so that   \begin{eqnarray} C_v(p,p^{\prime }) = \phi ^2\exp \left(\frac{-|t - t^{\prime }|^2}{2\tau ^2}\right) \delta _{\vec{x},\vec{x}^{\prime }}, \end{eqnarray} (A2)where $$\delta _{\vec{x},\vec{x}^{\prime }} = 1$$ if $$\vec{x}= \vec{x}^{\prime }$$ and 0 otherwise. The spatial covariance of v has little effect on the detected outliers, and so we have assumed that v is spatially uncorrelated for simplicity. Based on our experience, v can reasonably describe most transient features in the data when we set ϕ = 1 mm and τ = 10 d. Our goal is to find the index set of non-outliers in $$\bf {d}^*$$, which we denote as $$\bf {\Omega }$$. We use a tilde to indicate that an array only contains elements corresponding to $$\bf {\Omega }$$ (e.g. the vector of non-outlier observations is denoted $$\tilde{\bf {d}}^* = [d_i^*]_{i \in \bf {\Omega }}$$). The outliers are identified iteratively, and we initiate $$\bf {\Omega }$$ with all n indices. We consider outliers to be data that are poorly explained by the model $$\bf {G}\bf {m} + v(\bf {P})$$, which are determined by the residual vector   \begin{eqnarray} \bf {r} &=& \bf {d}^* - \mathrm{E}\left[ \big (\bf {G}\bf {m} + v(\bf {P})\big ) \big | \big (\tilde{\bf {d}} = \tilde{\bf {d}}^*\big ) \right]= \bf {d}^* - \left[\begin{array}{c@{\quad}c}C_v(\bf {P},\tilde{\bf {P}}) & \bf {G} \end{array}\right]\nonumber\\ &&\times \left[\begin{array}{c@{\quad}c}C_v(\tilde{\bf {P}},\tilde{\bf {P}}) + \mathrm{diag}(\tilde{\bf {\sigma }}^2) & \tilde{\bf {G}} \\ \tilde{\bf {G}}^T & \bf {0}\\ \end{array}\right]^{-1} \left[\begin{array}{c}\tilde{\bf {d}}^* \\ \bf {0}\\ \end{array}\right]. \end{eqnarray} (A3)Data with abnormally large residuals are identified as outliers. For each iteration, we compute $$\bf {r}$$ and then update $$\bf {\Omega }$$ so that it contains the indices of $$\bf {r}$$ whose weighted values are less than λ times the weighted root mean square of $$\tilde{\bf {r}}$$,   \begin{eqnarray} \bf {\Omega } \leftarrow \left\lbrace i : \left|\frac{r_i}{\sigma _i}\right| < \lambda \cdot \sqrt{\frac{1}{|\bf {\Omega }|} \sum _{j \in \bf {\Omega }} \frac{r_j^2}{\sigma _j^2}} \right\rbrace . \end{eqnarray} (A4)Iterations continue until the new $$\bf {\Omega }$$ is the same as the previous $$\bf {\Omega }$$. The outlier detection algorithm is demonstrated in Fig. A1. For the demonstration, we use the easting component of displacements at a single station, SC03, which is located on Mt. Olympus in Washington state. Station SC03 records anomalous observations during the winter, presumably because of snow and ice accumulation, and we want to remove these observations. The station also records periodic westward motion from SSEs, and we want to keep this deformation intact. The detected outliers are shown in panel (b). For comparison, we also show the detected outliers when we do not include the Gaussian process v in our model for the data (panel a). When v is not included, real transient deformation resulting from SSEs is erroneously identified as outliers. When v is included, the identified outliers only consist of the anomalous deformation that lacks temporal continuity. It should be noted that we use λ = 2.5 for this demonstration, which causes the outlier detection algorithm to be particularly aggressive. In Section 3, we clean the data using a more tolerant λ = 4.0. Figure A1. View largeDownload slide Outliers detected in the easting component of displacements at station SC03. The orange markers indicate detected outliers. The orange line is the best fit model to the data, which is used to compute the residual vector $$\bf {r}$$. The model being fit to the data in panel (a) is $$\bf {G}\bf {m}$$, and the model in panel (b) is $$\bf {G}\bf {m} + v(\bf {P})$$. Figure A1. View largeDownload slide Outliers detected in the easting component of displacements at station SC03. The orange markers indicate detected outliers. The orange line is the best fit model to the data, which is used to compute the residual vector $$\bf {r}$$. The model being fit to the data in panel (a) is $$\bf {G}\bf {m}$$, and the model in panel (b) is $$\bf {G}\bf {m} + v(\bf {P})$$. © The Author(s) 2017. Published by Oxford University Press on behalf of The Royal Astronomical Society. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Geophysical Journal International Oxford University Press

# Revealing transient strain in geodetic data with Gaussian process regression

, Volume 212 (3) – Mar 1, 2018
15 pages

/lp/ou_press/revealing-transient-strain-in-geodetic-data-with-gaussian-process-kvH7AHSrrf
Publisher
The Royal Astronomical Society
ISSN
0956-540X
eISSN
1365-246X
D.O.I.
10.1093/gji/ggx525
Publisher site
See Article on Publisher Site

### Abstract

Summary Transient strain derived from global navigation satellite system (GNSS) data can be used to detect and understand geophysical processes such as slow slip events and post-seismic deformation. Here we propose using Gaussian process regression (GPR) as a tool for estimating transient strain from GNSS data. GPR is a non-parametric, Bayesian method for interpolating scattered data. In our approach, we assume a stochastic prior model for transient displacements. The prior describes how much we expect transient displacements to covary spatially and temporally. A posterior estimate of transient strain is obtained by differentiating the posterior transient displacements, which are formed by conditioning the prior with the GNSS data. As a demonstration, we use GPR to detect transient strain resulting from slow slip events in the Pacific Northwest. Maximum likelihood methods are used to constrain a prior model for transient displacements in this region. The temporal covariance of our prior model is described by a compact Wendland covariance function, which significantly reduces the computational burden that can be associated with GPR. Our results reveal the spatial and temporal evolution of strain from slow slip events. We verify that the transient strain estimated with GPR is in fact geophysical signal by comparing it to the seismic tremor that is associated with Pacific Northwest slow slip events. Satellite geodesy, Transient deformation, Spatial analysis, Statistical methods, Time-series analysis, Dynamics of lithosphere and mantle 1 INTRODUCTION Crustal strain rates are fundamentally important quantities for assessing seismic hazard. Knowing where and how quickly strain is accumulating gives insight into where we can expect stored elastic energy to be released seismically. Estimates of secular, or long-term, strain rates have been used to constrain seismic hazard models such as UCERF3 (Field et al. 2014). Transient strain, which is anomalous strain caused by geophysical processes such as slow slip events (SSEs) or post-seismic deformation, is also relevant for assessing seismic hazard. While transient strain itself is not damaging, there is a possibility that it can trigger major earthquakes (Roeloffs 2006; Freed & Lin 2001). Global navigation satellite system (GNSS) networks, such as the Plate Boundary Observatory (PBO), can be used to infer secular and transient strain. Numerous methods for converting GNSS displacement data to strain have been proposed, and most involve assuming some parametric form for the deformation signal. The simplest method for estimating secular strain rates assumes that GNSS derived velocities can be described with a first-order polynomial (i.e. the deformation gradients are constant) over some subnetwork of the GNSS stations (e.g. Feigl et al. 1990; Murray & Lisowski 2000). The components of the secular strain rate tensor for each subnetwork are then determined from the least squares fit to the velocities. The assumption that deformation gradients are spatially uniform is not appropriate when subnetworks span a very large area. To help overcome this deficiency, Shen et al. (1996, 2015) used an inverse distance weighting scheme, in which the estimated secular strain rate at any point is primarily controlled by the velocities at nearby stations. The method of Shen et al. (1996, 2015) can be viewed as a form of local least squares regression with a first-order polynomial. Other methods for estimating secular strain rates have parameterized GNSS derived velocities with bi-cubic splines (Beavan & Haines 2001), spherical wavelets (Tape et al. 2009), and elastostatic Green’s functions (Sandwell & Wessel 2016). The type of basis functions and the number of degrees of freedom for a parameterization, which are often chosen subjectively, have a strong influence on the solution. If there are too few degrees of freedom in the parameterization, then estimated secular strain rates will be biased and the uncertainties will be underestimated. On the other hand, if there are too many degrees of freedom, then there will not be any coherent features in the estimated secular strain rates. The methods described by Beavan & Haines (2001) and Tape et al. (2009) also require the user to specify penalty parameters that control a similar trade-off between bias and variance in the solution. One could also parameterize deformation with a physically motivated model of interseismic deformation (e.g. Meade & Hager 2005; McCaffrey et al. 2007). In such models, the lithospheric rheology and fault geometries are assumed to be known. Any errors in the assumed physical model could result in biased estimates of the secular strain rates and underestimated uncertainties. The aforementioned studies are concerned with estimating secular strain rates. In recent years the Southern California Earthquake Center (SCEC) community has shown interest in developing methods for detecting transient deformation. SCEC supported a transient detection exercise (Lohman & Murray 2013), where several research groups tested their methods for detecting transient geophysical signal with a synthetic GNSS data set. Among the methods tested were the Network Strain Filter (NSF; Ohtani et al. 2010) and the Network Inversion Filter (NIF; Segall & Mathews 1997). The NSF uses a wavelet parameterization to describe the spatial component of geophysical signal. The NIF, which is intended for imaging slow fault slip from geodetic data, uses the elastic dislocation Green’s functions from Okada (1992). For the NSF and NIF, the time dependence of the geophysical signal is modelled as integrated Brownian motion. The method described in Holt & Shcherbenko (2013) was also tested in the SCEC transient detection exercise. Holt & Shcherbenko (2013) detect transient signal from an estimate of transient strain rates, and they estimate transient strain rates using a bi-cubic spatial parameterization of displacements between time epochs. Holt & Shcherbenko (2013) define a detection threshold based on the transient strain rate magnitude, and we demonstrate that this is indeed an effective criterion for identifying a geophysical signal. For the same reasons described above, the transient deformation and corresponding uncertainties estimated by these methods can be biased by the chosen spatial parameterization. It is then difficult to distinguish signal from noise with these methods, which limits their utility for transient detection. Here we propose using Gaussian process regression (GPR) (Rasmussen & Williams 2006) to estimate transient strain rates from GNSS data. GPR is closely related to kriging (Cressie 1993) and least squares collocation (Moritz 1978). The latter has been used by Kato et al. (1998) and El-Fiky & Kato (1998) to estimate secular strain rates from GNSS data. GPR is a Bayesian, non-parametric method for inferring a continuous function from scattered data. Since GNSS stations are irregularly spaced and observation times may differ between stations, GPR is an ideal tool for synthesizing discrete GNSS data into a spatially and temporally continuous representation of transient strain rates. GPR is Bayesian in that we use a prior model to control the spatial and temporal roughness of the inferred transient strain rates. The prior is specified as a stochastic process, namely a Gaussian process. If there is no information available to help choose an appropriate prior Gaussian process, then maximum likelihood methods can be used to objectively choose one that is most consistent with the data. We use GPR to infer transient strain rates resulting from SSEs in the Pacific Northwest, demonstrating that GPR is an effective tool for detecting transient geophysical processes. 2 ESTIMATING TRANSIENT STRAIN RATES We seek a spatially and temporally dependent estimate of transient strain rates. We consider transient strain rates to be any deviation from the secular strain rates. We denote transient strain rates as $$\dot{\varepsilon }(p)$$, where p represents the ordered pair $$(\vec{x},t)$$, $$\vec{x}= (x_\mathrm{e},x_\mathrm{n})$$ are spatial coordinates, and t is time. The subscripts ‘e’ and ‘n’ indicate the east and north component, and we assume that the study region is sufficiently small that $$\vec{x}$$ can be considered a point in a 2-D Cartesian map projection that is aligned with the cardinal directions. We determine $$\dot{\varepsilon }$$ by spatially and temporally differentiating estimates of transient displacements, which we constrain with GNSS data. We let $$\vec{u}(p) = (u_\mathrm{e}(p),u_\mathrm{n}(p))$$ be our prior understanding of transient displacements. Since transient displacements are not precisely known, we cannot consider $$\vec{u}$$ to be a deterministic function. Instead, $$\vec{u}$$ is considered to be a stochastic process containing a distribution of functions that could potentially describe transient displacements. Specifically, we let each component of $$\vec{u}$$ be a Gaussian process. A Gaussian process is a stochastic process whose value at any collection of points can be described with a multivariate normal distribution. That is to say, the random vector $$[u_i(p)]_{p \in \bf {P}}$$ has a multivariate normal distribution for any collection of points $$\bf {P}$$. A realization of the random vector $$[u_i(p)]_{p \in \bf {P}}$$ can be interpreted as a realization of ui (i.e. a sample function) that is evaluated at $$\bf {P}$$. Just as a multivariate normal distribution is fully determined by a mean vector and a covariance matrix, a Gaussian process is fully determined by a mean function and a covariance function. For example, Brownian motion, B(t), is a well-known Gaussian process in $$\mathbb {R}^1$$ which has the mean function E[B(t)] = 0 and the covariance function Cov[B(t), B(t΄)] = ϕ2min (t, t΄), where t, t΄ ≥ 0 and ϕ is a scale factor. We let each component of $$\vec{u}$$ have zero mean and a generic covariance function $$\mathrm{Cov}\left[ u_i(p),u_i(p^{\prime }) \right] = C_{u_i}(p,p^{\prime })$$. Using a more concise notation, we write our prior on each component of $$\vec{u}$$ as $$u_i \sim \mathcal {GP}\left(0,C_{u_i}\right)$$. The function $$C_{u_i}$$ must be positive definite in order to be a valid covariance function. By definition, $$C_{u_i}$$ is a positive definite function if the matrix $$[C_{u_i}(p,p^{\prime })]_{(p,p^{\prime }) \in \bf {P}\times \bf {P}}$$ is positive definite for any set of points $$\bf {P}$$ (Cressie 1993, section 2.5). We assume that $$C_{u_i}$$ can be separated into spatial and temporal functions as   \begin{eqnarray} C_{u_i}(p,p^{\prime }) = C_{u_i}\left((\vec{x},t),(\vec{x}^{\prime },t^{\prime })\right) = X_i(\vec{x},\vec{x}^{\prime })T_i(t,t^{\prime }). \end{eqnarray} (1)As long as the functions Xi and Ti are positive definite, $$C_{u_i}$$ is guaranteed to also be positive definite (Rasmussen & Williams 2006, section 4.2.4). We also require that the derivatives $$\partial ^2 X_i(\vec{x},\vec{x}^{\prime })/ \partial x_j \partial x_j^{\prime }$$ and ∂2Ti(t, t΄)/∂t∂t΄ exist. This ensures that ui is spatially and temporally differentiable, allowing us to compute transient strain rates (see Adler (1981, section 2.2) or Papoulis (1991, appx. 10A) for a definition of stochastic differentiation and the conditions for differentiability). We provide an example to give the prior on transient displacements a more tangible meaning. We can use a squared exponential function for Xi and Ti,   \begin{eqnarray} X_i(\vec{x},\vec{x}^{\prime }) &=& \exp \left(\frac{-||\vec{x}- \vec{x}^{\prime }||_2^2}{2\ell ^2}\right),\nonumber\\ T_i(t,t^{\prime }) &=& \phi ^2 \exp \left(\frac{-|t - t^{\prime }|^2}{2\tau ^2}\right), \end{eqnarray} (2)which satisfies our requirements for positive definiteness and differentiability. The parameters ℓ and τ control the length-scale and timescale, respectively, of realizations of ui. The parameter ϕ, which we have arbitrarily chosen to incorporate into Ti rather than Xi, controls the amplitude of realizations of ui. Ideally, we want realizations of ui to have a length-scale, timescale, and amplitude that resemble what we expect for the true transient displacements. In Fig. 1(a), we show $$C_{u_i}$$ using the squared exponential function for Xi and Ti and the parameters ℓ = 100 km, τ = 10 d, and ϕ = 1 mm. A single realization of ui corresponding to that choice of covariance function is shown in Fig. 1(b) (see Rasmussen & Williams (2006, appx. A.2) for details on drawing realizations from Gaussian processes). While the squared exponential function is a commonly used covariance function for GPR, it is not appropriate for every application. The appropriate choice for Xi and Ti may vary depending on the geophysical signal we are trying to describe. To keep this section sufficiently general, we hold off on specifying Xi and Ti until Section 3.2, where we estimate transient strain from SSEs in the Pacific Northwest. Figure 1. View largeDownload slide (a) An example covariance function for transient displacements, $$C_{u_i}(p,p^{\prime })$$, shown as a function of the spatial and temporal distance between p and p΄. (b) A single realization of transient displacements, ui(p), corresponding to the covariance function from panel (a). The realization is a function of three variables, t, xe and xn, although we only show its dependence on t and one of the spatial dimensions. Figure 1. View largeDownload slide (a) An example covariance function for transient displacements, $$C_{u_i}(p,p^{\prime })$$, shown as a function of the spatial and temporal distance between p and p΄. (b) A single realization of transient displacements, ui(p), corresponding to the covariance function from panel (a). The realization is a function of three variables, t, xe and xn, although we only show its dependence on t and one of the spatial dimensions. GNSS data records transient displacements as well as other physical and non-physical processes that we are not interested in. We first consider component i of a single GNSS displacement observation made at station j, which is located at $$\vec{x}^{(j)}$$, and time t(k). We describe this observation, $$d_i^{*(jk)}$$, as a realization of the random variable   \begin{eqnarray} d_i^{(jk)} &=& u_i\left(\vec{x}^{(j)},t^{(k)}\right) + \eta _i^{(jk)} + a_i^{(j)} + b_i^{(j)}t^{(k)} \nonumber\\ &&+\,c_i^{(j)}\sin \left(2 \pi t^{(k)}\right) + e_i^{(j)}\cos \left(2 \pi t^{(k)}\right)\nonumber\\ &&+\,f_i^{(j)}\sin \left(4 \pi t^{(k)}\right) + g_i^{(j)}\cos \left(4 \pi t^{(k)}\right), \end{eqnarray} (3)where $$\eta _i^{(jk)}$$ describes noise, $$a_i^{(j)}$$ is an offset that is unique for each station, $$b_i^{(j)}$$ is the secular velocity at $$\vec{x}^{(j)}$$, and the sinusoids describe seasonal deformation (using units of years for t(k)). We then consider the column vector of n GNSS displacement observations, $$\bf {d}^*_i$$, where the subscript indicates that we are still only considering component i of displacements. The observations are made at m stations, and the times and positions for each observation are described by the set $$\bf {P}$$. The data vector $$\bf {d}^*_i$$ can be considered a realization of the random vector $$\bf {d}_i$$, which is formed by evaluating eq. (3) at each point in $$\bf {P}$$. To write out $$\bf {d}_i$$ explicitly, we let $$\bf {G}$$ be an n × 6 m matrix consisting of the basis functions from eq. (3) (i.e. the linear trends and sinusoids for each station) evaluated at each point in $$\bf {P}$$. The coefficients corresponding to each basis function are collected into the column vector $$\bf {m}_i$$. The noise for all the observations are described by the column vector $$\bf {\eta }_i$$. We can then write $$\bf {d}_i$$ as   \begin{eqnarray} \bf {d}_i = u_i(\bf {P}) + \bf {\eta }_i + \bf {G}\bf {m}_i, \end{eqnarray} (4)where the notation $$u_i(\bf {P})$$ represents the column vector $$[u_i(p)]_{p \in \bf {P}}$$. We assume a diffuse prior for the components of $$\bf {m}_i$$. That is to say $$\bf {m}_i \sim \mathcal {N}(\bf {0},\kappa ^2\bf {I})$$ in the limit as κ → ∞. Of course, the secular velocities, $$b_i^{(j)}$$, are spatially correlated and we could invoke a tectonic model to form a prior on $$b_i^{(j)}$$. However, in our application to the Pacific Northwest, we will be using displacement time series which are long enough to sufficiently constrain $$b_i^{(j)}$$ for each station, avoiding the need to incorporate a prior. Likewise, seasonal deformation is spatially correlated (Dong et al. 2002; Langbein 2008), and it may be worth exploring and exploiting such a correlation in a future study. We assume that $$\bf {\eta }_i$$ is a spatially and/or temporally correlated random vector distributed as $$\mathcal {N}(\bf {0},\bf {C}_{\eta _i})$$. For example, $$\bf {\eta }_i$$ can be uncorrelated white noise, temporally correlated noise describing benchmark wobble (e.g. Wyatt 1982, 1989), and/or spatially correlated noise describing common mode error (e.g. Wdowinski et al. 1997). The appropriate noise model may vary depending on the application, and we hold off on specifying the covariance matrix, $$\bf {C}_{\eta_i}$$, until Section 3.1. We are now able to write the distribution of $$\bf {d}_i$$ as   \begin{eqnarray} \bf {d}_i \sim \mathcal {N}\left(\bf {0}, C_{u_i}(\bf {P},\bf {P}) + \bf {C}_{\eta _i} + \kappa ^2\bf {G}\bf {G}^T \right), \end{eqnarray} (5)where $$C_{u_i}(\bf {P},\bf {P})$$ represents the matrix $$[C_{u_i}(p,p^{\prime })]_{(p,p^{\prime }) \in \bf {P}\times \bf {P}}$$. We form a posterior estimate of transient displacements, denoted as $$\hat{u}_i$$, by updating ui with the fact that $$\bf {d}_i^*$$ was realized from the random vector $$\bf {d}_i$$, which we write as $$\hat{u}_i = u_i |(\bf {d}_i = \bf {d}^*_i)$$. A general solution for $$\hat{u}_i$$ is derived in von Mises (1964, section 8.9), where we find that $$\hat{u}_i$$ is distributed as $$\mathcal {GP}(\mu _{\hat{u}_i},C_{\hat{u}_i})$$ with mean function   \begin{eqnarray} \mu _{\hat{u}_i}(p) &=& \mathrm{E}\left[ u_i(p) \right] + \mathrm{Cov}\left[ u_i(p),\bf {d}_i \right] \mathrm{Cov}\left[ \bf {d}_i \right]^{-1} \left(\bf {d}^*_i - \mathrm{E}\left[ \bf {d}_i \right]\right)\nonumber \\ &=& C_{u_i}(p,\bf {P})\left( C_{u_i}(\bf {P},\bf {P}) + \bf {C}_{\eta _i} + \kappa ^2\bf {G}\bf {G}^T\right)^{-1} \bf {d}^*_i \end{eqnarray} (6)and covariance function   \begin{eqnarray} C_{\hat{u}_i}(p,p^{\prime }) &=& \mathrm{Cov}\left[ u_i(p),u_i(p^{\prime }) \right]\nonumber\\ && -\, \mathrm{Cov}\left[ u_i(p),\bf {d}_i \right] \mathrm{Cov}\left[ \bf {d}_i \right]^{-1} \mathrm{Cov}\left[ \bf {d}_i,u_i(p^{\prime }) \right] \nonumber \\ &=& C_{u_i}(p,p^{\prime }) - C_{u_i}(p,\bf {P})\left( C_{u_i}(\bf {P},\bf {P})\right.\nonumber\\ &&\left. +\, \bf {C}_{\eta _i} + \kappa ^2\bf {G}\bf {G}^T \right)^{-1} C_{u_i}(\bf {P},p^{\prime }). \end{eqnarray} (7)However, we are interested in the limit as κ → ∞, and the form for eqs (6) and (7) is not suitable for evaluating this limit. We use a partitioned matrix inversion identity (Press et al. 2007, sec. 2.7.4) to rewrite eqs (6) and (7) as   \begin{eqnarray} \mu _{\hat{u}_i}(p) = \left[\begin{array}{c@{\quad}c}C_{u_i}(p,\bf {P}) & \bf {0}\\ \end{array}\right] \left[\begin{array}{c@{\quad}c}C_{u_i}(\bf {P},\bf {P}) + \bf {C}_{\eta _i} & \bf {G}\\ \bf {G}^T & -\kappa ^{-2} \bf {I}\\ \end{array}\right]^{-1} \left[\begin{array}{c}\bf {d}^*_i \\ \bf {0}\\ \end{array}\right]\!\!\!\!\nonumber\\ \end{eqnarray} (8)and   \begin{eqnarray} C_{\hat{u}_i}(p,p^{\prime }) &=& C_{u_i}(p,p^{\prime }) -[C_{u_i}(p,\bf {P}) \quad \bf {0}]\nonumber\\ &&\times\, \left[\begin{array}{c@{\quad}c}C_{u_i}(\bf {P},\bf {P}) + \bf {C}_{\eta _i} & \bf {G}\\ \bf {G}^T & -\kappa ^{-2} \bf {I}\\ \end{array}\right]^{-1} \left[\begin{array}{c}C_{u_i}(\bf {P},p^{\prime }) \\ \bf {0}\nonumber\\ \end{array}\right].\nonumber\\ \end{eqnarray} (9)Taking the limit as κ → ∞, we get the solution for the mean and covariance of $$\hat{u}_i$$,   \begin{eqnarray} \mu _{\hat{u}_i}(p) = \left[\begin{array}{c@{\quad}c}C_{u_i}(p,\bf {P}) & \bf {0}\\ \end{array}\right] \left[\begin{array}{c@{\quad}c}C_{u_i}(\bf {P},\bf {P}) + \bf {C}_{\eta _i} & \bf {G}\\ \bf {G}^T & \bf {0}\\ \end{array}\right]^{-1} \left[\begin{array}{c}\bf {d}^*_i \\ \bf {0}\\ \end{array}\right]\!\!\!\!\nonumber\\ \end{eqnarray} (10)and   \begin{eqnarray} C_{\hat{u}_i}(p,p^{\prime }) &=& C_{u_i}(p,p^{\prime }) - \left[\begin{array}{c@{\quad}c}C_{u_i}(p,\bf {P}) & \bf {0}\\ \end{array}\right]\nonumber\\ &&\times \left[\begin{array}{c@{\quad}c}C_{u_i}(\bf {P},\bf {P}) + \bf {C}_{\eta _i} & \bf {G}\\ \bf {G}^T & \bf {0}\\ \end{array}\right]^{-1} \left[\begin{array}{c}C_{u_i}(\bf {P},p^{\prime }) \\ \bf {0}\\ \end{array}\right]. \end{eqnarray} (11)To ensure that the inverse matrices in eqs (10) and (11) exist, each column in $$\bf {G}$$ must be linearly independent. This condition tends to be violated when there are too few observations at a station. In that case, a singular value decomposition can be used to remove linearly dependent components from $$\bf {G}$$. It should be noted that we have ignored any covariances between the easting and northing components of $$\vec{u}$$ and $$\vec{\bf {d}}$$. This simplification reduces the computational complexity of evaluating the posterior transient displacements because each component can be evaluated independently. However, we are inherently assuming that the principal components of $$\vec{u}(p)$$ and $$\vec{d}^{(jk)}$$ are aligned with the cardinal directions, which is admittedly an arbitrary assumption. The posterior transient displacements are spatially and temporally continuous, and we can use eqs (10) and (11) to evaluate $$\hat{u}_i$$ at any p. Furthermore, $$\hat{u}_i$$ is spatially and temporally differentiable, allowing us to formulate $$\dot{\varepsilon }$$ at any p that we may be interested in. The components of $$\dot{\varepsilon }$$ can be written as   \begin{eqnarray} \dot{\varepsilon }_{ij}(p) = \frac{1}{2} \frac{\partial }{\partial t} \left(\frac{\partial \hat{u}_i(p)}{\partial x_j} + \frac{\partial \hat{u}_j(p)}{\partial x_i} \right). \end{eqnarray} (12)Since eq. (12) is a linear operation on the Gaussian processes $$\hat{u}_i$$ and $$\hat{u}_j$$, we know that $$\dot{\varepsilon }_{ij}$$ is also a Gaussian process. From Papoulis (1991, ch. 10), we find that the mean and covariance functions for $$\dot{\varepsilon }_{ij}$$ are   \begin{eqnarray} \mu _{\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}}(p) = \frac{\partial ^2 \mu _{\hat{u}_\mathrm{e}}(p)}{\partial t {\, } \partial x_\mathrm{e}} \end{eqnarray} (13)  \begin{eqnarray} \mu _{\dot{\varepsilon }_{\mathrm{n}\mathrm{n}}}(p) = \frac{\partial ^2 \mu _{\hat{u}_\mathrm{n}}(p)}{\partial t {\, } \partial x_\mathrm{n}} \end{eqnarray} (14)  \begin{eqnarray} \mu _{\dot{\varepsilon }_{\mathrm{e}\mathrm{n}}}(p) = \frac{1}{2}\frac{\partial }{\partial t}\left( \frac{\partial \mu _{\hat{u}_\mathrm{e}}(p)}{\partial x_\mathrm{n}} + \frac{\partial \mu _{\hat{u}_\mathrm{n}}(p)}{\partial x_\mathrm{e}} \right) \end{eqnarray} (15)and   \begin{eqnarray} C_{\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}}(p,p^{\prime }) = \frac{\partial ^4 C_{\hat{u}_\mathrm{e}}(p,p^{\prime })}{\partial t {\, } \partial t^{\prime } {\, } \partial x_\mathrm{e}{\, } \partial x^{\prime }_\mathrm{e}} \end{eqnarray} (16)  \begin{eqnarray} C_{\dot{\varepsilon }_{\mathrm{n}\mathrm{n}}}(p,p^{\prime }) = \frac{\partial ^4 C_{\hat{u}_\mathrm{n}}(p,p^{\prime })}{\partial t {\, } \partial t^{\prime } {\, } \partial x_\mathrm{n}{\, } \partial x^{\prime }_\mathrm{n}} \end{eqnarray} (17)   \begin{eqnarray} C_{\dot{\varepsilon }_{\mathrm{e}\mathrm{n}}}(p,p^{\prime }) = \frac{1}{4} \frac{\partial ^2}{\partial t {\, } \partial t^{\prime }}\left( \frac{\partial ^2 C_{\hat{u}_\mathrm{e}}(p,p^{\prime })}{\partial x_\mathrm{n}{\, } \partial x^{\prime }_\mathrm{n}} + \frac{\partial ^2 C_{\hat{u}_\mathrm{n}}(p,p^{\prime })}{\partial x_\mathrm{e}{\, } \partial x^{\prime }_\mathrm{e}} \right), \end{eqnarray} (18)respectively. The above derivatives of $$\mu _{\hat{u}_i}$$ and $$C_{\hat{u}_i}$$ are computed analytically by replacing the terms $$C_{u_i}(p,\bf {P})$$, $$C_{u_i}(\bf {P},p^{\prime })$$, and $$C_{u_i}(p,p^{\prime })$$ in eqs (10) and (11) with their appropriate derivatives. For example, the mean and covariance functions for $$\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}$$ can be written more verbosely as   \begin{eqnarray} \mu _{\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}}(p) &=& \left[\begin{array}{c@{\quad}c}\frac{\partial ^2 C_{u_\mathrm{e}}\left( p,\bf {P}\right)}{\partial t {\, } \partial x_\mathrm{e}} & \bf {0}\\ \end{array}\right] \left[\begin{array}{c@{\quad}c}C_{u_\mathrm{e}}(\bf {P},\bf {P}) + \bf {C}_{\eta _\mathrm{e}} & \bf {G}\\ \bf {G}^T & \bf {0}\\ \end{array}\right]^{-1} \left[\begin{array}{c}\bf {d}^*_\mathrm{e}\\ \bf {0}\\ \end{array}\right]\nonumber\\ \end{eqnarray} (19)and   \begin{eqnarray} C_{\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}}(p,p^{\prime }) &=& \frac{\partial ^4 C_{u_\mathrm{e}}(p,p^{\prime })}{\partial t {\, } \partial t^{\prime } {\, } \partial x_\mathrm{e}{\, } \partial x^{\prime }_\mathrm{e}} - \left[\begin{array}{c@{\quad}c}\frac{\partial ^2 C_{u_\mathrm{e}}\left( p,\bf {P}\right)}{\partial t {\, } \partial x_\mathrm{e}} & \bf {0} \end{array}\right]\nonumber\\ &&\times\,\left[\begin{array}{c@{\quad}c}C_{u_\mathrm{e}}(\bf {P},\bf {P}) + \bf {C}_{\eta _\mathrm{e}} & \bf {G}\\ \bf {G}^T & \bf {0}\\ \end{array}\right]^{-1} \left[\begin{array}{c}\frac{\partial ^2 C_{u_\mathrm{e}}\left(\bf {P}, p^{\prime } \right)}{\partial t^{\prime } {\, } \partial x^{\prime }_\mathrm{e}} \\ \bf {0}\\ \end{array}\right].\nonumber\\ \end{eqnarray} (20) 2.1 Transient detection criterion Our motivation for estimating transient strain rates is, in part, to detect transient geophysical processes. As we will see, geophysical signal can be easily identified by visually inspecting the solution for $$\dot{\varepsilon }$$. However, if we want to detect geophysical signal automatically, then we need to define a detection criterion. We use a signal-to-noise ratio, SNR, that is based on the Frobenius norm of $$\dot{\varepsilon }$$, $$||\dot{\varepsilon }||_\mathrm{F} = \left(\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}^2 + \dot{\varepsilon }_{\mathrm{n}\mathrm{n}}^2 + 2\dot{\varepsilon }_{\mathrm{e}\mathrm{n}}^2\right)^{\frac{1}{2}}$$, for our detection criterion. In the geodetic literature, $$||\dot{\varepsilon }||_\mathrm{F}$$ is often used as a metric for the strain rate ‘magnitude’, and it is sometimes referred to as the second invariant of strain rate. Noting that $$||\dot{\varepsilon }||_\mathrm{F}$$ is a random variable, we take SNR to be the ratio of the estimated mean and standard deviation of $$||\dot{\varepsilon }||_\mathrm{F}$$. An estimate of the mean is found by evaluating $$||\dot{\varepsilon }||_\mathrm{F}$$ at the mean of $$\dot{\varepsilon }$$,   \begin{eqnarray} \mu _{||\dot{\varepsilon }||_\mathrm{F}} \approx ||\dot{\varepsilon }||_F \big |_{\dot{\varepsilon }= \mu _{\dot{\varepsilon }}} = \left(\mu _{\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}}^2 + \mu _{\dot{\varepsilon }_{\mathrm{n}\mathrm{n}}}^2 + 2\mu _{\dot{\varepsilon }_{\mathrm{e}\mathrm{n}}}^2 \right)^\frac{1}{2}, \end{eqnarray} (21)and we use nonlinear uncertainty propagation to estimate the standard deviation,   \begin{eqnarray} \sigma _{||\dot{\varepsilon }||_\mathrm{F}} &\approx& \left( \left(\left. \frac{\partial ||\dot{\varepsilon }||_\mathrm{F}}{\partial \dot{\varepsilon }_{\mathrm{e}\mathrm{e}}}\right|_{\dot{\varepsilon }= \mu _{\dot{\varepsilon }}} \right)^2 \sigma ^2_{\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}} + \left(\left. \frac{\partial ||\dot{\varepsilon }||_\mathrm{F}}{\partial \dot{\varepsilon }_{\mathrm{n}\mathrm{n}}}\right|_{\dot{\varepsilon }= \mu _{\dot{\varepsilon }}} \right)^2 \sigma ^2_{\dot{\varepsilon }_{\mathrm{n}\mathrm{n}}} \right.\nonumber\\ &&\left. +\, \left(\left. \frac{\partial ||\dot{\varepsilon }||_\mathrm{F}}{\partial \dot{\varepsilon }_{\mathrm{e}\mathrm{n}}}\right|_{\dot{\varepsilon }= \mu _{\dot{\varepsilon }}} \right)^2 \sigma ^2_{\dot{\varepsilon }_{\mathrm{e}\mathrm{n}}} \right)^\frac{1}{2}, \end{eqnarray} (22)where $$\sigma ^2_{\dot{\varepsilon }_{ij}}(p) = C_{\dot{\varepsilon }_{ij}}(p,p)$$. After some calculations, we find SNR to be   \begin{eqnarray} {\mathrm{SNR}(p) = \frac{\mu _{||\dot{\varepsilon }||_\mathrm{F}}(p)}{\sigma _{||\dot{\varepsilon }||_\mathrm{F}}(p)} }\nonumber \\ &&\quad = \frac{\mu _{\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}}(p)^2 + \mu _{\dot{\varepsilon }_{\mathrm{n}\mathrm{n}}}(p)^2 + 2\mu _{\dot{\varepsilon }_{\mathrm{e}\mathrm{n}}}(p)^2}{\big (\sigma ^2_{\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}}(p)\mu _{\dot{\varepsilon }_{\mathrm{e}\mathrm{e}}}(p)^2 + \sigma ^2_{\dot{\varepsilon }_{\mathrm{n}\mathrm{n}}}(p)\mu _{\dot{\varepsilon }_{\mathrm{n}\mathrm{n}}}(p)^2 + 4\sigma ^2_{\dot{\varepsilon }_{\mathrm{e}\mathrm{n}}}(p)\mu _{\dot{\varepsilon }_{\mathrm{e}\mathrm{n}}}(p)^2 \big )^{\frac{1}{2}}}.\nonumber\\ \end{eqnarray} (23)We explicitly show that SNR is a function of p to emphasize that it identifies the position and time of anomalous deformation. We can reasonably suspect that some transient geophysical phenomena is occurring wherever and whenever SNR is larger than ∼3. 2.2 Outlier detection In deriving our formulation for transient strain rates, we have assumed that noise in the data vector is normally distributed. This is not an appropriate assumption for GNSS data, which often have more outliers than would be expected for normally distributed noise. Methods for analysing GNSS data should either be robust against outliers or should involve a preprocessing step in which outliers are detected and removed. Examples of the former include the MIDAS method for estimating secular velocities (Blewitt et al. 2016) and the GPS Imaging method for detecting spatially coherent features (Hammond et al. 2016). In this study, we identify and remove outliers as a preprocessing step before estimating $$\dot{\varepsilon }$$. Outliers are identified based on the residuals for a model that best fits the observed data. Observations with residuals that exceed some threshold are removed. This strategy for detecting outliers is commonly used for GNSS data, where the model being fit to the data typically consists of a linear trend and seasonal terms for each GNSS station (e.g. Johansson et al. 2002; Dong et al. 2006; Bos et al. 2013). To prevent transient geophysical signal from being erroneously identified as outliers, the model used in our outlier detection algorithm additionally consists of a temporally correlated Gaussian process. The details of our algorithm are given in the Appendix. It should be noted that our algorithm does not identify jumps in GNSS time-series, which is another common issue. Some, but not all, jumps can be automatically removed by looking up the dates of equipment changes and earthquakes (Gazeaux et al. 2013). However, it is still necessary to manually find and remove jumps of unknown origin. 3 APPLICATION TO PACIFIC NORTHWEST SLOW SLIP EVENTS In this section we estimate transient strain rates in the Pacific Northwest, and we are specifically interested in identifying transient strain resulting from SSEs. SSEs in the Pacific Northwest were first discovered by Dragert et al. (2001) and we refer the reader to Schwartz & Rokosky (2007) for a review of the subsequent research. SSEs in the Pacific Northwest can be detected by monitoring for associated seismic tremor (Rogers & Dragert 2003), which is actively being done by the Pacific Northwest Seismic Network (Wech 2010). We can then compare the tremor records to our estimated transient strain rates to verify that we are indeed identifying strain from SSEs. Before estimating transient strain rates, we first establish a noise model for GNSS stations in this region, and then we establish a prior Gaussian process to describe displacements from SSEs. We use the daily displacement solutions for continuous GNSS stations generated by the Geodesy Advancing Geosciences and EarthScope (GAGE) Facility (Herring et al. 2016). We limit the data set to the stations and times that are pertinent to the seven most recent SSEs in the Puget Sound region. The earliest SSE considered in this study began in August 2010, and the most recent SSE began in February 2017. The positions of GNSS stations used to estimate transient strain rates are shown in Fig. 2. Figure 2. View largeDownload slide Positions of the GNSS stations used to estimate transient strain rates. The coloured regions indicate the distribution of seismic tremor as determined by Wech (2010). The red dots show the positions of GNSS stations mentioned in this paper. The blue dot indicates the location of the transient strain rates shown in Fig. 7 and the signal-to-noise ratio shown in Fig.8. The blue dashed circle demarcates the spatial extent of the tremors shown in Fig. 8. Figure 2. View largeDownload slide Positions of the GNSS stations used to estimate transient strain rates. The coloured regions indicate the distribution of seismic tremor as determined by Wech (2010). The red dots show the positions of GNSS stations mentioned in this paper. The blue dot indicates the location of the transient strain rates shown in Fig. 7 and the signal-to-noise ratio shown in Fig.8. The blue dashed circle demarcates the spatial extent of the tremors shown in Fig. 8. 3.1 Noise model We consider the noise vector, $$\bf {\eta }_i$$, to be composed of a temporally correlated Gaussian process, $$z_i \sim \mathcal {GP}(0,C_{z_i})$$, and a vector of uncorrelated Gaussian noise, $$\bf {w}_i$$, so that   \begin{eqnarray} \bf {\eta }_i = z_i(\bf {P}) + \bf {w}_i. \end{eqnarray} (24)The standard deviations for $$\bf {w}_i$$ are taken to be the uncertainties derived for the GNSS displacement solutions, $$\bf {\sigma }_i$$. The noise vector then has zero mean and the covariance matrix   \begin{eqnarray} \bf {C}_{\eta _i} = C_{z_i}(\bf {P},\bf {P}) + \mathrm{diag}(\bf {\sigma }_i^2). \end{eqnarray} (25) The temporally correlated noise in GNSS data has been thoroughly studied over the past two decades (see Bock & Melgar (2016) or He et al. (2017) for a review). Most studies found that the temporally correlated noise can be described with Brownian motion (also known as random walk noise or a Weiner process) or flicker noise (e.g. Zhang et al. 1997; Mao et al. 1999; Williams et al. 2004; Langbein 2008; Murray & Svarc 2017). Less frequently used noise models include a band-pass process (e.g. Langbein 2004) and a first-order Gauss–Markov (FOGM) process (e.g. Langbein 2004; Dmitrieva et al. 2015). There is some physical justification for using Brownian motion as a noise model because it accurately describes the power spectrum of motion resulting from instability in a geodetic monument (e.g. Wyatt 1982, 1989). In particular, the power spectra for Brownian motion and monument motion both decay proportionally to f−2, where f is frequency. However, Brownian motion necessarily contains a reference time at which the process begins. Since there is no notion of when noise ‘begins’ in GNSS data, we do not find Brownian motion to be an appropriate noise model. On the other hand, an FOGM process has no reference time (i.e. it is stationary), and its power spectrum,   \begin{eqnarray} S(f) = \frac{\beta ^2}{(2 \pi f)^2 + \alpha ^2}, \end{eqnarray} (26)decays proportionally to f−2 above the cut-off frequency α/(2π). We choose zi to be a spatially uncorrelated FOGM process. The covariance function for zi is then   \begin{eqnarray} C_{z_i}\!\left((\vec{x},t),(\vec{x}^{\prime },t^{\prime })\right) = \frac{\beta ^2}{2\alpha } \exp \left(-\alpha |t - t^{\prime }|\right)\delta _{\vec{x},\vec{x}^{\prime }}, \end{eqnarray} (27)where $$\delta _{\vec{x},\vec{x}^{\prime }}$$ is 1 if $$\vec{x}= \vec{x}^{\prime }$$ and 0 otherwise (Rasmussen & Williams 2006, section B2). By choosing an FOGM process for zi, we have introduced two parameters that need to be constrained, α and β. We constrain these parameters, which we collectively refer to as $$\bf {\theta }$$, with the Restricted Maximum Likelihood (REML) method (Harville 1974). The REML method is conceptually similar to the Maximum Likelihood Estimation (MLE) method from Langbein & Johnson (1997), where we pick $$\bf {\theta }$$ to maximize the probability of drawing the observed data, $$\bf {d}_i^*$$, from the random vector $$\bf {d}_i$$. However, unlike the MLE method, the REML method produces unbiased estimates of $$\bf {\theta }$$ (Cressie 1993, section 2.6). We use the REML method to estimate $$\bf {\theta }$$ at 38 continuous GNSS stations in the Pacific Northwest that are east of 121°W. These stations are sufficiently far from the subduction zone that they are unlikely to record transient deformation from SSEs, allowing us to ignore the term $$u_i(\bf {P})$$ in $$\bf {d}_i$$. We assume the noise at these inland stations is representative of the noise at all the stations considered in this study, which is probably a poor assumption since the inland stations are subject to distinctly different climatic conditions. Nonetheless, we find $$\bf {\theta }$$ for each of the inland stations and for each displacement component by maximizing the REML likelihood function   \begin{eqnarray} \mathcal {L}(\bf {\theta }) &=& \left(\frac{\left|\bf {G}^T\bf {G}\right|}{(2\pi )^{n-6m} \left| \bf {C}_{\eta _i}(\bf {\theta }) \right| \left| \bf {G}^T\bf {C}_{\eta _i}(\bf {\theta })^{-1}\bf {G}\right|}\right)^{\frac{1}{2}} e^{-\frac{1}{2}\bf {d}_i^{*T}\bf {K}(\bf {\theta })\bf {d}_i^*}\nonumber\\ \end{eqnarray} (28)with respect to $$\bf {\theta }$$, where   \begin{eqnarray} \bf {K}(\bf {\theta }) = \bf {C}_{\eta _i}(\bf {\theta })^{-1} - \bf {C}_{\eta _i}(\bf {\theta })^{-1}\bf {G}\left(\bf {G}^T\bf {C}_{\eta _i}(\bf {\theta })^{-1}\bf {G}\right)^{-1} \bf {G}^T\bf {C}_{\eta _i}(\bf {\theta })^{-1}.\nonumber\\ \end{eqnarray} (29)In the above equation, $$\bf {d}^*_i$$ consists of the data for a single station. The distribution of inferred values for α and β are shown in Fig. 3. The estimates of β for the easting and northing components are clustered around 0.5 mm yr−0.5. The corresponding estimates of α tend to cluster around 0 yr−1, indicating that the power spectrum of noise indeed decays proportionally to f−2 over a wide band of frequencies. We also estimate α and β for the vertical component of displacements, with the hope that vertical deformation gradients could reveal some geophysical signal. For the vertical component, β is relatively large with a median value of 13.5 mm yr−0.5. The inferred values for α are also higher for the vertical component with a median value of 8.21 yr−1. In Fig. 4, we use the median values of α and β to generate two realizations of FOGM noise for each component. The realizations span five years, and the easting and northing realizations drift by about 1 mm over these 5 yr. In the context of detecting SSEs, which produce several mm’s of surface displacement on the timescale of weeks, the estimated FOGM noise for the easting and northing component is negligible. In contrast, the estimated FOGM noise for the vertical component is larger than the signal we would expect from SSEs. We suspect that the higher amplitude for the FOGM noise in the vertical component is accommodating for deficiencies in our rather simple seasonal model. Based on this analysis, we henceforth ignore temporally correlated noise in the easting and northing components because of its low amplitude. We also do not use vertical displacements because of the presumably low SNR. Figure 3. View largeDownload slide Distribution of inferred values for the parameters in the FOGM noise model (eq. 27). ‘IQR’ is the interquartile range of inferred values. Figure 3. View largeDownload slide Distribution of inferred values for the parameters in the FOGM noise model (eq. 27). ‘IQR’ is the interquartile range of inferred values. Figure 4. View largeDownload slide Two samples of FOGM noise for each displacement component. The parameters for the FOGM noise model have been set to the median values from Fig. 3. Figure 4. View largeDownload slide Two samples of FOGM noise for each displacement component. The parameters for the FOGM noise model have been set to the median values from Fig. 3. Another significant source of noise in GNSS data is common mode error (e.g. Wdowinski et al. 1997; Dong et al. 2006), which is noise that is highly spatially correlated. When not accounted for, common mode error manifests as spatially uniform undulations in our estimated transient displacements. These undulations have no effect on the transient strain rates, and so we do not need to include common mode error in our noise model. 3.2 Prior model We next establish a prior model for transient displacements. Specifically, we discuss our choice for the spatial and temporal functions making up $$C_{u_i}$$, Xi and Ti. For Xi, we use the squared exponential (SE) covariance function,   \begin{eqnarray} X_i(\vec{x},\vec{x}^{\prime }) = \exp \left(\frac{-||\vec{x}- \vec{x}^{\prime }||_2^2}{2 \ell ^2}\right). \end{eqnarray} (30)The SE covariance function is commonly used for kriging (e.g. Cressie 1993) and GPR (e.g. Rasmussen & Williams 2006). In terms of geodetic applications, Kato et al. (1998) and El-Fiky & Kato (1998) demonstrated that the SE covariance function accurately describes the spatial covariance of secular GNSS derived velocities in Japan. We consider three potential choices for Ti. The first option is the 1-D SE covariance function,   \begin{eqnarray} T_i(t,t^{\prime }) = \phi ^2\exp \left(\frac{-|t - t^{\prime }|^2}{2\tau ^2}\right). \end{eqnarray} (31) Note that Ti includes the parameter ϕ, which serves to scale the covariance function $$C_{u_i}$$. A second option is a member of the Wendland class of covariance functions (Wendland 2005). Wendland covariance functions have compact support, and thus their corresponding covariance matrices are sparse. We exploit this sparsity with the CHOLMOD software package (Chen et al. 2008). Wendland covariance functions can be constructed such that they are positive definite in $$\mathbb {R}^d$$ and their corresponding Gaussian process can be differentiated k times. We use d = 1, because we are describing the temporal covariance of ui, and we use k = 2, giving samples of ui continuous velocities and accelerations. The corresponding Wendland covariance function is   \begin{eqnarray} T_i(t,t^{\prime }) = \phi ^2\left(1 - \frac{|t - t^{\prime }|}{\tau }\right)^5_+ \left(\frac{8|t - t^{\prime }|^2}{\tau ^2} + \frac{5|t - t^{\prime }|}{\tau } + 1\right), \end{eqnarray} (32)where (t)+ = max(0, t). A third option for Ti is the covariance function for integrated Brownian motion (IBM). IBM is a Gaussian process with zero mean and its covariance function can be found by integrating the covariance function for Brownian motion,   \begin{eqnarray} T_i(t,t^{\prime }) &=& \int _0^t \int _0^{t^{\prime }} \phi ^2 \min (s,s^{\prime }) {\, }\text{d}s^{\prime }{\, }\text{d}s \nonumber \\ &=& \frac{\phi ^2}{2}\min (t,t^{\prime })^2 \left(\max (t,t^{\prime }) - \frac{1}{3}\min (t,t^{\prime })\right), \ \ \ t,t^{\prime } \ge 0.\nonumber\\ \end{eqnarray} (33)IBM has been used in the context of Kalman filtering as a non-parametric model for the time dependence of geophysical signals (e.g. Segall & Mathews 1997; McGuire & Segall 2003; Ohtani et al. 2010; Hines & Hetland 2016). Similar to Brownian motion, IBM has a reference time, t = 0, at which the process begins. For some geophysical signals, it is appropriate to have this reference time. For example, if we are trying to identify post-seismic deformation, then t should be zero at the time of the earthquake. However, if we are interested in detecting transient events, where there is no known start time, then IBM may not be an appropriate prior model. Instead, one may prefer to use the SE or Wendland covariance functions because they are stationary. In the following analysis, we make the choice that t is zero on the first epoch of $$\bf {d}_i^*$$. Using an earlier reference time does not change the results discussed in this section. We use the REML method to determine the appropriate values for the parameters in $$C_{u_i}$$, which are ℓ, ϕ and τ. We refer to these parameters collectively as $$\bf {\theta }$$. Just as in Section 3.1, we pick $$\bf {\theta }$$ to maximize the probability of drawing $$\bf {d}_i^*$$ from $$\bf {d}_i$$, but now we are including $$u_i(\bf {P})$$ in $$\bf {d}_i$$, and we are optimizing the parameters for $$C_{u_i}$$ rather than $$\bf {C}_{\eta _i}$$. We divide the GNSS data into seven subsets that are 4 months long and each centred on the time of an SSE. The times of the seven SSEs are determined with tremor records from Wech (2010). We find the optimal $$\bf {\theta }$$ for each subset of data, for each displacement component, and for each choice of Ti. The REML likelihood function that we are maximizing with respect to $$\bf {\theta }$$ is now   \begin{eqnarray} \mathcal {L}(\bf {\theta }) = \left(\frac{\left| \bf {G}^T \bf {G}\right|}{(2\pi )^{n-6m} \left| \bf {\Sigma }(\bf {\theta }) \right| \left| \bf {G}^T \bf {\Sigma }(\bf {\theta })^{-1} \bf {G}\right|}\right)^{\frac{1}{2}} e^{-\frac{1}{2} \bf {d}_i^{*T} \bf {K}(\bf {\theta }) \bf {d}_i^*}, \end{eqnarray} (34)where   \begin{eqnarray} \bf {K}(\bf {\theta }) = \bf {\Sigma }(\bf {\theta })^{-1} - \bf {\Sigma }(\bf {\theta })^{-1} \bf {G}\left( \bf {G}^T \bf {\Sigma }(\bf {\theta })^{-1} \bf {G}\right)^{-1} \bf {G}^T \bf {\Sigma }(\bf {\theta })^{-1} \end{eqnarray} (35)and $$\bf {\Sigma }(\bf {\theta }) = \bf {C}_{\eta _i} + C_{u_i}(\bf {P},\bf {P};\bf {\theta })$$. In the above equation, $$\bf {d}^*_i$$ consists of a single subset of data. The estimated parameters are summarized in Table 1. Based on the interquartile ranges, the estimated parameters for the SE and Wendland covariance functions do not vary significantly between SSEs. This suggests that the median of the estimated parameters should be appropriate for all the Pacific Northwest SSEs. For the IBM model, there are several anomalously large values for ℓ and ϕ, which is why the interquartile ranges are large. Table 1. Optimal values for the parameters in Xi and Ti determined with the REML method. The covariance function for Ti is indicated by the ‘Ti’ column. The SE, Wendland and IBM covariance functions are defined in eqs (31), (32) and (33), respectively. We use eq. (30) for Xi. The parameters are estimated for each of the seven SSEs considered in this study, and the tabulated values indicate the median and interquartile ranges of estimated values. The ‘diff log (REML)’ column compares the optimal log REML likelihoods to those for when Ti is the SE covariance function. Negative values indicate that observations are more consistent with the SE covariance function. Ti  Component  ℓ  ϕ  τ  diff. log (REML)  SE  East  92 ± 25 km  0.62 ± 0.11 mm  0.026 ± 0.011 yr  –  SE  North  91 ± 53 km  0.43 ± 0.05 mm  0.030 ± 0.017 yr  –  Wendland  East  95 ± 30 km  0.66 ± 0.15 mm  0.093 ± 0.044 yr  0.78 ± 0.87  Wendland  North  92 ± 57 km  0.46 ± 0.10 mm  0.116 ± 0.057 yr  0.08 ± 0.58  IBM  East  110 ± 130 km  290 ± 420 mm yr−1.5  –  − 16.4 ± 7.8  IBM  North  150 ± 560 km  110 ± 250 mm yr−1.5  –  − 10.1 ± 2.3  Ti  Component  ℓ  ϕ  τ  diff. log (REML)  SE  East  92 ± 25 km  0.62 ± 0.11 mm  0.026 ± 0.011 yr  –  SE  North  91 ± 53 km  0.43 ± 0.05 mm  0.030 ± 0.017 yr  –  Wendland  East  95 ± 30 km  0.66 ± 0.15 mm  0.093 ± 0.044 yr  0.78 ± 0.87  Wendland  North  92 ± 57 km  0.46 ± 0.10 mm  0.116 ± 0.057 yr  0.08 ± 0.58  IBM  East  110 ± 130 km  290 ± 420 mm yr−1.5  –  − 16.4 ± 7.8  IBM  North  150 ± 560 km  110 ± 250 mm yr−1.5  –  − 10.1 ± 2.3  View Large Next, we identify which covariance function for Ti best describes the time dependence of deformation from SSEs. One approach is to choose the covariance function that produces the largest optimal REML likelihoods, similar to the analysis in Langbein (2004). In Table 1, we summarize how the optimal REML likelihoods for the Wendland and IBM covariance functions compare to those for the SE covariance function. Based on the differences in optimal REML likelihoods, the data is substantially more likely to come from a Gaussian process with an SE or Wendland covariance function than an IBM covariance function. The REML likelihoods do not definitively indicate whether the SE or Wendland covariance function is preferable. A more intuitive way of deciding which function to use for Ti is to compare the posterior displacements and the observed displacements. The posterior displacements consist of our estimate of transient displacements, secular trends, and seasonal deformation. Specifically, the posterior displacement vector is $$\skew6\hat{\bf {d}}_i = \left( u_i(\bf {P}) + \bf {G}\bf {m}_i \right) | \left( \bf {d}_i = \bf {d}_i^* \right)$$. Following a similar procedure as in Section 2, it can be shown that $$\skew4\hat{\bf {d}}_i$$ has a multivariate normal distribution with mean vector   \begin{eqnarray} \bf {\mu }_{\hat{d}_i} = \left[\begin{array}{c@{\quad}c}C_{u_i}(\bf {P},\bf {P}) & \bf {G}\\ \end{array}\right] \left[\begin{array}{c@{\quad}c}\bf {C}_{\eta _i} + C_{u_i}(\bf {P},\bf {P}) & \bf {G}\\ \bf {G}^T & \bf {0}\\ \end{array}\right]^{-1} \left[\begin{array}{c}\bf {d}_i^* \\ \bf {0}\\ \end{array}\right] \end{eqnarray} (36)and covariance matrix   \begin{eqnarray} \bf {C}_{\hat{d}_i} &=& C_{u_i}(\bf {P},\bf {P}) - \left[\begin{array}{c@{\quad}c}C_{u_i}(\bf {P},\bf {P}) & \bf {G}\\ \end{array}\right]\nonumber\\ &&\times\,\left[\begin{array}{c@{\quad}c}\bf {C}_{\eta _i} + C_{u_i}(\bf {P},\bf {P}) & \bf {G}\\ \bf {G}^T & \bf {0}\\ \end{array}\right]^{-1} \left[\begin{array}{c}C_{u_i}(\bf {P},\bf {P}) \\ \bf {G}^T \\ \end{array}\right]. \end{eqnarray} (37)We compute $$\skew6\hat{\bf {d}}_i$$ using the SE, Wendland, and IBM covariance functions for Ti, and we use the median values from Table 1 for $$\bf {\theta }$$. Fig. 5 compares $$\bf {d}_i^*$$ to $$\skew6\hat{\bf {d}}_i$$ during the winter 2015–2016 SSE at stations ALBH, P436 and SC02, which are the stations that recorded the strongest signal from this SSE. Based on Fig. 5, the chosen covariance function for Ti has almost no effect on $$\skew6\hat{\bf {d}}_i$$. The posterior displacements for the IBM covariance function contain slightly more high frequency, and perhaps spurious, features. Regardless of the chosen covariance function, $$\skew6\hat{\bf {d}}_i$$ appears to underestimate the rate of deformation during the SSE at stations ALBH and SC02. The deformation at these two stations is particularly rapid compared to the surrounding stations, and the misfit between $$\bf {d}_i^*$$ and $$\skew6\hat{\bf {d}}_i$$ is likely due to oversmoothing. This oversmoothing could indicate that the chosen values for τ or ℓ are too large. However, $$\skew6\hat{\bf {d}}_i$$ does faithfully fit $$\bf {d}_i^*$$ at the remaining stations, and so we do not attempt to further refine the parameters. Figure 5. View largeDownload slide Easting component of the observed displacements, $$\bf {d}_\mathrm{e}^*$$, and posterior displacements, $$\skew6\hat{\bf {d}}_\mathrm{e}$$, during the winter 2015–2016 SSE. We show $$\skew6\hat{\bf {d}}_\mathrm{e}$$ for when Ti is an SE, Wendland and IBM covariance function. The one standard deviation uncertainties are shown for $$\bf {d}_\mathrm{e}^*$$ and the $$\skew6\hat{\bf {d}}_\mathrm{e}$$ that has an SE covariance function for Ti. For clarity, uncertainties are not shown for the $$\skew6\hat{\bf {d}}_\mathrm{e}$$ that have an IBM or Wendland covariance function for Ti. The posterior displacements for the different covariance functions are all practically indistinguishable. Figure 5. View largeDownload slide Easting component of the observed displacements, $$\bf {d}_\mathrm{e}^*$$, and posterior displacements, $$\skew6\hat{\bf {d}}_\mathrm{e}$$, during the winter 2015–2016 SSE. We show $$\skew6\hat{\bf {d}}_\mathrm{e}$$ for when Ti is an SE, Wendland and IBM covariance function. The one standard deviation uncertainties are shown for $$\bf {d}_\mathrm{e}^*$$ and the $$\skew6\hat{\bf {d}}_\mathrm{e}$$ that has an SE covariance function for Ti. For clarity, uncertainties are not shown for the $$\skew6\hat{\bf {d}}_\mathrm{e}$$ that have an IBM or Wendland covariance function for Ti. The posterior displacements for the different covariance functions are all practically indistinguishable. For our estimates of transient strain discussed in the next section, we ultimately settle on the Wendland covariance function for Ti, and we use the median values from Table 1 for the parameters. We choose the Wendland covariance function over the SE covariance function because of its computational advantages. 3.3 Transient Strain Rates Having established a noise model and a prior for transient displacements, we can now calculate transient strain rates, $$\dot{\varepsilon }$$, in the Puget Sound region from the GNSS data. We evaluate $$\dot{\varepsilon }$$ at a grid of points spanning the study area for each day from 2010 January 1 to 2017 May 15. In Fig. 6, we show a map view of $$\dot{\varepsilon }$$ on January 1, 2016, which coincides with the height of the winter 2015–2016 SSE. In Fig. 7, we show a time series of $$\dot{\varepsilon }$$ at a position on the Olympic Peninsula, which is where $$\dot{\varepsilon }$$ tends to be the largest during SSEs. We also include a supplementary animation showing a map view of $$\dot{\varepsilon }$$ over time. In each figure, we show the mean and standard deviation of $$\dot{\varepsilon }$$, making it easy to identify which features are statistically significant. Figure 6. View largeDownload slide Transient strain rates during the winter 2015–2016 SSE. The glyphs show the normal strain rates as a function of azimuth, where orange indicates compression and blue indicates extension. The shaded regions show the one standard deviation uncertainties for the normal strain rates. Figure 6. View largeDownload slide Transient strain rates during the winter 2015–2016 SSE. The glyphs show the normal strain rates as a function of azimuth, where orange indicates compression and blue indicates extension. The shaded regions show the one standard deviation uncertainties for the normal strain rates. Figure 7. View largeDownload slide Components of the transient strain rates at the position shown in Fig. 2. The shaded regions indicate the one standard deviation uncertainties. Figure 7. View largeDownload slide Components of the transient strain rates at the position shown in Fig. 2. The shaded regions indicate the one standard deviation uncertainties. The transient strain rates shown for the winter 2015–2016 SSE in Fig. 6 are characteristic of most SSEs in the Puget Sound region. The SSEs cause trench perpendicular compression in the Olympic Peninsula and extension east of Puget Sound. The strain transitions from compression to extension around the southern tip of Vancouver Island, coinciding with where fault slip tends to be inferred (e.g. Dragert et al. 2001; Wech et al. 2009; Schmidt & Gao 2010). Hence, this pattern of strain is to be expected. Over decadal timescales, there is trench perpendicular compression throughout this study region caused by steady tectonic plate motion (Murray & Lisowski 2000; McCaffrey et al. 2007, 2013). When comparing the transient strain rates caused by SSEs to the secular strain rates, we see that SSEs are concentrating tectonically accumulated strain energy towards the trench, and presumably pushing the subduction zone closer to failure. To verify that the estimated transient strain rates are accurately identifying geophysical signal, we compare the signal-to-noise ratio from eq. (23) at a point on the Olympic Peninsula to the frequency of seismic tremor (Fig. 8). A signal-to-noise ratio greater than ∼3 can be interpreted as a detected geophysical signal. We detect nine distinct events, which each correspond to peaks in seismic tremor. The smaller events detected in August 2014 and February 2017 can be considered inter-SSE events. They were not among the SSEs used to constrain the prior covariance function. In between peaks in seismic tremor, the signal-to-noise ratio is consistently between 0 and 2, suggesting that all the transient strain detected at this point on the Olympic Peninsula is associated with SSEs and inter-SSE events. Figure 8. View largeDownload slide Top: signal-to-noise ratio (eq. 23) at the position shown in Fig. 2. Bottom: frequency of seismic tremors in the region shown in Fig. 2. Figure 8. View largeDownload slide Top: signal-to-noise ratio (eq. 23) at the position shown in Fig. 2. Bottom: frequency of seismic tremors in the region shown in Fig. 2. The results we have presented indicate that we are identifying the transient strain that we should expect to see. This is not to say that there are no unexpected features in $$\dot{\varepsilon }$$ that are worth further exploration. However, a discussion on features in $$\dot{\varepsilon }$$ that have unknown origin is beyond the scope of this paper. 4 DISCUSSION Our results demonstrate that GPR is an effective tool for estimating transient strain rates and detecting geophysical processes from GNSS data. However, there are some aspects of our method that warrant further research. For example, there is potential for a more thorough analysis of the spatiotemporal noise in the GNSS data than what was performed in Section 3.1. Specifically, we did not explore spatially correlated noise, such as common mode error. We have assumed that any spatially correlated noise has a sufficiently long wavelength that it has a negligible effect on transient strain rates. We can also improve upon the seasonal noise model used in this study, which consists of four sinusoids for each station. We did not explore the roughness of the seasonal deformation (i.e. the number of sinusoids needed to describe the deformation). The periodic Gaussian process (Mackay 1998) is an alternative model for seasonal deformation and is well suited for exploring the roughness of seasonal deformation. The periodic Gaussian process has zero mean and the covariance function   \begin{eqnarray} T(t,t^{\prime }) = \phi ^2 \exp \left(\frac{-\sin (\pi |t - t^{\prime }|)^2}{2\tau ^2}\right). \end{eqnarray} (38)Realizations have annual periodicity and the roughness is controlled by τ. Decreasing τ has the same effect as including higher frequency sinusoids in the seasonal model. The optimal value for τ can be found with the REML method. Another potential research direction would be to reduce the computational cost of our method for estimating transient strain rates. GPR is generally computationally expensive when there are many observations. The transient strain rates estimated in this study are constrained by seven years of daily displacement observations from 94 GNSS stations, which amounts to about 240 000 observations for each displacement component. For a data set with this size, it is difficult to evaluate the matrix inverses in eqs (10) and (11). We alleviate this computational cost by using a compact Wendland covariance function for our prior. By using a compact covariance function for our prior, eqs (10) and (11) become sparse systems of equations that are more tractable to evaluate. However, the sparsity decreases when the Wendland covariance function has a larger timescale parameter. The Wendland covariance function then offers less of a computational advantage when we are interested in geophysical processes that occur over longer timescales. An alternative approach to dealing with large data sets would be to explore approximation methods for GPR (Rasmussen & Williams 2006, ch. 8). In addition to detecting geophysical processes, the GNSS derived transient strain rates can be used to better understand the data from borehole strain metres (BSMs). The PBO contains about forty BSMs in the Pacific Northwest, and it has been demonstrated that BSMs are able to record transient geophysical events such as SSEs (e.g. Dragert & Wang 2011). However, there are complications that prevent BSM data from being used quantitatively in geophysical studies. One difficulty is that BSM data should be calibrated with a well-known strain source, such as diurnal and semi-diurnal tides (Hart et al. 1996; Roeloffs 2010; Hodgkinson et al. 2013). Unfortunately, the tidal forces at BSMs which record SSEs are strongly influenced by local bodies of water such as the Straight of Juan de Fuca, making it difficult to form a theoretical prediction of tidal strains (Roeloffs 2010). Another complication is that noise in BSM data is not well understood. The noise consists, in part, of a long-term decay resulting from the instrument equilibrating with the surrounding rock (Gladwin et al. 1987). Typically, this noise is dealt with in an ad-hoc manner by fitting and removing exponentials and low-order polynomials. We envision that the GNSS derived strain rates from this paper can be used as a reference strain for calibrating BSM data and quantify its noise. 5 CONCLUSION We propose using GPR to estimate transient strain rates from GNSS data. Most other methods for estimating strain rates assume a parametric representation of deformation, which can bias the results if the parameterization is not chosen carefully. Here we assume a stochastic, rather than parametric, prior model for displacements. Our prior model describes how much we expect transient displacements to covary spatially and temporally. If we know nothing about the underlying signal that we are trying to recover, then the prior model can be chosen objectively with maximum likelihood methods. We demonstrate that GPR is an effective tool for detecting geophysical processes, such as SSEs, in our application to the Pacific Northwest. While this paper just focuses on using GPR to estimate transient strain rates and detect geophysical processes, we believe that GPR is a powerful tool that can be applied to a wide range of geophysical problems. Acknowledgements This material is based upon work supported by the National Science Foundation under grant EAR 1245263. The EarthScope Plate Boundary Observatory data are provided by UNAVCO through the GAGE Facility with support from the National Science Foundation (NSF) and National Aeronautics and Space Administration (NASA) under NSF Cooperative Agreement EAR-1261833. An implementation of the method described in this paper is named Python-based Geodetic Network Strain software (PyGeoNS). PyGeoNS is distributed under the MIT License and can be found at www.github.com/treverhines/PyGeoNS. REFERENCES Acheson D.T., 1975. Data editing—subroutine editq, Tech. rep., US Department of Commerce, National Oceanic and Atmospheric Administration, Environmental Data Service. Adler R.J., 1981. The Geometry of Random Fields , John Wiley & Sons. Beavan J., Haines J., 2001. Contemporary horizontal velocity and strain rate fields of the Pacific-Australian plate boundary zone through New Zealand, J. geophys. Res. , 106( B1), 741– 770. Google Scholar CrossRef Search ADS   Blewitt G., Kreemer C., Hammond W.C., Gazeaux J., 2016. MIDAS robust trend estimator for accurate GPS station velocities without step detection, J. geophys. Res. , 121, 2054– 2068. Google Scholar CrossRef Search ADS   Bock Y. Melgar D., 2016. Physical applications of GPS geodesy: a review, Rep. Prog. Phys. , 79( 10), 106801, 16– 23. Google Scholar CrossRef Search ADS   Bos M.S., Fernandes R.M.S., Williams S.D.P., Bastos L., 2013. Fast error analysis of continuous GNSS observations with missing data, J. Geod. , 87( 4), 351– 360. Google Scholar CrossRef Search ADS   Chen Y., Davis T.a., Hager W.W., 2008. Algorithm 887 : CHOLMOD, supernodal sparse Cholesky factorization and update/downdate, ACM Trans. Math. Softw. , 35( 3), 1– 12. Google Scholar CrossRef Search ADS   Cressie N., 1993. Statistics for Spatial Data , John Wiley & Sons, rev. edn. Dmitrieva K., Segall P., DeMets C., 2015. Network-based estimation of time-dependent noise in GPS position time series, J. Geod. , 89( 6), 591– 606. Google Scholar CrossRef Search ADS   Dong D. Fang P. Bock Y. Cheng M.K. Miyazaki S., 2002. Anatomy of apparent seasonal variations from GPS-derived site position time series, J. geophys. Res. , 107( B4), ETG 9-1–ETG 9-16. Dong D., Fang P., Bock Y., Webb F., Prawirodirdjo L., Kedar S., Jamason P., 2006. Spatiotemporal filtering using principal component analysis and Karhunen–Loeve expansion approaches for regional GPS network analysis, J. geophys. Res. , 111( 3), 1– 16. Google Scholar PubMed  Dragert H., Wang K., 2011. Temporal evolution of an episodic tremor and slip event along the northern Cascadia margin, J. geophys. Res. , 116( 12), 1– 12. Dragert H., Wang K., James T.S., 2001. A silent slip event on the deeper Cascadia subduction interface, Science , 292, 1525– 1528. Google Scholar CrossRef Search ADS PubMed  El-Fiky G.S., Kato T., 1998. Continuous distribution of the horizontal strain in the Tohoku district, Japan, predicted by least-squares collocation, J. Geodyn. , 27( 2), 213– 236. Google Scholar CrossRef Search ADS   Feigl K.L., King R.W., Jordan T.H., 1990. Geodetic measurement of tectonic deformation in the Santa Maria Fold and Thrust Belt, California, J. geophys. Res. , 95( B3), 2679– 2699. Google Scholar CrossRef Search ADS   Field E.H. et al.  , 2014. Uniform California Earthquake Rupture Forecast, version 3 (UCERF3) -The time-independent model, Bull. seism. Soc. Am. , 104( 3), 1122– 1180. Google Scholar CrossRef Search ADS   Freed a.M., Lin J., 2001. Delayed triggering of the 1999 Hector Mine earthquake by viscoelastic stress transfer, Nature , 411, 180– 183. Google Scholar CrossRef Search ADS PubMed  Gazeaux J. et al.  , 2013. Detecting offsets in GPS time series: first results from the detection of offsets in GPS experiment, J. geophys. Res. , 118( 5), 2397– 2407. Google Scholar CrossRef Search ADS   Gladwin M.T., Gwyther R.L., Hart R., Francis M., Johnston M.J.S., 1987. Borehole tensor strain measurements in California, J. geophys. Res. , 92( B8), 7981– 7988. Google Scholar CrossRef Search ADS   Hammond W.C. Blewitt G. Kreemer C., 2016. GPS imaging of vertical land motion in California and Nevada: Implications for Sierra Nevada uplift, J. geophys. Res . Hart R. H.G. Gladwin M.T. Gwyther R.L. Agnew D.C. Wyatt F.K., 1996. Tidal calibration of borehole strain meters: Removing the effects of small-scale inhomogeneity, J. geophys. Res. , 101( 96). Harville D.A., 1974. Bayesian Inference for Variance Components Using Only Error Contrasts, Biometrika , 61( 2), 383– 385. Google Scholar CrossRef Search ADS   He X., Montillet J.P., Fernandes R., Bos M., Yu K., Hua X., Jiang W., 2017. Review of current GPS methodologies for producing accurate time series and their error sources, J. Geodyn. , 106, 12– 29. Google Scholar CrossRef Search ADS   Herring T.A. et al.  , 2016. Plate Boundary Observatory and related networks: GPS data analysis methods and geodetic product, Reviews of Geophysics , pp. 1– 50. Google Scholar CrossRef Search ADS   Hines T.T., Hetland E.A., 2016. Rheologic constraints on the upper mantle from five years of postseismic deformation following the El Mayor-Cucapah earthquake, J. geophys. Res. , 121, 6809– 6827. Google Scholar CrossRef Search ADS   Hodgkinson K., Agnew D., Roeloffs E., 2013. Working With Strainmeter Data, EOS, Trans. Am. geophys. Un. , 94( 9), 91. Google Scholar CrossRef Search ADS   Holt W.E., Shcherbenko G., 2013. Toward a continuous monitoring of the horizontal displacement gradient tensor field in Southern California using cGPS observations from Plate Boundary Observatory (PBO), Seismol. Res. Lett. , 84( 3), 455– 467. Google Scholar CrossRef Search ADS   Johansson J.M. et al.  , 2002. Continuous GPS measurements of postglacial adjustment in Fennoscandia 1. Geodetic results, J. geophys. Res. , 107, ETG 3-1–ETG 3-27. Kato T., El-Fiky G.S., Oware E.N., Miyazaki S., 1998. Crustal strains in the Japanese islands as deduced from dense GPS array, Geophys. Res. Lett. , 25( 18), 3445– 3448. Google Scholar CrossRef Search ADS   Langbein J., 2004. Noise in two-color electronic distance meter measurements revisited, J. geophys. Res. , 109( 4), 1– 16. Langbein J., 2008. Noise in GPS displacement measurements from Southern California and Southern Nevada, J. geophys. Res. , 113( 5), 1– 12. Langbein J., Johnson H., 1997. Correlated errors in geodetic time series: implications for time-dependent deformation, J. geophys. Res. , 102( B1), 591– 603. Google Scholar CrossRef Search ADS   Lohman R.B., Murray J.R., 2013. The SCEC geodetic transient-detection validation exercise, Seism. Res. Lett. , 84( 3), 419– 425. Google Scholar CrossRef Search ADS   Mackay D.J.C., 1998. Introduction to gaussian processes, Neural Netw. Mach. Learn. , 168( 1996), 133– 165. Mao A., Harrison G.A., Dixon H., 1999. Noise in GPS coordinate time series, J. geophys. Res. , 104( B2), 2797– 2816. Google Scholar CrossRef Search ADS   McCaffrey R. et al.  , 2007. Fault locking, block rotation and crustal deformation in the Pacific Northwest, Geophys. J. Int. , 169( 3), 1315– 1340. Google Scholar CrossRef Search ADS   McCaffrey R., King R.W., Payne S.J., Lancaster M., 2013. Active tectonics of northwestern U.S. inferred from GPS-derived surface velocities, J. geophys. Res. , 118, 709– 723. Google Scholar CrossRef Search ADS   McGuire J.J., Segall P., 2003. Imaging of aseismic fault slip transients recorded by dense geodetic networks, Geophys. J. Int. , 155, 778– 788. Google Scholar CrossRef Search ADS   Meade B.J., Hager B.H., 2005. Block models of crustal motion in southern California constrained by GPS measurements, J. geophys. Res. , 110, 1– 19. Google Scholar CrossRef Search ADS   Moritz H., 1978. Least-Squares Collocation, Rev. Geophys. , 16( 3), 421– 430. Google Scholar CrossRef Search ADS   Murray J.R., Svarc J., 2017. Global Positioning System data collection, processing, and analysis conducted by the U.S. Geological Survey Earthquake Hazards Program, Seismol. Res. Lett. , 88( 3), 1– 10. Google Scholar CrossRef Search ADS   Murray M.H., Lisowski M., 2000. Strain accumulation along the Cascadia subduction zone in western Washington, Geophys. Res. Lett. , 27( 22), 3631– 3634. Google Scholar CrossRef Search ADS   Ohtani R., McGuire J.J., Segall P., 2010. Network strain filter: A new tool for monitoring and detecting transient deformation signals in GPS arrays, J. geophys. Res. , 115( 12), 1– 17. Okada Y., 1992. Internal deformation due to shear and tensile faults in a half space, Bull. seism. Soc. Am. , 82( 2), 1018– 1040. Papoulis A., 1991. Probability, Random Variables, and Stochastic Processes , 3rd edn, McGraw-Hill. Press W.H. Flannery B.P. Teukolsky S.A. Vetterling W.T., 2007. Numerical Recipes: The Art of Scientific Computing , 3rd edn, Cambridge Univ. Press. Rasmussen C.E. Williams C.K.I., 2006. Gaussian Processes for Machine Learning , The MIT Press. Roeloffs E., 2010. Tidal calibration of Plate Boundary Observatory borehole strainmeters: roles of vertical and shear coupling, J. geophys. Res. , 115( 6), 1– 25. Roeloffs E.A., 2006. Evidence for aseismic deformation rate changes prior to earthquakes, Annu. Rev. Earth Planet. Sci. , 34( 1), 591– 627. Google Scholar CrossRef Search ADS   Rogers G., Dragert H., 2003. Episodic tremor and slip on the Cascadia subduction zone: the chatter of silent slip, Science , 300, 1942– 1943. Google Scholar CrossRef Search ADS PubMed  Sandwell D.T. Wessel P., 2016. Interpolation of 2-D vector data using constraints from elasticity, Geophys. Res. Lett. , 43, 10 703– 10 709. Google Scholar CrossRef Search ADS   Schmidt D.A., Gao H., 2010. Source parameters and time-dependent slip distributions of slow slip events on the Cascadia subduction zone from 1998 to 2008, J. geophys. Res. , 115( 4), 1– 13. Schwartz S.Y., Rokosky J.M., 2007. Slow slip events and seismic tremor at circum-Pacific subduction zones, Rev. Geophys. , 45, 1– 32. Google Scholar CrossRef Search ADS   Segall P. Mathews M., 1997. Time dependent inversion of geodetic data, J. geophys. Res. , 102( B10), 22 391–22 409. Shen Z., Wang M., Zeng Y., Wang F., 2015. Optimal Interpolation of Spatially Discretized Geodetic Data, Bull. seism. Soc. Am. , 105( 4), 2117– 2127. Google Scholar CrossRef Search ADS   Shen Z.K., Jackson D.D., Ge B.X., Bob X.G., 1996. Crustal deformation across and beyond the Los Angeles basin from geodetic measurements, J. geophys. Res. , 101( B12), 27927– 27957. Google Scholar CrossRef Search ADS   Tape C., Musé P., Simons M., Dong D., Webb F., 2009. Multiscale estimation of GPS velocity fields, Geophys. J. Int. , 179( 2), 945– 971. Google Scholar CrossRef Search ADS   von Mises R., 1964. Mathematical Theory of Probability and Statistics , Academic Press. Wdowinski S. Zhang J. Fang P. Genrich J., 1997. Southern California Permanent GPS Geodetic Array: Spatial filtering of daily positions for estimating coseismic and postseismic displacements induced by the 1992 Landers earthquake, 102( 97), 57– 70. Wech A.G., 2010. Interactive Tremor Monitoring, Seismol. Res. Lett. , 81( 4), 664– 669. Google Scholar CrossRef Search ADS   Wech A.G., Creager K.C., Melbourne T.I., 2009. Seismic and geodetic constraints on Cascadia slow slip, J. geophys. Res. , 114( 10), 1– 9. Wendland H., 2005. Scattered Data Approximation , Cambridge Univ. Press. Williams S.D.P. Bock Y. Fang P. Jamason P. Nikolaidis R.M. Prawirodirdjo L. Miller M. Johnson D.J., 2004. Error analysis of continuous GPS position time series, J. geophys. Res. , 109( B3), 1– 19. Google Scholar CrossRef Search ADS   Wyatt F., 1982. Displacement of Surface Monuments: Horizontal Motion, J. geophys. Res. , 87( B2), 979– 989. Google Scholar CrossRef Search ADS   Wyatt F.K., 1989. Displacement of surface monuments: Vertical motion, J. geophys. Res. , 94( B2), 1655– 1664. Google Scholar CrossRef Search ADS   Zhang J. Bock Y. Johnson H. Fang P. Williams S. Genrich J. Wdowinski S. Behr J., 1997. Southern California Permanent GPS Geodetic Array: error analysis of daily position estimates and site velocities, J. geophys. Res. , 102( B8), 18 035–18 055. SUPPORTING INFORMATION Supplementary data are available at GJI online. strain-animation.mp4 Please note: Oxford University Press is not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the paper. APPENDIX: OUTLIER DETECTION ALGORITHM Our outlier detection algorithm is loosely based on the data editing algorithm from Acheson (1975). Let $$\bf {d}^*$$ denote all n GNSS displacement observations for a single directional component, which have been made at positions and times $$\bf {P}$$. We describe $$\bf {d}^*$$ as a realization of the random vector   \begin{eqnarray} \bf {d}= \bf {G}\bf {m} + v(\bf {P}) + \bf {w}, \end{eqnarray} (A1)where $$\bf {G}$$ and $$\bf {m}$$ are the same as in eq. (4), v is a Gaussian process distributed as $$\mathcal {GP}(0,C_v)$$, and $$\bf {w}$$ is a vector of uncorrelated Gaussian noise with known standard deviations $$\bf {\sigma }$$. The Gaussian process v is intended to describe transient features in the data that cannot be explained by the linear trend or seasonal terms in $$\bf {G}$$. We let the temporal covariance of v be a squared exponential, and we let v be spatially uncorrelated so that   \begin{eqnarray} C_v(p,p^{\prime }) = \phi ^2\exp \left(\frac{-|t - t^{\prime }|^2}{2\tau ^2}\right) \delta _{\vec{x},\vec{x}^{\prime }}, \end{eqnarray} (A2)where $$\delta _{\vec{x},\vec{x}^{\prime }} = 1$$ if $$\vec{x}= \vec{x}^{\prime }$$ and 0 otherwise. The spatial covariance of v has little effect on the detected outliers, and so we have assumed that v is spatially uncorrelated for simplicity. Based on our experience, v can reasonably describe most transient features in the data when we set ϕ = 1 mm and τ = 10 d. Our goal is to find the index set of non-outliers in $$\bf {d}^*$$, which we denote as $$\bf {\Omega }$$. We use a tilde to indicate that an array only contains elements corresponding to $$\bf {\Omega }$$ (e.g. the vector of non-outlier observations is denoted $$\tilde{\bf {d}}^* = [d_i^*]_{i \in \bf {\Omega }}$$). The outliers are identified iteratively, and we initiate $$\bf {\Omega }$$ with all n indices. We consider outliers to be data that are poorly explained by the model $$\bf {G}\bf {m} + v(\bf {P})$$, which are determined by the residual vector   \begin{eqnarray} \bf {r} &=& \bf {d}^* - \mathrm{E}\left[ \big (\bf {G}\bf {m} + v(\bf {P})\big ) \big | \big (\tilde{\bf {d}} = \tilde{\bf {d}}^*\big ) \right]= \bf {d}^* - \left[\begin{array}{c@{\quad}c}C_v(\bf {P},\tilde{\bf {P}}) & \bf {G} \end{array}\right]\nonumber\\ &&\times \left[\begin{array}{c@{\quad}c}C_v(\tilde{\bf {P}},\tilde{\bf {P}}) + \mathrm{diag}(\tilde{\bf {\sigma }}^2) & \tilde{\bf {G}} \\ \tilde{\bf {G}}^T & \bf {0}\\ \end{array}\right]^{-1} \left[\begin{array}{c}\tilde{\bf {d}}^* \\ \bf {0}\\ \end{array}\right]. \end{eqnarray} (A3)Data with abnormally large residuals are identified as outliers. For each iteration, we compute $$\bf {r}$$ and then update $$\bf {\Omega }$$ so that it contains the indices of $$\bf {r}$$ whose weighted values are less than λ times the weighted root mean square of $$\tilde{\bf {r}}$$,   \begin{eqnarray} \bf {\Omega } \leftarrow \left\lbrace i : \left|\frac{r_i}{\sigma _i}\right| < \lambda \cdot \sqrt{\frac{1}{|\bf {\Omega }|} \sum _{j \in \bf {\Omega }} \frac{r_j^2}{\sigma _j^2}} \right\rbrace . \end{eqnarray} (A4)Iterations continue until the new $$\bf {\Omega }$$ is the same as the previous $$\bf {\Omega }$$. The outlier detection algorithm is demonstrated in Fig. A1. For the demonstration, we use the easting component of displacements at a single station, SC03, which is located on Mt. Olympus in Washington state. Station SC03 records anomalous observations during the winter, presumably because of snow and ice accumulation, and we want to remove these observations. The station also records periodic westward motion from SSEs, and we want to keep this deformation intact. The detected outliers are shown in panel (b). For comparison, we also show the detected outliers when we do not include the Gaussian process v in our model for the data (panel a). When v is not included, real transient deformation resulting from SSEs is erroneously identified as outliers. When v is included, the identified outliers only consist of the anomalous deformation that lacks temporal continuity. It should be noted that we use λ = 2.5 for this demonstration, which causes the outlier detection algorithm to be particularly aggressive. In Section 3, we clean the data using a more tolerant λ = 4.0. Figure A1. View largeDownload slide Outliers detected in the easting component of displacements at station SC03. The orange markers indicate detected outliers. The orange line is the best fit model to the data, which is used to compute the residual vector $$\bf {r}$$. The model being fit to the data in panel (a) is $$\bf {G}\bf {m}$$, and the model in panel (b) is $$\bf {G}\bf {m} + v(\bf {P})$$. Figure A1. View largeDownload slide Outliers detected in the easting component of displacements at station SC03. The orange markers indicate detected outliers. The orange line is the best fit model to the data, which is used to compute the residual vector $$\bf {r}$$. The model being fit to the data in panel (a) is $$\bf {G}\bf {m}$$, and the model in panel (b) is $$\bf {G}\bf {m} + v(\bf {P})$$. © The Author(s) 2017. Published by Oxford University Press on behalf of The Royal Astronomical Society.

### Journal

Geophysical Journal InternationalOxford University Press

Published: Mar 1, 2018

## You’re reading a free preview. Subscribe to read the entire article.

### DeepDyve is your personal research library

It’s your single place to instantly
that matters to you.

over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month ### Explore the DeepDyve Library ### Search Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly ### Organize Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place. ### Access Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals. ### Your journals are on DeepDyve Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more. All the latest content is available, no embargo periods. DeepDyve ### Freelancer DeepDyve ### Pro Price FREE$49/month
\$360/year

Save searches from
PubMed

Create lists to

Export lists, citations