TY - JOUR AU - Szuberla, Curt A L AB - SUMMARY Infrasound array data are commonly used to detect and characterize infrasonic signals from a variety of natural and anthropogenic sources. Here we examine the effectiveness of robust regression estimators (L1-norm regression, M-estimators and least trimmed squares) for infrasound array processing, and compare them against standard array processing algorithms (least-squares estimation, frequency–wavenumber analysis and progressive multi-channel correlation) using a combination of real and synthetic data. Of particular interest is how each algorithm performs when one of the array elements produces data outliers. Synthetic tests on elements containing a clock error, constant values or only pink noise are performed, and we analyse the relative ability of the estimators to recover plane wave parameters. The L1-norm regression, M-estimate, frequency–wavenumber analysis and least trimmed squares estimates provided superior results than conventional least-squares estimation. Evaluation of least trimmed squares weights consistently identified the element with the simulated error, providing additional information on array performance. Least trimmed squares processing consistently identified an element with reversed polarity for Alaska Volcano Observatory array ADKI. International Monitoring System stations IS57 and IS55 were likewise processed. Data from an element of IS57, which had lower cross-correlation values than the remaining elements, were consistently identified as having outliers in array processing. An element with a timing error was identified in the analysis of IS55 data. These results suggest robust regression methods, in particular least trimmed squares, improve upon standard methods and should be used more widely, as they can provide robust array processing results and insight into array performance. Further, robust regression methods are not limited to infrasound array processing applications, and it is likely that they would also be effective for seismic array data. Time-series analysis, Earthquake monitoring and test-ban treaty verification, Statistical methods 1 INTRODUCTION Infrasound, acoustics in the frequency band below roughly 20 Hz, can propagate through the atmosphere across global distances from a rich variety of natural and man-made sources. To monitor compliance with the Comprehensive Nuclear-Test-Ban Treaty (CTBT), the International Monitoring System (IMS) adopted infrasound as one of its key technologies (Christie & Campus 2010). A network of 60 globally distributed infrasound sensor groups, called arrays, is planned, which enables a broad range of scientific and civil applications. Currently, 51 of the 60 arrays are installed and certified, which complements the existing seismic, hydroacoustic and radionuclide stations. Infrasound arrays are also commonly used by volcano observatories, such as the Alaska Volcano Observatory (AVO), for volcano monitoring (Lyons et al. 2020) and research (Fee et al. 2017; Haney et al. 2018). Infrasound arrays complement existing seismic and lightning detection networks, and can detect and characterize remote volcanic eruptions when satellite imagery is impeded by cloud cover (Fee & Matoza 2013; Lyons et al. 2020). Array processing is the combination of multiple sensor inputs for signal detection schemes and parameter estimation that would not be otherwise possible with single sensor coverage. If we assume an infrasonic wave propagates across an array as a plane wave, that is, the differential pressure is equal in a plane perpendicular to the propagation direction, then we can solve for an optimal direction-of-arrival and trace velocity of the traversing wave with either frequency or time domain techniques. These array parameter estimates are particularly useful when the source time and location are unknown, such as in early detection scenarios (e.g. Modrak et al. 2010). Accurate array processing is important for nearly all aspects of IMS detection capabilities (Le Pichon et al. 2009; Green & Bowers 2010), for the study of atmospheric structure (e.g. Assink et al. 2014), and other civil and scientific applications such as the monitoring of volcanic ash (e.g. Fee et al.2017). In an attempt to maximize global coverage and minimize cultural noise, IMS arrays are often located in remote locations with harsh environmental conditions that pose challenges for maintaining high data quality at all elements. Arrays deployed for volcano and other monitoring tasks can face similar issues. These situations pose an operational problem, as the electronics and wind noise reduction system (WNRS) can degrade or break, affecting the array performance. Maintenance of mission capability for CTBT verification requires continuous quality control at IMS infrasound arrays. The lack of ground-truth sources and complex, time-dependent propagation effects (e.g. Smets et al. 2015) relative to seismic counterparts makes detection of poorly performing elements challenging (Assink et al. 2014; Koch & Pilger 2019). Further, the loss or degradation of potentially multiple elements on an array’s performance is not understood. The uncertainty of a parameter estimate from array data depends (in part) on the spatial distribution of the elements (Szuberla & Olson 2004), and an improper consideration of the geometry can bias the parameters estimates (Edwards & Green 2012). An in situ diagnostic of array health could provide better monitoring capabilities and greater cost-effectiveness (Gabrielson 2011). Given a perfectly working array, signal processing of infrasonic data is still affected by many factors, including array geometry, local winds and signal propagation effects over interelement distances. Under some conditions, an impinging wave may distort or multipath over the approximately 1–3 km aperture IMS array (Christie & Campus 2010), leading to a loss of coherence and phase association problems (Green 2015). Ambient winds can induce low signal-to-noise conditions or obscure underlying signals entirely. Rosette pipe arrays and other WNRS are employed at IMS arrays to provide omnidirectional wind noise reduction, but generally low signal-to-noise conditions still present a challenging array processing problem (Walker & Hedlin 2010). Here we can apply new data analytic and algorithmic approaches in an attempt to mitigate these continuous issues. Using the plane wave assumption and the array geometry, we start our approach by examining the time-difference of arrival problem as a regression problem (e.g. Szuberla & Olson 2004). Specifically, we regress interelement time delays over the interelement distances, termed the co-array coordinates. This formulation allows us to apply robust regression techniques to our analysis. Robust regression techniques are designed to be resistant to outliers, extreme values that do not appear to belong to the underlying data distribution (Rousseeuw 1984; Rousseeuw & Leroy 1987; Ryan 2018). In a time domain array processing context, outliers are time delays that are inconsistent with the plane wave model. We show that these outliers can be related back to individual elements, which can grant new insight into infrasound array performance. In this paper, we compare the performance of some standard infrasound array processing methods (least-squares estimation, frequency–wavenumber analysis, L1-norm regression and PMCC) with robust regression techniques [M-estimators and least trimmed squares (LTS)]. M-estimators generalize the objective function used in least-squares and L1-norm regression techniques, and LTS generalizes the least-squares fit to cover data subsets. We show how LTS processing not only improves array estimates but also enables an extra quality control aspect as part of standard processing that can be useful for station operation and maintenance. In Section 2, we provide background information on the infrasound array processing techniques, followed by initial algorithm tests on synthetic and real array data in Sections 3.1 and 3.2. We show how the LTS algorithm helps identify degraded array performance on active IMS and AVO arrays. Finally, we conclude with a discussion on the application and limitations of LTS and other regression-based array processing techniques. We find that LTS and other robust regression techniques have superior performance to standard techniques and should be applied more regularly to infrasound and seismic array processing. 2 BACKGROUND AND METHODOLOGY 2.1 The plane wave model Fundamental to many infrasound array processing techniques is the plane wave propagation assumption (Cansi et al. 1993; Szuberla & Olson 2004). The plane is determined by the wavenumber vector, k, of the propagating wave, which may be considered as a spatial frequency variable, similar to how angular frequency ω is a temporal frequency variable (Johnson & Dudgeon 1993). These two quantities are related through the slowness vector, |$\boldsymbol {\beta } = {\mathbf k}/\omega$|⁠, which is what is typically estimated during infrasound beamforming, that is, the process where coherent traces at distributed receivers are time aligned for specified slowness values. We report the slowness vector estimates in components, which we express equivalently as a backazimuth (θ = tan−1(βy/βx), with reference to north) and speed in the plane of the array (trace velocity |$= 1/||\boldsymbol {\beta }||$|⁠). The plane wave approximation is typically valid far from an acoustic source (Szuberla et al. 2006). However, this model can fail over local ranges or due to other propagation effects (e.g. multipathing) or topographic influences (Berteussen 1976; Szuberla et al. 2006; Green 2015). We use 2-D beamforming in this paper and assume the bias from interelement elevation changes to be negligible (Edwards & Green 2012). 2.2 Robust regression We apply robust regression techniques to estimate plane wave parameters. These regression methods all fit interelement time delays to the co-array geometry. Time delays for the least-squares, L1-norm regression, M-estimation and LTS methods are all derived from the maxima of cross-correlation functions; therefore, in this paper, the starting set of time delays for each regression method will be the same. Robust techniques are designed to resist various types of statistical outliers and highlight outlying observations by their large residuals from the robust fit (Rousseeuw & Leroy 1987; Ryan 2018); such outliers commonly complicate infrasound array processing. The effect of outliers on the performance of statistical estimators can be quantified through the breakdown point, the number of data points that can cause unbounded bias in the estimate (Rousseeuw & Leroy 1987). Least-squares estimation, like the sample mean, has the smallest possible breakdown point of a single data point. High-breakdown value estimators, such as the sample median and LTS estimator, can achieve the maximum possible breakdown point of 50 per cent of the data (Rousseeuw 1984). In the following sections we describe the most commonly used infrasound array processing techniques, as well as new, robust regression array processing techniques. 2.3 Standard array processing methods 2.3.1 Least-squares beamforming Consider a signal that propagates with a slowness |$\boldsymbol {\beta }$| across an array of n sensors with distinct locations on a Cartesian grid (xi, yi). We assume a plane wave model, and use the maximum of the normalized cross-correlation function between the |$N = \binom{n}{2}$| station pairs to obtain interelement traveltimes t(ij). We can express this model with matrix algebra as $$\begin{eqnarray} \boldsymbol {\tau } = \mathbf {A}\boldsymbol {\beta } + \boldsymbol {\epsilon }, \end{eqnarray}$$(1) where |$\boldsymbol {\tau }$| is the N × 1 vector of interelement traveltime differences, |$\mathbf {A}$| is the N × 2 matrix of co-array coordinates, |$\boldsymbol {\beta }$| is the 2 × 1 slowness vector (βx, βy) and |$\boldsymbol {\epsilon }$| is the N × 1 vector of measurement errors. This overdetermined system of equations is usually solved in a least-squares sense (Berteussen 1976; Szuberla & Olson 2004), which minimizes the sum of the squared residuals: $$\begin{eqnarray} \boldsymbol {\skew{4}\hat{\beta }} = \genfrac{}{}{0.0pt}{}{\text{minimize}}{\boldsymbol {\beta }} \sum _{i=1}^{N} |r_i(\boldsymbol {\beta })|^2, \end{eqnarray}$$(2) where ri is the ith component of the residual vector, |$\boldsymbol {r} = \mathbf {A}\boldsymbol {\beta } - \boldsymbol {\tau }$|⁠. We note that this model can accommodate a priori knowledge of element quality through a diagonal weighting matrix with binary (0 or 1) weights, and this 2-D system can be readily extended to d-dimensions (Szuberla & Olson 2004). For normally distributed errors, the least-squares (L2-norm) solution is the maximum likelihood solution, and under the assumption that errors are uncorrelated, have constant variance and have an expectation value of zero, it is the best (minimum-variance) linear unbiased estimator (Montgomery et al. 2001). Statistical bounds on the uncertainty of the estimate |$\boldsymbol {\beta }$| are expressed through a covariance matrix (Szuberla & Olson 2004). 2.3.2 Frequency–wavenumber (f–k) analysis Consider a broad-band transient signal that propagates as a plane wave across an array of n distinct sensors. These sensors have locations |$\boldsymbol {r_j}$| relative to a reference location |$\boldsymbol {r_0}$|⁠. The signal recorded at sensor uj at time t is expressed as $$\begin{eqnarray} u_j(t) = s(t - \boldsymbol {\beta _0} \cdot \boldsymbol {r_j}), \end{eqnarray}$$(3) where |$\boldsymbol {\beta _0}$| is the slowness vector of the plane wave (Rost & Thomas 2002). The array beam, y(t), is formed by averaging the recorded signals $$\begin{eqnarray} y(t) = \frac{1}{n}\sum _{j=1}^{n} s(t - (\boldsymbol {\beta } - \boldsymbol {\beta _0}) \cdot \boldsymbol {r_j}), \end{eqnarray}$$(4) where |$\boldsymbol {\beta }$| is the beam steering vector. Beamforming distorts broad-band signals, but an approximate characterization can be created by considering the total energy of the array beam (Kelly 1967). Applying Parseval’s theorem, we express the energy |$E(\boldsymbol {k})$| as $$\begin{eqnarray} E(\boldsymbol {k} - \boldsymbol {k_0}) = \frac{1}{2\pi } \int _{-\infty }^{\infty } \Big |S(\omega )\Big |^2\Big |\frac{1}{n}\sum _{j=1}^{n} \exp {(2\pi \mathrm{ i} (\boldsymbol {k} - \boldsymbol {k_0})\cdot \boldsymbol {r_j})}\Big |^2 \mathrm{ d}\omega , \\ \end{eqnarray}$$(5) where S(ω) is the Fourier transform of s(t). The squared term on the right-hand side of the integrand is the array response function, which we denote as $$\begin{eqnarray} |A(\boldsymbol {k} - \boldsymbol {k_0})|^2 = \Big |\frac{1}{n}\sum _{j=1}^{n} \exp {(2\pi \mathrm{ i }(\boldsymbol {k} - \boldsymbol {k_0})\cdot \boldsymbol {r_j})}\Big |^2. \end{eqnarray}$$(6) This expression describes the array pattern in wavenumber space over narrow-band frequencies. After normalizing the signal power spectral density (Kelly 1967; Kvaerna & Doornbos 1986), we directly relate the total energy on the array beam to the array response function. For a signal of interest, this technique uses a grid search over hypothetical slowness vectors to maximize array gain. The slowness vector that returns the maximum beam energy provides an estimate of the velocity and backazimuth of the incident plane wave. There are many variations of this technique beyond the semblance-based method applied here (Capon 1969; Gibbons et al. 2018; den Ouden et al. 2020), and it is often used for seismic array processing. We apply frequency–wavenumber (f–k) analysis to compare our time-domain regression techniques with a commonly used frequency-based method. 2.3.3 Progressive multi-channel correlation We also apply progressive multi-channel correlation (PMCC) to our synthetic examples. It is the main array processing tool used at the International Data Center (created to support verification of the CTBT) and is commonly applied by some in the community to infrasound array processing problems (Cansi 1995; Brachet et al. 2010). In the absence of noise, a traveltime closure relation holds when a plane wave propagates between any three sensors in an array, called an array triplet. The relative time delays between the sensor pairs in the array triplet are determined from the maxima of their cross-correlation functions. Across multiple narrowband frequency ranges, the quadratic residual of the closure relation is calculated, and a detection is declared if this value is below a given threshold (Brachet et al. 2010). This technique initializes on the smallest group of three sensors in the array and progressively adds distant sensors to the calculations if the time delay (from the cross-correlation function maximum) is consistent with the delay predicted from plane wave propagation. Consistent time delays in frequency, backazimuth and trace-velocity space are denoted by ‘pixels’, and adjacent pixels with similar properties are joined into PMCC ‘families’ (Brachet et al. 2010). Pixels and families are commonly displayed on a 2-D frequency–time grid. We calculate sample statistics over pixel families (i.e. medians and means) to summarize the plane wave parameters. Processing detection and family association parameters are uniform for each synthetic example and are listed in the Supporting Information. 2.3.4 L1 regression Minimization of an L1-norm objective function is sometimes applied in the geophysical literature when data may have outlying points (Claerbout & Muir 1973; Scales et al. 1988; Aster et al. 2013; Silwal & Tape 2016). In the context of regression, the optimal vector |$\boldsymbol {\hat{\beta }}$| is found that minimizes the L1 norm of the residual vector $$\begin{eqnarray} \boldsymbol {\skew{4}\hat{\beta }} = \genfrac{}{}{0.0pt}{}{\text{minimize}}{\boldsymbol {\beta }} \sum _{i=1}^{N} |r_i(\boldsymbol {\beta })|. \end{eqnarray}$$(7) The L1-norm estimate is the maximum likelihood estimator for errors following a double exponential distribution. Unlike least-squares regression, a general closed form solution does not exist for minimizing for the L1 norm, so iterative techniques such as iteratively reweighted least squares are needed (Beaton & Tukey 1974). L1 regression has not typically been applied in infrasonic or seismic array processing, although Haney et al. (2018) employed an iteratively reweighted least-squares method to volcano infrasound data. For computational efficiency, we use the free Python convex optimization software CVXOPT (cvxopt.org) to perform the regression. This package rapidly solves the equivalent linear programming problem with a custom interior-point solver. 2.4 Robust array processing methods 2.4.1 M-estimation M-estimators are a family of functions that generalize the objective function used in L2- and L1-regression problems $$\begin{eqnarray} \boldsymbol {\skew{4}\hat{\beta }} = \genfrac{}{}{0.0pt}{}{\text{minimize}}{\boldsymbol {\beta }} \sum _{i=1}^{N}\rho \Big ( r_i(\boldsymbol {\beta }) \Big ), \end{eqnarray}$$(8) where ρ is a symmetric function of the residuals with a unique minimum at zero (Huber 1973; Rousseeuw & Leroy 1987). The M-estimate obtained from a function is the maximum likelihood estimate for a distribution with probability density function f(t) = Ce−ρ(z), although ρ need not be the maximum likelihood solution for any distribution (Maronna et al. 2006). If |$\rho = ({1}/{2}) z^2$| or ρ = |z|, we recover the least-squares estimator and the L1-norm estimator, respectively. If the residuals were multiplied by a constant (i.e. scaled), then the M-estimator solution may not be the same. To obtain a solution independent of data units, we solve the related equations (Montgomery et al. 2001) $$\begin{eqnarray} \boldsymbol {\skew{4}\hat{\beta }} &=& \genfrac{}{}{0.0pt}{}{\text{minimize}}{\boldsymbol {\beta }} \sum _{i=1}^{N}\rho \Big ( \frac{r_i(\boldsymbol {\beta })}{q} \Big ) \\ &=& \genfrac{}{}{0.0pt}{}{\text{minimize}}{\boldsymbol {\beta }}\sum _{i=1}^{N} \rho \Big ( \frac{\boldsymbol {\tau _i} - {{A_i}} \boldsymbol {\beta }}{q} \Big ), \end{eqnarray}$$(9) where q is a robust estimate of scale and |${A}_i$| is the ith row of the co-array matrix |$\mathbf {A}$|⁠. A popular scale choice, for example, Rousseeuw & Leroy (1987) among others, is the median absolute deviation $$\begin{eqnarray} q &=& \frac{1}{\Phi ^{-1}(3/4)}\times \text{median}|r_i - \text{median}(r_i)| \\ &\approx& 1.483\times \text{median}|r_i - \text{median}(r_i)|, \end{eqnarray}$$(10) where Φ−1 is the inverse of the cumulative distribution function for the standard normal distribution. To minimize (eq. 9), we set the derivative with respect to the model parameters |$\boldsymbol {\beta }_j$| equal to zero $$\begin{eqnarray} \sum _{i=1}^{N}\mathbf {A}_{ij} \psi \Big ( \frac{r_i}{q} \Big ) = 0, \end{eqnarray}$$(11) where ψ = ρ′, and |$\mathbf {A}_{ij}$| is the jth coordinate of the ith co-array element (Montgomery et al. 2001). To solve this equation, we apply iteratively reweighted least-squares (Beaton & Tukey 1974) with convergence determined by the scale-free gradient condition (Dennis 1977; Coleman et al. 1980). Due to previous work showing that the Biweight estimator returned lower variance parameter estimates than the least-squares estimator for several chemical data sets (Phillips & Eyrlng 1983), we use the Biweight function (ψ) in this analysis. After an initial L1-norm fit, the standardized residuals (ri/q) are calculated, and each data point is weighted with the (Biweight) weighting function. Data points with large residuals are down-weighted, which decreases their influence in the next iteration. For various M-estimators, weights are assigned with a maximum value of 1 and a (sometimes asymptotic) minimum value of 0. In this way, the M-estimator attempts to remove outlying data points. See the Supporting Information for additional M-estimators. While not a commonly applied technique in geophysics, there are some previously published applications of M-estimators to chemical data sets. For example, they were used for an accurate regression estimate of a calibration curve when the original data had a known outlier (Tiede & Pagano 1979). In both of the above works, the authors determined that the M-estimate was more accurate than the least-squares estimate when the underlying data errors were non-normal due to belonging to an inherently non-Gaussian data distribution or from the addition of data outliers (Tiede & Pagano 1979; Phillips & Eyrlng 1983). These univariate conclusions were further expanded to multiple linear regression calibrations (Wei et al. 1993). 2.4.2 LTS LTS is a high breakdown value estimator for a sample of size N defined as $$\begin{eqnarray} \boldsymbol {\skew{4}\hat{\beta }} = \genfrac{}{}{0.0pt}{}{\text{minimize}}{\boldsymbol {\beta }} \ \sum _{i=1}^h |r(\boldsymbol {\beta })|^2_{i:N}, \end{eqnarray}$$(12) where h is the subset size and |$r_1^2 \le r_2^2 \le ... \le r_N^2$| are the ordered squared residuals (ascending order; Rousseeuw 1984; Rousseeuw & Leroy 1987). In other words, it is the least-squares fit over a subset of the data points such that the sum of the associated squared residuals is minimized. For LTS, h ≤ N, and if h = N, then we recover the ordinary least-squares fit. In this paper, we apply the FAST-LTS resampling algorithm (Rousseeuw & Van Driessen 1999, 2006) with small sample correction factors (Pison et al. 2002). This algorithm makes use of several (e.g. 500) data subsets of size h (an h-subset) and a ‘concentration’ step, which is a systematic process to choose a new h-subset with a smaller least-squares residual than the current h-subset. After applying 10 concentration steps to each initial subset, the 10-h-subsets with the lowest sum of squared residuals are saved. Concentration steps are then applied to these sets of data points until they converge (Rousseeuw & Van Driessen 2006). After this initial data fit, the sample scale (⁠|$\hat{\sigma }$|⁠) is estimated, and binary weights are assigned to the data points (Rousseeuw & Leroy 1987; Rousseeuw & Hubert 1997) $$\begin{eqnarray} w_i = \left\lbrace \begin{array}{@{}l@{\quad }l@{}}0 & \text{if } \big | \frac{r_i}{\hat{\sigma }}\big | \gt 2.5, \\ 1 & \text{otherwise}. \end{array}\right. \end{eqnarray}$$(13) A weighted least-squares fit is performed to increase statistical efficiency. Station pairs whose corresponding data points have a final weight of zero are flagged for additional analysis. In choosing the subset size, h, there appears to be an inherent trade-off between breakdown value and statistical efficiency (Rousseeuw & Hubert 1997). A lower h-value results in a higher breakdown point while a higher h-value leads to higher statistical efficiency. The optimal h-value between these two extremes is ⌊(N + p + 1)/2⌋, where N is the total number of data points, p is the number of parameters we are trying to fit and ⌊⌋ is the floor function (Rousseeuw & Hubert 1997). However, any h-value from ⌊N/2⌋ + 1 to N is mathematically valid. To apply this method to data sets of arbitrary size, h is typically expressed as a fraction of the data set, h =⌊αN⌋, where 0 ≤ α ≤ 1. LTS has previously been applied in multiple fields, such as nonlinear regression filtering in image restoration (Koivunen 1995), re-examining economic trends (Zaman et al. 2001) and determining parameters of the Fundamental Plane in astrophysical calculations (Cappellari et al. 2013). These studies again found that the robust technique (i.e. LTS) was superior to the least-squares estimate. For array processing, we find it effective to choose an h-value such that all station pairs that include a single outlier element are removed. For a co-array of size N, this means that we trim n − 1 data points. On the active array data in Section 3.2, a subset of size α = 0.5 was chosen for exploratory analysis. LTS application to array data sets is extremely rapid and processing is completed in about half a second for each 30 s data window (8 traces, 28 station pairs) on a standard personal computer. Our LTS implementation is built on the Python modules Numpy and Scipy, and it is publicly available on github (https://github.com/uafgeotools/lts_array). 3 RESULTS 3.1 Synthetic tests In order to compare the aforementioned algorithms in an array-processing application, we attempt to recover plane wave parameters under various model deviations. The ability of the estimator to recover the fixed, arbitrarily chosen plane wave parameters (backazimuth = 203.5° and trace velocity = 340 m s−1) is compared in each case by calculating the relative errors (deviation of the estimate from the simulated value divided by the simulated value). We create synthetic data with a 100 Hz sample rate to avoid sampling artefacts and use a fixed station geometry (IS53; Fig. 1). For simplicity, an impulsive source waveform is created by joining a 2.5 Hz sine wave with a 1.5 Hz, damped cosine (time constant = 2 s). Scaled pseudo-random pink noise (i.e. noise with a power spectral density proportional to 1/f, similar to background infrasound noise) is added to a 30 s data segment to create the waveforms. Data are then filtered between 0.5 and 3.0 Hz. The entire data segment (30 s) is treated as one continuous window for the time-domain cross-correlations and f–k beamforming. In the f–k beamforming, a square grid of slowness components (βx and βy) from −3.5 to 3.5 s km−1 in 0.1 s km−1 increments is defined, and frequencies from 0.5 to 3.0 Hz are looped over in approximately 0.0244 Hz increments. Figure 1. Open in new tabDownload slide Infrasound arrays used in this study with elements labelled. (a) IMS array IS53 located in Fairbanks, Alaska. The blue line represents the plane wave simulated in the synthetic tests. (b) The Alaska Volcano Observatory ADKI infrasound array located on Adak Island, Alaska. (c) IMS infrasound array IS57 in Piñon Flats, California. (d) IMS infrasound array IS55 in Windless Bight, Antarctica. PMCC analysis works differently from the above techniques, and our synthetic data processing pipeline is modified accordingly. The synthetic data are padded with pseudo-random pink noise to create 2 min traces and downsampled to 20 Hz. Waveforms are filtered between 0.5 and 3.0 Hz and processed over ten linearly spaced frequency bins (0.5–3.0 Hz) in 30 s windows with 90 per cent overlap. We note that PMCC calculations are performed over narrow frequency bands, which adds an additional processing difference to the broad-band time domain techniques outlined above. Output from PMCC is a collection of pixel families in a frequency–time grid (Supporting Information). To compare with the other algorithms, we show the mean and median backazimuth and trace-velocity values over the entire pixel family formed in each case. We choose to only modify the synthetic data for element 1. This choice is somewhat arbitrary, and although model variations are consistently simulated at this element, the effect of individual failures is not equal among the elements. This differential importance can be shown with array response functions (e.g. Gibbons et al. 2018) or the slowness confidence ellipsoid method of Szuberla & Olson (2004). Due to the modifications introduced to element 1, we uniformly only consider the seven elements 2–8 in our signal-to-noise ratio (SNR) calculations. The approximate SNR of the signals is estimated by first calculating the Fisher statistic (F) across elements 2–8 (Melton & Bailey 1957; Blandford 1974) and then converting with the expression $$\begin{eqnarray} F = n \times \text{SNR} + 1, \end{eqnarray}$$(14) where n is the number of elements. The SNR expresses the ratio of signal power (Psignal) to noise power (Psignal) (Melton & Bailey 1957). The SNR is converted to decibels (dB) using the expression $$\begin{eqnarray} \text{dB} = 10\log _{10}(P_{\text{signal}}/P_{\text{noise}}). \end{eqnarray}$$(15) Our synthetic tests are by no means exhaustive, but they provide insight into how the algorithms will perform when model assumptions are violated. We note that due to the use of pink noise instead of Gaussian noise, our SNR estimates from the F statistic are approximate. We consider an extensive catalogue of testing on these simplified signals to be beyond the scope of this paper. 3.1.1 Synthetic Case 1: an element with a timing error We simulate a plane wave signal traversing array IS53 where the timing at element 1 is not synchronized with the other elements, resulting in an effective time shift of −5 s at the element. After filtering, the traces have an SNR of approximately 1.74 dB (Fig. 2). Results of the array processing ensemble are compared in Table 1. Least-squares estimation poorly recovers the trace velocity and backazimuth of the original signal (64.2 and 22.5 per cent relative errors, respectively), and the M-estimation with the L2 start only provides minor improvement (62.7 per cent in trace velocity and 20.7 per cent relative error in backazimuth, respectively). L1-norm regression, M-estimation with the L1-norm start, f–k analysis and LTS all show similar performance and closely recover the original signal parameters. All element pairs containing 1 (1–2, 1–3, 1–4, 1–5, 1–6, 1–7 and 1–8) are dropped from LTS analysis for both h = ⌊0.50N⌋ and h = ⌊0.75N⌋ subset sizes. We note though that additional station pairs (e.g. 2–6) are occasionally dropped due to random fluctuations of the pink noise; however, element pairs with element 1 are consistently dropped. Figure 2. Open in new tabDownload slide (a) Synthetic waveforms for IS53 analysed in Section 3.1.1. Traces show the simulated 0.5–3.0 Hz filtered data for a plane wave crossing the array (Fig. 1). The blue waveform in (a) shows the proper signal arrival time at element 1. (b) Synthetic waveforms for IS53 analysed in Section 3.1.3. Element 1 is modified to contain only pink noise (SNR = 0). (c) Synthetic waveforms for IS53 analysed in Section 3.1.4. These waveforms have the same arrival time alignments as in (a), but the approximate SNR is 0.07 dB compared to 1.74 dB in (a). (d) Actual data from a Mount Erebus eruption in 2009 recorded at IMS array IS55 (Fig. 1) and analysed in Section 3.2.3. In this case data are filtered between 0.5 and 10.0 Hz. The vertical axis units are shared across all subplots. Table 1. Array processing results for synthetic plane wave data filtered between 0.5 and 3.0 Hz and a timing error on element 1 (Fig. 2). PMCC statistics are calculated over 78 pixels. Method . Trace velocity (km s−1) . Relative error (per cent) . Backazimuth (°) . Relative error (per cent) . Synthetic test 0.340 – 203.50 – Least squares 0.558 64.2 249.33 22.5 L1 norm 0.341 0.3 203.59 0.0 f–k analysis (semblance) 0.341 0.2 203.64 0.1 M-estimation (L2 start) 0.553 62.7 245.62 20.7 M-estimation (L1 start) 0.340 0.1 203.57 0.0 LTS (h = ⌊0.75N⌋) 0.340 0.1 203.56 0.0 LTS (h = ⌊0.50N⌋) 0.340 0.1 203.56 0.0 PMCC family mean 0.342 0.6 203.10 0.2 PMCC family median 0.341 0.3 203.10 0.2 Method . Trace velocity (km s−1) . Relative error (per cent) . Backazimuth (°) . Relative error (per cent) . Synthetic test 0.340 – 203.50 – Least squares 0.558 64.2 249.33 22.5 L1 norm 0.341 0.3 203.59 0.0 f–k analysis (semblance) 0.341 0.2 203.64 0.1 M-estimation (L2 start) 0.553 62.7 245.62 20.7 M-estimation (L1 start) 0.340 0.1 203.57 0.0 LTS (h = ⌊0.75N⌋) 0.340 0.1 203.56 0.0 LTS (h = ⌊0.50N⌋) 0.340 0.1 203.56 0.0 PMCC family mean 0.342 0.6 203.10 0.2 PMCC family median 0.341 0.3 203.10 0.2 Open in new tab Table 1. Array processing results for synthetic plane wave data filtered between 0.5 and 3.0 Hz and a timing error on element 1 (Fig. 2). PMCC statistics are calculated over 78 pixels. Method . Trace velocity (km s−1) . Relative error (per cent) . Backazimuth (°) . Relative error (per cent) . Synthetic test 0.340 – 203.50 – Least squares 0.558 64.2 249.33 22.5 L1 norm 0.341 0.3 203.59 0.0 f–k analysis (semblance) 0.341 0.2 203.64 0.1 M-estimation (L2 start) 0.553 62.7 245.62 20.7 M-estimation (L1 start) 0.340 0.1 203.57 0.0 LTS (h = ⌊0.75N⌋) 0.340 0.1 203.56 0.0 LTS (h = ⌊0.50N⌋) 0.340 0.1 203.56 0.0 PMCC family mean 0.342 0.6 203.10 0.2 PMCC family median 0.341 0.3 203.10 0.2 Method . Trace velocity (km s−1) . Relative error (per cent) . Backazimuth (°) . Relative error (per cent) . Synthetic test 0.340 – 203.50 – Least squares 0.558 64.2 249.33 22.5 L1 norm 0.341 0.3 203.59 0.0 f–k analysis (semblance) 0.341 0.2 203.64 0.1 M-estimation (L2 start) 0.553 62.7 245.62 20.7 M-estimation (L1 start) 0.340 0.1 203.57 0.0 LTS (h = ⌊0.75N⌋) 0.340 0.1 203.56 0.0 LTS (h = ⌊0.50N⌋) 0.340 0.1 203.56 0.0 PMCC family mean 0.342 0.6 203.10 0.2 PMCC family median 0.341 0.3 203.10 0.2 Open in new tab 3.1.2 Synthetic Case 2: a flat channel of all zeroes We simulate a plane wave signal traversing array IS53 where element 1 has a flat channel composed of all zeros. After filtering, the traces have an SNR of approximately 1.39 dB. Results of the array processing are shown in Table 2. Least-squares estimation again poorly recovers the trace velocity and backazimuth of the original signal (75.0 and 66.5 per cent relative error, respectively), and the M-estimation with the L2 start again does not improve the estimation much (73.7 and 66.1 per cent relative error in trace velocity and backazimuth, respectively). L1-norm regression, M-estimation with the L1-norm start, f–k analysis and LTS again all show similar performance and recover the original signal parameters. All element pairs containing element 1 are dropped in LTS runs with both h = ⌊0.50N⌋ and h = ⌊0.75N⌋ subset sizes. Table 2. Array processing results for synthetic plane wave data filtered between 0.5 and 3.0 Hz and a flat channel of zeroes on element 1. PMCC statistics are calculated over 68 pixels. Method . Trace velocity (km s−1) . Relative error (per cent) . Backazimuth (°) . Relative error (per cent) . Synthetic test 0.340 – 203.50 – Least squares 0.085 75.0 338.79 66.5 L1 norm 0.342 0.4 203.69 0.1 f–k analysis (semblance) 0.341 0.3 203.46 0.0 M-estimation (L2 start) 0.089 73.7 337.99 66.1 M-estimation (L1 start) 0.341 0.2 203.54 0.0 LTS (h = ⌊0.75N⌋) 0.341 0.2 203.54 0.0 LTS (h = ⌊0.50N⌋) 0.341 0.2 203.54 0.0 PMCC family mean 0.342 0.6 203.30 0.1 PMCC family median 0.342 0.6 203.20 0.1 Method . Trace velocity (km s−1) . Relative error (per cent) . Backazimuth (°) . Relative error (per cent) . Synthetic test 0.340 – 203.50 – Least squares 0.085 75.0 338.79 66.5 L1 norm 0.342 0.4 203.69 0.1 f–k analysis (semblance) 0.341 0.3 203.46 0.0 M-estimation (L2 start) 0.089 73.7 337.99 66.1 M-estimation (L1 start) 0.341 0.2 203.54 0.0 LTS (h = ⌊0.75N⌋) 0.341 0.2 203.54 0.0 LTS (h = ⌊0.50N⌋) 0.341 0.2 203.54 0.0 PMCC family mean 0.342 0.6 203.30 0.1 PMCC family median 0.342 0.6 203.20 0.1 Open in new tab Table 2. Array processing results for synthetic plane wave data filtered between 0.5 and 3.0 Hz and a flat channel of zeroes on element 1. PMCC statistics are calculated over 68 pixels. Method . Trace velocity (km s−1) . Relative error (per cent) . Backazimuth (°) . Relative error (per cent) . Synthetic test 0.340 – 203.50 – Least squares 0.085 75.0 338.79 66.5 L1 norm 0.342 0.4 203.69 0.1 f–k analysis (semblance) 0.341 0.3 203.46 0.0 M-estimation (L2 start) 0.089 73.7 337.99 66.1 M-estimation (L1 start) 0.341 0.2 203.54 0.0 LTS (h = ⌊0.75N⌋) 0.341 0.2 203.54 0.0 LTS (h = ⌊0.50N⌋) 0.341 0.2 203.54 0.0 PMCC family mean 0.342 0.6 203.30 0.1 PMCC family median 0.342 0.6 203.20 0.1 Method . Trace velocity (km s−1) . Relative error (per cent) . Backazimuth (°) . Relative error (per cent) . Synthetic test 0.340 – 203.50 – Least squares 0.085 75.0 338.79 66.5 L1 norm 0.342 0.4 203.69 0.1 f–k analysis (semblance) 0.341 0.3 203.46 0.0 M-estimation (L2 start) 0.089 73.7 337.99 66.1 M-estimation (L1 start) 0.341 0.2 203.54 0.0 LTS (h = ⌊0.75N⌋) 0.341 0.2 203.54 0.0 LTS (h = ⌊0.50N⌋) 0.341 0.2 203.54 0.0 PMCC family mean 0.342 0.6 203.30 0.1 PMCC family median 0.342 0.6 203.20 0.1 Open in new tab 3.1.3 Synthetic Case 3: a channel with only noise (SNR = 0) We also simulate a plane wave signal traversing array IS53 with an approximate SNR of 1.01 dB after filtering. The time-series for element 1 is replaced by pure pink noise (SNR = 0 dB). Array processing results are shown in Table 3. Least-squares estimation poorly recovers the trace velocity and backazimuth of the original signal (27.4 per cent relative error in trace velocity and 6.9 per cent relative error in backazimuth). The M-estimation with the L2 start returns a trace velocity and backazimuth estimate with 11.9 per cent relative error in trace velocity and 3.8 per cent relative error in backazimuth. The L1-norm regression, M-estimation with the L1-norm start, f–k analysis and LTS all show similar performance and essentially recover the original signal parameters. All element pairs containing 1 are dropped for both the LTS h = ⌊0.75N⌋ and h = ⌊0.50N⌋ subset sizes. Table 3. Array processing results for synthetic plane wave data filtered between 0.5 and 3.0 Hz with element 1 replaced by pure pink noise (SNR = 0). PMCC statistics are calculated over 68 pixels. Method . Trace velocity (km s−1) . Relative error (per cent) . Backazimuth (°) . Relative error (per cent) . Synthetic test 0.340 – 203.50 – Least squares 0.247 27.4 189.49 6.9 L1 norm 0.339 0.3 203.36 0.1 f–k analysis (semblance) 0.342 0.7 203.53 0.0 M-estimation (L2 start) 0.300 11.9 195.79 3.8 M-estimation (L1 start) 0.340 0.0 203.43 0.0 LTS (h = ⌊0.75N⌋) 0.340 0.0 203.43 0.0 LTS (h = ⌊0.50N⌋) 0.340 0.0 203.43 0.0 PMCC family mean 0.343 0.9 203.40 0.0 PMCC family median 0.342 0.6 203.40 0.0 Method . Trace velocity (km s−1) . Relative error (per cent) . Backazimuth (°) . Relative error (per cent) . Synthetic test 0.340 – 203.50 – Least squares 0.247 27.4 189.49 6.9 L1 norm 0.339 0.3 203.36 0.1 f–k analysis (semblance) 0.342 0.7 203.53 0.0 M-estimation (L2 start) 0.300 11.9 195.79 3.8 M-estimation (L1 start) 0.340 0.0 203.43 0.0 LTS (h = ⌊0.75N⌋) 0.340 0.0 203.43 0.0 LTS (h = ⌊0.50N⌋) 0.340 0.0 203.43 0.0 PMCC family mean 0.343 0.9 203.40 0.0 PMCC family median 0.342 0.6 203.40 0.0 Open in new tab Table 3. Array processing results for synthetic plane wave data filtered between 0.5 and 3.0 Hz with element 1 replaced by pure pink noise (SNR = 0). PMCC statistics are calculated over 68 pixels. Method . Trace velocity (km s−1) . Relative error (per cent) . Backazimuth (°) . Relative error (per cent) . Synthetic test 0.340 – 203.50 – Least squares 0.247 27.4 189.49 6.9 L1 norm 0.339 0.3 203.36 0.1 f–k analysis (semblance) 0.342 0.7 203.53 0.0 M-estimation (L2 start) 0.300 11.9 195.79 3.8 M-estimation (L1 start) 0.340 0.0 203.43 0.0 LTS (h = ⌊0.75N⌋) 0.340 0.0 203.43 0.0 LTS (h = ⌊0.50N⌋) 0.340 0.0 203.43 0.0 PMCC family mean 0.343 0.9 203.40 0.0 PMCC family median 0.342 0.6 203.40 0.0 Method . Trace velocity (km s−1) . Relative error (per cent) . Backazimuth (°) . Relative error (per cent) . Synthetic test 0.340 – 203.50 – Least squares 0.247 27.4 189.49 6.9 L1 norm 0.339 0.3 203.36 0.1 f–k analysis (semblance) 0.342 0.7 203.53 0.0 M-estimation (L2 start) 0.300 11.9 195.79 3.8 M-estimation (L1 start) 0.340 0.0 203.43 0.0 LTS (h = ⌊0.75N⌋) 0.340 0.0 203.43 0.0 LTS (h = ⌊0.50N⌋) 0.340 0.0 203.43 0.0 PMCC family mean 0.343 0.9 203.40 0.0 PMCC family median 0.342 0.6 203.40 0.0 Open in new tab 3.1.4 Synthetic Case 4: low SNR data In this example, we again show a plane wave signal crossing IS53 with an effective time shift of −5 s at element 1. In this case, however, the post-filtering SNR is lower, approximately 0.07 dB (Fig. 2). The results of the array processing are summarized in Table 4 and show a similar trend in estimator performance as demonstrated in Section 3.1.1 (Table 1). For the h = ⌊0.75N⌋ subset size, all the element pairs containing element 1 are dropped. However, when h = ⌊0.50N⌋, the element pairs 3–5, 3–6, 3–8 and 4–8 are also dropped in addition to all of the element pairs containing element 1. These station pairs were dropped not because they were associated with an error at an element, but because the elements were misaligned in the cross-correlations due to the low signal-to-noise level. This traveltime misalignment caused the data points corresponding to these four station pairs to be offset from the best-fit plane (Supporting Information). Table 4. Array processing results for low SNR, time shifted (element 1) data filtered between 0.5 and 3.0 Hz. PMCC statistics are calculated over 54 pixels. Method . Trace velocity (km s−1) . Relative error (per cent) . Backazimuth [°] . Relative error (per cent) . Synthetic test 0.340 – 203.50 – Least squares 0.539 58.5 250.91 23.3 L1 norm 0.393 15.56 211.90 4.1 f–k analysis (semblance) 0.341 0.2 203.64 0.1 M-estimation (L2 start) 0.542 59.5 244.73 20.3 M-estimation (L1 start) 0.347 1.9 205.53 1.0 LTS (h = ⌊0.75N⌋) 0.347 2.0 205.48 1.0 LTS (h = ⌊0.50N⌋) 0.339 0.2 203.51 0.0 PMCC family mean 0.332 2.4 203.10 0.2 PMCC family median 0.333 2.1 203.2 0.1 Method . Trace velocity (km s−1) . Relative error (per cent) . Backazimuth [°] . Relative error (per cent) . Synthetic test 0.340 – 203.50 – Least squares 0.539 58.5 250.91 23.3 L1 norm 0.393 15.56 211.90 4.1 f–k analysis (semblance) 0.341 0.2 203.64 0.1 M-estimation (L2 start) 0.542 59.5 244.73 20.3 M-estimation (L1 start) 0.347 1.9 205.53 1.0 LTS (h = ⌊0.75N⌋) 0.347 2.0 205.48 1.0 LTS (h = ⌊0.50N⌋) 0.339 0.2 203.51 0.0 PMCC family mean 0.332 2.4 203.10 0.2 PMCC family median 0.333 2.1 203.2 0.1 Open in new tab Table 4. Array processing results for low SNR, time shifted (element 1) data filtered between 0.5 and 3.0 Hz. PMCC statistics are calculated over 54 pixels. Method . Trace velocity (km s−1) . Relative error (per cent) . Backazimuth [°] . Relative error (per cent) . Synthetic test 0.340 – 203.50 – Least squares 0.539 58.5 250.91 23.3 L1 norm 0.393 15.56 211.90 4.1 f–k analysis (semblance) 0.341 0.2 203.64 0.1 M-estimation (L2 start) 0.542 59.5 244.73 20.3 M-estimation (L1 start) 0.347 1.9 205.53 1.0 LTS (h = ⌊0.75N⌋) 0.347 2.0 205.48 1.0 LTS (h = ⌊0.50N⌋) 0.339 0.2 203.51 0.0 PMCC family mean 0.332 2.4 203.10 0.2 PMCC family median 0.333 2.1 203.2 0.1 Method . Trace velocity (km s−1) . Relative error (per cent) . Backazimuth [°] . Relative error (per cent) . Synthetic test 0.340 – 203.50 – Least squares 0.539 58.5 250.91 23.3 L1 norm 0.393 15.56 211.90 4.1 f–k analysis (semblance) 0.341 0.2 203.64 0.1 M-estimation (L2 start) 0.542 59.5 244.73 20.3 M-estimation (L1 start) 0.347 1.9 205.53 1.0 LTS (h = ⌊0.75N⌋) 0.347 2.0 205.48 1.0 LTS (h = ⌊0.50N⌋) 0.339 0.2 203.51 0.0 PMCC family mean 0.332 2.4 203.10 0.2 PMCC family median 0.333 2.1 203.2 0.1 Open in new tab 3.2 Real infrasound arrays 3.2.1 Adak infrasound array The Adak infrasound array is operated by the Alaska Volcano Observatory on Adak Island, Alaska (ADKI; Fig. 1) and was installed in early June 2017. The six element array has an aperture of ∼200 m, and uses Chaparral Physics Model 25Vx sensors with a sample rate of 100 Hz. We examine a 2017 June 10 eruption of Bogoslof volcano that is clearly recorded ∼600 km away at ADKI between 13:56:30 and 14:01:45 UTC (Fig. 3). The eruption location and timing was well-constrained by AVO for this remote volcano (Lyons et al. 2020), so we consider the source location to be exactly known. Figure 3. Open in new tabDownload slide ADKI backazimuth (left) and trace-velocity (right) estimates for the full ensemble of estimators for an eruption signal from Bogoslof volcano: (a) least squares, (b) L1-norm regression, (c) M-estimators, (d) f–k analysis and (e) least trimmed squares (α = 0.5). Points are shaded according to the median cross-correlation maxima in each window, and the values are the same across all subplots. While the f–k analysis does not use cross-correlations, the same shading was applied for visual comparison. The grey dashed line corresponds to the backazimuth from the ADKI infrasound array to Bogoslof volcano (∼65°), and the dual grey lines on the right (a2–e2) correspond to realistic acoustic wave bounds of 300 and 430 m s−1. ADKI waveforms are filtered between 0.5 and 5.0 Hz, and array cross-correlations are performed in 30 s windows with 50 per cent overlap. The median of the cross-correlation maxima in each window, the MdCCM, is used as a coherence metric. This frequency band is chosen to minimize microbarom noise and matches the operational passband used by AVO (Lyons et al. 2020). Fig. 3 shows the array processing results for the selected algorithms. Trace-velocity and backazimuth estimates from least-squares estimation are highly scattered between 13:56:30 and 14:01:45 UTC (Figs 3a1 and a2), despite the high MdCCM values (≳ 0.80). Large backazimuth scatter is also evident in the L1-norm regression, M-estimation and f–k analysis (Figs 3b1–e1). The trace-velocity estimates are likewise scattered in many highly correlated time windows lying outside the bounds of physically realistic acoustic velocities (Figs 3b2–e2). Note that for regression techniques, cross-correlations are performed prior to the regression, so the MdCCM values are the same across all time domain techniques. The LTS estimate of backazimuth has a much lower variance than the other techniques and points towards Bogoslof volcano (Figs 3e and 4c). The LTS trace-velocity estimates also appear more tightly clustered than the other estimators around a mean of ∼351 m s−1 (Figs 3a2–e2 and 4b). Fig. 4(d) shows the number of dropped station pairs containing the respective element. Evaluation of this panel of LTS weights shows that element 2 is consistently dropped from the regressions (Fig. 4d). After further analysis, including visual inspection of waveforms from the eruption sequence, it was discovered that the polarity of element 2 was reversed in the sensor connection with respect to the other elements (Supporting Information). The connection was repaired during the next maintenance visit. A ‘lift test’ where the sensor is manually lifted above the ground can be used to test the polarity of sensor connections during installation, but, due to the remote location, absence of real-time communications during the installation, and the lack of ground-truth sources, it was otherwise difficult to determine the reversed polarity in the data during the array installation. Figure 4. Open in new tabDownload slide Least trimmed squares estimation of trace velocity and backazimuth for the ADKI array. (a) Pressure trace from element 1 displaying a clear signal from 13:56:30 to 14:01:45 UTC. (b) LTS trace-velocity estimates are within realistic acoustic wave bounds of 300 and 430 m s−1 (grey lines). (c) LTS backazimuth estimates are ∼100°, shift to ∼65° during the time of high MdCCM, and then a return to ∼100°. We hypothesize that the highly coherent Bogoslof signal arrives during an extended period of surf or microbarom noise arriving from a different backazimuth. (d) Visualization of the elements flagged by LTS in each time window. These values denote the number of instances where the differential traveltimes from an element are dropped from the LTS estimate. Autocorrelations are not included, so the maximum number of dropped station pairs for each element is one less than the number of array elements. This figure shows that element 2 is frequently dropped entirely from the parameter estimates. Outside of the 13:56:30 to 14:01:45 UTC window, the L1-norm regression, M-estimation and LTS estimates appear to suggest a weaker continuous signal from a backazimuth of ∼100° (Figs 3 and 4). However, only the LTS solution appears to have the resolution to show a distinct change in backazimuth to Bogoslof volcano and back to ∼100° after the volcanic signal appears to have passed. We suspect this signal to potentially be microbarom or surf noise, and additional analysis is beyond the scope of this work. 3.2.2 IS57: Piñon Flats IMS station IS57 is an eight element array located in Piñon Flats, CA, USA (Fig. 1). This 1.5 km aperture array uses Chaparral 50A microphones with a sample rate of 20 Hz. A previous analysis of microbarom signals for 2017 IS57 data discovered that the inclusion of outer elements, particularly element 8, resulted in a lower median peak cross-correlation value among all of the element pairs in the ∼0.1–0.5 Hz (microbarom) band (Gabrielson 2017). We apply the LTS algorithm to this same data set. In our analysis, IS57 data from 2017 February 15 to 16 is filtered between 0.1 and 0.5 Hz and processed in 60 s windows with 50 per cent overlap. MdCCM is again used as a coherence metric (Fig. 6). Using LTS (α = 0.5), element 8 is consistently dropped from the analysis (Fig. 5) yielding an improvement to parameter estimation similar to completely dropping element 8 a priori before least-squares processing (Fig. 6). Frequency–wavenumber analysis also appears to be resistant to this phenomenon at the eight element array and shows similar beamforming results to LTS (Figs 6e–f). The L1-norm and M-estimation solutions appear to show trace-velocity and backazimuth solution sets similar to the f–k and LTS beamforming, but with many scattered estimates obscuring any clear backazimuth (Figs 6c1 and d1). Figure 5. Open in new tabDownload slide Least trimmed squares estimation of plane wave parameters for a microbarom signal recorded at IS57. (a) 32 hr of data from element 1. (b) Trace-velocity and (c) backazimuth estimates for the above data. The trace-velocity estimates during times of high correlation are physically realistic, and the backazimuth estimate converges to a consistent direction. Note the migration backazimuth to the north beginning around 10:00 UTC. (d) Number of times an element is flagged by LTS in each time window. Figure 6. Open in new tabDownload slide Backazimuth (left) and trace-velocity (right) estimates for the full ensemble of estimators for the IS57: least squares, L1-norm regression, M-estimators, f–k analysis and least trimmed squares (α = 0.5). Points are shaded according to the median cross-correlation maxima in each window, and the values are the same across all subplots. While the f–k analysis does not use cross-correlations, the same shading was applied for visual comparison. The dual grey lines on the right correspond to bounds of 300 and 430 m s−1. 3.2.3 IS55: Windless Bight IMS array IS55 is located ∼26 km away from Mount Erebus on the Ross Ice Shelf in Antarctica (Fig. 1). The array aperture is ∼2 km, the sample rate is 20 Hz, and the station uses eight Chaparral 50A microphones. We examine a small explosion from Mount Erebus on 2009 September 9 that was recorded as a short, impulsive infrasound signal at IS55 (Fig. 7). Data are filtered between 0.5 and 10.0 Hz using a two-pole acausal Butterworth filter and processed in 15 s windows with 75 per cent overlap. We note that array processing for this section is performed with the 2009 array geometry. Due to the glacial dynamics, the array elements move tens of meters each year. Figure 7. Open in new tabDownload slide Least trimmed squares estimation of plane wave parameters for a Mount Erebus explosion recorded at IMS station IS55. (a) A transient waveform recorded at element 1 filtered between 0.5 and 10.0 Hz. This signal is representative of the set of signals recorded at IS55 from this explosion. We note that the acausal filter has created an apparent rarefaction pulse in the waveform before and after the compression peak. (b) The trace velocity over times of high interelement correlation is physically realistic (dual grey lines), and the (c) backazimuth over the same time points back to the direction of Mount Erebus volcano (dashed line). (d) A graphical display of the elements dropped from the regression in the LTS processing. We note that element 6 is consistently dropped over the analysis window, and element 2 is also dropped during the high-correlation times. Inspection of the filtered data clearly shows a flat channel at element 6 (Fig. 2). While this is a relatively simple problem to detect (Brachet et al. 2010), we leave it in the processing to test the algorithms performance. Interestingly, previous work noted there was also a timing issue at element 2 due to the intermittent loss of its digitizer GPS-lock, which resulted in an inaccurate absolute time at that element (Denny 2012). This timing discrepancy produces an effective time shift approximately 2 s forward in time at element 2 relative to the other elements. There also appears to be some electrical noise on element 3, but the effect on the cross-correlations in this passband appears to be negligible compared to the other technical issues. Finally, we note that the acausal filter creates apparent rarefaction pulses on both sides of the peak compression signal (Figs 2 and 7). The signal distortion does not change the peak cross-correlation times, and unfiltered traces for comparison are presented in the Supporting Information. In the time segments of high MdCCM, least squares returns a velocity estimate of ∼368 m s−1 and a backazimuth of ∼324°. The L1 norm, M-estimator, f–k analysis and LTS solutions all show only small differences between the estimators, that is, a trace velocity of ∼320 m s−1 and a backazimuth of ∼338° (Fig. 7). These backazimuth estimates are in good agreement with the 337 degree geographic backazimuth of IS55 to Mount Erebus. While the non-least-squares estimators all appear to return an accurate estimation of the plane wave parameters (Supporting Information), consideration of LTS weights gives us an additional view of data quality. Element 6 is largely dropped from the analysis throughout this time period, and element 2 is also dropped from the analysis in the high-MdCCM regions. To more closely view how LTS processes the data, we present a visual representation of the LTS model fits in Fig. 8. These data points represent the cross-correlation derived, interelement time lags from the 15 s window centred at 17:05:49 UTC, and similar results are generated in each sliding window with high MdCCM (Fig. 7). Using the station co-array locations, all of the time-domain regression algorithms attempt to fit a singular plane through the data. The black shaded points represent those included in the LTS fit, while the blue and red points were dropped from the analysis by the algorithm. We see that the blue points (from the station pairs with element 6) were correlating at the end of the 15 s window, and are situated far from the other traveltimes. The red points (from the station pairs including element 2) are offset from the LTS derived best-fit plane (not shown) through the black points. Figure 8. Open in new tabDownload slide The IS55 least trimmed squares model fits for the processing window centred at 17:05:49 UTC (Fig. 7). The flat data (element 6) is evident from the cluster of blue points with a 15.0 s differential traveltime, the length of the cross-correlation window. The cross-correlation results for the element with a timing error (element 2) are shown in red and are offset from the best-fit plane determined by LTS (black). 4 DISCUSSION An anomalous traveltime is one that deviates from the time predicted by the plane wave propagation model, which underlies all of the estimation techniques mentioned above. Occasional deviations from this model in seismology, called wave-front aberrations, are well documented (Berteussen 1976; Gibbons et al. 2018). We show that a variety of infrasound element issues (e.g. timing, SNR, polarity) can manifest as an apparent deviation from the plane wave model (e.g. Fig. 3). Least-squares estimation of plane wave parameters attempts to incorporate these deviations into the processing, which can yield highly inaccurate results (Tables 1–3). In principle, individual elements can be down-weighted or dropped in least-squares analysis (Szuberla et al. 2006) and other array processing techniques, but it is rare and typically difficult to know a priori if the element is malfunctioning or has diminished data quality. For the various synthetic and array data case studies examined in this paper, the robust estimators consistently return more accurate trace-velocity and backazimuth estimates than least-squares estimation (see Tables 1–3, Figs 3 and 6). The L1-norm regression performs well in all of the synthetic case studies, and also appears to return an accurate estimate of plane wave parameters for the IS55 example (Section 3.2.3). However, the performance of this estimator appears to degrade on the ADKI (Section 3.2.1) and IS57 (Section 3.2.2) array data. In addition to the increased complexity of the real array data compared to the synthetic data, the performance deterioration of the estimator may be the result of the spatial configuration of the array elements. Viewed as a regression problem, station pairs on the outer boundary of the co-array have a larger influence on the final regression fit than station pairs closer to the co-array centre (Rousseeuw 1984; Montgomery et al. 2001). In statistical terms, these points are leverage points and have higher influence on the final fit than internal points (Rousseeuw & Van Driessen 1999; Ryan 2018). The L1-norm regression estimator is particularly affected by outlying data points that are also leverage points; one outlying data point can bias this estimator without bound (Ryan 2018). We recommend a dedicated publication on regression methods for a more rigorous discussion of influential data points, for example, Ryan (2018) or Montgomery et al. (2001). The Biweight M-estimator likewise performs well on the synthetic cases shown here, but shows poor performance on the ADKI and IS57 array examples. With the least-squares seed, the M-estimates show an improved trace velocity and backazimuth over the original least-squares estimate (Tables 1–3). However, for the purposes of accurate plane wave parameter estimation, this improvement is not sufficient. When an L1-norm start is used for the M-estimation, the M-estimates are more accurate than the least-squares estimates (Fig. 6), but additional improvement over the L1-norm estimate appears to be largely negligible. The breakdown point for M-estimators is also a single point (Ryan 2018). As suggested mathematically, we interpret the degradation of their behaviour in an array processing context to be for the same reasons as the L1-norm estimator (Ryan 2018). As other M-estimators suffer from the same susceptibility to the outlier configuration as the least squares, the L1-norm estimator, and the Biweight M-estimator used here, we do not expect the application of other M-estimates (Supporting Information) to yield different results. The f–k analysis performs well in many of the sample cases shown here (Tables 1–3, Fig. 6). One reason for this success may be the so-called translation property of Fourier transforms, where time shifts are converted into phase angle multiples of the original Fourier transforms (Bracewell 2000). The f–k analysis is also the only technique we applied that did not rely on cross-correlation derived traveltimes, which may provide some robustness in low SNR conditions (e.g. Table 4). However, we note that this method seemed to perform particularly poorly on the ADKI example where there was an element with reversed polarity (Fig. 3). The LTS estimator shows superior performance to least squares in all of the situations examined here. Additionally, LTS appears to return more accurate trace-velocity and backazimuth estimates than L1-norm regression and M-estimation in the ADKI (Section 3.2.1) and IS57 (Section 3.2.2) examples. The estimator also returns more stable and accurate estimates than f–k analysis when one of the elements had reversed polarity (ADKI), and it shows similar performance for the phenomenon illustrated in Section 3.2.2 for IS57. When beamforming the IS57 data, Gabrielson (2017) found that dropping element 8 from least-squares analysis resulted in backazimuth estimates with lower variation (Fig. 6). They also found that cross-correlation between all station pairs involving element 8 was lower than expected. Spectral analysis and beamforming of high-frequency data suggested that the element was in fact working properly, so it was proposed that element 8 experienced microbarom arrivals that were inconsistent with the plane wave model and may be related to the transverse correlation length (e.g. Gossard & Sailors 1970; Olson & Szuberla 2005; Green 2015). This was consistent with other time periods where the outer elements (6 and 7) had lower cross-correlation values and were perpendicular to the propagation direction (Gabrielson 2017). If this phenomenon is related to transverse correlation length, then this example suggests that LTS could be used for improved estimates in routine microbarom processing. In addition to the more accurate plane wave parameter estimates afforded by LTS, we can interpret the pattern of dropped station pairs as a diagnostic tool for monitoring station performance and informing station maintenance. In fact, the ability of LTS to single out elements that do not conform to the model assumptions is what distinguishes it from the other robust estimators. In the synthetic cases where element 1 was modified to have either a differential travel-time error (Synthetic Case 1), a flat channel of null data (Synthetic Case 2), or a channel of only pink noise (Synthetic Case 3), all station pairs that included the modified data were dropped from (i.e. flagged by) the LTS estimate. For the ADKI case study (Section 3.2.1), element 2 was nearly continuously dropped during the analysis period. The suggested non-planar arrival at element 8 was likewise dropped in the IS57 data. In the IS55 example, both the flat channel and the element with the timing error were dropped from the analysis, which exemplifies how LTS can remove multiple channels for larger arrays. We propose that LTS weights can be used as an analyst tool or the basis of an operational data quality alarm. As a subset algorithm, LTS broadly resembles PMCC (Cansi 1995) and limited-sensor pair correlation (Gibbons et al. 2018). The LTS subset size is pre-determined at the start of the algorithm, and the final element pair weights are determined by considering the magnitude of the scaled residuals from the best regression fit over the station pair subsets. In the above cases (Sections 3.2.1–3.2.3), a subset of half the data was chosen for exploratory analysis. This proportion results in a lower statistical efficiency than if a larger subset were used (Rousseeuw & Leroy 1987; Ryan 2018). However, evidence from the synthetic tests suggests that this effect is minor compared to the effect of the outlying data points (Tables 1–3 and additional tests in the Supporting Information). In practice, an ensemble of multiple subset sizes may be the best approach for exploratory analysis (Rousseeuw & Leroy 1987). In our experience, the robust estimators do not perform better than the traditional methods on arrays with small numbers of elements (n < 5, Supporting Information). For the inversion to be well-posed, at least three stations are required for analysis. In the instance that n = 3, there are three unique station pairs and it would not be possible to detect outlying traveltime delays without prior information. For n = 4, there are six unique station pairs, which means that an outlying traveltime delay at one station could potentially contaminate 50 per cent of the cross-correlation derived differential times. This is the maximum possible breakdown point that any robust estimator can achieve (Rousseeuw 1984; Rousseeuw & Hubert 1997), therefore it is not surprising that our LTS analysis for arrays with four elements do not appear to show improvement over standard least-squares processing (Supporting Information). However, we note that they are no worse though, with LTS producing equivalent results (Supporting Information). For n = 5, robust estimation performs better than least-squares estimation (ten total station pairs), but the influence of the array geometry is visible. The effect of array geometry on arrays with six or more elements was not observed. This trend may be the result of data dimensionality (Hubert & Debruyne 2010); for arrays with five elements or fewer, two parameters (velocity and backazimuth) are estimated with 10 or fewer data points. Although some algorithms are relatively resistant to outliers in the data (e.g. PMCC), the robust regression techniques described here have uncertainty bounds encoded in covariance matrices. For the M-estimates, expressions are asymptotically derived (Huber 1973). The LTS processing ends in a least-squares fit with binary weights, so uncertainty is again expressed as a covariance matrix. We also note that for complex Gaussian signals, it can be shown that the Bartlett and Capon f–k estimators of beam power are multiples of a chi-square variable (Capon 1969, 1970). Due to the relative simplicity that uncertainty can be expressed with these estimators, we propose LTS processing as a potential alternative to PMCC processing. Robust regression is a tool whose primary use is to identify outlying data points. Large differences between a robust estimate and the least-squares estimate suggest that there are outlying data points present in the data set that should be investigated (Rousseeuw & Leroy 1987). For infrasound array processing, our examples suggest that these differences could be the result of reduced data quality or non-planar propagation. These methods are not limited to infrasound array processing; many seismic array processing techniques rely on the plane wave assumption to estimate optimal backazimuth and velocity values (Rost & Thomas 2002). For example, this analysis can be done through a least-squares fit (Berteussen 1976), f–k analysis (Capon 1969) or the Cophase method (Rost & Thomas 2002). Due to local geology, the plane wave assumption is known to occasionally fail in seismic processing (Berteussen 1976; Rost & Thomas 2002; Gibbons et al. 2018). Seismometers may also experience instrumental time shifts, and LTS may complement existing tools (e.g. Stehly et al. 2007) for the seismic analyst processing array data. Seismic arrays are often deployed in remote locations, and an automated monitor of array health would likewise be useful. 5 CONCLUSIONS We introduce robust regression techniques into the infrasound array processing literature by comparing the performance of various algorithms on actual and synthetically flawed array data. With synthetic data, we show that realistic degradation of station performance (a timing error, a flat channel and excessive noise) can produce outlying cross-correlation derived traveltimes. These outliers can cause grossly aberrant estimates of plane wave parameters when least squares and M-estimators are used. In these simple examples, L1-norm regression, f–k analysis, PMCC and LTS well recover the original plane wave parameters. We then examine three cases of real data from both the IMS and AVO that contain problematic data that would otherwise be difficult for an analyst to detect. For ADKI and IS55, the arrays had sensor problems—an element with reversed polarity (ADKI) and an element with an approximately 2 s clock error (IS55). At IS57, interelement correlation with element 8 is lower than expected (Gabrielson 2017), which may be the result of non-planar propagation across the array. At ADKI, all of the estimators (least squares, L1-norm regression, M-estimation, and f–k analysis) apart from LTS return highly scattered backazimuth estimates. LTS, on the other hand, resolves a shift in backazimuth from ∼100° to ∼65° at the arrival of the Bogoslof eruption signal. For LTS, velocity estimates are also physically realistic; f–k analysis returns no physically realistic velocity estimates during this period. Examination of LTS weights consistently shows that element 2 is dropped from the analysis. At IS57, both LTS and f–k analysis returned backazimuth and velocity estimates that are distinct and physically realistic. L1-norm regression and M-estimation return backazimuth estimates that do not clearly point to a single source, but a consistent source velocity is obtained if physical information is also used. When processing the flawed IS55 data, all of the tested estimators perform better than least squares, but evaluation of LTS weights points to a timing error at element 2. In this case, the extra information of LTS weights distinguishes that estimator from the others. Our results show that robust estimators provide more accurate plane wave parameter estimates than conventional least-squares processing when there are problems with the data, and perform similarly when no issues exist. In addition to improved estimates, evaluation of LTS weights identified data quality issues in both the synthetic and active array data examples. While the robust estimators appear to lose some effectiveness at small element arrays (n ≤ 5 elements), they still perform equally to standard least squares. Our results suggest that LTS should be used for infrasound array processing instead of traditional methods. In particular, we note that these techniques are amenable to automated processing, relieving sensor network analysts from the burden of sifting out poorly performing or malfunctioning channels from the data stream and automatically flagging elements requiring further inspection or maintenance. Further, these methods are not limited to infrasound array processing and may also be applied to seismic array processing as well as general geophysical inverse theory. SUPPORTING INFORMATION Figure S1. Left: original unfiltered ADKI data between UTC 13:57:32 and 13:58:32. Right: the same data, but with element 2 reversed. Note the peak at approximately UTC 13:57:55. Figure S2. PMCC time–frequency grid from Synthetic Case 1. The mean and median statistics of the family are recorded in Table 2. Figure S3. PMCC time–frequency grid from Synthetic Case 2. The mean and median statistics of the family are recorded in Table 3. Figure S4. PMCC time–frequency grid from Synthetic Case 3. The mean and median statistics of the family are recorded in Table 4. Figure S5. PMCC time–frequency grid from Synthetic Case 4. The mean and median statistics of the family are recorded in Table 5. Figure S6. This figure shows the least trimmed squares model fit for Synthetic Case 4 (Section 3.1.4 in the main text). The blue points approximately 5 s above the best-fit plane correspond to the station pairs including element 1, which had the −5 s time shift. The four red points clustered around the back points are the station pairs 3–5, 3–6, 3–8 and 4–8. The pink noise caused a cross-correlation misalignment that offsets these four points from the best-fit plane (black). Figure S7. Array geometries considered in this Supplement. Figure S8. LTS processing results on subsets of the ADKI array for the Bogoslof eruption analysed in the main text. Left: the first column shows the trace-velocity estimates when different elements are removed. Middle: the second column shows the backazimuth estimates as elements are removed. Right: the third column shows the elements that are tagged in the algorithm when an element is removed prior to the processing. Figure S9. Backazimuth (left) and trace-velocity (right) estimates for the full ensemble of estimators for the Mount Erebus explosion recorded at IS55 and shown in Fig. 7 of the main text: (a) least squares, (b) L1 norm regression, (c) M-estimators, (d) f–k analysis and (e) Least trimmed squares (α = 0.5). Points are shaded according to the MdCCM in each window, and the values are the same across all subplots. While f–k analysis does not use cross-correlations, the same shading was applied for visual comparison. The grey dashed line corresponds to the backazimuth from IS55 to Mount Erebus (≈300○), and the dual grey lines on the right correspond to bounds of 300 and 430 m s−1. Figure S10. Waveforms from IS55 that depict an explosion at Mount Erebus volcano. Traces are filtered with a high pass filtered at 0.02 Hz and tapered. Note the lack of rarefaction pulse before the compression peak on elements 1–5 and 7–8. Furthermore, we point out that the vertical axes for elements 1 and 6 are different than the other six elements. These data are processed in the main text. Table S1. Some M-estimator functions, adapted from Grynovicki & Thomas (1983). Tuning constants were calculated to have a 95% asymptotic efficiency with respect to least squares when the errors are normally distributed. Table S2. PMCC detection and family association parameters used in the processing of each synthetic example (Sections 3.1.1–3.1.4). Values are the same across each of the 10 linearly spaced frequency bands from 0.5 to 3.0 Hz. Abbreviations are as shown in DTK-GPMCC 5.6.2. Table S3. Robust estimator processing results on an array with IS56 geometry and a 5 s timing error on element 1. Table S4. Robust estimator processing results for the IS56 array geometry and a 5 s timing error on element 2. Table S5. A 5 s timing error is simulated on element 1 for the IS49 array geometry. Table S6. A 5 s timing error is simulated on element 2 for the IS49 array geometry. Table S7. A 5 s error simulated on element 1 of an array with pentagonal geometry (Fig. S7C). Please note: Oxford University Press is not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the paper. ACKNOWLEDGEMENTS The Python code for the least trimmed squares algorithm is freely available at https://github.com/uafgeotools/lts_array. The reading, writing and filtering of infrasonic data relied on tools in the ObsPy Python toolbox (Beyreuther et al. 2010). Observations of volcanic activity were made by the Alaska Volcano Observatory and are detailed at www.avo.alaska.edu. The authors would like to acknowledge Thomas B. Gabrielson, Daniel C. Bowman, Philip S. Blom, Roger Waxler and Alexandra M. Iezzi for insightful discussions related to various portions of this work. Additionally, the authors would like to thank editor Sidao Ni and two anonymous reviewers for comments that improved the quality of this paper. This publication was made possible through support provided by the Defense Threat Reduction Agency Nuclear Arms Control Technology program under contract HDTRA1-17-C-0031. Distribution Statement A: Approved for public release; distribution is unlimited. REFERENCES Assink J. , Le Pichon A., Blanc E., Kallel M., Khemiri L., 2014 . Evaluation of wind and temperature profiles from ECMWF analysis on two hemispheres using volcanic infrasound , J. Geophys. Res. Atmos. , 119 , 8659 – 8683 . 10.1002/2014JD021632 Google Scholar Crossref Search ADS WorldCat Aster R.C. , Borchers B., Thurber C.H., 2013 . Parameter Estimation and Inverse Problems , 2nd edn, Academic Press . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Beaton A.E. , Tukey J.W., 1974 . The fitting of power series, meaning polynomials, illustrated on band-spectroscopic data , Technometrics , 16 ( 2 ), 147 – 185 . 10.2307/1267936 Google Scholar Crossref Search ADS WorldCat Berteussen K.A. , 1976 . The origin of slowness and azimuth anomalies at large arrays , Bull. seism. Soc. Am. , 66 ( 3 ), 719 – 741 . Google Scholar OpenURL Placeholder Text WorldCat Beyreuther M. , Barsch R., Krischer L., Megies T., Behr Y., Wassermann J., 2010 . ObsPy: a Python toolbox for seismology , Seismol. Res. Lett. , 81 ( 3 ), 530 – 533 . 10.1785/gssrl.81.3.530 Google Scholar Crossref Search ADS WorldCat Blandford R.R. , 1974 . An automatic event detector at the Tonto Forest seismic observatory , Geophysics , 39 ( 5 ), 633 – 643 . 10.1190/1.1440453 Google Scholar Crossref Search ADS WorldCat Bracewell R.N. , 2000 . The Fourier Transform and Its Applications , 3rd edn, The McGraw-Hill Companies, Inc Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Brachet N. , Brown D., Le Bras R., Cansi Y., Maille P., Coyne J., 2010 . Monitoring the Earth’s atmosphere with the global IMS infrasound network , in Infrasound Monitoring for Atmospheric Studies , chap. 3, Vol. 1, pp. 77 – 118 ., eds Le Pichon A., Blanc E., Hauchecorne A., Springer Science+Business Media . Google Scholar Crossref Search ADS Google Preview WorldCat COPAC Cansi Y. , 1995 . An automatic seismic event processing for detection and location: the P.M.C.C. method , Geophys. Res. Lett. , 22 ( 9 ), 1021 – 1024 . 10.1029/95GL00468 Google Scholar Crossref Search ADS WorldCat Cansi Y. , Plantet J.-L., Massinon B., 1993 . Earthquake location applied to a mini-array: k-spectrum versus correlation method , Geophys. Res. Lett. , 20 ( 17 ), 1819 – 1822 . 10.1029/93GL01397 Google Scholar Crossref Search ADS WorldCat Capon J. , 1969 . High-resolution frequency–wavenumber spectrum analysis , Proc. IEEE , 57 ( 8 ), 1408 – 1418 . 10.1109/PROC.1969.7278 Google Scholar Crossref Search ADS WorldCat Capon J. , 1970 . Probability distributions for estimators of the frequency–wavenumber spectrum , Proc. IEEE , 58 ( 10 ), 1785 – 1786 . 10.1109/PROC.1970.8014 Google Scholar Crossref Search ADS WorldCat Cappellari M. et al. ., 2013 . The ATLAS 3D project – XV. Benchmark for early-type galaxies scaling relations from 260 dynamical models: mass-to-light ratio, dark matter, Fundamental Plane and Mass Plane , Mon. Not. R. Astron. Soc. , 432 , 1709 – 1741 . 10.1093/mnras/stt562 Google Scholar Crossref Search ADS WorldCat Christie D.R. , Campus P., 2010 . The IMS infrasound network: design and experiment of infrasound stations , in Infrasound Monitoring for Atmospheric Studies , chap. 2, Vol. 1, pp. 29 – 76 ., eds Le Pichon A., Blanc E., Hauchecorne A., Springer Science+Business Media . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Claerbout J.F. , Muir F., 1973 . Robust modeling with erratic data , Geophysics , 38 , 826 – 844 . 10.1190/1.1440378 Google Scholar Crossref Search ADS WorldCat Coleman D. , Holland P., Kaden N., Klema V., Peters S.C., 1980 . A system of subroutines for iteratively reweighted least squares computations , ACM Tran. Math. Softw. , 6 ( 3 ), 327 – 336 . Google Scholar Crossref Search ADS WorldCat Dennis J.E. , 1977 . Non-linear least squares and equations , in The State of the Art in Numerical Analysis , pp. 269 – 312 ., eds Duff I.S., Watson G.A., Academic Press . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Denny A. , 2012 . Infrasonic survey of Mount Erebus eruptions in Antarctica, M.S. thesis , University of Alaska Fairbanks . den Ouden O.F.C. , Assink J.D., Smets P. S.M., Shani-Kadmiel S., Averbuch G., Evers L.G., 2020 . CLEAN beamforming for the enhanced detection of multiple infrasonic sources , Geophys. J. Int. , 221 ( 1 ), 305 – 317 . 10.1093/gji/ggaa010 Google Scholar Crossref Search ADS WorldCat Edwards W.N. , Green D.N., 2012 . Effect of interarray elevation differences on infrasound beamforming , Geophys. J. Int. , 190 , 335 – 346 . 10.1111/j.1365–246X.2012.05465.x Google Scholar Crossref Search ADS WorldCat Fee D. , Matoza R.S., 2013 . An overview of volcano infrasound: from Hawaiian to Plinian, local to global , J. Volc. Geotherm. Res. , 249 , 123 – 139 . 10.1016/j.jvolgeores.2012.09.002 Google Scholar Crossref Search ADS WorldCat Fee D. , Haney M.M., Matoza R.S., Van Eaton Alexa R., Cervelli P., Schneider D.J., Iezzi A.M., 2017 . Volcanic tremor and plume height hysteresis from Pavlof Volcano, Alaska , Science , 355 , 45 – 48 . 10.1126/science.aah6108 Google Scholar Crossref Search ADS PubMed WorldCat Gabrielson T. , 2017 . State-of-health assessment of infrasound elements using operational data , in Infrasound Technology Workshop 2017 Book of Abstracts . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Gabrielson T.B. , 2011 . In situ calibration of atmospheric-infrasound sensors including the effects of wind-noise-reduction pipe systems , J. acoust. Soc. Am. , 130 ( 3 ), 1154 – 1163 . 10.1121/1.3613925 Google Scholar Crossref Search ADS PubMed WorldCat Gibbons S.J. , Ruigrok E., Kværna T., 2018 . Improving slowness estimate stability and visualization using limited sensor pair correlation on seismic arrays , Geophys. J. Int. , 213 , 447 – 460 . 10.1093/gji/ggx550 Google Scholar Crossref Search ADS WorldCat Gossard E.E. , Sailors D.B., 1970 . Dispersion bandwidth deduced from coherency of wave recordings from spatially separated sites , J. geophys. Res. , 75 ( 7 ), 1324 – 1329 . 10.1029/JA075i007p01324 Google Scholar Crossref Search ADS WorldCat Green D.N. , 2015 . The spatial coherence structure of infrasonic waves: analysis of data from International Monitoring System arrays , Geophys. J. Int. , 201 ( July ), 377 – 389 . 10.1093/gji/ggu495 Google Scholar Crossref Search ADS WorldCat Green D.N. , Bowers D., 2010 . Estimating the detection capability of the International Monitoring System infrasound network , J. geophys. Res. , 115 ( 18 ), 1 – 18 . 10.1029/2010JD014017 Google Scholar OpenURL Placeholder Text WorldCat Crossref Haney M.M. , Van Eaton A.R., Lyons J.J., Kramer R.L., Fee D., Iezzi A.M., 2018 . Volcano thunder from explosive eruptions at Bogoslof volcano, Alaska , Geophys. Res. Lett. , 45 , 3429 – 3435 . 10.1002/2017GL076911 Google Scholar Crossref Search ADS WorldCat Huber P.J. , 1973 . Robust regression: asymptotics conjectures and Monte Carlo , Ann. Stat. , 1 ( 5 ), 799 – 821 . Google Scholar Crossref Search ADS WorldCat Hubert M. , Debruyne M., 2010 . Minimum covariance determinant , Wiley Interdiscip. Rev. Comput. Stat. , 2 ( 1 ), 36 – 43 . 10.1002/wics.61 Google Scholar Crossref Search ADS WorldCat Johnson D.H. , Dudgeon D.E., 1993 . Array Signal Processing , Prentice Hall . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Kelly E.J. , 1967 . Response of seismic arrays to wide-band signals, Tech. rep. , Massachusetts Institute of Technology Lincoln Laboratory . Koch K. , Pilger C., 2019 . Infrasound observations from the site of past underground nuclear explosions in North Korea , Geophys. J. Int. , 216 , 182 – 200 . 10.1093/gji/ggy381 Google Scholar Crossref Search ADS WorldCat Koivunen V. , 1995 . A robust nonlinear filter for image restoration , IEEE Trans. Image Process. , 4 ( 5 ), 569 – 578 . 10.1109/83.382492 Google Scholar Crossref Search ADS PubMed WorldCat Kvaerna T. , Doornbos D., 1986 . An integrated approach to slowness analysis with arrays and three-component stations , NORSAR Scientific Report 2-1985/1986, Semiannual Technical Summary , pp. 60 – 69 ., NORSAR , Kjeller, Norway . Google Scholar Le Pichon A. , Vergoz J., Blanc E., Guilbert J., Ceranna L., Evers L., Brächet N., 2009 . Assessing the performance of the International Monitoring System’s infrasound network: geographical coverage and temporal variabilities , J. geophys. Res. , 114 ( 8 ), 25 – 28 . 10.1029/2008JD010907 Google Scholar OpenURL Placeholder Text WorldCat Crossref Lyons J.J. , Iezzi A.M., Fee D., Schwaiger H.F., Wech A., Haney M.M., 2020 . Infrasound generated by the 2016–17 shallow submarine eruption of Bogoslof volcano, Alaska , Bull. Volcanol. , doi:10.1007/s00445–019–1355–0 . 10.1007/s00445–019–1355–0 Google Scholar OpenURL Placeholder Text WorldCat Crossref Maronna R.A. , Martin R.D., Yohai V.J., 2006 . Robust Statistics , John Wiley & Sons, Ltd . Google Scholar Crossref Search ADS Google Preview WorldCat COPAC Melton B.S. , Bailey L.F., 1957 . Multiple signal correlators , Geophysics , 22 ( 3 ), 565 – 588 . 10.1190/1.1438390 Google Scholar Crossref Search ADS WorldCat Modrak R.T. , Arrowsmith S.J., Anderson D.N., 2010 . A Bayesian framework for infrasound location , Geophys. J. Int. , 181 , 399 – 405 . 10.1111/j.1365–246X.2010.04499.x Google Scholar Crossref Search ADS WorldCat Montgomery D.C. , Peck, Elizabeth A., Vining G.G., 2001 . Introduction to Linear Regression Analysis , 3rd edn, John Wiley & Sons, Inc . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Olson J.V. , Szuberla C.A.L., 2005 . Distribution of wave packet sizes in microbarom wave trains observed in Alaska , J. acoust. Soc. Am. , 117 ( 3 ), 1032 – 1037 . 10.1121/1.1854651 Google Scholar Crossref Search ADS WorldCat Phillips G.R. , Eyrlng E.M., 1983 . Comparison of conventional and robust regression in analysis of chemical data , Anal. Chem. , 55 ( 7 ), 1134 – 1138 . 10.1021/ac00258a035 Google Scholar Crossref Search ADS WorldCat Pison G. , Van Aelst S., Willems G., 2002 . Small sample corrections for LTS and MCD , Metrika , 55 ( 1–2 ), 111 – 123 . 10.1007/s001840200191 Google Scholar Crossref Search ADS WorldCat Rost S. , Thomas C., 2002 . Array seismology: methods and applications , Rev. Geophys. , 40 ( 3 ), 1 – 27 . 10.1029/2000RG000100 Google Scholar Crossref Search ADS WorldCat Rousseeuw P. , Hubert M., 1997 . Recent developments in PROGRESS , Comput. Stat. Data Anal. , 21 , 67 – 85 . 10.1214/lnms/1215454138 Google Scholar Crossref Search ADS WorldCat Rousseeuw P.J. , 1984 . Least median of squares regression , J. Am. Stat. Assoc. , 79 ( 388 ), 871 – 880 . 10.1080/01621459.1984.10477105 Google Scholar Crossref Search ADS WorldCat Rousseeuw P.J. , Leroy A.M., 1987 . Robust Regression and Outlier Detection , John Wiley & Sons, Inc . Google Scholar Crossref Search ADS Google Preview WorldCat COPAC Rousseeuw P.J. , Van Driessen K., 1999 . A fast algorithm for the minimum covariance determinant estimator , Technometrics , 41 ( 3 ), 212 – 223 . 10.1080/00401706.1999.10485670 Google Scholar Crossref Search ADS WorldCat Rousseeuw P.J. , Van Driessen K., 2006 . Computing LTS regression for large data sets , Data Min. Knowl. Discovery , 12 ( 1 ), 29 – 45 . 10.1007/s10618–005–0024–4 Google Scholar Crossref Search ADS WorldCat Ryan T.P. , 2018 . Modern Regression Methods , John Wiley & Sons, INC . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Scales J.A. , Gersztenkorn A., Treitel S., 1988 . Fast lp solution of large, sparse, linear systems: application to seismic travel time tomography , J. Comput. Phys. , 75 , 314 – 333 . 10.1016/0021–9991(88)90115–5 Google Scholar Crossref Search ADS WorldCat Silwal V. , Tape C., 2016 . Seismic moment tensors and estimated uncertainties in southern Alaska , J. geophys. Res. , 121 ( 4 ), 2772 – 2797 . 10.1002/2015JB012588 Google Scholar Crossref Search ADS WorldCat Smets P.S.M. , Evers L.G., Näsholm S.P., Gibbons S.J., 2015 . Probabilistic infrasound propagation using realistic atmospheric perturbations , Geophys. Res. Lett. , 42 , 6510 – 6517 . 10.1002/2015GL064992.1 Google Scholar Crossref Search ADS WorldCat Stehly L. , Campillo M., Shapiro N.M., 2007 . Traveltime measurements from noise correlation: stability and detection of instrumental time-shifts , Geophys. J. Int. , 171 ( 1 ), 223 – 230 . 10.1111/j.1365–246X.2007.03492.x Google Scholar Crossref Search ADS WorldCat Szuberla C.A.L. , Olson J.V., 2004 . Uncertainties associated with parameter estimation in atmospheric infrasound arrays , J. acoust. Soc. Am. , 115 ( 1 ), 253 – 258 . 10.1121/1.1635407 Google Scholar Crossref Search ADS PubMed WorldCat Szuberla C.A.L. , Arnoult K.M., Olson J.V., Wilson D.K., 2006 . Discrimination of near-field infrasound sources based on time-difference of arrival information , J. acoust. Soc. Am. , 120 ( 3 ), EL23 – EL28 . 10.1121/1.2234517 Google Scholar Crossref Search ADS WorldCat Tiede J.J. , Pagano M., 1979 . The application of robust calibration to radioimmunoassay , Biometrics , 35 ( 3 ), 567 – 574 . 10.2307/2530247 Google Scholar Crossref Search ADS PubMed WorldCat Walker K.T. , Hedlin M., 2010 . A review of wind-noise reduction methodologies , Infrasound Monitoring for Atmospheric Studies , chap. 5, vol. 1, pp. 141 – 182 ., eds Le Pichon A., Blanc E., Hauchecorne A., Springer Science+Business Media . Google Scholar Crossref Search ADS Google Preview WorldCat COPAC Wei W.-Z. , Zhu W.-H., Yao S.-Z., 1993 . Application of robust regression in multicomponent UV spectrophotometry by direct calibration , Chemom. Intell. Lab. Syst. , 18 , 17 – 26 . 10.1016/0169–7439(93)80041–F Google Scholar Crossref Search ADS WorldCat Zaman A. , Rousseeuw P.J., Orhan M., 2001 . Econometric applications of high-breakdown robust regression techniques , Econ. Lett. , 71 , 1 – 8 . 10.1016/S0165–1765(00)00404–3 Google Scholar Crossref Search ADS WorldCat © The Author(s) 2020. Published by Oxford University Press on behalf of The Royal Astronomical Society. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. © The Author(s) 2020. Published by Oxford University Press on behalf of The Royal Astronomical Society. TI - Improved infrasound array processing with robust estimators JF - Geophysical Journal International DO - 10.1093/gji/ggaa110 DA - 2020-06-01 UR - https://www.deepdyve.com/lp/oxford-university-press/improved-infrasound-array-processing-with-robust-estimators-N2VyLryzIF SP - 2058 EP - 2074 VL - 221 IS - 3 DP - DeepDyve ER -