# Automatic microseismic event picking via unsupervised machine learning

Automatic microseismic event picking via unsupervised machine learning Abstract Effective and efficient arrival picking plays an important role in microseismic and earthquake data processing and imaging. Widely used short-term-average long-term-average ratio (STA/LTA) based arrival picking algorithms suffer from the sensitivity to moderate-to-strong random ambient noise. To make the state-of-the-art arrival picking approaches effective, microseismic data need to be first pre-processed, for example, removing sufficient amount of noise, and second analysed by arrival pickers. To conquer the noise issue in arrival picking for weak microseismic or earthquake event, I leverage the machine learning techniques to help recognizing seismic waveforms in microseismic or earthquake data. Because of the dependency of supervised machine learning algorithm on large volume of well-designed training data, I utilize an unsupervised machine learning algorithm to help cluster the time samples into two groups, that is, waveform points and non-waveform points. The fuzzy clustering algorithm has been demonstrated to be effective for such purpose. A group of synthetic, real microseismic and earthquake data sets with different levels of complexity show that the proposed method is much more robust than the state-of-the-art STA/LTA method in picking microseismic events, even in the case of moderately strong background noise. Inverse theory, Time-series analysis, Earthquake source observations INTRODUCTION Strong noise in microseismic data, especially in surface microseismic data, may impede the effective usage of the microseismic signals for subsurface characterization and monitoring, for example, hydraulic fracturing monitoring (Rodriguez et al. 2012; Sabbione & Sacchi 2015; Mousavi & Langston 2016c, 2017; Huang et al. 2017b,e). The determination of waveform arrival times is important for a variety of seismological applications including earthquake detection and seismic tomography (Bogiatzis & Ishii 2015). First arrivals can be identified on the seismograms by conventional picking, but these types of manual methods depend on external factors like the scale and quality of the seismic data, amplitude ratio, sensitivity of the picking cursor and user experience (Senkaya & Karsli 2014). An automatic arrival picking method is always preferred in the community. Hatherly (1982) designed a technique to accurately determine first arrival times by considering both the first kick and the first inflection point on each trace based on some statistic criteria. Senkaya & Karsli (2014) proposed a semi-automatic arrival picking method based on the cross-correlation technique (CCT). The CCT can reduce the dependency on user and decrease incorrect picking caused by environmental noise, displaying characteristics and scaling factors. Waveform correlation of strong events is a typical way to detect weak events in the earthquake seismology community, which is sensitive to small-amplitude events (Song et al. 2010). According to the study of many researchers, the waveform correlation based detection methods can be effective as long as the separation between the master event and the target event is less than the dominant wavelength (Gibbons & Ringdal 2006; Michelet & Toksöz 2007; Arrowsmith & Eisner 2006). Song et al. (2010) applied the correlation-based weak event detection method to microseismic events caused by hydraulic fracturing. In this method, a master event with high signal-to-noise ratio (SNR) is selected and then is used to cross-correlate with a weak event to find those events with the similar location, fault mechanism, and propagation path as the master event (Eisner et al. 2006). They improved the traditional short-term-average long-term-average ratio (STA/LTA) algorithm (Allen 1978; Vaezi & Van der Baan 2015), which is very sensitive to background noise level (Withers et al. 1998; Kalkan 2016), by transforming the time-domain data to the spectrogram domain to better identify P- and S-wave arrivals (Gibbons et al. 2006; Song et al. 2010). Gelchinsky & Shtivelman (1983) proposed a hybrid method that combines the benefits of both correlation-based methods and the statistics based methods. Sabbione & Velis (2010) proposed a mispick-correcting technique to exploit the benefits of the data present in the entire shot record, which allows us to adjust the trace-by-trace picks and to discard picks associated with bad or dead traces. Velis et al. (2015) utilized pattern recognition techniques to detect waveforms from microseismic data and used reduced-rank filtering approach to improve the SNR of the waveforms. Coppens (1985) proposed a fully automatic method to pick first arrivals that makes use of the delay-time method in order to compute static corrections at each shot position. From the automatically picked first arrivals on common-offset gathers, the delay times, weathering and subweathering velocities can be easily determined. Murat & Rudman (1992) applied back-propagation neural network to pick first arrivals (first breaks) in a background of noise. The back-propagation neural network is used to decide whether each half-cycle on the trace is the first arrival or not. The four attributes used as the features for the neural network are (1) peak amplitude of a half-cycle; (2) amplitude difference between the peak value of the half-cycle and the previous (or following) half-cycle; (3) root mean square (RMS) amplitude ratio for a data window (0.3 s) before and after the half-cycle; (4) RMS amplitude ratio for a data window (0.06 s) on adjacent traces. Zhang et al. (2003) developed an automatic P-wave arrival detection and picking algorithm based on the wavelet transform and Akaike information criteria (AIC) picker. They applied the discrete wavelet transform to seismograms based on a local-window strategy (Zu et al. 2017). In each local time-window, AIC autopicker is applied to the thresholded absolute wavelet coefficients at different scales, and the consistency of those picks in different scales is evaluated to determine whether a P-wave arrival has been detected in the given local window. Bogiatzis & Ishii (2015) proposed an algorithm for determining P- and S-wave arrival times based on continuous wavelet transform of the waveforms. The wavelet transform is well known for its superb performance in analysing nonstationary signals because its basis functions are localized in time and frequency. The wavelet coefficients are calculated using the vertical component for determining P-wave arrivals and polarization of the shear waves is utilized to cross examine the wavelet coefficients in horizontal components for determining S-wave arrivals. Because of the small magnitude of microseismic events and noisy borehole or surface recording environment, the microseismic signals may often be neglected if no proper denoising algorithms or event detection techniques are applied (Mousavi et al. 2016; Mousavi & Langston 2016a,b). The microseismic data can also be denoised using multichannel methods, which are widely used in the active-source seismic community. The state-of-the-art multichannel denoising algorithms include transform domain thresholding methods (Candès et al. 2006; Chen et al. 2014; Chen 2016; Gan et al. 2016a; Mousavi et al. 2016; Zu et al. 2016), predictive filtering methods (Liu et al. 2012; Liu & Chen 2013), singular spectrum analysis (Vautard et al. 1992; Huang et al. 2016; Chen et al. 2016a,b; Zhang et al. 2016, 2017; Huang et al. 2017c; Siahsar et al. 2017c), low-rank approximation based methods (Xue et al. 2016a; Chen et al. 2017; Huang et al. 2017a; Wang et al. 2017; Xie et al. 2017; Zhou et al. 2017), dictionary learning based methods (Elad & Aharon 2006; Chen 2017; Siahsar et al. 2017b,a), and mean/median filtering methods (Gan et al. 2016b; Xue et al. 2016b). Forghani-Arani et al. (2013) proposed a technique for suppressing noise in surface microseismic data based on the distinct characteristics that microseismic signal and noise show in the τ − p domain. Before separation in the τ − p domain, a scanning approach that is similar to semblance analysis needs to be used to test all possible double-couple orientations to determine an estimated orientation that best accounts for the polarity pattern of any microseismic events (Forghani-Arani et al. 2013). The multichannel denoising methods rely on a fairly dense spatial sampling of the data. For most microseismic monitoring projects, where the number of spatial geophones is not large enough, the multichannel denoising methods are applicable or cannot obtain acceptable results. Han & van der Baan (2015) applied ensemble empirical mode decomposition (EEMD) method combined with adaptive interval thresholding strategy to denoise microseismic data. Empirical mode decomposition (EMD) was developed by Huang et al. (1998) to adaptively decompose non-stationary and nonlinear signal into locally stationary components, which are called intrinsic mode functions (IMF). The EMD is a quasi-orthogonal decomposition in that the cross-correlation coefficients between different IMFs are close to zero (Han & van der Baan 2015). Ensemble empirical mode decomposition is an improved version of EMD in that it solves the mode-mixing problem in EMD to some extent and has a better time-frequency depicting performance than EMD (Wu & Huang 2009). Inspired by the EMD method, Gómez & Velis (2016) proposed a data-driven EMD-like algorithm to suppress noise either in microseismic data as a trace-by-trace process or in seismic data as a f − x domain algorithm. The EMD-like algorithm is much more computationally efficient than the EMD method and requires less user intervention. However, because of the ‘empirical’ nature of EMD, the EMD-related algorithms are not supported by a certain mathematical mode, and thus the performance of EMD based methods cannot be well controlled. In a similar way as the EMD based methods, Li et al. (2016a) developed a morphological decomposition (Huang et al. 2017d) based method, which is much faster than EMD, to remove low-frequency noise in microseismic data. Li et al. (2016b) further designed a morphological decomposition strategy for recovering the weak signals in microseismic data. Other latest EMD-like data decomposition algorithms include the variational mode decomposition (Liu et al. 2015, 2016b, 2017), empirical wavelet transform (Gilles 2013; Liu et al. 2016a), and regularized non-stationary autoregression (Fomel 2013; Wu et al.2017). Denoising microseismic data will inevitably cause signal damages to the useful signals, which degrades the fidelity of the processed data (Chen & Fomel 2015). On the one hand, small-amplitude signals may be neglected due to the signal damage and thus arrival picking will be negatively affected. On the other hand, the damaged waveform amplitude will greatly affect the subsequent source mechanism inversion and microseismic based waveform inversion (Maxwell et al. 2010; Song & Toksöz 2011). Considering the drawbacks and potential pitfalls caused by denoising, a robust arrival picking method without the need of pre-processing is strongly in demand for the microseismic community. In this paper, I propose a novel automatic microseismic arrival picking method based on clustering. Clustering is a type of unsupervised machine learning techniques, which allows us to classify the given data by using the data itself. Unlike the supervised machine learning techniques which heavily depend on the input training data set, the unsupervised machine learning approaches (e.g. clustering) rely only on the data itself and thus is much more flexible. More specifically, I demonstrate the use of the fuzzy clustering algorithm (Yang et al. 2006) in arrival picking problem. Results from a comprehensive analysis of a group of synthetic and real data sets with different levels of complexity demonstrate the strong antinoise ability of the proposed unsupervised machine learning algorithm in picking first arrivals from raw noisy microseismic data. The efficiency is also demonstrated to be acceptable and thus the method can be readily applicable. I organize the paper as follows: I first give a brief introduction of the concept of clustering analysis, and then I formulate the microseismic event picking problem as a clustering problem. Next, I introduce the iterative solver for solving the fuzzy clustering problem. I then use a group of examples with comprehensive analysis and discussion to demonstrate the performance of the proposed method and compare the performance with the state-of-the-art STA/LTA method. Finally, I draw some key conclusions in the end of the paper. THEORY Clustering analysis Clustering analysis is a type of unsupervised machine learning approach. The target of clustering analysis is to group the input data into several clusters just according to the inherent features of the input data set. The number of groups can be defined in advance according to the purpose of a specific problem. Simply speaking, each cluster after clustering analysis is a collection of objects which have some sort of similarities which defer them from objects in the other clusters. Clustering is driven only by the choice of input features (or attributes) and the number of desired clusters and thus is much more flexible than those supervised machine learning techniques where a large amount of training data is required. An important component of a clustering algorithm is the distance measure between data points. If the components of the data instance vectors are all in the same physical units then it is possible for the simple Euclidean distance metric to be sufficient to successfully group similar data instances. The following Minkowski Metric is a common way for measuring distance for an N dimensional data   $$d(\mathbf {x}_i,\mathbf {x}_j) =\, \parallel\! \mathbf {x}_i - \mathbf {x}_j\!\parallel _p,$$ (1)where d(xi, xj) denotes the distance between xi and xj. xi denotes the ith N dimensional data point. xi and xj are both vectors. ∥ · ∥p denotes p-norm of the input vector. The Euclidean distance is a special case where p = 2, while Manhattan metric has p = 1. However, there are no general theoretical guidelines for selecting a measure for any given application. Microseismic event picking as a clustering problem A microseismic record can be classified as waveform and non-waveform components. The first index of waveform components can be treated as the arrival of the microseismic event. The essence of the arrival picking problem is thus turned into a classification problem given a group of data points. When a group of training data is given together with predefined data features, the classification problem can be viewed as a supervised classification problem (e.g. binary classification). If one even wants to classify the microseismic record using the data itself, the problem becomes a classic clustering analysis problem. The most important factor that affects the performance of the clustering analysis is the selected feature vector. In the algorithm, I propose three features to construct the feature vector, which are mean, power, and STA/LTA. All these feature vectors are measured in the time domain. The three features are defined as follows: Mean M  $$M(i)=\frac{1}{N}\sum _{i-w}^{i+w} d(i)$$ (2) Power E  $$E(i)=\sum _{i-w}^{i+w} d^2(i)$$ (3) STA/LTA R  \begin{eqnarray} {}{\rm STA}(i)&=&\frac{1}{{\rm NSTA}} \sum _{j=i-{\rm NSTA}}^{i} d(j) \nonumber\\ {}{\rm LTA}(i)&=&\frac{1}{{\rm NLTA}} \sum _{j=i-{\rm NLTA}}^{i} d(j) \nonumber\\ R(i)&=&{\rm STA}(i)/{\rm LTA}(i) \end{eqnarray} (4) w denotes half of window length. d(i) denotes the input seismic. NSTA and NLTA denote short-term and long-term periods, respectively. Fig. 1 shows a simple example for demonstrating the extracted features. Fig. 1 contains two synthetic data in the clean and noisy cases. The noisy data contains a large amount of noise that makes the effective signals almost buried under the noise. The SNR of this noisy data is −3.64 dB. The red circles in panels (a) and (b) are the picked arrival indices using the presented algorithm. It is very clear that in both cases, the presented algorithm obtains very successful arrival picking results. Fig. 2 shows the three aforementioned features used for clustering for clean data (Fig. 2a) and noisy data (Fig. 2b). Figure 1. View largeDownload slide Simple synthetic data for demonstrating clustering features. (a) Clean data. (b) Noisy data with SNR = −3.64 dB. The red circles in (a) and (b) are the picked arrival indices. Figure 1. View largeDownload slide Simple synthetic data for demonstrating clustering features. (a) Clean data. (b) Noisy data with SNR = −3.64 dB. The red circles in (a) and (b) are the picked arrival indices. Figure 2. View largeDownload slide Predefined features of (a) clean data and (b) noisy data for clustering. Figure 2. View largeDownload slide Predefined features of (a) clean data and (b) noisy data for clustering. Fuzzy clustering Fuzzy clustering belongs to the type of overlapping clustering, and uses fuzzy sets to cluster data, so that each point may belong to two or more clusters with different degrees of membership. In this case, data will be associated with an appropriate membership value. Fuzzy c-means is a method of clustering which allows one piece of data to belong to two or more clusters (Dunn 1973; Bezdek 1981). It is based on minimization of the following objective function   $$J = \sum _{i=1}^{N} \sum _{j=1}^{C} u_{i,j}^m \parallel\! \mathbf {x}_i - \mathbf {c}_j\!\parallel ^2,\quad 1\le m\le \infty ,$$ (5)where m is any real number greater than 1, ui, j is the degree of membership of xi in the cluster j, xi is the ith of d-dimensional measured data (d is the dimension of xi), cj is the d-dimension center of the cluster, and ∥ · ∥ is any norm expressing the similarity between any measured data and the center, for example, L2 norm. C denotes the number of clusters or classes needed to be classified. N denotes the length of the input signal. Note that both xi and cj are vectors of the same dimension. The dimension of the vectors is decided by the number of features. For example, in this paper, I use mean, power, STA/LTA as the three features and thus the dimension of both xi and cj is 3. Each xi is composed of the values of all feature components at time index i. Fuzzy partitioning is carried out through an iterative optimization of the objective function shown above, with the update of membership ui, j and the cluster centres cj by:   $$u_{i,j} = \frac{1}{\displaystyle \sum _{k=1}^C \left(\frac{\parallel\! \mathbf {x}_i - \mathbf {c}_j \!\parallel }{\parallel\! \mathbf {x}_i - \mathbf {c}_k\!\parallel } \right)^{\frac{2}{m-1}}},$$ (6)and   $$\mathbf {c}_j = \frac{\displaystyle \sum _{i=1}^{N} u_{i,j}^m \mathbf {x}_i}{\displaystyle \sum _{i=1}^N u_{i,j}^m}.$$ (7) This iteration will stop when $$\max _{i,j} \lbrace |u_{i,j}^{(k+1)} - u_{i,j}^{(k)}|\rbrace <\epsilon$$, where ε is a termination criterion between 0 and 1, whereas k is the iteration step. This procedure converges to a local minimum or a saddle point of Jm. The algorithm is composed of the following steps: Initialize U = [ui, j] matrix, U0 At k-step: calculate the centres vectors Ck = [cj] with Uk using   $$\mathbf {c}_j = \frac{\displaystyle \sum _{i=1}^{N} u_{i,j}^m \cdot \mathbf {x}_i}{ \displaystyle \sum _{i=1}^N u_{i,j}^m}.$$ (8) Update Uk, Uk + 1 using   $$u_{i,j} = \frac{1}{\displaystyle \sum _{k=1}^C \left(\frac{\parallel\! \mathbf {x}_i - \mathbf {c}_j \!\parallel }{ \parallel\! \mathbf {x}_i - \mathbf {c}_k \!\parallel } \right)^{\frac{2}{m-1}}}.$$ (9) If ∥Uk + 1 − Uk∥ < ε, then STOP; otherwise return to step 2. Eq. (9) can be further derived as   \begin{eqnarray} u_{i,j} &=& \frac{1}{\displaystyle \sum _{k=1}^C \left(\frac{\parallel\! \mathbf {x}_i - \mathbf {c}_j\!\parallel }{\parallel\! \mathbf {x}_i - \mathbf {c}_k\!\parallel } \right)^{\frac{2}{m-1}}} \nonumber\\ &=& \frac{1}{\displaystyle \parallel\! \mathbf {x}_i - \mathbf {c}_j\!\parallel ^{\frac{2}{m-1}}\sum _{k=1}^C \left(\frac{1}{\parallel\! \mathbf {x}_i - \mathbf {c}_k\!\parallel } \right)^{\frac{2}{m-1}}} \nonumber\\ &=& \frac{\parallel\! \mathbf {x}_i - \mathbf {c}_j\!\parallel ^{-\frac{2}{m-1}}}{\displaystyle \sum _{k=1}^C \left(\frac{1}{\parallel\! \mathbf {x}_i - \mathbf {c}_k\!\parallel } \right)^{\frac{2}{m-1}}} \nonumber\\ &=& \frac{\parallel\! \mathbf {x}_i - \mathbf {c}_j\!\parallel ^{-\frac{2}{m-1}}}{\displaystyle \sum _{k=1}^C \parallel\! \mathbf {x}_i - \mathbf {c}_k\!\parallel ^{-\frac{2}{m-1}}} . \end{eqnarray} (10)for easier implementation. The above iteration terminates either when the maximum number of iteration (e.g. 100) is reached or when it is converged (e.g. ∥Uk + 1 − Uk∥ < ε). The obtained membership vectors ui, j is then used to detect the microseismic event. Fig. 3 shows the membership values calculated by iterative estimation for the data shown in Fig. 1. Fig. 3(a) corresponds to the clean data and Fig. 3(b) corresponds to the noisy data. It is clear from Fig. 3 that a microseismic arrival exists when the membership value jumps from one to zero (or from zero to one). Via this criterion, one can automatically detect the microseismic event. In the DISCUSSION section, I give a brief discussion on the implementation and parameter selection for the aforementioned algorithm. Figure 3. View largeDownload slide Calculated membership values of (a) clean data and (b) noisy data that define different clusters. Figure 3. View largeDownload slide Calculated membership values of (a) clean data and (b) noisy data that define different clusters. EXAMPLES To numerically evaluate the performance, I use the arrival picking error metric which is defined as follows:   $${\rm Error} = \sum _{h}^{H} |I(h) - \hat{I}(h) |,$$ (11)where Error denotes the picking error measured in samples. I(h) denotes the index corresponding to the exact first arrival for hth trace in a multichannel 2-D microseismic record and $$\hat{I}(h)$$ denotes the picked arrival (index). The exact arrival is found by applying the proposed method on clean synthetic data. I first use a multichannel synthetic data set to demonstrate the performance of the proposed method. The microseismic data set is simulated from the two-layer velocity model shown in Fig. 4. The blue inverted triangles in the first layer denote the 50 evenly spaced geophones for recording the microseismic signals. The blue dots in the second layer denote the two microseismic sources. The two sources are generated during the hydraulic fracturing process. I use the acoustic wave to simulate the recorded data, as shown in Fig. 5. Figs 5(a) and (b) show the clean microseismic data. The 50 spatial traces correspond to recorded data from 50 geophones. The red circles in Fig. 5(a) correspond to the picked first arrivals for this clean data set using the proposed clustering method. The blue circles in Fig. 5(b) correspond to the picked first arrivals using the STA/LTA method. From this clean data test, it is clear that both clustering method and the STA/LTA method work well in detecting the first arrivals when the SNR is very high. As a comparison, I then conduct an experiment for noisy data. I simulate the noisy data by adding some random noise with SNR = 0.24 dB. The arrival picking results are shown in Fig. 6. From Fig. 7(a), one can see clearly that the red circles, which correspond to the proposed method, successfully picked all the first arrivals without making any mistake. While for the result by the traditional STA/LTA method, as indicated by the blue circles in Fig. 7(b), most picked arrivals are not correct. The noisy data example demonstrates that for noisy data set, the proposed method can be robust to obtain acceptable arrival picking results while the STA/LTA method cannot perform well. In other words, the proposed method is not sensitive to noise while the STA/LTA method is very sensitive to ambient random noise. Figure 4. View largeDownload slide The geometry of the synthetic example. Figure 4. View largeDownload slide The geometry of the synthetic example. Figure 5. View largeDownload slide Comparison of arrival picking results for clean data using the presented (a) and STA/LTA (b) methods. Figure 5. View largeDownload slide Comparison of arrival picking results for clean data using the presented (a) and STA/LTA (b) methods. Figure 6. View largeDownload slide Comparison of arrival picking results for noisy data using the presented (a) and STA/LTA (b) methods. Figure 6. View largeDownload slide Comparison of arrival picking results for noisy data using the presented (a) and STA/LTA (b) methods. Figure 7. View largeDownload slide Comparison of arrival picking results for denoised data using the presented (a) and STA/LTA (b) methods. Figure 7. View largeDownload slide Comparison of arrival picking results for denoised data using the presented (a) and STA/LTA (b) methods. To compare the arrival picking performance of the two methods on raw noisy data is not exactly fair. In practice, the noisy data is usually denoised first and then passed into the arrival picker. To compare the performance of the two methods on denoised data, I apply a multichannel denoising operator to the noisy data shown in Fig. 6 to remove most of the noise. The multichannel denoising operator I use is the damped multichannel singular spectrum (DMSSA) algorithm, proposed by Huang et al. (2016). It is worth noting that since for this example, the spatial sampling is dense and the number of spatial traces is relatively high (50 in this case), one can obtain a multichannel microseismic record with very coherent events. Thus one can apply the multichannel denoising operator to successfully remove most noise without damaging the useful signals much. However, for most microseismic monitoring situations, the spatial sampling is not so optimal as in this synthetic test, and multichannel denoising methods are not applicable. Instead, a single-channel denoising operator is used to obtain a slightly worse performance. Here I focus on the comparison of arrival picking performance and neglect the denoising issue. As one can see from Fig. 7, the denoised data is much cleaner than the noisy data shown in Fig. 6. The proposed method, indicated by the red circles, successfully picks those arrivals again but the STA/LTA method still makes a lot of mistakes, as shown in Fig. 7(b) by the blue circles. Although much better than the result of the noisy data, there are still several traces where wrong arrivals are picked. I then test the sensitivity of different methods to the noise level. I create six different noisy data sets with different noise levels. I increase the noise level by increasing the variance of Gaussian white noise from σ2 = 0.1 to σ2 = 0.6, where σ2 denotes noise variance. The six noisy data sets with increasing noise level are shown in Fig. 8. The red circles in each subfigure of Fig. 8 show the picked arrivals using the proposed clustering method. One can observe that the proposed method is insensitive to the noise level. When the noise variance is smaller than 0.6, the proposed method does not make obvious mistakes. The proposed method causes three obviously inaccurate picked arrivals until the noise variance is increased to 0.6. As a comparison, Fig. 9 shows arrival picking performance of the STA/LTA method for the same six noisy data sets. It is quite clear that the STA/LTA method is very sensitive to the noise. Even the noise variance is as small as σ2 = 0.1, the STA/LTA method causes a lot of inaccurate picked arrivals. The performance becomes worse and worse when the noise variance is gradually increased. Figure 8. View largeDownload slide Arrival picking results using the proposed method for different input noise levels. (a) σ2 = 0.1. (b) σ2 = 0.2. (c) σ2 = 0.3. (d) σ2 = 0.4. (e) σ2 = 0.5. (f) σ2 = 0.6. Figure 8. View largeDownload slide Arrival picking results using the proposed method for different input noise levels. (a) σ2 = 0.1. (b) σ2 = 0.2. (c) σ2 = 0.3. (d) σ2 = 0.4. (e) σ2 = 0.5. (f) σ2 = 0.6. Figure 9. View largeDownload slide Arrival picking results using the STA/LTA method for different input noise levels. (a) σ2 = 0.1. (b) σ2 = 0.2. (c) σ2 = 0.3. (d) σ2 = 0.4. (e) σ2 = 0.5. (f) σ2 = 0.6. Figure 9. View largeDownload slide Arrival picking results using the STA/LTA method for different input noise levels. (a) σ2 = 0.1. (b) σ2 = 0.2. (c) σ2 = 0.3. (d) σ2 = 0.4. (e) σ2 = 0.5. (f) σ2 = 0.6. To numerically compare the performance, the picking error defined in eq. (11) is calculated for each case and a diagram of arrival picking error with respect to noise level (measured by variance) is plotted for each individual method and is shown in Fig. 10. In Fig. 10, the red line corresponds to the proposed method, which increases very slowly and is always less than 300 samples. The blue line corresponds to the STA/LTA method, which is much larger than the error of the proposed method. Even in the cleanest case, the picking error is very high (about 2000 samples). The green line corresponds to the SNRs of different data, which decreases from 8.21 dB when σ2 = 0.1 to −7.35 dB when σ2 = 0.6. Figure 10. View largeDownload slide Error diagram with respect to input noise level. The right green ‘y’ axis shows the corresponding SNR for different input noise level. The error is characterized by eq. (11). Figure 10. View largeDownload slide Error diagram with respect to input noise level. The right green ‘y’ axis shows the corresponding SNR for different input noise level. The error is characterized by eq. (11). For computational cost comparison, the traditional method takes 0.23 s for processing the data shown in Fig. 5(a) while the proposed algorithm takes 2.5 s. The data contains 351 samples and 50 traces. The arrival picking algorithms are applied trace by trace. The computation is done on a PC station equipped with an Intel Core i7 CPU clocked at 3.1 GHz and 16 GB of RAM. Thus, I conclude that the proposed algorithm obtains a better performance at the expense of a larger computational cost. Although the computational cost is roughly 10 times larger than the traditional algorithm, it is still acceptable in industrial applications, thus the computation should not be a serious problem. Fig. 11 shows the performance for the first field data example. This is a surface-recorded microseismic data. The data quality is relatively high and one can see the first arrivals caused by hydraulic perforation very clearly. Fig. 11(a) shows the multichannel field data and the arrival picking results using the presented clustering method. There are 40 receivers for this data set and the red circles indicate the picked arrival indices. Fig. 11(b) shows the result from the STA/LTA method. The performance of the proposed method is very close to excellent except for one picking mistake, as shown in the fourth trace in Fig. 11(a). Because the data quality of this data is relatively high, the STA/LTA method obtains accurate picking in some traces, but for most traces, the STA/LTA fails in picking the accurate arrival. Note that in this test, I apply the two methods directly to the raw microseismic data, so the performance of the two methods demonstrates their relative robustnesses to noise. It is worth mentioning that the STA/LTA method is usually implemented after an initial denoising step is done on the raw data, where the SNR becomes much better and more tolerable for the STA/LTA method. Figure 11. View largeDownload slide Comparison of arrival picking results for the real surface microseismic data using the presented (a) and STA/LTA (b) methods. Figure 11. View largeDownload slide Comparison of arrival picking results for the real surface microseismic data using the presented (a) and STA/LTA (b) methods. Figure 12. View largeDownload slide Single trace comparison for presented and STA/LTA methods. (a) The 20th trace from the first field data and the picked result using clustering method. (b) The 20th trace from the first field data and the picked result using STA/LTA method. Figure 12. View largeDownload slide Single trace comparison for presented and STA/LTA methods. (a) The 20th trace from the first field data and the picked result using clustering method. (b) The 20th trace from the first field data and the picked result using STA/LTA method. I extract the 20th trace in Fig. 11 for a single-trace comparison in Fig. 12. Figs 12(a) and (b) show the results from the two methods, where one can more clearly see the 1D waveform signals and the difference in picked time indices. The three features, namely, Mean, Power, and STA/LTA, are shown in Fig. 13 for the selected single trace. It is very clear that the mean and power features are relatively insensitive to noise since before 0.35s both mean and power are almost zero. The STA/LTA, however, is much more sensitive to ambient noise as one can observe a lot of fake peaks in the STA/LTA curve. The integrated clustering analysis over the two less noise-insensitive features (i.e. mean and power) and a more noise-sensitive feature (i.e. STA/LTA) through the fuzzy C-means framework accounts for the anti-noise superiority of the presented method to the traditional STA/LTA method. The two groups of membership values of the real single microseismic trace are shown in Fig. 14. A distinct zero-to-one jump happens around 168 ms can be observed from the top panel of Fig. 14. Figure 13. View largeDownload slide The three features exacted in the proposed method. Figure 13. View largeDownload slide The three features exacted in the proposed method. Figure 14. View largeDownload slide The two groups of membership values calculated in the clustering analysis for 20th trace from the first field data. Figure 14. View largeDownload slide The two groups of membership values calculated in the clustering analysis for 20th trace from the first field data. I then show a more complicated real microseismic event. Fig. 15 shows the original record with the horizontal components H1 and H2, and the vertical component, respectively. Twelve three-component geophones are used to record the signals from hydraulic fracturing. The microseismic record is noisy and amplitude of events is weak. So not all signals are immediately detectable. I apply the proposed clustering method and the STA/LTA method to the data set and show the result in Figs 15 and 16, respectively. This data is much noisier than the previous example. However, the proposed method is still very robust in detecting those arrivals, which appear spatially coherent, as indicated by the red circles in Fig. 15. The results from the STA/LTA method, however, are in a mess, as indicated by the blue circles in Fig. 16. Without denoising, the STA/LTA method is almost not possible to accurately detect the microseismic events while the proposed method can work properly even in severely corrupted data. Figure 15. View largeDownload slide Arrival picking results for the second real surface microseismic data with extremely strong background noise using the presented method. Figure 15. View largeDownload slide Arrival picking results for the second real surface microseismic data with extremely strong background noise using the presented method. Figure 16. View largeDownload slide Arrival picking results for the second real surface microseismic data with extremely strong background noise using the STA/LTA method. Figure 16. View largeDownload slide Arrival picking results for the second real surface microseismic data with extremely strong background noise using the STA/LTA method. The last real data example is an earthquake data stack profile. Fig. 17 shows Professor Peter Shearer’s stacks over many earthquakes at a constant epicentral distance (offset angle) (Shearer 1991a,b). Fig. 17 has been improved a lot from the raw data by stacking different earthquake data. Different seismic phases can be seen from the Fig. 17. However, one can see that there are still significant random and erratic noise existing in the stack profile. By applying the proposed method directly to the raw stack data, I obtain a very successful result, which is shown in Fig. 17. Fig. 18 shows the arrival picking performance using the STA/LTA method, where the most picking is accurate while there are still a lot of inaccurate picking. Figure 17. View largeDownload slide Arrival picking results for an earthquake data with extremely strong background noise using the presented method. Figure 17. View largeDownload slide Arrival picking results for an earthquake data with extremely strong background noise using the presented method. Figure 18. View largeDownload slide Arrival picking results for an earthquake data with extremely strong background noise using the STA/LTA method. Figure 18. View largeDownload slide Arrival picking results for an earthquake data with extremely strong background noise using the STA/LTA method. DISCUSSIONS The implementation of the proposed clustering method is quite straightforward and it is easy to control the performance of the algorithm. The four steps shown in the THEORY section are almost the whole framework of the algorithm. Once the feature vectors are fixed (e.g. mean, power, and STA/LTA), the parameters needed to be defined are just exponent parameter m for the membership matrix U (see eq. 5), which is usually fixed as m = 2, the number of clusters C, which is fixed as C = 2. In other words, the method is a fully automatic method and almost does not require any human inference. Although minimizing eq. (5) requires several iterations, which is the main computational cost of the algorithm, it usually takes less than 20 iterations to converge. According to the numerical tests shown above, the computational cost of the presented algorithm is only about 10 times larger than the traditional STA/LTA method and the efficiency in practical applications is still quite acceptable. The presented automatic microseismic event detection method based on unsupervised machine learning is more a concept than a specific type of microseismic event detection method. One the one hand, in the unsupervised machine learning framework, the clustering method is not limited to the fuzzy clustering algorithm. Other unsupervised machine learning algorithms are worth being investigated, such as Hierarchical clustering, k-Means clustering, Self-organizing maps, Hidden Markov models, etc. On the other hand, the feature vector in the unsupervised machine learning framework has the potential to be improved. In this paper, I just investigate the performance based on the three-component feature vector that is composed of mean, power, and STA/LTA. While each feature component can serve as an individual event picker, the combined usage of all these components in the feature vector through the integrated fuzzy clustering framework is more appealing. As I have shown, the integrated application of all three feature components via clustering is much more robust than the single component, for example, the STA/LTA method, in a noisy environment. From this aspect, all other individual event picking methods can be integrated into the same framework for more robust performance, such as the power spectral density method in Vaezi & Van der Baan (2015) and the Akaike information criterion method in Leonard (2000) and Zhang et al. (2003). From the comprehensive analysis and discussions of the different synthetic and real data tests, the main advantage of the proposed method is its anti-noise ability. It has been shown that when the noise level is weak to moderately strong, the performance of the proposed method is very robust. This feature makes the clustering method very appealing to practical usage, since in most cases where the noise is not extremely high, the proposed method can be applied without any pre-processing step and thus can serve as the very initial estimate of microseismic event detection and serve for the subsequent event source localization. When the noise level becomes extremely strong, as seen from the sensitivity test to noise level, the clustering method may make some mistakes. Although not perfect in all situations, when the noise level becomes extremely high, a simple and mild denoising filter can help remove a small amount of noise but without damaging the useful signals, which is followed by applying the presented event picker. The future investigation includes applying the proposed method to a much larger volume of data set and to further verify the anti-noise effect when the method is applied in the field. CONCLUSIONS Microseismic and earthquake data may contain strong distractive background noise that may heavily affect the waveform arrival picking, and could further result in an unconvincing tomographic model that is based on the arrival picking. I have introduced an effective and intelligent arrival picking algorithm that is based on clustering analysis. The three features (Mean, Power, STA/LTA) fed into the intelligent clustering algorithm make the clustering engine capable of detecting arrivals in extremely noisy environment without the need of pre-processing the data. The presented arrival picking algorithm is an unsupervised machine learning technique that can be applied to an arbitrarily large amount of microseismic (and earthquake) data. Unlike the supervised machine learning techniques, the proposed method does not require a reasonably large volume of training data and thus can be fairly flexible. The fuzzy clustering algorithm has been shown to be an effective clustering analysis method in the presented framework. The comparison between the presented algorithm with the state-of-the-art STA/LTA method shows very promising performance, especially in the low-SNR data set based on a combination of several synthetic and real data examples with different levels of complexity and noise strength. Acknowledgements I would like to thank Weilin Huang, Huijian Li, Shaohuan Zu and Dong Zhang for inspiring discussions. The research is previously supported by Texas Consortium for Computational Seismology (TCCS) and is currently supported by Distinguished Postdoctoral Fellowship at Oak Ridge National Laboratory. REFERENCES Allen R.V., 1978. Automatic earthquake recognition and timing from single traces, Bull. seism. Soc. Am. , 68( 5), 1521– 1532. Arrowsmith S.J., Eisner L., 2006. A technique for identifying microseismic multiplets and application to the Valhall field, North Sea, Geophys. J. Int. , 71( 2), V31– V40. Bezdek J.C., 1981. Pattern Recognition with Fuzzy Objective Function Algoritms , Plenum Press. Google Scholar CrossRef Search ADS   Bogiatzis P., Ishii M., 2015. Continuous wavelet decomposition algorithms for automatic detection of compressional-and shear-wave arrival times, Bull. seism. Soc. Am. , 105( 3), 1628– 1641. Google Scholar CrossRef Search ADS   Candès E.J., Demanet L., Donoho D.L., Ying L., 2006. Fast discrete curvelet transforms, Multiscale Model. Simul. , 5, 861– 899. Google Scholar CrossRef Search ADS   Chen Y., 2016. Dip-separated structural filtering using seislet thresholding and adaptive empirical mode decomposition based dip filter, Geophys. J. Int. , 206( 1), 457– 469. Google Scholar CrossRef Search ADS   Chen Y., 2017. Fast dictionary learning for noise attenuation of multidimensional seismic data, Geophys. J. Int. , 209, 21– 31. Chen Y., Fomel S., 2015. Random noise attenuation using local signal-and-noise orthogonalization, Geophysics , 80, WD1– WD9. Google Scholar CrossRef Search ADS   Chen Y., Fomel S., Hu J., 2014. Iterative deblending of simultaneous-source seismic data using seislet-domain shaping regularization, Geophysics , 79, V179– V189. Google Scholar CrossRef Search ADS   Chen Y., Zhang D., Huang W., Chen W., 2016a. An open-source matlab code package for improved rank-reduction 3D seismic data denoising and reconstruction, Comput. Geosci. , 95, 59– 66. Google Scholar CrossRef Search ADS   Chen Y., Zhang D., Jin Z., Chen X., Zu S., Huang W., Gan S., 2016b. Simultaneous denoising and reconstruction of 5D seismic data via damped rank-reduction method, Geophys. J. Int. , 206, 1695– 1717. Google Scholar CrossRef Search ADS   Chen Y., Zhou Y., Chen W., Zu S., Huang W., Zhang D., 2017. Empirical low rank decomposition for seismic noise attenuation, IEEE Trans. Geosci. Remote Sen. , 55( 8), 4696– 4711. Google Scholar CrossRef Search ADS   Coppens F., 1985. First arrival picking on common-offset trace collections for automatic estimation of static corrections, Geophys. Prospect. , 33( 8), 1212– 1231. Google Scholar CrossRef Search ADS   Dunn J.C., 1973. A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters, J. Cybern. , 3, 32– 57. Google Scholar CrossRef Search ADS   Eisner L., Fischer T., Calvez J.H.L., 2006. Detection of repeated hy- draulic fracturing out-of-zone growth by microseismic monitoring, Leading Edge , 25, 548– 554. Google Scholar CrossRef Search ADS   Elad M., Aharon M., 2006. Image denoising via sparse and redundant representations over learned dictionaries, IEEE Trans. Image Process. , 15( 12), 3736– 3745. Google Scholar CrossRef Search ADS PubMed  Fomel S., 2013. Seismic data decomposition into spectral components using regularized nonstationary autoregression, Geophysics , 78, O69– O76. Google Scholar CrossRef Search ADS   Forghani-Arani F., Willis M., Haines S.S., Batzle M., Behura J., Davidson M., 2013. An effective noise-suppression technique for surface microseismic data, Geophysics , 78( 6), KS85– KS95. Google Scholar CrossRef Search ADS   Gan S., Wang S., Chen Y., Chen X., 2016a. Simultaneous-source separation using iterative seislet-frame thresholding, IEEE Geosci. Remote Sen. Lett. , 13, 197– 201. Google Scholar CrossRef Search ADS   Gan S., Wang S., Chen Y., Chen X., Xiang K., 2016b. Separation of simultaneous sources using a structural-oriented median filter in the flattened dimension, Comput. Geosci. , 86, 46– 54. Google Scholar CrossRef Search ADS   Gelchinsky B., Shtivelman V., 1983. Automatic picking of first arrivals and parameterization of traveltime curves, Geophys. Prospect. , 31, 915– 928. Google Scholar CrossRef Search ADS   Gibbons S.J., Ringdal F., 2006. The detection of low magnitude seismic events using array-based waveform correlation, Geophys. J. Int. , 165( 1), 149– 166. Google Scholar CrossRef Search ADS   Gibbons S.J., Ringdal F., Kvaerna T., 2006. Detection and characteriza- tion of seismic phases using continuous spectral estimation on incoherent and partially coherent arrays, Geophys. J. Int. , 172( 1), 405– 421. Google Scholar CrossRef Search ADS   Gilles J., 2013. Empirical wavelet transform, IEEE Trans. Signal Process. , 61, 3999– 4010. Google Scholar CrossRef Search ADS   Gómez J.L., Velis D.R., 2016. A simple method inspired by empirical mode decomposition for denoising seismic data, Geophysics , 81( 6), V403– V413. Google Scholar CrossRef Search ADS   Han J., van der Baan M., 2015. Microseismic and seismic denoising via ensemble empirical mode decomposition and adaptive thresholding, Geophysics , 80( 6), KS69– KS80. Google Scholar CrossRef Search ADS   Hatherly P., 1982. A computer method for determining seismic first arrival times, Geophysics , 47, 1431– 1436. Google Scholar CrossRef Search ADS   Huang N.E., Shen Z., Long S.R., Wu M.C., Shih H.H., Zheng Q., Yen N.-C., Tung C.C., Liu H.H., 1998. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis, Proc. R. Soc. A , 454, 903– 995. Google Scholar CrossRef Search ADS   Huang W., Wang R., Chen Y., Li H., Gan S., 2016. Damped multichannel singular spectrum analysis for 3D random noise attenuation, Geophysics , 81( 4), V261– V270. Google Scholar CrossRef Search ADS   Huang W., Wang R., Chen X., Chen Y., 2017a. Double least squares projections method for signal estimation, IEEE Trans. Geosci. Remote Sen. , 55( 7), 4111– 4129. Google Scholar CrossRef Search ADS   Huang W., Wang R., Li H., Chen Y., 2017b. Unveiling the signals from extremely noisy microseismic data for high-resolution hydraulic fracturing monitoring, Sci. Rep. , 7, 11996, doi:10.1038/s41598-017-09711-2. Google Scholar CrossRef Search ADS   Huang W., Wang R., Yuan Y., Gan S., Chen Y., 2017c. Signal extraction using randomized-order multichannel singular spectrum analysis, Geophysics , 82( 2), V59– V74. Google Scholar CrossRef Search ADS   Huang W., Wang R., Zhang D., Zhou Y., Yang W., Chen Y., 2017d. Mathematical morphological filtering for linear noise attenuation of seismic data, Geophysics , 82( 6), V369– V384. Google Scholar CrossRef Search ADS   Huang W., Wang R., Zu S., Chen Y., 2017e. Low-frequency noise attenuation in seismic and microseismic data using mathematical morphological filtering, Geophys. J. Int. , 211, 1318– 1340. Kalkan E., 2016. An automatic p-phase arrival-time picker, Bull. seism. Soc. Am. , 106( 3), doi:10.1785/0120150111. Leonard M., 2000. Comparison of manual and automatic onset time picking, Bull. seism. Soc. Am. , 90( 6), 1384– 1390. Google Scholar CrossRef Search ADS   Li H., Wang R., Cao S., Chen Y., Huang W., 2016a. A method for low-frequency noise suppression based on mathematical morphology in microseismic monitoring, Geophysics , 81, V159– V167. Google Scholar CrossRef Search ADS   Li H., Wang R., Cao S., Chen Y., Tian N., Chen X., 2016b. Weak signal detection using multiscale morphology in microseismic monitoring, J. Appl. Geophys. , 133, 39– 49. Google Scholar CrossRef Search ADS   Liu G., Chen X., 2013. Noncausal f–x–y regularized nonstationary prediction filtering for random noise attenuation on 3D seismic data, J. Appl. Geophys. , 93, 60– 66. Google Scholar CrossRef Search ADS   Liu G., Chen X., Du J., Wu K., 2012. Random noise attenuation using f–x regularized nonstationary autoregression, Geophysics , 77, V61– V69. Google Scholar CrossRef Search ADS   Liu W., Cao S., He Y., 2015. Ground roll attenuation using variational mode decomposition, in 77th Annual International Conference and Exhibition, EAGE, Extended Abstracts . Liu W., Cao S., Chen Y., 2016a. Seismic time-frequency analysis via empirical wavelet transform, IEEE Geosci. Remote Sens. Lett. , 13, 28– 32. Google Scholar CrossRef Search ADS   Liu W., Cao S., Chen Y., 2016b. Applications of variational mode decomposition in seismic time-frequency analysis, Geophysics , 81, V365– V378. Google Scholar CrossRef Search ADS   Liu W., Cao S., Wang Z., Kong X., Chen Y., 2017. Spectral decomposition for hydrocarbon detection based on VMD and teager-kaiser energy, IEEE Geosci. Remote Sens. Lett. , 14( 4), 539– 543. Google Scholar CrossRef Search ADS   Maxwell S.C., Rutledge J., Jones R., Fehler M., 2010. Petroleum reservoir characterization using downhole microseismic monitoring, Geophysics , 75( 5), 75A129– 75A137. Google Scholar CrossRef Search ADS   Michelet S., Toksöz M.N., 2007. Fracture mapping in the Soultz-sous-forets geothermal field using microearthquake locations, J. geophys. Res. , 112, B07315, doi:10.1029/2006JB004442. Google Scholar CrossRef Search ADS   Mousavi S.M., Langston C.A., 2016a. Hybrid seismic denoising using higher-order statistics and improved wavelet block thresholding, Bull. seism. Soc. Am. , 106( 4), 1380– 1393. Google Scholar CrossRef Search ADS   Mousavi S.M., Langston C.A., 2016b. Adaptive noise estimation and suppression for improving microseismic event detection, J. Appl. Geophys. , 132, 116– 124. Google Scholar CrossRef Search ADS   Mousavi S.M., Langston C.A., 2016c. Fast and novel microseismic detection using time-frequency analysis, in 86th Annual International Meeting , SEG, Expanded Abstracts, pp. 2632– 2636. Mousavi S.M., Langston C.A., 2017. Automatic noise-removal/signal-removal based on general cross-validation thresholding in synchrosqueezed domain and its application on earthquake data, Geophysics , 82( 4), V211– V227. Google Scholar CrossRef Search ADS   Mousavi S.M., Langston C.A., Horton S.P., 2016. Automatic microseismic denoising and onset detection using the synchrosqueezed continuous wavelet transform, Geophysics , 81( 4), V341– V355. Google Scholar CrossRef Search ADS   Murat M.E., Rudman A.J., 1992. Automated first arrival picking: A neural network approach, Geophys. Prospect. , 40( 6), 587– 604. Google Scholar CrossRef Search ADS   Rodriguez I.V., Bonar D., Sacchi M., 2012. Microseismic data denoising using a 3c group sparsity constrained time-frequency transform, Geophysics , 77( 2), V21– V29. Google Scholar CrossRef Search ADS   Sabbione J.I., Sacchi M.D., 2015. Radon transform-based microseismic event detection and signal-to-noise ratio enhancement, J. Appl. Geophys. , 113, 51– 63. Google Scholar CrossRef Search ADS   Sabbione J.I., Velis D., 2010. Automatic first-breaks picking: New strategies and algorithms, Geophysics , 75( 4), V67– V76. Google Scholar CrossRef Search ADS   Senkaya M., Karsli H., 2014. A semi-automatic approach to identify first arrival time: the cross-correlation technique (CCT), Earth Sci. Res. J. , 18( 2), 107– 113. Google Scholar CrossRef Search ADS   Shearer P.M., 1991a. Imaging global body wave phases by stacking long-period seismograms, J. geophys. Res. , 96( B12), 20 535– 20 324. Google Scholar CrossRef Search ADS   Shearer P.M., 1991b. Constraints on upper mantle discontinuities from observations of long period reflected and converted phases, J. geophys. Res. , 96( B11), 18 147– 18 182. Google Scholar CrossRef Search ADS   Siahsar M.A.N., Abolghasemi V., Chen Y., 2017a. Simultaneous denoising and interpolation of 2d seismic data using data-driven non-negative dictionary learning, Signal Process. , 141, 309– 321. Google Scholar CrossRef Search ADS   Siahsar M.A.N., Gholtashi S., Kahoo A.R., Chen W., Chen Y., 2017b. Data-driven multi-task sparse dictionary learning for noise attenuation of 3D seismic data, Geophysics , 82( 6), V385– V396. Google Scholar CrossRef Search ADS   Siahsar M.A.N., Gholtashi S., Olyaei E., Chen W., Chen Y., 2017c. Simultaneous denoising and interpolation of 3D seismic data via damped data-driven optimal singular value shrinkage, IEEE Geosci. Remote Sens. Lett. , 14( 7), 1086– 1090. Google Scholar CrossRef Search ADS   Song F., Toksöz M.N., 2011. Full-waveform based complete moment tensor inversion and source parameter estimation from downhole microseismic data for hydrofracture monitoring, Geophysics , 76( 6), WC103– WC116. Google Scholar CrossRef Search ADS   Song F., Kuleli H.S., Toksöz M.N., Ay E., Zhang H., 2010. An improved method for hydrofracture-induced microseismic event detection and phase picking, Geophysics , 75( 6), A47– A52. Google Scholar CrossRef Search ADS   Vaezi Y., Van der Baan M., 2015. Comparison of the STA/LTA and power spectral density methods for microseismic event detection, Geophys. J. Int. , 203( 3), 1896– 1908. Google Scholar CrossRef Search ADS   Vautard R., Yiou P., Ghil M., 1992. Singular-spectrum analysis: A toolkit for short, noisy chaotic signals, Phys. D , 58( 1), 95– 126. Google Scholar CrossRef Search ADS   Velis D., Sabbione J.I., Sacchi M.D., 2015. Fast and automatic microseismic phase-arrival detection and denoising by pattern recognition and reduced-rank filtering, Geophysics , 80( 6), WC25– WC38. Google Scholar CrossRef Search ADS   Wang Y., Zhou H., Zu S., Mao W., Chen Y., 2017. Three-operator proximal splitting scheme for 3D seismic data reconstruction, IEEE Geosci. Remote Sens. Lett. , 14, 1830– 1834. Google Scholar CrossRef Search ADS   Withers M., Aster R., Young C., Beiriger J., Harris M., Moore S., Trujillo J., 1998. A comparison of select trigger algorithms for automated global seismic phase and event detection, Bull. seism. Soc. Am. , 88( 1), 95– 106. Wu G., Fomel S., Chen Y., 2017. Data-driven time-frequency analysis of seismic data using non-stationary prony method, Geophys. Prospect. , doi:10.1111/1365–2478.12530. Wu Z., Huang N.E., 2009. Ensemble empirical mode decomposition: A noise-assisted data analysis method, Adv. Adapt. Data Anal. , 1, 1– 41. Google Scholar CrossRef Search ADS   Xie J., Chen W., Zhang D., Zu S., Chen Y., 2017. Application of principal component analysis in weighted stacking of seismic data, IEEE Geosci. Remote Sens. Lett. , 14( 8), 1213– 1217. Google Scholar CrossRef Search ADS   Xue Y., Chang F., Zhang D., Chen Y., 2016a. Simultaneous sources separation via an iterative rank-increasing method, IEEE Geosci. Remote Sens. Lett. , 13( 12), 1915– 1919. Google Scholar CrossRef Search ADS   Xue Z., Chen Y., Fomel S., Sun J., 2016b. Seismic imaging of incomplete data and simultaneous-source data using least-squares reverse time migration with shaping regularization, Geophysics , 81, S11– S20. Google Scholar CrossRef Search ADS   Yang P., Yin X., Zhang G., 2006. Seismic data analysis based on fuzzy clustering, in 2006 8th International Conference on Signal Processing , doi:10.1109/ICOSP.2006.346109. Zhang H., Thurber C., Rowe C., 2003. Automatic p-wave arrival detection and picking with multiscale wavelet analysis for single-component recordings, Bull. seism. Soc. Am. , 93( 5), 1904– 1912. Google Scholar CrossRef Search ADS   Zhang D., Chen Y., Huang W., Gan S., 2016. Multi-step damped multichannel singular spectrum analysis for simultaneous reconstruction and denoising of 3D seismic data, J. Geophys. Eng. , 13, 704– 720. Google Scholar CrossRef Search ADS   Zhang D., Zhou Y., Chen H., Chen W., Zu S., Chen Y., 2017. Hybrid rank-sparsity constraint model for simultaneous reconstruction and denoising of 3D seismic data, Geophysics , 82( 5), V351– V367. Google Scholar CrossRef Search ADS   Zhou Y., Shi C., Chen H., Xie J., Wu G., Chen Y., 2017. Spike-like blending noise attenuation using structural low-rank decomposition, IEEE Geosci. Remote Sens. Lett. , 14( 9), 1633– 1637. Google Scholar CrossRef Search ADS   Zu S., Zhou H., Chen Y., Qu S., Zou X., Chen H., Liu R., 2016. A periodically varying code for improving deblending of simultaneous sources in marine acquisition, Geophysics , 81( 3), V213– V225. Google Scholar CrossRef Search ADS   Zu S., Zhou H., Mao W., Zhang D., Li C., Pan X., Chen Y., 2017. Iterative deblending of simultaneous-source data using a coherency-pass shaping operator, Geophys. J. Int. , 211( 1), 541– 557. Google Scholar CrossRef Search ADS   © The Author 2017. Published by Oxford University Press on behalf of The Royal Astronomical Society. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Geophysical Journal International Oxford University Press

# Automatic microseismic event picking via unsupervised machine learning

, Volume 212 (1) – Jan 1, 2018
15 pages

Loading next page...

/lp/ou_press/automatic-microseismic-event-picking-via-unsupervised-machine-learning-jnC18K6Hbt
Publisher
Oxford University Press
Copyright
© The Author 2017. Published by Oxford University Press on behalf of The Royal Astronomical Society.
ISSN
0956-540X
eISSN
1365-246X
D.O.I.
10.1093/gji/ggx420
Publisher site
See Article on Publisher Site

### Abstract

Abstract Effective and efficient arrival picking plays an important role in microseismic and earthquake data processing and imaging. Widely used short-term-average long-term-average ratio (STA/LTA) based arrival picking algorithms suffer from the sensitivity to moderate-to-strong random ambient noise. To make the state-of-the-art arrival picking approaches effective, microseismic data need to be first pre-processed, for example, removing sufficient amount of noise, and second analysed by arrival pickers. To conquer the noise issue in arrival picking for weak microseismic or earthquake event, I leverage the machine learning techniques to help recognizing seismic waveforms in microseismic or earthquake data. Because of the dependency of supervised machine learning algorithm on large volume of well-designed training data, I utilize an unsupervised machine learning algorithm to help cluster the time samples into two groups, that is, waveform points and non-waveform points. The fuzzy clustering algorithm has been demonstrated to be effective for such purpose. A group of synthetic, real microseismic and earthquake data sets with different levels of complexity show that the proposed method is much more robust than the state-of-the-art STA/LTA method in picking microseismic events, even in the case of moderately strong background noise. Inverse theory, Time-series analysis, Earthquake source observations INTRODUCTION Strong noise in microseismic data, especially in surface microseismic data, may impede the effective usage of the microseismic signals for subsurface characterization and monitoring, for example, hydraulic fracturing monitoring (Rodriguez et al. 2012; Sabbione & Sacchi 2015; Mousavi & Langston 2016c, 2017; Huang et al. 2017b,e). The determination of waveform arrival times is important for a variety of seismological applications including earthquake detection and seismic tomography (Bogiatzis & Ishii 2015). First arrivals can be identified on the seismograms by conventional picking, but these types of manual methods depend on external factors like the scale and quality of the seismic data, amplitude ratio, sensitivity of the picking cursor and user experience (Senkaya & Karsli 2014). An automatic arrival picking method is always preferred in the community. Hatherly (1982) designed a technique to accurately determine first arrival times by considering both the first kick and the first inflection point on each trace based on some statistic criteria. Senkaya & Karsli (2014) proposed a semi-automatic arrival picking method based on the cross-correlation technique (CCT). The CCT can reduce the dependency on user and decrease incorrect picking caused by environmental noise, displaying characteristics and scaling factors. Waveform correlation of strong events is a typical way to detect weak events in the earthquake seismology community, which is sensitive to small-amplitude events (Song et al. 2010). According to the study of many researchers, the waveform correlation based detection methods can be effective as long as the separation between the master event and the target event is less than the dominant wavelength (Gibbons & Ringdal 2006; Michelet & Toksöz 2007; Arrowsmith & Eisner 2006). Song et al. (2010) applied the correlation-based weak event detection method to microseismic events caused by hydraulic fracturing. In this method, a master event with high signal-to-noise ratio (SNR) is selected and then is used to cross-correlate with a weak event to find those events with the similar location, fault mechanism, and propagation path as the master event (Eisner et al. 2006). They improved the traditional short-term-average long-term-average ratio (STA/LTA) algorithm (Allen 1978; Vaezi & Van der Baan 2015), which is very sensitive to background noise level (Withers et al. 1998; Kalkan 2016), by transforming the time-domain data to the spectrogram domain to better identify P- and S-wave arrivals (Gibbons et al. 2006; Song et al. 2010). Gelchinsky & Shtivelman (1983) proposed a hybrid method that combines the benefits of both correlation-based methods and the statistics based methods. Sabbione & Velis (2010) proposed a mispick-correcting technique to exploit the benefits of the data present in the entire shot record, which allows us to adjust the trace-by-trace picks and to discard picks associated with bad or dead traces. Velis et al. (2015) utilized pattern recognition techniques to detect waveforms from microseismic data and used reduced-rank filtering approach to improve the SNR of the waveforms. Coppens (1985) proposed a fully automatic method to pick first arrivals that makes use of the delay-time method in order to compute static corrections at each shot position. From the automatically picked first arrivals on common-offset gathers, the delay times, weathering and subweathering velocities can be easily determined. Murat & Rudman (1992) applied back-propagation neural network to pick first arrivals (first breaks) in a background of noise. The back-propagation neural network is used to decide whether each half-cycle on the trace is the first arrival or not. The four attributes used as the features for the neural network are (1) peak amplitude of a half-cycle; (2) amplitude difference between the peak value of the half-cycle and the previous (or following) half-cycle; (3) root mean square (RMS) amplitude ratio for a data window (0.3 s) before and after the half-cycle; (4) RMS amplitude ratio for a data window (0.06 s) on adjacent traces. Zhang et al. (2003) developed an automatic P-wave arrival detection and picking algorithm based on the wavelet transform and Akaike information criteria (AIC) picker. They applied the discrete wavelet transform to seismograms based on a local-window strategy (Zu et al. 2017). In each local time-window, AIC autopicker is applied to the thresholded absolute wavelet coefficients at different scales, and the consistency of those picks in different scales is evaluated to determine whether a P-wave arrival has been detected in the given local window. Bogiatzis & Ishii (2015) proposed an algorithm for determining P- and S-wave arrival times based on continuous wavelet transform of the waveforms. The wavelet transform is well known for its superb performance in analysing nonstationary signals because its basis functions are localized in time and frequency. The wavelet coefficients are calculated using the vertical component for determining P-wave arrivals and polarization of the shear waves is utilized to cross examine the wavelet coefficients in horizontal components for determining S-wave arrivals. Because of the small magnitude of microseismic events and noisy borehole or surface recording environment, the microseismic signals may often be neglected if no proper denoising algorithms or event detection techniques are applied (Mousavi et al. 2016; Mousavi & Langston 2016a,b). The microseismic data can also be denoised using multichannel methods, which are widely used in the active-source seismic community. The state-of-the-art multichannel denoising algorithms include transform domain thresholding methods (Candès et al. 2006; Chen et al. 2014; Chen 2016; Gan et al. 2016a; Mousavi et al. 2016; Zu et al. 2016), predictive filtering methods (Liu et al. 2012; Liu & Chen 2013), singular spectrum analysis (Vautard et al. 1992; Huang et al. 2016; Chen et al. 2016a,b; Zhang et al. 2016, 2017; Huang et al. 2017c; Siahsar et al. 2017c), low-rank approximation based methods (Xue et al. 2016a; Chen et al. 2017; Huang et al. 2017a; Wang et al. 2017; Xie et al. 2017; Zhou et al. 2017), dictionary learning based methods (Elad & Aharon 2006; Chen 2017; Siahsar et al. 2017b,a), and mean/median filtering methods (Gan et al. 2016b; Xue et al. 2016b). Forghani-Arani et al. (2013) proposed a technique for suppressing noise in surface microseismic data based on the distinct characteristics that microseismic signal and noise show in the τ − p domain. Before separation in the τ − p domain, a scanning approach that is similar to semblance analysis needs to be used to test all possible double-couple orientations to determine an estimated orientation that best accounts for the polarity pattern of any microseismic events (Forghani-Arani et al. 2013). The multichannel denoising methods rely on a fairly dense spatial sampling of the data. For most microseismic monitoring projects, where the number of spatial geophones is not large enough, the multichannel denoising methods are applicable or cannot obtain acceptable results. Han & van der Baan (2015) applied ensemble empirical mode decomposition (EEMD) method combined with adaptive interval thresholding strategy to denoise microseismic data. Empirical mode decomposition (EMD) was developed by Huang et al. (1998) to adaptively decompose non-stationary and nonlinear signal into locally stationary components, which are called intrinsic mode functions (IMF). The EMD is a quasi-orthogonal decomposition in that the cross-correlation coefficients between different IMFs are close to zero (Han & van der Baan 2015). Ensemble empirical mode decomposition is an improved version of EMD in that it solves the mode-mixing problem in EMD to some extent and has a better time-frequency depicting performance than EMD (Wu & Huang 2009). Inspired by the EMD method, Gómez & Velis (2016) proposed a data-driven EMD-like algorithm to suppress noise either in microseismic data as a trace-by-trace process or in seismic data as a f − x domain algorithm. The EMD-like algorithm is much more computationally efficient than the EMD method and requires less user intervention. However, because of the ‘empirical’ nature of EMD, the EMD-related algorithms are not supported by a certain mathematical mode, and thus the performance of EMD based methods cannot be well controlled. In a similar way as the EMD based methods, Li et al. (2016a) developed a morphological decomposition (Huang et al. 2017d) based method, which is much faster than EMD, to remove low-frequency noise in microseismic data. Li et al. (2016b) further designed a morphological decomposition strategy for recovering the weak signals in microseismic data. Other latest EMD-like data decomposition algorithms include the variational mode decomposition (Liu et al. 2015, 2016b, 2017), empirical wavelet transform (Gilles 2013; Liu et al. 2016a), and regularized non-stationary autoregression (Fomel 2013; Wu et al.2017). Denoising microseismic data will inevitably cause signal damages to the useful signals, which degrades the fidelity of the processed data (Chen & Fomel 2015). On the one hand, small-amplitude signals may be neglected due to the signal damage and thus arrival picking will be negatively affected. On the other hand, the damaged waveform amplitude will greatly affect the subsequent source mechanism inversion and microseismic based waveform inversion (Maxwell et al. 2010; Song & Toksöz 2011). Considering the drawbacks and potential pitfalls caused by denoising, a robust arrival picking method without the need of pre-processing is strongly in demand for the microseismic community. In this paper, I propose a novel automatic microseismic arrival picking method based on clustering. Clustering is a type of unsupervised machine learning techniques, which allows us to classify the given data by using the data itself. Unlike the supervised machine learning techniques which heavily depend on the input training data set, the unsupervised machine learning approaches (e.g. clustering) rely only on the data itself and thus is much more flexible. More specifically, I demonstrate the use of the fuzzy clustering algorithm (Yang et al. 2006) in arrival picking problem. Results from a comprehensive analysis of a group of synthetic and real data sets with different levels of complexity demonstrate the strong antinoise ability of the proposed unsupervised machine learning algorithm in picking first arrivals from raw noisy microseismic data. The efficiency is also demonstrated to be acceptable and thus the method can be readily applicable. I organize the paper as follows: I first give a brief introduction of the concept of clustering analysis, and then I formulate the microseismic event picking problem as a clustering problem. Next, I introduce the iterative solver for solving the fuzzy clustering problem. I then use a group of examples with comprehensive analysis and discussion to demonstrate the performance of the proposed method and compare the performance with the state-of-the-art STA/LTA method. Finally, I draw some key conclusions in the end of the paper. THEORY Clustering analysis Clustering analysis is a type of unsupervised machine learning approach. The target of clustering analysis is to group the input data into several clusters just according to the inherent features of the input data set. The number of groups can be defined in advance according to the purpose of a specific problem. Simply speaking, each cluster after clustering analysis is a collection of objects which have some sort of similarities which defer them from objects in the other clusters. Clustering is driven only by the choice of input features (or attributes) and the number of desired clusters and thus is much more flexible than those supervised machine learning techniques where a large amount of training data is required. An important component of a clustering algorithm is the distance measure between data points. If the components of the data instance vectors are all in the same physical units then it is possible for the simple Euclidean distance metric to be sufficient to successfully group similar data instances. The following Minkowski Metric is a common way for measuring distance for an N dimensional data   $$d(\mathbf {x}_i,\mathbf {x}_j) =\, \parallel\! \mathbf {x}_i - \mathbf {x}_j\!\parallel _p,$$ (1)where d(xi, xj) denotes the distance between xi and xj. xi denotes the ith N dimensional data point. xi and xj are both vectors. ∥ · ∥p denotes p-norm of the input vector. The Euclidean distance is a special case where p = 2, while Manhattan metric has p = 1. However, there are no general theoretical guidelines for selecting a measure for any given application. Microseismic event picking as a clustering problem A microseismic record can be classified as waveform and non-waveform components. The first index of waveform components can be treated as the arrival of the microseismic event. The essence of the arrival picking problem is thus turned into a classification problem given a group of data points. When a group of training data is given together with predefined data features, the classification problem can be viewed as a supervised classification problem (e.g. binary classification). If one even wants to classify the microseismic record using the data itself, the problem becomes a classic clustering analysis problem. The most important factor that affects the performance of the clustering analysis is the selected feature vector. In the algorithm, I propose three features to construct the feature vector, which are mean, power, and STA/LTA. All these feature vectors are measured in the time domain. The three features are defined as follows: Mean M  $$M(i)=\frac{1}{N}\sum _{i-w}^{i+w} d(i)$$ (2) Power E  $$E(i)=\sum _{i-w}^{i+w} d^2(i)$$ (3) STA/LTA R  \begin{eqnarray} {}{\rm STA}(i)&=&\frac{1}{{\rm NSTA}} \sum _{j=i-{\rm NSTA}}^{i} d(j) \nonumber\\ {}{\rm LTA}(i)&=&\frac{1}{{\rm NLTA}} \sum _{j=i-{\rm NLTA}}^{i} d(j) \nonumber\\ R(i)&=&{\rm STA}(i)/{\rm LTA}(i) \end{eqnarray} (4) w denotes half of window length. d(i) denotes the input seismic. NSTA and NLTA denote short-term and long-term periods, respectively. Fig. 1 shows a simple example for demonstrating the extracted features. Fig. 1 contains two synthetic data in the clean and noisy cases. The noisy data contains a large amount of noise that makes the effective signals almost buried under the noise. The SNR of this noisy data is −3.64 dB. The red circles in panels (a) and (b) are the picked arrival indices using the presented algorithm. It is very clear that in both cases, the presented algorithm obtains very successful arrival picking results. Fig. 2 shows the three aforementioned features used for clustering for clean data (Fig. 2a) and noisy data (Fig. 2b). Figure 1. View largeDownload slide Simple synthetic data for demonstrating clustering features. (a) Clean data. (b) Noisy data with SNR = −3.64 dB. The red circles in (a) and (b) are the picked arrival indices. Figure 1. View largeDownload slide Simple synthetic data for demonstrating clustering features. (a) Clean data. (b) Noisy data with SNR = −3.64 dB. The red circles in (a) and (b) are the picked arrival indices. Figure 2. View largeDownload slide Predefined features of (a) clean data and (b) noisy data for clustering. Figure 2. View largeDownload slide Predefined features of (a) clean data and (b) noisy data for clustering. Fuzzy clustering Fuzzy clustering belongs to the type of overlapping clustering, and uses fuzzy sets to cluster data, so that each point may belong to two or more clusters with different degrees of membership. In this case, data will be associated with an appropriate membership value. Fuzzy c-means is a method of clustering which allows one piece of data to belong to two or more clusters (Dunn 1973; Bezdek 1981). It is based on minimization of the following objective function   $$J = \sum _{i=1}^{N} \sum _{j=1}^{C} u_{i,j}^m \parallel\! \mathbf {x}_i - \mathbf {c}_j\!\parallel ^2,\quad 1\le m\le \infty ,$$ (5)where m is any real number greater than 1, ui, j is the degree of membership of xi in the cluster j, xi is the ith of d-dimensional measured data (d is the dimension of xi), cj is the d-dimension center of the cluster, and ∥ · ∥ is any norm expressing the similarity between any measured data and the center, for example, L2 norm. C denotes the number of clusters or classes needed to be classified. N denotes the length of the input signal. Note that both xi and cj are vectors of the same dimension. The dimension of the vectors is decided by the number of features. For example, in this paper, I use mean, power, STA/LTA as the three features and thus the dimension of both xi and cj is 3. Each xi is composed of the values of all feature components at time index i. Fuzzy partitioning is carried out through an iterative optimization of the objective function shown above, with the update of membership ui, j and the cluster centres cj by:   $$u_{i,j} = \frac{1}{\displaystyle \sum _{k=1}^C \left(\frac{\parallel\! \mathbf {x}_i - \mathbf {c}_j \!\parallel }{\parallel\! \mathbf {x}_i - \mathbf {c}_k\!\parallel } \right)^{\frac{2}{m-1}}},$$ (6)and   $$\mathbf {c}_j = \frac{\displaystyle \sum _{i=1}^{N} u_{i,j}^m \mathbf {x}_i}{\displaystyle \sum _{i=1}^N u_{i,j}^m}.$$ (7) This iteration will stop when $$\max _{i,j} \lbrace |u_{i,j}^{(k+1)} - u_{i,j}^{(k)}|\rbrace <\epsilon$$, where ε is a termination criterion between 0 and 1, whereas k is the iteration step. This procedure converges to a local minimum or a saddle point of Jm. The algorithm is composed of the following steps: Initialize U = [ui, j] matrix, U0 At k-step: calculate the centres vectors Ck = [cj] with Uk using   $$\mathbf {c}_j = \frac{\displaystyle \sum _{i=1}^{N} u_{i,j}^m \cdot \mathbf {x}_i}{ \displaystyle \sum _{i=1}^N u_{i,j}^m}.$$ (8) Update Uk, Uk + 1 using   $$u_{i,j} = \frac{1}{\displaystyle \sum _{k=1}^C \left(\frac{\parallel\! \mathbf {x}_i - \mathbf {c}_j \!\parallel }{ \parallel\! \mathbf {x}_i - \mathbf {c}_k \!\parallel } \right)^{\frac{2}{m-1}}}.$$ (9) If ∥Uk + 1 − Uk∥ < ε, then STOP; otherwise return to step 2. Eq. (9) can be further derived as   \begin{eqnarray} u_{i,j} &=& \frac{1}{\displaystyle \sum _{k=1}^C \left(\frac{\parallel\! \mathbf {x}_i - \mathbf {c}_j\!\parallel }{\parallel\! \mathbf {x}_i - \mathbf {c}_k\!\parallel } \right)^{\frac{2}{m-1}}} \nonumber\\ &=& \frac{1}{\displaystyle \parallel\! \mathbf {x}_i - \mathbf {c}_j\!\parallel ^{\frac{2}{m-1}}\sum _{k=1}^C \left(\frac{1}{\parallel\! \mathbf {x}_i - \mathbf {c}_k\!\parallel } \right)^{\frac{2}{m-1}}} \nonumber\\ &=& \frac{\parallel\! \mathbf {x}_i - \mathbf {c}_j\!\parallel ^{-\frac{2}{m-1}}}{\displaystyle \sum _{k=1}^C \left(\frac{1}{\parallel\! \mathbf {x}_i - \mathbf {c}_k\!\parallel } \right)^{\frac{2}{m-1}}} \nonumber\\ &=& \frac{\parallel\! \mathbf {x}_i - \mathbf {c}_j\!\parallel ^{-\frac{2}{m-1}}}{\displaystyle \sum _{k=1}^C \parallel\! \mathbf {x}_i - \mathbf {c}_k\!\parallel ^{-\frac{2}{m-1}}} . \end{eqnarray} (10)for easier implementation. The above iteration terminates either when the maximum number of iteration (e.g. 100) is reached or when it is converged (e.g. ∥Uk + 1 − Uk∥ < ε). The obtained membership vectors ui, j is then used to detect the microseismic event. Fig. 3 shows the membership values calculated by iterative estimation for the data shown in Fig. 1. Fig. 3(a) corresponds to the clean data and Fig. 3(b) corresponds to the noisy data. It is clear from Fig. 3 that a microseismic arrival exists when the membership value jumps from one to zero (or from zero to one). Via this criterion, one can automatically detect the microseismic event. In the DISCUSSION section, I give a brief discussion on the implementation and parameter selection for the aforementioned algorithm. Figure 3. View largeDownload slide Calculated membership values of (a) clean data and (b) noisy data that define different clusters. Figure 3. View largeDownload slide Calculated membership values of (a) clean data and (b) noisy data that define different clusters. EXAMPLES To numerically evaluate the performance, I use the arrival picking error metric which is defined as follows:   $${\rm Error} = \sum _{h}^{H} |I(h) - \hat{I}(h) |,$$ (11)where Error denotes the picking error measured in samples. I(h) denotes the index corresponding to the exact first arrival for hth trace in a multichannel 2-D microseismic record and $$\hat{I}(h)$$ denotes the picked arrival (index). The exact arrival is found by applying the proposed method on clean synthetic data. I first use a multichannel synthetic data set to demonstrate the performance of the proposed method. The microseismic data set is simulated from the two-layer velocity model shown in Fig. 4. The blue inverted triangles in the first layer denote the 50 evenly spaced geophones for recording the microseismic signals. The blue dots in the second layer denote the two microseismic sources. The two sources are generated during the hydraulic fracturing process. I use the acoustic wave to simulate the recorded data, as shown in Fig. 5. Figs 5(a) and (b) show the clean microseismic data. The 50 spatial traces correspond to recorded data from 50 geophones. The red circles in Fig. 5(a) correspond to the picked first arrivals for this clean data set using the proposed clustering method. The blue circles in Fig. 5(b) correspond to the picked first arrivals using the STA/LTA method. From this clean data test, it is clear that both clustering method and the STA/LTA method work well in detecting the first arrivals when the SNR is very high. As a comparison, I then conduct an experiment for noisy data. I simulate the noisy data by adding some random noise with SNR = 0.24 dB. The arrival picking results are shown in Fig. 6. From Fig. 7(a), one can see clearly that the red circles, which correspond to the proposed method, successfully picked all the first arrivals without making any mistake. While for the result by the traditional STA/LTA method, as indicated by the blue circles in Fig. 7(b), most picked arrivals are not correct. The noisy data example demonstrates that for noisy data set, the proposed method can be robust to obtain acceptable arrival picking results while the STA/LTA method cannot perform well. In other words, the proposed method is not sensitive to noise while the STA/LTA method is very sensitive to ambient random noise. Figure 4. View largeDownload slide The geometry of the synthetic example. Figure 4. View largeDownload slide The geometry of the synthetic example. Figure 5. View largeDownload slide Comparison of arrival picking results for clean data using the presented (a) and STA/LTA (b) methods. Figure 5. View largeDownload slide Comparison of arrival picking results for clean data using the presented (a) and STA/LTA (b) methods. Figure 6. View largeDownload slide Comparison of arrival picking results for noisy data using the presented (a) and STA/LTA (b) methods. Figure 6. View largeDownload slide Comparison of arrival picking results for noisy data using the presented (a) and STA/LTA (b) methods. Figure 7. View largeDownload slide Comparison of arrival picking results for denoised data using the presented (a) and STA/LTA (b) methods. Figure 7. View largeDownload slide Comparison of arrival picking results for denoised data using the presented (a) and STA/LTA (b) methods. To compare the arrival picking performance of the two methods on raw noisy data is not exactly fair. In practice, the noisy data is usually denoised first and then passed into the arrival picker. To compare the performance of the two methods on denoised data, I apply a multichannel denoising operator to the noisy data shown in Fig. 6 to remove most of the noise. The multichannel denoising operator I use is the damped multichannel singular spectrum (DMSSA) algorithm, proposed by Huang et al. (2016). It is worth noting that since for this example, the spatial sampling is dense and the number of spatial traces is relatively high (50 in this case), one can obtain a multichannel microseismic record with very coherent events. Thus one can apply the multichannel denoising operator to successfully remove most noise without damaging the useful signals much. However, for most microseismic monitoring situations, the spatial sampling is not so optimal as in this synthetic test, and multichannel denoising methods are not applicable. Instead, a single-channel denoising operator is used to obtain a slightly worse performance. Here I focus on the comparison of arrival picking performance and neglect the denoising issue. As one can see from Fig. 7, the denoised data is much cleaner than the noisy data shown in Fig. 6. The proposed method, indicated by the red circles, successfully picks those arrivals again but the STA/LTA method still makes a lot of mistakes, as shown in Fig. 7(b) by the blue circles. Although much better than the result of the noisy data, there are still several traces where wrong arrivals are picked. I then test the sensitivity of different methods to the noise level. I create six different noisy data sets with different noise levels. I increase the noise level by increasing the variance of Gaussian white noise from σ2 = 0.1 to σ2 = 0.6, where σ2 denotes noise variance. The six noisy data sets with increasing noise level are shown in Fig. 8. The red circles in each subfigure of Fig. 8 show the picked arrivals using the proposed clustering method. One can observe that the proposed method is insensitive to the noise level. When the noise variance is smaller than 0.6, the proposed method does not make obvious mistakes. The proposed method causes three obviously inaccurate picked arrivals until the noise variance is increased to 0.6. As a comparison, Fig. 9 shows arrival picking performance of the STA/LTA method for the same six noisy data sets. It is quite clear that the STA/LTA method is very sensitive to the noise. Even the noise variance is as small as σ2 = 0.1, the STA/LTA method causes a lot of inaccurate picked arrivals. The performance becomes worse and worse when the noise variance is gradually increased. Figure 8. View largeDownload slide Arrival picking results using the proposed method for different input noise levels. (a) σ2 = 0.1. (b) σ2 = 0.2. (c) σ2 = 0.3. (d) σ2 = 0.4. (e) σ2 = 0.5. (f) σ2 = 0.6. Figure 8. View largeDownload slide Arrival picking results using the proposed method for different input noise levels. (a) σ2 = 0.1. (b) σ2 = 0.2. (c) σ2 = 0.3. (d) σ2 = 0.4. (e) σ2 = 0.5. (f) σ2 = 0.6. Figure 9. View largeDownload slide Arrival picking results using the STA/LTA method for different input noise levels. (a) σ2 = 0.1. (b) σ2 = 0.2. (c) σ2 = 0.3. (d) σ2 = 0.4. (e) σ2 = 0.5. (f) σ2 = 0.6. Figure 9. View largeDownload slide Arrival picking results using the STA/LTA method for different input noise levels. (a) σ2 = 0.1. (b) σ2 = 0.2. (c) σ2 = 0.3. (d) σ2 = 0.4. (e) σ2 = 0.5. (f) σ2 = 0.6. To numerically compare the performance, the picking error defined in eq. (11) is calculated for each case and a diagram of arrival picking error with respect to noise level (measured by variance) is plotted for each individual method and is shown in Fig. 10. In Fig. 10, the red line corresponds to the proposed method, which increases very slowly and is always less than 300 samples. The blue line corresponds to the STA/LTA method, which is much larger than the error of the proposed method. Even in the cleanest case, the picking error is very high (about 2000 samples). The green line corresponds to the SNRs of different data, which decreases from 8.21 dB when σ2 = 0.1 to −7.35 dB when σ2 = 0.6. Figure 10. View largeDownload slide Error diagram with respect to input noise level. The right green ‘y’ axis shows the corresponding SNR for different input noise level. The error is characterized by eq. (11). Figure 10. View largeDownload slide Error diagram with respect to input noise level. The right green ‘y’ axis shows the corresponding SNR for different input noise level. The error is characterized by eq. (11). For computational cost comparison, the traditional method takes 0.23 s for processing the data shown in Fig. 5(a) while the proposed algorithm takes 2.5 s. The data contains 351 samples and 50 traces. The arrival picking algorithms are applied trace by trace. The computation is done on a PC station equipped with an Intel Core i7 CPU clocked at 3.1 GHz and 16 GB of RAM. Thus, I conclude that the proposed algorithm obtains a better performance at the expense of a larger computational cost. Although the computational cost is roughly 10 times larger than the traditional algorithm, it is still acceptable in industrial applications, thus the computation should not be a serious problem. Fig. 11 shows the performance for the first field data example. This is a surface-recorded microseismic data. The data quality is relatively high and one can see the first arrivals caused by hydraulic perforation very clearly. Fig. 11(a) shows the multichannel field data and the arrival picking results using the presented clustering method. There are 40 receivers for this data set and the red circles indicate the picked arrival indices. Fig. 11(b) shows the result from the STA/LTA method. The performance of the proposed method is very close to excellent except for one picking mistake, as shown in the fourth trace in Fig. 11(a). Because the data quality of this data is relatively high, the STA/LTA method obtains accurate picking in some traces, but for most traces, the STA/LTA fails in picking the accurate arrival. Note that in this test, I apply the two methods directly to the raw microseismic data, so the performance of the two methods demonstrates their relative robustnesses to noise. It is worth mentioning that the STA/LTA method is usually implemented after an initial denoising step is done on the raw data, where the SNR becomes much better and more tolerable for the STA/LTA method. Figure 11. View largeDownload slide Comparison of arrival picking results for the real surface microseismic data using the presented (a) and STA/LTA (b) methods. Figure 11. View largeDownload slide Comparison of arrival picking results for the real surface microseismic data using the presented (a) and STA/LTA (b) methods. Figure 12. View largeDownload slide Single trace comparison for presented and STA/LTA methods. (a) The 20th trace from the first field data and the picked result using clustering method. (b) The 20th trace from the first field data and the picked result using STA/LTA method. Figure 12. View largeDownload slide Single trace comparison for presented and STA/LTA methods. (a) The 20th trace from the first field data and the picked result using clustering method. (b) The 20th trace from the first field data and the picked result using STA/LTA method. I extract the 20th trace in Fig. 11 for a single-trace comparison in Fig. 12. Figs 12(a) and (b) show the results from the two methods, where one can more clearly see the 1D waveform signals and the difference in picked time indices. The three features, namely, Mean, Power, and STA/LTA, are shown in Fig. 13 for the selected single trace. It is very clear that the mean and power features are relatively insensitive to noise since before 0.35s both mean and power are almost zero. The STA/LTA, however, is much more sensitive to ambient noise as one can observe a lot of fake peaks in the STA/LTA curve. The integrated clustering analysis over the two less noise-insensitive features (i.e. mean and power) and a more noise-sensitive feature (i.e. STA/LTA) through the fuzzy C-means framework accounts for the anti-noise superiority of the presented method to the traditional STA/LTA method. The two groups of membership values of the real single microseismic trace are shown in Fig. 14. A distinct zero-to-one jump happens around 168 ms can be observed from the top panel of Fig. 14. Figure 13. View largeDownload slide The three features exacted in the proposed method. Figure 13. View largeDownload slide The three features exacted in the proposed method. Figure 14. View largeDownload slide The two groups of membership values calculated in the clustering analysis for 20th trace from the first field data. Figure 14. View largeDownload slide The two groups of membership values calculated in the clustering analysis for 20th trace from the first field data. I then show a more complicated real microseismic event. Fig. 15 shows the original record with the horizontal components H1 and H2, and the vertical component, respectively. Twelve three-component geophones are used to record the signals from hydraulic fracturing. The microseismic record is noisy and amplitude of events is weak. So not all signals are immediately detectable. I apply the proposed clustering method and the STA/LTA method to the data set and show the result in Figs 15 and 16, respectively. This data is much noisier than the previous example. However, the proposed method is still very robust in detecting those arrivals, which appear spatially coherent, as indicated by the red circles in Fig. 15. The results from the STA/LTA method, however, are in a mess, as indicated by the blue circles in Fig. 16. Without denoising, the STA/LTA method is almost not possible to accurately detect the microseismic events while the proposed method can work properly even in severely corrupted data. Figure 15. View largeDownload slide Arrival picking results for the second real surface microseismic data with extremely strong background noise using the presented method. Figure 15. View largeDownload slide Arrival picking results for the second real surface microseismic data with extremely strong background noise using the presented method. Figure 16. View largeDownload slide Arrival picking results for the second real surface microseismic data with extremely strong background noise using the STA/LTA method. Figure 16. View largeDownload slide Arrival picking results for the second real surface microseismic data with extremely strong background noise using the STA/LTA method. The last real data example is an earthquake data stack profile. Fig. 17 shows Professor Peter Shearer’s stacks over many earthquakes at a constant epicentral distance (offset angle) (Shearer 1991a,b). Fig. 17 has been improved a lot from the raw data by stacking different earthquake data. Different seismic phases can be seen from the Fig. 17. However, one can see that there are still significant random and erratic noise existing in the stack profile. By applying the proposed method directly to the raw stack data, I obtain a very successful result, which is shown in Fig. 17. Fig. 18 shows the arrival picking performance using the STA/LTA method, where the most picking is accurate while there are still a lot of inaccurate picking. Figure 17. View largeDownload slide Arrival picking results for an earthquake data with extremely strong background noise using the presented method. Figure 17. View largeDownload slide Arrival picking results for an earthquake data with extremely strong background noise using the presented method. Figure 18. View largeDownload slide Arrival picking results for an earthquake data with extremely strong background noise using the STA/LTA method. Figure 18. View largeDownload slide Arrival picking results for an earthquake data with extremely strong background noise using the STA/LTA method. DISCUSSIONS The implementation of the proposed clustering method is quite straightforward and it is easy to control the performance of the algorithm. The four steps shown in the THEORY section are almost the whole framework of the algorithm. Once the feature vectors are fixed (e.g. mean, power, and STA/LTA), the parameters needed to be defined are just exponent parameter m for the membership matrix U (see eq. 5), which is usually fixed as m = 2, the number of clusters C, which is fixed as C = 2. In other words, the method is a fully automatic method and almost does not require any human inference. Although minimizing eq. (5) requires several iterations, which is the main computational cost of the algorithm, it usually takes less than 20 iterations to converge. According to the numerical tests shown above, the computational cost of the presented algorithm is only about 10 times larger than the traditional STA/LTA method and the efficiency in practical applications is still quite acceptable. The presented automatic microseismic event detection method based on unsupervised machine learning is more a concept than a specific type of microseismic event detection method. One the one hand, in the unsupervised machine learning framework, the clustering method is not limited to the fuzzy clustering algorithm. Other unsupervised machine learning algorithms are worth being investigated, such as Hierarchical clustering, k-Means clustering, Self-organizing maps, Hidden Markov models, etc. On the other hand, the feature vector in the unsupervised machine learning framework has the potential to be improved. In this paper, I just investigate the performance based on the three-component feature vector that is composed of mean, power, and STA/LTA. While each feature component can serve as an individual event picker, the combined usage of all these components in the feature vector through the integrated fuzzy clustering framework is more appealing. As I have shown, the integrated application of all three feature components via clustering is much more robust than the single component, for example, the STA/LTA method, in a noisy environment. From this aspect, all other individual event picking methods can be integrated into the same framework for more robust performance, such as the power spectral density method in Vaezi & Van der Baan (2015) and the Akaike information criterion method in Leonard (2000) and Zhang et al. (2003). From the comprehensive analysis and discussions of the different synthetic and real data tests, the main advantage of the proposed method is its anti-noise ability. It has been shown that when the noise level is weak to moderately strong, the performance of the proposed method is very robust. This feature makes the clustering method very appealing to practical usage, since in most cases where the noise is not extremely high, the proposed method can be applied without any pre-processing step and thus can serve as the very initial estimate of microseismic event detection and serve for the subsequent event source localization. When the noise level becomes extremely strong, as seen from the sensitivity test to noise level, the clustering method may make some mistakes. Although not perfect in all situations, when the noise level becomes extremely high, a simple and mild denoising filter can help remove a small amount of noise but without damaging the useful signals, which is followed by applying the presented event picker. The future investigation includes applying the proposed method to a much larger volume of data set and to further verify the anti-noise effect when the method is applied in the field. CONCLUSIONS Microseismic and earthquake data may contain strong distractive background noise that may heavily affect the waveform arrival picking, and could further result in an unconvincing tomographic model that is based on the arrival picking. I have introduced an effective and intelligent arrival picking algorithm that is based on clustering analysis. The three features (Mean, Power, STA/LTA) fed into the intelligent clustering algorithm make the clustering engine capable of detecting arrivals in extremely noisy environment without the need of pre-processing the data. The presented arrival picking algorithm is an unsupervised machine learning technique that can be applied to an arbitrarily large amount of microseismic (and earthquake) data. Unlike the supervised machine learning techniques, the proposed method does not require a reasonably large volume of training data and thus can be fairly flexible. The fuzzy clustering algorithm has been shown to be an effective clustering analysis method in the presented framework. The comparison between the presented algorithm with the state-of-the-art STA/LTA method shows very promising performance, especially in the low-SNR data set based on a combination of several synthetic and real data examples with different levels of complexity and noise strength. Acknowledgements I would like to thank Weilin Huang, Huijian Li, Shaohuan Zu and Dong Zhang for inspiring discussions. The research is previously supported by Texas Consortium for Computational Seismology (TCCS) and is currently supported by Distinguished Postdoctoral Fellowship at Oak Ridge National Laboratory. REFERENCES Allen R.V., 1978. Automatic earthquake recognition and timing from single traces, Bull. seism. Soc. Am. , 68( 5), 1521– 1532. Arrowsmith S.J., Eisner L., 2006. A technique for identifying microseismic multiplets and application to the Valhall field, North Sea, Geophys. J. Int. , 71( 2), V31– V40. Bezdek J.C., 1981. Pattern Recognition with Fuzzy Objective Function Algoritms , Plenum Press. Google Scholar CrossRef Search ADS   Bogiatzis P., Ishii M., 2015. Continuous wavelet decomposition algorithms for automatic detection of compressional-and shear-wave arrival times, Bull. seism. Soc. Am. , 105( 3), 1628– 1641. Google Scholar CrossRef Search ADS   Candès E.J., Demanet L., Donoho D.L., Ying L., 2006. Fast discrete curvelet transforms, Multiscale Model. Simul. , 5, 861– 899. Google Scholar CrossRef Search ADS   Chen Y., 2016. Dip-separated structural filtering using seislet thresholding and adaptive empirical mode decomposition based dip filter, Geophys. J. Int. , 206( 1), 457– 469. Google Scholar CrossRef Search ADS   Chen Y., 2017. Fast dictionary learning for noise attenuation of multidimensional seismic data, Geophys. J. Int. , 209, 21– 31. Chen Y., Fomel S., 2015. Random noise attenuation using local signal-and-noise orthogonalization, Geophysics , 80, WD1– WD9. Google Scholar CrossRef Search ADS   Chen Y., Fomel S., Hu J., 2014. Iterative deblending of simultaneous-source seismic data using seislet-domain shaping regularization, Geophysics , 79, V179– V189. Google Scholar CrossRef Search ADS   Chen Y., Zhang D., Huang W., Chen W., 2016a. An open-source matlab code package for improved rank-reduction 3D seismic data denoising and reconstruction, Comput. Geosci. , 95, 59– 66. Google Scholar CrossRef Search ADS   Chen Y., Zhang D., Jin Z., Chen X., Zu S., Huang W., Gan S., 2016b. Simultaneous denoising and reconstruction of 5D seismic data via damped rank-reduction method, Geophys. J. Int. , 206, 1695– 1717. Google Scholar CrossRef Search ADS   Chen Y., Zhou Y., Chen W., Zu S., Huang W., Zhang D., 2017. Empirical low rank decomposition for seismic noise attenuation, IEEE Trans. Geosci. Remote Sen. , 55( 8), 4696– 4711. Google Scholar CrossRef Search ADS   Coppens F., 1985. First arrival picking on common-offset trace collections for automatic estimation of static corrections, Geophys. Prospect. , 33( 8), 1212– 1231. Google Scholar CrossRef Search ADS   Dunn J.C., 1973. A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters, J. Cybern. , 3, 32– 57. Google Scholar CrossRef Search ADS   Eisner L., Fischer T., Calvez J.H.L., 2006. Detection of repeated hy- draulic fracturing out-of-zone growth by microseismic monitoring, Leading Edge , 25, 548– 554. Google Scholar CrossRef Search ADS   Elad M., Aharon M., 2006. Image denoising via sparse and redundant representations over learned dictionaries, IEEE Trans. Image Process. , 15( 12), 3736– 3745. Google Scholar CrossRef Search ADS PubMed  Fomel S., 2013. Seismic data decomposition into spectral components using regularized nonstationary autoregression, Geophysics , 78, O69– O76. Google Scholar CrossRef Search ADS   Forghani-Arani F., Willis M., Haines S.S., Batzle M., Behura J., Davidson M., 2013. An effective noise-suppression technique for surface microseismic data, Geophysics , 78( 6), KS85– KS95. Google Scholar CrossRef Search ADS   Gan S., Wang S., Chen Y., Chen X., 2016a. Simultaneous-source separation using iterative seislet-frame thresholding, IEEE Geosci. Remote Sen. Lett. , 13, 197– 201. Google Scholar CrossRef Search ADS   Gan S., Wang S., Chen Y., Chen X., Xiang K., 2016b. Separation of simultaneous sources using a structural-oriented median filter in the flattened dimension, Comput. Geosci. , 86, 46– 54. Google Scholar CrossRef Search ADS   Gelchinsky B., Shtivelman V., 1983. Automatic picking of first arrivals and parameterization of traveltime curves, Geophys. Prospect. , 31, 915– 928. Google Scholar CrossRef Search ADS   Gibbons S.J., Ringdal F., 2006. The detection of low magnitude seismic events using array-based waveform correlation, Geophys. J. Int. , 165( 1), 149– 166. Google Scholar CrossRef Search ADS   Gibbons S.J., Ringdal F., Kvaerna T., 2006. Detection and characteriza- tion of seismic phases using continuous spectral estimation on incoherent and partially coherent arrays, Geophys. J. Int. , 172( 1), 405– 421. Google Scholar CrossRef Search ADS   Gilles J., 2013. Empirical wavelet transform, IEEE Trans. Signal Process. , 61, 3999– 4010. Google Scholar CrossRef Search ADS   Gómez J.L., Velis D.R., 2016. A simple method inspired by empirical mode decomposition for denoising seismic data, Geophysics , 81( 6), V403– V413. Google Scholar CrossRef Search ADS   Han J., van der Baan M., 2015. Microseismic and seismic denoising via ensemble empirical mode decomposition and adaptive thresholding, Geophysics , 80( 6), KS69– KS80. Google Scholar CrossRef Search ADS   Hatherly P., 1982. A computer method for determining seismic first arrival times, Geophysics , 47, 1431– 1436. Google Scholar CrossRef Search ADS   Huang N.E., Shen Z., Long S.R., Wu M.C., Shih H.H., Zheng Q., Yen N.-C., Tung C.C., Liu H.H., 1998. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis, Proc. R. Soc. A , 454, 903– 995. Google Scholar CrossRef Search ADS   Huang W., Wang R., Chen Y., Li H., Gan S., 2016. Damped multichannel singular spectrum analysis for 3D random noise attenuation, Geophysics , 81( 4), V261– V270. Google Scholar CrossRef Search ADS   Huang W., Wang R., Chen X., Chen Y., 2017a. Double least squares projections method for signal estimation, IEEE Trans. Geosci. Remote Sen. , 55( 7), 4111– 4129. Google Scholar CrossRef Search ADS   Huang W., Wang R., Li H., Chen Y., 2017b. Unveiling the signals from extremely noisy microseismic data for high-resolution hydraulic fracturing monitoring, Sci. Rep. , 7, 11996, doi:10.1038/s41598-017-09711-2. Google Scholar CrossRef Search ADS   Huang W., Wang R., Yuan Y., Gan S., Chen Y., 2017c. Signal extraction using randomized-order multichannel singular spectrum analysis, Geophysics , 82( 2), V59– V74. Google Scholar CrossRef Search ADS   Huang W., Wang R., Zhang D., Zhou Y., Yang W., Chen Y., 2017d. Mathematical morphological filtering for linear noise attenuation of seismic data, Geophysics , 82( 6), V369– V384. Google Scholar CrossRef Search ADS   Huang W., Wang R., Zu S., Chen Y., 2017e. Low-frequency noise attenuation in seismic and microseismic data using mathematical morphological filtering, Geophys. J. Int. , 211, 1318– 1340. Kalkan E., 2016. An automatic p-phase arrival-time picker, Bull. seism. Soc. Am. , 106( 3), doi:10.1785/0120150111. Leonard M., 2000. Comparison of manual and automatic onset time picking, Bull. seism. Soc. Am. , 90( 6), 1384– 1390. Google Scholar CrossRef Search ADS   Li H., Wang R., Cao S., Chen Y., Huang W., 2016a. A method for low-frequency noise suppression based on mathematical morphology in microseismic monitoring, Geophysics , 81, V159– V167. Google Scholar CrossRef Search ADS   Li H., Wang R., Cao S., Chen Y., Tian N., Chen X., 2016b. Weak signal detection using multiscale morphology in microseismic monitoring, J. Appl. Geophys. , 133, 39– 49. Google Scholar CrossRef Search ADS   Liu G., Chen X., 2013. Noncausal f–x–y regularized nonstationary prediction filtering for random noise attenuation on 3D seismic data, J. Appl. Geophys. , 93, 60– 66. Google Scholar CrossRef Search ADS   Liu G., Chen X., Du J., Wu K., 2012. Random noise attenuation using f–x regularized nonstationary autoregression, Geophysics , 77, V61– V69. Google Scholar CrossRef Search ADS   Liu W., Cao S., He Y., 2015. Ground roll attenuation using variational mode decomposition, in 77th Annual International Conference and Exhibition, EAGE, Extended Abstracts . Liu W., Cao S., Chen Y., 2016a. Seismic time-frequency analysis via empirical wavelet transform, IEEE Geosci. Remote Sens. Lett. , 13, 28– 32. Google Scholar CrossRef Search ADS   Liu W., Cao S., Chen Y., 2016b. Applications of variational mode decomposition in seismic time-frequency analysis, Geophysics , 81, V365– V378. Google Scholar CrossRef Search ADS   Liu W., Cao S., Wang Z., Kong X., Chen Y., 2017. Spectral decomposition for hydrocarbon detection based on VMD and teager-kaiser energy, IEEE Geosci. Remote Sens. Lett. , 14( 4), 539– 543. Google Scholar CrossRef Search ADS   Maxwell S.C., Rutledge J., Jones R., Fehler M., 2010. Petroleum reservoir characterization using downhole microseismic monitoring, Geophysics , 75( 5), 75A129– 75A137. Google Scholar CrossRef Search ADS   Michelet S., Toksöz M.N., 2007. Fracture mapping in the Soultz-sous-forets geothermal field using microearthquake locations, J. geophys. Res. , 112, B07315, doi:10.1029/2006JB004442. Google Scholar CrossRef Search ADS   Mousavi S.M., Langston C.A., 2016a. Hybrid seismic denoising using higher-order statistics and improved wavelet block thresholding, Bull. seism. Soc. Am. , 106( 4), 1380– 1393. Google Scholar CrossRef Search ADS   Mousavi S.M., Langston C.A., 2016b. Adaptive noise estimation and suppression for improving microseismic event detection, J. Appl. Geophys. , 132, 116– 124. Google Scholar CrossRef Search ADS   Mousavi S.M., Langston C.A., 2016c. Fast and novel microseismic detection using time-frequency analysis, in 86th Annual International Meeting , SEG, Expanded Abstracts, pp. 2632– 2636. Mousavi S.M., Langston C.A., 2017. Automatic noise-removal/signal-removal based on general cross-validation thresholding in synchrosqueezed domain and its application on earthquake data, Geophysics , 82( 4), V211– V227. Google Scholar CrossRef Search ADS   Mousavi S.M., Langston C.A., Horton S.P., 2016. Automatic microseismic denoising and onset detection using the synchrosqueezed continuous wavelet transform, Geophysics , 81( 4), V341– V355. Google Scholar CrossRef Search ADS   Murat M.E., Rudman A.J., 1992. Automated first arrival picking: A neural network approach, Geophys. Prospect. , 40( 6), 587– 604. Google Scholar CrossRef Search ADS   Rodriguez I.V., Bonar D., Sacchi M., 2012. Microseismic data denoising using a 3c group sparsity constrained time-frequency transform, Geophysics , 77( 2), V21– V29. Google Scholar CrossRef Search ADS   Sabbione J.I., Sacchi M.D., 2015. Radon transform-based microseismic event detection and signal-to-noise ratio enhancement, J. Appl. Geophys. , 113, 51– 63. Google Scholar CrossRef Search ADS   Sabbione J.I., Velis D., 2010. Automatic first-breaks picking: New strategies and algorithms, Geophysics , 75( 4), V67– V76. Google Scholar CrossRef Search ADS   Senkaya M., Karsli H., 2014. A semi-automatic approach to identify first arrival time: the cross-correlation technique (CCT), Earth Sci. Res. J. , 18( 2), 107– 113. Google Scholar CrossRef Search ADS   Shearer P.M., 1991a. Imaging global body wave phases by stacking long-period seismograms, J. geophys. Res. , 96( B12), 20 535– 20 324. Google Scholar CrossRef Search ADS   Shearer P.M., 1991b. Constraints on upper mantle discontinuities from observations of long period reflected and converted phases, J. geophys. Res. , 96( B11), 18 147– 18 182. Google Scholar CrossRef Search ADS   Siahsar M.A.N., Abolghasemi V., Chen Y., 2017a. Simultaneous denoising and interpolation of 2d seismic data using data-driven non-negative dictionary learning, Signal Process. , 141, 309– 321. Google Scholar CrossRef Search ADS   Siahsar M.A.N., Gholtashi S., Kahoo A.R., Chen W., Chen Y., 2017b. Data-driven multi-task sparse dictionary learning for noise attenuation of 3D seismic data, Geophysics , 82( 6), V385– V396. Google Scholar CrossRef Search ADS   Siahsar M.A.N., Gholtashi S., Olyaei E., Chen W., Chen Y., 2017c. Simultaneous denoising and interpolation of 3D seismic data via damped data-driven optimal singular value shrinkage, IEEE Geosci. Remote Sens. Lett. , 14( 7), 1086– 1090. Google Scholar CrossRef Search ADS   Song F., Toksöz M.N., 2011. Full-waveform based complete moment tensor inversion and source parameter estimation from downhole microseismic data for hydrofracture monitoring, Geophysics , 76( 6), WC103– WC116. Google Scholar CrossRef Search ADS   Song F., Kuleli H.S., Toksöz M.N., Ay E., Zhang H., 2010. An improved method for hydrofracture-induced microseismic event detection and phase picking, Geophysics , 75( 6), A47– A52. Google Scholar CrossRef Search ADS   Vaezi Y., Van der Baan M., 2015. Comparison of the STA/LTA and power spectral density methods for microseismic event detection, Geophys. J. Int. , 203( 3), 1896– 1908. Google Scholar CrossRef Search ADS   Vautard R., Yiou P., Ghil M., 1992. Singular-spectrum analysis: A toolkit for short, noisy chaotic signals, Phys. D , 58( 1), 95– 126. Google Scholar CrossRef Search ADS   Velis D., Sabbione J.I., Sacchi M.D., 2015. Fast and automatic microseismic phase-arrival detection and denoising by pattern recognition and reduced-rank filtering, Geophysics , 80( 6), WC25– WC38. Google Scholar CrossRef Search ADS   Wang Y., Zhou H., Zu S., Mao W., Chen Y., 2017. Three-operator proximal splitting scheme for 3D seismic data reconstruction, IEEE Geosci. Remote Sens. Lett. , 14, 1830– 1834. Google Scholar CrossRef Search ADS   Withers M., Aster R., Young C., Beiriger J., Harris M., Moore S., Trujillo J., 1998. A comparison of select trigger algorithms for automated global seismic phase and event detection, Bull. seism. Soc. Am. , 88( 1), 95– 106. Wu G., Fomel S., Chen Y., 2017. Data-driven time-frequency analysis of seismic data using non-stationary prony method, Geophys. Prospect. , doi:10.1111/1365–2478.12530. Wu Z., Huang N.E., 2009. Ensemble empirical mode decomposition: A noise-assisted data analysis method, Adv. Adapt. Data Anal. , 1, 1– 41. Google Scholar CrossRef Search ADS   Xie J., Chen W., Zhang D., Zu S., Chen Y., 2017. Application of principal component analysis in weighted stacking of seismic data, IEEE Geosci. Remote Sens. Lett. , 14( 8), 1213– 1217. Google Scholar CrossRef Search ADS   Xue Y., Chang F., Zhang D., Chen Y., 2016a. Simultaneous sources separation via an iterative rank-increasing method, IEEE Geosci. Remote Sens. Lett. , 13( 12), 1915– 1919. Google Scholar CrossRef Search ADS   Xue Z., Chen Y., Fomel S., Sun J., 2016b. Seismic imaging of incomplete data and simultaneous-source data using least-squares reverse time migration with shaping regularization, Geophysics , 81, S11– S20. Google Scholar CrossRef Search ADS   Yang P., Yin X., Zhang G., 2006. Seismic data analysis based on fuzzy clustering, in 2006 8th International Conference on Signal Processing , doi:10.1109/ICOSP.2006.346109. Zhang H., Thurber C., Rowe C., 2003. Automatic p-wave arrival detection and picking with multiscale wavelet analysis for single-component recordings, Bull. seism. Soc. Am. , 93( 5), 1904– 1912. Google Scholar CrossRef Search ADS   Zhang D., Chen Y., Huang W., Gan S., 2016. Multi-step damped multichannel singular spectrum analysis for simultaneous reconstruction and denoising of 3D seismic data, J. Geophys. Eng. , 13, 704– 720. Google Scholar CrossRef Search ADS   Zhang D., Zhou Y., Chen H., Chen W., Zu S., Chen Y., 2017. Hybrid rank-sparsity constraint model for simultaneous reconstruction and denoising of 3D seismic data, Geophysics , 82( 5), V351– V367. Google Scholar CrossRef Search ADS   Zhou Y., Shi C., Chen H., Xie J., Wu G., Chen Y., 2017. Spike-like blending noise attenuation using structural low-rank decomposition, IEEE Geosci. Remote Sens. Lett. , 14( 9), 1633– 1637. Google Scholar CrossRef Search ADS   Zu S., Zhou H., Chen Y., Qu S., Zou X., Chen H., Liu R., 2016. A periodically varying code for improving deblending of simultaneous sources in marine acquisition, Geophysics , 81( 3), V213– V225. Google Scholar CrossRef Search ADS   Zu S., Zhou H., Mao W., Zhang D., Li C., Pan X., Chen Y., 2017. Iterative deblending of simultaneous-source data using a coherency-pass shaping operator, Geophys. J. Int. , 211( 1), 541– 557. Google Scholar CrossRef Search ADS   © The Author 2017. Published by Oxford University Press on behalf of The Royal Astronomical Society.

### Journal

Geophysical Journal InternationalOxford University Press

Published: Jan 1, 2018

## You’re reading a free preview. Subscribe to read the entire article.

### DeepDyve is your personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month ### Explore the DeepDyve Library ### Search Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly ### Organize Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place. ### Access Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals. ### Your journals are on DeepDyve Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more. All the latest content is available, no embargo periods. DeepDyve ### Freelancer DeepDyve ### Pro Price FREE$49/month
\$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off