# Total variation regularization of the 3-D gravity inverse problem using a randomized generalized singular value decomposition

Total variation regularization of the 3-D gravity inverse problem using a randomized generalized... Summary We present a fast algorithm for the total variation regularization of the 3-D gravity inverse problem. Through imposition of the total variation regularization, subsurface structures presenting with sharp discontinuities are preserved better than when using a conventional minimum-structure inversion. The associated problem formulation for the regularization is nonlinear but can be solved using an iteratively reweighted least-squares algorithm. For small-scale problems the regularized least-squares problem at each iteration can be solved using the generalized singular value decomposition. This is not feasible for large-scale, or even moderate-scale, problems. Instead we introduce the use of a randomized generalized singular value decomposition in order to reduce the dimensions of the problem and provide an effective and efficient solution technique. For further efficiency an alternating direction algorithm is used to implement the total variation weighting operator within the iteratively reweighted least-squares algorithm. Presented results for synthetic examples demonstrate that the novel randomized decomposition provides good accuracy for reduced computational and memory demands as compared to use of classical approaches. Gravity anomalies and Earth structure, Asia, Inverse theory, Numerical approximations and analysis 1 INTRODUCTION Regularization is imposed in order to find acceptable solutions to the ill-posed and non-unique problem of gravity inversion. Most current regularization techniques minimize a global objective function that consists of a data misfit term and a stabilization term; (Li & Oldenburg 1998; Portniaguine & Zhdanov 1999; Boulanger & Chouteau 2001; Vatankhah et al.2017a). Generally for potential field inversion the data misfit is measured as a weighted L2-norm of the difference between the observed and predicted data of the reconstructed model (Pilkington 2009). Stabilization aims to both counteract the ill-posedness of the problem so that changes in the model parameters due to small changes in the observed data are controlled, (Fedi et al.2005; Paoletti et al.2014), and to impose realistic characteristics on the reconstructed model. Many forms of robust and reliable stabilizations have been used by the geophysical community. The minimum-structure inversion based on a L2-norm measure of the gradient of the model parameters was commonly used in potential field inversion, (Li & Oldenburg 1996, 1998). Although the methodology was used successfully in many research papers, it is not able to recover a non-smooth density distribution such as expected by presence of mineral bodies, dikes, etc. (Farquharson 2008). Sparse inversion based on a minimum volume constraint, L0-norm of the model parameters, was introduced by Last & Kubik (1983) and then was developed and used commonly for reconstructing blocky images of the subsurface (Portniaguine & Zhdanov 1999; Boulanger & Chouteau 2001; Ajo-Franklin et al.2007; Vatankhah et al.2015). On the other hand, the L1 and Cauchy norms for the model parameters can be used to reconstruct sparse and compact solutions (Pilkington 2009; Vatankhah et al.2017a). Alternatively, total variation (TV) regularization based on a L1 norm of the gradient of the model parameters preserves edges in the model, and provides a reconstruction of the model that is blocky and non-smooth (Bertete-Aguirre et al.2002; Farquharson 2008). Therefore, the selection of the stabilization to be used for gravity inversion depends on an anticipated true representation of the subsurface model and is problem specific. The most important point here is that when the dimension of the inverse problem increases, efficient and practical algorithms for all aforementioned regularization formulations are desired. For example the following methods were all used for solution of the large-scale problem; (1) the symmetry of the gravity kernel, (Boulanger & Chouteau 2001); (2) data space inversion, (Pilkington 2009); and (3) the projection of the problem onto a smaller subspace, (Oldenburg et al.1993; Vatankhah et al.2017a). Here the focus is on the development of a new randomized algorithm for the 3-D linear inversion of gravity data with TV stabilization. The approach is very effective when the underlying inverse problem is of a moderate scale. Regularization using TV stabilization yields an objective function which is nonlinear in the solution. We therefore use the iteratively reweighted generalized Tikhonov least-squares (IRLS) formulation to solve the nonlinear problem, (Portniaguine & Zhdanov 1999). This presents no difficulty for small-scale problems; the generalized singular value decomposition (GSVD) can be used to compute the mutual decomposition of the model and stabilization matrices from which the solution is immediate at each iteration. Furthermore, the use of the GSVD provides a convenient form for the regularization parameter-choice rules, (Xiang & Zou 2013; Chung & Palmer 2015). The approach using the GSVD, however, is not practical when the dimension of the problem increases, neither in terms of computational costs nor memory demands. Randomized algorithms compute low-rank matrix approximations with reduced memory and computational costs (Halko et al.2011; Xiang & Zou 2013; Voronin et al.2015; Wei et al.2016). Random sampling is used to construct a low-dimensional subspace that captures the dominant spectral properties of the original matrix and then, as developed by Wei et al. (2016), a GSVD can be applied to regularize for this dominant spectral space. We demonstrate that applying this randomized GSVD (RGSVD) methodology at each step of the iterative TV regularization provides a fast and efficient algorithm for 3-D gravity inversion that inherits all the properties of TV inversion for small-scale problems. We also show how the weighting matrix at each iteration can be determined and how an alternating direction algorithm can also reduce the memory overhead associated with calculating the GSVD of the projected subproblem at each step of the IRLS. 2 INVERSION METHODOLOGY Suppose the subsurface is divided into a large number of cells of fixed size but unknown density (Li & Oldenburg 1998; Boulanger & Chouteau 2001). Unknown densities of the cells are stacked in vector $$\mathbf {m}\in \mathcal {R}^{n}$$ and measured data on the surface are stacked in $$\mathbf {d}_{\mathrm{obs}}\in \mathcal {R}^{m}$$, and are related to the densities via the linear relationship   \begin{eqnarray} \mathbf {d}_{\mathrm{obs}}= G\mathbf {m}, \end{eqnarray} (1)for forward model matrix $$G \in \mathcal {R}^{m \times n}$$ (m ≪ n). The aim is to find an acceptable model for the densities that predicts the observed data at the noise level. For gravity inversion, eq. (1) is modified through inclusion of an estimated prior model, mapr, yielding (Li & Oldenburg 1998)   \begin{eqnarray} \mathbf {d}_{\mathrm{obs}}- G\mathbf {m}_{\mathrm{apr}}= G\mathbf {m}- G\mathbf {m}_{\mathrm{apr}}. \end{eqnarray} (2)First, eq. (1) is replaced by Gy = r using r = dobs − Gmapr and y = m − mapr. Then a diagonal depth weighting matrix Wdepth = 1/(zj + ξ)β (Boulanger & Chouteau 2001) is incorporated in the model via $$\tilde{G}=GW_{\mathrm{depth}}^{-1}$$ yielding   \begin{eqnarray} \mathbf {r}= \tilde{G}\mathbf {h}, \end{eqnarray} (3)for h = Wdepthy. In Wdepth, zj is the mean depth of cell j, β is a weighting parameter usually chosen based on the data type and dimension of the problem and ξ is very small positive number (Boulanger & Chouteau 2001). Due to the ill-posedness of the problem, eq. (3) cannot be solved directly but must be stabilized in order to provide a practically acceptable solution. Using the L2-TV regularization methodology h is obtained by minimizing the objective function (Wohlberg & Rodriguez 2007),   \begin{eqnarray} P^{\alpha }(\mathbf {h})=\Vert W_{\mathbf {d}}(\tilde{G}\mathbf {h}-\mathbf {r}) \Vert _2^2 + \alpha ^2 \Vert \vert \nabla \mathbf {h}\vert \Vert _1^1, \end{eqnarray} (4)in which diagonal weighting matrix Wd approximates the square root of the inverse covariance matrix for the independent noise in the data. Specifically (Wd)ii = 1/ηi in which ηi is the standard deviation of the noise for the ith datum. Regularization parameter α provides a trade-off between the weighted data misfit and the stabilization. The absolute value of the gradient vector, |∇h|, is equal to $$= \sqrt{(D_{\rm {x}}\mathbf {h})^2+(D_{\rm {y}}\mathbf {h})^2+(D_{\rm {z}}\mathbf {h})^2}$$ in which Dx, Dy and Dz are the discrete derivative operators in x, y and z-directions, respectively. The derivatives at the centres of the cells are approximated to low order using forward differences with backward differencing at the boundary points, Li & Oldenburg (2000), yielding matrices Dx, Dy and Dz which are square of size n × n. Although (4) is a convex optimization problem with a unique solution, the TV term is not differentiable everywhere with respect to h and to find a solution it is helpful to rewrite the stabilization term using a weighted L2-norm, following Wohlberg & Rodriguez (2007). Given vectors ξ, χ and ψ and a diagonal matrix WR with entries wr we have   \begin{eqnarray} \Bigg \Vert \left( \begin{array}{c@{\hskip7pt}c@{\hskip7pt}c}W_{\mathrm{R}}& 0 & 0 \\ 0 & W_{\mathrm{R}}& 0 \\ 0 & 0 & W_{\mathrm{R}}\end{array} \right) \left( \begin{array}{c }\xi \\ \chi \\ \psi \end{array} \right) \Bigg \Vert _2^2 &=& \sum _r\big(w_{\mathrm{r}}^2 \xi _{\mathrm{r}}^2 + w_{\mathrm{r}}^2 \chi _{\mathrm{r}}^2 + w_{\mathrm{r}}^2 \psi _{\mathrm{r}}^2\big)\nonumber\\ &=& \sum _r w_{\mathrm{r}}^2 \big(\sqrt{\xi _{\mathrm{r}}^2 + \chi _{\mathrm{r}}^2 + \psi _{\mathrm{r}}^2}\big)^2. \end{eqnarray} (5)Setting wr = (ξr2 + χr2 + ψr2)− 1/4 we have   \begin{eqnarray} \Biggr \Vert \left( \begin{array}{c@{\hskip7pt}c@{\hskip7pt}c}W_{\mathrm{R}}& 0 & 0 \\ 0 & W_{\mathrm{R}}& 0 \\ 0 & 0 & W_{\mathrm{R}}\end{array} \right) \left( \begin{array}{c }\xi \\ \chi \\ \psi \end{array} \right) \Biggr \Vert _2^2 =\sum _r \big(\sqrt{\xi _{\mathrm{r}}^2 + \chi _{\mathrm{r}}^2 + \psi _{\mathrm{r}}^2}\big). \end{eqnarray} (6)Now with ξ = Dxh, χ = Dyh, ψ = Dzh and WR(k) defined to have entries wr(k) calculated for ∇h at iteration k − 1 given by $$w_{\mathrm{r}}^{(k)}= ((D_{\rm {x}}\mathbf {h}^{(k-1)})_r^2 + (D_{\rm {y}}\mathbf {h}^{(k-1)})_r^2 + (D_{\rm {z}}\mathbf {h}^{(k-1)})_r^2 +\epsilon ^2)^{-1/4}$$, the TV stabilizer is approximated via   \begin{eqnarray} \Vert \vert \nabla \mathbf {h}\vert \Vert _1^1= \left\Vert \sqrt{(D_{\rm {x}}\mathbf {h})^2+(D_{\rm {y}}\mathbf {h})^2+(D_{\rm {z}}\mathbf {h})^2} \right\Vert _1^1 \approx \Vert W^{(k)} D \mathbf {h}\Vert _2^2, \end{eqnarray} (7)for derivative operator D = [Dx; Dy; Dz]. Here, 0 < ε ≪ 1 is added to avoid the possibility of division by zero, and superscript k indicates that matrix WR is updated using the model parameters of the previous iteration. Hence (4) is rewritten as a general Tikhonov functional   \begin{eqnarray} P^{\alpha }(\mathbf {h})=\Vert W_{\mathbf {d}}(\tilde{G}\mathbf {h}-\mathbf {r}) \Vert _2^2 + \alpha ^2 \Vert \tilde{D} \mathbf {h}\Vert _2^2, \quad \tilde{D}=WD, \end{eqnarray} (8)for which the minimum is explicitly expressible as   \begin{eqnarray} \mathbf {h}=(\tilde{\tilde{G}}^T\tilde{\tilde{G}}+\alpha ^2\tilde{D}^T\tilde{D})^{-1}\tilde{\tilde{G}}^T\tilde{\mathbf {r}}, \quad \tilde{\tilde{G}}=W_{\mathbf {d}}\tilde{G} \quad \mathrm{and} \quad \tilde{\mathbf {r}}=W_{\mathbf {d}}\mathbf {r}. \end{eqnarray} (9)The model update is then   \begin{eqnarray} \mathbf {m}(\alpha )=\mathbf {m}_{\mathrm{apr}}+W_{\mathrm{depth}}^{-1}\mathbf {h}(\alpha ). \end{eqnarray} (10)While the Tikhonov function can be replaced by a standard form Tikhonov function, that is, with regularization term ‖h‖2, when $$\tilde{D}$$ is easily invertible, for example, when diagonal, the form of $$\tilde{D}$$ in this case makes that transformation prohibitive for cost and we must solve using the general form. We show the iterative process for the solution in Algorithm 2. It should be noted that the iteration process terminates when solution satisfies the noise level or a predefined maximum number of iterations is reached (Boulanger & Chouteau 2001). Furthermore, the density limits [ρmin, ρmax] are imposed at each iteration. If at any iteration a density value falls outside these predefined density bounds, the value is projected back to the nearest bound value (Boulanger & Chouteau 2001). For small-scale problems in which the dimensions of $$\tilde{\tilde{G}}$$, and consequently $$\tilde{D}$$, are small, the solution h(α) is found at minimal cost using the GSVD of the matrix pair $$[\tilde{\tilde{G}},\tilde{D}]$$ as it is shown in Appendix A (Aster et al.2013; Vatankhah et al.2014). Furthermore, given the GSVD the regularization parameter may be estimated cheaply using standard parameter-choice techniques (Xiang & Zou 2013; Chung & Palmer 2015). But for large-scale problems it is not practical to calculate the GSVD at each iteration, both with respect to computational cost and memory demands. Instead the size of the original large problem can be reduced greatly using a randomization technique which provides the GSVD in a more feasible and efficient manner. The solution of reduced system still is a good approximation of the original system (Halko et al.2011; Xiang & Zou 2013; Voronin et al.2015; Xiang & Zou 2015; Wei et al.2016). Here, we use the Randomized GSVD (RGSVD) algorithm developed by Wei et al. (2016) for underdetermined problems in which the mutual decomposition of the matrix pair $$[\tilde{\tilde{G}},\tilde{D}]$$ is approximated by   (11) The steps of the algorithm are given in Algorithm 1. Steps 1 to 3 are used to form matrix Q which approximates the range of $$\tilde{\tilde{G}}^T$$. At step 4, $$\tilde{\tilde{G}}$$ is projected into a lower dimensional matrix B1, for which B1 provides information on the range of $$\tilde{\tilde{G}}$$. The same projection is applied to matrix $$\tilde{D}$$. In step 5, an economy-sized GSVD is computed for the matrix pair [B1, B2]. Parameter q balances the accuracy and efficiency of the Algorithm 1 and determines the dimension of the subspace for the projected problem. When q is small, this methodology is very effective and leads to a fast GSVD computation. Simultaneously, the parameter q has to be selected large enough to capture the dominant spectral properties of the original problem with the aim that the solution obtained using the RGSVD is close to the solution that would be obtained using the GSVD. For any GSVD decomposition (11), including that obtained via Algorithm 1, h(α) of (9) is given by   \begin{eqnarray} \mathbf {h}(\alpha )= \left( Z^T \Lambda ^T U^T U \Lambda Z + \alpha ^2 Z^T M^T V^T V M Z \right)^{-1} Z^T \Lambda ^T U^T \tilde{\mathbf {r}},\nonumber\\ \end{eqnarray} (12)which simplifies to (Aster et al.2013; Vatankhah et al.2014)   \begin{eqnarray} \mathbf {h}(\alpha )= \sum _{i=1}^{q} \frac{\gamma ^2_i}{\gamma ^2_i+\alpha ^2} \frac{\mathbf {u}^T_{i}\tilde{\mathbf {r}}}{\lambda _i} (Z^{\dagger })_{i}. \end{eqnarray} (13)Here γi is the ith generalized singular value, see Appendix A, and Z† is the Moore-Penrose inverse of Z (Wei et al.2016). Incorporating this single step of the RGSVD within the TV algorithm yields the iteratively reweighted TVRGSVD algorithm given in Algorithm 2. Note here that steps 1–3 and the calculation of B1 in step 4 in Algorithm 1 are the same for all TV iterations and thus outside the loop in Algorithm 2. At each iteration k, the matrices W, $$\tilde{D}$$ and B2 are updated, and the GSVD is determined for the matrix pair $$[B_1, B_2^{(k)}]$$. We should note here that it is immediate to use the minimum gradient support (MGS) stabilizer introduced by Portniaguine & Zhdanov (1999) in Algorithm 2 via replacing (−1/4) in WR with (−1/2) and keeping all other parameters fixed. The dominant costs of finding solution (13) for each step of Algorithm 2 depend on the specifics of the algorithm used to estimate the GSVD. Using the RGSVD the cost is determined by the use of Algorithm 1 and the cost for finding Z† needed in (13). These costs are summarized in terms of estimates of the number of floating point operations (flops) in Table 1. For q ≪ m ≪ n, and p = 3n, the dominant cost is approximately 6qn2 + 34q2n + 4qmn and occurs due to steps 2, 4, 5, and calculation of Z†. This compares positively to the cost, approximately (208/3)n3, of using the full GSVD, see Appendix B. The potential impact of using the RGSVD rather than the GSVD for the solution of the TV problem can be seen by using the flop estimates to compare RGSVD and GSVD for examples of m, n and choices for q for RGSVD. The RGSVD offers a significant reduction in the computational cost, as illustrated in Figs 1(a) and (b). In these plots we give the estimates in each case for using either $$\tilde{D}$$ with p = 3n or p = n, and for the major terms 6qn2 and (208/3)n3, in each case. To compare a moderate- and a large-scale problem we illustrate for pairs (m, n) = (900, 9000) and (6000, 60 000), respectively. These results show that the major terms are sufficient to determine the relative impact of using RGSVD in place of GSVD; RGSVD will be more efficient for q/(12n) ≪ 1. We will show for simulated data that this comes without loss of accuracy. Figure 1. View largeDownload slide Estimates of total number of flops using RGSVD and GSVD algorithms for two choices for (m, n) with n ≫ m. Figure 1. View largeDownload slide Estimates of total number of flops using RGSVD and GSVD algorithms for two choices for (m, n) with n ≫ m. Table 1. Computational cost of each step of Algorithm 1 based on approximate count of flops. 2  3  4  5  6  Z†  2lmn  2l2(n − l/3)  2qmn and 6qn2  8q2(p + m) + (118/3)q3  2q2n  6nq2 + 20q3  2  3  4  5  6  Z†  2lmn  2l2(n − l/3)  2qmn and 6qn2  8q2(p + m) + (118/3)q3  2q2n  6nq2 + 20q3  View Large 2.0.1 An Alternating Direction algorithm Step 9 of Algorithm 2 requires the economy GSVD decomposition in which matrix $$B_2^{(k)}$$ is of size 3n × q. For a large-scale problem the computational cost and memory demands with the calculation of the GSVD limit the size of the problem that can be solved. We therefore turn to an alternative approach for large-scale 3-D problems and adopt the use of an Alternating Direction (AD) strategy in which to handle the large-scale problem requiring derivatives in greater than two dimensions we split the problem into pieces handling each direction one after the other. This is a technique that has been in the literature for some time for handling the solution of large-scale partial differential equations, most notably through the alternating direction implicit method (Peaceman & Rachford 1955). Matrix D(k) generated via steps 16 and 17 of Algorithm 2 can be changed without changing the other steps of the algorithm. For the AD algorithm we may therefore alternate over $$D^{(k)} = W_{\mathrm{R_x}}^{(k)}D_{\rm {x}}$$, $$W_{\mathrm{R_y}}^{(k)}D_{\rm {y}}$$ or $$W_{\mathrm{R_z}}^{(k)}D_{\rm {z}}$$, dependent on (k mod 3) = 0, 1, or 2, respectively. Then $$B_2^{(k)}$$ is only of size n × n, yielding reduced memory demands for calculating the GSVD. We note that D is also initialized consistently. The AD approach amounts to apply the edge preserving algorithm in each direction independently and cycling over all directions. Practically, we also find that there is nothing to be gained by requiring that the derivative matrices are square, and following (Hansen 2007) ignore the derivatives at the boundaries. For a 1-D problem this amounts to taking matrix Dx of size (n − 1) × n for a line with n points. Thus dependent on k mod 3 we have a weighting matrix of size px × px, py × py or pz × pz for matrices Dx, Dy and Dz of sizes px × n, py × n and pz × n, respectively, and where px, py and pz (all less than n) depend on the number of boundary points in each direction. Matrix $$B_2^{(k)}$$ is thus reduced in size and the GSVD calculation is more efficient at each step. Furthermore, we find that rather than calculating the relevant weight matrix WR for the given dimension we actually form the weighted entry that leads to approximation of the relevant component of the gradient vector for all three dimensions, thus realistically approximating the gradient as given in (5). For the presented results we will use the AD version of Algorithm 2, noting that this is not necessary for the smaller problems, but is generally more efficient and reliable for the larger 3-D problem formulations. 2.1 Estimation of the regularization parameter α As presented to this point we have assumed a known value for the regularization parameter α. Practically we wish to find α dynamically to appropriately regularize at each step of the iteration so as to recognize that the conditioning of the problem changes with the iteration. Here we use the method of unbiased predictive risk estimation which we have found to be robust in our earlier work (Renaut et al.2017; Vatankhah et al.2015; Vatankhah et al.2017a). The method, which goes back to Mallows (1973), requires some knowledge of the noise level in the data and was carefully developed in Vogel (2002) for the standard Tikhonov functional. The method was further extended for use with TV regularization by Lin et al. (2010). Defining the residual $$R(\mathbf {h}(\alpha )) = \tilde{\tilde{G}} \mathbf {h}(\alpha )-\tilde{\mathbf {r}}$$ and influence matrix $$H_{TV,\alpha }=\tilde{\tilde{G}}(\tilde{\tilde{G^{T}}}\tilde{\tilde{G}}+\alpha ^2\tilde{D}^T\tilde{D})^{-1}\tilde{\tilde{G^{T}}}$$, the optimal parameter α is the minimizer of   \begin{eqnarray} U(\alpha )=\Vert R(\mathbf {h}(\alpha ))\Vert _2^2+ 2 \,\text{trace}(H_{TV,\alpha })-m, \end{eqnarray} (14)which is given in terms of the GSVD by   \begin{eqnarray} U(\alpha )=\sum _{i=1}^{q} \left( \frac{1}{\gamma _i^2 \alpha ^{-2} + 1} \right)^2 \left(\mathbf {u}_i^T\tilde{\mathbf {r}} \right)^2 + 2 \left( \sum _{i=1}^{q} \frac{\gamma _i^2}{\gamma _i^2+\alpha ^2}\right) - q.\nonumber\\ \end{eqnarray} (15)Typically αopt is found by evaluating (15) on a range of α, between minimum and maximum γi, and then that α which minimizes the function is selected as αopt. 3 SYNTHETIC EXAMPLES 3.1 Model consisting of two dipping dikes As a first example, we use a complex model that consists of two embedded dipping dikes of different sizes and dipping in opposite directions but with the same density contrast 1 g cm−3, Fig. 2. Gravity data, dexact, are generated on the surface for a grid of 30 × 30 = 900 points with grid spacing 50 m. Gaussian noise with standard deviation (0.02 (dexact)i + 0.002 ‖dexact‖) is added to each datum. Example noisy data, dobs, are illustrated in Fig. 3. Inversion is performed for the subsurface volume of 9000 cubes of size 50 m in each dimension using the matrix $$\tilde{\tilde{G}}$$ of size 900 × 9000. Use of this relatively small model permits examination of the inversion methodology with respect to different parameter choices and provides the framework to be used for more realistic larger models. All computations are performed on a desktop computer with Intel Core i7-4790 CPU 3.6 GHz processor and 16 GB RAM. Figure 2. View largeDownload slide A model that consists of two dikes dipping in opposite directions. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 2. View largeDownload slide A model that consists of two dikes dipping in opposite directions. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 3. View largeDownload slide The gravity anomaly produced by the model shown in Fig. 2 and contaminated by Gaussian noise. Figure 3. View largeDownload slide The gravity anomaly produced by the model shown in Fig. 2 and contaminated by Gaussian noise. The parameters of the inversion needed for Algorithm 2, and the results, are detailed for each example in Table 2. These are the maximum number of iterations for the inversion Kmax, the bound constraints for the model ρmin and ρmax, the initial data mapr and the choice for q. Further, the inversion terminates if $$\chi _{\text{computed}}^2 =\Vert W_{\mathbf {d}}(\mathbf {d}_{\mathrm{obs}}-\mathbf {d}_{\mathrm{pre}}^{(K)})\Vert _2^2 \le m+\sqrt{2m}=942.4$$ is satisfied for iteration K < Kmax. The results for different choices of q, but all other parameters the same, are illustrated in Figs 4 –6, respectively. With q = 100 two dipping structures are recovered but the iteration has not converged by K = 200, and both the relative error and $$\chi _{\text{computed}}^2$$ are large. With q = 300 and 500 better reconstructions of the dikes are achieved although the extension of the left dike is overestimated. While the errors are nearly the same, the inversions terminate at K = 174 and K = 49 for q = 300 and q = 500, respectively. This leads to low computational time when using q = 500 as compared with the other two cases, see Table 2. We should note here that for all three cases the relative error decreases rapidly for the early iterations, after which there is little change in the model between iterations. For example, the result for q = 300 at iteration K = 50, as illustrated in Fig. 7 and detailed in Table 2, is acceptable and is achieved with a substantial reduction in the computational time. Figure 4. View largeDownload slide The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 100. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 4. View largeDownload slide The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 100. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 5. View largeDownload slide The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 300. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 5. View largeDownload slide The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 300. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 6. View largeDownload slide The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 500. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 6. View largeDownload slide The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 500. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 7. View largeDownload slide The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 300 at K = 50. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 7. View largeDownload slide The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 300 at K = 50. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Table 2. Parameters and results for Algorithm 2 applied to the small-scale dipping dikes model. The number of iterations is K, α(K) is the final regularization parameter, $$\text{RE}^{(K)}=\frac{\Vert \mathbf {m}_{\mathrm{exact}}-\mathbf {m}^{(K)} \Vert _2}{\Vert \mathbf {m}_{\mathrm{exact}} \Vert _2}$$ is the relative error of the reconstructed model at iteration K. The final $$\chi _{\text{computed}}^2$$ is also reported. Input Parameters  Results  mapr  ρmin (g cm−3)  ρmax (g cm−3)  Kmax  q  RE(K)  α(K)  K  $$\chi _{\text{computed}}^2$$  Time (s)  0  0  1  200                      100  0.7745  26.63  200  3978.9  494          300  0.7143  41.62  174  942.4  595          500  0.7244  51.62  49  942.2  249  0  0  1  50  300  0.7308  38.01  50  1165.7  165  0  0  2  200  500  0.7257  31.60  49  941.6  227  ≠ 0  0  1  200  500  0.7076  76.74  200  1952.6  972  Input Parameters  Results  mapr  ρmin (g cm−3)  ρmax (g cm−3)  Kmax  q  RE(K)  α(K)  K  $$\chi _{\text{computed}}^2$$  Time (s)  0  0  1  200                      100  0.7745  26.63  200  3978.9  494          300  0.7143  41.62  174  942.4  595          500  0.7244  51.62  49  942.2  249  0  0  1  50  300  0.7308  38.01  50  1165.7  165  0  0  2  200  500  0.7257  31.60  49  941.6  227  ≠ 0  0  1  200  500  0.7076  76.74  200  1952.6  972  View Large As compared to inversion using the L2-norm of the gradient of the model parameters, see, for example, Li & Oldenburg (1996), or conventional minimum structure inversion, the results obtained using Algorithm 2 provide a subsurface model that is not smooth. Further, as compared with L0 and L1 norms applied for the model parameters, see, for example, Last & Kubik (1983), Portniaguine & Zhdanov (1999) and Vatankhah et al. (2017a), the TV inversion does not yield a model that is sparse or compact. On the other hand, the TV inversion is far less dependent on correct specification of the model constraints. This is illustrated in Fig. 8, which is the same case as Fig. 6, but with ρmax = 2 g cm−3, and demonstrates that the approach is generally robust. Finally, to contrast with the algorithm presented in Bertete-Aguirre et al. (2002) the result in Fig. 9(b) is for an alternative choice for mapr, illustrated in Fig. 9(a), and q = 500 which is consistent with the approach in Bertete-Aguirre et al. (2002). Here mapr is obtained by taking the true model with the addition of Gaussian noise with a standard deviation of (0.05 mtrue + 0.02 ‖mtrue‖). Not surprisingly the reconstructed density model is more focused and is closer to the true model. The computed value for $$\chi _{\text{computed}}^2$$ is, however, larger than the specified value and the algorithm terminates at Kmax. This occurs due to the appearance of an incorrect density distribution in the first layer of the subsurface. Together, these results demonstrate that the TVRGSVD technique is successful and offers a good option for the solution of larger scale models. Figure 8. View largeDownload slide The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 500 when uncorrected upper density bound ρmax = 2 g cm−3 is selected. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 8. View largeDownload slide The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 500 when uncorrected upper density bound ρmax = 2 g cm−3 is selected. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 9. View largeDownload slide (a) The initial model which generated by adding Gaussian noise with standard deviation of (0.05 mtrue + 0.02 ‖mtrue‖) to the true model. (b) The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 500 when model shown in Fig. 9(a) is used as mapr. Figure 9. View largeDownload slide (a) The initial model which generated by adding Gaussian noise with standard deviation of (0.05 mtrue + 0.02 ‖mtrue‖) to the true model. (b) The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 500 when model shown in Fig. 9(a) is used as mapr. 3.2 Model of multiple bodies We now consider a more complex and larger model that consists of six bodies with various geometries, sizes, depths and densities, as illustrated in Fig. 10, and is further detailed in Vatankhah et al. (2017a). The surface gravity data are calculated on a 100 × 60 grid with 100 m spacing. Gaussian noise with standard deviation of (0.02 (dexact)i + 0.001 ‖dexact‖) is added to each datum, giving the noisy data set as shown in Fig. 11. The subsurface is divided into 100 × 60 × 10 = 60 000 cubes with sizes 100 m in each dimension. For the inversion we use Algorithm 2 with mapr = 0, ρmin = 0 g cm−3, ρmax = 1 g cm−3 and Kmax = 50. We perform the inversion for three values of q, 500, 1000 and 2000, and report the results in Table 3. The reconstruction with q = 500 is less satisfactory than that achieved with larger q and the iterations terminate at Kmax with a large value of $$\chi _{\text{computed}}^2$$. The results with q = 1000 and q = 2000 have similar relative errors, but the computational cost is much reduced using q = 1000; although the desired $$\chi _{\text{computed}}^2$$ is not achieved the result is close and acceptable. The reconstructed model and associated gravity response, using q = 1000, are illustrated in Figs 12 and 13, respectively. While the maximum depths of the anomalies are overestimated, the horizontal borders are reconstructed accurately. These two examples suggest that q > m/6 is suitable for Algorithm 2, which confirms our previous conclusions when using the randomized SVD, see Vatankhah et al. (2017b). We should note that as compared to the case in which the standard Tikhonov functional can be used, see such results with the focusing inversion in Vatankhah et al. (2017a,b), the computational cost here is much higher. Figure 10. View largeDownload slide Model consists of six bodies with various geometries and sizes embedded in a homogeneous background. Bodies have the densities 1 g cm−3 and 0.8 g cm−3. (a) Plane-section at depth = 100 m; (b) Plane-section at depth = 300 m; (c) Plane-section at depth = 500 m; (d) Plane-section at depth = 700 m. Figure 10. View largeDownload slide Model consists of six bodies with various geometries and sizes embedded in a homogeneous background. Bodies have the densities 1 g cm−3 and 0.8 g cm−3. (a) Plane-section at depth = 100 m; (b) Plane-section at depth = 300 m; (c) Plane-section at depth = 500 m; (d) Plane-section at depth = 700 m. Figure 11. View largeDownload slide The gravity anomaly produced by the multiple model shown in Fig. 10 and contaminated by Gaussian noise. Figure 11. View largeDownload slide The gravity anomaly produced by the multiple model shown in Fig. 10 and contaminated by Gaussian noise. Figure 12. View largeDownload slide The reconstructed model for data in Fig. 11 using Algorithm 2 with q = 1000. (a) Plane-section at depth = 100 m; (b) Plane-section at depth = 300 m; (c) Plane-section at depth = 500 m; (d) Plane-section at depth = 700 m. Figure 12. View largeDownload slide The reconstructed model for data in Fig. 11 using Algorithm 2 with q = 1000. (a) Plane-section at depth = 100 m; (b) Plane-section at depth = 300 m; (c) Plane-section at depth = 500 m; (d) Plane-section at depth = 700 m. Figure 13. View largeDownload slide The gravity anomaly produced by the reconstructed model shown in Fig. 12. Figure 13. View largeDownload slide The gravity anomaly produced by the reconstructed model shown in Fig. 12. Table 3. The results of the inversion for the multiple bodies example using Algorithm 2. Input Parameters  Results  mapr  ρmin (g cm−3)  ρmax (g cm−3)  Kmax  q  RE(K)  α(K)  K  $$\chi _{\text{computed}}^2$$  Time (s)  0  0  1  50  500  0.6710  15.31  50  11719.2  6014  0  0  1  50  1000  0.6451  18.02  50  7324.0  7262  0  0  1  50  2000  0.6452  18.79  44  6103.4  10666  Input Parameters  Results  mapr  ρmin (g cm−3)  ρmax (g cm−3)  Kmax  q  RE(K)  α(K)  K  $$\chi _{\text{computed}}^2$$  Time (s)  0  0  1  50  500  0.6710  15.31  50  11719.2  6014  0  0  1  50  1000  0.6451  18.02  50  7324.0  7262  0  0  1  50  2000  0.6452  18.79  44  6103.4  10666  View Large 4 REAL DATA We use the gravity data from the Goiás Alkaline Province (GAP) of the central region of Brazil. The GAP is a result of mafic-alkaline magmatism that occurred in the Late Cretaceous and includes mafic–ultramafic alkaline complexes in the northern portion, subvolcanic alkaline intrusions in the central region and volcanic products to the south with several dikes throughout the area (Dutra et al.2012). We select a region from the northern part of GAP in which Morro do Engenho Complex (ME) outcrops and another intrusive body, A2, is completely covered by Quaternary sediments (Dutra & Marangoni 2009). The data were digitized carefully from fig. 3 in Dutra & Marangoni (2009) and re-gridded into 45 × 53 = 2385 data points with spacing 1 km, Fig. 14(a). For these data, the result of the smooth inversion using Li & Oldenburg (1998) algorithm was presented in Dutra & Marangoni (2009) and Dutra et al. (2012). Furthermore, the result of focusing inversion based on L1-norm stabilizer is available in Vatankhah et al. (2017b). The results using the TV inversion presented here can therefore be compared with the inversions using both aforementioned algorithms. Figure 14. View largeDownload slide (a) Residual gravity data over the Morro do Engenho complex digitized from Dutra & Marangoni (2009); (b) the response of the reconstructed model shown in Fig. 15. Figure 14. View largeDownload slide (a) Residual gravity data over the Morro do Engenho complex digitized from Dutra & Marangoni (2009); (b) the response of the reconstructed model shown in Fig. 15. For the inversion we divide the subsurface into 45 × 53 × 14 = 33 390 rectangular 1 km prisms. The density bounds ρmin = 0 g cm−3 and ρmax = 0.3 g cm−3 are selected based on geological information from Dutra & Marangoni (2009). Algorithm 2 was implemented with q = 500 and Kmax = 100. The results of the inversion are presented in Table 4, with the illustration of the predicted data due to the reconstructed model in Fig. 14(b) and the reconstructed model in Fig. 15. As compared to the smooth estimate of the subsurface shown in Dutra & Marangoni (2009), the obtained subsurface is blocky and non-smooth. The subsurface is also not as sparse as that obtained by the focusing inversion in Vatankhah et al. (2017b). The ME and A2 bodies extend up to maximum 12 and 8 km, respectively. Unlike the result obtained using the focusing inversion, for the TV inversion the connection between ME and A2 at depths 4 to 7 km is not strong. We should note that the computational time for the focusing inversion presented in Vatankhah et al. (2017b) is much smaller than for the TV algorithm presented here. Figure 15. View largeDownload slide The plane-sections of the reconstructed model for the data in Fig. 14(a) using Algorithm 2 with q = 400. The sections are at the depths specified in the figures. Figure 15. View largeDownload slide The plane-sections of the reconstructed model for the data in Fig. 14(a) using Algorithm 2 with q = 400. The sections are at the depths specified in the figures. Table 4. The results of the inversion for the data presented in Fig. 14 using Algorithm 2. Input Parameters  Results  mapr  ρmin (g cm−3)  ρmax (g cm−3)  Kmax  q  α(K)  K  $$\chi _{\text{computed}}^2$$  Time (s)  0  0  0.3  100  500  102.48  38  2453.9  1533  Input Parameters  Results  mapr  ρmin (g cm−3)  ρmax (g cm−3)  Kmax  q  α(K)  K  $$\chi _{\text{computed}}^2$$  Time (s)  0  0  0.3  100  500  102.48  38  2453.9  1533  View Large 5 CONCLUSIONS We developed an algorithm for total variation regularization applied to the 3-D gravity inverse problem. The presented algorithm provides a non-smooth and blocky image of the subsurface which may be useful when discontinuity of the subsurface is anticipated. Using the randomized generalized singular value decomposition, we have demonstrated that moderate-scale problems can be solved in a reasonable computational time. This seems to be the first use of the TV regularization using the RGSVD for gravity inversion. Our presented results use an Alternating Direction implementation to further improve the efficiency and reduce the memory demands of the problem. The results show that there is less sensitivity to the provision of good bounds for the density values, and thus the TV may have a role to play for moderate-scale problems where limited information on subsurface model parameters is available. It was of interest to investigate the 3-D algorithm developed here in view of the results of Bertete-Aguirre et al. (2002) which advocated TV regularization in the context of 2-D gravity inversion. We obtained results that are comparable with the simulations presented in Bertete-Aguirre et al. (2002) but here for much larger problems. In our simulations we have found that the computational time is much larger than that required for the focusing algorithm presented in Vatankhah et al. (2017a,b). Because it is not possible to transform the TV regularization to standard Tikhonov form, the much more expensive GSVD algorithm has to be used in place of the SVD that can be utilized for the focusing inversion presented in Vatankhah et al. (2017b). On the other hand, the advantage of the TV inversion as compared to the focusing inversion is the lesser dependence on the density constraints and the generation of a subsurface which is not as smooth and admits discontinuities. The impact of the algorithm was illustrated for the inversion of real gravity data from the Morro do Engenho complex in central Brazil. ACKNOWLEDGEMENTS We would like to thank two reviewers, Dr. Valeria Paoletti and one anonymous reviewer. Their helpful comments helped us to improve the manuscript. REFERENCES Ajo-Franklin J.B., Minsley B.J., Daley T.M., 2007. Applying compactness constraints to differential traveltime tomography, Geophysics  72( 4), R67– R75. Google Scholar CrossRef Search ADS   Aster R.C., Borchers B., Thurber C.C., 2013, Parameter Estimation and Inverse Problems  2nd edn, Elsevier. Bertete-Aguirre H., Cherkaev E., Oristaglio M., 2002. Non-smooth gravity problem with total variation penalization functional, Geophys. J. Int.  149 499– 507. Google Scholar CrossRef Search ADS   Boulanger O., Chouteau M., 2001. Constraint in 3D gravity inversion, Geophys. Prospect.  49 265– 280. Google Scholar CrossRef Search ADS   Chung J., Palmer K., 2015. A hybrid LSMR algorithm for large-scale Tikhonov regularization, SIAM J. Sci. Comput.  37( 5), S562– S580. Google Scholar CrossRef Search ADS   Dutra A.C., Marangoni Y.R., 2009. Gravity and magnetic 3D inversion of Morro do Engenho complex, Central Brazil, J. South Am. Earth Sci.  28 193– 203. Google Scholar CrossRef Search ADS   Dutra A.C., Marangoni Y.R., Junqueira-Brod T.C., 2012. Investigation of the Goiás Alkaline Province, Central Brazil: application of gravity and magnetic methods, J. South Am. Earth Sci.  33 43– 55. Google Scholar CrossRef Search ADS   Farquharson C.G., 2008. Constructing piecwise-constant models in multidimensional minimum-structure inversions, Geophysics  73( 1), K1– K9. Google Scholar CrossRef Search ADS   Fedi M., Hansen P.C., Paoletti V., 2005. Tutorial: analysis of depth resolution in potential field inversion, Geophysics  70( 6), A1– A11. Google Scholar CrossRef Search ADS   Golub G.H., van Loan C., 2013. Matrix Computations  4th edn, John Hopkins Press Baltimore. Halko N., Martinsson P.G., Tropp J.A., 2011. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions, SIAM Rev.  53 2, 217– 288. Google Scholar CrossRef Search ADS   Hansen P.C., 2007. Regularization Tools: A Matlab Package for Analysis and Solution of Discrete Ill-Posed Problems Version 4.1 for Matlab 7.3, Numer. Algorithms  46 189– 194. Google Scholar CrossRef Search ADS   Last B.J., Kubik K., 1983. Compact gravity inversion, Geophysics  48 713– 721. Google Scholar CrossRef Search ADS   Li Y., Oldenburg D.W., 1996. 3-D inversion of magnetic data, Geophysics  61 394– 408. Google Scholar CrossRef Search ADS   Li Y., Oldenburg D.W., 1998. 3-D inversion of gravity data, Geophysics  63 109– 119. Google Scholar CrossRef Search ADS   Li Y., Oldenburg D.W., 2000. Incorporating geologic dip information into geophysical inversions, Geophysics  65 148– 157. Google Scholar CrossRef Search ADS   Lin Y., Wohlberg B., Guo H., 2010. UPRE method for total variation parameter selection, Signal Process.  90 2546– 2551. Google Scholar CrossRef Search ADS   Mallows C.L., 1973, Some comments on Cp, Technometrics  4( 15), 661– 675. Oldenburg D.W., McGillivray P.R., Ellis R.G., 1993. Generalized subspace methods for large-scale inverse problem, Geophys. J. Int  114 12– 20. Google Scholar CrossRef Search ADS   Paige C.C., Saunders M.A., 1981. Towards a generalized singular value decomposition, SIAM J. Numer. Anal.  18 398– 405. Google Scholar CrossRef Search ADS   Paoletti V., Hansen P.C., Hansen M.F., Fedi M., 2014. A computationally efficient tool for assessing the depth resolution in potential-field inversion, Geophysics  79( 4), A33– A38. Google Scholar CrossRef Search ADS   Peaceman D., Rachford J.H.H., 1955. The numerical solution of parabolic and elliptic differential equations, J. Soc. Indust. Appl. Math.  3 28– 41. CrossRef Search ADS   Pilkington M., 2009. 3D magnetic data-space inversion with sparseness constraints, Geophysics  74 L7– L15. Google Scholar CrossRef Search ADS   Portniaguine O., Zhdanov M.S., 1999. Focusing geophysical inversion images, Geophysics  64 874– 887. Google Scholar CrossRef Search ADS   Renaut R.A., Vatankhah S., Ardestani V. E., 2017. Hybrid and iteratively reweighted regularization by unbiased predictive risk and weighted GCV for projected systems, SIAM J. Sci. Comput.  39( 2), B221– B243. Google Scholar CrossRef Search ADS   Vatankhah S., Ardestani V.E., Renaut R.A., 2014. Automatic estimation of the regularization parameter in 2-D focusing gravity inversion: application of the method to the Safo manganese mine in northwest of Iran, J. Geophys. Eng.  11, 045001. Google Scholar CrossRef Search ADS   Vatankhah S., Ardestani V.E., Renaut R.A., 2015. Application of the χ2 principle and unbiased predictive risk estimator for determining the regularization parameter in 3D focusing gravity inversion, Geophys. J. Int.  200 265– 277. Google Scholar CrossRef Search ADS   Vatankhah S., Renaut R.A., Ardestani V.E., 2017a. 3-D Projected L1 inversion of gravity data using truncated unbiased predictive risk estimator for regularization parameter estimation, Geophys. J. Int.  210 3, 1872– 1887. Google Scholar CrossRef Search ADS   Vatankhah S., Renaut R.A., Ardestani V.E., 2017b. A fast algorithm for regularized focused 3-D inversion of gravity data using the randomized SVD, Geophysics  in press, arXiv:1706.06141v1. Vogel C.R., 2002. Computational Methods for Inverse Problems, SIAM Frontiers in Applied Mathematics , SIAM. Google Scholar CrossRef Search ADS   Voronin S., Mikesell D., Nolet G., 2015. Compression approaches for the regularized solutions of linear systems from large-scale inverse problems, Int. J. Geomath.  6 251– 294. Google Scholar CrossRef Search ADS   Wei Y., Xie P., Zhang L., 2016. Tikhonov regularization and randomized GSVD, SIAM J. Matrix Anal. Appl.  37 2, 649– 675. Google Scholar CrossRef Search ADS   Wohlberg B., Rodríguez P., 2007. An iteratively reweighted norm algorithm for minimization of total variation functionals, IEEE Signal Process. Lett.  14( 12), 948– 951. Google Scholar CrossRef Search ADS   Xiang H., Zou J., 2013. Regularization with randomized SVD for large-scale discrete inverse problems, Inverse Probl.  29 085008. Google Scholar CrossRef Search ADS   Xiang H., Zou J., 2015. Randomized algorithms for large-scale inverse problems with general Tikhonov regularization, Inverse Probl.  31 doi:10.1088/0266-5611/31/8/085008. APPENDIX A: THE GENERALIZED SINGULAR VALUE DECOMPOSITION Suppose $$\tilde{\tilde{G}} \in \mathcal {R}^{m\times n}$$, $$\tilde{D} \in \mathcal {R}^{p \times n}$$ and $$\mathcal {N}(\tilde{\tilde{G}}) \cap \mathcal {N}(\tilde{D}) = 0$$, where $$\mathcal {N}(\tilde{\tilde{G}})$$ is the null space of matrix $$\tilde{\tilde{G}}$$. Then there exist orthogonal matrices $$U \in \mathcal {R}^{m\times m}$$, $$V \in \mathcal {R}^{p \times p}$$ and a non-singular matrix $$X \in \mathcal {R}^{n \times n}$$ such that $$\tilde{\tilde{G}}=U \Lambda X^T$$ and $$\tilde{D}=V M X^T$$ (Paige & Saunders 1981; Aster et al.2013). Here, $$\Lambda \in \mathcal {R}^{m \times n}$$ is zero except for entries 0 < Λ1, (n − m) + 1 ≤ …Λm, n < 1, and M is diagonal of size p × n with entries $$M_{1,1}> M_{2,2}\ge \cdot\cdot\cdot \ge M_{p^*,p^*}>0$$, where p* ≔ min(p, n). The generalized singular values of the matrix pair $$[\tilde{\tilde{G}}, \tilde{D}]$$ are γi = λi/μi, where γ1 = … = γ(n − m) = 0 < γ(n − m) + 1 ≤ … ≤ γn, and $$\Lambda ^T\Lambda =\mathrm{diag}(0,\cdot\cdot\cdot 0,\lambda ^2_{(n-m)+1},\cdot\cdot\cdot , \lambda _n^2)$$, $$M^T M= \mathrm{diag}(1,\cdot\cdot\cdot ,1,\mu _{(n-m)+1}^2,\cdot\cdot\cdot ,\mu _n^2)$$, and $$\lambda _i^2+\mu _i^2=1$$, ∀i = 1: n, that is, MTM + ΛTΛ = In. Using the GSVD, introducing ui as the ith column of matrix U, we may immediately write the solution of (9) as   \begin{eqnarray} \mathbf {h}(\alpha )= \sum _{i=(n-m)+1}^{n} \frac{\gamma ^2_i}{\gamma ^2_i+\alpha ^2} \frac{\mathbf {u}^T_{i-(n-m)}\tilde{\mathbf {r}}}{\lambda _i} (X^T)^{-1}_{i}, \end{eqnarray} (A1)where $$(X^T)^{-1}_{i}$$ is the ith column of the inverse of the matrix XT. APPENDIX B: ESTIMATION OF THE COMPUTATIONAL COST OF FORMING A GSVD We follow the steps in (2013, Algorithm 8.7.2) to provide an estimate of the dominant number of floating point operations required to form a GSVD of two matrices A and B, without consideration of any kind of sparsity structure. For matrices $$A \in \mathcal {R}^{m\times n}$$ and $$B \in \mathcal {R}^{p \times n}$$, we start with computing the QR factorization of matrix $$C=[A;B] \in \mathcal {R}^{(m+p)\times n}$$, for which the operation cost is 2n2(m + p − n/3). Next the algorithm requires the calculation of two singular value decompositions, for the top and lower blocks of the A matrix in the QR factorization. This costs 6nm2 + 20m3 and 6pn2 + 20n3, respectively, Golub & van Loan (2013, page 493). The total cost for our application is thus (58/3)n3 + (8p + 2m)n2 + 2m2(3n + 10m). Now for p = n and p = 3n the total cost of GSVD are (82/3)n3 + 2mn2 + +2m2(3n + 10m) and (130/3)n3 + 2mn2 + +2m2(3n + 10m), respectively. Now also including the cost 26n3, found using the SVD, for finding the pseudo-inverse of X, noting that for these large-scale problems while X is theoretically non-singular in exact arithmetic it can be very ill-conditioned and the solution is more stably obtained using X†. In contrast for the RGSVD we have projected matrices $$B_1 \in \mathcal {R}^{m\times q}$$ and $$B_2 \in \mathcal {R}^{p \times q}$$, and the costs for QR factorization and two SVDs are 2q2(m + p − q/3), 6mq2 + 20q3 and 6pq2 + 20q3, respectively. Then, the total cost is equal to q2(8p + 8m) + (118/3)q3 which for the cases p = n and p = 3n gives totals 8q2(n + m) + 118/3q3 and 8q2(3n + m) + 118/3q3, respectively. Then we need also Z† costing approximately, 6nq2 + 20q3. © The Author(s) 2018. Published by Oxford University Press on behalf of The Royal Astronomical Society. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Geophysical Journal International Oxford University Press

# Total variation regularization of the 3-D gravity inverse problem using a randomized generalized singular value decomposition

, Volume 213 (1) – Apr 1, 2018
11 pages

/lp/ou_press/total-variation-regularization-of-the-3-d-gravity-inverse-problem-K5x8l5IB5d
Publisher
The Royal Astronomical Society
ISSN
0956-540X
eISSN
1365-246X
D.O.I.
10.1093/gji/ggy014
Publisher site
See Article on Publisher Site

### Abstract

Summary We present a fast algorithm for the total variation regularization of the 3-D gravity inverse problem. Through imposition of the total variation regularization, subsurface structures presenting with sharp discontinuities are preserved better than when using a conventional minimum-structure inversion. The associated problem formulation for the regularization is nonlinear but can be solved using an iteratively reweighted least-squares algorithm. For small-scale problems the regularized least-squares problem at each iteration can be solved using the generalized singular value decomposition. This is not feasible for large-scale, or even moderate-scale, problems. Instead we introduce the use of a randomized generalized singular value decomposition in order to reduce the dimensions of the problem and provide an effective and efficient solution technique. For further efficiency an alternating direction algorithm is used to implement the total variation weighting operator within the iteratively reweighted least-squares algorithm. Presented results for synthetic examples demonstrate that the novel randomized decomposition provides good accuracy for reduced computational and memory demands as compared to use of classical approaches. Gravity anomalies and Earth structure, Asia, Inverse theory, Numerical approximations and analysis 1 INTRODUCTION Regularization is imposed in order to find acceptable solutions to the ill-posed and non-unique problem of gravity inversion. Most current regularization techniques minimize a global objective function that consists of a data misfit term and a stabilization term; (Li & Oldenburg 1998; Portniaguine & Zhdanov 1999; Boulanger & Chouteau 2001; Vatankhah et al.2017a). Generally for potential field inversion the data misfit is measured as a weighted L2-norm of the difference between the observed and predicted data of the reconstructed model (Pilkington 2009). Stabilization aims to both counteract the ill-posedness of the problem so that changes in the model parameters due to small changes in the observed data are controlled, (Fedi et al.2005; Paoletti et al.2014), and to impose realistic characteristics on the reconstructed model. Many forms of robust and reliable stabilizations have been used by the geophysical community. The minimum-structure inversion based on a L2-norm measure of the gradient of the model parameters was commonly used in potential field inversion, (Li & Oldenburg 1996, 1998). Although the methodology was used successfully in many research papers, it is not able to recover a non-smooth density distribution such as expected by presence of mineral bodies, dikes, etc. (Farquharson 2008). Sparse inversion based on a minimum volume constraint, L0-norm of the model parameters, was introduced by Last & Kubik (1983) and then was developed and used commonly for reconstructing blocky images of the subsurface (Portniaguine & Zhdanov 1999; Boulanger & Chouteau 2001; Ajo-Franklin et al.2007; Vatankhah et al.2015). On the other hand, the L1 and Cauchy norms for the model parameters can be used to reconstruct sparse and compact solutions (Pilkington 2009; Vatankhah et al.2017a). Alternatively, total variation (TV) regularization based on a L1 norm of the gradient of the model parameters preserves edges in the model, and provides a reconstruction of the model that is blocky and non-smooth (Bertete-Aguirre et al.2002; Farquharson 2008). Therefore, the selection of the stabilization to be used for gravity inversion depends on an anticipated true representation of the subsurface model and is problem specific. The most important point here is that when the dimension of the inverse problem increases, efficient and practical algorithms for all aforementioned regularization formulations are desired. For example the following methods were all used for solution of the large-scale problem; (1) the symmetry of the gravity kernel, (Boulanger & Chouteau 2001); (2) data space inversion, (Pilkington 2009); and (3) the projection of the problem onto a smaller subspace, (Oldenburg et al.1993; Vatankhah et al.2017a). Here the focus is on the development of a new randomized algorithm for the 3-D linear inversion of gravity data with TV stabilization. The approach is very effective when the underlying inverse problem is of a moderate scale. Regularization using TV stabilization yields an objective function which is nonlinear in the solution. We therefore use the iteratively reweighted generalized Tikhonov least-squares (IRLS) formulation to solve the nonlinear problem, (Portniaguine & Zhdanov 1999). This presents no difficulty for small-scale problems; the generalized singular value decomposition (GSVD) can be used to compute the mutual decomposition of the model and stabilization matrices from which the solution is immediate at each iteration. Furthermore, the use of the GSVD provides a convenient form for the regularization parameter-choice rules, (Xiang & Zou 2013; Chung & Palmer 2015). The approach using the GSVD, however, is not practical when the dimension of the problem increases, neither in terms of computational costs nor memory demands. Randomized algorithms compute low-rank matrix approximations with reduced memory and computational costs (Halko et al.2011; Xiang & Zou 2013; Voronin et al.2015; Wei et al.2016). Random sampling is used to construct a low-dimensional subspace that captures the dominant spectral properties of the original matrix and then, as developed by Wei et al. (2016), a GSVD can be applied to regularize for this dominant spectral space. We demonstrate that applying this randomized GSVD (RGSVD) methodology at each step of the iterative TV regularization provides a fast and efficient algorithm for 3-D gravity inversion that inherits all the properties of TV inversion for small-scale problems. We also show how the weighting matrix at each iteration can be determined and how an alternating direction algorithm can also reduce the memory overhead associated with calculating the GSVD of the projected subproblem at each step of the IRLS. 2 INVERSION METHODOLOGY Suppose the subsurface is divided into a large number of cells of fixed size but unknown density (Li & Oldenburg 1998; Boulanger & Chouteau 2001). Unknown densities of the cells are stacked in vector $$\mathbf {m}\in \mathcal {R}^{n}$$ and measured data on the surface are stacked in $$\mathbf {d}_{\mathrm{obs}}\in \mathcal {R}^{m}$$, and are related to the densities via the linear relationship   \begin{eqnarray} \mathbf {d}_{\mathrm{obs}}= G\mathbf {m}, \end{eqnarray} (1)for forward model matrix $$G \in \mathcal {R}^{m \times n}$$ (m ≪ n). The aim is to find an acceptable model for the densities that predicts the observed data at the noise level. For gravity inversion, eq. (1) is modified through inclusion of an estimated prior model, mapr, yielding (Li & Oldenburg 1998)   \begin{eqnarray} \mathbf {d}_{\mathrm{obs}}- G\mathbf {m}_{\mathrm{apr}}= G\mathbf {m}- G\mathbf {m}_{\mathrm{apr}}. \end{eqnarray} (2)First, eq. (1) is replaced by Gy = r using r = dobs − Gmapr and y = m − mapr. Then a diagonal depth weighting matrix Wdepth = 1/(zj + ξ)β (Boulanger & Chouteau 2001) is incorporated in the model via $$\tilde{G}=GW_{\mathrm{depth}}^{-1}$$ yielding   \begin{eqnarray} \mathbf {r}= \tilde{G}\mathbf {h}, \end{eqnarray} (3)for h = Wdepthy. In Wdepth, zj is the mean depth of cell j, β is a weighting parameter usually chosen based on the data type and dimension of the problem and ξ is very small positive number (Boulanger & Chouteau 2001). Due to the ill-posedness of the problem, eq. (3) cannot be solved directly but must be stabilized in order to provide a practically acceptable solution. Using the L2-TV regularization methodology h is obtained by minimizing the objective function (Wohlberg & Rodriguez 2007),   \begin{eqnarray} P^{\alpha }(\mathbf {h})=\Vert W_{\mathbf {d}}(\tilde{G}\mathbf {h}-\mathbf {r}) \Vert _2^2 + \alpha ^2 \Vert \vert \nabla \mathbf {h}\vert \Vert _1^1, \end{eqnarray} (4)in which diagonal weighting matrix Wd approximates the square root of the inverse covariance matrix for the independent noise in the data. Specifically (Wd)ii = 1/ηi in which ηi is the standard deviation of the noise for the ith datum. Regularization parameter α provides a trade-off between the weighted data misfit and the stabilization. The absolute value of the gradient vector, |∇h|, is equal to $$= \sqrt{(D_{\rm {x}}\mathbf {h})^2+(D_{\rm {y}}\mathbf {h})^2+(D_{\rm {z}}\mathbf {h})^2}$$ in which Dx, Dy and Dz are the discrete derivative operators in x, y and z-directions, respectively. The derivatives at the centres of the cells are approximated to low order using forward differences with backward differencing at the boundary points, Li & Oldenburg (2000), yielding matrices Dx, Dy and Dz which are square of size n × n. Although (4) is a convex optimization problem with a unique solution, the TV term is not differentiable everywhere with respect to h and to find a solution it is helpful to rewrite the stabilization term using a weighted L2-norm, following Wohlberg & Rodriguez (2007). Given vectors ξ, χ and ψ and a diagonal matrix WR with entries wr we have   \begin{eqnarray} \Bigg \Vert \left( \begin{array}{c@{\hskip7pt}c@{\hskip7pt}c}W_{\mathrm{R}}& 0 & 0 \\ 0 & W_{\mathrm{R}}& 0 \\ 0 & 0 & W_{\mathrm{R}}\end{array} \right) \left( \begin{array}{c }\xi \\ \chi \\ \psi \end{array} \right) \Bigg \Vert _2^2 &=& \sum _r\big(w_{\mathrm{r}}^2 \xi _{\mathrm{r}}^2 + w_{\mathrm{r}}^2 \chi _{\mathrm{r}}^2 + w_{\mathrm{r}}^2 \psi _{\mathrm{r}}^2\big)\nonumber\\ &=& \sum _r w_{\mathrm{r}}^2 \big(\sqrt{\xi _{\mathrm{r}}^2 + \chi _{\mathrm{r}}^2 + \psi _{\mathrm{r}}^2}\big)^2. \end{eqnarray} (5)Setting wr = (ξr2 + χr2 + ψr2)− 1/4 we have   \begin{eqnarray} \Biggr \Vert \left( \begin{array}{c@{\hskip7pt}c@{\hskip7pt}c}W_{\mathrm{R}}& 0 & 0 \\ 0 & W_{\mathrm{R}}& 0 \\ 0 & 0 & W_{\mathrm{R}}\end{array} \right) \left( \begin{array}{c }\xi \\ \chi \\ \psi \end{array} \right) \Biggr \Vert _2^2 =\sum _r \big(\sqrt{\xi _{\mathrm{r}}^2 + \chi _{\mathrm{r}}^2 + \psi _{\mathrm{r}}^2}\big). \end{eqnarray} (6)Now with ξ = Dxh, χ = Dyh, ψ = Dzh and WR(k) defined to have entries wr(k) calculated for ∇h at iteration k − 1 given by $$w_{\mathrm{r}}^{(k)}= ((D_{\rm {x}}\mathbf {h}^{(k-1)})_r^2 + (D_{\rm {y}}\mathbf {h}^{(k-1)})_r^2 + (D_{\rm {z}}\mathbf {h}^{(k-1)})_r^2 +\epsilon ^2)^{-1/4}$$, the TV stabilizer is approximated via   \begin{eqnarray} \Vert \vert \nabla \mathbf {h}\vert \Vert _1^1= \left\Vert \sqrt{(D_{\rm {x}}\mathbf {h})^2+(D_{\rm {y}}\mathbf {h})^2+(D_{\rm {z}}\mathbf {h})^2} \right\Vert _1^1 \approx \Vert W^{(k)} D \mathbf {h}\Vert _2^2, \end{eqnarray} (7)for derivative operator D = [Dx; Dy; Dz]. Here, 0 < ε ≪ 1 is added to avoid the possibility of division by zero, and superscript k indicates that matrix WR is updated using the model parameters of the previous iteration. Hence (4) is rewritten as a general Tikhonov functional   \begin{eqnarray} P^{\alpha }(\mathbf {h})=\Vert W_{\mathbf {d}}(\tilde{G}\mathbf {h}-\mathbf {r}) \Vert _2^2 + \alpha ^2 \Vert \tilde{D} \mathbf {h}\Vert _2^2, \quad \tilde{D}=WD, \end{eqnarray} (8)for which the minimum is explicitly expressible as   \begin{eqnarray} \mathbf {h}=(\tilde{\tilde{G}}^T\tilde{\tilde{G}}+\alpha ^2\tilde{D}^T\tilde{D})^{-1}\tilde{\tilde{G}}^T\tilde{\mathbf {r}}, \quad \tilde{\tilde{G}}=W_{\mathbf {d}}\tilde{G} \quad \mathrm{and} \quad \tilde{\mathbf {r}}=W_{\mathbf {d}}\mathbf {r}. \end{eqnarray} (9)The model update is then   \begin{eqnarray} \mathbf {m}(\alpha )=\mathbf {m}_{\mathrm{apr}}+W_{\mathrm{depth}}^{-1}\mathbf {h}(\alpha ). \end{eqnarray} (10)While the Tikhonov function can be replaced by a standard form Tikhonov function, that is, with regularization term ‖h‖2, when $$\tilde{D}$$ is easily invertible, for example, when diagonal, the form of $$\tilde{D}$$ in this case makes that transformation prohibitive for cost and we must solve using the general form. We show the iterative process for the solution in Algorithm 2. It should be noted that the iteration process terminates when solution satisfies the noise level or a predefined maximum number of iterations is reached (Boulanger & Chouteau 2001). Furthermore, the density limits [ρmin, ρmax] are imposed at each iteration. If at any iteration a density value falls outside these predefined density bounds, the value is projected back to the nearest bound value (Boulanger & Chouteau 2001). For small-scale problems in which the dimensions of $$\tilde{\tilde{G}}$$, and consequently $$\tilde{D}$$, are small, the solution h(α) is found at minimal cost using the GSVD of the matrix pair $$[\tilde{\tilde{G}},\tilde{D}]$$ as it is shown in Appendix A (Aster et al.2013; Vatankhah et al.2014). Furthermore, given the GSVD the regularization parameter may be estimated cheaply using standard parameter-choice techniques (Xiang & Zou 2013; Chung & Palmer 2015). But for large-scale problems it is not practical to calculate the GSVD at each iteration, both with respect to computational cost and memory demands. Instead the size of the original large problem can be reduced greatly using a randomization technique which provides the GSVD in a more feasible and efficient manner. The solution of reduced system still is a good approximation of the original system (Halko et al.2011; Xiang & Zou 2013; Voronin et al.2015; Xiang & Zou 2015; Wei et al.2016). Here, we use the Randomized GSVD (RGSVD) algorithm developed by Wei et al. (2016) for underdetermined problems in which the mutual decomposition of the matrix pair $$[\tilde{\tilde{G}},\tilde{D}]$$ is approximated by   (11) The steps of the algorithm are given in Algorithm 1. Steps 1 to 3 are used to form matrix Q which approximates the range of $$\tilde{\tilde{G}}^T$$. At step 4, $$\tilde{\tilde{G}}$$ is projected into a lower dimensional matrix B1, for which B1 provides information on the range of $$\tilde{\tilde{G}}$$. The same projection is applied to matrix $$\tilde{D}$$. In step 5, an economy-sized GSVD is computed for the matrix pair [B1, B2]. Parameter q balances the accuracy and efficiency of the Algorithm 1 and determines the dimension of the subspace for the projected problem. When q is small, this methodology is very effective and leads to a fast GSVD computation. Simultaneously, the parameter q has to be selected large enough to capture the dominant spectral properties of the original problem with the aim that the solution obtained using the RGSVD is close to the solution that would be obtained using the GSVD. For any GSVD decomposition (11), including that obtained via Algorithm 1, h(α) of (9) is given by   \begin{eqnarray} \mathbf {h}(\alpha )= \left( Z^T \Lambda ^T U^T U \Lambda Z + \alpha ^2 Z^T M^T V^T V M Z \right)^{-1} Z^T \Lambda ^T U^T \tilde{\mathbf {r}},\nonumber\\ \end{eqnarray} (12)which simplifies to (Aster et al.2013; Vatankhah et al.2014)   \begin{eqnarray} \mathbf {h}(\alpha )= \sum _{i=1}^{q} \frac{\gamma ^2_i}{\gamma ^2_i+\alpha ^2} \frac{\mathbf {u}^T_{i}\tilde{\mathbf {r}}}{\lambda _i} (Z^{\dagger })_{i}. \end{eqnarray} (13)Here γi is the ith generalized singular value, see Appendix A, and Z† is the Moore-Penrose inverse of Z (Wei et al.2016). Incorporating this single step of the RGSVD within the TV algorithm yields the iteratively reweighted TVRGSVD algorithm given in Algorithm 2. Note here that steps 1–3 and the calculation of B1 in step 4 in Algorithm 1 are the same for all TV iterations and thus outside the loop in Algorithm 2. At each iteration k, the matrices W, $$\tilde{D}$$ and B2 are updated, and the GSVD is determined for the matrix pair $$[B_1, B_2^{(k)}]$$. We should note here that it is immediate to use the minimum gradient support (MGS) stabilizer introduced by Portniaguine & Zhdanov (1999) in Algorithm 2 via replacing (−1/4) in WR with (−1/2) and keeping all other parameters fixed. The dominant costs of finding solution (13) for each step of Algorithm 2 depend on the specifics of the algorithm used to estimate the GSVD. Using the RGSVD the cost is determined by the use of Algorithm 1 and the cost for finding Z† needed in (13). These costs are summarized in terms of estimates of the number of floating point operations (flops) in Table 1. For q ≪ m ≪ n, and p = 3n, the dominant cost is approximately 6qn2 + 34q2n + 4qmn and occurs due to steps 2, 4, 5, and calculation of Z†. This compares positively to the cost, approximately (208/3)n3, of using the full GSVD, see Appendix B. The potential impact of using the RGSVD rather than the GSVD for the solution of the TV problem can be seen by using the flop estimates to compare RGSVD and GSVD for examples of m, n and choices for q for RGSVD. The RGSVD offers a significant reduction in the computational cost, as illustrated in Figs 1(a) and (b). In these plots we give the estimates in each case for using either $$\tilde{D}$$ with p = 3n or p = n, and for the major terms 6qn2 and (208/3)n3, in each case. To compare a moderate- and a large-scale problem we illustrate for pairs (m, n) = (900, 9000) and (6000, 60 000), respectively. These results show that the major terms are sufficient to determine the relative impact of using RGSVD in place of GSVD; RGSVD will be more efficient for q/(12n) ≪ 1. We will show for simulated data that this comes without loss of accuracy. Figure 1. View largeDownload slide Estimates of total number of flops using RGSVD and GSVD algorithms for two choices for (m, n) with n ≫ m. Figure 1. View largeDownload slide Estimates of total number of flops using RGSVD and GSVD algorithms for two choices for (m, n) with n ≫ m. Table 1. Computational cost of each step of Algorithm 1 based on approximate count of flops. 2  3  4  5  6  Z†  2lmn  2l2(n − l/3)  2qmn and 6qn2  8q2(p + m) + (118/3)q3  2q2n  6nq2 + 20q3  2  3  4  5  6  Z†  2lmn  2l2(n − l/3)  2qmn and 6qn2  8q2(p + m) + (118/3)q3  2q2n  6nq2 + 20q3  View Large 2.0.1 An Alternating Direction algorithm Step 9 of Algorithm 2 requires the economy GSVD decomposition in which matrix $$B_2^{(k)}$$ is of size 3n × q. For a large-scale problem the computational cost and memory demands with the calculation of the GSVD limit the size of the problem that can be solved. We therefore turn to an alternative approach for large-scale 3-D problems and adopt the use of an Alternating Direction (AD) strategy in which to handle the large-scale problem requiring derivatives in greater than two dimensions we split the problem into pieces handling each direction one after the other. This is a technique that has been in the literature for some time for handling the solution of large-scale partial differential equations, most notably through the alternating direction implicit method (Peaceman & Rachford 1955). Matrix D(k) generated via steps 16 and 17 of Algorithm 2 can be changed without changing the other steps of the algorithm. For the AD algorithm we may therefore alternate over $$D^{(k)} = W_{\mathrm{R_x}}^{(k)}D_{\rm {x}}$$, $$W_{\mathrm{R_y}}^{(k)}D_{\rm {y}}$$ or $$W_{\mathrm{R_z}}^{(k)}D_{\rm {z}}$$, dependent on (k mod 3) = 0, 1, or 2, respectively. Then $$B_2^{(k)}$$ is only of size n × n, yielding reduced memory demands for calculating the GSVD. We note that D is also initialized consistently. The AD approach amounts to apply the edge preserving algorithm in each direction independently and cycling over all directions. Practically, we also find that there is nothing to be gained by requiring that the derivative matrices are square, and following (Hansen 2007) ignore the derivatives at the boundaries. For a 1-D problem this amounts to taking matrix Dx of size (n − 1) × n for a line with n points. Thus dependent on k mod 3 we have a weighting matrix of size px × px, py × py or pz × pz for matrices Dx, Dy and Dz of sizes px × n, py × n and pz × n, respectively, and where px, py and pz (all less than n) depend on the number of boundary points in each direction. Matrix $$B_2^{(k)}$$ is thus reduced in size and the GSVD calculation is more efficient at each step. Furthermore, we find that rather than calculating the relevant weight matrix WR for the given dimension we actually form the weighted entry that leads to approximation of the relevant component of the gradient vector for all three dimensions, thus realistically approximating the gradient as given in (5). For the presented results we will use the AD version of Algorithm 2, noting that this is not necessary for the smaller problems, but is generally more efficient and reliable for the larger 3-D problem formulations. 2.1 Estimation of the regularization parameter α As presented to this point we have assumed a known value for the regularization parameter α. Practically we wish to find α dynamically to appropriately regularize at each step of the iteration so as to recognize that the conditioning of the problem changes with the iteration. Here we use the method of unbiased predictive risk estimation which we have found to be robust in our earlier work (Renaut et al.2017; Vatankhah et al.2015; Vatankhah et al.2017a). The method, which goes back to Mallows (1973), requires some knowledge of the noise level in the data and was carefully developed in Vogel (2002) for the standard Tikhonov functional. The method was further extended for use with TV regularization by Lin et al. (2010). Defining the residual $$R(\mathbf {h}(\alpha )) = \tilde{\tilde{G}} \mathbf {h}(\alpha )-\tilde{\mathbf {r}}$$ and influence matrix $$H_{TV,\alpha }=\tilde{\tilde{G}}(\tilde{\tilde{G^{T}}}\tilde{\tilde{G}}+\alpha ^2\tilde{D}^T\tilde{D})^{-1}\tilde{\tilde{G^{T}}}$$, the optimal parameter α is the minimizer of   \begin{eqnarray} U(\alpha )=\Vert R(\mathbf {h}(\alpha ))\Vert _2^2+ 2 \,\text{trace}(H_{TV,\alpha })-m, \end{eqnarray} (14)which is given in terms of the GSVD by   \begin{eqnarray} U(\alpha )=\sum _{i=1}^{q} \left( \frac{1}{\gamma _i^2 \alpha ^{-2} + 1} \right)^2 \left(\mathbf {u}_i^T\tilde{\mathbf {r}} \right)^2 + 2 \left( \sum _{i=1}^{q} \frac{\gamma _i^2}{\gamma _i^2+\alpha ^2}\right) - q.\nonumber\\ \end{eqnarray} (15)Typically αopt is found by evaluating (15) on a range of α, between minimum and maximum γi, and then that α which minimizes the function is selected as αopt. 3 SYNTHETIC EXAMPLES 3.1 Model consisting of two dipping dikes As a first example, we use a complex model that consists of two embedded dipping dikes of different sizes and dipping in opposite directions but with the same density contrast 1 g cm−3, Fig. 2. Gravity data, dexact, are generated on the surface for a grid of 30 × 30 = 900 points with grid spacing 50 m. Gaussian noise with standard deviation (0.02 (dexact)i + 0.002 ‖dexact‖) is added to each datum. Example noisy data, dobs, are illustrated in Fig. 3. Inversion is performed for the subsurface volume of 9000 cubes of size 50 m in each dimension using the matrix $$\tilde{\tilde{G}}$$ of size 900 × 9000. Use of this relatively small model permits examination of the inversion methodology with respect to different parameter choices and provides the framework to be used for more realistic larger models. All computations are performed on a desktop computer with Intel Core i7-4790 CPU 3.6 GHz processor and 16 GB RAM. Figure 2. View largeDownload slide A model that consists of two dikes dipping in opposite directions. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 2. View largeDownload slide A model that consists of two dikes dipping in opposite directions. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 3. View largeDownload slide The gravity anomaly produced by the model shown in Fig. 2 and contaminated by Gaussian noise. Figure 3. View largeDownload slide The gravity anomaly produced by the model shown in Fig. 2 and contaminated by Gaussian noise. The parameters of the inversion needed for Algorithm 2, and the results, are detailed for each example in Table 2. These are the maximum number of iterations for the inversion Kmax, the bound constraints for the model ρmin and ρmax, the initial data mapr and the choice for q. Further, the inversion terminates if $$\chi _{\text{computed}}^2 =\Vert W_{\mathbf {d}}(\mathbf {d}_{\mathrm{obs}}-\mathbf {d}_{\mathrm{pre}}^{(K)})\Vert _2^2 \le m+\sqrt{2m}=942.4$$ is satisfied for iteration K < Kmax. The results for different choices of q, but all other parameters the same, are illustrated in Figs 4 –6, respectively. With q = 100 two dipping structures are recovered but the iteration has not converged by K = 200, and both the relative error and $$\chi _{\text{computed}}^2$$ are large. With q = 300 and 500 better reconstructions of the dikes are achieved although the extension of the left dike is overestimated. While the errors are nearly the same, the inversions terminate at K = 174 and K = 49 for q = 300 and q = 500, respectively. This leads to low computational time when using q = 500 as compared with the other two cases, see Table 2. We should note here that for all three cases the relative error decreases rapidly for the early iterations, after which there is little change in the model between iterations. For example, the result for q = 300 at iteration K = 50, as illustrated in Fig. 7 and detailed in Table 2, is acceptable and is achieved with a substantial reduction in the computational time. Figure 4. View largeDownload slide The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 100. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 4. View largeDownload slide The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 100. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 5. View largeDownload slide The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 300. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 5. View largeDownload slide The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 300. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 6. View largeDownload slide The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 500. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 6. View largeDownload slide The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 500. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 7. View largeDownload slide The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 300 at K = 50. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 7. View largeDownload slide The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 300 at K = 50. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Table 2. Parameters and results for Algorithm 2 applied to the small-scale dipping dikes model. The number of iterations is K, α(K) is the final regularization parameter, $$\text{RE}^{(K)}=\frac{\Vert \mathbf {m}_{\mathrm{exact}}-\mathbf {m}^{(K)} \Vert _2}{\Vert \mathbf {m}_{\mathrm{exact}} \Vert _2}$$ is the relative error of the reconstructed model at iteration K. The final $$\chi _{\text{computed}}^2$$ is also reported. Input Parameters  Results  mapr  ρmin (g cm−3)  ρmax (g cm−3)  Kmax  q  RE(K)  α(K)  K  $$\chi _{\text{computed}}^2$$  Time (s)  0  0  1  200                      100  0.7745  26.63  200  3978.9  494          300  0.7143  41.62  174  942.4  595          500  0.7244  51.62  49  942.2  249  0  0  1  50  300  0.7308  38.01  50  1165.7  165  0  0  2  200  500  0.7257  31.60  49  941.6  227  ≠ 0  0  1  200  500  0.7076  76.74  200  1952.6  972  Input Parameters  Results  mapr  ρmin (g cm−3)  ρmax (g cm−3)  Kmax  q  RE(K)  α(K)  K  $$\chi _{\text{computed}}^2$$  Time (s)  0  0  1  200                      100  0.7745  26.63  200  3978.9  494          300  0.7143  41.62  174  942.4  595          500  0.7244  51.62  49  942.2  249  0  0  1  50  300  0.7308  38.01  50  1165.7  165  0  0  2  200  500  0.7257  31.60  49  941.6  227  ≠ 0  0  1  200  500  0.7076  76.74  200  1952.6  972  View Large As compared to inversion using the L2-norm of the gradient of the model parameters, see, for example, Li & Oldenburg (1996), or conventional minimum structure inversion, the results obtained using Algorithm 2 provide a subsurface model that is not smooth. Further, as compared with L0 and L1 norms applied for the model parameters, see, for example, Last & Kubik (1983), Portniaguine & Zhdanov (1999) and Vatankhah et al. (2017a), the TV inversion does not yield a model that is sparse or compact. On the other hand, the TV inversion is far less dependent on correct specification of the model constraints. This is illustrated in Fig. 8, which is the same case as Fig. 6, but with ρmax = 2 g cm−3, and demonstrates that the approach is generally robust. Finally, to contrast with the algorithm presented in Bertete-Aguirre et al. (2002) the result in Fig. 9(b) is for an alternative choice for mapr, illustrated in Fig. 9(a), and q = 500 which is consistent with the approach in Bertete-Aguirre et al. (2002). Here mapr is obtained by taking the true model with the addition of Gaussian noise with a standard deviation of (0.05 mtrue + 0.02 ‖mtrue‖). Not surprisingly the reconstructed density model is more focused and is closer to the true model. The computed value for $$\chi _{\text{computed}}^2$$ is, however, larger than the specified value and the algorithm terminates at Kmax. This occurs due to the appearance of an incorrect density distribution in the first layer of the subsurface. Together, these results demonstrate that the TVRGSVD technique is successful and offers a good option for the solution of larger scale models. Figure 8. View largeDownload slide The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 500 when uncorrected upper density bound ρmax = 2 g cm−3 is selected. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 8. View largeDownload slide The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 500 when uncorrected upper density bound ρmax = 2 g cm−3 is selected. (a) Cross-section at northing = 725 m; (b) Plane-section at depth = 100 m. Figure 9. View largeDownload slide (a) The initial model which generated by adding Gaussian noise with standard deviation of (0.05 mtrue + 0.02 ‖mtrue‖) to the true model. (b) The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 500 when model shown in Fig. 9(a) is used as mapr. Figure 9. View largeDownload slide (a) The initial model which generated by adding Gaussian noise with standard deviation of (0.05 mtrue + 0.02 ‖mtrue‖) to the true model. (b) The reconstructed model for data in Fig. 3 using Algorithm 2 with q = 500 when model shown in Fig. 9(a) is used as mapr. 3.2 Model of multiple bodies We now consider a more complex and larger model that consists of six bodies with various geometries, sizes, depths and densities, as illustrated in Fig. 10, and is further detailed in Vatankhah et al. (2017a). The surface gravity data are calculated on a 100 × 60 grid with 100 m spacing. Gaussian noise with standard deviation of (0.02 (dexact)i + 0.001 ‖dexact‖) is added to each datum, giving the noisy data set as shown in Fig. 11. The subsurface is divided into 100 × 60 × 10 = 60 000 cubes with sizes 100 m in each dimension. For the inversion we use Algorithm 2 with mapr = 0, ρmin = 0 g cm−3, ρmax = 1 g cm−3 and Kmax = 50. We perform the inversion for three values of q, 500, 1000 and 2000, and report the results in Table 3. The reconstruction with q = 500 is less satisfactory than that achieved with larger q and the iterations terminate at Kmax with a large value of $$\chi _{\text{computed}}^2$$. The results with q = 1000 and q = 2000 have similar relative errors, but the computational cost is much reduced using q = 1000; although the desired $$\chi _{\text{computed}}^2$$ is not achieved the result is close and acceptable. The reconstructed model and associated gravity response, using q = 1000, are illustrated in Figs 12 and 13, respectively. While the maximum depths of the anomalies are overestimated, the horizontal borders are reconstructed accurately. These two examples suggest that q > m/6 is suitable for Algorithm 2, which confirms our previous conclusions when using the randomized SVD, see Vatankhah et al. (2017b). We should note that as compared to the case in which the standard Tikhonov functional can be used, see such results with the focusing inversion in Vatankhah et al. (2017a,b), the computational cost here is much higher. Figure 10. View largeDownload slide Model consists of six bodies with various geometries and sizes embedded in a homogeneous background. Bodies have the densities 1 g cm−3 and 0.8 g cm−3. (a) Plane-section at depth = 100 m; (b) Plane-section at depth = 300 m; (c) Plane-section at depth = 500 m; (d) Plane-section at depth = 700 m. Figure 10. View largeDownload slide Model consists of six bodies with various geometries and sizes embedded in a homogeneous background. Bodies have the densities 1 g cm−3 and 0.8 g cm−3. (a) Plane-section at depth = 100 m; (b) Plane-section at depth = 300 m; (c) Plane-section at depth = 500 m; (d) Plane-section at depth = 700 m. Figure 11. View largeDownload slide The gravity anomaly produced by the multiple model shown in Fig. 10 and contaminated by Gaussian noise. Figure 11. View largeDownload slide The gravity anomaly produced by the multiple model shown in Fig. 10 and contaminated by Gaussian noise. Figure 12. View largeDownload slide The reconstructed model for data in Fig. 11 using Algorithm 2 with q = 1000. (a) Plane-section at depth = 100 m; (b) Plane-section at depth = 300 m; (c) Plane-section at depth = 500 m; (d) Plane-section at depth = 700 m. Figure 12. View largeDownload slide The reconstructed model for data in Fig. 11 using Algorithm 2 with q = 1000. (a) Plane-section at depth = 100 m; (b) Plane-section at depth = 300 m; (c) Plane-section at depth = 500 m; (d) Plane-section at depth = 700 m. Figure 13. View largeDownload slide The gravity anomaly produced by the reconstructed model shown in Fig. 12. Figure 13. View largeDownload slide The gravity anomaly produced by the reconstructed model shown in Fig. 12. Table 3. The results of the inversion for the multiple bodies example using Algorithm 2. Input Parameters  Results  mapr  ρmin (g cm−3)  ρmax (g cm−3)  Kmax  q  RE(K)  α(K)  K  $$\chi _{\text{computed}}^2$$  Time (s)  0  0  1  50  500  0.6710  15.31  50  11719.2  6014  0  0  1  50  1000  0.6451  18.02  50  7324.0  7262  0  0  1  50  2000  0.6452  18.79  44  6103.4  10666  Input Parameters  Results  mapr  ρmin (g cm−3)  ρmax (g cm−3)  Kmax  q  RE(K)  α(K)  K  $$\chi _{\text{computed}}^2$$  Time (s)  0  0  1  50  500  0.6710  15.31  50  11719.2  6014  0  0  1  50  1000  0.6451  18.02  50  7324.0  7262  0  0  1  50  2000  0.6452  18.79  44  6103.4  10666  View Large 4 REAL DATA We use the gravity data from the Goiás Alkaline Province (GAP) of the central region of Brazil. The GAP is a result of mafic-alkaline magmatism that occurred in the Late Cretaceous and includes mafic–ultramafic alkaline complexes in the northern portion, subvolcanic alkaline intrusions in the central region and volcanic products to the south with several dikes throughout the area (Dutra et al.2012). We select a region from the northern part of GAP in which Morro do Engenho Complex (ME) outcrops and another intrusive body, A2, is completely covered by Quaternary sediments (Dutra & Marangoni 2009). The data were digitized carefully from fig. 3 in Dutra & Marangoni (2009) and re-gridded into 45 × 53 = 2385 data points with spacing 1 km, Fig. 14(a). For these data, the result of the smooth inversion using Li & Oldenburg (1998) algorithm was presented in Dutra & Marangoni (2009) and Dutra et al. (2012). Furthermore, the result of focusing inversion based on L1-norm stabilizer is available in Vatankhah et al. (2017b). The results using the TV inversion presented here can therefore be compared with the inversions using both aforementioned algorithms. Figure 14. View largeDownload slide (a) Residual gravity data over the Morro do Engenho complex digitized from Dutra & Marangoni (2009); (b) the response of the reconstructed model shown in Fig. 15. Figure 14. View largeDownload slide (a) Residual gravity data over the Morro do Engenho complex digitized from Dutra & Marangoni (2009); (b) the response of the reconstructed model shown in Fig. 15. For the inversion we divide the subsurface into 45 × 53 × 14 = 33 390 rectangular 1 km prisms. The density bounds ρmin = 0 g cm−3 and ρmax = 0.3 g cm−3 are selected based on geological information from Dutra & Marangoni (2009). Algorithm 2 was implemented with q = 500 and Kmax = 100. The results of the inversion are presented in Table 4, with the illustration of the predicted data due to the reconstructed model in Fig. 14(b) and the reconstructed model in Fig. 15. As compared to the smooth estimate of the subsurface shown in Dutra & Marangoni (2009), the obtained subsurface is blocky and non-smooth. The subsurface is also not as sparse as that obtained by the focusing inversion in Vatankhah et al. (2017b). The ME and A2 bodies extend up to maximum 12 and 8 km, respectively. Unlike the result obtained using the focusing inversion, for the TV inversion the connection between ME and A2 at depths 4 to 7 km is not strong. We should note that the computational time for the focusing inversion presented in Vatankhah et al. (2017b) is much smaller than for the TV algorithm presented here. Figure 15. View largeDownload slide The plane-sections of the reconstructed model for the data in Fig. 14(a) using Algorithm 2 with q = 400. The sections are at the depths specified in the figures. Figure 15. View largeDownload slide The plane-sections of the reconstructed model for the data in Fig. 14(a) using Algorithm 2 with q = 400. The sections are at the depths specified in the figures. Table 4. The results of the inversion for the data presented in Fig. 14 using Algorithm 2. Input Parameters  Results  mapr  ρmin (g cm−3)  ρmax (g cm−3)  Kmax  q  α(K)  K  $$\chi _{\text{computed}}^2$$  Time (s)  0  0  0.3  100  500  102.48  38  2453.9  1533  Input Parameters  Results  mapr  ρmin (g cm−3)  ρmax (g cm−3)  Kmax  q  α(K)  K  $$\chi _{\text{computed}}^2$$  Time (s)  0  0  0.3  100  500  102.48  38  2453.9  1533  View Large 5 CONCLUSIONS We developed an algorithm for total variation regularization applied to the 3-D gravity inverse problem. The presented algorithm provides a non-smooth and blocky image of the subsurface which may be useful when discontinuity of the subsurface is anticipated. Using the randomized generalized singular value decomposition, we have demonstrated that moderate-scale problems can be solved in a reasonable computational time. This seems to be the first use of the TV regularization using the RGSVD for gravity inversion. Our presented results use an Alternating Direction implementation to further improve the efficiency and reduce the memory demands of the problem. The results show that there is less sensitivity to the provision of good bounds for the density values, and thus the TV may have a role to play for moderate-scale problems where limited information on subsurface model parameters is available. It was of interest to investigate the 3-D algorithm developed here in view of the results of Bertete-Aguirre et al. (2002) which advocated TV regularization in the context of 2-D gravity inversion. We obtained results that are comparable with the simulations presented in Bertete-Aguirre et al. (2002) but here for much larger problems. In our simulations we have found that the computational time is much larger than that required for the focusing algorithm presented in Vatankhah et al. (2017a,b). Because it is not possible to transform the TV regularization to standard Tikhonov form, the much more expensive GSVD algorithm has to be used in place of the SVD that can be utilized for the focusing inversion presented in Vatankhah et al. (2017b). On the other hand, the advantage of the TV inversion as compared to the focusing inversion is the lesser dependence on the density constraints and the generation of a subsurface which is not as smooth and admits discontinuities. The impact of the algorithm was illustrated for the inversion of real gravity data from the Morro do Engenho complex in central Brazil. ACKNOWLEDGEMENTS We would like to thank two reviewers, Dr. Valeria Paoletti and one anonymous reviewer. Their helpful comments helped us to improve the manuscript. REFERENCES Ajo-Franklin J.B., Minsley B.J., Daley T.M., 2007. Applying compactness constraints to differential traveltime tomography, Geophysics  72( 4), R67– R75. Google Scholar CrossRef Search ADS   Aster R.C., Borchers B., Thurber C.C., 2013, Parameter Estimation and Inverse Problems  2nd edn, Elsevier. Bertete-Aguirre H., Cherkaev E., Oristaglio M., 2002. Non-smooth gravity problem with total variation penalization functional, Geophys. J. Int.  149 499– 507. Google Scholar CrossRef Search ADS   Boulanger O., Chouteau M., 2001. Constraint in 3D gravity inversion, Geophys. Prospect.  49 265– 280. Google Scholar CrossRef Search ADS   Chung J., Palmer K., 2015. A hybrid LSMR algorithm for large-scale Tikhonov regularization, SIAM J. Sci. Comput.  37( 5), S562– S580. Google Scholar CrossRef Search ADS   Dutra A.C., Marangoni Y.R., 2009. Gravity and magnetic 3D inversion of Morro do Engenho complex, Central Brazil, J. South Am. Earth Sci.  28 193– 203. Google Scholar CrossRef Search ADS   Dutra A.C., Marangoni Y.R., Junqueira-Brod T.C., 2012. Investigation of the Goiás Alkaline Province, Central Brazil: application of gravity and magnetic methods, J. South Am. Earth Sci.  33 43– 55. Google Scholar CrossRef Search ADS   Farquharson C.G., 2008. Constructing piecwise-constant models in multidimensional minimum-structure inversions, Geophysics  73( 1), K1– K9. Google Scholar CrossRef Search ADS   Fedi M., Hansen P.C., Paoletti V., 2005. Tutorial: analysis of depth resolution in potential field inversion, Geophysics  70( 6), A1– A11. Google Scholar CrossRef Search ADS   Golub G.H., van Loan C., 2013. Matrix Computations  4th edn, John Hopkins Press Baltimore. Halko N., Martinsson P.G., Tropp J.A., 2011. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions, SIAM Rev.  53 2, 217– 288. Google Scholar CrossRef Search ADS   Hansen P.C., 2007. Regularization Tools: A Matlab Package for Analysis and Solution of Discrete Ill-Posed Problems Version 4.1 for Matlab 7.3, Numer. Algorithms  46 189– 194. Google Scholar CrossRef Search ADS   Last B.J., Kubik K., 1983. Compact gravity inversion, Geophysics  48 713– 721. Google Scholar CrossRef Search ADS   Li Y., Oldenburg D.W., 1996. 3-D inversion of magnetic data, Geophysics  61 394– 408. Google Scholar CrossRef Search ADS   Li Y., Oldenburg D.W., 1998. 3-D inversion of gravity data, Geophysics  63 109– 119. Google Scholar CrossRef Search ADS   Li Y., Oldenburg D.W., 2000. Incorporating geologic dip information into geophysical inversions, Geophysics  65 148– 157. Google Scholar CrossRef Search ADS   Lin Y., Wohlberg B., Guo H., 2010. UPRE method for total variation parameter selection, Signal Process.  90 2546– 2551. Google Scholar CrossRef Search ADS   Mallows C.L., 1973, Some comments on Cp, Technometrics  4( 15), 661– 675. Oldenburg D.W., McGillivray P.R., Ellis R.G., 1993. Generalized subspace methods for large-scale inverse problem, Geophys. J. Int  114 12– 20. Google Scholar CrossRef Search ADS   Paige C.C., Saunders M.A., 1981. Towards a generalized singular value decomposition, SIAM J. Numer. Anal.  18 398– 405. Google Scholar CrossRef Search ADS   Paoletti V., Hansen P.C., Hansen M.F., Fedi M., 2014. A computationally efficient tool for assessing the depth resolution in potential-field inversion, Geophysics  79( 4), A33– A38. Google Scholar CrossRef Search ADS   Peaceman D., Rachford J.H.H., 1955. The numerical solution of parabolic and elliptic differential equations, J. Soc. Indust. Appl. Math.  3 28– 41. CrossRef Search ADS   Pilkington M., 2009. 3D magnetic data-space inversion with sparseness constraints, Geophysics  74 L7– L15. Google Scholar CrossRef Search ADS   Portniaguine O., Zhdanov M.S., 1999. Focusing geophysical inversion images, Geophysics  64 874– 887. Google Scholar CrossRef Search ADS   Renaut R.A., Vatankhah S., Ardestani V. E., 2017. Hybrid and iteratively reweighted regularization by unbiased predictive risk and weighted GCV for projected systems, SIAM J. Sci. Comput.  39( 2), B221– B243. Google Scholar CrossRef Search ADS   Vatankhah S., Ardestani V.E., Renaut R.A., 2014. Automatic estimation of the regularization parameter in 2-D focusing gravity inversion: application of the method to the Safo manganese mine in northwest of Iran, J. Geophys. Eng.  11, 045001. Google Scholar CrossRef Search ADS   Vatankhah S., Ardestani V.E., Renaut R.A., 2015. Application of the χ2 principle and unbiased predictive risk estimator for determining the regularization parameter in 3D focusing gravity inversion, Geophys. J. Int.  200 265– 277. Google Scholar CrossRef Search ADS   Vatankhah S., Renaut R.A., Ardestani V.E., 2017a. 3-D Projected L1 inversion of gravity data using truncated unbiased predictive risk estimator for regularization parameter estimation, Geophys. J. Int.  210 3, 1872– 1887. Google Scholar CrossRef Search ADS   Vatankhah S., Renaut R.A., Ardestani V.E., 2017b. A fast algorithm for regularized focused 3-D inversion of gravity data using the randomized SVD, Geophysics  in press, arXiv:1706.06141v1. Vogel C.R., 2002. Computational Methods for Inverse Problems, SIAM Frontiers in Applied Mathematics , SIAM. Google Scholar CrossRef Search ADS   Voronin S., Mikesell D., Nolet G., 2015. Compression approaches for the regularized solutions of linear systems from large-scale inverse problems, Int. J. Geomath.  6 251– 294. Google Scholar CrossRef Search ADS   Wei Y., Xie P., Zhang L., 2016. Tikhonov regularization and randomized GSVD, SIAM J. Matrix Anal. Appl.  37 2, 649– 675. Google Scholar CrossRef Search ADS   Wohlberg B., Rodríguez P., 2007. An iteratively reweighted norm algorithm for minimization of total variation functionals, IEEE Signal Process. Lett.  14( 12), 948– 951. Google Scholar CrossRef Search ADS   Xiang H., Zou J., 2013. Regularization with randomized SVD for large-scale discrete inverse problems, Inverse Probl.  29 085008. Google Scholar CrossRef Search ADS   Xiang H., Zou J., 2015. Randomized algorithms for large-scale inverse problems with general Tikhonov regularization, Inverse Probl.  31 doi:10.1088/0266-5611/31/8/085008. APPENDIX A: THE GENERALIZED SINGULAR VALUE DECOMPOSITION Suppose $$\tilde{\tilde{G}} \in \mathcal {R}^{m\times n}$$, $$\tilde{D} \in \mathcal {R}^{p \times n}$$ and $$\mathcal {N}(\tilde{\tilde{G}}) \cap \mathcal {N}(\tilde{D}) = 0$$, where $$\mathcal {N}(\tilde{\tilde{G}})$$ is the null space of matrix $$\tilde{\tilde{G}}$$. Then there exist orthogonal matrices $$U \in \mathcal {R}^{m\times m}$$, $$V \in \mathcal {R}^{p \times p}$$ and a non-singular matrix $$X \in \mathcal {R}^{n \times n}$$ such that $$\tilde{\tilde{G}}=U \Lambda X^T$$ and $$\tilde{D}=V M X^T$$ (Paige & Saunders 1981; Aster et al.2013). Here, $$\Lambda \in \mathcal {R}^{m \times n}$$ is zero except for entries 0 < Λ1, (n − m) + 1 ≤ …Λm, n < 1, and M is diagonal of size p × n with entries $$M_{1,1}> M_{2,2}\ge \cdot\cdot\cdot \ge M_{p^*,p^*}>0$$, where p* ≔ min(p, n). The generalized singular values of the matrix pair $$[\tilde{\tilde{G}}, \tilde{D}]$$ are γi = λi/μi, where γ1 = … = γ(n − m) = 0 < γ(n − m) + 1 ≤ … ≤ γn, and $$\Lambda ^T\Lambda =\mathrm{diag}(0,\cdot\cdot\cdot 0,\lambda ^2_{(n-m)+1},\cdot\cdot\cdot , \lambda _n^2)$$, $$M^T M= \mathrm{diag}(1,\cdot\cdot\cdot ,1,\mu _{(n-m)+1}^2,\cdot\cdot\cdot ,\mu _n^2)$$, and $$\lambda _i^2+\mu _i^2=1$$, ∀i = 1: n, that is, MTM + ΛTΛ = In. Using the GSVD, introducing ui as the ith column of matrix U, we may immediately write the solution of (9) as   \begin{eqnarray} \mathbf {h}(\alpha )= \sum _{i=(n-m)+1}^{n} \frac{\gamma ^2_i}{\gamma ^2_i+\alpha ^2} \frac{\mathbf {u}^T_{i-(n-m)}\tilde{\mathbf {r}}}{\lambda _i} (X^T)^{-1}_{i}, \end{eqnarray} (A1)where $$(X^T)^{-1}_{i}$$ is the ith column of the inverse of the matrix XT. APPENDIX B: ESTIMATION OF THE COMPUTATIONAL COST OF FORMING A GSVD We follow the steps in (2013, Algorithm 8.7.2) to provide an estimate of the dominant number of floating point operations required to form a GSVD of two matrices A and B, without consideration of any kind of sparsity structure. For matrices $$A \in \mathcal {R}^{m\times n}$$ and $$B \in \mathcal {R}^{p \times n}$$, we start with computing the QR factorization of matrix $$C=[A;B] \in \mathcal {R}^{(m+p)\times n}$$, for which the operation cost is 2n2(m + p − n/3). Next the algorithm requires the calculation of two singular value decompositions, for the top and lower blocks of the A matrix in the QR factorization. This costs 6nm2 + 20m3 and 6pn2 + 20n3, respectively, Golub & van Loan (2013, page 493). The total cost for our application is thus (58/3)n3 + (8p + 2m)n2 + 2m2(3n + 10m). Now for p = n and p = 3n the total cost of GSVD are (82/3)n3 + 2mn2 + +2m2(3n + 10m) and (130/3)n3 + 2mn2 + +2m2(3n + 10m), respectively. Now also including the cost 26n3, found using the SVD, for finding the pseudo-inverse of X, noting that for these large-scale problems while X is theoretically non-singular in exact arithmetic it can be very ill-conditioned and the solution is more stably obtained using X†. In contrast for the RGSVD we have projected matrices $$B_1 \in \mathcal {R}^{m\times q}$$ and $$B_2 \in \mathcal {R}^{p \times q}$$, and the costs for QR factorization and two SVDs are 2q2(m + p − q/3), 6mq2 + 20q3 and 6pq2 + 20q3, respectively. Then, the total cost is equal to q2(8p + 8m) + (118/3)q3 which for the cases p = n and p = 3n gives totals 8q2(n + m) + 118/3q3 and 8q2(3n + m) + 118/3q3, respectively. Then we need also Z† costing approximately, 6nq2 + 20q3. © The Author(s) 2018. Published by Oxford University Press on behalf of The Royal Astronomical Society.

### Journal

Geophysical Journal InternationalOxford University Press

Published: Apr 1, 2018

## You’re reading a free preview. Subscribe to read the entire article.

### DeepDyve is your personal research library

It’s your single place to instantly
that matters to you.

over 12 million articles from more than
10,000 peer-reviewed journals.

All for just $49/month ### Explore the DeepDyve Library ### Unlimited reading Read as many articles as you need. Full articles with original layout, charts and figures. Read online, from anywhere. ### Stay up to date Keep up with your field with Personalized Recommendations and Follow Journals to get automatic updates. ### Organize your research It’s easy to organize your research with our built-in tools. ### Your journals are on DeepDyve Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more. All the latest content is available, no embargo periods. ### Monthly Plan • Read unlimited articles • Personalized recommendations • No expiration • Print 20 pages per month • 20% off on PDF purchases • Organize your research • Get updates on your journals and topic searches$49/month

14-day Free Trial

Best Deal — 39% off

### Annual Plan

• All the features of the Professional Plan, but for 39% off!
• Billed annually
• No expiration
• For the normal price of 10 articles elsewhere, you get one full year of unlimited access to articles.

$588$360/year

billed annually

14-day Free Trial