Elastic full waveform inversion based on the homogenization method: theoretical framework and 2-D numerical illustrations

Elastic full waveform inversion based on the homogenization method: theoretical framework and 2-D... SUMMARY Seismic imaging is an efficient tool to investigate the Earth interior. Many of the different imaging techniques currently used, including the so-called full waveform inversion (FWI), are based on limited frequency band data. Such data are not sensitive to the true earth model, but to a smooth version of it. This smooth version can be related to the true model by the homogenization technique. Homogenization for wave propagation in deterministic media with no scale separation, such as geological media, has been recently developed. With such an asymptotic theory, it is possible to compute an effective medium valid for a given frequency band such that effective waveforms and true waveforms are the same up to a controlled error. In this work we make the link between limited frequency band inversion, mainly FWI, and homogenization. We establish the relation between a true model and an FWI result model. This relation is important for a proper interpretation of FWI images. We numerically illustrate, in the 2-D case, that an FWI result is at best the homogenized version of the true model. Moreover, it appears that the homogenized FWI model is quite independent of the FWI parametrization, as long as it has enough degrees of freedom. In particular, inverting for the full elastic tensor is, in each of our tests, always a good choice. We show how the homogenization can help to understand FWI behaviour and help to improve its robustness and convergence by efficiently constraining the solution space of the inverse problem. Inverse theory, Numerical solutions, Computational seismology, Seismic tomography, Wave propagation 1 INTRODUCTION Since the late sixties, seismic data have been used to investigate the Earth interior and to image its mechanical properties, for academic and industrial purposes, at scales ranging from few metres to the global Earth. Modern seismic imaging methods are based on an inverse problem making it possible to retrieve information about the Earth mechanical properties from active (explosion or airgun) or passive (earthquakes, ambient noise) seismic records. Because of the tremendous computing power such a scheme would require, a full exploration of the inverse problem solution space based on the direct modelling of the seismogram full waveforms, as envisioned by Tarantola (2005), is still out of reach, despite the progress in the design of high performance computing devices, reaching now exascale performances. Because of this intrinsic limitation, seismologists have developed seismic imaging and tomography techniques which are mainly based on reduced data, (i.e. secondary observables such as body waves arrival times, surface wave phase velocities, normal model eigenfrequencies ...) together with an appropriate approximate solution of the wave propagation modelling problem, that are quicker to solve than the full wave equation. These choices make it possible to assume that the inverse problem nonlinearity is weak enough so that a solution to the inverse problem can be computed through local optimization techniques. Nevertheless, mathematical regularization (non-data-driven constraints) are always necessary to make the solution of the inverse problem unique. In this framework, for global and regional scales, time arrival and phase velocity tomography methods have provided significant results and images of the deep Earth (e.g. Romanowicz 2003, for a review). At the exploration scale, migration imaging techniques from reflection data have been introduced to recover high-resolution structural information of the subsurface reflectivity. These high-resolution seismic imaging techniques rely on a prior knowledge of the smooth background wave velocity and, in their earlier version, on a linearized wave propagation operator, aiming at refocusing primary reflections only in depth (Claerbout 1985; Beylkin 1987; Beylkin & Burridge 1990). During the last two decades, the simultaneous development of wide angle/azimuth broadband seismic acquisition systems and high performance computing devices have made possible the successful application of seismic imaging techniques based on the full waveform, from the exploration to the global scale. Compared to the previous techniques (tomography, migration), full waveform inversion (FWI) aims at extracting the information from the entire waveform in a limited frequency band, through a local minimization of the difference between the observed and synthetic data. At the global and continental scales, FWI methods, often based on the Spectral Element Method for the forward modelling tool and gradient descent algorithms (steepest descent, nonlinear conjugate gradient) for the optimization, are making important progress (Tape et al.2010; Lekić & Romanowicz 2011; Fichtner et al.2013). At the exploration scale, the method is now routinely used in the industry, mainly for the 3-D reconstruction of the pressure wave velocity in the acoustic approximation (Plessix & Perkins 2010; Sirgue et al.2010; Peter et al.2011; Warner et al.2013; Operto et al.2015). Based on these successful applications, two main research directions are currently identified. The first consists in a better interpretation of physics of wave propagation, going beyond the current common mechanical approximations (typically the acoustic approximation at the exploration scale and the VS approximation at the regional and global scales) and a better accounting for the amplitude of the seismic signal. This requires the use of much more expensive viscoelastic wave propagation modelling solvers, correctly accounting for the attenuation and the anisotropy of the subsurface. This also yields a challenging multiparameter inverse problem involving at least P-wave and S-wave velocities to be reconstructed. The second research direction consists in the design of a more automatized workflow, less relying on human expertise, enhancing the robustness of FWI procedures. Indeed, the FWI problem is intrinsically ill-posed as the misfit function might exhibit several local minima leading to non-informative geological models. Designing strategies to avoid these local minima is at the heart of many methodological studies. In practice, this requires a careful data analysis, and the design of ad hoc multilevel hierarchical schemes based on frequency band, time-windowing and offset selections (also known as layer stripping approach) (Bunks et al.1995; Pratt 1999; Shipp & Singh 2002; Brossier et al.2009). This is complemented through the design of kinematically accurate initial models through high-resolution tomography techniques (Billette & Lambaré 1998; Alerini et al.2007) and the introduction of prior knowledge through the design of regularization strategies (Tikhonov, smoothing techniques) and prior model information. This second line of investigation might be summarized as a search for an FWI formulation leading to a better posed inverse problem, potentially less prone to exhibit local minima. In this paper, we investigate numerically how the concept of homogenization and equivalent medium could help in this respect. Our leading idea is to account for the fact that seismic waves propagating within the subsurface interior are always frequency band-limited. For this reason any subsurface mechanical properties variations smaller than a fraction of the shortest propagated wavelength are ‘seen’ as smooth, often anisotropic, heterogeneities (Capdeville et al.2013). The homogenization theory provides means to compute this equivalent anisotropic medium, the medium ‘seen’ by the wavefield. Homogenization, effective media or upscaling techniques gather a wide range of methods able to compute effective properties and equations of a fine scale problem when large scale properties are needed. In the context of wave propagation, the idea is to remove the heterogeneities of scale much smaller than the minimum wavefield wavelength and to replace them by effective properties. For ‘long’ elastic waves propagating in stratified media, Backus (1962) gave explicit formula to upscale finely layered media. One important result of this work is to show that a finely layered isotropic medium becomes an anisotropic effective medium for long waves. For periodic media, an important class of methods, the two-scale homogenization methods, has been developed (e.g. Sanchez-Palencia 1980). To obtain the effective media, the effective equations and local correctors, two-scale homogenization methods require one to solve the so-called periodic-cell problem. This periodic-cell problem can be solved analytically only for the specific case of layered media, whereas a numerical method such as finite elements is necessary for more general media. For stochastic media, methods formally similar to the two-scale homogenization methods exist (e.g. Bensoussan et al.1978; Blanc et al.2007). Finally, solutions called ‘numerical homogenization’ (e.g. Weinan et al.2007) also needs a scale separation. Typical geological media present no spatial periodicity, no natural scale separation or any kind of spatial statistical invariance. This difficulty excludes all of the above mentioned homogenization techniques from the problem of upscaling geological media. To fill this gap, the non-periodic homogenization technique (Capdeville et al.2010b; Guillot et al.2010; Capdeville et al.2010a, 2015; Capdeville & Cance 2015) has recently been introduced. While the non-periodic homogenization technique is strongly inspired from the classical two-scale periodic homogenization, it has some major differences. One of them relies on the fact that the obtained effective properties are not spatially constant, they are just ‘smoother’ than the original medium. So far, this homogenization method has been mostly used to simplify seismic forward modelling. In this context, the non-periodic homogenization can be seen as a pre-processing step of the elastic model prior to being used in the wave equation solver (e.g. spectral elements, finite differences ...). By removing all the fine scales, the non-periodic homogenization drastically reduces meshing complexity associated with elastic discontinuities and makes the computation optimum (see e.g. Capdeville et al.2015). Note that for relatively weak elastic contrasts, an alternate solution based on a second order Born approximation exists (Jordan 2015). In the inversion context, the homogenization method has been already used either to mix different scales (Afanasiev et al.2015) or to constrain FWI gradient updates in the layered media case (Afanasiev et al.2016). More generally, the homogenization concept raises a puzzling point: for a given frequency band-limited data set, at least two Earth subsurface models minimize similarly a waveform misfit function, the true and the homogenized models. Knowing that, for local optimization approaches, only one model can be found, we can already guess that the solution model is, at best, the homogenized model and not the true model. Indeed, homogenized models represent what is ‘seen’ by the wavefield. More technically, the homogenized models are ‘simpler’ in the sense that they need a finite number of parameters to describe them, which is not the case for the true models. This has been confirmed in a recent work, but only in the layered media case: it has been proposed that FWI can retrieve at best the homogenized medium and that this fact can be used to properly set up the inversion problem (Capdeville et al.2013). Extending those results from the stratified media case to the general case is not trivial for similar reasons that extending the homogenization principle from the layered case to the general case is also difficult (Capdeville et al.2010a). In the present work, we pursue mainly two objectives. First, we extend the results of Capdeville et al. (2013) to higher dimensions (2-D and 3-D). Second, we aim at presenting ideas on how homogenization concepts can be used to better constrain and better understand FWI. Beyond the objective of making and understanding the FWI better, we also aim at establishing the relation between the true model and the model resulting from an FWI. This relation is fundamental to set up a downscaling inverse problem. Its objective is to properly interpret the FWI results as a probability distribution over possible fine scale models (Bodin et al.2015; Nawaz & Curtis 2017) where fine scales are here scales of interest for the interpretation below the FWI resolution limit. In the following, we first define FWI and its solution space. We then introduce the homogenization concept and its consequences for the inverse problem. We then show a series of examples illustrating the exposed ideas before discussing our results and concluding our work. 2 THE FULL WAVEFORM INVERSE PROBLEM 2.1 Context We consider an elastic body $$\boldsymbol{\Omega }$$ of boundary $$\partial \boldsymbol{\Omega }$$ (see Fig. 1). We assume that this elastic body is fully characterized by its ‘true’ parameters, the density ρt(x) and the fourth order elastic tensor ct(x) defined for all x in $$\boldsymbol{\Omega }$$. We consider Ns point sources, characterized by their location $$\mathbf {x}_s^t$$ and mechanism $$\mathbf {q}_s^t$$ (moment tensor or force vector) for s ∈ {1, ..., Ns}. These sources are triggered independently in $$\boldsymbol{\Omega }$$ and recorded by Nr, d = 2 or d = 3 component receivers, in 2-D or 3-D respectively, located in xr, r ∈ {1, ..., Nr}. Figure 1. View largeDownload slide The $$\boldsymbol{\Omega }$$ domain of boundary $$\partial \boldsymbol{\Omega }$$. Ns = 3 sources (stars) and Nr = 4 receivers (triangles) are represented. The circles connected with dashed lines are two successive zooms on $$\boldsymbol{\Omega }$$ texture symbolizing the multiscale nature of the domain heterogeneities. Figure 1. View largeDownload slide The $$\boldsymbol{\Omega }$$ domain of boundary $$\partial \boldsymbol{\Omega }$$. Ns = 3 sources (stars) and Nr = 4 receivers (triangles) are represented. The circles connected with dashed lines are two successive zooms on $$\boldsymbol{\Omega }$$ texture symbolizing the multiscale nature of the domain heterogeneities. Based on the recorded waveforms, an FWI aims at retrieving information about the elastic model and sources in $$\boldsymbol{\Omega }$$. In the following, we define more precisely the model space $$\boldsymbol{\mathcal {M}}$$ to which belong potential solutions to the inverse problem, the data set d and the forward problem linking potential models and synthetic data. Finally, we define the misfit function between the recorded data and synthetic data as well as the chosen optimization algorithm to solve the inverse problem. 2.2 The fine scale model space The fine scale model space is the one to which the true model belongs. It contains all the unknown parameters required to model the data in the full frequency band. In our case, this corresponds to the density and elastic parameters in $$\boldsymbol{\Omega }$$, together with the source parameters. We first define $$\boldsymbol{\mathcal {E}}(\boldsymbol{\Omega })$$, the admissible elastic properties space. $$\boldsymbol{\mathcal {E}}(\boldsymbol{\Omega })$$ gathers all the physically possible density and elastic tensor distributions (ρ(x), c(x)) in $$\boldsymbol{\Omega }$$ such that ρ(x) is positive and bounded; c(x) is a fourth order tensor, positive definite and bounded; c(x) follows the classical major and minor symmetries, leading to only six independent coefficients in 2-D and 21 in 3-D. In general, elastic structures are multiscale: for a given scale, there are always heterogeneities at a smaller scale (see Fig. 1). Consequently, the number of parameters necessary to characterize in a deterministic way the spatial distribution of a general elastic structure is infinite. As a consequence, $$\boldsymbol{\mathcal {E}}$$ is an infinite dimensional space. Note that, in the real world, there may be a finite limit to this dimension when reaching the atomic scale. But this limit is out of reach and is not considered here. Assuming the sources are localized in space (point sources), we define the source parameter space as   \begin{equation} \boldsymbol{\mathcal {S}}=\left\lbrace (\mathbf {x}_s,\mathbf {q}_s)\in (\boldsymbol{\Omega }\times \mathbb {R}^p),\ s\in \lbrace 1\dots N_s\rbrace \right\rbrace, \end{equation} (1)where xs is the source location and qs the source mechanism, which can either be a moment tensor Ms (in this case we have p = d(d + 1)/2) or a force vector $$\boldsymbol{f}_{\!s}$$ (respectively p = d). The model space $$\boldsymbol{\mathcal {M}}$$ is defined by   \begin{equation} \boldsymbol{\mathcal {M}}=\boldsymbol{\mathcal {E}}\cup \boldsymbol{\mathcal {S}}. \end{equation} (2)It is also an infinite dimensional space by definition. A model $$\mathbf {m}\in \boldsymbol{\mathcal {M}}$$, potentially a solution of the FWI problem, made of an elastic model and of the source parameters, can be written as   \begin{equation} \mathbf {m}=\left\lbrace (\rho (\mathbf {x}),\mathbf {c}(\mathbf {x})), \ \forall \mathbf {x}\in \boldsymbol{\Omega }; (\mathbf {x}_s,\mathbf {q}_s), s\in \lbrace 1,...,N_s\rbrace \right\rbrace. \end{equation} (3)The ‘true’ model mt is   \begin{equation} \mathbf {m}_t=\left\lbrace (\rho ^t(\mathbf {x}),\mathbf {c}^t(\mathbf {x})), \ \forall \mathbf {x}\in \boldsymbol{\Omega }; (\mathbf {x}^t_s,\mathbf {q}^t_s), s\in \lbrace 1,...,N_s\rbrace \right\rbrace. \end{equation} (4) In the following, the model m restricted to the source number s is denoted by m|s:   \begin{equation} \mathbf {m}\vert _s=\{(\rho (\mathbf {x}),\mathbf {c}(\mathbf {x})), \ \forall \mathbf {x}\in \boldsymbol{\Omega }; \mathbf {x}_s, \mathbf {q}_s\}. \end{equation} (5) 2.3 The forward modelling equations We assume that, for a given model m, of component (ρ, c), for a source number s among the Ns sources, for any $$\mathbf {x}\in \boldsymbol{\Omega }$$ and any time t ∈ [0, T], the displacement u(x, t; m|s) is the solution of the elastic wave equation   \begin{eqnarray} &&{\rho \partial _{tt} \mathbf {u}- \boldsymbol{\nabla }\cdot \boldsymbol{\sigma }= \mathbf {s}_s, }\nonumber\\ &&\boldsymbol{\sigma }= \mathbf {c}:\boldsymbol{\epsilon }(\mathbf {u}), \end{eqnarray} (6)where ss(x, t) is the sth external source term, $$\boldsymbol{\sigma }(\mathbf {x},t;\mathbf {m}\vert _s)$$ is the stress and $$\boldsymbol{\epsilon }(\mathbf {u}) =\frac{1}{2}(\boldsymbol{\nabla }\mathbf {u}+{(\boldsymbol{\nabla }\mathbf {u})}^{\intercal })$$ is the strain and ⊺ the ‘transpose’ operator. Inelastic losses are ignored in this work for the sake of simplicity. They have nevertheless an important effect in practice (see, for instance, Kamei & Pratt 2013). The wave equation is completed by the normal stress free boundary condition, for $$\mathbf {x}\in \partial \boldsymbol{\Omega }$$,   \begin{equation} \boldsymbol{\sigma }(\mathbf {x},t) \cdot \mathbf {n}(\mathbf {x}) =0\,,\ \ \ \forall (\mathbf {x},t)\in \partial \boldsymbol{\Omega }\times [0,T]\,, \end{equation} (7)where n is the outward normal to $$\partial \boldsymbol{\Omega }$$, and zero initial conditions (the medium is considered to be at rest at t = 0). We consider sources ss(x, t) of the following form   \begin{equation} \mathbf {s}_s(\mathbf {x},t)=-\mathbf {M}_s\cdot \boldsymbol{\nabla }\delta (\mathbf {x}-\mathbf {x}_s)g_s(t)\,, \end{equation} (8)for moment tensor sources or   \begin{equation} \mathbf {s}_s(\mathbf {x},t)=\boldsymbol{f}_{\!s} \delta (\mathbf {x}-\mathbf {x}_s)g_s(t)\,, \end{equation} (9)for force vector sources, where gs(t) is the source time function. 2.4 The data set The Ns elastic sources are triggered independently in $$\boldsymbol{\Omega }$$, generating elastic waves, for which the corresponding displacement ds, r(t) at receivers r is recorded for a duration T. The collected data set is   \begin{equation} \mathbf {d}=\lbrace \mathbf {d}_{s,r}(t), (r,s,t)\in \lbrace 1,...,N_r\rbrace \times \lbrace 1,...,N_s\rbrace \times [0,T]\rbrace \,. \end{equation} (10) In the following, we assume that the physics of the problem is known (i.e. the true model mt belongs to $$\boldsymbol{\mathcal {M}}$$) and that solving the forward problem defined in Section 2.3 accurately models the data within some limit corresponding to numerical noise. 2.5 FWI misfit function We define the classical least-squares misfit function:   \begin{equation} E(\mathbf {m})=\sum _{r,s}\int _{0}^{T}\left(\mathbf {d}_s(\mathbf {x}_r,t)-\mathbf {u}(\mathbf {x}_r,t;\mathbf {m}\vert _s)\right)^2\,{\rm d}t. \end{equation} (11)where u(xr, t; mt|s) is the solution of the wave eq. (6) computed for the model m restricted to a source s. Other misfit functions could be chosen (such as those based on phase or envelope measurements, cross-correlation approaches, deconvolution approaches, optimal transport approaches, etc.), but as long as they rely on finite frequency data, we expect the conclusions of the present study to remain unchanged. The FWI problem is formulated as the minimization problem   \begin{equation} \bar{\mathbf {m}}= \mathop{arg\min }\limits_{\mathbf {m}\in \boldsymbol{\mathcal {M}}} E(\mathbf {m}) \,. \end{equation} (12) It can be mathematically proven that, assuming perfect illumination (unlimited number of sources, unlimited number of receivers recording both displacement and traction on a closed surface, different from the free surface, and an infinite observation time) the true model mt can be uniquely recovered and, in this case, $$\bar{\mathbf {m}}=\mathbf {m}_t$$ (Nachman 1988; Nakamura & Uhlmann 1994). However, in practice, because computing power is finite, (12) cannot be solved. In the FWI framework, the classical solution to this problem is to limit the data frequency band by introducing a maximum frequency fm to the recorded signal and to the source time function gs. This can simply be done using a low-pass filter $$\mathcal {F}^{f_{\rm m}}$$:   \begin{equation} \mathbf {d}_s^{f_{\rm m}}(\mathbf {x}_r,t)=\mathcal {F}^{f_{\rm m}}(\mathbf {d}_s(\mathbf {x}_r,t))\,. \end{equation} (13)The misfit function to minimize is then   \begin{equation} E^{f_{\rm m}}(\mathbf {m})=\sum _{r,s}\int _0^T\left(\mathbf {d}_s^{f_{\rm m}}(\mathbf {x}_r,t)-\mathbf {u}^{f_{\rm m}}(\mathbf {x}_r,t;\mathbf {m}\vert _s)\right)^2\,{\rm d}t. \end{equation} (14)where $$\mathbf {u}^{f_{\rm m}}$$ is computed with the low-pass filtered source time function $$g_s^{f_{\rm m}}=\mathcal {F}^{f_{\rm m}}(g_s)$$. For most media, a bounded dispersion relation between the frequency f and the wavelength λ of the wavefield exists (there are a few exceptions, see Section 7). For such media, introducing a maximum frequency fm to the recorded signal warrants the existence of a minimum wavelength λmin. Using the assumption that waves only see a blurred version of scales smaller than λmin, and therefore that these scales can somehow be ignored in the reconstruction process, introducing a maximum frequency fm has an important consequence: it yields the possibility of using a finite dimensional space approximation $$\boldsymbol{\mathcal {M}}^h$$ of $$\boldsymbol{\mathcal {M}}$$. The superscript h is a number that characterizes the discretization made in the finite dimensional approximation and can be thought as a ‘resolution’ parameter. For example, if the elastic model is represented spatially by constant velocity blocks, h can be the size of these blocks. In general h is directly related to λmin but other information such as the illumination angles, offset ranges or data coverage can influence the choice of h. In $$\boldsymbol{\mathcal {M}}^h$$, models do not contain arbitrarily small scale heterogeneities and, knowing that the data need to be modelled only in a limited frequency band, the forward problem (eqs 6 and 7) can be solved in a bounded computing time. Finally, replacing the original minimization problem (12) by   \begin{equation} \bar{\mathbf {m}}^h = \mathop{arg\min }\limits_{\mathbf {m}\in \boldsymbol{\mathcal {M}}^h} E^{f_{\rm m}}(\mathbf {m}) \end{equation} (15)makes the inverse problem solvable, $$\boldsymbol{\mathcal {M}}^h$$ being a finite dimensional space. To solve (15), two classes of algorithms exist: local optimization techniques or global search approaches (Tarantola 2005). While Bayesian global search approaches would be ideal, yielding the possibility to access the full posterior density function, they are still too expensive, from a computational point of view, to be used for realistic scale FWI problem. Only a few limited experiments have been done in this direction so far (e.g. Käufl et al.2013). For practical applications, local optimization techniques represent the state-of-the art for solving the FWI problem, relying on the gradient direction and different levels of approximation for the inverse Hessian operator [identity, l-BFGS (Nocedal 1980; Byrd et al.1995), Gauss–Newton (Pratt et al.1998); see Métivier & Brossier (2016) for a review]. Provided that a linesearch algorithm or a trust-region strategy is used, these methods converge to the closest local minimum from the initial guess and hopefully to the global minimum if that initial guess is good. Nevertheless, if the global minimum is not unique, this is a problem. In practice, all the FWI techniques based on local optimization approaches implicitly assume a unique global minimum exists and assume that the initial guess lies within the ‘basin of attraction’ of the minimization. This is the reason why the uniqueness of the solution of the inverse problem is important. Even if, in $$\boldsymbol{\mathcal {M}}$$, minimizing E under perfect conditions warrants a unique solution, it is not the case for $$E^{f_{\rm m}}$$: there may be an infinite number of fine scale models that can produce the same limited frequency data (there is a large null space). In $$\boldsymbol{\mathcal {M}}^h$$, it can be different and the uniqueness can be recovered, but it is not granted. Independently of the global search versus local optimization choice made to solve (15), many other choices may influence the inverse problem solution $$\bar{\mathbf {m}}^h$$, even for a perfect data coverage. The resolution parameter h itself and the way it is implemented have a strong effect. Depending on the authors and their chosen approach, a wide range of solutions for the spatial approximation is used, from spatially constant blocks to high degree polynomials (typically, the same polynomial approximation as the one used for the waveform forward modelling). Physical approximation choices made for the constitutive parameters may also strongly influence the inversion result. For instance, we might hope from fine scale prior information that the true model is isotropic, in this case a common choice in seismology is to only invert for VP and VS (or even only one of the two) instead of cijkl. If the solution to the inverse problem $$\bar{\mathbf {m}}^h$$ depends on technical choices to parametrize $$\boldsymbol{\mathcal {M}}^h$$, we know that it cannot be the true model mt. Indeed, in general, mt does not belong to $$\boldsymbol{\mathcal {M}}^h$$: the single fact that mt is multiscale while $$\boldsymbol{\mathcal {M}}^h$$ is not prevents it. This is expected: we know for instance, following the diffraction tomography theory (Devaney 1984), that FWI can only reach a resolution down to half the smallest propagated wavelength. Nevertheless, the relation between $$\bar{\mathbf {m}}^h$$ and mt is unknown and this is a serious issue: in particular it can be difficult to dissociate true high-resolution information and small scale artefacts coming from a combination of noise, lack of illumination, and discretization setups. The knowledge of a more formal relation between $$\bar{\mathbf {m}}^h$$ and mt would thus allow a better interpretation of the FWI result $$\bar{\mathbf {m}}^h$$. The problem that $$\bar{\mathbf {m}}^h$$ strongly depends on the particular choice of $$\boldsymbol{\mathcal {M}}^h$$ and that the relation between $$\bar{\mathbf {m}}^h$$ and mt is unknown is not often discussed in the literature. We feel that this discussion is missing, as understanding these issues would help to better apprehend the behaviour of the inversion schemes and therefore could open ways to improve them. To us, this issue is directly linked to the question of how scales relate to each other through the wave equation: it is a homogenization problem. 3 HOMOGENIZATION CONCEPTS In this section, we summarize the results obtained by Capdeville et al. (2010a), Guillot et al. (2010) and Capdeville et al. (2010b) about homogenization of complex deterministic media with no scale separation for wave propagation problems. For limited frequency band data, seismic waves do not ‘see’ the true Earth model, but a blurred version of it. This blurred Earth model is called the effective or the homogenized Earth model. For a given signal maximum frequency fm, it can be computed thanks to the homogenization technique. As mentioned earlier, in most media, the maximum frequency fm can be associated with a minimum wavelength λmin, and in the following, we assume that a minimum wavelength λmin exists. We moreover assume that a lower bound estimate of λmin can be found as VSmin/fm where VSmin is the slowest velocity in the medium. This is a strong assumption. However, to the best of our knowledge, natural media always satisfies it (while synthetic complex meta-material can break it: for instance, resonating media with Helmholtz resonator inclusions (Zhao et al.2016)). We introduce λ0, a constant separating scales considered as macroscopic (large scales) from the one considered as microscopic (fine scales). We then introduce   \begin{equation} \varepsilon _0=\frac{\lambda _0}{\lambda _{{\rm min}}}\,, \end{equation} (16)a small parameter which measures the scale separation position with respect to λmin. For example, when homogenization is used as a pre-processing step to the forward modelling solver, in order to obtain a precise agreement between true and homogenized waveforms, ε0 = 0.5 is often a good choice for most geological media and a few tens of wavelength of wave propagation (Capdeville et al.2010b). Two-scale homogenization techniques are asymptotic methods that explicitly use two space variables: x for the macroscopic spatial variations and $$\mathbf {y}=\frac{\mathbf {x}}{\varepsilon _0}$$ for microscopic spatial variations. For example, for a multiscale medium such as a geological medium, the density ρ depends on both macroscopic and microscopic scales and is written $$\rho \left(\mathbf {x},\frac{\mathbf {x}}{\varepsilon _0}\right)$$. Similarly, the solution of the wave equation with a maximum frequency fm is written $$\mathbf {u}^{f_{\rm m}}\left(\mathbf {x},\frac{\mathbf {x}}{\varepsilon _0},t\right)$$. Although these properties are written as if they depend on two independent scales, it should be noted that these variables (and others to follow) only have a physical meaning once evaluated on the axis $$\mathbf {y}=\frac{\mathbf {x}}{\varepsilon _0}$$. A function κ that would not depend upon the microscopic scale would be written κ(x). For some quantity $$\kappa (\mathbf {x},\frac{\mathbf {x}}{\varepsilon _0})$$ depending on both scales, we may find an ‘effective’ version κ*(x) that only depends on the macroscopic scale. In the following, effective quantities are noted with a *. In general, the relation between two scales and effective quantities is complex and not just a simple linear smoothing. All the effective quantities actually depend on the actual value of ε0 and λmin. In the following, however, we drop this explicit dependency on ε0 and λmin to simplify the notations. An effective quantity $$\kappa ^{*,\varepsilon _0,\lambda _{{\rm min}}}$$ is simply denoted by κ*. The main results of the homogenization theory are the following: to the first order in ε0, the relation between the true and effective displacement is   \begin{eqnarray} \mathbf {u}^{f_{\rm m}}\left(\mathbf {x},\frac{\mathbf {x}}{\varepsilon _0},t\right) {=} \mathbf {u}^*(\mathbf {x},t) {+} \varepsilon _0\boldsymbol{\chi }\left(\mathbf {x},\!\frac{\mathbf {x}}{\varepsilon _0}\right):\boldsymbol{\epsilon }(\mathbf {u}^*)(\mathbf {x},t) {+} O\left(\varepsilon _0^2\right), \end{eqnarray} (17)where u* is the effective displacement, $$\boldsymbol{\chi }$$ is the first order corrector and $$\boldsymbol{\epsilon }(\mathbf {u}^*)$$ is the strain related to the effective displacement ($$\epsilon _{ij}(\mathbf {u}^*)=\frac{1}{2}(\partial _{x_i}u^*_j+\partial _{x_j}u^*_i)$$). This first-order corrector $$\boldsymbol{\chi }$$ is a third order tensor. It does not depend on time nor on sources, however, it depends on the fine scales x/ε0, ε0 and λmin (see Appendix A). The effective displacement u* is solution of the following effective wave equation   \begin{eqnarray} &&{\rho ^*\ddot{\mathbf {u}}^*-\boldsymbol{\nabla }\cdot \boldsymbol{\sigma }^*=\boldsymbol{f}^*,} \nonumber\\ &&\quad \quad \,\, \boldsymbol{\sigma }^*=\mathbf {c}^*:\boldsymbol{\epsilon }(\mathbf {u}^*)\,, \end{eqnarray} (18)  \begin{equation} \mathbf {T}(\mathbf {u}^*)=\varepsilon _0{\boldsymbol{\Gamma }}^*(\mathbf {u}^*)\,{\rm on }\,\partial \boldsymbol{\Omega }^*, \end{equation} (19)where $$\mathbf {T}(\mathbf {u}^*)=\mathbf {c}^*:\boldsymbol{\epsilon }(\mathbf {u}^*)\cdot \mathbf {n}^*$$, $$\partial \boldsymbol{\Omega }^*$$ is the effective boundary, n* its outward normal and $${\boldsymbol{\Gamma }}^*$$ is a Dirichlet to Neumann operator (DtN) for the effective boundary condition (Capdeville & Marigo 2008, 2013). It shall be noted that, to the leading order in ε0, the effective wave eqs (18) and (19) is also a wave equation, comparable to the original wave eqs (6) and (7), but with different elastic and density coefficients. This remains true to the first order in ε0, but the original Neumann boundary condition changes to become the DtN condition (19). The effective density ρ*(x), elastic tensor c*(x), the boundary layer corrector $${\boldsymbol{\Gamma }}^*(\mathbf {x})$$ as well as the corrector $$\boldsymbol{\chi }(\mathbf {x},\mathbf {x}/\varepsilon _0)$$ can be obtained thanks to the homogenization operator $$\boldsymbol{\mathcal {H}}$$:   \begin{equation} \left(\rho ^*,\mathbf {c}^*,{\boldsymbol{\Gamma }}^*,\boldsymbol{\chi }\right)=\boldsymbol{\mathcal {H}}(\rho ,\mathbf {c})\,. \end{equation} (20)The operator $$\boldsymbol{\mathcal {H}}$$ depends on ε0 and λmin. As λmin is directly tied to fm, $$\boldsymbol{\mathcal {H}}$$ is a frequency dependent operator. For a fixed ε0, the roughness of the effective medium ρ*, c* increases with fm. The nonlinear process behind the homogenization is summarized in Appendix A. From eq. (17), two important conclusions can be drawn: to leading order in ε0, the displacement does not depend on the microscopic scale $$\frac{\mathbf {x}}{\varepsilon _0}$$. To the first order, the displacement depends on the microscopic scale $$\frac{\mathbf {x}}{\varepsilon _0}$$, but only through a ‘site effect’ $$\boldsymbol{\chi }$$, that is independent of time and sources and which needs to be known only at the receiver positions. Finally, the effective source mechanism $$\mathbf {q}_s^*$$ can also be required. For moment tensor sources, the moment tensor has to be corrected to the leading order in ε0 as   \begin{equation} \mathbf {M}_s^*=\mathbf {G}(\mathbf {x}_s,\frac{\mathbf {x}_s}{\varepsilon _0}):\mathbf {M}, \end{equation} (21)where G is the strain concentrator and is defined in Appendix A. The order 1 correction for vector forces can be found in Capdeville et al. (2010a). Effective models belong to a space that is more complex to define than the fine scale models: the effective domain $$\boldsymbol{\Omega }^*$$ itself can be different from the original domain $$\boldsymbol{\Omega }$$: a fine scale topography can be replaced by an effective one (Capdeville & Marigo 2013); $$\boldsymbol{\mathcal {E}}^*(\boldsymbol{\Omega }^*)$$, the effective elastic properties admissible space gathers all the possible effective elastic tensor and density. The main difference with $$\boldsymbol{\mathcal {E}}$$ is that it is a finite dimensional space and its dimensions is proportional to (fm/ε0)d (see Appendix A); the source parameter space remains unchanged, unless first order correctors are introduced; a new space, $$\boldsymbol{\mathcal {C}}$$ gathering the receiver correctors χ at the receiver locations is necessary. Its dimension is proportional to Nr; a new space, $$\boldsymbol{\mathcal {G}}^*$$ gathering all the possible boundary correctors $${\boldsymbol{\Gamma }}^*$$ is also required. Its dimension is also finite and proportional to (fm/ε0)d − 1; finally, note that no new space is required for the source correction term (21). To the leading order, the corrected moment tensor is still a moment tensor and can be inverted as usual. This also implies that only the corrected moment tensor can be retrieved from the seismic data, not the true moment tensor (Burgos et al.2016). The effective model space   \begin{equation} \boldsymbol{\mathcal {M}}^*=\boldsymbol{\mathcal {E}}^*\cup \boldsymbol{\mathcal {S}}\cup \boldsymbol{\mathcal {C}}\cup \boldsymbol{\mathcal {G}}^*, \end{equation} (22)is also a finite dimensional space. It is the image, or the projection, of $$\boldsymbol{\mathcal {M}}$$ through the homogenization operator $$\boldsymbol{\mathcal {H}}$$:   \begin{equation} \boldsymbol{\mathcal {M}}^*=\boldsymbol{\mathcal {H}}(\boldsymbol{\mathcal {M}})\,. \end{equation} (23)In general, $$\boldsymbol{\mathcal {M}}$$ does not contain $$\boldsymbol{\mathcal {M}}^*$$  \begin{equation} \boldsymbol{\mathcal {M}}^*\not\subset \boldsymbol{\mathcal {M}}\,. \end{equation} (24)This can be due to the receivers and boundary correctors, but this might not be the only reason. Another important case where $$\boldsymbol{\mathcal {M}}^*$$ is not included in $$\boldsymbol{\mathcal {M}}$$ is related to the situation where $$\boldsymbol{\mathcal {M}}$$ is restricted to isotropic media. Because fine scale isotropic heterogeneities lead to anisotropic effective heterogeneities, $$\boldsymbol{\mathcal {M}}^*$$ contains anisotropic media, even for isotropic $$\boldsymbol{\mathcal {M}}$$: thus, $$\boldsymbol{\mathcal {M}}^*$$ might not be included in $$\boldsymbol{\mathcal {M}}$$. Finally, a model $$\mathbf {m}^*\in \boldsymbol{\mathcal {M}}^*$$ can be written as   \begin{eqnarray} &&{ \mathbf {m}^*=\left\lbrace (\rho ^*(\mathbf {x}),\mathbf {c}^*(\mathbf {x})), \ \forall \mathbf {x}\in \boldsymbol{\Omega }^*\,; \right. (\mathbf {x}_s,\mathbf {q}^*_s), s\in \lbrace 1,...,N_s\rbrace ; } \nonumber\\ &&\boldsymbol{\chi }\left(\mathbf {x}_r,\frac{\mathbf {x}_r}{\varepsilon _0}\right), r\in \lbrace 1,...,N_r\rbrace ; \left. {\boldsymbol{\Gamma }}^*(\mathbf {x}), \ \forall \mathbf {x}\in \partial \boldsymbol{\Omega }^*\right\rbrace. \end{eqnarray} (25) 4 THE HOMOGENIZATION FULL WAVEFORM INVERSION (HFWI) PROBLEM Based on the homogenization framework, we can define another misfit function: for any model $$\mathbf {m}^*\in \boldsymbol{\mathcal {M}}^*,$$  \begin{equation} E^*(\mathbf {m}^*)=\sum _{r,s}\int _0^T\left(\mathbf {d}_s^{f_{\rm m}}\left(\mathbf {x}_r,t)-\tilde{\mathbf {u}}^*(\mathbf {x}_r,\frac{\mathbf {x}_r}{\varepsilon _0},t;\mathbf {m}^*\vert _s\right)\right)^2\,{\rm d}t,\!\! \end{equation} (26)where   \begin{eqnarray} \tilde{\mathbf {u}}^*\left(\mathbf {x},\frac{\mathbf {x}}{\varepsilon _0},t;\mathbf {m}^*\vert _s\right) {=} \mathbf {u}^*(\mathbf {x},t;\mathbf {m}^*\vert _s) {+} \varepsilon _0\boldsymbol{\chi }\left(\mathbf {x},\frac{\mathbf {x}}{\varepsilon _0}\right):\boldsymbol{\nabla }\mathbf {u}^*(\mathbf {x},t;\mathbf {m}^*\vert _s) \nonumber\\ \end{eqnarray} (27)is the zero order effective displacement plus the first order corrector at x. The associated effective (or homogenized) minimization problem is   \begin{equation} \bar{\mathbf {m}}^*= \mathop{arg\min }\limits_{\mathbf {m}^*\in \boldsymbol{\mathcal {M}}^*} E^*(\mathbf {m}^*)\,. \end{equation} (28)Eqs (26)–(28) define the homogenized FWI (HFWI) problem. For a fine scale model $$\mathbf {m}\in \boldsymbol{\mathcal {M}}$$ and its homogenized version $$\mathbf {m}^*=\boldsymbol{\mathcal {H}}(\mathbf {m})$$, using (17), we have   \begin{equation} E^{f_{\rm m}}(\mathbf {m})=E^*(\mathbf {m}^*)+O(\varepsilon _0^2)\,. \end{equation} (29)Knowing that mt minimizes E and obviously also minimizes $$E^{f_{\rm m}}$$, we deduce from the last equation that $$\mathbf {m}_t^*=\boldsymbol{\mathcal {H}}(\mathbf {m}_t)$$ is a solution of the HFWI problem (28). The solution of the HFWI problem therefore belongs to $$\boldsymbol{\mathcal {M}}^*$$ by construction. This fact is not true for many $$\boldsymbol{\mathcal {M}}^h$$ choices for classical FWI approaches. For example, $$\boldsymbol{\mathcal {M}}^h$$ is often limited to isotropic media whereas the solution of the inverse problem is anisotropic due to upscaling effects, implying the solution does not belong to $$\boldsymbol{\mathcal {M}}^h$$. This is an important result: the homogenized version of the true model is a solution of the HFWI problem. If we then assume that the solution of the HFWI problem is unique, this leads to establish the relation between the true model and the limited frequency band inverse problem solution   \begin{equation} \bar{\mathbf {m}}^*= \boldsymbol{\mathcal {H}}(\mathbf {m}_t)\,. \end{equation} (30)Assuming the uniqueness of the solution of the HFWI problem is a strong statement. Nevertheless, this uniqueness can be reached, as it has been illustrated numerically in the layered media case (Capdeville et al.2013). We show in this study that uniqueness of the solution is also possible in the case of 2-D spatially varying media, at least for low noise and good data coverage. If this assumption is not met, which can happen in many situations (bad coverage, bad quality data ...), we fall back to a classical local minimum problem. In such cases, similarly to most other local optimization methods, little can be done apart from using better data coverage, better quality data, relying on a better starting model, or introducing a priori information. Compared to more classical finite dimensional space approximation $$\boldsymbol{\mathcal {M}}^h$$ used to solve the FWI, using $$\boldsymbol{\mathcal {M}}^*$$ and solving the HFWI problem thus presents the following advantages: similarly to $$\boldsymbol{\mathcal {M}}^h$$, $$\boldsymbol{\mathcal {M}}^*$$ is a finite dimensional space (see Appendix A) rendering the HFWI problem solvable; the relation between the solution of the HFWI problem and the true model is known through (30). This point is very important for the interpretation of the inverted model (also called ‘downscaling’ step); the solution of the HFWI problem exists and belongs to $$\boldsymbol{\mathcal {M}}^*$$. In the next section, we present how to practically solve the HFWI problem. 5 HFWI PRACTICAL CONSIDERATIONS 5.1 The minimization scheme We rely here on the least squares approach and local optimization algorithms, a pragmatic and well adapted choice for the HFWI problem. In this study, we implement the Gauss–Newton iterative scheme (see for instance Tarantola & Valette 1982). Given mi, the inverted model at iteration i, we obtain the model at the iteration i + 1 as   \begin{equation} \mathbf {m}^{i+1}=\mathbf {m}^i+\left({(\mathbf {F}^i)}^{\intercal }\cdot \mathbf {F}^i +{\boldsymbol{\lambda }}^i \right)^{-1}\left[ {(\mathbf {F}^i)}^{\intercal }\cdot (\mathbf {d}-\tilde{\mathbf {u}}^*(\mathbf {m}^i))\right], \end{equation} (31)where d is the data vector for all sources, receivers and components, $$\tilde{\mathbf {u}}^*(\mathbf {m}^i)$$ is the effective synthetic seismograms computed in model mi also for all sources, receivers and components, Fi is the partial derivative matrix (matrix of the Fréchet derivatives)   \begin{equation} \mathbf {F}^i=\left[\frac{\partial \tilde{\mathbf {u}}^*}{\partial \mathbf {m}}\right]_{\mathbf {m}=\mathbf {m}^i}\,. \end{equation} (32)The damping parameter $${\boldsymbol{\lambda }}^i$$ is used to stabilize the inversion of the approximate Hessian (Fi)⊺ · Fi. We rely on the adjoint technique to numerically evaluate the waveform partial derivative Fi (Tarantola 1984; Pratt et al.1998). Practically, the full wavefields for each source and adjoint source are stored on disk and the partial derivatives are assembled afterward. For modest size 2-D tests, this solution can be efficiently implemented in terms of memory requirement and computational resources. The damping $${\boldsymbol{\lambda }}^i$$ is decreased while the iteration number increases until convergence. Whatever the technical choices made to solve (28), it implies describing the homogenized model space $$\boldsymbol{\mathcal {M}}^*$$. The most direct way to do so would be to set up an explicitly parametrization of this space. We consider two cases: the layered media case, for which such an explicit parametrization already exists, and the general case, for which it is not yet available. 5.2 Parametrization: the layered media case The layered media case is a special case in the sense that there exists an analytical solution to the cell problem (A3). VTI layered media can be described by five elastic parameters A(z), C(z), F(z), L(z), F(z) (e.g. Stoneley 1949) where z is the axis perpendicular to the layering. In such a case, the homogenization operator $$\boldsymbol{\mathcal {H}}$$ consists of the three following steps (Backus 1962; Capdeville & Marigo 2007): build the Backus parameter vector:   \begin{equation} \mathbf {p}(z)=\left(\frac{1}{C},\frac{1}{L},A-\frac{F^2}{C},\frac{F}{C},N,\rho \right)(z); \end{equation} (33) upscale the Backus vector with a simple spatial low-pass filtering (see Appendix A; this low-pass filtering in space shall not be confused with the time domain low-pass filtering used previously to limit the data frequency band):   \begin{equation} \mathbf {p}^*(z)=\mathcal {F}^{k_0}(\mathbf {p})(z); \end{equation} (34) build back the effective transversely isotropic coefficients:   \begin{eqnarray} (A^*,C^*,F^*,L^*,N^*,\rho ^*)(z)\!=\!\left(p_3^*\!+\!\frac{p_4^*}{p_1^*},\!\frac{1}{p_1^*},\!\frac{p_4^*}{p_1^*} ,\!\frac{1}{p_2^*},p_5^*,p_6^* \right)(z)\,. \nonumber\\ \end{eqnarray} (35) Note that eq. (34) is equivalent to muting coefficients in the Fourier domain for spatial frequencies larger than k0. If $$\tilde{\mathbf {p}}$$ are the Fourier coefficients of p,   \begin{equation} \tilde{\mathbf {p}}_k=\frac{1}{L_z}\int _{L_z}\mathbf {p}(z){\rm e}^{-ikz} {\rm d}z, \end{equation} (36)where Lz is the layered domain, then, eq. (34) corresponds to   \begin{equation} \mathbf {p}^*(z)=\sum _{\vert k\vert < k_0}\tilde{w}_k\tilde{\mathbf {p}}_k\, {\rm e}^{ikz}, \end{equation} (37)where $$\tilde{w}_k$$ are the Fourier coefficient of the filtering wavelet hidden in $$\boldsymbol{\mathcal {F}}^{k_0}$$. In such a case, we can define a parametrization based on the Backus parameter in the Fourier coefficient domain. Forgetting about receiver and boundary correctors, we can build $$\boldsymbol{\mathcal {M}}^*$$ as   \begin{equation} \boldsymbol{\mathcal {M}}^*=\left\lbrace {\rm all\, admissible}\,\tilde{\mathbf {p}}_k, k\in \lbrace -k_0, ..., k_0\rbrace \right\rbrace, \end{equation} (38)which defines a finite dimensional explicit parametrization of $$\boldsymbol{\mathcal {M}}^*$$. A similar parametrization has been used by Capdeville et al. (2013) (but with Lagrange polynomial instead of Fourier polynomial) to numerically illustrate that an FWI model is indeed an homogenized model in the layered media case. 5.3 Parametrization: the general media case For more general media, an explicit parametrization of the homogenized model space is more difficult to set up. This is due to the fact that the analytical solution to the cell problem (A3) used in the layered media case does not exist for higher dimension cases (2-D and 3-D heterogeneities). It could be nevertheless possible to design an explicit $$\boldsymbol{\mathcal {M}}^*$$ parametrization and this is a possibility we intend to study in the future. In this study, we rely on a simpler idea: instead of searching for a solution directly in $$\boldsymbol{\mathcal {M}}^*$$, we rely on an approximate $$\boldsymbol{\mathcal {M}}^{*h}$$ and use the homogenization operator $$\boldsymbol{\mathcal {H}}$$ to project the solution back to $$\boldsymbol{\mathcal {M}}^{*}$$. For iterative schemes such as the Gauss–Newton algorithm used in the present work, the projection can be done at each iteration (see Fig. 2), or only at convergence, as needed. If an increasing frequency band is used through the iterations (for frequency-grid continuation (Kolb et al.1986; Bunks et al.1995)), the projection needs to be done at least at each frequency band jump. This is a simple solution, which comes with the following drawback: depending on the practical choices made for $$\boldsymbol{\mathcal {M}}^{*h}$$, the iterative inversion scheme may fail to converge if $$\boldsymbol{\mathcal {M}}^{*h}$$ restricts too much the update from progressing toward the solution once the new model is homogenized. Unfortunately, we are not able yet to provide a precise criteria on $$\boldsymbol{\mathcal {M}}^{*h}$$ to ensure the convergence. Figure 2. View largeDownload slide Flowchart used to apply the homogenization projection in the Gauss–Newton scheme. Figure 2. View largeDownload slide Flowchart used to apply the homogenization projection in the Gauss–Newton scheme. Indeed, there are an infinite number of possibilities for practically setting up an explicit parametrization for $$\boldsymbol{\mathcal {M}}^{*h}$$. In the following examples, limiting ourselves to 2-D cases, we rely on a piece-wise polynomial basis for the spatial parametrization. More precisely, a square inversion sub-domain $$\boldsymbol{\mathcal {I}}\subset \boldsymbol{\Omega }$$ is chosen and divided into n × n non-overlapping elements:   \begin{equation*} \boldsymbol{\mathcal {I}}=\sum _{e=1}^{n^2} \boldsymbol{\mathcal {I}}_e^{n}. \end{equation*} An example of inversion domain $$\boldsymbol{\mathcal {I}}$$ and an associated inversion mesh for n = 10 is shown in Fig. 3(a). Over each element $$\boldsymbol{\mathcal {I}}_e^{n}$$, similar to what is done for spectral elements, elastic parameters and density are represented using a 2-D tensorial product polynomial approximation of degree N in each direction. This defines the parametrization $$\boldsymbol{\mathcal {P}}_n^N(\boldsymbol{\mathcal {I}})$$ (n × n elements of degree N × N) that we use for the numerical tests in the next section. We do not impose the continuity of the fields between elements, which implies that $$\boldsymbol{\mathcal {P}}_n^N(\boldsymbol{\mathcal {I}})$$ has n2 × (N + 1)2 degrees of freedom for each scalar. Figure 3. View largeDownload slide Weak and strong contrast circular inclusion tests configuration. (a) sketch of the elastic model and of the sources (stars) receivers (triangles) geometry used to generate the synthetic data to be inverted. The square in black dashed line represents the inverted domain $$\boldsymbol{\mathcal {I}}$$. The grey dashed lines represent the elements $$\boldsymbol{\mathcal {I}}_e^{10}$$ of an inversion mesh example. The dotted line is the cross-section line used for panel (c). (b) kinetic energy snapshot for one of the source (star symbol), for t = 11/fm for the strong heterogeneity case. The unstructured spectral element mesh used to solve the wave equation is overlaid in grey. (c) VS cross-section along the dotted line in plot (a) for weak and strong heterogeneity models. See Table 1 for a complete description of the elastic material properties of those models. Figure 3. View largeDownload slide Weak and strong contrast circular inclusion tests configuration. (a) sketch of the elastic model and of the sources (stars) receivers (triangles) geometry used to generate the synthetic data to be inverted. The square in black dashed line represents the inverted domain $$\boldsymbol{\mathcal {I}}$$. The grey dashed lines represent the elements $$\boldsymbol{\mathcal {I}}_e^{10}$$ of an inversion mesh example. The dotted line is the cross-section line used for panel (c). (b) kinetic energy snapshot for one of the source (star symbol), for t = 11/fm for the strong heterogeneity case. The unstructured spectral element mesh used to solve the wave equation is overlaid in grey. (c) VS cross-section along the dotted line in plot (a) for weak and strong heterogeneity models. See Table 1 for a complete description of the elastic material properties of those models. 6 SYNTHETIC INVERSION TESTS In this section, we present a series of examples illustrating the ideas presented above. More precisely, the objectives are to illustrate through numerical experiments: the validity of eq. (30) in the presented case studies; the influence of the practical choice of $$\boldsymbol{\mathcal {M}}^{h}$$ for standard FWI; the influence of the practical choice of $$\boldsymbol{\mathcal {M}}^{*h}$$ for HFWI; that the HFWI problem can be successful where a classical FWI fails. We run three synthetic inversions in the two heterogeneous models presented in Figs 3 and 4. In both ‘true’ elastic models, unknown heterogeneities are embedded in a known homogeneous isotropic background. The background density, P and S wave velocities are denoted by ρ0, VP0 and VS0 respectively. We use 16 sources and 16 receivers, regularly located around the heterogeneities. Figure 4. View largeDownload slide Faulted layers test configuration. (a) sketch of the elastic model and of the sources (stars) receivers (triangles) geometry used to generate the synthetic data to be inverted. The square in black dashed line represents the inverted domain $$\boldsymbol{\mathcal {I}}$$ and the dotted line is the cross-section line used for plot (c). (b) kinetic energy snapshot for one of the source (star symbol), for t = 11/fm for the strong heterogeneity case. The unstructured spectral element mesh used to solve the wave equation is overlapped in grey lines. (c) VS cross-section along the dotted line in plot (a). See Table 1 for a complete description of the elastic material properties of this model. Figure 4. View largeDownload slide Faulted layers test configuration. (a) sketch of the elastic model and of the sources (stars) receivers (triangles) geometry used to generate the synthetic data to be inverted. The square in black dashed line represents the inverted domain $$\boldsymbol{\mathcal {I}}$$ and the dotted line is the cross-section line used for plot (c). (b) kinetic energy snapshot for one of the source (star symbol), for t = 11/fm for the strong heterogeneity case. The unstructured spectral element mesh used to solve the wave equation is overlapped in grey lines. (c) VS cross-section along the dotted line in plot (a). See Table 1 for a complete description of the elastic material properties of this model. The source mechanism is the same for all sources: it is an horizontal vector force. The source time function gs(t) is a tapered door function in the frequency domain (see Fig. 5) which has a maximum frequency fm. This source time function is the same for all the sources. We define the background minimum wavelength as   \begin{equation} \lambda _{{\rm min}}=V_{S0}/f_{\rm m}. \end{equation} (39)In the following, all distances are measured as functions of λmin, the frequencies as functions of fm and the time as a function of 1/fm. For both settings, the domain size is $$16\times 16\ \lambda _{{\rm min}}^2$$ with absorbing boundary conditions around the domain. The inversion sub-domain $$\boldsymbol{\mathcal {I}}$$ is a square of dimension $$12\times 12\ \lambda _{{\rm min}}^2$$. The sources and receivers are located around $$\boldsymbol{\mathcal {I}}$$ leading to an excellent data coverage. Nevertheless this coverage is still sparse as the distance between each receiver and each source is equal to 3.2 λmin. Note that the sources and receivers are located outside of $$\boldsymbol{\mathcal {I}}$$ and that no free surface is present in the inverted domain. The idea of such settings is to avoid inverting for the receiver corrector $$\boldsymbol{\chi }$$, the source parameters as well as the boundary corrector $${\boldsymbol{\Gamma }}^*$$ and to focus on the reconstruction of ρ* and c*. Figure 5. View largeDownload slide Left: source time function gs amplitude spectra. The frequency axis unit is the maximum frequency fm. Right: source time function gs in the time domain. The time axis unit is 1/fm. Figure 5. View largeDownload slide Left: source time function gs amplitude spectra. The frequency axis unit is the maximum frequency fm. Right: source time function gs in the time domain. The time axis unit is 1/fm. For the first setting, (‘circular inclusions’, Fig. 3) two circular inclusions, one faster and one slower than the background medium are present. The faster inclusion is surrounded by a thin slow layer representing a damaged layer. The thickness of this layer is λmin/8. Both the sharpness of the circular inclusions and the thin damaged layer are fine scale and can only be recovered in an effective way by FWI. Two velocity contrasts are used for the inclusions in two different experiments, one labelled as ‘weak contrast’, the other one labelled as ‘strong contrast’ (see Table 1 and Fig. 3c). Table 1. Material properties used in the different tests.   VP (km s−1)  VS (km s−1)  ρ (103 kg m−3)  Background  5.6  3.17  2.61  Circular inclusions, weak heterogeneities case:  A  6.27  3.48  2.73  B & d  4.85  2.69  2.47  Circular inclusions, strong heterogeneities case:  A  7.41  4.12  3.05  B & d  2.3  1.58  2.35  Layered inclusion test:  Slow layers:  4.67  2.6  2.45  Fast layers:  6.72  3.74  2.85    VP (km s−1)  VS (km s−1)  ρ (103 kg m−3)  Background  5.6  3.17  2.61  Circular inclusions, weak heterogeneities case:  A  6.27  3.48  2.73  B & d  4.85  2.69  2.47  Circular inclusions, strong heterogeneities case:  A  7.41  4.12  3.05  B & d  2.3  1.58  2.35  Layered inclusion test:  Slow layers:  4.67  2.6  2.45  Fast layers:  6.72  3.74  2.85  View Large For the second inversion setting, (‘faulted layers’, Fig. 4), a horizontally layered medium is split by a tiled fault. The layer thickness is also λmin/8. Both the layers and the tilted fault are fine scale which can, again, only be recovered in an effective way by FWI. Choosing λmin = O(1cm), the ‘circular inclusions’ model could represent for instance a piece of concrete with an iron bar inclusion (the fast inclusion ‘A’) and two damaged areas (the slow inclusions ‘B’ and ‘d’), one circular and one around the bar. Choosing λmin = O(10 m), the ‘faulted layered’ model could represent a more geological situation where the sediments represented by the layers are offset by a tilted fault. In both cases, imaging as best as possible the heterogeneities is the challenge. For the first situation, the acquisition setting could be thought as ‘realistic’, however this is not the case for the second. Nevertheless, our objective here is to focus on the ideas developed in the previous sections without mixing with other important issues such as the data coverage. This will be the purpose of future works. For each experiment, the data to be inverted are generated using a 2-D spectral element solver (Komatitsch & Vilotte 1998), meshing all the interfaces of the heterogeneities (see Figs 3b and 4b). Perfectly Matched Layers (PMLs; Festa & Vilotte 2005) are used to absorb outgoing waves around the domain. No synthetic noise is added to the synthetic data. The spectral element mesh used to compute the approximate Hessian and the gradient in eq. (31) is a simple regular mesh based on the inversion parametrization (on the $$\boldsymbol{\mathcal {I}}_e^{n}$$ elements). In these synthetic tests, we avoid an ‘inverse crime’ by virtue of the following: first, the fact that the spectral element meshes used to generate the data to be inverted ($$\mathbf {d}^{f_{\rm m}}$$) and to model the data during the inversion process (u*) are not the same implies that the discretization errors are different. Second, the true model cannot be accurately represented with the inversion parametrization, which prevents us from injecting prior information about the inclusion shapes into the inversion. Nevertheless, the PML error (remaining small reflections from the domain boundaries) are the same for the data generation and inversion, and this error might be used by the inversion scheme. The following inversion tests often involve the full elastic tensor cijkl and its six independent coefficients (in 2-D). To present the results, we choose to first project anisotropic c to the nearest isotropic tensor ciso following Browaeys & Chevrot (2004). Then, P- and S-wave velocities are defined as   \begin{equation} V_P(\mathbf {x})=\sqrt{(c_{1111}^{\rm iso}(\mathbf {x})/\rho (\mathbf {x}))}\,, \end{equation} (40)  \begin{equation} V_S(\mathbf {x})=\sqrt{(c_{1212}^{\rm iso}(\mathbf {x})/\rho (\mathbf {x}))}\,, \end{equation} (41)and the total anisotropy as   \begin{equation} {\rm aniso}(\mathbf {x})=\frac{\sqrt{\sum _{ijkl}(c^{\rm iso}_{ijkl}(\mathbf {x})-c_{ijkl}(\mathbf {x}))^2}}{\sqrt{\sum _{ijkl}(c^{\rm iso}_{ijkl}(\mathbf {x}))^2}}. \end{equation} (42) 6.1 Circular inclusion weak heterogeneity test In this Section, we present a first test using the circular inclusion model with the ‘weak’ heterogeneities (see Fig. 3 and Table 1). The heterogeneities are chosen to be weak enough so that the Gauss–Newton scheme can converge without projecting to the homogenized model space at each iteration. In other word, the Gauss–Newton scheme is applied classically without using homogenization. The fine scales of the model such as the layer ‘d’ and also the sharp velocity jumps between the background media and the inclusions cannot be recovered by the inversion. Indeed they are below the resolution limit, and, technically, they cannot be represented in a spatial parametrization without prior knowledge about their shape (i.e. without explicitly meshing the discontinuities). This situation is realistic: in general, the heterogeneity geometry is unknown. We perform three different inversions each using a different model space $$\boldsymbol{\mathcal {M}}^h$$ parametrization: $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$: $$\boldsymbol{\mathcal {P}}_{30}^1(\boldsymbol{\mathcal {I}})$$ with ρ, VP and VS; $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$: $$\boldsymbol{\mathcal {P}}_{10}^3(\boldsymbol{\mathcal {I}})$$ with ρ and c; $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$: $$\boldsymbol{\mathcal {P}}_{20}^1(\boldsymbol{\mathcal {I}})$$ with ρ and c; where the notation $$\boldsymbol{\mathcal {P}}_n^N(\boldsymbol{\mathcal {I}})$$ for the decomposition of the domain $$\boldsymbol{\mathcal {I}}$$ in n × n elements of degree N × N is defined at the end of Section 5.3. These three parametrizations have roughly the same number of free parameters (unknowns), 302 × 22 × 3 = 10800 for $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$, 102 × 42 × 7 = 11200 for $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$ and 202 × 22 × 7 = 11200 for $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$. The three inversions are carried out and the Gauss–Newton scheme is stopped when a 98 per cent reduction in the square-root of the misfit function is reached. In each of the three cases, the inversion converges in about 10 iterations. The raw results of the inversions are presented in Figs 6 and 7. By raw results, we mean the inversion results before any homogenization. From these figures, it can be noted that the three inversions qualitatively retrieve the location and signs of the reference model heterogeneities. Nevertheless, quantitatively, the models are different from each other and all include a significant amount of noise. Indeed, from the VS maps (Fig. 6), the heterogeneous circles can be roughly located and the sign of the heterogeneity contrast (slow or fast) can be guessed. Even the slow layer around the inclusion ‘A’ seems to be reconstructed. Nevertheless, from the cross-sections (Fig. 7), it appears that the results are not accurate. The actual velocity values in each inclusion are not recovered. Moreover, the results strongly depend on the parametrization choice: the imprint of the inversion mesh is clearly visible and, for example, it is difficult to measure the value of the actual thickness and the VS value of the thin slow layer ‘d’. Figure 6. View largeDownload slide Weak contrast circular inclusion test raw inversion results. VS for the reference model (upper left panel), raw VS inversion results for the $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ parametrization (upper right panel), $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$ (lower left panel) and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ (lower right panel) parametrizations are plotted. For $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$, VS is defined following eq. (41). Figure 6. View largeDownload slide Weak contrast circular inclusion test raw inversion results. VS for the reference model (upper left panel), raw VS inversion results for the $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ parametrization (upper right panel), $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$ (lower left panel) and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ (lower right panel) parametrizations are plotted. For $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$, VS is defined following eq. (41). Figure 7. View largeDownload slide Weak contrast circular inclusion test raw inversion results. Cross-sections along the dotted line shown in Fig. 3(a), for the reference model (‘ref’) and for the raw inversion results for the $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$, $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$ parametrizations. VS, VP, ρ and the total anisotropy are presented. For anisotropic parametrizations, VP, VS and the total anisotropy are defined in eqs (40)–(42). Figure 7. View largeDownload slide Weak contrast circular inclusion test raw inversion results. Cross-sections along the dotted line shown in Fig. 3(a), for the reference model (‘ref’) and for the raw inversion results for the $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$, $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$ parametrizations. VS, VP, ρ and the total anisotropy are presented. For anisotropic parametrizations, VP, VS and the total anisotropy are defined in eqs (40)–(42). We next follow the main idea of this paper and project all the inversions result as well as the reference model to the homogenized space using the homogenization operator $$\boldsymbol{\mathcal {H}}$$. To this purpose, we need to give a value to the ε0 parameter (see eq. 16). As mentioned earlier, when dealing with forward modelling problems, ε0 = 0.5 is often a good choice. However, for this projection step any value can be chosen in practice. To focus on long wavelength results, we choose ε0 = 2, to study results at the resolution limit, ε0 = 0.5 is used while for intermediate results we choose ε0 = 1. The homogenized reference model and those resulting from the three inversions are presented in Figs 8 and 9. From the 2-D maps obtained for ε0 = 1, it appears clearly that the four models are the same once homogenized. This is quantitatively confirmed with the model cross-sections (Fig. 9) plotted for three different values of ε0 (2, 1 and 0.5). For large ε0 (low resolution), the models are all the same within a small error. For small ε0 (high resolution), the models are still in good agreement, but with a larger dispersion. The fact that the error becomes larger with smaller ε0, for a fixed maximum frequency fm, is expected: when reaching the resolution limit (≃ λmin/2, which corresponds to ε0 = 0.5), the wavefield sensitivity to the heterogeneities is lower, leading to a poor constraint on their amplitude. A good compromise between resolution and accuracy seems to be here ε0 = 1. Figure 8. View largeDownload slide Weak contrast circular inclusion test reference model and three different inversion results, all homogenized with ε0 = 1 (see Section 6.1). VS, VP, ρ and the ‘total anisotropy’ are presented (lines of panels from top to bottom, respectively ). Reference, $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$, $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ inverted models are plotted (left to right columns of panels, respectively). VS, VP and the ‘total anisotropy’ are obtained following eqs (40)–(42). Figure 8. View largeDownload slide Weak contrast circular inclusion test reference model and three different inversion results, all homogenized with ε0 = 1 (see Section 6.1). VS, VP, ρ and the ‘total anisotropy’ are presented (lines of panels from top to bottom, respectively ). Reference, $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$, $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ inverted models are plotted (left to right columns of panels, respectively). VS, VP and the ‘total anisotropy’ are obtained following eqs (40)–(42). Figure 9. View largeDownload slide Weak contrast circular inclusion test reference and homogenized inversion results. Cross-sections along the dotted line in Fig. 3(a) for the reference model (‘ref’) and for the inversion results for the $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$, $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$ parametrizations are represented. Three different values of ε0 are used (ε0 = 2, 1 and 0.5). VP, VS and the ‘total anisotropy’, obtained following eqs (40)–(42), are represented. Figure 9. View largeDownload slide Weak contrast circular inclusion test reference and homogenized inversion results. Cross-sections along the dotted line in Fig. 3(a) for the reference model (‘ref’) and for the inversion results for the $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$, $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$ parametrizations are represented. Three different values of ε0 are used (ε0 = 2, 1 and 0.5). VP, VS and the ‘total anisotropy’, obtained following eqs (40)–(42), are represented. It is interesting to note that the isotropic inversion ($$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$) is able to retrieve the expected anisotropy once homogenized (even if its amplitude is slightly too low). It implies that the inversion is able to find an isotropic model, that is not the true model, but that once homogenized is in good agreement with the true homogenized model. Finally, it can also be noted that the density is retrieved relatively accurately (the density results seem to be more noisy, but this should be related to the fact that the density contrast is quite small compared to the velocity contrasts associated with VS and VP). From this first experiment, we can conclude that the elastic models resulting from FWI are definitively not unique: they depend on the particular choice of parametrization for $$\boldsymbol{\mathcal {M}}^h$$. However, the homogenized versions of these models are actually all the same. 6.2 Faulted layers test A case study similar to the previous one is presented here, with the faulted layers model described in Fig. 4. Only two parametrizations are tested: $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$. The inversions are carried out using the same settings as the one defined for the previous case study, with the same misfit reduction stopping criteria. The $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ inversion converges within 20 iterations, but the $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ inversion stops progressing after four iterations and only a 50 per cent misfit reduction is achieved. As it can be seen in the raw inversion results (Figs 10 and 11), it is not possible to find a model explaining the data in $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$, at least with a local optimization algorithm such as the Gauss–Newton scheme used here. Figure 10. View largeDownload slide Faulted layers test raw inversion results. VS for the reference model (left panel), raw VS inversion results for the $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ parametrization (middle panel) and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ (right panel) parametrizations are plotted. For the anisotropic parametrization, VS is defined following eq. (41). Figure 10. View largeDownload slide Faulted layers test raw inversion results. VS for the reference model (left panel), raw VS inversion results for the $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ parametrization (middle panel) and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ (right panel) parametrizations are plotted. For the anisotropic parametrization, VS is defined following eq. (41). Figure 11. View largeDownload slide Faulted layers test raw inversion results. Cross-sections along the dotted line shown in Fig. 4(a), for the reference model (‘ref’) and for the raw inversion results for the $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ and for $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ parametrizations. VS, VP, ρ and the total anisotropy are presented. For anisotropic parametrization, VP, VS and the total anisotropy are defined following eqs (40)–(42). Figure 11. View largeDownload slide Faulted layers test raw inversion results. Cross-sections along the dotted line shown in Fig. 4(a), for the reference model (‘ref’) and for the raw inversion results for the $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ and for $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ parametrizations. VS, VP, ρ and the total anisotropy are presented. For anisotropic parametrization, VP, VS and the total anisotropy are defined following eqs (40)–(42). To test if the homogenized inversion principle can help in this case, we use the algorithm presented in Fig. 2: at each Gauss–Newton iteration, we project the obtained model to the homogenization space with the $$\boldsymbol{\mathcal {H}}$$ operator using ε0 = 1/3. We therefore use $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ for $$\boldsymbol{\mathcal {M}}^{*h}$$. In the following, we refer to this inversion as the $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso},*}$$ parametrization inversion. Inserting this projection step, the inversion almost converges (a 96 per cent misfit reduction was obtained) after about 50 Gauss–Newton iterations. Of course, there is here a little interest in using the homogenized inversion compared to inverting in $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ (the inversion convergence is slow), but it shows that projecting to the homogenized space can help to recover a failing inversion. At this stage, we cannot conclude if this observation goes beyond what could be obtained with simple Tikhonov regularization (see e.g. Brenders & Pratt (2007)) and this will need to be investigated deeper in future works. This $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso},*}$$ inversion is the first presented example of the HFWI algorithm. The raw results for the $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ inversions are the ones presented in Figs 10 and 11. There is no raw results for the $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso},*}$$ inversion because each iteration is homogenized. From these raw results, we see that, differently to the circular inclusions test, it is only possible to tell that an heterogeneity is present in the layered area, without any visible or qualitative detail on it. We can only tell that, for the $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ inversion, the layered area appears as a slow velocity area. It is not possible to guess anything about the layering and even less about the fault. Following the main idea of this study, Figs 12 and 13 depict the homogenized true model, the homogenized $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$, $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso},*}$$ inverted models, for ε0 = 1. Similarly to the previous test, the $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ inverted model is, once homogenized, in very good agreement with the homogenized true model. These plots also confirm that the $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ has failed to converge to a correct model. Nevertheless, when the homogenization is used at each Gauss–Newton iteration, the same inversion $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso},*}$$ result is, once homogenized with ε0 = 1, in very good agreement with the homogenized true model. For the $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso},*}$$ inversions, the position of the fault can also be guessed from the total anisotropy 2-D plot (Fig. 12, bottom panels). Figure 12. View largeDownload slide Faulted layers test reference model and inversion results, all homogenized with ε0 = 1 (see Section 6.2). VS, VP, ρ and the ‘total anisotropy’ are presented (lines of panels from top to bottom, respectively ). Reference, $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$, $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso},*}$$ and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ inverted models are plotted (left to right columns of panels, respectively). VS, VP, and the ‘total anisotropy’ are obtained following eqs (40)–(42). Figure 12. View largeDownload slide Faulted layers test reference model and inversion results, all homogenized with ε0 = 1 (see Section 6.2). VS, VP, ρ and the ‘total anisotropy’ are presented (lines of panels from top to bottom, respectively ). Reference, $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$, $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso},*}$$ and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ inverted models are plotted (left to right columns of panels, respectively). VS, VP, and the ‘total anisotropy’ are obtained following eqs (40)–(42). Figure 13. View largeDownload slide Faulted layers test reference and inversion homogenized models. Cross-sections along the dotted line in Fig. 4(a) for ε0 = 1 are plotted. VP, VS and the ‘total anisotropy’ are obtained following eqs (40)–(42). Figure 13. View largeDownload slide Faulted layers test reference and inversion homogenized models. Cross-sections along the dotted line in Fig. 4(a) for ε0 = 1 are plotted. VP, VS and the ‘total anisotropy’ are obtained following eqs (40)–(42). From this test, we conclude that, depending on the model we want to retrieve, the choice of $$\boldsymbol{\mathcal {M}}^h$$ can influence the convergence of a classical FWI. In particular, choosing an anisotropic parametrization is a good option, even if it seems apparently more difficult. This example illustrates the advantage of projecting to the homogenized space at each iteration of the inversion. Finally, Fig. 14 depicts horizontal cross-sections in the ε0 = 1 homogenized reference and $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ inverted models for the c1112 elastic coefficient. In a vertically transverse isotropic (VTI) model, we have c1112 = 0. Knowing that a horizontally layered model becomes a VTI model through homogenization and that the faulted layers test model is horizontally layered everywhere but near the fault, we expect c1112 to be zero except near the fault. This is indeed what is observed in Fig. 14 from both the homogenized reference and inverted models. It implies that, provided that we have the prior information that there is potentially a fault in the medium, it would be possible to identify it from the inversion, even at this low resolution. Such an interpretation of the inverted results can be seen as a downscaling inverse problem and is discussed in Section 7. Figure 14. View largeDownload slide Faulted layers test inversion result. Cross-sections along an horizontal line for y = 4λmin in $$c_{1112}^*$$ for the reference model and the $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ inverted model, both homogenized with ε0 = 1. Figure 14. View largeDownload slide Faulted layers test inversion result. Cross-sections along an horizontal line for y = 4λmin in $$c_{1112}^*$$ for the reference model and the $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ inverted model, both homogenized with ε0 = 1. 6.3 Circular inclusion strong heterogeneity test Finally, for the last test, we come back to the circular inclusion configuration already used in Section 6.1, with stronger velocity contrasts, especially for the slow regions (see Fig. 3 and Table 1). The effect of the heterogeneity on the waveforms is large and none of the parametrizations already used, with or without homogenization at each step, is able to converge and to avoid falling in a local minimum. To resolve this problem, we rely on the classical frequency-grid continuation solution (Kolb et al.1986; Bunks et al.1995). We first perform the inversion in a low frequency band [0, fm/2] and use the homogenized inversion results to start the inversion in the full frequency band [0, fm]. Applying these two successive frequency band inversions, the simple Gauss–Newton scheme (without homogenization) systematically proposes a non-physical model after a few iterations and fails to converge. We therefore rely on the homogenized Gauss–Newton scheme and use $$\boldsymbol{\mathcal {M}}_{20}^{0,{\rm ani},*}$$ for the low-frequency band step and then $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani},*}$$ for the full-frequency band step. In this framework, the convergence rate is slower than for the first experiments (about 20 iterations for the first step and 40 for the second), however the misfit function is decreased down to the 99 per cent objective and the convergence is reached. The final results of the inversion, homogenized with ε0 = 1, are summarized in Fig. 15. Once again, a very good agreement between the homogenized true and the homogenized inverted models is obtained. The high amplitude of the effective anisotropy (up to 30 per cent), knowing the isotropy of the underlying thin scale model shall be noted. The different anisotropy pattern between heterogeneities with or without a slow layers around them could help to find their finer scale characteristics in a downscaling process. Figure 15. View largeDownload slide Strong contrast circular inclusion test reference model and inversion results, both homogenized with ε0 = 1 (see Section 6.3). On the left column panels are presented maps of VS (top panel) and the ‘total anisotropy’ (bottom panel) of the reference model and, on the middle column, the one for the inverted model with the $$\boldsymbol{\mathcal {M}}_{20}^{0,{\rm ani},*}$$ parametrization. On the right column are plotted cross-sections along the dotted line seen on the maps presented in the panel from right and left columns for VS (top panel) and the ‘total anisotropy’ (bottom panel). Figure 15. View largeDownload slide Strong contrast circular inclusion test reference model and inversion results, both homogenized with ε0 = 1 (see Section 6.3). On the left column panels are presented maps of VS (top panel) and the ‘total anisotropy’ (bottom panel) of the reference model and, on the middle column, the one for the inverted model with the $$\boldsymbol{\mathcal {M}}_{20}^{0,{\rm ani},*}$$ parametrization. On the right column are plotted cross-sections along the dotted line seen on the maps presented in the panel from right and left columns for VS (top panel) and the ‘total anisotropy’ (bottom panel). From this last case study, we conclude that the homogenization can help FWI to successfully converge in this difficult case. 7 DISCUSSION One of the objectives of this paper is to extend the results obtained in the layered media case (Capdeville et al.2013), to higher dimension cases (here 2-D), which illustrates the fact that the model obtained from FWI are, at best, an homogenized version of the true model. It establishes the relation between the inverted and true models for limited frequency band data. Even if this relation might not always be true, the tests performed here seem to confirm this assumption: in cases where the inversion converges without homogenizing at each Gauss–Newton step, the inverted raw models are clearly parametrization dependent. Nevertheless, once homogenized, the different models are all the same and consistent with the homogenized true model. We interpret this observation in the following way: the differences between raw inversion results depend on fine scales in the parametrization that are not uniquely constrained by the data. In other words, these fine scales may be necessary to explain the data, but they are not unique and depend on the parametrization. For example, using an isotropic parametrization (such as $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ in Section 6.1), knowing that the effective medium behave in an anisotropic way, might not prevent the convergence of the inversion. But this strategy may find isotropic fine scales allowed by the parametrization which will try to emulate the correct anisotropy in the effective sense. Depending on the parametrization, these fine scales will be different, but they will all have the same effective effect. We conclude that FWI, indeed, finds the homogenized model and not the true model. The next related important point is to ensure that the assumption that the solution to the homogenized inverse problem is the homogenized true model (eq. 30) is true. Once again, for all the examples tested here, this relation seems to be satisfied. A corollary of this observation is that, in the homogenized space, the FWI inversion has a unique solution. This is true at least in the presented example and all the other tests we have done so far. This is definitely not a proof but, at least, it shows that it is a reasonable assumption. Cases breaking the uniqueness are more likely to be found for poor data coverage or noisy data. In such cases, the relation between the true and inverted models through homogenization is broken which is a serious issue for the interpretation stage. But this is not specific to HFWI. This framework gives some new insight about the FWI process, even if the homogenization is not used. For example, many FWI inversion scheme makes use of spatial smoothing to regularize their gradient update, especially near the sources and receivers (Williamson et al.2011; Trinh et al.2017). Based on our results, knowing that, for low-velocity contrast, the homogenization operator $$\boldsymbol{\mathcal {H}}$$ is close to a simple low-pass spatial filtering $$\boldsymbol{\mathcal {F}}$$, it makes sense to low pass-filter results from an FWI: for weakly contrasted models, this is equivalent to homogenizing the inverted model. Another example where the presented work gives a better understanding about common practice is the frequency-grid continuation methods (Kolb et al.1986; Bunks et al.1995). Using lower frequencies reduces the homogenized model space dimensional (see eq. A9), and, if low enough frequencies are present in the data, a misfit function with a single global and local minimum can be reached. Our work also raises questions about other practices: for instance, inverting for fine scales, such as sharp interfaces, based on limited frequency band data, appears to be counterproductive. Indeed, even if, for instance, the true model contains sharp interfaces, inverting for these interfaces makes the problems strongly nonlinear. More importantly, the imprint of these interfaces in the bandpassed seismic data is very weak, making the inverse problem poorly constrained. It thus appears more reasonable to invert for a smooth anisotropic media and leave the interpretation of the inverted model (such as the detection of sharp interfaces) to a second-step downscaling inversion. The resolution of FWI or HFWI is directly linked to λmin and to the choice of ε0. In the forward modelling context, for a fixed λmin value, depending on the desired accuracy, the number of propagated wavelengths and on the model velocity contrast, a good choice for ε0 lies between 0.5 and 0.25 (Capdeville et al.2010b). We could thus deduce that the FWI resolution can reach λmin/2 to λmin/4 resolution. Nevertheless, the different tests performed here show that with ε0 = 0.5, the noise on the results is significant, increasing the uncertainty on the reconstruction (leading to large error bars). It is therefore probably best to limit the results to larger ε0, such as ε0 = 1, for which the error appears lower. This could be potentially improved by increasing the number of sources and receivers, but this observation testifies that the sensitivity of the waveforms to heterogeneities decrease strongly for ε0 close to 0.5, and that the resolution limit should be set closer to 1 than 0.5. Note that these numbers actually depend on where the actual maximum frequency fm is placed: in Fig. 5, fm is really the maximum frequency, but its position could have been chosen at the end of the plateau (≃ 0.8 of the current fm). This would give an apparent more optimistic value for the resolution limit of the inversion, but it would not change the final images. This should be kept in mind when discussing the FWI resolution limits as, due to possible difference in the definitions, the λmin/2 resolution of one can be the λmin/4 of another. One important notion, both for the homogenization and for the inverse problem, is the minimum wavelength λmin. Here, we have assumed that λmin is roughly constant all over the domain. For more realistic situation, especially for geological media, λmin may vary dramatically over the inverted domain, especially with depth. Such a case is not a real issue but an adaptive homogenization, that already exists for layered media (Capdeville et al.2013,appendix B), would be needed for 2-D and 3-D media. Media that do not present a minimum wavelength for some frequencies are more problematic and cannot be inverted with any FWI methods (FWI or HFWI). Media with Helmholtz resonator inclusions (Zhao et al.2016), but also with fluid or void inclusion of size comparable or larger to the minimum wavelength are among those. Being able to apply FWI or HFWI schemes to such media would require to use very low frequencies only (such that a minimum wavelength exists anyway) or a drastically different approach. Another difficulty related to λmin is that, in the presented algorithm and examples, it has been assumed to be known a priori. For realistic situations, this often will not be true. Indeed, if the starting model is far enough from the homogenized true model, the true λmin can be significantly different from the initial estimate and therefore the homogenization may be poorly tuned. In such a case, the λmin is part of the unknown which makes the whole process implicit. An iterative scheme is therefore necessary to find the correct λmin. This is probably not a great difficulty but it will need to be tested in a future work. As it has been shown in the examples, the parametrization choice for $$\boldsymbol{\mathcal {M}}^h$$ or $$\boldsymbol{\mathcal {M}}^{h,*}$$ has an influence on the raw results, but not on the homogenized results, if the inversion has converged. It appears that, for relatively weak velocity contrast target models, this choice is not critical as long as there are enough degrees of freedom in the parametrization. For intermediate velocity contrast target models, classical FWI still converge, but using a fully anisotropic parametrization ($$\boldsymbol{\mathcal {M}}_n^{N,{\rm ani}}$$) is necessary. In any case, we found it is safer to use a fully anisotropic parametrization than a isotropic parametrization. Once the fully anisotropic tensor is inverted for, the spatial parametrization is not really critical as long as it provides between 3 to 4 degrees of freedom per wavelength in one direction. We nevertheless observe a slightly better stability of the inversion for low polynomial degree (e.g. using $$\boldsymbol{\mathcal {P}}_{20}^1$$ is more stable than $$\boldsymbol{\mathcal {P}}_{10}^3$$ for the same number of degree of freedom). An interesting consequence is that a rough parametrization, such as square blocks containing constant elastic properties, performs very well in terms of stability and convergence, better than a high order polynomial parametrization, as long as there are enough degrees of freedom per wavelength. In any case, once homogenized, the results correspond to the homogenized true model. As we saw above, the model resulting from HFWI are homogenized models and cannot be true models. Such models are well suited for data prediction. This could be very useful in many situations, such as source localization and moment tensor inversions. But they are poorly suited to geological interpretation. Indeed, because true interfaces are blurred and apparent anisotropy is introduced, interpretation can be difficult and misleading (for an example, see Capdeville et al.2013). Note that this issue is not specific to HFWI: any seismic inversion based on limited frequency band data is subjected to this issue. At least, with HFWI, it is made clear that the results should be interpreted with care. To properly interpret the homogenized model obtained through HFWI, we propose a second-step inverse problem: the downscaling or de-homogenizing inverse problem. The idea of such a new inverse problem is to find all the finer scale models both compatible with some a priori information and the HFWI model. Such an inversion solution is highly non-unique and needs to be carried out with a global search approach. The idea has already been successfully tested in the layered model case (Bodin et al.2015) and its development for higher dimension problems needs to be done. Combining HFWI and a downscaling inversion is an indirect way to address the full waveform inverse problem with a fully probabilistic approach. If the overall context of this work is seismology at all scales, our results can be extended to other domains, such as non-destructive testing at all scales with (almost) no modifications. Furthermore, although we have limited ourselves to the elastic case, the acoustic case is probably very similar and most of our conclusions would probably remain valid. Nevertheless, there are some important differences between the two cases related to the non-uniqueness of the inverse problem that would deserve a particular attention. 8 CONCLUSIONS AND PERSPECTIVES We have confirmed numerically that an FWI based on limited frequency band data can retrieve, at best, the homogenized version of the true model. The relation between the true and inverted models is known through eq. (30). This is an important progress as, for classical FWI scheme, this relation is not known. This relation opens the door to a proper interpretation of the inverted model with a downscaling inverse problem. To make the principle exposed here applicable to more realistic geological cases, many aspects still need to be explored. Because, in general, the source and receivers are in the inverted area, their associated correctors need to be inverted. Similarly, the free surface boundary corrector needs to be inverted. The fact that, in general, the minimum wavelength strongly varies over the domain also needs to be accounted for. The adaptive homogenization is potentially a good solution for such a problem. An explicit parametrization instead of the implicit one used here should be attempted. The general downscaling problem needs to be addressed and, finally, these works need to be extended to 3-D. All these aspects will be explored in future works. ACKNOWLEDGEMENTS We are very grateful to Gerhard Pratt and Michael Afanasiev for their positive and useful reviews that greatly helped to improve the manuscript. We thank R. Brossier, J. Virieux, A. Fichtner, T. Bodin and the European COST action TIDES (ES1401) for fruitful discussions. This work was funded by the HIWAI ANR project (ANR-16-CE31-0022-01). The computations were done on the Centre de Calcul Intensif des Pays de la Loire (CCIPL) computers. REFERENCES Afanasiev M., Peter D., Sager K., Simutė S., Ermert L., Krischer L., Fichtner A., 2015. Foundations for a multiscale collaborative earth model, Geophys. J. Int. , 204( 1), 39– 58. https://doi.org/10.1093/gji/ggv439 Google Scholar CrossRef Search ADS   Afanasiev M., Boehm C., May D., Fichtner A., 2016. Using effective medium theory to better constrain full waveform inversion, in 78th EAGE Conference and Exhibition 2016 , doi:10.3997/2214-4609.201601614. Alerini M., Lambaré G., Baina R., Podvin P., Bégat S.L., 2007. Two-dimensional PP/PS-stereotomography: P- and S-waves velocities estimation from OBC data, Geophys. J. Int. , 170, 725– 736. https://doi.org/10.1111/j.1365-246X.2007.03439.x Google Scholar CrossRef Search ADS   Backus G., 1962. Long-wave elastic anisotropy produced by horizontal layering, J. geophys. Res. , 67( 11), 4427– 4440. https://doi.org/10.1029/JZ067i011p04427 Google Scholar CrossRef Search ADS   Bensoussan A., Lions J.-L., Papanicolaou G., 1978. Asymptotic Analysis of Periodic Structures , North Holland. Beylkin G., 1987. Mathematical theory for seismic migration and spatial resolution, in Deconvolution and Inversion. , pp. 291– 304, eds Bernabini M., Carrion P., Jacovitti G., Rocca F., Treitel S., Worthington M., Blackwell Scientific Publications. Beylkin G., Burridge R., 1990. Linearized inverse scattering problems in acoustics and elasticity, Wave motion , 12, 15– 52. https://doi.org/10.1016/0165-2125(90)90017-X Google Scholar CrossRef Search ADS   Billette F., Lambaré G., 1998. Velocity macro-model estimation from seismic reflection data by stereotomography, Geophys. J. Int. , 135( 2), 671– 680. https://doi.org/10.1046/j.1365-246X.1998.00632.x Google Scholar CrossRef Search ADS   Blanc X., Le Bris C., Lions P.-L., 2007. Stochastic homogenization and random lattices, J. Math. Pures Appl. , 88( 1), 34– 63. https://doi.org/10.1016/j.matpur.2007.04.006 Google Scholar CrossRef Search ADS   Bodin T., Capdeville Y., Romanowicz B., Montagner J.-P., 2015. Interpreting radial anisotropy in global and regional tomographic models, in The Earth’s Heterogeneous Mantle , pp. 105– 144, eds Khan A., Deschamps F., Springer. Google Scholar CrossRef Search ADS   Brenders A., Pratt R., 2007. Full waveform tomography for lithospheric imaging: results from a blind test in a realistic crustal model, Geophys. J. Int. , 168( 1), 133– 151. https://doi.org/10.1111/j.1365-246X.2006.03156.x Google Scholar CrossRef Search ADS   Brossier R., Operto S., Virieux J., 2009. Seismic imaging of complex onshore structures by 2D elastic frequency-domain full-waveform inversion, Geophysics , 74( 6), WCC105– WCC118. https://doi.org/10.1190/1.3215771 Google Scholar CrossRef Search ADS   Browaeys J.T., Chevrot S., 2004. Decomposition of the elastic tensor and geophysical applications, Geophys. J. Int. , 159, 667– 678. https://doi.org/10.1111/j.1365-246X.2004.02415.x Google Scholar CrossRef Search ADS   Bunks C., Saleck F., Zaleski S., Chavent G., 1995. Multiscale seismic waveform inversion, Geophysics , 60( 5), 1457– 1473. https://doi.org/10.1190/1.1443880 Google Scholar CrossRef Search ADS   Burgos G., Capdeville Y., Guillot L., 2016. Homogenized moment tensor and the effect of near-field heterogeneities on nonisotropic radiation in nuclear explosion, J. geophys. Res. , 121( 6), 4366– 4389. https://doi.org/10.1002/2015JB012744 Google Scholar CrossRef Search ADS   Byrd R.H., Lu P., Nocedal J., 1995. A limited memory algorithm for bound constrained optimization, SIAM J. Sci. Stat. Comput. , 16, 1190– 1208. https://doi.org/10.1137/0916069 Google Scholar CrossRef Search ADS   Capdeville Y., Cance P., 2015. Residual homogenization for elastic wave propagation in complex media, Geophys. J. Int. , 200( 2), 984– 997. https://doi.org/10.1093/gji/ggu452 Google Scholar CrossRef Search ADS   Capdeville Y., Marigo J.J., 2007. Second order homogenization of the elastic wave equation for non-periodic layered media, Geophys. J. Int. , 170, 823– 838. https://doi.org/10.1111/j.1365-246X.2007.03462.x Google Scholar CrossRef Search ADS   Capdeville Y., Marigo J.J., 2008. Shallow layer correction for spectral element like methods, Geophys. J. Int. , 172, 1135– 1150. https://doi.org/10.1111/j.1365-246X.2007.03703.x Google Scholar CrossRef Search ADS   Capdeville Y., Marigo J.-J., 2013. A non-periodic two scale asymptotic method to take account of rough topographies for 2-D elastic wave propagation, Geophys. J. Int. , 192( 1), 163– 189. https://doi.org/10.1093/gji/ggs001 Google Scholar CrossRef Search ADS   Capdeville Y., Guillot L., Marigo J.J., 2010a. 1-D non periodic homogenization for the wave equation, Geophys. J. Int. , 181, 897– 910. Capdeville Y., Guillot L., Marigo J.J., 2010b. 2D nonperiodic homogenization to upscale elastic media for P–SV waves, Geophys. J. Int. , 182, 903– 922. https://doi.org/10.1111/j.1365-246X.2010.04636.x Google Scholar CrossRef Search ADS   Capdeville Y., Stutzmann E., Wang N., Montagner J.-P., 2013. Residual homogenization for seismic forward and inverse problems in layered media, Geophys. J. Int. , 194( 1), 470– 487. https://doi.org/10.1093/gji/ggt102 Google Scholar CrossRef Search ADS   Capdeville Y., Zhao M., Cupillard P., 2015. Fast Fourier homogenization for elastic wave propagation in complex media, Wave Motion , 54, 170– 186. https://doi.org/10.1016/j.wavemoti.2014.12.006 Google Scholar CrossRef Search ADS   Claerbout J., 1985. Imaging the Earth’s Interior , Blackwell Scientific Publication. Devaney A., 1984. Geophysical diffraction tomography, IEEE Trans. Geosci. Remote Sens. , GE-22( 1), 3– 13. https://doi.org/10.1109/TGRS.1984.350573 Google Scholar CrossRef Search ADS   Festa G., Vilotte J.-P., 2005. The Newmark scheme as velocity-stress time-staggering: an efficient PML implementation for spectral element simulations of elastodynamics, Geophys. J. Int. , 161, 789– 812. https://doi.org/10.1111/j.1365-246X.2005.02601.x Google Scholar CrossRef Search ADS   Fichtner A., Trampert J., Cupillard P., Saygin E., Taymaz T., Capdeville Y., Villaseñor A., 2013. Multiscale full waveform inversion, Geophys. J. Int. , 194( 1), 534– 556. https://doi.org/10.1093/gji/ggt118 Google Scholar CrossRef Search ADS   Guillot L., Capdeville Y., Marigo J.J., 2010. 2-D non periodic homogenization for the SH wave equation, Geophys. J. Int. , 182, 1438– 1454. https://doi.org/10.1111/j.1365-246X.2010.04688.x Google Scholar CrossRef Search ADS   Jordan T.H., 2015. An effective medium theory for three-dimensional elastic heterogeneities, Geophys. J. Int. , 203( 2), 1343– 1354. https://doi.org/10.1093/gji/ggv355 Google Scholar CrossRef Search ADS   Kamei R., Pratt R., 2013. Inversion strategies for visco-acoustic waveform inversion, Geophys. J. Int. , 194( 2), 859– 884. https://doi.org/10.1093/gji/ggt109 Google Scholar CrossRef Search ADS   Käufl P., Fichtner A., Igel H., 2013. Probabilistic full waveform inversion based on tectonic regionalization-development and application to the Australian upper mantle, Geophys. J. Int. , 193( 1), 437– 451. https://doi.org/10.1093/gji/ggs131 Google Scholar CrossRef Search ADS   Kolb P., Collino F., Lailly P., 1986. Pre-stack inversion of a 1-D medium, Proc. IEEE , 74( 3), 498– 508. https://doi.org/10.1109/PROC.1986.13490 Google Scholar CrossRef Search ADS   Komatitsch D., Vilotte J.P., 1998. The spectral element method: an effective tool to simulate the seismic response of 2D and 3D geological structures, Bull. seism. Soc. Am. , 88, 368– 392. Lekić V., Romanowicz B., 2011. Inferring upper-mantle structure by full waveform tomography with the spectral element method, Geophys. J. Int. , 185( 2), 799– 831. https://doi.org/10.1111/j.1365-246X.2011.04969.x Google Scholar CrossRef Search ADS   Métivier L., Brossier R., 2016. The SEISCOPE optimization toolbox: a large-scale nonlinear optimization library based on reverse communication, Geophysics , 81( 2), F11– F25. Google Scholar CrossRef Search ADS   Michel J., Moulinec H., Suquet P., 1999. Effective properties of composite materials with periodic microstructure: a computational approach, Comput. Methods Appl. Mech. Eng. , 172( 1), 109– 143. https://doi.org/10.1016/S0045-7825(98)00227-8 Google Scholar CrossRef Search ADS   Nachman A.I., 1988. Reconstructions from boundary measurements, Ann. Math. , 128( 3), 531– 576. https://doi.org/10.2307/1971435 Google Scholar CrossRef Search ADS   Nakamura G., Uhlmann G., 1994. Global uniqueness for an inverse boundary problem arising in elasticity, Inventiones math. , 118( 1), 457– 474. https://doi.org/10.1007/BF01231541 Google Scholar CrossRef Search ADS   Nawaz M.A., Curtis A., 2017. Bayesian inversion of seismic attributes for geological facies using a hidden Markov model, Geophys. J. Int. , 208, 1184– 1200. https://doi.org/10.1093/gji/ggw411 Google Scholar CrossRef Search ADS   Nocedal J., 1980. Updating quasi-Newton matrices with limited storage, Math. Comput. , 35( 151), 773– 782. https://doi.org/10.1090/S0025-5718-1980-0572855-7 Google Scholar CrossRef Search ADS   Operto S., Miniussi A., Brossier R., Combe L., Métivier L., Monteiller V., Ribodetti A., Virieux J., 2015. Efficient 3-D frequency-domain mono-parameter full-waveform inversion of ocean-bottom cable data: application to Valhall in the visco-acoustic vertical transverse isotropic approximation, Geophys. J. Int. , 202( 2), 1362– 1391. https://doi.org/10.1093/gji/ggv226 Google Scholar CrossRef Search ADS   Peter D. et al.  , 2011. Forward and adjoint simulations of seismic wave propagation on fully unstructured hexahedral meshes, Geophys. J. Int. , 186( 2), 721– 739. https://doi.org/10.1111/j.1365-246X.2011.05044.x Google Scholar CrossRef Search ADS   Plessix R.E., Perkins C., 2010. Full waveform inversion of a deep water ocean bottom seismometer dataset, First Break , 28, 71– 78. https://doi.org/10.3997/1365-2397.2010013 Google Scholar CrossRef Search ADS   Pratt R., Shin C., Hicks G., 1998. Gauss–Newton and full newton methods in frequency domain seismic waveform inversion, Geophys. J. Int. , 133, 341– 362. https://doi.org/10.1046/j.1365-246X.1998.00498.x Google Scholar CrossRef Search ADS   Pratt R.G., 1999. Seismic waveform inversion in the frequency domain, part I : theory and verification in a physical scale model, Geophysics , 64, 888– 901. https://doi.org/10.1190/1.1444597 Google Scholar CrossRef Search ADS   Romanowicz B., 2003. Global mantle tomography: progress status in the last 10 years, Annu. Rev. Geophys. Space Phys , 31, 303– 328. Sanchez-Palencia E., 1980, Non Homogeneous Media and Vibration Theory , Number 127 in Lecture Notes in Physics, Springer. Shipp R.M., Singh S.C., 2002. Two-dimensional full wavefield inversion of wide-aperture marine seismic streamer data, Geophys. J. Int. , 151, 325– 344. https://doi.org/10.1046/j.1365-246X.2002.01645.x Google Scholar CrossRef Search ADS   Sirgue L., Barkved O.I., Dellinger J., Etgen J., Albertin U., Kommedal J.H., 2010. Full waveform inversion: the next leap forward in imaging at Valhall, First Break , 28, 65– 70. https://doi.org/10.3997/1365-2397.2010012 Google Scholar CrossRef Search ADS   Stoneley R., 1949. The seismological implications of aeolotropy in continental structure, Geophys. Suppl. Mon. Not. R. Astron. Soc. , 5( 8), 343– 353. https://doi.org/10.1111/j.1365-246X.1949.tb02949.x Tape C., Liu Q., Maggi A., Tromp J., 2010. Seismic tomography of the southern California crust based on spectral-element and adjoint methods, Geophys. J. Int. , 180( 1), 433– 462. https://doi.org/10.1111/j.1365-246X.2009.04429.x Google Scholar CrossRef Search ADS   Tarantola A., 1984. Inversion of seismic reflection data in the acoustic approximation, Geophysics , 49, 1259– 1266. https://doi.org/10.1190/1.1441754 Google Scholar CrossRef Search ADS   Tarantola A., 2005. Inverse Problem Theory and Methods for Model Parameter Estimation , SIAM. Google Scholar CrossRef Search ADS   Tarantola A., Valette B., 1982. Generalized nonlinear inverse problems solved using the least squares criterion, Rev. Geophys. , 20, 219– 232. https://doi.org/10.1029/RG020i002p00219 Google Scholar CrossRef Search ADS   Trinh P.T., Brossier R., Métivier L., Virieux J., Wellington P., 2017. Bessel smoothing filter for spectral element mesh, Geophys. J. Int. , 209( 3), 1489– 1512. https://doi.org/10.1093/gji/ggx103 Google Scholar CrossRef Search ADS   Warner M. et al.  , 2013. Anisotropic 3D full-waveform inversion, Geophysics , 78( 2), R59– R80. https://doi.org/10.1190/geo2012-0338.1 Google Scholar CrossRef Search ADS   Weinan E., Engquist B., Li X., Ren W., Vanden-Eijnden E., 2007. Heterogeneous multiscale methods: a review, Commun. Comput. Phys. , 2( 3), 367– 450. Williamson P., Atle A., Fei W., Hale D., 2011. Regularization of wave-equation migration velocity analysis by structure-oriented smoothing, in 2011 SEG Annual Meeting , Society of Exploration Geophysicists. Zhao M., Capdeville Y., Zhang H., 2016. Direct numerical modeling of time-reversal acoustic subwavelength focusing, Wave Motion , 67, 102– 115. https://doi.org/10.1016/j.wavemoti.2016.07.010 Google Scholar CrossRef Search ADS   APPENDIX A: THE HOMOGENIZATION OPERATOR $$\boldsymbol{\mathcal {H}}$$ In this appendix, we summarize the practical steps necessary to build the homogenization operator $$\boldsymbol{\mathcal {H}}$$. The complete development and theory can be found in Capdeville et al. (2010b, 2015). To practically implement the user-defined λ0 scale separation, we need to introduce a low-pass filter operator $$\mathcal {F}^{k_0}$$, such that, for any quantity g(x), $$\mathcal {F}^{k_0}(g)(\mathbf {x})$$ does not contain any spatial variation smaller than λ0 = 1/k0. This low-pass filter operator can be written as   \begin{equation} \mathcal {F}^{k_0}(g)(\mathbf {x})=(w^{k_0}*g)(\mathbf {x}), \end{equation} (A1)where * is the spatial convolution and $$w^{k_0}$$ is the filter wavelet. Once λ0 is chosen, it sets the value of ε0 through (16). The different steps allowing us to build the upscaling operator $$\boldsymbol{\mathcal {H}}$$ such that   \begin{equation} \mathbf {c}^*=\boldsymbol{\mathcal {H}}(\mathbf {c})\,, \end{equation} (A2)and to find the correctors are the following: Step 1. We first solve the so-called cell problem to find the initial guess correctors $$\boldsymbol{\chi }_{{\!{\rm s }}}^{lm}(\mathbf {x})$$. It consists in solving the following elastostatic set of problems (3 in 2-D, 6 in 3-D) in $$\boldsymbol{\Omega }$$:   \begin{eqnarray} \boldsymbol{\nabla }\cdot \mathbf {c}:\boldsymbol{\epsilon }\left(\boldsymbol{\chi }_{{\!{\rm s }}}^{lm}\right)&=&\mathbf {F}^{lm}\,,\nonumber\\ \mathbf {F}^{lm} &=& -\boldsymbol{\nabla }\cdot (\mathbf {c}:(\mathbf {e}_l\otimes \mathbf {e}_m))\,, \end{eqnarray} (A3)with periodic boundary conditions on $$\partial \boldsymbol{\Omega }$$, where the ei, i ∈ {1, ..., d}, are the Cartesian unit basis vectors. Step 2. Once the initial corrector guess $$\boldsymbol{\chi }_{{\!{\rm s }}}^{lm}$$ is obtained, we compute   \begin{eqnarray} \left[\mathbf {G}_{{\!{\rm s }}}\right]_{ijlm}(\mathbf {x})&=& \frac{1}{2}\left( \delta _{il}\delta _{jm}+\delta _{jl}\delta _{im}\right)+\left[\boldsymbol{\epsilon }\left({\boldsymbol{\chi }}_{{\!{\rm s }}}^{lm}\right)\right]_{ij},\nonumber\\ \mathbf {H}_{{\!{\rm s }}}(\mathbf {x})&=& \mathbf {c}: \mathbf {G}_{{\!{\rm s }}}. \end{eqnarray} (A4)The ε0 effective tensor can be directly obtained as   \begin{equation} \mathbf {c}^*(\mathbf {x})=\mathcal {F}^{k_0}(\mathbf {H}_{{\!{\rm s }}}): \mathcal {F}^{k_0}({\mathbf {G}_{{\!{\rm s }}}})^{-1}(\mathbf {x}). \end{equation} (A5)At this stage the upscaling operator $$\boldsymbol{\mathcal {H}}$$, defined in eq. (A2), is known. The effective density is simply   \begin{equation} \rho ^*(\mathbf {x})=\mathcal {F}^{k_0}(\rho )(\mathbf {x}); \end{equation} (A6) Step 3. Finally, the strain concentrator is obtained as   \begin{equation} \mathbf {G}(\mathbf {x},\mathbf {y})=\mathbf {I}+\left(\mathbf {G}_{{\!{\rm s }}} -\mathcal {F}^{k_0}(\mathbf {G}_{{\!{\rm s }}})\right)(\varepsilon _0\mathbf {y}):\mathcal {F}^{k_0}(\mathbf {G}_{{\!{\rm s }}})^{-1}(\mathbf {x}), \end{equation} (A7)and the first-order corrector $$\boldsymbol{\chi }(\mathbf {x},\mathbf {y})$$ is obtained solving, for each x (fixed),   \begin{equation} \boldsymbol{\nabla }{\boldsymbol{\boldsymbol{\chi }}}(\mathbf {x},\mathbf {y})+ {(\boldsymbol{\nabla }{\boldsymbol{\boldsymbol{\chi }}})}^{\intercal }(\mathbf {x},\mathbf {y})=2(\mathbf {G}(\mathbf {x},\mathbf {y})-\mathbf {I}), \end{equation} (A8)where the $$\boldsymbol{\nabla }$$ operators here applies on the y variable and I is the identity operator. Solving the cell problem (A3) usually requires a numerical solver. In practice, we can use two different schemes, one based on finite elements (Capdeville et al.2010b) and one based on an iterative FFT scheme (Capdeville et al.2015), derived from Michel et al. (1999). Here, in the examples, the first has been used to compute the $$\mathbf {m}_t^*$$, the second to compute $$\bar{\mathbf {m}}^{h*}$$ and the homogenized projection of the Gauss–Newton updated model at each iteration. It can be seen that the homogenization operator actually depends on both the maximum frequency fm (which determines λmin through the medium dispersion relation) and on ε0 (which determines λ0 through (16)). The main assumption behind the homogenization is the existence of a minimum wavelength λmin. For most media, this is usually the case. Nevertheless, note that some synthetic media can break this assumption and they cannot be homogenized (at least in the sense used here) and they probably cannot be imaged (at least not with the scheme presented here). Media with fluid or void inclusions of size comparable to λmin break the assumption that c is positive definite and such media cannot be homogenized either without doing a domain decomposition. Such media cannot be imaged this way either, unless fluid–solid boundaries are previously known and explicitly taken into account. But media with void or fluid inclusions of size significantly smaller than λmin can still be homogenized and therefore inverted. One important consequence of the homogenization theory is that the effective model space $$\boldsymbol{\mathcal {M}}^*$$, image of the model space $$\boldsymbol{\mathcal {M}}$$ through $$\boldsymbol{\mathcal {H}}$$ ($$\boldsymbol{\mathcal {M}}^*=\boldsymbol{\mathcal {H}}(\boldsymbol{\mathcal {M}})$$) is a finite dimensional space (even prior to any discretization that could be necessary for practical reasons). Indeed, from eq. (A5), it can be seen that both $$\mathcal {F}^{k_0}(\mathbf {H}_{{\!{\rm s }}})$$ and $$\mathcal {F}^{k_0}(\mathbf {G}_{{\!{\rm s }}})$$ belong a finite dimensional Fourier space and therefore c* can be describe with a finite number of parameters. The exact dimension of $$\boldsymbol{\mathcal {M}}^*$$ as a function of fm and ε0 is difficult to measure precisely, but it scales as   \begin{equation} \dim (\boldsymbol{\mathcal {M}}^*)\propto \left(\frac{f_{\rm m}}{\varepsilon _0}\right)^d\,. \end{equation} (A9)The fact that $$\dim (\boldsymbol{\mathcal {M}}^*)$$ cannot be measured precisely is because, as any space-wavenumber problem, it is not possible to be precise on both on frequency (fm) and space (λmin) at the same time. Finally, we said nothing about the boundary conditions. This is a more complex issue that requires two matched asymptotic expansions (Capdeville & Marigo 2008; Capdeville et al.2013). At the order 0 in ε0, nothing special needs to be done and any smooth boundary and a free surface will be correct. At order one in ε0, it appears that an effective domain boundary $$\partial \boldsymbol{\Omega }^*$$ needs to be introduced. It transforms a potential rough topography in a smooth effective one. Moreover, the free boundary condition needs to be replaced by a Dirichlet to Neumann condition (see eq. 19). The coefficients involved in the Dirichlet to Neumann operator $${\boldsymbol{\Gamma }}^*$$ only depend on the macroscopic scale (they vary smoothly) and are therefore not a problem for an inverse problem (but the effective parameters imbedded in $${\boldsymbol{\Gamma }}^*$$ would still need to be inverted for). © The Author(s) 2018. Published by Oxford University Press on behalf of The Royal Astronomical Society. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Geophysical Journal International Oxford University Press

Elastic full waveform inversion based on the homogenization method: theoretical framework and 2-D numerical illustrations

Loading next page...
 
/lp/ou_press/elastic-full-waveform-inversion-based-on-the-homogenization-method-jFZcfkR3RV
Publisher
The Royal Astronomical Society
Copyright
© The Author(s) 2018. Published by Oxford University Press on behalf of The Royal Astronomical Society.
ISSN
0956-540X
eISSN
1365-246X
D.O.I.
10.1093/gji/ggy039
Publisher site
See Article on Publisher Site

Abstract

SUMMARY Seismic imaging is an efficient tool to investigate the Earth interior. Many of the different imaging techniques currently used, including the so-called full waveform inversion (FWI), are based on limited frequency band data. Such data are not sensitive to the true earth model, but to a smooth version of it. This smooth version can be related to the true model by the homogenization technique. Homogenization for wave propagation in deterministic media with no scale separation, such as geological media, has been recently developed. With such an asymptotic theory, it is possible to compute an effective medium valid for a given frequency band such that effective waveforms and true waveforms are the same up to a controlled error. In this work we make the link between limited frequency band inversion, mainly FWI, and homogenization. We establish the relation between a true model and an FWI result model. This relation is important for a proper interpretation of FWI images. We numerically illustrate, in the 2-D case, that an FWI result is at best the homogenized version of the true model. Moreover, it appears that the homogenized FWI model is quite independent of the FWI parametrization, as long as it has enough degrees of freedom. In particular, inverting for the full elastic tensor is, in each of our tests, always a good choice. We show how the homogenization can help to understand FWI behaviour and help to improve its robustness and convergence by efficiently constraining the solution space of the inverse problem. Inverse theory, Numerical solutions, Computational seismology, Seismic tomography, Wave propagation 1 INTRODUCTION Since the late sixties, seismic data have been used to investigate the Earth interior and to image its mechanical properties, for academic and industrial purposes, at scales ranging from few metres to the global Earth. Modern seismic imaging methods are based on an inverse problem making it possible to retrieve information about the Earth mechanical properties from active (explosion or airgun) or passive (earthquakes, ambient noise) seismic records. Because of the tremendous computing power such a scheme would require, a full exploration of the inverse problem solution space based on the direct modelling of the seismogram full waveforms, as envisioned by Tarantola (2005), is still out of reach, despite the progress in the design of high performance computing devices, reaching now exascale performances. Because of this intrinsic limitation, seismologists have developed seismic imaging and tomography techniques which are mainly based on reduced data, (i.e. secondary observables such as body waves arrival times, surface wave phase velocities, normal model eigenfrequencies ...) together with an appropriate approximate solution of the wave propagation modelling problem, that are quicker to solve than the full wave equation. These choices make it possible to assume that the inverse problem nonlinearity is weak enough so that a solution to the inverse problem can be computed through local optimization techniques. Nevertheless, mathematical regularization (non-data-driven constraints) are always necessary to make the solution of the inverse problem unique. In this framework, for global and regional scales, time arrival and phase velocity tomography methods have provided significant results and images of the deep Earth (e.g. Romanowicz 2003, for a review). At the exploration scale, migration imaging techniques from reflection data have been introduced to recover high-resolution structural information of the subsurface reflectivity. These high-resolution seismic imaging techniques rely on a prior knowledge of the smooth background wave velocity and, in their earlier version, on a linearized wave propagation operator, aiming at refocusing primary reflections only in depth (Claerbout 1985; Beylkin 1987; Beylkin & Burridge 1990). During the last two decades, the simultaneous development of wide angle/azimuth broadband seismic acquisition systems and high performance computing devices have made possible the successful application of seismic imaging techniques based on the full waveform, from the exploration to the global scale. Compared to the previous techniques (tomography, migration), full waveform inversion (FWI) aims at extracting the information from the entire waveform in a limited frequency band, through a local minimization of the difference between the observed and synthetic data. At the global and continental scales, FWI methods, often based on the Spectral Element Method for the forward modelling tool and gradient descent algorithms (steepest descent, nonlinear conjugate gradient) for the optimization, are making important progress (Tape et al.2010; Lekić & Romanowicz 2011; Fichtner et al.2013). At the exploration scale, the method is now routinely used in the industry, mainly for the 3-D reconstruction of the pressure wave velocity in the acoustic approximation (Plessix & Perkins 2010; Sirgue et al.2010; Peter et al.2011; Warner et al.2013; Operto et al.2015). Based on these successful applications, two main research directions are currently identified. The first consists in a better interpretation of physics of wave propagation, going beyond the current common mechanical approximations (typically the acoustic approximation at the exploration scale and the VS approximation at the regional and global scales) and a better accounting for the amplitude of the seismic signal. This requires the use of much more expensive viscoelastic wave propagation modelling solvers, correctly accounting for the attenuation and the anisotropy of the subsurface. This also yields a challenging multiparameter inverse problem involving at least P-wave and S-wave velocities to be reconstructed. The second research direction consists in the design of a more automatized workflow, less relying on human expertise, enhancing the robustness of FWI procedures. Indeed, the FWI problem is intrinsically ill-posed as the misfit function might exhibit several local minima leading to non-informative geological models. Designing strategies to avoid these local minima is at the heart of many methodological studies. In practice, this requires a careful data analysis, and the design of ad hoc multilevel hierarchical schemes based on frequency band, time-windowing and offset selections (also known as layer stripping approach) (Bunks et al.1995; Pratt 1999; Shipp & Singh 2002; Brossier et al.2009). This is complemented through the design of kinematically accurate initial models through high-resolution tomography techniques (Billette & Lambaré 1998; Alerini et al.2007) and the introduction of prior knowledge through the design of regularization strategies (Tikhonov, smoothing techniques) and prior model information. This second line of investigation might be summarized as a search for an FWI formulation leading to a better posed inverse problem, potentially less prone to exhibit local minima. In this paper, we investigate numerically how the concept of homogenization and equivalent medium could help in this respect. Our leading idea is to account for the fact that seismic waves propagating within the subsurface interior are always frequency band-limited. For this reason any subsurface mechanical properties variations smaller than a fraction of the shortest propagated wavelength are ‘seen’ as smooth, often anisotropic, heterogeneities (Capdeville et al.2013). The homogenization theory provides means to compute this equivalent anisotropic medium, the medium ‘seen’ by the wavefield. Homogenization, effective media or upscaling techniques gather a wide range of methods able to compute effective properties and equations of a fine scale problem when large scale properties are needed. In the context of wave propagation, the idea is to remove the heterogeneities of scale much smaller than the minimum wavefield wavelength and to replace them by effective properties. For ‘long’ elastic waves propagating in stratified media, Backus (1962) gave explicit formula to upscale finely layered media. One important result of this work is to show that a finely layered isotropic medium becomes an anisotropic effective medium for long waves. For periodic media, an important class of methods, the two-scale homogenization methods, has been developed (e.g. Sanchez-Palencia 1980). To obtain the effective media, the effective equations and local correctors, two-scale homogenization methods require one to solve the so-called periodic-cell problem. This periodic-cell problem can be solved analytically only for the specific case of layered media, whereas a numerical method such as finite elements is necessary for more general media. For stochastic media, methods formally similar to the two-scale homogenization methods exist (e.g. Bensoussan et al.1978; Blanc et al.2007). Finally, solutions called ‘numerical homogenization’ (e.g. Weinan et al.2007) also needs a scale separation. Typical geological media present no spatial periodicity, no natural scale separation or any kind of spatial statistical invariance. This difficulty excludes all of the above mentioned homogenization techniques from the problem of upscaling geological media. To fill this gap, the non-periodic homogenization technique (Capdeville et al.2010b; Guillot et al.2010; Capdeville et al.2010a, 2015; Capdeville & Cance 2015) has recently been introduced. While the non-periodic homogenization technique is strongly inspired from the classical two-scale periodic homogenization, it has some major differences. One of them relies on the fact that the obtained effective properties are not spatially constant, they are just ‘smoother’ than the original medium. So far, this homogenization method has been mostly used to simplify seismic forward modelling. In this context, the non-periodic homogenization can be seen as a pre-processing step of the elastic model prior to being used in the wave equation solver (e.g. spectral elements, finite differences ...). By removing all the fine scales, the non-periodic homogenization drastically reduces meshing complexity associated with elastic discontinuities and makes the computation optimum (see e.g. Capdeville et al.2015). Note that for relatively weak elastic contrasts, an alternate solution based on a second order Born approximation exists (Jordan 2015). In the inversion context, the homogenization method has been already used either to mix different scales (Afanasiev et al.2015) or to constrain FWI gradient updates in the layered media case (Afanasiev et al.2016). More generally, the homogenization concept raises a puzzling point: for a given frequency band-limited data set, at least two Earth subsurface models minimize similarly a waveform misfit function, the true and the homogenized models. Knowing that, for local optimization approaches, only one model can be found, we can already guess that the solution model is, at best, the homogenized model and not the true model. Indeed, homogenized models represent what is ‘seen’ by the wavefield. More technically, the homogenized models are ‘simpler’ in the sense that they need a finite number of parameters to describe them, which is not the case for the true models. This has been confirmed in a recent work, but only in the layered media case: it has been proposed that FWI can retrieve at best the homogenized medium and that this fact can be used to properly set up the inversion problem (Capdeville et al.2013). Extending those results from the stratified media case to the general case is not trivial for similar reasons that extending the homogenization principle from the layered case to the general case is also difficult (Capdeville et al.2010a). In the present work, we pursue mainly two objectives. First, we extend the results of Capdeville et al. (2013) to higher dimensions (2-D and 3-D). Second, we aim at presenting ideas on how homogenization concepts can be used to better constrain and better understand FWI. Beyond the objective of making and understanding the FWI better, we also aim at establishing the relation between the true model and the model resulting from an FWI. This relation is fundamental to set up a downscaling inverse problem. Its objective is to properly interpret the FWI results as a probability distribution over possible fine scale models (Bodin et al.2015; Nawaz & Curtis 2017) where fine scales are here scales of interest for the interpretation below the FWI resolution limit. In the following, we first define FWI and its solution space. We then introduce the homogenization concept and its consequences for the inverse problem. We then show a series of examples illustrating the exposed ideas before discussing our results and concluding our work. 2 THE FULL WAVEFORM INVERSE PROBLEM 2.1 Context We consider an elastic body $$\boldsymbol{\Omega }$$ of boundary $$\partial \boldsymbol{\Omega }$$ (see Fig. 1). We assume that this elastic body is fully characterized by its ‘true’ parameters, the density ρt(x) and the fourth order elastic tensor ct(x) defined for all x in $$\boldsymbol{\Omega }$$. We consider Ns point sources, characterized by their location $$\mathbf {x}_s^t$$ and mechanism $$\mathbf {q}_s^t$$ (moment tensor or force vector) for s ∈ {1, ..., Ns}. These sources are triggered independently in $$\boldsymbol{\Omega }$$ and recorded by Nr, d = 2 or d = 3 component receivers, in 2-D or 3-D respectively, located in xr, r ∈ {1, ..., Nr}. Figure 1. View largeDownload slide The $$\boldsymbol{\Omega }$$ domain of boundary $$\partial \boldsymbol{\Omega }$$. Ns = 3 sources (stars) and Nr = 4 receivers (triangles) are represented. The circles connected with dashed lines are two successive zooms on $$\boldsymbol{\Omega }$$ texture symbolizing the multiscale nature of the domain heterogeneities. Figure 1. View largeDownload slide The $$\boldsymbol{\Omega }$$ domain of boundary $$\partial \boldsymbol{\Omega }$$. Ns = 3 sources (stars) and Nr = 4 receivers (triangles) are represented. The circles connected with dashed lines are two successive zooms on $$\boldsymbol{\Omega }$$ texture symbolizing the multiscale nature of the domain heterogeneities. Based on the recorded waveforms, an FWI aims at retrieving information about the elastic model and sources in $$\boldsymbol{\Omega }$$. In the following, we define more precisely the model space $$\boldsymbol{\mathcal {M}}$$ to which belong potential solutions to the inverse problem, the data set d and the forward problem linking potential models and synthetic data. Finally, we define the misfit function between the recorded data and synthetic data as well as the chosen optimization algorithm to solve the inverse problem. 2.2 The fine scale model space The fine scale model space is the one to which the true model belongs. It contains all the unknown parameters required to model the data in the full frequency band. In our case, this corresponds to the density and elastic parameters in $$\boldsymbol{\Omega }$$, together with the source parameters. We first define $$\boldsymbol{\mathcal {E}}(\boldsymbol{\Omega })$$, the admissible elastic properties space. $$\boldsymbol{\mathcal {E}}(\boldsymbol{\Omega })$$ gathers all the physically possible density and elastic tensor distributions (ρ(x), c(x)) in $$\boldsymbol{\Omega }$$ such that ρ(x) is positive and bounded; c(x) is a fourth order tensor, positive definite and bounded; c(x) follows the classical major and minor symmetries, leading to only six independent coefficients in 2-D and 21 in 3-D. In general, elastic structures are multiscale: for a given scale, there are always heterogeneities at a smaller scale (see Fig. 1). Consequently, the number of parameters necessary to characterize in a deterministic way the spatial distribution of a general elastic structure is infinite. As a consequence, $$\boldsymbol{\mathcal {E}}$$ is an infinite dimensional space. Note that, in the real world, there may be a finite limit to this dimension when reaching the atomic scale. But this limit is out of reach and is not considered here. Assuming the sources are localized in space (point sources), we define the source parameter space as   \begin{equation} \boldsymbol{\mathcal {S}}=\left\lbrace (\mathbf {x}_s,\mathbf {q}_s)\in (\boldsymbol{\Omega }\times \mathbb {R}^p),\ s\in \lbrace 1\dots N_s\rbrace \right\rbrace, \end{equation} (1)where xs is the source location and qs the source mechanism, which can either be a moment tensor Ms (in this case we have p = d(d + 1)/2) or a force vector $$\boldsymbol{f}_{\!s}$$ (respectively p = d). The model space $$\boldsymbol{\mathcal {M}}$$ is defined by   \begin{equation} \boldsymbol{\mathcal {M}}=\boldsymbol{\mathcal {E}}\cup \boldsymbol{\mathcal {S}}. \end{equation} (2)It is also an infinite dimensional space by definition. A model $$\mathbf {m}\in \boldsymbol{\mathcal {M}}$$, potentially a solution of the FWI problem, made of an elastic model and of the source parameters, can be written as   \begin{equation} \mathbf {m}=\left\lbrace (\rho (\mathbf {x}),\mathbf {c}(\mathbf {x})), \ \forall \mathbf {x}\in \boldsymbol{\Omega }; (\mathbf {x}_s,\mathbf {q}_s), s\in \lbrace 1,...,N_s\rbrace \right\rbrace. \end{equation} (3)The ‘true’ model mt is   \begin{equation} \mathbf {m}_t=\left\lbrace (\rho ^t(\mathbf {x}),\mathbf {c}^t(\mathbf {x})), \ \forall \mathbf {x}\in \boldsymbol{\Omega }; (\mathbf {x}^t_s,\mathbf {q}^t_s), s\in \lbrace 1,...,N_s\rbrace \right\rbrace. \end{equation} (4) In the following, the model m restricted to the source number s is denoted by m|s:   \begin{equation} \mathbf {m}\vert _s=\{(\rho (\mathbf {x}),\mathbf {c}(\mathbf {x})), \ \forall \mathbf {x}\in \boldsymbol{\Omega }; \mathbf {x}_s, \mathbf {q}_s\}. \end{equation} (5) 2.3 The forward modelling equations We assume that, for a given model m, of component (ρ, c), for a source number s among the Ns sources, for any $$\mathbf {x}\in \boldsymbol{\Omega }$$ and any time t ∈ [0, T], the displacement u(x, t; m|s) is the solution of the elastic wave equation   \begin{eqnarray} &&{\rho \partial _{tt} \mathbf {u}- \boldsymbol{\nabla }\cdot \boldsymbol{\sigma }= \mathbf {s}_s, }\nonumber\\ &&\boldsymbol{\sigma }= \mathbf {c}:\boldsymbol{\epsilon }(\mathbf {u}), \end{eqnarray} (6)where ss(x, t) is the sth external source term, $$\boldsymbol{\sigma }(\mathbf {x},t;\mathbf {m}\vert _s)$$ is the stress and $$\boldsymbol{\epsilon }(\mathbf {u}) =\frac{1}{2}(\boldsymbol{\nabla }\mathbf {u}+{(\boldsymbol{\nabla }\mathbf {u})}^{\intercal })$$ is the strain and ⊺ the ‘transpose’ operator. Inelastic losses are ignored in this work for the sake of simplicity. They have nevertheless an important effect in practice (see, for instance, Kamei & Pratt 2013). The wave equation is completed by the normal stress free boundary condition, for $$\mathbf {x}\in \partial \boldsymbol{\Omega }$$,   \begin{equation} \boldsymbol{\sigma }(\mathbf {x},t) \cdot \mathbf {n}(\mathbf {x}) =0\,,\ \ \ \forall (\mathbf {x},t)\in \partial \boldsymbol{\Omega }\times [0,T]\,, \end{equation} (7)where n is the outward normal to $$\partial \boldsymbol{\Omega }$$, and zero initial conditions (the medium is considered to be at rest at t = 0). We consider sources ss(x, t) of the following form   \begin{equation} \mathbf {s}_s(\mathbf {x},t)=-\mathbf {M}_s\cdot \boldsymbol{\nabla }\delta (\mathbf {x}-\mathbf {x}_s)g_s(t)\,, \end{equation} (8)for moment tensor sources or   \begin{equation} \mathbf {s}_s(\mathbf {x},t)=\boldsymbol{f}_{\!s} \delta (\mathbf {x}-\mathbf {x}_s)g_s(t)\,, \end{equation} (9)for force vector sources, where gs(t) is the source time function. 2.4 The data set The Ns elastic sources are triggered independently in $$\boldsymbol{\Omega }$$, generating elastic waves, for which the corresponding displacement ds, r(t) at receivers r is recorded for a duration T. The collected data set is   \begin{equation} \mathbf {d}=\lbrace \mathbf {d}_{s,r}(t), (r,s,t)\in \lbrace 1,...,N_r\rbrace \times \lbrace 1,...,N_s\rbrace \times [0,T]\rbrace \,. \end{equation} (10) In the following, we assume that the physics of the problem is known (i.e. the true model mt belongs to $$\boldsymbol{\mathcal {M}}$$) and that solving the forward problem defined in Section 2.3 accurately models the data within some limit corresponding to numerical noise. 2.5 FWI misfit function We define the classical least-squares misfit function:   \begin{equation} E(\mathbf {m})=\sum _{r,s}\int _{0}^{T}\left(\mathbf {d}_s(\mathbf {x}_r,t)-\mathbf {u}(\mathbf {x}_r,t;\mathbf {m}\vert _s)\right)^2\,{\rm d}t. \end{equation} (11)where u(xr, t; mt|s) is the solution of the wave eq. (6) computed for the model m restricted to a source s. Other misfit functions could be chosen (such as those based on phase or envelope measurements, cross-correlation approaches, deconvolution approaches, optimal transport approaches, etc.), but as long as they rely on finite frequency data, we expect the conclusions of the present study to remain unchanged. The FWI problem is formulated as the minimization problem   \begin{equation} \bar{\mathbf {m}}= \mathop{arg\min }\limits_{\mathbf {m}\in \boldsymbol{\mathcal {M}}} E(\mathbf {m}) \,. \end{equation} (12) It can be mathematically proven that, assuming perfect illumination (unlimited number of sources, unlimited number of receivers recording both displacement and traction on a closed surface, different from the free surface, and an infinite observation time) the true model mt can be uniquely recovered and, in this case, $$\bar{\mathbf {m}}=\mathbf {m}_t$$ (Nachman 1988; Nakamura & Uhlmann 1994). However, in practice, because computing power is finite, (12) cannot be solved. In the FWI framework, the classical solution to this problem is to limit the data frequency band by introducing a maximum frequency fm to the recorded signal and to the source time function gs. This can simply be done using a low-pass filter $$\mathcal {F}^{f_{\rm m}}$$:   \begin{equation} \mathbf {d}_s^{f_{\rm m}}(\mathbf {x}_r,t)=\mathcal {F}^{f_{\rm m}}(\mathbf {d}_s(\mathbf {x}_r,t))\,. \end{equation} (13)The misfit function to minimize is then   \begin{equation} E^{f_{\rm m}}(\mathbf {m})=\sum _{r,s}\int _0^T\left(\mathbf {d}_s^{f_{\rm m}}(\mathbf {x}_r,t)-\mathbf {u}^{f_{\rm m}}(\mathbf {x}_r,t;\mathbf {m}\vert _s)\right)^2\,{\rm d}t. \end{equation} (14)where $$\mathbf {u}^{f_{\rm m}}$$ is computed with the low-pass filtered source time function $$g_s^{f_{\rm m}}=\mathcal {F}^{f_{\rm m}}(g_s)$$. For most media, a bounded dispersion relation between the frequency f and the wavelength λ of the wavefield exists (there are a few exceptions, see Section 7). For such media, introducing a maximum frequency fm to the recorded signal warrants the existence of a minimum wavelength λmin. Using the assumption that waves only see a blurred version of scales smaller than λmin, and therefore that these scales can somehow be ignored in the reconstruction process, introducing a maximum frequency fm has an important consequence: it yields the possibility of using a finite dimensional space approximation $$\boldsymbol{\mathcal {M}}^h$$ of $$\boldsymbol{\mathcal {M}}$$. The superscript h is a number that characterizes the discretization made in the finite dimensional approximation and can be thought as a ‘resolution’ parameter. For example, if the elastic model is represented spatially by constant velocity blocks, h can be the size of these blocks. In general h is directly related to λmin but other information such as the illumination angles, offset ranges or data coverage can influence the choice of h. In $$\boldsymbol{\mathcal {M}}^h$$, models do not contain arbitrarily small scale heterogeneities and, knowing that the data need to be modelled only in a limited frequency band, the forward problem (eqs 6 and 7) can be solved in a bounded computing time. Finally, replacing the original minimization problem (12) by   \begin{equation} \bar{\mathbf {m}}^h = \mathop{arg\min }\limits_{\mathbf {m}\in \boldsymbol{\mathcal {M}}^h} E^{f_{\rm m}}(\mathbf {m}) \end{equation} (15)makes the inverse problem solvable, $$\boldsymbol{\mathcal {M}}^h$$ being a finite dimensional space. To solve (15), two classes of algorithms exist: local optimization techniques or global search approaches (Tarantola 2005). While Bayesian global search approaches would be ideal, yielding the possibility to access the full posterior density function, they are still too expensive, from a computational point of view, to be used for realistic scale FWI problem. Only a few limited experiments have been done in this direction so far (e.g. Käufl et al.2013). For practical applications, local optimization techniques represent the state-of-the art for solving the FWI problem, relying on the gradient direction and different levels of approximation for the inverse Hessian operator [identity, l-BFGS (Nocedal 1980; Byrd et al.1995), Gauss–Newton (Pratt et al.1998); see Métivier & Brossier (2016) for a review]. Provided that a linesearch algorithm or a trust-region strategy is used, these methods converge to the closest local minimum from the initial guess and hopefully to the global minimum if that initial guess is good. Nevertheless, if the global minimum is not unique, this is a problem. In practice, all the FWI techniques based on local optimization approaches implicitly assume a unique global minimum exists and assume that the initial guess lies within the ‘basin of attraction’ of the minimization. This is the reason why the uniqueness of the solution of the inverse problem is important. Even if, in $$\boldsymbol{\mathcal {M}}$$, minimizing E under perfect conditions warrants a unique solution, it is not the case for $$E^{f_{\rm m}}$$: there may be an infinite number of fine scale models that can produce the same limited frequency data (there is a large null space). In $$\boldsymbol{\mathcal {M}}^h$$, it can be different and the uniqueness can be recovered, but it is not granted. Independently of the global search versus local optimization choice made to solve (15), many other choices may influence the inverse problem solution $$\bar{\mathbf {m}}^h$$, even for a perfect data coverage. The resolution parameter h itself and the way it is implemented have a strong effect. Depending on the authors and their chosen approach, a wide range of solutions for the spatial approximation is used, from spatially constant blocks to high degree polynomials (typically, the same polynomial approximation as the one used for the waveform forward modelling). Physical approximation choices made for the constitutive parameters may also strongly influence the inversion result. For instance, we might hope from fine scale prior information that the true model is isotropic, in this case a common choice in seismology is to only invert for VP and VS (or even only one of the two) instead of cijkl. If the solution to the inverse problem $$\bar{\mathbf {m}}^h$$ depends on technical choices to parametrize $$\boldsymbol{\mathcal {M}}^h$$, we know that it cannot be the true model mt. Indeed, in general, mt does not belong to $$\boldsymbol{\mathcal {M}}^h$$: the single fact that mt is multiscale while $$\boldsymbol{\mathcal {M}}^h$$ is not prevents it. This is expected: we know for instance, following the diffraction tomography theory (Devaney 1984), that FWI can only reach a resolution down to half the smallest propagated wavelength. Nevertheless, the relation between $$\bar{\mathbf {m}}^h$$ and mt is unknown and this is a serious issue: in particular it can be difficult to dissociate true high-resolution information and small scale artefacts coming from a combination of noise, lack of illumination, and discretization setups. The knowledge of a more formal relation between $$\bar{\mathbf {m}}^h$$ and mt would thus allow a better interpretation of the FWI result $$\bar{\mathbf {m}}^h$$. The problem that $$\bar{\mathbf {m}}^h$$ strongly depends on the particular choice of $$\boldsymbol{\mathcal {M}}^h$$ and that the relation between $$\bar{\mathbf {m}}^h$$ and mt is unknown is not often discussed in the literature. We feel that this discussion is missing, as understanding these issues would help to better apprehend the behaviour of the inversion schemes and therefore could open ways to improve them. To us, this issue is directly linked to the question of how scales relate to each other through the wave equation: it is a homogenization problem. 3 HOMOGENIZATION CONCEPTS In this section, we summarize the results obtained by Capdeville et al. (2010a), Guillot et al. (2010) and Capdeville et al. (2010b) about homogenization of complex deterministic media with no scale separation for wave propagation problems. For limited frequency band data, seismic waves do not ‘see’ the true Earth model, but a blurred version of it. This blurred Earth model is called the effective or the homogenized Earth model. For a given signal maximum frequency fm, it can be computed thanks to the homogenization technique. As mentioned earlier, in most media, the maximum frequency fm can be associated with a minimum wavelength λmin, and in the following, we assume that a minimum wavelength λmin exists. We moreover assume that a lower bound estimate of λmin can be found as VSmin/fm where VSmin is the slowest velocity in the medium. This is a strong assumption. However, to the best of our knowledge, natural media always satisfies it (while synthetic complex meta-material can break it: for instance, resonating media with Helmholtz resonator inclusions (Zhao et al.2016)). We introduce λ0, a constant separating scales considered as macroscopic (large scales) from the one considered as microscopic (fine scales). We then introduce   \begin{equation} \varepsilon _0=\frac{\lambda _0}{\lambda _{{\rm min}}}\,, \end{equation} (16)a small parameter which measures the scale separation position with respect to λmin. For example, when homogenization is used as a pre-processing step to the forward modelling solver, in order to obtain a precise agreement between true and homogenized waveforms, ε0 = 0.5 is often a good choice for most geological media and a few tens of wavelength of wave propagation (Capdeville et al.2010b). Two-scale homogenization techniques are asymptotic methods that explicitly use two space variables: x for the macroscopic spatial variations and $$\mathbf {y}=\frac{\mathbf {x}}{\varepsilon _0}$$ for microscopic spatial variations. For example, for a multiscale medium such as a geological medium, the density ρ depends on both macroscopic and microscopic scales and is written $$\rho \left(\mathbf {x},\frac{\mathbf {x}}{\varepsilon _0}\right)$$. Similarly, the solution of the wave equation with a maximum frequency fm is written $$\mathbf {u}^{f_{\rm m}}\left(\mathbf {x},\frac{\mathbf {x}}{\varepsilon _0},t\right)$$. Although these properties are written as if they depend on two independent scales, it should be noted that these variables (and others to follow) only have a physical meaning once evaluated on the axis $$\mathbf {y}=\frac{\mathbf {x}}{\varepsilon _0}$$. A function κ that would not depend upon the microscopic scale would be written κ(x). For some quantity $$\kappa (\mathbf {x},\frac{\mathbf {x}}{\varepsilon _0})$$ depending on both scales, we may find an ‘effective’ version κ*(x) that only depends on the macroscopic scale. In the following, effective quantities are noted with a *. In general, the relation between two scales and effective quantities is complex and not just a simple linear smoothing. All the effective quantities actually depend on the actual value of ε0 and λmin. In the following, however, we drop this explicit dependency on ε0 and λmin to simplify the notations. An effective quantity $$\kappa ^{*,\varepsilon _0,\lambda _{{\rm min}}}$$ is simply denoted by κ*. The main results of the homogenization theory are the following: to the first order in ε0, the relation between the true and effective displacement is   \begin{eqnarray} \mathbf {u}^{f_{\rm m}}\left(\mathbf {x},\frac{\mathbf {x}}{\varepsilon _0},t\right) {=} \mathbf {u}^*(\mathbf {x},t) {+} \varepsilon _0\boldsymbol{\chi }\left(\mathbf {x},\!\frac{\mathbf {x}}{\varepsilon _0}\right):\boldsymbol{\epsilon }(\mathbf {u}^*)(\mathbf {x},t) {+} O\left(\varepsilon _0^2\right), \end{eqnarray} (17)where u* is the effective displacement, $$\boldsymbol{\chi }$$ is the first order corrector and $$\boldsymbol{\epsilon }(\mathbf {u}^*)$$ is the strain related to the effective displacement ($$\epsilon _{ij}(\mathbf {u}^*)=\frac{1}{2}(\partial _{x_i}u^*_j+\partial _{x_j}u^*_i)$$). This first-order corrector $$\boldsymbol{\chi }$$ is a third order tensor. It does not depend on time nor on sources, however, it depends on the fine scales x/ε0, ε0 and λmin (see Appendix A). The effective displacement u* is solution of the following effective wave equation   \begin{eqnarray} &&{\rho ^*\ddot{\mathbf {u}}^*-\boldsymbol{\nabla }\cdot \boldsymbol{\sigma }^*=\boldsymbol{f}^*,} \nonumber\\ &&\quad \quad \,\, \boldsymbol{\sigma }^*=\mathbf {c}^*:\boldsymbol{\epsilon }(\mathbf {u}^*)\,, \end{eqnarray} (18)  \begin{equation} \mathbf {T}(\mathbf {u}^*)=\varepsilon _0{\boldsymbol{\Gamma }}^*(\mathbf {u}^*)\,{\rm on }\,\partial \boldsymbol{\Omega }^*, \end{equation} (19)where $$\mathbf {T}(\mathbf {u}^*)=\mathbf {c}^*:\boldsymbol{\epsilon }(\mathbf {u}^*)\cdot \mathbf {n}^*$$, $$\partial \boldsymbol{\Omega }^*$$ is the effective boundary, n* its outward normal and $${\boldsymbol{\Gamma }}^*$$ is a Dirichlet to Neumann operator (DtN) for the effective boundary condition (Capdeville & Marigo 2008, 2013). It shall be noted that, to the leading order in ε0, the effective wave eqs (18) and (19) is also a wave equation, comparable to the original wave eqs (6) and (7), but with different elastic and density coefficients. This remains true to the first order in ε0, but the original Neumann boundary condition changes to become the DtN condition (19). The effective density ρ*(x), elastic tensor c*(x), the boundary layer corrector $${\boldsymbol{\Gamma }}^*(\mathbf {x})$$ as well as the corrector $$\boldsymbol{\chi }(\mathbf {x},\mathbf {x}/\varepsilon _0)$$ can be obtained thanks to the homogenization operator $$\boldsymbol{\mathcal {H}}$$:   \begin{equation} \left(\rho ^*,\mathbf {c}^*,{\boldsymbol{\Gamma }}^*,\boldsymbol{\chi }\right)=\boldsymbol{\mathcal {H}}(\rho ,\mathbf {c})\,. \end{equation} (20)The operator $$\boldsymbol{\mathcal {H}}$$ depends on ε0 and λmin. As λmin is directly tied to fm, $$\boldsymbol{\mathcal {H}}$$ is a frequency dependent operator. For a fixed ε0, the roughness of the effective medium ρ*, c* increases with fm. The nonlinear process behind the homogenization is summarized in Appendix A. From eq. (17), two important conclusions can be drawn: to leading order in ε0, the displacement does not depend on the microscopic scale $$\frac{\mathbf {x}}{\varepsilon _0}$$. To the first order, the displacement depends on the microscopic scale $$\frac{\mathbf {x}}{\varepsilon _0}$$, but only through a ‘site effect’ $$\boldsymbol{\chi }$$, that is independent of time and sources and which needs to be known only at the receiver positions. Finally, the effective source mechanism $$\mathbf {q}_s^*$$ can also be required. For moment tensor sources, the moment tensor has to be corrected to the leading order in ε0 as   \begin{equation} \mathbf {M}_s^*=\mathbf {G}(\mathbf {x}_s,\frac{\mathbf {x}_s}{\varepsilon _0}):\mathbf {M}, \end{equation} (21)where G is the strain concentrator and is defined in Appendix A. The order 1 correction for vector forces can be found in Capdeville et al. (2010a). Effective models belong to a space that is more complex to define than the fine scale models: the effective domain $$\boldsymbol{\Omega }^*$$ itself can be different from the original domain $$\boldsymbol{\Omega }$$: a fine scale topography can be replaced by an effective one (Capdeville & Marigo 2013); $$\boldsymbol{\mathcal {E}}^*(\boldsymbol{\Omega }^*)$$, the effective elastic properties admissible space gathers all the possible effective elastic tensor and density. The main difference with $$\boldsymbol{\mathcal {E}}$$ is that it is a finite dimensional space and its dimensions is proportional to (fm/ε0)d (see Appendix A); the source parameter space remains unchanged, unless first order correctors are introduced; a new space, $$\boldsymbol{\mathcal {C}}$$ gathering the receiver correctors χ at the receiver locations is necessary. Its dimension is proportional to Nr; a new space, $$\boldsymbol{\mathcal {G}}^*$$ gathering all the possible boundary correctors $${\boldsymbol{\Gamma }}^*$$ is also required. Its dimension is also finite and proportional to (fm/ε0)d − 1; finally, note that no new space is required for the source correction term (21). To the leading order, the corrected moment tensor is still a moment tensor and can be inverted as usual. This also implies that only the corrected moment tensor can be retrieved from the seismic data, not the true moment tensor (Burgos et al.2016). The effective model space   \begin{equation} \boldsymbol{\mathcal {M}}^*=\boldsymbol{\mathcal {E}}^*\cup \boldsymbol{\mathcal {S}}\cup \boldsymbol{\mathcal {C}}\cup \boldsymbol{\mathcal {G}}^*, \end{equation} (22)is also a finite dimensional space. It is the image, or the projection, of $$\boldsymbol{\mathcal {M}}$$ through the homogenization operator $$\boldsymbol{\mathcal {H}}$$:   \begin{equation} \boldsymbol{\mathcal {M}}^*=\boldsymbol{\mathcal {H}}(\boldsymbol{\mathcal {M}})\,. \end{equation} (23)In general, $$\boldsymbol{\mathcal {M}}$$ does not contain $$\boldsymbol{\mathcal {M}}^*$$  \begin{equation} \boldsymbol{\mathcal {M}}^*\not\subset \boldsymbol{\mathcal {M}}\,. \end{equation} (24)This can be due to the receivers and boundary correctors, but this might not be the only reason. Another important case where $$\boldsymbol{\mathcal {M}}^*$$ is not included in $$\boldsymbol{\mathcal {M}}$$ is related to the situation where $$\boldsymbol{\mathcal {M}}$$ is restricted to isotropic media. Because fine scale isotropic heterogeneities lead to anisotropic effective heterogeneities, $$\boldsymbol{\mathcal {M}}^*$$ contains anisotropic media, even for isotropic $$\boldsymbol{\mathcal {M}}$$: thus, $$\boldsymbol{\mathcal {M}}^*$$ might not be included in $$\boldsymbol{\mathcal {M}}$$. Finally, a model $$\mathbf {m}^*\in \boldsymbol{\mathcal {M}}^*$$ can be written as   \begin{eqnarray} &&{ \mathbf {m}^*=\left\lbrace (\rho ^*(\mathbf {x}),\mathbf {c}^*(\mathbf {x})), \ \forall \mathbf {x}\in \boldsymbol{\Omega }^*\,; \right. (\mathbf {x}_s,\mathbf {q}^*_s), s\in \lbrace 1,...,N_s\rbrace ; } \nonumber\\ &&\boldsymbol{\chi }\left(\mathbf {x}_r,\frac{\mathbf {x}_r}{\varepsilon _0}\right), r\in \lbrace 1,...,N_r\rbrace ; \left. {\boldsymbol{\Gamma }}^*(\mathbf {x}), \ \forall \mathbf {x}\in \partial \boldsymbol{\Omega }^*\right\rbrace. \end{eqnarray} (25) 4 THE HOMOGENIZATION FULL WAVEFORM INVERSION (HFWI) PROBLEM Based on the homogenization framework, we can define another misfit function: for any model $$\mathbf {m}^*\in \boldsymbol{\mathcal {M}}^*,$$  \begin{equation} E^*(\mathbf {m}^*)=\sum _{r,s}\int _0^T\left(\mathbf {d}_s^{f_{\rm m}}\left(\mathbf {x}_r,t)-\tilde{\mathbf {u}}^*(\mathbf {x}_r,\frac{\mathbf {x}_r}{\varepsilon _0},t;\mathbf {m}^*\vert _s\right)\right)^2\,{\rm d}t,\!\! \end{equation} (26)where   \begin{eqnarray} \tilde{\mathbf {u}}^*\left(\mathbf {x},\frac{\mathbf {x}}{\varepsilon _0},t;\mathbf {m}^*\vert _s\right) {=} \mathbf {u}^*(\mathbf {x},t;\mathbf {m}^*\vert _s) {+} \varepsilon _0\boldsymbol{\chi }\left(\mathbf {x},\frac{\mathbf {x}}{\varepsilon _0}\right):\boldsymbol{\nabla }\mathbf {u}^*(\mathbf {x},t;\mathbf {m}^*\vert _s) \nonumber\\ \end{eqnarray} (27)is the zero order effective displacement plus the first order corrector at x. The associated effective (or homogenized) minimization problem is   \begin{equation} \bar{\mathbf {m}}^*= \mathop{arg\min }\limits_{\mathbf {m}^*\in \boldsymbol{\mathcal {M}}^*} E^*(\mathbf {m}^*)\,. \end{equation} (28)Eqs (26)–(28) define the homogenized FWI (HFWI) problem. For a fine scale model $$\mathbf {m}\in \boldsymbol{\mathcal {M}}$$ and its homogenized version $$\mathbf {m}^*=\boldsymbol{\mathcal {H}}(\mathbf {m})$$, using (17), we have   \begin{equation} E^{f_{\rm m}}(\mathbf {m})=E^*(\mathbf {m}^*)+O(\varepsilon _0^2)\,. \end{equation} (29)Knowing that mt minimizes E and obviously also minimizes $$E^{f_{\rm m}}$$, we deduce from the last equation that $$\mathbf {m}_t^*=\boldsymbol{\mathcal {H}}(\mathbf {m}_t)$$ is a solution of the HFWI problem (28). The solution of the HFWI problem therefore belongs to $$\boldsymbol{\mathcal {M}}^*$$ by construction. This fact is not true for many $$\boldsymbol{\mathcal {M}}^h$$ choices for classical FWI approaches. For example, $$\boldsymbol{\mathcal {M}}^h$$ is often limited to isotropic media whereas the solution of the inverse problem is anisotropic due to upscaling effects, implying the solution does not belong to $$\boldsymbol{\mathcal {M}}^h$$. This is an important result: the homogenized version of the true model is a solution of the HFWI problem. If we then assume that the solution of the HFWI problem is unique, this leads to establish the relation between the true model and the limited frequency band inverse problem solution   \begin{equation} \bar{\mathbf {m}}^*= \boldsymbol{\mathcal {H}}(\mathbf {m}_t)\,. \end{equation} (30)Assuming the uniqueness of the solution of the HFWI problem is a strong statement. Nevertheless, this uniqueness can be reached, as it has been illustrated numerically in the layered media case (Capdeville et al.2013). We show in this study that uniqueness of the solution is also possible in the case of 2-D spatially varying media, at least for low noise and good data coverage. If this assumption is not met, which can happen in many situations (bad coverage, bad quality data ...), we fall back to a classical local minimum problem. In such cases, similarly to most other local optimization methods, little can be done apart from using better data coverage, better quality data, relying on a better starting model, or introducing a priori information. Compared to more classical finite dimensional space approximation $$\boldsymbol{\mathcal {M}}^h$$ used to solve the FWI, using $$\boldsymbol{\mathcal {M}}^*$$ and solving the HFWI problem thus presents the following advantages: similarly to $$\boldsymbol{\mathcal {M}}^h$$, $$\boldsymbol{\mathcal {M}}^*$$ is a finite dimensional space (see Appendix A) rendering the HFWI problem solvable; the relation between the solution of the HFWI problem and the true model is known through (30). This point is very important for the interpretation of the inverted model (also called ‘downscaling’ step); the solution of the HFWI problem exists and belongs to $$\boldsymbol{\mathcal {M}}^*$$. In the next section, we present how to practically solve the HFWI problem. 5 HFWI PRACTICAL CONSIDERATIONS 5.1 The minimization scheme We rely here on the least squares approach and local optimization algorithms, a pragmatic and well adapted choice for the HFWI problem. In this study, we implement the Gauss–Newton iterative scheme (see for instance Tarantola & Valette 1982). Given mi, the inverted model at iteration i, we obtain the model at the iteration i + 1 as   \begin{equation} \mathbf {m}^{i+1}=\mathbf {m}^i+\left({(\mathbf {F}^i)}^{\intercal }\cdot \mathbf {F}^i +{\boldsymbol{\lambda }}^i \right)^{-1}\left[ {(\mathbf {F}^i)}^{\intercal }\cdot (\mathbf {d}-\tilde{\mathbf {u}}^*(\mathbf {m}^i))\right], \end{equation} (31)where d is the data vector for all sources, receivers and components, $$\tilde{\mathbf {u}}^*(\mathbf {m}^i)$$ is the effective synthetic seismograms computed in model mi also for all sources, receivers and components, Fi is the partial derivative matrix (matrix of the Fréchet derivatives)   \begin{equation} \mathbf {F}^i=\left[\frac{\partial \tilde{\mathbf {u}}^*}{\partial \mathbf {m}}\right]_{\mathbf {m}=\mathbf {m}^i}\,. \end{equation} (32)The damping parameter $${\boldsymbol{\lambda }}^i$$ is used to stabilize the inversion of the approximate Hessian (Fi)⊺ · Fi. We rely on the adjoint technique to numerically evaluate the waveform partial derivative Fi (Tarantola 1984; Pratt et al.1998). Practically, the full wavefields for each source and adjoint source are stored on disk and the partial derivatives are assembled afterward. For modest size 2-D tests, this solution can be efficiently implemented in terms of memory requirement and computational resources. The damping $${\boldsymbol{\lambda }}^i$$ is decreased while the iteration number increases until convergence. Whatever the technical choices made to solve (28), it implies describing the homogenized model space $$\boldsymbol{\mathcal {M}}^*$$. The most direct way to do so would be to set up an explicitly parametrization of this space. We consider two cases: the layered media case, for which such an explicit parametrization already exists, and the general case, for which it is not yet available. 5.2 Parametrization: the layered media case The layered media case is a special case in the sense that there exists an analytical solution to the cell problem (A3). VTI layered media can be described by five elastic parameters A(z), C(z), F(z), L(z), F(z) (e.g. Stoneley 1949) where z is the axis perpendicular to the layering. In such a case, the homogenization operator $$\boldsymbol{\mathcal {H}}$$ consists of the three following steps (Backus 1962; Capdeville & Marigo 2007): build the Backus parameter vector:   \begin{equation} \mathbf {p}(z)=\left(\frac{1}{C},\frac{1}{L},A-\frac{F^2}{C},\frac{F}{C},N,\rho \right)(z); \end{equation} (33) upscale the Backus vector with a simple spatial low-pass filtering (see Appendix A; this low-pass filtering in space shall not be confused with the time domain low-pass filtering used previously to limit the data frequency band):   \begin{equation} \mathbf {p}^*(z)=\mathcal {F}^{k_0}(\mathbf {p})(z); \end{equation} (34) build back the effective transversely isotropic coefficients:   \begin{eqnarray} (A^*,C^*,F^*,L^*,N^*,\rho ^*)(z)\!=\!\left(p_3^*\!+\!\frac{p_4^*}{p_1^*},\!\frac{1}{p_1^*},\!\frac{p_4^*}{p_1^*} ,\!\frac{1}{p_2^*},p_5^*,p_6^* \right)(z)\,. \nonumber\\ \end{eqnarray} (35) Note that eq. (34) is equivalent to muting coefficients in the Fourier domain for spatial frequencies larger than k0. If $$\tilde{\mathbf {p}}$$ are the Fourier coefficients of p,   \begin{equation} \tilde{\mathbf {p}}_k=\frac{1}{L_z}\int _{L_z}\mathbf {p}(z){\rm e}^{-ikz} {\rm d}z, \end{equation} (36)where Lz is the layered domain, then, eq. (34) corresponds to   \begin{equation} \mathbf {p}^*(z)=\sum _{\vert k\vert < k_0}\tilde{w}_k\tilde{\mathbf {p}}_k\, {\rm e}^{ikz}, \end{equation} (37)where $$\tilde{w}_k$$ are the Fourier coefficient of the filtering wavelet hidden in $$\boldsymbol{\mathcal {F}}^{k_0}$$. In such a case, we can define a parametrization based on the Backus parameter in the Fourier coefficient domain. Forgetting about receiver and boundary correctors, we can build $$\boldsymbol{\mathcal {M}}^*$$ as   \begin{equation} \boldsymbol{\mathcal {M}}^*=\left\lbrace {\rm all\, admissible}\,\tilde{\mathbf {p}}_k, k\in \lbrace -k_0, ..., k_0\rbrace \right\rbrace, \end{equation} (38)which defines a finite dimensional explicit parametrization of $$\boldsymbol{\mathcal {M}}^*$$. A similar parametrization has been used by Capdeville et al. (2013) (but with Lagrange polynomial instead of Fourier polynomial) to numerically illustrate that an FWI model is indeed an homogenized model in the layered media case. 5.3 Parametrization: the general media case For more general media, an explicit parametrization of the homogenized model space is more difficult to set up. This is due to the fact that the analytical solution to the cell problem (A3) used in the layered media case does not exist for higher dimension cases (2-D and 3-D heterogeneities). It could be nevertheless possible to design an explicit $$\boldsymbol{\mathcal {M}}^*$$ parametrization and this is a possibility we intend to study in the future. In this study, we rely on a simpler idea: instead of searching for a solution directly in $$\boldsymbol{\mathcal {M}}^*$$, we rely on an approximate $$\boldsymbol{\mathcal {M}}^{*h}$$ and use the homogenization operator $$\boldsymbol{\mathcal {H}}$$ to project the solution back to $$\boldsymbol{\mathcal {M}}^{*}$$. For iterative schemes such as the Gauss–Newton algorithm used in the present work, the projection can be done at each iteration (see Fig. 2), or only at convergence, as needed. If an increasing frequency band is used through the iterations (for frequency-grid continuation (Kolb et al.1986; Bunks et al.1995)), the projection needs to be done at least at each frequency band jump. This is a simple solution, which comes with the following drawback: depending on the practical choices made for $$\boldsymbol{\mathcal {M}}^{*h}$$, the iterative inversion scheme may fail to converge if $$\boldsymbol{\mathcal {M}}^{*h}$$ restricts too much the update from progressing toward the solution once the new model is homogenized. Unfortunately, we are not able yet to provide a precise criteria on $$\boldsymbol{\mathcal {M}}^{*h}$$ to ensure the convergence. Figure 2. View largeDownload slide Flowchart used to apply the homogenization projection in the Gauss–Newton scheme. Figure 2. View largeDownload slide Flowchart used to apply the homogenization projection in the Gauss–Newton scheme. Indeed, there are an infinite number of possibilities for practically setting up an explicit parametrization for $$\boldsymbol{\mathcal {M}}^{*h}$$. In the following examples, limiting ourselves to 2-D cases, we rely on a piece-wise polynomial basis for the spatial parametrization. More precisely, a square inversion sub-domain $$\boldsymbol{\mathcal {I}}\subset \boldsymbol{\Omega }$$ is chosen and divided into n × n non-overlapping elements:   \begin{equation*} \boldsymbol{\mathcal {I}}=\sum _{e=1}^{n^2} \boldsymbol{\mathcal {I}}_e^{n}. \end{equation*} An example of inversion domain $$\boldsymbol{\mathcal {I}}$$ and an associated inversion mesh for n = 10 is shown in Fig. 3(a). Over each element $$\boldsymbol{\mathcal {I}}_e^{n}$$, similar to what is done for spectral elements, elastic parameters and density are represented using a 2-D tensorial product polynomial approximation of degree N in each direction. This defines the parametrization $$\boldsymbol{\mathcal {P}}_n^N(\boldsymbol{\mathcal {I}})$$ (n × n elements of degree N × N) that we use for the numerical tests in the next section. We do not impose the continuity of the fields between elements, which implies that $$\boldsymbol{\mathcal {P}}_n^N(\boldsymbol{\mathcal {I}})$$ has n2 × (N + 1)2 degrees of freedom for each scalar. Figure 3. View largeDownload slide Weak and strong contrast circular inclusion tests configuration. (a) sketch of the elastic model and of the sources (stars) receivers (triangles) geometry used to generate the synthetic data to be inverted. The square in black dashed line represents the inverted domain $$\boldsymbol{\mathcal {I}}$$. The grey dashed lines represent the elements $$\boldsymbol{\mathcal {I}}_e^{10}$$ of an inversion mesh example. The dotted line is the cross-section line used for panel (c). (b) kinetic energy snapshot for one of the source (star symbol), for t = 11/fm for the strong heterogeneity case. The unstructured spectral element mesh used to solve the wave equation is overlaid in grey. (c) VS cross-section along the dotted line in plot (a) for weak and strong heterogeneity models. See Table 1 for a complete description of the elastic material properties of those models. Figure 3. View largeDownload slide Weak and strong contrast circular inclusion tests configuration. (a) sketch of the elastic model and of the sources (stars) receivers (triangles) geometry used to generate the synthetic data to be inverted. The square in black dashed line represents the inverted domain $$\boldsymbol{\mathcal {I}}$$. The grey dashed lines represent the elements $$\boldsymbol{\mathcal {I}}_e^{10}$$ of an inversion mesh example. The dotted line is the cross-section line used for panel (c). (b) kinetic energy snapshot for one of the source (star symbol), for t = 11/fm for the strong heterogeneity case. The unstructured spectral element mesh used to solve the wave equation is overlaid in grey. (c) VS cross-section along the dotted line in plot (a) for weak and strong heterogeneity models. See Table 1 for a complete description of the elastic material properties of those models. 6 SYNTHETIC INVERSION TESTS In this section, we present a series of examples illustrating the ideas presented above. More precisely, the objectives are to illustrate through numerical experiments: the validity of eq. (30) in the presented case studies; the influence of the practical choice of $$\boldsymbol{\mathcal {M}}^{h}$$ for standard FWI; the influence of the practical choice of $$\boldsymbol{\mathcal {M}}^{*h}$$ for HFWI; that the HFWI problem can be successful where a classical FWI fails. We run three synthetic inversions in the two heterogeneous models presented in Figs 3 and 4. In both ‘true’ elastic models, unknown heterogeneities are embedded in a known homogeneous isotropic background. The background density, P and S wave velocities are denoted by ρ0, VP0 and VS0 respectively. We use 16 sources and 16 receivers, regularly located around the heterogeneities. Figure 4. View largeDownload slide Faulted layers test configuration. (a) sketch of the elastic model and of the sources (stars) receivers (triangles) geometry used to generate the synthetic data to be inverted. The square in black dashed line represents the inverted domain $$\boldsymbol{\mathcal {I}}$$ and the dotted line is the cross-section line used for plot (c). (b) kinetic energy snapshot for one of the source (star symbol), for t = 11/fm for the strong heterogeneity case. The unstructured spectral element mesh used to solve the wave equation is overlapped in grey lines. (c) VS cross-section along the dotted line in plot (a). See Table 1 for a complete description of the elastic material properties of this model. Figure 4. View largeDownload slide Faulted layers test configuration. (a) sketch of the elastic model and of the sources (stars) receivers (triangles) geometry used to generate the synthetic data to be inverted. The square in black dashed line represents the inverted domain $$\boldsymbol{\mathcal {I}}$$ and the dotted line is the cross-section line used for plot (c). (b) kinetic energy snapshot for one of the source (star symbol), for t = 11/fm for the strong heterogeneity case. The unstructured spectral element mesh used to solve the wave equation is overlapped in grey lines. (c) VS cross-section along the dotted line in plot (a). See Table 1 for a complete description of the elastic material properties of this model. The source mechanism is the same for all sources: it is an horizontal vector force. The source time function gs(t) is a tapered door function in the frequency domain (see Fig. 5) which has a maximum frequency fm. This source time function is the same for all the sources. We define the background minimum wavelength as   \begin{equation} \lambda _{{\rm min}}=V_{S0}/f_{\rm m}. \end{equation} (39)In the following, all distances are measured as functions of λmin, the frequencies as functions of fm and the time as a function of 1/fm. For both settings, the domain size is $$16\times 16\ \lambda _{{\rm min}}^2$$ with absorbing boundary conditions around the domain. The inversion sub-domain $$\boldsymbol{\mathcal {I}}$$ is a square of dimension $$12\times 12\ \lambda _{{\rm min}}^2$$. The sources and receivers are located around $$\boldsymbol{\mathcal {I}}$$ leading to an excellent data coverage. Nevertheless this coverage is still sparse as the distance between each receiver and each source is equal to 3.2 λmin. Note that the sources and receivers are located outside of $$\boldsymbol{\mathcal {I}}$$ and that no free surface is present in the inverted domain. The idea of such settings is to avoid inverting for the receiver corrector $$\boldsymbol{\chi }$$, the source parameters as well as the boundary corrector $${\boldsymbol{\Gamma }}^*$$ and to focus on the reconstruction of ρ* and c*. Figure 5. View largeDownload slide Left: source time function gs amplitude spectra. The frequency axis unit is the maximum frequency fm. Right: source time function gs in the time domain. The time axis unit is 1/fm. Figure 5. View largeDownload slide Left: source time function gs amplitude spectra. The frequency axis unit is the maximum frequency fm. Right: source time function gs in the time domain. The time axis unit is 1/fm. For the first setting, (‘circular inclusions’, Fig. 3) two circular inclusions, one faster and one slower than the background medium are present. The faster inclusion is surrounded by a thin slow layer representing a damaged layer. The thickness of this layer is λmin/8. Both the sharpness of the circular inclusions and the thin damaged layer are fine scale and can only be recovered in an effective way by FWI. Two velocity contrasts are used for the inclusions in two different experiments, one labelled as ‘weak contrast’, the other one labelled as ‘strong contrast’ (see Table 1 and Fig. 3c). Table 1. Material properties used in the different tests.   VP (km s−1)  VS (km s−1)  ρ (103 kg m−3)  Background  5.6  3.17  2.61  Circular inclusions, weak heterogeneities case:  A  6.27  3.48  2.73  B & d  4.85  2.69  2.47  Circular inclusions, strong heterogeneities case:  A  7.41  4.12  3.05  B & d  2.3  1.58  2.35  Layered inclusion test:  Slow layers:  4.67  2.6  2.45  Fast layers:  6.72  3.74  2.85    VP (km s−1)  VS (km s−1)  ρ (103 kg m−3)  Background  5.6  3.17  2.61  Circular inclusions, weak heterogeneities case:  A  6.27  3.48  2.73  B & d  4.85  2.69  2.47  Circular inclusions, strong heterogeneities case:  A  7.41  4.12  3.05  B & d  2.3  1.58  2.35  Layered inclusion test:  Slow layers:  4.67  2.6  2.45  Fast layers:  6.72  3.74  2.85  View Large For the second inversion setting, (‘faulted layers’, Fig. 4), a horizontally layered medium is split by a tiled fault. The layer thickness is also λmin/8. Both the layers and the tilted fault are fine scale which can, again, only be recovered in an effective way by FWI. Choosing λmin = O(1cm), the ‘circular inclusions’ model could represent for instance a piece of concrete with an iron bar inclusion (the fast inclusion ‘A’) and two damaged areas (the slow inclusions ‘B’ and ‘d’), one circular and one around the bar. Choosing λmin = O(10 m), the ‘faulted layered’ model could represent a more geological situation where the sediments represented by the layers are offset by a tilted fault. In both cases, imaging as best as possible the heterogeneities is the challenge. For the first situation, the acquisition setting could be thought as ‘realistic’, however this is not the case for the second. Nevertheless, our objective here is to focus on the ideas developed in the previous sections without mixing with other important issues such as the data coverage. This will be the purpose of future works. For each experiment, the data to be inverted are generated using a 2-D spectral element solver (Komatitsch & Vilotte 1998), meshing all the interfaces of the heterogeneities (see Figs 3b and 4b). Perfectly Matched Layers (PMLs; Festa & Vilotte 2005) are used to absorb outgoing waves around the domain. No synthetic noise is added to the synthetic data. The spectral element mesh used to compute the approximate Hessian and the gradient in eq. (31) is a simple regular mesh based on the inversion parametrization (on the $$\boldsymbol{\mathcal {I}}_e^{n}$$ elements). In these synthetic tests, we avoid an ‘inverse crime’ by virtue of the following: first, the fact that the spectral element meshes used to generate the data to be inverted ($$\mathbf {d}^{f_{\rm m}}$$) and to model the data during the inversion process (u*) are not the same implies that the discretization errors are different. Second, the true model cannot be accurately represented with the inversion parametrization, which prevents us from injecting prior information about the inclusion shapes into the inversion. Nevertheless, the PML error (remaining small reflections from the domain boundaries) are the same for the data generation and inversion, and this error might be used by the inversion scheme. The following inversion tests often involve the full elastic tensor cijkl and its six independent coefficients (in 2-D). To present the results, we choose to first project anisotropic c to the nearest isotropic tensor ciso following Browaeys & Chevrot (2004). Then, P- and S-wave velocities are defined as   \begin{equation} V_P(\mathbf {x})=\sqrt{(c_{1111}^{\rm iso}(\mathbf {x})/\rho (\mathbf {x}))}\,, \end{equation} (40)  \begin{equation} V_S(\mathbf {x})=\sqrt{(c_{1212}^{\rm iso}(\mathbf {x})/\rho (\mathbf {x}))}\,, \end{equation} (41)and the total anisotropy as   \begin{equation} {\rm aniso}(\mathbf {x})=\frac{\sqrt{\sum _{ijkl}(c^{\rm iso}_{ijkl}(\mathbf {x})-c_{ijkl}(\mathbf {x}))^2}}{\sqrt{\sum _{ijkl}(c^{\rm iso}_{ijkl}(\mathbf {x}))^2}}. \end{equation} (42) 6.1 Circular inclusion weak heterogeneity test In this Section, we present a first test using the circular inclusion model with the ‘weak’ heterogeneities (see Fig. 3 and Table 1). The heterogeneities are chosen to be weak enough so that the Gauss–Newton scheme can converge without projecting to the homogenized model space at each iteration. In other word, the Gauss–Newton scheme is applied classically without using homogenization. The fine scales of the model such as the layer ‘d’ and also the sharp velocity jumps between the background media and the inclusions cannot be recovered by the inversion. Indeed they are below the resolution limit, and, technically, they cannot be represented in a spatial parametrization without prior knowledge about their shape (i.e. without explicitly meshing the discontinuities). This situation is realistic: in general, the heterogeneity geometry is unknown. We perform three different inversions each using a different model space $$\boldsymbol{\mathcal {M}}^h$$ parametrization: $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$: $$\boldsymbol{\mathcal {P}}_{30}^1(\boldsymbol{\mathcal {I}})$$ with ρ, VP and VS; $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$: $$\boldsymbol{\mathcal {P}}_{10}^3(\boldsymbol{\mathcal {I}})$$ with ρ and c; $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$: $$\boldsymbol{\mathcal {P}}_{20}^1(\boldsymbol{\mathcal {I}})$$ with ρ and c; where the notation $$\boldsymbol{\mathcal {P}}_n^N(\boldsymbol{\mathcal {I}})$$ for the decomposition of the domain $$\boldsymbol{\mathcal {I}}$$ in n × n elements of degree N × N is defined at the end of Section 5.3. These three parametrizations have roughly the same number of free parameters (unknowns), 302 × 22 × 3 = 10800 for $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$, 102 × 42 × 7 = 11200 for $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$ and 202 × 22 × 7 = 11200 for $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$. The three inversions are carried out and the Gauss–Newton scheme is stopped when a 98 per cent reduction in the square-root of the misfit function is reached. In each of the three cases, the inversion converges in about 10 iterations. The raw results of the inversions are presented in Figs 6 and 7. By raw results, we mean the inversion results before any homogenization. From these figures, it can be noted that the three inversions qualitatively retrieve the location and signs of the reference model heterogeneities. Nevertheless, quantitatively, the models are different from each other and all include a significant amount of noise. Indeed, from the VS maps (Fig. 6), the heterogeneous circles can be roughly located and the sign of the heterogeneity contrast (slow or fast) can be guessed. Even the slow layer around the inclusion ‘A’ seems to be reconstructed. Nevertheless, from the cross-sections (Fig. 7), it appears that the results are not accurate. The actual velocity values in each inclusion are not recovered. Moreover, the results strongly depend on the parametrization choice: the imprint of the inversion mesh is clearly visible and, for example, it is difficult to measure the value of the actual thickness and the VS value of the thin slow layer ‘d’. Figure 6. View largeDownload slide Weak contrast circular inclusion test raw inversion results. VS for the reference model (upper left panel), raw VS inversion results for the $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ parametrization (upper right panel), $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$ (lower left panel) and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ (lower right panel) parametrizations are plotted. For $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$, VS is defined following eq. (41). Figure 6. View largeDownload slide Weak contrast circular inclusion test raw inversion results. VS for the reference model (upper left panel), raw VS inversion results for the $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ parametrization (upper right panel), $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$ (lower left panel) and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ (lower right panel) parametrizations are plotted. For $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$, VS is defined following eq. (41). Figure 7. View largeDownload slide Weak contrast circular inclusion test raw inversion results. Cross-sections along the dotted line shown in Fig. 3(a), for the reference model (‘ref’) and for the raw inversion results for the $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$, $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$ parametrizations. VS, VP, ρ and the total anisotropy are presented. For anisotropic parametrizations, VP, VS and the total anisotropy are defined in eqs (40)–(42). Figure 7. View largeDownload slide Weak contrast circular inclusion test raw inversion results. Cross-sections along the dotted line shown in Fig. 3(a), for the reference model (‘ref’) and for the raw inversion results for the $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$, $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$ parametrizations. VS, VP, ρ and the total anisotropy are presented. For anisotropic parametrizations, VP, VS and the total anisotropy are defined in eqs (40)–(42). We next follow the main idea of this paper and project all the inversions result as well as the reference model to the homogenized space using the homogenization operator $$\boldsymbol{\mathcal {H}}$$. To this purpose, we need to give a value to the ε0 parameter (see eq. 16). As mentioned earlier, when dealing with forward modelling problems, ε0 = 0.5 is often a good choice. However, for this projection step any value can be chosen in practice. To focus on long wavelength results, we choose ε0 = 2, to study results at the resolution limit, ε0 = 0.5 is used while for intermediate results we choose ε0 = 1. The homogenized reference model and those resulting from the three inversions are presented in Figs 8 and 9. From the 2-D maps obtained for ε0 = 1, it appears clearly that the four models are the same once homogenized. This is quantitatively confirmed with the model cross-sections (Fig. 9) plotted for three different values of ε0 (2, 1 and 0.5). For large ε0 (low resolution), the models are all the same within a small error. For small ε0 (high resolution), the models are still in good agreement, but with a larger dispersion. The fact that the error becomes larger with smaller ε0, for a fixed maximum frequency fm, is expected: when reaching the resolution limit (≃ λmin/2, which corresponds to ε0 = 0.5), the wavefield sensitivity to the heterogeneities is lower, leading to a poor constraint on their amplitude. A good compromise between resolution and accuracy seems to be here ε0 = 1. Figure 8. View largeDownload slide Weak contrast circular inclusion test reference model and three different inversion results, all homogenized with ε0 = 1 (see Section 6.1). VS, VP, ρ and the ‘total anisotropy’ are presented (lines of panels from top to bottom, respectively ). Reference, $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$, $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ inverted models are plotted (left to right columns of panels, respectively). VS, VP and the ‘total anisotropy’ are obtained following eqs (40)–(42). Figure 8. View largeDownload slide Weak contrast circular inclusion test reference model and three different inversion results, all homogenized with ε0 = 1 (see Section 6.1). VS, VP, ρ and the ‘total anisotropy’ are presented (lines of panels from top to bottom, respectively ). Reference, $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$, $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ inverted models are plotted (left to right columns of panels, respectively). VS, VP and the ‘total anisotropy’ are obtained following eqs (40)–(42). Figure 9. View largeDownload slide Weak contrast circular inclusion test reference and homogenized inversion results. Cross-sections along the dotted line in Fig. 3(a) for the reference model (‘ref’) and for the inversion results for the $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$, $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$ parametrizations are represented. Three different values of ε0 are used (ε0 = 2, 1 and 0.5). VP, VS and the ‘total anisotropy’, obtained following eqs (40)–(42), are represented. Figure 9. View largeDownload slide Weak contrast circular inclusion test reference and homogenized inversion results. Cross-sections along the dotted line in Fig. 3(a) for the reference model (‘ref’) and for the inversion results for the $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$, $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{10}^{3,{\rm ani}}$$ parametrizations are represented. Three different values of ε0 are used (ε0 = 2, 1 and 0.5). VP, VS and the ‘total anisotropy’, obtained following eqs (40)–(42), are represented. It is interesting to note that the isotropic inversion ($$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$) is able to retrieve the expected anisotropy once homogenized (even if its amplitude is slightly too low). It implies that the inversion is able to find an isotropic model, that is not the true model, but that once homogenized is in good agreement with the true homogenized model. Finally, it can also be noted that the density is retrieved relatively accurately (the density results seem to be more noisy, but this should be related to the fact that the density contrast is quite small compared to the velocity contrasts associated with VS and VP). From this first experiment, we can conclude that the elastic models resulting from FWI are definitively not unique: they depend on the particular choice of parametrization for $$\boldsymbol{\mathcal {M}}^h$$. However, the homogenized versions of these models are actually all the same. 6.2 Faulted layers test A case study similar to the previous one is presented here, with the faulted layers model described in Fig. 4. Only two parametrizations are tested: $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$. The inversions are carried out using the same settings as the one defined for the previous case study, with the same misfit reduction stopping criteria. The $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ inversion converges within 20 iterations, but the $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ inversion stops progressing after four iterations and only a 50 per cent misfit reduction is achieved. As it can be seen in the raw inversion results (Figs 10 and 11), it is not possible to find a model explaining the data in $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$, at least with a local optimization algorithm such as the Gauss–Newton scheme used here. Figure 10. View largeDownload slide Faulted layers test raw inversion results. VS for the reference model (left panel), raw VS inversion results for the $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ parametrization (middle panel) and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ (right panel) parametrizations are plotted. For the anisotropic parametrization, VS is defined following eq. (41). Figure 10. View largeDownload slide Faulted layers test raw inversion results. VS for the reference model (left panel), raw VS inversion results for the $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ parametrization (middle panel) and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ (right panel) parametrizations are plotted. For the anisotropic parametrization, VS is defined following eq. (41). Figure 11. View largeDownload slide Faulted layers test raw inversion results. Cross-sections along the dotted line shown in Fig. 4(a), for the reference model (‘ref’) and for the raw inversion results for the $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ and for $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ parametrizations. VS, VP, ρ and the total anisotropy are presented. For anisotropic parametrization, VP, VS and the total anisotropy are defined following eqs (40)–(42). Figure 11. View largeDownload slide Faulted layers test raw inversion results. Cross-sections along the dotted line shown in Fig. 4(a), for the reference model (‘ref’) and for the raw inversion results for the $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ and for $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ parametrizations. VS, VP, ρ and the total anisotropy are presented. For anisotropic parametrization, VP, VS and the total anisotropy are defined following eqs (40)–(42). To test if the homogenized inversion principle can help in this case, we use the algorithm presented in Fig. 2: at each Gauss–Newton iteration, we project the obtained model to the homogenization space with the $$\boldsymbol{\mathcal {H}}$$ operator using ε0 = 1/3. We therefore use $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ for $$\boldsymbol{\mathcal {M}}^{*h}$$. In the following, we refer to this inversion as the $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso},*}$$ parametrization inversion. Inserting this projection step, the inversion almost converges (a 96 per cent misfit reduction was obtained) after about 50 Gauss–Newton iterations. Of course, there is here a little interest in using the homogenized inversion compared to inverting in $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ (the inversion convergence is slow), but it shows that projecting to the homogenized space can help to recover a failing inversion. At this stage, we cannot conclude if this observation goes beyond what could be obtained with simple Tikhonov regularization (see e.g. Brenders & Pratt (2007)) and this will need to be investigated deeper in future works. This $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso},*}$$ inversion is the first presented example of the HFWI algorithm. The raw results for the $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ inversions are the ones presented in Figs 10 and 11. There is no raw results for the $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso},*}$$ inversion because each iteration is homogenized. From these raw results, we see that, differently to the circular inclusions test, it is only possible to tell that an heterogeneity is present in the layered area, without any visible or qualitative detail on it. We can only tell that, for the $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ inversion, the layered area appears as a slow velocity area. It is not possible to guess anything about the layering and even less about the fault. Following the main idea of this study, Figs 12 and 13 depict the homogenized true model, the homogenized $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$, $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso},*}$$ inverted models, for ε0 = 1. Similarly to the previous test, the $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ inverted model is, once homogenized, in very good agreement with the homogenized true model. These plots also confirm that the $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ has failed to converge to a correct model. Nevertheless, when the homogenization is used at each Gauss–Newton iteration, the same inversion $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso},*}$$ result is, once homogenized with ε0 = 1, in very good agreement with the homogenized true model. For the $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso},*}$$ inversions, the position of the fault can also be guessed from the total anisotropy 2-D plot (Fig. 12, bottom panels). Figure 12. View largeDownload slide Faulted layers test reference model and inversion results, all homogenized with ε0 = 1 (see Section 6.2). VS, VP, ρ and the ‘total anisotropy’ are presented (lines of panels from top to bottom, respectively ). Reference, $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$, $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso},*}$$ and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ inverted models are plotted (left to right columns of panels, respectively). VS, VP, and the ‘total anisotropy’ are obtained following eqs (40)–(42). Figure 12. View largeDownload slide Faulted layers test reference model and inversion results, all homogenized with ε0 = 1 (see Section 6.2). VS, VP, ρ and the ‘total anisotropy’ are presented (lines of panels from top to bottom, respectively ). Reference, $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$, $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso},*}$$ and $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ inverted models are plotted (left to right columns of panels, respectively). VS, VP, and the ‘total anisotropy’ are obtained following eqs (40)–(42). Figure 13. View largeDownload slide Faulted layers test reference and inversion homogenized models. Cross-sections along the dotted line in Fig. 4(a) for ε0 = 1 are plotted. VP, VS and the ‘total anisotropy’ are obtained following eqs (40)–(42). Figure 13. View largeDownload slide Faulted layers test reference and inversion homogenized models. Cross-sections along the dotted line in Fig. 4(a) for ε0 = 1 are plotted. VP, VS and the ‘total anisotropy’ are obtained following eqs (40)–(42). From this test, we conclude that, depending on the model we want to retrieve, the choice of $$\boldsymbol{\mathcal {M}}^h$$ can influence the convergence of a classical FWI. In particular, choosing an anisotropic parametrization is a good option, even if it seems apparently more difficult. This example illustrates the advantage of projecting to the homogenized space at each iteration of the inversion. Finally, Fig. 14 depicts horizontal cross-sections in the ε0 = 1 homogenized reference and $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ inverted models for the c1112 elastic coefficient. In a vertically transverse isotropic (VTI) model, we have c1112 = 0. Knowing that a horizontally layered model becomes a VTI model through homogenization and that the faulted layers test model is horizontally layered everywhere but near the fault, we expect c1112 to be zero except near the fault. This is indeed what is observed in Fig. 14 from both the homogenized reference and inverted models. It implies that, provided that we have the prior information that there is potentially a fault in the medium, it would be possible to identify it from the inversion, even at this low resolution. Such an interpretation of the inverted results can be seen as a downscaling inverse problem and is discussed in Section 7. Figure 14. View largeDownload slide Faulted layers test inversion result. Cross-sections along an horizontal line for y = 4λmin in $$c_{1112}^*$$ for the reference model and the $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ inverted model, both homogenized with ε0 = 1. Figure 14. View largeDownload slide Faulted layers test inversion result. Cross-sections along an horizontal line for y = 4λmin in $$c_{1112}^*$$ for the reference model and the $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani}}$$ inverted model, both homogenized with ε0 = 1. 6.3 Circular inclusion strong heterogeneity test Finally, for the last test, we come back to the circular inclusion configuration already used in Section 6.1, with stronger velocity contrasts, especially for the slow regions (see Fig. 3 and Table 1). The effect of the heterogeneity on the waveforms is large and none of the parametrizations already used, with or without homogenization at each step, is able to converge and to avoid falling in a local minimum. To resolve this problem, we rely on the classical frequency-grid continuation solution (Kolb et al.1986; Bunks et al.1995). We first perform the inversion in a low frequency band [0, fm/2] and use the homogenized inversion results to start the inversion in the full frequency band [0, fm]. Applying these two successive frequency band inversions, the simple Gauss–Newton scheme (without homogenization) systematically proposes a non-physical model after a few iterations and fails to converge. We therefore rely on the homogenized Gauss–Newton scheme and use $$\boldsymbol{\mathcal {M}}_{20}^{0,{\rm ani},*}$$ for the low-frequency band step and then $$\boldsymbol{\mathcal {M}}_{20}^{1,{\rm ani},*}$$ for the full-frequency band step. In this framework, the convergence rate is slower than for the first experiments (about 20 iterations for the first step and 40 for the second), however the misfit function is decreased down to the 99 per cent objective and the convergence is reached. The final results of the inversion, homogenized with ε0 = 1, are summarized in Fig. 15. Once again, a very good agreement between the homogenized true and the homogenized inverted models is obtained. The high amplitude of the effective anisotropy (up to 30 per cent), knowing the isotropy of the underlying thin scale model shall be noted. The different anisotropy pattern between heterogeneities with or without a slow layers around them could help to find their finer scale characteristics in a downscaling process. Figure 15. View largeDownload slide Strong contrast circular inclusion test reference model and inversion results, both homogenized with ε0 = 1 (see Section 6.3). On the left column panels are presented maps of VS (top panel) and the ‘total anisotropy’ (bottom panel) of the reference model and, on the middle column, the one for the inverted model with the $$\boldsymbol{\mathcal {M}}_{20}^{0,{\rm ani},*}$$ parametrization. On the right column are plotted cross-sections along the dotted line seen on the maps presented in the panel from right and left columns for VS (top panel) and the ‘total anisotropy’ (bottom panel). Figure 15. View largeDownload slide Strong contrast circular inclusion test reference model and inversion results, both homogenized with ε0 = 1 (see Section 6.3). On the left column panels are presented maps of VS (top panel) and the ‘total anisotropy’ (bottom panel) of the reference model and, on the middle column, the one for the inverted model with the $$\boldsymbol{\mathcal {M}}_{20}^{0,{\rm ani},*}$$ parametrization. On the right column are plotted cross-sections along the dotted line seen on the maps presented in the panel from right and left columns for VS (top panel) and the ‘total anisotropy’ (bottom panel). From this last case study, we conclude that the homogenization can help FWI to successfully converge in this difficult case. 7 DISCUSSION One of the objectives of this paper is to extend the results obtained in the layered media case (Capdeville et al.2013), to higher dimension cases (here 2-D), which illustrates the fact that the model obtained from FWI are, at best, an homogenized version of the true model. It establishes the relation between the inverted and true models for limited frequency band data. Even if this relation might not always be true, the tests performed here seem to confirm this assumption: in cases where the inversion converges without homogenizing at each Gauss–Newton step, the inverted raw models are clearly parametrization dependent. Nevertheless, once homogenized, the different models are all the same and consistent with the homogenized true model. We interpret this observation in the following way: the differences between raw inversion results depend on fine scales in the parametrization that are not uniquely constrained by the data. In other words, these fine scales may be necessary to explain the data, but they are not unique and depend on the parametrization. For example, using an isotropic parametrization (such as $$\boldsymbol{\mathcal {M}}_{30}^{1,{\rm iso}}$$ in Section 6.1), knowing that the effective medium behave in an anisotropic way, might not prevent the convergence of the inversion. But this strategy may find isotropic fine scales allowed by the parametrization which will try to emulate the correct anisotropy in the effective sense. Depending on the parametrization, these fine scales will be different, but they will all have the same effective effect. We conclude that FWI, indeed, finds the homogenized model and not the true model. The next related important point is to ensure that the assumption that the solution to the homogenized inverse problem is the homogenized true model (eq. 30) is true. Once again, for all the examples tested here, this relation seems to be satisfied. A corollary of this observation is that, in the homogenized space, the FWI inversion has a unique solution. This is true at least in the presented example and all the other tests we have done so far. This is definitely not a proof but, at least, it shows that it is a reasonable assumption. Cases breaking the uniqueness are more likely to be found for poor data coverage or noisy data. In such cases, the relation between the true and inverted models through homogenization is broken which is a serious issue for the interpretation stage. But this is not specific to HFWI. This framework gives some new insight about the FWI process, even if the homogenization is not used. For example, many FWI inversion scheme makes use of spatial smoothing to regularize their gradient update, especially near the sources and receivers (Williamson et al.2011; Trinh et al.2017). Based on our results, knowing that, for low-velocity contrast, the homogenization operator $$\boldsymbol{\mathcal {H}}$$ is close to a simple low-pass spatial filtering $$\boldsymbol{\mathcal {F}}$$, it makes sense to low pass-filter results from an FWI: for weakly contrasted models, this is equivalent to homogenizing the inverted model. Another example where the presented work gives a better understanding about common practice is the frequency-grid continuation methods (Kolb et al.1986; Bunks et al.1995). Using lower frequencies reduces the homogenized model space dimensional (see eq. A9), and, if low enough frequencies are present in the data, a misfit function with a single global and local minimum can be reached. Our work also raises questions about other practices: for instance, inverting for fine scales, such as sharp interfaces, based on limited frequency band data, appears to be counterproductive. Indeed, even if, for instance, the true model contains sharp interfaces, inverting for these interfaces makes the problems strongly nonlinear. More importantly, the imprint of these interfaces in the bandpassed seismic data is very weak, making the inverse problem poorly constrained. It thus appears more reasonable to invert for a smooth anisotropic media and leave the interpretation of the inverted model (such as the detection of sharp interfaces) to a second-step downscaling inversion. The resolution of FWI or HFWI is directly linked to λmin and to the choice of ε0. In the forward modelling context, for a fixed λmin value, depending on the desired accuracy, the number of propagated wavelengths and on the model velocity contrast, a good choice for ε0 lies between 0.5 and 0.25 (Capdeville et al.2010b). We could thus deduce that the FWI resolution can reach λmin/2 to λmin/4 resolution. Nevertheless, the different tests performed here show that with ε0 = 0.5, the noise on the results is significant, increasing the uncertainty on the reconstruction (leading to large error bars). It is therefore probably best to limit the results to larger ε0, such as ε0 = 1, for which the error appears lower. This could be potentially improved by increasing the number of sources and receivers, but this observation testifies that the sensitivity of the waveforms to heterogeneities decrease strongly for ε0 close to 0.5, and that the resolution limit should be set closer to 1 than 0.5. Note that these numbers actually depend on where the actual maximum frequency fm is placed: in Fig. 5, fm is really the maximum frequency, but its position could have been chosen at the end of the plateau (≃ 0.8 of the current fm). This would give an apparent more optimistic value for the resolution limit of the inversion, but it would not change the final images. This should be kept in mind when discussing the FWI resolution limits as, due to possible difference in the definitions, the λmin/2 resolution of one can be the λmin/4 of another. One important notion, both for the homogenization and for the inverse problem, is the minimum wavelength λmin. Here, we have assumed that λmin is roughly constant all over the domain. For more realistic situation, especially for geological media, λmin may vary dramatically over the inverted domain, especially with depth. Such a case is not a real issue but an adaptive homogenization, that already exists for layered media (Capdeville et al.2013,appendix B), would be needed for 2-D and 3-D media. Media that do not present a minimum wavelength for some frequencies are more problematic and cannot be inverted with any FWI methods (FWI or HFWI). Media with Helmholtz resonator inclusions (Zhao et al.2016), but also with fluid or void inclusion of size comparable or larger to the minimum wavelength are among those. Being able to apply FWI or HFWI schemes to such media would require to use very low frequencies only (such that a minimum wavelength exists anyway) or a drastically different approach. Another difficulty related to λmin is that, in the presented algorithm and examples, it has been assumed to be known a priori. For realistic situations, this often will not be true. Indeed, if the starting model is far enough from the homogenized true model, the true λmin can be significantly different from the initial estimate and therefore the homogenization may be poorly tuned. In such a case, the λmin is part of the unknown which makes the whole process implicit. An iterative scheme is therefore necessary to find the correct λmin. This is probably not a great difficulty but it will need to be tested in a future work. As it has been shown in the examples, the parametrization choice for $$\boldsymbol{\mathcal {M}}^h$$ or $$\boldsymbol{\mathcal {M}}^{h,*}$$ has an influence on the raw results, but not on the homogenized results, if the inversion has converged. It appears that, for relatively weak velocity contrast target models, this choice is not critical as long as there are enough degrees of freedom in the parametrization. For intermediate velocity contrast target models, classical FWI still converge, but using a fully anisotropic parametrization ($$\boldsymbol{\mathcal {M}}_n^{N,{\rm ani}}$$) is necessary. In any case, we found it is safer to use a fully anisotropic parametrization than a isotropic parametrization. Once the fully anisotropic tensor is inverted for, the spatial parametrization is not really critical as long as it provides between 3 to 4 degrees of freedom per wavelength in one direction. We nevertheless observe a slightly better stability of the inversion for low polynomial degree (e.g. using $$\boldsymbol{\mathcal {P}}_{20}^1$$ is more stable than $$\boldsymbol{\mathcal {P}}_{10}^3$$ for the same number of degree of freedom). An interesting consequence is that a rough parametrization, such as square blocks containing constant elastic properties, performs very well in terms of stability and convergence, better than a high order polynomial parametrization, as long as there are enough degrees of freedom per wavelength. In any case, once homogenized, the results correspond to the homogenized true model. As we saw above, the model resulting from HFWI are homogenized models and cannot be true models. Such models are well suited for data prediction. This could be very useful in many situations, such as source localization and moment tensor inversions. But they are poorly suited to geological interpretation. Indeed, because true interfaces are blurred and apparent anisotropy is introduced, interpretation can be difficult and misleading (for an example, see Capdeville et al.2013). Note that this issue is not specific to HFWI: any seismic inversion based on limited frequency band data is subjected to this issue. At least, with HFWI, it is made clear that the results should be interpreted with care. To properly interpret the homogenized model obtained through HFWI, we propose a second-step inverse problem: the downscaling or de-homogenizing inverse problem. The idea of such a new inverse problem is to find all the finer scale models both compatible with some a priori information and the HFWI model. Such an inversion solution is highly non-unique and needs to be carried out with a global search approach. The idea has already been successfully tested in the layered model case (Bodin et al.2015) and its development for higher dimension problems needs to be done. Combining HFWI and a downscaling inversion is an indirect way to address the full waveform inverse problem with a fully probabilistic approach. If the overall context of this work is seismology at all scales, our results can be extended to other domains, such as non-destructive testing at all scales with (almost) no modifications. Furthermore, although we have limited ourselves to the elastic case, the acoustic case is probably very similar and most of our conclusions would probably remain valid. Nevertheless, there are some important differences between the two cases related to the non-uniqueness of the inverse problem that would deserve a particular attention. 8 CONCLUSIONS AND PERSPECTIVES We have confirmed numerically that an FWI based on limited frequency band data can retrieve, at best, the homogenized version of the true model. The relation between the true and inverted models is known through eq. (30). This is an important progress as, for classical FWI scheme, this relation is not known. This relation opens the door to a proper interpretation of the inverted model with a downscaling inverse problem. To make the principle exposed here applicable to more realistic geological cases, many aspects still need to be explored. Because, in general, the source and receivers are in the inverted area, their associated correctors need to be inverted. Similarly, the free surface boundary corrector needs to be inverted. The fact that, in general, the minimum wavelength strongly varies over the domain also needs to be accounted for. The adaptive homogenization is potentially a good solution for such a problem. An explicit parametrization instead of the implicit one used here should be attempted. The general downscaling problem needs to be addressed and, finally, these works need to be extended to 3-D. All these aspects will be explored in future works. ACKNOWLEDGEMENTS We are very grateful to Gerhard Pratt and Michael Afanasiev for their positive and useful reviews that greatly helped to improve the manuscript. We thank R. Brossier, J. Virieux, A. Fichtner, T. Bodin and the European COST action TIDES (ES1401) for fruitful discussions. This work was funded by the HIWAI ANR project (ANR-16-CE31-0022-01). The computations were done on the Centre de Calcul Intensif des Pays de la Loire (CCIPL) computers. REFERENCES Afanasiev M., Peter D., Sager K., Simutė S., Ermert L., Krischer L., Fichtner A., 2015. Foundations for a multiscale collaborative earth model, Geophys. J. Int. , 204( 1), 39– 58. https://doi.org/10.1093/gji/ggv439 Google Scholar CrossRef Search ADS   Afanasiev M., Boehm C., May D., Fichtner A., 2016. Using effective medium theory to better constrain full waveform inversion, in 78th EAGE Conference and Exhibition 2016 , doi:10.3997/2214-4609.201601614. Alerini M., Lambaré G., Baina R., Podvin P., Bégat S.L., 2007. Two-dimensional PP/PS-stereotomography: P- and S-waves velocities estimation from OBC data, Geophys. J. Int. , 170, 725– 736. https://doi.org/10.1111/j.1365-246X.2007.03439.x Google Scholar CrossRef Search ADS   Backus G., 1962. Long-wave elastic anisotropy produced by horizontal layering, J. geophys. Res. , 67( 11), 4427– 4440. https://doi.org/10.1029/JZ067i011p04427 Google Scholar CrossRef Search ADS   Bensoussan A., Lions J.-L., Papanicolaou G., 1978. Asymptotic Analysis of Periodic Structures , North Holland. Beylkin G., 1987. Mathematical theory for seismic migration and spatial resolution, in Deconvolution and Inversion. , pp. 291– 304, eds Bernabini M., Carrion P., Jacovitti G., Rocca F., Treitel S., Worthington M., Blackwell Scientific Publications. Beylkin G., Burridge R., 1990. Linearized inverse scattering problems in acoustics and elasticity, Wave motion , 12, 15– 52. https://doi.org/10.1016/0165-2125(90)90017-X Google Scholar CrossRef Search ADS   Billette F., Lambaré G., 1998. Velocity macro-model estimation from seismic reflection data by stereotomography, Geophys. J. Int. , 135( 2), 671– 680. https://doi.org/10.1046/j.1365-246X.1998.00632.x Google Scholar CrossRef Search ADS   Blanc X., Le Bris C., Lions P.-L., 2007. Stochastic homogenization and random lattices, J. Math. Pures Appl. , 88( 1), 34– 63. https://doi.org/10.1016/j.matpur.2007.04.006 Google Scholar CrossRef Search ADS   Bodin T., Capdeville Y., Romanowicz B., Montagner J.-P., 2015. Interpreting radial anisotropy in global and regional tomographic models, in The Earth’s Heterogeneous Mantle , pp. 105– 144, eds Khan A., Deschamps F., Springer. Google Scholar CrossRef Search ADS   Brenders A., Pratt R., 2007. Full waveform tomography for lithospheric imaging: results from a blind test in a realistic crustal model, Geophys. J. Int. , 168( 1), 133– 151. https://doi.org/10.1111/j.1365-246X.2006.03156.x Google Scholar CrossRef Search ADS   Brossier R., Operto S., Virieux J., 2009. Seismic imaging of complex onshore structures by 2D elastic frequency-domain full-waveform inversion, Geophysics , 74( 6), WCC105– WCC118. https://doi.org/10.1190/1.3215771 Google Scholar CrossRef Search ADS   Browaeys J.T., Chevrot S., 2004. Decomposition of the elastic tensor and geophysical applications, Geophys. J. Int. , 159, 667– 678. https://doi.org/10.1111/j.1365-246X.2004.02415.x Google Scholar CrossRef Search ADS   Bunks C., Saleck F., Zaleski S., Chavent G., 1995. Multiscale seismic waveform inversion, Geophysics , 60( 5), 1457– 1473. https://doi.org/10.1190/1.1443880 Google Scholar CrossRef Search ADS   Burgos G., Capdeville Y., Guillot L., 2016. Homogenized moment tensor and the effect of near-field heterogeneities on nonisotropic radiation in nuclear explosion, J. geophys. Res. , 121( 6), 4366– 4389. https://doi.org/10.1002/2015JB012744 Google Scholar CrossRef Search ADS   Byrd R.H., Lu P., Nocedal J., 1995. A limited memory algorithm for bound constrained optimization, SIAM J. Sci. Stat. Comput. , 16, 1190– 1208. https://doi.org/10.1137/0916069 Google Scholar CrossRef Search ADS   Capdeville Y., Cance P., 2015. Residual homogenization for elastic wave propagation in complex media, Geophys. J. Int. , 200( 2), 984– 997. https://doi.org/10.1093/gji/ggu452 Google Scholar CrossRef Search ADS   Capdeville Y., Marigo J.J., 2007. Second order homogenization of the elastic wave equation for non-periodic layered media, Geophys. J. Int. , 170, 823– 838. https://doi.org/10.1111/j.1365-246X.2007.03462.x Google Scholar CrossRef Search ADS   Capdeville Y., Marigo J.J., 2008. Shallow layer correction for spectral element like methods, Geophys. J. Int. , 172, 1135– 1150. https://doi.org/10.1111/j.1365-246X.2007.03703.x Google Scholar CrossRef Search ADS   Capdeville Y., Marigo J.-J., 2013. A non-periodic two scale asymptotic method to take account of rough topographies for 2-D elastic wave propagation, Geophys. J. Int. , 192( 1), 163– 189. https://doi.org/10.1093/gji/ggs001 Google Scholar CrossRef Search ADS   Capdeville Y., Guillot L., Marigo J.J., 2010a. 1-D non periodic homogenization for the wave equation, Geophys. J. Int. , 181, 897– 910. Capdeville Y., Guillot L., Marigo J.J., 2010b. 2D nonperiodic homogenization to upscale elastic media for P–SV waves, Geophys. J. Int. , 182, 903– 922. https://doi.org/10.1111/j.1365-246X.2010.04636.x Google Scholar CrossRef Search ADS   Capdeville Y., Stutzmann E., Wang N., Montagner J.-P., 2013. Residual homogenization for seismic forward and inverse problems in layered media, Geophys. J. Int. , 194( 1), 470– 487. https://doi.org/10.1093/gji/ggt102 Google Scholar CrossRef Search ADS   Capdeville Y., Zhao M., Cupillard P., 2015. Fast Fourier homogenization for elastic wave propagation in complex media, Wave Motion , 54, 170– 186. https://doi.org/10.1016/j.wavemoti.2014.12.006 Google Scholar CrossRef Search ADS   Claerbout J., 1985. Imaging the Earth’s Interior , Blackwell Scientific Publication. Devaney A., 1984. Geophysical diffraction tomography, IEEE Trans. Geosci. Remote Sens. , GE-22( 1), 3– 13. https://doi.org/10.1109/TGRS.1984.350573 Google Scholar CrossRef Search ADS   Festa G., Vilotte J.-P., 2005. The Newmark scheme as velocity-stress time-staggering: an efficient PML implementation for spectral element simulations of elastodynamics, Geophys. J. Int. , 161, 789– 812. https://doi.org/10.1111/j.1365-246X.2005.02601.x Google Scholar CrossRef Search ADS   Fichtner A., Trampert J., Cupillard P., Saygin E., Taymaz T., Capdeville Y., Villaseñor A., 2013. Multiscale full waveform inversion, Geophys. J. Int. , 194( 1), 534– 556. https://doi.org/10.1093/gji/ggt118 Google Scholar CrossRef Search ADS   Guillot L., Capdeville Y., Marigo J.J., 2010. 2-D non periodic homogenization for the SH wave equation, Geophys. J. Int. , 182, 1438– 1454. https://doi.org/10.1111/j.1365-246X.2010.04688.x Google Scholar CrossRef Search ADS   Jordan T.H., 2015. An effective medium theory for three-dimensional elastic heterogeneities, Geophys. J. Int. , 203( 2), 1343– 1354. https://doi.org/10.1093/gji/ggv355 Google Scholar CrossRef Search ADS   Kamei R., Pratt R., 2013. Inversion strategies for visco-acoustic waveform inversion, Geophys. J. Int. , 194( 2), 859– 884. https://doi.org/10.1093/gji/ggt109 Google Scholar CrossRef Search ADS   Käufl P., Fichtner A., Igel H., 2013. Probabilistic full waveform inversion based on tectonic regionalization-development and application to the Australian upper mantle, Geophys. J. Int. , 193( 1), 437– 451. https://doi.org/10.1093/gji/ggs131 Google Scholar CrossRef Search ADS   Kolb P., Collino F., Lailly P., 1986. Pre-stack inversion of a 1-D medium, Proc. IEEE , 74( 3), 498– 508. https://doi.org/10.1109/PROC.1986.13490 Google Scholar CrossRef Search ADS   Komatitsch D., Vilotte J.P., 1998. The spectral element method: an effective tool to simulate the seismic response of 2D and 3D geological structures, Bull. seism. Soc. Am. , 88, 368– 392. Lekić V., Romanowicz B., 2011. Inferring upper-mantle structure by full waveform tomography with the spectral element method, Geophys. J. Int. , 185( 2), 799– 831. https://doi.org/10.1111/j.1365-246X.2011.04969.x Google Scholar CrossRef Search ADS   Métivier L., Brossier R., 2016. The SEISCOPE optimization toolbox: a large-scale nonlinear optimization library based on reverse communication, Geophysics , 81( 2), F11– F25. Google Scholar CrossRef Search ADS   Michel J., Moulinec H., Suquet P., 1999. Effective properties of composite materials with periodic microstructure: a computational approach, Comput. Methods Appl. Mech. Eng. , 172( 1), 109– 143. https://doi.org/10.1016/S0045-7825(98)00227-8 Google Scholar CrossRef Search ADS   Nachman A.I., 1988. Reconstructions from boundary measurements, Ann. Math. , 128( 3), 531– 576. https://doi.org/10.2307/1971435 Google Scholar CrossRef Search ADS   Nakamura G., Uhlmann G., 1994. Global uniqueness for an inverse boundary problem arising in elasticity, Inventiones math. , 118( 1), 457– 474. https://doi.org/10.1007/BF01231541 Google Scholar CrossRef Search ADS   Nawaz M.A., Curtis A., 2017. Bayesian inversion of seismic attributes for geological facies using a hidden Markov model, Geophys. J. Int. , 208, 1184– 1200. https://doi.org/10.1093/gji/ggw411 Google Scholar CrossRef Search ADS   Nocedal J., 1980. Updating quasi-Newton matrices with limited storage, Math. Comput. , 35( 151), 773– 782. https://doi.org/10.1090/S0025-5718-1980-0572855-7 Google Scholar CrossRef Search ADS   Operto S., Miniussi A., Brossier R., Combe L., Métivier L., Monteiller V., Ribodetti A., Virieux J., 2015. Efficient 3-D frequency-domain mono-parameter full-waveform inversion of ocean-bottom cable data: application to Valhall in the visco-acoustic vertical transverse isotropic approximation, Geophys. J. Int. , 202( 2), 1362– 1391. https://doi.org/10.1093/gji/ggv226 Google Scholar CrossRef Search ADS   Peter D. et al.  , 2011. Forward and adjoint simulations of seismic wave propagation on fully unstructured hexahedral meshes, Geophys. J. Int. , 186( 2), 721– 739. https://doi.org/10.1111/j.1365-246X.2011.05044.x Google Scholar CrossRef Search ADS   Plessix R.E., Perkins C., 2010. Full waveform inversion of a deep water ocean bottom seismometer dataset, First Break , 28, 71– 78. https://doi.org/10.3997/1365-2397.2010013 Google Scholar CrossRef Search ADS   Pratt R., Shin C., Hicks G., 1998. Gauss–Newton and full newton methods in frequency domain seismic waveform inversion, Geophys. J. Int. , 133, 341– 362. https://doi.org/10.1046/j.1365-246X.1998.00498.x Google Scholar CrossRef Search ADS   Pratt R.G., 1999. Seismic waveform inversion in the frequency domain, part I : theory and verification in a physical scale model, Geophysics , 64, 888– 901. https://doi.org/10.1190/1.1444597 Google Scholar CrossRef Search ADS   Romanowicz B., 2003. Global mantle tomography: progress status in the last 10 years, Annu. Rev. Geophys. Space Phys , 31, 303– 328. Sanchez-Palencia E., 1980, Non Homogeneous Media and Vibration Theory , Number 127 in Lecture Notes in Physics, Springer. Shipp R.M., Singh S.C., 2002. Two-dimensional full wavefield inversion of wide-aperture marine seismic streamer data, Geophys. J. Int. , 151, 325– 344. https://doi.org/10.1046/j.1365-246X.2002.01645.x Google Scholar CrossRef Search ADS   Sirgue L., Barkved O.I., Dellinger J., Etgen J., Albertin U., Kommedal J.H., 2010. Full waveform inversion: the next leap forward in imaging at Valhall, First Break , 28, 65– 70. https://doi.org/10.3997/1365-2397.2010012 Google Scholar CrossRef Search ADS   Stoneley R., 1949. The seismological implications of aeolotropy in continental structure, Geophys. Suppl. Mon. Not. R. Astron. Soc. , 5( 8), 343– 353. https://doi.org/10.1111/j.1365-246X.1949.tb02949.x Tape C., Liu Q., Maggi A., Tromp J., 2010. Seismic tomography of the southern California crust based on spectral-element and adjoint methods, Geophys. J. Int. , 180( 1), 433– 462. https://doi.org/10.1111/j.1365-246X.2009.04429.x Google Scholar CrossRef Search ADS   Tarantola A., 1984. Inversion of seismic reflection data in the acoustic approximation, Geophysics , 49, 1259– 1266. https://doi.org/10.1190/1.1441754 Google Scholar CrossRef Search ADS   Tarantola A., 2005. Inverse Problem Theory and Methods for Model Parameter Estimation , SIAM. Google Scholar CrossRef Search ADS   Tarantola A., Valette B., 1982. Generalized nonlinear inverse problems solved using the least squares criterion, Rev. Geophys. , 20, 219– 232. https://doi.org/10.1029/RG020i002p00219 Google Scholar CrossRef Search ADS   Trinh P.T., Brossier R., Métivier L., Virieux J., Wellington P., 2017. Bessel smoothing filter for spectral element mesh, Geophys. J. Int. , 209( 3), 1489– 1512. https://doi.org/10.1093/gji/ggx103 Google Scholar CrossRef Search ADS   Warner M. et al.  , 2013. Anisotropic 3D full-waveform inversion, Geophysics , 78( 2), R59– R80. https://doi.org/10.1190/geo2012-0338.1 Google Scholar CrossRef Search ADS   Weinan E., Engquist B., Li X., Ren W., Vanden-Eijnden E., 2007. Heterogeneous multiscale methods: a review, Commun. Comput. Phys. , 2( 3), 367– 450. Williamson P., Atle A., Fei W., Hale D., 2011. Regularization of wave-equation migration velocity analysis by structure-oriented smoothing, in 2011 SEG Annual Meeting , Society of Exploration Geophysicists. Zhao M., Capdeville Y., Zhang H., 2016. Direct numerical modeling of time-reversal acoustic subwavelength focusing, Wave Motion , 67, 102– 115. https://doi.org/10.1016/j.wavemoti.2016.07.010 Google Scholar CrossRef Search ADS   APPENDIX A: THE HOMOGENIZATION OPERATOR $$\boldsymbol{\mathcal {H}}$$ In this appendix, we summarize the practical steps necessary to build the homogenization operator $$\boldsymbol{\mathcal {H}}$$. The complete development and theory can be found in Capdeville et al. (2010b, 2015). To practically implement the user-defined λ0 scale separation, we need to introduce a low-pass filter operator $$\mathcal {F}^{k_0}$$, such that, for any quantity g(x), $$\mathcal {F}^{k_0}(g)(\mathbf {x})$$ does not contain any spatial variation smaller than λ0 = 1/k0. This low-pass filter operator can be written as   \begin{equation} \mathcal {F}^{k_0}(g)(\mathbf {x})=(w^{k_0}*g)(\mathbf {x}), \end{equation} (A1)where * is the spatial convolution and $$w^{k_0}$$ is the filter wavelet. Once λ0 is chosen, it sets the value of ε0 through (16). The different steps allowing us to build the upscaling operator $$\boldsymbol{\mathcal {H}}$$ such that   \begin{equation} \mathbf {c}^*=\boldsymbol{\mathcal {H}}(\mathbf {c})\,, \end{equation} (A2)and to find the correctors are the following: Step 1. We first solve the so-called cell problem to find the initial guess correctors $$\boldsymbol{\chi }_{{\!{\rm s }}}^{lm}(\mathbf {x})$$. It consists in solving the following elastostatic set of problems (3 in 2-D, 6 in 3-D) in $$\boldsymbol{\Omega }$$:   \begin{eqnarray} \boldsymbol{\nabla }\cdot \mathbf {c}:\boldsymbol{\epsilon }\left(\boldsymbol{\chi }_{{\!{\rm s }}}^{lm}\right)&=&\mathbf {F}^{lm}\,,\nonumber\\ \mathbf {F}^{lm} &=& -\boldsymbol{\nabla }\cdot (\mathbf {c}:(\mathbf {e}_l\otimes \mathbf {e}_m))\,, \end{eqnarray} (A3)with periodic boundary conditions on $$\partial \boldsymbol{\Omega }$$, where the ei, i ∈ {1, ..., d}, are the Cartesian unit basis vectors. Step 2. Once the initial corrector guess $$\boldsymbol{\chi }_{{\!{\rm s }}}^{lm}$$ is obtained, we compute   \begin{eqnarray} \left[\mathbf {G}_{{\!{\rm s }}}\right]_{ijlm}(\mathbf {x})&=& \frac{1}{2}\left( \delta _{il}\delta _{jm}+\delta _{jl}\delta _{im}\right)+\left[\boldsymbol{\epsilon }\left({\boldsymbol{\chi }}_{{\!{\rm s }}}^{lm}\right)\right]_{ij},\nonumber\\ \mathbf {H}_{{\!{\rm s }}}(\mathbf {x})&=& \mathbf {c}: \mathbf {G}_{{\!{\rm s }}}. \end{eqnarray} (A4)The ε0 effective tensor can be directly obtained as   \begin{equation} \mathbf {c}^*(\mathbf {x})=\mathcal {F}^{k_0}(\mathbf {H}_{{\!{\rm s }}}): \mathcal {F}^{k_0}({\mathbf {G}_{{\!{\rm s }}}})^{-1}(\mathbf {x}). \end{equation} (A5)At this stage the upscaling operator $$\boldsymbol{\mathcal {H}}$$, defined in eq. (A2), is known. The effective density is simply   \begin{equation} \rho ^*(\mathbf {x})=\mathcal {F}^{k_0}(\rho )(\mathbf {x}); \end{equation} (A6) Step 3. Finally, the strain concentrator is obtained as   \begin{equation} \mathbf {G}(\mathbf {x},\mathbf {y})=\mathbf {I}+\left(\mathbf {G}_{{\!{\rm s }}} -\mathcal {F}^{k_0}(\mathbf {G}_{{\!{\rm s }}})\right)(\varepsilon _0\mathbf {y}):\mathcal {F}^{k_0}(\mathbf {G}_{{\!{\rm s }}})^{-1}(\mathbf {x}), \end{equation} (A7)and the first-order corrector $$\boldsymbol{\chi }(\mathbf {x},\mathbf {y})$$ is obtained solving, for each x (fixed),   \begin{equation} \boldsymbol{\nabla }{\boldsymbol{\boldsymbol{\chi }}}(\mathbf {x},\mathbf {y})+ {(\boldsymbol{\nabla }{\boldsymbol{\boldsymbol{\chi }}})}^{\intercal }(\mathbf {x},\mathbf {y})=2(\mathbf {G}(\mathbf {x},\mathbf {y})-\mathbf {I}), \end{equation} (A8)where the $$\boldsymbol{\nabla }$$ operators here applies on the y variable and I is the identity operator. Solving the cell problem (A3) usually requires a numerical solver. In practice, we can use two different schemes, one based on finite elements (Capdeville et al.2010b) and one based on an iterative FFT scheme (Capdeville et al.2015), derived from Michel et al. (1999). Here, in the examples, the first has been used to compute the $$\mathbf {m}_t^*$$, the second to compute $$\bar{\mathbf {m}}^{h*}$$ and the homogenized projection of the Gauss–Newton updated model at each iteration. It can be seen that the homogenization operator actually depends on both the maximum frequency fm (which determines λmin through the medium dispersion relation) and on ε0 (which determines λ0 through (16)). The main assumption behind the homogenization is the existence of a minimum wavelength λmin. For most media, this is usually the case. Nevertheless, note that some synthetic media can break this assumption and they cannot be homogenized (at least in the sense used here) and they probably cannot be imaged (at least not with the scheme presented here). Media with fluid or void inclusions of size comparable to λmin break the assumption that c is positive definite and such media cannot be homogenized either without doing a domain decomposition. Such media cannot be imaged this way either, unless fluid–solid boundaries are previously known and explicitly taken into account. But media with void or fluid inclusions of size significantly smaller than λmin can still be homogenized and therefore inverted. One important consequence of the homogenization theory is that the effective model space $$\boldsymbol{\mathcal {M}}^*$$, image of the model space $$\boldsymbol{\mathcal {M}}$$ through $$\boldsymbol{\mathcal {H}}$$ ($$\boldsymbol{\mathcal {M}}^*=\boldsymbol{\mathcal {H}}(\boldsymbol{\mathcal {M}})$$) is a finite dimensional space (even prior to any discretization that could be necessary for practical reasons). Indeed, from eq. (A5), it can be seen that both $$\mathcal {F}^{k_0}(\mathbf {H}_{{\!{\rm s }}})$$ and $$\mathcal {F}^{k_0}(\mathbf {G}_{{\!{\rm s }}})$$ belong a finite dimensional Fourier space and therefore c* can be describe with a finite number of parameters. The exact dimension of $$\boldsymbol{\mathcal {M}}^*$$ as a function of fm and ε0 is difficult to measure precisely, but it scales as   \begin{equation} \dim (\boldsymbol{\mathcal {M}}^*)\propto \left(\frac{f_{\rm m}}{\varepsilon _0}\right)^d\,. \end{equation} (A9)The fact that $$\dim (\boldsymbol{\mathcal {M}}^*)$$ cannot be measured precisely is because, as any space-wavenumber problem, it is not possible to be precise on both on frequency (fm) and space (λmin) at the same time. Finally, we said nothing about the boundary conditions. This is a more complex issue that requires two matched asymptotic expansions (Capdeville & Marigo 2008; Capdeville et al.2013). At the order 0 in ε0, nothing special needs to be done and any smooth boundary and a free surface will be correct. At order one in ε0, it appears that an effective domain boundary $$\partial \boldsymbol{\Omega }^*$$ needs to be introduced. It transforms a potential rough topography in a smooth effective one. Moreover, the free boundary condition needs to be replaced by a Dirichlet to Neumann condition (see eq. 19). The coefficients involved in the Dirichlet to Neumann operator $${\boldsymbol{\Gamma }}^*$$ only depend on the macroscopic scale (they vary smoothly) and are therefore not a problem for an inverse problem (but the effective parameters imbedded in $${\boldsymbol{\Gamma }}^*$$ would still need to be inverted for). © The Author(s) 2018. Published by Oxford University Press on behalf of The Royal Astronomical Society.

Journal

Geophysical Journal InternationalOxford University Press

Published: May 1, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off