Convergence and adaptive discretization of the IRGNM Tikhonov and the IRGNM Ivanov method under a tangential cone condition in Banach space

Convergence and adaptive discretization of the IRGNM Tikhonov and the IRGNM Ivanov method under a... Numerische Numer. Math. (2018) 140:449–478 https://doi.org/10.1007/s00211-018-0971-5 Mathematik Convergence and adaptive discretization of the IRGNM Tikhonov and the IRGNM Ivanov method under a tangential cone condition in Banach space Barbara Kaltenbacher · Mario Luiz Previatti de Souza Received: 24 July 2017 / Revised: 27 February 2018 / Published online: 29 May 2018 © The Author(s) 2018 Abstract In this paper we consider the iteratively regularized Gauss–Newton method (IRGNM) in its classical Tikhonov version as well as two further—Ivanov type and Morozov type—versions. In these two alternative versions, regularization is achieved by imposing bounds on the solution or by minimizing some regularization functional under a constraint on the data misfit, respectively. We do so in a general Banach space setting and under a tangential cone condition, while convergence (without source conditions, thus without rates) has so far only been proven under stronger restrictions on the nonlinearity of the operator and/or on the spaces. Moreover, we provide a convergence result for the discretized problem with an appropriate control on the error and show how to provide the required error bounds by goal oriented weighted dual residual estimators. The results are illustrated for an inverse source problem for a nonlinear elliptic boundary value problem, for the cases of a measure valued and of an L source. For the latter, we also provide numerical results with the Ivanov type IRGNM. Mathematics Subject Classification 65F22 · 65N20 Supported by the Austrian Science Fund FWF under the Grant I2271 “Regularization and Discretization of Inverse Problems for PDEs in Banach Spaces”. Barbara Kaltenbacher barbara.kaltenbacher@aau.at Mario Luiz Previatti de Souza mario.previatti@aau.at Institute of Mathematics, Alpen-Adria-Universität Klagenfurt, Klagenfurt, Austria 123 450 B. Kaltenbacher, M. L. P. de Souza 1 Introduction In this paper we consider a nonlinear ill-posed operator equation F (x ) = y , (1) where the possibly nonlinear operator F : D(F ) ⊆ X → Y with domain D(F ) maps between real Banach spaces X and Y . We are interested in the ill-posed situation, i.e., F fails to be continuously invertible, and the data are contaminated with noise, thus regularization has to be applied (see, e.g., [8,27], and references therein). Throughout this paper we will assume that an exact solution x ∈ D(F ) of (1) exists, i.e., F (x ) = y, and that the noise level δ in the (deterministic) estimate y − y ≤ δ (2) is known. Partially we will also refer to the formulation of the inverse problem as a system of model and observation equation A(x , u) =0(3) C (u) = y . (4) Here A : X × V → W and C : V → Y are the model and observation operator, so that with the parameter-to-state map S : X → V satisfying A(x , S(x )) = 0 and F = C ◦ S,(1) is equivalent to the all-at-once formulation (3), (4). Newton type methods for the solution of nonlinear ill-posed problems (1) have been extensively studied in Hilbert spaces (see, e.g., [2,20] and the references therein) and more recently also in a in Banach space setting. In particular, the iteratively regularized Gauss–Newton method [1] can be generalized to a Banach space setting by calculating iterates x in a Tikhonov type variational form as k+1 δ δ δ δ p x ∈ argmin F (x )(x − x ) + F (x ) − y  + α R(x), (5) k+1 x ∈C k k k see, e.g., [11,16,17,21,28] where p ∈[1, ∞), (α ) is a sequence of regularization k k∈N parameters, and R is some nonnegative regularization functional. Alternatively, one might introduce regularization by imposing some bound ρ on the norm of x, or, again, generally, on a regularization functional of x δ δ δ δ x ∈ argmin F (x )(x − x ) + F (x ) − y  such that R(x ) ≤ ρ , (6) x ∈C k+1 k k k which corresponds to Ivanov regularization or the method of quasi solutions, see, e.g., [7,13–15,22,24,26]. A third way of incorporating regularization in a Newton type iteration is Morozov regularization, also called the method of the residuals, see, e.g., [9,22,23] 123 Convergence and adaptive discretization of the IRGNM… 451 δ δ δ δ δ δ x ∈ argmin R(x ) such that F (x )(x −x )+ F (x )− y ≤ σ F (x )− y  , k+1 x ∈C k k k k (7) for some σ ∈ (0, 1), where the choice of the bound in the inequality constraint is very much inspired by the inexact Newton type regularization parameter choice in [10]. We restrict ourselves to the norm in Y as a measure of the data misfit, but the analysis could as well be extended to more general functionals S satisfying certain conditions, as e.g., in [11,28]. Here C is a set (possibly chosen with convenient properties for carrying out the minimization) containing x and being contained in D(F ), such that F satisfies additional conditions on C see (8), (10) below. If F is defined on all of X, then the minimization problem (5) can be posed in an unconstrained way C = X. As a restriction on the nonlinearity of the forward operator F we impose the tan- gential cone condition F (x˜ ) − F (x ) − F (x )(x˜ − x )≤ c F (x˜ ) − F (x ) for all x˜ , x ∈ B (8) tc R (also called Scherzer condition, cf. [25]) for some constant c < 1/3. Here, for any tc r > 0, B ={x ∈ C : R(x ) ≤ r } (9) is a sublevel set of the regularization functional and R will be specified in the conver- gence result Theorem 1. Note that the convergence conditions imposed in [11,16,17,21,28] in the situation without source condition, namely local invariance of the range of F (x ) , are slightly stronger, since this adjoint range invariance is sufficient for (8). However, most prob- ably the gap is not very large, as in those application examples where (8) has been verified, the proof of (8) is actually often done via adjoint range invariance. In (5), (6), (7), the bounded linear operator F (x ) is not necessarily a Gâteaux or Fréchet deriva- tive of F, but just some local linearization (in the sense of (8)), satisfying additionally the weak closedness condition T T X Y ∀x ∈ C ,(x ) ⊆ C : x −→ x ˆ , and F (x )x −→ y n n∈N n n ⇒ xˆ ∈ C and F (x )xˆ = y . (10) In here, T and T are topologies on X and Y (e.g., just the weak or weak* topolo- X Y gies) such that bounded sets in Y are T -compact and the norm in Y is T -lower Y Y semicontinuous. The remainder of this paper is organized as follows. In Sect. 2 we state and prove convergence results in the continuous and discretized setting. Section 3 shows how to actually obtain the required discretization error estimates by a goal oriented weighted dual residual approach and Sect. 4 illustrates the theoretical findings by an inverse souce problem for a nonlinear PDE. In Sect. 5 we provide some numerical results for this model problem and Sect. 6 concludes with some remarks. 123 452 B. Kaltenbacher, M. L. P. de Souza 2 Convergence In this section we will study convergence of the IRGNM iterates first of all in a con- tinuous setting, then in the situation of having discretized for computational purposes. The regularization parameters α , ρ , σ are chosen a priori k k k 2c ct α = α θ for some θ ∈ , 1 (11) k 0 1−c ct 2c ct p (note that ( ) < 1for c < 1/3), tc 1−c ct ρ ≡ ρ ≥ R(x ), (12) and 1 + c tc σ ≥ + c ,σ < 1 − 2c , (13) tc tc with τ as in (14), and the iteration is stopped according to the discrepancy principle δ δ δ k = k (δ, y ) = min{k ∈ N :F (x ) − y ≤ τδ} (14) ∗ ∗ 0 with some fixed τ> 1 chosen sufficiently large but independent of δ. Theorem 1 Let R : X →[0, ∞] be proper, convex and T lower semicontinuous † † with R(x )< ∞ and let, for all r ∈[R(x ), ∞) in case of (5), or for all r ∈ † † [R(x ), ρ] in case of (6),orfor r = R(x ) in case of (7), the sublevel set (9) be compact with respect to the topology T on X. Moroever, let F satisfy (8), (10). Finally, let the family of data (y ) satisfy (2). δ>0 (i) Then for fixed δ,y , the iterates according to (5)–(7) are well-defined and satisfy ⎨ defined by (23), (19), (20) in case of (5) x ∈ B with R = ρ in case of (6) (15) = R(x ) in case of (7) for all k ≤ k (δ, y ), which denotes the stopping index according to the discrep- ancy principle (14) with τ sufficiently large, and this stopping indes k (δ, y ) is finite. (ii) Moreover, for both methods we have T -subsequential convergence as δ → 0 i.e., (x ) has a T -convergent subsequence and the limit of every T - δ>0 X X k (δ,y ) convergent subsequence solves (1). If the solution x of (1) is unique in B , then δ † x −→ x as δ → 0. k (δ,y ) (iii) Additionally, k satisfies the asymptotics k = O(log(1/δ)). ∗ ∗ δ δ δ Proof Existence of minimizers x of (5)–(7) for fixed k, x and y follows by the k+1 k direct method of calculus of variations: In all three cases, the cost functional δ δ δ δ p J (x ) =F (x )(x − x ) + F (x ) − y  + α R(x ) in case of (5), k k k k k δ δ δ δ 2 J (x ) = F (x )(x − x ) + F (x ) − y  in case of (6), k k k 123 Convergence and adaptive discretization of the IRGNM… 453 J (x ) = R in case of (7), is bounded from below and the admissible set ad X = C in case of (5), ad X = B in case of (6), ad δ δ δ δ δ δ X ={x ∈ C :F (x )(x − x ) + F (x ) − y ≤ σ F (x ) − y } in case of (7) k k k k is nonempty (for (6) this follows from ρ ≥ R(x ) and for (7) from (8), (14) and (13), l ad see (16) below). Hence, there exists a minimizing sequence (x ) ⊆ X ∩ B for l∈N r † † r = J (x ) in case of (5), r = ρ in case of (6), r = R(x ) in case of (7), δ l δ δ δ with bounded linearized residuals F (x )(x − x ) + F (x ) − y ≤ s for k k k † 1/p δ δ s = J (x ) in case of (5), (6), s = σ F (x ) − y  in case of (7), and lim J (x ) = inf J (x ).By T -compactness of B , the sequence l→∞ k k X r ad x ∈X l l (x ) has a T -convergent subsequence (x ) with limit x¯ ∈ B . Moroever, l∈N X m∈N r T -compactness of norm bounded sets in Y together with T -T -closedness of Y X Y F (x ) and lower T semicontinuity of the norm in Y , implies that in all three cases l ad J (x¯) ≤ lim inf J (x ) = inf J (x ) and x¯ ∈ X , hence x¯ is a minimizer. k m→∞ k ad k x ∈X Note that (ii) follows from (i) by standard arguments and our assumption on T - compactness of B . Thus it remains to prove (i) and (iii) for the three versions (5), (6), (7) of the IRGNM. For this purpose we are going to show that for every δ> 0, there exists k = k (δ, y ) such that k ∼ log(1/δ), and the stopping criterion according to ∗ ∗ ∗ δ δ the discrepancy principle F (x ) − y ≤ τδ is satisfied. For (5), we also need k (δ,y ) δ δ to show that R(x ) ≤ R for k ≤ k (δ, y ), whereas in (6) this automatically holds by (12). The same holds true for (7): If x ∈ B , then by (8), (14) and (13)wehave δ † δ δ δ δ δ F (x )(x − x ) + F (x ) − y ≤ c F (x ) − y + (1 + c )δ tc tc k k k k 1 + c tc δ δ ≤ c + F (x ) − y , (16) tc † δ † δ so x is admissible, hence R(x ) ≤ R(x ), i.e., x ∈ B . k+1 k+1 We start with the Tikhonov version (5) and carry out an induction proof of the following statement: For all k ∈{0,..., k (δ, y )} δ † p R(x ) ≤ R and ∀ j ∈{0,..., k −1}: d + α R ≤ qd + α R +C δ , (17) j +1 j j +1 j j where 1−p p δ δ p d := 2 (1 − c ) F (x ) − y  , (18) k tc 123 454 B. Kaltenbacher, M. L. P. de Souza tc p−1 p−1 q := 2 (1 + γ) + 1 ∈ (0, 1), (19) 1 − c tc δ † † R := R(x ), R := R(x ), p−1 1 + γ C := (1 + c ) , (20) tc for some fixed small γ ∈ (0, 1). We will require < 1, which by definition of q (19) 2c ct p is achievable for γ> 0 sufficiently small, due to θ> ( ) ,cf. (11). By Lemma 2 1−c ct (see the “Appendix”) the right hand side estimate in (17) implies 1 1 k † p d + α R < q d + α R + C δ . (21) k k−1 k 0 k−1 1 − 1 − q δ † δ Using the minimality of x and (2), (8) together with x , x ∈ B ,wehave k+1 k δ δ δ δ δ p δ F (x )(x − x ) + F (x ) − y  + α R(x ) k k+1 k k k+1 δ † δ δ δ p † ≤F (x )(x − x ) + F (x ) − y  + α R(x ) k k k δ δ † ≤ c F (x ) − y + (1 + c )δ + α R(x ), (22) tc tc k From (21) and (14) we infer C R 1− p p δ δ p k k 2 (1 − c ) − F (x ) − y  ≤ q d + θ α tc 0 0 (1 − q)τ θ − q Using this and again (14)in(22) yields p −1 1 + c C tc δ 1−p p R(x ) ≤ c + 2 (1 − c ) − ct tc k+1 τ (1 − q)τ q d R × + + R θ α θ − q p −1 1 4 2 C ≤ + − p p 3 3τ 3 (1 − q)τ δ δ p † F (x ) − y  R 1−p 0 † × 2 + + R =: R . (23) α θ − q On the other hand, since we have established x ∈ B , we can apply (8)tothe left k+1 hand side of (22) to obtain δ δ δ δ δ p δ F (x )(x − x ) + F (x ) − y  + α R(x ) k k+1 k k k+1 δ δ δ δ δ ≥ (1 − c )F (x ) − y − c F (x ) − y  + α R(x ). tc tc k k+1 k k+1 123 Convergence and adaptive discretization of the IRGNM… 455 seealso[11, Lemma 5.2] and [21, proof of Theorem 3]. To handle the power p we make use of the following inequalities that can be proven by solving extremal value problems, see the “Appendix” p−1 1 + γ p p−1 p p (a + b) ≤ (1 + γ) a + b and p−1 1 − p p−1 p p (a − b) ≥ (1 − ) a − b , (24) for all a, b > 0, p ≥ 1 and γ, ∈ (0, 1), where for the right hand inequality to hold, additionally a ≥ b is needed. δ δ δ δ Hence, in case (1 − c )F (x ) − y ≥ c F (x ) − y  the following general tc tc k+1 k estimate holds p−1 p δ δ p δ (1 − ) (1 − c ) F (x ) − y  + α R(x ) tc k k+1 k+1 p−1 1 − p−1 δ δ p ≤ (1 + γ) + c F (x ) − y tc k p−1 1 + γ † p p + α R(x ) + (1 + c ) δ , (25) k tc for γ, ∈ (0, 1). δ δ So in order for this recursion to yield geometric decay of F (x ) − y , we need to ensure p−1 1 − p−1 p p−1 (1 − ) (1 − c ) > (1 + γ) + c (26) tc tc for a proper choice of , γ ∈ (0, 1). To obtain the largest possible (and therefore least restrictive) bound on c , we rewrite the requirement above as tc −1 p p−1 c 1 − tc p−1 p−1 < sup (1 − ) (1 + γ) + 1 − c tc ,γ ∈(0,1) −1 p−1 1 − p−1 1 −p = sup (1 − ) 1 + = φ( ) = 2 , ∈(0,1) =φ( ) as can be found out by evaluating the derivative of φ −2 p−1 p 1 − 1 − p−2 φ ( ) =−(p − 1)(1 − ) 1 + 1 − . 123 456 B. Kaltenbacher, M. L. P. de Souza Thus we will furtheron set = and assume that γ> 0 is sufficiently small so that (26) holds with = , i.e., (19). Then, using (11), estimate (25) can be written as k † p d + α R ≤ qd + α θ R + C δ , (27) k+1 k k+1 k 0 which we first of all regard as a recursive estimate for d . To derive a similar estimate also in the complementary case (1 − c )F (x ) − tc k+1 δ δ δ y  < c F (x ) − y , we use that fact that, for d as in (18), this inequality just tc k means tc d < d k+1 k 1 − c tc and, using (22) and the left hand part of (24), p−1 1 + γ p−1 δ δ p † p p α R ≤ (1 + γ) c F (x ) − y  + α R + (1 + c ) δ , k k+1 k tc tc hence after addition we again get (27) (even with a slightly smaller value of q := p−1 p−1 p tc (1 + 2 (1 + γ) )( ) ). 1−c tc Thus in both cases, using Lemma 2 we can conclude that 1 1 k+1 † p d + α R < q d + α R + C δ . (28) k+1 k k+1 0 k 1 − 1 − q This finishes the induction proof of (17) for all k ∈{0,..., k (δ, y )}. We next show that the discrepancy stopping criterion from (14), i.e., d ≤˜ τδ 1− p p p for τ˜ = 2 (1 − c ) τ , will be satisfied after finitely many, namely O(log(1/δ)), tc steps. For this purpose, note that τ> ˜ , provided τ is chosen sufficiently large, 1−q which we assume to be done. Thus, indeed, using (11), (21), we have α C k † p d ≤ d + α R <θ d + R + δ , (29) k k k−1 k 0 θ − q 1 − q where the right hand side falls below τδ ˜ as soon as α C −1 † k ≥ (log 1/θ ) p log(1/δ) + log d + R − log τ˜ − θ − q 1 − q =: k(δ). Thus we get the upper estimate k (δ, y ) ≤ k(δ) = O(log(1/δ)). For the Ivanov version (6), it only remains to show finiteness of the stopping index, as boundedness of the R values by R = ρ holds by definition. Applying the minimality 123 Convergence and adaptive discretization of the IRGNM… 457 argument with x being admissible [cf. (12)] to (6) leads to the special case p = 1, α = 0in (25) δ δ δ δ (1 − c )F (x ) − y ≤ 2c F (x ) − y + (1 + c )δ. tc tc tc k+1 k Our notation becomes δ δ d := (1 − c )F (x ) − y , k tc 2c tc q := ∈ (0, 1), 1 − c tc C := (1 + c ), tc which gives d ≤ qd + Cδ, k+1 k and by induction, one can conclude d < q d + Cδ, k 0 1 − q where the right hand side is smaller than τδ ˜ (with τ˜ = (1 − c )τ ) for all tc −1 k ≥ (log 1/q) p log(1/δ) + log d − log τ˜ − =: k(δ), 1 − q so that we can again conclude k (δ, y ) ≤ k(δ) = O(log(1/δ)). Finally we consider (7), where boundedness of the R values by R = R(x ) holds by minimality and the fact that x is admissible, cf. (16). Geometric decay of the residuals follows by the estimate δ δ δ δ δ δ δ σ F (x ) − y ≥F (x )(x − x ) + F (x ) − y k k k+1 k k δ δ δ δ ≥ (1 − c )F (x ) − y − c F (x ) − y  (30) tc tc k+1 k and (13), i.e., δ δ δ δ F (x ) − y ≤ qF (x ) − y k+1 k with σ + c tc q = , 1 − c tc so that similarly to above we end up with a logarithmic estimate for k . 123 458 B. Kaltenbacher, M. L. P. de Souza δ † Remark 1 Convergence of R(x ) to R(x ) as δ → 0 holds along the T con- k (δ,y ) vergent subsequence according to Theorem 1 (ii), first of all for the Morozov and the Ivanov version of the IRGNM, with the choice ρ = R(x ) for the latter, since in both δ † cases R(x ) ≤ R(x ) holds for all δ and R is T lower semicontinuous. The k (δ,y ) same holds true also for the Tikhonov version with the alternative choice of α such that δ δ δ δ δ F (x )(x (α ) − x ) + F (x ) − y k k+1 k k σ ≤ ≤ σ F (x ) − y 1+c tc for some constants σ , σ satisfying c + <σ < σ< 1 in place of (11), as can tc be seen directly from (22). If R is defined by the norm on a space with the Kadets- Klee property, and T is the weak topology of this space, then this implies norm δ † convergence of x to x along the same subsequence. k (δ,y ) Remark 2 The fact that x stays in B [cf. (15)] is crucial for the applicability of the tangential cone condition (8) in these iterates. If the functional R quantifies some distance to an a priori guess x , (e.g., R =x − x  for some norm · and some 0 0 q > 0), then x ∈ B with small R means closeness of x to x in a certain sense. Thus, R 0 the smaller R is, the better (8) might get achievable with some c < . On the other tc hand, making R according to (15) small means closeness of x to x . Thus we deal with local convergence, as typical for Newton type methods. Now we consider the appearance of discretization errors in the numerical solution of (5), (6) arising from restriction of the minimization to finite dimensional subspaces X δ k and leading to discretized iterates x and an approximate version F of the forward k,h h operator i.e., we consider the discretized version of Tikhonov-IRGNM (5) δ k δ δ k δ δ p x ∈ argmin k F (x )(x − x ) + F (x ) − y  + α R(x ), (31) k+1,h x ∈C∩X h k,h k,h h k,h of Ivanov-IRGNM (6) δ k δ δ k δ δ x ∈ argmin F (x )(x − x ) + F (x ) − y  such that R(x ) ≤ ρ, k+1,h h k,h k,h h k,h x ∈C∩X (32) and of Morozov-IRGNM (7) δ k δ δ k δ δ x ∈ argmin R(x ) such that F (x )(x − x ) + F (x ) − y k+1,h h k,h k,h h k,h x ∈C∩X k δ δ ≤ σ F (x ) − y , (33) h k,h respectively. Moreover, also in the discrepancy principle, the residual is replaced by its actually computable discretized version δ k δ δ k = k (δ, y ) = min{k ∈ N :F (x ) − y ≤ τδ} . (34) ∗ ∗ 0 h k,h 123 Convergence and adaptive discretization of the IRGNM… 459 We define the auxiliary continuous iterates δ δ δ δ p x ∈ argmin F (x )(x − x ) + F (x ) − y  + α R(x ), (35) k+1 x ∈C k,h k,h k,h δ δ δ δ x ∈ argmin F (x )(x − x ) + F (x ) − y  such that R(x ) ≤ ρ, k+1 x ∈C k,h k,h k,h (36) and δ δ δ δ x ∈ argmin R(x ) such that F (x )(x − x ) + F (x ) − y k+1 x ∈C k,h k,h k,h δ δ ≤ σ F (x ) − y , (37) k,h respectively in order to be able to use minimality, i.e., compare with the continuous exact solution x . For an illustration we refer to [18, Figure 1]. First of all, we assess how large the discretization errors can be allowed to still enable convergence. Later on, in Sect. 3, we will describe how to really obtain such estimates a posteriori and to achieve the prescribed accuracy by adaptive discretization. Corollary 1 Let the assumptions of Theorem 1 be satisfied and assume that the dis- cretization error estimates δ δ δ δ F (x ) − y −F (x ) − y ≤ η (38) k+1 k+1,h k+1 k δ δ δ δ F (x ) − y −F (x ) − y  ≤ ξ (39) h k,h k,h δ δ R(x ) − R(x ) ≤ ζ (40) k,h k (note that no absolute value is needed in (38), (40); moreover, (40) is only be needed for (5) and (7)) hold with k δ δ k δ δ η ≤ c F (x ) − y  ξ ≤ c F (x ) − y ,ζ ≤ ζ. (41) k η k ξ k h k,h h k,h for all k ≤ k (δ, y ) and constants c , c > 0 sufficiently small, ζ> 0. ∗ η ξ δ δ Then the assertions of Theorem 1 remain valid for x in place of x δ δ k (δ,y ),h k (δ,y ) ∗ ∗ with (34) in place of (14) and (42) in place of (23). Proof For the Tikhonov version (31), in order to inductively estimate R(x ), k+1,h given x ∈ B , note that from (43) with k + 1 replaced by k, we get like in (23) that p −1 1 4 2 C R(x ) ≤ + − k+1,h p p 3 3τ 3 (1 − q)τ δ δ p † F (x ) − y R + ζ 1−p 0 † × 2 + + R =: R (42) α θ − q 123 460 B. Kaltenbacher, M. L. P. de Souza where 1−p p δ δ p d := 2 (1 − c ) F (x ) − y  , k,h tc k,h tc p−1 p−1 p−1 q ˜ := 2 (1 + γ) + (1 +˜ γ) , 1 − c tc q =˜ q + D ∈ (0, 1), 1 − c δ † † R := R(x ), R := R(x ), k,h k,h p−1 p−1 1 + γ 1 +˜ γ p p C := (1 + c ) , D := (1 − c ) , tc tc γ γ˜ for γ, γ, ˜ c ∈ (0, 1), which are chosen small enough so that q <θ. As before, from δ † the minimality of x and (2), (8)aswellas x ∈ D(F ),wehave k+1 δ δ δ δ δ (1 − c )F (x ) − y − c F (x ) − y  + α R(x ) tc tc k k+1 k,h k+1 δ δ † ≤ c F (x ) − y + (1 + c )δ + α R(x ), tc tc k k,h then using (38), (40), δ δ δ δ δ (1 − c )(F (x ) − y − η ) − c F (x ) − y  + α R(x ) tc k+1 tc k k+1,h k,h k+1,h δ δ † ≤ c F (x ) − y + (1 + c )δ + α R(x ) + α ζ . tc tc k k k+1 k,h Hence, with the same technique as in the proof of Theorem 1,using (24) with = ,wehave k † p d + α R ≤˜qd + α θ (R + ζ ) + C δ + Dη k+1,h k k+1,h k,h 0 k+1 k+1 k † p ≤ qd + α θ (R + ζ ) + C δ , k,h 0 k+1 using (41). From this, by induction we conclude 1 1 k+1 † p d + α R ≤ q d + α (R + ζ) + C δ (43) k+1,h k k+1,h 0 k 1 − 1 − q Hence, by (39), (41), we have the following estimate k δ δ F (x ) − y h k,h 1/p p−1 2 α C 1 k † p ≤ δ + θ d + (R + ζ) + δ , (1 − c ) θ − q 1 − q 1 − c tc ξ 123 Convergence and adaptive discretization of the IRGNM… 461 where the right hand side falls below τδ as soon as α C −1 † k ≥ (log 1/θ ) p log(1/δ) + log d + (R + ζ) − log τ˜ − θ − q 1 − q =: k(δ), 1− p p p C for τ˜ = 2 (1 − c ) (τ (1 − c )) . Note that τ> ˜ , provided τ is chosen tc ξ 1−q sufficiently large, which we assume to be done. That is, we have shown that the discrepancy stopping criterion from (34) will be satisfied after finitely many, namely O(log(1/δ)), steps. On the other hand, the continuous discrepancy at the iterate defined by the dis- cretized discrepancy principle (34)by(39), (41) satisfies δ δ F (x ) − y ≤ τ(1 + c )δ . k,h To estimate R(x ), note that according to our notation, from (43), we get, k (δ,y ),h likein(23), that for all k ∈{1,..., k (δ, y )} −1 d R + ζ C C R ≤ θ + 1 + τ˜ − =: R. α θ − q 1 − q 1 − q Now we show finiteness of the stopping index for the discretized Ivanov-IRGNM (32). By minimality of x and (38), for this problem we have k+1 δ δ δ δ (1 − c )F (x ) − y ≤ 2c F (x ) − y + (1 + c )δ + (1 − c )η . tc tc tc tc k+1 k+1,h k,h which with δ δ d := (1 − c )F (x ) − y , k,h tc k,h 2c c tc η q ˜ := , q =˜ q + D ∈ (0, 1), 1 − c 1 − c tc ξ C := (1 + c ), D := (1 − c ), tc tc by induction, (39) and (41)gives 1 1 C k δ δ k F (x ) − y ≤ d + ξ ≤ q d + δ +ˆ τδ, k,h k 0 h k,h 1 − c 1 − c 1 − q tc tc where the right hand side is smaller than τδ for all −1 k ≥ (log 1/q) log(1/δ) + log d − log τ˜ − =: k(δ), 1 − q 123 462 B. Kaltenbacher, M. L. P. de Souza with τ˜ = (1 − c )τ (1 − c ), so that we can again conclude k (δ, y ) ≤ k(δ) = tc ξ ∗ O(log(1/δ)). It remains to show finiteness of the stopping index for the discretized Morozov- δ δ δ IRGNM (33). By minimality of x we have (30) with x replaced by x , thus the k+1 k k,h inequalities (38) and (41) yield σ + c tc δ δ δ δ F (x ) − y ≤   F (x ) − y k+1,h k,h 1 − (1 − c ) tc 1−c then, by (39) and induction k δ δ k δ δ F (x ) − y ≤ q F (x ) − y  (44) h k,h 0,h where σ + c tc q :=   ∈ (0, 1), 1 − (1 − c ) tc 1−c and the right hand side of (44) falls below τδ for all −1 k ≥ (log 1/q) (log(1/δ) + log d − log τ˜) =: k(δ), where τ˜ = τ(1 − c ), and we can again conclude k (δ, y ) ≤ k(δ) = O(log(1/δ)). ξ ∗ Boundedness of the R values for (33)by R(x ) + ζ follows like in the proof of Theorem 1 together with (40), (41). 3 Error estimators for adaptive discretization The error estimators η , ξ and ζ can be quantified, e.g., by means of a goal oriented k k k dual weighted residual (DWR) approach [3], applied to the minimization problems δ δ δ δ δ p (x ,v , u , u ) ∈ argmin 3 C (u ˜)v + C (u ˜) − y  + α R(x ) k+1,h k,h k+1 k,h (x ,v,u,u ˜)∈C×V δ δ s.t. ∀w ∈ W :A (x , u ˜)(x − x ) + A (x , u ˜)v, w = 0, W ,W x k,h k,h u k,h ∗ ∗ A(x , u ˜), w = 0, A(x , u), w = 0, W ,W W ,W k,h (45) (note that the last constraint is added in order to enable computation of I below) δ δ δ δ δ 2 (x ,v , u , u ) ∈ argmin 3 C (u ˜)v + C (u ˜) − y k+1,h k,h k+1 k,h (x ,v,u,u ˜)∈C×V s.t. R(x ) ≤ ρ , δ δ and ∀w ∈ W :A (x , u ˜)(x − x ) + A (x , u ˜)v, w ∗ = 0, W ,W x k,h k,h u k,h A(x , u ˜), w ∗ = 0, A(x , u), w ∗ = 0, W ,W W ,W k,h (46) 123 Convergence and adaptive discretization of the IRGNM… 463 and δ δ δ δ (x ,v , u , u ) ∈ argmin 3 R(x ) (x ,v,u,u ˜)∈C×V k+1,h k,h k+1 k,h δ δ s.t. C (u ˜)v + C (u ˜) − y ≤ σ C (u ˜) − y , δ δ and ∀w ∈ W :A (x , u ˜)(x − x ) + A (x , u ˜)v, w = 0, W ,W x u k,h k,h k,h ∗ ∗ A(x , u ˜), w = 0, A(x , u), w = 0, W ,W W ,W k,h (47) which are equivalent to (5), (6), and (7), respectively, with k δ k δ I (x,v, u, u ˜) =C (u ˜) − y  , I (x,v, u, u ˜) =C (u) − y  , 1 2 I (x,v, u, u ˜) = R(x ) as quantities of interest [where I is only needed for (5) and (7)]. We assume that C, R and the norms can be evaluated without discretization error, so the discretized versions of I only arise due to discreteness of the arguments. Indeed, it is easy to see that the left hand sides of (38) and (39) can be bounded (at least approximately) by k k combinations of I and I , using the triangle inequality: 1 2 δ δ δ δ F (x ) − y −F (x ) − y k+1,h k+1 k+1 k+1 δ δ δ δ δ δ δ δ = I (x ,v , u , u ˜ ) − I (x ,v , u , u ˜ ) k+2 k+1 k+2 k+1 k+2,h k+1,h k+2,h k+1,h 1 1 k δ δ δ δ k δ δ δ δ k+1 − (I (x ,v , u , u ˜ ) − I (x ,v , u , u ˜ )) + R ; (48) 2 k+1 k k+1 k 2 k+1,h k,h k+1,h k,h k δ δ δ δ F (x ) − y −F (x ) − y h k,h k,h k δ δ δ δ k δ δ δ δ = I (x ,v , u , u ˜ ) − I (x ,v , u , u ˜ ), (49) 1 k+1,h k,h k+1,h k,h 1 k+1 k k+1 k k+1 k+1 δ δ k δ δ where we will neglect R =F (x ) − y −F (x ) − y . h k+1,h h k+1,h k+1 It is important to note that I is not equal to I ,see [18]. 1,h 2,h The computation of the a posteriori error estimators η ,ξ ,ζ is done as in [18]. k k k These error estimators can be used within the following adaptive algorithm for error control and mesh refinement: We start on a coarse mesh, solve the discretized opti- mization problem and evaluate the error estimator. Thereafter, we refine the current mesh using local information obtained from the error estimator, reducing the error with respect to the quantity of interest. This procedure is iterated until the value of the error estimator is below the given tolerance (41), cf. [3]. In this case, all the variables x,v, u, u ˜ are subject to a new discretization. For better readability we will partially omit the iteration index k and the discretization index h. The previous iterate x is fixed and not subject to a new discretization. Consider now the cost functional for (45) δ p J (x,v, u ˜) =C (u ˜)v + C (u ˜) − y  + α R(x ) 123 464 B. Kaltenbacher, M. L. P. de Souza and define the Langrangian functional δ δ L(x,v, u, u ˜,λ, μ, ˜ μ) :=J (x,v, u ˜) +A (x , u ˜)(x − x ) + A (x , u ˜)v, λ ∗ W ,W x k k u k ∗ ∗ +A(x , u ˜), μ ˜  +A(x , u), μ , (50) W ,W W ,W neglecting for simplicity (cf. Remark 2) the constraints defined by C. The first-order necessary optimality conditions for (45) are given by stationarity for the Lagrangian L. Setting z = (x,v, u, u ˜,λ, μ, ˜ μ), they read L (z)(dz) = 0, ∀dz ∈ Z = X × V × V × V × W × W × W and for the discretized problem, L (z )(dz ) = 0, ∀dz ∈ Z = X × V × V × V × W × W × W . h h h h h h h h h h h To derive a posteriori error estimators for the error with respect to the quantities of interest (I , I , I ), we introduce auxiliary functionals M : 1 2 3 i M (z, z ¯) = I (z) + L (z)z ¯, z, z ¯ ∈ Z , i = 1, 2, 3, i i ˜ ˜ Let z ˜ = (z, z ¯) ∈ Z = Z × Z and z ˜ = (z , z ¯ ) ∈ Z = Z × Z be continuous and h h h h h h discrete stationary points of M satisfying M (z ˜)(dz ˜) = 0, ∀dz ˜ ∈ZM (z ˜ )(dz ˜ ) = 0, ∀dz ˜ ∈ Z , h h h h respectively. Then, z, z are continuous and discrete stationary points of L and there holds I (z) = M (z ˜), i = 1, 2, 3. Thus the z part, as computed already during the i i numerical solution of the minimization problem (45)(or (46)) remains fixed for all i ∈{1, 2, 3, }. Moreover, after computing the discrete stationary point z for L (e.g., by applying Newton’s method), it requires only one more Newton step to compute the z ¯ coordinate of the stationary point for M from L (z )(z ¯ , dz ¯) =−I (z )dz ¯, ∀dz ˜ ∈ Z . h i,h h h h According to [3], there holds I (x,v, u ˜) − I (x ,v , u ˜ ) = M (z ˜ )(z ˜ −ˆz ) + R, ∀ˆz ∈ Z i = 1, 2, 3, i i h h h h h h h with a remainder term R of order O(˜z −˜z  ) that is therefore neglected. Thus we use k k I (z) − I (z ) ≈ M (z , z ¯ )(π z ˜ −˜z ) = ε , h h i,h h i,h i,h i i i i 123 Convergence and adaptive discretization of the IRGNM… 465 where π is an operator defined such that (π z ˜ −˜z ) approximates the interpolation h h i,h i,h errorasin[18], typically defined by local averaging, to define the estimators η , ξ , k k ζ according to the rule k+1 k k k η = ε + ε ,ξ = ε ,ζ = ε ; (51) k+1 k k 1 2 1 3 cf. (48), (49). The estimators obtained by this procedure can be used to trigger local mesh refinement until the requirements (41) are met cf. [3]. Explictly, for p = 2 (for simplicity) such a stationary point z = (x,v, u, u ˜,λ, μ) ˜ can be computed by solving the following system of equations (analogously for the discrete stationary point of L) ∗ δ ∗ −(A (x , u) μ + A (x , u ˜) λ) ∈ α ∂R(x ); (52) x x k δ δ 2C (u ˜)(dv), C (u ˜)v + C (u ˜) − y +A (x , u ˜)(dv), λ= 0, ∀dv ∈ V ; (53) u k A (x , u)(du), μ= 0, ∀du ∈ V ; (54) δ δ δ δ A (x , u ˜)(x − x , du ˜) + A (x , u ˜)(v, du ˜), λ+A (x , u ˜)(du ˜), μ ˜ xu k k uu k u k + 2C (u ˜)(du ˜,v) + C (u ˜)(du ˜), C (u ˜)v + C (u ˜) − y = 0, ∀du ˜ ∈ V ; (55) δ δ δ A (x , u ˜)(x − x ) + A (x , u ˜)v, dλ= 0, ∀dλ ∈ W ; (56) x k k u k A(x , u ˜), dμ ˜ = 0, ∀dμ ˜ ∈ W ; (57) A(x , u), dμ= 0, ∀dμ ∈ W. (58) Note that (58) is decoupled from the other equations and that if A (x , u) is injective, Eq. (54) implies μ = 0. Summarizing, since we have a convex minimization problem, after solving a non- linear system of seven equations to find the minimizer, we need only one more Newton step to compute the error estimators to check whether we need a refinement on the mesh or not. Regarding the problem (46) related to the Ivanov-IRGNM, we have the Lagrangian functional (50) with the cost functional defined by δ 2 J (x,v, u ˜) = C (u ˜)v + C (u ˜) − y  + I (R(x ) − ρ) , (−∞,0] and the indicator functional I (R(x ) − ρ) takes the role of a regularization (−∞,0] functional. The resulting optimality system is the same as above, cf. (52)-(58), just with (52) replaced by ∗ δ ∗ − (A (x , u) μ + A (x , u ˜) λ) ∈ ∂ I (R(x ) − ρ). (59) (−∞,0] x x k Similarly for (47) for Morozov-IRGNM, with the cost function δ δ J (x,v, u ˜) = R(x ) + I C (u ˜)v + C (u ˜) − y − σ C (u ˜) − y  , (−∞,0] =:Q(u ˜,v) 123 466 B. Kaltenbacher, M. L. P. de Souza we end up with an optimality system by setting α = 1 and replacing (53), (55)in (52)–(58)by δ ∗ −A (x , u ˜) λ ∈ ∂ Q(u ˜,v) (60) δ δ δ ∗ δ ∗ −(A (x , u ˜)(x − x ) + A (x , u ˜)v) λ − A (x , u ˜) μ ˜ ∈ ∂ Q(u ˜,v) (61) u ˜ xu k k uu k u k respectively. Note that the bound on I only appears—via (51)—in connection to the assumption k δ δ δ η ≤ c F (x ) − y ,for k ≤ k (δ, y ) in (41). This may be satisfied in practice k η ∗ h k,h without refining explicitly with respect to η , but simply by refining with respect to the other error estimators ξ (and ζ in the Tikhonov or Morozov case). The fact k k k−1 that I and I only differ in the discretization level, motivates the assumption 1,h 2,h k k−1 that for small h,wehave I ≈ I and η ≈ ξ . Thefore, the algorithm used k−1 k 1,h 2,h in actual computations will be built neglecting I and hence skipping the constraint A(x , u), w = 0, ∀w ∈ W in (45), (46), (47), which implies a modification W ,W of the Lagrangian (50) accordingly. Therefore, the corresponding optimality systems for p = 2 in the Tikhonov case is given by δ ∗ − A (x , u ˜) λ ∈ α ∂R(x ); (62) x k δ δ 2C (u ˜)(dv), C (u ˜)v + C (u ˜) − y +A (x , u ˜)(dv), λ= 0, ∀dv ∈ V ; (63) u k δ δ δ δ A (x , u ˜)(x − x , du ˜) + A (x , u ˜)(v, du ˜), λ+A (x , u ˜)(du ˜), μ ˜ xu k k uu k u k + 2C (u ˜)(du ˜,v) + C (u ˜)(du ˜), C (u ˜)v + C (u ˜) − y = 0, ∀du ˜ ∈ V ; (64) δ δ δ A (x , u ˜)(x − x ) + A (x , u ˜)v, dλ= 0, ∀dλ ∈ W ; (65) x k k u k A(x , u ˜), dμ ˜ = 0, ∀dμ ˜ ∈ W. (66) Note that Eq. (66) is decoupled from the others. Therefore, the strategy is to solve (66) for u ˜ first, then solve the linear system (62), (63), (65)for (x,v,λ), and finally compute μ ˜ via the linear equation (64). Here, the system (62), (63), (65) can be interpreted as the optimality conditions for the following problem δ δ δ 2 (x ,v ) ∈ argmin C (u ˜)v + C (u ˜) − y  + α R(x ) k+1,h k,h (x ,v)∈C×V δ δ s.t. ∀w ∈ W :A (x , u ˜)(x − x ) + A (x , u ˜)v, w ∗ = 0. W ,W x k,h k,h u k,h For the Ivanov case, we have to solve (63)–(66) with δ ∗ − A (x , u ˜) λ ∈ ∂ I (R(x ) − ρ) (67) (−∞,0] x k in place of (62), hence again (66) is decoupled from the other equations, (64) is linear with respect to μ ˜ , once (x,v,λ) has been computed, and the remaining system for (x,v,λ) can be interpreted as the optimality conditions for the following problem 123 Convergence and adaptive discretization of the IRGNM… 467 δ δ δ 2 (x ,v ) ∈ argmin C (u ˜)v + C (u ˜) − y k+1,h k,h (x ,v)∈C×V s.t. R(x ) ≤ ρ , δ δ and ∀w ∈ W :A (x , u ˜)(x − x ) + A (x , u ˜)v, w = 0. W ,W x k,h k,h u k,h The Morozov case requires solution of (62) (with α = 1), (65), (66), (60), (61). Thus again, we first solve (66)for u ˜, then the system (62) (with α = 1), (65), (60), which is the first order optimality condition for δ δ (x ,v ) ∈ argmin R(x ) k+1,h k,h (x ,v)∈C×V δ δ s.t. C (u ˜)v + C (u ˜) − y ≤ σ C (u ˜) − y , δ δ and ∀w ∈ W :A (x , u ˜)(x − x ) + A (x , u ˜)v, w = 0, W ,W x k,h k,h u k,h with Lagrange multiplier λ for the equality constraint, and finally the (now possibly nonlinear) inclusion (61)for μ ˜ . Remark 3 Since DWR estimators are based on residuals which are computed in the optimization process, the additional costs for estimation are very low, which makes this approach attractive for our purposes. However, although these error estimators are known to work efficiently in practice (see [3]), they are not reliable, i.e., the k k k conditions I (z) − I (z ) ≤ , i = 1, 2, 3 can not be guaranteed in a strict sense in i i i the computations, since we neglect the remainder term R and use an approximation for z ˜ −ˆz . As our analysis in Theorem 1 is kept rather general, it is not restricted to DWR estimators and would also work with different (e.g., reliable) error estimators. 4 Model examples We present a model example to illustrate the abstract setting from the previous section. Consider the following inverse source problem for a semilinear elliptic PDE, where the model and observation equations are given by 3 d − Δu + κu = χ x in Ω ⊂ R , (68) u = 0on ∂Ω, (69) C (u) = u | , y − y  2 ≤ δ, (70) o L (ω ) where χ denotes the extension by zero of a function on ω to a function on all of Ω. ω c We first of all consider Tikhonov regularization and, aiming for a sparsely supported source, therefore use the space of Radon measures M(ω ) as a preimage space X. 1,q −1,q Thus we define the operators A : M(ω ) × W (Ω) −→ W (Ω), A(x , u) = 1,q 3 2 −Δu + κu − χ x, κ ∈ R and the injection C : W (Ω) −→ L (ω ) = Y , q > d, ω o c 0 where Ω is a bounded domain in R with d = 2 or 3, with Lipschitz boundary ∂Ω and ω ,ω ⊂ Ω are the control domain and the observation domain, respectively. c o A monotonicity argument yields well posedness of the above semilinear boundary 1,q value problem, i.e., well-definedness of u ∈ W (Ω) as a solution to the elliptic 123 468 B. Kaltenbacher, M. L. P. de Souza 3 −1,q boundary value problem (68), (69), as long as we can guarantee that u ∈ W (Ω) 1,q 1,q 3r r for any u ∈ W (Ω), i.e., the embeddings W (Ω) → L (Ω) and L (Ω) → 0 0 −1,q W (Ω) are continuous for some r ∈[1, ∞], which (by duality) is the case iff 1,q 3r r W (Ω) embeds continuously both into L (Ω) and L (Ω). By Sobolev’s Embed- ding Theorem, this boils down to the inequalities d d d d 1 − ≥− and 1 − ≥− , q 3r q r which by elementary computations turns out to be equivalent to dq dq ≤ r ≤ , (71) q + d 3(dq − q − d) where the left hand side is larger than one and the denominator on the right hand side is positive due to the fact that for d ≥ 2wehave q > d ≥ d = . Taking the d−1 extremal bounds for q > d—note that the lower bound is increasing and the upper bound is decreasing with q—in (72) we get d d < r < . (72) 2 3(d − 2) Thus, as a by-product, we get that for any t ∈[1, t ) there exists q > d such that 1,q W (Ω) continuously embeds into L , with ¯ ¯ t =∞ in case d = 2 and t = 3 in case d = 3 . (73) For the regularization functional R(x ) =x  , the IRGNM-Tikhonov mini- M(ω ) mization step is given by (ignoring h in the notation) δ δ δ δ 2 (x ,v , u ) ∈ argmin v +˜ u − y k+1 k k 1,q L (ω ) 2 o (x ,v,u ˜)∈M(ω )×(W (Ω)) + α x k M(ω ) 1,q 2 δ s.t. ∀w ∈ W (Ω) : (∇v∇w + 3κu ˜ vw)dΩ = wd(x − x ), 0 k Ω ω 3 δ (∇˜ u∇w + κu ˜ w)dΩ = wdx . Ω ω Here and below dΩ and dx denote the integrals with respect to the Lebesgue Ω ω measure and with respect to the measure x, respectively. Therefore, to compute this Gauss–Newton step, one first needs to solve the nonlinear equation 3 δ − Δu ˜ + κu ˜ = χ x (74) c k 123 Convergence and adaptive discretization of the IRGNM… 469 for u ˜ = u , then solve the following optimality system with respect to (x,v,λ) (written in a strong formulation) ∗ ∗ C (ω ) b c λ ≤ α and (x − λ)dx ≤ 0, ∀x ∈ B C (ω ) k b c α δ 2 δ δ −Δλ + 3κ(u ) λ + 2v + 2u = 2χ y k k o δ 2 δ −Δv + 3κ(u ) v = χ (x − x ), k c k which can be interpreted as the optimality system for the minimization problem δ δ δ δ 2 (x ,v ) ∈ argmin u + v − y  + α x 2 k M(ω ) k+1 k k 1,q L (ω ) c (x ,v)∈M(ω )×W (Ω) δ 2 δ s.t. −Δv + 3κ(u ) v = χ (x − x ), (75) k c k with Lagrange multiplier λ for the equality constraint, and finally, compute μ ˜ by solving δ 2 δ δ δ − Δμ ˜ + 3κ(u ) μ ˜ =−6κu vλ − 2(v + u − χ y ). (76) k k k o For carrying out the IRGNM iteration, μ ˜ is not required, but we need it for evaluating the error estimators. For the Ivanov case, we consider the same model and observation equations (68), (69), (70) but now we intend to regularize by imposing L bounds and thus use ∞ 1 −1 the slightly different function space setting, A : L (ω ) × H (Ω) −→ H (Ω), 3 1 2 A(x , u) =−Δu + κu − x, κ ∈ R and the injection C : H (Ω) −→ L (ω ). The IRGNM-Ivanov minimization step with the regularization functional R(x ) = x  is given by L (ω ) δ δ δ δ 2 (x ,v , u ) ∈ argmin v +˜ u − y ∞ 1 2 k+1 k k (x ,v,u ˜)∈L (ω )×(H (Ω)) L (ω ) c o s.t. x  ≤ ρ L (ω ) 1 2 δ and ∀w ∈ H (Ω) : (∇v∇w + 3κu ˜ vw)dΩ = w(x − x )dΩ, 0 k Ω ω 3 δ (∇˜ u∇w + κu ˜ wdΩ = wx dΩ. Ω ω For the Gauss–Newton step, one needs to first solve the nonlinear equation (74)for u ˜ = u , and then solve the following optimality system with respect to (x,v,λ) ∗ ∗ L (ω ) x  ≤ ρ and (x − x )λdΩ ≤ 0, ∀x ∈ B L (ω ) c ρ δ 2 δ δ −Δλ + 3κ(u ) λ + 2v + 2u = 2χ y k k o δ 2 δ −Δv + 3κ(u ) v = χ (x − x ), k c k 123 470 B. Kaltenbacher, M. L. P. de Souza which can be interpreted as the optimality system for the minimization problem δ δ δ δ 2 (x ,v ) ∈ argmin 1 u + v − y ∞ 2 k+1 k (x ,v)∈L (ω )×H (Ω) k c L (ω ) 0 o s.t. x  ∞ ≤ ρ L (ω ) δ 2 δ −Δv + 3κ(u ) v = χ (x − x ) (77) k c k with Lagrange multiplier λ for the equality constraint. Finally, μ ˜ is computed from (76). For the IRGNM-Morozov case, using for simplicity the regularization functional 1 2 R(x ) = x  , and leaving the rest of the setting as in the IRGNM-Ivanov case, L (ω ) the step is defined by δ δ δ 2 (x ,v , u ) ∈ argmin x ∞ 1 2 k+1 k k (x ,v,u ˜)∈L (ω )×(H (Ω)) L (ω ) c c δ 2 δ 2 s.t. v +˜ u − y  ≤ σ ˜ u − y 2 2 L (ω ) L (ω ) o o 1 2 δ and ∀w ∈ H (Ω) : (∇v∇w + 3κu ˜ vw)dΩ = w(x − x )dΩ, 0 k Ω ω 3 δ (∇˜ u∇w + κu ˜ wdΩ = wx dΩ. Ω ω So again we first solve (74)for u ˜ = u , then the minimization problem δ δ 2 (x ,v ) ∈ argmin 1 x k+1 k (x ,v)∈L (ω )×H (Ω) c L (ω ) 0 c δ δ 2 δ δ 2 s.t. v + u − y  ≤ σ u − y 2 2 k k L (ω ) L (ω ) o o δ 2 δ −Δv + 3κ(u ) v = χ (x − x ), k c k or actually its first order optimality system λ| = x δ δ 2 δ δ 2 φ ≥ 0 , v + u − y  ≤ σ u − y  , 2 2 k k L (ω ) L (ω ) o o δ δ 2 δ δ 2 φ v + u − y  − σ u − y  = 0 2 2 k k L (ω ) L (ω ) o o δ 2 δ −Δv + 3κ(u ) v = χ (x − x ), k c k δ δ for (x ,v ,φ,λ), and finally, k+1 k δ 2 δ δ δ −Δμ ˜ + 3κ(u ) μ ˜ =−6κu vλ − φ(v + (1 − σ)(u − χ y )). k k k o for μ ˜ . For numerically efficient methods to solve the minimization problems (75) and (77) we refer to e.g., [4–6] and the references therein. 123 Convergence and adaptive discretization of the IRGNM… 471 We finally check the tangential cone condition in case ω = Ω and, for simplicity also ω = Ω, in both settings 1,q 1,q X = M(ω ), V = W (Ω) , W = W (Ω) 0 0 (where we will have to restrict ourselves to d = 2) and ∞ 2 1 X = L (ω ) or X = L (ω ), V = W = H (Ω) . c c For this purpose, we use the fact that with the notation F (x˜) =˜ u| , F (x ) = u| , ω ω o o F (x˜) − F (x ) = v| and F (x˜) − F (x ) − F (x )(x˜ − x ) = w| , the functions ω ω o o 1,q v, w ∈ W (Ω) satisfy the homogeneous Dirichlet boundary value problems for the equations 2 2 −Δv + κ(u ˜ +˜uu + u )v =˜x − x 2 2 −Δw + κu w =−κ(u ˜ + 2u)v . Using an Aubin-Nitsche type duality trick, we can estimate the L norm of w via the 1,n adjoint state p ∈ W (Ω), which solves −Δp + κu p = w, with homogeneous Dirichlet boundary conditions, so that by Hölder’s inequality 2 2 2 w =w, (−Δ + κu id)p=(−Δ + κu id)w, p L (Ω) =−κ(u ˜ + 2u)v , p≤ κv ˜ u + 2u m v m p 2 2m L (Ω) L (Ω) L (Ω) m−4 L (Ω) ≤ C κv ˜ u + 2u m v m w , 2 2 L (Ω) L (Ω) L (Ω) L (Ω) where we aim at choosing m ∈[4, ∞], n ∈[1, ∞] such that indeed p 1,n ≤ C w −1,n ≤ C w 2 W (Ω) L (Ω) W (Ω) 2m m 1,n 2 −1,n m−4 and the embeddings V → L (Ω), W (Ω) → L (Ω), L (Ω) → W (Ω) m m are continuous. If we succeed in doing so, we can bound C κ˜ u + 2u v L (Ω) L (Ω) by some constant c , which will be small provided ˜x − x  and hence v is tc X L (Ω) small. Thus, the numbers n, m are limited by the requirements 2m m 1,n m−4 V ⊆ L (Ω) and W (Ω) ⊆ L (Ω) and m ≥ 4 , (78) 2 −1,n L (Ω) ⊆ W (Ω), i.e., by duality, 1,n 2 W (Ω) ⊆ L (Ω) , (79) 123 472 B. Kaltenbacher, M. L. P. de Souza 2 o −1,n and the fact that κu p ∈ L (Ω) should be contained in W (Ω) for u ∈ V ⊆ t 1,n L (Ω), and p ∈ W (Ω), which via Hölder’s inequality in 1/o 2 o 2 ot (u p) dΩ ≤u p L (Ω) t −2o L (Ω) and duality leads to the requirements ot t 1,n o t 1,n t −2o W (Ω) ⊆ L (Ω) and V ⊆ L (Ω) and W (Ω) ⊆ L (Ω) and o ≤ (80) 0 0 1,q In case V = W (Ω) with q > d and d = 3, (78) will not work out, since according to (73), m cannot be chosen larger or equal to four. 1,q In case V = W (Ω) with q > d and d = 2, we can choose, e.g., t = m = n = 6, o = 2 to satisfy (78), (79), (80)aswellas t, m < t as in (73). The same choice is possible in case V = H (Ω) with d ∈{2, 3}. 5 Numerical tests In this section, we provide some numerical illustration of the IRGNM Ivanov method applied to the example from Sect. 4, i.e., each Newton step consists of solving (74) and subsequently (77). For the numerical solution of (74) we apply a damped Newton iteration to the equation Φ(u ˜) = 0 where 1 −1 3 δ Φ : H (Ω) → H (Ω) , Φ(u ˜) =−Δu ˜ + κu ˜ − x , 0 k −1 l+1 l l 2 l 3 δ u ˜ =˜ u − −Δu ˜ + 3κ(u ˜ ) −Δu ˜ + κ(u ˜ ) − x , which is stopped as soon as Φ(u ˜ ) −1 has been reduced by a factor of 1.e−4. H (Ω) The sources x and states u are discretized by piecewise linear finite elements, hence after elimination of the state via the linear equality constraint, (77) becomes a box constrained quadratic program for the dicretized version of x, which we solve with the method from [12] using the Matlab code mkr_box provided to us by Philipp Hungerländer, Alpen-Adria Universität Klagenfurt. All implementations have been done in Matlab. We performed test computations on a 2-d domain ω = ω = Ω = (−1, 1) , o c on a regular computational finite element grid consisting of 2 · N · N triangles, with N = 32. We first of all consider κ = 1 (below we will also show results with κ = 100) and the piecewise constant exact source function x =−10 + 20 · 1 , (81) ex B where B is the ball of radius 0.2 around (−0.4, −0.3) cf. Fig. 1, and correspondingly set ρ = 10. In order to avoid an inverse crime, we generated the synthetic data on a 123 Convergence and adaptive discretization of the IRGNM… 473 Fig. 1 Left: exact source x ; right: locations of spots for testing weak * L convergence ex Table 1 Convergence as δ err err err err spot spot spot 1 1 2 3 L (Ω) δ → 0: averaged errors of five test runs with uniform noise 0.1000 0 4.0818 8.0043 0.0627 0.0667 0.1558 3.6454 7.8451 0.0541 0.0333 0 3.0442 6.5726 0.0370 0.0100 0 0 3.9091 0.0188 finer grid and, after projection of u onto the computational grid, we added normally ex distributed random noise of levels δ ∈{0.001, 0.01, 0.1} to obtain synthetic data y . In all tests we start with the constant function with value zero for x . Moreover, we always set τ = 1.1. According to our convergence result Theorem 1 with R = · , we can expect weak * convergence in L (Ω) here. Thus we computed L (Ω) the errors in certain spots within the two homogeneous regions and on their interface, spot = (0.5, 0.5), spot = (−0.4, −0.3), spot = (−0.4, −0.5), 1 2 3 1 1 cf. Fig. 1, more precisely, on × squares located at these spots, corresponding to N N the piecewise constant L functions with these supports in order to exemplarily test ∞ 1 weak * L convergence. Additionally we computed L errors. Table 1 provides an illustration of convergence as δ decreases. For this purpose, we performed five runs on each noise level for each example and list the average errors. In Fig. 2 we plot the reconstructions for κ = 1 and κ = 100. For κ = 1, the noise levels δ ∈{0.1, 0.667, 0.333, 0.01} correspond to a percentage of p ∈ {5.6, 18.5, 37.1, 55.6} of the L deviation of the exact state from the background state 1/3 1/3 u =−10 . In case of κ = 100, where the background state is u =−0.1 the 0 0 corresponding percentages are p ∈{17.9, 59.7, 119.4, 179.2}. For an illustration of the noisy data as compared to the exact ones, see Figs. 3 and 4. Indeed, the box con- straints enable to cope with relatively large noise levels, even in the rather nonlinear regime with κ = 100. 123 474 B. Kaltenbacher, M. L. P. de Souza Fig. 2 Reconstructions from noisy data with δ ∈{0.1, 0.667, 0.333, 0.01} (top to bottom) for κ = 1 (left) and κ = 100 (right) 123 Convergence and adaptive discretization of the IRGNM… 475 Fig. 3 Exact and noisy data (δ = 0.1) for κ = 1 Fig. 4 Exact and noisy data (δ = 0.1) for κ = 100 6 Conclusions and remarks In this paper we have studied convergence of the Tikhonov type, the Ivanov type, and the Morozov type IRGNM with a stopping rule based on the discrepancy principle type. To the best of our knowledge, the Ivanov and Morozov IRGNMs have not been studied so far and in all three Tikhonov, Ivanov, and Morozov type IRGNMs, convergence results without source conditions so far use stronger assumptions than the tangential cone condition used here. We also consider discretized versions of the methods and provide discretization error bounds that still guarantee convergence. Moroever, we discuss goal oriented dual weighted residual error estimators that can be used in an adaptive discretization scheme for controlling these discretization error bounds. An inverse source problem for a nonlinear elliptic boundary value problems illustrates our theoretical findings in the special situations of measure valued and L sources. We also provide some computational results with the Ivanov IRGNM for the case of an L source. Numerical implementations and tests for a measure valued source, together with adaptive discretization is subject of ongoing work, based on the approaches from [4–6,18,19]. Future research in this context will be concerend with convergence rates results for the Ivanov and Morozov IRGNMs under source conditions. Acknowledgements Open access funding provided by University of Klagenfurt. The authors wish to thank Philipp Hungerländer, Alpen-Adria Universität Klagenfurt, for providing us with the Matlab code based on the method from [12]. Moreover, the authors gratefully acknowledge financial support by the Austrian 123 476 B. Kaltenbacher, M. L. P. de Souza Science Fund FWF under the grants I2271 “Regularization and Discretization of Inverse Problems for PDEs in Banach Spaces” and P30054 “Solving Inverse Problems without Forward Operators” as well as partial support by the Karl Popper Kolleg “Modeling-Simulation-Optimization”, funded by the Alpen- Adria-Universität Klagenfurt and by the Carinthian Economic Promotion Fund (KWF). Moreover, we wish to thank both reviewers for fruitful comments leading to an improved version of the manuscript. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 Interna- tional License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Appendix Lemma 1 For all a, b > 0, p ≥ 1 and γ, ∈ (0, 1) p−1 1 + γ p p−1 p p (a + b) ≤ (1 + γ) a + b (82) and, if additionally a ≥ b, also p−1 1 − p p−1 p p (a − b) ≥ (1 − ) a − b . (83) Proof Theestimatein(82) can be done by solving the following extremal value problems C = max φ(x), C = max Φ(x), x >0 x >0 where p p−1 −p p−1 p −p φ(x ) := ((1 + x ) − (1 + γ) )x and Φ(x ) := ((1 − ) − (1 − x ) )x , since for any γ, ∈ (0, 1), φ(x ) ≤ C and Φ(x ) ≤ C for all x > 0 with x := b/a, a, b > 0 is equivalent to (24). Solving for C ,wehave ⎨ = 0 ⇐⇒ x = γ, −( p+1) p−1 p−1 φ (x ) = px ((1 + γ) − (1 + x ) ) < 0for x >γ, > 0for x <γ, which means that p−1 1 + γ max φ(x ) = φ(γ ) = , 123 Convergence and adaptive discretization of the IRGNM… 477 p−1 1+γ so defining C := and writing the resulting inequality in terms of a and b we have the desired formula. The formula in (83) is derived analogously. Lemma 2 Let k ∈ N, (d ) ⊆[0, ∞), (R ) ⊆[0, ∞), α , R , c, q j 1≤ j ≤k+1 j 1≤ j ≤k+1 0 = θ ∈ (0, ∞). Then j j † ∀ j ∈{0,..., k}: d + α θ R ≤ qd + α θ R + c , (84) j +1 0 j +1 j 0 implies that 1 1 k k+1 k † d + α θ R < q d + α θ R + c. (85) k+1 0 k+1 0 0 1 − 1 − q Proof We first show by induction that for all l ∈{0,..., k} q q k l+1 k † l d +α θ R ≤ q d + 1 + + ··· + α θ R +(1+q+···+q )c. k+1 0 k+1 k−l 0 θ θ (86) Indeed, for l = 0, (86)isjust(84) with j = k. Suppose that (86) holds for l, then using (84) with j = k − (l + 1), we obtain the formula for l + 1 d + α R k+1 k k+1 l+1 k−(l+1) † ≤ q qd + α θ R + c k−(l+1) 0 q q k † l + 1 + + ··· + α θ R + (1 + q + ··· + q )c θ θ l l+1 q q q (l+1)+1 k † = q d + 1 + + ··· + + α θ R k−(l+1) θ θ θ l l+1 + (1 + q + ··· + q + q )c, and the induction proof is complete. Hence, setting l = k in (86) and using the geometric series formula, we get the assertion (85). References 1. Bakushinsky, A.B.: The problem of the convergence of the iteratively regularized Gauss–Newton method. Comput. Math. Math. Phys. 32, 1353–1359 (1992) 2. Bakushinsky, A.B., Kokurin, M.: Iterative Methods for Approximate Solution of Inverse Problems. Kluwer, Dordrecht (2004) 3. Becker, R., Vexler, B.: Mesh refinement and numerical sensitivity analysis for parameter calibration of partial differential equations. J. Comput. Phys. 206, 95–110 (2005) 4. Clason, C., Kunisch, K.: A duality-based approach to elliptic control problems in non-reflexive Banach spaces. ESAIM Control Optim. Calc. Var. 17(1), 243–266 (2011) 123 478 B. Kaltenbacher, M. L. P. de Souza 5. Clason, C., Kunisch, K.: A measure space approach to optimal source placement. Comput. Optim. Appl. 53(1), 155–171 (2012) 6. Casas, E., Clason, C., Kunisch, K.: Approximation of elliptic control problems in measure spaces with sparse solutions. SIAM J. Control Optim. 50(4), 1735–1752 (2012) 7. Dombrovskaja, I., Ivanov, V.K.: On the theory of certain linear equations in abstract spaces. Sib. Mat. Z. 6, 499–508 (1965) 8. Engl, H., Hanke, M., Neubauer, A.: Regularization of Inverse Problems. Kluwer, Dordrecht (1996) 9. Grasmair, M., Haltmeier, M., Scherzer, O.: The residual method for regularizing ill-posed problems. Appl. Math. Comput. 218, 2693–710 (2011) 10. Hanke, M.: A regularization Levenberg–Marquardt scheme, with applications to inverse groundwater filtration problems. Inverse Prob. 13, 79–95 (1997) 11. Hohage, T., Werner, F.: Iteratively regularized Newton-type methods for general data misfit functionals and applications to Poisson data. Numer. Math. 123, 745–779 (2013) 12. Hungerländer, P., Rendl, F.: A feasible active set method for strictly convex problems with simple bounds. SIAM J. Opt. 25, 1633–1659 (2015) 13. Ivanov, V.K.: On linear problems which are not well-posed. Dokl. Akad. Nauk SSSR 145, 270–272 (1962) 14. Ivanov, V.K.: On ill-posed problems. Mat. Sb. (N.S.) 61(103), 211–223 (1963) 15. Ivanov, V.K., Vasin, V.V., Tanana, V.P.: Theory of Linear Ill-Posed Problems and Its Applications, Inverse and Ill-Posed Problems Series, VSP (2002) 16. Jin, Q., Zhong, M.: On the iteratively regularized Gauss–Newton method in Banach spaces with applications to parameter identification problems. Numer. Math. 124, 647–683 (2013) 17. Kaltenbacher, B., Hofmann, B.: Convergence rates for the iteratively regularized Gauss–Newton method in Banach spaces. Inverse Prob. 26, 035007 (2010) 18. Kaltenbacher, B., Kirchner, A., Veljovic, ´ S.: Goal oriented adaptivity in the IRGNM for parameter identification in PDEs: I. Reduced formulation. Inverse Prob. 30, 045001 (2014) 19. Kaltenbacher, B., Kirchner, A., Vexler, S.: Goal oriented adaptivity in the IRGNM for parameter identification in PDEs II: all-at once formulations. Inverse Prob. 30, 045002 (2014) 20. Kaltenbacher, B., Neubauer, A., Scherzer, O.: Iterative Regularization Methods for Nonlinear Ill-Posed Problems. Walter de Gruyter, Berlin (2008) 21. Kaltenbacher, B., Schöpfer, F., Schuster, T.: Convergence of some iterative methods for the regular- ization of nonlinear ill-posed problems in Banach spaces. Inverse Prob. 25, 065003 (2009) 22. Lorenz, D., Worliczek, N.: Necessary conditions for variational regularization schemes. Inverse Prob. 29, 075016 (2013) 23. Morozov, V.A.: Choice of parameter for the solution of functional equations by the regularization method. Dokl. Akad. Nauk SSSR 175, 1225–8 (1967) 24. Neubauer, A., Ramlau, R.: On convergence rates for quasi-solutions of ill-posed problems. ETNA 41, 81–92 (2014) 25. Scherzer, O.: Convergence criteria of iterative methods based on Landweber iteration for solving nonlinear problems. J. Math. Anal. Appl. 194, 911–933 (1995) 26. Seidman, T.I., Vogel, C.R.: Well posedness and convergence of some regularisation methods for non- linear ill posed problems. Inverse Prob. 5, 227–238 (1989) 27. Tikhonov, A.N., Arsenin, V.Y.: Solutions of Ill-Posed Problems. Wiley, New York (1977) 28. Werner, F.: On convergence rates for iteratively regularized Newton-type methods under a Lipschitz- type nonlinearity condition. J. Inverse Ill Posed Probl. 23, 75–84 (2015) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Numerische Mathematik Springer Journals

Convergence and adaptive discretization of the IRGNM Tikhonov and the IRGNM Ivanov method under a tangential cone condition in Banach space

Free
30 pages

Loading next page...
 
/lp/springer_journal/convergence-and-adaptive-discretization-of-the-irgnm-tikhonov-and-the-XaM7Whgf6l
Publisher
Springer Journals
Copyright
Copyright © 2018 by The Author(s)
Subject
Mathematics; Numerical Analysis; Mathematics, general; Theoretical, Mathematical and Computational Physics; Mathematical Methods in Physics; Numerical and Computational Physics, Simulation; Mathematical and Computational Engineering
ISSN
0029-599X
eISSN
0945-3245
D.O.I.
10.1007/s00211-018-0971-5
Publisher site
See Article on Publisher Site

Abstract

Numerische Numer. Math. (2018) 140:449–478 https://doi.org/10.1007/s00211-018-0971-5 Mathematik Convergence and adaptive discretization of the IRGNM Tikhonov and the IRGNM Ivanov method under a tangential cone condition in Banach space Barbara Kaltenbacher · Mario Luiz Previatti de Souza Received: 24 July 2017 / Revised: 27 February 2018 / Published online: 29 May 2018 © The Author(s) 2018 Abstract In this paper we consider the iteratively regularized Gauss–Newton method (IRGNM) in its classical Tikhonov version as well as two further—Ivanov type and Morozov type—versions. In these two alternative versions, regularization is achieved by imposing bounds on the solution or by minimizing some regularization functional under a constraint on the data misfit, respectively. We do so in a general Banach space setting and under a tangential cone condition, while convergence (without source conditions, thus without rates) has so far only been proven under stronger restrictions on the nonlinearity of the operator and/or on the spaces. Moreover, we provide a convergence result for the discretized problem with an appropriate control on the error and show how to provide the required error bounds by goal oriented weighted dual residual estimators. The results are illustrated for an inverse source problem for a nonlinear elliptic boundary value problem, for the cases of a measure valued and of an L source. For the latter, we also provide numerical results with the Ivanov type IRGNM. Mathematics Subject Classification 65F22 · 65N20 Supported by the Austrian Science Fund FWF under the Grant I2271 “Regularization and Discretization of Inverse Problems for PDEs in Banach Spaces”. Barbara Kaltenbacher barbara.kaltenbacher@aau.at Mario Luiz Previatti de Souza mario.previatti@aau.at Institute of Mathematics, Alpen-Adria-Universität Klagenfurt, Klagenfurt, Austria 123 450 B. Kaltenbacher, M. L. P. de Souza 1 Introduction In this paper we consider a nonlinear ill-posed operator equation F (x ) = y , (1) where the possibly nonlinear operator F : D(F ) ⊆ X → Y with domain D(F ) maps between real Banach spaces X and Y . We are interested in the ill-posed situation, i.e., F fails to be continuously invertible, and the data are contaminated with noise, thus regularization has to be applied (see, e.g., [8,27], and references therein). Throughout this paper we will assume that an exact solution x ∈ D(F ) of (1) exists, i.e., F (x ) = y, and that the noise level δ in the (deterministic) estimate y − y ≤ δ (2) is known. Partially we will also refer to the formulation of the inverse problem as a system of model and observation equation A(x , u) =0(3) C (u) = y . (4) Here A : X × V → W and C : V → Y are the model and observation operator, so that with the parameter-to-state map S : X → V satisfying A(x , S(x )) = 0 and F = C ◦ S,(1) is equivalent to the all-at-once formulation (3), (4). Newton type methods for the solution of nonlinear ill-posed problems (1) have been extensively studied in Hilbert spaces (see, e.g., [2,20] and the references therein) and more recently also in a in Banach space setting. In particular, the iteratively regularized Gauss–Newton method [1] can be generalized to a Banach space setting by calculating iterates x in a Tikhonov type variational form as k+1 δ δ δ δ p x ∈ argmin F (x )(x − x ) + F (x ) − y  + α R(x), (5) k+1 x ∈C k k k see, e.g., [11,16,17,21,28] where p ∈[1, ∞), (α ) is a sequence of regularization k k∈N parameters, and R is some nonnegative regularization functional. Alternatively, one might introduce regularization by imposing some bound ρ on the norm of x, or, again, generally, on a regularization functional of x δ δ δ δ x ∈ argmin F (x )(x − x ) + F (x ) − y  such that R(x ) ≤ ρ , (6) x ∈C k+1 k k k which corresponds to Ivanov regularization or the method of quasi solutions, see, e.g., [7,13–15,22,24,26]. A third way of incorporating regularization in a Newton type iteration is Morozov regularization, also called the method of the residuals, see, e.g., [9,22,23] 123 Convergence and adaptive discretization of the IRGNM… 451 δ δ δ δ δ δ x ∈ argmin R(x ) such that F (x )(x −x )+ F (x )− y ≤ σ F (x )− y  , k+1 x ∈C k k k k (7) for some σ ∈ (0, 1), where the choice of the bound in the inequality constraint is very much inspired by the inexact Newton type regularization parameter choice in [10]. We restrict ourselves to the norm in Y as a measure of the data misfit, but the analysis could as well be extended to more general functionals S satisfying certain conditions, as e.g., in [11,28]. Here C is a set (possibly chosen with convenient properties for carrying out the minimization) containing x and being contained in D(F ), such that F satisfies additional conditions on C see (8), (10) below. If F is defined on all of X, then the minimization problem (5) can be posed in an unconstrained way C = X. As a restriction on the nonlinearity of the forward operator F we impose the tan- gential cone condition F (x˜ ) − F (x ) − F (x )(x˜ − x )≤ c F (x˜ ) − F (x ) for all x˜ , x ∈ B (8) tc R (also called Scherzer condition, cf. [25]) for some constant c < 1/3. Here, for any tc r > 0, B ={x ∈ C : R(x ) ≤ r } (9) is a sublevel set of the regularization functional and R will be specified in the conver- gence result Theorem 1. Note that the convergence conditions imposed in [11,16,17,21,28] in the situation without source condition, namely local invariance of the range of F (x ) , are slightly stronger, since this adjoint range invariance is sufficient for (8). However, most prob- ably the gap is not very large, as in those application examples where (8) has been verified, the proof of (8) is actually often done via adjoint range invariance. In (5), (6), (7), the bounded linear operator F (x ) is not necessarily a Gâteaux or Fréchet deriva- tive of F, but just some local linearization (in the sense of (8)), satisfying additionally the weak closedness condition T T X Y ∀x ∈ C ,(x ) ⊆ C : x −→ x ˆ , and F (x )x −→ y n n∈N n n ⇒ xˆ ∈ C and F (x )xˆ = y . (10) In here, T and T are topologies on X and Y (e.g., just the weak or weak* topolo- X Y gies) such that bounded sets in Y are T -compact and the norm in Y is T -lower Y Y semicontinuous. The remainder of this paper is organized as follows. In Sect. 2 we state and prove convergence results in the continuous and discretized setting. Section 3 shows how to actually obtain the required discretization error estimates by a goal oriented weighted dual residual approach and Sect. 4 illustrates the theoretical findings by an inverse souce problem for a nonlinear PDE. In Sect. 5 we provide some numerical results for this model problem and Sect. 6 concludes with some remarks. 123 452 B. Kaltenbacher, M. L. P. de Souza 2 Convergence In this section we will study convergence of the IRGNM iterates first of all in a con- tinuous setting, then in the situation of having discretized for computational purposes. The regularization parameters α , ρ , σ are chosen a priori k k k 2c ct α = α θ for some θ ∈ , 1 (11) k 0 1−c ct 2c ct p (note that ( ) < 1for c < 1/3), tc 1−c ct ρ ≡ ρ ≥ R(x ), (12) and 1 + c tc σ ≥ + c ,σ < 1 − 2c , (13) tc tc with τ as in (14), and the iteration is stopped according to the discrepancy principle δ δ δ k = k (δ, y ) = min{k ∈ N :F (x ) − y ≤ τδ} (14) ∗ ∗ 0 with some fixed τ> 1 chosen sufficiently large but independent of δ. Theorem 1 Let R : X →[0, ∞] be proper, convex and T lower semicontinuous † † with R(x )< ∞ and let, for all r ∈[R(x ), ∞) in case of (5), or for all r ∈ † † [R(x ), ρ] in case of (6),orfor r = R(x ) in case of (7), the sublevel set (9) be compact with respect to the topology T on X. Moroever, let F satisfy (8), (10). Finally, let the family of data (y ) satisfy (2). δ>0 (i) Then for fixed δ,y , the iterates according to (5)–(7) are well-defined and satisfy ⎨ defined by (23), (19), (20) in case of (5) x ∈ B with R = ρ in case of (6) (15) = R(x ) in case of (7) for all k ≤ k (δ, y ), which denotes the stopping index according to the discrep- ancy principle (14) with τ sufficiently large, and this stopping indes k (δ, y ) is finite. (ii) Moreover, for both methods we have T -subsequential convergence as δ → 0 i.e., (x ) has a T -convergent subsequence and the limit of every T - δ>0 X X k (δ,y ) convergent subsequence solves (1). If the solution x of (1) is unique in B , then δ † x −→ x as δ → 0. k (δ,y ) (iii) Additionally, k satisfies the asymptotics k = O(log(1/δ)). ∗ ∗ δ δ δ Proof Existence of minimizers x of (5)–(7) for fixed k, x and y follows by the k+1 k direct method of calculus of variations: In all three cases, the cost functional δ δ δ δ p J (x ) =F (x )(x − x ) + F (x ) − y  + α R(x ) in case of (5), k k k k k δ δ δ δ 2 J (x ) = F (x )(x − x ) + F (x ) − y  in case of (6), k k k 123 Convergence and adaptive discretization of the IRGNM… 453 J (x ) = R in case of (7), is bounded from below and the admissible set ad X = C in case of (5), ad X = B in case of (6), ad δ δ δ δ δ δ X ={x ∈ C :F (x )(x − x ) + F (x ) − y ≤ σ F (x ) − y } in case of (7) k k k k is nonempty (for (6) this follows from ρ ≥ R(x ) and for (7) from (8), (14) and (13), l ad see (16) below). Hence, there exists a minimizing sequence (x ) ⊆ X ∩ B for l∈N r † † r = J (x ) in case of (5), r = ρ in case of (6), r = R(x ) in case of (7), δ l δ δ δ with bounded linearized residuals F (x )(x − x ) + F (x ) − y ≤ s for k k k † 1/p δ δ s = J (x ) in case of (5), (6), s = σ F (x ) − y  in case of (7), and lim J (x ) = inf J (x ).By T -compactness of B , the sequence l→∞ k k X r ad x ∈X l l (x ) has a T -convergent subsequence (x ) with limit x¯ ∈ B . Moroever, l∈N X m∈N r T -compactness of norm bounded sets in Y together with T -T -closedness of Y X Y F (x ) and lower T semicontinuity of the norm in Y , implies that in all three cases l ad J (x¯) ≤ lim inf J (x ) = inf J (x ) and x¯ ∈ X , hence x¯ is a minimizer. k m→∞ k ad k x ∈X Note that (ii) follows from (i) by standard arguments and our assumption on T - compactness of B . Thus it remains to prove (i) and (iii) for the three versions (5), (6), (7) of the IRGNM. For this purpose we are going to show that for every δ> 0, there exists k = k (δ, y ) such that k ∼ log(1/δ), and the stopping criterion according to ∗ ∗ ∗ δ δ the discrepancy principle F (x ) − y ≤ τδ is satisfied. For (5), we also need k (δ,y ) δ δ to show that R(x ) ≤ R for k ≤ k (δ, y ), whereas in (6) this automatically holds by (12). The same holds true for (7): If x ∈ B , then by (8), (14) and (13)wehave δ † δ δ δ δ δ F (x )(x − x ) + F (x ) − y ≤ c F (x ) − y + (1 + c )δ tc tc k k k k 1 + c tc δ δ ≤ c + F (x ) − y , (16) tc † δ † δ so x is admissible, hence R(x ) ≤ R(x ), i.e., x ∈ B . k+1 k+1 We start with the Tikhonov version (5) and carry out an induction proof of the following statement: For all k ∈{0,..., k (δ, y )} δ † p R(x ) ≤ R and ∀ j ∈{0,..., k −1}: d + α R ≤ qd + α R +C δ , (17) j +1 j j +1 j j where 1−p p δ δ p d := 2 (1 − c ) F (x ) − y  , (18) k tc 123 454 B. Kaltenbacher, M. L. P. de Souza tc p−1 p−1 q := 2 (1 + γ) + 1 ∈ (0, 1), (19) 1 − c tc δ † † R := R(x ), R := R(x ), p−1 1 + γ C := (1 + c ) , (20) tc for some fixed small γ ∈ (0, 1). We will require < 1, which by definition of q (19) 2c ct p is achievable for γ> 0 sufficiently small, due to θ> ( ) ,cf. (11). By Lemma 2 1−c ct (see the “Appendix”) the right hand side estimate in (17) implies 1 1 k † p d + α R < q d + α R + C δ . (21) k k−1 k 0 k−1 1 − 1 − q δ † δ Using the minimality of x and (2), (8) together with x , x ∈ B ,wehave k+1 k δ δ δ δ δ p δ F (x )(x − x ) + F (x ) − y  + α R(x ) k k+1 k k k+1 δ † δ δ δ p † ≤F (x )(x − x ) + F (x ) − y  + α R(x ) k k k δ δ † ≤ c F (x ) − y + (1 + c )δ + α R(x ), (22) tc tc k From (21) and (14) we infer C R 1− p p δ δ p k k 2 (1 − c ) − F (x ) − y  ≤ q d + θ α tc 0 0 (1 − q)τ θ − q Using this and again (14)in(22) yields p −1 1 + c C tc δ 1−p p R(x ) ≤ c + 2 (1 − c ) − ct tc k+1 τ (1 − q)τ q d R × + + R θ α θ − q p −1 1 4 2 C ≤ + − p p 3 3τ 3 (1 − q)τ δ δ p † F (x ) − y  R 1−p 0 † × 2 + + R =: R . (23) α θ − q On the other hand, since we have established x ∈ B , we can apply (8)tothe left k+1 hand side of (22) to obtain δ δ δ δ δ p δ F (x )(x − x ) + F (x ) − y  + α R(x ) k k+1 k k k+1 δ δ δ δ δ ≥ (1 − c )F (x ) − y − c F (x ) − y  + α R(x ). tc tc k k+1 k k+1 123 Convergence and adaptive discretization of the IRGNM… 455 seealso[11, Lemma 5.2] and [21, proof of Theorem 3]. To handle the power p we make use of the following inequalities that can be proven by solving extremal value problems, see the “Appendix” p−1 1 + γ p p−1 p p (a + b) ≤ (1 + γ) a + b and p−1 1 − p p−1 p p (a − b) ≥ (1 − ) a − b , (24) for all a, b > 0, p ≥ 1 and γ, ∈ (0, 1), where for the right hand inequality to hold, additionally a ≥ b is needed. δ δ δ δ Hence, in case (1 − c )F (x ) − y ≥ c F (x ) − y  the following general tc tc k+1 k estimate holds p−1 p δ δ p δ (1 − ) (1 − c ) F (x ) − y  + α R(x ) tc k k+1 k+1 p−1 1 − p−1 δ δ p ≤ (1 + γ) + c F (x ) − y tc k p−1 1 + γ † p p + α R(x ) + (1 + c ) δ , (25) k tc for γ, ∈ (0, 1). δ δ So in order for this recursion to yield geometric decay of F (x ) − y , we need to ensure p−1 1 − p−1 p p−1 (1 − ) (1 − c ) > (1 + γ) + c (26) tc tc for a proper choice of , γ ∈ (0, 1). To obtain the largest possible (and therefore least restrictive) bound on c , we rewrite the requirement above as tc −1 p p−1 c 1 − tc p−1 p−1 < sup (1 − ) (1 + γ) + 1 − c tc ,γ ∈(0,1) −1 p−1 1 − p−1 1 −p = sup (1 − ) 1 + = φ( ) = 2 , ∈(0,1) =φ( ) as can be found out by evaluating the derivative of φ −2 p−1 p 1 − 1 − p−2 φ ( ) =−(p − 1)(1 − ) 1 + 1 − . 123 456 B. Kaltenbacher, M. L. P. de Souza Thus we will furtheron set = and assume that γ> 0 is sufficiently small so that (26) holds with = , i.e., (19). Then, using (11), estimate (25) can be written as k † p d + α R ≤ qd + α θ R + C δ , (27) k+1 k k+1 k 0 which we first of all regard as a recursive estimate for d . To derive a similar estimate also in the complementary case (1 − c )F (x ) − tc k+1 δ δ δ y  < c F (x ) − y , we use that fact that, for d as in (18), this inequality just tc k means tc d < d k+1 k 1 − c tc and, using (22) and the left hand part of (24), p−1 1 + γ p−1 δ δ p † p p α R ≤ (1 + γ) c F (x ) − y  + α R + (1 + c ) δ , k k+1 k tc tc hence after addition we again get (27) (even with a slightly smaller value of q := p−1 p−1 p tc (1 + 2 (1 + γ) )( ) ). 1−c tc Thus in both cases, using Lemma 2 we can conclude that 1 1 k+1 † p d + α R < q d + α R + C δ . (28) k+1 k k+1 0 k 1 − 1 − q This finishes the induction proof of (17) for all k ∈{0,..., k (δ, y )}. We next show that the discrepancy stopping criterion from (14), i.e., d ≤˜ τδ 1− p p p for τ˜ = 2 (1 − c ) τ , will be satisfied after finitely many, namely O(log(1/δ)), tc steps. For this purpose, note that τ> ˜ , provided τ is chosen sufficiently large, 1−q which we assume to be done. Thus, indeed, using (11), (21), we have α C k † p d ≤ d + α R <θ d + R + δ , (29) k k k−1 k 0 θ − q 1 − q where the right hand side falls below τδ ˜ as soon as α C −1 † k ≥ (log 1/θ ) p log(1/δ) + log d + R − log τ˜ − θ − q 1 − q =: k(δ). Thus we get the upper estimate k (δ, y ) ≤ k(δ) = O(log(1/δ)). For the Ivanov version (6), it only remains to show finiteness of the stopping index, as boundedness of the R values by R = ρ holds by definition. Applying the minimality 123 Convergence and adaptive discretization of the IRGNM… 457 argument with x being admissible [cf. (12)] to (6) leads to the special case p = 1, α = 0in (25) δ δ δ δ (1 − c )F (x ) − y ≤ 2c F (x ) − y + (1 + c )δ. tc tc tc k+1 k Our notation becomes δ δ d := (1 − c )F (x ) − y , k tc 2c tc q := ∈ (0, 1), 1 − c tc C := (1 + c ), tc which gives d ≤ qd + Cδ, k+1 k and by induction, one can conclude d < q d + Cδ, k 0 1 − q where the right hand side is smaller than τδ ˜ (with τ˜ = (1 − c )τ ) for all tc −1 k ≥ (log 1/q) p log(1/δ) + log d − log τ˜ − =: k(δ), 1 − q so that we can again conclude k (δ, y ) ≤ k(δ) = O(log(1/δ)). Finally we consider (7), where boundedness of the R values by R = R(x ) holds by minimality and the fact that x is admissible, cf. (16). Geometric decay of the residuals follows by the estimate δ δ δ δ δ δ δ σ F (x ) − y ≥F (x )(x − x ) + F (x ) − y k k k+1 k k δ δ δ δ ≥ (1 − c )F (x ) − y − c F (x ) − y  (30) tc tc k+1 k and (13), i.e., δ δ δ δ F (x ) − y ≤ qF (x ) − y k+1 k with σ + c tc q = , 1 − c tc so that similarly to above we end up with a logarithmic estimate for k . 123 458 B. Kaltenbacher, M. L. P. de Souza δ † Remark 1 Convergence of R(x ) to R(x ) as δ → 0 holds along the T con- k (δ,y ) vergent subsequence according to Theorem 1 (ii), first of all for the Morozov and the Ivanov version of the IRGNM, with the choice ρ = R(x ) for the latter, since in both δ † cases R(x ) ≤ R(x ) holds for all δ and R is T lower semicontinuous. The k (δ,y ) same holds true also for the Tikhonov version with the alternative choice of α such that δ δ δ δ δ F (x )(x (α ) − x ) + F (x ) − y k k+1 k k σ ≤ ≤ σ F (x ) − y 1+c tc for some constants σ , σ satisfying c + <σ < σ< 1 in place of (11), as can tc be seen directly from (22). If R is defined by the norm on a space with the Kadets- Klee property, and T is the weak topology of this space, then this implies norm δ † convergence of x to x along the same subsequence. k (δ,y ) Remark 2 The fact that x stays in B [cf. (15)] is crucial for the applicability of the tangential cone condition (8) in these iterates. If the functional R quantifies some distance to an a priori guess x , (e.g., R =x − x  for some norm · and some 0 0 q > 0), then x ∈ B with small R means closeness of x to x in a certain sense. Thus, R 0 the smaller R is, the better (8) might get achievable with some c < . On the other tc hand, making R according to (15) small means closeness of x to x . Thus we deal with local convergence, as typical for Newton type methods. Now we consider the appearance of discretization errors in the numerical solution of (5), (6) arising from restriction of the minimization to finite dimensional subspaces X δ k and leading to discretized iterates x and an approximate version F of the forward k,h h operator i.e., we consider the discretized version of Tikhonov-IRGNM (5) δ k δ δ k δ δ p x ∈ argmin k F (x )(x − x ) + F (x ) − y  + α R(x ), (31) k+1,h x ∈C∩X h k,h k,h h k,h of Ivanov-IRGNM (6) δ k δ δ k δ δ x ∈ argmin F (x )(x − x ) + F (x ) − y  such that R(x ) ≤ ρ, k+1,h h k,h k,h h k,h x ∈C∩X (32) and of Morozov-IRGNM (7) δ k δ δ k δ δ x ∈ argmin R(x ) such that F (x )(x − x ) + F (x ) − y k+1,h h k,h k,h h k,h x ∈C∩X k δ δ ≤ σ F (x ) − y , (33) h k,h respectively. Moreover, also in the discrepancy principle, the residual is replaced by its actually computable discretized version δ k δ δ k = k (δ, y ) = min{k ∈ N :F (x ) − y ≤ τδ} . (34) ∗ ∗ 0 h k,h 123 Convergence and adaptive discretization of the IRGNM… 459 We define the auxiliary continuous iterates δ δ δ δ p x ∈ argmin F (x )(x − x ) + F (x ) − y  + α R(x ), (35) k+1 x ∈C k,h k,h k,h δ δ δ δ x ∈ argmin F (x )(x − x ) + F (x ) − y  such that R(x ) ≤ ρ, k+1 x ∈C k,h k,h k,h (36) and δ δ δ δ x ∈ argmin R(x ) such that F (x )(x − x ) + F (x ) − y k+1 x ∈C k,h k,h k,h δ δ ≤ σ F (x ) − y , (37) k,h respectively in order to be able to use minimality, i.e., compare with the continuous exact solution x . For an illustration we refer to [18, Figure 1]. First of all, we assess how large the discretization errors can be allowed to still enable convergence. Later on, in Sect. 3, we will describe how to really obtain such estimates a posteriori and to achieve the prescribed accuracy by adaptive discretization. Corollary 1 Let the assumptions of Theorem 1 be satisfied and assume that the dis- cretization error estimates δ δ δ δ F (x ) − y −F (x ) − y ≤ η (38) k+1 k+1,h k+1 k δ δ δ δ F (x ) − y −F (x ) − y  ≤ ξ (39) h k,h k,h δ δ R(x ) − R(x ) ≤ ζ (40) k,h k (note that no absolute value is needed in (38), (40); moreover, (40) is only be needed for (5) and (7)) hold with k δ δ k δ δ η ≤ c F (x ) − y  ξ ≤ c F (x ) − y ,ζ ≤ ζ. (41) k η k ξ k h k,h h k,h for all k ≤ k (δ, y ) and constants c , c > 0 sufficiently small, ζ> 0. ∗ η ξ δ δ Then the assertions of Theorem 1 remain valid for x in place of x δ δ k (δ,y ),h k (δ,y ) ∗ ∗ with (34) in place of (14) and (42) in place of (23). Proof For the Tikhonov version (31), in order to inductively estimate R(x ), k+1,h given x ∈ B , note that from (43) with k + 1 replaced by k, we get like in (23) that p −1 1 4 2 C R(x ) ≤ + − k+1,h p p 3 3τ 3 (1 − q)τ δ δ p † F (x ) − y R + ζ 1−p 0 † × 2 + + R =: R (42) α θ − q 123 460 B. Kaltenbacher, M. L. P. de Souza where 1−p p δ δ p d := 2 (1 − c ) F (x ) − y  , k,h tc k,h tc p−1 p−1 p−1 q ˜ := 2 (1 + γ) + (1 +˜ γ) , 1 − c tc q =˜ q + D ∈ (0, 1), 1 − c δ † † R := R(x ), R := R(x ), k,h k,h p−1 p−1 1 + γ 1 +˜ γ p p C := (1 + c ) , D := (1 − c ) , tc tc γ γ˜ for γ, γ, ˜ c ∈ (0, 1), which are chosen small enough so that q <θ. As before, from δ † the minimality of x and (2), (8)aswellas x ∈ D(F ),wehave k+1 δ δ δ δ δ (1 − c )F (x ) − y − c F (x ) − y  + α R(x ) tc tc k k+1 k,h k+1 δ δ † ≤ c F (x ) − y + (1 + c )δ + α R(x ), tc tc k k,h then using (38), (40), δ δ δ δ δ (1 − c )(F (x ) − y − η ) − c F (x ) − y  + α R(x ) tc k+1 tc k k+1,h k,h k+1,h δ δ † ≤ c F (x ) − y + (1 + c )δ + α R(x ) + α ζ . tc tc k k k+1 k,h Hence, with the same technique as in the proof of Theorem 1,using (24) with = ,wehave k † p d + α R ≤˜qd + α θ (R + ζ ) + C δ + Dη k+1,h k k+1,h k,h 0 k+1 k+1 k † p ≤ qd + α θ (R + ζ ) + C δ , k,h 0 k+1 using (41). From this, by induction we conclude 1 1 k+1 † p d + α R ≤ q d + α (R + ζ) + C δ (43) k+1,h k k+1,h 0 k 1 − 1 − q Hence, by (39), (41), we have the following estimate k δ δ F (x ) − y h k,h 1/p p−1 2 α C 1 k † p ≤ δ + θ d + (R + ζ) + δ , (1 − c ) θ − q 1 − q 1 − c tc ξ 123 Convergence and adaptive discretization of the IRGNM… 461 where the right hand side falls below τδ as soon as α C −1 † k ≥ (log 1/θ ) p log(1/δ) + log d + (R + ζ) − log τ˜ − θ − q 1 − q =: k(δ), 1− p p p C for τ˜ = 2 (1 − c ) (τ (1 − c )) . Note that τ> ˜ , provided τ is chosen tc ξ 1−q sufficiently large, which we assume to be done. That is, we have shown that the discrepancy stopping criterion from (34) will be satisfied after finitely many, namely O(log(1/δ)), steps. On the other hand, the continuous discrepancy at the iterate defined by the dis- cretized discrepancy principle (34)by(39), (41) satisfies δ δ F (x ) − y ≤ τ(1 + c )δ . k,h To estimate R(x ), note that according to our notation, from (43), we get, k (δ,y ),h likein(23), that for all k ∈{1,..., k (δ, y )} −1 d R + ζ C C R ≤ θ + 1 + τ˜ − =: R. α θ − q 1 − q 1 − q Now we show finiteness of the stopping index for the discretized Ivanov-IRGNM (32). By minimality of x and (38), for this problem we have k+1 δ δ δ δ (1 − c )F (x ) − y ≤ 2c F (x ) − y + (1 + c )δ + (1 − c )η . tc tc tc tc k+1 k+1,h k,h which with δ δ d := (1 − c )F (x ) − y , k,h tc k,h 2c c tc η q ˜ := , q =˜ q + D ∈ (0, 1), 1 − c 1 − c tc ξ C := (1 + c ), D := (1 − c ), tc tc by induction, (39) and (41)gives 1 1 C k δ δ k F (x ) − y ≤ d + ξ ≤ q d + δ +ˆ τδ, k,h k 0 h k,h 1 − c 1 − c 1 − q tc tc where the right hand side is smaller than τδ for all −1 k ≥ (log 1/q) log(1/δ) + log d − log τ˜ − =: k(δ), 1 − q 123 462 B. Kaltenbacher, M. L. P. de Souza with τ˜ = (1 − c )τ (1 − c ), so that we can again conclude k (δ, y ) ≤ k(δ) = tc ξ ∗ O(log(1/δ)). It remains to show finiteness of the stopping index for the discretized Morozov- δ δ δ IRGNM (33). By minimality of x we have (30) with x replaced by x , thus the k+1 k k,h inequalities (38) and (41) yield σ + c tc δ δ δ δ F (x ) − y ≤   F (x ) − y k+1,h k,h 1 − (1 − c ) tc 1−c then, by (39) and induction k δ δ k δ δ F (x ) − y ≤ q F (x ) − y  (44) h k,h 0,h where σ + c tc q :=   ∈ (0, 1), 1 − (1 − c ) tc 1−c and the right hand side of (44) falls below τδ for all −1 k ≥ (log 1/q) (log(1/δ) + log d − log τ˜) =: k(δ), where τ˜ = τ(1 − c ), and we can again conclude k (δ, y ) ≤ k(δ) = O(log(1/δ)). ξ ∗ Boundedness of the R values for (33)by R(x ) + ζ follows like in the proof of Theorem 1 together with (40), (41). 3 Error estimators for adaptive discretization The error estimators η , ξ and ζ can be quantified, e.g., by means of a goal oriented k k k dual weighted residual (DWR) approach [3], applied to the minimization problems δ δ δ δ δ p (x ,v , u , u ) ∈ argmin 3 C (u ˜)v + C (u ˜) − y  + α R(x ) k+1,h k,h k+1 k,h (x ,v,u,u ˜)∈C×V δ δ s.t. ∀w ∈ W :A (x , u ˜)(x − x ) + A (x , u ˜)v, w = 0, W ,W x k,h k,h u k,h ∗ ∗ A(x , u ˜), w = 0, A(x , u), w = 0, W ,W W ,W k,h (45) (note that the last constraint is added in order to enable computation of I below) δ δ δ δ δ 2 (x ,v , u , u ) ∈ argmin 3 C (u ˜)v + C (u ˜) − y k+1,h k,h k+1 k,h (x ,v,u,u ˜)∈C×V s.t. R(x ) ≤ ρ , δ δ and ∀w ∈ W :A (x , u ˜)(x − x ) + A (x , u ˜)v, w ∗ = 0, W ,W x k,h k,h u k,h A(x , u ˜), w ∗ = 0, A(x , u), w ∗ = 0, W ,W W ,W k,h (46) 123 Convergence and adaptive discretization of the IRGNM… 463 and δ δ δ δ (x ,v , u , u ) ∈ argmin 3 R(x ) (x ,v,u,u ˜)∈C×V k+1,h k,h k+1 k,h δ δ s.t. C (u ˜)v + C (u ˜) − y ≤ σ C (u ˜) − y , δ δ and ∀w ∈ W :A (x , u ˜)(x − x ) + A (x , u ˜)v, w = 0, W ,W x u k,h k,h k,h ∗ ∗ A(x , u ˜), w = 0, A(x , u), w = 0, W ,W W ,W k,h (47) which are equivalent to (5), (6), and (7), respectively, with k δ k δ I (x,v, u, u ˜) =C (u ˜) − y  , I (x,v, u, u ˜) =C (u) − y  , 1 2 I (x,v, u, u ˜) = R(x ) as quantities of interest [where I is only needed for (5) and (7)]. We assume that C, R and the norms can be evaluated without discretization error, so the discretized versions of I only arise due to discreteness of the arguments. Indeed, it is easy to see that the left hand sides of (38) and (39) can be bounded (at least approximately) by k k combinations of I and I , using the triangle inequality: 1 2 δ δ δ δ F (x ) − y −F (x ) − y k+1,h k+1 k+1 k+1 δ δ δ δ δ δ δ δ = I (x ,v , u , u ˜ ) − I (x ,v , u , u ˜ ) k+2 k+1 k+2 k+1 k+2,h k+1,h k+2,h k+1,h 1 1 k δ δ δ δ k δ δ δ δ k+1 − (I (x ,v , u , u ˜ ) − I (x ,v , u , u ˜ )) + R ; (48) 2 k+1 k k+1 k 2 k+1,h k,h k+1,h k,h k δ δ δ δ F (x ) − y −F (x ) − y h k,h k,h k δ δ δ δ k δ δ δ δ = I (x ,v , u , u ˜ ) − I (x ,v , u , u ˜ ), (49) 1 k+1,h k,h k+1,h k,h 1 k+1 k k+1 k k+1 k+1 δ δ k δ δ where we will neglect R =F (x ) − y −F (x ) − y . h k+1,h h k+1,h k+1 It is important to note that I is not equal to I ,see [18]. 1,h 2,h The computation of the a posteriori error estimators η ,ξ ,ζ is done as in [18]. k k k These error estimators can be used within the following adaptive algorithm for error control and mesh refinement: We start on a coarse mesh, solve the discretized opti- mization problem and evaluate the error estimator. Thereafter, we refine the current mesh using local information obtained from the error estimator, reducing the error with respect to the quantity of interest. This procedure is iterated until the value of the error estimator is below the given tolerance (41), cf. [3]. In this case, all the variables x,v, u, u ˜ are subject to a new discretization. For better readability we will partially omit the iteration index k and the discretization index h. The previous iterate x is fixed and not subject to a new discretization. Consider now the cost functional for (45) δ p J (x,v, u ˜) =C (u ˜)v + C (u ˜) − y  + α R(x ) 123 464 B. Kaltenbacher, M. L. P. de Souza and define the Langrangian functional δ δ L(x,v, u, u ˜,λ, μ, ˜ μ) :=J (x,v, u ˜) +A (x , u ˜)(x − x ) + A (x , u ˜)v, λ ∗ W ,W x k k u k ∗ ∗ +A(x , u ˜), μ ˜  +A(x , u), μ , (50) W ,W W ,W neglecting for simplicity (cf. Remark 2) the constraints defined by C. The first-order necessary optimality conditions for (45) are given by stationarity for the Lagrangian L. Setting z = (x,v, u, u ˜,λ, μ, ˜ μ), they read L (z)(dz) = 0, ∀dz ∈ Z = X × V × V × V × W × W × W and for the discretized problem, L (z )(dz ) = 0, ∀dz ∈ Z = X × V × V × V × W × W × W . h h h h h h h h h h h To derive a posteriori error estimators for the error with respect to the quantities of interest (I , I , I ), we introduce auxiliary functionals M : 1 2 3 i M (z, z ¯) = I (z) + L (z)z ¯, z, z ¯ ∈ Z , i = 1, 2, 3, i i ˜ ˜ Let z ˜ = (z, z ¯) ∈ Z = Z × Z and z ˜ = (z , z ¯ ) ∈ Z = Z × Z be continuous and h h h h h h discrete stationary points of M satisfying M (z ˜)(dz ˜) = 0, ∀dz ˜ ∈ZM (z ˜ )(dz ˜ ) = 0, ∀dz ˜ ∈ Z , h h h h respectively. Then, z, z are continuous and discrete stationary points of L and there holds I (z) = M (z ˜), i = 1, 2, 3. Thus the z part, as computed already during the i i numerical solution of the minimization problem (45)(or (46)) remains fixed for all i ∈{1, 2, 3, }. Moreover, after computing the discrete stationary point z for L (e.g., by applying Newton’s method), it requires only one more Newton step to compute the z ¯ coordinate of the stationary point for M from L (z )(z ¯ , dz ¯) =−I (z )dz ¯, ∀dz ˜ ∈ Z . h i,h h h h According to [3], there holds I (x,v, u ˜) − I (x ,v , u ˜ ) = M (z ˜ )(z ˜ −ˆz ) + R, ∀ˆz ∈ Z i = 1, 2, 3, i i h h h h h h h with a remainder term R of order O(˜z −˜z  ) that is therefore neglected. Thus we use k k I (z) − I (z ) ≈ M (z , z ¯ )(π z ˜ −˜z ) = ε , h h i,h h i,h i,h i i i i 123 Convergence and adaptive discretization of the IRGNM… 465 where π is an operator defined such that (π z ˜ −˜z ) approximates the interpolation h h i,h i,h errorasin[18], typically defined by local averaging, to define the estimators η , ξ , k k ζ according to the rule k+1 k k k η = ε + ε ,ξ = ε ,ζ = ε ; (51) k+1 k k 1 2 1 3 cf. (48), (49). The estimators obtained by this procedure can be used to trigger local mesh refinement until the requirements (41) are met cf. [3]. Explictly, for p = 2 (for simplicity) such a stationary point z = (x,v, u, u ˜,λ, μ) ˜ can be computed by solving the following system of equations (analogously for the discrete stationary point of L) ∗ δ ∗ −(A (x , u) μ + A (x , u ˜) λ) ∈ α ∂R(x ); (52) x x k δ δ 2C (u ˜)(dv), C (u ˜)v + C (u ˜) − y +A (x , u ˜)(dv), λ= 0, ∀dv ∈ V ; (53) u k A (x , u)(du), μ= 0, ∀du ∈ V ; (54) δ δ δ δ A (x , u ˜)(x − x , du ˜) + A (x , u ˜)(v, du ˜), λ+A (x , u ˜)(du ˜), μ ˜ xu k k uu k u k + 2C (u ˜)(du ˜,v) + C (u ˜)(du ˜), C (u ˜)v + C (u ˜) − y = 0, ∀du ˜ ∈ V ; (55) δ δ δ A (x , u ˜)(x − x ) + A (x , u ˜)v, dλ= 0, ∀dλ ∈ W ; (56) x k k u k A(x , u ˜), dμ ˜ = 0, ∀dμ ˜ ∈ W ; (57) A(x , u), dμ= 0, ∀dμ ∈ W. (58) Note that (58) is decoupled from the other equations and that if A (x , u) is injective, Eq. (54) implies μ = 0. Summarizing, since we have a convex minimization problem, after solving a non- linear system of seven equations to find the minimizer, we need only one more Newton step to compute the error estimators to check whether we need a refinement on the mesh or not. Regarding the problem (46) related to the Ivanov-IRGNM, we have the Lagrangian functional (50) with the cost functional defined by δ 2 J (x,v, u ˜) = C (u ˜)v + C (u ˜) − y  + I (R(x ) − ρ) , (−∞,0] and the indicator functional I (R(x ) − ρ) takes the role of a regularization (−∞,0] functional. The resulting optimality system is the same as above, cf. (52)-(58), just with (52) replaced by ∗ δ ∗ − (A (x , u) μ + A (x , u ˜) λ) ∈ ∂ I (R(x ) − ρ). (59) (−∞,0] x x k Similarly for (47) for Morozov-IRGNM, with the cost function δ δ J (x,v, u ˜) = R(x ) + I C (u ˜)v + C (u ˜) − y − σ C (u ˜) − y  , (−∞,0] =:Q(u ˜,v) 123 466 B. Kaltenbacher, M. L. P. de Souza we end up with an optimality system by setting α = 1 and replacing (53), (55)in (52)–(58)by δ ∗ −A (x , u ˜) λ ∈ ∂ Q(u ˜,v) (60) δ δ δ ∗ δ ∗ −(A (x , u ˜)(x − x ) + A (x , u ˜)v) λ − A (x , u ˜) μ ˜ ∈ ∂ Q(u ˜,v) (61) u ˜ xu k k uu k u k respectively. Note that the bound on I only appears—via (51)—in connection to the assumption k δ δ δ η ≤ c F (x ) − y ,for k ≤ k (δ, y ) in (41). This may be satisfied in practice k η ∗ h k,h without refining explicitly with respect to η , but simply by refining with respect to the other error estimators ξ (and ζ in the Tikhonov or Morozov case). The fact k k k−1 that I and I only differ in the discretization level, motivates the assumption 1,h 2,h k k−1 that for small h,wehave I ≈ I and η ≈ ξ . Thefore, the algorithm used k−1 k 1,h 2,h in actual computations will be built neglecting I and hence skipping the constraint A(x , u), w = 0, ∀w ∈ W in (45), (46), (47), which implies a modification W ,W of the Lagrangian (50) accordingly. Therefore, the corresponding optimality systems for p = 2 in the Tikhonov case is given by δ ∗ − A (x , u ˜) λ ∈ α ∂R(x ); (62) x k δ δ 2C (u ˜)(dv), C (u ˜)v + C (u ˜) − y +A (x , u ˜)(dv), λ= 0, ∀dv ∈ V ; (63) u k δ δ δ δ A (x , u ˜)(x − x , du ˜) + A (x , u ˜)(v, du ˜), λ+A (x , u ˜)(du ˜), μ ˜ xu k k uu k u k + 2C (u ˜)(du ˜,v) + C (u ˜)(du ˜), C (u ˜)v + C (u ˜) − y = 0, ∀du ˜ ∈ V ; (64) δ δ δ A (x , u ˜)(x − x ) + A (x , u ˜)v, dλ= 0, ∀dλ ∈ W ; (65) x k k u k A(x , u ˜), dμ ˜ = 0, ∀dμ ˜ ∈ W. (66) Note that Eq. (66) is decoupled from the others. Therefore, the strategy is to solve (66) for u ˜ first, then solve the linear system (62), (63), (65)for (x,v,λ), and finally compute μ ˜ via the linear equation (64). Here, the system (62), (63), (65) can be interpreted as the optimality conditions for the following problem δ δ δ 2 (x ,v ) ∈ argmin C (u ˜)v + C (u ˜) − y  + α R(x ) k+1,h k,h (x ,v)∈C×V δ δ s.t. ∀w ∈ W :A (x , u ˜)(x − x ) + A (x , u ˜)v, w ∗ = 0. W ,W x k,h k,h u k,h For the Ivanov case, we have to solve (63)–(66) with δ ∗ − A (x , u ˜) λ ∈ ∂ I (R(x ) − ρ) (67) (−∞,0] x k in place of (62), hence again (66) is decoupled from the other equations, (64) is linear with respect to μ ˜ , once (x,v,λ) has been computed, and the remaining system for (x,v,λ) can be interpreted as the optimality conditions for the following problem 123 Convergence and adaptive discretization of the IRGNM… 467 δ δ δ 2 (x ,v ) ∈ argmin C (u ˜)v + C (u ˜) − y k+1,h k,h (x ,v)∈C×V s.t. R(x ) ≤ ρ , δ δ and ∀w ∈ W :A (x , u ˜)(x − x ) + A (x , u ˜)v, w = 0. W ,W x k,h k,h u k,h The Morozov case requires solution of (62) (with α = 1), (65), (66), (60), (61). Thus again, we first solve (66)for u ˜, then the system (62) (with α = 1), (65), (60), which is the first order optimality condition for δ δ (x ,v ) ∈ argmin R(x ) k+1,h k,h (x ,v)∈C×V δ δ s.t. C (u ˜)v + C (u ˜) − y ≤ σ C (u ˜) − y , δ δ and ∀w ∈ W :A (x , u ˜)(x − x ) + A (x , u ˜)v, w = 0, W ,W x k,h k,h u k,h with Lagrange multiplier λ for the equality constraint, and finally the (now possibly nonlinear) inclusion (61)for μ ˜ . Remark 3 Since DWR estimators are based on residuals which are computed in the optimization process, the additional costs for estimation are very low, which makes this approach attractive for our purposes. However, although these error estimators are known to work efficiently in practice (see [3]), they are not reliable, i.e., the k k k conditions I (z) − I (z ) ≤ , i = 1, 2, 3 can not be guaranteed in a strict sense in i i i the computations, since we neglect the remainder term R and use an approximation for z ˜ −ˆz . As our analysis in Theorem 1 is kept rather general, it is not restricted to DWR estimators and would also work with different (e.g., reliable) error estimators. 4 Model examples We present a model example to illustrate the abstract setting from the previous section. Consider the following inverse source problem for a semilinear elliptic PDE, where the model and observation equations are given by 3 d − Δu + κu = χ x in Ω ⊂ R , (68) u = 0on ∂Ω, (69) C (u) = u | , y − y  2 ≤ δ, (70) o L (ω ) where χ denotes the extension by zero of a function on ω to a function on all of Ω. ω c We first of all consider Tikhonov regularization and, aiming for a sparsely supported source, therefore use the space of Radon measures M(ω ) as a preimage space X. 1,q −1,q Thus we define the operators A : M(ω ) × W (Ω) −→ W (Ω), A(x , u) = 1,q 3 2 −Δu + κu − χ x, κ ∈ R and the injection C : W (Ω) −→ L (ω ) = Y , q > d, ω o c 0 where Ω is a bounded domain in R with d = 2 or 3, with Lipschitz boundary ∂Ω and ω ,ω ⊂ Ω are the control domain and the observation domain, respectively. c o A monotonicity argument yields well posedness of the above semilinear boundary 1,q value problem, i.e., well-definedness of u ∈ W (Ω) as a solution to the elliptic 123 468 B. Kaltenbacher, M. L. P. de Souza 3 −1,q boundary value problem (68), (69), as long as we can guarantee that u ∈ W (Ω) 1,q 1,q 3r r for any u ∈ W (Ω), i.e., the embeddings W (Ω) → L (Ω) and L (Ω) → 0 0 −1,q W (Ω) are continuous for some r ∈[1, ∞], which (by duality) is the case iff 1,q 3r r W (Ω) embeds continuously both into L (Ω) and L (Ω). By Sobolev’s Embed- ding Theorem, this boils down to the inequalities d d d d 1 − ≥− and 1 − ≥− , q 3r q r which by elementary computations turns out to be equivalent to dq dq ≤ r ≤ , (71) q + d 3(dq − q − d) where the left hand side is larger than one and the denominator on the right hand side is positive due to the fact that for d ≥ 2wehave q > d ≥ d = . Taking the d−1 extremal bounds for q > d—note that the lower bound is increasing and the upper bound is decreasing with q—in (72) we get d d < r < . (72) 2 3(d − 2) Thus, as a by-product, we get that for any t ∈[1, t ) there exists q > d such that 1,q W (Ω) continuously embeds into L , with ¯ ¯ t =∞ in case d = 2 and t = 3 in case d = 3 . (73) For the regularization functional R(x ) =x  , the IRGNM-Tikhonov mini- M(ω ) mization step is given by (ignoring h in the notation) δ δ δ δ 2 (x ,v , u ) ∈ argmin v +˜ u − y k+1 k k 1,q L (ω ) 2 o (x ,v,u ˜)∈M(ω )×(W (Ω)) + α x k M(ω ) 1,q 2 δ s.t. ∀w ∈ W (Ω) : (∇v∇w + 3κu ˜ vw)dΩ = wd(x − x ), 0 k Ω ω 3 δ (∇˜ u∇w + κu ˜ w)dΩ = wdx . Ω ω Here and below dΩ and dx denote the integrals with respect to the Lebesgue Ω ω measure and with respect to the measure x, respectively. Therefore, to compute this Gauss–Newton step, one first needs to solve the nonlinear equation 3 δ − Δu ˜ + κu ˜ = χ x (74) c k 123 Convergence and adaptive discretization of the IRGNM… 469 for u ˜ = u , then solve the following optimality system with respect to (x,v,λ) (written in a strong formulation) ∗ ∗ C (ω ) b c λ ≤ α and (x − λ)dx ≤ 0, ∀x ∈ B C (ω ) k b c α δ 2 δ δ −Δλ + 3κ(u ) λ + 2v + 2u = 2χ y k k o δ 2 δ −Δv + 3κ(u ) v = χ (x − x ), k c k which can be interpreted as the optimality system for the minimization problem δ δ δ δ 2 (x ,v ) ∈ argmin u + v − y  + α x 2 k M(ω ) k+1 k k 1,q L (ω ) c (x ,v)∈M(ω )×W (Ω) δ 2 δ s.t. −Δv + 3κ(u ) v = χ (x − x ), (75) k c k with Lagrange multiplier λ for the equality constraint, and finally, compute μ ˜ by solving δ 2 δ δ δ − Δμ ˜ + 3κ(u ) μ ˜ =−6κu vλ − 2(v + u − χ y ). (76) k k k o For carrying out the IRGNM iteration, μ ˜ is not required, but we need it for evaluating the error estimators. For the Ivanov case, we consider the same model and observation equations (68), (69), (70) but now we intend to regularize by imposing L bounds and thus use ∞ 1 −1 the slightly different function space setting, A : L (ω ) × H (Ω) −→ H (Ω), 3 1 2 A(x , u) =−Δu + κu − x, κ ∈ R and the injection C : H (Ω) −→ L (ω ). The IRGNM-Ivanov minimization step with the regularization functional R(x ) = x  is given by L (ω ) δ δ δ δ 2 (x ,v , u ) ∈ argmin v +˜ u − y ∞ 1 2 k+1 k k (x ,v,u ˜)∈L (ω )×(H (Ω)) L (ω ) c o s.t. x  ≤ ρ L (ω ) 1 2 δ and ∀w ∈ H (Ω) : (∇v∇w + 3κu ˜ vw)dΩ = w(x − x )dΩ, 0 k Ω ω 3 δ (∇˜ u∇w + κu ˜ wdΩ = wx dΩ. Ω ω For the Gauss–Newton step, one needs to first solve the nonlinear equation (74)for u ˜ = u , and then solve the following optimality system with respect to (x,v,λ) ∗ ∗ L (ω ) x  ≤ ρ and (x − x )λdΩ ≤ 0, ∀x ∈ B L (ω ) c ρ δ 2 δ δ −Δλ + 3κ(u ) λ + 2v + 2u = 2χ y k k o δ 2 δ −Δv + 3κ(u ) v = χ (x − x ), k c k 123 470 B. Kaltenbacher, M. L. P. de Souza which can be interpreted as the optimality system for the minimization problem δ δ δ δ 2 (x ,v ) ∈ argmin 1 u + v − y ∞ 2 k+1 k (x ,v)∈L (ω )×H (Ω) k c L (ω ) 0 o s.t. x  ∞ ≤ ρ L (ω ) δ 2 δ −Δv + 3κ(u ) v = χ (x − x ) (77) k c k with Lagrange multiplier λ for the equality constraint. Finally, μ ˜ is computed from (76). For the IRGNM-Morozov case, using for simplicity the regularization functional 1 2 R(x ) = x  , and leaving the rest of the setting as in the IRGNM-Ivanov case, L (ω ) the step is defined by δ δ δ 2 (x ,v , u ) ∈ argmin x ∞ 1 2 k+1 k k (x ,v,u ˜)∈L (ω )×(H (Ω)) L (ω ) c c δ 2 δ 2 s.t. v +˜ u − y  ≤ σ ˜ u − y 2 2 L (ω ) L (ω ) o o 1 2 δ and ∀w ∈ H (Ω) : (∇v∇w + 3κu ˜ vw)dΩ = w(x − x )dΩ, 0 k Ω ω 3 δ (∇˜ u∇w + κu ˜ wdΩ = wx dΩ. Ω ω So again we first solve (74)for u ˜ = u , then the minimization problem δ δ 2 (x ,v ) ∈ argmin 1 x k+1 k (x ,v)∈L (ω )×H (Ω) c L (ω ) 0 c δ δ 2 δ δ 2 s.t. v + u − y  ≤ σ u − y 2 2 k k L (ω ) L (ω ) o o δ 2 δ −Δv + 3κ(u ) v = χ (x − x ), k c k or actually its first order optimality system λ| = x δ δ 2 δ δ 2 φ ≥ 0 , v + u − y  ≤ σ u − y  , 2 2 k k L (ω ) L (ω ) o o δ δ 2 δ δ 2 φ v + u − y  − σ u − y  = 0 2 2 k k L (ω ) L (ω ) o o δ 2 δ −Δv + 3κ(u ) v = χ (x − x ), k c k δ δ for (x ,v ,φ,λ), and finally, k+1 k δ 2 δ δ δ −Δμ ˜ + 3κ(u ) μ ˜ =−6κu vλ − φ(v + (1 − σ)(u − χ y )). k k k o for μ ˜ . For numerically efficient methods to solve the minimization problems (75) and (77) we refer to e.g., [4–6] and the references therein. 123 Convergence and adaptive discretization of the IRGNM… 471 We finally check the tangential cone condition in case ω = Ω and, for simplicity also ω = Ω, in both settings 1,q 1,q X = M(ω ), V = W (Ω) , W = W (Ω) 0 0 (where we will have to restrict ourselves to d = 2) and ∞ 2 1 X = L (ω ) or X = L (ω ), V = W = H (Ω) . c c For this purpose, we use the fact that with the notation F (x˜) =˜ u| , F (x ) = u| , ω ω o o F (x˜) − F (x ) = v| and F (x˜) − F (x ) − F (x )(x˜ − x ) = w| , the functions ω ω o o 1,q v, w ∈ W (Ω) satisfy the homogeneous Dirichlet boundary value problems for the equations 2 2 −Δv + κ(u ˜ +˜uu + u )v =˜x − x 2 2 −Δw + κu w =−κ(u ˜ + 2u)v . Using an Aubin-Nitsche type duality trick, we can estimate the L norm of w via the 1,n adjoint state p ∈ W (Ω), which solves −Δp + κu p = w, with homogeneous Dirichlet boundary conditions, so that by Hölder’s inequality 2 2 2 w =w, (−Δ + κu id)p=(−Δ + κu id)w, p L (Ω) =−κ(u ˜ + 2u)v , p≤ κv ˜ u + 2u m v m p 2 2m L (Ω) L (Ω) L (Ω) m−4 L (Ω) ≤ C κv ˜ u + 2u m v m w , 2 2 L (Ω) L (Ω) L (Ω) L (Ω) where we aim at choosing m ∈[4, ∞], n ∈[1, ∞] such that indeed p 1,n ≤ C w −1,n ≤ C w 2 W (Ω) L (Ω) W (Ω) 2m m 1,n 2 −1,n m−4 and the embeddings V → L (Ω), W (Ω) → L (Ω), L (Ω) → W (Ω) m m are continuous. If we succeed in doing so, we can bound C κ˜ u + 2u v L (Ω) L (Ω) by some constant c , which will be small provided ˜x − x  and hence v is tc X L (Ω) small. Thus, the numbers n, m are limited by the requirements 2m m 1,n m−4 V ⊆ L (Ω) and W (Ω) ⊆ L (Ω) and m ≥ 4 , (78) 2 −1,n L (Ω) ⊆ W (Ω), i.e., by duality, 1,n 2 W (Ω) ⊆ L (Ω) , (79) 123 472 B. Kaltenbacher, M. L. P. de Souza 2 o −1,n and the fact that κu p ∈ L (Ω) should be contained in W (Ω) for u ∈ V ⊆ t 1,n L (Ω), and p ∈ W (Ω), which via Hölder’s inequality in 1/o 2 o 2 ot (u p) dΩ ≤u p L (Ω) t −2o L (Ω) and duality leads to the requirements ot t 1,n o t 1,n t −2o W (Ω) ⊆ L (Ω) and V ⊆ L (Ω) and W (Ω) ⊆ L (Ω) and o ≤ (80) 0 0 1,q In case V = W (Ω) with q > d and d = 3, (78) will not work out, since according to (73), m cannot be chosen larger or equal to four. 1,q In case V = W (Ω) with q > d and d = 2, we can choose, e.g., t = m = n = 6, o = 2 to satisfy (78), (79), (80)aswellas t, m < t as in (73). The same choice is possible in case V = H (Ω) with d ∈{2, 3}. 5 Numerical tests In this section, we provide some numerical illustration of the IRGNM Ivanov method applied to the example from Sect. 4, i.e., each Newton step consists of solving (74) and subsequently (77). For the numerical solution of (74) we apply a damped Newton iteration to the equation Φ(u ˜) = 0 where 1 −1 3 δ Φ : H (Ω) → H (Ω) , Φ(u ˜) =−Δu ˜ + κu ˜ − x , 0 k −1 l+1 l l 2 l 3 δ u ˜ =˜ u − −Δu ˜ + 3κ(u ˜ ) −Δu ˜ + κ(u ˜ ) − x , which is stopped as soon as Φ(u ˜ ) −1 has been reduced by a factor of 1.e−4. H (Ω) The sources x and states u are discretized by piecewise linear finite elements, hence after elimination of the state via the linear equality constraint, (77) becomes a box constrained quadratic program for the dicretized version of x, which we solve with the method from [12] using the Matlab code mkr_box provided to us by Philipp Hungerländer, Alpen-Adria Universität Klagenfurt. All implementations have been done in Matlab. We performed test computations on a 2-d domain ω = ω = Ω = (−1, 1) , o c on a regular computational finite element grid consisting of 2 · N · N triangles, with N = 32. We first of all consider κ = 1 (below we will also show results with κ = 100) and the piecewise constant exact source function x =−10 + 20 · 1 , (81) ex B where B is the ball of radius 0.2 around (−0.4, −0.3) cf. Fig. 1, and correspondingly set ρ = 10. In order to avoid an inverse crime, we generated the synthetic data on a 123 Convergence and adaptive discretization of the IRGNM… 473 Fig. 1 Left: exact source x ; right: locations of spots for testing weak * L convergence ex Table 1 Convergence as δ err err err err spot spot spot 1 1 2 3 L (Ω) δ → 0: averaged errors of five test runs with uniform noise 0.1000 0 4.0818 8.0043 0.0627 0.0667 0.1558 3.6454 7.8451 0.0541 0.0333 0 3.0442 6.5726 0.0370 0.0100 0 0 3.9091 0.0188 finer grid and, after projection of u onto the computational grid, we added normally ex distributed random noise of levels δ ∈{0.001, 0.01, 0.1} to obtain synthetic data y . In all tests we start with the constant function with value zero for x . Moreover, we always set τ = 1.1. According to our convergence result Theorem 1 with R = · , we can expect weak * convergence in L (Ω) here. Thus we computed L (Ω) the errors in certain spots within the two homogeneous regions and on their interface, spot = (0.5, 0.5), spot = (−0.4, −0.3), spot = (−0.4, −0.5), 1 2 3 1 1 cf. Fig. 1, more precisely, on × squares located at these spots, corresponding to N N the piecewise constant L functions with these supports in order to exemplarily test ∞ 1 weak * L convergence. Additionally we computed L errors. Table 1 provides an illustration of convergence as δ decreases. For this purpose, we performed five runs on each noise level for each example and list the average errors. In Fig. 2 we plot the reconstructions for κ = 1 and κ = 100. For κ = 1, the noise levels δ ∈{0.1, 0.667, 0.333, 0.01} correspond to a percentage of p ∈ {5.6, 18.5, 37.1, 55.6} of the L deviation of the exact state from the background state 1/3 1/3 u =−10 . In case of κ = 100, where the background state is u =−0.1 the 0 0 corresponding percentages are p ∈{17.9, 59.7, 119.4, 179.2}. For an illustration of the noisy data as compared to the exact ones, see Figs. 3 and 4. Indeed, the box con- straints enable to cope with relatively large noise levels, even in the rather nonlinear regime with κ = 100. 123 474 B. Kaltenbacher, M. L. P. de Souza Fig. 2 Reconstructions from noisy data with δ ∈{0.1, 0.667, 0.333, 0.01} (top to bottom) for κ = 1 (left) and κ = 100 (right) 123 Convergence and adaptive discretization of the IRGNM… 475 Fig. 3 Exact and noisy data (δ = 0.1) for κ = 1 Fig. 4 Exact and noisy data (δ = 0.1) for κ = 100 6 Conclusions and remarks In this paper we have studied convergence of the Tikhonov type, the Ivanov type, and the Morozov type IRGNM with a stopping rule based on the discrepancy principle type. To the best of our knowledge, the Ivanov and Morozov IRGNMs have not been studied so far and in all three Tikhonov, Ivanov, and Morozov type IRGNMs, convergence results without source conditions so far use stronger assumptions than the tangential cone condition used here. We also consider discretized versions of the methods and provide discretization error bounds that still guarantee convergence. Moroever, we discuss goal oriented dual weighted residual error estimators that can be used in an adaptive discretization scheme for controlling these discretization error bounds. An inverse source problem for a nonlinear elliptic boundary value problems illustrates our theoretical findings in the special situations of measure valued and L sources. We also provide some computational results with the Ivanov IRGNM for the case of an L source. Numerical implementations and tests for a measure valued source, together with adaptive discretization is subject of ongoing work, based on the approaches from [4–6,18,19]. Future research in this context will be concerend with convergence rates results for the Ivanov and Morozov IRGNMs under source conditions. Acknowledgements Open access funding provided by University of Klagenfurt. The authors wish to thank Philipp Hungerländer, Alpen-Adria Universität Klagenfurt, for providing us with the Matlab code based on the method from [12]. Moreover, the authors gratefully acknowledge financial support by the Austrian 123 476 B. Kaltenbacher, M. L. P. de Souza Science Fund FWF under the grants I2271 “Regularization and Discretization of Inverse Problems for PDEs in Banach Spaces” and P30054 “Solving Inverse Problems without Forward Operators” as well as partial support by the Karl Popper Kolleg “Modeling-Simulation-Optimization”, funded by the Alpen- Adria-Universität Klagenfurt and by the Carinthian Economic Promotion Fund (KWF). Moreover, we wish to thank both reviewers for fruitful comments leading to an improved version of the manuscript. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 Interna- tional License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Appendix Lemma 1 For all a, b > 0, p ≥ 1 and γ, ∈ (0, 1) p−1 1 + γ p p−1 p p (a + b) ≤ (1 + γ) a + b (82) and, if additionally a ≥ b, also p−1 1 − p p−1 p p (a − b) ≥ (1 − ) a − b . (83) Proof Theestimatein(82) can be done by solving the following extremal value problems C = max φ(x), C = max Φ(x), x >0 x >0 where p p−1 −p p−1 p −p φ(x ) := ((1 + x ) − (1 + γ) )x and Φ(x ) := ((1 − ) − (1 − x ) )x , since for any γ, ∈ (0, 1), φ(x ) ≤ C and Φ(x ) ≤ C for all x > 0 with x := b/a, a, b > 0 is equivalent to (24). Solving for C ,wehave ⎨ = 0 ⇐⇒ x = γ, −( p+1) p−1 p−1 φ (x ) = px ((1 + γ) − (1 + x ) ) < 0for x >γ, > 0for x <γ, which means that p−1 1 + γ max φ(x ) = φ(γ ) = , 123 Convergence and adaptive discretization of the IRGNM… 477 p−1 1+γ so defining C := and writing the resulting inequality in terms of a and b we have the desired formula. The formula in (83) is derived analogously. Lemma 2 Let k ∈ N, (d ) ⊆[0, ∞), (R ) ⊆[0, ∞), α , R , c, q j 1≤ j ≤k+1 j 1≤ j ≤k+1 0 = θ ∈ (0, ∞). Then j j † ∀ j ∈{0,..., k}: d + α θ R ≤ qd + α θ R + c , (84) j +1 0 j +1 j 0 implies that 1 1 k k+1 k † d + α θ R < q d + α θ R + c. (85) k+1 0 k+1 0 0 1 − 1 − q Proof We first show by induction that for all l ∈{0,..., k} q q k l+1 k † l d +α θ R ≤ q d + 1 + + ··· + α θ R +(1+q+···+q )c. k+1 0 k+1 k−l 0 θ θ (86) Indeed, for l = 0, (86)isjust(84) with j = k. Suppose that (86) holds for l, then using (84) with j = k − (l + 1), we obtain the formula for l + 1 d + α R k+1 k k+1 l+1 k−(l+1) † ≤ q qd + α θ R + c k−(l+1) 0 q q k † l + 1 + + ··· + α θ R + (1 + q + ··· + q )c θ θ l l+1 q q q (l+1)+1 k † = q d + 1 + + ··· + + α θ R k−(l+1) θ θ θ l l+1 + (1 + q + ··· + q + q )c, and the induction proof is complete. Hence, setting l = k in (86) and using the geometric series formula, we get the assertion (85). References 1. Bakushinsky, A.B.: The problem of the convergence of the iteratively regularized Gauss–Newton method. Comput. Math. Math. Phys. 32, 1353–1359 (1992) 2. Bakushinsky, A.B., Kokurin, M.: Iterative Methods for Approximate Solution of Inverse Problems. Kluwer, Dordrecht (2004) 3. Becker, R., Vexler, B.: Mesh refinement and numerical sensitivity analysis for parameter calibration of partial differential equations. J. Comput. Phys. 206, 95–110 (2005) 4. Clason, C., Kunisch, K.: A duality-based approach to elliptic control problems in non-reflexive Banach spaces. ESAIM Control Optim. Calc. Var. 17(1), 243–266 (2011) 123 478 B. Kaltenbacher, M. L. P. de Souza 5. Clason, C., Kunisch, K.: A measure space approach to optimal source placement. Comput. Optim. Appl. 53(1), 155–171 (2012) 6. Casas, E., Clason, C., Kunisch, K.: Approximation of elliptic control problems in measure spaces with sparse solutions. SIAM J. Control Optim. 50(4), 1735–1752 (2012) 7. Dombrovskaja, I., Ivanov, V.K.: On the theory of certain linear equations in abstract spaces. Sib. Mat. Z. 6, 499–508 (1965) 8. Engl, H., Hanke, M., Neubauer, A.: Regularization of Inverse Problems. Kluwer, Dordrecht (1996) 9. Grasmair, M., Haltmeier, M., Scherzer, O.: The residual method for regularizing ill-posed problems. Appl. Math. Comput. 218, 2693–710 (2011) 10. Hanke, M.: A regularization Levenberg–Marquardt scheme, with applications to inverse groundwater filtration problems. Inverse Prob. 13, 79–95 (1997) 11. Hohage, T., Werner, F.: Iteratively regularized Newton-type methods for general data misfit functionals and applications to Poisson data. Numer. Math. 123, 745–779 (2013) 12. Hungerländer, P., Rendl, F.: A feasible active set method for strictly convex problems with simple bounds. SIAM J. Opt. 25, 1633–1659 (2015) 13. Ivanov, V.K.: On linear problems which are not well-posed. Dokl. Akad. Nauk SSSR 145, 270–272 (1962) 14. Ivanov, V.K.: On ill-posed problems. Mat. Sb. (N.S.) 61(103), 211–223 (1963) 15. Ivanov, V.K., Vasin, V.V., Tanana, V.P.: Theory of Linear Ill-Posed Problems and Its Applications, Inverse and Ill-Posed Problems Series, VSP (2002) 16. Jin, Q., Zhong, M.: On the iteratively regularized Gauss–Newton method in Banach spaces with applications to parameter identification problems. Numer. Math. 124, 647–683 (2013) 17. Kaltenbacher, B., Hofmann, B.: Convergence rates for the iteratively regularized Gauss–Newton method in Banach spaces. Inverse Prob. 26, 035007 (2010) 18. Kaltenbacher, B., Kirchner, A., Veljovic, ´ S.: Goal oriented adaptivity in the IRGNM for parameter identification in PDEs: I. Reduced formulation. Inverse Prob. 30, 045001 (2014) 19. Kaltenbacher, B., Kirchner, A., Vexler, S.: Goal oriented adaptivity in the IRGNM for parameter identification in PDEs II: all-at once formulations. Inverse Prob. 30, 045002 (2014) 20. Kaltenbacher, B., Neubauer, A., Scherzer, O.: Iterative Regularization Methods for Nonlinear Ill-Posed Problems. Walter de Gruyter, Berlin (2008) 21. Kaltenbacher, B., Schöpfer, F., Schuster, T.: Convergence of some iterative methods for the regular- ization of nonlinear ill-posed problems in Banach spaces. Inverse Prob. 25, 065003 (2009) 22. Lorenz, D., Worliczek, N.: Necessary conditions for variational regularization schemes. Inverse Prob. 29, 075016 (2013) 23. Morozov, V.A.: Choice of parameter for the solution of functional equations by the regularization method. Dokl. Akad. Nauk SSSR 175, 1225–8 (1967) 24. Neubauer, A., Ramlau, R.: On convergence rates for quasi-solutions of ill-posed problems. ETNA 41, 81–92 (2014) 25. Scherzer, O.: Convergence criteria of iterative methods based on Landweber iteration for solving nonlinear problems. J. Math. Anal. Appl. 194, 911–933 (1995) 26. Seidman, T.I., Vogel, C.R.: Well posedness and convergence of some regularisation methods for non- linear ill posed problems. Inverse Prob. 5, 227–238 (1989) 27. Tikhonov, A.N., Arsenin, V.Y.: Solutions of Ill-Posed Problems. Wiley, New York (1977) 28. Werner, F.: On convergence rates for iteratively regularized Newton-type methods under a Lipschitz- type nonlinearity condition. J. Inverse Ill Posed Probl. 23, 75–84 (2015)

Journal

Numerische MathematikSpringer Journals

Published: May 29, 2018

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off