DeepDyve requires Javascript to function. Please enable Javascript on your browser to continue.

Complexity of the relaxed Peaceman–Rachford splitting method for the sum of two maximal strongly monotone operators
Complexity of the relaxed Peaceman–Rachford splitting method for the sum of two maximal...
Monteiro, Renato; Sim, Chee-Khian
2018-03-20 00:00:00
Comput Optim Appl (2018) 70:763–790 https://doi.org/10.1007/s10589-018-9996-z Complexity of the relaxed Peaceman–Rachford splitting method for the sum of two maximal strongly monotone operators 1 2 Renato D. C. Monteiro · Chee-Khian Sim Received: 3 November 2016 / Published online: 20 March 2018 © The Author(s) 2018 Abstract This paper considers the relaxed Peaceman–Rachford (PR) splitting method for ﬁnding an approximate solution of a monotone inclusion whose underlying oper- ator consists of the sum of two maximal strongly monotone operators. Using general results obtained in the setting of a non-Euclidean hybrid proximal extragradient frame- work, we extend a previous convergence result on the iterates generated by the relaxed PR splitting method, as well as establish new pointwise and ergodic convergence rate results for the method whenever an associated relaxation parameter is within a cer- tain interval. An example is also discussed to demonstrate that the iterates may not converge when the relaxation parameter is outside this interval. Keywords Relaxed Peaceman–Rachford splitting method · Strongly monotone operators · Non-Euclidean hybrid proximal extragradient framework Renato D. C. Monteiro: The work of this author was partially supported by NSF Grant CMMI-1300221. Chee-Khian Sim: This research was made possible through support by Centre of Operational Research and Logistics, University of Portsmouth. B Chee-Khian Sim chee-khian.sim@port.ac.uk Renato D. C. Monteiro monteiro@isye.gatech.edu School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332-0205, USA Department of Mathematics, University of Portsmouth, Lion Gate Building, Lion Terrace, Portsmouth PO1 3HF, UK 123 764 R. D. C. Monteiro, C.-K. Sim 1 Introduction In this paper, we consider the relaxed Peaceman–Rachford (PR) splitting method for solving the monotone inclusion 0 ∈ (A + B)(u) (1) where A : X ⇒ X and B : X ⇒ X are maximal β-strongly monotone (point-to- set) operators for some β ≥ 0 (with the convention that 0-strongly monotone means simply monotone, and β-strongly monotone with β> 0 means strongly monotone in the usual sense). Recall that the relaxed PR splitting method is given by x = x + θ(J (2J (x ) − x ) − J (x )), (2) k k−1 B A k−1 k−1 A k−1 −1 where θ> 0 is a ﬁxed relaxation parameter and J := (I + T ) . The special case of the relaxed PR splitting method in which θ = 2 is known as the Peaceman–Rachford (PR) splitting method and the one with θ = 1 is the widely-studied Douglas–Rachford (DR) splitting method. Convergence results for them are studied for example in [1– 4,8,13,14,22]. The analysis of the relaxed PR splitting method for the case in which β = 0 has been undertaken in a number of papers which are discussed in this paragraph. Convergence of the sequence of iterates generated by the relaxed PR splitting method is well-known when θ< 2 (see for example [1,7,14]) and, according to [16], its limiting behavior for the case in which θ ≥ 2 is not known. We actually show in Sect. 5.2 that the sequence (2) does not necessarily converge when θ ≥ 2. An O(1/ k) (strong) pointwise convergence rate result is established in [18] for the relaxed PR splitting method when θ ∈ (0, 2). Moreover, when A = ∂ f and B = ∂g where f and g are proper lower semi-continuous convex functions, papers [9–11] derive strong pointwise (resp., ergodic) convergence rate bounds for the relaxed PR method when θ ∈ (0, 2) (resp., θ ∈ (0, 2]) under different assumptions on the functions. Assuming only β-strong monotonicity of A = ∂ f , whereβ> 0, some smoothness property on f , and maximal monotonicity of B,[16] shows that the relaxed PR splitting method has linear convergence rate for θ ∈ (0, 2 + τ) for some τ> 0. Linear rate of convergence of the relaxed PR splitting method and its two special cases, namely, the DR splitting and PR splitting methods, are established in [2–4,11,15,16,22] under relatively strong assumptions on A and/or B (seealsoTable 2). This paper assumes that β ≥ 0, and hence its analysis applies to the case in which both A and B are monotone (β = 0) and the case in which both A and B are strongly monotone (β> 0). This paragraph discusses papers dealing with the latter case. Paper [12] establishes convergence of the sequence generated by the relaxed PR splitting method for any θ ∈ (0, 2 + β) and, under some strong assumptions on A and B, establishes its linear convergence rate. We complement the convergence results in [12] by showing that for θ = 2 + β, the sequence of iterates generated by the relaxed PR splitting method also converge, and describe an instance showing its nonconvergence when θ ≥ min{2 + 2β, 2 + β + 1/β}. Moreover, we establish strong pointwise and 123 Complexity of the relaxed Peaceman–Rachford splitting... 765 ergodic convergence rate results (Theorems 4.6 and 4.8) for the relaxed PR splitting method when θ ∈ (0, 2 + β) and θ ∈ (0, 2 + β], respectively. Finally, by imposing strong assumptions requiring one of the operators to be strong monotone and one of them to be Lipschitz (and hence point-to-point), [11,15,16] establish linear convergence rate of the relaxed PR splitting method. As opposed to these papers, the assumptions in [12] and this paper do not imply the operators A or B to be point-to-point. Our analysis of the relaxed PR splitting method for solving (1) is based on viewing it as an inexact proximal point method, more speciﬁcally, as an instance of a non- Euclidean hybrid proximal extragradient (HPE) framework for solving the monotone inclusion problem. The proximal point method, proposed by Rockafellar [29], is a classical iterative scheme for solving the latter problem. Paper [30] introduces an Euclidean version of the HPE framework which is an inexact version of the proximal point method based on a certain relative error criterion. Iteration-complexities of the latter framework are established in [25](seealso[26]). Generalizations of the HPE framework to the non-Euclidean setting are studied in [17,21,31]. Applications of the HPE framework can be found for example in [19,20,25,26]. This paper is organized as follows. Section 2 describes basic concepts and notation used in the paper. Section 3 discusses the non-Euclidean HPE framework which is used to the study the convergence properties of the relaxed PR splitting method in Sects. 4 and 5. Section 4 derives convergence rate bounds for the relaxed Peaceman–Rachford (PR) splitting method. Section 5, which consists of two subsections, discusses a conver- gence result of the relaxed PR splitting method in the ﬁrst subsection and provides an example showing that its iterates may not converge when θ ≥ min{2+2β, 2+β +1/β} in the second subsection. Finally, Sect. 6 discusses the numerical performance of the relaxed PR splitting method for solving the weighted Lasso minimization problem. Section 7 gives some concluding remarks. 2 Basic concepts and notation This section presents some deﬁnitions, notation and terminology which will be used in the paper. We denote the set of real numbers by R and the set of non-negative real numbers by R .Let f and g be functions with the same domain and whose values are in R . + + We write that f (·) = (g(·)) if there exists constant K > 0 such that f (·) ≥ Kg(·). Also, we write f (·) = (g(·)) if f (·) = (g(·)) and g(·) = ( f (·)). Let Z be a ﬁnite-dimensional real vector space with inner product denoted by ·, · (an example of Z is R endowed with the standard inner product) and let · denote an arbitrary seminorm in Z. Its dual (extended) seminorm, denoted by · , is deﬁned as · := sup{·, z:z≤ 1}. It is easy to see that z,v≤zv ∀z,v ∈ Z. (3) The following straightforward result states some basic properties of the dual semi- norm associated with a matrix seminorm. Its proof can be found for example in Lemma A.1(b) of [23]. 123 766 R. D. C. Monteiro, C.-K. Sim Proposition 2.1 Let A : Z → Z be a self-adjoint positive semideﬁnite linear opera- 1/2 tor and consider the seminorm · in Z given by z=Az, z for every z ∈ Z. Then, dom · = Im (A) and Az =z for every z ∈ Z. ∗ ∗ Given a set-valued operator T : Z ⇒ Z, its domain is denoted by Dom(T ) := −1 −1 {z ∈ Z : T (z) =∅} and its inverse operator T : Z ⇒ Z is given by T (v) := {z : v ∈ T (z)}. The graph of T is deﬁned by Gr(T ) := {(z, t ) : t ∈ T (z)}. The operator T is said to be monotone if z − z , t − t ≥ 0 ∀(z, t ), (z , t ) ∈ Gr(T ). Moreover, T is maximal monotone if it is monotone and, additionally, if T is a monotone operator such that T (z) ⊂ T (z) for every z ∈ Z, then T = T .The sum T + T : Z ⇒ Z of two set-valued operators T , T : Z ⇒ Z is deﬁned by (T + T )(z) := {t + t ∈ Z : t ∈ T (z), t ∈ T (z)} for every z ∈ Z. Given a scalar [ε] ε ≥ 0, the ε-enlargement T : Z ⇒ Z of a monotone operator T : Z ⇒ Z is deﬁned as [ε] T (z) := {t ∈ Z :t − t , z − z ≥−ε, ∀z ∈ Z, ∀t ∈ T (z )}∀z ∈ Z. (4) 3 A non-Euclidean hybrid proximal extragradient framework This section discusses the non-Euclidean hybrid proximal extragradient (NE-HPE) framework and describes its associated convergence and iteration complexity results. The results of the section will be used in Sects. 4 and 5 to study the convergence and iteration complexity properties of the relaxed PR splitting method (2). It contains two subsections. The ﬁrst one describes a class of distance generating functions introduced in [17]. The second one describes the NE-HPE framework and its corresponding convergence and iteration complexity results. For the sake of shortness, all the results in this section are stated without proofs which, in turn, can be found in Section 3 of the technical report version of this paper (see [24]). 3.1 A class of distance generating functions We start by introducing a class of distance generating functions (and its corresponding Bregman distances) which is needed for the presentation of the NE-HPE framework in Sect. 3.2. Deﬁnition 3.1 For a given convex set Z ⊂ Z, a seminorm · in Z and scalars 0 < m ≤ M,welet D (m, M ) denote the class of real-valued functions w which are differentiable on Z and satisfy w(z ) − w(z) −∇w(z), z − z≥ z − z ∀z, z ∈ Z , (5) ∇w(z) −∇w(z ) ≤ M z − z ∀z, z ∈ Z . (6) 123 Complexity of the relaxed Peaceman–Rachford splitting... 767 A function w ∈ D (m, M ) is referred to as a distance generating function with respect to the seminorm · and its associated Bregman distance dw : Z × Z → R is deﬁned as (dw)(z ; z) = (dw) (z ) := w(z ) − w(z) −∇w(z), z − z∀z, z ∈ Z . (7) Throughout our presentation, we use the second notation (dw) (z ) instead of the ﬁrst one (dw)(z ; z) although the latter one makes it clear that (dw) is a function of two arguments, namely, z and z .Clearly,itfollows from (5) that w is a convex function on Z which is in fact m-strongly convex on Z whenever · is a norm. Note that if the seminorm in Deﬁnition 3.1 is a norm, then (5) implies that w is strongly convex on Z, in which case the corresponding dw is said to be nondegenerate on Z. However, since Deﬁnition 3.1 does not necessarily assume that · is a norm, it admits the possibility of w being not strongly convex on Z, or equivalently, dw being degenerate on Z. Finally, some useful relations about the above class of Bregman distances can be found in Sect. 3.1 of the technical report version of this paper (see Lemmas 3.2 and 3.3 of [24]). 3.2 The NE-HPE framework This subsection describes the NE-HPE framework and its corresponding convergence and iteration complexity results. Throughout this subsection, we assume that scalars 0 < m ≤ M, convex set Z ⊂ Z, seminorm · and distance generating function w ∈ D (m, M ) with respect to · are given. Our problem of interest in this section is the MIP 0 ∈ T (z) (8) where T : Z ⇒ Z is a maximal monotone operator satisfying the following condi- tions: (A0) Dom (T ) ⊂ Z; −1 (A1) the solution set T (0) of (8) is nonempty. We now state a non-Euclidean HPE (NE-HPE) framework for solving the MIP (8) which generalizes its Euclidean counterparts studied in the literature (see for example in [25,27,30]). 123 768 R. D. C. Monteiro, C.-K. Sim Framework 1 (An NE-HPE framework for solving (8)). (0) Let z ∈ Z and σ ∈[0, 1] be given, and set k = 1; (1) choose λ > 0 and ﬁnd (z ˜ , z ,ε ) ∈ Z × Z × R such that k k k k + [ε ] r := ∇(dw) (z ) ∈ T (z ˜ ), (9) k z k−1 k (dw) (z ˜ ) + λ ε ≤ σ(dw) (z ˜ ); (10) z k k k z k k k−1 (2) set k ← k + 1 and go to step 1. end We now make some remarks about Framework 1. First, it does not specify how to ﬁnd λ and (z ˜ , z ,ε ) satisfying (9) and (10). The particular scheme for computing k k k k λ and (z ˜ , z ,ε ) will depend on the instance of the framework under consideration k k k k and the properties of the operator T . Second, if w is strongly convex on Z and σ = 0, then (10) implies that ε = 0 and z =˜z for every k, and hence that r ∈ T (z ) in k k k k k view of (9). Therefore, the HPE error conditions (9)–(10) can be viewed as a relaxation of an iteration of the exact non-Euclidean proximal point method, namely, 0 ∈ ∇(dw) (z ) + T (z ). z k k k−1 We observe that NE-HPE frameworks have already been studied in [17,21] and [31]. The approach presented in this section differs from these three papers as follows. Assuming that Z is an open convex set, w is continuously differentiable on Z and continuous on its closure, [31] studies a special case of the NE-HPE framework in which ε = 0 for every k, and presents results on convergence of sequences rather than iteration complexity. Paper [21] deals with distance generating functions w which do not necessarily satisfy conditions (5) and (6), and as consequence, obtains results which are more limited in scope, i.e., only an ergodic convergence rate result is obtained for operators with bounded feasible domains (or, more generally, for the case in which the sequence generated by the HPE framwework is bounded). Paper [17] introduces the class of distance generating functions D (m, M ) but only analyzes the behavior of a HPE framework for solving inclusions whose operators are strongly monotone with respect to a ﬁxed w ∈ D (m, M ) (see condition A1 in Section 2 of [17]). This section on the other hand assumes that w ∈ D (m, M ) but it does not assume any strong monotonicity of T with respect to w. Before presenting the main results about the the NE-HPE framework, namely, Theorems 3.3 and 3.4 establishing its pointwise and ergodic iteration complexities, respectively, and Propositions 3.5 and 3.6 showing that {z } and/or {˜z } approach k k −1 T (0) in terms of the Bregman distance (dw), we have the following result. ∗ −1 Proposition 3.2 For every k ≥ 1 and z ∈ T (0), we have ∗ ∗ ∗ (dw) (z )−(dw) (z )−(1−σ)(dw) (z ˜ ) ≥ λ r , z ˜ −z + ε ≥0. (11) z z z k k k k k k−1 k k−1 123 Complexity of the relaxed Peaceman–Rachford splitting... 769 As a consequence, the following statements hold: (a) {(dw) (z )} is non-increasing; (b) lim λ r , z ˜ − z + ε = 0; k→∞ k k k k k ∗ (c) (1 − σ) (dw) (z ˜ ) ≤ (dw) (z ). z i z i =1 i −1 0 Proof See the proof of Proposition 3.5 of [24]. For the purpose of stating the convergence rate results below, deﬁne ∗ ∗ −1 (dw) := inf{(dw) (z ) : z ∈ T (0)}. (12) 0 z The following pointwise convergence rate result describes the convergence rate of the sequence {(r ,ε )} of residual pairs associated to the sequence {˜z }. Note that k k k its convergence rate bounds are derived on the best residual pair among (r ,ε ) for i i i = 1,..., k rather than on the last residual pair (r ,ε ). k k Theorem 3.3 (Pointwise convergence) Let (dw) be as in (12) and assume that σ< 1. Then, the following statements hold: (a) if λ := inf λ > 0, then for every k ∈ N there exists i ≤ k such that −1 2(dw) λ M (1 + σ) 2(dw) 0 0 r ≤ M (1 + σ) ≤ √ , (1 − σ)m (1 − σ)m λ k j =1 σ(dw) 1 σ(dw) 0 0 ε ≤ ≤ ; 1 − σ (1 − σ)λk i =1 (b) for every k ∈ N, there exists an index i ≤ k such that √ 2(dw) 1 σ(dw) λ 0 0 i r ≤ M (1 + σ) ,ε ≤ . i ∗ i k 2 k 2 (1 − σ)m λ (1 − σ) λ j =1 j j =1 j (13) Proof See the proof of Theorem 3.8 of [24]. From now on, we focus on the ergodic convergence rate of the NE-HPE framework. For k ≥ 1, deﬁne := λ and the ergodic sequences k i i =1 k k k 1 1 1 a a a a z ˜ = λ z ˜ , r := λ r ,ε := λ ε +r , z ˜ −˜z . i i i i i i i i k k k k k k k i =1 i =1 i =1 (14) The following ergodic convergence result describes the association between the a a a ergodic iterate z ˜ and the residual pair (r ,ε ), and gives a convergence rate bound k k k on the latter residual pair. 123 770 R. D. C. Monteiro, C.-K. Sim Theorem 3.4 (Ergodic convergence) Let (dw) be as in (12). Then, for every k ≥ 1, we have a a [ε ] a ε ≥ 0, r ∈ T (z ˜ ) k k k and 2M 2(dw) 3M 2(dw) + ρ 0 0 k a a r ≤ ,ε ≤ k k m m k k where ρ := max (dw) (z ˜ ). (15) k z i i =1,...,k Moreover, the sequence {ρ } is bounded under either one of the following situations: (a) σ< 1, in which case σ(dw) ρ ≤ ; (16) 1 − σ (b) Dom T is bounded, in which case 2M ρ ≤ [(dw) + D] (17) k 0 where D := sup{min{(dw) (y ), (dw) (y)}: y, y ∈ Dom T } is the diameter of Dom T with respect to dw. Proof See the proof of Theorem 3.9 of [24]. In the remaining part of this subsection, we state some results about the sequence generated by an instance of the NE-HPE framework. We assume from now on that such instance generates an inﬁnite sequence of iterates, i.e., the instance does not terminate in a ﬁnite number of steps and no termination criterion is checked. Since we are not assuming that the distance generating function w is nondegenerate on Z,itis not possible to establish convergence of the sequence {z } generated by the NE-HPE framework to a solution of (8). However, under some mild assumptions, it is possible −1 to establish that {z } approaches a point z ˜ ∈ T (0) if the proximity measure used is the actual Bregman distance. Proposition 3.5 Assume that for some inﬁnite index set K and some z ˜ ∈ Z, we have lim (r ,ε ) = (0, 0), lim z ˜ =˜z. (18) k k k k→K k→K −1 Then, z ˜ ∈ T (0) ⊂ Z. If, in addition, lim (dw) (z ˜ ) = 0, then lim (dw) (z ˜) k∈K z k k→∞ z k k = 0. Proof See the proof of Proposition 3.10 of [24]. 123 Complexity of the relaxed Peaceman–Rachford splitting... 771 Proposition 3.6 Assume that σ< 1, λ =∞ and {˜z } is bounded. Then, there i =1 −1 exists z ˜ ∈ T (0) ⊂ Z such that lim (dw) (z ˜) = lim (dw) (z ˜) = 0. (19) z z ˜ k→∞ k→∞ Proof See the proof of Proposition 3.11 of [24] Clearly, if w is a nondegenerate distance generating function, then the results above give sufﬁcient conditions for the sequences {z } and {˜z } to converge to some z ˜ ∈ k k −1 T (0). 4 The relaxed Peaceman–Rachford splitting method This section derives convergence rate bounds for the relaxed Peaceman–Rachford (PR) splitting method for solving the monotone inclusion (1) under the assumption that A and B are maximal β-strongly monotone operators for any β ≥ 0. More speciﬁcally, its pointwise iteration-complexity is obtained in Theorem 4.6 and its ergodic iteration- complexity is derived in Theorem 4.8. These results are obtained as by-products of the corresponding ones (i.e, Theorems 3.3 and 3.4) in Sect. 3.2 and the fact that the relaxed Peaceman–Rachford (PR) splitting method can be viewed as a special instance of the NE-HPE framework. Throughout this section, we assume that X a ﬁnite-dimensional real vector space with inner product and associated inner product norm denoted by ·, · and · , X X respectively. For a given β ≥ 0, an operator T : X ⇒ X is said to be β-strongly monotone if w − w , x − x ≥ βx − x ∀(x,w), (x ,w ) ∈ Gr(T ). In what follows, we refer to monotone operators as 0-strongly monotone operators. This terminology has the beneﬁt of allowing us to treat both the monotone and strongly monotone case simultaneously. Throughout this section, we consider the monotone inclusion (1) where A, B : X ⇒ X satisfy the following assumptions: (B0) for some β ≥ 0, A and B are maximal β-strongly monotone operators; −1 (B1) the solution set (A + B) (0) is non-empty. We start by observing that (1) is equivalent to solving the following augmented system of inclusions/equation 0 ∈ γ A(u) + u − x , 0 ∈ γ B(v) + x − v, 0 = u − v where γ> 0 is an arbitrary scalar. Another way of writing the above system is as 0 ∈ γ A(u) + u − x , 123 772 R. D. C. Monteiro, C.-K. Sim 0 ∈ γ B(v) + v + x − 2u, 0 = u − v. Note that the ﬁrst and second inclusions are equivalent to u = u(x ) := J (x ), v = v(x ) := J (2u − x ) = J (2J (x ) − x ) (20) γ A γ B γ B γ A so that the third equation reduces to 0 = u(x ) − v(x ) = J (x ) − J (2J (x ) − x ). γ A γ B γ A The Douglas–Rachford (DR) splitting method is the iterative procedure x = x + k k−1 v(x ) −u(x ), k ≥ 1, started from some x ∈ X . It is known that the DR splitting k−1 k−1 0 method is an exact proximal point method for some maximal monotone operator [13,14]. Hence, convergence of its sequence of iterates is guaranteed. This section is concerned with a natural generalization of the DR splitting method, namely, the relaxed Peaceman–Rachford (PR) splitting method with relaxation param- eter θ> 0, which iterates as (u ,v ) := (u(x ), v(x )) x = x := x + θ(v − u ) ∀k ≥ 1. (21) k k k−1 k−1 k k−1 k k We now make a few remarks about the above method. First, it reduces to the DR splitting method when θ = 1, and to the PR splitting method when θ = 2. Second, it reduces to (2) when γ = 1 but it is not more general than (2) since (21) is equivalent to (2) with (A, B) = (γ A,γ B). Third, as presented in (21), it can be viewed as an iterative process in the (u,v, x )-space rather than only in the x-space as suggested by (2). Our analysis of the relaxed DR splitting method is based on further exploring the last remark above, i.e., viewing it as an iterative method in the (u,v, x )-space. We start by introducing an inclusion which plays an important role in our analysis. For a ﬁxed θ> 0 and γ> 0, consider the inclusion 0 ∈ (L + γ C )(z) (22) where L : X × X × X → X × X × X is the linear map deﬁned as ⎡ ⎤ ⎛ ⎞ ˜ ˜ (1 − θ)I θ I −I u ⎣ ˜ ˜ ⎦ ⎝ ⎠ L (z) = L (u,v, x ) := (θ − 2)I (1 − θ)II v (23) ˜ ˜ θ θ I −I 0 x and C : X × X × X ⇒ X × X × X is the maximal monotone operator deﬁned as C (z) = C (u,v, x ) := A(u) × B(v) ×{0}. (24) It is easy to verify that the inclusion (22) is equivalent to the two systems of inclu- sions/equation following conditions B0 and B1. Hence, it sufﬁces to solve (22) in order 123 Complexity of the relaxed Peaceman–Rachford splitting... 773 to solve (1). The following simple but useful result explicitly show the relationship between the solution sets of (22) and (1). −1 Lemma 4.1 For any θ> 0, the solution set (L + γ C ) (0) is given by −1 ∗ ∗ ∗ −1 ∗ ∗ ∗ ∗ (L + γ C ) (0) ={(u , u , x ) : γ (x − u ) ∈ A(u ) ∩ (−B(u ))} ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ={(u , u , u + γ a ) : a ∈ A(u ), −a ∈ B(u )}. ∗ ∗ ∗ ∗ −1 ∗ −1 As a consequence, if z = (u , u , x ) ∈ (L + γ C ) (0), then u ∈ (A + B) (0) ∗ ∗ and u = J (x ). γ A Proof The conclusion of the lemma follows immediately from the deﬁnitions of L and C in (23) and (24), respectively, and some simple algebraic manipulations. The key idea of our analysis is to show that the relaxed PR splitting method is actu- ally a special instance of the NE-HPE framework for solving inclusion (22) and then use the results discussed in Sect. 3.2 to derive convergence and iteration-complexity results for it. With this goal in mind, the next result gives a sufﬁcient condition for (22) to be a maximal monotone inclusion. Proposition 4.2 Assume that A, B : X ⇒ X satisfy B0 and let θ> 0 be given. Then, (a) for every z = (u,v, x ) ∈ X × X × X,z = (u ,v , x ) ∈ X × X × X,r ∈ (L + γ C )(z) and r ∈ (L + γ C )(z ), we have ˜ ˜ θ θ L (z − z ), z − z ) = (1 − θ)(u − u ) − (v − v ) (25) θ X r − r , z − z ≥ (1 − θ)(u − u ) − (v − v ) 2 2 + γβ(u − u +v − v ); (26) X X (b) L + γ C is maximal monotone whenever θ ∈ (0,θ ] where ˜ 0 γβ θ := 1 + . (27) Proof (a) Identity (25) follows from the deﬁnition of L in (23). To show inequality (26), assume that r ∈ (L +γ C )(z) and r ∈ (L +γ C )(z ). Then, r = L (z)+γ c ˜ ˜ ˜ θ θ θ and r = L (z) + γ c for some c ∈ C (z) and c ∈ C (z ). Using the deﬁnition of C and assumption B0, we easily see that 2 2 c − c, z − z≥ β(u − u +v − v ), X X which together with (25), and the fact that r = L (z) + γ c and r = L (z) + γ c , ˜ ˜ θ θ imply (26). (b) Monotonicity of L + γ C is due to the fact that the right hand side of (26)is nonnegative for every (u,v), (u ,v ) ∈ X × X whenever θ ∈ (0,θ ].Toshow ¯ ¯ L + γ C is maximal monotone, write L + γ C = (L + γ C ) + γ(C − C ) ˜ ˜ ˜ θ θ θ 123 774 R. D. C. Monteiro, C.-K. Sim where C := β(I, I, 0). As a consequence of (a) with (A, B) = β(I, I ) and the deﬁnition of C, we conclude that L + γ C is a monotone linear operator for every ˜ ¯ θ ∈ (0,θ ]. Moreover, Assumption B0 easily implies that γ(C − C ) is maximal monotone. The statement now follows by noting that the sum of a monotone linear map and a maximal monotone operator is a maximal monotone operator [1,28]. Note that θ in (27) depends on γ and β and that θ = 1 when β = 0. 0 0 The following technical result states some useful identities and inclusions needed to analyze the the sequence generated by the relaxed PR splitting method. Lemma 4.3 For a given x ∈ X and θ> 0, deﬁne k−1 ˜ ˜ θ θ x˜ = x := x + θ(v − u ), z ˜ = z := (u ,v , x˜ ) (28) k k−1 k k k k k k k k where u ,v are as in (21), and set k k 1 1 a := (x − u ), b := (2u − v − x ). (29) k k−1 k k k k k−1 γ γ Then, we have: ˜ ˜ − (1 − θ)u − θv +˜x = γ a ∈ γ A(u ), (30) k k k k k ˜ ˜ (2 − θ)u − (1 − θ)v −˜x = γ b ∈ γ B(v ). (31) k k k k k As a consequence, we have u − v = γ(a + b ) ∈ γ A(u ) + γ B(v ) (32) k k k k k k (0, 0, u − v ) = L (z ˜ ) + γ c ∈ (L + γ C )(z ˜ ) (33) k k ˜ k k ˜ k θ θ where c := (a , b , 0). (34) k k k Proof Using the deﬁnition of (u(·), v(·)) in (20), the deﬁnition of (u ,v , x ) in k k (21), and the deﬁnitions of a and b in (29), we easily see that (30) and (31) hold. k k The equality and the inclusion in (32) follow by adding (30) and (31). Clearly, (33) follows as an immediate consequence of (30) and (31), deﬁnitions (23) and (24), and the deﬁnition of c . The following result shows that the relaxed PR splitting method with θ ∈ (0, 2θ ] can be viewed as an inexact instance of the NE-HPE framework for solving (22) where from now on we assume that θ := min{θ, θ }. (35) Proposition 4.4 Consider the (degenerate) distance generating function given by w(z) = w(u,v, x ) = ∀z = (u,v, x ) ∈ X × X × X (36) 2θ 123 Complexity of the relaxed Peaceman–Rachford splitting... 775 and the sequence {z = (u ,v , x )} generated according to the relaxed PR splitting k k k k method (21) with any θ> 0. Also, deﬁne the sequences {ε }, {λ } and {r } as k k k ε := 0,λ := 1, r := ∇(dw) (z ) ∀k ≥ 1, (37) k k k z k−1 and the sequence {˜z = (u ,v , x˜ )} as in (28) with θ given by (35). Then, for every k k k k k ≥ 1, we have: (a) r = (0, 0, x − x )/θ = (0, 0, u − v ) = γ(0, 0, a + b ); k k−1 k k k k k (b) (λ , z ) and (z , z ˜ ,ε ) satisfy (9) with T = L +γ C, i.e., r ∈ (L +γ C )(z ˜ ); k k−1 k k k ˜ k ˜ k θ θ (c) (λ , z ) and (z , z ˜ ,ε ) satisfy (10) with σ = (θ /θ − 1) and w as in (36). k k−1 k k k As a consequence, the relaxed PR splitting method with θ ∈ (0, 2θ ) (resp., θ = 2θ ) 0 0 is an NE-HPE instance with respect to the monotone inclusion 0 ∈ (L + γ C )(z) in which σ< 1 (resp., σ = 1), ε = 0 and λ = 1 for every k. k k Proof (a) The ﬁrst identity in (a) follows from (36) and the deﬁnition of r in (37). The second and third equalities in (a) are due to the second identity in (21) and relation (32), respectively. (b) This statement follows from (a) and (33). (c) Using the second identity in (21), relation (36) and the deﬁnition of x˜ in (28), we conclude that for any θ ∈ (0, 2θ ], 2 2 2 ˜x − x ˜x − x k k θ k k−1 X X (dw) (z ˜ ) = = − 1 z k 2θ 2θ = − 1 (dw) (z ˜ ) z k k−1 and hence that (10) is satisﬁed with σ = (1 − θ/θ) . The last conclusion follows from statements (b) and (c), and Proposition 4.2(b). We now make a remark about the special case of Proposition 4.4 in which θ ∈ (0,θ ]. Indeed, in this case, θ = θ, and hence σ = 0 and z ˜ = z for every k ≥ 1. 0 k k Thus, the relaxed PR splitting method with θ ∈ (0,θ ] can be viewed as an exact non-Euclidean proximal point method with distance generating function w as in (36) with respect to the monotone inclusion 0 ∈ T (z) := (L + γ C )(z). Note also that the latter inclusion depends on θ. As a consequence of Proposition 4.4, we are now ready to describe the pointwise and ergodic convergence rate for the relaxed PR splitting method. We ﬁrst endow the space Z := X × X × X with the semi-norm (u,v, x ):=x and hence Proposition 2.1 implies that (0, 0, x ) =x . (38) It is also easy to see that the distance generating function w deﬁned in (36)isin D (m, M ) with respect to · where M = m = 1/θ (see Deﬁnition 3.1). 123 776 R. D. C. Monteiro, C.-K. Sim Our next goal is to state a pointwise convergence rate bound for the relaxed PR splitting method. We start by stating a technical result which is well-known for the case where β = 0 (see for example Lemma 2.4 of [18]). The proof for the general case, i.e., β ≥ 0, is similar and is given in the “Appendix” for the sake of completeness. Lemma 4.5 Assume that θ ∈ (0, 2θ ]. Then, for every k ≥ 2, we have x ≤ 0 k X x where x := x − x . k−1 X k k k−1 We now state the pointwise convergence rate result for the relaxed PR splitting method. Theorem 4.6 Consider the sequence {z = (u ,v , x )} generated by the relaxed PR k k k k ∗ ∗ ∗ ∗ splitting method with θ ∈ (0, 2θ ). Then, for every k ≥ 1 and z = (u , u , x ) ∈ −1 (L + γ C ) (0), 2x − x 0 X a + b ∈ A(u ) + B(v ), γ a + b = u − v ≤ . k k k k k k X k k X √ k 2θ − θ Proof The inclusion and the equality in the theorem follows from (32). Since by Proposition 4.4, the relaxed PR splitting method with θ ∈ (0, 2θ ) is an NE-HPE instance for solving the monotone inclusion 0 ∈ (L + γ C )(z) in which σ = (θ /θ − 1) < 1, ε = 0 and λ = 1 for all k ≥ 1, it follows from Lemma 4.5, Theorem 3.3, k k the fact that M = m = 1/θ, and relation (12) that √ √ 2M (dw) 1 2x − x 0 0 X r ≤ √ (1 + σ) ≤ . k ∗ √ m 1 − σ k 2θ − θ j =1 j The inequality of the theorem then follows by Proposition 4.4(a) and relation (38). Our main goal in the remaining part of this section is to derive ergodic convergence rate bounds for the relaxed PR splitting method for any θ ∈ (0, 2θ ].Westart by stating the following variation of the transportation lemma for maximal β-strongly monotone operators. Proposition 4.7 Assume that T is a maximal β-strongly monotone operator for some β ≥ 0. Assume also that t ∈ T (u ) for i = 1,..., k, and deﬁne i i k k k 1 1 1 t = t , u ¯ = u ,ε = t − βu , u −¯ u (39) k i k i k i i i k X k k k i =1 i =1 i =1 [ε ] Then, ε ≥ 0 and t ∈ T (u ¯ ). k k k Proof Theassumptionthat T is a maximal β-strongly monotone operator implies that T − β I is maximal monotone. Hence, it follows from the weak transportation formula [ε ] (see Theorem 2.3 of [5]) applied to T −β I that ε ≥ 0 and t −βu ¯ ∈ (T −β I ) (u ¯ ). k k k k [ε ] [ε ] k k The result then follows by observing that (T − β I ) (u ¯ ) + βu ¯ ⊆ T (u ¯ ). k k k 123 Complexity of the relaxed Peaceman–Rachford splitting... 777 In order to state the ergodic iteration complexity bound for the relaxed PR splitting method, we introduce the ergodic sequences k k k k 1 1 1 1 u ¯ = u , v ¯ = v , a ¯ = a , b = b (40) k i k i k i k i k k k k i =1 i =1 i =1 i =1 and the scalar sequences k k 1 1 ε := a − βu , u −¯ u ,ε := b − βv ,v −¯ v . (41) i i i k X i i i k X k k k k i =1 i =1 Theorem 4.8 Assume that θ ∈ (0, 2θ ] and consider the ergodic sequences above. ∗ ∗ ∗ ∗ −1 Then, for every k ≥ 1 and z = (u , u , x ) ∈ (L + γ C ) (0), [ε ] [ε ] k k a ¯ ∈ A (u ¯ ), b ∈ B (v ¯ ), k k k k 2x − x 0 X γ a ¯ + b = u ¯ −¯ v ≤ , k k k k X kθ 2 ∗ 2 3(1 + 2(1 − θ/θ) )x − x ε + ε ≤ . k k kγθ Proof The ﬁrst two inclusions follow from the two inclusions in (30) and (31), relation (40), Assumption B0 and Proposition 4.7. We will now derive the equality and the two inequalities of the theorem using the fact that the relaxed PR splitting method with θ ∈ (0, 2θ ] is an instance of the NE-HPE method. Letting λ = 1, ε = 0 for every 0 k k a a a k and z ˜ , r and ε be as in (14), we easily see from Proposition 4.4(a) and (40) that k k k r = (0, 0, u ¯ −¯ v ) = γ(0, 0, a ¯ + b ), (42) k k k k a a ε = r , z ˜ −˜z . (43) i i k k i =1 We claim that ε ≥ γ(ε + ε ). (44) k k k Before proving this claim, we will use it to complete the proof of the theorem. Indeed, using the deﬁnition of w in (36), relations (12), (38), (42) and (44), the conclusion of Proposition 4.4, and Theorem 3.4 with T = L + γ C, M = m = 1/θ and λ = 1for all k, we conclude that 1/2 2 2(dw) 2x − x 0 X a 0 γ a ¯ + b =u¯ −¯ v =r ≤ √ ≤ k k k k X ∗ kθ and ∗ 2 3 x − x + ρ 3(2(dw) + ρ ) 0 k 0 k a X γ(ε + ε ) ≤ ε ≤ ≤ (45) k k kθ 123 778 R. D. C. Monteiro, C.-K. Sim where ρ is deﬁned in (15). Moreover, using (15), the deﬁnition of w in (36), the deﬁni- tion of x and x˜ in (21) and (28), respectively, the triangle inequality, and Proposition i i 3.2(a), we conclude that x −˜x i i ρ := max (dw) (z ˜ ) = max k z i i =1,...,k i =1,...,k 2θ 2 2 ∗ 2 x − x 2(1 − θ/θ) x − x i i −1 0 2 X X = (1 − θ/θ) max ≤ . i =1,...,k 2θ θ The inequalities of the theorem now follows from the above three relations. In the remaining part of the proof, we establish our previous claim (44). By Propo- sition 4.4(a) and relations (33) and (43), we have k k a a a kε = r , z ˜ −˜z = L (z ˜ ) + γ c , z ˜ −˜z (46) i i i i i k k θ k i =1 i =1 where c is deﬁned in (34). Moreover, we have k k k a a a a L (z ˜ ) + γ c , z ˜ −˜z = L (z ˜ −˜z ), z ˜ −˜z + γ c , z ˜ −˜z i i i i i i i ˜ ˜ θ k θ k k k i =1 i =1 i =1 = (1 − θ) (u −¯ u ) − (v −¯ v ) i k i k i =1 + γ c , z ˜ −˜z i i i =1 γβ ≥− (u −¯ u ) − (v −¯ v ) i k i k i =1 + γ c , z ˜ −˜z , (47) i i i =1 where the second equality follows from (25) and the deﬁnitions of z ˜ in (14), z ˜ in (28), and u ¯ and v ¯ in (40), and the inequality follows from (27) and the fact that k k θ ≤ θ in view of (35). Finally, using the deﬁnitions of z ˜ in (14), and z ˜ and c in 0 i i Lemma 4.3, and the straightforward relation k k − (u −¯ u ) − (v −¯ v ) ≥− (u , u −¯ u +v ,v −¯ v ) , i k i k i i k i i k X X i =1 i =1 123 Complexity of the relaxed Peaceman–Rachford splitting... 779 we conclude from (46) and (47) that ε ≥ (a − βu , u −¯ u +b − βv ,v −¯ v ), i i i k X i i i k X i =1 and hence that the claim holds in view of (41). We now make some remarks about the convergence rate bounds obtained in Theo- rem 4.8. In view of Lemma 4.1, x depends on γ according to ∗ ∗ ∗ ∗ ∗ ∗ x = γ a + u , a ∈ A(u ) ∩−B(u ). Hence, letting ∗ ∗ −1 d := inf{x − u : u ∈ (A + B) (0)}, 0 0 X ∗ ∗ ∗ ∗ ∗ −1 S := sup{a : a ∈ A(u ) ∩−B(u ), u ∈ (A + B) (0)}, and assuming that S < ∞, it is easy to see that Theorem 4.8 and (27) imply that the relaxed PR splitting method with θ = 2θ satisﬁes C (γ ) C (γ ) C (γ ) 1 1 2 a ¯ + b ≤ , u ¯ −¯ v ≤ ,ε + ε ≤ k k k k X k k γ k k k where d + γ S C (γ ) = C (γ ; β, d ) = , 1 1 0 1 + βγ (d + γ S) C (γ ) = C (γ ; β, d ) = . 2 2 0 γ(1 + βγ ) When S/β ≥ d , then γ = d /S minimizes both C (·) and C (·) up to a mul- 0 0 1 2 ∗ ∗ ∗ tiplicative constant, in which case C = (d ), C /γ = (S) and C = (Sd ) 0 0 1 1 2 where ∗ ∗ ∗ ∗ C = C (β, d ) := inf{C (γ ) : γ> 0}, C = C (β, d ) := inf{C (γ ) : γ> 0}. 0 1 0 2 1 1 2 2 Note that this case includes the case in which β = 0. On the other hand, when S/β < d , then both C and C are minimized up to a multiplicative constant by any 0 1 2 ∗ ∗ γ ≥ d /S, in which case C = (S/β) and C = (S /β). Clearly, in this case, 1 2 C /γ converges to zero as γ tends to inﬁnity. Indeed, assume ﬁrst that S/β ≥ d . Then, up to some multiplicative constants, we have d + γ S d + γ S 0 0 C (γ ) ≥ ≥ = d , 1 0 1 + βγ 1 + Sγ/d 123 780 R. D. C. Monteiro, C.-K. Sim 2 2 (d + γ S) (d + γ S) d (d + γ S) d 0 0 0 0 C (γ ) ≥ ≥ = = + Sd , 2 0 γ(1 + βγ ) γ(1 + Sγ/d ) γ γ ∗ ∗ and hence that C = (d ) and C = (Sd ). Moreover, if γ = d /S, then the 0 0 0 1 2 ∗ ∗ assumption S/β ≥ d implies that βγ ≤ 1, and hence that C = (d ) and C = 0 0 1 2 (Sd ). Assume now that S/β < d . Then, up to multiplicative constants, it is easy to see that d + γ S S C (γ ) ≥ ≥ 1 + βγ β 2 2 2 (d + γ S) (S/β + γ S) S C (γ ) ≥ ≥ = (1 + γβ), γ(1 + βγ ) γ(1 + βγ ) γβ ∗ ∗ 2 and hence that C = (S/β) and C = (S /β). Moreover, if γ ≥ d /S, then it is 1 2 ∗ ∗ 2 easy to see that C = (S/β) and C = (S /β). 1 2 Based on the above discussion, the choice γ = d /S is optimal but has the disad- vantage that d is generally difﬁcult to compute. One possibility around this difﬁculty is to use γ = D /S where D is an upper bound on d . 0 0 0 5 On the convergence of the relaxed PR splitting method This section discusses some new convergence results about the sequence generated by the relaxed PR splitting method for the case in which β> 0. It contains two subsections. As observed in the Introduction, [12] already establishes the convergence of the relaxed PR sequence for the case in which β ≥ 0 and θ< 2θ . The ﬁrst subsection establishes convergence of the relaxed PR sequence for the case in which β> 0 and θ = 2θ . The second subsection describes an instance showing that the relaxed PR spliting method may diverge when β ≥ 0 and θ ≥ min{2(1 + γβ), 2 + γβ + 1/(γ β)}. (Here, we assume that 1/0 =∞.) Note that this instance, specialized to the case β = 0, shows that the sequence {z = (u ,v , x )} generated by the k k k k relaxed PR splitting method with β = 0 may diverge for any θ ≥ 2, and hence that the convergence result obtained for any θ ∈ (0, 2) in [12] cannot be improved. 5.1 Convergence result about the relaxed PR sequence It is known that the sequence {z = (u ,v , x )} generated by the relaxed PR splitting k k k k method with θ ∈ (0, 2θ ) and β ≥ 0 converges [12]. The main result of this subsection, namely Theorem 5.2, establishes convergence of this sequence for θ = 2θ when β> 0. We start by giving a lemma which is used in the proof of Theorem 5.2. Lemma 5.1 Consider the sequence {z = (u ,v , x )} generated by the relaxed PR k k k k splitting method with θ ∈ (0, 2θ ] and the sequence {˜z = (u ,v , x˜ )} deﬁned in 0 k k k k (28). Then, the sequences {z } and {˜z } are bounded. k k 123 Complexity of the relaxed Peaceman–Rachford splitting... 781 Proof Theassumptionthat θ ∈ (0, 2θ ] together with the last conclusion of Propo- sition 4.4 imply that the relaxed PR splitting method is an NE-HPE instance with ∗ −1 σ ≤ 1. Hence, for any z ∈ (L + γ C ) (0), it follows from Proposition 3.2(a) that the sequence {(dw) (z )} is non-increasing where w is the distance generating function given by (36). Clearly, this observation implies that {x } is bounded. This conclusion together with (20) and the nonexpansiveness of J , J imply that {u } γ A γ B k and {v } are also bounded. Finally, {˜x } is bounded due to the deﬁnition of x˜ in (28), k k k and the boundedness of {x }, {u } and {v }. k k k As mentioned at the beginning of this subsection, the convergence of {(u ,v )} to k k ∗ ∗ ∗ −1 some pair (u , u ) where u ∈ (A + B) (0) has been established in [12] for the case in which β> 0 and θ< 2θ . The following result shows that the latter conclusion can also be extended to θ = 2θ . Theorem 5.2 In addition to Assumption B1, assume that Assumption B0 holds with β> 0. Then, the sequence {z = (u ,v , x )} generated by the relaxed PR splitting k k k k −1 method with θ = 2θ converges to some point lying in (L + γ C ) (0). 0 θ Proof We assume that θ = 2θ and without any loss of generality that γ = 1. In view ∗ −1 of (35), we have θ = θ .Let z ∈ (L + C ) (0). Then, by Lemma 4.1,wehave 0 θ ∗ ∗ ∗ ∗ z = (u , u , x ) where ∗ −1 ∗ ∗ ∗ ∗ ∗ ∗ u = (A + B) (0), x − u ∈ A(u ), −x + u ∈ B(u ). (48) Since θ = θ , it follows from Proposition 4.4(b) that r ∈ (L + C )(z ˜ ).This 0 k θ k together with the fact that 0 ∈ (L + C )(z ), inequality (26) with (z, r ) = (z ˜ , r ) θ k k and (z , r ) = (z , 0), and the fact that θ = 1 + β/2, then imply that ∗ ∗ ∗ 2 ∗ 2 ∗ 2 r , z ˜ −z ≥ (1 − θ )(u − u ) − (v − u ) + β(u − u +v − u ) k k 0 k k k k X X X ∗ 2 = u + v − 2u ≥ 0. (49) k k Since the last conclusion of Proposition 4.4 states that the relaxed PR splitting method with θ = 2θ is an NE-HPE instance with respect to the monotone inclusion 0 ∈ (L + γ C )(z) in which σ = 1, λ = 1 and ε = 0 for every k,itfollows from ˜ k k Proposition 3.2(b), (49) and the assumption that β> 0 that ∗ ∗ lim r , z ˜ − z = lim u + v − 2u = 0. (50) k k k k X k→∞ k→∞ By Lemma 5.1, {z }, and hence {x }, is bounded. Therefore, there exist an inﬁnite k k index set K and x¯ ∈ X such that lim x =¯x, from which we conclude that k∈K k−1 lim u =¯ u := J (x¯ ), lim v =¯ v := J (2u ¯ −¯x ) (51) k A k B k∈K k∈K in view of (20), (21) and the continuity of the point-to-point maps J and J . Clearly, A B relations (21), (50) and (51), Proposition 4.4(a), the deﬁnitions of J following (2) 123 782 R. D. C. Monteiro, C.-K. Sim and z ˜ in (28), and the fact that θ = θ , imply that k 0 u ¯ +¯ v = 2u , 2u ¯ −¯ v −¯x ∈ B(v) ¯ , lim z = (u ¯, v, ¯ x¯ + θ(v ¯ −¯ u)), (52) k∈K lim r = (0, 0, u ¯ −¯ v), lim z ˜ =˜z := (u ¯, v, ¯ x¯ + θ (v ¯ −¯ u)). (53) k k 0 k∈K k∈K Clearly, (50) and (53) imply that ∗ ∗ 2 ∗ 0 = lim r , z ˜ − z =u¯ −¯ v, x¯ + θ (v ¯ −¯ u) − x =−θ ¯ u −¯ v +u¯ −¯ v, x¯ − x . k k 0 X 0 X k∈K (54) Using the second inclusion in (48), the identity and the inclusion in (52), the β-strong monotonicity of B, and relation (54), we then conclude that 2 ∗ 2 ∗ ∗ ∗ ¯ v −¯ u = β¯ v − u ≤ (2u ¯ −¯ v −¯x ) − (u − x ), v ¯ − u X X 1 3 = (u ¯ −¯ v) −¯x + x , v ¯ −¯ u 2 2 1 3 θ 3 ∗ 2 2 = x¯ − x , u ¯ −¯ v − ¯ v −¯ u = − ¯ v −¯ u . X X 2 4 2 4 The latter inequality together with the fact that θ = 1 + (β/2) then imply that u ¯ =¯ v = u where the last equality is due to the identity in (52). We have thus shown ∗ −1 ∗ that {u } and {v } both converge to u = (A + B) (0). Since u ¯ =¯ v = u , k k∈K k k∈K it follows from (52) and (53) that ∗ ∗ lim r = 0, lim z = lim z ˜ =˜z = (u , u , x¯ ), k k k k∈K k∈K k∈K 2 2 lim (dw) (z ˜ ) = lim x −˜x /(2θ) =x¯ −¯x /(2θ) = 0. z k k k X X k∈K k∈K −1 Hence, Proposition 3.5 with T = (L + γ C ) implies that z ˜ ∈ (L + γ C ) (0) θ θ 0 0 and 0 = lim (dw) (z ˜) = lim ¯x − x /(2θ). We thus conclude that {z } k→∞ z k→∞ k k ∗ ∗ −1 converges to (u , u , x¯ ) =˜z ∈ (L + γ C ) (0). Before ending this subsection, we make two remarks. First, for a ﬁxed τ> 0, consider the set R(τ ) := {(β, θ ) ∈ R : β> 0, 0 <θ ≤ 2 + τβ}. Then, it follows from Theorem 5.2 and the observation in the paragraph preceding it that the sequence generated by the relaxed PR splitting method with relaxation parameter θ to solve (1) with A, B maximal β-strongly monotone converges for any (β, θ ) ∈ R(1). Second, it follows from the example presented in the next subsection that the above conclusion fails if R(1) is enlarged to the region R(τ ) for any τ> 1. 123 Complexity of the relaxed Peaceman–Rachford splitting... 783 5.2 Non-convergent instances for θ ≥ min{2 + 2γβ, 2 + γβ + 1/(γ β)} By [12] and Theorem 5.2, the sequence {x } generated by the relaxed PR splitting method converges whenever either θ ∈ (0, 2 + γβ) or θ = 2 + γβ and β> 0. This subsection gives an instance of (1), where A, B are maximal β-strongly monotone, for which the sequence {x } generated by the relaxed PR splitting method with relaxation parameter θ does not converge when β ≥ 0 and θ ≥ min{2(1+γβ), 2+γβ +1/(γ β)}. Recall from (20) and (21) that the relaxed PR splitting method iterates as x = x + θ(J (2J (x ) − x ) − J (x )) (55) k+1 k γ B γ A k k γ A k where θ> 0. Without any loss of generality, we assume that γ = 1in(55). ˜ ˜ ˜ We now describe our instance. First, let X := X ×X where X is a ﬁnite-dimensional real vector space, and let A , B : X ⇒ X be deﬁned as 0 0 ˜ ˜ A (x˜ , x˜ ) = (0, 0), B (x˜ , x˜ ) = N (x˜ ) ×{0}, ∀x = (x˜ , x˜ ) ∈ X × X 0 1 2 0 1 2 {0} 1 1 2 where N (·) denotes the normal cone operator of the set {0}. Clearly, A and B are {0} 0 0 both maximal monotone operators and ˜ ˜ J (x˜ , x˜ ) = (x˜ , x˜ ), J (x˜ , x˜ ) = (0, x˜ ), ∀x = (x˜ , x˜ ) ∈ X × X . A 1 2 1 2 B 1 2 2 1 2 0 0 ¯ ¯ Now deﬁne A := A + β I and B := B + β I where β ≥ β ≥ 0. It follows that A 0 0 is a β-strongly maximal monotone operator and B is a β-strongly maximal monotone operator. Hence, the instance we are describing is slightly more general in that A and B have different strong monotonicity parameters. Moreover, for any x = (x˜ , x˜ ) ∈ X , it is easy to see that 1 2 1 1 J (x ) = J 1 (x˜ , x˜ ) = (x˜ , x˜ ), A 1 2 1 2 1+β 1 + β 1 + β 1 1 J (x ) = J (x˜ , x˜ ) = (0, x˜ ). B 1 2 2 1+β ¯ ¯ 1 + β 1 + β and hence that (1 + β − θ) (β + β)θ x + θ [J (2J (x ) − x ) − J (x )]= x˜ , 1 − x˜ . B A A 1 2 1 + β (1 + β)(1 + β) (56) From (55) and (56), we easily see that the sequence {x } generated by the relaxed PR splitting method diverges whenever (1 + β − θ) (β + β)θ ≤−1, or 1 − ≤−1, 1 + β (1 + β)(1 + β) 123 784 R. D. C. Monteiro, C.-K. Sim or equivalently, whenever ! " 2(1 + ββ) θ ≥ min 2(1 + β) , 2 + . β + β Note that when β = β, the above inequality reduces to θ ≥ min{2(1+β), 2+β +1/β}. Before ending this subsection, we make two remarks. First, when β = 0 and hence A is not strongly monotone, the sequence {x } for the above example diverges for any θ ≥ 2evenif B is strongly monotone, i.e., β> 0. Second, the above example specialized to the case in which β = β easily shows that the sequence generated by the relaxed PR splitting method does not necessarily converge for any (β, θ ) ∈ R(τ ) if τ> 1 where R(τ ) is deﬁned at the end of Sect. 5.1. 6 Numerical study This section illustrates the behavior of the relaxed PR splitting method for solving the weighted Lasso minimization problem [16](seealso[6]) min f (u) + g(u), (57) u∈R 1 2 where f (u) := Cu − b and g(u) := Wu for every u ∈ R . Our numerical 2 X 300 300×200 experiments consider instances where n = 200, b ∈ R and C ∈ R is a sparse data matrix with an average of 10 nonzero entries per row. Each component of b and each nonzero element of C is drawn from a Gaussian distribution with zero mean 200×200 and unit variance, while W ∈ R is a diagonal matrix with positive diagonal elements drawn from a uniform distribution on the interval [0, 1]. This setup follows 300 200 that of [16]. Note that X = R and · is the 1-norm on R . Observe that f 1 200 T is α-strongly convex on R where α = λ (C C ) is the minimum eigenvalue min T 200 of C C.Also, f is differentiable and its gradient is κ-Lipschitz continuous on R T T where κ = λ (C C ) is the maximum eigenvalue of C C. The function g is clearly max convex on R . We consider solving (57) by apply the relaxed PR splitting method (55)tosolve the inclusion (1) with A = ∂ f − α I, B = ∂g + α I, (58) where 0 ≤ α ≤ α = λ (C C ). Since A (resp., B)is (α − α )-strongly (resp., α - min strongly) maximal monotone, the results developed in Sects. 4 and 5 for the relaxed PR splitting method with (A, B) as above applies with β = min{α − α ,α }. Our goal in this section is to gain some intuition of how the relaxed PR splitting method performs as α (and hence β), γ and θ change. In our numerical experiments, we start the relaxed −5 PR splitting algorithm with x = 0 and terminate it when x − x ≤ 10 .The 0 k+1 k X paragraphs below report the results of three experiments. α 2 A function f is α-strongly convex on X if f − · is convex on X . 123 Complexity of the relaxed Peaceman–Rachford splitting... 785 Table 1 Average number of iterations performed by the relaxed PR splitting method (55)tosolve (57) based on the partition (A, B) given by (58) for 4 pairs (γ , α ) and 6 different values of θ θ Average number of iterations γ = 1 γ = 1/ ακ α = 0 α = α/2 α = 0 α = α/2 1 141.79 140.64 60.10 60.11 1.25 115.96 115.06 48.47 48.48 1.50 98.31 97.48 40.51 40.49 1.75 85.33 84.64 34.67 34.70 2 264.80 75.08 58.54 42.11 2 + γα/2 > 500 73.25 74.73 49.60 In the ﬁrst experiment, we generate 100 random instances of (C, W, b) and we observed that the condition λ (C C)> 0 holds for all instances. The relaxed PR min splitting method is used to solve these instances of (57) for various values of θ and with √ √ the pair (γ , α ) taking on the values (1, 0), (1,α/2), (1/ ακ, 0) and (1/ ακ, α/2). Note that it follows from Proposition 3 of [16] that when α = 0 and θ = 2, the choice of γ = 1/ ακ has been shown to be optimal for the relaxed PR splitting method. Our results are shown in Table 1. We see from the table that, except when θ = 2 and θ = 2 + γα/2, the average number of iterations for α = 0 and α = α/2are similar. However, when θ = 2 and θ = 2 + γα/2, the choice α = α/2 outperforms the one with α = 0. One possible explanation for this behavior is due to the fact that when θ = 2 and θ = 2 + γα/2, the relaxed PR sequence converges when both operators are strongly monotone, while it does not necessary converge when either one of the operators is only monotone. Note also that the results in the last row of the table conﬁrm the convergence result of the relaxed PR splitting method (see Theorem 5.2) for the case in which A and B are β-strongly maximal monotone operators with β> 0 and θ = 2 + γβ. Finally, our results (the last two rows of table) suggest that, if A is maximal α-strongly monotone and B is only maximal monotone, it might be advantageous to use the relaxed PR splitting method with 0 <α <α (and hence β> 0) instead of α = 0 (and hence β = 0). In our second experiment, we use relaxed PR splitting method with (θ , γ ) equal to (2, 1) and (2, 1/ ακ), and with α varying from 0 to α,tosolve (57)for a randomly generated (C, W, b). In this instance, α = λ (C C ) = 0.3792 and min κ = λ (C C ) = 57.6624. Our results are shown in Fig. 1. We see from Fig. 1 max that the number of iterations decreases as α increases in both cases. These graphs again suggest that it might be advantageous to have A and B maximal β-strongly monotone with β> 0. We also observe that as α approaches α, the number of itera- tions does not increase even though the operator A is losing its strong monotonicity. In our third experiment, we performed the same numerical experiments as the ones mentioned above but with (A, B) = (∂g + α I,∂ f − α I ) instead of (A, B) = (∂ f − α I,∂g + α I ) and note that the results obtained were very similar to the ones 123 786 R. D. C. Monteiro, C.-K. Sim Graph of Number of Iterations as Graph of Number of Iterations as (A) (B) 1/2 varies when = (1/ ) varies when = 1 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 Fig. 1 Two graphs showing how the number of iterations performed by (55) changes with varying α using the partition (A, B) given (58) Table 2 Summary of convergence rate results for the relaxed PR splitting method on the sum of two convex functions f, g Convergence Additional conditions on f, g besides convexity θ Rate Type References (0, 2) O(1/ k) Pointwise [18] − (0, 2) o(1/ k) Pointwise [10] − (0, 2] O(1/k) Ergodic [10] − (0, 2) o(1/k) Best iterate [11] ∇ f or ∇g Lipschitz 1, 2 − R-linear [22] f strongly convex and ∇ f Lipschitz (0, 2) − R-linear [15] g strongly convex and ∇ f Lipschitz (0, 2]− R-linear [11]( f or g strongly convex) and (∇ f or ∇g Lipschitz) (0, 2]− R-linear [12]( f or g strongly convex) and ∇ f Lipschitz (0, )δ ∈ (0, 1) − R-linear [16] (resp., [15]) f (resp., g) strongly convex 1+δ and ∇ f (resp., ∇g) Lipschitz reported above. Hence interchanging A and B in the implementation of the relaxed PR splitting method have little impact on its performance. 7 Concluding remarks This paper establishes convergence of the sequence of iterates and an O(1/k) ergodic convergence rate bound for the relaxed PR splitting method for any θ ∈ (0, 2 + γβ] Number of Iterations Number of Iterations Complexity of the relaxed Peaceman–Rachford splitting... 787 Table 3 Summary of convergence rate results for the relaxed PR splitting method on the sum of two β-strongly convex functions f, g, with β> 0 Convergence Additional conditions on f, g besides strong convexity θ Rate Type References (0, 2 + γβ) O(1/ k) Pointwise This paper − (0, 2 + γβ] O(1/k) Ergodic This paper − (0, 2]− R-linear [12] ∇ f Lipschitz by viewing it as an instance of a non-Euclidean HPE framework. It also establishes an O(1/ k) pointwise convergence rate bound for it for any θ ∈ (0, 2 + γβ). Fur- thermore, an example showing that PR iterates do not necessarily converge for β ≥ 0 and θ ≥ min{2(1 + γβ), 2 + γβ + 1/(γ β)} is given. Table 2 (resp., Table 3) for the case in which β = 0(resp., β> 0) provides a summary of the convergence rate results known so far for the relaxed PR splitting method when (A, B) = (∂ f,∂g) for some convex functions f and g. However, we observe that some of these results also hold for pairs (A, B) of maximal monotone operators which are not subdifferentials. The term “R-linear” in the tables below stands for linear convergence of the sequence {x } generated by the relaxed PR splitting method. We observe that our analysis in Sects. 4 and 5, in contrast to the ones in [11,15, 16,22], does not impose any regularity condition on A and B such as assuming one of them to be a Lipschitz, and hence point-to-point, operator. Also, if only one of the operators, say A,isassumedtobemaximal β-strongly monotone, (1) is equivalent to 0 ∈ (A + B )(u) where A := A − (β/2)I and B := B + (β/2)I are now both (β/2)- strongly monotone. Thus, to solve (1), the relaxed PR method with (A, B) replaced by (A , B ) can be applied, thereby ensuring convergence of the iterates, as well as pointwise and ergodic convergence rate bounds, for values of θ ≥ 2. This idea was tested in our computational results of Sect. 6 where a weighted Lasso minimization problem [6] is solved using the partitions (A, B) and (A , B ). Our conclusion is that the partition (A , B ) substantially outperforms the other one for values of θ ≥ 2. Acknowledgements We would like to thank the associate editor for handling the paper and the two anonymous reviewers for providing valuable comments to improve the paper. We further appreciate the suggestion of one of the reviewers that leads us to the revised example, in Sect. 5.2, that shows possible nonconvergence of iterates generated by the relaxed PR splitting method. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 Interna- tional License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. 123 788 R. D. C. Monteiro, C.-K. Sim Appendix Proof of Lemma 4.5 To simplify notation, let x = x ,x := x ,u = u − u , k k−1 k k−1 v = v − v ,a = a − a ,b = b − b . k k−1 k k−1 k k−1 Then, it follows from the second identity in (21) and relation (29) that − − − x = x + θ(v − u), γ a = x − u,γb = 2u − v − x . (59) Also, the two inclusions in (30) and (31) together with the β-strong monotonicity of A and B imply that 2 2 a,u ≥ βu , b,v ≥ βv . X X X X Combining the last two identities in (59) with the above inequalities, we obtain − 2 − 2 x − u,u ≥ γβu , 2u − v − x ,v ≥ γβv . X X X X Adding these two last inequalities and simplifying the resulting expression, we obtain − 2 2 x ,u − v + 2u,v ≥ (1 + γβ)[u +v ] (60) X X X X From the ﬁrst equality in (59), we have − − 2 2 2 2 2θ x ,u − v =x −x + θ v − u , X X X which upon substituting into (60), the following is true: − 2 2 2 2 x −x ≥ 2θ 1 + γβ − (u +v ) X X X X + 2 − 1 u,v . Note that the right-hand side in the above inequality is greater than or equal to zero if θ ∈ (0, 2θ ]. Hence, we have if θ ∈ (0, 2θ ], 0 0 x ≤x . X X 123 Complexity of the relaxed Peaceman–Rachford splitting... 789 References 1. Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, Berlin (2011) 2. Bauschke, H.H., Bello Cruz, J.Y., Nghia, T.T.A., Phan, H.M., Wang, X.: The rate of linear convergence of the Douglas–Rachford algorithm for subspaces is the cosine of the Friedrichs angle. J. Approx. Theory 185, 63–79 (2014) 3. Bauschke, H.H., Bello Cruz, J.Y., Nghia, T.T.A., Phan, H.M., Wang, X.: Optimal rates of linear conver- gence of relaxed alternating projections and generalized Douglas–Rachford methods for two subspaces. Numer. Algorithms 73, 33–76 (2016) 4. Bauschke, H.H., Noll, D., Phan, H.M.: Linear and strong convergence of algorithms involving averaged nonexpansive operators. J. Math. Anal. Appl. 421, 1–20 (2015) 5. Burachik, R.S., Sagastizábal, C.A., Svaiter, B.F.: ε-enlargements of maximal monotone operators: theory and applications. Reformulation: Nonsmooth, piecewise smooth, semismooth and smoothing methods (Lausanne 1997), vol. 22 of Applied Optimization, pp. 25–43. Kluwer Academic Publishers, Dordrecht, The Netherlands (1999) 6. Candes, E.J., Wakin, M.B., Boyd, S.P.: Enhancing sparsity by reweighted l minimization. J. Fourier Anal. Appl. 14(5–6), 877–905 (2008) 7. Combettes, P.L.: Solving monotone inclusions via compositions of nonexpansive averaged operators. Optimization 53(5–6), 475–504 (2004) 8. Combettes, P.L.: Iterative construction of the resolvent of a sum of maximal monotone operators. J. Convex Anal. 16(3), 727–748 (2009) 9. Davis, D.: Convergence rate analysis of the forward-Douglas–Rachford splitting scheme. SIAM J. Optim. 25, 1760–1786 (2015) 10. Davis, D., Yin, W.: Convergence rate analysis of several splitting schemes. arXiv preprint arXiv:1406.4834v3 (2015) 11. Davis, D., Yin, W.: Faster convergence rates of relaxed Peaceman–Rachford and ADMM under regu- larity assumptions. Math. Oper. Res. 42(3), 783–805 (2017) 12. Dong, Y., Fischer, A.: A family of operator splitting methods revisited. Nonlinear Anal. 72, 4307–4315 (2010) 13. Eckstein, J., Bertsekas, D.P.: On the Douglas–Rachford splitting method and the proximal point algo- rithm for maximal monotone operators. Math. Program. 55, 293–318 (1992) 14. Facchinei, F., Pang, J.-S.: Finite-Dimensional Variational Inequalities and Complementarity Problem, vol. II. Springer, New York (2003) 15. Giselsson, P.: Tight global linear convergence rate bounds for Douglas–Rachford splitting. J. Fixed Point Theory Appl. 19(4), 2241–2270 (2017) 16. Giselsson, P., Boyd, S.: Linear convergence and metric selection for Douglas–Rachford splitting and ADMM. IEEE Trans. Autom. Control 62, 532–544 (2017) 17. Goncalves, M.L.N., Melo, J.G., Monteiro, R.D.C.: Improved pointwise iteration-complexity of a regularized ADMM and of a regularized non-Euclidean HPE framework. arXiv preprint arXiv:1601.01140v1 (2016) 18. He, B., Yuan, X.: On the convergence rate of Douglas–Rachford operator splitting method. Math. Program. Ser. A 153, 715–722 (2015) 19. He, Y., Monteiro, R.D.C.: Accelerating block-decomposition ﬁrst-order methods for solving composite saddle-point and two-player Nash equilibrium problems. SIAM J. Optim. 25, 2182–2211 (2015) 20. He, Y., Monteiro, R.D.C.: An accelerated HPE-type algorithm for a class of composite convex–concave saddle-point problems. SIAM J. Optim. 26, 29–56 (2016) 21. Kolossoski, O., Monteiro, R.D.C.: An accelerated non-Euclidean hybrid proximal extragradient-type algorithm for convex-concave saddle-point problems. Preprint (2015) 22. Lions, P.-L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16(6), 964–979 (1979) 23. Monteiro, R.D.C., Sicre, M.R., Svaiter, B.F.: A hybrid proximal extragradient self-concordant primal barrier method for monotone variational inequalities. SIAM J. Optim. 25(4), 1965–1996 (2015) 24. Monteiro, R.D.C, Sim, C.-K.: Complexity of the relaxed Peaceman–Rachford splitting method for the sum of two maximal strongly monotone operators. http://arxiv.org/abs/1611.03567 (2017). arXiv preprint arXiv:1611.03567v2 123 790 R. D. C. Monteiro, C.-K. Sim 25. Monteiro, R.D.C., Svaiter, B.F.: On the complexity of the hybrid proximal extragradient method for the iterates and the ergodic mean. SIAM J. Optim. 20, 2755–2787 (2010) 26. Monteiro, R.D.C., Svaiter, B.F.: Complexity of variants of Tseng’s modiﬁed F–B splitting and Kor- pelevich’s methods for hemivariational inequalities with applications to saddle-point and convex optimization problems. SIAM J. Optim. 21(4), 1688–1720 (2011) 27. Monteiro, R.D.C., Svaiter, B.F.: Complexity of variants of Tseng’s modiﬁed F–B splitting and Kor- pelevich’s methods of hemi-variational inequalities with applications to saddle point and convex optimization problems. SIAM J. Optim. 21, 1688–1720 (2011) 28. Rockafellar, R.T.: On the maximality of sums of nonlinear monotone operators. Trans. Am. Math. Soc. 149, 75–88 (1970) 29. Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14, 877–898 (1976) 30. Solodov, M.V., Svaiter, B.F.: A hybrid approximate extragradient-proximal point algorithm using the enlargement of a maximal monotone operator. Set-Valued Var. Anal. 7, 323–345 (1999) 31. Solodov, M.V., Svaiter, B.F.: An inexact hybrid generalized proximal point algorithm and some new results on the theory of Bregman functions. Math. Oper. Res. 25, 214–230 (2000)
http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png
Computational Optimization and Applications
Springer Journals
http://www.deepdyve.com/lp/springer-journals/complexity-of-the-relaxed-peaceman-rachford-splitting-method-for-the-UKEDcyp0Cb