Generalizations of SIP methods to systems with $$p$$ -structure

Generalizations of SIP methods to systems with $$p$$ -structure Abstract In the present article, we propose two variants of symmetric internal penalty methods for systems with $$p$$-structure. We show stability (a priori estimates) of the methods and derive error estimates. Moreover, we discuss the performance of the schemes and compare them with the local discontinuous Galerkin method. 1. The problem We consider the numerical approximation of a vectorial system of $$p$$-Laplace type,   \begin{equation} \label{eq:p-lap} \begin{aligned} -{\text{div}}{\boldsymbol{\mathcal{A}}}(\nabla{\bf u})&={\bf f}\qquad&&\text{in }{\it{\Omega}}, \\ {\bf u}&={\bf u}_D \qquad&&\text{on }\partial {\it{\Omega}}, \end{aligned} \end{equation} (1.1) by means of symmetric internal penalty (SIP) approximations. For the given data $${\bf f}$$ and $${\bf u}_D$$, we seek the unknown vector field $${\bf u}=(u_1,\ldots, u_d)^\top$$ defined on $${\it{\Omega}} \subset {\mathbb{R}}^n$$. Throughout the article we assume that $${\boldsymbol{\mathcal{A}}}\colon {\mathbb{R}}^{d\times n}\to {\mathbb{R}}^{d\times n}$$ possesses a $$\psi$$-potential, where $$\psi$$ has $$(p,\delta)$$-structure (cf. Assumption 2.2) and the relevant example that falls into this class is   \begin{equation}\label{eq:fluids} {\boldsymbol{\mathcal{A}}}(\nabla {\bf u}) = (\delta+\left| {{\nabla {\bf u}}} \right|)^{p-2} \nabla{\bf u}, \end{equation} (1.2) with $$ p \in (1,\infty)$$ and $$ \delta\geq 0$$. Discontinuous Galerkin (DG) methods for elliptic problems were introduced in the late 1990s. In recent decades, they have received extensive attention and by now they are well understood and rigorously analysed in the context of linear elliptic problems (cf. Arnold et al., 2002, for the Poisson problem). Prominent examples for DG methods are the local discontinuous Galerkin (LDG) and the SIP methods. In contrast to this, little is known about the treatment of the $$p$$-Laplace problem with DG methods (cf. Burman & Ern, 2008; Buffa & Ortner, 2009; Diening et al., 2014). On the other hand, it is known from Ebmeyer & Liu (2005) and Diening & Růžička (2007) that finite element solutions of equation (1.1) converge at least with a linear rate to the exact solution at least in the case $${\bf u}_D =\mathbf{0}$$. This convergence rate is optimal for $$p \in (1,\infty)$$ and linear ansatz functions, if the continuous solution has the natural regularity $$\nabla {\bf F}(\nabla {\bf u}) \in L^2({\it{\Omega}})$$. It was shown in Diening & Kreuzer (2008) and Belenki et al. (2012) that the adaptive finite element algorithm for equation (1.1) with piecewise linear ansatz functions converges with an optimal rate to the solution. In this article, we extend the techniques developed for a rigorous analysis of LDG methods to SIP methods. These methods seem, on the one hand, to be best suited for a theoretical analysis in the context of systems with $$p$$-structure and have, on the other hand, good stability and localization properties in numerical experiments. The proved convergence rate for the LDG scheme is optimal for $$p \le 2$$, while for $$p\ge 2$$ it is suboptimal due to technical difficulties (cf. Diening et al., 2014). Clearly, one has many different possibilities to generalize the SIP method for $$p=2$$ to SIP methods for $$p\neq 2$$ (cf. Houston et al., 2005, where the analysis of different IP methods of problem (1.1) with $${\boldsymbol{\mathcal{A}}}(\nabla {\bf u}) = \nabla {\bf u} + (\delta+\left| {{\nabla {\bf u}}} \right|)^{p-2} \nabla{\bf u}$$ heavily depends on the added linear term). We propose two formulations that are motivated by different perceptions of SIP methods in the linear case (cf. Di Pietro & Ern, 2012). We show in this article that the SIP formulation using shifts (cf. Scheme 2.10) of problem (1.1), (1.2) possesses for all $$p \in (1,\infty)$$ an optimal (for linear ansatz functions), linear convergence rate to the exact solution. In fact, this is the first theoretical proof that a DG method behaves exactly as the finite element method (FEM) for the whole range $$p \in (1,\infty)$$. Moreover, the second SIP formulation using liftings (cf. Scheme 2.11) of problem (1.1), (1.2) possesses the same convergence properties as the LDG formulation in Diening et al. (2014). The article is organized as follows. In the next section, we introduce the notation, the appropriate function spaces, in particular appropriate spaces for DG functions, the basic assumption on the nonlinear operator and its basic consequences, discrete gradients and our numerical fluxes. Moreover, we propose two different SIP formulations and recall the primal formulation of the LDG method for our problem. Both methods coincide with the classical SIP method for $$p=2$$. In Section 3, we prove stability of the methods, i.e., a priori estimates (cf. Theorems 3.1 and 3.2). In Section 4, we prove error estimates for our problem (cf. Theorems 4.3 and 4.5). These results provide the first convergence rates for SIP methods for problems with $$(p,\delta)$$-structure, like the $$p$$-Laplace equation. In the appendix, we collect various technical results used in the sequel. 2. The SIP schemes: notation and set-up In this section, we introduce different DG formulations of problem (1.1). Before that we introduce the notation we will use, state the precise assumptions on the nonlinear operator and discuss its basic consequences. 2.1 Function spaces We use $$c, C$$ to denote generic constants, which may change from line to line, but not depending on the crucial quantities. Moreover, we write $$f\sim g$$ if and only if there exist constants $$c,C>0$$ such that $$c\, f \le g\le C\, f$$. We will use the customary Lebesgue spaces $$(L^p({\it{\Omega}}),\left\| {{\,\cdot\,}} \right\|_p)$$ and Sobolev spaces $$(W^{k,p}({\it{\Omega}}),\left\| {{\,\cdot\,}} \right\|_{k,p})$$, where $${\it{\Omega}} \subset {\mathbb{R}}^n$$ is a bounded, polyhedral domain with Lipschitz continuous boundary $$\partial {\it{\Omega}}=:{\it{\Gamma}}_{\rm D}$$. The space $$W^{1,p}_{{\it{\Gamma}}_{\rm D}}({\it{\Omega}})$$ is the closure of $$C^\infty_0 ({\it{\Omega}})$$ functions in $$W^{1,p}({\it{\Omega}})$$, equipped with the gradient norm $$\left\| {{\nabla\,\cdot\,}} \right\|_p$$. We do not distinguish between function spaces for scalar, vector-valued or tensor-valued functions. However, we will denote vector-valued functions by boldface letters and tensor-valued functions by capital boldface letters. The scalar product between two vectors $${\bf u}$$, $${\bf v}$$ is denoted by $${\bf u} \cdot {\bf v}$$. The scalar product between two tensors $${\bf P}, {\bf Q}$$ is denoted by $${\bf P}: {\bf Q}$$ and we use the notation $$\left| {{{\bf P}}} \right|^2={\bf P} : {\bf P} ^\top$$. The mean value of a locally integrable function $$f$$ over a measurable set $$M \subset {\it{\Omega}}$$ is denoted by $$\left\langle {{f}} \right\rangle_M:= \mathop{{\int\hspace{-0.8em}{-}}}_M f \, {\rm d}x =\frac 1 {|M|}\int_M f \, {\rm d}x$$. Moreover, we use the notation $$({f}, {g}):=\int_{\it{\Omega}} f g\, {\rm d}x$$, whenever the right-hand side is well defined. We will also work with Orlicz and Sobolev–Orlicz spaces (cf. Rao & Ren, 1991). A real convex function $$\psi \,:\, {\mathbb{R}}^{\geq 0} \to {\mathbb{R}}^{\geq 0}$$ is said to be an N-function if $$\psi(0)=0$$, $$\psi(t)>0$$ for $$t>0$$, $$\lim_{t\rightarrow0} \psi(t)/t=0$$, as well as $$\lim_{t\rightarrow\infty} \psi(t)/t=\infty$$. We always assume that $$\psi$$ and the conjugate N-function $$\psi ^*$$ satisfy the $${\it{\Delta}}_2$$-condition. We denote the smallest constant such that $$\psi(2\,t) \leq K\, \psi(t)$$ by $${\it{\Delta}}_2(\psi)$$. We denote by $$L^\psi({\it{\Omega}})$$ and $$W^{1,\psi}({\it{\Omega}})$$ the classical Orlicz and Sobolev–Orlicz spaces, i.e., $$f \in L^\psi({\it{\Omega}})$$ if the {modular} $$\rho_\psi(f):=\int_{\it{\Omega}} \psi(\left| {{f}} \right|)\, {\rm d}x $$ is finite and $$f \in W^{1,\psi}({\it{\Omega}})$$ if $$f$$ and $$ \nabla f$$ belong to $$L^\psi({\it{\Omega}})$$. When equipped with the Luxembourg norm $$\left\| {{f}} \right\|_{\psi}:= \inf \left\{ {{\lambda >0 {\,\big|\,} \int_{\it{\Omega}} \psi(\left| {{f}} \right|/\lambda)\, {{{\rm d}}}x \le 1}} \right\}$$, the space $$L^\psi({\it{\Omega}})$$ becomes a Banach space. The same holds for the space $$W^{1,\psi}({\it{\Omega}})$$ if it is equipped with the norm $$\left\| {{\cdot }} \right\|_{\psi} +\left\| {{\nabla \cdot}} \right\|_{\psi} $$. Note that the dual space $$(L^\psi({\it{\Omega}}))^*$$ can be identified with the space $$L^{\psi^*}({\it{\Omega}})$$. By $$W^{1,\psi}_{{\it{\Gamma}}_{\rm D}}({\it{\Omega}})$$ we denote the closure of $$C^\infty_0({\it{\Omega}})$$ in $$W^{1,\psi}({\it{\Omega}})$$ and equip it with the gradient norm $$\left\| {{\nabla \cdot}} \right\|_\psi$$. We need the following refined version of the Young inequality: for all $$\varepsilon >0$$ there exists $$c_\epsilon>0 $$, depending only on $${\it{\Delta}}_2(\psi),{\it{\Delta}}_2( \psi ^*)<\infty$$, such that for all $$s,t\geq0$$,   \begin{align} \label{ineq:young} \begin{split} ts&\leq \epsilon \, \psi(t)+ c_\epsilon \,\psi^*(s), \\ t\, \psi'(s) + \psi'(t)\, s &\le \epsilon \, \psi(t)+ c_\epsilon \,\psi(s). \end{split} \end{align} (2.1) 2.2 Basic properties of the nonlinear operator We now state precisely the assumptions we make for our nonlinear operator $${\boldsymbol{\mathcal{A}}} (\cdot)$$. For $$t\geq0 $$, we define a special N-function $$\varphi=\varphi_{p,\delta}$$ by   \begin{align} \label{eq:5a} \varphi(t):= \int _0^t \alpha(s)\, {\rm{d}}s\qquad\text{with}\quad \alpha(t) := (\delta +t)^{p-2} t, \end{align} (2.2) where $$p\in (1,\infty)$$ and $$\delta\ge 0$$. The function $$\varphi$$ satisfies, uniformly in $$t$$, the important equivalence   \begin{align} \label{eq:equi1} \varphi''(t)\, t \sim \varphi'(t), \end{align} (2.3) because $$ \min\left\{ {{1,p-1}} \right\}\,(\delta+t)^{p-2} \le \varphi''(t)\leq \max\left\{ {{1,p-1}} \right\}(\delta+t)^{p-2}$$. Moreover, $$\varphi$$ satisfies the $${\it{\Delta}}_2$$-condition with $${\it{\Delta}}_2(\varphi) \leq c\, 2^{\max \left\{ {{2,p}} \right\}}$$ (hence independent of $$\delta$$). This implies that, uniformly with respect to $$t$$, we have   \begin{align} \label{eq:equi2} \varphi'(t)\, t \sim \varphi(t), \end{align} (2.4) with constants depending only on $$p$$. The conjugate function $$\varphi^*$$ satisfies $$\varphi^*(t) \sim (\delta^{p-1} + t)^{p'-2} t^2$$ with $$1= \frac{1}{p} + \frac{1}{p'}$$. Also $$\varphi^*$$ satisfies the $${\it{\Delta}}_2$$-condition with $${\it{\Delta}}_2(\varphi^*) \leq c\,2^{\max \left\{ {{2,p'}} \right\}}$$. A detailed discussion and full proofs can be found in Růžička & Diening (2007) and Diening & Ettwein (2008). Definition 2.1 We say that an N-function $$\psi \in C^1({\mathbb{R}}^\ge)\cap C^2({\mathbb{R}}^>)$$ has $$(p,\delta)$$-structure, with $$p\in (1,\infty)$$ and $$\delta\ge 0$$, if   \begin{alignat*}{2} \psi(t) &\sim \varphi_{p,\delta}(t)\,\qquad &&\textrm{uniformly in $t\ge 0$}, \\ \psi''(t) &\sim \varphi''_{p,\delta}(t)\,\qquad &&\textrm{uniformly in $t> 0$} . \end{alignat*} The constants in these equivalences and $$p$$ are called characteristics of $$\psi$$. In this case $$\psi$$ and $$\psi^*$$ satisfy the $${\it{\Delta}}_2$$-condition with $${\it{\Delta}}_2(\psi)\leq c\,2^{\max \left\{ {{2,p}} \right\}}$$ and $$ {\it{\Delta}}_2(\psi^*) \leq c\,2^{\max \left\{ {{2,p'}} \right\}}$$. Moreover, we have, uniformly with respect to $$t$$,   \begin{align} \label{eq:equi2a} \psi'(t)\, t \sim \psi(t), \quad \varphi'(t)\sim \psi'(t), \end{align} (2.5) with constants depending only on the characteristics of $$\psi$$. Assumption 2.2 (Nonlinear operator) We assume that the nonlinear operator $${{\boldsymbol{\mathcal{A}}} \colon {\mathbb{R}}^{d \times n} \to {\mathbb{R}}^{d \times n}}$$ belongs to $$C^0({\mathbb{R}}^{d \times n},{\mathbb{R}}^{d \times n} )\cap C^1({\mathbb{R}}^{d \times n}\setminus \{\mathbf{0}\},{\mathbb{R}}^{d \times n} ) $$ and satisfies $${\boldsymbol{\mathcal{A}}} (\mathbf 0)=\mathbf 0$$. Moreover, we assume that $${\boldsymbol{\mathcal{A}}} $$ possesses a potential $$\psi \in C^1({\mathbb{R}}^\ge)$$, which has $$(p,\delta)$$-structure, i.e., for all $${\bf P} \in {\mathbb{R}}^{d \times n} \setminus \left\{ { \mathbf{0}} \right\} $$ there holds   \begin{align} {\boldsymbol{\mathcal{A}}} ({\bf P}) = \psi'(\left| {{{\bf P}}} \right|) \frac {{\bf P}}{\left| {{{\bf P}}} \right|}. \label{eq:ass_S} \end{align} (2.6) In this case, we say that $${\boldsymbol{\mathcal{A}}}$$ possesses a $$(p,\delta)$$-potential $$\psi$$ and call the characteristics of $$\psi$$ also the characteristics of $${\boldsymbol{\mathcal{A}}}$$. Remark 2.3 We emphasize that the constants in the article depend only on the characteristics of $${\boldsymbol{\mathcal{A}}}$$ but are independent of $$\delta\geq 0$$. Remark 2.4 Note that the spaces $$L^p({\it{\Omega}})$$, $$L^\varphi({\it{\Omega}})$$ and $$L^\psi({\it{\Omega}}),$$ as well as $$W^{1,p}({\it{\Omega}})$$, $$W^{1,\varphi}({\it{\Omega}})$$ and $$W^{1,\psi}({\it{\Omega}})$$, are isomorphic. The equivalence of the corresponding norms depends only on $$\delta$$ and the characteristics of $$\psi$$. Closely related to the nonlinear operator $${\boldsymbol{\mathcal{A}}}$$ with $$(p,\delta)$$-potential $$\psi$$ are the functions $${\bf F}\colon{\mathbb{R}}^{d \times n} \to {\mathbb{R}}^{d \times n}$$ defined through   \begin{align} \begin{aligned} {\bf F}({\bf P})&:= \big (\delta+\left| {{{\bf P}}} \right| \big )^{\frac {p-2}{2}}{{\bf P} }. \end{aligned} \label{eq:def_F} \end{align} (2.7) Another important tool is the shifted N-functions (cf. Růžička & Diening, 2007; Diening & Ettwein, 2008; Diening & Kreuzer, 2008; Belenki et al., 2012; Růžička, 2013). For an N-function $$\psi$$, we define the family of shifted N-functions $$\left\{ {{\psi_a}} \right\}_{a \ge 0}$$ for $$t\geq 0$$ by   \begin{align} \label{eq:phi_shifted} \psi_a(t):= \int _0^t \psi_a'(s)\, {\rm{d}}s\qquad\text{with }\quad \psi'_a(t):=\psi'(a+t)\frac {t}{a+t}. \end{align} (2.8) The function $$\psi _a$$ is again an N-function. It follows from Diening & Ettwein (2008, Lemma 26) that the conjugate function of the shifted N-function satisfies for all $$t \ge 0$$,   \begin{align} (\psi_a)^*(t)\sim (\psi^*)_{\psi'(a)}(t) , \label{eq:69} \end{align} (2.9) with constants depending only on $${\it{\Delta}}_2(\psi), {\it{\Delta}}_2(\psi^*) $$. In the special case of $$\varphi$$ defined in (2.2), we have $$\varphi_a(t) \sim (\delta+a+t)^{p-2} t^2$$ and also $$(\varphi_a)^*(t) \sim ((\delta+a)^{p-1} + t)^{p'-2} t^2$$. The family $$\left\{ {{\varphi_a}} \right\}_{a \ge 0}$$ satisfies the $${\it{\Delta}}_2$$-condition uniformly in $$a \ge 0$$, with $${\it{\Delta}}_2(\varphi_a) \leq c\, 2^{\max \left\{ {{2,p}} \right\}}$$ and $${\it{\Delta}}_2((\varphi_a)^*) \leq c\, 2^{\max \left\{ {{2,p}} \right\}}$$, respectively. We also have, uniformly with respect to $$t, a \ge 0$$,   \begin{align} \label{eq:equi3} \varphi_a'(t)\, t \sim \varphi_a(t)\sim \psi_a'(t)\, t \sim \psi_a(t), \end{align} (2.10) with constants depending only on the characteristics of $$\psi$$, if $$\psi $$ has $$(p,\delta)$$-structure. Moreover, we have (cf. Růžička & Diening, 2007; Diening & Ettwein, 2008; Diening & Kreuzer, 2008) the following result: Lemma 2.5 (Change of shift) For all $$\beta \in (0,1)$$ there exists $$c_\beta$$ such that all $${\bf P},{\bf Q} \in {\mathbb{R}}^{d\times n}$$, and all $$t\ge 0$$,   \begin{align*} \varphi_{\left| {{{\bf P}}} \right|}(t) &\le c_\beta\, \varphi_{\left| {{{\bf Q}}} \right|}(t) + \beta\, \varphi_{\left| {{{\bf P}}} \right|}(\left| {{{\bf P}-{\bf Q}}} \right|),\\[1mm] \big (\varphi_{\left| {{{\bf P}}} \right|}\big )^*(t) &\le c_\beta\, \big (\varphi_{\left| {{{\bf Q}}} \right|}\big )^*(t) + \beta\, \varphi_{\left| {{{\bf P}}} \right|}(\left| {{{\bf P}-{\bf Q}}} \right|), \\ \big(\varphi_{\left| {{{\bf P}}} \right|})(t) &\le c_\beta\, \big(\varphi\big)_{\left| {{{\bf Q}}} \right|}(t) + \beta\,\left| {{{\bf F}({\bf P}) - {\bf F}({\bf Q})}} \right|^2, \\ \big(\varphi_{\left| {{{\bf P}}} \right|})^*(t) &\le c_\beta\, \big(\varphi_{\left| {{{\bf Q}}} \right|}\big)^*(t) + \beta\,\left| {{{\bf F}({\bf P}) - {\bf F}({\bf Q})}} \right|^2 . \end{align*} Moreover, we have $$\varphi_\left| {{{\bf Q}}} \right|(\left| {{{\bf P}-{\bf Q}}} \right|)\sim \varphi_\left| {{{\bf P}}} \right|\big (\left| {{{\bf P}-{\bf Q}}} \right|\big )$$. The connection between $${\boldsymbol{\mathcal{A}}}$$, $${\bf F}$$ and $$\left\{ {{\varphi_a}} \right\}_{a \geq 0}$$ is best explained by the following proposition (cf. Růžička & Diening, 2007; Diening & Ettwein, 2008). Proposition 2.6 Let $${\boldsymbol{\mathcal{A}}}$$ satisfy Assumption 2.2, let $$\varphi$$ be defined in (2.2) and let $${\bf F}$$ be defined in (2.7). Then   \begin{align} \label{eq:hammera} \big({\boldsymbol{\mathcal{A}}}({\bf P}) - {\boldsymbol{\mathcal{A}}}({\bf Q})\big) :\big({\bf P}-{\bf Q} \big) &\sim \left| {{ {\bf F}({\bf P}) - {\bf F}({\bf Q})}} \right|^2 \\ \end{align} (2.11a)  \begin{align} \label{eq:hammerb} &\sim \varphi_{\left| {{{\bf P}}} \right|}(\left| {{{\bf P} - {\bf Q}}} \right|), \end{align} (2.11b) uniformly in $${\bf P}, {\bf Q} \in {\mathbb{R}}^{d \times n}$$. Moreover, uniformly in $${\bf Q} \in {\mathbb{R}}^{d \times n}$$,   \begin{align} {\boldsymbol{\mathcal{A}}}({\bf Q}) \cdot {\bf Q} \sim \left| {{{\bf F}({\bf Q})}} \right|^2 &\sim \varphi(\left| {{{\bf Q}}} \right|). \end{align} (2.11c) The constants depend only on the characteristics of $${\boldsymbol{\mathcal{A}}}$$. There also holds   \begin{alignat}{2} \label{eq:hammere} \left| {{{\boldsymbol{\mathcal{A}}}({\bf P}) - {\boldsymbol{\mathcal{A}}}({\bf Q})}} \right| &\sim \varphi'_{\left| {{{\bf P}}} \right|}\big(\left| {{{\bf P} - {\bf Q}}} \right|\big)&&\qquad\forall\,{\bf P}, {\bf Q} \in {{\mathbb{R}}^{d \times n}} , \\ \end{alignat} (2.12)  \begin{align} \left| {{{\boldsymbol{\mathcal{A}}}({\bf P})}} \right| &\sim \varphi'\big(\left| {{{\bf P}}} \right|)&&\qquad\forall\,{\bf P} \in {{\mathbb{R}}^{d \times n}} . \end{align} (2.13) Remark 2.7 In view of the previous proposition we have, for all $${\bf u}, {\bf w} \in W^{1,\varphi}({\it{\Omega}})$$,   \begin{align*} ({{\boldsymbol{\mathcal{A}}}(\nabla {\bf u}) \!-\! {\boldsymbol{\mathcal{A}}}(\nabla{\bf w})},{\nabla{\bf u} \!-\! \nabla {\bf w}}) &\sim \left\| {{{\bf F}(\nabla{\bf u}) \!-\! {\bf F}(\nabla{\bf w})}} \right\|_2^2 \,\sim \int_{\it{\Omega}}\! \varphi_{\left| {{\nabla{\bf u}}} \right|}(\left| {{\nabla{\bf u} \!-\! \nabla{\bf w}}} \right|) \,{\rm{d}}x. \end{align*} The constants depend only on the characteristics of $${\boldsymbol{\mathcal{A}}}$$. The last expression equals the quasi-norm introduced in Barrett & Liu (1994) raised to the power $$\rho = \max \left\{ {{p,2}} \right\}$$. This ensures that our results can also be expressed in terms of the quasi-norm. On the other hand, the last expression, which is a modular of the generalized shifted N-function $$\varphi_{\left| {{\nabla{\bf u}}} \right|}(\cdot)$$, has the advantage that one has not to distinguish between $$p\ge 2$$ and $$p\le 2$$ and that the theory of Orlicz–Sobolev spaces, in particular traces and embeddings, can be used. Moreover, the quantity $${\bf F}$$ enables us to formulate the natural regularity class of the problem, namely $$\nabla {\bf F}(\nabla {\bf u}) \in L^2({\it{\Omega}})$$ (cf. Giusti, 1994), which is not possible in terms of classical Sobolev spaces. The following important estimate follows directly from (2.12), Young’s inequality (2.1) and (2.11). Lemma 2.8 For all $$\epsilon>0$$, there exist a constant $$c_\epsilon>0$$ depending only on $$\epsilon>0$$ and the characteristics of $${\boldsymbol{\mathcal{A}}}$$ such that for all vector fields $${\bf u}, {\bf v}, {\bf w} \in W^{1,\varphi}({\it{\Omega}})$$,   \begin{align*} &\left( {{{\boldsymbol{\mathcal{A}}}(\nabla{\bf u}) - {\boldsymbol{\mathcal{A}}} (\nabla{\bf v})}, {\nabla{\bf w} - \nabla {\bf v}}} \right) \leq \epsilon\, \left\| {{{\bf F}(\nabla{\bf u}) - {\bf F}(\nabla{\bf v})}} \right\|_2^2 +c_\epsilon\, \left\| {{{\bf F}(\nabla{\bf w}) - {\bf F}(\nabla{\bf v})}} \right\|_2^2 . \end{align*} 2.3 Existence theory Let us briefly discuss the existence and regularity theory for problem (1.1). Assume that $${\boldsymbol{\mathcal{A}}}$$ possesses a $$(p,\delta)$$-potential $$\psi$$. Given boundary data $${\bf u}_D \in W^{1-\frac 1p,p}({\it{\Gamma}}_{\rm D})$$ can be extended to $${\it{\Omega}} $$. The extension, which is denoted again by $${\bf u}_D$$, belongs to $$W^{1,p}({\it{\Omega}})$$. Now, one can easily show, using the theory of monotone operators, that for all $$p>1$$, $$\delta\ge 0$$ and all data $${{\bf u}_D\in W^{1-\frac 1p,p}({\it{\Gamma}}_{\rm D})}$$ and $${\bf f}\in L^{p'}({\it{\Omega}})$$, there exists a weak solution $${\bf u} \in W^{1,p}({\it{\Omega}})$$ of problem (1.1), i.e., $${\bf u}-{\bf u}_D \in W^{1,p}_{{\it{\Gamma}}_{\rm D}}({\it{\Omega}})$$ and   \begin{align*} \int_{\it{\Omega}} {\boldsymbol{\mathcal{A}}}(\nabla {\bf u}) : \nabla {\bf z}\, {\rm{d}}x = \int _{\it{\Omega}} {\bf f} \cdot {\bf z}\, {\rm{d}}x \end{align*} is satisfied for all $${\bf z} \in W^{1,p}_{{\it{\Gamma}}_{\rm D}}({\it{\Omega}})$$. Using modular trace and Poincaré inequalities, one obtains the a priori estimate   \begin{align} \label{eq:uapriori} \rho_{\varphi,{\it{\Omega}}}(\nabla {\bf u}) \le c\, \big (\rho_{\varphi,{\it{\Omega}}}(\nabla {\bf u}_D)+\rho_{\varphi^*,{\it{\Omega}}}({\bf f}) \big ) . \end{align} (2.14) It is well known that one can show that weak solutions possess under appropriate assumptions the regularity $${\bf F}({\bf D}{\bf u}) \in W^{1,2}({\it{\Omega}})$$ (cf. Giaquinta & Modica, 1986; Acerbi & Fusco, 1994; Giusti, 1994). 2.4 DG spaces, jumps and averages Let $$\mathcal{T}_h$$ be a family of shape-regular triangulations of our domain $${\it{\Omega}}$$ consisting of $$n$$-dimensional simplices $$K$$ with diameter $$h_K$$ less than $$h$$. For simplicity, we assume in the article that $$h \le 1$$ always. For a simplex $$K \in \mathcal{T}_h$$, we denote by $$\rho_K$$ the supremum of the diameters of inscribed balls. We assume that there exists a constant $$\omega_0$$ independent of $$h$$ and $$K \in \mathcal{T}_h$$ such that $${h_K}{\rho_K^{-1}}\le \omega_0$$. The smallest such constant $$\omega_0$$ is called the chunkiness of $$\mathcal{T}_h$$. Note that, in the following, all constants may depend on the chunkiness $$\omega_0$$ but are independent of $$h$$. Let $$S_K$$ denote the neighbourhood of $$K$$, i.e., the patch $$S_K$$ is the union of all simplices of $$\mathcal{T}_h$$ touching $$K$$. We assume further for our triangulation that the interior of each $$S_K$$ is connected. One easily sees that under these assumptions we get that $$\left| {{K}} \right| \sim \left| {{S_K}} \right|$$ and that the number of simplices in $$S_K$$ and the number of patches to which a simplex belongs are uniformly bounded with respect to $$h>0$$ and $$K \in \mathcal{T}_h$$. We define the faces of $$\mathcal{T}_h$$ as follows: an interior face of $$\mathcal{T}_h$$ is the nonempty interior of $$\partial K \cap \partial K'$$, where $$K, K'$$ are two adjacent elements of $$\mathcal{T}_h$$. For the face $$\gamma:= \partial K \cap \partial K'$$, we use the notation $$S_\gamma:= K \cup K'$$. A boundary face of $$\mathcal{T}_h$$ is the nonempty interior of $$\partial K \cap \partial {\it{\Omega}}$$, where $$K$$ is a boundary element of $$\mathcal{T}_h$$. For the face $$\gamma:= \partial K \cap \partial {\it{\Omega}}$$, we use the notation $$S_\gamma:= K $$. By $${\it{\Gamma}}_I$$ and $${\it{\Gamma}}_{\rm D}$$, respectively, we denote the interior and the boundary faces, respectively, and put $${\it{\Gamma}}:= {\it{\Gamma}}_I\cup {\it{\Gamma}}_{\rm D} $$. For faces $$\gamma \in {\it{\Gamma}}$$, we denote by $$\gamma_h$$ the diameter of one of the adjacent simplices. We introduce the following notation for integrals related to quantities defined on the triangulation $$\mathcal T_h$$,   \begin{align*} \int_{\it{\Gamma}} f \, {\rm{d}}s&:= \sum_{\gamma \in {\it{\Gamma}}} \int_\gamma f \,{\rm{d}}s,\qquad \int_{\it{\Omega}} f \, {\rm{d}}x:= \sum_{K \in \mathcal T_h} \int_K f \,{\rm{d}}x, \end{align*} whenever the right-hand side is well defined.We also extend the notation for modulars to $${\it{\Gamma}}$$ by setting $$ \rho_{\psi,{\it{\Gamma}}}(f):= \int _{\it{\Gamma}} \psi(\left| {{f}} \right|)\, {\rm{d}}s$$ for $$f \in L^\psi({\it{\Gamma}})$$. We denote by $${\mathcal P}_k(K)$$, with $$k\in {\mathbb{N}}_0$$, the space of scalar, vector-valued or tensor-valued continuous functions, which are polynomials of degree at most $$m$$ on a simplex $$K \in \mathcal{T}_h$$. Given a triangulation of $${\it{\Omega}}$$ with the above properties, given an N-function $$\psi$$ and $$k,m \in {\mathbb{N}}$$ we define   \begin{align} \begin{alignedat}{2} {V_h^k} &= &{V_h^k}({\it{\Omega}})&:=\left\{ {{{\bf g} \in L^1 ({\it{\Omega}}) {\,\big|\,} {\bf g}|_K \in {\mathcal P}_k(K)\; \ \forall\, K \in \mathcal{T}_h}} \right\}, \\ {W_{{\rm DG}}}^{m,\psi}&= &\;{W_{{\rm DG}}}^{m,\psi}({\it{\Omega}}) &:= \left\{ {{ {\bf g} \in L^1({\it{\Omega}}) {\,\big|\,} {\bf g}|_K \in W^{m,\psi}(K) \ \forall\, K \in \mathcal{T}_h}} \right\} , \\ {X_h^k} &= &{X_h^k}({\it{\Omega}})&:=\left\{ {{{\bf G} \in L^1 ({\it{\Omega}}) {\,\big|\,} {\bf G}|_K \in {\mathcal P}_k(K)\; \ \forall\, K \in \mathcal{T}_h}} \right\}. \end{alignedat}\label{eq:1} \end{align} (2.15) Note that both $$W^{1,\psi}({\it{\Omega}})\subset {W_{{\rm DG}}}^{1,\psi}({\it{\Omega}})$$ and $${V_h^k}({\it{\Omega}})\subset {W_{{\rm DG}}}^{1,\psi}({\it{\Omega}})$$. In a slight abuse of notation we will also use $${V_h^k}$$ and $${W_{{\rm DG}}}^{1,\psi}$$ to denote the corresponding function spaces of scalar functions. For $${\bf g} \in {W_{{\rm DG}}}^{1,\psi}$$, we denote by $$\nabla_h {\bf g}$$ the local distributional gradient {and note that for} each $$K \in \mathcal{T}_h$$ the interior trace $${tr}^K_\gamma({\bf g})$$ of $${\bf g}$$ on $$\partial K$$ is well defined. Let $$g, {\bf g}, {\bf G} \in {W_{{\rm DG}}}^{1,\psi}$$. For interior faces $$\gamma$$ we denote by $$\left[\kern-0.15em\left[ {{g {\bf n}}}\right]\kern-0.15em\right]_\gamma$$, $$\left[\kern-0.15em\left[ {{{\bf g} \otimes {\bf n}}} \right]\kern-0.15em\right]_\gamma$$ and $$\left[\kern-0.15em\left[ {{{\bf G}{\bf n}}} \right]\kern-0.15em\right]_\gamma$$ the normal jump, i.e., the jump of $$g {\bf n}$$, $${\bf g} \otimes {\bf n}$$, $${\bf G} {\bf n}$$, respectively. For example $$\left[\kern-0.15em\left[ {{{\bf G}{\bf n}}} \right]\kern-0.15em\right]_\gamma$$ is defined on an interior face $$\gamma \in {\it{\Gamma}}_I$$ shared by the adjacent elements $$K^-,K^+ \in \mathcal T_h$$ with outer normals $${\bf n}^-$$, $${\bf n}^+$$, respectively, by   \begin{align*} \left[\kern-0.15em\left[ {{{\bf G}\,{\bf n}}} \right]\kern-0.15em\right]_\gamma := {tr}^{K^+}_\gamma({\bf G})\,{\bf n} ^+ +{tr}^{K^-}_\gamma({\bf G})\, {\bf n} ^- . \end{align*} For all interior faces, we denote by $$\left\{ {{\cdot}} \right\}$$ the trace average. For example, $$\left\{ {{{\bf g}}} \right\}$$ is defined on an interior face $$\gamma \in {\it{\Gamma}}_I$$ shared by the adjacent elements $$K^-,K^+ \in \mathcal T_h$$ by   \begin{align*} \left\{ {{{\bf g}}} \right\}_\gamma&:= \frac 12 \big ({tr}^{K^+}_\gamma({\bf g}) + {tr}^{K^-}_\gamma({\bf g})\big ). \end{align*} We omit the index $$\gamma$$ for jumps and averages if there is no danger of confusion. To deal with the Dirichlet boundary data on $${\it{\Gamma}}_{{\rm D}}$$, we need the following construction. Let $${\it{\Omega}}' \supsetneq {\it{\Omega}}$$ be a polyhedral, bounded domain with Lipschitz continuous boundary such that $$ \partial {\it{\Omega}} \setminus \partial{\it{\Omega}}' = \partial {\it{\Omega}}$$, $$ \partial {\it{\Omega}} \cap \partial {\it{\Omega}}' = \emptyset$$. Let $$\mathcal{T}'_h$$ denote an extension of the triangulation $$\mathcal{T}_h$$ to $${\it{\Omega}}'$$, having the same properties as $$\mathcal{T}_h$$ (in particular with a similar chunkiness). We extend our notation to this setting by adding a superscript ‘prime’ to it. In particular, we denote by $${\it{\Gamma}}_I'$$, $$S_K'$$ and $$S_\gamma'$$ the interior faces, the neighbourhood of $$K$$ and $$\gamma$$, resp., of $$\mathcal T_h'$$. We define   \begin{align*} {W_{{\rm DG},{\it{\Gamma}}_{\rm D}}}^{1,\psi}({\it{\Omega}}) := \left\{ {{{\bf g} \in {W_{{\rm DG}}}^{1,\psi}({\it{\Omega}}') {\,\big|\,} {\bf g}|_{{\it{\Omega}}' \setminus {\it{\Omega}}} = \mathbf{0}}} \right\}. \end{align*} So functions from $${W_{{\rm DG},{\it{\Gamma}}_{\rm D}}}^{1,\psi}({\it{\Omega}})$$ are elements of $${W_{{\rm DG}}}^{1,\psi}({\it{\Omega}})$$, which are (virtually) extended by zero to $${\it{\Omega}}' \setminus {\it{\Omega}}$$. Therefore, it is very natural to define the jumps and averages of $${\bf g} \in {W_{{\rm DG},{\it{\Gamma}}_{\rm D}}}^{1,\psi}({\it{\Omega}})$$ on $${\it{\Gamma}}_{\rm D}$$ by   \begin{align} \label{eq:jump_bnd} \left[\kern-0.15em\left[ {{{\bf g}\otimes {\bf n}}} \right]\kern-0.15em\right]_\gamma &:= {tr}^{\it{\Omega}}_\gamma ({\bf g}) \otimes {\bf n} , \quad && \left\{ {{{\bf g}}} \right\}_\gamma :={tr}^{\it{\Omega}}_\gamma ({\bf g}) \quad \text{for }\gamma \in {\it{\Gamma}}_{\rm D} . \end{align} (2.16) Let us now define a discrete DG gradient and jump functionals for functions $${\bf g} \in {W_{{\rm DG},{\it{\Gamma}}_{\rm D}}}^{1,\psi}({\it{\Omega}})$$. They were first introduced in Di Pietro & Ern (2010) and Bustinza & Gatica (2004) for $${\bf g}_h \in {V_h^k}$$. For every $${\bf g} \in {W_{{\rm DG},{\it{\Gamma}}_{\rm D}}}^{1,\psi}({\it{\Omega}})$$ we define the discrete gradient $${\nabla_{\rm DG}^h} {\bf g} \in {X_h^k}$$ (via Riesz representation) by the relation   \begin{align} \label{eq:nablaDG} \left( {{{\nabla_{\rm DG}^h} {\bf g}}, {{\bf X}_h}} \right)&:= \left( {{\nabla_h {\bf g}}, {{\bf X}_h}} \right) - \left\langle {{\left[\kern-0.15em\left[ {{{\bf g} \otimes {\bf n}}} \right]\kern-0.15em\right]}{\left\{ {{{\bf X}_h}} \right\}}} \right\rangle_{{\it{\Gamma}}} \end{align} (2.17) for all $${\bf X}_h \in {X_h^k}$$. For $${\bf g} \in {W_{{\rm DG},{\it{\Gamma}}_{\rm D}}}^{1,\psi}({\it{\Omega}})$$, we define the jump functional $${\bf R}_h {\bf g} \in {{X_h^k}}$$ (via Riesz representation) by the formula   \begin{align} \label{eq:R} \left( {{{\bf R}_h {\bf g}}, {{\bf X}_h}} \right) &:= \left\langle {{\left[\kern-0.15em\left[ {{{\bf g} \otimes {\bf n}}} \right]\kern-0.15em\right]}{\left\{ {{{\bf X}_h}} \right\}}} \right\rangle_{\it{\Gamma}} \end{align} (2.18) for all $${\bf X}_h \in {X_h^k}$$. With these definitions we have the following pointwise identities for $${\bf g}_h \in {V_h^k}$$:   \begin{align} \label{eq:DGnablaR} {\nabla_{\rm DG}^h} {\bf g}_h &= \nabla_h {\bf g}_h - {\bf R}_h {\bf g}_h, \end{align} (2.19) and for $${\bf g} \in {W_{{\rm DG},{\it{\Gamma}}_{\rm D}}}^{1,\psi}({\it{\Omega}})$$,   \begin{align} \label{eq:DGnablaR2} {\nabla_{\rm DG}^h} {\bf g} &= {{\it{\Pi}}_{\rm DG}} \nabla_h {\bf g} - {\bf R}_h {\bf g}. \end{align} (2.20) In the appendix, the projection $${{\it{\Pi}}_{\rm DG}}$$ is defined and some basic properties are collected. The same is done for the discrete gradient and the jump functionals. We define the semimodulars $$m_{\psi,h}$$ and $$M_{\psi,h}$$ for $${\bf g} \in{W_{{\rm DG},{\it{\Gamma}}_{\rm D}}}^{1,\psi}({\it{\Omega}})$$ by   \begin{align} \begin{aligned} m_{\psi,h}({\bf g})&:= h\,\rho_{\psi,{\it{\Gamma}} }(h^{-1}\left[\kern-0.15em\left[ {{{\bf g}\otimes {\bf n}}} \right]\kern-0.15em\right]) , \\ M_{\psi,h}({\bf g})&:= \rho_{\psi,{\it{\Omega}}}(\nabla_h {\bf g}) + m_{\psi,h}({\bf g}) . \end{aligned}\label{def:mh} \end{align} (2.21) Note that for every $${\bf g} \in {W_{{\it{\Gamma}}_{\rm D}}}^{1,\psi}({\it{\Omega}})$$, we have $$m_{\psi,h}({\bf g})=0$$ and $$M_{\psi,h}({\bf g}) = \rho_{\psi,{\it{\Omega}}}(\nabla {\bf g})$$, so $$M_{\psi,h}(\,\cdot\,)$$ is an extension of the modular $$\rho_{\psi,{\it{\Omega}}}(\nabla \,\cdot\,)$$ on $${W_{{\it{\Gamma}}_{\rm D}}}^{1,\psi}$$ to the DG setting. In fact, the semimodular $$M_{\psi,{\it{\Omega}}}$$ is modular. This is in complete analogy to the case $${W_{{\it{\Gamma}}_{\rm D}}}^{1,\psi}({\it{\Omega}})$$. Remark 2.9 In the special case $$\psi=\varphi,$$ we have due to (2.11c),   \begin{align} \begin{aligned}\label{eq:3} m_{\varphi,h}({\bf g})&\sim h\, \left\| {{ {\bf F}(h^{-1}\left[\kern-0.15em\left[ {{{\bf g}\otimes {\bf n}}} \right]\kern-0.15em\right])}} \right\|_{2,{\it{\Gamma}}_I\cup {\it{\Gamma}}_{\rm D}}^2 , \\ M_{\varphi,h}({\bf g})&\sim \left\| {{{\bf F}(\nabla_h {\bf g})}} \right\|^2_{2,{\it{\Omega}}}+ h\,\left\| {{{\bf F}(h^{-1}\left[\kern-0.15em\left[ {{{\bf g}\otimes {\bf n}}} \right]\kern-0.15em\right])}} \right\|_{2,{\it{\Gamma}}_I\cup {\it{\Gamma}}_{\rm D}}^2 . \end{aligned} \end{align} (2.22) 2.5 SIP methods Let us now formulate two SIP formulations of (1.1) under the assumption that $${\boldsymbol{\mathcal{A}}}$$ possesses a $$(p,\delta)$$-potential $$\psi$$. For given $$a > 0$$, we define the shifted operator $$ {\boldsymbol{\mathcal{A}}}_{a}\colon {\mathbb{R}}^{d\times n} \to {\mathbb{R}}^{d\times n}$$ through $$ {\boldsymbol{\mathcal{A}}}_{a}(\mathbf{0}):=\mathbf{0}$$ and   \begin{align} {\boldsymbol{\mathcal{A}}}_{a} ({\bf P}) := \psi_{a}' (\left| {{{\bf P}}} \right|) \frac {{\bf P}}{\left| {{{\bf P}}} \right|} \qquad \textrm{ $\forall\, {\bf P} \in {\mathbb{R}}^{d \times n} \setminus \left\{ { \mathbf{0}} \right\} $.} \label{def:A_u} \end{align} (2.23) For given $${\bf u} _D \in W^{1,p}({\it{\Omega}})$$, let $${\bf u}_D^* \in W^{1,p}({\it{\Omega}})$$ be some approximation of $${\bf u}_D$$. We have in mind $$ {\bf u}_D^*= {\bf u}$$ or $$ {\bf u}_D^*= {{\it{\Pi}}_{{\rm SZ}}}{\bf u}$$. Recall that we denote by $${\it{\Gamma}}$$ the union of all interior and boundary faces. Scheme 2.10 (SIP-shifted) For the given data $${\bf u}_D\in W^{1-\frac 1p,p}({\it{\Gamma}}_{\rm D})$$, $${{\bf f}\in L^{p'}({\it{\Omega}})}$$ and given $$\alpha>0$$, find $${\bf u}_h \in {V_h^k} $$ such that for all $${\bf z}_h \in {V_h^k}$$,   \begin{align} \label{eq:4} \begin{aligned} &\int_{\it{\Omega}} {\boldsymbol{\mathcal{A}}}(\nabla_h {\bf u}_h) :\nabla_h {\bf z}_h\, {\rm{d}}x -\int_{{\it{\Gamma}}} \left\{ {{{\boldsymbol{\mathcal{A}}}(\nabla_h {\bf u}_h)}} \right\}: \left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]\, {\rm{d}}s \\ &\quad -h\, \int_{{\it{\Gamma}}} \left\{ {{{\boldsymbol{\mathcal{A}}}_{\left| {{\nabla_h {\bf u}_h}} \right|}(h^{-1}\left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}^*_D) \otimes {\bf n}}} \right]\kern-0.15em\right]):\nabla_h {\bf z}_h }} \right\}\, {\rm{d}}s \\ &\quad + \alpha \int_{{\it{\Gamma}}} \left\{ {{{\boldsymbol{\mathcal{A}}}_{\left| {{\nabla_h {\bf u}_h}} \right|}(h^{-1}\left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right])}} \right\}: \left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]\, {\rm{d}}s =\int _{\it{\Omega}} {\bf f} \cdot {\bf z}_h\, {\rm{d}}x . \end{aligned} \end{align} (2.24) Scheme 2.11 (SIP-lifting) For the given data $${\bf u}_D\in W^{1-\frac 1p,p}({\it{\Gamma}}_{\rm D})$$, $${{\bf f}\in L^{p'}({\it{\Omega}})}$$ and given $$\alpha>0$$, find $${\bf u}_h \in {V_h^k} $$ such that for all $${\bf z}_h \in {V_h^k}$$,   \begin{align} & \int_{\it{\Omega}}\! {\boldsymbol{\mathcal{A}}}({\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^*) : {\nabla_{\rm DG}^h} {\bf z}_h\, {\rm{d}}x -\int_{{\it{\Omega}}} {\boldsymbol{\mathcal{A}}}({\bf R}_h( {\bf u}_h-{\bf u}_D^*)): {\bf R}_h {\bf z}_h \, {\rm{d}}x \notag \\ &\quad + \alpha \int_{{\it{\Gamma}}} { {\boldsymbol{\mathcal{A}}} (h^{-1}\left[\kern-0.15em\left[ {{({\bf u}_h - {\bf u}^*_D) \otimes {\bf n}}} \right]\kern-0.15em\right])}: \left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]\, {\rm{d}}s =\int _{\it{\Omega}} {\bf f} \cdot {\bf z}_h\, {\rm{d}}x . \label{eq:5} \end{align} (2.25) In view of the a priori estimates (cf. Section 4), one can easily prove by standard methods that there exists a solution $${\bf u}_h$$ of Scheme 2.10 and of Scheme 2.11. Remark 2.12 Note that these schemes generalize the classical SIP scheme, because for $$p=2$$ the formulations (2.24) and (2.25) reduce to the corresponding ones of the SIP method (cf. Di Pietro & Ern, 2012). Note also that the left-hand sides of (2.24) and (2.25) are well defined for $${\bf u}_h \in {W_{{\rm DG}}}^{1,\varphi}({\it{\Omega}}) \cap {W_{{\rm DG}}}^{2,1}({\it{\Omega}})$$, $${\bf z}_h \in {V_h^k} $$. Thus, we can evaluate them for the solution $${\bf u} $$ of problem (1.1). Let us now see if the weak solution of our original problem (1.1) satisfies similar systems. For a sufficiently smooth solution $${\bf u} \in W^{1,\varphi}({\it{\Omega}}) \cap W^{2,1}({\it{\Omega}})$$ of (1.1), we get for all $${\bf z}_h \in {V_h^k}$$,   \begin{align}\label{eq:cont-sol} \begin{aligned} \left( {{{\bf f}}, {{\bf z}_h}} \right) &= \left( {{-{\text{div}} {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {{\bf z}_h}} \right) \\ &= \left( {{{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {\nabla_h {\bf z}_h}} \right) - \left\langle {{{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {\left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]}} \right\rangle_{{\it{\Gamma}}} \\ &= {\left( {{{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {\nabla_h {\bf z}_h}} \right) - \left\langle {{{\left\{ {{{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}} \right\}}}, {\left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]}} \right\rangle_{{\it{\Gamma}}}} . \end{aligned} \end{align} (2.26) Thus, the solution $${\bf u}$$ of the continuous problem also satisfies (2.24), because the last two terms on the left-hand side of (2.24) vanish for $${\bf u} \in W^{1,\varphi}({\it{\Omega}})$$. Using (2.19), the definition of $${{\it{\Pi}}_{\rm DG}}$$ and the definition of the jump functional (2.18), we obtain   \begin{align*} \left( {{{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {\nabla_h {\bf z}_h}} \right) &= \left( {{ {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {{\nabla_{\rm DG}^h}{\bf z}_h}} \right) + \left( {{ {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {{\bf R}_h{\bf z}_h}} \right) \\ &= \left( {{ {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {{\nabla_{\rm DG}^h}{\bf z}_h}} \right) + \left( {{{{\it{\Pi}}_{\rm DG}} {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {{\bf R}_h{\bf z}_h}} \right) \\ &= \left( {{{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {{\nabla_{\rm DG}^h}{\bf z}_h}} \right) + \left\langle {{\left\{ {{{{\it{\Pi}}_{\rm DG}} {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}} \right\}}, {\left[\kern-0.15em\left[ {{{\bf z}_h\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right\rangle_{\it{\Gamma}}, \end{align*} and consequently,   \begin{align} \label{eq:cont} \begin{aligned} \left( {{{\bf f}}, {{\bf z}_h}} \right) &= \left( {{{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {{\nabla_{\rm DG}^h}{\bf z}_h}} \right) + \left\langle {{\left\{ {{{{\it{\Pi}}_{\rm DG}} {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}} \right\}-{\left\{ {{{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}} \right\}}}, {\left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]}} \right\rangle_{{\it{\Gamma}}} . \end{aligned} \end{align} (2.27) Thus, the solution $${\bf u}$$ of the continuous problem satisfies a system quite different to (2.25), which is caused by the nonlinearity in the elliptic term. 2.6 Primal formulation of an LDG scheme An LDG scheme for problem (1.1) was proposed and analysed in Diening et al. (2014). We assume again that $${\boldsymbol{\mathcal{A}}}$$ possesses a $$(p,\delta)$$-potential $$\psi$$. For later purposes, we recall the primal formulation of the LDG scheme here. Scheme 2.13 (LDG) For the given data $${\bf u}_D\in W^{1-\frac 1p,p}({\it{\Gamma}}_{\rm D})$$, $${{\bf f}\in L^{p'}({\it{\Omega}})}$$ and given $$\alpha>0$$, find $${\bf u}_h \in {V_h^k} $$ such that for all $${\bf z}_h \in {V_h^k}$$,   \begin{align} & \int_{\it{\Omega}}\! {\boldsymbol{\mathcal{A}}}({\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^*):{\nabla_{\rm DG}^h} {\bf z}_h\, {\rm{d}}x + \alpha \int_{{\it{\Gamma}}} \left\{ {{ {\boldsymbol{\mathcal{A}}} (h^{-1}\left[\kern-0.15em\left[ {{({\bf u}_h - {\bf u}^*_D) \otimes {\bf n}}} \right]\kern-0.15em\right])}} \right\}: \left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]\, {\rm{d}}s \notag \\ &=\int_{\it{\Omega}} {\bf f} \cdot {\bf z}_h\, {\rm{d}}x . \label{eq:6} \end{align} (2.28) In Diening et al. (2014, Theorem 3.2), it was shown that for $$ {\bf u}_D^*= {\bf u}$$ or $$ {\bf u}_D^*= {{\it{\Pi}}_{{\rm SZ}}}{\bf u}$$ and any $$\alpha>0$$, the solution $${\bf u}_h$$ of (2.28) satisfies the following a priori estimate:   \begin{align*} &\int_{\it{\Omega}} \varphi\big(\left| {{{\nabla_{\rm DG}^h} {\bf u}+ {\bf R}_h{\bf u}_D^*}} \right|\big) \,{\rm{d}}x +\alpha\, h \int_{{\it{\Gamma}}}\varphi\big(h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_h-{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|\big) \,{\rm{d}}s \\ &\quad + \min\{1,\alpha\} \int_{\it{\Omega}} \varphi\big(\left| {{\nabla _h{\bf u}_h}} \right|\big) \,{\rm{d}}x +\min\{1,\alpha\}\int_{\it{\Omega}} \varphi\big(\left| {{{\bf u}_h-{\bf u}}} \right|\big) \,{\rm{d}}x \\ &\le c\, \int_{{\it{\Omega}}} \varphi\big(\left| {{ \nabla{\bf u}}} \right|\big)\, {\rm{d}}x + c\, \int_{{\it{\Omega}}} \varphi^*\big(\left| {{{\bf f}}} \right|\big)\, {\rm{d}}x , \end{align*} with $$c$$ depending only on $$\alpha$$, the characteristics of $${\boldsymbol{\mathcal{A}}}$$ and the chunkiness $$\omega_0$$. Let $$k \geq 1$$ and let $${\bf u} \in W^{1,\varphi}({\it{\Omega}})$$ be a solution of (1.1) with $${\boldsymbol{\mathcal{A}}}(\nabla {\bf u}) \in {W^{1,\varphi^*}}({\it{\Omega}})$$ and $${\bf F}(\nabla {\bf u}) \in W^{1,2}({\it{\Omega}})$$, and let $${\bf u}_h \in {V_h^k} $$ be a solution of (2.28) for some $$\alpha>0$$. Then, the following error estimates have been shown in Diening et al. (2014, Corollary 4.10). (i) If $${\bf u}_D^* = {{\it{\Pi}}_{{\rm SZ}}}{\bf u}$$, then   \begin{align*} &\left\| {{{\bf F}({\nabla_{\rm DG}^h} {\bf u}_h +{\bf R}_h {\bf u}_D^*) - {\bf F}(\nabla {\bf u})}} \right\|_2^2 + \alpha\,{m_{\varphi,h}}({\bf u}_h-{{\it{\Pi}}_{{\rm SZ}}} {\bf u}) \\ &\leq {c_\alpha\, \begin{cases} h^{2}\, \left\| {{\nabla {\bf F}(\nabla {\bf u})}} \right\|_2^2\,\qquad \;\kern1pt\quad \qquad \qquad \qquad \;\;\quad \textrm{ if } p\le 2 , \\[1mm] h^{p'}\, \big(\left\| {{\nabla {\bf F}(\nabla {\bf u})}} \right\|_2^2+\rho_{\varphi^*,{\it{\Omega}}}(\nabla {\boldsymbol{\mathcal{A}}}(\nabla {\bf u}))\big )\,\qquad \textrm{ if } p\ge 2 . \end{cases}} \end{align*} (ii) If $${\bf u}_D^* = {\bf u}$$, then   \begin{align*} &\left\| {{{\bf F}({\nabla_{\rm DG}^h} {\bf u}_h +{\bf R}_h {\bf u}_D^*) - {\bf F}(\nabla {\bf u})}} \right\|_2^2 + \alpha\,{m_{\varphi,h}}({\bf u}_h-{{\it{\Pi}}_{{\rm SZ}}} {\bf u}) \\ &\leq {c_\alpha \begin{cases} h^{p}\,\big( \left\| {{\nabla {\bf F}(\nabla {\bf u})}} \right\|_2^2 + \rho_{\varphi, {\it{\Omega}}}(\nabla {\bf u})\big)\, \;\;\, \qquad \qquad \qquad \quad \qquad \,\textrm{if } p\le 2 , \\[1mm] h^{p'}\,\big( \left\| {{\nabla {\bf F}(\nabla {\bf u})}} \right\|_2^2 + \rho_{\varphi, {\it{\Omega}}}(\nabla {\bf u}) {+\rho_{\varphi^*,{\it{\Omega}}}(\nabla {\boldsymbol{\mathcal{A}}}(\nabla {\bf u}))} \big)\,\qquad \textrm{ if } p\ge 2 . \end{cases}} \end{align*} The constants depend only on the characteristics of $${\boldsymbol{\mathcal{A}}}$$, the chunkiness $$\omega_0$$ and $$\alpha^{-1}$$. 3. A priori estimates In this section, we derive a priori estimates for our SIP schemes. Let us start with Scheme 2.10 with shifts. Theorem 3.1 Let $$ {\bf u}_D^*= {\bf u}$$ or $$ {\bf u}_D^*= {{\it{\Pi}}_{{\rm SZ}}}{\bf u}$$. Then, there exists $$\alpha_0$$ such that for all $$\alpha \ge \alpha_0$$ the solution $${\bf u}_h \in {V_h^k} $$ of Scheme 2.10 satisfies the a priori estimate   \begin{align*} &\int_{\it{\Omega}} \varphi\big(\left| {{\nabla _h{\bf u}_h}} \right|\big) \,{\rm{d}}x +\alpha\, h\int_{{\it{\Gamma}}} \left\{ {{\varphi_{\left| {{\nabla _h{\bf u}_h}} \right|}\big(h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_h-{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|\big)}} \right\} \,{\rm{d}}s \\ &\quad +\min\{1,\alpha\}\,h\, \int_{\it{\Gamma}} \varphi\big(h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_h-{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|\big) \,{\rm{d}}s +\min\{1,\alpha\}\,\int_{\it{\Omega}} \varphi\big(\left| {{{\bf u}_h-{\bf u}}} \right|\big) \,{\rm{d}}x \\ &\le c\, \int_{{\it{\Omega}}} \varphi\big(\left| {{ \nabla{\bf u}}} \right|\big)\, {\rm{d}}x + c\, \int_{{\it{\Omega}}} \varphi^*\big(\left| {{{\bf f}}} \right|\big)\, {\rm{d}}x , \end{align*} with $$c$$ depending only on $$\alpha$$, the characteristics of $${\boldsymbol{\mathcal{A}}}$$ and the chunkiness $$\omega_0$$. Proof. To prove the assertion we use in (2.24), the test function $${\bf z}_h={\bf u}_h-{{\it{\Pi}}_{{\rm SZ}}} {\bf u}$$. Thus, we get, adding and subtracting appropriate terms,   \begin{align} &\int_{\it{\Omega}}{{\boldsymbol{\mathcal{A}}}(\nabla_h {\bf u}_h )}: {\nabla_h {\bf u}_h }\, {\rm{d}}x\label{eq:apri} \\[-1mm] &\quad + \alpha\, h\int_{\it{\Gamma}}\left\{ {{ {\boldsymbol{\mathcal{A}}_{\lvert{\nabla _h{\bf u}_h}\rvert}}(h^{-1} \left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*) \otimes {\bf n}}} \right]\kern-0.15em\right]): h^{-1}\left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right\}\, {\rm{d}}s\notag \\ &= \int_{\it{\Omega}} {{\bf f}}\cdot ({\bf u}_h-{\bf u} +{\bf u}-{{\it{\Pi}}_{{\rm SZ}}} {\bf u})\, {\rm{d}}x+\int_{\it{\Omega}}{{\boldsymbol{\mathcal{A}}}(\nabla_h{\bf u}_h )}: {\nabla_h{{\it{\Pi}}_{{\rm SZ}}} {\bf u} }\,{\rm{d}}x \notag \\ &\quad +\alpha\,h \int_{\it{\Gamma}} \left\{ {{{\boldsymbol{\mathcal{A}}_{\lvert{\nabla _h{\bf u}_h}\rvert}}(h^{-1} \left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right])}} \right\}:{ h^{-1}\left[\kern-0.15em\left[ {{({{\it{\Pi}}_{{\rm SZ}}} {\bf u}-{\bf u}_D^*) \otimes {\bf n}}} \right]\kern-0.15em\right]}\,s \notag \\ &\quad + h\,\int_{\it{\Gamma}} \left\{ {{{\boldsymbol{\mathcal{A}}}(\nabla_h{\bf u}_h )}} \right\}:{h^{-1} \left[\kern-0.15em\left[ {{({\bf u}_h-{{\it{\Pi}}_{{\rm SZ}}} {\bf u}) \otimes {\bf n}}} \right]\kern-0.15em\right]}\,{\rm{d}}s \notag \\ &\quad + h\,\int_{\it{\Gamma}} \left\{ {{{\boldsymbol{\mathcal{A}}_{\lvert{\nabla _h{\bf u}_h}\rvert}}(h^{-1} \left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]): \nabla_h{({\bf u}_h-{{\it{\Pi}}_{{\rm SZ}}} {\bf u}) }}} \right\}\,{\rm{d}}s \notag \\ &=: I_1+I_2+I_3+I_4+I_5 .\notag \end{align} (3.1) Using Assumption 2.2, (2.4) and (2.10) we see that the left-hand side of (3.1) is equivalent to   \begin{align} \int_{\it{\Omega}} \varphi(|{\nabla_h {\bf u}_h }|)\,{\rm{d}}x + \alpha\, h \int_{\it{\Gamma}} \left\{ {{\varphi_{\left| {{\nabla _h{\bf u}_h}} \right|} (h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|})} \right\}\, {\rm{d}}s . \label{eq:lhs1} \end{align} (3.2) Lemma 2.5 and (A.10) imply that (3.2) is an upper bound of   \begin{align} \min(1,\alpha)\, h \int_{\it{\Gamma}} {\varphi (h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]})} \right|}\, {\rm{d}}s, \label{eq:lhs1a} \end{align} (3.3) which thus can be added to (3.2). This, in turn, implies that we can also add   \begin{equation} \label{eq:lhs1b} \min(1,\alpha)\,\int_{\it{\Omega}} \varphi (\left| {{{\bf u}_h-{\bf u}}} \right|)\,{\rm{d}}x \end{equation} (3.4) to (3.2) at the expense of adding $$\rho_{\varphi, {\it{\Omega}}}(\nabla {\bf u})$$ to the right-hand side of (3.1), because   \begin{align} \int_{\it{\Omega}} \!\varphi (\left| {{{\bf u}_h\!-\!{\bf u}}} \right|)\,{\rm{d}}x &\le c \int_{\it{\Omega}} \!\varphi(\left| {{\nabla_h {\bf u}_h-\nabla {\bf u}}} \right|)\,{\rm{d}}x + c \, h \!\int_{\it{\Gamma}} \! {\varphi (h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u})\otimes {\bf n}}} \right]\kern-0.15em\right])}} \right|}\, {\rm{d}}s \notag \\ &\le c \int_{\it{\Omega}} \!\varphi(\left| {{\nabla_h {\bf u}_h}} \right|) +\varphi(\left| {{\nabla {\bf u}}} \right|)\,{\rm{d}}x + c \, h \!\int_{\it{\Gamma}} \!{\varphi (h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_h \!-\!{\bf u}^*_D)\otimes {\bf n}})} \right]\kern-0.15em\right]}} \right|}\, {\rm{d}}s \notag \\ &\quad + c\, h \int_{\it{\Gamma}} \! {\varphi (h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_D^*-{\bf u})\otimes {\bf n}}} \right]\kern-0.15em\right])}} \right|}\, {\rm{d}}s\label{eq:lhs1c} \\ &\le c \int_{\it{\Omega}} \!\varphi(\left| {{\nabla_h {\bf u}_h}} \right|) +\varphi(\left| {{\nabla {\bf u}}} \right|)\,{\rm{d}}x + c \, h \!\int_{\it{\Gamma}} \! {\varphi (h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_h \!-\!{\bf u}^*_D)\otimes {\bf n}})} \right]\kern-0.15em\right]}} \right|}\, {\rm{d}}s, \notag \end{align} (3.5) where we used Lemmas A3 and A4. Now we can estimate the terms $$I_i$$, $$i=1,\ldots,5$$ on the right-hand side of (3.1). We estimate with Young’s inequality and (A.17),   \begin{align*} \left| {{I_1}} \right| &\leq \epsilon \rho_{\varphi,{\it{\Omega}}}({\bf u}_h - {\bf u}) + c_{\epsilon} \rho_{\varphi^*,{\it{\Omega}}}({\bf f}) + c\, \rho_{\varphi,{\it{\Omega}}}({\bf u} - {{\it{\Pi}}_{{\rm SZ}}} {\bf u}) \\ &\leq \epsilon \rho_{\varphi,{\it{\Omega}}}({\bf u}_h - {\bf u}) + c_{\epsilon} \rho_{\varphi^*,{\it{\Omega}}}({\bf f}) + c\, \rho_{\varphi,{\it{\Omega}}}( \nabla {\bf u}) , \end{align*} where we also used $$h\le 1$$. Using (2.13), Young’s inequality and (A.17) we get   \begin{align*} \begin{aligned} \left| {{I_2}} \right|&\le {\varepsilon} \,\rho_{\varphi,{\it{\Omega}}}\big({\nabla_h {\bf u}_h }\big) +c_{\varepsilon}\, \rho_{\varphi,{\it{\Omega}}}\big(\nabla_h{{\it{\Pi}}_{{\rm SZ}}}{\bf u}\big) \\ &\le {\varepsilon} \,\rho_{\varphi,{\it{\Omega}}}\big({\nabla_h {\bf u}_h }\big) +c_{\varepsilon} \,\rho_{\varphi,{\it{\Omega}}}\big(\nabla{\bf u}\big) . \end{aligned} \end{align*} From (2.23), (2.10) and Young’s inequality, we infer   \begin{align*} \left| {{I_3}} \right|&\le {\varepsilon}\,h \, \alpha\,\int_{\it{\Gamma}} \left\{ {{\varphi_{\left| {{\nabla_h{\bf u}_h}} \right|}(h^{-1}\left| {{ \left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|)}} \right\}\, {\rm{d}}s \\ &\quad +c_{\varepsilon} \, h \,\alpha\int_{\it{\Gamma}} \left\{ {{\varphi_{\left| {{\nabla_h{\bf u}_h}} \right|}(h^{-1}\left| {{ \left[\kern-0.15em\left[ {{({\bf u}_D^*-{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|)}} \right\}\, {\rm{d}}s , \end{align*} where the last term is further estimated, using Lemma 2.5, (A.10), Lemma A4 and (A.17), by   \begin{align}\label{eq:beta1} \begin{aligned} &c_{\varepsilon} \, h \,\alpha\int_{\it{\Gamma}} \beta_1\,\left\{ {{\varphi(\left| {{\nabla _h{\bf u}_h}} \right|)}} \right\} + c_{\beta_1} \,\varphi(h^{-1}\left| {{ \left[\kern-0.15em\left[ {{({\bf u}_D^* -{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|) \, {\rm{d}}s \\ &\le \beta_1\, c_{\varepsilon}\,\alpha\,\rho_{\varphi,{\it{\Omega}}}\big(\nabla _h{\bf u}_h \big)+c_{\varepsilon} \, c_{\beta_1} \, \alpha\,\big (m_{\varphi,h}\big({\bf u}_D^* -{\bf u}\big)+ m_{\varphi,h}({\bf u} -{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\big ) \\ &\le \beta_1\, c_{\varepsilon}\,\alpha\,\rho_{\varphi,{\it{\Omega}}}\big(\nabla _h{\bf u}_h \big) +c_{\varepsilon} \, c_{\beta_1}\,\alpha\, \rho_{\varphi,{\it{\Omega}}}(\nabla {\bf u}) , \end{aligned} \end{align} (3.6) with $$\beta_1\in (0,1)$$. Using Young’s inequality, (A.10), adding and subtracting $${\bf u}^*_D$$ and $${\bf u}$$, Lemma A4 and (A.17) we get   \begin{align*} \left| {{I_4}} \right|&\le h \, \int_{\it{\Gamma}} {\varepsilon} \left\{ {{\varphi (\left| {{\nabla_h{\bf u}_h}} \right|) }} \right\} + c_{\varepsilon} \, \varphi (h^{-1}\left| {{ \left[\kern-0.15em\left[ {{({\bf u}_h -{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|)\, {\rm{d}}s \\ &\le {\varepsilon} \, c \,\rho_{\varphi,{\it{\Omega}}}\big(\nabla _h{\bf u}_h \big) +c_{\varepsilon} \big( m_{\varphi,h} ({\bf u}_h -{\bf u}^*_D )+ m_{\varphi,h}( {\bf u}_D^* -{\bf u} )+m_{\varphi,h} ({\bf u} -{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\big ) \\ &\le {\varepsilon} \, c \,\rho_{\varphi,{\it{\Omega}}}\big(\nabla _h{\bf u}_h \big) +c_{\varepsilon} \, \rho_{\varphi,{\it{\Omega}}}(\nabla {\bf u}) +c_{\varepsilon} \, m_{\varphi,h}\big ({\bf u}_h -{\bf u}^*_D\big ) , \end{align*} where the last term is further estimated by Lemma 2.5 and (A.10) by   \begin{align}\label{eq:alpha} \begin{aligned} &c_{\varepsilon} \, h \int_{\it{\Gamma}} \beta_2\,\left\{ {{\varphi(\left| {{\nabla _h{\bf u}_h}} \right|)}} \right\}+ c_{\beta_2} \,\left\{ {{\varphi_{\left| {{\nabla _h{\bf u}_h}} \right|}(h^{-1}\left| {{ \left[\kern-0.15em\left[ {{({\bf u}_h-{\bf u}_D^* )\otimes{\bf n}}} \right]\kern-0.15em\right]}} \right|)}} \right\} \, {\rm{d}}s \\ &\le \beta_2\, c_{\varepsilon}\,\rho_{\varphi,{\it{\Omega}}}\big(\nabla _h{\bf u}_h \big)+c_{\varepsilon} \, c_{\beta_2} \, h \int_{\it{\Gamma}} \left\{ {{\varphi_{\left| {{\nabla _h{\bf u}_h}} \right|}(h^{-1}\left| {{ \left[\kern-0.15em\left[ {{({\bf u}_h-{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|)}} \right\} \, {\rm{d}}s , \end{aligned} \end{align} (3.7) with $$\beta_2\in (0,1)$$. Finally, using (2.23), (2.10) and Young’s inequality (2.1) we estimate   \begin{align} \begin{aligned}\label{eq:alpha1} \left| {{I_5}} \right|&\le c_{\varepsilon}\,h\int_{\it{\Gamma}} \left\{ {{\varphi_{\left| {{\nabla_h{\bf u}_h}} \right|}(h^{-1}\left| {{ \left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|)}} \right\}\, {\rm{d}}s \\ &\quad +{\varepsilon} \, h \int_{\it{\Gamma}} \left\{ {{\varphi_{\left| {{\nabla_h{\bf u}_h}} \right|}(\left| {{ \nabla _h({\bf u}_h -{{\it{\Pi}}_{{\rm SZ}}}{\bf u})}} \right|)}} \right\}\, {\rm{d}}s , \end{aligned} \end{align} (3.8) where the last term is further estimated, using Lemma 2.5, (A.10) and (A.17), by   \begin{align*} &c\,{\varepsilon} \, h \int_{\it{\Gamma}} \left\{ {{\varphi(\left| {{\nabla _h{\bf u}_h}} \right|)}} \right\}+ \left\{ {{\varphi(\left| {{ \nabla _h{({\bf u}_h -{{\it{\Pi}}_{{\rm SZ}}}{\bf u})}}} \right|) }} \right\}\, {\rm{d}}s \\ &\le c\,{\varepsilon}\,\rho_{\varphi,{\it{\Omega}}}\big(\nabla _h{\bf u}_h \big)+c \, \rho_{\varphi,{\it{\Omega}}}\big(\nabla _h{{\it{\Pi}}_{{\rm SZ}}}{\bf u}\big) \\ &\le c\,{\varepsilon}\,\rho_{\varphi,{\it{\Omega}}}\big(\nabla _h{\bf u}_h \big) +c \,\rho_{\varphi,{\it{\Omega}}}(\nabla {\bf u}) . \end{align*} Choosing $$\varepsilon$$ small enough we can absorb all terms with $${\varepsilon}$$ into the corresponding terms in (3.2) and (3.4). Now $$c_{\varepsilon}$$ is fixed, and we choose $$\beta _2 $$ small enough to absorb the terms with $$\beta_2$$ into the corresponding terms in (3.2). Then, we choose $$\alpha$$ large enough to absorb the term with the, by now fixed, number $$c_{\varepsilon} \, c_{\beta_2}$$ in (3.7) and the term with $$c_{\varepsilon} $$ in (3.8) into the corresponding terms in (3.2). Finally, we choose $$\beta_1$$ small enough to absorb the term with $$\beta_1$$ in (3.6) into the corresponding terms in (3.2). This way the assertion of Theorem 3.1 is proved. □ Let us now come to Scheme 2.11 with lifting. Theorem 3.2 Let $$ {\bf u}_D^*= {\bf u}$$ or $$ {\bf u}_D^*= {{\it{\Pi}}_{{\rm SZ}}}{\bf u}$$. Then, there exists $$\alpha_0$$ such that for all $$\alpha \ge \alpha_0$$, the solution $${\bf u}_h \in {V_h^k} $$ of Scheme 2.11 satisfies the a priori estimate   \begin{align*} &\int_{\it{\Omega}} \varphi\big(\left| {{{\nabla_{\rm DG}^h} {\bf u}_h + {\bf R}_h {\bf u}_D^*}} \right|\big) \,{\rm{d}}x +\alpha\, h\int_{{\it{\Gamma}}} {\varphi \big(h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_h-{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|\big)} \,{\rm{d}}s \\ &\quad +\min\{1,\alpha\}\,\int_{\it{\Omega}} \varphi\big( \left| {{\nabla_h {\bf u}_h}} \right|\big) \,{\rm{d}}s +\min\{1,\alpha\}\,\int_{\it{\Omega}} \varphi\big(\left| {{{\bf u}_h-{\bf u}}} \right|\big) \,{\rm{d}}x \\ &\le c\, \int_{{\it{\Omega}}} \varphi\big(\left| {{ \nabla{\bf u}}} \right|\big)\, {\rm{d}}x + c\, \int_{{\it{\Omega}}} \varphi^*\big(\left| {{{\bf f}}} \right|\big)\, {\rm{d}}x , \end{align*} with $$c$$ depending only on $$\alpha$$, the characteristics of $${\boldsymbol{\mathcal{A}}}$$ and the chunkiness $$\omega_0$$. Proof. To prove the assertion, we use in (2.25) the test function $${\bf z}_h={\bf u}_h-{{\it{\Pi}}_{{\rm SZ}}} {\bf u}$$. Thus, we get, adding and subtracting appropriate terms and using (2.19),   \begin{align} & \int_{\it{\Omega}}\! {\boldsymbol{\mathcal{A}}}({\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^*) : ({\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^*)\, {\rm{d}}x \label{eq:apri-2} \\ &\quad + \alpha \, h\int_{{\it{\Gamma}}} { {\boldsymbol{\mathcal{A}}} (h^{-1}\left[\kern-0.15em\left[ {{({\bf u}_h - {\bf u}^*_D) \otimes {\bf n}}} \right]\kern-0.15em\right])}: h^{-1} \left[\kern-0.15em\left[ {{({\bf u}_h - {\bf u}^*_D) \otimes {\bf n}}} \right]\kern-0.15em\right]\, {\rm{d}}s \notag \\ &= \int_{\it{\Omega}} {{\bf f}}\cdot ({\bf u}_h-{\bf u} +{\bf u}-{{\it{\Pi}}_{{\rm SZ}}} {\bf u})\, {\rm{d}}x \notag \\ &\quad + \int_{\it{\Omega}}\! {\boldsymbol{\mathcal{A}}}({\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^*) : \big (\nabla_h {{\it{\Pi}}_{{\rm SZ}}} {\bf u}+ {\bf R}_h({\bf u}_D^*-{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\big )\, {\rm{d}}x \notag \\ &\quad + \alpha\, h\int_{\it{\Gamma}}{ {\boldsymbol{\mathcal{A}}} (h^{-1} \left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*) \otimes {\bf n}}} \right]\kern-0.15em\right]): \left[\kern-0.15em\left[ {{h^{-1}({{\it{\Pi}}_{{\rm SZ}}}{\bf u} -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}\, {\rm{d}}s\notag \\ &\quad +\int_{{\it{\Omega}}} {\boldsymbol{\mathcal{A}}}({\bf R}_h( {\bf u}_h-{\bf u}_D^*)): \big ({\bf R}_h ({\bf u}_h -{\bf u}_D^*) +{\bf R}_h ({\bf u}_D^*-{{\it{\Pi}}_{{\rm SZ}}} {\bf u}) \big ) \, {\rm{d}}x \notag \\ &=: I_1+I_2+I_3+I_4 .\notag \end{align} (3.9) Using Assumption 2.2, (2.4) and (2.10) we see that the left-hand side of (3.1) is equivalent to   \begin{align} \begin{aligned} &\int_{\it{\Omega}} \varphi(\left| {{{\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^*}} \right|)\,{\rm{d}}x + \alpha\, h \int_{\it{\Gamma}} {\varphi(h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|)}\, {\rm{d}}s \\ &= \rho_{\varphi,{\it{\Omega}}}\big({{\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^* }\big) +\alpha \, m_{\varphi,h}({\bf u}_h -{\bf u}_D^*) . \end{aligned} \label{eq:lhs2} \end{align} (3.10) The identity $$\nabla _h {\bf u}_h= ({\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^*) +{\bf R}_h({\bf u}_h -{\bf u}_D^*)$$ together with (A.13) implies that (3.10) is an upper bound of   \begin{align} \min(1,\alpha) \int_{\it{\Omega}} {\varphi ( \left| {{\nabla _h {\bf u}_h}} \right|)}\, x, \label{eq:lhs2a} \end{align} (3.11) which thus can be added to (3.10). In view of (3.5) we can also add (3.4) to (3.10) at the expense of adding $$\rho_{\varphi, {\it{\Omega}}}(\nabla {\bf u})$$ to the right-hand side of (3.9). Now we can estimate the terms $$I_i$$, $$i=1,\ldots,4$$ on the right-hand side of (3.9). We estimate with Young’s inequality, (A.17) and $$h \le 1$$,   \begin{align*} \left| {{I_1}} \right| &\leq \epsilon \rho_{\varphi,{\it{\Omega}}}({\bf u}_h - {\bf u}) + c_{\epsilon} \rho_{\varphi^*,{\it{\Omega}}}({\bf f}) + c\, \rho_{\varphi,{\it{\Omega}}}({\bf u} - {{\it{\Pi}}_{{\rm SZ}}} {\bf u}) \\ &\leq \epsilon \rho_{\varphi,{\it{\Omega}}}({\bf u}_h - {\bf u}) + c_{\epsilon} \rho_{\varphi^*,{\it{\Omega}}}({\bf f}) + c\, \rho_{\varphi,{\it{\Omega}}}( \nabla {\bf u}) . \end{align*} Using (2.13), Young’s inequality, the definition of $${\bf u}_D^*$$, (A.17), (A.13) and again (A.17) we get   \begin{align*} \begin{aligned} \left| {{I_2}} \right|&\le {\varepsilon} \rho_{\varphi,{\it{\Omega}}}\big({{\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^* }\big) +c_{\varepsilon}\rho_{\varphi,{\it{\Omega}}}\big(\nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}\big) +c_{\varepsilon}\rho_{\varphi,{\it{\Omega}}}\big({\bf R}_h({\bf u}-{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\big) \\ &\le {\varepsilon} \,c\,\rho_{\varphi,{\it{\Omega}}}\big({{\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^* }\big) +c_{\varepsilon}\,\rho_{\varphi,{\it{\Omega}}}\big(\nabla {\bf u}\big)+c_{\varepsilon} \, m_{\varphi,h}({\bf u} -{{\it{\Pi}}_{{\rm SZ}}} {\bf u}) \\ &\le {\varepsilon} \,c\,\rho_{\varphi,{\it{\Omega}}}\big({{\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^* }\big) +c_{\varepsilon}\,\rho_{\varphi,{\it{\Omega}}}\big(\nabla {\bf u}\big) . \end{aligned} \end{align*} Using (2.13), Young’s inequality, the definition of $${\bf u}_D^*$$ and (A.17) we get   \begin{align*} \left| {{I_3}} \right|&\le {\varepsilon}\, \alpha \, m_{\varphi,h}({\bf u}_h -{\bf u}_D^*) + \alpha\, c_{\varepsilon} \, m_{\varphi,h}({\bf u} -{{\it{\Pi}}_{{\rm SZ}}} {\bf u}) \\ &\le {\varepsilon}\, \alpha \, m_{\varphi,h}({\bf u}_h -{\bf u}_D^*) + \alpha\, c_{\varepsilon} \rho_{\varphi,{\it{\Omega}}}\big(\nabla {\bf u}\big) . \end{align*} Using (2.13), (2.11c), Young’s inequality, the definition of $${\bf u}_D^*$$, (A.13) and (A.17) we get   \begin{align*} \left| {{I_4}} \right| &\le c\, \rho_{\varphi,{\it{\Omega}}}\big({\bf R}_h({\bf u}_h-{\bf u}_D^*) \big) + c\, \rho_{\varphi,{\it{\Omega}}}\big({\bf R}_h({\bf u}-{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\big ) \\ &\le c\, m_{\varphi,h}\big({\bf u}_h-{\bf u}_D^*\big) + c\, m_{\varphi,h}\big({\bf u}-{{\it{\Pi}}_{{\rm SZ}}}{\bf u}\big ) \\ &\le c \, m_{\varphi,h}({\bf u}_h -{\bf u}_D^*)+ c\, \rho_{\varphi,{\it{\Omega}}}\big(\nabla {\bf u}\big) . \end{align*} Choosing $$\varepsilon$$ small enough we can absorb all terms with $${\varepsilon}$$ into the corresponding terms in (3.10) and (3.4). Now $$c_{\varepsilon}$$ is fixed, and we choose $$\alpha $$ large enough to absorb the term $$c \, m_{\varphi,h}({\bf u}_h -{\bf u}_D^*) $$ in the estimate of $$I_4$$ into the corresponding term in (3.10). This proves the assertion of Theorem 3.2. □ 4. Error estimates Before we prove our error estimates, we prove the analogue of the local best approximation property of the Scott–Zhang interpolation operator (cf. Diening & Růžička, 2007, Theorem 5.3) for faces $$\gamma$$ instead of elements $$K$$. Theorem 4.1 Let $${\boldsymbol{\mathcal{A}}}$$ satisfy Assumption 2.2 and let $${{\it{\Pi}}_{{\rm SZ}}} \colon W^{1,1}({\it{\Omega}}) \to {V_h^k}$$ be the Scott–Zhang operator with $$k \ge 1$$. Let $${\bf u} \in W^{1,\varphi}({\it{\Omega}})$$ with $${\bf F}(\nabla {\bf u}) \in W^{1,2}({\it{\Omega}})$$. Then, for all $$\gamma \in {\it{\Gamma}}$$ and all adjacent elements $$K \in \mathcal{T}_h$$ we have1  \begin{align} \label{eq:app_V1} \mathop{\int\hspace{-0.8em}{-}}_\gamma \left| {{{\bf F} (\nabla {\bf u}) - {\bf F} (\nabla_h {{\it{\Pi}}_{{\rm SZ}}} {\bf u})}} \right|^2 \,{\rm{d}}s &\leq c\,h_\gamma^2\, \mathop{\int\hspace{-0.8em}{-}}_{S_K} \left| {{\nabla {\bf F}(\nabla {\bf u})}} \right|^2\,{\rm{d}}x, \end{align} (4.1) with $$c$$ depending only on the characteristics of $${\boldsymbol{\mathcal{A}}}$$ and the chunkiness $$\omega_0$$. Proof. For arbitrary $${\bf Q} \in {\mathbb{R}}^{d \times n}$$ we have   \begin{align} &\mathop{\int\hspace{-0.8em}{-}}_\gamma \lvert{{\bf F} (\nabla {\bf u}) - {\bf F}(\nabla_h {{\it{\Pi}}_{{\rm SZ}}} {\bf u})}\rvert^2 \,{\rm{d}}s\label{eq:i-II} \\ &\leq c\, \mathop{\int\hspace{-0.8em}{-}}_\gamma \lvert{{\bf F}(\nabla {\bf u}) - {\bf F}({\bf Q})}\rvert^2 \,{\rm{d}}s + c\, \mathop{\int\hspace{-0.8em}{-}}_\gamma \lvert{{\bf F} (\nabla_h {{\it{\Pi}}_{{\rm SZ}}} {\bf u}) - {\bf F}({\bf Q})}\rvert^2 \,{\rm{d}}s=: (I) + (II) .\notag \end{align} (4.2) Let $${\mathfrak q} \in \mathcal{P}_1(S_K)$$ be such that $$\nabla {\mathfrak q} = {\bf Q}$$. Since $$k\ge 1$$ we have $${{\it{\Pi}}_{{\rm SZ}}} {\mathfrak q} = {\mathfrak q}$$ and consequently $${\bf Q} = \nabla {\mathfrak q} = \nabla_h {{\it{\Pi}}_{{\rm SZ}}} {\mathfrak q}$$. This and Proposition 2.6 imply   \begin{align*} (II)\sim c\, \mathop{\int\hspace{-0.8em}{-}}_\gamma \varphi_{\left| {{{\bf Q}}} \right|}\big(\lvert{\nabla_h{{\it{\Pi}}_{{\rm SZ}}} {\bf u} - {\bf Q}}\rvert \big)\,{\rm{d}}s &= c\, \mathop{\int\hspace{-0.8em}{-}}_\gamma \varphi_{\left| {{{\bf Q}}} \right|}\big(\lvert{\nabla_h {{\it{\Pi}}_{{\rm SZ}}} ({\bf u} -{\mathfrak q})\rvert } \big)\,{\rm{d}}s. \end{align*} Thus, inequality (A.10), the continuity of $${{\it{\Pi}}_{{\rm SZ}}}$$ (A.17) and again Proposition 2.6 imply that $$(II)$$ is bounded by   \begin{align} c\, \mathop{\int\hspace{-0.8em}{-}}_K \varphi_{\left| {{{\bf Q}}} \right|}\big(\lvert{\nabla_h {{\it{\Pi}}_{{\rm SZ}}} ({\bf u} -{\mathfrak q})\lvert } \big)\,{\rm{d}}x &\le c\,\mathop{\int\hspace{-0.8em}{-}}_{S_K} \varphi_{\left| {{{\bf Q}}} \right|}\big(\lvert{\nabla ({\bf u} -{\mathfrak q}) \rvert} \big)\,{\rm{d}}x \label{eq:II} \\ &= c\, \mathop{\int\hspace{-0.8em}{-}}_{S_K} \varphi_{\left| {{{\bf Q}}} \right|}\big(\lvert{\nabla {\bf u} - {\bf Q}}\rvert \big)\,{\rm{d}}x\sim \mathop{\int\hspace{-0.8em}{-}}_{S_K} \lvert{{\bf F}(\nabla {\bf u}) - {\bf F}({\bf Q})}\rvert^2 \,{\rm{d}}x . \notag \end{align} (4.3) On the other hand, Lemma A1, used for $$\psi (t)=t^2$$, $$\left| {{K}} \right| \sim \lvert{S_K}\rvert$$ and $$K \subset S_K$$, implies   \begin{align} (I) &\leq c\,\mathop{\int\hspace{-0.8em}{-}}_{K} \lvert{{\bf F}(\nabla {\bf u}) - {\bf F}({\bf Q})}\rvert^2 +h_\gamma^2\,\left| {{\nabla {\bf F}(\nabla {\bf u})}} \right|^2\,{\rm{d}}x \label{eq:I} \\ &\le c\, \mathop{\int\hspace{-0.8em}{-}}_{S_K} \lvert{{\bf F}(\nabla {\bf u}) - {\bf F}({\bf Q})}\rvert^2+h_\gamma^2\,\left| {{\nabla {\bf F}(\nabla {\bf u})}} \right|^2 \,{\rm{d}}x .\notag \end{align} (4.4) Thus, the assertion follows from the estimates (4.2)–(4.4), the surjectivity of $${\bf F}$$ and Poincaré’s inequality in $$W^{1,2}(S_K)$$. □ Corollary 4.2 Under the assumptions of Theorem 4.1 there holds2  \begin{align} \label{eq:app_V1b} h\,\int_{\it{\Gamma}} \left| {{{\bf F} (\nabla {\bf u}) - {\bf F} (\nabla_h {{\it{\Pi}}_{{\rm SZ}}} {\bf u})}} \right|^2 \,{\rm{d}}s &\leq c\,h^2\, \int_{{\it{\Omega}}} \left| {{\nabla {\bf F}(\nabla {\bf u})}} \right|^2\,{\rm{d}}x . \end{align} (4.5) Proof. This follows immediately from (4.1) by summation over $$\gamma \in {\it{\Gamma}}$$ using the properties of the triangulation. □ Using (2.24) and (2.26), we get our error equation for Scheme 2.10:   \begin{align}\label{eq:error-s1} \begin{aligned} &\int_{\it{\Omega}} \big({\boldsymbol{\mathcal{A}}}(\nabla_h {\bf u}_h)-{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})\big ):\nabla_h {\bf z}_h\, {\rm{d}}x \\ &\quad + \alpha \int_{{\it{\Gamma}}} \big\{{{\boldsymbol{\mathcal{A}}}_{\lvert{\nabla_h {\bf u}_h}\rvert}(h^{-1}\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right])}\big\}: \left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]\, {\rm{d}}s \\ &=\int_{{\it{\Gamma}}} \big\{{{\boldsymbol{\mathcal{A}}}(\nabla_h {\bf u}_h)- {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}\big\}: \left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]\, {\rm{d}}s \\ &\quad +h\, \int_{{\it{\Gamma}}} \big\{{{\boldsymbol{\mathcal{A}}}_{\lvert{\nabla_h {\bf u}_h}\rvert}(h^{-1}\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}^*_D) \otimes {\bf n}}\right]\kern-0.15em\right]):\nabla_h {\bf z}_h }\big\}\, {\rm{d}}s . \end{aligned} \end{align} (4.6) Based on this, we have the following error estimate: Theorem 4.3 Let $$ {\bf u}_D^*= {\bf u}$$ or $$ {\bf u}_D^*= {{\it{\Pi}}_{{\rm SZ}}}{\bf u}$$. Then there exists $$\alpha_0$$ such that for all $$\alpha \ge \alpha_0$$ the solution $${\bf u}_h \in {V_h^k} $$, $$k \ge 1$$ of Scheme 2.10 and a solution $${\bf u} \in W^{1,\varphi}({\it{\Omega}}) $$ of (1.1) with $${\bf F} (\nabla {\bf u}) \in W^{1,2}({\it{\Omega}})$$ satisfy the error estimate   \begin{align*} &\int_{\it{\Omega}} \left| {{{\bf F}(\nabla_h {\bf u}_h ) -{\bf F}(\nabla {\bf u} )}} \right|^2\,{\rm{d}}x + \alpha\, h \int_{\it{\Gamma}} \big\{{\varphi_{\left| {{\nabla _h{\bf u}_h}} \right|} (h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]}\rvert)}\big\}\, {\rm{d}}s \\ &\le c\, h^2 \int _{\it{\Omega}}\lvert{\nabla {\bf F}(\nabla {\bf u})}\rvert^2\, {\rm{d}}x , \end{align*} with a constant depending only on the characteristics of $${\boldsymbol{\mathcal{A}}}$$, the chunkiness $$\omega_0$$ and $$\alpha^{-1}$$. Remark 4.4 Note that under appropriate assumptions on the data, it is well known that weak solutions $${\bf u}$$ of (1.1) possess the regularity $${\bf F} (\nabla {\bf u}) \in W^{1,2}({\it{\Omega}})$$ (cf. Giaquinta & Modica, 1986; Acerbi & Fusco, 1994; Giusti, 1994). Proof (of Theorem 4.3). Using $${\bf z}_h := {\bf u}_h - {{\it{\Pi}}_{{\rm SZ}}}{\bf u}$$ in the error equation (4.6) we get, adding and subtracting appropriate terms,   \begin{align} &\int_{\it{\Omega}} \big({\boldsymbol{\mathcal{A}}}(\nabla_h {\bf u}_h)-{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})\big ):\nabla_h ({\bf u}_h-{\bf u})\, {\rm{d}}x \label{eq:error-1} \\ &\quad + \alpha \,h \int_{{\it{\Gamma}}} \big\{{{\boldsymbol{\mathcal{A}}}_{\lvert{\nabla_h {\bf u}_h}\rvert}(h^{-1}\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right])}\big\}:h^{-1} \left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]\, {\rm{d}}s \notag \\ &=-\int_{\it{\Omega}} \big({\boldsymbol{\mathcal{A}}}(\nabla_h {\bf u}_h)-{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})\big ):\nabla_h ({\bf u}-{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\, {\rm{d}}x \notag \\ &\quad +h\, \int_{{\it{\Gamma}}} \big\{{{\boldsymbol{\mathcal{A}}}_{\lvert{\nabla_h {\bf u}_h}\rvert}(h^{-1}\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}^*_D) \otimes {\bf n}}\right]\kern-0.15em\right]):\nabla_h ({\bf u}_h-{{\it{\Pi}}_{{\rm SZ}}}{\bf u}) }\big\}\, {\rm{d}}s \notag \\ &\quad +h \int_{{\it{\Gamma}}} \big\{{{\boldsymbol{\mathcal{A}}}(\nabla_h {\bf u}_h)- {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}\big\}:h^{-1} \left[\kern-0.15em\left[{({\bf u}_h-{{\it{\Pi}}_{{\rm SZ}}}{\bf u}) \otimes {\bf n}}\right]\kern-0.15em\right]\, {\rm{d}}s \notag \\ &\quad - \alpha \, h\int_{{\it{\Gamma}}} \big\{{{\boldsymbol{\mathcal{A}}}_{\lvert{\nabla_h {\bf u}_h}\rvert}(h^{-1}\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right])}\big\}:h^{-1} \left[\kern-0.15em\left[{({\bf u}_D^* -{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\otimes {\bf n}}\right]\kern-0.15em\right]\, {\rm{d}}s \notag \\ &=: I_1+I_2+I_3+I_4 .\notag \end{align} (4.7) In view of Proposition 2.6 used for $${\boldsymbol{\mathcal{A}}}$$ and $${\boldsymbol{\mathcal{A}}}_{\left| {{\nabla _h{\bf u}_h}} \right|}$$, we see that the two terms on the left-hand side are equivalent to   \begin{align} \int_{\it{\Omega}} \left| {{{\bf F}(\nabla_h {\bf u}_h ) -{\bf F}(\nabla {\bf u} )}} \right|^2\,{\rm{d}}x + \alpha\, h \int_{\it{\Gamma}} \big\{{\varphi_{\left| {{\nabla _h{\bf u}_h}} \right|} (h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]}\rvert)}\big\}\, {\rm{d}}s . \label{eq:lhs-e1} \end{align} (4.8) Using Lemma 2.8 we get   \begin{align} \label{eq:er-i1} \left| {{I_1}} \right|&\le {\varepsilon} \int_{\it{\Omega}} \left| {{{\bf F}(\nabla_h {\bf u}_h ) -{\bf F}(\nabla {\bf u} )}} \right|^2\,{\rm{d}}x + c_{\varepsilon} \int_{\it{\Omega}} \left| {{{\bf F}(\nabla {\bf u} ) -{\bf F}(\nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u} )}} \right|^2\,{\rm{d}}x . \end{align} (4.9) Young’s inequality implies   \begin{align} \label{eq:er-i2} \begin{aligned} \left| {{I_2}} \right|&\le c_{\varepsilon} \, h\,\int _{\it{\Gamma}} \big\{{\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}^*_D) \otimes {\bf n}}\right]\kern-0.15em\right]}\rvert) }\big\}\, {\rm{d}}s \\ &\quad +{\varepsilon} \, h\, \int_{{\it{\Gamma}}} \big\{{\varphi_{\lvert{\nabla_h {\bf u}_h}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert) }\big\} \, {\rm{d}}s . \end{aligned} \end{align} (4.10) Let $$\gamma \in {\it{\Gamma}} $$ be a face adjacent to some element $$K$$. Using Lemma 2.5, (A.10), Lemma 2.5 again, Proposition 2.6, Lemma A5, adding and subtracting appropriate terms and Jensen’s inequality we estimate3  \begin{align} & h \int_{\gamma} {\varphi_{\lvert{{\nabla_h {\bf u}_h}}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert) }\, {\rm{d}}s\notag \\ &\le h\, c \int_{\gamma} {\varphi_{\lvert{\langle{\nabla_h {\bf u}_h}\rangle_K}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert) }+{\varphi_{\lvert{\langle{\nabla_h {\bf u}_h}\rangle_K}\rvert} (\lvert{\nabla_h{\bf u}_h- \langle{\nabla_h {\bf u}_h}\rangle_K}\rvert) }\, {\rm{d}}s\notag \\ &\le c \int_{K} {\varphi_{\lvert{\langle{\nabla_h {\bf u}_h}\rangle_{K}}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert) }+{\varphi_{\lvert{\langle{\nabla_h {\bf u}_h}\rangle_{K}}\rvert} (\lvert{\nabla_h{\bf u}_h- \langle{\nabla_h {\bf u}_h}\rangle_{K}}\rvert) }\, {\rm{d}}x\notag \\ &\le c \int_{K} {\varphi_{\lvert{{\nabla_h {\bf u}_h}}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert) }+{\varphi_{\lvert{{\nabla_h {\bf u}_h}}\rvert} (\lvert{\nabla_h{\bf u}_h- \langle{\nabla_h {\bf u}_h}\rangle_{K}}\rvert) }\, {\rm{d}}x\notag \\ &\le c \int_{K} \lvert{{\bf F} (\nabla_h{\bf u}_h)- {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}\rvert^2 +\lvert{{\bf F} (\nabla_h{\bf u}_h)- {\bf F}( \langle{\nabla_h {\bf u}_h}\rangle_{K}) }\rvert^2\, {\rm{d}}x\notag \\ &\le c \int_{K} \lvert{{\bf F} (\nabla_h{\bf u}_h)- {\bf F}( \nabla {\bf u})}\rvert^2+\lvert{{\bf F} (\nabla{\bf u})- {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}\rvert^2\, {\rm{d}}x\notag \\ &\quad +c \int_K\lvert{{\bf F} (\nabla{\bf u})- \langle{{\bf F}(\nabla {\bf u})}_{K}) }\rangle\rvert^2\, {\rm{d}}x . \label{eq:er-i2a} \end{align} (4.11) Thus, we get   \begin{align} \left| {{I_2}} \right| &\le c_{\varepsilon} \, h\int _{\it{\Gamma}} \!\big\{{\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}_h \!-\!{\bf u}^*_D) \otimes {\bf n}}\right]\kern-0.15em\right]\rvert}) }\big\}\, {\rm{d}}s +{\varepsilon} \, c \!\int_{{\it{\Omega}}} \lvert{{\bf F} (\nabla_h{\bf u}_h)- {\bf F}( \nabla {\bf u})}\rvert^2\, {\rm{d}}x \notag \\ &\quad +c \!\int _{\it{\Omega}} \!\lvert{{\bf F} (\nabla{\bf u})\!-\! {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}\rvert^2\, {\rm{d}}x +c\! \sum_{K \in \mathcal T_h}\int_K\lvert{{\bf F} (\nabla{\bf u})\!-\! \langle{{\bf F}(\nabla {\bf u})}_{K}) }\rangle\rvert^2\, {\rm{d}}x. \label{eq:er-i2b} \end{align} (4.12) Using (2.12), Young’s inequality, adding and subtracting $${\bf u}_D^*$$, using the definition of $${\bf u}_D^*$$, $$\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (\lvert{\nabla_h{\bf u}_h\!-\! \nabla {\bf u}}\rvert) \sim \varphi_{\left| {{\nabla {\bf u}}} \right|} (\lvert{\nabla_h{\bf u}_h\!-\! \nabla {\bf u}}\rvert)$$ and adding and subtracting $$\nabla _h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}$$, we obtain   \begin{align} \left| {{I_3}} \right| &\le c_{\varepsilon} h\!\int _{\it{\Gamma}}\!\! \big\{{\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}_h \!-\!{{\it{\Pi}}_{{\rm SZ}}}{\bf u}) \otimes {\bf n}}\right]\kern-0.15em\right]\rvert}) }\big\}\, {\rm{d}}s \!+\!{\varepsilon} \, h\! \int_{{\it{\Gamma}}}\!\! \big\{{\varphi_{\lvert{\nabla_h {\bf u}_h}\rvert} (\lvert{\nabla_h{\bf u}_h\!-\! \nabla {\bf u}}\rvert) }\big\} \,{\rm{d}}s \notag \\ &\le c_{\varepsilon} h\!\!\int _{\it{\Gamma}}\!\!\big\{{\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}_h \!-\!{\bf u}_D^*) \otimes {\bf n}}\right]\kern-0.15em\right]\rvert}) }\big\}\!+\! \big\{{\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}-{{\it{\Pi}}_{{\rm SZ}}}{\bf u}) \otimes {\bf n}}\right]\kern-0.15em\right]\rvert}) }\big\} \,{\rm{d}}s \notag \\ &\quad +{\varepsilon}\, c \,h\! \int_{{\it{\Gamma}}} \!\big\{{\varphi_{\lvert{\nabla {\bf u}}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla {\bf u}}\rvert) }\big\}\, {\rm{d}}s \notag \\ &\le c_{\varepsilon} h\!\int _{\it{\Gamma}}\!\!\big\{{\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}_h \!-\!{\bf u}_D^*) \otimes {\bf n}}\right]\kern-0.15em\right]\rvert}) }\big\}\!+\! \big\{{\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}-{{\it{\Pi}}_{{\rm SZ}}}{\bf u}) \otimes {\bf n}}\right]\kern-0.15em\right]\rvert}) }\big\}\, {\rm{d}}s \notag \\ &\quad +{\varepsilon}\, c \,h\! \int_{{\it{\Gamma}}} \!\big\{{\varphi_{\lvert{\nabla {\bf u}}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert) }\big\}+{\varphi_{\lvert{\nabla {\bf u}}\rvert} (\lvert{\nabla{\bf u}- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}\rvert}) }\, {\rm{d}}s \notag \\ &=: J_1+J_2+J_3+J_4 . \label{eq:er-i3} \end{align} (4.13) The term $$J_2$$ is nonzero only for $$\gamma \in {\it{\Gamma}}_{\rm D}$$. For $$\gamma \in {\it{\Gamma}}_{\rm D}$$ with $$\gamma \in \partial K$$, we use Lemma 2.5 for $$\beta \in (0,1)$$, the identity   \begin{align} {\bf u} -{{\it{\Pi}}_{{\rm SZ}}} {\bf u}=({\bf u}-{{\it{\Pi}}_{{\rm SZ}}} {\bf u}) - {{\it{\Pi}}_{{\rm SZ}}} ({\bf u}-{{\it{\Pi}}_{{\rm SZ}}} {\bf u}) ,\label{eq:proj} \end{align} (4.14) (A.16) with $${\bf g}=h^{-1}({\bf u}- {{\it{\Pi}}_{{\rm SZ}}} {\bf u})$$, (A.10), Lemma 2.5 for $$\beta_1 \in (0,1)$$, the equivalence $$\varphi_{\lvert{\langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert} (\lvert{\nabla_h{\bf u}_h - \langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert) \sim \varphi_{\lvert{{\nabla_h {\bf u}_h}}\rvert} (\lvert{\nabla_h{\bf u}_h - \langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert) $$, Proposition 2.6, Lemma A5, add and subtract appropriate terms, Jensen’s inequality and obtain   \begin{align} & h\int _\gamma {\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (h^{-1}\lvert{\left[\kern-0.15em\left[ {({\bf u}-{{\it{\Pi}}_{{\rm SZ}}}{\bf u}) \otimes {\bf n}}\right]\kern-0.15em\right]}\rvert) }\, {\rm{d}}s \label{eq:er-i3b} \\ &\le h \int_{\gamma} c_\beta\,{\varphi_{\lvert{\langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u} -{{\it{\Pi}}_{{\rm SZ}}}{\bf u}) \otimes {\bf n}}\right]\kern-0.15em\right]}\rvert) }\, {\rm{d}}s \notag \\ &\quad + h \int_{\gamma} \beta\,{\varphi_{\lvert{\langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert} (\lvert{\nabla_h{\bf u}_h - \langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert) } \, {\rm{d}}s \notag \\ &\le \int_{S_K}\!\!c_\beta\, {\varphi_{\lvert{\langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert} (\lvert{\nabla{\bf u} -\nabla_h {{\it{\Pi}}_{{\rm SZ}}} {\bf u}}\rvert) }+c\,\beta\,{\varphi_{\lvert{\langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert} (\lvert{\nabla_h{\bf u}_h - \langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert) }\, {\rm{d}}x \notag \\ &\le \int_{S_K}\!c_\beta\,c_{\beta_1}\, {\varphi_{\lvert{{\nabla {\bf u}}}\rvert} (\lvert{\nabla{\bf u} -\nabla_h {{\it{\Pi}}_{{\rm SZ}}} {\bf u}}\rvert) }+c_\beta\,\beta_1\,{\varphi_{\lvert{{\nabla {\bf u}}}\rvert} (\lvert{\nabla{\bf u} - \langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert) }\, {\rm{d}}x \notag \\ &\quad + \int_{S_K}c\,\beta\,{\varphi_{\lvert{{\nabla_h {\bf u}_h}}\rvert} (\lvert{\nabla_h{\bf u}_h \! -\! \langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert) }\, {\rm{d}}x \notag \\ &\le \int_{S_K}c_\beta\,c_{\beta_1}\, \lvert{{\bf F} (\nabla{\bf u})- {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}} {\bf u})}\rvert^2 + (c\,\beta+c_\beta\,\beta_1)\,\lvert{{\bf F} (\nabla{\bf u})- \langle{{\bf F}(\nabla {\bf u})}\rangle_{S_K} }\rvert^2 \, {\rm{d}}x\notag \\ &\quad +\int_{S_K} (c\,\beta+c_\beta\,\beta_1)\, \lvert{{\bf F} (\nabla_h{\bf u}_h)- {\bf F}( \nabla {\bf u})}\rvert^2 \, {\rm{d}}x . \notag \end{align} (4.15) Thus, $$J_2$$ is estimated by   \begin{align} \label{eq:er-i2bA} & c_{\varepsilon}\, (c\,\beta+c_\beta\,\beta_1 ) \int_{{\it{\Omega}}} \lvert{{\bf F} (\nabla_h{\bf u}_h)- {\bf F}( \nabla {\bf u})}\rvert^2 \, {\rm{d}}x \\ &\; +c_{\varepsilon}\,c_\beta\,c_{\beta_1}\!\int _{\it{\Omega}} \!\lvert{{\bf F} (\nabla{\bf u})\!-\! {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}\rvert^2\, {\rm{d}}x + c_{\varepsilon}\,c_\beta \sum_{ K \in \mathcal T_h } \int_{S_K}\!\lvert{{\bf F} (\nabla{\bf u})\!-\! \langle{{\bf F}(\nabla {\bf u})}\rangle_{S_K} }\rvert^2\, {\rm{d}}x. \notag \end{align} (4.16) The term $$J_3$$ is treated similarly to (4.11). In fact, let $$\gamma \in {\it{\Gamma}} $$ be a face adjacent to some element $$K$$. Using Lemma 2.5, Proposition 2.6, (A.10), Lemma A1, Lemma 2.5, adding and subtracting $$\nabla {\bf u}$$, using Proposition 2.6 and Lemma A5, we estimate   \begin{align} & h \int_{\gamma} {\varphi_{\lvert{{\nabla {\bf u}}}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert) }\, {\rm{d}}s \label{eq:er-i2aa} \\ &\le h\, c \int_{\gamma} {\varphi_{\lvert{\langle{\nabla {\bf u}}\rangle_K}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert)} + {\varphi_{\lvert{\langle{\nabla {\bf u}}\rangle_K}\rvert} (\lvert{\nabla{\bf u}- \langle{\nabla {\bf u}}\rangle_K}\rvert) }\, {\rm{d}}s \notag \\ &\le h\, c \int_{\gamma} {\varphi_{\lvert{\langle{\nabla {\bf u}}\rangle_K}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert)} + \lvert{{\bf F} (\nabla{\bf u})- {\bf F}(\langle{\nabla {\bf u}}\rangle_{K}) }\rvert^2\, {\rm{d}}s \notag \\ &\le c \int_{K} {\varphi_{\lvert{\langle{\nabla{\bf u}}\rangle_{K}}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert) }+ \lvert{{\bf F} (\nabla{\bf u})- {\bf F}(\langle{\nabla {\bf u}}\rangle_{K}) }\rvert^2 + h^2\,\lvert{\nabla {\bf F} (\nabla{\bf u}) }\rvert^2\, {\rm{d}}x \notag \\ &\le c \int_{K} {\varphi_{\lvert{{\nabla {\bf u}}}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert) }+ \lvert{{\bf F} (\nabla{\bf u})- {\bf F}(\langle{\nabla {\bf u}}\rangle_{K}) }\rvert^2 + h^2\,\lvert{\nabla {\bf F} (\nabla{\bf u}) }\rvert^2\, {\rm{d}}x \notag \\ &\le c \int_{K} \lvert{{\bf F} (\nabla_h{\bf u}_h)- {\bf F}( \nabla {\bf u})}\rvert^2+\lvert{{\bf F} (\nabla{\bf u})- {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}\rvert^2\, {\rm{d}}x \notag \\ &\quad +c \int_K\lvert{{\bf F} (\nabla{\bf u})- \langle{{\bf F}(\nabla {\bf u})}_{K}) }\rangle\rvert^2 + h^2\,\lvert{\nabla {\bf F} (\nabla{\bf u}) }\rvert^2\, {\rm{d}}x. \notag \end{align} (4.17) Thus, $$J_3$$ is estimated by   \begin{align} \label{eq:er-i3bA} &{\varepsilon} \, c \int_{{\it{\Omega}}} \lvert{{\bf F} (\nabla_h{\bf u}_h)- {\bf F}( \nabla {\bf u})}\rvert^2 \, {\rm{d}}x + c\, h^2\int_{\it{\Omega}} \lvert{\nabla {\bf F} (\nabla{\bf u}) }\rvert^2\, {\rm{d}}x \\ &\quad +c \int _{\it{\Omega}} \lvert{{\bf F} (\nabla{\bf u})- {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}\rvert^2\, {\rm{d}}x +c \sum_{K \in \mathcal T_h}\int_K\lvert{{\bf F} (\nabla{\bf u})- \langle{{\bf F}(\nabla {\bf u})}_{K}) }\rangle\rvert^2\, {\rm{d}}x . \notag \end{align} (4.18) Corollary 4.2 implies that   \begin{align}\label{eq:er-i4aA} \lvert{J_4}\rvert &\le {\varepsilon} \,c\, \,h^2\, \int_{{\it{\Omega}}} \left| {{\nabla {\bf F}(\nabla {\bf u})}} \right|^2\,{\rm{d}}x . \end{align} (4.19) Finally, Young’s inequality and the definition of $${\bf u}_D^*$$ imply   \begin{align} \label{eq:er-i4} \begin{aligned} \left| {{I_4}} \right|&\le {\varepsilon} \,\alpha\, h\,\int _{\it{\Gamma}} \big\{{\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}^*_D) \otimes {\bf n}}\right]\kern-0.15em\right]}\rvert) }\big\}\, {\rm{d}}s \\ &\quad +c_{\varepsilon} \,\alpha\, h\, \int_{{\it{\Gamma}}} \big\{{\varphi_{\lvert{\nabla_h {\bf u}_h}\rvert} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}- {{\it{\Pi}}_{{\rm SZ}}}{\bf u})\otimes {\bf n} }\right]\kern-0.15em\right]}\rvert) }\big\}\, {\rm{d}}s . \end{aligned} \end{align} (4.20) The last term in (4.20) is treated as $$J_2$$ and thus estimated by   \begin{align} \label{eq:er-i4bA} & c_{\varepsilon}\,\alpha\, (c\,\beta+c_\beta\,\beta_1 ) \int_{{\it{\Omega}}} \lvert{{\bf F} (\nabla_h{\bf u}_h)- {\bf F}( \nabla {\bf u})}\rvert^2 \, {\rm{d}}x \\ &\, +\!c_{\varepsilon}\alpha\, c_\beta\,c_{\beta_1}\!\!\int _{\it{\Omega}} \!\lvert{{\bf F} (\nabla{\bf u})\!-\! {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}\rvert^2\, {\rm{d}}x + c_{\varepsilon}\alpha\,c_\beta \!\sum_{ K \in \mathcal T_h}\int_{S_K}\!\!\lvert{{\bf F} (\nabla{\bf u})\!-\! \langle{{\bf F}(\nabla {\bf u})}\rangle_{S_K} }\rvert^2 {\rm{d}}x. \notag \end{align} (4.21) Choosing $$\varepsilon$$ small enough, we can absorb all terms with $${\varepsilon}$$ in (4.9), (4.12), (4.18) and (4.20) into the corresponding terms in (4.8). Now $$c_{\varepsilon}$$ is fixed, and we choose $$\alpha $$ large enough to absorb the terms with $$\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (h^{-1}\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}^*_D) \otimes {\bf n}}\right]\kern-0.15em\right]) $$ in (4.10) and (4.13) into the corresponding term in (4.8). Then, we choose first $$\beta$$ and then $$\beta_1$$ small enough to absorb the terms with $$\beta$$ and $$\beta_1$$ in (4.16) and (4.21) into the corresponding term in (4.8). This way we arrive at   \begin{align} &\int_{\it{\Omega}} \left| {{{\bf F}(\nabla_h {\bf u}_h ) -{\bf F}(\nabla {\bf u} )}} \right|^2\,{\rm{d}}x + \alpha\, h \int_{\it{\Gamma}} \big\{{\varphi_{\left| {{\nabla _h{\bf u}_h}} \right|} (h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]}\rvert)}\big\}\, {\rm{d}}s \notag \\ &\le c_\alpha \, \int _{\it{\Omega}} \lvert{{\bf F} (\nabla{\bf u})- {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}\rvert^2\, {\rm{d}}x + c_\alpha \sum_{K \in \mathcal T_h}\int_K\lvert{{\bf F} (\nabla{\bf u})- \langle{{\bf F}(\nabla {\bf u})}\rangle_{K} }\rvert^2\, {\rm{d}}x\notag \\ &\quad + c_\alpha \sum_{K \in \mathcal T_h}\int_{S_{K}}\lvert{{\bf F} (\nabla{\bf u})- \langle{{\bf F}(\nabla {\bf u})}\rangle_{S_{K}} }\rvert^2\, {\rm{d}}x +c_\alpha\, \,h^2\, \int_{{\it{\Omega}}} \left| {{\nabla {\bf F}(\nabla {\bf u})}} \right|^2\,{\rm{d}}x . \label{eq:err} \end{align} (4.22) The first three terms on the right-hand side are estimated by $$c\,h^2\,\left\| {{\nabla {\bf F}(\nabla {\bf u})}} \right\|_2^2$$ due to Proposition A2 and Poincaré’s inequality on $$W^{1,2}(K)$$ and $$W^{1,2}(S_{S_\gamma})$$, respectively (cf. Diening et al., 2010, Theorem 6.5). This proves the assertion of the theorem. □ Let us now turn to Scheme 2.11. Using (2.25) and (2.27), we get our error equation for Scheme 2.10:   \begin{align}\label{eq:error-s2} \begin{aligned} &\int_{\it{\Omega}} \big({\boldsymbol{\mathcal{A}}}({\nabla_{\rm DG}^h} {\bf u}_h +{\bf R}_h {\bf u}_D^*)-{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})\big ):{\nabla_{\rm DG}^h} {\bf z}_h\, {\rm{d}}x \\ &\quad + \alpha \int_{{\it{\Gamma}}} {{\boldsymbol{\mathcal{A}}}(h^{-1}\left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right])}: \left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]\, {\rm{d}}s \\ &=\int_{{\it{\Gamma}}} \big\{{{{\it{\Pi}}_{\rm DG}}{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})- {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}\big\}: \left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]\, {\rm{d}}s \\ &\quad + \int_{{\it{\Omega}}} {{\boldsymbol{\mathcal{A}}}({\bf R}_h{({\bf u}_h -{\bf u}^*_D) }):{\bf R}_h {\bf z}_h }\, {\rm{d}}x . \end{aligned} \end{align} (4.23) Theorem 4.5 Let $$ {\bf u}_D^*= {\bf u}$$ or $$ {\bf u}_D^*= {{\it{\Pi}}_{{\rm SZ}}}{\bf u}$$. Then there exists $$\alpha_0$$ such that for all $$\alpha \ge \alpha_0$$ the solution $${\bf u}_h \in {V_h^k} $$, $$k\ge 1$$ of Scheme 2.11 and a solution $${\bf u} \in W^{1,\varphi}({\it{\Omega}}) $$ of (1.1) with $${\bf F} (\nabla {\bf u}) \in W^{1,2}({\it{\Omega}})$$ and $$\nabla {\boldsymbol{\mathcal{A}}}(\nabla {\bf u}) \in L^{\varphi^*}({\it{\Omega}})$$ satisfy the error estimate   \begin{align*} &\int_{\it{\Omega}} \left| {{{\bf F}({\nabla_{\rm DG}^h} {\bf u}_h +{\bf R}_h {\bf u}_D^*) -{\bf F}(\nabla {\bf u} )}} \right|^2\,{\rm{d}}x + \alpha\, h \int_{\it{\Gamma}} {\varphi(h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]}\rvert)}\, {\rm{d}}s \\ &\le c\, h^{\min (p,p')} \left ( \int _{\it{\Omega}}\lvert{\nabla {\bf F}(\nabla {\bf u})}\rvert^2 + \varphi(\lvert{\nabla {\bf u}}\rvert) + \varphi^*(\lvert{\nabla {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}\rvert)\, {\rm{d}}x \right ), \end{align*} with a constant depending only on the characteristics of $${\boldsymbol{\mathcal{A}}}$$, the chunkiness $$\omega_0$$ and $$\alpha^{-1}$$. Proof. Using $${\bf z}_h := {\bf u}_h - {{\it{\Pi}}_{{\rm SZ}}}{\bf u}$$ in the error equation (4.23) we get, adding and subtracting appropriate terms,   \begin{align} &\int_{\it{\Omega}} \big({\boldsymbol{\mathcal{A}}}({\nabla_{\rm DG}^h} {\bf u}_h +{\bf R}_h {\bf u}_D^*)-{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})\big ):\big ({\nabla_{\rm DG}^h} {\bf u}_h +{\bf R}_h{\bf u}_D^* -\nabla {\bf u} \big )\, {\rm{d}}x \label{eq:error-2} \\ &\quad + \alpha \,h \int_{{\it{\Gamma}}} {{\boldsymbol{\mathcal{A}}}(h^{-1}\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right])}:h^{-1} \left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]\, {\rm{d}}s \notag \\ &=\int_{\it{\Omega}} \big({\boldsymbol{\mathcal{A}}}({\nabla_{\rm DG}^h} {\bf u}_h +{\bf R}_h {\bf u}_D^*)-{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})\big ): \big ({\nabla_{\rm DG}^h} {{\it{\Pi}}_{{\rm SZ}}}{\bf u} +{\bf R}_h{\bf u}_D^* -\nabla {\bf u} \big )\, {\rm{d}}x \notag \\ &\quad +\int_{{\it{\Omega}}} {{\boldsymbol{\mathcal{A}}}({\bf R}_h {({\bf u}_h -{\bf u}^*_D)}):{\bf R}_h ({\bf u}_h-{{\it{\Pi}}_{{\rm SZ}}}{\bf u}) }\, {\rm{d}}x \notag \\ &\quad +h \int_{{\it{\Gamma}}} \big\{{{{\it{\Pi}}_{\rm DG}}{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})- {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}\big\}:h^{-1} \left[\kern-0.15em\left[{({\bf u}_h-{{\it{\Pi}}_{{\rm SZ}}}{\bf u}) \otimes {\bf n}}\right]\kern-0.15em\right]\, {\rm{d}}s \notag \\ &\quad - \alpha \, h\int_{{\it{\Gamma}}} {{\boldsymbol{\mathcal{A}}}(h^{-1}\left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right])}:h^{-1} \left[\kern-0.15em\left[{({\bf u}_D^* -{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\otimes {\bf n}}\right]\kern-0.15em\right]\, {\rm{d}}s \notag \\ &=: I_1+I_2+I_3+I_4 .\notag \end{align} (4.24) In view of Proposition 2.6 used for $${\boldsymbol{\mathcal{A}}}$$ we see that the two terms on the left-hand side are equivalent to   \begin{align} \int_{\it{\Omega}} \left| {{{\bf F}({\nabla_{\rm DG}^h} {\bf u}_h\! +\!{\bf R}_h {\bf u}_D^*)\! -\!{\bf F}(\nabla {\bf u} )}} \right|^2\,{\rm{d}}x + \alpha\, h \!\int_{\it{\Gamma}} \!\varphi { (h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h \! -\!{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]}\rvert)}\, {\rm{d}}s . \label{eq:lhs-e2} \end{align} (4.25) Using $${\nabla_{\rm DG}^h} {{\it{\Pi}}_{{\rm SZ}}}{\bf u}+{\bf R}_h {\bf u}_D^*-\nabla {\bf u} = \nabla _h({{\it{\Pi}}_{{\rm SZ}}} {\bf u} -{\bf u}) -{\bf R}_h ({{\it{\Pi}}_{{\rm SZ}}} {\bf u} -{\bf u}_D^*)$$, the definition of $${\bf u}_D^*$$, Young’s inequality and Proposition 2.6 we get   \begin{align} \label{eq:er-i1A} \left| {{I_1}} \right| &\le {\varepsilon} \int_{\it{\Omega}} \left| {{{\bf F}({\nabla_{\rm DG}^h} {\bf u}_h\! +\!{\bf R}_h {\bf u}_D^* ) -{\bf F}(\nabla {\bf u} )}} \right|^2\,{\rm{d}}x \\ &\quad + c_{\varepsilon} \int_{\it{\Omega}}\left| {{{\bf F}(\nabla {\bf u} ) -{\bf F}(\nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u} )}} \right|^2 + \varphi_{\left| {{\nabla {\bf u}}} \right|}(\lvert{{\bf R}_h ({{\it{\Pi}}_{{\rm SZ}}} {\bf u} - {\bf u} )\rvert})\,{\rm{d}}x . \notag \end{align} (4.26) Using the properties of $${\bf R}_h$$, especially (A.13), (A.17), we can proceed as in the proof of Diening et al. (2014, Theorem 4.8) and show that the last integral is bounded by $$c\, h^2\left\| {{ \nabla {\bf F}(\nabla {\bf u})}} \right\|_2^2$$. From Young’s inequality, estimate (A.13), adding and subtracting $${\bf u}_D^*$$, using the definition of $${\bf u}_D^*$$ and (A.18) we obtain   \begin{align} \label{eq:er-i2A} \left| {{I_2}} \right|&\le c \int _{\it{\Omega}}{\varphi (\lvert{{\bf R}_h ({\bf u}_h -{\bf u}^*_D) }\rvert) }\, {\rm{d}}x +c \int_{{\it{\Omega}}} {\varphi} (\lvert{{\bf R}_h ({\bf u}_h- {{\it{\Pi}}_{{\rm SZ}}}{\bf u}\rvert)} ) \, {\rm{d}}x \\ &\le c\, h \int _{\it{\Gamma}}{\varphi (h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}^*_D)\otimes {\bf n} }\right]\kern-0.15em\right]}\rvert) }\, {\rm{d}}s +c\, h \int_{{\it{\Gamma}}} {\varphi} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}- {{\it{\Pi}}_{{\rm SZ}}}{\bf u})\otimes {\bf n} }\right]\kern-0.15em\right]} \rvert) \, {\rm{d}}s\notag \\ &\le c\, h \int _{\it{\Gamma}}{\varphi (h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}^*_D)\otimes {\bf n} }\right]\kern-0.15em\right]}\rvert)}\, {\rm{d}}s +c\,h^{\min (2,p)} \int_{{\it{\Omega}}} \lvert{\nabla {\bf F}(\nabla {\bf u} )}\rvert^2 + \varphi (\lvert{\nabla {\bf u} }\rvert) \, {\rm{d}}x .\notag \end{align} (4.27) Young’s inequality, adding and subtracting $${\bf u}_D^*$$, using the definition of $${\bf u}_D^*$$ and (A.18) yield   \begin{align} \label{eq:er-i3A} \left| {{I_3}} \right| &\le {\varepsilon} \, h \int _{\it{\Gamma}}{\varphi (h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}^*_D)\otimes {\bf n} }\right]\kern-0.15em\right]}\rvert)} + {\varphi} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}- {{\it{\Pi}}_{{\rm SZ}}}{\bf u})\otimes {\bf n} }\right]\kern-0.15em\right]} \rvert) \, {\rm{d}}s \\ &\quad + c_{\varepsilon} \, h \int _{\it{\Gamma}}\big\{ {\varphi^*(\lvert{{{\it{\Pi}}_{\rm DG}}{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})- {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})\rvert})}\big\}\, {\rm{d}}s \notag \\ &\le {\varepsilon} \, h \int _{\it{\Gamma}}{\varphi (h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}^*_D)\otimes {\bf n} }\right]\kern-0.15em\right]}\rvert)} \, {\rm{d}}s + c\,h^{\min (2,p)} \int_{{\it{\Omega}}} \lvert{\nabla {\bf F}(\nabla {\bf u} )}\rvert^2 + \varphi (\lvert{\nabla {\bf u} }\rvert) \, {\rm{d}}x \notag \\ &\quad + c_{\varepsilon} \, h \int _{\it{\Gamma}}\big\{ {\varphi^*(\lvert{{{\it{\Pi}}_{\rm DG}}{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})- {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})\rvert})}\big\}\, {\rm{d}}s. \notag \end{align} (4.28) Using (A.11) for each $$\gamma \in {\it{\Gamma}} $$ we estimate the last term by   \begin{align}\label{eq:er-13B} c_{\varepsilon} \int_{\it{\Omega}} \varphi^*(h\, \lvert{\nabla {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})\rvert})\, {\rm{d}}x . \end{align} (4.29) Young’s inequality, the definition of $${\bf u}_D^*$$ and (A.18) yield   \begin{align} \lvert{I_4}\rvert&\le \varepsilon \alpha h\! \int_{\it{\Gamma}}\!\!\varphi(h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}_h\!-\!{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]}\rvert)\, {\rm{d}}s \!+\!c_\varepsilon \alpha\, h\! \int_{\it{\Gamma}}\!\!\varphi(h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}\!-\!{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\otimes {\bf n}}\right]\kern-0.15em\right]}\rvert)\, {\rm{d}}s \label{eq:er-i4A} \\ &\le \varepsilon \alpha h \! \int_{\it{\Gamma}} \!\varphi(h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}_h \!-\!{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]}\rvert) \, {\rm{d}}s +c_\varepsilon \alpha\,\,h^{\min (2,p)}\!\! \int_{{\it{\Omega}}}\! \lvert{\nabla {\bf F}(\nabla {\bf u} )}\rvert^2 \! +\! \varphi (\lvert{\nabla {\bf u}}\rvert ) \, {\rm{d}}x. \end{align} (4.30) Choosing $${\varepsilon} $$ small enough, we can absorb all terms with $${\varepsilon}$$ in (4.26)–(4.30) in (4.25). Then, we choose $$ \alpha$$ large enough to absorb the remaining term in (4.27) with $$\varphi(h^{-1}\left| {{\left[\kern-0.15em\left[{({\bf u}_h-{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]}} \right|) $$ in (4.25). This way we arrive at   \begin{align} &\int_{\it{\Omega}} \left| {{{\bf F}({\nabla_{\rm DG}^h} {\bf u}_h+{\bf R}_h{\bf u}_D^* ) -{\bf F}(\nabla {\bf u} )}} \right|^2\,{\rm{d}}x + \alpha\, h \int_{\it{\Gamma}} {\varphi (h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]}\rvert)}\, {\rm{d}}s \notag \\ &\le c_\alpha \, \int _{\it{\Omega}} \lvert{{\bf F} (\nabla{\bf u})- {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}\rvert^2\, {\rm{d}}x+ c_\alpha \,h^2 \int _{\it{\Omega}} \lvert {\nabla {\bf F} (\nabla{\bf u})}\rvert^2\, {\rm{d}}x \notag \\ &\quad + c_\alpha \,h^{\min (2,p)} \int_{{\it{\Omega}}} \lvert{\nabla {\bf F}(\nabla {\bf u} )}\rvert^2 + \varphi (\lvert{\nabla {\bf u} }\rvert) \, {\rm{d}}x + c_\alpha \int_{\it{\Omega}} \varphi^*(h\, \lvert{\nabla {\boldsymbol{\mathcal{A}}}(\nabla {\bf u}\rvert)})\, {\rm{d}}x . \label{eq:err-s2} \end{align} (4.31) Proposition A2 and   \begin{align*} \varphi^*(h\,t) \le c\, h^{\min(2,p')}\varphi^*(t)\, \end{align*} yield the assertion. □ Remark 4.6 For $${\bf u}_D^* ={{\it{\Pi}}_{{\rm SZ}}} {\bf u}$$, we can improve Theorem 4.5 for $$p\le 2$$. Indeed, one easily sees that in this case the terms with $$\varphi(h^{-1}\left[\kern-0.15em\left[{({\bf u}-{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\otimes {\bf n}}\right]\kern-0.15em\right]) $$ in the estimates of $$I_2$$ and $$I_3$$ do not appear and that $$I_4=0$$. As a consequence, the term with $$\varphi (h\,\nabla^2{\bf u})$$ in (4.31) does not appear. Moreover, as in the proof of Diening et al. (2014, Theorem 4.5), one can show that the last term in (4.28) can be estimated by $$c\, \left\| {{{\bf F} (\nabla{\bf u})- {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}} \right\|_2^2$$. Thus, we get instead of (4.31) the estimate   \begin{align*} &\int_{\it{\Omega}} \left| {{{\bf F}({\nabla_{\rm DG}^h} {\bf u}_h +{\bf R}_h {\bf u}_D^*) -{\bf F}(\nabla {\bf u} )}} \right|^2\,{\rm{d}}x + \alpha\, h \int_{\it{\Gamma}} {\varphi(h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]}\rvert)}\, {\rm{d}}s \\ &\le c_\alpha \, \int _{\it{\Omega}} \lvert{{\bf F} (\nabla{\bf u})- {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}\rvert^2\, {\rm{d}}x+ c_\alpha \,h^2 \int _{\it{\Omega}} \lvert {\nabla {\bf F} (\nabla{\bf u})}\rvert^2\, {\rm{d}}x \notag \\ &\le c_\alpha\, h^{2} \,\Big ( \int _{\it{\Omega}}\lvert{\nabla {\bf F}(\nabla {\bf u})}\rvert^2 \, {\rm{d}}x\Big ) . \end{align*} Let us summarize our error estimates. Theorem 4.3 shows that SIP Scheme 2.10 using shifts possesses an optimal convergence rate for linear ansatz functions in all considered realizations of the boundary values. Thus, it behaves as the classical FEM scheme analysed in Diening & Růžička (2007). This is the first theoretical result that shows optimal convergence rates for all $$p \in (1,\infty)$$. On the other hand, from Theorem 4.5, Remark 4.6 and the results from Diening et al. (2014) (summarized in Section 2.6), we see that the theoretical error analysis gives the same results for LDG Scheme 2.13 and the SIP scheme in the lift formulation Scheme 2.11. In particular, for $$p\le 2$$, the convergence rates for linear elements are optimal. The performance in numerical experiments of all three schemes is discussed in the next section. 5. Numerical experiments In this section, we apply SIP Schemes 2.10 and 2.11, and primal LDG Scheme 2.13 to solve numerically system (1.1) with $${\boldsymbol{\mathcal{A}}}$$ given by (1.2). We approximate the discrete solution $${\bf u}_h$$ of the nonlinear problems (2.24), (2.25) and (2.28) using a Newton scheme with modified Jacobian matrix. The modified Jacobian matrix in each Newton step is evaluated by replacing $${\boldsymbol{\mathcal{A}}}$$ by   \begin{equation*} {\boldsymbol{\mathcal{A}}}^\prime_\beta( {\bf Q} ) {\bf P} = (\delta +{{{}\lvert{\bf Q}{}\rvert}} ^{p-2} {\bf P} + \beta( p - 2 ) ( \delta + \left| {{ {\bf Q} }} \right| )^{p-3} ( {\bf P}, {\bf Q} ) \frac {{\bf Q}}{\left| {{{\bf Q}}} \right|} \end{equation*} in (2.24), (2.25) and (2.28). Note that the true Jacobian corresponds to setting $$\beta=1$$ in the last formula. The parameter $$0 \leq \beta \leq 1$$ is adaptively chosen in each Newton step. It is increased if the residual $$\| {\bf r}_{h,i} \|$$ is decreasing or decreased otherwise. Here, $${\bf r}_{h,i}$$ denotes the update term of the $$i$$th Newton step. The stopping criterion for the Newton scheme is set to $$\| {\bf r}_{h,i} \| < 10^{-8}$$. The linear system emerging in each Newton step is solved using the sparse LU solver umfpack (Davis, 2004). The choice of the $$p$$-dependent penalty parameter $$\alpha$$ used for our experiments is presented in Table 1. Table 1 Choice of the stabilization parameter $$\alpha$$    p  —  1.25  4/3  1.5  5/3  1.8  2  2.25  2.5  3  4  $$\alpha_\text{SIP-shifted}$$  1.5  1.5  1.5  1.5  2.0  2.0  2.0  2.0  2.0  2.5  $$\alpha_\text{LDG}$$  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.9  $$\alpha_\text{SIP-lifting}$$  0.5  0.8  1.2  1.5  1.5  2.0  2.0  2.5  2.5  3.5     p  —  1.25  4/3  1.5  5/3  1.8  2  2.25  2.5  3  4  $$\alpha_\text{SIP-shifted}$$  1.5  1.5  1.5  1.5  2.0  2.0  2.0  2.0  2.0  2.5  $$\alpha_\text{LDG}$$  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.9  $$\alpha_\text{SIP-lifting}$$  0.5  0.8  1.2  1.5  1.5  2.0  2.0  2.5  2.5  3.5  Table 1 Choice of the stabilization parameter $$\alpha$$    p  —  1.25  4/3  1.5  5/3  1.8  2  2.25  2.5  3  4  $$\alpha_\text{SIP-shifted}$$  1.5  1.5  1.5  1.5  2.0  2.0  2.0  2.0  2.0  2.5  $$\alpha_\text{LDG}$$  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.9  $$\alpha_\text{SIP-lifting}$$  0.5  0.8  1.2  1.5  1.5  2.0  2.0  2.5  2.5  3.5     p  —  1.25  4/3  1.5  5/3  1.8  2  2.25  2.5  3  4  $$\alpha_\text{SIP-shifted}$$  1.5  1.5  1.5  1.5  2.0  2.0  2.0  2.0  2.0  2.5  $$\alpha_\text{LDG}$$  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.9  $$\alpha_\text{SIP-lifting}$$  0.5  0.8  1.2  1.5  1.5  2.0  2.0  2.5  2.5  3.5  For our numerical experiments, we choose $${\it{\Omega}} = [-2,2]^2$$, linear elements, i.e., $$k=1$$, and $$\delta:=10^{-3}$$ as the parameter $$\delta$$ in the operator $${\boldsymbol{\mathcal{A}}}$$. For $$a = 0.01$$, we choose $${\bf f}$$ and $${\bf u}_D$$ such that   \begin{equation} \label{N3} {\bf u}( {\bf x} ) = |{\bf x} |^{a} \begin{pmatrix} x_2 \\ - x_1 \end{pmatrix} \end{equation} (5.1) is a solution of (1.1) and $${\bf F}(\nabla {\bf u} ) \in W^{1,2}({\it{\Omega}})$$. For a series of triangulations $$\mathcal T_{h_i}$$ with $$h_{i+1} = \frac{h_i}{2}$$, we apply the above Newton scheme to compute the corresponding numerical solution $${\bf u}_{h_i}$$ and the errors   \begin{equation*} e_{{\bf F},h_i} = \left\| {{{\bf F}(\nabla_{h_i} {\bf u}_{h_i} ) - {\bf F}(\nabla {\bf u})}} \right\|_{2}\;\; \text{and} \;\; e_{ \left[\!\left[ {} \right]\!\right], h_i } = h_{i} \int_{\it{\Gamma}} {\varphi(h^{-1}_{i} \left| {{\left[\!\left[ {{({\bf u}_{h_i} -{\bf u}_D^*)\otimes {\bf n}}} \right]\!\right]}} \right|)}\, {\rm{d}}s. \end{equation*} As an estimation of the convergence rates, we use the experimental order of convergence (EOC):   \begin{equation*} {\rm{EOC}}_i( e_{h_i} ) := \frac{\log( e_{h_{i}} / e_{h_{i-1}} ) }{ \log( h_{i} / h_{i-1} )} \end{equation*} for $$i>1$$ and $$e_{h_i}$$ being either the error $$e_{{\bf F},h_i}$$ or $$e_{ \left[\!\left[ {} \right]\!\right], h_i }$$. Schemes 2.11 and 2.13 are based on the evaluation of the term $${\bf R}_h{\bf u}_h$$ in the interior of $${\it{\Omega}}$$. For a given simplex $$K \in \mathcal T_h$$, the restriction of $${\bf R}_{h} {\bf u}_h$$ to $$K$$ is computed by solving locally   \begin{equation*} \left( {{{\bf R}_{h}{\bf u}_h}{ {\bf X}_i}} \right)_K = \sum_{\gamma \in \partial K} \left\langle {{ \left[\!\left[ {{{\bf u}_h \otimes {\bf n}}} \right]\!\right]},{ {\bf X}_i}} \right\rangle_\gamma \end{equation*} for each basis function $${\bf X}_i$$ of $$\mathcal P_k(K)$$. For different values of $$p$$ and for a series of triangulations with $$h_0 = 1$$, the EOC is computed and presented in Tables 2 and 3 for each of the three methods. In each case, we observe convergence ratios of about $$1$$, as predicted by the theoretical results Theorem 4.3, Theorem 4.5 and Diening et al. (2014, Corollary 4.10). Table 2 Experimental order of convergence : $${\rm EOC}_{i(e_{{\bf F}, h_i} )}$$ in the case of $$\mathcal{P}_1(K)$$ as local space and $${\bf u}_D^*={\bf u}$$ on $$\partial {\it{\Omega}}$$, $$\mathbf{F}(\nabla\mathbf{u})\in {W}^{1,2}({\it{\Omega}})$$    p  $$\frac{h_0}{2^m}$$  1.25  4/3  1.5  5/3  1.8  2.0  2.25  2.5  3.0  4.0    SIP shifted  $$m=1$$  0.83  0.89  1.0  1.1  1.17  1.04  1.1  1.16  1.23  1.14  $$m=2$$  0.86  0.86  0.87  0.87  0.88  0.86  0.86  0.87  0.88  0.88  $$m=3$$  0.88  0.88  0.89  0.89  0.89  0.88  0.89  0.89  0.9  0.9  $$m=4$$  0.9  0.9  0.9  0.9  0.91  0.9  0.9  0.91  0.91  0.92  $$m=5$$  0.91  0.91  0.91  0.92  0.92  0.91  0.92  0.92  0.92  0.93    LDG  $$m=1$$  0.71  0.74  0.85  1.06  1.32  1.8  2.45  3.12  3.97  4.23  $$m=2$$  0.83  0.83  0.83  0.84  0.85  0.86  0.87  0.89  0.96  0.64  $$m=3$$  0.86  0.86  0.87  0.87  0.88  0.89  0.89  0.9  0.94  0.58  $$m=4$$  0.88  0.88  0.89  0.89  0.9  0.9  0.91  0.92  0.93  0.72  $$m=5$$  0.9  0.9  0.9  0.91  0.91  0.91  0.92  0.93  0.94  0.98    SIP lifting  $$m=1$$  0.63  0.64  0.67  0.71  0.75  0.86  1.2  1.76  2.98  4.03  $$m=2$$  0.83  0.83  0.84  0.84  0.84  0.84  0.86  0.87  0.89  0.89  $$m=3$$  0.86  0.86  0.87  0.87  0.87  0.87  0.88  0.89  0.91  0.99  $$m=4$$  0.89  0.89  0.89  0.89  0.89  0.89  0.9  0.91  0.92  0.95  $$m=5$$  0.9  0.9  0.9  0.9  0.91  0.91  0.91  0.92  0.93  0.95     p  $$\frac{h_0}{2^m}$$  1.25  4/3  1.5  5/3  1.8  2.0  2.25  2.5  3.0  4.0    SIP shifted  $$m=1$$  0.83  0.89  1.0  1.1  1.17  1.04  1.1  1.16  1.23  1.14  $$m=2$$  0.86  0.86  0.87  0.87  0.88  0.86  0.86  0.87  0.88  0.88  $$m=3$$  0.88  0.88  0.89  0.89  0.89  0.88  0.89  0.89  0.9  0.9  $$m=4$$  0.9  0.9  0.9  0.9  0.91  0.9  0.9  0.91  0.91  0.92  $$m=5$$  0.91  0.91  0.91  0.92  0.92  0.91  0.92  0.92  0.92  0.93    LDG  $$m=1$$  0.71  0.74  0.85  1.06  1.32  1.8  2.45  3.12  3.97  4.23  $$m=2$$  0.83  0.83  0.83  0.84  0.85  0.86  0.87  0.89  0.96  0.64  $$m=3$$  0.86  0.86  0.87  0.87  0.88  0.89  0.89  0.9  0.94  0.58  $$m=4$$  0.88  0.88  0.89  0.89  0.9  0.9  0.91  0.92  0.93  0.72  $$m=5$$  0.9  0.9  0.9  0.91  0.91  0.91  0.92  0.93  0.94  0.98    SIP lifting  $$m=1$$  0.63  0.64  0.67  0.71  0.75  0.86  1.2  1.76  2.98  4.03  $$m=2$$  0.83  0.83  0.84  0.84  0.84  0.84  0.86  0.87  0.89  0.89  $$m=3$$  0.86  0.86  0.87  0.87  0.87  0.87  0.88  0.89  0.91  0.99  $$m=4$$  0.89  0.89  0.89  0.89  0.89  0.89  0.9  0.91  0.92  0.95  $$m=5$$  0.9  0.9  0.9  0.9  0.91  0.91  0.91  0.92  0.93  0.95  Table 2 Experimental order of convergence : $${\rm EOC}_{i(e_{{\bf F}, h_i} )}$$ in the case of $$\mathcal{P}_1(K)$$ as local space and $${\bf u}_D^*={\bf u}$$ on $$\partial {\it{\Omega}}$$, $$\mathbf{F}(\nabla\mathbf{u})\in {W}^{1,2}({\it{\Omega}})$$    p  $$\frac{h_0}{2^m}$$  1.25  4/3  1.5  5/3  1.8  2.0  2.25  2.5  3.0  4.0    SIP shifted  $$m=1$$  0.83  0.89  1.0  1.1  1.17  1.04  1.1  1.16  1.23  1.14  $$m=2$$  0.86  0.86  0.87  0.87  0.88  0.86  0.86  0.87  0.88  0.88  $$m=3$$  0.88  0.88  0.89  0.89  0.89  0.88  0.89  0.89  0.9  0.9  $$m=4$$  0.9  0.9  0.9  0.9  0.91  0.9  0.9  0.91  0.91  0.92  $$m=5$$  0.91  0.91  0.91  0.92  0.92  0.91  0.92  0.92  0.92  0.93    LDG  $$m=1$$  0.71  0.74  0.85  1.06  1.32  1.8  2.45  3.12  3.97  4.23  $$m=2$$  0.83  0.83  0.83  0.84  0.85  0.86  0.87  0.89  0.96  0.64  $$m=3$$  0.86  0.86  0.87  0.87  0.88  0.89  0.89  0.9  0.94  0.58  $$m=4$$  0.88  0.88  0.89  0.89  0.9  0.9  0.91  0.92  0.93  0.72  $$m=5$$  0.9  0.9  0.9  0.91  0.91  0.91  0.92  0.93  0.94  0.98    SIP lifting  $$m=1$$  0.63  0.64  0.67  0.71  0.75  0.86  1.2  1.76  2.98  4.03  $$m=2$$  0.83  0.83  0.84  0.84  0.84  0.84  0.86  0.87  0.89  0.89  $$m=3$$  0.86  0.86  0.87  0.87  0.87  0.87  0.88  0.89  0.91  0.99  $$m=4$$  0.89  0.89  0.89  0.89  0.89  0.89  0.9  0.91  0.92  0.95  $$m=5$$  0.9  0.9  0.9  0.9  0.91  0.91  0.91  0.92  0.93  0.95     p  $$\frac{h_0}{2^m}$$  1.25  4/3  1.5  5/3  1.8  2.0  2.25  2.5  3.0  4.0    SIP shifted  $$m=1$$  0.83  0.89  1.0  1.1  1.17  1.04  1.1  1.16  1.23  1.14  $$m=2$$  0.86  0.86  0.87  0.87  0.88  0.86  0.86  0.87  0.88  0.88  $$m=3$$  0.88  0.88  0.89  0.89  0.89  0.88  0.89  0.89  0.9  0.9  $$m=4$$  0.9  0.9  0.9  0.9  0.91  0.9  0.9  0.91  0.91  0.92  $$m=5$$  0.91  0.91  0.91  0.92  0.92  0.91  0.92  0.92  0.92  0.93    LDG  $$m=1$$  0.71  0.74  0.85  1.06  1.32  1.8  2.45  3.12  3.97  4.23  $$m=2$$  0.83  0.83  0.83  0.84  0.85  0.86  0.87  0.89  0.96  0.64  $$m=3$$  0.86  0.86  0.87  0.87  0.88  0.89  0.89  0.9  0.94  0.58  $$m=4$$  0.88  0.88  0.89  0.89  0.9  0.9  0.91  0.92  0.93  0.72  $$m=5$$  0.9  0.9  0.9  0.91  0.91  0.91  0.92  0.93  0.94  0.98    SIP lifting  $$m=1$$  0.63  0.64  0.67  0.71  0.75  0.86  1.2  1.76  2.98  4.03  $$m=2$$  0.83  0.83  0.84  0.84  0.84  0.84  0.86  0.87  0.89  0.89  $$m=3$$  0.86  0.86  0.87  0.87  0.87  0.87  0.88  0.89  0.91  0.99  $$m=4$$  0.89  0.89  0.89  0.89  0.89  0.89  0.9  0.91  0.92  0.95  $$m=5$$  0.9  0.9  0.9  0.9  0.91  0.91  0.91  0.92  0.93  0.95  Table 3 Experimental order of convergence : EOC$$_i\text{(}{{\text{e}}_{\left[\!\left[ {} \right]\!\right],{{h}_{i}}}}\text{)}$$ in the case of $$\mathcal{P}_1(K)$$ as local space and $${\bf u}_D^*={\bf u}$$ on $$\partial {\it{\Omega}}$$, $$\mathbf{F}(\nabla\mathbf{u})\in {W}^{1,2}({\it{\Omega}})$$    p  $$\frac{h_0}{2^m}$$  1.25  4/3  1.5  5/3  1.8  2.0  2.25  2.5  3.0  4.0    SIP shifted  $$m=1$$  1.05  1.21  1.46  1.65  1.77  1.81  1.98  2.14  2.48  2.91  $$m=2$$  0.93  0.94  0.94  0.95  0.95  0.94  0.94  0.94  0.95  0.95  $$m=3$$  0.93  0.93  0.94  0.94  0.95  0.94  0.94  0.95  0.95  0.96  $$m=4$$  0.93  0.93  0.94  0.94  0.95  0.94  0.94  0.95  0.95  0.96  $$m=5$$  0.93  0.94  0.94  0.95  0.95  0.95  0.95  0.95  0.96  0.96    LDG  $$m=1$$  0.9  1.06  1.42  1.82  2.16  2.7  3.43  4.21  5.31  6.79  $$m=2$$  0.92  0.91  0.89  0.87  0.87  0.86  0.87  0.89  1.05  0.53  $$m=3$$  0.91  0.91  0.9  0.89  0.88  0.88  0.89  0.9  0.97  0.5  $$m=4$$  0.92  0.91  0.91  0.9  0.9  0.9  0.9  0.91  0.93  0.69  $$m=5$$  0.92  0.92  0.92  0.91  0.91  0.91  0.92  0.93  0.95  0.99    SIP lifting  $$m=1$$  1.52  1.45  1.3  1.26  1.38  1.68  2.21  2.87  4.38  6.41  $$m=2$$  1.39  1.29  1.08  0.95  0.9  0.87  0.86  0.85  0.89  0.88  $$m=3$$  1.25  1.14  0.98  0.92  0.9  0.89  0.88  0.88  0.91  1.02  $$m=4$$  1.12  1.04  0.95  0.92  0.91  0.9  0.9  0.9  0.92  0.97  $$m=5$$  1.03  0.98  0.93  0.92  0.92  0.91  0.91  0.91  0.93  0.96     p  $$\frac{h_0}{2^m}$$  1.25  4/3  1.5  5/3  1.8  2.0  2.25  2.5  3.0  4.0    SIP shifted  $$m=1$$  1.05  1.21  1.46  1.65  1.77  1.81  1.98  2.14  2.48  2.91  $$m=2$$  0.93  0.94  0.94  0.95  0.95  0.94  0.94  0.94  0.95  0.95  $$m=3$$  0.93  0.93  0.94  0.94  0.95  0.94  0.94  0.95  0.95  0.96  $$m=4$$  0.93  0.93  0.94  0.94  0.95  0.94  0.94  0.95  0.95  0.96  $$m=5$$  0.93  0.94  0.94  0.95  0.95  0.95  0.95  0.95  0.96  0.96    LDG  $$m=1$$  0.9  1.06  1.42  1.82  2.16  2.7  3.43  4.21  5.31  6.79  $$m=2$$  0.92  0.91  0.89  0.87  0.87  0.86  0.87  0.89  1.05  0.53  $$m=3$$  0.91  0.91  0.9  0.89  0.88  0.88  0.89  0.9  0.97  0.5  $$m=4$$  0.92  0.91  0.91  0.9  0.9  0.9  0.9  0.91  0.93  0.69  $$m=5$$  0.92  0.92  0.92  0.91  0.91  0.91  0.92  0.93  0.95  0.99    SIP lifting  $$m=1$$  1.52  1.45  1.3  1.26  1.38  1.68  2.21  2.87  4.38  6.41  $$m=2$$  1.39  1.29  1.08  0.95  0.9  0.87  0.86  0.85  0.89  0.88  $$m=3$$  1.25  1.14  0.98  0.92  0.9  0.89  0.88  0.88  0.91  1.02  $$m=4$$  1.12  1.04  0.95  0.92  0.91  0.9  0.9  0.9  0.92  0.97  $$m=5$$  1.03  0.98  0.93  0.92  0.92  0.91  0.91  0.91  0.93  0.96  Table 3 Experimental order of convergence : EOC$$_i\text{(}{{\text{e}}_{\left[\!\left[ {} \right]\!\right],{{h}_{i}}}}\text{)}$$ in the case of $$\mathcal{P}_1(K)$$ as local space and $${\bf u}_D^*={\bf u}$$ on $$\partial {\it{\Omega}}$$, $$\mathbf{F}(\nabla\mathbf{u})\in {W}^{1,2}({\it{\Omega}})$$    p  $$\frac{h_0}{2^m}$$  1.25  4/3  1.5  5/3  1.8  2.0  2.25  2.5  3.0  4.0    SIP shifted  $$m=1$$  1.05  1.21  1.46  1.65  1.77  1.81  1.98  2.14  2.48  2.91  $$m=2$$  0.93  0.94  0.94  0.95  0.95  0.94  0.94  0.94  0.95  0.95  $$m=3$$  0.93  0.93  0.94  0.94  0.95  0.94  0.94  0.95  0.95  0.96  $$m=4$$  0.93  0.93  0.94  0.94  0.95  0.94  0.94  0.95  0.95  0.96  $$m=5$$  0.93  0.94  0.94  0.95  0.95  0.95  0.95  0.95  0.96  0.96    LDG  $$m=1$$  0.9  1.06  1.42  1.82  2.16  2.7  3.43  4.21  5.31  6.79  $$m=2$$  0.92  0.91  0.89  0.87  0.87  0.86  0.87  0.89  1.05  0.53  $$m=3$$  0.91  0.91  0.9  0.89  0.88  0.88  0.89  0.9  0.97  0.5  $$m=4$$  0.92  0.91  0.91  0.9  0.9  0.9  0.9  0.91  0.93  0.69  $$m=5$$  0.92  0.92  0.92  0.91  0.91  0.91  0.92  0.93  0.95  0.99    SIP lifting  $$m=1$$  1.52  1.45  1.3  1.26  1.38  1.68  2.21  2.87  4.38  6.41  $$m=2$$  1.39  1.29  1.08  0.95  0.9  0.87  0.86  0.85  0.89  0.88  $$m=3$$  1.25  1.14  0.98  0.92  0.9  0.89  0.88  0.88  0.91  1.02  $$m=4$$  1.12  1.04  0.95  0.92  0.91  0.9  0.9  0.9  0.92  0.97  $$m=5$$  1.03  0.98  0.93  0.92  0.92  0.91  0.91  0.91  0.93  0.96     p  $$\frac{h_0}{2^m}$$  1.25  4/3  1.5  5/3  1.8  2.0  2.25  2.5  3.0  4.0    SIP shifted  $$m=1$$  1.05  1.21  1.46  1.65  1.77  1.81  1.98  2.14  2.48  2.91  $$m=2$$  0.93  0.94  0.94  0.95  0.95  0.94  0.94  0.94  0.95  0.95  $$m=3$$  0.93  0.93  0.94  0.94  0.95  0.94  0.94  0.95  0.95  0.96  $$m=4$$  0.93  0.93  0.94  0.94  0.95  0.94  0.94  0.95  0.95  0.96  $$m=5$$  0.93  0.94  0.94  0.95  0.95  0.95  0.95  0.95  0.96  0.96    LDG  $$m=1$$  0.9  1.06  1.42  1.82  2.16  2.7  3.43  4.21  5.31  6.79  $$m=2$$  0.92  0.91  0.89  0.87  0.87  0.86  0.87  0.89  1.05  0.53  $$m=3$$  0.91  0.91  0.9  0.89  0.88  0.88  0.89  0.9  0.97  0.5  $$m=4$$  0.92  0.91  0.91  0.9  0.9  0.9  0.9  0.91  0.93  0.69  $$m=5$$  0.92  0.92  0.92  0.91  0.91  0.91  0.92  0.93  0.95  0.99    SIP lifting  $$m=1$$  1.52  1.45  1.3  1.26  1.38  1.68  2.21  2.87  4.38  6.41  $$m=2$$  1.39  1.29  1.08  0.95  0.9  0.87  0.86  0.85  0.89  0.88  $$m=3$$  1.25  1.14  0.98  0.92  0.9  0.89  0.88  0.88  0.91  1.02  $$m=4$$  1.12  1.04  0.95  0.92  0.91  0.9  0.9  0.9  0.92  0.97  $$m=5$$  1.03  0.98  0.93  0.92  0.92  0.91  0.91  0.91  0.93  0.96  Finally, we compare the performance of the three methods. We compare the overall solver run time (CPU time) per grid refinement against the solver accuracy in terms of the error $$e_{{\bf F},h_i}$$. In Table 4, the error $$e_{{\bf F},h_i}$$, CPU time and Newton iterations are presented for $$p=1.25$$ and $$p=3$$. Additionally, we give a performance plot in Fig. A1, where the average CPU time per Newton step is plotted against the error $$e_{{\bf F},h_i}$$. We observe that the SIP-shifted method takes fewer Newton iterations and the average CPU time per Newton step is smaller compared with the other two methods. This is mainly due to the larger stencil and the computation of the lifting terms for the primal LDG and SIP-lifting methods. Fig. A1. View largeDownload slide CPU run time per Newton step for SIP-shifted, LDG and SIP-lifting methods plotted against the error $$e_{{\bf F},h_i}$$ for $$p=1.25$$ and $$p=3$$. Fig. A1. View largeDownload slide CPU run time per Newton step for SIP-shifted, LDG and SIP-lifting methods plotted against the error $$e_{{\bf F},h_i}$$ for $$p=1.25$$ and $$p=3$$. Table 4 $$e_{{\bf F},h_i}$$, CPU time and number of Newton steps for $$p=1.25$$ and $$p=3.0$$    SIP shifted  LDG  SIP lifting  $$\frac{h_0}{2^m}$$  $$e_{{\bf F},h_i}$$  CPU  Steps  $$e_{{\bf F},h_i}$$  CPU  Steps  $$e_{{\bf F},h_i}$$  CPU  Steps    $$p=1.25$$  $$m=0$$  0.013247  0.0155  11  0.012002  0.0319  14  0.012464  0.0466  13  $$m=1$$  0.007443  0.0861  11  0.007339  0.2024  14  0.00804  0.2754  13  $$m=2$$  0.004108  0.3384  10  0.004136  0.8948  14  0.00451  1.0768  13  $$m=3$$  0.00223  1.1192  10  0.002277  2.4986  14  0.002478  3.1657  13  $$m=4$$  0.001195  3.7492  10  0.001234  9.0149  14  0.001341  11.8002  13  $$m=5$$  0.000635  13.157  9  0.000661  35.2269  14  0.000717  48.6188  14    $$p=3.0$$  $$m=0$$  0.03651  0.0097  7  0.292522  0.0172  8  0.127643  0.0324  9  $$m=1$$  0.015514  0.0529  7  0.018718  0.1107  8  0.016229  0.1828  9  $$m=2$$  0.008427  0.2517  8  0.009638  0.5578  9  0.008735  0.7937  10  $$m=3$$  0.004522  0.8694  8  0.005039  1.7787  10  0.004647  2.6673  11  $$m=4$$  0.002403  3.2579  9  0.00265  7.2401  11  0.002454  10.9343  12  $$m=5$$  0.001268  13.825  10  0.00138  31.1862  12  0.001288  47.214  13     SIP shifted  LDG  SIP lifting  $$\frac{h_0}{2^m}$$  $$e_{{\bf F},h_i}$$  CPU  Steps  $$e_{{\bf F},h_i}$$  CPU  Steps  $$e_{{\bf F},h_i}$$  CPU  Steps    $$p=1.25$$  $$m=0$$  0.013247  0.0155  11  0.012002  0.0319  14  0.012464  0.0466  13  $$m=1$$  0.007443  0.0861  11  0.007339  0.2024  14  0.00804  0.2754  13  $$m=2$$  0.004108  0.3384  10  0.004136  0.8948  14  0.00451  1.0768  13  $$m=3$$  0.00223  1.1192  10  0.002277  2.4986  14  0.002478  3.1657  13  $$m=4$$  0.001195  3.7492  10  0.001234  9.0149  14  0.001341  11.8002  13  $$m=5$$  0.000635  13.157  9  0.000661  35.2269  14  0.000717  48.6188  14    $$p=3.0$$  $$m=0$$  0.03651  0.0097  7  0.292522  0.0172  8  0.127643  0.0324  9  $$m=1$$  0.015514  0.0529  7  0.018718  0.1107  8  0.016229  0.1828  9  $$m=2$$  0.008427  0.2517  8  0.009638  0.5578  9  0.008735  0.7937  10  $$m=3$$  0.004522  0.8694  8  0.005039  1.7787  10  0.004647  2.6673  11  $$m=4$$  0.002403  3.2579  9  0.00265  7.2401  11  0.002454  10.9343  12  $$m=5$$  0.001268  13.825  10  0.00138  31.1862  12  0.001288  47.214  13  Table 4 $$e_{{\bf F},h_i}$$, CPU time and number of Newton steps for $$p=1.25$$ and $$p=3.0$$    SIP shifted  LDG  SIP lifting  $$\frac{h_0}{2^m}$$  $$e_{{\bf F},h_i}$$  CPU  Steps  $$e_{{\bf F},h_i}$$  CPU  Steps  $$e_{{\bf F},h_i}$$  CPU  Steps    $$p=1.25$$  $$m=0$$  0.013247  0.0155  11  0.012002  0.0319  14  0.012464  0.0466  13  $$m=1$$  0.007443  0.0861  11  0.007339  0.2024  14  0.00804  0.2754  13  $$m=2$$  0.004108  0.3384  10  0.004136  0.8948  14  0.00451  1.0768  13  $$m=3$$  0.00223  1.1192  10  0.002277  2.4986  14  0.002478  3.1657  13  $$m=4$$  0.001195  3.7492  10  0.001234  9.0149  14  0.001341  11.8002  13  $$m=5$$  0.000635  13.157  9  0.000661  35.2269  14  0.000717  48.6188  14    $$p=3.0$$  $$m=0$$  0.03651  0.0097  7  0.292522  0.0172  8  0.127643  0.0324  9  $$m=1$$  0.015514  0.0529  7  0.018718  0.1107  8  0.016229  0.1828  9  $$m=2$$  0.008427  0.2517  8  0.009638  0.5578  9  0.008735  0.7937  10  $$m=3$$  0.004522  0.8694  8  0.005039  1.7787  10  0.004647  2.6673  11  $$m=4$$  0.002403  3.2579  9  0.00265  7.2401  11  0.002454  10.9343  12  $$m=5$$  0.001268  13.825  10  0.00138  31.1862  12  0.001288  47.214  13     SIP shifted  LDG  SIP lifting  $$\frac{h_0}{2^m}$$  $$e_{{\bf F},h_i}$$  CPU  Steps  $$e_{{\bf F},h_i}$$  CPU  Steps  $$e_{{\bf F},h_i}$$  CPU  Steps    $$p=1.25$$  $$m=0$$  0.013247  0.0155  11  0.012002  0.0319  14  0.012464  0.0466  13  $$m=1$$  0.007443  0.0861  11  0.007339  0.2024  14  0.00804  0.2754  13  $$m=2$$  0.004108  0.3384  10  0.004136  0.8948  14  0.00451  1.0768  13  $$m=3$$  0.00223  1.1192  10  0.002277  2.4986  14  0.002478  3.1657  13  $$m=4$$  0.001195  3.7492  10  0.001234  9.0149  14  0.001341  11.8002  13  $$m=5$$  0.000635  13.157  9  0.000661  35.2269  14  0.000717  48.6188  14    $$p=3.0$$  $$m=0$$  0.03651  0.0097  7  0.292522  0.0172  8  0.127643  0.0324  9  $$m=1$$  0.015514  0.0529  7  0.018718  0.1107  8  0.016229  0.1828  9  $$m=2$$  0.008427  0.2517  8  0.009638  0.5578  9  0.008735  0.7937  10  $$m=3$$  0.004522  0.8694  8  0.005039  1.7787  10  0.004647  2.6673  11  $$m=4$$  0.002403  3.2579  9  0.00265  7.2401  11  0.002454  10.9343  12  $$m=5$$  0.001268  13.825  10  0.00138  31.1862  12  0.001288  47.214  13  All numerical experiments are carried out using the Dune-Fem module (cf. Dedner et al., 2010), part of the Dune framework (Bastian et al., 2008a,b; Blatt et al., 2016). The Dune-ALUGrid module (Alkämper et al., 2016) is used as the underlying grid manager. The computations were performed on an Intel core i7-4770S @ 3.10 GHz desktop machine with about 16 GB of memory. Appendix In this appendix, we collect several useful results used in the sequel of the article. Throughout the appendix, $$\psi$$ is an N-function such that $$\psi$$ and $$\psi^*$$ satisfy the $${\it {\Delta}}_2$$-condition. Note that the constants in the following subsections can depend on the chunkiness $$\omega_0$$ of $$\mathcal{T}_h$$. All results can be found in Diening & Růžička (2007) and Diening et al. (2014). Let $${{\it {\Pi}}_{{\rm DG}}} \colon L^1({\it {\Omega}}) \to X_h^k({\it {\Omega}})$$ denote the (local) $$L^2$$-projection onto $$X_h^k({\it {\Omega}})$$, i.e.,   \begin{align} \label{eq:PiDG} ({{{\it {\Pi}}_{{\rm DG}}}) {\bf G}}{{\bf X}_h}=({{\bf G}}){{\bf X}_h}\qquad \forall\, {\bf X}_h \in X_h^k . \end{align} (A.1) The projection $${{\it {\Pi}}_{{\rm DG}}}$$ is stable:   \begin{align} \label{eq:PiDGLpsistablelocal} {\int\hspace{-0.8em}{-}}_K \psi(\lvert{{{\it {\Pi}}_{{\rm DG}}} {\bf G}}\rvert)\,{\rm{d}}x &\leq c\, {\int\hspace{-0.8em}{-}}_K \psi(\lvert{{\bf G}}\rvert)\,{\rm{d}}x\, \end{align} (A.2) and has the following local approximation property:   \begin{align} \label{eq:PiDGapproxpsi} {\int\hspace{-0.8em}{-}}_K \psi(h_K^j \lvert{\nabla^j_h({\bf G} - {{\it {\Pi}}_{{\rm DG}}} {\bf G}\rvert)})\,{\rm{d}}x &\leq c\, {\int\hspace{-0.8em}{-}}_K \psi(h_K^l \lvert{\nabla^l {\bf G}}\rvert)\,{\rm{d}}x \end{align} (A.3) for all $$K \in \mathcal{T}_h$$ and $${\bf G} \in W^{l,\psi}(K)$$ with $$0\leq j \leq l \leq k+1$$. In particular, this implies   \begin{align} \label{eq:PiDGapprox0} \rho_{\psi,{\it {\Omega}}}({\bf G} - {{\it {\Pi}}_{{\rm DG}}} {\bf G}) &\leq c\, \rho_{\psi,{\it {\Omega}}}({\bf G}), \end{align} (A.4)  \begin{align} \label{eq:PiDGapprox1} \rho_{\psi,{\it {\Omega}}}({\bf G} - {{\it {\Pi}}_{{\rm DG}}} {\bf G}) &\leq c\, \rho_{\psi,{\it {\Omega}}}(h\, {\nabla_h {\bf G}}), \end{align} (A.5)  \begin{align} \label{eq:PiDGapprox2} \rho_{\psi,{\it {\Omega}}}(\nabla_h {\bf G} - \nabla_h {{\it {\Pi}}_{{\rm DG}}} {\bf G}) &\leq c\, \rho_{\psi,{\it {\Omega}}}({\nabla_h {\bf G}}), \\ \end{align} (A.6)  \begin{align} \label{eq:PiDGLpsistable} \rho_{\psi,{\it {\Omega}}}({{\it {\Pi}}_{{\rm DG}}} {\bf G}) &\leq c\, \rho_{\psi,{\it {\Omega}}}({\bf G}), \end{align} (A.7)  \begin{align} \label{eq:PiDGLW1psistable} \rho_{\psi,{\it {\Omega}}}(\nabla_h {{\it {\Pi}}_{{\rm DG}}} {\bf G}) &\leq c\, \rho_{\psi,{\it {\Omega}}}(\nabla_h {\bf G}), \end{align} (A.8) for all $${\bf G} \in {W_{{\rm DG}}^{1,\psi}}({\it {\Omega}})$$. For the treatment of the jumps, the following trace theorems are useful. Lemma A1 Let $$K \in \mathcal{T}_h$$ and $$\gamma$$ be a face of $$K$$. Then for all $${\bf g} \in W^{1,\psi}(K)$$,   \begin{align}\label{eq:emb} {\int\hspace{-0.8em}{-}}_\gamma \psi(\lvert{{\bf g}}\rvert)\,{\rm{d}}s &\leq c {\int\hspace{-0.8em}{-}}_K \psi(\lvert{{\bf g}}\rvert) \,{\rm{d}}x + c {\int\hspace{-0.8em}{-}}_K \psi(\lvert{h_\gamma \nabla {\bf g}}\rvert) \,{\rm{d}}x \end{align} (A.9) with a constant independent of $$h$$. Note that for all $${\mathfrak q}\in {\mathcal P}_k(K)$$, $$k \in {\mathbb{N}}_0$$, we have   \begin{align} {\int\hspace{-0.8em}{-}}_\gamma \psi(\lvert{{\mathfrak q}}\rvert)\, {\rm{d}}s \le c \, {\int\hspace{-0.8em}{-}}_K \psi(\lvert{{\mathfrak q}}\rvert)\, {\rm{d}}x , \label{eq:pol-trace} \end{align} (A.10) where $$\gamma$$ is a face of some $$K \in \mathcal{T}_h$$. Lemma A1 and (A.3) imply   \begin{align} \label{eq:PiDGapproxmlocal} \begin{aligned} h_\gamma \int_\gamma \psi(h_\gamma^{-1} \lvert{{\bf G} - {{\it {\Pi}}_{{\rm DG}}}\rvert {\bf G}})\,{\rm{d}}s &\leq c \int_K \psi(\lvert{\nabla {\bf G}}\rvert)\,{\rm{d}}x , \end{aligned} \end{align} (A.11) which yields   \begin{align} \label{eq:PiDGapproxmglobal} m_{\psi,h}({\bf G} - {{\it {\Pi}}_{{\rm DG}}} {\bf G}) &\leq c\,\rho_{\psi,{\it {\Omega}}}(\nabla_h {\bf G}) \end{align} (A.12) for all $${\bf G} \in {W_{{\rm DG}}^{1,\psi}}({\it {\Omega}})$$. The operator $${\bf R}_h$$ defined in (2.18) is called a jump functional or lifting operator. One easily sees that $${\bf R}_h$$ are bounded, i.e.,   \begin{align} \label{eq:Rgammaest} \rho_{\psi,{\it {\Omega}}} ({\bf R}_h {\bf g}) &\leq c\, \sum_{\gamma \in {\it {\Gamma}}} h_\gamma \int_\gamma \psi\big( h_\gamma^{-1} \lvert{\left[\kern-0.15em\left[{{\bf g} \otimes {\bf n}}\right]\kern-0.15em\right]_\gamma}\rvert\big) \,{\rm{d}}s = c\, m_{\psi,h}({\bf g}) \end{align} (A.13) for all $${\bf g} \in {W}_{{\rm DG},{\it {\Gamma}}_{\rm D}}^{1,\psi}({\it {\Omega}})$$. Thus, the definition of the discrete gradient in (2.17) implies   \begin{align} \label{eq:nablaDGvsM} \rho_{\psi,{\it {\Omega}}} ({\nabla_{\rm DG}^h} {\bf g}) &\leq c\, \big(\rho_{\psi,{\it {\Omega}}}(\nabla_h {\bf g}) + m_{\psi,h}({\bf g}) \big) = c\, M_{\psi,h}({\bf g}) , \\ \end{align} (A.14)  \begin{align} \label{eq:MvsDGNabla} M_{\psi,h}({\bf g}) &= \rho_{\psi,{\it {\Omega}}} (\nabla_h {\bf g}) + m_{\psi,h}({\bf g}) \leq c\, \big(\rho_{\psi,{\it {\Omega}}}({\nabla_{\rm DG}^h} {\bf g}) + m_{\psi,h}({\bf g}) \big) \end{align} (A.15) for all $${\bf g} \in {W_{{\rm DG},{\it {\Gamma}}_{\rm D}}^{1,\psi}}({\it {\Omega}})$$. The classical Scott–Zhang interpolation operator $${{\it {\Pi}}_{{\rm SZ}}} \colon W^{1,1}({\it {\Omega}}) \to V_h^k$$ was introduced in Scott & Zhang (1990), where also the basic properties in Lebesgue and Sobolev spaces are proved. These properties have been extended to Orlicz and Sobolev–Orlicz spaces in Diening & Růžička (2007). Using the formalism of N-functions, this results in particular in the following important result. Proposition A2 Let $$\psi $$ have $$(p,\delta)$$-structure and let $${\bf F}$$ be defined in (2.7). If $${\bf F}(\nabla g) \in W^{1,2}({\it {\Omega}})$$, and in the definition of $${{\it {\Pi}}_{{\rm SZ}}}$$, we have $$k \geq 1$$, then   \begin{align*} \lVert{{\bf F}(\nabla_h {{\it {\Pi}}_{{\rm SZ}}}\rVert g) - {\bf F}(\nabla g)}_2^2 &\le c\, h^2 \, \lVert{\nabla {\bf F}(\nabla g)}\rVert_{2}^2, \end{align*} with $$c$$ depending only on the characteristics of $$\psi$$ and the chunkiness $$\omega_0$$. In Diening et al. (2014), the Scott–Zhang operator has been extended to the DG setting. In fact, a linear, bounded projection $${{\it {\Pi}}_{{\rm SZ}}} \colon {W_{\rm DG}^{1,1}}({\it {\Omega}}') \to V_h^kc({\it {\Omega}}')$$, where $$ V_h^kc({\it {\Omega}}') := V_h^k({\it {\Omega}}') \cap W^{1,1}({\it {\Omega}}')$$, was defined. It maps the space $${W_{{\rm DG},{\it {\Gamma}}_{\rm D}}^{1,1}}({\it {\Omega}})$$ into $$V_h^kc({\it {\Omega}}) \cap {W_{{\it {\Gamma}}_{\rm D}}^{1,1}}({\it {\Omega}})$$, where $$ V_h^kc({\it {\Omega}}) := V_h^k({\it {\Omega}}) \cap W^{1,1}({\it {\Omega}})$$, and coincides for functions from $$W^{1,1}({\it {\Omega}})$$ with the classical Scott–Zhang operator. Moreover, it was shown in Diening et al. (2014) that if $$k\ge 1$$ then for all $$K \in \mathcal{T}_h$$, all faces $$\gamma$$ of $$K$$ and all $$g \in W^{1,\psi}({\it {\Omega}})$$ there holds   \begin{align} \label{eq:PiSZapproxmlocala} h_\gamma \int_\gamma \psi(h_\gamma^{-1} \lvert{g - {{\it {\Pi}}_{{\rm SZ}}} g}\rvert)\,{\rm{d}}s &\leq c \int_K \psi(h_\gamma^{-1} \lvert{g - {{\it {\Pi}}_{{\rm SZ}}} g}\rvert)\,{\rm{d}}x + c \int_K \psi(\lvert{\nabla_h (g - {{\it {\Pi}}_{{\rm SZ}}} g)}\rvert)\,{\rm{d}}x\notag \\ &\leq c \int_{S_K} \psi(\lvert{\nabla g}\rvert)\,{\rm{d}}x , \end{align} (A.16)  \begin{align} \label{eq:PiSZapproxpsi3} & m_{\psi,h}(g - {{\it {\Pi}}_{{\rm SZ}}} g)+\rho_{\psi,{\it {\Omega}}}\big(h_K^{-1} (g - {{\it {\Pi}}_{{\rm SZ}}} g) \big) + \rho_{\psi,{\it {\Omega}}}(\nabla_h {{\it {\Pi}}_{{\rm SZ}}} g) \leq c\,\rho_{\psi,{\it {\Omega}}}(\nabla g) . \end{align} (A.17) If additionally $$\psi $$ has $$(p,\delta)$$-structure, and $${\bf F}(\nabla g) \in W^{1,2}({\it {\Omega}})$$, then   \begin{align}\label{lem:err-uD} m_{\psi,h}(g -{{\it {\Pi}}_{{\rm SZ}}} g) & \le c\, h^{\min\{2,p\}}\big( \lVert{\nabla {\bf F}(\nabla g)}\rVert_2^2 + \rho_{\psi, {\it {\Omega}}}(\nabla g) \big) . \end{align} (A.18) We have the following Poincaré inequality in the DG setting. Lemma A3 For all $$g \in {W_{{\rm DG},{\it {\Gamma}}_{\rm D}}^{1,\psi}}({\it {\Omega}})$$,   \begin{align*} \rho_{\psi,{\it {\Omega}}}(g) &\leq c\, M_{\psi,h}\big(\text{diam}({\it {\Omega}})\, g\big), \end{align*} where $$c$$ depends only on $${\it {\Omega}}$$, $${\it {\Omega}}'$$, $${\it {\Delta}}_2(\psi)$$ and $${\it {\Delta}}_2(\psi^*)$$. For our choices of $${\bf u}_D^*$$, we can control the jump terms. Lemma A4 Let $$ {\bf u}_D^*= {\bf u}$$ or $$ {\bf u}_D^*= {{\it {\Pi}}_{{\rm SZ}}}{\bf u}$$. Then we have   \begin{align*} m_{\varphi,h}({\bf u}_D^*-{\bf u}) \le c\, \rho_{\varphi,{\it {\Omega}}}(\nabla {\bf u}) . \end{align*} Lemma A5 For all $$K \in \mathcal{T}_h$$ and all functions $${\bf P} \colon {\it {\Omega}} \to {{\mathbb{R}}^{d \times n}}$$,   \begin{align*} {\int\hspace{-0.8em}{-}}_K \lvert{{\bf F}({\bf P}) - \langle{{\bf F}({\bf P})}\rangle_K}\lvert^2\,{\rm{d}}x &\sim {\int\hspace{-0.8em}{-}}_K \lvert{{\bf F}({\bf P}) - {\bf F}(\langle{{\bf P}}\rangle_K)}\rvert^2\,{\rm{d}}x . \end{align*} References Acerbi, E. & Fusco, N. ( 1994) Partial regularity under anisotropic ($$ p,q $$) growth conditions. J. Differ. Equ. , 107, 46– 67. Google Scholar CrossRef Search ADS   Alkämper, M., Dedner, A., Klöfkorn, M. & Nolte, R. ( 2016) The DUNE-ALUGrid module. Arch. Numer. Softw ., 4( 1), 1– 28. Arnold, D. N., Brezzi, F., Cockburn, B. & Marini, L. D. ( 2002) Unified analysis of discontinuous Galerkin methods for elliptic problems. SIAM J. Numer. Anal ., 39, 1749– 1779. Google Scholar CrossRef Search ADS   Barrett, J. W. & Liu, W. B. ( 1994) Quasi-norm error bounds for the finite element approximation of a non-Newtonian flow. Numer. Math ., 68, 437– 456. Google Scholar CrossRef Search ADS   Bastian, P., Blatt, M., Dedner, A., Engwer, C., Klöfkorn, R., Kornhuber, R., Ohlberger, M. & Sander, O. ( 2008a) A generic grid interface for parallel and adaptive scientific computing. Part II: Implementation and tests in DUNE. Computing , 82, 121– 138. Google Scholar CrossRef Search ADS   Bastian, P., Blatt, M., Dedner, A., Engwer, C., Klöfkorn, R., Ohlberger, M. & Sander, O. ( 2008b) A generic grid interface for parallel and adaptive scientific computing. Part I: Abstract framework. Computing , 82, 103– 119. Google Scholar CrossRef Search ADS   Belenki, L., Diening, L.& Kreuzer, Ch. ( 2012) Optimality of an adaptive finite element method for the $$p$$-Laplacian equation. IMA J. Numer. Anal. , 32, 484– 510. Google Scholar CrossRef Search ADS   Blatt, M., Burchardt, A., Dedner, A., Engwer, C., Fahlke, J., Flemisch, B., Gersbacher, C., Gräser, C., Gruber, F., Grüninger, C., Kempf, D., Klöfkorn, R., Malkmus, T., Müthing, S., Nolte, M., Piatkowski, M. & Sander, O. ( 2016) The disributed and unified numerics environment, version 2.4. Arch. Numer. Softw ., 4( 100), 13– 29. Buffa, A. & Ortner, C. ( 2009) Compact embeddings of broken Sobolev spaces and applications. IMA J. Numer. Anal ., 29, 827– 855. Google Scholar CrossRef Search ADS   Burman, E. & Ern, A. ( 2008) Discontinuous Galerkin approximation with discrete variational principle for the nonlinear Laplacian. C. R. Math. Acad. Sci. Paris , 346, 1013– 1016. Google Scholar CrossRef Search ADS   Bustinza, R. & Gatica, G. N. ( 2004) A local discontinuous Galerkin method for nonlinear diffusion problems with mixed boundary conditions. SIAM J. Sci. Comput. , 26, 152– 177. Google Scholar CrossRef Search ADS   Davis, T. A. ( 2004) Algorithm 832: UMFPACK V4.3—an unsymmetric-pattern multifrontal method. ACM Trans. Math. Softw. , 30, 196– 199. Google Scholar CrossRef Search ADS   Dedner, A., Klöfkorn, R., Nolte, M. & Ohlberger, M. ( 2010) A generic interface for parallel and adaptive scientific computing: abstraction principles and the DUNE-FEM module. Computing , 90, 165– 196. Google Scholar CrossRef Search ADS   Di Pietro, D. A. & Ern, A. ( 2010) Discrete functional analysis tools for discontinuous Galerkin methods with application to the incompressible Navier-Stokes equations. Math. Comput ., 79, 1303– 1330. Google Scholar CrossRef Search ADS   Di Pietro, D. A. & Ern, A. ( 2012) Mathematical aspects of discontinuous Galerkin methods. Mathématiques & Applications , vol. 69. Berlin: Springer. Google Scholar CrossRef Search ADS   Diening, L. & Ettwein, F. ( 2008) Fractional estimates for non-differentiable elliptic systems with general growth. Forum Math ., 20, 523– 556. Google Scholar CrossRef Search ADS   Diening, L. & Kreuzer, C. ( 2008) Linear convergence of an adaptive finite element method for the $$p$$–Laplacian equation. SINUM , 46, 614– 638. Google Scholar CrossRef Search ADS   Diening, L., Kröner, D., Růžička, M. & Toulopoulos, I. ( 2014) A local discontinuous Galerkin approximation for systems with $$p$$-structure. IMA J. Numer. Anal ., 34, 1447– 1488. Google Scholar CrossRef Search ADS   Diening, L. & Růžička, M. ( 2007) Interpolation operators in Orlicz–Sobolev spaces. Numer. Math ., 107, 107– 129. Google Scholar CrossRef Search ADS   Diening, L., Růžička, M. & Schumacher, K. ( 2010) A decomposition technique for John domains. Ann. Acad. Sci. Fenn. Ser. A. I. Math ., 35, 87– 114. Google Scholar CrossRef Search ADS   Ebmeyer, C. & Liu, W. B. ( 2005) Quasi-norm interpolation error estimates for finite element approximations of problems with $$p$$-structure. Numer. Math ., 100, 233– 258. Google Scholar CrossRef Search ADS   Giaquinta, M. & Modica, G. ( 1986) Remarks on the regularity of the minimizers of certain degenerate functionals. Manuscripta Math ., 57, 55– 99. Google Scholar CrossRef Search ADS   Giusti, E. ( 1994) Metodi Diretti nel Calcolo delle Variazioni.  Bologna: Unione Matematica Italiana. Houston, P., Robson, J. & Süli, E. ( 2005) Discontinuous Galerkin finite element approximation of quasilinear elliptic boundary value problems. I. The scalar case. IMA J. Numer. Anal ., 25, 726– 749. Google Scholar CrossRef Search ADS   Rao, M. M. & Ren, Z. D. ( 1991) Theory of Orlicz spaces. Pure and Applied Mathematics , Monographs and Textbooks, vol. 146. New York: Marcel Dekker. Růžička, M. ( 2013) Analysis of Generalized Newtonian Fluids. Topics in Mathematical Fluid Mechanics. Lecture Notes in Mathematics ( Beirao da Veiga H. & Flandoli F. eds), vol. 2073. Heidelberg: Springer, pp. 199– 238. Růžička, M. & Diening, L. ( 2007) Non-Newtonian Fluids and Function Spaces. Nonlinear Analysis, Function Spaces and Applications ( Rákosnik J. ed.), vol. 8. Praha: Institute of Mathematics of the Academy of Sciences of the Czech Republic, pp. 95– 144. Scott, L. R. & Zhang, S. ( 1990) Finite element interpolation of nonsmooth functions satisfying boundary conditions. Math. Comput ., 54, 483– 493. Google Scholar CrossRef Search ADS   Footnotes 1 Note that the traces are evaluated with respect to $$K$$. 2 On a face $$\gamma \in {\it {\Gamma}}$$, the expression $$\nabla_h {{\it {\Pi}}_{{\rm SZ}}} {\bf u}$$ is to be understood as the evaluation on any adjacent element $$K \in \mathcal T_h$$, i.e., $$\text{tr} _\gamma^K \nabla_h {{\it {\Pi}}_{{\rm SZ}}} {\bf u}$$. 3 Note that functions and traces are evaluated with respect to $$K$$. © The authors 2017. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. All rights reserved. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png IMA Journal of Numerical Analysis Oxford University Press

Generalizations of SIP methods to systems with $$p$$ -structure

Loading next page...
 
/lp/ou_press/generalizations-of-sip-methods-to-systems-with-p-structure-M6ezlKm9aJ
Publisher
Oxford University Press
Copyright
© The authors 2017. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. All rights reserved.
ISSN
0272-4979
eISSN
1464-3642
D.O.I.
10.1093/imanum/drx040
Publisher site
See Article on Publisher Site

Abstract

Abstract In the present article, we propose two variants of symmetric internal penalty methods for systems with $$p$$-structure. We show stability (a priori estimates) of the methods and derive error estimates. Moreover, we discuss the performance of the schemes and compare them with the local discontinuous Galerkin method. 1. The problem We consider the numerical approximation of a vectorial system of $$p$$-Laplace type,   \begin{equation} \label{eq:p-lap} \begin{aligned} -{\text{div}}{\boldsymbol{\mathcal{A}}}(\nabla{\bf u})&={\bf f}\qquad&&\text{in }{\it{\Omega}}, \\ {\bf u}&={\bf u}_D \qquad&&\text{on }\partial {\it{\Omega}}, \end{aligned} \end{equation} (1.1) by means of symmetric internal penalty (SIP) approximations. For the given data $${\bf f}$$ and $${\bf u}_D$$, we seek the unknown vector field $${\bf u}=(u_1,\ldots, u_d)^\top$$ defined on $${\it{\Omega}} \subset {\mathbb{R}}^n$$. Throughout the article we assume that $${\boldsymbol{\mathcal{A}}}\colon {\mathbb{R}}^{d\times n}\to {\mathbb{R}}^{d\times n}$$ possesses a $$\psi$$-potential, where $$\psi$$ has $$(p,\delta)$$-structure (cf. Assumption 2.2) and the relevant example that falls into this class is   \begin{equation}\label{eq:fluids} {\boldsymbol{\mathcal{A}}}(\nabla {\bf u}) = (\delta+\left| {{\nabla {\bf u}}} \right|)^{p-2} \nabla{\bf u}, \end{equation} (1.2) with $$ p \in (1,\infty)$$ and $$ \delta\geq 0$$. Discontinuous Galerkin (DG) methods for elliptic problems were introduced in the late 1990s. In recent decades, they have received extensive attention and by now they are well understood and rigorously analysed in the context of linear elliptic problems (cf. Arnold et al., 2002, for the Poisson problem). Prominent examples for DG methods are the local discontinuous Galerkin (LDG) and the SIP methods. In contrast to this, little is known about the treatment of the $$p$$-Laplace problem with DG methods (cf. Burman & Ern, 2008; Buffa & Ortner, 2009; Diening et al., 2014). On the other hand, it is known from Ebmeyer & Liu (2005) and Diening & Růžička (2007) that finite element solutions of equation (1.1) converge at least with a linear rate to the exact solution at least in the case $${\bf u}_D =\mathbf{0}$$. This convergence rate is optimal for $$p \in (1,\infty)$$ and linear ansatz functions, if the continuous solution has the natural regularity $$\nabla {\bf F}(\nabla {\bf u}) \in L^2({\it{\Omega}})$$. It was shown in Diening & Kreuzer (2008) and Belenki et al. (2012) that the adaptive finite element algorithm for equation (1.1) with piecewise linear ansatz functions converges with an optimal rate to the solution. In this article, we extend the techniques developed for a rigorous analysis of LDG methods to SIP methods. These methods seem, on the one hand, to be best suited for a theoretical analysis in the context of systems with $$p$$-structure and have, on the other hand, good stability and localization properties in numerical experiments. The proved convergence rate for the LDG scheme is optimal for $$p \le 2$$, while for $$p\ge 2$$ it is suboptimal due to technical difficulties (cf. Diening et al., 2014). Clearly, one has many different possibilities to generalize the SIP method for $$p=2$$ to SIP methods for $$p\neq 2$$ (cf. Houston et al., 2005, where the analysis of different IP methods of problem (1.1) with $${\boldsymbol{\mathcal{A}}}(\nabla {\bf u}) = \nabla {\bf u} + (\delta+\left| {{\nabla {\bf u}}} \right|)^{p-2} \nabla{\bf u}$$ heavily depends on the added linear term). We propose two formulations that are motivated by different perceptions of SIP methods in the linear case (cf. Di Pietro & Ern, 2012). We show in this article that the SIP formulation using shifts (cf. Scheme 2.10) of problem (1.1), (1.2) possesses for all $$p \in (1,\infty)$$ an optimal (for linear ansatz functions), linear convergence rate to the exact solution. In fact, this is the first theoretical proof that a DG method behaves exactly as the finite element method (FEM) for the whole range $$p \in (1,\infty)$$. Moreover, the second SIP formulation using liftings (cf. Scheme 2.11) of problem (1.1), (1.2) possesses the same convergence properties as the LDG formulation in Diening et al. (2014). The article is organized as follows. In the next section, we introduce the notation, the appropriate function spaces, in particular appropriate spaces for DG functions, the basic assumption on the nonlinear operator and its basic consequences, discrete gradients and our numerical fluxes. Moreover, we propose two different SIP formulations and recall the primal formulation of the LDG method for our problem. Both methods coincide with the classical SIP method for $$p=2$$. In Section 3, we prove stability of the methods, i.e., a priori estimates (cf. Theorems 3.1 and 3.2). In Section 4, we prove error estimates for our problem (cf. Theorems 4.3 and 4.5). These results provide the first convergence rates for SIP methods for problems with $$(p,\delta)$$-structure, like the $$p$$-Laplace equation. In the appendix, we collect various technical results used in the sequel. 2. The SIP schemes: notation and set-up In this section, we introduce different DG formulations of problem (1.1). Before that we introduce the notation we will use, state the precise assumptions on the nonlinear operator and discuss its basic consequences. 2.1 Function spaces We use $$c, C$$ to denote generic constants, which may change from line to line, but not depending on the crucial quantities. Moreover, we write $$f\sim g$$ if and only if there exist constants $$c,C>0$$ such that $$c\, f \le g\le C\, f$$. We will use the customary Lebesgue spaces $$(L^p({\it{\Omega}}),\left\| {{\,\cdot\,}} \right\|_p)$$ and Sobolev spaces $$(W^{k,p}({\it{\Omega}}),\left\| {{\,\cdot\,}} \right\|_{k,p})$$, where $${\it{\Omega}} \subset {\mathbb{R}}^n$$ is a bounded, polyhedral domain with Lipschitz continuous boundary $$\partial {\it{\Omega}}=:{\it{\Gamma}}_{\rm D}$$. The space $$W^{1,p}_{{\it{\Gamma}}_{\rm D}}({\it{\Omega}})$$ is the closure of $$C^\infty_0 ({\it{\Omega}})$$ functions in $$W^{1,p}({\it{\Omega}})$$, equipped with the gradient norm $$\left\| {{\nabla\,\cdot\,}} \right\|_p$$. We do not distinguish between function spaces for scalar, vector-valued or tensor-valued functions. However, we will denote vector-valued functions by boldface letters and tensor-valued functions by capital boldface letters. The scalar product between two vectors $${\bf u}$$, $${\bf v}$$ is denoted by $${\bf u} \cdot {\bf v}$$. The scalar product between two tensors $${\bf P}, {\bf Q}$$ is denoted by $${\bf P}: {\bf Q}$$ and we use the notation $$\left| {{{\bf P}}} \right|^2={\bf P} : {\bf P} ^\top$$. The mean value of a locally integrable function $$f$$ over a measurable set $$M \subset {\it{\Omega}}$$ is denoted by $$\left\langle {{f}} \right\rangle_M:= \mathop{{\int\hspace{-0.8em}{-}}}_M f \, {\rm d}x =\frac 1 {|M|}\int_M f \, {\rm d}x$$. Moreover, we use the notation $$({f}, {g}):=\int_{\it{\Omega}} f g\, {\rm d}x$$, whenever the right-hand side is well defined. We will also work with Orlicz and Sobolev–Orlicz spaces (cf. Rao & Ren, 1991). A real convex function $$\psi \,:\, {\mathbb{R}}^{\geq 0} \to {\mathbb{R}}^{\geq 0}$$ is said to be an N-function if $$\psi(0)=0$$, $$\psi(t)>0$$ for $$t>0$$, $$\lim_{t\rightarrow0} \psi(t)/t=0$$, as well as $$\lim_{t\rightarrow\infty} \psi(t)/t=\infty$$. We always assume that $$\psi$$ and the conjugate N-function $$\psi ^*$$ satisfy the $${\it{\Delta}}_2$$-condition. We denote the smallest constant such that $$\psi(2\,t) \leq K\, \psi(t)$$ by $${\it{\Delta}}_2(\psi)$$. We denote by $$L^\psi({\it{\Omega}})$$ and $$W^{1,\psi}({\it{\Omega}})$$ the classical Orlicz and Sobolev–Orlicz spaces, i.e., $$f \in L^\psi({\it{\Omega}})$$ if the {modular} $$\rho_\psi(f):=\int_{\it{\Omega}} \psi(\left| {{f}} \right|)\, {\rm d}x $$ is finite and $$f \in W^{1,\psi}({\it{\Omega}})$$ if $$f$$ and $$ \nabla f$$ belong to $$L^\psi({\it{\Omega}})$$. When equipped with the Luxembourg norm $$\left\| {{f}} \right\|_{\psi}:= \inf \left\{ {{\lambda >0 {\,\big|\,} \int_{\it{\Omega}} \psi(\left| {{f}} \right|/\lambda)\, {{{\rm d}}}x \le 1}} \right\}$$, the space $$L^\psi({\it{\Omega}})$$ becomes a Banach space. The same holds for the space $$W^{1,\psi}({\it{\Omega}})$$ if it is equipped with the norm $$\left\| {{\cdot }} \right\|_{\psi} +\left\| {{\nabla \cdot}} \right\|_{\psi} $$. Note that the dual space $$(L^\psi({\it{\Omega}}))^*$$ can be identified with the space $$L^{\psi^*}({\it{\Omega}})$$. By $$W^{1,\psi}_{{\it{\Gamma}}_{\rm D}}({\it{\Omega}})$$ we denote the closure of $$C^\infty_0({\it{\Omega}})$$ in $$W^{1,\psi}({\it{\Omega}})$$ and equip it with the gradient norm $$\left\| {{\nabla \cdot}} \right\|_\psi$$. We need the following refined version of the Young inequality: for all $$\varepsilon >0$$ there exists $$c_\epsilon>0 $$, depending only on $${\it{\Delta}}_2(\psi),{\it{\Delta}}_2( \psi ^*)<\infty$$, such that for all $$s,t\geq0$$,   \begin{align} \label{ineq:young} \begin{split} ts&\leq \epsilon \, \psi(t)+ c_\epsilon \,\psi^*(s), \\ t\, \psi'(s) + \psi'(t)\, s &\le \epsilon \, \psi(t)+ c_\epsilon \,\psi(s). \end{split} \end{align} (2.1) 2.2 Basic properties of the nonlinear operator We now state precisely the assumptions we make for our nonlinear operator $${\boldsymbol{\mathcal{A}}} (\cdot)$$. For $$t\geq0 $$, we define a special N-function $$\varphi=\varphi_{p,\delta}$$ by   \begin{align} \label{eq:5a} \varphi(t):= \int _0^t \alpha(s)\, {\rm{d}}s\qquad\text{with}\quad \alpha(t) := (\delta +t)^{p-2} t, \end{align} (2.2) where $$p\in (1,\infty)$$ and $$\delta\ge 0$$. The function $$\varphi$$ satisfies, uniformly in $$t$$, the important equivalence   \begin{align} \label{eq:equi1} \varphi''(t)\, t \sim \varphi'(t), \end{align} (2.3) because $$ \min\left\{ {{1,p-1}} \right\}\,(\delta+t)^{p-2} \le \varphi''(t)\leq \max\left\{ {{1,p-1}} \right\}(\delta+t)^{p-2}$$. Moreover, $$\varphi$$ satisfies the $${\it{\Delta}}_2$$-condition with $${\it{\Delta}}_2(\varphi) \leq c\, 2^{\max \left\{ {{2,p}} \right\}}$$ (hence independent of $$\delta$$). This implies that, uniformly with respect to $$t$$, we have   \begin{align} \label{eq:equi2} \varphi'(t)\, t \sim \varphi(t), \end{align} (2.4) with constants depending only on $$p$$. The conjugate function $$\varphi^*$$ satisfies $$\varphi^*(t) \sim (\delta^{p-1} + t)^{p'-2} t^2$$ with $$1= \frac{1}{p} + \frac{1}{p'}$$. Also $$\varphi^*$$ satisfies the $${\it{\Delta}}_2$$-condition with $${\it{\Delta}}_2(\varphi^*) \leq c\,2^{\max \left\{ {{2,p'}} \right\}}$$. A detailed discussion and full proofs can be found in Růžička & Diening (2007) and Diening & Ettwein (2008). Definition 2.1 We say that an N-function $$\psi \in C^1({\mathbb{R}}^\ge)\cap C^2({\mathbb{R}}^>)$$ has $$(p,\delta)$$-structure, with $$p\in (1,\infty)$$ and $$\delta\ge 0$$, if   \begin{alignat*}{2} \psi(t) &\sim \varphi_{p,\delta}(t)\,\qquad &&\textrm{uniformly in $t\ge 0$}, \\ \psi''(t) &\sim \varphi''_{p,\delta}(t)\,\qquad &&\textrm{uniformly in $t> 0$} . \end{alignat*} The constants in these equivalences and $$p$$ are called characteristics of $$\psi$$. In this case $$\psi$$ and $$\psi^*$$ satisfy the $${\it{\Delta}}_2$$-condition with $${\it{\Delta}}_2(\psi)\leq c\,2^{\max \left\{ {{2,p}} \right\}}$$ and $$ {\it{\Delta}}_2(\psi^*) \leq c\,2^{\max \left\{ {{2,p'}} \right\}}$$. Moreover, we have, uniformly with respect to $$t$$,   \begin{align} \label{eq:equi2a} \psi'(t)\, t \sim \psi(t), \quad \varphi'(t)\sim \psi'(t), \end{align} (2.5) with constants depending only on the characteristics of $$\psi$$. Assumption 2.2 (Nonlinear operator) We assume that the nonlinear operator $${{\boldsymbol{\mathcal{A}}} \colon {\mathbb{R}}^{d \times n} \to {\mathbb{R}}^{d \times n}}$$ belongs to $$C^0({\mathbb{R}}^{d \times n},{\mathbb{R}}^{d \times n} )\cap C^1({\mathbb{R}}^{d \times n}\setminus \{\mathbf{0}\},{\mathbb{R}}^{d \times n} ) $$ and satisfies $${\boldsymbol{\mathcal{A}}} (\mathbf 0)=\mathbf 0$$. Moreover, we assume that $${\boldsymbol{\mathcal{A}}} $$ possesses a potential $$\psi \in C^1({\mathbb{R}}^\ge)$$, which has $$(p,\delta)$$-structure, i.e., for all $${\bf P} \in {\mathbb{R}}^{d \times n} \setminus \left\{ { \mathbf{0}} \right\} $$ there holds   \begin{align} {\boldsymbol{\mathcal{A}}} ({\bf P}) = \psi'(\left| {{{\bf P}}} \right|) \frac {{\bf P}}{\left| {{{\bf P}}} \right|}. \label{eq:ass_S} \end{align} (2.6) In this case, we say that $${\boldsymbol{\mathcal{A}}}$$ possesses a $$(p,\delta)$$-potential $$\psi$$ and call the characteristics of $$\psi$$ also the characteristics of $${\boldsymbol{\mathcal{A}}}$$. Remark 2.3 We emphasize that the constants in the article depend only on the characteristics of $${\boldsymbol{\mathcal{A}}}$$ but are independent of $$\delta\geq 0$$. Remark 2.4 Note that the spaces $$L^p({\it{\Omega}})$$, $$L^\varphi({\it{\Omega}})$$ and $$L^\psi({\it{\Omega}}),$$ as well as $$W^{1,p}({\it{\Omega}})$$, $$W^{1,\varphi}({\it{\Omega}})$$ and $$W^{1,\psi}({\it{\Omega}})$$, are isomorphic. The equivalence of the corresponding norms depends only on $$\delta$$ and the characteristics of $$\psi$$. Closely related to the nonlinear operator $${\boldsymbol{\mathcal{A}}}$$ with $$(p,\delta)$$-potential $$\psi$$ are the functions $${\bf F}\colon{\mathbb{R}}^{d \times n} \to {\mathbb{R}}^{d \times n}$$ defined through   \begin{align} \begin{aligned} {\bf F}({\bf P})&:= \big (\delta+\left| {{{\bf P}}} \right| \big )^{\frac {p-2}{2}}{{\bf P} }. \end{aligned} \label{eq:def_F} \end{align} (2.7) Another important tool is the shifted N-functions (cf. Růžička & Diening, 2007; Diening & Ettwein, 2008; Diening & Kreuzer, 2008; Belenki et al., 2012; Růžička, 2013). For an N-function $$\psi$$, we define the family of shifted N-functions $$\left\{ {{\psi_a}} \right\}_{a \ge 0}$$ for $$t\geq 0$$ by   \begin{align} \label{eq:phi_shifted} \psi_a(t):= \int _0^t \psi_a'(s)\, {\rm{d}}s\qquad\text{with }\quad \psi'_a(t):=\psi'(a+t)\frac {t}{a+t}. \end{align} (2.8) The function $$\psi _a$$ is again an N-function. It follows from Diening & Ettwein (2008, Lemma 26) that the conjugate function of the shifted N-function satisfies for all $$t \ge 0$$,   \begin{align} (\psi_a)^*(t)\sim (\psi^*)_{\psi'(a)}(t) , \label{eq:69} \end{align} (2.9) with constants depending only on $${\it{\Delta}}_2(\psi), {\it{\Delta}}_2(\psi^*) $$. In the special case of $$\varphi$$ defined in (2.2), we have $$\varphi_a(t) \sim (\delta+a+t)^{p-2} t^2$$ and also $$(\varphi_a)^*(t) \sim ((\delta+a)^{p-1} + t)^{p'-2} t^2$$. The family $$\left\{ {{\varphi_a}} \right\}_{a \ge 0}$$ satisfies the $${\it{\Delta}}_2$$-condition uniformly in $$a \ge 0$$, with $${\it{\Delta}}_2(\varphi_a) \leq c\, 2^{\max \left\{ {{2,p}} \right\}}$$ and $${\it{\Delta}}_2((\varphi_a)^*) \leq c\, 2^{\max \left\{ {{2,p}} \right\}}$$, respectively. We also have, uniformly with respect to $$t, a \ge 0$$,   \begin{align} \label{eq:equi3} \varphi_a'(t)\, t \sim \varphi_a(t)\sim \psi_a'(t)\, t \sim \psi_a(t), \end{align} (2.10) with constants depending only on the characteristics of $$\psi$$, if $$\psi $$ has $$(p,\delta)$$-structure. Moreover, we have (cf. Růžička & Diening, 2007; Diening & Ettwein, 2008; Diening & Kreuzer, 2008) the following result: Lemma 2.5 (Change of shift) For all $$\beta \in (0,1)$$ there exists $$c_\beta$$ such that all $${\bf P},{\bf Q} \in {\mathbb{R}}^{d\times n}$$, and all $$t\ge 0$$,   \begin{align*} \varphi_{\left| {{{\bf P}}} \right|}(t) &\le c_\beta\, \varphi_{\left| {{{\bf Q}}} \right|}(t) + \beta\, \varphi_{\left| {{{\bf P}}} \right|}(\left| {{{\bf P}-{\bf Q}}} \right|),\\[1mm] \big (\varphi_{\left| {{{\bf P}}} \right|}\big )^*(t) &\le c_\beta\, \big (\varphi_{\left| {{{\bf Q}}} \right|}\big )^*(t) + \beta\, \varphi_{\left| {{{\bf P}}} \right|}(\left| {{{\bf P}-{\bf Q}}} \right|), \\ \big(\varphi_{\left| {{{\bf P}}} \right|})(t) &\le c_\beta\, \big(\varphi\big)_{\left| {{{\bf Q}}} \right|}(t) + \beta\,\left| {{{\bf F}({\bf P}) - {\bf F}({\bf Q})}} \right|^2, \\ \big(\varphi_{\left| {{{\bf P}}} \right|})^*(t) &\le c_\beta\, \big(\varphi_{\left| {{{\bf Q}}} \right|}\big)^*(t) + \beta\,\left| {{{\bf F}({\bf P}) - {\bf F}({\bf Q})}} \right|^2 . \end{align*} Moreover, we have $$\varphi_\left| {{{\bf Q}}} \right|(\left| {{{\bf P}-{\bf Q}}} \right|)\sim \varphi_\left| {{{\bf P}}} \right|\big (\left| {{{\bf P}-{\bf Q}}} \right|\big )$$. The connection between $${\boldsymbol{\mathcal{A}}}$$, $${\bf F}$$ and $$\left\{ {{\varphi_a}} \right\}_{a \geq 0}$$ is best explained by the following proposition (cf. Růžička & Diening, 2007; Diening & Ettwein, 2008). Proposition 2.6 Let $${\boldsymbol{\mathcal{A}}}$$ satisfy Assumption 2.2, let $$\varphi$$ be defined in (2.2) and let $${\bf F}$$ be defined in (2.7). Then   \begin{align} \label{eq:hammera} \big({\boldsymbol{\mathcal{A}}}({\bf P}) - {\boldsymbol{\mathcal{A}}}({\bf Q})\big) :\big({\bf P}-{\bf Q} \big) &\sim \left| {{ {\bf F}({\bf P}) - {\bf F}({\bf Q})}} \right|^2 \\ \end{align} (2.11a)  \begin{align} \label{eq:hammerb} &\sim \varphi_{\left| {{{\bf P}}} \right|}(\left| {{{\bf P} - {\bf Q}}} \right|), \end{align} (2.11b) uniformly in $${\bf P}, {\bf Q} \in {\mathbb{R}}^{d \times n}$$. Moreover, uniformly in $${\bf Q} \in {\mathbb{R}}^{d \times n}$$,   \begin{align} {\boldsymbol{\mathcal{A}}}({\bf Q}) \cdot {\bf Q} \sim \left| {{{\bf F}({\bf Q})}} \right|^2 &\sim \varphi(\left| {{{\bf Q}}} \right|). \end{align} (2.11c) The constants depend only on the characteristics of $${\boldsymbol{\mathcal{A}}}$$. There also holds   \begin{alignat}{2} \label{eq:hammere} \left| {{{\boldsymbol{\mathcal{A}}}({\bf P}) - {\boldsymbol{\mathcal{A}}}({\bf Q})}} \right| &\sim \varphi'_{\left| {{{\bf P}}} \right|}\big(\left| {{{\bf P} - {\bf Q}}} \right|\big)&&\qquad\forall\,{\bf P}, {\bf Q} \in {{\mathbb{R}}^{d \times n}} , \\ \end{alignat} (2.12)  \begin{align} \left| {{{\boldsymbol{\mathcal{A}}}({\bf P})}} \right| &\sim \varphi'\big(\left| {{{\bf P}}} \right|)&&\qquad\forall\,{\bf P} \in {{\mathbb{R}}^{d \times n}} . \end{align} (2.13) Remark 2.7 In view of the previous proposition we have, for all $${\bf u}, {\bf w} \in W^{1,\varphi}({\it{\Omega}})$$,   \begin{align*} ({{\boldsymbol{\mathcal{A}}}(\nabla {\bf u}) \!-\! {\boldsymbol{\mathcal{A}}}(\nabla{\bf w})},{\nabla{\bf u} \!-\! \nabla {\bf w}}) &\sim \left\| {{{\bf F}(\nabla{\bf u}) \!-\! {\bf F}(\nabla{\bf w})}} \right\|_2^2 \,\sim \int_{\it{\Omega}}\! \varphi_{\left| {{\nabla{\bf u}}} \right|}(\left| {{\nabla{\bf u} \!-\! \nabla{\bf w}}} \right|) \,{\rm{d}}x. \end{align*} The constants depend only on the characteristics of $${\boldsymbol{\mathcal{A}}}$$. The last expression equals the quasi-norm introduced in Barrett & Liu (1994) raised to the power $$\rho = \max \left\{ {{p,2}} \right\}$$. This ensures that our results can also be expressed in terms of the quasi-norm. On the other hand, the last expression, which is a modular of the generalized shifted N-function $$\varphi_{\left| {{\nabla{\bf u}}} \right|}(\cdot)$$, has the advantage that one has not to distinguish between $$p\ge 2$$ and $$p\le 2$$ and that the theory of Orlicz–Sobolev spaces, in particular traces and embeddings, can be used. Moreover, the quantity $${\bf F}$$ enables us to formulate the natural regularity class of the problem, namely $$\nabla {\bf F}(\nabla {\bf u}) \in L^2({\it{\Omega}})$$ (cf. Giusti, 1994), which is not possible in terms of classical Sobolev spaces. The following important estimate follows directly from (2.12), Young’s inequality (2.1) and (2.11). Lemma 2.8 For all $$\epsilon>0$$, there exist a constant $$c_\epsilon>0$$ depending only on $$\epsilon>0$$ and the characteristics of $${\boldsymbol{\mathcal{A}}}$$ such that for all vector fields $${\bf u}, {\bf v}, {\bf w} \in W^{1,\varphi}({\it{\Omega}})$$,   \begin{align*} &\left( {{{\boldsymbol{\mathcal{A}}}(\nabla{\bf u}) - {\boldsymbol{\mathcal{A}}} (\nabla{\bf v})}, {\nabla{\bf w} - \nabla {\bf v}}} \right) \leq \epsilon\, \left\| {{{\bf F}(\nabla{\bf u}) - {\bf F}(\nabla{\bf v})}} \right\|_2^2 +c_\epsilon\, \left\| {{{\bf F}(\nabla{\bf w}) - {\bf F}(\nabla{\bf v})}} \right\|_2^2 . \end{align*} 2.3 Existence theory Let us briefly discuss the existence and regularity theory for problem (1.1). Assume that $${\boldsymbol{\mathcal{A}}}$$ possesses a $$(p,\delta)$$-potential $$\psi$$. Given boundary data $${\bf u}_D \in W^{1-\frac 1p,p}({\it{\Gamma}}_{\rm D})$$ can be extended to $${\it{\Omega}} $$. The extension, which is denoted again by $${\bf u}_D$$, belongs to $$W^{1,p}({\it{\Omega}})$$. Now, one can easily show, using the theory of monotone operators, that for all $$p>1$$, $$\delta\ge 0$$ and all data $${{\bf u}_D\in W^{1-\frac 1p,p}({\it{\Gamma}}_{\rm D})}$$ and $${\bf f}\in L^{p'}({\it{\Omega}})$$, there exists a weak solution $${\bf u} \in W^{1,p}({\it{\Omega}})$$ of problem (1.1), i.e., $${\bf u}-{\bf u}_D \in W^{1,p}_{{\it{\Gamma}}_{\rm D}}({\it{\Omega}})$$ and   \begin{align*} \int_{\it{\Omega}} {\boldsymbol{\mathcal{A}}}(\nabla {\bf u}) : \nabla {\bf z}\, {\rm{d}}x = \int _{\it{\Omega}} {\bf f} \cdot {\bf z}\, {\rm{d}}x \end{align*} is satisfied for all $${\bf z} \in W^{1,p}_{{\it{\Gamma}}_{\rm D}}({\it{\Omega}})$$. Using modular trace and Poincaré inequalities, one obtains the a priori estimate   \begin{align} \label{eq:uapriori} \rho_{\varphi,{\it{\Omega}}}(\nabla {\bf u}) \le c\, \big (\rho_{\varphi,{\it{\Omega}}}(\nabla {\bf u}_D)+\rho_{\varphi^*,{\it{\Omega}}}({\bf f}) \big ) . \end{align} (2.14) It is well known that one can show that weak solutions possess under appropriate assumptions the regularity $${\bf F}({\bf D}{\bf u}) \in W^{1,2}({\it{\Omega}})$$ (cf. Giaquinta & Modica, 1986; Acerbi & Fusco, 1994; Giusti, 1994). 2.4 DG spaces, jumps and averages Let $$\mathcal{T}_h$$ be a family of shape-regular triangulations of our domain $${\it{\Omega}}$$ consisting of $$n$$-dimensional simplices $$K$$ with diameter $$h_K$$ less than $$h$$. For simplicity, we assume in the article that $$h \le 1$$ always. For a simplex $$K \in \mathcal{T}_h$$, we denote by $$\rho_K$$ the supremum of the diameters of inscribed balls. We assume that there exists a constant $$\omega_0$$ independent of $$h$$ and $$K \in \mathcal{T}_h$$ such that $${h_K}{\rho_K^{-1}}\le \omega_0$$. The smallest such constant $$\omega_0$$ is called the chunkiness of $$\mathcal{T}_h$$. Note that, in the following, all constants may depend on the chunkiness $$\omega_0$$ but are independent of $$h$$. Let $$S_K$$ denote the neighbourhood of $$K$$, i.e., the patch $$S_K$$ is the union of all simplices of $$\mathcal{T}_h$$ touching $$K$$. We assume further for our triangulation that the interior of each $$S_K$$ is connected. One easily sees that under these assumptions we get that $$\left| {{K}} \right| \sim \left| {{S_K}} \right|$$ and that the number of simplices in $$S_K$$ and the number of patches to which a simplex belongs are uniformly bounded with respect to $$h>0$$ and $$K \in \mathcal{T}_h$$. We define the faces of $$\mathcal{T}_h$$ as follows: an interior face of $$\mathcal{T}_h$$ is the nonempty interior of $$\partial K \cap \partial K'$$, where $$K, K'$$ are two adjacent elements of $$\mathcal{T}_h$$. For the face $$\gamma:= \partial K \cap \partial K'$$, we use the notation $$S_\gamma:= K \cup K'$$. A boundary face of $$\mathcal{T}_h$$ is the nonempty interior of $$\partial K \cap \partial {\it{\Omega}}$$, where $$K$$ is a boundary element of $$\mathcal{T}_h$$. For the face $$\gamma:= \partial K \cap \partial {\it{\Omega}}$$, we use the notation $$S_\gamma:= K $$. By $${\it{\Gamma}}_I$$ and $${\it{\Gamma}}_{\rm D}$$, respectively, we denote the interior and the boundary faces, respectively, and put $${\it{\Gamma}}:= {\it{\Gamma}}_I\cup {\it{\Gamma}}_{\rm D} $$. For faces $$\gamma \in {\it{\Gamma}}$$, we denote by $$\gamma_h$$ the diameter of one of the adjacent simplices. We introduce the following notation for integrals related to quantities defined on the triangulation $$\mathcal T_h$$,   \begin{align*} \int_{\it{\Gamma}} f \, {\rm{d}}s&:= \sum_{\gamma \in {\it{\Gamma}}} \int_\gamma f \,{\rm{d}}s,\qquad \int_{\it{\Omega}} f \, {\rm{d}}x:= \sum_{K \in \mathcal T_h} \int_K f \,{\rm{d}}x, \end{align*} whenever the right-hand side is well defined.We also extend the notation for modulars to $${\it{\Gamma}}$$ by setting $$ \rho_{\psi,{\it{\Gamma}}}(f):= \int _{\it{\Gamma}} \psi(\left| {{f}} \right|)\, {\rm{d}}s$$ for $$f \in L^\psi({\it{\Gamma}})$$. We denote by $${\mathcal P}_k(K)$$, with $$k\in {\mathbb{N}}_0$$, the space of scalar, vector-valued or tensor-valued continuous functions, which are polynomials of degree at most $$m$$ on a simplex $$K \in \mathcal{T}_h$$. Given a triangulation of $${\it{\Omega}}$$ with the above properties, given an N-function $$\psi$$ and $$k,m \in {\mathbb{N}}$$ we define   \begin{align} \begin{alignedat}{2} {V_h^k} &= &{V_h^k}({\it{\Omega}})&:=\left\{ {{{\bf g} \in L^1 ({\it{\Omega}}) {\,\big|\,} {\bf g}|_K \in {\mathcal P}_k(K)\; \ \forall\, K \in \mathcal{T}_h}} \right\}, \\ {W_{{\rm DG}}}^{m,\psi}&= &\;{W_{{\rm DG}}}^{m,\psi}({\it{\Omega}}) &:= \left\{ {{ {\bf g} \in L^1({\it{\Omega}}) {\,\big|\,} {\bf g}|_K \in W^{m,\psi}(K) \ \forall\, K \in \mathcal{T}_h}} \right\} , \\ {X_h^k} &= &{X_h^k}({\it{\Omega}})&:=\left\{ {{{\bf G} \in L^1 ({\it{\Omega}}) {\,\big|\,} {\bf G}|_K \in {\mathcal P}_k(K)\; \ \forall\, K \in \mathcal{T}_h}} \right\}. \end{alignedat}\label{eq:1} \end{align} (2.15) Note that both $$W^{1,\psi}({\it{\Omega}})\subset {W_{{\rm DG}}}^{1,\psi}({\it{\Omega}})$$ and $${V_h^k}({\it{\Omega}})\subset {W_{{\rm DG}}}^{1,\psi}({\it{\Omega}})$$. In a slight abuse of notation we will also use $${V_h^k}$$ and $${W_{{\rm DG}}}^{1,\psi}$$ to denote the corresponding function spaces of scalar functions. For $${\bf g} \in {W_{{\rm DG}}}^{1,\psi}$$, we denote by $$\nabla_h {\bf g}$$ the local distributional gradient {and note that for} each $$K \in \mathcal{T}_h$$ the interior trace $${tr}^K_\gamma({\bf g})$$ of $${\bf g}$$ on $$\partial K$$ is well defined. Let $$g, {\bf g}, {\bf G} \in {W_{{\rm DG}}}^{1,\psi}$$. For interior faces $$\gamma$$ we denote by $$\left[\kern-0.15em\left[ {{g {\bf n}}}\right]\kern-0.15em\right]_\gamma$$, $$\left[\kern-0.15em\left[ {{{\bf g} \otimes {\bf n}}} \right]\kern-0.15em\right]_\gamma$$ and $$\left[\kern-0.15em\left[ {{{\bf G}{\bf n}}} \right]\kern-0.15em\right]_\gamma$$ the normal jump, i.e., the jump of $$g {\bf n}$$, $${\bf g} \otimes {\bf n}$$, $${\bf G} {\bf n}$$, respectively. For example $$\left[\kern-0.15em\left[ {{{\bf G}{\bf n}}} \right]\kern-0.15em\right]_\gamma$$ is defined on an interior face $$\gamma \in {\it{\Gamma}}_I$$ shared by the adjacent elements $$K^-,K^+ \in \mathcal T_h$$ with outer normals $${\bf n}^-$$, $${\bf n}^+$$, respectively, by   \begin{align*} \left[\kern-0.15em\left[ {{{\bf G}\,{\bf n}}} \right]\kern-0.15em\right]_\gamma := {tr}^{K^+}_\gamma({\bf G})\,{\bf n} ^+ +{tr}^{K^-}_\gamma({\bf G})\, {\bf n} ^- . \end{align*} For all interior faces, we denote by $$\left\{ {{\cdot}} \right\}$$ the trace average. For example, $$\left\{ {{{\bf g}}} \right\}$$ is defined on an interior face $$\gamma \in {\it{\Gamma}}_I$$ shared by the adjacent elements $$K^-,K^+ \in \mathcal T_h$$ by   \begin{align*} \left\{ {{{\bf g}}} \right\}_\gamma&:= \frac 12 \big ({tr}^{K^+}_\gamma({\bf g}) + {tr}^{K^-}_\gamma({\bf g})\big ). \end{align*} We omit the index $$\gamma$$ for jumps and averages if there is no danger of confusion. To deal with the Dirichlet boundary data on $${\it{\Gamma}}_{{\rm D}}$$, we need the following construction. Let $${\it{\Omega}}' \supsetneq {\it{\Omega}}$$ be a polyhedral, bounded domain with Lipschitz continuous boundary such that $$ \partial {\it{\Omega}} \setminus \partial{\it{\Omega}}' = \partial {\it{\Omega}}$$, $$ \partial {\it{\Omega}} \cap \partial {\it{\Omega}}' = \emptyset$$. Let $$\mathcal{T}'_h$$ denote an extension of the triangulation $$\mathcal{T}_h$$ to $${\it{\Omega}}'$$, having the same properties as $$\mathcal{T}_h$$ (in particular with a similar chunkiness). We extend our notation to this setting by adding a superscript ‘prime’ to it. In particular, we denote by $${\it{\Gamma}}_I'$$, $$S_K'$$ and $$S_\gamma'$$ the interior faces, the neighbourhood of $$K$$ and $$\gamma$$, resp., of $$\mathcal T_h'$$. We define   \begin{align*} {W_{{\rm DG},{\it{\Gamma}}_{\rm D}}}^{1,\psi}({\it{\Omega}}) := \left\{ {{{\bf g} \in {W_{{\rm DG}}}^{1,\psi}({\it{\Omega}}') {\,\big|\,} {\bf g}|_{{\it{\Omega}}' \setminus {\it{\Omega}}} = \mathbf{0}}} \right\}. \end{align*} So functions from $${W_{{\rm DG},{\it{\Gamma}}_{\rm D}}}^{1,\psi}({\it{\Omega}})$$ are elements of $${W_{{\rm DG}}}^{1,\psi}({\it{\Omega}})$$, which are (virtually) extended by zero to $${\it{\Omega}}' \setminus {\it{\Omega}}$$. Therefore, it is very natural to define the jumps and averages of $${\bf g} \in {W_{{\rm DG},{\it{\Gamma}}_{\rm D}}}^{1,\psi}({\it{\Omega}})$$ on $${\it{\Gamma}}_{\rm D}$$ by   \begin{align} \label{eq:jump_bnd} \left[\kern-0.15em\left[ {{{\bf g}\otimes {\bf n}}} \right]\kern-0.15em\right]_\gamma &:= {tr}^{\it{\Omega}}_\gamma ({\bf g}) \otimes {\bf n} , \quad && \left\{ {{{\bf g}}} \right\}_\gamma :={tr}^{\it{\Omega}}_\gamma ({\bf g}) \quad \text{for }\gamma \in {\it{\Gamma}}_{\rm D} . \end{align} (2.16) Let us now define a discrete DG gradient and jump functionals for functions $${\bf g} \in {W_{{\rm DG},{\it{\Gamma}}_{\rm D}}}^{1,\psi}({\it{\Omega}})$$. They were first introduced in Di Pietro & Ern (2010) and Bustinza & Gatica (2004) for $${\bf g}_h \in {V_h^k}$$. For every $${\bf g} \in {W_{{\rm DG},{\it{\Gamma}}_{\rm D}}}^{1,\psi}({\it{\Omega}})$$ we define the discrete gradient $${\nabla_{\rm DG}^h} {\bf g} \in {X_h^k}$$ (via Riesz representation) by the relation   \begin{align} \label{eq:nablaDG} \left( {{{\nabla_{\rm DG}^h} {\bf g}}, {{\bf X}_h}} \right)&:= \left( {{\nabla_h {\bf g}}, {{\bf X}_h}} \right) - \left\langle {{\left[\kern-0.15em\left[ {{{\bf g} \otimes {\bf n}}} \right]\kern-0.15em\right]}{\left\{ {{{\bf X}_h}} \right\}}} \right\rangle_{{\it{\Gamma}}} \end{align} (2.17) for all $${\bf X}_h \in {X_h^k}$$. For $${\bf g} \in {W_{{\rm DG},{\it{\Gamma}}_{\rm D}}}^{1,\psi}({\it{\Omega}})$$, we define the jump functional $${\bf R}_h {\bf g} \in {{X_h^k}}$$ (via Riesz representation) by the formula   \begin{align} \label{eq:R} \left( {{{\bf R}_h {\bf g}}, {{\bf X}_h}} \right) &:= \left\langle {{\left[\kern-0.15em\left[ {{{\bf g} \otimes {\bf n}}} \right]\kern-0.15em\right]}{\left\{ {{{\bf X}_h}} \right\}}} \right\rangle_{\it{\Gamma}} \end{align} (2.18) for all $${\bf X}_h \in {X_h^k}$$. With these definitions we have the following pointwise identities for $${\bf g}_h \in {V_h^k}$$:   \begin{align} \label{eq:DGnablaR} {\nabla_{\rm DG}^h} {\bf g}_h &= \nabla_h {\bf g}_h - {\bf R}_h {\bf g}_h, \end{align} (2.19) and for $${\bf g} \in {W_{{\rm DG},{\it{\Gamma}}_{\rm D}}}^{1,\psi}({\it{\Omega}})$$,   \begin{align} \label{eq:DGnablaR2} {\nabla_{\rm DG}^h} {\bf g} &= {{\it{\Pi}}_{\rm DG}} \nabla_h {\bf g} - {\bf R}_h {\bf g}. \end{align} (2.20) In the appendix, the projection $${{\it{\Pi}}_{\rm DG}}$$ is defined and some basic properties are collected. The same is done for the discrete gradient and the jump functionals. We define the semimodulars $$m_{\psi,h}$$ and $$M_{\psi,h}$$ for $${\bf g} \in{W_{{\rm DG},{\it{\Gamma}}_{\rm D}}}^{1,\psi}({\it{\Omega}})$$ by   \begin{align} \begin{aligned} m_{\psi,h}({\bf g})&:= h\,\rho_{\psi,{\it{\Gamma}} }(h^{-1}\left[\kern-0.15em\left[ {{{\bf g}\otimes {\bf n}}} \right]\kern-0.15em\right]) , \\ M_{\psi,h}({\bf g})&:= \rho_{\psi,{\it{\Omega}}}(\nabla_h {\bf g}) + m_{\psi,h}({\bf g}) . \end{aligned}\label{def:mh} \end{align} (2.21) Note that for every $${\bf g} \in {W_{{\it{\Gamma}}_{\rm D}}}^{1,\psi}({\it{\Omega}})$$, we have $$m_{\psi,h}({\bf g})=0$$ and $$M_{\psi,h}({\bf g}) = \rho_{\psi,{\it{\Omega}}}(\nabla {\bf g})$$, so $$M_{\psi,h}(\,\cdot\,)$$ is an extension of the modular $$\rho_{\psi,{\it{\Omega}}}(\nabla \,\cdot\,)$$ on $${W_{{\it{\Gamma}}_{\rm D}}}^{1,\psi}$$ to the DG setting. In fact, the semimodular $$M_{\psi,{\it{\Omega}}}$$ is modular. This is in complete analogy to the case $${W_{{\it{\Gamma}}_{\rm D}}}^{1,\psi}({\it{\Omega}})$$. Remark 2.9 In the special case $$\psi=\varphi,$$ we have due to (2.11c),   \begin{align} \begin{aligned}\label{eq:3} m_{\varphi,h}({\bf g})&\sim h\, \left\| {{ {\bf F}(h^{-1}\left[\kern-0.15em\left[ {{{\bf g}\otimes {\bf n}}} \right]\kern-0.15em\right])}} \right\|_{2,{\it{\Gamma}}_I\cup {\it{\Gamma}}_{\rm D}}^2 , \\ M_{\varphi,h}({\bf g})&\sim \left\| {{{\bf F}(\nabla_h {\bf g})}} \right\|^2_{2,{\it{\Omega}}}+ h\,\left\| {{{\bf F}(h^{-1}\left[\kern-0.15em\left[ {{{\bf g}\otimes {\bf n}}} \right]\kern-0.15em\right])}} \right\|_{2,{\it{\Gamma}}_I\cup {\it{\Gamma}}_{\rm D}}^2 . \end{aligned} \end{align} (2.22) 2.5 SIP methods Let us now formulate two SIP formulations of (1.1) under the assumption that $${\boldsymbol{\mathcal{A}}}$$ possesses a $$(p,\delta)$$-potential $$\psi$$. For given $$a > 0$$, we define the shifted operator $$ {\boldsymbol{\mathcal{A}}}_{a}\colon {\mathbb{R}}^{d\times n} \to {\mathbb{R}}^{d\times n}$$ through $$ {\boldsymbol{\mathcal{A}}}_{a}(\mathbf{0}):=\mathbf{0}$$ and   \begin{align} {\boldsymbol{\mathcal{A}}}_{a} ({\bf P}) := \psi_{a}' (\left| {{{\bf P}}} \right|) \frac {{\bf P}}{\left| {{{\bf P}}} \right|} \qquad \textrm{ $\forall\, {\bf P} \in {\mathbb{R}}^{d \times n} \setminus \left\{ { \mathbf{0}} \right\} $.} \label{def:A_u} \end{align} (2.23) For given $${\bf u} _D \in W^{1,p}({\it{\Omega}})$$, let $${\bf u}_D^* \in W^{1,p}({\it{\Omega}})$$ be some approximation of $${\bf u}_D$$. We have in mind $$ {\bf u}_D^*= {\bf u}$$ or $$ {\bf u}_D^*= {{\it{\Pi}}_{{\rm SZ}}}{\bf u}$$. Recall that we denote by $${\it{\Gamma}}$$ the union of all interior and boundary faces. Scheme 2.10 (SIP-shifted) For the given data $${\bf u}_D\in W^{1-\frac 1p,p}({\it{\Gamma}}_{\rm D})$$, $${{\bf f}\in L^{p'}({\it{\Omega}})}$$ and given $$\alpha>0$$, find $${\bf u}_h \in {V_h^k} $$ such that for all $${\bf z}_h \in {V_h^k}$$,   \begin{align} \label{eq:4} \begin{aligned} &\int_{\it{\Omega}} {\boldsymbol{\mathcal{A}}}(\nabla_h {\bf u}_h) :\nabla_h {\bf z}_h\, {\rm{d}}x -\int_{{\it{\Gamma}}} \left\{ {{{\boldsymbol{\mathcal{A}}}(\nabla_h {\bf u}_h)}} \right\}: \left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]\, {\rm{d}}s \\ &\quad -h\, \int_{{\it{\Gamma}}} \left\{ {{{\boldsymbol{\mathcal{A}}}_{\left| {{\nabla_h {\bf u}_h}} \right|}(h^{-1}\left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}^*_D) \otimes {\bf n}}} \right]\kern-0.15em\right]):\nabla_h {\bf z}_h }} \right\}\, {\rm{d}}s \\ &\quad + \alpha \int_{{\it{\Gamma}}} \left\{ {{{\boldsymbol{\mathcal{A}}}_{\left| {{\nabla_h {\bf u}_h}} \right|}(h^{-1}\left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right])}} \right\}: \left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]\, {\rm{d}}s =\int _{\it{\Omega}} {\bf f} \cdot {\bf z}_h\, {\rm{d}}x . \end{aligned} \end{align} (2.24) Scheme 2.11 (SIP-lifting) For the given data $${\bf u}_D\in W^{1-\frac 1p,p}({\it{\Gamma}}_{\rm D})$$, $${{\bf f}\in L^{p'}({\it{\Omega}})}$$ and given $$\alpha>0$$, find $${\bf u}_h \in {V_h^k} $$ such that for all $${\bf z}_h \in {V_h^k}$$,   \begin{align} & \int_{\it{\Omega}}\! {\boldsymbol{\mathcal{A}}}({\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^*) : {\nabla_{\rm DG}^h} {\bf z}_h\, {\rm{d}}x -\int_{{\it{\Omega}}} {\boldsymbol{\mathcal{A}}}({\bf R}_h( {\bf u}_h-{\bf u}_D^*)): {\bf R}_h {\bf z}_h \, {\rm{d}}x \notag \\ &\quad + \alpha \int_{{\it{\Gamma}}} { {\boldsymbol{\mathcal{A}}} (h^{-1}\left[\kern-0.15em\left[ {{({\bf u}_h - {\bf u}^*_D) \otimes {\bf n}}} \right]\kern-0.15em\right])}: \left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]\, {\rm{d}}s =\int _{\it{\Omega}} {\bf f} \cdot {\bf z}_h\, {\rm{d}}x . \label{eq:5} \end{align} (2.25) In view of the a priori estimates (cf. Section 4), one can easily prove by standard methods that there exists a solution $${\bf u}_h$$ of Scheme 2.10 and of Scheme 2.11. Remark 2.12 Note that these schemes generalize the classical SIP scheme, because for $$p=2$$ the formulations (2.24) and (2.25) reduce to the corresponding ones of the SIP method (cf. Di Pietro & Ern, 2012). Note also that the left-hand sides of (2.24) and (2.25) are well defined for $${\bf u}_h \in {W_{{\rm DG}}}^{1,\varphi}({\it{\Omega}}) \cap {W_{{\rm DG}}}^{2,1}({\it{\Omega}})$$, $${\bf z}_h \in {V_h^k} $$. Thus, we can evaluate them for the solution $${\bf u} $$ of problem (1.1). Let us now see if the weak solution of our original problem (1.1) satisfies similar systems. For a sufficiently smooth solution $${\bf u} \in W^{1,\varphi}({\it{\Omega}}) \cap W^{2,1}({\it{\Omega}})$$ of (1.1), we get for all $${\bf z}_h \in {V_h^k}$$,   \begin{align}\label{eq:cont-sol} \begin{aligned} \left( {{{\bf f}}, {{\bf z}_h}} \right) &= \left( {{-{\text{div}} {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {{\bf z}_h}} \right) \\ &= \left( {{{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {\nabla_h {\bf z}_h}} \right) - \left\langle {{{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {\left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]}} \right\rangle_{{\it{\Gamma}}} \\ &= {\left( {{{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {\nabla_h {\bf z}_h}} \right) - \left\langle {{{\left\{ {{{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}} \right\}}}, {\left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]}} \right\rangle_{{\it{\Gamma}}}} . \end{aligned} \end{align} (2.26) Thus, the solution $${\bf u}$$ of the continuous problem also satisfies (2.24), because the last two terms on the left-hand side of (2.24) vanish for $${\bf u} \in W^{1,\varphi}({\it{\Omega}})$$. Using (2.19), the definition of $${{\it{\Pi}}_{\rm DG}}$$ and the definition of the jump functional (2.18), we obtain   \begin{align*} \left( {{{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {\nabla_h {\bf z}_h}} \right) &= \left( {{ {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {{\nabla_{\rm DG}^h}{\bf z}_h}} \right) + \left( {{ {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {{\bf R}_h{\bf z}_h}} \right) \\ &= \left( {{ {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {{\nabla_{\rm DG}^h}{\bf z}_h}} \right) + \left( {{{{\it{\Pi}}_{\rm DG}} {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {{\bf R}_h{\bf z}_h}} \right) \\ &= \left( {{{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {{\nabla_{\rm DG}^h}{\bf z}_h}} \right) + \left\langle {{\left\{ {{{{\it{\Pi}}_{\rm DG}} {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}} \right\}}, {\left[\kern-0.15em\left[ {{{\bf z}_h\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right\rangle_{\it{\Gamma}}, \end{align*} and consequently,   \begin{align} \label{eq:cont} \begin{aligned} \left( {{{\bf f}}, {{\bf z}_h}} \right) &= \left( {{{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}, {{\nabla_{\rm DG}^h}{\bf z}_h}} \right) + \left\langle {{\left\{ {{{{\it{\Pi}}_{\rm DG}} {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}} \right\}-{\left\{ {{{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}} \right\}}}, {\left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]}} \right\rangle_{{\it{\Gamma}}} . \end{aligned} \end{align} (2.27) Thus, the solution $${\bf u}$$ of the continuous problem satisfies a system quite different to (2.25), which is caused by the nonlinearity in the elliptic term. 2.6 Primal formulation of an LDG scheme An LDG scheme for problem (1.1) was proposed and analysed in Diening et al. (2014). We assume again that $${\boldsymbol{\mathcal{A}}}$$ possesses a $$(p,\delta)$$-potential $$\psi$$. For later purposes, we recall the primal formulation of the LDG scheme here. Scheme 2.13 (LDG) For the given data $${\bf u}_D\in W^{1-\frac 1p,p}({\it{\Gamma}}_{\rm D})$$, $${{\bf f}\in L^{p'}({\it{\Omega}})}$$ and given $$\alpha>0$$, find $${\bf u}_h \in {V_h^k} $$ such that for all $${\bf z}_h \in {V_h^k}$$,   \begin{align} & \int_{\it{\Omega}}\! {\boldsymbol{\mathcal{A}}}({\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^*):{\nabla_{\rm DG}^h} {\bf z}_h\, {\rm{d}}x + \alpha \int_{{\it{\Gamma}}} \left\{ {{ {\boldsymbol{\mathcal{A}}} (h^{-1}\left[\kern-0.15em\left[ {{({\bf u}_h - {\bf u}^*_D) \otimes {\bf n}}} \right]\kern-0.15em\right])}} \right\}: \left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]\, {\rm{d}}s \notag \\ &=\int_{\it{\Omega}} {\bf f} \cdot {\bf z}_h\, {\rm{d}}x . \label{eq:6} \end{align} (2.28) In Diening et al. (2014, Theorem 3.2), it was shown that for $$ {\bf u}_D^*= {\bf u}$$ or $$ {\bf u}_D^*= {{\it{\Pi}}_{{\rm SZ}}}{\bf u}$$ and any $$\alpha>0$$, the solution $${\bf u}_h$$ of (2.28) satisfies the following a priori estimate:   \begin{align*} &\int_{\it{\Omega}} \varphi\big(\left| {{{\nabla_{\rm DG}^h} {\bf u}+ {\bf R}_h{\bf u}_D^*}} \right|\big) \,{\rm{d}}x +\alpha\, h \int_{{\it{\Gamma}}}\varphi\big(h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_h-{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|\big) \,{\rm{d}}s \\ &\quad + \min\{1,\alpha\} \int_{\it{\Omega}} \varphi\big(\left| {{\nabla _h{\bf u}_h}} \right|\big) \,{\rm{d}}x +\min\{1,\alpha\}\int_{\it{\Omega}} \varphi\big(\left| {{{\bf u}_h-{\bf u}}} \right|\big) \,{\rm{d}}x \\ &\le c\, \int_{{\it{\Omega}}} \varphi\big(\left| {{ \nabla{\bf u}}} \right|\big)\, {\rm{d}}x + c\, \int_{{\it{\Omega}}} \varphi^*\big(\left| {{{\bf f}}} \right|\big)\, {\rm{d}}x , \end{align*} with $$c$$ depending only on $$\alpha$$, the characteristics of $${\boldsymbol{\mathcal{A}}}$$ and the chunkiness $$\omega_0$$. Let $$k \geq 1$$ and let $${\bf u} \in W^{1,\varphi}({\it{\Omega}})$$ be a solution of (1.1) with $${\boldsymbol{\mathcal{A}}}(\nabla {\bf u}) \in {W^{1,\varphi^*}}({\it{\Omega}})$$ and $${\bf F}(\nabla {\bf u}) \in W^{1,2}({\it{\Omega}})$$, and let $${\bf u}_h \in {V_h^k} $$ be a solution of (2.28) for some $$\alpha>0$$. Then, the following error estimates have been shown in Diening et al. (2014, Corollary 4.10). (i) If $${\bf u}_D^* = {{\it{\Pi}}_{{\rm SZ}}}{\bf u}$$, then   \begin{align*} &\left\| {{{\bf F}({\nabla_{\rm DG}^h} {\bf u}_h +{\bf R}_h {\bf u}_D^*) - {\bf F}(\nabla {\bf u})}} \right\|_2^2 + \alpha\,{m_{\varphi,h}}({\bf u}_h-{{\it{\Pi}}_{{\rm SZ}}} {\bf u}) \\ &\leq {c_\alpha\, \begin{cases} h^{2}\, \left\| {{\nabla {\bf F}(\nabla {\bf u})}} \right\|_2^2\,\qquad \;\kern1pt\quad \qquad \qquad \qquad \;\;\quad \textrm{ if } p\le 2 , \\[1mm] h^{p'}\, \big(\left\| {{\nabla {\bf F}(\nabla {\bf u})}} \right\|_2^2+\rho_{\varphi^*,{\it{\Omega}}}(\nabla {\boldsymbol{\mathcal{A}}}(\nabla {\bf u}))\big )\,\qquad \textrm{ if } p\ge 2 . \end{cases}} \end{align*} (ii) If $${\bf u}_D^* = {\bf u}$$, then   \begin{align*} &\left\| {{{\bf F}({\nabla_{\rm DG}^h} {\bf u}_h +{\bf R}_h {\bf u}_D^*) - {\bf F}(\nabla {\bf u})}} \right\|_2^2 + \alpha\,{m_{\varphi,h}}({\bf u}_h-{{\it{\Pi}}_{{\rm SZ}}} {\bf u}) \\ &\leq {c_\alpha \begin{cases} h^{p}\,\big( \left\| {{\nabla {\bf F}(\nabla {\bf u})}} \right\|_2^2 + \rho_{\varphi, {\it{\Omega}}}(\nabla {\bf u})\big)\, \;\;\, \qquad \qquad \qquad \quad \qquad \,\textrm{if } p\le 2 , \\[1mm] h^{p'}\,\big( \left\| {{\nabla {\bf F}(\nabla {\bf u})}} \right\|_2^2 + \rho_{\varphi, {\it{\Omega}}}(\nabla {\bf u}) {+\rho_{\varphi^*,{\it{\Omega}}}(\nabla {\boldsymbol{\mathcal{A}}}(\nabla {\bf u}))} \big)\,\qquad \textrm{ if } p\ge 2 . \end{cases}} \end{align*} The constants depend only on the characteristics of $${\boldsymbol{\mathcal{A}}}$$, the chunkiness $$\omega_0$$ and $$\alpha^{-1}$$. 3. A priori estimates In this section, we derive a priori estimates for our SIP schemes. Let us start with Scheme 2.10 with shifts. Theorem 3.1 Let $$ {\bf u}_D^*= {\bf u}$$ or $$ {\bf u}_D^*= {{\it{\Pi}}_{{\rm SZ}}}{\bf u}$$. Then, there exists $$\alpha_0$$ such that for all $$\alpha \ge \alpha_0$$ the solution $${\bf u}_h \in {V_h^k} $$ of Scheme 2.10 satisfies the a priori estimate   \begin{align*} &\int_{\it{\Omega}} \varphi\big(\left| {{\nabla _h{\bf u}_h}} \right|\big) \,{\rm{d}}x +\alpha\, h\int_{{\it{\Gamma}}} \left\{ {{\varphi_{\left| {{\nabla _h{\bf u}_h}} \right|}\big(h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_h-{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|\big)}} \right\} \,{\rm{d}}s \\ &\quad +\min\{1,\alpha\}\,h\, \int_{\it{\Gamma}} \varphi\big(h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_h-{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|\big) \,{\rm{d}}s +\min\{1,\alpha\}\,\int_{\it{\Omega}} \varphi\big(\left| {{{\bf u}_h-{\bf u}}} \right|\big) \,{\rm{d}}x \\ &\le c\, \int_{{\it{\Omega}}} \varphi\big(\left| {{ \nabla{\bf u}}} \right|\big)\, {\rm{d}}x + c\, \int_{{\it{\Omega}}} \varphi^*\big(\left| {{{\bf f}}} \right|\big)\, {\rm{d}}x , \end{align*} with $$c$$ depending only on $$\alpha$$, the characteristics of $${\boldsymbol{\mathcal{A}}}$$ and the chunkiness $$\omega_0$$. Proof. To prove the assertion we use in (2.24), the test function $${\bf z}_h={\bf u}_h-{{\it{\Pi}}_{{\rm SZ}}} {\bf u}$$. Thus, we get, adding and subtracting appropriate terms,   \begin{align} &\int_{\it{\Omega}}{{\boldsymbol{\mathcal{A}}}(\nabla_h {\bf u}_h )}: {\nabla_h {\bf u}_h }\, {\rm{d}}x\label{eq:apri} \\[-1mm] &\quad + \alpha\, h\int_{\it{\Gamma}}\left\{ {{ {\boldsymbol{\mathcal{A}}_{\lvert{\nabla _h{\bf u}_h}\rvert}}(h^{-1} \left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*) \otimes {\bf n}}} \right]\kern-0.15em\right]): h^{-1}\left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right\}\, {\rm{d}}s\notag \\ &= \int_{\it{\Omega}} {{\bf f}}\cdot ({\bf u}_h-{\bf u} +{\bf u}-{{\it{\Pi}}_{{\rm SZ}}} {\bf u})\, {\rm{d}}x+\int_{\it{\Omega}}{{\boldsymbol{\mathcal{A}}}(\nabla_h{\bf u}_h )}: {\nabla_h{{\it{\Pi}}_{{\rm SZ}}} {\bf u} }\,{\rm{d}}x \notag \\ &\quad +\alpha\,h \int_{\it{\Gamma}} \left\{ {{{\boldsymbol{\mathcal{A}}_{\lvert{\nabla _h{\bf u}_h}\rvert}}(h^{-1} \left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right])}} \right\}:{ h^{-1}\left[\kern-0.15em\left[ {{({{\it{\Pi}}_{{\rm SZ}}} {\bf u}-{\bf u}_D^*) \otimes {\bf n}}} \right]\kern-0.15em\right]}\,s \notag \\ &\quad + h\,\int_{\it{\Gamma}} \left\{ {{{\boldsymbol{\mathcal{A}}}(\nabla_h{\bf u}_h )}} \right\}:{h^{-1} \left[\kern-0.15em\left[ {{({\bf u}_h-{{\it{\Pi}}_{{\rm SZ}}} {\bf u}) \otimes {\bf n}}} \right]\kern-0.15em\right]}\,{\rm{d}}s \notag \\ &\quad + h\,\int_{\it{\Gamma}} \left\{ {{{\boldsymbol{\mathcal{A}}_{\lvert{\nabla _h{\bf u}_h}\rvert}}(h^{-1} \left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]): \nabla_h{({\bf u}_h-{{\it{\Pi}}_{{\rm SZ}}} {\bf u}) }}} \right\}\,{\rm{d}}s \notag \\ &=: I_1+I_2+I_3+I_4+I_5 .\notag \end{align} (3.1) Using Assumption 2.2, (2.4) and (2.10) we see that the left-hand side of (3.1) is equivalent to   \begin{align} \int_{\it{\Omega}} \varphi(|{\nabla_h {\bf u}_h }|)\,{\rm{d}}x + \alpha\, h \int_{\it{\Gamma}} \left\{ {{\varphi_{\left| {{\nabla _h{\bf u}_h}} \right|} (h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|})} \right\}\, {\rm{d}}s . \label{eq:lhs1} \end{align} (3.2) Lemma 2.5 and (A.10) imply that (3.2) is an upper bound of   \begin{align} \min(1,\alpha)\, h \int_{\it{\Gamma}} {\varphi (h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]})} \right|}\, {\rm{d}}s, \label{eq:lhs1a} \end{align} (3.3) which thus can be added to (3.2). This, in turn, implies that we can also add   \begin{equation} \label{eq:lhs1b} \min(1,\alpha)\,\int_{\it{\Omega}} \varphi (\left| {{{\bf u}_h-{\bf u}}} \right|)\,{\rm{d}}x \end{equation} (3.4) to (3.2) at the expense of adding $$\rho_{\varphi, {\it{\Omega}}}(\nabla {\bf u})$$ to the right-hand side of (3.1), because   \begin{align} \int_{\it{\Omega}} \!\varphi (\left| {{{\bf u}_h\!-\!{\bf u}}} \right|)\,{\rm{d}}x &\le c \int_{\it{\Omega}} \!\varphi(\left| {{\nabla_h {\bf u}_h-\nabla {\bf u}}} \right|)\,{\rm{d}}x + c \, h \!\int_{\it{\Gamma}} \! {\varphi (h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u})\otimes {\bf n}}} \right]\kern-0.15em\right])}} \right|}\, {\rm{d}}s \notag \\ &\le c \int_{\it{\Omega}} \!\varphi(\left| {{\nabla_h {\bf u}_h}} \right|) +\varphi(\left| {{\nabla {\bf u}}} \right|)\,{\rm{d}}x + c \, h \!\int_{\it{\Gamma}} \!{\varphi (h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_h \!-\!{\bf u}^*_D)\otimes {\bf n}})} \right]\kern-0.15em\right]}} \right|}\, {\rm{d}}s \notag \\ &\quad + c\, h \int_{\it{\Gamma}} \! {\varphi (h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_D^*-{\bf u})\otimes {\bf n}}} \right]\kern-0.15em\right])}} \right|}\, {\rm{d}}s\label{eq:lhs1c} \\ &\le c \int_{\it{\Omega}} \!\varphi(\left| {{\nabla_h {\bf u}_h}} \right|) +\varphi(\left| {{\nabla {\bf u}}} \right|)\,{\rm{d}}x + c \, h \!\int_{\it{\Gamma}} \! {\varphi (h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_h \!-\!{\bf u}^*_D)\otimes {\bf n}})} \right]\kern-0.15em\right]}} \right|}\, {\rm{d}}s, \notag \end{align} (3.5) where we used Lemmas A3 and A4. Now we can estimate the terms $$I_i$$, $$i=1,\ldots,5$$ on the right-hand side of (3.1). We estimate with Young’s inequality and (A.17),   \begin{align*} \left| {{I_1}} \right| &\leq \epsilon \rho_{\varphi,{\it{\Omega}}}({\bf u}_h - {\bf u}) + c_{\epsilon} \rho_{\varphi^*,{\it{\Omega}}}({\bf f}) + c\, \rho_{\varphi,{\it{\Omega}}}({\bf u} - {{\it{\Pi}}_{{\rm SZ}}} {\bf u}) \\ &\leq \epsilon \rho_{\varphi,{\it{\Omega}}}({\bf u}_h - {\bf u}) + c_{\epsilon} \rho_{\varphi^*,{\it{\Omega}}}({\bf f}) + c\, \rho_{\varphi,{\it{\Omega}}}( \nabla {\bf u}) , \end{align*} where we also used $$h\le 1$$. Using (2.13), Young’s inequality and (A.17) we get   \begin{align*} \begin{aligned} \left| {{I_2}} \right|&\le {\varepsilon} \,\rho_{\varphi,{\it{\Omega}}}\big({\nabla_h {\bf u}_h }\big) +c_{\varepsilon}\, \rho_{\varphi,{\it{\Omega}}}\big(\nabla_h{{\it{\Pi}}_{{\rm SZ}}}{\bf u}\big) \\ &\le {\varepsilon} \,\rho_{\varphi,{\it{\Omega}}}\big({\nabla_h {\bf u}_h }\big) +c_{\varepsilon} \,\rho_{\varphi,{\it{\Omega}}}\big(\nabla{\bf u}\big) . \end{aligned} \end{align*} From (2.23), (2.10) and Young’s inequality, we infer   \begin{align*} \left| {{I_3}} \right|&\le {\varepsilon}\,h \, \alpha\,\int_{\it{\Gamma}} \left\{ {{\varphi_{\left| {{\nabla_h{\bf u}_h}} \right|}(h^{-1}\left| {{ \left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|)}} \right\}\, {\rm{d}}s \\ &\quad +c_{\varepsilon} \, h \,\alpha\int_{\it{\Gamma}} \left\{ {{\varphi_{\left| {{\nabla_h{\bf u}_h}} \right|}(h^{-1}\left| {{ \left[\kern-0.15em\left[ {{({\bf u}_D^*-{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|)}} \right\}\, {\rm{d}}s , \end{align*} where the last term is further estimated, using Lemma 2.5, (A.10), Lemma A4 and (A.17), by   \begin{align}\label{eq:beta1} \begin{aligned} &c_{\varepsilon} \, h \,\alpha\int_{\it{\Gamma}} \beta_1\,\left\{ {{\varphi(\left| {{\nabla _h{\bf u}_h}} \right|)}} \right\} + c_{\beta_1} \,\varphi(h^{-1}\left| {{ \left[\kern-0.15em\left[ {{({\bf u}_D^* -{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|) \, {\rm{d}}s \\ &\le \beta_1\, c_{\varepsilon}\,\alpha\,\rho_{\varphi,{\it{\Omega}}}\big(\nabla _h{\bf u}_h \big)+c_{\varepsilon} \, c_{\beta_1} \, \alpha\,\big (m_{\varphi,h}\big({\bf u}_D^* -{\bf u}\big)+ m_{\varphi,h}({\bf u} -{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\big ) \\ &\le \beta_1\, c_{\varepsilon}\,\alpha\,\rho_{\varphi,{\it{\Omega}}}\big(\nabla _h{\bf u}_h \big) +c_{\varepsilon} \, c_{\beta_1}\,\alpha\, \rho_{\varphi,{\it{\Omega}}}(\nabla {\bf u}) , \end{aligned} \end{align} (3.6) with $$\beta_1\in (0,1)$$. Using Young’s inequality, (A.10), adding and subtracting $${\bf u}^*_D$$ and $${\bf u}$$, Lemma A4 and (A.17) we get   \begin{align*} \left| {{I_4}} \right|&\le h \, \int_{\it{\Gamma}} {\varepsilon} \left\{ {{\varphi (\left| {{\nabla_h{\bf u}_h}} \right|) }} \right\} + c_{\varepsilon} \, \varphi (h^{-1}\left| {{ \left[\kern-0.15em\left[ {{({\bf u}_h -{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|)\, {\rm{d}}s \\ &\le {\varepsilon} \, c \,\rho_{\varphi,{\it{\Omega}}}\big(\nabla _h{\bf u}_h \big) +c_{\varepsilon} \big( m_{\varphi,h} ({\bf u}_h -{\bf u}^*_D )+ m_{\varphi,h}( {\bf u}_D^* -{\bf u} )+m_{\varphi,h} ({\bf u} -{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\big ) \\ &\le {\varepsilon} \, c \,\rho_{\varphi,{\it{\Omega}}}\big(\nabla _h{\bf u}_h \big) +c_{\varepsilon} \, \rho_{\varphi,{\it{\Omega}}}(\nabla {\bf u}) +c_{\varepsilon} \, m_{\varphi,h}\big ({\bf u}_h -{\bf u}^*_D\big ) , \end{align*} where the last term is further estimated by Lemma 2.5 and (A.10) by   \begin{align}\label{eq:alpha} \begin{aligned} &c_{\varepsilon} \, h \int_{\it{\Gamma}} \beta_2\,\left\{ {{\varphi(\left| {{\nabla _h{\bf u}_h}} \right|)}} \right\}+ c_{\beta_2} \,\left\{ {{\varphi_{\left| {{\nabla _h{\bf u}_h}} \right|}(h^{-1}\left| {{ \left[\kern-0.15em\left[ {{({\bf u}_h-{\bf u}_D^* )\otimes{\bf n}}} \right]\kern-0.15em\right]}} \right|)}} \right\} \, {\rm{d}}s \\ &\le \beta_2\, c_{\varepsilon}\,\rho_{\varphi,{\it{\Omega}}}\big(\nabla _h{\bf u}_h \big)+c_{\varepsilon} \, c_{\beta_2} \, h \int_{\it{\Gamma}} \left\{ {{\varphi_{\left| {{\nabla _h{\bf u}_h}} \right|}(h^{-1}\left| {{ \left[\kern-0.15em\left[ {{({\bf u}_h-{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|)}} \right\} \, {\rm{d}}s , \end{aligned} \end{align} (3.7) with $$\beta_2\in (0,1)$$. Finally, using (2.23), (2.10) and Young’s inequality (2.1) we estimate   \begin{align} \begin{aligned}\label{eq:alpha1} \left| {{I_5}} \right|&\le c_{\varepsilon}\,h\int_{\it{\Gamma}} \left\{ {{\varphi_{\left| {{\nabla_h{\bf u}_h}} \right|}(h^{-1}\left| {{ \left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|)}} \right\}\, {\rm{d}}s \\ &\quad +{\varepsilon} \, h \int_{\it{\Gamma}} \left\{ {{\varphi_{\left| {{\nabla_h{\bf u}_h}} \right|}(\left| {{ \nabla _h({\bf u}_h -{{\it{\Pi}}_{{\rm SZ}}}{\bf u})}} \right|)}} \right\}\, {\rm{d}}s , \end{aligned} \end{align} (3.8) where the last term is further estimated, using Lemma 2.5, (A.10) and (A.17), by   \begin{align*} &c\,{\varepsilon} \, h \int_{\it{\Gamma}} \left\{ {{\varphi(\left| {{\nabla _h{\bf u}_h}} \right|)}} \right\}+ \left\{ {{\varphi(\left| {{ \nabla _h{({\bf u}_h -{{\it{\Pi}}_{{\rm SZ}}}{\bf u})}}} \right|) }} \right\}\, {\rm{d}}s \\ &\le c\,{\varepsilon}\,\rho_{\varphi,{\it{\Omega}}}\big(\nabla _h{\bf u}_h \big)+c \, \rho_{\varphi,{\it{\Omega}}}\big(\nabla _h{{\it{\Pi}}_{{\rm SZ}}}{\bf u}\big) \\ &\le c\,{\varepsilon}\,\rho_{\varphi,{\it{\Omega}}}\big(\nabla _h{\bf u}_h \big) +c \,\rho_{\varphi,{\it{\Omega}}}(\nabla {\bf u}) . \end{align*} Choosing $$\varepsilon$$ small enough we can absorb all terms with $${\varepsilon}$$ into the corresponding terms in (3.2) and (3.4). Now $$c_{\varepsilon}$$ is fixed, and we choose $$\beta _2 $$ small enough to absorb the terms with $$\beta_2$$ into the corresponding terms in (3.2). Then, we choose $$\alpha$$ large enough to absorb the term with the, by now fixed, number $$c_{\varepsilon} \, c_{\beta_2}$$ in (3.7) and the term with $$c_{\varepsilon} $$ in (3.8) into the corresponding terms in (3.2). Finally, we choose $$\beta_1$$ small enough to absorb the term with $$\beta_1$$ in (3.6) into the corresponding terms in (3.2). This way the assertion of Theorem 3.1 is proved. □ Let us now come to Scheme 2.11 with lifting. Theorem 3.2 Let $$ {\bf u}_D^*= {\bf u}$$ or $$ {\bf u}_D^*= {{\it{\Pi}}_{{\rm SZ}}}{\bf u}$$. Then, there exists $$\alpha_0$$ such that for all $$\alpha \ge \alpha_0$$, the solution $${\bf u}_h \in {V_h^k} $$ of Scheme 2.11 satisfies the a priori estimate   \begin{align*} &\int_{\it{\Omega}} \varphi\big(\left| {{{\nabla_{\rm DG}^h} {\bf u}_h + {\bf R}_h {\bf u}_D^*}} \right|\big) \,{\rm{d}}x +\alpha\, h\int_{{\it{\Gamma}}} {\varphi \big(h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_h-{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|\big)} \,{\rm{d}}s \\ &\quad +\min\{1,\alpha\}\,\int_{\it{\Omega}} \varphi\big( \left| {{\nabla_h {\bf u}_h}} \right|\big) \,{\rm{d}}s +\min\{1,\alpha\}\,\int_{\it{\Omega}} \varphi\big(\left| {{{\bf u}_h-{\bf u}}} \right|\big) \,{\rm{d}}x \\ &\le c\, \int_{{\it{\Omega}}} \varphi\big(\left| {{ \nabla{\bf u}}} \right|\big)\, {\rm{d}}x + c\, \int_{{\it{\Omega}}} \varphi^*\big(\left| {{{\bf f}}} \right|\big)\, {\rm{d}}x , \end{align*} with $$c$$ depending only on $$\alpha$$, the characteristics of $${\boldsymbol{\mathcal{A}}}$$ and the chunkiness $$\omega_0$$. Proof. To prove the assertion, we use in (2.25) the test function $${\bf z}_h={\bf u}_h-{{\it{\Pi}}_{{\rm SZ}}} {\bf u}$$. Thus, we get, adding and subtracting appropriate terms and using (2.19),   \begin{align} & \int_{\it{\Omega}}\! {\boldsymbol{\mathcal{A}}}({\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^*) : ({\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^*)\, {\rm{d}}x \label{eq:apri-2} \\ &\quad + \alpha \, h\int_{{\it{\Gamma}}} { {\boldsymbol{\mathcal{A}}} (h^{-1}\left[\kern-0.15em\left[ {{({\bf u}_h - {\bf u}^*_D) \otimes {\bf n}}} \right]\kern-0.15em\right])}: h^{-1} \left[\kern-0.15em\left[ {{({\bf u}_h - {\bf u}^*_D) \otimes {\bf n}}} \right]\kern-0.15em\right]\, {\rm{d}}s \notag \\ &= \int_{\it{\Omega}} {{\bf f}}\cdot ({\bf u}_h-{\bf u} +{\bf u}-{{\it{\Pi}}_{{\rm SZ}}} {\bf u})\, {\rm{d}}x \notag \\ &\quad + \int_{\it{\Omega}}\! {\boldsymbol{\mathcal{A}}}({\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^*) : \big (\nabla_h {{\it{\Pi}}_{{\rm SZ}}} {\bf u}+ {\bf R}_h({\bf u}_D^*-{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\big )\, {\rm{d}}x \notag \\ &\quad + \alpha\, h\int_{\it{\Gamma}}{ {\boldsymbol{\mathcal{A}}} (h^{-1} \left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*) \otimes {\bf n}}} \right]\kern-0.15em\right]): \left[\kern-0.15em\left[ {{h^{-1}({{\it{\Pi}}_{{\rm SZ}}}{\bf u} -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}\, {\rm{d}}s\notag \\ &\quad +\int_{{\it{\Omega}}} {\boldsymbol{\mathcal{A}}}({\bf R}_h( {\bf u}_h-{\bf u}_D^*)): \big ({\bf R}_h ({\bf u}_h -{\bf u}_D^*) +{\bf R}_h ({\bf u}_D^*-{{\it{\Pi}}_{{\rm SZ}}} {\bf u}) \big ) \, {\rm{d}}x \notag \\ &=: I_1+I_2+I_3+I_4 .\notag \end{align} (3.9) Using Assumption 2.2, (2.4) and (2.10) we see that the left-hand side of (3.1) is equivalent to   \begin{align} \begin{aligned} &\int_{\it{\Omega}} \varphi(\left| {{{\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^*}} \right|)\,{\rm{d}}x + \alpha\, h \int_{\it{\Gamma}} {\varphi(h^{-1} \left| {{\left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right]}} \right|)}\, {\rm{d}}s \\ &= \rho_{\varphi,{\it{\Omega}}}\big({{\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^* }\big) +\alpha \, m_{\varphi,h}({\bf u}_h -{\bf u}_D^*) . \end{aligned} \label{eq:lhs2} \end{align} (3.10) The identity $$\nabla _h {\bf u}_h= ({\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^*) +{\bf R}_h({\bf u}_h -{\bf u}_D^*)$$ together with (A.13) implies that (3.10) is an upper bound of   \begin{align} \min(1,\alpha) \int_{\it{\Omega}} {\varphi ( \left| {{\nabla _h {\bf u}_h}} \right|)}\, x, \label{eq:lhs2a} \end{align} (3.11) which thus can be added to (3.10). In view of (3.5) we can also add (3.4) to (3.10) at the expense of adding $$\rho_{\varphi, {\it{\Omega}}}(\nabla {\bf u})$$ to the right-hand side of (3.9). Now we can estimate the terms $$I_i$$, $$i=1,\ldots,4$$ on the right-hand side of (3.9). We estimate with Young’s inequality, (A.17) and $$h \le 1$$,   \begin{align*} \left| {{I_1}} \right| &\leq \epsilon \rho_{\varphi,{\it{\Omega}}}({\bf u}_h - {\bf u}) + c_{\epsilon} \rho_{\varphi^*,{\it{\Omega}}}({\bf f}) + c\, \rho_{\varphi,{\it{\Omega}}}({\bf u} - {{\it{\Pi}}_{{\rm SZ}}} {\bf u}) \\ &\leq \epsilon \rho_{\varphi,{\it{\Omega}}}({\bf u}_h - {\bf u}) + c_{\epsilon} \rho_{\varphi^*,{\it{\Omega}}}({\bf f}) + c\, \rho_{\varphi,{\it{\Omega}}}( \nabla {\bf u}) . \end{align*} Using (2.13), Young’s inequality, the definition of $${\bf u}_D^*$$, (A.17), (A.13) and again (A.17) we get   \begin{align*} \begin{aligned} \left| {{I_2}} \right|&\le {\varepsilon} \rho_{\varphi,{\it{\Omega}}}\big({{\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^* }\big) +c_{\varepsilon}\rho_{\varphi,{\it{\Omega}}}\big(\nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}\big) +c_{\varepsilon}\rho_{\varphi,{\it{\Omega}}}\big({\bf R}_h({\bf u}-{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\big) \\ &\le {\varepsilon} \,c\,\rho_{\varphi,{\it{\Omega}}}\big({{\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^* }\big) +c_{\varepsilon}\,\rho_{\varphi,{\it{\Omega}}}\big(\nabla {\bf u}\big)+c_{\varepsilon} \, m_{\varphi,h}({\bf u} -{{\it{\Pi}}_{{\rm SZ}}} {\bf u}) \\ &\le {\varepsilon} \,c\,\rho_{\varphi,{\it{\Omega}}}\big({{\nabla_{\rm DG}^h} {\bf u}_h+ {\bf R}_h{\bf u}_D^* }\big) +c_{\varepsilon}\,\rho_{\varphi,{\it{\Omega}}}\big(\nabla {\bf u}\big) . \end{aligned} \end{align*} Using (2.13), Young’s inequality, the definition of $${\bf u}_D^*$$ and (A.17) we get   \begin{align*} \left| {{I_3}} \right|&\le {\varepsilon}\, \alpha \, m_{\varphi,h}({\bf u}_h -{\bf u}_D^*) + \alpha\, c_{\varepsilon} \, m_{\varphi,h}({\bf u} -{{\it{\Pi}}_{{\rm SZ}}} {\bf u}) \\ &\le {\varepsilon}\, \alpha \, m_{\varphi,h}({\bf u}_h -{\bf u}_D^*) + \alpha\, c_{\varepsilon} \rho_{\varphi,{\it{\Omega}}}\big(\nabla {\bf u}\big) . \end{align*} Using (2.13), (2.11c), Young’s inequality, the definition of $${\bf u}_D^*$$, (A.13) and (A.17) we get   \begin{align*} \left| {{I_4}} \right| &\le c\, \rho_{\varphi,{\it{\Omega}}}\big({\bf R}_h({\bf u}_h-{\bf u}_D^*) \big) + c\, \rho_{\varphi,{\it{\Omega}}}\big({\bf R}_h({\bf u}-{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\big ) \\ &\le c\, m_{\varphi,h}\big({\bf u}_h-{\bf u}_D^*\big) + c\, m_{\varphi,h}\big({\bf u}-{{\it{\Pi}}_{{\rm SZ}}}{\bf u}\big ) \\ &\le c \, m_{\varphi,h}({\bf u}_h -{\bf u}_D^*)+ c\, \rho_{\varphi,{\it{\Omega}}}\big(\nabla {\bf u}\big) . \end{align*} Choosing $$\varepsilon$$ small enough we can absorb all terms with $${\varepsilon}$$ into the corresponding terms in (3.10) and (3.4). Now $$c_{\varepsilon}$$ is fixed, and we choose $$\alpha $$ large enough to absorb the term $$c \, m_{\varphi,h}({\bf u}_h -{\bf u}_D^*) $$ in the estimate of $$I_4$$ into the corresponding term in (3.10). This proves the assertion of Theorem 3.2. □ 4. Error estimates Before we prove our error estimates, we prove the analogue of the local best approximation property of the Scott–Zhang interpolation operator (cf. Diening & Růžička, 2007, Theorem 5.3) for faces $$\gamma$$ instead of elements $$K$$. Theorem 4.1 Let $${\boldsymbol{\mathcal{A}}}$$ satisfy Assumption 2.2 and let $${{\it{\Pi}}_{{\rm SZ}}} \colon W^{1,1}({\it{\Omega}}) \to {V_h^k}$$ be the Scott–Zhang operator with $$k \ge 1$$. Let $${\bf u} \in W^{1,\varphi}({\it{\Omega}})$$ with $${\bf F}(\nabla {\bf u}) \in W^{1,2}({\it{\Omega}})$$. Then, for all $$\gamma \in {\it{\Gamma}}$$ and all adjacent elements $$K \in \mathcal{T}_h$$ we have1  \begin{align} \label{eq:app_V1} \mathop{\int\hspace{-0.8em}{-}}_\gamma \left| {{{\bf F} (\nabla {\bf u}) - {\bf F} (\nabla_h {{\it{\Pi}}_{{\rm SZ}}} {\bf u})}} \right|^2 \,{\rm{d}}s &\leq c\,h_\gamma^2\, \mathop{\int\hspace{-0.8em}{-}}_{S_K} \left| {{\nabla {\bf F}(\nabla {\bf u})}} \right|^2\,{\rm{d}}x, \end{align} (4.1) with $$c$$ depending only on the characteristics of $${\boldsymbol{\mathcal{A}}}$$ and the chunkiness $$\omega_0$$. Proof. For arbitrary $${\bf Q} \in {\mathbb{R}}^{d \times n}$$ we have   \begin{align} &\mathop{\int\hspace{-0.8em}{-}}_\gamma \lvert{{\bf F} (\nabla {\bf u}) - {\bf F}(\nabla_h {{\it{\Pi}}_{{\rm SZ}}} {\bf u})}\rvert^2 \,{\rm{d}}s\label{eq:i-II} \\ &\leq c\, \mathop{\int\hspace{-0.8em}{-}}_\gamma \lvert{{\bf F}(\nabla {\bf u}) - {\bf F}({\bf Q})}\rvert^2 \,{\rm{d}}s + c\, \mathop{\int\hspace{-0.8em}{-}}_\gamma \lvert{{\bf F} (\nabla_h {{\it{\Pi}}_{{\rm SZ}}} {\bf u}) - {\bf F}({\bf Q})}\rvert^2 \,{\rm{d}}s=: (I) + (II) .\notag \end{align} (4.2) Let $${\mathfrak q} \in \mathcal{P}_1(S_K)$$ be such that $$\nabla {\mathfrak q} = {\bf Q}$$. Since $$k\ge 1$$ we have $${{\it{\Pi}}_{{\rm SZ}}} {\mathfrak q} = {\mathfrak q}$$ and consequently $${\bf Q} = \nabla {\mathfrak q} = \nabla_h {{\it{\Pi}}_{{\rm SZ}}} {\mathfrak q}$$. This and Proposition 2.6 imply   \begin{align*} (II)\sim c\, \mathop{\int\hspace{-0.8em}{-}}_\gamma \varphi_{\left| {{{\bf Q}}} \right|}\big(\lvert{\nabla_h{{\it{\Pi}}_{{\rm SZ}}} {\bf u} - {\bf Q}}\rvert \big)\,{\rm{d}}s &= c\, \mathop{\int\hspace{-0.8em}{-}}_\gamma \varphi_{\left| {{{\bf Q}}} \right|}\big(\lvert{\nabla_h {{\it{\Pi}}_{{\rm SZ}}} ({\bf u} -{\mathfrak q})\rvert } \big)\,{\rm{d}}s. \end{align*} Thus, inequality (A.10), the continuity of $${{\it{\Pi}}_{{\rm SZ}}}$$ (A.17) and again Proposition 2.6 imply that $$(II)$$ is bounded by   \begin{align} c\, \mathop{\int\hspace{-0.8em}{-}}_K \varphi_{\left| {{{\bf Q}}} \right|}\big(\lvert{\nabla_h {{\it{\Pi}}_{{\rm SZ}}} ({\bf u} -{\mathfrak q})\lvert } \big)\,{\rm{d}}x &\le c\,\mathop{\int\hspace{-0.8em}{-}}_{S_K} \varphi_{\left| {{{\bf Q}}} \right|}\big(\lvert{\nabla ({\bf u} -{\mathfrak q}) \rvert} \big)\,{\rm{d}}x \label{eq:II} \\ &= c\, \mathop{\int\hspace{-0.8em}{-}}_{S_K} \varphi_{\left| {{{\bf Q}}} \right|}\big(\lvert{\nabla {\bf u} - {\bf Q}}\rvert \big)\,{\rm{d}}x\sim \mathop{\int\hspace{-0.8em}{-}}_{S_K} \lvert{{\bf F}(\nabla {\bf u}) - {\bf F}({\bf Q})}\rvert^2 \,{\rm{d}}x . \notag \end{align} (4.3) On the other hand, Lemma A1, used for $$\psi (t)=t^2$$, $$\left| {{K}} \right| \sim \lvert{S_K}\rvert$$ and $$K \subset S_K$$, implies   \begin{align} (I) &\leq c\,\mathop{\int\hspace{-0.8em}{-}}_{K} \lvert{{\bf F}(\nabla {\bf u}) - {\bf F}({\bf Q})}\rvert^2 +h_\gamma^2\,\left| {{\nabla {\bf F}(\nabla {\bf u})}} \right|^2\,{\rm{d}}x \label{eq:I} \\ &\le c\, \mathop{\int\hspace{-0.8em}{-}}_{S_K} \lvert{{\bf F}(\nabla {\bf u}) - {\bf F}({\bf Q})}\rvert^2+h_\gamma^2\,\left| {{\nabla {\bf F}(\nabla {\bf u})}} \right|^2 \,{\rm{d}}x .\notag \end{align} (4.4) Thus, the assertion follows from the estimates (4.2)–(4.4), the surjectivity of $${\bf F}$$ and Poincaré’s inequality in $$W^{1,2}(S_K)$$. □ Corollary 4.2 Under the assumptions of Theorem 4.1 there holds2  \begin{align} \label{eq:app_V1b} h\,\int_{\it{\Gamma}} \left| {{{\bf F} (\nabla {\bf u}) - {\bf F} (\nabla_h {{\it{\Pi}}_{{\rm SZ}}} {\bf u})}} \right|^2 \,{\rm{d}}s &\leq c\,h^2\, \int_{{\it{\Omega}}} \left| {{\nabla {\bf F}(\nabla {\bf u})}} \right|^2\,{\rm{d}}x . \end{align} (4.5) Proof. This follows immediately from (4.1) by summation over $$\gamma \in {\it{\Gamma}}$$ using the properties of the triangulation. □ Using (2.24) and (2.26), we get our error equation for Scheme 2.10:   \begin{align}\label{eq:error-s1} \begin{aligned} &\int_{\it{\Omega}} \big({\boldsymbol{\mathcal{A}}}(\nabla_h {\bf u}_h)-{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})\big ):\nabla_h {\bf z}_h\, {\rm{d}}x \\ &\quad + \alpha \int_{{\it{\Gamma}}} \big\{{{\boldsymbol{\mathcal{A}}}_{\lvert{\nabla_h {\bf u}_h}\rvert}(h^{-1}\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right])}\big\}: \left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]\, {\rm{d}}s \\ &=\int_{{\it{\Gamma}}} \big\{{{\boldsymbol{\mathcal{A}}}(\nabla_h {\bf u}_h)- {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}\big\}: \left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]\, {\rm{d}}s \\ &\quad +h\, \int_{{\it{\Gamma}}} \big\{{{\boldsymbol{\mathcal{A}}}_{\lvert{\nabla_h {\bf u}_h}\rvert}(h^{-1}\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}^*_D) \otimes {\bf n}}\right]\kern-0.15em\right]):\nabla_h {\bf z}_h }\big\}\, {\rm{d}}s . \end{aligned} \end{align} (4.6) Based on this, we have the following error estimate: Theorem 4.3 Let $$ {\bf u}_D^*= {\bf u}$$ or $$ {\bf u}_D^*= {{\it{\Pi}}_{{\rm SZ}}}{\bf u}$$. Then there exists $$\alpha_0$$ such that for all $$\alpha \ge \alpha_0$$ the solution $${\bf u}_h \in {V_h^k} $$, $$k \ge 1$$ of Scheme 2.10 and a solution $${\bf u} \in W^{1,\varphi}({\it{\Omega}}) $$ of (1.1) with $${\bf F} (\nabla {\bf u}) \in W^{1,2}({\it{\Omega}})$$ satisfy the error estimate   \begin{align*} &\int_{\it{\Omega}} \left| {{{\bf F}(\nabla_h {\bf u}_h ) -{\bf F}(\nabla {\bf u} )}} \right|^2\,{\rm{d}}x + \alpha\, h \int_{\it{\Gamma}} \big\{{\varphi_{\left| {{\nabla _h{\bf u}_h}} \right|} (h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]}\rvert)}\big\}\, {\rm{d}}s \\ &\le c\, h^2 \int _{\it{\Omega}}\lvert{\nabla {\bf F}(\nabla {\bf u})}\rvert^2\, {\rm{d}}x , \end{align*} with a constant depending only on the characteristics of $${\boldsymbol{\mathcal{A}}}$$, the chunkiness $$\omega_0$$ and $$\alpha^{-1}$$. Remark 4.4 Note that under appropriate assumptions on the data, it is well known that weak solutions $${\bf u}$$ of (1.1) possess the regularity $${\bf F} (\nabla {\bf u}) \in W^{1,2}({\it{\Omega}})$$ (cf. Giaquinta & Modica, 1986; Acerbi & Fusco, 1994; Giusti, 1994). Proof (of Theorem 4.3). Using $${\bf z}_h := {\bf u}_h - {{\it{\Pi}}_{{\rm SZ}}}{\bf u}$$ in the error equation (4.6) we get, adding and subtracting appropriate terms,   \begin{align} &\int_{\it{\Omega}} \big({\boldsymbol{\mathcal{A}}}(\nabla_h {\bf u}_h)-{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})\big ):\nabla_h ({\bf u}_h-{\bf u})\, {\rm{d}}x \label{eq:error-1} \\ &\quad + \alpha \,h \int_{{\it{\Gamma}}} \big\{{{\boldsymbol{\mathcal{A}}}_{\lvert{\nabla_h {\bf u}_h}\rvert}(h^{-1}\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right])}\big\}:h^{-1} \left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]\, {\rm{d}}s \notag \\ &=-\int_{\it{\Omega}} \big({\boldsymbol{\mathcal{A}}}(\nabla_h {\bf u}_h)-{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})\big ):\nabla_h ({\bf u}-{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\, {\rm{d}}x \notag \\ &\quad +h\, \int_{{\it{\Gamma}}} \big\{{{\boldsymbol{\mathcal{A}}}_{\lvert{\nabla_h {\bf u}_h}\rvert}(h^{-1}\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}^*_D) \otimes {\bf n}}\right]\kern-0.15em\right]):\nabla_h ({\bf u}_h-{{\it{\Pi}}_{{\rm SZ}}}{\bf u}) }\big\}\, {\rm{d}}s \notag \\ &\quad +h \int_{{\it{\Gamma}}} \big\{{{\boldsymbol{\mathcal{A}}}(\nabla_h {\bf u}_h)- {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}\big\}:h^{-1} \left[\kern-0.15em\left[{({\bf u}_h-{{\it{\Pi}}_{{\rm SZ}}}{\bf u}) \otimes {\bf n}}\right]\kern-0.15em\right]\, {\rm{d}}s \notag \\ &\quad - \alpha \, h\int_{{\it{\Gamma}}} \big\{{{\boldsymbol{\mathcal{A}}}_{\lvert{\nabla_h {\bf u}_h}\rvert}(h^{-1}\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right])}\big\}:h^{-1} \left[\kern-0.15em\left[{({\bf u}_D^* -{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\otimes {\bf n}}\right]\kern-0.15em\right]\, {\rm{d}}s \notag \\ &=: I_1+I_2+I_3+I_4 .\notag \end{align} (4.7) In view of Proposition 2.6 used for $${\boldsymbol{\mathcal{A}}}$$ and $${\boldsymbol{\mathcal{A}}}_{\left| {{\nabla _h{\bf u}_h}} \right|}$$, we see that the two terms on the left-hand side are equivalent to   \begin{align} \int_{\it{\Omega}} \left| {{{\bf F}(\nabla_h {\bf u}_h ) -{\bf F}(\nabla {\bf u} )}} \right|^2\,{\rm{d}}x + \alpha\, h \int_{\it{\Gamma}} \big\{{\varphi_{\left| {{\nabla _h{\bf u}_h}} \right|} (h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]}\rvert)}\big\}\, {\rm{d}}s . \label{eq:lhs-e1} \end{align} (4.8) Using Lemma 2.8 we get   \begin{align} \label{eq:er-i1} \left| {{I_1}} \right|&\le {\varepsilon} \int_{\it{\Omega}} \left| {{{\bf F}(\nabla_h {\bf u}_h ) -{\bf F}(\nabla {\bf u} )}} \right|^2\,{\rm{d}}x + c_{\varepsilon} \int_{\it{\Omega}} \left| {{{\bf F}(\nabla {\bf u} ) -{\bf F}(\nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u} )}} \right|^2\,{\rm{d}}x . \end{align} (4.9) Young’s inequality implies   \begin{align} \label{eq:er-i2} \begin{aligned} \left| {{I_2}} \right|&\le c_{\varepsilon} \, h\,\int _{\it{\Gamma}} \big\{{\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}^*_D) \otimes {\bf n}}\right]\kern-0.15em\right]}\rvert) }\big\}\, {\rm{d}}s \\ &\quad +{\varepsilon} \, h\, \int_{{\it{\Gamma}}} \big\{{\varphi_{\lvert{\nabla_h {\bf u}_h}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert) }\big\} \, {\rm{d}}s . \end{aligned} \end{align} (4.10) Let $$\gamma \in {\it{\Gamma}} $$ be a face adjacent to some element $$K$$. Using Lemma 2.5, (A.10), Lemma 2.5 again, Proposition 2.6, Lemma A5, adding and subtracting appropriate terms and Jensen’s inequality we estimate3  \begin{align} & h \int_{\gamma} {\varphi_{\lvert{{\nabla_h {\bf u}_h}}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert) }\, {\rm{d}}s\notag \\ &\le h\, c \int_{\gamma} {\varphi_{\lvert{\langle{\nabla_h {\bf u}_h}\rangle_K}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert) }+{\varphi_{\lvert{\langle{\nabla_h {\bf u}_h}\rangle_K}\rvert} (\lvert{\nabla_h{\bf u}_h- \langle{\nabla_h {\bf u}_h}\rangle_K}\rvert) }\, {\rm{d}}s\notag \\ &\le c \int_{K} {\varphi_{\lvert{\langle{\nabla_h {\bf u}_h}\rangle_{K}}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert) }+{\varphi_{\lvert{\langle{\nabla_h {\bf u}_h}\rangle_{K}}\rvert} (\lvert{\nabla_h{\bf u}_h- \langle{\nabla_h {\bf u}_h}\rangle_{K}}\rvert) }\, {\rm{d}}x\notag \\ &\le c \int_{K} {\varphi_{\lvert{{\nabla_h {\bf u}_h}}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert) }+{\varphi_{\lvert{{\nabla_h {\bf u}_h}}\rvert} (\lvert{\nabla_h{\bf u}_h- \langle{\nabla_h {\bf u}_h}\rangle_{K}}\rvert) }\, {\rm{d}}x\notag \\ &\le c \int_{K} \lvert{{\bf F} (\nabla_h{\bf u}_h)- {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}\rvert^2 +\lvert{{\bf F} (\nabla_h{\bf u}_h)- {\bf F}( \langle{\nabla_h {\bf u}_h}\rangle_{K}) }\rvert^2\, {\rm{d}}x\notag \\ &\le c \int_{K} \lvert{{\bf F} (\nabla_h{\bf u}_h)- {\bf F}( \nabla {\bf u})}\rvert^2+\lvert{{\bf F} (\nabla{\bf u})- {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}\rvert^2\, {\rm{d}}x\notag \\ &\quad +c \int_K\lvert{{\bf F} (\nabla{\bf u})- \langle{{\bf F}(\nabla {\bf u})}_{K}) }\rangle\rvert^2\, {\rm{d}}x . \label{eq:er-i2a} \end{align} (4.11) Thus, we get   \begin{align} \left| {{I_2}} \right| &\le c_{\varepsilon} \, h\int _{\it{\Gamma}} \!\big\{{\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}_h \!-\!{\bf u}^*_D) \otimes {\bf n}}\right]\kern-0.15em\right]\rvert}) }\big\}\, {\rm{d}}s +{\varepsilon} \, c \!\int_{{\it{\Omega}}} \lvert{{\bf F} (\nabla_h{\bf u}_h)- {\bf F}( \nabla {\bf u})}\rvert^2\, {\rm{d}}x \notag \\ &\quad +c \!\int _{\it{\Omega}} \!\lvert{{\bf F} (\nabla{\bf u})\!-\! {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}\rvert^2\, {\rm{d}}x +c\! \sum_{K \in \mathcal T_h}\int_K\lvert{{\bf F} (\nabla{\bf u})\!-\! \langle{{\bf F}(\nabla {\bf u})}_{K}) }\rangle\rvert^2\, {\rm{d}}x. \label{eq:er-i2b} \end{align} (4.12) Using (2.12), Young’s inequality, adding and subtracting $${\bf u}_D^*$$, using the definition of $${\bf u}_D^*$$, $$\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (\lvert{\nabla_h{\bf u}_h\!-\! \nabla {\bf u}}\rvert) \sim \varphi_{\left| {{\nabla {\bf u}}} \right|} (\lvert{\nabla_h{\bf u}_h\!-\! \nabla {\bf u}}\rvert)$$ and adding and subtracting $$\nabla _h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}$$, we obtain   \begin{align} \left| {{I_3}} \right| &\le c_{\varepsilon} h\!\int _{\it{\Gamma}}\!\! \big\{{\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}_h \!-\!{{\it{\Pi}}_{{\rm SZ}}}{\bf u}) \otimes {\bf n}}\right]\kern-0.15em\right]\rvert}) }\big\}\, {\rm{d}}s \!+\!{\varepsilon} \, h\! \int_{{\it{\Gamma}}}\!\! \big\{{\varphi_{\lvert{\nabla_h {\bf u}_h}\rvert} (\lvert{\nabla_h{\bf u}_h\!-\! \nabla {\bf u}}\rvert) }\big\} \,{\rm{d}}s \notag \\ &\le c_{\varepsilon} h\!\!\int _{\it{\Gamma}}\!\!\big\{{\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}_h \!-\!{\bf u}_D^*) \otimes {\bf n}}\right]\kern-0.15em\right]\rvert}) }\big\}\!+\! \big\{{\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}-{{\it{\Pi}}_{{\rm SZ}}}{\bf u}) \otimes {\bf n}}\right]\kern-0.15em\right]\rvert}) }\big\} \,{\rm{d}}s \notag \\ &\quad +{\varepsilon}\, c \,h\! \int_{{\it{\Gamma}}} \!\big\{{\varphi_{\lvert{\nabla {\bf u}}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla {\bf u}}\rvert) }\big\}\, {\rm{d}}s \notag \\ &\le c_{\varepsilon} h\!\int _{\it{\Gamma}}\!\!\big\{{\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}_h \!-\!{\bf u}_D^*) \otimes {\bf n}}\right]\kern-0.15em\right]\rvert}) }\big\}\!+\! \big\{{\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}-{{\it{\Pi}}_{{\rm SZ}}}{\bf u}) \otimes {\bf n}}\right]\kern-0.15em\right]\rvert}) }\big\}\, {\rm{d}}s \notag \\ &\quad +{\varepsilon}\, c \,h\! \int_{{\it{\Gamma}}} \!\big\{{\varphi_{\lvert{\nabla {\bf u}}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert) }\big\}+{\varphi_{\lvert{\nabla {\bf u}}\rvert} (\lvert{\nabla{\bf u}- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}\rvert}) }\, {\rm{d}}s \notag \\ &=: J_1+J_2+J_3+J_4 . \label{eq:er-i3} \end{align} (4.13) The term $$J_2$$ is nonzero only for $$\gamma \in {\it{\Gamma}}_{\rm D}$$. For $$\gamma \in {\it{\Gamma}}_{\rm D}$$ with $$\gamma \in \partial K$$, we use Lemma 2.5 for $$\beta \in (0,1)$$, the identity   \begin{align} {\bf u} -{{\it{\Pi}}_{{\rm SZ}}} {\bf u}=({\bf u}-{{\it{\Pi}}_{{\rm SZ}}} {\bf u}) - {{\it{\Pi}}_{{\rm SZ}}} ({\bf u}-{{\it{\Pi}}_{{\rm SZ}}} {\bf u}) ,\label{eq:proj} \end{align} (4.14) (A.16) with $${\bf g}=h^{-1}({\bf u}- {{\it{\Pi}}_{{\rm SZ}}} {\bf u})$$, (A.10), Lemma 2.5 for $$\beta_1 \in (0,1)$$, the equivalence $$\varphi_{\lvert{\langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert} (\lvert{\nabla_h{\bf u}_h - \langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert) \sim \varphi_{\lvert{{\nabla_h {\bf u}_h}}\rvert} (\lvert{\nabla_h{\bf u}_h - \langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert) $$, Proposition 2.6, Lemma A5, add and subtract appropriate terms, Jensen’s inequality and obtain   \begin{align} & h\int _\gamma {\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (h^{-1}\lvert{\left[\kern-0.15em\left[ {({\bf u}-{{\it{\Pi}}_{{\rm SZ}}}{\bf u}) \otimes {\bf n}}\right]\kern-0.15em\right]}\rvert) }\, {\rm{d}}s \label{eq:er-i3b} \\ &\le h \int_{\gamma} c_\beta\,{\varphi_{\lvert{\langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u} -{{\it{\Pi}}_{{\rm SZ}}}{\bf u}) \otimes {\bf n}}\right]\kern-0.15em\right]}\rvert) }\, {\rm{d}}s \notag \\ &\quad + h \int_{\gamma} \beta\,{\varphi_{\lvert{\langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert} (\lvert{\nabla_h{\bf u}_h - \langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert) } \, {\rm{d}}s \notag \\ &\le \int_{S_K}\!\!c_\beta\, {\varphi_{\lvert{\langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert} (\lvert{\nabla{\bf u} -\nabla_h {{\it{\Pi}}_{{\rm SZ}}} {\bf u}}\rvert) }+c\,\beta\,{\varphi_{\lvert{\langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert} (\lvert{\nabla_h{\bf u}_h - \langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert) }\, {\rm{d}}x \notag \\ &\le \int_{S_K}\!c_\beta\,c_{\beta_1}\, {\varphi_{\lvert{{\nabla {\bf u}}}\rvert} (\lvert{\nabla{\bf u} -\nabla_h {{\it{\Pi}}_{{\rm SZ}}} {\bf u}}\rvert) }+c_\beta\,\beta_1\,{\varphi_{\lvert{{\nabla {\bf u}}}\rvert} (\lvert{\nabla{\bf u} - \langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert) }\, {\rm{d}}x \notag \\ &\quad + \int_{S_K}c\,\beta\,{\varphi_{\lvert{{\nabla_h {\bf u}_h}}\rvert} (\lvert{\nabla_h{\bf u}_h \! -\! \langle{\nabla_h {\bf u}_h}\rangle_{S_K}}\rvert) }\, {\rm{d}}x \notag \\ &\le \int_{S_K}c_\beta\,c_{\beta_1}\, \lvert{{\bf F} (\nabla{\bf u})- {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}} {\bf u})}\rvert^2 + (c\,\beta+c_\beta\,\beta_1)\,\lvert{{\bf F} (\nabla{\bf u})- \langle{{\bf F}(\nabla {\bf u})}\rangle_{S_K} }\rvert^2 \, {\rm{d}}x\notag \\ &\quad +\int_{S_K} (c\,\beta+c_\beta\,\beta_1)\, \lvert{{\bf F} (\nabla_h{\bf u}_h)- {\bf F}( \nabla {\bf u})}\rvert^2 \, {\rm{d}}x . \notag \end{align} (4.15) Thus, $$J_2$$ is estimated by   \begin{align} \label{eq:er-i2bA} & c_{\varepsilon}\, (c\,\beta+c_\beta\,\beta_1 ) \int_{{\it{\Omega}}} \lvert{{\bf F} (\nabla_h{\bf u}_h)- {\bf F}( \nabla {\bf u})}\rvert^2 \, {\rm{d}}x \\ &\; +c_{\varepsilon}\,c_\beta\,c_{\beta_1}\!\int _{\it{\Omega}} \!\lvert{{\bf F} (\nabla{\bf u})\!-\! {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}\rvert^2\, {\rm{d}}x + c_{\varepsilon}\,c_\beta \sum_{ K \in \mathcal T_h } \int_{S_K}\!\lvert{{\bf F} (\nabla{\bf u})\!-\! \langle{{\bf F}(\nabla {\bf u})}\rangle_{S_K} }\rvert^2\, {\rm{d}}x. \notag \end{align} (4.16) The term $$J_3$$ is treated similarly to (4.11). In fact, let $$\gamma \in {\it{\Gamma}} $$ be a face adjacent to some element $$K$$. Using Lemma 2.5, Proposition 2.6, (A.10), Lemma A1, Lemma 2.5, adding and subtracting $$\nabla {\bf u}$$, using Proposition 2.6 and Lemma A5, we estimate   \begin{align} & h \int_{\gamma} {\varphi_{\lvert{{\nabla {\bf u}}}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert) }\, {\rm{d}}s \label{eq:er-i2aa} \\ &\le h\, c \int_{\gamma} {\varphi_{\lvert{\langle{\nabla {\bf u}}\rangle_K}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert)} + {\varphi_{\lvert{\langle{\nabla {\bf u}}\rangle_K}\rvert} (\lvert{\nabla{\bf u}- \langle{\nabla {\bf u}}\rangle_K}\rvert) }\, {\rm{d}}s \notag \\ &\le h\, c \int_{\gamma} {\varphi_{\lvert{\langle{\nabla {\bf u}}\rangle_K}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert)} + \lvert{{\bf F} (\nabla{\bf u})- {\bf F}(\langle{\nabla {\bf u}}\rangle_{K}) }\rvert^2\, {\rm{d}}s \notag \\ &\le c \int_{K} {\varphi_{\lvert{\langle{\nabla{\bf u}}\rangle_{K}}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert) }+ \lvert{{\bf F} (\nabla{\bf u})- {\bf F}(\langle{\nabla {\bf u}}\rangle_{K}) }\rvert^2 + h^2\,\lvert{\nabla {\bf F} (\nabla{\bf u}) }\rvert^2\, {\rm{d}}x \notag \\ &\le c \int_{K} {\varphi_{\lvert{{\nabla {\bf u}}}\rvert} (\lvert{\nabla_h{\bf u}_h- \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u}}\rvert) }+ \lvert{{\bf F} (\nabla{\bf u})- {\bf F}(\langle{\nabla {\bf u}}\rangle_{K}) }\rvert^2 + h^2\,\lvert{\nabla {\bf F} (\nabla{\bf u}) }\rvert^2\, {\rm{d}}x \notag \\ &\le c \int_{K} \lvert{{\bf F} (\nabla_h{\bf u}_h)- {\bf F}( \nabla {\bf u})}\rvert^2+\lvert{{\bf F} (\nabla{\bf u})- {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}\rvert^2\, {\rm{d}}x \notag \\ &\quad +c \int_K\lvert{{\bf F} (\nabla{\bf u})- \langle{{\bf F}(\nabla {\bf u})}_{K}) }\rangle\rvert^2 + h^2\,\lvert{\nabla {\bf F} (\nabla{\bf u}) }\rvert^2\, {\rm{d}}x. \notag \end{align} (4.17) Thus, $$J_3$$ is estimated by   \begin{align} \label{eq:er-i3bA} &{\varepsilon} \, c \int_{{\it{\Omega}}} \lvert{{\bf F} (\nabla_h{\bf u}_h)- {\bf F}( \nabla {\bf u})}\rvert^2 \, {\rm{d}}x + c\, h^2\int_{\it{\Omega}} \lvert{\nabla {\bf F} (\nabla{\bf u}) }\rvert^2\, {\rm{d}}x \\ &\quad +c \int _{\it{\Omega}} \lvert{{\bf F} (\nabla{\bf u})- {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}\rvert^2\, {\rm{d}}x +c \sum_{K \in \mathcal T_h}\int_K\lvert{{\bf F} (\nabla{\bf u})- \langle{{\bf F}(\nabla {\bf u})}_{K}) }\rangle\rvert^2\, {\rm{d}}x . \notag \end{align} (4.18) Corollary 4.2 implies that   \begin{align}\label{eq:er-i4aA} \lvert{J_4}\rvert &\le {\varepsilon} \,c\, \,h^2\, \int_{{\it{\Omega}}} \left| {{\nabla {\bf F}(\nabla {\bf u})}} \right|^2\,{\rm{d}}x . \end{align} (4.19) Finally, Young’s inequality and the definition of $${\bf u}_D^*$$ imply   \begin{align} \label{eq:er-i4} \begin{aligned} \left| {{I_4}} \right|&\le {\varepsilon} \,\alpha\, h\,\int _{\it{\Gamma}} \big\{{\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}^*_D) \otimes {\bf n}}\right]\kern-0.15em\right]}\rvert) }\big\}\, {\rm{d}}s \\ &\quad +c_{\varepsilon} \,\alpha\, h\, \int_{{\it{\Gamma}}} \big\{{\varphi_{\lvert{\nabla_h {\bf u}_h}\rvert} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}- {{\it{\Pi}}_{{\rm SZ}}}{\bf u})\otimes {\bf n} }\right]\kern-0.15em\right]}\rvert) }\big\}\, {\rm{d}}s . \end{aligned} \end{align} (4.20) The last term in (4.20) is treated as $$J_2$$ and thus estimated by   \begin{align} \label{eq:er-i4bA} & c_{\varepsilon}\,\alpha\, (c\,\beta+c_\beta\,\beta_1 ) \int_{{\it{\Omega}}} \lvert{{\bf F} (\nabla_h{\bf u}_h)- {\bf F}( \nabla {\bf u})}\rvert^2 \, {\rm{d}}x \\ &\, +\!c_{\varepsilon}\alpha\, c_\beta\,c_{\beta_1}\!\!\int _{\it{\Omega}} \!\lvert{{\bf F} (\nabla{\bf u})\!-\! {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}\rvert^2\, {\rm{d}}x + c_{\varepsilon}\alpha\,c_\beta \!\sum_{ K \in \mathcal T_h}\int_{S_K}\!\!\lvert{{\bf F} (\nabla{\bf u})\!-\! \langle{{\bf F}(\nabla {\bf u})}\rangle_{S_K} }\rvert^2 {\rm{d}}x. \notag \end{align} (4.21) Choosing $$\varepsilon$$ small enough, we can absorb all terms with $${\varepsilon}$$ in (4.9), (4.12), (4.18) and (4.20) into the corresponding terms in (4.8). Now $$c_{\varepsilon}$$ is fixed, and we choose $$\alpha $$ large enough to absorb the terms with $$\varphi_{\left| {{\nabla_h {\bf u}_h}} \right|} (h^{-1}\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}^*_D) \otimes {\bf n}}\right]\kern-0.15em\right]) $$ in (4.10) and (4.13) into the corresponding term in (4.8). Then, we choose first $$\beta$$ and then $$\beta_1$$ small enough to absorb the terms with $$\beta$$ and $$\beta_1$$ in (4.16) and (4.21) into the corresponding term in (4.8). This way we arrive at   \begin{align} &\int_{\it{\Omega}} \left| {{{\bf F}(\nabla_h {\bf u}_h ) -{\bf F}(\nabla {\bf u} )}} \right|^2\,{\rm{d}}x + \alpha\, h \int_{\it{\Gamma}} \big\{{\varphi_{\left| {{\nabla _h{\bf u}_h}} \right|} (h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]}\rvert)}\big\}\, {\rm{d}}s \notag \\ &\le c_\alpha \, \int _{\it{\Omega}} \lvert{{\bf F} (\nabla{\bf u})- {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}\rvert^2\, {\rm{d}}x + c_\alpha \sum_{K \in \mathcal T_h}\int_K\lvert{{\bf F} (\nabla{\bf u})- \langle{{\bf F}(\nabla {\bf u})}\rangle_{K} }\rvert^2\, {\rm{d}}x\notag \\ &\quad + c_\alpha \sum_{K \in \mathcal T_h}\int_{S_{K}}\lvert{{\bf F} (\nabla{\bf u})- \langle{{\bf F}(\nabla {\bf u})}\rangle_{S_{K}} }\rvert^2\, {\rm{d}}x +c_\alpha\, \,h^2\, \int_{{\it{\Omega}}} \left| {{\nabla {\bf F}(\nabla {\bf u})}} \right|^2\,{\rm{d}}x . \label{eq:err} \end{align} (4.22) The first three terms on the right-hand side are estimated by $$c\,h^2\,\left\| {{\nabla {\bf F}(\nabla {\bf u})}} \right\|_2^2$$ due to Proposition A2 and Poincaré’s inequality on $$W^{1,2}(K)$$ and $$W^{1,2}(S_{S_\gamma})$$, respectively (cf. Diening et al., 2010, Theorem 6.5). This proves the assertion of the theorem. □ Let us now turn to Scheme 2.11. Using (2.25) and (2.27), we get our error equation for Scheme 2.10:   \begin{align}\label{eq:error-s2} \begin{aligned} &\int_{\it{\Omega}} \big({\boldsymbol{\mathcal{A}}}({\nabla_{\rm DG}^h} {\bf u}_h +{\bf R}_h {\bf u}_D^*)-{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})\big ):{\nabla_{\rm DG}^h} {\bf z}_h\, {\rm{d}}x \\ &\quad + \alpha \int_{{\it{\Gamma}}} {{\boldsymbol{\mathcal{A}}}(h^{-1}\left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right])}: \left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]\, {\rm{d}}s \\ &=\int_{{\it{\Gamma}}} \big\{{{{\it{\Pi}}_{\rm DG}}{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})- {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}\big\}: \left[\kern-0.15em\left[ {{{\bf z}_h \otimes {\bf n}}} \right]\kern-0.15em\right]\, {\rm{d}}s \\ &\quad + \int_{{\it{\Omega}}} {{\boldsymbol{\mathcal{A}}}({\bf R}_h{({\bf u}_h -{\bf u}^*_D) }):{\bf R}_h {\bf z}_h }\, {\rm{d}}x . \end{aligned} \end{align} (4.23) Theorem 4.5 Let $$ {\bf u}_D^*= {\bf u}$$ or $$ {\bf u}_D^*= {{\it{\Pi}}_{{\rm SZ}}}{\bf u}$$. Then there exists $$\alpha_0$$ such that for all $$\alpha \ge \alpha_0$$ the solution $${\bf u}_h \in {V_h^k} $$, $$k\ge 1$$ of Scheme 2.11 and a solution $${\bf u} \in W^{1,\varphi}({\it{\Omega}}) $$ of (1.1) with $${\bf F} (\nabla {\bf u}) \in W^{1,2}({\it{\Omega}})$$ and $$\nabla {\boldsymbol{\mathcal{A}}}(\nabla {\bf u}) \in L^{\varphi^*}({\it{\Omega}})$$ satisfy the error estimate   \begin{align*} &\int_{\it{\Omega}} \left| {{{\bf F}({\nabla_{\rm DG}^h} {\bf u}_h +{\bf R}_h {\bf u}_D^*) -{\bf F}(\nabla {\bf u} )}} \right|^2\,{\rm{d}}x + \alpha\, h \int_{\it{\Gamma}} {\varphi(h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]}\rvert)}\, {\rm{d}}s \\ &\le c\, h^{\min (p,p')} \left ( \int _{\it{\Omega}}\lvert{\nabla {\bf F}(\nabla {\bf u})}\rvert^2 + \varphi(\lvert{\nabla {\bf u}}\rvert) + \varphi^*(\lvert{\nabla {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}\rvert)\, {\rm{d}}x \right ), \end{align*} with a constant depending only on the characteristics of $${\boldsymbol{\mathcal{A}}}$$, the chunkiness $$\omega_0$$ and $$\alpha^{-1}$$. Proof. Using $${\bf z}_h := {\bf u}_h - {{\it{\Pi}}_{{\rm SZ}}}{\bf u}$$ in the error equation (4.23) we get, adding and subtracting appropriate terms,   \begin{align} &\int_{\it{\Omega}} \big({\boldsymbol{\mathcal{A}}}({\nabla_{\rm DG}^h} {\bf u}_h +{\bf R}_h {\bf u}_D^*)-{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})\big ):\big ({\nabla_{\rm DG}^h} {\bf u}_h +{\bf R}_h{\bf u}_D^* -\nabla {\bf u} \big )\, {\rm{d}}x \label{eq:error-2} \\ &\quad + \alpha \,h \int_{{\it{\Gamma}}} {{\boldsymbol{\mathcal{A}}}(h^{-1}\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right])}:h^{-1} \left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]\, {\rm{d}}s \notag \\ &=\int_{\it{\Omega}} \big({\boldsymbol{\mathcal{A}}}({\nabla_{\rm DG}^h} {\bf u}_h +{\bf R}_h {\bf u}_D^*)-{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})\big ): \big ({\nabla_{\rm DG}^h} {{\it{\Pi}}_{{\rm SZ}}}{\bf u} +{\bf R}_h{\bf u}_D^* -\nabla {\bf u} \big )\, {\rm{d}}x \notag \\ &\quad +\int_{{\it{\Omega}}} {{\boldsymbol{\mathcal{A}}}({\bf R}_h {({\bf u}_h -{\bf u}^*_D)}):{\bf R}_h ({\bf u}_h-{{\it{\Pi}}_{{\rm SZ}}}{\bf u}) }\, {\rm{d}}x \notag \\ &\quad +h \int_{{\it{\Gamma}}} \big\{{{{\it{\Pi}}_{\rm DG}}{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})- {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})}\big\}:h^{-1} \left[\kern-0.15em\left[{({\bf u}_h-{{\it{\Pi}}_{{\rm SZ}}}{\bf u}) \otimes {\bf n}}\right]\kern-0.15em\right]\, {\rm{d}}s \notag \\ &\quad - \alpha \, h\int_{{\it{\Gamma}}} {{\boldsymbol{\mathcal{A}}}(h^{-1}\left[\kern-0.15em\left[ {{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}} \right]\kern-0.15em\right])}:h^{-1} \left[\kern-0.15em\left[{({\bf u}_D^* -{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\otimes {\bf n}}\right]\kern-0.15em\right]\, {\rm{d}}s \notag \\ &=: I_1+I_2+I_3+I_4 .\notag \end{align} (4.24) In view of Proposition 2.6 used for $${\boldsymbol{\mathcal{A}}}$$ we see that the two terms on the left-hand side are equivalent to   \begin{align} \int_{\it{\Omega}} \left| {{{\bf F}({\nabla_{\rm DG}^h} {\bf u}_h\! +\!{\bf R}_h {\bf u}_D^*)\! -\!{\bf F}(\nabla {\bf u} )}} \right|^2\,{\rm{d}}x + \alpha\, h \!\int_{\it{\Gamma}} \!\varphi { (h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h \! -\!{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]}\rvert)}\, {\rm{d}}s . \label{eq:lhs-e2} \end{align} (4.25) Using $${\nabla_{\rm DG}^h} {{\it{\Pi}}_{{\rm SZ}}}{\bf u}+{\bf R}_h {\bf u}_D^*-\nabla {\bf u} = \nabla _h({{\it{\Pi}}_{{\rm SZ}}} {\bf u} -{\bf u}) -{\bf R}_h ({{\it{\Pi}}_{{\rm SZ}}} {\bf u} -{\bf u}_D^*)$$, the definition of $${\bf u}_D^*$$, Young’s inequality and Proposition 2.6 we get   \begin{align} \label{eq:er-i1A} \left| {{I_1}} \right| &\le {\varepsilon} \int_{\it{\Omega}} \left| {{{\bf F}({\nabla_{\rm DG}^h} {\bf u}_h\! +\!{\bf R}_h {\bf u}_D^* ) -{\bf F}(\nabla {\bf u} )}} \right|^2\,{\rm{d}}x \\ &\quad + c_{\varepsilon} \int_{\it{\Omega}}\left| {{{\bf F}(\nabla {\bf u} ) -{\bf F}(\nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u} )}} \right|^2 + \varphi_{\left| {{\nabla {\bf u}}} \right|}(\lvert{{\bf R}_h ({{\it{\Pi}}_{{\rm SZ}}} {\bf u} - {\bf u} )\rvert})\,{\rm{d}}x . \notag \end{align} (4.26) Using the properties of $${\bf R}_h$$, especially (A.13), (A.17), we can proceed as in the proof of Diening et al. (2014, Theorem 4.8) and show that the last integral is bounded by $$c\, h^2\left\| {{ \nabla {\bf F}(\nabla {\bf u})}} \right\|_2^2$$. From Young’s inequality, estimate (A.13), adding and subtracting $${\bf u}_D^*$$, using the definition of $${\bf u}_D^*$$ and (A.18) we obtain   \begin{align} \label{eq:er-i2A} \left| {{I_2}} \right|&\le c \int _{\it{\Omega}}{\varphi (\lvert{{\bf R}_h ({\bf u}_h -{\bf u}^*_D) }\rvert) }\, {\rm{d}}x +c \int_{{\it{\Omega}}} {\varphi} (\lvert{{\bf R}_h ({\bf u}_h- {{\it{\Pi}}_{{\rm SZ}}}{\bf u}\rvert)} ) \, {\rm{d}}x \\ &\le c\, h \int _{\it{\Gamma}}{\varphi (h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}^*_D)\otimes {\bf n} }\right]\kern-0.15em\right]}\rvert) }\, {\rm{d}}s +c\, h \int_{{\it{\Gamma}}} {\varphi} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}- {{\it{\Pi}}_{{\rm SZ}}}{\bf u})\otimes {\bf n} }\right]\kern-0.15em\right]} \rvert) \, {\rm{d}}s\notag \\ &\le c\, h \int _{\it{\Gamma}}{\varphi (h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}^*_D)\otimes {\bf n} }\right]\kern-0.15em\right]}\rvert)}\, {\rm{d}}s +c\,h^{\min (2,p)} \int_{{\it{\Omega}}} \lvert{\nabla {\bf F}(\nabla {\bf u} )}\rvert^2 + \varphi (\lvert{\nabla {\bf u} }\rvert) \, {\rm{d}}x .\notag \end{align} (4.27) Young’s inequality, adding and subtracting $${\bf u}_D^*$$, using the definition of $${\bf u}_D^*$$ and (A.18) yield   \begin{align} \label{eq:er-i3A} \left| {{I_3}} \right| &\le {\varepsilon} \, h \int _{\it{\Gamma}}{\varphi (h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}^*_D)\otimes {\bf n} }\right]\kern-0.15em\right]}\rvert)} + {\varphi} (h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}- {{\it{\Pi}}_{{\rm SZ}}}{\bf u})\otimes {\bf n} }\right]\kern-0.15em\right]} \rvert) \, {\rm{d}}s \\ &\quad + c_{\varepsilon} \, h \int _{\it{\Gamma}}\big\{ {\varphi^*(\lvert{{{\it{\Pi}}_{\rm DG}}{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})- {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})\rvert})}\big\}\, {\rm{d}}s \notag \\ &\le {\varepsilon} \, h \int _{\it{\Gamma}}{\varphi (h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}^*_D)\otimes {\bf n} }\right]\kern-0.15em\right]}\rvert)} \, {\rm{d}}s + c\,h^{\min (2,p)} \int_{{\it{\Omega}}} \lvert{\nabla {\bf F}(\nabla {\bf u} )}\rvert^2 + \varphi (\lvert{\nabla {\bf u} }\rvert) \, {\rm{d}}x \notag \\ &\quad + c_{\varepsilon} \, h \int _{\it{\Gamma}}\big\{ {\varphi^*(\lvert{{{\it{\Pi}}_{\rm DG}}{\boldsymbol{\mathcal{A}}}(\nabla {\bf u})- {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})\rvert})}\big\}\, {\rm{d}}s. \notag \end{align} (4.28) Using (A.11) for each $$\gamma \in {\it{\Gamma}} $$ we estimate the last term by   \begin{align}\label{eq:er-13B} c_{\varepsilon} \int_{\it{\Omega}} \varphi^*(h\, \lvert{\nabla {\boldsymbol{\mathcal{A}}}(\nabla {\bf u})\rvert})\, {\rm{d}}x . \end{align} (4.29) Young’s inequality, the definition of $${\bf u}_D^*$$ and (A.18) yield   \begin{align} \lvert{I_4}\rvert&\le \varepsilon \alpha h\! \int_{\it{\Gamma}}\!\!\varphi(h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}_h\!-\!{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]}\rvert)\, {\rm{d}}s \!+\!c_\varepsilon \alpha\, h\! \int_{\it{\Gamma}}\!\!\varphi(h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}\!-\!{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\otimes {\bf n}}\right]\kern-0.15em\right]}\rvert)\, {\rm{d}}s \label{eq:er-i4A} \\ &\le \varepsilon \alpha h \! \int_{\it{\Gamma}} \!\varphi(h^{-1}\lvert{\left[\kern-0.15em\left[{({\bf u}_h \!-\!{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]}\rvert) \, {\rm{d}}s +c_\varepsilon \alpha\,\,h^{\min (2,p)}\!\! \int_{{\it{\Omega}}}\! \lvert{\nabla {\bf F}(\nabla {\bf u} )}\rvert^2 \! +\! \varphi (\lvert{\nabla {\bf u}}\rvert ) \, {\rm{d}}x. \end{align} (4.30) Choosing $${\varepsilon} $$ small enough, we can absorb all terms with $${\varepsilon}$$ in (4.26)–(4.30) in (4.25). Then, we choose $$ \alpha$$ large enough to absorb the remaining term in (4.27) with $$\varphi(h^{-1}\left| {{\left[\kern-0.15em\left[{({\bf u}_h-{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]}} \right|) $$ in (4.25). This way we arrive at   \begin{align} &\int_{\it{\Omega}} \left| {{{\bf F}({\nabla_{\rm DG}^h} {\bf u}_h+{\bf R}_h{\bf u}_D^* ) -{\bf F}(\nabla {\bf u} )}} \right|^2\,{\rm{d}}x + \alpha\, h \int_{\it{\Gamma}} {\varphi (h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]}\rvert)}\, {\rm{d}}s \notag \\ &\le c_\alpha \, \int _{\it{\Omega}} \lvert{{\bf F} (\nabla{\bf u})- {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}\rvert^2\, {\rm{d}}x+ c_\alpha \,h^2 \int _{\it{\Omega}} \lvert {\nabla {\bf F} (\nabla{\bf u})}\rvert^2\, {\rm{d}}x \notag \\ &\quad + c_\alpha \,h^{\min (2,p)} \int_{{\it{\Omega}}} \lvert{\nabla {\bf F}(\nabla {\bf u} )}\rvert^2 + \varphi (\lvert{\nabla {\bf u} }\rvert) \, {\rm{d}}x + c_\alpha \int_{\it{\Omega}} \varphi^*(h\, \lvert{\nabla {\boldsymbol{\mathcal{A}}}(\nabla {\bf u}\rvert)})\, {\rm{d}}x . \label{eq:err-s2} \end{align} (4.31) Proposition A2 and   \begin{align*} \varphi^*(h\,t) \le c\, h^{\min(2,p')}\varphi^*(t)\, \end{align*} yield the assertion. □ Remark 4.6 For $${\bf u}_D^* ={{\it{\Pi}}_{{\rm SZ}}} {\bf u}$$, we can improve Theorem 4.5 for $$p\le 2$$. Indeed, one easily sees that in this case the terms with $$\varphi(h^{-1}\left[\kern-0.15em\left[{({\bf u}-{{\it{\Pi}}_{{\rm SZ}}}{\bf u})\otimes {\bf n}}\right]\kern-0.15em\right]) $$ in the estimates of $$I_2$$ and $$I_3$$ do not appear and that $$I_4=0$$. As a consequence, the term with $$\varphi (h\,\nabla^2{\bf u})$$ in (4.31) does not appear. Moreover, as in the proof of Diening et al. (2014, Theorem 4.5), one can show that the last term in (4.28) can be estimated by $$c\, \left\| {{{\bf F} (\nabla{\bf u})- {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}} \right\|_2^2$$. Thus, we get instead of (4.31) the estimate   \begin{align*} &\int_{\it{\Omega}} \left| {{{\bf F}({\nabla_{\rm DG}^h} {\bf u}_h +{\bf R}_h {\bf u}_D^*) -{\bf F}(\nabla {\bf u} )}} \right|^2\,{\rm{d}}x + \alpha\, h \int_{\it{\Gamma}} {\varphi(h^{-1} \lvert{\left[\kern-0.15em\left[{({\bf u}_h -{\bf u}_D^*)\otimes {\bf n}}\right]\kern-0.15em\right]}\rvert)}\, {\rm{d}}s \\ &\le c_\alpha \, \int _{\it{\Omega}} \lvert{{\bf F} (\nabla{\bf u})- {\bf F}( \nabla_h {{\it{\Pi}}_{{\rm SZ}}}{\bf u})}\rvert^2\, {\rm{d}}x+ c_\alpha \,h^2 \int _{\it{\Omega}} \lvert {\nabla {\bf F} (\nabla{\bf u})}\rvert^2\, {\rm{d}}x \notag \\ &\le c_\alpha\, h^{2} \,\Big ( \int _{\it{\Omega}}\lvert{\nabla {\bf F}(\nabla {\bf u})}\rvert^2 \, {\rm{d}}x\Big ) . \end{align*} Let us summarize our error estimates. Theorem 4.3 shows that SIP Scheme 2.10 using shifts possesses an optimal convergence rate for linear ansatz functions in all considered realizations of the boundary values. Thus, it behaves as the classical FEM scheme analysed in Diening & Růžička (2007). This is the first theoretical result that shows optimal convergence rates for all $$p \in (1,\infty)$$. On the other hand, from Theorem 4.5, Remark 4.6 and the results from Diening et al. (2014) (summarized in Section 2.6), we see that the theoretical error analysis gives the same results for LDG Scheme 2.13 and the SIP scheme in the lift formulation Scheme 2.11. In particular, for $$p\le 2$$, the convergence rates for linear elements are optimal. The performance in numerical experiments of all three schemes is discussed in the next section. 5. Numerical experiments In this section, we apply SIP Schemes 2.10 and 2.11, and primal LDG Scheme 2.13 to solve numerically system (1.1) with $${\boldsymbol{\mathcal{A}}}$$ given by (1.2). We approximate the discrete solution $${\bf u}_h$$ of the nonlinear problems (2.24), (2.25) and (2.28) using a Newton scheme with modified Jacobian matrix. The modified Jacobian matrix in each Newton step is evaluated by replacing $${\boldsymbol{\mathcal{A}}}$$ by   \begin{equation*} {\boldsymbol{\mathcal{A}}}^\prime_\beta( {\bf Q} ) {\bf P} = (\delta +{{{}\lvert{\bf Q}{}\rvert}} ^{p-2} {\bf P} + \beta( p - 2 ) ( \delta + \left| {{ {\bf Q} }} \right| )^{p-3} ( {\bf P}, {\bf Q} ) \frac {{\bf Q}}{\left| {{{\bf Q}}} \right|} \end{equation*} in (2.24), (2.25) and (2.28). Note that the true Jacobian corresponds to setting $$\beta=1$$ in the last formula. The parameter $$0 \leq \beta \leq 1$$ is adaptively chosen in each Newton step. It is increased if the residual $$\| {\bf r}_{h,i} \|$$ is decreasing or decreased otherwise. Here, $${\bf r}_{h,i}$$ denotes the update term of the $$i$$th Newton step. The stopping criterion for the Newton scheme is set to $$\| {\bf r}_{h,i} \| < 10^{-8}$$. The linear system emerging in each Newton step is solved using the sparse LU solver umfpack (Davis, 2004). The choice of the $$p$$-dependent penalty parameter $$\alpha$$ used for our experiments is presented in Table 1. Table 1 Choice of the stabilization parameter $$\alpha$$    p  —  1.25  4/3  1.5  5/3  1.8  2  2.25  2.5  3  4  $$\alpha_\text{SIP-shifted}$$  1.5  1.5  1.5  1.5  2.0  2.0  2.0  2.0  2.0  2.5  $$\alpha_\text{LDG}$$  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.9  $$\alpha_\text{SIP-lifting}$$  0.5  0.8  1.2  1.5  1.5  2.0  2.0  2.5  2.5  3.5     p  —  1.25  4/3  1.5  5/3  1.8  2  2.25  2.5  3  4  $$\alpha_\text{SIP-shifted}$$  1.5  1.5  1.5  1.5  2.0  2.0  2.0  2.0  2.0  2.5  $$\alpha_\text{LDG}$$  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.9  $$\alpha_\text{SIP-lifting}$$  0.5  0.8  1.2  1.5  1.5  2.0  2.0  2.5  2.5  3.5  Table 1 Choice of the stabilization parameter $$\alpha$$    p  —  1.25  4/3  1.5  5/3  1.8  2  2.25  2.5  3  4  $$\alpha_\text{SIP-shifted}$$  1.5  1.5  1.5  1.5  2.0  2.0  2.0  2.0  2.0  2.5  $$\alpha_\text{LDG}$$  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.9  $$\alpha_\text{SIP-lifting}$$  0.5  0.8  1.2  1.5  1.5  2.0  2.0  2.5  2.5  3.5     p  —  1.25  4/3  1.5  5/3  1.8  2  2.25  2.5  3  4  $$\alpha_\text{SIP-shifted}$$  1.5  1.5  1.5  1.5  2.0  2.0  2.0  2.0  2.0  2.5  $$\alpha_\text{LDG}$$  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.5  0.9  $$\alpha_\text{SIP-lifting}$$  0.5  0.8  1.2  1.5  1.5  2.0  2.0  2.5  2.5  3.5  For our numerical experiments, we choose $${\it{\Omega}} = [-2,2]^2$$, linear elements, i.e., $$k=1$$, and $$\delta:=10^{-3}$$ as the parameter $$\delta$$ in the operator $${\boldsymbol{\mathcal{A}}}$$. For $$a = 0.01$$, we choose $${\bf f}$$ and $${\bf u}_D$$ such that   \begin{equation} \label{N3} {\bf u}( {\bf x} ) = |{\bf x} |^{a} \begin{pmatrix} x_2 \\ - x_1 \end{pmatrix} \end{equation} (5.1) is a solution of (1.1) and $${\bf F}(\nabla {\bf u} ) \in W^{1,2}({\it{\Omega}})$$. For a series of triangulations $$\mathcal T_{h_i}$$ with $$h_{i+1} = \frac{h_i}{2}$$, we apply the above Newton scheme to compute the corresponding numerical solution $${\bf u}_{h_i}$$ and the errors   \begin{equation*} e_{{\bf F},h_i} = \left\| {{{\bf F}(\nabla_{h_i} {\bf u}_{h_i} ) - {\bf F}(\nabla {\bf u})}} \right\|_{2}\;\; \text{and} \;\; e_{ \left[\!\left[ {} \right]\!\right], h_i } = h_{i} \int_{\it{\Gamma}} {\varphi(h^{-1}_{i} \left| {{\left[\!\left[ {{({\bf u}_{h_i} -{\bf u}_D^*)\otimes {\bf n}}} \right]\!\right]}} \right|)}\, {\rm{d}}s. \end{equation*} As an estimation of the convergence rates, we use the experimental order of convergence (EOC):   \begin{equation*} {\rm{EOC}}_i( e_{h_i} ) := \frac{\log( e_{h_{i}} / e_{h_{i-1}} ) }{ \log( h_{i} / h_{i-1} )} \end{equation*} for $$i>1$$ and $$e_{h_i}$$ being either the error $$e_{{\bf F},h_i}$$ or $$e_{ \left[\!\left[ {} \right]\!\right], h_i }$$. Schemes 2.11 and 2.13 are based on the evaluation of the term $${\bf R}_h{\bf u}_h$$ in the interior of $${\it{\Omega}}$$. For a given simplex $$K \in \mathcal T_h$$, the restriction of $${\bf R}_{h} {\bf u}_h$$ to $$K$$ is computed by solving locally   \begin{equation*} \left( {{{\bf R}_{h}{\bf u}_h}{ {\bf X}_i}} \right)_K = \sum_{\gamma \in \partial K} \left\langle {{ \left[\!\left[ {{{\bf u}_h \otimes {\bf n}}} \right]\!\right]},{ {\bf X}_i}} \right\rangle_\gamma \end{equation*} for each basis function $${\bf X}_i$$ of $$\mathcal P_k(K)$$. For different values of $$p$$ and for a series of triangulations with $$h_0 = 1$$, the EOC is computed and presented in Tables 2 and 3 for each of the three methods. In each case, we observe convergence ratios of about $$1$$, as predicted by the theoretical results Theorem 4.3, Theorem 4.5 and Diening et al. (2014, Corollary 4.10). Table 2 Experimental order of convergence : $${\rm EOC}_{i(e_{{\bf F}, h_i} )}$$ in the case of $$\mathcal{P}_1(K)$$ as local space and $${\bf u}_D^*={\bf u}$$ on $$\partial {\it{\Omega}}$$, $$\mathbf{F}(\nabla\mathbf{u})\in {W}^{1,2}({\it{\Omega}})$$    p  $$\frac{h_0}{2^m}$$  1.25  4/3  1.5  5/3  1.8  2.0  2.25  2.5  3.0  4.0    SIP shifted  $$m=1$$  0.83  0.89  1.0  1.1  1.17  1.04  1.1  1.16  1.23  1.14  $$m=2$$  0.86  0.86  0.87  0.87  0.88  0.86  0.86  0.87  0.88  0.88  $$m=3$$  0.88  0.88  0.89  0.89  0.89  0.88  0.89  0.89  0.9  0.9  $$m=4$$  0.9  0.9  0.9  0.9  0.91  0.9  0.9  0.91  0.91  0.92  $$m=5$$  0.91  0.91  0.91  0.92  0.92  0.91  0.92  0.92  0.92  0.93    LDG  $$m=1$$  0.71  0.74  0.85  1.06  1.32  1.8  2.45  3.12  3.97  4.23  $$m=2$$  0.83  0.83  0.83  0.84  0.85  0.86  0.87  0.89  0.96  0.64  $$m=3$$  0.86  0.86  0.87  0.87  0.88  0.89  0.89  0.9  0.94  0.58  $$m=4$$  0.88  0.88  0.89  0.89  0.9  0.9  0.91  0.92  0.93  0.72  $$m=5$$  0.9  0.9  0.9  0.91  0.91  0.91  0.92  0.93  0.94  0.98    SIP lifting  $$m=1$$  0.63  0.64  0.67  0.71  0.75  0.86  1.2  1.76  2.98  4.03  $$m=2$$  0.83  0.83  0.84  0.84  0.84  0.84  0.86  0.87  0.89  0.89  $$m=3$$  0.86  0.86  0.87  0.87  0.87  0.87  0.88  0.89  0.91  0.99  $$m=4$$  0.89  0.89  0.89  0.89  0.89  0.89  0.9  0.91  0.92  0.95  $$m=5$$  0.9  0.9  0.9  0.9  0.91  0.91  0.91  0.92  0.93  0.95     p  $$\frac{h_0}{2^m}$$  1.25  4/3  1.5  5/3  1.8  2.0  2.25  2.5  3.0  4.0    SIP shifted  $$m=1$$  0.83  0.89  1.0  1.1  1.17  1.04  1.1  1.16  1.23  1.14  $$m=2$$  0.86  0.86  0.87  0.87  0.88  0.86  0.86  0.87  0.88  0.88  $$m=3$$  0.88  0.88  0.89  0.89  0.89  0.88  0.89  0.89  0.9  0.9  $$m=4$$  0.9  0.9  0.9  0.9  0.91  0.9  0.9  0.91  0.91  0.92  $$m=5$$  0.91  0.91  0.91  0.92  0.92  0.91  0.92  0.92  0.92  0.93    LDG  $$m=1$$  0.71  0.74  0.85  1.06  1.32  1.8  2.45  3.12  3.97  4.23  $$m=2$$  0.83  0.83  0.83  0.84  0.85  0.86  0.87  0.89  0.96  0.64  $$m=3$$  0.86  0.86  0.87  0.87  0.88  0.89  0.89  0.9  0.94  0.58  $$m=4$$  0.88  0.88  0.89  0.89  0.9  0.9  0.91  0.92  0.93  0.72  $$m=5$$  0.9  0.9  0.9  0.91  0.91  0.91  0.92  0.93  0.94  0.98    SIP lifting  $$m=1$$  0.63  0.64  0.67  0.71  0.75  0.86  1.2  1.76  2.98  4.03  $$m=2$$  0.83  0.83  0.84  0.84  0.84  0.84  0.86  0.87  0.89  0.89  $$m=3$$  0.86  0.86  0.87  0.87  0.87  0.87  0.88  0.89  0.91  0.99  $$m=4$$  0.89  0.89  0.89  0.89  0.89  0.89  0.9  0.91  0.92  0.95  $$m=5$$  0.9  0.9  0.9  0.9  0.91  0.91  0.91  0.92  0.93  0.95  Table 2 Experimental order of convergence : $${\rm EOC}_{i(e_{{\bf F}, h_i} )}$$ in the case of $$\mathcal{P}_1(K)$$ as local space and $${\bf u}_D^*={\bf u}$$ on $$\partial {\it{\Omega}}$$, $$\mathbf{F}(\nabla\mathbf{u})\in {W}^{1,2}({\it{\Omega}})$$    p  $$\frac{h_0}{2^m}$$  1.25  4/3  1.5  5/3  1.8  2.0  2.25  2.5  3.0  4.0    SIP shifted  $$m=1$$  0.83  0.89  1.0  1.1  1.17  1.04  1.1  1.16  1.23  1.14  $$m=2$$  0.86  0.86  0.87  0.87  0.88  0.86  0.86  0.87  0.88  0.88  $$m=3$$  0.88  0.88  0.89  0.89  0.89  0.88  0.89  0.89  0.9  0.9  $$m=4$$  0.9  0.9  0.9  0.9  0.91  0.9  0.9  0.91  0.91  0.92  $$m=5$$  0.91  0.91  0.91  0.92  0.92  0.91  0.92  0.92  0.92  0.93    LDG  $$m=1$$  0.71  0.74  0.85  1.06  1.32  1.8  2.45  3.12  3.97  4.23  $$m=2$$  0.83  0.83  0.83  0.84  0.85  0.86  0.87  0.89  0.96  0.64  $$m=3$$  0.86  0.86  0.87  0.87  0.88  0.89  0.89  0.9  0.94  0.58  $$m=4$$  0.88  0.88  0.89  0.89  0.9  0.9  0.91  0.92  0.93  0.72  $$m=5$$  0.9  0.9  0.9  0.91  0.91  0.91  0.92  0.93  0.94  0.98    SIP lifting  $$m=1$$  0.63  0.64  0.67  0.71  0.75  0.86  1.2  1.76  2.98  4.03  $$m=2$$  0.83  0.83  0.84  0.84  0.84  0.84  0.86  0.87  0.89  0.89  $$m=3$$  0.86  0.86  0.87  0.87  0.87  0.87  0.88  0.89  0.91  0.99  $$m=4$$  0.89  0.89  0.89  0.89  0.89  0.89  0.9  0.91  0.92  0.95  $$m=5$$  0.9  0.9  0.9  0.9  0.91  0.91  0.91  0.92  0.93  0.95     p  $$\frac{h_0}{2^m}$$  1.25  4/3  1.5  5/3  1.8  2.0  2.25  2.5  3.0  4.0    SIP shifted  $$m=1$$  0.83  0.89  1.0  1.1  1.17  1.04  1.1  1.16  1.23  1.14  $$m=2$$  0.86  0.86  0.87  0.87  0.88  0.86  0.86  0.87  0.88  0.88  $$m=3$$  0.88  0.88  0.89  0.89  0.89  0.88  0.89  0.89  0.9  0.9  $$m=4$$  0.9  0.9  0.9  0.9  0.91  0.9  0.9  0.91  0.91  0.92  $$m=5$$  0.91  0.91  0.91  0.92  0.92  0.91  0.92  0.92  0.92  0.93    LDG  $$m=1$$  0.71  0.74  0.85  1.06  1.32  1.8  2.45  3.12  3.97  4.23  $$m=2$$  0.83  0.83  0.83  0.84  0.85  0.86  0.87  0.89  0.96  0.64  $$m=3$$  0.86  0.86  0.87  0.87  0.88  0.89  0.89  0.9  0.94  0.58  $$m=4$$  0.88  0.88  0.89  0.89  0.9  0.9  0.91  0.92  0.93  0.72  $$m=5$$  0.9  0.9  0.9  0.91  0.91  0.91  0.92  0.93  0.94  0.98    SIP lifting  $$m=1$$  0.63  0.64  0.67  0.71  0.75  0.86  1.2  1.76  2.98  4.03  $$m=2$$  0.83  0.83  0.84  0.84  0.84  0.84  0.86  0.87  0.89  0.89  $$m=3$$  0.86  0.86  0.87  0.87  0.87  0.87  0.88  0.89  0.91  0.99  $$m=4$$  0.89  0.89  0.89  0.89  0.89  0.89  0.9  0.91  0.92  0.95  $$m=5$$  0.9  0.9  0.9  0.9  0.91  0.91  0.91  0.92  0.93  0.95  Table 3 Experimental order of convergence : EOC$$_i\text{(}{{\text{e}}_{\left[\!\left[ {} \right]\!\right],{{h}_{i}}}}\text{)}$$ in the case of $$\mathcal{P}_1(K)$$ as local space and $${\bf u}_D^*={\bf u}$$ on $$\partial {\it{\Omega}}$$, $$\mathbf{F}(\nabla\mathbf{u})\in {W}^{1,2}({\it{\Omega}})$$    p  $$\frac{h_0}{2^m}$$  1.25  4/3  1.5  5/3  1.8  2.0  2.25  2.5  3.0  4.0    SIP shifted  $$m=1$$  1.05  1.21  1.46  1.65  1.77  1.81  1.98  2.14  2.48  2.91  $$m=2$$  0.93  0.94  0.94  0.95  0.95  0.94  0.94  0.94  0.95  0.95  $$m=3$$  0.93  0.93  0.94  0.94  0.95  0.94  0.94  0.95  0.95  0.96  $$m=4$$  0.93  0.93  0.94  0.94  0.95  0.94  0.94  0.95  0.95  0.96  $$m=5$$  0.93  0.94  0.94  0.95  0.95  0.95  0.95  0.95  0.96  0.96    LDG  $$m=1$$  0.9  1.06  1.42  1.82  2.16  2.7  3.43  4.21  5.31  6.79  $$m=2$$  0.92  0.91  0.89  0.87  0.87  0.86  0.87  0.89  1.05  0.53  $$m=3$$  0.91  0.91  0.9  0.89  0.88  0.88  0.89  0.9  0.97  0.5  $$m=4$$  0.92  0.91  0.91  0.9  0.9  0.9  0.9  0.91  0.93  0.69  $$m=5$$  0.92  0.92  0.92  0.91  0.91  0.91  0.92  0.93  0.95  0.99    SIP lifting  $$m=1$$  1.52  1.45  1.3  1.26  1.38  1.68  2.21  2.87  4.38  6.41  $$m=2$$  1.39  1.29  1.08  0.95  0.9  0.87  0.86  0.85  0.89  0.88  $$m=3$$  1.25  1.14  0.98  0.92  0.9  0.89  0.88  0.88  0.91  1.02  $$m=4$$  1.12  1.04  0.95  0.92  0.91  0.9  0.9  0.9  0.92  0.97  $$m=5$$  1.03  0.98  0.93  0.92  0.92  0.91  0.91  0.91  0.93  0.96     p  $$\frac{h_0}{2^m}$$  1.25  4/3  1.5  5/3  1.8  2.0  2.25  2.5  3.0  4.0    SIP shifted  $$m=1$$  1.05  1.21  1.46  1.65  1.77  1.81  1.98  2.14  2.48  2.91  $$m=2$$  0.93  0.94  0.94  0.95  0.95  0.94  0.94  0.94  0.95  0.95  $$m=3$$  0.93  0.93  0.94  0.94  0.95  0.94  0.94  0.95  0.95  0.96  $$m=4$$  0.93  0.93  0.94  0.94  0.95  0.94  0.94  0.95  0.95  0.96  $$m=5$$  0.93  0.94  0.94  0.95  0.95  0.95  0.95  0.95  0.96  0.96    LDG  $$m=1$$  0.9  1.06  1.42  1.82  2.16  2.7  3.43  4.21  5.31  6.79  $$m=2$$  0.92  0.91  0.89  0.87  0.87  0.86  0.87  0.89  1.05  0.53  $$m=3$$  0.91  0.91  0.9  0.89  0.88  0.88  0.89  0.9  0.97  0.5  $$m=4$$  0.92  0.91  0.91  0.9  0.9  0.9  0.9  0.91  0.93  0.69  $$m=5$$  0.92  0.92  0.92  0.91  0.91  0.91  0.92  0.93  0.95  0.99    SIP lifting  $$m=1$$  1.52  1.45  1.3  1.26  1.38  1.68  2.21  2.87  4.38  6.41  $$m=2$$  1.39  1.29  1.08  0.95  0.9  0.87  0.86  0.85  0.89  0.88  $$m=3$$  1.25  1.14  0.98  0.92  0.9  0.89  0.88  0.88  0.91  1.02  $$m=4$$  1.12  1.04  0.95  0.92  0.91  0.9  0.9  0.9  0.92  0.97  $$m=5$$  1.03  0.98  0.93  0.92  0.92  0.91  0.91  0.91  0.93  0.96  Table 3 Experimental order of convergence : EOC$$_i\text{(}{{\text{e}}_{\left[\!\left[ {} \right]\!\right],{{h}_{i}}}}\text{)}$$ in the case of $$\mathcal{P}_1(K)$$ as local space and $${\bf u}_D^*={\bf u}$$ on $$\partial {\it{\Omega}}$$, $$\mathbf{F}(\nabla\mathbf{u})\in {W}^{1,2}({\it{\Omega}})$$    p  $$\frac{h_0}{2^m}$$  1.25  4/3  1.5  5/3  1.8  2.0  2.25  2.5  3.0  4.0    SIP shifted  $$m=1$$  1.05  1.21  1.46  1.65  1.77  1.81  1.98  2.14  2.48  2.91  $$m=2$$  0.93  0.94  0.94  0.95  0.95  0.94  0.94  0.94  0.95  0.95  $$m=3$$  0.93  0.93  0.94  0.94  0.95  0.94  0.94  0.95  0.95  0.96  $$m=4$$  0.93  0.93  0.94  0.94  0.95  0.94  0.94  0.95  0.95  0.96  $$m=5$$  0.93  0.94  0.94  0.95  0.95  0.95  0.95  0.95  0.96  0.96    LDG  $$m=1$$  0.9  1.06  1.42  1.82  2.16  2.7  3.43  4.21  5.31  6.79  $$m=2$$  0.92  0.91  0.89  0.87  0.87  0.86  0.87  0.89  1.05  0.53  $$m=3$$  0.91  0.91  0.9  0.89  0.88  0.88  0.89  0.9  0.97  0.5  $$m=4$$  0.92  0.91  0.91  0.9  0.9  0.9  0.9  0.91  0.93  0.69  $$m=5$$  0.92  0.92  0.92  0.91  0.91  0.91  0.92  0.93  0.95  0.99    SIP lifting  $$m=1$$  1.52  1.45  1.3  1.26  1.38  1.68  2.21  2.87  4.38  6.41  $$m=2$$  1.39  1.29  1.08  0.95  0.9  0.87  0.86  0.85  0.89  0.88  $$m=3$$  1.25  1.14  0.98  0.92  0.9  0.89  0.88  0.88  0.91  1.02  $$m=4$$  1.12  1.04  0.95  0.92  0.91  0.9  0.9  0.9  0.92  0.97  $$m=5$$  1.03  0.98  0.93  0.92  0.92  0.91  0.91  0.91  0.93  0.96     p  $$\frac{h_0}{2^m}$$  1.25  4/3  1.5  5/3  1.8  2.0  2.25  2.5  3.0  4.0    SIP shifted  $$m=1$$  1.05  1.21  1.46  1.65  1.77  1.81  1.98  2.14  2.48  2.91  $$m=2$$  0.93  0.94  0.94  0.95  0.95  0.94  0.94  0.94  0.95  0.95  $$m=3$$  0.93  0.93  0.94  0.94  0.95  0.94  0.94  0.95  0.95  0.96  $$m=4$$  0.93  0.93  0.94  0.94  0.95  0.94  0.94  0.95  0.95  0.96  $$m=5$$  0.93  0.94  0.94  0.95  0.95  0.95  0.95  0.95  0.96  0.96    LDG  $$m=1$$  0.9  1.06  1.42  1.82  2.16  2.7  3.43  4.21  5.31  6.79  $$m=2$$  0.92  0.91  0.89  0.87  0.87  0.86  0.87  0.89  1.05  0.53  $$m=3$$  0.91  0.91  0.9  0.89  0.88  0.88  0.89  0.9  0.97  0.5  $$m=4$$  0.92  0.91  0.91  0.9  0.9  0.9  0.9  0.91  0.93  0.69  $$m=5$$  0.92  0.92  0.92  0.91  0.91  0.91  0.92  0.93  0.95  0.99    SIP lifting  $$m=1$$  1.52  1.45  1.3  1.26  1.38  1.68  2.21  2.87  4.38  6.41  $$m=2$$  1.39  1.29  1.08  0.95  0.9  0.87  0.86  0.85  0.89  0.88  $$m=3$$  1.25  1.14  0.98  0.92  0.9  0.89  0.88  0.88  0.91  1.02  $$m=4$$  1.12  1.04  0.95  0.92  0.91  0.9  0.9  0.9  0.92  0.97  $$m=5$$  1.03  0.98  0.93  0.92  0.92  0.91  0.91  0.91  0.93  0.96  Finally, we compare the performance of the three methods. We compare the overall solver run time (CPU time) per grid refinement against the solver accuracy in terms of the error $$e_{{\bf F},h_i}$$. In Table 4, the error $$e_{{\bf F},h_i}$$, CPU time and Newton iterations are presented for $$p=1.25$$ and $$p=3$$. Additionally, we give a performance plot in Fig. A1, where the average CPU time per Newton step is plotted against the error $$e_{{\bf F},h_i}$$. We observe that the SIP-shifted method takes fewer Newton iterations and the average CPU time per Newton step is smaller compared with the other two methods. This is mainly due to the larger stencil and the computation of the lifting terms for the primal LDG and SIP-lifting methods. Fig. A1. View largeDownload slide CPU run time per Newton step for SIP-shifted, LDG and SIP-lifting methods plotted against the error $$e_{{\bf F},h_i}$$ for $$p=1.25$$ and $$p=3$$. Fig. A1. View largeDownload slide CPU run time per Newton step for SIP-shifted, LDG and SIP-lifting methods plotted against the error $$e_{{\bf F},h_i}$$ for $$p=1.25$$ and $$p=3$$. Table 4 $$e_{{\bf F},h_i}$$, CPU time and number of Newton steps for $$p=1.25$$ and $$p=3.0$$    SIP shifted  LDG  SIP lifting  $$\frac{h_0}{2^m}$$  $$e_{{\bf F},h_i}$$  CPU  Steps  $$e_{{\bf F},h_i}$$  CPU  Steps  $$e_{{\bf F},h_i}$$  CPU  Steps    $$p=1.25$$  $$m=0$$  0.013247  0.0155  11  0.012002  0.0319  14  0.012464  0.0466  13  $$m=1$$  0.007443  0.0861  11  0.007339  0.2024  14  0.00804  0.2754  13  $$m=2$$  0.004108  0.3384  10  0.004136  0.8948  14  0.00451  1.0768  13  $$m=3$$  0.00223  1.1192  10  0.002277  2.4986  14  0.002478  3.1657  13  $$m=4$$  0.001195  3.7492  10  0.001234  9.0149  14  0.001341  11.8002  13  $$m=5$$  0.000635  13.157  9  0.000661  35.2269  14  0.000717  48.6188  14    $$p=3.0$$  $$m=0$$  0.03651  0.0097  7  0.292522  0.0172  8  0.127643  0.0324  9  $$m=1$$  0.015514  0.0529  7  0.018718  0.1107  8  0.016229  0.1828  9  $$m=2$$  0.008427  0.2517  8  0.009638  0.5578  9  0.008735  0.7937  10  $$m=3$$  0.004522  0.8694  8  0.005039  1.7787  10  0.004647  2.6673  11  $$m=4$$  0.002403  3.2579  9  0.00265  7.2401  11  0.002454  10.9343  12  $$m=5$$  0.001268  13.825  10  0.00138  31.1862  12  0.001288  47.214  13     SIP shifted  LDG  SIP lifting  $$\frac{h_0}{2^m}$$  $$e_{{\bf F},h_i}$$  CPU  Steps  $$e_{{\bf F},h_i}$$  CPU  Steps  $$e_{{\bf F},h_i}$$  CPU  Steps    $$p=1.25$$  $$m=0$$  0.013247  0.0155  11  0.012002  0.0319  14  0.012464  0.0466  13  $$m=1$$  0.007443  0.0861  11  0.007339  0.2024  14  0.00804  0.2754  13  $$m=2$$  0.004108  0.3384  10  0.004136  0.8948  14  0.00451  1.0768  13  $$m=3$$  0.00223  1.1192  10  0.002277  2.4986  14  0.002478  3.1657  13  $$m=4$$  0.001195  3.7492  10  0.001234  9.0149  14  0.001341  11.8002  13  $$m=5$$  0.000635  13.157  9  0.000661  35.2269  14  0.000717  48.6188  14    $$p=3.0$$  $$m=0$$  0.03651  0.0097  7  0.292522  0.0172  8  0.127643  0.0324  9  $$m=1$$  0.015514  0.0529  7  0.018718  0.1107  8  0.016229  0.1828  9  $$m=2$$  0.008427  0.2517  8  0.009638  0.5578  9  0.008735  0.7937  10  $$m=3$$  0.004522  0.8694  8  0.005039  1.7787  10  0.004647  2.6673  11  $$m=4$$  0.002403  3.2579  9  0.00265  7.2401  11  0.002454  10.9343  12  $$m=5$$  0.001268  13.825  10  0.00138  31.1862  12  0.001288  47.214  13  Table 4 $$e_{{\bf F},h_i}$$, CPU time and number of Newton steps for $$p=1.25$$ and $$p=3.0$$    SIP shifted  LDG  SIP lifting  $$\frac{h_0}{2^m}$$  $$e_{{\bf F},h_i}$$  CPU  Steps  $$e_{{\bf F},h_i}$$  CPU  Steps  $$e_{{\bf F},h_i}$$  CPU  Steps    $$p=1.25$$  $$m=0$$  0.013247  0.0155  11  0.012002  0.0319  14  0.012464  0.0466  13  $$m=1$$  0.007443  0.0861  11  0.007339  0.2024  14  0.00804  0.2754  13  $$m=2$$  0.004108  0.3384  10  0.004136  0.8948  14  0.00451  1.0768  13  $$m=3$$  0.00223  1.1192  10  0.002277  2.4986  14  0.002478  3.1657  13  $$m=4$$  0.001195  3.7492  10  0.001234  9.0149  14  0.001341  11.8002  13  $$m=5$$  0.000635  13.157  9  0.000661  35.2269  14  0.000717  48.6188  14    $$p=3.0$$  $$m=0$$  0.03651  0.0097  7  0.292522  0.0172  8  0.127643  0.0324  9  $$m=1$$  0.015514  0.0529  7  0.018718  0.1107  8  0.016229  0.1828  9  $$m=2$$  0.008427  0.2517  8  0.009638  0.5578  9  0.008735  0.7937  10  $$m=3$$  0.004522  0.8694  8  0.005039  1.7787  10  0.004647  2.6673  11  $$m=4$$  0.002403  3.2579  9  0.00265  7.2401  11  0.002454  10.9343  12  $$m=5$$  0.001268  13.825  10  0.00138  31.1862  12  0.001288  47.214  13     SIP shifted  LDG  SIP lifting  $$\frac{h_0}{2^m}$$  $$e_{{\bf F},h_i}$$  CPU  Steps  $$e_{{\bf F},h_i}$$  CPU  Steps  $$e_{{\bf F},h_i}$$  CPU  Steps    $$p=1.25$$  $$m=0$$  0.013247  0.0155  11  0.012002  0.0319  14  0.012464  0.0466  13  $$m=1$$  0.007443  0.0861  11  0.007339  0.2024  14  0.00804  0.2754  13  $$m=2$$  0.004108  0.3384  10  0.004136  0.8948  14  0.00451  1.0768  13  $$m=3$$  0.00223  1.1192  10  0.002277  2.4986  14  0.002478  3.1657  13  $$m=4$$  0.001195  3.7492  10  0.001234  9.0149  14  0.001341  11.8002  13  $$m=5$$  0.000635  13.157  9  0.000661  35.2269  14  0.000717  48.6188  14    $$p=3.0$$  $$m=0$$  0.03651  0.0097  7  0.292522  0.0172  8  0.127643  0.0324  9  $$m=1$$  0.015514  0.0529  7  0.018718  0.1107  8  0.016229  0.1828  9  $$m=2$$  0.008427  0.2517  8  0.009638  0.5578  9  0.008735  0.7937  10  $$m=3$$  0.004522  0.8694  8  0.005039  1.7787  10  0.004647  2.6673  11  $$m=4$$  0.002403  3.2579  9  0.00265  7.2401  11  0.002454  10.9343  12  $$m=5$$  0.001268  13.825  10  0.00138  31.1862  12  0.001288  47.214  13  All numerical experiments are carried out using the Dune-Fem module (cf. Dedner et al., 2010), part of the Dune framework (Bastian et al., 2008a,b; Blatt et al., 2016). The Dune-ALUGrid module (Alkämper et al., 2016) is used as the underlying grid manager. The computations were performed on an Intel core i7-4770S @ 3.10 GHz desktop machine with about 16 GB of memory. Appendix In this appendix, we collect several useful results used in the sequel of the article. Throughout the appendix, $$\psi$$ is an N-function such that $$\psi$$ and $$\psi^*$$ satisfy the $${\it {\Delta}}_2$$-condition. Note that the constants in the following subsections can depend on the chunkiness $$\omega_0$$ of $$\mathcal{T}_h$$. All results can be found in Diening & Růžička (2007) and Diening et al. (2014). Let $${{\it {\Pi}}_{{\rm DG}}} \colon L^1({\it {\Omega}}) \to X_h^k({\it {\Omega}})$$ denote the (local) $$L^2$$-projection onto $$X_h^k({\it {\Omega}})$$, i.e.,   \begin{align} \label{eq:PiDG} ({{{\it {\Pi}}_{{\rm DG}}}) {\bf G}}{{\bf X}_h}=({{\bf G}}){{\bf X}_h}\qquad \forall\, {\bf X}_h \in X_h^k . \end{align} (A.1) The projection $${{\it {\Pi}}_{{\rm DG}}}$$ is stable:   \begin{align} \label{eq:PiDGLpsistablelocal} {\int\hspace{-0.8em}{-}}_K \psi(\lvert{{{\it {\Pi}}_{{\rm DG}}} {\bf G}}\rvert)\,{\rm{d}}x &\leq c\, {\int\hspace{-0.8em}{-}}_K \psi(\lvert{{\bf G}}\rvert)\,{\rm{d}}x\, \end{align} (A.2) and has the following local approximation property:   \begin{align} \label{eq:PiDGapproxpsi} {\int\hspace{-0.8em}{-}}_K \psi(h_K^j \lvert{\nabla^j_h({\bf G} - {{\it {\Pi}}_{{\rm DG}}} {\bf G}\rvert)})\,{\rm{d}}x &\leq c\, {\int\hspace{-0.8em}{-}}_K \psi(h_K^l \lvert{\nabla^l {\bf G}}\rvert)\,{\rm{d}}x \end{align} (A.3) for all $$K \in \mathcal{T}_h$$ and $${\bf G} \in W^{l,\psi}(K)$$ with $$0\leq j \leq l \leq k+1$$. In particular, this implies   \begin{align} \label{eq:PiDGapprox0} \rho_{\psi,{\it {\Omega}}}({\bf G} - {{\it {\Pi}}_{{\rm DG}}} {\bf G}) &\leq c\, \rho_{\psi,{\it {\Omega}}}({\bf G}), \end{align} (A.4)  \begin{align} \label{eq:PiDGapprox1} \rho_{\psi,{\it {\Omega}}}({\bf G} - {{\it {\Pi}}_{{\rm DG}}} {\bf G}) &\leq c\, \rho_{\psi,{\it {\Omega}}}(h\, {\nabla_h {\bf G}}), \end{align} (A.5)  \begin{align} \label{eq:PiDGapprox2} \rho_{\psi,{\it {\Omega}}}(\nabla_h {\bf G} - \nabla_h {{\it {\Pi}}_{{\rm DG}}} {\bf G}) &\leq c\, \rho_{\psi,{\it {\Omega}}}({\nabla_h {\bf G}}), \\ \end{align} (A.6)  \begin{align} \label{eq:PiDGLpsistable} \rho_{\psi,{\it {\Omega}}}({{\it {\Pi}}_{{\rm DG}}} {\bf G}) &\leq c\, \rho_{\psi,{\it {\Omega}}}({\bf G}), \end{align} (A.7)  \begin{align} \label{eq:PiDGLW1psistable} \rho_{\psi,{\it {\Omega}}}(\nabla_h {{\it {\Pi}}_{{\rm DG}}} {\bf G}) &\leq c\, \rho_{\psi,{\it {\Omega}}}(\nabla_h {\bf G}), \end{align} (A.8) for all $${\bf G} \in {W_{{\rm DG}}^{1,\psi}}({\it {\Omega}})$$. For the treatment of the jumps, the following trace theorems are useful. Lemma A1 Let $$K \in \mathcal{T}_h$$ and $$\gamma$$ be a face of $$K$$. Then for all $${\bf g} \in W^{1,\psi}(K)$$,   \begin{align}\label{eq:emb} {\int\hspace{-0.8em}{-}}_\gamma \psi(\lvert{{\bf g}}\rvert)\,{\rm{d}}s &\leq c {\int\hspace{-0.8em}{-}}_K \psi(\lvert{{\bf g}}\rvert) \,{\rm{d}}x + c {\int\hspace{-0.8em}{-}}_K \psi(\lvert{h_\gamma \nabla {\bf g}}\rvert) \,{\rm{d}}x \end{align} (A.9) with a constant independent of $$h$$. Note that for all $${\mathfrak q}\in {\mathcal P}_k(K)$$, $$k \in {\mathbb{N}}_0$$, we have   \begin{align} {\int\hspace{-0.8em}{-}}_\gamma \psi(\lvert{{\mathfrak q}}\rvert)\, {\rm{d}}s \le c \, {\int\hspace{-0.8em}{-}}_K \psi(\lvert{{\mathfrak q}}\rvert)\, {\rm{d}}x , \label{eq:pol-trace} \end{align} (A.10) where $$\gamma$$ is a face of some $$K \in \mathcal{T}_h$$. Lemma A1 and (A.3) imply   \begin{align} \label{eq:PiDGapproxmlocal} \begin{aligned} h_\gamma \int_\gamma \psi(h_\gamma^{-1} \lvert{{\bf G} - {{\it {\Pi}}_{{\rm DG}}}\rvert {\bf G}})\,{\rm{d}}s &\leq c \int_K \psi(\lvert{\nabla {\bf G}}\rvert)\,{\rm{d}}x , \end{aligned} \end{align} (A.11) which yields   \begin{align} \label{eq:PiDGapproxmglobal} m_{\psi,h}({\bf G} - {{\it {\Pi}}_{{\rm DG}}} {\bf G}) &\leq c\,\rho_{\psi,{\it {\Omega}}}(\nabla_h {\bf G}) \end{align} (A.12) for all $${\bf G} \in {W_{{\rm DG}}^{1,\psi}}({\it {\Omega}})$$. The operator $${\bf R}_h$$ defined in (2.18) is called a jump functional or lifting operator. One easily sees that $${\bf R}_h$$ are bounded, i.e.,   \begin{align} \label{eq:Rgammaest} \rho_{\psi,{\it {\Omega}}} ({\bf R}_h {\bf g}) &\leq c\, \sum_{\gamma \in {\it {\Gamma}}} h_\gamma \int_\gamma \psi\big( h_\gamma^{-1} \lvert{\left[\kern-0.15em\left[{{\bf g} \otimes {\bf n}}\right]\kern-0.15em\right]_\gamma}\rvert\big) \,{\rm{d}}s = c\, m_{\psi,h}({\bf g}) \end{align} (A.13) for all $${\bf g} \in {W}_{{\rm DG},{\it {\Gamma}}_{\rm D}}^{1,\psi}({\it {\Omega}})$$. Thus, the definition of the discrete gradient in (2.17) implies   \begin{align} \label{eq:nablaDGvsM} \rho_{\psi,{\it {\Omega}}} ({\nabla_{\rm DG}^h} {\bf g}) &\leq c\, \big(\rho_{\psi,{\it {\Omega}}}(\nabla_h {\bf g}) + m_{\psi,h}({\bf g}) \big) = c\, M_{\psi,h}({\bf g}) , \\ \end{align} (A.14)  \begin{align} \label{eq:MvsDGNabla} M_{\psi,h}({\bf g}) &= \rho_{\psi,{\it {\Omega}}} (\nabla_h {\bf g}) + m_{\psi,h}({\bf g}) \leq c\, \big(\rho_{\psi,{\it {\Omega}}}({\nabla_{\rm DG}^h} {\bf g}) + m_{\psi,h}({\bf g}) \big) \end{align} (A.15) for all $${\bf g} \in {W_{{\rm DG},{\it {\Gamma}}_{\rm D}}^{1,\psi}}({\it {\Omega}})$$. The classical Scott–Zhang interpolation operator $${{\it {\Pi}}_{{\rm SZ}}} \colon W^{1,1}({\it {\Omega}}) \to V_h^k$$ was introduced in Scott & Zhang (1990), where also the basic properties in Lebesgue and Sobolev spaces are proved. These properties have been extended to Orlicz and Sobolev–Orlicz spaces in Diening & Růžička (2007). Using the formalism of N-functions, this results in particular in the following important result. Proposition A2 Let $$\psi $$ have $$(p,\delta)$$-structure and let $${\bf F}$$ be defined in (2.7). If $${\bf F}(\nabla g) \in W^{1,2}({\it {\Omega}})$$, and in the definition of $${{\it {\Pi}}_{{\rm SZ}}}$$, we have $$k \geq 1$$, then   \begin{align*} \lVert{{\bf F}(\nabla_h {{\it {\Pi}}_{{\rm SZ}}}\rVert g) - {\bf F}(\nabla g)}_2^2 &\le c\, h^2 \, \lVert{\nabla {\bf F}(\nabla g)}\rVert_{2}^2, \end{align*} with $$c$$ depending only on the characteristics of $$\psi$$ and the chunkiness $$\omega_0$$. In Diening et al. (2014), the Scott–Zhang operator has been extended to the DG setting. In fact, a linear, bounded projection $${{\it {\Pi}}_{{\rm SZ}}} \colon {W_{\rm DG}^{1,1}}({\it {\Omega}}') \to V_h^kc({\it {\Omega}}')$$, where $$ V_h^kc({\it {\Omega}}') := V_h^k({\it {\Omega}}') \cap W^{1,1}({\it {\Omega}}')$$, was defined. It maps the space $${W_{{\rm DG},{\it {\Gamma}}_{\rm D}}^{1,1}}({\it {\Omega}})$$ into $$V_h^kc({\it {\Omega}}) \cap {W_{{\it {\Gamma}}_{\rm D}}^{1,1}}({\it {\Omega}})$$, where $$ V_h^kc({\it {\Omega}}) := V_h^k({\it {\Omega}}) \cap W^{1,1}({\it {\Omega}})$$, and coincides for functions from $$W^{1,1}({\it {\Omega}})$$ with the classical Scott–Zhang operator. Moreover, it was shown in Diening et al. (2014) that if $$k\ge 1$$ then for all $$K \in \mathcal{T}_h$$, all faces $$\gamma$$ of $$K$$ and all $$g \in W^{1,\psi}({\it {\Omega}})$$ there holds   \begin{align} \label{eq:PiSZapproxmlocala} h_\gamma \int_\gamma \psi(h_\gamma^{-1} \lvert{g - {{\it {\Pi}}_{{\rm SZ}}} g}\rvert)\,{\rm{d}}s &\leq c \int_K \psi(h_\gamma^{-1} \lvert{g - {{\it {\Pi}}_{{\rm SZ}}} g}\rvert)\,{\rm{d}}x + c \int_K \psi(\lvert{\nabla_h (g - {{\it {\Pi}}_{{\rm SZ}}} g)}\rvert)\,{\rm{d}}x\notag \\ &\leq c \int_{S_K} \psi(\lvert{\nabla g}\rvert)\,{\rm{d}}x , \end{align} (A.16)  \begin{align} \label{eq:PiSZapproxpsi3} & m_{\psi,h}(g - {{\it {\Pi}}_{{\rm SZ}}} g)+\rho_{\psi,{\it {\Omega}}}\big(h_K^{-1} (g - {{\it {\Pi}}_{{\rm SZ}}} g) \big) + \rho_{\psi,{\it {\Omega}}}(\nabla_h {{\it {\Pi}}_{{\rm SZ}}} g) \leq c\,\rho_{\psi,{\it {\Omega}}}(\nabla g) . \end{align} (A.17) If additionally $$\psi $$ has $$(p,\delta)$$-structure, and $${\bf F}(\nabla g) \in W^{1,2}({\it {\Omega}})$$, then   \begin{align}\label{lem:err-uD} m_{\psi,h}(g -{{\it {\Pi}}_{{\rm SZ}}} g) & \le c\, h^{\min\{2,p\}}\big( \lVert{\nabla {\bf F}(\nabla g)}\rVert_2^2 + \rho_{\psi, {\it {\Omega}}}(\nabla g) \big) . \end{align} (A.18) We have the following Poincaré inequality in the DG setting. Lemma A3 For all $$g \in {W_{{\rm DG},{\it {\Gamma}}_{\rm D}}^{1,\psi}}({\it {\Omega}})$$,   \begin{align*} \rho_{\psi,{\it {\Omega}}}(g) &\leq c\, M_{\psi,h}\big(\text{diam}({\it {\Omega}})\, g\big), \end{align*} where $$c$$ depends only on $${\it {\Omega}}$$, $${\it {\Omega}}'$$, $${\it {\Delta}}_2(\psi)$$ and $${\it {\Delta}}_2(\psi^*)$$. For our choices of $${\bf u}_D^*$$, we can control the jump terms. Lemma A4 Let $$ {\bf u}_D^*= {\bf u}$$ or $$ {\bf u}_D^*= {{\it {\Pi}}_{{\rm SZ}}}{\bf u}$$. Then we have   \begin{align*} m_{\varphi,h}({\bf u}_D^*-{\bf u}) \le c\, \rho_{\varphi,{\it {\Omega}}}(\nabla {\bf u}) . \end{align*} Lemma A5 For all $$K \in \mathcal{T}_h$$ and all functions $${\bf P} \colon {\it {\Omega}} \to {{\mathbb{R}}^{d \times n}}$$,   \begin{align*} {\int\hspace{-0.8em}{-}}_K \lvert{{\bf F}({\bf P}) - \langle{{\bf F}({\bf P})}\rangle_K}\lvert^2\,{\rm{d}}x &\sim {\int\hspace{-0.8em}{-}}_K \lvert{{\bf F}({\bf P}) - {\bf F}(\langle{{\bf P}}\rangle_K)}\rvert^2\,{\rm{d}}x . \end{align*} References Acerbi, E. & Fusco, N. ( 1994) Partial regularity under anisotropic ($$ p,q $$) growth conditions. J. Differ. Equ. , 107, 46– 67. Google Scholar CrossRef Search ADS   Alkämper, M., Dedner, A., Klöfkorn, M. & Nolte, R. ( 2016) The DUNE-ALUGrid module. Arch. Numer. Softw ., 4( 1), 1– 28. Arnold, D. N., Brezzi, F., Cockburn, B. & Marini, L. D. ( 2002) Unified analysis of discontinuous Galerkin methods for elliptic problems. SIAM J. Numer. Anal ., 39, 1749– 1779. Google Scholar CrossRef Search ADS   Barrett, J. W. & Liu, W. B. ( 1994) Quasi-norm error bounds for the finite element approximation of a non-Newtonian flow. Numer. Math ., 68, 437– 456. Google Scholar CrossRef Search ADS   Bastian, P., Blatt, M., Dedner, A., Engwer, C., Klöfkorn, R., Kornhuber, R., Ohlberger, M. & Sander, O. ( 2008a) A generic grid interface for parallel and adaptive scientific computing. Part II: Implementation and tests in DUNE. Computing , 82, 121– 138. Google Scholar CrossRef Search ADS   Bastian, P., Blatt, M., Dedner, A., Engwer, C., Klöfkorn, R., Ohlberger, M. & Sander, O. ( 2008b) A generic grid interface for parallel and adaptive scientific computing. Part I: Abstract framework. Computing , 82, 103– 119. Google Scholar CrossRef Search ADS   Belenki, L., Diening, L.& Kreuzer, Ch. ( 2012) Optimality of an adaptive finite element method for the $$p$$-Laplacian equation. IMA J. Numer. Anal. , 32, 484– 510. Google Scholar CrossRef Search ADS   Blatt, M., Burchardt, A., Dedner, A., Engwer, C., Fahlke, J., Flemisch, B., Gersbacher, C., Gräser, C., Gruber, F., Grüninger, C., Kempf, D., Klöfkorn, R., Malkmus, T., Müthing, S., Nolte, M., Piatkowski, M. & Sander, O. ( 2016) The disributed and unified numerics environment, version 2.4. Arch. Numer. Softw ., 4( 100), 13– 29. Buffa, A. & Ortner, C. ( 2009) Compact embeddings of broken Sobolev spaces and applications. IMA J. Numer. Anal ., 29, 827– 855. Google Scholar CrossRef Search ADS   Burman, E. & Ern, A. ( 2008) Discontinuous Galerkin approximation with discrete variational principle for the nonlinear Laplacian. C. R. Math. Acad. Sci. Paris , 346, 1013– 1016. Google Scholar CrossRef Search ADS   Bustinza, R. & Gatica, G. N. ( 2004) A local discontinuous Galerkin method for nonlinear diffusion problems with mixed boundary conditions. SIAM J. Sci. Comput. , 26, 152– 177. Google Scholar CrossRef Search ADS   Davis, T. A. ( 2004) Algorithm 832: UMFPACK V4.3—an unsymmetric-pattern multifrontal method. ACM Trans. Math. Softw. , 30, 196– 199. Google Scholar CrossRef Search ADS   Dedner, A., Klöfkorn, R., Nolte, M. & Ohlberger, M. ( 2010) A generic interface for parallel and adaptive scientific computing: abstraction principles and the DUNE-FEM module. Computing , 90, 165– 196. Google Scholar CrossRef Search ADS   Di Pietro, D. A. & Ern, A. ( 2010) Discrete functional analysis tools for discontinuous Galerkin methods with application to the incompressible Navier-Stokes equations. Math. Comput ., 79, 1303– 1330. Google Scholar CrossRef Search ADS   Di Pietro, D. A. & Ern, A. ( 2012) Mathematical aspects of discontinuous Galerkin methods. Mathématiques & Applications , vol. 69. Berlin: Springer. Google Scholar CrossRef Search ADS   Diening, L. & Ettwein, F. ( 2008) Fractional estimates for non-differentiable elliptic systems with general growth. Forum Math ., 20, 523– 556. Google Scholar CrossRef Search ADS   Diening, L. & Kreuzer, C. ( 2008) Linear convergence of an adaptive finite element method for the $$p$$–Laplacian equation. SINUM , 46, 614– 638. Google Scholar CrossRef Search ADS   Diening, L., Kröner, D., Růžička, M. & Toulopoulos, I. ( 2014) A local discontinuous Galerkin approximation for systems with $$p$$-structure. IMA J. Numer. Anal ., 34, 1447– 1488. Google Scholar CrossRef Search ADS   Diening, L. & Růžička, M. ( 2007) Interpolation operators in Orlicz–Sobolev spaces. Numer. Math ., 107, 107– 129. Google Scholar CrossRef Search ADS   Diening, L., Růžička, M. & Schumacher, K. ( 2010) A decomposition technique for John domains. Ann. Acad. Sci. Fenn. Ser. A. I. Math ., 35, 87– 114. Google Scholar CrossRef Search ADS   Ebmeyer, C. & Liu, W. B. ( 2005) Quasi-norm interpolation error estimates for finite element approximations of problems with $$p$$-structure. Numer. Math ., 100, 233– 258. Google Scholar CrossRef Search ADS   Giaquinta, M. & Modica, G. ( 1986) Remarks on the regularity of the minimizers of certain degenerate functionals. Manuscripta Math ., 57, 55– 99. Google Scholar CrossRef Search ADS   Giusti, E. ( 1994) Metodi Diretti nel Calcolo delle Variazioni.  Bologna: Unione Matematica Italiana. Houston, P., Robson, J. & Süli, E. ( 2005) Discontinuous Galerkin finite element approximation of quasilinear elliptic boundary value problems. I. The scalar case. IMA J. Numer. Anal ., 25, 726– 749. Google Scholar CrossRef Search ADS   Rao, M. M. & Ren, Z. D. ( 1991) Theory of Orlicz spaces. Pure and Applied Mathematics , Monographs and Textbooks, vol. 146. New York: Marcel Dekker. Růžička, M. ( 2013) Analysis of Generalized Newtonian Fluids. Topics in Mathematical Fluid Mechanics. Lecture Notes in Mathematics ( Beirao da Veiga H. & Flandoli F. eds), vol. 2073. Heidelberg: Springer, pp. 199– 238. Růžička, M. & Diening, L. ( 2007) Non-Newtonian Fluids and Function Spaces. Nonlinear Analysis, Function Spaces and Applications ( Rákosnik J. ed.), vol. 8. Praha: Institute of Mathematics of the Academy of Sciences of the Czech Republic, pp. 95– 144. Scott, L. R. & Zhang, S. ( 1990) Finite element interpolation of nonsmooth functions satisfying boundary conditions. Math. Comput ., 54, 483– 493. Google Scholar CrossRef Search ADS   Footnotes 1 Note that the traces are evaluated with respect to $$K$$. 2 On a face $$\gamma \in {\it {\Gamma}}$$, the expression $$\nabla_h {{\it {\Pi}}_{{\rm SZ}}} {\bf u}$$ is to be understood as the evaluation on any adjacent element $$K \in \mathcal T_h$$, i.e., $$\text{tr} _\gamma^K \nabla_h {{\it {\Pi}}_{{\rm SZ}}} {\bf u}$$. 3 Note that functions and traces are evaluated with respect to $$K$$. © The authors 2017. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. All rights reserved.

Journal

IMA Journal of Numerical AnalysisOxford University Press

Published: Sep 8, 2017

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off