# Total variation diminishing schemes in optimal control of scalar conservation laws

Total variation diminishing schemes in optimal control of scalar conservation laws Abstract Optimal control problems subject to a nonlinear scalar conservation law, the state system, are studied. Such problems are challenging at both the continuous level and the discrete level since the control-to-state operator poses difficulties such as nondifferentiability. Therefore, discretization of the control problem has to be designed with care to provide stability and convergence. Here, the discretize-then-optimize approach is pursued, and the state is removed by the solution of the underlying state system, thus providing a reduced control problem. An adjoint calculus is then applied for computing the reduced gradient in a gradient-related descent scheme for solving the optimization problem. The time discretization of the underlying state system relies on total variation diminishing Runge–Kutta (TVD-RK) schemes, which guarantee stability, best possible order and convergence of the discrete adjoint to its continuous counterpart. While interesting in its own right, it also influences the quality and accuracy of the numerical reduced gradient and, thus, the accuracy of the numerical solution. In view of these demands, it is proven that providing a state scheme that is a strongly stable TVD-RK method is enough to ensure stability of the discrete adjoint state. It is also shown that requiring strong stability for both the discrete state and adjoint is too strong, confining one to a first-order method, regardless of the number of stages employed in the TVD-RK scheme. Given such a discretization, we further study order conditions for the discrete adjoint such that the numerical approximation is of the best possible order. Also, convergence of the discrete adjoint state towards its continuous counterpart is studied. In particular, it is shown that such a convergence result hinges on a regularity assumption at final time for a tracking-type objective in the control problem. Finally, numerical experiments validate our theoretical findings. 1. Introduction We are considering an optimal control problem of type \begin{equation*} \min_{u \in U_\text{ad}} \int_{\mathbb{R}} G(y(x,T)) \, \textrm{d} x, \end{equation*} subject to a scalar conservation law, i.e., \begin{equation*} \begin{array}{rcll} y_t + f(y)_x &=& 0 & \text{in } \mathbb{R} \times \mathbb{R}_+, \\ y(x,0) &=& u(x) & \text{in } \mathbb{R}. \end{array} \end{equation*} Here $$U_\text{ad}$$ is called the admissible set, and it is assumed to be nonempty, convex and closed. The state $$y(x,t)$$ is considered to be the entropic (weak) solution of the scalar conservation law and $$u(x),$$ the control, is the initial data of the partial differential equation (PDE). Although the definition of the optimal control problem seems to be simple, the PDE constraint (conservation law) poses severe difficulties for the analysis of such problems, both at the continuous level and at the discrete level. The major problem is the possible formation of a shock in the state $$y(x,t)$$ at a finite time even for very smooth initial data $$u(x)$$, when the flux function $$f(\cdot)$$ is nonlinear. Moreover, it is easy to show through examples that the control-to-state map is not Gâteaux differentiable when shocks are present. This poses a significant problem for obtaining the derivative of the cost functional of a (control) reduced version of the underlying optimal control problem. Luckily, a generalized definition of the derivative, called the ‘shift derivative’, for the control-to-state map has been derived by Ulbrich (2001, 2002), which implies Fréchet differentiability of the cost functional. Such differentiability results enable us to compute the derivative of the cost functional using an adjoint approach. In Section 2, we state the underlying optimal control in a rigorous context by recalling weak and entropic solutions of scalar conservation laws and their properties as well as the concept of shift differentiability. The difficulties that arise from the nature of the PDE are also reflected at the discrete level, i.e., one should discretize the problem with care. Monotone schemes, which we recall in Section 3, are among successful discretizations for conservation laws and their theory is well understood. We use monotone discretizations in space to obtain a semidiscrete formulation and then we discretize in time using a total variation diminishing (TVD) Runge–Kutta (RK) scheme. TVD-RK methods are a class of RK methods that guarantee, under quite mild assumptions, that the discrete solution is total variation stable (also called ‘strong stability preserving (SSP)’ methods). We then obtain the fully discrete optimal control problem by discretizing the objective functional. Similar to the continuous level, one can obtain the derivative of the cost functional using the adjoint calculus. The properties of the discrete adjoint are intimately related to the discretization of the discrete state. The TVD-RK method for the discrete state can be characterized by two sets of positive coefficients $$\{ \alpha_{ij} \}$$ and $$\{ \beta_{ij} \}$$. We show that the corresponding discrete adjoint is also obtained by a TVD-RK method, where the coefficients are ‘conjugates’ of $$\{ \alpha_{ij} \}$$ and $$\{ \beta_{ij} \}$$. Therefore, we will study in Section 4 the stability and the approximation of the discrete adjoint. In particular, we discover the following properties: Proposition 4.4: Imposing strong stability on both discrete state and discrete adjoint is too strong and it results in a first-order time discretization. Theorem 4.5: Imposing strong stability on the discrete state is enough to give stability of the discrete adjoint. Theorem 4.6 and Theorem 4.7: Any two-stage second-order TVD-RK method for the discrete state results in a second-order adjoint approximation. Any three-stage third-order TVD-RK method for the discrete state results in a second-order adjoint approximation. Hager (2000) showed that for certain third-order RK methods, the resulting discrete adjoint is only second order. Theorem 4.7 shows that this is the case for the class of TVD-RK methods. The study of the differentiability of the control-to-state map when shocks are present was started by Bressan & Guerra (1997). The authors show that this map, in general, is not differentiable. A similar problem formulated in terms of a minimization task was studied by James & Sepúlveda (1999), where the derivative of the objective functional was obtained when the state is smooth and does not contain shocks. Later Ulbrich (2001, 2002) analysed the shift differentiability of the so-called control-to-state operator and showed Fréchet differentiability of the reduced objective functional in the presence of shocks. Moreover, an adjoint procedure to compute the mentioned derivative was introduced and analysed. For the nonlinear conservation law (2.1), it has been observed that not all RK methods can ensure TVD properties of the approximation, i.e., oscillations occur near discontinuities (see, e.g., the example in Gottlieb & Shu, 1998, Section 2). Shu & Osher (1988) constructed a class of RK methods that ensures the approximation is TVD, the so-called TVD-RK methods. The main idea is to use convex combinations of the forward Euler method to construct a high-order approximation. If the Euler step is stable in some (semi)norm, then under some mild conditions the convex combination of the Euler steps is stable too. Order conditions are also derived by Shu & Osher (1988) for second and third orders with two and three stages, respectively. We should also remark that the derivation of such conditions remains formal as often the solution of the hyperbolic problem does not possess the required regularity. It is proven by Gottlieb & Shu (1998) that a fourth-order method with four stages does not exist. However, a fourth-order five-stage method was discovered by Spiteri & Ruuth (2002). Ruuth & Spiteri (2002) also showed that methods beyond fourth order of any number of stages do not exist. For the numerical treatment of such optimization problems and for a particular objective functional, Giles (2003) showed that the classical Lax–Friedrichs scheme leads to a discrete adjoint that converges to a wrong solution. Later, Giles & Ulbrich (2010a,b) and Ulbrich (2001) showed under a restrictive time step that the classical Lax–Friedrichs scheme yields a convergent discrete adjoint. Higher-order discretizations based on relaxation of the conservation law were also introduced and studied by Banda & Herty (2012). We finish this brief review of the literature by recalling an alternating descent method introduced by Castro et al. (2008) as a solution technique for such optimization problems. 2. Optimal control of scalar conservation laws An abstract optimal control problem can be formulated as $$\min\quad J(y) \quad \text{subject to} \quad S(u) = y, \quad u \in U_\text{ad},$$ (P) where $$y$$ is called the state variable and $$u$$ is the control variable, with the latter belonging to an admissible set $$U_\text{ad}$$. The objective functional is denoted by $$J(\cdot)$$, and it is assumed to be differentiable. The control and state variables are related through a control-to-state map $$S(\cdot)$$, which can be regarded as a solution operator of the underlying PDE. Obviously, the control-to-state $$S(\cdot)$$ influences existence and uniqueness of the optimal control problem as well as optimality conditions that characterize the optimal solution. In this work, we consider $$S(\cdot)$$ to be the solution operator of the following one-dimensional scalar conservation law (Cauchy problem): $$\begin{array}{rcll} y_t + f(y)_x &=& 0 & \text{in } \mathbb{R} \times \mathbb{R}_+, \\ y(x,0) &=& u(x) & \text{in } \mathbb{R}, \end{array}$$ (2.1) where $$u \in \textrm{L}^\infty(\mathbb{R})$$ is the initial data with compact support in a bounded interval $$K \subset \mathbb{R}$$ and $$y(x,t)$$ is the conserved variable. This motivates us to define the admissible set $$U_{\text{ad}}$$ by $$U_\text{ad} := \left\{ u \in \textrm{L}^\infty(\mathbb{R}) : \text{supp}(u) {\subset} K, \Vert {{u}} \Vert_{\textrm{BV}(\mathbb{R})} \leq C \right\}\!,$$ (2.2) where $$C>0$$ is a constant, $$\text{supp}(\cdot)$$ denotes the support of a function, i.e., $$\text{supp}(u) = \{ x \in \mathbb{R} : u(x) \not = 0 \}$$, and $$\textrm{L}^\infty(\mathbb{R})$$ is the Lebesgue space of essentially bounded functions on $$\mathbb{R}$$ with norm $$\Vert {{\cdot}} \Vert_{\textrm{L}^\infty(\mathbb{R})}$$. In this article, we assume that the so-called flux function $$f(\cdot)$$ is in $$\textrm{C}^2(\mathbb{R})$$ and satisfies $$f'' > 0$$. A particular example is $$f(y) = \frac{1}{2} y^2$$ which gives rise to the inviscid Burgers equation. Concerning the objective functional we study the so-called tracking type functional, i.e., $$J(y) := \int_{\mathbb{R}} G(y(x,T)) \, \textrm{d} x,$$ (2.3) where $$T>0$$ is a final time. A common example is $$G(y(x,T)) := |y(x,T)-y_\text{obs}(x)|^2$$ with $$y_\text{obs} \in \textrm{L}^2(\mathbb{R})$$ given. Classical solutions of (2.1) can be constructed by the method of characteristics. However, due to the possible nonlinearity of the flux function $$f$$, classical solutions break down in finite time even for very smooth initial data. Therefore, we consider generalized (weak) solutions of (2.1) in integral form: the function $$y \in \textrm{L}^\infty(\mathbb{R} \times \mathbb{R}_+)$$ is called a weak solution of (2.1) if it satisfies $$\int_{\mathbb{R} \times \mathbb{R}_+} y \, \varphi_t + f(y) \, \varphi_x \, \textrm{d} x \,\textrm{d} t + \int_{\mathbb{R}} u(x) \varphi(x,0) \, \textrm{d} x = 0 \quad \forall\,\varphi \in \textrm{C}^\infty_c(\mathbb{R} \times \mathbb{R}_+),$$ (2.4) where $$\textrm{C}^{\infty}_c(\mathbb{R} \times \mathbb{R}_+)$$ represents the space of infinitely differentiable functions with compact supports in $$\mathbb{R} \times \mathbb{R}_+$$, with $$\mathbb{R}_+ = [0,+\infty)$$. There might be more than one weak solution for given initial data. For instance, consider the following example: let $$f(y)=\frac{1}{2} y^2$$, and $$u(x) = 0$$ for $$x<0$$ and $$u(x) = 1$$ for $$x\geq0,$$ which yields discontinuous initial data. Then the following two weak solutions are known: \begin{equation*} y_1(x,t) = \begin{cases} 0, & x < \frac{1}{2}t, \\ 1, & x\geq \frac{1}{2}t, \end{cases} \quad \text{and} \quad y_2(x,t) = \begin{cases} 0, & x<0, \\ x/t, & 0 \leq x < t, \\ 1, & t \leq x. \end{cases} \end{equation*} The lack of uniqueness can be overcome by picking the physically relevant or entropy solution. Therefore, we impose an extra constraint on the weak solution: a function $$y \in \textrm{L}^\infty(\mathbb{R} \times \mathbb{R}_+)$$ is called an entropy solution (in the sense of Kružkov, 1970) if it satisfies $$\int_{\mathbb{R} \times \mathbb{R}_+} \eta(y) \, \varphi_t + q(y) \, \varphi_x \, \textrm{d} x \, \textrm{d} t + \int_{\mathbb{R}} \eta(u(x)) \varphi(x,0) \, \textrm{d} x \geq 0 \quad \forall\, \varphi \in \textrm{C}^\infty_c(\mathbb{R} \times \mathbb{R}_+), \, \varphi \geq 0,$$ (2.5) where $$\eta(y) := |y-k|$$ and $$q(y) = {\text{sign}}(y-k) (f(y)-f(k))$$ for all $$k\in \mathbb{R}$$. One can show that if $$y$$ satisfies (2.5) then it is also a weak solution in the sense of (2.4) (see, e.g., Chechkin & Goritsky, 2009, (Section 5.5). Kružkov then shows that such an entropy solution is indeed unique. For the following result, we introduce $$\textrm{BV}(\mathbb{R})$$, the space of the functions of bounded variations on $$\mathbb{R}$$, i.e., $$u \in \textrm{BV}(\mathbb{R})$$ iff $$u \in \textrm{L}^1(\mathbb{R})$$ and $$\textrm{TV}(u) := \sup \{ \int_{\mathbb{R}} u \, p' \, \textrm{d} x : p \in \textrm{C}^1_c(\mathbb{R}), |p| \leq 1 \text{ a.e. in } \mathbb{R} \}$$. Here ‘a.e.’ stands for ‘almost everywhere’. Endowed with $$\Vert {{u}} \Vert_{\textrm{BV}(\mathbb{R})} = \Vert {{u}} \Vert_{\textrm{L}^1(\mathbb{R})} + \textrm{TV}(u)$$, it is a Banach space. The following theorem is due to Kružkov (see Kružkov, 1970). We refer the reader to Evans (2010, Section 11.4.3) for a proof. Theorem 2.1 Let $$u \in \textrm{L}^\infty(\mathbb{R})$$ in (2.5); then there exists a unique entropy solution $$y$$ of (2.5) that satisfies $$\Vert {{y(\cdot,t)}} \Vert_{\textrm{L}^\infty(\mathbb{R})} \leq \Vert {{u}} \Vert_{\textrm{L}^\infty(\mathbb{R})}$$ for all $$t > 0$$. Moreover, let, $$y_1$$ and $$y_2$$ be two entropy solutions corresponding to initial data $$u_1, u_2 \in \textrm{L}^\infty(\mathbb{R}) \cap \textrm{L}^1(\mathbb{R})$$, respectively. Then we have \begin{equation*} \Vert {{y_1(\cdot,t) - y_2(\cdot,t)}} \Vert_{\textrm{L}^1(\mathbb{R})} \leq \Vert {{u_1 - u_2}} \Vert_{\textrm{L}^1(\mathbb{R})} \quad \forall\, t>0. \end{equation*} Finally, if $$u \in \textrm{BV}(\mathbb{R})$$ then $$\textrm{TV}(y(\cdot,t)) \leq \textrm{TV}(u)$$ for all $$t>0$$. The notion of entropy solutions enables us to state our optimal control problem in a meaningful way. Therefore, our optimal control problem (P) is completed by setting the solution operator $$S(\cdot)$$ to be the entropy solution of the conservation law at time $$T$$. We will see in Section 2.1 that $$S(\cdot)$$ is not Gâteaux differentiable when discontinuity (shocks) appear in the state $$y$$. Therefore, a different approach in defining a derivative of the control-to-state map, called a shift derivative, is derived that leads to Fréchet differentiability of the objective functional. Using this setting, existence of a minimizer can be shown for the underlying problem (see Appendix A). We also refer to Castro et al. (2008) and Ulbrich (1999) for a more general case. Uniqueness is, however, not guaranteed since we can construct control examples that lead to the same state that minimizes the objective functional: indeed, consider the following controls: \begin{equation*} u_1 = \begin{cases} 1, & -2 \leq x < 0, \\[2pt] -1, & 0 \leq x \leq 2, \\[2pt] 0 & \text{otherwise}, \end{cases} \quad u_2 = \begin{cases} 1, & -2 \leq x < -1, \\[2pt] -x, & -1 \leq x \leq 1, \\[2pt] -1, & 1 \leq x \leq 2, \\[2pt] 0 & \text{otherwise}, \end{cases} \end{equation*} for the Burgers equation, i.e., $$f(y)=\frac{1}{2}y^2$$. We fix the final time $$T=2$$. One can construct the corresponding entropy solution for each initial data. A direct calculation by the method of characteristics shows that at time $$t\geq T$$ both initial data give \begin{equation*} y(x,t) = \begin{cases} \frac{1}{t}(x + 2), & -2 \leq x \leq 0, \\[2pt] \frac{1}{t}(-x+2), & 0 < x \leq 2, \\[2pt] 0 & \text{otherwise}. \end{cases} \end{equation*} Setting $$y_\text{obs} := y(x,T)$$, we then have $$J(S(u_1)) = J(S(u_2)) = 0$$, i.e., two optimal controls for (P). Remark 2.2 (Flux identification problem) A different, yet similar optimal control problem of conservation laws can be formulated in which the control variable is the flux function and the initial data are fixed. More precisely , we look for a flux function $$f \in \mathcal{F}_\text{ad}$$, where $$\mathcal{F}_\text{ad}$$ is the admissible set, minimizing a given objective functional. For instance, $$f$$ may have a closed analytical form, perhaps transforming the optimal control problem into $$\mathbb{R}^n$$. In this case, the solution operator is defined as $$y = S(f)$$. The existence of the minimum can be proved by a continuity result due to Lucier (1986), \begin{equation*} \Vert {{ S(\,f)(\cdot,t) - S(g)(\cdot,t) }} \Vert_{\textrm{L}^1(\mathbb{R})} \leq t \Vert {{ f- g }} \Vert_\text{Lip} \, \Vert {{ u }} \Vert_{\textrm{BV}}, \end{equation*} and an assumption on the compactness of $$\mathcal{F}_\text{ad}$$ (see James & Sepúlveda, 1999, Section 2.2, for details). Uniqueness is again not guaranteed in general (see the discussion in James & Sepúlveda, 1999). 2.1 Shift differentiability and adjoint calculus As we have seen, the control-to-state operator is a delicate object and needs special care for the forward problem (2.1) to be well posed. We have also seen that although the optimal control problem (P) admits a minimizer, uniqueness cannot be expected in general. We now turn our attention to characterizing such a minimum. For the numerical treatment, first-order characterizations of such minimizers are of importance. This is our next subject of study. A first step in deriving optimal conditions for the problem is investigating differentiability of the objective functional. Note that the objective functional defined by (2.3) is well behaved, in the sense that $$G$$ is sufficiently smooth and allows existence of a solution and only the solution operator $$S(\cdot)$$ may cause problems. Thus, using the chain rule, we must investigate differentiability of $$S(\cdot)$$. It turns out that in the presence of shocks, $$S(\cdot)$$ is not Gâteaux differentiable as we illustrate in the following example borrowed from Bressan & Guerra (1997). Example 2.3 Suppose the initial data are $$u_\epsilon = (1+\epsilon) x \cdot \mathbf{1}_{[0,1]}(x)$$, where $$\epsilon \in \mathbb{R}$$ and $$\mathbf{1}_{[0,1]}(x)$$ denotes the indicator function of the interval $$[0,1]$$. Consider the Burgers equation. We can construct the entropy solution by the method of characteristics: \begin{equation*} y_\epsilon(x,t) = \frac{(1+\epsilon)x}{1+(1+\epsilon)t} \cdot \mathbf{1}_{[0,\sqrt{1+(1+\epsilon)t}]}(x). \end{equation*} Note that the shock position now depends on the perturbation, i.e., $$x_s(t) = \sqrt{1+(1+\epsilon)t}$$, and at time $$t=0$$ the derivative exists \begin{equation*} v(x) = \lim_{\epsilon \rightarrow 0} \epsilon^{-1} ( u_\epsilon - u_0 ) = x \cdot \mathbf{1}_{[0,1]}(x). \end{equation*} However, at time $$t>0,$$ there is no such derivative $$v(x) \in \textrm{L}^1(\mathbb{R})$$. In fact $$\epsilon^{-1}(y_\epsilon(\cdot,t) - y_0(\cdot,t) )$$ converges as $$\epsilon \rightarrow 0$$ in the sense of distributions to $$\frac{x}{(1+t)^2}\cdot \mathbf{1}_{[0,\sqrt{1+t}]} + \frac{t}{2(t+1)} \delta_{\sqrt{1+t}},$$ (2.6) where $$\delta_{x}$$ is the Dirac delta function located at $$x$$. Note that in the previous example, the distributional derivative has a continuous part and a singular part due to a shift in the shock location. Moreover, the perturbed shock location, $$x_s(t) = \sqrt{1+(1+\epsilon)t}$$, is differentiable with respect to the perturbation and the solution, $$y_\epsilon(x,t)$$, vary differentiably in the left and right of the shock. This motivates a first-order approximation of (2.6) introduced in Ulbrich (2001, 2002). Let $$u \in U \subset \mathcal{U}$$ where $$U$$ is an open set and $$\mathcal{U}$$ a Banach space. Moreover, $$u$$ is such that $$y_u := S(u)$$ has bounded variation and its support is in $$K \subset \mathbb{R}$$. For the sake of presentation, suppose, for instance, that $$u$$ has compact support, contains a shock at $$x_s(0)$$ and is a $$\textrm{C}^1$$ function on either side of the shock. Suppose the shock remains in the solution up to time $$T$$ and its position is given by $$x_s(t)$$. We perturb $$u$$ by $$\delta u \in U$$ such that $$y_{u+\delta u} := S(u+\delta u) \in \textrm{L}^\infty(\mathbb{R}) \cap \textrm{BV}(\mathbb{R})$$. Then we can define a first-order approximation of $$y_{u+\delta u} - y_u$$ by the shift variation \begin{equation*} \delta S (\delta y, s) := \delta y + {\text{sign}}(s) [\![{y(x_s(T),T)}]\!] \mathbf{1}_{[x_s(T), x_s(T)+s]} \in \textrm{L}^1(K), \end{equation*} where $$[\![{y(x_s(T),T)}]\!] = y(x_s^{-}(T),T) - y(x_s^{+}(T),T)$$ and $$(\delta y, s) \in \textrm{L}^1(K) \times \mathbb{R}$$ depends linearly on $$\delta u$$. Here $$\delta y$$ corresponds to the variation of the solution in the continuous part and $$s$$ is a linear approximation of the shock shift (e.g., in Example 2.3, $$x_s(t)$$ is differentiable with respect to $$\epsilon$$). More precisely we suppose that there exists a bounded linear operator \begin{equation*} T(u) \in \mathcal{L}(\mathcal{U}, \textrm{L}^r(K) \times \mathbb{R}), \quad r \in (1,\infty], \end{equation*} such that $$(\delta y, s) := T(u)\cdot \delta u$$. We say $$y_u$$ is shift differentiable at $$u$$ if \begin{equation*} \lim_{\delta u \rightarrow 0} \Vert {{\delta u}} \Vert_{\mathcal{U}}^{-1} \cdot \Vert {{ y_{u+\delta u} - y_u - \delta S(T(u)\cdot \delta u) }} \Vert_{\textrm{L}^1(K)}^{} = 0. \end{equation*} It is proven that shift differentiability implies Fréchet differentiability of the objective functional (see, e.g., Ulbrich, 2001, Section 3.2.2). Moreover, it is shown in Ulbrich (2001, Theorem 3.3.2) that, under some technical assumptions, entropy solutions are continuously shift differentiable and therefore the objective functional is Fréchet differentiable. Although the shift differentiability result is a useful tool in proving differentiability of the objective functional using sensitivities, it is often more convenient to obtain the objective functional’s derivative using an adjoint calculus. For the conservation laws with smooth solution, James and Sepùlveda derived such an adjoint-based derivative; see James & Sepúlveda (1999, Section 2.4] and the references therein. Later Ulbrich generalized the adjoint calculus to the case where the solution contains shocks. Here, we only state the result and refer the interested reader to Ulbrich (2001, 2002) for details. For a formal derivation, see Giles & Ulbrich (2010a). Suppose the perturbation $$\delta u$$ has the same structure as $$u$$, i.e., it contains a shock discontinuity at $$x_s(0)$$ and is piecewise $$\textrm{C}^1$$ on either side of the shock. Then the derivative of $$\mathcal{J}(u) := J(S(u))$$ in the direction of the $$\delta u$$ (perturbation in the initial data) is given by $$\mathcal{J}'(u) \delta u = \int_{\mathbb{R}} p(x,0) \, \delta u(x) \, \textrm{d} x,$$ (2.7) where $$p(x,t)$$ is the adjoint variable that satisfies the following backward equation with final-time condition: $$\begin{array}{rcll} p_t + f'(y) p_x &=& 0 & \text{in } \mathbb{R} \times (0,T), \\ p(x,T) &=& G'(y(x,T)) & \text{in } \mathbb{R}. \end{array}$$ (2.8) In the case $$y(x,t)$$ contains a shock which travels along $$x_s(t)$$, we need to impose an interior boundary condition for the adjoint along $$x_s(t)$$: $$p( x_s(t), t) = \frac{[\![{G(y)}]\!] }{ [\![{y}]\!] } \Big|_{(x_s(T),T)} \quad \forall\, t \in [0,T].$$ (2.9) The adjoint equation (2.8) is backward in time and is a nonconservative hyperbolic PDE with final datum and has been studied by many authors (see, e.g., James & Sepúlveda, 1999; Ulbrich, 2003). Let us first consider the case when $$p(x,T) \in \textrm{Lip}_{{\rm loc}}(\mathbb{R})$$, e.g., shocks at the final time are smoothed in the objective functional. A one-sided Lipschitz-continuity condition (OSLC) on $$f'(y)$$, i.e., \begin{equation*} \text{ess} \sup_{x\not=z} \left( \frac{f'(y(x,t)) - f'(y(z,t))}{x-z} \right)^+ \leq m(t), \end{equation*} where $$m(t) \in \textrm{L}^1(0,T)$$, guarantees that the generalized backward characteristics starting from time $$T$$ do not intersect. Then the existence of at least one Lipschitz continuous solution can be guaranteed. However, uniqueness is not ensured (see Conway, 1967). Uniqueness can be proved for the so-called reversible solutions which we briefly recall. Let us define the set $$\mathcal{E}$$ as the set of exceptional solutions, i.e., Lipschitz continuous solutions of (2.8) where $$p(x,T)=0$$. Then the support of the exceptional solutions is defined as \begin{equation*} \mathcal{V}_e := \left\{ (x,t) \in \mathbb{R} \times (0,T) : \exists\, p \in \mathcal{E}, p(x,t) \not = 0 \right\}\!. \end{equation*} A reversible solution is then a Lipschitz continuous solution of (2.8) that is locally constant in $$\mathcal{V}_e$$ (see Bouchut & James, 1998, Definition 4.1.4). We should mention that reversible solutions can also be defined for discontinuous Borel functions as end data $$p(x,T)$$ that are pointwise everywhere the limit of a bounded sequence $$(p_n^T)\in \textrm{C}^{0,1}(\mathbb{R})$$, i.e., bounded in $$\textrm{C}(\mathbb{R})\cap W^{1,1}_{{\rm loc}}(\mathbb{R})$$ (see Ulbrich, 2003). In this case, reversible solutions can be defined as broad solutions along the generalized backward characteristics, which automatically ensures the internal boundary condition (2.9), if the end data (2.9) are used at $$(x_s(T),T)$$. The following theorem states properties of the reversible solution and is proved in Ulbrich (2003, Theorem 14) for the general case of hyperbolic balance laws. For simplicity of the presentation, we adapt the result in Ulbrich (2003, Theorem 14) to our setting (see also Bouchut & James, 1998). For the case of discontinuous end data, we refer the reader to Ulbrich (2003, Corollary 15). Theorem 2.4 Let $$f'(y(x,t)) \in \textrm{L}^\infty(\mathbb{R} \times (0,T))$$ and satisfy OSLC. Then the following holds: for all $$p(x,T) \in \textrm{C}^{0,1}(\mathbb{R})$$ there exists a unique reversible solution $$p$$ of (2.8). Moreover, $$p \in \textrm{C}^{0,1}(\mathbb{R} \times [0,T])$$ and solves (2.8) a.e. on $$\mathbb{R} \times (0,T)$$. Finally, for all $$t \in [0,T]$$, $$z_1 < z_2$$ and $$0 \leq s < \hat{s} \leq T$$ with \begin{align*} I &:= [z_1,z_2], \quad I_s^{\hat{s}} := [s,\hat{s}] \times I, \\ J &:= \left[z_1 - (T-t) \Vert {{\,f'(y)}} \Vert_{\textrm{L}^\infty(\mathbb{R} \times (0,T))}, z_2 + (T-t) \Vert {{\,f'(y)}} \Vert_{\textrm{L}^\infty(\mathbb{R} \times (0,T))} \right]\!, \\ J_t^T &:= [t,T] \times J, \end{align*} the following estimates hold: \begin{align*} \Vert {{p(\cdot,t)}} \Vert_{B(I)} &\leq \Vert {{p(\cdot,t)}} \Vert_{B(J)}, \\ \Vert {{p_x(\cdot,t)}} \Vert_{\textrm{L}^1(I)} &\leq \Vert {{p_x(\cdot,t)}} \Vert_{\textrm{L}^1(J)}, \\ \Vert {{p_t}} \Vert_{\textrm{L}^1(I_s^{\hat{s}})} &\leq (\hat{s}-s) \Vert {{\,f'(y)}} \Vert_{\textrm{L}^\infty(I_s^{\hat{s}})} \Vert {{p_x}} \Vert_{\textrm{L}^\infty(s,\hat{s};\textrm{L}^1(I))}, \end{align*} where $$B(I)$$ is the Banach space of the bounded functions equipped with the $$\sup$$-norm. We now demonstrate the use of the adjoint calculus by the following example. Example 2.5 Consider the Burgers equation, i.e., $$f(y) := \frac{1}{2} y^2$$, on the domain $${\it{\Omega}} = (-1,1)$$, and the initial data is set to be \begin{equation*} u(x) := \begin{cases} 1, & -1\leq x<0, \\ -1, & 0 < x \leq 1, \end{cases} \end{equation*} with a shock discontinuity at $$x=0$$. For the boundary conditions we set $$y(-1,t)=1$$ and $$y(1,t)=-1$$ for $$t>0$$. It is easy to see that the entropic solution equals the initial data, i.e., $$y(x,t) = u(x)$$ for all $$t>0$$. Hence, the shock position remains at $$x=0$$, i.e., $$x_s(t)=0$$ for $$t>0$$. We now compute the adjoint state using (2.8) and (2.9). For this purpose, let $$G(y) := \frac{1}{2} |y|^2$$. Then we have $$[\![{G(y)}]\!]|_{t=T}=0$$ and $$G'(y) = y$$. Then for the final-time condition of the adjoint equation, we obtain \begin{equation*} p(x,T):= \begin{cases} 1, & -1 \leq x < 0, \\ -1, & 0 < x \leq 1, \end{cases} \quad \text{and} \quad p(0,T) = 0, \end{equation*} and for the ‘interior’ boundary condition we have $$p(0,t) = 0$$ for $$t \in [0,T]$$. We fix $$T=\frac{1}{2}$$. By the method of characteristics we can construct $$p(x,t)$$ for $$t<T$$. More precisely, along all straight lines $$x(t)$$ with the derivative $$\dot{x}(t) = y(x(t),t)$$, the adjoint state is constant. This implies that, at $$t=0$$, we have \begin{equation*} p(x,0) = \begin{cases} 1, & -1 < x < -\frac{1}{2}, \\[2pt] 0, & -\frac{1}{2} \leq x \leq \frac{1}{2}, \\[2pt] -1, & -\frac{1}{2} < x < 1. \end{cases} \end{equation*} See also Fig. 1. Fig. 1. View largeDownload slide Construction of the adjoint using the method of characteristics in the space-time domain. Note that the adjoint has a constant value in the grey area due to the discontinuity of the state at time $$T$$. Fig. 1. View largeDownload slide Construction of the adjoint using the method of characteristics in the space-time domain. Note that the adjoint has a constant value in the grey area due to the discontinuity of the state at time $$T$$. 3. Numerical methods In this section, we describe how we discretize the optimal control problem (P). In this article, we follow the discretize-then-differentiate approach, i.e., we first fully discretize the optimization problem and then derive the optimality conditions for the resulting finite-dimensional optimization problem. In particular, we should consider in detail the discretization of the conservation law (2.1). It is important that one makes sure that the discretization of the forward problem converges to the unique entropy solution, as well as the inherited adjoint discretization to the continuous adjoint state. We refer the reader to LeVeque (1990) for an introduction to numerical methods for such PDEs. Since we are interested in studying how time discretization, using TVD-RK methods, influences the quality of the overall scheme, we first discretize the conservation law in space and then in time by a TVD-RK method. 3.1 Spatial discretization Let us first partition the domain, say $$\mathbb{R}$$ with nonoverlapping intervals, $$I_j := (x_{j-1/2},x_{j+1/2}]$$, where $$x_{j-1/2} < x_{j+1/2}$$ for all $$j \in \mathbb{Z}$$. The so-called mesh size is denoted by $$h_j := x_{j+1/2}-x_{j-1/2}$$. We denote the semidiscrete approximation at time $$t$$ by \begin{equation*} {\boldsymbol {y}}(t) := (..., y_{j-1}(t),y_{j}(t),y_{j+1}(t), ... ) \in \ell^\infty(\mathbb{Z}). \end{equation*} More precisely, $$y_j(t) \in \mathbb{R}$$ is an approximation to the cell average of the true solution, i.e., \begin{equation*} y_j(t) \approx \frac{1}{h_j} \int_{x_{j-1/2}}^{x_{j+1/2}} y(x,t) \, \textrm{d} x. \end{equation*} We then discretize in space using a conservative scheme, by choosing a numerical flux function$$\hat{f}(y_j(t),y_{j+1}(t))$$ that approximates $$f( y(x_{j+1/2},t))$$. Then the semidiscrete numerical scheme is obtained by solving the following ordinary differential equation (ODE) in time: $$\frac{\textrm{d}}{\textrm{d} t} y_j(t) + \frac{1}{h_j} \left( \hat{f}(y_j(t),y_{j+1}(t)) - \hat{f}(y_{j-1}(t),y_{j}(t)) \right) = 0 \quad \forall\, j \in \mathbb{Z}.$$ (3.1) The simplest time discretization for (3.1) is given by the forward Euler scheme. For this purpose, we partition the time direction into time slabs $$t_n$$ for $$n \in \mathbb{Z}_+$$, where $$t_{n} < t_{n+1}$$. For simplicity we assume a uniform time step, i.e., $$t_{n+1}-t_n = k$$ for all $$n \in \mathbb{Z}_+$$ and similarly a uniform mesh size, i.e., $$h_j = h$$ for all $$j \in \mathbb{Z}$$. Then the fully discrete system using the forward Euler discretization reads $$y_j^{n+1} = y_{j}^n - \frac{k}{h} \left( \hat{f}(y_j^n,y_{j+1}^n) - \hat{f}(y_{j-1}^n,y_{j}^n) \right) \quad \forall\, j \in \mathbb{Z}, n \in \mathbb{Z}_+,$$ (3.2) where $${\boldsymbol {y}}^n := (..., y_{j-1}^n,y_j^n,y_{j+1}^n,...)$$ is an approximation to $${\boldsymbol {y}}(t_n)$$. Later, in Section 4, we discretize (3.1) by a high-order TVD-RK method. The choice of the numerical flux $$\hat{f}(\cdot,\cdot)$$ is crucial since it affects convergence properties of the method. Note that not only are we interested in convergence to a weak solution, but also we require convergence to the entropy solution. Hence, we require that the method satisfy a discrete version of (2.5) as well as other properties such as $$\textrm{L}^1$$-contraction and total variation diminishing. In fact, monotone schemes satisfy such requirements and converge to the entropy solution (see, e.g., LeVeque, 1990, Chapter 15). We, however, need yet another condition: the numerical scheme should be differentiable. This enables us to derive optimality conditions at the discrete level. Motivated by the differentiability issue addressed above, the following numerical fluxes are used in this article: $$\begin{array}{rcll} \hat{f}_\text{LF}(a,b) &:=& \frac{1}{2} (\,f(a)+f(b)) - \frac{\gamma}{2} \frac{h}{k} (b-a) & \text{Lax-Friedrichs (LF)}, \\ \hat{f}_\text{EO}(a,b) &:=& f(0) + \int_{0}^{a} f'(s)^+ \, \textrm{d} s + \int_{0}^{b} f'(s)^- \, \textrm{d} s & \quad\text{Engquist-Osher (EO)}, \end{array}$$ (3.3) where $$\gamma \in (0,1)$$ and we define $$f'(s)^+ := \max(0, f'(s))$$ and $$f'(s)^- \,{:=}\, \min(0, f'(s))$$. We note that the classical Lax–Friedrichs method uses $$\gamma =1$$. However, due to the stability requirements for the discrete adjoint, we impose $$\gamma < 1$$ (see Section 3.3). The Lax–Friedrichs scheme is monotone provided that the time step satisfies the CFL condition $$\frac{k}{h} \sup_{|y| \leq M} |f'(y)| \leq \gamma,$$ (3.4) where $$M = \max_x |u(x)|$$. We define the total variation seminorm for $${\boldsymbol {y}}$$ by \begin{equation*} \left|{ {\boldsymbol {y}} }\right|_\text{TV} := \sum_{j=-\infty}^\infty | y_{j+1} - y_{j} |. \end{equation*} It is well known that monotone schemes are also TVD (see LeVeque, 1990, Chapter 15.7). Therefore, we have $$\left|{ {\boldsymbol {y}}^{n+1} }\right|_\text{TV} \leq \left|{ {\boldsymbol {y}}^{n} }\right|_\text{TV}$$. Giles & Ulbrich (2010a,b) proved that for the Lax–Friedrichs method, provided $$k \leq \gamma \cdot h^{2-q}$$ for $$0<q<1$$, both forward and adjoint approximations converge to their respective continuous counterparts. This is due to adding more grid points near the shock position as the mesh is refined. This, however, results in a very restrictive time-step requirement. Since the support of the initial data is assumed to be in a bounded set, recall (2.2), and it is known that the solution of the conservation law has a finite speed of propagation, we can consider the problem on a bounded domain, denoted by $${\it{\Omega}}$$ only. Therefore, the discretization can be written using finite-dimensional operators. Let us suppose that $${\it{\Omega}} = (a,b)$$ is partitioned into $$N$$ elements, $$I_j$$ for $$\,j=1,...,N$$, with $$x_{1/2} = a$$ and $$x_{N+1/2} = b$$. Then we can define a finite-dimensional space $$V_h := \mathbb{R}^N$$ as approximation space, and we denote the approximation at time $$t_n$$ by $${\boldsymbol {y}}_h^n = (y^n_1, y^n_2, ..., y^n_N) \in V_h$$. This allows us to express the underlying scheme using a nonlinear discrete operator, $$F_h: V_h \rightarrow V_h$$, which is defined by $$\left[ F_h({\boldsymbol {w}}) \right]_j := \hat{f}(w_j,w_{j+1}) - \hat{f}(w_{j-1},w_{j}) \quad \forall\, j=1,...,N, \quad {\boldsymbol {w}} \in V_h.$$ (3.5) The fully discrete version of (3.2) with forward Euler time integration can then be written as $${\boldsymbol {y}}_h^{n+1} = {\boldsymbol {y}}_h^n - \frac{k}{h} F_h({\boldsymbol {y}}_h^n), \quad {\boldsymbol {y}}_h^{0} = {\boldsymbol {u}}_h.$$ (3.6) Differentiability properties of $$F_h(\cdot)$$ will be exploited later in Section 3.2 for deriving an adjoint discretization. Now we state Lax–Friedrichs and Engquist–Osher differentiability in the following proposition whose proof we defer to Appendix A. Proposition 3.1 Let the flux function be $$f(\cdot) \in \textrm{C}^2$$. Then the respective $$F_h({\boldsymbol {w}})$$ for the Lax–Friedrichs and Engquist-Osher schemes at $${\boldsymbol {w}} \in V_h$$ is Fréchet differentiable, i.e., there exists a linear bounded operator $$F_h'({\boldsymbol {w}}) : V_h \rightarrow V_h$$ such that in direction $${\boldsymbol {v}} \in V_h$$ we have $$[F_h'({\boldsymbol {w}}) {\boldsymbol v}]_j = g_{j,j+1} - g_{j-1,j}$$ with \begin{equation*} \begin{array}{ll} g_{j,j+1}^\text{LF} := \frac{1}{2} \left[ f'(w_{j+1})v_{j+1}+f'(w_j)v_{j} \right] - \frac{\gamma}{2}\frac{h}{k}(v_{j+1}-v_j) & \text{(LF)}, \\ g_{j,j+1}^\text{EO} := \frac{1}{2} \left[ f'(w_{j+1})v_{j+1}+f'(w_j)v_{j} \right] - \frac{1}{2}( |f'(w_{j+1})| v_{j+1} - |f'(w_{j})|v_j) & \text{(EO)}, \end{array} \end{equation*} for $$i=j,...,N$$. Moreover, for their transposes we have \begin{equation*} \begin{array}{ll} \left[ F_h'({\boldsymbol {w}})^\top {\boldsymbol v} \right]_j = \gamma \frac{h}{k} v_j - \frac{1}{2} \left( \gamma \frac{h}{k} + f'(w_j) \right) v_{j+1} - \frac{1}{2} \left( \gamma \frac{h}{k} - f'(w_j)\right) v_{j-1} & \text{(LF)}, \\ \left[F_h'({\boldsymbol {w}})^\top {\boldsymbol v} \right]_j = |f'(w_j)| v_j - \frac{1}{2} \left( |f'(w_j)| + f'(w_j) \right) v_{j+1} - \frac{1}{2} \left( |f'(w_j)| - f'(w_j)\right) v_{j-1} & \text{(EO)}. \end{array} \end{equation*} We will see in Section 3.2 how properties of $$F_h(\cdot)$$ influence the discrete adjoint variable. In particular, we are interested in TVD properties of the discrete adjoint. Remark 3.2 We emphasize that our choice of the Lax–Friedrichs scheme or the Engquist–Osher scheme in this article is due to theoretical considerations for stability and convergence of the discrete adjoint in Section 4.1 only. More precisely, we can choose any scheme with a differentiable flux function that satisfies the SSP requirement. To fully benefit from the high-order TVD-RK discretizations used in this article, one should consider using high-order spatial discretizations as well. For the latter see, e.g., Sanders (1988) and Zhang & Shu (2010). Furthermore, for the convergence of the discrete adjoint in Section 4.3, we require that the forward discretization converges to the entropy solution, representing another requirement in the choice of the scheme (see, e.g., Osher & Tadmor, 1988; Qiu & Shu, 2008) for higher-order methods satisfying entropy conditions. 3.2 Discrete optimal control problem and adjoint calculus In this section, we state the discrete optimization problem, derive the discrete adjoint and study its properties when a forward Euler time integration is employed together with the spatial discretization discussed in Section 3.1. As before, we denote the final time by $$T$$. For simplicity, we partition the time direction in a way that there exists $$n_T$$ such that $$T = n_T \cdot k$$. In order to ease the notation, we concatenate approximations $${\boldsymbol {y}}_h^n$$ at different times $$n=1,...,n_T$$ to obtain $${\boldsymbol {y}}_h \in (V_h)^{n_T+1}$$: $${\boldsymbol {y}}_h := ({\boldsymbol {y}}_h^0,{\boldsymbol {y}}_h^1, {\boldsymbol {y}}_h^2, ..., {\boldsymbol {y}}_h^{n_T})^\top.$$ (3.7) The discretized objective functional is then given by \begin{equation*} J_h^{}( {\boldsymbol {y}}_h ) := \sum_{j=1}^{N} h \, G({y}_j^{n_T}), \end{equation*} where $${\boldsymbol {y}}_h$$ is obtained by (3.6). We then consider the following finite-dimensional optimal control problem $$\min J_h^{}( {\boldsymbol {y}}_h ) \quad \text{subject to} \quad S_h^{}({\boldsymbol {u}}^{}_h) = {\boldsymbol {y}}_h^{n_T}, \quad {\boldsymbol {u}}_h \in V_h \cap U_\text{ad},$$ (DP) where $$S_h(\cdot)$$ is the discrete control-to-state operator that is defined by successive ($$n_T$$ times) application of (3.6) to $${\boldsymbol {u}}_h$$. For deriving the discrete adjoint, it is more convenient to consider (3.6) as a constraint instead of $$S_h(\cdot)$$. For this purpose, we define the equality constraint at time $$t_n$$ by $$L_h^n : (V_h)^{n_T+1} \rightarrow V_h$$ with \begin{equation*} L_h^{n}( {\boldsymbol {y}}_h^{} ) := - {\boldsymbol {y}}_h^{n} + {\boldsymbol {y}}_h^{n-1} - \frac{k}{h} F_h( {\boldsymbol {y}}_h^{n-1} ) \quad \text{for}\,n=1,...,n_T, \end{equation*} and $$L_h^0({\boldsymbol {y}}_h,{\boldsymbol u}_h) = -{\boldsymbol y}_h^0 + {\boldsymbol u}_h$$. We then collect all time-step contributions and state the discrete constraint as $$L_h({\boldsymbol y}_h,{\boldsymbol u}_h) = 0$$, where $$L_h : (V_h)^{n_T+1} \times V_h \rightarrow (V_h)^{n_T+1}$$ with \begin{equation*} L_h^{}({\boldsymbol y}_h^{},{\boldsymbol u}_h^{}) = \left( L_h^0({\boldsymbol y}_h^{},{\boldsymbol u}_h^{}), L_h^1({\boldsymbol y}_h^{}), L_h^2({\boldsymbol y}_h^{}), \cdots, L_h^{n_T}({\boldsymbol y}_h^{}) \right)^\top. \end{equation*} Then we define the Lagrangian for the finite-dimensional problem (DP) by \begin{equation*} \mathcal{L}({\boldsymbol {y}}_h^{},{\boldsymbol {p}}_h^{},{\boldsymbol {u}}_h^{}) := J_h^{}({\boldsymbol {y}}_h^{}) + h \, {\boldsymbol p}_h^\top \cdot L_h^{}({\boldsymbol y}_h^{},{\boldsymbol u}_h^{}), \end{equation*} where $${\boldsymbol p}_h \in (V_h)^{n_T+1}$$ is enumerated like $${\boldsymbol y}_h$$ in (3.7). Recall that from Proposition 3.1 we know that $$F_h(\cdot)$$ is differentiable when Lax–Friedrichs and Engquist–Osher schemes are used. Differentiability of $$F_h(\cdot)$$ implies differentiability of $$L_h(\cdot,\cdot)$$ with respect to the first argument. More precisely, a direct calculation shows that this derivative, denoted by $$L_{{\boldsymbol y}_h}'(\cdot,\cdot)$$, has a lower-triangular structure: \begin{equation*} L_{{\boldsymbol y}_h}'({\boldsymbol y}_h,{\boldsymbol u}_h) = \left[ \begin{array}{ccccc} -I & & & & \\ I - \frac{k}{h} F_h'({\boldsymbol y}_h^{1}) & -I & & & \\ & \ddots & \ddots & & \\ & & I - \frac{k}{h} F_h'({\boldsymbol y}_h^{n_T-1}) & -I & \\ & & & I - \frac{k}{h} F_h'({\boldsymbol y}_h^{n_T}) & -I \end{array} \right]\!, \end{equation*} and therefore $$L_{{\boldsymbol y}_h}'(\cdot,\cdot)$$ is invertible. As a consequence, the constraint $$L_h^n({\boldsymbol y}_h)=0$$ satisfies the linear independence constraint qualification (LICQ) at any feasible point of (DP). Hence, at a solution pair $$(y_h,u_h)$$ there exists a Lagrange multiplier (adjoint state) $$p_h$$. Now we can compute the discrete derivative of the objective functional using a discrete adjoint calculus. The concatenated discrete adjoint equation is given by $${\nabla J_h({\boldsymbol y}_h)} + h \, L'_{{\boldsymbol y}_h}({\boldsymbol y}_h,{\boldsymbol u}_h)^\top {\boldsymbol p}_h = 0.$$ (3.8) Note that $$L'_{{\boldsymbol y}_h}({\boldsymbol y}_h,{\boldsymbol u}_h)^\top$$ is upper triangular, and for a fixed $${\boldsymbol y}_h$$ one can solve (3.8) successively from $${\boldsymbol p}_h^{n_T}$$ to $${\boldsymbol p}_h^{0}$$, i.e., backward in time similar to the continuous adjoint; see (2.8). From the definition of the discrete objective function we get \begin{equation*} {\nabla J_h({\boldsymbol y}_h)} = \left( {\boldsymbol {0}}, {\boldsymbol {0}}, \cdots, {\boldsymbol {0}}, h \, G'({\boldsymbol y}_h^{n_T}) \right)^\top, \end{equation*} and therefore $${\boldsymbol p}_h^{n_T} = G'({\boldsymbol y}_h^{n_T})$$, which is the discrete counterpart for the final-time condition in (2.8). For $$n = n_T, ..., 0$$ we have $${\boldsymbol p}_h^{n_T} = G'({\boldsymbol y}_h^{n_T}), \quad {\boldsymbol p}_h^{n} = \left[ I - \frac{k}{h} F_h'({\boldsymbol y}_h^{n+1})^\top \right] {\boldsymbol p}_h^{n+1} \quad \text{for } n=(n_T-1), ..., 0,$$ (3.9) which is the discrete counterpart of the adjoint PDE (2.8). The gradient of the discrete reduced objective function in the direction of $$\delta {\boldsymbol u}_h \in V_h$$ is obtained by $${\nabla \mathcal{J}_h({\boldsymbol u}_h)^\top} \delta {\boldsymbol u}_h = h \sum_{j=1}^N p_j^0 \, \delta u_j^{}.$$ (3.10) Remark 3.3 (Convergence of the discrete adjoint) Note, however, that at the discrete level, the interior boundary condition (2.9) does not appear. Hence, a natural question is to ask whether the discrete adjoint converges to the continuous one or not. It is shown in Giles (2003) through numerical experiments that the adjoint obtained from the Lax–Friedrichs scheme converges to a wrong solution as the mesh is refined. It is also observed that if numerical diffusion is chosen such that the shock in $${\boldsymbol y}_h^{n_T}$$ spreads into more cells, then the discrete adjoint converges. This is proven by Giles & Ulbrich (2010a,b). We are not aware of a similar result for the Engquist–Osher scheme. However, it has been shown by Ulbrich (2001) that for the case where the end data is Lipschitz continuous, e.g., using a smoothed end state in the tracking functional, the discrete adjoint converges to the continuous adjoint. This motivates the use of two separate solvers for the discrete forward and adjoint problems, e.g., using a TVDRK-MUSCL scheme, when the end data are smoothed. We should mention that this approach results in an inexact discrete adjoint. 3.3 Stability of the discrete adjoint We would like to examine monotonicity of the discrete adjoint by checking whether or not it is TVD. The discrete adjoint, satisfying (3.9), can be written in a simplified form for analysis as $$p_j^n = A_{j,0}^{} \, p_j^{n+1} + A_{j,1}^{} \, p_{j+1}^{n+1} + A_{j,-1}^{} \, p_{j-1}^{n+1},$$ (3.11) where \begin{equation*} A_{j,l}^\text{LF} := \begin{cases} \frac{\gamma}{2} - \frac{k}{2h} f'(w_j) & \text{for } l=-1, \\ 1 - \gamma & \text{for } l=0, \\ \frac{\gamma}{2} + \frac{k}{2h} f'(w_j) & \text{for } l=1, \end{cases} \quad A_{j,l}^\text{EO} := \begin{cases} \frac{k}{2h} \left( |f'(w_j)| - f'(w_j) \right) & \text{for } l=-1, \\ 1 - \frac{k}{h} |f'(w_j)| & \text{for } l=0, \\ \frac{k}{2h} \left( |f'(w_j)| + f'(w_j) \right) & \text{for } l=1. \end{cases} \end{equation*} Note that, provided the CFL condition (3.4) is satisfied, we have $$A_{j,l}^\text{LF} \geq 0$$ and $$A_{j,l}^\text{EO} \geq 0$$. Moreover, observe that by construction we have \begin{equation*} \sum_{l=-1}^{1} A_{j,l} = 1. \end{equation*} Taking absolute values on both sides in (3.11) and using the above properties, we obtain \begin{equation*} |p_j^n| \leq A_{j,0}^{} |p_j^{n+1}| + A_{j,1}^{} |p_{j+1}^{n+1}| + A_{j,-1}^{} |p_{j-1}^{n+1}| \leq \max_{l=-1,0,1} |p_{j+l}^{n+1}| \leq \Vert {{ {\boldsymbol p}_h^{n+1} }} \Vert_{\infty}, \end{equation*} which implies $$\textrm{L}^\infty$$ stability. We now show that the discrete adjoint scheme with a forward Euler time discretization is also TVD. We first rewrite (3.11) in the form $$p_j^{n} = p_{j}^{n+1} + B_{j,0}^{} (p_{j}^{n+1} - p_{j-1}^{n+1}) + B_{j,1}^{} (p_{j+1}^{n+1} - p_{j}^{n+1}),$$ (3.12) with \begin{equation*} B_{j,l}^\text{LF} := \begin{cases} - \frac{\gamma}{2} + \frac{k}{2h} f'(w_j) & \text{for } l=0, \\ \frac{\gamma}{2} + \frac{k}{2h} f'(w_j) & \text{for } l=1, \end{cases} \quad B_{j,l}^\text{EO} := \begin{cases} - \frac{k}{2h} |f'(w_j)| + \frac{k}{2h} f'(w_j) & \text{for } l=0, \\ \frac{k}{2h} |f'(w_j)| + \frac{k}{2h} f'(w_j) & \text{for } l=1. \end{cases} \end{equation*} Then Harten’s lemma (see Harten, 1983) guarantees TVD properties of the discrete adjoint scheme. Lemma 3.4 (Harten’s lemma) Suppose a finite difference scheme can be written as \begin{equation*} w_j = v_j + B_{j,0} \cdot (v_j - v_{j-1}) + B_{j,1} \cdot (v_{j+1}-v_{j}), \end{equation*} where $$B_{j,0}$$ and $$B_{j,1}$$ are arbitrary nonlinear functions of $$v_j, v_{j+1}, v_{j-1}$$ and satisfy \begin{equation*} B_{j,0} \leq 0, \quad B_{j,1} \geq 0, \quad B_{j,1} - B_{j+1,0} \leq 1. \end{equation*} Then we have $$\left|{{\boldsymbol w}_h}\right|_\text{TV} \leq\left|{{\boldsymbol v}_h}\right|_\text{TV}$$. Observe that in our case, the above conditions on $$B_{j,0}$$ and $$B_{j,1}$$ are satisfied provided a $$(1-\gamma)$$-CFL condition for Lax–Friedrichs and a $$\frac{1}{2}$$-CFL condition for Engquist–Osher are satisfied, respectively, i.e., $$\frac{k^\text{LF}}{h} \sup_{|y| \leq M} |f'(y)| \leq (1-\gamma), \quad \frac{k^\text{EO}}{h} \sup_{|y| \leq M} |f'(y)| \leq \frac{1}{2}.$$ (3.13) This is the reason for the choice of $$\gamma \in (0,1)$$ in the Lax–Friedrichs scheme. Obviously the optimal CFL condition is \begin{equation*} \text{CFL}^* = \max_{\gamma \in (0,1)} \min\{\gamma,1-\gamma\} = \frac{1}{2}. \end{equation*} Lemma 3.4 now guarantees that the discrete adjoint obtained from forward Euler time discretization is TVD, i.e., $$\left|{ {\boldsymbol p}_h^{n} }\right|_\text{TV} \leq \left| {\boldsymbol p}_h^{n+1} \right|_\text{TV}.$$ (3.14) Note that this property is shared by the continuous adjoint in Theorem 2.4. 4. SSP time discretizations In this section, we examine the effect of using RK methods for discretizing the underlying problem instead of using the forward Euler method. A TVD-RK method is defined by convex combinations of forward Euler steps that are parametrized by two sets of coefficients: $$\{ \alpha_{ij} \}$$ and $$\{ \beta_{ij} \}$$ for $$i=1,...,s$$ and $$j=0,...,(s-1)$$, where $$s$$ is the number of RK stages. A TVD-RK time stepping is then defined as follows: Set the initial stage: $${\boldsymbol w}^{(0)} = {\boldsymbol y}_h^n$$. Compute for each stage $$i = 1, \ldots, s$$, $${\boldsymbol w}^{(i)} = \sum_{j=0}^{i-1} \alpha_{ij} {\boldsymbol w}^{(j)} - \frac{k}{h} \, \beta_{ij} F_h({\boldsymbol w}^{(j)}).$$ (4.1) Set the next time-step approximation: $${\boldsymbol y}^{n+1}_h = {\boldsymbol w}^{(s)}$$. Moreover, we impose the following constraints over $$\{ \alpha_{ij} \}$$ and $$\{ \beta_{ij} \}$$: $$\alpha_{ij}, \beta_{ij} \geq 0 \quad \sum_{j=0}^{i-1} \alpha_{ij} = 1, \quad \left( \beta_{ij} \not= 0 \implies \alpha_{ij}\not=0 \right)\!.$$ (4.2) The following result is due to Shu & Osher (1988) and shows that the TVD-RK method is stable if the forward Euler (3.6) is stable. We call such methods strong stability preserving (SSP) since we have stability with respect to stage variables, too. Proposition 4.1 Suppose the time step $$k$$ is chosen such that the forward Euler discretization is stable, i.e., $$\left\Vert{ {\boldsymbol w} - \frac{k}{h} \frac{\beta_{ij}}{\alpha_{ij}} F_h({\boldsymbol w}) }\right\Vert \leq {\Vert {{\boldsymbol w}} \Vert} \quad \forall\, {\boldsymbol w} \in V_h,$$ (4.3) for all $$i = 1, \ldots, s$$, $$j=0,\ldots,(s-1)$$ and $$\alpha_{ij} \not=0$$. Here, $${\Vert {\cdot} \Vert}$$ is a non-negative homogeneous convex function, e.g., a norm or a seminorm. Moreover, assume that the conditions in (4.2) are satisfied. Then for the TVD-RK method we have \begin{equation*} {\Vert {{\boldsymbol w}^{(i)}} \Vert} \leq \max_{j=0,\ldots,(i-1)} {\Vert { {\boldsymbol w}^{(j)} } \Vert} \quad \text{for} i=1,\ldots,s, \end{equation*} and consequently $${\Vert { {\boldsymbol y}^{n+1} } \Vert} \leq {\Vert { {\boldsymbol y}^{n} } \Vert}$$. To highlight the technical differences between the proof technique of Shu & Osher (1988), which relies on $$\sum_{j=0}^{i-1} \alpha_{ij} = 1$$, the property not available for the discrete adjoint in Section 4.1, we display a short proof. Proof. First observe that if $$\alpha_{ij}=0$$ then the contribution of $$w^{(j)}$$ is zero. So we can rewrite (4.1) as \begin{equation*} {\boldsymbol w}^{(i)} = \sum_{\{ j : \alpha_{ij} \not= 0 \}} \alpha_{ij} \left( {\boldsymbol w}^{(j)} - \frac{k}{h} \, \frac{\beta_{ij}}{\alpha_{ij}} F_h\left({\boldsymbol w}^{(j)}\right) \right) \quad \forall\, i=1,\ldots, s. \end{equation*} We then take $${\Vert {\cdot} \Vert}$$ from both sides and use convexity as well as our assumption on the stability of the Euler step: for all $$i=1,\ldots,s$$ we have \begin{equation*} {\Vert {{\boldsymbol w}^{(i)}} \Vert} \leq \sum_{\{ j : \alpha_{ij} \not= 0 \}} \alpha_{ij} {\Vert { {\boldsymbol w}^{(j)} } \Vert} \leq \left( \max_{j=0,\ldots,(i-1)} {\Vert { {\boldsymbol w}^{(j)} } \Vert} \right) \left( \sum_{\{\, j : \alpha_{ij} \not= 0 \}} \alpha_{ij} \right) = \max_{j=0,\ldots,(i-1)} {\Vert { {\boldsymbol w}^{(j)} } \Vert}, \end{equation*} where we also used positivity of $$\alpha_{ij},\beta_{ij}$$ and $$\sum_{j=0}^{i-1} \alpha_{ij} = 1$$. The proof is completed by induction. □ Let us denote the forward Euler time step by $$k_\text{FE}$$. Then the stable TVD-RK time step is bounded by $$k \leq \left( \min_{\alpha_{ij}, \beta_{ij} \not = 0} \frac{\alpha_{ij}}{\beta_{ij}} \right) k_\text{FE}.$$ (4.4) Therefore, one can optimize the coefficients $$\alpha_{ij}$$ and $$\beta_{ij}$$ to maximize the constant $$\min_{\alpha_{ij}, \beta_{ij} \not = 0} \frac{\alpha_{ij}}{\beta_{ij}}$$. In Table 1, we show such optimal TVD-RK methods of two and three stages. Table 1 Table of coefficients for TVD-RK methods of order $$2$$ and $$3$$ Order $$\alpha_{ij}$$ $$\beta_{ij}$$ $$\min \alpha_{ij}/\beta_{ij}$$ 2 1 1 1 1/2 1/2 0 1/2 3 1 1 1 3/4 1/4 0 1/4 1/3 0 2/3 0 0 2/3 Order $$\alpha_{ij}$$ $$\beta_{ij}$$ $$\min \alpha_{ij}/\beta_{ij}$$ 2 1 1 1 1/2 1/2 0 1/2 3 1 1 1 3/4 1/4 0 1/4 1/3 0 2/3 0 0 2/3 Table 1 Table of coefficients for TVD-RK methods of order $$2$$ and $$3$$ Order $$\alpha_{ij}$$ $$\beta_{ij}$$ $$\min \alpha_{ij}/\beta_{ij}$$ 2 1 1 1 1/2 1/2 0 1/2 3 1 1 1 3/4 1/4 0 1/4 1/3 0 2/3 0 0 2/3 Order $$\alpha_{ij}$$ $$\beta_{ij}$$ $$\min \alpha_{ij}/\beta_{ij}$$ 2 1 1 1 1/2 1/2 0 1/2 3 1 1 1 3/4 1/4 0 1/4 1/3 0 2/3 0 0 2/3 We now rewrite our discrete optimization problem (DP) using a TVD-RK time discretization. First we redefine $${\boldsymbol y}_h^n$$ to be suitable for the TVD-RK method: the collection of all stage approximations at time-slab $$t_n$$ is given by \begin{equation*} {\boldsymbol y}_h^n := ({\boldsymbol y}_h^{n,0}, {\boldsymbol y}_h^{n,1}, \ldots, {\boldsymbol y}_h^{n,s} )^\top \in W_h := (V_h)^{s+1}, \end{equation*} where $${\boldsymbol y}_h^{n,l}$$ is the approximation at time $$t_n$$, at stage $$l = 0, \ldots, s$$. Then we concatenate contributions from all time steps to get \begin{equation*} {\boldsymbol y}_h := ({\boldsymbol y}_h^0, {\boldsymbol y}_h^1, \ldots, {\boldsymbol y}_h^{n_T})^\top \in (W_h)^{n_T+1}. \end{equation*} Let us denote the forward Euler operator by $$H_{ij}: V_h \rightarrow V_h$$ with \begin{equation*} H_{ij}({\boldsymbol v}_h) := \begin{cases} {\boldsymbol v}_h - \frac{k}{h} \frac{\beta_{ij}}{\alpha_{ij}} F_h({\boldsymbol v}_h) & \text{if } \alpha_{ij} > 0, \\ 0 & \text{if } \alpha_{ij} = 0, \end{cases} \end{equation*} for all $${\boldsymbol v}_h \in V_h$$, which is differentiable with derivative in the direction of $${\boldsymbol u}_h \in V_h$$ given by \begin{equation*} H_{ij}'({\boldsymbol v}_h) {\boldsymbol u}_h := \begin{cases} \left[ I - \frac{k}{h} \frac{\beta_{ij}}{\alpha_{ij}} F_h'({\boldsymbol v}_h) \right] {\boldsymbol u}_h & \text{if } \alpha_{ij} > 0, \\ 0 & \text{if } \alpha_{ij} = 0. \end{cases} \end{equation*} Then the equality constraint generated by TVD-RK scheme at time $$t_n$$ and stage $$l$$ is denoted by $$L_h^{n,l}: (W_h)^{n_T+1} \rightarrow W_h$$ with \begin{align*} L_h^{0,0}({\boldsymbol y}_h,{\boldsymbol u}_h) &:= -{\boldsymbol y}_h^{0,0} + {\boldsymbol u}_h,\\ L_h^{n,0}({\boldsymbol y}_h) &:= -{\boldsymbol y}_h^{n,0} + {\boldsymbol y}_h^{n-1,s},\\ L_h^{n,l}({\boldsymbol y}_h) &:= -{\boldsymbol y}_h^{n,l} + \sum_{j=0}^{l-1} \alpha_{lj} H_{lj}({\boldsymbol y}_h^{n,j}) \quad \text{for } n=1,\ldots,n_T \quad \text{and } l=1,\ldots,s. \end{align*} The fully discrete scheme, i.e., spatial discretization with the TVD-RK method, can be expressed as $$L_h({\boldsymbol y}_h,{\boldsymbol u}_h) = 0$$ where \begin{equation*} L_h({\boldsymbol y}_h,{\boldsymbol u}_h) = \left( L_h^{0,0}({\boldsymbol y}_h,{\boldsymbol u}_h), L_h^{0,1}({\boldsymbol y}_h), \ldots, L_h^{0,s}({\boldsymbol y}_h), \ldots, L_h^{n_T,s}({\boldsymbol y}_h) \right)^\top \in (W_h)^{n_T+1}. \end{equation*} As before we define the Lagrangian function by \begin{equation*} \mathcal{L}({\boldsymbol y}_h^{},{\boldsymbol p}_h^{},{\boldsymbol u}_h^{}) := J_h^{}({\boldsymbol y}_h^{}) + h \, {\boldsymbol p}_h^\top \cdot L_h^{}({\boldsymbol y}_h^{},{\boldsymbol u}_h^{}), \end{equation*} where $${\boldsymbol p}_h^{} \in W_h^{n_T+1}$$ has the same structure as $${\boldsymbol y}_h$$. Note that since $$H_{ij}(\cdot)$$ is differentiable we can conclude that $${L_h}(\cdot,\cdot)$$ is also differentiable with respect to the first argument. Moreover, similar to the case with forward Euler step, it has again a lower-triangular structure: \begin{equation*} L_{{\boldsymbol y}_h}'({\boldsymbol y}_h,{\boldsymbol u}_h) = \left[ \begin{array}{ccccc} G({\boldsymbol y}_h^{0}) & & & & \\ K & G({\boldsymbol y}_h^{1}) & & & \\ & \ddots & \ddots & & \\ & & K & G({\boldsymbol y}_h^{n_T-1}) & \\ & & & K & G({\boldsymbol y}_h^{n_T}) \end{array} \right]\!, \end{equation*} where $$G(\cdot)$$ is the linearized contribution of the TVD-RK scheme, i.e., $$G({\boldsymbol y}_h^n) := \left[ \begin{array}{ccccc} -I & & & & \\[4pt] \alpha_{10} H_{10}'({\boldsymbol y}_h^{n,0}) & -I & & & \\[4pt] \alpha_{20} H_{20}'({\boldsymbol y}_h^{n,0}) & \alpha_{21} H_{21}'({\boldsymbol y}_h^{n,1}) & -I & & \\[4pt] & \vdots & & \ddots & \\[4pt] \alpha_{s,0} H_{s,0}'({\boldsymbol y}_h^{n,0}) & & & \alpha_{s,s-1} H_{s,s-1}'({\boldsymbol y}_h^{n,s-1}) & -I \end{array} \right]\!,$$ (4.5) and $$K$$ links the approximation from the previous time step to the current one: $$K := \left[ \begin{array}{cccc} O & \cdots & O & I \\ O & \cdots & O & O \\ \vdots & & \vdots & \vdots \\ O & \cdots & O & O \end{array} \right]\!.$$ (4.6) As before, the discrete adjoint satisfies, \begin{equation*} {\nabla J_h({\boldsymbol y}_h)} + h L_{{\boldsymbol y}_h}({\boldsymbol y}_h,{\boldsymbol u}_h)^\top {\boldsymbol p}_h = 0, \end{equation*} which is again backward in time. As before, the above constraint satisfies LICQ. Let us illustrate the above adjoint equation with an example. Example 4.2 (Case $$s=2$$) Consider the discrete adjoint for a TVD-RK discretization with two stages. Let $$n < n_T$$. Then the discrete adjoint satisfies the following backward in time linear system: \begin{equation*} \begin{array}{rrrcl} {\boldsymbol p}_h^{n,0} & -\ \alpha_{10} H_{10}'({\boldsymbol y}_h^{n,0})^\top {\boldsymbol p}_h^{n,1} & - \alpha_{20} H_{20}'({\boldsymbol y}_h^{n,0})^\top {\boldsymbol p}_h^{n,2} & = & {\boldsymbol 0}, \\[3pt] & {\boldsymbol p}_h^{n,1} & -\ \alpha_{21} H_{21}'({\boldsymbol y}_h^{n,1})^\top {\boldsymbol p}_h^{n,2} & = & {\boldsymbol 0}, \\[3pt] & & {\boldsymbol p}_h^{n,2} & = & {\boldsymbol p}_h^{n+1,0}, \end{array} \end{equation*} which is a TVD-RK scheme with coefficients $${\alpha_{ij}^\star}$$ and $${\beta_{ij}^\star}$$. In fact, the TVD-RK coefficients for the discrete adjoint equation are related to the TVD-RK of the discrete state as \begin{equation*} \begin{array}{cc} \alpha_{10}^* = \alpha_{21}, & \\ \alpha_{20}^* = \alpha_{20}, & \alpha_{21}^* = \alpha_{10}, \end{array} \quad \begin{array}{cc} \beta_{10}^* = \beta_{21}, & \\ \beta_{20}^* = \beta_{20}, & \beta_{21}^* = \beta_{10}. \end{array} \end{equation*} Observe the way the coefficients of the TVD-RK scheme are transformed by this conjugation process. We refer to the coefficients of the adjoint TVD-RK scheme as conjugate coefficients. This transformation in the table of adjoint TVD-RK coefficients might pose some restrictions on the choice of the TVD-RK method in the first place. We conclude this section by the following proposition. Proposition 4.3 Suppose we discretize-then-optimize the problem (P) and a TVD-RK time discretization is used for the discrete state variable with coefficients $$\alpha_{ij}$$ and $$\beta_{ij}$$. Then the discrete adjoint is also obtained by a TVD-RK method with coefficients $$\alpha_{ij}^*$$ and $$\beta_{ij}^*$$ such that $$\alpha_{ij}^* = \alpha_{s-j,s-i}, \quad \beta_{ij}^* = \beta_{s-j,s-i} \text{ for } i=1,\ldots,s \text{ and } j=0,\ldots,(s-1).$$ (4.7) 4.1 Stability of the discrete adjoint Given the result of Proposition 4.3, our first idea is to impose strong stability on both the discrete state and the adjoint variables, i.e., imposing (4.2) on $$\{ \alpha_{ij}, \beta_{ij} \}$$ and $$\{ \alpha^*_{ij}, \beta^*_{ij} \}$$. In other words, the following conditions on $$\{ \alpha_{ij}, \beta_{ij} \}$$ should hold: $$\alpha_{ij}, \beta_{ij} \geq 0, \quad \sum_{j=0}^{i-1} \alpha_{ij} = 1, \quad \sum_{i=j+1}^{s} \alpha_{ij} = 1, \quad \left( \beta_{ij} \not= 0 \implies \alpha_{ij}\not=0 \right)\!.$$ (4.8) This, however, turns out to be too strong as the following proposition clarifies. Proposition 4.4 Suppose we discretize-then-optimize the problem (P) with a TVD-RK method. If we require strong stability for both discrete state and discrete adjoint, then the TVD-RK method is at most first order. More precisely, the TVD-RK coefficients are \begin{equation*} \alpha_{i,j} = \left\{ \begin{array}{ll} 1 & \text{if } j=i-1, \\ 0 & \text{otherwise}, \end{array} \right. \quad \beta_{i,j} = \left\{ \begin{array}{ll} \beta_{i,i-1} & \text{if } j=i-1, \\ 0 & \text{otherwise}, \end{array} \right. \end{equation*} which gives a concatenation of forward Euler steps. Proof. We need to identify sets of $$\{\alpha_{ij}\}$$ and $$\{\beta_{ij}\}$$ that satisfy (4.8). Thus, all coefficients need to be non-negative. Let $$i=1$$, then $$\alpha_{10} = 1$$. Now let $$j=0$$ and observe that $$\sum_{i=j+1}^s \alpha_{ij} = 1$$. This implies $$\alpha_{i,0} = 0$$ for all $$i = 2, \ldots, s$$ since all $$\alpha_{ij} \geq 0$$ and $$\alpha_{10}=1$$. Now let $$i=2$$ and observe that $$\sum_{j=0}^{i-1} \alpha_{ij} =1$$. However, we just showed that $$\alpha_{20}=0$$, which implies $$\alpha_{21}=1$$. We now repeat the same process by letting $$j=1$$ and consider the constraint $$\sum_{i=j+1}^s \alpha_{ij}=1$$. Continuing, we conclude that $$\alpha_{i,i-1} =1$$ and the other coefficients are zero. Since $$\alpha_{i,j}=0$$ for $$j<i-1$$ we conclude from the last requirement of (4.8) that $$\beta_{ij} = 0$$ for $$j<i-1$$. Therefore, the only free parameters are $$\beta_{i,i-1}$$. However, such a TVD-RK scheme is equivalent to the concatenation of Euler steps, instead of a combination. Finally, the concatenation of Euler steps yields a first-order method. □ As we shall see next, imposing (4.8) is not necessary. Let us consider a two-stage TVD-RK that ensures strong stability only for the state discretization. That is, TVD-RK satisfies only (4.2). Moreover, suppose that the time step $$k$$ is chosen such that adjoint stability of the forward Euler step for discrete adjoint holds. For instance, for the the discrete adjoint from Example 4.2 we have \begin{equation*} \begin{array}{rcl} {\Vert {{\boldsymbol p}^{n,1}} \Vert} &\leq& \alpha_{21} {\Vert { {\boldsymbol p}^{n,2} } \Vert}, \\[3pt] {\Vert {{\boldsymbol p}^{n,0}} \Vert} &\leq& \alpha_{20} {\Vert { {\boldsymbol p}^{n,2} } \Vert} + \alpha_{10} {\Vert { {\boldsymbol p}^{n,1} } \Vert}, \end{array} \end{equation*} where $${\Vert {\cdot} \Vert}$$ is a non-negative homogeneous convex function, e.g., a seminorm or a norm. For strong stability of the TVD-RK scheme, we would take the maximum of (semi)norms up to the $$(i-1)\text{st}$$ stage and use the assumption that the sum of the coefficients in each stage equals $$1$$. However, here we do not have $$\alpha_{20} + \alpha_{10} = 1$$. Instead, we can substitute the first inequality into the second and obtain \begin{equation*} {\Vert {{\boldsymbol p}^{n,0}} \Vert} \leq ( \alpha_{20} + \alpha_{21} \, \alpha_{10} ) {\Vert { {\boldsymbol p}^{n,2} } \Vert} = {\Vert { {\boldsymbol p}^{n,2} } \Vert}, \end{equation*} which holds true since $$\alpha_{10}=1$$ and $$\alpha_{20} + \alpha_{21} = 1$$. This shows that for two-stage TVD-RK methods we have stability at each time step, which is weaker than stability at each stage; observe that in Proposition 4.1 stability is achieved at each stage and therefore at each time step. We can generalize this observation to an arbitrary $$s$$-stage TVD-RK method. Theorem 4.5 Suppose the state equation is discretized with an SSP TVD-RK method. Moreover, suppose that $$k$$ is chosen such that it ensures forward Euler stability for both the discrete state and adjoint. Then the discrete adjoint is stable in each time step for an arbitrary $$s$$-stage method., i.e., \begin{equation*} {\Vert {{\boldsymbol p}^{n,0}} \Vert} \leq {\Vert {{\boldsymbol p}^{n,s}} \Vert}, \end{equation*} where $${\Vert {\cdot} \Vert}$$ is a non-negative homogeneous convex function, e.g., a seminorm or a norm. Proof. Since we require Euler step stability, we have for each stage \begin{equation*} {\Vert { {\boldsymbol p}^{n,\ell}} \Vert} \leq \sum_{i=\ell+1}^s \alpha_{i\ell} \, {\Vert {{\boldsymbol p}^{n,i}} \Vert} \quad \text{for } \ell = 0, \ldots, s-1. \end{equation*} Let $$\ell=0$$ and recall that $$\alpha_{10}=1$$. Then, using the above inequality, we have \begin{equation*} {\Vert { {\boldsymbol p}^{n,0}} \Vert} \leq \sum_{i=1}^s \alpha_{i0} \, {\Vert {{\boldsymbol p}^{n,i}} \Vert} = {\Vert {{\boldsymbol p}^{n,1}} \Vert} + \sum_{i=2}^s \alpha_{i0} {\Vert {{\boldsymbol p}^{n,i}} \Vert} \leq \sum_{i=2}^s ( \alpha_{i1} + \alpha_{i0} ) {\Vert {{\boldsymbol p}^{n,i}} \Vert}. \end{equation*} Now isolating the term with $$i=2$$ and noting that $$\alpha_{21} +\alpha_{20}=1$$ we have \begin{equation*} {\Vert { {\boldsymbol p}^{n,0}} \Vert} \leq {\Vert { {\boldsymbol p}^{n,2} } \Vert} + \sum_{i=3}^{s} (\alpha_{i1}+\alpha_{i0}) {\Vert {{\boldsymbol p}^{n,i}} \Vert} \leq \sum_{i=3}^s (\alpha_{i2}+\alpha_{i1} + \alpha_{i0}) {\Vert {{\boldsymbol p}^{n,i}} \Vert}. \end{equation*} We repeat this procedure to obtain $${\Vert { {\boldsymbol p}^{n,0}} \Vert} \leq \sum_{i=\ell'}^s \left( \sum_{j=0}^{\ell'-1} \alpha_{ij} \right) {\Vert { {\boldsymbol p}^{n,i} } \Vert}$$ for all $$\ell' = 1, \ldots, s$$. We then choose $$\ell' = s$$ and obtain $${\Vert {{\boldsymbol p}^{n,0}} \Vert} \leq (\sum_{j=0}^{s-1} \alpha_{sj}) {\Vert {{\boldsymbol p}^{n,s}} \Vert} = {\Vert {{\boldsymbol p}^{n,s}} \Vert}$$, which completes the proof. □ Theorem 4.5 shows that any $$s$$-stage TVD-RK method for the discrete state yields a stable TVD-RK method for the discrete adjoint. However, the discrete adjoint is proved to be stable at each time step instead of each stage. 4.2 Order conditions for the discrete adjoint In this section, we study approximation properties of the scheme for the discrete adjoint. We focus on deriving order conditions for the discrete adjoint scheme. For this purpose, extra conditions on the TVD-RK method applied to the state discretization are needed to ensure high accuracy of the adjoint scheme. We also check which methods provide optimal CFL constants. For simplicity of the notation, we consider TVD-RK methods with a conjugate coefficient table as in Proposition 4.3 for the following linear problem: $$\dot{{\boldsymbol p}}(t) + R(t) {\boldsymbol p}(t) = 0, \quad {\boldsymbol p}(T) = {\boldsymbol p}_T \in V_h.$$ (4.9) Here $$R(t)$$ is a linear operator and defined as $$R(t) := -\frac{1}{h} F_h'({\boldsymbol y}(t))^\top,$$ (4.10) where $${\boldsymbol y}(t)$$ is the solution of the semidiscrete problem (3.1), i.e., $$\dot{{\boldsymbol y}}(t)=-\frac{1}{h} F_h({\boldsymbol y}(t))$$. For completeness we state derivatives of $$R(\cdot)$$: \begin{align} \dot{R}(t) &= \frac{1}{h^2} F''_h({\boldsymbol y}(t))^\top F_h({\boldsymbol y}(t)), \notag\\[-2pt] \ddot{R}(t) &= -\frac{1}{h^3} \left[ F''_h({\boldsymbol y}(t))^\top F'_h({\boldsymbol y}(t))^\top F_h({\boldsymbol y}(t)) + F_h'''({\boldsymbol y}(t))^\top F_h({\boldsymbol y}(t))^2 \right]\!. \end{align} (4.11) Observe that $$R(t)$$ at the $$\ell$$th stage is approximated by $$\frac{-1}{h} F_h'({\boldsymbol y}_h^{n,\ell})$$; see for instance Example 4.2 and the definition of $$H'_{i\ell}({\boldsymbol y}_h^{n,\ell})$$. Since we consider the local error of the TVD-RK method in the time interval $$[t_n,t_{n+1}]$$, we let $${\boldsymbol y}_h^{n,0} = {\boldsymbol y}(t_n)$$. For the analysis of the TVD-RK method, we need approximation properties of $$\frac{-1}{h} F_h'({\boldsymbol y}_h^{n,\ell})$$. A direct calculation gives \begin{align} -\frac{1}{h} F'_h({\boldsymbol y}_h^{n,0})^\top &= R(t_{n+1}) - k \dot{R}(t_{n+1}) + \frac{1}{2} k^2 \ddot{R}(t_{n+1}) + \mathcal{O}(k^3), \\[-2pt] \end{align} (4.12) \begin{align} -\frac{1}{h} F'_h({\boldsymbol y}_h^{n,1})^\top &= R(t_{n+1}) - (1-\beta_{10}) k \dot{R}(t_{n+1}) \notag\\[-2pt] &\quad - \frac{1}{2} \frac{k^2}{h^3} (1-\beta_{10})^2 F'''_h({\boldsymbol y}(t_{n+1}))^\top F_h^2({\boldsymbol y}(t_{n+1})) \notag\\[-2pt] &\quad - \frac{1}{2} \frac{k^2}{h^3} (1 - 2 \beta_{10} ) F_h''({\boldsymbol y}(t_{n+1}))^\top F_h'({\boldsymbol y}(t_{n+1}))^\top F_h({\boldsymbol y}(t_{n+1})) + \mathcal{O}(k^3) \end{align} (4.13) and \begin{align} -\frac{1}{h} F'_h({\boldsymbol y}_h^{n,2})^\top &= R(t_{n+1}) - (1- \psi) k \dot{R}(t_{n+1}) \notag\\[-2pt] &- \frac{1}{2} \frac{k^2}{h^3} (1- \psi)^2 F'''_h({\boldsymbol y}(t_{n+1}))^\top F_h^2({\boldsymbol y}(t_{n+1})) \notag\\[-2pt] & - \frac{1}{2} \frac{k^2}{h^3} (1 - 2 \psi + 2 \beta_{21} \beta_{10}) F_h''({\boldsymbol y}(t_{n+1}))^\top F_h'({\boldsymbol y}(t_{n+1}))^\top F_h({\boldsymbol y}(t_{n+1})) \notag\\[-2pt] & +\mathcal{O}(k^3), \end{align} (4.14) where $$\psi := \beta_{20} + \beta_{21} + \alpha_{21} \beta_{10}.$$ (4.15) Note that since the inner stages of the RK are of low order, we have first-order approximation of $$R(\cdot)$$ in (4.13) and (4.14). A Taylor expansion of the solution of (4.9) at time $$t_{n+1}$$ with the time step $$(-k)$$ yields \begin{align} {\boldsymbol p}(t_{n}) &= {\boldsymbol p}(t_{n+1}) + k R(t_{n+1}) {\boldsymbol p}(t_{n+1}) + \frac{k^2}{2} \left[ R^2(t_{n+1}) - \dot{R}(t_{n+1}) \right] {\boldsymbol p}(t_{n+1}) \notag\\ &\quad - \frac{k^3}{6} \left[ 2 \dot{R}(t_{n+1}) R(t_{n+1}) + R(t_{n+1}) \dot{R}(t_{n+1}) - R^3(t_{n+1}) - \ddot{R}(t_{n+1}) \right] {\boldsymbol p}(t_{n+1}) \notag\\ &\quad +\mathcal{O}(k^4). \end{align} (4.16) We will compare the TVD-RK method with conjugate coefficients against the Taylor expansion in (4.16). 4.2.1 Two-stage methods We now study the approximation properties of a two-stage scheme. In this vein, Shu & Osher (1988) derived order conditions for TVD-RK methods with two stages. As before, let $$\{\alpha_{ij}\}, \{\beta_{ij}\}$$ be the coefficients of a TVD-RK scheme for the state equation. Then the method is second order if it satisfies (4.2) and the order conditions $$\begin{array}{rcl} \alpha_{21}\beta_{10}+\alpha_{10}\beta_{21} + \beta_{20} &=& 1, \\[2pt] \beta_{10} \beta_{21} &=& \frac{1}{2}. \end{array}$$ (4.17) We need to find extra conditions on $$\{\alpha_{ij}\}$$ and $$\{\beta_{ij}\}$$ to obtain a second-order approximation for the discrete adjoint too. For the TVD-RK method with a conjugate coefficient table applied to (4.9) we have (see Example 4.2) \begin{equation*} \begin{array}{rcl} {\boldsymbol p}_h^{n,0} &=& \left[ \alpha_{10} + k \beta_{10} R(t_{n}) \right] {\boldsymbol p}_h^{n,1} + \left[ \alpha_{20} + k \beta_{20} R(t_{n}) \right] {\boldsymbol p}_h^{n,2}, \\[2pt] {\boldsymbol p}_h^{n,1} &=& \left[ \alpha_{21} + k \beta_{21} R(t_{n+1}) - k^2 \beta_{21}(1-\beta_{10}) \dot{R}(t_{n+1}) \right] {\boldsymbol p}_h^{n,2} + \mathcal{O}(k^3). \end{array} \end{equation*} Eliminating $${\boldsymbol p}_h^{n,1}$$, we simplify the above equations and obtain \begin{align} {\boldsymbol p}_h^{n,0} &= (\alpha_{20}^{} + \alpha_{10}^{}\alpha_{21}^{}) {\boldsymbol p}_h^{n,2} + (\alpha_{10} \beta_{21} + \alpha_{21} \beta_{10} + \beta_{20} ) k R(t_{n+1}) {\boldsymbol p}_h^{n,2} \notag\\ & \quad + \left[ \beta_{10}\beta_{21} R^2(t_{n+1}) - (\beta_{20}+\alpha_{21}\beta_{10} + \alpha_{10} \beta_{21}(1-\beta_{10})) \dot{R}(t_{n+1}) \right] k^2 {\boldsymbol p}_h^{n,2} + \mathcal{O}(k^3). \end{align} (4.18) Recall that $$\alpha_{10} =1$$ and $$\alpha_{20}+\alpha_{21} = 1$$. Then using these facts in the above expansion we get \begin{align} {\boldsymbol p}_h^{n,0} &= {\boldsymbol p}_h^{n,2} + ( \alpha_{10} \beta_{21} + \alpha_{21} \beta_{10} + \beta_{20} ) k R(t_{n+1}) {\boldsymbol p}_h^{n,2} \notag\\ &\quad + \left[ \beta_{10}\beta_{21} R^2(t_{n+1}) - (\beta_{20}+\alpha_{21}\beta_{10} + \alpha_{10}\beta_{21}(1-\beta_{10})) \dot{R}(t_{n+1}) \right] k^2 {\boldsymbol p}_h^{n,2} + \mathcal{O}(k^3). \end{align} (4.19) We compare (4.19) to the Taylor expansion of the exact solution (4.16) and require the following conditions on the coefficients: $$\begin{array}{rcl} \alpha_{21}\beta_{10}+\alpha_{10}\beta_{21} + \beta_{20} &=& 1, \\[2pt] \beta_{10} \beta_{21} &=& \frac{1}{2}, \\[2pt] \beta_{20} + \alpha_{21} \beta_{10} + \alpha_{10}\beta_{21}(1-\beta_{10}) &=& \frac{1}{2}. \end{array}$$ (4.20) Note that the first and second conditions are satisfied due to (4.17). Moreover, the first two conditions imply that the third condition is automatically satisfied. Therefore any second-order TVD-RK method applied to the forward problem results in a second-order approximation of the discrete adjoint. The above arguments prove the following theorem. Theorem 4.6 Suppose a second-order two-stage TVD-RK method is used to discretize the state equation. Then the corresponding TVD-RK method for the adjoint equation is consistent and is second-order. Moreover, the optimal CFL constant is $$1$$. 4.2.2 Three-stage methods In this section, we analyse approximation properties of the third-order three-stages TVD-RK methods. The following conditions should be satisfied to ensure a third-order discrete state: \begin{align} \alpha_{32} &= 1 - \alpha_{31} - \alpha_{30}, \\[-2pt] \end{align} (4.21) \begin{align} \beta_{32} &= \frac{3 \beta_{10} - 2}{6 \psi (\beta_{10} - \psi)}, \\[-2pt] \end{align} (4.22) \begin{align} \beta_{21} &= \frac{1}{6 \beta_{10} \beta_{32} }, \\[-2pt] \end{align} (4.23) \begin{align} \beta_{31} &= \frac{ \frac{1}{2} - \alpha_{32} \beta_{10} \beta_{21} - \psi \beta_{32} }{\beta_{10}}, \\[-2pt] \end{align} (4.24) \begin{align} \beta_{30} &= 1 - \alpha_{31} \beta_{10} - \alpha_{32} \psi - \beta_{31} - \beta_{32}, \\[-2pt] \end{align} (4.25) \begin{align} \beta_{20} &= \psi - \alpha_{21} \beta_{10} - \beta_{21}. \end{align} (4.26) The free parameters are $$\alpha_{21}, \alpha_{30}, \alpha_{31}, \beta_{10}$$ and $$\psi$$; see the discussion in Shu & Osher (1988) for details. The same analysis as in Section 4.2.1 and employing inner stages approximation properties (4.12)–(4.14) and comparing with coefficients of the Taylor expansion in (4.16) yields that the following conditions must be satisfied for the discrete adjoint: $$\begin{array}{rcll} \alpha_{31} \beta_{10} + \alpha_{21} \alpha_{32} \beta_{10} + \alpha_{32} (\beta_{20} + \beta_{21}) + \beta_{30} + \beta_{31} + \beta_{32} &=&1 & \text{(first order)}, \end{array}$$ (4.27) \begin{align} \left. \begin{array}{rcl} \alpha_{32} \beta_{10} \beta_{21} + \beta_{10} \beta_{31} + \alpha_{21} \alpha_{32} \beta_{10} + \beta_{32} (\beta_{20} + \beta_{21}) &=& \frac{1}{2} \quad \text{(for}\,R^2) \\[4pt] \alpha_{31} \beta_{10} + \alpha_{21} \alpha_{32} \beta_{10} + \alpha_{32} \beta_{20} + \beta_{30} + \beta_{32} (1-\psi) \\[4pt] + (\alpha_{32} \beta_{21} + \beta_{31})(1-\beta_{10}) &=& \frac{1}{2} \quad (\text{for}\,\dot{R}) \end{array} \right\} & \text{(second order)}, \end{align} (4.28) \begin{align} \left. \begin{array}{rcl} \beta_{10} \beta_{21} \beta_{32} &=& \frac{1}{6} \quad (\text{for} R^3) \\[4pt] \beta_{20}\beta_{32} + \beta_{10}\beta_{31} + (1-\beta_{10})\beta_{21}\beta_{32} + \beta_{10} \beta_{21} \alpha_{32} + \beta_{10} \beta_{32} \alpha_{21} &=& \frac{1}{3} \quad (\text{for} \dot{R} R) \\[4pt] (1-\psi) \left( \beta_{32}\beta_{20} + \beta_{10}\alpha_{21}\beta_{32} + \beta_{32}\beta_{21} \right) \quad \\ + (1-\beta_{10}) \left( \beta_{21}\beta_{10}\alpha_{32} + \beta_{10} \beta_{31} \right) &=& \frac{1}{6} \quad (\text{for}\,R\dot{R}) \\[4pt] \alpha_{31} \beta_{10} + \alpha_{21} \alpha_{32} \beta_{10} + \alpha_{32} \beta_{20} + (\alpha_{32} \beta_{21} + \beta_{31}) (1-\beta_{10})^2 \\ + \beta_{30} + \beta_{32} (1-\psi)^2 &=& \frac{1}{3} \quad (\text{for}\,F'''F^2) \\[4pt] \alpha_{31} \beta_{10} + \alpha_{21} \alpha_{32} \beta_{10} + \alpha_{32} \beta_{20} + (\alpha_{32} \beta_{21} + \beta_{31}) (1-2\beta_{10}) \\ + \beta_{30} + \beta_{32} (1-2\psi+2 \beta_{21}\beta_{10}) &=& \frac{1}{3} \quad (\text{for}\,F''F'F) \end{array} \right\} & \text{(third order)}. \end{align} (4.29) One can show that if conditions (4.21)–(4.26) are satisfied then (4.27)–(4.29) are automatically satisfied except for the term associated to $$R \dot{R}$$: let us define $$A := (1-\psi) \left( \beta_{32}\beta_{20} + \beta_{10}\alpha_{21}\beta_{32} + \beta_{32}\beta_{21} \right) + (1-\beta_{10}) \left( \beta_{21}\beta_{10}\alpha_{32} + \beta_{10} \beta_{31} \right)$$. Then we have \begin{equation*} A = (1-\psi) \psi \beta_{32} + (1-\beta_{10})\left(\tfrac{1}{2} - \psi \beta_{32} \right) = \tfrac{1}{2} + \psi (\psi - \beta_{10}) \beta_{32} - \tfrac{1}{2} \beta_{10} = \tfrac{1}{2} + \tfrac{1}{2} \beta_{10} - \tfrac{1}{6} - \tfrac{1}{2} \beta_{10} = \tfrac{1}{3}, \end{equation*} which cannot be equal to $$\frac{1}{6}$$ for any choice of parameters. Therefore the discrete adjoint TVD-RK method is at most second order. The above arguments prove the following theorem. Theorem 4.7 Suppose a third-order three-stage TVD-RK method is used to discretize the state equation. Then the corresponding TVD-RK method for the adjoint equation is consistent and is at most second order. Moreover, the optimal CFL constant is $$1$$. 4.2.3 Fifth-stage (fourth-order) method We have already mentioned that a fourth-order TVD-RK method with four stages does not exist (see Gottlieb & Shu, 1998; Ruuth & Spiteri, 2002). Using a nonlinear programming computer code, a fourth-order with five-stage TVD-RK method was found in Spiteri & Ruuth (2002, Appendix B). It has been shown in Hager (2000, Proposition 6.1) that when a general four-stage fourth-order RK method is applied to the forward problem of an optimal control problem of ODEs, then the corresponding discrete adjoint is also automatically fourth order. Inspired by this result, we would like to check whether or not the fourth-order with five-stage TVD-RK method of Ruuth & Spiteri (2002) generates a fourth-order discrete adjoint too. To check the order conditions for this particular TVD-RK method, we use Butcher’s table of the mentioned TVD-RK method and check whether the order conditions in Hager (2000, Table 1) are satisfied. A direct calculation shows that the discrete adjoint is only second order. This is, however, not surprising, since the TVD-RK method is obtained by nonlinear programming (with order conditions for the forward problem as constraints). 4.3 Convergence of the TVD-RK method In this section, we discuss convergence of the discrete adjoint obtained from a TVD-RK method. We follow the framework established in Ulbrich (2001, Chapter 6.4). We consider the case when the end data of the continuous adjoint is smooth, e.g., $$p(x,T) \in {\textrm{Lip}}_\text{loc}({\mathbb{R}})$$, and the space discretization is either the Lax–Friedrichs or Engquist–Osher method, which yields a monotone scheme when combined with a forward Euler method in time (see, e.g., LeVeque, 1990, Chapter 15.7). Consider the TVD-RK schemes with either Lax–Friedrichs or Engquist–Osher discretization and suppose the time step $$k$$ is chosen such that the forward Euler discretizations (4.3) are monotone and their corresponding discrete adjoint scheme is TVD (see (3.13), (4.4)). The first ingredient for the proof is to show that the TVD-RK scheme is again a monotone scheme for the forward problem, i.e., given any initial data $$y_j^n, w_j^n$$, at time $$t_n$$, we have \begin{equation*} y_j^n \geq w_j^n \quad \forall\, j \implies y_j^{n+1} \geq w_j^{n+1} \quad \forall\, j. \end{equation*} Suppose that until the $$\ell$$th stage we have $$y_j^{n,i} \geq w_j^{n,i}$$ for all $$i=1,\ldots,\ell$$. Since $$y_j^{n,\ell+1}$$ is equal to a convex combination of forward Euler methods and each forward Euler method is a monotone scheme, we can conclude that $$y_j^{n,\ell+1} \geq w_j^{n,\ell+1}$$. The second ingredient is to show that after eliminating the inner stages we have a conservative scheme. Let us demonstrate this for a two-stage method: observe that the application of an Euler step at the first stage can be written as $$y_j^{n,1} = \mathcal{H}(y_{j-1}^{n,0},y_{j}^{n,0},y_{j+1}^{n,0}) := \alpha_{10}^{} y_{j}^{n,0} - \beta_{10}^{} \frac{k}{h} \left[ \hat{f}(y_j^{n,0},y_{j+1}^{n,0}) - \hat{f}(y_{j-1}^{n,0},y_{j}^{n,0}) \right]\!.$$ (4.30) Then the second stage can be written in the conservative form $$y_j^{n+1} = y_j^{n} - \frac{k}{h} \left[ \tilde{f}(y_{j-1}^{n},y_{j}^{n},y_{j+1}^{n},y_{j+2}^{n}) - \tilde{f}(y_{j-2}^{n}, y_{j-1}^{n},y_{j}^{n},y_{j+1}^{n}) \right]\!,$$ (4.31) where the numerical flux is defined, after eliminating the inner stage, by \begin{equation*} \tilde{f}(y_{j-1},y_{j},y_{j+1},y_{j+2}) := (\alpha_{21}\beta_{10}+\beta_{20}) \hat{f}(y_j,y_{j+1}) + \alpha_{10}\beta_{21} \hat{f}(\mathcal{H}(y_{j-1},y_j,y_{j+1}), \mathcal{H}(y_{j},y_{j+1},y_{j+2})). \end{equation*} We state the convergence result in the following proposition. Proposition 4.8 Suppose the final end data of the continuous adjoint satisfies $$p(x,T) \in {\textrm{C}}^{0,1}({\mathbb{R}})$$. Moreover, suppose that the space discretization is either the Lax–Friedrichs or Engquist–Osher method and that the time is discretized using a TVD-RK method. Let the time step $$k= ch$$ be chosen such that the forward Euler discretizations (4.3) are monotone and their corresponding discrete adjoint scheme is TVD, see (3.13), (4.4). Then the discrete adjoint is convergent to the unique reversible solution as $$k = ch \rightarrow 0$$, i.e., \begin{equation*} p_h \rightarrow p \quad \text{in } B([0,T];{\textrm{L}}^r(I)), \end{equation*} where $$B([0,T];{\textrm{L}}^r(I))$$ is the space of bounded functions equipped with the $$\sup$$-norm and with values in $${\textrm{L}}^r(I)$$ for $$r \in [1,\infty)$$ and $$I := (-R,R)$$ for all $$R>0$$. Proof. We give the main steps of the proof from Ulbrich (2001, Chapter 6.4). First, we eliminate the inner stages and write the TVD-RK method in a conservative form (see (4.31) for the two-stage method): \begin{equation*} y_j^{n+1} = y_j^{n} - \frac{k}{h} \left[ \tilde{f}_{j+1/2}^n - \tilde{f}_{j-1/2}^{n} \right]\!, \quad \tilde{f}_{j+1/2}^{n} := \tilde{f}(y_{j-K+1}^n,\ldots,y_{j+K}^n). \end{equation*} Therefore, it is of the form Ulbrich (2001, Equation (6.9)). Then the discrete adjoint can be written as $$p_j^{n} = p_j^{n+1} + \frac{k}{h} \sum_{i=1-K}^{K} a_{j-i+1/2,i}^{n} \, (p_{j-i+1}^{n+1} - p_{j-i}^{n+1}),$$ (4.32) where $$a_{j-i+1/2,i}^{n} = \partial_{y_i}\tilde{f}_{j-i+1/2}^n$$. For the $${\textrm{L}}^\infty$$-stability, it is more convenient to write the discrete adjoint scheme in the form \begin{equation*} p_j^n = \sum_{i=-K}^{K} B_{j,i}^n p_{j-i}^{n+1}, \end{equation*} where $$B_{j,-K}^n = \frac{k}{h} a^n_{j+K-\frac{1}{2},1-K}$$, $$B_{j,K}^n = - \frac{k}{h} a^n_{j-K+\frac{1}{2},K}$$ and \begin{equation*} B_{j,i}^n = \delta_{0,i} + \frac{k}{h} \left( a^n_{j-i-\frac{1}{2},i+1} - a^n_{j-i+\frac{1}{2},i}\right)\!, \quad -K < i < K, \end{equation*} where $$\delta_{0,i}$$ is the Kronecker delta. For the stability of the total variation, however, it is more convenient to write the discrete adjoint scheme in the form $$(p_{j+1}^{n}-p_{j}^{n}) = \sum_{i=-K}^{K} C_{j,i}^{n} \left(p_{j-i+1}^{n+1}-p_{j-i}^{n+1}\right)\!,$$ (4.33) where $$C_{j,-K}^n = \frac{k}{h} a^n_{j+K+\frac{1}{2},1-K}$$, $$C_{j,K}^n = - \frac{k}{h} a^n_{j-K+\frac{1}{2},K}$$ and \begin{equation*} C_{j,i}^n = \delta_{0,i} + \frac{k}{h} \left( a^n_{j-i+\frac{1}{2},i+1} - a^n_{j-i+\frac{1}{2},i}\right)\!, \quad -K < i < K. \end{equation*} Monotonicity of the TVD-RK method applied to the forward problem ensures that $$B_{j,i}^n \geq 0$$ (see Ulbrich, 2001, Lemma 6.4.2). Moreover, the TVD-RK adjoint scheme is TVD by Theorem 4.5 and maps by (4.32) constant values $$p_{j-K}^{n+1}=\cdots=p_{j+K}^{n+1}=c$$ to the same value $$p_j^n=c$$. Hence, Harten (1983, Theorem 2.1) yields that the TVD-RK adjoint scheme is monotonicity perserving and thus (4.33) implies $$C_{j,i}^n \geq 0$$. This shows that condition (1) in Ulbrich (2001, Chapter 6.4.1, condition (1)) is satisfied. It remains to show that assumptions (D1), (D2) and (D3) of Ulbrich (2001, Chapter 6.4.1) hold. (D1) is consistency of the numerical flux, which holds in our case; observe for instance that $$\tilde{f}(y,y,y,y) = f(y)$$ in (4.30) since $$\alpha_{21}\beta_{10} + \alpha_{10}\beta_{21} + \beta_{20} = 1$$. (D2) is the convergence of the discrete state to the entropy solution of the state, which holds as our discretization is TVD. (D3) is an OSLC condition which holds for Lax–Friedrichs and Engquist–Osher discretization (see Ulbrich, 2001, Lemma 6.5.2 and Lemma 6.5.5 for details). Then we can apply Ulbrich (2001, Theorem 6.4.4 and Theorem 6.4.6) to show convergence of the discrete adjoint to the unique reversible solution. □ 5. Numerical experiments In this section, we perform numerical experiments on TVD-RK methods for computing the discrete adjoint state. We show through numerical experiments that the adjoint scheme obtained from discretization of the forward problem using a TVD-RK method is stable. Let us consider the configuration of Example 2.5. The domain is set to be $${\it{\Omega}} = (-1,1)$$, the flux function $$f(y) := \frac{1}{2}y^2$$ and $$u(x)=1$$ for $$x \in [-1,0)$$ and $$u(x)=-1$$ for $$x \in [0,1]$$. We set the function $$G(y(x,T)):= \frac{1}{2} |y(x,T)|^2$$. For the boundary condition we choose $$y(-1,t) = 1$$ and $$y(1,t)=-1$$. We then use the Lax–Friedrichs and Engquist–Osher schemes to compute $$\{ {\boldsymbol y}_h^n \}_{n=1}^{n_T}$$ with the time step \begin{equation*} k = \frac{1}{2} \left( \min_{\alpha_{ij},\beta_{ij}\not = 0} \frac{\alpha_{ij}}{\beta_{ij}} \right) \gamma \, h, \end{equation*} where $$\gamma = \frac{1}{2}$$ is the optimal CFL constant for both the forward and adjoint discretization as discussed in Section 3.3. Here, as before, $$h$$ is the mesh parameter and is inversely proportional to the number of cells $$N$$. We use the TVD-RK methods of Table 1. As shown in Theorem 4.5, the discrete adjoint is also stable provided the TVD-RK method for the forward problem is SSP. Moreover, we expect that the discrete adjoint approximates the continuous adjoint at $$t=0$$, \begin{equation*} p(x,0) = \begin{cases} 1, & -1 < x < -\frac{1}{2}, \\[2pt] 0, & -\frac{1}{2} \leq x \leq \frac{1}{2}, \\[2pt] -1, & -\frac{1}{2} < x < 1, \end{cases} \end{equation*} even though the discrete adjoint scheme does not impose the ‘interior’ boundary condition (2.9). In Fig. 2, we observe that for both the Engquist–Osher and Lax–Friedrichs methods with a two-stage TVD-RK method, we obtain a stable TVD discrete adjoint. Note that in the interval $$x \in (-\frac{1}{2},\frac{1}{2})$$, the discrete adjoint has the correct value, i.e., $$p(x,0) = 0$$, and the shock location is correct as well. Moreover, the discrete adjoint converges as we refine the mesh. Identical results are obtained using a three-stage TVD-RK method. The reason for this is that the Lax–Friedrichs and Engquist–Osher methods are low-order methods, while the time discretization is high order and the leading error is due to the spatial discretization. Fig. 2. View largeDownload slide Discrete adjoint computed using Engquist–Osher and Lax–Friedrichs schemes with two-stage TVD-RK method. Fig. 2. View largeDownload slide Discrete adjoint computed using Engquist–Osher and Lax–Friedrichs schemes with two-stage TVD-RK method. In Fig. 3, we observe that the total variation of the discrete adjoint at any time $$t<T$$ deviates from the final-time discrete adjoint, i.e., $${\boldsymbol p}_h^{n_T}$$, only up to machine precision. This implies that the TVD-RK method for the discrete adjoint is TV stable, even though it is not SSP. Fig. 3. View largeDownload slide The difference between total variation of the discrete adjoint at time $$t_n$$ and total variation of the final-time discrete adjoint. Fig. 3. View largeDownload slide The difference between total variation of the discrete adjoint at time $$t_n$$ and total variation of the final-time discrete adjoint. 5.1 Giles’ test case In this section, we perform numerical experiments on an optimal control problem that was first proposed by Giles (2003). Let us choose the setting of the problem of the previous section except for the objective functional which we now choose to be $$G(y) := y^5 - y$$. Note that since $$y(x,T) = 1$$ for $$x \in [-1,0)$$ and $$y(x,T) = -1$$ for $$x \in (0,1]$$, we have $$G'(y(x,T)) = 4$$ for $$x \in [-1,1]\setminus \{0\}$$. Moreover, since there is a shock in the solution at time $$T$$ at $$x=0$$ we should impose an ‘interior’ boundary condition for the adjoint state. Since $${[\![{ G(y(x,T)) }]\!]} = 0$$, we should set the ‘interior’ boundary condition to \begin{equation*} p(0,t) = 0 \quad \forall\, t \in [0,T]. \end{equation*} Therefore the adjoint state at time $$t=0$$ reads \begin{equation*} p(x,0) = \begin{cases} 4, & -1 < x < -\frac{1}{2}, \\ 0, & -\frac{1}{2} \leq x \leq \frac{1}{2}, \\ 4, & -\frac{1}{2} < x < 1. \end{cases} \end{equation*} If we choose the discretize-then-optimize approach, we do not impose such an ‘interior’ boundary condition for the discrete adjoint. It was first observed in Giles (2003) that the Lax–Friedrichs scheme provides a discrete adjoint that converges to a wrong adjoint (see Remark 3.3). We perform the numerical experiment for the Engquist–Osher scheme with a two-stage TVD-RK method and the number of cells $$N=800$$. In Fig. 4, we observe that the discrete adjoint has the correct value for $$x \in [-1,-\frac{1}{2}) \cup (\frac{1}{2},1]$$, but it has a wrong value for $$x \in [-\frac{1}{2},\frac{1}{2}]$$, i.e., $$p_h(x,0) = 0.25$$ for $$x \in [-\frac{1}{2}, \frac{1}{2}]$$. The wrong value in this region does not improve under refinement and the approximation converges to the value $$0.25$$ in this region. Note that the final-time condition $$p_h(0,T)$$ has a Dirac delta shape at $$x=0$$ due to the nonlinearity of the objective functional with the value of $$0.25$$. This is precisely the value that is transported backward in time. Fig. 4. View largeDownload slide The discrete adjoint computed using Engquist–Osher (left) and Lax–Friedrichs (right) schemes and a two-stage TVD-RK method for Giles’ test case. Fig. 4. View largeDownload slide The discrete adjoint computed using Engquist–Osher (left) and Lax–Friedrichs (right) schemes and a two-stage TVD-RK method for Giles’ test case. It is shown by Giles & Ulbrich (2010a,b) that the Lax–Friedrichs scheme converges to the correct adjoint provided a restrictive time step of type $$k = \mathcal{O}(h^{2-q})$$, for $$0<q<1$$; see Fig. 4 (right). Observe in Fig. 5 that the Lax–Friedrichs method converges in the shock funnel with $$k=\mathcal{O}(h^{1.2})$$. Fig. 5. View largeDownload slide Convergence of the discrete adjoint computed using the Lax–Friedrich method. Fig. 5. View largeDownload slide Convergence of the discrete adjoint computed using the Lax–Friedrich method. Such a time step increases diffusion in the scheme and allows more grid points to enter the shock and, hence, it leads to convergence of the discrete adjoint to the correct solution. This, however, is not possible with the classical definition of the Engquist–Osher scheme as such a diffusion is not present. In Fig. 6, we plot the deviation of the total variation of the discrete adjoint from the total variation of $$p_h(x,T)$$. Observe that we have $$p_h(x,t) \leq p_h(x,T)$$ for $$t \in [0,T]$$ up to machine precision, which agrees with the theoretical stability result for TVD-RK methods. Fig. 6. View largeDownload slide The difference between total variation of the discrete adjoint at time $$t_n$$ and total variation of the final time discrete adjoint for Giles’ test case. Fig. 6. View largeDownload slide The difference between total variation of the discrete adjoint at time $$t_n$$ and total variation of the final time discrete adjoint for Giles’ test case. 5.2 Convergence of the discrete adjoint We now validate the convergence result of Proposition 4.8 where we use mollification in order to guarantee that $$p(x,T) \in {\textrm{Lip}}_\text{loc}({\mathbb{R}})$$. For mollifying the end data, we use the smoothing kernel \begin{equation*} \varphi(x) = \begin{cases} \frac{1}{Z} \exp(-\frac{1}{1-x^2}), & -1 < x < 1, \\ 0 & \text{otherwise}, \end{cases} \end{equation*} where $$Z \in {\mathbb{R}}^+ \setminus \{0\}$$ is a normalization constant such that $$\int_{\mathbb{R}} \varphi(x) {\textrm{d}} x = 1$$. Given a positive constant $$\epsilon$$, we then denote the mollified final state by $$y_{\varphi,\epsilon}(x,T)$$ and define it by \begin{equation*} y_{\varphi,\epsilon}(x,T) := \int_{\mathbb{R}} \frac{1}{\epsilon} \varphi\left(\frac{z}{\epsilon} \right) y(x-z,T) \,{\textrm{d}} z. \end{equation*} The objective functional is given as before by $$J(y) := \int_{\mathbb{R}} G(y_{\varphi,\epsilon}(x,T)) {\textrm{d}} x$$ with $$G(y) = y^5 - y$$, and we consider the Burgers equation with prescribed boundary and initial conditions as in the previous section. Since $$y_{\varphi,\epsilon}$$ is smooth, we conclude that $$p(x,T) = G'(y_{\varphi,\epsilon}(x,T)) \in {\textrm{Lip}}_\text{loc}({\mathbb{R}})$$. Therefore, in the shock funnel we should obtain $$p(x,t) = G'(y_{\varphi,\epsilon}(0,T)) = -1$$ since $$y_{\varphi,\epsilon}(0,T) = 0$$. This yields that for the continuous adjoint at $$t=0$$, we have \begin{equation*} p(x,0) = -1 \quad \text{for } x \in \left[-\frac12, \frac12\right]\!, \end{equation*} for any given $$\epsilon$$. We discretize the mollification function by averaging on the mesh, i.e., $$\mathbf{\varphi} := (\varphi_1, \ldots, \varphi_N)$$ where \begin{equation*} \varphi_j := \frac{1}{h_j} \int_{x_{j-1/2}}^{x_{j+1/2}} \frac{1}{\epsilon} \varphi \left( \frac{x}{\epsilon} \right) \,{\textrm{d}} x \quad \forall\, j \in \mathbb{Z}, \end{equation*} and the discrete mollified state is given by \begin{equation*} y_{j,\varphi,\epsilon}^{n_T} = \sum_{j'} h_j^{}\, \varphi_j^{} y_{j-j'}^{n_T} \quad \forall\, j \in \mathbb{Z}. \end{equation*} For the numerical experiment, we choose, as before, the Engquist–Osher scheme and a two-stage TVD-RK method for the forward problem with $$N=800$$ and the time step is chosen as $$k = 0.25 h$$. In Fig. 7 (left), we plot the discrete adjoint obtained from the Engquist–Osher scheme with smoothed end data. As we observe, in the shock funnel at $$t=0$$ we recover the correct discrete adjoint. Fig. 7. View largeDownload slide Discrete adjoint at $$t=0$$ (left) and convergence of the discrete adjoint to its continuous counterpart in the shock funnel (right). Fig. 7. View largeDownload slide Discrete adjoint at $$t=0$$ (left) and convergence of the discrete adjoint to its continuous counterpart in the shock funnel (right). In Fig. 7 (right) we plot the convergence of the discrete adjoint in the shock funnel. Note that due to the smoothing effect of the mollifier, we recover convergence of the discrete adjoint in the shock funnel. This validates numerically the result of Proposition 4.8. Since we use a first-order scheme in space (for theoretical reasons), we do not expect to obtain an overall convergence rate of more than $$1$$. However, in Fig. 7 (right) we observe a slight improvement in the convergence rate due to the TVD-RK method. 5.3 A numerical optimal control problem In this section, we solve an optimization task using the gradient information obtained from a discrete adjoint. We set the objective functional to \begin{equation*} J(y) := \frac{1}{2} \int_{-1}^{1} |y(x,T) - y_\text{obs}(x)|^2 \,{\textrm{d}} x, \end{equation*} with the final time $$T = \frac{1}{2}$$, \begin{equation*} y_\text{obs}(x) := \begin{cases} 2x - \frac{1}{2}, & \frac{1}{4} \leq x \leq \frac{3}{4}, \\ 0 & \text{otherwise}, \end{cases} \end{equation*} and, Burgers flux function $$f(y) = \frac{1}{2}y^2$$. Given the current control $$u_{j}$$ (at iteration $$j$$), we compute the discrete adjoint at time $$T=0$$, i.e., $$p_j(x,0)$$, and choose $$\delta u = - \eta_j p_j(x,0),$$ where $$\eta_j \in {\mathbb{R}}_+$$ is a parameter to ensure $$J(y_h(u_{j+1})) < J(y_h(u_{j}))$$. Then the updated control at iteration $$j+1$$ reads $$u_{j+1}(x) = u_{j}(x) - \eta_j \, p_j(x,0).$$ (5.1) We choose $$\eta_j$$ by checking the Armijo condition and a back-tracking procedure, i.e., $$J(y_h(u_{j+1})) \leq J(y_h(u_{j})) - c_\text{opt} \eta_j {\Vert {p_j(x,0)} \Vert}_{{\textrm{L}}^2(-1,1)}^2,$$ (5.2) where $$c_\text{opt} \in (0,1)$$ is the Armijo constant. If the above Armijo condition is not satisfied we choose a smaller $$\alpha$$ by \begin{equation*} \alpha_\text{new} := \rho\, \alpha_\text{old}, \end{equation*} where $$0 < \rho < 1$$ and recheck (5.2). In this numerical experiment, we choose $$\rho = 0.95$$, $$c_\text{opt} = 0.5$$ and the initial $$\alpha = 0.5$$. As initial guess we fix \begin{equation*} u_0(x) := \begin{cases} (\frac{3}{4}+x)(\frac{1}{2}-x), & -\frac{3}{4} \leq x \leq \frac{1}{2}, \\ 0 & \text{otherwise}. \end{cases} \end{equation*} We choose the Engquist–Osher method for spatial discretization and Heun’s second-order TVD-RK method. The time step is chosen as $$k = 0.25 h,$$ and the stopping criterion is taken to be $${\Vert {\nabla \mathcal{J}_h(u_h)} \Vert}_{{\textrm{L}}^2({\it{\Omega}})} = {\Vert {p_j(x,0)} \Vert}_{{\textrm{L}}^2({\it{\Omega}})} \leq {\tt tol}$$. In our experiments we set $${\tt tol}= 10^{-2}$$ and $$10^{-4}$$. In Fig. 8, we observe that the numerical algorithm seems to converge to the true solution. That is, it captures correctly the shock location at $$x=\frac{3}{4}$$ and also the rarefaction. There are numerical artefacts at $$x=\frac{3}{4}$$ which vanish as we reduce the tolerance $$\tt tol$$. The corresponding initial guess for the control variable and the final control variable are plotted in Fig. 9. Fig. 8. View largeDownload slide The state variable $$y_h(x,T)$$ obtained from optimization algorithm with $${\tt tol}=10^{-2}$$ and $${\tt tol}=10^{-4}$$. Fig. 8. View largeDownload slide The state variable $$y_h(x,T)$$ obtained from optimization algorithm with $${\tt tol}=10^{-2}$$ and $${\tt tol}=10^{-4}$$. Fig. 9. View largeDownload slide The control variable $$u(x)$$ obtained from optimization algorithm with $${\tt tol}=10^{-2}$$ and $${\tt tol}=10^{-4}$$. Fig. 9. View largeDownload slide The control variable $$u(x)$$ obtained from optimization algorithm with $${\tt tol}=10^{-2}$$ and $${\tt tol}=10^{-4}$$. 6. Conclusion In this article, we studied TVD-RK methods for the numerical treatment of the optimal control problems in a discretize-then-optimize approach. We have shown that a TVD-RK discretization of the state equation yields a TVD-RK method for the discrete adjoint with a conjugate coefficient table. We then showed that requiring strong stability for both discrete state and adjoint is too strong and results in a first-order method. Luckily, imposing SSP properties for the discrete state is sufficient to obtain stability of the discrete adjoint. This result holds for an arbitrary $$s$$-stage TVD-RK method. Moreover, thanks to the linearity of the adjoint equation, the TVD-RK method for the discrete adjoint is consistent. We also studied the approximation properties of the discrete adjoint and showed that for a second-order two-stage method applied to the forward problem, we also obtain a second-order discrete adjoint. However, for the third-order three-stage method applied, we obtain only a second-order discrete adjoint. Inner stages of our theoretical results were finally illustrated by numerical experiments. We would like to finish this conclusion by mentioning that the convergence of the discrete adjoint to the continuous adjoint is an interesting question on its own. Funding German Research Foundation DFG through the collaborative research center SFB-TRR 154 under projects A02 and B02 and by the Research Center MATHEON through project C-SE5 and D-OT1 funded by the Einstein Center for Mathematics, Berlin. Appendix A. Existence of a minimizer Proposition A1 Let $$G(y) = \frac{1}{2} |y(x,T) - y_\text{obs}(x)|^2$$. Then the optimal control problem subject to the conservation law (2.1) has a solution in the admissible set $$U_\text{ad}$$. Proof. We begin with the continuity of the objective functional. Suppose $$y(x,t)$$ and $$w(x,t)$$ are the entropic solution of (2.1) with the initial data $$u(x) \in U_\text{ad}$$ and $$v(x) \in U_\text{ad}$$, respectively. Then we have \begin{equation*} | J(y) - J(w) | = \frac{1}{2} \left| \int_{\mathbb{R}} ( y - w )(y + w - 2 y_{\text{obs}} ) \right| \leq {\Vert { y - w } \Vert}_{{\textrm{L}}^1({\mathbb{R}})} {\Vert { y + w - 2 y_{\text{obs}} } \Vert}_{{\textrm{L}}^\infty({\mathbb{R}})}. \end{equation*} Then by $${\textrm{L}}^1$$-contraction of the entropic solutions we have \begin{equation*} \begin{array}{rcl} | J(y) - J(w) | &\leq& {\Vert { y(\cdot,T) - w(\cdot,T) } \Vert}_{{\textrm{L}}^1({\mathbb{R}})} ({\Vert {y(\cdot,T)} \Vert}_{{\textrm{L}}^\infty({\mathbb{R}})} + {\Vert {w(\cdot,T)} \Vert}_{{\textrm{L}}^\infty({\mathbb{R}})} + c) \\ &\leq& {\Vert { u - v } \Vert}_{{\textrm{L}}^1({\mathbb{R}})} ({\Vert {u} \Vert}_{{\textrm{L}}^\infty({\mathbb{R}})} + {\Vert {v} \Vert}_{{\textrm{L}}^\infty({\mathbb{R}})} + c) \\ &\leq& C {\Vert { u - v } \Vert}_{{\textrm{L}}^1({\mathbb{R}})}, \end{array} \end{equation*} where in the last step we used the $${\textrm{L}}^\infty$$-stability of the entropic solutions and the fact that $$u$$ and $$v$$ are in $$U_\text{ad}$$ and therefore we have a uniform bound on the $${\textrm{L}}^\infty$$-norm. Observe that $$J(y) \geq 0$$ and therefore a minimizing sequence denoted by $$\{u_i\}$$ exists. Since $$U_\text{ad}$$ is a compact set in $${\textrm{L}}^1$$, one can obtain a subsequence denoted by $$\{ u_{i_j} \}$$ that converges strongly in $${\textrm{L}}^1({\mathbb{R}})$$ to $$u^\star \in U_\text{ad}$$ as $$j \rightarrow \infty$$. Then using continuity of the objective functional we have \begin{equation*} \inf_{u \in U_\text{ad}} J(y) = \lim_{j \rightarrow \infty} J(y(u_{i_j})) = J(y(u^\star)) \quad \text{for } u^\star \in U_\text{ad}, \end{equation*} which shows existence of the minimizer. □ We would like to mention also that for a more general admissible set, e.g., $$U_\text{ad} := \left\{ u \in {\textrm{L}}^\infty({\mathbb{R}}) : \text{supp}(u) \in K, {\Vert {u} \Vert}_{{\textrm{L}}^\infty({\mathbb{R}})} \leq C \right\}\!,$$ (A.1) and an assumption on the uniform convexity of the flux function $$f(\cdot)$$, one can also obtain an existence result. We refer the reader to Castro et al. (2008) for the proof. Appendix B. Proof of Proposition 3.1 Proof. Since $$f(\cdot)$$ is $${\textrm{C}}^2$$ we have $$f(w+v) = f(w) + f'(w)v + \frac{1}{2} f''(z) v^2$$ for $$w,v \in {\mathbb{R}}$$ and some $$z \in (w,w+v)$$. For the Lax–Friedrichs flux, a direct calculation shows \begin{equation*} \hat{f}(w_j + v_j, w_{j+1} + v_{j+1}) = \hat{f}(w_j,w_{j+1}) + g_{i+1,i}^\text{LF} + \mathcal{O}(v_j^2) + \mathcal{O}(v_{j+1}^2) + \mathcal{O}(v_{j-1}^2). \end{equation*} For Engquist–Osher, we have \begin{equation*} \hat{f}(w_j + v_j, w_{j+1} + v_{j+1}) = \hat{f}(w_j,w_{j+1}) + \int_{w_j}^{w_j+v_j} f'(s)^+ \, {\textrm{d}} s + \int_{w_{j+1}}^{w_{j+1} + v_{j+1}} f'(s)^- \, {\textrm{d}} s. \end{equation*} Note that we can write $$f'(x)^+ = \frac{1}{2} (\,f'(x) + |f'(x)|)$$ and $$f'(x)^- = \frac{1}{2} (\,f'(x) - |f'(x)|)$$. Recall the definition of $$g_{j,j+1}^\text{EO}$$ in the proposition and define the residual $$r({\boldsymbol w},{\boldsymbol v}) : V_h \times V_h \rightarrow V_h$$ by $$[r({\boldsymbol w},{\boldsymbol v})]_j:= [F_h({\boldsymbol w}+{\boldsymbol v})]_j - [F_h({\boldsymbol w})]_j - [F_h'({\boldsymbol w}) {\boldsymbol v}]_j$$. Then we have $$[r({\boldsymbol w},{\boldsymbol v})]_j = q_{j,j+1} - q_{j-1,j}$$ where \begin{equation*} q_{j,j+1} = \int_{w_j}^{w_j+v_j} f'(s)^+ - f'(w_j)^+ \, {\textrm{d}} s + \int_{w_{j+1}}^{w_{j+1}+v_{j+1}} f'(s)^- - f'(w_{j+1})^- \, {\textrm{d}} s. \end{equation*} For the first term on the right-hand side we have \begin{equation*} \left| \int_{w_j}^{w_j+v_j} f'(s)^+ - f'(w_j)^+ \, {\textrm{d}} s \right| \leq \max_{z \in (w_j,w_j+v_j)} |f'(z)^+ - f'(w_j)^+ | \cdot |v_j| \leq C \, v_j^2, \end{equation*} since $$f'(x)^+$$ is Lipschitz continuous. The proof for the second term of $$q_{j,j+1}$$ is similar. Hence, we showed that for both Lax–Friedrichs and Engquist–Osher fluxes we have $$[r({\boldsymbol w},{\boldsymbol v})]_j = \mathcal{O}(v_j^2) + \mathcal{O}(v_{j+1}^2) + \mathcal{O}(v_{j-1}^2)$$, respectively. Taking the $$\ell^1$$-norm of $$r({\boldsymbol w},{\boldsymbol v})$$ one obtains \begin{equation*} {\Vert {r({\boldsymbol w},{\boldsymbol v})} \Vert}_{\ell^1} \leq C {\Vert { {\boldsymbol v} } \Vert}_{\ell^2}^2 \leq C {\Vert { {\boldsymbol v} } \Vert}_{\ell^1}^2, \end{equation*} where $$C$$ is independent of the mesh parameter. Multiplying both sides by $$h$$ gives \begin{equation*} {\Vert {r(w,v)} \Vert}_{{\textrm{L}}^1({{\it{\Omega}}})} \leq C {\Vert { v } \Vert}_{{\textrm{L}}^1({{\it{\Omega}}})} \, {\Vert { {\boldsymbol v}} \Vert}_{\ell^1}, \end{equation*} which shows that $${\Vert {r(w,v)} \Vert}_{{\textrm{L}}^1({{\it{\Omega}}})} / {\Vert { v } \Vert}_{{\textrm{L}}^1({{\it{\Omega}}})} \rightarrow 0$$ uniformly when $${\boldsymbol v}$$ converges to zero. This completes the proof for the derivative. A direct calculation yields the formula for the transpose. □ References Banda, M. K. & Herty, M. ( 2012 ) Adjoint IMEX-based schemes for the numerical solution of optimal control problems governed by conservation laws. Hyperbolic Problems—Theory, Numerics and Applications ( Tatsien L. & Jiang S. eds), vol. 2. Ser. Contemp. Appl. Math. CAM , vols. 17 & 18 . Singapore : World Scientific Publishing , pp. 297 – 303 . Google Scholar CrossRef Search ADS Bouchut, F. & James, F. ( 1998 ) One-dimensional transport equations with discontinuous coefficients. Nonlinear Anal. , 32 , 891 – 933 . Google Scholar CrossRef Search ADS Bressan, A. & Guerra, G. ( 1997 ) Shift-differentiability of the flow generated by a conservation law. Discrete Contin. Dyn. Syst. , 3 , 35 – 58 . Castro, C., Palacios, F. & Zuazua, E. ( 2008 ) an alternating descent method for the optimal control of the inviscid Burgers equation in the presence of shocks. Math. Models Methods Appl. Sci. , 18 , 369 – 416 . Google Scholar CrossRef Search ADS Chechkin, G. A. & Goritsky, A. Y. ( 2009 ) S. N. Kruzhkov’s lectures on first-order quasilinear PDEs. Analytical and Numerical Aspects of Partial Differential Equations ( Emmrich E. & Wittbold P. eds). Berlin : Walter de Gruyter , pp. 1 – 67 .( Translated from the Russian by Boris P. Andreianov. ), https://www.degruyter.com/viewbooktoc/product/40206. Conway, E. D. ( 1967 ) Generalized solutions of linear differential equations with discontinuous coefficients and the uniqueness question for multidimensional quasilinear conservation laws. J. Math. Anal. Appl. , 18 , 238 – 251 . Google Scholar CrossRef Search ADS Evans, L. C. ( 2010 ) Partial Differential Equations . Graduate Studies in Mathematics , vol. 19 , 2nd edn. Providence, RI : American Mathematical Society , pp. xxii+749 . Giles, M. B. ( 2003 ) Discrete adjoint approximations with shocks. Hyperbolic Problems: Theory, Numerics, Applications ( Hou T. Y. & Tadmor E. eds), Heidelberg, Berlin : Springer , pp. 185 – 194 . Google Scholar CrossRef Search ADS Giles, M. & Ulbrich, S. ( 2010a ) Convergence of linearized and adjoint approximations for discontinuous solutions of conservation laws. Part 1: linearized approximations and linearized output functionals. SIAM J. Numer. Anal. , 48 , 882 – 904 . Google Scholar CrossRef Search ADS Giles, M. & Ulbrich, S. ( 2010b ) Convergence of linearized and adjoint approximations for discontinuous solutions of conservation laws. Part 2: adjoint approximations and extensions. SIAM J. Numer. Anal. , 48 , 905 – 921 . Google Scholar CrossRef Search ADS Gottlieb, S. & Shu, C.-W. ( 1998 ) Total variation diminishing Runge-Kutta schemes. Math. Comp. , 67 , 73 – 85 . Google Scholar CrossRef Search ADS Hager, W. W. ( 2000 ) Runge-Kutta methods in optimal control and the transformed adjoint system. Numer. Math. , 87 , 247 – 282 . Google Scholar CrossRef Search ADS Harten, A. ( 1983 ) High resolution schemes for hyperbolic conservation laws. J. Comput. Phys. , 49 , 357 – 393 . Google Scholar CrossRef Search ADS James, F. & Sepúlveda, M. ( 1999 ) Convergence results for the flux identification in a scalar conservation law. SIAM J. Control Optim. , 37 , 869 – 891 ( electronic ). Google Scholar CrossRef Search ADS Kružkov, S. N. ( 1970 ) First order quasilinear equations with several independent variables. Math. Sb. (N.S.) , 81 , 228 – 255 . LeVeque, R. J. ( 1990 ) Numerical Methods for Conservation Laws . Lectures in Mathematics ETH Zürich. Basel : Birkhäuser , pp. x+214 . Google Scholar CrossRef Search ADS Lucier, B. J. ( 1986 ) A moving mesh numerical method for hyperbolic conservation laws. Math. Comp. , 46 , 59 – 69 . Google Scholar CrossRef Search ADS Osher, S. & Tadmor, E. ( 1988 ) On the convergence of difference approximations to scalar conservation laws. Math. Comp. , 50 , 19 – 51 . Google Scholar CrossRef Search ADS Qiu, J.-M. & Shu, C.-W. ( 2008 ) Convergence of Godunov-type schemes for scalar conservation laws under large time steps. SIAM J. Numer. Anal. , 46 , 2211 – 2237 . Google Scholar CrossRef Search ADS Ruuth, S. J. & Spiteri, R. J. ( 2002 ) Two barriers on strong-stability-preserving time discretization methods. J. Scient. Comp. , 17 , 211 – 220 . Google Scholar CrossRef Search ADS Sanders, R. ( 1988 ) A third-order accurate variation nonexpansive difference scheme for single nonlinear conservation laws. Math. Comp. , 51 , 535 – 558 . Google Scholar CrossRef Search ADS Shu, C.-W. & Osher, S. ( 1988 ) Efficient implementation of essentially nonoscillatory shock-capturing schemes. J. Comput. Phys. , 77 , 439 – 471 . Google Scholar CrossRef Search ADS Spiteri, R. J. & Ruuth, S. J. ( 2002 ) A new class of optimal high-order strong-stability-preserving time discretization methods. SIAM J. Numer. Anal. , 40 , 469 – 491 ( electronic ). Google Scholar CrossRef Search ADS Ulbrich, S. ( 1999 ) On the existence and approximation of solutions for the optimal control of nonlinear hyperbolic conservation laws. Optimal Control of Partial Differential Equations: International Conference in Germany (Chemnitz, 1998) ( Hoffmann, K.-H. Leugering, G. Tröltzsch F. & Caesar S. eds). Internat. Ser. Numer. Math., vol. 133. Basel : Birkhäuser, pp. 287 – 299 . Google Scholar CrossRef Search ADS Ulbrich, S. ( 2001 ) Optimal Control of Nonlinear Hyperbolic Conservation Laws with Source Terms . Habilitation Thesis, Munich, Germany : Technische Universität München. Ulbrich, S. ( 2002 ) A sensitivity and adjoint calculus for discontinuous solutions of hyperbolic conservation laws with source terms. SIAM J. Control Optim. , 41 , 740 – 797 ( electronic ). Google Scholar CrossRef Search ADS Ulbrich, S. ( 2003 ) Adjoint-based derivative computations for the optimal control of discontinuous solutions of hyperbolic conservation laws. Systems Control Lett. , 48 , 313 – 328 . Google Scholar CrossRef Search ADS Zhang, X. & Shu, C.-W. ( 2010 ) A genuinely high order total variation diminishing scheme for one-dimensional scalar conservation laws. SIAM J. Numer. Anal. , 48 , 772 – 795 . Google Scholar CrossRef Search ADS © The authors 2017. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. All rights reserved. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png IMA Journal of Numerical Analysis Oxford University Press

# Total variation diminishing schemes in optimal control of scalar conservation laws

, Volume Advance Article – Dec 14, 2017
36 pages

/lp/ou_press/total-variation-diminishing-schemes-in-optimal-control-of-scalar-rUe0Vv0j0E
Publisher
Oxford University Press
ISSN
0272-4979
eISSN
1464-3642
D.O.I.
10.1093/imanum/drx073
Publisher site
See Article on Publisher Site

### Abstract

Abstract Optimal control problems subject to a nonlinear scalar conservation law, the state system, are studied. Such problems are challenging at both the continuous level and the discrete level since the control-to-state operator poses difficulties such as nondifferentiability. Therefore, discretization of the control problem has to be designed with care to provide stability and convergence. Here, the discretize-then-optimize approach is pursued, and the state is removed by the solution of the underlying state system, thus providing a reduced control problem. An adjoint calculus is then applied for computing the reduced gradient in a gradient-related descent scheme for solving the optimization problem. The time discretization of the underlying state system relies on total variation diminishing Runge–Kutta (TVD-RK) schemes, which guarantee stability, best possible order and convergence of the discrete adjoint to its continuous counterpart. While interesting in its own right, it also influences the quality and accuracy of the numerical reduced gradient and, thus, the accuracy of the numerical solution. In view of these demands, it is proven that providing a state scheme that is a strongly stable TVD-RK method is enough to ensure stability of the discrete adjoint state. It is also shown that requiring strong stability for both the discrete state and adjoint is too strong, confining one to a first-order method, regardless of the number of stages employed in the TVD-RK scheme. Given such a discretization, we further study order conditions for the discrete adjoint such that the numerical approximation is of the best possible order. Also, convergence of the discrete adjoint state towards its continuous counterpart is studied. In particular, it is shown that such a convergence result hinges on a regularity assumption at final time for a tracking-type objective in the control problem. Finally, numerical experiments validate our theoretical findings. 1. Introduction We are considering an optimal control problem of type \begin{equation*} \min_{u \in U_\text{ad}} \int_{\mathbb{R}} G(y(x,T)) \, \textrm{d} x, \end{equation*} subject to a scalar conservation law, i.e., \begin{equation*} \begin{array}{rcll} y_t + f(y)_x &=& 0 & \text{in } \mathbb{R} \times \mathbb{R}_+, \\ y(x,0) &=& u(x) & \text{in } \mathbb{R}. \end{array} \end{equation*} Here $$U_\text{ad}$$ is called the admissible set, and it is assumed to be nonempty, convex and closed. The state $$y(x,t)$$ is considered to be the entropic (weak) solution of the scalar conservation law and $$u(x),$$ the control, is the initial data of the partial differential equation (PDE). Although the definition of the optimal control problem seems to be simple, the PDE constraint (conservation law) poses severe difficulties for the analysis of such problems, both at the continuous level and at the discrete level. The major problem is the possible formation of a shock in the state $$y(x,t)$$ at a finite time even for very smooth initial data $$u(x)$$, when the flux function $$f(\cdot)$$ is nonlinear. Moreover, it is easy to show through examples that the control-to-state map is not Gâteaux differentiable when shocks are present. This poses a significant problem for obtaining the derivative of the cost functional of a (control) reduced version of the underlying optimal control problem. Luckily, a generalized definition of the derivative, called the ‘shift derivative’, for the control-to-state map has been derived by Ulbrich (2001, 2002), which implies Fréchet differentiability of the cost functional. Such differentiability results enable us to compute the derivative of the cost functional using an adjoint approach. In Section 2, we state the underlying optimal control in a rigorous context by recalling weak and entropic solutions of scalar conservation laws and their properties as well as the concept of shift differentiability. The difficulties that arise from the nature of the PDE are also reflected at the discrete level, i.e., one should discretize the problem with care. Monotone schemes, which we recall in Section 3, are among successful discretizations for conservation laws and their theory is well understood. We use monotone discretizations in space to obtain a semidiscrete formulation and then we discretize in time using a total variation diminishing (TVD) Runge–Kutta (RK) scheme. TVD-RK methods are a class of RK methods that guarantee, under quite mild assumptions, that the discrete solution is total variation stable (also called ‘strong stability preserving (SSP)’ methods). We then obtain the fully discrete optimal control problem by discretizing the objective functional. Similar to the continuous level, one can obtain the derivative of the cost functional using the adjoint calculus. The properties of the discrete adjoint are intimately related to the discretization of the discrete state. The TVD-RK method for the discrete state can be characterized by two sets of positive coefficients $$\{ \alpha_{ij} \}$$ and $$\{ \beta_{ij} \}$$. We show that the corresponding discrete adjoint is also obtained by a TVD-RK method, where the coefficients are ‘conjugates’ of $$\{ \alpha_{ij} \}$$ and $$\{ \beta_{ij} \}$$. Therefore, we will study in Section 4 the stability and the approximation of the discrete adjoint. In particular, we discover the following properties: Proposition 4.4: Imposing strong stability on both discrete state and discrete adjoint is too strong and it results in a first-order time discretization. Theorem 4.5: Imposing strong stability on the discrete state is enough to give stability of the discrete adjoint. Theorem 4.6 and Theorem 4.7: Any two-stage second-order TVD-RK method for the discrete state results in a second-order adjoint approximation. Any three-stage third-order TVD-RK method for the discrete state results in a second-order adjoint approximation. Hager (2000) showed that for certain third-order RK methods, the resulting discrete adjoint is only second order. Theorem 4.7 shows that this is the case for the class of TVD-RK methods. The study of the differentiability of the control-to-state map when shocks are present was started by Bressan & Guerra (1997). The authors show that this map, in general, is not differentiable. A similar problem formulated in terms of a minimization task was studied by James & Sepúlveda (1999), where the derivative of the objective functional was obtained when the state is smooth and does not contain shocks. Later Ulbrich (2001, 2002) analysed the shift differentiability of the so-called control-to-state operator and showed Fréchet differentiability of the reduced objective functional in the presence of shocks. Moreover, an adjoint procedure to compute the mentioned derivative was introduced and analysed. For the nonlinear conservation law (2.1), it has been observed that not all RK methods can ensure TVD properties of the approximation, i.e., oscillations occur near discontinuities (see, e.g., the example in Gottlieb & Shu, 1998, Section 2). Shu & Osher (1988) constructed a class of RK methods that ensures the approximation is TVD, the so-called TVD-RK methods. The main idea is to use convex combinations of the forward Euler method to construct a high-order approximation. If the Euler step is stable in some (semi)norm, then under some mild conditions the convex combination of the Euler steps is stable too. Order conditions are also derived by Shu & Osher (1988) for second and third orders with two and three stages, respectively. We should also remark that the derivation of such conditions remains formal as often the solution of the hyperbolic problem does not possess the required regularity. It is proven by Gottlieb & Shu (1998) that a fourth-order method with four stages does not exist. However, a fourth-order five-stage method was discovered by Spiteri & Ruuth (2002). Ruuth & Spiteri (2002) also showed that methods beyond fourth order of any number of stages do not exist. For the numerical treatment of such optimization problems and for a particular objective functional, Giles (2003) showed that the classical Lax–Friedrichs scheme leads to a discrete adjoint that converges to a wrong solution. Later, Giles & Ulbrich (2010a,b) and Ulbrich (2001) showed under a restrictive time step that the classical Lax–Friedrichs scheme yields a convergent discrete adjoint. Higher-order discretizations based on relaxation of the conservation law were also introduced and studied by Banda & Herty (2012). We finish this brief review of the literature by recalling an alternating descent method introduced by Castro et al. (2008) as a solution technique for such optimization problems. 2. Optimal control of scalar conservation laws An abstract optimal control problem can be formulated as $$\min\quad J(y) \quad \text{subject to} \quad S(u) = y, \quad u \in U_\text{ad},$$ (P) where $$y$$ is called the state variable and $$u$$ is the control variable, with the latter belonging to an admissible set $$U_\text{ad}$$. The objective functional is denoted by $$J(\cdot)$$, and it is assumed to be differentiable. The control and state variables are related through a control-to-state map $$S(\cdot)$$, which can be regarded as a solution operator of the underlying PDE. Obviously, the control-to-state $$S(\cdot)$$ influences existence and uniqueness of the optimal control problem as well as optimality conditions that characterize the optimal solution. In this work, we consider $$S(\cdot)$$ to be the solution operator of the following one-dimensional scalar conservation law (Cauchy problem): $$\begin{array}{rcll} y_t + f(y)_x &=& 0 & \text{in } \mathbb{R} \times \mathbb{R}_+, \\ y(x,0) &=& u(x) & \text{in } \mathbb{R}, \end{array}$$ (2.1) where $$u \in \textrm{L}^\infty(\mathbb{R})$$ is the initial data with compact support in a bounded interval $$K \subset \mathbb{R}$$ and $$y(x,t)$$ is the conserved variable. This motivates us to define the admissible set $$U_{\text{ad}}$$ by $$U_\text{ad} := \left\{ u \in \textrm{L}^\infty(\mathbb{R}) : \text{supp}(u) {\subset} K, \Vert {{u}} \Vert_{\textrm{BV}(\mathbb{R})} \leq C \right\}\!,$$ (2.2) where $$C>0$$ is a constant, $$\text{supp}(\cdot)$$ denotes the support of a function, i.e., $$\text{supp}(u) = \{ x \in \mathbb{R} : u(x) \not = 0 \}$$, and $$\textrm{L}^\infty(\mathbb{R})$$ is the Lebesgue space of essentially bounded functions on $$\mathbb{R}$$ with norm $$\Vert {{\cdot}} \Vert_{\textrm{L}^\infty(\mathbb{R})}$$. In this article, we assume that the so-called flux function $$f(\cdot)$$ is in $$\textrm{C}^2(\mathbb{R})$$ and satisfies $$f'' > 0$$. A particular example is $$f(y) = \frac{1}{2} y^2$$ which gives rise to the inviscid Burgers equation. Concerning the objective functional we study the so-called tracking type functional, i.e., $$J(y) := \int_{\mathbb{R}} G(y(x,T)) \, \textrm{d} x,$$ (2.3) where $$T>0$$ is a final time. A common example is $$G(y(x,T)) := |y(x,T)-y_\text{obs}(x)|^2$$ with $$y_\text{obs} \in \textrm{L}^2(\mathbb{R})$$ given. Classical solutions of (2.1) can be constructed by the method of characteristics. However, due to the possible nonlinearity of the flux function $$f$$, classical solutions break down in finite time even for very smooth initial data. Therefore, we consider generalized (weak) solutions of (2.1) in integral form: the function $$y \in \textrm{L}^\infty(\mathbb{R} \times \mathbb{R}_+)$$ is called a weak solution of (2.1) if it satisfies $$\int_{\mathbb{R} \times \mathbb{R}_+} y \, \varphi_t + f(y) \, \varphi_x \, \textrm{d} x \,\textrm{d} t + \int_{\mathbb{R}} u(x) \varphi(x,0) \, \textrm{d} x = 0 \quad \forall\,\varphi \in \textrm{C}^\infty_c(\mathbb{R} \times \mathbb{R}_+),$$ (2.4) where $$\textrm{C}^{\infty}_c(\mathbb{R} \times \mathbb{R}_+)$$ represents the space of infinitely differentiable functions with compact supports in $$\mathbb{R} \times \mathbb{R}_+$$, with $$\mathbb{R}_+ = [0,+\infty)$$. There might be more than one weak solution for given initial data. For instance, consider the following example: let $$f(y)=\frac{1}{2} y^2$$, and $$u(x) = 0$$ for $$x<0$$ and $$u(x) = 1$$ for $$x\geq0,$$ which yields discontinuous initial data. Then the following two weak solutions are known: \begin{equation*} y_1(x,t) = \begin{cases} 0, & x < \frac{1}{2}t, \\ 1, & x\geq \frac{1}{2}t, \end{cases} \quad \text{and} \quad y_2(x,t) = \begin{cases} 0, & x<0, \\ x/t, & 0 \leq x < t, \\ 1, & t \leq x. \end{cases} \end{equation*} The lack of uniqueness can be overcome by picking the physically relevant or entropy solution. Therefore, we impose an extra constraint on the weak solution: a function $$y \in \textrm{L}^\infty(\mathbb{R} \times \mathbb{R}_+)$$ is called an entropy solution (in the sense of Kružkov, 1970) if it satisfies $$\int_{\mathbb{R} \times \mathbb{R}_+} \eta(y) \, \varphi_t + q(y) \, \varphi_x \, \textrm{d} x \, \textrm{d} t + \int_{\mathbb{R}} \eta(u(x)) \varphi(x,0) \, \textrm{d} x \geq 0 \quad \forall\, \varphi \in \textrm{C}^\infty_c(\mathbb{R} \times \mathbb{R}_+), \, \varphi \geq 0,$$ (2.5) where $$\eta(y) := |y-k|$$ and $$q(y) = {\text{sign}}(y-k) (f(y)-f(k))$$ for all $$k\in \mathbb{R}$$. One can show that if $$y$$ satisfies (2.5) then it is also a weak solution in the sense of (2.4) (see, e.g., Chechkin & Goritsky, 2009, (Section 5.5). Kružkov then shows that such an entropy solution is indeed unique. For the following result, we introduce $$\textrm{BV}(\mathbb{R})$$, the space of the functions of bounded variations on $$\mathbb{R}$$, i.e., $$u \in \textrm{BV}(\mathbb{R})$$ iff $$u \in \textrm{L}^1(\mathbb{R})$$ and $$\textrm{TV}(u) := \sup \{ \int_{\mathbb{R}} u \, p' \, \textrm{d} x : p \in \textrm{C}^1_c(\mathbb{R}), |p| \leq 1 \text{ a.e. in } \mathbb{R} \}$$. Here ‘a.e.’ stands for ‘almost everywhere’. Endowed with $$\Vert {{u}} \Vert_{\textrm{BV}(\mathbb{R})} = \Vert {{u}} \Vert_{\textrm{L}^1(\mathbb{R})} + \textrm{TV}(u)$$, it is a Banach space. The following theorem is due to Kružkov (see Kružkov, 1970). We refer the reader to Evans (2010, Section 11.4.3) for a proof. Theorem 2.1 Let $$u \in \textrm{L}^\infty(\mathbb{R})$$ in (2.5); then there exists a unique entropy solution $$y$$ of (2.5) that satisfies $$\Vert {{y(\cdot,t)}} \Vert_{\textrm{L}^\infty(\mathbb{R})} \leq \Vert {{u}} \Vert_{\textrm{L}^\infty(\mathbb{R})}$$ for all $$t > 0$$. Moreover, let, $$y_1$$ and $$y_2$$ be two entropy solutions corresponding to initial data $$u_1, u_2 \in \textrm{L}^\infty(\mathbb{R}) \cap \textrm{L}^1(\mathbb{R})$$, respectively. Then we have \begin{equation*} \Vert {{y_1(\cdot,t) - y_2(\cdot,t)}} \Vert_{\textrm{L}^1(\mathbb{R})} \leq \Vert {{u_1 - u_2}} \Vert_{\textrm{L}^1(\mathbb{R})} \quad \forall\, t>0. \end{equation*} Finally, if $$u \in \textrm{BV}(\mathbb{R})$$ then $$\textrm{TV}(y(\cdot,t)) \leq \textrm{TV}(u)$$ for all $$t>0$$. The notion of entropy solutions enables us to state our optimal control problem in a meaningful way. Therefore, our optimal control problem (P) is completed by setting the solution operator $$S(\cdot)$$ to be the entropy solution of the conservation law at time $$T$$. We will see in Section 2.1 that $$S(\cdot)$$ is not Gâteaux differentiable when discontinuity (shocks) appear in the state $$y$$. Therefore, a different approach in defining a derivative of the control-to-state map, called a shift derivative, is derived that leads to Fréchet differentiability of the objective functional. Using this setting, existence of a minimizer can be shown for the underlying problem (see Appendix A). We also refer to Castro et al. (2008) and Ulbrich (1999) for a more general case. Uniqueness is, however, not guaranteed since we can construct control examples that lead to the same state that minimizes the objective functional: indeed, consider the following controls: \begin{equation*} u_1 = \begin{cases} 1, & -2 \leq x < 0, \\[2pt] -1, & 0 \leq x \leq 2, \\[2pt] 0 & \text{otherwise}, \end{cases} \quad u_2 = \begin{cases} 1, & -2 \leq x < -1, \\[2pt] -x, & -1 \leq x \leq 1, \\[2pt] -1, & 1 \leq x \leq 2, \\[2pt] 0 & \text{otherwise}, \end{cases} \end{equation*} for the Burgers equation, i.e., $$f(y)=\frac{1}{2}y^2$$. We fix the final time $$T=2$$. One can construct the corresponding entropy solution for each initial data. A direct calculation by the method of characteristics shows that at time $$t\geq T$$ both initial data give \begin{equation*} y(x,t) = \begin{cases} \frac{1}{t}(x + 2), & -2 \leq x \leq 0, \\[2pt] \frac{1}{t}(-x+2), & 0 < x \leq 2, \\[2pt] 0 & \text{otherwise}. \end{cases} \end{equation*} Setting $$y_\text{obs} := y(x,T)$$, we then have $$J(S(u_1)) = J(S(u_2)) = 0$$, i.e., two optimal controls for (P). Remark 2.2 (Flux identification problem) A different, yet similar optimal control problem of conservation laws can be formulated in which the control variable is the flux function and the initial data are fixed. More precisely , we look for a flux function $$f \in \mathcal{F}_\text{ad}$$, where $$\mathcal{F}_\text{ad}$$ is the admissible set, minimizing a given objective functional. For instance, $$f$$ may have a closed analytical form, perhaps transforming the optimal control problem into $$\mathbb{R}^n$$. In this case, the solution operator is defined as $$y = S(f)$$. The existence of the minimum can be proved by a continuity result due to Lucier (1986), \begin{equation*} \Vert {{ S(\,f)(\cdot,t) - S(g)(\cdot,t) }} \Vert_{\textrm{L}^1(\mathbb{R})} \leq t \Vert {{ f- g }} \Vert_\text{Lip} \, \Vert {{ u }} \Vert_{\textrm{BV}}, \end{equation*} and an assumption on the compactness of $$\mathcal{F}_\text{ad}$$ (see James & Sepúlveda, 1999, Section 2.2, for details). Uniqueness is again not guaranteed in general (see the discussion in James & Sepúlveda, 1999). 2.1 Shift differentiability and adjoint calculus As we have seen, the control-to-state operator is a delicate object and needs special care for the forward problem (2.1) to be well posed. We have also seen that although the optimal control problem (P) admits a minimizer, uniqueness cannot be expected in general. We now turn our attention to characterizing such a minimum. For the numerical treatment, first-order characterizations of such minimizers are of importance. This is our next subject of study. A first step in deriving optimal conditions for the problem is investigating differentiability of the objective functional. Note that the objective functional defined by (2.3) is well behaved, in the sense that $$G$$ is sufficiently smooth and allows existence of a solution and only the solution operator $$S(\cdot)$$ may cause problems. Thus, using the chain rule, we must investigate differentiability of $$S(\cdot)$$. It turns out that in the presence of shocks, $$S(\cdot)$$ is not Gâteaux differentiable as we illustrate in the following example borrowed from Bressan & Guerra (1997). Example 2.3 Suppose the initial data are $$u_\epsilon = (1+\epsilon) x \cdot \mathbf{1}_{[0,1]}(x)$$, where $$\epsilon \in \mathbb{R}$$ and $$\mathbf{1}_{[0,1]}(x)$$ denotes the indicator function of the interval $$[0,1]$$. Consider the Burgers equation. We can construct the entropy solution by the method of characteristics: \begin{equation*} y_\epsilon(x,t) = \frac{(1+\epsilon)x}{1+(1+\epsilon)t} \cdot \mathbf{1}_{[0,\sqrt{1+(1+\epsilon)t}]}(x). \end{equation*} Note that the shock position now depends on the perturbation, i.e., $$x_s(t) = \sqrt{1+(1+\epsilon)t}$$, and at time $$t=0$$ the derivative exists \begin{equation*} v(x) = \lim_{\epsilon \rightarrow 0} \epsilon^{-1} ( u_\epsilon - u_0 ) = x \cdot \mathbf{1}_{[0,1]}(x). \end{equation*} However, at time $$t>0,$$ there is no such derivative $$v(x) \in \textrm{L}^1(\mathbb{R})$$. In fact $$\epsilon^{-1}(y_\epsilon(\cdot,t) - y_0(\cdot,t) )$$ converges as $$\epsilon \rightarrow 0$$ in the sense of distributions to $$\frac{x}{(1+t)^2}\cdot \mathbf{1}_{[0,\sqrt{1+t}]} + \frac{t}{2(t+1)} \delta_{\sqrt{1+t}},$$ (2.6) where $$\delta_{x}$$ is the Dirac delta function located at $$x$$. Note that in the previous example, the distributional derivative has a continuous part and a singular part due to a shift in the shock location. Moreover, the perturbed shock location, $$x_s(t) = \sqrt{1+(1+\epsilon)t}$$, is differentiable with respect to the perturbation and the solution, $$y_\epsilon(x,t)$$, vary differentiably in the left and right of the shock. This motivates a first-order approximation of (2.6) introduced in Ulbrich (2001, 2002). Let $$u \in U \subset \mathcal{U}$$ where $$U$$ is an open set and $$\mathcal{U}$$ a Banach space. Moreover, $$u$$ is such that $$y_u := S(u)$$ has bounded variation and its support is in $$K \subset \mathbb{R}$$. For the sake of presentation, suppose, for instance, that $$u$$ has compact support, contains a shock at $$x_s(0)$$ and is a $$\textrm{C}^1$$ function on either side of the shock. Suppose the shock remains in the solution up to time $$T$$ and its position is given by $$x_s(t)$$. We perturb $$u$$ by $$\delta u \in U$$ such that $$y_{u+\delta u} := S(u+\delta u) \in \textrm{L}^\infty(\mathbb{R}) \cap \textrm{BV}(\mathbb{R})$$. Then we can define a first-order approximation of $$y_{u+\delta u} - y_u$$ by the shift variation \begin{equation*} \delta S (\delta y, s) := \delta y + {\text{sign}}(s) [\![{y(x_s(T),T)}]\!] \mathbf{1}_{[x_s(T), x_s(T)+s]} \in \textrm{L}^1(K), \end{equation*} where $$[\![{y(x_s(T),T)}]\!] = y(x_s^{-}(T),T) - y(x_s^{+}(T),T)$$ and $$(\delta y, s) \in \textrm{L}^1(K) \times \mathbb{R}$$ depends linearly on $$\delta u$$. Here $$\delta y$$ corresponds to the variation of the solution in the continuous part and $$s$$ is a linear approximation of the shock shift (e.g., in Example 2.3, $$x_s(t)$$ is differentiable with respect to $$\epsilon$$). More precisely we suppose that there exists a bounded linear operator \begin{equation*} T(u) \in \mathcal{L}(\mathcal{U}, \textrm{L}^r(K) \times \mathbb{R}), \quad r \in (1,\infty], \end{equation*} such that $$(\delta y, s) := T(u)\cdot \delta u$$. We say $$y_u$$ is shift differentiable at $$u$$ if \begin{equation*} \lim_{\delta u \rightarrow 0} \Vert {{\delta u}} \Vert_{\mathcal{U}}^{-1} \cdot \Vert {{ y_{u+\delta u} - y_u - \delta S(T(u)\cdot \delta u) }} \Vert_{\textrm{L}^1(K)}^{} = 0. \end{equation*} It is proven that shift differentiability implies Fréchet differentiability of the objective functional (see, e.g., Ulbrich, 2001, Section 3.2.2). Moreover, it is shown in Ulbrich (2001, Theorem 3.3.2) that, under some technical assumptions, entropy solutions are continuously shift differentiable and therefore the objective functional is Fréchet differentiable. Although the shift differentiability result is a useful tool in proving differentiability of the objective functional using sensitivities, it is often more convenient to obtain the objective functional’s derivative using an adjoint calculus. For the conservation laws with smooth solution, James and Sepùlveda derived such an adjoint-based derivative; see James & Sepúlveda (1999, Section 2.4] and the references therein. Later Ulbrich generalized the adjoint calculus to the case where the solution contains shocks. Here, we only state the result and refer the interested reader to Ulbrich (2001, 2002) for details. For a formal derivation, see Giles & Ulbrich (2010a). Suppose the perturbation $$\delta u$$ has the same structure as $$u$$, i.e., it contains a shock discontinuity at $$x_s(0)$$ and is piecewise $$\textrm{C}^1$$ on either side of the shock. Then the derivative of $$\mathcal{J}(u) := J(S(u))$$ in the direction of the $$\delta u$$ (perturbation in the initial data) is given by $$\mathcal{J}'(u) \delta u = \int_{\mathbb{R}} p(x,0) \, \delta u(x) \, \textrm{d} x,$$ (2.7) where $$p(x,t)$$ is the adjoint variable that satisfies the following backward equation with final-time condition: $$\begin{array}{rcll} p_t + f'(y) p_x &=& 0 & \text{in } \mathbb{R} \times (0,T), \\ p(x,T) &=& G'(y(x,T)) & \text{in } \mathbb{R}. \end{array}$$ (2.8) In the case $$y(x,t)$$ contains a shock which travels along $$x_s(t)$$, we need to impose an interior boundary condition for the adjoint along $$x_s(t)$$: $$p( x_s(t), t) = \frac{[\![{G(y)}]\!] }{ [\![{y}]\!] } \Big|_{(x_s(T),T)} \quad \forall\, t \in [0,T].$$ (2.9) The adjoint equation (2.8) is backward in time and is a nonconservative hyperbolic PDE with final datum and has been studied by many authors (see, e.g., James & Sepúlveda, 1999; Ulbrich, 2003). Let us first consider the case when $$p(x,T) \in \textrm{Lip}_{{\rm loc}}(\mathbb{R})$$, e.g., shocks at the final time are smoothed in the objective functional. A one-sided Lipschitz-continuity condition (OSLC) on $$f'(y)$$, i.e., \begin{equation*} \text{ess} \sup_{x\not=z} \left( \frac{f'(y(x,t)) - f'(y(z,t))}{x-z} \right)^+ \leq m(t), \end{equation*} where $$m(t) \in \textrm{L}^1(0,T)$$, guarantees that the generalized backward characteristics starting from time $$T$$ do not intersect. Then the existence of at least one Lipschitz continuous solution can be guaranteed. However, uniqueness is not ensured (see Conway, 1967). Uniqueness can be proved for the so-called reversible solutions which we briefly recall. Let us define the set $$\mathcal{E}$$ as the set of exceptional solutions, i.e., Lipschitz continuous solutions of (2.8) where $$p(x,T)=0$$. Then the support of the exceptional solutions is defined as \begin{equation*} \mathcal{V}_e := \left\{ (x,t) \in \mathbb{R} \times (0,T) : \exists\, p \in \mathcal{E}, p(x,t) \not = 0 \right\}\!. \end{equation*} A reversible solution is then a Lipschitz continuous solution of (2.8) that is locally constant in $$\mathcal{V}_e$$ (see Bouchut & James, 1998, Definition 4.1.4). We should mention that reversible solutions can also be defined for discontinuous Borel functions as end data $$p(x,T)$$ that are pointwise everywhere the limit of a bounded sequence $$(p_n^T)\in \textrm{C}^{0,1}(\mathbb{R})$$, i.e., bounded in $$\textrm{C}(\mathbb{R})\cap W^{1,1}_{{\rm loc}}(\mathbb{R})$$ (see Ulbrich, 2003). In this case, reversible solutions can be defined as broad solutions along the generalized backward characteristics, which automatically ensures the internal boundary condition (2.9), if the end data (2.9) are used at $$(x_s(T),T)$$. The following theorem states properties of the reversible solution and is proved in Ulbrich (2003, Theorem 14) for the general case of hyperbolic balance laws. For simplicity of the presentation, we adapt the result in Ulbrich (2003, Theorem 14) to our setting (see also Bouchut & James, 1998). For the case of discontinuous end data, we refer the reader to Ulbrich (2003, Corollary 15). Theorem 2.4 Let $$f'(y(x,t)) \in \textrm{L}^\infty(\mathbb{R} \times (0,T))$$ and satisfy OSLC. Then the following holds: for all $$p(x,T) \in \textrm{C}^{0,1}(\mathbb{R})$$ there exists a unique reversible solution $$p$$ of (2.8). Moreover, $$p \in \textrm{C}^{0,1}(\mathbb{R} \times [0,T])$$ and solves (2.8) a.e. on $$\mathbb{R} \times (0,T)$$. Finally, for all $$t \in [0,T]$$, $$z_1 < z_2$$ and $$0 \leq s < \hat{s} \leq T$$ with \begin{align*} I &:= [z_1,z_2], \quad I_s^{\hat{s}} := [s,\hat{s}] \times I, \\ J &:= \left[z_1 - (T-t) \Vert {{\,f'(y)}} \Vert_{\textrm{L}^\infty(\mathbb{R} \times (0,T))}, z_2 + (T-t) \Vert {{\,f'(y)}} \Vert_{\textrm{L}^\infty(\mathbb{R} \times (0,T))} \right]\!, \\ J_t^T &:= [t,T] \times J, \end{align*} the following estimates hold: \begin{align*} \Vert {{p(\cdot,t)}} \Vert_{B(I)} &\leq \Vert {{p(\cdot,t)}} \Vert_{B(J)}, \\ \Vert {{p_x(\cdot,t)}} \Vert_{\textrm{L}^1(I)} &\leq \Vert {{p_x(\cdot,t)}} \Vert_{\textrm{L}^1(J)}, \\ \Vert {{p_t}} \Vert_{\textrm{L}^1(I_s^{\hat{s}})} &\leq (\hat{s}-s) \Vert {{\,f'(y)}} \Vert_{\textrm{L}^\infty(I_s^{\hat{s}})} \Vert {{p_x}} \Vert_{\textrm{L}^\infty(s,\hat{s};\textrm{L}^1(I))}, \end{align*} where $$B(I)$$ is the Banach space of the bounded functions equipped with the $$\sup$$-norm. We now demonstrate the use of the adjoint calculus by the following example. Example 2.5 Consider the Burgers equation, i.e., $$f(y) := \frac{1}{2} y^2$$, on the domain $${\it{\Omega}} = (-1,1)$$, and the initial data is set to be \begin{equation*} u(x) := \begin{cases} 1, & -1\leq x<0, \\ -1, & 0 < x \leq 1, \end{cases} \end{equation*} with a shock discontinuity at $$x=0$$. For the boundary conditions we set $$y(-1,t)=1$$ and $$y(1,t)=-1$$ for $$t>0$$. It is easy to see that the entropic solution equals the initial data, i.e., $$y(x,t) = u(x)$$ for all $$t>0$$. Hence, the shock position remains at $$x=0$$, i.e., $$x_s(t)=0$$ for $$t>0$$. We now compute the adjoint state using (2.8) and (2.9). For this purpose, let $$G(y) := \frac{1}{2} |y|^2$$. Then we have $$[\![{G(y)}]\!]|_{t=T}=0$$ and $$G'(y) = y$$. Then for the final-time condition of the adjoint equation, we obtain \begin{equation*} p(x,T):= \begin{cases} 1, & -1 \leq x < 0, \\ -1, & 0 < x \leq 1, \end{cases} \quad \text{and} \quad p(0,T) = 0, \end{equation*} and for the ‘interior’ boundary condition we have $$p(0,t) = 0$$ for $$t \in [0,T]$$. We fix $$T=\frac{1}{2}$$. By the method of characteristics we can construct $$p(x,t)$$ for $$t<T$$. More precisely, along all straight lines $$x(t)$$ with the derivative $$\dot{x}(t) = y(x(t),t)$$, the adjoint state is constant. This implies that, at $$t=0$$, we have \begin{equation*} p(x,0) = \begin{cases} 1, & -1 < x < -\frac{1}{2}, \\[2pt] 0, & -\frac{1}{2} \leq x \leq \frac{1}{2}, \\[2pt] -1, & -\frac{1}{2} < x < 1. \end{cases} \end{equation*} See also Fig. 1. Fig. 1. View largeDownload slide Construction of the adjoint using the method of characteristics in the space-time domain. Note that the adjoint has a constant value in the grey area due to the discontinuity of the state at time $$T$$. Fig. 1. View largeDownload slide Construction of the adjoint using the method of characteristics in the space-time domain. Note that the adjoint has a constant value in the grey area due to the discontinuity of the state at time $$T$$. 3. Numerical methods In this section, we describe how we discretize the optimal control problem (P). In this article, we follow the discretize-then-differentiate approach, i.e., we first fully discretize the optimization problem and then derive the optimality conditions for the resulting finite-dimensional optimization problem. In particular, we should consider in detail the discretization of the conservation law (2.1). It is important that one makes sure that the discretization of the forward problem converges to the unique entropy solution, as well as the inherited adjoint discretization to the continuous adjoint state. We refer the reader to LeVeque (1990) for an introduction to numerical methods for such PDEs. Since we are interested in studying how time discretization, using TVD-RK methods, influences the quality of the overall scheme, we first discretize the conservation law in space and then in time by a TVD-RK method. 3.1 Spatial discretization Let us first partition the domain, say $$\mathbb{R}$$ with nonoverlapping intervals, $$I_j := (x_{j-1/2},x_{j+1/2}]$$, where $$x_{j-1/2} < x_{j+1/2}$$ for all $$j \in \mathbb{Z}$$. The so-called mesh size is denoted by $$h_j := x_{j+1/2}-x_{j-1/2}$$. We denote the semidiscrete approximation at time $$t$$ by \begin{equation*} {\boldsymbol {y}}(t) := (..., y_{j-1}(t),y_{j}(t),y_{j+1}(t), ... ) \in \ell^\infty(\mathbb{Z}). \end{equation*} More precisely, $$y_j(t) \in \mathbb{R}$$ is an approximation to the cell average of the true solution, i.e., \begin{equation*} y_j(t) \approx \frac{1}{h_j} \int_{x_{j-1/2}}^{x_{j+1/2}} y(x,t) \, \textrm{d} x. \end{equation*} We then discretize in space using a conservative scheme, by choosing a numerical flux function$$\hat{f}(y_j(t),y_{j+1}(t))$$ that approximates $$f( y(x_{j+1/2},t))$$. Then the semidiscrete numerical scheme is obtained by solving the following ordinary differential equation (ODE) in time: $$\frac{\textrm{d}}{\textrm{d} t} y_j(t) + \frac{1}{h_j} \left( \hat{f}(y_j(t),y_{j+1}(t)) - \hat{f}(y_{j-1}(t),y_{j}(t)) \right) = 0 \quad \forall\, j \in \mathbb{Z}.$$ (3.1) The simplest time discretization for (3.1) is given by the forward Euler scheme. For this purpose, we partition the time direction into time slabs $$t_n$$ for $$n \in \mathbb{Z}_+$$, where $$t_{n} < t_{n+1}$$. For simplicity we assume a uniform time step, i.e., $$t_{n+1}-t_n = k$$ for all $$n \in \mathbb{Z}_+$$ and similarly a uniform mesh size, i.e., $$h_j = h$$ for all $$j \in \mathbb{Z}$$. Then the fully discrete system using the forward Euler discretization reads $$y_j^{n+1} = y_{j}^n - \frac{k}{h} \left( \hat{f}(y_j^n,y_{j+1}^n) - \hat{f}(y_{j-1}^n,y_{j}^n) \right) \quad \forall\, j \in \mathbb{Z}, n \in \mathbb{Z}_+,$$ (3.2) where $${\boldsymbol {y}}^n := (..., y_{j-1}^n,y_j^n,y_{j+1}^n,...)$$ is an approximation to $${\boldsymbol {y}}(t_n)$$. Later, in Section 4, we discretize (3.1) by a high-order TVD-RK method. The choice of the numerical flux $$\hat{f}(\cdot,\cdot)$$ is crucial since it affects convergence properties of the method. Note that not only are we interested in convergence to a weak solution, but also we require convergence to the entropy solution. Hence, we require that the method satisfy a discrete version of (2.5) as well as other properties such as $$\textrm{L}^1$$-contraction and total variation diminishing. In fact, monotone schemes satisfy such requirements and converge to the entropy solution (see, e.g., LeVeque, 1990, Chapter 15). We, however, need yet another condition: the numerical scheme should be differentiable. This enables us to derive optimality conditions at the discrete level. Motivated by the differentiability issue addressed above, the following numerical fluxes are used in this article: $$\begin{array}{rcll} \hat{f}_\text{LF}(a,b) &:=& \frac{1}{2} (\,f(a)+f(b)) - \frac{\gamma}{2} \frac{h}{k} (b-a) & \text{Lax-Friedrichs (LF)}, \\ \hat{f}_\text{EO}(a,b) &:=& f(0) + \int_{0}^{a} f'(s)^+ \, \textrm{d} s + \int_{0}^{b} f'(s)^- \, \textrm{d} s & \quad\text{Engquist-Osher (EO)}, \end{array}$$ (3.3) where $$\gamma \in (0,1)$$ and we define $$f'(s)^+ := \max(0, f'(s))$$ and $$f'(s)^- \,{:=}\, \min(0, f'(s))$$. We note that the classical Lax–Friedrichs method uses $$\gamma =1$$. However, due to the stability requirements for the discrete adjoint, we impose $$\gamma < 1$$ (see Section 3.3). The Lax–Friedrichs scheme is monotone provided that the time step satisfies the CFL condition $$\frac{k}{h} \sup_{|y| \leq M} |f'(y)| \leq \gamma,$$ (3.4) where $$M = \max_x |u(x)|$$. We define the total variation seminorm for $${\boldsymbol {y}}$$ by \begin{equation*} \left|{ {\boldsymbol {y}} }\right|_\text{TV} := \sum_{j=-\infty}^\infty | y_{j+1} - y_{j} |. \end{equation*} It is well known that monotone schemes are also TVD (see LeVeque, 1990, Chapter 15.7). Therefore, we have $$\left|{ {\boldsymbol {y}}^{n+1} }\right|_\text{TV} \leq \left|{ {\boldsymbol {y}}^{n} }\right|_\text{TV}$$. Giles & Ulbrich (2010a,b) proved that for the Lax–Friedrichs method, provided $$k \leq \gamma \cdot h^{2-q}$$ for $$0<q<1$$, both forward and adjoint approximations converge to their respective continuous counterparts. This is due to adding more grid points near the shock position as the mesh is refined. This, however, results in a very restrictive time-step requirement. Since the support of the initial data is assumed to be in a bounded set, recall (2.2), and it is known that the solution of the conservation law has a finite speed of propagation, we can consider the problem on a bounded domain, denoted by $${\it{\Omega}}$$ only. Therefore, the discretization can be written using finite-dimensional operators. Let us suppose that $${\it{\Omega}} = (a,b)$$ is partitioned into $$N$$ elements, $$I_j$$ for $$\,j=1,...,N$$, with $$x_{1/2} = a$$ and $$x_{N+1/2} = b$$. Then we can define a finite-dimensional space $$V_h := \mathbb{R}^N$$ as approximation space, and we denote the approximation at time $$t_n$$ by $${\boldsymbol {y}}_h^n = (y^n_1, y^n_2, ..., y^n_N) \in V_h$$. This allows us to express the underlying scheme using a nonlinear discrete operator, $$F_h: V_h \rightarrow V_h$$, which is defined by $$\left[ F_h({\boldsymbol {w}}) \right]_j := \hat{f}(w_j,w_{j+1}) - \hat{f}(w_{j-1},w_{j}) \quad \forall\, j=1,...,N, \quad {\boldsymbol {w}} \in V_h.$$ (3.5) The fully discrete version of (3.2) with forward Euler time integration can then be written as $${\boldsymbol {y}}_h^{n+1} = {\boldsymbol {y}}_h^n - \frac{k}{h} F_h({\boldsymbol {y}}_h^n), \quad {\boldsymbol {y}}_h^{0} = {\boldsymbol {u}}_h.$$ (3.6) Differentiability properties of $$F_h(\cdot)$$ will be exploited later in Section 3.2 for deriving an adjoint discretization. Now we state Lax–Friedrichs and Engquist–Osher differentiability in the following proposition whose proof we defer to Appendix A. Proposition 3.1 Let the flux function be $$f(\cdot) \in \textrm{C}^2$$. Then the respective $$F_h({\boldsymbol {w}})$$ for the Lax–Friedrichs and Engquist-Osher schemes at $${\boldsymbol {w}} \in V_h$$ is Fréchet differentiable, i.e., there exists a linear bounded operator $$F_h'({\boldsymbol {w}}) : V_h \rightarrow V_h$$ such that in direction $${\boldsymbol {v}} \in V_h$$ we have $$[F_h'({\boldsymbol {w}}) {\boldsymbol v}]_j = g_{j,j+1} - g_{j-1,j}$$ with \begin{equation*} \begin{array}{ll} g_{j,j+1}^\text{LF} := \frac{1}{2} \left[ f'(w_{j+1})v_{j+1}+f'(w_j)v_{j} \right] - \frac{\gamma}{2}\frac{h}{k}(v_{j+1}-v_j) & \text{(LF)}, \\ g_{j,j+1}^\text{EO} := \frac{1}{2} \left[ f'(w_{j+1})v_{j+1}+f'(w_j)v_{j} \right] - \frac{1}{2}( |f'(w_{j+1})| v_{j+1} - |f'(w_{j})|v_j) & \text{(EO)}, \end{array} \end{equation*} for $$i=j,...,N$$. Moreover, for their transposes we have \begin{equation*} \begin{array}{ll} \left[ F_h'({\boldsymbol {w}})^\top {\boldsymbol v} \right]_j = \gamma \frac{h}{k} v_j - \frac{1}{2} \left( \gamma \frac{h}{k} + f'(w_j) \right) v_{j+1} - \frac{1}{2} \left( \gamma \frac{h}{k} - f'(w_j)\right) v_{j-1} & \text{(LF)}, \\ \left[F_h'({\boldsymbol {w}})^\top {\boldsymbol v} \right]_j = |f'(w_j)| v_j - \frac{1}{2} \left( |f'(w_j)| + f'(w_j) \right) v_{j+1} - \frac{1}{2} \left( |f'(w_j)| - f'(w_j)\right) v_{j-1} & \text{(EO)}. \end{array} \end{equation*} We will see in Section 3.2 how properties of $$F_h(\cdot)$$ influence the discrete adjoint variable. In particular, we are interested in TVD properties of the discrete adjoint. Remark 3.2 We emphasize that our choice of the Lax–Friedrichs scheme or the Engquist–Osher scheme in this article is due to theoretical considerations for stability and convergence of the discrete adjoint in Section 4.1 only. More precisely, we can choose any scheme with a differentiable flux function that satisfies the SSP requirement. To fully benefit from the high-order TVD-RK discretizations used in this article, one should consider using high-order spatial discretizations as well. For the latter see, e.g., Sanders (1988) and Zhang & Shu (2010). Furthermore, for the convergence of the discrete adjoint in Section 4.3, we require that the forward discretization converges to the entropy solution, representing another requirement in the choice of the scheme (see, e.g., Osher & Tadmor, 1988; Qiu & Shu, 2008) for higher-order methods satisfying entropy conditions. 3.2 Discrete optimal control problem and adjoint calculus In this section, we state the discrete optimization problem, derive the discrete adjoint and study its properties when a forward Euler time integration is employed together with the spatial discretization discussed in Section 3.1. As before, we denote the final time by $$T$$. For simplicity, we partition the time direction in a way that there exists $$n_T$$ such that $$T = n_T \cdot k$$. In order to ease the notation, we concatenate approximations $${\boldsymbol {y}}_h^n$$ at different times $$n=1,...,n_T$$ to obtain $${\boldsymbol {y}}_h \in (V_h)^{n_T+1}$$: $${\boldsymbol {y}}_h := ({\boldsymbol {y}}_h^0,{\boldsymbol {y}}_h^1, {\boldsymbol {y}}_h^2, ..., {\boldsymbol {y}}_h^{n_T})^\top.$$ (3.7) The discretized objective functional is then given by \begin{equation*} J_h^{}( {\boldsymbol {y}}_h ) := \sum_{j=1}^{N} h \, G({y}_j^{n_T}), \end{equation*} where $${\boldsymbol {y}}_h$$ is obtained by (3.6). We then consider the following finite-dimensional optimal control problem $$\min J_h^{}( {\boldsymbol {y}}_h ) \quad \text{subject to} \quad S_h^{}({\boldsymbol {u}}^{}_h) = {\boldsymbol {y}}_h^{n_T}, \quad {\boldsymbol {u}}_h \in V_h \cap U_\text{ad},$$ (DP) where $$S_h(\cdot)$$ is the discrete control-to-state operator that is defined by successive ($$n_T$$ times) application of (3.6) to $${\boldsymbol {u}}_h$$. For deriving the discrete adjoint, it is more convenient to consider (3.6) as a constraint instead of $$S_h(\cdot)$$. For this purpose, we define the equality constraint at time $$t_n$$ by $$L_h^n : (V_h)^{n_T+1} \rightarrow V_h$$ with \begin{equation*} L_h^{n}( {\boldsymbol {y}}_h^{} ) := - {\boldsymbol {y}}_h^{n} + {\boldsymbol {y}}_h^{n-1} - \frac{k}{h} F_h( {\boldsymbol {y}}_h^{n-1} ) \quad \text{for}\,n=1,...,n_T, \end{equation*} and $$L_h^0({\boldsymbol {y}}_h,{\boldsymbol u}_h) = -{\boldsymbol y}_h^0 + {\boldsymbol u}_h$$. We then collect all time-step contributions and state the discrete constraint as $$L_h({\boldsymbol y}_h,{\boldsymbol u}_h) = 0$$, where $$L_h : (V_h)^{n_T+1} \times V_h \rightarrow (V_h)^{n_T+1}$$ with \begin{equation*} L_h^{}({\boldsymbol y}_h^{},{\boldsymbol u}_h^{}) = \left( L_h^0({\boldsymbol y}_h^{},{\boldsymbol u}_h^{}), L_h^1({\boldsymbol y}_h^{}), L_h^2({\boldsymbol y}_h^{}), \cdots, L_h^{n_T}({\boldsymbol y}_h^{}) \right)^\top. \end{equation*} Then we define the Lagrangian for the finite-dimensional problem (DP) by \begin{equation*} \mathcal{L}({\boldsymbol {y}}_h^{},{\boldsymbol {p}}_h^{},{\boldsymbol {u}}_h^{}) := J_h^{}({\boldsymbol {y}}_h^{}) + h \, {\boldsymbol p}_h^\top \cdot L_h^{}({\boldsymbol y}_h^{},{\boldsymbol u}_h^{}), \end{equation*} where $${\boldsymbol p}_h \in (V_h)^{n_T+1}$$ is enumerated like $${\boldsymbol y}_h$$ in (3.7). Recall that from Proposition 3.1 we know that $$F_h(\cdot)$$ is differentiable when Lax–Friedrichs and Engquist–Osher schemes are used. Differentiability of $$F_h(\cdot)$$ implies differentiability of $$L_h(\cdot,\cdot)$$ with respect to the first argument. More precisely, a direct calculation shows that this derivative, denoted by $$L_{{\boldsymbol y}_h}'(\cdot,\cdot)$$, has a lower-triangular structure: \begin{equation*} L_{{\boldsymbol y}_h}'({\boldsymbol y}_h,{\boldsymbol u}_h) = \left[ \begin{array}{ccccc} -I & & & & \\ I - \frac{k}{h} F_h'({\boldsymbol y}_h^{1}) & -I & & & \\ & \ddots & \ddots & & \\ & & I - \frac{k}{h} F_h'({\boldsymbol y}_h^{n_T-1}) & -I & \\ & & & I - \frac{k}{h} F_h'({\boldsymbol y}_h^{n_T}) & -I \end{array} \right]\!, \end{equation*} and therefore $$L_{{\boldsymbol y}_h}'(\cdot,\cdot)$$ is invertible. As a consequence, the constraint $$L_h^n({\boldsymbol y}_h)=0$$ satisfies the linear independence constraint qualification (LICQ) at any feasible point of (DP). Hence, at a solution pair $$(y_h,u_h)$$ there exists a Lagrange multiplier (adjoint state) $$p_h$$. Now we can compute the discrete derivative of the objective functional using a discrete adjoint calculus. The concatenated discrete adjoint equation is given by $${\nabla J_h({\boldsymbol y}_h)} + h \, L'_{{\boldsymbol y}_h}({\boldsymbol y}_h,{\boldsymbol u}_h)^\top {\boldsymbol p}_h = 0.$$ (3.8) Note that $$L'_{{\boldsymbol y}_h}({\boldsymbol y}_h,{\boldsymbol u}_h)^\top$$ is upper triangular, and for a fixed $${\boldsymbol y}_h$$ one can solve (3.8) successively from $${\boldsymbol p}_h^{n_T}$$ to $${\boldsymbol p}_h^{0}$$, i.e., backward in time similar to the continuous adjoint; see (2.8). From the definition of the discrete objective function we get \begin{equation*} {\nabla J_h({\boldsymbol y}_h)} = \left( {\boldsymbol {0}}, {\boldsymbol {0}}, \cdots, {\boldsymbol {0}}, h \, G'({\boldsymbol y}_h^{n_T}) \right)^\top, \end{equation*} and therefore $${\boldsymbol p}_h^{n_T} = G'({\boldsymbol y}_h^{n_T})$$, which is the discrete counterpart for the final-time condition in (2.8). For $$n = n_T, ..., 0$$ we have $${\boldsymbol p}_h^{n_T} = G'({\boldsymbol y}_h^{n_T}), \quad {\boldsymbol p}_h^{n} = \left[ I - \frac{k}{h} F_h'({\boldsymbol y}_h^{n+1})^\top \right] {\boldsymbol p}_h^{n+1} \quad \text{for } n=(n_T-1), ..., 0,$$ (3.9) which is the discrete counterpart of the adjoint PDE (2.8). The gradient of the discrete reduced objective function in the direction of $$\delta {\boldsymbol u}_h \in V_h$$ is obtained by $${\nabla \mathcal{J}_h({\boldsymbol u}_h)^\top} \delta {\boldsymbol u}_h = h \sum_{j=1}^N p_j^0 \, \delta u_j^{}.$$ (3.10) Remark 3.3 (Convergence of the discrete adjoint) Note, however, that at the discrete level, the interior boundary condition (2.9) does not appear. Hence, a natural question is to ask whether the discrete adjoint converges to the continuous one or not. It is shown in Giles (2003) through numerical experiments that the adjoint obtained from the Lax–Friedrichs scheme converges to a wrong solution as the mesh is refined. It is also observed that if numerical diffusion is chosen such that the shock in $${\boldsymbol y}_h^{n_T}$$ spreads into more cells, then the discrete adjoint converges. This is proven by Giles & Ulbrich (2010a,b). We are not aware of a similar result for the Engquist–Osher scheme. However, it has been shown by Ulbrich (2001) that for the case where the end data is Lipschitz continuous, e.g., using a smoothed end state in the tracking functional, the discrete adjoint converges to the continuous adjoint. This motivates the use of two separate solvers for the discrete forward and adjoint problems, e.g., using a TVDRK-MUSCL scheme, when the end data are smoothed. We should mention that this approach results in an inexact discrete adjoint. 3.3 Stability of the discrete adjoint We would like to examine monotonicity of the discrete adjoint by checking whether or not it is TVD. The discrete adjoint, satisfying (3.9), can be written in a simplified form for analysis as $$p_j^n = A_{j,0}^{} \, p_j^{n+1} + A_{j,1}^{} \, p_{j+1}^{n+1} + A_{j,-1}^{} \, p_{j-1}^{n+1},$$ (3.11) where \begin{equation*} A_{j,l}^\text{LF} := \begin{cases} \frac{\gamma}{2} - \frac{k}{2h} f'(w_j) & \text{for } l=-1, \\ 1 - \gamma & \text{for } l=0, \\ \frac{\gamma}{2} + \frac{k}{2h} f'(w_j) & \text{for } l=1, \end{cases} \quad A_{j,l}^\text{EO} := \begin{cases} \frac{k}{2h} \left( |f'(w_j)| - f'(w_j) \right) & \text{for } l=-1, \\ 1 - \frac{k}{h} |f'(w_j)| & \text{for } l=0, \\ \frac{k}{2h} \left( |f'(w_j)| + f'(w_j) \right) & \text{for } l=1. \end{cases} \end{equation*} Note that, provided the CFL condition (3.4) is satisfied, we have $$A_{j,l}^\text{LF} \geq 0$$ and $$A_{j,l}^\text{EO} \geq 0$$. Moreover, observe that by construction we have \begin{equation*} \sum_{l=-1}^{1} A_{j,l} = 1. \end{equation*} Taking absolute values on both sides in (3.11) and using the above properties, we obtain \begin{equation*} |p_j^n| \leq A_{j,0}^{} |p_j^{n+1}| + A_{j,1}^{} |p_{j+1}^{n+1}| + A_{j,-1}^{} |p_{j-1}^{n+1}| \leq \max_{l=-1,0,1} |p_{j+l}^{n+1}| \leq \Vert {{ {\boldsymbol p}_h^{n+1} }} \Vert_{\infty}, \end{equation*} which implies $$\textrm{L}^\infty$$ stability. We now show that the discrete adjoint scheme with a forward Euler time discretization is also TVD. We first rewrite (3.11) in the form $$p_j^{n} = p_{j}^{n+1} + B_{j,0}^{} (p_{j}^{n+1} - p_{j-1}^{n+1}) + B_{j,1}^{} (p_{j+1}^{n+1} - p_{j}^{n+1}),$$ (3.12) with \begin{equation*} B_{j,l}^\text{LF} := \begin{cases} - \frac{\gamma}{2} + \frac{k}{2h} f'(w_j) & \text{for } l=0, \\ \frac{\gamma}{2} + \frac{k}{2h} f'(w_j) & \text{for } l=1, \end{cases} \quad B_{j,l}^\text{EO} := \begin{cases} - \frac{k}{2h} |f'(w_j)| + \frac{k}{2h} f'(w_j) & \text{for } l=0, \\ \frac{k}{2h} |f'(w_j)| + \frac{k}{2h} f'(w_j) & \text{for } l=1. \end{cases} \end{equation*} Then Harten’s lemma (see Harten, 1983) guarantees TVD properties of the discrete adjoint scheme. Lemma 3.4 (Harten’s lemma) Suppose a finite difference scheme can be written as \begin{equation*} w_j = v_j + B_{j,0} \cdot (v_j - v_{j-1}) + B_{j,1} \cdot (v_{j+1}-v_{j}), \end{equation*} where $$B_{j,0}$$ and $$B_{j,1}$$ are arbitrary nonlinear functions of $$v_j, v_{j+1}, v_{j-1}$$ and satisfy \begin{equation*} B_{j,0} \leq 0, \quad B_{j,1} \geq 0, \quad B_{j,1} - B_{j+1,0} \leq 1. \end{equation*} Then we have $$\left|{{\boldsymbol w}_h}\right|_\text{TV} \leq\left|{{\boldsymbol v}_h}\right|_\text{TV}$$. Observe that in our case, the above conditions on $$B_{j,0}$$ and $$B_{j,1}$$ are satisfied provided a $$(1-\gamma)$$-CFL condition for Lax–Friedrichs and a $$\frac{1}{2}$$-CFL condition for Engquist–Osher are satisfied, respectively, i.e., $$\frac{k^\text{LF}}{h} \sup_{|y| \leq M} |f'(y)| \leq (1-\gamma), \quad \frac{k^\text{EO}}{h} \sup_{|y| \leq M} |f'(y)| \leq \frac{1}{2}.$$ (3.13) This is the reason for the choice of $$\gamma \in (0,1)$$ in the Lax–Friedrichs scheme. Obviously the optimal CFL condition is \begin{equation*} \text{CFL}^* = \max_{\gamma \in (0,1)} \min\{\gamma,1-\gamma\} = \frac{1}{2}. \end{equation*} Lemma 3.4 now guarantees that the discrete adjoint obtained from forward Euler time discretization is TVD, i.e., $$\left|{ {\boldsymbol p}_h^{n} }\right|_\text{TV} \leq \left| {\boldsymbol p}_h^{n+1} \right|_\text{TV}.$$ (3.14) Note that this property is shared by the continuous adjoint in Theorem 2.4. 4. SSP time discretizations In this section, we examine the effect of using RK methods for discretizing the underlying problem instead of using the forward Euler method. A TVD-RK method is defined by convex combinations of forward Euler steps that are parametrized by two sets of coefficients: $$\{ \alpha_{ij} \}$$ and $$\{ \beta_{ij} \}$$ for $$i=1,...,s$$ and $$j=0,...,(s-1)$$, where $$s$$ is the number of RK stages. A TVD-RK time stepping is then defined as follows: Set the initial stage: $${\boldsymbol w}^{(0)} = {\boldsymbol y}_h^n$$. Compute for each stage $$i = 1, \ldots, s$$, $${\boldsymbol w}^{(i)} = \sum_{j=0}^{i-1} \alpha_{ij} {\boldsymbol w}^{(j)} - \frac{k}{h} \, \beta_{ij} F_h({\boldsymbol w}^{(j)}).$$ (4.1) Set the next time-step approximation: $${\boldsymbol y}^{n+1}_h = {\boldsymbol w}^{(s)}$$. Moreover, we impose the following constraints over $$\{ \alpha_{ij} \}$$ and $$\{ \beta_{ij} \}$$: $$\alpha_{ij}, \beta_{ij} \geq 0 \quad \sum_{j=0}^{i-1} \alpha_{ij} = 1, \quad \left( \beta_{ij} \not= 0 \implies \alpha_{ij}\not=0 \right)\!.$$ (4.2) The following result is due to Shu & Osher (1988) and shows that the TVD-RK method is stable if the forward Euler (3.6) is stable. We call such methods strong stability preserving (SSP) since we have stability with respect to stage variables, too. Proposition 4.1 Suppose the time step $$k$$ is chosen such that the forward Euler discretization is stable, i.e., $$\left\Vert{ {\boldsymbol w} - \frac{k}{h} \frac{\beta_{ij}}{\alpha_{ij}} F_h({\boldsymbol w}) }\right\Vert \leq {\Vert {{\boldsymbol w}} \Vert} \quad \forall\, {\boldsymbol w} \in V_h,$$ (4.3) for all $$i = 1, \ldots, s$$, $$j=0,\ldots,(s-1)$$ and $$\alpha_{ij} \not=0$$. Here, $${\Vert {\cdot} \Vert}$$ is a non-negative homogeneous convex function, e.g., a norm or a seminorm. Moreover, assume that the conditions in (4.2) are satisfied. Then for the TVD-RK method we have \begin{equation*} {\Vert {{\boldsymbol w}^{(i)}} \Vert} \leq \max_{j=0,\ldots,(i-1)} {\Vert { {\boldsymbol w}^{(j)} } \Vert} \quad \text{for} i=1,\ldots,s, \end{equation*} and consequently $${\Vert { {\boldsymbol y}^{n+1} } \Vert} \leq {\Vert { {\boldsymbol y}^{n} } \Vert}$$. To highlight the technical differences between the proof technique of Shu & Osher (1988), which relies on $$\sum_{j=0}^{i-1} \alpha_{ij} = 1$$, the property not available for the discrete adjoint in Section 4.1, we display a short proof. Proof. First observe that if $$\alpha_{ij}=0$$ then the contribution of $$w^{(j)}$$ is zero. So we can rewrite (4.1) as \begin{equation*} {\boldsymbol w}^{(i)} = \sum_{\{ j : \alpha_{ij} \not= 0 \}} \alpha_{ij} \left( {\boldsymbol w}^{(j)} - \frac{k}{h} \, \frac{\beta_{ij}}{\alpha_{ij}} F_h\left({\boldsymbol w}^{(j)}\right) \right) \quad \forall\, i=1,\ldots, s. \end{equation*} We then take $${\Vert {\cdot} \Vert}$$ from both sides and use convexity as well as our assumption on the stability of the Euler step: for all $$i=1,\ldots,s$$ we have \begin{equation*} {\Vert {{\boldsymbol w}^{(i)}} \Vert} \leq \sum_{\{ j : \alpha_{ij} \not= 0 \}} \alpha_{ij} {\Vert { {\boldsymbol w}^{(j)} } \Vert} \leq \left( \max_{j=0,\ldots,(i-1)} {\Vert { {\boldsymbol w}^{(j)} } \Vert} \right) \left( \sum_{\{\, j : \alpha_{ij} \not= 0 \}} \alpha_{ij} \right) = \max_{j=0,\ldots,(i-1)} {\Vert { {\boldsymbol w}^{(j)} } \Vert}, \end{equation*} where we also used positivity of $$\alpha_{ij},\beta_{ij}$$ and $$\sum_{j=0}^{i-1} \alpha_{ij} = 1$$. The proof is completed by induction. □ Let us denote the forward Euler time step by $$k_\text{FE}$$. Then the stable TVD-RK time step is bounded by $$k \leq \left( \min_{\alpha_{ij}, \beta_{ij} \not = 0} \frac{\alpha_{ij}}{\beta_{ij}} \right) k_\text{FE}.$$ (4.4) Therefore, one can optimize the coefficients $$\alpha_{ij}$$ and $$\beta_{ij}$$ to maximize the constant $$\min_{\alpha_{ij}, \beta_{ij} \not = 0} \frac{\alpha_{ij}}{\beta_{ij}}$$. In Table 1, we show such optimal TVD-RK methods of two and three stages. Table 1 Table of coefficients for TVD-RK methods of order $$2$$ and $$3$$ Order $$\alpha_{ij}$$ $$\beta_{ij}$$ $$\min \alpha_{ij}/\beta_{ij}$$ 2 1 1 1 1/2 1/2 0 1/2 3 1 1 1 3/4 1/4 0 1/4 1/3 0 2/3 0 0 2/3 Order $$\alpha_{ij}$$ $$\beta_{ij}$$ $$\min \alpha_{ij}/\beta_{ij}$$ 2 1 1 1 1/2 1/2 0 1/2 3 1 1 1 3/4 1/4 0 1/4 1/3 0 2/3 0 0 2/3 Table 1 Table of coefficients for TVD-RK methods of order $$2$$ and $$3$$ Order $$\alpha_{ij}$$ $$\beta_{ij}$$ $$\min \alpha_{ij}/\beta_{ij}$$ 2 1 1 1 1/2 1/2 0 1/2 3 1 1 1 3/4 1/4 0 1/4 1/3 0 2/3 0 0 2/3 Order $$\alpha_{ij}$$ $$\beta_{ij}$$ $$\min \alpha_{ij}/\beta_{ij}$$ 2 1 1 1 1/2 1/2 0 1/2 3 1 1 1 3/4 1/4 0 1/4 1/3 0 2/3 0 0 2/3 We now rewrite our discrete optimization problem (DP) using a TVD-RK time discretization. First we redefine $${\boldsymbol y}_h^n$$ to be suitable for the TVD-RK method: the collection of all stage approximations at time-slab $$t_n$$ is given by \begin{equation*} {\boldsymbol y}_h^n := ({\boldsymbol y}_h^{n,0}, {\boldsymbol y}_h^{n,1}, \ldots, {\boldsymbol y}_h^{n,s} )^\top \in W_h := (V_h)^{s+1}, \end{equation*} where $${\boldsymbol y}_h^{n,l}$$ is the approximation at time $$t_n$$, at stage $$l = 0, \ldots, s$$. Then we concatenate contributions from all time steps to get \begin{equation*} {\boldsymbol y}_h := ({\boldsymbol y}_h^0, {\boldsymbol y}_h^1, \ldots, {\boldsymbol y}_h^{n_T})^\top \in (W_h)^{n_T+1}. \end{equation*} Let us denote the forward Euler operator by $$H_{ij}: V_h \rightarrow V_h$$ with \begin{equation*} H_{ij}({\boldsymbol v}_h) := \begin{cases} {\boldsymbol v}_h - \frac{k}{h} \frac{\beta_{ij}}{\alpha_{ij}} F_h({\boldsymbol v}_h) & \text{if } \alpha_{ij} > 0, \\ 0 & \text{if } \alpha_{ij} = 0, \end{cases} \end{equation*} for all $${\boldsymbol v}_h \in V_h$$, which is differentiable with derivative in the direction of $${\boldsymbol u}_h \in V_h$$ given by \begin{equation*} H_{ij}'({\boldsymbol v}_h) {\boldsymbol u}_h := \begin{cases} \left[ I - \frac{k}{h} \frac{\beta_{ij}}{\alpha_{ij}} F_h'({\boldsymbol v}_h) \right] {\boldsymbol u}_h & \text{if } \alpha_{ij} > 0, \\ 0 & \text{if } \alpha_{ij} = 0. \end{cases} \end{equation*} Then the equality constraint generated by TVD-RK scheme at time $$t_n$$ and stage $$l$$ is denoted by $$L_h^{n,l}: (W_h)^{n_T+1} \rightarrow W_h$$ with \begin{align*} L_h^{0,0}({\boldsymbol y}_h,{\boldsymbol u}_h) &:= -{\boldsymbol y}_h^{0,0} + {\boldsymbol u}_h,\\ L_h^{n,0}({\boldsymbol y}_h) &:= -{\boldsymbol y}_h^{n,0} + {\boldsymbol y}_h^{n-1,s},\\ L_h^{n,l}({\boldsymbol y}_h) &:= -{\boldsymbol y}_h^{n,l} + \sum_{j=0}^{l-1} \alpha_{lj} H_{lj}({\boldsymbol y}_h^{n,j}) \quad \text{for } n=1,\ldots,n_T \quad \text{and } l=1,\ldots,s. \end{align*} The fully discrete scheme, i.e., spatial discretization with the TVD-RK method, can be expressed as $$L_h({\boldsymbol y}_h,{\boldsymbol u}_h) = 0$$ where \begin{equation*} L_h({\boldsymbol y}_h,{\boldsymbol u}_h) = \left( L_h^{0,0}({\boldsymbol y}_h,{\boldsymbol u}_h), L_h^{0,1}({\boldsymbol y}_h), \ldots, L_h^{0,s}({\boldsymbol y}_h), \ldots, L_h^{n_T,s}({\boldsymbol y}_h) \right)^\top \in (W_h)^{n_T+1}. \end{equation*} As before we define the Lagrangian function by \begin{equation*} \mathcal{L}({\boldsymbol y}_h^{},{\boldsymbol p}_h^{},{\boldsymbol u}_h^{}) := J_h^{}({\boldsymbol y}_h^{}) + h \, {\boldsymbol p}_h^\top \cdot L_h^{}({\boldsymbol y}_h^{},{\boldsymbol u}_h^{}), \end{equation*} where $${\boldsymbol p}_h^{} \in W_h^{n_T+1}$$ has the same structure as $${\boldsymbol y}_h$$. Note that since $$H_{ij}(\cdot)$$ is differentiable we can conclude that $${L_h}(\cdot,\cdot)$$ is also differentiable with respect to the first argument. Moreover, similar to the case with forward Euler step, it has again a lower-triangular structure: \begin{equation*} L_{{\boldsymbol y}_h}'({\boldsymbol y}_h,{\boldsymbol u}_h) = \left[ \begin{array}{ccccc} G({\boldsymbol y}_h^{0}) & & & & \\ K & G({\boldsymbol y}_h^{1}) & & & \\ & \ddots & \ddots & & \\ & & K & G({\boldsymbol y}_h^{n_T-1}) & \\ & & & K & G({\boldsymbol y}_h^{n_T}) \end{array} \right]\!, \end{equation*} where $$G(\cdot)$$ is the linearized contribution of the TVD-RK scheme, i.e., $$G({\boldsymbol y}_h^n) := \left[ \begin{array}{ccccc} -I & & & & \\[4pt] \alpha_{10} H_{10}'({\boldsymbol y}_h^{n,0}) & -I & & & \\[4pt] \alpha_{20} H_{20}'({\boldsymbol y}_h^{n,0}) & \alpha_{21} H_{21}'({\boldsymbol y}_h^{n,1}) & -I & & \\[4pt] & \vdots & & \ddots & \\[4pt] \alpha_{s,0} H_{s,0}'({\boldsymbol y}_h^{n,0}) & & & \alpha_{s,s-1} H_{s,s-1}'({\boldsymbol y}_h^{n,s-1}) & -I \end{array} \right]\!,$$ (4.5) and $$K$$ links the approximation from the previous time step to the current one: $$K := \left[ \begin{array}{cccc} O & \cdots & O & I \\ O & \cdots & O & O \\ \vdots & & \vdots & \vdots \\ O & \cdots & O & O \end{array} \right]\!.$$ (4.6) As before, the discrete adjoint satisfies, \begin{equation*} {\nabla J_h({\boldsymbol y}_h)} + h L_{{\boldsymbol y}_h}({\boldsymbol y}_h,{\boldsymbol u}_h)^\top {\boldsymbol p}_h = 0, \end{equation*} which is again backward in time. As before, the above constraint satisfies LICQ. Let us illustrate the above adjoint equation with an example. Example 4.2 (Case $$s=2$$) Consider the discrete adjoint for a TVD-RK discretization with two stages. Let $$n < n_T$$. Then the discrete adjoint satisfies the following backward in time linear system: \begin{equation*} \begin{array}{rrrcl} {\boldsymbol p}_h^{n,0} & -\ \alpha_{10} H_{10}'({\boldsymbol y}_h^{n,0})^\top {\boldsymbol p}_h^{n,1} & - \alpha_{20} H_{20}'({\boldsymbol y}_h^{n,0})^\top {\boldsymbol p}_h^{n,2} & = & {\boldsymbol 0}, \\[3pt] & {\boldsymbol p}_h^{n,1} & -\ \alpha_{21} H_{21}'({\boldsymbol y}_h^{n,1})^\top {\boldsymbol p}_h^{n,2} & = & {\boldsymbol 0}, \\[3pt] & & {\boldsymbol p}_h^{n,2} & = & {\boldsymbol p}_h^{n+1,0}, \end{array} \end{equation*} which is a TVD-RK scheme with coefficients $${\alpha_{ij}^\star}$$ and $${\beta_{ij}^\star}$$. In fact, the TVD-RK coefficients for the discrete adjoint equation are related to the TVD-RK of the discrete state as \begin{equation*} \begin{array}{cc} \alpha_{10}^* = \alpha_{21}, & \\ \alpha_{20}^* = \alpha_{20}, & \alpha_{21}^* = \alpha_{10}, \end{array} \quad \begin{array}{cc} \beta_{10}^* = \beta_{21}, & \\ \beta_{20}^* = \beta_{20}, & \beta_{21}^* = \beta_{10}. \end{array} \end{equation*} Observe the way the coefficients of the TVD-RK scheme are transformed by this conjugation process. We refer to the coefficients of the adjoint TVD-RK scheme as conjugate coefficients. This transformation in the table of adjoint TVD-RK coefficients might pose some restrictions on the choice of the TVD-RK method in the first place. We conclude this section by the following proposition. Proposition 4.3 Suppose we discretize-then-optimize the problem (P) and a TVD-RK time discretization is used for the discrete state variable with coefficients $$\alpha_{ij}$$ and $$\beta_{ij}$$. Then the discrete adjoint is also obtained by a TVD-RK method with coefficients $$\alpha_{ij}^*$$ and $$\beta_{ij}^*$$ such that $$\alpha_{ij}^* = \alpha_{s-j,s-i}, \quad \beta_{ij}^* = \beta_{s-j,s-i} \text{ for } i=1,\ldots,s \text{ and } j=0,\ldots,(s-1).$$ (4.7) 4.1 Stability of the discrete adjoint Given the result of Proposition 4.3, our first idea is to impose strong stability on both the discrete state and the adjoint variables, i.e., imposing (4.2) on $$\{ \alpha_{ij}, \beta_{ij} \}$$ and $$\{ \alpha^*_{ij}, \beta^*_{ij} \}$$. In other words, the following conditions on $$\{ \alpha_{ij}, \beta_{ij} \}$$ should hold: $$\alpha_{ij}, \beta_{ij} \geq 0, \quad \sum_{j=0}^{i-1} \alpha_{ij} = 1, \quad \sum_{i=j+1}^{s} \alpha_{ij} = 1, \quad \left( \beta_{ij} \not= 0 \implies \alpha_{ij}\not=0 \right)\!.$$ (4.8) This, however, turns out to be too strong as the following proposition clarifies. Proposition 4.4 Suppose we discretize-then-optimize the problem (P) with a TVD-RK method. If we require strong stability for both discrete state and discrete adjoint, then the TVD-RK method is at most first order. More precisely, the TVD-RK coefficients are \begin{equation*} \alpha_{i,j} = \left\{ \begin{array}{ll} 1 & \text{if } j=i-1, \\ 0 & \text{otherwise}, \end{array} \right. \quad \beta_{i,j} = \left\{ \begin{array}{ll} \beta_{i,i-1} & \text{if } j=i-1, \\ 0 & \text{otherwise}, \end{array} \right. \end{equation*} which gives a concatenation of forward Euler steps. Proof. We need to identify sets of $$\{\alpha_{ij}\}$$ and $$\{\beta_{ij}\}$$ that satisfy (4.8). Thus, all coefficients need to be non-negative. Let $$i=1$$, then $$\alpha_{10} = 1$$. Now let $$j=0$$ and observe that $$\sum_{i=j+1}^s \alpha_{ij} = 1$$. This implies $$\alpha_{i,0} = 0$$ for all $$i = 2, \ldots, s$$ since all $$\alpha_{ij} \geq 0$$ and $$\alpha_{10}=1$$. Now let $$i=2$$ and observe that $$\sum_{j=0}^{i-1} \alpha_{ij} =1$$. However, we just showed that $$\alpha_{20}=0$$, which implies $$\alpha_{21}=1$$. We now repeat the same process by letting $$j=1$$ and consider the constraint $$\sum_{i=j+1}^s \alpha_{ij}=1$$. Continuing, we conclude that $$\alpha_{i,i-1} =1$$ and the other coefficients are zero. Since $$\alpha_{i,j}=0$$ for $$j<i-1$$ we conclude from the last requirement of (4.8) that $$\beta_{ij} = 0$$ for $$j<i-1$$. Therefore, the only free parameters are $$\beta_{i,i-1}$$. However, such a TVD-RK scheme is equivalent to the concatenation of Euler steps, instead of a combination. Finally, the concatenation of Euler steps yields a first-order method. □ As we shall see next, imposing (4.8) is not necessary. Let us consider a two-stage TVD-RK that ensures strong stability only for the state discretization. That is, TVD-RK satisfies only (4.2). Moreover, suppose that the time step $$k$$ is chosen such that adjoint stability of the forward Euler step for discrete adjoint holds. For instance, for the the discrete adjoint from Example 4.2 we have \begin{equation*} \begin{array}{rcl} {\Vert {{\boldsymbol p}^{n,1}} \Vert} &\leq& \alpha_{21} {\Vert { {\boldsymbol p}^{n,2} } \Vert}, \\[3pt] {\Vert {{\boldsymbol p}^{n,0}} \Vert} &\leq& \alpha_{20} {\Vert { {\boldsymbol p}^{n,2} } \Vert} + \alpha_{10} {\Vert { {\boldsymbol p}^{n,1} } \Vert}, \end{array} \end{equation*} where $${\Vert {\cdot} \Vert}$$ is a non-negative homogeneous convex function, e.g., a seminorm or a norm. For strong stability of the TVD-RK scheme, we would take the maximum of (semi)norms up to the $$(i-1)\text{st}$$ stage and use the assumption that the sum of the coefficients in each stage equals $$1$$. However, here we do not have $$\alpha_{20} + \alpha_{10} = 1$$. Instead, we can substitute the first inequality into the second and obtain \begin{equation*} {\Vert {{\boldsymbol p}^{n,0}} \Vert} \leq ( \alpha_{20} + \alpha_{21} \, \alpha_{10} ) {\Vert { {\boldsymbol p}^{n,2} } \Vert} = {\Vert { {\boldsymbol p}^{n,2} } \Vert}, \end{equation*} which holds true since $$\alpha_{10}=1$$ and $$\alpha_{20} + \alpha_{21} = 1$$. This shows that for two-stage TVD-RK methods we have stability at each time step, which is weaker than stability at each stage; observe that in Proposition 4.1 stability is achieved at each stage and therefore at each time step. We can generalize this observation to an arbitrary $$s$$-stage TVD-RK method. Theorem 4.5 Suppose the state equation is discretized with an SSP TVD-RK method. Moreover, suppose that $$k$$ is chosen such that it ensures forward Euler stability for both the discrete state and adjoint. Then the discrete adjoint is stable in each time step for an arbitrary $$s$$-stage method., i.e., \begin{equation*} {\Vert {{\boldsymbol p}^{n,0}} \Vert} \leq {\Vert {{\boldsymbol p}^{n,s}} \Vert}, \end{equation*} where $${\Vert {\cdot} \Vert}$$ is a non-negative homogeneous convex function, e.g., a seminorm or a norm. Proof. Since we require Euler step stability, we have for each stage \begin{equation*} {\Vert { {\boldsymbol p}^{n,\ell}} \Vert} \leq \sum_{i=\ell+1}^s \alpha_{i\ell} \, {\Vert {{\boldsymbol p}^{n,i}} \Vert} \quad \text{for } \ell = 0, \ldots, s-1. \end{equation*} Let $$\ell=0$$ and recall that $$\alpha_{10}=1$$. Then, using the above inequality, we have \begin{equation*} {\Vert { {\boldsymbol p}^{n,0}} \Vert} \leq \sum_{i=1}^s \alpha_{i0} \, {\Vert {{\boldsymbol p}^{n,i}} \Vert} = {\Vert {{\boldsymbol p}^{n,1}} \Vert} + \sum_{i=2}^s \alpha_{i0} {\Vert {{\boldsymbol p}^{n,i}} \Vert} \leq \sum_{i=2}^s ( \alpha_{i1} + \alpha_{i0} ) {\Vert {{\boldsymbol p}^{n,i}} \Vert}. \end{equation*} Now isolating the term with $$i=2$$ and noting that $$\alpha_{21} +\alpha_{20}=1$$ we have \begin{equation*} {\Vert { {\boldsymbol p}^{n,0}} \Vert} \leq {\Vert { {\boldsymbol p}^{n,2} } \Vert} + \sum_{i=3}^{s} (\alpha_{i1}+\alpha_{i0}) {\Vert {{\boldsymbol p}^{n,i}} \Vert} \leq \sum_{i=3}^s (\alpha_{i2}+\alpha_{i1} + \alpha_{i0}) {\Vert {{\boldsymbol p}^{n,i}} \Vert}. \end{equation*} We repeat this procedure to obtain $${\Vert { {\boldsymbol p}^{n,0}} \Vert} \leq \sum_{i=\ell'}^s \left( \sum_{j=0}^{\ell'-1} \alpha_{ij} \right) {\Vert { {\boldsymbol p}^{n,i} } \Vert}$$ for all $$\ell' = 1, \ldots, s$$. We then choose $$\ell' = s$$ and obtain $${\Vert {{\boldsymbol p}^{n,0}} \Vert} \leq (\sum_{j=0}^{s-1} \alpha_{sj}) {\Vert {{\boldsymbol p}^{n,s}} \Vert} = {\Vert {{\boldsymbol p}^{n,s}} \Vert}$$, which completes the proof. □ Theorem 4.5 shows that any $$s$$-stage TVD-RK method for the discrete state yields a stable TVD-RK method for the discrete adjoint. However, the discrete adjoint is proved to be stable at each time step instead of each stage. 4.2 Order conditions for the discrete adjoint In this section, we study approximation properties of the scheme for the discrete adjoint. We focus on deriving order conditions for the discrete adjoint scheme. For this purpose, extra conditions on the TVD-RK method applied to the state discretization are needed to ensure high accuracy of the adjoint scheme. We also check which methods provide optimal CFL constants. For simplicity of the notation, we consider TVD-RK methods with a conjugate coefficient table as in Proposition 4.3 for the following linear problem: $$\dot{{\boldsymbol p}}(t) + R(t) {\boldsymbol p}(t) = 0, \quad {\boldsymbol p}(T) = {\boldsymbol p}_T \in V_h.$$ (4.9) Here $$R(t)$$ is a linear operator and defined as $$R(t) := -\frac{1}{h} F_h'({\boldsymbol y}(t))^\top,$$ (4.10) where $${\boldsymbol y}(t)$$ is the solution of the semidiscrete problem (3.1), i.e., $$\dot{{\boldsymbol y}}(t)=-\frac{1}{h} F_h({\boldsymbol y}(t))$$. For completeness we state derivatives of $$R(\cdot)$$: \begin{align} \dot{R}(t) &= \frac{1}{h^2} F''_h({\boldsymbol y}(t))^\top F_h({\boldsymbol y}(t)), \notag\\[-2pt] \ddot{R}(t) &= -\frac{1}{h^3} \left[ F''_h({\boldsymbol y}(t))^\top F'_h({\boldsymbol y}(t))^\top F_h({\boldsymbol y}(t)) + F_h'''({\boldsymbol y}(t))^\top F_h({\boldsymbol y}(t))^2 \right]\!. \end{align} (4.11) Observe that $$R(t)$$ at the $$\ell$$th stage is approximated by $$\frac{-1}{h} F_h'({\boldsymbol y}_h^{n,\ell})$$; see for instance Example 4.2 and the definition of $$H'_{i\ell}({\boldsymbol y}_h^{n,\ell})$$. Since we consider the local error of the TVD-RK method in the time interval $$[t_n,t_{n+1}]$$, we let $${\boldsymbol y}_h^{n,0} = {\boldsymbol y}(t_n)$$. For the analysis of the TVD-RK method, we need approximation properties of $$\frac{-1}{h} F_h'({\boldsymbol y}_h^{n,\ell})$$. A direct calculation gives \begin{align} -\frac{1}{h} F'_h({\boldsymbol y}_h^{n,0})^\top &= R(t_{n+1}) - k \dot{R}(t_{n+1}) + \frac{1}{2} k^2 \ddot{R}(t_{n+1}) + \mathcal{O}(k^3), \\[-2pt] \end{align} (4.12) \begin{align} -\frac{1}{h} F'_h({\boldsymbol y}_h^{n,1})^\top &= R(t_{n+1}) - (1-\beta_{10}) k \dot{R}(t_{n+1}) \notag\\[-2pt] &\quad - \frac{1}{2} \frac{k^2}{h^3} (1-\beta_{10})^2 F'''_h({\boldsymbol y}(t_{n+1}))^\top F_h^2({\boldsymbol y}(t_{n+1})) \notag\\[-2pt] &\quad - \frac{1}{2} \frac{k^2}{h^3} (1 - 2 \beta_{10} ) F_h''({\boldsymbol y}(t_{n+1}))^\top F_h'({\boldsymbol y}(t_{n+1}))^\top F_h({\boldsymbol y}(t_{n+1})) + \mathcal{O}(k^3) \end{align} (4.13) and \begin{align} -\frac{1}{h} F'_h({\boldsymbol y}_h^{n,2})^\top &= R(t_{n+1}) - (1- \psi) k \dot{R}(t_{n+1}) \notag\\[-2pt] &- \frac{1}{2} \frac{k^2}{h^3} (1- \psi)^2 F'''_h({\boldsymbol y}(t_{n+1}))^\top F_h^2({\boldsymbol y}(t_{n+1})) \notag\\[-2pt] & - \frac{1}{2} \frac{k^2}{h^3} (1 - 2 \psi + 2 \beta_{21} \beta_{10}) F_h''({\boldsymbol y}(t_{n+1}))^\top F_h'({\boldsymbol y}(t_{n+1}))^\top F_h({\boldsymbol y}(t_{n+1})) \notag\\[-2pt] & +\mathcal{O}(k^3), \end{align} (4.14) where $$\psi := \beta_{20} + \beta_{21} + \alpha_{21} \beta_{10}.$$ (4.15) Note that since the inner stages of the RK are of low order, we have first-order approximation of $$R(\cdot)$$ in (4.13) and (4.14). A Taylor expansion of the solution of (4.9) at time $$t_{n+1}$$ with the time step $$(-k)$$ yields \begin{align} {\boldsymbol p}(t_{n}) &= {\boldsymbol p}(t_{n+1}) + k R(t_{n+1}) {\boldsymbol p}(t_{n+1}) + \frac{k^2}{2} \left[ R^2(t_{n+1}) - \dot{R}(t_{n+1}) \right] {\boldsymbol p}(t_{n+1}) \notag\\ &\quad - \frac{k^3}{6} \left[ 2 \dot{R}(t_{n+1}) R(t_{n+1}) + R(t_{n+1}) \dot{R}(t_{n+1}) - R^3(t_{n+1}) - \ddot{R}(t_{n+1}) \right] {\boldsymbol p}(t_{n+1}) \notag\\ &\quad +\mathcal{O}(k^4). \end{align} (4.16) We will compare the TVD-RK method with conjugate coefficients against the Taylor expansion in (4.16). 4.2.1 Two-stage methods We now study the approximation properties of a two-stage scheme. In this vein, Shu & Osher (1988) derived order conditions for TVD-RK methods with two stages. As before, let $$\{\alpha_{ij}\}, \{\beta_{ij}\}$$ be the coefficients of a TVD-RK scheme for the state equation. Then the method is second order if it satisfies (4.2) and the order conditions $$\begin{array}{rcl} \alpha_{21}\beta_{10}+\alpha_{10}\beta_{21} + \beta_{20} &=& 1, \\[2pt] \beta_{10} \beta_{21} &=& \frac{1}{2}. \end{array}$$ (4.17) We need to find extra conditions on $$\{\alpha_{ij}\}$$ and $$\{\beta_{ij}\}$$ to obtain a second-order approximation for the discrete adjoint too. For the TVD-RK method with a conjugate coefficient table applied to (4.9) we have (see Example 4.2) \begin{equation*} \begin{array}{rcl} {\boldsymbol p}_h^{n,0} &=& \left[ \alpha_{10} + k \beta_{10} R(t_{n}) \right] {\boldsymbol p}_h^{n,1} + \left[ \alpha_{20} + k \beta_{20} R(t_{n}) \right] {\boldsymbol p}_h^{n,2}, \\[2pt] {\boldsymbol p}_h^{n,1} &=& \left[ \alpha_{21} + k \beta_{21} R(t_{n+1}) - k^2 \beta_{21}(1-\beta_{10}) \dot{R}(t_{n+1}) \right] {\boldsymbol p}_h^{n,2} + \mathcal{O}(k^3). \end{array} \end{equation*} Eliminating $${\boldsymbol p}_h^{n,1}$$, we simplify the above equations and obtain \begin{align} {\boldsymbol p}_h^{n,0} &= (\alpha_{20}^{} + \alpha_{10}^{}\alpha_{21}^{}) {\boldsymbol p}_h^{n,2} + (\alpha_{10} \beta_{21} + \alpha_{21} \beta_{10} + \beta_{20} ) k R(t_{n+1}) {\boldsymbol p}_h^{n,2} \notag\\ & \quad + \left[ \beta_{10}\beta_{21} R^2(t_{n+1}) - (\beta_{20}+\alpha_{21}\beta_{10} + \alpha_{10} \beta_{21}(1-\beta_{10})) \dot{R}(t_{n+1}) \right] k^2 {\boldsymbol p}_h^{n,2} + \mathcal{O}(k^3). \end{align} (4.18) Recall that $$\alpha_{10} =1$$ and $$\alpha_{20}+\alpha_{21} = 1$$. Then using these facts in the above expansion we get \begin{align} {\boldsymbol p}_h^{n,0} &= {\boldsymbol p}_h^{n,2} + ( \alpha_{10} \beta_{21} + \alpha_{21} \beta_{10} + \beta_{20} ) k R(t_{n+1}) {\boldsymbol p}_h^{n,2} \notag\\ &\quad + \left[ \beta_{10}\beta_{21} R^2(t_{n+1}) - (\beta_{20}+\alpha_{21}\beta_{10} + \alpha_{10}\beta_{21}(1-\beta_{10})) \dot{R}(t_{n+1}) \right] k^2 {\boldsymbol p}_h^{n,2} + \mathcal{O}(k^3). \end{align} (4.19) We compare (4.19) to the Taylor expansion of the exact solution (4.16) and require the following conditions on the coefficients: $$\begin{array}{rcl} \alpha_{21}\beta_{10}+\alpha_{10}\beta_{21} + \beta_{20} &=& 1, \\[2pt] \beta_{10} \beta_{21} &=& \frac{1}{2}, \\[2pt] \beta_{20} + \alpha_{21} \beta_{10} + \alpha_{10}\beta_{21}(1-\beta_{10}) &=& \frac{1}{2}. \end{array}$$ (4.20) Note that the first and second conditions are satisfied due to (4.17). Moreover, the first two conditions imply that the third condition is automatically satisfied. Therefore any second-order TVD-RK method applied to the forward problem results in a second-order approximation of the discrete adjoint. The above arguments prove the following theorem. Theorem 4.6 Suppose a second-order two-stage TVD-RK method is used to discretize the state equation. Then the corresponding TVD-RK method for the adjoint equation is consistent and is second-order. Moreover, the optimal CFL constant is $$1$$. 4.2.2 Three-stage methods In this section, we analyse approximation properties of the third-order three-stages TVD-RK methods. The following conditions should be satisfied to ensure a third-order discrete state: \begin{align} \alpha_{32} &= 1 - \alpha_{31} - \alpha_{30}, \\[-2pt] \end{align} (4.21) \begin{align} \beta_{32} &= \frac{3 \beta_{10} - 2}{6 \psi (\beta_{10} - \psi)}, \\[-2pt] \end{align} (4.22) \begin{align} \beta_{21} &= \frac{1}{6 \beta_{10} \beta_{32} }, \\[-2pt] \end{align} (4.23) \begin{align} \beta_{31} &= \frac{ \frac{1}{2} - \alpha_{32} \beta_{10} \beta_{21} - \psi \beta_{32} }{\beta_{10}}, \\[-2pt] \end{align} (4.24) \begin{align} \beta_{30} &= 1 - \alpha_{31} \beta_{10} - \alpha_{32} \psi - \beta_{31} - \beta_{32}, \\[-2pt] \end{align} (4.25) \begin{align} \beta_{20} &= \psi - \alpha_{21} \beta_{10} - \beta_{21}. \end{align} (4.26) The free parameters are $$\alpha_{21}, \alpha_{30}, \alpha_{31}, \beta_{10}$$ and $$\psi$$; see the discussion in Shu & Osher (1988) for details. The same analysis as in Section 4.2.1 and employing inner stages approximation properties (4.12)–(4.14) and comparing with coefficients of the Taylor expansion in (4.16) yields that the following conditions must be satisfied for the discrete adjoint: $$\begin{array}{rcll} \alpha_{31} \beta_{10} + \alpha_{21} \alpha_{32} \beta_{10} + \alpha_{32} (\beta_{20} + \beta_{21}) + \beta_{30} + \beta_{31} + \beta_{32} &=&1 & \text{(first order)}, \end{array}$$ (4.27) \begin{align} \left. \begin{array}{rcl} \alpha_{32} \beta_{10} \beta_{21} + \beta_{10} \beta_{31} + \alpha_{21} \alpha_{32} \beta_{10} + \beta_{32} (\beta_{20} + \beta_{21}) &=& \frac{1}{2} \quad \text{(for}\,R^2) \\[4pt] \alpha_{31} \beta_{10} + \alpha_{21} \alpha_{32} \beta_{10} + \alpha_{32} \beta_{20} + \beta_{30} + \beta_{32} (1-\psi) \\[4pt] + (\alpha_{32} \beta_{21} + \beta_{31})(1-\beta_{10}) &=& \frac{1}{2} \quad (\text{for}\,\dot{R}) \end{array} \right\} & \text{(second order)}, \end{align} (4.28) \begin{align} \left. \begin{array}{rcl} \beta_{10} \beta_{21} \beta_{32} &=& \frac{1}{6} \quad (\text{for} R^3) \\[4pt] \beta_{20}\beta_{32} + \beta_{10}\beta_{31} + (1-\beta_{10})\beta_{21}\beta_{32} + \beta_{10} \beta_{21} \alpha_{32} + \beta_{10} \beta_{32} \alpha_{21} &=& \frac{1}{3} \quad (\text{for} \dot{R} R) \\[4pt] (1-\psi) \left( \beta_{32}\beta_{20} + \beta_{10}\alpha_{21}\beta_{32} + \beta_{32}\beta_{21} \right) \quad \\ + (1-\beta_{10}) \left( \beta_{21}\beta_{10}\alpha_{32} + \beta_{10} \beta_{31} \right) &=& \frac{1}{6} \quad (\text{for}\,R\dot{R}) \\[4pt] \alpha_{31} \beta_{10} + \alpha_{21} \alpha_{32} \beta_{10} + \alpha_{32} \beta_{20} + (\alpha_{32} \beta_{21} + \beta_{31}) (1-\beta_{10})^2 \\ + \beta_{30} + \beta_{32} (1-\psi)^2 &=& \frac{1}{3} \quad (\text{for}\,F'''F^2) \\[4pt] \alpha_{31} \beta_{10} + \alpha_{21} \alpha_{32} \beta_{10} + \alpha_{32} \beta_{20} + (\alpha_{32} \beta_{21} + \beta_{31}) (1-2\beta_{10}) \\ + \beta_{30} + \beta_{32} (1-2\psi+2 \beta_{21}\beta_{10}) &=& \frac{1}{3} \quad (\text{for}\,F''F'F) \end{array} \right\} & \text{(third order)}. \end{align} (4.29) One can show that if conditions (4.21)–(4.26) are satisfied then (4.27)–(4.29) are automatically satisfied except for the term associated to $$R \dot{R}$$: let us define $$A := (1-\psi) \left( \beta_{32}\beta_{20} + \beta_{10}\alpha_{21}\beta_{32} + \beta_{32}\beta_{21} \right) + (1-\beta_{10}) \left( \beta_{21}\beta_{10}\alpha_{32} + \beta_{10} \beta_{31} \right)$$. Then we have \begin{equation*} A = (1-\psi) \psi \beta_{32} + (1-\beta_{10})\left(\tfrac{1}{2} - \psi \beta_{32} \right) = \tfrac{1}{2} + \psi (\psi - \beta_{10}) \beta_{32} - \tfrac{1}{2} \beta_{10} = \tfrac{1}{2} + \tfrac{1}{2} \beta_{10} - \tfrac{1}{6} - \tfrac{1}{2} \beta_{10} = \tfrac{1}{3}, \end{equation*} which cannot be equal to $$\frac{1}{6}$$ for any choice of parameters. Therefore the discrete adjoint TVD-RK method is at most second order. The above arguments prove the following theorem. Theorem 4.7 Suppose a third-order three-stage TVD-RK method is used to discretize the state equation. Then the corresponding TVD-RK method for the adjoint equation is consistent and is at most second order. Moreover, the optimal CFL constant is $$1$$. 4.2.3 Fifth-stage (fourth-order) method We have already mentioned that a fourth-order TVD-RK method with four stages does not exist (see Gottlieb & Shu, 1998; Ruuth & Spiteri, 2002). Using a nonlinear programming computer code, a fourth-order with five-stage TVD-RK method was found in Spiteri & Ruuth (2002, Appendix B). It has been shown in Hager (2000, Proposition 6.1) that when a general four-stage fourth-order RK method is applied to the forward problem of an optimal control problem of ODEs, then the corresponding discrete adjoint is also automatically fourth order. Inspired by this result, we would like to check whether or not the fourth-order with five-stage TVD-RK method of Ruuth & Spiteri (2002) generates a fourth-order discrete adjoint too. To check the order conditions for this particular TVD-RK method, we use Butcher’s table of the mentioned TVD-RK method and check whether the order conditions in Hager (2000, Table 1) are satisfied. A direct calculation shows that the discrete adjoint is only second order. This is, however, not surprising, since the TVD-RK method is obtained by nonlinear programming (with order conditions for the forward problem as constraints). 4.3 Convergence of the TVD-RK method In this section, we discuss convergence of the discrete adjoint obtained from a TVD-RK method. We follow the framework established in Ulbrich (2001, Chapter 6.4). We consider the case when the end data of the continuous adjoint is smooth, e.g., $$p(x,T) \in {\textrm{Lip}}_\text{loc}({\mathbb{R}})$$, and the space discretization is either the Lax–Friedrichs or Engquist–Osher method, which yields a monotone scheme when combined with a forward Euler method in time (see, e.g., LeVeque, 1990, Chapter 15.7). Consider the TVD-RK schemes with either Lax–Friedrichs or Engquist–Osher discretization and suppose the time step $$k$$ is chosen such that the forward Euler discretizations (4.3) are monotone and their corresponding discrete adjoint scheme is TVD (see (3.13), (4.4)). The first ingredient for the proof is to show that the TVD-RK scheme is again a monotone scheme for the forward problem, i.e., given any initial data $$y_j^n, w_j^n$$, at time $$t_n$$, we have \begin{equation*} y_j^n \geq w_j^n \quad \forall\, j \implies y_j^{n+1} \geq w_j^{n+1} \quad \forall\, j. \end{equation*} Suppose that until the $$\ell$$th stage we have $$y_j^{n,i} \geq w_j^{n,i}$$ for all $$i=1,\ldots,\ell$$. Since $$y_j^{n,\ell+1}$$ is equal to a convex combination of forward Euler methods and each forward Euler method is a monotone scheme, we can conclude that $$y_j^{n,\ell+1} \geq w_j^{n,\ell+1}$$. The second ingredient is to show that after eliminating the inner stages we have a conservative scheme. Let us demonstrate this for a two-stage method: observe that the application of an Euler step at the first stage can be written as $$y_j^{n,1} = \mathcal{H}(y_{j-1}^{n,0},y_{j}^{n,0},y_{j+1}^{n,0}) := \alpha_{10}^{} y_{j}^{n,0} - \beta_{10}^{} \frac{k}{h} \left[ \hat{f}(y_j^{n,0},y_{j+1}^{n,0}) - \hat{f}(y_{j-1}^{n,0},y_{j}^{n,0}) \right]\!.$$ (4.30) Then the second stage can be written in the conservative form $$y_j^{n+1} = y_j^{n} - \frac{k}{h} \left[ \tilde{f}(y_{j-1}^{n},y_{j}^{n},y_{j+1}^{n},y_{j+2}^{n}) - \tilde{f}(y_{j-2}^{n}, y_{j-1}^{n},y_{j}^{n},y_{j+1}^{n}) \right]\!,$$ (4.31) where the numerical flux is defined, after eliminating the inner stage, by \begin{equation*} \tilde{f}(y_{j-1},y_{j},y_{j+1},y_{j+2}) := (\alpha_{21}\beta_{10}+\beta_{20}) \hat{f}(y_j,y_{j+1}) + \alpha_{10}\beta_{21} \hat{f}(\mathcal{H}(y_{j-1},y_j,y_{j+1}), \mathcal{H}(y_{j},y_{j+1},y_{j+2})). \end{equation*} We state the convergence result in the following proposition. Proposition 4.8 Suppose the final end data of the continuous adjoint satisfies $$p(x,T) \in {\textrm{C}}^{0,1}({\mathbb{R}})$$. Moreover, suppose that the space discretization is either the Lax–Friedrichs or Engquist–Osher method and that the time is discretized using a TVD-RK method. Let the time step $$k= ch$$ be chosen such that the forward Euler discretizations (4.3) are monotone and their corresponding discrete adjoint scheme is TVD, see (3.13), (4.4). Then the discrete adjoint is convergent to the unique reversible solution as $$k = ch \rightarrow 0$$, i.e., \begin{equation*} p_h \rightarrow p \quad \text{in } B([0,T];{\textrm{L}}^r(I)), \end{equation*} where $$B([0,T];{\textrm{L}}^r(I))$$ is the space of bounded functions equipped with the $$\sup$$-norm and with values in $${\textrm{L}}^r(I)$$ for $$r \in [1,\infty)$$ and $$I := (-R,R)$$ for all $$R>0$$. Proof. We give the main steps of the proof from Ulbrich (2001, Chapter 6.4). First, we eliminate the inner stages and write the TVD-RK method in a conservative form (see (4.31) for the two-stage method): \begin{equation*} y_j^{n+1} = y_j^{n} - \frac{k}{h} \left[ \tilde{f}_{j+1/2}^n - \tilde{f}_{j-1/2}^{n} \right]\!, \quad \tilde{f}_{j+1/2}^{n} := \tilde{f}(y_{j-K+1}^n,\ldots,y_{j+K}^n). \end{equation*} Therefore, it is of the form Ulbrich (2001, Equation (6.9)). Then the discrete adjoint can be written as $$p_j^{n} = p_j^{n+1} + \frac{k}{h} \sum_{i=1-K}^{K} a_{j-i+1/2,i}^{n} \, (p_{j-i+1}^{n+1} - p_{j-i}^{n+1}),$$ (4.32) where $$a_{j-i+1/2,i}^{n} = \partial_{y_i}\tilde{f}_{j-i+1/2}^n$$. For the $${\textrm{L}}^\infty$$-stability, it is more convenient to write the discrete adjoint scheme in the form \begin{equation*} p_j^n = \sum_{i=-K}^{K} B_{j,i}^n p_{j-i}^{n+1}, \end{equation*} where $$B_{j,-K}^n = \frac{k}{h} a^n_{j+K-\frac{1}{2},1-K}$$, $$B_{j,K}^n = - \frac{k}{h} a^n_{j-K+\frac{1}{2},K}$$ and \begin{equation*} B_{j,i}^n = \delta_{0,i} + \frac{k}{h} \left( a^n_{j-i-\frac{1}{2},i+1} - a^n_{j-i+\frac{1}{2},i}\right)\!, \quad -K < i < K, \end{equation*} where $$\delta_{0,i}$$ is the Kronecker delta. For the stability of the total variation, however, it is more convenient to write the discrete adjoint scheme in the form $$(p_{j+1}^{n}-p_{j}^{n}) = \sum_{i=-K}^{K} C_{j,i}^{n} \left(p_{j-i+1}^{n+1}-p_{j-i}^{n+1}\right)\!,$$ (4.33) where $$C_{j,-K}^n = \frac{k}{h} a^n_{j+K+\frac{1}{2},1-K}$$, $$C_{j,K}^n = - \frac{k}{h} a^n_{j-K+\frac{1}{2},K}$$ and \begin{equation*} C_{j,i}^n = \delta_{0,i} + \frac{k}{h} \left( a^n_{j-i+\frac{1}{2},i+1} - a^n_{j-i+\frac{1}{2},i}\right)\!, \quad -K < i < K. \end{equation*} Monotonicity of the TVD-RK method applied to the forward problem ensures that $$B_{j,i}^n \geq 0$$ (see Ulbrich, 2001, Lemma 6.4.2). Moreover, the TVD-RK adjoint scheme is TVD by Theorem 4.5 and maps by (4.32) constant values $$p_{j-K}^{n+1}=\cdots=p_{j+K}^{n+1}=c$$ to the same value $$p_j^n=c$$. Hence, Harten (1983, Theorem 2.1) yields that the TVD-RK adjoint scheme is monotonicity perserving and thus (4.33) implies $$C_{j,i}^n \geq 0$$. This shows that condition (1) in Ulbrich (2001, Chapter 6.4.1, condition (1)) is satisfied. It remains to show that assumptions (D1), (D2) and (D3) of Ulbrich (2001, Chapter 6.4.1) hold. (D1) is consistency of the numerical flux, which holds in our case; observe for instance that $$\tilde{f}(y,y,y,y) = f(y)$$ in (4.30) since $$\alpha_{21}\beta_{10} + \alpha_{10}\beta_{21} + \beta_{20} = 1$$. (D2) is the convergence of the discrete state to the entropy solution of the state, which holds as our discretization is TVD. (D3) is an OSLC condition which holds for Lax–Friedrichs and Engquist–Osher discretization (see Ulbrich, 2001, Lemma 6.5.2 and Lemma 6.5.5 for details). Then we can apply Ulbrich (2001, Theorem 6.4.4 and Theorem 6.4.6) to show convergence of the discrete adjoint to the unique reversible solution. □ 5. Numerical experiments In this section, we perform numerical experiments on TVD-RK methods for computing the discrete adjoint state. We show through numerical experiments that the adjoint scheme obtained from discretization of the forward problem using a TVD-RK method is stable. Let us consider the configuration of Example 2.5. The domain is set to be $${\it{\Omega}} = (-1,1)$$, the flux function $$f(y) := \frac{1}{2}y^2$$ and $$u(x)=1$$ for $$x \in [-1,0)$$ and $$u(x)=-1$$ for $$x \in [0,1]$$. We set the function $$G(y(x,T)):= \frac{1}{2} |y(x,T)|^2$$. For the boundary condition we choose $$y(-1,t) = 1$$ and $$y(1,t)=-1$$. We then use the Lax–Friedrichs and Engquist–Osher schemes to compute $$\{ {\boldsymbol y}_h^n \}_{n=1}^{n_T}$$ with the time step \begin{equation*} k = \frac{1}{2} \left( \min_{\alpha_{ij},\beta_{ij}\not = 0} \frac{\alpha_{ij}}{\beta_{ij}} \right) \gamma \, h, \end{equation*} where $$\gamma = \frac{1}{2}$$ is the optimal CFL constant for both the forward and adjoint discretization as discussed in Section 3.3. Here, as before, $$h$$ is the mesh parameter and is inversely proportional to the number of cells $$N$$. We use the TVD-RK methods of Table 1. As shown in Theorem 4.5, the discrete adjoint is also stable provided the TVD-RK method for the forward problem is SSP. Moreover, we expect that the discrete adjoint approximates the continuous adjoint at $$t=0$$, \begin{equation*} p(x,0) = \begin{cases} 1, & -1 < x < -\frac{1}{2}, \\[2pt] 0, & -\frac{1}{2} \leq x \leq \frac{1}{2}, \\[2pt] -1, & -\frac{1}{2} < x < 1, \end{cases} \end{equation*} even though the discrete adjoint scheme does not impose the ‘interior’ boundary condition (2.9). In Fig. 2, we observe that for both the Engquist–Osher and Lax–Friedrichs methods with a two-stage TVD-RK method, we obtain a stable TVD discrete adjoint. Note that in the interval $$x \in (-\frac{1}{2},\frac{1}{2})$$, the discrete adjoint has the correct value, i.e., $$p(x,0) = 0$$, and the shock location is correct as well. Moreover, the discrete adjoint converges as we refine the mesh. Identical results are obtained using a three-stage TVD-RK method. The reason for this is that the Lax–Friedrichs and Engquist–Osher methods are low-order methods, while the time discretization is high order and the leading error is due to the spatial discretization. Fig. 2. View largeDownload slide Discrete adjoint computed using Engquist–Osher and Lax–Friedrichs schemes with two-stage TVD-RK method. Fig. 2. View largeDownload slide Discrete adjoint computed using Engquist–Osher and Lax–Friedrichs schemes with two-stage TVD-RK method. In Fig. 3, we observe that the total variation of the discrete adjoint at any time $$t<T$$ deviates from the final-time discrete adjoint, i.e., $${\boldsymbol p}_h^{n_T}$$, only up to machine precision. This implies that the TVD-RK method for the discrete adjoint is TV stable, even though it is not SSP. Fig. 3. View largeDownload slide The difference between total variation of the discrete adjoint at time $$t_n$$ and total variation of the final-time discrete adjoint. Fig. 3. View largeDownload slide The difference between total variation of the discrete adjoint at time $$t_n$$ and total variation of the final-time discrete adjoint. 5.1 Giles’ test case In this section, we perform numerical experiments on an optimal control problem that was first proposed by Giles (2003). Let us choose the setting of the problem of the previous section except for the objective functional which we now choose to be $$G(y) := y^5 - y$$. Note that since $$y(x,T) = 1$$ for $$x \in [-1,0)$$ and $$y(x,T) = -1$$ for $$x \in (0,1]$$, we have $$G'(y(x,T)) = 4$$ for $$x \in [-1,1]\setminus \{0\}$$. Moreover, since there is a shock in the solution at time $$T$$ at $$x=0$$ we should impose an ‘interior’ boundary condition for the adjoint state. Since $${[\![{ G(y(x,T)) }]\!]} = 0$$, we should set the ‘interior’ boundary condition to \begin{equation*} p(0,t) = 0 \quad \forall\, t \in [0,T]. \end{equation*} Therefore the adjoint state at time $$t=0$$ reads \begin{equation*} p(x,0) = \begin{cases} 4, & -1 < x < -\frac{1}{2}, \\ 0, & -\frac{1}{2} \leq x \leq \frac{1}{2}, \\ 4, & -\frac{1}{2} < x < 1. \end{cases} \end{equation*} If we choose the discretize-then-optimize approach, we do not impose such an ‘interior’ boundary condition for the discrete adjoint. It was first observed in Giles (2003) that the Lax–Friedrichs scheme provides a discrete adjoint that converges to a wrong adjoint (see Remark 3.3). We perform the numerical experiment for the Engquist–Osher scheme with a two-stage TVD-RK method and the number of cells $$N=800$$. In Fig. 4, we observe that the discrete adjoint has the correct value for $$x \in [-1,-\frac{1}{2}) \cup (\frac{1}{2},1]$$, but it has a wrong value for $$x \in [-\frac{1}{2},\frac{1}{2}]$$, i.e., $$p_h(x,0) = 0.25$$ for $$x \in [-\frac{1}{2}, \frac{1}{2}]$$. The wrong value in this region does not improve under refinement and the approximation converges to the value $$0.25$$ in this region. Note that the final-time condition $$p_h(0,T)$$ has a Dirac delta shape at $$x=0$$ due to the nonlinearity of the objective functional with the value of $$0.25$$. This is precisely the value that is transported backward in time. Fig. 4. View largeDownload slide The discrete adjoint computed using Engquist–Osher (left) and Lax–Friedrichs (right) schemes and a two-stage TVD-RK method for Giles’ test case. Fig. 4. View largeDownload slide The discrete adjoint computed using Engquist–Osher (left) and Lax–Friedrichs (right) schemes and a two-stage TVD-RK method for Giles’ test case. It is shown by Giles & Ulbrich (2010a,b) that the Lax–Friedrichs scheme converges to the correct adjoint provided a restrictive time step of type $$k = \mathcal{O}(h^{2-q})$$, for $$0<q<1$$; see Fig. 4 (right). Observe in Fig. 5 that the Lax–Friedrichs method converges in the shock funnel with $$k=\mathcal{O}(h^{1.2})$$. Fig. 5. View largeDownload slide Convergence of the discrete adjoint computed using the Lax–Friedrich method. Fig. 5. View largeDownload slide Convergence of the discrete adjoint computed using the Lax–Friedrich method. Such a time step increases diffusion in the scheme and allows more grid points to enter the shock and, hence, it leads to convergence of the discrete adjoint to the correct solution. This, however, is not possible with the classical definition of the Engquist–Osher scheme as such a diffusion is not present. In Fig. 6, we plot the deviation of the total variation of the discrete adjoint from the total variation of $$p_h(x,T)$$. Observe that we have $$p_h(x,t) \leq p_h(x,T)$$ for $$t \in [0,T]$$ up to machine precision, which agrees with the theoretical stability result for TVD-RK methods. Fig. 6. View largeDownload slide The difference between total variation of the discrete adjoint at time $$t_n$$ and total variation of the final time discrete adjoint for Giles’ test case. Fig. 6. View largeDownload slide The difference between total variation of the discrete adjoint at time $$t_n$$ and total variation of the final time discrete adjoint for Giles’ test case. 5.2 Convergence of the discrete adjoint We now validate the convergence result of Proposition 4.8 where we use mollification in order to guarantee that $$p(x,T) \in {\textrm{Lip}}_\text{loc}({\mathbb{R}})$$. For mollifying the end data, we use the smoothing kernel \begin{equation*} \varphi(x) = \begin{cases} \frac{1}{Z} \exp(-\frac{1}{1-x^2}), & -1 < x < 1, \\ 0 & \text{otherwise}, \end{cases} \end{equation*} where $$Z \in {\mathbb{R}}^+ \setminus \{0\}$$ is a normalization constant such that $$\int_{\mathbb{R}} \varphi(x) {\textrm{d}} x = 1$$. Given a positive constant $$\epsilon$$, we then denote the mollified final state by $$y_{\varphi,\epsilon}(x,T)$$ and define it by \begin{equation*} y_{\varphi,\epsilon}(x,T) := \int_{\mathbb{R}} \frac{1}{\epsilon} \varphi\left(\frac{z}{\epsilon} \right) y(x-z,T) \,{\textrm{d}} z. \end{equation*} The objective functional is given as before by $$J(y) := \int_{\mathbb{R}} G(y_{\varphi,\epsilon}(x,T)) {\textrm{d}} x$$ with $$G(y) = y^5 - y$$, and we consider the Burgers equation with prescribed boundary and initial conditions as in the previous section. Since $$y_{\varphi,\epsilon}$$ is smooth, we conclude that $$p(x,T) = G'(y_{\varphi,\epsilon}(x,T)) \in {\textrm{Lip}}_\text{loc}({\mathbb{R}})$$. Therefore, in the shock funnel we should obtain $$p(x,t) = G'(y_{\varphi,\epsilon}(0,T)) = -1$$ since $$y_{\varphi,\epsilon}(0,T) = 0$$. This yields that for the continuous adjoint at $$t=0$$, we have \begin{equation*} p(x,0) = -1 \quad \text{for } x \in \left[-\frac12, \frac12\right]\!, \end{equation*} for any given $$\epsilon$$. We discretize the mollification function by averaging on the mesh, i.e., $$\mathbf{\varphi} := (\varphi_1, \ldots, \varphi_N)$$ where \begin{equation*} \varphi_j := \frac{1}{h_j} \int_{x_{j-1/2}}^{x_{j+1/2}} \frac{1}{\epsilon} \varphi \left( \frac{x}{\epsilon} \right) \,{\textrm{d}} x \quad \forall\, j \in \mathbb{Z}, \end{equation*} and the discrete mollified state is given by \begin{equation*} y_{j,\varphi,\epsilon}^{n_T} = \sum_{j'} h_j^{}\, \varphi_j^{} y_{j-j'}^{n_T} \quad \forall\, j \in \mathbb{Z}. \end{equation*} For the numerical experiment, we choose, as before, the Engquist–Osher scheme and a two-stage TVD-RK method for the forward problem with $$N=800$$ and the time step is chosen as $$k = 0.25 h$$. In Fig. 7 (left), we plot the discrete adjoint obtained from the Engquist–Osher scheme with smoothed end data. As we observe, in the shock funnel at $$t=0$$ we recover the correct discrete adjoint. Fig. 7. View largeDownload slide Discrete adjoint at $$t=0$$ (left) and convergence of the discrete adjoint to its continuous counterpart in the shock funnel (right). Fig. 7. View largeDownload slide Discrete adjoint at $$t=0$$ (left) and convergence of the discrete adjoint to its continuous counterpart in the shock funnel (right). In Fig. 7 (right) we plot the convergence of the discrete adjoint in the shock funnel. Note that due to the smoothing effect of the mollifier, we recover convergence of the discrete adjoint in the shock funnel. This validates numerically the result of Proposition 4.8. Since we use a first-order scheme in space (for theoretical reasons), we do not expect to obtain an overall convergence rate of more than $$1$$. However, in Fig. 7 (right) we observe a slight improvement in the convergence rate due to the TVD-RK method. 5.3 A numerical optimal control problem In this section, we solve an optimization task using the gradient information obtained from a discrete adjoint. We set the objective functional to \begin{equation*} J(y) := \frac{1}{2} \int_{-1}^{1} |y(x,T) - y_\text{obs}(x)|^2 \,{\textrm{d}} x, \end{equation*} with the final time $$T = \frac{1}{2}$$, \begin{equation*} y_\text{obs}(x) := \begin{cases} 2x - \frac{1}{2}, & \frac{1}{4} \leq x \leq \frac{3}{4}, \\ 0 & \text{otherwise}, \end{cases} \end{equation*} and, Burgers flux function $$f(y) = \frac{1}{2}y^2$$. Given the current control $$u_{j}$$ (at iteration $$j$$), we compute the discrete adjoint at time $$T=0$$, i.e., $$p_j(x,0)$$, and choose $$\delta u = - \eta_j p_j(x,0),$$ where $$\eta_j \in {\mathbb{R}}_+$$ is a parameter to ensure $$J(y_h(u_{j+1})) < J(y_h(u_{j}))$$. Then the updated control at iteration $$j+1$$ reads $$u_{j+1}(x) = u_{j}(x) - \eta_j \, p_j(x,0).$$ (5.1) We choose $$\eta_j$$ by checking the Armijo condition and a back-tracking procedure, i.e., $$J(y_h(u_{j+1})) \leq J(y_h(u_{j})) - c_\text{opt} \eta_j {\Vert {p_j(x,0)} \Vert}_{{\textrm{L}}^2(-1,1)}^2,$$ (5.2) where $$c_\text{opt} \in (0,1)$$ is the Armijo constant. If the above Armijo condition is not satisfied we choose a smaller $$\alpha$$ by \begin{equation*} \alpha_\text{new} := \rho\, \alpha_\text{old}, \end{equation*} where $$0 < \rho < 1$$ and recheck (5.2). In this numerical experiment, we choose $$\rho = 0.95$$, $$c_\text{opt} = 0.5$$ and the initial $$\alpha = 0.5$$. As initial guess we fix \begin{equation*} u_0(x) := \begin{cases} (\frac{3}{4}+x)(\frac{1}{2}-x), & -\frac{3}{4} \leq x \leq \frac{1}{2}, \\ 0 & \text{otherwise}. \end{cases} \end{equation*} We choose the Engquist–Osher method for spatial discretization and Heun’s second-order TVD-RK method. The time step is chosen as $$k = 0.25 h,$$ and the stopping criterion is taken to be $${\Vert {\nabla \mathcal{J}_h(u_h)} \Vert}_{{\textrm{L}}^2({\it{\Omega}})} = {\Vert {p_j(x,0)} \Vert}_{{\textrm{L}}^2({\it{\Omega}})} \leq {\tt tol}$$. In our experiments we set $${\tt tol}= 10^{-2}$$ and $$10^{-4}$$. In Fig. 8, we observe that the numerical algorithm seems to converge to the true solution. That is, it captures correctly the shock location at $$x=\frac{3}{4}$$ and also the rarefaction. There are numerical artefacts at $$x=\frac{3}{4}$$ which vanish as we reduce the tolerance $$\tt tol$$. The corresponding initial guess for the control variable and the final control variable are plotted in Fig. 9. Fig. 8. View largeDownload slide The state variable $$y_h(x,T)$$ obtained from optimization algorithm with $${\tt tol}=10^{-2}$$ and $${\tt tol}=10^{-4}$$. Fig. 8. View largeDownload slide The state variable $$y_h(x,T)$$ obtained from optimization algorithm with $${\tt tol}=10^{-2}$$ and $${\tt tol}=10^{-4}$$. Fig. 9. View largeDownload slide The control variable $$u(x)$$ obtained from optimization algorithm with $${\tt tol}=10^{-2}$$ and $${\tt tol}=10^{-4}$$. Fig. 9. View largeDownload slide The control variable $$u(x)$$ obtained from optimization algorithm with $${\tt tol}=10^{-2}$$ and $${\tt tol}=10^{-4}$$. 6. Conclusion In this article, we studied TVD-RK methods for the numerical treatment of the optimal control problems in a discretize-then-optimize approach. We have shown that a TVD-RK discretization of the state equation yields a TVD-RK method for the discrete adjoint with a conjugate coefficient table. We then showed that requiring strong stability for both discrete state and adjoint is too strong and results in a first-order method. Luckily, imposing SSP properties for the discrete state is sufficient to obtain stability of the discrete adjoint. This result holds for an arbitrary $$s$$-stage TVD-RK method. Moreover, thanks to the linearity of the adjoint equation, the TVD-RK method for the discrete adjoint is consistent. We also studied the approximation properties of the discrete adjoint and showed that for a second-order two-stage method applied to the forward problem, we also obtain a second-order discrete adjoint. However, for the third-order three-stage method applied, we obtain only a second-order discrete adjoint. Inner stages of our theoretical results were finally illustrated by numerical experiments. We would like to finish this conclusion by mentioning that the convergence of the discrete adjoint to the continuous adjoint is an interesting question on its own. Funding German Research Foundation DFG through the collaborative research center SFB-TRR 154 under projects A02 and B02 and by the Research Center MATHEON through project C-SE5 and D-OT1 funded by the Einstein Center for Mathematics, Berlin. Appendix A. Existence of a minimizer Proposition A1 Let $$G(y) = \frac{1}{2} |y(x,T) - y_\text{obs}(x)|^2$$. Then the optimal control problem subject to the conservation law (2.1) has a solution in the admissible set $$U_\text{ad}$$. Proof. We begin with the continuity of the objective functional. Suppose $$y(x,t)$$ and $$w(x,t)$$ are the entropic solution of (2.1) with the initial data $$u(x) \in U_\text{ad}$$ and $$v(x) \in U_\text{ad}$$, respectively. Then we have \begin{equation*} | J(y) - J(w) | = \frac{1}{2} \left| \int_{\mathbb{R}} ( y - w )(y + w - 2 y_{\text{obs}} ) \right| \leq {\Vert { y - w } \Vert}_{{\textrm{L}}^1({\mathbb{R}})} {\Vert { y + w - 2 y_{\text{obs}} } \Vert}_{{\textrm{L}}^\infty({\mathbb{R}})}. \end{equation*} Then by $${\textrm{L}}^1$$-contraction of the entropic solutions we have \begin{equation*} \begin{array}{rcl} | J(y) - J(w) | &\leq& {\Vert { y(\cdot,T) - w(\cdot,T) } \Vert}_{{\textrm{L}}^1({\mathbb{R}})} ({\Vert {y(\cdot,T)} \Vert}_{{\textrm{L}}^\infty({\mathbb{R}})} + {\Vert {w(\cdot,T)} \Vert}_{{\textrm{L}}^\infty({\mathbb{R}})} + c) \\ &\leq& {\Vert { u - v } \Vert}_{{\textrm{L}}^1({\mathbb{R}})} ({\Vert {u} \Vert}_{{\textrm{L}}^\infty({\mathbb{R}})} + {\Vert {v} \Vert}_{{\textrm{L}}^\infty({\mathbb{R}})} + c) \\ &\leq& C {\Vert { u - v } \Vert}_{{\textrm{L}}^1({\mathbb{R}})}, \end{array} \end{equation*} where in the last step we used the $${\textrm{L}}^\infty$$-stability of the entropic solutions and the fact that $$u$$ and $$v$$ are in $$U_\text{ad}$$ and therefore we have a uniform bound on the $${\textrm{L}}^\infty$$-norm. Observe that $$J(y) \geq 0$$ and therefore a minimizing sequence denoted by $$\{u_i\}$$ exists. Since $$U_\text{ad}$$ is a compact set in $${\textrm{L}}^1$$, one can obtain a subsequence denoted by $$\{ u_{i_j} \}$$ that converges strongly in $${\textrm{L}}^1({\mathbb{R}})$$ to $$u^\star \in U_\text{ad}$$ as $$j \rightarrow \infty$$. Then using continuity of the objective functional we have \begin{equation*} \inf_{u \in U_\text{ad}} J(y) = \lim_{j \rightarrow \infty} J(y(u_{i_j})) = J(y(u^\star)) \quad \text{for } u^\star \in U_\text{ad}, \end{equation*} which shows existence of the minimizer. □ We would like to mention also that for a more general admissible set, e.g., $$U_\text{ad} := \left\{ u \in {\textrm{L}}^\infty({\mathbb{R}}) : \text{supp}(u) \in K, {\Vert {u} \Vert}_{{\textrm{L}}^\infty({\mathbb{R}})} \leq C \right\}\!,$$ (A.1) and an assumption on the uniform convexity of the flux function $$f(\cdot)$$, one can also obtain an existence result. We refer the reader to Castro et al. (2008) for the proof. Appendix B. Proof of Proposition 3.1 Proof. Since $$f(\cdot)$$ is $${\textrm{C}}^2$$ we have $$f(w+v) = f(w) + f'(w)v + \frac{1}{2} f''(z) v^2$$ for $$w,v \in {\mathbb{R}}$$ and some $$z \in (w,w+v)$$. For the Lax–Friedrichs flux, a direct calculation shows \begin{equation*} \hat{f}(w_j + v_j, w_{j+1} + v_{j+1}) = \hat{f}(w_j,w_{j+1}) + g_{i+1,i}^\text{LF} + \mathcal{O}(v_j^2) + \mathcal{O}(v_{j+1}^2) + \mathcal{O}(v_{j-1}^2). \end{equation*} For Engquist–Osher, we have \begin{equation*} \hat{f}(w_j + v_j, w_{j+1} + v_{j+1}) = \hat{f}(w_j,w_{j+1}) + \int_{w_j}^{w_j+v_j} f'(s)^+ \, {\textrm{d}} s + \int_{w_{j+1}}^{w_{j+1} + v_{j+1}} f'(s)^- \, {\textrm{d}} s. \end{equation*} Note that we can write $$f'(x)^+ = \frac{1}{2} (\,f'(x) + |f'(x)|)$$ and $$f'(x)^- = \frac{1}{2} (\,f'(x) - |f'(x)|)$$. Recall the definition of $$g_{j,j+1}^\text{EO}$$ in the proposition and define the residual $$r({\boldsymbol w},{\boldsymbol v}) : V_h \times V_h \rightarrow V_h$$ by $$[r({\boldsymbol w},{\boldsymbol v})]_j:= [F_h({\boldsymbol w}+{\boldsymbol v})]_j - [F_h({\boldsymbol w})]_j - [F_h'({\boldsymbol w}) {\boldsymbol v}]_j$$. Then we have $$[r({\boldsymbol w},{\boldsymbol v})]_j = q_{j,j+1} - q_{j-1,j}$$ where \begin{equation*} q_{j,j+1} = \int_{w_j}^{w_j+v_j} f'(s)^+ - f'(w_j)^+ \, {\textrm{d}} s + \int_{w_{j+1}}^{w_{j+1}+v_{j+1}} f'(s)^- - f'(w_{j+1})^- \, {\textrm{d}} s. \end{equation*} For the first term on the right-hand side we have \begin{equation*} \left| \int_{w_j}^{w_j+v_j} f'(s)^+ - f'(w_j)^+ \, {\textrm{d}} s \right| \leq \max_{z \in (w_j,w_j+v_j)} |f'(z)^+ - f'(w_j)^+ | \cdot |v_j| \leq C \, v_j^2, \end{equation*} since $$f'(x)^+$$ is Lipschitz continuous. The proof for the second term of $$q_{j,j+1}$$ is similar. Hence, we showed that for both Lax–Friedrichs and Engquist–Osher fluxes we have $$[r({\boldsymbol w},{\boldsymbol v})]_j = \mathcal{O}(v_j^2) + \mathcal{O}(v_{j+1}^2) + \mathcal{O}(v_{j-1}^2)$$, respectively. Taking the $$\ell^1$$-norm of $$r({\boldsymbol w},{\boldsymbol v})$$ one obtains \begin{equation*} {\Vert {r({\boldsymbol w},{\boldsymbol v})} \Vert}_{\ell^1} \leq C {\Vert { {\boldsymbol v} } \Vert}_{\ell^2}^2 \leq C {\Vert { {\boldsymbol v} } \Vert}_{\ell^1}^2, \end{equation*} where $$C$$ is independent of the mesh parameter. Multiplying both sides by $$h$$ gives \begin{equation*} {\Vert {r(w,v)} \Vert}_{{\textrm{L}}^1({{\it{\Omega}}})} \leq C {\Vert { v } \Vert}_{{\textrm{L}}^1({{\it{\Omega}}})} \, {\Vert { {\boldsymbol v}} \Vert}_{\ell^1}, \end{equation*} which shows that $${\Vert {r(w,v)} \Vert}_{{\textrm{L}}^1({{\it{\Omega}}})} / {\Vert { v } \Vert}_{{\textrm{L}}^1({{\it{\Omega}}})} \rightarrow 0$$ uniformly when $${\boldsymbol v}$$ converges to zero. This completes the proof for the derivative. A direct calculation yields the formula for the transpose. □ References Banda, M. K. & Herty, M. ( 2012 ) Adjoint IMEX-based schemes for the numerical solution of optimal control problems governed by conservation laws. Hyperbolic Problems—Theory, Numerics and Applications ( Tatsien L. & Jiang S. eds), vol. 2. Ser. Contemp. Appl. Math. CAM , vols. 17 & 18 . Singapore : World Scientific Publishing , pp. 297 – 303 . Google Scholar CrossRef Search ADS Bouchut, F. & James, F. ( 1998 ) One-dimensional transport equations with discontinuous coefficients. Nonlinear Anal. , 32 , 891 – 933 . Google Scholar CrossRef Search ADS Bressan, A. & Guerra, G. ( 1997 ) Shift-differentiability of the flow generated by a conservation law. Discrete Contin. Dyn. Syst. , 3 , 35 – 58 . Castro, C., Palacios, F. & Zuazua, E. ( 2008 ) an alternating descent method for the optimal control of the inviscid Burgers equation in the presence of shocks. Math. Models Methods Appl. Sci. , 18 , 369 – 416 . Google Scholar CrossRef Search ADS Chechkin, G. A. & Goritsky, A. Y. ( 2009 ) S. N. Kruzhkov’s lectures on first-order quasilinear PDEs. Analytical and Numerical Aspects of Partial Differential Equations ( Emmrich E. & Wittbold P. eds). Berlin : Walter de Gruyter , pp. 1 – 67 .( Translated from the Russian by Boris P. Andreianov. ), https://www.degruyter.com/viewbooktoc/product/40206. Conway, E. D. ( 1967 ) Generalized solutions of linear differential equations with discontinuous coefficients and the uniqueness question for multidimensional quasilinear conservation laws. J. Math. Anal. Appl. , 18 , 238 – 251 . Google Scholar CrossRef Search ADS Evans, L. C. ( 2010 ) Partial Differential Equations . Graduate Studies in Mathematics , vol. 19 , 2nd edn. Providence, RI : American Mathematical Society , pp. xxii+749 . Giles, M. B. ( 2003 ) Discrete adjoint approximations with shocks. Hyperbolic Problems: Theory, Numerics, Applications ( Hou T. Y. & Tadmor E. eds), Heidelberg, Berlin : Springer , pp. 185 – 194 . Google Scholar CrossRef Search ADS Giles, M. & Ulbrich, S. ( 2010a ) Convergence of linearized and adjoint approximations for discontinuous solutions of conservation laws. Part 1: linearized approximations and linearized output functionals. SIAM J. Numer. Anal. , 48 , 882 – 904 . Google Scholar CrossRef Search ADS Giles, M. & Ulbrich, S. ( 2010b ) Convergence of linearized and adjoint approximations for discontinuous solutions of conservation laws. Part 2: adjoint approximations and extensions. SIAM J. Numer. Anal. , 48 , 905 – 921 . Google Scholar CrossRef Search ADS Gottlieb, S. & Shu, C.-W. ( 1998 ) Total variation diminishing Runge-Kutta schemes. Math. Comp. , 67 , 73 – 85 . Google Scholar CrossRef Search ADS Hager, W. W. ( 2000 ) Runge-Kutta methods in optimal control and the transformed adjoint system. Numer. Math. , 87 , 247 – 282 . Google Scholar CrossRef Search ADS Harten, A. ( 1983 ) High resolution schemes for hyperbolic conservation laws. J. Comput. Phys. , 49 , 357 – 393 . Google Scholar CrossRef Search ADS James, F. & Sepúlveda, M. ( 1999 ) Convergence results for the flux identification in a scalar conservation law. SIAM J. Control Optim. , 37 , 869 – 891 ( electronic ). Google Scholar CrossRef Search ADS Kružkov, S. N. ( 1970 ) First order quasilinear equations with several independent variables. Math. Sb. (N.S.) , 81 , 228 – 255 . LeVeque, R. J. ( 1990 ) Numerical Methods for Conservation Laws . Lectures in Mathematics ETH Zürich. Basel : Birkhäuser , pp. x+214 . Google Scholar CrossRef Search ADS Lucier, B. J. ( 1986 ) A moving mesh numerical method for hyperbolic conservation laws. Math. Comp. , 46 , 59 – 69 . Google Scholar CrossRef Search ADS Osher, S. & Tadmor, E. ( 1988 ) On the convergence of difference approximations to scalar conservation laws. Math. Comp. , 50 , 19 – 51 . Google Scholar CrossRef Search ADS Qiu, J.-M. & Shu, C.-W. ( 2008 ) Convergence of Godunov-type schemes for scalar conservation laws under large time steps. SIAM J. Numer. Anal. , 46 , 2211 – 2237 . Google Scholar CrossRef Search ADS Ruuth, S. J. & Spiteri, R. J. ( 2002 ) Two barriers on strong-stability-preserving time discretization methods. J. Scient. Comp. , 17 , 211 – 220 . Google Scholar CrossRef Search ADS Sanders, R. ( 1988 ) A third-order accurate variation nonexpansive difference scheme for single nonlinear conservation laws. Math. Comp. , 51 , 535 – 558 . Google Scholar CrossRef Search ADS Shu, C.-W. & Osher, S. ( 1988 ) Efficient implementation of essentially nonoscillatory shock-capturing schemes. J. Comput. Phys. , 77 , 439 – 471 . Google Scholar CrossRef Search ADS Spiteri, R. J. & Ruuth, S. J. ( 2002 ) A new class of optimal high-order strong-stability-preserving time discretization methods. SIAM J. Numer. Anal. , 40 , 469 – 491 ( electronic ). Google Scholar CrossRef Search ADS Ulbrich, S. ( 1999 ) On the existence and approximation of solutions for the optimal control of nonlinear hyperbolic conservation laws. Optimal Control of Partial Differential Equations: International Conference in Germany (Chemnitz, 1998) ( Hoffmann, K.-H. Leugering, G. Tröltzsch F. & Caesar S. eds). Internat. Ser. Numer. Math., vol. 133. Basel : Birkhäuser, pp. 287 – 299 . Google Scholar CrossRef Search ADS Ulbrich, S. ( 2001 ) Optimal Control of Nonlinear Hyperbolic Conservation Laws with Source Terms . Habilitation Thesis, Munich, Germany : Technische Universität München. Ulbrich, S. ( 2002 ) A sensitivity and adjoint calculus for discontinuous solutions of hyperbolic conservation laws with source terms. SIAM J. Control Optim. , 41 , 740 – 797 ( electronic ). Google Scholar CrossRef Search ADS Ulbrich, S. ( 2003 ) Adjoint-based derivative computations for the optimal control of discontinuous solutions of hyperbolic conservation laws. Systems Control Lett. , 48 , 313 – 328 . Google Scholar CrossRef Search ADS Zhang, X. & Shu, C.-W. ( 2010 ) A genuinely high order total variation diminishing scheme for one-dimensional scalar conservation laws. SIAM J. Numer. Anal. , 48 , 772 – 795 . Google Scholar CrossRef Search ADS © The authors 2017. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. All rights reserved.

### Journal

IMA Journal of Numerical AnalysisOxford University Press

Published: Dec 14, 2017

## You’re reading a free preview. Subscribe to read the entire article.

### DeepDyve is your personal research library

It’s your single place to instantly
that matters to you.

over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month ### Explore the DeepDyve Library ### Search Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly ### Organize Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place. ### Access Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals. ### Your journals are on DeepDyve Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more. All the latest content is available, no embargo periods. DeepDyve ### Freelancer DeepDyve ### Pro Price FREE$49/month
\$360/year

Save searches from
PubMed

Create lists to

Export lists, citations