Analysis of extremum value theorems for function spaces in optimal control under numerical uncertainty

Analysis of extremum value theorems for function spaces in optimal control under numerical... Abstract The extremum value theorem for function spaces plays the central role in optimal control. It is known that computation of optimal control actions and policies is often prone to numerical errors which may be related to computability issues. The current work addresses a version of the extremum value theorem for function spaces under explicit consideration of numerical uncertainties. It is shown that certain function spaces are bounded in a suitable sense, i.e., they admit finite approximations up to an arbitrary precision. The proof of this fact is constructive in the sense that it explicitly builds the approximating functions. Consequently, existence of approximate extremal functions is shown. Applicability of the theorem is investigated for finite-horizon optimal control, dynamic programming and adaptive dynamic programming. Some possible computability issues of the extremum value theorem in optimal control are shown on counterexamples. 1. Introduction Optimal control represents an important part of control theory. Typically, one seeks for an optimal function over a state space (also called control policy) so as to minimize a given cost functional. It is, however, not in general possible to compute optimizing control policies exactly due to limitations of numerical procedures which may have certain effects on the system behaviour. The current work shows how, under some mild and practicable assumptions, approximate optimal control policies can still be explicitly computed. The proofs are done constructively, i.e., they entail certain ways of computing the objects in question. Constructive results are not unusual in control engineering and are often desired: for instance, Banaschewski & Mulvey (1997) gave a constructive proof of the Stone–Weierstrass theorem, which is used in a number of applications to effectively find approximations to specific functions in a suitable basis. The famous Sontag’s formula (Sontag, 1989) is the core of the Sontag’s constructive proof of the Artstein’s theorem on non-linear stabilization. Sepulchre et al. (2012) developed this methodology further into a vast variety of constructive methods of finding specific stabilizing controllers. Going back to the problem of optimal control and related effects which may occur due to numerical uncertainty, consider the following simple example of a discrete-time system whose dynamical behaviour is switched by a binary decision variable $$u$$:   \begin{align} x_{k+1}= \begin{cases} \big(\frac{1}{2} + b\big)x_k, &u_k = 1, \\ \big(\frac{1}{2} + c\big)x_k, &u_k = -1, \end{cases} \quad x_0 = 1, u_k \in \{1, -1\}, \end{align} (1.1)where $$b, c$$ are real numbers which may, e.g., represent some physical quantities. Let an infinite-horizon cost function be defined as   $$ \min_{\{u_k\}_k} \quad J = \sum_{k=0}^{\infty} x_k^2. $$ Suppose, for the sake of the example, that $$b$$ is zero and $$c$$ is positive. Then, by the virtue of the system dynamics (1.1), the optimal control policy is $$u^{\ast }=\{1,1,1, \dots \}$$ and the corresponding optimal state sequence is $$x^{\ast }=\{ 1,\frac{1}{2},\frac{1}{4}, \frac{1}{8}, \dots \}$$. In this case, the optimal cost is $$J^{\ast }=2$$. If, otherwise, $$c$$ is zero and $$b$$ is positive, then the optimal control policy is $$u^{\ast }=\{-1,-1,-1, \dots \}$$, whereas the optimal cost is, again, $$J^{\ast }=2$$. Thus, if one could find the optimal control policy, i.e., the optimal control action at each time step, then either $$J^{\ast }=\frac{2}{1-2c}=2$$ or $$J^{\ast }=\frac{2}{1-2b}=2$$ by the geometric sum and, hence, either $$b=0$$ or alternatively $$c=0$$. However, in practice, there may occur a numerical uncertainty between the exact values of $$b, c$$ and their representations in a computational device, usually as rational numbers. One of the possible simple ways to consider these representations of $$b, c$$ is in the form of Cauchy sequences $$\{b(n)\}_n, \{c(n)\}_n$$ which are regular in the following sense:   \begin{align*} & \forall n,m \in \mathbb{N} \\ & |b(n)-b(m)| \leqslant \tfrac{1}{n} + \tfrac{1}{m}, \\ &\, |c(n)-c(m)| \leqslant \tfrac{1}{n} + \tfrac{1}{m}. \end{align*} In practice, the system (1.1) may contain some particular approximations $$b(n^{\prime}), c(n^{\prime}), n^{\prime} \in \mathbb N$$ where $$n^{\prime}$$ is the precision of the computational device. In the current work, all the proofs are done by working directly with the representations of real numbers which help address numerical uncertainty. The said rational approximations may well come from, e.g., a measurement, which always has a finite precision, or from some computational algorithm, such as model identification. Therefore, to computationally check whether $$b=0$$ or alternatively $$c=0$$, approximations $$b(n), c(n)$$ for all $$n \in \mathbb N$$ must be compared. Such an unbounded search is, however, not technically possible. Therefore, different optimal control policies might result depending on precision—in this case, a particular number $$n^{\prime}$$. The same issue may appear when minimizing, e.g., the following particular cost function:   $$ J(u) = \min \{ u^2 + b, (u-1)^2+c \}. $$ By the virtue of the numbers $$b$$ and $$c$$ as described above, it follows that $$\min J = 0$$. However, if an optimal control action $$u^{\ast }$$ could be computed exactly, such that $$J(u^{\ast })= \min J$$, then either $$u^{\ast } = 0$$ or $$u^{\ast } = 1$$. It would be equivalent to deciding whether $$b$$ or $$c$$ is exactly zero which is not always technically possible. This has been typically demonstrated in simple counter-examples (Bishop, 1967), whose more detailed description may be found in Appendix. Particular examples of peculiar phenomena related to numerical uncertainty and floating-point arithmetic may also be found in (Rump, 2010). As shown in the example above, optimality in general may fail to be achieved depending on the representation of system parameters. To address these issues, the present work seeks to show existence of optimal control in an approximate format by explicitly considering numerical uncertainty. The proofs are done constructively and in the setting of (Bishop & Bridges, 1985) since it offers convenient tools for keeping track of the number representations. The details are given in the next section. It should be noted that, classically, the extremum value theorem states the following: Theorem 1.1 If a function $$f$$ is continuous on a compact interval $$[a,b]$$, then there exist $$x,y\in [a,b]$$ such that $$f(x)=\sup f$$ and $$f(y)=\inf \,f$$. Some constructive approaches to Theorem 1.1 were addressed, e.g., in (Berger et al., 2006; Bridges, 2007) with additional assumptions on the function $$f$$. These assumptions are, however, not always easy to verify practically, especially when one wants to apply the theorem to function spaces. Instead of strengthening the conditions of the theorem, an approximate format is considered in the present work which is sufficient for practical applications of optimal control. To achieve this, it is shown that certain function spaces admit finite approximations. The proof is based on constructing finite approximations explicitly. Another consequence of the new results shows also what at best can be achieved in general when addressing optimal control. The major implication is a theoretical limit at which any numerical algorithm may perform. Exact optimal control policies are not achievable in general. The new result demonstrates principal possibility of computing approximate optimal control policies up to prescribed accuracy if the optimization problem satisfies certain conditions. As will be shown in the case study of Section 4, the said conditions are practicable. The next section discusses the important preliminaries needed to prove the main theorem of Section 3. 2. Preliminaries In this section, the definitions and some basic technical results necessary for derivation of the new approximate extremum value theorem for function spaces are recalled. For a comprehensive description, refer, for example, to Bishop & Bridges (1985); Bridges & Richman (1987); Bridges & Vita (2007); Schwichtenberg (2012); Ye (2011). A real number $$x$$ in the current work will be characterized by its rational approximations in the following regular Cauchy sequence format:   $$ \forall n,m\in\mathbb{N},|x(n)-x(m)|\leqslant\frac{1}{n}+\frac{1}{m}, $$where $$x(n)$$ is some operation that produces the $$n-$$th rational approximation to $$x$$. The inequalities on real numbers are defined as follows:   \begin{align*} (x\leqslant y)\triangleq & \forall n\in\mathbb{N},x(n)\leqslant y(n)+\frac{2}{n},\\ (x<y)\triangleq & \exists n\in\mathbb{N},x(n)<y(n)-\frac{2}{n}. \end{align*} In the second definition, the number $$n$$ is also called a witness. Such objects are said to certify the respective formulas. They can be used by computational devices. Further, the maximum of two real numbers is defined as follows: $$\max \left \{ x,y\right \} (n)\triangleq \max \left \{ x(n),y(n)\right \}$$. The basic properties of it can be proven, but in general it cannot be decided whether $$\max \left \{ x,y\right \} =x$$ or $$\max \left \{ x,y\right \} =y$$. However, the following simple technical lemma can be easily proven: Lemma 2.1 For any two real numbers $$x,y$$ satisfying $$x\leqslant y$$, it follows that $$\max \left \{ x,y\right \} = y$$. Proof. It suffices to show that   $$ \forall n,|\max\left\{ x(n),y(n)\right\} -y(n)|\leqslant\frac{2}{n}. $$ It follows that $$\forall n\in \mathbb{N},\max \left \{ x(n),y(n)\right \} \leqslant y(n)+\frac{2}{n}$$ from the condition of the lemma which implies $$\forall n\in \mathbb{N}.x(n)\leqslant y(n)+\frac{2}{n}$$. On the other hand, $$\forall n\in \mathbb{N},\max \left \{ x(n),y(n)\right \} \geqslant y(n)\geqslant y(n)-\frac{2}{n}$$, and the result follows. Remark 2.1 Properties of the minimum are derived similarly. A metric space $$\left (X,\rho \right )$$ is a set $$X$$ together with an operation $$\rho :X\times X\rightarrow \mathbb{R}$$ that satisfies the usual axioms of a metric. A metric space $$\left (X,\rho \right )$$ is totally bounded if for all natural $$k$$, there exists a finite set of unequal points $$\left \{ x_{1}, \dots , x_{n}\right \} \subset X$$ such that for any $$x\in X$$, there exists an $$x_{i}\in \left \{ x_{1}, \dots ,,x_{n}\right \}$$ with $$\rho \left (x,x_{i}\right )\leqslant \frac{1}{k}$$. Such a finite set is also called a $$\frac{1}{k}-$$approximation to $$X$$. A subset $$A$$ of a metric space $$\left (X,\rho \right )$$ is located if it is non-empty and for any $$x$$ in $$X$$, the metric $$\rho \left (x,A\right )\triangleq \inf \left \{ \rho (x,y):y\in A\right \}$$ can be effectively computed. A totally bounded subset of a metric space is also located (see Proposition 2.2.9 in (Bridges & Vita, 2007)). A metric between two subsets $$A$$ and $$B$$ is defined as $$\rho \left (A,B\right )\triangleq \inf \left \{ \rho (x,y):x\in A,y\in B\right \} $$. A (uniformly continuous) function from a totally bounded metric space $$\left (X,\rho \right )$$ to a metric space $$\left (Y,\sigma \right )$$ is a pair consisting of an operation $$x\mapsto f(x),x\in X$$ and an operation $$\omega :\mathbb{Q}\rightarrow \mathbb{Q}$$ called modulusof (uniform) continuity such that:   $$ \forall\varepsilon\in\mathbb{Q},\forall x,y\in X,\rho(x,y)\leqslant\omega(\varepsilon)\implies\sigma(\,f(x),f(y))\leqslant\varepsilon. $$ A function is Lipschitz continuous if $$\,\forall x,y\in X,\sigma (\,f(x),f(y))\leqslant L\cdot \rho (x,y)$$ for some rational $$L>0$$. The set $$\mathscr{F}$$ of (all) uniformly continuous functions from a totally bounded metric space $$\left (X,\rho \right )$$ to a metric space $$\left (Y,\sigma \right )$$ together with the metric $$\tau (\,f,g)\triangleq \underset{x\in X}{\sup }\,\sigma (\,f(x),f(y))$$ for any $$f,g\in \mathscr{F}$$ is called the function space from $$X$$ to $$Y$$. A function space $$\mathscr{F}$$ is equicontinuous if there exists a common modulus of continuity for all $$f$$ in $$\mathscr{F}$$. Further, $$\mathscr{F}$$ is the space of uniformly Lipschitz and uniformly bounded functions whenever there exists a common Lipschitz constant $$L$$ and respectively a common bound $$K\in \mathbb{Q},K>0$$ such that $$\forall f\in \mathscr{F},\|\,f\|\leqslant K$$. A uniformly continuous functional$$F$$ on a totally bounded function space $$\mathscr{F}$$ is an operation $$f\mapsto F[\,f]\in \mathbb{R}$$ with a modulus of continuity $$\alpha $$ such that $$|F[\,f]-F[g]|\leqslant \frac{1}{k}$$ whenever $$\tau (\,f,g)\leqslant \alpha \big (\frac{1}{k}\big )$$. The symbol $$x^{i}$$ denotes the $$i-$$th coordinate of the point $$x$$ in $$\mathbb{R}^{n}$$. The two common norms on $$\mathbb{R}^{n}$$ are the $$d_{2}$$–norm: $$\|x\|_{2}=\big (\sum _{i=1}^{n}\left (x^{i}\right )^{2}\big )^{\frac{1}{2}}$$, and the $$d_{\infty }$$–norm (or maximum norm): $$\|x\|_{\infty }=\max _{i}\big |x^{i}\big |$$. The subscripts ‘$$2$$’ and ‘$$\infty $$’ may be omitted whenever the type of the norm is clear from the context. The corresponding metric between any two points $$x,y$$ is defined as $$\|x-y\|$$. For the metrics $$d_{2}$$ and $$d_{\infty }$$, the following holds: $$\|\bullet \|_{\infty }\leqslant \|\bullet \|_{2}\leqslant \sqrt{n}\|\bullet \|_{\infty }$$. A real space $$\mathbb{R}^{n}$$ with the metric $$d_{\infty }$$ will be also denoted as $$\left (\mathbb{R}^{n},d_{\infty }\right )$$. A (rational) closed ball$$\bar{\mathscr{B}}(b,K)$$ in $$\mathbb{R}^{n}$$ with a radius $$K\in \mathbb{Q},K>0$$ centred at $$b\in \mathbb{Q}^{n}$$ is the set $$\left \{ x:x\in \mathbb{R}^{n}\land \|x-b\|\leqslant K\right \} $$. For example, with the $$d_{\infty }$$–metric, $$\bar{\mathscr{B}}(b,K)$$ is effectively a hypercube with $$2^{n}$$ vertices with rational coordinates. On the reals $$\mathbb{R}$$, it is a compact interval. Clearly, a closed ball $$\bar{\mathscr{B}}(b,K)$$ is located. A regular partition with a step $$\delta =\frac{K}{k},k\in \mathbb{N}$$ on a closed ball $$\bar{\mathscr{B}}(0,K)$$ in $$\mathbb{R}^{n}$$ is a finite set of points $$\left \{ b_{i}\right \} _{i=1}^{N}\subset \bar{\mathscr{B}}(0,K),N\in \mathbb{N}$$ with the coordinates satisfying all the combinations of the form $$b_{i}^{j}:=\pm n_{ij}\delta ,n_{ij}\in \left \{ 0, \dots , k\right \} $$. Notice that the number $$N$$ of partition points depends on the dimension $$n$$ and the step $$\delta $$. For instance, a regular partition on $$\bar{\mathscr{B}}\big (\frac{1}{2},\frac{1}{2}\big)$$ in $$\mathbb{R}$$—which is the unit interval $$\left [0,1\right ]$$—with a step $$\delta =\frac{1}{4}$$ is the finite set of points $$\big \{ 0,\frac{1}{4},\frac{1}{2},\frac{3}{4},1\big \} $$. Clearly, regular partitions on non-trivial closed balls exist and they may be considered as witnesses for total boundedness. That is, for any approximation to a closed ball, there exists a regular partition with the same property that for any point in the set there exists a point in the partition that is close to the given point up to the given precision. The following simple technical result can be easily proven: Lemma 2.2 Let $$P=\left (p_{1}, \dots ,,p_{N}\right )$$ be a regular partition with a step $$\delta =\frac{K}{k},k\in \mathbb{N}$$ on a closed ball $$\bar{\mathscr{B}}(0,K)\subset \left (\mathbb{R}^{n},d_{\infty }\right )$$. Then, for any $$x \in \bar{\mathscr{B}}(0,K)$$, there exists a closed ball $$\bar{\mathscr{B}}\left (p_{i},\delta \right ),p_{i}\in P$$ such that $$x\in \bar{\mathscr{B}}\left (p_{i},\delta \right )$$. Proof. Since $$x$$ is a tuple of $$n$$ real numbers $$\big (x^{1}, \dots , x^{n}\big )$$, an $$m-$$th rational approximation to $$x$$ is a tuple of rational numbers $$x(m):=\left (x^{1}(m), \dots , x^{n}(m)\right )$$. Indeed, for any $$m^{\prime}\in \mathbb{N}$$, $$\|x(m)-x(m^{\prime})\|=\max _{i}\big |x^{i}(m)-x^{i}(m^{\prime})\big |\leqslant \frac{1}{m}+\frac{1}{m^{\prime}}$$ since $$\big |x^{i}(m)-x^{i}(m^{\prime})\big |\leqslant \frac{1}{m}+\frac{1}{m^{\prime}}$$ for all $$i\in \left \{ 1, \dots , n\right \}$$. Let $$m:=\lceil \frac{4}{\delta }\rceil $$, then $$\|x(m)-x\|\leqslant \frac{\delta }{2}$$. Compute all distances $$\big |\big |x(m)-p_{i}\big |\big |,p_{i}\in B$$. If $$\big |\big |x(m)-p_{i}\big |\big |<\frac{\delta }{2}$$, then $$x\in \bar{\mathscr{B}}\left (p_{i},\delta \right )$$. If there are more than one such balls, pick the one with the smallest index. If $$\big |\big |x(m)-p_{j}\big |\big |=\frac{\delta }{2}$$ for some indices $$j=j_{1}, \dots , j_{L}$$, then pick the smallest such $$j$$ and conclude that $$x\in \bar{\mathscr{B}}\left (p_{j},\delta \right )$$. Remark 2.2 In the current constructive setting, it cannot be deduced whether a point in a real metric space $$\left (\mathbb{R}^{n},d_{\infty }\right )$$ belongs to a subset $$A$$ or to a subset $$B$$ if $$A\cap B$$ has a dimension less than $$n$$. However, comparison of a real number with a non-trivial interval is decidable, i.e., whether a real number in the interval $$\left [a,b\right ]\subset \mathbb{R}$$ belongs to a non-trivial interval $$I_{1}\subseteq \left [a,b\right ]$$ or to $$I_{2}\subseteq \left [a,b\right ]$$ if and only if $$I_{1}\cap I_{2}$$ is a non-trivial interval. In the lemma above, this fact is generalized to overlapping hypercubes. Further, an important result, called constructive Arzela–Ascoli’s lemma, which is due to (Bishop & Bridges 1985, p. 100), is recalled: Lemma 2.3 Let $$\mathscr{F}$$ be an equicontinuous function space from a totally bounded metric space $$\left (X,\rho \right )$$ to a metric space $$\left (Y,\sigma \right )$$. Suppose that for any finite $$\frac{1}{k}$$–approximation $$\left \{ x_{1}, \dots ,,x_{N}\right \} $$ to $$X$$, the set $$A:=\left \{ \left (\,f\left (x_{1}\right ), \dots , \,f\left (x_{N}\right )\right ):f\in \mathscr{F}\right \} \subset Y^{N}$$ is totally bounded. Then, $$\mathscr{F}$$ is totally bounded. Proof. Let $$\omega $$ be the continuity modulus for the function space $$\mathscr{F}$$. Let $$k\in \mathbb{N}$$, and let $$\left \{ x_{1}, \dots ,,x_{N}\right \} $$ be an $$\omega \big (\frac{1}{3k}\big )$$–approximation to $$X$$. By assumption, the set $$A$$ is totally bounded. Let $$\left \{ \,f_{1}, \dots ,,f_{M}\right \} $$ be a set of functions in $$\mathscr{F}$$ such that the points $$a_{i}:=\left (\,f_{i}\left (x_{1}\right ), \dots , f_{i}\left (x_{N}\right )\right ),i=1, \dots , M$$ form a finite $$\frac{1}{4k}$$–approximation to $$X$$. Then, for an arbitrary $$f\in \mathscr{F}$$, it follows that there exists an $$f_{i}$$ such that $$\sum _{j=1}^{N}\sigma \left (\,f_{i}\left (x_{j}\right ),f\left (x_{j}\right )\right )\leqslant \frac{1}{4k}$$. For an arbitrary $$x\in X$$, there exists an $$x_{j}$$ such that $$\rho \left (x,x_{j}\right )\leqslant \omega \left (\frac{1}{3k}\right )$$. Then,   \begin{align*} & \sigma\left(\,f_{i}(x),\,f(x)\right) \leqslant \sigma\left(\,f_{i}(x),\,f_{i}\left(x_{j}\right)\right)+\sigma\left(\,f_{i}\left(x_{j}\right),\,f\left(x_{j}\right)\right)+\sigma\left(\,f_{i}\left(x_{j}\right),f\left(x\right)\right) \leqslant \dfrac{1}{3k}+\dfrac{1}{4k}+\dfrac{1}{3k}. \end{align*} It follows that $$\tau \left (\,f_{i},f\right )\leqslant \frac{1}{k}$$. Therefore, $$\left \{\, f_{1}, \dots ,,f_{M}\right \} $$ is a finite $$\frac{1}{k}-$$approximation to $$\mathscr{F}$$ whence $$\mathscr{F}$$ is totally bounded. 3. Main results Based on the preliminaries of the previous section, the new result on approximate optimal control policies can be derived. This is made in two steps. First, it is shown that certain function spaces, which represent the sets of admissible control policies, admit finite approximations provided that they satisfy certain assumptions which are, however, applicable in practice. 3.1 Finite approximations to function spaces The central theorem of this section is stated as follows: Theorem 3.1 Let $$\mathscr{F}$$ be the space of uniformly Lipschitz and uniformly bounded real-valued functions on a totally bounded metric space $$\left (X,\rho \right )$$. Then $$\mathscr{F}$$ is totally bounded. Proof. Let $$X_{0}$$ be a finite subset of $$X$$ consisting of unequal points $$\left \{ x_{1}, \dots ,,x_{N}\right \},$$$$N\in \mathbb{N}$$. Suppose that $$L$$ and $$K$$ are the uniform Lipschitz constant and uniform bound for $$\mathscr{F}$$, respectively. First, show that the subset   $$ Y:=\left\{ \left(\,f\left(x_{1}\right), \dots, f\left(x_{N}\right)\right):f\in\mathscr{F}\right\} $$of $$\mathbb{R}^{N}$$ with the product metric is totally bounded. To this end, let $$P=\left (p_{1}, \dots ,,p_{M}\right ),M\in \mathbb{N}$$ be a regular partition of $$\left [-K,K\right ]$$ with a step $$\delta :=\frac{1}{k},k\in \mathbb{N}$$. Let $$f$$ be any function from the function space $$\mathscr{F}$$ and fix some arbitrary $$n\in \mathbb{N}$$. Construct a piece-wise linear function $$\varphi :X\longrightarrow \mathbb{R}$$ such that $$\forall x,y\in X.|\varphi (x)-\varphi (y)|\leqslant L\rho (x,y)$$ and $$\forall x_{i}\in X_{0}.\big |\,f\left (x_{i}\right )-\varphi \left (x_{i}\right )\big |\leqslant \frac{1}{nN}$$. By the product metric, the latter condition would imply that $$\big |\big |\left (\,f\left (x_{1}\right ), \dots , f\left (x_{N}\right )\right )-\left (\varphi \left (x_{1}\right ), \dots , \varphi \left (x_{N}\right )\right )\big |\big |\leqslant \frac{1}{n}$$. First, the image of $$\varphi $$ on $$X_{0}$$ is constructed inductively. By Lemma 2.2, for any $$x\in X$$ and $$f\in \mathscr{F}$$, there exists a $$p_{i}\in P$$ such that $$\big |\,f(x)-p_{i}\big |\leqslant \delta $$. Suppose that $$f\left (x_{1}\right ),f\left (x_{2}\right )$$ are within some closed balls $$\bar{\mathscr{B}}\left (p_{j_{1}},\delta \right ),\bar{\mathscr{B}}\left (p_{j_{2}},\delta \right ),j_{1},j_{2}\in \left \{ 1, \dots , M\right \} $$, respectively. Let $$\varphi \left (x_{1}\right ):=p_{j_{1}}$$. Observe that since $$\big |\,f\left (x_{1}\right )-f\left (x_{2}\right )\big |\leqslant L\rho \left (x_{1},x_{2}\right )$$, it follows that $$\big |p_{j_{1}}-p_{j_{2}}\big |\leqslant L\rho \left (x_{1},x_{2}\right )+2\delta $$. Notice that $$p_{j_{1}}$$and $$p_{j_{2}}$$ are rational numbers. It can, therefore, be assumed that either $$p_{j_{1}}-p_{j_{2}}>2\delta $$, or $$p_{j_{1}}-p_{j_{2}}<-2\delta $$, or $$\big |p_{j_{1}}-p_{j_{2}}\big |\leqslant 2\delta $$. The first two cases are analogous whence one may assume that $$p_{j_{1}}-p_{j_{2}}>2\delta $$. Let $$\varphi \left (x_{2}\right ):=p_{j_{2}}-2\delta $$. This setting ensures the Lipschitz condition. Indeed,   \begin{align*} & \big|\varphi\left(x_{1}\right)-\varphi\left(x_{2}\right)\big|=p_{j_{2}}-2\delta-p_{j_{1}} \leqslant \big|p_{j_{1}}-p_{j_{2}}\big|-2\delta\leqslant L\rho\left(x_{1},x_{2}\right). \end{align*} On the other hand, since $$f\left (x_{2}\right )\in \bar{\mathscr{B}}\left (p_{j_{2}},\delta \right )$$ and by the setting of $$\varphi \left (x_{2}\right )$$ it follows that $$f\left (x_{2}\right )\in \bar{\mathscr{B}}\left (\varphi \left (x_{2}\right ),\delta +2\delta \right )$$. If $$\big |p_{j_{2}}-p_{j_{1}}\big |\leqslant 2\delta $$, then setting $$\varphi \left (x_{2}\right ):=p_{j_{2}}$$ ensures the same conditions. Suppose now that, at the step $$i$$, $$f\left (x_{i}\right )\in \bar{\mathscr{B}}\left (\varphi \left (x_{2}\right ),\delta +2(i-1)\delta \right )$$. Assume that $$f\left (x_{i+1}\right )\in \bar{\mathscr{B}}\left (p_{j_{i+1}},\delta \right ),j_{i+1}\in \left \{ 1, \dots , M\right \} $$. Following exactly the same procedure, one may pick the next value of $$\varphi $$ so that the approximation radius grows by $$2\delta $$ whereas the Lipschitz condition is satisfied. After the step $$N$$, it holds that $$f\left (x_{N}\right )\in \bar{\mathscr{B}}\left (\varphi \left (x_{N}\right ),\delta +2(N-1)\delta \right )$$. Therefore, setting $$k$$ equal to $$2nN^{2}$$ ensures $$\big |\big |\left (\,f\left (x_{1}\right ), \dots , f\left (x_{N}\right )\right )-\left (\varphi \left (x_{1}\right ), \dots , \varphi \left (x_{N}\right )\right )\big |\big |\leqslant \frac{1}{n}$$. Now, extend $$\varphi $$ to the whole space $$X$$. To this end, let   \begin{align*} & \psi(x):= \frac{1}{2}\left[\max_{i}\left(\varphi\left(x_{i}\right)-L\rho(x,x_{i})\right)+\min_{i}\left(\varphi\left(x_{i}\right)+L\rho(x,x_{i})\right)\right], \quad\forall x\in X. \end{align*} Lemma 2.1 implies that $$\psi (x_{j})=\varphi (x_{j})$$. It follows from the fact that $$\forall j,\varphi (x_{i})-L\rho (x_{j},x_{i})\leqslant \varphi (x_{j})$$. To see that $$\psi (x)$$ is an $$L$$–Lipschitz function, observe that $$\rho (x,x_{i})$$ is a $$1$$–Lipschitz function of $$x$$ whence $$\varphi \left (x_{i}\right )-L\rho (x,x_{i})$$ is an $$L$$–Lipschitz function of $$x$$ for each $$i$$. Therefore, $$\max _{i}\left (\varphi \left (x_{i}\right )-L\rho (x,x_{i})\right )$$ and $$\min _{i}\left (\varphi \left (x_{i}\right )+L\rho (x,x_{i})\right )$$ are uniformly continuous functions with the same Lipschitz constant $$L$$. The factor $$\frac{1}{2}$$ ensures that $$\psi (x)$$ is $$L$$–Lipschitz. Let $$\varphi (x):=\max \{\min \{\psi (x),K\},-K\}$$. Due to the properties of the minimum and maximum (Bishop & Bridges, 1985, p. 23), it follows that $$\varphi $$ is an $$L$$–Lipschitz continuous function satisfying $$\varphi (x)\leqslant K$$ and $$\varphi (x)\geqslant -K$$. Thus, $$\varphi $$ belongs to the function space $$\mathscr{F}$$ and approximates $$f$$ at the points $$X_{0}$$ arbitrarily close. Since $$\varphi $$ is uniquely defined by its values at $$X_{0}$$, and the values $$\left \{ \varphi \left (x_{i}\right ):x_{i}\in X_{0}\right \} $$ take place in a finite set $$P$$, whereas the distances between the function values at each two points of $$X_{0}$$ have fixed bounds, there are finitely many such functions. Further, since $$f$$ was arbitrary, it follows that $$Y$$ is totally bounded. By the constructive Arzela–Ascoli’s Lemma 2.3, the function space $$\mathscr{F}$$ is totally bounded. Corollary 3.1 Let $$\mathscr{F}$$ be the space of uniformly Lipschitz and uniformly bounded functions from a totally bounded metric space $$\left (X,\rho \right )$$ to $$\mathbb{R}^{m},m\in \mathbb{N}$$ with the $$d_{\infty }$$–metric. Then $$\mathscr{F}$$ is totally bounded. Proof. The proof amounts to the same procedure, as in the proof of the theorem, done for each coordinate separately since   \begin{align*} & \forall i=1, \dots, m,\big|x^{i}-y^{i}\big| \leqslant \varepsilon \iff \|x-y\|_{\infty}=\max_{i}\big|x^{i}-y^{i}\big| \leqslant \varepsilon \end{align*}for any $$x$$ and $$y$$ in $$\left (\mathbb{R}^{m},d_{\infty }\right )$$ and $$\varepsilon>0$$. Remark 3.1 The same result applies if there is a uniform bound and uniform Lipschitz constant for each dimension separately: $$\exists \left (K_{1}, \dots , K_{m}\right ),\forall f\in \mathscr{F},\forall x\in X,|\,f^{i}(x)|\leqslant K_{i},i=1, \dots , m$$ and $$\exists \left (L_{1}, \dots , L_{m}\right ),$$$$\forall f\in \mathscr{F},\forall x,y\in X,\big |\,f^{i}(x)-f^{i}(y)\big |\leqslant L\rho (x,y),i=1, \dots , m$$. The proof is by rescaling of the hypercuboid with the side lengths $$\left (2K_{1}, \dots , 2K_{m}\right )$$ centred at the origin. Corollary 3.2 Let $$\mathscr{\bar{C}}^{1}$$ be the space of uniformly bounded functions from a compact set $$X\subset \mathbb{R}^{n},n\in \mathbb{N}$$ to $$\mathbb{R}^{m},m\in \mathbb{N}$$ with the $$d_{\infty }$$–metric, and suppose that the derivatives of the functions in $$\mathscr{\bar{C}}^{1}$$ are uniformly bounded. Then $$\mathscr{\bar{C}}^{1}$$ is totally bounded. Proof. Let $$\mathscr{F}$$ denote the space of functions as described in the theorem. Fix an arbitrary function $$g$$ from $$\mathscr{\bar{C}}^{1}$$. Clearly, $$g$$ is a function in $$\mathscr{F}$$ whence $$\mathscr{\bar{C}}^{1}\subset \mathscr{F}$$. The converse is not true. It suffices to show that $$\mathscr{\bar{C}}^{1}$$ is dense in $$\mathscr{F}$$. Fix an arbitrary function $$f\in \mathscr{F}$$ and a number $$k\in \mathbb{N}$$. Following the construction as in the proof, one can derive a piece-wise linear function $$\varphi :\mathbb{R}^{n}\rightarrow \left (\mathbb{R}^{m},d_{\infty }\right )$$ that approximates $$f$$ on $$X$$ up to the precision $$\frac{1}{2k}$$. Further, one can construct an analytic function $$\varphi _{2k}$$ that approximates $$\varphi $$ up to the precision $$\frac{1}{2k}$$ and has the same Lipschitz constant (see details in Appendix). Therefore, $$\varphi _{k}$$ approximates $$f$$ up to the precision $$\frac{1}{k}$$. Therefore, the set $$\mathscr{\bar{C}}^{1}$$, as a dense subset of a totally bounded set $$\mathscr{F}$$, is itself totally bounded (Bridges & Richman, 1987, p. 28). Corollary 3.3 Let $$\mathscr{\bar{C}}^{N},N\in \mathbb{N}$$ be the space of uniformly bounded functions from a compact set $$X\subset \mathbb{R}^{n},n\in \mathbb{N}$$ to $$\mathbb{R}^{m},m\in \mathbb{N}$$ with the $$d_{\infty }$$–metric, and suppose that the derivatives of the functions in $$\mathscr{\bar{C}}^{N}$$ up to order $$N$$ are uniformly bounded. Then $$\mathscr{\bar{C}}^{N}$$ is totally bounded. 3.2 Approximate extrema In this section, based on the construction of approximating functions of the previous section, the new constructive version of the approximate extremum value theorem for function spaces is stated. The implication of it is that, under certain assumptions, approximate control policies may be computed up to a prescribed accuracy. Theorem 3.2 Let $$\mathscr{F}$$ be the space of uniformly Lipschitz and uniformly bounded functions from a totally bounded metric space $$\left (X,\rho \right )$$ to $$\mathbb{R}$$, and let $$J$$ be a uniformly continuous functional from $$\mathscr{F}$$ to $$\mathbb{R}$$. Then, for any $$k\in \mathbb{N}$$, there exists an $$f\in \mathscr{F}$$ such that $$J[\,f]-\frac{1}{k}\leqslant \inf J$$. Proof. Since $$\mathscr{F}$$ is totally bounded by Theorem 3.1 and $$J$$ is uniformly continuous, $$\inf J$$ exists. Let $$\alpha $$ be the continuity modulus of $$J$$ and $$\mathscr{F}_{0}=\left \{\, f_{1}, \dots ,,f_{N}\right \} $$ be an $$\alpha \big (\frac{1}{8k}\big )-$$approximation to $$\mathscr{F}$$. Consider all finitely many $$\left \{ J\left [\,f_{i}\right ](8k)\right \},i=1, \dots , N$$. Let $$J [\,f_{j} ](8k)$$ be the smallest one, and such that $$j$$ is the smallest index if there are more than one such indices. Observe that $$\big |J[\,f_{j} ]-J[\,f_{j} ](8k)\big |\leqslant \frac{1}{4k}$$ and $$\forall f\in \mathscr{F},\big |\big |\,f_{j}-f\big |\big |\leqslant \alpha \big (\frac{1}{8k}\big )\implies \big |J(\,f)-J[\,f_{j} ]\big |\leqslant \frac{1}{4k}$$ whence $$J[\,f_{j} ](8k)-\frac{1}{2k}\leqslant J[\,f](8k)$$. Therefore, $$J[\,f_{j} ](8k)-\frac{1}{2k}\leqslant J[\,f]$$ and consequently $$J[\,f_{j} ]-\frac{1}{k}\leqslant J[\,f]$$. The same holds trivially if $$\big |\big |\,f_{j}-f\big |\big |>\alpha \big (\frac{1}{8k}\big )$$. Since $$f$$ is arbitrary, $$J[\,f_{j} ]-\frac{1}{k}$$ is a lower bound of $$J$$ and so, in particular, $$\inf J\geqslant J[\,f_{j} ]-\frac{1}{k}$$. Corollary 3.4 Let $$\mathscr{F}$$ be the space of uniformly Lipschitz and uniformly bounded functions from a totally bounded metric space $$\left (X,\rho \right )$$ to $$\mathbb{R}$$, and let $$J$$ be a uniformly continuous functional from $$\mathscr{F}$$ to $$\mathbb{R}^{m},m\in \mathbb{N}$$ with the $$d_{\infty }$$–metric. Then, for any $$k\in \mathbb{N}$$, there exists an $$f\in \mathscr{F}$$ such that $$J[\,f]-\frac{1}{k}\leqslant \inf J$$. Remark 3.2 To compute an extremal function which yields a $$\frac{1}{k}$$-infimum of $$J$$, where $$k$$ describes the specified precision, construct all possible piece-wise linear functions over the regular partition of step $$\frac{1}{2 k N^2}$$ where $$N$$ is as in Theorem 3.2 by preserving the common Lipschitz constant. Smoothen the constructed functions, if necessary, as per Corollary 3.2. Then, choose a one $$f_j$$ which satisfies $$J[\,f]-\frac{1}{k}\leqslant \inf J$$. Remark 3.3 Theorem 3.2 describes the worst-case scenario a numerical algorithm can perform in general. Various numerical approaches exist and they may be numerically fast, but the best one can expect in general is the result as in the statement of the theorem. The implication for optimal control is that optimality may fail to be achieved in general. Instead, approximate optimal control policies can be effectively computed, provided that the system and the cost function satisfy the assumptions in the statement of the theorem. These assumptions are, however, practicable as justified by physical nature of the control problems and demonstrated in the next section on finite-horizon optimal control, dynamic programming (DP) and adaptive dynamic programming (ADP). Remark 3.4 The statement for the supremum is equivalent. 4. Case study: optimal control In this section, the derived version of a constructive extremum value theorem in application to finite-horizon optimal control, DP and ADP is discussed. 4.1 Finite-horizon optimal control Classical theorems of existence of extremal solutions to functional optimization problems essentially rely on Bolzano–Weierstrass’s theorem that every bounded sequence has a convergent subsequence. One first shows that the function space in question is compact, and applies the sequential compactness argument. There is, unfortunately, no constructive way to find a convergent subsequence. Therefore, approximate solutions are investigated in this section. Recall the problem of minimization of the following cost functional:   \begin{equation} J[u]:=\varphi\left(x\left(t_{1}\right)\right)+\int\limits _{t_{0}}^{t_{1}}\mathscr{L}(x(t),u(x(t)),t)\ \textrm{d}t \end{equation} (4.1)subject to $$\dot{x}(t):=f(x(t),u(t),t),x\left (t_{0}\right )=x_{0}$$. Here, $$\mathscr L$$ is the running cost, or Lagrangian, which is usually a positive-definite function of $$x, u, t$$. Assume that the state space $$X\subset \left (\mathbb{R}^{n},d_{\infty }\right ),n\in \mathbb{N}$$ is compact. With the $$d_{\infty }$$–metric, two states $$x$$ and $$y$$ are close whenever their respective components $$x^{i},i=1, \dots , n$$ and $$y^{i},i=1, \dots , n$$ are close. Therefore, a state trajectory $$x(t)$$ is uniformly continuous whenever each state component $$x^{i}(t)$$ is uniformly continuous. It can be assumed that $$u(x)\in \left (\mathbb{R}^{m},d_{\infty }\right ),m\in \mathbb{N}$$ for any $$x\in X$$. Let $$\mathscr{U}$$ denote the set of admissible control policies, i.e., those which yield state trajectories within $$X$$. In (4.1), the starting time $$t_{0}\in \mathbb{Q}$$ and the final time $$t_{1}\in \mathbb{Q}$$ are assumed fixed. Suppose that $$f:X\times U\times \mathbb{R}\rightarrow X$$ satisfies the Lipschitz condition for $$x$$ and $$u$$ on $$X\times U$$ in the following sense:   \begin{equation} \|\,f(x,u,t)-f(y,v,t)\|\leqslant L_{f}\max\left\{ \|x-y\|,\|u-v\|\right\} \end{equation} (4.2)for some rational $$L_{f}>0$$. Then, the constructive theorem of existence and uniqueness of solutions of the initial value problem $$\dot{x}(t)=f(x(t),u(t),t),x\left (t_{0}\right )=x_{0},t\in \left [t_{0},t_{1}\right ]$$ applies (Schwichtenberg, 2012, 12.4) provided that $$u(t)$$ is continuous. Further, $$\varphi $$ is assumed to be uniformly continuous on $$X\times X$$. The Lagrangian should be also uniformly continuous on $$X\times U$$:   \begin{align*} & \exists\omega_{\mathscr{L}}:\mathbb{Q}\rightarrow\mathbb{Q},\forall k\in\mathbb{N},\forall x,y\in X,\forall u,v\in U,\forall t\in\left[t_{0},t_{1}\right], \\ &\max\left\{ \|x-y\|,\|u-v\|\right\} \leqslant\omega_{\mathscr{L}}\left(\frac{1}{k}\right) \implies \big|\mathscr{L}(x,u,t)-\mathscr{L}(y,v,t)\big|\leqslant\frac{1}{k}. \end{align*} Consider two control policies $$u(x),v(x),x\in X$$ in $$\mathscr{U}$$ and the respective state trajectories:   \begin{align*} x_{u}(\tau)&=\int\limits _{t_{0}}^{\tau}f(x(t),u(x(t)),t)\ \textrm{d}t,\\ x_{v}(\tau)&=\int\limits _{t_{0}}^{\tau}f(x(t),v(x(t)),t)\ \textrm{d}t \end{align*}for an arbitrary $$\tau \in \left [t_{0},t_{1}\right ]$$. It follows that   \begin{align*} & \bigg|\bigg|\int\limits _{t_{0}}^{\tau}\left(f(x(t),u(x(t)),t)-f(x(t),v(x(t)),t)\right)\ \textrm{d}t\bigg|\bigg| \\ &\quad\leqslant \sup_{t_{0}\leqslant\tau\leqslant t_{1}}\|f(x(t),u(x(t)),t)-f(x(t),v(x(t)),t)\|\cdot\left(\tau-t_{0}\right) \\ &\quad\leqslant L_{f}\cdot\underset{x\in X}{\sup}\|u(x)-v(x)\|\cdot\left(\tau-t_{0}\right) \\ &\quad\leqslant L_{f}\cdot\|u-v\|\cdot\left(t_{1}-t_{0}\right). \end{align*} Therefore, if $$\underset{x\in X}{\sup }\|u(x)-v(x)\|\leqslant \frac{1}{k},k\in \mathbb{N}$$, which is to say that $$\|u-v\|\leqslant \frac{1}{k}$$, then $$\big |\big |x_{u}-x_{v}\big |\big |\leqslant L_{f}\frac{t_{1}-t_{0}}{k}$$. Consequently, for $$k\geqslant L_{f}\left (t_{1}-t_{0}\right )$$, and $$\|u-v\|\leqslant \omega _{\mathscr{L}}\left (\frac{1}{k}\right )$$ it follows that $$\big |\mathscr{L}\left (x_{u}(t),u(t),t\right )-\mathscr{L}\left (x_{v}(t),v(t),t\right )\big |\leqslant \frac{1}{k}$$. If $$k<L_{f}\left (t_{1}-t_{0}\right )$$, then $$\big |\mathscr{L}\left (x_{u}(t),u(t),t\right )-\mathscr{L}\left (x_{v}(t),v(t),t\right )\big |\leqslant L_{f}\frac{t_{1}-t_{0}}{k}$$ whence the continuity modulus is easily derived. Further, it holds that   \begin{align*} & \left|\int\limits _{t_{0}}^{t_{1}}\left(\mathscr{L}(x(t),u(x(t)),t)-\mathscr{L}(x(t),v(x(t)),t)\right)\ \textrm{d}t\right|\leqslant \frac{t_{1}-t_{0}}{k}. \end{align*} In case if $$\mathscr{U}$$ is a located subset of the space of uniformly bounded and uniformly Lipschitz functions on $$X$$ with a uniform bound and uniform Lipschitz constant, respectively (which can physically be dictated by the fact that any control policy has limits on magnitude and rate of change with respect to the state), by Theorem 3.1 and Lemma 4.3 in (Ye, 2011), $$\mathscr{U}$$ is totally bounded. Since $$J$$ is a uniformly continuous functional from $$\mathscr{U}$$ to $$\mathbb{R}$$, by Theorem 3.2, for any $$k\in \mathbb{N}$$, there exists a control policy $$u^{*}(x),x\in X$$ such that $$J\left [u^{*}\right ]-\frac{1}{k}\leqslant J[u]$$ for any other control policy $$u(x),x\in X$$. This implies that control policies, which yield approximate optima of the cost functional, can be effectively computed provided that the controllers have bounds on the magnitude and rate of change of the controls which is satisfied in practice due to physical nature. The next section considers infinite-horizon optimal control in the framework of DP. 4.2 Dynamic programming In DP, optimization problems in the following form are considered:   \begin{equation} \sup_{u\in U}\left\{ r\left(x,u\right)+\gamma V(\,f(x,u))\right\},\quad\forall x\in X. \end{equation} (4.3) In (4.3), $$u$$ is taken from a totally bounded set $$U\subset \mathbb{R}^{m},m\in \mathbb{N}$$. In the case when each component $$u^{i},i=1, \dots , m$$ is an independent function, the $$d_{\infty }$$–metric may be assumed on $$\mathbb{R}^{m}$$. The function $$r:X\times U\longrightarrow \mathbb{R}$$ is a positive-definite utility function (or running cost) that describes the instantaneous cost whereas $$V$$ is the value function that describes the cumulative cost. The functions $$r$$ and $$V$$ are assumed to be bounded on their domains. The parameter $$0<\gamma <1$$ is called discounting factor. Finally, the DP operator is introduced:   \begin{equation} T[V](x):=\sup_{u\in U}\left\{ r\left(x,u\right)+\gamma V(\,f(x,u))\right\},\quad\forall x\in X. \end{equation} (4.4) The operator $$T$$ acts on the space of continuous and bounded functions. The Hamilton–Jacobi–Bellman equation is defined by the fixed-point of $$T$$. The natural question is whether $$T$$ yields continuous functions, whether the extrema of $$r\left (x,u\right )+\gamma V(f(x,u))$$ over $$U$$ exist and whether they are continuous in $$x$$ in a certain sense. The answer is given by the Berge’s Theorem of the Maximum (Berge, 1963, p. 115). It shows what kind of continuity of the extrema is preserved if the optimization problem is continuous in a certain sense. First, recall the definition of hemi-continuity that generalizes the notion of continuity to multi-functions, i.e., functions from a set to the power set of another set. A multi-function is called compact-valued if its values are compact sets. Definition 4.1 (Upper hemi-continuity) Let $$\varGamma :X\rightarrow U$$ be a multi-function such that $$\varGamma (x)$$ is closed for all $$x$$ in $$X$$. Then, $$\varGamma $$ is called upper hemi-continuous at $$x\in X$$ if for any sequence $$\left \{ x_{n}\right \} _{n}$$ in $$X$$, $$u$$ in $$U$$ and sequence $$\left \{ u_{n}\right \} _{n}$$ such that $$u_{n}\in \varGamma \left (x_{n}\right )$$, it follows that   $$ \left(\lim_{n\rightarrow\infty}x_{n}=x\land\lim_{n\rightarrow\infty}u_{n}=u\right)\implies u\in\Gamma(x). $$ Definition 4.2 (Lower hemi-continuity) A multifunction $$\Gamma :X\rightarrow U$$ is called lower hemi-continuous at $$x\in X$$ if for any sequence $$\left \{ x_{n}\right \} _{n}$$ in $$X$$ such that $$\lim _{n\rightarrow \infty }x_{n}=x$$, any $$u\in \Gamma (x)$$, there exists a subsequence $$\left \{ x_{nk}\right \} _{k}\subset \left \{ x_{n}\right \} _{n}$$ such that there exist $$u_{k}\in \Gamma \left (x_{nk}\right )$$ with $$\lim _{k\rightarrow \infty }u_{k}=u$$. Theorem 4.1 (Theorem of the Maximum) Let $$f$$ be a jointly continuous function from a product of two metric spaces $$\left (X,\rho \right )\times \left (U,\sigma \right )$$ to $$\mathbb{R}$$, and $$\Gamma :X\rightarrow U$$ be a compact-valued upper and lower hemi-continuous multi-function. Then, the function $$h(x):=\underset{u\in \Gamma (x)}{\sup }f(x,u)$$ is continuous, and $$u^{*}(x):=\arg \underset{u\in \Gamma (x)}{\sup }f(x,u)$$ is a non-empty, compact-valued, upper hemi-continuous multi-function. In the setting of (4.3), the multi-function $$\Gamma $$ is taken as a constant multi-function $$\Gamma (x)\equiv U$$ with the assumption that any control action is available at any state. The proof of Theorem 4.1 essentially uses the classical extremum value theorem that is not valid constructively. A constructive analysis of the Maximum Theorem has been done by Tanaka (2012). To prove a constructive version of Theorem 4.1, Tanaka introduces the notion of a function with sequentially locally at most one maximum. Such a condition is, however, hard to verify in practice. To summarize, approximate extrema of (4.3) cannot be shown uniformly continuous in general, let alone found exactly. However, a weaker result holds: Proposition 4.1 Let $$f$$ be a uniformly continuous function from a product of two totally bounded metric spaces $$\left (X,\rho \right )\times \left (U,\sigma \right )$$ to $$\mathbb{R}$$. Then, the function $$h(x):=\underset{u\in U}{\sup}\, f(x,u)$$ is uniformly continuous. Proof. Fix any $$x,y\in X$$ and $$k\in \mathbb{N}$$. By Theorem 3.2, there exists $$u^{*}\in U$$ such that $$f\left (x,u^{*}\right )+\frac{1}{k}\geqslant \underset{u\in U}{\sup }\,f(x,u)$$. Let $$\omega $$ be the continuity modulus of $$f$$. Fix any $$y\in X$$ such that $$\rho (x,y)\leqslant \omega \big (\frac{1}{k}\big )$$. It follows that $$\forall u\in U,|f(x,u)-f(y,u)|\leqslant \frac{1}{k}$$. In particular, $$|\,f\left (x,u^{*}\right )-f\left (y,u^{*}\right )|\leqslant \frac{1}{k}$$ whence $$f\left (y,u^{*}\right )+\frac{1}{k}\geqslant f\left (x,u^{*}\right )$$. Therefore, $$f\left (y,u^{*}\right )+\frac{2}{k}\geqslant \underset{u\in U}{\sup }\,f(x,u)$$. From the continuity condition, it follows that $$f(y,u)\leqslant f(x,u)+\frac{1}{k}$$ whence $$f\left (y,u^{*}\right )+\frac{3}{k}\geqslant f(y,u)$$ for all $$u\in U$$. It means that $$f\left (y,u^{*}\right )+\frac{3}{k}\geqslant \underset{u\in U}{\sup }\,f(y,u)$$. Finally, $$ |\underset{u\in U}{\sup }\,f(x,u)-\underset{u\in U}{\sup }\,f(y,u) |\leqslant |\,f\left (x,u^{*}\right )-f\left (y,u^{*}\right )|+\frac{2}{k}=\frac{3}{k}$$. It follows that $$h$$ is a uniformly continuous function with a modulus $$k\mapsto \omega \big (\frac{3}{k}\big )$$. It can be proven that $$T$$ is indeed an operator that sends uniformly continuous functions to uniformly continuous functions provided that $$r,V,f$$ are uniformly continuous functions. In turn, uniformly continuous functions on totally bounded sets are bounded. Denote the set of uniformly continuous functions from $$X$$ to $$X$$ by $$\mathscr{V}$$. Assuming the supremum norm on $$\mathscr{V}$$, it is easy to show that $$T$$ is a contraction mapping. The proof is already constructive. First, observe that $$r\left (x,u\right )+\gamma V(\,f(x,u))\leqslant r\left (x,u\right )+\gamma W(\,f(x,u))$$ whenever $$V(x)\leqslant W(x)$$ for any $$x$$ in $$X$$. Therefore,   \begin{align*} & \sup_{u\in U}\left\{ r\left(x,u\right)+\gamma V(\,f(x,u)))\right\} \leqslant \sup_{u\in U}\left\{ r\left(x,u\right)+\gamma W(\,f(x,u)))\right\} . \end{align*} Consequently, $$T[V]\leqslant T[W]$$. It constitutes the monotonicity of $$T$$ which is the first Blackwell’s sufficient condition for a contraction mapping (Blackwell, 1965). The second condition requires that $$T$$ be discounting   \begin{align*} T[V+a](x)=\sup_{u\in U}\left\{ r\left(x,u\right)+\gamma V(\,f(x,u))+\gamma a\right\} = T[V](x)+\gamma a \end{align*}for any $$a\geqslant 0$$. It follows that $$T$$ is a contraction mapping with a modulus $$\gamma $$. By the Banach fixed point theorem (Palais, 2007), $$T$$ has a unique fixed point. The proof of the theorem is essentially constructive and provided by the algorithm   $$ V_{n}:=T\left[V_{n-1}\right],n\in\mathbb{N} $$starting from an arbitrary uniformly continuous function $$V_{0}$$ from which it immediately follows that $$\left \{ V_{n}\right \} _{n}$$ is a Cauchy sequence with the modulus defined from   $$ \big|\big|T^{n}\left[V_{0}\right]-V^{*}\big|\big|\leqslant\dfrac{\gamma^{n}}{1-\gamma}\big|\big|T\left[V_{0}\right]-V_{0}\big|\big|, $$where $$T^{n}[V]=\underbrace{T[T[ \dots T[V]]]}_{n\textrm{ times}}$$ and $$V^{*}$$ is the fixed point. Thus, $$V_{n}$$ converges to $$V^{*}$$ uniformly and $$V^{*}$$ is in turn a uniformly continuous function. The only difference to the classical theorem is that $$\mathscr{V}$$ must be non-empty: $$\exists V_{0}\in \mathscr{V}.$$ Using this fact, one can perform value iteration starting from any $$V_{0}\in \mathscr{V}$$ and stopping at some $$V_{n}$$ such that a convergence criterion $$\big |\big |V_{n}-V^{*}\big |\big |\leqslant \varepsilon $$ is satisfied. Notice that the number of steps $$n$$ can be directly determined from the desired accuracy $$\varepsilon $$. Having found a suitable approximate $$V_{n}$$, an optimal control policy is of interest. In the constructive setting, a control policy must be a uniformly continuous function. In real-world applications, each control action has physical limits for magnitude and rate of change. Thus, one may assume that the space of admissible control policies $$\mathscr{U}$$ is a located subset of the space of uniformly bounded and uniformly Lipschitz functions on $$X$$ with a common bound and Lipschitz constant, respectively. By Theorem 3.1, the latter space is totally bounded, and since $$\mathscr{U}$$ is located, it is also totally bounded (Ye, 2011, Lemma 4.3.). To cope with the problem of continuity of extrema in the state variable, it is suggested to consider the following relaxed optimization problem for $$V_{n}$$:   \begin{equation} \sup_{u\in\mathscr{U}}\inf_{x\in X}\left\{ r\left(x,u(x)\right)+\gamma V_{n}(\,f(x,u(x)))\right\} . \end{equation} (4.5) It follows that $$J:\mathscr{U}\rightarrow \mathbb{R}$$ defined by   $$ J[u]:=\inf_{x\in X}\left\{ r\left(x,u(x)\right)+\gamma V_{n}(\,f(x,u(x)))\right\} $$is a uniformly continuous functional since $$r,V_{n},f,\inf $$ are uniformly continuous. By Theorem 3.2, for any $$k\in \mathbb{N}$$, there exists a control policy $$u^{*}\in \mathscr{U}$$ such that   \begin{align*} & \inf_{x\in X}\left\{ r\left(x,u^{*}(x)\right)+\gamma V_{n}(f(x,u^{*}(x)))\right\} +\frac{1}{k}\geqslant \inf_{x\in X}\left\{ r\left(x,u(x)\right)+\gamma V_{n}(\,f(x,u(x)))\right\} \end{align*}for any control policy $$u$$. The difference between (4.5) and (4.3) lies in the way the performance mark is defined. In the latter case, $$r\left (x,u\right )+\gamma V(\,f(x,u))$$ is optimized in all states $$x$$ while in the former, the ‘worst’ state is optimized. The optimization problem (4.5) is thus more mild than the original one, but it is still appropriate for a variety of practical applications. 4.3 ADP ADP is a variant of DP that is suitable for real-time optimization problems. It may be considered as a reinforcement learning technique (Sutton & Barto, 1998) in the sense that it uses an iterative procedure of updating a so-called actor that produces a control policy, according to a citric that represents the value function. The value function in the framework of ADP is commonly a subject to approximation since exact optimal solutions may be not achieved (Lewis & Syrmos, 1995). In this regard, neural networks are widely used as approximators (Bertsekas & Tsitsiklis, 1995; Werbos, 1990, 1992). For recent surveys on ADP, refer, e.g., to Balakrishnan et al. (2008) and Ferrari et al. (2011). It is common to consider ADP in application to discrete-time systems of the form $$x_{k+1}:=f\left (x_{k},u_{k}\right ),k\in \mathbb{N}$$ or even affine in control: $$x_{k+1}:=f\left (x_{k}\right )+g\left (x_{k}\right )u_{k},k\in \mathbb{N}$$. It is also assumed that $$f(0)=g(0)=0$$, and that there exists a control policy $$u$$ such that for all initial conditions $$x_{0}\in X$$, $$x_{k}\rightarrow 0$$ as $$k\rightarrow \infty $$. ADP usually addresses the following infinite-horizon optimization problem:   \begin{equation} \inf_{u}\sum_{l=k}^{\infty}r\left(x_{l},u\left(x_{l}\right)\right),\quad\forall x_{k}\in X. \end{equation} (4.6) Al Tamimi et al. (2008) have provided a convergence analysis of an ADP algorithm for affine-in-control systems. Assuming a utility function in the form $$r\left (x,u\right ):=q\left (x\right )+u^{T}Ru$$ with $$q$$ being a positive-definite function, and given an arbitrary $$\mathscr{C}^{\infty }\left (X\subset \mathbb{R}^{n},\mathbb{R}\right )-$$function $$V_{0}\left (x\right )$$ such that $$\forall x\in X,0\leqslant V\left (x\right )\leqslant Q\left (x\right ),$$ perform the following iterations starting with $$i:=0$$ for all $$x_{k}\in X$$:   \begin{eqnarray} u_{i}\left(x{}_{k}\right): & = & \arg\inf_{u}\left\{ r\left(x_{k},u\right)+V_{i}\left(\,f\left(x_{k},u\right)\right)\right\}, \end{eqnarray} (4.7)  \begin{eqnarray} V_{i+1}\left(x_{k}\right): & = & r\left(x_{k},u_{i}\left(x_{k}\right)\right)+V_{i}\left(\,f\left(x_{k},u_{i}\left(x\right)\right)\right),\\ i: & = & i+1.\nonumber \end{eqnarray} (4.8) Al-Tamimi showed that this algorithm converges to the solution of the Hamilton–Jacobi–Bellman equation   \begin{eqnarray} V^{*}\left(x_{k}\right) & = & \inf_{u}\left\{ r\left(x_{k},u\right)+V^{*}\left(\,f\left(x_{k},u\right)\right)\right\}, \end{eqnarray} (4.9)  \begin{eqnarray} u^{*}\left(x_{k}\right) & = & \arg\inf_{u}\left\{ r\left(x_{k},u\right)+V^{*}\left(\,f\left(x_{k},u\right)\right)\right\}, \end{eqnarray} (4.10)  \begin{eqnarray} & & \forall x_{k}\in X \end{eqnarray} (4.11)as $$i\rightarrow \infty $$. The proof essentially uses the classical monotone convergence theorem that states that a sequence of real numbers converges whenever it is bounded and monotone. Consequently, no estimate on the number of iterations can be given for the prescribed accuracy $$\big |\big |V_{i}-V^{*}\big |\big |$$. Another subtle point that is hard to justify from both the constructive and practical viewpoint is the assumption that (4.7) can be solved in terms of a closed-form expression. That is generally impossible. An exception is, e.g., a linear quadratic regulator that is a solution for linear systems. Liu & Wei (2014) introduced a similar proof technique, as in (Al Tamimi et al., 2008), for a policy iteration algorithm: start with $$i:=0$$, and any continuous control policy $$u_{0}$$ such that $$u(0)=0$$, the state trajectory $$x_{k}\rightarrow 0$$ under $$u_{0}$$, and (4.6) converges, perform the following iterations:   \begin{eqnarray} V_{i}\left(x\right): & = & r\left(x,u_{i}\left(x\right)\right)+V_{i}\left(\,f\left(x,u_{i}\left(x\right)\right)\right), \end{eqnarray} (4.12)  \begin{eqnarray} u_{i}\left(x{}_{k}\right): & = & \arg\inf_{u}\left\{ r\left(x_{k},u\right)+V_{i-1}\left(\,f\left(x_{k},u\right)\right)\right\}, \end{eqnarray} (4.13)  \begin{eqnarray} & & \forall x_{k}\in X \\ i: & = & i+1.\qquad\qquad\qquad\;\;\,\nonumber \end{eqnarray} (4.14) Notice the difference in iteration indices for the value function and the control policy. Again, the proof of convergence uses the monotone convergence theorem and the assumption that (4.13) can be solved in terms of a closed-form expression. To coup with this problem, Al-Tamimi and Liu suggest to use neural-network based approximators for the value function and the control policy. Unfortunately, no convergence proof has been given for such an approximate setting (Liu & Wei, 2014, p. 632). An alternative approach has been proposed by Heydari (2014). Instead of approximating the control policy, Heydari has shown that the first-order necessary condition for an extremum   $$ u_{i}(x)=-\frac{1}{2}R^{-1}g^{T}(x)\frac{\partial V_{i}\left(\,f(x)+g(x)u_{i}(x)\right)}{\partial x},\quad\forall x\in X $$is a fixed-point equation provided that all the functions in question are $$\mathscr{C}^{\infty }$$. By an appropriate choice of the matrix norm of $$R^{-1}$$ and/or $$g(x)$$, it can be shown that the mapping $$F:\mathscr{C}^{\infty }\left (X,\mathbb{R}^{m}\right )\rightarrow \mathscr{C}^{\infty }\left (X,\mathbb{R}^{m}\right )$$ defined by   \begin{equation} F[u](x):=-\dfrac{1}{2}R^{-1}g^{T}\left(x\right)\frac{\partial V_{i}(\,f(x)+g(x)u(x))}{\partial x} \end{equation} (4.15)is a contraction. The assumption that the control policy at each iteration is a smooth function satisfies our argumentation in Section 4.2 and, provided with a uniform bound and Lipschitz constant, leads to total boundedness of the space of control policies by Corollary 3.3. However, the first-order condition for an extremum is not sufficient to claim that the infimum of (4.8) is attained at each iteration. Currently, one can decouple (4.7) and (4.8), iterate the value function and then claim existence of an approximate optimal control policy for an alternative performance mark (4.5). Relaxing the continuity condition by considering measurable functions, and investigating other performance marks, such as Lebesgue integrals, may be of interest for future research. 5. Conclusions The present work is highlighted in the following points: A new constructive proof of the approximate extremum value theorem for function spaces is suggested. The methodological approach of the proof takes into account the numerical uncertainty which is related to limitations of real number representations in a computational device. The functions forming finite approximations to the respective function spaces are constructed explicitly. In particular, it was shown that the sets of uniformly bounded and uniformly Lipschitz functions on a totally bounded set are totally bounded by explicit constructions of approximating functions. As stated in Remark 3.3, any numerical procedure may in general achieve the result of Theorem 3.2 at best. That implies that optimality in optimal control problems may in general be achieved only approximately. Applications of the theorem to finite-horizon optimal control, DP and ADP are addressed. It is shown that under the stated assumptions, whose practicability is discussed, approximate optimal control policies can be effectively computed up to prescribed accuracy on the approximate optima of the cost functional. References Al Tamimi, A., Lewis, F. L. & Abu Khalaf, M. ( 2008) Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans. Syst. Man. Cybern. Part B (Cybernetics) , 38, 943– 949. Google Scholar CrossRef Search ADS   Balakrishnan, S., Ding, J. & Lewis, F. L. ( 2008) Issues on stability of ADP feedback controllers for dynamical systems. IEEE Trans. Syst. Man. Cybern. Part B (Cybernetics) , 38, 913– 917. Google Scholar CrossRef Search ADS   Banaschewski, B. & Mulvey, C. ( 1997) A constructive proof of the Stone-Weierstrass theorem. J. Pure Appl. Algebra , 116, 25– 40. Google Scholar CrossRef Search ADS   Berge, C. ( 1963). Topological Spaces: Including a Treatment of Multi-valued Functions, Vector Spaces, and Convexity . Mineola, New York: Dover Publications, Inc. Berger, J., Bridges, D. & Schuster, P. ( 2006) The fan theorem and unique existence of maxima. J. Symbolic Logic , 71, 713– 720. Google Scholar CrossRef Search ADS   Bertsekas, D. P. & Tsitsiklis, J. N. ( 1995) Neuro-dynamic programming: an overview. Proceedings of the 34th IEEE Conference on Decision and Control , vol. 1. IEEE, pp. 560– 564. Bishop, E. ( 1967) Foundations of Constructive Analysis , vol. 60. New York: McGraw-Hill. Bishop, E. & Bridges, D. ( 1985) Constructive Analysis , vol. 279. Berlin: Springer. Google Scholar CrossRef Search ADS   Blackwell, D. ( 1965) Discounted dynamic programming. Ann. Math. Stat. , 36, 226– 235. Google Scholar CrossRef Search ADS   Bridges, D. & Richman, F. ( 1987) Varieties of Constructive Mathematics , vol. 97. Cambridge: Cambridge University Press. Google Scholar CrossRef Search ADS   Bridges, D. & Vita, L. ( 2007) Techniques of Constructive Analysis. Universitext . New York: Springer. Bridges, D. S. ( 2007) Constructing local optima on a compact interval. Arch. Math. Logic , 46, 149– 154. Google Scholar CrossRef Search ADS   Ferrari, S., Sarangapani, J. & Lewis, F. L. ( 2011) Special issue on approximate dynamic programming and reinforcement learning. J. Control Theory Appl. , 9, 309– 309. Google Scholar CrossRef Search ADS   Heydari, A. ( 2014) Revisiting approximate dynamic programming and its convergence. IEEE Trans. Cybern. , 44, 2733– 2743. Google Scholar CrossRef Search ADS PubMed  Lewis, F. L. & Syrmos, V. L. ( 1995) Optimal Control . Hoboken, New Jersey: John Wiley & Sons. Liu, D. & Wei, Q. ( 2014) Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. , 25, 621– 634. Google Scholar CrossRef Search ADS PubMed  Palais, R. S. ( 2007) A simple proof of the Banach contraction principle. J. Fixed Point Theory Appl. , 2, 221– 223. Google Scholar CrossRef Search ADS   Rump, S. ( 2010) Verification methods: rigorous results using floating-point arithmetic. Acta Numerica , 19, 287– 449. Google Scholar CrossRef Search ADS   Schwichtenberg, H. ( 2012) Constructive Analysis With Witnesses . Manuscript. Sepulchre, R., Jankovic, M. & Kokotovic, P. ( 2012) Constructive Nonlinear Control . London: Springer Science & Business Media. Sontag, E. ( 1989) A “universal” construction of artstein’s theorem on nonlinear stabilization. Syst. Control Lett. , 13, 117– 123. Google Scholar CrossRef Search ADS   Sutton, R. S. & Barto, A. G. ( 1998) Reinforcement Learning: An Introduction , vol. 1. Cambridge, MA, USA: MIT Press. Tanaka, Y. ( 2012) On the maximum theorem: a constructive analysis. Int. J. Comp. and Math. Sciences , 6, 173– 175. Turing, A. M. ( 1936) On computable numbers, with an application to the entscheidungsproblem. Proc. London Math. Soc. , 42, 230– 265. Werbos, P. ( 1990) A menu of designs for reinforcement learning over time, Neural Networks for Control . Cambridge, MA, USA: MIT Press, 67– 95 Werbos, P. J. ( 1992) Approximate dynamic programming for real-time control and neural modeling. Handb. Intelligent Cont.: Neural Fuzzy Adaptive Approaches , 15, 493– 525. Ye, F. ( 2011) Strict Finitism and the Logic of Mathematical Applications , vol. 355. Springer Dordrecht Heidelberg London New York: Springer. Google Scholar CrossRef Search ADS   Appendix A.1 Smooth approximations For the sake of completeness, some technical details of smooth approximation are discussed in this appendix. First, consider the following real-valued non-analytic $$\mathscr{C}^{\infty }$$–function on $$\mathbb{R}$$:   $$ \sigma(t)=\left\{ \begin{array}{ll} \mathrm{e}^{-\frac{1}{t^{2}}} & t>0,\\ 0 & t\leqslant0. \end{array}\right. $$ Define a $$\mathscr{C}^{\infty }$$ bump function $$\vartheta :\mathbb{R}^{n}\rightarrow \mathbb{R},n\in \mathbb{N}$$ by $$\vartheta (x):=a\cdot \sigma \left (1-\|x\|^{2}\right )$$ where   $$ a:=\left(\int\limits _{\,\mathbb{R}^{n}}\sigma\left(1-\|x\|^{2}\right)\ \textrm{d}x\right)^{-1}. $$ Then, the support of $$\vartheta $$ lies within the unit ball $$\mathscr{B}(0,1)$$, i.e., $$\left \{ x\in \mathbb{R}^{n}:\vartheta (x)>0\right \} \subset \mathscr{B}(0,1)$$. Clearly, $$\forall x\in \mathbb{R}^{n},\vartheta (x)\geqslant 0$$ and $$\int _{\mathbb{R}^{n}}\vartheta (x)\ \textrm{d}x=1$$. Now, let $$\vartheta _{k}(x):=k^{n}\vartheta (kx)$$ for $$k\in \mathbb{N}$$. It follows that $$\int _{\mathbb{R}^{n}}\vartheta _{k}(x)\ \textrm{d}x=k^{n}\int _{\mathbb{R}^{n}}\frac{1}{k^{n}}\vartheta (kx)\ \textrm{d}(kx)=1$$. Let $$f:\mathbb{R}^{n}\rightarrow \mathbb{R}$$ be an $$L$$–Lipschitz function. It may be assumed that $$L=1$$ without loss of generality. Clearly, $$f$$ is locally integrable, i.e., integrable on any compact subset of $$\mathbb{R}^{n}$$ since it is Lipschitz continuous on this subset. Since $$\vartheta _{k}$$ is compactly supported, define a $$\mathscr{C}^{\infty }$$ function $$f_{k}$$ by convolution as follows:   \begin{align*} f_{k}(x)&=\int\limits _{\mathbb{R}^{n}}f(\chi)\vartheta_{k}(x-\chi)\ \textrm{d}\chi=\int\limits _{\mathbb{R}^{n}}f(x-\chi)\vartheta_{k}(\chi)\ \textrm{d}\chi\\ &=k^{n}\int\limits _{\mathscr{B}\left(0,\frac{1}{k}\right)}f(x-\chi)\vartheta(k\chi)\ \textrm{d}\chi=\int\limits _{\mathscr{B}\left(0,1\right)}f\left(x-\dfrac{\chi}{k}\right)\vartheta(\chi)\ \textrm{d}\chi. \end{align*} It follows that   \begin{align*} f_{k}(x)-f_{k}(y)=\int\limits _{\mathbb{R}^{n}}\big(\,f(x-\chi)-f(y-\chi)\big)\vartheta_{k}(\chi)\ \textrm{d}\chi. \end{align*} Since $$\forall \chi \in \mathbb{R}^{n},\big |\,f(x-\chi )-f(y-\chi )\big |\vartheta _{k}(\chi )\leqslant \|x-y\|\vartheta _{k}(\chi )$$, it holds that   \begin{align*} \big|\,f_{k}(x)-f_{k}(y)\big|&\leqslant\int\limits _{\mathbb{R}^{n}}\big|\,f(x-\chi)-f(y-\chi)\big|\vartheta_{k}(\chi)\ \textrm{d}\chi\\&\leqslant \|x-y\|\int\limits _{\mathbb{R}^{n}}\vartheta_{k}(\chi)\ \textrm{d}\chi=\|x-y\|. \end{align*} Finally, since $$f(x)=f\left (x\right )\cdot 1=f\left (x\right )\cdot \int \limits _{\mathscr{B}(0,1)}\vartheta (\chi )\ \textrm{d}\chi =$$$$\int \limits _{\mathscr{B}(0,1)}f\left (x\right )\vartheta (\chi )\ \textrm{d}\chi $$, it follows that   \begin{align*} \big|\,f_{k}(x)-f(x)\big|&=\left|\int\limits _{\;\mathscr{B}(0,1)}f\left(x-\dfrac{\chi}{k}\right)\vartheta(\chi)\ \textrm{d}\chi-f(x)\right|\\ &=\left|\int\limits _{\;\mathscr{B}(0,1)}f\left(x-\dfrac{\chi}{k}\right)\vartheta(\chi)\ \textrm{d}\chi-\int\limits _{\mathscr{B}(0,1)}f\left(x\right)\vartheta(\chi)\ \textrm{d}\chi\right|\\&= \left|\int\limits _{\;\mathscr{B}(0,1)}\left(\,f\left(x-\dfrac{\chi}{k}\right)-f(x)\right)\vartheta(\chi)\ \textrm{d}\chi\right|\\&\leqslant \sup_{\chi\in\mathscr{B}(0,1)}\Big|\Big|\dfrac{\chi}{k}\Big|\Big|\int\limits _{\mathscr{B}\left(0,1\right)}f\vartheta(\chi)\ \textrm{d}\chi=\dfrac{1}{k}. \end{align*} A.2 Brouwerian counter-examples The two examples in Section 1, also called Brouwerian counter-examples, demonstrate the computational inability of finding exact optimal control actions and policies in general. The possible computational problems with addressing optimal control may be related to the so-called principles of omniscience (Bishop & Bridges, 1985, p. 11). One of them, called limited principle of omniscience (LPO), states Definition A.1 (LPO) For any sequence binary $$\{a_i\}_i$$, the following holds: either $$a_i = 0$$ for all $$i$$, or there is a $$k$$ with $$a_k = 1$$. LPO might be related to the inability of a computer to perform an unbounded search and decide exactly whether a given real number is non-zero or exactly zero which may be in turn related to the Turing’s Halting problem (Turing, 1936). Let $$\{ a_i \}_i$$ be a binary sequence with at least one $$1$$ at some place which is not a priori known. Let the numbers $$b, c$$ be defined as follows:   $$ b = \frac{1}{4} \sum_{i=0}^{\infty} \frac{1}{i+1} a_{2i+1}, c = \frac{1}{4} \sum_{i=1}^{\infty} \frac{1}{i+1} a_{2i}. $$ If it were known that $$b=0$$, then one could deduce that all the odd entries of $$\{a_i\}$$ are zero and, therefore, since one even entry must be $$1$$, there exists an index $$2N$$ such that $$a_{2N}=1$$, i.e.,   $$ b=0 \; \Rightarrow \; \forall i, \; a_{2i+1} =0 \; \Rightarrow \; \exists N. \; a_{2N}=1. $$ Similarly, if $$c=0$$, then some even entry of $$\{a_i\}$$ must be $$1$$, i.e.,   $$ c=0 \; \Rightarrow \; \forall i, \; a_{2i} =0 \; \Rightarrow \; \exists N. \; a_{2N+1}=1. $$ In this minimalistic scenario, the appearance of a $$1$$ can be described by the condition $$b=0 \lor c=0$$ which implies LPO. Consider, for instance, the case of the cost function $$J(u) = \min \{ u^2 + b, (u-1)^2+c \}$$. If an optimal control action $$u^{\ast }$$ could be computed exactly, such that $$J(u^{\ast })= \min J$$, then either $$u^{\ast } \geqslant \frac{1}{3}$$ or $$u^{\ast } \leqslant \frac{2}{3}$$. If $$u^{\ast } \geqslant \frac{1}{3}$$, then   $$ (u^{\ast})^2 +b \geqslant \frac{1}{9}+b> 0 $$whence $$(u^{\ast } -1)^2 +c =0 \Rightarrow c=0$$. If $$u^{\ast } \leqslant \frac{2}{3}$$, then   $$ (u^{\ast}-1)^2 +c \geqslant \frac{1}{9}+c> 0 $$whence $$(u^{\ast })^2 +b =0 \Rightarrow b=0$$. Therefore, equivalence to LPO is shown. © The Author(s) 2018. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. All rights reserved. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png IMA Journal of Mathematical Control and Information Oxford University Press

Analysis of extremum value theorems for function spaces in optimal control under numerical uncertainty

Loading next page...
 
/lp/ou_press/analysis-of-extremum-value-theorems-for-function-spaces-in-optimal-vBa6FRk8gA
Publisher
Oxford University Press
Copyright
© The Author(s) 2018. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. All rights reserved.
ISSN
0265-0754
eISSN
1471-6887
D.O.I.
10.1093/imamci/dny018
Publisher site
See Article on Publisher Site

Abstract

Abstract The extremum value theorem for function spaces plays the central role in optimal control. It is known that computation of optimal control actions and policies is often prone to numerical errors which may be related to computability issues. The current work addresses a version of the extremum value theorem for function spaces under explicit consideration of numerical uncertainties. It is shown that certain function spaces are bounded in a suitable sense, i.e., they admit finite approximations up to an arbitrary precision. The proof of this fact is constructive in the sense that it explicitly builds the approximating functions. Consequently, existence of approximate extremal functions is shown. Applicability of the theorem is investigated for finite-horizon optimal control, dynamic programming and adaptive dynamic programming. Some possible computability issues of the extremum value theorem in optimal control are shown on counterexamples. 1. Introduction Optimal control represents an important part of control theory. Typically, one seeks for an optimal function over a state space (also called control policy) so as to minimize a given cost functional. It is, however, not in general possible to compute optimizing control policies exactly due to limitations of numerical procedures which may have certain effects on the system behaviour. The current work shows how, under some mild and practicable assumptions, approximate optimal control policies can still be explicitly computed. The proofs are done constructively, i.e., they entail certain ways of computing the objects in question. Constructive results are not unusual in control engineering and are often desired: for instance, Banaschewski & Mulvey (1997) gave a constructive proof of the Stone–Weierstrass theorem, which is used in a number of applications to effectively find approximations to specific functions in a suitable basis. The famous Sontag’s formula (Sontag, 1989) is the core of the Sontag’s constructive proof of the Artstein’s theorem on non-linear stabilization. Sepulchre et al. (2012) developed this methodology further into a vast variety of constructive methods of finding specific stabilizing controllers. Going back to the problem of optimal control and related effects which may occur due to numerical uncertainty, consider the following simple example of a discrete-time system whose dynamical behaviour is switched by a binary decision variable $$u$$:   \begin{align} x_{k+1}= \begin{cases} \big(\frac{1}{2} + b\big)x_k, &u_k = 1, \\ \big(\frac{1}{2} + c\big)x_k, &u_k = -1, \end{cases} \quad x_0 = 1, u_k \in \{1, -1\}, \end{align} (1.1)where $$b, c$$ are real numbers which may, e.g., represent some physical quantities. Let an infinite-horizon cost function be defined as   $$ \min_{\{u_k\}_k} \quad J = \sum_{k=0}^{\infty} x_k^2. $$ Suppose, for the sake of the example, that $$b$$ is zero and $$c$$ is positive. Then, by the virtue of the system dynamics (1.1), the optimal control policy is $$u^{\ast }=\{1,1,1, \dots \}$$ and the corresponding optimal state sequence is $$x^{\ast }=\{ 1,\frac{1}{2},\frac{1}{4}, \frac{1}{8}, \dots \}$$. In this case, the optimal cost is $$J^{\ast }=2$$. If, otherwise, $$c$$ is zero and $$b$$ is positive, then the optimal control policy is $$u^{\ast }=\{-1,-1,-1, \dots \}$$, whereas the optimal cost is, again, $$J^{\ast }=2$$. Thus, if one could find the optimal control policy, i.e., the optimal control action at each time step, then either $$J^{\ast }=\frac{2}{1-2c}=2$$ or $$J^{\ast }=\frac{2}{1-2b}=2$$ by the geometric sum and, hence, either $$b=0$$ or alternatively $$c=0$$. However, in practice, there may occur a numerical uncertainty between the exact values of $$b, c$$ and their representations in a computational device, usually as rational numbers. One of the possible simple ways to consider these representations of $$b, c$$ is in the form of Cauchy sequences $$\{b(n)\}_n, \{c(n)\}_n$$ which are regular in the following sense:   \begin{align*} & \forall n,m \in \mathbb{N} \\ & |b(n)-b(m)| \leqslant \tfrac{1}{n} + \tfrac{1}{m}, \\ &\, |c(n)-c(m)| \leqslant \tfrac{1}{n} + \tfrac{1}{m}. \end{align*} In practice, the system (1.1) may contain some particular approximations $$b(n^{\prime}), c(n^{\prime}), n^{\prime} \in \mathbb N$$ where $$n^{\prime}$$ is the precision of the computational device. In the current work, all the proofs are done by working directly with the representations of real numbers which help address numerical uncertainty. The said rational approximations may well come from, e.g., a measurement, which always has a finite precision, or from some computational algorithm, such as model identification. Therefore, to computationally check whether $$b=0$$ or alternatively $$c=0$$, approximations $$b(n), c(n)$$ for all $$n \in \mathbb N$$ must be compared. Such an unbounded search is, however, not technically possible. Therefore, different optimal control policies might result depending on precision—in this case, a particular number $$n^{\prime}$$. The same issue may appear when minimizing, e.g., the following particular cost function:   $$ J(u) = \min \{ u^2 + b, (u-1)^2+c \}. $$ By the virtue of the numbers $$b$$ and $$c$$ as described above, it follows that $$\min J = 0$$. However, if an optimal control action $$u^{\ast }$$ could be computed exactly, such that $$J(u^{\ast })= \min J$$, then either $$u^{\ast } = 0$$ or $$u^{\ast } = 1$$. It would be equivalent to deciding whether $$b$$ or $$c$$ is exactly zero which is not always technically possible. This has been typically demonstrated in simple counter-examples (Bishop, 1967), whose more detailed description may be found in Appendix. Particular examples of peculiar phenomena related to numerical uncertainty and floating-point arithmetic may also be found in (Rump, 2010). As shown in the example above, optimality in general may fail to be achieved depending on the representation of system parameters. To address these issues, the present work seeks to show existence of optimal control in an approximate format by explicitly considering numerical uncertainty. The proofs are done constructively and in the setting of (Bishop & Bridges, 1985) since it offers convenient tools for keeping track of the number representations. The details are given in the next section. It should be noted that, classically, the extremum value theorem states the following: Theorem 1.1 If a function $$f$$ is continuous on a compact interval $$[a,b]$$, then there exist $$x,y\in [a,b]$$ such that $$f(x)=\sup f$$ and $$f(y)=\inf \,f$$. Some constructive approaches to Theorem 1.1 were addressed, e.g., in (Berger et al., 2006; Bridges, 2007) with additional assumptions on the function $$f$$. These assumptions are, however, not always easy to verify practically, especially when one wants to apply the theorem to function spaces. Instead of strengthening the conditions of the theorem, an approximate format is considered in the present work which is sufficient for practical applications of optimal control. To achieve this, it is shown that certain function spaces admit finite approximations. The proof is based on constructing finite approximations explicitly. Another consequence of the new results shows also what at best can be achieved in general when addressing optimal control. The major implication is a theoretical limit at which any numerical algorithm may perform. Exact optimal control policies are not achievable in general. The new result demonstrates principal possibility of computing approximate optimal control policies up to prescribed accuracy if the optimization problem satisfies certain conditions. As will be shown in the case study of Section 4, the said conditions are practicable. The next section discusses the important preliminaries needed to prove the main theorem of Section 3. 2. Preliminaries In this section, the definitions and some basic technical results necessary for derivation of the new approximate extremum value theorem for function spaces are recalled. For a comprehensive description, refer, for example, to Bishop & Bridges (1985); Bridges & Richman (1987); Bridges & Vita (2007); Schwichtenberg (2012); Ye (2011). A real number $$x$$ in the current work will be characterized by its rational approximations in the following regular Cauchy sequence format:   $$ \forall n,m\in\mathbb{N},|x(n)-x(m)|\leqslant\frac{1}{n}+\frac{1}{m}, $$where $$x(n)$$ is some operation that produces the $$n-$$th rational approximation to $$x$$. The inequalities on real numbers are defined as follows:   \begin{align*} (x\leqslant y)\triangleq & \forall n\in\mathbb{N},x(n)\leqslant y(n)+\frac{2}{n},\\ (x<y)\triangleq & \exists n\in\mathbb{N},x(n)<y(n)-\frac{2}{n}. \end{align*} In the second definition, the number $$n$$ is also called a witness. Such objects are said to certify the respective formulas. They can be used by computational devices. Further, the maximum of two real numbers is defined as follows: $$\max \left \{ x,y\right \} (n)\triangleq \max \left \{ x(n),y(n)\right \}$$. The basic properties of it can be proven, but in general it cannot be decided whether $$\max \left \{ x,y\right \} =x$$ or $$\max \left \{ x,y\right \} =y$$. However, the following simple technical lemma can be easily proven: Lemma 2.1 For any two real numbers $$x,y$$ satisfying $$x\leqslant y$$, it follows that $$\max \left \{ x,y\right \} = y$$. Proof. It suffices to show that   $$ \forall n,|\max\left\{ x(n),y(n)\right\} -y(n)|\leqslant\frac{2}{n}. $$ It follows that $$\forall n\in \mathbb{N},\max \left \{ x(n),y(n)\right \} \leqslant y(n)+\frac{2}{n}$$ from the condition of the lemma which implies $$\forall n\in \mathbb{N}.x(n)\leqslant y(n)+\frac{2}{n}$$. On the other hand, $$\forall n\in \mathbb{N},\max \left \{ x(n),y(n)\right \} \geqslant y(n)\geqslant y(n)-\frac{2}{n}$$, and the result follows. Remark 2.1 Properties of the minimum are derived similarly. A metric space $$\left (X,\rho \right )$$ is a set $$X$$ together with an operation $$\rho :X\times X\rightarrow \mathbb{R}$$ that satisfies the usual axioms of a metric. A metric space $$\left (X,\rho \right )$$ is totally bounded if for all natural $$k$$, there exists a finite set of unequal points $$\left \{ x_{1}, \dots , x_{n}\right \} \subset X$$ such that for any $$x\in X$$, there exists an $$x_{i}\in \left \{ x_{1}, \dots ,,x_{n}\right \}$$ with $$\rho \left (x,x_{i}\right )\leqslant \frac{1}{k}$$. Such a finite set is also called a $$\frac{1}{k}-$$approximation to $$X$$. A subset $$A$$ of a metric space $$\left (X,\rho \right )$$ is located if it is non-empty and for any $$x$$ in $$X$$, the metric $$\rho \left (x,A\right )\triangleq \inf \left \{ \rho (x,y):y\in A\right \}$$ can be effectively computed. A totally bounded subset of a metric space is also located (see Proposition 2.2.9 in (Bridges & Vita, 2007)). A metric between two subsets $$A$$ and $$B$$ is defined as $$\rho \left (A,B\right )\triangleq \inf \left \{ \rho (x,y):x\in A,y\in B\right \} $$. A (uniformly continuous) function from a totally bounded metric space $$\left (X,\rho \right )$$ to a metric space $$\left (Y,\sigma \right )$$ is a pair consisting of an operation $$x\mapsto f(x),x\in X$$ and an operation $$\omega :\mathbb{Q}\rightarrow \mathbb{Q}$$ called modulusof (uniform) continuity such that:   $$ \forall\varepsilon\in\mathbb{Q},\forall x,y\in X,\rho(x,y)\leqslant\omega(\varepsilon)\implies\sigma(\,f(x),f(y))\leqslant\varepsilon. $$ A function is Lipschitz continuous if $$\,\forall x,y\in X,\sigma (\,f(x),f(y))\leqslant L\cdot \rho (x,y)$$ for some rational $$L>0$$. The set $$\mathscr{F}$$ of (all) uniformly continuous functions from a totally bounded metric space $$\left (X,\rho \right )$$ to a metric space $$\left (Y,\sigma \right )$$ together with the metric $$\tau (\,f,g)\triangleq \underset{x\in X}{\sup }\,\sigma (\,f(x),f(y))$$ for any $$f,g\in \mathscr{F}$$ is called the function space from $$X$$ to $$Y$$. A function space $$\mathscr{F}$$ is equicontinuous if there exists a common modulus of continuity for all $$f$$ in $$\mathscr{F}$$. Further, $$\mathscr{F}$$ is the space of uniformly Lipschitz and uniformly bounded functions whenever there exists a common Lipschitz constant $$L$$ and respectively a common bound $$K\in \mathbb{Q},K>0$$ such that $$\forall f\in \mathscr{F},\|\,f\|\leqslant K$$. A uniformly continuous functional$$F$$ on a totally bounded function space $$\mathscr{F}$$ is an operation $$f\mapsto F[\,f]\in \mathbb{R}$$ with a modulus of continuity $$\alpha $$ such that $$|F[\,f]-F[g]|\leqslant \frac{1}{k}$$ whenever $$\tau (\,f,g)\leqslant \alpha \big (\frac{1}{k}\big )$$. The symbol $$x^{i}$$ denotes the $$i-$$th coordinate of the point $$x$$ in $$\mathbb{R}^{n}$$. The two common norms on $$\mathbb{R}^{n}$$ are the $$d_{2}$$–norm: $$\|x\|_{2}=\big (\sum _{i=1}^{n}\left (x^{i}\right )^{2}\big )^{\frac{1}{2}}$$, and the $$d_{\infty }$$–norm (or maximum norm): $$\|x\|_{\infty }=\max _{i}\big |x^{i}\big |$$. The subscripts ‘$$2$$’ and ‘$$\infty $$’ may be omitted whenever the type of the norm is clear from the context. The corresponding metric between any two points $$x,y$$ is defined as $$\|x-y\|$$. For the metrics $$d_{2}$$ and $$d_{\infty }$$, the following holds: $$\|\bullet \|_{\infty }\leqslant \|\bullet \|_{2}\leqslant \sqrt{n}\|\bullet \|_{\infty }$$. A real space $$\mathbb{R}^{n}$$ with the metric $$d_{\infty }$$ will be also denoted as $$\left (\mathbb{R}^{n},d_{\infty }\right )$$. A (rational) closed ball$$\bar{\mathscr{B}}(b,K)$$ in $$\mathbb{R}^{n}$$ with a radius $$K\in \mathbb{Q},K>0$$ centred at $$b\in \mathbb{Q}^{n}$$ is the set $$\left \{ x:x\in \mathbb{R}^{n}\land \|x-b\|\leqslant K\right \} $$. For example, with the $$d_{\infty }$$–metric, $$\bar{\mathscr{B}}(b,K)$$ is effectively a hypercube with $$2^{n}$$ vertices with rational coordinates. On the reals $$\mathbb{R}$$, it is a compact interval. Clearly, a closed ball $$\bar{\mathscr{B}}(b,K)$$ is located. A regular partition with a step $$\delta =\frac{K}{k},k\in \mathbb{N}$$ on a closed ball $$\bar{\mathscr{B}}(0,K)$$ in $$\mathbb{R}^{n}$$ is a finite set of points $$\left \{ b_{i}\right \} _{i=1}^{N}\subset \bar{\mathscr{B}}(0,K),N\in \mathbb{N}$$ with the coordinates satisfying all the combinations of the form $$b_{i}^{j}:=\pm n_{ij}\delta ,n_{ij}\in \left \{ 0, \dots , k\right \} $$. Notice that the number $$N$$ of partition points depends on the dimension $$n$$ and the step $$\delta $$. For instance, a regular partition on $$\bar{\mathscr{B}}\big (\frac{1}{2},\frac{1}{2}\big)$$ in $$\mathbb{R}$$—which is the unit interval $$\left [0,1\right ]$$—with a step $$\delta =\frac{1}{4}$$ is the finite set of points $$\big \{ 0,\frac{1}{4},\frac{1}{2},\frac{3}{4},1\big \} $$. Clearly, regular partitions on non-trivial closed balls exist and they may be considered as witnesses for total boundedness. That is, for any approximation to a closed ball, there exists a regular partition with the same property that for any point in the set there exists a point in the partition that is close to the given point up to the given precision. The following simple technical result can be easily proven: Lemma 2.2 Let $$P=\left (p_{1}, \dots ,,p_{N}\right )$$ be a regular partition with a step $$\delta =\frac{K}{k},k\in \mathbb{N}$$ on a closed ball $$\bar{\mathscr{B}}(0,K)\subset \left (\mathbb{R}^{n},d_{\infty }\right )$$. Then, for any $$x \in \bar{\mathscr{B}}(0,K)$$, there exists a closed ball $$\bar{\mathscr{B}}\left (p_{i},\delta \right ),p_{i}\in P$$ such that $$x\in \bar{\mathscr{B}}\left (p_{i},\delta \right )$$. Proof. Since $$x$$ is a tuple of $$n$$ real numbers $$\big (x^{1}, \dots , x^{n}\big )$$, an $$m-$$th rational approximation to $$x$$ is a tuple of rational numbers $$x(m):=\left (x^{1}(m), \dots , x^{n}(m)\right )$$. Indeed, for any $$m^{\prime}\in \mathbb{N}$$, $$\|x(m)-x(m^{\prime})\|=\max _{i}\big |x^{i}(m)-x^{i}(m^{\prime})\big |\leqslant \frac{1}{m}+\frac{1}{m^{\prime}}$$ since $$\big |x^{i}(m)-x^{i}(m^{\prime})\big |\leqslant \frac{1}{m}+\frac{1}{m^{\prime}}$$ for all $$i\in \left \{ 1, \dots , n\right \}$$. Let $$m:=\lceil \frac{4}{\delta }\rceil $$, then $$\|x(m)-x\|\leqslant \frac{\delta }{2}$$. Compute all distances $$\big |\big |x(m)-p_{i}\big |\big |,p_{i}\in B$$. If $$\big |\big |x(m)-p_{i}\big |\big |<\frac{\delta }{2}$$, then $$x\in \bar{\mathscr{B}}\left (p_{i},\delta \right )$$. If there are more than one such balls, pick the one with the smallest index. If $$\big |\big |x(m)-p_{j}\big |\big |=\frac{\delta }{2}$$ for some indices $$j=j_{1}, \dots , j_{L}$$, then pick the smallest such $$j$$ and conclude that $$x\in \bar{\mathscr{B}}\left (p_{j},\delta \right )$$. Remark 2.2 In the current constructive setting, it cannot be deduced whether a point in a real metric space $$\left (\mathbb{R}^{n},d_{\infty }\right )$$ belongs to a subset $$A$$ or to a subset $$B$$ if $$A\cap B$$ has a dimension less than $$n$$. However, comparison of a real number with a non-trivial interval is decidable, i.e., whether a real number in the interval $$\left [a,b\right ]\subset \mathbb{R}$$ belongs to a non-trivial interval $$I_{1}\subseteq \left [a,b\right ]$$ or to $$I_{2}\subseteq \left [a,b\right ]$$ if and only if $$I_{1}\cap I_{2}$$ is a non-trivial interval. In the lemma above, this fact is generalized to overlapping hypercubes. Further, an important result, called constructive Arzela–Ascoli’s lemma, which is due to (Bishop & Bridges 1985, p. 100), is recalled: Lemma 2.3 Let $$\mathscr{F}$$ be an equicontinuous function space from a totally bounded metric space $$\left (X,\rho \right )$$ to a metric space $$\left (Y,\sigma \right )$$. Suppose that for any finite $$\frac{1}{k}$$–approximation $$\left \{ x_{1}, \dots ,,x_{N}\right \} $$ to $$X$$, the set $$A:=\left \{ \left (\,f\left (x_{1}\right ), \dots , \,f\left (x_{N}\right )\right ):f\in \mathscr{F}\right \} \subset Y^{N}$$ is totally bounded. Then, $$\mathscr{F}$$ is totally bounded. Proof. Let $$\omega $$ be the continuity modulus for the function space $$\mathscr{F}$$. Let $$k\in \mathbb{N}$$, and let $$\left \{ x_{1}, \dots ,,x_{N}\right \} $$ be an $$\omega \big (\frac{1}{3k}\big )$$–approximation to $$X$$. By assumption, the set $$A$$ is totally bounded. Let $$\left \{ \,f_{1}, \dots ,,f_{M}\right \} $$ be a set of functions in $$\mathscr{F}$$ such that the points $$a_{i}:=\left (\,f_{i}\left (x_{1}\right ), \dots , f_{i}\left (x_{N}\right )\right ),i=1, \dots , M$$ form a finite $$\frac{1}{4k}$$–approximation to $$X$$. Then, for an arbitrary $$f\in \mathscr{F}$$, it follows that there exists an $$f_{i}$$ such that $$\sum _{j=1}^{N}\sigma \left (\,f_{i}\left (x_{j}\right ),f\left (x_{j}\right )\right )\leqslant \frac{1}{4k}$$. For an arbitrary $$x\in X$$, there exists an $$x_{j}$$ such that $$\rho \left (x,x_{j}\right )\leqslant \omega \left (\frac{1}{3k}\right )$$. Then,   \begin{align*} & \sigma\left(\,f_{i}(x),\,f(x)\right) \leqslant \sigma\left(\,f_{i}(x),\,f_{i}\left(x_{j}\right)\right)+\sigma\left(\,f_{i}\left(x_{j}\right),\,f\left(x_{j}\right)\right)+\sigma\left(\,f_{i}\left(x_{j}\right),f\left(x\right)\right) \leqslant \dfrac{1}{3k}+\dfrac{1}{4k}+\dfrac{1}{3k}. \end{align*} It follows that $$\tau \left (\,f_{i},f\right )\leqslant \frac{1}{k}$$. Therefore, $$\left \{\, f_{1}, \dots ,,f_{M}\right \} $$ is a finite $$\frac{1}{k}-$$approximation to $$\mathscr{F}$$ whence $$\mathscr{F}$$ is totally bounded. 3. Main results Based on the preliminaries of the previous section, the new result on approximate optimal control policies can be derived. This is made in two steps. First, it is shown that certain function spaces, which represent the sets of admissible control policies, admit finite approximations provided that they satisfy certain assumptions which are, however, applicable in practice. 3.1 Finite approximations to function spaces The central theorem of this section is stated as follows: Theorem 3.1 Let $$\mathscr{F}$$ be the space of uniformly Lipschitz and uniformly bounded real-valued functions on a totally bounded metric space $$\left (X,\rho \right )$$. Then $$\mathscr{F}$$ is totally bounded. Proof. Let $$X_{0}$$ be a finite subset of $$X$$ consisting of unequal points $$\left \{ x_{1}, \dots ,,x_{N}\right \},$$$$N\in \mathbb{N}$$. Suppose that $$L$$ and $$K$$ are the uniform Lipschitz constant and uniform bound for $$\mathscr{F}$$, respectively. First, show that the subset   $$ Y:=\left\{ \left(\,f\left(x_{1}\right), \dots, f\left(x_{N}\right)\right):f\in\mathscr{F}\right\} $$of $$\mathbb{R}^{N}$$ with the product metric is totally bounded. To this end, let $$P=\left (p_{1}, \dots ,,p_{M}\right ),M\in \mathbb{N}$$ be a regular partition of $$\left [-K,K\right ]$$ with a step $$\delta :=\frac{1}{k},k\in \mathbb{N}$$. Let $$f$$ be any function from the function space $$\mathscr{F}$$ and fix some arbitrary $$n\in \mathbb{N}$$. Construct a piece-wise linear function $$\varphi :X\longrightarrow \mathbb{R}$$ such that $$\forall x,y\in X.|\varphi (x)-\varphi (y)|\leqslant L\rho (x,y)$$ and $$\forall x_{i}\in X_{0}.\big |\,f\left (x_{i}\right )-\varphi \left (x_{i}\right )\big |\leqslant \frac{1}{nN}$$. By the product metric, the latter condition would imply that $$\big |\big |\left (\,f\left (x_{1}\right ), \dots , f\left (x_{N}\right )\right )-\left (\varphi \left (x_{1}\right ), \dots , \varphi \left (x_{N}\right )\right )\big |\big |\leqslant \frac{1}{n}$$. First, the image of $$\varphi $$ on $$X_{0}$$ is constructed inductively. By Lemma 2.2, for any $$x\in X$$ and $$f\in \mathscr{F}$$, there exists a $$p_{i}\in P$$ such that $$\big |\,f(x)-p_{i}\big |\leqslant \delta $$. Suppose that $$f\left (x_{1}\right ),f\left (x_{2}\right )$$ are within some closed balls $$\bar{\mathscr{B}}\left (p_{j_{1}},\delta \right ),\bar{\mathscr{B}}\left (p_{j_{2}},\delta \right ),j_{1},j_{2}\in \left \{ 1, \dots , M\right \} $$, respectively. Let $$\varphi \left (x_{1}\right ):=p_{j_{1}}$$. Observe that since $$\big |\,f\left (x_{1}\right )-f\left (x_{2}\right )\big |\leqslant L\rho \left (x_{1},x_{2}\right )$$, it follows that $$\big |p_{j_{1}}-p_{j_{2}}\big |\leqslant L\rho \left (x_{1},x_{2}\right )+2\delta $$. Notice that $$p_{j_{1}}$$and $$p_{j_{2}}$$ are rational numbers. It can, therefore, be assumed that either $$p_{j_{1}}-p_{j_{2}}>2\delta $$, or $$p_{j_{1}}-p_{j_{2}}<-2\delta $$, or $$\big |p_{j_{1}}-p_{j_{2}}\big |\leqslant 2\delta $$. The first two cases are analogous whence one may assume that $$p_{j_{1}}-p_{j_{2}}>2\delta $$. Let $$\varphi \left (x_{2}\right ):=p_{j_{2}}-2\delta $$. This setting ensures the Lipschitz condition. Indeed,   \begin{align*} & \big|\varphi\left(x_{1}\right)-\varphi\left(x_{2}\right)\big|=p_{j_{2}}-2\delta-p_{j_{1}} \leqslant \big|p_{j_{1}}-p_{j_{2}}\big|-2\delta\leqslant L\rho\left(x_{1},x_{2}\right). \end{align*} On the other hand, since $$f\left (x_{2}\right )\in \bar{\mathscr{B}}\left (p_{j_{2}},\delta \right )$$ and by the setting of $$\varphi \left (x_{2}\right )$$ it follows that $$f\left (x_{2}\right )\in \bar{\mathscr{B}}\left (\varphi \left (x_{2}\right ),\delta +2\delta \right )$$. If $$\big |p_{j_{2}}-p_{j_{1}}\big |\leqslant 2\delta $$, then setting $$\varphi \left (x_{2}\right ):=p_{j_{2}}$$ ensures the same conditions. Suppose now that, at the step $$i$$, $$f\left (x_{i}\right )\in \bar{\mathscr{B}}\left (\varphi \left (x_{2}\right ),\delta +2(i-1)\delta \right )$$. Assume that $$f\left (x_{i+1}\right )\in \bar{\mathscr{B}}\left (p_{j_{i+1}},\delta \right ),j_{i+1}\in \left \{ 1, \dots , M\right \} $$. Following exactly the same procedure, one may pick the next value of $$\varphi $$ so that the approximation radius grows by $$2\delta $$ whereas the Lipschitz condition is satisfied. After the step $$N$$, it holds that $$f\left (x_{N}\right )\in \bar{\mathscr{B}}\left (\varphi \left (x_{N}\right ),\delta +2(N-1)\delta \right )$$. Therefore, setting $$k$$ equal to $$2nN^{2}$$ ensures $$\big |\big |\left (\,f\left (x_{1}\right ), \dots , f\left (x_{N}\right )\right )-\left (\varphi \left (x_{1}\right ), \dots , \varphi \left (x_{N}\right )\right )\big |\big |\leqslant \frac{1}{n}$$. Now, extend $$\varphi $$ to the whole space $$X$$. To this end, let   \begin{align*} & \psi(x):= \frac{1}{2}\left[\max_{i}\left(\varphi\left(x_{i}\right)-L\rho(x,x_{i})\right)+\min_{i}\left(\varphi\left(x_{i}\right)+L\rho(x,x_{i})\right)\right], \quad\forall x\in X. \end{align*} Lemma 2.1 implies that $$\psi (x_{j})=\varphi (x_{j})$$. It follows from the fact that $$\forall j,\varphi (x_{i})-L\rho (x_{j},x_{i})\leqslant \varphi (x_{j})$$. To see that $$\psi (x)$$ is an $$L$$–Lipschitz function, observe that $$\rho (x,x_{i})$$ is a $$1$$–Lipschitz function of $$x$$ whence $$\varphi \left (x_{i}\right )-L\rho (x,x_{i})$$ is an $$L$$–Lipschitz function of $$x$$ for each $$i$$. Therefore, $$\max _{i}\left (\varphi \left (x_{i}\right )-L\rho (x,x_{i})\right )$$ and $$\min _{i}\left (\varphi \left (x_{i}\right )+L\rho (x,x_{i})\right )$$ are uniformly continuous functions with the same Lipschitz constant $$L$$. The factor $$\frac{1}{2}$$ ensures that $$\psi (x)$$ is $$L$$–Lipschitz. Let $$\varphi (x):=\max \{\min \{\psi (x),K\},-K\}$$. Due to the properties of the minimum and maximum (Bishop & Bridges, 1985, p. 23), it follows that $$\varphi $$ is an $$L$$–Lipschitz continuous function satisfying $$\varphi (x)\leqslant K$$ and $$\varphi (x)\geqslant -K$$. Thus, $$\varphi $$ belongs to the function space $$\mathscr{F}$$ and approximates $$f$$ at the points $$X_{0}$$ arbitrarily close. Since $$\varphi $$ is uniquely defined by its values at $$X_{0}$$, and the values $$\left \{ \varphi \left (x_{i}\right ):x_{i}\in X_{0}\right \} $$ take place in a finite set $$P$$, whereas the distances between the function values at each two points of $$X_{0}$$ have fixed bounds, there are finitely many such functions. Further, since $$f$$ was arbitrary, it follows that $$Y$$ is totally bounded. By the constructive Arzela–Ascoli’s Lemma 2.3, the function space $$\mathscr{F}$$ is totally bounded. Corollary 3.1 Let $$\mathscr{F}$$ be the space of uniformly Lipschitz and uniformly bounded functions from a totally bounded metric space $$\left (X,\rho \right )$$ to $$\mathbb{R}^{m},m\in \mathbb{N}$$ with the $$d_{\infty }$$–metric. Then $$\mathscr{F}$$ is totally bounded. Proof. The proof amounts to the same procedure, as in the proof of the theorem, done for each coordinate separately since   \begin{align*} & \forall i=1, \dots, m,\big|x^{i}-y^{i}\big| \leqslant \varepsilon \iff \|x-y\|_{\infty}=\max_{i}\big|x^{i}-y^{i}\big| \leqslant \varepsilon \end{align*}for any $$x$$ and $$y$$ in $$\left (\mathbb{R}^{m},d_{\infty }\right )$$ and $$\varepsilon>0$$. Remark 3.1 The same result applies if there is a uniform bound and uniform Lipschitz constant for each dimension separately: $$\exists \left (K_{1}, \dots , K_{m}\right ),\forall f\in \mathscr{F},\forall x\in X,|\,f^{i}(x)|\leqslant K_{i},i=1, \dots , m$$ and $$\exists \left (L_{1}, \dots , L_{m}\right ),$$$$\forall f\in \mathscr{F},\forall x,y\in X,\big |\,f^{i}(x)-f^{i}(y)\big |\leqslant L\rho (x,y),i=1, \dots , m$$. The proof is by rescaling of the hypercuboid with the side lengths $$\left (2K_{1}, \dots , 2K_{m}\right )$$ centred at the origin. Corollary 3.2 Let $$\mathscr{\bar{C}}^{1}$$ be the space of uniformly bounded functions from a compact set $$X\subset \mathbb{R}^{n},n\in \mathbb{N}$$ to $$\mathbb{R}^{m},m\in \mathbb{N}$$ with the $$d_{\infty }$$–metric, and suppose that the derivatives of the functions in $$\mathscr{\bar{C}}^{1}$$ are uniformly bounded. Then $$\mathscr{\bar{C}}^{1}$$ is totally bounded. Proof. Let $$\mathscr{F}$$ denote the space of functions as described in the theorem. Fix an arbitrary function $$g$$ from $$\mathscr{\bar{C}}^{1}$$. Clearly, $$g$$ is a function in $$\mathscr{F}$$ whence $$\mathscr{\bar{C}}^{1}\subset \mathscr{F}$$. The converse is not true. It suffices to show that $$\mathscr{\bar{C}}^{1}$$ is dense in $$\mathscr{F}$$. Fix an arbitrary function $$f\in \mathscr{F}$$ and a number $$k\in \mathbb{N}$$. Following the construction as in the proof, one can derive a piece-wise linear function $$\varphi :\mathbb{R}^{n}\rightarrow \left (\mathbb{R}^{m},d_{\infty }\right )$$ that approximates $$f$$ on $$X$$ up to the precision $$\frac{1}{2k}$$. Further, one can construct an analytic function $$\varphi _{2k}$$ that approximates $$\varphi $$ up to the precision $$\frac{1}{2k}$$ and has the same Lipschitz constant (see details in Appendix). Therefore, $$\varphi _{k}$$ approximates $$f$$ up to the precision $$\frac{1}{k}$$. Therefore, the set $$\mathscr{\bar{C}}^{1}$$, as a dense subset of a totally bounded set $$\mathscr{F}$$, is itself totally bounded (Bridges & Richman, 1987, p. 28). Corollary 3.3 Let $$\mathscr{\bar{C}}^{N},N\in \mathbb{N}$$ be the space of uniformly bounded functions from a compact set $$X\subset \mathbb{R}^{n},n\in \mathbb{N}$$ to $$\mathbb{R}^{m},m\in \mathbb{N}$$ with the $$d_{\infty }$$–metric, and suppose that the derivatives of the functions in $$\mathscr{\bar{C}}^{N}$$ up to order $$N$$ are uniformly bounded. Then $$\mathscr{\bar{C}}^{N}$$ is totally bounded. 3.2 Approximate extrema In this section, based on the construction of approximating functions of the previous section, the new constructive version of the approximate extremum value theorem for function spaces is stated. The implication of it is that, under certain assumptions, approximate control policies may be computed up to a prescribed accuracy. Theorem 3.2 Let $$\mathscr{F}$$ be the space of uniformly Lipschitz and uniformly bounded functions from a totally bounded metric space $$\left (X,\rho \right )$$ to $$\mathbb{R}$$, and let $$J$$ be a uniformly continuous functional from $$\mathscr{F}$$ to $$\mathbb{R}$$. Then, for any $$k\in \mathbb{N}$$, there exists an $$f\in \mathscr{F}$$ such that $$J[\,f]-\frac{1}{k}\leqslant \inf J$$. Proof. Since $$\mathscr{F}$$ is totally bounded by Theorem 3.1 and $$J$$ is uniformly continuous, $$\inf J$$ exists. Let $$\alpha $$ be the continuity modulus of $$J$$ and $$\mathscr{F}_{0}=\left \{\, f_{1}, \dots ,,f_{N}\right \} $$ be an $$\alpha \big (\frac{1}{8k}\big )-$$approximation to $$\mathscr{F}$$. Consider all finitely many $$\left \{ J\left [\,f_{i}\right ](8k)\right \},i=1, \dots , N$$. Let $$J [\,f_{j} ](8k)$$ be the smallest one, and such that $$j$$ is the smallest index if there are more than one such indices. Observe that $$\big |J[\,f_{j} ]-J[\,f_{j} ](8k)\big |\leqslant \frac{1}{4k}$$ and $$\forall f\in \mathscr{F},\big |\big |\,f_{j}-f\big |\big |\leqslant \alpha \big (\frac{1}{8k}\big )\implies \big |J(\,f)-J[\,f_{j} ]\big |\leqslant \frac{1}{4k}$$ whence $$J[\,f_{j} ](8k)-\frac{1}{2k}\leqslant J[\,f](8k)$$. Therefore, $$J[\,f_{j} ](8k)-\frac{1}{2k}\leqslant J[\,f]$$ and consequently $$J[\,f_{j} ]-\frac{1}{k}\leqslant J[\,f]$$. The same holds trivially if $$\big |\big |\,f_{j}-f\big |\big |>\alpha \big (\frac{1}{8k}\big )$$. Since $$f$$ is arbitrary, $$J[\,f_{j} ]-\frac{1}{k}$$ is a lower bound of $$J$$ and so, in particular, $$\inf J\geqslant J[\,f_{j} ]-\frac{1}{k}$$. Corollary 3.4 Let $$\mathscr{F}$$ be the space of uniformly Lipschitz and uniformly bounded functions from a totally bounded metric space $$\left (X,\rho \right )$$ to $$\mathbb{R}$$, and let $$J$$ be a uniformly continuous functional from $$\mathscr{F}$$ to $$\mathbb{R}^{m},m\in \mathbb{N}$$ with the $$d_{\infty }$$–metric. Then, for any $$k\in \mathbb{N}$$, there exists an $$f\in \mathscr{F}$$ such that $$J[\,f]-\frac{1}{k}\leqslant \inf J$$. Remark 3.2 To compute an extremal function which yields a $$\frac{1}{k}$$-infimum of $$J$$, where $$k$$ describes the specified precision, construct all possible piece-wise linear functions over the regular partition of step $$\frac{1}{2 k N^2}$$ where $$N$$ is as in Theorem 3.2 by preserving the common Lipschitz constant. Smoothen the constructed functions, if necessary, as per Corollary 3.2. Then, choose a one $$f_j$$ which satisfies $$J[\,f]-\frac{1}{k}\leqslant \inf J$$. Remark 3.3 Theorem 3.2 describes the worst-case scenario a numerical algorithm can perform in general. Various numerical approaches exist and they may be numerically fast, but the best one can expect in general is the result as in the statement of the theorem. The implication for optimal control is that optimality may fail to be achieved in general. Instead, approximate optimal control policies can be effectively computed, provided that the system and the cost function satisfy the assumptions in the statement of the theorem. These assumptions are, however, practicable as justified by physical nature of the control problems and demonstrated in the next section on finite-horizon optimal control, dynamic programming (DP) and adaptive dynamic programming (ADP). Remark 3.4 The statement for the supremum is equivalent. 4. Case study: optimal control In this section, the derived version of a constructive extremum value theorem in application to finite-horizon optimal control, DP and ADP is discussed. 4.1 Finite-horizon optimal control Classical theorems of existence of extremal solutions to functional optimization problems essentially rely on Bolzano–Weierstrass’s theorem that every bounded sequence has a convergent subsequence. One first shows that the function space in question is compact, and applies the sequential compactness argument. There is, unfortunately, no constructive way to find a convergent subsequence. Therefore, approximate solutions are investigated in this section. Recall the problem of minimization of the following cost functional:   \begin{equation} J[u]:=\varphi\left(x\left(t_{1}\right)\right)+\int\limits _{t_{0}}^{t_{1}}\mathscr{L}(x(t),u(x(t)),t)\ \textrm{d}t \end{equation} (4.1)subject to $$\dot{x}(t):=f(x(t),u(t),t),x\left (t_{0}\right )=x_{0}$$. Here, $$\mathscr L$$ is the running cost, or Lagrangian, which is usually a positive-definite function of $$x, u, t$$. Assume that the state space $$X\subset \left (\mathbb{R}^{n},d_{\infty }\right ),n\in \mathbb{N}$$ is compact. With the $$d_{\infty }$$–metric, two states $$x$$ and $$y$$ are close whenever their respective components $$x^{i},i=1, \dots , n$$ and $$y^{i},i=1, \dots , n$$ are close. Therefore, a state trajectory $$x(t)$$ is uniformly continuous whenever each state component $$x^{i}(t)$$ is uniformly continuous. It can be assumed that $$u(x)\in \left (\mathbb{R}^{m},d_{\infty }\right ),m\in \mathbb{N}$$ for any $$x\in X$$. Let $$\mathscr{U}$$ denote the set of admissible control policies, i.e., those which yield state trajectories within $$X$$. In (4.1), the starting time $$t_{0}\in \mathbb{Q}$$ and the final time $$t_{1}\in \mathbb{Q}$$ are assumed fixed. Suppose that $$f:X\times U\times \mathbb{R}\rightarrow X$$ satisfies the Lipschitz condition for $$x$$ and $$u$$ on $$X\times U$$ in the following sense:   \begin{equation} \|\,f(x,u,t)-f(y,v,t)\|\leqslant L_{f}\max\left\{ \|x-y\|,\|u-v\|\right\} \end{equation} (4.2)for some rational $$L_{f}>0$$. Then, the constructive theorem of existence and uniqueness of solutions of the initial value problem $$\dot{x}(t)=f(x(t),u(t),t),x\left (t_{0}\right )=x_{0},t\in \left [t_{0},t_{1}\right ]$$ applies (Schwichtenberg, 2012, 12.4) provided that $$u(t)$$ is continuous. Further, $$\varphi $$ is assumed to be uniformly continuous on $$X\times X$$. The Lagrangian should be also uniformly continuous on $$X\times U$$:   \begin{align*} & \exists\omega_{\mathscr{L}}:\mathbb{Q}\rightarrow\mathbb{Q},\forall k\in\mathbb{N},\forall x,y\in X,\forall u,v\in U,\forall t\in\left[t_{0},t_{1}\right], \\ &\max\left\{ \|x-y\|,\|u-v\|\right\} \leqslant\omega_{\mathscr{L}}\left(\frac{1}{k}\right) \implies \big|\mathscr{L}(x,u,t)-\mathscr{L}(y,v,t)\big|\leqslant\frac{1}{k}. \end{align*} Consider two control policies $$u(x),v(x),x\in X$$ in $$\mathscr{U}$$ and the respective state trajectories:   \begin{align*} x_{u}(\tau)&=\int\limits _{t_{0}}^{\tau}f(x(t),u(x(t)),t)\ \textrm{d}t,\\ x_{v}(\tau)&=\int\limits _{t_{0}}^{\tau}f(x(t),v(x(t)),t)\ \textrm{d}t \end{align*}for an arbitrary $$\tau \in \left [t_{0},t_{1}\right ]$$. It follows that   \begin{align*} & \bigg|\bigg|\int\limits _{t_{0}}^{\tau}\left(f(x(t),u(x(t)),t)-f(x(t),v(x(t)),t)\right)\ \textrm{d}t\bigg|\bigg| \\ &\quad\leqslant \sup_{t_{0}\leqslant\tau\leqslant t_{1}}\|f(x(t),u(x(t)),t)-f(x(t),v(x(t)),t)\|\cdot\left(\tau-t_{0}\right) \\ &\quad\leqslant L_{f}\cdot\underset{x\in X}{\sup}\|u(x)-v(x)\|\cdot\left(\tau-t_{0}\right) \\ &\quad\leqslant L_{f}\cdot\|u-v\|\cdot\left(t_{1}-t_{0}\right). \end{align*} Therefore, if $$\underset{x\in X}{\sup }\|u(x)-v(x)\|\leqslant \frac{1}{k},k\in \mathbb{N}$$, which is to say that $$\|u-v\|\leqslant \frac{1}{k}$$, then $$\big |\big |x_{u}-x_{v}\big |\big |\leqslant L_{f}\frac{t_{1}-t_{0}}{k}$$. Consequently, for $$k\geqslant L_{f}\left (t_{1}-t_{0}\right )$$, and $$\|u-v\|\leqslant \omega _{\mathscr{L}}\left (\frac{1}{k}\right )$$ it follows that $$\big |\mathscr{L}\left (x_{u}(t),u(t),t\right )-\mathscr{L}\left (x_{v}(t),v(t),t\right )\big |\leqslant \frac{1}{k}$$. If $$k<L_{f}\left (t_{1}-t_{0}\right )$$, then $$\big |\mathscr{L}\left (x_{u}(t),u(t),t\right )-\mathscr{L}\left (x_{v}(t),v(t),t\right )\big |\leqslant L_{f}\frac{t_{1}-t_{0}}{k}$$ whence the continuity modulus is easily derived. Further, it holds that   \begin{align*} & \left|\int\limits _{t_{0}}^{t_{1}}\left(\mathscr{L}(x(t),u(x(t)),t)-\mathscr{L}(x(t),v(x(t)),t)\right)\ \textrm{d}t\right|\leqslant \frac{t_{1}-t_{0}}{k}. \end{align*} In case if $$\mathscr{U}$$ is a located subset of the space of uniformly bounded and uniformly Lipschitz functions on $$X$$ with a uniform bound and uniform Lipschitz constant, respectively (which can physically be dictated by the fact that any control policy has limits on magnitude and rate of change with respect to the state), by Theorem 3.1 and Lemma 4.3 in (Ye, 2011), $$\mathscr{U}$$ is totally bounded. Since $$J$$ is a uniformly continuous functional from $$\mathscr{U}$$ to $$\mathbb{R}$$, by Theorem 3.2, for any $$k\in \mathbb{N}$$, there exists a control policy $$u^{*}(x),x\in X$$ such that $$J\left [u^{*}\right ]-\frac{1}{k}\leqslant J[u]$$ for any other control policy $$u(x),x\in X$$. This implies that control policies, which yield approximate optima of the cost functional, can be effectively computed provided that the controllers have bounds on the magnitude and rate of change of the controls which is satisfied in practice due to physical nature. The next section considers infinite-horizon optimal control in the framework of DP. 4.2 Dynamic programming In DP, optimization problems in the following form are considered:   \begin{equation} \sup_{u\in U}\left\{ r\left(x,u\right)+\gamma V(\,f(x,u))\right\},\quad\forall x\in X. \end{equation} (4.3) In (4.3), $$u$$ is taken from a totally bounded set $$U\subset \mathbb{R}^{m},m\in \mathbb{N}$$. In the case when each component $$u^{i},i=1, \dots , m$$ is an independent function, the $$d_{\infty }$$–metric may be assumed on $$\mathbb{R}^{m}$$. The function $$r:X\times U\longrightarrow \mathbb{R}$$ is a positive-definite utility function (or running cost) that describes the instantaneous cost whereas $$V$$ is the value function that describes the cumulative cost. The functions $$r$$ and $$V$$ are assumed to be bounded on their domains. The parameter $$0<\gamma <1$$ is called discounting factor. Finally, the DP operator is introduced:   \begin{equation} T[V](x):=\sup_{u\in U}\left\{ r\left(x,u\right)+\gamma V(\,f(x,u))\right\},\quad\forall x\in X. \end{equation} (4.4) The operator $$T$$ acts on the space of continuous and bounded functions. The Hamilton–Jacobi–Bellman equation is defined by the fixed-point of $$T$$. The natural question is whether $$T$$ yields continuous functions, whether the extrema of $$r\left (x,u\right )+\gamma V(f(x,u))$$ over $$U$$ exist and whether they are continuous in $$x$$ in a certain sense. The answer is given by the Berge’s Theorem of the Maximum (Berge, 1963, p. 115). It shows what kind of continuity of the extrema is preserved if the optimization problem is continuous in a certain sense. First, recall the definition of hemi-continuity that generalizes the notion of continuity to multi-functions, i.e., functions from a set to the power set of another set. A multi-function is called compact-valued if its values are compact sets. Definition 4.1 (Upper hemi-continuity) Let $$\varGamma :X\rightarrow U$$ be a multi-function such that $$\varGamma (x)$$ is closed for all $$x$$ in $$X$$. Then, $$\varGamma $$ is called upper hemi-continuous at $$x\in X$$ if for any sequence $$\left \{ x_{n}\right \} _{n}$$ in $$X$$, $$u$$ in $$U$$ and sequence $$\left \{ u_{n}\right \} _{n}$$ such that $$u_{n}\in \varGamma \left (x_{n}\right )$$, it follows that   $$ \left(\lim_{n\rightarrow\infty}x_{n}=x\land\lim_{n\rightarrow\infty}u_{n}=u\right)\implies u\in\Gamma(x). $$ Definition 4.2 (Lower hemi-continuity) A multifunction $$\Gamma :X\rightarrow U$$ is called lower hemi-continuous at $$x\in X$$ if for any sequence $$\left \{ x_{n}\right \} _{n}$$ in $$X$$ such that $$\lim _{n\rightarrow \infty }x_{n}=x$$, any $$u\in \Gamma (x)$$, there exists a subsequence $$\left \{ x_{nk}\right \} _{k}\subset \left \{ x_{n}\right \} _{n}$$ such that there exist $$u_{k}\in \Gamma \left (x_{nk}\right )$$ with $$\lim _{k\rightarrow \infty }u_{k}=u$$. Theorem 4.1 (Theorem of the Maximum) Let $$f$$ be a jointly continuous function from a product of two metric spaces $$\left (X,\rho \right )\times \left (U,\sigma \right )$$ to $$\mathbb{R}$$, and $$\Gamma :X\rightarrow U$$ be a compact-valued upper and lower hemi-continuous multi-function. Then, the function $$h(x):=\underset{u\in \Gamma (x)}{\sup }f(x,u)$$ is continuous, and $$u^{*}(x):=\arg \underset{u\in \Gamma (x)}{\sup }f(x,u)$$ is a non-empty, compact-valued, upper hemi-continuous multi-function. In the setting of (4.3), the multi-function $$\Gamma $$ is taken as a constant multi-function $$\Gamma (x)\equiv U$$ with the assumption that any control action is available at any state. The proof of Theorem 4.1 essentially uses the classical extremum value theorem that is not valid constructively. A constructive analysis of the Maximum Theorem has been done by Tanaka (2012). To prove a constructive version of Theorem 4.1, Tanaka introduces the notion of a function with sequentially locally at most one maximum. Such a condition is, however, hard to verify in practice. To summarize, approximate extrema of (4.3) cannot be shown uniformly continuous in general, let alone found exactly. However, a weaker result holds: Proposition 4.1 Let $$f$$ be a uniformly continuous function from a product of two totally bounded metric spaces $$\left (X,\rho \right )\times \left (U,\sigma \right )$$ to $$\mathbb{R}$$. Then, the function $$h(x):=\underset{u\in U}{\sup}\, f(x,u)$$ is uniformly continuous. Proof. Fix any $$x,y\in X$$ and $$k\in \mathbb{N}$$. By Theorem 3.2, there exists $$u^{*}\in U$$ such that $$f\left (x,u^{*}\right )+\frac{1}{k}\geqslant \underset{u\in U}{\sup }\,f(x,u)$$. Let $$\omega $$ be the continuity modulus of $$f$$. Fix any $$y\in X$$ such that $$\rho (x,y)\leqslant \omega \big (\frac{1}{k}\big )$$. It follows that $$\forall u\in U,|f(x,u)-f(y,u)|\leqslant \frac{1}{k}$$. In particular, $$|\,f\left (x,u^{*}\right )-f\left (y,u^{*}\right )|\leqslant \frac{1}{k}$$ whence $$f\left (y,u^{*}\right )+\frac{1}{k}\geqslant f\left (x,u^{*}\right )$$. Therefore, $$f\left (y,u^{*}\right )+\frac{2}{k}\geqslant \underset{u\in U}{\sup }\,f(x,u)$$. From the continuity condition, it follows that $$f(y,u)\leqslant f(x,u)+\frac{1}{k}$$ whence $$f\left (y,u^{*}\right )+\frac{3}{k}\geqslant f(y,u)$$ for all $$u\in U$$. It means that $$f\left (y,u^{*}\right )+\frac{3}{k}\geqslant \underset{u\in U}{\sup }\,f(y,u)$$. Finally, $$ |\underset{u\in U}{\sup }\,f(x,u)-\underset{u\in U}{\sup }\,f(y,u) |\leqslant |\,f\left (x,u^{*}\right )-f\left (y,u^{*}\right )|+\frac{2}{k}=\frac{3}{k}$$. It follows that $$h$$ is a uniformly continuous function with a modulus $$k\mapsto \omega \big (\frac{3}{k}\big )$$. It can be proven that $$T$$ is indeed an operator that sends uniformly continuous functions to uniformly continuous functions provided that $$r,V,f$$ are uniformly continuous functions. In turn, uniformly continuous functions on totally bounded sets are bounded. Denote the set of uniformly continuous functions from $$X$$ to $$X$$ by $$\mathscr{V}$$. Assuming the supremum norm on $$\mathscr{V}$$, it is easy to show that $$T$$ is a contraction mapping. The proof is already constructive. First, observe that $$r\left (x,u\right )+\gamma V(\,f(x,u))\leqslant r\left (x,u\right )+\gamma W(\,f(x,u))$$ whenever $$V(x)\leqslant W(x)$$ for any $$x$$ in $$X$$. Therefore,   \begin{align*} & \sup_{u\in U}\left\{ r\left(x,u\right)+\gamma V(\,f(x,u)))\right\} \leqslant \sup_{u\in U}\left\{ r\left(x,u\right)+\gamma W(\,f(x,u)))\right\} . \end{align*} Consequently, $$T[V]\leqslant T[W]$$. It constitutes the monotonicity of $$T$$ which is the first Blackwell’s sufficient condition for a contraction mapping (Blackwell, 1965). The second condition requires that $$T$$ be discounting   \begin{align*} T[V+a](x)=\sup_{u\in U}\left\{ r\left(x,u\right)+\gamma V(\,f(x,u))+\gamma a\right\} = T[V](x)+\gamma a \end{align*}for any $$a\geqslant 0$$. It follows that $$T$$ is a contraction mapping with a modulus $$\gamma $$. By the Banach fixed point theorem (Palais, 2007), $$T$$ has a unique fixed point. The proof of the theorem is essentially constructive and provided by the algorithm   $$ V_{n}:=T\left[V_{n-1}\right],n\in\mathbb{N} $$starting from an arbitrary uniformly continuous function $$V_{0}$$ from which it immediately follows that $$\left \{ V_{n}\right \} _{n}$$ is a Cauchy sequence with the modulus defined from   $$ \big|\big|T^{n}\left[V_{0}\right]-V^{*}\big|\big|\leqslant\dfrac{\gamma^{n}}{1-\gamma}\big|\big|T\left[V_{0}\right]-V_{0}\big|\big|, $$where $$T^{n}[V]=\underbrace{T[T[ \dots T[V]]]}_{n\textrm{ times}}$$ and $$V^{*}$$ is the fixed point. Thus, $$V_{n}$$ converges to $$V^{*}$$ uniformly and $$V^{*}$$ is in turn a uniformly continuous function. The only difference to the classical theorem is that $$\mathscr{V}$$ must be non-empty: $$\exists V_{0}\in \mathscr{V}.$$ Using this fact, one can perform value iteration starting from any $$V_{0}\in \mathscr{V}$$ and stopping at some $$V_{n}$$ such that a convergence criterion $$\big |\big |V_{n}-V^{*}\big |\big |\leqslant \varepsilon $$ is satisfied. Notice that the number of steps $$n$$ can be directly determined from the desired accuracy $$\varepsilon $$. Having found a suitable approximate $$V_{n}$$, an optimal control policy is of interest. In the constructive setting, a control policy must be a uniformly continuous function. In real-world applications, each control action has physical limits for magnitude and rate of change. Thus, one may assume that the space of admissible control policies $$\mathscr{U}$$ is a located subset of the space of uniformly bounded and uniformly Lipschitz functions on $$X$$ with a common bound and Lipschitz constant, respectively. By Theorem 3.1, the latter space is totally bounded, and since $$\mathscr{U}$$ is located, it is also totally bounded (Ye, 2011, Lemma 4.3.). To cope with the problem of continuity of extrema in the state variable, it is suggested to consider the following relaxed optimization problem for $$V_{n}$$:   \begin{equation} \sup_{u\in\mathscr{U}}\inf_{x\in X}\left\{ r\left(x,u(x)\right)+\gamma V_{n}(\,f(x,u(x)))\right\} . \end{equation} (4.5) It follows that $$J:\mathscr{U}\rightarrow \mathbb{R}$$ defined by   $$ J[u]:=\inf_{x\in X}\left\{ r\left(x,u(x)\right)+\gamma V_{n}(\,f(x,u(x)))\right\} $$is a uniformly continuous functional since $$r,V_{n},f,\inf $$ are uniformly continuous. By Theorem 3.2, for any $$k\in \mathbb{N}$$, there exists a control policy $$u^{*}\in \mathscr{U}$$ such that   \begin{align*} & \inf_{x\in X}\left\{ r\left(x,u^{*}(x)\right)+\gamma V_{n}(f(x,u^{*}(x)))\right\} +\frac{1}{k}\geqslant \inf_{x\in X}\left\{ r\left(x,u(x)\right)+\gamma V_{n}(\,f(x,u(x)))\right\} \end{align*}for any control policy $$u$$. The difference between (4.5) and (4.3) lies in the way the performance mark is defined. In the latter case, $$r\left (x,u\right )+\gamma V(\,f(x,u))$$ is optimized in all states $$x$$ while in the former, the ‘worst’ state is optimized. The optimization problem (4.5) is thus more mild than the original one, but it is still appropriate for a variety of practical applications. 4.3 ADP ADP is a variant of DP that is suitable for real-time optimization problems. It may be considered as a reinforcement learning technique (Sutton & Barto, 1998) in the sense that it uses an iterative procedure of updating a so-called actor that produces a control policy, according to a citric that represents the value function. The value function in the framework of ADP is commonly a subject to approximation since exact optimal solutions may be not achieved (Lewis & Syrmos, 1995). In this regard, neural networks are widely used as approximators (Bertsekas & Tsitsiklis, 1995; Werbos, 1990, 1992). For recent surveys on ADP, refer, e.g., to Balakrishnan et al. (2008) and Ferrari et al. (2011). It is common to consider ADP in application to discrete-time systems of the form $$x_{k+1}:=f\left (x_{k},u_{k}\right ),k\in \mathbb{N}$$ or even affine in control: $$x_{k+1}:=f\left (x_{k}\right )+g\left (x_{k}\right )u_{k},k\in \mathbb{N}$$. It is also assumed that $$f(0)=g(0)=0$$, and that there exists a control policy $$u$$ such that for all initial conditions $$x_{0}\in X$$, $$x_{k}\rightarrow 0$$ as $$k\rightarrow \infty $$. ADP usually addresses the following infinite-horizon optimization problem:   \begin{equation} \inf_{u}\sum_{l=k}^{\infty}r\left(x_{l},u\left(x_{l}\right)\right),\quad\forall x_{k}\in X. \end{equation} (4.6) Al Tamimi et al. (2008) have provided a convergence analysis of an ADP algorithm for affine-in-control systems. Assuming a utility function in the form $$r\left (x,u\right ):=q\left (x\right )+u^{T}Ru$$ with $$q$$ being a positive-definite function, and given an arbitrary $$\mathscr{C}^{\infty }\left (X\subset \mathbb{R}^{n},\mathbb{R}\right )-$$function $$V_{0}\left (x\right )$$ such that $$\forall x\in X,0\leqslant V\left (x\right )\leqslant Q\left (x\right ),$$ perform the following iterations starting with $$i:=0$$ for all $$x_{k}\in X$$:   \begin{eqnarray} u_{i}\left(x{}_{k}\right): & = & \arg\inf_{u}\left\{ r\left(x_{k},u\right)+V_{i}\left(\,f\left(x_{k},u\right)\right)\right\}, \end{eqnarray} (4.7)  \begin{eqnarray} V_{i+1}\left(x_{k}\right): & = & r\left(x_{k},u_{i}\left(x_{k}\right)\right)+V_{i}\left(\,f\left(x_{k},u_{i}\left(x\right)\right)\right),\\ i: & = & i+1.\nonumber \end{eqnarray} (4.8) Al-Tamimi showed that this algorithm converges to the solution of the Hamilton–Jacobi–Bellman equation   \begin{eqnarray} V^{*}\left(x_{k}\right) & = & \inf_{u}\left\{ r\left(x_{k},u\right)+V^{*}\left(\,f\left(x_{k},u\right)\right)\right\}, \end{eqnarray} (4.9)  \begin{eqnarray} u^{*}\left(x_{k}\right) & = & \arg\inf_{u}\left\{ r\left(x_{k},u\right)+V^{*}\left(\,f\left(x_{k},u\right)\right)\right\}, \end{eqnarray} (4.10)  \begin{eqnarray} & & \forall x_{k}\in X \end{eqnarray} (4.11)as $$i\rightarrow \infty $$. The proof essentially uses the classical monotone convergence theorem that states that a sequence of real numbers converges whenever it is bounded and monotone. Consequently, no estimate on the number of iterations can be given for the prescribed accuracy $$\big |\big |V_{i}-V^{*}\big |\big |$$. Another subtle point that is hard to justify from both the constructive and practical viewpoint is the assumption that (4.7) can be solved in terms of a closed-form expression. That is generally impossible. An exception is, e.g., a linear quadratic regulator that is a solution for linear systems. Liu & Wei (2014) introduced a similar proof technique, as in (Al Tamimi et al., 2008), for a policy iteration algorithm: start with $$i:=0$$, and any continuous control policy $$u_{0}$$ such that $$u(0)=0$$, the state trajectory $$x_{k}\rightarrow 0$$ under $$u_{0}$$, and (4.6) converges, perform the following iterations:   \begin{eqnarray} V_{i}\left(x\right): & = & r\left(x,u_{i}\left(x\right)\right)+V_{i}\left(\,f\left(x,u_{i}\left(x\right)\right)\right), \end{eqnarray} (4.12)  \begin{eqnarray} u_{i}\left(x{}_{k}\right): & = & \arg\inf_{u}\left\{ r\left(x_{k},u\right)+V_{i-1}\left(\,f\left(x_{k},u\right)\right)\right\}, \end{eqnarray} (4.13)  \begin{eqnarray} & & \forall x_{k}\in X \\ i: & = & i+1.\qquad\qquad\qquad\;\;\,\nonumber \end{eqnarray} (4.14) Notice the difference in iteration indices for the value function and the control policy. Again, the proof of convergence uses the monotone convergence theorem and the assumption that (4.13) can be solved in terms of a closed-form expression. To coup with this problem, Al-Tamimi and Liu suggest to use neural-network based approximators for the value function and the control policy. Unfortunately, no convergence proof has been given for such an approximate setting (Liu & Wei, 2014, p. 632). An alternative approach has been proposed by Heydari (2014). Instead of approximating the control policy, Heydari has shown that the first-order necessary condition for an extremum   $$ u_{i}(x)=-\frac{1}{2}R^{-1}g^{T}(x)\frac{\partial V_{i}\left(\,f(x)+g(x)u_{i}(x)\right)}{\partial x},\quad\forall x\in X $$is a fixed-point equation provided that all the functions in question are $$\mathscr{C}^{\infty }$$. By an appropriate choice of the matrix norm of $$R^{-1}$$ and/or $$g(x)$$, it can be shown that the mapping $$F:\mathscr{C}^{\infty }\left (X,\mathbb{R}^{m}\right )\rightarrow \mathscr{C}^{\infty }\left (X,\mathbb{R}^{m}\right )$$ defined by   \begin{equation} F[u](x):=-\dfrac{1}{2}R^{-1}g^{T}\left(x\right)\frac{\partial V_{i}(\,f(x)+g(x)u(x))}{\partial x} \end{equation} (4.15)is a contraction. The assumption that the control policy at each iteration is a smooth function satisfies our argumentation in Section 4.2 and, provided with a uniform bound and Lipschitz constant, leads to total boundedness of the space of control policies by Corollary 3.3. However, the first-order condition for an extremum is not sufficient to claim that the infimum of (4.8) is attained at each iteration. Currently, one can decouple (4.7) and (4.8), iterate the value function and then claim existence of an approximate optimal control policy for an alternative performance mark (4.5). Relaxing the continuity condition by considering measurable functions, and investigating other performance marks, such as Lebesgue integrals, may be of interest for future research. 5. Conclusions The present work is highlighted in the following points: A new constructive proof of the approximate extremum value theorem for function spaces is suggested. The methodological approach of the proof takes into account the numerical uncertainty which is related to limitations of real number representations in a computational device. The functions forming finite approximations to the respective function spaces are constructed explicitly. In particular, it was shown that the sets of uniformly bounded and uniformly Lipschitz functions on a totally bounded set are totally bounded by explicit constructions of approximating functions. As stated in Remark 3.3, any numerical procedure may in general achieve the result of Theorem 3.2 at best. That implies that optimality in optimal control problems may in general be achieved only approximately. Applications of the theorem to finite-horizon optimal control, DP and ADP are addressed. It is shown that under the stated assumptions, whose practicability is discussed, approximate optimal control policies can be effectively computed up to prescribed accuracy on the approximate optima of the cost functional. References Al Tamimi, A., Lewis, F. L. & Abu Khalaf, M. ( 2008) Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans. Syst. Man. Cybern. Part B (Cybernetics) , 38, 943– 949. Google Scholar CrossRef Search ADS   Balakrishnan, S., Ding, J. & Lewis, F. L. ( 2008) Issues on stability of ADP feedback controllers for dynamical systems. IEEE Trans. Syst. Man. Cybern. Part B (Cybernetics) , 38, 913– 917. Google Scholar CrossRef Search ADS   Banaschewski, B. & Mulvey, C. ( 1997) A constructive proof of the Stone-Weierstrass theorem. J. Pure Appl. Algebra , 116, 25– 40. Google Scholar CrossRef Search ADS   Berge, C. ( 1963). Topological Spaces: Including a Treatment of Multi-valued Functions, Vector Spaces, and Convexity . Mineola, New York: Dover Publications, Inc. Berger, J., Bridges, D. & Schuster, P. ( 2006) The fan theorem and unique existence of maxima. J. Symbolic Logic , 71, 713– 720. Google Scholar CrossRef Search ADS   Bertsekas, D. P. & Tsitsiklis, J. N. ( 1995) Neuro-dynamic programming: an overview. Proceedings of the 34th IEEE Conference on Decision and Control , vol. 1. IEEE, pp. 560– 564. Bishop, E. ( 1967) Foundations of Constructive Analysis , vol. 60. New York: McGraw-Hill. Bishop, E. & Bridges, D. ( 1985) Constructive Analysis , vol. 279. Berlin: Springer. Google Scholar CrossRef Search ADS   Blackwell, D. ( 1965) Discounted dynamic programming. Ann. Math. Stat. , 36, 226– 235. Google Scholar CrossRef Search ADS   Bridges, D. & Richman, F. ( 1987) Varieties of Constructive Mathematics , vol. 97. Cambridge: Cambridge University Press. Google Scholar CrossRef Search ADS   Bridges, D. & Vita, L. ( 2007) Techniques of Constructive Analysis. Universitext . New York: Springer. Bridges, D. S. ( 2007) Constructing local optima on a compact interval. Arch. Math. Logic , 46, 149– 154. Google Scholar CrossRef Search ADS   Ferrari, S., Sarangapani, J. & Lewis, F. L. ( 2011) Special issue on approximate dynamic programming and reinforcement learning. J. Control Theory Appl. , 9, 309– 309. Google Scholar CrossRef Search ADS   Heydari, A. ( 2014) Revisiting approximate dynamic programming and its convergence. IEEE Trans. Cybern. , 44, 2733– 2743. Google Scholar CrossRef Search ADS PubMed  Lewis, F. L. & Syrmos, V. L. ( 1995) Optimal Control . Hoboken, New Jersey: John Wiley & Sons. Liu, D. & Wei, Q. ( 2014) Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. , 25, 621– 634. Google Scholar CrossRef Search ADS PubMed  Palais, R. S. ( 2007) A simple proof of the Banach contraction principle. J. Fixed Point Theory Appl. , 2, 221– 223. Google Scholar CrossRef Search ADS   Rump, S. ( 2010) Verification methods: rigorous results using floating-point arithmetic. Acta Numerica , 19, 287– 449. Google Scholar CrossRef Search ADS   Schwichtenberg, H. ( 2012) Constructive Analysis With Witnesses . Manuscript. Sepulchre, R., Jankovic, M. & Kokotovic, P. ( 2012) Constructive Nonlinear Control . London: Springer Science & Business Media. Sontag, E. ( 1989) A “universal” construction of artstein’s theorem on nonlinear stabilization. Syst. Control Lett. , 13, 117– 123. Google Scholar CrossRef Search ADS   Sutton, R. S. & Barto, A. G. ( 1998) Reinforcement Learning: An Introduction , vol. 1. Cambridge, MA, USA: MIT Press. Tanaka, Y. ( 2012) On the maximum theorem: a constructive analysis. Int. J. Comp. and Math. Sciences , 6, 173– 175. Turing, A. M. ( 1936) On computable numbers, with an application to the entscheidungsproblem. Proc. London Math. Soc. , 42, 230– 265. Werbos, P. ( 1990) A menu of designs for reinforcement learning over time, Neural Networks for Control . Cambridge, MA, USA: MIT Press, 67– 95 Werbos, P. J. ( 1992) Approximate dynamic programming for real-time control and neural modeling. Handb. Intelligent Cont.: Neural Fuzzy Adaptive Approaches , 15, 493– 525. Ye, F. ( 2011) Strict Finitism and the Logic of Mathematical Applications , vol. 355. Springer Dordrecht Heidelberg London New York: Springer. Google Scholar CrossRef Search ADS   Appendix A.1 Smooth approximations For the sake of completeness, some technical details of smooth approximation are discussed in this appendix. First, consider the following real-valued non-analytic $$\mathscr{C}^{\infty }$$–function on $$\mathbb{R}$$:   $$ \sigma(t)=\left\{ \begin{array}{ll} \mathrm{e}^{-\frac{1}{t^{2}}} & t>0,\\ 0 & t\leqslant0. \end{array}\right. $$ Define a $$\mathscr{C}^{\infty }$$ bump function $$\vartheta :\mathbb{R}^{n}\rightarrow \mathbb{R},n\in \mathbb{N}$$ by $$\vartheta (x):=a\cdot \sigma \left (1-\|x\|^{2}\right )$$ where   $$ a:=\left(\int\limits _{\,\mathbb{R}^{n}}\sigma\left(1-\|x\|^{2}\right)\ \textrm{d}x\right)^{-1}. $$ Then, the support of $$\vartheta $$ lies within the unit ball $$\mathscr{B}(0,1)$$, i.e., $$\left \{ x\in \mathbb{R}^{n}:\vartheta (x)>0\right \} \subset \mathscr{B}(0,1)$$. Clearly, $$\forall x\in \mathbb{R}^{n},\vartheta (x)\geqslant 0$$ and $$\int _{\mathbb{R}^{n}}\vartheta (x)\ \textrm{d}x=1$$. Now, let $$\vartheta _{k}(x):=k^{n}\vartheta (kx)$$ for $$k\in \mathbb{N}$$. It follows that $$\int _{\mathbb{R}^{n}}\vartheta _{k}(x)\ \textrm{d}x=k^{n}\int _{\mathbb{R}^{n}}\frac{1}{k^{n}}\vartheta (kx)\ \textrm{d}(kx)=1$$. Let $$f:\mathbb{R}^{n}\rightarrow \mathbb{R}$$ be an $$L$$–Lipschitz function. It may be assumed that $$L=1$$ without loss of generality. Clearly, $$f$$ is locally integrable, i.e., integrable on any compact subset of $$\mathbb{R}^{n}$$ since it is Lipschitz continuous on this subset. Since $$\vartheta _{k}$$ is compactly supported, define a $$\mathscr{C}^{\infty }$$ function $$f_{k}$$ by convolution as follows:   \begin{align*} f_{k}(x)&=\int\limits _{\mathbb{R}^{n}}f(\chi)\vartheta_{k}(x-\chi)\ \textrm{d}\chi=\int\limits _{\mathbb{R}^{n}}f(x-\chi)\vartheta_{k}(\chi)\ \textrm{d}\chi\\ &=k^{n}\int\limits _{\mathscr{B}\left(0,\frac{1}{k}\right)}f(x-\chi)\vartheta(k\chi)\ \textrm{d}\chi=\int\limits _{\mathscr{B}\left(0,1\right)}f\left(x-\dfrac{\chi}{k}\right)\vartheta(\chi)\ \textrm{d}\chi. \end{align*} It follows that   \begin{align*} f_{k}(x)-f_{k}(y)=\int\limits _{\mathbb{R}^{n}}\big(\,f(x-\chi)-f(y-\chi)\big)\vartheta_{k}(\chi)\ \textrm{d}\chi. \end{align*} Since $$\forall \chi \in \mathbb{R}^{n},\big |\,f(x-\chi )-f(y-\chi )\big |\vartheta _{k}(\chi )\leqslant \|x-y\|\vartheta _{k}(\chi )$$, it holds that   \begin{align*} \big|\,f_{k}(x)-f_{k}(y)\big|&\leqslant\int\limits _{\mathbb{R}^{n}}\big|\,f(x-\chi)-f(y-\chi)\big|\vartheta_{k}(\chi)\ \textrm{d}\chi\\&\leqslant \|x-y\|\int\limits _{\mathbb{R}^{n}}\vartheta_{k}(\chi)\ \textrm{d}\chi=\|x-y\|. \end{align*} Finally, since $$f(x)=f\left (x\right )\cdot 1=f\left (x\right )\cdot \int \limits _{\mathscr{B}(0,1)}\vartheta (\chi )\ \textrm{d}\chi =$$$$\int \limits _{\mathscr{B}(0,1)}f\left (x\right )\vartheta (\chi )\ \textrm{d}\chi $$, it follows that   \begin{align*} \big|\,f_{k}(x)-f(x)\big|&=\left|\int\limits _{\;\mathscr{B}(0,1)}f\left(x-\dfrac{\chi}{k}\right)\vartheta(\chi)\ \textrm{d}\chi-f(x)\right|\\ &=\left|\int\limits _{\;\mathscr{B}(0,1)}f\left(x-\dfrac{\chi}{k}\right)\vartheta(\chi)\ \textrm{d}\chi-\int\limits _{\mathscr{B}(0,1)}f\left(x\right)\vartheta(\chi)\ \textrm{d}\chi\right|\\&= \left|\int\limits _{\;\mathscr{B}(0,1)}\left(\,f\left(x-\dfrac{\chi}{k}\right)-f(x)\right)\vartheta(\chi)\ \textrm{d}\chi\right|\\&\leqslant \sup_{\chi\in\mathscr{B}(0,1)}\Big|\Big|\dfrac{\chi}{k}\Big|\Big|\int\limits _{\mathscr{B}\left(0,1\right)}f\vartheta(\chi)\ \textrm{d}\chi=\dfrac{1}{k}. \end{align*} A.2 Brouwerian counter-examples The two examples in Section 1, also called Brouwerian counter-examples, demonstrate the computational inability of finding exact optimal control actions and policies in general. The possible computational problems with addressing optimal control may be related to the so-called principles of omniscience (Bishop & Bridges, 1985, p. 11). One of them, called limited principle of omniscience (LPO), states Definition A.1 (LPO) For any sequence binary $$\{a_i\}_i$$, the following holds: either $$a_i = 0$$ for all $$i$$, or there is a $$k$$ with $$a_k = 1$$. LPO might be related to the inability of a computer to perform an unbounded search and decide exactly whether a given real number is non-zero or exactly zero which may be in turn related to the Turing’s Halting problem (Turing, 1936). Let $$\{ a_i \}_i$$ be a binary sequence with at least one $$1$$ at some place which is not a priori known. Let the numbers $$b, c$$ be defined as follows:   $$ b = \frac{1}{4} \sum_{i=0}^{\infty} \frac{1}{i+1} a_{2i+1}, c = \frac{1}{4} \sum_{i=1}^{\infty} \frac{1}{i+1} a_{2i}. $$ If it were known that $$b=0$$, then one could deduce that all the odd entries of $$\{a_i\}$$ are zero and, therefore, since one even entry must be $$1$$, there exists an index $$2N$$ such that $$a_{2N}=1$$, i.e.,   $$ b=0 \; \Rightarrow \; \forall i, \; a_{2i+1} =0 \; \Rightarrow \; \exists N. \; a_{2N}=1. $$ Similarly, if $$c=0$$, then some even entry of $$\{a_i\}$$ must be $$1$$, i.e.,   $$ c=0 \; \Rightarrow \; \forall i, \; a_{2i} =0 \; \Rightarrow \; \exists N. \; a_{2N+1}=1. $$ In this minimalistic scenario, the appearance of a $$1$$ can be described by the condition $$b=0 \lor c=0$$ which implies LPO. Consider, for instance, the case of the cost function $$J(u) = \min \{ u^2 + b, (u-1)^2+c \}$$. If an optimal control action $$u^{\ast }$$ could be computed exactly, such that $$J(u^{\ast })= \min J$$, then either $$u^{\ast } \geqslant \frac{1}{3}$$ or $$u^{\ast } \leqslant \frac{2}{3}$$. If $$u^{\ast } \geqslant \frac{1}{3}$$, then   $$ (u^{\ast})^2 +b \geqslant \frac{1}{9}+b> 0 $$whence $$(u^{\ast } -1)^2 +c =0 \Rightarrow c=0$$. If $$u^{\ast } \leqslant \frac{2}{3}$$, then   $$ (u^{\ast}-1)^2 +c \geqslant \frac{1}{9}+c> 0 $$whence $$(u^{\ast })^2 +b =0 \Rightarrow b=0$$. Therefore, equivalence to LPO is shown. © The Author(s) 2018. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. All rights reserved. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Journal

IMA Journal of Mathematical Control and InformationOxford University Press

Published: Jun 6, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off