TY - JOUR AU - Harbrecht,, Helmut AB - Abstract We compare the cost complexities of two approximation schemes for functions that live on the product domain |$\varOmega _1\times \varOmega _2$| of sufficiently smooth domains |$\varOmega _1\subset \mathbb{R}^{n_1}$| and |$\varOmega _2\subset \mathbb{R}^{n_2}$|⁠, namely the singular value / Karhunen–Lòeve decomposition and the sparse grid representation. We assume that appropriate finite element methods with associated orders |$r_1$| and |$r_2$| of accuracy are given on the domains |$\varOmega _1$| and |$\varOmega _2$|⁠, respectively. This setting reflects practical needs, since often black-box solvers are used in numerical simulation, which restrict the freedom in the choice of the underlying discretization. We compare the cost complexities of the associated singular value decomposition and the associated sparse grid approximation. It turns out that, in this situation, the approximation by the sparse grid is always equal or superior to the approximation by the singular value decomposition. The results in this article improve and generalize those from the study by Griebel & Harbrecht (2014, Approximation of bi-variate functions. Singular value decomposition versus sparse grids. IMA J. Numer. Anal., 34, 28–54). Especially, we consider the approximation of functions from generalized isotropic and anisotropic Sobolev spaces. 1. Introduction With this article, we intend to refine the results that have been achieved in the study by Griebel & Harbrecht (2014), where we were concerned with the comparison of low-rank approximation methods and sparse grid methods for bivariate functions. This is a relevant setting since many problems in science and engineering lead to problems on the product |$\varOmega _1 \times \varOmega _2$| of two domains |$\varOmega _1 \subset \mathbb{R}^{n_1}$| and |$\varOmega _2\subset \mathbb{R}^{n_2}$|⁠. For example, radiosity models and radiative transfer (Widmer et al., 2008), space-time formulations of parabolic problems (Griebel & Oeltz, 2007), phase space problems (Balescu, 1997), biscale homogenization (Cioranescu et al., 2008), as well as correlation equations (Deb et al., 2001) fit into this setting. We refer the reader to the study by Griebel & Harbrecht (2014) for a more comprehensive discussion of these problems and further references. The representation of functions on product domains by low-rank approximation is also the fundamental idea of reduced basis methods and model order reduction, see Rozza et al. (2008), Hesthaven et al. (2016), Quarteroni et al. (2016) and the references therein. Similarly, in uncertainty quantification, the spatial variable and the stochastic variable are defined on different domains. In general, after inserting the Karhunen–Lòeve decomposition of the underlying random field, one arrives at a parametric problem posed on the product of the physical domain and a high- or even infinite-dimensional parameter domain, see Ghanem & Spanos (1991) and Le Maître & Knio (2010), for example. All the aforementioned problems are directly given on the product of two domains. Furthermore, for some of these as well as for many other problems, the domains themselves are products of lower-dimensional domains. Then, the domain of an |$n$|-dimensional problem with, for instance, |$n$| being some power of two can be split into the product of two domains of dimension |$n/2$|⁠, which can recursively be further split until a terminal situation (a one-dimensional domain or a truly higher-dimensional but nontensor product domain) is reached. Related representation methods have been considered in Hackbusch & Kühn (2009), Oseledets & Tyrtyshnikov (2009), Bebendorf (2011) or Hackbusch (2012). Here, one should note that hierarchical tensor formats, such as the hierarchical Tucker format or the tensor train format, can be computed by means of a truncated singular value decomposition for each dimension separation step, cf. Grasedyck (2010). An alternative approach would be a two-dimensional sparse grid approximation in each separation step. Then, the recursive application would yield an |$n$|-dimensional sparse grid. This motivates to consider the simple case of two domains |$\varOmega _1$| and |$\varOmega _2$| only. Our analysis covers then also a single bisection step in the above-mentioned recursion. Our setting is as follows. We suppose to have given fixed sequences of nested trial spaces $$\begin{equation}V_0^{(i)}\subset V_1^{(i)}\subset V_2^{(i)}\subset\cdots \subset L^2(\varOmega_i),\quad i=1,2\end{equation}$$ (1.1) on the individual subdomains, which consist of ansatz functions of approximation orders |$r_1$| and |$r_2$|⁠, respectively (see Subsection 2.1 for the precise properties of the ansatz spaces under consideration). We hence first fix the discretization and then compare the resulting algorithms. This reflects practical needs, since often black-box codes have to be used due to the implementational complexity of the underlying problems. Note at this point that our assumption is thus fundamentally different to the setting in approximation theory, where a function class is fixed and the best algorithm is sought, compare Novak & Woźniakowski (2008), Novak & Woźniakowski (2010) and Novak & Woźniakowski (2012). It is also different to the universality point of view, where one aims at algorithms, which are almost optimal for a wide range of function classes, see Babuška (1968) and Motornyj (1974), for example. Having the trial spaces (1.1) at hand, we can either apply the truncated singular value decomposition $$f_M(\mathbf{x},\mathbf{y}) := \sum_{\ell=1}^M \sqrt{\lambda_\ell}\varphi_\ell(\mathbf{x})\psi_\ell(\mathbf{y}), \quad\mathbf{x}\in\varOmega_1,\quad\mathbf{y}\in\varOmega_2$$ or the generalized sparse grid approach $$\widehat{f}_J(\mathbf{x},\mathbf{y}) := \sum_{j_1/\sigma+j_2\sigma\leqslant J}\sum_{k_1\in\nabla_{j_1}^{(1)}} \sum_{k_2\in\nabla_{j_2}^{(2)}} \beta_{(j_1,k_1),(j_2,k_2)} \xi_{j_1,k_1}^{(1)}(\mathbf{x})\xi_{j_2,k_2}^{(2)}(\mathbf{y}), \quad\mathbf{x}\in\varOmega_1,\quad\mathbf{y}\in\varOmega_2$$ to represent a given function |$f\in L^2(\varOmega _1\times \varOmega _2)$| in an efficient way. In the first representation, |$\{\varphi _\ell \}_{\ell =1}^M$| and |$\{\psi _\ell \}_{\ell =1}^M$| are sets of orthonormal functions. They are a priori unknown, can in general not be derived analytically and need thus to be approximated in the ansatz spaces |$\{V_j^{(i)}\}$|⁠. In other words, the approximation involves in most applications both, a truncation after |$M$| terms and an approximate computation of the singular values and the associated left and right singular vectors. In the second representation, |$\sigma>0$| is an appropriately chosen parameter and |$\big\{\xi _{j,k}^{(i)}\big\}_{k\in \nabla _j^{(i)}, j\in \mathbb{N}}$| are in general multilevel or wavelet bases associated with the trial spaces, where the index |$j$| refers to the level of resolution and the index |$k$| refers to the locality of the basis function (the precise definition will be given in Section 4). In order to decide which approximation should be implemented for treating problems on product domains, we need to know the pros and cons of both methods. The main improvement of our theory in comparison to the study by Griebel & Harbrecht (2014) concerns the approximative truncated singular value decomposition. Namely, it turned out that it is not optimal to directly approximate the singular values and the eigenfunctions of the function under consideration. In this article, we therefore proceed differently; we first apply an |$L^2$|-projection on an appropriately chosen full tensor product space (comp. Subsection 3.2), which enables us to prove sharp estimates on the decay of the eigenvalues of the singular value decomposition (comp. Subsection 3.3). We then truncate the discrete singular value decomposition afterwards to directly derive sharp error estimates in the trace norm, while an approximation of the continuous eigenfunctions is no longer needed (comp. Subsection 3.4). With the improvements achieved here for the approximative truncated singular value decomposition, we are now able to correctly predict the observations made in the numerical examples found in the study by Griebel & Harbrecht (2014). For our comparison, we consider the smoothness of the function |$f$| to be measured in isotropic and anisotropic Sobolev norms. We then want to compare the cost complexity to reach an approximation with a prescribed accuracy for the truncated singular value decomposition and the sparse grid approach. One result of this article is then as follows: in order to approximate a function |$f \in H^p(\varOmega _1\times \varOmega _2)$| in |$L^2(\varOmega _1 \times \varOmega _2)$| with accuracy |$\varepsilon>0$|⁠, we have to spend |$\mathscr{O} (\varepsilon ^{-q})$| degrees of freedom with $$q_{svd} = \frac{\min\{n_1,n_2\}}{p} + \max\left\{\frac{\max\{n_1,n_2\}}{p}, \frac{n_1}{r_1},\frac{n_2}{r_2}\right\}$$ for the approximation by the truncated singular value decomposition and with $$q_{sg} = \max\left\{\frac{n_1+n_2}{p},\frac{n_1}{r_1},\frac{n_2}{r_2}\right\}$$ for the general sparse grid method with associated parameter |$\sigma = n_1/n_2$| (a precise definition is given in Section 4), see also the study by Griebel & Harbrecht (2013). Since it always holds $$\frac{\min\{n_1,n_2\}}{p}+\max\left\{\frac{\max\{n_1,n_2\}}{p}, \frac{n_1}{r_1},\frac{n_2}{r_2}\right\}\geqslant \max\left\{\frac{n_1+n_2}{p},\frac{n_1}{r_1},\frac{n_2}{r_2}\right\},$$ we deduce that the approximation by the sparse grid method is equal or—in case of smooth functions |$f\in H^p(\varOmega _1\times \varOmega _2)$| with |$\frac{p}{\max \{n_1,n_2\}}> \min \big\{\frac{r_1}{n_1},\frac{r_2}{n_2}\big\}$|—even superior to the approximation by the singular value decomposition, at least for our setting where we also consider the approximation of the eigenfunctions. For example, assume for the sake of simplicity that |$n = n_1 = n_2$| and |$r = r_1 = r_2$|⁠. Then, if |$p\geqslant 2r$|⁠, the sparse grid approach approximates a bivariate function in a cost complexity, which is essentially the same as for a univariate function, i.e., with |$\mathscr{O} (\varepsilon ^{-n/p})$| cost. In contrast, the approximative truncated singular value decomposition realizes this complexity only if the function to be approximated is analytical, i.e., in the limit case |$p=\infty$|⁠. In the situation |$p>r$|⁠, its cost complexity is indeed inferior to that of the sparse grid approach, while for |$p\leqslant r$| the cost complexities of the sparse grid approach and the approximative truncated singular value decomposition both coincide with that of the approximation in the full tensor product space. We refine in addition the aforementioned findings by considering in Section 5 more general isotropic and anisotropic Sobolev spaces. Also in these cases, the asymptotic superiority of the sparse grid approach can be established. Recall here again that we fixed the underlying discretization via (1.1) and now compare the resulting associated algorithms. We like to mention that our results generalize those of Temlyakov (1986, 1988, 1992a,b, 1993). On the one hand, we consider the approximation of arbitrary functions on domains |$\varOmega _1\subset \mathbb{R}^{n_1}$| and |$\varOmega _2\subset \mathbb{R}^{n_2}$| instead of the approximation of periodic functions on the unit cube. On the other hand, we consider the approximation by piecewise polynomial ansatz functions of fixed order (i.e., the |$h$|-approximation of functions) instead of the approximation by trigonometric polynomials (i.e., the |$p$|-approximation of functions). Nonetheless, the results coincide for cubic domains and ansatz functions of sufficiently high order. The mathematical technique, however, to derive our results is completely different to the one used by Temlyakov. The remainder of this article is organized as follows: in Section 2, we give a short introduction to multilevel approximation. In Section 3, we describe the singular value decomposition of a bivariate function on |$\varOmega _1\times \varOmega _2$| and discuss its approximation properties in detail. Section 4 gives the basics of the so-called general sparse grid approximation of a bivariate function on |$\varOmega _1\times \varOmega _2$| and presents its error rates and cost complexities. In Section 5, we compare the two approximations and make some final remarks. Throughout this article, the notion ‘essential’ in connection with the complexity estimates means ‘up to logarithmic terms’. Moreover, to avoid the repeated use of generic but unspecified constants, we denote by |$C \lesssim D$| that |$C$| is bounded by a multiple of |$D$| independently of parameters, which |$C$| and |$D$| may depend on. Obviously, |$C \gtrsim D$| is defined as |$D \lesssim C$|⁠, and |$C \sim D$| as |$C \lesssim D$| and |$C \gtrsim D$|⁠. 2. Preliminaries 2.1 Approximation on the subdomains Let |$\varOmega \subset \mathbb{R}^{n}$| be a sufficiently smooth, bounded domain.1 In general, one uses finite elements to approximate functions on |$L^2(\varOmega )$|⁠. In the present article, we focus on the common |$h$|-method, i.e., on finite elements of fixed approximation order. Then, particularly for applying multiscale techniques, one has a sequence of nested trial spaces $$\begin{equation}V_0\subset V_1\subset V_2\subset\ldots \subset L^2(\varOmega)\end{equation}$$ (2.1) such that $$L^2(\varOmega) = \overline{\bigcup_{j\in\mathbb{N}_0}V_j},$$ which is called multiscale analysis. Each space |$V_j$| is defined by a single scale basis |$\varPhi _j = \{\phi _{j,k}\},$| i.e., |$V_j = \operatorname{span}\{\phi _{j,k}: k\in \varDelta _j\}$|⁠, where |$\varDelta _j$| denotes a suitable index set with cardinality |$\#\varDelta _j\sim 2^{nj}$|⁠. We say that the trial spaces have (approximation) order|$r\in \mathbb{N}$| if $$\begin{equation}r = \sup\left\{s\in\mathbb{R}: \inf_{v_j\in V_j}\|v-v_j\|_{L^2(\varOmega)} \lesssim h_j^s\|v\|_s\ \textrm{for all}\ v\in H^s(\varOmega)\right\},\end{equation}$$ (2.2) where the quantity |$h_j\sim 2^{-j}$| corresponds to the mesh width associated with the subspace |$V_j$| on |$\varOmega$|⁠. Note that the integer |$r>0$| refers in general to the maximal order of polynomials that are locally contained in |$V_j$|⁠. Equation (2.2) implies that a given function |$v\in H^p(\varOmega )$|⁠, |$0\leqslant p\leqslant r$|⁠, can be approximated in |$V_j$| at a rate |$h_j^p$|⁠, i.e., the associated |$L^2$|-orthogonal projection |$Q_j:L^2(\varOmega )\to V_j$| satisfies $$\begin{equation}\|(I-Q_j)v\|_{L^2(\varOmega)} \lesssim h_j^p\|v\|_{H^p(\varOmega)}, \quad 0\leqslant p\leqslant r.\end{equation}$$ (2.3) Thus, when we approximate a function |$v\in H^p(\varOmega )$| with |$0\leqslant p\leqslant r$| by uniform mesh refinement we obtain the rate |$h_j^p$| according to (2.3). Since the mesh size and the number of unknowns in |$V_j$| are related by |$\dim (V_j)\sim 2^{jn}\sim h_j^{-n}$|⁠, we deduce that $$\begin{equation}N\sim\varepsilon^{-n/p}\end{equation}$$ (2.4) unknowns have to be spent to achieve an approximation error |$\varepsilon$|⁠. The best possible rate |$N^{-n/r}$| is achieved if |$p=r$|⁠, that is, if |$v\in H^r(\varOmega )$|⁠. 2.2 Kolmogorov’s |$n$|-width and full tensor product spaces For our subsequent analysis of the approximation of bivariate functions in |$L^2(\varOmega _1\times \varOmega _2)$|⁠, we shall fix the definitions, properties and cost complexities individually for each subdomain |$\varOmega _i\in \mathbb{R}^{n_i}$|⁠, |$i=1,2$|⁠. That is, we fix two multiscale analyses $$\begin{equation}V_0^{(i)}\subset V_1^{(i)}\subset V_2^{(i)}\subset\cdots \subset L^2(\varOmega_i),\quad i=1,2,\end{equation}$$ (2.5) which are assumed to provide the approximation orders |$r_1$| and |$r_2$|⁠, respectively. We start our discussion with the approximation of a given bivariate function |$f\in L^2(\varOmega _1\times \varOmega _2)$| in full tensor product spaces |$V_{j_1}^{(1)}\otimes V_{j_2}^{(2)}$|⁠. To measure the smoothness of bivariate functions, we define the Sobolev space of dominating mixed derivatives by $$H_{mix}^{s_1,s_2}(\varOmega_1\times\varOmega_2) := H^{s_1}(\varOmega_1)\otimes H^{s_2}(\varOmega_2)$$ and set $$H_{iso}^{s_1,s_2}(\varOmega_1\times\varOmega_2) := H_{mix}^{s_1,0}(\varOmega_1\times\varOmega_2) \cap H_{mix}^{0,s_2}(\varOmega_1\times\varOmega_2).$$ Note that, in case of |$p=p_1=p_2$|⁠, the space |$H_{iso}^{p,p}(\varOmega _1\times \varOmega _2)$| coincides with the standard isotropic Sobolev space |$H^p(\varOmega _1\times \varOmega _2)$|⁠, i.e., it holds $$H^p(\varOmega_1\times\varOmega_2)= H_{iso}^{p,p}(\varOmega_1\times\varOmega_2).$$ Let |$f\in H_{iso}^{p_1,p_2}(\varOmega _1\times \varOmega _2)$|⁠. Then, for the |$L^2$|-orthogonal projections onto |$V_{j_1}^{(1)}$| and |$V_{j_2}^{(2)}$|⁠, respectively, we obtain $$\begin{equation}\begin{aligned} \left\|\left(I-Q_{j_1}^{(1)}\otimes I\right)f\right\|_{L^2(\varOmega_1\times\varOmega_2)} &\lesssim 2^{-j_1\min\{p_1,r_1\}}\|\,f\|_{H_{mix}^{\min\{p_1,r_1\},0}(\varOmega_1\times\varOmega_2)},\\ \left\|\left(I-I\otimes Q_{j_2}^{(2)}\right)f\right\|_{L^2(\varOmega_1\times\varOmega_2)} &\lesssim 2^{-{j_2}\min\{p_2,r_2\}}\|\,f\|_{H_{mix}^{0,\min\{p_2,r_2\}}(\varOmega_1\times\varOmega_2)}.\end{aligned}\end{equation}$$ (2.6) Using standard tensor product arguments leads thus to $$\begin{align*}&\left\|\left(I-Q_{j_1}^{(1)}\otimes Q_{j_2}^{(2)}\right)f\right\|_{L^2(\varOmega_1\times\varOmega_2)}\\ &\qquad\lesssim 2^{-j_1\min\{p_1,r_1\}}\|\,f\|_{H_{mix}^{\min\{p_1,r_1\},0}(\varOmega_1\times\varOmega_2)} + 2^{-{j_2}\min\{p_2,r_2\}}\|\,f\|_{H_{mix}^{0,\min\{p_2,r_2\}}(\varOmega_1\times\varOmega_2)}\\ &\qquad\lesssim (2^{-j_1\min\{p_1,r_1\}}+2^{-j_2\min\{p_2,r_2\}}) \|\,f\|_{H_{iso}^{\min\{p_1,r_1\},\min\{p_2,r_2\}}(\varOmega_1\times\varOmega_2)}.\end{align*}$$ The optimum choice is to equilibrate the errors, since the approximation errors are additive while the costs are multiplicative. This means that $$\begin{equation}j_1\min\{p_1,r_1\}\sim j_2\min\{p_2,r_2\},\end{equation}$$ (2.7) which implies, in view of |$\dim \big(V_{j_1}^{(1)}\big)\sim 2^{j_1n_1}$| and |$\dim \big(V_{j_2}^{(2)}\big)\sim 2^{j_2n_2}$|⁠, the cost complexity $$\begin{equation}\operatorname{dof}_{iso}(\varepsilon) = \varepsilon^{-\frac{n_1}{\min\{p_1,r_1\}}} \varepsilon^{-\frac{n_2}{\min\{p_2,r_2\}}}\end{equation}$$ (2.8) to achieve a desired approximation error |$\varepsilon$|⁠. If |$p_1\leqslant r_1$| and |$p_2\leqslant r_2$|⁠, this is known to be Kolmogorov’s |$n$|-width for Sobolev balls in the space |$H_{iso}^{p_1,p_2}(\varOmega _1 \times \varOmega _2)$|⁠, see the study by Kolmogorov (1936). Hence, the cost complexity (2.8) is sharp in this case, which means there is asymptotically no better representation possible. Nonetheless, if |$p_1>r_1$| or |$p_2>r_2$|⁠, then (2.8) is not sharp anymore and we can approximate better in |$H_{iso}^{p_1,p_2}(\varOmega _1\times \varOmega _2)$| than by just using the full tensor product space. The methods we discuss in this article are the approximative truncated singular value decomposition in Section 3 and the sparse grid in Section 4. The question we address is as follows: given a function |$H_{iso}^{p_1,p_2} (\varOmega _1\times \varOmega _2)$|⁠, where |$p_1,p_2>0$| are arbitrary and where trial spaces with approximation orders |$r_1$| and |$r_2$|⁠, respectively, are used in both approaches, which algorithm provides the cheaper approximation? 3. Singular value decomposition 3.1 Definition and mapping properties We intend to numerically represent functions |$f\in L^2(\varOmega _1\times \varOmega _2)$| on tensor product domains |$\varOmega _1\times \varOmega _2$| in an efficient way. One way to solve this approximation problem is to use an ansatz by means of tensor products, which separates the variables |$\mathbf{x}\in \varOmega _1$| and |$\mathbf{y}\in \varOmega _2$|⁠. We first consider the approximation $$\begin{equation}f(\mathbf{x},\mathbf{y})\approx f_M(\mathbf{x},\mathbf{y}) = \sum_{\ell=1}^M \alpha_\ell\varphi_\ell(\mathbf{x})\psi_\ell(\mathbf{y})\end{equation}$$ (3.1) with certain coefficients |$\alpha _\ell \in \mathbb{R}$| and normalized functions |$\varphi _\ell \in L^2(\varOmega _1)$| and |$\psi _\ell \in L^2(\varOmega _2)$|⁠. Such an approximation is called low-rank approximation. It is well known (see, e.g., Schmidt, 1907 or Lòeve, 1978) that, with respect to the number |$M$| of terms, the best possible representation of a function |$f\in L^2(\varOmega _1\times \varOmega _2)$| in the |$L^2$|-sense is given by the Karhunen–Lòeve / singular value decomposition.2 Then, |$\alpha _\ell = \sqrt{\lambda _\ell }$| are given by the eigenvalues of the below-defined integral operator (3.2) with kernel (3.3). As shown in Subsection 3.2, the truncation error (in terms of |$M$|⁠) of the series (3.1) is related to the smoothness of the function |$f$| to be approximated. As a byproduct of this estimate, we can infer the decay of the eigenvalues in Subsection 3.3. In Subsection 3.4, we finally consider the numerical treatment of (3.1). Besides determining the coefficients |$\{\alpha _\ell \}_{\ell \in \mathbb{N}}$|⁠, a numerical scheme needs to approximate the functions |$\{\varphi _\ell \}_{\ell \in \mathbb{N}}$| and |$\{\psi _\ell \}_{\ell \in \mathbb{N}}$| in appropriate trial spaces |$V_{j_1}^{(1)}$| and |$V_{j_2}^{(2)}$|⁠, respectively, up to an accuracy corresponding to that of (3.1). Recall that the trial spaces that we consider are elements of the multiscale analyses (2.5) that have the approximation orders |$r_1$| and |$r_2$|⁠, respectively. To derive the singular value decomposition, we shall consider the integral operator $$\mathscr{S}:L^2(\varOmega_1)\to L^2(\varOmega_2), \quad u\mapsto (\mathscr{S}u)(\mathbf{y}) := \int_{\varOmega_1} f(\mathbf{x},\mathbf{y})u(\mathbf{x})\operatorname{d}\!\mathbf{x}.$$ Its adjoint is $$\mathscr{S}^\star:L^2(\varOmega_2)\to L^2(\varOmega_1), \quad u\mapsto (\mathscr{S}^\star u)(\mathbf{x}) := \int_{\varOmega_2} f(\mathbf{x},\mathbf{y})u(\mathbf{y})\operatorname{d}\!\mathbf{y}.$$ To obtain the low-rank representation (3.1), we need to compute the eigenvalues of the integral operator $$\begin{equation}\mathscr{K}=\mathscr{S}^\star\mathscr{S}:L^2(\varOmega_1)\to L^2(\varOmega_1),\quad u\mapsto (\mathscr{K}u)(\mathbf{x}) := \int_{\varOmega_1} k(\mathbf{x},\mathbf{x^{\prime}})u(\mathbf{x^{\prime}})\operatorname{d}\!\mathbf{x^{\prime}}\end{equation}$$ (3.2) whose kernel function is given by $$\begin{equation}k(\mathbf{x},\mathbf{x^{\prime}}) = \int_{\varOmega_2} f(\mathbf{x},\mathbf{y}) f(\mathbf{x^{\prime}},\mathbf{y}) \operatorname{d}\!\mathbf{y} \in L^2(\varOmega_1\times\varOmega_1).\end{equation}$$ (3.3) This is a Hilbert–Schmidt kernel. Thus, the associated integral operator |$\mathscr{K}$| is compact. Moreover, since |$\mathscr{K}$| is self-adjoint, there exists a decomposition into eigenpairs |$(\lambda _\ell , \varphi _\ell )$|⁠, i.e., $$\mathscr{K}\varphi_\ell = \lambda_\ell \varphi_\ell\quad\textrm{for all}\ \ell\in\mathbb{N},$$ with non-negative eigenvalues |$\lambda _1\geqslant \lambda _2 \geqslant \cdots \geqslant \lambda _m\to 0$| and eigenfunctions |$\{\varphi _\ell \}_{\ell \in \mathbb{N}}$|⁠, which constitute an orthonormal basis in |$L^2(\varOmega _1)$|⁠. We now define for all |$\ell \in \mathbb{N}$| with |$\lambda _\ell> 0$| the function |$\psi _\ell \in L^2(\varOmega _2)$| by $$\begin{equation}\psi_\ell(\mathbf{y}) = \frac{1}{\sqrt{\lambda_\ell}} (\mathscr{S}\varphi_\ell)(\mathbf{y}) = \frac{1}{\sqrt{\lambda_\ell}} \int_{\varOmega_1} f(\mathbf{x},\mathbf{y})\varphi_\ell(\mathbf{x})\operatorname{d}\!\mathbf{x}.\end{equation}$$ (3.4) This constitutes a second sequence of orthonormal functions since $$\begin{align*}(\psi_k,\psi_\ell)_{L^2(\varOmega_2)} &=\frac{1}{\sqrt{\lambda_k\lambda_\ell}} (\mathscr{S}\varphi_k,\mathscr{S}\varphi_\ell)_{L^2(\varOmega_2)} =\frac{1}{\sqrt{\lambda_k\lambda_\ell}} (\mathscr{K}\varphi_k,\varphi_\ell)_{L^2(\varOmega_1)}\\ &=\frac{\lambda_k}{\sqrt{\lambda_k\lambda_\ell}} (\varphi_k,\varphi_\ell)_{L^2(\varOmega_1)} = \delta_{k,\ell}.\end{align*}$$ If |$\lambda _\ell = 0$| for some |$\ell \in \mathbb{N}$|⁠, we can extend this collection of functions properly to obtain an orthonormal basis |$\{\psi _\ell \}_{\ell \in \mathbb{N}}$| of |$L^2(\varOmega _2)$|⁠. Due to $$\begin{equation}\sqrt{\lambda_\ell}\varphi_\ell(\mathbf{x}) = \frac{1}{\sqrt{\lambda_\ell}} (\mathscr{S}^\star\mathscr{S}\varphi_\ell)(\mathbf{x}) = (\mathscr{S}^\star\psi_\ell)(\mathbf{x}) = \int_{\varOmega_2} f(\mathbf{x},\mathbf{z})\psi_\ell(\mathbf{z})\operatorname{d}\!\mathbf{z}\end{equation}$$ (3.5) for all |$\mathbf{x}\in \varOmega _1$| and |$\ell \in \mathbb{N}$|⁠, we finally obtain the representation $$\begin{equation}f(\mathbf{x},\mathbf{y}) = \sum_{\ell=1}^\infty \sqrt{\lambda_\ell}\varphi_\ell(\mathbf{x})\psi_\ell(\mathbf{y}).\end{equation}$$ (3.6) With (3.4) and (3.5), this equation is easily verified by testing with the orthonormal basis |$\{\varphi _k\otimes \psi _\ell \}_{k,\ell \in \mathbb{N}}$| of |$L^2(\varOmega _1\times \varOmega _2)$|⁠. Remark 3.1 The adjoint kernel |$\widetilde{k}(\cdot ,\cdot )$| is just obtained by interchanging |$\varOmega _1$| and |$\varOmega _2$|⁠, i.e., $$\widetilde{k}(\mathbf{y},\mathbf{y^{\prime}}) = \int_{\varOmega_1} f(\mathbf{x},\mathbf{y}) f(\mathbf{x},\mathbf{y^{\prime}}) \operatorname{d}\!\mathbf{x} \in L^2(\varOmega_2\times\varOmega_2).$$ Then, one has the integral operator $$\widetilde{\mathscr{K}}=\mathscr{S}\mathscr{S}^\star: L^2(\varOmega_2)\to L^2(\varOmega_2),\quad u\mapsto (\widetilde{\mathscr{K}}u)(\mathbf{y}) := \int_{\varOmega_2} \widetilde{k}(\mathbf{y},\mathbf{y^{\prime}})u(\mathbf{y^{\prime}})\operatorname{d}\!\mathbf{y^{\prime}}.$$ Again there exists a decomposition into eigenpairs $$\widetilde{\mathscr{K}} \widetilde\varphi_\ell = \widetilde\lambda_\ell \widetilde\varphi_\ell, \quad\ell\in\mathbb{N},$$ with non-negative eigenvalues |$\widetilde \lambda _1\geqslant \widetilde \lambda _2\geqslant \cdots \geqslant \widetilde \lambda _m\to 0$| and eigenfunctions |$\widetilde \varphi _\ell \in L^2(\varOmega _2)$|⁠. We also obtain a second sequence of orthonormal functions |$\widetilde \psi _\ell \in L^2(\varOmega _1)$| analogously to (3.4). The functions |$\{\widetilde \varphi _\ell \}_{\ell \in \mathbb{N}}$| and |$\{\widetilde \psi _\ell \}_{\ell \in \mathbb{N}}$| will be the same as before, but now their roles are exchanged. Moreover, the eigenvalues |$\lambda _\ell$| and |$\widetilde \lambda _\ell$| of |$\mathscr{K}$| and |$\widetilde{\mathscr{K}}$| coincide. 3.2 Truncation error We shall now give improved estimates on the decay rate of the eigenvalues of the integral operator |$\mathscr{K}=\mathscr{S}^\star \mathscr{S}$| with kernel (3.3). To this end, assume that |$f\in H_{mix}^{p,0}(\varOmega _1\times \varOmega _2)$|⁠. We introduce new3 finite element spaces |$U_M\subset L^2(\varOmega _1)$|⁠, which consist of |$M$| discontinuous, piecewise polynomial functions of total degree |$\lceil p\rceil$| on a quasi-uniform triangulation of |$\varOmega _1$| with mesh width |$h_M\sim M^{-1/n_1}$|⁠. Then, due to the Bramble–Hilbert lemma (see, e.g., Braess, 2001 or Brenner & Scott, 2008), given a function w ∈ Hp(Ω1), the |$L^2$|-orthogonal projection |$P_M:L^2(\varOmega _1)\to U_M$| satisfies $$\begin{equation}\|(I-P_M)w \|_{L^2(\varOmega_1)} \leqslant c_p M^{-p/n_1}\|w\|_{H^p(\varOmega_1)}\end{equation}$$ (3.7) uniformly in M. For the approximation of f(x, y) in the first variable, i.e., $$f_M(\mathbf{x},\mathbf{y}) := ((P_M\otimes I)f)(\mathbf{x},\mathbf{y}),$$ we obtain the following approximation result in UM, see also the study by Harbrecht et al. (2015). Theorem 3.2 Let λ1 ⩾ λ2 ⩾ … ⩾ 0 be the eigenvalues of the operator |$\mathscr{K}=\mathscr{S}\mathscr{S}^{\ast}\ \textrm{and}\ \lambda^{M}_{1}\geqslant \lambda^{M}_{2}\geqslant \ldots \lambda^{M}_{M}\geqslant 0$| those of |$\mathscr{K}_{M} := P_M \mathscr{K}P_M$|⁠. Then, it holds $$\|\,f-f_M\|_{L^2(\varOmega_1\times\varOmega_2)}^2 = \operatorname{trace}\mathscr{K}-\operatorname{trace}\mathscr{K}_M$$ and therefore $$\begin{equation}\|\,f-f_M\|_{L^2(\varOmega_1\times\varOmega_2)}^2 = \sum_{\ell=1}^{M}\left(\lambda_\ell-\lambda_{\ell}^M\right)+\sum_{\ell=M+1}^\infty\lambda_\ell.\end{equation}$$ (3.8) Proof. Let {θk}k∈ℕ be an orthonormal basis of L2(Ω1) such that either θk ∈ img PM or θk ∈ img(I − PM) holds. This implies |$(\mathscr{S}(I-P_{M})\theta_{k}, \mathscr{S}P_{M}\theta_{k})_{L^{2}(\Omega_{2})}$| = 0 for all |$k\in \mathbb{N}$|⁠. We thus arrive at $$\|\,f-f_M\|_{L^2(\varOmega_1\times\varOmega_2)}^2 = \int_{\varOmega_2} \sum_{\ell=1}^\infty \left((\,f-f_M)(\cdot,\mathbf{y}), \theta_\ell\right)_{L^2(\varOmega_1)}^2\operatorname{d}\!\mathbf{y}.$$ Due to the fact that |$I-P_M$| is an |$L^2$|-orthogonal projection, we have $$\left((\,f-f_M)(\cdot,\mathbf{y}),\theta_\ell\right)_{L^2(\varOmega_1)} = \left(\,f(\cdot,\mathbf{y}),(I-P_M)\theta_\ell\right)_{L^2(\varOmega_1)}\!.$$ Hence, it holds that $$\|\,f-f_M\|_{L^2(\varOmega_1\times\varOmega_2)}^2 = \sum_{\ell=1}^\infty \int_{\varOmega_2} \left(\,f(\cdot,\mathbf{y}),(I-P_M)\theta_\ell\right)_{L^2(\varOmega_1)}^2\operatorname{d}\!\mathbf{y}.$$ Inserting next the definition of |$\mathscr{S}$|⁠, we obtain $$\|\,f-f_M\|_{L^2(\varOmega_1\times\varOmega_2)}^2 = \sum_{\ell=1}^\infty \|\mathscr{S}(I-P_M)\theta_\ell\|_{L^2(\varOmega_2)}^2 = \sum_{\ell=1}^\infty \|\mathscr{S}\theta_\ell\|_{L^2(\varOmega_2)}^2 - \sum_{\ell=1}^\infty \|\mathscr{S}P_M\theta_\ell\|_{L^2(\varOmega_2)}^2.$$ Finally, by using $$\operatorname{trace}\mathscr{K}=\sum_{\ell=1}^\infty(\mathscr{K}\theta_\ell,\theta_\ell)_{L^2(\varOmega_1)} = \sum_{\ell=1}^\infty\|\mathscr{S}\theta_\ell\|_{L^2(\varOmega_1)}^2$$ and $$\operatorname{trace}\mathscr{K}_M = \sum_{\ell=1}^\infty(\mathscr{K}_M\theta_\ell,\theta_\ell)_{L^2(\varOmega_1)} = \sum_{\ell=1}^\infty\|\mathscr{S}P_M\theta_\ell\|_{L^2(\varOmega_1)}^2,$$ we conclude the assertion. By combining this theorem with the approximation estimate (3.7), we can obviously bound the trace error by $$\begin{equation}0\leqslant\operatorname{trace}\mathscr{K}-\operatorname{trace} \mathscr{K}_M \lesssim M^{-\frac{2p}{n_1}}\|\,f\|_{H_{mix}^{p,0}(\varOmega_1\times\varOmega_2)}^2.\end{equation}$$ (3.9) This estimate now allows to prove the following result on the truncation of the singular value decomposition after |$M$| terms. Theorem 3.3 Let |$f\in H_{mix}^{p,0}(\varOmega _1\times \varOmega _2)$|⁠. Then, it holds $$\begin{equation}\left\|\,f-\sum_{\ell=0}^M\sqrt{\lambda_\ell} (\varphi_\ell\otimes\psi_\ell)\right\|_{L^2(\varOmega_1\times\varOmega_2)} \lesssim M^{-\frac{p}{n_1}}\|\,f\|_{H_{mix}^{p,0}(\varOmega_1\times\varOmega_2)}.\end{equation}$$ (3.10) Proof. Due to the orthonormality of the sequences |$\{\varphi _\ell \}$| and |$\{\psi _\ell \}$| in |$L^2(\varOmega _1)$| and |$L^2(\varOmega _2)$|⁠, respectively, the error when truncating the singular value decomposition after |$M$| terms is given by $$\begin{align*}\left\|\,f-\sum_{\ell=0}^M\sqrt{\lambda_\ell} (\varphi_\ell\otimes\psi_\ell)\right\|_{L^2(\varOmega_1\times\varOmega_2)}^2 = \left\|\sum_{\ell=M+1}^\infty\sqrt{\lambda_\ell} (\varphi_\ell\otimes\psi_\ell)\right\|_{L^2(\varOmega_1\times\varOmega_2)}^2 = \sum_{\ell=M+1}^\infty \lambda_\ell.\end{align*}$$ In view of Theorem 3.2 and (3.9), since |$\lambda _\ell \geqslant \lambda _{\ell }^M$| for all |$\ell \in \{1,\ldots ,M\}$| (see, e.g., Babuška & Osborn, 1991), we immediately arrive at the following estimate: $$\begin{equation}\sum_{\ell=M+1}^\infty\lambda_\ell\lesssim M^{-\frac{2p}{n_1}}\|\,f\|_{H_{mix}^{p,0}(\varOmega_1\times\varOmega_2)}^2.\end{equation}$$ (3.11) According to Theorem 3.3, we only need the smoothness of |$f$| in the first coordinate to derive estimate (3.10). Since the eigenvalues of integral operator |$\mathscr{K}$| and its adjoint |$\widetilde{\mathscr{K}}$| are the same, we can also exploit any smoothness of |$f$| in the second coordinate, if provided, by interchanging the roles of |$\varOmega _1$| and |$\varOmega _2$| in the above proof. We thus obtain the following corollary. Corollary 3.4 For |$f\in H_{iso}^{p_1,p_2}(\varOmega _1\times \varOmega _2)$|⁠, it holds $$\begin{equation}\left\|\,f-\sum_{\ell=0}^M\sqrt{\lambda_\ell} (\varphi_\ell\otimes\psi_\ell)\right\|_{L^2(\varOmega_1\times\varOmega_2)} \lesssim M^{-\max\left\{\frac{p_1}{n_1},\frac{p_2}{n_2}\right\}} \|\,f\|_{H_{iso}^{p_1,p_2}(\varOmega_1\times\varOmega_2)}.\end{equation}$$ (3.12) Altogether, in order to ensure the bound $$\begin{equation}\left\|\,f-\sum_{\ell=0}^M\sqrt{\lambda_\ell} (\varphi_\ell\otimes\psi_\ell)\right\|_{L^2(\varOmega_1\times\varOmega_2)} \lesssim\varepsilon\end{equation}$$ (3.13) on the truncation error of the singular value decomposition, we need, as a consequence of Theorem 3.3 and Corollary 3.4, to choose the expansion degree |$M$| in accordance with $$\begin{equation}M\sim\varepsilon^{-\min\left\{\frac{n_1}{p_1},\frac{n_2}{p_2}\right\}}\end{equation}$$ (3.14) if |$f\in H_{iso}^{p_1,p_2}(\varOmega _1\times \varOmega _2)$|⁠. Note that in the situation |$p=p_1=p_2$|⁠, which means |$f\in H^p(\varOmega _1\times \varOmega _2)$|⁠, the truncation rank is simply given by |$M\sim \varepsilon ^{-\frac{\min \{n_1,n_2\}}{p}}$|⁠. 3.3 Decay of the eigenvalues Having Corollary 3.4 at hand, we can give a bound on the decay of the singular values. Note that this estimate improves the bound $$\lambda_\ell \lesssim \ell^{-2\min\left\{\frac{p_1}{n_1},\frac{p_2}{n_2}\right\}} \quad\textrm{as}\,\,\, \ell\to\infty$$ found in the study by Griebel & Harbrecht (2014), resulting from the application of the min–max principle of Courant–Fisher, by an additive factor |$1$| in the exponent. In particular, the new bound below is now sharp (see, for instance, the specific examples in the study by Griebel & Harbrecht, 2014). Moreover, it coincides with the bound that was derived in the studies by Dölz et al. (2017) and Griebel & Li (2017) by different techniques. Proposition 3.5 Consider |$f\in H_{iso}^{p_1,p_2}(\varOmega _1\times \varOmega _2)$| with associated kernel |$k$| from (3.3) and associated integral operator |$\mathscr{K}$| from (3.2). Then, the eigenvalues |$\{\lambda _\ell \}_{\ell \in \mathbb{N}}$| of |$\mathscr{K}$| decay like $$\lambda_\ell \lesssim \ell^{-2\min\left\{\frac{p_1}{n_1},\frac{p_2}{n_2}\right\}-1} \quad\textrm{as}\,\, \ell\to\infty.$$ Proof. Since the sequence |$\{\lambda _\ell \}$| decreases monotonically, it holds on the one hand $$\sum_{m=1}^{2^{k+1}-1} \lambda_m = \lambda_1 + (\lambda_2+\lambda_3) + \dots + (\lambda_{2^k}+\lambda_{2^k+1} + \dots+\lambda_{2^{k+1}-1}) \leqslant \sum_{n=0}^k 2^n \lambda_{2^n},$$ and on the other hand $$\sum_{m=1}^{2^k} \lambda_m = \lambda_1 + \lambda_2+(\lambda_3+\lambda_4) + \dots + (\lambda_{2^{k-1}+1}+\lambda_{2^{k-1}+2} + \dots+\lambda_{2^k}) \geqslant\frac{1}{2}\sum_{n=0}^k 2^n \lambda_{2^n}.$$ This is the well-known Cauchy condensation test, which implies $$\sum_{m=2^k}^{2^{k+1}-1} \lambda_m\sim\sum_{n=0}^k 2^n \lambda_{2^n}.$$ For arbitrary |$k\in \mathbb{N}$|⁠, in view of (3.11), we conclude with |$\beta :=2\min \big \{\frac{p_1}{n_1},\frac{p_2}{n_2}\big \}$| that $$\sum_{n=0}^k 2^n\lambda_{2^n} \sim\sum_{m=2^k}^{2^{k+1}}\lambda_m \lesssim \sum_{m=2^k}^\infty \lambda_m\lesssim 2^{-\beta k}.$$ This, however, leads to $$\sum_{n=0}^k 2^{(\beta+1)n}\lambda_{2^n} \lesssim 2^{\beta k}\sum_{m=2^k}^{2^{k+1}} \lambda_m \lesssim 2^{\beta k}\sum_{m=2^k}^\infty \lambda_m\lesssim 1$$ uniformly in |$k\in \mathbb{N}$|⁠. Therefore, |$2^{(\beta +1)n} \lambda _{2^n}$| tends to zero, which immediately implies that |$\ell ^{\beta +1}\lambda _\ell$| also tends to zero since $$2^{(\beta+1) n}\lambda_{2^n}\sim\ell^{\beta+1}\lambda_\ell \sim 2^{(\beta+1)(n+1)}\lambda_{2^{n+1}} \ \textrm{for all}\ 2^n\leqslant\ell < 2^{n+1}.$$ 3.4 Numerical approximation In Corollary 3.4, we used an exact description of the eigenfunctions. However, this does not hold in practice. Instead, the eigenvalues |$\{\lambda _\ell \}_{\ell =1}^M$| and eigenfunctions |$\{\varphi _\ell \}_{\ell =1}^M$| and |$\{\psi _\ell \}_{\ell =1}^M$| need to be approximately computed in the finite element spaces that have been introduced in Subsection 2.1. For functions |$f\in H_{iso}^{p_1,p_2}(\varOmega _1\times \varOmega _2)$| with |$p_1\leqslant r_1$| and |$p_2\leqslant r_2$|⁠, we already know that the full tensor product space yields the best possible approximation. Indeed, the singular values then decay not fast enough in order to benefit from additional compression. Hence, we shall assume |$p_1>r_1$| or |$p_2>r_2$| in the subsequent discussion. In contrast to the study by Griebel & Harbrecht (2014), we aim not at a direct approximation of the eigenvectors since then their oscillations would have to be taken into account, which does not yield optimal estimates. Therefore, in the following, we first consider the projection of |$f$| onto a suitable full tensor product ansatz space and perform afterwards a projection onto the |$M$| dominant eigenpairs of the projected function. When choosing the levels of refinement |$j_1$| and |$j_2$| for the projections |$Q_{j_1}^{(1)}$| and |$Q_{j_2}^{(2)}$| in (2.6), we obtain |$N_\varphi =\dim \big(V_{j_1}^{(1)}\big) \sim 2^{j_1n_1}$| and |$N_\psi =\dim \big(V_{j_2}^{(2)}\big)\sim 2^{j_2n_2}$| degrees of freedoms, respectively. Then, for |$\mathscr{K}_{N_\varphi }:= Q_j^{(1)}\mathscr{K}Q_j^{(1)}$| with eigenvalues |$\lambda _{1,N_\varphi }\geqslant \lambda _{2,N_\varphi } \geqslant \cdots \geqslant \lambda _{N_\varphi ,N_\varphi }\geqslant 0$|⁠, it holds in complete analogy to Theorem 3.2 $$\begin{equation}\begin{aligned} \left\|\left(I-Q_{j_1}^{(1)}\otimes I\right)f\right\|_{L^2(\varOmega_1\times\varOmega_2)}^2 &= \operatorname{trace}\mathscr{K}-\operatorname{trace}\mathscr{K}_{N_\varphi}\\ &= \sum_{\ell=1}^{N_\varphi}(\lambda_\ell-\lambda_{\ell,N_\varphi}) + \sum_{\ell=N_\varphi+1}^\infty\lambda_\ell\\ &\lesssim 2^{-2j_1\min\{p_1,r_1\}}\|\,f\|_{H_{mix}^{\min\{p_1,r_1\},0}(\varOmega_1\times\varOmega_2)}^2.\end{aligned}\end{equation}$$ (3.15) We emphasize again that there holds |$\lambda _\ell \geqslant \lambda _{\ell ,N_\varphi }$| for all |$\ell \in \{1,2,\ldots ,N_\varphi \}$|⁠. For |$M\leqslant N_\varphi$|⁠, let |$P_M^{(1)}$| denote the projection onto the |$M$| dominant eigenpairs |$(\lambda _{1,N_\varphi },\varphi _{1,N_\varphi }), \ldots ,(\lambda _{M,N_\varphi }, \varphi _{M,N_\varphi })$| of |$\mathscr{K}_{N_\varphi }$|⁠, i.e., $$P_M^{(1)}:L^2(\varOmega_1)\to L^2(\varOmega_1),\quad g\mapsto P_M^{(1)} g = \sum_{k=1}^M (g,\varphi_{k,N_\varphi})_{L^2(\varOmega_1)}\varphi_{k,N_\varphi}.$$ Then, we have $$\begin{equation}\begin{aligned} &\left\|\left(I-P_M^{(1)}\otimes I\right)f\right\|_{L^2(\varOmega_1\times\varOmega_2)}^2\\ &\qquad= \operatorname{trace}\mathscr{K}-\operatorname{trace} P_M^{(1)}\mathscr{K}_{N_\varphi}P_M^{(1)}\\ &\qquad= \sum_{\ell=1}^M(\lambda_\ell-\lambda_{\ell,N_\varphi}) +\sum_{\ell=M+1}^\infty\lambda_\ell\\ &\qquad\lesssim 2^{-2j_1\min\{p_1,r_1\}}\|\,f\|_{H_{mix}^{\min\{p_1,r_1\},0}(\varOmega_1\times\varOmega_2)}^2 +M^{-2\max\left\{\frac{p_1}{n_1},\frac{p_2}{n_2}\right\}}\|\,f\|_{H_{iso}^{p_1,p_2}(\varOmega_1\times\varOmega_2)}^2,\end{aligned}\end{equation}$$ (3.16) where we used (3.11) and $$0\leqslant\sum_{\ell=1}^M\left(\lambda_\ell-\lambda_{\ell,N_\varphi}\right) \leqslant\sum_{\ell=1}^{N_\varphi}\left(\lambda_\ell-\lambda_{\ell,N_\varphi}\right) \leqslant 2^{-2j_1\min\{r_1,p_1\}}\|\,f\|_{H_{mix}^{\min\{r_1,p_1\},0}(\varOmega_1\times\varOmega_2)}^2$$ in accordance with (3.15). We proceed in complete analogy in the second variable. For the projection $$P_M^{(2)}:L^2(\varOmega_2)\to L^2(\varOmega_2),\quad g\mapsto P_M^{(2)} g = \sum_{k=1}^M (g,\psi_{k,N_\psi})_{L^2(\varOmega_1)}\psi_{k,N_\psi}$$ onto the |$M\leqslant N_\psi$| dominant eigenpairs |$(\widetilde{\lambda }_{1,N_\psi },\psi _{1,N_\psi }), \ldots ,(\widetilde{\lambda }_{M,N_\psi },\psi _{M,N_\psi })$| of |$\widetilde{\mathscr{K}}_{N_\psi } := Q_{j_2}^{(2)}\widetilde{\mathscr{K}}Q_{j_2}^{(2)}$|⁠, we obtain $$\begin{equation}\begin{aligned} &\left\|\left(I-I\otimes P_M^{(2)}\right)f\right\|_{L^2(\varOmega_1\times\varOmega_2)}^2\\ &\qquad\lesssim 2^{-2j_2\min\{p_2,r_2\}}\|\,f\|_{H_{mix}^{0,\min\{p_2,r_2\}}(\varOmega_1\times\varOmega_2)}^2 +M^{-2\min\left\{\frac{p_1}{n_1},\frac{p_2}{n_2}\right\}}\|\,f\|_{H_{iso}^{p_1,p_2}(\varOmega_1\times\varOmega_2)}^2.\end{aligned}\end{equation}$$ (3.17) We now choose the levels |$j_1$| and |$j_2$| such that (2.7) holds. Then, by combining (3.16) and (3.17), we arrive for any |$M\leqslant \max \{N_\varphi , N_\psi \}$| at $$\begin{aligned} &\left\|\left(I-P_M^{(1)}\otimes P_M^{(2)}\right)f\right\|_{L^2(\varOmega_1\times\varOmega_2)}^2\\ &\qquad\lesssim 2^{-2j_1\min\{p_1,r_1\}}\|\,f\|_{H_{iso}^{\min\{p_1,\,r_1\},\min\{p_2,\,r_2\}}(\varOmega_1\times\varOmega_2)}^2 +M^{-2\min\left\{\frac{p_1}{n_1},\frac{p_2}{n_2}\right\}}\|\,f\|_{H_{iso}^{p_1,p_2}(\varOmega_1\times\varOmega_2)}^2.\end{aligned}$$ Note that the expression |$\big(P_M^{(1)}\otimes P_M^{(2)}\big)f$| corresponds to the desired series expansion $$\left(P_M^{(1)}\otimes P_M^{(2)}\right)f = \sum_{\ell=1}^M \alpha_\ell \left(\varphi_{\ell,N_\varphi}\otimes\psi_{\ell,N_\psi}\right)\!,$$ where the coefficients |$\{\alpha _\ell \}$| are given by $$\alpha_\ell := (\,f,\varphi_{\ell,N_\varphi}\otimes\psi_{\ell,N_\psi} )_{L^2(\varOmega_1\times\varOmega_2)}\quad\textrm{for all}\ \ell\in\{1,2,\ldots,M\}.$$ The approximation error is already fixed by the projection onto the full tensor product space. Hence, it remains to equilibrate the truncation error, which is induced by |$M$|⁠, and the projection error, which is related to the degrees of freedom |$N_\varphi$| and |$N_\psi$|⁠, respectively. This implies the choice $$M\sim\varepsilon^{-\min\left\{\frac{n_1}{p_1},\frac{n_2}{p_2}\right\}},\quad N_\varphi\sim\varepsilon^{-\frac{n_1}{\min\{p_1,r_1\}}},\quad N_\psi\sim\varepsilon^{-\frac{n_2}{\min\{p_2,r_2\}}}.$$ In particular, the assumption |$p_1>r_1$| or |$p_2>r_2$| implies both, |$M\leqslant N_\varphi$| and |$M\leqslant N_\psi$|⁠. Thus, in view of the cost |$M\cdot \max \{N_\varphi ,N_\psi \}$|⁠, we obtain the following result: Theorem 3.6 The number of degrees of freedom, which is needed to approximate a function |$f \in H_{iso}^{p_1,p_2}(\varOmega _1\times \varOmega _2)$| by the singular value decomposition approach (3.1) to a prescribed accuracy |$\varepsilon$|⁠, is $$\begin{equation}\operatorname{dof}_{svd}(\varepsilon)\sim\varepsilon^{-\min\left\{\frac{n_1}{p_1},\frac{n_2}{p_2}\right\}} \varepsilon^{-\max\left\{\frac{n_1}{\min\{p_1,r_1\}},\frac{n_2}{\min\{p_2,r_2\}}\right\}}.\end{equation}$$ (3.18) Remark 3.7 Note that if |$p_1\leqslant r_1$| and |$p_2\leqslant r_2$|⁠, the cost complexity (3.18) also covers the situation when the full tensor product space coincides with Kolmogorov’s |$n$|-width by means of |$M=\min \{N_\varphi ,N_\psi \}$|⁠. We may hence drop the restriction |$p_1>r_1$| or |$p_2>r_2$|⁠, which has been made in the beginning of this subsection. Moreover, the cost complexity (3.18) does not improve if |$f$| would possess mixed Sobolev regularity in the sense of |$f \in H_{mix}^{p_1,p_2}(\varOmega _1\times \varOmega _2)$|⁠. We emphasize that the estimate (3.18) does not include the work to be spent for computing the singular values nor the eigenfunctions. Here, a naive approach would result in a cost complexity of order |$M\cdot N^2$|⁠, where |$N:=\max \{N_\varphi ,N_\psi \}$|⁠, the use of fast methods for nonlocal operators would result in an almost linear or even linear complexity per eigenpair. Note that, in any case, at least linear complexity |$\mathscr{O}(M\cdot N)$| is required, which is indeed achieved with modern algorithms, see, e.g., Dahmen et al. (2008), Dai et al. (2008) and the references cited therein. Therefore, our analysis is based on the best possible situation for the approximative truncated singular value decomposition and our later comparison will be fair in this respect. Remark 3.8 In uncertainty quantification, the truncated singular value decomposition of the random field under consideration is called Karhunen–Loéve approximation. In particular, one has then higher regularity only in the spatial variable, i.e., one considers functions |$f\in H_{mix}^{0,p}(\varOmega _1\times \varOmega _2)$|⁠, where |$\varOmega _1$| reflects the sample space and |$\varOmega _2$| the physical domain. Of course, the truncation estimate (3.13) remains true with |$p_1=0$| and |$p_2=p$|⁠, but an approximation of the eigenfunctions with respect to the sample space is not possible anymore. Instead, one prescribes a certain distribution of the random variables. This involves a modeling step, see Ghanem & Spanos (1991) or Le Maître & Knio (2010) for details. 4. Sparse grids Based on the multiscale analyses (2.5) on each individual subdomain, one naturally obtains a second method to approximate functions in tensor product spaces; by choosing complementary spaces $$W_j^{(i)} = \operatorname{span}\left\{\xi_{j,k}^{(i)}:k\in\nabla_j^{(i)} := \varDelta_j^{(i)}\setminus\varDelta_{j-1}^{(i)}\right\}, \quad i=1,2,$$ such that $$V_j^{(i)} = W_j^{(i)}\oplus V_{j-1}^{(i)}, \quad V_0^{(i)} = W_0^{(i)},$$ we can define the so-called general sparse grid space, see the study Griebel & Harbrecht (2013), $$\begin{equation}\widehat{\mathbf{V}}_J^\sigma := \bigoplus_{j_1\sigma+j_2/\sigma\leqslant J} W_{j_1}^{(1)}\otimes W_{j_2}^{(2)},\end{equation}$$ (4.1) where |$\sigma>0$| is a given parameter. Thus, a function |$\widehat f_J \in \widehat{\mathbf{V}}_J^\sigma$| is represented as $$\begin{equation}\widehat{f}_J(\mathbf{x},\mathbf{y})= \sum_{j_1\sigma+j_2/\sigma\leqslant J}\sum_{k_1\in\nabla_{j_1}^{(1)}} \sum_{k_2\in\nabla_{j_2}^{(2)}} \beta_{(j_1,k_1),(j_2,k_2)} \xi_{j_1,k_1}^{(1)}(\mathbf{x})\xi_{j_2,k_2}^{(2)}(\mathbf{y}).\end{equation}$$ (4.2) Sparse grids can be constructed via hierarchical bases, interpolets and wavelet-like bases (see, e.g., Zenger, 1991; DeVore et al., 1998; Strömberg, 1998; Griebel & Knapek, 2009) or even directly by finite elements in terms of frames (see, e.g., Griebel, 1994; Griebel & Oswald, 1994; Harbrecht et al., 2008). They can also be built from Fourier, Chebyshev, Legendre or similar global polynomial systems, depending on the respective situation. This results in the so-called hyperbolic cross methods. For a survey on sparse grids, we refer the reader to the survey by Bungartz & Griebel (2004) and the references therein. The dimension of the general sparse grid space |$\widehat{\mathbf{V}}_J^\sigma$| is essentially equal to the dimension of the finest univariate finite element spaces that enter its construction, i.e., it is essentially equal to the value of |$\max \big\{\dim V_{J/\sigma }^{(1)}, \dim V_{J\sigma }^{(2)}\big\}$|⁠. Nevertheless, by considering smoothness in terms of mixed Sobolev spaces, its approximation power is essentially the same as in the full tensor product space. Precisely, in accordance with the study by Griebel & Harbrecht (2013), we have the following result: Lemma 4.1 The sparse grid space |$\widehat{\mathbf{V}}_J^\sigma$| possesses $$\dim\widehat{\mathbf{V}}_J^\sigma\sim\begin{cases} 2^{J\max\{n_1/\sigma,n_2\sigma\}},&\textrm{if}\,\, n_1/\sigma\not= n_2\sigma,\\ 2^{Jn_2\sigma}J,&\textrm{if}\,\, n_1/\sigma= n_2\sigma,\end{cases}$$ degrees of freedom. Moreover, for a given function |$f\in H_{mix}^{p_1,p_2}(\varOmega _1\times \varOmega _2)$| with |$0< p_1\leqslant r_1$| and |$0r_1$| and |$p_2>r_2$|⁠, then it even holds |$q_{svd}>q_{sg}$|⁠. This means that the approximation by properly balanced sparse grids is superior over the approximation by the singular value decomposition in the function class |$H_{iso}^{p_1,p_2} (\varOmega _1\times \varOmega _2)$|⁠. We emphasize that a change from the isotropic Sobolev space |$H_{iso}^{p_1,p_2}(\varOmega _1 \times \varOmega _2)$| to the anisotropic Sobolev space |$H_{mix}^{p_1,p_2}(\varOmega _1\times \varOmega _2)$| does only improve the rate |$q_{sg}$| in accordance with $$q_{sg}=\max\left\{\frac{n_1}{\min\{p_1,r_1\}},\frac{n_2}{\min\{p_2,r_2\}}\right\}$$ while |$q_{svd}$| is kept unchanged. In the situation of low regularity, that is, if |$p_1\leqslant r_1$| and |$p_2\leqslant r_2$|⁠, the approximative truncated singular value decomposition coincides with the approximation in the full tensor product space |$V_{j_1}^{(1)}\otimes V_{j_2}^{(2)}$|⁠, where |$j_1$| and |$j_2$| are related by (2.7). This full tensor product space is known to realize Kolmogorov’s |$n$|-width |$\varepsilon ^{-\frac{n_1}{p_1}- \frac{n_2}{p_2}}$| for Sobolev balls in the space |$H_{iso}^{p_1,p_2} (\varOmega _1\times \varOmega _2)$|⁠. The sparse grid approach would essentially also give Kolmogorov’s |$n$|-width, but has in general larger constants. In case of the sparse grid approach, we envision further improvements by the use of local adaptivity, which would further increase its performance. In case of the singular value decomposition, the truncation length is determined by the smoothness of the function under consideration and is thus fixed. Therefore, improvements for the truncated singular value decomposition can only be achieved by a more efficient representation of the eigenfunctions. Future work is needed to study the cost complexity in the higher-dimensional setting. Especially, the comparison of the recursive application of the singular value decomposition in a hierarchical tensor format and the sparse grid approach would be relevant for practical application. For example, our results immediately extend the results of the study by Schneider & Uschmajew (2014), regarding hierarchical tensor formats, also to nonperiodic functions on general product domains, since we have now been able to prove the same decay rate of the singular values as in the study by Temlyakov (1993) in case of periodic functions. For further results concerning the cost complexity of high-dimensional tensor approximation, we refer the reader to the studies by Bachmayr & Dahmen (2015, 2016) and the references therein. Acknowledgements This research was done while M.G. held a John von Neumann Visiting Professorship at the Technische Universität München in the summer term 2017. The hospitality of the Fakultät für Mathematik of the Technische Universität München and especially of the host, Prof. Dr Folkmar Bornemann, is greatfully acknowledged. H.H. acknowledges the support by the Swiss National Science Foundation (SNSF) through the project ‘|$\mathscr{H}$|- matrix based first and second moment analysis’ (grant 200021_156101). Footnotes 1 One may also consider |$\varOmega \subset \mathbb{R}^{n+1}$| to be a sufficiently smooth manifold. 2 We refer the reader to the study by Stewart (1993) for a comprehensive historical overview about the singular value decomposition. 3 The present argument relies on an approximation argument. The new finite element spaces |$\{U_M\}$| are introduced to obtain the optimal convergence rate with |$N$| degrees of freedom. References Babuška , I. ( 1968 ) Über universal optimale Quadraturformeln . Appl. Mat. , 13 , 304–338 , 338 – 404 . WorldCat Babuška , I. & Osborn , J. ( 1991 ) Eigenvalue problems . Handbook of Numerical Analysis , vol. 2 . Amsterdam : North-Holland , pp. 641 – 784 . Google Preview WorldCat COPAC Bachmayr , M. & Dahmen , W. ( 2015 ) Adaptive near-optimal rank tensor approximation for high-dimensional operator equations . Found. Comput. Math. , 15 , 839 – 898 . Google Scholar Crossref Search ADS WorldCat Bachmayr , M. & Dahmen , W. ( 2016 ) Adaptive low-rank methods for problems on Sobolev spaces with error control in |$L_2$| . ESAIM Math. Model. Numer. Anal. , 50 , 1107 – 1136 . Google Scholar Crossref Search ADS WorldCat Balescu , R. ( 1997 ) Statistical Dynamics, Matter out of Equilibrium . London: Imperial College Press, Imperial College . Google Preview WorldCat COPAC Bebendorf , M . ( 2011 ) Adaptive cross approximation of multivariate functions . Constr. Approx. , 34 , 149 – 179 . Google Scholar Crossref Search ADS WorldCat Braess , D . ( 2001 ) Finite Elements. Theory, Fast Solvers, and Applications in Solid Mechanics . Cambridge : Cambridge University Press . Google Preview WorldCat COPAC Brenner , S. & Scott , L. ( 2008 ) The Mathematical Theory of Finite Element Methods . Berlin : Springer . Google Preview WorldCat COPAC Bungartz , H. J. & Griebel , M. ( 2004 ) Sparse grids . Acta Numer. , 13 , 147 – 269 . Google Scholar Crossref Search ADS WorldCat Cioranescu , D. , Damlamian , A. & Griso , G. ( 2008 ) The periodic unfolding method in homogenization . SIAM J. Appl. Math. , 40 , 1585 – 1620 . Google Scholar Crossref Search ADS WorldCat Dahmen , W. , Rohwedder , T. , Schneider , R. & Zeiser , A. ( 2008 ) Adaptive eigenvalue computation: complexity estimates . Numer. Math. , 110 , 277 – 312 . Google Scholar Crossref Search ADS WorldCat Dai , X. , Xu , J. & Zhou , A. ( 2008 ) Convergence and optimal complexity of adaptive finite element eigenvalue computations . Numer. Math. , 110 , 313 – 355 . Google Scholar Crossref Search ADS WorldCat Deb , M. K. , Babuška , I. & Oden , J. T. ( 2001 ) Solution of stochastic partial differential equations using Galerkin finite element techniques . Comput. Methods Appl. Mech. Eng. , 190 , 6359 – 6372 . Google Scholar Crossref Search ADS WorldCat DeVore , R. , Konyagin , S. & Temlyakov , V. ( 1998 ) Hyperbolic wavelet approximation . Constr. Approx. , 14 , 1 – 26 . Google Scholar Crossref Search ADS WorldCat Dölz , J. , Harbrecht , H. & Schwab , C. ( 2017 ) Covariance regularity and |$\mathscr{H}$|-matrix approximation for rough random fields . Numer. Math. , 135 , 1045 – 1071 . Google Scholar Crossref Search ADS WorldCat Ghanem , R. G. & Spanos , P. D. ( 1991 ) Stochastic Finite Elements: A Spectral Approach . New York : Springer . Google Preview WorldCat COPAC Grasedyck , L . ( 2010 ) Hierarchical singular value decomposition of tensors . SIAM J. Matrix Anal. Appl. , 31 , 2029 – 2054 . Google Scholar Crossref Search ADS WorldCat Griebel , M. ( 1994 ) Multilevelmethoden als Iterationsverfahren über Erzeugendensystemen. Teubner Skripten zur Numerik . Stuttgart : B. G. Teubner . Google Preview WorldCat COPAC Griebel , M. & Harbrecht , H. ( 2013 ) On the construction of sparse tensor product spaces . Math. Comput. , 82 , 975 – 994 . Google Scholar Crossref Search ADS WorldCat Griebel , M. & Harbrecht , H. ( 2014 ) Approximation of bi-variate functions: Singular value decomposition versus sparse grids . IMA J. Numer. Anal. , 34 , 28 – 54 . Google Scholar Crossref Search ADS WorldCat Griebel , M. & Knapek , S. ( 2009 ) Optimized general sparse grid approximation spaces for operator equations . Math. Comput. , 78 , 2223 – 2257 . Google Scholar Crossref Search ADS WorldCat Griebel , M. & Li , G. ( 2017 ) On the decay rate of the singular values of bivariate functions . SIAM J. Numer. Anal. , 56 , 974 – 993 . Google Scholar Crossref Search ADS WorldCat Griebel , M. & Oeltz , D. ( 2007 ) A sparse grid space-time discretization scheme for parabolic problems . Computing , 81 , 1 – 34 . Google Scholar Crossref Search ADS WorldCat Griebel , M. & Oswald , P. ( 1994 ) On additive Schwarz preconditioners for sparse grid discretizations . Numer. Math. , 66 , 449 – 463 . Google Scholar Crossref Search ADS WorldCat Hackbusch , W . ( 2012 ) Tensor Spaces and Numerical Tensor Calculus . Berlin-Heidelberg : Springer . Google Preview WorldCat COPAC Hackbusch , W. & Kühn , S. ( 2009 ) A new scheme for the tensor representation . J. Fourier Anal. Appl. , 15 , 706 – 722 . Google Scholar Crossref Search ADS WorldCat Harbrecht , H. , Peters , M. & Siebenmorgen , M. ( 2015 ) Efficient approximation of random fields for numerical applications . Numer. Linear Algebra Appl. , 22 , 596 – 617 . Google Scholar Crossref Search ADS WorldCat Harbrecht , H. , Schneider , R. & Schwab , C. ( 2008 ) Multilevel frames for sparse tensor product spaces . Numer. Math. , 110 , 199 – 220 . Google Scholar Crossref Search ADS WorldCat Hesthaven , J. S. , Rozza , G. & Stamm , B. ( 2016 ) Certified Reduced Basis Methods for Parametrized Partial Differential Equations. SpringerBriefs in Mathematics . Cham : Springer . Google Preview WorldCat COPAC Kolmogorov , A. ( 1936 ) Über die beste Annäherung von Funktionen einer gegebenen Funktionenklasse . Annals Math ., 37 , 107 – 110 . Google Scholar Crossref Search ADS WorldCat Le Maître , O. P. & Knio , O. M. ( 2010 ) Spectral Methods for Uncertainty Quantification. With Applications to Computational Fluid Dynamics. Scientific Computation . Dordrecht-Heidelberg-London-New York : Springer . Lòeve , M. ( 1978 ) Probability Theory, vol. 2 & 3 . New York : Springer . Google Preview WorldCat COPAC Motornyj , V. ( 1974 ) On the best quadrature formula of the form |$\sum _{k=1}^n p_k f(x_k)$| for some classes of periodic differentiable functions . Ivz. Akad. Nauk USSR Ser. Mat. , 38 , 538 – 614 . WorldCat Novak , E. & Woźniakowski , H. ( 2008 ) Tractability of Multivariate Problems: Linear Information . EMS Tracts in Mathematics . Zürich : European Mathematical Society . Novak , E. & Woźniakowski , H. ( 2010 ) Tractability of Multivariate Problems: Standard Information for Functionals . EMS Tracts in Mathematics . Zürich : European Mathematical Society . Novak , E. & Woźniakowski , H. ( 2012 ) Tractability of Multivariate Problems: Standard Information for Operators . EMS Tracts in Mathematics . Zürich : European Mathematical Society . Oseledets , I. V. & Tyrtyshnikov , E. E. ( 2009 ) Breaking the curse of dimensionality, or how to use SVD in many dimensions . SIAM J. Sci. Comput. , 31 , 3744 – 3759 . Google Scholar Crossref Search ADS WorldCat Quarteroni , A. , Manzoni , A. & Negri , F. ( 2016 ) Reduced Basis Methods for Partial Differential Equations : An Introduction . Cham : Springer . Google Preview WorldCat COPAC Rozza , G. , Huynh , D. B. P. & Patera , A. T. ( 2008 ) Reduced basis approximation and a posteriori error estimation for affinely parametrized elliptic coercive partial differential equations. Application to Transport and Continuum Mechanics . Arch. Comput. Methods Eng. , 15 , 229 – 275 . Google Scholar Crossref Search ADS WorldCat Schmidt , E. ( 1907 ) Zur Theorie der linearen und nichtlinearen Integralgleichungen. I. Teil: Entwicklung willkürlicher Funktionen nach Systemen vorgeschriebener . Math. Ann. , 63 , 433 – 476 . Google Scholar Crossref Search ADS WorldCat Schneider , R. & Uschmajew , A. ( 2014 ) Approximation rates for the hierarchical tensor format in periodic Sobolev spaces . J. Complexity , 30 , 56 – 71 . Google Scholar Crossref Search ADS WorldCat Stewart , G. W. ( 1993 ) On the early history of the singular value decomposition . SIAM Rev ., 35 , 551 – 566 . Google Scholar Crossref Search ADS WorldCat Strömberg , J. ( 1998 ) Computation with wavelets in higher dimensions . Proceedings of the International Congress of Mathematicians , vol. 3. Berlin: Documenta Mathematica, extra vol. 3 , 523 – 532 . Temlyakov , V. N. ( 1986 ) Approximation of functions with bounded mixed derivative. Tr. Mat. Inst. Steklova, 178 (in Russian) ; Proc. Steklov Inst. Math. , 1 ( English translation ). WorldCat Temlyakov , V. N. ( 1988 ) Estimates of best bilinear approximations of periodic functions. Tr. Mat. Inst. Steklova , 181 , 250 – 267 (in Russian) ; Proc. Steklov Inst. Math. , 4 , 181 – 293 (English translation). Temlyakov , V. N. ( 1992a ) Bilinear approximation and related questions. Tr. Mat. Inst. Steklova , 194 , 229 – 248 (in Russian) ; Proc. Steklov Inst. Math. , 4 , 245 – 265 (English translation). Temlyakov , V. N. ( 1992 b) Estimates for the best bilinear approximations of functions and approximation numbers of integral operators. Mat. Zametki , 51 , 125 – 134 (in Russian) ; Math. Notes , 51 , 510 – 517 (English translation). Temlyakov , V. N. ( 1993 ) Approximation of Periodic Functions . Commack, New York : Nova Science Publishers. Widmer , G. , Hiptmair , R. & Schwab , C. ( 2008 ) Sparse adaptive finite elements for radiative transfer . J. Comp. Phys. , 227 , 6071 – 6105 . Google Scholar Crossref Search ADS WorldCat Zenger , C. ( 1991 ) Sparse grids. Parallel Algorithms for Partial Differential Equations. Proceedings of the 6th GAMM-Seminar, Kiel/Germany , 1990 . Notes on Numerical Fluid Mechanics , vol. 31 , 241 – 251 . Braunschweig : Vieweg . © The Author(s) 2018. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. All rights reserved. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) TI - Singular value decomposition versus sparse grids: refined complexity estimates JF - IMA Journal of Numerical Analysis DO - 10.1093/imanum/dry039 DA - 2019-10-16 UR - https://www.deepdyve.com/lp/oxford-university-press/singular-value-decomposition-versus-sparse-grids-refined-complexity-lauH50jehB SP - 1652 VL - 39 IS - 4 DP - DeepDyve ER -