# On the use of Hahn’s asymptotic formula and stabilized recurrence for a fast, simple and stable Chebyshev–Jacobi transform

On the use of Hahn’s asymptotic formula and stabilized recurrence for a fast, simple and stable... Abstract We describe a fast, simple and stable transform of Chebyshev expansion coefficients to Jacobi expansion coefficients, and its inverse based on the numerical evaluation of Jacobi expansions at the Chebyshev–Lobatto points. This is achieved via decomposition of Hahn’s interior asymptotic formula into a small sum of diagonally scaled discrete sine and cosine transforms and the use of stable recurrence relations. It is known that the Clenshaw–Smith algorithm is not uniformly stable on the entire interval of orthogonality. Therefore, Reinsch’s modification is extended for Jacobi polynomials and employed near the endpoints to improve numerical stability. 1. Introduction Chebyshev expansions:   pN(x)=∑n=0NcnchebTn(x),x∈[−1,1], (1.1) where $$T_n(\cos\theta) = \cos(n\theta)$$ are ubiquitous in numerical analysis, approximation theory and pseudo-spectral methods for their near-best approximation, fast evaluation via the discrete cosine transform and fast linear algebra for Chebyshev spectral methods, among the many other properties that facilitate their convenient use (see e.g. Mason & Handscomb (2002); Olver et al. (2010); Trefethen (2012)). Jacobi expansions:   pN(x)=∑n=0NcnjacPn(α,β)(x),x∈[−1,1],α,β>−1, (1.2) also have useful properties. The Jacobi polynomials are orthogonal with respect to $$L^2([-1,1],w^{(\alpha,\beta)}(x){\rm\,d}x)$$, where $$w^{(\alpha,\beta)}(x) = (1-x)^\alpha(1+x)^\beta$$ is the Jacobi weight. Jacobi expansions are therefore useful in pseudo-spectral methods, where it is more natural to measure the error in Jacobi-weighted Hilbert spaces (see Li & Shen (2010)). Also Wimp et al. (1997) show that the Jacobi-weighted finite Hilbert and Cauchy transforms are diagonalized by Jacobi polynomials. For $$N\in\mathbb{N}$$, define $$\boldsymbol{\theta}_N^{\rm cheb}$$ as the vector of $$N+1$$ equally spaced angles:   [θNcheb]n=πnN,n=0,…,N, (1.3) and the vector of $$N+1$$ Chebyshev–Lobatto points $${\bf x}_N^{\rm cheb} = \cos\boldsymbol{\theta}_N^{\rm cheb}$$. We express the vectors of the evaluation of the expansion (1.1) and (1.2) at $${\bf x}_N^{\rm cheb}$$ as the equality of the matrix-vector products:   pN(xNcheb)=TN(xNcheb)cNcheb=PN(α,β)(xNcheb)cNjac, (1.4) where the entries of the matrices are   [TN(xNcheb)]i,j=Ti−1([xNcheb]j−1),[PN(α,β)(xNcheb)]i,j=Pi−1(α,β)([xNcheb]j−1). (1.5) We define the forward Chebyshev–Jacobi transform as   cNcheb=TN(xNcheb)−1PN(α,β)(xNcheb)cNjac, (1.6) and the inverse Chebyshev–Jacobi transform by   cNjac=PN(α,β)(xNcheb)−1TN(xNcheb)cNcheb. (1.7) 1.1 Previous work on computing Legendre, Gegenbauer and Jacobi expansion coefficients The origins of the method proposed and analyzed in this paper start with the fast eigenfunction transform of Orszag (1986). The novelty of his approach, which is improved by Mori et al. (1999), is that, for large $$N$$, the matrix $${\bf P}_N^{(0,0)}({\bf x}_N^{\rm cheb})$$ is well approximated by a small sum of diagonally scaled discrete cosine transforms of type-I (DCT-I’s) and discrete sine transforms of type-I (DST-I’s). However, by not accounting for the region in the $$N$$-$$x$$ plane where the matrix significantly differs from the interior asymptotics, their initial advances were unstable. Families of orthogonal polynomials are related by the so-called connection coefficients (Andrews et al., 1998, p. 357). The connection coefficients fill in a lower-triangular matrix that allows conversion between two different families of orthogonal polynomials. Alpert & Rokhlin (1991) leverage the asymptotically smooth functions which define the connection coefficients between Chebyshev and Legendre polynomials for an $${\mathcal O}(N\log N)$$ hierarchical approach to the Chebyshev–Legendre transform. This hierarchical approach has been extended by Keiner (2009) for expansions in Gegenbauer polynomials. When transforming polynomial expansions of analytic functions, an alternative approach to the hierarchical decomposition of the connection coefficients can be used. With geometric decay in the coefficients of both the source expansion and the target expansion, the algebraic off-diagonal decay of the connection coefficients has been used by Cantero & Iserles (2012), and Wang & Huybrechs (2014) for $${\mathcal O}(N\log N+MN)$$ Gegenbauer and Jacobi expansion coefficients of analytic functions, where $$M\in\mathbb{N}$$ is a parameter. In principle, the hierarchical approach of Alpert & Rokhlin (1991) can be adapted to the Jacobi connection coefficients for an $${\mathcal O}(N\log N)$$ algorithm. However, this approach will also be saddled with the same expensive pre-computation of the hierarchical matrix. Instead, we extend the approach of Hale & Townsend (2014) by developing a fast and numerically stable evaluation of Jacobi polynomials at the Chebyshev–Lobatto points. This approach does not have expensive pre-computation, nor does it require analyticity of the function underlying the expansion. Indeed, the transform produces high absolute accuracy for expansion coefficients of a function with any regularity $${\mathcal C}^\rho[-1,1]$$, $$\rho\ge0$$. In exchange, we accept an asymptotically slower algorithm. Hale & Townsend (2014) advocate for a modification of the approach of Mori et al. (1999) based on block partitioning of the matrix $${\bf P}_N^{(0,0)}({\bf x}_N^{\rm cheb})$$ into an $${\mathcal O}(\log N/\log\log N)$$ number of partitions within which the interior asymptotics of the Legendre polynomials are guaranteed to be accurate and the remainder of the matrix is evaluated via recurrence relations. The balancing of operations between stable fast transforms in the blocks with the recurrence relations leads to the complexity $${\mathcal O}(N\log^2N/\log\log N)$$. While asymptotically slower than the hierarchical decomposition of Alpert & Rokhlin (1991), Hale and Townsend advocate that the partitioning algorithm is a practical alternative with a smaller setup cost. Hale & Townsend (2014) leave behind a mystery regarding the discrepancy in the numerically computed error in the coefficients and the theoretical estimates based on model coefficients. In particular, they show that for Legendre coefficients $$[{\bf c}_N^{\rm leg}]_n = {\mathcal O}(n^{-r})$$, and for some $$r\in\mathbb{R}$$, the sup-norm in applying $${\bf P}_N^{(0,0)}({\bf x}_N^{\rm cheb})$$ is asymptotically:   ‖PN(0,0)(xNcheb)cNleg‖∞={O(N1−r),r<1,O(log⁡N),r=1,O(1),r>1, asN→∞, (1.8) but in their numerical experiments they observed larger errors:   Observed  Error={O(N32−r/log⁡N),r=0,12,1,O(1),r=32, asN→∞. (1.9) For the Chebyshev–Legendre transform and more generally for the Chebyshev–Jacobi transform, this mystery is solved here by an extension of Reinsch’s modification of the Clenshaw–Smith algorithm to the Jacobi polynomials. It is known that the Clenshaw–Smith algorithm is not uniformly stable on the entire interval of orthogonality, i.e., the error bound of the recurrence relation is spatially dependent. In particular, the loss of accuracy near the endpoints of the interval $$[-1,1]$$ is significant. Reinsch suggested a modification of Clenshaw’s algorithm near the endpoints; the modification is extended by Levrie & Piessens (1985) to the Clenshaw–Smith algorithm for Legendre, ultraspherical, and Laguerre polynomials; and here, we extend it to Jacobi polynomials. 1.2 General definitions and properties The Gamma function is defined for all $$\Re z>0$$ by Abramowitz & Stegun (1965):   Γ(z)=∫0∞xz−1e−xdx, (1.10) and it is analytically continued to $$z\in\mathbb{C}\setminus\{-\mathbb{N}_0\}$$ by the property $${\it{\Gamma}}(z+1) = z{\it{\Gamma}}(z)$$. The Pochhammer symbol is then defined by Abramowitz & Stegun (1965):   (x)n=Γ(x+n)Γ(x), (1.11) and the beta function is defined similarly by Abramowitz & Stegun (1965):   B(x,y)=Γ(x)Γ(y)Γ(x+y). (1.12) Jacobi polynomials have the Rodrigues formula (Olver et al., 2010, §18.5):   Pn(α,β)(x)=(−1)n2nn!(1−x)−α(1+x)−βdndxn((1−x)α(1+x)β(1−x2)n); (1.13) their values at $$x=\pm1$$ are known:   Pn(α,β)(1)=(n+αn),Pn(α,β)(−1)=(−1)n(n+βn); (1.14) and, they satisfy the symmetry relation:   Pn(α,β)(x)=(−1)nPn(β,α)(−x). (1.15) Their three-term recurrence relation is given by:   Pn+1(α,β)(x)=(Anx+Bn)Pn(α,β)(x)−CnPn−1(α,β(x),P−1(α,β)(x)=0,P0(α,β)(x)=1, (1.16) where the recurrence coefficients are given by (Olver et al., 2010 §18.9.2):   An =(2n+α+β+1)(2n+α+β+2)2(n+1)(n+α+β+1), (1.17)  Bn =(α2−β2)(2n+α+β+1)2(n+1)(n+α+β+1)(2n+α+β), (1.18)  Cn =(n+α)(n+β)(2n+α+β+2)(n+1)(n+α+β+1)(2n+α+β). (1.19) The relation between Jacobi polynomials of differing parameters,   (α+β+2n+1)Pn(α,β)(x)=(α+β+n+1)Pn(α,β+1)(x)+(α+n)Pn−1(α,β+1)(x), (1.20) combined with the symmetry relation (1.15), allows for integer-valued increments and decrements of parameters with linear complexity in the degree. Lemma 1.1 (Wang & Huybrechs (2014)) Assume that   Pn(γ,δ)(x)=∑k=0ncn,k(α,β,γ,δ)Pk(α,β)(x). (1.21) Then the coefficients $$c_{n,k}^{(\alpha,\beta,\gamma,\delta)}$$ are given by   cn,k(α,β,γ,δ) =(n+γ+δ+1)k(k+γ+1)n−k(2k+α+β+1)Γ(k+α+β+1)(n−k)!Γ(2k+α+β+2) ×3F2(k−n,n+k+γ+δ+1,k+α+1k+γ+1,2k+α+β+2;1), (1.22) where $$\,_3F_2$$ is a generalized hypergeometric function (Olver et al., 2010 §16.2.1). 2. The forward transform: Jacobi to Chebyshev In this section, we extend the algorithm of Hale & Townsend (2014) for the Chebyshev–Legendre transform to the Chebyshev–Jacobi transform by deriving a fast algorithm to compute   cNcheb=TN(xNcheb)−1PN(α,β)(xNcheb)cNjac.$${\bf T}_N({\bf x}_N^{\rm cheb})$$ is a diagonally scaled DCT-I that can be applied and inverted in $${\mathcal O}(N\log N)$$ operations. 2.1 Interior asymptotics of Jacobi polynomials The interior asymptotics of Jacobi polynomials are given by Hahn (1980). Given $$M\in\mathbb{N}_0$$:   Pn(α,β)(cos⁡θ)=∑m=0M−1Cn,mα,βfm(θ)+Rn,Mα,β(θ), (2.1) where   Cn,mα,β =22n−m+α+β+1B(n+α+1,n+β+1)π(2n+α+β+2)m, (2.2)  fm(θ) =∑l=0m(12+α)l(12−α)l(12+β)m−l(12−β)m−ll!(m−l)!cos⁡θn,m,lsinl+α+12⁡(θ2)cosm−l+β+12⁡(θ2), (2.3)  θn,m,l =12(2n+α+β+m+1)θ−(α+l+12)π2, (2.4) and where $$x=\cos\theta$$. For $$\theta\in(\tfrac{\pi}{3},\tfrac{2\pi}{3})$$, the summation converges as $$M\to\infty$$, and for $$(\alpha,\beta)\in(-\tfrac{1}{2},\tfrac{1}{2}]^2$$, and for $$n\ge 2$$, the remainder $$R_{n,M}^{\alpha,\beta}(\theta)$$ is described in §2.2. Rewriting $$\theta_{n,m,l}$$ as   θn,m,l =12(2n+α+β+m+1)θ−(α+l+12)π2, (2.5)   =nθ+(α+β+m+1)θ2−(α+l+12)π2, (2.6)   =nθ−θm,l, (2.7) allows us to insert the cosine addition formula into the asymptotic formula (2.1). The result is   Pn(α,β)(cos⁡θ)=∑m=0M−1(um(θ)cos⁡nθ+vm(θ)sin⁡nθ)Cn,mα,β+Rn,Mα,β(θ), (2.8) where   um(θ) =∑l=0m(12+α)l(12−α)l(12+β)m−l(12−β)m−ll!(m−l)!cos⁡θm,lsinl+α+12⁡(θ2)cosm−l+β+12⁡(θ2), (2.9)  vm(θ) =∑l=0m(12+α)l(12−α)l(12+β)m−l(12−β)m−ll!(m−l)!sin⁡θm,lsinl+α+12⁡(θ2)cosm−l+β+12⁡(θ2). (2.10) Since in (2.8), $$\cos n\theta$$ and $$\sin n\theta$$ are the only terms that depend simultaneously and inextricably on both $$n$$ and $$\theta$$, the matrix $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})$$ of (1.4) can be expressed in the compact form:   PN(α,β)(xNcheb)ASY=∑m=0M−1(Dum(θNcheb)TN(xNcheb)+Dvm(θNcheb)sin⁡(θNcheb[0,…,N]⊤))DCn,mα,β+RMα,β(θNcheb). (2.11) Here, $${\bf D}_{u_m(\boldsymbol{\theta}_N^{\rm cheb})}$$ and $${\bf D}_{v_m(\boldsymbol{\theta}_N^{\rm cheb})}$$ denote diagonal matrices whose entries correspond to $$u_m$$ and $$v_m$$ evaluated at the equally spaced angles $$\boldsymbol{\theta}_N^{\rm cheb}$$, $${\bf D}_{C_{n,m}^{\alpha,\beta}}$$ is the diagonal matrix whose entries consist of $$C_{n,m}^{\alpha,\beta}$$ for $$n=0,\ldots,N$$, and $${\bf R}_M^{\alpha,\beta}(\boldsymbol{\theta}_N^{\rm cheb})$$ is the matrix of remainders. Since $${\bf T}_N({\bf x}_N^{\rm cheb})$$ is a diagonally scaled DCT-I and $$\sin(\boldsymbol{\theta}_N^{\rm cheb} [0,\ldots,N]^\top)$$ is a diagonally scaled DST-I bordered by zeros, the matrix $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})^{\rm ASY}$$ can be applied in $${\mathcal O}(M N\log N + M^2N)$$ operations. However, for low degree or for $$\theta\approx0$$ or $$\theta\approx\pi$$, the approximation of $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})$$ by $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})^{\rm ASY}$$ incurs an unacceptably large error. Therefore, we restrict the applicability of the matrix $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})^{\rm ASY}$$ to the region in the $$n$$-$$\theta$$ plane, where the remainder is guaranteed to be below a tolerance $$\varepsilon$$ and we use recurrence relations to stably fill in1 the remaining entries of the matrix $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})$$. 2.2 Partitioning the matrix $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})$$ For $$\theta\in(0,\pi)$$, and for every $$m\in\mathbb{N}_0$$, let $$g_m(\theta)$$ be the positive function:   gm(θ)=∑l=0m(12+α)l(12−α)l(12+β)m−l(12−β)m−ll!(m−l)!cosl−m−β−12⁡(θ2)sinl+α+12⁡(θ2). (2.12) Then, for $$(\alpha,\beta)\in(-\tfrac{1}{2},\tfrac{1}{2}]^2$$ and for $$n\ge 2$$, the remainder in (2.1) is bounded by:2  |Rn,Mα,β(θ)|<2Cn,Mα,βgM(θ). (2.13) For large $$n$$, the following leading order asymptotics are valid:   2Cn,Mα,β =22n−M+α+β+2πΓ(n+α+1)Γ(n+β+1)Γ(2n+α+β+M+2), (2.14)   ∼22n−M+α+β+2π2πn+α(n+αe)n+α2πn+β(n+βe)n+β2π2n+α+β+M+1(2n+α+β+M+1e)2n+α+β+M+1, (2.15)   ∼122M−1πnM+12,asn→∞. (2.16) Therefore, if we set the remainder to $$\varepsilon$$, this will define a curve in the $$n$$-$$\theta$$ plane for every $$M$$ given by   n≈⌊(ε22M−1πgM(θ))−1M+12⌋. (2.17) Since $$g_m(\theta)\in{\mathcal C}(0,\pi)$$ and   limθ→0+gm(θ)=limθ→π−gm(θ)=+∞, (2.18) the Weierstrass extreme value theorem ensures the existence of a global minimizer   θ^= argminθ∈(0,π)⁡ gm(θ). (2.19) Therefore, we find the discrete global minimizer:   θ¯= argminθ∈θNchebgm(θ)≈θ^, (2.20) and collect contiguous angles such that the error in evaluating the asymptotic expansion is guaranteed to be below $$\varepsilon$$. Following Hale & Townsend (2014), we define $$n_M\in\mathbb{N}_0$$:   nM:=⌊(ε22M−1πgM(π/2))−1M+12⌋, (2.21) and we set:   αN:=min(1log⁡(N/nM),1/2)andK:=⌈log⁡(N/nM)log⁡(1/αN)⌉. (2.22) For $$k=0,\ldots,K$$, while $$j_k := \alpha_N^kN > n_M$$, we compute the indices $$i_k^1,i_k^2$$ within which the remainder falls below the tolerance $$\varepsilon$$ and whose angles bound the discrete global minimizer $$\bar{\theta}$$:   [Rjk,Mα,β(θNcheb)]i<ε∀i∈{ik1,…,ik2},[θNcheb]ik1<θ¯<[θNcheb]ik2. (2.23) We require the following lemma. Lemma 2.1 Let $$(\alpha,\beta)\in(-\tfrac{1}{2},\tfrac{1}{2}]^2$$. Then for every $$M\ge2$$:   Cn+1,Mα,β≤Cn,Mα,β. (2.24) Proof. We have   inf(α,β)∈(−12,12]2(α+β)=−1,maxα∈(−12,12]α=12,maxβ∈(−12,12]β=12. (2.25) Using (2.2)   Cn+1,Mα,β=(2n+2α+2)(2n+2β+2)Cn,Mα,β(2n+α+β+M+3)(2n+α+β+M+2)≤(2n+3)2Cn,Mα,β(2n+M+2)(2n+M+1)≤Cn,Mα,β. (2.26) □ Lemma 2.1 guarantees that if $$M\,{\ge}\,2$$, the bound on the magnitude of the remainder $$|R_{n,M}^{\alpha,\beta}(\theta)| \,{<} 2C_{n,M}^{\alpha,\beta}g_m(\theta)$$ is a nonincreasing function of $$n$$. Therefore, the determination of the indices ensures the accuracy of the asymptotic formula within the rectangles $$[[\boldsymbol{\theta}_N^{\rm cheb}]_{i_k^1},[\boldsymbol{\theta}_N^{\rm cheb}]_{i_k^2}]\times[j_{k},j_{k-1}]$$, for $$k=1,\ldots,K$$, as depicted in Fig. 1. Fig. 1. View largeDownload slide The absolute error in $${\bf P}_N^{(1/8,3/8)}({\bf x}_N^{\rm cheb})$$ using the asymptotic formula (2.1) with $$M = 7$$ (left) and $$M=13$$ (right). In both plots the colour denotes the absolute error on a logarithmic scale; the curves represent the approximate region of accuracy of the asymptotic formula to $$\varepsilon\approx 2.2204\times10^{-16}$$ determined by (2.17), whereas the boxes denote the numerically determined indices $$i_k^1,i_k^2,j_k$$ for $$k=0,1,2$$ (left) and $$k=0,1,2,3$$ (right), such that the remainder is certainly below $$\varepsilon$$. Fig. 1. View largeDownload slide The absolute error in $${\bf P}_N^{(1/8,3/8)}({\bf x}_N^{\rm cheb})$$ using the asymptotic formula (2.1) with $$M = 7$$ (left) and $$M=13$$ (right). In both plots the colour denotes the absolute error on a logarithmic scale; the curves represent the approximate region of accuracy of the asymptotic formula to $$\varepsilon\approx 2.2204\times10^{-16}$$ determined by (2.17), whereas the boxes denote the numerically determined indices $$i_k^1,i_k^2,j_k$$ for $$k=0,1,2$$ (left) and $$k=0,1,2,3$$ (right), such that the remainder is certainly below $$\varepsilon$$. Lastly, for $$k=1,\ldots,K$$, we define   PN(α,β)(xNcheb)ASY,k=diag⁡(00:ik1−1,1ik1:ik2,0ik2+1:N)PN(α,β)(xNcheb)ASYdiag⁡(00:jk−1,1jk:jk−1,0jk−1+1:N), (2.27) to be the matrices of the asymptotic formula (2.11) within $$[[\boldsymbol{\theta}_N^{\rm cheb}]_{i_k^1},[\boldsymbol{\theta}_N^{\rm cheb}]_{i_k^2}]\times[j_k,j_{k-1}]$$, and:   [PN(α,β)(xNcheb)REC]i,j={PN(α,β)(xNcheb)i,j,i<ik1 or i>ik2,j<jk−1,k=1,…,K,PN(α,β)(xNcheb)i,j,iK1≤i≤iK2,j<jK,0,otherwise,  (2.28) which is computed via recurrence relations. Then, the numerically stable formula   PN(α,β)(xNcheb)=PN(α,β)(xNcheb)REC+∑k=1KPN(α,β)(xNcheb)ASY,k, (2.29) can be computed in $${\mathcal O}(N\log^2N/\log\log N)$$ operations. For detailed leading order estimates, see Appendix A. 2.3 Error analysis for model coefficients Consider a set of coefficients satisfying $$[{\bf c}_N^{\rm jac}]_n = {\mathcal O}(n^{-r})$$, for some $$r\in\mathbb{R}$$. We can estimate the sup-norm of the error in the forward transform (1.6) by estimating the error in applying the matrix $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})$$. Define $$D_N^r := \operatorname{diag}(1,1^r,\ldots,N^r)$$, then   ‖PN(α,β)(xNcheb)cNjac‖∞≤‖PN(α,β)(xNcheb)DN−r‖∞‖DnrcNjac‖∞. (2.30) Since   maxx∈xNcheb|Pn(α,β)(x)|=max{|Pn(α,β)(1)|,|Pn(α,β)(−1)|}=max{(n+αn),(n+βn)}=(n+max{α,β}n), (2.31) we can estimate the first term in (2.30) as follows:   ‖PN(α,β)(xNcheb)DN−r‖∞ =O(1+∑n=1N(n+max{α,β}n)n−r),asN→∞, (2.32)   =O(HN,r−max{α,β}),asN→∞, (2.33) where $$H_{N,r}$$ are the generalized harmonic numbers (Graham et al., 1989). Using their asymptotics,   ‖PN(α,β)(xNcheb)cNjac‖∞={O(N1+max{α,β}−r),r<1+max{α,β},O(log⁡N),r=1+max{α,β},O(1),r>1+max{α,β}.  (2.34) In the case $$(\alpha,\beta)=(0,0)$$, the error analysis reduces to that of the Chebyshev–Legendre transform derived in Hale & Townsend (2014). 3. The inverse transform: Chebyshev to Jacobi It is impractical to compute the inverse transform (1.7):   cNjac=PN(α,β)(xNcheb)−1TN(xNcheb)cNcheb, directly due to the occurrence of the inverse of the matrix $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb}){}$$. Instead, following Hale & Townsend (2014), we use the transpose of the asymptotic formula (2.11) in conjunction with the integral definition of the Jacobi coefficients:   [cNjac]n=1Anα,β∫−11pN(x)Pn(α,β)(x)w(α,β)(x)dx,n=0,…,N, (3.1) where $$p_N(x)$$ is defined by (1.1), and where $$\mathscr{A}_n^{\alpha,\beta}$$, defined by Olver et al. (2010, §18.3.1), are the orthonormalization constants of the Jacobi polynomials:   Anα,β=∫−11Pn(α,β)(x)2w(α,β)(x)dx=2α+β+1Γ(n+α+1)Γ(n+β+1)(2n+α+β+1)Γ(n+α+β+1)n!. (3.2) Since $$\deg(p_N(x))\le N$$, its product with $$P_N^{(\alpha,\beta)}(x)$$ will be integrated exactly by the $$2N+1$$-point Clenshaw–Curtis quadrature rule with the Jacobi weight $$w^{(\alpha,\beta)}(x)$$. 3.1 Clenshaw–Curtis quadrature Clenshaw–Curtis quadrature is a quadrature rule (Waldvogel, 2003; Sommariva, 2013) whose nodes are the $$N+1$$ Chebyshev–Lobatto points $${\bf x}_N^{\rm cheb}$$. Given a continuous weight function $$w(x)\in{\mathcal C}(-1,1)$$ and $$\mathbb{P}_N$$, the space of algebraic polynomials of degree at most $$N$$, the weight vector $${\bf w}_N$$ is designed by the equality:   ∫−11f(x)w(x)dx=wN⊤f(xNcheb),∀f∈PN. (3.3) With the modified Chebyshev moments of the weight function $$w(x)$$:   μn=∫−11Tn(x)w(x)dx,n=0,…,N, (3.4) the weights $${\bf w}_N$$ can be determined via the formula:   [wN]n=1−12(δ0,n+δN,n)N{μ0+(−1)nμN+2∑k=1N−1μkcos⁡[πkn/N]}, (3.5) where $$\delta_{k,j}$$ is the Kronecker delta (Abramowitz & Stegun, 1965). Due to this representation, the $${\mathcal O}(N\log N)$$ computation of the weights from modified Chebyshev moments is achieved via a diagonally scaled DCT-I. For the Jacobi weight, the modified Chebyshev moments are known explicitly (Piessens, 1987):   μn(α,β)=∫−11Tn(x)w(α,β)(x)dx=2α+β+1B(α+1,β+1)3F2(n,−n,α+112,α+β+2;1), (3.6) where $$\,_3F_2$$ is a generalized hypergeometric function (Olver et al., 2010, §16.2.1). Using Sister Celine’s technique (Rainville, 1960,§127) or induction (Xiang et al., 2014), a recurrence relation can be derived for the modified moments:   μ0(α,β)=2α+β+1B(α+1,β+1),μ1(α,β)=β−αα+β+2μ0(α,β), (3.7)  (α+β+n+2)μn+1(α,β)+2(α−β)μn(α,β)+(α+β−n+2)μn−1(α,β)=0,forn>0. (3.8) It is known that for $$\alpha>\beta$$ and $$\beta = -\tfrac{1}{2} + \mathbb{N}_0$$ or for $$\alpha<\beta$$ and $$\alpha = -\tfrac{1}{2} + \mathbb{N}_0$$, neither forward nor backward recurrence is stable. This has been addressed by Xiang et al. (2014) by transforming the initial value problem into a boundary value problem with a sufficiently accurate asymptotic expansion for $$\mu_N^{(\alpha,\beta)}$$ and subsequent use of Oliver’s algorithm (Oliver (1968)), i.e., the LU decomposition of a tridiagonal matrix. However, the recurrence relation is stable in the forward direction in the half-open square $$(\alpha,\beta)\in(-\tfrac{1}{2},\tfrac{1}{2}]^2$$, and in light of the linear complexity of integer-valued decrements, Oliver’s algorithm is not required in the present context. Once the modified Chebyshev moments $$\mu_n^{(\alpha,\beta)}$$ are computed, the Clenshaw–Curtis weights $${\bf w}_N^{(\alpha,\beta)}$$ follow via a diagonally scaled DCT-I. 3.2 The transpose of the asymptotic formula Since $$\deg(p_N(x)P_n^{(\alpha,\beta)}(x))\le 2N$$, the $$2N+1$$-point, Clenshaw–Curtis quadrature rule yields:   [cNjac]n=1Anα,β(w2N(α,β))⊤(pN(x2Ncheb)Pn(α,β)(x2Ncheb)),n=0,…,N, (3.9) where $$p_N(x)$$ is defined by (1.1) and where $$\mathscr{A}_n^{\alpha,\beta}$$ is given by (3.2). If we let the vector $$[{\bf s}_{2N}]_n = (\mathscr{A}_n^{\alpha,\beta})^{-1}$$ for $$n=0,\ldots,2N$$, then we can rewrite this in matrix form:   cNjac=[IN+1|0N]Ds2NP2N(α,β)(x2Ncheb)⊤Dw2N(α,β)T2N(x2Ncheb)⊤[IN+10N]cNcheb, (3.10) where $${\bf D}_{{\bf s}_{2N}}$$ and $${\bf D}_{{\bf w}_{2N}^{(\alpha,\beta)}}$$ denote diagonal matrices whose entries correspond to $${\bf s}_{2N}$$ and $${\bf w}_{2N}^{(\alpha,\beta)}$$, respectively. Clenshaw–Curtis quadrature allows us to express the Jacobi coefficients in terms of transposed matrices rather than inverse matrices. In order to complete our formulation, we use the transpose of (2.11), given by   P2N(α,β)(x2Ncheb)ASY,⊤=∑m=0M−1DCn,mα,β(T2N(x2Ncheb)⊤Dum(θ2Ncheb)+sin⁡(θ2Ncheb[0,…,2N]⊤)⊤Dvm(θ2Ncheb))+RMα,β(θ2Ncheb)⊤. (3.11) Given the ordering of the points $${\bf x}_{2N}^{\rm cheb}$$, the transposed DCT-I $${\bf T}_{2N}({\bf x}_{2N}^{\rm cheb})^\top$$ and the transposed DST-I bordered by zeros $$\sin(\boldsymbol{\theta}_{2N}^{\rm cheb}[0,\ldots,2N]^\top)$$ are symmetric, allowing for the same implementation as the forward Chebyshev–Jacobi transform. Similarly, the constants $$n_M$$, $$\alpha_N$$, $$K$$, and the indices $$i_k^1$$, $$i_k^2$$ and $$j_k$$ can be computed as in the forward transform, with the substitution $$N\to 2N$$. Therefore, the transpose of the asymptotic formula, combined with recurrence relations, can be used for a numerically stable partition and evaluation of the inverse transform. 3.3 Error analysis for model coefficients Consider a set of coefficients satisfying $$[{\bf c}_N^{\rm cheb}]_n = {\mathcal O}(n^{-r})$$, for some $$r\in\mathbb{R}$$. We can estimate the sup-norm of the error in the inverse transform (1.7) by estimating the error in the transpose formula (3.10). Using $$D_N^r := \operatorname{diag}(1,1^r,\ldots,N^r)$$ again, then:   ‖cNjac‖∞ =‖[IN+1|0N]Ds2NP2N(α,β)(x2Ncheb)⊤Dw2N(α,β)T2N(x2Ncheb)⊤[IN+10N]cNcheb‖∞ (3.12)   ≤‖Ds2NP2N(α,β)(x2Ncheb)⊤‖∞‖Dw2N(α,β)‖∞‖T2N(x2Ncheb)⊤[DN−r0N]‖∞‖DNrcNcheb‖∞. (3.13) Using the bound on the Jacobi polynomials (2.31), we can formulate the asymptotics of the sup-norm involving the transposed matrix $${\bf P}_{2N}^{(\alpha,\beta)}({\bf x}_{2N}^{\rm cheb})^\top$$ and its diagonal scaling. Since $$\mathscr{A}_n^{\alpha,\beta} = {\mathcal O}(N^{-1})$$, as can be seen from (3.2), we have   ‖Ds2NP2N(α,β)(x2Ncheb)⊤‖∞ ≤C‖D2N1P2N(α,β)(x2Ncheb)⊤‖∞,for some C>0, (3.14)   ≤Cmaxn∈{0,…,2N}∑x∈x2Ncheb(δn,0+n)|Pn(α,β)(x)|, (3.15)   ≤Cmaxn∈{0,…,2N}2N(δn,0+n)(n+max{α,β}n), (3.16)   =O(N2+max{α,β}),asN→∞. (3.17) For the Clenshaw–Curtis quadrature weights,   ‖Dw2N(α,β)‖∞=O(N−1),asN→∞, (3.18) as can be seen by (3.5).3 Due to the symmetry of $${\bf T}_{2N}({\bf x}_{2N}^{\rm cheb})$$, we can conclude that   ‖T2N(x2Ncheb)⊤[DN−r0N]‖∞=1+HN,r, (3.19) or   ‖cNjac‖∞={O(N2+max{α,β}−r),r<1,O(N1+max{α,β}log⁡N),r=1,O(N1+max{α,β}),r>1.  (3.20) This growth rate appears larger than our numerical experiments suggest, and this can be attributed to the overestimation of $$\left\| {\bf D}_{{\bf s}_{2N}} {\bf P}_{2N}^{(\alpha,\beta)}({\bf x}_{2N}^{\rm cheb})^\top\right\|_\infty$$: the Jacobi polynomials are significantly smaller than the maximum of their endpoints for the majority of the interior of $$[-1,1]$$. However, without a useful envelope function, we report what can only be an overestimate. 4. Design and implementation As Hale & Townsend (2014) remark, the partitioning implies that the algorithm is trivially parallelized. However, of more immediate concern is the application of the same transform to multiple sets of expansion coefficients. Analogous to the Fastest Fourier Transform in the West (FFTW) of Frigo & Johnson (2005), we divide the computation into Part 1 (Planification) and Part 2 (Execution): (1) Planification (i) computation of the partitioning indices, (ii) planification of the in-place DCT-I and DST-I, (iii) computation of the recurrence coefficients, (iv) allocation of temporary arrays, and (V) computation of the modified weights and orthonormality constants (inverse only). (2) Execution (i) computation of the diagonal matrices $${\bf D}_{u_m(\boldsymbol{\theta}_N^{\rm cheb})}$$, $${\bf D}_{v_m(\boldsymbol{\theta}_N^{\rm cheb})}$$, and $${\bf D}_{C_{n,m}^{\alpha,\beta}}$$, (ii) application of the DCT-I and DST-I, and (iii) execution of the recurrence relations. Since Part 1 is only dependent on the degree and the Jacobi parameters, it is reusable. Therefore, the results of Part 1 are stored in an object called a ChebyshevJacobiPlan. Analogous to FFTW, applying the ChebyshevJacobiPlan to a vector results in execution of Part 2. While it is beneficial to divide the computation as such, the construction of a ChebyshevJacobiPlan is not orders of magnitude larger than the execution, as is the case for other schemes using hierarchical or other complex data structures; our numerical experiments suggest an approximate gain on the order of $$10\%$$. However, the reduction of memory allocation alone could be important in memory-sensitive applications. 4.1 Computational issues Consider the Stirling series for the gamma function (Olver et al., 2010, §5.11.10) on $$z\in\mathbb{R}^+$$:   Γ(z)=2πzz−12e−z(SN(z)+RN(z)),SN(z)=∑n=0N−1anzn,RN(z)≤(1+ζ(N))Γ(N)(2π)N+1zN. (4.1) The sequence $$\{a_n\}_{n\ge0}$$ is defined by the ratio of sequences A001163 and A001164 of Sloane (2016), and $$\zeta$$ is the Riemann zeta function (Olver et al., 2010 §25). Table 1 shows the necessary and sufficient number of terms required of the Stirling series such that $$\tfrac{R_N(z)}{S_N(z)} < \tfrac{\varepsilon}{20} \approx 1.1102\times10^{-17}$$. Taking rounding errors into account, the effect is a relative error below machine precision $$\varepsilon \approx 2.2204\times10^{-16}$$ in double precision arithmetic. Table 1 Number of terms such that $$\tfrac{R_N(z)}{S_N(z)} < \tfrac{\varepsilon}{20} \approx 1.1102\times10^{-17}$$ in double precision arithmetic $$z\ge$$  3275  591  196  92  53  35  26  $$\quad$$$$N$$  4  5  6  7  8  9  10  $$z\ge$$  20  17  14  12  11  10  9  $$\quad$$$$N$$  11  12  13  14  15  16  17  $$z\ge$$  3275  591  196  92  53  35  26  $$\quad$$$$N$$  4  5  6  7  8  9  10  $$z\ge$$  20  17  14  12  11  10  9  $$\quad$$$$N$$  11  12  13  14  15  16  17  Define $$S_{\varepsilon}(z): [9,\infty)\to\mathbb{R}$$ by the truncated Stirling series $$S_N(z)$$ with necessary and sufficient $$N$$ for relative error below $$\varepsilon$$, as determined by Table 1. The coefficients $$C_{n,m}^{\alpha,\beta}$$ of (2.2) can be stably computed by forward recurrence in $$n$$ and $$m$$:   Cn,mα,β=(n+α)(n+β)Cn−1,mα,β(n+α+β+m+12)(n+α+β+m2),andCn,mα,β=Cn,m−1α,β2(2n+α+β+m+1). (4.2) However, to determine the indices $$i_k^1$$ and $$i_k^2$$ for the partitioning of the matrix $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})$$, use of an asymptotic formula is more efficient. Here, we adapt the approach of Hale & Townsend (2013, §3.2.3 & §3.3.1) with suitable modifications. In terms of $$S_\varepsilon(z)$$ defined above, the coefficients $$C_{n,m}^{\alpha,\beta}$$ can be expressed as   Cn,mα,β =22n−m+α+β+12π(n+α+1)n+α+12e−n−α−1(n+β+1)n+β+12e−n−β−1π(2n+m+α+β+2)2n+m+α+β+32e−2n−m−α−β−2 ×Sε(n+α+1)Sε(n+β+1)Sε(2n+m+α+β+2), (4.3)   =em4mπ(1+α−β−m2n+α+β+m+2)n+α+12(1+β−α−m2n+α+β+m+2)n+β+12 ×1nm+12(1+α+β+m+22n)m+12Sε(n+α+1)Sε(n+β+1)Sε(2n+m+α+β+2). (4.4) In (4.4), the terms resembling $$(1+x)^y$$ can be computed stably and efficiently by $$\exp(y\operatorname{log1p} x)$$, where $$\operatorname{log1p}$$ calls the natural logarithm $$\log(1+x)$$ for large arguments and its Taylor series for small arguments. So long as $$n+\min\{\alpha,\beta\}\ge8$$, the asymptotic formula (4.4) for the coefficients $$C_{n,m}^{\alpha,\beta}$$ can be used for a fast and stable numerical evaluation, and the downward recurrence of (4.2) supplies $$C_{n,m}^{\alpha,\beta}$$ for the handful of remaining values. To compute the $$\mathscr{A}_n^{\alpha,\beta}$$ of (3.2), the asymptotic expansion derived by Bühring (2000) can be used. However, a remainder estimate is not reported and instead we use the same technique as for the computation of the coefficients $$C_{n,m}^{\alpha,\beta}$$:   Anα,β =2α+β+12n+α+β+1(n+α+1)n+α+12(n+β+1)n+β+12(n+α+β+1)n+α+β+12(n+1)n+12Sε(n+α+1)Sε(n+β+1)Sε(n+α+β+1)Sε(n+1), (4.5)   =2α+β+12n+α+β+1(1−βn+α+β+1)n2+α+14(1−αn+α+β+1)n2+β+14 ×(1+αn+1)n2+14(1+βn+1)n2+14Sε(n+α+1)Sε(n+β+1)Sε(n+α+β+1)Sε(n+1). (4.6) Similar to (4.4), (4.6) can be computed stably and efficiently for $$n+\min\{\alpha,\beta,\alpha+\beta,0\}\ge8$$. Note as well the symmetry in both (4.4) and (4.6) upon the substitution $$\alpha\leftrightarrow\beta$$. 4.2 Reinsch’s modification of forward orthogonal polynomial recurrence and the Clenshaw–Smith algorithm In order to evaluate $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})^{\rm REC}$$ and its transpose, recurrence relations are required. Here, we review the recurrence relations for orthogonal polynomials, and derive new relations for stabilized evaluation near the boundary of the interval of orthogonality for Jacobi polynomials. Let an orthogonal polynomial sequence $$\pi_n(x)$$ be defined by the three-term recurrence relation (Olver et al., 2010, §18.9.1):   πn+1(x)=(Anx+Bn)πn(x)−Cnπn−1(x),π−1(x)=0,π0(x)=1. (4.7) The Clenshaw–Smith algorithm writes the sum,   pN(x)=∑n=0Ncnπn(x), (4.8) via an inhomogeneous recurrence relation involving the adjoint of (4.7) as follows: Algorithm 1 (Clenshaw, 1955; Smith, 1965) (1) Set:   uN+1(x)=uN+2(x)=0. (4.9) (2) For $$n=N,N-1,\ldots,0$$:   un(x)=(Anx+Bn)un+1(x)−Cn+1un+2(x)+cn. (4.10) (3) Then:   pN(x)=u0(x). (4.11) After Clenshaw’s original error analysis, it was Gentleman (1969) who first drew attention to the susceptibility of larger rounding errors near the ends of the interval $$[-1,1]$$. In Gentleman’s paper, Reinsch proposed (unpublished) a stabilizing modification, with the error analysis of the modification performed by Oliver (1977). Levrie & Piessens (1985) derive Reinsch’s modification of the Clenshaw–Smith algorithm for Legendre, ultraspherical and Laguerre polynomials. They also derive Reinsch’s modification to the forward orthogonal polynomial recurrence (4.7) for Chebyshev, Legendre, ultraspherical, Jacobi and Laguerre polynomials. Here, we review Reinsch’s modification with more general notation than that of Levrie & Piessens (1985) and extend Reinsch’s modified Clenshaw–Smith algorithm to Jacobi polynomials. Formally, we define the ratio   rnf(x):=πn+1(x)πn(x),forn≥0, (4.12) such that at the point $$x_0$$  rnf(x0)=Anx0+Bn−Cnrn−1f(x0)−1, (4.13) or isolating for $$B_n$$  Bn=rnf(x0)+Cnrn−1f(x0)−1−Anx0. (4.14) Substituting this relationship for $$B_n$$ into the forward recurrence (4.7), we obtain the modified version: Algorithm 2 (1) Set:   π0(x)=1,d0(x)=0. (4.15) (2) For $$n\ge0$$:   dn+1(x) =(An(x−x0)πn(x)+Cndn(x))rnf(x0)−1, (4.16)  πn+1(x) =(πn(x)+dn+1(x))rnf(x0). (4.17) Consider the homogeneous adjoint three-term recurrence:   vn(x)=(Anx+Bn)vn+1(x)−Cn+1vn+2(x),v0(x)=0,v1(x)=1. (4.18) Formally, we define the ratio   rnb(x):=vn+1(x)vn(x),forn>0, (4.19) such that at the point $$x_0$$  rnb(x0)−1=Anx0+Bn−Cn+1rn+1b(x0), (4.20) or isolating for $$B_n$$  Bn=rnb(x0)−1+Cn+1rn+1b(x0)−Anx0. (4.21) Substituting this relationship for $$B_n$$ into the Clenshaw–Smith algorithm, we obtain the modified version: Algorithm 3 (1) Set:   uN+1(x)=dN+1(x)=0. (4.22) (2) For $$n=N,N-1,\ldots,1$$,   dn(x) =(An(x−x0)un+1(x)+Cn+1dn+1(x)+cn)rnb(x0), (4.23)  un(x) =(un+1(x)+dn(x))rnb(x0)−1. (4.24) (3) Then,   pN(x)=A0(x−x0)u1(x)+C1d1(x)+c0. (4.25) The stability of the modified forward recurrence and the modified Clenshaw–Smith algorithm near $$x_0$$ is derived from the geometric damping induced by $$x-x_0$$ and the avoidance of cancellation errors. However, the naïve implementation of the two-term recurrence relations for the ratios (4.13) and (4.20) contains precisely the cancellation errors we were hoping to avoid. Therefore, to complete the stable implementation of the scheme, we require stable evaluation of the ratios $$r^f(x_0)$$ and $$r^b(x_0)$$. For Jacobi polynomials, due to (1.14) and the two-term recurrence of binomials, the ratios $$r_n^f(\pm1)$$ defined by (4.12) are trivial:   rnf(1)=n+α+1n+1,andrnf(−1)=−n+β+1n+1. (4.26) Fortunately, we can also prove the following: Lemma 4.1 For Jacobi polynomials, the ratios $$r_n^b(\pm1)$$ defined by (4.19) are as follows:   rnb(1)=n+1n(α+β+n+1)(α+β+2n)(n+β)(α+β+2n+2),  and  rnb(−1)=−n+1n(α+β+n+1)(α+β+2n)(n+α)(α+β+2n+2). (4.27) Proof. One need only insert the ratios into the relationship (4.20). □ Figure 2 shows the relative error in evaluating $$P_{10,000}^{(0,0)}(\cos\theta)$$ at 10,001 equally spaced angles using the six described algorithms. In Fig. 2, the terms $$x\pm1$$ are computed accurately with the trigonometric identities $$x+1=2\cos^2(\tfrac{\theta}{2})$$ and $$x-1 = -2\sin^2(\tfrac{\theta}{2})$$. While variations in $$\alpha$$ and $$\beta$$ will change the accuracy of all six recurrence relations, we practically take the unmodified algorithms to be more accurate in $$\frac{\pi}{4} < \theta < \frac{3\pi}{4}$$, and the modifications otherwise as the perturbations in the breakpoints are asymptotically of lower order as $$N\to\infty$$. Fig. 2. View largeDownload slide The relative error in evaluating $$P_{10,000}^{(0,0)}(\cos\theta)$$ at 10,001 equally spaced angles using: left, the three-term recurrence relation and Reinsch’s forward modifications for the ends of the interval $$[0,\pi]$$; right, the Clenshaw–Smith algorithm and Reinsch’s backward modifications for the ends of the interval $$[0,\pi]$$. Colour figures available online. Fig. 2. View largeDownload slide The relative error in evaluating $$P_{10,000}^{(0,0)}(\cos\theta)$$ at 10,001 equally spaced angles using: left, the three-term recurrence relation and Reinsch’s forward modifications for the ends of the interval $$[0,\pi]$$; right, the Clenshaw–Smith algorithm and Reinsch’s backward modifications for the ends of the interval $$[0,\pi]$$. Colour figures available online. Note that six algorithms are required in total: the Clenshaw–Smith algorithm and Reinsch’s modifications near $$\pm1$$ for the forward Chebyshev–Jacobi transform, and forward orthogonal polynomial recurrence, and again Reinsch’s modifications near $$\pm1$$ for the inverse Chebyshev–Jacobi transform. 5. Numerical discussion and outlook In principle, the connection coefficients (1.22) are able to provide reference solutions for the maximum absolute error. But, in practice, the naïve algorithm’s quadratic complexity limits the applicability to below about $$N=10^4$$. Therefore, in Fig. 3, we plot the maximum absolute error in transforming Chebyshev expansion coefficients to Jacobi expansion coefficients and back for coefficients simulating an irregular function, and for coefficients simulating a continuous function. The error is similar for the forward–inverse composition. Figure 4 shows the execution time of the forward and inverse transforms in line with the predicted asymptotic complexity $${\mathcal O}(N\log^2N/\log\log N)$$. These calculations are performed on a Mid 2014 Macbook Pro with a $$2.8{\rm GHz}$$ Intel Core i7-4980HQ processor and $$16{\rm GB}$$$$1600{\rm MHz}$$ DDR3 memory. Our implementation (Slevinsky, 2017FastTransforms.jl) in the Julia programming language is freely available online. Fig. 3. View largeDownload slide Maximum absolute error of the inverse Chebyshev–Jacobi transform of the forward Chebyshev–Jacobi transform. Left: For coefficients simulating an irregular function $$[{\bf c}_N]_n \sim U(0,1)$$. Right: For coefficients simulating a continuous function $$[{\bf c}_N]_n \sim U(-1,1)n^{-2}$$. In both plots, the numbers labeling the solid lines refer to different Jacobi parameters and the dashed black lines are asymptotic estimates on the error based on the error analyses. The results are an average over $$10$$ executions. Fig. 3. View largeDownload slide Maximum absolute error of the inverse Chebyshev–Jacobi transform of the forward Chebyshev–Jacobi transform. Left: For coefficients simulating an irregular function $$[{\bf c}_N]_n \sim U(0,1)$$. Right: For coefficients simulating a continuous function $$[{\bf c}_N]_n \sim U(-1,1)n^{-2}$$. In both plots, the numbers labeling the solid lines refer to different Jacobi parameters and the dashed black lines are asymptotic estimates on the error based on the error analyses. The results are an average over $$10$$ executions. Fig. 4. View largeDownload slide Execution time for the Chebyshev–Jacobi transform. Left: The forward Chebyshev–Jacobi transform. Right: The inverse Chebyshev–Jacobi transform. In both plots, the numbers labeling the solid lines refer to the parameter $$M$$ and the dashed black line is the same, showing that the inverse transform takes about twice the time. The results are an average over $$10$$ executions performed with uniformly distributed parameters $$\alpha\sim U(-\tfrac{1}{2},\tfrac{1}{2})$$ and $$\beta\sim U(-\tfrac{1}{2},\tfrac{1}{2})$$. Fig. 4. View largeDownload slide Execution time for the Chebyshev–Jacobi transform. Left: The forward Chebyshev–Jacobi transform. Right: The inverse Chebyshev–Jacobi transform. In both plots, the numbers labeling the solid lines refer to the parameter $$M$$ and the dashed black line is the same, showing that the inverse transform takes about twice the time. The results are an average over $$10$$ executions performed with uniformly distributed parameters $$\alpha\sim U(-\tfrac{1}{2},\tfrac{1}{2})$$ and $$\beta\sim U(-\tfrac{1}{2},\tfrac{1}{2})$$. Composition of the forward and inverse transforms allows for the transform between expansions in Jacobi polynomials of differing parameters. Additionally, use of a nonuniform discrete cosine transform (NDCT) (e.g. Hale & Townsend (2016)) could allow for fast evaluation at the Gauss–Jacobi nodes. However, an efficient NDCT requires points to be close to the Chebyshev points of the first kind, and the inequalities on the zeros of the Jacobi polynomials (Olver et al., 2010,§18.16) seem to be overestimates. The performance of an NDCT may be better in practice than can be currently estimated theoretically. One potential area of application is the extension of the fast and well-conditioned spectral method for solving singular integral equations of Slevinsky & Olver (2017) to polygonal boundaries. Elliptic partial differential equations have angle-dependent algebraic singularities in the densities on polygonal boundaries. It is conjectured that working in the more exotic bases of Jacobi polynomials and Jacobi functions of the second kind can lead to banded representations of singular integral operators defined on polygonal boundaries. Since the integer-valued increments are required for Jacobi parameters beyond $$(\alpha,\beta)\in(-\tfrac{1}{2},\tfrac{1}{2}]^2$$, the method proposed and analyzed here cannot be used for exceedingly large parameters. This is consistent with nonuniformity of Hahn’s asymptotics (2.1) in $$\alpha$$ and $$\beta$$. Therefore, this Chebyshev–Jacobi transform cannot be used for a fast spherical harmonics transform. There are certain parameter régimes where the complexity can be reduced. These are detailed in Appendix B. Townsend et al. (2016) describe a new approach to fast polynomial transforms based on exploiting the structure of certain connection coefficients matrices. In this framework, it is also possible to perform certain Chebyshev–Jacobi and Jacobi–Jacobi transforms. Acknowledgments This paper is dedicated to the celebration of Nick Trefethen on his $$60^{\rm th}$$ birthday and his inspirational contributions to numerical analysis. I wish to thank the referees for their constructive remarks. Funding This work was supported by the Natural Sciences and Engineering Research Council of Canada. A. Appendix: Complexity of $$\boldsymbol{{\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})^{\rm REC}}$$ In this section, we derive refined estimates on the complexity of applying the matrix $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})^{\rm REC}$$. By artificially partitioning the matrix into rectangular regions, we need to estimate Hale & Townsend, 2014, §3.3),   ∑k=1K−1αNkN(ik+11−ik1+ik2−ik+12), (A.1) to leading order. Fortunately, as $$\theta\to0$$ or $$\theta\to\pi$$, $$g_M(\theta)$$ is its own asymptotic expansion. For brevity, we derive the leading order asymptotics of $$i_k^1$$ and deduce those of $$i_k^2$$ by symmetry. To leading order,   gM(θ)∼gM1(θ)=(12+α)M(12−α)MM!1sinM+α+12⁡θ2,asθ→0. (A.2) Then, to determine the leading order estimate of $$i_k^1$$,   ε∼2Cjk,Mα,βgM1(ik1πN+1), (A.3) or isolating for $$i_k^1$$,   ik1∼⌊2(N+1)πsin−1⁡(((12+α)M(12−α)MεM!22M−1πjkM+12)1M+α+12)⌋,asN→∞. (A.4) Using the fact that $$\sin^{-1}x \sim x$$ as $$x\to0$$, we find,   ik1=O(N×jk−M+12M+α+12)=O(NαM+α+12×αN−kM+12M+α+12),asN→∞. (A.5) Therefore, the sum involving $$i_{k+1}^1$$ and $$i_k^1$$ is, to leading order:   ∑k=1K−1αNkN(ik+11−ik1) =O(NM+2α+12M+α+12∑k=1K−1(αN−M+12M+α+12−1)αNkαM+α+12), (A.6)   =O(KNM+2α+12M+α+12αNM−α+12M+α+12),asN→∞. (A.7) By the symmetry in $$\alpha\leftrightarrow\beta$$ and $$\theta\leftrightarrow\pi-\theta$$, we have   ∑k=1K−1αNkN(ik2−ik+12)=O(KNM+2β+12M+β+12αNM−β+12M+β+12),asN→∞. (A.8) Therefore, the simplified estimate $${\mathcal O}(N\log^2 N/\log\log N)$$ is a local expansion near $$(\alpha,\beta)\approx(0,0)$$, and we observe in Fig. 4 that it holds over $$(\alpha,\beta)\in(-\tfrac{1}{2},\tfrac{1}{2}]^2$$ in practice, so long as $$M\ge5$$. B Appendix: Jacobi parameters resulting in reduced complexity B.1 $$\alpha=\beta=\lambda-\frac{1}{2}$$ In the case that $$\alpha=\beta=\lambda-\tfrac{1}{2}$$, we are a normalization away from the ultraspherical or Gegenbauer polynomials. These asymptotics are given by (Olver et al., 2010 §18.15):   Pn(λ−12,λ−12)(cos⁡θ)=∑m=0M−1Cn,mλcos⁡θn,mλsinm+λ⁡θ+Rn,Mλ(θ). (B.1) Here, we have   Cn,mλ =2λΓ(n+λ+12)πΓ(n+λ+1)(λ)m(1−λ)m2mm!(n+λ+1)m, (B.2)  θn,mλ =(n+m+λ)θ−(m+λ)π2=nθ−(m+λ)(π2−θ), (B.3) and $$x=\cos\theta$$. The coefficients $$C_{n,m}^\lambda$$ can be computed by the recurrence:   Cn,mλ=(λ+m−1)(m−λ)2m(n+λ+m)Cn,m−1λ,Cn,0λ=2λΓ(n+λ+12)πΓ(n+λ+1). (B.4) So long as $$\lambda\in(0,1)$$, the error is bounded by   |Rn,Mλ(θ)|<2Cn,MλsinM+λ⁡θ. (B.5) Therefore, if we set the error to $$\varepsilon$$, this will define a curve in the $$n$$-$$\theta$$ plane for every $$M$$ and $$\lambda$$ given by,   n≈nMλsinM+λM+12⁡θ,nMλ=⌊(επ2MM!2λ+1(λ)M(1−λ)M)−1M+12⌋. (B.6) B.2 $$\alpha=\frac{1}{2}$$ If $$\alpha = \tfrac{1}{2}$$, then the summations in the functions $$f_m(\theta)$$ collapse:   fm(θ)=(12+β)m(12−β)mm!cos⁡θn,m,0sin⁡(θ2)cosm+β+12⁡(θ2). (B.7) B.3 $$\beta=\frac{1}{2}$$ If $$\beta = \tfrac{1}{2}$$, then the summations in the functions $$f_m(\theta)$$ collapse:   fm(θ)=(12+α)m(12−α)mm!cos⁡θn,m,msinm+α+12⁡(θ2)cos⁡(θ2). (B.8) Footnotes 1The Clenshaw–Smith algorithm (Clenshaw, 1955; Smith, 1965) for evaluation of polynomials in orthogonal polynomial bases is used rather than explicitly filling in the matrix. 2The validity of Hahn (1980, Satz 5) can be extended from the open square $$(\alpha,\beta)\in(-\tfrac{1}{2},\tfrac{1}{2})^2$$ to the half-open square $$(\alpha,\beta)\in(-\tfrac{1}{2},\tfrac{1}{2}]^2$$ by substituting the error-free equality $$(1-z)^{-0} = 1$$ for Hilfssatz 2 of §15, and by substituting Hilfssatz 2 for Hilfssatz 3 of §15 in the proof of Satz 5 contained in §16. 3Intuitively, since the Clenshaw–Curtis weights must sum to a constant, and not one weight is of paramount importance, this can only occur if they all decay uniformly with $${\mathcal O}(N^{-1})$$. References Abramowitz M. & Stegun I. A. ( 1965) Handbook of Mathematical Functions . New York: Dover. Alpert B. K. & Rokhlin V. ( 1991) A fast algorithm for the evaluation of Legendre expansions. SIAM J. Sci. Stat. Comput. , 12, 158– 179. Google Scholar CrossRef Search ADS   Andrews G. E. Askey R. & Roy R. ( 1998) Special Functions . Cambridge, UK: Cambridge University Press. Bühring W. ( 2000) An asymptotic expansion for a ratio of products of gamma functions. Internat. J. Math. & Math. Sci. , 24, 505– 510. Google Scholar CrossRef Search ADS   Cantero M. J. & Iserles A. ( 2012) On rapid computation of expansions in ultraspherical polynomials. SIAM J. Numer. Anal. , 50, 307– 327. Google Scholar CrossRef Search ADS   Clenshaw C. W. ( 1955) A note on the summation of Chebyshev series. Math. Comp. , 9, 118– 120. Google Scholar CrossRef Search ADS   Frigo M. & Johnson S. G. ( 2005) The design and implementation of FFTW3. Proc. IEEE , 93, 216– 231. Google Scholar CrossRef Search ADS   Gentleman W. M. ( 1969) An error analysis of Goertzel’s (Watt’s) method for computing Fourier coefficients. Comput. J. , 12, 160– 164. Google Scholar CrossRef Search ADS   Graham R. L. Knuth D. E. & Patashnik O. ( 1989) Concrete Mathematics, A Foundation for Computer Science , 2nd edn. Reading, MA: ddison-Wesley. Hahn E. ( 1980) Asymptotik bei Jacobi-polynomen und Jacobi-funktionen. Math. Z. , 171, 201– 226. Google Scholar CrossRef Search ADS   Hale N. & Townsend A. ( 2013) Fast and accurate computation of Gauss–Legendre and Gauss–Jacobi quadrature nodes and weights. SIAM J. Sci. Comput. , 35, A652– A674. Google Scholar CrossRef Search ADS   Hale N. & Townsend A. ( 2014) A fast, simple, and stable Chebyshev–Legendre transform using an asymptotic formula. SIAM J. Sci. Comput. , 36, A148– A167. Google Scholar CrossRef Search ADS   Hale N. & Townsend A. ( 2016) A fast FFT-based discrete Legendre transform. IMA J. Numer. Anal. , 36, 1670– 1684. Google Scholar CrossRef Search ADS   Keiner J. ( 2009) Computing with expansions in Gegenbauer polynomials. SIAM J. Sci. Comput. , 31, 2151– 2171. Google Scholar CrossRef Search ADS   Levrie P. & Piessens R. ( 1985) A note on the evaluation of orthogonal polynomials using recurrence relations. Technical Report 74.  Katholieke Universiteit Leuven. Li H. & Shen J. ( 2010) Optimal error estimates in Jacobi-weighted Sobolev spaces for polynomial approximations on the triangle. Math. Comp. , 79, 1621– 1646. Google Scholar CrossRef Search ADS   Mason J. C. & Handscomb D. C. ( 2002) Chebyshev Polynomials . Boca Raton, FL: CRC Press. Google Scholar CrossRef Search ADS   Mori A. Suda R. & Sugihara M. ( 1999) An improvement on Orszag’s fast algorithm for Legendre polynomial transform. Trans. Info. Process. Soc. Japan , 40, 3612– 3615. Oliver J. ( 1968) The numerical solution of linear recurrence relations. Numer. Math. , 11, 349– 360. Google Scholar CrossRef Search ADS   Oliver J. ( 1977) An error analysis of the modified Clenshaw method for evaluating Chebyshev and Fourier series. IMA J. App. Math. , 20, 379– 391. Google Scholar CrossRef Search ADS   Olver F. W. J. Lozier D. W. Boisvert R. F. & Clark C. W. (eds) ( 2010) NIST Handbook of Mathematical Functions . Cambridge, UK: Cambridge University Press. Orszag S. A. ( 1986) Fast eigenfunction transforms. Science and Computers . New York: Academic Press, pp. 13– 30. Piessens R. ( 1987) Numerical Integration , vol. 203 of NATO ASI Series. Netherlands: Springer, chapter 2, Modified Clenshaw–Curtis Integration and Applications to Numerical Computation of Integral Transforms. Rainville E. ( 1960) Special Functions . New York, NY: MacMillan. Slevinsky R. M. ( 2017) https://github.com/MikaelSlevinsky/FastTransforms.jl. GitHub.. Accessed 9 January 2017. Slevinsky R. M. & Olver S. ( 2017) A fast and well-conditioned spectral method for singular integral equations. J. Comp. Phys. , 332, 290– 315. Google Scholar CrossRef Search ADS   Sloane N. J. A. ( 2016) The On-Line Encyclopedia of Integer Sequences. http://oeis.org. Accessed 9 January 2017. Smith F. J. ( 1965) An algorithm for summing orthogonal polynomial series and their derivatives with applications to curve-fitting and interpolation. Math. Comp. , 19, 33– 36. Google Scholar CrossRef Search ADS   Sommariva A. ( 2013) Fast construction of Fejér and Clenshaw–Curtis rules for general weight functions. Comp. Math. Appl. , 65, 682– 693. Google Scholar CrossRef Search ADS   Townsend A. Webb M. & Olver S. ( 2016) Fast polynomial transforms based on Toeplitz and Hankel matrices. arXiv:1604.07486. Trefethen L. N. ( 2012) Approximation Theory and Approximation Practice . Philadelphia, PA: SIAM. Waldvogel J. ( 2003) Fast construction of the Fejér and Clenshaw–Curtis quadrature rules. BIT Numer. Math. , 43, 001– 018. Google Scholar CrossRef Search ADS   Wang H. & Huybrechs D. ( 2014) Fast and accurate computation of Jacobi expansion coefficients of analytic functions. arXiv:1404.2463v1. Wimp J. McCabe P. & Connor J. N. L. ( 1997) Computation of Jacobi functions of the second kind for use in nearside–farside scattering theory. J. Comp. Appl. Math. , 82, 447– 464. Google Scholar CrossRef Search ADS   Xiang S. He G. & Wang H. ( 2014) On fast and stable implementation of Clenshaw–Curtis and Fejér-type quadrature rules. Abst. Appl. Anal. , 2014, 10. © The authors 2017. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. All rights reserved. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png IMA Journal of Numerical Analysis Oxford University Press

# On the use of Hahn’s asymptotic formula and stabilized recurrence for a fast, simple and stable Chebyshev–Jacobi transform

, Volume 38 (1) – Jan 1, 2018
23 pages

/lp/ou_press/on-the-use-of-hahn-s-asymptotic-formula-and-stabilized-recurrence-for-m06XNdCDUZ
Publisher
Oxford University Press
ISSN
0272-4979
eISSN
1464-3642
D.O.I.
10.1093/imanum/drw070
Publisher site
See Article on Publisher Site

### Abstract

Abstract We describe a fast, simple and stable transform of Chebyshev expansion coefficients to Jacobi expansion coefficients, and its inverse based on the numerical evaluation of Jacobi expansions at the Chebyshev–Lobatto points. This is achieved via decomposition of Hahn’s interior asymptotic formula into a small sum of diagonally scaled discrete sine and cosine transforms and the use of stable recurrence relations. It is known that the Clenshaw–Smith algorithm is not uniformly stable on the entire interval of orthogonality. Therefore, Reinsch’s modification is extended for Jacobi polynomials and employed near the endpoints to improve numerical stability. 1. Introduction Chebyshev expansions:   pN(x)=∑n=0NcnchebTn(x),x∈[−1,1], (1.1) where $$T_n(\cos\theta) = \cos(n\theta)$$ are ubiquitous in numerical analysis, approximation theory and pseudo-spectral methods for their near-best approximation, fast evaluation via the discrete cosine transform and fast linear algebra for Chebyshev spectral methods, among the many other properties that facilitate their convenient use (see e.g. Mason & Handscomb (2002); Olver et al. (2010); Trefethen (2012)). Jacobi expansions:   pN(x)=∑n=0NcnjacPn(α,β)(x),x∈[−1,1],α,β>−1, (1.2) also have useful properties. The Jacobi polynomials are orthogonal with respect to $$L^2([-1,1],w^{(\alpha,\beta)}(x){\rm\,d}x)$$, where $$w^{(\alpha,\beta)}(x) = (1-x)^\alpha(1+x)^\beta$$ is the Jacobi weight. Jacobi expansions are therefore useful in pseudo-spectral methods, where it is more natural to measure the error in Jacobi-weighted Hilbert spaces (see Li & Shen (2010)). Also Wimp et al. (1997) show that the Jacobi-weighted finite Hilbert and Cauchy transforms are diagonalized by Jacobi polynomials. For $$N\in\mathbb{N}$$, define $$\boldsymbol{\theta}_N^{\rm cheb}$$ as the vector of $$N+1$$ equally spaced angles:   [θNcheb]n=πnN,n=0,…,N, (1.3) and the vector of $$N+1$$ Chebyshev–Lobatto points $${\bf x}_N^{\rm cheb} = \cos\boldsymbol{\theta}_N^{\rm cheb}$$. We express the vectors of the evaluation of the expansion (1.1) and (1.2) at $${\bf x}_N^{\rm cheb}$$ as the equality of the matrix-vector products:   pN(xNcheb)=TN(xNcheb)cNcheb=PN(α,β)(xNcheb)cNjac, (1.4) where the entries of the matrices are   [TN(xNcheb)]i,j=Ti−1([xNcheb]j−1),[PN(α,β)(xNcheb)]i,j=Pi−1(α,β)([xNcheb]j−1). (1.5) We define the forward Chebyshev–Jacobi transform as   cNcheb=TN(xNcheb)−1PN(α,β)(xNcheb)cNjac, (1.6) and the inverse Chebyshev–Jacobi transform by   cNjac=PN(α,β)(xNcheb)−1TN(xNcheb)cNcheb. (1.7) 1.1 Previous work on computing Legendre, Gegenbauer and Jacobi expansion coefficients The origins of the method proposed and analyzed in this paper start with the fast eigenfunction transform of Orszag (1986). The novelty of his approach, which is improved by Mori et al. (1999), is that, for large $$N$$, the matrix $${\bf P}_N^{(0,0)}({\bf x}_N^{\rm cheb})$$ is well approximated by a small sum of diagonally scaled discrete cosine transforms of type-I (DCT-I’s) and discrete sine transforms of type-I (DST-I’s). However, by not accounting for the region in the $$N$$-$$x$$ plane where the matrix significantly differs from the interior asymptotics, their initial advances were unstable. Families of orthogonal polynomials are related by the so-called connection coefficients (Andrews et al., 1998, p. 357). The connection coefficients fill in a lower-triangular matrix that allows conversion between two different families of orthogonal polynomials. Alpert & Rokhlin (1991) leverage the asymptotically smooth functions which define the connection coefficients between Chebyshev and Legendre polynomials for an $${\mathcal O}(N\log N)$$ hierarchical approach to the Chebyshev–Legendre transform. This hierarchical approach has been extended by Keiner (2009) for expansions in Gegenbauer polynomials. When transforming polynomial expansions of analytic functions, an alternative approach to the hierarchical decomposition of the connection coefficients can be used. With geometric decay in the coefficients of both the source expansion and the target expansion, the algebraic off-diagonal decay of the connection coefficients has been used by Cantero & Iserles (2012), and Wang & Huybrechs (2014) for $${\mathcal O}(N\log N+MN)$$ Gegenbauer and Jacobi expansion coefficients of analytic functions, where $$M\in\mathbb{N}$$ is a parameter. In principle, the hierarchical approach of Alpert & Rokhlin (1991) can be adapted to the Jacobi connection coefficients for an $${\mathcal O}(N\log N)$$ algorithm. However, this approach will also be saddled with the same expensive pre-computation of the hierarchical matrix. Instead, we extend the approach of Hale & Townsend (2014) by developing a fast and numerically stable evaluation of Jacobi polynomials at the Chebyshev–Lobatto points. This approach does not have expensive pre-computation, nor does it require analyticity of the function underlying the expansion. Indeed, the transform produces high absolute accuracy for expansion coefficients of a function with any regularity $${\mathcal C}^\rho[-1,1]$$, $$\rho\ge0$$. In exchange, we accept an asymptotically slower algorithm. Hale & Townsend (2014) advocate for a modification of the approach of Mori et al. (1999) based on block partitioning of the matrix $${\bf P}_N^{(0,0)}({\bf x}_N^{\rm cheb})$$ into an $${\mathcal O}(\log N/\log\log N)$$ number of partitions within which the interior asymptotics of the Legendre polynomials are guaranteed to be accurate and the remainder of the matrix is evaluated via recurrence relations. The balancing of operations between stable fast transforms in the blocks with the recurrence relations leads to the complexity $${\mathcal O}(N\log^2N/\log\log N)$$. While asymptotically slower than the hierarchical decomposition of Alpert & Rokhlin (1991), Hale and Townsend advocate that the partitioning algorithm is a practical alternative with a smaller setup cost. Hale & Townsend (2014) leave behind a mystery regarding the discrepancy in the numerically computed error in the coefficients and the theoretical estimates based on model coefficients. In particular, they show that for Legendre coefficients $$[{\bf c}_N^{\rm leg}]_n = {\mathcal O}(n^{-r})$$, and for some $$r\in\mathbb{R}$$, the sup-norm in applying $${\bf P}_N^{(0,0)}({\bf x}_N^{\rm cheb})$$ is asymptotically:   ‖PN(0,0)(xNcheb)cNleg‖∞={O(N1−r),r<1,O(log⁡N),r=1,O(1),r>1, asN→∞, (1.8) but in their numerical experiments they observed larger errors:   Observed  Error={O(N32−r/log⁡N),r=0,12,1,O(1),r=32, asN→∞. (1.9) For the Chebyshev–Legendre transform and more generally for the Chebyshev–Jacobi transform, this mystery is solved here by an extension of Reinsch’s modification of the Clenshaw–Smith algorithm to the Jacobi polynomials. It is known that the Clenshaw–Smith algorithm is not uniformly stable on the entire interval of orthogonality, i.e., the error bound of the recurrence relation is spatially dependent. In particular, the loss of accuracy near the endpoints of the interval $$[-1,1]$$ is significant. Reinsch suggested a modification of Clenshaw’s algorithm near the endpoints; the modification is extended by Levrie & Piessens (1985) to the Clenshaw–Smith algorithm for Legendre, ultraspherical, and Laguerre polynomials; and here, we extend it to Jacobi polynomials. 1.2 General definitions and properties The Gamma function is defined for all $$\Re z>0$$ by Abramowitz & Stegun (1965):   Γ(z)=∫0∞xz−1e−xdx, (1.10) and it is analytically continued to $$z\in\mathbb{C}\setminus\{-\mathbb{N}_0\}$$ by the property $${\it{\Gamma}}(z+1) = z{\it{\Gamma}}(z)$$. The Pochhammer symbol is then defined by Abramowitz & Stegun (1965):   (x)n=Γ(x+n)Γ(x), (1.11) and the beta function is defined similarly by Abramowitz & Stegun (1965):   B(x,y)=Γ(x)Γ(y)Γ(x+y). (1.12) Jacobi polynomials have the Rodrigues formula (Olver et al., 2010, §18.5):   Pn(α,β)(x)=(−1)n2nn!(1−x)−α(1+x)−βdndxn((1−x)α(1+x)β(1−x2)n); (1.13) their values at $$x=\pm1$$ are known:   Pn(α,β)(1)=(n+αn),Pn(α,β)(−1)=(−1)n(n+βn); (1.14) and, they satisfy the symmetry relation:   Pn(α,β)(x)=(−1)nPn(β,α)(−x). (1.15) Their three-term recurrence relation is given by:   Pn+1(α,β)(x)=(Anx+Bn)Pn(α,β)(x)−CnPn−1(α,β(x),P−1(α,β)(x)=0,P0(α,β)(x)=1, (1.16) where the recurrence coefficients are given by (Olver et al., 2010 §18.9.2):   An =(2n+α+β+1)(2n+α+β+2)2(n+1)(n+α+β+1), (1.17)  Bn =(α2−β2)(2n+α+β+1)2(n+1)(n+α+β+1)(2n+α+β), (1.18)  Cn =(n+α)(n+β)(2n+α+β+2)(n+1)(n+α+β+1)(2n+α+β). (1.19) The relation between Jacobi polynomials of differing parameters,   (α+β+2n+1)Pn(α,β)(x)=(α+β+n+1)Pn(α,β+1)(x)+(α+n)Pn−1(α,β+1)(x), (1.20) combined with the symmetry relation (1.15), allows for integer-valued increments and decrements of parameters with linear complexity in the degree. Lemma 1.1 (Wang & Huybrechs (2014)) Assume that   Pn(γ,δ)(x)=∑k=0ncn,k(α,β,γ,δ)Pk(α,β)(x). (1.21) Then the coefficients $$c_{n,k}^{(\alpha,\beta,\gamma,\delta)}$$ are given by   cn,k(α,β,γ,δ) =(n+γ+δ+1)k(k+γ+1)n−k(2k+α+β+1)Γ(k+α+β+1)(n−k)!Γ(2k+α+β+2) ×3F2(k−n,n+k+γ+δ+1,k+α+1k+γ+1,2k+α+β+2;1), (1.22) where $$\,_3F_2$$ is a generalized hypergeometric function (Olver et al., 2010 §16.2.1). 2. The forward transform: Jacobi to Chebyshev In this section, we extend the algorithm of Hale & Townsend (2014) for the Chebyshev–Legendre transform to the Chebyshev–Jacobi transform by deriving a fast algorithm to compute   cNcheb=TN(xNcheb)−1PN(α,β)(xNcheb)cNjac.$${\bf T}_N({\bf x}_N^{\rm cheb})$$ is a diagonally scaled DCT-I that can be applied and inverted in $${\mathcal O}(N\log N)$$ operations. 2.1 Interior asymptotics of Jacobi polynomials The interior asymptotics of Jacobi polynomials are given by Hahn (1980). Given $$M\in\mathbb{N}_0$$:   Pn(α,β)(cos⁡θ)=∑m=0M−1Cn,mα,βfm(θ)+Rn,Mα,β(θ), (2.1) where   Cn,mα,β =22n−m+α+β+1B(n+α+1,n+β+1)π(2n+α+β+2)m, (2.2)  fm(θ) =∑l=0m(12+α)l(12−α)l(12+β)m−l(12−β)m−ll!(m−l)!cos⁡θn,m,lsinl+α+12⁡(θ2)cosm−l+β+12⁡(θ2), (2.3)  θn,m,l =12(2n+α+β+m+1)θ−(α+l+12)π2, (2.4) and where $$x=\cos\theta$$. For $$\theta\in(\tfrac{\pi}{3},\tfrac{2\pi}{3})$$, the summation converges as $$M\to\infty$$, and for $$(\alpha,\beta)\in(-\tfrac{1}{2},\tfrac{1}{2}]^2$$, and for $$n\ge 2$$, the remainder $$R_{n,M}^{\alpha,\beta}(\theta)$$ is described in §2.2. Rewriting $$\theta_{n,m,l}$$ as   θn,m,l =12(2n+α+β+m+1)θ−(α+l+12)π2, (2.5)   =nθ+(α+β+m+1)θ2−(α+l+12)π2, (2.6)   =nθ−θm,l, (2.7) allows us to insert the cosine addition formula into the asymptotic formula (2.1). The result is   Pn(α,β)(cos⁡θ)=∑m=0M−1(um(θ)cos⁡nθ+vm(θ)sin⁡nθ)Cn,mα,β+Rn,Mα,β(θ), (2.8) where   um(θ) =∑l=0m(12+α)l(12−α)l(12+β)m−l(12−β)m−ll!(m−l)!cos⁡θm,lsinl+α+12⁡(θ2)cosm−l+β+12⁡(θ2), (2.9)  vm(θ) =∑l=0m(12+α)l(12−α)l(12+β)m−l(12−β)m−ll!(m−l)!sin⁡θm,lsinl+α+12⁡(θ2)cosm−l+β+12⁡(θ2). (2.10) Since in (2.8), $$\cos n\theta$$ and $$\sin n\theta$$ are the only terms that depend simultaneously and inextricably on both $$n$$ and $$\theta$$, the matrix $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})$$ of (1.4) can be expressed in the compact form:   PN(α,β)(xNcheb)ASY=∑m=0M−1(Dum(θNcheb)TN(xNcheb)+Dvm(θNcheb)sin⁡(θNcheb[0,…,N]⊤))DCn,mα,β+RMα,β(θNcheb). (2.11) Here, $${\bf D}_{u_m(\boldsymbol{\theta}_N^{\rm cheb})}$$ and $${\bf D}_{v_m(\boldsymbol{\theta}_N^{\rm cheb})}$$ denote diagonal matrices whose entries correspond to $$u_m$$ and $$v_m$$ evaluated at the equally spaced angles $$\boldsymbol{\theta}_N^{\rm cheb}$$, $${\bf D}_{C_{n,m}^{\alpha,\beta}}$$ is the diagonal matrix whose entries consist of $$C_{n,m}^{\alpha,\beta}$$ for $$n=0,\ldots,N$$, and $${\bf R}_M^{\alpha,\beta}(\boldsymbol{\theta}_N^{\rm cheb})$$ is the matrix of remainders. Since $${\bf T}_N({\bf x}_N^{\rm cheb})$$ is a diagonally scaled DCT-I and $$\sin(\boldsymbol{\theta}_N^{\rm cheb} [0,\ldots,N]^\top)$$ is a diagonally scaled DST-I bordered by zeros, the matrix $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})^{\rm ASY}$$ can be applied in $${\mathcal O}(M N\log N + M^2N)$$ operations. However, for low degree or for $$\theta\approx0$$ or $$\theta\approx\pi$$, the approximation of $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})$$ by $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})^{\rm ASY}$$ incurs an unacceptably large error. Therefore, we restrict the applicability of the matrix $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})^{\rm ASY}$$ to the region in the $$n$$-$$\theta$$ plane, where the remainder is guaranteed to be below a tolerance $$\varepsilon$$ and we use recurrence relations to stably fill in1 the remaining entries of the matrix $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})$$. 2.2 Partitioning the matrix $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})$$ For $$\theta\in(0,\pi)$$, and for every $$m\in\mathbb{N}_0$$, let $$g_m(\theta)$$ be the positive function:   gm(θ)=∑l=0m(12+α)l(12−α)l(12+β)m−l(12−β)m−ll!(m−l)!cosl−m−β−12⁡(θ2)sinl+α+12⁡(θ2). (2.12) Then, for $$(\alpha,\beta)\in(-\tfrac{1}{2},\tfrac{1}{2}]^2$$ and for $$n\ge 2$$, the remainder in (2.1) is bounded by:2  |Rn,Mα,β(θ)|<2Cn,Mα,βgM(θ). (2.13) For large $$n$$, the following leading order asymptotics are valid:   2Cn,Mα,β =22n−M+α+β+2πΓ(n+α+1)Γ(n+β+1)Γ(2n+α+β+M+2), (2.14)   ∼22n−M+α+β+2π2πn+α(n+αe)n+α2πn+β(n+βe)n+β2π2n+α+β+M+1(2n+α+β+M+1e)2n+α+β+M+1, (2.15)   ∼122M−1πnM+12,asn→∞. (2.16) Therefore, if we set the remainder to $$\varepsilon$$, this will define a curve in the $$n$$-$$\theta$$ plane for every $$M$$ given by   n≈⌊(ε22M−1πgM(θ))−1M+12⌋. (2.17) Since $$g_m(\theta)\in{\mathcal C}(0,\pi)$$ and   limθ→0+gm(θ)=limθ→π−gm(θ)=+∞, (2.18) the Weierstrass extreme value theorem ensures the existence of a global minimizer   θ^= argminθ∈(0,π)⁡ gm(θ). (2.19) Therefore, we find the discrete global minimizer:   θ¯= argminθ∈θNchebgm(θ)≈θ^, (2.20) and collect contiguous angles such that the error in evaluating the asymptotic expansion is guaranteed to be below $$\varepsilon$$. Following Hale & Townsend (2014), we define $$n_M\in\mathbb{N}_0$$:   nM:=⌊(ε22M−1πgM(π/2))−1M+12⌋, (2.21) and we set:   αN:=min(1log⁡(N/nM),1/2)andK:=⌈log⁡(N/nM)log⁡(1/αN)⌉. (2.22) For $$k=0,\ldots,K$$, while $$j_k := \alpha_N^kN > n_M$$, we compute the indices $$i_k^1,i_k^2$$ within which the remainder falls below the tolerance $$\varepsilon$$ and whose angles bound the discrete global minimizer $$\bar{\theta}$$:   [Rjk,Mα,β(θNcheb)]i<ε∀i∈{ik1,…,ik2},[θNcheb]ik1<θ¯<[θNcheb]ik2. (2.23) We require the following lemma. Lemma 2.1 Let $$(\alpha,\beta)\in(-\tfrac{1}{2},\tfrac{1}{2}]^2$$. Then for every $$M\ge2$$:   Cn+1,Mα,β≤Cn,Mα,β. (2.24) Proof. We have   inf(α,β)∈(−12,12]2(α+β)=−1,maxα∈(−12,12]α=12,maxβ∈(−12,12]β=12. (2.25) Using (2.2)   Cn+1,Mα,β=(2n+2α+2)(2n+2β+2)Cn,Mα,β(2n+α+β+M+3)(2n+α+β+M+2)≤(2n+3)2Cn,Mα,β(2n+M+2)(2n+M+1)≤Cn,Mα,β. (2.26) □ Lemma 2.1 guarantees that if $$M\,{\ge}\,2$$, the bound on the magnitude of the remainder $$|R_{n,M}^{\alpha,\beta}(\theta)| \,{<} 2C_{n,M}^{\alpha,\beta}g_m(\theta)$$ is a nonincreasing function of $$n$$. Therefore, the determination of the indices ensures the accuracy of the asymptotic formula within the rectangles $$[[\boldsymbol{\theta}_N^{\rm cheb}]_{i_k^1},[\boldsymbol{\theta}_N^{\rm cheb}]_{i_k^2}]\times[j_{k},j_{k-1}]$$, for $$k=1,\ldots,K$$, as depicted in Fig. 1. Fig. 1. View largeDownload slide The absolute error in $${\bf P}_N^{(1/8,3/8)}({\bf x}_N^{\rm cheb})$$ using the asymptotic formula (2.1) with $$M = 7$$ (left) and $$M=13$$ (right). In both plots the colour denotes the absolute error on a logarithmic scale; the curves represent the approximate region of accuracy of the asymptotic formula to $$\varepsilon\approx 2.2204\times10^{-16}$$ determined by (2.17), whereas the boxes denote the numerically determined indices $$i_k^1,i_k^2,j_k$$ for $$k=0,1,2$$ (left) and $$k=0,1,2,3$$ (right), such that the remainder is certainly below $$\varepsilon$$. Fig. 1. View largeDownload slide The absolute error in $${\bf P}_N^{(1/8,3/8)}({\bf x}_N^{\rm cheb})$$ using the asymptotic formula (2.1) with $$M = 7$$ (left) and $$M=13$$ (right). In both plots the colour denotes the absolute error on a logarithmic scale; the curves represent the approximate region of accuracy of the asymptotic formula to $$\varepsilon\approx 2.2204\times10^{-16}$$ determined by (2.17), whereas the boxes denote the numerically determined indices $$i_k^1,i_k^2,j_k$$ for $$k=0,1,2$$ (left) and $$k=0,1,2,3$$ (right), such that the remainder is certainly below $$\varepsilon$$. Lastly, for $$k=1,\ldots,K$$, we define   PN(α,β)(xNcheb)ASY,k=diag⁡(00:ik1−1,1ik1:ik2,0ik2+1:N)PN(α,β)(xNcheb)ASYdiag⁡(00:jk−1,1jk:jk−1,0jk−1+1:N), (2.27) to be the matrices of the asymptotic formula (2.11) within $$[[\boldsymbol{\theta}_N^{\rm cheb}]_{i_k^1},[\boldsymbol{\theta}_N^{\rm cheb}]_{i_k^2}]\times[j_k,j_{k-1}]$$, and:   [PN(α,β)(xNcheb)REC]i,j={PN(α,β)(xNcheb)i,j,i<ik1 or i>ik2,j<jk−1,k=1,…,K,PN(α,β)(xNcheb)i,j,iK1≤i≤iK2,j<jK,0,otherwise,  (2.28) which is computed via recurrence relations. Then, the numerically stable formula   PN(α,β)(xNcheb)=PN(α,β)(xNcheb)REC+∑k=1KPN(α,β)(xNcheb)ASY,k, (2.29) can be computed in $${\mathcal O}(N\log^2N/\log\log N)$$ operations. For detailed leading order estimates, see Appendix A. 2.3 Error analysis for model coefficients Consider a set of coefficients satisfying $$[{\bf c}_N^{\rm jac}]_n = {\mathcal O}(n^{-r})$$, for some $$r\in\mathbb{R}$$. We can estimate the sup-norm of the error in the forward transform (1.6) by estimating the error in applying the matrix $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})$$. Define $$D_N^r := \operatorname{diag}(1,1^r,\ldots,N^r)$$, then   ‖PN(α,β)(xNcheb)cNjac‖∞≤‖PN(α,β)(xNcheb)DN−r‖∞‖DnrcNjac‖∞. (2.30) Since   maxx∈xNcheb|Pn(α,β)(x)|=max{|Pn(α,β)(1)|,|Pn(α,β)(−1)|}=max{(n+αn),(n+βn)}=(n+max{α,β}n), (2.31) we can estimate the first term in (2.30) as follows:   ‖PN(α,β)(xNcheb)DN−r‖∞ =O(1+∑n=1N(n+max{α,β}n)n−r),asN→∞, (2.32)   =O(HN,r−max{α,β}),asN→∞, (2.33) where $$H_{N,r}$$ are the generalized harmonic numbers (Graham et al., 1989). Using their asymptotics,   ‖PN(α,β)(xNcheb)cNjac‖∞={O(N1+max{α,β}−r),r<1+max{α,β},O(log⁡N),r=1+max{α,β},O(1),r>1+max{α,β}.  (2.34) In the case $$(\alpha,\beta)=(0,0)$$, the error analysis reduces to that of the Chebyshev–Legendre transform derived in Hale & Townsend (2014). 3. The inverse transform: Chebyshev to Jacobi It is impractical to compute the inverse transform (1.7):   cNjac=PN(α,β)(xNcheb)−1TN(xNcheb)cNcheb, directly due to the occurrence of the inverse of the matrix $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb}){}$$. Instead, following Hale & Townsend (2014), we use the transpose of the asymptotic formula (2.11) in conjunction with the integral definition of the Jacobi coefficients:   [cNjac]n=1Anα,β∫−11pN(x)Pn(α,β)(x)w(α,β)(x)dx,n=0,…,N, (3.1) where $$p_N(x)$$ is defined by (1.1), and where $$\mathscr{A}_n^{\alpha,\beta}$$, defined by Olver et al. (2010, §18.3.1), are the orthonormalization constants of the Jacobi polynomials:   Anα,β=∫−11Pn(α,β)(x)2w(α,β)(x)dx=2α+β+1Γ(n+α+1)Γ(n+β+1)(2n+α+β+1)Γ(n+α+β+1)n!. (3.2) Since $$\deg(p_N(x))\le N$$, its product with $$P_N^{(\alpha,\beta)}(x)$$ will be integrated exactly by the $$2N+1$$-point Clenshaw–Curtis quadrature rule with the Jacobi weight $$w^{(\alpha,\beta)}(x)$$. 3.1 Clenshaw–Curtis quadrature Clenshaw–Curtis quadrature is a quadrature rule (Waldvogel, 2003; Sommariva, 2013) whose nodes are the $$N+1$$ Chebyshev–Lobatto points $${\bf x}_N^{\rm cheb}$$. Given a continuous weight function $$w(x)\in{\mathcal C}(-1,1)$$ and $$\mathbb{P}_N$$, the space of algebraic polynomials of degree at most $$N$$, the weight vector $${\bf w}_N$$ is designed by the equality:   ∫−11f(x)w(x)dx=wN⊤f(xNcheb),∀f∈PN. (3.3) With the modified Chebyshev moments of the weight function $$w(x)$$:   μn=∫−11Tn(x)w(x)dx,n=0,…,N, (3.4) the weights $${\bf w}_N$$ can be determined via the formula:   [wN]n=1−12(δ0,n+δN,n)N{μ0+(−1)nμN+2∑k=1N−1μkcos⁡[πkn/N]}, (3.5) where $$\delta_{k,j}$$ is the Kronecker delta (Abramowitz & Stegun, 1965). Due to this representation, the $${\mathcal O}(N\log N)$$ computation of the weights from modified Chebyshev moments is achieved via a diagonally scaled DCT-I. For the Jacobi weight, the modified Chebyshev moments are known explicitly (Piessens, 1987):   μn(α,β)=∫−11Tn(x)w(α,β)(x)dx=2α+β+1B(α+1,β+1)3F2(n,−n,α+112,α+β+2;1), (3.6) where $$\,_3F_2$$ is a generalized hypergeometric function (Olver et al., 2010, §16.2.1). Using Sister Celine’s technique (Rainville, 1960,§127) or induction (Xiang et al., 2014), a recurrence relation can be derived for the modified moments:   μ0(α,β)=2α+β+1B(α+1,β+1),μ1(α,β)=β−αα+β+2μ0(α,β), (3.7)  (α+β+n+2)μn+1(α,β)+2(α−β)μn(α,β)+(α+β−n+2)μn−1(α,β)=0,forn>0. (3.8) It is known that for $$\alpha>\beta$$ and $$\beta = -\tfrac{1}{2} + \mathbb{N}_0$$ or for $$\alpha<\beta$$ and $$\alpha = -\tfrac{1}{2} + \mathbb{N}_0$$, neither forward nor backward recurrence is stable. This has been addressed by Xiang et al. (2014) by transforming the initial value problem into a boundary value problem with a sufficiently accurate asymptotic expansion for $$\mu_N^{(\alpha,\beta)}$$ and subsequent use of Oliver’s algorithm (Oliver (1968)), i.e., the LU decomposition of a tridiagonal matrix. However, the recurrence relation is stable in the forward direction in the half-open square $$(\alpha,\beta)\in(-\tfrac{1}{2},\tfrac{1}{2}]^2$$, and in light of the linear complexity of integer-valued decrements, Oliver’s algorithm is not required in the present context. Once the modified Chebyshev moments $$\mu_n^{(\alpha,\beta)}$$ are computed, the Clenshaw–Curtis weights $${\bf w}_N^{(\alpha,\beta)}$$ follow via a diagonally scaled DCT-I. 3.2 The transpose of the asymptotic formula Since $$\deg(p_N(x)P_n^{(\alpha,\beta)}(x))\le 2N$$, the $$2N+1$$-point, Clenshaw–Curtis quadrature rule yields:   [cNjac]n=1Anα,β(w2N(α,β))⊤(pN(x2Ncheb)Pn(α,β)(x2Ncheb)),n=0,…,N, (3.9) where $$p_N(x)$$ is defined by (1.1) and where $$\mathscr{A}_n^{\alpha,\beta}$$ is given by (3.2). If we let the vector $$[{\bf s}_{2N}]_n = (\mathscr{A}_n^{\alpha,\beta})^{-1}$$ for $$n=0,\ldots,2N$$, then we can rewrite this in matrix form:   cNjac=[IN+1|0N]Ds2NP2N(α,β)(x2Ncheb)⊤Dw2N(α,β)T2N(x2Ncheb)⊤[IN+10N]cNcheb, (3.10) where $${\bf D}_{{\bf s}_{2N}}$$ and $${\bf D}_{{\bf w}_{2N}^{(\alpha,\beta)}}$$ denote diagonal matrices whose entries correspond to $${\bf s}_{2N}$$ and $${\bf w}_{2N}^{(\alpha,\beta)}$$, respectively. Clenshaw–Curtis quadrature allows us to express the Jacobi coefficients in terms of transposed matrices rather than inverse matrices. In order to complete our formulation, we use the transpose of (2.11), given by   P2N(α,β)(x2Ncheb)ASY,⊤=∑m=0M−1DCn,mα,β(T2N(x2Ncheb)⊤Dum(θ2Ncheb)+sin⁡(θ2Ncheb[0,…,2N]⊤)⊤Dvm(θ2Ncheb))+RMα,β(θ2Ncheb)⊤. (3.11) Given the ordering of the points $${\bf x}_{2N}^{\rm cheb}$$, the transposed DCT-I $${\bf T}_{2N}({\bf x}_{2N}^{\rm cheb})^\top$$ and the transposed DST-I bordered by zeros $$\sin(\boldsymbol{\theta}_{2N}^{\rm cheb}[0,\ldots,2N]^\top)$$ are symmetric, allowing for the same implementation as the forward Chebyshev–Jacobi transform. Similarly, the constants $$n_M$$, $$\alpha_N$$, $$K$$, and the indices $$i_k^1$$, $$i_k^2$$ and $$j_k$$ can be computed as in the forward transform, with the substitution $$N\to 2N$$. Therefore, the transpose of the asymptotic formula, combined with recurrence relations, can be used for a numerically stable partition and evaluation of the inverse transform. 3.3 Error analysis for model coefficients Consider a set of coefficients satisfying $$[{\bf c}_N^{\rm cheb}]_n = {\mathcal O}(n^{-r})$$, for some $$r\in\mathbb{R}$$. We can estimate the sup-norm of the error in the inverse transform (1.7) by estimating the error in the transpose formula (3.10). Using $$D_N^r := \operatorname{diag}(1,1^r,\ldots,N^r)$$ again, then:   ‖cNjac‖∞ =‖[IN+1|0N]Ds2NP2N(α,β)(x2Ncheb)⊤Dw2N(α,β)T2N(x2Ncheb)⊤[IN+10N]cNcheb‖∞ (3.12)   ≤‖Ds2NP2N(α,β)(x2Ncheb)⊤‖∞‖Dw2N(α,β)‖∞‖T2N(x2Ncheb)⊤[DN−r0N]‖∞‖DNrcNcheb‖∞. (3.13) Using the bound on the Jacobi polynomials (2.31), we can formulate the asymptotics of the sup-norm involving the transposed matrix $${\bf P}_{2N}^{(\alpha,\beta)}({\bf x}_{2N}^{\rm cheb})^\top$$ and its diagonal scaling. Since $$\mathscr{A}_n^{\alpha,\beta} = {\mathcal O}(N^{-1})$$, as can be seen from (3.2), we have   ‖Ds2NP2N(α,β)(x2Ncheb)⊤‖∞ ≤C‖D2N1P2N(α,β)(x2Ncheb)⊤‖∞,for some C>0, (3.14)   ≤Cmaxn∈{0,…,2N}∑x∈x2Ncheb(δn,0+n)|Pn(α,β)(x)|, (3.15)   ≤Cmaxn∈{0,…,2N}2N(δn,0+n)(n+max{α,β}n), (3.16)   =O(N2+max{α,β}),asN→∞. (3.17) For the Clenshaw–Curtis quadrature weights,   ‖Dw2N(α,β)‖∞=O(N−1),asN→∞, (3.18) as can be seen by (3.5).3 Due to the symmetry of $${\bf T}_{2N}({\bf x}_{2N}^{\rm cheb})$$, we can conclude that   ‖T2N(x2Ncheb)⊤[DN−r0N]‖∞=1+HN,r, (3.19) or   ‖cNjac‖∞={O(N2+max{α,β}−r),r<1,O(N1+max{α,β}log⁡N),r=1,O(N1+max{α,β}),r>1.  (3.20) This growth rate appears larger than our numerical experiments suggest, and this can be attributed to the overestimation of $$\left\| {\bf D}_{{\bf s}_{2N}} {\bf P}_{2N}^{(\alpha,\beta)}({\bf x}_{2N}^{\rm cheb})^\top\right\|_\infty$$: the Jacobi polynomials are significantly smaller than the maximum of their endpoints for the majority of the interior of $$[-1,1]$$. However, without a useful envelope function, we report what can only be an overestimate. 4. Design and implementation As Hale & Townsend (2014) remark, the partitioning implies that the algorithm is trivially parallelized. However, of more immediate concern is the application of the same transform to multiple sets of expansion coefficients. Analogous to the Fastest Fourier Transform in the West (FFTW) of Frigo & Johnson (2005), we divide the computation into Part 1 (Planification) and Part 2 (Execution): (1) Planification (i) computation of the partitioning indices, (ii) planification of the in-place DCT-I and DST-I, (iii) computation of the recurrence coefficients, (iv) allocation of temporary arrays, and (V) computation of the modified weights and orthonormality constants (inverse only). (2) Execution (i) computation of the diagonal matrices $${\bf D}_{u_m(\boldsymbol{\theta}_N^{\rm cheb})}$$, $${\bf D}_{v_m(\boldsymbol{\theta}_N^{\rm cheb})}$$, and $${\bf D}_{C_{n,m}^{\alpha,\beta}}$$, (ii) application of the DCT-I and DST-I, and (iii) execution of the recurrence relations. Since Part 1 is only dependent on the degree and the Jacobi parameters, it is reusable. Therefore, the results of Part 1 are stored in an object called a ChebyshevJacobiPlan. Analogous to FFTW, applying the ChebyshevJacobiPlan to a vector results in execution of Part 2. While it is beneficial to divide the computation as such, the construction of a ChebyshevJacobiPlan is not orders of magnitude larger than the execution, as is the case for other schemes using hierarchical or other complex data structures; our numerical experiments suggest an approximate gain on the order of $$10\%$$. However, the reduction of memory allocation alone could be important in memory-sensitive applications. 4.1 Computational issues Consider the Stirling series for the gamma function (Olver et al., 2010, §5.11.10) on $$z\in\mathbb{R}^+$$:   Γ(z)=2πzz−12e−z(SN(z)+RN(z)),SN(z)=∑n=0N−1anzn,RN(z)≤(1+ζ(N))Γ(N)(2π)N+1zN. (4.1) The sequence $$\{a_n\}_{n\ge0}$$ is defined by the ratio of sequences A001163 and A001164 of Sloane (2016), and $$\zeta$$ is the Riemann zeta function (Olver et al., 2010 §25). Table 1 shows the necessary and sufficient number of terms required of the Stirling series such that $$\tfrac{R_N(z)}{S_N(z)} < \tfrac{\varepsilon}{20} \approx 1.1102\times10^{-17}$$. Taking rounding errors into account, the effect is a relative error below machine precision $$\varepsilon \approx 2.2204\times10^{-16}$$ in double precision arithmetic. Table 1 Number of terms such that $$\tfrac{R_N(z)}{S_N(z)} < \tfrac{\varepsilon}{20} \approx 1.1102\times10^{-17}$$ in double precision arithmetic $$z\ge$$  3275  591  196  92  53  35  26  $$\quad$$$$N$$  4  5  6  7  8  9  10  $$z\ge$$  20  17  14  12  11  10  9  $$\quad$$$$N$$  11  12  13  14  15  16  17  $$z\ge$$  3275  591  196  92  53  35  26  $$\quad$$$$N$$  4  5  6  7  8  9  10  $$z\ge$$  20  17  14  12  11  10  9  $$\quad$$$$N$$  11  12  13  14  15  16  17  Define $$S_{\varepsilon}(z): [9,\infty)\to\mathbb{R}$$ by the truncated Stirling series $$S_N(z)$$ with necessary and sufficient $$N$$ for relative error below $$\varepsilon$$, as determined by Table 1. The coefficients $$C_{n,m}^{\alpha,\beta}$$ of (2.2) can be stably computed by forward recurrence in $$n$$ and $$m$$:   Cn,mα,β=(n+α)(n+β)Cn−1,mα,β(n+α+β+m+12)(n+α+β+m2),andCn,mα,β=Cn,m−1α,β2(2n+α+β+m+1). (4.2) However, to determine the indices $$i_k^1$$ and $$i_k^2$$ for the partitioning of the matrix $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})$$, use of an asymptotic formula is more efficient. Here, we adapt the approach of Hale & Townsend (2013, §3.2.3 & §3.3.1) with suitable modifications. In terms of $$S_\varepsilon(z)$$ defined above, the coefficients $$C_{n,m}^{\alpha,\beta}$$ can be expressed as   Cn,mα,β =22n−m+α+β+12π(n+α+1)n+α+12e−n−α−1(n+β+1)n+β+12e−n−β−1π(2n+m+α+β+2)2n+m+α+β+32e−2n−m−α−β−2 ×Sε(n+α+1)Sε(n+β+1)Sε(2n+m+α+β+2), (4.3)   =em4mπ(1+α−β−m2n+α+β+m+2)n+α+12(1+β−α−m2n+α+β+m+2)n+β+12 ×1nm+12(1+α+β+m+22n)m+12Sε(n+α+1)Sε(n+β+1)Sε(2n+m+α+β+2). (4.4) In (4.4), the terms resembling $$(1+x)^y$$ can be computed stably and efficiently by $$\exp(y\operatorname{log1p} x)$$, where $$\operatorname{log1p}$$ calls the natural logarithm $$\log(1+x)$$ for large arguments and its Taylor series for small arguments. So long as $$n+\min\{\alpha,\beta\}\ge8$$, the asymptotic formula (4.4) for the coefficients $$C_{n,m}^{\alpha,\beta}$$ can be used for a fast and stable numerical evaluation, and the downward recurrence of (4.2) supplies $$C_{n,m}^{\alpha,\beta}$$ for the handful of remaining values. To compute the $$\mathscr{A}_n^{\alpha,\beta}$$ of (3.2), the asymptotic expansion derived by Bühring (2000) can be used. However, a remainder estimate is not reported and instead we use the same technique as for the computation of the coefficients $$C_{n,m}^{\alpha,\beta}$$:   Anα,β =2α+β+12n+α+β+1(n+α+1)n+α+12(n+β+1)n+β+12(n+α+β+1)n+α+β+12(n+1)n+12Sε(n+α+1)Sε(n+β+1)Sε(n+α+β+1)Sε(n+1), (4.5)   =2α+β+12n+α+β+1(1−βn+α+β+1)n2+α+14(1−αn+α+β+1)n2+β+14 ×(1+αn+1)n2+14(1+βn+1)n2+14Sε(n+α+1)Sε(n+β+1)Sε(n+α+β+1)Sε(n+1). (4.6) Similar to (4.4), (4.6) can be computed stably and efficiently for $$n+\min\{\alpha,\beta,\alpha+\beta,0\}\ge8$$. Note as well the symmetry in both (4.4) and (4.6) upon the substitution $$\alpha\leftrightarrow\beta$$. 4.2 Reinsch’s modification of forward orthogonal polynomial recurrence and the Clenshaw–Smith algorithm In order to evaluate $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})^{\rm REC}$$ and its transpose, recurrence relations are required. Here, we review the recurrence relations for orthogonal polynomials, and derive new relations for stabilized evaluation near the boundary of the interval of orthogonality for Jacobi polynomials. Let an orthogonal polynomial sequence $$\pi_n(x)$$ be defined by the three-term recurrence relation (Olver et al., 2010, §18.9.1):   πn+1(x)=(Anx+Bn)πn(x)−Cnπn−1(x),π−1(x)=0,π0(x)=1. (4.7) The Clenshaw–Smith algorithm writes the sum,   pN(x)=∑n=0Ncnπn(x), (4.8) via an inhomogeneous recurrence relation involving the adjoint of (4.7) as follows: Algorithm 1 (Clenshaw, 1955; Smith, 1965) (1) Set:   uN+1(x)=uN+2(x)=0. (4.9) (2) For $$n=N,N-1,\ldots,0$$:   un(x)=(Anx+Bn)un+1(x)−Cn+1un+2(x)+cn. (4.10) (3) Then:   pN(x)=u0(x). (4.11) After Clenshaw’s original error analysis, it was Gentleman (1969) who first drew attention to the susceptibility of larger rounding errors near the ends of the interval $$[-1,1]$$. In Gentleman’s paper, Reinsch proposed (unpublished) a stabilizing modification, with the error analysis of the modification performed by Oliver (1977). Levrie & Piessens (1985) derive Reinsch’s modification of the Clenshaw–Smith algorithm for Legendre, ultraspherical and Laguerre polynomials. They also derive Reinsch’s modification to the forward orthogonal polynomial recurrence (4.7) for Chebyshev, Legendre, ultraspherical, Jacobi and Laguerre polynomials. Here, we review Reinsch’s modification with more general notation than that of Levrie & Piessens (1985) and extend Reinsch’s modified Clenshaw–Smith algorithm to Jacobi polynomials. Formally, we define the ratio   rnf(x):=πn+1(x)πn(x),forn≥0, (4.12) such that at the point $$x_0$$  rnf(x0)=Anx0+Bn−Cnrn−1f(x0)−1, (4.13) or isolating for $$B_n$$  Bn=rnf(x0)+Cnrn−1f(x0)−1−Anx0. (4.14) Substituting this relationship for $$B_n$$ into the forward recurrence (4.7), we obtain the modified version: Algorithm 2 (1) Set:   π0(x)=1,d0(x)=0. (4.15) (2) For $$n\ge0$$:   dn+1(x) =(An(x−x0)πn(x)+Cndn(x))rnf(x0)−1, (4.16)  πn+1(x) =(πn(x)+dn+1(x))rnf(x0). (4.17) Consider the homogeneous adjoint three-term recurrence:   vn(x)=(Anx+Bn)vn+1(x)−Cn+1vn+2(x),v0(x)=0,v1(x)=1. (4.18) Formally, we define the ratio   rnb(x):=vn+1(x)vn(x),forn>0, (4.19) such that at the point $$x_0$$  rnb(x0)−1=Anx0+Bn−Cn+1rn+1b(x0), (4.20) or isolating for $$B_n$$  Bn=rnb(x0)−1+Cn+1rn+1b(x0)−Anx0. (4.21) Substituting this relationship for $$B_n$$ into the Clenshaw–Smith algorithm, we obtain the modified version: Algorithm 3 (1) Set:   uN+1(x)=dN+1(x)=0. (4.22) (2) For $$n=N,N-1,\ldots,1$$,   dn(x) =(An(x−x0)un+1(x)+Cn+1dn+1(x)+cn)rnb(x0), (4.23)  un(x) =(un+1(x)+dn(x))rnb(x0)−1. (4.24) (3) Then,   pN(x)=A0(x−x0)u1(x)+C1d1(x)+c0. (4.25) The stability of the modified forward recurrence and the modified Clenshaw–Smith algorithm near $$x_0$$ is derived from the geometric damping induced by $$x-x_0$$ and the avoidance of cancellation errors. However, the naïve implementation of the two-term recurrence relations for the ratios (4.13) and (4.20) contains precisely the cancellation errors we were hoping to avoid. Therefore, to complete the stable implementation of the scheme, we require stable evaluation of the ratios $$r^f(x_0)$$ and $$r^b(x_0)$$. For Jacobi polynomials, due to (1.14) and the two-term recurrence of binomials, the ratios $$r_n^f(\pm1)$$ defined by (4.12) are trivial:   rnf(1)=n+α+1n+1,andrnf(−1)=−n+β+1n+1. (4.26) Fortunately, we can also prove the following: Lemma 4.1 For Jacobi polynomials, the ratios $$r_n^b(\pm1)$$ defined by (4.19) are as follows:   rnb(1)=n+1n(α+β+n+1)(α+β+2n)(n+β)(α+β+2n+2),  and  rnb(−1)=−n+1n(α+β+n+1)(α+β+2n)(n+α)(α+β+2n+2). (4.27) Proof. One need only insert the ratios into the relationship (4.20). □ Figure 2 shows the relative error in evaluating $$P_{10,000}^{(0,0)}(\cos\theta)$$ at 10,001 equally spaced angles using the six described algorithms. In Fig. 2, the terms $$x\pm1$$ are computed accurately with the trigonometric identities $$x+1=2\cos^2(\tfrac{\theta}{2})$$ and $$x-1 = -2\sin^2(\tfrac{\theta}{2})$$. While variations in $$\alpha$$ and $$\beta$$ will change the accuracy of all six recurrence relations, we practically take the unmodified algorithms to be more accurate in $$\frac{\pi}{4} < \theta < \frac{3\pi}{4}$$, and the modifications otherwise as the perturbations in the breakpoints are asymptotically of lower order as $$N\to\infty$$. Fig. 2. View largeDownload slide The relative error in evaluating $$P_{10,000}^{(0,0)}(\cos\theta)$$ at 10,001 equally spaced angles using: left, the three-term recurrence relation and Reinsch’s forward modifications for the ends of the interval $$[0,\pi]$$; right, the Clenshaw–Smith algorithm and Reinsch’s backward modifications for the ends of the interval $$[0,\pi]$$. Colour figures available online. Fig. 2. View largeDownload slide The relative error in evaluating $$P_{10,000}^{(0,0)}(\cos\theta)$$ at 10,001 equally spaced angles using: left, the three-term recurrence relation and Reinsch’s forward modifications for the ends of the interval $$[0,\pi]$$; right, the Clenshaw–Smith algorithm and Reinsch’s backward modifications for the ends of the interval $$[0,\pi]$$. Colour figures available online. Note that six algorithms are required in total: the Clenshaw–Smith algorithm and Reinsch’s modifications near $$\pm1$$ for the forward Chebyshev–Jacobi transform, and forward orthogonal polynomial recurrence, and again Reinsch’s modifications near $$\pm1$$ for the inverse Chebyshev–Jacobi transform. 5. Numerical discussion and outlook In principle, the connection coefficients (1.22) are able to provide reference solutions for the maximum absolute error. But, in practice, the naïve algorithm’s quadratic complexity limits the applicability to below about $$N=10^4$$. Therefore, in Fig. 3, we plot the maximum absolute error in transforming Chebyshev expansion coefficients to Jacobi expansion coefficients and back for coefficients simulating an irregular function, and for coefficients simulating a continuous function. The error is similar for the forward–inverse composition. Figure 4 shows the execution time of the forward and inverse transforms in line with the predicted asymptotic complexity $${\mathcal O}(N\log^2N/\log\log N)$$. These calculations are performed on a Mid 2014 Macbook Pro with a $$2.8{\rm GHz}$$ Intel Core i7-4980HQ processor and $$16{\rm GB}$$$$1600{\rm MHz}$$ DDR3 memory. Our implementation (Slevinsky, 2017FastTransforms.jl) in the Julia programming language is freely available online. Fig. 3. View largeDownload slide Maximum absolute error of the inverse Chebyshev–Jacobi transform of the forward Chebyshev–Jacobi transform. Left: For coefficients simulating an irregular function $$[{\bf c}_N]_n \sim U(0,1)$$. Right: For coefficients simulating a continuous function $$[{\bf c}_N]_n \sim U(-1,1)n^{-2}$$. In both plots, the numbers labeling the solid lines refer to different Jacobi parameters and the dashed black lines are asymptotic estimates on the error based on the error analyses. The results are an average over $$10$$ executions. Fig. 3. View largeDownload slide Maximum absolute error of the inverse Chebyshev–Jacobi transform of the forward Chebyshev–Jacobi transform. Left: For coefficients simulating an irregular function $$[{\bf c}_N]_n \sim U(0,1)$$. Right: For coefficients simulating a continuous function $$[{\bf c}_N]_n \sim U(-1,1)n^{-2}$$. In both plots, the numbers labeling the solid lines refer to different Jacobi parameters and the dashed black lines are asymptotic estimates on the error based on the error analyses. The results are an average over $$10$$ executions. Fig. 4. View largeDownload slide Execution time for the Chebyshev–Jacobi transform. Left: The forward Chebyshev–Jacobi transform. Right: The inverse Chebyshev–Jacobi transform. In both plots, the numbers labeling the solid lines refer to the parameter $$M$$ and the dashed black line is the same, showing that the inverse transform takes about twice the time. The results are an average over $$10$$ executions performed with uniformly distributed parameters $$\alpha\sim U(-\tfrac{1}{2},\tfrac{1}{2})$$ and $$\beta\sim U(-\tfrac{1}{2},\tfrac{1}{2})$$. Fig. 4. View largeDownload slide Execution time for the Chebyshev–Jacobi transform. Left: The forward Chebyshev–Jacobi transform. Right: The inverse Chebyshev–Jacobi transform. In both plots, the numbers labeling the solid lines refer to the parameter $$M$$ and the dashed black line is the same, showing that the inverse transform takes about twice the time. The results are an average over $$10$$ executions performed with uniformly distributed parameters $$\alpha\sim U(-\tfrac{1}{2},\tfrac{1}{2})$$ and $$\beta\sim U(-\tfrac{1}{2},\tfrac{1}{2})$$. Composition of the forward and inverse transforms allows for the transform between expansions in Jacobi polynomials of differing parameters. Additionally, use of a nonuniform discrete cosine transform (NDCT) (e.g. Hale & Townsend (2016)) could allow for fast evaluation at the Gauss–Jacobi nodes. However, an efficient NDCT requires points to be close to the Chebyshev points of the first kind, and the inequalities on the zeros of the Jacobi polynomials (Olver et al., 2010,§18.16) seem to be overestimates. The performance of an NDCT may be better in practice than can be currently estimated theoretically. One potential area of application is the extension of the fast and well-conditioned spectral method for solving singular integral equations of Slevinsky & Olver (2017) to polygonal boundaries. Elliptic partial differential equations have angle-dependent algebraic singularities in the densities on polygonal boundaries. It is conjectured that working in the more exotic bases of Jacobi polynomials and Jacobi functions of the second kind can lead to banded representations of singular integral operators defined on polygonal boundaries. Since the integer-valued increments are required for Jacobi parameters beyond $$(\alpha,\beta)\in(-\tfrac{1}{2},\tfrac{1}{2}]^2$$, the method proposed and analyzed here cannot be used for exceedingly large parameters. This is consistent with nonuniformity of Hahn’s asymptotics (2.1) in $$\alpha$$ and $$\beta$$. Therefore, this Chebyshev–Jacobi transform cannot be used for a fast spherical harmonics transform. There are certain parameter régimes where the complexity can be reduced. These are detailed in Appendix B. Townsend et al. (2016) describe a new approach to fast polynomial transforms based on exploiting the structure of certain connection coefficients matrices. In this framework, it is also possible to perform certain Chebyshev–Jacobi and Jacobi–Jacobi transforms. Acknowledgments This paper is dedicated to the celebration of Nick Trefethen on his $$60^{\rm th}$$ birthday and his inspirational contributions to numerical analysis. I wish to thank the referees for their constructive remarks. Funding This work was supported by the Natural Sciences and Engineering Research Council of Canada. A. Appendix: Complexity of $$\boldsymbol{{\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})^{\rm REC}}$$ In this section, we derive refined estimates on the complexity of applying the matrix $${\bf P}_N^{(\alpha,\beta)}({\bf x}_N^{\rm cheb})^{\rm REC}$$. By artificially partitioning the matrix into rectangular regions, we need to estimate Hale & Townsend, 2014, §3.3),   ∑k=1K−1αNkN(ik+11−ik1+ik2−ik+12), (A.1) to leading order. Fortunately, as $$\theta\to0$$ or $$\theta\to\pi$$, $$g_M(\theta)$$ is its own asymptotic expansion. For brevity, we derive the leading order asymptotics of $$i_k^1$$ and deduce those of $$i_k^2$$ by symmetry. To leading order,   gM(θ)∼gM1(θ)=(12+α)M(12−α)MM!1sinM+α+12⁡θ2,asθ→0. (A.2) Then, to determine the leading order estimate of $$i_k^1$$,   ε∼2Cjk,Mα,βgM1(ik1πN+1), (A.3) or isolating for $$i_k^1$$,   ik1∼⌊2(N+1)πsin−1⁡(((12+α)M(12−α)MεM!22M−1πjkM+12)1M+α+12)⌋,asN→∞. (A.4) Using the fact that $$\sin^{-1}x \sim x$$ as $$x\to0$$, we find,   ik1=O(N×jk−M+12M+α+12)=O(NαM+α+12×αN−kM+12M+α+12),asN→∞. (A.5) Therefore, the sum involving $$i_{k+1}^1$$ and $$i_k^1$$ is, to leading order:   ∑k=1K−1αNkN(ik+11−ik1) =O(NM+2α+12M+α+12∑k=1K−1(αN−M+12M+α+12−1)αNkαM+α+12), (A.6)   =O(KNM+2α+12M+α+12αNM−α+12M+α+12),asN→∞. (A.7) By the symmetry in $$\alpha\leftrightarrow\beta$$ and $$\theta\leftrightarrow\pi-\theta$$, we have   ∑k=1K−1αNkN(ik2−ik+12)=O(KNM+2β+12M+β+12αNM−β+12M+β+12),asN→∞. (A.8) Therefore, the simplified estimate $${\mathcal O}(N\log^2 N/\log\log N)$$ is a local expansion near $$(\alpha,\beta)\approx(0,0)$$, and we observe in Fig. 4 that it holds over $$(\alpha,\beta)\in(-\tfrac{1}{2},\tfrac{1}{2}]^2$$ in practice, so long as $$M\ge5$$. B Appendix: Jacobi parameters resulting in reduced complexity B.1 $$\alpha=\beta=\lambda-\frac{1}{2}$$ In the case that $$\alpha=\beta=\lambda-\tfrac{1}{2}$$, we are a normalization away from the ultraspherical or Gegenbauer polynomials. These asymptotics are given by (Olver et al., 2010 §18.15):   Pn(λ−12,λ−12)(cos⁡θ)=∑m=0M−1Cn,mλcos⁡θn,mλsinm+λ⁡θ+Rn,Mλ(θ). (B.1) Here, we have   Cn,mλ =2λΓ(n+λ+12)πΓ(n+λ+1)(λ)m(1−λ)m2mm!(n+λ+1)m, (B.2)  θn,mλ =(n+m+λ)θ−(m+λ)π2=nθ−(m+λ)(π2−θ), (B.3) and $$x=\cos\theta$$. The coefficients $$C_{n,m}^\lambda$$ can be computed by the recurrence:   Cn,mλ=(λ+m−1)(m−λ)2m(n+λ+m)Cn,m−1λ,Cn,0λ=2λΓ(n+λ+12)πΓ(n+λ+1). (B.4) So long as $$\lambda\in(0,1)$$, the error is bounded by   |Rn,Mλ(θ)|<2Cn,MλsinM+λ⁡θ. (B.5) Therefore, if we set the error to $$\varepsilon$$, this will define a curve in the $$n$$-$$\theta$$ plane for every $$M$$ and $$\lambda$$ given by,   n≈nMλsinM+λM+12⁡θ,nMλ=⌊(επ2MM!2λ+1(λ)M(1−λ)M)−1M+12⌋. (B.6) B.2 $$\alpha=\frac{1}{2}$$ If $$\alpha = \tfrac{1}{2}$$, then the summations in the functions $$f_m(\theta)$$ collapse:   fm(θ)=(12+β)m(12−β)mm!cos⁡θn,m,0sin⁡(θ2)cosm+β+12⁡(θ2). (B.7) B.3 $$\beta=\frac{1}{2}$$ If $$\beta = \tfrac{1}{2}$$, then the summations in the functions $$f_m(\theta)$$ collapse:   fm(θ)=(12+α)m(12−α)mm!cos⁡θn,m,msinm+α+12⁡(θ2)cos⁡(θ2). (B.8) Footnotes 1The Clenshaw–Smith algorithm (Clenshaw, 1955; Smith, 1965) for evaluation of polynomials in orthogonal polynomial bases is used rather than explicitly filling in the matrix. 2The validity of Hahn (1980, Satz 5) can be extended from the open square $$(\alpha,\beta)\in(-\tfrac{1}{2},\tfrac{1}{2})^2$$ to the half-open square $$(\alpha,\beta)\in(-\tfrac{1}{2},\tfrac{1}{2}]^2$$ by substituting the error-free equality $$(1-z)^{-0} = 1$$ for Hilfssatz 2 of §15, and by substituting Hilfssatz 2 for Hilfssatz 3 of §15 in the proof of Satz 5 contained in §16. 3Intuitively, since the Clenshaw–Curtis weights must sum to a constant, and not one weight is of paramount importance, this can only occur if they all decay uniformly with $${\mathcal O}(N^{-1})$$. References Abramowitz M. & Stegun I. A. ( 1965) Handbook of Mathematical Functions . New York: Dover. Alpert B. K. & Rokhlin V. ( 1991) A fast algorithm for the evaluation of Legendre expansions. SIAM J. Sci. Stat. Comput. , 12, 158– 179. Google Scholar CrossRef Search ADS   Andrews G. E. Askey R. & Roy R. ( 1998) Special Functions . Cambridge, UK: Cambridge University Press. Bühring W. ( 2000) An asymptotic expansion for a ratio of products of gamma functions. Internat. J. Math. & Math. Sci. , 24, 505– 510. Google Scholar CrossRef Search ADS   Cantero M. J. & Iserles A. ( 2012) On rapid computation of expansions in ultraspherical polynomials. SIAM J. Numer. Anal. , 50, 307– 327. Google Scholar CrossRef Search ADS   Clenshaw C. W. ( 1955) A note on the summation of Chebyshev series. Math. Comp. , 9, 118– 120. Google Scholar CrossRef Search ADS   Frigo M. & Johnson S. G. ( 2005) The design and implementation of FFTW3. Proc. IEEE , 93, 216– 231. Google Scholar CrossRef Search ADS   Gentleman W. M. ( 1969) An error analysis of Goertzel’s (Watt’s) method for computing Fourier coefficients. Comput. J. , 12, 160– 164. Google Scholar CrossRef Search ADS   Graham R. L. Knuth D. E. & Patashnik O. ( 1989) Concrete Mathematics, A Foundation for Computer Science , 2nd edn. Reading, MA: ddison-Wesley. Hahn E. ( 1980) Asymptotik bei Jacobi-polynomen und Jacobi-funktionen. Math. Z. , 171, 201– 226. Google Scholar CrossRef Search ADS   Hale N. & Townsend A. ( 2013) Fast and accurate computation of Gauss–Legendre and Gauss–Jacobi quadrature nodes and weights. SIAM J. Sci. Comput. , 35, A652– A674. Google Scholar CrossRef Search ADS   Hale N. & Townsend A. ( 2014) A fast, simple, and stable Chebyshev–Legendre transform using an asymptotic formula. SIAM J. Sci. Comput. , 36, A148– A167. Google Scholar CrossRef Search ADS   Hale N. & Townsend A. ( 2016) A fast FFT-based discrete Legendre transform. IMA J. Numer. Anal. , 36, 1670– 1684. Google Scholar CrossRef Search ADS   Keiner J. ( 2009) Computing with expansions in Gegenbauer polynomials. SIAM J. Sci. Comput. , 31, 2151– 2171. Google Scholar CrossRef Search ADS   Levrie P. & Piessens R. ( 1985) A note on the evaluation of orthogonal polynomials using recurrence relations. Technical Report 74.  Katholieke Universiteit Leuven. Li H. & Shen J. ( 2010) Optimal error estimates in Jacobi-weighted Sobolev spaces for polynomial approximations on the triangle. Math. Comp. , 79, 1621– 1646. Google Scholar CrossRef Search ADS   Mason J. C. & Handscomb D. C. ( 2002) Chebyshev Polynomials . Boca Raton, FL: CRC Press. Google Scholar CrossRef Search ADS   Mori A. Suda R. & Sugihara M. ( 1999) An improvement on Orszag’s fast algorithm for Legendre polynomial transform. Trans. Info. Process. Soc. Japan , 40, 3612– 3615. Oliver J. ( 1968) The numerical solution of linear recurrence relations. Numer. Math. , 11, 349– 360. Google Scholar CrossRef Search ADS   Oliver J. ( 1977) An error analysis of the modified Clenshaw method for evaluating Chebyshev and Fourier series. IMA J. App. Math. , 20, 379– 391. Google Scholar CrossRef Search ADS   Olver F. W. J. Lozier D. W. Boisvert R. F. & Clark C. W. (eds) ( 2010) NIST Handbook of Mathematical Functions . Cambridge, UK: Cambridge University Press. Orszag S. A. ( 1986) Fast eigenfunction transforms. Science and Computers . New York: Academic Press, pp. 13– 30. Piessens R. ( 1987) Numerical Integration , vol. 203 of NATO ASI Series. Netherlands: Springer, chapter 2, Modified Clenshaw–Curtis Integration and Applications to Numerical Computation of Integral Transforms. Rainville E. ( 1960) Special Functions . New York, NY: MacMillan. Slevinsky R. M. ( 2017) https://github.com/MikaelSlevinsky/FastTransforms.jl. GitHub.. Accessed 9 January 2017. Slevinsky R. M. & Olver S. ( 2017) A fast and well-conditioned spectral method for singular integral equations. J. Comp. Phys. , 332, 290– 315. Google Scholar CrossRef Search ADS   Sloane N. J. A. ( 2016) The On-Line Encyclopedia of Integer Sequences. http://oeis.org. Accessed 9 January 2017. Smith F. J. ( 1965) An algorithm for summing orthogonal polynomial series and their derivatives with applications to curve-fitting and interpolation. Math. Comp. , 19, 33– 36. Google Scholar CrossRef Search ADS   Sommariva A. ( 2013) Fast construction of Fejér and Clenshaw–Curtis rules for general weight functions. Comp. Math. Appl. , 65, 682– 693. Google Scholar CrossRef Search ADS   Townsend A. Webb M. & Olver S. ( 2016) Fast polynomial transforms based on Toeplitz and Hankel matrices. arXiv:1604.07486. Trefethen L. N. ( 2012) Approximation Theory and Approximation Practice . Philadelphia, PA: SIAM. Waldvogel J. ( 2003) Fast construction of the Fejér and Clenshaw–Curtis quadrature rules. BIT Numer. Math. , 43, 001– 018. Google Scholar CrossRef Search ADS   Wang H. & Huybrechs D. ( 2014) Fast and accurate computation of Jacobi expansion coefficients of analytic functions. arXiv:1404.2463v1. Wimp J. McCabe P. & Connor J. N. L. ( 1997) Computation of Jacobi functions of the second kind for use in nearside–farside scattering theory. J. Comp. Appl. Math. , 82, 447– 464. Google Scholar CrossRef Search ADS   Xiang S. He G. & Wang H. ( 2014) On fast and stable implementation of Clenshaw–Curtis and Fejér-type quadrature rules. Abst. Appl. Anal. , 2014, 10. © The authors 2017. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. All rights reserved.

### Journal

IMA Journal of Numerical AnalysisOxford University Press

Published: Jan 1, 2018

## You’re reading a free preview. Subscribe to read the entire article.

### DeepDyve is your personal research library

It’s your single place to instantly
that matters to you.

over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month ### Explore the DeepDyve Library ### Search Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly ### Organize Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place. ### Access Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals. ### Your journals are on DeepDyve Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more. All the latest content is available, no embargo periods. DeepDyve ### Freelancer DeepDyve ### Pro Price FREE$49/month
\$360/year

Save searches from
PubMed

Create lists to

Export lists, citations

Abstract access only

18 million full-text articles

Print

20 pages / month

PDF Discount

20% off