TY - JOUR AU - Barden, D. AB - One of the fundamental differences between the Central Limit Theorem for empirical Fréchet means obtained in [Kendall and Le, ‘Limit theorems for empirical Fréchet means of independent and non-identically distributed manifold-valued random variables’, Braz. J. Probab. Stat. 25 (2011) 323–352] and that for empirical Euclidean means lies on the assumption that the probability measure of the cut locus of the true Fréchet mean is zero. In [Hotz and Huckemann, ‘Intrinsic means on the circle: uniqueness, locus and asymptotics’, Preprint, 2011, arXiv:1108.2141v1], the authors show that, in the case of a circle, this assumption holds automatically. This paper shows that this holds for any complete and connected Riemannian manifold, assuming that there are at least two minimal geodesics between the Fréchet mean and any point in its cut locus and that, in fact, it can also be generalized to local minima of the $$p$$-energy function of a finite measure. 1. Introduction Fréchet means, as a generalization of Euclidean means, of probability measures on a metric space have been widely used for statistical analysis of non-Euclidean data. A point $$x_0$$ in a metric space $$\boldsymbol {M}$$ with distance function $$d$$ is called a Fréchet mean of the probability measure $$\mu$$ on $$\boldsymbol {M}$$ if the corresponding energy function $$F_\mu$$, defined by \[F_\mu (x)=\frac {1}{2}\int _{ \boldsymbol {M} }d(x,y)^2\,d\mu (y)\] is finite and attains its global minimum at $$x_0$$. Such a point is also sometimes referred to as a Riemannian Barycentre, or a centre of mass, of a finite measure $$\mu$$ in analysis, where the concept also plays an important role (cf. [10]). Influenced by the structure of the underlying manifold Fréchet means, unlike their Euclidean counterparts, exhibit many challenging probabilistic and statistical features. In the last 10 years or so, many properties of Fréchet means have been investigated and understood. However, due to the limitations of the geometric tools employed, for general results it is often necessary to restrict the support of the underlying probability measure (cf. [1, 3, 12, 15]), although it is sometimes possible to weaken such assumptions for particular Riemannian manifolds or for certain types of probability measure (cf. [2, 8, 9, 11, p. 211; 19]). The weakest assumption on a probability measure for general manifolds is the one made in [13] when deriving the central limit type of theorem for its Fréchet means. This requires that the probability measure of the cut locus of a Fréchet mean be zero. In [9], the authors show that this requirement holds automatically if the manifold is a circle where, of course, the cut locus is a single non-conjugate point. The aim of this paper is to show that it holds in general. For this, we first investigate a property of the distance function at cut loci in the next section. In Section 3, we obtain a necessary technical result of a combinatorial nature before proceeding, in Section 4, to the main result that, if $$\mu$$ is a finite measure on a Riemannian manifold and if $$x_0$$ is a local minimum of the more general $$p$$-energy function defined there, then the measure, under $$\mu$$, of the cut locus of $$x_0$$ is zero under very mild conditions. 2. Directional derivatives of the distance function on cut loci Throughout this paper, we assume that $$\boldsymbol {M}$$ is an $$m$$-dimensional complete and connected Riemannian manifold and that $$d$$ is the distance function on $$\boldsymbol {M}$$ induced by its Riemannian metric. For each $$x\in \boldsymbol {M}$$, denote by $$C(x)$$ its cut locus in $$\boldsymbol {M}$$, that is, the locus of points $$y$$ such that any shortest geodesic from $$x$$ to $$y$$ ceases, beyond $$y$$, to minimize the distance from $$x$$. Recall also that a point $$y$$ in $$\boldsymbol {M}$$ is called conjugate to $$x$$ if there is a vector $$v$$ in $$\tau _x({ \boldsymbol {M} })$$, the tangent space to $$\boldsymbol {M}$$ at $$x$$, with $$y=\exp _x(v)$$ and the exponential map $$\exp _x$$ at $$x$$ is singular at $$v$$. Such a conjugate point $$y$$ is called a first conjugate point along the geodesic $$\exp _x(tv)$$ if $$\exp _x$$ is regular at all $$tv$$, $$00$$. For any$$x\in \boldsymbol {M}$$and any$$y\in C(x)\setminus Q(x),$$there is a finite number$$k\geqslant 2$$for which there are distinct unit tangent vectors,$$v_1,\ldots ,v_k,$$in$$\tau _x({ \boldsymbol {M} })$$such that for any$$v\in \tau _x({ \boldsymbol {M} })$$ \[v(d(x,y)^p)=-p\,r^{p-1}\max _{1\leqslant j\leqslant k}\langle v_j,v\rangle ,\]where$$r=d(x,y)$$. Note that, in fact, $$k$$ is the number of the minimal geodesics between $$x$$ and $$y$$ and the $$v_i$$ are their initial unit tangent vectors. Proof Since $$y\in C(x)\setminus Q(x)$$, it follows that $$x\in C(y)\setminus Q(y)$$ and then (cf. [4, 16]) there is a finite integer $$k\geqslant 2$$ for which there are $$k$$ distinct minimal geodesics $${\{ }\gamma _j(t)\mid 1\leqslant j\leqslant k{\} }$$ from $$y$$ to $$x$$ with $$\gamma _j(1)=x$$. Moreover, there are neighbourhoods $$V$$ of $$x$$ and $${\cal {{U}}}_j$$ of each $$u_j=\gamma _j'(0)$$ in $$\tau _y({ \boldsymbol {M} })$$ with $$\exp _y\!|_{{\cal {{U}}}_j}$$ a diffeomorphism onto $$V$$, so giving smooth functions \[\phi _j: V \longrightarrow {\Bbb {R}};\quad x' \longmapsto \| \left (\exp _y|_{{\cal {{U}}}_j}\right )^{-1}(x') \|,\] so that $$\phi _j(x)=\|u_j\|=r$$ and $$d(x',y)=\min _{1\leqslant j\leqslant k}\phi _j(x')$$ for $$x'\in V$$. For $$j=1,\ldots ,k$$, let $$v_j=-\gamma _j'(1)/r\in \tau _x({ \boldsymbol {M} })$$. Then, the neighbourhood $$V$$ of $$x$$ is divided into $$k$$ ‘chambers’ \[C_i=\left\{ x'\in V\mid d(x',y)=\phi _i(x')=\min _{1\leqslant j\leqslant k}\phi _j(x')\right\} ,\quad 1\leqslant i\leqslant k,\] by the hypersurfaces $$P_{i_1i_2}={\{ }x'\in V\mid \phi _{i_1}(x')=\phi _{i_2}(x'){\} }$$. For any particular $$v\in \tau _x({ \boldsymbol {M} })$$ and all sufficiently small $$t$$, $$\exp _x(tv)$$ must lie in one such chamber, say, $$C_{j_0}$$. Noting that $$\phi _j(x)=d(x,y)$$, that $$\phi _j$$ is smooth near $$x$$ and that the vector $$v_j=-\gamma _j'(1)/r$$ is the initial unit tangent vector of the geodesic from $$x$$ to $$y$$ and so $$v_i=-{\mathrm {grad}}\,\phi _j(x)$$, we have \[\begin {array}{rl} v(d(x,y)^p)&=\lim _{t\downarrow 0}\frac {1}{t}{\{ }d(\exp _x(tv),y)^p-d(x,y)^p{\} }\\ &=\lim _{t\downarrow 0}\frac {1}{t}{\{ }\phi _{j_0}(\exp _x(tv))^p-\phi _{j_0}(x)^p{\} }\\ &=\lim _{t\downarrow 0}\frac {1}{t}\left\{ \min _{1\leqslant j\leqslant k}{\{ }\phi _j(\exp _x(tv))^p-\phi _j(x)^p{\} }\right\} \\ &=\min _{1\leqslant j\leqslant k}\lim _{t\downarrow 0}\frac {1}{t}{\{ }\phi _j(\exp _x(tv))^p-\phi _j(x)^p{\} }\\ &=p\min _{1\leqslant j\leqslant k}\langle \phi _j(x)^{p-1}\,{\mathrm {grad}}\,\phi _j(x),\,v\rangle \\ &=-pr^{p-1}\max _{1\leqslant j\leqslant k}\langle v_j,v\rangle , \end {array}\] as required. Lemma 1 covers a wide class of manifolds and the excluded set $$Q(x)$$ is at least metrically insignificant. Nevertheless, the proof cannot be extended to include points $$y$$ of $$Q(x)$$ since, for example, the functions $$\phi _j$$ defined there require $$\exp _x$$ to be a diffeomorphism onto a neighbourhood of $$y$$. Consequently, the result cannot be applied even to the case when $$\boldsymbol {M}$$ is a 2-sphere. Indeed, the local structure at a first conjugate point of the cut locus is generally significantly more complicated than that at any other point of the cut locus and is little understood apart from Warner's paper [20] and even that excludes what he terms singular conjugate points. Nevertheless, for our results, it is still possible to give a, necessarily more intricate, proof valid at all cut points. Lemma 2 Let$$p>0$$. Suppose that, for$$x\in \boldsymbol {M},$$$$y\in C(x)$$with$$d(x,y)=r$$. Then, for any$$v\in \tau _x({ \boldsymbol {M} }),$$ \[v(d(x,y)^p)=-pr^{p-1}\sup _{v'\in {\cal {{V}}}_y}\langle v',v\rangle ,\]where$${\cal {{V}}}_y={\{ }v'\in \tau _x({ \boldsymbol {M} })\mid \|v'\|=1{\mathrm {\ and\ }}\exp _x(rv')=y{\} }$$. Note that the set $${\{ }\exp _x(tv')\mid v'\in {\cal {{V}}}_y, 0\leqslant t\leqslant r{\} }$$ comprises all possible distinct unit speed geodesics from $$x$$ to $$y$$. Note also that, for $$y\notin C(x)\cup {\{ }x{\} }$$, $${\cal {{V}}}_y$$ contains single element $$v'=-{\mathrm {grad}}_1\,d(x,y)$$, where grad$$_1$$ denotes the gradient with respect to the first variable, and that for such $$y$$, when $$p=2$$, the above formula can be obtained directly by applying the first variation formula. Proof Since \[v(d(x,y)^p)=\lim _{t\downarrow 0}\frac {1}{t}{\{ }d(\exp _x(tv),y)^p-d(x,y)^p{\} }{,}\] for any $$v\in \tau _x({ \boldsymbol {M} })$$, it is sufficient to show that \[\lim _{t\downarrow 0}\frac {1}{t}{\{ }d(\exp _x(tv),y)-d(x,y){\} }=-\sup _{v'\in {\cal {{V}}}_y}\langle v',v\rangle .\] (1) For this, we write $$p(t)=\exp _x(tv)$$. First, for any $$v'\in {\cal {{V}}}_y$$ and $$s\in (0,1)$$ let $$z=\exp _x(srv')$$, so that $$z$$ lies on a minimal geodesic between $$x$$ and $$y$$ and precedes the cut point $$y$$. Then, $$d(x,z)$$ is smooth at $$x$$ and so \[v(d(x,z))=\langle {\mathrm {grad}}_1(d(x,z)),v\rangle =-\langle v',v\rangle .\] However, for $$t>0$$ \[\begin {array}{rl} d(p(t),y)-d(x,y)&=d(p(t),y)-d(y,z)-d(z,x)\\ &\leqslant d(p(t),z)-d(x,z){,} \end {array}\] by the triangle inequality, so that \[\limsup _{t\downarrow 0}\frac {d(p(t),y)-d(x,y)}{t}\leqslant \lim _{t\downarrow 0}\frac {d(p(t),z)-d(x,z)}{t}=v(d(x,z))=-\langle v',v\rangle .\] The left-hand side being independent of $$v'$$ implies that \[\limsup _{t\downarrow 0}\frac {d(p(t),y)-d(x,y)}{t}\leqslant -\sup _{v'\in {\cal {{V}}}_y}\langle v',v\rangle .\] (2) On the other hand, we may work in a convex $$\epsilon$$-neighbourhood $$V$$ of $$x$$ on which the sectional curvature is bounded below by $$-\kappa ^2$$. Taking $$t$$ sufficiently small for $$p(t)$$ to lie in $$V$$, let $$u_1(t)=-\dot p(t)/\|\dot p(t)\|=-\dot p(t)/\|v\|$$ be the initial unit tangent vector of the minimal geodesic from $$p(t)$$ to $$x$$ which, by the convexity, necessarily lies in $$V$$. Then choose $$u_2(t)$$ to be the initial unit tangent vector to a minimal geodesic from $$p(t)$$ to $$y$$ and $$z(t)=\exp _{p(t)}(su_2(t))$$ with again $$s$$ sufficiently small for $$z(t)$$ and, hence, the minimal geodesics from $$z(t)$$ to $$p(t)$$ and $$x$$ to lie in $$V$$. Denoting the lengths of sides of the geodesic triangle determined by $$x$$, $$p(t)$$ and $$z(t)$$ by $$a=t\|v\|=d(x,p(t))$$, $$b=s=d(p(t),z(t))$$ and $$c=d(z(t),x)$$, consider a geodesic triangle $$\Delta$$ of side-lengths $$a$$, $$b$$ and $$c$$ in the simply connected space of constant curvature $$-\kappa ^2$$. The angle $$\theta \in [0,\pi ]$$ between the sides with lengths $$a$$ and $$b$$ of $$\Delta$$ satisfies \[\cosh (\kappa c)=\cosh (\kappa a)\cosh (\kappa b)-\sinh (\kappa a)\sinh (\kappa b)\cos \theta .\] By the Toponogov comparison theorem, $$\theta$$ is bounded above by the angle between $$u_1(t)$$ and $$u_2(t)$$ at $$p(t)$$. This gives \[\sinh (\kappa a)\sinh (\kappa b)\langle u_1(t),u_2(t)\rangle \leqslant \cosh (\kappa a)\cosh (\kappa b)- \cosh (\kappa c).\] Taking Taylor expansions, we get \[2ab\langle u_1(t),u_2(t)\rangle \leqslant a^2+b^2-c^2+O(t^3)+O(s^3).\] (3) However, $$a^2+b^2-c^2\leqslant a^2+2\,b\,(b-c)$$ and here \[b-c=d(y,p(t))-d(y,z(t))-d(z(t),x)\leqslant d(p(t),y)-d(x,y).\] So that taking $$s=t^{2/3}$$ and letting $$t\downarrow 0$$, from (3), we obtain \[\liminf _{t\downarrow 0}\langle u_1(t),u_2(t)\rangle \leqslant \frac {1}{\|v\|}\liminf _{t\downarrow 0}\frac {d(p(t),y)-d(x,y)}{t}.\] (4) Now, let $$\alpha \in [0,\pi ]$$ satisfy $$\liminf _{t\downarrow 0}\langle u_1(t),u_2(t)\rangle =\cos \alpha$$ and choose a sequence $$t_n\downarrow 0$$ such that \[\lim _{n\rightarrow \infty }\langle u_1(t_n),u_2(t_n)\rangle =\cos \alpha .\] Since the unit vectors $$u_2(t)$$ lie in a compact subset of the tangent bundle $$\tau ({ \boldsymbol {M} })$$, taking a subsequence if necessary, we may assume that $$u_2(t_n)$$ converges to a unit vector $$\tilde v$$ in $$\tau _x({ \boldsymbol {M} })$$ as $$n\rightarrow \infty$$. By the continuity of exp on $$\tau ({ \boldsymbol {M} })$$, $$\exp _x(r\tilde v)=y$$ so that $$\tilde v\in {\cal {{V}}}_y$$ and, as $$\lim _{n\rightarrow \infty }u_1(t_n)=-v/\|v\|$$, \[\left \langle \tilde v,-\frac {v}{\|v\|}\right \rangle =\left \langle \lim _{n\rightarrow \infty } u_1(t_n),\lim _{n\rightarrow \infty }u_2(t_n)\right \rangle =\cos \alpha .\] Hence, by (4), we have that \[\liminf _{t\downarrow 0}\frac {d(p(t),y)-d(x,y)}{t}\geqslant -\langle \tilde v,v\rangle \geqslant -\sup _{v'\in {\cal {{V}}}_y}\langle v',v\rangle .\] This, together with (2), gives the required result (1). For the following special case, the directional derivative takes the same form as where the distance function is smooth. Corollary 1 Let$$p>0$$. Suppose that, for$$x\in \boldsymbol {M},$$$$y\in C(x)$$with$$d(x,y)=r$$and that there is just one minimal geodesic from$$x$$to$$y$$. Then, for any$$v\in \tau _x({ \boldsymbol {M} }),$$ \[v(d(x,y)^p)=-p\,r^{p-2}\langle \exp _x^{-1}(y),v\rangle .\] 3. A technical lemma For the proof of our main result in the next section, in addition to the directional derivatives obtained in the previous section, we shall require the following result for certain pairs of vectors in the tangent space $$\tau _{x_0}({ \boldsymbol {M} })$$. Lemma 3 Let$$\boldsymbol {V}$$be a given vector space of dimension at least$$2$$. For at least two and possibly countably many$$i\geqslant 1$$let$$v_{ij},$$$$j=1,2,$$be two distinct non-zero vectors in$$\boldsymbol {V}$$of equal length. Write \[w_{1j}=v_{1j}\quad {\mathrm {and}}\quad w_{2j}=\sum _{i>1}v_{ij},\ j=1,2,\]assuming convergence of the sum in the countable case, and \[w=w_{11}+w_{21}+w_{12}+w_{22},\quad w_1=w_{11}+w_{21},\quad w_2=w_{11}+w_{22}.\]Assume that$$w\neq 0$$. Then, subject to possibly re-ordering the sequence$${\{ }(v_{i1},v_{i2})\mid i\geqslant 1{\} },$$renaming the members of various pairs$$(v_{i1},v_{i2})$$and consequently re-defining the vectors$$w_{ij}$$and$$w_i,$$there exists$$v\in \boldsymbol {V}$$with$$\langle w_i,v\rangle \langle w-w_i,\,v\rangle <0$$for$$i=1$$or$$2$$. Proof Note first that, if $$w_i$$ is not a multiple of $$w$$, since $$w\neq 0$$, then there is a $$v\in \boldsymbol {V}$$ such that $$\langle w,v\rangle =0$$ and $$\langle w_i,v\rangle \neq 0$$ and then $$\langle w_i,v\rangle \langle w-w_i,v\rangle <0$$. On the other hand, if $$w_i=\lambda _iw$$ for $$i=1,2$$, then both $$w_{21}$$ and $$w_{22}$$ lie in $$\Pi =\langle w,w_{11}\rangle$$ and hence so too does $$w_{12}=w-w_{11}-w_{21}-w_{22}$$. Since $$v_{i1}$$ and $$v_{i2}$$ are distinct vectors of the same length, they are linearly independent unless $$v_{i1}=-v_{i2}$$. However, if $$v_{i1}=-v_{i2}$$ for all $$i$$, then $$w-w_1=-w_1$$ and so $$w=0$$, contrary to the hypothesis. Thus, re-ordering the sequence of pairs of vectors if necessary, we may assume that $$w_{11}$$ and $$w_{12}$$ are linearly independent so that $$\Pi$$ is two-dimensional. Then, for any non-zero $$\tilde v\in \Pi$$ orthogonal to $$w$$, we have $$\langle w_i,\tilde v\rangle =0$$ for $$i=1,2$$, so \[\langle w_{11},\tilde v\rangle =-\langle w_{21},\tilde v\rangle =-\langle w_{22},\tilde v\rangle =\langle w_{12},\tilde v\rangle .\] As $$\|w_{11}\|=\|w_{12}\|$$ and $$w_{11}\neq w_{12}$$, this implies that $$w_{11}$$ and $$w_{12}$$ must have equal and opposite projections on $$w$$. So, without loss of generality, we can write $$w_{11}=w_0+v_0$$ and $$w_{12}=-w_0+v_0$$ with $$v_0\in \langle \tilde v\rangle$$ and $$w_0\in \langle w\rangle$$. Moreover, $$w_0\neq 0$$ or we would have $$w_{11}=w_{12}$$, so that we may write $$w_0=\alpha ^{-1}w$$ for non-zero $$\alpha \in \mathbb R$$. Then, for some $$\beta \in \mathbb R$$, we have $$w_{21}=\beta w_0-v_0$$. Thus, as $$\sum w_{ij}=w=\alpha w_0$$, we see that if both $$w_1$$ and $$w_2$$ are linearly dependent on $$w,$$ then $$w_{ij}$$, $$i,j=1,2$$, satisfy \[\begin {array}{rl} w_{11} &= w_0 +v_0,\quad w_{12}=-w_0+v_0, \\ w_{21} &= \beta w_0-v_0,\quad w_{22}=(\alpha -\beta )w_0-v_0, \end {array}\] (5) with $$\alpha \neq 0$$; see Figure 1. Note that $$w_0$$, $$v_0$$ and so $$\alpha$$ here are determined by $$w_{11}$$ and $$w_{12}$$. If now, for some $$i>1$$, we re-order the pair of vectors $$(v_{i1},v_{i2})$$ as $$(v_{i2},v_{i1})$$, then this, with $$w$$, $$w_{11}$$ and $$w_{12}$$ unchanged, replaces $$w_{21}$$ and $$w_{22}$$ of our first choice by $$\tilde w_{21}=w_{21}+v_{i2}-v_{i1}$$ and $$\tilde w_{22}=w_{22}+v_{i1}-v_{i2}$$, respectively, and similarly replaces $$w_1$$ and $$w_2$$ by $$\tilde w_1$$ and $$\tilde w_2$$. Then, the above argument shows that either at least one of $$\tilde w_i$$ is not a multiple of $$w$$ and we obtain the required vector $$v$$ for this new choice; or equations (5) hold for $$\tilde w_{2i}$$ with some $$\tilde \beta$$ in place of $$\beta$$. The latter implies that $$v_{i1}-v_{i2}$$ is $$(\beta -\tilde \beta )w_0$$. Hence, the projections of $$v_{i1}$$ and $$v_{i2}$$ on $$v_0$$ are equal and, since they have the same norm but are not equal, their projections on $$w$$ must be equal and opposite. Finally, if we were unable to obtain the required vector $$v$$ by swapping the members of any pair of the vectors, then we would be able to re-define $$w_{21}$$ to be the sum of all $$v_{ij}$$ with $$i>1$$ and $$v_{ij}$$ having non-negative projection on $$w$$; $$w_{22}$$ would then be the similar sum with all projections on $$w$$ non-positive. Then $$\langle w_{21}+w_{22},w\rangle =0$$ and so, since we already have $$\langle w_{11}+w_{12},w\rangle =0$$, $$\|w\|=0$$, contradicting the assumption that $$w\neq 0$$. Figure 1. View largeDownload slide Relationship among the vectors $$w_{ij}$$ in the plane $$\Pi$$. Figure 1. View largeDownload slide Relationship among the vectors $$w_{ij}$$ in the plane $$\Pi$$. 4. The main result For a given finite measure $$\mu$$ on $$\boldsymbol {M}$$ and $$p\geqslant 1$$, let \[F_{\mu ,p}(x)=\frac {1}{p}\int _{ \boldsymbol {M} }d(x,y)^p\,d\mu (y){,}\] and assume that $$F_{\mu ,p}$$ is finite on $$\boldsymbol {M}$$. When $$p=2$$, $$F_{\mu ,p}$$ is the energy function of $$\mu$$ and, if it achieves a global minimum at $$x_0$$, then $$x_0$$ is called a Fréchet mean of $$\mu$$. Hence, we shall call $$F_{\mu ,p}$$ the $$p$$-energy function of $$\mu$$ and refer to $$x_0$$, where $$F_{\mu ,p}$$ achieves a global minimum, as a Fréchet $$p$$-mean of $$\mu$$. This generalized $$p$$-mean has been studied in [1, 3]. In particular, when $$p=1$$, $$x_0$$ is the Riemannian median of $$\mu$$ studied in [22]. In this section, we investigate the measure, under $$\mu$$, of the cut locus $$C(x_0)$$ of $$x_0$$ when $$F_{\mu ,p}$$ achieves a local minimum at $$x_0$$. Our main result, stated in the following theorem, says that, under certain mild conditions, for $$x_0$$ to be a local minimum of $$F_{\mu ,p}$$, $$\mu$$ can only carry zero mass on $$C(x_0)$$. Theorem 1 Assume that$$\mu$$is a finite measure on$$\boldsymbol {M}$$with finite$$p$$-energy function$$F_{\mu ,p}$$and that$$F_{\mu ,p}$$achieves a local minimum at$$x_0$$. Assume further that, for any$$x\in C(x_0),$$there are at least two minimal geodesics from$$x_0$$to$$x$$. If$$p>1,$$or if$$p=1$$and$$\mu ({\{ }x_0{\} })=0,$$then$$\mu (C(x_0))=0$$. Note that, if $$00$$. Proof We only need to consider the case when $$C(x_0)\neq \emptyset$$. Since $$C(x)$$ has co-dimension at least 1, it follows that, if $$\mu$$ is absolutely continuous with respect to the volume measure, then $$\mu (C(x))=0$$. Moreover, any probability measure can be decomposed as the sum of an absolutely continuous measure and discrete measure. Thus, it is sufficient to consider the case that, when restricted to $$C(x_0)$$, $$\mu$$ is a discrete measure. For this, we assume that there is a set $$Y={\{ }y_i\mid i\geqslant 1{\} }\subseteq C(x_0)$$ containing at most a countable number of distinct points such that $$\mu (C(x_0)\setminus Y)=0$$. Then, for $$p\geqslant 1$$, we can re-express $$F_{\mu ,p}(x_0)$$ as \[F_{\mu ,p}(x_0)=\frac {1}{p}\int _{{ \boldsymbol {M} }\setminus C(x_0)}d(x_0,y)^pd\mu (y)+ \frac {1}{p}\sum _{i\geqslant 1}d(x_0,y_i)^p\mu ({\{ }y_i{\} }).\] Since $$p\geqslant 1$$, for any tangent vector $$v\in \tau _y({ \boldsymbol {M} })$$, \[\lim _{t\downarrow 0}\frac {d(\exp _y(tv),y)^p}{t}=\lim _{t\downarrow 0}t^{p-1}\|v\|^p = \left\{ \begin {array}{ll}0&{\mathrm {if\ }}p>1{,} \\ \|v\|&{\mathrm {if\ }}p=1, \end {array}\right .\]$$d(x,y)^p$$ is differentiable at $$x=y$$ if $$p>1$$. It follows that, for any tangent vector $$v\in \tau _{x_0}({ \boldsymbol {M} })$$, \[\begin {array}{rl} v(F_{\mu ,p}(x_0)) &= \delta _{p,1}\|v\|\mu ({\{ }x_0{\} })+ \dfrac {1}{p}\displaystyle \sum _{i\geqslant 1} v(d(x_0,y_i)^p)\mu ({\{ }y_i{\} })\\ &\quad -\left \langle \int _{{ \boldsymbol {M} }\setminus (C(x_0)\cup {\{ }x_0{\} })}d(x_0,y)^{p-2}\exp _{x_0}^{-1}(y)\,d\mu (y),v\right \rangle {,} \end {array}\] (6) where $$\delta _{i,j}$$ is the Kronecker delta and, in the integrand, $$\exp _{x_0}$$ is restricted to the initial segments of rays in $$\tau _{x_0}({ \boldsymbol {M} })$$ that map to geodesics that do not reach $$C(x_0)$$ and, hence, on which $$\exp ^{-1}_{x_0}$$ is well defined. Under the assumption that $$\mu ({\{ }x_0{\} })=0$$ when $$p=1$$, we may omit the first term on the right-hand side. Since $$F_{\mu ,p}$$ achieves a local minimum at $$x_0,$$ it follows that, $$\forall v\in \tau _{x_0}({ \boldsymbol {M} })$$, $$v(F_{\mu ,p}(x_0))\geqslant 0$$. In particular, for any $$v_1\neq 0$$ in $$\tau _{x_0}({ \boldsymbol {M} })$$ and $$v_2=-v_1$$, we must have both $$v_1(F_{\mu ,p}(x_0))\geqslant 0$$ and $$v_2(F_{\mu ,p}(x_0))\geqslant 0$$. Hence, adding these inequalities and applying the expression (6) for such $$v_1$$ and $$v_2$$, we have \[\sum _{i\geqslant 1}v_1(d(x_0,y_i)^p) \mu ({\{ }y_i{\} })+ \sum _{i\geqslant 1}v_2(d(x_0,y_i)^p)\mu ({\{ }y_i{\} })\geqslant 0.\] (7) Note that, since $$d(x_0,y_i)^p$$ is not differentiable at $$x_0$$ under the given assumptions, although $$v_2=-v_1$$, we do not necessarily have $$v_2(d(x_0,y_i)^p)=-v_1(d(x_0,y_i)^p)$$. By the assumptions and Lemma 2, for each $$y_i$$, there is a subset $${\cal {{V}}}_i$$ of $$\tau _{x_0}({ \boldsymbol {M} })$$ containing at least two non-zero, distinct vectors of equal length $$d(x_0,y_i)^{p-1}$$ such that, renaming the $$v'$$ of Lemma 2, \[v(d(x_0,y_i)^p)=-p\sup _{v'\in {\cal {{V}}}_i}\langle v',v\rangle ;\] equivalently, for any $$v\in \tau _{x_0}({ \boldsymbol {M} })$$, \[(-v)(d(x_0,y_i)^p)=p\inf _{v'\in {\cal {{V}}}_i}\langle v',v\rangle .\] For each $$i\geqslant 1,$$ write $${\cal {{V}}}^*_i=\mu ({\{ }y_i{\} })\,{\cal {{V}}}_i$$. Then, if $$\mu ({\{ }y_i{\} })\neq 0$$, then all the $$v'$$ in $${\cal {{V}}}^*_i$$ are non-zero, distinct and of equal length for each given $$i$$. Hence, for all $$v$$ in $$\tau _{x_0}({ \boldsymbol {M} })$$, \[\frac {1}{p}\left (\sum _{i\geqslant 1}v(d(x_0,y_i)^p)\mu ({\{ }y_i{\} })\right )= -\sum _{i\geqslant 1}\sup _{v'\in {\cal {{V}}}^*_i}\langle v',v\rangle =-\sup _{{ \boldsymbol {v} }'\in {\cal {{V}}}^* }\left \langle \sum _{i\geqslant 1}v'_i,v\right \rangle ,\] where $${\cal {{V}}}^* = {\{ }{ \boldsymbol {v} }=(v_1,v_2,\ldots )\mid v_i\in {\cal {{V}}}^*_i{\} }$$. We now show that, for any $${ \boldsymbol {v} }_1\neq { \boldsymbol {v} }_2\in {\cal {{V}}}^* $$, there exists $$v\in \tau _{x_0}({ \boldsymbol {M} })$$ such that \[\left \langle \sum _{i\geqslant 1}v_{i1},v\right \rangle >0\quad {\mathrm {and}} \quad \left \langle \sum _{i\geqslant 1}v_{i2},v\right \rangle <0,\] (8) where $${ \boldsymbol {v} }_j=(v_{1j},v_{2j},\ldots )$$, $$j=1,2$$. This, in particular, implies that there is a vector $$v\in \tau _{x_0}({ \boldsymbol {M} })$$ such that \[\sup _{{ \boldsymbol {v} }\in {\cal {{V}}}^* }\left \langle \sum _{i\geqslant 1}v_i,\,v\right \rangle >0\quad {\mathrm {and}}\quad \inf _{{ \boldsymbol {v} }\in {\cal {{V}}}^* }\left \langle \sum _{i\geqslant 1}v_i,v\right \rangle <0,\] which, taking $$v_1=v$$ and $$v_2=-v$$, contradicts (7). To prove (8), we first note that, if $$Y$$ contains just one point $$y_1$$, in particular if dim$$({ \boldsymbol {M} })=1$$, then the existence of $$v$$ satisfying (8) can be verified by taking $$v=v_{11}-v_{12}$$. Hence, we may assume that dim$$({ \boldsymbol {M} })\geqslant 2$$ and $$|Y|\geqslant 2$$, and apply Lemma 3, for which we choose, for each $$i\geqslant 1$$, two, of possibly many, distinct $$v_{ij}\in {\cal {{V}}}^*_i$$, $$j=1,2$$. Define the corresponding $$w_{ij}$$, $$w_k$$ and $$w$$ for $$i\geqslant 1$$ and $$j,k=1,2$$ as in Lemma 3. Then, if $$w\neq 0$$, then the existence of $$v$$ satisfying (8) follows directly from the lemma. If $$w=0$$, since $$v_{11}\neq v_{12}$$, then without loss of generality we may assume that $$w_1=v_{11}+w_{21}\neq 0$$. Then, by choosing $$v=w_1=\sum _{i\geqslant 1}v_{i1}$$, we have \[\langle w_{11}+w_{21},v\rangle =\|v\|^2>0\quad {\mathrm {and}}\quad \langle w_{12}+w_{22},v\rangle =-\langle w_{11}+w_{21},v\rangle <0.\] The extra condition that $$\mu ({\{ }x_0{\} })=0$$ in the case $$p=1$$ in the theorem is necessary. To see this, take the following four points on the unit circle $${ \boldsymbol {S} }^1$$ in $${\Bbb {R}}^2$$ with $$y>0$$: \[p_1=(x,y), \quad p_2=(-x,y), \quad p_3=(0,1), \quad p_4=(0,-1).\] Consider the finite measure $$\mu$$ supported on these four points with $$\mu ({\{ }p_1{\} })=\mu ({\{ }p_2{\} })>0$$ and $$\mu ({\{ }p_3{\} })>\mu ({\{ }p_4{\} })>0$$ and the measure $$\tilde \mu$$ with $$\tilde \mu ({\{ }p_1{\} })=\tilde \mu ({\{ }p_2{\} })=\mu ({\{ }p_1{\} })$$, $$\tilde \mu ({\{ }p_3{\} })=\mu ({\{ }p_3{\} })-\mu ({\{ }p_4{\} })$$ and $$\tilde \mu ({\{ }p_4{\} })=0$$. Then, for all $$p\in { \boldsymbol {S} }^1$$, $$F_{\mu ,1}(p)=F_{\tilde \mu ,1}(p)+ \pi \,\mu ({\{ }p_4{\} })$$ since $$d(p,p_3)+d(p,p_4)=\pi$$, so that $$F_{\mu ,1}$$ and $$F_{\tilde \mu ,1}$$ attain their local and global minima at the same points. However, $$d(p,p_1)+d(p,p_2)$$ is least when $$p$$ lies on the shorter arc between $$p_1$$ and $$p_2$$ and is then independent of the position of $$p$$. Thus, $$F_{\tilde \mu ,1}$$ achieves its global minimum at $$p_3$$ and hence so too does $$F_{\mu ,1}$$. This shows that $$\mu$$ is a measure with positive weight on the cut locus $${\{ }p_4{\} }$$ of its unique Fréchet 1-mean $${\{ }p_3{\} }$$. An immediate consequence of the theorem is that the measure-dependent condition required for the central limit theorems obtained in [13] may be replaced by a universal geometric condition on the underlying manifold. Rather than asking that the probability measure of the cut locus of its Fréchet mean be zero, it is sufficient to require that there be at least two minimal geodesics to each point of its cut locus. This latter condition is automatically satisfied on many standard manifolds, such as spheres of constant curvature, orthogonal groups, Grassmannians and real and complex projective spaces. On the other hand, the derivation of the limit theorems given in [13] can be regarded as a form of generalization of the ‘delta-method’ in statistics for deriving an asymptotic distribution for a function of a sequence of random variables that are convergent in distribution. Hence, for example, the three remaining assumptions for the i.i.d. case in Corollary 3 there resemble those required for general central limit theorems for M-estimators. Thus, the replacing the measure-dependent condition on the cut locus gives the optimal version of the results in [13]. The proof of the theorem clearly cannot directly be extended to the case where $$x$$ is a cut point of $$x_0$$ with only one minimal geodesic between $$x_0$$ and $$x$$. Indeed, the result of Corollary 1 makes it clear that the behaviour of the directional derivative at such a cut point is identical to that at the non-cut points of $$x_0$$. Although we can neither prove it nor produce a counterexample our conjecture, lead by this observation, would be that two or more geodesics to each cut point are necessary for our result. Nevertheless, the following corollaries do not require that constraint since, when there is only one minimal geodesic between $$x$$ and $$x_0$$, $$\exp _{x_0}^{-1}(x)$$ is well defined. Corollary 2 Assume that$$\mu$$is a finite measure on$$\boldsymbol {M}$$with finite$$p$$-energy function$$F_{\mu ,p}$$and that$$F_{\mu ,p}$$achieves a local minimum at$$x_0$$. Then,$$\exp ^{-1}_{x_0}$$is well defined almost surely under$$\mu$$. Thus, the results in [13], together with Corollary 2, imply that the limiting distribution of the empirical Fréchet mean is closely linked to that of the empirical Euclidean mean of $$\exp _{x_0}^{-1}(X)$$, where $$X$$ is a random variable on $$\boldsymbol {M}$$ with distribution $$\mu$$. The role that the curvature of $$\boldsymbol {M}$$ plays in the limiting distribution of Fréchet means, in addition to that reflected in $$\exp ^{-1}_{x_0}$$, is captured via the expected value of the random linear operator on $$\tau _{x_0}({ \boldsymbol {M} })$$ given by \[H_{x_0,X}:v\longmapsto -(D_v\exp ^{-1}_{x_0}(X))(x_0).\] In particular, irrespective of its global geometric structure, as long as $$\boldsymbol {M}$$ is flat, for example, $${\Bbb {R}}^m$$, $${ \boldsymbol {S} }^1$$, flat torus $${\Bbb {R}}^m/\Gamma$$, $$\ldots$$, in which case $$H$$ is the identity operator, the limiting distribution for the Fréchet 2-mean is always identical with the limiting distribution of the empirical Euclidean mean of $$\exp _{x_0}^{-1}(X)$$. Moreover, as in the Euclidean case, the probability measure $$\mu$$ plays a role in the limiting distribution only via the expectations of the two operators appearing in its covariance matrix, similar to those in the Euclidean case. The $$p$$-energy function $$F_{\mu ,p}$$ is clearly a continuous function as long as it is finite. The following result shows that it is also differentiable at its local minima and generalizes a similar well-known result for complete and simply connected Riemannian manifolds of non-positive curvature (cf. [14, p. 109]), which links a local minimum of $$F_{\mu ,p}$$ with the Euclidean $$p$$-mean of the push-forward measure of $$\mu$$ by the inverse exponential map at the local minimum. Corollary 3 Assume that$$\mu$$is a finite measure on$$\boldsymbol {M}$$with finite$$p$$-energy function$$F_{\mu ,p}$$and that$$F_{\mu ,p}$$achieves a local minimum at$$x_0$$. Then,$$F_{\mu ,p}$$is differentiable at$$x_0$$with \[{\mathrm {\rm grad}}(F_{\mu ,p}(x_0))=-\int _{ \boldsymbol {M} }d(x_0,y)^{p-2}\exp _{x_0}^{-1}(y)\,d\mu (y)=0.\] Proof Using the result of Corollary 1, a modification of the proof of the theorem to include the possibility that there is just one minimal geodesic between $$x_0$$ and one or more of its cut points shows that, for any $$v\in \tau _{x_0}({ \boldsymbol {M} })$$, $$v(F_{\mu ,p}(x_0))=0$$ so that $$F_{\mu ,p}$$ is differentiable with zero derivative at $$x_0$$. Moreover, it also shows that, for any $$v\in \tau _{x_0}({ \boldsymbol {M} })$$, \[v(F_{\mu ,p}(x_0)) = -\left \langle \int _{{ \boldsymbol {M} }}d(x_0,y)^{p-2}\exp ^{-1}_{x_0}(y)\,d\mu (y),v\right \rangle .\] Since $$v(F_{\mu ,p}(x_0))=\langle {\mathrm {grad}}(F_{\mu ,p}(x_0)),v\rangle$$, the required expression follows. Barycentres in a proper Alexandrov space of curvature bounded below have been studied in [18]. However, although the distance function on any complete locally compact Alexandrov space of either non-positive curvature or non-negative curvature has a similar property, for any two points in the space, to that given in Lemma 2 (cf. [6, Corollary 4.5.7]), the fact that our proof of the theorem uses the linear structure of the tangent space of a Riemannian manifold implies that it cannot be generalized directly to study Barycentres in Alexandrov spaces. Acknowledgements We are indebted to the referee for helpful suggestions to improve our presentation. References 1 Afsari B., ‘ Riemannian $$L^{p}$$ center of mass: existence, uniqueness, and convexity’, Proc. Amer. Math. Soc. 139 ( 2011) 655– 673. Google Scholar CrossRef Search ADS 2 Afsari B. Tron R. Vidal R., ‘ On the convergence of gradient descent for finding the Riemannian center of mass’, SIAM J. Control Optim. 51 ( 2013) 2230– 2260. Google Scholar CrossRef Search ADS 3 Arnaudon M. Dombry C. Phan A. Yang L., ‘ Stochastic algorithms for computing means of probability measures’, Stochastic Process. Appl. 122 ( 2012) 1437– 1455. Google Scholar CrossRef Search ADS 4 Barden D. Le H., ‘ Some consequences of the nature of the distance function on the cut locus in a Riemannian manifold’, J. London Math. Soc. 56 ( 1997) 369– 383. Google Scholar CrossRef Search ADS 5 Bishop R. L., ‘ Decomposition of cut loci’, Proc. Amer. Math. Soc. 65 ( 1977) 133– 136. Google Scholar CrossRef Search ADS 6 Burago D. Burago Y. Ivanov S., A course in metric geometry ( American Mathematical Society, Providence, RI, 2001). 7 Burago Y. Gromov M. Perelman G., ‘ A.D. Alexandrov spaces with curvature bounded below’, Russian Math. Surveys 47 ( 1992) 1– 58. Google Scholar CrossRef Search ADS 8 Charlier B., ‘ Necessary and sufficient condition for existence of a Fréchet mean on the circle’, Preprint, 2011, arXiv:1109.1986v2. 9 Hotz T. Huckemann S., ‘ Intrinsic means on the circle: uniqueness, locus and asymptotics’, Preprint, 2011, arXiv:1108.2141v1. 10 Karcher H., ‘ Riemannian center of mass and mollifier smoothing’, Comm. Pure Appl. Math. 30 ( 1977) 509– 541. Google Scholar CrossRef Search ADS 11 Kendall D. G. Barden D. Carne T. K. Le H., Shape and shape theory ( Wiley, New York, 1999). 12 Kendall W. S., ‘ Probability, convexity, and harmonic maps with small image I: uniqueness and fine existence’, Proc. London Math. Soc. 61 ( 1990) 371– 406. Google Scholar CrossRef Search ADS 13 Kendall W. S. Le H., ‘ Limit theorems for empirical Fréchet means of independent and non-identically distributed manifold-valued random variables’, Braz. J. Probab. Stat. 25 ( 2011) 323– 352. Google Scholar CrossRef Search ADS 14 Kobayashi S. Nomizu K., Foundations of differential geometry, II ( Wiley-Interscience, New York, 1969). 15 Le H., ‘ Estimation of Riemannian barycentres’, LMS J. Comput. Math. 7 ( 2004) 193– 200. Google Scholar CrossRef Search ADS 16 Le H. Barden D., ‘ Itô correction terms for the radial parts of semimartingales on manifolds’, Probab. Theory Related Fields 101 ( 1995) 133– 146. Google Scholar CrossRef Search ADS 17 Margerin C. M., ‘ General conjugate loci are not closed’, Differential geometry: Riemannian geometry (Los Angeles, CA, 1990) , Proceedings of Symposia in Pure Mathematics 54, Part 3 (eds Greene R. Yau S. T.; American Mathematical Society, Providence, RI, 1993) 465– 478. 18 Ohta S., ‘ Barycenters in Alexandrov spaces of curvature bounded below’, Adv. Geom. 14 ( 2012) 571– 587. Google Scholar CrossRef Search ADS 19 Pennec X., ‘ Intrinsic statistics on Riemannian manifolds: basic tools for geometric measurements’, J. Math. Imaging Vision 25 ( 2006) 127– 154. Google Scholar CrossRef Search ADS 20 Warner F. W., ‘ The conjugate locus of a Riemannian manifold’, Amer. J. Math. 87 ( 1965) 575– 604. Google Scholar CrossRef Search ADS 21 Weinstein A. D., ‘ The cut locus and conjugate locus of a Riemannian manifold’, Ann. of Math. 87 ( 1968) 29– 41. Google Scholar CrossRef Search ADS 22 Yang L., ‘ Riemannian median and its estimation’, LMS J. Comput. Math. 13 ( 2010) 461– 479. Google Scholar CrossRef Search ADS © 2014 London Mathematical Society TI - On the measure of the cut locus of a Fréchet mean JO - Bulletin of the London Mathematical Society DO - 10.1112/blms/bdu025 DA - 2014-04-08 UR - https://www.deepdyve.com/lp/oxford-university-press/on-the-measure-of-the-cut-locus-of-a-fr-chet-mean-3yuEpOTWNe SP - 698 EP - 708 VL - 46 IS - 4 DP - DeepDyve ER -