2018 Australian Statistical Publishing Association Inc. Published by John Wiley & Sons Australia Pty Ltd.
Aust. N. Z. J. Stat. 60(1), 2018, 115–131 doi: 10.1111/anzs.12214
Kernel excess mass test for multimodality
Seoul National University
In this paper we propose a new statistical procedure for testing the multimodality of an
underlying distribution. Peter Hall developed an innovative idea of calibrating the null
distribution for the excess mass test statistic using the empirical distribution function. We
ﬁnd that the qualitative characteristics of a smooth underlying distribution function on the
number of modes is barely preserved in the excess mass functional by the non-smooth
empirical distribution function. Instead of the empirical distribution function, we propose
to use a kernel distribution function estimator. We derive the limiting distribution of the
resulting test statistic under strong unimodality, based on which we apply the calibration
idea to the proposed test statistic to obtain a cut-off value. Our numerical study suggests
that the calibrated kernel excess mass test has greater power than other existing methods.
We also illustrate the use of the proposed method in a case study in astronomy which
supports an assumption on a physical property of minor planets in the solar system.
Key words: calibration; dip test statistic; empirical distribution function; kernel methods; uni-
Inference on the multimodality of a population density is essential in model-based
clustering, see Chan & Hall (2010), for example. Several nonparametric procedures have
been proposed for testing the unimodality of an underlying distribution. Three well known
and popular methods are the bandwidth test (Silverman 1981), the dip test (Hartigan &
Hartigan 1985) and the excess mass test (M¨uller & Sawitzki 1991). The bandwidth test is
known to be asymptotically inaccurate as its exact signiﬁcance level is different from the
nominal one even as the sample size tends to inﬁnity. It is also sensitive to outliers that
produce spurious bumps in the tail of a kernel density estimator, so that the method suffers
from a loss of power. The dip and excess mass tests based on the empirical distribution
function of the observed data also have similar difﬁculties. In this paper we propose a new
test statistic that seems more promising than these existing test statistics.
The dip test statistic is deﬁned as the minimal sup-distance between the empirical
F and the class of all unimodal distribution functions. For a distribution
function F, let d(F) denote the ‘dip functional’ deﬁned by d(F)= inf
where G is the class of all unimodal distributions. The dip statistic is simply d(
F), and when
Author to whom correspondence should be addressed.
Seoul National University, e-mail: firstname.lastname@example.org
Acknowledgements. We thank the Associate Editor and two referees for various constructive comments
that helped us improve signiﬁcantly the earlier version of the paper. This work was supported by the
National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (No.
Australian & New Zealand Journal of Statistics