K‐means clustering: A half‐century synthesis

K‐means clustering: A half‐century synthesis This paper synthesizes the results, methodology, and research conducted concerning the K‐means clustering method over the last fifty years. The K‐means method is first introduced, various formulations of the minimum variance loss function and alternative loss functions within the same class are outlined, and different methods of choosing the number of clusters and initialization, variable preprocessing, and data reduction schemes are discussed. Theoretic statistical results are provided and various extensions of K‐means using different metrics or modifications of the original algorithm are given, leading to a unifying treatment of K‐means and some of its extensions. Finally, several future studies are outlined that could enhance the understanding of numerous subtleties affecting the performance of the K‐means method. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png British Journal of Mathematical and Statistical Psychology Wiley

Loading next page...
 
/lp/wiley/k-means-clustering-a-half-century-synthesis-qP4nqXllKv
Publisher
Wiley
Copyright
2006 The British Psychological Society
ISSN
0007-1102
eISSN
2044-8317
DOI
10.1348/000711005X48266
pmid
16709277
Publisher site
See Article on Publisher Site

Abstract

This paper synthesizes the results, methodology, and research conducted concerning the K‐means clustering method over the last fifty years. The K‐means method is first introduced, various formulations of the minimum variance loss function and alternative loss functions within the same class are outlined, and different methods of choosing the number of clusters and initialization, variable preprocessing, and data reduction schemes are discussed. Theoretic statistical results are provided and various extensions of K‐means using different metrics or modifications of the original algorithm are given, leading to a unifying treatment of K‐means and some of its extensions. Finally, several future studies are outlined that could enhance the understanding of numerous subtleties affecting the performance of the K‐means method.

Journal

British Journal of Mathematical and Statistical PsychologyWiley

Published: May 1, 2006

References

  • MDL principle for robust vector quantisation
    Bischof, Bischof; Leonards, Leonards; Selb, Selb
  • Generalized Procrustes analysis
    Gower, Gower
  • Extensions to the K ‐means algorithm for clustering large data sets with categorical values
    Huang, Huang
  • Discrimination by means of components that are orthogonal in the data space
    Kiers, Kiers
  • The global K ‐means clustering algorithm
    Likas, Likas; Vlassis, Vlassis; Verbeek, Verbeek
  • Estimating the number of clusters in a data set via the gap statistic
    Tibshirani, Tibshirani; Walther, Walther; Hastie, Hastie

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create folders to
organize your research

Export folders, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off