Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Binary clustering with missing data

Binary clustering with missing data A clustering method is presented for analysing multivariate binary data with missing values. When not all values are observed, Govaert3 has studied the relations between clustering methods and statistical models. The author has shown how the identification of a mixture of Bernoulli distributions with the same parameter for all clusters and for all variables corresponds to a clustering criterion which uses L1 distance characterizing the MNDBIN method (Marchetti8). He first generalized this model by selecting parameters which can depend on variables and finally by selecting parameters which can depend both on variables and on clusters. We use the previous models to derive a clustering method adapted to missing data. This method optimizes a criterion by a standard iterative partitioning algorithm which removes the necessity either to ignore objects or to substitute the missing data. We study several versions of this algorithm and, finally, a brief account is given of the application of this method to some simulated data. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Applied Stochastic Models and Data Analysis Wiley

Binary clustering with missing data

Loading next page...
 
/lp/wiley/binary-clustering-with-missing-data-iK0XuExc9e

References (9)

Publisher
Wiley
Copyright
Copyright © 1993 Wiley Subscription Services, Inc., A Wiley Company
ISSN
8755-0024
eISSN
1099-0747
DOI
10.1002/asm.3150090105
Publisher site
See Article on Publisher Site

Abstract

A clustering method is presented for analysing multivariate binary data with missing values. When not all values are observed, Govaert3 has studied the relations between clustering methods and statistical models. The author has shown how the identification of a mixture of Bernoulli distributions with the same parameter for all clusters and for all variables corresponds to a clustering criterion which uses L1 distance characterizing the MNDBIN method (Marchetti8). He first generalized this model by selecting parameters which can depend on variables and finally by selecting parameters which can depend both on variables and on clusters. We use the previous models to derive a clustering method adapted to missing data. This method optimizes a criterion by a standard iterative partitioning algorithm which removes the necessity either to ignore objects or to substitute the missing data. We study several versions of this algorithm and, finally, a brief account is given of the application of this method to some simulated data.

Journal

Applied Stochastic Models and Data AnalysisWiley

Published: Mar 1, 1993

Keywords: ; ; ;

There are no references for this article.