Ensemble methods often produce effective classifiers by learning a set of base classifiers from a diverse collection of the training sets. In this paper, we present a system, voting on classifications from imputed learning sets (VCI), that produces those diverse training sets by randomly removing a small percentage of attribute values from the original training set, and then using an imputation technique to replace those values. VCI then runs a learning algorithm on each of these imputed training sets to produce a set of base classifiers. Later, the final prediction on a novel instance is the plurality classification produced by these classifiers. We investigate various imputation techniques here, including the state-of-the-art Bayesian multiple imputation (BMI) and expectation maximisation (EM). Our empirical results show that VCI predictors, especially those using BMI and EM as imputers, significantly improve the classification accuracy over conventional classifiers, especially on datasets that are originally incomplete; moreover VCI significantly outperforms bagging predictors and imputation-helped machine learners.
International Journal of Information and Decision Sciences – Inderscience Publishers
Published: Jan 1, 2009