Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

HIOC: a hybrid imputation method to predict missing values in medical datasets

HIOC: a hybrid imputation method to predict missing values in medical datasets Decision support systems developed using machine learning classifiers have become a valuable tool in predicting various diseases. However, the performance of these systems is adversely affected by the missing values in medical datasets. Imputation methods are used to predict these missing values. In this paper, a new imputation method called hybrid imputation optimized by the classifier (HIOC) is proposed to predict missing values efficiently.Design/methodology/approachThe proposed HIOC is developed by using a classifier to combine multivariate imputation by chained equations (MICE), K nearest neighbor (KNN), mean and mode imputation methods in an optimum way. Performance of HIOC has been compared to MICE, KNN, and mean and mode methods. Four classifiers support vector machine (SVM), naive Bayes (NB), random forest (RF) and decision tree (DT) have been used to evaluate the performance of imputation methods.FindingsThe results show that HIOC performed efficiently even with a high rate of missing values. It had reduced root mean square error (RMSE) up to 17.32% in the heart disease dataset and 34.73% in the breast cancer dataset. Correct prediction of missing values improved the accuracy of the classifiers in predicting diseases. It increased classification accuracy up to 18.61% in the heart disease dataset and 6.20% in the breast cancer dataset.Originality/valueThe proposed HIOC is a new hybrid imputation method that can efficiently predict missing values in any medical dataset. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png International Journal of Intelligent Computing and Cybernetics Emerald Publishing

HIOC: a hybrid imputation method to predict missing values in medical datasets

Loading next page...
 
/lp/emerald-publishing/hioc-a-hybrid-imputation-method-to-predict-missing-values-in-medical-00ST4sj0GR
Publisher
Emerald Publishing
Copyright
© Emerald Publishing Limited
ISSN
1756-378X
DOI
10.1108/ijicc-03-2021-0042
Publisher site
See Article on Publisher Site

Abstract

Decision support systems developed using machine learning classifiers have become a valuable tool in predicting various diseases. However, the performance of these systems is adversely affected by the missing values in medical datasets. Imputation methods are used to predict these missing values. In this paper, a new imputation method called hybrid imputation optimized by the classifier (HIOC) is proposed to predict missing values efficiently.Design/methodology/approachThe proposed HIOC is developed by using a classifier to combine multivariate imputation by chained equations (MICE), K nearest neighbor (KNN), mean and mode imputation methods in an optimum way. Performance of HIOC has been compared to MICE, KNN, and mean and mode methods. Four classifiers support vector machine (SVM), naive Bayes (NB), random forest (RF) and decision tree (DT) have been used to evaluate the performance of imputation methods.FindingsThe results show that HIOC performed efficiently even with a high rate of missing values. It had reduced root mean square error (RMSE) up to 17.32% in the heart disease dataset and 34.73% in the breast cancer dataset. Correct prediction of missing values improved the accuracy of the classifiers in predicting diseases. It increased classification accuracy up to 18.61% in the heart disease dataset and 6.20% in the breast cancer dataset.Originality/valueThe proposed HIOC is a new hybrid imputation method that can efficiently predict missing values in any medical dataset.

Journal

International Journal of Intelligent Computing and CyberneticsEmerald Publishing

Published: Oct 4, 2021

Keywords: Imputation methods; Multivariate imputation by chained equations; KNN imputation; Mode imputation; Mean imputation; Hybrid imputation

References