Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

A comparison of single imputation and multiple imputation methods for missing data in different oncogene expression profiles

A comparison of single imputation and multiple imputation methods for missing data in different... To evaluate the effects of multiple-imputation (MI) method for missing data in gene expression profiles with different datasets and percentages of missing values compared with 3 single-imputation (SI) methods. Based on 3 gene expression profiles datasets from human colon cancer, non-small cell lung cancer, and lymph cancer, different deletion rates and different imputation numbers of MI were compared. The imputation and clustering effects of different methods were evaluated using the NRMSE and the gene clustering accuracy (F value). The NRMSE of the 4 methods gradually increased as the percentage of missing values in the 3 datasets increased, whereas the F value gradually decreased. In all datasets with different percentage of missing values settings, the NRMSEs of MI was consistently lower than those of the 3 SI methods, whereas the F value of MI was highest. The NRMSEs of MI gradually decreased as the number of imputations increased and increased as the variability in the original datasets increased, and the datasets imputed by MI showed the best clustering results. The results showed that the application of MI develops and enriches imputation-model approaches and provides a solid foundation for subsequent establishment of imputation strategies for gene expression profiles with missing data. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Biostatistics & Epidemiology Taylor & Francis

A comparison of single imputation and multiple imputation methods for missing data in different oncogene expression profiles

A comparison of single imputation and multiple imputation methods for missing data in different oncogene expression profiles

Abstract

To evaluate the effects of multiple-imputation (MI) method for missing data in gene expression profiles with different datasets and percentages of missing values compared with 3 single-imputation (SI) methods. Based on 3 gene expression profiles datasets from human colon cancer, non-small cell lung cancer, and lymph cancer, different deletion rates and different imputation numbers of MI were compared. The imputation and clustering effects of different methods were evaluated using the NRMSE...
Loading next page...
 
/lp/taylor-francis/a-comparison-of-single-imputation-and-multiple-imputation-methods-for-hpfw2lsKph
Publisher
Taylor & Francis
Copyright
© 2022 International Biometric Society – Chinese Region
ISSN
2470-9379
eISSN
2470-9360
DOI
10.1080/24709360.2021.2023805
Publisher site
See Article on Publisher Site

Abstract

To evaluate the effects of multiple-imputation (MI) method for missing data in gene expression profiles with different datasets and percentages of missing values compared with 3 single-imputation (SI) methods. Based on 3 gene expression profiles datasets from human colon cancer, non-small cell lung cancer, and lymph cancer, different deletion rates and different imputation numbers of MI were compared. The imputation and clustering effects of different methods were evaluated using the NRMSE and the gene clustering accuracy (F value). The NRMSE of the 4 methods gradually increased as the percentage of missing values in the 3 datasets increased, whereas the F value gradually decreased. In all datasets with different percentage of missing values settings, the NRMSEs of MI was consistently lower than those of the 3 SI methods, whereas the F value of MI was highest. The NRMSEs of MI gradually decreased as the number of imputations increased and increased as the variability in the original datasets increased, and the datasets imputed by MI showed the best clustering results. The results showed that the application of MI develops and enriches imputation-model approaches and provides a solid foundation for subsequent establishment of imputation strategies for gene expression profiles with missing data.

Journal

Biostatistics & EpidemiologyTaylor & Francis

Published: Jan 2, 2022

Keywords: Gene expression profiles; missing data; multiple imputation; clustering analysis

References