International Scientific and Vocational Studies Journal

International Scientific and Vocational Studies Journal

A Comparison of Five Methods for Missing Value Imputation in Data Sets

Yazarlar: Pınar Cihan

Cilt 2 , Sayı 2 , 2018 , Sayfalar 80 - 85

Konular:Bilgisayar Bilimleri, Bilgi Sistemleri

Anahtar Kelimeler:Missing value imputation,K-nearest neighbor,Singular value decomposition,Bayesian principal component analysis,MissForest

Özet: The missing values in the data sets do not allow for accurate analysis. Therefore, the correct imputation of missing values has become the focus of attention of researchers in recent years. This paper focuses on a comparison of most reliable and up to date estimation methods to imputing the missing values. Imputation of missing values has a very high priority because of its impact on next pre-processing, data analysis, classification, clustering, etc. Root mean square error (RMSE) value, classification accuracy and execution time are used to evaluate the performances of most popular five methods (mean, k-nearest neighbors, singular value decomposition, bayesian principal component analysis and missForest). When RMSE and classification accuracy values of methods were compared, it has observed that missForest method outperformed other methods in all datasets.


ATIFLAR
Atıf Yapan Eserler
Henüz Atıf Yapılmamıştır

KAYNAK GÖSTER
BibTex
KOPYALA
@article{2018, title={A Comparison of Five Methods for Missing Value Imputation in Data Sets}, volume={2}, number={2}, publisher={International Scientific and Vocational Studies Journal}, author={Pınar Cihan}, year={2018}, pages={80–85} }
APA
KOPYALA
Pınar Cihan. (2018). A Comparison of Five Methods for Missing Value Imputation in Data Sets (Vol. 2, pp. 80–85). Vol. 2, pp. 80–85. International Scientific and Vocational Studies Journal.
MLA
KOPYALA
Pınar Cihan. A Comparison of Five Methods for Missing Value Imputation in Data Sets. no. 2, International Scientific and Vocational Studies Journal, 2018, pp. 80–85.