International Scientific and Vocational Studies Journal
Yazarlar: Pınar Cihan
Konular:Bilgisayar Bilimleri, Bilgi Sistemleri
Anahtar Kelimeler:Missing value imputation,K-nearest neighbor,Singular value decomposition,Bayesian principal component analysis,MissForest
Özet: The missing values in the data sets do not allow for accurate analysis. Therefore, the correct imputation of missing values has become the focus of attention of researchers in recent years. This paper focuses on a comparison of most reliable and up to date estimation methods to imputing the missing values. Imputation of missing values has a very high priority because of its impact on next pre-processing, data analysis, classification, clustering, etc. Root mean square error (RMSE) value, classification accuracy and execution time are used to evaluate the performances of most popular five methods (mean, k-nearest neighbors, singular value decomposition, bayesian principal component analysis and missForest). When RMSE and classification accuracy values of methods were compared, it has observed that missForest method outperformed other methods in all datasets.