International Journal of Informatics and Applied Mathematics
Yazarlar: ["Karima BENHAMZA", "Nadjette BENHAMIDA", "Mohamed Ilyes BOURAHDOUN", "Bilel BOUDJAHEM"]
Konular:-
DOI:10.53508/ijiam.1118198
Anahtar Kelimeler:Medical Big data,Missing data,Imputation,K-means,Regression
Özet: Compared to other traditional datasets, medical data has several hidden challenges. In fact, the possibility of missing values for certain attributes presents a great dispute for data mining researchers to make correct medical decisions. In this paper, a hybrid scheme combining the k-means method and regression analysis is proposed. A combination of these two analytical methods allows to find the best distributional model of numerical data in space and helps to predict missing data. Applied to medical data (diabetes dataset), the proposed model predicts the values with a minor error rate, which is considered very satisfactory.