Asian Pacific Journal of Health Sciences
Yazarlar: Atul Kumar Ramotra, Vibhakar Mansotra
Konular:-
Anahtar Kelimeler:Classification,Clustering,Data mining,Diabetes prediction,Principal component analysis,Synthetic Minority Over,Ampling Technique
Özet: Early prediction of diabetes is very important as diabetes can turn out to be life threatening for the patients in the later stages. In this paper, a hybrid framework for the prediction of type-2 diabetes is developed. In the first step, imbalance dataset is balanced using Synthetic Minority Over-sampling Technique. Then, clustering is applied using k-means clustering technique and all the incorrectly clustered entries and outliers are removed. Principal component analysis is then used for dimensionality reduction of the dataset. In the final step, classification is done using logistic regression (LR), naïve Bayes, support vector machine, and k-nearest neighbors classification techniques. Experimental analysis shows that 98.96% of accuracy is achieved by the proposed hybrid model using LR. The results are validated using 10-fold cross-validation.