Diabetec Disease Classifier Based on Three Machine Learning Models
Diabetes is generally acknowledged as an increasing epidemic that affects nearly every country, age group, and economy on the earth. Without doubt, this worrisome statistic requires immediate response. The healthcare business produces vast volumes of complicated data on regular basis from a variety of sources, including electronic patient records, medical reports, hospital gadgets and billing systems. Traditional approaches cannot handle and interpret the massive volumes of data created by healthcare transactions because they are too complicated and numerous. Machine learning has been used to many sectors of medical health due to the rapid growth of the technology. The aim of this research is to aid the medical professionals to diagnose patients whether the patients is diabetic or not diabetic, by applying machine learning algorithms, and evaluate the results to find the best algorithm to predict diabetic diseases. Support Vector Machine (SVM), K-Nearest Neighbours (KNN) and Random Forest (RF) are implemented in this research. Performance measures which is accuracy score is utilized to determine the performance for each model. Based on the results for each model, the model with the highest accuracy score obtained is SVC Linear with the score of 78.62%. The proposed models are valuable to be used for medical practice or in assisting medical professionals in making treatment decisions.
 T. M. Alam et al., “A model for early prediction of diabetes,” Informatics Med. Unlocked, vol. 16, pp. 1–6, 2019.
 Q. Zou, K. Qu, Y. Luo, D. Yin, Y. Ju, and H. Tang, “Predicting Diabetes Mellitus With Machine Learning Techniques,” Front. Genet., vol. 9, no. 515, 2018.
 N. I. A. Kader, U. K. Yusof, and S. Naim, “A Study of Diabetic Retinopathy Classification Using Support Vector Machine,” Int. J. Eng. Technol., vol. 7, pp. 521–527, 2018.
 A. Viloria, Y. Herazo-Beltran, D. Cabrera, and O. B. Pineda, “Diabetes Diagnostic Prediction Using Vector Support Machines,” Procedia Comput. Sci., vol. 170, pp. 376–381, 2020.
 S. Sistla, “Predicting Diabetes using SVM Implemented by Machine Learning,” Int. J. Soft Comput. Eng., vol. 12, no. 2, pp. 16–18, 2022.
 W. Yu, T. Liu, R. Valdez, M. Gwinn, and M. J. Khoury, “Application of support vector machine modeling for prediction of common diseases: the case of diabetes and pre-diabetes,” BMC Med. Inform. Decis. Mak., vol. 10, no. 16, pp. 1–7, 2010.
 R. Saxenaa, S. K. Sharmab, and M. Guptac, “Role of K-nearest neighbour in detection of Diabetes Mellitus,” Turkish J. Comput. Math. Educ., vol. 12, no. 10, pp. 373–376, 202.
 I. H. Sarker, M. F. Faruque, H. Alqahtani, and A. Kalim, “K-Nearest Neighbor Learning based Diabetes Mellitus Prediction and Analysis for eHealth Services,” EAI Endorsed Trans. Scalable Inf. Syst., vol. 7, no. 26, pp. 1–9, 2020.
 B. Premamayudu, K. Muralikrishna, and K. Pramodh, “Diabetes Prediction Using Machine Learning KNN -Algorithm Technique,” Int. J. Innov. Sci. Res. Technol., vol. 7, no. 5, pp. 941–944, 2022.
 R. Garcia-Carretero, L. Vigil-Medina, I. Mora-Jimenez, C. Soguero-Ruiz, O. Barquero-Perez, and J. Ramos-Lopez, “Use of a K-nearest neighbors model to predict the development of type 2 diabetes within 2 years in an obese, hypertensive population,” Med Biol Eng Comput, vol. 58, no. 5, pp. 991–1002, 2020.
 K. VijiyaKumar, B. Lavanya, I. Nirmala, and S. S. Caroline, “Random Forest Algorithm for the Prediction of Diabetes,” in IEEE International Conference on System, Computation, Automation and Networking (ICSCAN), 2019, pp. 1–5.
 T. Ooka, H. Johno, K. Nakamoto, Y. Yoda, H. Yokomichi, and Z. Yamagata, “Random forest approach for determining risk prediction and predictive factors of type 2 diabetes: large-scale health check-up data in Japan,” BMJ Nutr. Prev. Heal., vol. 0, pp. 1–9, 2021.
 V. A. Maksimenko et al., “Exploratory study on classifcation of diabetes mellitus through a combined Random Forest Classifer,” BMC Med. Inform. Decis. Mak., vol. 21, no. 105, pp. 1–14, 2021.
 O. Daanouni, B. Cherradi, and A. Tmiri, “Diabetes Diseases Prediction Using Supervised Machine Learning and Neighbourhood Components Analysis,” in Proceedings of the 3rd International Conference on Networking, Information Systems & Security, 2020, pp. 1–5.
 L. Breiman, “Random forests,” Mach. Learn., pp. 5–32, 2001.