Diabetec Disease Classifier Based on Three Machine Learning Models

  • Mohd Faaizie Darmawan Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Perak Branch Tapah Campus, Perak, Malaysia
  • ‘Ayuni Zamri Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Perak Branch Tapah Campus, Perak, Malaysia
  • Shahirah Mohamed Hatim Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Perak Branch Tapah Campus, Perak, Malaysia
  • Ahmad Firdaus Ahmad Firdaus Faculty of Computing, College of Computing and Applied Sciences, Universiti Malaysia Pahang, Pahang, Malaysia
  • Mohd Zamri Osman Faculty of Computing, College of Computing and Applied Sciences, Universiti Malaysia Pahang, Pahang, Malaysia

Abstract

Diabetes is generally acknowledged as an increasing epidemic that affects nearly every country, age group, and economy on the earth. Without doubt, this worrisome statistic requires immediate response. The healthcare business produces vast volumes of complicated data on regular basis from a variety of sources, including electronic patient records, medical reports, hospital gadgets and billing systems. Traditional approaches cannot handle and interpret the massive volumes of data created by healthcare transactions because they are too complicated and numerous. Machine learning has been used to many sectors of medical health due to the rapid growth of the technology. The aim of this research is to aid the medical professionals to diagnose patients whether the patients is diabetic or not diabetic, by applying machine learning algorithms, and evaluate the results to find the best algorithm to predict diabetic diseases. Support Vector Machine (SVM), K-Nearest Neighbours (KNN) and Random Forest (RF) are implemented in this research. Performance measures which is accuracy score is utilized to determine the performance for each model. Based on the results for each model, the model with the highest accuracy score obtained is SVC Linear with the score of 78.62%. The proposed models are valuable to be used for medical practice or in assisting medical professionals in making treatment decisions.

References

[1] K. Papatheodorou, M. Banach, E. Bekiari, M. Rizzo, and M. Edmonds, “Complications of Diabetes 2017,” J. Diabetes Res., pp. 1–4, 2018.
[2] T. M. Alam et al., “A model for early prediction of diabetes,” Informatics Med. Unlocked, vol. 16, pp. 1–6, 2019.
[3] Q. Zou, K. Qu, Y. Luo, D. Yin, Y. Ju, and H. Tang, “Predicting Diabetes Mellitus With Machine Learning Techniques,” Front. Genet., vol. 9, no. 515, 2018.
[4] N. I. A. Kader, U. K. Yusof, and S. Naim, “A Study of Diabetic Retinopathy Classification Using Support Vector Machine,” Int. J. Eng. Technol., vol. 7, pp. 521–527, 2018.
[5] A. Viloria, Y. Herazo-Beltran, D. Cabrera, and O. B. Pineda, “Diabetes Diagnostic Prediction Using Vector Support Machines,” Procedia Comput. Sci., vol. 170, pp. 376–381, 2020.
[6] S. Sistla, “Predicting Diabetes using SVM Implemented by Machine Learning,” Int. J. Soft Comput. Eng., vol. 12, no. 2, pp. 16–18, 2022.
[7] W. Yu, T. Liu, R. Valdez, M. Gwinn, and M. J. Khoury, “Application of support vector machine modeling for prediction of common diseases: the case of diabetes and pre-diabetes,” BMC Med. Inform. Decis. Mak., vol. 10, no. 16, pp. 1–7, 2010.
[8] R. Saxenaa, S. K. Sharmab, and M. Guptac, “Role of K-nearest neighbour in detection of Diabetes Mellitus,” Turkish J. Comput. Math. Educ., vol. 12, no. 10, pp. 373–376, 202.
[9] I. H. Sarker, M. F. Faruque, H. Alqahtani, and A. Kalim, “K-Nearest Neighbor Learning based Diabetes Mellitus Prediction and Analysis for eHealth Services,” EAI Endorsed Trans. Scalable Inf. Syst., vol. 7, no. 26, pp. 1–9, 2020.
[10] B. Premamayudu, K. Muralikrishna, and K. Pramodh, “Diabetes Prediction Using Machine Learning KNN -Algorithm Technique,” Int. J. Innov. Sci. Res. Technol., vol. 7, no. 5, pp. 941–944, 2022.
[11] R. Garcia-Carretero, L. Vigil-Medina, I. Mora-Jimenez, C. Soguero-Ruiz, O. Barquero-Perez, and J. Ramos-Lopez, “Use of a K-nearest neighbors model to predict the development of type 2 diabetes within 2 years in an obese, hypertensive population,” Med Biol Eng Comput, vol. 58, no. 5, pp. 991–1002, 2020.
[12] K. VijiyaKumar, B. Lavanya, I. Nirmala, and S. S. Caroline, “Random Forest Algorithm for the Prediction of Diabetes,” in IEEE International Conference on System, Computation, Automation and Networking (ICSCAN), 2019, pp. 1–5.
[13] T. Ooka, H. Johno, K. Nakamoto, Y. Yoda, H. Yokomichi, and Z. Yamagata, “Random forest approach for determining risk prediction and predictive factors of type 2 diabetes: large-scale health check-up data in Japan,” BMJ Nutr. Prev. Heal., vol. 0, pp. 1–9, 2021.
[14] V. A. Maksimenko et al., “Exploratory study on classifcation of diabetes mellitus through a combined Random Forest Classifer,” BMC Med. Inform. Decis. Mak., vol. 21, no. 105, pp. 1–14, 2021.
[15] O. Daanouni, B. Cherradi, and A. Tmiri, “Diabetes Diseases Prediction Using Supervised Machine Learning and Neighbourhood Components Analysis,” in Proceedings of the 3rd International Conference on Networking, Information Systems & Security, 2020, pp. 1–5.
[16] L. Breiman, “Random forests,” Mach. Learn., pp. 5–32, 2001.
Published
2022-11-15
How to Cite
DARMAWAN, Mohd Faaizie et al. Diabetec Disease Classifier Based on Three Machine Learning Models. Mathematical Sciences and Informatics Journal, [S.l.], v. 3, n. 2, p. 25-34, nov. 2022. ISSN 2735-0703. Available at: <https://myjms.mohe.gov.my/index.php/mij/article/view/20118>. Date accessed: 07 dec. 2022. doi: https://doi.org/10.24191/mij.v3i2.20118.
Section
Articles

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.