ABSTRACT
Diabetes is a common chronic condition, and current prediction methods generally perform poorly. This article proposed a machine. A learning-based method to diabetes prediction enables early detection. Three supervised machine learning techniques are chosen for this project: Random Forest (RF), Linear Support Vector Machine (LSVM) and K-Nearest Neighbors (KNN). We use the PIMA Indian Diabetes dataset by using the UCI repository to measure the accuracy of each technique and area under the curve (AUC). Random Forest surpasses other algorithms in predicting diabetes risk, with an AUC of 94.02% and accuracy of 83.67%. This contribution is important for healthcare workers since it can help predict diseases early and treat them promptly.
