ABSTRACT

As the population of a country increases, so does the healthcare responsibility. Due to today’s modern lifestyle, the problem of diabetes in people is very common. The reason behind it is a group of metabolic disorders where for an expanded period, the sugar level exceeds a threshold value. In such type of disease, early prediction can control the severity level of the disease and save money which later becomes very expensive. Machine learning furnishes with well-organized techniques to take out knowledge from diagnostic medical datasets composed from the patients with the help of correlation function and heat map using Python language and this will tell us which feature is more effective to predict the output accurately. So our main focus of this work is to predict the severity level of diabetes at the early stage with the help of classification techniques and confusion matrix of Machine Learning such as K-Nearest Neighbor, Support Vector Machine, Naive Bayes, Logistic Regression, and Random Forest. In the end, we are comparing all these classification techniques based on accuracy, precision, recall, and receiver operating characteristic curve to find out which technique is predicting the best output.