ABSTRACT

Machine learning has been the most recommended approach for predicting the early detection of breast cancer. With the availability of datasets that have details about the features extracted from the mammograms as well as other imaging techniques, the modeling and training can aid in the early detection of breast cancer. Such early diagnostic strategies can focus on providing timely access to cancer treatment thereby improving the quality of life of these cancer patients.

This chapter is aimed at presenting an early breast cancer prognosis by using classification approach with ten different machine learning techniques namely Logistic Regression (LR), K-Nearest Neighbors (KNN), Decision Tree, Random Forest, Naive Bayes (NB), Adaboost Tree, Support Vector Machine (SVM), Radial Basis Function (RBF), Quadratic Discriminant Analysis (QDA) and Multilayer Perceptron on the Wisconsin Breast Cancer (original) dataset The dataset features have been categorized into three sub-classes as mean, standard error, and worst to compute the performance parameters using simulation tool Jupyter notebook.

Based on comparative analysis the Support Vector Machine (SVM) exhibiting around 95 percent accuracy and most optimized recall value has been presented as the choice for modeling. The model is validated with minimum false-positive results for the testing dataset.