ABSTRACT

Early cancer detection is critical for successful cancer treatment and in most cases, early detection increases chances for positive treatment outcomes. Traditionally, early detection consisted of education for early diagnosis and screening although, with increasingly available data, machine learning approaches are being adopted to improve the detection. In specific 12 supervised machine algorithms, namely kNN, Logistic Regression, Neural Network, Naïve Bayes, Decision Tree, Random Forest, ID3, CHAID, CRT, Support Vector Machines, Ada Boost, and Stochastic Gradient Descent were applied on a prognosis survey data to develop a framework for early detection without screening. Using both symptoms and predisposing factors in developing the framework provided a basis for identifying possible lung cancer incidences based on the information that the individual provided to the doctor. Most of the cancer types are detected at advanced stages and there is a need to detect them as early as possible. The machine learning approach can assist doctors in predicting the likely cancer status of an individual based on symptomatic descriptions and exposure to predisposing factors. Based on the findings, doctors can recommend further screening and early treatment to the individual.