ABSTRACT

Biomarkers are critical for predicting the incidence, and evaluating the medical indications and treatment outcomes in complex diseases like cancer, Alzheimer’s, and Parkinson’s. Biomarker identification helps in distinguishing normal biological from pathogenic processes in normal and diseased persons. The epidemiology of cancer tells us that nearly one in six deaths is due to cancer globally. Machine learning approaches are being widely used in transcriptomic data analysis and complement experimental methods by reducing the resources, time, and cost required for microarray assays. Previous studies have revealed the need to review the work done on cancer biomarkers identification, and the challenges and limitations of using supervised machine learning algorithms and models applied to transcriptomic data. This chapter examines the various supervised machine learning methods applied to the identification and prediction of biomarkers associated with cancer. The major limiting factor with supervised machine learning on cancer transcriptomic data is high dimensionality and data scarcity.