The Machine Learning Project | Hands on Data Science for Biologists Us

ABSTRACT

Different Machine Learning algorithms have different performances for datasets. Finding the best algorithm for a dataset has to be straightforward. For a real-world data science project, the dataset has to be trained through various algorithms for determining the best-suited model for the aforementioned dataset. In this chapter, a dataset of genes associated with an age-related disorder will be used for training various ML algorithms learned from previous chapters. The algorithms were evaluated on the test set to compare their performances. Moreover, the concept of positive unlabeled learning will be introduced in an example where we have a well defined positive dataset, but we have no true negative datasets. The classification of genes mostly falls in this category.