ABSTRACT

The increasingly data-driven nature of bioinformatics makes the most promising research efforts in this field heavily dependent on the application of algorithms and techniques borrowed from data mining and related disciplines, such as machine learning and data science, given the growing volumes of molecular genetics and genomics data available as well as the computational complexity inherent to the analysis of these. This chapter provides the reader with an introduction to some of the most commonly used data mining methods in bioinformatics-related research as well as examples where these can be put in practice using software tools freely available online. In addition to this, an overview of some of the most widely used public repositories of genomics data is provided. The introductory concepts covered in this chapter, along with the practical examples provided, are designed to allow any reader with a biology background and no prior knowledge of these to understand their inner workings as well as their applications in contemporary bioinformatics problems.