ABSTRACT

The term “bioinformatics” was first introduced in 1970 by Ben Hesper and Paulien Hogeweg to indicate the study of information processing in biotic systems (Hesper and Hogeweg 1970). Since the late 1980s, bioinformatics has mostly been used to refer to computational analysis of biological data (Hogeweg 2011), and has brought a revolution to biology. Bioinformatics is an interdisciplinary field that combines computer science, statistics, mathematics, and biology. Current-day biological research is highly data driven, generated by microarray gene expression measurements, high-throughput sequencing technologies, and proteomic mass spectrometry, to name a few examples. Bioinformaticists develop critical computational tools for organizing, visualizing, storing, and analyzing the data obtained (Kesh 2004). As these large-scale data generation methods become more accessible to researchers around the world, the biological questions that may be addressed become more advanced, and the amount and complexity of biological data increase. For example, the speed and throughput of DNA sequencing platforms have increased faster than the computational power required to analyze the resulting data (Carlson 2003), making research projects increasingly dependent on innovative bioinformatics solutions. This is known as the “bioinformatics bottleneck.” Bioinformatics has become an intrinsic component in molecular biological research, in which it plays an important role in the large-scale analysis of sequence data. These sequences may represent proteins (amino acids), or genes and genomes (DNA), and are stored in large databases, maintained

11.1 Introduction ........................................................................................................................ 263 11.2 Databases ............................................................................................................................265