ABSTRACT

Summary .................................................................................................................. 96 7.1 What Is Bioinformatics? ................................................................................ 96 7.2 Database Design............................................................................................. 97

7.2.1 Data Storage beyond the Spreadsheet: Laboratory Information Management Systems (LIMS) ........................................................... 97

7.2.2 Data Modeling.................................................................................... 98 7.2.2.1 The Flat Format .................................................................. 98 7.2.2.2 The Hierarchical Format..................................................... 98 7.2.2.3 The Relational Database Management System (RDBMS) 98

7.3 “Natural” Keys in Biological Databases ....................................................... 99 7.3.1 Genome Location as Reference....................................................... 100 7.3.2 Genes as References ........................................................................ 100 7.3.3 Genetic Reference Populations........................................................ 101

7.3.3.1 Genetic Correlations and Reference Populations ............ 102 7.3.3.2 GRPs for Mapping QTLs ................................................. 103

7.3.4 Gene Sets as References .................................................................. 104 7.3.4.1 Gene Ontology.................................................................. 104 7.3.4.2 The Interactome................................................................ 104

7.4 Applications ................................................................................................. 104 7.4.1 Sequence Analysis Scoring Matrices and Phylogeny ..................... 105

7.4.1.1 Scoring Matrices............................................................... 106 7.4.1.2 Motif Search and Alignment............................................ 106 7.4.1.3 Structure Analysis............................................................. 106

7.4.2 QTL Candidate Gene Selection....................................................... 107 7.4.3 Microarray, Proteomic, and Other High-Throughput Gene

Set Analysis...................................................................................... 108 7.4.4 Text Mining...................................................................................... 109 7.4.5 Integrating the Genome and the Phenome for Systems-Level

Bioinformatics.................................................................................. 109 7.5 Toward a Bioinformatics of Behavior ......................................................... 110 Acknowledgments.................................................................................................. 111 References.............................................................................................................. 111

This chapter is intended to provide a brief introduction to biological databases for two major purposes-first, to familiarize readers with the structure and design of databases for use in their own laboratories, and second, to illustrate examples of public biological databases and approaches that have grown from early bioinformatic methods.