ABSTRACT

Bioinformatics methods aim to extract information about biological functions and mechanisms from computer-held biological data. This chapter introduces the fundamental ideas underpinning data management techniques and their use. It examines some of the techniques for dealing with the particular problems arising from the management of biological data. A database is still a collection of data stored in a computer, but the storage of and access to the data is controlled by specialized software called a database management system (DBMS). Whatever relational DBMS product is used, the fundamental requirement of any user knows how to access the data in the database. Those scientists developing their own data bases additionally need to know how to design and create a relational database. The chapter outlines data management techniques—key concepts: accessing a database; designing a database; overcoming performance problems; and accessing remote data. There are two distinct but related aspects to designing a database, logical design and physical design.