ABSTRACT

The hereditary information of all living organisms is carried by deoxyribonucleic acid (DNA) molecules, which are held by complementary chains of nucleotides into a double-helical structure. The ability of molecular biologists to rapidly sequence these nucleotides has created many new areas of biomedical researches and initiated the sequencing techniques and technologies. In the current era, DNA sequencing has become a standard laboratory technique and has resulted in the recent sequencing of the complete genomes for numerous organisms. Most notable was the Human Genome Project that mapped 30000 functional human genes and sequenced approximately 3 billion DNA (van Ommen, 2002). These breakthrough-sequencing results have paved way for the availability of tremendous amounts of data. Moreover, the increasingly expanding data’s necessitated an exhaustive, reliable and reproducible application of skills for the storage and interpretation of datasets. These skills when applied to the domain of functional genomics, is described as bioinformatics. The Human Genome Project has made available the genetic data such as DNA sequences, physical maps, genetic maps, gene polymorphisms, protein structures, gene expression profiles and protein interaction effects (Kohane and Butte, 2002). The collections of these huge datasets had further produced an urgent need for systematic quantitative analysis.