ABSTRACT

In the last few decades, the molecular biology and the equipment available for research in this eld have been advanced signicantly. Therefore, large sequencing projects, in which the entire genetic sequence of an organism is obtained, are now routine. Today, several bacterial genomes, as well as those of some simple eukaryotes (e.g., Saccharomyces cerevisiae, or baker’s yeast) and more complex eukaryotes (Caenorhabditis elegans and Drosophila) have been sequenced. The Human Genome Project is another example of rapidly growing research that provides a huge amount of data. These large amounts of sequence data can be used for

Analysis of the organization of genes and genomes and their evolution-• ary processes Prediction of the function of newly identied genes• Protein sequence prediction from DNA sequence• Identifying regulatory factors of a specic gene or RNA• Identication of mutations that cause diseases and others•

Inasmuch as this information can help scientists discover or predict some unknown features, bioinformatics provides some methods for recording, annotating, searching/retrieving, analyzing, and storing nucleic acid sequence (genes and RNAs), protein sequences, and structural information.