ABSTRACT

Computer software designed for analysis of nucleic acid and protein sequence data is an indispensable tool in molecular biology and genetic engineering research. Concerted efforts in genome sequencing and gene discovery have resulted in large amounts of DNA sequence data and the development of databases containing enormous numbers of records. However, the majority of this sequence data functionally belong to so-called hypothetical genes or gene fragments that code for proteins of unknown functions, and DNA sequences and motifs with unknown regulatory capabilities. Hence, scientists are faced with the immense task of analyzing these sequence data in order to identify useful genes and genetic elements.