ABSTRACT

Function prediction from sequence employs a combination of highly efficient sequencing techniques and sophisticated electronic searches against millions of gene and protein sequences. Bioinformatics seeks to elucidate the relationships between biological sequence, three-dimensional structure and its accompanying functions, and then to use this knowledge for predictive purposes. For homologous proteins with easily recognizable sequence similarity, this type of prediction is based on the 'similar sequence-similar structure-similar function' paradigm. Functional and evolutionary information can be inferred from sequence comparisons. Experimental and computational methods are most sensitive at the protein level and the detection of distantly related sequences is easier in protein translation. Larger proteins are modular in nature, and their structural units, protein domains, can be covalently linked to generate multi-domain proteins. Sequence analysis techniques provide important tools for the prediction of biochemical function, but show clear limitations in predicting a protein's context-dependent functions and role in one or more biological processes.