ABSTRACT

This chapter discusses algorithms for learning and prediction with Hidden Markov Models (HMMs). The HMMs that are commonly used to describe protein families have a specific architecture that was designed to reflect the basic evolutionary processes of deletion, insertion and substitution. The HMMs that are designed to model protein families make up to some extent for their lower order by utilizing plenty of domain-specific information about the source in the model structure. The core of the model outlines the type of residues that are expected to occur in each position along the sequence of family members, and as such it essentially encodes higher order dependency between residues. The chapter also discusses a related class of models, called variable order Markov models, from the perspective of codes and compression. A Markov model describes a sequence of states that a given system might go through.