ABSTRACT

The chapter is designed as a general introduction to sequence analysis for scientists and software developers who desire to write their own computer programs to study nucleic acid sequences. It is also intended for practically minded readerswhowould like to understand fundamental principles of algorithms used in sequence research and the reasons specific methods were preferred over the alternatives. The unifying paradigm in this respect is the idea of pragmatic inference: an organized ensemble of protocols [9-12] that on the one hand allows one to construct materially adequate generalizations based upon instances of observable facts (validated induction) and on the other hand generates predictions of novel potentially observable facts that cannot be attained by in-

MD: KONOPKA, JOB: 04359,

duction alone (unverified discoveries).* More specific methods (primarily frequency count analysis) for making biological predictions are outlined in considerable detail. However, the coverage of sequence alignment and database searches (including pattern-matching algorithms) is reduced to the minimum because these topics are fully addressed in other chapters of this volume [see Chapters byCrochemore andSagot (3),Heringa (4), Taylor (6), andLisacek (8)].