Aligning Sequences with LAGAN | 20 | Biological Sequence Analysis Usin

ABSTRACT

In this Chapter, we use SeqAn to re-implement the basic functionality of the common software tool Lagan by Brudno et al. (2003).

13.1 The Lagan Algorithm

Lagan is a tool for aligning two long sequences a1 . . . an and b1 . . . bm, and it uses a seed chaining approach; see Section 8.6. The applied procedure (see Algorithm 37 for line numbers) works in four steps; see Figure 44:

(1) Finding Seeds (lines 2 to 6): For a given length q = qmax, all common q-grams of a1 . . . an and b1 . . . bm are found, e.g., by using a q-gram index (Section 11.2), and then combined to a setD of seeds by local chaining (Algorithm 24 on page 174), where the seed extension mode Chaos is used (Table 23 on page 176). If no common q-grams are found, the q is decreased until a minimal bound qmin is reached.