ABSTRACT

Read mapping is the process to align Next-Generation Sequencing (NGS) reads on a reference genome. Many NGS applications require read mapping, including genome variation calling, transcriptome analysis, transcription factor binding site calling, epigenetic mark calling, metagenomics, etc. Accurate alignment affects the performance of these applications. This chapter discusses the computational techniques to address the issues of alignment accuracy and efficiency. It gives an overview of the read mapping problem and simple brute-force solutions. The chapter presents different methods to solve the read mapping problem depending on whether gaps are allowed in the read alignment. The methods can be classified into two classes: the seed-and-extension approach and the filtering approach. The chapter introduces the mapQ score and discusses the computational challenges in solving the read mapping problem. It also includes exercise problems related to NGS read mapping.