ABSTRACT

Sequencing goes hand in hand with computational analysis. Eective translation of the accumulating high-throughput sequence data into meaningful biomedical knowledge and application relies in its interpretation. High-throughput sequence analyses are only made possible via intelligent computational systems designed particularly to decipher meaning of the complex world of nucleotides. Most of the data obtained with state-of-the-art next-generation sequencers are in the form of short reads. Hence, analysis and interpretation of these data encounters several challenges, including those associated with base calling, sequence alignment and assembly, and variant calling. Oen the data output per run are beyond the common desktop computer’s capacity to handle. High power computer cluster becomes the necessity for ecient genome-seq data analysis. ese challenges have led to the development of innovative computational tools and bioinformatics approaches to facilitate data analysis and clinical translation. Although de novo genome-seq is in its full swing to sequence the new genomes of animals, plants, and bacteria, this chapter only covers the human genome-seq data analysis by aligning the newly sequenced human genome data to the reference human genome. Here, we will highlight some genome-seq applications, summarize typical genome-seq data analysis procedures, and demonstrate both command-line interface-based-and graphical user interface (GUI)-based-genome-seq data analysis pipelines.