ABSTRACT

This chapter covers the quality checking of raw reads, that is, FASTQ files [1]. Once reads have been aligned to a reference genome, additional quality metrics can be investigated based on the location information as discussed in Chapter 6. These include coverage uniformity along transcripts, saturation of sequencing depth, ribosomal RNA content, and read distribution between exons, introns, and intergenic regions. Finally, once aligned reads have been counted per genes, sample relations and batch effects can be visualized with heatmaps and PCA plots. This experimentallevel quality control is discussed in conjunction with statistical testing in Chapter 8.