ABSTRACT

The second half discusses quantitation of gene expression, which is an integral part of most RNA-seq studies. In principle, calculating the number of mapped reads provides a direct way to estimate transcript abundance, but in practice several complications need to be taken into account. Eukaryotic genes typically produce several transcript isoforms via alternative splicing and promoter usage. However, quantitation at transcript level is not trivial with short reads, because transcript isoforms often have common or overlapping exons. Furthermore, the coverage along transcripts is not uniform because of mappability issues and

biases introduced in library preparation. Because of these complications, expression is often estimated at the gene level or the exon level instead. However, gene level counts are not optimal for differential expression analysis for those genes which undergo isoform switching, because the number of counts depends on transcript length. This issue is described in more detail in Chapter 8 in the context of differential expression analysis.