ABSTRACT

RNA-Seq is a powerful technology for analyzing transcriptomes that is predicted to replace microarrays [1]. Leveraging recent advances in sequencing technology, RNA-Seq experiments produce millions of relatively short reads from the ends of cDNAs derived from fragments of sample RNA. The reads produced can be used for a number of transcriptome analyses, including transcript quantification [2-7], differential expression testing [8,9], reference-based gene annotation [6,10], and de novo transcript assembly [11,12]. In this paper we focus on the task of transcript quantification, which is the estimation of relative abundances, at both the gene and isoform levels. After sequencing, the quantification task typically involves two steps: (1) the mapping of reads to a reference genome or transcript set, and (2) the estimation of gene and isoform abundances based on the read mappings.