ABSTRACT

Cap Analysis of Gene Expression (CAGE) elucidates aspects of the transcriptome previously inaccessible on a genome-wide and unbiased scale, opening avenues to find novel global mechanisms of transcriptional regulation. The sequencing of a CAGE library is the entry point to the computational analysis. Aligning similar tag sequences to each other enables the estimation of errors for library and sequencing quality control. The annotation of the CAGE sequence tags to known transcripts builds the basis for all subsequent data analysis. The complex structure of CAGE tags and their expression on a genome-wide scale make an intuitive understanding difficult. Due to the unbiased nature of the CAGE technology, improvements in quality of available genomes can be of direct benefit for existing CAGE libraries because these CAGE libraries can be remapped to a genome version. With the advent of individual genomes, CAGE tags can be mapped to not only reference genomes but also to those individual genomes separately.