ABSTRACT

This chapter analyzes publicly available transcriptomic data based on microarray and RNA-seq experiments. It discusses how to normalize both types of data and the methods to infer transcriptome-wide association studies. The chapter illustrates a typical use of limma in the analysis of Genome Expression Omnibus (GEO) datasets, following example of the Alzheimer’s disease study GSE63061. It also analyzes the normalization of the GSE63061 study for which nonnormalized data is available in its GEO’s webpage. RNA-seq is a high-throughput technology that scans the transcriptome by sequencing the tRNA content of a biological sample. RNA is randomly cut into small fragments call reads, which are sequenced and mapped to a reference genome. The chapter explains how counts at the gene-level can be used for differential expression profiling to help explain phenotypic differences between subjects. Count data must be normalized across genes since the number of counts that map to a given gene depends on sequencing depth, gene length and GC-content.