ABSTRACT

Over a decade ago, sequencing of random cDNA clones to obtain expressed sequence tags (ESTs) took root as a new approach for gene discovery in forest trees. An EST is a single sequencing reaction obtained from a cDNA fragment or clone, usually from one of its termini (3’- or 5’-end). Loblolly pine was among the very fi rst trees to be targeted (Allona et al. 1998). The approach had been successfully applied to several model organisms and as costs had begun to decrease it was applied to diverse organisms, including crop plants. Considering the exceedingly large size of conifer genomes (ranging from 20 to 30 Gigabases), only a small fraction of which likely contains protein coding information, it was acknowledged that sequencing numerous cDNAs clones was an effi cient approach to characterizing portions of the genome that could most rapidly contribute to our understanding of tree biology, as well as provide new tools for tree improvement (Kirst et al. 2003). To the extent that cDNA clones represent faithful DNA copies of a majority of the RNA transcripts from a given tissue sample, deep sequencing of the cDNA pool has the potential to reveal a signifi cant fraction of all of the coding sequences that contribute to the growth and development of the tissue, and to the extent that sampling can be extended to as many tissues and stages of development as possible, a catalog of all the actively expressed sequences in the organisms-the transcriptome-may be defi ned. Until such time as DNA sequencing technologies have improved enough to make characterization of very large genomes cost-effi cient, the transcriptome will represent the window through which the conifer genome is most frequently accessed. Today, the transcriptome is at the heart of many basic and applied endeavors in conifer genomics.