ABSTRACT

As modern sequencing technologies are becoming more affordable, common and indispensable to modern medicine, agriculture, biotechnology and environmental sciences, a surge in sequence data is inevitable in the coming years. This chapter describes the emergence of modern gene sequencing technologies and associated challenges in data storage, sharing, sequence assembly, gene annotation, sequence analysis, and annotation data visualization. After the introduction, the chapter describes various genome annotation techniques, namely, genome assembly, repeat identification, gene annotation (including gene ontology), identification of non-coding genes, and protein sequence annotation (including domains, pathways, interactions and other features). These post-sequencing annotation tasks are enabled using various Bioinformatics tools and packages. The tools and packages discussed in this chapter give an overview of gene annotation tools available to analyze data from the modern genome sequencing technologies.