ABSTRACT

Genome sequence data provide valuable insights into many aspects of biological science. The explosion of genome sequencing activity and the need to categorise and extract information from these large data sets has led to the formation of the rapidly expanding field of bioinformatics. Bioinformatics tools allow us to define genes, infer metabolic pathways, and compare organisms, all of which provide insights into biological processes. The need to streamline and improve existing tools as well as to develop new tools that can extract useful information from these vast repositories of information becomes more pressing as additional genomes are sequenced. As we progress into the genomics era, the availability of such data and its interpretation will become both more complex and commonplace. This chapter seeks to (i) summarize common bioinformatic approaches to the analysis and interpretation of primary genome sequence data and (ii) provide examples of data generated using these approaches derived from published genome analyses of plant-pathogenic bacteria. Such data serve as a starting point from which a reasonable subset of candidate genes can be defined and targeted for more precise genetic and biochemical analyses. It is hoped that this review will provide insight into the methodologies of genome analysis that will facilitate genome analysis for new researchers and provide a framework upon which others can interpret the conclusions derived from these analyses.