ABSTRACT

In this chapter, the authors concentrate on two distinct but related approaches: data acquisition of the genomic sequence, physical map and related biological matrix for a specific organism, and data integration and analysis for a single family of proteins where available information spans the range of bioinformation. The resources and efforts being devoted to the Human Genome Initiative and its prototype genome projects should not lose sight of the long-term goals beyond sequencing and mapping. The former approach, genomic analysis of a specific organism, is typified by the authors' involvement with Center for Prokaryote Genome Analysis and their interest in other genome sequencing efforts to serve as prototypes for evaluating the requirements, limitations and potential for success of IIGI. The authors develop two parallel approaches to this problem, both of which utilize the generation of a substructure library from the observed protein structures, based on the linear distance plot and a pattern recognition implementation using dynamic programming.