ABSTRACT

Software architecture used in discovery informatics is characterized by its diversity. It is rare to find an environment used in target identification and validation that consists solely of commercial off-the-shelf software. In most cases the research environment consists of a variety of commercial, open-source, and locally developed software packages. HTML and Web technologies are often used as a mechanism to integrate these disparate environments. One aspect of research IT is its datacentric nature, often involving a wide variety and types of data in unstructured and semistructured textual forms and in structured forms as relational and object-oriented

databases. This heterogeneous collection of software and data often seen in drugdiscovery informatics can pose some significant challenges to administer and manage. The research process itself is one of experimentation, iteration, and hypothesis testing. In this environment the analysis tools utilized can be changed or augmented frequently. This process requires that the applied software architecture be flexible enough to accommodate rapid and frequent changes. Ultimately the requirements for the integration of informatics software and data are driven by the research process.