ABSTRACT

The rate of growth of available information about biomedical and biological

systems is exponential. In fact, available information has already gone far

beyond human ability to synthesize, analyze, and predict. The biomedical

information torrent is now a continuously growing set of exceedingly intricate

knowledge about complex, large dynamic systems. Against this background,

one of the most urgent challenges to the biomedical research community, with

direct impact on drug development, seems to be to develop approaches to

analyze this extremely large amount of data to discover patterns (1) or useful

information that may be available in the data but not apparent through simple

inspection. This integration aspect has always been a challenge in drug develop-

ment, where data are continuously gathered at a variety of scales and sizes

through the preclinical and clinical development programs. Recently, academic

and health sciences research has also become aware of the importance of this

very challenge. The National Institutes of Health (NIH) Roadmap for medical

research in the 21st century states that:

Today’s biomedical researcher routinely generates an amount of data

that would fill multiple compact discs, each containing billions of

bytes of data. [A byte is approximately the amount of information con-

tained in an individual letter of type on this page.] There is no way

to manage these data by hand. What researchers need are computer

programs and other tools to evaluate, combine, and visualize these

data. In some cases, these tools will greatly benefit from the awesome

strength of supercomputers or the combined power of many smaller

machines in a coordinated way but, in other cases, these tools will be

used on modern personal computers and workstations (2).