ABSTRACT

Advances in both combinatorial chemistry and laboratory automation have allowed high-throughput screening of compounds to become common methodologies, and this has led to the wide availability of libraries of molecules for drug discovery [5-7]. A better understanding of the chemistry behind small-molecule docking and examples of ligands which have binding activity to dierent targets, in parallel with the explosion of available tertiary and quaternary protein structures, has enabled in silico modeling of small molecules to become a standard practice in both academic and commercial laboratories. However, this method has commercial successes that are primarily limited to me too drugs of successfully marketed pharmaceuticals. e statin drug family, which lower cholesterol by targeting the enzyme HMG-CoA reductase, is a primary example of this-there are more than 10 marketed statins, all of which share structural similarity in the HMG moiety. With this mode of thinking, designing a successful new drug relies on identifying compounds directed against

known targets, not on new human disease gene targets. Docking methods attempt to identify the optimal binding position, orientation, and molecular interactions between a ligand and a target macromolecule. Virtual screens, using large local computers or cloud-based resources, can screen compound structure libraries with millions of entries and yield hit rate of several orders of magnitude greater than that of empirical, bench-based screening for a fraction of the cost. Focused target and small molecule libraries can be generated, focusing on either specic properties of small molecules, molecules that contain a specic substructure (e.g., a binding pocket or active site), or combinations of both. Structural ngerprinting can extend this concept by encoding a three-dimensional structure in a bitwise representation of the presence or absence of particular substructures within a molecule. Algorithms can also be used for clustering of structures within a chemical library, and other metrics allow determination of small-molecule subsets that are ltered by desired properties for drug-like, lead-like, or fragment-like compound representations for in silico screening.