ABSTRACT

In the past decade, the pharmaceutical industry has realized the increasing significance of methods, traditionally referred to as the hit-to-lead optimization phase, in the early stage of the drug discovery process. In particular, knowledge-based approaches emerged and evolved to address a multitude of significant issues such as biological activity profile, metabolism, pharmacokinetics, toxicity, lead-and druglikeness [1]. In this chapter, we will focus on the development and application of knowledge-based methods related to classification algorithms in virtual screening (VS) programs. Compound classification methods used for correlation of molecular properties with specific activities play a significant role in modern VS strategies. Because the current drug discovery paradigm states that mass random synthesis and screening do not necessarily provide a sufficiently large number of high-quality leads, such computational technologies are of great industrial demand. The most typical application of classification algorithms includes the identification of compounds with desired target-specific activity, which constitutes an essential part of the VS ideology. Statistical methods can be applied to process the results of high throughput screening (HTS) or known literature data and develop predictive models of biological activity. These models can further be used for selection of screening candidates from virtual databases. However, achieving desired specificity and activity alone is not sufficient to produce high-quality clinical candidates. Accordingly, druglikeness, favorable ADME-Tox profile (see Section 6.11), solubility issues, and pharmacokinetic and metabolic characteristics should be taken into consideration as early as possible. This obvious trend seriously influences contemporary VS programs. Usually, it is realized in development and implementation of various special filters in screening tools. Such filters allow the prediction of compound characteristics significant for particular bioscreening purposes.