ABSTRACT

Department of Psychiatry and Psychotherapy, University Hospital of Erlangen-Nuremberg, Friedrich-Alexander-University Erlangen, Germany

18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 18.2 Molecular Structure Formats for Chemoinformatics . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 18.3 Installation of the PaDEL Extension for RapidMiner . . . . . . . . . . . . . . . . . . . . . . . . . 322 18.4 Applications and Capabilities of the PaDEL Extension . . . . . . . . . . . . . . . . . . . . . . . 323 18.5 Examples of Computer-aided Predictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324 18.6 Calculation of Molecular Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 18.7 Generation of a Linear Regression Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 18.8 Example Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326 18.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328

Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329

The Pharmaceutical Data Exploration Laboratory (PaDEL) is a software suite developed at the University of Singapore to calculate a variety of properties and fingerprints of molecules based on their molecular structure [1]. These properties are also named molecular descriptors in chemoinformatics, since they represent a molecular property by a numerical value. The aim of this use case is to demonstrate how easy a chemoinformatic prediction model, showing the relation between a number of calculated molecular descriptors and a biological property, can be established. Models based on simple molecular descriptors can be used to safe valuable resources and replace or support experiments.