ABSTRACT
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616
Interdisciplinary computational approaches that combine statistics, computer science, medicine,
chemoinformatics, and biology are becoming highly valuable for drug1 discovery and development.
Data mining and machine learning methods are being more commonly used to properly analyze
the emerging high volumes of structured and unstructured biomedical and biological data from
several sources including hospitals, laboratories, pharmaceutical companies, and even social media.
These data may include sequencing and gene expression, drug molecular structures, protein and
drug interaction networks, clinical trial and electronic patient records, patient behavior and self-
reporting data in social media, regulatory monitoring data, and biomedical literature.