ABSTRACT

This chapter describes the usage of ML to study molecules in conjunction with molecular dynamics (MD) computer simulations. The theory-driven approach of MD is nicely complemented by the data-driven strategy of ML. MD allows the study of systems in atomic or near-atomic detail by solving Newton’s equations of motion in small time steps. The most time-consuming part of this approach is the computation forces which can be carried out via density functional theory (DFT) but accelerated by several orders of magnitude by combining DFT with ML. Choosing an appropriate method for MD is always a trade-off between accuracy and efficiency, and an even more efficient technique is to describe the interatomic interactions using a simple mathematical form with fixed parameters fitted to experimental data. The efficiency can be enhanced by lowering the resolution via summarizing groups of atoms into superatoms, i.e., using coarse-grained (CG) models. Here, ML may be employed to infer CG forces from atomistic models or to backmap CG to atomistic configurations. The output of MD simulations of proteins is often analyzed using ML techniques from principal component analysis to neural networks which efficiently reduce the dimensionality of the data.