ABSTRACT

Targeted learning methods build machine learning-based estimators of parameters defined as features of the probability distribution of the data, while also providing influence-curve or bootstrap-based confidence internals. The theory offers a general template for creating targeted maximum likelihood estimators for a data structure, nonparametric or semiparametric statistical model, and parameter mapping. The targeted learning framework for efficient estimation was introduced nearly a decade ago [62], following key advances where efficient influence curves were used as estimating functions for effect estimation [27-30,59], and unified loss-based machine learning methods were developed for fitting infinite-dimensional parameters of the probability distribution of the data [54,58]. Targeted maximum likelihood estimation (TMLE) built on the loss-based super learning system such that lower dimensional parameters could be targeted; the remaining bias for the (low-dimensional) target feature of the probability distribution was removed. It also represents the first class of estimators that provide inference in concert with the use of machine learning.