Targeted Learning with Application to Health Care Research

doi:10.1201/9781351061223-12

ABSTRACT

Targeted learning (TL) is a paradigm for transforming data into reliable, actionable knowledge. The TL estimation roadmap describes a series of steps for expressing the substantive goal in terms of a parameter of the distribution of the data, and then estimating that statistical parameter from data. TL relies on two core analytic estimation methodologies. Super learning (SL) is a data-adaptive ensemble machine learning algorithm for predictive modeling. Targeted minimum loss-based estimation (TMLE) is an efficient double-robust semi-parametric substitution estimator. TL is built upon a theoretical foundation that incorporates machine learning into statistical analysis while preserving the ability to quantify uncertainty in the parameter estimate. TL has been applied in a broad range of health care settings that include risk prediction and estimation of causal and non-causal effects. This chapter describes the roadmap and offers a primer on SL and TMLE in the context of point treatment and longitudinal data analyses. Applications to mortality risk prediction, variable importance ranking, estimating the effects of point treatments, and parameters of a marginal structural model for longitudinal data analyses illustrate how TL is currently being used in practice. Promising directions for future research are also described.