Health care utilization routinely generates vast amounts of data from sources ranging from electronic medical records, insurance claims, vital signs, and patient-reported outcomes. Predicting health outcomes using data modeling approaches is an emerging field that can reveal important insights into disproportionate spending patterns. This book presents data driven methods, especially machine learning, for understanding and approaching the high utilizers problem, using the example of a large public insurance program. It describes important goals for data driven approaches from different aspects of the high utilizer problem, and identifies challenges uniquely posed by this problem.

Key Features:

  • Introduces basic elements of health care data, especially for administrative claims data, including disease code, procedure codes, and drug codes
  • Provides tailored supervised and unsupervised machine learning approaches for understanding and predicting the high utilizers
  • Presents descriptive data driven methods for the high utilizer population
  • Identifies a best-fitting linear and tree-based regression model to account for patients’ acute and chronic condition loads and demographic characteristics
  • chapter Chapter 1|6 pages


    chapter Chapter 2|8 pages

    Overview of Health Care Data

    chapter Chapter 3|13 pages

    Machine Learning Modeling from Health Care Data

    chapter Chapter 4|18 pages

    Descriptive Analysis of High Utilizers

    chapter Chapter 5|18 pages

    Residuals Analysis for Identifying High Utilizers

    chapter Chapter 6|21 pages

    Machine Learning Results for High Utilizers

    chapter Chapter 7|2 pages