ABSTRACT

This chapter discusses a brief introduction to methods in high-dimensional statistics. Structured regression methods provide powerful tools for high-dimensional data analysis problems encountered in climate science. The chapter considers the task of predicting climate variables over land regions using climate variables measured over oceans. It illustrates the Sparse Group Lasso (SGL) which can encode sparsity arising from the natural grouping within covariates, using a hierarchical norm regularizer. Experiments prove improved prediction accuracy, and interpretable model selection by SGL compared to ordinary least squares. The chapter also considers the problem of global climate model (GCM) combination, in order to improve predictions. GCM model skills are estimated for each spatial location based on their performance in reproducing historical observations at that spatial location. The chapter formulates the problem as an multitask learning (MTL) problem and provided a framework for learning relationships within tasks using structure learning methods, while estimating the model skills.