Data and Basic Modeling

doi:10.1201/9781003402848-2

ABSTRACT

Machine learning is the process of learning computer models from relationships in data. Two major paradigms of machine learning that are supported by mlr3 are supervised learning and unsupervised learning. In supervised learning, datasets consist of features and a target/label that we are trying to predict from the features. In contrast, unsupervised learning identifies groupings of data based on features. Machine Learning models are trained on data, allow to make predictions on new data, and can be evaluated with respect to their predictive quality.

This chapter introduces the building blocks of mlr3. Tasks – computational representations of machine learning problems; Learners – machine learning algorithms; and Measures - metrics for evaluating performance. We focus on how to use these building blocks; only basic machine learning theory is given. Tasks are introduced first, including relevant methods and fields for accessing and filtering data. Learners are covered subsequently, including how to load Learners and train them on a subset of data in a task and make predictions on a different subset. Measures are covered at the end of the chapter.