ABSTRACT

In this chapter, the authors provide a short description of the first two datasets, together with results of exploratory analyses. The sinking of the RMS Titanic is one of the deadliest maritime disasters in history. Over 1500 people died as a consequence of a collision with an iceberg. However, the dataset is created in a way that two very different models, namely linear regression and random forest, offer almost exactly the same overall accuracy of predictions. To simplify the task, the dalex library wraps models in objects of class Explainer that contain, in a uniform way, all the functions necessary for working with models. Predicting house prices is a common exercise used in machine-learning courses. Note that apartments is an artificial dataset created to illustrate and explain differences between random forest and linear regression. Apartment-prices data are provided in the apartments dataset, which is available in the dalex library.