ABSTRACT

This chapter covers the basics of programming and writing functions in R, sets, logic, and probability. It uses these tools to construct machine learning algorithms that can classify and predict. The chapter refers to the predictor variables and the target variable. Estimation and prediction are often used interchangeably. Different practitioners develop their own vocabularies. This is true for the terms “classify” and “predict”. It does seem though that an agreement has been reached that classification will refer to the case where the output variable is categorical and estimation (or prediction) will refer to the case where the output variable is numeric. The Naive Rule is probably the simplest possible algorithm. NaiveRule() function requires the user to specify the output variable in the dataframe, which is Class in HouseVotes. It also requires the name of the dataframe.