ABSTRACT

In many applications, one must invest effort or money to acquire the data and other information required for machine learning and data mining. Careful selection of the information to acquire can substantially improve generalization performance per unit cost. The costly information scenario that has received the most research attention (see Chapter 10) has come to be called “active learning,” and focuses on choosing the instances for which target values (labels) will be acquired for training. However, machine learning applications offer a variety of different sorts of information that may need to be acquired.