Prediction is quite often the ultimate goal of temporal data mining. It is because of this important and intimate relationship with data mining that a whole chapter is dedicated to the subject of temporal prediction. It has a variety of applications in such diverse areas as financial forecasting, meteorology, seismology, and medical disease detection. For example, a company is interested in predicting its average closing price for next month. In another example, a doctor would like to predict the reaction of his patients to a new diabetes medication, particularly the duration of hypoglycemic episodes. The first example falls into the area of time series prediction, which is also known as time series forecasting. Note that the terms prediction and forecasting will be used interchangeably in the remainder of the chapter. The second example falls in the area of event prediction. The differences between time series prediction and event prediction are summarized below:
Problem statement• : In univariate time series forecasting the problem is to predict the value of a variable at a multiple of a time interval. For example, a company would like to predict its sales for next month. In event prediction, we would like to predict the occurrence of an event or the number of occurences of an event or the duration of an event given the existence of certain conditions. For example, a doctor just put one of his epileptic patients on a new drug that is very effective, but upon therapy initiation might cause a severe migraine attack. The doctor would like to predict the duration of this attack given knowledge about the age of the patient and the number and duration of epileptic attacks in the last year. In this book, we will only examine
event prediction problems that deal with the prediction of an event’s duration. The reason is that an event’s duration can be modeled using a continuous variable and we can use linear regression for its prediction, where linear regression is one of the most widely available regression methods. Prediction of the occurence of an event can be modeled with a categorical dichotomous variable and the appropriate type of regression is logistic regression [Fel09]. Prediction of the number of occurence’s of an event can be modeled with a count variable and the appropriate type of regression is Poisson regression [Orm09].