chapter  4
12 Pages

Dichotomous and frequency data joint estimation: John C. Whitehead


Introduction Discrete choice revealed and stated preference data can be used to elicit information about participation in an activity or market. Revealed and stated preference frequency data can be used to elicit information about the intensity of consumption of environmental-related goods and services such as recreation trips and seafood meals, among a number of other possibilities. One limitation of the revealed preference data arises over measurement of environmental values when environmental conditions change. For example, improvements in environmental quality might lead to a shift of the recreation demand curve. Stated preference data can be used to better estimate the demand shifts and resulting changes in environmental values. In these situations stated preference data is typically layered into (i.e. stacked with) revealed preference data and standard econometric models can be used. Discrete choice data is elicited with closed-ended questions. For example, respondents could be asked a question that requires a yes or no answer such as: “Do you drink filtered water from the tap?” Frequency data can be elicited in surveys with open or closed-ended questions. For example, respondents could be asked open ended questions: “How many recreation trips did you take last year to . . .?” Similar closed-ended questions could be asked where the answer categories are given as categories and the resulting data could be treated as continuous. However, researchers should pay attention to the form and distribution of the dependent variable as there are several econometric models that can be applied to frequency data such as ordinary least squares, Tobit, Poisson, negative binomial and interval data regression models. Combining dichotomous and frequency data involves stacking observations. For example, consider a choice situation where Y is the dependent (i.e. choice) variable and X is the independent variable:

Yj Aj Bj Xj Ej (4.1)

where A and B are coefficients to be estimated, E is the error term and j = r, s to indicate revealed and stated preference data. In other words, separate revealed and stated preference models would involve two equations

Y Xr r r r r A B E

Y Xs s s s s A B E (4.2)

Data with two values with various categorical labels attached are dichotomous (i.e. binary) choice data

Y yes noj

ª « ­­ ¬­­

1 0


Respondents may reveal their behavior or state their intentions to participate in recreation activity, averting or some other type of behavior. Data with a large number of potential values without categorical labels attached are frequency data. For example, count data consists of non-negative integers

Yj 0, 1, 2, 3 ... (4.4)

With both dichotomous and frequency data, analysis proceeds after stacking rows of data in the same columns. In Figure 4.1 we provide an illustration with continuous variables. The sample data includes three respondents with identification (ID) numbers 1001, 1002, and 1003. The SP variable is a dummy variable that indicates whether the data source is revealed preference (SP = 0) or stated preference (SP = 1). Each respondent has one revealed preference observation and two stated preference observations. The dependent variable, Y, consists of integers. There are three independent variables. X1 is observed for both revealed and stated preference scenarios, say the travel cost associated with recreation trips. X2 might be a hypothetical scenario treatment variable that is only collected in the stated preference survey. The revealed preference scenario is associated with a constant value, X2 = 1, while the stated preference scenario values range up to 5. The SPX1 variable is the SP variable interacted with X1.