ABSTRACT

A frequency distribution is a simple way to learn how often a variable takes on each of its possible values. The tabulate command creates frequency distribution tables for single variables or combinations of variables. Bar graphs are an excellent way to visualize data distributions. Stata has two commands useful for producing these graphs, graph bar and histogram. Pie charts are another way to graphically display the distribution of categorical data, particularly when the variable takes on a small number of values. The summary command is a useful way to produce many summary statistics. To perform a comparison of means test, use Stata's ttest command. A regression tests for the linear relationship (direction and value) between a dependent (outcome) variable and one or more independent (potentially causal) variables. The regression equation is based on minimizing the sum of the squared differences between the observed values on the dependent variable and the values predicted by the calculated regression equation.