ABSTRACT

So far we have considered the analysis of data that are measured at least on an ordinal scale (e.g. a ranking of tree leaf condition) or an interval/ratio scale (e.g. strontium-90 activity in becquerels per litre). We have only used nominal variables (i.e. data which are placed in categories) as a way of separating samples (e.g. males from females, urban from suburban, clean from polluted). We now need some way of analysing nominal scale data in their own right. Suppose that when we surveyed trees on polluted and clean sites (Chapter 4) we were unable to give the tree a rank score of condition and were merely able to say whether the tree was in poor condition or not. For each tree we would then have obtained a nominal measurement (good or poor) rather than a score from 1 to 6 (where good condition was 6). This would be the second piece of nominal data, because we also record site type: clean or polluted. How, then, would we analyse these data? We could compare the number of trees in good condition and the number of trees in poor condition on the two types of site using frequency analysis. There are two main types of frequency analysis: tests for association and tests for goodness of fit. The choice of test depends on whether we are examining the association of one measured frequency distribution with another (e.g. association of tree condition with site type), or whether we are testing one measured frequency distribution against a known theoretical distribution. The term 'theoretical distribution' sounds quite grand, but could simply be the expectation of equal frequencies of events (e.g. sex ratio: equal numbers of males and females).