ABSTRACT

Example 5A: 1, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 8, 8 In this case 5 is the median, which is the value that falls in the exact center of the distribution. If there is an even number of observations, the 50th percentile is between the two most central values and the median would be the average of those two central scores. For example, in the following set of numbers the median (represented by an underlined area) is located between the two center values (ten data points are above and ten data points are below this center):

Example 5B: 20, 22, 23, 24, 24, 24, 25, 25, 25, 25, __ 26, 26, 26, 27, 27, 28, 28, 28, 29, 30

The calculation of the median would be:

The median value is a better estimate of the center of a distribution than the mode. However, it is neither affected by, nor representative, of extreme values in the sample distribution. For example, consider the following two samples: Example 5C - Table weights in milligrams: Sample 1 36, 45, 48, 50, 50, 51, 51, 53, 54 Sample 2 47, 48, 49, 50, 50, 51, 52, 57, 68 Even though both samples have the same median (50 mg), Sample 1 appears to have more observations that are relatively smaller and Sample 2 has more samples that are larger. The two samples appear to be different, yet both produce the same median. If possible, a measure of the center for a given distribution should consider all extreme data points (e.g., 36 and 68). However, at the same time, this inability to be affected by extreme values also represents one of the advantages of using the median as a measure of the center. The median is a robust statistic and not affected by any one observation. As will be seen in Chapter 23, an outlier or atypical data point, can strongly affect the arithmetic center of the distribution, especially in small sample sizes. The median is insensitive to these extreme values. The median is a relative measure, in that it is defined by its position in relation to the other ordered values for a set of data points. In certain cases it may be desirable to describe a particular value with respect to its position related to other values. The most effective way to do this is in terms of its percentile location (the percent of observations that data point exceeds):

For example consider Table 4.2 where 30 tetracycline capsules were placed in ranked order from smallest to largest. If one were interested in the percentile for 252 mg (the 24th largest value) the calculation would be

Thus, 252 mg represents the 77th percentile for the data presented in Table 4.2. At the same time, when using percentiles, it is possible to calculate variability in a distribution, especially a distribution that is skewed in one direction. In this case, the measure would be the interquartile range (IQR) or interrange (the distance

between the 25th and the 75th percentiles).