ABSTRACT

Over the past 50 years multiple studies have made attempts to model the development of world or Olympic records, for instance for running events. Deakin (1967) explored progress in the mile record, Chatterjee and Chatterjee (1982) for the 100, 200, 400, and 800 meter in the Olympic Games, Blest (1996) for running distances of 100 meter to the marathon, and recently Nevill and Whyte (2005) for 800 meter to the marathon by male and female runners. Modelling world records through time attracts attention from different perspectives. First, all studies are inspired by the apparent problems in the analysis of world records. In terms of time used to complete an event for instance we only observe nonpositive changes. These nonpositive changes can be very infrequent: world records sometimes survive for about 20 to 25 years. But in some instances improvements are really substantial, leading to extreme values in the distribution of the first difference of the series. Or technological innovation of sports gear shifts up the human frontier; an example is the clap skate in speed skating (see Kuper and Sterken, 2003 and Stefani, Chapter 3 of this volume). A

second element in the analysis is the interest in the ultimate human performance. Given the development of the world record up to now, can we predict the fastest time ever? And thirdly, can we compare contemporaneous performances? How does the world record 10000 meter running for men compare to the 5000 meter for men? Is there a phase difference in the development of records of the various events? And can women outperform men in the far future? Observing the time series of the world record of a well-developed sports event

reveals an inverted S-shape pattern (we will treat all developments of records as monotonic declining time series of the time used to complete an event). In the early phase of the development of running events, competition is not fierce, and amateurism dominates. At the inflection point the rate of progress is large, because more sportsmen get involved, more professional help is available, rewards become more visible, etc. After this rapid development phase there is a phase of saturation. It is hard to improve the record and only at a few instances a highly talented individual is able to break it. For some sports events we do not observe such a shape of the time series, because the development is much faster through cross-fertilization (e.g., the 3000 meter steeple for women, which has become an official race distance only recently). For some events typical observations are available for some pieces of the curve. This sometimes leads to the use of simple piecewise linear techniques to model the development of world records. These linear approximations are computationally attractive, but theoretically poor. Nevill and Whyte (2005) e.g., contribute to the debate on whether linear approximations are helpful in describing world records. Tatem, Guerra, Atkinson, and Hay (2004) use linear approximations of the development of the best male and female 100 meter sprints at the Olympic Games and conclude that in 2156 women will run as fast as men. Whipp and Ward (1992) also employed a linear approximation of marathon times of men and women and predicted in 1992 that female marathon runners would run as fast as men in 1998. Of course linear approximations cannot be correct: world record times cannot become negative. Wainer, Njue, and Palmer (2000) argue that women show similar development patterns of world records, but lag the male equivalents. Their conclusion is that the growth rates of development of record times is equal between men and women. Our main question though is to find more biologically sound and statistically robust nonlinear (S-shaped) functions that provide a superior fit of the development of world records (see Nevill and Whyte, 2005). Besides finding the best fit, most studies are trying to get insight into the upper-

limit of speed (of running) or the lower limit of time to be used to complete an event. For some functional forms, a nice limit value can be derived from the properties of the functions. For instance Kuper and Sterken (2003) give detailed lower limits of time to be used on skating events, while Nevill andWhyte (2005) give predicted peak world records for the men’s 800 meter, 1500 meter, mile, 5000 meter, 10000 meter, marathon and women’s 800 meter, and 1500 meter. For skating the predictions of future world records are highly dependent on the “no-technical progress” assumption. Contrary to running, speed skating is a technology-intensive sport (skates, ice rinks, clothing), so that shocks to technological progress are visible in the improvements

in

of world records. Ultimate human performances are so conditional on this typically hard to predict factor. In this paper we contribute to the literature on modelling world records. We focus

on running events, since these events are well-developed, highly competitive, not intensively affected by the problem of hard to predict technical innovations, and have a typical long history. For instance for the one mile record we have official data since 1865 (for men). Firstly, we compute the approximation of historical lower bounds of time needed to complete running events using the single-event historical data. After a careful selection and discussion of alternative specifications, we apply the Gompertz curve (Gompertz, 1825). The Gompertz curve is a relatively simple curve that allows for an asymmetric S-shape. From these specifications we compute the implied infinite lower bounds. Secondly, these lower bounds are compared in a cross-sectional setting to find the relationship between time and distance. Finally, we use one event, the one mile run, to compare the forecasting performance of our methodology. The set-up of the chapter is as follows. First, we discuss the existing methodology

to model world records in Section 2.2. In Section 2.3 we present different functional forms. From this analysis we conclude to use the Gompertz model. Section 2.4 describes the development of world records in running. In Section 2.5 we present the results of fitting the Gompertz curve for the 100, 200, 400, 800, 1500, 5000, 10000 meter, and marathon events for men and women. In Section 2.6 we test for robustness of the methods by relating the limit values implied by the Gompertz curves and distance in the famous log-log model of distance and time (see Section 2.2 for a description). We summarize and conclude in Section 2.7.