E The baseball players’ historical example | 23

ABSTRACT

Efron and Morris [98] brilliantly exemplified via an interesting sport dataset that the James-Stein estimator ([286]) based on each player’s first 45 at bats does perform better at predicting subsequent performance than their observed averages. In this appendix1, we revisit this example as a simple introduction to hierarchical modeling. The batting successes yi of 18 major league baseball players (i = 1, ..., 18) in their first 45 at bats of the 1970 season are given in the first column of Table E.1. The batting average defined as the ratio of a player’s hits yi to his at bats (45 in this example), is one of the best acknowledged of all baseball statistics. They are given in the second column of Table E.1 and play the role of estimates µˆi =

yi 45 for the true µi, i.e., the unknown skill of

player i and the empirical score µˆi has for long been considered as the classical best bet for the underlying true averages µi. In addition, the two last columns provide a 90% Bayesian posterior credible interval for µi using a noninformative Uniform prior for µi with the Binomial model Yi ∼ Binomial(µi, 45) showing that a large uncertainty is attached to these estimates.