ABSTRACT

Questions 1-4 can be addressed in two major ways: hypothesis testing and confi dence intervals. In hypothesis testing, we propose a hypothesis such as aluminum alloy A has a larger tensile strength than aluminum alloy B . We gather some data and then test this hypothesis statistically to decide whether we accept or reject it at a certain level of signifi cance, often 5%. Here, 5% means that in our acceptance/rejection decision, we will be wrong only 5% of the time. Using the confi dence interval method on the same problem, we gather some data and compute the difference in the mean values of tensile strength, together with a

confi dence interval that extends on either side of the difference in average values. For example the difference in average values (A − B) might be 10,673 psi, and the confi dence interval might be ±2,473 psi. Confi dence intervals always carry an associated probability, often 95%. The meaning of the above statements is that our best estimate of the difference in strength is that alloy A has tensile strength 10,673 psi larger than that of alloy B, and we are 95% sure that the true difference is somewhere between 8,200 and 13,146. The phrase “95% sure” means that if we use this test routinely in our experimental work, we will be correct 95% of the time (19 times out of 20). About 5% of the time we will be wrong. While 95% confi dence is often used, we can choose any percentage that we wish, but with an important trade-off. If we insist on being wrong only, say, 1% of the time, then for the same sample size, the confi dence interval must be larger. If we want both a higher confi dence level and a smaller confi dence interval, then we must use larger sample sizes, which usually costs time and money.