Agency Statistical Consulting

Christopher W. Ryan, MD, MS, MSPH

Helping those in public service get the most from their data

I hate mowing the lawn. I'd much rather shovel snow. That will come around again soon enough, I suppose. But I still have a few mowings left this season.

The other day I was out there mowing, and my thoughts turned to statistics. (This happens a lot.) Specifically, I thought about confidence intervals, prediction intervals, and the automatic cutoff switch on my mower. Modern power mowers in the US have a switch in the handle that must be closed in order for the mower to run. Let go of the handle, the engine stops, and the blade stops whirling. This is an important safety feature, to keep people from getting their hand cut off by reaching down under the mower or into the discharge chute to remove a clog. As you can imagine, the sooner the blade stops whirling, the better.

Which brings us to confidence intervals versus prediction intervals. The companies that make lawn mowers might be interested in how long it takes for the blade to stop spinning once the handle switch is released. They could measure that in a large sample of mowers of a particular model. It's unlikely that all the mowers would have exactly the same shutoff time. There's a lot of variation among all those moving parts. So we'd find a variety of stopping times, maybe within milliseconds of each other. We could calculate a sample mean. That would be an estimate for the mean stopping time in the whole population of mowers of that model---even those we did not measure. We could also measure the variation amongst that sample of mowers---the standard deviation. From those two numbers, we could calculate a confidence interval for the mean stopping time. The company could then say, "We are pretty confident that the mean stopping time for our mowers is between this (the lower end of the interval) and that (the upper end of the interval)." That's a good thing to know. It's a quantifiable statement about the mean stopping time in a large population of mowers, based on measurements from a small sample of mowers.

Conceptually, a confidence interval for a mean runs from

(estimate of population mean) minus (uncertainty in the estimated population mean)

to

(estimate of population mean) plus (uncertainty in the estimated population mean)

But a confidence interval is not the end of the story. As a customer, I don't buy "a sample of lawn mowers." I buy one lawn mower. And a safety-conscious customer might be interested in the stopping time of the one particular mower they are about to buy. For that, an interval for the mean stopping time of a population of mowers doesn't help. What's needed is a prediction interval. A prediction interval is an estimate of the stopping time of any single mower off that same assembly line, along with an indication of the uncertainty around that estimate (because there is always uncertainty.)

Unlike a confidence interval, there are two uncertainties in a prediction interval. First of all, you are using sample data to estimate the population mean, so you aren't exactly sure what the population mean is. Then, whatever that population mean truly is (you don't and can't know for sure), there is uncertainty about where the stopping time of any particular mower will fall, on either side of that estimated mean.

Conceptually, a prediction interval for one mower's stopping time runs from

(estimate of population mean) minus (uncertainty in the estimated population mean) minus (uncertainty of any one mower around the estimated population mean)

to

(estimate of population mean) plus (uncertainty in the estimated population mean) plus (uncertainty of any one mower around the estimated population mean)

Notice the "extra" uncertainty, in red, built into the prediction interval. Prediction intervals for one mower's stopping time will always be wider than confidence intervals for the mean stopping time of, say, 100 mowers.

Why this matters

You may be interested in whether a measurement on one particular "thing" is out of the ordinary, unusually high or low, or out of step with the other "things." Is 7 opioid overdoses today an outbreak? Did this crew spend an unusually long time on scene with a trauma patient? Was this patient's length of stay unusual? In situtations like that, you look at the measurement from the possibly anomalous "thing" and compare it with the measurements from the other or preceding "things". But you want to look at a prediction interval, not a confidence interval. Prediction intervals (for one thing's measurement) end up being wider than confidence intervals (for the mean measurement on a group of things). So if you get concerned that one thing's measurement is outside of a confidence interval, you may be sounding a false alarm. That measurement on that one day or one patient or one incident may be inside the bounds of the prediction interval, and not anomalous at all.


Agency Statistical Consulting

PO Box 181

Johnson City, NY 13790

cwr@agencystatistical.com