When speaking on the accuracy of data, I often quote statistics professor George Box, who taught us that statistics are based on models, and "all models are wrong, but some are useful."
|
|
What does this mean for those of us who conduct statistical research? The most important lesson is that we must understand the limitations of the data and of models.
Consider Cochise County's population. According to the Arizona Department of Economic Security, the county's population in 2005 was 131,790. According to the U.S. Census Bureau, it was 126,106.
So whose data are correct? Well, there's a high probability both are wrong. Both DES and the Census Bureau use models to estimate population. These models differ from each other, and both are most likely wrong. The truth is it's very likely, almost certain, that no one knows the true population of Cochise County.
But does this mean the population estimates aren't useful? Certainly not. The census population of Cochise County as of 2000 was 117,755. While it's probable this number is also wrong, it's based on a very good model and is likely very close to the actual population that year. Both DES and the Census Bureau use the Census 2000 figure as a base from which to estimate the population in subsequent years.
According to DES, the county's population between 2000 and 2005 grew by 14,035 people, or 11.9 percent. According to the Census Bureau, it grew by only 8,351, or 7.1 percent. Based on these two estimates, it's reasonable to conclude the county's population increased between 7 and 12 percent. It's possible, however, that the increase was less than 7 percent or more than 12 percent. It's unlikely, however, that the increase was much less than 7 percent or much more than 12 percent.
According to DES, the county's unemployment rate in 2006 was 4.7 percent. In 2005 it was 4.9 percent. Does this mean we know the exact number of people who were unemployed? Not really. DES calculates unemployment rates based on a model. And as Professor Box taught us, all models are wrong.
But some models are useful. Cochise County's unemployment rate probably was about 4.9 percent in 2005 and about 4.7 percent in 2006. More importantly, the rate probably did drop from 2005 to 2006, probably in the neighborhood of two-tenths of a percentage point. This last observation illustrates one of the most useful characteristics of models: They give us a pretty good idea of the direction and magnitude of change.
We might not know exactly how many people are unemployed or the exact proportion of the labor force they represent. But we can get a pretty good idea of whether unemployment is going up or down, by about how much, and at what point we should become concerned.
According to DES, Benson's unemployment rate fell from 8.8 to 8.4 percent between 2005 and 2006. To compute the unemployment rates for cities and towns, DES uses a model that differs from the model they use for state- and county-level unemployment.
Unfortunately, the model they use for cities and towns isn't quite as useful. The model doesn't account for the surge in residential construction in Benson, the opening of the new Wal-Mart, or other local factors. Instead, the DES model presumes the ratio of Benson's unemployment to the county's unemployment is the same as it was in 2000. DES then produces an estimate for Benson based on trends seen at the county level. Benson's unemployment rate in 2006 was probably quite a bit lower than the DES estimate.
But DES uses the same model to estimate unemployment for Sierra Vista. In this case, the model is probably more useful. That's because Sierra Vista accounts for such a large share of Cochise County's population, labor force, employment, and economic activity. Thus, employment conditions in Sierra Vista have a large impact on trends at the county level. In this case, the same model has different levels of usefulness depending on its application. It's much more useful for Sierra Vista, and even Douglas, than for the smaller communities of Huachuca City, Tombstone, Bisbee, Benson, and Willcox.
In 2006, the median price of a home sold in the Sierra Vista area was $205,500, according to CER estimates. To determine this, we looked at each home sold that was listed on the Multiple Listing Service, and then computed the median price. But not all homes sold were listed on MLS. And there's no guarantee that all realtors entered every price correctly. Nonetheless, this is a very useful model, and it's unlikely the true median home price differed substantially from $205,500.
We see and hear statistics in practically every aspect of our lives. While many of them are useful, it's important to keep in mind their limitations. As a researcher, when I hear a statistic I think is important, the first question I ask is how it was derived.
If you have any questions on the local economy, please contact the CER at 515-5486 or email the Center at cer@cochise.edu. Check out the Center's website at www.cochise.edu/cer.





Comments