Thursday, December 8, 2011

And you call yourselves statisticians

It's been a long time since I wrote, but with school winding down, I thought I'd have a little discussion about econometrics. This morning, I came across a Gallup poll (via Greg Mankiw's blog) on the difference in education between the top 1% of earners and the bottom 99%. The idea is to attribute some of the discrepancy in earning to differences in educational attainment.

As reported, the top 1% are way more likely to go to college and get a graduate education than the 99%, which would, due to its size, closely resemble statistics of the average. This seems like a legitimate theory to explain at least some of the income inequality. But looks can be deceiving; it's conclusions aren't based on sound statistical methods.

The main problem here is breaking up the data. Given the popularity of Occupy Wallstreet and the 99% movement, the split between the top 1% of earners and the bottom 99% may be politically interesting. However, this is poor motivation to split the data at that point. If we were to split it instead at 2% and 98%, would we see a statistically significant difference from Gallup's initial findings? What about 3% and 97%?

What Gallup would have done were they equipped with the proper statistical motivations is split the data  based on the demographics first (that is, find statistically significantly different groups of earners), then test the difference in demographics (education, party lines, religion, whatever) themselves. This sort of threshold model would help them find a number (five, for example, instead of two) of groups of similar individuals within the population, then test the difference between the groups.

Instead, Gallup split the data via a politically motivated line and came up with -- you guessed it -- politically motivated conclusions. And they ought to know better.

Had they tested the data with the proper statistical tools, perhaps they'd actually come to a conclusion that might help illuminate the underlying reasons for differences in education between the rich and poor. But these results fuel argument rather than fostering any serious intellectual conversation. For shame, Gallup. For shame.