Tuesday, November 26, 2013

One More Time on Subsamples (Ignore Those Polls! Addendum)

Last week I took someone to task for placing too much weight on Gallup's reports of Latino voters supposedly disproportionate turn away from Barack Obama. One week more, and...it's disappearing even more. In the Gallup weekly reports, Obama trickled down to 40% approval overall after three weeks at 41%, but his approval among Hispanic citizens is up for the third week in a row and now stands at 54%, 14 points higher than his overall rating. That's consistent with where he was back in September (that is, it's down, but not down more than the overall dip).

It's certainly possible that there's something going on here, but it's also possible, and perhaps a bit more likely, that the earlier dip never actually happened. It was just a bit of random variation.

I wanted to post on this not to beat up on the same thing, but because there's one additional point I didn't mention which is actually pretty important. Gallup publishes, in their weekly reports, a colossal 41 different splits. Look: flip a single coin 1000 times, and if heads comes up 600 times, something is probably going on. But what you're doing here is walking into a coin flipping factory, with some flipped 1000 times, and some 100 times, and some only 10 times, and yeah, if all you are doing are looking for oddball results then you're going to find what you're looking for. 

And so, like clockwork, we have another reporter tweeting out that "Obama's approval has dropped the sharpest among Eastern wealthy post-grads." Give it a rest, folks: just don't trust these blips in the crosstabs. There's no way of knowing whether they're real or not. 


  1. Please tell me that tweet was meant as parody.

  2. Replies
    1. Due for re-launch in early 2014: http://www.usatoday.com/story/money/business/2013/11/19/fivethirtyeight-staffing/3644841/

  3. Speaking of Silver and offensive tweets, in the linked poll the decline in approval among Easterners (-9), the wealthy (-9), and postgrads (-8) all fell well outside the reported MOE (+/-2). Of course, the repoted MOE is not the "real" MOE, due to the "unreported" data on the cutting room floor.

    Curious though: if the data swamp causes us to disregard polling results 4 or so times as large as the MOE - why do we pay any attention to any polls, at all, ever?

    1. Form the article: "For results based on the total sample of national adults, one can say with 95% confidence that the margin of sampling error is ±2 percentage points.." That's the margin of error for the survey of everyone, and there's no dispute that Obama's popularity is going down overall. The results for the subgroup analysis are not based on the total sample, that's what makes them subgroup analyses. The margins of error will be much higher.


Note: Only a member of this blog may post a comment.

Who links to my website?