Monday, May 9, 2011

I Think He's Talking About RBIs

Nate Silver finally got around to part three of his interesting series on early-season polling in presidential nomination contests, but unfortunately I'm on the road (Wesleyan!) and can't give it the attention right now that it deserves.

I'm highly suspicious, however, that his conclusion won't hold up. Silver says:
[I]t’s simply quite wrong to suggest (as some smart people have) that early primary polls are meaningless. Instead, they have a reasonable amount of predictive power. 
Okay, I need to explain the title of this item: RBIs. Baseball analysts discovered some years ago that for a hitter, Runs Batted In were basically explained by slugging percentage and opportunities. There were, certainly, hitters who were "RBI guys," but that just means having a good SLG and staying in the lineup (that is, the skill of health); the rest is all about the teammates, and the way the manager constructs the lineup. Now, if you have nothing else then RBIs will tell you quite a bit, but if you know SLG and the other stuff RBIs won't tell you anything.

I don't know whether early polling numbers are exactly like RBIs, but there's certainly a big chunk of that going on in Silver's data, at least at first glance. Silver resists controlling for what he calls objective factors (I agree with him about subjective factors), but I think in fact that's what's driving most of the numbers here, on both ends. On the high end: sitting presidents get renominated, and sitting VPs get nominated. Al Gore 2000, George H.W. Bush 1988, Jimmy Carter 1980, and Gerald Ford 1976 didn't get nominated because they had good poll numbers; they got nominated because people in those positions control (other) important resources. I think one can push that a little further, but again I agree one doesn't want to veer into the subjective. But basically we know that endorsements matter, and its likely that money raised matters (although both are complex).

Then on the bottom end, Silver is throwing a bunch of no-chance candidates into the mix who have no chance regardless of polling numbers, because they don't have the normal qualifications for the office. The problem for Tom Tancredo and Duncan Hunter in 2008, for example, wasn't that they were getting trounced in the polls; it's that Members of the House without other qualifications aren't serious presidential contenders. But since most candidates like that start with little support and then go nowhere, including them (especially since there are quite a few) as if they started with the same chance as more serious contenders is going to magnify the (possible?) illusion that you need to be doing well in the polls at this point.

The point here is that to the extent that polling is measuring something that the (other) objective indicators miss, then it really does matter. But if all it's telling us is that this candidate is the sitting VP, and that one has lapped the field in money raised, and that those two are backbencher Members of the House or unelected interest group leaders...well, then it's just telling us things that we already know.

Or, to put it another way, if a really famous clown shows up and pretends to be running for president a year before Iowa and scores some decent poll numbers, what we want to know is which counts more: the numbers, or that he's a clown? I'm quite confident that the clown part -- the objective side of the clown part, starting with the lack of elective or other high government experience -- is far more important.

So what we're left with is that we have something that may be RBIs, and may not be, but looking at it in isolation isn't going to tell us anything. I'm a big Nate Silver fan, but I don't think he's cracked this one yet, at all; I'm hoping he goes back for a 4th part to his series and gets a little closer to it.

2 comments:

  1. 1) Yes, except there are those players that could really sniff out an RBI. Take Manny Ramirez. He got RBIs because he got opportunities, but I wager the predicted value for RBIs from him based on SLG and quality of teammates was a consistent underestimate (back when he was a useful player, so, pre-2008). The guy just didn't care at all about his AB if nobody was on base, and you could see it. Overall, though, absolutely right...there are just some outliers.

    2) As for the RBI analogy to early polls....hmm. Actually, I think the better analogy is to Ws. You can be a legitimate candidate and poll in * land, but it's relatively rare: just like getting a win when you give up 5 runs.

    Now, we have to enter in the question about other factors and endogeneity. Should you control for them? If you're forecasting, you only want to drop out those high polling numbers if they don't end up winning. So, all the incumbents/clear favorites: from a forecasting POV, it doesn't so much matter WHY they're polling up: that factor is also present for the rest of the race, so it still has predictive power. But, then we end up in the subjective rub you and Nate find themselves in. At what point do we put our thumb on the scale with Trump? And what makes it legitimate to do so for Trump, but not for Bachmann or Palin? (The "elective office" thing doesn't work: Perot did plenty fine without it. Yes, it's not a primary, but the point is: people WILL vote for a crazy billionaire).

    Honestly, I don't know the answer to this question...from a forecasting POV. I prefer to do causal modeling, which would completely ignore early polling, since we know it's endogenous.

    ReplyDelete
  2. Actually Manny is a useful example of the "yes but" factor.

    He may have tried harder with runners on. His situational stats bear this out to an extent.

    And yet the simple rbi estimator (using nothing more than at bats with runners on, overall batting average and overall slugging percentage) underestimates his rbi totals by around 5 runs per year (and a remarkably consistent 5 runs a year)

    As Jonathan says, it's mostly about opportunity and power. You can identify other factors but they aren't all that important (at least not generally)

    ReplyDelete

Who links to my website?