Outliers

There has been a meme spreading among conservative commentators that pollsters are getting their models entirely wrong when it comes to predicting the actual breakdown of voters in the upcoming election. It’s an interesting thesis, and one we can attempt to analyze using pWP as a test.

Any use of statistics must involve some kind of logical framework, a qualitative analysis, or else you’re just data mining (data mining is bad because random patterns will emerge in large data sets purely by chance). So what is our qualitative analysis of the 2012 presidential election? Do we, based on our experiences with past elections and our knowledge of the two candidates, think this election is basically an automatic win for one side or the other?

No, of course not. We know what those races look like. At least, those of us who remember Bob Dole do. We can say with confidence, without using any polls or census data whatsoever, that Keith Ellison will win re-election. The same goes for Nancy Pelosi, the same goes for many matchups that are basically decided long before anyone starts campaigning (excepting tail-events).

The current presidential election is not one of these races, unless you’re a mesmerized partisan. This knowledge in hand, we can look at pWP to help us find those polls that appear to be outliers because they suggest one candidate or another has a substantial lead (probability of winning) far beyond what our qualitative analysis would predict.

In the following graph, I have divided up all the polls taken since Romney picked Paul Ryan as his running mate into five categories based on pWP. Any poll that  has a result of 90% or higher is an outlier. Any poll between 40% and 60% pWP is a tossup, and the rest all lean either Obama or Romney.

As everyone can see, there are a lot more outliers for Obama than Romney. But it should be noted, most of Obama’s outliers are clustered around the convention, and Romney’s outlier also dates to his convention. Still, at least three polls recently have been outliers for Obama, depending on the arbitrary point where decide “positive convention coverage” ended. If we remove the outliers and look just at the polls working within our qualitative framework, Obama is still looking good. If you split the tossup polls, and add the Lean Obama polls, and divide by the total number of polls that aren’t outliers, you get a 67%, let’s call it ‘propensity,’ for Obama to win. And how about that, more pWP convergence.

Current Obama pWP: 67%

There’s a lot of variation and ‘noise’ apparent in the graph, but the average Obama pWP for all polls since Ryan was picked as Romney’s #2 is 67%, and Obama’s average pWP for polls conducted in the last week is 67%. And that’s convergence. And I like convergence.

Obama Aggregate pWP

Obama’s high of near 85% pWP after the DNC has fallen to right about 73% since then. In the graph I’ve labeled this “Middle East Tormoil/Romney Gaffe”  but I could have just as easily labeled it “post-DNC” since we can’t be exactly sure what is causing what. The end result is Obama is now well-removed from the positive convention coverage and his pWP is higher now than before. And in politics, the best place to be, always, is ahead.

Obama pWP Update

The pollsters, other than Rasmussen and Gallup, took some time off after the conventions, so there’s a hole in the data (I don’t want to over-sample Rasmussen, so I only record Rasmussen’s moving average every third day or so, except during high-leverage events). Here’s a graph showing Obama’s level of support in the polls, and corresponding pWP:

I added some linear trendlines, not because I believe in anyway whatsoever that a linear model makes any sense, but because it does help show the relationship between Obama’s level of support in the polls and pWP (small movements in support make big changes in pWP).

It will be a couple of days before I have all the data, but it looks like Obama’s aggregate pWP has settled back down into the 65% level.

Undecided Voters and other Chthonic Specters

One of the weaknesses of pWP is the fact it doesn’t take into consideration undecided voters. The basic assumption of my pWP model is that in close elections, the undecided voters split rather evenly; otherwise they break toward the leading candidate. The main exception is when there’s an obvious macro-event (recession, war, hyperinflation, the candidate murders someone, etc), in those situations the undecideds break away from the incumbent, or the murderer, unless it’s a good macro-event, in which case the opposite is true.

Thus, I have no problem ignoring undecided voters when calculating pWP.

Rasmussen, and other pollsters, have ways of trying to gauge how undecided voters are leaning. Currently Rasmussen has Romney leading Obama 47%-45% (giving Obama a very low 25% pWP). However, when they include leaners, the race is much tighter:

The Rasmussen Reports daily Presidential Tracking Poll for Monday shows Mitt Romney attracting support from 47% of voters nationwide, while President Obama earns 45% of the vote. Four percent (4%) prefer some other candidate, and four percent (4%) are undecided. See daily tracking history.

When “leaners” are included, the race is tied with both Obama and Romney at 48%. Leaners are those who are initially uncommitted to the two leading candidates but lean towards one of them when asked a follow-up question.

Leaners make the race a toss-up. They break 3:1 for Obama. But what can we say about the 4% who are truly undecided? What can we say about the energy of the leaners? are they likely to make it to the polls? What about the four percent who are looking at voting third party? what percentage of them will decide on the lesser of two evils? (From my own experience, many of those who intend to vote 3rd party in the weeks before an election will decide on one of the major candidates on election day.)

I just don’t have good answers to many of these questions. My hunch is the undecided voters will break the president’s way if there’s no double-dip recession. The current aggregate pWP for Obama is 65%, which is outside of ‘toss-up’ territory, based on pWP, I would predict an Obama victory.

If the electorate is really 48%-48%, as Rasmussen’s poll including leaners suggest, then the Undecideds may not be important at all. Voter turnout will be the tiebreaker. Voter turnout is summoned by three things: macro-events, party-organized GOTV, and voter enthusiasm. In terms of party GOTV, I think Obama wins. In terms of macro-events, Obama wins. In terms of voter enthusiasm, I think Romney rides high on the anti-Obama sentiment.

In other words, Obama is still in control of this race.

DNC and pWP

Most of the polls have now been published, so I can give the pWPA (political Win Probability Added) for the DNC:

As you can see from the graph, Obama regained all the ground lost after Romney’s RNC bump, and then some. The DNC increased Obama’s pWP by a hefty 50%.  Looking at the raw polls, averaging levels of support reported therein, the DNC added 1% support to Obama, and erased .5%  support from Romney. You’d think 1.5% wouldn’t be such a big deal, but 1.5% is the typical standard deviation for a large poll. Thus, Obama added a full standard deviation as a buffer between him and Romney, after already having a healthy overall lead before the conventions started.

The next high-leverage event is the first debate, to be held October 3rd. Between then and now I’ll be looking at pWP among the tossup states listed in RealClearPolitic’s map. We should also have enough polls to see how long-lasting the convention effect happens to be, in terms of pWP.

Intrade on Obama

I have to wait two more days before I can give my final analysis on the conventions and their effect on pWP in this POTUS race. However, I did want to post something on Intrade.

Intrade, if you didn’t know, is a futures market where you can buy or sell contracts based on the outcomes of future events. The contracts are priced from 0 dollars to 100 dollars (a contract pays $100 if the event comes to pass, zero dollars if it does not). Thus, an event that traders think is unlikely to happen will be cheap; an event very likely to happen will be expensive. It’s a real market with millions of dollars at stake, and it should be taken seriously.

Here is the most recent chart on the Obama to Win contract:

As we can see, except for a small dip in June, the Obama contract has been trading between $55 and just over $60. So, according to the market, Obama has a 60% chance to win re-election (or put more accurately, members in this market are willing to risk $60 for a $100 payoff, a profit of $40, on Obama’s re-election).

This is in stark contrast to my pWP, which had Obama at a 68% chance to win re-election before the conventions (it currently sits over 80% now). The conventions produced almost no effect on traders (and they’re probably right about this, still, it’s an interesting factoid).

Traders think Obama has a good chance to win, but they’re not as confident in the contract as the polls suggest they should be. So the question is, what information does the market have that we don’t, since based only on the polls Obama’s contract should be trading higher. Either the market believes the pollsters are doing their job wrong, or that independents and undecideds will break slightly toward Romney, or that there’s a liklihood future events will shift the race in Romney’s direction.

Based on my pWP, the Obama contract is a buy, and Romney’s (sitting just under 40 bucks) is a sell. If you believe there will be more bad economic news, you might want to think about buying into Romney now. If you think (as I do) that the economic news will continue to be flat (neither positive nor negative), Obama’s contract looks pretty good.

pWP Convention Effect

Despite what I would label as lukewarm coverage for Mitt Romney during the RNC, the convention and its four days of media coverage (hurricane and all) produced a very large and positive shift in political Win Probability (pWP) for the Republican Nominee.

Here is a graph showing Obama’s level of support, and his calculated pWP, per each poll I recorded:

I calculate everything in terms of Obama’s pWP and level of support, as he is the incumbent. Whenever I do calculations for other races, I’ll maintain the same convention.

In order to get a clearer picture of how events effect pWP, we have to aggregate the data, post hoc, based on what is, in my opinion, the primary driver of news coverage over the periods in question. Here is the graph of aggregate pWP for Obama:

Some notes on the aggregated data:

– The RNC was worth a solid 34.6 pWPA (political Win Probability Added). By this I mean, based strictly on the polls, Romney’s chance of winning the election increased from just under 30% to almost 70%. It more than doubled.

– Obviously, the DNC, if traditional wisdom holds true, should help Obama in the same way. I think the conventions are low leverage events, so their effect should be fleeting. We’ll get a good chance to see if I’m right in the coming weeks. It will also be interesting to see if Obama can recapture all of his pre-RNC pWP.

– Initial polling in the summer, after Romney had secured the nomination and before he picked Paul Ryan, Obama’s pWP was around 60%. Since FDR, presidential incumbents win 62% of the time, or 60% (depending on whether one counts presidents who previously won an election, or succeeded to office after the death or resignation of the previous president); of all the presidents seeking reelection since Washington, 58% of them win. That’s a nice convergence of current data and historical data.

– The group of polls around the time after Paul Ryan was picked as the VP candidate showed a positive movement for Romney. Then, after a while, it looks like a strong TV ad campaign starting around the time of the Olympics helped Obama, leading to some very high pWP numbers just before the RNC.

– I do not weight the polls based on sample size when aggregating. My reasoning is simple, I do not know a bad sample from a good sample, I don’t want to weight a poorly constructed poll with a large sample size more than a poll with a small sample size that is properly constructed. My assumption is the sampling errors will cancel themselves out as more polls are added. (I also do not discriminate polls, I use all Likely Voter polls, regardless of bias of the polling institution, with the assumption the errors, or bias, will cancel out.)

– This post is long enough, I’ll talk about some of my methods, and the strengths and weaknesses thereof, in a later post.

pWP Update

Here is all my current data (Obama support, Romney Support, Obama pWP):

Conventional wisdom proves correct again, conventions provide a healthy bounce for the candidate. I’ll post more on this specifically, but what I want people to notice is how relatively small shifts in the support each candidate gets can make huge shifts in pWP.

Chip Cravaack pWP

A DCCC Poll in the 8th Congressional District (Nolan vs. Cravaack) has Nolan up 45-44 with a MOE of 4.9. It’s a push poll, which is bad, but it’s a current poll (Aug 30) in the only interesting federal race in the state. The pWP for Cravaack is 40% (that is, he has a 40% chance of being re-elected, assuming the poll is valid) with 11% of respondents undecided (or ‘other’). I have a majority of the undecideds moving against Chip Cravaack. So Chip has a lot of work to do to keep his seat.