Politics, like baseball, has a lot of readily available data that is typically used incorrectly. Like polls. When a polls says “Candidate A leads Candidate B by three points, plus or minus five points”, what does this really mean?
I’ve tackled this issue before. In those posts I point out that any poll is simply reporting the probability distributions of two pieces of data. From these distributions we can gain insight into the minds of the voters over a certain period of time. We can also find out the probability one candidate will beat another, if the election were held in the present.
Using the WPA model, borrowed from sabremetrics (check fangraphs), I decided to create a way to visualize the real information found in polls, win probability, without the “margin of error” confusions.
This graph (data approximated, but based on available polls) shows Mark Dayton‘s moving Win Probability over the course of this election:
(The Y-Axis represents the probability Dayton wins, the X-Axis corresponds to the approximate date of the poll, by month.)
As can clearly be seen, Dayton enjoyed a very high Win Probability in July and August, just before the state primary date. At this time, all three DFL gubernatorial candidates were running ads opposing Emmer and some astroturf groups were also running anti-Emmer ads.
Emmer did not respond to those ads, waiting until after the primary before starting his strategic media campaign.
The result was an instantaneous change in the election dynamic. The election quickly went from a guaranteed DFL win to a toss-up that leans DFL.
Graphs like this will help politicos see how strategic elements change a campaign, what different events do to campaigns, the quality and predictive power of different polling institutions and the quality of a campaign’s GOTV effort (GOTV=Get Out The Vote).
I can even see pWPA (political Win Probability Added) stats for all the different players, tactics and strategies used during the course of an election.
-There is another element used in baseball’s Win Probability stat that I’d like to incorporate into this, and that is a leverage index. Unfortunately, I’m not entirely sure how to do it that isn’t post hoc. Obviously, elections are much more leveraged the closer to an election. For the presidential elections, the debates are another source of high leverage. It is very clear from the above graph that summertime is very low-leverage, with most voters not caring or paying attention to details, as small changes in strategy produce huge changes in pWPA.
-This graphic is a bit more complicated when there is more than one candidate with a chance to win. I’m not sure how to best put data like that into a graph. Luckily, these races are so rare it’s basically unnecessary.
– I was working on this post before the latest Strib/Minnesota Poll came out showing Emmer about four standard deviations from Dayton’s total. I think there methods are clearly mistaken (Mitch Berg has a post on their methods) just because it puts the probability of a Dayton win near 100%, and this in anomalous compared to several recent polls and there’s no event or change in strategy to explain the movement. After doing some adjustments to the numbers (adding 5.5% to the GOP total, adjusting down Horner’s total to 13% and splitting the “other” tally, I get a rough Win Probability for Dayton at 75%, which is much more realistic).
– I really, really, really hope nobody else has done this. I did a Google search and didn’t find anything, so I’m claiming it as my own (despite the fact WPA has been used in baseball for decades).
Filed under: Political, pWP, Statistics | Tagged: Margin of error, Mark Dayton, Mitch Berg, Politics, Probability, Standard deviation | Comments Off on Introducing Political Win Probability