Some data on the upcoming 2004 Presidential election.
(1) Thomas F. Schaller, writing in the 15 June 2004 Washington Spectator, "The Electoral College Seen By Many as the Deplorable College:"
Based on 2003 Census estimates, the 16 states almost universally agreed to be "battlegrounds" and the five that some also consider competitiveColorado, Delaware, Maine, Louisiana and Tennesseemake up just 40 percent of the national population and 39 percent (212) of the electors. That means three out of every five voters will be mostly ignored between now and November 2.(2) A second article by a different author lists only fifteen battleground states:
Arkansas
Florida
Iowa
Michigan
Minnesota
Missouri
New Hampshire
New Mexico
Ohio
Oregon
Pennsylvania
Washington
West Virginia
Wisconsin
(3) A third, more informative article on battleground states at the Wall Street Journal (21 June 2004) agrees that there are sixteen, but puts Tennessee in the category of a definite battleground, unlike Shaller.
And here's the latest polling data in those states:
So my sources don't quite agree completely on the battleground states. Taking the union of all the states mentioned by at least one source as a battleground, I get 21 states:
Arkansas
Colorado
Delaware
Florida
Iowa
Louisiana
Maine
Michigan
Minnesota
Missouri
Nevada
New Hampshire
New Mexico
Ohio
Oregon
Pennsylvania
Tennessee
Washington
West Virginia
Wisconsin
That leaves 51-21 = 30 "uncontested" states (D. C. is a "state" here). So who is supposed to win in those? This graphic (also from the WSJ article) seems to have the answers:
It's time to start whacking on this data in Mathematica. First, a list of with all the data so far, including the number of electoral votes per state:
eVotes ={{"Alabama", 9, "Bush"}, {"Alaska", 3, "Bush"}, {"Arizona", 10, "Battleground"}, {"Arkansas", 6, "Battleground"}, {"California", 55, "Kerry"}, {"Colorado", 9, "Battleground"}, {"Connecticut", 7, "Kerry"}, {"Delaware", 3, "Battleground"}, {"D. C.", 3, "Kerry"}, {"Florida", 27, "Battleground"}, {"Georgia", 15, "Bush"}, {"Hawaii", 4, "Kerry"}, {"Idaho", 4, "Bush"}, {"Illinois", 21, "Battleground"}, {"Indiana", 11, "Bush"}, {"Iowa", 7, "Battleground"}, {"Kansas", 6, "Bush"}, {"Kentucky", 8, "Bush"}, {"Louisiana", 9, "Battleground"}, {"Maine", 4, "Battleground"}, {"Maryland", 10, "Kerry"}, {"Massachusetts", 12, "Kerry"}, {"Michigan", 17, "Battleground"}, {"Minnesota", 10, "Battleground"}, {"Mississippi", 6, "Bush"}, {"Missouri", 11, "Battleground"}, {"Montana", 3, "Bush"}, {"Nebraska", 5, "Bush"}, {"Nevada", 5, "Battleground"}, {"New Hampshire", 4, "Battleground"}, {"New Jersey", 15, "Kerry"}, {"New Mexico", 5, "Battleground"}, {"New York", 31, "Kerry"}, {"North Carolina", 15, "Bush"}, {"North Dakota", 3, "Bush"}, {"Ohio", 20, "Battleground"}, {"Oklahoma", 7, "Bush"}, {"Oregon", 7, "Battleground"}, {"Pennsylvania", 21, "Battleground"}, {"Rhode Island", 4, "Kerry"}, {"South Carolina", 8, "Bush"}, {"South Dakota", 3, "Bush"}, {"Tennessee", 11, "Battleground"}, {"Texas", 34, "Bush"}, {"Utah", 5, "Bush"}, {"Vermont", 3, "Kerry"}, {"Virginia", 13, "Bush"}, {"Washington", 11, "Battleground"}, {"West Virginia", 5, "Battleground"}, {"Wisconsin", 10, "Battleground"}, {"Wyoming", 3, "Bush"}};So what's the battleground?
battleground = Select[eVotes, #[[3]] == "Battleground" &]The starting situation:
Out[63]= {{"Arizona", 10, "Battleground"}, {"Arkansas", 6, "Battleground"}, {"Colorado", 9, "Battleground"}, {"Delaware", 3, "Battleground"}, {"Florida", 27, "Battleground"}, {"Illinois", 21, "Battleground"}, {"Iowa", 7, "Battleground"}, {"Louisiana", 9, "Battleground"}, {"Maine", 4, "Battleground"}, {"Michigan", 17, "Battleground"}, {"Minnesota", 10, "Battleground"}, {"Missouri", 11, "Battleground"}, {"Nevada", 5, "Battleground"}, {"New Hampshire", 4, "Battleground"}, {"New Mexico", 5, "Battleground"}, {"Ohio", 20, "Battleground"}, {"Oregon", 7, "Battleground"}, {"Pennsylvania", 21, "Battleground"}, {"Tennessee", 11, "Battleground"}, {"Washington", 11, "Battleground"}, {"West Virginia", 5, "Battleground"}, {"Wisconsin", 10, "Battleground"}}
Kerry 144
Battleground 233
There are 538 total Electoral College votes, and 270 are needed to win. To make some guesses about who might win, I need to come up with a way to convert each battleground state's polling data into a probability of victory in that state. I can then exhaust over all 221 possible battleground outcomes and sum over probabilities to get some idea who's going to win. That's not too difficult a calculation (roughly 2 million possible outcomes).
So, how to convert the polling data for a state into a probability of victory? [Note to self: read this PDF document on the meaning of "margin of error" (from the American Statistical Association), which has the following useful warning (see its pg 10)]:
A misleading feature of most current media stories on political polls is that they report the margin of error associated with the proportion favoring one candidate, not the margin of error of the lead of one candidate over another. To illustrate the problem, suppose one poll finds that Mr. Jones has 45 percent support, Ms. Smith has 41 percent support, 14 percent are undecided, and there is a 3 percent margin of error for each category.
If we note that Mr. Jones might have anywhere from 42 percent to 48 percent support in the voting population and Ms. Smith might have anywhere from 38 percent to 44 percent support, then it would not be terribly surprising for another poll to report anything from a 10-point lead for Mr. Jones (such as 48 percent to 38 percent) to a 2-point lead for Ms. Smith (such as 44 percent to 42 percent).
In more technical terms, a law of probability dictates that the difference between two uncertain proportions (e.g., the lead of one candidate over another in a political poll in which both are estimated) has more uncertainty associated with it than either proportion alone.
Accordingly, the margin of error associated with the lead of one candidate over another should be larger than the margin of error associated with a single proportion, which is what media reports typically mention (thus the need to keep your eye on what’s being estimated!).
Until media organizations get their reporting practices in line with actual variation in results across political polls, a rule of thumb is to multiply the currently reported margin of error by 1.7 to obtain a more accurate estimate of the margin of error for the lead of one candidate over another. Thus, a reported 3 percent margin of error becomes about 5 percent and a reported 4 percent margin of error becomes about 7 percent when the size of the lead is being considered.
[to be continued...]