Saturday, November 17, 2012

Did Obama Cheat? How to Answer the Question

Nov. 17, 2012

There are 15 states with photo ID requirements for voting. Mr. Obama lost in all of them. In places with the weakest controls, specifically counties in Florida, Ohio, Colorado, and Pennsylvania, he generally drew turnouts in the 90% or greater range and won by better than 95% of the vote.

Losers tend to look for external explanations, and a lot of conservatives looking at numbers like those from Florida's St. Lucie County (where Mr. Obama got 247,713 votes from only 175,554 registered voters) are starting to question the legitimacy of the electoral results as reported. That's not good news for democracy, because the system works only if we trust it -- and having a majority in the GOP write off a minority who think the results were rigged serves nobody. Not even Democrats.

So what we need is an independent means of testing the electoral result.

The traditional way of doing this is, of course, to assume legitimacy, then gather anecdotal evidence of vote-cheating, promote that to sworn testimony through court proceedings, and hope for a conclusion from the adversarial process this generates. That's how we now know, for example, that Stevens was falsely prosecuted and Coleman beat Franken.

Great. Except that both Franken and Begich hold office, both voted for ObamaCare, and both will get generous federal pensions. Basically, the traditional approach may be effective, but it's also politically pointless...and inappropriate in today's context anyway, given that we need something a whole lot quicker. We need something, in fact, that can give us a clear result in time to decide whether there's a case to be made for asking the college of electors to overturn the nominal result when they vote on December 17.

One idea that might work would be to compare the results of an honest poll to the nominal results obtained in the election and then decide what the odds are that the differences, if any, reflect electoral fraud.

Given that there have been hundreds of polls, the most public of which roughly predicted the reported electoral outcomes, this may seem like a dumb idea. But it may not be so dumb -- and for two main reasons:

First: internal GOP polling based on face-to-face voter contact consistently contradicted the D+7 or more results predicted by the major media polls and subsequently demonstrated in the nominal electoral outcome.

This shouldn't happen; theory predicts that internal polls based on face-to-face interviews should produce better results than panel studies or public media polls -- and until the 2008 presidential election, they generally did.

Second: The major media pollsters face non-response problems that render their results indefensible in terms of normal statistical practices and standards. In consequence, those in charge of analyzing and reporting the data typically use complicated, but ultimately guesstimated, formulas, giving a kind of scientific patina to what are ultimately intuitive judgments about sub-sample weights -- and the closer the population proportions they're trying to estimate get to 50:50, the more these judgments affect the outcome, and the more risk the people involved take when they make those judgments.

As a result, the closer the contest is in reality, the more pressure these guys are under to follow the leader -- for competing pollsters to recursively adjust their weightings to invisibly move their results toward a consensus position.

Notice that this isn't conspiracy; it's a natural consequence of the costs and complexities of public polling in today's business environment. But it is an exploitable consequence, because someone who wants to drive the public media polling consensus toward a predetermined outcome need only lean on one of the market leaders to cause all of them to drift toward the intended conclusion.

Gallup published a poll close to what the GOP internals showed, found the DOJ joining what should have been a nuisance lawsuit against them, and adjusted its weightings to bring its results into line. We know these events happened; we do not know if they're related.

A public, national, face-to-face voter poll, using a simple set of questions and academically defensible statistical methods, would go a long way in clearing up the questions here. If the results strongly support the nominal electoral outcomes, we can be fairly confident, for example, that the election was broadly fair, that the GOP internal polls were wrong, and that the major media pollsters behaved honorably and correctly throughout.

If, on the other hand, our hypothetical national poll produces results that differ significantly from the nominal election results, it will largely rehabilitate the GOP internal results, cast significant doubt on the legitimacy of Mr. Obama's claimed victory, and probably cause at least one of the major media pollsters to rethink its methods.

Notice, however, that this type of audit survey is a first cut at the problem -- more to see whether there is a real problem than to address it. Ultimately, only the more traditional methods will let us deal with issues like those raised by the disenfranchisement of (mostly GOP) service personnel and the enfranchisement of (mostly Democrat) illegals.

With that in mind, let's look at the mechanics of actually doing it -- but bear in mind, please, that there are many different ways of doing this, and what I'm suggesting here is intended to be illustrative, not prescriptive.

First we need to establish the universe: the target group whose population proportions we want to estimate. Since we're interested only in people whose votes were counted and we often don't know who they are, we'll start with the list of all registered voters -- a list we make by combining lists from all jurisdictions and not eliminating duplicates.

Next we need to establish the question. Since we want to know what percentage of registered voters voted for each presidential slate, the two core questions are:

1. Did you vote?

2. For which presidential slate?

If the nominal election results are broadly correct:
all of our respondents should be reachable at the addresses listed for them,
about 58% of our sample should report having voted;
about 30% of our sample should claim to have voted for Mr. Obama, and
about 28% of our sample should claim to have voted for Mr. Romney.

Any significant differences from that distribution will indicate fraud. Normal statistical methods can then be used to quantify both the likelihood and the significance of the fraud.

Next we need a sampling methodology -- in this case, we'll number our combined lists from one to whatever and use the computer's pseudo-random number capability to pick our interviewees.

The big issue is sample size. The determinants for this are:
Population or universe size: with well over 100 million voters, the statistical rules for infinite populations sampled without replacement apply.

How confident do we want to be that the population proportion estimates we produce are in the same ballpark as the answer we would get if we talked to everybody and tabulated those results?

The traditional confidence level targeted by pollsters is 95% -- meaning that if you did the sampling 20 times, you'd expect the people drawn to correctly reflect population means 19 times.

How close do we need to get to the real proportion? I.e., how big a ballpark can we live with?

The traditional polling answer is plus or minus 5% -- largely, incidentally, because pre-computer-age pollsters rejoiced in the happy coincidence that this precision combined with a 95% confidence level let clients jump to the conclusion that their ability to add the two numbers to get 100% meant they understood something and produced an easily memorable, easily squared z-statistic that just happened to work out to an even number.

In our case, the basic 95% confidence that our estimate is within 5% of the real number, given no more than 3% non-response, requires a sample of about 400; raising that to gain 99% confidence that our estimate is correct to within 3% requires a sample of about 1,900; and going for 99% confidence that our estimates are within 1% of reality would require a sample size of about 17,000.

So what we'll do is draw 17,000 names, draw 1,900 names from that list, and finally produce an initial sub-sampling of 400. We'll then interview those first 400, decide whether it's worth continuing, and, if so, continue to at least 1,840 successful interviews from the list of 1,900 names. Finally, we will re-evaluate again before either stopping or proceeding to the full 17,000.

Our next decisions involve interview methods. Since we cannot tolerate much non-response and do not want the interviewer to bias the result, we're going to:
cold call on the doorstep,
dispatch two interviewers on each call, and
ask only two questions:
did you vote? and
if so, for which presidential slate?

Teams will not ask the respondent to say which slate was preferred but will, instead, hand the respondent two cards or other tokens with instructions to keep one while placing the other in a box or other container proffered by the interviewer.

This process will be extremely expensive -- far more so than the telephone interviews conducted by the major media pollsters. As a very rough first guesstimate: getting the infrastructure in place quickly enough for the result to be meaningful and then carrying out the first 400 interviews will run upwards of $400,000, with total costs rising to the $2 million range if it is necessary to interview the full 17,000 sample.

What we're really proposing here is a first audit of the election result, and regardless of outcome, its primary value is in reducing uncertainty.

Right now, saying Obama cheated is about as credible as saying he didn't, so a positive result will go a long way to debunking various destructive conspiracy theories and thus contribute to the smooth functioning of American democracy. Conversely, a negative result will form a strong basis for multiple legal and political actions aimed at delegitimizing an illegitimate president.

Bottom line? Knowing is better than not knowing. So who's got three million bucks?


Source: Paul Murphy is the psuedonym used by a retired IT consultant now living in beautiful Lethbridge, Alberta.

No comments:

Post a Comment