jrtom: (Default)
[personal profile] jrtom
A friend's blog asks which of the following two studies of exit polls should be regarded as reliable:

"this guy says the exit polls couldn't have been off so far by chance" (Stephen Freeman, UPenn)

"these guys say the difference is insignificant" (CalTech/MIT Voting Technology Project)

So, I'm not much of a statistician; I understand some things about probability and I speak a few distributions and so on, but I'm not accustomed to doing confidence analyses. So I'd have to do a lot more scut work to check on the analyses themselves.

Instead, I can provide a higher-level meta-analysis, which might be of interest.

The CalTech/MIT analysis doesn't really give me much confidence in their results: they don't go through their methods, they don't talk about exactly what data they're working with or where they got it from, and they don't really address possible confounding factors very well. The whole analysis strikes me as being kind of shallow, and there isn't enough here for me to be able to check their work. I also have the vague feeling that they basically bought the "exit polls were skewed" argument coming out the door.

By contrast, Freeman's analysis (after a brief skim) seems to have none of these problems. I can't guarantee that he didn't screw something up, but at least I feel that I could back-check him if I wanted to take the time.

Update: Robin has given me permission to post our e-mail discussion about this. I'll be posting my future responses, if any, in the comments.



it's dawning on me slowly that the math might be right on both--the first thing the yes-it's-significant guy tells us is that he's using now-vanished figures from the first day on CNN, which was later "corrected" in incomprehensible fashion. Likely, the Caltech-MIT team is using the "corrected" numbers and trying to figure out what all the fuss is about.

Which means, of course, now I want a source on that "correction" process and what in God's name the function of that procedure is supposed to be--other than making the network look like it was predicting things accurately all along, no matter what the results might be.






I don't know what the correction process was, but I think that you have the general motive right: to make the final results look like the counted votes. This is not necessarily as sinister as it seems, though.

I think that there are a couple of things going on here, which are complicating the discussion (not ours, but the wider debate).

(1) Some people think of the exit polls' purpose as to provide a check on the election results (i.e., if the polls differ significantly from the counted votes, then there may be a problem with the vote-counting process). If you believe this, then clearly you want to keep the exit poll numbers and vote counts completely separate.

Others think of the exit polls' purpose as a way to estimate what the vote count results are going to be. This is presumably the stance of news organizations, which put a premium on being able to get news out ASAP. In this case, you want to gradually blend in the vote data with the exit poll data, so that you can call the state as soon as possible (and presumably once all vote data is in, you discard the exit poll data entirely, since it's served its purpose as you see it).

(2) A lot of people seem to be basing their analyses on early exit poll results; this is problematic because (a) there might be systematic bias, e.g., early voters might tend to be Democrats, and (b) the sample sizes are obviously not very big. In order to really tell whether there was a discrepancy, as I observed above, we'd need to know the *final, raw* exit polls.

If you do a Google search on "exit poll correction" you'll find some articles on this.





Points well taken. Two things to add: first, it's not so much that I think vote-checking is "the purpose" of exit polls, but certainly it's "an obvious utility" of them. And yes, since they're final samples they should never be regarded as substitutes to the tallies themselves--but their accuracy should be regarded as more or less whatever it has been, historically. Most parties to the argument seem to place that accuracy pretty high.

And second, as for what you say about the networks: I'd rather they took the extra step of keeping those sets of numbers separate, adding a third set to blend the two if they like as an index of progress. But you do make clear why they'd want such a number to exist.

But in that case, anybody who purports to be analyzing the exit polls, after the fact, is culpable for adopting those altered numbers and referring to them as exit poll figures. They aren't; they're some other thing, somebody's composite. You could argue that the networks are still implicated, for using a term that really isn't accurate, and archiving these mongrel numbers under the title "exit polls" when the dust has settled. But certainly, academic statisticians going back to examine the figures later must be held responsible for using those numbers instead of the original data.

And when the question at hand is precisely the accuracy of the real, official tallies, this offense moves from being a head-smacking oversight to being simple academic dishonesty. No matter what fool thing happened to the official tallies, such "blended" polls will automatically tend to echo and legitimize it. Call these altered numbers "exit polls," especially without explaining, and your work loses its legitimacy.

It should probably be reiterated that I'm only speculating that the Caltech-MIT team is using the "blended" numbers. But if it turns out they are, man, throw their analysis right in the circular file.


(no subject)

Date: 19 November 2004 00:09 (UTC)
From: [identity profile] joedecker.livejournal.com
This (http://center.grad.upenn.edu/center/get.cgi?item=freemanexitpoll) is the UPenn site for the Freeman paper, which has been updated since the copy Truthout is currently linking. I have not read either yet, I provide entirely in the interest of best information.

(no subject)

Date: 19 November 2004 09:15 (UTC)
From: [identity profile] jrtom.livejournal.com
Thanks for the link. I've now read the whole thing and I don't see any holes in the arguments. The author does not make any definite conclusions, but I'm at least now reasonably satisfied that he was not using early exit poll data. (Assuming that the source had been updated regularly, that is.)

(no subject)

Date: 19 November 2004 10:27 (UTC)
From: [identity profile] joedecker.livejournal.com
*nods* I haven't had my whack at reading the papers carefully, but I'm looking forward to making the time. Thanks!

more links

Date: 19 November 2004 12:19 (UTC)
From: [identity profile] jrtom.livejournal.com
http://ucdata.berkeley.edu/new_web/VOTE2004/index.html (http://ucdata.berkeley.edu/new_web/VOTE2004/index.html): another study. Haven't looked at this one at all yet. Analyzes Florida results.

A Wired article ("Researchers: Florida Vote Fishy" (http://www.wired.com/news/evote/0,2645,65757,00.html)) reports on the above study, and some reactions to it.

Profile

jrtom: (Default)
jrtom

May 2011

S M T W T F S
1234567
891011121314
1516 1718192021
22232425262728
29 3031    

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated 5 July 2025 21:23
Powered by Dreamwidth Studios