Tuesday, November 07, 2006

 

Election 2006 - Is there Sampling Bias in Public Opinion polls?

Before one can make conclusions about data, one must grasp the meaning of the inferential statistics that point to that conclusion. Before that, one must grasp what those inferential statistics are testing, namely the descriptive statistics. Before that, one must obtain the data. And before that, one must have a method of obtaining a sample from the population.

This is why one of the first stories statistics students are told is the 1948 Presidential election. This election had more significance than keeping Earl Warren from becoming Vice President and thus making him available for the Supreme Court. What was more important was that the polls absolutely blew it. The public opinion polls of the time were all leaning towards Dewey, so much so that the infamous "Dewey Defeats Truman" headline was printed up before the results were even made final, and before it was proven wrong.

How did the pollsters blow it? Well it's quite simple: they took a biased sample. They were conducting several telephone polls, and at the time people who owned telephones were generally wealthier, thus more likely to vote Republican. Since then, controlling for party affiliation (Republican, Democrat, Independent, Other) and other concepts to draw a sample from the whole population has made polling more reliable. Also, as the telephone became more commonplace (basically within the next decade), it became a more reliable source of data.

Ever since Caller ID became popular, though, polling by phone has always struck me as odd. Pollsters, like other telemarketers, show up as unavailable. My parents don't answer telemarketers or survey calls, so understandably more calls have to be made. What becomes especially curious, then, is whether there are any particular demographics which people with Caller ID belong to. Further, whether Caller ID becomes a call screening object or not varies from person to person. My thoughts:

1) The elderly would have a greater representation in these polls because they are more likely to answer every phone call, not to mention that they are likely retired and at home.

2) Perhaps there is a correlation between intelligence and Caller ID use patterns; given the expected value of a call labeled "unavailable" is some guy trying to sell you something, it's probably best that one does not answer the phone on such calls.

3) Polling places would put themselves in an interesting situation if they identified themselves as polling organizations. Because of the high volumes of calls required because of Caller ID, picking up the phone and participating could in effect become a method of voluntary response. So we assume this does not happen, and the call is simply filed under "unavailable."

4) Since cell phone numbers are not in the phone book, those numbers will not be reached. This will essentially exclude college students as well as other young people who have cell phones but no land line phone.

5) Effects 1 and 4 may likely cancel, depending on voter tendencies.

6) Since the internet is not a good place to conduct a poll, perhaps a gas station or a supermarket would be more suitable. Yet even this could be complicated. In Richmond, VA, the grocery store with the highest market share is closed on Sunday and does not sell alcohol. Polling would have to occur at several grocery store locations in order to get an accurate read of the population.

7) Major League Baseball's ratings have gone down by at least one of the following: either the advent of Caller ID (and baseball fans all use their caller IDs to block annoyances), or Fox taking over broadcasting the postseason.

Labels:


Comments: Post a Comment



<< Home

This page is powered by Blogger. Isn't yours?