The Literary Digest conducted a survey for the 1936 Presidental election between Franklin D. Roosevelt and Alfred E. Landon. Over 3 million contributed to the poll, to the results of Landon 57% to Roosevelt's 43%. George Gallup had conducted a smaller poll using only 300,000 participants, his poll predicting a win by Roosevelt.
While Gallup was ridiculed for using such a small sample on people on the street compared to a magazine using three million participants, Gallup's prediction proved correct.
The flaw? Literary Digest had polled via telephone. The majority of those who could afford a phone in 1936 mostly of the uppper classes, and primarily Republican.
As above, your poll can easily be skewed when you are not careful in participant selection. Convenient samples - a group of people who are close to you - can be used, but you must be careful. Standing in front of a library, a sporting event, or a church will get a similar response, due to common traits among the participants who frequent that area. Website polls are skewed because certain people use the internet and frequent the particular website.
Random sampling is your best bet for a valid poll. If your intention is to see the opinion of the average person, using a sample to apply that to a larger population, then a random sample is your goal. This requires effort to avoid systematic error as described above, either by finding neutral places to get samples (phones are now near ubiquitous, but still systematically avoid polling the homeless, for example), or by sampling from a fair distribution of potentially statistically slanted sources.
If you are collecting participants by phone, grab the phone book and select names at random. If you are approaching people on the street, make sure to frequent different streets and neighborhoods.
The selection of a sample must mirror the demographic mix of the complete population whose opinion the sample is supposed to reflect.
Political opinion research which endeavors to predict election outcomes generally defines the complete population as those who vote. Various national polls will cite samples such as "adults nationwide", "registered voters", or "likely voters". In this instance the only sample population which meets the criteria of a valid sample is "likely voters". Most "adults nationwide" do not vote and only approximately 50% of "registered voters" bother to do so.
Pollsters use a variety of methods to achieve a valid sample of "likely voters", some of them proprietary. Interview screening for "likely voters" can be problematic since there is a tendency for individuals to lie about their voting habits - they might not bother to vote, but they are hesitant to admit they aren't doing their civic duty. In some locales an individual's voting history (that is, if they voted in an election, not how they voted) is a public record. In this case an individual's response about their voting history can be verified by the public record.
Samples used to predict election outcomes must not just reflect the demographic characteristics of the population, but must also reflect the tendencies of voter turnout in the area sampled. That is, a sample can be geographically stratified by voter turnout. Turnout models take into account the history of voter turnout in the area subsets of a sample (many times at the precinct level). Voter turnout models are predictive, and in many cases, opinion research firms use proprietary models, hence the different poll results when various firms poll the same contest.
Contacting individuals in the sample for phone interviews can be problematic. In many instances commercial vendors can provide voter files with telephone matches. The match rate is generally around 60% (in some instances it is lower). Manually checking the phone numbers, while labor intensive, can increase the match rate. Still, a significant number of potential respondents are eliminated (and this injects additional sampling error) due to inadequate phone matches in a voter file database. In addition, certain demographic groups tend to have unlisted phone numbers. A new twist to this problem is the increasing number of households which do not have a land line telephone and rely instead on cellular telephone service.
Another method used to access households with unlisted telephone numbers is to utilize one plus dialing. That is, phone numbers in an exchange are selected randomly and then the final digit is increased by one. In this fashion unlisted phone numbers can be included in the sample. Unfortunately, this also includes business phone numbers and fax machines, as well as non-registered individuals and non-voters. Interview screening for "likely voters" is crucial here.
A political opinion research interview, usually done by phone, has several components.
The interviewer will call a household, depending on a demographic quota (for instance, male or female), and might ask for "the youngest male over of the age of 18 who is present at this time".
The interviewer will then ask a series of questions to determine if the interviewee is a "likely voter." If that is not the case the phone call will be terminated.
Questions are asked in a fashion and order to facilitate completion of the interview and keep the interviewee from terminating the call.
Immediately after the screening questions the interviewer might ask the following question to gage voter sentiment: "Generally speaking, do you feel that things here in this part of XXXXX are going in the right direction, or do you feel things are off on the wrong track?"
The first set of questions might ask about individuals active in poltics to gage their name recognition. The interviewer might state: "I am going to read you the names of a few people active in politics. I'd like you to rate your feelings toward each as either very positive, somewhat positive, neutral, somewhat negative, or very negative. If you don't know the name, just say so." These questions are rotated top to bottom and bottom to top for every interview so as not to inject "rating" bias.
The second set of questions might consist of issues rated on a Likert Scale (strongly agree, agree, neutral, disagree, strongly disagree) which can be used to gage the intensity of voter's views on those issues. The series of issue questions are rotated top to bottom and bottom to top for every interview so as not to inject "question fatigue" bias.
A third set of questions might contain "head to head" or contest components. Along the lines of "If the election for x where held today, would you vote for candidate a or candidate b"? The order of the candidates is rotated at every interview so as not to inject first answer bias in the results.
The final set of questions are usually demographic. Interviewees tend to not want to answer these early in an interview, but once they've invested the time and effort answering the earlier sets of questions they are usually more amenable to responding. This category can contain party affiliation, ideology, household income, age, religious affiliation, union affiliation and household income questions.
An interviewee who hangs up or declines to be interviewed is termed a "refusal". Refusal rates above 50% are problematic for a survey. Interviewers strictly adhere to the interview script, but they also have assurance scripts to help prompt answers from reluctant interviewees.
Good interview scripts are designed to leave the respondents in the dark about who commissioned the survey. If the interviewees know who a political poll is being done for they tend to answer in a manner in which the believe the client wants them to. This would obviously inject additional error.
Margin of Error
(see this this Margin of Error Calculator)
The margin of error in a poll is expressed as a percentage and is 1.96 times the standard deviation. This is also referred to as the 95% confidence level. That is, if the methodology is the same, the results will be within the margin of error 95 times out of 100.
Where n = sample size, the margin of error is 1.96 times the square root of the multiplied percentages of answers A and B which have been divided by the sample size.
1.96 x sqrt((A% x B%)/n)
For example - with a sample of 500 completed interviews, if candidate A is at 50% and candidate B is at 50% the formula is:
1.96 x (sqrt((50% x 50%)/500)) = 4.38% margin of error
Note the diminishing returns involved in larger samples. For instance, doubling the sample size (and doubling the number of interviews and work) does not halve the margin of error:
1.96 x (sqrt((50% x 50%)/1000)) = 3.10% margin of error (rounded up)
This is one reason why the large sample size of the Literary Digest poll did not guarantee greater accuracy, though the real flaw was in selecting a non-representative sample.
Likert scale responses can be expressed as ratios to indicate the salience and intensity of opinions on issues questions.
Strongly Agree = A, Agree = B, Neutral/Don't Know = C, Disagree = D, Strongly Disagree = E
A/B = Positive Intensity (Ratio)
A ratio of 1 indicates an equal number of respondents agree and strongly agree - above 1 indicates stronger opinions - below 1 indicates less strongly held opinions.
E/D = Negative Intensity (Ratio)
A ratio of 1 indicates an equal number of respondents disagree and strongly disagree - above 1 indicates stronger opinions - below 1 indicates less strongly held opinions.
(A + E)/(B + C + D) = Issue Intensity Ratio
A ratio of 1 indicates the extremes are balanced with the less strongly held opinions along with the neutral opinions - below 1 indicates less intensity - above 1 indicates higher intensity.
(A + B + C)/(C + D + E) = Consensus Ratio
A ratio of 1 indicates a balance (of sorts) - above 1 indicates a tendency toward agreement with the statement - below indicates a tendency toward disagreement with the statement.
A + B + D + E = General Salience
Expressed as a percentage of the respondents - a higher salience (up to 100) is an indication of higher general awareness of the issue - low salience is an indication of lower general awareness of an issue.
Political campaigns rarely release internal poll results. One of the the tenets of political opinion research is that publically announced results must be open to examination. Generally no competent campaign will release its polling results - it would be tantamount to handing the campaign's resources and strategy to the opposition. In cases when campaign poll results are released the actual data is usually fragmentary and has probably been leaked.
News organizations also commission political opinion research surveys. What one reads about in the newspapers or sees on the evening news are usually the "top numbers" - the superficial percentages which fit the horse race mentality of the media.
Campaign professionals are more interested in the crosstabulations.
Crosstabulations juxtapose the results of two or more survey questions into a format which allows for detailed analysis of the results and for targetting of demographic groups. For instance, a crosstabulation splitting out the responses of male and female repondents on an issue question might reveal a "gender gap" or even, with the Likert Scale, reveal differences in intensity on an issue. Then, if a demographic group with a significant number of undecided voters has a distinct opinion on an issue which is compatible with the candidate's, the campaign will target that group.
Truisms and Myths
The general public can have misconceptions about the methodologies, uses, practicality, and fairness of political opinion research. This is due in part to the "horse race" mentality of the media. Nevertheless, there are a number of truisms and myths associated with polls.
- "Undecideds" tend to break to the challenger, especially as the day of the election approaches.
- A poll is a snapshot of opinion in time, not an indication of opinions set in stone.
- Publically released political poll data generally comes from media sources or independent organizations which have commissioned them, not from campaigns.
- The only poll that matters is on the day of the election (if that's the only poll a national campaign plans on using, they're either liars or fools or both).
- The current administration doesn't rely on or use polls.