Abstract

Although survey research is a young field relative to many scientific domains, it has already experienced three distinct stages of development. In the first era (1930–1960), the founders of the field invented the basic components of the design of data collection and the tools to produce the statistical information from surveys. As they were inventing the method, they were also building the institutions that conduct surveys in the private, academic, and government sectors. The second era (1960–1990) witnessed a vast growth in the use of the survey method. This growth was aided by the needs of the U.S. federal government to monitor the effects of investments in human and physical infrastructure, the growth of the quantitative social sciences, and the use of quantitative information to study consumer behaviors. The third era (1990 and forward) witnessed the declines in survey participation rates, the growth of alternative modes of data collection, the weakening of sampling frames, and the growth of continuously produced process data from digital systems in all sectors, but especially those emanating from the Internet. Throughout each era, survey research methods adapted to changes in society and exploited new technologies when they proved valuable to the field.

Survey research is a relatively young field. Although the systematic observation of social and economic phenomena has roots deep in history, it was only with the 1940s that we saw the organization of sampling frames with universal coverage, probability sampling, structured questions, and statistical inference to finite populations with measurable sampling error.

However, at this moment in survey research, uncertainty reigns. Participation rates in household surveys are declining throughout the developed world. Surveys seeking high response rates are experiencing crippling cost inflation. Traditional sampling frames that have been serviceable for decades are fraying at the edges. Alternative sources of statistical information, from volunteer data collections, administrative records, and Internet-based data, are propelled by new technology.

To be candid, these issues are not new to 2011, but have been building over the past 30 years or so. However, it is not uncommon to have thoughtful survey researchers discussing what lies in the future and whether key components of the basic paradigm of sample surveys might be subject to rethinking.

Ironically, this introspection arises at a time of unprecedented use of empirical information to guide important decisions in business, government, and the natural and social sciences. If there was a “war” between hunch and “gut feeling” on the one hand, and statistics, counts, and quantitative estimates on the other, the war was won by statistics. Sample surveys produce a good portion of those numbers. It seems that the demand for survey statistics has never been higher.

So, whither goest survey research? Perhaps a review of the past may be helpful. This essay offers comments on three distinct modern eras in the field, with the hopes that it might illuminate one or more ways forward. I claim no more than a personal viewpoint, and I suspect that my fellow survey professionals may have different takes on these issues. My remarks admittedly have in mind surveys of the U.S. household population, not those of businesses, employees, or members of an organization.

It is clear that Public Opinion Quarterly’s history is an accurate chronicle of this past; in some sense, these are my thoughts on what will likely appear in the journal's pages in the mid-term future.

1930–1960—The Era of Invention

Sampling statisticians view the 1930–1940 period as the practical start of their profession. Neyman's article in 1934 convincingly presented evidence that probability sampling offered bias-free estimates and measurable sampling errors. The early founders of that field in the United States told of the excitement of young statisticians in Ames, Iowa, and Washington, DC, studying his paper, bringing Neyman over to visit, and teaching each other its implications. There is even an oft-repeated story of their using the meeting table in the U.S. Secretary of Agriculture's office (then a department that was a hotbed of survey development) to meet after the workday to discuss new developments.

The sampling frames used for surveys in this era were often area-based. Pieces of land using civil jurisdictions like counties and census-based geographies like enumeration districts were the sampling units in stratified, multistage samples. The initial frame thus offered complete (theoretical) coverage if every person could be associated uniquely to one and only one piece of land. The residence of the person (with linking rules that attempted to define a single residence for each person) was used to assign people to geography. Alternative frames in parts of the United States were so-called city directories, produced by the private sector and paid for by businesses through advertising in the directory. Finally, some surveys used telephone directories or membership/subscriber lists, but these were limited to special populations or efforts that often abandoned both the desire for a universal frame and probability sampling.

As Converse (2009) notes, there were other developments that paralleled the statistical ones—those on the measurement side. Three streams of inquiry of the American public were gradually coming together. First, building on the “man in the street” interviews, journalists were discovering the power of supplementing individual quotes with percentages of many people who held a viewpoint on an issue. Asking the same question of many seemed important. Second, Gallup, Roper, and Lazarsfeld realized the power of measuring attitudes. The movement from qualitative unstructured interviews to structured questions fit both the era of mass assembly and measurement developments in psychology. Likert's 1929 dissertation noted that practices in intelligence measurement could be applied to survey interviews, and the five-point scale was justified. The structured interview was never really a settled issue, and debates about the open versus closed forms of questions flourished. Third, the U.S. government desperately needed information about the economy and then later the status of a society in wartime.

The data collection modes were dominantly face-to-face interviewing and mailed questionnaires, with telephone surveys entering the private sector companies toward the end of the era. The interviewer labor force tended to be female. Early interviewers told stories of the care they exercised in traveling to remote sampling areas, making sure to visit a local minister's home to seek a place to stay, thus assuring the locals that the interviewer was not a “working woman” of a type undesired by the local culture.

On the nonresponse side of sample surveys, these were truly the glory years. Although there was concern about gaining the participation of reluctant respondents, few of the major problems we face now with survey participation existed. With response rates often over 90 percent, there was more concern with contact rates than refusal rates. Founders of the field told stories of the first decision to compute response rates as a tool to record how completely the sample was measured. Resistance to the added work of the new tally came in the form of “What would you do with the number after you have it?” (personal communication, Leslie Kish, circa 1980).

With limited sampling frames and the dominant mode of data collection requiring face-to-face presence, the number of surveys was small (relative to today). The novelty of the event may have produced a beneficial curiosity on the part of selected persons, an increased perceived value based on its scarcity.

It was during this era that all of the basic tools we use as a field were invented. Many of the inventors have passed on, but my memory is that they shared an attribute of creativity and quick thinking, a pragmatism that eschewed theory when it did not solve practical problems. They were high-energy folks, broad thinkers, believing that they were creating a tool for the betterment of society. I note that Gallup and Rae's early book, The Pulse of Democracy, made the case that the survey was a key tool for hearing the voice of the people (in 1940). At about the same time, in a U.S. government publication, Likert noted that one of the risks of the increased size of the New Deal government was that centralized bureaucrats would lose touch with the people—an unusually ideological statement in an official publication (1940). His solution, presented after a few pages of ideology, was the sample survey, destined to be an efficient way to keep the bureaucrats aware of the wishes of the populace.

The sample survey also fit quite well with the rise of the consumer and service sectors. The need for a feedback loop from products and services to maximize profit became an accepted ethic. Sample surveys filled the need.

The organizational leadership of the field consisted of the intellectual inventors of the field. Scientists who created the methods and those who used the methods to study American society were making the key decisions about the development of their organizations and the field itself.

1960–1990—The Era of Expansion

This era saw the ubiquitous dissemination of the telephone as a communication medium—first in the homes of the rich, soon to the vast majority of urban areas. This technology offered sampling frames, provided by the then-monopoly national telephone company. One frame consisted of the list of the first six digits of all telephone numbers, to which sampling statisticians appended four-digit random numbers to create a probability sample of telephone numbers. Another frame consisted of listed telephone numbers in directories.

In the first era of the survey field, the impacts of technology change were small relative to those in the years that constitute the second era. Although the first computer was used in the 1950 U.S. decennial census, it was the 1960s that saw its near ubiquitous use in surveys. Technology entered surveys at the “back end,” the processing of the individual data collected from questionnaires. Holes punched in rigid cards denoting a unique answer to a question were readable by various mechanical devices over the years, and such machines produced the tables of numbers that were generated by surveys. By the mid-1960s, survey researchers were routinely using computers to do the statistical computations.

By the late 1960s, the use of computers had migrated to an earlier step, that of data collection, as computers connected to data entry devices were used to present the questions on a monitor and receive the answers as entered by telephone interviewers, Computer Assisted Telephone Interviewing (CATI). The developments took place not in the academic or government sector, but in the private sector (Freeman 1983). It was that sector that moved to telephone interviewing for one-time surveys faster than the scientific or government sector, which continued to have concerns about the coverage of the household population by telephone frames.

The 1960s saw a large expansion of federal government funding of social programs, benefiting the social sciences. With the rise of research funding in the social sciences, more and more sample surveys took place. Survey research centers grew throughout the country's universities. The University of Michigan used the Detroit Area Study, an annual sample survey of Detroit households, to teach many cohorts of graduate students the rudiments of the method. Other campuses launched similar training. Because the capital investment to conduct telephone surveys was low, many small companies and academic centers grew, almost all of them adopting some CATI software, on networked personal computers toward the end of this era.

Government surveys flourished during this time, permitting the growth of the federal contract sector of surveys (e.g., Westat, Research Triangle Institute). Private sector surveys grew also, with increasing linkage between customer survey statistics and management action. The government and academic sector (and through the Roper Center, the public opinion survey sector) made microdata available for secondary analysts. This archiving function permitted the rise of quantitative methods in many social science disciplines.

Sample design remained an important function in large survey organizations in the government and academic sector, but the rise of the telephone survey reduced its role in many smaller organizations. Sample designs morphed from stratified element samples, to cluster-based sampling methods (Waksberg 1978), and eventually to combinations of listed and unlisted number designs. Private sector firms emerged to sell telephone samples to survey organizations, reducing the need for on-site sampling statistics’ talent within the survey organization.

On the measurement side, this was the era when cognitive psychological theories of comprehension, memory, and processing were applied to question wording and questionnaire construction (Sudman, Bradburn, and Schwarz 1996). This work unlocked some of the mysteries of why small wording changes could affect responses, under what circumstances the order of questions altered respondent behavior, and how conversational norms played a role in survey measurement error.

This era also saw the increasing concern about response rates in household surveys, both within the profession and in public media (Dougherty 1973; Rothenberg 1990). The telephone mode brought with it a set of norms that permitted respondents to easily terminate the interview (by “hanging up” the phone). Hence, the rate of “partial” interviews rose, with attendant concerns about missing data on individual questions, and norms shifted to shorter instruments on the telephone relative to face-to-face interviews.

Increasing nonresponse rates were not handled well in the framework of direct estimates (e.g., means and totals incorporating case weights) and classical probability sampling. Despite this, the classical sampling texts used throughout this era (Cochran 1984; Kish 1965; Hansen, Hurwitz, and Madow 1953) suggest the use of post-stratification weighting of cases for adjustment. Using these, the theoretical requirements for the elimination of nonresponse bias were clearly an unmeasurable hope on the part of the analyst. Hence, this era saw increasing debate about classical survey estimation (Smith 1976). Most came to agree that other approaches were better suited to errors of coverage of the sampling frame or nonresponse. Formal statistical models increasingly became a focus. The seeds of this were planted (Rubin 1987) at the end of this decade.

These years saw an even larger breach between government and academic surveys on the one hand, and private sector surveys on the other. Lower propensities to respond in the population led government and academic researchers to increase efforts to contact and persuade sample persons. Their costs began to skyrocket as a simple function of their response rates. Private sector surveys, except for those in the regulated media measurement domain, turned increasingly to placing quotas on sociodemographic groups followed by post-collection weighting.

The founders of the field entered their retirement years during this period. The new leadership was heterogeneous in their focus. In the academic sector, leadership sometimes moved from the scientists who invented the method to those who were more interested in the analysis of survey data. Small academic survey centers were increasingly led by non-faculty members. In this era, the founders of private sector survey companies were often replaced first by protégés and then by those more interested in growing the company than advancing the method. Management as a skill seemed to be increasingly valued in the government survey sector, as well.

1990 to the Present—“Designed Data” Supplemented by “Organic Data”

Walled subdivisions, locked apartment buildings, telephone answering machines, telephone caller ID, and a host of other access impediments for survey researchers grew in this era. Response rates continued to deteriorate. Those household surveys devoted to high response rates experienced continuous inflation of costs due to increased effort to contact and interview the public. Face-to-face interviews continued to decline in volume, often limited to the first wave of longitudinal surveys.

The traditional telephone survey frames declined in coverage because of the rise of mobile phone numbers; the geographical location of a person became less well predicted by his/her telephone, as portability of numbers across areas was deregulated. Thus, local area studies on the phone faced new coverage error problems using such frames. Data assemblers in the private sector offered address lists, partially based on the U.S. Postal Service lists. These permitted sample designs using area frames to reduce costs from listing of addresses at the last stage of sampling. They are also increasingly used in telephone sampling as auxiliary frames.

Technology offered new communication media (e.g., mobile phones and the Internet), which increasingly changed the day-to-day lives of Americans. In addition to sampling frame issues, the mobile phones appeared to be linked to different user behavior. Caller-identification features of mobile phones allowed people to screen out calls effortlessly. Since the mobile phone also lent itself to short but frequent interactions with others, longer telephone interviews appeared to be inappropriate to the medium. Mobile telephone response rates appeared to be lower than line telephone response rates (AAPOR 2008).

The rise of the Internet re-energized research in self-administered instruments, with renewed vigor in studies of formatting and visual presentation (Redline and Dillman 2002; Couper 2008). Because Web use was not universal, Web surveys were often combined with other modes in multi-mode survey designs. Mailed correspondence with URLs for Web response were sent to a sample from an area frame or address frames. Simultaneously, the assignment of phone numbers, especially mobile phone numbers, meant that the efficiency of using sampling frames based on the first six digits of the telephone number declined (Tucker, Lepkowski, and Piercarski 2002). Mobile phone numbers were most often linked to individual people, not with full households, as with line phones. Regulations preventing the use of machine-directed dialing of mobile numbers forced rethinking of how interviewers accessed sample telephone numbers.

The Internet offers very low per-respondent costs relative to other modes; it offers the same within-instrument consistency checks that CATI and CAPI offer; it offers the promise of questions enhanced with video content; and it offers very, very fast turnaround of data records. When timeliness and cost advantages are so clear, the problems of the absence of a sampling frame are ignored by those parts of the profession whose users demand fast, cheap statistics.

It is unsurprising, therefore, that volunteer Internet panels arose as a tool. If clients want to maximize the number of data records per dollar under heavy time constraints, these designs are attractive. While a data set from a poorly designed Internet panel may resemble that from the same questionnaire on a probability sample, and selection quotas assure balance on chosen sample attributes, results of the two methods often prove incomparable (Yeager et al., 2011). Relying on volunteering to generate a statistical microcosm of a large diverse population ignores all of the lessons of the first era of survey research. Such designs work well until they don't; there is little theory undergirding their key features.

The Internet and the technologies producing large databases, in addition, have an impact on data about the American public. We're entering a world where data will be the cheapest commodity around, simply because society has created systems that automatically track transactions of all sorts. For example, Internet search engines build data sets with every entry; Twitter generates tweet data continuously; traffic cameras digitally count cars; scanners record purchases; radio frequency identification (RFID) tags feed databases on the movement of packages and equipment; and Internet sites capture and store mouse clicks. Collectively, society is assembling data on massive amounts of its behaviors. Indeed, if you think of these processes as an ecosystem, the ecosystem is self-measuring in increasingly broad scope. We might label these data as “organic,” a now-natural feature of this ecosystem.

However, information is produced from data by users. Data streams have no meaning until they are used. The user finds meaning in data by bringing questions to the data and finding their answers in the data. An old quip notes that a thousand monkeys at typewriters will eventually produce the complete works of Shakespeare. (For younger readers, typewriters were early word-processing hardware.) The monkeys produce “data” with every keystroke. Only we, as “users,” identify the Shakespearean content. Data without a user are merely the jumbled-together shadows of a past reality.

For decades, the survey profession created “designed data” in contrast to “organic data.” The questions we ask of households create data with a pre-specified purpose, with a use in mind. Indeed, designed data through surveys and censuses are often created by the users. This means that the ratio of information to data (for those uses) is very high, relative to much organic data. Direct estimates are made from each data item—no need to search for a Shakespearean sonnet within the masses of data. However, with “designed data” operations increasingly being rejected by sample persons, the obvious question is how survey-based designed data might be useful in this new, organic-data-rich world.

What has changed in the current era is that the volume of organic data produced as auxiliary to the Internet and other systems now swamps the volume of designed data. In 2004, the monthly traffic on the Internet exceeded 1 exabyte or 1 billion gigabytes. The risk of confusing data with information has grown exponentially. We must collectively figure out the role of organic data in extracting useful information about society. Hence, we see developments like “Google Flu” (Dukic, Lopes, and Polson 2009), which tries to predict the course of flu epidemics. We see the Google price index (Varian and Choi 2009) or the Billion Prices Project at MIT (bpp.mit.edu), both of which scrape price data from Internet sales sites to measure price inflation. Such data have near zero marginal cost in some cases and are always very timely relative to the events they measure.

Looking Back in Order to See Forward

Reviewing this history in this manner yields conclusions and speculations. The twin trends of a) falling response rates producing b) higher costs of data collection in surveys are unsustainable. The new modes of data collection (e.g., mobile phones, Internet surveys) appear to offer substitute methods of responding among the cooperative rather than strong appeals to those who would reject the old modes of responding. Although there appears to be a broad consensus among survey methodologists that we are moving to a future of mixed-mode surveys, the current available mix does not solve the problem of falling response rates in a permanent way.

The challenge to the survey profession is to discover how to combine designed data with organic data, to produce resources with the most efficient information-to-data ratio. This means we need to learn how surveys can be designed to incorporate transaction data continuously produced by the Internet and other systems in useful ways. Combining data sources to produce new information not contained in any single source is the future. I suspect that the biggest payoff will lie in new combinations of designed data and organic data, not in one type alone.

To continue the monkey-typewriter metaphor, the Internet and other computer data systems are like typewriters that have an unknown set of keys disabled. Some keys are missing, but we don't know which ones are missing. They're not capturing all behaviors in the society, just some. The Shakespearean library may or may not be the result of the monkeys pounding on the keys. In contrast to the beauty of the Bard's words, we may find only pedestrian jingles and conclude that that's as good as it gets. We need designed data for the missing keys; then we need to piece them together with the masses of organic data from the present keys.

The ingredients of this new era are clear in all sectors of the survey profession. Many are engaged in augmenting data records from surveys with data from existing data on the same persons. These include supplementing health survey data with Medicare records or Social Security records, with permission of the respondents; adding customer purchase records to satisfaction survey data; and appending longitudinal records of voting participation to pre-election survey responses. Some are engaged in combining data sources statistically, without the benefit of a micro-link among records. These commonly are using a survey rich in variables but small in sample size combined with one that is rich in sample size but lean in variables (Schenker, Raghunathan, and Bondarenko 2010). Others are engaged in gathering data on ecological units in which sample persons live, to expand the data set. Still others are designing paradata observations to add to sample records (both respondents and nonrespondents).

A still uncharted challenge of this age, relative to the first two eras, is that a much larger portion of the digital data on persons is held proprietary to private sector companies. Whereas the survey profession has developed a set of norms to share anonymized data for secondary analysts, most of the personal data assembled by private concerns (from drivers’ licenses, credit reports, subscriber and member data) are sold as a product for secondary commercial uses. Although Internet website content can be scraped from Web pages by anyone, search data are treated differently. Economic transaction data are held privately. Thus, although we may be living in a time where there are more digital data stored on each of us, ironically they may come to be used not to understand society (see Savage and Burrows 2007 for a parallel commentary), but instead to maximize benefits only to those who can purchase them. This possibility would no doubt trouble both Gallup and Likert.

Summary

The statistical and conceptual framework that led to the enormous growth of the survey research field depended on various societal prerequisites. First, the public had to be accessible to survey contact attempts by unknown researchers. This implied that anyone in the society had to have ways to contact all persons in the society. Strangers could not be immediately feared; evidence of their affiliation had to be believed. Second, once contacted, the public had to be willing to be measured. The public had to generally value the common-good nature of statistical information or, at least, the social interaction inherent in the measurement. Third, the survey researcher had to possess universal frames for the population studied. Here, the three eras have shown a robust use of the address of the residence as a sampling frame element, albeit with changing sources of data. Fourth, the cost and timeliness of the original sample surveys had to rival those of other data sources. Now, however, the relative cost advantages of de novo measurement of a probability sample relative to organic data on a full population have radically changed. These new data sources are automatic auxiliaries to everyday behaviors; they are produced at very low cost.

Survey research is not dying; it is changing. The self-report sample survey provides insights into the thoughts, aspirations, and behaviors of large populations in ways that data tracking naturally occurring behaviors are unlikely ever to capture. The survey method has strengths and deficits that are reflections of the society that it measures; the very act of speaking candidly to a stranger is governed by norms that can and do change. Survey research has always and must always adapt to those changes.

References

AAPOR
Cell Phone Task Force
 , 
2008
 
American Association for Public Opinion Research, http://www.aapor.org/uploads/Final_AAPOR_Cell_Phone_TF_report_041208.pdf
Cochran
William G
Sampling Techniques
 , 
1984
3rd ed
New York
Wiley and Sons
Converse
Jean
Survey Research in the United States: Roots and Emergence, 1890–1960
 , 
2009
Piscataway, NJ
Transaction Publishers
Couper
Mick
Designing Effective Web Surveys
 , 
2008
New York
Cambridge University Press
Dougherty
Philip H
“Advertising: Research Problems; Pritzker in Meeting with McCall's Chief's New Fashion Magazine Joy from Johnnie Walker.”
New York Times
 , 
1973
 
October 9, p. 75
Dukic
Vanja M
Lopes
Hedibert F
Polson
Nick
“Tracking Flu Epidemics Using Google Flu Trends and Particle Learning.”
2009
 
Freeman
Howard
“Special Issue on the Emergence of Computer-Assisted Survey Research.”
Sociological Methods and Research
 , 
1983
, vol. 
12
 
2
Gallup
George
Rae
Saul
The Pulse of Democracy
 , 
1940
New York
Simon and Schuster
Hansen
Morris
Hurwitz
William
Madow
William
Sample Survey Methods and Theory (two volumes)
 , 
1953
New York
Wiley and Sons
Kish
Leslie
Survey Sampling
 , 
1965
New York
Wiley and Sons
Likert
Rensis
A Technique for the Measurement of Attitudes. Ph.D. dissertation
 , 
1929
New York
Columbia University
———
“Democracy in Agriculture—Why and How?”
Farmers in a Changing World: The Yearbook of Agriculture, 1940
 , 
1940
Washington, DC
U.S. Superintendent of Documents
(pg. 
994
-
1002
)
Neyman
Jerzy
“On the Two Different Aspects of the Representative Method: The Method of Stratified Sampling and the Method of Purposive Selection.”
Journal of the Royal Statistical Society
 , 
1934
, vol. 
97
 
4
(pg. 
558
-
625
)
Redline
Cleo
Dillman
Don
Groves
Robert
Dillman
Don
Eltinge
John
Little
Roderick
“The Impact of Alternative Visual Designs on Respondents’ Performances with Branching Instructions in Self-Administered Questionnaires.”
Survey Nonresponse
 , 
2002
New York
Wiley and Sons
(pg. 
179
-
193
)
Rothenberg
Randall
“Surveys Proliferate, but Responses Dwindle.”
New York Times
 , 
1990
 
October 5
Rubin
Donald
Multiple Imputation for Nonresponse in Surveys
 , 
1987
New York
Wiley and Sons
Savage
Mike
Burrows
Roger
“The Coming Crisis of Empirical Sociology.”
Sociology
 , 
2007
, vol. 
41
 
5
(pg. 
885
-
99
)
Schenker
Nathaniel
Raghunathan
Trivellore
Bondarenko
Irina
“Improving on Analyses of Self-Reported Data in a Large-Scale Health Survey by Using Information from an Examination-Based Survey.”
Statistics in Medicine
 , 
2010
, vol. 
29
 
5
(pg. 
533
-
45
)
Smith
TMF
“The Foundations of Survey Sampling: A Review.”
Journal of the Royal Statistical Society, A
 , 
1976
, vol. 
139
 (pg. 
183
-
204
)
Sudman
Seymour
Bradburn
Norman
Schwarz
Norbert
Thinking About Answers: The Application of Cognitive Process to Survey Methodology
 , 
1996
San Francisco
Jossey-Bass
Tucker
Clyde
Lepkowski
James
Piercarski
Linda
“The Current Efficiency of List-Assisted Telephone Sampling Designs.”
Public Opinion Quarterly
 , 
2002
, vol. 
66
 
3
(pg. 
321
-
38
)
Varian
Hal R
Choi
Hyunyoung
“Predicting the Present with Google Trends.”
2009
 
Waksberg
Joseph
“Sampling Methods for Random Digit Dialing.”
Journal of the American Statistical Association
 , 
1978
, vol. 
73
 
361
(pg. 
40
-
46
)
Yeager
David S
Krosnick
Jon A
Chang
LinChiat
Javitz
Harold S
Levendusky
Matthew S
Simpser
Alberto
Wang
Rui
“Comparing the Accuracy of RDD Telephone Surveys and Internet Surveys Conducted with Probability and Non-Probability Samples.”
Public Opinion Quarterly
 , 
2011
, vol. 
75
 (pg. 
709
-
47
)