-
PDF
- Split View
-
Views
-
Cite
Cite
Stephanie Coffey, Olga Maslovskaya, Cameron McPhee, Recent Innovations and Advances in Mixed-Mode Surveys, Journal of Survey Statistics and Methodology, Volume 12, Issue 3, June 2024, Pages 507–531, https://doi.org/10.1093/jssam/smae025
- Share Icon Share
Abstract
The use of mixed-mode surveys has grown rapidly in recent years, due to both technological advances and the COVID-19 pandemic. The increased mixing of modes (and the adoption of newer digital modes like web and text messaging) necessitates an evaluation of the impact of these newer designs on survey errors and costs, as well as new techniques for disaggregating and adjusting for nonresponse and measurement errors. This special issue highlights recent innovations, applications, and evaluations of mixed-mode survey designs and identifies areas where additional research is required.
Many high-quality social, establishment and other surveys around the world have adopted mixed-mode survey designs as a measure to address rising costs. Additionally, the COVID-19 pandemic forced many organizations to adopt different innovations in mixed-mode surveys to continue data collection when face-to-face interviewing was not possible. This paper summarizes assessments, methodological evaluations, and statistical developments of recent mixed-mode survey innovations to assess whether these recent innovations used in mixed-mode surveys have a postpandemic future and, if so, in which contexts.
1. INTRODUCTION
Survey research has undergone significant shifts over the last two decades. One of the most notable shifts has been in the mode of data collection. In March 2020, the coronavirus disease 2019 (COVID-19) pandemic effectively halted all in-person survey data collection (Mead et al. 2020; Census Bureau 2020; Census Bureau 2021; Maslovskaya et al. 2022) and stopped other operations requiring centralized space such as telephone and mail centers. Many survey organizations were forced to pivot to new modes so that survey data collection could continue during the pandemic, which spanned most of 2020 and 2021, though the resumption of field activities was highly variable (Census Bureau 2021; Assistant Secretary of Planning and Evaluation [ASPE] 2021). Organizations sometimes made radical departures from their typical data collection operations. Examples of these changes included: the use of short message service (SMS) or text messaging, both as a mode of contact as well as a mode of data collection (Andreadis 2020; Soszynski and Bliss 2023); the use of alternative contact protocols, such as knock-to-nudge, in which face-to-face interviewers attempted to persuade sample members to take part in a later online or telephone interview (Kastberg and Siegler 2022); and the use of off-the-shelf video conferencing software such as Zoom, Teams, Skype, or in-house developed software to conduct a survey interview remotely after recruiting a household through more standard protocols such as mail, web, or phone (Sanchez et al. 2022; Sanchez and Spencer 2023; Hanson et al. 2023; Endres et al. 2023).
Yet, this shift in March 2020 reflected a change in survey modes that had begun much earlier, as detailed in the 2019 AAPOR Task Force Report on Transitions from Telephone Surveys to Self-Administered and Mixed-Mode Surveys (Olson et al. 2019). Rapid changes in technology and the survey environment, including increasing rates of internet use, mobile ownership, and cell phone-only households (Blumberg and Luke 2022; The Office of Communications [OFCOM] of the United Kingdom 2022); increasing comfort with online methods for interaction; continuously declining contact and response rates in the telephone mode (Daikeler et al. 2020; Kastberg and Siegler 2022; Krieger et al. 2023); and increasing refusal rates (Luiten et al. 2020), all catalyzed a shift to using multiple modes of data collection in household surveys. As a few years have passed since Olson et al. (2019), an updated assessment of whether and how changes to data collection modes affect survey errors and costs is needed, focusing on contemporary surveys in the context where internet access is almost universal, and the mix of modes almost always includes a self-administered online mode of data collection.
With this backdrop, the special issue on Innovations in Mixed-Mode Surveys was announced in December 2022. We leverage the Total Survey Error framework (Groves 1989) to consider the contribution of each paper in this issue, as well as other contemporaneous research, to the understanding of how mixed-mode surveys relate to selection and measurement errors specifically, as well as how cost drives the use of multiple modes. Each paper in this special issue, as well as recent evidence, helps us answer the following research questions:
What have we learned as a field about mixed-mode surveys’ impacts on sources of selection error, measurement error, adjustment methods, and cost in the last five years?
What areas of inquiry still need to be addressed and investigated?
This issue presents 13 important contributions, covering key topic areas in mixed-mode surveys. Details of all papers in this special issue are found in table 1.
Paper titles and authors . | Surveys employed . | Samples and frames . | Target populations . | Modes included . |
---|---|---|---|---|
Papers focused on tailoring and mixing of modes or contact strategies, with a focus on improving respondent sample composition | ||||
| The German Family Demography Panel Study (FReDA) | Two-stage address-based sample (ABS): stratified sample of municipalities; addresses sampled from selected municipalities | Adults 18–49 in Germany | Web and mail self-administered (SA) modes; different mode sequences |
| German General Social Survey (ALLBUS) | Multistage cluster sample from population registry with oversample of East German households | Adults 18+ in Germany living in private households | Web and mail SA modes; concurrent versus sequential designs |
| American Family Health Survey (AFHS) | Two-phase ABS: sample of addresses for screener; random selection of eligible household respondent | Households with adults 18–49 in the United States; adults 18–49 | Web and mail SA modes; concurrent versus sequential designs, tailored contact materials |
| National Training, Education, and Workforce Survey (NTEWS) | Stratified sample of American Community Survey (ACS) respondents (all eligible members of responding housing units and group quarter respondents) | Noninstitutionalized individuals aged 16–75 who are not enrolled in high school and are living in the United States and its territories | Web and mail SA modes; single mode versus concurrent versus sequential designs |
Papers focused on tailoring and mixing of modes or contact strategies, with a focus on improving respondent sample composition | ||||
| Consumer Assessment of Healthcare Providers and Systems (CAHPS) Hospice Survey | Sample from list of patients and caregivers in participating hospices | Hospice informal caregivers (family members or friends of patients) | Web and mail SA and computer-assisted telephone interviewing (CATI) response modes; different mode sequences |
Papers focused on introducing text messaging and email as new modes for contact or data collection | ||||
| Understanding Society, the United Kingdom Household Longitudinal Survey (UKHLS) | Two-stage stratified ABS (Great Britan); single-stage stratified ABS (Northern Ireland) | Households in the United Kingdom | SMS, text, and mail invitations; web SA and CATI response modes; sequential design |
| National Survey of Fishing, Hunting, and Wildlife-Associated Recreation (FHWAR) | Two-phase ABS: sample of addresses for screener; sample of eligible addresses with oversample in areas with high hunting participation | Individuals 16+ in the United States who are participating in fishing, hunting, or wildlife recreation activities | SMS, Email, and mail invitations; web and paper SA and inbound CATI response modes |
| Principal Follow-Up Survey to the National Teacher and Principal Survey (NTPS) | Stratified sample of principal respondents to the NTPS | Principals in public and private schools in the United States | SMS, Email, and mail invitations; web and mail SA and SMS response modes |
Papers focused on adaptive designs and optimizations | ||||
| California Health Interview Survey (CHIS) | ABS sample of households; one adult selected via next birthday method | Households in California; individuals 18+ | Mail-push-to-web SA and CATI response modes |
| National Survey of College Graduates (NSCG) | Stratified sample of respondents to the ACS (all eligible members of responding housing units and group quarter respondents) | Noninstitutionalized individuals aged 16–75 who have at least a bachelor's degree and are living in the US and its territories | Web and mail SA and CATI response modes |
| Dutch Health Survey (HS) and Dutch Labor Force Survey (LFS) | Two-stage population registry sample: stratified sample of municipalities; persons sampled from selected municipalities (HS); two-stage clustered ABS: clustered sample of addresses within geographic regions (LFS) | Households in the Netherlands: age 12+ (HS); and age 16–64 (LFS) | Web SA and F2F response modes; single versus sequential design |
Papers focused on postcollection adjustment and estimation | ||||
| Community Life Survey (CLS) | Clustered stratified ABS (new samples) with random selection of sample adult; respondents to prior CLS (follow-up sample) | Adults in England, age 16+ living in a private residence | Web SA and F2F response modes; single versus sequential design |
| Jordan Arab Barometer | Stratified area probability sample of households; one adult selected via Kish grid | Adults in Jordan, age 18+ | F2F and CATI response modes |
Paper titles and authors . | Surveys employed . | Samples and frames . | Target populations . | Modes included . |
---|---|---|---|---|
Papers focused on tailoring and mixing of modes or contact strategies, with a focus on improving respondent sample composition | ||||
| The German Family Demography Panel Study (FReDA) | Two-stage address-based sample (ABS): stratified sample of municipalities; addresses sampled from selected municipalities | Adults 18–49 in Germany | Web and mail self-administered (SA) modes; different mode sequences |
| German General Social Survey (ALLBUS) | Multistage cluster sample from population registry with oversample of East German households | Adults 18+ in Germany living in private households | Web and mail SA modes; concurrent versus sequential designs |
| American Family Health Survey (AFHS) | Two-phase ABS: sample of addresses for screener; random selection of eligible household respondent | Households with adults 18–49 in the United States; adults 18–49 | Web and mail SA modes; concurrent versus sequential designs, tailored contact materials |
| National Training, Education, and Workforce Survey (NTEWS) | Stratified sample of American Community Survey (ACS) respondents (all eligible members of responding housing units and group quarter respondents) | Noninstitutionalized individuals aged 16–75 who are not enrolled in high school and are living in the United States and its territories | Web and mail SA modes; single mode versus concurrent versus sequential designs |
Papers focused on tailoring and mixing of modes or contact strategies, with a focus on improving respondent sample composition | ||||
| Consumer Assessment of Healthcare Providers and Systems (CAHPS) Hospice Survey | Sample from list of patients and caregivers in participating hospices | Hospice informal caregivers (family members or friends of patients) | Web and mail SA and computer-assisted telephone interviewing (CATI) response modes; different mode sequences |
Papers focused on introducing text messaging and email as new modes for contact or data collection | ||||
| Understanding Society, the United Kingdom Household Longitudinal Survey (UKHLS) | Two-stage stratified ABS (Great Britan); single-stage stratified ABS (Northern Ireland) | Households in the United Kingdom | SMS, text, and mail invitations; web SA and CATI response modes; sequential design |
| National Survey of Fishing, Hunting, and Wildlife-Associated Recreation (FHWAR) | Two-phase ABS: sample of addresses for screener; sample of eligible addresses with oversample in areas with high hunting participation | Individuals 16+ in the United States who are participating in fishing, hunting, or wildlife recreation activities | SMS, Email, and mail invitations; web and paper SA and inbound CATI response modes |
| Principal Follow-Up Survey to the National Teacher and Principal Survey (NTPS) | Stratified sample of principal respondents to the NTPS | Principals in public and private schools in the United States | SMS, Email, and mail invitations; web and mail SA and SMS response modes |
Papers focused on adaptive designs and optimizations | ||||
| California Health Interview Survey (CHIS) | ABS sample of households; one adult selected via next birthday method | Households in California; individuals 18+ | Mail-push-to-web SA and CATI response modes |
| National Survey of College Graduates (NSCG) | Stratified sample of respondents to the ACS (all eligible members of responding housing units and group quarter respondents) | Noninstitutionalized individuals aged 16–75 who have at least a bachelor's degree and are living in the US and its territories | Web and mail SA and CATI response modes |
| Dutch Health Survey (HS) and Dutch Labor Force Survey (LFS) | Two-stage population registry sample: stratified sample of municipalities; persons sampled from selected municipalities (HS); two-stage clustered ABS: clustered sample of addresses within geographic regions (LFS) | Households in the Netherlands: age 12+ (HS); and age 16–64 (LFS) | Web SA and F2F response modes; single versus sequential design |
Papers focused on postcollection adjustment and estimation | ||||
| Community Life Survey (CLS) | Clustered stratified ABS (new samples) with random selection of sample adult; respondents to prior CLS (follow-up sample) | Adults in England, age 16+ living in a private residence | Web SA and F2F response modes; single versus sequential design |
| Jordan Arab Barometer | Stratified area probability sample of households; one adult selected via Kish grid | Adults in Jordan, age 18+ | F2F and CATI response modes |
Paper titles and authors . | Surveys employed . | Samples and frames . | Target populations . | Modes included . |
---|---|---|---|---|
Papers focused on tailoring and mixing of modes or contact strategies, with a focus on improving respondent sample composition | ||||
| The German Family Demography Panel Study (FReDA) | Two-stage address-based sample (ABS): stratified sample of municipalities; addresses sampled from selected municipalities | Adults 18–49 in Germany | Web and mail self-administered (SA) modes; different mode sequences |
| German General Social Survey (ALLBUS) | Multistage cluster sample from population registry with oversample of East German households | Adults 18+ in Germany living in private households | Web and mail SA modes; concurrent versus sequential designs |
| American Family Health Survey (AFHS) | Two-phase ABS: sample of addresses for screener; random selection of eligible household respondent | Households with adults 18–49 in the United States; adults 18–49 | Web and mail SA modes; concurrent versus sequential designs, tailored contact materials |
| National Training, Education, and Workforce Survey (NTEWS) | Stratified sample of American Community Survey (ACS) respondents (all eligible members of responding housing units and group quarter respondents) | Noninstitutionalized individuals aged 16–75 who are not enrolled in high school and are living in the United States and its territories | Web and mail SA modes; single mode versus concurrent versus sequential designs |
Papers focused on tailoring and mixing of modes or contact strategies, with a focus on improving respondent sample composition | ||||
| Consumer Assessment of Healthcare Providers and Systems (CAHPS) Hospice Survey | Sample from list of patients and caregivers in participating hospices | Hospice informal caregivers (family members or friends of patients) | Web and mail SA and computer-assisted telephone interviewing (CATI) response modes; different mode sequences |
Papers focused on introducing text messaging and email as new modes for contact or data collection | ||||
| Understanding Society, the United Kingdom Household Longitudinal Survey (UKHLS) | Two-stage stratified ABS (Great Britan); single-stage stratified ABS (Northern Ireland) | Households in the United Kingdom | SMS, text, and mail invitations; web SA and CATI response modes; sequential design |
| National Survey of Fishing, Hunting, and Wildlife-Associated Recreation (FHWAR) | Two-phase ABS: sample of addresses for screener; sample of eligible addresses with oversample in areas with high hunting participation | Individuals 16+ in the United States who are participating in fishing, hunting, or wildlife recreation activities | SMS, Email, and mail invitations; web and paper SA and inbound CATI response modes |
| Principal Follow-Up Survey to the National Teacher and Principal Survey (NTPS) | Stratified sample of principal respondents to the NTPS | Principals in public and private schools in the United States | SMS, Email, and mail invitations; web and mail SA and SMS response modes |
Papers focused on adaptive designs and optimizations | ||||
| California Health Interview Survey (CHIS) | ABS sample of households; one adult selected via next birthday method | Households in California; individuals 18+ | Mail-push-to-web SA and CATI response modes |
| National Survey of College Graduates (NSCG) | Stratified sample of respondents to the ACS (all eligible members of responding housing units and group quarter respondents) | Noninstitutionalized individuals aged 16–75 who have at least a bachelor's degree and are living in the US and its territories | Web and mail SA and CATI response modes |
| Dutch Health Survey (HS) and Dutch Labor Force Survey (LFS) | Two-stage population registry sample: stratified sample of municipalities; persons sampled from selected municipalities (HS); two-stage clustered ABS: clustered sample of addresses within geographic regions (LFS) | Households in the Netherlands: age 12+ (HS); and age 16–64 (LFS) | Web SA and F2F response modes; single versus sequential design |
Papers focused on postcollection adjustment and estimation | ||||
| Community Life Survey (CLS) | Clustered stratified ABS (new samples) with random selection of sample adult; respondents to prior CLS (follow-up sample) | Adults in England, age 16+ living in a private residence | Web SA and F2F response modes; single versus sequential design |
| Jordan Arab Barometer | Stratified area probability sample of households; one adult selected via Kish grid | Adults in Jordan, age 18+ | F2F and CATI response modes |
Paper titles and authors . | Surveys employed . | Samples and frames . | Target populations . | Modes included . |
---|---|---|---|---|
Papers focused on tailoring and mixing of modes or contact strategies, with a focus on improving respondent sample composition | ||||
| The German Family Demography Panel Study (FReDA) | Two-stage address-based sample (ABS): stratified sample of municipalities; addresses sampled from selected municipalities | Adults 18–49 in Germany | Web and mail self-administered (SA) modes; different mode sequences |
| German General Social Survey (ALLBUS) | Multistage cluster sample from population registry with oversample of East German households | Adults 18+ in Germany living in private households | Web and mail SA modes; concurrent versus sequential designs |
| American Family Health Survey (AFHS) | Two-phase ABS: sample of addresses for screener; random selection of eligible household respondent | Households with adults 18–49 in the United States; adults 18–49 | Web and mail SA modes; concurrent versus sequential designs, tailored contact materials |
| National Training, Education, and Workforce Survey (NTEWS) | Stratified sample of American Community Survey (ACS) respondents (all eligible members of responding housing units and group quarter respondents) | Noninstitutionalized individuals aged 16–75 who are not enrolled in high school and are living in the United States and its territories | Web and mail SA modes; single mode versus concurrent versus sequential designs |
Papers focused on tailoring and mixing of modes or contact strategies, with a focus on improving respondent sample composition | ||||
| Consumer Assessment of Healthcare Providers and Systems (CAHPS) Hospice Survey | Sample from list of patients and caregivers in participating hospices | Hospice informal caregivers (family members or friends of patients) | Web and mail SA and computer-assisted telephone interviewing (CATI) response modes; different mode sequences |
Papers focused on introducing text messaging and email as new modes for contact or data collection | ||||
| Understanding Society, the United Kingdom Household Longitudinal Survey (UKHLS) | Two-stage stratified ABS (Great Britan); single-stage stratified ABS (Northern Ireland) | Households in the United Kingdom | SMS, text, and mail invitations; web SA and CATI response modes; sequential design |
| National Survey of Fishing, Hunting, and Wildlife-Associated Recreation (FHWAR) | Two-phase ABS: sample of addresses for screener; sample of eligible addresses with oversample in areas with high hunting participation | Individuals 16+ in the United States who are participating in fishing, hunting, or wildlife recreation activities | SMS, Email, and mail invitations; web and paper SA and inbound CATI response modes |
| Principal Follow-Up Survey to the National Teacher and Principal Survey (NTPS) | Stratified sample of principal respondents to the NTPS | Principals in public and private schools in the United States | SMS, Email, and mail invitations; web and mail SA and SMS response modes |
Papers focused on adaptive designs and optimizations | ||||
| California Health Interview Survey (CHIS) | ABS sample of households; one adult selected via next birthday method | Households in California; individuals 18+ | Mail-push-to-web SA and CATI response modes |
| National Survey of College Graduates (NSCG) | Stratified sample of respondents to the ACS (all eligible members of responding housing units and group quarter respondents) | Noninstitutionalized individuals aged 16–75 who have at least a bachelor's degree and are living in the US and its territories | Web and mail SA and CATI response modes |
| Dutch Health Survey (HS) and Dutch Labor Force Survey (LFS) | Two-stage population registry sample: stratified sample of municipalities; persons sampled from selected municipalities (HS); two-stage clustered ABS: clustered sample of addresses within geographic regions (LFS) | Households in the Netherlands: age 12+ (HS); and age 16–64 (LFS) | Web SA and F2F response modes; single versus sequential design |
Papers focused on postcollection adjustment and estimation | ||||
| Community Life Survey (CLS) | Clustered stratified ABS (new samples) with random selection of sample adult; respondents to prior CLS (follow-up sample) | Adults in England, age 16+ living in a private residence | Web SA and F2F response modes; single versus sequential design |
| Jordan Arab Barometer | Stratified area probability sample of households; one adult selected via Kish grid | Adults in Jordan, age 18+ | F2F and CATI response modes |
The following section presents the current survey landscape which is then followed by a discussion of the recent evidence reported in literature and findings presented in this special issue in the areas of recruitment and contact modes and their impact on selection and nonresponse errors, measurement errors in mixed-mode designs, mode effects, and adjustment methods. Then, we discuss costs in mixed-mode surveys and conclude with a discussion of areas for further research.
2. THE SURVEY LANDSCAPE
In the lead-up to the pandemic, the survey landscape was evolving beyond one-mode-fits-all designs (Olson et al. 2019). Survey organizations were leveraging multiple frames to mitigate coverage errors in any single frame. Additionally, increased (although not ubiquitous) linking of auxiliary data to sampling frames for improvements throughout the survey lifecycle enabled enhanced or targeted sampling (Barron et al. 2015) and targeted messaging in contact materials or data collection operations (Jackson et al. 2023). Surveys had begun using different modes for contact and administration, for example, by sending a letter asking the sample member to respond to a web survey (i.e., push-to-web approach; Parast et al. 2019). These enhancements can help reduce coverage, sampling, and nonresponse errors, but equally important, they can reduce survey costs. Auxiliary information enables targeted sampling that would otherwise require a costly screener operation; assists in matching the right survey materials to a sample member to obtain a response sooner and decrease follow-up costs; and helps identify which sample members require high-cost operations, like field follow-ups.
While the focus of this overview is on household surveys, establishment surveys also have been shifting to mixed-mode collections, often with an emphasis on web collection, long before the pandemic began in 2020, evidenced by the decline in the availability of printable forms and the usage (or option) of response via mail or fax. However, in the context of surveys in which an individual or household is the unit of analysis, as opposed to an establishment, the COVID-19 pandemic period of 2020–2021 jump-started research that Olson et al. (2019) foreshadowed.
The papers in this special issue reflect research on and usage of a larger variety of modes than was available in the past, as digital modes of contact and administration became more common during the pandemic. Looking across the special issue, several patterns emerged:
About half of the papers discuss surveys sampled from other surveys, highlighting the continued use of enriched frames or samples for conducting or evaluating data collection operations. The other half discusses surveys selected from address-based sampling (ABS) frames or population registers, while no submissions focused on the use of multiple frames.
All but one of the papers include web as a response mode, and in surveys that incorporated a sequential design, web was typically the initial mode.
Most of the papers use experimental or observational designs, underscoring the need for empirical evidence of the impacts and benefits of new modes or mixes of modes.
When papers discuss experimental testing of new modes, they include SMS text messaging or email, reflecting the increased interest in technological modes for contact or administration during the pandemic.
The papers that discuss statistical methods for estimating or adjusting for mode effects in the presence of selection effects report mixed findings, indicating the need for more research into the development and evaluation of methods, including the identification of methods that work best for different survey designs or different measurement conditions.
Three papers discuss balancing costs and errors, either implicitly or explicitly, leveraging sequential mixed-mode designs attempting to achieve that balance by either decreasing effort in later, more costly, modes or allocating limited cases to more expensive modes.
Most papers analyze general population surveys; some investigate specific population groups such as hospice informal caregivers, school principals, and participants in wildlife recreation activities; but no papers in this special issue investigate establishment surveys.
The papers use data from surveys collected in five countries (Germany, Jordan, the Netherlands, the United Kingdom, and the United States).
The mixed-mode studies featured in this special issue, and the increased use of new and innovative methods in general, represent a potentially promising approach to mixed-mode data collection beyond the pandemic. Relatively little is still known about these innovations and their effects on data quality, nonresponse bias, and measurement errors, and the adjustment techniques which are needed to ensure valid inference from mixed-mode data. However, it is important to consider their postpandemic future in mixed-mode survey design and implementation.
3. RECRUITMENT STRATEGIES, CONTACT MODES, AND THEIR IMPACT ON SELECTION ERROR AND SURVEY NONRESPONSE
Many studies that had previously used interviewer-administered modes were forced to shift to self-administered methods or are currently preparing for a transition to self-administered modes (e.g., European Social Survey). Those already using self-administered modes, fully or partially, embraced the opportunity to experiment with new techniques in the rapidly changing environment. Specifically, in this special issue, experimental designs attempted to answer one of two questions: (i) what contact and response modes, offered in what sequence, will achieve not only the highest response rate but deliver the most representative responding sample and the highest data quality?; and (ii) are there new modes, like SMS, that can be incorporated into the survey process to help in this effort, with minimal cost?
Several of the articles included in this special issue including Christmann et al. (2024), Tolpadi et al. (2024), and Heimel et al. (2024) find strong evidence that a sequential approach, where respondents are pushed first to web and then later offered an alternative mode of contact (often paper or telephone), leads to the highest response rates and most balanced samples. Additionally, this sequencing approach seems consistently cost-effective. Notably, different studies find that a single-mode, paper-only treatment is less effective than it once was (Medway and Fulton 2012; McPhee et al. 2018; Jackson et al. 2023); and further, Heimel et al. (2024) report the paper-only protocol to be most sensitive to demographic variability in response patterns. An outstanding question is whether this finding is true for all population subgroups. There also seems to be evidence that when the recruitment is focused on younger adults, a single-mode, web-only design may be comparable to the sequential mixed-mode web-to-paper design (Heimel et al. 2024). However, this remains an open area for study.
Within this special issue, several submissions explored the use of text message survey invitations and reminders as well as the use of SMS as a response mode, but the effectiveness of this technology remains underresearched. For example, in this special issue, Christian et al. (2024) test the efficacy of using text messaging as a contact mode by experimenting with sequencing of text messages and mailing contacts and with the time text messages are sent to respondents. They report that text message reminders, especially early ones, are effective at increasing completion rates and that the sequencing of text messages did not affect response behavior or data quality across different subgroups. Yet, when Cabrera-Álvarez and Lynn (2024) experimentally test the addition of text messages to a sequential web-CATI mixed-mode longitudinal study in the United Kingdom, they find no significant impact of text message invitations or reminders on the overall response rate or the response by web. Their analysis reveals that specific subgroups unique to the longitudinal survey context (i.e., panel members who had not provided an email or had changed their address since the last wave, as well as panel members with an irregular response pattern, who had been more reluctant to participate in the past) benefit from the additional SMS invite and/or reminder.
Text messaging and SMS, though, are not only a mechanism for contacting and inviting sampled individuals into a survey through a URL but also can be used as a mode of survey administration as well. Spiegelman et al. (2024) in this special issue experiment with supplementing a paper questionnaire with two electronic modes of data collection: emails with a direct link to complete an online survey and invitations by text to complete short survey using text messages (two-way SMS). The authors report that adding either electronic mode significantly increases response rates and reduces time to survey completion compared to offering only a paper mode. Furthermore, the response rates obtained through the SMS survey are higher than the online survey.
Current survey research is also moving beyond asking “what works” to asking, “what works for which population subgroups?”. While we are beginning to accept that a “one-size-fits-all” design is not the best way to gain cooperation from all subgroups in target populations, we have not yet formulated a consistently successful approach if one exists. Broadly, this discussion fits into the framework of Responsive and Adaptive Survey Design, including “static” adaptive designs which tailor protocols based only on information available prior to data collection, and “dynamic” adaptive and responsive designs, which also leverage information and paradata available during data collection (Groves and Heeringa 2006; Schouten et al. 2017; Chun et al. 2018).
Evaluations of both adaptive and responsive designs have increased in recent years alongside increased pressure to control costs, but overall findings are mixed. Thus far, researchers have found limited success with designs that use a priori model-based targeting to assign cases to different sets of features. For example, Jackson et al. (2023) and two papers in this special issue, Zhang et al. (2023) and Asimov and Blohm (2024), attempt to improve response outcomes and sample balance by treating cases differently based on information appended to the sample frame. Jackson et al. (2023) employed machine-learning models to predict the sensitivity of household-level response behavior to the offered response mode and varied the initial offered mode based on the predictions. As Jackson et al. (2023) note, however, such model-based predictions demand that prior data exist with which to train the models and that the design phase allows for the time needed to build and test such models. Therefore, rather than building new predictive models, Zhang et al. (2023) in this special issue leverage the Esri Tapestry segmentation (Esri 2020) to tailor both mode sequencing and invitation materials to align with hypothesized preferences, and Asimov and Blohm (2024) explore whether mode sequencing can be tailored based on sample frame indicators of potential mode preference within the German General Social Survey. All three of these “static” adaptive designs are premised on the idea that offering materials appropriate to hypothesized preferences of sampled cases could improve the response of particular groups and/or reduce costs by using more expensive materials only when required.
Neither Jackson et al. (2023) nor Zhang et al. (2023) find an overall impact on response rates or the resulting sample composition. Asimov and Blohm (2024) report evidence that a tailored design could slightly increase response rates compared to a sequential web-to-mail design, but not a concurrent web-and-mail design. The lack of success could be due to a variety of factors: the auxiliary data available at the time of targeting may be insufficient; the models used to create subgroups may not successfully discriminate, or the materials used for targeted interventions may not drive participation as expected. There remain, therefore, many unanswered questions regarding the effectiveness, efficiency, and cost of tailoring survey designs.
Recent implementations of dynamic adaptive and responsive designs have shown modest, but more consistent, successes. Furthermore, these designs often intervene at the case level, rather than the subgroup level. Coffey et al. (2020) implemented a dynamic adaptive design in a sequential multimode survey conducted in web, paper, and outbound telephone. During data collection, higher resource modes were withheld or introduced based on the over- or underrepresentation of key subgroups, achieving similar data quality with lower response rates and data collection costs. More recently, several papers have used stopping rules that reallocate resources away from stopped cases to other cases that will have greater impacts on data quality. Past work examining stopping rules in single-mode, face-to-face surveys through both simulation and experimental data collection (e.g., Einarsson et al. 2023; Wagner et al. 2023) generally finds the potential to control costs while improving or having little effect on sample characteristics and/or estimates. In this special issue, Coffey and Elliott (2024) also implement a responsive design experiment with a stopping rule in a sequential mixed-mode survey. Cases are stopped in the treatment group to minimize data collection costs and survey error in a single survey item, achieving cost reductions with no impact on quality in the treatment group versus the control group. Also in this special issue, Jackson et al. (2024) implement a dynamic adaptive design in a sequential mixed-mode survey using web followed by outbound telephone. The authors use a stopping rule based on near-real-time estimates of response propensity, and thresholds that vary by the predicted demographic characteristics of cases, also finding cost savings. More work is needed in the development of intervention rules, and the models that underpin these types of designs—effective implementation of adaptive and responsive designs relies on models for response propensity, survey outcomes or data quality, and cost, as well as infrastructure to regularly estimate these models and implement these complex designs.
4. MEASUREMENT ERRORS AND CONSIDERATIONS FOR MIXED-MODE SURVEY DESIGN
Mode-specific measurement errors refer to differences in responses that a respondent provides when questions are presented in different modes. Surprisingly, there were not many submissions to this special issue about mode-specific measurement effects in mixed-mode designs, although we solicited submissions on testing and evaluating mixed-mode design features for questionnaires, assessing risks of measurement effects when transitioning a questionnaire from one mode to another, or exploring methods for reducing mode effects through effective questionnaire design. The primary challenge in identifying the effects of modes on measurement might be due to distinguishing the measurement effect from the selection effect; this complication may have limited possible submissions in this area.
Most papers in this special issue analyze mixed-mode designs that incorporate a web mode, with many employing combinations of self-administered and interviewer-administered modes. In particular, several papers start with a web survey request and move to a telephone follow-up (table 1). Consequently, this section focuses on new developments in mixed-mode surveys, particularly those incorporating a web mode of data collection or combinations of self-administered and interviewer-administered modes, and best practices in the area. Given the lack of submissions in this special issue and numerous underexplored aspects that may differentially impact measurement, this section outlines areas for further research, with a particular emphasis on the relationship between survey design, survey mode, and measurement error.
Recent evidence confirms that the magnitude of mode differences tends to vary by question type, emphasizing the importance of considering question characteristics during the design stage. While factual demographic and behavioral questions are generally less affected by mode, attitudinal questions may exhibit larger measurement effects in mixed-mode designs (Burkill et al. 2016; Hox et al. 2017; Villar and Fitzgerald 2017; Huskinson et al. 2021). Kibuchi et al. (2024) report in this special issue that attitudinal and multicategory questions show larger measurement effects in the context of mixed-mode designs than behavioral and binary questions. Other question characteristics may be subject to mode effects in mixed-mode surveys including questions with many response categories (Cornick et al. 2022); sensitive topics (e.g., alcohol consumption); subjective questions such as well-being, self-rated health, and financial status (Goodman et al. 2022); and questions that might be subject to primacy effects in visual modes (e.g., paper and online) and recency effects in aural modes (e.g., face-to-face and telephone) (Cernat and Revilla 2020). More work is needed on question characteristics and measurement outcomes that are most sensitive to mode effects for different combinations of modes.
Long questionnaires may increase respondent burden, especially in self-administered modes. However, when combining self- and interviewer-administered modes, longer questionnaires may be necessary to obtain the needed information for the survey (Wolf et al. 2021). In this special issue, Asimov and Blohm (2024), Cabrera-Álvarez and Lynn (2024), Jackson et al. (2024), Kibuchi et al. (2024), and Yu et al. (2024) administer questionnaires that would traditionally be considered long (all between 36 and 60 minutes), whereas Spiegelman et al. (2024) and Tolpadi et al. (2024) focus on a short questionnaire. The length of questionnaires used in the studies in this special issue is consistent with other past work demonstrating that long self-completion questionnaires might not be associated with negative outcomes (Huskinson et al. 2021; Hanson and Fitzgerald 2022; Emery et al. 2023; Maslovskaya et al. 2024a, 2024b), suggesting that length may not be as much of a concern when combining self- and interviewer-administered modes. One approach to reduce the length of a survey is to create modules that are iteratively administered to respondents, reducing the length of any single survey request, but increasing the total number of survey requests to a potential respondent. None of the studies in this special issue used modularization, and past research shows mixed results in its implementation (Peytchev and Peytcheva 2017; Andreadis and Kartsounidou 2020; Peytchev et al. 2020; Toepoel and Lugtig 2022). Nevertheless, we think that more evidence is needed to investigate both nonresponse and measurement errors in longer versus shorter mixed-mode questionnaires.
Many web survey respondents complete questionnaires on mobile devices, making web surveys also mixed-device surveys (Toepoel and Lugtig 2015). In this special issue, Spiegelman et al. (2024) and Christian et al. (2024) report on increased response rates when text messages were used in recruitment or administration, although Cabrera-Álvarez and Lynn (2024) do not find this pattern. Earlier studies have found that mobile device respondents provide less accurate answers, shorter answers to open-ended questions and suffer from higher break-off rates than respondents using PCs/laptops for survey completion, but the evidence for other indicators is mixed (Bosch and Maslovskaya 2023). We were surprised to see little consideration of data quality from mixed-device respondents in submissions to the special issue; more research in mixed-mode and mixed-device studies is needed.
Video interviewing is a relatively new mode of interviewer-administered data collection, which has potential to combine the benefits of face-to-face interviewing with reduced costs and can be successfully offered within mixed-mode designs (West et al. 2022). Little is still known about this mode of data collection in the context of mixed-mode surveys; measurement comparability and other aspects of data quality still need to be explored. We encourage those who use video interviewing to plan methodological evaluations of its efficacy.
Mixed-mode designs can pose additional challenges when collecting complex measurements such as occupational coding or conducting cognitive assessments or consenting to auxiliary data collection. None of the papers in this special issue addressed complex measurements in a mixed-mode context. One open question in mixed-mode surveys is how to collect data that had traditionally been field-coded by an interviewer or coded by office staff, such as self-reported industry and occupation. In a web questionnaire, respondents could be asked to perform the coding procedures (e.g., select their occupation from a list of occupations or a series of questions about occupations) themselves in a self-completion mode or to report their industry and occupation through open-ended answers (Peycheva et al. 2021; Wilson 2021). Other complex measurements, such as measures of cognition, differ between online and interviewer-administered modes as well as across web devices (Ofstedal et al. 2021), as do consent rates for linkage to administrative data and for biological measurements (Sakshaug et al. 2017; Thornby et al. 2018; Jäckle et al. 2022; Kumari et al. 2023). How comparable these complex measurements are when asked in noncomputerized modes or when mixing modes of data collection requires additional research.
One complication when designing mixed-mode surveys that include self-administered modes has to do with the potential respondent’s capacity to read and understand the questionnaire. Best practices suggest that mixed-mode surveys should accommodate respondents with lower cognitive abilities, education levels, and internet literacy to ensure measurement equivalence between different modes and different respondents (Wilson and Dickinson 2021; Cornick et al. 2022). How best to ensure accessibility and inclusivity of questionnaires designed for mixed-mode surveys requires additional research.
Investigation into mode-specific measurement effects in mixed-mode surveys and methods for reduction of mode effects through effective questionnaire design is needed. Implementing experimental designs and parallel data collections can provide valuable insights into addressing measurement differences in mixed-mode surveys.
5. SURVEY ESTIMATION, MODE EFFECTS, AND ADJUSTMENT METHODS
Once survey data are collected, combining data collected via multiple modes also requires careful consideration. Coverage, nonresponse, and measurement effects may differentially affect estimates overall or for subgroups. An estimate that is subject to pure selection effects may adequately “eliminate” mode differences through weighting adjustments. Alternatively, an estimate that is subject to pure measurement effects may first be adjusted so response data from one mode is aligned with response data from a preferred mode, followed by weighting. In practice, however, errors almost certainly appear in combination and are cumulative (Tourangeau 2020)—there are likely compositional differences in which sample members respond to a particular mode, as well as what sample members report across different modes.
The issue is that these errors are confounded—once a sample member responds in one mode, we typically do not observe the counterfactual. As a result, identifying and adjusting for mode effects requires external information or additional techniques beyond standard adjustments. Several approaches to handling mode effects exist, including adjusting responses to a benchmark or “gold standard” dataset, bridge surveys that conduct the same survey in multiple modes, repeated measurements for the same respondents in different modes, and statistical models that can account for mode differences (Olson et al. 2019; Schouten et al. 2022; Maslovskaya et al. 2023).
There are often practical limitations to which approach is available for a given application. Benchmarking to external gold standards requires those external data to exist, which is frequently not the case. Bridge surveys and repeated measurement surveys both require additional data collection and can be expensive to implement. Statistical methods for adjusting and combining data from multiple modes can bypass the need for either external or additional data collection but come with their own drawbacks in that they are typically applied after data collection when the collected data are unchangeable, are often complex, and may require assumptions that are difficult to validate.
These challenges mean that there is no agreed-upon “best” or easy-to-implement method for diagnosing or adjusting data from multiple modes. Recent publications, including papers in this special issue, offer some empirical findings and the direction of further methodological development. In this special issue, Kibuchi et al. (2024) leverage data collected from three samples in parallel. The same questionnaire was fielded to all three samples over the same time period, but the contact and collection modes were different: one sample received face-to-face interviews, one received a mail push-to-web strategy, and one sample was administered a web survey as a follow-up to a face-to-face interview. Using propensity matching to identify comparable survey subgroups, the authors find small but persistent mode differences across the 133 items included in the questionnaire, with larger differences in comparisons of the face-to-face sample to either of the web samples.
Additionally, in this special issue, Schouten et al. (2024) extend past work to consider how reinterview surveys can reduce, as well as permit estimation of, mode effects in surveys. Through simulation on two large national surveys in the Netherlands, they find that reinterviews would be beneficial for reducing mode effects in one survey, but not the other. The authors offer a framework for determining not just whether reinterviews are necessary, but assuming that they are, how to allocate the survey workload between cases being reinterviewed in a second mode and cases receiving follow-up in the second mode after not responding in the first mode. This framework accounts for expected mode differences, expected response behavior, and survey budget.
Other research extends the use of statistical models for mode effects in different directions. These methods include a latent variable approach to estimating the underlying “true” value of a survey response, “free” of mode effects (Sakshaug et al. 2021), and fractional imputation to generate “counterfactuals” of response in the mode(s) not selected by the sample member (Park et al. 2016; She and Wu 2018). Other model-based methods combine data from multiple modes where the mode of response and measurement effects are assumed to be correlated with the actual response value (similar to a missing not at-random nonresponse mechanism) (Pfefferman and Preminger 2021). In this special issue, Yu et al. (2024) add to the existing literature on model-based methods by developing statistical decision rules to determine whether and how to combine data from multiple modes. The authors compare simple decision rules based on mode-specific means and variances as well as more sophisticated Bayesian methods for model averaging. The recommended methods are sensitive to mode effects, resulting in wide confidence intervals when mode-specific estimates and candidate model parameters are far apart, but those confidence intervals narrow when those differences decline. Much of the work on new statistical models for mode effects in survey data is simulated on designed (as opposed to collected) data or applied to special populations, requiring additional research.
Methodological research focused on identifying and adjusting for mode effects in mixed-mode surveys is accumulating. Data collection designs can be used to identify and potentially correct for mode effects. Importantly, these findings support that there are indeed mode effects, and we need to explore methods for identifying and correcting those effects. These techniques show promise for understanding the risk of mode effects, but more work is needed to understand the robustness and ease of application of these methods to a variety of topics and estimates, as mode effects are estimate-specific. Statistical techniques also need further development and application to a variety of survey settings and classes of estimates (e.g., means, totals, regression coefficients). The statistical models and underlying assumptions needed for different types of estimates may render these methods differentially useful. Further work in this area will help our field understand the relative performance and impact of various statistical adjustment methods, whether specific methods are more appropriate for different survey designs or estimators, and if the recommended methods are user-friendly.
6. SURVEY COSTS
No discussion of mixed-mode surveys would be complete without addressing costs. The COVID-19 pandemic may have accelerated the adoption of mixed-mode surveys, but concerns about decreasing participation rates accompanied by increasing data collection costs have been a motivator for much longer. While historically, the literature has often considered the costs of a particular data collection feature, such as incentives, or sets of features, such as incentives and mailings, as part of experimental evaluations (Mercer et al. 2015; Williams et al. 2018), detailed research about survey costs is still relatively rare.
New research on survey costs is emerging. A recent typology of cost metrics that provide common ground on which different survey organizations can report costs (without divulging proprietary details) identifies various actual, estimated, and proxy cost indicators (Olson et al. 2021). Importantly for mixed-mode surveys, different types of cost indicators may be available and/or useful as measures of survey costs for different modes of data collection, which may be particularly challenging when examining costs in mixed-mode surveys. In an application of the typology to single-mode face-to-face and telephone surveys, Wagner et al. (2023) construct proxy cost indicators, exploring correlations between those indicators and actual survey costs for four different surveys. To our knowledge, a similar analysis of alternative cost indicators and survey costs in self-administered and mixed-mode surveys has not yet been conducted.
There is also a growing literature on cost models themselves, and the use of those models to inform data collection decisions. Researchers are using a broad set of models to estimate data collection costs (Wagner 2019; Wagner et al. 2020, 2021), and then leveraging those models to target data collection features (including modes), often as part of adaptive and responsive designs (van Berkel et al. 2020; Einarsson et al. 2023; Wagner et al. 2023).
In this special issue, three papers explicitly discuss data collection costs as a motivator for innovation. Jackson et al. (2024) evaluate a pseudo-experimental dynamic adaptive design where outbound telephone calling effort was stopped for cases that were less likely to respond and in strata that were not typically underrepresented. This targeting methodology is implemented explicitly to control costs, and the authors report a 14 percent reduction in the average number of calls per case in the treated sample. Coffey and Elliott (2024) evaluate a responsive design that minimized a function of survey costs and errors. By incorporating response propensity, case-level contributions to bias and variance, and data collection costs into the objective function, the responsive design identified cases that are less likely to respond, have tiny impacts on bias and variance, have high estimated data collection costs, or some combination of the three. The treatment group subject to optimization had 10 percent lower data collection costs than the control group, with no practical impact on the key survey estimate, self-reported salary, or response rates. In the mode effect minimization strategy discussed in Schouten et al. (2024), a survey’s data collection budget is incorporated into an optimization function that allocates resources between reinterview in a second mode for cases that responded to a first mode and follow-up in the second mode for cases that did not respond to the first mode. Importantly, this allocation strategy assumes that data collection budgets will remain level, and so explicitly addresses the fact that effort must be reallocated away from some nonresponding cases to allow for reinterview of other cases.
Several other papers in this special issue include an assessment of costs as one of several examined outcomes, even if cost was not an explicit motivator. Asimov and Blohm (2024) find that the cost per completed interview is higher in a concurrent web-and-mail design than in a sequential web-to-mail design, and that designs where modes are tailored to specific age groups show slightly higher costs than either of the nontailored designs. Christmann et al. (2024) find that the cost per completed survey increases relatively linearly with the number of mailings that included a paper questionnaire. Heimel et al. (2024) evaluate the costs of five different mixed-mode designs and find that the web-only design has the lowest cost per completed interview, although the cost per completed interview was comparable for the sequential push-to-web design that does not offer a paper option until the fifth contact.
While several papers included in the special issue examined cost as part of their analysis, more research is needed. It is important to improve the capture and understanding of cost-related data, be it paradata, more granular information about cost-driving activities, or even relative aggregate costs between different activities (e.g., bulk mailing costs versus email notification costs). It is also critical to understand the relationship between predicted and actual costs as well as the relationship between costs and data quality.
7. DISCUSSION
Mixed-mode collection is the “new normal” in surveys, with adopted practice accelerated by the pandemic and facilitated by new and emerging technology. Mixed-mode methodologies offer alternatives to other cost-prohibitive measures, with recent research focusing on their specific consequences (e.g., representativity, mode effects). This special issue includes a snapshot of some of the research that has grown out of these changes and challenges including experimentation focused on mixed-mode sequencing, the use of new technology, like SMS, new statistical techniques for estimation in the context of mixed-mode surveys, model-driven adaptive and responsive design approaches aimed at cost reduction, and the enhancement of sample frame information to improve sampling efficiency and tailor operational strategies.
Many areas of research related to mixed-mode surveys remain unexplored. We encourage survey researchers to continue investigating how sample frames can be augmented, enhanced, and combined to improve mixed-mode survey research. There is potential to augment sample frames, particularly ABS and registration-based frames, with auxiliary data to not only improve targeted operations, but also enhance and complement the survey data. For example, the utility of GIS-based location mapping to survey frames, including distance metrics and regional and environmental characteristics remains largely unexplored in this context. Additionally, as our models and designs grow increasingly sophisticated, we need to be cognizant of the trade-off between improved precision, representation, and data quality and the implementation complexity and cost incurred by tailored adaptive or responsive designs.
More research is also warranted on within-household sampling methods in the context of mixed-mode surveys. While in-person studies and older landline telephone studies were often able to roster households and conduct randomized within-household respondent selection, mixed-mode administration complicates this step. Most mixed-mode studies employ quasi- and nonprobability-based methods such as asking for the person with the most recent birthday or allowing any adult to complete the survey (Olson et al. 2019; Nicolaas 2022). Quasi- and nonprobability approaches can reduce respondent burden and increase response rates but may increase bias. The impact on survey estimates of incorrect selection of individuals within households in self-completion modes should be further explored. Additionally, in self-administered modes, researchers rely on the respondent to follow the selection instructions. This often includes asking the initial household respondent to hand-off the survey to another household member, which can lead to an increased risk of respondent self-selection and survey nonresponse (Olson et al. 2019).
The risk of measurement error is potentially greater in self-administered and mixed-mode designs. Additionally, as online technology evolves and changes, the profile of measurement issues is likely to change as well. It is crucial that researchers continue to explore the sources of measurement errors and their interactions with survey mode. Specifically, the effects of question characteristics, questionnaire length, “mobile-first” design, collection of complex measures and requests for auxiliary tasks on data quality, sample representativeness, and respondent accessibility are not yet fully understood. Similarly, more work is needed with regard to estimating and adjusting for mode effects in mixed-mode surveys to understand the efficacy of which statistical adjustment methods are appropriate for different survey designs or estimators.
Ensuring accessibility and inclusivity for general population surveys is important as it is crucial to hear voices of all population subgroups—this includes not only demographic and socio-demographic subgroups, but those who face accessibility challenges as well (e.g., those with visual impairment, deafness, neurodivergence and other conditions). Tailoring data collection approaches to meet individual needs, including offering different modes in mixed-mode designs, helps increase the accessibility and inclusivity of the surveys (Robinson 2024).
Finally, we need to continue to thoughtfully incorporate cost models into our research into different mixed-mode designs and methodologies. As cost is often a key driver in the design process, we need to have a good understanding of the relationship between levers and costs. Being able to successfully estimate the costs of various data collection features and understand how the costs of those features increase (or decrease) survey errors will allow survey teams to design higher-quality cost efficient surveys.
In summary, transitioning surveys to mixed-mode designs led to many successes for survey researchers. However, there are still areas for further research and investigation which need urgent attention to ensure that mixed-mode surveys produce high-quality cost-effective data.
The contributions to this special issue have improved the understanding of different aspects of mixed-mode survey landscape. We would like to thank all authors, reviewers, and particularly the editors-in-chief of JSSAM, Kristen Olson and Katherine Jenny Thompson, who have supported us through this process and who made this special issue possible.