-
PDF
- Split View
-
Views
-
Cite
Cite
Mengyu Li, Gaofei Li, Sijia Yang, Correction by distraction: how high-tempo music enhances medical experts’ debunking TikTok videos, Journal of Computer-Mediated Communication, Volume 29, Issue 5, September 2024, zmae007, https://doi.org/10.1093/jcmc/zmae007
- Share Icon Share
Abstract
The spread of multimodal coronavirus disease 2019 (COVID-19) misinformation on social media poses considerable public health risks. Yet limited research has addressed the efficacy of citizen-contributed, multimodal debunking messages, especially the roles of audiovisual structural features. In a between-subject online experiment, we assessed the impacts of misleading TikTok videos promoting the false claim that COVID-19 vaccines cause infertility and compared the effectiveness of debunking videos from medical experts vs. laypeople. We independently varied the presence of background music. Results showed that while misleading TikTok videos increased misperceptions, most debunking videos effectively countered such misinformation. Notably, compared with laypeople’s testimonial corrections, expert didactic videos benefited more from incorporating high-tempo background music, primarily through the suppression of counterarguing rather than through enhanced encoding. These findings underscore the importance to consider audiovisual structural features, such as background music, as well as the cognitive pathway through distracted counterarguing, in future research on multimodal misinformation and correction.
Lay Summary
The spread of multimodal coronavirus disease 2019 (COVID-19) misinformation on social media poses public health risks. However, we do not know much about whether citizen-contributed debunking video messages can help correct health-related misinformation, nor the roles of specific message features such as background music. To answer these questions, we conducted an online experiment where participants viewed misleading TikTok videos about COVID-19 vaccines causing infertility and then watched correction videos from experts or regular users. We also varied whether these debunking videos included high- vs. low-tempo background music. We found that most correction videos effectively corrected the false information. Notably, expert videos with fast-paced music were particularly successful in reducing counterarguments that criticized the correction messages, which in turn helped improve such correction messages’ persuasiveness. Our study underscores the importance of well-crafted multimedia corrections, including the role of background music, in combatting false information on platforms like TikTok.
Since the outbreak of the coronavirus disease 2019 (COVID-19) pandemic, social media have played a dual role in information integrity (Southwick et al., 2021): catalyzing the spread of misinformation related to COVID-19 vaccination on one hand, and helping health organizations and professionals inform the public on the other (Baghdadi et al., 2023; Yang et al., 2023). COVID-19 misinformation had a great impact on people’s mental health (CDC, 2023; Xiao & Torok, 2020), and threatened to delay preventative behaviors such as vaccination, masking, and social distancing (Loomba et al., 2021; Neely et al., 2022). Meanwhile, the rise of short video platforms such as YouTube and TikTok (Matsa, 2022) also provides a fertile ground for the rapid spread of COVID-19 misinformation: For instance, 124 TikTok videos containing misleading COVID-19 vaccine information have garnered over 20 million views and 339,000 shares (O’Connor, 2021). Furthermore, research suggests that information presented in video format is perceived as more convincing and credible (Sundar et al., 2021; Wittenberg et al., 2021).
Therefore, it is crucial to investigate message features within short videos that can help to counteract the misperceptions resulting from multimodal health misinformation on short video platforms. Yet, unlike research on text-based misinformation on established social media platforms, much less research has examined multimodal misinformation and its correction, despite evidence showing that videos are effective in both transmitting (e.g., Vraga et al., 2022) and debunking misinformation (Bhargava et al., 2023; Yousuf et al., 2021).
Given the growing popularity of short-video platforms such as TikTok, pro-vaccine content from non-institutional sources, including medical experts, public health professionals, and ordinary citizens, is on the rise (Baghdadi et al., 2023; Kampen et al., 2022; Yang et al., 2023). Since this citizen-led online vaccine promotion also includes dedicated debunking efforts (Micallef et al., 2020; Vraga & Bode, 2020), research is needed to evaluate the effectiveness of such citizen-led fact-checking in countering health-related misinformation.
Additionally, background music is a common audio structural feature (Fisher et al., 2018; Lang, 2006; Lee & Lang, 2015) and a crucial part of storytelling (Medina-Serrano et al., 2020) on TikTok, yet its impact on the processing and reception of debunking videos remains underexplored. Existing research shows mixed results on music's role in user engagement. Some studies show background music is widely used to increase the appeal of short videos, including fact-checking videos, to attract audience engagement (e.g., Lu & Shen, 2023), while others suggest an opposite pattern in COVID-19 vaccine videos (e.g., Yang et al., 2023). Therefore, investigating the role of music in cognitive processing (Fisher et al., 2018; Rodero, 2019) and misinformation debunking can help advance the theorization of multimodal correction while producing practical message design principles to optimize health misinformation debunking videos. Drawing upon the Limited Capacity Model of Motivated Mediated Message Processing (LC4MP; Lang, 2006), the psychological reactance theory (PRT; Brehm & Brehm, 1981; Dillard & Shen, 2005), and recent attempts to integrate these two frameworks (Clayton, 2022; Clayton et al., 2018), we experimentally examine how background music might affect the efficacy of TikTok debunking videos, particularly those created by medical experts who often adopt a didactic style (Law, 2021) to address health-related misinformation.
Specifically, based on the best available scientific evidence, we focus on the false claim that COVID-19 vaccines lead to infertility as a case study. Our study seeks to address the following questions: (a) how would misleading and debunking videos respectively affect misbelief acceptance; (b) whether background music would influence the effects of debunking videos that differ by type (i.e., expert didactic corrections vs. layperson testimonials); (c) what cognitive mechanisms can account for background music’s differential effects (i.e., reducing counterarguing vs. increasing encoding). Our results highlight the importance of audiovisual structural features and the theoretical relevance of the integrated LC4MP-PRT framework in future research on multimodal misinformation correction.
Multimodal COVID-19 misinformation on social media
Ever since the COVID-19 pandemic began, scholars and public health professionals have recognized the serious challenges posed by the widespread health misinformation (WHO, 2020), which we define as publicly available claims about health promotion and/or disease control that are (a) currently false due to their contradiction to consensus among experts/institutions that adhere to scientific principles and/or the best available scientific evidence at the time (e.g., randomized control trials and systematic reviews), (b) inaccurate due to their use of incomplete evidence, or (c) unsubstantiated due to a lack of evidence (Nan et al., 2023; Southwell et al., 2022). Concerns exist that misinformation might trigger negative emotions (Xiao & Torok, 2020), promote unproven or even life-threatening treatments (e.g., injecting bleach), and hinder compliance with recommended mitigation behaviors (Vigdor, 2020). The idea that misinformation circulating on social media may jeopardize the integrity of the public health information environment is by no means new (Sylvia Chou et al., 2020). In recent years, video-based platforms such as YouTube and TikTok have amplified misinformation production and propagation by, partially, making it easy to employ audiovisual message features (Kim & Chen, 2022). However, research on health misinformation and its corrections still predominantly focuses on text or still images on established platforms such as Facebook and Twitter (e.g., Pennycook & Rand, 2021). Joining a burgeoning literature on multimodal misinformation (Sundar et al., 2021; Vraga et al., 2022), our first goal is to assess the pernicious effects of COVID-19 misinformation packaged in a short-video format on a popular video-based platform, TikTok.
TikTok allows users to post short videos typically ranging from several seconds to a few minutes. It stands as one of the most rapidly expanding social media platforms globally. While health-related topics on TikTok have received significant user engagement (Schaeffer, 2019), they also raise concerns for information quality. For instance, between January and March 2022, 27% of TikTok videos about COVID-19 were found to contain incorrect, misleading, or incomplete information (Southwick et al., 2021). Given TikTok’s soaring popularity, the potential for TikTok to promote health misperceptions necessitates attention from both communication and public health researchers.
This study focuses on TikTok videos that falsely claim that COVID-19 vaccination causes infertility. Studies have shown that misinformation exposure is positively associated with vaccination hesitancy (e.g., Loomba et al., 2021), defined by WHO (n.d.) as reluctance or refusal to vaccinate despite availability. Despite guidelines recommending vaccination for women who are pregnant or intend to become pregnant (Centers of Disease Control and Prevention, 2023), this group exhibits higher vaccine hesitancy than the general population (Townsel et al., 2021; Yasmin et al., 2021). It has also been shown that fertility concerns draw substantial public interest (Diaz et al., 2021), and fear of reduced fertility is a significant obstacle to vaccination among the unvaccinated U.S. population (Diaz et al., 2022). Therefore, it is important to examine whether exposure to misinformation linking COVID-19 vaccines to fertility complications or miscarriages contributes to such misperceptions.
Prior research shows the greater influence of multimodal misinformation, compared to their textual counterparts, on individuals’ perceptions. This is partly because it is often seen as more realistic and credible due to the indexicality and realism heuristics, and because it tends to evoke stronger emotional responses (Sundar et al., 2021; Weikmann & Lecheler, 2022). Given these findings, we expect that TikTok videos promoting the claim falsely linking COVID-19 vaccination to infertility will increase misbeliefs.
H1: Exposure to TikTok videos promoting the false link between COVID-19 vaccination and infertility will increase this misbelief.
“Whole-of-Society” debunking: medical experts as online citizen fact-checkers
Recently, the U.S. Surgeon General Dr. Vivek H. Murthy called for a “whole-of-society” approach to tackle the threats from health misinformation, urging average citizens and medical experts to collaborate with local communities, technology companies, governments, and journalists in their efforts to combat misinformation (Office of the Surgeon General, 2021). Understandably, most existing research on misinformation correction focuses on top-down institutional players (e.g., the governments and professional fact-checkers), given their institutional responsibilities in protecting information integrity (Walter et al., 2020). That said, for a “whole-of-society” approach, research on health misinformation needs to expand its scope to consider grassroots fact-checking initiatives, where community members collaborate to challenge and correct misinformation.
Citizen fact-checking (Micallef et al., 2020) emphasizes the importance of bottom-up corrections undertaken by ordinary citizens in collectively safeguarding public information integrity. In this study, we conceptualize online misinformation debunking undertaken by medical experts as a part of citizen fact-checking, given that misinformation correction on social media is typically beyond their institutional responsibilities. Medical experts, including medical scientists, practicing clinicians, and non-clinician healthcare professionals, remain one of the most trusted sources of health information during the pandemic, a time when public trust in other institutions has waned (Funk & Gramlich, 2020). In several recent studies, expert endorsements helped correct misbeliefs about vaccination (Van der Linden et al., 2015; Zhang et al., 2021), including those related to COVID-19 vaccines (Ronzani et al., 2022). Therefore, encouraging medical experts to engage in online citizen fact-checking is an essential component to help implement the “whole-of-society” approach. Furthermore, we included and tested debunking videos from laypeople, a group consisting of a much larger share of online citizen fact-checkers.
Two reasons justify the efforts to better understand the effectiveness of citizen fact-checking, particularly those undertaken by medical experts. First, institutional fact-checkers’ reach and impacts are limited by the declining trust in institutional sources, the ease of selective exposure and dismissal of information deemed contradictory to individuals’ pre-existing opinions, and a general lack of interest in truth discernment (Bakshy et al., 2015; Thorson & Wells, 2016). Conversely, correction messages from misinformation-susceptible social media users’ own existing online networks are more likely to bypass psychological or algorithmic filtering that would otherwise restrict message exposure (Bakshy et al., 2015; Margolin et al., 2017). Furthermore, citizen fact-checkers can engage one-on-one with the disbelievers and tailor their responses accordingly, a level of personalized interaction that institutional messengers may find challenging to achieve.
Second, citizen fact-checking has already taken root on social media. In one recent large-scale study on COVID-19 misinformation correction on Twitter, citizen fact-checkers generated a substantially greater volume of corrective tweets than professional fact-checkers (Micallef et al., 2020). A national online survey in early 2020 revealed that about one-third of participants had witnessed others being corrected, while nearly one-quarter reported engaging in corrective behavior themselves (Bode & Vraga, 2021). During the pandemic, medical experts have taken unprecedented steps to participate in online misinformation debunking (Ohlheiser, 2020), using their expertise to communicate about critically important medical issues to the public (Southerton, 2021). Empirical evidence evaluating the effectiveness of such citizen fact-checking practices led by medically trained online influencers can thus provide much-needed data to inform other medical experts about the value of joining this collective effort.
However, citizen fact-checking’s effectiveness, particularly when deploying multimodal videos on platforms like TikTok, remains understudied. To fill this gap, we tested whether debunking videos created by medical experts vs. by laypeople can effectively correct multimodal misinformation about COVID-19 vaccination and infertility. By encouraging the use of text overlays, graphics, and sound, multimodal platforms such as TikTok are designed to facilitate the creation and dissemination of audiovisual content, hence holding the potential to amplify citizen fact-checking’s reach and impact. Packaging information in different modalities into a coherent video can also help present complex arguments and make otherwise cognitively taxing information more accessible and easier to understand (Cohn, 2019). Multimodal videos are thus typically more powerful than their textual and static-visual counterparts (Josephson et al., 2020).
Regarding multimodal correction messages, prior research has yielded promising findings concerning the efficacy of video-format correction. Notably, Bhargava et al. (2023) conducted an experiment demonstrating that corrective TikTok videos successfully countered COVID-19 false claims. A recent study by Yousuf et al. (2021) found that debunking videos not only helped reduce misconceptions about the side effects of COVID-19 vaccines, but also improved vaccine-related knowledge, to a degree larger than their textual counterparts. Relatedly, Young et al. (2018) reported the superior efficacy of video-formatted corrections over long-form textual versions (e.g., FactCheck.org articles), with the effect primarily mediated through increased message attention and reduced confusion. Therefore, we anticipate TikTok videos, either produced by medical experts or laypeople, to effectively debunk misinformation.
H2: Compared with the misinformation-only condition, exposure to TikTok correction videos, regardless of type, will reduce the misbelief that getting COVID-19 vaccines leads to infertility.
Admittedly, not all correction messages are created equal (Walter et al., 2020; Walter & Murphy, 2018). Videos by medical expert TikTokers, who often use a didactic style, may not yield optimal outcomes. That said, limited research has investigated how structural message features, such as background music, may influence the effectiveness of fact-checking videos. The following section details the theoretical rationale on why higher- vs. low-tempo background music might moderate correction success, with varying impacts on debunking TikTok videos produced by medical experts vs. by laypeople.
Reducing counterarguing or boosting encoding: cognitive pathways by which background music may influence the effectiveness of correction videos
TikTok videos frequently leverage background music for engagement (Lu & Shen, 2023; Lundy, 2023). However, despite the ubiquitous presence of music featured in online videos, its role in multimodal misinformation correction remains understudied. Drawing on the LC4MP (Fisher et al., 2018; Lang, 2006; Lee & Lang, 2015) and the PRT (Brehm & Brehm, 1981; Dillard & Shen, 2005), we propose that adding high-tempo background music may enhance the efficacy of expert didactic corrections through two potential cognitive mechanisms: reducing counterarguing or boosting encoding. To explain our theoretical rationale, we first discuss how the performance of expert didactic debunking videos may vary depending upon the presence of high-tempo background music. We then discuss why this pattern may differ for layperson testimonial videos. Despite a wide range of message features and persuasive appeals that could be studied, we focus on expert didactic vs. layperson testimonial videos, as both represent important forms of citizen fact-checking popular on TikTok (Micallef et al., 2020; Ohlheiser, 2020; Southerton, 2021).
Debunking videos posted by medical experts during the pandemic (Ohlheiser, 2020; Southerton, 2021) typically adopt a didactic approach (Law, 2021), featuring facts, scientific and statistical evidence, and at times data visualizations. According to two meta-analyses, whether or not didactic messages would outperform testimonial and narrative messages in producing desirable health-related persuasive outcomes is contingent on factors such as behavior type (e.g., prevention vs. cessation), sample characteristics, and persuasive goal (Xu, 2023; Zebregs et al., 2015). Furthermore, didactic messages are more likely to trigger counterarguing than those presented in narrative or testimonial styles (Krakow et al., 2018; Ratcliff & Sun, 2020), potentially leading the message recipients to discredit and refute the key arguments presented in the correction video (Gardner & Leshner, 2016; Green, 2006; Moyer-Gusé, 2008; Ratcliff & Sun, 2020). One plausible reason for this vulnerability to counterarguing is the argumentative nature of didactic persuasive messages, whose emphasis on logics, reason, and evidence (Billig, 1996) often renders them more likely to be perceived as a threat to one’s freedom (Moyer-Gusé, 2008). Conversely, narrative persuasion often evokes emotional responses and identification with characters, making it less susceptible to logical counterarguments (Margolin, 2021). Through the lens of PRT, this salient freedom threat can induce psychological reactance, including the cognitive component of counterarguing, along with the affective component of anger (Brehm & Brehm, 1981; Dillard & Shen, 2005). Supporting this theoretical expectation, Ratcliff and Sun (2020) found that didactic messages overall produce more counterarguing compared to narratives. PRT further suggests that counterarguing in turn can motivate attempts to restore threatened freedom, which may induce derogating the persuasive message (Brehm & Brehm, 1981; Dillard & Shen, 2005). Such defensive processing is likely to weaken the correction’s effectiveness. For example, while some scholars demonstrated the advantage of well-crafted, evidence-based scientific consensus messages to communicate contentious topics such as climate change (Van der Linden et al., 2015), other findings show that didactic messages could backfire, leading to increased resistance to correction and worse persuasive outcomes, especially among the misinformed (e.g., Ma & Nan, 2019). Therefore, expert didactic videos might produce less effective correction through the indirect pathway of increased counterarguing.
Given that induced counterarguing could undermine the efficacy of expert didactic debunking videos, we ask whether any structural message features could either directly mitigate counterarguing or compensate for its negative effect through alternative mechanisms, such as enhanced cognitive resource allocation to processing the main message. We argue that the inclusion of background music, especially high-tempo instrumental music without lyrics, may provide this benefit. As shown in Figure 1, we propose two competing mechanisms: first, the correction-by-distraction hypothesis, where high-tempo music is expected to automatically reduce the cognitive resources needed for counterarguing, leading to reduced counterarguing and then improved correction success; second, the correction-by-recognition hypothesis, where high-tempo music improves correction by creating a “net” gain in cognitive resource allocation and enhancing the encoding of the key arguments in expert didactic videos. While the correction-by-distraction hypothesis does not preclude the possibility that encoding might be worsened in parallel to counterarguing, as high-tempo music could take away cognitive resources allocated to processing the main message the same way it disrupts counterarguing, these two competing mechanisms can be empirically distinguished: The directions of the indirect effect through recognition would be opposite—negative or non-significant for the correction-by-distraction hypothesis vs. positive for the correction-by-recognition hypothesis.

The proposed theoretical framework: The indirect effect through counterarguing vs. recognition.
Note. Two sets of conditions regarding musical tempo were examined: (a) high-tempo music vs. low-tempo music, and (b) high-tempo music vs. no music.
According to LC4MP, cognitive resources available for information processing are limited. Automatic allocation of cognitive resources is influenced by two primary processes: (a) the degree to which message stimuli activate the recipient’s appetitive (e.g., portrayals of opportunities for survival) and/or aversive (e.g., signals for danger and threat) motivational systems, and (b) the presence of Orienting Eliciting Structural Features (OESFs) that could automatically elicit orienting responses. LC4MP further posits that although a modest activation of the aversive system may increase cognitive resources allocated to message encoding and storage, a hyper-charged aversive system would shift them away from the focal message toward either internal defensive processing (e.g., counterarguing, a “fight” response) or withdrawal to prepare the body to deal with the looming threat (e.g., avoidance, a “flight” response). Earlier work on LC4MP did not specify the conditions under which “fight” vs. “flight” defensive processing would occur, nor did it provide an analytical strategy to differentiate these two processes. Recent integration of LC4MP and PRT (Clayton, 2022; Clayton et al., 2018, 2020) addresses this gap. One recent study (Clayton et al., 2020) found that adding dogmatic (vs. suggestive) language in anti-vaping PSAs led to increased counterarguing, slower secondary task reaction time, and worse recognition accuracy. This pattern aligns with the expectation that counterarguing, as a form of defensive message processing, can deplete cognitive resources required for other cognitive tasks. This theoretical integration extends LC4MP’s explanatory scope from information processing (Cappella, 2006) to persuasive outcomes following “fight” responses.
Following this logic, if counterarguing reduces corrective messages’ persuasiveness by over-consuming limited cognitive resources, this process should be mitigated or even reserved if some message features could automatically pull cognitive resources away from counterarguing. Music, known to elicit orienting responses in message recipients to influence automatic allocation of limited cognitive resources (Fisher et al., 2018; Lang, 2006; Lee & Lang, 2015), could be one such feature. Incorporating music into radio ads can improve self-reported attention and recognition (Rodero, 2019). High-paced music was found to increase arousal measured by skin conductance level (Dillman Carpentier & Potter, 2007). While prior research testing the integrated theoretical framework of LC4MP and PRT focused on content features (e.g., dogmatic language) likely to activate counterarguing (Clayton et al., 2018, 2020), we conducted a novel test of the potential distraction effect of high-tempo background music in the direction to suppress counterarguing and consequently improve correction.
As a placebo test, we compared expert didactic corrections with debunking videos created by laypeople featuring personal stories and testimonials. Since narratives can reduce defensive processing (Ratcliff & Sun, 2020), there should be less counterarguing available for suppression to begin with. Thus, for layperson testimonial videos, we expect weaker correction-by-distraction effects of high-tempo background music. Operationally, this expectation amounts to an interaction effect, where a debunking video with high-tempo background music (vs. the low-tempo or no music versions of the same debunking video) should impact counterarguing—and, thereby, misbelief—to a greater degree if this video deploys expert didactic correction than layperson testimonials. Although prior research has not yet extensively tested the effects of high-tempo background music on counterarguing, its general distraction effect has been well-documented (Fraser & Bradford, 2013; Oakes & North, 2006). Furthermore, Jeong and Hwang (2015) found that music videos uniquely interfered with counterarguing without impairing overall message comprehension. Therefore, we put forth the following hypotheses to lay out the correction-by-distraction argument. Notably, the core mediator of interest is counterarguing, although a parallel negative indirect effect through reduced recognition remains possible. Theoretically, high-tempo music is as likely to deplete cognitive resources needed for processing the debunking video as to disrupt counterarguing.
H3: There will be an interaction effect between debunking video type and musical tempo such that adding high-tempo music (vs. low-tempo music or no music) will reduce counterarguing to a larger degree for expert didactic videos than for layperson testimonial videos.
H4: There will be an interaction effect between debunking video type and musical tempo such that adding high-tempo music (vs. low-tempo music or no music) will improve misbelief correction to a larger degree for expert didactic videos than for layperson testimonial videos.
H5: The effect of high-tempo background music on misbelief correction is mediated through reduced counterarguing, and this indirect effect will be stronger for expert didactic videos than for layperson testimonial videos.
Alternatively, if high-paced music can automatically attract cognitive resources, would such “net” gain in allocation spill over to enhance correction message encoding and storage? Indeed, studies show high-tempo music can attract people’s attention (Kellaris et al., 1993) and facilitate information encoding and recognition (Mohling, 2015). Following these studies, one would expect high-tempo background music to improve the encoding and recognition of verbal information presented in the debunking video, which should, in turn, enhance the success of correction. This correction-by-recognition argument would not support the moderation effects of video type, as both expert didactic and layperson testimonial videos would be equally likely to benefit from this “net” gain in overall cognitive resource allocation. Although theoretically plausible, this argument was inconsistent with the extensive prior research supporting the distraction effect of music, particularly those with frequent changes in tempo and timbre, to worsen encoding and recall (Fraser & Bradford, 2013; Oakes & North, 2006). Given the mixed findings, we pose two research questions regarding the correction-by-recognition mechanism, particularly the indirect effects through recognition.
RQ1: Will high-tempo music (vs. low-tempo music and no music) affect message recognition differently for expert didactic vs. layperson testimonial videos?
RQ2: (a) Will message recognition mediate the effects of high-tempo background music (vs. low-tempo music and no music) on misbelief correction? (b) Will this indirect effect differ between expert didactic videos and layperson testimonial videos?
Method
Participants
We recruited U.S. adults who were or whose partner was currently pregnant, breastfeeding, or plan to have babies soon because they were more likely to be concerned about the impacts of COVID-19 vaccines on fertility than the general public. An initial sample of 937 participants meeting our eligibility criteria and providing consent was recruited from Lucid in August 2022, who was then randomized to conditions. To check the success of randomization, we tested whether covariates would predict condition assignment. We compared a null model predicting condition assignment with a full model including all pre-treatment covariates (see “Measures” section for details). The non-significant results (chi-square = 118.31, p = .129) support balanced randomization in our experiment with regards to pre-treatment covariates.1 Since only 64 participants did not complete the questionnaire (6.8%), we used listwise deletion and gathered an analytical sample of 873 completes. The majority were women (75.8%), self-identified as White (64.03%), and the average age was 29.62 (SD = 6.57). The mode for annual household income was between $20,001 and $60,000 (47.54%), and slightly less than half of the sample (43.3%) had the highest degree of an equivalent to high school diploma or lower. More participants identify themselves as Democrats (34.82%) than Republicans (24.74%). In subsequent analyses, we included covariates in all regression models to improve precision of statistical estimation.
Materials
To mitigate concerns about case-category confounding (Slater et al., 2015), we selected and edited 15 videos endorsing the vaccine-infertility misinformation, 25 expert didactic correction videos, and 15 videos featuring layperson testimonials for correction, all from TikTok through keyword-based scraping and manual screening. Expert didactic videos were produced and posted by medical experts and healthcare professionals such as researchers, doctors, and nurses. In these videos, they debunked the misinformation with facts, latest scientific evidence, and sometimes data visualizations; and importantly, they did not use any personal stories. In contrast, layperson testimonials shared laypeople's positive experiences with COVID-19 vaccination during pregnancy. All videos were edited to range between 50 and 90 s. We only selected correction videos that did not include any original background music.
We further edited 15 pieces of unique instrumental background music, with no lyrics, to create a low-tempo version (edited via the software Audacity to have 60–90 beats per second, or bpm) and a corresponding high-tempo version (120–160 bpm) for each piece. To enhance external validity, these original audio tracks were obtained from a different set of TikTok videos debunking COVID-19 misinformation. For each correction video, we randomly selected one piece of instrumental background music and added the two edited versions to this correction video to create its respective high- vs. low-tempo background music versions. Given the different combinations crossing the video type factor with the music tempo factor, we developed in total of 105 unique videos across all correction stimuli conditions.
Procedures
This study employed a 2 (debunking video type: expert didactic vs. layperson testimonials) × 3 (music tempo: no music vs. low-tempo vs. high-tempo) plus two set-aside control conditions (i.e., misinformation only, questionnaire only) between-participants design. After the screening questions and before randomization, participants answered questions about their vaccine uptake, their perceived COVID-19 vaccine effectiveness and safety, their experience with COVID-19, and their TikTok use in general. Participants assigned to the questionnaire-only condition did not view any stimuli and went directly to answer the remaining questionnaire assessing outcomes. For those randomized to the misinformation-only condition, they were first exposed to one randomly selected TikTok video claiming that COVID-19 vaccines lead to infertility before reporting their misbeliefs and demographics.
Participants assigned to the six correction conditions watched the misinformation video first before they were shown one debunking video. There were six versions of debunking videos, varying in video types and the presence of background music: expert didactic correction or layperson testimonials, accompanied by either no music, low-tempo music, or high-tempo music. These two factors were fully crossed. After viewing the debunking video, participants answered questions measuring the degree of counterarguing and their perceptions of the video. Lastly, participants proceeded to complete the recognition task to measure short-term encoding of the correction videos, their vaccine-infertility misperception, and demographics.
Measures
Manipulation check
To verify that expert didactic vs. layperson testimonials indeed differed by featured personal stories and experiences, participants in the six correction conditions were asked how much they agreed that the video they watched “describes details about someone’s personal experiences related to COVID-19 vaccination,” “someone's thoughts and feelings on how COVID-19 vaccination affected their own pregnancy experiences,” and “the main character featured in this video appears to be a pregnant woman or a mom who just delivered a baby” (response options anchored by 1 = strongly disagree and 5 = strongly agree, α = 0.81, M = 3.26, SD = 0.99). The results confirmed that compared to participants who watched expert didactic videos, those exposed to layperson testimonial videos reported higher agreement (F(1, 621) = 119.8, p < .001). Furthermore, we collected additional data from a separate online sample (N = 82), using the same screening criteria, to more directly test whether expert didactic vs. layperson testimonial videos varied in perceived didacticism (see Supplementary Materials B for details). Results confirmed that participants exposed to layperson testimonial videos reported lower levels of didacticism compared to those who watched expert didactic videos (b = −1.00, t(60) = −4.67, p < .001). Because the tempo of music was manipulated objectively through a software that only changed bmp, manipulation check is not necessary (O'Keefe, 2003) .
Misbelief acceptance
Misperceptions about COVID-19 vaccines and infertility were measured by four questions asking how much participants agree or disagree with the following: “COVID-19 vaccine could lead to birth defects/will cause infertility/will be harmful for people who are pregnant or breastfeeding/will be harmful for people who plan to have a child.” (1 = strongly disagree to 5 = strongly agree). Responses were averaged to form a single score, with higher values indicating more misbelief (α = 0.91, M = 3.06, SD = 1.06).
Counterarguing
To measure counterarguing, participants were asked how much they criticized the video, were skeptical of what was being said in the video, and thought about points that went against what was being said in the video (Gardner & Leshner, 2016), while they were watching it (1 = not at all to 5 = very much). Responses were averaged before analysis (α = 0.82, M = 2.71, SD = 1.21).
Recognition of correction videos
Given the manipulation of background music, we focus on recognition of the audio tracks in the debunking videos. The recognition task consisted of 10 audio segments, with half extracted from the displayed debunking video and the other half from past videos posted by the same TikToker to ensure comparable vocal characteristics. Each audio segment was around 5–7 s long. For each segment, participants were asked to decide whether it was from the video they just watched. We followed the literature on signal detection (Macmillan & Creelman, 2004) and calculated each participant’s d-prime score to assess recognition accuracy. We first calculated the hit rate (i.e., rate of identifying correct information) and false-alarm rate (i.e., rate of identifying incorrect information) for each participant. We then computed the d-prime score (M = 4.80, SD = 5.98) by taking the difference between the two z-scores.
Covariates
A set of covariates were included in reported analyses to improve estimation efficiency: demographics (age, gender, race, ethnicity, education level, income, party identification, and rurality), epistemic beliefs (Garrett & Weeks, 2017), participants’ TikTok use, previous experience with encountering fact-checking, participants’ COVID-19-related experiences, COVID-19 vaccination status, perceived safety and effectiveness of COVID-19 vaccines (Nan et al., 2012), and perceived susceptibility to COVID-19. Measurement details and descriptive statistics can be found in Supplementary Materials C.
Statistical analysis
For H1 and H2, we performed multiple regression analyses with robust standard errors to test the differences between the questionnaire-only control condition, the misinformation-only control condition, and the six correction conditions. To test H3, H4, and RQ1, multiple regression with robust standard errors was carried out on the subset including only the six correction conditions to test the interaction between video types and background music conditions on misbelief, counterarguing, and recognition. In these regression analyses, covariates were added to increase precision in statistical estimation.
Regarding H5 and RQ2, we fitted and tested path models using the lavaan package in R, with 10,000 bootstrapped samples to quantify estimation uncertainties. The two mediators (counterarguing and recognition) were allowed to co-vary with each other and were regressed on the two dummies representing the contrasts against the high-tempo music condition. Then, the outcome variable misbelief was regressed on the two mediators as well as the music tempo contrasts. The dummy representing the contrast between the two video types was specified to moderate both the a-paths (from music tempo conditions to the two mediators) and the c-paths (direct effects of music tempo conditions on the outcome). Covariates were added to predict both the mediators and the outcome to improve causal inference (Imai et al., 2010).
Results
Descriptive statistics of key variables for each of the eight conditions are presented in Table 1. As illustrated in Figure 2, compared with the questionnaire-only condition, participants in the misinformation-only control condition reported a higher level of misbelief (b = −0.26, 95% CI −0.45, −0.06, p = .010), thus supporting H1. Regarding the effectiveness of corrections, most corrections showed success in decreasing the misbelief as compared to the misinformation-only condition: expert didactic videos with high-tempo background music had the largest estimated coefficient (b = −0.56, 95% CI −0.78, −0.34, p < .001), followed by layperson testimonial videos with low-tempo music (b = −0.36, 95% CI −0.62, −0.10, p = .006), expert didactic videos with low-tempo music (b = −0.30, 95% CI −0.51, −0.09, p = .006), layperson testimonial videos with no music (b = −0.30, 95% CI −0.52, −0.08, p = .008), and expert didactic videos with no music (b = −0.25, 95% CI −0.46, −0.04, p = .029). The only exception is the layperson testimonial videos with high-tempo music (b = −0.19, 95% CI −0.39, −0.01, p = .062). H2 was partially supported. Supplementary Materials D presents a detailed breakdown of the estimated differences between the six correction conditions.

Effects of misleading vs. debunking TikTok videos on the misbelief that COVID-19 vaccines lead to infertility.
Notes. Contrasts with the misinformation-only condition from multiple regression analyses with robust standard errors. The outcome was misbelief acceptance (5-point scale, higher values indicating more misperceptions). Covariates were included.
*p < .05; **p < .01; ***p < .001.
Condition . | No message condition (n = 146) . | Misinformation condition (n = 104) . | Expert didactic and high-tempo music (n = 102) . | Expert didactic and low-tempo music (n = 121) . | Expert didactic and no music (n = 105) . | Layperson testimonial and high-tempo music (n = 109) . | Layperson testimonial and low-tempo music (n = 88) . | Layperson testimonial and no music (n = 98) . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | Mean . | SD . | Mean . | SD . | Mean . | SD . | Mean . | SD . | Mean . | SD . | Mean . | SD . | Mean . | SD . | Mean . | SD . |
Misbelief | 3.12 | 1.01 | 3.33 | 1.00 | 2.70 | 1.12 | 3.05 | 1.09 | 3.12 | 1.02 | 3.21 | 1.01 | 2.96 | 1.15 | 2.95 | 1.02 |
Counterarguing | NA1 | NA | NA | NA | 2.69 | 1.20 | 2.69 | 1.19 | 2.99 | 1.21 | 2.77 | 1.18 | 2.63 | 1.32 | 2.46 | 1.14 |
Recognition | NA | NA | NA | NA | 4.38 | 6.52 | 4.29 | 5.72 | 4.49 | 5.45 | 4.60 | 6.05 | 5.82 | 5.93 | 5.52 | 6.17 |
Condition . | No message condition (n = 146) . | Misinformation condition (n = 104) . | Expert didactic and high-tempo music (n = 102) . | Expert didactic and low-tempo music (n = 121) . | Expert didactic and no music (n = 105) . | Layperson testimonial and high-tempo music (n = 109) . | Layperson testimonial and low-tempo music (n = 88) . | Layperson testimonial and no music (n = 98) . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | Mean . | SD . | Mean . | SD . | Mean . | SD . | Mean . | SD . | Mean . | SD . | Mean . | SD . | Mean . | SD . | Mean . | SD . |
Misbelief | 3.12 | 1.01 | 3.33 | 1.00 | 2.70 | 1.12 | 3.05 | 1.09 | 3.12 | 1.02 | 3.21 | 1.01 | 2.96 | 1.15 | 2.95 | 1.02 |
Counterarguing | NA1 | NA | NA | NA | 2.69 | 1.20 | 2.69 | 1.19 | 2.99 | 1.21 | 2.77 | 1.18 | 2.63 | 1.32 | 2.46 | 1.14 |
Recognition | NA | NA | NA | NA | 4.38 | 6.52 | 4.29 | 5.72 | 4.49 | 5.45 | 4.60 | 6.05 | 5.82 | 5.93 | 5.52 | 6.17 |
NA = Not Applicable
Condition . | No message condition (n = 146) . | Misinformation condition (n = 104) . | Expert didactic and high-tempo music (n = 102) . | Expert didactic and low-tempo music (n = 121) . | Expert didactic and no music (n = 105) . | Layperson testimonial and high-tempo music (n = 109) . | Layperson testimonial and low-tempo music (n = 88) . | Layperson testimonial and no music (n = 98) . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | Mean . | SD . | Mean . | SD . | Mean . | SD . | Mean . | SD . | Mean . | SD . | Mean . | SD . | Mean . | SD . | Mean . | SD . |
Misbelief | 3.12 | 1.01 | 3.33 | 1.00 | 2.70 | 1.12 | 3.05 | 1.09 | 3.12 | 1.02 | 3.21 | 1.01 | 2.96 | 1.15 | 2.95 | 1.02 |
Counterarguing | NA1 | NA | NA | NA | 2.69 | 1.20 | 2.69 | 1.19 | 2.99 | 1.21 | 2.77 | 1.18 | 2.63 | 1.32 | 2.46 | 1.14 |
Recognition | NA | NA | NA | NA | 4.38 | 6.52 | 4.29 | 5.72 | 4.49 | 5.45 | 4.60 | 6.05 | 5.82 | 5.93 | 5.52 | 6.17 |
Condition . | No message condition (n = 146) . | Misinformation condition (n = 104) . | Expert didactic and high-tempo music (n = 102) . | Expert didactic and low-tempo music (n = 121) . | Expert didactic and no music (n = 105) . | Layperson testimonial and high-tempo music (n = 109) . | Layperson testimonial and low-tempo music (n = 88) . | Layperson testimonial and no music (n = 98) . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | Mean . | SD . | Mean . | SD . | Mean . | SD . | Mean . | SD . | Mean . | SD . | Mean . | SD . | Mean . | SD . | Mean . | SD . |
Misbelief | 3.12 | 1.01 | 3.33 | 1.00 | 2.70 | 1.12 | 3.05 | 1.09 | 3.12 | 1.02 | 3.21 | 1.01 | 2.96 | 1.15 | 2.95 | 1.02 |
Counterarguing | NA1 | NA | NA | NA | 2.69 | 1.20 | 2.69 | 1.19 | 2.99 | 1.21 | 2.77 | 1.18 | 2.63 | 1.32 | 2.46 | 1.14 |
Recognition | NA | NA | NA | NA | 4.38 | 6.52 | 4.29 | 5.72 | 4.49 | 5.45 | 4.60 | 6.05 | 5.82 | 5.93 | 5.52 | 6.17 |
NA = Not Applicable
Next, we examined interaction effects between video types (expert didactic vs. layperson testimonial) and music tempo (high-tempo vs. low-tempo vs. no music) on misbelief (H4), counterarguing (H3), and recognition (RQ1). We focused on reporting conditional models after adjusting for covariates below, but the results from unconditional models excluding covariates were similar (see Table 2 for details). Numeric results are summarized in Table 2. The conditional effects of high-tempo background music by video type were visualized in Figure 3 to facilitate interpretation. Note that in this figure, 84% CIs were used—in this way, visually disjoined CIs with no overlapping can be interpreted to indicate that the two conditional main effects significantly differ at the p < .05 level, two-tailed (Maghsoodloo & Huang, 2010) .

Video type moderating the effects of musical tempo on misbelief, counterarguing, and recognition.
Notes. Error bars represent 84% CIs for visual clarity of comparing between groups (see Maghsoodloo & Huang, 2010).
Interaction effects between musical tempo and video type on misbelief, counterarguing, and recognition
N = 623 . | Misbelief b (95% CI) . | Counterarguing b (95% CI) . | Recognition b (95% CI) . | |||
---|---|---|---|---|---|---|
. | Unconditional . | Conditional . | Unconditional . | Conditional . | Unconditional . | Conditional . |
With covariates or not | No | Yes | No | Yes | No | Yes |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Constant |
|
|
|
|
|
|
F-Statistic |
|
|
|
|
|
|
N = 623 . | Misbelief b (95% CI) . | Counterarguing b (95% CI) . | Recognition b (95% CI) . | |||
---|---|---|---|---|---|---|
. | Unconditional . | Conditional . | Unconditional . | Conditional . | Unconditional . | Conditional . |
With covariates or not | No | Yes | No | Yes | No | Yes |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Constant |
|
|
|
|
|
|
F-Statistic |
|
|
|
|
|
|
Notes. Multiple regression analyses were conducted with robust standard errors, including both unconditional models (without covariates) and conditional models (with covariates). The outcomes were misbelief acceptance, counterarguing, and recognition, where higher values signify more misperceptions, counterarguing, and recognition, respectively.
p < .05;
p < .01;
p < .001.
Interaction effects between musical tempo and video type on misbelief, counterarguing, and recognition
N = 623 . | Misbelief b (95% CI) . | Counterarguing b (95% CI) . | Recognition b (95% CI) . | |||
---|---|---|---|---|---|---|
. | Unconditional . | Conditional . | Unconditional . | Conditional . | Unconditional . | Conditional . |
With covariates or not | No | Yes | No | Yes | No | Yes |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Constant |
|
|
|
|
|
|
F-Statistic |
|
|
|
|
|
|
N = 623 . | Misbelief b (95% CI) . | Counterarguing b (95% CI) . | Recognition b (95% CI) . | |||
---|---|---|---|---|---|---|
. | Unconditional . | Conditional . | Unconditional . | Conditional . | Unconditional . | Conditional . |
With covariates or not | No | Yes | No | Yes | No | Yes |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Constant |
|
|
|
|
|
|
F-Statistic |
|
|
|
|
|
|
Notes. Multiple regression analyses were conducted with robust standard errors, including both unconditional models (without covariates) and conditional models (with covariates). The outcomes were misbelief acceptance, counterarguing, and recognition, where higher values signify more misperceptions, counterarguing, and recognition, respectively.
p < .05;
p < .01;
p < .001.
First, regarding misbelief, the effects of high-tempo music on misbelief correction were moderated by video type, with larger correction observed for expert didactic videos than for layperson testimonial videos (low- vs. high-tempo music, binteraction = 0.44, 95% CI 0.10–0.78, p = .012; no music vs. high-tempo music, binteraction = 0.41, 95% CI 0.11–0.72, p = .008), as shown in Table 2. H4 was supported. When probing the conditional main effects of high-tempo music, we found it only significantly helped reduce misbelief for expert didactic videos (vs. low-tempo music, b = −0.28, 95% CI −0.45, −0.12, p = .031; vs. no music, b = −0.33, 95% CI −0.49, −0.17, p = .013; the Holm method was applied to adjust for multiple comparisons), as presented in Figure 3. Second, regarding counterarguing, adding high-tempo background music (vs. the original no-music version) reduced counterarguing to a larger degree for expert didactic videos than layperson testimonial videos (setting the high-tempo music conditions as the baseline, binteraction = 0.49, 95% CI 0.06–0.91, p = .024). We did not observe similar interaction effects when the comparison was with regards to low-tempo music conditions. H3 was thus partially supported. Despite significant interaction effects, the conditional main effects of high-tempo music were non-significant regardless of video type. Lastly, to address RQ1, we found that video type did not significantly moderate the effects of high-tempo background music on recognition, regardless of the comparison group. Nor did we find any significant conditional main effects.
Finally, results from the moderated mediation analyses were visualized in Figure 4. The model demonstrated a good fit: χ2(93) = 628.62, p < .001; CFI = 1.00; TLI = 1.00; RMSEA = 0.00; SRMR = 0.00. Comparing correction videos with high-tempo background music with their no-music counterparts, the mediational pathway—high-tempo music reducing misbelief through mitigating counterarguing—was marginally significant for expert didactic videos (indirect effect = 0.05, bootstrapped 95% CI −0.01, 0.12), and this indirect effect was significantly larger than layperson testimonial videos (moderated mediation index = 0.10, bootstrapped 95% CI 0.01–0.20). However, this indirect effect was non-significant when the comparison group was low-tempo music videos, regardless of video type. Nor did we find significant moderated mediation effects for this comparison. H5 was partially supported. To address RQ2, we tested the indirect effect through recognition and whether video type moderated this mediational pathway. The results were all non-significant. Therefore, the evidence partially supported the correction-by-distraction hypothesis but not the correction-by-recognition hypothesis.

Moderated mediation analyses: video type moderating the indirect effects of background music tempo on misbelief through counterarguing vs. through recognition.
Notes. Structural equation modeling was conducted with bootstrapping (resamples = 10,000) to test the moderated mediation path model. Error bars represent 84% CIs for visual clarity of comparing between groups (see Maghsoodloo & Huang, 2010).
Discussion
Through an online experiment using a large and realistic stimuli pool from the popular short-video social media platform TikTok, we demonstrated that videos promoting the false claim that COVID-19 vaccines lead to infertility indeed increased related misperceptions among a national sample of participants for whom vaccination and infertility were likely to be an issue of concern. This highlights the importance of studying multimodal misinformation on increasingly popular short-video platforms (Lu & Shen, 2023; Yang et al., 2023). Furthermore, while a substantial number of the most-viewed TikTok videos related to COVID-19 vaccination are pro-vaccine and created by non-institutional sources including medical experts, public health professionals, and ordinary social media users (Kampen et al., 2022; Yang et al., 2023), few studies have examined the effectiveness of such citizen fact-checking efforts. Echoing a recent study (Bhargava et al., 2023), we showed that citizen-contributed debunking videos are largely effective, irrespective of the source and the persuasive strategies tested in this study. These findings confirmed the value of citizen fact-checking (Micallef et al., 2020) as an integral part of the “whole-of-society” approach to combating health misinformation online.
Despite that medical professionals often lack institutional support or incentives to engage in online TikToking for health promotion, including misinformation correction, our study builds on encouraging evidence showing that videos by medical “influencers” tend to garner more engagement (e.g., likes, comments, shares, and followers) than laypeople-produced videos (Kampen et al., 2022; Yang et al., 2023). Our study demonstrates that medical TikTokers’ debunking videos not only are likely to excel at winning the competition in the online attention economy but can also effectively correct misperceptions. While their relative advantages over laypeople’s debunking videos depend upon features such as high-tempo background music, future research is warranted to identify other message features that can enhance the effectiveness of debunking messages by medical experts and public health professionals. We also encourage future research to investigate the potential of citizen fact-checking to complement, and potentially enhance, institutional fact-checking at the macro level.
Theoretically, our study highlights the need to explore how message structural features such as audiovisual elements, in addition to content appeals, may influence multimodal debunking messages. We also identified distracted counterarguing as one promising cognitive mechanism, grounded in LC4MP and PRT, to account for the effects of such structural features. As research on mis-/disinformation debunking expands to multimodal messages (Lu & Shen, 2023; Peng et al., 2023), theorizing and empirically examining how audiovisual features impact the message recipient’s motivational systems, cognitive resource allocation, and ultimately the success of debunking efforts, is crucial. For example, one recent study has shown that the presence of background music in COVID-19 vaccine TikTok videos correlated with less shares (Yang et al., 2023). Our study found that high-tempo background music can facilitate correction, albeit largely restricted to expert didactic videos. Taken together, this growing body of evidence begins to paint a fuller picture of the roles of background music in TikTok debunking videos, unpacking and differentiating its impacts on share-worthiness vs. persuasiveness. Since TikTok offers a suite of tools (Lundy, 2023; Yang et al., 2023) to easily incorporate audiovisual features (e.g., readily available sound effects, audio track clips from an audio library, lip-synching, duetting, and stitching), there are plenty of opportunities to research their implications for online engagement and misinformation debunking effectiveness.
More importantly, our findings demonstrate the advantage of incorporating high-tempo background music to enhance expert didactic debunking videos compared to layperson testimonials. Furthermore, such relative advantage was mediated by reduced counterarguing rather than increased encoding, which contributed to recent theorization integrating LC4MP with PRT (Clayton, 2022; Clayton et al., 2018, 2020). Recent work has shown that freedom-threatening content features in health promotional messages (e.g., dogmatic language) can undermine persuasiveness by inducing psychological reactance, which exhibits the expected psychophysiological pattern (e.g., skin conductivity, corrugator muscle activation, heart rate deceleration) of an activated aversive motivational system and defensive message processing (Clayton, 2022; Clayton et al., 2020). Our study suggests that employing high-tempo background music in expert didactic videos (vs. layperson testimonial videos) reverted or at least mitigated this process to a higher degree, largely by distraction from counterarguing, the cognitive component of psychological reactance. In other words, unlike prior work that has manipulated content features to activate both counterarguing and anger, we focused on the cognitive component and employed a structural message feature, namely high-tempo background music, to suppress counterarguing. If reactance is indeed central to the defensive processing of multimodal correction messages, its suppression, particularly reduced counterarguing, should enhance correction success, especially for expert didactic videos.
Although this study focuses on the cognitive aspect of reactance, we encourage future research to include the affective component of anger and psychophysiological measures, to better characterize the “fight” response in defensive message processing (Clayton, 2022; Clayton et al., 2020). Furthermore, based on prior comparisons of didactic to testimonial messages (Gardner & Leshner, 2016; Green, 2006; Moyer-Gusé, 2008; Ratcliff & Sun, 2020), we expect that didactic debunking videos to have induced a greater level of perceived freedom threat compared to testimonials, although this assumption warrants further empirical validation.
Notably, in our study, high-tempo music did not improve message encoding in expert debunking videos (i.e., the correction-by-recognition hypothesis), which seems contradictory to past research where background music was found to increase attention and improve information encoding (Allan, 2006; Kellaris et al., 1993). One possibility is that fast-paced background music re-directed attention away from the core arguments in the correction message towards processing the music, resulting in fewer available cognitive resources for message encoding (Shevy & Hung, 2013). Another speculation has to do with the measurement for recognition—in this study, we focused on the recognition of audio clips, which might be particularly susceptible to interference from background music. Perhaps the background music had indeed improved message encoding, much of which was conveyed by visual elements in TikTok videos. Although focusing on audio recognition is common in past research (Clayton et al., 2018, 2020), we encourage future research to employ alternative measures for visual information.
Taking the results as a whole, our data support the correction-by-distraction hypothesis but not the correction-by-recognition hypothesis. Although the correction-by-distraction hypothesis does not preclude the possibility that high-tempo music might have reduced cognitive resources for processing the main message along with those for counterarguing, in our data this possibility did not materialize. Two important stimuli and design features might help explain the overall null effects on recognition and why the encoding of the main persuasive message was not impaired. In our study, the extensive verbal and visual elements of debunking videos were attention-grabbing, which were likely to have preserved the requisite attention for processing, despite distraction from background music. Relatedly, all background music we tested were instrumental music without lyrics, as past research (Salamé & Baddeley, 1989) suggests such music was less distracting and disruptive than vocal music. This aligns with the findings by Jeong and Hwang (2015), where music videos impaired counterarguing but not comprehension. Similarly, Pool et al. (2000, 2003) discovered that foreign-language music videos in the background did not adversely affect students’ performance on either paper-and-pencil or memorization assignments. Admittedly, given the absence of a vocal music condition, we cannot empirically test this speculation, which we encourage future research to tackle.
This study has several limitations. First, our study cannot isolate the independent effects of information presentation style (i.e., didactic vs. testimonials) from source credibility (i.e., medical experts vs. laypeople). That said, it is worth reiterating that the core audiovisual feature of interest to this study, that is the presence of background music, was independently and exogenously manipulated. In the main analyses, we treated video type (i.e., expert didactic videos vs. layperson testimonial videos) as a moderator. Additional analyses further revealed significant simple main effects of video type on counterarguing when no music was presented—expert didactic videos still induced slightly more counterarguing than layperson testimonials (see Supplementary Materials E), despite that the higher credibility associated with medical experts (Reiter et al., 2020) could have mitigated psychological reactance. This tendency to induce counterarguing was mitigated by adding high-tempo music, where these two types of videos no longer differed. Furthermore, it was also in the high-tempo music condition that expert didactic videos outperformed layperson testimonials in correcting misperceptions (see Supplementary Materials E), consistent with the correction-by-distraction hypothesis. Therefore, we believe the advantages of testing a large pool of realistic TikTok videos high in external validity outweigh the limitation of mixing presentation style with source. Future research can explore additional moderators at both the message and the individual levels using a multilevel modeling approach (e.g., Southwell, 2005) to further clarify the scope conditions influencing the effects of music.
Second, causal validity can be compromised in mediation analyses (Imai et al., 2010). In this study, we made efforts to reduce potential confounding through including a wide range of pre-treatment covariates (Imai et al., 2010). Although complete elimination of confounding is impossible given our current mediation analyses, our main conclusions are drawn from the general pattern across several analyses, including main and interaction effects (e.g., H1–H4) that are not susceptible to this causal inference challenge. This implies that although our data supported reduced counterarguing as one plausible mediating pathway that was more prominent for expert didactic debunking videos, we cannot rule out all other possible mediating mechanisms. Per reviewers’ suggestions, we explored the plausibility of perceived usefulness and para-social identification as mediators, since background music may increase overall positive message evaluation (Kellaris et al., 1993; North et al., 2016) and perceived identification with the music creator (LaMarre et al., 2012; Maher et al., 2013), which could in turn enhance debunking effectiveness. The results were non-significant (see Supplementary Materials F). We encourage future research to explore other mediating mechanisms, such as emotional processing.
Conclusion
During the COVID-19 pandemic, TikTok has emerged as a battleground where misinformation and citizen-contributed correction videos compete for public attention and engagement. Building upon these recent observational studies, we experimentally studied the impacts of TikTok videos endorsing the false claim that COVID-19 vaccines lead to infertility. We also examined the efficacy of multimodal debunking videos varying in type (medical experts using a more didactic style vs. laypeople employing testimonials) and the presence of a structural audiovisual feature—high- vs. low-tempo background music. Our results showed that high-tempo background music enhanced the effectiveness of expert didactic debunking videos to a greater degree than layperson testimonial videos. These results were mediated through reduced counterarguing rather than through improved encoding. As research on misinformation correction expands to consider multimodal corrections, our findings demonstrate the relevance of LC4MP and PRT as valuable theoretical frameworks and underscore the importance of considering structural message features such as background music.
Supplementary material
Supplementary material is available online at Journal of Computer-Mediated Communication.
Data availability
All message stimuli, the questionnaire used in the study, replication datasets, and R codes can be found in the Open Science Framework online depository for this study at https://osf.io/b5zxa/.
Funding
This work was supported by funding from the John S. and James L. Knight Foundation (Award Number: MSN231314) and the National Science Foundation through the Convergence Accelerator Track F: Course Correct: Precision Guidance Against Misinformation (Agency Tracking Number: 2230692; Award Number: MSN 266268). Additional support for this research was provided by the Office of the Vice Chancellor for Research and Graduate Education at the University of Wisconsin-Madison with funding from the Wisconsin Alumni Research Foundation (Award Number: MSN231886) awarded to SY.
Conflicts of interest: None declared.
Notes
Further analysis of covariates’ variance across conditions is presented in Supplementary Materials A.
References
Author notes
All authors contributed equally to this work.

