-
PDF
- Split View
-
Views
-
Cite
Cite
Noah Castelo, Johannes Boegershausen, Christian Hildebrand, Alexander P Henkel, Understanding and Improving Consumer Reactions to Service Bots, Journal of Consumer Research, Volume 50, Issue 4, December 2023, Pages 848–863, https://doi.org/10.1093/jcr/ucad023
- Share Icon Share
Abstract
Many firms are beginning to replace customer service employees with bots, from humanoid service robots to digital chatbots. Using real human–bot interactions in lab and field settings, we study consumers’ evaluations of bot-provided service. We find that service evaluations are more negative when the service provider is a bot versus a human—even when the provided service is identical. This effect is explained by consumers’ belief that service automation is motivated by firm benefits (i.e., cutting costs) at the expense of customer benefits (such as service quality). The effect is eliminated when firms share the economic surplus derived from automation with consumers through price discounts. The effect is reversed when service bots provide unambiguously superior service to human employees—a scenario that may soon become reality. Consumers’ default reactions to service bots are therefore largely negative but can be equal to or better than reactions to human service providers if firms can demonstrate how automation benefits consumers.
INTRODUCTION
Service bots—defined as autonomous systems such as physical robots and digital chatbots that interact with, communicate with, and deliver service to customers (Wirtz et al. 2018)—are used to provide customer service in many industries, including retail, restaurants, hotels, and hospitals (Dass 2017; Nguyen 2016; Simon 2015). In 2022, the market value of physical robots for automating customer service amounted to $9.3 billion worldwide (Statista 2022), while the chatbot market was valued at $3.8 billion in 2021 (Mordor Intelligence 2021). Furthermore, the use of service bots increased during the COVID-19 pandemic as a means to reduce human-to-human contact (Lee 2021). However, several robotics firms have recently paused or ceased operations (Hoffman 2019), including Softbank’s Pepper unit (Nussey 2021), and the chatbot industry has faced similar cycles of hype and disappointment (CB Insights 2021). Given this mixed evidence, it remains unclear what factors contribute to the success or failure of service bots in the marketplace.
In this article, we study consumers’ perceptions of service bots and the firms that use them. We find that consumers believe that the use of service bots is motivated by cost cutting and profit maximization for the firm at the expense of the customer experience. This perception has negative implications for firms using service bots, including decreased satisfaction with the service and less willingness to patronize and share positive word of mouth about the firm. These effects emerge for both physical and digital service bots. We further explore boundary conditions in which this belief may not hold and in which customers’ evaluations of bot-provided service might equal or even exceed evaluations of service provided by humans. Our findings suggest that the negative effect of using service bots on consumers’ perceptions of firms and the service they provide can be attenuated by passing on the cost savings derived from using service bots to consumers in the form of lower prices. Furthermore, service bots that provide unambiguously superior service compared to human employees can reverse these effects, resulting in higher service evaluations relative to a human employee.
Using real consumer–bot interactions, we contribute to the emerging literature on consumer reactions to automation in commercial settings (Bergner, Hildebrand, and Häubl 2023; Luo et al. 2019; Rust and Huang 2012) by documenting how consumers perceive firms that use bots to automate customer service, how those perceptions shape service evaluations, and how the negative effects of service automation can be attenuated and reversed. In addition to these substantive contributions, we contribute to a theoretical understanding of consumer reactions to service automation by identifying consumer perceptions of the firm using service automation as a critical aspect that has received limited attention in prior consumer research. Existing research in this area has focused predominantly on how perceptions of the automation technology itself shape consumer reactions. However, the use of new technologies does not occur in a vacuum: consumers are likely to make inferences about the firm using technologies such as service bots and its motivation for doing so. Our work highlights the importance of consumers’ attributions for why a firm might be introducing a service automation technology and how these perceptions, in turn, shape service evaluations and related firm outcomes.
RELATED RESEARCH
Research comparing the use of service bots to the use of humans in commercial contexts has been accelerating in recent years. We conducted a literature review by consulting the reference lists from four recent literature reviews (Blaurock et al. 2022; Burton, Stein, and Jensen 2020; Mende et al. 2019; Van Pinxteren, Pluymaekers, and Lemmink 2020) and by extracting records from the Web of Science based on the terms “robot*” and “management OR psychology OR business,” searching for articles in which humans and service bots were directly compared to each other. We identified several important shortcomings in this literature. First, there are very few studies of consumers’ actual, interactive experiences with bots—most studies use hypothetical scenarios in which participants are asked to imagine interacting with a bot, which may not capture consumers’ real experiences with bots. Second, we are aware of no research on how the use of bots affects perceptions of the firm using them, which we argue is a key factor in shaping consumer reactions to service bots. Third, there are few studies of managerially relevant interventions for improving perceptions of bots and the firms using them. Finally, there are few field studies in commercial settings. Table W1 in the web appendix summarizes our literature review. Our research fills these gaps in the literature, contributing both external validity and managerial relevance to the study of consumer reactions to service bots.
Interestingly, despite the growing popularity of the “algorithm aversion” concept in which consumers sometimes prefer to rely on humans than on bots (Burton, Stein, and Jensen 2020), our literature review also established that there is no consensus regarding whether consumers react more positively to humans compared to bots. Some articles document such an effect (Luo et al. 2019; Mende et al. 2019), while others find the opposite pattern (Hoorn and Winter 2018; Yueh et al. 2020) and still others observe ambivalent reactions (Desideri et al. 2019). This preference is highly context dependent, shaped by factors such as the nature of the task being automated (Castelo, Bos, and Lehmann 2019; Longoni and Cian 2022). One practical lesson that researchers can apply from this literature review is therefore to avoid over-generalizing algorithm aversion, recognizing instead that consumers sometimes react more positively to humans than to bots, while at other times the exact opposite is true.
The closest published research to our own uses a field study to test the effectiveness of voice-based chatbots in a sales context (Luo et al. 2019). This research finds that bots can close sales of financial products as effectively as human agents unless consumers are made aware that the bot is not a human. However, this article proposes an intervention that is not transparent about the bot’s identity. Such an approach may violate the General Data Protection Regulation in the European Union and the B.O.T. (“Bolstering Online Transparency”) bill in California (Butterworth 2018; DiResta 2019), which require disclosing a bot’s identity and/or require that consumers be able to opt out of having their data processed by an algorithm. Alternative interventions to improve consumer reactions to service bots are therefore needed. Furthermore, while bots might also be used for sales roles, customer service is the predominant application area for bots in the marketplace (Kannan and Bernoff 2019), which is our focus here.
In sum, despite the fast-paced development of the emerging literature on service bots, it remains unclear as to how the use of service bots (especially in externally valid customer service settings) affects customers’ perceptions of the firm using the bots and consumers’ service evaluations, as well as what can be done to improve these outcomes.
HYPOTHESIS DEVELOPMENT
Prior conceptual work has suggested that deploying automation involves a direct tradeoff between firm-benefiting outcomes such as service productivity and customer-oriented outcomes such as service quality (Rust and Huang 2012). We are not aware of any research examining how cost-cutting attributions vary depending on whether service is provided by a human or by a machine—that is, whether a firm seems more motivated by cost cutting and less by providing high-quality service. Nevertheless, some adjacent literature streams offer relevant insights.
Specifically, research has found that consumers perceive profit-driven corporate behavior as being in conflict with social good, highlighting the role of a firm’s perceived intentions in shaping evaluations of related outcomes (Bhattacharjee, Dana, and Baron 2017). Prior work suggests that people hold unfavorable intuitive lay beliefs about corporations (Jago and Pfeffer 2019; Rai and Diermeier 2015). For-profit firms are stereotypically perceived as being high in competence but low in warmth (Aaker, Vohs, and Mogilner 2010), further suggesting that firm intentions are typically perceived as being self-interested rather than benevolent. These general perceptions of firm motivations suggest that consumers are likely to perceive the use of service bots as being motivated by firm benefits more so than by customer benefits. Specifically, we hypothesize that a firm using bots to automate customer service will be seen as motivated by cutting costs at the expense of the customer experience, relative to a firm using human employees to provide customer service.
Note that we do not assume that cost cutting and improving the customer experience are (in reality) the polar ends of a single dimension, such that cutting costs necessarily implies a worse customer experience. Indeed, in some cases, firms may adopt a dual emphasis strategy—focusing on customer satisfaction and cost reduction simultaneously (Rust, Moorman, and Dickson 2002). While such a dual emphasis may be successful, enabling high-quality customer service and cost cutting simultaneously (Wirtz and Zeithaml 2018), pursuing such a strategy is often challenging (Rust et al. 2002), tends to be less financially rewarding in the short run (Mittal et al. 2005), and may require sizeable investments to be effective (Mithas and Rust 2016). Thus, the two goals often represent a tradeoff for firms. We propose that consumers intuitively perceive these two goals to be inversely related, such that a firm using service bots (relative to a firm that does not) is perceived as caring more about cutting costs and less about improving the customer experience. However, we will test whether directly refuting this intuition via passing cost savings on to the customer can be effective at improving reactions to service bots.
Perceiving a firm as prioritizing cost cutting at the expense of the customer experience should negatively impact a range of consequential variables, such as consumers’ service evaluations and willingness to recommend the firm to others. We therefore measure a range of closely related outcomes, including customer satisfaction, word of mouth intentions, and interest in patronizing the firm in question. This is in line with prior conceptual and empirical work suggesting that these firm outcomes often have similar antecedents and consequences (Zeelenberg and Pieters 2004; Zeithaml, Berry, and Parasuraman 1996). Consumer satisfaction with a firm’s service goes hand in hand with spreading positive word of mouth and a higher likelihood of patronizing the firm. For simplicity of exposition, we will refer to this set of firm outcomes as “service evaluation” while recognizing that satisfaction, word of mouth, and patronage are conceptually distinct but closely related outcomes.
Importantly, our proposed mechanism, cost-cutting attributions, should hold even when bots and humans provide service that is identical, such that consumers should still evaluate the service provided by a bot less favorably even if the bot can provide human-level service because the firm will still be perceived as caring more about its own welfare (i.e., cost savings and profit maximization) than about the customer. Several of our studies therefore hold service quality constant across conditions, merely altering customers’ beliefs regarding the nature of the service provider (human vs. bot). This suggests that our central finding should be robust to improvements in technology until these improvements are so substantial that the bot-provided service offers a clearly superior customer experience compared to the human service provider—which we also test as a boundary condition that may soon become a reality with the advent of large language models that perform many text-based tasks at human levels (OpenAI 2023). Similarly, our proposed mechanism should also hold regardless of the bots’ specific features, including whether the bot is physically embodied like the Pepper service bot or digitally instantiated like a chatbot. Whereas a great deal of research explores how design factors such as anthropomorphism affect consumer reactions to bots (Blut et al. 2021), our key findings should be robust to such design factors, and we view this as an important contribution toward establishing the generalizability of our results.
We also propose a way of mitigating these negative reactions. If negative effects on service evaluation are driven (in part) by the perception that using service bots is motivated by cost cutting at the customer’s expense, managers may be able to attenuate these effects by communicating or demonstrating that the motivation for using service bots is to benefit the consumer directly. Thus, in addition to supporting our proposed mechanism through mediation analyses, we also provide process evidence by testing interventions that refute cost-cutting attributions.
We do this in two ways. First, we test whether providing a price discount to consumers who receive automated service can eliminate the gap in service evaluations between human- and bot-provided service. This approach is intended to counteract the belief that service bots are used to cut costs to benefit the firm at the expense of consumers by also providing consumers with a direct financial benefit. Insofar as consumers attribute the use of service bots (but not human employees) to cost-cutting motivations, providing a discount should specifically refute these attributions and improve evaluations of bot-provided service more than human-provided service. Second, we provide further process evidence by imagining a future in which service bots provide unambiguously superior service relative to human employees (i.e., much faster and just as accurate or effective). The recent release of chatbots based on large language models such as Open AI’s ChatGPT and GPT-4 suggests that this may indeed be possible soon. Bots that clearly outperform human service employees could also refute the cost-cutting attribution by making the customer benefits of service bots salient during the service experience. We test whether these two approaches are sufficient to improve evaluations of service bots to the level of human employees or even beyond.
Our central hypotheses are that (a) customers evaluate service provided by bots less favorably than service provided by humans, even when bots and humans provide identical service, (b) these effects are driven in part by the perception that the firm’s perceived motivation for using service bots is to cut costs at the expense of the customer experience, (c) these effects can be attenuated when sharing some of the economic surplus from service automation with consumers via discounts, and (d) the effects can be reversed when the provided service is unambiguously better than human-provided service.
OVERVIEW OF STUDIES
We report six studies testing consumers’ reactions to firms’ use of service bots across a broad range of settings, participant samples, and industries. All data and code are available at https://osf.io/vm6ax/. Full stimuli are included in the web appendix. We replicate our key findings using both lab and field studies, real human–bot interactions, and scenario studies, using large samples. Our use of field research and real human–bot interactions reflects the prioritization of real-world marketplace phenomena (van Heerde et al. 2021) with interventions to improve consumer reactions to service bots geared toward managerial practice.
We begin with a pilot study showing that consumers perceive service automation to be motivated by firm benefits (i.e., cost cutting) at the expense of customer benefits. The remaining studies build on this key finding, demonstrating the negative consequences that firms face by automating customer service using bots. Study 1 shows that actual customers of a coffee shop evaluate the service provided by a widely used physical bot less favorably than when the service was provided by a human. Study 2 replicates the field study in an online context, providing evidence for cost-cutting attributions as a mechanism. Study 3 shifts to digital bots to establish that the focal effect is generalizable from physical to digital service bots and provides a critical test of the proposed cost-cutting mechanism. This study demonstrates that providing a discount to customers who interact with a bot (versus a human service employee) eliminates the gap in evaluations of bot- and human-provided service. Studies 4a and 4b examine consumer reactions to a bot that is significantly faster—and just as competent as—a human employee, since such a bot can provide a salient benefit to customers rather than enriching only the firm at the expense of the customer.
PILOT STUDY
Our pilot study provides initial evidence of consumers’ beliefs about what motivates firms’ use of service bots. We hypothesized that benefits to the firm, such as cost-cutting, would be seen as the primary motivation for using bots and that perceived benefits to the firm would be negatively correlated with perceived benefits to customers.
Method
Two hundred eighty-seven undergraduate students at a large Canadian university completed this study in exchange for course credit (55% female, Mage = 20). Participants first read some general information on how firms are using technology to automate customer-facing jobs, such as chatbots providing customer service online and robots working as cashiers in cafés. These examples were purely descriptive to avoid potential demand effects (see web appendix for the complete stimuli used in this and all other studies). Participants then indicated their agreement with statements assessing the belief that such automation is primarily intended to benefit consumers (“the main goal of most automation initiatives or projects is to improve the customer experience,” “the main goal of most automation initiatives or projects is to improve service quality”; r = 0.68) and the belief that automation is predominantly intended to benefit the firm (“the main goal of most automation initiatives or projects is to cut the firm’s costs,” “the main goal of most automation initiatives or projects is to maximize profits”; r = 0.57). This study also included a subsequent ancillary between-subjects manipulation. We describe additional measures and corresponding results in the web appendix.
Results and Discussion
As illustrated in figure 1, participants believed that the use of service automation is intended to benefit the firm (M = 7.98, SD = 1.23) more than consumers (M = 5.48, SD = 2.14, t(286) = 16.10, p < .001, dz = 0.95). The belief that the goal of automation is to benefit the firm was negatively correlated with the belief that the goal of automation is to benefit consumers (r = −0.16, p = .008).

CONSUMERS’ PERCEPTIONS OF THE MOTIVATION FOR SERVICE AUTOMATION
Notes.—The boxplot represents the interquartile range (IQR) and median of consumers’ perceptions about firms’ goals for service automation (i.e., customer- vs. firm-focused motives). Whiskers represent lower and upper quartile plus/minus 1.5 IQR. Dots represent individual observations.
Using a finite mixture model approach (see web appendix for full details), we explored whether some consumers might perceive a win-win from service automation. However, both the two- and three-mixture solutions led to consistently negative parameter estimates of the firm benefit coefficients. While the effect was weaker for some latent mixture components or clusters, the relationship was consistently negative, in line with our theorizing.
The extent to which this belief is accurate is not entirely clear, given that common goals for developing customer service chatbots include both cost cutting and improving the customer experience (Kannan and Bernoff 2019). In informal interviews with CMOs, marketing managers, and CX specialists, we found heterogeneity regarding firms’ motives for using service bots. For example, some managers reported that bot usage was primarily motivated by cost savings. Others reported a dual emphasis on both realizing cost savings and improving the customer experience (e.g., cutting costs by automating simpler customer requests to free up additional resources for more labor-intensive customer requests). Importantly, regardless of firms’ true intention, the results from our pilot study suggest that consumers believe that automating customer service is mostly motivated by benefits to the firm as opposed to the customer and that increases in firm benefits come at the expense of customer benefits.
STUDY 1
To provide evidence for our predictions in an actual service setting, we conducted a field study in collaboration with a coffee shop in which customers ordered their coffee either from a human employee or from a Pepper robot. Pepper is a humanoid service robot used by over 2,000 companies globally for a variety of customer service roles (SoftBank Robotics 2020). This study allows us to study real consumer reactions to service automation in a naturalistic setting. To our knowledge, this is one of the first field studies comparing consumers’ reactions to a humanoid service robot versus a human service provider in a naturalistic commercial context.
Method
Participants and Procedures
One hundred and nine participants (38% female, Mage = 34) participated in this field study. The study took place at a coffee shop on a business campus in the Netherlands. All data were collected on four days over two consecutive weeks. Due to the logistics involved with the supervision of the service robot, the bot condition was conducted first and the human condition afterward, each on two separate days (i.e., Monday and Tuesday). Data collection took place in coordination with the coffee shop, ensuring that no special events were scheduled during the study. On all four days, the study was run from 9 am until 2 pm.
The coffee shop was usually operated by two service employees: one taking customer orders and another preparing the coffee. For the duration of the field study, the first employee was replaced either with Pepper or with a confederate. As the Pepper robot had never been used by this coffee shop before our study, it was custom programmed for our context. To ensure consistency across conditions, in the human condition, we replaced the regular employee with a confederate—an experienced barista who was specifically trained on the service script for the study. The same service script was used by both the robot and the human confederate to ensure consistent levels of service and competency. On rare occasions where Pepper faced technical issues or did not recognize speech commands (in approximately 1 out of 10 cases), customers were approached and asked to place their order with a colleague instead and were not included in the study. All customers who were assigned to the human barista condition were included in the study. The function and script of the second employee were constant across conditions.
One at a time, participants in the bot condition were greeted by the robot, whose screen displayed the coffee menu. Customers ordered verbally as they would from a human employee and did not use Pepper’s touchscreen. The orders were of similar complexity across conditions. However, when customers wanted to order something other than what was displayed on the menu, the robot redirected them to a human service employee, and hence, they could not participate in the study. This was the case in approximately one out of five interactions. For all other customers, at the end of the ordering interaction, the robot announced that another employee (i.e., the human employee) would prepare the beverage and take the payment.
In the human condition, customers ordered with the human confederate, following the identical process as described above. After receiving their order, participants were asked to fill out a short pen-and-paper questionnaire about their service experience.
Measures
Customers first provided their service evaluation (“I would recommend this coffee bar to a friend or colleague” and “I am satisfied with my service experience today” on 7-point scales anchored at “strongly disagree” and “strongly agree,” r = 0.63).
Because reactions to service bots may depend on customers’ prior attitudes toward new technologies, we measured participants’ attitudes toward technology using two items from the Affinity for Technology Scale (Fleming and Artis 2010): “I have a generally positive attitude towards new technologies,” and “I am generally enthusiastic about new technologies,” r = 0.80. Finally, we measured participants’ age, gender (0 = female; 1 = male), and frequency of visiting the coffee shop (1 = “At least daily,” 2 = “At least weekly,” 3 = “At least monthly,” 4 = “Less than monthly”).
Results and Discussion
Participants’ service evaluations were significantly worse in the bot condition (M = 5.34, SD = 1.28) than in the human condition (M = 5.81, SD = 0.91, t(105) = 2.21, p = .029, d = 0.43). Two participants did not complete both items used to measure service evaluations and are excluded from this analysis (including these participants based on the single item yields substantively identical results). We estimated a linear model to test whether this effect was robust when controlling for technology affinity, age, gender, and frequency of visiting the shop. The effect of condition remained significant and consistent in magnitude and statistical significance, b = −0.58, SE = 0.22, t(101) = 2.62, p = .010. We also observed a marginally positive and significant effect of technology affinity (b = 0.19, SE = 0.11, t(101) = 1.71, p = .09) and a significant effect of gender (b = −0.77, SE = 0.23, t(101) = 3.38, p = .001) such that male consumers’ evaluations were less positive (M = 5.38, SD = 1.18) than female consumers (M = 5.91, SD = 0.94, t(105) = 2.42, p = .017, d = 0.48). No other effects were significant (all ps > .42).
To the best of our knowledge, this is one of the first field studies documenting consumer reactions to service robots based on a real service interaction and using an experimental paradigm to assess causality. The results suggest that customers react relatively negatively to one of the most widely used service robots in the marketplace, with negative consequences for the firm employing them. The following studies examine the mechanism underlying these effects and address potential alternative explanations. For example, there are inevitable differences in the service provided by humans and physical robots (i.e., facial expressions, tone of voice, etc.), which we could not control for, and the use of Pepper in this coffee shop was a significant novelty that regular customers would have noticed (note, however, that the above effects did not vary as a function of visitor frequency). Subsequent studies will therefore address these alternative explanations.
STUDY 2
The objective of study 2 is to test whether cost-cutting attributions underlie the effects observed in study 1. This study was pre-registered: https://aspredicted.org/k9u4f.pdf.
Method
Participants and Procedure
Four hundred and one American participants were recruited from Prolific Academic (50% female, Mage = 34) and randomly assigned to either a human service or robotic service condition. Participants saw one of two videos that showed the process of ordering a coffee with either a service robot (i.e., the Pepper robot as in study 1) or with a human employee filmed from a first-person perspective, mirroring the paradigm used in study 1. We asked participants to imagine ordering coffee in the same way as the person in the video.
Measures
Participants first provided their service evaluation (“Overall, I would be satisfied with this service experience,” “I would come back to this coffee shop myself,” and “I would recommend this coffee shop to a friend or colleague,” α = 0.95).
Next, we measured cost-cutting attributions using three items (“This company is cutting costs at the customers’ expense,” “This company is trying to increase its profits by reducing service quality,” and “This company is prioritizing its own financial interests over the best interests of its customers,” α = 0.89). Finally, we measured technology affinity with a single item (“I have a generally positive attitude towards new technologies”) to test whether this variable interacts with condition.
Results and Discussion
Service evaluation was significantly worse in the bot condition (M = 4.09, SD = 1.81) than in the human condition (M = 5.30, SD = 1.27, t(399) = 7.67, p < .001, d = 0.77). A linear model on service evaluation with a condition dummy (0 = human, 1 = robot) confirmed that this effect remained significant (b = −1.23, SE = 0.15, t(396) = 8.20, p < .001) when controlling for gender (0 = female/non-binary, 1 = male), age, and technology affinity. Only technology affinity had a significant, positive effect on service evaluation (b = 0.48, SE = 0.07, t(396) = 6.48, p < .001). No other effects were significant (all ps > .57). Given that we had more power than in study 1, we probed the interaction between condition and technology affinity in a separate regression (b = −0.33, SE = 0.15, t(395) = 2.29, p = .022). A subsequent floodlight analysis suggested that the condition effect was significant for all participants except those at the highest level of technology affinity; for these 58 participants answering at the top end of the scale, the effect remained negative but was no longer significant (b = −0.51, SE = 0.41, t(56) = 1.23, p > .22).
Testing the proposed mechanism via cost-cutting attributions, we found that the firm was perceived to be prioritizing cost cutting over the customer experience significantly more in the bot condition (M = 4.19, SD = 1.51) than in the human condition (M = 2.83, SD = 1.30, t(399) = 9.64, p < .001, d = 0.96). Cost-cutting attributions, in turn, predicted service evaluation, b = −0.67, SE = 0.04, t(399) = 16.00, p < .001. Finally, we estimated the full model, wherein cost-cutting attributions remained a significant predictor of service evaluation (b = −0.62, SE = 0.05, t(398) = 13.43, p < .001) and attenuated the direct effect of condition (b = −0.36, SE = 0.14, t(398) = 2.48, p = .013). Our estimation of the indirect effect using 10,000 bootstrap replications produced a significant indirect effect of condition on service evaluation via cost-cutting attributions (a × b = −0.85, SE = 0.11, 95% CI = −1.07 to −0.64), thus supporting cost-cutting attributions as a mediator.
These results support the field study’s key finding and show that perceived cost-cutting motivations can account for a significant portion of the negative effects of service automation on service evaluations. Thus far, all our studies have compared a real, physical bot to a human employee. This means that our observed effects could be driven, at least in part, by differences in the service provided by the human and the bot. For example, the bot, of course, has a different physical appearance from the human, which may contribute to differences in service evaluations. Thus, we hold everything about the service encounter constant across the bot and human conditions in our subsequent studies, varying only participants’ beliefs about whether they are interacting with a bot or a human.
STUDY 3
The objectives of this study are three-fold. First, we sought to address the possibility that the previously observed effects might be driven by differences in service quality provided by the Pepper bot compared to the human employees. Study 3 holds the entire service interaction constant across conditions and only varies participants’ beliefs about whether the service provider is human or a bot. Second, we show that the effects replicate in digital settings, confirming the generalizability of our effects to non-physical service bots (i.e., chatbots). Third, we test an intervention for attenuating these effects: if consumers believe that the use of service bots is motivated by cost cutting at the customer’s expense, this belief should be weakened when a firm passes cost savings derived from service bots on to customers, which should improve evaluations of bot-provided service. In effect, we test whether attributions of cost cutting can be attenuated (and service evaluation improved) when firms provide financial benefits to consumers by passing on cost savings in the form of lower prices. We expected an interaction between service type and discounts such that the effect of discounts would be stronger when service is provided by a bot than by a human, insofar as the use of bots but not humans is attributed to cost-cutting goals. This study was pre-registered: https://aspredicted.org/q82hv.pdf.
Method
Participants
We recruited 2403 American participants from Prolific (50% female, Mage = 38). In line with our preregistration, we excluded 46 participants whose chat logs and reports alluded to issues with the bot (e.g., chats gone awry) for a final sample of 2,357. All results also hold in the full sample without excluding these participants. The large sample size allows us to reliably detect an interaction such that discounts are more effective at improving service evaluations and reducing cost-cutting attributions in the bot compared to the human employee condition. Our sample size was informed by the recommendations of Giner-Sorolla (2018) to be able to detect not only a “knockout” interaction but also “attenuated” interactions.
Procedure
This study used a 2 (service provider: human vs. bot) × 2 (discount provided: yes vs. no) between-subjects design. All participants were asked to evaluate a coffee shop’s chat-based ordering platform as if they were a customer wanting to order coffee from a company called BARISTA. Participants were forwarded to a website on which they engaged in a chat to order coffee. In reality, all chat interactions were with a chatbot that we designed for this study, but we told participants either that they would be chatting with a bot or with an employee. To increase the believability of the manipulation, the chatbot’s responses were delayed by several seconds to make it seem more human like. Participants in the discount conditions were told that they would receive a 20% discount on their order because they were using the chat-based ordering platform. Common estimates of cost savings obtained by using chatbots are up to 30% (VentureBeat 2020), such that a 20% discount could be a realistic proposition for firms from an economic perspective. In the chat, participants went through a typical coffee order sequence, indicating their preferred size, type of roast, and whether they wanted milk and/or sugar.
Measures
All participants then returned to the survey, where they completed a three-item measure of service evaluation with BARISTA (the same three items as in previous studies; α = 0.94) and the same three-item measure of cost-cutting attributions (α = 0.85). We also measured perceived service value using four items from Sweeney and Soutar's (2001) service value scale with an additional reverse-coded item (i.e., “overpriced,” α = 0.83), to control for potential differences in value perceptions. We measured gender, age, and technology affinity, as in the previous study, to be used as covariates. All subsequent analyses are based on all 2,357 participants who did not experience technical issues with the chatbot. As pre-registered, we also repeated and replicated our analyses only among the subsample of participants who correctly recalled whether they received a discount (N = 2,290). All results hold after controlling for gender, age, and technology affinity, but for the sake of brevity, we do not discuss these results further. Web appendix B contains these and other supplementary analyses.
Results and Discussion
A regression analysis with service evaluation as the dependent variable and service type (0 = human service employee, 1 = service bot), discount (0 = absent, 1 = present), and their interaction as independent variables replicated the negative effect of bots on service evaluation (b = −0.44, SE = 0.10, t(2353) = 4.49, p < .001). The effect of discount was not significant (b = 0.10, SE = 0.10, t(2353) = 0.97, p > .33). Most importantly, and consistent with our theorizing, these effects were qualified by a significant interaction in the expected direction (b = 0.33, SE = 0.14, t(2353) = 2.33, p = .02). To decompose the interaction (figure 2A), we examined the effect of discounts in the bot and human condition with planned contrasts. In the bot condition, providing a discount significantly improved service evaluations (M = 5.18, SD = 1.62) compared to when no discount was provided (M = 4.76, SD = 1.76, t(1177) = 4.26, p < .001, d = 0.25). However, when served by a human service employee, service evaluations were similar regardless of whether the firm provided a discount (M = 5.29, SD = 1.67) or not (M = 5.20, SD = 1.70, t(1176) = 0.97, p > .33, d = 0.06). In addition, we found that providing a discount improved the evaluations of bots (M = 5.18, SD = 1.62) to a level comparable to human service without a discount (M = 5.20, SD = 1.70, t(1171) = 0.20, p > .84, d = 0.01). Providing a discount when service bots are used can therefore eliminate the difference in service evaluations between bot- and human-provided service.

Moderating effect of discounts on service evaluations (A) and cost-cutting attributions (B) for service bots versus human employees
Notes.—The graphs display the means with standard errors around the mean.
Cost-cutting attributions showed a similar pattern of results (figure 2B). Replicating our previous studies, using bots (vs. human employees) significantly increased cost-cutting attributions (b = 0.51, SE = 0.09, t(2353) = 5.53, p < .001). The effect of discount was not significant (b = −0.02, SE = 0.09, t(2353) = 0.18, p > .85). Importantly, the interaction between service type and discount was negative and significant (b = −0.39, SE = 0.13, t(2353) = 2.94, p = .003). Again, we examined the effect of discounts in the bot and human condition with planned contrasts to decompose the interaction. In the bot condition, providing a discount led to significantly lower cost-cutting attributions (M = 3.58, SD = 1.59) compared to when no discount was provided (M = 3.99, SD = 1.65, t(1177) = 4.28, p < .001, d = 0.25). In contrast, cost-cutting attributions were similar in the human service employee condition, regardless of whether the firm provided a discount (M = 3.46, SD = 1.57) or not (M = 3.47, SD = 1.58, t(1176) = 0.18, p > .85, d = 0.01). In addition, we found that providing a discount reduced the cost-cutting attributions for firms using bots (M = 3.58, SD = 1.59) to a level comparable to when the firm used human service without a discount (M = 3.47, SD = 1.58, t(1171) = 1.17, p > .24, d = 0.07).
To test our mechanism via cost-cutting attributions and the conditional effect of discounts on service evaluations, we conducted a moderated mediation analysis with service evaluation as the dependent variable, service type (0 = human employee; 1 = service bot) as the independent variable, cost-cutting attributions as the mediator, and discount (0 = absent, 1 = present) as the moderator of the effect of service type on cost-cutting attributions. Most importantly, the index of moderated mediation was significant with a confidence interval excluding zero (index = 0.17, SE = 0.06, 95% CI: 0.06–0.29; 10,000 bootstrap replications). Specifically, we found that without discounts, cost-cutting attributions were a significant mediator of the negative effect of service bot (vs. human employee) on service evaluations (a × b = −0.23, SE = 0.04, 95% CI: −0.31 to −0.15). In contrast, in the discount condition, the indirect effect of service bot (vs. human employee) via cost-cutting attributions was significantly smaller and not significantly different from zero (a × b = −0.06, SE = 0.04, 95% CI: −0.14 to 0.03). In addition, the index of moderated mediation via cost-cutting attributions on service evaluation remained significant after entering service value as a parallel mediator and controlling for all covariates (i.e., gender, age, and technological affinity; index = 0.11, SE = 0.04, 95% CI: 0.03 to 0.19; 10,000 bootstrap replications). Service value was moderately correlated with both cost-cutting attributions (r = −0.28, p < .001) and service evaluation (r = 0.47, p < . 001).
The findings of study 3 identify a boundary condition such that the gap in service evaluations between human and bot service is eliminated if firms pass on some of the financial benefits from service automation to consumers (i.e., the bot/discount condition was not significantly lower for any of the outcome measures than the human/no discount condition). The improved evaluations of bot-provided service are driven, at least in part, by a reduction in cost-cutting attributions.
Finally, the findings of study 3 also demonstrate that the negative effects of service bots on service evaluation occur not only for physical robots, but also when firms use chatbots. The current findings also rule out the possibility that the effects can be explained by differences in the service interaction itself, as participants were interacting with the same service provider and using identical scripts in both the human and chatbot condition. The effects of using service bots appear to be robust across physical and digital bots. Whereas previous research has focused on how various features and characteristics of bots differentially affect consumer reactions (Blut et al. 2021), our results show that very different kinds of bots can lead to similar reactions.
STUDY 4A
A key part of our theorizing is that consumers evaluate bot-provided service more negatively than human-provided service because they perceive firms that use bots as focused on cutting costs at the expense of the customer experience. We have demonstrated that even when bots and humans provide identical levels of service (as in study 3), this perception holds. However, it is possible that bots will improve so much that they will provide service that is unambiguously better than most human-provided service. In this scenario, the perception that firms are cutting costs at the expense of customers should be less likely to hold since customers would quickly realize how they benefit from automated service (e.g., by receiving significantly faster or more efficient service). We therefore tested this possibility by creating a chatbot that is unambiguously superior to its human counterpart. Study 4a tests the effects of such a bot on the same measures used in prior studies while also ruling out alternative explanations. Study 4b tests the effect of such a bot on behavioral outcomes.
Method
Participants and Procedure
We recruited 597 American participants from Prolific Academic (34% female, MAge=24). The procedure was similar to study 3 in which participants were asked to chat with a bot that was labeled either as a human or a bot, although we changed the context from ordering coffee to changing a mobile phone plan. Participants were randomly assigned to one of three conditions (slow human, slow bot, hyper-fast bot). As in study 3, all participants actually chatted with a bot.
In the slow human condition, we told participants that they would be chatting with a human customer service employee. In the slow bot condition, we told participants that they would be chatting with a customer service chatbot. In both the slow human and slow bot conditions, the bot delayed its responses to participants by approximately 60 seconds. In the hyper-fast bot condition, we told participants that they would be chatting with a customer service chatbot and eliminated any response delay such that the bot responded to each message instantly. In all three conditions, the bot was able to change the mobile phone plan as requested, so participants were always able to complete the task successfully (e.g., to rule out the possibility that differences between conditions can be attributed to issues of understanding or ineffective communication).
Measures
After participants completed the chat, we measured service evaluation using three items used in previous studies (α = 0.95) and cost-cutting attributions (α = 0.91), also using three items as in the previous studies. We also examined participants’ sentiments toward the service experience provided by the firm and how consumers rate the firm. We used the tidytext package in R for data processing and the sentimentR package for calculating sentiment scores (using both positively and negatively valenced words). We also assessed the rating of their overall experience with the firm in the form of a star rating (1–5).
To further assess whether the current effects are robust to a range of control measures and alternative accounts, we also assessed consumers’ general attitude toward chatbots with three items (e.g., “Chatbots that can deal with customers’ requests could be useful,” α = 0.93), their experience of discomfort when interacting with chatbots (three items, e.g., “Using chatbots to interact with customers makes me uncomfortable,” α = 0.78), the extent of anthropomorphism (five items, e.g., “The service interface I used has a mind of its own,” α = 0.91), and how effective participants perceived the overall communication to change their current mobile plan (three items, e.g., “My problem was appropriately solved,” α = 0.94).
Results and Discussion
Service evaluations were worst in the slow bot condition (M = 2.87, SD = 1.34), significantly better in the slow human condition (M = 3.82, SD = 1.63, t = 6.69, Tukey-adjusted p < .001, d = 0.63), and best in the hyper-fast bot condition (M = 5.92, SD = 1.22, t = 14.95, Tukey-adjusted p < .001, d = 1.47, vs. the slow human; figure 3A). The contrast between the slow human and the slow bot replicates the main effect observed in previous studies, in which the identical level of service is evaluated more positively when provided by a human than by a bot. The contrast between the hyper-fast bot and the slow human indicates that it is indeed possible for customer evaluations of bot-provided service to be more positive than those of human-provided service, when the bot is unambiguously superior.

Hyper-fast bots boost service evaluation (A) and reduce cost-cutting attributions (B)
Notes.—The graph shows the means and standard errors around the mean.
We observed the corresponding pattern for the cost-cutting attributions (figure 3B), which were lowest in the hyper-fast bot condition (M = 3.04, SD = 1.38), higher in the slow human condition (M = 4.39, SD = 1.71, t = 8.29, Tukey-adjusted p < .001, d = 0.34), and highest in the slow bot condition (M = 4.98, SD = 1.74, t = 3.65, Tukey-adjusted p < .001, d = 0.86, compared to the slow human). This replicates our core effect in which the firm seems to prioritize cost cutting at the customers’ expense when providing the same level of service with a bot versus a human and confirms that an unambiguously superior bot can reverse this perception. Based on 10,000 bootstrap replications, cost-cutting attributions emerged as a significant mediator for the effect of the hyper-fast bot relative to the human employee (a × b = 0.30, SE = 0.04, 95% CI: 0.22–0.39) and the effect of the slow bot relative to the human employee (a × b = −0.20, SE = 0.06, 95% CI: −0.34 to −0.09).
We observed the same pattern of results on star rating and consumer sentiment scores. Star rating (M = 2.21, SD = 1.18) and sentiment scores (M = −0.21, SD = 0.51) were lowest in the slow bot condition, significantly higher in the slow human condition (firm rating: M = 2.78, SD = 1.38, t = 4.41, Tukey-adjusted p < .001, d = 0.44; sentiment: M = −0.07, SD = 0.46, t = 2.94, Tukey-adjusted p < .001, d = 0.30), and highest in the hyper-fast bot condition (firm rating: M = 4.31, SD = 0.86, t = 13.17, Tukey-adjusted p < .001, d = 1.33; sentiment: M = 0.45, SD = 0.55, t = 10.25, Tukey-adjusted p < .001, d = 1.03).
Finally, we assessed to which extent these results are robust to a range of control measures. As summarized in table 1, a series of regression analyses revealed that both the condition effect and the mechanism via cost-cutting attributions only changed in magnitude but not in direction or significance after controlling for consumers’ general attitude toward chatbots (model 3; b = 0.18, SE = 0.03, t(592) = 6.50, p < .001), the discomfort when interacting with chatbots in this service setting (model 4; b = 0.05, SE = 0.04, t(587) = 1.29, p = .20), the extent of anthropomorphism (model 4; b = 0.20, SE = 0.04, t(587) = 5.47, p < .001), and their evaluation of how effective the communication in the service task was perceived (model 4; b = 0.23, SE = 0.03, t(587) = 6.53, p < .001). The indirect effect via cost-cutting attributions also remained robust and significant when controlling for each control measure in the previously reported mediation model (a × b = 0.08, SE = 0.02, 95% CI = 0.04–0.13).
Predictors . | Model 1 . | Model 2 . | Model 3 . | Model 4 . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Estimates . | CI . | p . | Estimates . | CI . | p . | Estimates . | CI . | p . | Estimates . | CI . | p . | |
Intercept | 3.82 | 3.62 to 4.02 | <.001 | 5.40 | 5.06 to 5.73 | <.001 | 4.02 | 3.49 to 4.54 | <.001 | 1.97 | 1.13 to 2.80 | <.001 |
∼ Slow service bot | −0.94 | −1.22 to −0.67 | <.001 | −0.73 | −0.99 to −0.48 | <.001 | −0.72 | −0.96 to −0.47 | <.001 | −0.66 | −0.89 to −0.43 | <.001 |
∼ Hyper-fast service bot | 2.10 | 1.83 to 2.38 | <.001 | 1.62 | 1.36 to 1.89 | <.001 | 1.55 | 1.29 to 1.81 | <.001 | 1.47 | 1.21 to 1.73 | <.001 |
Cost-cutting attribution | −0.36 | −0.42 to −0.30 | <.001 | −0.27 | −0.34 to −0.20 | <.001 | −0.16 | −0.23 to −0.09 | <.001 | |||
Attitude chatbots | 0.18 | 0.13 to 0.23 | <.001 | 0.12 | 0.06 to 0.17 | <.001 | ||||||
Discomfort chatbots | 0.05 | −0.03 to 0.13 | .199 | |||||||||
Anthropomorphism | 0.20 | 0.13 to 0.27 | <.001 | |||||||||
Communication effectiveness | 0.23 | 0.16 to 0.29 | <.001 | |||||||||
Age | 0.00 | −0.01 to 0.01 | .556 | |||||||||
Gender (female) | 0.02 | −0.17 to 0.21 | .801 | |||||||||
Observations | 597 | 597 | 597 | 597 | ||||||||
R2/R2 adjusted | 0.454/0.452 | 0.548/0.545 | 0.578/0.575 | 0.631/0.625 |
Predictors . | Model 1 . | Model 2 . | Model 3 . | Model 4 . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Estimates . | CI . | p . | Estimates . | CI . | p . | Estimates . | CI . | p . | Estimates . | CI . | p . | |
Intercept | 3.82 | 3.62 to 4.02 | <.001 | 5.40 | 5.06 to 5.73 | <.001 | 4.02 | 3.49 to 4.54 | <.001 | 1.97 | 1.13 to 2.80 | <.001 |
∼ Slow service bot | −0.94 | −1.22 to −0.67 | <.001 | −0.73 | −0.99 to −0.48 | <.001 | −0.72 | −0.96 to −0.47 | <.001 | −0.66 | −0.89 to −0.43 | <.001 |
∼ Hyper-fast service bot | 2.10 | 1.83 to 2.38 | <.001 | 1.62 | 1.36 to 1.89 | <.001 | 1.55 | 1.29 to 1.81 | <.001 | 1.47 | 1.21 to 1.73 | <.001 |
Cost-cutting attribution | −0.36 | −0.42 to −0.30 | <.001 | −0.27 | −0.34 to −0.20 | <.001 | −0.16 | −0.23 to −0.09 | <.001 | |||
Attitude chatbots | 0.18 | 0.13 to 0.23 | <.001 | 0.12 | 0.06 to 0.17 | <.001 | ||||||
Discomfort chatbots | 0.05 | −0.03 to 0.13 | .199 | |||||||||
Anthropomorphism | 0.20 | 0.13 to 0.27 | <.001 | |||||||||
Communication effectiveness | 0.23 | 0.16 to 0.29 | <.001 | |||||||||
Age | 0.00 | −0.01 to 0.01 | .556 | |||||||||
Gender (female) | 0.02 | −0.17 to 0.21 | .801 | |||||||||
Observations | 597 | 597 | 597 | 597 | ||||||||
R2/R2 adjusted | 0.454/0.452 | 0.548/0.545 | 0.578/0.575 | 0.631/0.625 |
Notes.—Contrasts with the human employee as reference level.
Predictors . | Model 1 . | Model 2 . | Model 3 . | Model 4 . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Estimates . | CI . | p . | Estimates . | CI . | p . | Estimates . | CI . | p . | Estimates . | CI . | p . | |
Intercept | 3.82 | 3.62 to 4.02 | <.001 | 5.40 | 5.06 to 5.73 | <.001 | 4.02 | 3.49 to 4.54 | <.001 | 1.97 | 1.13 to 2.80 | <.001 |
∼ Slow service bot | −0.94 | −1.22 to −0.67 | <.001 | −0.73 | −0.99 to −0.48 | <.001 | −0.72 | −0.96 to −0.47 | <.001 | −0.66 | −0.89 to −0.43 | <.001 |
∼ Hyper-fast service bot | 2.10 | 1.83 to 2.38 | <.001 | 1.62 | 1.36 to 1.89 | <.001 | 1.55 | 1.29 to 1.81 | <.001 | 1.47 | 1.21 to 1.73 | <.001 |
Cost-cutting attribution | −0.36 | −0.42 to −0.30 | <.001 | −0.27 | −0.34 to −0.20 | <.001 | −0.16 | −0.23 to −0.09 | <.001 | |||
Attitude chatbots | 0.18 | 0.13 to 0.23 | <.001 | 0.12 | 0.06 to 0.17 | <.001 | ||||||
Discomfort chatbots | 0.05 | −0.03 to 0.13 | .199 | |||||||||
Anthropomorphism | 0.20 | 0.13 to 0.27 | <.001 | |||||||||
Communication effectiveness | 0.23 | 0.16 to 0.29 | <.001 | |||||||||
Age | 0.00 | −0.01 to 0.01 | .556 | |||||||||
Gender (female) | 0.02 | −0.17 to 0.21 | .801 | |||||||||
Observations | 597 | 597 | 597 | 597 | ||||||||
R2/R2 adjusted | 0.454/0.452 | 0.548/0.545 | 0.578/0.575 | 0.631/0.625 |
Predictors . | Model 1 . | Model 2 . | Model 3 . | Model 4 . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Estimates . | CI . | p . | Estimates . | CI . | p . | Estimates . | CI . | p . | Estimates . | CI . | p . | |
Intercept | 3.82 | 3.62 to 4.02 | <.001 | 5.40 | 5.06 to 5.73 | <.001 | 4.02 | 3.49 to 4.54 | <.001 | 1.97 | 1.13 to 2.80 | <.001 |
∼ Slow service bot | −0.94 | −1.22 to −0.67 | <.001 | −0.73 | −0.99 to −0.48 | <.001 | −0.72 | −0.96 to −0.47 | <.001 | −0.66 | −0.89 to −0.43 | <.001 |
∼ Hyper-fast service bot | 2.10 | 1.83 to 2.38 | <.001 | 1.62 | 1.36 to 1.89 | <.001 | 1.55 | 1.29 to 1.81 | <.001 | 1.47 | 1.21 to 1.73 | <.001 |
Cost-cutting attribution | −0.36 | −0.42 to −0.30 | <.001 | −0.27 | −0.34 to −0.20 | <.001 | −0.16 | −0.23 to −0.09 | <.001 | |||
Attitude chatbots | 0.18 | 0.13 to 0.23 | <.001 | 0.12 | 0.06 to 0.17 | <.001 | ||||||
Discomfort chatbots | 0.05 | −0.03 to 0.13 | .199 | |||||||||
Anthropomorphism | 0.20 | 0.13 to 0.27 | <.001 | |||||||||
Communication effectiveness | 0.23 | 0.16 to 0.29 | <.001 | |||||||||
Age | 0.00 | −0.01 to 0.01 | .556 | |||||||||
Gender (female) | 0.02 | −0.17 to 0.21 | .801 | |||||||||
Observations | 597 | 597 | 597 | 597 | ||||||||
R2/R2 adjusted | 0.454/0.452 | 0.548/0.545 | 0.578/0.575 | 0.631/0.625 |
Notes.—Contrasts with the human employee as reference level.
In summary, we find evidence that a broad range of measures impacts consumers’ service experience in the expected direction (e.g., a positive effect of anthropomorphism) but that the current findings cannot be merely explained by either an anthropomorphism account, general attitudes toward chatbots, discomfort with chatbots, or the effectiveness of the communication itself. We next demonstrate the downstream behavioral outcomes of these effects.
STUDY 4B
Study 4b was designed to provide further evidence of service provision by bots on a variety of behavioral outcomes. Specifically, we build on the paradigm employed in study 4a and examine the downstream impact on behavioral outcomes that have either short-term (e.g., sign-ups for a newsletter provided by the firm) or long-term implications for firms (e.g., brand-switching behavior and sign-ups for the firm’s loyalty program).
Method
Participants and Procedure
We recruited 571 American participants from Prolific Academic (64% female, MAge=37). The procedure was identical to the previous study.
Measures
This study was focused on assessing behavioral outcomes. First, participants were asked if they wanted to sign up for the company’s newsletter within the chat interaction itself. Second, also within the chat, participants were asked if they wanted to join the company’s loyalty program. Third, within the survey after the chat, we asked participants if they wanted to stay with the brand or switch to a different brand. We also measured service evaluation (α = 0.94), star rating, cost-cutting attributions, and demographics as in study 4a. As patterns on these measures were identical to those reported earlier, we focus on service evaluation and behavioral outcomes.
Results and Discussion
Service evaluation (α=0.94) followed the same pattern seen in study 4a: lowest in the slow bot condition (M = 3.94, SD = 1.70), significantly higher in the slow human condition (M = 4.52, SD = 1.83, t = 3.47, Tukey-adjusted p = .002, d = 0.33), and highest in the fast bot condition (M = 5.89, SD = 1.24, t = 8.39, Tukey-adjusted p < .001, d = 1.31, vs. slow bot condition).
Subsequently, we analyzed the three behavioral outcomes as a composite measure (i.e., staying with the brand, newsletter sign-up, and joining the loyalty program; range: 0–3, M = 0.91, SD = 0.85). Given that our outcome measure is a count variable, we used a negative binomial regression with the human condition as the baseline and one dummy each for the regular and fast bot condition. As expected, we found that the behavioral outcomes for the firm were less positive in the slow bot than in the human condition (b = −0.40, SE = 0.12, χ2(1) = 10.46, p = .001). In contrast, the outcomes for the firm were more positive in the hyper-fast bot than in the human condition (b = 0.37, SE = 0.10, χ2(1) = 13.36, p < .001). We also ran three separate logistic regressions, again using the human condition as the baseline, to examine each of the three behaviors separately. The effect was strongest for staying with (vs. switching) the brand (bslow bot = −0.84, SE = 0.21, χ2(1) = 15.75, p < .001; bhyper-fast bot = 0.56, SE = 0.22, χ2(1) = 6.73, p < .01). The effect remained directionally consistent but was weaker for both newsletter sign-ups (bslow bot = −0.58, SE = 0.41, χ2(1) = 2.03, p = .15; bhyper-fast bot = 0.98, SE = 0.30, χ2(1) = 10.56, p = .001) and joining the loyalty program (bslow bot = −0.31, SE = 0.29, χ2(1) = 1.15, p = .28; bhyper-fast bot = 0.74, SE = 0.19, χ2(1) = 9.35, p = .002). Replicating previous studies, we found that cost-cutting attributions (α = 0.89) mediated these effects.
Jointly, these results illustrate the downstream consequences of employing regular versus hyper-fast chatbots. Relying on regular bots decreases not only service evaluation but also leads to less favorable brand outcomes. In contrast, using hyper-fast bots can improve service evaluation, reduce brand switching, and increase sign-ups for loyalty programs and newsletters.
GENERAL DISCUSSION
This research demonstrates that, at the current level of technological development, consumers evaluate service provided by physical and digital bots less favorably than service provided by human employees—even if the service quality is identical. Consumers attribute the use of service bots largely to cost-cutting motivations: they see service bots as saving money for the firm at the expense of the customer experience. The silver lining for firms is that they might be able to attenuate this effect with price discounts and fully reverse the effect by using bots that unambiguously outperform human employees.
Implications for Theory
The findings of this research contribute to a better theoretical understanding of consumer reactions to service bots. Beyond identifying a novel main effect of service automation on firm-level outcomes, we shed light on why this effect occurs by measuring and manipulating a firm’s perceived motivation to cut costs at the customer’s expense. This focus on the firm and customers’ perceptions of firm motives for automation is novel in this area of research and thus broadens the theoretical scope with which researchers can build further insights regarding how and why consumers react to automation. Our findings contribute to the literature on algorithm aversion, showing that customer service is a domain in which consumers prefer humans over algorithms. However, if the service provided by bots is notably superior, as in studies 4a and 4b, we find a reversal of this effect. These findings also contribute to the literature on product anthropomorphism, insofar as our main effects and process variables showed consistent patterns across both physical and digital bots—this illustrates that increasing a bot’s human likeness (from digital to physical) does not necessarily impact consequential outcomes like service evaluations. Indeed, we show that our outcome variables are robust to different levels of product anthropomorphism. Together, these results identify obstacles to the successful deployment of service bots across the service value chain and suggest ethically appropriate interventions to overcome these obstacles.
Our findings are also relevant to the broader phenomenon of service automation, within which service bots are one specific instantiation. Service automation also includes more familiar technologies such as automatic teller machines at banks, self-serve check-out kiosks at grocery stores, and self-serve check-in kiosks at airports. Of course, each of these specific forms of service automation differs significantly from one another, such that the generalizability of our findings is constrained by some idiosyncratic features of certain automation technologies. For example, the more familiar forms mentioned above are not interactive in the same ways that service bots are. They are not anthropomorphized in any way and consumers often self-select into using those technologies rather than human employees, which is not the case with most service bots. At the same time, there are also areas of overlap between service bots and older forms of service automation, most obviously the replacement of human employees with machines. Consumers’ perceptions of firms’ motivation for using service automation are likely relevant in shaping service evaluation, regardless of the technology being used. We therefore suggest that our findings are relevant to understanding consumer reactions to multiple forms of service automation more broadly and beyond service bots.
Finally, demonstrating that cost-cutting attributions are an important mechanism through which the use of automation impacts consumers’ service evaluations and related behavioral outcomes provides a further theoretical contribution. While much of the literature on consumer responses to automation focuses on specific psychological mechanisms, such as uniqueness neglect (Longoni, Bonezzi, and Morewedge 2019) and identity threat (Mende et al. 2019), our results suggest that a more basic process of attribution (i.e., how consumers perceive firms’ motivations for using new technologies) also contributes meaningfully to outcomes that firms value. Our work shows that attributing firms’ strategic decisions, such as deploying bots, to cost cutting, undermines service evaluations. Future research might also explore whether a firm’s brand positioning (e.g., a no-frills positioning, Henkel et al. 2018) interacts with the firm’s decision to automate service in shaping cost-cutting attributions.
Implications for Practice
Our findings provide clear guidance for firms already deploying service bots as well as those considering the deployment of bots in customer service. Firms are well advised to carefully monitor consumer reactions to service bots and the potentially detrimental effects on firm perception—from lower service evaluation to reduced willingness to patronize and re-patronize. Although firms might introduce bot-based service automation with good intentions (e.g., automating routine tasks to free up resources for handling more complex customer requests), the findings of this research demonstrate that, at this time, service bots provide a market signal that the firm is focused more on cost cutting than on maintaining or improving the customer experience. Thus, while automating customer-facing service processes carries significant cost-cutting potential, greater investments in tracking and analyzing consumer sentiment are warranted. While Amazon Go, an automated convenience store operated by Amazon, did not use service bots specifically, the current findings are consistent with the negative buzz about the profit orientation and cost-cutting focus that consumers perceived as opposed to the “frictionless shopping experience” advertised by Amazon (Graham 2020).
Another important managerial insight is that the negative effects of automating service on service evaluations can be mitigated. Passing on some of the cost savings derived from automation via discounts may be one way to ensure that both the firm and its customers benefit from the use of service bots. Furthermore, cases in which service bots can already provide unambiguously superior service relative to human service employees should be identified and targeted for automation. Firms that offer service bots in a B2B context already offer different tiers of chatbot sophistication (i.e., https://web.archive.org/web/20230421/https://www.ada.cx/pricing); our results suggest that implementing basic bots that are unlikely to exceed human performance is misguided. Even matching human performance seems insufficient; bots that unambiguously outperform most humans may be required to preserve customer satisfaction when automating service. Thus, the careful selection of clearly defined tasks where bots clearly outperform humans has the greatest potential to realize cost savings without undermining service evaluations.
There are indeed examples of firms already applying these ideas. For example, Amazon uses a great deal of automation while also providing discounted prices and prioritizing customer service. Likewise, Expedia has implemented highly effective chatbots that can change and cancel flight bookings much more quickly than human employees—although they sometimes cannot understand more complex requests. Recently released chatbots using large language models such as GPT-4 outperform most humans at many complex tasks, including the LSAT and the universal bar exam (OpenAI 2023) and may therefore be able to outperform many human customer service employees. Unfortunately, many firms seem to simply replace service employees with sub-par bots and without also improving or changing in other ways to preserve positive service evaluations. This is reflected in the widespread dissatisfaction with chatbots: one analysis of 35,000 customer service chatbot interactions found that 66% of those interactions were rated a 1 out of 5 in terms of the customer satisfaction (Crolic et al. 2022). Thus, while most existing applications of customer service chatbots seem to fail, our results suggest that investing in high-quality chatbots that can quickly and accurately resolve common issues is worthwhile—and this strategy is indeed pursued by a few select firms (Christison 2022; Zierau et al. 2022). Of course, this strategy may also put more human jobs at risk of automation, such that increased political attention toward technological unemployment is strongly needed.
Limitations and Constraints on Generalizability
Managerial practice and consumer behaviors surrounding service bots are dynamic and context dependent. Our research is limited by a reliance on North American and Western European participants and may thus not generalize to all geographic or cultural contexts in which the use of service bots may be more common or more positively perceived. However, research in a Chinese context has identified effects similar to those we observe here (Luo et al. 2019).
The implications of these findings are also likely to shift along with ongoing technological progress. As service bots become more widely used and familiar to consumers, perceptions of these technologies may become more positive. Our findings in studies 4a and 4b suggest that the rapid ongoing improvements in these bots’ quality relative to humans are also likely to have significant positive effects on consumers’ perceptions of and reactions to service bots.
Conclusion
Even a decade ago, the idea of providing complex customer service via machines seemed like science fiction. The fact that this idea is now a reality constitutes a fundamental shift in how firms interact with consumers in the marketplace. Many firms have rushed to take advantage of this new opportunity, sometimes with limited success (Nussey 2021). Our work sheds light on both the promise and peril of using bots for customer service and provides a parsimonious, theory-driven framework to understand and influence how consumers perceive firms that use service bots.
DATA COLLECTION INFORMATION
The data for the pilot study were collected at the Alberta School of Business Behavioral Research Lab by research assistants under the supervision of the first author in October and November 2019. Data collection for study 1 took place at the coffee shop on the Brightlands Campus in Heerlen and was overseen by the fourth author in June 2019. The first author managed the data collection for study 2 using Prolific Academic in November 2021. The third author managed the data collection for studies 3, 4a, and 4b using Prolific Academic between January 2022 and January 2023. All authors jointly analyzed the data. The data and syntax for all studies are available on the Open Science Framework at https://osf.io/vm6ax/.
Author notes
Noah Castelo ([email protected]) is an assistant professor at the Alberta School of Business, University of Alberta, Edmonton, AB, Canada.
Johannes Boegershausen ([email protected]) is an assistant professor of marketing at the Rotterdam School of Management, Erasmus University, Rotterdam, The Netherlands.
Christian Hildebrand ([email protected]) is a professor of marketing analytics at the Institute of Behavioral Science and Technology, University of St. Gallen, Switzerland.
Alexander P. Henkel ([email protected]) is an assistant professor at the Open University of the Netherlands, Heerlen, The Netherlands.
The authors thank the JCR review team for their constructive comments, particularly Klaus Wertenbroch who was an extraordinarily helpful and supportive associate editor.
They are also grateful to the R Lab participants at the Alberta School of Business, seminar participants at the Rotterdam School of Management and the Sauder School of Business, and to Stefano Puntoni for their helpful comments. Funding for this research was provided by the Canadian government’s Social Sciences and Humanities Research Council (Insight Grant # 435-2020-0547 to Noah Castelo), by the Province of Limburg, The Netherlands (grant # SAS-2020-03117 to Alexander P. Henkel), the Swiss National Science Foundation (# 189005 to Christian Hildebrand), and by the Erasmus Research Institute of Management (to Johannes Boegershausen). Supplementary materials are included in the web appendix accompanying the online version of this article.