-
PDF
- Split View
-
Views
-
Cite
Cite
Michael Moutoussis, Joe Barnby, Anais Durand, Megan Croal, Laura Dilley, Robb B Rutledge, Liam Mason, Impressions about harm are formed rapidly and then refined, modulated by serotonin, Social Cognitive and Affective Neuroscience, Volume 19, Issue 1, 2024, nsae078, https://doi.org/10.1093/scan/nsae078
- Share Icon Share
Abstract
Attributing motives to others is a crucial aspect of mentalizing, can be biased by prejudice, and is affected by common psychiatric disorders. It is therefore important to understand in depth the mechanisms underpinning it. Toward improving models of mentalizing motives, we hypothesized that people quickly infer whether other’s motives are likely beneficial or detrimental, then refine their judgment (classify-refine). To test this, we used a modified Dictator game, a game theoretic task, where participants judged the likelihood of intent to harm vs. self-interest in economic decisions. Toward testing the role of serotonin in judgments of intent to harm, we delivered the task in a week-long, placebo vs. citalopram study. Computational model comparison provided clear evidence for the superiority of classify-refine models over traditional ones, strongly supporting the central hypothesis. Further, while citalopram helped refine attributions about motives through learning, it did not induce more positive initial inferences about others’ motives. Finally, model comparison indicated a minimal role for racial bias within economic decisions for the large majority of our sample. Overall, these results support a proposal that classify-refine social cognition is adaptive, although relevant mechanisms of serotonergic antidepressant action will need to be studied over longer time spans.
Introduction
Relationships with others play a key role in well-being and social harmony; to wit, mental health suffers when important relationships deteriorate. Clinically, perceiving others as harmful is central in conditions ranging from post-traumatic stess disorder (PTSD) (assaults) to social anxiety (humiliation) to paranoia (conspiracy). It is thus important to understand in depth the ways in which perceptions of others, especially regarding under-privileged groups (Kaiser Trujillo et al. 2022, Singh et al. 2022), may interact with mental well-being.
The “Bayesian Brain Hypothesis” holds that confidence in beliefs, even those characterizing paranoia or PTSD, is updated by brains (approximately) implementing Bayes’ rule. How strong one “belief” is depends on conviction in related ones, so beliefs become organized into interconnected hierarchies, ranging from simple predictions regarding sensory data to abstract expectations about self and others, encoding high-level features from the environment. Belief-based models account well for data related to harm attribution, over and above associative learning models (Barnby et al. 2020, 2022), but much remains unclear. For example, understanding the neuro-computational basis of polarized attributions about others is at an early stage (Brown et al. 2022, Story et al. 2024).
The psychological literature has traditionally cast polarized thinking as maladaptive, contributing to us–them biases. However, taking inspiration from recent modeling work (Story et al. 2024), here we ask whether polarized attributions may originate in useful adult cognition. We propose a “classify-refine” hypothesis, whereby people first (1) attempt to rapidly classify others’ attributes as “beneficial vs. detrimental” to themselves, and subsequently (2) to refine beliefs about others through learning. This may be highly adaptive: quickly telling friend from foe, computationally easy, yet allowing one to refine one’s beliefs, and go beyond black-and-white thinking. Current understanding contains limitations: We (Hopkins et al. 2021) proposed a model which, despite its advantages, gives a poor account of people’s fast initial learning. Pike (Bone et al. 2021) addressed this with an exponentially reducing learning rate; however, this lacks a mechanistic account of why this may be the case. We tested new “classify-refine” models that used only one beneficial and one detrimental state along each dimension of attribution (overviewed in Fig. 2), against previously successful, “classic” learning models which employ a fine-grained range along each dimension.
Social experiments based on game theory have been useful in modeling cognitive processes of inter-personal inference (Raihani and Bell 2017, Greenburgh et al. 2019, Barnby et al. 2020, 2022). We extend this work toward a more ecologically valid task for probing brain mechanisms behind attributions, based on the repeated Dictator Task (Barnby et al. 2020, 2022). Here, a “dictator” (an on-screen partner of the participant) decides how to split a sum of money between themselves and a “receiver” (the participant), who must accept the split. The motivation of the “dictator” is undisclosed, but receivers rate, for each economic exchange, the extent to which decisions may have been motivated by harmful intent vs. the extent to which they were motivated by self-interest.
Work with this attribution task has not yet studied some important social variables. One variable thought to influence beliefs in real life is race or ethnicity. For example, the implicit association task has consistently demonstrated implicit race-related biases, which frequently but not invariably influence behavior (Maina et al. 2018). Toward querying a possible role of race in attributions of intent, here we implemented the task using photos of either black or white individuals to portray “dictators.”
The neurobiological mechanisms of making attributions are uncertain, but may involve serotonin, which is important for social cognition. Higher levels are associated with pro-sociality, possibly via multiple mechanisms (Crockett 2009). One mechanism may be reducing attribution of intent-to-harm to others, through higher 5HT transporter availability (Yang et al. 2007). In depression, increased hostility is associated with interpreting ambiguous behaviors as hostile (Smith et al. 2016), while Selective Serotonin Reuptake Inhibitors (SSRIs) reduce patient hostility in those responding to psychopharmacotherapy (Bian et al. 2024). During citalopram therapy, relationship factors are associated with improvements in depressive symptoms (Joseph et al. 2011). We thus hypothesized that serotonin may reduce attributions of harm intent. To test this, we administered the repeated dictator task before and after subchronic citalopram treatment, while looking for changes in detailed cognitive mechanisms through computational modeling.
Increased attribution of harmful intent is implicated in anti-black racism (Stjohn and Healdmoore 1995). Since then, the implicit association task has consistently demonstrated implicit race-related biases in the population, which often but not invariably influence behavior (Maina et al. 2018). Strikingly, negative emotionality, which can be ameliorated by SSRIs, has been reported to mediate the impact of adverse events on bias against out-groups (She et al. 2022). As we increased the validity of our study by including diverse photos of either black or white individuals to portray “dictators” in the task, it became important to model and study the interaction between the effect of SSRIs, of potential race bias, and harm intent. At the same time, by no means do we oversimplify racism as rooted in neurotransmitter deficiencies.
In summary, the present study aimed to shed light on the neurocognitive basis for how individuals gauge harm vs. self-interest using Bayesian modeling of a repeated Dictator task that we call the Sharing Game. This generated data for testing whether in interpersonal situations, individuals initially quickly classify others’ attributes as beneficial vs. detrimental to themselves, and subsequently refine and update their beliefs (“classify-refine hypothesis”). We employed two experimental manipulations to test two preregistered predictions. First, we manipulated serotonin levels by randomizing participants to receive citalopram vs. placebo. We predicted that citalopram would result in attributing more beneficent motives. Second, we evaluated whether race plays a role in evaluations of harm vs. self-interest by using photos of white and black “dictators.” We predicted that more negative attributions would be made for out-group others (Moutoussis et al. 2022) i.e. those identifying as white would make more negative attributions for non-whites than whites, while non-whites would evaluate non-whites more positively than whites.
Materials and methods
Sample
Healthy UK residents were recruited from a University College London (UCL) subject pool. They gave informed consent to participate in a week-long citalopram 20 mg vs. placebo study, approved by the UCL Ethics Committee, ID 19601/001. Participants had no history of psychiatric or neurological disorder and agreed not to become intoxicated by drugs or alcohol during the study. Seventy-four participants were enrolled (44/74 self-identified female, 29/74 male, 0/74 other, 1 missing). In all, 42 participants were randomized to citalopram. The commonest ethnicities were white and Chinese. The sample was highly educated, young adult (median age = 25), and of low income (Supplementary Fig. S2).
Task
In our Sharing Game task (Fig. 1a), a development of the repeated dictator task (Barnby et al. 2020, 2022), we increased task length by one block, we collected more data per trial, and tested a new set of computational models. In so doing we aimed to increase the stability and sensitivity of tasks, toward improving the assessment of individual differences, including drug effects. Participants thus saw four Dictators, who made either fair (5:5) or unfair (10:0) splits of 10 pence of fictive money between themselves and the participant (2 Dictators 80% fair, 2 Dictators 20% fair). Participants reported their expectations about the fairness level of the Dictator, and then made attributions of motivation along two salient dimensions: harm-intent (HI) and self-interest (SI). (Fig. 1). The dictators were portrayed by photos of women of varying age (young adult to middle-aged) and ethnicity (black vs white), purchased under license for public use from www.shutterstock.com.

One trial of the revised iterated dictator task, a.k.a. “Sharing game.” (a) Expectation stage. Participants were asked to estimate the probability that this Dictator would split two coins fairly. (b) Having observed the split (not shown here), participants had to infer the likelihood of SI and HI motivating the Dictator. Question order was randomized across trials. The expectation stage preceded attributions within a trial. Computationally, it is best thought of as the result of belief updates formed on the basis of the observations made so far, especially in the trial before. Unlike many economic games, we displayed ecologically valid Dictator images, and asked whether they elicited motivational attributions more beneficial or detrimental to the participant.
A total of 73 participants completed the Sharing Game at least once; 66 completed it again 1 week later. They faced each dictator just once, for 12 consecutive trials. A post-experiment survey probed awareness of the drug and its subjective effects. Of the 42 citalopram participants, 25 experienced subjective drug effects (all minor) and correctly guessed that they took SSRI.
Modeling
All models had a simple hidden Markov process (HMM) as a central “learning core,” implemented as a one-level, 12-trial HMM. Each model had three “reporting” processes, generating fairness expectation, HI attribution, and SI attribution reports. All models were derived from a published, successful Bayesian model (Barnby et al. 2022). Here we used the active-inference framework, which naturally accommodates the process of fast classification, accompanied by slower belief refinement (Smith et al. 2022). We now describe the key features of the models (Fig. 2). The parameters are explained in Table 1, and their place in the different models is compared in Fig. 4. Detailed explanations and equations follow in the Supplementary data, where Supplementary Fig. S1 exemplifies the workings of the classify-refine model.
Parameter (abbreviation) . | Meaning . | Role in winning model . |
---|---|---|
Replication parameters (analogous to published model) | ||
initHarmInt (pHI0) | Central tendency of initial, or prior, beliefs over attribution of HI | Included |
initSelfInt (pSI0) | Central tendency of initial, or prior, beliefs over attribution of SI | Included |
IntentAttrEv (dEv) | Certainty of prior belief in positive or negative Intent Attribution. “d” refers to the active inference convention for prior beliefs over states, “Ev” for amount of evidence. | Included |
evidRatio (EvRat) | Ratio of initial evidence over HI vs. Selfish Intent | Excluded: ratio = 1 in winning model. |
decisPrec (alphaPrec) | Baseline decision precision, or inverse-temperature (=mean of prior on expected free energy precision) | Included |
wHarmInt (wH) | Weight of HI in Dictator’s policy fairness | Included |
wSelfInt (wS) | Weight of Selfish Intent Intent in Dictator’s policy fairness | Included |
fairnessB (w0) | Bias (intercept) in Dictator’s policy fairness function | Included |
λother | Learning rate measure from one dictator to the next | Included |
Exploratory parameters (newly introduced) | ||
typingConf, (aEv) | Confidence certainty of prior belief over typing the level of niceness of “positive” and “negative” character classes. “a” refers to the active inference symbol for the likelihood map, “Ev” for amount of evidence | Included |
learnRetn (ω) | Extent of learning retention from trial to trial | Included |
POCbias | Bias in expecting more positive or negative attributions for non-white Dictators. | Excluded: bias = 0 in winning model |
Parameter (abbreviation) . | Meaning . | Role in winning model . |
---|---|---|
Replication parameters (analogous to published model) | ||
initHarmInt (pHI0) | Central tendency of initial, or prior, beliefs over attribution of HI | Included |
initSelfInt (pSI0) | Central tendency of initial, or prior, beliefs over attribution of SI | Included |
IntentAttrEv (dEv) | Certainty of prior belief in positive or negative Intent Attribution. “d” refers to the active inference convention for prior beliefs over states, “Ev” for amount of evidence. | Included |
evidRatio (EvRat) | Ratio of initial evidence over HI vs. Selfish Intent | Excluded: ratio = 1 in winning model. |
decisPrec (alphaPrec) | Baseline decision precision, or inverse-temperature (=mean of prior on expected free energy precision) | Included |
wHarmInt (wH) | Weight of HI in Dictator’s policy fairness | Included |
wSelfInt (wS) | Weight of Selfish Intent Intent in Dictator’s policy fairness | Included |
fairnessB (w0) | Bias (intercept) in Dictator’s policy fairness function | Included |
λother | Learning rate measure from one dictator to the next | Included |
Exploratory parameters (newly introduced) | ||
typingConf, (aEv) | Confidence certainty of prior belief over typing the level of niceness of “positive” and “negative” character classes. “a” refers to the active inference symbol for the likelihood map, “Ev” for amount of evidence | Included |
learnRetn (ω) | Extent of learning retention from trial to trial | Included |
POCbias | Bias in expecting more positive or negative attributions for non-white Dictators. | Excluded: bias = 0 in winning model |
Parameter (abbreviation) . | Meaning . | Role in winning model . |
---|---|---|
Replication parameters (analogous to published model) | ||
initHarmInt (pHI0) | Central tendency of initial, or prior, beliefs over attribution of HI | Included |
initSelfInt (pSI0) | Central tendency of initial, or prior, beliefs over attribution of SI | Included |
IntentAttrEv (dEv) | Certainty of prior belief in positive or negative Intent Attribution. “d” refers to the active inference convention for prior beliefs over states, “Ev” for amount of evidence. | Included |
evidRatio (EvRat) | Ratio of initial evidence over HI vs. Selfish Intent | Excluded: ratio = 1 in winning model. |
decisPrec (alphaPrec) | Baseline decision precision, or inverse-temperature (=mean of prior on expected free energy precision) | Included |
wHarmInt (wH) | Weight of HI in Dictator’s policy fairness | Included |
wSelfInt (wS) | Weight of Selfish Intent Intent in Dictator’s policy fairness | Included |
fairnessB (w0) | Bias (intercept) in Dictator’s policy fairness function | Included |
λother | Learning rate measure from one dictator to the next | Included |
Exploratory parameters (newly introduced) | ||
typingConf, (aEv) | Confidence certainty of prior belief over typing the level of niceness of “positive” and “negative” character classes. “a” refers to the active inference symbol for the likelihood map, “Ev” for amount of evidence | Included |
learnRetn (ω) | Extent of learning retention from trial to trial | Included |
POCbias | Bias in expecting more positive or negative attributions for non-white Dictators. | Excluded: bias = 0 in winning model |
Parameter (abbreviation) . | Meaning . | Role in winning model . |
---|---|---|
Replication parameters (analogous to published model) | ||
initHarmInt (pHI0) | Central tendency of initial, or prior, beliefs over attribution of HI | Included |
initSelfInt (pSI0) | Central tendency of initial, or prior, beliefs over attribution of SI | Included |
IntentAttrEv (dEv) | Certainty of prior belief in positive or negative Intent Attribution. “d” refers to the active inference convention for prior beliefs over states, “Ev” for amount of evidence. | Included |
evidRatio (EvRat) | Ratio of initial evidence over HI vs. Selfish Intent | Excluded: ratio = 1 in winning model. |
decisPrec (alphaPrec) | Baseline decision precision, or inverse-temperature (=mean of prior on expected free energy precision) | Included |
wHarmInt (wH) | Weight of HI in Dictator’s policy fairness | Included |
wSelfInt (wS) | Weight of Selfish Intent Intent in Dictator’s policy fairness | Included |
fairnessB (w0) | Bias (intercept) in Dictator’s policy fairness function | Included |
λother | Learning rate measure from one dictator to the next | Included |
Exploratory parameters (newly introduced) | ||
typingConf, (aEv) | Confidence certainty of prior belief over typing the level of niceness of “positive” and “negative” character classes. “a” refers to the active inference symbol for the likelihood map, “Ev” for amount of evidence | Included |
learnRetn (ω) | Extent of learning retention from trial to trial | Included |
POCbias | Bias in expecting more positive or negative attributions for non-white Dictators. | Excluded: bias = 0 in winning model |

Essentials of classic vs. classify-refine models. (a) In classic models, each of a fine-grained range of states maps to a specific policy. For example, the worst possible HI always maps to an “unfair splitting” polity. Participants have to infer which of these many states obtains. (b) In the new model, classification into coarse-grained values occurs, i.e. only detrimental and beneficial, but what these mean for each partner is refined through learning. (c) Beliefs about states form the “core” of the generative model, updated at each trial. (d) Returns seen at each trial. (e) Participants rated their expectation of fair or unfair split before observing the return, and afterwards reported their (updated) attributions. Although they consider binary states, they have uncertainty about which obtains, so they still report a graded score about how likely a particular attribution is.
Agents inferred the type of others over an array of SI × HI states, utilizing independently parametrized priors over each dimension (Supplementary Table S1). Each “type” (SI, HI) determined a probability of fair-split through a logistic function (Supplementary Eq. S1). Observed splits allowed agents to invert this model and update their beliefs. Beliefs held at the learning “core” produced “fairness predictions,” “HI reports” and “SI reports” via three simple active-inference modules.
In classic models, HI and SI grids had six bins each, with fixed values, covering the possible range; but for the classify-refine models, the central core only had two states along each attribution dimension (Fig. 2b), giving four combinations. These, however, were not of fixed value, but could be adjusted via learning. Initially, each was taken to include low (5%) and high (95%) values of each attribute, so as to represent the psychological meaning of “beneficial” and “detrimental” and span the range of each attribute. Crucially, uncertainty over the location of the bins was parametrized by a typing confidence parameter aEv (Supplementary Table S1). Beliefs about location of the “beneficial” and “detrimental” states were refined by crediting the evidence for “fair” or “unfair” split at each trial in proportion to the (posterior) belief that said state underlay the trial (Dorfman et al. 2019).
Ethnicity did not directly affect learning, but modulated the beliefs fed from the learning to the reporting processes. That is, the core HMM learned without taking race into account, but its output to the response modules was subject to a bias parameter. The magnitude and direction of this bias was fitted individually, allowing modeling of the bias in any direction—including being positively biased about an out-group.
Model fitting
We fitted participants’ expectations about how the Dictator would split the sum, and their attributions of the likely SI and harm-intending motives of the Dictator, using models parameterized as per Table 1. Maximum-a-posteriori (MAP) fitting with weakly informative priors defined over native parameter space was used (Moutoussis et al. 2018). The sum-log-likelihood at the MAP estimate was then used to calculate Bayesian Information Criterion (BIC; corrected for small samples) and Akaike Information Criterion for each participant. In cases where a parameter was fitted across the whole participant sample, we modified the complexity penalty associated with that parameter to be consistent with BIC/2 being an approximation to log model evidence. Gradient-descent methods (including matlab fmincon and SPM spm_dcm_mdp; (SPM development team 2022, MATLAB 2023) encountered problems with local minima; hence, we used adaptive grid-search optimization with multiple initial conditions.
Regression analyses
Model fitting mostly resulted in approximately normal parameter distributions in transformed space, where regression analyses were performed. Sometimes, however, outliers occurred. We therefore first used robust regression rlm in R (R Core Team 2021), and we report hypothesis tests based on the t-value of the coefficient in question, e.g. rlm(pHI0_follow-up ∼ pHI0_baseline + drug_group + gender + subjective_socioecon_status). We then performed the equivalent ordinary least squares regression using lm and identified points with Cook’s distance > 1 and/or standardized residual distance > 3 from the theoretically predicted on Q-Q plots. We excluded these suspect datapoints from further analyses. After these few outliers were excluded, we performed longitudinal analyses using linear mixed effects models.
Results
Refined learning accompanies fast classification
In our baseline sample, the classify-refine model was superior to the classical learning model (Fig. 3a, median BIC 372.21 vs. 446.69, Wilcoxon rank sum P < 1E-5), strongly supporting our hypothesis that individuals rapidly classify others’ attributes as beneficial vs. detrimental to themselves, followed by refining their attributions. People refined their attributions more slowly if they had a higher typing confidence parameter, aEv. This quantified the initial amount of evidence underpinning their beliefs that each type of partner would follow a specific policy (a-matrix; see Table 1 and “Materials and Methods” section). Further model comparisons performed on the baseline data indicated the necessity of including learning from one dictator to the next (λother in Table 1) and imperfect memory for retaining learning from trial to trial (ω). We then fixed one parameter at a time, in order to discover more parsimonious models, or replicate our previous successful models as per preregistration. The necessity for all parameters was replicated, except the need for separate uncertainties over prior Harm and Selfishness intent [analogous to uSI0 = uPI0 in our previous model, and replicating novel work (Barnby et al. 2024)]. The only remarkable correlation between parameters was between initHarmInt and initSelfInt (baseline r = 0.438, P uncorr. = .000170, follow-up r = 0.428, P uncorr. = .000372, Supplementary Fig. S8).

Key model comparison results. Red lines = BIC equality. (a) Classify-refine models clearly outperform classic learning. For 64/73 participants, the BIC difference is 6 or more (i.e. to the left of gray, diff = + 6 gray line; conventionally at least modest evidence. Right gray line is diff = −6). (b and c) Race/ethnicity bias. (b) The model without person-of-color bias gave a more parsimonious fit for 63 out of 73 participants (dots above the red, equal-BIC line). For 10/73 participants, the model including a POC bias parameter gave an advantage of at least 6 BIC points (‘modest evidence’). (c) (inset) Histogram of the difference, showing a main peak for the simpler no-bias model and an upper tail for the more complex model.
Coalitional factors: lower subjective status may attenuate attributions of harm intent
To examine the preregistered hypothesis that race stereotypes would modulate attributions, we examined the effect of a parameter that shifted the effective attributions as they were communicated from the “core belief” module to the “Dictator response function” (Fig. 2a and b), based on the apparent ethnicity of the Dictator.
Happily, we found strong evidence against our hypothesis that including this bias parameter regarding people of color (POC) would improve model fit for most participants. The model including apparent ethnicity had a median BIC of 372.2, vs. for 366.2 for not including ethnicity, Wilcox. P = 8.5E-5). However, 10 of 73 participants appeared better fit by a model including POC bias (Fig. 2b and c).
In preregistration, we hypothesized that across-participant coalitional threat, operationalized as perceiving oneself as of lower socioeconomic rank, would increase HI attributions—but we found evidence for the opposite. Average harm attributions increased with Subjective SES (SSES) score, beta = 0.109, SE = 0.048, P = .030. This was unchanged controlling for testing wave and drug [HIAv ∼ SSES + wave*drug + (1|participant): P = .034, beta = 0.109, SE = 0.050].
The success of the winning model replicated between baseline and follow-up waves
A winning model over the baseline sample motivated the hypothesis that the same model would be the best at follow-up. This hypothesis held, as the same model had the best total BIC in the follow-up testing. Model comparison at baseline vs. follow-up is shown at Fig. 4.

Median model fit measures (BIC) for classify-refine models. (a) The best model at both baseline and follow-up (b) had two parameters fewer than the full model—POC_bias and ratio of prior certainties for HI vs. SI. BIC reduced with testing wave, mixed-effects analysis BIC ∼ wave + (∼1 | participant) P = .023. Note the same scales. (c and d) Overall, the same model was the best when fitted only to the attribution (excluding the expectation) data. This was a much less stable model (cf. log-likelihoods in Supplementary Fig. S5), which probably accounts for why it was slightly worse than its main competitor shown here at baseline (c), but considerably better at follow-up (d).
We then examined the hypothesis that each parameter of the model would show stability, i.e. that it would be correlated between baseline and follow-up. To do this, we regressed the follow-up values of the parameters on the baseline ones, controlling for group allocation (placebo vs. SSRI). We found evidence for stability of the following parameters: initHarmInt (P = .0096, adj. R2 = 0.096 excl. one outlier; see“Materials and Methods” section), decision noise decisPrec (P = .0024, adj. R2 =0.11), prior attribution certainty typingConf (P = .00011, adj. R2 = 0.21 excl. one outlier), bias fairnessB (P = 1.5E-5, adj. R2 = 0.253 excl. one outlier; Note this is not bias about ethnicity but propensity to attribute beneficence), memory learnRet (P = .00122, R2 = 0.132), and learning over dictators λother (P = .012, R2 =0.071). Other parameters were poorly correlated (pS0: P = .194, R2 =0.064; dEv: P = .731, adj. R2 = −0.032 excl. one outlier; wH: P = .806, R2 =0.010; Ws:P = .436, R2 =−0.022). As often found, model fit was the most stable measure (log-likelihood: P = 2.17E-6, adj. R2 = 0.287) and improved on re-testing (See Supplementary Fig. S6, B. vs. A.)
Including predictions along with attributions improves task and model stability
We then fitted the winning model only to the harmful intent and SI attributions (i.e. like the previously published version of the task). In support of preregistered hypothesis C, fewer measures were correlated at conventional levels of significance, and baseline values generally explained less variance of the follow-up ones. As expected, model fit was most stable (log likelihood: P = .0074, adj. R2 = 0.085), and the overall bias fairnessB, decision noise deciPrec, and learning over dictators λother also showed significant correlations (w0: P = .012, adj. R2 = 0.106; alphaPrec: P = .00202, adj. R2 = 0.122; λother: P = .0208, adj. R2 = 0.055; Supplementary Fig. S5). However, the other measures did not attain conventional significance by this simple measure (P-value for initHarmInt: 0.235, initSelfInt: 0.90, aEv: 0.814, typingConf: 0.108, wHarmInt: 0.228, wSelfInt: 0.948, and learnRet: 0.155).
Citalopram may reduce typing confidence, thus enhancing refinement of views
We first assessed the stability of the measure of average attribution levels for an individual per testing wave, as we preregistered the hypothesis that citalopram would reduce this measure levels across waves. These measures showed good stability, as shown by correlating follow-up attributions with baseline ones, while controlling for the effect of citalopram. Average HI attributions, HIAv, correlated at P = .0015, beta = 0.442, R2 = 0.135. Average Self-interest, SIAv, showed P= .0023, beta = 0.409, R2 = 0.152. Predictions (predAv) showed P = 1.17E-5, beta = 0.437, R2 = 0.249.
We used linear mixed effects analysis [measure ∼ wave*drug + (1 | participant)] to assess the effect of citalopram. We found no evidence that citalopram affected HI attributions, P = .11 for the wave*drug term for measure = HIAv. There was modest evidence that SI attributions were enhanced by citalopram (wave*drug for SIAv: beta = 0.269, SE = 0.134, P =.0484). Average expectation was unaffected, P for predAv = 0.48.
We found a possible novel effect of citalopram on the confidence of typing characters, typingConf. Otherwise, modeling measures were consistent with the analysis of average attributions, i.e. there was a modest effect of SSRI increasing initSelfInt (pSI0: beta = 1.32, SE = 0.62, P = .037), consistent with SIAv above. The novel effect was a decrease typingConf by citalopram (aEv: wave*drug P = .023, beta = −1.18, SE = 0.50). This would enhance the attribution-refining process. Notably, this was on a background of initSelfInt reducing with wave (pS0: P = .014, beta = −2.612, SE = 1.033), unlike initHarmInt (pH0: P = .153). The wave*drug effect for other measures was unremarkable (P values: LL: 0.729, initHarmInt: 0.217, dEv: 0.743, alphaPrec: 0.7638, fairnessB: 0.568, wHarmInt: 0.979, wS: 0.11, ω: 0.303, λother: 0.898).
Guessing which group participants were in was not associated with initSelfInt (P= .24), SIAv (P= .60), or typingConf (P = .76); hence, expectation effects are unlikely to account for our findings.
Psychometric findings and the effect of citalopram
We administered the Hypomanic Personality Scale (HPS) at baseline, and the Patient Health Questionnaire, Rosenberg Self Esteem, Mood and Anxiety Symptom Questionnaire, Perceived Stress Scale, Generalized Anxiety Disorder—7, and the Achievement Motivation Inventory at both baseline and follow-up. We examined the dependence of follow-up scores on citalopram treatment, controlling for baseline scores, gender, SSES, and HPS. Citalopram had no statistically significant effect on any of the scores (Supplementary Fig. S4).
In order to reduce dimensionality and increase sensitivity, we performed a factor analysis on the baseline state-like symptoms (i.e. not the HPS), resulting in two factors on the basis of parallel analysis and scree plot (Supplementary Fig. S6). We tested the validity and stability of this by deriving factor scores on baseline and follow-up based on the baseline loadings only. The first factor scores, “anxious depression” correlated r = 0.791, P < 1E-10. The second factor scores, “stressful amotivation,” correlated at r = 0.803, P < 1E-10, both showing good stability. Anxious depression was not affected by SSRI [anxDep ∼ wave*drug + (1 | participant) gave P = .759] but stressful amotivation marginally improved, P = .052, beta = −0.25. Exploratory linear regressions found no evidence that variance in psychometric ratings, or changes in these ratings, was related to model parameters.
Discussion
We sought to refine our understanding of the computational basis of attributing motives and examined the role of serotonin and ethnicity in such attributions. We tested whether SSRI treatment may promote attributions of more beneficent motives to others; and whether factors often postulated to recruit subtle coalitional dynamics, namely apparent race and subjective socioeconomic status, affected attribution of motives. Importantly, we found strong evidence in favor of our hypothesized “classify-refine” models, which postulated that participants rapidly classified others’ attributes as positive or negative, and then sought to refine attributions. One week’s treatment with citalopram did not result in more magnanimous attributions (Moutoussis et al. 2022), but exploratory evidence suggested that it rendered the process of belief refinement more flexible. Contrary to our hypotheses regarding ethnicity, apparent race did not affect positive or negative attributions, consistent with most of our participants’ decision-making being unbiased in this respect. Notably, low subjective socioeconomic status reduced rather than increased attributions of harmful intent.
In line with our preregistered modeling hypotheses, we replicated the set of features, or parameters, needed to model the data according to our previous work—with one exception (Barnby et al. 2022). That is, prior belief uncertainty over HI and Self-interest could be condensed into a single uncertainty parameter. This is important, as the replication entails a task in a different population, a laboratory-based pharmacology experiment, with added task features of naturalistic photos of the “partner,” and questions about expected frequency of fair outcomes (Fig. 1). We found good evidence that the version of the task including these new questions was more stable, or reliable, than the original—as we predicted [(Moutoussis et al. 2022), main Hypothesis C]. Our close replication of a robust correlation between initial beliefs in HI and SI (Barnby et al. 2022) suggests two things. First, that there is a general propensity to attribute beneficent vs. detrimental motives. Second, that future studies should consider their commonality (harm-benefit intent) and contrast (other-self focus). Here we used active-inference, which offers a natural framework for our models, but our modeling findings can be implemented in other frameworks (like reinforcement learning) too.
The winning classify-refine learning shares much with posterior-belief-based credit allocation models (Dorfman et al. 2019), and has been considered in some detail by Story and co-authors in the context of “black and white thinking” (Story et al. 2024). Here, we suggest that classify-refine updating may be a useful socio-cognitive strategy, rather than suboptimal black-and-white thinking. It is advantageous to rapidly distinguish friend from foe, and then do more justice to the true nature of others. It may thus be an example of benign “thinking fast and slow” (Kahneman 2011). Classify-refine may also explain the fast-then-slow social learning shown by Bone, Pike and co-workers, and others (Bone et al. 2021, Hopkins et al. 2021). Bone et al. fitted their data with associative models wherein learning rates decayed in novel ways, partially inspiring the present work. Future research should compare our (Bayesian) classify-refine models with associative learning employing multiple learning processes, by which the brain may approximate Bayesian inference. But how might classify-refine relate to the well-recognized, maladaptive polarized thinking? This could be about the failure of the “refine” process, either due to excessive “typing confidence” leading to persistent stereotyping, or by circular causation induced by tit-for-tat strategies.
Citalopram treatment did not increase attribution of more prosocial motives, either in terms of increasing the proportion of fair returns that participants expected, their average HI and SI attributions, nor the parameters pHI0 and pSI0. Indeed, there was weak evidence for a relative increase in average SI and pSI0 (both at the 0.05 level uncorrected). Citalopram did not affect our healthy participants’ psychometric measures either, so the possibility remains that it may only ameliorate truly depressive attributions, or over a longer time course. We found intriguing exploratory evidence—much in need of replication—that citalopram may reduce certainty over character types, allowing for faster learning of others’ true nature, and that it may reduce stress-induced amotivation.
Still, our computational modeling hints at neurobiological mechanisms yet to be established. This is important, as SSRIs are thought to have prosocial effects in patients who benefit from them (Bian et al. 2024), but the specific mechanisms involved are unclear, and very complex (Kiser et al. 2012, Duerler et al. 2022). Serotonin may interact with existing expectations to affect changes in dopaminergic transmission, associated with changes in social anxiety (Hjorth et al. 2021). Thus improvements in slow social learning may be particularly salient for depression sufferers, who have higher harm-attribution expectations than healthy people. We thus speculate that during the treatment of depression with SSRIs, a synergy of psychological factors (expectations and learning) with changes in serotonergic neurotransmission may be required to build more benign priors about the social world. This may be implemented by changes in dopaminergic transmission (Barnby et al. 2024) in the basal ganglia, more or less specific to social contexts.
Important calls have been made in recent years toward a translational neuroscience sensitive to its own biases, one which is to elucidate and help mitigate prejudice (Iyer 2022, Kaiser Trujillo et al. 2022, Singh et al. 2022). Our findings about the role of race were unexpected, and in a sense cause for celebration. Models that included a parameter that would change motivational attributions depending on the ethnicity of the partner, a parameter which was able to capture any direction of such effect at the individual level, were rejected as they did not improve model fit—although a few individuals may be prone to this bias, and they may have a disproportionate impact on disadvantaged individuals. Our findings contrast with literature showing affective biases against people of color (Davies and Turnbull 2011) and are consistent with decision-making findings where appropriately educated participants behaved fairly toward POC others, despite having low-level affective biases (Correll et al. 2007). Lower socioeconomic self-ranking was associated, again contrary to our hypotheses, with lower attributions of harm intent, a positive finding in the sense of lack of evidence for excessive mistrust in this group. We note that our participants were predominantly young, highly educated women from BAME backgrounds, the most common ethnicity being Chinese. Groups with a different experience of social disadvantage, or indeed advantage, may attribute motives differently. Although, therefore, our sample is not representative of the UK population, our inability to detect signs of attribution bias and socioeconomic-based suspiciousness are cause for measured celebration.
In terms of limitations, our study did not involve participants with clinically significant symptoms, nor did citalopram have an effect on their mood and anxiety, rendering moot the question of whether the attributional problems of depression are ameliorated by SSRIs. It would be important, therefore, for longer, clinical studies to include social-cognitive tasks such as the one used here. Similarly, the happy absence of detectable biases related to low subjective socioeconomic status or ethnicity needs replication in samples more representative of the general population.
In conclusion, classify-refine models appear to hold much promise for social neuroscience, have normalizing implications for “black-and-white” thinking, are well served by the active inference framework, and have learning-relevant parameters that may depend on serotonin function. SSRIs appear to have little effect on the motives and expectations healthy people have of each other. Importantly, computational neuroscience studies should be equipped to interrogate burning issues such as social inequality or racial biases, but, as importantly, also to elucidate mechanisms behind “the glass being half full.”
Acknowledgements
We wish to thank Giles Story, Alex Pike, Kate Button, Janina Hoffman, Katie Hobbs, Ryan Smith, and the WCHN Methods group for their support.
Data availability
The data underlying this article are available in the public github directory https://github.com/mmoutou/Classify-refine_Sharing_game.
Author contributions
Michael Moutoussis (Investigation, data curation (supporting), review & editing (equal), conceptualization, formal analysis, methodology, software, writing—original draft preparation [all lead]), Joe Barnby (Conceptualization, formal analysis, methodology, software, writing—original draft preparation [all supporting], review & editing [equal]), Anais Durand (Investigation, data curation [equal], writing—original draft preparation (supporting), review & editing [equal]), Megan Croal (Investigation, data curation [equal], writing—original draft preparation [supporting], review & editing [equal]), Laura Dilley (Supervision, writing [supporting], review & editing [equal]), Robb B. Rutledge (Methodology, project administration, supervision, writing—original draft preparation [all supporting] review & editing [equal]), funding acquisition [lead]), Liam Mason (Conceptualization, methodology, writing—original draft preparation, investigation, data curation, funding acquisition [all supporting], review & editing [equal], project administration, supervision [both lead]).
Supplementary data
Supplementary data is available at SCAN online.
Conflict of interest
M.C. is a current employee of and holds options in COMPASS Pathfinder Ltd, but this work is unrelated to COMPASS Pathfinder. None of the other authors has any financial or intellectual competing interests in connection with the manuscript to declare.
Funding
The Department of Imaging Neuroscience is supported by a Platform Grant from the Wellcome Trust, and its Max-Planck center for computational neuroscience and ageing is supported by UCL and the Max-Plan Society. R.B.R. is supported by a NARSAD Young Investigator Award from the Brain & Behavior Research Foundation, P&S Fund. L.M. is supported by a Medical Research Council Clinician Scientist Fellowship (MR/S006613/1).