Nociceptin Receptor Antagonism Modulates Electrophysiological Markers of Reward Learning

Using in vivo electrophysiology and a reward-learning task reverse-translated from humans, this study tested the hypothesis that a nociceptin (NOP) receptor antagonist (J-113397) would potentiate behavioral and electrophysiological markers of reward learning in rats. Relative to vehicle, the NOP antagonist modulated electrophysiological markers of reward processing but did not affect response bias toward a more frequently rewarded stimulus. This proof-of-concept study provides initial insights into the effects of NOP receptor antagonism on reward learning, which are consistent with previous findings suggesting that such mechanism is a promising antidepressant target.


INTRODUCTION
Anhedonia, the reduced reactivity to rewards, is a cardinal phenotype of major depressive disorder (MDD) (Pizzagalli, 2022). Nociceptin/orphanin FQ peptide (NOP) and its receptor (NOPR) have been implicated in various domains affected by MDD (e.g., learning, stress regulation, hedonic responses) (Gavioli et al., 2021). Particularly relevant, NOPR activation inhibits dopamine, leading to reduced motivated/hedonic behaviors (Gavioli et al., 2021), whereas NOPR blockade has antidepressant-like effects in rodents (Redrobe et al., 2002;Rizzi et al., 2007). While valuable, rodent assays probing anhedonic behaviors are typically very different from human tasks, which hinders translation. To fill this gap, we developed functionally identical versions of the probabilistic reward task (PRT) to objectively quantify reward responsiveness in humans and laboratory animals (Pizzagalli et al., 2005;Kangas et al., 2020). By unevenly distributing rewards between 2 difficult-to-discriminate stimuli, the task assesses the subject's ability to develop a response bias (i.e., preference for the stimulus more frequently rewarded). Critically, individuals with MDD, and specifically those with anhedonia, show a reduced response bias (Pizzagalli, 2022).
Recently, we recorded local field potentials (LFPs) from 2 key brain reward nodes-the anterior cingulate cortex (ACC) and nucleus accumbens (NAc)-in rats performing the PRT (Iturra-Mena et al., 2023). We reported that 3 electrophysiological markers linked to reward processing in humans could be reliably detected in rats: an event-related potential (ERP) deflection 250-500 milliseconds after reward, as well as power increase in delta (1-5 Hz) and alpha/beta band (9-17 Hz) for rewarded trials. Consistent with human findings implicating delta and beta oscillations in reward prediction errors (i.e., when outcomes are better than expected) (HajiHosseini et al., 2012;Cavanagh, 2015;Marco-Pallares et al., 2015), delta (200-600 milliseconds) and beta (100-200 milliseconds) power in the ACC and NAc were largest after reward feedback, particularly for the less frequently rewarded stimulus (i.e., the largest reward prediction error). Building on these findings, we tested whether a single dose of the NOP antagonist J-113397 would potentiate response bias and electrophysiological markers of reward learning.

Procedures
We conducted secondary analyses of Iturra-Mena et al. (2023), focusing on pharmacological effects. Eleven rats (5 females) were trained on a rodent touchscreen PRT (Kangas et al., 2020), with 100 trials divided across 3 blocks (block 1 and 2: n = 33, block 3: n = 34). Following PRT training, we implanted each animal with electrodes in the ACC/Cg2 (AP: +1.2, ML: +0.8, DV: −3.0) and NAc (AP: +1.2, ML: +0.8, DV: −7.0) for LFP recordings. In each testing/ recording session, subjects were injected either with vehicle or J-113397 (10 mg/kg) 15 minutes before PRT testing. In this initial, proof-of-concept study, only 1 dose (10 mg/kg) was selected after (Genovese and Dobre, 2017), which demonstrated that 7.5-mg/  Kangas et al., 2020). Bottom panels: accuracy calculated as the percentage of correct responses (left) and reaction time (right) measured as time to make a response (seconds). The PRT elicited the intended preference for the stimulus paired with more frequent reward (log b), without fluctuations in task difficulty (log d) or reaction time throughout the task for all conditions. Contrary to our hypotheses, J-113397 did not potentiate the animal's preference for the more frequently rewarded stimulus (log b). (B) Grand average of the feedbacklocked ERP for rewarded (blue), non-rewarded (red) trials, and the difference between them (gray) separated by stimulus type. A feedback-related positivity was observed as a negative deflection at 250-500 milliseconds after feedback in the ACC and NAc local field potentials. (C) Amplitude values for the ERP 250-500 milliseconds after feedback in ACC and NAc for all correct rewarded and nonrewarded trials separated by stimulus type-lean (white circle) and rich (black circle). A significant treatment × reward feedback × stimulus type interaction emerged for the ACC and post hoc tests further clarified that the interaction was driven by more positive deflection to nonrewarded lean trials for J-113397 than vehicle, albeit at a trend level (P = .059). Data presented as mean ± SEM; n = 11. ACC: Anterior Cingulate Cortex; NAc: Nucleus Accumbens; NOP: Nociceptin/orphanin FQ peptide; PRT: Probabilistic Reward Task. The main effect of J-113397 on delta power with overall lower delta power for J-113397 relative to vehicle was found. Similarly, a treatment × reward feedback × stimulus type interaction for ACC 9 to 17-Hz power emerged. Follow-up analyses clarified that this difference was driven by a higher 9 to 17-Hz power for J-113397 relative to vehicle exclusively for lean rewarded trials (P = .021). Main effects and interaction are presented with letters R (reward/nonreward feedback), S (stimulus), R × S (interaction), and T (treatment) with asterisks according to their statistical significance. Data presented as mean ± SEM; n = 11. ACC: Anterior Cingulate Cortex; LFP: Local Field Potentials; NAc: Nucleus Accumbens; NOP: Nociceptin/orphanin FQ peptide. kg and 20-mg/kg doses safely mitigated stress-related behavioral effects without causing behavioral disruption. Thus, we deemed a 10-mg/kg dose safe/suitable. ERP analyses and wavelet frequency-decomposition were performed as described (Iturra-Mena et al., 2023). Electrophysiological variables were computed timelocked to reward vs nonrewarded stimuli separately for the stimulus associated with more (rich) vs less (lean) frequent rewards. The current research was approved by the McLean Hospital's Institutional Animal Care and Use Committee.

Statistical Analysis
For drug-vehicle comparisons, we performed 3-way ANOVA using feedback-locked amplitude/power values on correct trials, entering reward feedback (rewarded/nonrewarded), stimulus type (lean/rich), and treatment (J-113397/vehicle) as repeated measures. For significant triple interactions, follow-up 2-way ANOVAs were performed to disentangle effects, followed by Šídák's test for multiple comparisons. For response bias, a 2-way repeated-measures ANOVA with treatment and block (1, 2, 3) as factors was conducted.

Response Bias
Contrary to our hypotheses, the treatment × block ANOVA revealed no effects involving treatment (P > .99) ( Figure 1A).

DISCUSSION
Our findings provide initial evidence that a single administration of a NOP antagonist modulated electrophysiological markers of reward learning without affecting behavior. While these preliminary findings suggest that electrophysiological markers might be especially sensitive in detecting effects of NOP antagonism, replications in larger samples are warranted. Notably, in addition to a general reduction in delta power in NAc (irrespective of stimulus type and reward delivery) by J-113397, the most specific drug effect was observed for the feedback-locked 9 to 17-Hz frequency band, which was significantly larger in ACC for J-113397 relative to vehicle in response to rewarded lean trials. These findings are intriguing in light of human findings showing that beta power is potentiated by delivery of unexpected (low probability) rewards (HajiHosseini et al., 2012;Marco-Pallares et al., 2015), which in the PRT correspond to rewarded lean trials. Because unexpected rewards have been linked to dopaminergic signaling, these findings raise the possibility that NOP antagonism might potentiate dopaminergic signaling to salient cues. Future dose-response studies are warranted to evaluate this speculation and evaluate the promise of NOP antagonism to reverse anhedonic phenotypes.