Patients with anxiety disorders rely on bilateral dlPFC activation during verbal working memory

Abstract One of the hallmarks of anxiety disorders is impaired cognitive control, affecting working memory (WM). The dorsolateral prefrontal cortex (dlPFC) is critical for WM; however, it is still unclear how dlPFC activity relates to WM impairments in patients. Forty-one healthy volunteers and 32 anxiety (general and/or social anxiety disorder) patients completed the Sternberg WM paradigm during safety and unpredictable shock threat. On each trial, a series of letters was presented, followed by brief retention and response intervals. On low- and high-load trials, subjects retained the series (five and eight letters, respectively) in the original order, while on sort trials, subjects rearranged the series (five letters) in alphabetical order. We sampled the blood oxygenation level–dependent activity during retention using a bilateral anatomical dlPFC mask. Compared to controls, patients showed increased reaction time during high-load trials, greater right dlPFC activity and reduced dlPFC activity during threat. These results suggest that WM performance for patients and controls may rely on distinct patterns of dlPFC activity with patients requiring bilateral dlPFC activity. These results are consistent with reduced efficiency of WM in anxiety patients. This reduced efficiency may be due to an inefficient allocation of dlPFC resources across hemispheres or a decreased overall dlPFC capacity.


Introduction
Anxiety disorders are one of the most commonly diagnosed classes of mental disorders, affecting one in five individuals in a given year (Kessler and Chiu, 2005). One of the hallmarks of these conditions is an inability to focus attention (First et al., 2012), which makes it difficult for anxious participants to perform attentionally demanding tasks (Eysenck et al., 2007), like those that require manipulating items in working memory (WM) (Vytal et al., 2012. One critical region thought to be important for WM manipulation is the dorsolateral prefrontal cortex (dlPFC) (Balderston et al., 2016b). Using threat of shock to induce anxiety is one approach to studying the relationship between anxiety and cognition (Robinson et al., 2013;Grillon et al., 2019); however, additional research is needed to determine whether this approach can shed light on the cognitive symptoms seen in anxious patients and inform the identification of treatment targets for anxiety disorders (Balderston et al., 2020). Therefore, the purpose of this study is to examine the WM-related dlPFC activity in anxiety patients and controls and to examine whether this activity is affected by manipulations in anxiety state.
Although there are few WM studies conducted in anxiety patients, the results often suggest processing deficits in WMrelated regions including the dlPFC. For instance, patients with generalized anxiety disorder (GAD) show reduced left dlPFC activation during WM maintenance Jeong, 2015 2017;Moon et al., 2016) but greater activation in WM-related regions when presented with emotional distractors (Moon and Jeong, 2015;Park et al., 2016). Patients with GAD also show reduced activation in frontal, parietal and cerebellar regions important for WM maintenance compared to controls during a WM suppression task (Diwadkar et al., 2017). Patients with post-traumatic stress disorder (PTSD) have reduced default mode network deactivation during the 3-back task (Landré et al., 2012) and reduced frontoparietal activity during memory updating (Shaw et al., 2009). Patients with anxious major depressive disorder show reduced left prefrontal beta-band desynchronization during the n-back task (Ionescu et al., 2015). Together these results suggest that deficits in WM-related processing may be a dimensional indicator of anxiety, cutting across anxiety disorders.
Despite these deficits in WM-related processing, anxiety patients are often able to perform at a similar level of accuracy as healthy controls, although at the cost of slower response, suggesting the implementation of compensatory mechanisms to maintain performance. A prime candidate region for this compensatory processing is the right dlPFC. For instance, highanxious subjects show elevated right dlPFC activity during WM maintenance compared to low-anxious subjects (Basten et al., 2012). Similarly, healthy subjects show elevated right dlPFC activity during complex cognitive tasks performed under threat of shock (Oei et al., 2012;Balderston et al., 2017bBalderston et al., 2017c. However, this argument is inconsistent with neuromodulatory data targeting the right dlPFC, showing that excitatory repetitive transcranial magnetic stimulation (rTMS) increases anxiety (Balderston et al., 2020) while inhibitory rTMS reduces anxiety (Chen et al., 2013). In addition, retention interval activity is often left-lateralized, especially when verbal stimuli are used (Altamura et al., 2010;Rottschy et al., 2012), making the functional significance of these anxiety-related right dlPFC changes unclear. The key question then is whether right dlPFC contributes to or detracts from cognitive control in anxious patients and whether this right dlPFC activity could be a dimensional indicator of anxiety, consistent with the National Institute of Mental Health (NIMH) research domain criteria initiative (Insel et al., 2010;Insel, 2014).
Accordingly, we recruited a heterogeneous sample of patients meeting criteria for anxiety disorder and compared left and right dlPFC activity during retention interval between these patients and controls during periods of safety and threat. We chose to (i) contrast left and right blood oxygenation leveldependent (BOLD) activity because state, trait and clinical anxiety have been shown to impact task-related right dlPFC activity (Basten et al., 2011;Balderston et al., 2016bBalderston et al., 2017b), but the Sternberg task is known to preferentially activate the left dlPFC (Altamura et al., 2010;Rottschy et al., 2012) and (ii) vary the cognitive load of the task by adding a WM manipulation component (letter sorting), in the hope of revealing cognitive control deficits in the patients, characterized by increased reaction time (RT) (Eysenck et al., 2007;Cornwell et al., 2011). Finally, since previous research has shown that state anxiety can negatively impact WM performance and WM-related dlPFC activity (Vytal et al., 2012Balderston et al., 2016b), we chose to investigate these relationships during periods of safety and threat. We also expected that these cognitive control deficits would be accompanied by an increase in right dlPFC activity in patients compared to controls. However, given that excitatory neuromodulation to the right dlPFC can be anxiogenic and inhibitory neuromodulation to the right dlPFC can be anxiolytic, it is currently unclear whether the right dlPFC contributes to or detracts from cognitive control. Accordingly, it is currently unclear whether right dlPFC activity should be positively or negatively correlated with performance. A positive correlation would suggest that right dlPFC contributes to cognitive control, maintaining performance at the level of the controls, albeit less efficiently. In contrast, a negative correlation would suggest that the right dlPFC detracts from cognitive control and contributes to the deficits seen in performance.

Participants
Participants were recruited from the Washington DC metro area via flyers, advertisements and postings on listservs (see Supplemental Table 1 for demographic information). All participants were screened by a trained clinician who administered a structured clinical interview (First et al., 2012). Subjects also completed the Beck Depression Inventory (Beck et al., 1996) and the Beck Anxiety Inventory (Beck et al., 1988). Subjects were included if they were (i) between 18 and 50 years of age, (ii) physically healthy and (iii) able to understand the instructions in English. Subjects were excluded if they had (i) neurological or other physical issues or were taking medications that may have impacted the study; (ii) alcohol/substance dependence, met criteria for an Axis 1 psychiatric disorder (healthy subjects); (iii) a first-degree relative with a psychotic disorder or (iv) magnetic resonance imaging (MRI) contraindications. Additionally, patients were included if they met criteria for one or more of the following disorders: GAD, social anxiety disorder (SAD), PTSD, panic disorder (PD) or specific phobia. A full list of the inclusion/exclusion criteria can be found at https://clinicaltrials.gov/ct2/show/NCT00047853. We recruited 33 patients (26 women) with an average age of 30.41 ± 7.44 years. One subject (GAD/SAD) was lost due to scanner failure. Of the remaining patients, 12 had a primary diagnosis of GAD, 7 for SAD, 11 for comorbid GAD/SAD and 2 for comorbid GAD/SAD/PD. We recruited 46 healthy control subjects (26 women) with an average age of 26.8 ± 4.41 years. Patients were on average significantly older than controls, so age was included as a covariate where appropriate below [t(71) = 2.821; P = 0.006]. Three control subjects were excluded for motion during the functional magnetic resonance imaging (fMRI) scans, and two withdrew from the study. All subjects gave written informed consent approved by the NIMH Combined Neuroscience Institutional Review Board and were compensated for their time.

Procedure
Sternberg WM task. On each trial, subjects were sequentially presented a series of five or eight letters, followed by a brief retention period (see Figure 1). Prior to the letter series, subjects saw a 1 s fixation that indicated the trial type (low = 'maintain five letters', high = 'maintain eight letters', sort = 'sort five letters'). On low and high 'maintain' trials (low = five letters; high = eight letters), subjects rehearsed the series in order. An association test for the term 'dlPFC' on the site neurosynth.org (Yarkoni et al., 2011) was used to generate the bilateral mask.
On 'sort' trials (five letters), subjects rearranged the letters in alphabetical order. Following the retention interval, subjects were presented with a letter and a number, and made a forced choice button press indicating whether the position of the letter in the original (low and high trials) or alphabetical (sort trials) series matched the number.
There were four runs, each with 26 trials. Half of the trials were matches, meaning the position of the letter correctly matched the number shown, and half were mismatches. Low, high and sort trials were randomly shuffled within alternating blocks of safe and threat (two blocks each per run). The safe and threat blocks were signaled using colored banners with the words 'Safe' and 'Threat' that were present on the screen for the duration of the block. Two threat trials per run included a shock presentation and were discarded. On each trial, the fixation period lasted 1 s, the encoding period lasted between 3 and 5.5 s, the retention period lasted between 3 and 5.5 s and the response period lasted 3 s. The duration of the encoding and retention intervals was jittered across trials to allow for deconvolution of the BOLD response to the separate intervals. However, there remained some collinearity between the regressors (mean correlation = 0.42), suggesting that some variability was shared across intervals (mean r 2 = 0.177). However, it should be noted that partial correlation coefficients were used to model the BOLD response, so the results do not include this shared variability. The total trial length was 20 s, and the intertrial interval varied based on the duration of the encoding and retention intervals. After each run, subjects were asked to rate their anxiety during the safe and threat periods on a scale from 1 (not anxious) to 10 (extremely anxious; see Supplemental Table 1) (Balderston et al., 2016a).
Shock. The shock stimulus was a 200 Hz train of stimulation delivered for 100 ms to the right wrist using a constant current stimulator (Digitimer #DS7A, Ft. Lauderdale, FL). Two 11-mm disposable Ag/AgCl electrodes (Biopac Item number EL508; Goleta, CA), spaced ∼2 cm apart delivered the shock. Intensity was determined at the start of the experiment using an individualized thresholding procedure. Subjects rated each shock on a scale from 1 (not uncomfortable) to 10 (uncomfortable but tolerable), and shocks were delivered throughout the experiment at the level that subjects rated as their level 10 (see Supplemental  Table 1). Subjects also rated the shock after each run on the same scale, and values did not substantially change in these post-run ratings (see Supplemental Table 1).
Performance analysis. For both accuracy and RT, we performed a 2 (group: patient vs control) by 2 (condition: safe vs threat) by model (GLM) that included the following regressors of no interest: baseline (polynomial estimates), six motion parameters and their derivatives, the non-BOLD component time series, shock onsets and button presses. The GLM also included regressors of interest corresponding to the encoding, retention and response interval of the Sternberg trials. These intervals were modeled as variable duration blocks independently for the different conditions (i.e. safe vs threat, low vs high vs sort). fMRI analysis. The resulting beta maps from the first-level GLM were then analyzed using an a priori region of interest (ROI) approach focused on the left and right dlPFC, and an exploratory whole-brain voxelwise approach. For the ROI analysis, we used NeuroSynth (Yarkoni et al., 2011) to generate a bilateral mask of the dlPFC by searching the term 'dlPFC', saving the resulting association test map and extracting the two primary clusters corresponding to the left and right dlPFC. We chose the anatomical search term 'dlPFC', as opposed to a functional search term like 'working memory' for two reasons. First, we wanted the selection of the ROI to be independent of the factors in the experiment. Second, we were primarily interested in dlPFC activity, specifically because it is a common therapeutic target for neuromodulation (O'Reardon et al., 2007). We then averaged the beta values for the retention interval across voxels within each dlPFC ROI and performed a 2 (group: patient vs control) by 2 (condition: safe vs threat) by 3 (load: low vs high vs sort) by 2 (hemisphere: left vs right) mixed model ANOVA on the values. Note that we chose an ROI approach for the focus of this paper because we wanted to characterize the activity in the dlPFC across all experimental conditions in the current study. Given the large number of factors in this study, the design contains a large number of degrees of freedom, and thus a large number of distinct voxelwise maps in the omnibus analysis. Accordingly, it can be difficult to interpret the results from main effects and higher-order interactions in overlapping but nonidentical regions.
In addition to the a priori ROI analysis of the dlPFC, we also conducted exploratory voxelwise analyses at the whole-brain level. As before, we extracted the betas from the first-level GLM corresponding to the encoding, retention and response intervals and then performed three 2 (group: patient vs control) by 2 (condition: safe vs threat) by 3 (load: low vs high vs sort) mixed model ANOVAs on the values for each interval. We used cluster thresholding based on 10 000 Monte Carlo simulations to correct for multiple comparisons (Forman et al., 1995). We chose a two-tailed voxelwise P-value of 0.001, used a non-Gaussian autocorrelation function that better approximates BOLD data to estimate the smoothness of the residuals (Cox et al., 2017) and clustered voxels with adjoining faces and edges. The result was a minimum cluster size of 41, 3-mm isotropic voxels. Interactions were decomposed using post hoc t-tests, and the results are reported in Supplemental Tables 2-4.

Accuracy
For accuracy, we performed a group by condition by load mixed model ANOVA (see Table 1 and Figure 2A). There was a significant main effect of load [F(2142) = 18.419; P < 0.001; partial eta 2 = 0.21], but no other main effects or interactions (Ps > 0.05). To characterize this main effect, we performed paired sample t-tests for each of the three possible load comparisons and found that accuracy decreased significantly from low to sort to high [low > sort: t(72) = 3.074; P = 0.003; d = 0.36; sort > high: t(72) = 3.03; P = 0.003; d = 0.35; low > high: t(72) = −6.736; P < 0.001; d = 0.79].

Reaction time
For RT, we performed a group by condition by load mixed model ANOVA (see Table 1 and Figure 2B). As with accuracy, we found a significant main effect for load [F(2142) = 30.557; P < 0.001; partial eta 2 = 0.3]. In addition, we found a significant load by group interaction [F(2142) = 3.489; P = 0.033; partial eta 2 = 0.05]. However, there were no other significant main effects or interactions (Ps > 0.05). With age as a covariate, the load by group interaction is no longer significant [F(2142) = 2.577; P = 0.08; partial eta 2 = 0.01].
To characterize the load main effect, we performed paired sample t-tests for each of the three possible load comparisons and found that RT increased significantly from low to sort to high [low < sort: t(72) = 3.672; P < 0.001; d = 0.43; sort < high: t(72) = −3.236; d = 0.38; P = 0.002; low < high: t(72) = 9.113; P < 0.001; d = 1.07]. To characterize the load by group interaction, we performed t-tests comparing RT for controls vs patients for each level of load. We found that patients were significantly slower than controls for high trials [t (71)

dlPFC BOLD
To examine retention interval activity in the dlPFC, we averaged the beta values within ROIs for the left and right dlPFC (see Table 2 and Figure 1C) and performed a group by condition by load by hemisphere mixed model ANOVA on the values. We found significant effects for the following factors However, it should be noted (as shown below) that the main effect of hemisphere is driven primarily by the control group. Group interaction results are similar if age is included as a covariate.
To characterize these effects, we plotted the corresponding comparisons in Figure 3 and conducted post hoc t-tests to examine the directionality of the effects. First, for the main effect of load ( Figure 3A), we observed an increase in dlPFC BOLD from low to high [t(72) = 3.36; P = 0.001; d = 0.4] and high to  showed the fastest reaction time on low trials, and a slower reaction time on high trials than on sort trials. In addition, patients were significantly slower than controls on the high trials. Bars represent mean ± SEM. Circles represent individual data points. *Pairwise P-value < 0.05. For the hemisphere main effect, we observed significantly more left compared to right dlPFC activity (see Figure 3B and F-test above). For the condition by load interaction, we observed significantly less dlPFC activity during threat compared to safe for the sort condition  Figure 3D). For the group by condition interaction, we observed significantly less dlPFC activity during threat compared to safe for the patients [t(31) = 2.635; P = 0.013; d = 0.46] but not the control [t(40) = −0.833; P = 0.41; d = −0.13] subjects (see Figure 3E). Finally, for the group by hemisphere interaction, we observed significantly less activity in the right compared to left dlPFC for the controls [t(40) = 3.957; P < 0.001; d = 0.62] but not the patients [t(31) = 0.96; P = 0.344; d = 0.17; see Figure 3F].
Importantly, these results seem to be specific to WM maintenance. When we run the same group by condition by load by hemisphere mixed model ANOVA on the values extracted for the encoding and response windows, we see no significant main effects or interactions for group or condition (Ps > 0.05).
To understand whether this right dlPFC activation contributes to or detracts from performance, we correlated right dlPFC activity during the sort condition with accuracy (see Figure 4). Consistent with the idea that the right dlPFC contributes to cognitive control, especially in patients, we found a significant positive correlation between right dlPFC activity and accuracy for patients

Exploratory whole-brain BOLD
The two main goals of the whole-brain voxelwise analysis were to (i) dissociate BOLD activity to the encoding, retention and response intervals and (ii) dissociate BOLD activity associated with WM maintenance and manipulation. Accordingly, we modeled the encoding, retention and response intervals separately with variable duration blocks and conducted three separate group by condition by load mixed model ANOVAs corresponding to the betas for each interval (see Tables 2-4). To characterize the load manipulation, we also conducted two orthogonal planned comparisons for each interval that captured responses related to task difficulty (low vs high and sort; Supplemetal Figure 5A, C, E) and WM manipulation (high vs sort; Supplemental Figure 5B, D, F). As can be seen from Supplemental Figure 5, we observed distinct patterns of BOLD activity for the two comparisons across the encoding, retention and response intervals. Importantly, we observed the most robust dlPFC activity during the retention interval for both comparisons, consistent with the ROI analysis.
Aside from the load main effects, there were several other patterns worth noting. Threat tended to decrease activity in several midline regions including the cuneus, the middle cingulate and the medial prefrontal cortex during the encoding  2013; Torrisi et al., 2016;Balderston et al., 2017aBalderston et al., 2017b. Together, these results support the use of the Sternberg task as a method for isolating responses to unique WM processes and suggest that threat interactions with WM may be most robust during encoding and retrieval of items in WM.

Discussion
Little is known about mechanisms underlying WM impairment in anxiety patients as there is a dearth of investigation on this topic (Moon et al., 2016;Park et al., 2016). In this study, we used the Sternberg WM paradigm to assess retention interval activity in anxiety patients and healthy controls during periods of safety and shock threat. Our two primary findings were that (i) anxiety patients were slower than controls when task demands were high and (ii) anxiety patients recruited bilateral dlPFC activation, while controls were able to complete the task using primarily the left dlPFC. These results are consistent with the attentional control theory, which posits that anxiety patients can overcome attention control deficits and perform at similar levels as controls because of compensatory neural processes (Eysenck et al., 2007). Interestingly, right dlPFC BOLD activity was positively correlated with accuracy during the sort condition for patients but not controls. This is puzzling, given that excitatory neuromodulation of the right dlPFC has been shown to be anxiogenic (Balderston et al., 2020), while inhibitory neuromodulation of the right dlPFC has been shown to be anxiolytic (Chen et al., 2013).
While it would be tempting to conclude that anxiety patients suffer from a deficiency in left dlPFC processing, which is overcome by recruitment of the right dlPFC, our current results alone are not sufficient to support this conclusion. For this, we would need to show significantly less left and significantly more right dlPFC activity in the patients compared to the controls. Unfortunately, our current study is not sufficiently powered to demonstrate these effects. Even so, support for this left hemisphere deficit hypothesis comes from electroencephalography (EEG) data showing hypoactivation in the left hemisphere for both anxiety and depression, which is frequently comorbid with anxiety (Davidson et al., 1990;Davidson, 2004). Importantly, these data form the basis of the therapeutic application of transcranial magnetic stimulation (TMS) in individuals with depression (O'Reardon et al., 2007;Chen et al., 2013). The most common application of TMS for depression is to boost left dlPFC activity through left-lateralized excitatory stimulation (i.e. high frequency or intermittent theta burst stimulation) (O'Reardon et al., 2007). Interestingly, our current results may also provide insight into the mechanism of action of the anxiolytic effects of inhibitory rTMS to the right dlPFC (Chen et al., 2013). If indeed right dlPFC recruitment is a compensatory mechanism for diminished capacity in the left dlPFC, perhaps inhibiting right dlPFC activity forces the left dlPFC to work harder leading to plastic changes and improved functioning. If this is the case, one should expect a gradual shift in dlPFC activity over the course of right-lateralized inhibitory rTMS dlPFC treatment, normalizing the pattern of dlPFC during the retention interval for these patients. However, it should be noted that inhibitory rTMS to the right dlPFC may impair WM performance in anxiety patients, a possibility that should be considered in future trials. Future work should test this hypothesis. Although not included in the current sample, patients with PTSD have been shown to benefit from both excitatory (Boggio et al., 2010) and inhibitory (Kozel et al., 2018) right dlPFC stimulation, suggesting that additional work is needed to show the generality of the aforementioned hypothesis.
Behaviorally, we observed increased RTs in patients compared to controls for the high-load condition, which was also the most difficult. However, it should be noted that after controlling for age, this effect is only a trend. Like the dlPFC results, these behavioral results are consistent with the attentional control theory (Eysenck et al., 2007), suggesting that high-anxious individuals process information less efficiently, requiring more time to perform at the same level of accuracy (Lee, 1999;Richards et al., 2000). Again, it would be tempting to attribute this slower RT to the bilateral dlPFC effects that we observed in the patients. However, the increased RT in the patients was greatest for the high-load condition, while dlPFC activity was strongest for the sort condition. While this data alone should not rule out a dlPFC/RT connection, it should also be noted that there were no significant correlations between RT and dlPFC activity in the patients or the controls. Anecdotally, it often seems as if the anxiety patients have a stronger desire to do well on the task, which can lead to more effort on difficult trials. Although we did not formally test this hypothesis in the current work, there is experimental evidence that cognitive performance anxiety can lead to increases in RT during WM tasks (Angelidis et al., 2019).
In addition to slower RTs and less efficient dlPFC processing, it is also known that depression and anxiety patients have trouble filtering distractors from WM (Meconi et al., 2013;Stout et al., 2013Stout et al., , 2015Stout et al., , 2017Qi et al., 2014;Park et al., 2016). In the current study, patients showed reduced dlPFC activity during threat, which may reflect arousal-related impairments in top-down control that could affect distractor suppression (Stout et al., 2020). This filtering difficulty has been repeatedly shown using spatial WM tasks that rely on contralateral delay activity (CDA), an event-related potential measured in parietal regions using EEG (Vogel and Machizawa, 2004). In a typical task, subjects must study and retain the spatial configuration of items in one visual hemifield over the other. Importantly, EEG signal power in the contralateral hemispheres is linearly related to the number of attended items in the visual display (Vogel and Machizawa, 2004). Importantly, when distractors are also presented in the visual display, highly anxious individuals show CDA responses that scale not only with the number of items to be attended but also with the number of items to be ignored (Qi et al., 2014;Stout et al., 2013Stout et al., 2015. This effect is especially pronounced when the items carry a negative emotional valence (Meconi et al., 2013;Park et al., 2016;Stout et al., 2017). Similarly, highly anxious individuals have difficulty in task switching (Gustavson et al., 2017) and updating items in WM (Gustavson and Miyake, 2016), which could be interpreted as an inability to suppress previously task-relevant data similar to the deficits seen with task-irrelevant distractors. It is currently unclear how this failure to filter distractors relates to our current findings. However, there is structural connectivity evidence to suggest a potential link. A recent study looked at the relationship between frontoparietal connectivity and WM capacity and found that those with a strong connectivity had higher WM capacity than those with weak connectivity (Ekman et al., 2016). One possible explanation is that the deficient filtering is again driven by diminished left dlPFC capacity and/or diminished left dlPFC/parietal connectivity. It may be possible to test this hypothesis by administering a course of therapeutic rTMS to the left or right dlPFC and measuring CDA as a function of both target and distractor load.
In addition to the group effects, we also observed threatrelated reductions in dlPFC activity during sort trials. Notably, previous research with the Sternberg task has shown that the sort condition engages WM manipulation processes that drive greater dlPFC activity (Altamura et al., 2007), an effect that we replicate here (i.e. sort > high in dlPFC). This suggests that threat impairs WM manipulation processes in the dlPFC. Indeed, there have been a number of studies investigating the effect of threat on WM, primarily using the n-back WM task (Vytal et al., 2012Clarke and Johnstone, 2013;Patel et al., 2016;Ernst et al., 2016). The major findings from these studies are that (i) threat reduces accuracy, especially on low-load WM trials (Vytal et al., 2012Patel et al., 2016) and (ii) threat reduces overall dlPFC processing (Clarke and Johnstone, 2013). Although, we did not observe an effect of threat on accuracy in the current work, these results replicate our previous work using the Sternberg WM task during threat (Balderston et al., 2016a), suggesting that either the dlPFC effects shown here are below the threshold needed to cause a behavioral effect, or that the behavioral effects observed with the n-back task may be due to some other aspects of the task. Given that the n-back task features constant encoding, maintenance and retrieval, it is difficult to relate performance to neural activity during distinct phases of the task. Future work using adaptations of the n-back suited for event-related fMRI analyses might provide a better understanding of the mechanisms mediating threat-related performance deficits on this task (Fales et al., 2008).

Strengths and limitations
The primary strengths of this study are that we directly compared dlPFC BOLD in a relatively large (for fMRI) sample of unmedicated anxiety patients and healthy control subjects. Another strength of this study is that we used a version of the Sternberg WM paradigm that was optimized for fMRI, allowing us to dissociate encoding, retention and retrieval processes and show retention-specific differences across groups.
The main limitation of the current work is that we have a heterogeneous group of anxiety patients, most of them comorbid and meeting criteria for GAD, SAD and PD with some degree of comorbidity. Although it is beyond the scope of the current work to distinguish between these anxiety subpopulations, our results are similar if we look at those that meet either the criteria for GAD or SAD. Future studies should be conducted with adequate samples of patients meeting criteria for each of these disorders to identify any disorder-specific differences in retention-interval BOLD responses.
Another limitation is that our sample size, albeit large, may not have been sufficient to detect significant correlations between accuracy and left dlPFC activity in the patients or controls. Accordingly, it would be preliminary (and largely inconsistent with the literature) to suggest that left dlPFC activity does not contribute verbal WM. However, we were able to detect a significant correlation between right dlPFC activity and accuracy in the anxiety patients, suggesting that activity in this region is particularly important for these individuals. In other words, while previous research suggests that the left dlPFC may be necessary for successful WM performance generally (Altamura et al., 2010;Rottschy et al., 2012), our research provides indirect correlational (i.e. fMRI) evidence that left dlPFC activity may not be sufficient for successful WM performance in anxiety patients.
Additionally, although we jittered the onsets of the maintenance and retrieval periods, there were still correlations between the regressors. This is impossible to avoid because the encoding, maintenance and retrieval periods must be presented in order for the task to function correctly. However, it is important to note that the data reported from the manuscript come from the partial correlations between the regressors and the BOLD time series, so the results do not include this shared variability. This means that even though we are reducing the sensitivity of our analysis, we still maintain specificity.
One final limitation is that we did not objectively measure the anxiety level (e.g. through startle potentiation) of our subjects to assess the effectiveness of the threat manipulation. However, it should be noted that threat of unpredictable shock has been repeatedly used in our laboratory and others as a robust anxiogenic experimental condition (Charney et al., 1984;Lang et al., 1990;Grillon et al., 1994;Morgan et al., 1995;Böcker et al., 2004;Grillon, 2008;Balderston et al., 2015Balderston et al., , 2016a).

Conclusions
The current work suggests that anxiety patients process items in WM differently than healthy controls, which may prove to be a key avenue for future treatments for these individuals. Future studies using noninvasive neuromodulation may be able to correct this bilateral retention interval processing by either boosting left dlPFC excitability or attenuating right dlPFC excitability, an approach that has already been shown to be effective in treatments for major depressive disorder (O'Reardon et al., 2007;Dell'Osso et al., 2015).