The Cognitive Effects of Antidepressants in Major Depressive Disorder: A Systematic Review and Meta-Analysis of Randomized Clinical Trials

Background: Cognitive dysfunction is often present in major depressive disorder (MDD). Several clinical trials have noted a pro-cognitive effect of antidepressants in MDD. The objective of the current systematic review and meta-analysis was to assess the pooled efficacy of antidepressants on various domains of cognition in MDD. Methods: Trials published prior to April 15, 2015, were identified through searching the Cochrane Central Register of Controlled Trials, PubMed, Embase, PsychINFO, Clinicaltrials.gov, and relevant review articles. Data from randomized clinical trials assessing the cognitive effects of antidepressants were pooled to determine standard mean differences (SMD) using a random-effects model. Results: Nine placebo-controlled randomized trials (2 550 participants) evaluating the cognitive effects of vortioxetine (n = 728), duloxetine (n = 714), paroxetine (n = 23), citalopram (n = 84), phenelzine (n = 28), nortryptiline (n = 32), and sertraline (n = 49) were identified. Antidepressants had a positive effect on psychomotor speed (SMD 0.16; 95% confidence interval [CI] 0.05–0.27; I2 = 46%) and delayed recall (SMD 0.24; 95% CI 0.15–0.34; I2 = 0%). The effect on cognitive control and executive function did not reach statistical significance. Of note, after removal of vortioxetine from the analysis, statistical significance was lost for psychomotor speed. Eight head-to-head randomized trials comparing the effects of selective serotonin reuptake inhibitors (SSRIs; n = 371), selective serotonin and norepinephrine reuptake inhibitors (SNRIs; n = 25), tricyclic antidepressants (TCAs; n = 138), and norepinephrine and dopamine reuptake inhibitors (NDRIs; n = 46) were identified. No statistically significant difference in cognitive effects was found when pooling results from head-to-head trials of SSRIs, SNRIs, TCAs, and NDRIs. Significant limitations were the heterogeneity of results, limited number of studies, and small sample sizes. Conclusions: Available evidence suggests that antidepressants have a significant positive effect on psychomotor speed and delayed recall.


Introduction
Major depressive disorder (MDD) is a highly prevalent and disabling illness affecting greater than 350 million people worldwide (Kessler et al., 2006). The World Health Organization (2012) has recognized MDD as the leading cause of disability, causing significant and often chronic functional impairment. Cognitive dysfunction associated with MDD is a key feature sub-serving the functional impairment associated with MDD (Baune et al., 2010). Several cognitive domains, including executive function, attention, memory, processing speed, and psychomotor skills, are affected during both symptomatic as well as "remitted" phases in MDD (Marazziti et al., 2010;Lee et al., 2012;Bora et al., 2013;Bortolato et al., 2014). Given the significant and persistent functional impairment mediated by cognitive dysfunction, increased attention is being given to this domain in the treatment of MDD (Bortolato et al., 2014).
Several investigators have studied the cognitive effects of various antidepressants (Keefe et al., 2014); however, the majority of these studies were limited by small sample sizes, absence of placebo controls, a lack of pre-specification of cognition as a primary outcome, and insufficient statistical analytic approaches to parse direct versus indirect effects (e.g. path analysis; McIntyre et al., 2013). Many studies have reported a positive effect of various antidepressants on cognition, yielding a statistically significant difference between groups receiving treatment versus placebo; however, quantification of the overall and relative magnitude of effect (e.g. the pooled standard mean difference [SMD]) of all currently available antidepressants on cognition has yet to be conducted. Notably, Keefe et al. (2014) conducted a systematic review on the cognitive effects of pharmacotherapy in MDD, in which they calculated the effect size for all studies reviewed; however, they did not meta-analytically quantify pooled effect sizes of the cognitive effects of antidepressants. In addition, since the publication of their review, several new clinical trials have been published that have primarily sought to determine the effects of antidepressants on cognitive function (Katona et al., 2012;McIntyre et al., 2014;Robinson et al., 2014;Soczynska et al., 2014;Gorlyn et al., 2015;Mahableshwarkar et al., 2015). Therefore, the primary objective of the current systematic review and meta-analysis is to assess the overall effect of antidepressants on cognition in MDD as determined in placebo-controlled trials. As a secondary objective, the relative efficacy of mechanistically diverse antidepressants on cognitive function will be compared based on effect sizes calculated from comparative head-to-head trials. The pertinence of this review is two-fold: (1) given that cognition is becoming an increasingly important target in the treatment of MDD, knowledge regarding the effect size of currently available therapies is essential; and (2) with the increased pursuit of novel therapeutic strategies targeting cognition in MDD, a benchmark of effect size should be established.

Search Methods for Identification of Trials
The PubMed, PsycInfo, Cochrane, and Embase databases were searched from inception to April 15, 2015. The PubMed search was limited to human studies, including clinical trials, observational studies, meta-analyses, and review articles written in the English language using the following search string: (major depressive disorder OR unipolar depression) AND (cognitive function OR cognitive impairment OR cognitive dysfunction OR executive function OR executive dysfunction OR memory OR attention). Various combinations of additional search terms were used to search for additional articles in all four databases (search terms listed in Supplementary Material). Reference lists from identified articles were manually searched for additional relevant studies. All identified articles were screened by two independent reviewers (Drs Rosenblat and Kakar) for inclusion in qualitative and quantitative analyses. Where there was disagreement on inclusion, consensus was reached through discussion.

Inclusion Criteria
1. Human studies with participants over the age of 18 (no upper limit) with a diagnosis of MDD as defined by the Diagnostic and Statistical Manual or International Classification of Disease criteria (no restrictions on edition used); 2. Randomized clinical trial of antidepressants with the primary mechanism of action being monoamine modulation in one or more of the following categories: selective serotonin reuptake inhibitors (SSRIs), selective serotonin and norepinephrine reuptake inhibitors (SNRIs), norepinephrine and dopamine reuptake inhibitors (NDRIs), serotonin antagonist and reuptake inhibitors, noradrenergic and specific serotonergic antidepressants, tricyclic antidepressants (TCAs), and multimodal antidepressants (e.g. vortioxetine); 3. Cognition was assessed using standardized and validated measures; 4. Data was provided to allow for calculation of effect size (where insufficient data was provided in the article, the authors were contacted to obtain the required data); and 5. Manuscript is written in English

Exclusion Criteria
Excluded study descriptions and reasons for exclusion are summarized in Supplemental Table 1. 1. Unpublished data or conference abstracts; 2. Open-label trials and observational studies; 3. Studies using healthy controls, instead of placebo-controlled MDD patients, to determine effect (some of these studies are discussed in the qualitative analysis, but were not included in the quantitative analysis); 4. Clear methodological flaws, such as lack of randomization or large variance in treatment and placebo groups baseline characteristic and/or psychometric measures (included in qualitative review but not quantitative analysis); 5. Multiple reports from the same data set (e.g. only original study was included to prevent overweighting of one data set); 6. Studies explicitly including participants with other psychiatric or neurologic diagnoses such as bipolar disorder, schizophrenia, schizoaffective disorder, attention deficit hyperactivity disorder, or dementia; 7. Studies explicitly including participants using concomitant choline esterase inhibitors or stimulants; and 8. While studies using TCAs were included, trials assessing the effects of tianeptine were excluded, as the primary mechanism of action of tianeptine is now believed to be via glutamate modulation (Nickel et al., 2003).

Data Extraction and Statistical Analysis
Using standardized data extraction forms, data was extracted from included studies by two independent reviewers (Drs Rosenblat and Kakar) to systematically evaluate study characteristics, risks of bias, and cognitive testing results required for the calculation of effect size. Final cognitive scores of treatment versus placebo were used for the analysis, as recommended by the Cochrane Handbook for Systematic Review of Interventions, except where large pre-treatment differences in cognitive scores were identified; for these studies the change from baseline was compared instead to prevent skewing of results. Where mean and/or standard deviation values were not reported, these were calculated based on reported confidence intervals (CI) or p-values. Where inadequate information was reported to calculate mean and standard deviations, values essential for determining Cohen's d effect size/SMD, the study authors were contacted directly for this additional data. For two studies (Georgotas et al., 1989;Raskin et al., 2007), only means were reported and the original authors could not provide standard deviation values. For these studies, the average standard deviation was extrapolated from other studies using the same cognitive test and was utilized for SMD calculations.
Pooling of effect sizes and tests of heterogeneity were conducted using Review Manager 5.3 (Copenhagen: The Nordic Cochrane Centre, The Cochrane Collaboration) software using a random-effects model. Effect sizes, using Cohen's d effect size, where 0.2 = small, 0.5 = medium, and 0.8 = large, were calculated using SMD in post-treatment neuropsychological performance between antidepressant treatment and placebo, for placebo-controlled trials, and antidepressant compared to another antidepressant, for comparative head-to-head trials of different antidepressants. Samples were not sub-grouped into responders and non-responders, as an insufficient number of studies reported responder sub-grouped analysis; rather, the mean effect for all subjects was included for effect size calculations.
Neuropsychological testing from included studies was pooled based on the cognitive domain being tested (see Strauss and Spreen, 2006, for a review of cognitive tests and domains). Results for placebo-controlled trials were only pooled for domains wherein two or more studies evaluating the same domain were identified. In placebo-controlled trials with multiple antidepressant groups, separate effect sizes were calculated with respect to the one common placebo control group. Pooled effect sizes were calculated separately for each antidepressant, then pooled to calculate the overall effect size of all antidepressants included. Individual agents or studies were subsequently removed from the pooled sample to determine if removal of any one specific agent or study could significantly alter the overall effect size. In addition, for trials assessing psychomotor speed, an additional subgroup analysis was conducted, separating studies including subjects with a mean age of less than versus greater than 65 years.
For studies directly comparing antidepressants' relative effects on cognition without a placebo group, effect size was determined in a similar manner, except by replacing the placebo group with the comparator antidepressant, effectively determining an effect size in relation to the first antidepressant.
Critical values for pooled effect sizes were set at 0.05. Homogeneity in effect sizes was tested using the Q statistic (Chi 2 ) for each cognitive domain and each antidepressant. Heterogeneity was quantified using the I 2 statistic, where 25% = small, 50% = moderate, and 75% = high heterogeneity (Higgins et al., 2003).

Assessment of Bias
The risk of bias was assessed for all clinical trials included in the quantitative analysis. As per recommendations in the Cochrane Handbook for Systematic Review of Interventions, bias was assessed based on the following five domains: sequence generation (e.g. based on description of randomization), allocation concealment, blinding of outcome assessors, intention-to-treat, and for-profit bias. Risk of bias was designated to be high if described protocols were concerning for bias in a given domain or if description of the domain was omitted from the primary text and primary authors could not provide clarification when contacted. For example, if sequence generation methods were not explicitly described and the study author could not provide clarification when contacted, this domain would be labeled as high risk. Where an adequate protocol was described for a given domain, it would be labeled low risk.
To assess publication bias, a funnel plot was created using Review Manager 5.3 Software for forest plots with greater than five studies included. An Egger Test was not conducted, as greater than ten studies are required in accordance with the Cochrane Review Handbook; the current analysis had a maximum of nine studies included in any given forest plot.

Search Results and Study Characteristics
Electronic database searches yielded a total of 1 084 articles ( Figure 1). A manual review of reference lists and suggested studies from experts in the field revealed an additional 23 potentially relevant articles. Titles and abstracts were screened, yielding 45 articles for which the full text was reviewed for inclusion. Of these studies, 25 were found appropriate to be included in the qualitative review, of which 17 were included for the quantitative analysis. Demographic information, the antidepressant studied, and cognitive testing for each study included in the quantitative analysis are summarized in Table 1.
In addition to the studies included in the quantitative analysis, eight studies were identified for qualitative review. These clinical trials were excluded from the quantitative analysis due to their study design (observational studies, open label studies, lack of placebo controls, or appropriate comparative group) and/ or agents used (e.g. non-monoaminergic agents); however, these studies were still deemed to be noteworthy within the scope of this review and are summarized separately in Supplementary  Table 1.
Of interest, when removing vortioxetine from the pooled SMD, the effect size was no longer statistically significant compared to placebo (SMD 0.08; 95% CI -0.02 to 0.18; p = 0.13) and the heterogeneity was small (Chi 2 = 4.10; p = 0.85; I 2 = 0%). Also, with the removal of TCAs, the pooled effect size remained unchanged.
A subgroup analysis comparing studies with subjects with a mean age greater than 65 versus less than 65 was also conducted, as shown in Figure 3. For studies with subjects older than 65, the SMD was 0.10 (95% CI 0.00 to 0.21; p = 0.06) as compared to 0.23 (95% CI 0.04 to 0.43; p = 0.02) in subjects younger than 65, suggestive of a greater positive effect in subjects under 65; however, the difference between subgroups was not statistically significant (p = 0.24). A funnel plot to assess for publication bias was also conducted, as shown in Figure 4.

Effect on Delayed Recall
Four placebo-controlled trials (Raskin et al., 2007;Katona et al., 2012;McIntyre et al., 2014;Robinson et al., 2014) evaluated the effect of antidepressants on delayed recall using RAVLT. Of these studies, one study (Katona et al., 2012) evaluated two agents in parallel compared to placebo, providing a total of five    independent effect sizes to pool, including evaluation of vortioxetine (n = 2) and duloxetine (n = 3). The pooled effect size of both antidepressants (n = 989) versus placebo (n = 616) was 0.24 (95% CI 0.15 to 0.34; p < 0.00001), indicative of a small, yet statistically significant, positive effect (Figure 7). Heterogeneity was found to be low, with I 2 = 0% (p = 0.86). Subgroup analysis revealed a pooled SMD, slightly greater for duloxetine (SMD = 0.25) compared to vortioxetine (SMD = 0.24); however, the difference was not statistically significant (p = 0.9).
The pooled SMD of all SSRIs/SNRIs (n = 140) versus TCAs (n = 138) was 0.33 (95% CI -0.11 to 0.78) in favor of SSRIs/SNRIs; however, the effect was not statistically significant (p = 0.14). Heterogeneity was moderate, with I 2 = 64% (p = 0.04). Given that cognitive tests evaluating different domains of memory were utilized, this may have been a cause of heterogeneity. Notably, when removing one study evaluating venlafaxine versus dothiepin (Trick et al., 2004), which appeared to be divergent from the other studies, I 2 became 0% and the SMD rose to 0.58 (95% CI 0.31 to 0.84; p < 0.00001) in favor of SSRIs/SNRIs (Figure 8). Of note, both venlafaxine and   dothiepin were dosed twice daily in this study, which negatively affected the quality of sleep (Trick et al., 2004).

Bias of Included Studies
Assessment of bias is summarized in Table 3. All included studies were found to have adequate sequence generation and concealment. Risk of bias for blinded outcome assessment was low in all studies except for one (Levkovitz et al., 2002). Risk of bias based on intention-to-treat analysis was variable between studies as shown in Table 3. Several included studies (Finkel et al., 1999;Bondareff et al., 2000;Newhouse et al., 2000;Ferguson et al., 2003;Trick et al., 2004;Raskin et al., 2007;Culang et al., 2009;Katona et et al., 2015) were identified to be high risk of for-profit bias, given that pharmaceutical companies provided funding for these studies. Publication bias was assessed using a funnel plot, as shown in Figure 4. A funnel plot was only created for placebo-controlled trials assessing psychomotor speed, as all other forest plots had small numbers of studies and as such a funnel plot would be an inappropriate test. Qualitative assessment of the funnel plot revealed no obvious signs of publication bias; however, the limited number of studies greatly limited the interpretation of the funnel plot. Also of note, an Egger's test could not be performed, as greater than 10 studies are required for this test to be used according to the Cochrane Review Handbook.

Discussion
The current meta-analysis identified nine placebo-controlled trials assessing the cognitive effects of antidepressants. Pooled effect sizes based on SMD revealed that overall antidepressants have a small positive effect on psychomotor speed and delayed recall; however, the positive effect on cognitive control and executive function was not statistically significant. Other cognitive domains could not be meaningfully assessed due to the lack of comparability of cognitive testing between studies. Of note, the high level of heterogeneity and small number of studies identified in pooling cognitive effects is a major limitation of the current study, which may greatly limit the interpretation of the determined effects. Among the antidepressants assessed under the condition of a placebo-controlled trial, vortioxetine appeared to have the largest effect size on psychomotor speed, executive control, and cognitive control, while duloxetine had the greatest effect on delayed recall.
Subgroup analysis comparing subjects greater than versus less than 65 revealed a greater positive effect in subjects under the age of 65; however, there was no statistically significant difference between age groups. The pathophysiology of cognitive dysfunction associated with MDD may differ in the geriatric population and, as such, a variable effect of antidepressants on cognition may be expected in this group.
Studies directly comparing SSRIs/SNRIs to TCAs were also identified (Bondareff et al., 2000;Levkovitz et al., 2002;Trick et al., 2004;Culang-Reinlieb et al., 2012); however, there was large heterogeneity in cognitive testing, preventing pooling of effect size for a single domain. Domains of memory were thus combined, suggesting SSRIs/SNRIs have a more positive effect on memory compared to TCAs; however, the effect was not statistically significant. A high degree of heterogeneity was identified in this analysis, potentially caused by the pooling of results from different domains of memory. Therefore, the results of this pooled effect size may be invalid; however, cognitive dysfunction secondary to TCA use has long been suggested secondary to the anti-cholinergic effects of TCAs (Baune and Renger, 2014;Bortolato et al., 2014;Keefe et al., 2014).
Two studies (Soczynska et al., 2014;Gorlyn et al., 2015) suggested that SSRIs/SNRIs have an equivalent effect to NDRIs on working memory; however, the pooled effect size was based on a small number of participants and therefore may have been underpowered to detect a difference between these groups.
Sertraline appeared to have a greater effect on psychomotor speed when directly compared to fluoxetine in two separate trials (Finkel et al., 1999;Newhouse et al., 2000). This result alone might not be very clinically relevant; however, it suggests that Gorlyn

Limitations
A major limitation of the current meta-analysis was the high level of heterogeneity of cognitive testing used in the identified clinical trials. This heterogeneity in testing greatly limited the comparison and pooling of data. Therefore, the current metaanalysis could not elucidate the relative effect of all antidepressants across disparate cognitive domains and instead was limited to including only a subset of antidepressants for the domains of psychomotor speed, cognitive control, executive control, and delayed recall.
Another limitation of the current study was the moderate level of heterogeneity identified when pooling SMD effect sizes. The heterogeneity may have been caused by the pooling of studies using different antidepressants with different mechanisms of action, including studies with different durations of treatment and different age groups, as shown in Table 1.
Another significant limitation was the highly variable number of subjects pooled for each antidepressant. More specifically, in placebo-controlled trials, vortioxetine and duloxetine were heavily weighted, as these trials had much higher numbers of participants. Therefore, when pooling results for all antidepressants, the majority of the effect size was determined by the effect of vortioxetine and duloxetine. Further, with removal of vortioxetine from the pooled sample, statistical significance was lost.
Lastly, a limitation of all studies assessing cognitive function is the gap in understanding the correlation between results of cognitive testing and functional outcomes. While the current study has shown a small positive effect on psychomotor speed and delayed recall as measured by cognitive testing, the precise functional meaning of this remains largely unknown (Baune et al., 2010;Bortolato et al., 2014).

Conclusion
Due to the known persistence of cognitive dysfunction during remission (Bora et al., 2013) and demonstrated small positive effect of antidepressants on delayed recall and psychomotor speed, the investigation of other cognitively enhancing agents to be used adjunctively to current antidepressants in the MDD population is merited.
The current study also elucidated the large difficulties appropriately comparing cognitive clinical trials due to the currently high level of heterogeneity of cognitive testing. Therefore, improved standardization of cognitive testing with efforts made to evaluate every domain separately in every study would be greatly beneficial. As well, a combination of both self-report and objective cognitive testing may aid in the understanding of subjective cognitive complaints in MDD.
Future studies using cognitive function as a pre-specified primary outcome are needed, as the majority of studies discussed were evaluating cognition as a secondary outcome. As well, studies should include placebo controls due to the expected improvement in cognitive testing seen with repeat testing (e.g. practice effect). Adequate statistical testing to allow for path analysis, and thus determination of direct and indirect effects of antidepressants on cognition, should be considered for future studies.