The acute effects of aerobic exercise on sleep in patients with unipolar depression: a randomized controlled trial

Abstract Study Objectives Insomnia increases the risk of negative disease trajectory, relapse, and suicide in patients with depression. We aimed at investigating the effects of a single bout of aerobic exercise, performed after 02:00 pm, on the subsequent night’s sleep in patients with depression. Methods The study was designed as a two-arm parallel-group, randomized, outcome assessor-blinded, controlled, superiority trial. Patients between 18 and 65 years of age with a primary diagnosis of unipolar depression were included. The intervention was a single 30-minute bout of moderate aerobic exercise. The control group sat and read for 30 minutes. The primary outcome was sleep efficiency measured by polysomnography. Secondary outcomes were other polysomnographic variables, subjective sleep quality, daytime sleepiness, mood states, and adverse events. Results Ninety-two patients were randomized to the exercise (N = 46) or control group (N = 46). There were no clinically relevant differences at baseline. Intent-to-treat analysis ANCOVA of follow-up sleep efficiency, adjusted for baseline levels and minimization factors, did not detect a significant effect of the allocation (β = −0.93, p = 0.59). There was no evidence for significant differences between both groups in any other objective or subjective sleep outcomes, daytime sleepiness, or adverse events. The intervention had an immediate positive effect on mood states, including depressiveness (β = −0.40, p = 0.003). Conclusions This is the first trial to study the effects of a single bout of aerobic exercise on sleep in patients with depression to the best of our knowledge. Aerobic exercise had no effect on sleep efficiency but had a strong beneficial effect on mood and did not increase adverse outcomes. These results add to the growing body of evidence that, contrary to sleep hygiene recommendations, exercise after 02:00 pm is not detrimental for sleep. Clinical Trial Registration Clinicaltrials.gov, https://clinicaltrials.gov/ct2/show/NCT03673397. Protocol registered on September 17, 2018.


Introduction
Insomnia is a core symptom of unipolar depression, which critically predicts depression onset, trajectory, and recurrence.Insomnia is defined as having difficulties initiating or maintaining sleep, early morning awakening and is often accompanied by daytime impairments [1].Meta-analysis has shown that insomnia more than doubles the risk for depression [2].This risk might be mainly driven by difficulties initiating sleep, as was suggested by a recent network outcome analysis [3].Epidemiological studies have found insomnia prevalence rates of up to 90% in patients with depression [4].Longitudinal studies have repeatedly shown a bidirectional link between insomnia and depression [5].Insomnia symptoms negatively affect the disease trajectory [6].It reduces the responsiveness to psychotherapy [7], pharmacotherapy, or a combination of these [8,9].It also seems to increase the risk of developing treatmentresistant depression [10], suicidal behavior [11], as well as myocardial infarction [12].Sleep complaints are one of the most common symptoms after remission [13].Residual insomnia is problematic because insomnia is a prodromal symptom [14], thereby increasing the risk for depression relapse [15].There is a need to develop additional treatments for insomnia in patients with depression, considering the evidence presented above.
Moderate aerobic exercise might be a viable candidate as an adjuvant therapy for insomnia in patients with depression.Aerobic exercise is a rhythmic activity that involves large muscle groups and that primarily uses aerobic energy-producing systems.Moderate aerobic exercise refers to intensities of 55%-69% maximal heart rate, 40%-59% heart rate reserve, or 11-13 rate of perceived exertion [16].Regular moderate aerobic exercise has positive effects on subjective sleep quality in patients with depression, as we have demonstrated in a recent network meta-analysis [17].These effects are particularly strong when aerobic exercise is combined with treatment as usual.Regular moderate aerobic exercise also improves symptoms of depression [18].Furthermore, chronic aerobic exercise improves cardiorespiratory fitness in patients with depression [19].The effect on fitness is pertinent because depression increases the risk of coronary heart disease and myocardial infarction [20].Current general sleep hygiene guidelines recommend regular exercise before 02:00 pm [21].The authors of this guideline argue that vigorous exercise before bedtime causes the release of endorphins which can delay the onset of sleep [21].This caveat on timing severely limits the feasibility of exercise interventions.Moreover, epidemiological data [22] and meta-analysis of randomized controlled trials in healthy individuals [23] have shown that there is no adverse effect of an exercise bout after 02:00 pm on sleep.This reflects the lack of evidence concerning the acute effects of moderate exercise on sleep, especially in patients with depression.
The primary goal of this trial was, therefore, to investigate the effects of a single bout of moderate aerobic exercise on the subsequent night's sleep efficiency in patients with depression.For the duration of the exercise bout, we chose 30 min, corresponding to the recommendation of the American College of Sports Medicine and the American Heart Association for daily physical activity with beneficial health effects [24].Secondary goals were to investigate the intervention's effect on other polysomnographic outcomes, adverse outcomes, daytime sleepiness, and mood.We hypothesized that the intervention improves (1) sleep efficiency, (2) sleep continuity, (3) sleep architecture, (4) subjective sleep quality, (5) daytime sleepiness, and (6) mood.We expected no evidence for an effect on the frequency and the severity of adverse events as an exploratory outcome.

Trial design
This was a two-arm parallel-group, randomized, outcome assessor-blinded, controlled, superiority trial.The trial took place in the psychosomatic in-patient rehabilitation unit of the OBERWAID AG, a rehabilitation clinic in St. Gallen, Switzerland.The Ethics Committee East Switzerland, St. Gallen, Switzerland, approved the study protocol (EKOS 18/089).We prospectively registered this trial in the clinicaltrial.govregistry on September 17, 2018 (NCT03673397).A detailed study protocol that clearly states the study's rationale is available [25].There were no amendments and no deviations from the protocol.This report follows the CONSORT guideline for randomized controlled trials [26].The data underlying this article is available in the Harvard Dataverse at https://doi.org/10.7910/DVN/WASN36and will be shared at reasonable request to the corresponding author.

Screening
Patients admitted to the in-patient psychosomatic rehabilitation unit of the OBERWAID clinic were screened for inclusion.The trial took place in the first 5 days of the patient's psychosomatic in-patient rehabilitation, see Figure 1.The first author or another representative of the OBERWAID AG obtained written informed consent from participants.Eligibility criteria are listed in Table 1.We provide a detailed rationale for the inclusion and exclusion criteria in the study protocol [25].
After providing informed consent, we formally screened patients.The screening included a consultation with an experienced psychiatrist and a full history and medical examination by an experienced internist.Patients with undiagnosed sleep apnea were excluded according to the baseline polysomnography.

Graded exercise testing
Patients fulfilling all eligibility criteria assessed thus far (sleep apnea criterion is determined later by polysomnography, see Figure 1) performed sub-maximal graded exercise testing on a bicycle ergometer (ergoselect 200, Ergoline, Bitz, Germany).We determined the anaerobic threshold according to the method of Dickhuth et al. [27] using a specialized software program (Ergonizer, Freiburg, Germany).A detailed description of the graded exercise testing can be found in the study protocol [25].

Patient characterization
We administered multiple questionnaires to characterize patients at baseline.We assessed somatic multimorbidity with the Patient Health Questionnaire Somatic Symptom Scale (PHQ-15) [28] and the Modified Cumulative Illness Rating Scale (CIRS) [29].The PHQ-15 is a self-administered questionnaire with 15-items.It measures the severity of somatic symptoms (e.g.back pain) within the previous 4 weeks on a three-point Likert scale (0 = not bothered at all to 2 = bothered a lot).The symptoms in this questionnaire account for more than 90% of physical complaints reported in outpatient settings and its validity has been demonstrated [28].The CIRS provides physician-rated scores of multimorbidity.It measures the severity of symptoms over 14 organ systems (e.g.heart) on a five-point scale (0 = no problem to 4 = extremely severe problem).Inter-rater agreement and validity have been demonstrated [29].We measured depressive symptom severity with the Patient Health Questionnaire-9 (PHQ-9) [30].The nine symptom items (e.g."Feeling down, depressed, or hopeless") are scored on a four-point Likert scale (0 = not at all to 3 = nearly every day).The psychometric properties, including the validity of the cutoffs from mild to severe depression, have been demonstrated [30].Anxiety was assessed using the Hospital Anxiety and Depression Scale (HADS) [31].This questionnaire measures anxiety with seven items (e.g."I get sudden feelings of panic"), each on a four-point Likert scale (e.g.0 = not at all to 3 = most of the time).Psychometric properties, including diagnostic test accuracy, have been demonstrated [31,32].Stress was assessed with the Perceived Stress Scale [33].This questionnaire operationalizes stress as the degree to which life is experienced as unpredictable, uncontrollable, and overloaded in the past month.The ten items (e.g."In the last month, how often have you felt nervous and stressed?")are scored on a five-point Likert scale (0 = never to 4 = very often).The psychometric properties have been demonstrated [33].Sleep reactivity was assessed with the Ford Insomnia Response to Stress Test  *Hypnotic agents are defined as follows: orexin receptor agonists, benzodiazepine receptor agonists, sedating antidepressants, neuroleptics, benzodiazepines, melatonin agonists, heterocyclics, anticonvulsants, over the counter sleep aids (sedating antihistamines, melatonin L-tryptophan, valerian), and cannabinoids.

Table 1. Inclusion and exclusion criteria
† Absolute and relative contraindications are based on ACSM's Guidelines for Exercise Testing and Prescription.
ICD-10, International classification of diseases, version 10; BMI, body mass index.[34,35].The nine items of this self-report questionnaire assess the likelihood of sleep disturbances in response to stressful situations (e.g."How likely is it for you to have difficulty sleeping after an argument") on a four-point Likert scale (1 = not very likely to 4 = very likely).Reliability and validity have been demonstrated [36].Dysfunctional sleep-related thoughts and attitudes were assessed with the Short form of the Dysfunctional Beliefs and Attitudes about Sleep Scale (DBAS) [37].The sixteen items (e.g."I am worried that I may lose control over my ability to sleep.") are rated on a Likert scale (0 = strongly disagree to 10 = strongly agree).The reliability and validity of this questionnaire have been demonstrated [37].Chronotype was assessed using the Morningness-Eveningness Questionnaire [38].The 19 multiplechoice questions (four-or five-point scale) assess sleep habits and propensity for performance throughout the day.The sum score (range: 16 to 86) can be translated into chronotype (<42, evening type; 42-58, neither; >58, morning type).Adequate psychometric properties have been demonstrated [39,40].We measured chronic daytime sleepiness using the Epworth Sleepiness Scale [41].The likelihood of dozing off in eight daily situations (e.g.watching television) is assessed on a four-point Likert scale (0 = would never doze to 3 = high chance of dozing).The reliability and validity have been demonstrated [42].Subjective sleep disturbance was measured with the Pittsburgh Sleep Quality Index [43,44].This 18 item scale assesses subjective sleep quality, sleep latency, sleep duration, habitual sleep efficiency, sleep disturbances, use of sleeping medication, and daytime dysfunction.The sum score with a cutoff value of ≥5 has been shown to distinguish good and poor sleepers [45].Since patients cannot be blinded in exercise trials and considering the importance of patient preference and satisfaction [46,47], we also assessed credibility and expectancy on day three (i.e.before randomization) using two items (adapted from [48,49]): "At this point, how logical does the therapy offered to you seem?", "At this point, how successfully do you think this treatment will be in reducing your insomnia symptoms?." Patients rated the credibility and expectancy items on a four-point Likert scale (1 = not at all to 4 = very).

Baseline assessments
We assessed multiple objective and subjective sleep outcomes at baseline.We performed the baseline polysomnography on the night before the intervention to exclude patients with at least moderate sleep apnea (oxygen desaturation index ≥ 15) and assess potential first-night effects.We recorded polysomnographic data with the SOMNOscreen™ plus RC (Somnomedics, Randersacker, Germany) using the following montage: one EEG channel (Fp2-A1, 512 Hz), two EOG channels (512 Hz), one EMG channel, (512 Hz), one ECG channel (modified lead II, 512 Hz), thoracic respiratory effort channel (inductance plethysmography belt, 32 Hz), finger photoplethysmography (nondominant arm, 128 Hz), body position (stored every 30 s), movement (32 Hz), and ambient light (stored every 30 s).The validity of this montage for the assessment of sleep stages has previously been demonstrated [50].Polysomnography recordings were scored independently by two trained scorers according to the American Association of Sleep Medicine guidelines [51].All polysomnographic variables were calculated according to the American Association of Sleep Medicine guidelines [51].
Both scorers have demonstrated good agreement with the gold standard ratings in the AASM inter-scorer program [52].Their average agreement with the gold standard was 88% and 84%, respectively, which is above average [53].Scorers were blinded against allocation, time points, and each other's ratings.The subjective sleep quality of the baseline night was measured upon awakening with the revised Schlaffragebogen A [54], a German sleep questionnaire recommended by guidelines [55].This self-report questionnaire contains 25 items that load onto five factors: sleep quality, recuperation after sleep, calmness before sleep, exhaustion before sleep, and nocturnal psychosomatic symptoms.Internal consistency, factor structure, and validity have been demonstrated [54].

Randomization
Once eligibility was confirmed by baseline polysomnography, patients were randomly allocated to one of both groups using minimization (see Figure 1).We used a nondeterministic unweighted minimization algorithm [56] with a random element of 0.8.The allocation ratio was 1:1.We used minimization to increase the probability of balanced groups across the following predictive factors: sex, age, depression severity (PHQ-9 score), and subjective sleep quality (PSQI score).Allocation concealment consisted of four aspects: (1) requesting randomization after baseline measurement, (2) using a random element, (3) requesting allocation for participants by four different study nurses, and ( 4) not disclosing full details of minimization to study nurses in accordance with the SPIRIT guideline [57].Further details, including the rationale for the selection of minimization factors, are provided in Section 1 of the Supplementary Material.

Intervention and control condition
The exercise and control interventions started at approximately 04:45 pm in the afternoon.Patients allocated to the intervention group performed a single bout of supervised aerobic exercise on a bicycle ergometer (ergoselect 200, Ergoline, Bitz, Germany).The intervention began with a 5-minute warm-up period in which the intensity was gradually increased.Thereafter, patients exercised at an intensity of 80% of the individual anaerobic threshold (i.e. as defined by graded exercise testing) for 30 min (i.e. as recommended by guidelines [24]).We recorded average Watt and heart rate (Polar H7 chest strap, Polar OY, Finland) as well as perceived exertion in the 5th, 15th, and 30th min in the exercise group.Patients allocated to the control group were asked to sit and read magazines when the intervention group was exercising.All patients completed a mood questionnaire (Befindlichkeitsskala) [58] directly before and after the intervention.This questionnaire consists of 40 adjectives (e.g.cheerful, sad) with a five-point Likert scale (1 = not at all to 5 = very much) to indicate the experience of these adjectives in the present moment.Items load onto eight subscales (with five items each): activity, elation, contemplation, calmness, fatigue, depression, anger, and excitement.We also administered six Likert scaled (1 = not at all to 5 = very much) questions on adverse outcomes (pain, dizziness, cardiovascular symptoms, respiratory symptoms, nausea, and "other") immediately after the intervention.
We took multiple measures to offset the risk of performance bias that is inherent to exercise trials.Patients were instructed not to perform any other physical exercise except their daily activities.All patients wore an accelerometer (vivofit 2, Garmin, Schaffhausen, Switzerland) on their nondominant wrist on the days before and after the sleep assessments.The validity of this is accelerometer has been demonstrated [59].The accelerometer data allowed us to gauge potential contamination through other physical activity.Moreover, the rules and schedules of the in-patient rehabilitation clinic (e.g.timing of meals, consumption of multimedia, and alcohol) limit the variability of many behavioral aspects and ancillary treatments which could influence sleep.

Follow-up assessments
We repeated objective and subjective sleep assessments at follow-up identically to baseline (see Figure 1).Also, we administered the adverse outcomes question upon awakening.Lastly, the Stanford Sleepiness Scale [60] was administered four times (08:00 am, 12:00 pm, 04:00 pm, 08:00 pm) on the day after the intervention to assess excessive daytime sleepiness.

Outcomes
Polysomnographic sleep efficiency was the primary outcome.We chose a polysomnographic variable because the inability to blind patients against allocation in exercise trials increases the risk of a detection bias for patient-reported outcomes.Patients with depression have difficulties initiating and maintaining sleep, and they also show early morning awakening [61].Sleep efficiency is an appropriate polysomnographic measure to capture these sleep problems.We defined multiple secondary outcomes which should help to inform clinical decision-making.Secondary polysomnographic outcomes were: total sleep time (TST), sleep onset latency (SOL), wake after sleep onset (WASO), number of awakenings (NA), stage one sleep (N1; in percent of TST), stage two sleep (N2; in percent of TST), stage three sleep (N3; in percent of TST), non-REM sleep (in percent of TST), REM sleep (in percent of TST), REM-sleep latency (minutes), and stage shift index (stage changes per hour).Secondary subjective outcomes were subjective sleep quality, daytime sleepiness, mood states, and adverse events.The secondary outcomes of presleep arousal and nocturnal autonomic cardiovascular modulation prespecified in the protocol will be published in a different paper.

Statistical methods
We analyzed the primary outcome using an ANCOVA model.Thereby, we used baseline sleep efficiency and minimization factors [62] as covariates, allocation as the independent variable, and follow-up sleep efficiency as the dependent variable.First, we checked the statistical prerequisites.If residuals were heteroscedastic, we used heteroscedasticity-consistent estimation of the covariance matrix (HC3) [63].We used intent-to-treat analysis to reduce attrition bias.We replaced missing values using multiple imputations [64].Sensitivity analyses for the primary outcome were performed to gauge the influence of several factors: influential data points, per-protocol analysis (to reflect optimal adherence to treatment), missing data (analysis of complete data only), and chronotypes.All analyses were performed using the software R, version 3.6.3[65].
Sample size calculation was performed according to the procedure defined by Borm et al. [66].The expected treatment effect is based on the work of Passos et al. [67], which found a standardized mean difference in polysomnographic sleep efficiency of 0.53 (detailed rationale for the choice of this effect size is provided in the protocol [25]).With a power of 0.8 and a two-sided alpha of 0.05, 57 subjects would be required for each group using a t-test.According to the method of Borm et al., this sample size can be multiplied by a "design factor" of (1 − ρ 2 ), where ρ is the correlation coefficient between baseline and follow-up outcome [66].We used a conservative estimate and let ρ = 0.5, resulting in a "design factor" of 0.75 (1 − 0.5 2 = 0.75).Hence the sample size needed per group is 43 (57 × 0.75 = 42.75).We anticipated a dropout rate of approximately 7% (half the average dropout rate of trials investigating the chronic effects of exercise in patients with depression [68]).Therefore, we calculated the total sample size to be 92 (2 × 43 × 1.07 = 92).
The choice of statistical analysis for secondary outcomes was based on the number of assessments for each outcome.We calculated the effect of allocation using ANCOVA models (as outlined for the primary outcome) for all outcomes assessed at baseline and follow-up.We analyzed acute daytime sleepiness (four measurements) using a two-way repeated-measures ANOVA with Benjamini-Hochberg [69] corrected post hoc paired sample t-tests.We assessed adverse outcomes using Mann-Whitney-U tests.The threshold for statistical significance was set at p ≤ 0.05.We did not adjust secondary analyses for multiple testing, and thus these should be considered exploratory.

Results
Four hundred and forty-eight patients were screened for inclusion between September 2018 and January 2020 (see Figure 2).The most frequent reason for exclusion was the use of hypnotics (48%), followed by exercise contraindications (13%), and not being diagnosed with unipolar depression (8%).Ninety-two patients met eligibility criteria and were allocated to moderate aerobic exercise (N = 46) or the control condition (N = 46).Baseline characteristics of the study sample are summarized in Table 2. Demographic and clinical characteristics were well balanced at baseline.Four patients did not complete the study (two in each group).Dropouts did not seem to differ from completers at baseline.In addition to the dropouts, two polysomnographic measurements at follow-up failed (one in each group), and five patients did not complete the subjective sleep questionnaires.Inter-rater reliability was good (Cohen's Kappa: wake: 0.82; N1: 0.49; N2: 0.68; N3: 0.73; REM: 0.79).The intervention was implemented as planned: the mean rate of perceived exertion was 13.6 (SD = 1.6), and the mean percent of age-predicted maximal heart rate over the course of the intervention was 70.6% (SD = 6.8%), see Supplementary Figures S1 and S2.There was no evidence to suggest that average daily steps differed between the groups, F(1.82, 144.14) = 0.08, p = 0.9.
Intent-to-treat analysis ANCOVA of follow-up sleep efficiency, adjusted for pre-intervention levels and minimization factors did not detect a significant effect of the allocation, see Table 3 and Figure 3.The coefficient for allocation is the difference between the mean change scores of each group.
This finding was robust in all sensitivity analyses.Prespecified sensitivity analyses included per protocol (i.e.only including patients who reported an RPE (rate of perceived exertion) value of ≥13 at the end of the exercise intervention) and complete case (i.e.not using imputed data) analyses.Furthermore, we identified one influential data point (based on Cook's distance and DFBETA), clearly visible in Figure 3. Excluding this observation did not alter the primary ANCOVA results or any of the other aforementioned sensitivity analyses.There were no significant interaction effects of chronotype (β = 0.20, 95% CI = −0.08 to 0.47, p = 0.19), expectancy (β = −2.33,95% CI = −8.43 to 3.79, p = 0.45), and credibility (β = −4.17,95% CI = −12.54 to 4.19, p = 0.32) with allocation.There was no evidence that the rate of perceived exertion (r s = 0.05, p = 0.76) nor the average percent of agepredicted maximal heart rate (r s = −0.16,p = 0.30) was associated with sleep efficiency in the intervention group at follow-up.
Steps on day four (measured by accelerometer) did not predict sleep efficiency at follow-up in the ANCOVA model (β = −0.10,95% CI: −0.46 to 0.25, p = 0.56; for ease of interpretation, step count was divided by 1,000).
There was no evidence for an effect of allocation on any other objectively or subjectively measured sleep outcomes.The effect of allocation on polysomnographic and subjective sleep outcomes are summarized in Tables 4 and 5  There was no evidence for an effect of allocation on daytime sleepiness over time, F(3, 222) = 1.15, p = 0.33.Post-hoc tests assessing the effects of allocation were not significant (see Supplementary Figure S3).
Next, we investigated the effects of exercise on mood.All subscales of the mood questionnaire showed good internal consistency (Cronbach's alpha > 0.8) except the subscale contemplativeness (Cronbach's alpha: 0.47 and 0.64 for pre-and post-intervention, respectively).Hence, we did not further analyze the items of the subscale contemplativeness.ANCOVAs of post-intervention mood, adjusted for pre-intervention levels and minimization factors showed that exercise consistently improved mood.Patients in the intervention group reported higher levels of activation (β = 0.85, 95% CI: 0. We did not find evidence for an effect of allocation on adverse outcomes.There were no serious adverse events in either group.We aggregated the questions on adverse outcomes since they showed satisfactory internal consistency at both time points (Cronbach's alpha: 0.76 and 0.73).The number of reported symptoms immediately after the intervention did not differ between the groups.However, immediately after the intervention there was a trend toward lower symptom severity in the intervention group (median = 0.29) compared to the control group (median = 0.46; r = 0.19, p = 0.08).Upon awakening on the day after the intervention, there was no evidence to suggest that the group differed in terms of adverse outcome frequency or intensity.

Discussion
The main goal of our trial was to investigate the effects of 30 min of moderate aerobic exercise in patients with depression on the subsequent night's sleep efficiency measured by polysomnography.Secondary goals were to assess the effects on other objectively and subjectively measured sleep outcomes, mood, and adverse effects.
We did not find evidence to suggest that a single bout of moderate aerobic exercise improves polysomnographically or subjectively measured sleep outcomes.The absence of evidence for an effect of allocation was very consistent.None of the sensitivity analyses of the primary outcome nor of the secondary   polysomnographic or of the subjective sleep outcomes found a significant effect.This is the first trial to study the acute effects of a single bout of aerobic exercise on sleep in patients with depression to the best of our knowledge.Trials investigating the effects of a single bout of exercise on objectively and subjectively measured sleep in patients with insomnia, however, have found equivocal evidence.While two trials [70,71] found no effect of exercise on sleep, three trials found some positive effects on objectively measured sleep [67,72,73].The first trial found beneficial effects on sleep onset latency, total sleep time, and sleep efficiency [67].The second study observed positive effects on actigraphically measured sleep latency and sleep efficiency [73].The third investigation showed a reduction in stage shifts during the entire night as well as a reduction in stage shifts, arousal index, and wake stages during the second half of the night following exercise performed in the morning [72].The different findings cannot be readily explained by moderating factors like exercise intensity, duration, or timing.The only trial in which exercise was implemented in the morning revealed a significant effect [72].Exercising in the afternoon or evening produced mixed results, with some trials finding beneficial [67,73] and others finding no effect [70][71][72].
While two trials [70,71] found no effect of exercise on sleep, three trials found some positive effects on objectively measured sleep [67,72,73].Passos et al. found beneficial effects on sleep onset latency, total sleep time, and sleep efficiency [67].Li-Jung et al. observed positive effects on actigraphically measured sleep latency and sleep efficiency [73].Morita et al. showed a reduction in stage shifts during the entire night as well as a reduction in stage shifts, arousal index, and wake stages during the second half of the night following exercise performed in the morning [72].The different findings cannot be readily explained by moderating factors such as intensity, duration, or timing of the exercise.The only trial in which exercise was implemented during the morning hours revealed a significant effect [72].Exercising in the afternoon or evening produced mixed results, with some trials finding beneficial [67,73] and others finding no effect [70][71][72].
There are several explanations as to why we did not find evidence for the effect of exercise on sleep in this trial.First, issues of internal consistency might have played a role.However, our study design makes this explanation unlikely for the following reasons.We can rule out contamination from other physical activity since step count did not differ between groups and step count was not a significant covariate.There were no differences between the groups at baseline.Moreover, the in-patient rehabilitation setting limits the variability of behavioral aspects which can influence sleep.Second, the trial might have been underpowered due to inappropriate assumptions.Our sample size calculation is based on an effect size for aerobic exercise found in patients with insomnia (a detailed rationale of the sample size calculation can be found in the protocol [25]).However, in our study, polysomnographic outcomes were within the range of healthy individuals [74,75].This finding is most likely due to the exclusion of patients who regularly used hypnotics, thus excluding patients with severe insomnia.Reported effect sizes of aerobic exercise on sleep in healthy individuals are smaller [76] than the effect size we used for our sample size calculation.Hence, it is possible that our trial was underpowered  to detect a significant difference.Third, a single bout of aerobic exercise might not affect sleep in patients with depression.We cannot provide evidence for the absence of an effect in superiority trials.Thus a non-inferiority trial is needed to provide evidence for the second or third explanation.The immediate effects of the exercise intervention on mood states were consistently positive.Negative mood states decreased, notably including depressiveness, and positive mood states increased.Findings on the acute effects of moderate aerobic exercise on mood states in patients with depression have been equivocal to a certain degree.While all trials have found positive effects for at least some mood subscales, some have found positive effects on all mood subscales.Our findings are consistent with Niedermeier et al. [77], which also found increased positive and decreased negative mood states with large effect sizes.Stark et al. [78], Frühauf et al. [79], Bartholomew et al. [80], and Legrand et al. [81] also found positive effects on some but not all mood states.Of note, the trial of Stark et al. [78] implemented the same questionnaire as we did, but the intervention lasted 60 min.They found significant and substantial beneficial effects for all subscales except anger.The inconsistencies between the studies mentioned above are likely due to small sample sizes and different outcome questionnaires, which measure slightly different facets of mood.
Adverse outcome severity immediately after the intervention tended to be slightly lower in the intervention group.Although this was a nonsignificant trend and the effect size was small, it is important to note that there is no evidence that exercise increased pain, dizziness, nausea, or cardiovascular and respiratory symptoms.Adverse events are a central aspect of clinical decision-making [46].However, adverse effects (i.e. an undesirable symptom or outcome temporally associated with an intervention) are underreported in exercise [82] and sleep [46] trials.There are no trials on the acute effects of exercise, which included patients with depression and reported adverse outcomes to the authors' knowledge.The meta-analysis of Krogh et al. [83] analyzed the chronic effects of exercise in patients with depression.Only approximately 10% and 30% of the included trials reported data on serious and nonserious adverse events, respectively.Based on this limited data, Krogh et al. found that allocation to exercise interventions was associated with a lower risk of nonserious but an increased risk of serious adverse events [83].The meta-analysis of Niemeijer et al. found no evidence of an increased risk for nonserious adverse events in the psychiatric subgroup [82].Thus, our study helps to close the gap in the literature concerning the adverse effects of exercise.
This study has several strengths.We took several measures to minimize the risk of bias.These include using minimization (a restricted randomization technique) as well as solid allocation concealment (selection bias), blinding scorers of polysomnographic data (detection bias), quantifying contamination through other physical activity (performance bias), and intent-to-treat analysis (attrition bias).We also carefully selected secondary outcomes which help to inform clinicians, patients, and policymakers.Both scorers have demonstrated good agreement with gold standard ratings of the AASM interscorer program [52].The inter-rater agreement in this trial was within the range reported by other sleep centers [84,85].Furthermore, the absence of evidence for an effect of allocation on sleep outcomes and the strong positive effect on the different mood subscales are very consistent.
Limitations include the restricted external validity and the limited polysomnographic montage.The inclusion of patients with psychiatric and somatic comorbidities enhanced the external validity of this study.However, we excluded many of the screened patients because of hypnotics.Although this increased internal validity, it limited external validity.The limits to external validity should be considered when using these findings to inform clinical practice.It is unclear whether the present findings are transferable to patients who regularly use hypnotics.We also made a conscious trade-off between feasibility (reduced polysomnographic montage) and the resulting loss of information.The reduced montage did not allow us to analyze EEG microarchitecture.The strengths and limitations highlighted above point to interesting avenues for future research.
The findings of our study have several scientific implications worth mentioning.Future trials can improve our understanding of exercises' effects on sleep in patients with depression in many ways.Effectiveness trials could compare the acute effects of different interventions commonly used in in-patient or outpatient treatment settings (e.g.relaxation or mindfulness interventions, light therapy).Importantly, these trials should include patients who use hypnotics.A particularly promising line of investigation is whether exercise is an effective add-on treatment to psychotherapy, pharmacotherapy, or both.The effect of exercise timing is also interesting.The timing of exercise throughout the day seems to alter exercises' effects on sleep in healthy individuals [76].This finding might be partially explained by the different effects morning and evening exercise have on melatonin secretion at 10:00 pm [86] and on circadian phase shifts (the latter also depends on the chronotypes) [87].Any trial on the effects of exercise in patients with depression should systematically collect and report data on adverse effects.Non-inferiority trials could show that exercise does not increases the risk of adverse outcomes.Despite the remaining research questions, this trial can improve clinical decision-making.
These findings can inform clinical practice in multiple ways.A single bout of moderate aerobic exercise will improve mood states and is likely not to have harmful effects on sleep or other symptoms.In addition, patients can expect positive effects on sleep (and many other outcomes, including depressiveness) if they continue to exercise over multiple weeks or months.Our findings also add to the growing body of evidence that exercise performed after 02:00 pm does not reduce sleep quality.This body of evidence is in contrast to current sleep hygiene recommendations [21].Metaanalyses in healthy populations of all ages have also consistently confirmed that exercise after 02:00 pm either has no effect or even improves sleep [23,76,88].Trials focusing on patients with insomnia have found similar results, although there are far fewer trials available [67,[70][71][72][73].This is relevant to therapeutic settings where exercise interventions are commonly also implemented after 02:00 pm.

Conclusions
In conclusion, this trial suggests that a single bout of moderate aerobic exercise strongly improves mood but found no evidence for an effect on the subsequent night's sleep or adverse outcomes.This is the first trial to study the effects of a single bout of exercise on sleep in patients with depression to the best of our knowledge.Additional non-inferiority trials are needed to confirm that moderate aerobic exercise does not negatively affect sleep nor increase the risk of adverse outcomes.
Inclusion criteria • ≥18 and ≤65 years old • Primary diagnosis of depression (F32, F33) without psychotic episode according to ICD-10 Exclusion criteria • Regular use of hypnotic agents* (patients are included if no hypnotic agents were taken 2 weeks before study participation) • Factors precluding exercise testing or training † • Use of beta-blockers (except Carvedilol & Nebivolol) • Use of opioids • History of epilepsy • Restless legs syndrome defined by ≥7 points on the Restless Legs Screening Questionnaire [35] • Moderate or severe sleep apnea defined by an oxygen desaturation index (using 4% criterion) ≥ 15 in the baseline polysomnography • Morbid adiposity with BMI > 40 , respectively.The internal consistency of the sleep questionnaire subscales was adequate (Cronbach's alpha: sleep quality = 0.87; recuperation after sleep = 0.93; mental balance before sleep = 0.90; exhaustion before sleep = 0.68; nocturnal psychosomatic symptoms = 0.65).

Figure 3 .
Figure 3. Baseline and follow-up sleep efficiency by allocation.

Table 3 .
ANCOVA table for intent-to-treat analysis of sleep efficiency at follow-up

Table 5 .
Coefficients of exercise allocation in ANCOVA models predicting subjective sleep outcomes Allocation was coded as follows: 1 = control, 2 = exercise.All models used baseline values of the outcome as well as minimization factors (sex, age, PHQ9 score, and PSQI score) as covariates and allocation as the independent variable.The coefficient for allocation is the difference of the mean change score in the exercise group compared to the control group. *

Table 4 .
Coefficients of exercise allocation in ANCOVA models predicting polysomnographic outcomes All models used baseline values of the outcome as well as minimization factors (sex, age, PHQ9 score, and PSQI score) as covariates and allocation as the independent variable.The coefficient for allocation is the difference of the mean change score in the exercise group compared to the control group.N1, stage one sleep, N2, stage two sleep, N3, stage three sleep, REM, rapid eye movement sleep, TST, total sleep time.