Effects of Sleep Deprivation on Performance: A Meta-Analysis

Summary: To quantitatively describe the effects of sleep loss, we used meta-analysis, a technique relatively new to the sleep research field, to mathematically summarize data from 19 original research studies. Results of our analysis of 143 study coefficients and a total sample size of 1,932 suggest that overall sleep deprivation strongly impairs human functioning. Moreover, we found that mood is more affected by sleep deprivation than either cog nitive or motor performance and that partial sleep deprivation has a more profound effect on functioning than either long-term or short-term sleep deprivation. In general, these results indicate that the effects of sleep deprivation may be underestimated in some narrative reviews, particularly those concerning the effects of partial sleep deprivation.

Meta-analytic reviews, because of their mathematical nature, tend to be fairly objective and consistent. Tn addition, meta-analysis has several statistical advantages. Since each individual study represents a sample taken from a larger popUlation, sample results may not always match those of the population (i.e. a sampling error). Mathematically averaging across studies minimizes the influence of sampling error since the high and low random deviations tend to balance out. Moreover, since some studies may be based on a relatively small sample, problems with low power are avoided since no formal significance testing is done at the individual study level. (Effectually, all individual samples are combined into one large sample which should be largely representative of the general population.) While meta-analysis has been gaining popularity in other fields, such as personnel management, clinical psychology and education (e.g. [4][5][6], it has yet to gain widespread acceptance in the sleep research community. To date, only four articles have reported metaanalyses of primary sleep studies. Benca et al. (7) reviewed sleep patterns in psychiatric disorders. Hudson et al. (8) looked at polysomnographic measures in good and bad sleep. Knowles and MacLean (9) assessed age-related changes in sleep. Lastly, Koslowsky and Babkoff (10) examined the effect of total sleep deprivation on work-paced and self-paced task performance.
Moreover, consistent with the pattern observed in other fields, some of the first meta-analyses to appear have been somewhat limited in scope. In regard to sleep deprivation, there are a number of potentially important moderator variables which could be taken into account. For example, there are three types of measures commonly used to assess the effects of sleep deprivation: cognitive performance, motor performance and mood. And, there may be additional variables operating within each of these measures which may further change the effects of deprivation on functioning.
Some evidence does, in fact, suggest that performance varies on different types of cognitive tasks. In a comprehensive survey of the sleep deprivation literature, Johnson (11) concluded that the results from studies using accuracy as the performance variable depended on the type of cognitive measure (e.g. logical reasoning, mental addition, visual search tasks, word memory tasks). In addition, other narrative reviewers (e.g. 12) have suggested that the length and pacing of cognitive tasks may affect performance.
Similarly, with motor performance measures, the use of different tasks could affect results. In a review of the effect of sleep loss on exercise, Martin (l3) concluded that the effect of sleep deprivation depends on the type and length of the motor task. For example, while several studies suggest that exercise is not adversely affected by sleep deprivation (e.g. 14-17), others report that performance on certain endurance tasks is decremented (e.g. 18). In an extensive review of the sleep deprivation and exercise performance literature, VanHelder and Radomski (19) concluded that sleep deprivation up to 72 hours does not affect muscle strength or reaction but does decrease time to exhaustion.
One factor that may affect all three measures is the length of sleep deprivation. Naitoh (20), for example, found that sleep deprivation of less than 46 hours is usually too short to have a substantial effect on either cognitive or motor tasks. Other researchers, however, have reported performance decrements at sleep loss durations of less than 45 hours (e.g. 21). While mood appears to be decremented by sleep deprivation (e.g. 12,[22][23][24], it is unclear whether different types of deprivation differentially impact mood. The purpose of this study is to use the meta-analytic technique to provide a comprehensive, quantitative analysis of the effects of sleep deprivation on functioning. Our work extends that of Koslowsky and Babkoff (10) in that we evaluate a number of additional moderator variables. Specifically, we categorize and separately analyze the measures as being either mood assessment, motor task performance or cognitive task performance. Then, we further differentiate both the motor and cognitive task performance categories according to the length and complexity of the tasks. In addition, our analysis utilizes data from both partial and total sleep deprivation studies.

Decision rules
We established the following criteria for inclusiOll in our analysis. First, enough information had to be provided to allow computation of an effect size statistic (explained in the next section) for each dependent measure. In most cases this meant direct reporting of the means (or a clear enough graph such that the means could be estimated) and standard deviations for the sleep-deprived and non-sleep-deprived groups. In cases where results were expressed only as a t or a 1 df, an effect size statistic was computed through a statistical transformation (see 2).
Second, the study had to involve short-term total sleep deprivation (:::;45 hours), long-term total sleep deprivation (>45 hours), or partial sleep deprivation (sleep period of <5 hours in a 24-hour period). The determination of short-term and long-term sleep deprivation followed the criteria set by Koslowsky and Babkoff (10). The decision to use a sleep duration of <5 hours as the criteria for partial sleep deprivation was made after reviewing a number of studies which allowed the subjects to sleep for short periods of time in a 24-hour period and then selecting a number that reflected a natural cutoff point.
Third, the study had to use either a cognitive performance task, a motor performance task or a mood scale as the dependent measure. In particular, cognitive performance tasks (e.g. logical reasoning tasks, mental addition tasks, Torrance tests) had to be either :::;6 minutes in duration or :=:: 1 0 minutes in duration. Motor performance tasks (e.g. serial reaction time, treadmill walking, manual dexterity tasks) had to be either 53 minutes in duration or :=::8 minutes in duration. The above times for cognitive and motor tasks were based on our review of a large number of studies and ap-Sleep, Vol. 19  peared to represent natural separations that could differentiate short from long durations. Finally, if subjects performed multiple tasks of the same type, all data had to be reported. If a study reported results only for positive or statistically significant effects and not for negative or nonsignificant effects, then it was dropped.
Of the 56 primary studies located, 37 were rejected because they did not meet the above criteria (a complete bibliography of the studies not used in this metaanalysis is available on request). Of these 37 articles, 29 did not present data in a fashion that would allow us to compute an effect size (e.g. 21,25-28), seven did not use a design that met our qualifications for type of task or type of deprivation (e.g. 29-31) and one presented only positive results (32). We attempted to directly contact the authors of these articles for more information, but a majority either could not be contacted or could not locate the data in question.

Coding of study information
The final data set was formed from the remaining 19 primary journal articles (33)(34)(35)(36)(37)(38)(39)(40)(41)(42)(43)(44)(45)(46)(47)(48)(49)(50)(51). A special coding sheet was developed to capture information from these studies. The information coded is listed in Table 1. Effect size statistics, which indicate how many standard deviation units the experimental group was different from the control group, were computed using the methodology outlined by Hunter and Schmidt (2). The formula for the effect size statistic, d, is shown below, where X E is the mean of the experimental group, Xc is the mean of the control group and Sw is the standard deviation pooled across both groups. (1) The formula for computing the pooled standard deviation in the effect size formula is shown below, where NE and SE are the sample size and standard deviation, Sleep, Vol. 19, No.4, 1996 respectively, for the experimental group and Nc and Sc are the sample size and standard deviation for the control group.
Careful attention was paid to the sign of the effect size during coding, since d values mathematically can be positive or negative. Studies were uniformly coded such that a negative d represented situations where the experimental (i.e. sleep-deprived) group did worse on the dependent measure than the control group, while a positive d represented situations where the experimental group did better than the control group.
In total we were able to code 143 d values representing 1:932 subjects from the 19 primary studies. (Since most sleep deprivation studies used more than one measure of performance, these data are not totally independent of each other. Such a situation is common in meta-analysis.) This subject pool represents a broad range of subjects including both genders and a wide age range. The 19 primary studies were divided approximately equally between the three sleep deprivation categories: four in short-term sleep deprivation (37,42,43,45), six in long-term sleep deprivation (33,35,46,47,50,51), three in both short-term and longterm sleep deprivation (34,36,38) and six in partial sleep deprivation (39)(40)(41)44,48,49). Such a data set is large and provides the opportunity to do meaningful analyses.
The reliability of the coding process was assessed by having two independent researchers code all 19 studies. The correlation between the raters was 0.60 for type of sleep deprivation, 1.00 for type of dependent measure, 0.94 for type of task, 0.78 for task duration, 0.93 for sample size and 0.98 for effect size. The lowest interrater reliability was seen for type of sleep deprivation. This was due to the initial analysis not clearly distinguishing between short-term and partial sleep deprivation. With the second coding, this distinction was clarified, which changed some of the previously coded short-term deprivations to partial sleep deprivations. All disagreements across all coding criteria were investigated by the two researchers and resolved. These results indicate that information could be coded reliably from the studies.

Meta-analytic computations
The actual computations for the meta-analyses were performed using a SAS (SAS Institute, 1990) PROC MEANS program developed by Huffcutt et al. (52) that takes the d values from the various studies and combines them mathematically. The result is an estimate of the average effect size across the studies (i.e. the average number of standard deviations the experimental distributions was offset from the control distribution) and the variability observed around this average. All computations are done weighting by sample size, since studies based on a larger sample are more stable than those based on a smaller sample (2,3).
It should be noted that Huffcutt et al.'s program does not provide any tests of statistical significance. Formal significance testing is typically not done in a meta-analysis, as the meta-analytic procedures were developed to avoid the problems and limitations associated with significance testing (2,3). Moreover, since sampling errors tend to be averaged out when combining across studies, results of a meta-analysis are thought to represent direct estimates of the strength of a relationship iQ the population. The average effect size represents the overall strength of a relationship, while the variability around the average reflects the degree to which other variables moderate the relationship. (Therefore, high variability does not imply a lack of an effect but rather that the strength of the effect depends strongly on other variables.)

Analyses
Our first goal was to assess the overall effect of sleep deprivation. More specifically, we attempted to answer two questions. First, at a general level, how well do experimental subjects do, on average, relative to control subjects in a sleep deprivatiori study? And second, how constant (i.e. stable) are the effects of sleep deprivation across different study designs?
To answer these questions we conducted a metaanalysis of all 143 effect sizes collectively. The mean d score from this analysis indicated the average number of standard deviations that the sleep-deprived group was different from the non-sleep-deprived group, collapsing across all the different study design characteristics. The standard deviation in d scores was a reflection of the extent to which design characteristics affected the magnitude of the difference between deprived and nondeprived subjects.
A second goal was to investigate specifically how the effects of sleep deprivation vary according to the two most prominent study design characteristics, the type of deprivation and the type of measure. Our attempt here was to assess quantitatively how much difference each of these two characteristics makes in terms of the effects of sleep deprivation. To assess the influence of type of deprivation, we separated the studies into the three main categories (short-term, longterm and partial sleep deprivation) and conducted a separate meta-analysis for each category. We then looked to see if there was a difference in the mean d scores across the three categories. [As Hunter and Schmidt (2) noted, the more different the individual means are from the overall mean, the more influence that characteristic has on the strength of the experimental effect.] Similarly, we separated the studies into the three categories of dependent measure (cognitive task performance, motor task performance and mood scales) and conducted a separate meta-analysis for each category. Finally, we conducted a meta-analysis in which we combined across the two prominent design characteristics to assess whether the effects of deprivation for a particular dependent measure changed depending on the type of sleep deprivation (i.e. an interaction effect).
A third goal was to do a supplemental assessment of whether performance on cognitive tasks changed according to either the type of task (simple vs. complex) and/or the task length (short vs. long). Thus, we took the cognitive task studies which were already sorted according to the type of deprivation, further sorted them into simple and complex task categories and conducted a separate meta-analysis for each resulting combination. Then, within each type of deprivation, we re-sorted the studies into short and long duration categories and conducted separate meta-analyses for each resulting combination. These procedures allowed us to assess whether, for a particular type of deprivation, performance on a cognitive task depended on the type and/or length of the task.
Lastly, a fourth and similar goal was to assess whether performance on motor tasks varied according to either the type of task (simple vs. complex) or the length of the task (short vs. long). As described in the preceding paragraph, studies already separated by type of deprivation were then further separated by type of task and length of task.
In closing, the above analyses were designed to fol-Iowa "hierarchical" strategy, as is typically done in a meta-analytic investigation (2). In particular, we started at an overall summary level and then progressively made the analyses more and more specific. Two comments should be noted in regard to this strategy. First, unlike cognitive and motor task performance, we did not break mood measures down by additional features such as length and complexity. Conceptually, such features were not as meaningful with mood measures as they were with cognitive and motor tasks. And second, the number of studies being analyzed at any one time progressively decreased as the data set was split into more and more subcategories. Naturally, the smaller the number of studies in a given category, the more tentative the results become.

RESULTS
Results of the overall analysis of all 143 coefficients combined are presented in the top portion of Table 2.
Sleep. Vol. 19 Abbreviations used: a, average effect size; SD(d), standard deviation of effect sizes; N" number of study coefficients in the analysis; TSS, total sample size from those coefficients.
" Averages and standard deviations were computed using sample size weighting.
As shown, the mean effect size collapsing across all study characteristics was -1.37, indicating that the sleep-deprived subjects performed at a level 1.37 standard deviations lower than the performance level of the non-sleep-deprived subjects. The difference of 1.37 standard deviations between the two distributions is graphically illustrated in Fig. 1. In more pragmatic terms, such a finding suggests that a person at the 50th percentile in the deprived group (shown as the dark dot in Fig. 1) performs roughly equivalent to a person at the 9th percentile in the nondeprived group. [This is based on the assumption that both the deprived and nondeprived groups roughly approximate a z distribution. Percentiles were obtained from Minium et al. (53).] The relatively large standard deviation across the effect sizes (2.08) suggests that study design characteristics do make a considerable difference in terms of how deprived subjects perform relative to nondeprived subjects.
Results of the meta-analyses for the two most prominent study design characteristics, the type of deprivation and the type of measure, are also presented in Table 2. As shown, the type of deprivation does appear to make a difference; partial sleep deprivation appeared to have a considerably greater overall impact on subjects than either short-term or long-term depri- vation. In terms of type of measure, sleep deprivation in general appeared to have the least effect on motor tasks, a greater effect on cognitive tasks, and an even greater effect on mood. [However, the average effect size for motor tasks is still considered to be a large experimental effect. For reference, an average effect size of 0.20 is considered to be a small experimental effect, 0.50 is considered medium, and 0.80 or greater is considered to be large (54).] Results for type of dependent measure crossed with type of deprivation are shown in Table 3. These results in general suggest an interaction between these two design characteristics. For motor performance tasks, the means were fairly close across all three types of deprivation, suggesting that performance on motor tasks is relatively unaffected by the type of deprivation. For cognitive performance tasks, the means were dissimilar, with performance being considerably more decremented with partial sleep deprivation than either short-term or long-term deprivation. Similarly, mood appeared to be much more affected by partial deprivation than by long-term deprivation. (There were no studies of the effect of short-term deprivation on mood in the final data set.) Results of the supplemental analyses of cognitive performance tasks are presented in Table 4. For shortterm deprivation, performance on complex and long tasks was considerably more decremented than on simple and short tasks, respectively. For long-term deprivation, opposite results were found, with performance   TSS, total sample size from those coefficients. a Averages and standard deviations were computed using sample size weighting. b Data for "Overall" are from Table 3. being considerably worse on short tasks than on long tasks and slightly worse on simple tasks than on complex tasks. For partial deprivation, subjects did worse on tasks that were simple than those that were complex and worse on tasks that were longer. However, the relatively small samples involved makes these findings much more tentative.
Results of the supplemental analyses of motor performance tasks are presented in Table 5. As shown, there were no studies involving complex motor tasks in the final data set. For length of task, performance was worse on long tasks for all three types of deprivation. Once again, the relatively small samples involved makes these findings tentative.

DISCUSSION
Our results confirm that sleep deprivation has a significant effect on human functioning. By quantitatively combining across primary studies, we found that the mean level of functioning of sleep-deprived subjects was comparable to, that of only the 9th percentile of non-sleep-deprived subjects (i.e. a 1.37 standard deviation difference between the distributions). Although most of the sleep research community may concur with these results, there are a surprising number of scientists outside the sleep research field who have concluded that sleep deprivation has no profound effect on performance and only a marginal effect on mood. For example, many widely known professionals outside the sleep research field writing introductory texts in psychology and physiological psychology have stated that the effects of sleep deprivation on human functioning are minimal (55)(56)(57)(58)(59).
Another major finding of our investigation was that the effects of sleep deprivation vary according to two key moderator variables. First, we found a substantial difference across the three dependent measures. Specifically, we found that cognitive performance was more affected by sleep deprivation than motor performance and that mood was much more affected than either cognitive or motor performance. It is important to note, however, that even on motor tasks the sleepdeprived subjects performed considerably worse than TSS, total sample size from those coefficients. a Averages and standard deviations were computed using sample size weighting. b Data for "Overall" are from Table 3.
Sleep, Vol. 19, No.4, 1996 the non-sleep-deprived subjects. This pattern of differences among the three types of dependent measures is not surprising and is consistent with the viewpoints of many sleep researchers (1,11,19,24). That mood was more influenced than the objective performance measures is not surprising. Since mood is usually assessed using self-reporting methodology, it is possible that the subjects could be overestimating the effect of sleep deprivation on their mood. However, it is important to note that on average, the sleepdeprived subjects reported mood ratings that were over 3 standard deviations worse than those of non-sleepdeprived subjects. While part of these differences could be attributable to self-reporting error, it is likely that sleep deprivation has a negative effect on mood.
Second, we found a substantial difference across the three types of sleep deprivation. Unexpectedly, partial sleep deprivation had a much stronger overall effect on the dependent measures than either short-term or long-term sleep deprivation. On average, partially sleep-deprived subjects performed at a level 2 standard deviations below that of the non-sleep-deprived subjects, compared to about a 1 standard deviation difference for both long-term and short-term deprivation.
In addition, we found an interaction between the two key moderator variables, length of sleep deprivation and type of dependent mcasure. We found that detriments on motor task performance were relatively constant across the three types of sleep deprivation. In contrast, cognitive performance and mood were considerably more decremented under partial sleep deprivation than under long-term or short-term deprivation. Narrative reviews clearly do not indicate such an overwhelming decrement in performance due to partial sleep deprivation. For example, two reviews of the effects of sleep deprivation reported mixed findings from a variety of partial sleep deprivation studies (1,60). Similarly, more recent reviews concluded that the effects of partial sleep loss on medical residents' performance were inconclusive (12,24).
It is possible that the difference in methodology between the narrative reviews and our quantitative analysis could account for the disagreement on the effects of partial sleep deprivation. Alternatively, the disagreement could be attributable to differences in the studies reviewed. For example, four of the six partial sleep deprivation studies in our meta-analysis used medical residents as subjects, and three of these studies specifically used medical-related tasks as dependent mea-su~es. It is possible that these tasks were more easily affected by sleep deprivation than more traditional cognitive and motor tasks. However, given the magnitude of the differences between partially deprived and control subjects, it is unlikely that these method-Sleep, Vol. 19, No.4, 1996 ological concerns could account for all of the decrement found in cognitive tasks and mood.
One clear direction for future research is to address why partial sleep deprivation may have such a pronounced effect on mood and cognitive performance. For example, partial sleep deprivation may alter certain circadian rhythm effects on performance and mood. While total sleep deprivation has been found to interact with circadian rhythms (61,62), few studies have investigated the effects of partial sleep deprivation on circadian rhythms. In addition, partial sleep deprivation may be similar to fragmented sleep in that subjects in both cases obtain at least some sleep. Since sleep fragmentation has been shown to significantly decrease performance and mood (63,64), it is possible that the effects of partial sleep deprivation more closely resemble those of sleep fragmentation than those of total sleep deprivation. Furthermore, partial sleep deprivation could have a unique effect on certain psychological variables. Decreased interest and attention, for example, are thought to be two prominent variables related to total sleep deprivation (65) and could be investigated with partial sleep deprivation. Similarly, partial sleep deprivation could have certain physiological effects that are either different or more pronounced than those of total sleep deprivation. Although numerous studies have been conducted on physiological changes following total sleep deprivation (e.g. 66,67), few studies have specifically investigated physiological changes following partial sleep deprivation. In sum, the effects of partial sleep deprivation need to be more thoroughly investigated, particularly since partial sleep loss is a relatively common condition in our society.
There are several limitations that should be noted about our investigation. First, we could not use a number of the primary studies that we found because they did not meet our established criteria. Although the meta-analytic technique does not require that all possible literature be utilized, it is important that coverage of the literature not be systematically biased. In our case, there is no a priori reason to assume that the articles we rejected were different in any systematic way from the articles that we used. Second, a general concern about the meta-analytic technique is that it combines across data that may be inherently positive. This same point, however, can be made concerning narrative reviews. In both cases, the reviewers are simply evaluating published data. Also, this concern may not be as valid in this particular meta-analysis since a number of the studies that we used actually included nonsignificant data. Third, we were not able to draw robust conclusions from our final level of analysis, which examined the influence of task length and complexity on performance. Such analyses may become possible in the future as more primary studies become available. Lastly, other moderator variables, such as age or gender, may influence the interpretation of the effects of sleep deprivation. Similarly, these variables could be investigated as more studies become available.
Nonetheless, these results allow us to draw two major conclusions. First, sleep deprivation has a substantial effect on mood and motor and cognitive performance in humans. And, second, partial sleep deprivation has a greater negative effect on mood and cognitive performance than either short-term or long-term sleep deprivation.