Abstract

The Trail Making Test Part B (TMT-B) is widely used in clinical and research settings as a measure of executive function. Standard administration allows a maximal time score (i.e., floor score) of 300 s. This practice potentially masks performance variability among cognitively impaired individuals who cannot complete the task. For example, performances that are nearly complete receive the same 300-s score as a performance of only a few moves. Such performance differences may have utility in research and clinical settings. To address this, we propose a new TMT-B efficiency metric designed to capture clinically relevant performance variability below the standard administration floor. Our metric takes into account time, correct moves, and errors of commission and omission. We demonstrate that the metric has concurrent validity, permits statistical analysis of performances that fall below the test floor, and captures clinically relevant performance variability missed by alternative methods.

Introduction

The Trail Making Test (TMT; Army Individual Test Battery, 1944; Reitan, 1955, 1958; Reitan & Wolfson, 1993) is among the most widely used neuropsychological tests in clinical and research settings (Rabin, Barr, & Burton, 2005). The test was originally developed in 1938 as “Partington's Pathways” (Partington, Leiter, & Graydon, 1949) and was used as part of the army individual test battery as a task of general intelligence (Tombaugh, 2004; War Department, Adjutant General's Office, 1944). It gained acceptance in civilian clinical neuropsychological practice as an indicator of organic brain damage after further development by Reitan (1955, 1958) and by inclusion in the widely used Halstead battery (Reitan & Wolfson, 1985). This pencil-and-paper test consists of two consecutively administered parts. In Part A (TMT-A), the examinee uses a pencil to sequentially connect the numbers 1–25, which are scattered about the page. TMT-A is generally viewed as a test of psychomotor processing speed involving perceptual tracking (Strauss, Sherman, & Spreen, 2006). TMT-B is similar but involves alternating alphanumeric sequencing (i.e., 1-A-2-B-3-C… L-13) and is generally considered to be a test of executive function, specifically, rapid cognitive set switching and divided attention (Strauss et al., 2006). Lesion and imaging studies indicate TMT-B performance, especially errors, are related to frontal brain systems, particularly dorsolateral prefrontal regions (Davidson, Gao, Mason, Winocur, & Anderson, 2008; Moll, de Oliveira-Souza, Moll, Bramati, & Andreiuolo, 2002; Stuss et al., 2001; Yochim, Baldo, Nelson, & Delis, 2007; Zakzanis, Mraz, & Graham, 2005). A recent study, however, showed that TMT-B performance did not differ significantly in a group of patients with frontal lesion compared with those with nonfrontal brain lesions, suggesting that the test is sensitive to brain dysfunction but perhaps not to specifically to frontal executive dysfunction (Chan et al., 2015).

In both TMT-A and TMT-B, the examinee's score is determined by completion time in seconds. Standardized administration commonly permits 180 s to complete Part A (Strauss et al., 2006) and 300 s to complete Part B (e.g., Heaton, Miller, Taylor, & Grant, 2004). Participants who are unable to complete the test in this time frame receive the maximum time as their raw score. The Heaton normative system (2004), however, permits the examinee to progress beyond 300 s if he or she were “making good progress, were likely to finish soon, and/or, in the examiner's judgment, would be likely to experience additional stress if they were not permitted to complete the task” (Heaton et al., 2004, p. 10). Examiners point out errors immediately when they occur and the clock continues to run while examinees correct them with some guidance. Correcting errors costs time, thereby driving completion times higher. In most of the widely used scoring systems (e.g., Heaton et al., 2004), standard scores correspond to completion time only and do not take into account errors.

Alternatively derived TMT scores have been proposed in an effort to capture interpretive information in addition to completion time. One focus has been on quantifying the executive switching component of TMT-B relative to speed and visual search. These scores include a TMT ratio score of TMT-B/TMT-A (Arbuthnott & Frank, 2000) and a TMT difference score of TMT-B − TMT-A (Arbuthnott & Frank, 2000; Giovagnoli et al., 1996). Arbuthnott and Frank (2000) showed that the TMT-B − TMT-A difference score was not associated with switching reaction time, whereas TMT-B/TMT-A ratio scores > 3 were suggestive of set-switching impairment.

Other investigators have examined the utility of TMT-B errors as a marker of impairment. Nadler and Ryan (1984) found that sequencing errors on TMT-B had 100% accuracy in identifying “organic chronic schizophrenics” (due to hyponatremia). The Delis–Kaplan Executive Function System (Delis, Kaplan, & Kramer, 2001) uses a modified TMT-B task. D-KEFS errors are recorded and normed. Amieva and colleagues (1998) found that inhibition errors (a combination of sequencing and perseverative errors) occurred in 67% of a sample of patients diagnosed with Alzheimer's disease (AD), whereas inhibition errors occurred in only 24% of healthy controls.

Mahurin and colleagues (2006) identify three types of errors on TMT-A and TMT-B: Using these definitions, Mahurin and colleagues (2006) examined TMT-A and TMT-B performance in healthy controls and in groups of patients with depression and schizophrenia. TMT was administered according to standard procedures except for not correcting patients when they committed errors to obtain greater independence between completion time and errors. Results showed that relative to healthy controls, TMT-B completion time was significantly longer in the depressed group and was longer still in the schizophrenia group. The schizophrenia group committed significantly more errors than both the depressed and control groups, whereas the number of errors did not differ significantly between the depressed and control groups. Moreover, TMT-B error rate was associated with tests of mental tracking ability, whereas completion time was related to both mental tracking ability and processing speed.

  1. Sequencing or tracking errors: Proceeding to the wrong letter or number or, for TMT-A, proceeding to a wrong number.

  2. Perseverative errors: Proceeding without alphanumeric alteration (i.e., number–number or letter–letter rather than number–letter–number–letter).

  3. Proximity errors: Proceeding to an incorrect target with close physical proximity to the last correct target on either TMT-A or TMT-B.

Ruffolo, Guilmette, and Willis (2000) examined TMT-A and TMT-B completion time and errors in four groups: (a) patients with mild to moderate/severe traumatic brain injury, (b) patients with suspect effort, (c) experimental malingerers, and (d) healthy controls. Completion time scores differentiated the malingerers and those with suspect effort from the patients with head injury. Errors were fairly common among the healthy controls with 12% producing at least one error on TMT-A and 35% producing at least one error on TMT-B. Error rates did not differ between the healthy control and head injured groups for either TMT-A or TMT-B. Error rates were inflated on TMT-B for the suspect effort and experimental malingerer groups. The authors concluded that performance errors lacked diagnostic utility in patients with head injury but might be useful in detecting malingering. In contrast, Ashendorf and colleagues (2008) found that TMT-B completion time and errors provided independently meaningful information about executive function in healthy controls, patients with mild cognitive impairment and patients with AD. Further, they demonstrated that diagnostic classification improved using a combination of TMT-B completion time and error score compared with use of either time or error alone.

The standard procedure of assigning the maximum TMT-B score of 300 s for examinees who cannot complete the task in other contexts, such as in severe head injury or dementia, masks considerable performance variability. For example, an examinee who completes the task slowly but accurately and hits the 300-s mark at “J” presumably has better executive function, particularly cognitive switching ability, than an examinee who is unable to progress past “C.” Assigning the test's floor score of 300 s to both examinees potentially obscures performance variability that may improve our ability to distinguish differences in severity of the examinees' executive dysfunction. Assigning the 300-s score may also be particularly troublesome in research settings, such as in studies of AD, where a high proportion of participants may fail to complete the task according to standard administration rules, but who may retain varying degrees of the cognitive abilities tapped by the test. In this situation, the TMT-B score distribution asymptotes at 300 s, thereby decreasing variability and limiting its utility in parametric statistical analyses. The loss of performance variability resulting from the 300-s scoring convention holds irrespective of whether the 300-s score is converted to a standard score for clinical analysis or not (e.g., use of nonconverted raw scores for statistical analysis in research). The standard score in a normative system such as Heaton and colleagues (2004) would be at the performance floor, thus causing the same masking of performance variability and limiting differentiation of severity of executive dysfunction.

In developing their comprehensive norms, Heaton and colleagues (2004) prorated TMT-B (TMT-Bpr) raw scores in cases where the examinee had not completed the task by the 300-s time limit. This TMT-Bpr score is computed by dividing 300 s by the number of circles completed. This yields a “time per circle” value that is then multiplied by 25 (the total number of circles on the TMT-B test): 

TMT - Bpr=300snumber of completed circles×25

Here, we propose a new TMT-B scoring metric called “TMT-B efficiency” or “TMT-Be” that is designed to lower the basal level of the test. We developed our TMT-Be metric in the context of a research study of patients with early to moderate–severe AD. The intent of the metric is to “lower the floor” of the test, thereby providing clinically meaningful information about the cognitive status of individuals who fail to complete the task in 300 s. TMT-Be takes into account the examinee's completion time and the number of correct sequencing “moves,” errors of commission, errors of omission, and unattempted items (due to premature termination). Various terms could be used to name this metric. We chose “efficiency score” as a reasonable term that captures the performance components used to compute the metric (see Method) and, for those that cannot complete the task in 300 s, addresses the question of “how capable is this impaired examinee at cognitive set-switching rapidly and correctly?” In other words, how “efficient” are they at the TMT-B task.

We hypothesized that our metric would correlate strongly with the standard TMT-B (TMT-Bs) metric and extend the score range (e.g., “lowers the floor”) among examinees who receive the maximum time of 300 s on the standard scoring. We also hypothesized that the lower floor would permit statistical analysis of TMT-B performance in individuals who receive the 300 s TMT-B score. We anticipate that our new TMT-Be metric will provide useful clinical information.

Method

Participants

The sample for this report consisted of 59 participants (18 men and 41 women) who completed TMT-A and TMT-B as a part of their participation in a broader research project at the Butler Hospital Memory and Aging Program in Providence, RI. The broader project, Investigation of Myelin Loss Associated with Alzheimer's Dementia Alzheimer's Association, NIRG-09-131008; SD, Principal Investigator, was aimed at using a new magnetic resonance imaging (MRI) method to measure myelin integrity and its relation to cognitive function across the spectrum of cognitively healthy aging to moderate–severe AD. Participants in this study underwent MRI and completed a battery of tests including TMT-A and TMT-B. Participants were assigned to one of four groups: healthy controls, mild cognitive impairment, mild AD, and moderate–severe AD. Group membership was determined using a combination of Mini Mental State Examination (MMSE) and Clinical Dementia Rating Scale (CDR) with the following criteria, which are similar to those in the Alzheimer's Disease Neuroimaging Initiative study (http://adni-info.org/Scientists/ADNIGrant/ProtocolSummary.aspx): The grouping criteria listed earlier deliberately overlap. This was intended to minimize the likelihood of misclassification of individual patients—who were well known to our clinical service—due to a single outlier score on the MMSE or an overly positive or negative caregiver report on the CDR at the time of study entry. All cases were reviewed for assignment to disease stage by a neuropsychologist and a board-certified neurologist. Cases that fell into more than one category were assigned by consensus based on additional information contained in their clinical record at the Memory and Aging Program with input from a second neuropsychologist to resolve ties. Note that CDR raters obtained certification through the University of Washington School of Medicine in St Louis, Knight Alzheimer's Disease Research Center training protocol (http://alzheimer.wustl.edu/cdr/)

  • Controls: MMSE ≥ 27 and CDR = 0.0.

  • Mild cognitive impairment: MMSE = 24–30 and CDR = 0.5.

  • Mild AD: MMSE = 19–26 and CDR = 0.5 or 1.0.

  • Moderate–severe AD: MMSE ≤ 19 and CDR > 1.0.

Exclusionary criteria for the broader project included metabolic disease interfering with cognitive functioning, evidence of significant cerebrovascular disease (e.g., cortical infarctions, multiple lacunar lesions, extensive leukoaraiosis, or diagnosis of vascular dementia), uncontrolled major depressive disorder, primary psychotic disorder, and significant sensorimotor disability.

Participants were included in the TMT-B analysis reported here if their TMT-B performance contained at least one correct move regardless of the participant's performance on the practice items. The sample for the current TMT-Be study consisted of 17 healthy controls and 42 patients with presumed AD pathology, which included 25 individuals with mild cognitive impairment, 14 with mild AD, and 3 with moderate–severe AD.

The Institutional Review Boards at Butler Hospital and Brown University approved the study. All participants provided written informed consent in accordance with Institutional Review Board-approved procedures.

Procedures

Neuropsychological assessment

All participants underwent a battery of standardized neuropsychological tests administered according to standard procedures. The battery was designed to cover all major cognitive domains. The battery emphasized tests of attention, processing speed, and executive function—cognitive abilities thought to be closely linked to cerebral myelin integrity. The TMT-B was administered according to standard instructions (Reitan & Wolfson, 1993; Strauss et al., 2006, p. 656) following administration of TMT-A. The maximum time score (or floor score) on the TMT-B was 300 s. This score was given to participants who were unable to complete the task within 300 s. This included participants who completed the task in excess of 300 s as well as those who terminated the test prematurely due to frustration or confusion and those with obvious inability to continue with the task.

TMT-Be metric

The TMT-Be metric is computed post hoc TMT-Bs administration. Our formula was derived to capture the ratio of correct moves (Mc) to commission errors (Ec) while accounting for time efficiency, omission errors (Eo), and unattempted moves (Mu; i.e., moves that were not attempted because the task was terminated early). Higher TMT-Be scores indicated worse performance.

We defined TMT-B errors to be errors of commission (i.e., any of the three error types defined by Mahurin and colleagues 2006 and listed above) plus errors of omission (i.e., gaps in sequencing, such as 1-A-2-B [skip] 4-D). We did not count very unusual behaviors as errors such as intrusions (e.g., writing words and drawing pictures) or other extraneous output that could pathognomonic for severe impairment or suggestive of malingering or deliberate disdain for the test. Additionally, our proposed TMT-Be metric incorporates errors but does not analyze them or ascribe clinical significance by error type. Rather, our metric treats errors equally.

Time efficiency was defined as a ratio of total time (T) to Mc. TMT-B is completed correctly and without error in a total of 24 moves (i.e., 1—A is move one, A—2 is move two, 2—B is move three, etc.). Our efficiency formula requires a minimum of one correct move (24 ≥ Mc ≥ 0) to ensure both the task is acquired and the T/Mc ratio does not become mathematically undefined. The formula also requires no more than 23 errors of commission (Ec < 24); otherwise, the TMT-Be value becomes uninterpretable. Also, the term Mc/(24 − Ec) becomes undefined when Ec = 24. The term becomes negative if Ec > 24, thereby driving the TMT-Be value lower (i.e., better performance) when in fact their performance is quite impaired. The sum of omission errors (Eo) plus unattempted moves (Mu) may range from 0 to 23—this is because 24 moves are possible and at least one correct move is required to derive TMT-Be. Time may range up to 300 s. Given these conditions, the TMT-Be formula was derived as: 

TMT-Be=Mc24Ec×TMc+(Eo+Mu)

All three terms and their operands are set mathematically such that they increase with poorer performance due to either increasing errors of omission or commission, more unattempted moves, or decreasing correct moves. Thus, the three terms operate to increase the overall value of the TMT-Be metric. The formula could be arranged differently to produce scores that decrease with poorer performance.

The first term in the brackets represents “move efficiency.” It captures the ratio of correct moves to total possible moves minus errant moves (i.e., errors of commission). As noted earlier, the formula requires Mc ≥ 1. That is, the examinee must complete at least one correct move; otherwise, the bracketed first term becomes mathematically undefined rendering the TMT-Be metric uninterpretable. The constant value in the denominator (i.e., 24) is the number of correct moves on the test. It would be simpler to omit the constant (i.e., Mc/Ec) but the term would become mathematically undefined when Ec = 0.

The “move efficiency” term acquires a value of 1 when the examinee completes all 24 possible moves correctly with no errors of commission. The value will decrease depending on the number of correct moves and errors of commission. Consider, for example, two examinees who both make 10 correct moves (i.e., progressed to “1-A… E-6”) but one examinee does so without commission error, whereas the other commits four such errors. In the first case, the value of the term is 10/(24 − 0) = 0.42, whereas the value rises to 10/(24 − 4) = 0.50 for the second examinee indicating less optimal or less efficient performance. A more extreme hypothetical example is the case when an examinee achieves 24 correct moves but also produces 23 errors of commission. In this case, the value of the term rises to 24/(24 − 23) = 24.

The second term in the brackets represents “time efficiency.” It captures the amount of time (in seconds) per correct move. As noted earlier, the numerator can take on values from 24 to 300 s. A value of 300 s is assigned to examinees who take 300 s or longer to complete the task or who terminate early. The value for this term in an “optimal” performance is 1 indicating that the examinee correctly executed all 24 moves within 24 s. Smaller values are mathematically possible with faster errorless performances of <1 s per move, but this is unlikely in practice. The value of the term increases as efficiency decreases (i.e., completion time rises or the number of correct moves falls). For example, the term acquires a value of 12.5 s/Mc when all 24 correct moves are completed at 300 s (i.e., 300/24 = 12.5), whereas the value rises to 30 s/Mc for an individual who terminates the test after achieving 10 correct moves: (i.e., 300/10 = 30), consistent with less efficient performance. The formula multiplies the first and second terms based on our conceptualization that these factors interact during test performance.

The final term is the number of omitted moves. It is the sum of errors of omission plus unattempted moves. The value for this term for an optimal performance is 0. Clearly, errors of omission and unattempted items lead to higher overall TMT-Be values again showing that higher TMT-Be scores indicate worse efficiency.

The TMT-Be formula can be algebraically simplified to a less complex computation formula: 

TMT-Be=T24Ec+(Eo+Mu)

The TMT-Be metric is dimensionless. Two salient scores are useful for illustration. An errorless performance in 24 s produces a “benchmark” TMT-Be score of 1; faster errorless performances would produce scores lower than 1. An errorless performance at the time ceiling of 300 s produces a “boundary” TMT-Be score of 12.5; increased errors of commission or omission and unattempted items will result in higher TMT-Be values. Appendix contains TMT-Be scores across a range of hypothetical TMT-Bs performances.

Statistical Analyses

To maximize statistical power, we collapsed all patients into a single patient group and compared this group with the healthy controls. We used Pearson chi-square analysis for comparison of categorical variables. We used Spearman rank-order bivariate correlation to determine strength of association between TMT-Bs and TMT-Be and other clinical and cognitive variables. Group comparison of continuous variables was performed using parametric methods (e.g., Student's t tests and one-way analysis of variance) or nonparametric methods (e.g., Mann–Whitney U test) as appropriate. Analyses were performed using IBM SPSS® Statistics (IBM Corp., Released 2013, IBM SPSS Statistics for Windows, Version 22, Armonk, NY).

Results

The sample characteristics by group are presented in Table 1. The healthy control and patient groups did not differ significantly by gender (χ2 = 0.549, ns) or handedness (χ2 = 0.764, ns). There was a statistical trend toward the patient group being older than the control group [t(57) = −1.92, p = .059]. The control group had significantly more years of education than the patient group [t(57) = 2.82, p = .007].

Table 1.

Demographic variables for healthy control and patient groups

 Healthy controls (n = 17) Patients (n = 42) 
Gender 
 Male (n, %) 4 (23.5%) 14 (33.3%) 
 Female (n, %) 13 (76.5%) 28 (66.7%) 
Handedness 
 Right (n, %) 14 (82.4%) 38 (90.5%) 
 Left (n, %) 3 (17.6%) 4 (9.5%) 
 Age (mean ± SD, range), years 74.47±6.48, 58–86 78.10±6.58, 55–87 
 Education (mean ± SD, range), years 16.18±2.63, 12–20 13.45±3.60, 5–20 
 Healthy controls (n = 17) Patients (n = 42) 
Gender 
 Male (n, %) 4 (23.5%) 14 (33.3%) 
 Female (n, %) 13 (76.5%) 28 (66.7%) 
Handedness 
 Right (n, %) 14 (82.4%) 38 (90.5%) 
 Left (n, %) 3 (17.6%) 4 (9.5%) 
 Age (mean ± SD, range), years 74.47±6.48, 58–86 78.10±6.58, 55–87 
 Education (mean ± SD, range), years 16.18±2.63, 12–20 13.45±3.60, 5–20 

Notes: Bold font indicates statistical significance (p < .05). Italics indicate trend-level results (i.e., 0.05 < p < .10). See text for further information.

In the whole sample (N = 59), the mean scores were 159.73 s (standard deviation [SD] = 93.13 s) for TMT-Bs and 10.12 (SD = 9.97) for TMT-Be. Table 2 lists the TMT-Bs time scores by group. Shapiro–Wilks tests revealed that the distributions of TMT-Bs and TMT-Be considered in the entire sample deviated significantly from normality (Shapiro–Wilks = 0.835 for TMT-Bs and 0.748 for TMT-Be; p < .001 for both). Table 2 lists skewness and kurtosis by group. Group analysis using Shapiro–Wilks revealed that the deviation for normality lay entirely within the patient group for both TMT-Bs and TMT-Be (W = 0.840 for TMT-Bs, W = 0.819 for TMT-Be; p < .001 for both). In the control group, Shapiro–Wilks analysis was nonsignificant for both TMT-Bs (W = 0.935, p = .263) and TMT-Be (W = 0.920, p = .148). As expected, Mann–Whitney tests showed that the patient group achieved significantly higher scores (i.e., worse performance) than the healthy controls on TMT-Bs (U = 79.00, p < .001) and on TMT-Be (U = 77.50, p < .001). In the patient group, 14 participants achieved the modal score of 300 s (see Table 2). Figure 1 shows a histogram with superimposed normal curve for TMT-Bs and TMT-Be for the entire sample. Note that the bimodal histogram pattern seen for TMT-Bs (i.e., spikes at TMT-Bs of 100 and 300 s) are absent for TMT-Be. These differences in score distribution demonstrate that the TMT-Be metric captures performance variability that is masked when 300 s is assigned to all participants who fail to complete TMT-Bs in the allowed time or terminate prematurely. Note the large peak at 300 s in the left histogram representing standard scoring has been distributed more broadly in the right histogram representing the efficiency metric. Also, compared with the histogram for TMT-Bs, the superimposed normal curve for TMT-Be is more platykurtic, owing to the long tail for scores greater than the aforementioned “boundary” score of 12.50.

Table 2.

TMT-Bs, TMT-Be, and other cognitive test scores

 Healthy controls (n = 17) Patients (n = 42) 
TMT-Bs 79.06±26.57 192.38±90.58 
TMT-Bs (median) 79 167 
TMT-Bs (mode) 79 (n = 3) 300 (n = 14) 
Skewness 0.807 0.093 
Kurtosis 0.711 −1.70 
TMT-Be 3.38±1.22 12.85±10.66 
TMT-Be (median) 3.29 7.25 
TMT-Be (mode) 3.29 (n = 3) 2.88 (n = 2), 6.91 (n = 2) 
Skewness 0.909 1.03 
Kurtosis 0.713 −0.27 
MMSE 29.29±0.99 24.31±4.33 
CDR-SB 0.06±0.17 2.71±2.38 
DRS-2 I/P 36.59±1.28 32.00±5.42 
COWAT 38.53±12.28 26.98±12.32 
BNT 56.41±3.99 44.85±12.55 
HVLT-R TL 27.00±4.98 14.51±6.11 
 Healthy controls (n = 17) Patients (n = 42) 
TMT-Bs 79.06±26.57 192.38±90.58 
TMT-Bs (median) 79 167 
TMT-Bs (mode) 79 (n = 3) 300 (n = 14) 
Skewness 0.807 0.093 
Kurtosis 0.711 −1.70 
TMT-Be 3.38±1.22 12.85±10.66 
TMT-Be (median) 3.29 7.25 
TMT-Be (mode) 3.29 (n = 3) 2.88 (n = 2), 6.91 (n = 2) 
Skewness 0.909 1.03 
Kurtosis 0.713 −0.27 
MMSE 29.29±0.99 24.31±4.33 
CDR-SB 0.06±0.17 2.71±2.38 
DRS-2 I/P 36.59±1.28 32.00±5.42 
COWAT 38.53±12.28 26.98±12.32 
BNT 56.41±3.99 44.85±12.55 
HVLT-R TL 27.00±4.98 14.51±6.11 

Notes: Values are mean ± SD except for median, mode, skewness, and kurtosis entries as noted; TMT-Bs values are in seconds. TMT-B = Trail Making Test Part B; TMT-Bs = standard TMT-B; TMT-Be = TMT-B efficiency; MMSE = Mini Mental State Examination; CDR-SB = Clinical Dementia Rating Sum of Boxes; DRS-2 I/P = Dementia Rating Scale-2, Initiation/Perseveration subtest; COWAT = Controlled Oral Word Association Test; HVLT-R TL = Hopkins Verbal Learning Test—Revised Total Learning; BNT = Boston Naming Test. n for DRS-2 I/P, COWAT, HVLT-R, and BNT = 41 and differ from n = 42 for other variables due to missing data. There was no single modal score for TMT-Be in patient group. Bold font indicates statistical significance (p < .05). See text for further information.

Fig. 1.

Histogram with normal curve for standard TMT-B (TMT-Bs in seconds) and TMT-B efficiency (TMT-Be) (n = 59). (The extension of the normal curves beyond the actual highest and lowest TMT-Bs and TMT-Be scores is a graphical artifact inherent in SPSS.)

Fig. 1.

Histogram with normal curve for standard TMT-B (TMT-Bs in seconds) and TMT-B efficiency (TMT-Be) (n = 59). (The extension of the normal curves beyond the actual highest and lowest TMT-Bs and TMT-Be scores is a graphical artifact inherent in SPSS.)

Table 2 also lists the group means and SDs for two ratings of global cognitive status. The MMSE (Folstein, Folstein, & McHugh, 1975) is a widely used, coarse objective cognitive screening measure. The Clinical Dementia Rating Scale Sum of Boxes (CDR-SB) score is a clinician-administered structured-interview of overall dementia severity. Table 2 also lists group means and SDs for other cognitive measures included in the test battery, including two additional measures of executive function: Dementia Rating Scale-2 Initiation/Perseveration Scale (DRS-2 I/P; Jurica, Leiten, & Mattis, 2001) and Controlled Oral Word Association Tests (COWAT; Benton & Hamsher, 1989); one measure of memory performance: the Hopkins Verbal Learning Test—Revised, Total Learning Score (HVLT-R TL; Benedict, Schretlen, Groninger, & Brandt, 1998) and one measure of confrontation naming: Boston Naming Test (BNT; Kaplan, Goodglass, & Weintraub, 1983). We used Student's t tests to compare the patients with controls on these measures. As expected, the patient group showed significantly poorer standing compared with the healthy controls on each of these measures [MMSE: t(50.06) = 7.02, p < .001; CDR-SB: t(41.98) = −7.21, p < .001; DRS-2 I/P: t(49.22) = 5.09, p < . 001; COWAT: t(56) = 3.13, p = .003; BNT: t(53.86) = 5.29, p < .001; HVLT-R TL: t(56) = 7.46, p < .001]. Note: Levene's test for equality of variances was not tenable for MMSE, CDR-SB, DRS-IP, and BNT. Accordingly, the reported degrees of freedom and t values for MMSE, CDR-SB, DRS-I/P, and BNT are for equal variances not assumed. Degrees of freedom for COWAT and HVLT-R TL = 56 due to missing data.

Convergent validity of the TMT-Be metric was explored using Spearman rank-order bivariate correlation against the TMT-Bs score. TMT-Be was strongly correlated with TMT-Bs in the overall sample (r = .99, p < .001), as well as in the healthy control and patient groups considered separately (r = .98, p < .001 in both groups, respectively). These strong correlations suggest good concurrent validity and are expected because TMT-Bs score is incorporated into the TMT-Be metric (see the formula derivation earlier).

Criterion validity was explored using Spearman rank-order bivariate correlation with global markers of cognitive status collected as part of the study. In the entire sample, TMT-Bs was significantly and positively correlated with CDR-SB (r = .66, p < .001) and significantly and negatively correlated with MMSE, DRS-I/P, COWAT, BNT, and HVLT-R TL (lowest correlation of r = −.58 for COWAT to highest of r = −.74 for HVLT-R TL; all p < .001). In the entire sample, TMT-Be was also significantly and positively correlated with CDR-SB (r = .66, p < .001) and significantly and negatively correlated with all other measures (lowest correlation r = −.58 for COWAT to highest of r = −.75 for HVLT-R TL; all p < .001). Table 3 shows the correlation between the TMT-B scores and cognitive measures by group. Results show the two TMT-B metrics to have similar patterns and strengths of correlation with these measures of specific cognitive abilities. Figure 2 demonstrates the correlation between TMT-Bs and TMT-Be compared with MMSE and CDR-SB.

Table 3.

TMT-Bs and TMT-Be Spearman rank-order correlation with cognitive status

 Healthy controls
 
Patients
 
TMT-Bs TMT-Be TMT-Bs TMT-Be 
MMSE −.029 (.911) −.086 (.743) .595 (<.001) .619 (<.001) 
CDR-SB .374 (.139) .374 (.140) .461 (.002) .454 (.003) 
DRS-2 I/P −.105 (.687) −.105 (.688) .532 (<.001) .534 (<.001) 
COWAT −.018 (.944) −.011 (.966) .610 (<.001) .604 (<.001) 
BNT −.153 (.557) −.096 (.714) .537 (<.001) .552 (<.001) 
HVLT-R TL −.319 (.213) −.356 (.161) .651 (<.001) .655 (<.001) 
 Healthy controls
 
Patients
 
TMT-Bs TMT-Be TMT-Bs TMT-Be 
MMSE −.029 (.911) −.086 (.743) .595 (<.001) .619 (<.001) 
CDR-SB .374 (.139) .374 (.140) .461 (.002) .454 (.003) 
DRS-2 I/P −.105 (.687) −.105 (.688) .532 (<.001) .534 (<.001) 
COWAT −.018 (.944) −.011 (.966) .610 (<.001) .604 (<.001) 
BNT −.153 (.557) −.096 (.714) .537 (<.001) .552 (<.001) 
HVLT-R TL −.319 (.213) −.356 (.161) .651 (<.001) .655 (<.001) 

Note: Values are r (p); bold indicates statistically significant at α = 0.05.

For patient group n = 41 for DRS-2 I/P, COWAT, HVLT-R, and BNT due to missing data. TMT-B = Trail Making Test Part B; TMT-Bs = standard TMT-B; TMT-Be = TMT-B efficiency; MMSE = Mini Mental State Examination; CDR-SB = Clinical Dementia Rating Sum of Boxes; DRS-2 I/P = Dementia Rating Scale-2, Initiation/Perseveration subtest; COWAT = Controlled Oral Word Association Test; HVLT-R TL = Hopkins Verbal Learning Test—Revised Total Learning; BNT = Boston Naming Test.

Fig. 2.

Correlation of TMT-B metrics (TMT-Bs score is in seconds) with MMSE and CDR-SB (n = 59). MMSE = Mini Mental State Examination; CDR-SB = Clinical Dementia Rating Scale Sum of Boxes.

Fig. 2.

Correlation of TMT-B metrics (TMT-Bs score is in seconds) with MMSE and CDR-SB (n = 59). MMSE = Mini Mental State Examination; CDR-SB = Clinical Dementia Rating Scale Sum of Boxes.

For comparison, we computed the TMT-Bpr score described by Heaton and colleagues (2004) for the 14 participants who obtained a TMT-Bs score of 300 s. Comparing TMT-Be with the Heaton prorated score is appropriate because both scores share the similar objective of providing useful clinical information about test performance in individuals who score below the 300-s time limit. Table 4 lists the Spearman rank-order correlation between TMT-Bpr and cognitive measures. TMT-Bpr correlated strongly, negatively, and significantly with MMSE score. Correlations between TMT-Bpr and other cognitive measures were weak and not statistically significant. There were no statistically significant correlations between TMT-Be and the cognitive tests but moderate trend-level (i.e., 0.05 < p < .10) correlations were found between TMT-Be and MMSE, as well as TMT-Be and BNT.

Table 4.

Spearman rank-order correlations between TMT-Be and TMT-Bpr with cognitive status in individuals with TMT-Bs = 300 (n = 14)

Test TMT-Be TMT-Bpr 
MMSE .499 (.069) .710 (.004) 
CDR-SB −.068 (.817) .182 (.533) 
DRS−2 I/P −.392 (.186) −.260 (.390) 
COWAT −.014 (.964) −.014 (.964) 
BNT .503 (.079) −.415 (.158) 
HVLT-R TL −.050 (.872) −.042 (.892) 
Test TMT-Be TMT-Bpr 
MMSE .499 (.069) .710 (.004) 
CDR-SB −.068 (.817) .182 (.533) 
DRS−2 I/P −.392 (.186) −.260 (.390) 
COWAT −.014 (.964) −.014 (.964) 
BNT .503 (.079) −.415 (.158) 
HVLT-R TL −.050 (.872) −.042 (.892) 

Notes: Values are r (p); bold font indicates statistical significance (p < .05). Italics indicate trend-level results (i.e., .05 < p < .10). TMT-B = Trail Making Test Part B; TMT-Be = TMT-B efficiency; TMT-Bpr = prorated TMT-B; TMT-Bs = standard TMT-B; MMSE = Mini Mental State Examination; CDR-SB = Clinical Dementia Rating Sum of Boxes; DRS-2 I/P = Dementia Rating Scale-2, Initiation/Perseveration subtest; COWAT = Controlled Oral Word Association Test; HVLT-R TL = Hopkins Verbal Learning Test—Revised Total Learning; BNT = Boston Naming Test. n = 14 for MMSE and CDR-SB and n = 13 for other measures due to missing data. The correlation and p value for the relationship between COWAT and TMT-Be and TMT-Bpr are actually the same.

The scatter plots in the top portion of Fig. 3 show the association between TMT-Be and MMSE and CDR-SB for the 14 participants who had a TMT-Bs score of 300 s. The bottom portion of the figure shows these associations for TMT-Bpr. Comparison of the top (TMT-Be) and bottom (TMT-Bpr) portion of the figures shows that TMT-Be scores are more evenly distributed across the entire range of scores achieved in this small sample, whereas TMT-Bpr scores tend to cluster around 500 s with a few scores far in excess of this value.

Fig. 3.

Correlation of TMT-B efficiency (TMT-Be) and prorated TMT-B (TMT-Bpr) with MMSE and CDR-SB for participants with standard TMT-B (TMT-Bs) = 300 s (n = 14). MMSE = Mini Mental State Examination; CDR-SB = Clinical Dementia Rating Scale Sum of Boxes.

Fig. 3.

Correlation of TMT-B efficiency (TMT-Be) and prorated TMT-B (TMT-Bpr) with MMSE and CDR-SB for participants with standard TMT-B (TMT-Bs) = 300 s (n = 14). MMSE = Mini Mental State Examination; CDR-SB = Clinical Dementia Rating Scale Sum of Boxes.

Discussion

According to standard test administration procedures, individuals who fail to complete TMT-B in the standard allotted time are all given the same maximum time score of 300 s. This performance “floor” score, however, potentially masks considerable variability among those who cannot complete the task. Individuals who nearly complete the task in 300 s with few or no errors are given the same score as individuals who are unable to complete more than a few moves.

We report here on a new TMT-B scoring metric, the TMT-Be metric that captures such performance variability. The new TMT-Be metric derives from TMT-Bs administration. It takes into account the TMT-Bs time score as well as errors of omission and commission. We examined the initial validity of the TMT-Be metric in a research sample consisting of cognitively healthy elderly controls and patients with presumed AD ranging from mild cognitive impairment to moderate–severe dementia.

Our results show TMT-Be and TMT-Bs to be strongly correlated, indicating that the new metric has good concurrent validity. In the healthy control group, TMT-Be correlated moderately and significantly with CDR-SB and HVLT-R TL. In the patient group, TMT-Be correlated moderately and significantly with CDR-SB and all cognitive measures analyzed. Taken together, these results indicate that TMT-Be has good criterion validity. Importantly, among participants who achieved a TMT-Bs score of 300, TMT-Be correlated significantly with MMSE and DRS-I/P but not with CDR-SB or other measures. Correlations between TMT-Bs and these measures cannot be computed in this subgroup of participants because all TMT-Bs scores were identical (i.e., 300 s).

This latter finding illustrates the potential utility of the TMT-Be metric: the metric captures performance variability at the TMT-B floor that would be lost using standard scoring. This result clearly shows that TMT-Be permits statistical analysis of individuals who cannot perform the task in the allotted time and demonstrates that it has utility for use in research with significantly impaired participants. The result also suggests that the TMT-Be metric may have clinical utility; accordingly, we are currently evaluating its use as a means of staging severity of executive impairment.

The reason for the lack of significant correlation between TMT-Be and CDR-SB among the 14 participants who obtained a TMT-Bs score of 300 s is unclear. The lack of significant correlation between CDR-SB and TMT-Be at higher TMT-Be scores could be related to differences in what the two measures assess and to changes in the relationship between the measures with increasing dementia severity. TMT-B assesses a patient's cognitive set-switching and divided attention abilities, whereas CDR-SB assesses cognitive status more broadly and functional status based on both assessment of the patient and interview with the caregiver. Thus, it is possible that functional status on the CDR could be much more variable among those who “fail” TMT-B compared with those who complete it. The lack of correlation in the more cognitively impaired patients may be more reflective of an interaction between dementia severity and relationship between an objective measure of cognitive functioning (TMT) and a subjective and observer-based measure of functional status (CDR). Measurement error on the CDR at higher levels of dementia could be another factor contributing to the correlation decoupling between it and TMT-Be. Lastly, it is possible that the lack of significant correlation is due to the small number of participants who could not complete TMT-B in 300 s.

In each of the healthy control and patient groups, TMT-Be showed identical patterns and comparable strengths of correlation with other measures. This result further supports the comparability of TMT-Be to TMT-Bs in terms of its relationship to other cognitive measures. Fewer significant correlations between TMT-Be and other measures were found in the small subgroup of individuals who obtained a TMT-Bs score of 300 s. This likely reflects the limited power of this small sample. However, we cannot rule out that the association between the TMT-Be metric and other measures differs among individuals who obtain a TMT-Bs score of 300 s. Further research is needed to evaluate this possibility.

As discussed in the Introduction, other researchers have proposed alternative TMT-B scoring methods (e.g., Arbuthnott & Frank, 2000; Giovagnoli et al., 1996) designed to capture component processes of TMT-B task completion (e.g., speed vs. visual scanning vs. cognitive switching). Still others have examined the utility of errors or subtypes of errors as indicators of impairment as a complement or alternative to the standard time-to-completion score (e.g., Amieva et al., 1998; Delis et al., 2001; Mahurin et al., 2006; Nadler & Ryan, 1984).

Our metric is designed to capture performance variability at the floor of the measure rather than to examine component TMT-B processes or facilitate error analysis. We are unaware of another TMT-B scoring metric that captures such variability. We included errors of commission and omission and unattempted items in our metric. Inclusion of errors and unattempted items is key to capturing performance variability in noncompleters because they all receive the same 300-s score. Accordingly, consideration of errors and unattempted items is the only way to differentiate the severity of examinees' underlying executive dysfunction. Individuals who fail the task may make errors of commission such as proceeding to the wrong number or letter or repeatedly connecting letters or numbers. Including errors of omission and unattempted moves is a means of capturing the performance difference between an examinee that nearly completes the task with a few unexecuted moves and an examinee who completes only a few moves, clearly cannot perform the task, and either terminates prematurely or futilely persists until 300 s is reached. Apart from quantifying errors of commission or omission, our metric does not involve further error analysis or characterization for interpretive reasons. The use of our metric does not, however, preclude further error analysis or other alternative scoring to provide complementary information. Moreover, future research may reveal that more fine-grained error analysis is, in fact, helpful.

An alternative score of particular relevance is the TMT-Bpr score posited by Heaton and colleagues (2004). This score has relevance because it too provides potentially useful clinical information when TMT-Bs scores exceed 300 s. Comparison of both TMT-Bpr and TMT-Be scores shows that both have merit for capturing performance variability across individuals who cannot complete the task in 300 s. In our analysis, TMT-Be and TMT-Bpr differed somewhat in their correlations with other measures in the subgroup of 14 participants who had TMT-Bs = 300 s. TMT-Bpr correlated strongly and significantly with MMSE, whereas the correlation between MMSE and TMT-Be approached, but fell short of statistical significance. The reason for this difference is unclear but two possibilities seem plausible. Firstly, the correlation between TMT-Be and MMSE was moderate and on the cusp of statistical significance, suggesting the possibility that the lack of significance was due to the small sample size. Secondly, the differing results could reflect the impact of score distribution characteristics of TMT-Be and TMT-Bpr on the correlation statistic. Figure 3 shows that TMT-Bpr scores are relatively unevenly distributed across the range of scores with most scores hovering around 500 s and a few more extreme values. In contrast, TMT-Be scores are more evenly distributed across their range. The distribution of TMT-Bpr scores produces a steeper regression line compared with the more evenly distributed TMT-Be scores, hence contributing to the higher TMT-Bpr correlation value.

TMT-Bpr can produce the same score in individuals who perform quite differently. Consider, for example, that Examinee 1 produces a TMT-B sequence with 19 correctly completed circles (i.e., 18 correct moves) with 3 errors of commission before time expires as shown below:

Examinee 1 TMT-B sequence:

1—A—2—B—3—C—4—D—5—E—6—F—7—G—8—H—9—109—I—KI—10—11

                          ∨     ∨      |

                         redir    redir    300 s

 Where:

 underline = incorrect circle

 redir = redirected by examiner

 errors per Mahurin et al. (2006):    9–10 = perseverative; I–K = sequencing;

                  10–11 = perseverative

This performance results in a prorated raw score of 394.74 s (i.e., TMT-Bpr = 300 s/19 circles competed x 25 total circles = 394.7 s). Now consider that Examinee 2 produces an errorless TMT-B sequence with 19 completed circles (i.e., gets to circle ‘10’) when time expires (i.e., 1-A-2-B-3-C-4-D-5-E-6-F-7-G-8-H-9-I-10). In this situation, Examinee 1 and Examinee 2 obtain the same TMT-Bpr score 394.7even though Examinee 1 appeared to have greater difficulty maintaining the alternating response set as evidenced by the errors. (Note the Heaton manual does not provide explicit instructions on how errors are incorporated into the prorated score. We have counted only correct circles in these examples.) Considering our TMT-Be formula, both examinees produce 18 correct moves and 6 unattempted moves, but Examinee 1 produces 3 errors of commission. Accordingly, Examinee 1 achieves a TMT-Be score of 20.28 (i.e., TMT-Be = 300 s/(24−3 Ec) + (0 Eo + 6 Mu) = 20.28) and Examinee 2 achieves a TMT-Be score of 18.50 (i.e., TMT-Be = 300 s/(24−0 Ec) + (0 Eo + 6 Mu) = 18.50). The higher TMT-Be score for Examinee 1 captures the more errant performance compared to Examinee 2, whereas the TMT-Bpr scores do not reflect this potentially clinically meaningful information.

It is also reasonable to consider that the TMT-B sequences used in the foregoing example actually come from the same examinee tested at different time points. In this situation, the TMT-Bpr scores would be the same, whereas the TMT-Be score has the advantage of reflecting the presumed decline in cognitive set-switching ability experienced by the examinee between test administrations (even though both performances are severely impaired normatively). These examples highlight the advantage of TMT-Be over TMT-Bpr for capturing possibly meaningful clinical information about the cognitive functioning of two impaired individuals or the same individual at different time points.

One other advantage of our TMT-Be metric is that it can be computed in situations where the task is terminated prematurely, for example, when the examinee loses the response set, is unable to regain it despite redirection from the examiner, and the examiner honors their request to terminate the test. This might occur in clinical practice to avoid damaging rapport by requiring an examinee to work to the time limit when it becomes clear early in the task that the examinee is too cognitively impaired to manage it (e.g., examinee is unable to progress beyond G despite knowing the alphabet fluently). The Heaton manual is unclear as to how such situations map onto the TMT-Bpr score and whether it can be interpreted. The TMT-Be metric, however, could be used in this situation because it accounts for unattempted moves.

The current results were obtained in the context of elderly individuals with and without presumed AD. Further validation of the TMT-Be metric in a larger sample and in other conditions will more fully characterize its stability and utility. The data for this study were not collected explicitly for the purpose of developing and testing this metric and so we did not gather specific measures to further test congruent and discriminant validity. However, the two metrics are simply different ways of quantifying the same performance and so should measure the same underlying construct. Support for this is seen in the strong association between TMT-Bs and TMT-Be, and their similar pattern of correlations with other cognitive measures suggests that the two measures have similar congruent and discriminant validity. As noted, we did not characterize or analyze specific error types other than the broad characterization of errors of commission or omission. Future studies of the TMT-Be metric might determine whether more fine-grained error analysis contributes to its research or clinical utility. One potential limitation of the utility of TMT-Be is that its interpretation as an indicator of cognitive status may be flawed in individuals who fail TMT-B (i.e., obtain TMT-Bs scores of 300 s) due to lack of familiarity with the alphabet. Researchers and clinicians using the TMT-Be metric should verify alphabet knowledge in individuals who fail TMT-B. Also, our control group had significantly higher level of education compared with the patient group, raising the possibility that education effects could have influenced the results. Lastly, the greatest utility of the TMT-Be metric is for use in individuals who have cognitive impairment, particularly those who fail TMT-B. The extent to which the metric provides incremental validity or interpretive power for TMT-B completers is uncertain.

In summary, we present a new metric for scoring TMT-B performance that is designed to capture performance variability among individuals who fail to complete the test in the standard allotted time. Our results show the metric correlates strongly with the TMT-Bs metric and correlates similarly with other cognitive measures in healthy controls and individuals with presumed AD. This metric has potential utility in research and clinical settings to capture variability otherwise lost to floor effects.

Funding

This work was supported by Alzheimer's Association (NIRG-09-131008): Investigation of Myelin Loss Associated with Alzheimer's Dementia; SD, Principal Investigator; and the Department of Veterans Affairs. The contents of this paper do not represent the views of the Department of Veterans Affairs or the United States Government.

Conflict of Interest

None declared.

Appendix

Table A1.

Simulated data. TMT-Be scores derived from hypothetical TMT-B performances

Mc Ec Eo + Mu Mt T TMT-Be 
Task completers with different times 
 24 24 40 1.67 
 24 24 60 2.50 
 24 24 80 3.33 
 24 24 100 4.17 
 24 24 120 5.00 
 24 24 140 5.83 
 24 24 160 6.67 
 24 24 180 7.50 
 24 24 200 8.33 
 24 24 300 12.50 
Maximum time (300 s) with different levels of completion 
 24  24 300 12.50 
 20 20 300 16.50 
 16 16 300 20.50 
 12 12 12 300 24.50 
 8 16 300 28.50 
 4 20 300 32.50 
 2 22 300 34.50 
 1 23 300 35.50 
Maximum time (300 s) with different levels of completion and commission errors 
 24 24 300 12.50 
 24 25 300 13.04 
 24 26 300 13.64 
 24 27 300 14.29 
 16 16 300 20.50 
 16 17 300 21.04 
 16 18 300 21.64 
 16 19 300 22.29 
 8 16 300 28.50 
 8 16 300 29.04 
 8 16 10 300 29.64 
 8 16 11 300 30.29 
 4 20 300 32.50 
 4 20 300 33.04 
 4 20 300 33.64 
 4 20 300 34.29 
 2 22 300 34.50 
 2 22 300 35.04 
 2 22 300 35.64 
 2 22 300 36.29 
 1 23 300 35.50 
 1 23 300 36.04 
 1 23 300 36.64 
 1 23 300 37.29 
Mc Ec Eo + Mu Mt T TMT-Be 
Task completers with different times 
 24 24 40 1.67 
 24 24 60 2.50 
 24 24 80 3.33 
 24 24 100 4.17 
 24 24 120 5.00 
 24 24 140 5.83 
 24 24 160 6.67 
 24 24 180 7.50 
 24 24 200 8.33 
 24 24 300 12.50 
Maximum time (300 s) with different levels of completion 
 24  24 300 12.50 
 20 20 300 16.50 
 16 16 300 20.50 
 12 12 12 300 24.50 
 8 16 300 28.50 
 4 20 300 32.50 
 2 22 300 34.50 
 1 23 300 35.50 
Maximum time (300 s) with different levels of completion and commission errors 
 24 24 300 12.50 
 24 25 300 13.04 
 24 26 300 13.64 
 24 27 300 14.29 
 16 16 300 20.50 
 16 17 300 21.04 
 16 18 300 21.64 
 16 19 300 22.29 
 8 16 300 28.50 
 8 16 300 29.04 
 8 16 10 300 29.64 
 8 16 11 300 30.29 
 4 20 300 32.50 
 4 20 300 33.04 
 4 20 300 33.64 
 4 20 300 34.29 
 2 22 300 34.50 
 2 22 300 35.04 
 2 22 300 35.64 
 2 22 300 36.29 
 1 23 300 35.50 
 1 23 300 36.04 
 1 23 300 36.64 
 1 23 300 37.29 

Notes: Mc = correct moves; Ec = errors of commission; Eo + Mu = errors of omission plus unattempted moves; Mt = total moves; T = time; TMT-Be = TMT-B efficiency score.

References

Amieva
H.
,
Lafont
S.
,
Auriacombe
S.
,
Rainville
C.
,
Orgogozo
J. M.
,
Dartigues
J. F.
et al
. (
1998
).
Analysis of error types in the trail making test evidences an inhibitory deficit in dementia of the Alzheimer type
.
Journal of Clinical and Experimental Neuropsychology
 ,
20
,
280
285
.
Arbuthnott
K.
,
Frank
J.
(
2000
).
Trail making test, part B as a measure of executive control: Validation using a set-switching paradigm
.
Journal of Clinical and Experimental Neuropsychology
 ,
22
,
518
528
.
Army Individual Test Battery
. (
1944
).
Manual of directions and scoring.
 
Washington, DC
:
War Department, Adjutant General's Office.
Ashendorf
L.
,
Jefferson
A. L.
,
O'Connor
M. K.
,
Chaisson
C.
,
Green
R. C.
,
Stern
R. A.
(
2008
).
Trail Making Test errors in normal aging, mild cognitive impairment, and dementia
.
Archives of Clinical Neuropsychology
 ,
23
,
129
137
.
Benedict
R.
,
Schretlen
D.
,
Groninger
L.
,
Brandt
J.
(
1998
).
Hopkins Verbal Learning Test—Revised: Normative data and analysis of inter-form and test retest-reliability
.
The Clinical Neuropsychologist
 ,
12
,
43
55
.
Benton
A.
,
Hamsher
K.
(
1989
).
Multilingual aphasia examination
 .
Iowa City, IA
:
AJA Associates
.
Chan
E.
,
MacPherson
S. E.
,
Robinson
G.
,
Turner
M.
,
Lecce
F.
,
Shallice
T.
et al
. (
2015
).
Limitations of the trail making test part-B in assessing frontal executive dysfunction
.
Journal of the International Neuropsychological Society
 ,
21
,
169
174
.
Davidson
P. S.
,
Gao
F. Q.
,
Mason
W. P.
,
Winocur
G.
,
Anderson
N. D.
(
2008
).
Verbal fluency, trail making, and Wisconsin Card Sorting Test performance following right frontal lobe tumor resection
.
Journal of Clinical and Experimental Neuropsychology
 ,
30
,
18
32
.
Delis
D. C.
,
Kaplan
E.
,
Kramer
J. H.
(
2001
).
Delis-Kaplan executive function system (D-KEFS)
 .
San Antonio, TX
:
The Psychological Corporation
.
Folstein
M.
,
Folstein
E.
,
McHugh
P.
(
1975
).
Mini-Mental State Exam: A practical method for grading the cognitive status of patients
.
Journal of Psychiatric Research
 ,
12
,
189
198
.
Giovagnoli
A. R.
,
Del Pesce
M.
,
Mascheroni
S.
,
Simoncelli
M.
,
Laiacona
M.
,
Capitani
E.
(
1996
).
Trail making test: Normative values from 287 normal adult controls
.
Italian Journal of Neurological Sciences
 ,
17
,
305
309
.
Heaton
R. K.
,
Miller
S. W.
,
Taylor
M. J.
,
Grant
I.
(
2004
).
Revised comprehensive norms for an expanded Halstead Reitan battery: Demographically adjusted neuropyschological norms for African Americans and Caucasian adults
 .
Lutz, FL
:
Psychological Assessment Resources
.
Jurica
P. J.
,
Leiten
C. L.
,
Mattis
S.
(
2001
).
Dementia Rating Scale-2 (DRS-2)
 .
Lutz, FL
:
Psychological Assessment Resources
.
Kaplan
E.
,
Goodglass
H.
,
Weintraub
S.
(
1983
).
The Boston Naming Test
  (
2nd ed.
).
Philadelphia
:
Lea and Febiger
.
Mahurin
R. K.
,
Velligan
D. I.
,
Hazleton
B.
,
Mark Davis
J.
,
Eckert
S.
,
Miller
A. L.
(
2006
).
Trail making test errors and executive function in schizophrenia and depression
.
Clinical Neuropsychology
 ,
20
,
271
288
.
Moll
J.
,
de Oliveira-Souza
R.
,
Moll
F. T.
,
Bramati
I. E.
,
Andreiuolo
P. A.
(
2002
).
The cerebral correlates of set-shifting: An fMRI study of the trail making test
.
Arquivos de Neuro-psiquiatria
 ,
60
,
900
905
.
Nadler
I. M.
,
Ryan
T. T.
(
1984
).
The Trail Making Test and Canter Background Interference Procedure in screening for organicity in chronic schizophrenia: A preliminary report
.
Perceptual and Motor Skills
 ,
59
,
403
406
.
Partington
J.
,
Leiter
J. E.
,
Graydon
R.
(
1949
).
Partington's Pathways Test
.
Psychological Service Center Journal
 ,
1
,
11
20
.
Rabin
L. A.
,
Barr
W. B.
,
Burton
L. A.
(
2005
).
Assessment practices of clinical neuropsychologists in the United States and Canada: A survey of INS, NAN, and APA Division 40 members
.
Archives of Clinical Neuropsychology
 ,
20
,
33
65
.
Reitan
R. M.
(
1955
).
An investigation of the validity of Halstead's measures of biological intelligence
.
Archives of Neurology & Psychiatry
 ,
73
,
28
35
.
Reitan
R. M.
(
1958
).
Validity of the Trail Making Test as an indicator of organic brain damage
.
Perceptual and Motor Skills
 ,
8
,
271
276
.
Reitan
R. M.
,
Wolfson
D.
(
1985
).
The Halstead-Reitan Neuropsychological Test Battery
 .
Tucson, AZ
:
Neuropsychology Press
.
Reitan
R. M.
,
Wolfson
D.
(
1993
).
The Halstead-Reitan Neuropsychologcial Test Battery: Theory and clinical interpretation
 .
Tucson, AZ
:
Neuropsychology Press
.
Ruffolo
L. F.
,
Guilmette
T. J.
,
Willis
G. W.
(
2000
).
Comparison of time and error rates on the trail making test among patients with head injuries, experimental malingerers, patients with suspect effort on testing, and normal controls
.
Journal of Clinical Neuropsychology
 ,
14
,
223
230
.
Strauss
E.
,
Sherman
E. M. S.
,
Spreen
O.
(
2006
).
A compendium of neuropsychological tests: Administration, norms, and commentary
  (
3rd ed.
).
New York
:
Oxford
.
Stuss
D. T.
,
Bisschop
S. M.
,
Alexander
M. P.
,
Levine
B.
,
Katz
D.
,
Izukawa
D.
(
2001
).
The Trail Making Test: A study in focal lesion patients
.
Psychological Assessment
 ,
13
,
230
239
.
Tombaugh
T. N
. (
2004
).
Trail Making Test A and B: Normative data stratified by age and education
.
Archives of Clinical Neuropsychology
 ,
19
,
203
214
.
Yochim
B.
,
Baldo
J.
,
Nelson
A.
,
Delis
D. C.
(
2007
).
D-KEFS Trail Making Test performance in patients with lateral prefrontal cortex lesions
.
Journal of the International Neuropsychological Society
 ,
13
,
704
709
.
Zakzanis
K. K.
,
Mraz
R.
,
Graham
S. J.
(
2005
).
An fMRI study of the Trail Making Test
.
Neuropsychologia
 ,
43
,
1878
1886
.