Abstract

The Children's Category Test (CCT) is a widely used measure of problem solving with adequate psychometric properties. Yet, Shriver and Vacc (n.d.) were fairly critical of the CCT in The Mental Measurement Yearbook and highlighted its limitations. Thus, to explore the clinical validity of the widely used CCT-Level 2 (CCT-2) version, results of that test were analyzed post hoc in a sample of 265 children with mixed etiology referred for neuropsychological testing at a private outpatient laboratory. Overall, the CCT-2 correctly classified 57.7% of the sample, with 72.2% accuracy in classifying the Neuropsychologically Normal Clinical Comparison group but only 54% for the Brain Injured group. Predictive power was further reduced when the Brain Injury group was subdivided. Predictive power fell to 27.2%, with the best predictions coming for the Mental Retardation (MR) group (58.3%) and the lowest for the Learning Disorder NOS group (2.5%). The current findings suggest that the CCT's clinical application should be used with caution.

Clinical Validity of the Children's Category Test

The Halstead Category Test (HCT; Halstead, 1947) has been in use for neuropsychological assessment of children and adults for over 60 years (Choca, Laatsch, Wetzel, & Agresti, 1997). The adult Category Test remains one of the most widely used neuropsychological tests ever developed for both clinical neuropsychological (Butler, Retzlaff, & Vanderploeg, 1991; Rabin, Barr, & Burton, 2005) and forensic neuropsychological (Lees-Haley, Smith, Williams, & Dunn, 1996) assessment. Its longevity is attributed to its unparalleled sensitivity in the detection of brain injury and neurological diseases processes (Adams & Grant, 1986; Allen, Goldstein, & Mariano, 1999; Mercer, Harrell, Miller, Childs, & Rocker, 1997), which has been reported to be upward of 94% accurate (Reitan, 1955).

The Children's Category Test (CCT; Boll, 1993) is an abbreviated booklet version derived from the Intermediate HCT (Reed, Reitan, & Kløve, 1965) for use with children aged 5 through 16. Boll's version is separated into Level-1 for children aged 5–8 (CCT-1) and Level-2 for children 9–16 (CCT-2). The CCT is a test of problem solving (Nesbit-Greene & Donders, 2002) standardized on a large representative sample (N = 920), which supports its adequate internal consistency, criterion validity, and psychometric properties (Boll, 1993). Yet, Shriver and Vacc (n.d.) were fairly critical of the CCT in The Mental Measurement Yearbook and highlighted its limitations. Specifically, he noted little, if any, validation research exists documenting the ability of the CCT to differentiate normal from neuropsychologically compromised children. Furthermore, the limited validity research available is generally limited to traumatic brain injury (TBI; Allen, Knatz, & Mayfield, 2006; Donders & Giroux, 2005; Donders & Nesbit-Greene, 2004; Hoffman, 1998; Hoffman, Donders, & Thompson, 2000; Miller & Donders, 2003). To date, only one study has evaluated the sensitivity of the CCT-2 among a mixed sample of neurologically impaired children (Bello, Allen, & Mayfield, 2008). The authors concluded that, as in previous studies involving the adult and child version of the HCT, the CCT-2 is a multidimensional instrument. However, while their results suggest that the CCT-2 is a psychometrically sound instrument, it does not appear to be particularly sensitive to either structural brain damage or neurodevelopmental disorders such as attention deficit/hyperactivity disorder (ADHD), and the exact reason for this lack of sensitivity is unclear. As such, they caution against using the CCT-2's composite T-score, factor, and subtest scores for clinical or research applications, which aim to make inferences about abstraction and problem-solving abilities.

The aim of the present study was multifocal. First and foremost, we hoped to resolve the paradox between the sound psychometric properties of the CCT-2, as well as its assumed high degree of sensitivity to brain damage, from its inconsistent ability to differentially categorize brain-injured from non-brain-injured children in clinical practice. Based on previous work with the adult Category Test (i.e., Reitan, 1955), we hypothesized that the CCT-2 would discriminate between brain-injured and non-brain-injured participants with at least 80% accuracy. Additionally, this study attempted to improve the accuracy of the CCT-2 in discriminating between brain-injured and normal participants by evaluating the underlying constructs to see if different factor solutions improve sensitivity. To this end, the underlying constructs of the CCT-2 will be explored. Based on the works of Donders (1999) and Nesbit-Greene and Donders (2002), a three-factor solution differentially evaluating the abilities to count, proportionally reason, and determine the odd variable within a configuration will be found. Third, to further validate the psychometric properties of the instrument, the CCT-2 was compared with the Wechsler Intelligence Scale for Children-Third Edition (WISC-III, Wechsler, 1991). More specifically, we assessed whether or not the CCT-2 T-scores and factor scores demonstrate a significant relationship with the WISC-III Full-Scale IQ (FSIQ), the Verbal IQ (VIQ), the Performance IQ (PIQ), the WISC-III General Ability Index (GAI), and the WISC-III Index scores (Verbal Comprehension Index [VCI], Perceptual Organization Index [POI], Freedom from Distractibility Index [FDI], and Processing Speed Index [PSI]). The WISC-III was chosen, because it is a well-established measure of psychometric intelligence and neurocognitive functioning. According to Carr, Sweet, and Rossini (1986), a comparison of widely used tests such as the Wechsler intelligence tests to newly developed tests “can provide clinicians with a frame of reference within which to judge the relative merits of newer tests” (p. 354). We hypothesized that the CCT-2 and the WISC-III IQ and Index scores will demonstrate significant convergent validity and have a statistically significant inverse relationship with the total error T-score as well as the factor scores; as the number of total CCT-2 errors increases, showing greater impairment in neurocognitive functioning, the WISC-III scores will decrease, showing a similar pattern of reduced neurocognitive functioning. This hypothesis is based on the established principle that different test instruments measuring “g” are correlated. Furthermore, as seen in previous studies (i.e., Shore, Shore, & Phil, 1971), the CCT-2 T-score will best correlate with the POI, both of which are assumed measures of fluid intelligence.

Method

Participants

A sample of 265 children of mixed gender was retrospectively selected from a database comprised of outpatient pediatric patients assessed at a private neuropsychological practice over the course of 15 years. Referrals for testing include a variety of sources including regional pediatric medical centers and regional school districts.

Participants were selected into five groups based on typology of injury commonly seen among school based referrals for neuropsychological testing. Group 1 was based on a learning disorder (LD), not otherwise specified, and ADHD diagnoses (LD-NOS = 33 and ADHD = 7; n= 40). Group 2 was based on a diagnosis of verbally based LD (LD-Reading = 47; LD-Writing = 6; LD-CAPD [central auditory processing disorder] = 28; n = 81). Group 3 was based on a diagnosis of MR (this group was predominantly in the mild MR range, n= 48). Group 4 was based on Structural Brain Injury (SBI), as evident by medical records indicating the like, including congenital brain impairment (CBI = 27), TBI (=6 [all participants experienced a closed head trauma of mild severity]), seizure disorder (Sz = 7), and cerebral vascular accident (CVA = 2 [both participants experienced a left hemisphere based hemorrhagic stroke]; n= 42). Group 5 are a Neuropsychologically Normal Clinical Comparison group (NC = 54), whose tests scores were within normal limits for all domains tested and they were without psychiatric diagnoses. The overall results were verified by two independent neuropsychologists who determined the absence of pathology. All participants must have completed a full neuropsychological evaluation that included the administration of the CCT-2 and the WISC-III within the same referral period, and all participants were in the age range of 9–16 at the time of the neuropsychological assessment. Demographic data for the total sample and NC group are reviewed in Table 1, and for the brain injury subgroups, are provided in Table 2. The participants mean age was 11.92 (SD = 2.11 years), 62.6% were male, 64.9% were Caucasian, 32.1% were African American, and 3% were mixed Hispanic and Asian. The mean WISC-III FSIQ for the sample (n = 265) was 87.50 with a standard deviation of 17.47.

Table 1.

Demographic and IQ data for total sample and Neuropsychologically Normal Clinical Comparison groups

Variables Total Neuropsychologically Normal Clinical Comparison group 
Age (mean [SD]) 11.58 (2.11) 11.96 (2.18) 
WISC-III FSIQ (mean [SD]) 87.50 (17.47) 103.94 (12.28) 
Gender (n [%]) 
 Male 166 (62.6) 3768.5 
 Female 99 (37.4) 1731.5 
Ethnicity (n [%]) 
 Caucasian 172 (64.9) 3870.3 
 African American 85 (32.1) 15 (27.7) 
 Other 8 (3.0) 1 (2.0) 
Variables Total Neuropsychologically Normal Clinical Comparison group 
Age (mean [SD]) 11.58 (2.11) 11.96 (2.18) 
WISC-III FSIQ (mean [SD]) 87.50 (17.47) 103.94 (12.28) 
Gender (n [%]) 
 Male 166 (62.6) 3768.5 
 Female 99 (37.4) 1731.5 
Ethnicity (n [%]) 
 Caucasian 172 (64.9) 3870.3 
 African American 85 (32.1) 15 (27.7) 
 Other 8 (3.0) 1 (2.0) 

Note: WISC = Wechsler Intelligence Scale for Children.

Table 2.

Demographic and IQ data for brain injury groups

Variables LD VLD MR SBI 
Age (mean [SD]) 11.58 (2.11) 11.92 (2.11) 12.08 (2.14) 12.02 (2.20) 
WISC-III FSIQ (mean [SD]) 84.73 (11.58) 89.60 (12.79) 64.23 (7.21) 90.55 (15.17) 
Gender (n [%]) 
 Male 23 (57.5) 47 (58.0) 30 (62.5) 29 (62.5) 
 Female 17 (42.5) 34 (42.0) 18 (37.5) 13 (31.0) 
Ethnicity (n [%]) 
 Caucasian 26 (65.0) 53 (65.4) 22 (45.8) 33 (78.6) 
 AA 13 (32.5) 22 (27.2) 26 (54.2) 9 (21.4) 
 Other 1 (2.5) 6 (7.4) 0 (0) 0 (0) 
Variables LD VLD MR SBI 
Age (mean [SD]) 11.58 (2.11) 11.92 (2.11) 12.08 (2.14) 12.02 (2.20) 
WISC-III FSIQ (mean [SD]) 84.73 (11.58) 89.60 (12.79) 64.23 (7.21) 90.55 (15.17) 
Gender (n [%]) 
 Male 23 (57.5) 47 (58.0) 30 (62.5) 29 (62.5) 
 Female 17 (42.5) 34 (42.0) 18 (37.5) 13 (31.0) 
Ethnicity (n [%]) 
 Caucasian 26 (65.0) 53 (65.4) 22 (45.8) 33 (78.6) 
 AA 13 (32.5) 22 (27.2) 26 (54.2) 9 (21.4) 
 Other 1 (2.5) 6 (7.4) 0 (0) 0 (0) 

Notes: WISC = Wechsler Intelligence Scale for Children; LD = Learning Disorder; VLD = Verbal Learning Disorder; MR = Mental Retardation; SBI = Structural Brain Injury; AA = African American; FSIQ = Full-Scale IQ.

Instruments

The CCT (Boll, 1993) is an individually administered instrument designed to assess non-verbal learning and memory, concept formation, and problem solving. The CCT-2 (Boll, 1993) is for use with children between the ages of 9–16 and has six subtests and 83 total items. The test items are constructed on a single sorting principle, with the exception of the last subtest, which requires the child to remember and reapply the conceptual rules from the previous subtests. After each answer, the child is required to identify the sorting principle and apply it to the subsequent items within the given subtest. Immediate feedback is provided in the form of “yes” or “no,” allowing the child to adapt and learn the given sorting principle.

The present study focuses exclusively on the CCT-2, which employs a flipbook with discrete stimulus items per page. The child provides a response by pointing to or verbally indicating 1, 2, 3, or 4, which is printed on a response card (Boll, 1993). The average internal consistency of this measure is 0.86, its average standard error of measurement is 3.74 and its test–retest reliability is 0.70, suggesting adequate reliability (Boll, 1993).

WISC-III is an individually administered measure of psychometric intelligence with an FSIQ, VIQ, and PIQ. Individual index scores are also generated, which are defined as the VCI, POI, FDI, and PSI. Optionally, the GAI can be calculated by summing the VCI and POI scores. The WISC-III has an excellent standardization sample (2,200 participants' ages 6–16 years) that closely matches the 1988 U.S. population on the stratification variables of race, ethnicity, geographic region, socioeconomic status, and gender.

Reliability coefficients are high for the VIQ (r = .95), PIQ (r = .91), and FSIQ (r = .96) scores, somewhat lower for the index scores (VCI = 0.94; POI = 0.90; FDI = 0.87; and PSI = 0.85), and lower still for the subtests (median average internal reliability across the ages = 0.78, range = 0.69–0.87).

Procedure

The CCT-2 and the WISC-III (Wechsler, 1991) were routinely administered to all pediatric patients referred for evaluation at a private clinic as part of a semi-flexible neuropsychological test battery. A licensed psychologist administered all testing, and all materials were de-identified prior to encoding onto the Statistical Package for the Social Sciences database.

The CCT-2 and WISC-III were administered and scored in a standardized manner. All neuropsychological evaluations were requested by interested parties to ascertain the neurocognitive functioning of each pediatric patient. Diagnostic categorization was based on the child's overall performance across neuropsychological testing and review of available medical records in the form of medical histories for applicable participants. Once appropriate candidates were selected for inclusion into this study, all diagnoses were re-evaluated without the CCT. Two different neuropsychologists examined the relevant data and arrived at independent diagnostic conclusions. In the event of conflicting diagnosis, provisions were made for a third neuropsychologist to evaluate the contested data and provide a diagnosis. In the event the third neuropsychologist arrived at different conclusion than the two primary diagnosticians, the subject was removed from the study. No conflicts were found among the neuropsychologists with respect to diagnosis, and all appropriate participants were retained for this study.

Data Analysis

The total number of errors on each subtest and the total number of errors across subtests on the CCT-2 (along with their age corrected T-scores) and the WISC-III Indexes were used in all statistical analysis.

To determine the underlying constructs and latent structure of the CCT-2, a principal factor analysis was computed using the raw error scores for each subtest. Based on previous research, it was expected that the factors would correlate with each other, and to maximize the variance of the loadings on the factors, varimax rotation was applied. The number of factors was determined using the Kaiser–Guttman criterion and previous research, which included factors with Eigenvalues approximating 1 or greater (Bello et al., 2008). Once the factor structures were identified, their factor scores were calculated and their sensitivity to brain damage evaluated through a discriminant analysis comparing the larger Brain Injury group as well as the smaller brain injury subgroups with the NC group. A similar discriminant analysis was computed for the CCT-2 T-scores. Correlations were also examined between the CCT-2 factor scores, the T-scores, and the WISC-III Index scores, including FSIQ and the GAI to further examine the criterion validity of the instrument.

Results

Principal Component Analyses

In keeping with the assumed multidimensionality of the Category Test (e.g., Allen, Goldstein, & Mariano, 1999), a principal component analysis (PCA) for the CCT-2 (Boll, 1993) raw subtests errors was computed for the entire sample. A second PCA was completed without Subtests 1 and 2 given that they were markedly skewed and kurtotic. Additionally, based on previous research findings, these subtests contributed little to the overall sensitivity of the CCT-2 as few children miss any of the items (Bello et al., 2008; Donders, 1999; Nesbit-Green & Donders, 2002). Error rates for Category 1 ranged from 0 to 5 with a mean of 0.5 and standard deviation of 1. The error rates for Category 2 ranged from 0 to 5 with a mean of 0.28 and standard deviation of 0.76. Both subtest error means and standard deviation are consistent with the literature at large (e.g., Bello et al., 2008). The results of the two analyses are presented in Table 3, which contains the rotated component matrix, Eigenvalues, and percent variance accounted for by each factor.

Table 3.

Principal component analysis for the CCT-2

CCT-2 subtests Factorsa
 
Factorsb
 
Subtest 1 0.335 0.096 0.603 — — 
Subtest 2 −0.10 0.010 0.884 — — 
Subtest 3 −0.040 0.943 0.011 −0.027 0.942 
Subtest 4 0.896 −0.026 −0.006 0.899 −0.053 
Subtest 5 0.685 0.399 0.208 0.737 0.393 
Subtest 6 0.505 0.756 0.122 0.529 0.750 
% variance 39.98 18.15 16.61 40.82 40.16 
Eigenvaluec 2.34 1.09 1.00 2.23 1.00 
CCT-2 subtests Factorsa
 
Factorsb
 
Subtest 1 0.335 0.096 0.603 — — 
Subtest 2 −0.10 0.010 0.884 — — 
Subtest 3 −0.040 0.943 0.011 −0.027 0.942 
Subtest 4 0.896 −0.026 −0.006 0.899 −0.053 
Subtest 5 0.685 0.399 0.208 0.737 0.393 
Subtest 6 0.505 0.756 0.122 0.529 0.750 
% variance 39.98 18.15 16.61 40.82 40.16 
Eigenvaluec 2.34 1.09 1.00 2.23 1.00 

Note: CCT-2 = Children's Category Test-Level 2.

aTotal variance accounted for = 74.75 (Subtests 1 and 2 included).

bTotal variance accounted for = 80.98 (Subtests 1 and 2 excluded).

cAll Eigenvalues rounded to the nearest 100th decimal place.

When all the subtests were included and Eigenvalues set at 1 or above, a two-factor solution was identified. This factor structure accounted for 58.14% of the total variance. However, a third factor with an Eigenvalue of 0.997 was also present, and, given its close approximation of 1 and examination of the Scree plot, it was included into the factor solution. The new 3-factor solution, consistent with previous findings (Bello et al., 2008; Donders, 1999), accounted for 74.75% of the total variance. When Subtests 1 and 2 were excluded from the PCA, two factors were identified, which accounted for 80.98% of the total variance.

The PCA, including all subtests, revealed three factors. The first factor, previously named the “Proportional Reasoning” Factor (PRF; Bello et al., 2008), which requires the subject to determine the proportion of the item that have solid parts, was composed primarily of Subtests 4 and 5 with a smaller contribution from Subtest 6. The second factor previously named the “Oddity Reasoning” Factor (ORF; Bello et al., 2008), which requires the individual to identify the sequential position of the non-matching figure, was identified primarily by Subtest 3 and to a lesser extent Subtest 6. The third factor previously referred to as the “Count” factor (Johnstone, Holland, & Hewett, 1997), which requires the individual to count either the Roman Numeral or the number of figures on the card, was identified by Subtests 1 and 2. For the two-factor analyses, Subtest 6 loaded on both Factors 1 and 2. This is expected, given that this subtest reviews principles required to successfully complete the preceding subtests.

To determine the construct validity of the CCT-2, Pearson's correlation coefficients were calculated between the T-score and factor solutions with the WISC-III IQ and Index scores. The PSI score was not included in the comparisons secondary to the exclusion of the Symbol Search subtest, which is not used in computing the FSIQ.

Correlations with IQ

Correlations for the CCT-2 T-scores and factor scores with the IQ and Index scores from the WISC-III are presented in Table 4. The entire sample (n= 265) was used in the comparison. With respect to the T-score, it was best correlated with the FSIQ (r= .55, p < .01, r2 = .30) followed by the GAI (r= .54, p < .01, r2 = .29). Of the indexes, the POI demonstrated the highest correlation with the CCT-2 Composite T-scores (r = .52, p < .01, r2 = .27) followed by the VCI (r =.47, p < .01, r2 = .22) and then the FDI (r = .41, p < .01, r2 = .17).

Table 4.

Correlations among the CCT and the WISC-III

 PRF ORF T-score Raw Score 
FSIQ −.48* −.42* .55* −.52* 
GAI −.47* −.41* .54* −.51* 
POI −.47* −.39* .52* −.50* 
VCI −.40* −.36* .47* −.44* 
FDI −.39* −.27* .41* −.39* 
PRF 1.00 .59* −.86* .91* 
ORF .59* 1.00 −.83* .85* 
 PRF ORF T-score Raw Score 
FSIQ −.48* −.42* .55* −.52* 
GAI −.47* −.41* .54* −.51* 
POI −.47* −.39* .52* −.50* 
VCI −.40* −.36* .47* −.44* 
FDI −.39* −.27* .41* −.39* 
PRF 1.00 .59* −.86* .91* 
ORF .59* 1.00 −.83* .85* 

Notes: CCT = Children's Category Test; WISC = Wechsler Intelligence Scale for Children; PRF = Proportional Reasoning Factor; ORF = Oddity Reasoning Index; T-Score = Composite T-score for the CCT; Raw Score = Raw Error Scores for the CCT; FSIQ = Full-Scale IQ for the WISC-III; GAI = General Ability Index for the WISC-III; POI = Perceptual Organization Index for the WISC-III; VCI = Verbal Comprehension Index for the WISC-III; FDI = Freedom from Distractibility Index for the WISC-III.

*p < .01.

When reviewing the CCT-2 factors, the PRF had the best correlation with IQ of the two factors. The PRF was best correlated with the FSIQ (r= −.48, p < .01, r2 = .23) followed by the GAI (r = −.47, p < .01, r2 = .22), the POI (r = −.47, p < .01, r2 = .22), the VCI (r = −.40, p < .01, r2 = .16), and the FDI (r = −.39, p < .01, r2 = .15). The ORF was best correlated with FSIQ (r= −.42, p < .01, r2 = .18) followed by the GAI (r = −.41, p < .01, r2 = .17), the POI (r = −.39, p < .01, r2 = .15), the VCI (r= −.36, p < .01, r2 = .13), and the FDI (r = −.27, p < .01, r2 = .07). The two factors were moderately correlated with each other (r = .59, p < .01, r2 = .35) and highly correlated with the Total Error T-scores (PRF: r= −.86, p < .01, r2 = .74 and ORF: r= −.83, p < .01, r2 = .69). To determine the criterion validity of the CCT-2, the three primary measures, Composite T-score, the PRF, and ORF, were compared using analysis of variance (ANOVA) and a discriminant analysis.

Sensitivity to Neurological Injury

A discriminant analysis was performed for the entire sample by separating the participants into either the Brain Injury or NC groups and using them as the dependent variable (DV) with Age, Total Errors, the Composite T-score, PRF, and the ORF as the predictor variables.

Univariate ANOVA revealed that the Brain Injury and NC groups differed significantly for mean number of Total Raw Error score—F(1, 263) = 14.00, p < .01, Composite T-score—F(1, 263) = 13.85, p < .01, PRF score—F(1, 263) = 10.72, p = .01, and the ORF score—F(1, 263) = 11.03, p < .01, but not for Age—F(1, 278) = 0.33, p = .57.

A single discriminant function was calculated. The value of this function was significantly different for Brain Injury and NC groups (χ2 = 14.656, df = 5, p = .01). The correlation between the predictor variables and the discriminant function suggested that the CCT-2 Composite T-score, Total Raw Error score, and the two-factor solutions were good predictors for group classification. Overall, the discriminant function successfully predicted outcomes for 57.7% of cases, with accurate predictions made for 72.2% of NC group and 54% of neurologically impaired participants. The Total Raw Error score demonstrated the highest correlation with the function (r = .96, p < .01, r2 = .92), followed by the Composite T-score (r = −.95, p < .01, r2 = .90), the ORF (r = .85, p < .01, r2 = .72), and then the PRF (r = .83, p < .01, r2 = .69).

To determine the discriminant power of the Total Error T-score and the two factor solutions, an independent discriminant analyses was completed for each. The Composite T-score alone correctly classified 59.3% of the NC group into their correct group and 60.2% of the brain injured into theirs. The PRF, comprised of Subtests 4, 5, and 6, also correctly classified 59.3% of the NC into their correct group, but only 55.9% of the brain injured into theirs. The ORF, comprised of Subtests 3 and 4, also correctly classified 59.3% of the NC and only 55% of the brain injured into their respective groups.

The CCT-2 is purported by its authors to be “a valuable measure of learning ability” that “is increasingly being applied to school populations” (p. iii, Boll, 1993). As such, a more refined post hoc analysis was conducted to explore whether the CCT-2 has differential sensitivity to common disorders within the referral population. The larger brain injury group was divided into four subgroups, the LD, Verbal LD (VLD), MR, SBI, and NC groups. An ANOVA was used to evaluate differences in their means on T-score and factor scores (Table 5). The Composite T-score difference was calculated first. The equality of variances in an assumed normal distribution was verified with Levene's test—F(4, 260) = 1.28, p = .28. The ANOVA revealed a significant effect for group membership and Composite T-score on the CCT-2—F(4, 260) = 10.25, p < .01. Employing the Tukey's HSD post hoc test, significant differences were found between the LD group and the MR group (p = .02). Differences were also found for the MR and SBI (p < .01) and the NC and VLD groups (p < .01). There was no significant difference between the LD and SBI (p = 1.00), the NC (p = .07), or the VLD groups (p = .68). No significant differences were found for the SBI and the NC (p = .11) or the VLD group (p = .83) either. The NC group was not significantly different from the VLD group (p = .45).

Table 5.

Analysis of variance: Brain Injury subgroups and NC group

 Total (n = 265) LD-NOS MR SBI NC VLD F p 
PRF 15.79 (7.44) 16.55 (7.32) 20.23 (6.70) 15.43 (7.66) 12.89 (6.20) 14.91 (7.47) 7.39 (4, 260) <.01 
ORF 11.63 (6.45) 12.03 (6.69) 15.02 (5.10) 12.43 (6.82) 9.07 (5.71) 10.70 (6.50) 6.57 (4, 260) <.01 
CTS 45.88 (11.32) 45.05 (10.72) 38.04 (8.68) 45.60 (11.01) 50.87 (9.50) 47.74 (11.96) 10.25 (4, 260) <.01 
Subtest 1 0.32 (0.75) 0.55 (1.20) 1.06 (1.37) 0.36 (1.03) 0.15 (0.45) 0.42 (0.95) 5.60 (4, 260) <.01 
Subtest 2 0.32 (0.75) 0.55 (1.11) 0.44 (0.92) 0.21 (0.42) 0.20 (0.74) 0.26 (0.52) 1.91 (4, 260) .11 
Subtest 3 6.83 (4.43) 7.2 (4.54) 8.40 (4.06) 7.45 (4.92) 5.60 (4.12) 6.22 (4.27) 3.32 (4, 260) .01 
Subtest 4 5.54 (3.35) 5.65 (3.33) 6.71 (3.50) 5.55 (3.62) 4.93 (3.03) 5.19 (3.22) 2.19 (4, 260) .07 
Subtest 5 5.46 (3.12) 6.08 (3.02) 6.90 (2.88) 4.90 (2.99) 4.48 (3.18) 5.25 (3.04) 4.96 (4, 260) <.01 
Subtest 6 4.80 (2.73) 4.83 (2.69) 6.63 (2.03) 4.98 (2.68) 3.48 (2.26) 4.48 (2.89) 10.02 (4, 260) <.01 
 Total (n = 265) LD-NOS MR SBI NC VLD F p 
PRF 15.79 (7.44) 16.55 (7.32) 20.23 (6.70) 15.43 (7.66) 12.89 (6.20) 14.91 (7.47) 7.39 (4, 260) <.01 
ORF 11.63 (6.45) 12.03 (6.69) 15.02 (5.10) 12.43 (6.82) 9.07 (5.71) 10.70 (6.50) 6.57 (4, 260) <.01 
CTS 45.88 (11.32) 45.05 (10.72) 38.04 (8.68) 45.60 (11.01) 50.87 (9.50) 47.74 (11.96) 10.25 (4, 260) <.01 
Subtest 1 0.32 (0.75) 0.55 (1.20) 1.06 (1.37) 0.36 (1.03) 0.15 (0.45) 0.42 (0.95) 5.60 (4, 260) <.01 
Subtest 2 0.32 (0.75) 0.55 (1.11) 0.44 (0.92) 0.21 (0.42) 0.20 (0.74) 0.26 (0.52) 1.91 (4, 260) .11 
Subtest 3 6.83 (4.43) 7.2 (4.54) 8.40 (4.06) 7.45 (4.92) 5.60 (4.12) 6.22 (4.27) 3.32 (4, 260) .01 
Subtest 4 5.54 (3.35) 5.65 (3.33) 6.71 (3.50) 5.55 (3.62) 4.93 (3.03) 5.19 (3.22) 2.19 (4, 260) .07 
Subtest 5 5.46 (3.12) 6.08 (3.02) 6.90 (2.88) 4.90 (2.99) 4.48 (3.18) 5.25 (3.04) 4.96 (4, 260) <.01 
Subtest 6 4.80 (2.73) 4.83 (2.69) 6.63 (2.03) 4.98 (2.68) 3.48 (2.26) 4.48 (2.89) 10.02 (4, 260) <.01 

Notes: PRF = Proportional Reasoning Factor; ORF = Oddity Reasoning Index; CTS = Composite T-score for the Children's Category Test; LD-NOS = Learning Disorder NOS; MR = Mental Retardation; SBI = Structural Brain Injury; NC = Neuropsychologically Normal Clinical Comparison group; VLD = Verbal Learning Disorder.

A separate ANOVA was calculated for the same groups and the PRF. The equality of variances in an assumed normal distribution was again verified with Levene's test—L(4, 260) = 0.41, p = .80. A significant effect for group membership and the PRF on the CCT-2—F(4, 260) = 7.39, p < .01—was found. The Tukey's HSD post hoc test revealed a significant difference between the MR group and the SBI (p = .01), the NC (p < .01), and the VLD groups (p = .01). There were no significant differences between the LD and the MR (p = .11), the SBI (p = .95), or the VLD (p = .76) groups. No significant differences were found for the SBI and the NC (p = .41), or the VLD group (p = 1.00) either. The NC group was not significantly different from the VLD (p = .48).

A third ANOVA was computed for the ORF. The Levene's test was used to test the equality of variance in an assumed normal distribution, which was not found to be equal—L(4, 260) = 3.67, p = .01. However, a significant effect for group membership and the ORF scores on the CCT-2—F(4, 260) = 6.57, p < .01—was found. Given the unequal distribution of the variance, the Games–Howell post hoc test was applied to determine variability between the groups. The Games–Howell post hoc test revealed significant differences between the MR group and the NC (p < .01) and the VLD groups (p = .01). No significant differences were found between the LD group and the MR (p = .15), SBI (p = 1.00), NC (p = .17), or VLD groups (p = .84). No significant differences were found between the MR group and SBI (p = .27). No significant differences were found between the SBI group and NC (p = 0.87) or VLD (p = .66). No significant differences were found between the NC group and the VLD group (p = .54).

Having found significant differences among some of the groupings, a post hoc discriminant analysis was performed with diagnosis as the DV and Composite T-score, the PRF and the ORF as the predictor values. A total of 265 cases were analyzed. Three functions were found, with only one having absolute correlations between each variable and the discriminant function. The value of this function was significantly different for the five diagnostic groups (χ2 = 40.78, df = 12, p < .01). Overall, the discriminant function successfully predicted outcome for 27.2% of cases. The best prediction was for the MR group which was correctly classified 58.3% of the time followed by NC 50%, SBI 28.6%, VLD 4.9%, and LD 2.5%.

An independent discriminant function using only the Composite T-score successfully predicted outcome for 26% of the cases. Individuals identified as MR were correctly classified 58.3% of the time followed by NC, which was correctly classified 51.9% of the time. LD, SBI, and VLD were correctly classified 12.5%, 9.5%, and 4.9% of the time, respectively. A discriminant function using the PRF scores similarly had poor predictive power with an overall group classification of 27.5%. MR and the NC group had the highest predictive percentages, 56.3% and 59.3%, respectively. LD and VLD were correctly classified 15% and 9.9% of the time, whereas no individuals were classified as having SBI. A discriminant function using the ORF score correctly classified 24.9% of the participants. The MR and NC groups had the highest predictive percentages, 62.5 and 57.4. SBI and VLD were correctly classified 4.8% and 3.7% of the time. The LD group was not correctly classified.

Discussion

The CCT-2 (Boll, 1993) was evaluated in a mixed sample of neurologically impaired school-aged children commonly referred for outpatient neuropsychological testing. As with previous studies, the CCT-2 was found to have multiple factor structures further supporting its multidimensionality (e.g., Nesbit-Greene & Donders, 2002). Three factors were found. Factor 1 appears to measure the ability to detect sequential discontinuity and was previously labeled the PRF (Bello et al., 2008). Subtests 4 and 5 loaded primarily on this factor. Factor 2, which appears to measure the ability to perceive part–whole relationships and was previously labeled the ORF (Bello et al., 2008), associated primarily on Subtest 3. Factor 3, the Counting Factor, loaded primarily on Subtests 1 and 2. However, given the skewed and kurtotic nature of both subtests and the fact that only 12% of the children committed two or more errors on Subtest 1, and 3% on Subtest 2, they were excluded from further analysis. Subtest 6 demonstrated a moderate relationship with the two remaining factors, which is not surprising, since knowledge of the preceding subtests is required for its successful completion. The findings of the present study are consistent with those of Bello and colleagues (2008), in that the factor loadings of the different subtests appear to be consistent across a variety of clinical populations, in particular those typically referred for neuropsychological testing secondary to academic concerns. Further, the inclusion or the exclusion of Subtests 1 and 2 does not drastically alter the factor loadings of the remaining subtests, which consistently loaded on the PRF and ORF.

The validity of the CCT-2 was further supported by strong correlations with the WISC-III IQ and Index scores. Significant correlations were found between the composite T-score and, in order from highest to lowest, the FSIQ, GAI, POI, VCI, and FDI of the WISC-III; correlations ranged from r= .55 to .41 (r2= .30–.17). A similar pattern emerged for the PRF and the WISC-III scores, which ranged from r = −.48 to −.39 (r2= .23–.15). The ORF had significant but lower correlations along the same lines and ranged from r= −.42 to −.27 (r2= .18–.07). There was a moderate correlation between the PRF and the ORF (r= .59, r2 = .35), and both had strong correlations with the Composite T-score, −.86 (r2= .74) and −.83 (r2= .69), respectively. These findings are largely consistent with previous research (e.g., Bello et al., 2008), and our study hypotheses that the CCT-2 will correlate better with the POI of the WISC-III over other index scores, given the perceptual nature of both instruments. However, given the range of r2 scores observed, the CCT-2 should not be considered a non-VIQ measure.

The CCT has repeatedly demonstrated sound psychometric properties (Bello et al., 2008). As such, the CCT-2's discriminant power was evaluated and found to be approximately 60% overall. When the Composite T-score and factor scores were combined they correctly identified 72% of the NC group but only 54% of the Brain Injury group. Individually, the Composite T-score correctly classified 59% of NC group and 60% of the brain injured, while the two factors were similarly able to classify approximately 59% of NC group and 55% of brain-injured participants into their respective categories. These findings suggest that the CCT-2 sensitivity to brain injury appears quite limited. Whereas other studies have warned against using the Composite T-score as the predictive measure (Nesbit-Greene and Donders, 2002), we find that of the three evaluated measures, PRF, ORF, and Composite T-score, it is the best overall predictor. Nevertheless, its predictive value was modest at best.

Further evaluation of the individual subgroups of the Brain Injury groups was also completed. The ability of the CCT-2 to discriminate between the different forms of brain injury was also examined via a discriminant analysis, which evaluated four groups from the larger brain injury sample and the NC group. The predictive power of the CCT-2 Composite T-score and factor scores declined significantly, with overall predictive power falling close to the 25% range. The Composite T-score continued to be a better measure of group classification over the factor scores, but all three variables were similarly poor predictors of group membership. These findings are not surprising, given the literature of the adult HCT, which finds the measure to be sensitive but not necessarily specific to brain injury typology (Choca et al., 1997). Further, while the CCT-2 demonstrated increased predictive power as the Composite T-score and raw error scores deviated further from the mean, the test fails to reach levels similar to that of the adult HCT, which is upward of 94% accurate in classifying participants into their respective categories (e.g., Reitan, 1955). Overall, the CCT demonstrates poor predictive power and should be used with caution and in conjunction with other measures of brain injury and severity in making formal determinations of impairment. Within the present mixed sample of brain-injured children, the factor scores were not particularly sensitive to type of brain injury above and beyond the Composite T-score. While there may be validity in the use of factor scores in the TBI populations (Nesbit-Green & Donders, 2002), this is not essential for our present group.

The Intermediate Category Test from which the CCT was derived is by all accounts a “downward extension of the adult Category Test” (Livingston, 1996, p. 851). While the adult Category Test is a highly useful and sensitive measure of brain dysfunction, the CCT does not appear to share these same qualities. These dissimilar findings may be attributed to the nature of brain development and in particular the maturation of higher order executive functions continuing to develop into the third decade of life (e.g., Huttenlocher, 1979). Conklin, Luciana, Hooper, and Yarger (2007) discuss how executive functions supported by the frontal lobes “best differentiate humans from their closest primate relatives but also adult humans from children and adolescents” (p. 104). Other studies have reported similar findings of continuing improvement on functions mediated by the frontal lobes such as working memory (Conklin et al., 2007), problem solving (Walsh, Pennington, & Grossier, 1991), and executive control (e.g., Luciana, Conklin, Hooper & Yarger, 2005) through adolescence and into adulthood. The application of adult standards onto children is tenuous at best. It is important to avoid arbitrarily applying knowledge attained from adult neuropsychological research onto children without first substantiating its validity and applicability for this population. In order to bolster diagnostic accuracy and assist the evaluator in making appropriate conclusions on brain integrity the use of psychometrically sound and clinically validated test instruments is essential.

Limitations of the present investigation should be considered before making decisions about clinical populations. First and foremost, convenience sampling of referred outpatient children was used, which resulted in more patients with less severe injuries than would be likely in other areas where the CCT-2 might be used, such as rehabilitation hospitals or extended recovery care settings. Further, the clinical comparison group was derived from a subset of children referred for testing but found to be neuropsychologically within normal limits. This sample may not accurately reflect a normal population, given that, regardless of their test findings, they were referred for testing due possibly to academic difficulties or emotional problems and therefore are not simply a random sample of the normal population. However, compared with the demographic data provided within the CCT-2 administration manual, the present, NC group estimated the standardization sample quite well. Further, despite those limitations, the present study is the first to have a control sample from within the same referral environment with which to make comparisons on CCT-2 test performance. Additionally, the sample is consistent with the type of referrals that are common to practices which routinely employ the CCT-2 in making academically relevant diagnosis. This is important, given that the CCT is generally marketed as a tool for the detection of school-based learning difficulties (Boll, 1993).

Future research should continue to evaluate the clinical validity of the CCT-2 in a variety of settings, including rehabilitation hospitals and extended recovery units with children of variable brain injury etiologies. In summary, although the CCT-2 is a psychometrically sound instrument, it is not particularly sensitive to brain injury. Further, the factor scores, while important in the TBI populations, do not appear to improve the diagnostic power of the CCT and should not be considered independently of the composite T-score. Given the CCT-2's limited predictive power, this test should not be used independently of other brain behavior measures in decision making regarding impairments in the context of school-based referrals for learning disability testing.

Conflict of Interest

None declared.

References

Adams
K.
Grant
I.
Influence of premorbid risk factors on neuropsychological performance in alcoholics
Journal of Clinical and Experimental Neuropsychology
 , 
1986
, vol. 
8
 (pg. 
362
-
370
)
Allen
D. N.
Goldstein
G.
Mariano
E.
Is the Halstead Category Test a multidimensional instrument
Journal of Clinical and Experimental Neuropsychology: Official Journal of the International Neuropsychological Society
 , 
1999
, vol. 
21
 
2
(pg. 
237
-
244
)
Allen
D. N.
Knatz
D. T.
Mayfield
J.
Validity of the Children's Category Test-level 1 in a clinical sample with heterogeneous forms of brain dysfunction
Archives of Clinical Neuropsychology: The Official Journal of the National Academy of Neuropsychologists
 , 
2006
, vol. 
21
 
7
(pg. 
711
-
720
)
Bello
D. T.
Allen
D. N.
Mayfield
J.
Sensitivity of the Children's Category Test Level 2 to brain dysfunction
Archives of Clinical Neuropsychology
 , 
2008
, vol. 
23
 (pg. 
329
-
339
)
Boll
T. J.
Children's Category Test
 , 
1993
San Antonio, TX
Psychological Corporation
Butler
M.
Retzlaff
P.
Vanderploeg
R.
Neuropsychological test usage
Professional Psychology: Research and Practice
 , 
1991
, vol. 
22
 (pg. 
510
-
512
)
Carr
M. A.
Sweet
J. J.
Rossini
E.
Diagnostic validity of the Luria-Nebraska Neuropsychological Battery-Children's Version
Journal of Consulting and Clinical Psychology
 , 
1986
, vol. 
54
 
3
(pg. 
354
-
358
)
Choca
J. P.
Laatsch
L.
Wetzel
L.
Agresti
A.
The Halstead Category Test: A fifty year perspective
Neuropsychology Review
 , 
1997
, vol. 
7
 
2
(pg. 
61
-
75
)
Conklin
H. M.
Luciana
M.
Hooper
C. J.
Yarger
R. S.
Working memory performance in typically developing children and adolescents: Behavioral evidence of protracted frontal lobe development
Developmental Neuropsychology
 , 
2007
, vol. 
31
 
1
(pg. 
103
-
128
)
Donders
J.
Latent structure of the children's category test at two age levels in the standardization sample
Journal of Clinical and Experimental Neuropsychology
 , 
1999
, vol. 
21
 
2
(pg. 
279
-
282
)
Donders
J.
Giroux
A.
Discrepancies between the California Verbal Learning Test: Children's version and the Children's Category Test after pediatric traumatic brain injury
Journal of the International Neuropsychological Society
 , 
2005
, vol. 
11
 
4
(pg. 
386
-
391
)
Donders
J.
Nesbit-Greene
K.
Predictors of neuropsychological test performance after pediatric traumatic brain injury
Assessment
 , 
2004
, vol. 
11
 
4
(pg. 
275
-
284
)
Halstead
W. C.
Brain and intelligence: A quantitative study of the frontal lobes
 , 
1947
Chicago
University of Chicago Press
Hoffman
N. M.
Neuropsychological assessment of memory and executive functioning in a pediatric traumatic brain injury population
Dissertation Abstracts International, Section B: The Sciences and Engineering
 , 
1998
, vol. 
59
 (pg. 
4
-
35
)
Hoffman
N.
Donders
J.
Thompson
E. H.
Novel learning abilities after traumatic head injury in children
Archives of Clinical Neuropsychology: The Official Journal of the National Academy of Neuropsychologists
 , 
2000
, vol. 
15
 
1
(pg. 
47
-
58
)
Huttenlocher
P. R.
Synaptic density in human frontal cortex-Developmental changes and effects of aging
Brain Research
 , 
1979
, vol. 
163
 
2
(pg. 
195
-
205
)
Johnstone
B.
Holland
D.
Hewett
J. E.
The construct validity of the category test: Is it a measure of reasoning or intelligence
Psychological Assessment
 , 
1997
, vol. 
9
 
1
(pg. 
28
-
33
)
Lees-Haley
P.
Smith
H.
Williams
C.
Dunn
J.
Forensic neuropsychological test usage: An empirical survey
Archives of Clinical Neuropsychology
 , 
1996
, vol. 
11
 (pg. 
45
-
51
)
Livingston
R. B.
Gray
R. M.
Haak
R. A.
Factor analysis of the Intermediate Category Test
Perceptual and Motor Skills
 , 
1996
, vol. 
83
 
3
(pg. 
851
-
855
)
Luciana
M.
Conklin
H. M.
Hooper
C. J.
Yarger
R. S.
The development of nonverbal working memory and executive control processes in adolescence
Child Development
 , 
2005
, vol. 
76
 (pg. 
697
-
712
)
Mercer
W.
Harrell
E.
Miller
D.
Childs
H.
Rocker
D.
Performance of brain-injured verses healthy adults in three versions of the Category Test
The Clinical Neuropsychologist
 , 
1997
, vol. 
11
 (pg. 
174
-
179
)
Miller
L. J.
Donders
J.
Prediction of educational outcome after pediatric traumatic brain injury
Rehabilitation Psychology
 , 
2003
, vol. 
48
 
4
(pg. 
237
-
241
)
Nesbit-Greene
K.
Donders
J.
Latent structure of the Children's Category Test after pediatric traumatic head injury
Journal of Clinical and Experimental Neuropsychology
 , 
2002
, vol. 
24
 
2
(pg. 
194
-
199
)
Rabin
L. A.
Barr
W. B.
Burton
L. A.
Assessment practices of clinical neuropsychologists in the United States and Canada: A survey of INS, NAN, and APA division 40 members
Archives of Clinical Neuropsychology
 , 
2005
, vol. 
25
 
8
(pg. 
33
-
65
)
Reed
H. B. C.
Jr
Reitan
R. M.
Kløve
H.
Influence of cerebral lesions on psychological test performances of older children
Journal of Consulting Psychology
 , 
1965
, vol. 
29
 
3
(pg. 
247
-
251
)
Reitan
R. M.
Investigation of the validity of Halstead's measures of biological intelligence
A.M.A. Archives of Neurology and Psychiatry
 , 
1955
, vol. 
73
 
1
(pg. 
28
-
35
)
Shore
C.
Shore
H.
Pihl
R. O.
Correlations between performance on the Category Test and the Wechsler Adult Intelligence Scale
Perceptual and Motor Skills
 , 
1971
, vol. 
32
 
1
pg. 
70
 
Shriver
M. D.
Vacc
N. A.
Review of the Children's Category Test. The Thirteenth Mental Measurements Yearbook
  
n.d. Retrieved July 26, 2008, from EBSCOHost Mental Measurements Yearbook Database
Walsh
M. C.
Pennington
B. F.
Grossier
D. B.
A normative-developmental study of executive functions: A window on prefrontal function in children
Developmental Neuropsychology
 , 
1991
, vol. 
7
 (pg. 
131
-
149
)
Wechsler
D.
Wechsler Intelligence Scale for Children-Third Edition
 , 
1991
San Antonio, TX
Psychological Corporation