Abstract

Embedded validity measures can screen for possible noncredible performance, but there is a paucity of literature with youth who have neurological disorders. The purpose of this study is to examine the California Verbal Learning Test, Children’s Version (CVLT-C) recognition discriminability (RD) score as an embedded validity marker in a sample of youth with neurological diagnoses. Youth between 5–16 years old (N = 294; mean age = 11.3, SD = 3.4) completed the CVLT-C and the Test of Memory Malingering (TOMM). Overall, 5.4% (n = 16) scored below the established cutoff on the TOMM; they were younger, had lower intellectual abilities, and worse performance on nearly all CVLT-C scores than those who scored above the TOMM cutoff. Using the CVLT-C RD score of z ≤ −0.5 (Baker et al. 2004), our sample had a sensitivity = .81 and specificity = .67. Using z ≤ −3.0 provided sensitivity at .44 with specificity at .90. A lower cutoff score of z ≤ −3.0 for CVLT-C RD is necessary in youth with neurological diagnoses.

Introduction

Determining the validity of obtained data is an important component of neuropsychological assessments with children and adolescents (Bush et al., 2005; Heilbronner, Sweet, Morgan, Larrabee, & Millis, 2009; Kirkwood, in press). Stand-alone performance validity tests (PVTs) have the primary, and often sole, purpose of detecting noncredible responding. Embedded PVTs are designed to detect possible noncredible performance on a cognitive measure and are a direct indicator of credibility on that specific test (versus being a proxy). Despite clinicians using a combination of stand-alone and embedded PVTs for neuropsychological assessments, the majority of research has focused on the former and not the latter (Brooks, in press; Kirkwood, 2012; Kirkwood, in press).

Of the few studies on embedded PVTs for use with children, most have considered performance on a verbal attention/working memory test. Reliable digit span (RDS: Greiffenstein, Baker, & Gola, 1994) scores of 6 or less have been investigated in youth, with literature supporting adequate sensitivity and specificity (51% and 92%, respectively) in adolescents with mild traumatic brain injury (mTBI; Araujo et al., 2014; Kirkwood, Hargrave, & Kirk, 2011). When using an RDS cut-off of 6 or less in children with epilepsy (Welsh, Bender, Whitman, Vasserman, & MacAllister, 2012), only 65% of the sample passed this embedded PVT, sensitivity was 100%, and specificity was 71%. Alternative cut-off scores were suggested for the RDS in youth with epilepsy, including scores of 4 or less that provide sensitivity of 60% and specificity of 89%. The Digit Span age-adjusted scaled score (Wechsler, 2003) has been examined as an embedded PVT. In youth with mTBI, a Digit Span age-adjusted scaled score ≤5 had sensitivity of 51% and specificity of 96% to noncredible test performance (Kirkwood et al., 2011). When evaluating the Digit Span scaled score as an embedded PVT in children with various academic and behavioral problems, a lower cut-off score of 4 or lower was needed to establish the desired level of specificity (≥90%) and resulted in 43%–44% sensitivity (Loughan, Perna, & Hertza, 2012; Perna, Loughan, Hertza, & Segraves, 2014).

There is also limited research on embedded PVTs for episodic memory tests. Perna and colleagues (2014) suggested limited utility of a verbal memory recall > recognition discrepancy on the Children's Memory Scale (Cohen, 1997) as an embedded PVT due to sensitivity of 11% (when holding specificity at 90%). The recognition discriminability score from the California Verbal Learning Test, Children's Edition (CVLT-C; Delis, Kramer, Kaplan, & Ober, 1994) has some evidence for use as a marker of performance validity. Baker, Connery, Kirk, and Kirkwood (2014) found that only the CVLT-C recognition discriminability score was predictive of noncredible performance in adolescents who had sustained a mild traumatic brain injury and demonstrated that a cut-off score of z ≤ −0.5 would provide sensitivity of 55% and specificity of 91%. However, like the RDS and Digit Span scaled score, there is a concern that the use of the same CVLT-C cut-off score in children with neurological disorders will result in unacceptably high false-positive rates. To date, this has not been explored.

The purpose of the present study was to examine whether the CVLT-C recognition discriminability score established by Baker and colleagues (2014) can be used as an embedded PVT in children and adolescents with neurological disorders. Based on prior research with the RDS (Welsh et al., 2012) and Digit Span scaled score (Loughan et al., 2012; Perna et al., 2014), it was hypothesized that (a) the CVLT-C recognition discriminability score established as an embedded PVT in a sample of youth with mTBI would not be an appropriate cut-off score for use in youth with neurological diagnoses and (b) a lower cut-off score would be needed to achieve specificity ≥0.90.

Methods

Participants

Participants included 294 consecutively referred children and adolescents between the ages of 5 and 16 years who underwent neuropsychological assessments at a tertiary care hospital. Patients had diagnoses that included epilepsy, stroke, hydrocephalus, and other neurological disorders (e.g., encephalitis). Diagnoses were made by neurologists or neurosurgeons. Portions of this same case series have been previously published but for different research questions (Brooks, 2012; Brooks, Sherman, & Krol, 2012; Ploetz, Mazur-Mosiewicz, Kirkwood, Sherman, & Brooks, 2014). None of these participants were seeking disability. All data were collected with the approval of the University of Calgary Conjoint Health Research Ethics Board.

Measures

All participants were administered the CVLT-C (Delis et al., 1994) as a core part of their neuropsychological assessment. The CVLT-C is a standardized word list that evaluates verbal learning, free recall, and recognition memory in children and adolescents. In addition, all participants were administered the Test of Memory Malingering (TOMM; Tombaugh, 1996) as a stand-alone PVT. The TOMM is a forced-choice visual recognition performance validity test (PVT) with evidence for its use in pediatric samples (Brooks et al., 2012; Brooks, in press; Donders, 2005; Loughan & Perna, 2014; Perna & Loughan, 2013; Ploetz et al., 2014; Welsh et al., 2012). The TOMM consists of two learning trials (Trial 1 and Trial 2) and a delayed retention trial (Trial 3). For the purpose of this investigation, performance was coded as above the cut-off score or below the cut-off score based on the TOMM manual (Tombaugh, 1996) and previous research with pediatric neurology patients (Brooks et al., 2012). Data from two subtests on the Wechsler Intelligence Scale for Children, Fourth Edition (WISC-IV; Wechsler, 2003), notably Vocabulary and Matrix Reasoning, were also included in order to estimate their intellectual abilities based on formulas provided in Sattler (2001).

Analyses

Two groups were formed based on TOMM performance, with one group being above cutoff and one group being below cutoff. For the purpose of this study, above the TOMM cutoff is referred to as possible credible performance and below the TOMM cutoff is referred to as possible noncredible performance; however, these are actually a clinical determination that must be made when considering all data for any patient. Subsequent group comparisons of age, estimated intellectual abilities, and CVLT-C variables used one-way analyses of variance (ANOVA) for parametric methods (if nonsignificant Levene's test of homogeneity of variance) and Mann–Whitney U test for nonparametric methods (if significant Levene's test of homogeneity of variance). Secondary comparisons of CVLT-C performance across the diagnostic groups were also performed. Alpha was corrected to p < .01 for multiple comparisons. Cohen's d effect sizes were calculated for each variable across the two groups, with interpretation being small if d ≥ 0.20, medium if d ≥ 0.50, and large if d ≥ 0.80. Sensitivity and specificity of the CVLT-C recognition discriminability z score involved classification based on TOMM performance as the reference criterion and classification based on CVLT-C recognition discriminability z score ≤−0.5 as the outcome criterion. Further calculations of sensitivity and specificity for additional CVLT-C recognition discriminability z scores were also conducted. The goal was to have a specificity of at least 0.90 with the cut-off score. Positive predictive values (PPV; proportion of participants at or below the cutoff for noncredible performance) and negative predictive values (NPV; proportion of participants above the cutoff for noncredible performance) were also calculated.

Results

Demographic information is presented in Table 1 for the 294 participants. This sample had an average age of 11.3 years (SD = 3.4), parents with some college/university on average was evenly split between males and females, and was mostly Caucasian. Over half of the sample had a primary diagnosis of epilepsy, with about one-quarter having stroke (19.4%) or hydrocephalus (8.5%) and just over 20% having other various neurological diagnoses (e.g., encephalitis, neurocutaneous syndromes without epilepsy, genetic disorders). The mean estimated intellectual level for the total sample was 86.1 (SD = 17.7, range = 46–135). Performances on CVLT-C for the total sample, as well as divided by diagnostic groups, are presented in Table 2. None of the comparisons for the CVLT-C scores across diagnostic groups were statistically significant (all p’s > .05).

Table 1.

Demographic information for participants

Demographics N = 294 
Age (years) M = 11.3, SD = 3.4, range = 5.2–16.9 
Parent education (years) 
 Mother M = 14.1, SD = 2.3, range = 4–20 
 Father M = 14.0, SD = 2.8, range = 3–20 
Sex 
 Male N = 147; 50% 
 Female N = 147; 50% 
Ethnicity 
 Caucasian N = 255; 86.7% 
 Other N = 39; 13.3% 
Diagnosis 
 Epilepsy N = 153; 52.0% 
 Stroke N = 57; 19.4% 
 Hydrocephalus N = 25; 8.5% 
 General/mixed neurological N = 59; 20.1% 
Demographics N = 294 
Age (years) M = 11.3, SD = 3.4, range = 5.2–16.9 
Parent education (years) 
 Mother M = 14.1, SD = 2.3, range = 4–20 
 Father M = 14.0, SD = 2.8, range = 3–20 
Sex 
 Male N = 147; 50% 
 Female N = 147; 50% 
Ethnicity 
 Caucasian N = 255; 86.7% 
 Other N = 39; 13.3% 
Diagnosis 
 Epilepsy N = 153; 52.0% 
 Stroke N = 57; 19.4% 
 Hydrocephalus N = 25; 8.5% 
 General/mixed neurological N = 59; 20.1% 
Table 2.

Performance on CVLT-C in diagnostic groups

 Total sample Epilepsy Stroke Hydrocephalus General neurology 
N 294 153 57 25 59 
CVLT-C trials 1–5 44.6 (12.3) 44.6 (12.3) 43.5 (13.5) 42.3 (9.6) 46.9 (12.0) 
CVLT-C short delay free recall −0.6 (1.2) −0.6 (1.2) −0.7 (1.2) −1.0 (1.1) −0.4 (1.2) 
CVLT-C short delay cued recall −0.6 (1.3) −0.7 (1.4) −0.7 (1.3) −0.7 (1.3) −0.6 (1.3) 
CVLT-C long delay free recall −0.7 (1.3) −0.7 (1.3) −0.8 (1.4) −1.1 (1.2) −0.4 (1.2) 
CVLT-C long delay cued recall −0.6 (1.3) −0.6 (1.3) −0.7 (1.4) −0.6 (1.2) −0.4 (1.3) 
CVLT-C recognition hits −0.4 (1.3) −0.4 (1.2) −0.2 (1.1) −0.8 (1.5) −0.3 (1.5) 
CVLT-C recognition discriminability −0.6 (1.6) −0.7 (1.7) −0.4 (1.7) −0.7 (1.3) −0.3 (1.5) 
Percent falling below cut-off score on TOMM 5.4% 3.3% 14.0% 0.0% 5.1% 
 Total sample Epilepsy Stroke Hydrocephalus General neurology 
N 294 153 57 25 59 
CVLT-C trials 1–5 44.6 (12.3) 44.6 (12.3) 43.5 (13.5) 42.3 (9.6) 46.9 (12.0) 
CVLT-C short delay free recall −0.6 (1.2) −0.6 (1.2) −0.7 (1.2) −1.0 (1.1) −0.4 (1.2) 
CVLT-C short delay cued recall −0.6 (1.3) −0.7 (1.4) −0.7 (1.3) −0.7 (1.3) −0.6 (1.3) 
CVLT-C long delay free recall −0.7 (1.3) −0.7 (1.3) −0.8 (1.4) −1.1 (1.2) −0.4 (1.2) 
CVLT-C long delay cued recall −0.6 (1.3) −0.6 (1.3) −0.7 (1.4) −0.6 (1.2) −0.4 (1.3) 
CVLT-C recognition hits −0.4 (1.3) −0.4 (1.2) −0.2 (1.1) −0.8 (1.5) −0.3 (1.5) 
CVLT-C recognition discriminability −0.6 (1.6) −0.7 (1.7) −0.4 (1.7) −0.7 (1.3) −0.3 (1.5) 
Percent falling below cut-off score on TOMM 5.4% 3.3% 14.0% 0.0% 5.1% 

Notes: With the exception of the bottom row, all values represent means (SD). All of the group comparisons are nonsignificant.

Based on performance on the TOMM, 16 youth (5.4%) fell below the established cut-off score (see Table 2; percent below this cutoff by diagnostic groups: epilepsy, 3.3%; stroke, 14.0%; hydrocephalus, 0.0%; general neurology, 5.1%). Those with possible noncredible performance were younger (medium effect size), had lower estimated intellectual abilities (large effect size), and worse performance on nearly all CVLT-C scores (medium to very large effect sizes) (see Table 3). There was not a statistical difference for the CVLT-C recognition hits z score between those with and without possible credible performances, although the medium effect size may suggest that power was a limiting factor.

Table 3.

Comparisons of those who have TOMM performance either above or below the cut-off score

 Above cutoff on validity test (possible credible) Below cutoff on validity test (possible noncredible) F-Value p value Cohen's d effect size 
278 16    
Age 11.5 (3.4) 9.2 (3.7) 6.91 .009 0.68 
Intellectual abilities 86.8 (17.4) 71.0 (16.5) 8.68 .004 0.91 
CVLT-C trials 1–5 45.2 (12.1) 35.4 (12.2) 9.76 .002 0.80 
CVLT-C short delay free recall −0.6 (1.2) −1.4 (1.0) 7.14 .008 0.69 
CVLT-C short delay cued recall −0.6 (1.3) −1.9 (1.3) 15.49 <.001 1.01 
CVLT-C long delay free recall −0.6 (1.3) −1.8 (1.3) 11.90 .001 0.89 
CVLT-C long delay cued recall −0.5 (1.3) −1.9 (1.2) 20.00 <.001 1.15 
CVLT-C recognition hits −0.3 (1.2) −1.1 (1.8) 1717.00* .118 0.57 
CVLT-C recognition discriminability −0.5 (1.5) −2.4 (2.0) 941.00* <.001 1.27 
 Above cutoff on validity test (possible credible) Below cutoff on validity test (possible noncredible) F-Value p value Cohen's d effect size 
278 16    
Age 11.5 (3.4) 9.2 (3.7) 6.91 .009 0.68 
Intellectual abilities 86.8 (17.4) 71.0 (16.5) 8.68 .004 0.91 
CVLT-C trials 1–5 45.2 (12.1) 35.4 (12.2) 9.76 .002 0.80 
CVLT-C short delay free recall −0.6 (1.2) −1.4 (1.0) 7.14 .008 0.69 
CVLT-C short delay cued recall −0.6 (1.3) −1.9 (1.3) 15.49 <.001 1.01 
CVLT-C long delay free recall −0.6 (1.3) −1.8 (1.3) 11.90 .001 0.89 
CVLT-C long delay cued recall −0.5 (1.3) −1.9 (1.2) 20.00 <.001 1.15 
CVLT-C recognition hits −0.3 (1.2) −1.1 (1.8) 1717.00* .118 0.57 
CVLT-C recognition discriminability −0.5 (1.5) −2.4 (2.0) 941.00* <.001 1.27 

Notes: Intellectual abilities are based on an estimated FSIQ using two subtests (vocabulary and matrix reasoning) and information provided in Sattler (2001) for short-form estimation. Intellectual abilities are presented as a standard score (mean = 100, SD = 15). CVLT-C Trials 1–5 is presented as a T score (mean = 50, SD = 10). All remaining CVLT-C scores are presented as z scores (mean = 0, SD = 1). *Due to significant differences in variances (i.e., significant Levene's test for homogeneity of variance), Mann–Whitney U tests were completed for these group comparisons.

In this sample, 40.1% had a CVLT-C recognition discriminability z score ≤−0.5 (percent below this cutoff by diagnostic groups: epilepsy, 43.8%; stroke, 35.1%; hydrocephalus, 44.0%; general neurology, 33.9%). Classification statistics for the CVLT-C recognition discriminability z score are presented in Table 3. Using the previously established cutoff of z ≤ −0.5, sensitivity was 0.88 and specificity was 0.41. Because there is an a priori goal to establish a cut-off score with specificity ≥0.90, calculations with additional CVLT-C recognition discriminability z scores were conducted. As shown in Table 3, a CVLT-C recognition discriminability cut-off score of z ≤ −3.0 provided a sensitivity of 0.44 with a specificity of .90. If considering base rates of possible noncredible effort of 5% and 10% (these rates are consistent with pediatric neurology samples, the PPV would be 19% and 33%, respectively (see Table 4)). In addition, the NPV would be 97% and 94%, respectively.

Table 4.

Classification statistics for CVLT-C recognition discriminability z scores

CVLT-C recognition discriminability z score Sensitivity Specificity 5% Base rate
 
10% Base rate
 
15% Base Rate
 
Positive predictive value Negative predictive value Positive predictive value Negative predictive value Positive predictive value Negative predictive value 
0.0 88 59 10 99 19 98 27 96 
−0.5 81 62 10 98 19 97 28 95 
−1.0 81 68 12 99 22 97 31 95 
−1.5 69 79 15 98 27 96 37 93 
−2.0 56 84 15 97 27 94 37 92 
−2.5 44 88 16 97 29 93 39 90 
3.0 44 90 19 97 33 94 44 90 
−3.5 38 93 22 97 37 93 48 89 
−4.0 31 94 21 96 36 92 47 89 
−4.5 25 96 23 96 39 92 51 88 
−5.0 25 97 29 96 46 92 58 88 
CVLT-C recognition discriminability z score Sensitivity Specificity 5% Base rate
 
10% Base rate
 
15% Base Rate
 
Positive predictive value Negative predictive value Positive predictive value Negative predictive value Positive predictive value Negative predictive value 
0.0 88 59 10 99 19 98 27 96 
−0.5 81 62 10 98 19 97 28 95 
−1.0 81 68 12 99 22 97 31 95 
−1.5 69 79 15 98 27 96 37 93 
−2.0 56 84 15 97 27 94 37 92 
−2.5 44 88 16 97 29 93 39 90 
3.0 44 90 19 97 33 94 44 90 
−3.5 38 93 22 97 37 93 48 89 
−4.0 31 94 21 96 36 92 47 89 
−4.5 25 96 23 96 39 92 51 88 
−5.0 25 97 29 96 46 92 58 88 

Notes: All values represent percentages. Base rate refers to level of possible noncredible performance. Bolded values correspond to the cutoff score of z ≤ −3.0, which provides the best balance of sensitivity and specificity.

With the present sample, over half of the participants had a primary diagnosis of epilepsy (mean estimated intellectual level = 86.1, SD = 18.5, range = 48–135). Participants with epilepsy (n = 153) were not significantly different than those without epilepsy (n = 141) on age [F(1,292) = 0.87, p = .351], estimated intellectual level [F(1,247) = 0.002, p = .961], or performance on all CVLT-C scores (all p’s > .05). The number of participants with epilepsy who were flagged as possible noncredible on the TOMM (n = 5, 3.3%) was not significantly lower than in the nonepilepsy sample [n = 11, 7.8%; χ2(1) = 2.93, p = .087]. Participants with epilepsy who had possible noncredible performance on the TOMM were significantly younger (U = 71.5, p = .002) and had worse performance on CVLT-C long delay cued recall [F(1,151) = 7.31, p = .008] and CVLT-C recognition discriminability [F(1,151) = 7.25, p = .008]. Using the CVLT-C recognition discriminability z score ≤−0.5, participants with epilepsy had a sensitivity of 1.0 and a specificity of 0.58. A cut-off score of z ≤ −3.0 on the CVLT-C recognition discriminability resulted in a similar sensitivity of 0.40 with specificity of 0.89 (note: a cut-off score of z ≤ −3.5 had sensitivity of 0.20 with specificity of 0.91). If considering base rates of possible noncredible effort of 5% in children with epilepsy (consistent with existing research in these samples), the PPV would be 16% and the NPV would be 97% with a cut-off score of z ≤ −3.0.

Discussion

Determining performance validity should be considered for every neuropsychological assessment. Despite evidence for using stand-alone PVTs in pediatric samples with cognitive deficits (Brooks, 2012; Brooks et al., 2012; Brooks, in press; Carone, 2008; Courtney, Dinkins, Allen, & Kuroski, 2003; DeRight & Carone, 2015; Donders, 2005; Green & Flaro, 2003; Kirk et al., 2011; Kirkwood, 2012; Loughan & Perna, 2014; MacAllister, Nakhutina, Bender, Karantzoulis, & Carlson, 2009; Ploetz et al., 2014), the evidence for using embedded markers of validity in youth who have neurological diagnoses is lacking. Baker and colleagues (2014) proposed that the recognition discriminability index from the CVLT-C could be used as an embedded indicator of performance validity. Based on a z score ≤−0.5 in a sample of youth with an mTBI, these authors reported a respectable sensitivity of 55% with 91% specificity. The purpose of the present study was to determine whether the CVLT-C recognition discriminability score can be used as an embedded marker of performance validity in youth with neurological diagnoses.

Unfortunately, a cut-off score of z ≤ −0.5 on the CVLT-C recognition discriminability in the present sample of youth with neurological diagnoses resulted in high sensitivity (88%) but insufficient specificity (41%). The high-sensitivity/low-specificity values are likely the result of having just over 40% of this sample with a recognition discriminability z score of ≤−0.5 but only 5% being identified as possible noncredible on the stand-alone PVT. When considering different cut-off scores for recognition discriminability, it was found that z ≤ −3.0 provided sensitivity of 44% with the desired level of specificity ≥90%. When conducting the analyses with only the epilepsy sample, this same cut-off score for recognition discriminability (z ≤ −3.0) provided sensitivity of 40% with specificity at 89%.

Clearly, the use of the CVLT-C recognition discriminability z score ≤ −0.5 as an embedded indicator of validity with youth who have neurological diagnoses could be inappropriate and misleading. Instead, a lower cut-off score of z ≤ −3.0 should be used with these patient populations, especially if similar to the present sample in demographics, diagnoses, and intellectual functioning. It has been previously demonstrated using other embedded markers of performance validity, notably the RDS and the Digit Span scaled score, that cut-off scores appropriate for youth with mTBI (Kirkwood et al., 2011) may not be appropriate for children and adolescents with different diagnoses (Loughan et al., 2012; Perna et al., 2014; Welsh et al., 2012). As a result, cut-off scores that are established using a clinical population are necessary and are most applicable to that population.

Welsh and colleagues (2012) cautioned that embedded measures may not have as much utility in pediatric patient populations with significant cognitive problems (e.g., epilepsy) because they are likely to produce higher false-positive rates. It is true that PVT performance can be significantly correlated with intellectual levels, so worse cognitive problems/lower intellectual levels must be considered when interpreting any PVT (see discussion in Brooks, in press). In the above-mentioned studies, the sample of youth with mTBI had a mean intellectual level of 103.5 (SD = 12.6; Kirkwood et al., 2011) compared with 87.0 (SD = 20.0) in the epilepsy sample (Welsh et al., 2012) and 89.9 (SD = 18.1) in the children with academic and behavioral problems (Loughan et al., 2012; Perna et al., 2014). In both of the latter studies, lower cut-off scores were necessary in order to maintain desired classification rates (e.g., specificity ≥0.90). It is possible to use embedded PVTs in pediatric populations with neurological diagnoses and lesser cognitive abilities, but it is important to establish diagnostic- and developmentally-appropriate cut-off scores through research.

There are some limitations to consider regarding these data. First, this was a sample of patients who were evaluated through a clinical service within a tertiary care hospital. Therefore, the results may not apply to all pediatric patients who have similar diagnoses or to patients with different demographics. For example, in children who have a similar diagnosis but higher levels of intellectual abilities, it is possible that the new cut-off score of z ≤ −3.0 could be too stringent and could end up missing true positives. Second, the classification of data into possible credible or possible noncredible performance was based solely on the TOMM cut-off score. It is possible that the TOMM performance did not capture all youth who truly provided noncredible performances on the assessment, although there is increasing evidence that this stand-alone PVT can be used in this population (Brooks et al., 2012; Brooks, in press; DeRight & Carone, 2015; Donders, 2005; Kirk et al., 2011; Kirkwood, 2012; Loughan & Perna, 2014; MacAllister et al., 2009; Ploetz et al., 2014). Determining whether performance is credible or not credible based on falling below a cut-off score is the savoir-faire of clinical judgment, which must consider all of the data available. Therefore, it is not appropriate to automatically consider a score below any cutoff as synonymous with probable noncredible performance. Third, the subsample with performance on the TOMM below the cut-off score was small and therefore could have an impact on determination of classification rates and additional analyses. Replication of these data is recommended. However, it is noteworthy that the sample size of possible noncredible performance in the present study (n = 16) is higher than the sample sizes in several other studies of embedded validity markers (e.g., Loughan et al., 2012, n = 7 identified as noncredible; Perna et al., 2014, n = 9 identified as noncredible; Welsh et al., 2012, n = 5 identified as noncredible). Fourth, some of the participants who were flagged on the TOMM due to scoring below the established cut-off score could have been false positives. For example, 8 of the 16 youth (two with epilepsy, four with perinatal stroke, one with hydrocephalus, and one with a chromosomal abnormality) were suspected possible false positives based on the clinician's report.

Along with stand-alone PVTs, embedded PVTs are an important component of neuropsychological assessments with children and adolescents. However, applying cut-off scores from one clinical sample to another clinical sample, particularly when there may be substantial and important differences (e.g., one sample may be more cognitively impacted than another), may be an inappropriate application of these PVTs and may result in higher-than-desired false-positive rates. In the present study with a sample of children and adolescents with neurological disorders, there is evidence that a CVLT-C recognition discriminability z score ≤−3.0 can potentially be used to help determine possible noncredible performance on testing with this population.

Conflict of Interest

Neither of the authors have a financial interest in the tests used in this study. BLB receives funding from a test publisher (Psychological Assessment Resources, Inc.) for a learning and memory battery with an embedded validity indicator (ChAMP™; Sherman and Brooks, in press) and a co-normed stand-alone performance validity test (MVP™; Sherman and Brooks, in press), royalties from Oxford University Press for a book that reviews validity measures in youth (Pediatric Forensic Neuropsychology, 2012), and in-kind test credits for research using a computerized test publisher (CNS Vital Signs).

Acknowledgements

The authors thank Dr. Helen Carlson for her assistance with data management, as well as (alphabetically) Hussain Daya, Courtney Habina, Andrea Jubinville, Christianne Laliberté, Lonna Mitchell, Emily Tam, and Julie Wershler for data entry. The authors also thank the families who come to the Alberta Children's Hospital and agree to participate in research.

References

Araujo
G. C.
Antonini
T. N.
Monahan
K.
Gelfius
C.
Klamar
K.
Potts
M.
et al
(
2014
).
The relationship between suboptimal effort and post-concussion symptoms in children and adolescents with mild traumatic brain injury
.
The Clinical Neuropsychologist
 ,
28
(5)
,
786
801
.
Baker
D. A.
Connery
A. K.
Kirk
J. W.
Kirkwood
M. W.
(
2014
).
Embedded performance validity indicators within the California Verbal Learning Test, children's version
.
The Clinical Neuropsychologist
 ,
28
(1)
,
116
127
.
Brooks
B. L.
(
2012
).
Victoria Symptom Validity Test performance in children and adolescents with neurological disorders
.
Archives of Clinical Neuropsychology
 ,
27
(8)
,
858
868
.
Brooks
B. L.
(
in press
).
Validity testing during pediatric clinical neuropsychological evaluations with medical populations
. In
Kirkwood
M. W.
(Ed.),
Validity testing in the assessment of children and adolescents
 .
New York
:
Guilford Press
.
Brooks
B. L.
Sherman
E. M.
Krol
A. L.
(
2012
).
Utility of TOMM Trial 1 as an indicator of effort in children and adolescents
.
Archives of Clinical Neuropsychology
 ,
27
(1)
,
23
29
.
Bush
S. S.
Ruff
R. M.
Troster
A. I.
Barth
J. T.
Koffler
S. P.
Pliskin
N. H.
et al
(
2005
).
Symptom validity assessment: Practice issues and medical necessity NAN policy & planning committee
.
Archives of Clinical Neuropsychology
 ,
20
(4)
,
419
426
.
Carone
D. A.
(
2008
).
Children with moderate/severe brain damage/dysfunction outperform adults with mild-to-no brain damage on the Medical Symptom Validity Test
.
Brain Injury
 ,
22
(12)
,
960
971
.
Cohen
M. J.
(
1997
).
Children's Memory Scale
 .
San Antonio, TX
:
The Psychological Corporation
.
Courtney
J. C.
Dinkins
J. P.
Allen
L. M.
III
Kuroski
K.
(
2003
).
Age related effects in children taking the computerized assessment of response bias and word memory test
.
Child Neuropsychology
 ,
9
(2)
,
109
116
.
Delis
D. C.
Kramer
J. H.
Kaplan
E.
Ober
B. A.
(
1994
).
California Verbal Learning Test—children's version
 .
San Antonio, TX
:
Psychological Corporation
.
DeRight
J.
Carone
D. A.
(
2015
).
Assessment of effort in children: A systematic review
.
Child Neuropsychology
 ,
21
(1)
,
1
24
.
Donders
J.
(
2005
).
Performance on the test of memory malingering in a mixed pediatric sample
.
Child Neuropsychology
 ,
11
(2)
,
221
227
.
Green
P.
Flaro
L.
(
2003
).
Word Memory Test performance in children
.
Child Neuropsychology
 ,
9
(3)
,
189
207
.
Greiffenstein
M.
Baker
W.
Gola
T.
(
1994
).
Validation of malingered amnesia measures with a large clinical sample
.
Psychological Assessment
 ,
6
,
218
224
.
Heilbronner
R. L.
Sweet
J. J.
Morgan
J. E.
Larrabee
G. J.
Millis
S. R.
(
2009
).
American Academy of Clinical Neuropsychology Consensus Conference Statement on the neuropsychological assessment of effort, response bias, and malingering
.
Clinical Neuropsychology
 ,
23
(7)
,
1093
1129
.
Kirk
J. W.
Harris
B.
Hutaff-Lee
C. F.
Koelemay
S. W.
Dinkins
J. P.
Kirkwood
M. W.
(
2011
).
Performance on the Test of Memory Malingering (TOMM) among a large clinic-referred pediatric sample
.
Child Neuropsychology
 ,
17
(3)
,
242
254
.
Kirkwood
M. W.
(
2012
).
Overview of tests and techniques to detect negative response bias in children
. In
Sherman
E. M. S.
Brooks
B. L.
(Eds.),
Pediatric forensic neuropsychology
  (pp.
136
161
).
New York
:
Oxford University Press
.
Kirkwood
M. W.
(Ed.). (
in press
).
Validity testing in the assessment of children and adolescents
 .
New York
:
Guilford Publishing
.
Kirkwood
M. W.
Hargrave
D. D.
Kirk
J. W.
(
2011
).
The value of the WISC-IV Digit Span subtest in detecting noncredible performance during pediatric neuropsychological examinations
.
Archives of Clinical Neuropsychology
 ,
26
(5)
,
377
384
.
Loughan
A. R.
Perna
R.
(
2014
).
Performance and specificity rates in the test of memory malingering: An investigation into pediatric clinical populations
.
Applied Neuropsychology: Child
 ,
3
(1)
,
26
30
.
Loughan
A. R.
Perna
R.
Hertza
J.
(
2012
).
The value of the Wechsler intelligence scale for children-fourth edition digit span as an embedded measure of effort: An investigation into children with dual diagnoses
.
Archives of Clinical Neuropsychology
 ,
27
(7)
,
716
724
.
MacAllister
W. S.
Nakhutina
L.
Bender
H. A.
Karantzoulis
S.
Carlson
C.
(
2009
).
Assessing effort during neuropsychological evaluation with the TOMM in children and adolescents with epilepsy
.
Child Neuropsychology
 ,
15
(6)
,
521
531
.
Perna
R. B.
Loughan
A. R.
(
2013
).
Children and the test of memory malingering: Is one trial enough?
Child Neuropsychology
 ,
19
(4)
,
438
447
.
Perna
R.
Loughan
A. R.
Hertza
J.
Segraves
K.
(
2014
).
The value of embedded measures in detecting suboptimal effort in children: An investigation into the WISC-IV Digit Span and CMS verbal memory subtests
.
Applied Neuropsychology: Child
 ,
3
(1)
,
45
51
.
Ploetz
D.
Mazur-Mosiewicz
A.
Kirkwood
M. W.
Sherman
E. M. S.
Brooks
B. L.
(
2014
).
Performance on the Test of Memory Malingering in children with neurological conditions
.
Child Neuropsychology
 . .
Sattler
J. M.
(
2001
).
Assessment of children, cognitive applications
  (
4th ed.
).
La Mesa, CA
:
Author
.
Sherman
E. M. S.
Brooks
B. L.
(
in Press
).
Child and Adolescent Memory Profile (ChAMP)
 .
Lutz, FL
:
Psychological Assessment Resources, Inc
.
Tombaugh
T.
(
1996
).
Test of memory of malingering (TOMM)
 .
North Tonawanda, NY
:
Multi-Health Systems
.
Wechsler
D.
(
2003
).
Wechsler intelligence scale for children
  (
4th ed.
).
San Antonio, TX
:
The Psychological Corporation
.
Welsh
A. J.
Bender
H. A.
Whitman
L. A.
Vasserman
M.
MacAllister
W. S.
(
2012
).
Clinical utility of reliable digit span in assessing effort in children and adolescents with epilepsy
.
Archives of Clinical Neuropsychology
 ,
27
(7)
,
735
741
.