Abstract

A Rey-Osterrieth Complex Figure Test (ROCFT) equation incorporating copy and recognition was found to be useful in detecting negative response bias in neuropsychological assessments (ROCFT Effort Equation; Lu, P. H., Boone, K. B., Cozolino, L., & Mitchell, C. (2003). Effectiveness of the Rey-Osterrieth Complex Figure Test and the Meyers and Meyers recognition trial in the detection of suspect effort. Clinical Neuropsychologist, 17, 426–440). In the current cross validation of this validity, the credible patient group (n = 146; 124 with equation data) outperformed the noncredible group (n = 157; 115 with equation data) on copy, 3-min recall, total recognition correct and the Effort Equation, but the latter was most effective in classifying subjects. A cut-off of ≤50 maintained specificity of 90% and achieved sensitivity of 80%. Results of the current cross validation provide corroboration that the ROCFT Effort Equation is an effective measure of neurocognitive response bias.

The Rey-Osterrieth Complex Figure Test (ROCFT) is used to evaluate visuospatial constructional ability and visual memory (Lezak, 2005). Subjects are instructed to copy the figure, and then to draw the figure again from memory; the number and delay length of the recall trials have varied (see Mitrushina, Boone, Razani, & D'Elia, 2005). Recognition trials are also available (Fastenau, 2002; Meyers & Meyers, 1995).

In an attempt to develop an effective embedded measure of neurocognitive response bias from the ROCFT, Lu, Boone, Cozolino, and Mitchell (2003) utilized archived ROCFT and Meyers and Meyers recognition trial data on 58 noncredible subjects and credible neuropsychology clinic patients (30 without memory impairment, 23 with verbal memory impairment, and 17 with visual memory impairment). Group comparisons revealed significant differences on direct copy, immediate recall/3-min recall, and recognition scores (obtained after immediate/3-min recall), with the noncredible group displaying significantly lower performances on the immediate recall and copy scores than the verbal memory impaired and nonmemory impaired clinical patient groups, and significantly lower recognition scores than all three clinical groups. Additionally, examination of the recognition trial data showed that noncredible patients endorsed “atypical recognition errors” (i.e., details that did not resemble any of those from the figure) at a significantly higher rate than credible clinic patients. The most effective indicator of performance invalidity was the following formula, titled the ROCFT Effort Equation: 

formula

A cutoff of ≤47 achieved sensitivity of 76% and a specificity of 91% for the comparison groups combined. This equation was found to outperform the Meyers and Volbrecht (1998) Memory Error Patterns (MEPs) for identifying suspect effort; when applied to the Lu and colleagues (2003) sample, MEPs obtained sensitivity of 26%–50% and specificity of only 52%–100%.

Blaskewitz, Merten, and Brockhaus (2009) subsequently analyzed ROCFT data from a full-effort group (41 noncompensation-seeking German neurological clinic patients) and an insufficient-effort group (44 compensation-seeking Turkish immigrants). Subjects were administered the ROCFT according to the Meyers and Meyers (1995) format (which involved administration of the recognition trial after a 30-min delayed trial), and protocols were scored by a single rater (the first author). Despite a large mismatch in educational level (mean of 15.3 years in the credible subjects and 5.5 years in the noncredible subjects), both groups scored similarly on the immediate and delayed recall trials. However, the noncredible group significantly underperformed relative to the credible group on copy and recognition trials. When the Meyers and Meyers criteria for performance invalidity (based on rare mistakes and analysis of MEPs) were applied, specificity was 78%, with an associated sensitivity of 50%; when only the MEPs were examined, specificity was 85% and sensitivity was 48%. Application of the Lu and colleagues (2003) equation to the sample (cut-off of ≤45) resulted in specificity of 95% and a sensitivity of 52%. Thus, the Lu and colleagues (2003) equation again outperformed the MEP approach, although sensitivity was lower than documented in the original Lu and colleagues (2003) validation study.

The Lu and colleagues (2003) ROCFT Effort Equation is a convenient performance validity measure in that it is derived from data that are generally part of a standard neuropsychological battery, and therefore does not require any extra administration time. However, findings regarding equation sensitivity have been variable, perhaps related to differences across studies in test administration, cut-offs, and sample sizes and characteristics. The purpose of the present study was to re-investigate the effectiveness of the ROCFT Effort Equation in a large known-groups sample.

Method

Subjects

Patients were referred for neuropsychological assessment to the Harbor-UCLA Medical Center Department of Psychiatry Outpatient Neuropsychology Service or the private practice of the second author from 2001 to 2010; none of these subjects were included in the Lu and colleagues (2003) initial validation study. Patients evaluated in the former setting were primarily referred by treating psychiatrists or neurologists for diagnostic clarification, case management, and/or determination of appropriateness for disability compensation. Patients tested in the latter setting were either evaluated in the context of litigation or at the request of private disability carriers. IRB approval to examine the archival data was obtained from the hospital-affiliated research institute (Los Angeles Biomedical Institute). All participants were fluent in English and most were native English-speakers.

Credible Patients

The 146 patients assigned to the credible group had no motive to feign cognitive symptoms (not in litigation or disability-seeking), and failed ≤1 performance validity tests (PVTs; tests and cut-offs shown in Table 1; multiple failed scores from the same test were counted as a single PVT failure). Patients who failed one PVT were retained in the sample because research shows that failure on a single indicator among several is not unusual in credible populations (Victor, Boone, Serpa, Beuhler, & Ziegler, 2009). Participants with an IQ of <70 or diagnoses of dementia were excluded due to evidence that these groups fail PVTs at high rate despite performing to true ability (Dean, Victor, Boone, & Arnold, 2008; Dean, Victor, Boone, Philpott, & Hess, 2009); retention of these subjects results in cut-off scores that have unacceptably poor sensitivity. Final diagnoses (i.e., determined by history and cognitive test results) are listed in Table 2, and demographic data are provided in Table 3.

Table 1.

Cut-offs and references for performance validity measures used for group assignment

Rey 15 plus recognition Boone, Salazar, Lu, Warner-Chacon, and Razani (2002) 
 Combination score <20 
Dot Counting Test Boone, Lu, and Herzberg (2002a) 
E-score ≥17 
b Test Boone, Lu, and Herzberg (2002b) 
E-score ≥170 
WAIS-III Digit span Babikian, Boone, Lu, and Arnold (2006) 
 Age-corrected scaled score ≤5, or 
 Reliable digit span ≤6, or 
 Mean time to repeat 3 digits forward >2″ 
Rey Word Recognition Nitch, Boone, Wen, Arnold, and Alfano (2006) 
 Total recognized ≤5 for men/≤7 for women, or 
 Combination equation ≤9 
Rey Auditory Verbal Learning Test (RAVLT) Boone, Lu, and Wen (2005) 
 Effort Equation ≤12 
Finger Tapping (dominant hand) Arnold and colleagues (2005) 
 Men ≤35 and women ≤28 
Warrington Recognition Memory Test – Words Kim and colleagues (2010) 
 Words ≤42 
Rey 15 plus recognition Boone, Salazar, Lu, Warner-Chacon, and Razani (2002) 
 Combination score <20 
Dot Counting Test Boone, Lu, and Herzberg (2002a) 
E-score ≥17 
b Test Boone, Lu, and Herzberg (2002b) 
E-score ≥170 
WAIS-III Digit span Babikian, Boone, Lu, and Arnold (2006) 
 Age-corrected scaled score ≤5, or 
 Reliable digit span ≤6, or 
 Mean time to repeat 3 digits forward >2″ 
Rey Word Recognition Nitch, Boone, Wen, Arnold, and Alfano (2006) 
 Total recognized ≤5 for men/≤7 for women, or 
 Combination equation ≤9 
Rey Auditory Verbal Learning Test (RAVLT) Boone, Lu, and Wen (2005) 
 Effort Equation ≤12 
Finger Tapping (dominant hand) Arnold and colleagues (2005) 
 Men ≤35 and women ≤28 
Warrington Recognition Memory Test – Words Kim and colleagues (2010) 
 Words ≤42 
Table 2.

Frequencies of diagnoses by group

Diagnosis Credible (n = 146) Noncredible (n = 157) 
Anoxia 
Anxiety/panic disorder 
Asperger's syndrome (R/o) 
Attention deficit disorder (including R/O) 
Bipolar disorder 
Brain tumor 
Chronic pain 
Cognitive disorder NOS 
Dementia (vascular) 
Depression 26 15 
Electrical injury 
Epilepsy 12 
HIV 
Hydrocephalus 
Klinefelter syndrome 
Learning disability (probable) 25 12 
Liver disease (end stage) 
Meningitis 
Mental retardation 
Multiple sclerosis 
Obsessive Compulsive Disorder 
Psychosis 12 28 
Post traumatic stress disorder 
Somatoform disorder 14 
Stroke 
Syncopal Episodes 
Substance abuse 
Toxic exposure 
Traumatic brain injury 
 Mild 33 
 Moderate 
 Severe 14 
Diagnosis Credible (n = 146) Noncredible (n = 157) 
Anoxia 
Anxiety/panic disorder 
Asperger's syndrome (R/o) 
Attention deficit disorder (including R/O) 
Bipolar disorder 
Brain tumor 
Chronic pain 
Cognitive disorder NOS 
Dementia (vascular) 
Depression 26 15 
Electrical injury 
Epilepsy 12 
HIV 
Hydrocephalus 
Klinefelter syndrome 
Learning disability (probable) 25 12 
Liver disease (end stage) 
Meningitis 
Mental retardation 
Multiple sclerosis 
Obsessive Compulsive Disorder 
Psychosis 12 28 
Post traumatic stress disorder 
Somatoform disorder 14 
Stroke 
Syncopal Episodes 
Substance abuse 
Toxic exposure 
Traumatic brain injury 
 Mild 33 
 Moderate 
 Severe 14 
Table 3.

Demographic characteristics and Rey-Osterrieth Complex Figure Test scores by group

 Credible (n = 146) Noncredible (n = 157) t p Effect size 
Age 42.73 ± 13.71 44.22 ± 10.77 −1.05 .294 0.11 
 Range 15–75 17–67    
Years of education 13.27 ± 2.80 12.49 ± 2.87 2.41 .016 0.28 
 Range 3–20 6–21    
Gender 75m/71f 92m/65f    
Ethnicity 78 (53.4%) 49 (31.2%)    
 Caucasian 31 (21.2%) 28 (17.8%)    
 Hispanic 17 (11.6%) 59 (37.6%)    
 African American 8 (5.5%) 10 (6.4%)    
 Asian 3 (2.1%) 3 (1.9%)    
 Native American 7 (4.8%) 5 (3.2%)    
 Middle Eastern 2 (1.4%) 3 (1.9%)    
 Other      
ROCFT Scores 
 Copy 31.35 ± 3.86 24.59 ± 8.32 8.91 <.001 1.75 
  Range 12–36 5–36    
N 145 152    
 3-min recall 17.06 ± 6.21 11.16 ± 5.68 8.54 <.001 .95 
  Range 3–33 0–29    
N 145 151    
 Recognition Correct 9.23 ± 1.79 5.91 ± 2.49 11.98 <.001 1.86 
  Range 5–12 0–11    
n 124 118    
 Effort Equation 59.13 ± 6.45 41.30 ± 12.71 13.83 <.001 2.76 
  Range 38–72 2–65    
n 124 115    
 Credible (n = 146) Noncredible (n = 157) t p Effect size 
Age 42.73 ± 13.71 44.22 ± 10.77 −1.05 .294 0.11 
 Range 15–75 17–67    
Years of education 13.27 ± 2.80 12.49 ± 2.87 2.41 .016 0.28 
 Range 3–20 6–21    
Gender 75m/71f 92m/65f    
Ethnicity 78 (53.4%) 49 (31.2%)    
 Caucasian 31 (21.2%) 28 (17.8%)    
 Hispanic 17 (11.6%) 59 (37.6%)    
 African American 8 (5.5%) 10 (6.4%)    
 Asian 3 (2.1%) 3 (1.9%)    
 Native American 7 (4.8%) 5 (3.2%)    
 Middle Eastern 2 (1.4%) 3 (1.9%)    
 Other      
ROCFT Scores 
 Copy 31.35 ± 3.86 24.59 ± 8.32 8.91 <.001 1.75 
  Range 12–36 5–36    
N 145 152    
 3-min recall 17.06 ± 6.21 11.16 ± 5.68 8.54 <.001 .95 
  Range 3–33 0–29    
N 145 151    
 Recognition Correct 9.23 ± 1.79 5.91 ± 2.49 11.98 <.001 1.86 
  Range 5–12 0–11    
n 124 118    
 Effort Equation 59.13 ± 6.45 41.30 ± 12.71 13.83 <.001 2.76 
  Range 38–72 2–65    
n 124 115    

Noncredible Subjects

The 157 patients in this group met the Slick, Sherman, and Iverson (1999) criteria for probable malingered neurocognitive dysfunction. All were involved in litigation or seeking to obtain disability benefits for reported symptoms at the time of the evaluation, and showed evidence of noncredible cognitive performance on at least two PVTs (shown in Table 1).

In reference to the Slick and colleagues (1999) Criterion D, it was important to exclude potential subjects from the noncredible group who were truly low functioning and failed PVTs due to their actual conditions rather than due to response bias. However, unfortunately, the same exclusion criteria (i.e., IQ < 70 and presenting diagnoses of dementia) used for the credible group could not be employed because these data are not accurate in an unknown percentage of compensation-seeking subjects. Studies have shown that noncredible compensation-seekers obtain much lower IQ scores than do credible patient groups without motive to feign (e.g., Bianchini, Mathias, Greve, Houston, & Crouch, 2001) because the former are not performing to true ability on the IQ measures. Likewise, noncredible patients can be incorrectly assigned diagnoses of dementia when they do not perform to true ability on memory testing.

The approach we used to confirm appropriateness of assignment to the noncredible group (in addition to presence of motive to feign and failure on at least 2 separate PVTs) was by checking for a mismatch between low cognitive scores and evidence of normal function in ADLs (e.g., dementia level memory scores but lives independently, handles own finances, etc.). If such a mismatch was present, the subject was retained in the noncredible group. However, if individuals had verifiable evidence of low cognitive function outside of the evaluation context that could account for their PVT failure (e.g., not able to live independently, had never held employment or been able to drive, had a guardian or conservator, etc.), they were excluded from the noncredible group.

Presenting diagnoses (i.e., the conditions claimed by the patients at the time of evaluation) are reproduced in Table 2, and demographic information is contained in Table 3.

Instrument/Procedures

The ROCFT had been administered as part of a comprehensive neuropsychological test battery according to the procedures outlined by Lu and colleagues (2003). Specifically, patients were shown the ROCFT figure and instructed to copy it onto a sheet of blank 8.5 × 11-in. paper. Following the copy trial, the figure and the copy were removed from view, and approximately 3 min later, patients were asked to draw the figure from memory on another blank sheet of paper. Following the recall task, the majority of participants (124 credible subjects and 118 noncredible subjects) were presented with the Meyers and Meyers (1995) recognition trial, consisting of 4 pages that contain 12 partial designs from the figure interspersed with 12 distractor items that were not part of the original figure. The participants were asked to circle the designs that were part of the original figure.

All protocols were scored according to the procedure described in Lezak (2005) in which up to two points are given (one point for correct placement and one point for correct reproduction; half point awarded to poorly placed and distorted elements) for each of 18 details for a total of 36 points. All protocols were scored by a single experienced rater (K.B.B.). This rater's scorings have been found to have high inter-rater reliability with those of another experienced neuropsychologist (see Boone, Lesser, Hill-Gutierrez, Berman, & D'elia, 1993). Further, the mean copy score for the youngest age group (normal individuals aged 45 to 59) in the Boone and colleagues (1993) study is highly similar to that reported by Meyers and Meyers (1995) in their normal sample who averaged 35 years of age (i.e., 34.2 versus 34.6), which suggests that scoring approaches are comparable.

The four scores used for statistical analyses were: (i) copy score; (ii) immediate/3-min recall score; (iii) true positive recognition score (total number of correct figures circled); and (iv) Lu and colleagues (2003) Effort Equation.

Results

As shown in Table 3, credible and noncredible groups did not significantly differ in age but did differ in education, although by <1 year; gender distribution was generally comparable between groups.

T-test comparisons revealed highly significant group differences on all ROCFT scores, with the credible group outperforming the noncredible group (see Table 3).

Cut-offs were selected for ROCFT scores that achieved at least 90% specificity. As can be seen in Table 4, a ROCFT copy score cut-off of <26 identified 47% of noncredible subjects, while an immediate/3-min recall cut-off of <10 was associated with 42% sensitivity. In contrast, a cut-off of <7 applied to true positive recognitions reached 61% sensitivity, and the ROCFT Effort Equation, using the cut-off of ≤47, detected at least two-thirds (67.8%) of noncredible subjects. The cut-off for the equation could actually be raised to ≤50 and still maintain at least 90% specificity, and thereby increasing sensitivity to 80%.

Table 4.

Specificity and sensitivity data for Rey-Osterrieth Complex Figure Test scores

 Cutoff Specificity (%) Sensitivity (%) 
  N = 145 N = 152 
Copy ≤11 100.0 7.9 
 ≤19 98.6 23.0 
 ≤23.0 95.2 34.2 
 ≤25.5 91.7 47.4 
 ≤26 89.0 52.0 
 ≤27 84.8 57.9 
3-min delay  N = 145 N = 151 
 ≤2.0 100.0 5.3 
 ≤3.0 98.6 7.3 
 ≤8.0 95.2 35.1 
 ≤9.5 91.0 41.7 
 ≤10.0 88.3 45.0 
 ≤11.0 83.4 53.6 
Recognition true positives  N = 124 N = 118 
 ≤4 100.0 24.6 
 ≤5 98.4 43.2 
 ≤6 92.7 61.0 
 ≤7 79.8 73.7 
Effort Equation  N = 124 N = 115 
 ≤37 100.0 33.0 
 ≤45 98.4 59.1 
 ≤46 98.4 60.9 
 ≤47 96.0 67.8 
 ≤48 96.0 70.4 
 ≤50 90.3 80.0 
 ≤53 80.6 88.7 
 Cutoff Specificity (%) Sensitivity (%) 
  N = 145 N = 152 
Copy ≤11 100.0 7.9 
 ≤19 98.6 23.0 
 ≤23.0 95.2 34.2 
 ≤25.5 91.7 47.4 
 ≤26 89.0 52.0 
 ≤27 84.8 57.9 
3-min delay  N = 145 N = 151 
 ≤2.0 100.0 5.3 
 ≤3.0 98.6 7.3 
 ≤8.0 95.2 35.1 
 ≤9.5 91.0 41.7 
 ≤10.0 88.3 45.0 
 ≤11.0 83.4 53.6 
Recognition true positives  N = 124 N = 118 
 ≤4 100.0 24.6 
 ≤5 98.4 43.2 
 ≤6 92.7 61.0 
 ≤7 79.8 73.7 
Effort Equation  N = 124 N = 115 
 ≤37 100.0 33.0 
 ≤45 98.4 59.1 
 ≤46 98.4 60.9 
 ≤47 96.0 67.8 
 ≤48 96.0 70.4 
 ≤50 90.3 80.0 
 ≤53 80.6 88.7 

In the noncredible group, there was no significant association between the ROCFT Effort Equation and age (r = .13, p = .21), but a significant relationship was present with education (r = .25, p = .007), although education accounted for <7% test score variance and was not further considered in statistical analyses. In contrast, in the credible group, there was no significant correlation between Effort Equation scores and education (r = .10, p = .25), but a small significant correlation was observed with age (r = −.22, p = .02), but age accounted for <5% test score variance and also was not further considered in statistical computations. T-test comparisons showed no significant differences in ROCFT Effort Equation scores between credible men and women (p = .27), and between noncredible men and women (p = .78).

The credible patients who fell below the Effort Equation cut-off of ≤50 were examined for commonalities that might identify populations at risk for failure on the indicator despite performance to true capability. Table 5 summarizes the demographic, diagnostic, and IQ data for these 12 subjects. Seven were men and five were women, and mean education was 12.75 years, only slightly less than the average of 13.27 years in the credible sample as a whole. Five were Caucasian, four were Hispanic, two were African American, and one was Middle Eastern; two subjects spoke English as a second language (out of a total of 23 credible subjects who were non-native English speakers). Visual memory scores available from WMS-III Visual Reproduction II revealed that the mean scaled score in the credible subjects who fell below the ROCFT Effort Equation cut-off was 7.9 (low average) and only slightly lower than the mean scaled score for the credible population as a whole (8.5, SD = 2.56); only one credible subject failing the ROCFT Effort Equation obtained an impaired score on Visual Reproduction II (scaled score = 4). Thus, gender, educational level, ethnicity/culture, language characteristics, and visual memory skill did not appear to be particular risk factors for failure on the Effort Equation.

Table 5.

Demographic, diagnostic, and IQ data for credible patients falling below Effort Equation cut-off of ≤50

Age Gender Education Ethnicity/language Diagnosis Special education PIQ FSIQ 
27 Male 15 Middle Eastern/ESL Seizures Yes 87 95 
33 Female 11 Caucasian Depression No 75 82 
43 Male 14 Hispanic Stroke No 84 89 
43 Male 13 African American Learning disability Yes 76 77 
46 Female 10 Hispanic Learning disability No 73 72 
48 Male 14 Caucasian Learning disability No 87 82 
51 Female 12 African American Multiple sclerosis No 78 72 
53 Female 14 Hispanic Bipolar Yes 102 100 
58 Male 18 Hispanic/ESL Severe TBI No 76 80 
58 Female 10 Caucasian Learning disability No 98 89 
62 Male 13 Caucasian Somatoform No 94 91 
64 Male Caucasian Substance abuse No 105 104 
Age Gender Education Ethnicity/language Diagnosis Special education PIQ FSIQ 
27 Male 15 Middle Eastern/ESL Seizures Yes 87 95 
33 Female 11 Caucasian Depression No 75 82 
43 Male 14 Hispanic Stroke No 84 89 
43 Male 13 African American Learning disability Yes 76 77 
46 Female 10 Hispanic Learning disability No 73 72 
48 Male 14 Caucasian Learning disability No 87 82 
51 Female 12 African American Multiple sclerosis No 78 72 
53 Female 14 Hispanic Bipolar Yes 102 100 
58 Male 18 Hispanic/ESL Severe TBI No 76 80 
58 Female 10 Caucasian Learning disability No 98 89 
62 Male 13 Caucasian Somatoform No 94 91 
64 Male Caucasian Substance abuse No 105 104 

Note: ESL = English as a second language; TBI = Traumatic brain injury.

In contrast, despite the only modest correlation between age and Effort Equation scores in the credible sample as a whole, the mean age for the credible subsample falling below the ROCFT Effort Equation cut-off of ≤50 was 51.3 years, when compared with 42.73 for the credible sample as a whole. Additionally, mean Full Scale IQ was 86.08 and mean Performance IQ was 86.25, when compared with a Full Scale IQ of 96.15 and Performance IQ of 95.30 in the entire credible group. In terms of diagnosis, four subjects who fell below the cut-off had histories of probable learning disability (20% of the probable learning disabled sample, n = 20 with Effort Equation data; three of the subjects falling below the cut-off had histories of special education placement), but the remainder of the diagnoses in this subgroup were heterogeneous.

Thus, the only discernable pattern that emerged was that credible subjects failing the cut-off tended to be of older age and lower intellectual level, and to have histories of learning problems in school, although this latter characteristic may have been a function of lowered intelligence; the mean IQ of the four subjects with learning problems who failed the ROCFT was 80.0 (range of 72–89).

Fifteen patients in the entire credible sample were older than age 59, and of the 12 who had ROCFT Effort Equation data, two obtained ROCFT Equation scores of 50 (83.3% specificity); all remaining scores were higher. Lowering the cut-score to ≤49 resulted in 100% specificity in this group.

In credible subjects of low average intelligence (i.e., 80–89; n = 28, all with Effort Equation data), decreasing the cut-off to ≤48 resulted in at least 90% specificity, and in subjects with borderline intelligence (i.e., 70–79; n = 15, all with Effort Equation data), lowering the cut-off to ≤47 reached 100% specificity. Adjusting the cut-off to ≤48 achieved at least 90% specificity in the 20 credible subjects with histories of learning problems who had ROCFT Effort Equation data.

The largest diagnostic subgroups in the noncredible sample were mild traumatic brain injury (n = 24 with Effort Equation data), psychosis (n = 19 with Effort Equation data), and depression (n = 13 with Effort Equation data). Using the ROCFT cut-off of ≤50, sensitivity rates for these three subgroups were 83.3%, 63.2%, and 69.2%, respectively.

Discussion

The ROCFT was originally developed to assess for visual constructional skill and visual memory (Lezak, 2005), but subsequently a performance validity indicator derived from the test was found to be useful in detecting negative response bias (Lu et al., 2003). Lu and colleagues (2003) observed that an equation incorporating ROCFT copy and Meyers and Meyers (1995) recognition trial scores achieved 76% sensitivity at ≥90% specificity in a fairly large “real world” sample (70 credible and 58 noncredible patients). The current study accessed archival ROCFT data for 146 credible patients (124 with Effort Equation data) and 157 noncredible patients (115 with Effort Equation data) obtained from 2001 and later (with none included in the Lu et al., publication). In the current study, the noncredible subjects scored significantly lower on all four ROCFT scores: copy, immediate/3-min recall, total correct on recognition trial, and the ROCFT Effort Equation. However, the Effort Equation was most effective in classifying subjects; a cut-off of ≤47 achieved similar sensitivity to the Lu and colleagues (2003) study (i.e., nearly 68%) while maintaining specificity of 96.0%. The cut-off could actually be raised to ≤50 while still maintaining specificity of ≥90%, and thereby increasing sensitivity to 80%. Application of the ROCFT Effort Equation cut-off of ≤50 individually to the largest subgroups of noncredible subjects revealed sensitivity rates of >80% in the subjects claiming mild traumatic brain injury, and between 60% and 70% in patients claiming psychosis or depression, suggesting that the ROCFT Effort Equation in particularly effective in the context of claimed mild traumatic brain injury.

Thus, the current data provide corroboration of the Lu and colleagues (2003) finding that the ROCFT Effort Equation is successful in detecting performance invalidity. However, some patient groups appear to be at a slightly increased risk for false-positive identification on the task, namely those of older age, lower intelligence (low average and borderline), and history of learning problems in school. Minor adjustments to the cut-off allows adequate protection of these groups; specificity of at least 90% was achieved by lowering the cut-off to ≤49 in older patients, to ≤48 in individuals with histories of learning problems or low average intelligence, and to ≤47 for individuals of borderline intelligence. Of note, subjects with IQ scores <70 and dementia diagnoses were excluded from the credible sample because these populations exhibit high failure rates on PVTs despite performing with adequate effort (Dean et al., 2008, 2009). Their inclusion in the sample would have required selection of very lenient cut-offs (to maintain 90% specificity), which would have severely impacted test sensitivity. Because these types of patients were excluded, the cut-offs recommended by the current data cannot be used in the differential of actual versus feigned dementia and extremely low IQ.

A recent study in Germany (Blaskewitz et al., 2009) reported somewhat lower sensitivity for the ROCFT Effort Equation (52%), but this appears to be due to use of a lower cut-off (≤45). The authors concluded that embedded ROCFT performance validity indicators were not sufficiently effective due to the relatively low-sensitivity rates. However, the equation achieved 95% specificity with a cut-off of ≤45, and it is likely that the authors could have raised the cut-off and still obtained specificity of ≥90%, while increasing test sensitivity. Unfortunately, the authors do not report specificity and sensitivity rates for higher Effort Equation cut-scores.

In comparing the results of the Blaskewitz and colleagues (2009) and current studies, we are struck by the similarity of findings, despite the fact that Blaskewitz and colleagues employed a different test administration format and their patient groups were of differing ethnicity and education levels from those in the current study. In the present study, the ethnically diverse noncredible group, who averaged 12.5 years of education and were tested in English in California, obtained mean scores of 24.6 for the copy trial, 11.2 for the immediate/3-min recall trial, and 41.3 for the Effort Equation. The Blaskewitz and colleagues (2009) noncredible sample, who were Turkish immigrants in Germany with an average of 5.5 years of education, achieved a mean copy score of 21.9, immediate recall was 11.2, and mean Effort Equation score was 40.6. Sensitivity rates were also similar when the same cut-score was employed; specifically, using a cut-off of ≤45, sensitivity was 52% in the Blaskewitz and colleagues (2009) sample and 59.1% in the current sample. Thus, these data suggest that noncredible subjects of very different educational and cultural backgrounds perform similarly on the ROCFT scores.

Examination of the Blaskewitz and colleagues (2009) credible sample (German neurology patients with an average of 15.3 years of education) suggests that they may have been more cognitively impaired than the credible patients in the current study. While MMSE scores ranged from 26 to 30 in the German sample, mean ROCFT copy and recall scores were lower than those for credible subjects in the current study. Specifically, the German sample averaged 28.7 for the copy trial and 12.5 on immediate/3-min delay, when compared with mean scores of 31.4 and 17.1, respectively, in the current credible sample. Thus, the German patients, as a group, appear to have had impairments in constructional skills and visual memory. Blaskewitz and colleagues did not appear to exclude patients with dementia, amnestic disorder, or IQ < 70; in fact 44% of their subjects had cerebrovascular pathology, and testing of three-quarters of them occurred within the acute or subacute phase of illness (median of 5 days post injury/illness onset). It is reassuring that such high specificity levels for the Effort Equation were obtained in that study, despite evidence of a high rate of constructional and visual memory deficit in those credible subjects.

Additionally, Blaskewitz and colleagues (2009) applied the ROCFT Effort Equation to a different test administration format than that employed by the Lu and colleagues (2003) and current studies. Specifically, they administered an immediate/3-min delay, and then a 30-min delay, after which the recognition trial was given, yet they still achieved specificity and sensitivity rates similar to those found in the Lu and colleagues (2003) and current study (in which the recognition trial followed the immediate/3-min delay). We could not locate any studies investigating comparability of recognition scores obtained after a single immediate/3-min delay versus after immediate/3-min plus 30-min delays, but given that immediate/3-min and 30-min recall scores differ by ≤2 points within patient and normal samples (see Meyers & Meyers, 1995), recognition scores would not be expected to display a different pattern. Taken together, these data allow us to tentatively conclude that the Lu and colleagues (2003) equation can likely be imported for use in those examinations employing the recognition trial after a 30-min delay.

In conclusion, results of the current study provide evidence that the ROCFT Effort Equation is an effective measure of response bias in clinical neuropsychological assessments, particularly in those subjects claiming residuals from mild traumatic brain injury, although slight adjustments to cut-offs may be required for older individuals or those with histories and/or independent evidence of premorbid lowered intellectual level.

Conflict of Interest

Some subjects were obtained from the forensic practice of the second author.

References

Arnold
G.
Boone
K. B.
Lu
P.
Dean
A.
Wen
J.
Nitch
S.
Sensitivity and specificity of finger tapping test scores for the detection of suspect effort
Journal of Clinical and Experimental Neuropsychology
 , 
2005
, vol. 
19
 (pg. 
105
-
120
)
Babikian
T.
Boone
K.
Lu
P.
Arnold
G.
Sensitivity and specificity of various digit span scores in the detection of suspect effort
Clinical Neuropsychologist
 , 
2006
, vol. 
20
 (pg. 
145
-
159
)
Bianchini
K. J.
Mathias
C. W.
Greve
K. W.
Houston
R. J.
Crouch
J. A.
Classification accuracy of the Portland Digit Recognition Test in traumatic brain injury
The Clinical Neuropsychologist
 , 
2001
, vol. 
15
 (pg. 
461
-
470
)
Blaskewitz
N.
Merten
T.
Brockhaus
R.
Detection of suboptimal effort with the Rey Complex Figure Test and Recognition Trial
Journal of Applied Neuropsychology
 , 
2009
, vol. 
16
 (pg. 
54
-
61
)
Boone
K. B.
Lesser
I. M.
Hill-Gutierrez
E.
Berman
N. G.
D'elia
L. F.
Rey-Osterrieth complex figure performance in healthy, older adults: Relationship to age, education, sex, and IQ
Clinical Neuropsychologist
 , 
1993
, vol. 
7
 (pg. 
22
-
28
)
Boone
K. B.
Lu
P. H.
Herzberg
D.
b test.
 , 
2002a
Los Angeles, CA
Western Psychological Services
Boone
K. B.
Lu
P. H.
Herzberg
D.
Rey Dot Counting Test.
 , 
2002b
Los Angeles, CA
Western Psychological Services
Boone
K. B.
Lu
P.
Wen
J.
Comparison of various RAVLT scores in the detection of noncredible memory performance
Archives of Clinical Neuropsychology
 , 
2005
, vol. 
20
 (pg. 
310
-
319
)
Boone
K. B.
Salazar
X.
Lu
P.
Warner-Chacon
K.
Razani
J.
The Rey 15-item Recognition Trial: A technique to enhance sensitivity of the Rey 15-item Memorization Test
Journal of Clinical and Experimental Neuropsychology
 , 
2002
, vol. 
24
 (pg. 
561
-
573
)
Dean
A. C.
Victor
T. L.
Boone
K. B.
Arnold
G.
The relationship of IQ to effort test performance
The Clinical Neuropsychologist
 , 
2008
, vol. 
22
 (pg. 
705
-
722
)
Dean
A. C.
Victor
T.
Boone
K. B.
Philpott
L.
Hess
R.
Dementia and effort test performance
The Clinical Neuropsychologist
 , 
2009
, vol. 
23
 (pg. 
133
-
152
)
Fastenau
P. S.
The Extended Complex Figure Test (ECFT)
 , 
2002
Los Angeles
Western Psychological Services
Kim
M. S.
Boone
K. B.
Marion
S. D.
Amano
S.
Cottingham
M. E.
Ziegler
E. A.
, et al.  . 
The Warrington Recognition Memory Test for Words as a measure of response bias: Total score and response time cutoffs developed on ‘real world’ credible and noncredible subjects
Archives of Clinical Neuropsychology
 , 
2010
, vol. 
25
 (pg. 
60
-
70
)
Lezak
M. D.
Neuropsychological assessment
 , 
2005
(3rd ed.)
New York
Oxford University Press
Lu
P. H.
Boone
K. B.
Cozolino
L.
Mitchell
C.
Effectiveness of the Rey-Osterrieth Complex Figure Test and the Meyers and Meyers recognition trial in the detection of suspect effort
Clinical Neuropsychologist
 , 
2003
, vol. 
17
 (pg. 
426
-
440
)
Meyers
J.
Meyers
K.
Rey Complex Figure and Recognition Trial: Professional manual
 , 
1995
Odessa, FL
Psychological Assessment Resources
Meyers
J. E.
Volbrecht
M.
Validation of memory error patterns on the Rey Complex Figure and Recognition Trial
Applied Neuropsychology
 , 
1998
, vol. 
5
 (pg. 
120
-
131
)
Mitrushina
M.
Boone
K. B.
Razani
J.
D'Elia
L. F.
Handbook of Normative Data for Neuropsychological Assessment
 , 
2005
2nd ed.
New York, NY
Oxford University Press
Nitch
S.
Boone
K. B.
Wen
J.
Arnold
G.
Alfano
K.
The utility of the Rey Word Recognition Test in the detection of suspect effort
Clinical Neuropsychologist
 , 
2006
, vol. 
20
 (pg. 
873
-
887
)
Slick
D. J.
Sherman
E. M. S.
Iverson
G. L.
Diagnostic criteria for malingered neurocognitive dysfunction: Proposed standards for clinical practice and research
The Clinical Neuropsychologist
 , 
1999
, vol. 
13
 (pg. 
545
-
561
)
Victor
T.
Boone
K. B.
Serpa
G.
Beuhler
J.
Ziegler
E. A.
Interpreting the meaning of multiple effort test failure
The Clinical Neuropsychologist
 , 
2009
, vol. 
23
 (pg. 
297
-
313
victor