Abstract

The current study sought to report the base rates of Symptom Validity Test (SVT) failure in an active duty military sample as well as to compare the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS) Effort Index (EI) to stand-alone measures of symptom validity. SVT failure varied from previous studies and even among different subgroups in the current sample, ranging from 8% to 30%. The RBANS EI demonstrated modest sensitivity in the detection of suboptimal effort when compared with stand-alone SVTs. Although the index appears to add some utility to the detection of suboptimal effort, sole use of the EI as a measure of symptom validity could conceivably result in an unnecessarily high rate of false negatives.

Introduction

The import of effort testing as an essential element of neuropsychological assessment has gained increasing acceptance (Bush et al., 2005). With the literature indicating that the qualitative analysis of effort is tenuous at best (Faust, Hart, Guilmette, & Arkes, 1988; Heaton, Smith, Lehman, & Vogt, 1978), various quantitative measures have been devised to assess the response bias. These measures take one of two forms: (1) indices embedded in standard neuropsychological tests (e.g., Reliable Digit Span; [Greiffenstein, Baker, & Gola, 1994] and Forced Choice on the California Verbal Learning Test-II [Delis, Dramer, Kaplan, & Ober, 2000]) or stand-alone instruments designed and standardized to evaluate symptom validity (e.g., Test of Memory Malingering [TOMM; Tombaugh, 1996] and Word Memory Test [Green, 2003]). The intuitive value of quantitatively assuring the reliability of patient's scores is augmented by the research literature which suggests that effort can account for more variance in neuropsychological test scores than severity of neurological insult (Constantinou, Bauer, Aahendorf, Fisher, & McCaffrey, 2005; Green, 2007; Green, Rohling, Lees-Haley, & Allen, 2001; Stevens, Friedel, Mehren, & Merten, 2008).

The Repeatable Battery for the Assessment of Neuropsychological Status (RBANS; Randolph, 1998) is a psychometrically sound and well-validated screening measure of cognitive functioning. In an attempt to craft an internal validity indicator for the RBANS, Silverberg, Wertheimer, and Fichtenberg (2007) constructed an embedded Effort Index (EI) based on RBANS tasks that have demonstrated utility as measures of effort/response bias. Based on the rationale that combination scores, which include multiple (nonredundant) measures, can improve the psychometric qualities of effort measures, Silverberg and colleagues constructed a weighted scoring system based on examinees' raw scores on the RBANS Digit Span and Word List Recognition subtests that can range from 0 to 12. An initial validation study was conducted with a neurologically and psychiatrically heterogeneous sample to determine the frequency of various raw score cut points on the specified subtests. Based on these results the authors concluded that an EI score of >3 should be considered “suspicious” given the infrequency seen in their sample of patients referred for neuropsychological assessment. In a second validation study, cut scores were determined after the examination of the following five groups: (1) clinical mild traumatic brain injury (mTBI), (2) clinical malingering, (3) simulated-naïve malingerers, (4) simulated-coached malingerers, and (5) controls. Based on these data, the authors concluded that “a more liberal cut-off score of 1 is optimal to discriminate between examinees with actual versus falsely alleged cognitive impairment owing to post-acute mild TBI” (p. 851). Silverberg and colleagues reported that 86%–96% of examinees in groups thought to be withholding effort were identified as such.

To the authors' knowledge, only two other published studies have attempted to validate the RBANS EI. First, Hook, Marquine, and Hoelzle (2009) sought to examine the utility of the EI in a sample of medically ill, nonlitigating older adults. The authors found that 31% of their sample was classified as demonstrating suspect effort with the more conservative RBANS EI cut score of >3. Hook and colleagues found a significant relationship between RBANS EI scores and both the RBANS total and Mini-Mental State Examination (MMSE; Folstein et al., 1975) scores. The authors then divided their sample into cognitively impaired and noncognitively impaired groups based on RBANS total and MMSE scores. While the samples were small, in the noncognitively impaired group no patients scored >3, while roughly 83% of the cognitively impaired group scored above this threshold. Interestingly, an examination of the reported data shows that when utilizing the EI cut score of 1, approximately 56% of the noncognitively impaired group scored in the suspect effort range. The authors concluded that the Silverberg and colleagues guidelines may not be useful in a cognitively impaired medically ill geriatric sample and called for further research on the measure.

Second, a recent article by Barker, Horner, and Bachman (2010) found modest predictive utility for the RBANS EI in a sample of geriatric patients seen in a memory disorders clinic. Participants were classified into suspect or probable good effort groups by both TOMM performance and clinical consensus. The authors, however, acknowledged that the TOMM is not maximally sensitive to suboptimal effort and the employment of other tests of symptom validity may have improved classification. It was reported that an EI cut score of >3 was best for this sample as it yielded a specificity of approximately 85% with a sensitivity of 64%. Based on the modest operating characteristics demonstrated by the EI, the researchers encouraged its use as a supplement to additional measures of symptom validity.

To the authors' knowledge no published study, outside of the work by Baker and colleagues has attempted to determine the concordance of the RBANS EI with a validated measure of symptom validity. While Silverberg and colleagues (2007) apparently administered the TOMM during one of their validation studies, classification statistics were not reported in their paper. Hook and colleagues (2009) called for such an investigation in a sample similar to theirs; however, given the general paucity of such research, this inquiry appears justified in any patient group. One aim of the current study is to compare classifications derived from the RBANS EI to established the stand-alone measures of symptom validity. While the Silverberg and colleagues study showed good EI sensitivity to suboptimal effort, previous research has suggested that the utility of the tasks used in the EI to gage effort and the EI itself are limited (e.g., Axelrod, Fichtenberg, Millis, & Wertheimer, 2006; Barker et al., 2010; Heinly, Greve, Bianchini, Love, & Brennan, 2005). As such, we predicted that the RBANS EI would demonstrate low concordance and limited sensitivity relative to validated instruments designed to measure effort.

Another aim of the current study is to describe the base rates of SVT failure in a military sample. To the authors' knowledge three studies have sought to describe base rates in similar samples. Armistead-Jehle (2010) found a 58% failure rate on the Medical Symptom Validity Test (MSVT; Green, 2004) in 45 U.S. veterans referred clinically for evaluation of possible post-concussive symptoms. Second, Whitney and colleagues (2009) administered the MSVT to a sample of 23 combat veterans reporting mTBI referred for neuropsychological testing within a Veterans Affairs Medical Center (VAMC). The sample was comprised of nine individuals still enrolled in active duty service and 14 who had recently been discharged. Whitney and colleagues observed a 17% failure rate on the MSVT, with all of those failing (n = 4) still on active duty service. Third, Nelson and colleagues (2010) evaluated various stand-alone and embedded effort measures as a function of forensic and research contexts in a sample of 119 Operations Iraqi Freedom (OIF) and Enduring Freedom (OEF) and non-OIF/OEF veterans. The veterans in the forensic group evidenced elevated rates of insufficient effort relative to veterans in the research group. Specifically, 59%, 16%, and 9% of the forensic group (defined by veterans involved in a compensation and pension [C&P] evaluation) demonstrated insufficient effort on the Victoria Symptom Validity Test (VSVT; Slick, Hopp, Straus, & Spellacy, 1996), CVLT-II Forced Choice, and the Rey-15 Item and Recognition Test (FIT; Boone, Salazar, Lu, Warner-Chacon, & Razani, 2002), respectively. Conversely, those in the research group evidenced a rate of failure of 8% on the VSVT, 3% on the CVLT-II Forced Choice, and 0% on the FIT. The authors argue that the context of the neuropsychological evaluation (i.e., forensic vs. research) is a salient variable in considering SVT performance, whereas the patient cohort (i.e., OIF/OEF vs. non-OIF/OEF) is not.

The Whitney and colleagues (2009) study then appears to be the only investigation in the extant literature that included active duty military service members in an examination of symptom validity testing. As such, given that the current study would consist exclusively of a similar sample (i.e., service members still completing their military service), the base rate of SVT failure was expected to most closely approximate this investigation. Taken together the dual aims of the current work are to provide information on select SVT characteristics in a unique patient sample that has yet to be exclusively studied.

Method

Participants

The study included 85 U.S. active duty military service members evaluated by the first author in an outpatient neuropsychology clinic located at Midwest U.S. Army Health Center between June 2009 and August 2010. The sample was primarily comprised of individuals reporting a history of mTBI/concussion and/or various mental health conditions to include Post-Traumatic Stress Disorder and Unipolar Depressive Disorder. Regarding neurological diagnoses, 72 (84.7%) of the participants reported a history of at least one mTBI/concussion as defined by the American Congress of Rehabilitation Medicine criteria (ACRM, 1993). At the time of evaluation, all patients were at least 1 month out from their injury, with most patients experiencing their injuries more than a year before the assessment. One patient (1.2%) had a previous brain injury that met criteria for moderate severity. It was determined on evaluation that nine (10.6%) of the patients, despite the clinical referral for neuropsychological evaluation, did not meet the minimum criteria for an mTBI. Of the remaining three patients, two (2.4%) had primary diagnoses of chronic pain related to orthopedic issues and one (1.2%) was diagnosed with narcolepsy. Regarding psychiatric diagnoses, 38 (44.7%) of the participants had an Anxiety Disorder/PTSD diagnosis, 13 (15.3%) had a depressive disorder diagnosis, 12 (14.1%) had comorbid diagnoses of anxiety and depression, 2 (2.4%) had an adjustment disorder diagnosis, 1 (1.2%) had a V-code diagnosis, and 18 (21.2%) had no psychiatric diagnosis. Very few members of the sample had any discernable motivation for secondary gain. No participants were known to be pending any litigation. Seven of the evaluations (8.2%) were conducted in the context of a medical evaluation board and as such there may have been some incentive for symptom exaggeration in a very small minority of the sample.

The average age of the sample was 34.0 years (SD = 6.7) with an average education of 15.0 years (SD = 2.1). All participants spoke English fluently and the testing was conducted in English. The majority of the sample was men (90.6%). Ethnic breakdown was as follows: Caucasian (80.0%), African American (11.8%), Hispanic (5.9%), and Native American (2.3%). Regarding branch of service, 83.5% of the sample was active duty Army, 5.9% Army Reserve, 4.7% Army National Guard, 3.5% Marine Corp, and 2.4% Navy. Of the sample, 54.4% were officers and 45.6% were enlisted. The disproportionate number of military officers in the current sample is secondary to the Army instillation where the sample was derived, housing a nearly year-long officers training course titled Intermediate-Level Education (ILE). This academically rigorous course is attended by mid-level officers and aims to prepare them to operate as field-grade commanders and staff officers in full-spectrum Army, joint, interagency, and multinational environments. Approximately 45% of the sample was enrolled in this course at the time of testing. This retrospective analysis of clinical data was approved by the Institutional Review Board at Madigan Army Medical Center.

Procedures and Measures

All patients were tested by the first author or a trained neuropsychology technician under the supervision of the first author. Participants were referred for evaluation by the health center's primary care or mental health clinics or after a positive concussion screening completed during the in-processing session for the ILE students. Participants were administered a screening battery of neuropsychological tests, which included the measures outlined below. Prior to all evaluations, the patients gave consent for the assessment and were instructed to provide their best effort across the tests administered. The patients were never alerted to the types of SVTs employed or the order within test administration. SVT administration was scattered throughout the screening battery. The examiner remained in the room at all times throughout the evaluations.

The TOMM (Tombaugh, 1996) is a forced choice recognition task involving the presentation of 50 line drawings across two trials. After each trial, the examinee is shown a target picture paired with a foil and asked to select the target item. According to the TOMM manual (Tombaugh, 1996), the measure showed 100% specificity and 82% sensitivity in a simulator study (p. 15). Validation within a clinical sample, however, demonstrated a 27% false-positive rate and 73% specificity of patients with dementia (p. 13). The full psychometric properties of the TOMM are outlined in the test manual (Tombaugh, 1996). Failure on this measure was determined as specified in the manual based on the individual's score on the second trial. The optional retention trail was not administered in the current study secondary to the time demands present in the clinical environment from which these data were collected.

The MSVT (Green, 2004) is a brief automated verbal memory screening with several subtests designed to measure verbal memory and response consistency. Ten easy-to-remember word pairs representing a single common object (e.g., Ballpoint-Pen) are shown across two trials. Afterwards, two forced choice recognition subtests are administered and the measure is concluded with Paired Associates and Free Recall trials. Failure of this measure was determined as specified in the test manual. In addition to data presented in the manual, a number of studies have demonstrated the utility of this measure in the discrimination between those with genuine memory impairment and those simulating impairment in a range of patient samples (e.g., Chafetz, 2008; Howe & Loring, 2009; Singhal, Green, Ashaye, Shankar, & Gill, 2009; seeCarone, 2009, for review). For instance, specificity of the MSVT in patients with possible dementia have been reported as 91% and 100% (Howe & Loring, 2009; Singhal, Green, Ashaye, Shankar, & Gill, 2009). Sensitivity has been reported to approximately 97% in simulator studies (Green, 2004).

The Nonverbal MSVT (NV-MSVT; Green, 2008) is a brief automated nonverbal memory screening with several subtests designed to measure nonverbal memory and response consistency. Ten artist-drawn colored images representing an intuitive pair of items (e.g., a baseball and a baseball bat) are shown across two trials. Afterwards, a series of forced choice trails with varying degrees of difficulty are presented. These are followed by a Paired Associates recall subtest and the measure is concluded with a Free Recall task. Failure of this measure was determined as specified in the test manual. According to the manual the NV-MSVT was reported to have 100% specificity in good effort volunteers, 95% specificity in patients diagnosed with dementia, and 72.5% sensitivity to poor effort in simulators. The sensitivity and specificity of the NV-MSVT as a measure of symptom validity has been further validated in a number of other studies (e.g., Green, Flaro, Brockhaus, & Montijo, 2010; Henry, Merton, Wolf, & Harth, 2009; Singhal et al., 2009; seeWagner & Howe, 2010, for review).

The RBANS (Randolph, 1998) is a brief battery comprised of 12 subtests that yields five index scores and a total summary score. The psychometric properties and clinical utility of this measure has been well established (e.g., McKay, Casey, Wertheimer, & Fichtenberg, 2007; Moser & Shatz, 2002). Silverberg and colleagues (2007) constructed the RBANS EI by converting raw scores from the Digit Span and Word List Recognition subtests to weighted scores each ranging from 0 to 6. These two weighted scores are then summed to arrive at the EI score that can then range from 0 to 12. According to Silverberg and colleagues, while the RBANS EI initial study cited a cut score of >3 as “suspicious,” a later validation study determined that a cut score of 1 should be used in patients with post-acute mTBI. Classification statistics for both cut scores were employed in the current analysis.

Results

The various stand-alone measures of effort garnered differing failure rates, with the MSVT showing the highest rate of failure at 20%, followed by the NV-MSVT failure rate of 15% and the TOMM failure rate of 11%. Specific subtest scores for the various stand-alone effort measures are listed in Tables 1–3. In total, eight participants (9.4%) failed all three of the stand-alone measures. Employing a cut score of ≥1, the RBANS EI was failed by 14% of the sample. When the RBANS EI cut score was elevated to >3, it was failed by 7% of the sample.

Table 1.

MSVT scores in 85 Active Duty Military Personnel

 MSVT scores Mean scores in % correct Standard deviation Range 
Pass MSVT (n = 68) IR 99.6 1.3 95 to 100 
DR 98.6 2.8 90 to 100 
CNS 98.4 2.9 90 to 100 
PA 95.9 10.4 50 to 100 
FR 77.0 12.8 45 to 100 
Fail MSVT (n = 17) IR 85.0 8.1 70 to 100 
DR 75.0 13.1 50 to 90 
CNS 75.9 11.1 50 to 85 
PA 68.8 19.6 40 to 100 
FR 48.8 15.2 25 to 85 
Easy Subtests 78.5 9.3 60 to 90 
Hard Subtests 58.8 15.9 35 to 87.5 
Easy-Hard Subtests 19.5 12.2 −9.2 to 37.5 
 MSVT scores Mean scores in % correct Standard deviation Range 
Pass MSVT (n = 68) IR 99.6 1.3 95 to 100 
DR 98.6 2.8 90 to 100 
CNS 98.4 2.9 90 to 100 
PA 95.9 10.4 50 to 100 
FR 77.0 12.8 45 to 100 
Fail MSVT (n = 17) IR 85.0 8.1 70 to 100 
DR 75.0 13.1 50 to 90 
CNS 75.9 11.1 50 to 85 
PA 68.8 19.6 40 to 100 
FR 48.8 15.2 25 to 85 
Easy Subtests 78.5 9.3 60 to 90 
Hard Subtests 58.8 15.9 35 to 87.5 
Easy-Hard Subtests 19.5 12.2 −9.2 to 37.5 

Notes: MSVT = Medical Symptom Validity Test; IR = Immediate Recognition; DR = Delayed Recognition; CNS = Consistency; PA = Paired Associate; FR = Free Recall. Easy-Hard subtest differences in those passing MSVT were not reported as this statistic is only examined in individuals who fail the easy subtests (Green, 2004).

Table 2.

TOMM scores in 85 Active Duty Military Personnel

 TOMM scores Mean scores in % correct Standard deviation Range 
Pass TOMM (n = 75) 1st Trial 46.6 4.3 32–52 
2nd Trial 49.8 0.7 45–50 
Fail TOMM (n = 10) 1st Trial 33.2 6.8 25–48 
2nd Trial 38.8 5.5 31–44 
 TOMM scores Mean scores in % correct Standard deviation Range 
Pass TOMM (n = 75) 1st Trial 46.6 4.3 32–52 
2nd Trial 49.8 0.7 45–50 
Fail TOMM (n = 10) 1st Trial 33.2 6.8 25–48 
2nd Trial 38.8 5.5 31–44 

Note: TOMM = Test of Memory Malingering.

Table 3.

NV-MSVT scores in 85 Active Duty Military Personnel

 NV-MSVT scores Mean scores in % correct Standard deviation Range 
Pass NV-MSVT (n = 72) IR 100.0 0.0 100–100 
DR 98.5 3.6 75–100 
CNS 98.5 3.6 75–100 
DRA 95.8 6.2 75–100 
DRV 97.4 5.0 80–100 
PA 99.9 1.2 90–100 
FR 76.7 13.0 40–100 
Fail NV-MSVT (n = 13) IR 97.7 4.4 85–100 
DR 83.9 11.4 65–100 
CNS 83.9 11.8 60–100 
DRA 75.8 13.8 55–100 
DRV 74.6 12.7 50–100 
PA 91.5 14.6 50–100 
FR 51.3 19.1 20–85 
 NV-MSVT scores Mean scores in % correct Standard deviation Range 
Pass NV-MSVT (n = 72) IR 100.0 0.0 100–100 
DR 98.5 3.6 75–100 
CNS 98.5 3.6 75–100 
DRA 95.8 6.2 75–100 
DRV 97.4 5.0 80–100 
PA 99.9 1.2 90–100 
FR 76.7 13.0 40–100 
Fail NV-MSVT (n = 13) IR 97.7 4.4 85–100 
DR 83.9 11.4 65–100 
CNS 83.9 11.8 60–100 
DRA 75.8 13.8 55–100 
DRV 74.6 12.7 50–100 
PA 91.5 14.6 50–100 
FR 51.3 19.1 20–85 

Notes: NV-MSVT = Nonverbal Medical Symptom Validity Test; IR = Immediate Recognition; DR = Delayed Recognition; CNS = Consistency; DRA = Delayed Recognition Archetypes; DRV = Delayed Recognition Variations; PA = Paired Associate; FR = Free Recall.

The phi correlations between the three stand-alone effort measures and the RBANS EI with a cut score of ≥1 and >3 are given in Table 4. As expected, given the relatively low base rate of measurement failure across all instruments, there was a high degree of correlation; however, in general, correlations between the stand-alone measures are higher than correlations between the stand-alone measures and the RBANS EI. Classification statistics between the three stand-alone effort measures and the RBANS EI with a cut score of ≥1 and >3 can be found in Tables 5 and 6, respectively. The sensitivity of the RBANS EI with a cut score of ≥1 ranged between 0.53 and 0.62 depending on the SVT used. Specificity was high across the three stand-alone SVTs examined, ranging between 0.92 and 0.96. Positive Predictive Power (PPP) was at near chance levels for the NV-MSVT (0.51) and TOMM (0.58), with a modest improvement on the MSVT (0.77). Negative Predictive Power (NPP) was high with ranges between 0.89 and 0.94. The modest sensitivity of the RBANS EI dropped notably when a cut score of >3 was employed with sensitivity ranging from 0.24 to 0.30. Specificity remained high with the RBANS cut score of >3, ranging from 0.96 to 0.97. Across the stand-alone measures, the associated PPP was also generally limited with the higher cut score. The NPP of the three stand-alone measures declined to a degree with the higher RBANS EI cut score, ranging from 0.84 to 0.91.

Table 4.

Correlations between MSVT, NV-MSVT, TOMM, and the RBANS EI with cut scores of ≥1 and >3 (n = 85)

 
1. MSVT 1.00     
2. NV-MSVT .768 1.00    
3. TOMM .548 .758 1.00   
4. RBANS EI ≥1 .557 .579 .481 1.00  
5. RBANS EI >3 .322* .393* .327 .680 1.00 
 
1. MSVT 1.00     
2. NV-MSVT .768 1.00    
3. TOMM .548 .758 1.00   
4. RBANS EI ≥1 .557 .579 .481 1.00  
5. RBANS EI >3 .322* .393* .327 .680 1.00 

Notes: MSVT = Medical Symptom Validity Test; NV-MSVT = Nonverbal Medical Symptom Validity Test; TOMM = Test of Memory Malingering; RBANS EI = Repeatable Battery for the Assessment of Neuropsychological Status Effort Index. Unless indicated, all correlations are p< .001.

*p< .01.

Table 5.

Classification statistics of RBANS EI cut score of ≥1 using MSVT, NV-MSVT, and TOMM as poor effort criteria

RBANS EI Sens Spec PPPa NPPa Hit rate LR+ 
MSVT 0.53 0.96 0.77 0.89 0.87 13.3 
NV-MSVT 0.62 0.92 0.51 0.94 0.89 7.8 
TOMM 0.60 0.92 0.58 0.93 0.88 7.5 
RBANS EI Sens Spec PPPa NPPa Hit rate LR+ 
MSVT 0.53 0.96 0.77 0.89 0.87 13.3 
NV-MSVT 0.62 0.92 0.51 0.94 0.89 7.8 
TOMM 0.60 0.92 0.58 0.93 0.88 7.5 

Notes: RBANS EI = Repeatable Battery for the Assessment of Neuropsychological Status Effort Index; MSVT = Medical Symptom Validity Test; NV-MSVT = Nonverbal Medical Symptom Validity Subtest; TOMM = Test of Memory Malingering; Sens = Sensitivity; Spec = Specificity; PPP = Positive Predictive Power; NPP = Negative Predictive Power; LR+ = Positive Likelihood Ratio. Base Rate MSVT = 0.20, Base Rate NV-MSVT = 0.15, Base Rate TOMM = 0.12.

aBase rate calculated from symptom validity test failure in present sample.

Table 6.

Classification statistics of RBANS EI cut score of >3 using MSVT, NV-MSVT, and TOMM as poor effort criteria

RBANS EI Sens Spec PPPa NPPa Hit rate LR+ 
MSVT 0.24 0.97 0.66 0.84 0.82 8.0 
NV-MSVT 0.31 0.97 0.65 0.89 0.87 10.3 
TOMM 0.30 0.96 0.51 0.91 0.88 7.5 
RBANS EI Sens Spec PPPa NPPa Hit rate LR+ 
MSVT 0.24 0.97 0.66 0.84 0.82 8.0 
NV-MSVT 0.31 0.97 0.65 0.89 0.87 10.3 
TOMM 0.30 0.96 0.51 0.91 0.88 7.5 

Notes: RBANS EI = Repeatable Battery for the Assessment of Neuropsychological Status Effort Index; MSVT = Medical Symptom Validity Test; NV-MSVT = Nonverbal Medical Symptom Validity Subtest; TOMM = Test of Memory Malingering; Sens = Sensitivity; Spec = Specificity; PPP = Positive Predictive Power; NPP = Negative Predictive Power; LR+ = Positive Likelihood Ratio. Base Rate MSVT = 0.20, Base Rate NV-MSVT = 0.15, Base Rate TOMM = 0.12.

aBase rate calculated from symptom validity test failure in present sample.

Fig. 1 shows the receiver operating characteristic (ROC) curve analysis with EI cut scores of ≥1 and >3 when employing the MSVT as an external criterion (seeSwets, Dawes, & Monahan, 2000, for a review of ROC curve and area under the curve [AUC] analyses) . The MSVT was utilized since it has the most research in other patient samples similar to the current sample. Under this methodology, the ROC curve plots the rate of false positives against the rate of true positives. As the overall model tracks nearly perfectly EI cut point ≥1, it was not included in this figure (e.g., being over written by EI cut point ≥1). The AUC in this model represents how well the measure predicts effort, with larger areas representing more accurate prediction of poor effort. The ROC curve for EI ≥ 1 cut point shows fair discriminability in this analysis (AUC = 0.743, SE = 0.079, 95% CI = 0.588–0.898), whereas the EI > 3 cut point shows poor discriminability (AUC = 0.603, SE = 0.084, 95% CI = 0.438–0.768). Overall, a better tradeoff between true positives (i.e., sensitivity) and false positives (i.e., 1-specificity) is achieved with an EI cut point of ≥1 in this sample.

Fig. 1.

ROC curves for RBANS EI cut points ≥1 and >3 with MSVT as external criterion.

Fig. 1.

ROC curves for RBANS EI cut points ≥1 and >3 with MSVT as external criterion.

Discussion

Of current interest in the field of neuropsychology is the establishment of valid and reliable indices of effort. Of related interest is the reporting of base rates on failure of such effort measures among different patient populations. The current study attempted to address both areas by evaluating the performances on stand-alone effort measures in an active duty military sample and then comparing performances on these measures to an embedded index of effort in a commonly employed neuropsychological screening measure. The overall base rate of stand-alone SVT failure ranged from 20% on the MSVT to 15% on the NV-MSVT to 11% on TOMM. Investigations suggest that the TOMM is less sensitive to poor effort than the MSVT (Blaskewitz, Merten, & Kathmann, 2008) and the NV-MSVT (Armistead-Jehle & Gervais, 2010; Green, 2011) and as such the MSVT failure rate of 20% and the NV-MSVT failure rate of 15% may be more accurate estimates for this sample.

The MSVT failure rate in the current sample differed notably from that reported by Armistead-Jehle (2010), who reported a 58% failure rate in a sample of U.S. veterans being evaluated clinically for possible post-concussion syndrome. Such differences are potentially attributable to the dissimilar nature of the samples used and the different motivation for symptom presentation across the veteran and active duty military healthcare environments. The current MSVT failure rate was also slightly higher than the 17% reported in the Whitney and colleagues (2009) study. The Whitney and colleagues sample was comprised of 9 individuals still enrolled in active duty service and 14 who had recently been discharged, with all of those failing the MSVT (n = 4) still on active duty service at the time of testing. If one divided the Whitney and colleagues sample into active duty and veteran subsamples, the failure rate for active duty would have been 44%. It is, however, of potential import to bear in mind from where this sample of active duty service members was drawn. The Whitney and colleagues study was conducted in a VAMC polytrauma unit, and while this was not reported in the article, the nature of these service members' injuries and conditions may have precluded a return to active duty status. This dynamic may have altered symptom presentation relative to the current sample that was on active duty and had no obvious intentions of changing this status in the near future. Nevertheless, the notion that SVT failure rates may vary as a function of more nuanced subgroup membership is broached.

Applying this logic to the current study, within the active duty sample, there was a great deal of dissimilarity between SVT failure base rates of different subgroups. Roughly, half of the current sample consisted of mid-level military officers attending a nearly year-long training course titled ILE. This course is designed to prepare those that intend to operate as field-grade commanders and staff officers in full-spectrum Army, joint armed services, interagency, and multinational environments. Although the course is required of all Army Operations Career Field Officers at the rank of Major (O-4), most individuals who attend the course have made the decision to remain in the armed services at least until they are eligible for retirement after 20 years of service. As such, the ILE subsample is not considered a good cross-section of the active duty population in terms of age, education, or rank (ILE subsample average age = 37.0 years [SD = 4.1], average education = 16.8 [SD = 0.9] and all achieved the rank of Major [O-4]).

The non-ILE student subsample might be considered more representative of the active duty population as it consisted of a greater range in terms of age, education, and rank (non- ILE subsample average age = 31.6 years [SD = 7.4], average education = 13.7 [SD = 1.7], modal rank Sergeant [E-5] with all ranks from Private [E-1] to Colonel [O-6] represented). In this subsample (n = 47), there was a 30% failure rate on the MSVT (n = 14), a 21% failure rate on the NV-MSVT (n = 10), and a 15% failure rate on the TOMM (n = 7). To the converse, among ILE students (total n = 38) there was only an 8% failure rate on all three measures (n = 3). Consequently, a very different base rate of SVT failure as a function of the select subsamples available in the present data is observed.

These results are in contrast to the Nelson and colleagues (2010) finding that reported evaluation context, but not participant cohort, was of import in the SVT performance of their veteran sample. These authors found no difference in SVT performances between OIF/OEF and non-OIF/OEF veterans (i.e., cohort), but a notable difference in SVT results between veterans in forensic and research groups (i.e., context). In the current study the SVT performance between different cohorts (i.e., ILE vs. non-ILE participants both seen in a clinical context) was remarkably different, suggesting that cohort may be a factor in SVT performance at least in the active duty military population.

Examination of the currently available studies thus demonstrates a wide variety of SVT failure rates across various military and veteran samples. As suggested above, this could be a function of subgroup membership within the overall military and veteran environments that in turn may impact motivation for symptom presentation. In an effort to report accurate base rates across these populations, future work should continue to describe SVT performances based on distinct definitions of the samples being studied.

Within the veterans' healthcare environment future efforts could, for instance, continue the work of Nelson and colleagues (2010) in the examination of individuals involved in various contexts (i.e., forensic/C&P vs. clinical vs. research evaluations). As an ancillary point, it is of potential interest that the SVT failure rate of arguably the most sensitive measure used in the Nelson and colleagues study (i.e., the VSVT) demonstrated a failure rate in their forensic group comparable to the failure rate of the MSVT in the Armistead-Jehle (2010) study that evaluated veterans seen a clinical/non-C&P context (59% and 58%, respectively). This comparison could be taken to suggest that veteran samples seen in a clinical context more closely match those seen in forensic rather than research settings. This would bolster the assertion made by Armistead-Jehle in his 2010 paper that secondary to the nature of the Veterans Health Affairs system (which affords veterans the opportunity to make service connection claims at any time) the “external incentive to appear more compromised than one might objectively be is potentially ubiquitous in this system” (p. 58). Within the active duty population, various subgroups defined by both different contexts (i.e., service members pending medical evaluation boards vs. those not pending such boards) and cohorts (i.e., Field Operation Units vs. Combat Support Units, Senior vs. Junior Enlisted, etc.) may demonstrate different rates of SVT failure and thus warrant future research efforts.

Another goal of the current paper was to compare the RBANS EI to well-established stand-alone measures of effort. At both cut scores employed in the current paper, the RBANS EI demonstrated high specificity when compared with the included stand-alone SVTs. Although there are limitations with examining the specificity of the EI in the current sample comprised of high functioning individuals (see below), this data can be used to suggest that in such a sample the RBANS EI does not inaccurately classify normal to high functioning individuals as providing poor effort.

To the contrary, despite efforts to maximize the sensitivity of the RBANS EI by using the most liberal cut score possible (i.e., ≥1), the index still demonstrated only modest sensitivity to poor effort when various stand-alone effort measures were employed as external criterion. The current study was then unable to extend Silverberg's findings supporting the use of the RBANS EI as a viable sole measure of effort. Correlational analysis also demonstrated generally higher correlations among the stand-alone SVTs than between the RBANS EI and the stand-alone measures. If one assumes that failure on symptom validity testing is suggestive of potentially suboptimal effort, the current data in conjunction with the work of Barker and colleagues (2010) can then be taken to indicate that the RBANS EI is not alone capable of achieving sensitivity to poor effort at the level of stand-alone SVTs. The current data provide support for the conclusions of Barker and colleagues, who state that the EI can add some utility as an adjunctive measure of symptom validity when administered in conjunction with other measures of effort.

Silverberg and colleagues (2007) argue that while the National Academy of Neuropsychology statement on effort testing (Bush et al., 2005) concludes that actuarial indicators of effort must be included in all neuropsychological evaluations, “the practice is not feasible, since it would require the additional administration of a lengthy SVT” (p. 850). The MSVT and NV-MSVT can be administered in roughly 10–12 min (minus the required delay between recognition trials where other testing can be completed). Consequently, although some additional time would be necessary to administer measures specifically designed to gage effort, such an allocation of time would seem worthwhile as embedded indices used in isolation do not appear to have sensitivity commensurate to suboptimal effort relative to stand-alone measures.

Among the limitations of the current study is the relative homogeneity of the sample with regard to select variables. It is acknowledged that the current sample was exclusively military in nature and was short on female participants. As such, future research will need to validate these RBANS EI findings in nonmilitary samples as well as with women. Next, secondary to small cell sizes, it was also not possible to explore SVT base rate failures as a function of specific diagnoses (i.e., PTSD, Major Depressive Disorder, Post Concussive Syndrome) as statistical power was lacking. Future work may wish to examine SVT failure across these different diagnostic groups. Third, given that the current sample was comprised of relatively high functioning individuals, the specificity of the RBANS EI could not be adequately assessed. Fourth, the current study could not employ a simulator model. Simulator designs permit a higher degree of confidence that poor effort is actually shown and it would be of interest to know how sensitive any SVT is to poor effort in such groups. Future research with more severely impaired individuals (i.e., those with dementia diagnoses) will be necessary to better evaluate the specificity of the EI and simulator models will also aid in further assessing the sensitivity of the index. Next, while the authors attempted to include a variety of psychometrically sound and commonly used stand-alone SVTs, it is acknowledged that other such measures exist (i.e., WMT and VSVT). Future research may wish to incorporate these measures in an effort to extend the current findings. Future studies with known group samples could also potentially employ multiple regression techniques to ascertain the incremental validity of the RBANS EI in comparison to stand-alone effort measures. Finally, the aim of the current paper was not to identify any individuals as potentially malingering, but rather to provide information on test characteristics within a unique patient sample. However, confidence in the current findings could be bolstered by employing more stringent criteria for poor effort (e.g., Slick et al., 1999) in the designation of group classification.

In sum, the current study provided SVT failure base rates in an active duty military population, with the indication that various subsamples demonstrated different failure rates. Additionally, this study showed relatively poor sensitivity of the RBANS EI when stand-alone effort measures were employed as external criterion. Although the RBANS EI may serve to provide limited information on response bias, the current data fails to support its use as the sole measure of effort within a neuropsychological battery.

Conflict of Interest

None declared.

References

American College of Rehabilitation Medicine
Definition of mild traumatic brain injury
Journal of Head Trauma Rehabilitation
 , 
1993
, vol. 
8
 (pg. 
86
-
87
)
Armistead-Jehle
P.
Symptom validity test performance in U.S. Veterans referred for evaluation of mild TBI
Applied Neuropsychology
 , 
2010
, vol. 
17
 (pg. 
52
-
59
)
Armistead-Jehle
P.
Gervais
R. O.
Sensitivity of the Test of Memory Malingering (TOMM) and the Nonverbal Medical Symptom Validity Test (NV-MSVT): A replication study
Applied Neuropsychology
 , 
2010
Axelrod
B. N.
Fichtenberg
N. L.
Millis
S. R.
Wertheimer
J.C.
Detecting incomplete effort with Digit Span from the Wechsler Adult Intelligence Scale-Third Edition
The Clinical Neuropsychologist
 , 
2006
, vol. 
20
 (pg. 
513
-
523
)
Barker
M. D.
Horner
M. D.
Bachman
D. L.
Embedded indices of effort in the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS) in a geriatric sample
The Clinical Neuropsychologist
 , 
2010
, vol. 
24
 (pg. 
1064
-
1077
)
Blaskewitz
N.
Merten
T.
Kathmann
N.
Performance of children on symptom validity tests: TOMM, MSVT, and FIT
Archives of Clinical Neuropsychology
 , 
2008
, vol. 
23
 (pg. 
379
-
391
)
Boone
K. B.
Salazar
Z.
Lu
P.
Warner-Chacon
K.
Razani
J.
The Rey 15-Item Recognition Trial: A technique to enhance sensitivity of the Rey 15-Item Memorization Test
Journal of Clinical and Experimental Neuropsychology
 , 
2002
, vol. 
24
 (pg. 
561
-
573
)
Bush
S. S.
Ruff
R. M.
Troster
A. I.
Barth
J. T.
Koffler
S. P.
Pliskin
N. H.
, et al.  . 
Symptom validity assessment: Practice issues and medical necessity NAN Policy and Planning Committee
Archives of Clinical Neuropsychology
 , 
2005
, vol. 
20
 (pg. 
419
-
426
)
Carone
D. A.
Test review of the Medical Symptom Validity Test (2009)
Applied Neuropsychology
 , 
2009
, vol. 
16
 (pg. 
309
-
311
)
Chafetz
M. D.
Malingering on the social security disability consultative exam: Predictors and base rates
The Clinical Neuropsychologist
 , 
2008
, vol. 
22
 (pg. 
529
-
546
)
Constantinou
M.
Bauer
L.
Aahendorf
L.
Fisher
J. M.
McCaffrey
R. J.
Is poor performance on recognition memory effort measures indicative of generalized poor performance on neuropsychological tests
Archives of Clinical Neuropsychology
 , 
2005
, vol. 
20
 (pg. 
191
-
198
)
Delis
D. C.
Dramer
J. H.
Kaplan
E.
Ober
B. A.
California Verbal Learning Test-Second Edition manual (CVLT-II)
 , 
2000
San Antonio, TX
The Psychological Corporation
Faust
D.
Hart
K.
Guilmette
T.
Arkes
H. R.
Neuropsychologist's capacity to detect adolescent malingering
Professional Psychology: Research and Practice
 , 
1988
, vol. 
19
 (pg. 
508
-
515
)
Folstein
M. F.
Folstein
S. E.
McHugh
R. R.
‘Mini-Mental State’: A practical method for grading the cognitive state of patients for the clinician
Journal of Psychiatric Research
 , 
1975
, vol. 
12
 (pg. 
189
-
198
)
Green
P.
Green's Word Memory Test for Windows: User's manual
 , 
2003
Edmonton, Canada
Green's Publishing
Green
P.
Green's Medical Symptom Validity Test (MSVT) for Microsoft Windows: User's manual
 , 
2004
Edmonton, Canada
Green's Publishing
Green
P.
The pervasive influence of effort on neuropsychological tests
Physical Medicine and Rehabilitation Clinics of North America
 , 
2007
, vol. 
18
 (pg. 
43
-
68
)
Green
P.
Manual for the Nonverbal Medical Symptom Validity Test
 , 
2008
Edmonton, Canada
Green's Publishing
Green
P.
Comparison between the Test of Memory Malingering (TOMM) and the Nonverbal Medical Symptom Validity Test (NV-MSVT) in adults with disability claims
Applied Neuropsychology
 , 
2011
, vol. 
18
 (pg. 
18
-
26
)
Green
P.
Flaro
F.
Brockhaus
R.
Montijo
J.
Reynolds
C. R.
Horton
A.
Performance on the WMT, MSVT, & NV-MSVT in children with developmental disabilities and in adults with mild traumatic brain injury
Detection of malingering during head injury litigation
 , 
2010
2nd ed.
New York
Plenum Press
Green
P.
Rohling
M. L.
Lees-Haley
P. R.
Allen
L. M.
Effort has a greater effect on test scores then severe brain injury in compensation claimants
Brain Injury
 , 
2001
, vol. 
15
 (pg. 
1045
-
1060
)
Greiffenstein
M. F.
Baker
J. W.
Gola
T.
Validation of malingered amnesia measures with a large clinical sample
Psychological Assessment
 , 
1994
, vol. 
6
 (pg. 
218
-
224
)
Heaton
R. K.
Smith
H. H.
Lehman
A. W.
Vogt
A.T.
Prospects for faking believable deficits on neuropsychological testing
Journal of Consulting and Clinical Psychology
 , 
1978
, vol. 
46
 (pg. 
892
-
900
)
Heinly
M.
Greve
K.
Bianchini
K.
Love
J.
Brennan
A.
WAIS Digit Span based indicators of malingered cognitive dysfunction
Assessment
 , 
2005
, vol. 
12
 (pg. 
429
-
444
)
Henry
M.
Merton
T.
Wolf
S.
Harth
S.
Nonverbal Medical Symptom Validity Test performance in elderly health adults and clinical neurology patients
Journal of Clinical and Experimental Neuropsychology
 , 
2009
, vol. 
8
 (pg. 
1
-
10
)
Hook
J. N.
Marquine
M. J.
Hoelzle
J. B.
Repeatable Battery for the Assessment of Neuropsychological Status effort index performance in a medically ill geriatric sample
Archives of Clinical Neuropsychology
 , 
2009
, vol. 
24
 (pg. 
231
-
235
)
Howe
L. L.
Loring
D. W.
Classification accuracy and predictive ability of the Medical Symptom Validity Test's dementia profile and general memory impairment profile
The Clinical Neuropsychologist
 , 
2009
, vol. 
23
 (pg. 
329
-
342
)
McKay
C.
Casey
J. E.
Wertheimer
J.
Fichtenberg
N. L.
Reliability and validity of the RBANS in a traumatic brain injured sample
Archives of Clinical Neuropsychology
 , 
2007
, vol. 
22
 (pg. 
91
-
98
)
Moser
R. S.
Shatz
P.
Enduring effects of concussion in young athletes
Archives of Clinical Neuropsychology
 , 
2002
, vol. 
17
 (pg. 
91
-
100
)
Nelson
N. W.
Hoelzle
J. B.
McGuire
K. A.
Ferrier-Auerbach
A. G.
Charlesworth
M. J.
Sponheim
S. R.
Evaluation context impacts neuropsychological performance of OEF/OIF veterans with reported combat-related concussion
Archives of Clinical Neuropsychology
 , 
2010
, vol. 
25
 (pg. 
713
-
723
)
Randolph
C.
Repeatable Battery for the Assessment of Neuropsychological Status manual
 , 
1998
San Antonio, TX
The Psychological Corporation
Silverberg
N. D.
Wertheimer
J. C.
Fichtenberg
N. L.
An effort index for the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS)
The Clinical Neuropsychologist
 , 
2007
, vol. 
21
 (pg. 
841
-
854
)
Singhal
A.
Green
P.
Ashaye
K.
Shankar
K.
Gill
D.
High specificity of the medical symptom validity test in patients with very severe memory impairment
Archives of Clinical Neuropsychology
 , 
2009
, vol. 
24
 (pg. 
721
-
728
)
Slick
D. J.
Hopp
G.
Strauss
E.
Spellacy
F. J.
Victoria Symptom Validity Test: Efficiency for detection feigned memory impairment and relationship to neuropsychological tests and MMPI-2 validity scales
Journal of Clinical and Experimental Neuropsychology
 , 
1996
, vol. 
18
 (pg. 
911
-
922
)
Slick
D. J.
Sherman
E. M. S.
Iverson
G. L.
Diagnostic criteria for malingering neurocognitive dysfunction: Proposed standards for clinical practice and research
Clinical Neuropsychologist
 , 
1999
, vol. 
13
 (pg. 
545
-
561
)
Stevens
A.
Friedel
E.
Mehren
G.
Merten
T.
Malingering and uncooperativeness in psychiatric and psychological assessment: Prevalence and effects in a German sample of claimants
Psychiatry Research
 , 
2008
, vol. 
157
 (pg. 
191
-
200
)
Swets
J. A.
Dawes
R. M.
Monahan
J.
Better decisions through science
Scientific America
 , 
2000
, vol. 
283
 (pg. 
82
-
87
)
Tombaugh
T. N.
The Test of Memory Malingering
 , 
1996
Los Angeles
Western Psychology Corporation
Wager
J. G.
Howe
L. L. S.
Nonverbal Medical Symptom Validity Test: Try faking now!
Applied Neuropsychology
 , 
2010
, vol. 
17
 (pg. 
305
-
309
)
Whitney
A. W.
Shepard
P. H.
Williams
A. L.
Davis
J. J.
Adams
K. M.
The Medical Symptom Validity Test in the evaluation of Operation Iraqi Freedom/Operation Enduring Freedom soldiers: A preliminary study
Archives of Clinical Neuropsychology
 , 
2009
, vol. 
24
 (pg. 
145
-
152
)

Author notes

The views, opinions, and/or findings contained in this article are those of the authors and should not be construed as an official Department of the Army position, policy or decision unless so designated by other official documentation.