Abstract

The study examined Symptom Validity Test (SVT) performance in a sample of military service members on active orders as a function of evaluation context. Service members were assessed in the context of either a pending disability evaluation (Medical Evaluation Board; MEB) or a non-MEB/clinical evaluation. Overall, 41.8% of the sample failed the Word Memory Test; however, significantly more individuals in the MEB group (54%) failed the measure relative to the non-MEB/clinical group (35%). Regardless of group membership, SVT performance had a notable impact on neurocognitive test scores as measured by effect sizes. SVT performance was less strongly associated with self-reported psychological symptoms as gauged by the Personality Assessment Inventory. The current results are discussed in light of previous research on SVT performance in veteran and active duty samples.

Introduction

In recent years, the significance of Symptom Validity Testing as a necessary element of neuropsychological assessment has gained increasing acceptance (Bush et al., 2005; Heilbronner et al., 2009). Such a position is substantiated, in large part, by the accumulating number of studies demonstrating that effort has a more pervasive influence on neurocognitive testing than do clinical or demographic factors (Constantinou, Bauer, Ashendorf, Fisher, & McCaffrey, 2005; Fox, 2011; Green, Rohling, Lees-Haley, & Allen, 2001; Green, 2007; Lange, Iverson, Brooks, & Rennison, 2010; Moss, Jones, Forkis, & Quinn, 2003; West, Curtis, Greve, & Bianchini, 2010). For instance, Meyers, Volbrecht, Axelrod, and Reinsch-Boothby (2011) demonstrated that 50% of the variance in a full neuropsychological test battery was accounted for by failure on a comprehensive system of internal Symptom Validity Tests (SVTs). In another study, Stevens, Friedel, Mehren, and Merten (2008) demonstrated that effort (as measured by various stand-alone measures) accounted for 35% of the variance of performance in domains of processing speed, memory, and intelligence.

The issue of SVT failure and the subsequent suggestion of suboptimal effort are relatively new to U.S. military and veteran populations, at least in regard to the published literature. Armistead-Jehle (2010) found a 58% failure rate on the Medical SVT (MSVT; Green, 2004) in 45 U.S. veterans referred clinically for the evaluation of possible post-concussive symptoms. Axelrod and Schutte (2010) administered the MSVT to a mixed clinical sample of 286 veterans. These authors reported that 47% of the sample failed the MSVT based on “easy” subtest performance. Young, Sawyer, Roper, and Baughman (2012) examined a sample of 259 veterans completing the Word Memory Test (WMT; Green, 2003). The investigators noted a 44% failure rate across the sample, with a substantially higher failure rate in those completing disability relative to clinical evaluations. Whitney, Shepard, Williams, Davis, and Adams (2009) administered the MSVT to a sample of 23 combat veterans reporting mild traumatic brain injury (TBI) referred for neuropsychological testing within a Veterans Affairs Medical Center (VAMC). The sample was comprised of nine individuals still enrolled in active duty service and 14 who had recently been discharged. Whitney and colleagues observed a 17% failure rate on the MSVT, with all of those failing (n = 4) still on active duty service.

A recent study by Lange, Pancholi, Bhagwat, Anderson-Barnes, and French (2012) evaluated service members with a history of mild TBI that passed and failed SVTs and service members with a history of severe TBI that passed SVTs. The authors reported that those with a history of mild TBI who failed effort measures performed worse on the majority of neurocognitive measures relative to those with a history of mild TBI who passed SVTs and those with a history of severe TBI who passed SVTs. These authors also noted a strong relationship between failure of SVTs and an elevation in self-reported psychological symptoms. As such, the notable impact of SVTs in neuropsychological testing and self-reported psychological symptoms was highlighted in a military sample. Lange and colleagues reported a 19% failure rate in study participants with a history mild TBI, although they acknowledged that secondary to issues related to the sample selection and group categorization this number may be an underestimate.

Nelson and colleagues (2010) evaluated various stand-alone and embedded effort measures as a function of forensic and research contexts in a sample of 119 Operation Iraqi Freedom (OIF) and Operation Enduring Freedom (OEF) and non-OIF/OEF veterans. The veterans in the forensic group (defined by veterans involved in a compensation and pension [C&P] evaluation conducted for the purposes of establishing disability ratings) evidenced elevated rates of insufficient effort relative to veterans in the research group. Specifically, 59%, 16%, and 9% of the forensic group demonstrated insufficient effort on the Victoria Symptom Validity Test (VSVT; Slick, Hopp, Strauss, & Spellacy, 1996), California Verbal Learning Test-Second Edition (CVLT-II) Forced Choice (Delis, Kramer, Kaplan, & Ober, 2000), and the Rey-15 Item and Recognition Test (FIT; Boone, Salazar, Lu, Warner-Chacon, & Razani, 2002), respectively. Conversely, those in the research group evidenced a failure rate of 8% on the VSVT, 3% on the CVLT-II Forced Choice, and 0% on the FIT. The authors argued that the context of the neuropsychological evaluation (i.e., forensic vs. research) is a salient variable in considering SVT performance, whereas the patient cohort (i.e., OIF/OEF vs. non-OIF/OEF) is not.

Depending on the stand-alone SVT employed, Armistead-Jehle and Hansen (2011) found an 11%–20% failure rate in their sample of 85 active duty military service members. However, when this sample was divided into those attending a nearly year-long military training for field-grade officers and those who were not, the SVT failure rates showed a notable difference. The former group showed an 8% failure rate across stand-alone SVTs, whereas the latter group's SVT failure rate ranged from 15% to 30% depending on the stand-alone SVT examined. These authors argued that given the wide variety of SVT failure reported across various military and veteran samples, underlying variables related to the subgroup membership and/or examination context should be evaluated.

In addition to the relationship between SVTs and cognitive test performances, a number of studies have examined the impact of SVTs on the self-report of psychological symptoms. However, research in this area has been less consistent in its findings compared with studies examining the relationship between cognitive measures of effort and ability. Several studies have demonstrated that failure on SVTs is associated with elevations across validity and clinical scales on measures of psychological symptoms (Gervais, Ben-Porath, Wygant, & Green, 2008; Gervais, Rohling, Green, & Ford, 2004; Iverson, Lange, Brooks, & Rennison, 2010; Larrabee, 2003a, 2003b; Suhr, Hammers, Dobbins-Buckland, Zimak, & Hughes, 2008; Wygant et al., 2007). Specific to a veteran sample, Whitney, Davis, Shepard, and Herman (2008) evaluated Minnesota Multiphasic Personalty Inventory -2-Restructred Form (MMPI-2) (Butcher et al., 2001) validity scales in relation to the Test of Memory Malingering (TOMM; Tombaugh, 1996) performances in 46 male veterans seen for outpatient neuropsychological evaluation. The researchers found select MMPI-2 validity scales were associated with TOMM performance (i.e., Response Bias Scale [RBS], Henry-Heilbronner Index [HHI], and Back Infrequency [Fb]); however, other scales were not (i.e., Infrequency [F], Infrequency Psychopathy [Fp], Infrequency Post-Traumatic Stress Disorder Scale, and the Fake Bad Scale [FBS]). In another study examining 174 veterans, Young, Kearns, and Roper (2011) found that select MMPI-2 validity scales (RBS, FBS, F, Fp, and HHI) were associated with failure on the WMT.

The Lange and colleagues (2012) study outlined above also examined the relationship between MSVT performances and Personality Assessment Inventory (PAI) profiles. The researchers showed that individuals with a history of mild TBI who failed the MSVT had higher elevations on the majority of PAI clinical scales as well as select validity scales relative to individuals with mild and severe TBIs who passed the MSVT. Finally, a recent study by Jones, Ingram, and Ben-Porath (2012) demonstrated that failure on a variety of SVTs was associated with significant linear increases in all MMPI-2-RF (Ben-Porath & Tellegen, 2008) over-reporting validity scales and most clinical scales in a sample of 501 active duty military members. Effect sizes were large across all validity scales when comparing groups who failed no SVTs and those who failed three SVTs.

In contrast to these studies, a handful of other investigations have not found a relationship between SVT failure and psychological symptom reporting (Demakis, Gervais, & Rohling, 2008; Haggerty, Frazier, Busch, & Naugle, 2007; Sumanti, Boone, Savadnick, & Gorsuch, 2006). Specific to a veteran sample, Armistead-Jehle (2010) found no difference between those veterans with a history of mild TBI who passed and failed the MSVT on several PAI validity scales to include the Negative Impression Management (NIM), Rogers Discriminant Function, and Malingering Index.

The current paper has several aims. First, to expand the findings of Nelson and colleagues (2010) and Armistead-Jehle and Hansen (2011) by evaluating the base rates of SVT failure in two groups of active duty military members: those seen for neuropsychological evaluation in the context of a pending Medical Evaluation Board (MEB) and those evaluated in a non-MEB/clinical context. We hypothesized that the base rate for SVT failure in those pending an MEB would be significantly higher than those seen in a clinical context. Next, the study aimed to extend the research on the impact of cognitive effort measures on neuropsychological testing by analyzing the effect sizes of SVT performance on the neuropsychological test battery. Given that previous research has consistently demonstrated the notable impact of cognitive effort measures on neurocognitive test performance, we predicted medium to large effect sizes across neurocognitive testing. The final aim of the study was to evaluate the relationship between cognitive SVTs and self-report measures of psychological symptoms. To our knowledge, the only studies to examine this relationship in an active duty military sample are Lange and colleagues (2012) and Jones, Ingram, and Ben-Porath (2012). Based on their conclusions, we predicted that those who failed cognitive effort measures would demonstrate significant elevations on psychological self-report measures.

Methods

Participants

The study included a convenience sample of 335 U.S. military service members on active duty orders assessed by means of a neuropsychological evaluation in an outpatient TBI clinic located at a Southeast U.S. Army Medical Facility between July 2009 and August 2011. The average age and education of the sample was 32.3 (SD = 7.7) and 12.85 (SD = 1.7) years, respectively. Ethnicity of the sample was predominantly Caucasian (77.3%), with 11.3% African American, 7.8% Hispanic, and 3.6% other. The preponderance of the sample was men (95.5%). The majority of the sample was active duty Army (77.3%), with a subsection being Army National Guard or Army Reserve Soldiers on active orders (18.8%). The remainder of the sample (1.5%) consisted of active duty members of the Navy, Air Force, or Marines. With regard to the military grade, 29.3% were lower enlisted (i.e., E1–E4), 64.5% were senior enlisted (i.e., E5–E9), and 6.3% were officers. Regarding injury status, as data for this retrospective study came from a TBI clinic over 98% had a history of TBI with the vast majority (i.e., over 95%) classified as mild (ACRN, 1993). The remaining few cases had a history of other diagnosis that may have impacted neurocognitive functioning, such as heat stroke and possible transient ischemic attack. Of those with a history of TBI, the average time since injury was 30.5 months (SD = 25.8). Regarding psychiatric conditions, a number of individuals carried primary Axis I diagnoses of Post-Traumatic Stress Disorder/other anxiety (58.6%), Adjustment Disorder (11.9%), or Depression (7.5%). No individuals had a current diagnosis of any psychotic disorder.

All patients were tested by the second author or a trained neuropsychology technician under the supervision of the second author. For select analyses, participants were divided into two groups: (a) those in the process of a MEB and (b) those seen in a non-MEB/clinical context. An MEB is an evaluative process that a service member undergoes when he or she may no longer meet the retention standards of his or her branch of the U.S. Military. If the service member is deemed to no longer meet retention standards he or she can be medically separated from service, with this separation typically being associated with benefits. As a service member is considered for medical separation the entirety of any current or past diagnoses are thoroughly evaluated such that secondary diagnoses are also considered. Consequently, while the primary reason for a potential medical discharge may not be related to neuropsychological functioning (i.e., an orthopedic injury or psychiatric diagnosis), service members can undergo a neuropsychological evaluation as part of their medical board if there is any history or indication of an insult that may have affected brain functioning. In the current study, MEB referrals were sent for neuropsychological evaluation through administrative channels in the Physical Evaluation Board process. Participants were referred for non-MEB/clinical evaluation by the health center's primary care or behavioral health clinics. Prior to all evaluations, the patients gave consent for the assessment and were instructed to provide their best effort across the tests administered. The examiner remained in the room at all times throughout the evaluations. All participants spoke English fluently and the testing was conducted in English. This retrospective analysis of clinical data was approved by the Institutional Review Board at Walter Reed National Military Medical Center.

Measures

Patients were administered a fixed-flexible battery of neuropsychological tests with specific measures chosen as a result of the clinical nature of the evaluation and the temporal restrictions sometimes associated with assessment conduced in the clinical environment; consequently, not every participant was administered exactly the same battery of measures. Neurocognitive measures included the following: Select subtests of the Wechsler Adult Intelligence Scale-Fourth Edition (WAIS-IV; Wechsler, 2008; Digit Span, Arithmetic, Coding, Symbol Search, Vocabulary, Similarities, Information, Block Design, Matrix Reasoning, and Visual Puzzles); select subtests of the Wechsler Memory Scale-Fourth Edition (WMS-IV; Wechsler, 2009; Logical Memory I and II, Verbal Paired Associates I and II, Designs I and II, Visual Reproduction I and II, Spatial Addition, and Spatial Span); Booklet Category Test (DeFilippis & McCampbell, 1997); Stroop Color and Word Test (Golden & Freshwater, 2002); Tower of London-Drexel University: Second Edition (TOL-DX:2; Culbertson & Zillmer, 2005); Delis–Kaplan Executive Function System (D-KEFS) Trail Making Conditions 2, 3, and 4 (Delis, Kaplan, & Kramer, 2001); the Test of Variables of Attention (TOVA; Greenberg, 2007); Controlled Oral Word Association (COWA) and Animal Naming; Grooved Pegboard (Lafayette Instrument, 2003); and Grip Strength (Lafayette Instrument, 2004). Normative scores for the WAIS-IV, WMS-IV, Booklet Category Test, Stroop Color and Word Test, TOL-DX:2, D-KEFS Trail Making Test, and TOVA were obtained from the respective manuals. The Grooved Pegboard and Grip Strength tests were administered according to the standard instructions from their respective test manuals and scored using the respective meta-analysis norms from Mitrushina, Boone, Razani, and D'Elia (2005). Verbal fluency task norms were obtained from Gladsjo and colleagues (1999).

The WMT (Green, 2003) is a computer administered verbal memory test with multiple subtests designed to assess verbal memory, effort, and response consistency. Twenty semantically related work pairs are presented twice for examinees to learn. Directly after the learning trials, Immediate Recognition (IR) memory is tested. Subsequent to delay, Delayed Recognition (DR) memory is assessed. This is then followed by a serious of subtests comprising Multiple Choice (MC), Paired Associate (PA), and Free Recall (FR) formats. In addition to these five subtests, a consistency (CNS) score is calculated to gauge recall consistency across select trials. Failure on the WMT was defined as performance below the cut score on the IR, DR, and/or CNS subtests as defined in the test manual. The IR, DR, and CNS subtests are considered the initial measures of symptom validity and have been termed the “easy subtests.” As a result of their difficulty relative to the easy subtests the MC, PA, and FR subtests are referred to as the “hard subtests.” When one or more easy subtests are failed, further analysis of the complete WMT profile and presenting clinical factors is then recommended to determine if genuine memory impairment or suboptimal effort is more likely. When at least one easy subtest is failed, the mean of the easy subtests is subtracted from the mean of the hard subtests. The magnitude of this difference is high (i.e., ≥30) in true impairment but low in simulators. The examination of the easy–hard subtest difference has been found to discriminate adequately between those with and without a genuine memory impairment; with genuine memory impairment often found in patient samples diagnosed with dementia or other conditions that prohibit independent living. Further, order violations (i.e., superior performance on relatively difficult subtests compared with easier ones) have also been shown to differentiate between patients with genuine memory impairment and those offering suboptimal effort (Chafetz, 2008; Green, 2003; Howe & Loring, 2009). The interpretive instructions of the WMT manual dictate that performances below the specified cut scores on IR, DR, and CNS subtests be evaluated for a Genuine Memory Impairment Profile (GMIP). In those who meet the GMIP criteria, the individual's functional capabilities related to safe independent living should be evaluated to determine if a GMIP would be reasonable. Any participants meeting these GMIP criteria were removed from analysis. A number of studies have demonstrated the utility of the WMT in the discrimination between those with genuine memory impairment and those simulating impairment in a range of patient samples (e.g., Green, Lees-Haley, & Allen, 2002; Hartman, 2002; Wynkoop & Denney, 2005).

The PAI (Morey, 1991) is an actuarial measure of personality and emotional functioning that consists of 344 items answered on a 4-point Likert format that render 22 non-overlapping scales. The measure renders four primary validity scales (Inconsistency, Infrequency, Positive Impression Management, and NIM) and several supplemental validity scales. Two supplemental validity scales, the Malingering Index and Roger's Discriminant Function, were employed in the current study. The PAI also renders 18 clinical scale and 31 clinical subscales. Clinical scales of interest for the current evaluation and patterned after those of the Lange and colleagues (2012) investigation included: Somatic Complaints, Anxiety, Anxiety-Related Disorders, Depression, Mania, Paranoia, Schizophrenia, Borderline Features, Antisocial Features, Alcohol Problems, Drug Problems, Aggression, Suicidal Ideation, and Stress. The psychometric properties of the PAI have been well established (see, e.g., Kurtz & Blais, 2007; Morey, 2003).

Procedures

In an effort to reduce the number of pairwise comparisons and therefore limit the potential influence of family-wise error, individual neurocognitive measures of ability were grouped into domains to arrive at an average composite T-score (see Table 1 for the composition of each domain). In order to assess the overall impact of SVT performance on neurocognitive test performances and self-reported symptoms, the participants were first divided into two groups, those who passed the WMT (n = 195) and those who failed the measure (n = 140). Neurocognitive domain scores as well as PAI scores were compared between these two groups. In an effort to assess the impact of SVT performance among those pending a medical board (n = 117) and those seen in a non-MEB/clinical context (n = 218), the sample was divided into these two groups and performances on neurocognitive and self-report inventories were again compared as a function of WMT performance. Given the multiple pairwise comparisons a Bonferroni correction was applied. For the neurocognitive domains, an α was set at 0.008. For the PAI comparisons, the α was set at 0.003. All statistical procedures were completed using SPSS® version 17.0.

Table 1.

Composition of neurocognitive domains

Attention/working memory Memory Processing speed Visual reasoning Verbal reasoning Motor skills 
Digit Spana Logical Memory Ib Codinga Block Designa Similaritiesa Grooved Pegboard Dominant Hand 
Arithmetica Logical Memory IIb Symbol Searcha Matrix Reasoninga Vocabularya Grooved Pegboard Non-Dominant Hand 
Spatial Additionb Verbal Paired Associates Ib D-KEFS Trials Condition 2 Visual Puzzlesa Informationa Grip Strength Dominant Hand 
Symbol Spanb Verbal Paired Associates IIb D-KEFS Trials Condition 3 Booklet Category Test COWA Grip Strength Non-Dominant Hand 
TOVA Commission Errors Designs Ib D-KEFS Trials Condition 4 Tower of London Animal Naming  
TOVA Omission Errors Designs IIb TOVA RT    
TOVA RT Variability Visual Reproduction Ib Stroop Word Reading    
Stroop Color Word Visual Reproduction IIb Stroop Color Reading    
Attention/working memory Memory Processing speed Visual reasoning Verbal reasoning Motor skills 
Digit Spana Logical Memory Ib Codinga Block Designa Similaritiesa Grooved Pegboard Dominant Hand 
Arithmetica Logical Memory IIb Symbol Searcha Matrix Reasoninga Vocabularya Grooved Pegboard Non-Dominant Hand 
Spatial Additionb Verbal Paired Associates Ib D-KEFS Trials Condition 2 Visual Puzzlesa Informationa Grip Strength Dominant Hand 
Symbol Spanb Verbal Paired Associates IIb D-KEFS Trials Condition 3 Booklet Category Test COWA Grip Strength Non-Dominant Hand 
TOVA Commission Errors Designs Ib D-KEFS Trials Condition 4 Tower of London Animal Naming  
TOVA Omission Errors Designs IIb TOVA RT    
TOVA RT Variability Visual Reproduction Ib Stroop Word Reading    
Stroop Color Word Visual Reproduction IIb Stroop Color Reading    

Notes: TOVA = Test of Variables of Attention; RT = Response Time; D-KEFS = Delis–Kaplin Executive Function System; COWA = Controlled Oral Word Association.

aWechsler Adult Intelligence Scale-IV.

bWechsler Memory Scale-IV.

Results

When the entire sample was divided into groups based only on WMT performance, there were no significant differences in age (t = 0.07, p = .95, d = 0.01), education (t = 0.54, p = .59, d = 0.06), months tested post injury (t = 1.65, p = .10, d = 0.18), gender (men: 98% fail, 92% pass, χ2 = 3.07, p = .08), rank (χ2 = 3.26, p = .20) or ethnicity (χ2 = 2.69, p = .44). When the sample was divided into those evaluated in the context of an MEB and those evaluated in a non-MEB/clinical context, there was also no difference in terms of age (t = 0.72, p = .47, d = 0.08), education (t = 0.79, p = .43, d = 0.09), gender (men: 96% MEB, 94% non-MEB, χ2 = 0.95, p = .31), rank (χ2 = 0.37, p = .83), ethnicity (χ2 = 2.49, p = .48), or WAIS-IV Full Scale Intelligence Quotient (t = 1.5, p = .14, d = 0.16). There was however a significant difference in time since injury, with those undergoing an MEB having a greater time (in months), since injury (M = 35.2, SD = 29.1) relative to those being evaluated in a clinical context (M = 28.0, SD = 23.5)—t (317) = 2.38, p = .02, d = 0.27. Such a difference is logical given that the MEB process can take time to unfold.

WMT subtest scores among those who passed and failed the measure are reported in Table 2. In total, 41.8% of the sample (140 service members) failed the WMT. Individuals performing below the specified cut scores on IR, DR, and CNS subtests were evaluated for a GMIP. Of those who met the GMIP, the functional capabilities in terms of Activities of Daily Living and Instrumental Activities of Daily Living were evaluated to determine if a GMIP would be reasonable in the individual. One patient from the fail group was removed from analysis secondary to meeting these criteria. In the failing group, the average difference between the easy and the hard subtests was 25.1. Order violations were noted in 63 (45%) of the patients who failed the WMT (i.e., higher scores on more difficult tasks relative to easier ones). The WMT profiles in those failing the measure were thus more similar to individuals simulating impairment than to those with genuine memory deficits.

Table 2.

WMT scores in 335 U.S. military service members

 WMT scores Mean scores in % correct Standard deviation Range 
Pass WMT (N = 195) IR 95.9 4.2 85–100 
DR 95.6 4.3 85–100 
CNS 93.7 7.4 85–100 
MC 86.2 13.0 35–100 
PA 84.4 14.2 40–100 
FR 54.7 14.3 15–90 
Easy subtests 95.1 4.1 83.3–100 
Hard subtests 75.1 12.1 37.5–95.0 
Fail WMT (N = 140) IR 75.6 12.5 30–100 
DR 71.4 13.5 27.5–95 
CNS 68.0 11.0 40–90 
MC 53.1 17.4 15–95 
PA 51.5 18.0 5–100 
FR 35.3 11.4 12.5–65 
Easy subtests 71.7 10.8 38.3–91.7 
Hard subtests 46.6 14.1 15.0–79.2 
Easy–hard subtests 25.1 8.3 4.2–48.3 
 WMT scores Mean scores in % correct Standard deviation Range 
Pass WMT (N = 195) IR 95.9 4.2 85–100 
DR 95.6 4.3 85–100 
CNS 93.7 7.4 85–100 
MC 86.2 13.0 35–100 
PA 84.4 14.2 40–100 
FR 54.7 14.3 15–90 
Easy subtests 95.1 4.1 83.3–100 
Hard subtests 75.1 12.1 37.5–95.0 
Fail WMT (N = 140) IR 75.6 12.5 30–100 
DR 71.4 13.5 27.5–95 
CNS 68.0 11.0 40–90 
MC 53.1 17.4 15–95 
PA 51.5 18.0 5–100 
FR 35.3 11.4 12.5–65 
Easy subtests 71.7 10.8 38.3–91.7 
Hard subtests 46.6 14.1 15.0–79.2 
Easy–hard subtests 25.1 8.3 4.2–48.3 

Notes: WMT = Word Memory Test; IR = Immediate Recognition; DR = Delayed Recognition; CNS = Consistency; MC = Multiple Choice; PA = Paired Associate; FR = Free Recall. Easy–hard subtest differences in those passing the WMT were not reported as this statistic is only examined in individuals who fail the easy subtests (Green, 2003).

When the sample was divided into MEB and non-MEB/clinical evaluation contexts, there was a significant difference in WMT failure rates. Of the 117 members of the sample undergoing an MEB, 63 (53.8%) failed the WMT. Of the 218 members of the sample completing neuropsychological evaluation for non-MEB/clinical purposes, 77 (35.3%) failed the WMT (χ2 = 10.7, p = .01). As time since injury was significantly different between the evaluation context groups, the influence of this variable was examined via logistic regression. Although evaluation context was significant in predicting WMT performance, B = −0.71, Wald (1) = 8.52, p < .01, time since injury was not a significant variable in the model, B = −0.01, Wald (1) = 1.57, p = .21. As such when accounting for time since injury, the MEB group continued to fail the WMT at a significantly higher rate relative to the non-MEB/clinical group—model χ2(2, N = 335) = 11.3, p < .01. Table 3 summarizes the descriptive statistics, analysis of variance results, and effect sizes (Cohen, 1988) for the domain-specific neuropsychological test scores in those passing and failing the WMT. Tables 4 and 5 show the same comparisons for the MEB and non-MEB/clinical subsamples. Across all comparisons, the effect sizes for the attention and processing speed composite scores were large (range d = 0.89–1.17). The effect sizes for the memory domain were large in the entire sample (d = 0.85), as well as for the MEB subsample (d = 1.12). A medium effect size was found in the non-MEB/clinical subsample (d = 0.69).

Table 3.

Neuropsychological domain composite scores as a function of WMT performance in all participants

  WMT pass
 
WMT fail
 
df F-value p-value Effect size 
Mean (SDn Mean (SDn 
Attention Composite 45.3 (6.3) 176 39.6 (5.8) 104 1, 278 56.7 <.001 0.94 
Processing Speed Composite 46.5 (6.3) 158 39.0 (7.9) 90 1, 246 66.7 <.001 1.05 
Memory Composite 50.2 (5.8) 180 45.0 (6.4) 114 1, 292 52.5 <.001 0.85 
Verbal Reasoning Composite 46.6 (5.9) 176 44.8 (5.6) 98 1, 272 5.8 .016 0.31 
Visual Reasoning Composite 50.6 (6.6) 189 48.1 (6.2) 123 1, 310 11.3 .001 0.39 
Motor Skills Composite 45.1 (7.6) 160 41.2 (8.2) 113 1, 271 16.2 .000 1.02 
  WMT pass
 
WMT fail
 
df F-value p-value Effect size 
Mean (SDn Mean (SDn 
Attention Composite 45.3 (6.3) 176 39.6 (5.8) 104 1, 278 56.7 <.001 0.94 
Processing Speed Composite 46.5 (6.3) 158 39.0 (7.9) 90 1, 246 66.7 <.001 1.05 
Memory Composite 50.2 (5.8) 180 45.0 (6.4) 114 1, 292 52.5 <.001 0.85 
Verbal Reasoning Composite 46.6 (5.9) 176 44.8 (5.6) 98 1, 272 5.8 .016 0.31 
Visual Reasoning Composite 50.6 (6.6) 189 48.1 (6.2) 123 1, 310 11.3 .001 0.39 
Motor Skills Composite 45.1 (7.6) 160 41.2 (8.2) 113 1, 271 16.2 .000 1.02 

Notes: All composite scores are T-scores. WMT = Word Memory Test; SD = Standard Deviation; effect sizes computed using Cohen's d. Bonferroni corrected α = 0.008.

Table 4.

Neuropsychological domain composite scores as a function of WMT performance in participants completing MEB evaluations

  WMT pass
 
WMT fail
 
df F-value p-value Effect size 
Mean (SDn Mean (SDn 
Attention Composite 44.3 (6.6) 48 38.8 (5.6) 49 1, 96 19.9 <.001 0.90 
Processing Speed Composite 44.9 (5.6) 45 36.8 (8.0) 42 1, 85 30.2 <.001 1.17 
Memory Composite 50.9 (6.1) 50 44.2 (5.9) 52 1, 100 31.5 <.001 1.12 
Verbal Reasoning Composite 46.6 (5.3) 50 45.9 (5.6) 49 1, 97 0.3 .550 0.13 
Visual Reasoning Composite 50.7 (6.3) 52 47.7 (6.6) 54 1, 104 6.0 .016 0.46 
Motor Skills Composite 42.7 (7.9) 46 40.4 (7.7) 46 1, 90 2.0 .159 0.29 
  WMT pass
 
WMT fail
 
df F-value p-value Effect size 
Mean (SDn Mean (SDn 
Attention Composite 44.3 (6.6) 48 38.8 (5.6) 49 1, 96 19.9 <.001 0.90 
Processing Speed Composite 44.9 (5.6) 45 36.8 (8.0) 42 1, 85 30.2 <.001 1.17 
Memory Composite 50.9 (6.1) 50 44.2 (5.9) 52 1, 100 31.5 <.001 1.12 
Verbal Reasoning Composite 46.6 (5.3) 50 45.9 (5.6) 49 1, 97 0.3 .550 0.13 
Visual Reasoning Composite 50.7 (6.3) 52 47.7 (6.6) 54 1, 104 6.0 .016 0.46 
Motor Skills Composite 42.7 (7.9) 46 40.4 (7.7) 46 1, 90 2.0 .159 0.29 

Notes: All composite scores are T-scores. WMT = Word Memory Test; SD = Standard Deviation; effect sizes computed using Cohen's d. Bonferroni corrected α = 0.008.

Table 5.

Neuropsychological domain composite scores as a function of WMT performance in participants completing non-MEB/clinical evaluations

  WMT pass
 
WMT fail
 
df F-value p-value Effect size 
Mean (SDn Mean (SDn 
Attention Composite 45.7 (6.2) 128 40.3 (5.9) 55 1, 181 29.4 <.001 0.89 
Processing Speed Composite 47.1 (6.5) 113 40.9 (7.4) 48 1, 159 28.1 <.001 0.89 
Memory Composite 50.0 (5.7) 130 45.7 (6.7) 62 1, 190 21.2 <.001 0.69 
Verbal Reasoning Composite 46.6 (6.1) 126 43.7 (5.3) 49 1, 173 8.2 .005 0.51 
Visual Reasoning Composite 50.6 (6.7) 137 48.5 (5.9) 69 1, 204 4.9 .027 0.33 
Motor Skills Composite 46.0 (7.3) 114 41.7 (8.6) 67 1, 179 12.9 <.001 0.54 
  WMT pass
 
WMT fail
 
df F-value p-value Effect size 
Mean (SDn Mean (SDn 
Attention Composite 45.7 (6.2) 128 40.3 (5.9) 55 1, 181 29.4 <.001 0.89 
Processing Speed Composite 47.1 (6.5) 113 40.9 (7.4) 48 1, 159 28.1 <.001 0.89 
Memory Composite 50.0 (5.7) 130 45.7 (6.7) 62 1, 190 21.2 <.001 0.69 
Verbal Reasoning Composite 46.6 (6.1) 126 43.7 (5.3) 49 1, 173 8.2 .005 0.51 
Visual Reasoning Composite 50.6 (6.7) 137 48.5 (5.9) 69 1, 204 4.9 .027 0.33 
Motor Skills Composite 46.0 (7.3) 114 41.7 (8.6) 67 1, 179 12.9 <.001 0.54 

Notes: All composite scores are T-scores. WMT = Word Memory Test; SD = Standard Deviation; effect sizes computed using Cohen's d. Bonferroni corrected α = 0.008.

Regarding self-report psychological measurements, Table 6 summarizes the descriptive and inferential statistics including effect sizes for the individual PAI scales and validity subscales of interest in those passing and failing the WMT. The PAI sample size was 333, with 194 passing the WMT and 137 failing the WMT. Tables 7 and 8 show the same comparisons for the MEB and non-MEB/clinical subsamples. Given the multiple comparisons, a Bonferroni correction was applied with the resulting α set at 0.003. When the entire sample was evaluated, the NIM and Malingering Index scores were significantly higher in the WMT fail group, with medium and small effect sizes, respectively. Nine clinical scales showed significant elevations in the WMT fail group relative to those who passed the WMT. Among those completing an MEB evaluation, there was no difference between those passing and failing the WMT on validity indices. There was also little divergence on clinical scales, with only two scales showing significant differences as a function of WMT performance. Among study participants being evaluated in a non-MEB/clinical context, relative to study participants who passed the WMT, those who failed the WMT had significantly higher scores on the NIM and Malingering Index scales with medium effect sizes. Seven of the PAI clinical scales examined were also elevated in the WMT fail group and demonstrated medium effect sizes.

Table 6.

PAI T-scores as a function of WMT performance in all study participants

 WMT pass (mean [SD]) WMT fail (mean [SD]) F-value p-value Effect size 
PAI Inconsistency 51.5 (9.2) 50.9 (8.6) 0.33 .56 0.07 
PAI Infrequency 52.7 (9.2) 53.8 (9.0) 1.07 .30 0.12 
PAI NIM 58.2 (12.6) 64.8 (14.5) 18.6 <.001 0.49 
PAI PIM 47.7 (11.7) 43.21 (10.8) 12.7 <.001 0.40 
PAI Somatization 62.9 (10.7) 70.4 (10.9) 38.6 <.001 0.69 
PAI Anxiety 60.9 (13.8) 69.7 (12.6) 35.7 <.001 0.67 
PAI Anxiety-Related Disorder 60.6 (14.4) 71.7 (12.3) 50.7 <.001 0.83 
PAI Depression 65.6 (14.5) 74.5 (13.2) 32.8 <.001 0.64 
PAI Mania 54.6 (9.8) 56.2 (9.8) 2.1 .14 0.16 
PAI Paranoia 60.6 (13.3) 66.0 (13.3) 13.0 <.001 0.41 
PAI Schizophrenia 60.9 (14.1) 69.3 (13.4) 31.5 <.001 0.61 
PAI Borderline Features 59.5 (13.3) 65.8 (11.6) 20.1 <.001 0.50 
PAI Antisocial Features 60.0 (11.4) 58.0 (13.3) 2.2 .14 0.16 
PAI Alcohol Problems 49.7 (9.0) 49.9 (10.7) 0.03 .88 0.02 
PAI Drug Problems 49.0 (7.2) 47.7 (6.6) 3.1 .08 0.19 
PAI Aggression 60.2 (14.5) 66.7 (14.4) 16.4 <.001 0.45 
PAI Suicidal Ideation 50.1 (11.5) 52.3 (13.1) 2.8 .10 0.18 
PAI Stress 54.9 (11.3) 59.2 (11.8) 11.3 .001 0.37 
PAI Malingering Index 54.4 (12.0) 59.1 (14.7) 10.3 .002 0.35 
PAI Rogers Discriminant Function 50.1 (11.4) 50.0 (11.2) 0.01 .90 0.01 
 WMT pass (mean [SD]) WMT fail (mean [SD]) F-value p-value Effect size 
PAI Inconsistency 51.5 (9.2) 50.9 (8.6) 0.33 .56 0.07 
PAI Infrequency 52.7 (9.2) 53.8 (9.0) 1.07 .30 0.12 
PAI NIM 58.2 (12.6) 64.8 (14.5) 18.6 <.001 0.49 
PAI PIM 47.7 (11.7) 43.21 (10.8) 12.7 <.001 0.40 
PAI Somatization 62.9 (10.7) 70.4 (10.9) 38.6 <.001 0.69 
PAI Anxiety 60.9 (13.8) 69.7 (12.6) 35.7 <.001 0.67 
PAI Anxiety-Related Disorder 60.6 (14.4) 71.7 (12.3) 50.7 <.001 0.83 
PAI Depression 65.6 (14.5) 74.5 (13.2) 32.8 <.001 0.64 
PAI Mania 54.6 (9.8) 56.2 (9.8) 2.1 .14 0.16 
PAI Paranoia 60.6 (13.3) 66.0 (13.3) 13.0 <.001 0.41 
PAI Schizophrenia 60.9 (14.1) 69.3 (13.4) 31.5 <.001 0.61 
PAI Borderline Features 59.5 (13.3) 65.8 (11.6) 20.1 <.001 0.50 
PAI Antisocial Features 60.0 (11.4) 58.0 (13.3) 2.2 .14 0.16 
PAI Alcohol Problems 49.7 (9.0) 49.9 (10.7) 0.03 .88 0.02 
PAI Drug Problems 49.0 (7.2) 47.7 (6.6) 3.1 .08 0.19 
PAI Aggression 60.2 (14.5) 66.7 (14.4) 16.4 <.001 0.45 
PAI Suicidal Ideation 50.1 (11.5) 52.3 (13.1) 2.8 .10 0.18 
PAI Stress 54.9 (11.3) 59.2 (11.8) 11.3 .001 0.37 
PAI Malingering Index 54.4 (12.0) 59.1 (14.7) 10.3 .002 0.35 
PAI Rogers Discriminant Function 50.1 (11.4) 50.0 (11.2) 0.01 .90 0.01 

Notes: PAI = Personality Assessment Inventory; SD = Standard Deviation. df for all comparisons = 1, 329. Effect sizes computed using Cohen's d. Bonferroni corrected α = 0.003.

Table 7.

PAI T-scores as a function of WMT performance in study participants completing MEB evaluations

 WMT pass (mean [SD]) WMT fail (mean [SD]) F-value p-value Effect size 
PAI Inconsistency 52.5 (8.2) 50.1 (7.4) 2.7 .10 0.31 
PAI Infrequency 54.9 (10.0) 53.9 (9.3) 0.3 .57 0.10 
PAI NIM 63.2 (16.5) 65.6 (14.4) 0.7 .40 0.15 
PAI PIM 46.2 (13.4) 42.6 (11.6) 2.4 .13 0.29 
PAI Somatization 68.1 (9.6) 71.6 (10.7) 3.6 .06 0.34 
PAI Anxiety 64.9 (10.3) 72.6 (12.3) 9.6 .002 0.69 
PAI Anxiety-Related Disorder 65.9 (15.3) 74.0 (12.3) 9.7 .002 0.58 
PAI Depression 72.1 (15.6) 77.9 (12.7) 4.9 .03 0.41 
PAI Mania 55.4 (10.6) 57.8 (9.6) 1.5 .22 0.24 
PAI Paranoia 64.2 (15.1) 67.9 (12.7) 2.0 .16 0.27 
PAI Schizophrenia 65.3 (16.4) 71.4 (12.6) 5.6 .02 0.42 
PAI Borderline Features 62.7 (15.2) 67.9 (12.2) 4.3 .04 0.38 
PAI Antisocial Features 56.2 (12.2) 58.9 (14.0) 1.2 .28 0.21 
PAI Alcohol Problems 51.4 (11.1) 50.2 (12.6) 0.3 .58 0.10 
PAI Drug Problems 49.2 (6.7) 47.4 (5.8) 2.4 .12 0.29 
PAI Aggression 62.9 (14.5) 67.5 (14.5) 2.9 .09 0.32 
PAI Suicidal Ideation 55.4 (16.9) 61.2 (11.9) 0.02 .88 0.40 
PAI Stress 57.8 (12.1) 61.2 (11.9) 2.3 .13 0.28 
PAI Malingering Index 59.6 (15.4) 60.0 (12.8) 0.3 .86 0.03 
PAI Rogers Discriminant Function 50.9 (12.3) 51.4 (12.8) 0.5 .82 0.04 
 WMT pass (mean [SD]) WMT fail (mean [SD]) F-value p-value Effect size 
PAI Inconsistency 52.5 (8.2) 50.1 (7.4) 2.7 .10 0.31 
PAI Infrequency 54.9 (10.0) 53.9 (9.3) 0.3 .57 0.10 
PAI NIM 63.2 (16.5) 65.6 (14.4) 0.7 .40 0.15 
PAI PIM 46.2 (13.4) 42.6 (11.6) 2.4 .13 0.29 
PAI Somatization 68.1 (9.6) 71.6 (10.7) 3.6 .06 0.34 
PAI Anxiety 64.9 (10.3) 72.6 (12.3) 9.6 .002 0.69 
PAI Anxiety-Related Disorder 65.9 (15.3) 74.0 (12.3) 9.7 .002 0.58 
PAI Depression 72.1 (15.6) 77.9 (12.7) 4.9 .03 0.41 
PAI Mania 55.4 (10.6) 57.8 (9.6) 1.5 .22 0.24 
PAI Paranoia 64.2 (15.1) 67.9 (12.7) 2.0 .16 0.27 
PAI Schizophrenia 65.3 (16.4) 71.4 (12.6) 5.6 .02 0.42 
PAI Borderline Features 62.7 (15.2) 67.9 (12.2) 4.3 .04 0.38 
PAI Antisocial Features 56.2 (12.2) 58.9 (14.0) 1.2 .28 0.21 
PAI Alcohol Problems 51.4 (11.1) 50.2 (12.6) 0.3 .58 0.10 
PAI Drug Problems 49.2 (6.7) 47.4 (5.8) 2.4 .12 0.29 
PAI Aggression 62.9 (14.5) 67.5 (14.5) 2.9 .09 0.32 
PAI Suicidal Ideation 55.4 (16.9) 61.2 (11.9) 0.02 .88 0.40 
PAI Stress 57.8 (12.1) 61.2 (11.9) 2.3 .13 0.28 
PAI Malingering Index 59.6 (15.4) 60.0 (12.8) 0.3 .86 0.03 
PAI Rogers Discriminant Function 50.9 (12.3) 51.4 (12.8) 0.5 .82 0.04 

Notes: PAI = Personality Assessment Inventory; SD = Standard Deviation. df for all comparisons = 1, 113.Effect sizes computed using Cohen's d. Bonferroni corrected α = 0.003.

Table 8.

PAI T-scores as a function of WMT performance in study participants completing non-MEB/clinical evaluations

 WMT pass (mean [SD]) WMT fail (mean [SD]) F-value p-value Effect size 
PAI Inconsistency 51.2 (9.5) 51.5 (9.4) 0.1 .74 0.03 
PAI Infrequency 51.9 (8.7) 53.7 (8.8) 2.1 .15 0.21 
PAI NIM 56.4 (10.3) 64.1 (14.8) 20.2 <.001 0.60 
PAI PIM 48.3 (10.8) 43.7 (10.1) 9.2 .003 0.44 
PAI Somatization 60.9 (10.4) 69.3 (11.0) 30.8 <.001 0.78 
PAI Anxiety 59.3 (13.2) 67.4 (12.5) 19.2 <.001 0.63 
PAI Anxiety-Related Disorder 58.5 (13.6) 69.8 (13.8) 33.9 <.001 0.82 
PAI Depression 63.0 (13.3) 71.8 (13.0) 19.3 <.001 0.67 
PAI Mania 54.3 (9.5) 55.0 (9.8) 33.9 .63 0.07 
PAI Paranoia 59.2 (12.3) 64.4 (13.7) 21.5 .005 0.40 
PAI Schizophrenia 58.9 (12.8) 67.5 (13.7) 20.9 <.001 0.65 
PAI Borderline Features 58.2 (12.4) 64.0 (10.8) 11.8 .001 0.70 
PAI Antisocial Features 55.9 (11.2) 57.3 (12.8) 0.7 .41 0.12 
PAI Alcohol Problems 49.1 (8.0) 49.8 (9.0) 0.3 .61 0.08 
PAI Drug Problems 49.0 (7.5) 47.9 (7.2) 1.1 .30 0.15 
PAI Aggression 59.1 (14.4) 66.0 (14.4) 11.4 .001 0.48 
PAI Suicidal Ideation 48.0 (7.7) 50.2 (10.3) 3.2 .08 0.24 
PAI Stress 53.7 (10.8) 57.5 (11.5) 5.9 .02 0.34 
PAI Malingering Index 52.4 (9.7) 58.4 (16.1) 11.6 .001 0.45 
PAI Rogers Discriminant Function 49.8 (11.1) 48.9 (12.1) 0.4 .56 0.08 
 WMT pass (mean [SD]) WMT fail (mean [SD]) F-value p-value Effect size 
PAI Inconsistency 51.2 (9.5) 51.5 (9.4) 0.1 .74 0.03 
PAI Infrequency 51.9 (8.7) 53.7 (8.8) 2.1 .15 0.21 
PAI NIM 56.4 (10.3) 64.1 (14.8) 20.2 <.001 0.60 
PAI PIM 48.3 (10.8) 43.7 (10.1) 9.2 .003 0.44 
PAI Somatization 60.9 (10.4) 69.3 (11.0) 30.8 <.001 0.78 
PAI Anxiety 59.3 (13.2) 67.4 (12.5) 19.2 <.001 0.63 
PAI Anxiety-Related Disorder 58.5 (13.6) 69.8 (13.8) 33.9 <.001 0.82 
PAI Depression 63.0 (13.3) 71.8 (13.0) 19.3 <.001 0.67 
PAI Mania 54.3 (9.5) 55.0 (9.8) 33.9 .63 0.07 
PAI Paranoia 59.2 (12.3) 64.4 (13.7) 21.5 .005 0.40 
PAI Schizophrenia 58.9 (12.8) 67.5 (13.7) 20.9 <.001 0.65 
PAI Borderline Features 58.2 (12.4) 64.0 (10.8) 11.8 .001 0.70 
PAI Antisocial Features 55.9 (11.2) 57.3 (12.8) 0.7 .41 0.12 
PAI Alcohol Problems 49.1 (8.0) 49.8 (9.0) 0.3 .61 0.08 
PAI Drug Problems 49.0 (7.5) 47.9 (7.2) 1.1 .30 0.15 
PAI Aggression 59.1 (14.4) 66.0 (14.4) 11.4 .001 0.48 
PAI Suicidal Ideation 48.0 (7.7) 50.2 (10.3) 3.2 .08 0.24 
PAI Stress 53.7 (10.8) 57.5 (11.5) 5.9 .02 0.34 
PAI Malingering Index 52.4 (9.7) 58.4 (16.1) 11.6 .001 0.45 
PAI Rogers Discriminant Function 49.8 (11.1) 48.9 (12.1) 0.4 .56 0.08 

Notes: PIM = Positive Impression Management; PAI = Personality Assessment Inventory; SD = Standard Deviation; df for all comparisons = 1, 214. Effect sizes computed using Cohen's d. Bonferroni corrected α = 0.003.

Discussion

The current study sought to extend the existing literature on Symptom Validity Testing in the military context. Overall, approximately 42% of the sample failed the WMT. However, a significantly higher failure rate was revealed in those undergoing a neuropsychological evaluation as part of a pending MEB (54%), relative to those who were seen in a non-MEB/clinical context (35%). Such a difference highlights the importance of the immediate evaluation context when considering the likelihood of SVT failure in an active duty sample. The non-MEB/clinical group's failure rate is not dissimilar from the Armistead-Jehle and Hansen (2011) study that showed up to a 30% failure rate in their sample of active duty service members that were not involved in high-level officer training. Moreover, Nelson and colleagues (2010) demonstrated a 59% failure rate on the VSVT (Slick et al., 1996) in veterans involved in a forensic-type C&P evaluation, and Young, Kearns, and Roper (2011) reported a 71% failure rate on the WMT in C&P evaluations. These groups may be very comparable with the current sample of active service members undergoing an MEB evaluation as proximal secondary gain can reasonably be considered a factor in both groups. Armistead-Jehle (2010) showed a 58% failure rate in veterans seen in a clinical context. However, it has been highlighted that in the current VA system (where a claim for service connection can be made at any time), the distinction between clinical and forensic type evaluations can be difficult to make (Armistead-Jehle, 2010; Armistead-Jehle & Hansen, 2011). As such, the failure rate exceeding 50% seen across veteran samples and the current group of active duty service members immediately involved in an MEB may be a reasonable baseline in these types of evaluations.

As with previous research, the current study demonstrated remarkable effect sizes across neurocognitive ability measures as a function of SVT performance. In the sample as a whole, the magnitude of these effect sizes was most notable across measures of attention, processing speed, and memory. Although significant differences were still demonstrated on measures of verbal and visual reasoning, effect sizes were smaller. Such a finding suggests that measures tapping attention, processing speed, and memory are more susceptible to the effects of suboptimal effort. When the sample was divided into those completing an MEB evaluation and those seen in a non-MEB/clinical context, effect sizes remained prominent across attention, processing speed, and memory composite scores. In the non-MEB/clinical group, significant differences and small to medium effect sizes were also evidenced across the verbal and visual reasoning composite scores. In the MEB group, there was not a significant difference on the verbal reasoning composite.

The failure to note an influence of effort on measures of crystallized intelligence in those failing effort measures was also reported by Whitney, Shepard, Mariner, Mossbarger, and Herman (2010). Here, the authors demonstrated that Wechsler Test of Adult Reading (PsychCorp, 2001) scores were not influenced by SVT performance in a group of military veterans. The reason for such a relationship is not entirely clear. As suggested by Whitney and colleagues, it may be that the examinees perceive such tasks as measures of “intelligence” and may have thus been more highly motivated to perform well on these tasks. Future studies employing simulated malingerers may aid in the clarification to this question. Regarding motor skills, given that the average scores for both those failing and passing the WMT in the non-MEB/clinical subsample were in the low average range and the sample sizes of those administered measures of motor skills were relatively small, this finding may be secondary to limited variability and a type II error. Regardless of these nuanced differences, the main finding that performances across neurocognitive ability measures in an active duty military sample are greatly impacted by SVT performance is consistent with the findings of Lange and colleagues (2012). The current study can also be taken to extend these findings as individuals evaluated in the Lange and colleagues work were seen on average 3.9 months post-injury, whereas the time since injury in the current data comprised predominantly of patients with a history of mild TBI was 30.5 months.

There appears to be a wide range of SVT failure rates across the various studies in veteran and active duty military samples. The search for underlying factors that influence this variability may be of import when seeking to accurately quantify base rates. In conjunction with the Nelson and colleagues (2010) and Young and colleagues (2012) studies, the current findings suggest that of those veterans and active duty service members assessed in the immediate context of a potential compensation seeking examination (i.e., MEB or C&P), SVT base rate failure exceeds 50%. However, base rates in the active duty samples seen in non-MEB/clinical contexts appear lower with ranges between 19% and 35% (Armistead-Jehle & Hansen, 2011; Lange et al., 2012). As such, when considering a military population, efforts to make a distinction between veteran and active duty samples as well as between active duty samples undergoing and not undergoing an MEB may be of import. Future research in this area will be necessary to verify this conclusion, which is admittedly based on relatively few studies.

Regarding self-reported psychological symptoms, in the sample as a whole those who failed the WMT had significantly elevated NIM and Malingering Index scale scores, with medium and small effect sizes, respectively. Nine clinical scales showed significantly higher scores in the WMT fail group, with effect sizes generally in the medium range. In those undergoing an MEB, there were no differences on PAI validity scales as a function of WMT performance; however, both the NIM and Malingering Index were elevated relative to the non-MEB/clinical sample who passed the WMT. While such a finding may be taken to suggest that as a group, there is some degree of over-reporting in those service members who present for an evaluation in the context of a pending medical board, it is important to note that mean NIM and Malingering Index scores were below the cut scores suggested by Morey (1991, 2003) to be indicative of over-reporting. However, the large standard deviations across these scores (Table 7) suggest a notable degree of variance. Among those seen in a non-MEB/clinical context, there were significant differences across PAI validity scales with the indication that those who failed the WMT were prone to over-report. In this group, seven clinical scales met statistical significance between the WMT pass and fail group, with generally medium effect sizes demonstrated.

As a whole, these data could be taken to suggest that those who fail cognitive effort measures may be prone to exaggeration of self-report psychological symptoms as well. This conclusion would be consistent with Lange and colleagues (2012) and Jones and colleagues (2012), as well as a number of studies conducted with civilian samples (i.e., Gervais et al., 2004, 2008; Iverson et al., 2010; Larrabee, 2003a, 2003b; Suhr et al., 2008; Wygant et al., 2007). However, other researchers have not found such an association between cognitive effort measure performance and self-reported psychological symptoms (Armistead-Jehle, 2010; Demarkis et al., 2008; Haggerty et al., 2007; Sumanti et al., 2006). It is of import to note that within the current data, the effect sizes of WMT performance on cognitive effort measures were generally much greater than those of psychological symptom self-report measures. Consequently, given the somewhat conflicting literature in this area, one might conclude that failure on cognitive effort measures is much more strongly related to cognitive ability measure performance than to the exaggeration of psychological symptoms. Such a statement appears to be supported by the myriad data consistently showing a strong association between SVT failure and poor neurocognitive performance (e.g., Constantinou et al., 2005; Fox, 2011; Green et al., 2001; Green, 2007; Lange et al., 2010; Moss et al., 2003; West et al., 2010), relative to the marginally discrepant literature comparing poor SVT performance to the exaggeration of psychological symptoms. This statement is also consistent with the concerns articulated by Bush and colleagues (2005), indicating that invalid performance on measures of cognitive symptom validity do not allow for an a priori conclusion that personality test results are also invalid. Future research may be able to identify select subgroups where this relationship is more clearly defined.

Limitations in the current study include select issues of statistical power and external validity. More specifically, the sample was not a diagnostically pure group in reference to neurological insult or psychiatric diagnosis and there were insufficient cell sizes with select diagnoses to evaluate test performances as a function of this variable. Future investigations evaluating how diagnostic variations may have influenced results would be useful to this line of investigation. Additionally, the vast majority of this sample was men and Caucasian which can limit external validity. Replication of the current findings in military samples of greater demographic heterogeneity will be of importance. Another limitation in the current study was the reliance on a single SVT to determine adequate effort. Although the WMT is generally considered a sensitive instrument, the use of multiple SVTs in future research studies could result in more robust findings.

In sum, the current study extends the research literature in the domain of effort testing in the military population. The data highlight the difference in SVT failure rates between neuropsychological examinations in MEB and non-MEB/clinical contexts (with WMT failure at 54% and 35%, respectively) and thus provides support for the Nelson and colleagues (2010) study demonstrating the importance of the clinical context in SVT failure rates. The pervasive influence of effort on neurocognitive test performance was also demonstrated. In the current study, this finding was consistent across the MEB and non-MEB/clinical subsamples and overall was in concert with Lange and colleagues (2012) and a number of previous investigations. Although the current data tended to show significant differences in PAI validity and clinical scales across those who failed and passed the WMT, the effect sizes were smaller relative to those on neurocognitive measures.

The study provides additional support for the importance of Symptom Validity Testing in this patient population. Given the apparent base rates of SVT failure in military samples involved in both medical boards and clinical care, suboptimal effort on neuropsychological testing within this population appears to be a relatively frequent occurrence. Those who fail SVTs are likely to have falsely depressed performances on measures of neurocognitive ability and may have artificially elevated scores on measures of psychological and emotional functioning. Such results could easily lead to inaccurate diagnoses, unnecessary treatment recommendations, and ultimately misguided clinical conclusions. Routine administration of Symptom Validity Testing can substantially aid in the reduction of these potential clinical missteps.

Conflict of Interest

None declared.

References

American College of Rehabilitation Medicine
Definition of mild traumatic brain injury
Journal of Head Trauma Rehabilitation
 , 
1993
, vol. 
8
 (pg. 
86
-
87
)
Armistead-Jehle
P.
Symptom validity test performance in US veterans referred for evaluation of mild TBI
Applied Neuropsychology
 , 
2010
, vol. 
17
 (pg. 
52
-
59
)
Armistead-Jehle
P.
Hansen
C. L.
Comparison of the Repeatable Battery for the Assessment of Neuropsychological Status Effort Index and Stand-Alone Symptom Validity Tests in a military sample
Archives of Clinical Neuropsychology
 , 
2011
, vol. 
26
 (pg. 
592
-
601
)
Axelrod
B. N.
Schutte
C.
Analysis of the dementia profile on the Medical Symptom Validity Test
The Clinical Neuropsychologist
 , 
2010
, vol. 
24
 
5
(pg. 
873
-
881
)
Ben-Porath
Y. S.
Tellegen
A.
MMPI-2-RF (Minnesota Multiphasic Personality Inventory-2-Restructured Form) manual for administration, scoring, and interpretation
 , 
2008
Minneapolis, MN
University of Minnesota Press
Boone
K. B.
Salazar
X.
Lu
P.
Warner-Chacon
K.
Razani
J.
The Rey 15-Item Recognition Trial: A technique to enhance sensitivity of the Rey 15-Item Memorization Test
Journal of Clinical and Experimental Neuropsychology
 , 
2002
, vol. 
24
 (pg. 
561
-
573
)
Bush
S. S.
Ruff
R. M.
Troster
A. I.
Barth
J. T.
Koffler
S. P.
Pliskin
N. H.
, et al.  . 
Symptom validity assessment: Practice issues and medical necessity NAN Policy and Planning Committee
Archives of Clinical Neuropsychology
 , 
2005
, vol. 
20
 (pg. 
419
-
426
)
Butcher
J. N.
Graham
J. R.
Ben-Porath
Y. S.
Tellegen
A.
Dahlstrom
W. G.
Kaemmer
B.
MMPI-2: Manual for administration, scoring, and interpretation
 , 
2001
Minneapolis, MN
University of Minnesota Press
Chafetz
M.
Malingering on the social security disability consultative exam: Predictors and base rates
The Clinical Neuropsychologist
 , 
2008
, vol. 
22
 (pg. 
529
-
546
)
Cohen
J.
Statistical power analysis for the behavioral sciences
 , 
1988
2nd ed.
Hillsdale, NJ
Lawrence Erlbaum Associates
Constantinou
M.
Bauer
L.
Ashendorf
L.
Fisher
J. M.
McCaffrey
R. J.
Is poor performance on recognition memory effort measures indicative of generalized poor performance on neuropsychological tests?
Archives of Clinical Neuropsychology
 , 
2005
, vol. 
20
 (pg. 
191
-
198
)
Culbertson
W. C.
Zillmer
E. A.
Tower of London-Drexel University-2nd Edition technical manual
 , 
2005
North Tonawanda, NY
Multi-Health Systems
DeFilippis
N. A.
McCampbell
E.
The Booklet Category Test-Second Edition: Professional manual
 , 
1997
Lutz, FL
Psychological Assessment Resources
Delis
D. C.
Kaplan
E.
Kramer
J. H.
Delis-Kaplan Executive Function System
 , 
2001
San Antonio, TX
Harcourt Assessment
Delis
D. C.
Kramer
J. H.
Kaplan
E.
Ober
B. A.
California Verbal Learning Test-Second Edition manual (CVLT-II)
 , 
2000
San Antonio, TX
Psychological Corporation
Demakis
G. J.
Gervais
R. O.
Rohling
M. L.
The effect of failure on cognitive and psychological symptom validity tests in litigants with symptoms of post-traumatic stress disorder
The Clinical Neuropsychologist
 , 
2008
, vol. 
22
 (pg. 
879
-
895
)
Fox
D. D.
Symptom validity test failure indicates invalidity of neuropsychological tests
The Clinical Neuropsychologist
 , 
2011
, vol. 
25
 (pg. 
488
-
495
)
Gervais
R. O.
Ben-Porath
Y. S.
Wygant
D. B.
Green
P.
Differential sensitivity of the Response Bias Scale (RBS) and MMPI-2 validity scales to memory complaints
The Clinical Neuropsychologist
 , 
2008
, vol. 
22
 (pg. 
1061
-
1079
)
Gervais
R. O.
Rohling
M. L.
Green
P.
Ford
W.
A comparison of WMT, CARB, and TOMM failure rates in non-head injury disability claimants
Archives of Clinical Neuropsychology
 , 
2004
, vol. 
19
 (pg. 
475
-
487
)
Gladsjo
J. A.
Schuman
C. C.
Evans
J. D.
Peavy
G. M.
Miller
S. W.
Heaton
R. K.
Norms for letter and category fluency: Demographic corrections for age, education, and ethnicity
Assessment
 , 
1999
, vol. 
6
 (pg. 
147
-
178
)
Golden
C. J.
Freshwater
S. M.
The Stroop Color and Word Test: A manual for clinical and experimental uses
 , 
2002
Wood Dale, IL
Stoelting
Green
P.
Green's Word Memory Test for Windows: User's manual
 , 
2003
Edmonton, Canada
Green's Publishing
Green
P.
Green's Medical Symptom Validity Test (MSVT) for Microsoft Windows: User's manual
 , 
2004
Edmonton, Canada
Green's Publishing
Green
P.
The pervasive influence of effort on neuropsychological tests
Physical Medicine and Rehabilitation Clinics of North America
 , 
2007
, vol. 
18
 (pg. 
43
-
68
)
Green
P.
Lees-Haley
P. R.
Allen
L. M.
III
The Word Memory Test and the validity of neuropsychological test scores
Journal of Forensic Neuropsychology
 , 
2002
, vol. 
2
 (pg. 
97
-
124
)
Green
P.
Rohling
M. L.
Lees-Haley
P. R.
Allen
L. M.
III
Effort has a greater effect on test scores then severe brain injury in compensation claimants
Brain Injury
 , 
2001
, vol. 
15
 (pg. 
1045
-
1060
)
Greenberg
L. M.
The Test of Variables of Attention (version 7.3) [computer software]
 , 
2007
Los Alamitos, CA
The TOVA Company
Haggerty
K. A.
Frazier
T. W.
Busch
R. M.
Naugle
R. I.
Relationships aong Victoria Symptoms Validity Test indices and Personality Assessment Inventory validity scales in a large clinical sample
The Clinical Neuropsychologist
 , 
2007
, vol. 
21
 (pg. 
917
-
928
)
Hartman
D. E.
The unexamined lie is a lie worth fibbing: Neuropsychological malingering and the Word Memory Test
Archives of Clinical Neuropsychology
 , 
2002
, vol. 
17
 (pg. 
709
-
714
)
Heilbronner
R. L.
Sweet
J. J.
Morgan
J. E.
Larrabee
G. L.
Millis
S. R.
Conference Participants
American academy of clinical neuropsychology consensus conference statement on the neuropsychological assessment of effort, response bias, and malingering
The Clinical Neuropsychologist
 , 
2009
, vol. 
23
 (pg. 
1093
-
1129
)
Howe
L. L.
Loring
D. W.
Classification accuracy and predictive ability of the Medical Symptom Validity Test's dementia profile and general memory impairment profile
The Clinical Neuropsychologist
 , 
2009
, vol. 
23
 (pg. 
329
-
342
)
Iverson
G. L.
Lange
R. T.
Brooks
B. L.
Rennison
V. L.
‘Good old days’ bias following mild traumatic brain injury
The Clinical Neuropsychologist
 , 
2010
, vol. 
24
 (pg. 
17
-
37
)
Jones
A. M.
Ingram
V.
Ben-Porath
Y. S.
Scores on the MMPI-2-RF scales as a function of increasing levels of failure on cognitive symptom validity tests in a military sample
The Clinical Neuropsychologist
 , 
2012
, vol. 
26
 (pg. 
790
-
815
)
Kurtz
J. E.
Blais
M. A.
Introduction to the special issue on the Personality Assessment Inventory
Journal of Personality Assessment
 , 
2007
, vol. 
88
 (pg. 
1
-
4
)
Lafayette Instrument
Grooved Pegboard user's manual
 , 
2003
Lafayette, IN
Lafayette Instrument Company
Lafayette Instrument
Hand Dynamometer user instructions
 , 
2004
Lafayette, IN
Lafayette Instrument Company
Lange
R. T.
Iverson
G. L.
Brooks
B. L.
Rennison
V. L.
Influence of poor effort on self-reported symptoms and neurocognitive test performance following mild traumatic brain injury
Journal of Clinical and Experimental Neuropsychology
 , 
2010
, vol. 
32
 (pg. 
961
-
972
)
Lange
R. T.
Pancholi
S.
Bhagwat
A.
Anderson-Barnes
V.
French
L. M.
Influence of poor effort on neuropsychological test performance in U.S. military personnel following mild traumatic brain injury
Journal of Clinical and Experimental Neuropsychology
 , 
2012
, vol. 
34
 (pg. 
453
-
466
)
Larrabee
G. J.
Exaggerated MMPI-2 symptom repot in personal injury litigants with malingered neurocognitive deficit
Archives of Clinical Neuropsychology
 , 
2003
, vol. 
18
 (pg. 
673
-
686
)
Larrabee
G. J.
Exaggerated pain report in litigants with malingered neurocognitive dysfunction
The Clinical Neuropsychologist
 , 
2003
, vol. 
17
 (pg. 
395
-
401
)
Meyers
J. E.
Volbrecht
M.
Axelrod
B. N.
Reinsch-Boothby
L.
Embedded symptom validity tests and overall neuropsychological test performance
Archives of Clinical Neuropsychology
 , 
2011
, vol. 
26
 (pg. 
8
-
15
)
Mitrushina
M.
Boone
K. B.
Razani
J.
D'Elia
L. F.
Handbook of normative data for neuropsychological assessment-Second Edition
 , 
2005
New York
Oxford University Press
Morey
L. C.
Personality Assessment Inventory
 , 
1991
Odessa, FL
Psychological Assessment Resources
Morey
L. C.
Essentials of PAI assessment
 , 
2003
Hoboken, NJ
Wiley
Moss
A.
Jones
C.
Fokias
D.
Quinn
D.
The mediating effects of effort upon the relationship between head injury severity and cognitive functioning
Brain Injury
 , 
2003
, vol. 
17
 (pg. 
377
-
387
)
Nelson
N. W.
Hoelzle
J. B.
McGuire
K. A.
Ferrier-Auerbach
A. G.
Charlesworth
M. J.
Sponheim
S. R.
Evaluation context impacts neuropsychological performance of OEF/OIF veterans with reported combat-related concussion
Archives of Clinical Neuropsychology
 , 
2010
, vol. 
25
 (pg. 
713
-
723
)
PsychCorp
Wechsler Test of Adult Reading (WTAR)
 , 
2001
San Antonio, TX
Author
Slick
D. J.
Hopp
G.
Strauss
E.
Spellacy
F. J.
Victoria Symptom Validity Test: Efficiency for detection feigned memory impairment and relationship to neuropsychological tests and MMPI-2 validity scales
Journal of Clinical and Experimental Neuropsychology
 , 
1996
, vol. 
18
 (pg. 
911
-
922
)
Stevens
A.
Friedel
E.
Mehren
G.
Merten
T.
Malingering and uncooperativeness in psychiatric and psychological assessment: Prevalence and effects in a German sample of claimants
Psychiatry Research
 , 
2008
, vol. 
157
 (pg. 
191
-
200
)
Suhr
J.
Hammers
D.
Dobbins-Buckland
K.
Zimak
E.
Hughes
C.
The relationship of malingering test failure to self-reported symptoms and neuropsychological findings in adults referred for ADHD evaluation
Archives of Clinical Neuropsychology
 , 
2008
, vol. 
23
 (pg. 
521
-
530
)
Sumanti
M.
Boone
K. B.
Savodnik
I.
Gorsuch
R.
Noncredible psychiatric and cognitive symptoms in workers’ compensation ‘stress’ claim sample
The Clinical Neuropsychologist
 , 
2006
, vol. 
22
 (pg. 
1080
-
1092
)
Tombaugh
T. N.
The Test of Memory Malingering
 , 
1996
Toronto
Multi-Health Systems
Wechsler
D.
Wechsler Adult Intelligence Scale-Fourth Edition
 , 
2008
San Antonio, TX
Pearson Assessment
Wechsler
D.
Wechsler Memory Scale-Fourth Edition
 , 
2009
San Antonio, TX
Pearson
West
L. K.
Curtis
K. L.
Greve
K. W.
Bianchini
K. J.
Memory in traumatic brain injury: The effects of severity and effort on the Wechsler Memory Scale-III
Journal of Neuropsychology
 , 
2010
, vol. 
5
 (pg. 
114
-
125
)
Whitney
K. A.
Davis
J. J.
Shepard
P. H.
Herman
S. M.
Utility of the Response Bias Scale (RBS) and other MMPI-2 validity scales in predicting TOMM performance
Archives of Clinical Neuropsychology
 , 
2008
, vol. 
23
 (pg. 
777
-
786
)
Whitney
K. A.
Shepard
P. H.
Mariner
J.
Mossbarger
B.
Herman
S. M.
Validity of the Wechsler Test of Adult Reading (WTAR): Effort considered in a clinical sample of U.S. military veterans
Applied Neuropsychology
 , 
2010
, vol. 
17
 (pg. 
196
-
204
)
Whitney
K. A.
Shepard
P. H.
Williams
A. L.
Davis
J. J.
Adams
K. M.
The Medical Symptom Validity Test in the evaluation of Operation Iraqi Freedom/Operation Enduring Freedom soldiers: A preliminary study
Archives of Clinical Neuropsychology
 , 
2009
, vol. 
24
 (pg. 
145
-
152
)
Wygant
D. B.
Sellbom
M.
Ben-Porath
Y. S.
Stafford
K. P.
Freeman
D. B.
Heilbronner
R. L.
The relation between symptom validity testing and MMPI-2 scores as a function of forensic evaluation context
Archives of Clinical Neuropsychology
 , 
2007
, vol. 
22
 (pg. 
489
-
499
)
Wynkoop
T. F.
Denney
R. L.
Test review: Green's Word Memory Test (WMT) for windows
Journal of Forensic Neuropsychology
 , 
2005
, vol. 
4
 (pg. 
101
-
105
)
Young
J. C.
Kearns
L. A.
Roper
B. L.
Validation of the MMPI-2 Response Bias Scale and Henry-Heilbroner Index in a U.S. veteran population
Archives of Clinical Neuropsychology
 , 
2011
, vol. 
26
 (pg. 
194
-
204
)
Young
J. C.
Sawyer
R. J.
Roper
B. L.
Baughman
B. C.
Expansion and reexamination of Digit Span effort indices in the WAIS-IV
The Clinical Neuropsychologist
 , 
2012
, vol. 
26
 (pg. 
147
-
159
)

Author notes

The views, opinions, and/or findings contained in this article are those of the authors and should not be construed as an official Department of the Army position, policy or decision unless so designated by other official documentation.