Abstract

This brief report sought to evaluate the classification statistics of the Advanced Clinical Solutions (ACS) embedded measures of symptom validity in relation to the Word Memory Test (WMT), a well-established standalone measure of effort in a sample of active duty military service members with a history of mild traumatic brain injury. Results demonstrated that relative to the WMT, the ACS embedded measures had adequate specificity, but lacked sensitivity. This result is in agreement with previous studies demonstrating poor sensitivity for embedded effort measures relative to standalone Symptom Validity Tests.

Introduction

Within the field of clinical neuropsychology, the objective establishment of response validity has become an increasingly researched topic. As effort can account for more variance in neuropsychological test scores than the severity of neurological insult (Constantinou, Bauer, Ashendorf, Fisher, & McCaffrey, 2005; Fox, 2011; Green, 2007; Green, Rohling, Lees-Haley, & Allen, 2001; Meyers, Volbrecht, Axelrod, & Reinsch-Boothby, 2011; Stevens, Friedel, Mehren, & Merten, 2008), identifying optimal methods for assessing test-taker effort is of importance. The existing research literature has suggested that qualitative analysis of effort is tenuous at best (Faust, Hart, Guilmete, & Arkes, 1988; Heaton, Smith, Lehman & Vogt, 1978) and as such various quantitative measures have been devised to assess this construct. In general, these measures can be categorized into one of two forms: (a) standalone instruments designed and standardized to evaluate effort (e.g., Test of Memory Malingering [TOMM; Tombaugh, 1996]; Word Memory Test [WMT; Green, 2003]) or (b) indices embedded in neuropsychological ability tests.

Examples of embedded indices include several subtests from the most recent revision of the Wechsler Memory Scale, Fourth Edition (WMS-IV; Wechsler, 2009). From this battery, three embedded measures can be calculated: Visual Reproduction Recognition (VR-rec), Logical Memory Recognition (LM-rec), and Verbal Paired Associates Recognition (VPA-rec). The associated Advanced Clinical Solutions (ACS; Pearson, 2009a) package adds to these three subtests a forced choice memory test (Word Choice Test [WCT]) and the Reliable Digit Span (RDS; Greiffenstein, Gola, & Baker, 1994), an embedded effort measure from Wechsler Adult Intelligence Scale, Fourth Edition (WAIS-IV; Wechsler, 2008). A recent study by Miller and colleagues (2011) examined, via analog design, the utility of these embedded indices and the WCT to discriminate insufficient effort among healthy adults coached to simulate traumatic brain injury (TBI) and poor cognitive test performances secondary to actual TBI. A series of binary logistic regression and receiver operator characteristic analyses were employed to determine the predictive utility of the ACS. The strongest model tested included all five ACS measures and demonstrated an area under the curve of 0.95. Although a single variable model including only the WCT was also statistically reliable, it was less precise in the predicting group membership. Miller and colleagues concluded that the five ACS measures collectively, as well as the WCT individually, demonstrated utility in identifying suboptimal effort.

As embedded effort measures typically require little or no additional test administration time, the practical appeal of these instruments is obvious. However, a handful of recent studies have suggested that relative to standalone measures of effort, various embedded measures have limited sensitivity. For instance, Armistead-Jehle and Hansen (2011) found that in comparison with the TOMM, Non-Verbal Medical Symptom Validity Test (NV-MSVT; Green, 2008), and Medical Symptom Validity Test (MSVT; Green, 2004), the Repeatable Battery for the Assessment of Neuropsychological Status (Randolph, 1998) Effort Index (Silverberg, Wertheimer, & Fichtenberg, 2007) demonstrated sensitivities of 0.60, 0.62, and 0.53 (respectively) with an EI cut of ≥1 in their sample of active duty military service members. With an EI cut of >3 sensitivities all dropped below 0.32.

Another recent study by Miele, Gunner, Lynch, and McCaffrey (2012) represents a more comprehensive approach to embedded versus standalone Symptom Validity Test (SVT) comparisons. The authors evaluated four standalone SVTs and 17 embedded validity indices in a sample of 50 examinees seen for neuropsychological evaluation in a medico-legal context. Results suggested that although the RDS had the best classification accuracy of the included embedded indices, no single embedded measure had a sensitivity of over 0.62. The authors then concluded that all indices “tended to either over-identify optimal effort or under-identify suboptimal effort” (p. 21) and as such the employment of embedded indices in the absence of standalone SVTs was not supported.

The aim of the current study was to extend the literature in the area of symptom validity testing by evaluating the classification statistics of several ACS embedded measures of symptom validity relative to the WMT, a well-established standalone measure of effort. Given the previous literature in this area, we predicted that the ACS system of embedded measures would demonstrate adequate specificity, but poor sensitivity.

Methods

Participants

The study consisted of a convenience sample of 280 U.S. military service members on active duty orders with a history of mild TBI as defined by the American College of Rehabilitation Medicine (1993). The average age and education of the sample was 32.0 (SD = 7.7) and 12.8 (SD = 1.7) years, respectively. Ethnicity of the sample was predominantly Caucasian (76.4%), with 11.8% African American, 2.8% Hispanic, and 3.6% other. The preponderance of the sample was men (95.7%). All patients were tested by the second author or a trained technician under the supervision of the second author as part of a neuropsychological consult completed at an outpatient TBI clinic located at a Southeast U.S. Army Medical Facility between July 2009 and August 2011. Prior to all evaluations, the patients gave consent for the assessment and were instructed to provide their best effort across the tests administered. The examiner remained in the room at all times throughout the evaluations. Although the standardized WMT instructions indicate that the examiner is to leave the testing room during select subtests, this was not logistically feasible in the current data collection. However, several previous papers have been published with this slight alteration to test administration (e.g., Armistead-Jehle & Hansen, 2011; Singhal, Green, Ashave, Shankar, & Gill, 2009) and the test creator/publisher (Dr Paul Green) has asserted that an examiner can remain in the room without detrimental effects to the test administration so long as the examiner does not interact with the examinee or give feedback during the select subtests (P. Green, personal communication, 15 January 2013). All participants spoke English fluently and the testing was conducted in English. This retrospective analysis of clinical data was approved by the Institutional Review Board at Walter Reed National Military Medical Center.

Measures

Participants were administered a battery of neuropsychological tests, which included the WAIS-IV, WMS-IV, and WMT. As part of the ACS package, embedded effort measures were calculated based on raw scores from specific subtests within the WAIS-IV and WMS-IV. The specific embedded effort subtests included the RDS (Greiffenstein, Baker, & Gola, 1994), LM-rec, VPA-rec, and VR-rec. Raw scores from these tests are used to compare the individual's performance to various clinical and non-clinical samples used in the WAIS-IV and WMS-IV validation studies. The ACS materials provide base rates (e.g., ≤2%, ≤5%, ≤10%, ≤15%, etc.) for these individual raw scores as well as base rates for the number of scores below defined cutoffs. In this way, it is possible to compare the examinee's performance to various groups of interest in terms of how many scores were below a specific cutoff. The ACS Administration and Scoring Manual specifies that, of the five effort scores, at least three must be available to provide a meaningful analysis of the individual's performance. Although specific cutoffs for poor effort are not listed, the ACS Clinical and Interpretative manual (Pearson, 2009b) suggests that two or more scores below either the 10% or 15% base rate cutoff be interpreted as a sign of possible suboptimal effort (p. 101). Of note, the WCT is also an effort test included in the ACS package; however, this measure was not administered secondary to its similarity to the WMT (i.e., both use a two-alternative forced choice format to assess recognition for a word list).

In addition to the embedded measures from the ACS, the participants were also administered the WMT (Green, 2003). The WMT is a computer administered verbal memory test with multiple subtests designed to assess verbal memory, effort, and response consistency. Performances on the WMT were classified as either passing or failing based on the criteria published in the test manual. A number of studies have demonstrated the utility of the WMT in the discrimination between those with genuine memory impairment and those simulating impairment in a range of patient samples (e.g., Green, Lees-Haley, & Allen, 2002; Hartman, 2002; Wynkoop, & Denney, 2005).

Procedures

As noted above, the ACS manual does not provide specific cut scores indicative of suboptimal effort; rather it suggests that when two of the subtests are below the 10% or 15% base rate cutoff, suboptimal effort may have been offered by the examinee. As such, classification statistics for the ACS subtests were computed based on both the 10% and 15% base rates, with suboptimal effort on the ACS represented by two or more scores on the RDS, LM-rec, VPA-rec, and VR-rec at or below the 10th and 15th percentile base rates. The ACS manual has different clinical groups from which to select appropriate base rates. Given the nature of the study sample as individuals with a history of concussion, the ACS TBI clinical group was employed for comparison.

Results

Across the sample, 106 cases (37.9%) failed the WMT, whereas 18 (6.4%) failed the ACS subtests at the 10% base rate level and 23 (8.2%) at the 15% base rate level. As outlined in Table 1, at the 10% base rate level, 173 cases (62%) passed both the WMT and the ACS subtests, with 17 cases (6%) failing both measures. Eighty-nine cases (32%) failed the WMT but passed the ACS subtests. There was one case that passed the WMT but failed the ACS subtests (0.4%). At the 15% base rate level, 170 cases (61%) passed both the WMT and the ACS subtests, with 19 cases (7%) failing both measures. Eighty-seven cases (31%) failed the WMT but passed the ACS subtests at the 15% base rate level. There were four cases that passed the WMT but failed the ACS subtests (1%). Greater detail related to the base rate of WMT failure in this sample can be found in Armistead-Jehle and Buican (2012).

Table 1.

WMT and ACS performances at the 10% and 15% base rate levels (n = 280)

 Fail WMT Pass WMT  Fail WMT Pass WMT 
Fail ACS at 10%a 17 Fail ACS at 15%a 19 
Pass ACS at 10%a 89 173 Pass ACS at 15%a 87 170 
 Fail WMT Pass WMT  Fail WMT Pass WMT 
Fail ACS at 10%a 17 Fail ACS at 15%a 19 
Pass ACS at 10%a 89 173 Pass ACS at 15%a 87 170 

Notes: ACS = Advanced Clinical Solutions; WMT = Word Memory Test.

aDefined as failing two or more of the following: RDS, LM-rec, VPA-rec, and VR-rec at the specified base rate.

Classification statistics for the WMT and ACS measures are presented in Table 2. As the ACS manual states that the 10% and 15% base rate levels should be employed as a potential indication of suboptimal effort, these scores are of particular interest. At both the 10% and 15% levels specificity of the ACS measures was high, but sensitivity was rather low. Additionally, positive predictive power and positive likelihood ratios were high, but negative predictive power and negative likelihood ratios were relatively low. Table 2 also outlines the classification statistics for the 5%, 25%, and 50% base rate levels of failure on two or more ACS embedded measures. Although specificity at the 25% base rate level remained high at 0.94, the sensitivity was still depressed at 0.27. Sensitivity increased notably at the 50% base rate level, but here specificity was now limited at 0.58.

Table 2.

Classification statistics for the ACS subtests at the 5%, 10%, 15%, 25%, and 50% base rate levels using the WMT as suboptimal effort criterion

 Sens Spec PPP NPP Hit Rate LR+ LR− 
ACS 
 5% 0.11 0.99 0.87 0.64 0.66 19.7 1.12 
 10%a 0.16 0.99 0.91 0.66 0.68 16.0 1.18 
 15%a 0.18 0.98 0.85 0.66 0.68 7.80 1.19 
 25%a 0.27 0.94 0.73 0.68 0.69 4.33 1.29 
 50%a 0.71 0.58 0.51 0.77 0.63 1.69 1.99 
 Sens Spec PPP NPP Hit Rate LR+ LR− 
ACS 
 5% 0.11 0.99 0.87 0.64 0.66 19.7 1.12 
 10%a 0.16 0.99 0.91 0.66 0.68 16.0 1.18 
 15%a 0.18 0.98 0.85 0.66 0.68 7.80 1.19 
 25%a 0.27 0.94 0.73 0.68 0.69 4.33 1.29 
 50%a 0.71 0.58 0.51 0.77 0.63 1.69 1.99 

Notes: PPP and NPP were calculated by using the prevalence rate of WMT failure in the sample (i.e., 0.38). Sens = Sensitivity; Spec = Specificity; PPP = Positive Predictive Power; NPP = Negative Predictive Power; LR+ = Positive Likelihood Ratio; LR− = Negative Likelihood Ratio.

aDefined as failing two or more of the following: RDS, LM-rec, VPA-rec, and VR-rec at the specified base rate.

Discussion

This brief empirical report sought to explore the classification statistics of the ACS embedded effort measures in relation to a well-established standalone instrument. Results echo previous research demonstrating high specificity in embedded measures, but low sensitivity relative to standalone SVTs (Armistead-Jehle & Hansen, 2011; Miele et al., 2012). Data further suggest that the best balance between sensitivity and specificity is achieved with employment of the 25% base rate, rather than the manual's recommended 10% or 15%; however, even with this change sensitivity is still rather limited. Given the relatively high base rate of WMT failure in the current sample, an alternate way to examine the classification statistics is to consider the positive predictive power and negative predictive power values. Here, the 10% base rate appears to be optimal. Future studies could further examine the most favorable ACS base rates, which given the dissimilar rates of SVT failure across clinical and forensic samples may differ as a function of the evaluation context. Regardless of the specific classification statistics examined, the current findings demonstrate that although the embedded ACS effort measures may serve to provide some indication of suboptimal effort on the part of the examinee, from a clinical point of view, they should not be employed as the sole measures of effort within a neuropsychological battery.

Among the limitations of the current study was the reliance on one standalone SVT as a criterion variable. Although the WMT is considered among the most sensitive measures of suboptimal effort, the inclusion of additional standalone measures would provide more robust results. Next, while this study met the ACS manual requirements suggesting that three of the five effort subtests be administered, the current data did not include the WCT. According to the recent study by Miller and colleagues (2011), the WCT is perhaps the most sensitive measure of the ACS package and the quality of any future studies seeking to evaluate the classification accuracy of the ACS effort measures would be improved by its inclusion. The current study, based on pre-existing clinical data, did not include this measure as it shared a number of commonalities with the criterion measure employed (i.e., WMT). Future studies including the WCT and other verbally based standalone SVTs (such as the WMT or MSVT) could also assess the classification properties of the WCT. Additionally, the ACS manual provides several clinical and non-clinical groups from which to derive base rate percentages. In the current study, we chose to employ the TBI group; however, it is of note that this comparison group was not identical to the study sample in terms of TBI severity or demographic constitution. Finally, the current sample was composed predominantly of Caucasian male military service members and as such the external validity could be considered limited. Despite these limitations, the current data replicate previous studies demonstrating the limited sensitivity of embedded effort measures relative to standalone instruments designed to gauge respondent effort on neuropsychological testing.

Conflict of Interest

None declared.

Acknowledgements

The views, opinions, and/or findings contained in this article are those of the authors and should not be construed as an official Department of the Army position, policy, or decision unless so designated by other official documentation.

References

American College of Rehabilitation Medicine
Definition of mild traumatic brain injury
Journal of Head Trauma Rehabilitation
 , 
1993
, vol. 
8
 (pg. 
86
-
87
)
Armistead-Jehle
P.
Buican
B.
Evaluation context and symptom validity test performance in a U.S. military sample
Archives of Clinical Neuropsychology
 , 
2012
, vol. 
27
 (pg. 
828
-
839
)
Armistead-Jehle
P.
Hansen
C. L.
Comparison of the Repeatable Battery for the Assessment of Neuropsychological Status Effort Index and Stand-Alone Symptom Validity Tests in a military sample
Archives of Clinical Neuropsychology
 , 
2011
, vol. 
26
 (pg. 
592
-
601
)
Constantinou
M.
Bauer
L.
Ashendorf
L.
Fisher
J. M.
McCaffrey
R. J.
Is poor performance on recognition memory effort measures indicative of generalized poor performance on neuropsychological tests?
Archives of Clinical Neuropsychology
 , 
2005
, vol. 
20
 (pg. 
191
-
198
)
Faust
D.
Hart
K.
Guilmette
T.
Arkes
H. R.
Neuropsychologist's capacity to detect adolescent malingering
Professional Psychology: Research and Practice
 , 
1988
, vol. 
19
 (pg. 
508
-
515
)
Fox
D. D.
Symptom validity test failure indicates invalidity of neuropsychological tests
The Clinical Neuropsychologist
 , 
2011
, vol. 
25
 (pg. 
488
-
495
)
Green
P.
Green's Word Memory Test for Windows: User's manual
 , 
2003
Edmonton, Canada
Green's Publishing
Green
P.
Green's Medical Symptom Validity Test (MSVT) for Microsoft Windows: User's manual
 , 
2004
Edmonton, Canada
Green's Publishing
Green
P.
The pervasive influence of effort on neuropsychological tests
Physical Medicine and Rehabilitation Clinics of North America
 , 
2007
, vol. 
18
 (pg. 
43
-
68
)
Green
P.
Manual for the Nonverbal Medical Symptom Validity Test
 , 
2008
Edmonton, Canada
Green's Publishing
Green
P.
Lees-Haley
P. R.
Allen
L. M.
III
The Word Memory Test and the validity of neuropsychological test scores
Journal of Forensic Neuropsychology
 , 
2002
, vol. 
2
 (pg. 
97
-
124
)
Green
P.
Rohling
M. L.
Lees-Haley
P. R.
Allen
L. M.
III
Effort has a greater effect on test scores then severe brain injury in compensation claimants
Brain Injury
 , 
2001
, vol. 
15
 (pg. 
1045
-
1060
)
Greiffenstein
M. F.
Baker
J. W.
Gola
T.
Validation of malingered amnesia measures with a large clinical sample
Psychological Assessment
 , 
1994
, vol. 
6
 (pg. 
218
-
224
)
Hartman
D. E.
The unexamined lie is a lie worth fibbing: Neuropsychological malingering and the Word Memory Test
Archives of Clinical Neuropsychology
 , 
2002
, vol. 
17
 (pg. 
709
-
714
)
Heaton
R. K.
Smith
H. H.
Lehman
A. W.
Vogt
A. T.
Prospects for faking believable deficits on neuropsychological testing
Journal of Consulting and Clinical Psychology
 , 
1978
, vol. 
46
 (pg. 
892
-
900
)
Meyers
J. E.
Volbrecht
M.
Axelrod
B. N.
Reinsch-Boothby
L.
Embedded symptom validity tests and overall neuropsychological test performance
Archives of Clinical Neuropsychology
 , 
2011
, vol. 
26
 (pg. 
8
-
15
)
Miele
A. S.
Gunner
J. H.
Lynch
J. K.
McCaffrey
R. J.
Are embedded validity indices equivalent to free-standing symptom validity tests?
Archives of Clinical Neuropsychology
 , 
2012
, vol. 
27
 (pg. 
10
-
22
)
Miller
J. B.
Millis
S. R.
Rapport
L. J.
Bashem
J. R.
Hanks
R. A.
Axelrod
B. N.
Detection of insufficient effort using the Advanced Clinical Solutions for the Wechsler Memory Scale, Fourth Edition
The Clinical Neuropsychologist
 , 
2011
, vol. 
25
 (pg. 
160
-
172
)
Pearson
Advanced Clinical Solutions for the WAIS-IV and WMS-IV: Administration and scoring manual
 , 
2009a
San Antonio, TX
Author
Pearson
Advanced Clinical Solutions for the WAIS-IV and WMS-IV: Clinical and interpretative manual
 , 
2009b
San Antonio, TX
Author
Randolph
C.
Repeatable Battery for the Assessment of Neuropsychological Status manual
 , 
1998
San Antonio, TX
The Psychological Corporation
Silverberg
N. D.
Wertheimer
J. C.
Fichtenberg
N. L.
An effort index for the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS)
The Clinical Neuropsychologist
 , 
2007
, vol. 
21
 (pg. 
841
-
854
)
Singhal
A.
Green
P.
Ashaye
K.
Shankar
K.
Gill
D.
High specificity of the medical symptom validity test in patients with very severe memory impairment
Archives of Clinical Neuropsychology
 , 
2009
, vol. 
24
 (pg. 
721
-
728
)
Stevens
A.
Friedel
E.
Mehren
G.
Merten
T.
Malingering and uncooperativeness in psychiatric and psychological assessment: Prevalence and effects in a German sample of claimants
Psychiatry Research
 , 
2008
, vol. 
157
 (pg. 
191
-
200
)
Tombaugh
T. N.
The Test of Memory Malingering
 , 
1996
Toronto
Multi-Health Systems
Wechsler
D.
Wechsler Adult Intelligence Scale-Fourth Edition
 , 
2008
San Antonio, TX
Pearson
Wechsler
D.
Wechsler Memory Scale-Fourth Edition
 , 
2009
San Antonio, TX
Pearson
Wynkoop
T. F.
Denney
R. L.
Test Review: Green's Word Memory Test (WMT) for Windows
Journal of Forensic Neuropsychology
 , 
2005
, vol. 
4
 (pg. 
101
-
105
)