Abstract

The ability of the Response Bias Scale (RBS) and the Henry–Heilbronner Index (HHI), along with several other MMPI-2 validity scales, to predict performance on two separate stand-alone symptom validity tests, the Test of Memory Malingering (TOMM) and the Medical Symptom Validity Test (MSVT), was examined. Findings from this retrospective data analysis of outpatients seen within a Veterans Affairs medical center (N = 194) showed that group differences between those passing and failing the TOMM were largest for the RBS (d = 0.79), HHI (d = 0.75), and Infrequency (F; d = 0.72). The largest group differences for those passing versus failing the MSVT were greatest on the HHI (d = 0.83), RBS (d = 0.80), and F (d = 0.78). Regression analyses showed that the RBS accounted for the most variance in TOMM scores (20%), whereas the HHI accounted for the most variance in MSVT scores (26%). Nonetheless, due to unacceptably low positive and negative predictive values, caution is warranted in using either one of these indices in isolation to predict performance invalidity.

Introduction

In recent years, several new validity scales have been derived from the Minnesota Multiphasic Personality Inventory 2 (MMPI-2; Butcher et al., 2001). These validity scales were developed with the intention of identifying symptom exaggeration in test-takers. At least two scales have been specifically developed by selecting MMPI-2 items that distinguished between individuals passing versus failing stand-alone cognitive symptom validity tests. These scales are the Henry–Heilbronner Index (HHI; Henry, Heilbronner, Mittenberg, & Enders, 2006) and the Response Bias Scale (RBS; Gervais, Ben-Porath, Wygant, & Green, 2007).

When examined independently of the HHI, the RBS has shown impressive predictive validity in identifying individuals suspected of or feigning negative response bias (e.g., Gervais, Ben-Porath, Wygant, & Sellboom, 2010; Gervais et al., 2007; Jones, Ingram, & Ben-Porath, 2012; Lange, Sullivan, & Scott, 2010; Sullivan, & Elliott, 2012; Wygant et al., 2010). Though less often studied, the HHI has also shown promise when examined independently of the RBS (e.g., Henry et al., 2006; Henry, Heilbronner, Mittenberg, Enders, & Stanczak, 2008). Both scales have also performed well when examined concurrently. For example, among patients referred for the evaluation of head injury in the context of litigation, it was shown that both scales demonstrated utility in discriminating membership in either the probable negative response bias group (n = 37) versus the presumed valid group (n = 42), with an area under the curve of 0.82 for the RBS and 0.73 for the HHI (Dionysus, Denney, & Halfaker, 2011). However, as in the latter study, the method used to define the negative response bias group in studies concurrently examining the HHI and the RBS has included the ill-advised practice of allowing failure on a single embedded symptom validity test to satisfy at least part of the entry criteria into the “invalid performance group.” The latter is problematic because the rate of failure on any one embedded symptom validity test has been shown to be fairly common in clinical samples, with 41% of a credible group (i.e., defined by performance on stand-alone symptom validity tests) failing at least one embedded measure in a well-designed study (Victor, Boone, Serpa, Beuhler, & Zielger, 2009). A separate methodological difficulty is seen in at least two other studies that simultaneously examined the HHI and the RBS, both of which showed support for the HHI and the RBS, but neither of which employed the use of any symptom validity test, embedded or stand-alone, to systematically define group membership (Tsushima, Geling, & Fabrigas, 2011; Young & Gross, 2011).

To date, only three studies have directly compared the RBS and the HHI in their ability to predict failure on stand-alone, rather than embedded, cognitive symptom validity tests. As shown in Table 1, results of these studies varied. In a study particularly relevant to the current investigation, it was shown that, among a variety of MMPI-2 scales studied, the RBS and the traditional Infrequency (F) scale were shown to be the best predictors of symptom validity test failure. Specifically, the largest effect size for the score difference between outpatient military veterans passing versus failing the Word Memory Test (WMT; Green, 2003; Green, Allen, & Astner, 1996; Green & Astner, 1995), a stand-alone symptom validity test, occurred on the RBS (d = 0.40) and the traditional F scale (d = 0.40) (Young, Kearns, & Roper, 2011). Although both the RBS and the F scale were significant predictors of WMT failure, using regression analyses, neither one showed incremental validity over the other.

Table 1.

Studies comparing the ability of RBS and HHI to predict failure on stand-alone symptom validity tests

Na Scale Cohen's d Sensitivity Specificity (%) PPVb NPVb Description of adequate effort group Description of questionable effort group Age (mean, years) Authors 
89/85 RBS
HHI
F
Fb
Fp
FBS 
0.40
0.29
0.40
0.29
0.38
0.31 
≥17 = 0.32
≥19 = 0.24

 
≥17 = 0.81
≥19 = 0.91

 
≥17 = 0.61
≥19 = 0.71

 
≥17 = 0.55
≥19 = 0.55 
Outpatient military veterans routinely referred for neuropsychological evaluation who passed the WMT Outpatient military veterans routinely referred for neuropsychological evaluation who failed the WMT 48 Young and colleagues (2011) 
171/117 RBS
HHI
F
Fb
Fp
FBS
FBS-r
Fs 
1.05
1.16
0.76
0.61
0.25
0.98
0.99
0.78 
n/a n/a n/a n/a Active duty military members who completed and passed both the TOMM (Tombaugh, 1996) and the VSVT (Slick et al., 1996) or one if only one was given, as part of an evaluation of head, blast, or heat injuries or brain disease Active duty military members who completed and failed either the TOMM (Tombaugh, 1996) or the VSVT (Slick et al., 1996) as part of an evaluation of head, blast, or heat injuries or brain disease 31 Jones and Ingram (2011) 
24/22 RBS
HHI
F
Fb
Fp
FBS
Fptsd 
0.98
0.90
0.51
0.65
0.59
0.23
0.52 
≥17 = 0.50
≥19 = 0.23 
≥17 = 0.92
≥19 = 0.96 
≥17 = 0.81
≥19 = 0.80 
≥17 = 0.72
≥19 = 0.64 
Outpatient military veterans suspected of potential malingering who passed the TOMM (Tombaugh, 1996Outpatient military veterans suspected of potential malingering who failed the TOMM (Tombaugh, 199648 Whitney and colleagues (2008) 
Na Scale Cohen's d Sensitivity Specificity (%) PPVb NPVb Description of adequate effort group Description of questionable effort group Age (mean, years) Authors 
89/85 RBS
HHI
F
Fb
Fp
FBS 
0.40
0.29
0.40
0.29
0.38
0.31 
≥17 = 0.32
≥19 = 0.24

 
≥17 = 0.81
≥19 = 0.91

 
≥17 = 0.61
≥19 = 0.71

 
≥17 = 0.55
≥19 = 0.55 
Outpatient military veterans routinely referred for neuropsychological evaluation who passed the WMT Outpatient military veterans routinely referred for neuropsychological evaluation who failed the WMT 48 Young and colleagues (2011) 
171/117 RBS
HHI
F
Fb
Fp
FBS
FBS-r
Fs 
1.05
1.16
0.76
0.61
0.25
0.98
0.99
0.78 
n/a n/a n/a n/a Active duty military members who completed and passed both the TOMM (Tombaugh, 1996) and the VSVT (Slick et al., 1996) or one if only one was given, as part of an evaluation of head, blast, or heat injuries or brain disease Active duty military members who completed and failed either the TOMM (Tombaugh, 1996) or the VSVT (Slick et al., 1996) as part of an evaluation of head, blast, or heat injuries or brain disease 31 Jones and Ingram (2011) 
24/22 RBS
HHI
F
Fb
Fp
FBS
Fptsd 
0.98
0.90
0.51
0.65
0.59
0.23
0.52 
≥17 = 0.50
≥19 = 0.23 
≥17 = 0.92
≥19 = 0.96 
≥17 = 0.81
≥19 = 0.80 
≥17 = 0.72
≥19 = 0.64 
Outpatient military veterans suspected of potential malingering who passed the TOMM (Tombaugh, 1996Outpatient military veterans suspected of potential malingering who failed the TOMM (Tombaugh, 199648 Whitney and colleagues (2008) 

Notes: PPV = Positive Predictive Value; NPV = Negative Predictive Value; n/a = not provided; TOMM = Test of Memory Malingering; RBS = Response Bias Scale; Fp = Infrequency Psychopathology; HHI = Henry–Heilbronner Index; VSVT = Victoria Symptom Validity Test; WMT = Word Memory Test; F = Infrequency; Fb = Infrequency-Back; FBS = Symptom Validity Scale.

aReported Ns are for adequate/questionable effort participants.

bValues calculated using various base rates; not specified in the Young and colleagues (2011) study.

In contrast, in a separate study of active duty military members, it was shown that the HHI (d = 1.16) and, to a lesser extent the RBS (d = 1.05), showed the largest effect size for differences among the groups displaying adequate versus inadequate effort on symptom validity tests, including the Victoria Symptom Validity Test (VSVT; Slick, Hopp, Strauss, & Spellacy, 1996) and the Test of Memory Malingering (TOMM; Jones & Ingram, 2011; Tombaugh, 1996). Using non-traditional statistical analyses, the authors of the latter study reported that the HHI, RBS, Symptom Validity Scale (FBS), and Symptom Validity Scale from the MMPI-2 Restructured Form (MMPI-2 RF; Ben-Porath & Tellegen, 2008) outperformed the F family of scales on a variety of indices of classification accuracy.

In a third study, it was shown that the RBS, and to a certain extent, the HHI were superior to other MMPI-2 validity scales (i.e., F, Infrequency-Back [Fb], Infrequency-Psychopathology [Fp], FBS, and Infrequency Posttraumatic Stress Disorder Scale (Elhai, Ruggiero, Frueh, Beckham, Gold, & Feldman, 2002) (Fptsd)) in predicting failure on the TOMM within an outpatient Veterans Affairs medical center population (Whitney, Davis, Shepard, & Herman, 2008). In the latter study, the RBS (d = 0.98) and the HHI (d = 0.90) demonstrated the largest effect sizes for differences between groups passing versus failing the TOMM. The RBS was most consistently correlated with performances on the various trials of the TOMM. Regression analyses showed the RBS incrementally contributed to each MMPI-2 validity scale in predicting TOMM performance, with the exception of the HHI, where there was only a trend for a significant contribution. In no case, did the addition of an MMPI-2 validity scale incrementally contribute to the RBS in predicting TOMM performance.

Similar to the current investigation, all three of the above-described studies examined either active or veteran U.S. military members. The goal of these studies was to examine the ability of various MMPI-2 scales to predict “performance invalidity” on neuropsychological testing using at least one stand-alone symptom validity test. However, because performance invalidity was operationally defined as performance on stand-alone symptom validity tests that differed from study to study, it is difficult to discern whether the differences in study findings were due to differences among the samples, which were quite similar, or due to differences in the symptom validity test employed. To more closely examine the latter issue, in the present investigation, two separate stand-alone cognitive symptom validity tests were administered to the same sample and their relationships with MMPI-2 validity scales were looked at separately. Thus, the primary goal of the present research was to examine the ability of various MMPI-2 validity scales, including the HHI, RBS, F, Fb, Fp, and FBS, to predict failure on the Medical Symptom Validity Test (MVST; Green, 2004) and the TOMM, respectively. “To my knowledge the current study represents the only investigation that has concurrently examined HHI and RBS using two separate stand-alone symptom validity tests simultaneously. The study design allows a comparison of the unique relationships between HHI and RBS and both stand-alone symptom validity tests, respectively. The latter design will allow clinicians and researchers to better understand the differential predictive ability of HHI and RBS as they relate to two commonly employed free-standing symptom validity tests that may be employed alone or in combination in clinical and research practice.” Based largely on the findings from the Young and colleagues study (2011) examining the ability of MMPI-2 validity scales to predict performance on the WMT, which is essentially a longer version of the MSVT, it was hypothesized that the RBS would be the scale that demonstrated the largest difference between groups of individuals who passed the MSVT versus those who failed it. Based on the Whitney and colleagues (2008) study employing the TOMM, the same was predicted to be true for that measure. In addition, in order to determine which MMPI-2 validity scales best predicted group membership, exploratory hierarchical logistic regression analyses were planned to explore which MMPI-2 validity scales added incremental validity to one another in predicting pass/fail on the MSVT and the TOMM, respectively. MMPI-2 scales to be selected for entry in the regression analyses were those that showed the largest effect sizes between groups passing and failing the MSVT and the TOMM, respectively.

Prior to proceeding to a more detailed discussion of the methods used in the current study, a brief discussion of the development and validation of each of the aforementioned validity scales is presented.

Response Bias Scale

The RBS (Gervais et al., 2007) is only one of two known scales to have been developed on the basis of independent performances on cognitive effort measures. Specifically, the RBS was derived by empirically selecting 28 items from the MMPI-2 that discriminated between non-head-injury disability claimants (N = 1,212) passing or failing three well-validated symptom validity measures, including the WMT (Green, 2003; Green & Astner, 1995; Green et al., 1996), the Computerized Assessment of Response Bias (CARB; Allen, Conder, Green, & Cox, 1997), and/or the TOMM. More than one study has demonstrated the superiority of the RBS over other MMPI-2 validity scales, including the FBS (Lees-Haley, English, & Glenn, 1991), in discriminating symptom validity test performance in patients claiming disability or with significant secondary gain issues (Gervais et al., 2007; Nelson, Sweet, & Heilbronner, 2007; Wygant et al, 2010).

Henry–Heilbronner Index

The HHI (Henry et al., 2006) is the second of the only two known scales to have been developed on the basis of independent performances on cognitive effort measures. It shares three items with the RBS. The HHI was derived by empirically selecting 15 items from the 43-item FBS and the 17-item Shaw and Mathews' Pseudoneurologic Scale (PNS; Shaw & Mathews, 1965) that were most highly correlated with categorical membership in a non-malingering versus a probable malingering group. The probable malingering group consisted of individuals failing at least one performance validity measure, including the TOMM, WMT, CARB, or VSVT and meeting Slick, Sherman, and Iverson's (1999) criteria for either definite or probable malingered neurocognitive dysfunction. In their initial validation study, the scale developers showed that the HHI was superior to FBS and PNS in the identification of symptom exaggeration in personal injury litigants and disability claimants (N = 45) compared with non-litigating head-injured controls (N = 74).

Infrequency, Fb, and Fp

The F scale is one of the four original validity scales developed by the authors of the MMPI (Hathaway & McKinley, 1951; for review, seeGraham, 1990). Items (60) were selected for the F scale by choosing those endorsed in a particular direction by fewer than 10% of normal controls. The F scale was designed to identify individuals who approach the MMPI in a way that is different from that intended by the test authors.

Unlike the F scale items, all of which appear before item 362, the Fb items appear later in the booklet, all after item 280, and were included to detect abnormal responding in the second half of the test. The Fb scale was originally developed for the experimental booklet used in the normative data collection for the MMPI-2 (Butcher, Dahlstrom, Graham, Tellegen, & Kaemmer, 1989). Like the F scale, the 40-item Fb scale is composed of items that were endorsed in a particular direction by fewer than 10% of normal controls.

The Fp scale was “designed to detect infrequent responding in settings characterized by relatively high base rates of psychopathology and psychological distress” (Arbisi & Ben-Porath, 1995, p. 424.). In addition to being infrequently endorsed in the MMPI-2 normative sample, items included in the Fp scale (27) were selected because they were also infrequently endorsed by psychiatric inpatients. As it were, over half of the psychiatric inpatient sample was drawn from individuals hospitalized at a Veterans Affairs medical center.

The FBS (Lees-Haley et al., 1991) was originally developed to identify invalid symptom presentation in the context of litigation. However, the scale is now included in the standard scoring materials for the MMPI-2 and it enjoys much wider use among clinicians. Items were selected for the FBS by empirical and post hoc rational analyses. Forty-three items were chosen that reflected the exaggeration of post-injury emotional distress and the minimization of pre-injury personality problems (Greiffenstein, Fox, & Lees-Haley, 2007). A recent meta-analysis (N = 3,663) found a large grand effect size for the FBS (0.96) and reported that the FBS performed as well as, if not superior to, other validity scales (including F, Fb, and Fp, discussed above) in distinguishing between groups of individuals who were likely over-reporting symptoms versus those who were not (Nelson, Sweet, & Demakis, 2006).

Method

Participants

Data were collected from the files of 194 outpatients who were consecutively referred to the author for neuropsychological testing within a VA Medical Center. The patients in the current study did not overlap with those examined in the previous study by Whitney and colleagues (2008). Referral sources included the Psychiatry Ambulatory Care Clinic, OIF/OEF Clinic, Primary Medical Clinics and the Neurology clinic. No patients were diagnosed with mental retardation. All patients were either active duty or veteran soldiers primarily referred for neuropsychological evaluation to assess the potential presence of cognitive dysfunction, not primarily to assess for the presence of psychiatric disorder. Consecutive referrals were reviewed for cases that were administered the MSVT, TOMM, and MMPI-2 as part of the clinical neuropsychological evaluation. Generally speaking, all patients referred for testing are given these instruments. Exceptions include those patients to whom the MMPI-2 is not given due to advanced age (i.e., ≥80 years old), low reading level, or obvious and severe cognitive impairment consistent with moderate to severe dementia.

At the time of this study, participants' medical records were retrospectively reviewed. Excluded from the sample were 13 patients who produced invalid MMPI-2 protocols due to elevated scores on True Response Inconsistency or Variable Response Consistency (i.e., scores ≥80 on either scale). No participants exceeded the cut-off on the Cannot Say scale of the MMPI-2. Referral sources included the Psychiatry Ambulatory Care Clinic (37%), Primary Care (35%), Neurology (13%), Polytrauma Clinic (6%), Compensation and Pension Clinic (5%), and Other (4%). As noted above, only 5% of the patients were referred from the Compensation and Pension Clinic and were, thus, being evaluated in association with an active claim for injury through the Veterans Benefits Association. Patients from other referral sources were referred solely for clinical reasons. However, because “many neuropsychological evaluations conducted within the general clinical framework of Veterans Affairs healthcare may be impacted by patient concerns regarding the attainment and/or maintenance of disability,” it is difficult to estimate how many patients had direct incentive to underperform (Young et al., 2011, p. 195). Reasons for referral were quite diverse. Approximately 31% (n = 60/194) of the sample was referred due to potential symptoms of remote traumatic brain injury (TBI), with most (n = 45/60) of those individuals reporting symptoms consistent with potential mild head injury and a minority (n = 15/60) reporting symptoms consistent with potential moderate to severe TBI (Malec, Brown, Leibson, Flaada, & Mandrekar, 2007). Another 21% (n = 40/194) of the sample was referred with a history of major neurological problem(s), including stroke (n = 12/40), epilepsy (n = 3/40), Parkinson's disease (n = 3/40), and dementia requiring capacity evaluation (n = 3/40).

Nearly half of the total sample (48%) was referred due to memory or concentration problems of unknown etiology. Nearly half of the latter individuals carried only a primary psychiatric diagnosis (n = 46/94), mostly commonly anxiety/depression (n = 37/46) possibly co-occurring with other disorders. Only one of these patients was referred for psychosis and a minority of patients (n = 9/46) were referred with posttraumatic stress disorder (PTSD) as part of the referral question. Among the 48% of the sample referred due to memory or concentration problems of unknown etiology, 7% (n = 7/94) had no psychiatric or neurological problems, while 32% (n = 30/94) had co-morbid psychiatric and minor medical diagnoses that could potentially affect cognitive functioning, most commonly depression/anxiety co-occurring with a medical disorder, such as hypertension, sleep apnea, transient ischemic attack, or coronary artery disease. Twelve percent of those referred due to memory or concentration problems of unknown etiology (n = 11/94) carried only a minor medical diagnosis potentially causing cognitive difficulties, the same or similar to those previously mentioned.

In terms of patient demographics, age of participants ranged from 21 to 77 years old, with a mean age of 50.67 years (SD = 13.21). Highest year of education completed by participants ranged from 7 to 20, with a mean level of 12.82 years (SD = 2.58). In terms of gender, 181 (93.3%) participants were men and 13 (6.7%) were women. One-hundred sixty-six of 194 participants (85.6%) were Caucasian, 26 participants (13.4%) were African American, and 1 participant (0.5%) was Hispanic.

Measures and Procedures

The MMPI-2 (Butcher et al., 2001) and the free-standing symptom validity tests were administered as part of the larger clinical neuropsychological evaluation. The MMPI-2 is commonly cited as the most widely used personality test in the world (Graham, 2005). It is a self-report questionnaire consisting of 567 true/false personality-type statements. Validity scales used in the current analysis included those that are a standard part of the scoring software for the MMPI-2 within Veterans Affairs medical centers (F, Fb, and Fp) and those that are more recently developed and require hand scoring (FBS, HHI, and RBS).

The TOMM (Tombaugh, 1996) is a recognition memory task in which 50 line drawings of common objects are presented during two learning trials that are each followed by forced-choice recognition trials. An optional forced-choice recognition trial can be administered following a 15-min delay. In the present study, all three trials were always administered, and patients were considered to have failed the TOMM if they performed below cut-offs specified in the manual on Trial 2 or the Retention Trial. Although the TOMM is historically well-validated in a number of populations, including persons with fibromyalgia and chronic pain and/or depression, neurological patients, community dwelling geriatric individuals, college student normal controls and simulators, persons simulating TBI, actual brain injury litigants, and persons with depression (Ashendorf, Constantinou, & McCaffrey, 2004; Iverson, Le Page, Koehler, Shojania, & Badii, 2007; Rees, Tombaugh, Gansler, & Moczynski, 1998; Tombaugh, 1997), research findings suggest that its rate of false positive diagnoses of “performance invalidity” is unacceptably high in persons with dementia, falling at ∼75% in a study conducted by Teichner and Wagner (2004). Research also suggests that the sensitivity of the TOMM is low, with more than twice as many disability claimants demonstrating implausible profiles on a separate stand-alone symptom validity test (i.e., Nonverbal MVST; Green, 2008) than failed the TOMM (Armistead-Jehle & Gervais, 2011; Green, 2011). Thus, owing to the aforementioned concerns regarding sensitivity and specificity of the TOMM, and because there is no gold standard for symptom validity testing, with researchers expressing concern about the lack of concordance among existing measures (Axelrod & Schutte, 2011), the decision was made to also compare scores on the MMPI-2 validity scales to a separate symptom validity test, the MSVT (Green, 2004).

The MSVT (Green, 2004) is a computerized and largely automated screen for verbal memory impairment with built-in effort testing. The test measures the patient's ability to learn a series of semantically related word pairs, which are presented twice at the beginning of the test session. Following the presentation of word pairs, four trials are administered resulting in five test scores: Immediate recognition (IR), delayed recognition (DR), consistency (CNS), paired-associates (PA), and free recall (FR). Participants scores on the MSVT were analyzed according to criteria outlined in the Advanced Interpretation (AI) Program (Green, 2009). The AI Program uses profile analysis based upon a variety of normative databases to categorize test-takers into three basic groups: those who pass the MSVT, those who fail the MSVT due to poor effort, and those who fail the MSVT with a Genuine Memory Impairment Profile (GMIP). The main criteria that qualify an individual for the GMIP are that they (1) fail the MSVT by scoring below cut-offs on IR, DR, or CNS, but (2) score an average of 20 points higher on the easy subtests than the hard subtests, (3) exhibit no scores below chance, and (4) evidence clinical correlates of disability.

Twenty-seven patients were identified as having met all criteria for the GMIP except for criterion number 4, “evidence of clinical correlates of disability.” Thus, their data were retained in the fail MSVT sample. Incidentally, all of these individuals also failed the TOMM. Individuals satisfying all four criteria for the GMIP were identified as having a GMIP and were excluded from the MSVT analyses (n = 5) because, due to their advanced cognitive impairment, they were believed to represent a distinct group from those who passed the MSVT. Incidentally, all five of these excluded patients passed the TOMM.

All patients whose data were excluded due to having a GMIP evidenced clinical correlates of disability. Many only narrowly failed the MSVT. The youngest of these individuals was a 59-year-old gentleman, whose age fell at the 73 percentile for the study sample as a whole. This individual had an abnormal MRI due to more than one cerebrovascular accident. He was living with his sister who addressed most of his needs (MSVT: IR, 95; DR, 85; CNS, 80; PA, 80; FR, 50). Another patient whose data were excluded was a 67-year-old gentleman with no neuroimaging on file who was diagnosed with multiple medical problems, not the least of which included PTSD, depression, and chronic pain associated with several documented physical injuries. He was also non-compliant with treatment for Type II diabetes at the time of testing. He was 90% service connected, indicating that he receives 90% of the maximum monthly benefit from the Veterans Benefits Association for his injuries (MSVT: IR, 95; DR, 90; CNS, 85; PA, 80; FR, 40). The third excluded veteran was a 77 year-old gentleman with Parkinson's disease and an abnormal CT scan showing mild cerebral volume loss. His wife addressed most of his needs for him (MSVT: IR, 90; DR, 60: CNS, 60; PA, 50; FR, 25). The fourth gentleman whose data were excluded was a 66-year-old individual with an abnormal CT scan due to chronic small vessel disease. His wife also addressed most of his needs for him (MSVT: IR, 85; DR, 85; CNS, 100; PA, 80; FR, 35). The fifth and the final patient who was excluded due to a GMIP was a 67-year-old gentleman diagnosed with PTSD and depression who was recently treated for prostate cancer. He was taking at least one medication associated with negative cognitive effects, including hydrocone/APAP (MSVT: IR, 75; DR, 80; CNS, 65; PA, 50; FR, 20).

Research has shown that the MSVT is an extremely easy memory test that is passed even by individuals with significant cognitive difficulties. For example, a recent study showed that the MSVT was passed by at least 95% children with moderate-to-severe brain/injury/dysfunction (e.g., TBI, stroke) and/or developmental disabilities (n = 38; Carone, 2008). Research has also demonstrated that the MSVT has low false positive rates in cases involving memory impairment (Howe, Anderson, Kaufman, Sachs, & Loring, 2007; Singhal, Green, Ashaye, Shankar, & Gill, 2009). In comparison to other more well-established measures, such as the WMT, Green, Montijo, and Brockhaus (2011) recently reported that the WMT and the MSVT showed 100% agreement in classifying patients as showing a GMIP.

Statistical Analyses

Except where indicated, statistical analyses were calculated using SPSS, Version 10.1 (SPSS, Chicago, IL, USA). Initial analyses consisted of conducting independent t-tests to compare group differences in MMPI-2 validity scale scores (RBS, F, Fb, Fp, FBS, and HHI) among persons either failing or passing the TOMM and the MSVT, respectively. Alpha was set at 0.05 for all other analyses. Cohen's d effect sizes along with 95% confidence intervals (CIs) for these group differences were calculated using a computerized program provided by Devilly (2004). The aforementioned analyses all focused on a dichotomous distinction between groups (i.e., MSVT pass or fail). Given the additional variability inherent in continuous TOMM and MSVT scores, an alternative view of the relationship between the TOMM and MSVT scores, respectively, and the MMPI-2 validity scales was created by computing the two-tailed Pearson correlations between all trials on the TOMM and the MSVT, respectively, and the various MMPI-2 validity scales.

Following the methodology of Arbisi and Ben-Porath (1995), Gervais and colleagues (2007), and Young and colleagues (2011), hierarchical regression analyses were conducted to evaluate the incremental validity of the RBS compared with other MMPI-2 validity scales in discriminating persons who failed the TOMM from those whose passed it. The same analyses were conducted to evaluate the incremental validity of the RBS and the HHI compared with other validity scales in discriminating persons who failed the MSVT from those who passed it. Whereas Arbisi and Ben-Porath (1995) and Gervais and colleagues (2007) used linear regression analyses, like Young and colleagues (2011) and Jones and Ingram (2011), the present study employed logistic regression due to the suitability of logistic regression for dichotomous-dependent variables.

For the TOMM, in the first regression analyses, RBS was entered into the first step and then HHI, F, or Fb, respectively, was entered into the second step. In a second set of analyses, HHI was entered in the first step and RBS, F, or Fb, respectively, was entered into the second step. In a third set of analyses, F was entered into the first step and then RBS, HHI, or Fb, respectively, was entered into the second step. In these analyses, the usefulness of each MMPI-2 scale in incrementally contributing to the RBS, HHI, F, or Fb in predicting TOMM performance as evaluated by looking at the significance of the χ2-change statistic.

For the MSVT, in the first regression analyses, HHI was entered in the first step and then F, RBS, or FBS was entered in the second step. In a second set of analyses, RBS was entered in the first step and HHI, F, or FBS was entered in the second step. In the third set of analyses, F was entered in the first step and HHI, RBS, or FBS was entered in the second step. In these analyses, the usefulness of each MMPI-2 scale in incrementally contributing to HHI, RBS, or F in predicting the MSVT performance was evaluated by looking at the significance of χ2-change statistic.

Receiver operating characteristic (ROC) curve analyses were used to evaluate the usefulness of varying RBS and HHI cut-offs in predicting MSVT and TOMM failure, respectively. As part of the ROC analysis, the sensitivity and the specificity of the RBS and the HHI at various cut-offs were examined. Following the ROC analysis, positive and negative predictive values (PPV and NPV) were calculated. As explained by O'Bryant and Lucas (2006), the PPV refers to the likelihood that a person has condition X (i.e., performance invalidity as evidenced by TOMM failure) given positive findings on test Y (i.e., meets or exceeds RBS cut-off score) (Glaros & Kline, 1988; McCaffrey, Palav, O'Bryant, & Labarge, 2003). The NPV is defined as the likelihood that the person does not have condition X (i.e., is not demonstrating performance invalidity evidenced by passing the TOMM) given a negative finding on test Y (i.e., scores below RBS cut-off) (Glaros & Kline, 1988; McCaffrey et al., 2003). Both the PPV and the NPV were calculated using the formulas presented in O'Bryant and Lucas (2006). An estimated base rate of the condition in question (in this case, performance invalidity as evidenced by MSVT or TOMM failure, respectively) is needed to calculate the PPV and the NPV. For the present study, a base rate of 28% was employed as it represents the average of the percentage of groups actually failing the MSVT (32%) and the TOMM (23%) in the present study.

Results

Group Differences (as Shown by t-Tests and Effect Sizes) and Pearson Correlations

Test of Memory Malingering

Results of the t-tests showed that, of the MMPI-2 validity tests evaluated (RBS F, Fb, Fp, FBS, and HHI), significant differences between the group who passed the TOMM (N = 149) and the group who failed the TOMM (N = 45) were shown for all scales (Table 2). The effect sizes for these group differences are considered small to large (Lipsey, 1990) and ranged from 0.36 to 0.79. The largest effect sizes were shown for the RBS (d = 0.79), HHI (d = 0.75), and F (d = 0.72). As expected, persons failing the TOMM scored higher, indicating more performance invalidity, than those passing the TOMM on all validity indices. Results of the Pearson correlations showed that all MMPI-2 validity scales studied were significantly correlated with Trial 1, Trial 2, and the Retention Trial of the TOMM (Table 3).

Table 2.

Group differences in MMPI-2 validity scales among groups passing and failing the TOMMa

  Pass TOMM (N = 149)
 
Fail TOMM (N = 45)
 
t(192)
 
Cohen's d
 
M SD M SD t p-value d 95% CI 
RBS 11.4 4.1 14.6 4.0 4.6 .000 0.79 0.45–1.13 
67.8 18.9 82.1 20.8 4.4 .000 0.72 0.38–1.06 
Fb 66.2 22.2 82.4 25.1 4.2 .000 0.68 0.34–1.02 
Fp 53.2 13.1 61.5 17.0 3.5 .001 0.55 0.21–0.88 
FBS 20.6 5.6 23.8 5.8 3.4 .001 0.56 0.22–0.90 
HHI 8.1 3.7 10.7 3.2 4.3 .000 0.75 0.41–1.09 
  Pass TOMM (N = 149)
 
Fail TOMM (N = 45)
 
t(192)
 
Cohen's d
 
M SD M SD t p-value d 95% CI 
RBS 11.4 4.1 14.6 4.0 4.6 .000 0.79 0.45–1.13 
67.8 18.9 82.1 20.8 4.4 .000 0.72 0.38–1.06 
Fb 66.2 22.2 82.4 25.1 4.2 .000 0.68 0.34–1.02 
Fp 53.2 13.1 61.5 17.0 3.5 .001 0.55 0.21–0.88 
FBS 20.6 5.6 23.8 5.8 3.4 .001 0.56 0.22–0.90 
HHI 8.1 3.7 10.7 3.2 4.3 .000 0.75 0.41–1.09 

Notes: TOMM = Test of Memory Malingering; F = Infrequency; Fb = Infrequency-Back; Fp = Infrequency Psychopathology; FBS = Symptom Validity Scale; RBS = Response Bias Scale; HHI = Henry–Heilbronner Index. RBS, FBS, and HHI are raw scores. F, Fb, and Fp are T-scores.

aThere are five more cases included in the TOMM analyses than in the MSVT analyses. These five cases were excluded from the MSVT analyses because they met criteria for the GMIP on the AI Program of the MSVT.

Table 3.

Correlations between the free-standing symptom validity scales and the MMPI-2 validity scales

 RBS Fb Fp FBS HHI 
TOMM Trial 1 −0.42a −0.39a −0.38a −0.21a −0.36a −0.40a 
TOMM Trial 2 −0.34a −0.29a −0.28a −0.18b −0.22a −0.28a 
TOMM Retention Trial −0.35a −0.31a −0.30a −0.24a −0.23a −0.29b 
MSVT Easy −0.35a −0.32a −0.29a −0.20b −0.27a −0.33a 
MSVT Hard −0.31a −0.31a −0.27a −0.18b −0.23a −0.27b 
 RBS Fb Fp FBS HHI 
TOMM Trial 1 −0.42a −0.39a −0.38a −0.21a −0.36a −0.40a 
TOMM Trial 2 −0.34a −0.29a −0.28a −0.18b −0.22a −0.28a 
TOMM Retention Trial −0.35a −0.31a −0.30a −0.24a −0.23a −0.29b 
MSVT Easy −0.35a −0.32a −0.29a −0.20b −0.27a −0.33a 
MSVT Hard −0.31a −0.31a −0.27a −0.18b −0.23a −0.27b 

Notes: TOMM = Test of Memory Malingering; MSVT = Medical Symptom Validity Test; F = Infrequency; Fb = Infrequency-Back; Fp = Infrequency Psychopathology; FBS = Symptom Validity Scale; RBS = Response Bias Scale; HHI = Henry–Heilbronner Index. RBS, FBS, and HHI are raw scores. F, Fb, and Fp are T-scores.

aCorrelation is significant at the <.01 level.

bCorrelation is significant at the <.05 level.

Medical Symptom Validity Test

Results of the t-tests showed that, of the MMPI-2 validity tests evaluated (RBS F, Fb, Fp, FBS, and HHI), significant differences between the group who passed the MSVT (N = 129) and the group who failed the MSVT (N = 60) were shown for all scales (Table 4). The effect sizes for these group differences are considered moderate to large (Lipsey, 1990) and ranged from 0.45 to 0.83. The largest effect sizes were shown for the HHI (d = 0.83), RBS (d = 0.80), and F scale (d = 0.78). As expected, persons failing the MSVT scored higher, indicating more negative response bias, than those passing the MSVT on all validity indices. Results of the Pearson correlations showed that all MMPI-2 validity scales studied were significantly correlated with both the easy and the hard subtests of the MSVT (Table 3).

Table 4.

Group differences in MMPI-2 validity scales among groups passing and failing the MSVT using the AI Programa

  Pass MSVT (N = 129)
 
Fail MSVT (N = 60)
 
t(187)
 
Cohen's d
 
M SD M SD t-value p-value d-value 95% CI 
RBS 11.1 4.1 14.3 3.9 5.0 .000 0.80 0.48–1.12 
66.4 18.2 81.8 21.0 5.1 .000 0.78 0.48–1.10 
Fb 64.5 20.8 80.6 25.5 4.6 .000 0.69 0.38–1.01 
Fp 53.3 13.4 60.0 16.4 2.5 .012 0.45 0.14–0.76 
FBS 20.1 5.8 24.0 5.0 4.5 .000 0.72 0.41–1.04 
HHI 7.78 3.8 10.6 2.9 5.1 .000 0.83 0.52–1.15 
  Pass MSVT (N = 129)
 
Fail MSVT (N = 60)
 
t(187)
 
Cohen's d
 
M SD M SD t-value p-value d-value 95% CI 
RBS 11.1 4.1 14.3 3.9 5.0 .000 0.80 0.48–1.12 
66.4 18.2 81.8 21.0 5.1 .000 0.78 0.48–1.10 
Fb 64.5 20.8 80.6 25.5 4.6 .000 0.69 0.38–1.01 
Fp 53.3 13.4 60.0 16.4 2.5 .012 0.45 0.14–0.76 
FBS 20.1 5.8 24.0 5.0 4.5 .000 0.72 0.41–1.04 
HHI 7.78 3.8 10.6 2.9 5.1 .000 0.83 0.52–1.15 

Notes: MSVT = Medical Symptom Validity Test; F = Infrequency; Fb = Infrequency-Back; Fp = Infrequency Psychopathology; FBS = Symptom Validity Scale; RBS = Response Bias Scale; HHI = Henry–Heilbronner Index. RBS, FBS, and HHI are raw scores. F, Fb, and Fp are T-scores.

aThere are five more cases included in the TOMM analyses than in the MSVT analyses. These five cases were excluded from the MSVT analyses because they met criteria for the GMIP on the AI Program of the MSVT.

Hierarchical Logistic Regression Analyses

Test of Memory Malingering

MMPI-2 scales selected for entry in the regression analyses were those that showed the largest effect sizes between groups passing and failing the TOMM (i.e., RBS, HHI, F, and Fb). Table 5 shows the results of these analyses, which indicate that no MMPI-2 validity scale tested added incremental validity to the RBS in predicting pass versus fail on the TOMM. Similarly, no MMPI-2 validity scale tested added incrementally to the HHI in predicting pass versus fail on the TOMM. Both the HHI and the RBS added incrementally validity to F and Fb in differentiating the groups who passed versus failed the TOMM.

Table 5.

Hierarchical logistic regression analyses: Predicting TOMM pass/fail with MMPI-2 validity scales

Model Model χ2(dfχ2-change R2 R2-change 
RBS entered first 
 RBS 20.09 (1)***  .149  
 HHI 22.48 (2)*** 2.39 .165 .016 
 F 21.48 (2)*** 1.39 .158 .009 
 Fb 23.94 (2)*** 0.64 .167 .018 
HHI entered first 
 HHI 18.66 (1)***  .139  
 RBS 22.48 (2)*** 3.82 .165 .026 
 F 22.18 (2)*** 3.51 .163 .024 
 Fb 21.63 (2)*** 2.97 .159 .020 
F entered first 
 F 16.80 (1)***  .125  
 RBS 21.48 (2)*** 4.67* .158 .033 
 HHI 22.18 (2)*** 5.38* .163 .038 
 Fb 17.85 (2)*** 1.05 .133 .008 
Fb entered first 
 Fb 15.32 (1)***  .115  
 RBS 21.57 (2)*** 6.25* .159 .044 
 HHI 16.08 (2)*** 6.31* .159 .044 
 F 17.85 (2)*** 2.52 .133 .018 
Model Model χ2(dfχ2-change R2 R2-change 
RBS entered first 
 RBS 20.09 (1)***  .149  
 HHI 22.48 (2)*** 2.39 .165 .016 
 F 21.48 (2)*** 1.39 .158 .009 
 Fb 23.94 (2)*** 0.64 .167 .018 
HHI entered first 
 HHI 18.66 (1)***  .139  
 RBS 22.48 (2)*** 3.82 .165 .026 
 F 22.18 (2)*** 3.51 .163 .024 
 Fb 21.63 (2)*** 2.97 .159 .020 
F entered first 
 F 16.80 (1)***  .125  
 RBS 21.48 (2)*** 4.67* .158 .033 
 HHI 22.18 (2)*** 5.38* .163 .038 
 Fb 17.85 (2)*** 1.05 .133 .008 
Fb entered first 
 Fb 15.32 (1)***  .115  
 RBS 21.57 (2)*** 6.25* .159 .044 
 HHI 16.08 (2)*** 6.31* .159 .044 
 F 17.85 (2)*** 2.52 .133 .018 

Notes: TOMM = Test of Memory Malingering; HHI = Henry–Heilbronner Index; F = Infrequency; Fb = Infrequency-Back; RBS = Response Bias Scale. RBS and HHI are raw scores. F and Fb are T-scores.

***p < .001.

*p < .05.

Medical Symptom Validity Test

MMPI-2 scales selected for entry in the regression analyses were those that showed the largest effect sizes between groups passing and failing the MSVT (i.e., HHI, RBS, F, and FBS). Table 6 shows the results of these analyses, which indicate that F, but not RBS or FBS, added incrementally validity to the HHI in predicting pass versus fail on the MSVT. On the other hand, only the HHI added incrementally validity to the RBS in differentiating the groups who passed versus failed the MSVT. Finally, the HHI and the FBS, but not the RBS, both added incremental validity to F in differentiating the groups.

Table 6.

Hierarchical logistic regression analyses: Predicting MSVT pass/fail using the AI Program with MMPI-2 validity scales

Model Model χ2(dfχ2-change R2 R2-change 
HHI entered first 
 HHI 25.90 (1)***  .179  
 RBS 28.54 (2)*** 2.64 .196 .017 
 F 30.65 (2)*** 4.75* .210 .031 
 FBS 26.15 (2)*** 2.47 .181 .002 
RBS entered first 
 RBS 23.30 (1)***  .163  
 HHI 28.54 (2)*** 5.24* .196 .033 
 F 26.95*** 3.65 .186 .023 
 FBS 27 97 182 19 
F entered first 
 F 23.24 (1)***  .162  
 HHI 30.65 (2)*** 7.41** .210 .048 
 RBS 26.95 (2)*** 3.71 .186 .024 
 FBS 28.36 (2)*** 5.12* .195 .033 
Model Model χ2(dfχ2-change R2 R2-change 
HHI entered first 
 HHI 25.90 (1)***  .179  
 RBS 28.54 (2)*** 2.64 .196 .017 
 F 30.65 (2)*** 4.75* .210 .031 
 FBS 26.15 (2)*** 2.47 .181 .002 
RBS entered first 
 RBS 23.30 (1)***  .163  
 HHI 28.54 (2)*** 5.24* .196 .033 
 F 26.95*** 3.65 .186 .023 
 FBS 27 97 182 19 
F entered first 
 F 23.24 (1)***  .162  
 HHI 30.65 (2)*** 7.41** .210 .048 
 RBS 26.95 (2)*** 3.71 .186 .024 
 FBS 28.36 (2)*** 5.12* .195 .033 

Notes: MSVT = Medical Symptom Validity Test; HHI = Henry–Heilbronner Index; F = Infrequency; FBS = Symptom Validity Scale; RBS = Response Bias Scale. HHI, RBS, and FBS are raw scores. F is a T-score.

***p < .001.

**p < .01.

*p < .05.

ROC, Sensitivity, Specificity, and Predictive Power Analyses

Test of Memory Malingering

As shown by ROC analysis, the area under the curve of 0.719 (95% CI: 0.63–0.804) suggests that the predictive information captured by the RBS score was reasonably good. Using the RBS cut-off score of ≥17, suggested by Gervais and colleagues (2007), resulted in a somewhat low specificity of 0.88 and sensitivity of 0.22 (Table 7). A more reasonable false positive rate (0.07) was found when using a slightly higher cut-off score of ≥18 (specificity = 0.93). At the cut-off of ≥18, the sensitivity of the RBS was similar to (sensitivity = 0.18) that when using a cut-off score of ≥17 (sensitivity = 0.22) and was slightly lower than that found by Gervais and colleagues (2007), who reported sensitivity values ranging from 0.25 to 0.29 when using a cut-off of ≥17 in their samples. As can be seen in Table 7, using a cut-off of 18 on the RBS in the present study resulted in a PPV of 0.50 and an NPV of 0.74.

Table 7.

Classification accuracy of various RBS and HHI cut-offs in predicting TOMM pass/fail

Cutoff Sensitivity Specificity PPVa NPVa 
RBS 
 16 0.38 0.81 0.44 0.77 
 17 0.22 0.88 0.42 0.74 
 18 0.18 0.93 0.50 0.74 
 19 0.16 0.97 0.67 0.75 
 20 0.11 0.98 0.68 0.74 
 21 0.09 0.99 0.78 0.74 
HHI 
 12 0.44 0.77 0.43 0.78 
 13 0.36 0.87 0.52 0.78 
 14 0.13 0.95 0.50 0.74 
 15 0.02 0.99 0.44 0.72 
Cutoff Sensitivity Specificity PPVa NPVa 
RBS 
 16 0.38 0.81 0.44 0.77 
 17 0.22 0.88 0.42 0.74 
 18 0.18 0.93 0.50 0.74 
 19 0.16 0.97 0.67 0.75 
 20 0.11 0.98 0.68 0.74 
 21 0.09 0.99 0.78 0.74 
HHI 
 12 0.44 0.77 0.43 0.78 
 13 0.36 0.87 0.52 0.78 
 14 0.13 0.95 0.50 0.74 
 15 0.02 0.99 0.44 0.72 

Notes: Prevalence of MSVT failure was 0.32. RBS = Response Bias Scale; HHI = Henry–Heilbronner Index; PPV = positive predictive value; NPV = negative predictive value.

aA 28% base rate of performance invalidity was used in these calculations.

Also, as shown by the ROC analysis, the area under the curve of 0.711 (95% CI: 0.627–0.796) suggests that predictive information captured by the HHI score was reasonably good. Henry and colleagues (2006) suggested that a score of ≥8 on the HHI warrants consideration for a “pseudosomatic” pattern of item endorsement. However, in the present sample, while a cut-off of ≥8 resulted in excellent sensitivity (0.87), it showed very poor specificity (0.57). In contrast, a much higher cut-off of ≥14 resulted in an acceptable level of specificity (0.95) while still retaining a small amount of sensitivity (0.13). As can be seen in Table 7, using a cut-off of 14 on the HHI in the present study resulted in a PPV of 0.50 and an NPV of 0.74.

Medical Symptom Validity Test

As shown by ROC analysis, the area under the curve of 0.701 (95% CI: 0.624–0.779) suggests that the predictive information captured by the RBS score was reasonably good. Using the RBS cut-off score of ≥17, suggested by Gervais and colleagues (2007), resulted in a somewhat low specificity of 0.88 and sensitivity of 0.20 (Table 8). A more reasonable false positive rate (0.04) was found when using a slightly higher cut-off score of 18 (specificity = 0.96). At the cut-off of 18, the sensitivity of the RBS was similar to (sensitivity = 0.17) that when using a cut-off score of ≥17 (sensitivity = 0.20) and was also roughly similar to that found by Gervais and colleagues (2007), who reported sensitivity values ranging from 0.25 to 0.29 when using a cut-off of ≥17 in their samples. As can be seen in Table 8, using a cut-off of 18 on the RBS in the present study resulted in a PPV of 0.52 and an NPV of 0.74.

Table 8.

Classification accuracy of various HHI cut-offs in predicting MSVT pass/fail using the AI Program

Cutoff Sensitivity Specificity PPVa NPVa 
RBS 
 16 0.35 0.82 0.43 0.76 
 17 0.20 0.88 0.39 0.74 
 18 0.17 0.94 0.52 0.74 
 19 0.15 0.98 0.74 0.75 
 20 0.12 0.99 0.82 0.74 
 21 0.10 1.0 1.0 0.74 
HHI 
 12 0.42 0.79 0.44 0.78 
 13 0.28 0.87 0.46 0.76 
 14 0.13 0.96 0.56 0.74 
 15 0.03 1.0 1.0 0.73 
Cutoff Sensitivity Specificity PPVa NPVa 
RBS 
 16 0.35 0.82 0.43 0.76 
 17 0.20 0.88 0.39 0.74 
 18 0.17 0.94 0.52 0.74 
 19 0.15 0.98 0.74 0.75 
 20 0.12 0.99 0.82 0.74 
 21 0.10 1.0 1.0 0.74 
HHI 
 12 0.42 0.79 0.44 0.78 
 13 0.28 0.87 0.46 0.76 
 14 0.13 0.96 0.56 0.74 
 15 0.03 1.0 1.0 0.73 

Notes: Prevalence of TOMM failure was 0.23. RBS = Response Bias Scale; HHI = Henry–Heilbronner Index; PPV = positive predictive value; NPP = negative predictive value.

aA 28% base rate of performance invalidity was used in these calculations.

For the HHI, as shown by the ROC analysis, the area under the curve of 0.718 (95% CI: 0.643–0.794) suggests that predictive information captured by the HHI score was reasonably good. Henry and colleagues (2006) suggested that a score of ≥8 on the HHI warrants consideration for a “pseudosomatic” pattern of item endorsement. However, in the present sample, while a cut-off of ≥8 resulted in excellent sensitivity (0.83), it showed very poor specificity (0.54). In contrast, a much higher cut-off of ≥14 resulted in an acceptable level of specificity (0.96) while still retaining a small amount of sensitivity (0.13). As can be seen in Table 8, using a cut-off of ≥14 on the HHI in the present study resulted in a PPV of 0.56 and an NPV of 0.74.

Discussion

Similar to the three previously published studies which compared the ability of the RBS and the HHI to predict performance invalidity on neuropsychological testing, the current study used a retrospective chart review to examine this issue among either active or veteran U.S. military members. While results of the previously published studies varied with regard to whether or not the RBS or the HHI was superior in its ability to predict performance invalidity, it is notable that different symptom validity tests were used as the criterion in these studies. To more closely examine whether the differences in study findings were due to differences among the samples, which were quite similar, or due to differences in the symptom validity test employed, the present investigation examined patient performances on two separate stand-alone symptom validity tests administered during the same session. Specifically, the primary goal of the present research was to examine the ability of various MMPI-2 validity scales, including the HHI, RBS, F, Fb, Fp, and FBS, to predict failure on the MSVT and the TOMM, respectively. Overall, findings of the current study suggest that the RBS and the HHI are superior to other MMPI-2 validity scales in predicting performance invalidity on free-standing symptom validity tests. However, the choice of free-standing validity test employed, even within the same sample, results in differing relative contributions of the RBS and the HHI in predicting symptom validity test outcome.

Specifically, results showed that while nearly all MMPI-2 validity scales showed small-to-medium and significant correlations with performances on both the TOMM and the MSVT, the MMPI-2 validity scales differed in the magnitude of difference between groups passing and failing these measures (Tables 2 and 4). For the TOMM, group differences between those passing and failing were largest for the RBS (d = 0.79), HHI (d = 0.75), and F (d = 0.72). Similarly, for the MSVT, the largest group differences for those passing versus failing were greatest on the HHI (d = 0.83), RBS (d = 0.80), and F (d = 0.78). As expected, groups passing the TOMM and the MSVT, respectively, scored higher than those who failed.

Regression analyses showed that the RBS accounted for the most variance in TOMM scores (20%), while the HHI accounted for the most variance in MSVT scores (26%). No MMPI-2 validity scale tested added incremental validity to either RBS or HHI, respectively, in predicting passing versus fail on the TOMM. Both HHI and RBS added incrementally validity to F in differentiating the groups who passed versus failed the TOMM. On the MSVT, only F added incrementally validity to the HHI in predicting pass versus fail. Only the HHI added incrementally validity to the RBS in differentiating the groups who passed versus failed the MSVT. Finally, the HHI and the FBS, but not the RBS, both added incremental validity to F in differentiating the groups.

With regard to the RBS, classification analyses suggested that a cut-off of ≥18 minimized false positives on both the TOMM and the MSVT (specificity = 0.93 and 0.94, respectively) while retaining a reasonable sensitivity of 0.18 on the TOMM and 0.17 on the MSVT, which is fairly typical in the practice of symptom validity testing. These data suggest that, in terms of the PPV, using an RBS cut-off of ≥18, a clinician would have a 52% probability of being correct in suspecting that a patient would fail the MSVT with non-credible performance. This figure is slightly lower for the TOMM, where using a cut-off of ≥18 on the RBS would give the clinician only a 50% chance of being correct in suspecting failure. In terms of the NPV, the same clinician would have a 74% probability of being correct in not suspecting a patient of possible TOMM or MSVT failure, respectively, given below cut-off performance on the RBS. According to Gervais, Ben-Porath, Wygant, and Green's (2008) published data, an RBS cut-off score of ≥18 corresponds to an RBS T-score of ≥105. In their published guidelines, the latter places the test-taker's score into the highest T-score interpretive category of T = 100+, which, by their description, suggests that “memory complaints are exaggerated” (p. 14).

With regard to the HHI, classification analyses suggested that a cut-off of ≥14 minimized false positives for both the MSVT and the TOMM (specificity = 0.96 and 0.95, respectively) while still remaining a small amount of sensitivity (0.13). These data suggest that, in terms of the PPV, using an HHI cut-off of ≥14, a clinician would have a 56% probability of being correct in suspecting that a patient would fail the MSVT with non-credible performance. This figure is slightly lower for the TOMM, where using a cut-off of ≥14 on the HHI would give the clinician only a 50% chance of being correct in suspecting failure. In terms of the NPV, the same clinician would have a 74% probability of being correct in not suspecting a patient of possible MSVT or TOMM failure, respectively, given below cut-off performance on the RBS.

It is possible that the slight increases in the PPVs when using the MSVT versus the TOMM are attributable to the increased sensitivity of the former measure. In the current study, 23 patients who passed the TOMM failed the MSVT, while only three patients who passed the MSVT failed the TOMM. Of those 23 patients who failed the MSVT, but passed the TOMM, four exceeded the cut-off on the RBS (≥18) and three exceeded the cut-off on the HHI (≥14). Thus, it is possible that some patients who passed the TOMM, but exceeded the RBS and HHI cut-offs, may actually be exhibiting performance invalidity that went undetected by the TOMM but was detected by the MSVT. The reverse may also be true, but to a lesser extent, as one of the three patients who passed the MSVT but failed the TOMM also exceeded cut-offs on both the HHI and the RBS.

As a whole, the results of the study suggest that, while both HHI and RBS may have some use in predicting failure on stand-alone symptom validity tests and, thus, potentially arousing suspicion of performance invalidity, they like other embedded symptom validity measures are best not used in isolation. Previous research into the interpretation of multiple embedded symptom validity among clinical groups has shown that 41% of a credible (i.e., defined by performance on free standing symptom validity tests) group of patients failed one or more of four embedded symptom validity measures (Victor et al., 2009). The accuracy of predicting credible versus non-credible performance was reported to be superior when using any pairwise (i.e., two or more) failure combination of the embedded symptom validity tests (sensitivity 84% and specificity 94%) when compared with using one test by itself (sensitivity 95% and specificity 53%) or to using any three test failure combination (sensitivity 52% and specificity 99%), where sensitivity was substantially lowered.

Thus, it is likely that using the HHI or the RBS in combination with another embedded or stand-alone symptom validity test would increase the accuracy of identifying performance invalidity. As it were, only six individuals in the present study failed both the HHI and the RBS using the adjusted cut-offs (i.e., RBS ≥ 18 and HHI ≥ 14). Post hoc analyses show that using failure on both to predict MSVT performance yields an unimpressive sensitivity of 7% and a specificity of 98%. However, using the cut-offs suggested by the test developers (RBS ≥ 17 and HHI ≥ 8) yielded different results. Specifically, post hoc analyses show that 28 individuals failed both the HHI and the RBS using the original cut-offs. Under such circumstances, sensitivity was much higher, falling at 22%, while specificity remained near acceptable levels, falling at 88%. As the current sample of patients also completed the Digit Span task of the Wechsler Adult Intelligence Scale-Fourth Edition (Wechsler, 2008), it was also possible to follow the example provided by Victor and colleagues (2009) of employing multiple symptom validity tests and requiring failure on at least two of these to predict performance invalidity. Specifically, post hoc analyses showed that requiring failure on two of three embedded symptom validity tests in this sample (i.e., RBS ≥ 17 or HHI ≥ 8 or Reliable Digit Span ≤ 7), the sensitivity for predicting TOMM failure was impressive, falling at 43%, while the specificity was slightly low, falling at 83%. In predicting MSVT failure, the sensitivity of this approach was also impressive, falling at 39%, while specificity remained low, falling at 84%. Future research may address the utility of using these scales in combination and/or with other embedded and stand-alone symptom validity measures to predict performance invalidity.

Although the TOMM and the MSVT were used to classify individuals as demonstrating performance invalidity in the present study, it should be emphasized that the diagnosis of invalid presentation, especially if malingering is in question, is a clinical judgment that cannot be made on the results of symptom validity tests alone, but must be made in consideration of other psychometric, behavioral, and collateral data (Slick et al., 1999). It is also notable that the use of different free-standing symptom validity tests, such as the CARB or the WMT, may have resulted in findings that differed from those presented herein. A further limitation of this study is that the participants were primarily Caucasian men, all of whom were receiving neuropsychological services at a Veterans Affairs medical center. It is possible that the results of this study might not generalize well to the general population, where the base rate of performance invalidity may be lower. Holding all other factors constant, a decrease in the base rate of performance invalidity would statistically result in a concomitant decrease in PPVs and NPVs for the RBS and the HHI. Thus, future research may address the applicability of these findings to various patient populations and settings.

Conflict of Interest

None declared.

References

Allen
L.
Conder
R. L.
Green
P.
Cox
D. R.
CARB '97 manual for the Computerized Assessment of Response Bias
 , 
1997
Durham, NC
CogniSyst
Arbisi
P. A.
Ben-Porath
Y. S.
An MMPI-2 infrequent response scale for use with psychopathological populations: The infrequency-psychopathology scale, F(p)
Psychological Assessment
 , 
1995
, vol. 
7
 
4
(pg. 
424
-
431
)
Armistead-Jehle
P.
Gervais
R. O.
Sensitivity of the Test of Memory Malingering and the Nonverbal Medical Symptom Validity Test: A replication study
Applied Neuropsychology
 , 
2011
, vol. 
18
 
4
(pg. 
284
-
290
)
Ashendorf
L.
Constantinou
M.
McCaffrey
R. J.
The effects of depression and anxiety on the TOMM in community dwelling older adults
Archives of Clinical Neuropsychology
 , 
2004
, vol. 
19
 (pg. 
125
-
130
)
Axelrod
B. N.
Schutte
C.
Concurrent validity of three forced-choice measures of symptom validity
Applied Neuropsychology
 , 
2011
, vol. 
18
 
1
(pg. 
27
-
33
)
Ben-Porath
Y. S.
Tellegan
A.
Minnesota Multiphasic Personality Inventory-2 Restructured Form: Manual for administration and scoring
 , 
2008
Minneapolis, MN
University of Minnesota Press
Butcher
J. N.
Dahlstrom
W. G.
Graham
J. R.
Tellegen
A.
Kaemmer
B.
Minnesota Multiphasic Personality Inventory (MMPI-2). Manual for administration and scoring
 , 
1989
Minneapolis
University of Minnesota Press
Butcher
J. N.
Graham
J. R.
Ben-Porath
Y. S.
Tellegen
A.
Dahlstrom
W. G.
Kaemmer
B.
MMPI-2: Manual for administration and scoring
 , 
2001
Rev. ed.
Minneapolis
University of Minnesota Press
Carone
D.
Children with moderate/severe brain damage/dysfunction outperform adults with mild-to-no brain damage on the Medical Symptom Validity Test
Brain Injury
 , 
2008
, vol. 
22
 
12
(pg. 
960
-
971
)
Devilly
G. J.
Effect Size Generator for Windows: Version 2.3
 , 
2004
Australia
Centre for Neuropsychology, Swinburne University
Dionysus
K. E.
Denney
R. L.
Halfaker
D. A.
Detecting negative response bias with the Fake Bad Scale, Response Bias Scale, and Henry-Heilbronner Index of the Minnesota Multiphasic Personality Inventory-2
Archives of Clinical Neuropsychology
 , 
2011
, vol. 
26
 (pg. 
81
-
88
)
Elhai
J. D.
Ruggiero
K. J.
Frueh
B. C.
Beckham
J. C.
Gold
P. B.
Feldman
M. E.
The Infrequency-Posttraumatic Stress Disorder Scale (Fptsd) for the MMPI-2: Development and initial validation with veterans presenting with combat-related PTSD
Journal of Personality Assessment
 , 
2002
, vol. 
79
 
3
(pg. 
531
-
549
)
Gervais
R. O.
Ben-Porath
Y. S.
Wygant
D. B.
Green
P.
Development and validation of a Response Bias Scale (RBS) for the MMPI-2
Assessment
 , 
2007
, vol. 
14
 (pg. 
196
-
208
)
Gervais
R. O.
Ben-Porath
Y. S.
Wygant
D. B.
Green
P.
Differential sensitivity of the Response Bias Scale (RBS) and MMPI-2 validity scales to memory complaints
The Clinical Neuropsychologist
 , 
2008
, vol. 
22
 
6
(pg. 
1061
-
1079
)
Gervais
R. O.
Ben-Porath
Y. S.
Wygant
D. B.
Sellboom
M.
Incremental validity of the MMPI-2-RF over-reporting scales and RBS in assessing the veracity of memory complaints
Archives of Clinical Neuropsychology
 , 
2010
, vol. 
25
 
4
(pg. 
274
-
84
)
Glaros
A. G.
Kline
R. B.
Understanding the accuracy of tests with cutting scores: The sensitivity, specificity, and predictive value model
Journal of Clinical Psychology
 , 
1988
, vol. 
44
 (pg. 
1013
-
1023
)
Graham
J. R.
MMPI-2: Assessing personality and psychopathology
 , 
1990
New York
Oxford University Press
Graham
J. R.
MMPI-2: Assessing personality and psychopathology
 , 
2005
4th ed.)
New York
Oxford University Press
Green
P.
Green's Word Memory Test for Windows: User's manual
 , 
2003
Edmonton, Canada
Green's Publishing
Green
P.
Green's Medical Symptom Validity Test (MSVT) for Windows: User's manual
 , 
2004
Edmonton, Canada
Green's Publishing
Green
P.
Manual for the Nonverbal Medical Symptom Validity Test
 , 
2008
Edmonton, Canada
Green's Publishing
Green
P.
The Advanced Interpretation Program for the WMT, MSVT, NV-MSVT, and MCI
 , 
2009
Edmonton, Canada
Green's Publishing
Green
P.
Comparison between the Test of Memory Malingering (TOMM) and the Nonverbal Medical Symptom Validity Test in adults with disability claims
Applied Neuropsychology
 , 
2011
, vol. 
18
 
1
(pg. 
18
-
26
)
Green
P.
Allen
L.
Astner
K.
The Word Memory Test: A user's guide to the oral and computer-administered forms, U. S. Version 1.1.
 , 
1996
Durham, NC
CogniSyst
Green
P.
Astner
K.
Manual for the Oral Word Memory Test
 , 
1995
Edmonton, Canada
Neurobehavioural Associates
Green
P.
Montijo
J.
Brockhaus
R.
High specificity of the Word Memory Test and Medical Symptom Validity Test in groups with severe verbal memory impairment
Applied Neuropsychology
 , 
2011
, vol. 
18
 
2
(pg. 
86
-
94
)
Greiffenstein
M. F.
Fox
D.
Lees-Haley
P. R.
Boone
K.
The MMPI-2 Fake Bad Scale in detection of noncredible brain injury claims
Assessment of feigned cognitive impairment: A neuropsychological perspective
 , 
2007
New York
Guilford Publications
(pg. 
210
-
235
)
Hathaway
S. R.
McKinley
J. C.
The Minnesota multiphasic personality inventory
 , 
1951
Rev. ed.
New York
Psychological Corporation
Henry
G. K.
Heilbronner
R. L.
Mittenberg
W.
Enders
C.
The Henry-Heilbronner Index: A 15-item empirically derived MMPI-2 subscale for identifying probable malingering in personal injury litigants and disability claimants
The Clinical Neuropsychologist
 , 
2006
, vol. 
20
 (pg. 
786
-
797
)
Henry
G.
Heilbronner
R. L.
Mittenberg
W.
Enders
C.
Stanczak
S. R.
Comparison of the Lees-Haley Fake Bad Scale, Henry-Heilbronner Index, and restructured clinical scale 1 in identifying noncredible symptom reporting
The Clinical Neuropsychologist
 , 
2008
, vol. 
22
 
5
(pg. 
919
-
929
)
Howe
L. L. S.
Anderson
A. M.
Kaufman
D. A. S.
Sachs
B. C.
Loring
D. W.
Characterization of the Medical Symptom Validity Test in evaluation of clinically referred memory disorders clinic patients
Archives of Clinical Neuropsychology
 , 
2007
, vol. 
22
 (pg. 
753
-
761
)
Iverson
G. L.
Le Page
J.
Koehler
B. E.
Shojania
K.
Badii
M.
Test of Memory Malingering (TOMM) scores are not affected by chronic pain or depression in patients with fibromyalgia
The Clinical Neuropsychologist
 , 
2007
, vol. 
21
 
3
(pg. 
532
-
546
)
Jones
A.
Ingram
V.
A comparison of selected MMPI-2 and MMPI-2-RF validity scales in assessing effort on cognitive tests in a military sample
The Clinical Neuropsychologist
 , 
2011
, vol. 
25
 
7
(pg. 
1207
-
1227
)
Jones
A.
Ingram
M. V.
Ben-Porath
Y. S.
Scores on the MMPI-2 RF scales as a function of increasing levels of failure on cognitive symptom validity tests in a military sample
The Clinical Neuropsychologist
 , 
2012
, vol. 
26
 
5
(pg. 
790
-
815
)
Lange
R. T.
Sullivan
K. A.
Scott
C.
Comparison of MMPI-2 and PAI validity indicators to detect feigned depression and PTSD symptom reporting
Psychiatry Research
 , 
2010
, vol. 
176
 
2–3
(pg. 
229
-
235
)
Lees-Haley
P.
English
L. T.
Glenn
W. J.
A fake bad scale on the MMPI-2 for personal injury claimants
Psychological Reports
 , 
1991
, vol. 
68
 (pg. 
203
-
210
)
Lipsey
M. W.
Design sensitivity
 , 
1990
Newberry Park, CA
Sage
Malec
J. F.
Brown
A. W.
Leibson
C. L.
Flaada
J. T.
Mandrekar
J. N.
The Mayo classification system for traumatic brain injury severity
Journal of Neurotrauma
 , 
2007
, vol. 
24
 (pg. 
1417
-
1424
)
McCaffrey
R. J.
Palav
A.
O'Bryant
S. E.
Labarge
A. S.
Practitioner's guide to symptom base rates in clinical neuropsychology
 , 
2003
New York
Plenum
Nelson
N. W.
Sweet
J. J.
Demakis
G. J.
Meta-analysis of the MMPI-2 fake Bad Scale: Utility in forensic practice
The Clinical Neuropsychologist
 , 
2006
, vol. 
20
 (pg. 
39
-
58
)
Nelson
N. W.
Sweet
J. J.
Heilbronner
R. L.
Examination of the new MMPI-2 Response Bias Scale (Gervais): Relationship with MMPI-2 validity scales
Journal of Clinical and Experimental Neuropsychology
 , 
2007
, vol. 
29
 
1
(pg. 
67
-
72
)
O'Bryant
S. E.
Lucas
J. A.
Estimating the predictive value of the Test of Memory Malingering: An illustrative example for clinicians
The Clinical Neuropsychologist
 , 
2006
, vol. 
20
 (pg. 
533
-
540
)
Rees
L. M.
Tombaugh
T. N.
Gansler
D. A.
Moczynski
N. P.
Five validation experiments of the Test of Memory Malingering (TOMM)
Psychological Assessment
 , 
1998
, vol. 
10
 (pg. 
10
-
20
)
Shaw
D. J.
Mathews
C. G.
Differential MMPI performance of brain damaged versus pseudoneurologic groups
Journal of Clinical Psychology
 , 
1965
, vol. 
21
 (pg. 
405
-
408
)
Singhal
A.
Green
P.
Ashaye
K.
Shankar
K.
Gill
D.
High specificity of the Medical Symptom Validity Test in patients with very severe memory impairment
Archives of Clinical Neuropsychology
 , 
2009
, vol. 
24
 (pg. 
721
-
728
)
Slick
D. J.
Hopp
G.
Strauss
E.
Spellacy
F. J.
Victoria Symptom Validity Test: Efficiency for detecting feigned memory impairment and relationship to neuropsychological tests and MMPI-2 validity scales
Journal of Clinical and Experimental Neuropsychology
 , 
1996
, vol. 
18
 (pg. 
911
-
922
)
Slick
D. J.
Sherman
E. M. S.
Iverson
G. L.
Diagnostic criteria for malingered neurocognitive dysfunction: Proposed standards for clinical practice and research
The Clinical Neuropsychologist
 , 
1999
, vol. 
13
 
4
(pg. 
545
-
561
)
Sullivan
K. A.
Elliott
C.
An investigation of the validity of the MMPI-2 response bias scale using an analog simulation design
The Clinical Neuropsychologist
 , 
2012
, vol. 
26
 
1
(pg. 
160
-
176
)
Teichner
G.
Wagner
M. T.
The test of memory malingering (TOMM): Normative data from cognitively intact, cognitively impaired, and elderly patients with dementia
Archives of Clinical Neuropsychology
 , 
2004
, vol. 
19
 
3
(pg. 
455
-
464
)
Tombaugh
T. N.
TOMM: Test of Memory Malingering
 , 
1996
North Tonawanda, NY
Multi-Health Systems
Tombaugh
T. N.
The Test of Memory Malingering (TOMM): Normative data from cognitively intact and cognitively impaired individuals
Psychological Assessment
 , 
1997
, vol. 
9
 (pg. 
260
-
268
)
Tsushima
W. T.
Geling
O.
Fabrigas
J.
Comparison of MMPI-2 validity scale scores of personal injury litigants and disability claimants
The Clinical Neuropsychologist
 , 
2011
, vol. 
25
 
8
(pg. 
1403
-
1414
)
Victor
T. L.
Boone
K. B.
Serpa
G.
Buehler
J.
Ziegler
E. A.
Interpreting the meaning of multiple symptom validity test failure
The Clinical Neuropsychologist
 , 
2009
, vol. 
23
 
2
(pg. 
297
-
313
)
Wechsler
D. A.
Wechsler Adult Intelligence Scale – IV
 , 
2008
San Antonio, TX
Psychological Corporation
Whitney
K. A.
Davis
J. J.
Shepard
P. H.
Herman
S. M.
Utility of the Response Bias Scale (RBS) and other MMPI-2 validity scales in predicting TOMM performance
Archives of Clinical Neuropsychology
 , 
2008
, vol. 
23
 
7–8
(pg. 
777
-
786
)
Wygant
D. B.
Sellbom
M.
Gervais
R. O.
Ben-Porath
Y. S.
Stafford
K. P.
Freeman
D. B.
Further validation of the MMPI-2 and MMPI-2-RF Response Bias Scale: Findings from disability and criminal forensic settings
Psychological Assessment
 , 
2010
, vol. 
22
 
4
(pg. 
745
-
756
)
Young
J. C.
Gross
A. M.
Detection of response bias and noncredible performance in adult attention-deficit/hyperactivity disorder
Archives of Clinical Neuropsychology
 , 
2011
, vol. 
26
 (pg. 
165
-
175
)
Young
J. C.
Kearns
L. A.
Roper
B. L.
Validation of the MMPI-2 response bias scale and Henry–Heilbronner Index in a U.S. veteran population
Archives of Clinical Neuropsychology
 , 
2011
, vol. 
26
 (pg. 
194
-
204
)