Abstract

Objective

This research examined cutoff scores for MMPI-2 and MMPI-2-RF validity scales specifically developed to assess non-credible reporting of cognitive and/or somatic symptoms. The validity scales examined included the Response Bias Scale (RBS), the Symptom Validity Scales (FBS, FBS-r), Infrequent Somatic Responses scale (Fs), and the Henry–Heilbronner Indexes (HHI, HHI-r).

Method

Cutoffs were developed by comparing a psychometrically defined non-malingering group with three psychometrically defined malingering groups (probable, probable to definite, and definite malingering) and a group that combined all malingering groups. The participants in this research were drawn from a military sample consisting largely of patients with traumatic brain injury (mostly mild traumatic brain injury).

Results

Specificities for cutoffs of at least 0.90 are provided. Sensitivities, predictive values, and likelihood ratios are also provided.

Conclusions

RBS had the largest mean effect size (d) when the malingering groups were compared to the non-malingering group (d range = 1.23–1.58).

Introduction

There is a body of evidence (Nelson, Sweet, & Heilbronner, 2007; Gervais, Ben-Porath, Wygant, & Green, 2008; Whitney, Davis, Shepard, & Herman 2008; Gervais, Ben-Porath, Wygant, Green, & Sellbom, 2010; Nelson, Hoelzle, Sweet, Arbisi, & Demakis, 2010; Jones & Ingram, 2011; Youngjohn, Wershba, Stevenson, Sturgeon, & Thomas, 2011; Peck et al., 2013) that demonstrates Symptom Validity Tests (SVTs) specifically designed to assess the validity or credibility of self-reported cognitive and/or somatic symptoms perform better in a variety of clinical and forensic settings than the validity scales that assess over-reporting of emotional distress or psychopathology on the Minnesota Multiphasic Personality-2 (MMPI-2; Butcher, Dahlstrom. Graham, Tellegen, & Kaemmer, 1989) and the related but restructured set of scales on the MMPI-2-RF (Ben-Porath & Tellegen, 2008). Results of a principal component analysis in a military sample by Jones and Ingram (2011) also indicate that the Cognitive-Somatic Validity Scales and validity scales related to psychopathology load on distinctly different components. Because of the differences in content and apparent superiority of the cognitive-somatic scales in the context of neuropsychological evaluations, additional research is indicated with application in a military sample. Cutoff scores have clinical utility and have been established in a variety of settings for the MMPI-2 and MMPI-2-RF Cognitive-Somatic SVTs (C-S SVTs), but no research has specifically addressed cutoffs to predict malingering for these scales in a military sample. The purpose of this research is to examine cutoffs that may be useful for diagnosing psychometrically determined malingering at three levels (Probable [PM], Probable to Definite [PDM], and Definite [DM]) and a group that combines all malingering groups (CM).

The scales that assess non-credible reporting of psychopathology or emotional distress include the MMPI-2 Infrequency (F), Back F (FB) scale, and the Infrequency Psychopathology (FP) scale and the corresponding MMPI-2-RF scales (F-r and FP-r). The C-S SVTs on the MMPI-2 include the Fake Bad Scale (FBS; Lees-Haley, English, & Glenn, 1991), now named the Symptom Validity Scale, and the Response Bias Scale (RBS; Gervais, Ben-Porath, Wygant, & Green, 2007). The MMPI-2-RF revised version of the MMPI-2 FBS (FBS-r) retains 30 of the original 43 FBS items. RBS remains intact on the MMPI-2-RF. The MMPI-2-RF also includes a scale specifically designed to assess over-reporting of somatic complaints, the Infrequent Somatic Responses scale (Fs).

Another C-S SVT, the Henry–Heilbronner Index (HHI; Henry, Heilbronner, Mittenberg, & Enders, 2006) was developed for the MMPI-2, and a revised but shorter version, the HHI-r (11 vs. 15 items), was recently developed for the MMPI-2-RF (Henry, Heilbronner, Algina, & Kaya, 2013). Neither the HHI nor the HHI-r have been adopted by the publisher of the MMPI-2 and MMPI-2-RF. Jones and Ingram (2011) demonstrated that in a military sample FBS, RBS, and FBS-r, and HHI performed better than the MMPI-2 F-family of scales assessing psychopathology in predicting performance credibility on neurocognitive tests. Because HHI performed the best in many respects in this sample, it is included in the current research.

FBS was the first CS-SVT adopted by the publisher of the MMPI-2. The items on this scale were selected “rationally on a content basis using unpublished frequency counts of malingerers’ MMPI test responses and observations of personal injury malingerers” (Lees-Haley, English, & Glenn, 1991; p. 204). Items were selected that were thought to assess simultaneously exaggerated post-injury distress and under-reporting of pre-incident personality problems. A meta-analysis of FBS was completed in 2006 (Nelson, Sweet, & Demakis, 2006), and it was updated in 2010 (Nelson, Hoelzle, Sweet, Arbisi, & Demakis, 2010). The authors of this meta-analysis concluded that FBS differentiated groups as well as, and was at times superior, to other MMPI-2 validity scales, including all of F-family scales that assess psychopathology. This included TBI patients. A review of cutoff scores for FBS in multiple settings and groups (TBI, nontraumatic brain disease, psychiatric patients, and other groups) by Greiffenstein, Fox, and Lees-Haley (2007) concluded that a cutoff of ≥23 (specificity = 0.90) justifies concern about the validity of self-reported symptoms. A specificity of 0.90 has been considered a minimally acceptable level to establish cutoff scores (Victor, Boone, Serpa, Buehler, & Ziegler, 2009; Larrabee, 2012a) and is used throughout the current research to examine cutoff scores. Greiffenstein and coworkers indicated that one should be mindful of moderating variables, such as head injury severity, when using FBS cutoff scores; however, they indicate that in cases of head injury with negative radiologic findings, as is the case in the sample used for the current research, then cutoff scores of 23–24 are grounds for suspecting exaggeration. A cutoff of 23 for men and 26 for women suggests possible malingering for the MMPI-2 (Ben-Porath, Graham, & Tellegen, 2009), which corresponds to a T-score of 80 for both cutoffs. Greiffenstein and coworkers found that a raw score of 30 or greater never or rarely produced false-positive errors, which corresponds to an MMPI-2 T-score of about 100 for men and 90 for women.

Research subsequent to that of Greiffenstein, Fox, and Lees-Haley (2007) found that slightly higher cutoffs might be needed to meet recommended minimal levels of specificity. However, this research may not have used well-differentiated groups with respect to possible malingering status, and FBS may not be as sensitive or specific to a diagnosis of malingering in this case. For example, Tsushima, Geling, and Fabrigas (2011) compared a sample of predominantly mTBI litigating and/or compensation-seeking patients and non-litigating patients primarily with emotional or behavioral problems, health conditions, or both that were suspected to be influenced by psychological factors. No validity tests were used to classify patients into either group. Tsushima and coworkers found an FBS cutoff of ≥25 produced a specificity of 0.91. Dionysus, Denney, and Halfaker (2011) also found a cutoff of ≥25 was necessary to obtain a specificity of at least 0.90 in sample of litigating or disability-seeking head-injured patients who failed at least one performance validity test (PVT) and a group of head-injured litigating patients who failed no PVTs. PVTs are designed to assess the credibility of performance on neurocognitive tests and not self-reported symptoms as do SVTs (Larrabee, 2012a). The possibility of exaggeration still exists in the latter group in the research by Dionysus and coworkers given that they were litigating. Peck et al. (2013) examined three groups, including a valid TBI group, a psychogenic non-epileptic seizure group, and an invalid TBI group, which were classed as PM based on Slick, Sherman, and Iverson (1999) criteria. They found an FBS cutoff ≥27 produced a false-positive classification rate of 7% with ≤1 failure on PVTs in their valid TBI group. However, not only did some in this presumed valid group fail a PVT, the authors indicate that participants in the presumed valid group had substantial external incentives. Thus, the cutoff established in this research might have been somewhat high because there were possible malingerers in this presumed valid group. Research (Fox, 2011) suggests that failure on even a single PVT can invalidate expected brain–behavior relationships that underlie neurocognitive tests interpretations, and research by Proto et al. (2014) concluded that failure on even one PVT should raise concerns about performance validity, especially in individuals with mTBI. Thus, failure on one PVT test failure may raise questions about possible malingering or at least non-credible performance on neuropsychological tests for other reasons, such as a Somatic Symptom Disorder (American Psychiatric Association, 2013) perhaps in the form of a Cogniform Disorder (Delis & Wetter, 2007).

The revision of FBS on the MMPI-2-RF (FBS-r) correlates in the upper 90s with scores on the MMPI-2 version of the scale in samples that included large numbers of individuals who failed SVTs (Tellegen & Ben-Porath, 2008). Jones and Ingram (2011) found a correlation of 0.95 between FBS and FBS-r in a military sample. With respect to cutoffs, Ben-Porath and Tellegen (2011) indicate that a T-score of 80–99 (raw score range 17–23) suggests possible over-reporting of memory complaints, and a T-score ≥ 100 (raw score = 24) suggests likely over-reporting of memory complaints and limits the interpretability of the MMPI-2-RF Cognitive Complaints scale (COG). Tarescavage, Wygant, Gervais, and Ben-Porath (2013) found the same T-score (≥100) of was necessary to differentiate an incentive only vs. a probable to definite Malingered Neurocognitive Disorder (MND) group based on Slick, Sherman, and Iverson (1999) criteria in a non–head-injured sample. This is slightly higher than the cutoff of ≥21 that Schroeder et al. (2012) found in differentiating a litigating group that failed Slick and coworkers criteria from a group of mostly mTBI patients who passed these criteria.

RBS was initially developed by Gervais (2005) but later modified. The items for both the original and modified versions are on both the MMPI-2 and MMPI-2-RF. RBS was developed using an empirical keying methodology to detect symptom complaints associated with cognitive response bias and over-reporting in forensic neuropsychological and disability assessment settings. Regression analysis was used to identify MMPI-2 items that predicted failure on PVTs. RBS consists of 28 MMPI-2 items that discriminated between non–head-injured disability claimants who passed or failed tests designed to detect effort on cognitive tests (Gervais, Ben-Porath, Wygant, & Green, 2007).

The initial validation research for the revised RBS (Gervais, Ben-Porath, Wygant, & Green, 2007) involved mostly non–head-injured disability claimants. This research suggested that RBS was more accurate in detecting inadequate effort on tests of cognitive functioning than the MMPI-2 F-family scales that assess psychopathology in this type of forensic disability assessment setting. This initial research also found a cutoffs of ≥16 and ≥17 produced specificities of 0.89 and 0.95 in classifying failure on the Medical Symptom Validity Test (MSVT; Green, 2004), Word Memory Test (WMT; Green, 2003), or both. Three studies have examined cutoffs using criteria developed by Slick, Sherman, and Iverson (1999). Peck et al. (2013) found cutoff of ≥16 was adequate in their valid TBI group, which is the same as the cutoff that Schroeder et al. (2012) found comparing their TBI patients who met or failed Slick and coworkers criteria. Tarescavage, Wygant, Gervais, and Ben-Porath (2013) found a T-score of ≥100 (raw score ≥ 17) comparing their sample of non–head-injured incentive only and PM/DM group.

In samples using military veterans, similar or just slightly higher cutoffs have been found. Whitney, Davis, Shepard, and Herman (2008) found a cutoff of ≥17 was adequate (specificity = 0.92) in predicting failure on the TOMM in a mixed (TBI, neurologic disorders, psychiatric disorders) Veterans Administration clinical sample. Whitney (2013) using a larger sample from her practice found a similar cutoff of ≥18 predicted failure on the TOMM and MSVT at specificities of 0.93 and 0.94, respectively. Young, Kearns, and Roper (2011) found cutoff of 19 was needed to obtain acceptable specificity (0.91) in distinguishing pass–fail status on the WMT in sample of military veterans in recent, current, or upcoming compensation evaluations.

Other research using non-military samples has found slightly lower cutoff scores. Wygant et al. (2010) found a cutoff of ≥15 (T = 90; specificity = 0.91) was adequate in classifying disability claimants who failed either the WMT or TOMM vs. a group of claimants who passed both PVTs. Dionysus, Denney, and Halfaker (2011) found that similar RBS cutoffs of ≥14 or ≥15 both produced a specificity of 0.93, but the ≥14 cutoff had better sensitivity. Using criterion groups based on litigation status (not validity tests), Tsushima Geling and Fabrigas (2011) found a cutoff of ≥13 was necessary to obtain a specificity of at least 0.90. The results of their receiver operating characteristic and area under the curve analysis indicated that RBS outperformed F, FP, FBS, and HHI in identifying the litigating patients.

The MMPI-2-RF technical manual (Tellegen & Ben-Porath, 2008) indicates that FS consists of 16 items that were endorsed by 25% or less of women and men in several large samples of medical patients and a chronic pain patient sample. The total sample included over 55,000 patients. It is described as a scale that assesses over-reporting of somatic symptoms (Ben-Porath & Tellegen, 2008). However, it has been shown to be as sensitive to memory complaints as FBS-r (Gervais, Ben-Porath, Wygant, Green, & Sellbom, 2010). The technical manual suggests that T-scores in the range of 80–99 (raw score range 5–7) indicate possible over-reporting of somatic problems, and scores ≥100 (raw score ≥ 8) indicates over-reporting of somatic problems and possible invalidity of scores on the MMPI-2-RF Somatic Scales. Tarescavage, Wygant, Gervais, and Ben-Porath (2013) found T-score of 100 (raw score ≥ 8) had acceptable specificity when comparing their incentive only and PM/DM group. Schroeder et al. (2012) found a slightly lower raw score of ≥6 was sufficient to differentiate their TBI patients.

HHI is a subset of items empirically derived from FBS and the pseudoneurologic scale (PNS) developed by Shaw and Matthews (1965). The items on the HHI were derived by comparing personal injury litigants and disability claimants (79% with mTBI; 5% with moderate-to severe TBI) who met Slick, Sherman, and Iverson (1999) criteria for PM with non-litigating mTBI (85%) and moderate-to-severe (15%) head-injured controls (Henry, Heilbronner, Mittenberg, & Enders, 2006). As a result of their research comparing the FBS, PNS, and HHI, Henry and coworkers, concluded that HHI, which they termed a “pseudosomatic factor,” was a purer measure of “somatic malingering” than the FBS and PNS. They found that a cutoff of ≥8 had an acceptable classification accuracy of 0.86 with a specificity of 0.89. A cutoff of 9 produced a specificity of 0.95, and a score ≥ 13 was associated with a positive predictive value (PPV) of 100%.

Subsequent HHI research suggests that higher cutoffs may be needed to obtain acceptable specificity. Dionysus, Denney, and Halfaker (2011) also using Slick, Sherman, and Iverson (1999) criteria and a sample of head-injured patients found a cutoff of ≥12 (specificity = 0.93) was needed to differentiate a group of probable to definite malingerers and a group of head-injured patients who passed validity tests and did not meet Slick and coworkers criteria for malingering. Other research using less stringent criteria to form groups has found similar or slightly higher cutoffs. Tsushima Geling and Fabrigas (2011) found a cutoff of ≥11 was needed to differentiate litigating from clinical patients to obtain a specificity of 0.90. Young, Kerns, and Roper (2011) found a cutoff of found a cutoff of ≥14 was necessary to differentiate a group of compensation-seeking vs. non–compensation-seeking military veterans at a specificity of 0.85. They found a nonsignificant relationship between HHI and the WMT and did not calculate cutoff scores for HHI. The relatively higher cutoff in differentiating the compensation-seeking and non–compensation-seeking patients may be because more participants in the non–compensation-seeking group (53.4%) failed the WMT than did participants in the compensation-seeking group (46.5%; Heilbronner & Henry, 2013). In addition, almost all individuals in a Veterans Administration sample, as in an active duty military sample, have potential incentive to malinger. Thus, the rate of failure on the WMT in both groups and well as potential incentive to malinger could have, as Heilbronner and Henry point out, artificially inflated HHI cutoffs in order to obtain acceptable specificity. However, Whitney (2013) found that the same cutoff of ≥14 was necessary to obtain specificities of 0.95 and 0.96 to differentiate those passing or failing the Test of Memory Malingering (TOMM; Tombaugh, 1996) and MSVT (Green, 2004). Earlier research by Whitney, Davis, Shepard, and Herman (2008) using a smaller sample in the same setting found a similar cutoff of ≥13 was needed to differentiate those who passed for failed the TOMM (specificity = 0.92). Henry, Heilbronner, Algina, & Kaya (2013) found that a cutoff of ≥7 for HHI-r produced a sensitivity of 0.69 and a specificity of 0.93 in differentiating personal jury and disability litigants vs. non-litigating head-injured patients. This initial research suggests the HHI-r may have some promise.

Materials and Methods

Participants

The sample used in this research is the same as used by Jones (2013a, 2013b), and the methodology in the current research parallels that used in the previous research. All participants were active duty military members who were consecutive referrals to the author and evaluated in the brain injury medicine or neuropsychology services at two army medical centers. The participants completed a neuropsychological evaluation that included the MMPI-2 and at least one PVT. Participants with TRIN or VRIN ≥80 were excluded. A cannot say scale (CNS) > 18 was also used as an exclusion criterion; however, no one had a CNS score >17. The initial sample consisted of 495 participants; after exclusion criteria for the MMPI-2 were applied, the final sample consisted of 462 participants.

After the non-malingering (NM) and three malingering groups (PM, PDM, and DM) were formed (groups are described in the following), the final sample consisted of 300 participants. There were 145 participants in the NM group and 155 in the three malingering groups. The sample was 82.7% men; the mean age was 31.6 years (SD = 9.0), and the mean education level was 13.1 years (SD = 2.0). The ethnic distribution was 72.6% Caucasian, 16.7% Afro-American, 10.3% Hispanic, and 0.4% was Asian or other. There were no significant differences in age or education or in the distribution of gender or ethnicity across the comparison groups.

The majority of the participants were evaluated for closed head injuries, blast exposure, and heat injuries, or some combination of these injuries. Because of the retrospective nature of the data, the exact distribution of the severity and nature of head injury was not available for the final sample. However, data were available for about two-thirds of the initial sample but unfortunately not linked to individual participants. About 90% of the sample experienced closed head, blast, or heat injuries or some combination of the three. Approximately 75% of these injuries were estimated to be mild, 19% were moderate, and 6% were severe. About 10% of the sample was evaluated for brain disease (e.g., multiple sclerosis, epilepsy, Huntington's disease, etc.). Criteria used to judge the severity of TBI were based in large part on the Department of Veterans Affairs/Department of Defense 40 consensus-based classification of closed TBI severity (Department of Veteran Affairs, 2009). However, information for each criterion was not always available (e.g., Glasgow Coma Scale ratings and brain-imaging studies at time of injury), so the severity ratings were primarily based on criteria related to length of loss of consciousness, length of alteration of consciousness (e.g., feeling dazed, disoriented, confused, or difficulty mentally tracking events), and length of post-traumatic amnesia.

PVT Cutoff Scores

Five PVTs were used for this research to establish comparison groups. The PVTs included two embedded and three freestanding tests. The freestanding tests included the Victoria Symptom Validity Test (VSVT; Slick, Hopp, Strauss, & Thompson, 1997), the TOMM (Tombaugh, 1996), and the WMT (Green, 2003). The embedded PVTs included the Effort Index (EI) for the Repeatable Battery of Neuropsychological Status (RBANS; Randolph, 1998) and Reliable Digit Span (RDS; Babikian, Boone, Lu, & Arnold, 2006). Greater detail concerning the characteristics of the validity tests, such as the nature of the stimuli and administration procedures, are not presented herein the interest of maintaining test security and deterring coaching. The stand-alone PVTs were administered and scored by computer using standard instructions. Cutoffs for determining failure on the WMT were based on the test manual, i.e., ≤82.5% on IR, DR, or CNS. The RBANS EI was calculated by procedures described by Silverberg, Wertheimer, and Fichtenberg (2007). The cutoff for determining failure on the RBANS EI was based on the research by Armistead-Jehle and Hansen (2011) for a military sample. The cutoff used (EI1) had the highest sensitivity while maintaining a very low false-positive rate. A cutoff of ≤7 was used for RDS (Babikian et al.; Greiffenstein, Baker, & Gola, 1994; Jasinski, Berry, Shandera, & Clark, 2011). For the TOMM, the cutoffs used were ≤43 for Trial 1 and ≤49 for the other two trials. The cutoffs all had PPVs of 0.90 or greater for a base rate of 0.40 for PM, PDM, and DM groups used in the research by Jones (2013a). That research suggested that a base rate of 0.41 for PM based on failure of two or more validity tests in the military sample used in that research. The use of these nonstandard cutoffs for the TOMM is consistent with the findings of Stenclik, Miele, Silk-Eglit, Lynch, and McCaffrey (2013). They concluded that their research supported the use of a cutoff of ≤39 for Trial 1 and a cutoff of <49 for Trial 2 and the Retention Trial in a sample of mTBI patients. Cutoffs used for the VSVT for the current research were ≤ 20, ≤ 18, and ≤ 41 for the Easy, Hard, and Total Scores, respectively. Research by Jones (2013b) indicated these cutoffs for the Hard and Total scores produced PPVs of at least 0.90 at a base rate of 0.40 for a PM and PDM groups. The cutoff for the easy items for the PM had a PPV of 0.88. Cutoffs of <18 and <41 for the Hard and Total scores, respectively, were found to have the best classification accuracy for mTBI patients in the recent research by Silk-Eglit, Lynch, and Mccaffrey (2016). In general, the research cited earlier by Stenclik and coworkers and Silk-Eglit and coworkers as well as the cutoffs established in the Jones research on the TOMM and VSVT in predominantly mTBI sample and used in the current research support the use of nonstandard cutoffs. They are also consistent with cutoffs established in other research, as reviewed in the Jones research, using a variety of other samples (e.g., Grote et al., 2000; Greve, Bianchini, & Doane, 2006; Greve et al., 2006; Macciocchi, Seel, Alderson, & Godsall, 2006; Loring, Larrabee, Lee, & Meador, 2007; Greve, Etherton, Ord, Bianchini, & Curtis, 2009).

Comparison Groups

Four groups were used to establish cutoff scores for the MMPI-2 and MMPI-2-RF C-S SVTs. The PM group was based on failure of exactly two PVTs. The PDM group was based on failure of three or more PVTs, and the DM was composed of participants who performance on the VSVT was significantly below chance. All participants who performed significantly below chance on the WMT or TOMM also scored significantly below chance on the VSVT; three participants scored below chance on the WMT and four on the TOMM. The fourth malingering group included all participants in the PM, PDM, and DM groups (combined malingering group). The group thought not to be malingering (NM group) failed no PVTs administered to them and were administered at least two PVTs.

The formation of the PM, PDM, and DM groups was based in large part on the simplification and refinements suggested by Larrabee, Greiffenstein, Greve, and Bianchini (2007) and Larrabee (2008) of the Slick, Sherman, and Iverson (1999) criteria that have been widely used to diagnose MND. The Slick and coworkers criteria use a multidimensional and multimethod approach including evidence from neuropsychological testing and evidence from self-report. Larrabee and coworkers concluded that the Slick and coworkers criteria related to evidence from neuropsychological testing could be modified and simplified to allow multiple psychometric findings to define different levels of malingering regardless of other Slick and coworkers criteria. Larrabee (2008) stated that failure on two independent PVTs provides “strong” evidence for a diagnosis of PM (post-test probability = .90+), and failure on three PVTs provides “very strong” evidence of probable, if not definite, malingering. Larrabee and coworkers state that failure on three well-validated validity indicators appears to be associated with “100% probability of malingering and is statistically equivalent to definite MND…” (p. 357). However, they also indicate that although this is associated with 100% probability of malingering, scores at this level are not conceptually equivalent to the active avoidance of correct answers as associated with significantly worse-than-chance-performance on two-alternative forced choice testing, which would indicate conscious intent and DM.

Of the 462 participants available, after MMPI-2 exclusion criteria were applied 395 “completed” at least two validity tests or failed the VSVT significantly below chance. Of these 395 participants, 155 “failed” at least two PVTs (or VSVT below chance) and could be used to form the three malingering groups. The PM group was composed of 83 participants, and for this group, 62 of 71 failed the VSVT, 68 of 73 failed the TOMM, 10 of 50 failed the EI, 16 of 23 failed RDS, and 10 of 12 failed the WMT. Of the 44 individuals who met criteria for inclusion in the PDM group, 42 of 43 failed the VSVT, 43 of 43 failed the TOMM, 37 of 41 failed the EI, 10 of 12 failed RDS, and 10 of 10 failed the WMT. The DM group was composed of 28 participants who failed the VSVT at significantly below chance levels. The NM group was composed of 145 participants; 100 were administered the TOMM, 119 the VSVT, 56 RDS, 12 the WMT, and the RBANS EI was scored for 78 participants.

Data Analysis

The data analysis proceeded in four steps. First, it was necessary to establish that the validity measures were independent (i.e., not redundant) to ensure the validity of the composition of the three groups composed of those thought to be malingering. Larrabee (2008) and Nelson et al. (2003) used correlational analysis across the full range of scores to establish validity tests independence. However, it can be argued that for the current research the primary concern is establishing if there is an association between either passing or failing a validity test within a comparison group and not the association between the full range of scores on each validity test. This is important because, e.g., the PM group was composed of participants who failed exactly two validity tests in any combination. If the two tests failed were redundant, then it would be the equivalent to failing one test. However, when nonredundant tests are used, there is greater certainty that the individuals who were placed in the PM group were correctly classified, i.e., failed two nonredundant validity tests. This reasoning also applies to the PDM group.

To establish independence for the PVTs used in this research, a chi-square analysis was completed to assess the association between the pass–fail status for each the five PVTs used to establish the malingering comparison groups. Of the 462 participants in the total sample, 153 failed two or more PVTs and could be used for chi-square analysis. Two participants of the 155 participants in the comparisons groups failed only the VSVT below chance, i.e., they did not fail two PVTs. The second analysis involved comparing the means of the three comparison groups in terms of standardized units. This allowed for comparison with other research reporting effect sizes. The third analysis involved an examination of classification accuracy in terms of sensitivity and specificity for a range of cutoffs for the C-S SVTs. Cutoffs were calculated for cutoffs with specificities ≥.90 and terminated when specificities reached 1.0 or when the maximum raw score for scale was reached. The sensitivity of a test is the proportion of people with a COI (condition of interest), in the case of this research malingering, who have a positive result (true positives). The specificity of a test is the proportion of people without the COI who have a negative result (1–specificity = false-positive rate). Sensitivity and specificity inform us of the accuracy of a test and not the probability that someone has the COI.

The final analyses for the current research included calculation of PPVs, negative predictive values (NPVs), and the likelihood ratios for positive (LR+) and negative (LR−) test results. PVs and LRs are important in establishing “post-test probabilities” of having COI. The PPV of a test is defined as the proportion of people with a positive test result who actually have the COI. The NPV of a test is the proportion of people with a negative test result who do not have the COI. PVs are dependent on the prevalence or base rate of the COI (see Crawford, Garthwaite, and Betkowska (2009) for formulae for calculations of PVs). PVs for base rates ranging from 0.10 to 0.50 are provided in the current research and provide information about the probability of malingering at a given base rate. LR+ is defined as the probability of an individual with a COI having a positive test divided by the probability of an individual without the COI having a positive test (true positives/false positives). For example, if 80% of patients with the COI have a true positive test and only 6% of those who do not have the COI have a positive test (false positive), then the LR+ for the ability of the test to detect the COI is 13 (80%/6%). This indicates that a person with the COI is 13 times more likely to have a positive test than a person who does not have the COI. If the probability of having a positive test were the same in those with and without the COI, then the LR would be 1. This would indicate the test is not useful in differentiating the two groups. Individuals with a COI should be much more likely to have an abnormal test result than individuals without the COI. The likelihood ratio for a negative test (LR−) is defined as the probability of an individual with a COI having a negative test divided by the probability of an individual without the COI having a negative test (1–sensitivity/specificity). A LR− greater than 1 indicates that a negative test is more likely to occur in people with the COI than in people without the COI. A LR− less than 1 mean that a negative test is less likely to occur in people with the COI compared to people without the COI (Grimes & Schulz, 2005; Bowden & Loring, 2009). An example of the calculation of post-test probabilities of a COI can be found in Jones (2013b), and online calculators are readily available to simplify the process. The Fagan (1975) nomogram, which is also readily available online or in text books (e.g., Sackett, Haynes, Guyatt, & Tugwell, 1991), can also be used to provide a fast and easy estimate of a post-test probability.

Results

The results of the chi-square analysis using failure on two or more PVTs to assess redundancy (Table 1) indicate that there are no statistically significant associations between passing or failing the PVTs used in this research for those who failed PVTs. However, some of the analyses had very small cell sizes or empty cells. In an attempt to address this problem, an additional analysis (Table 2) was completed using participants failing one or more PVTs rather than two or more PVTs used in the initial analysis. Although some associations were statistically significant in this second analysis, the results of this analysis also indicate that PVT redundancy is not a problem. Where there are statistically significant associations (TOMM and EI, VSVT and EI, TOMM and RDS), the amount of shared variance was minimal 0.03, 0.05 and 14%, respectively. Hinkle, Wiersma, and Jurs (2003) suggest that correlations in the range of 0.00–0.30 (0.0–0.09% shared variance) indicate little if any correlation between variables and correlations in the range of 0.30–0.50 (0.09–0.25% shared variance) indicate low correlations. Based on the two chi-square analyses, it appears that the relationship between the PVTs is minimal, and the formation of the malingering groups is valid, i.e., not based on redundant PVTs.

Table 1.

Chi-square analysis for validity test independence: two or more failed performance validity tests

 VSVT EI WMT RDS 
χ2 ϕ pa N χ2 ϕ p N χ2 ϕ p N χ2 ϕ p N 
TOMM 0.347 −0.05 .556 128 0.750 0.09 .387 99 0.186 −0.10 .666 20 b — — 20 
VSVT     2.37 −0.15 .124 107 1.13 0.22 .289 24 0.134 −0.09 .714 15 
EI         0.046 −0.05 .830 21 0.444 −0.33 .505 
WMT             0.133 −0.58 .248 
 VSVT EI WMT RDS 
χ2 ϕ pa N χ2 ϕ p N χ2 ϕ p N χ2 ϕ p N 
TOMM 0.347 −0.05 .556 128 0.750 0.09 .387 99 0.186 −0.10 .666 20 b — — 20 
VSVT     2.37 −0.15 .124 107 1.13 0.22 .289 24 0.134 −0.09 .714 15 
EI         0.046 −0.05 .830 21 0.444 −0.33 .505 
WMT             0.133 −0.58 .248 

Note: VSVT = Victoria Symptom Validity Test, EI = Repeatable Battery for the Assessment of Neuropsychological Status Effort Index, WMT = Word Memory Test, RDS = Reliable Digit Span, TOMM = Test of Memory Malingering.

aThe probability levels are based on χ2. There is no difference in the pattern of significant and nonsignificant results when compared to Fisher's Exact Test.

bValues could not computed because failure on the TOMM is a constant, i.e., of the 20 participants who met inclusion criteria for this analysis, 14 failed and 6 passed RDS and all participants failed the TOMM.

Table 2.

Chi-square analysis for validity test independence: one or more failed performance validity test

 VSVT EI WMT RDS 
χ2 ϕ pa N χ2 ϕ p N χ2 ϕ p N χ2 ϕ p N 
TOMM 0.387 −0.05 .534 180 4.18 0.17 .041 138 0.538 0.14 .463 26 7.70 −0.37 .006 57 
VSVT     6.98 0.22 .008 151 0.534 0.12 .465 36 0.201 −0.08 .654 30 
EI         0.411 0.12 .521 28 0.900 0.32 .343 
GWMT             1.74 −0.47 .187 
 VSVT EI WMT RDS 
χ2 ϕ pa N χ2 ϕ p N χ2 ϕ p N χ2 ϕ p N 
TOMM 0.387 −0.05 .534 180 4.18 0.17 .041 138 0.538 0.14 .463 26 7.70 −0.37 .006 57 
VSVT     6.98 0.22 .008 151 0.534 0.12 .465 36 0.201 −0.08 .654 30 
EI         0.411 0.12 .521 28 0.900 0.32 .343 
GWMT             1.74 −0.47 .187 

Note: VSVT = Victoria Symptom Validity Test, EI = Repeatable Battery for the Assessment of Neuropsychological Status Effort Index, WMT = Word Memory Test, RDS = Reliable Digit Span, TOMM = Test of Memory Malingering.

aThe probability levels are based on χ2. There is no difference in the pattern of significant and nonsignificant results when compared to Fisher's Exact Test.

The effect size analysis (Table 3) indicates that there were large effect sizes based on Cohen's (1988) general recommendations for classifying standardized effects sizes for all C-S SVTs across all malingering groups. RBS performed best overall based on the mean effect size for the PM, PDM, and DM groups. Based on Ferguson's (2009) criteria for moderate effect size (practically significant = 0.41, moderate = 1.15; strong = 2.70), only RBS had at least moderate effect sizes for all malingering groups. All scales had moderate effect sizes for the PDM and DM groups (range, 1.15 – 2.01) based on Ferguson's criteria.

Table 3.

Descriptive statistics and effect sizes for the probable malingering (PM), probable to definite (PDM), definite malingering (DM), and combined groups vs. non-malingering (NM) groups for the MMPI-2 and MMPI-2-RF Cognitive-Somatic Validity scales

Group NM (N = 145) PM (N = 83) PDM (N = 44) DM (N = 28) Combineda Effect Size—Cohen's d for NM vs. 
M SD M SD M SD M SD M SD PM PDM DM Combined Mb 
RBS 8.59 4.08 13.35 3.87 14.80 4.03 16.61 3.39 14.35 4.01 1.19 1.53 2.01 1.42 1.58 
FBS 16.24 5.38 21.69 5.38 24.25 6.59 24.89 5.83 22.99 5.96 1.01 1.41 1.59 1.19 1.34 
FBS-r 10.48 4.01 14.60 3.99 16.93 4.97 17.50 4.02 15.79 4.46 1.03 1.52 1.75 1.25 1.43 
Fs 2.10 2.04 4.06 2.67 4.73 2.99 5.86 3.04 4.57 2.90 0.85 1.15 1.69 0.98 1.23 
HHI 5.52 3.68 9.35 3.27 11.09 3.10 11.93 1.82 10.31 3.18 1.08 1.57 1.86 1.40 1.50 
HHI-r 3.91 2.76 6.75 2.48 8.11 2.24 8.71 1.51 7.49 2.40 1.07 1.59 1.84 1.39 1.50 
Group NM (N = 145) PM (N = 83) PDM (N = 44) DM (N = 28) Combineda Effect Size—Cohen's d for NM vs. 
M SD M SD M SD M SD M SD PM PDM DM Combined Mb 
RBS 8.59 4.08 13.35 3.87 14.80 4.03 16.61 3.39 14.35 4.01 1.19 1.53 2.01 1.42 1.58 
FBS 16.24 5.38 21.69 5.38 24.25 6.59 24.89 5.83 22.99 5.96 1.01 1.41 1.59 1.19 1.34 
FBS-r 10.48 4.01 14.60 3.99 16.93 4.97 17.50 4.02 15.79 4.46 1.03 1.52 1.75 1.25 1.43 
Fs 2.10 2.04 4.06 2.67 4.73 2.99 5.86 3.04 4.57 2.90 0.85 1.15 1.69 0.98 1.23 
HHI 5.52 3.68 9.35 3.27 11.09 3.10 11.93 1.82 10.31 3.18 1.08 1.57 1.86 1.40 1.50 
HHI-r 3.91 2.76 6.75 2.48 8.11 2.24 8.71 1.51 7.49 2.40 1.07 1.59 1.84 1.39 1.50 

Note: RBS = Response Bias Scale, FBS = MMPI-2 Symptom Validity Scale, FBS-r = MMPI-2-RF Symptom Validity Scale, Fs = Infrequent Somatic Responses Scale, HHI = MMPI-2 Henry–Heilbronner Index, HHI-r = MMPI-2-RF Henry–Heilbronner Index.

aThis group is composed of all participants in the PM, PDM, and DM groups (N = 155).

bMean effect size for the PM, PDM, and DM groups.

The results for cutoff scores for the C-S SVTs with a minimum specificity of 0.90 are provided in Tables 4–9. Calculation of cutoffs was terminated when specificities reached 1.0 or the maximum score on the scale. These tables provide statistics related to test accuracy, PVs, and LRs. It should be noted that a value of ∞ appears in the LR column for some cutoffs. This is because a value of ∞, an undefined value, results when a zero enters into the calculation of the LR. For example, for cutoff of ≥21 for FBS (Table 4) the LR+ = Sensitivity/(1 − Specificity), which results in a LR+ = 0.14/(1– 1) or LR+ = 0.14/0. It should also be noted that the calculation of LRs was based on sensitivities and specificities to five decimal places, not the rounded values provided in this table. Sensitivities and specificities were calculated at three decimal places. Sensitivities, specificities, and PPV will be discussed here and LRs will be explored in the Discussion section. PPVs discussed in this section will be based primarily on a base rate of 0.40 that is similar to estimates of malingering found in litigating TBI patients and in military and other samples (Mittenberg, Patton, Canyock, & Condit, 2002; Larrabee, 2007; Larrabee, Millis, & Meyers, 2009; Jones, 2013b).

Table 4.

Sensitivity, specificity, positive/negative predictive value (PPV/NPV), and likelihood ratios for cutoff scores for the response bias scale (RBS)

Cutoffa Test accuracy (95% CI) PPV for select base rates NPV for select base rates Likelihood ratios (95% CI) 
 Sensitivity Specificity 0.10 0.20 0.30 0.40 0.50 0.10 0.20 0.30 0.40 0.50 Positive test Negative test 
Definite Malingering 
RBS ≥ 15 0.71 (0.51–0.86) 0.93 (0.87–0.96) 0.54 0.72 0.82 0.87 0.91 0.97 0.93 0.88 0.83 0.77 10.4 (5.4–19.7) 0.31 (0.17–0.55) 
RBS ≥ 16 0.57 (0.37–0.75) 0.94 (0.89–0.97) 0.54 0.72 0.82 0.87 0.91 0.95 0.90 0.84 0.77 0.69 10.4 (4.9–21.8) 0.45 (0.30–0.70) 
RBS ≥ 17 0.43 (0.25–0.63) 0.96 (0.91–0.98) 0.54 0.72 0.82 0.87 0.91 0.94 0.87 0.80 0.72 0.63 10.4 (4.2–25.3) 0.60 (0.43–0.82) 
RBS ≥ 18 0.39 (0.22–0.59) 0.96 (0.91–0.98) 0.51 0.70 0.80 0.86 0.90 0.93 0.86 0.79 0.70 0.61 9.5 (3.8–23.6) 0.63 (0.47–0.85) 
RBS ≥ 19 0.25 (0.11–0.45) 0.97 (0.93–0.99) 0.50 0.69 0.80 0.86 0.90 0.92 0.84 0.75 0.66 0.56 9.1 (2.8–28.9) 0.77 (0.62–0.96) 
RBS ≥ 20 0.25 (0.11–0.45) 0.98 (0.93–0.99) 0.57 0.75 0.84 0.89 0.92 0.92 0.84 0.75 0.66 0.57 12.1 (3.3–43.9) 0.77 (0.62–0.95) 
RBS ≥ 21 0.14 (0.05–0.34) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.82 0.73 0.64 0.54 ∞ 0.86 (0.74–1.0) 
Probable to Definite Malingering 
RBS ≥ 15 0.55 (0.39–0.69) 0.93 (0.87–0.96) 0.47 0.66 0.77 0.84 0.89 0.95 0.89 0.83 0.75 0.67 7.9 (4.1–15.2) 0.49 (0.35–0.68) 
RBS ≥ 16 0.50 (0.35–0.65) 0.94 (0.89–0.97) 0.50 0.69 0.80 0.86 0.90 0.94 0.88 0.82 0.74 0.65 9.0 (4.3–18.9) 0.53 (0.39–0.71) 
RBS ≥ 17 0.34 (0.21–0.50) 0.96 (0.91–0.98) 0.48 0.67 0.78 0.85 0.89 0.93 0.85 0.77 0.69 0.59 8.2 (3.4–20.0) 0.69 (0.56–0.85) 
RBS ≥ 18 0.23 (0.12–0.38) 0.96 (0.91–0.98) 0.38 0.58 0.70 0.79 0.85 0.92 0.83 0.74 0.65 0.55 5.5 (2.1–14.3) 0.81 (0.69–0.95) 
RBS ≥ 19 0.14 (0.06–0.28) 0.97 (0.93–0.99) 0.35 0.55 0.68 0.77 0.83 0.91 0.82 0.72 0.63 0.53 4.9 (1.5–16.7) 0.89 (0.79–1.0) 
RBS ≥ 20 0.09 (0.03–0.23) 0.98 (0.94–0.99) 0.33 0.52 0.65 0.75 0.81 0.91 0.81 0.72 0.62 0.52 4.4 (1.0–18.9) 0.93 (0.85–1.0) 
RBS ≥ 21 0.07 (0.02–0.20) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.81 0.71 0.62 0.52 ∞ 0.93 (0.86–1.0) 
Probable Malingering 
RBS ≥ 15 0.43 (0.33–0.55) 0.93 (0.87–0.96) 0.41 0.61 0.73 0.81 0.86 0.94 0.87 0.79 0.71 0.62 6.3 (3.3–12.0) 0.61 (0.50–0.73) 
RBS ≥ 16 0.33 (0.23–0.44) 0.94 (0.89–0.97) 0.40 0.60 0.72 0.80 0.85 0.93 0.85 0.77 0.68 0.58 5.9 (2.8–12.4) 0.71 (0.61–0.83) 
RBS ≥ 17 0.20 (0.13–0.31) 0.96 (0.91–0.98) 0.35 0.55 0.68 0.77 0.83 0.92 0.83 0.74 0.64 0.55 4.9 (2.0–12.1) 0.83 (0.74–0.93) 
RBS ≥ 18 0.14 (0.08–0.24) 0.96 (0.91–0.98) 0.28 0.47 0.60 0.70 0.78 0.91 0.82 0.72 0.63 0.53 3.5 (1.4–9.0) 0.89 (0.82–0.98) 
RBS ≥ 19 0.08 (0.04–0.17) 0.97 (0.93–0.99) 0.25 0.43 0.57 0.67 0.75 0.91 0.81 0.71 0.61 0.52 3.1 (0.9–10.1) 0.94 (0.88–1.0) 
RBS ≥ 20 0.04 (0.01–0.11) 0.98 (0.94–0.99) 0.16 0.30 0.43 0.54 0.64 0.90 0.80 0.70 0.60 0.50 1.7 (0.36–8.5) 0.98 (0.94–1.0) 
RBS ≥ 21 0.01 (0.00–0.07) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.90 0.80 0.70 0.60 0.50 ∞ 0.98 (0.96–1.0) 
Combined Malingering Groups 
RBS ≥ 15 0.52 (0.43–0.60) 0.93 (0.87–0.96) 0.45 0.65 0.76 0.83 0.88 0.95 0.88 0.82 0.74 0.66 7.5 (4.0–13.9) 0.52 (0.44–0.61) 
RBS ≥ 16 0.42 (0.34–0.50) 0.94 (0.89–0.97) 0.45 0.65 0.76 0.83 0.88 0.94 0.87 0.79 0.71 0.62 7.6 (3.8–15.3) 0.61 (0.54–0.70) 
RBS ≥ 17 0.28 (0.22–0.36) 0.96 (0.91–0.98) 0.43 0.63 0.75 0.82 0.87 0.92 0.84 0.76 0.67 0.57 6.9 (3.0–15.6) 0.75 (0.68–0.83) 
RBS ≥ 18 0.21 (0.15–0.29) 0.96 (0.91–0.98) 0.37 0.57 0.69 0.78 0.84 0.92 0.83 0.74 0.65 0.55 5.1 (2.2–11.9) 0.82 (0.76–0.89) 
RBS ≥ 19 0.13 (0.08–0.19) 0.97 (0.93–0.99) 0.34 0.54 0.66 0.75 0.82 0.91 0.82 0.72 0.63 0.53 4.7 (1.6–13.4) 0.90 (0.84–0.95) 
RBS ≥ 20 0.09 (0.05–0.15) 0.98 (0.94–0.99) 0.32 0.52 0.65 0.74 0.81 0.91 0.81 0.72 0.62 0.52 4.4 (1.3–14.9) 0.93 (0.88–0.98) 
RBS ≥ 21 0.05 (0.02–0.10) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.90 0.81 0.71 0.61 0.51 ∞ 0.95 (0.91–0.98) 
Cutoffa Test accuracy (95% CI) PPV for select base rates NPV for select base rates Likelihood ratios (95% CI) 
 Sensitivity Specificity 0.10 0.20 0.30 0.40 0.50 0.10 0.20 0.30 0.40 0.50 Positive test Negative test 
Definite Malingering 
RBS ≥ 15 0.71 (0.51–0.86) 0.93 (0.87–0.96) 0.54 0.72 0.82 0.87 0.91 0.97 0.93 0.88 0.83 0.77 10.4 (5.4–19.7) 0.31 (0.17–0.55) 
RBS ≥ 16 0.57 (0.37–0.75) 0.94 (0.89–0.97) 0.54 0.72 0.82 0.87 0.91 0.95 0.90 0.84 0.77 0.69 10.4 (4.9–21.8) 0.45 (0.30–0.70) 
RBS ≥ 17 0.43 (0.25–0.63) 0.96 (0.91–0.98) 0.54 0.72 0.82 0.87 0.91 0.94 0.87 0.80 0.72 0.63 10.4 (4.2–25.3) 0.60 (0.43–0.82) 
RBS ≥ 18 0.39 (0.22–0.59) 0.96 (0.91–0.98) 0.51 0.70 0.80 0.86 0.90 0.93 0.86 0.79 0.70 0.61 9.5 (3.8–23.6) 0.63 (0.47–0.85) 
RBS ≥ 19 0.25 (0.11–0.45) 0.97 (0.93–0.99) 0.50 0.69 0.80 0.86 0.90 0.92 0.84 0.75 0.66 0.56 9.1 (2.8–28.9) 0.77 (0.62–0.96) 
RBS ≥ 20 0.25 (0.11–0.45) 0.98 (0.93–0.99) 0.57 0.75 0.84 0.89 0.92 0.92 0.84 0.75 0.66 0.57 12.1 (3.3–43.9) 0.77 (0.62–0.95) 
RBS ≥ 21 0.14 (0.05–0.34) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.82 0.73 0.64 0.54 ∞ 0.86 (0.74–1.0) 
Probable to Definite Malingering 
RBS ≥ 15 0.55 (0.39–0.69) 0.93 (0.87–0.96) 0.47 0.66 0.77 0.84 0.89 0.95 0.89 0.83 0.75 0.67 7.9 (4.1–15.2) 0.49 (0.35–0.68) 
RBS ≥ 16 0.50 (0.35–0.65) 0.94 (0.89–0.97) 0.50 0.69 0.80 0.86 0.90 0.94 0.88 0.82 0.74 0.65 9.0 (4.3–18.9) 0.53 (0.39–0.71) 
RBS ≥ 17 0.34 (0.21–0.50) 0.96 (0.91–0.98) 0.48 0.67 0.78 0.85 0.89 0.93 0.85 0.77 0.69 0.59 8.2 (3.4–20.0) 0.69 (0.56–0.85) 
RBS ≥ 18 0.23 (0.12–0.38) 0.96 (0.91–0.98) 0.38 0.58 0.70 0.79 0.85 0.92 0.83 0.74 0.65 0.55 5.5 (2.1–14.3) 0.81 (0.69–0.95) 
RBS ≥ 19 0.14 (0.06–0.28) 0.97 (0.93–0.99) 0.35 0.55 0.68 0.77 0.83 0.91 0.82 0.72 0.63 0.53 4.9 (1.5–16.7) 0.89 (0.79–1.0) 
RBS ≥ 20 0.09 (0.03–0.23) 0.98 (0.94–0.99) 0.33 0.52 0.65 0.75 0.81 0.91 0.81 0.72 0.62 0.52 4.4 (1.0–18.9) 0.93 (0.85–1.0) 
RBS ≥ 21 0.07 (0.02–0.20) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.81 0.71 0.62 0.52 ∞ 0.93 (0.86–1.0) 
Probable Malingering 
RBS ≥ 15 0.43 (0.33–0.55) 0.93 (0.87–0.96) 0.41 0.61 0.73 0.81 0.86 0.94 0.87 0.79 0.71 0.62 6.3 (3.3–12.0) 0.61 (0.50–0.73) 
RBS ≥ 16 0.33 (0.23–0.44) 0.94 (0.89–0.97) 0.40 0.60 0.72 0.80 0.85 0.93 0.85 0.77 0.68 0.58 5.9 (2.8–12.4) 0.71 (0.61–0.83) 
RBS ≥ 17 0.20 (0.13–0.31) 0.96 (0.91–0.98) 0.35 0.55 0.68 0.77 0.83 0.92 0.83 0.74 0.64 0.55 4.9 (2.0–12.1) 0.83 (0.74–0.93) 
RBS ≥ 18 0.14 (0.08–0.24) 0.96 (0.91–0.98) 0.28 0.47 0.60 0.70 0.78 0.91 0.82 0.72 0.63 0.53 3.5 (1.4–9.0) 0.89 (0.82–0.98) 
RBS ≥ 19 0.08 (0.04–0.17) 0.97 (0.93–0.99) 0.25 0.43 0.57 0.67 0.75 0.91 0.81 0.71 0.61 0.52 3.1 (0.9–10.1) 0.94 (0.88–1.0) 
RBS ≥ 20 0.04 (0.01–0.11) 0.98 (0.94–0.99) 0.16 0.30 0.43 0.54 0.64 0.90 0.80 0.70 0.60 0.50 1.7 (0.36–8.5) 0.98 (0.94–1.0) 
RBS ≥ 21 0.01 (0.00–0.07) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.90 0.80 0.70 0.60 0.50 ∞ 0.98 (0.96–1.0) 
Combined Malingering Groups 
RBS ≥ 15 0.52 (0.43–0.60) 0.93 (0.87–0.96) 0.45 0.65 0.76 0.83 0.88 0.95 0.88 0.82 0.74 0.66 7.5 (4.0–13.9) 0.52 (0.44–0.61) 
RBS ≥ 16 0.42 (0.34–0.50) 0.94 (0.89–0.97) 0.45 0.65 0.76 0.83 0.88 0.94 0.87 0.79 0.71 0.62 7.6 (3.8–15.3) 0.61 (0.54–0.70) 
RBS ≥ 17 0.28 (0.22–0.36) 0.96 (0.91–0.98) 0.43 0.63 0.75 0.82 0.87 0.92 0.84 0.76 0.67 0.57 6.9 (3.0–15.6) 0.75 (0.68–0.83) 
RBS ≥ 18 0.21 (0.15–0.29) 0.96 (0.91–0.98) 0.37 0.57 0.69 0.78 0.84 0.92 0.83 0.74 0.65 0.55 5.1 (2.2–11.9) 0.82 (0.76–0.89) 
RBS ≥ 19 0.13 (0.08–0.19) 0.97 (0.93–0.99) 0.34 0.54 0.66 0.75 0.82 0.91 0.82 0.72 0.63 0.53 4.7 (1.6–13.4) 0.90 (0.84–0.95) 
RBS ≥ 20 0.09 (0.05–0.15) 0.98 (0.94–0.99) 0.32 0.52 0.65 0.74 0.81 0.91 0.81 0.72 0.62 0.52 4.4 (1.3–14.9) 0.93 (0.88–0.98) 
RBS ≥ 21 0.05 (0.02–0.10) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.90 0.81 0.71 0.61 0.51 ∞ 0.95 (0.91–0.98) 

aT-score equivalents for raw scores: 15 = 92, 16 = 97, 17 = 101, 18 = 105, 19 = 109, 20 = 114 , 21 = 118.

Table 5.

Sensitivity, specificity, positive/negative predictive value (PPV/NPV), and likelihood ratios for cutoff scores for the MMPI-2 Symptom Validity Scale (FBS)

Cutoff Test accuracy (95% CI) PPV for select base rates NPV for select base rates Likelihood ratios (95% CI) 
 Sensitivity Specificity 0.10 0.20 0.30 0.40 0.50 0.10 0.20 0.30 0.40 0.50 Positive test Negative test 
Definite Malingering 
FBS ≥ 25 0.43 (0.25–0.63) 0.92 (0.87–0.96) 0.39 0.59 0.71 0.79 0.85 0.94 0.87 0.79 0.71 0.62 5.6 (2.8–11.5) 0.62 (0.45–0.85) 
FBS ≥ 26 0.39 (0.22–0.59) 0.97 (0.92–0.99) 0.56 0.74 0.83 0.88 0.92 0.93 0.86 0.79 0.70 0.61 11.4 (4.3–30.3) 0.63 (0.47–0.85) 
FBS ≥ 27/28 0.32 (0.17–0.52) 0.99 (0.95–1.0) 0.72 0.85 0.91 0.94 0.96 0.93 0.85 0.77 0.69 0.59 23.3 (5.3–102.1) 0.69 (0.53–0.89) 
FBS ≥ 29 0.32 (0.17–0.52) 0.99 (0.96–1.0) 0.84 0.92 0.95 0.97 0.98 0.93 0.85 0.77 0.69 0.59 46.6 (6.1–353.4) 0.68 (0.53–0.88) 
FBS ≥ 30 0.18 (0.07–0.38) 0.99 (0.96–1.0) 0.74 0.87 0.92 0.95 0.96 0.92 0.83 0.74 0.64 0.55 25.9 (3.1–213.3) 0.83 (0.70–0.98) 
FBS ≥ 31 0.18 (0.07–0.38) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.92 0.83 0.74 0.65 0.55 ∞ 0.82 (0.69–0.98) 
Probable to Definite Malingering 
FBS ≥ 25 0.52 (0.37–0.67) 0.92 (0.87–0.96) 0.43 0.63 0.75 0.82 0.87 0.95 0.89 0.82 0.74 0.66 6.9 (3.7–13.0) 0.52 (0.38–0.70) 
FBS ≥ 26 0.50 (0.35–0.65) 0.97 (0.92–0.99) 0.62 0.78 0.86 0.91 0.94 0.95 0.89 0.82 0.74 0.66 14.5 (5.8–36.0) 0.52 (0.39–0.70) 
FBS ≥ 27 0.41 (0.27–0.57) 0.99 (0.95–1.0) 0.77 0.88 0.93 0.95 0.97 0.94 0.87 0.80 0.71 0.63 29.7 (7.2–122.9) 0.60 (0.47–0.77) 
FBS ≥ 28 0.34 (0.21–0.50) 0.99 (0.95–1.0) 0.73 0.86 0.91 0.94 0.96 0.93 0.86 0.78 0.69 0.60 24.7 (5.9–103.9) 0.67 (0.54–0.83) 
FBS ≥ 29 0.27 (0.15–0.43) 0.99 (0.96–1.0) 0.81 0.91 0.94 0.96 0.98 0.92 0.85 0.76 0.67 0.58 39.5 (5.3–295.7) 0.73 (0.61–0.88) 
FBS ≥ 30 0.23 (0.12–0.38) 0.99 (0.96–1.0) 0.79 0.89 0.93 0.96 0.97 0.92 0.84 0.75 0.66 0.56 33.0 (4.3–250.4) 0.78 (0.66–0.91) 
FBS ≥ 31 0.20 (0.10–0.36) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.92 0.83 0.75 0.65 0.56 ∞ 0.80 (0.68–0.92) 
Probable Malingering 
FBS ≥ 25 0.27 (0.18–0.38) 0.92 (0.87–0.96) 0.28 0.47 0.60 0.70 0.78 0.92 0.83 0.75 0.65 0.56 3.5 (1.8–6.8) 0.80 (0.70–0.91) 
FBS ≥ 26 0.23 (0.15–0.34) 0.97 (0.92–0.99) 0.42 0.62 0.74 0.82 0.87 0.92 0.83 0.75 0.65 0.56 6.6 (2.6–17.1) 0.80 (0.71–0.90) 
FBS ≥ 27 0.23 (0.15–0.34) 0.99 (0.95–1.0) 0.65 0.81 0.88 0.92 0.94 0.92 0.84 0.75 0.66 0.56 16.6 (4.0–69.5) 0.78 (0.70–0.88) 
FBS ≥ 28 0.17 (0.10–0.27) 0.99 (0.95–1.0) 0.58 0.75 0.84 0.89 0.92 0.91 0.83 0.73 0.64 0.54 12.2 (2.8–52.5) 0.84 (0.76–0.93) 
FBS ≥ 29 0.16 (0.09–0.26) 0.99 (0.96–1.0) 0.72 0.85 0.91 0.94 0.96 0.91 0.82 0.73 0.64 0.54 27.7 (3.0–170.5) 0.85 (0.77–0.93) 
FBS ≥ 30 0.08 (0.04–0.17) 0.99 (0.96–1.0) 0.58 0.75 0.84 0.89 0.92 0.91 0.81 0.72 0.62 0.52 12.2 (1.5–97.7) 0.92 (0.86–0.98) 
FBS ≥ 31 0.04 (0.01–0.11) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.90 0.81 0.71 0.61 0.51 ∞ 0.96 (0.92–1.0) 
Combined Malingering Groups 
FBS ≥ 25 0.37 (0.29–0.45) 0.92 (0.87–0.96) 0.35 0.55 0.67 0.76 0.83 0.93 0.85 0.77 0.69 0.59 4.8 (2.6–8.9) 0.68 (0.61–0.77) 
FBS ≥ 26 0.34 (0.26–0.42) 0.97 (0.92–0.99) 0.52 0.71 0.81 0.87 0.91 0.93 0.85 0.77 0.69 0.59 9.7 (4.0–23.7) 0.69 (0.62–0.77) 
FBS ≥ 27 0.30 (0.23–0.38) 0.99 (0.95–1.0) 0.70 0.84 0.90 0.93 0.95 0.93 0.85 0.77 0.68 0.58 21.5 (5.3–87.0) 0.71 (0.64–0.79) 
FBS ≥ 28 0.25 (0.18–0.32) 0.99 (0.95–1.0) 0.66 0.81 0.88 0.92 0.95 0.92 0.84 0.75 0.66 0.57 17.8 (4.4–72.3) 0.77 (0.70–0.84) 
FBS ≥ 29 0.22 (0.16–0.29) 0.99 (0.96–1.0) 0.78 0.89 0.93 0.95 0.97 0.92 0.84 0.75 0.66 0.56 31.8 (4.4–229.4) 0.79 (0.72–0.85) 
FBS ≥ 30 0.14 (0.09–0.21) 0.99 (0.96–1.0) 0.69 0.84 0.90 0.93 0.95 0.91 0.82 0.73 0.63 0.54 20.6 (2.8–150.7) 0.86 (0.81–0.92) 
FBS ≥ 31 0.11 (0.07–0.17) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.82 0.72 0.63 0.53 ∞ 0.89 (0.84–0.94) 
Cutoff Test accuracy (95% CI) PPV for select base rates NPV for select base rates Likelihood ratios (95% CI) 
 Sensitivity Specificity 0.10 0.20 0.30 0.40 0.50 0.10 0.20 0.30 0.40 0.50 Positive test Negative test 
Definite Malingering 
FBS ≥ 25 0.43 (0.25–0.63) 0.92 (0.87–0.96) 0.39 0.59 0.71 0.79 0.85 0.94 0.87 0.79 0.71 0.62 5.6 (2.8–11.5) 0.62 (0.45–0.85) 
FBS ≥ 26 0.39 (0.22–0.59) 0.97 (0.92–0.99) 0.56 0.74 0.83 0.88 0.92 0.93 0.86 0.79 0.70 0.61 11.4 (4.3–30.3) 0.63 (0.47–0.85) 
FBS ≥ 27/28 0.32 (0.17–0.52) 0.99 (0.95–1.0) 0.72 0.85 0.91 0.94 0.96 0.93 0.85 0.77 0.69 0.59 23.3 (5.3–102.1) 0.69 (0.53–0.89) 
FBS ≥ 29 0.32 (0.17–0.52) 0.99 (0.96–1.0) 0.84 0.92 0.95 0.97 0.98 0.93 0.85 0.77 0.69 0.59 46.6 (6.1–353.4) 0.68 (0.53–0.88) 
FBS ≥ 30 0.18 (0.07–0.38) 0.99 (0.96–1.0) 0.74 0.87 0.92 0.95 0.96 0.92 0.83 0.74 0.64 0.55 25.9 (3.1–213.3) 0.83 (0.70–0.98) 
FBS ≥ 31 0.18 (0.07–0.38) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.92 0.83 0.74 0.65 0.55 ∞ 0.82 (0.69–0.98) 
Probable to Definite Malingering 
FBS ≥ 25 0.52 (0.37–0.67) 0.92 (0.87–0.96) 0.43 0.63 0.75 0.82 0.87 0.95 0.89 0.82 0.74 0.66 6.9 (3.7–13.0) 0.52 (0.38–0.70) 
FBS ≥ 26 0.50 (0.35–0.65) 0.97 (0.92–0.99) 0.62 0.78 0.86 0.91 0.94 0.95 0.89 0.82 0.74 0.66 14.5 (5.8–36.0) 0.52 (0.39–0.70) 
FBS ≥ 27 0.41 (0.27–0.57) 0.99 (0.95–1.0) 0.77 0.88 0.93 0.95 0.97 0.94 0.87 0.80 0.71 0.63 29.7 (7.2–122.9) 0.60 (0.47–0.77) 
FBS ≥ 28 0.34 (0.21–0.50) 0.99 (0.95–1.0) 0.73 0.86 0.91 0.94 0.96 0.93 0.86 0.78 0.69 0.60 24.7 (5.9–103.9) 0.67 (0.54–0.83) 
FBS ≥ 29 0.27 (0.15–0.43) 0.99 (0.96–1.0) 0.81 0.91 0.94 0.96 0.98 0.92 0.85 0.76 0.67 0.58 39.5 (5.3–295.7) 0.73 (0.61–0.88) 
FBS ≥ 30 0.23 (0.12–0.38) 0.99 (0.96–1.0) 0.79 0.89 0.93 0.96 0.97 0.92 0.84 0.75 0.66 0.56 33.0 (4.3–250.4) 0.78 (0.66–0.91) 
FBS ≥ 31 0.20 (0.10–0.36) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.92 0.83 0.75 0.65 0.56 ∞ 0.80 (0.68–0.92) 
Probable Malingering 
FBS ≥ 25 0.27 (0.18–0.38) 0.92 (0.87–0.96) 0.28 0.47 0.60 0.70 0.78 0.92 0.83 0.75 0.65 0.56 3.5 (1.8–6.8) 0.80 (0.70–0.91) 
FBS ≥ 26 0.23 (0.15–0.34) 0.97 (0.92–0.99) 0.42 0.62 0.74 0.82 0.87 0.92 0.83 0.75 0.65 0.56 6.6 (2.6–17.1) 0.80 (0.71–0.90) 
FBS ≥ 27 0.23 (0.15–0.34) 0.99 (0.95–1.0) 0.65 0.81 0.88 0.92 0.94 0.92 0.84 0.75 0.66 0.56 16.6 (4.0–69.5) 0.78 (0.70–0.88) 
FBS ≥ 28 0.17 (0.10–0.27) 0.99 (0.95–1.0) 0.58 0.75 0.84 0.89 0.92 0.91 0.83 0.73 0.64 0.54 12.2 (2.8–52.5) 0.84 (0.76–0.93) 
FBS ≥ 29 0.16 (0.09–0.26) 0.99 (0.96–1.0) 0.72 0.85 0.91 0.94 0.96 0.91 0.82 0.73 0.64 0.54 27.7 (3.0–170.5) 0.85 (0.77–0.93) 
FBS ≥ 30 0.08 (0.04–0.17) 0.99 (0.96–1.0) 0.58 0.75 0.84 0.89 0.92 0.91 0.81 0.72 0.62 0.52 12.2 (1.5–97.7) 0.92 (0.86–0.98) 
FBS ≥ 31 0.04 (0.01–0.11) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.90 0.81 0.71 0.61 0.51 ∞ 0.96 (0.92–1.0) 
Combined Malingering Groups 
FBS ≥ 25 0.37 (0.29–0.45) 0.92 (0.87–0.96) 0.35 0.55 0.67 0.76 0.83 0.93 0.85 0.77 0.69 0.59 4.8 (2.6–8.9) 0.68 (0.61–0.77) 
FBS ≥ 26 0.34 (0.26–0.42) 0.97 (0.92–0.99) 0.52 0.71 0.81 0.87 0.91 0.93 0.85 0.77 0.69 0.59 9.7 (4.0–23.7) 0.69 (0.62–0.77) 
FBS ≥ 27 0.30 (0.23–0.38) 0.99 (0.95–1.0) 0.70 0.84 0.90 0.93 0.95 0.93 0.85 0.77 0.68 0.58 21.5 (5.3–87.0) 0.71 (0.64–0.79) 
FBS ≥ 28 0.25 (0.18–0.32) 0.99 (0.95–1.0) 0.66 0.81 0.88 0.92 0.95 0.92 0.84 0.75 0.66 0.57 17.8 (4.4–72.3) 0.77 (0.70–0.84) 
FBS ≥ 29 0.22 (0.16–0.29) 0.99 (0.96–1.0) 0.78 0.89 0.93 0.95 0.97 0.92 0.84 0.75 0.66 0.56 31.8 (4.4–229.4) 0.79 (0.72–0.85) 
FBS ≥ 30 0.14 (0.09–0.21) 0.99 (0.96–1.0) 0.69 0.84 0.90 0.93 0.95 0.91 0.82 0.73 0.63 0.54 20.6 (2.8–150.7) 0.86 (0.81–0.92) 
FBS ≥ 31 0.11 (0.07–0.17) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.82 0.72 0.63 0.53 ∞ 0.89 (0.84–0.94) 
Table 6.

Sensitivity, specificity, positive/negative predictive value (PPV/NPV), and likelihood ratios for cutoff scores for the MMPI-2-RF Symptom Validity Scale (FBS-r)

Cutoffa Test accuracy (95% CI) PPV for select base rates NPV for select base rates Likelihood ratios (95% CI) 
 Sensitivity Specificity 0.10 0.20 0.30 0.40 0.50 0.10 0.20 0.30 0.40 0.50 Positive test Negative test 
Definite Malingering 
FBS-r ≥ 17 0.57 (0.37–0.75) 0.92 (0.86–0.95) 0.43 0.63 0.75 0.82 0.87 0.95 0.90 0.83 0.76 0.68 6.9 (3.7–13.0) 0.47 (0.30–0.72) 
FBS-r ≥ 18 0.39 (0.22–0.59) 0.96 (0.91–0.98) 0.51 0.70 0.80 0.86 0.90 0.93 0.86 0.79 0.70 0.61 9.5 (3.8–23.6) 0.63 (0.47–0.85) 
FBS-r ≥ 19 0.29 (0.14–0.49) 0.97 (0.93–0.99) 0.53 0.72 0.81 0.87 0.91 0.92 0.84 0.76 0.67 0.58 10.4 (3.3–32.1) 0.73 (0.58–0.93) 
FBS-r ≥ 20 0.29 (0.14–0.49) 0.98 (0.94–0.99) 0.60 0.77 0.85 0.90 0.93 0.93 0.85 0.76 0.67 0.58 13.8 (3.9–48.9) 0.73 (0.58–0.92) 
FBS-r ≥ 21 0.21 (0.09–0.41) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.92 0.84 0.75 0.66 0.56 ∞ 0.79 (0.65–0.98) 
Probable to Definite Malingering 
FBS-r ≥ 17 0.59 (0.43–0.73) 0.92 (0.86–0.95) 0.44 0.64 0.75 0.83 0.88 0.95 0.90 0.84 0.77 0.69 7.1 (3.9–12.9) 0.45 (0.31–0.64) 
FBS-r ≥ 18 0.55 (0.39–0.69) 0.96 (0.91–0.98) 0.60 0.77 0.85 0.90 0.93 0.95 0.89 0.83 0.76 0.68 13.2 (5.8–30.2) 0.47 (0.34–0.66) 
FBS-r ≥ 19 0.41 (0.27–0.57) 0.97 (0.93–0.99) 0.62 0.79 0.86 0.91 0.94 0.94 0.87 0.79 0.71 0.62 14.8 (5.3–41.5) 0.61 (0.48–0.78) 
FBS-r ≥ 20 0.36 (0.23–0.52) 0.98 (0.94–0.99) 0.66 0.81 0.88 0.92 0.95 0.93 0.86 0.78 0.70 0.61 17.6 (5.4–57.5) 0.65 (0.52–0.81) 
FBS-r ≥ 21 0.20 (0.10–0.36) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.92 0.83 0.75 0.65 0.56 ∞ 0.80 (0.68–0.92) 
Probable Malingering 
FBS-r ≥ 17 0.30 (0.21–0.41) 0.92 (0.86–0.95) 0.29 0.48 0.61 0.71 0.78 0.92 0.84 0.75 0.66 0.57 3.6 (1.9–6.9) 0.76 (0.66–0.88) 
FBS-r ≥ 18 0.23 (0.15–0.34) 0.96 (0.91–0.98) 0.38 0.58 0.71 0.79 0.85 0.92 0.83 0.74 0.65 0.55 5.5 (2.3–13.3) 0.80 (0.71–0.90) 
FBS-r ≥ 19 0.17 (0.10–0.27) 0.97 (0.93–0.99) 0.40 0.60 0.72 0.80 0.86 0.91 0.82 0.73 0.64 0.54 6.1 (2.1–18.0) 0.85 (0.78–0.94) 
FBS-r ≥ 20 0.14 (0.08–0.24) 0.98 (0.94–0.99) 0.43 0.63 0.75 0.82 0.87 0.91 0.82 0.73 0.63 0.53 7.0 (2.0–24.1) 0.87 (0.80–0.95) 
FBS-r ≥ 21 0.11 (0.05–0.20) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.82 0.72 0.63 0.53 ∞ 0.89 (0.83–0.96) 
Combined Malingering Groups 
FBS-r ≥ 17 0.43 (0.35–0.51) 0.92 (0.86–0.95) 0.37 0.57 0.69 0.78 0.84 0.94 0.87 0.79 0.71 0.62 5.2 (3.0–9.2) 0.62 (0.54–0.71) 
FBS-r ≥ 18 0.35 (0.27–0.43) 0.96 (0.91–0.98) 0.49 0.68 0.78 0.85 0.89 0.93 0.85 0.77 0.69 0.60 8.4 (3.7–19.0) 0.68 (0.61–0.76) 
FBS-r ≥ 19 0.26 (0.19–0.34) 0.97 (0.93–0.99) 0.51 0.70 0.80 0.86 0.90 0.92 0.84 0.75 0.66 0.57 9.4 (3.4–25.5) 0.76 (0.70–0.84) 
FBS-r ≥ 20 0.23 (0.17–0.31) 0.98 (0.94–0.99) 0.55 0.73 0.83 0.88 0.92 0.92 0.84 0.75 0.66 0.56 11.2 (3.5–35.7) 0.78 (0.72–0.86) 
FBS-r ≥ 21 0.15 (0.10–0.22) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.83 0.73 0.64 0.54 ∞ 0.85 (0.79–0.90) 
Cutoffa Test accuracy (95% CI) PPV for select base rates NPV for select base rates Likelihood ratios (95% CI) 
 Sensitivity Specificity 0.10 0.20 0.30 0.40 0.50 0.10 0.20 0.30 0.40 0.50 Positive test Negative test 
Definite Malingering 
FBS-r ≥ 17 0.57 (0.37–0.75) 0.92 (0.86–0.95) 0.43 0.63 0.75 0.82 0.87 0.95 0.90 0.83 0.76 0.68 6.9 (3.7–13.0) 0.47 (0.30–0.72) 
FBS-r ≥ 18 0.39 (0.22–0.59) 0.96 (0.91–0.98) 0.51 0.70 0.80 0.86 0.90 0.93 0.86 0.79 0.70 0.61 9.5 (3.8–23.6) 0.63 (0.47–0.85) 
FBS-r ≥ 19 0.29 (0.14–0.49) 0.97 (0.93–0.99) 0.53 0.72 0.81 0.87 0.91 0.92 0.84 0.76 0.67 0.58 10.4 (3.3–32.1) 0.73 (0.58–0.93) 
FBS-r ≥ 20 0.29 (0.14–0.49) 0.98 (0.94–0.99) 0.60 0.77 0.85 0.90 0.93 0.93 0.85 0.76 0.67 0.58 13.8 (3.9–48.9) 0.73 (0.58–0.92) 
FBS-r ≥ 21 0.21 (0.09–0.41) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.92 0.84 0.75 0.66 0.56 ∞ 0.79 (0.65–0.98) 
Probable to Definite Malingering 
FBS-r ≥ 17 0.59 (0.43–0.73) 0.92 (0.86–0.95) 0.44 0.64 0.75 0.83 0.88 0.95 0.90 0.84 0.77 0.69 7.1 (3.9–12.9) 0.45 (0.31–0.64) 
FBS-r ≥ 18 0.55 (0.39–0.69) 0.96 (0.91–0.98) 0.60 0.77 0.85 0.90 0.93 0.95 0.89 0.83 0.76 0.68 13.2 (5.8–30.2) 0.47 (0.34–0.66) 
FBS-r ≥ 19 0.41 (0.27–0.57) 0.97 (0.93–0.99) 0.62 0.79 0.86 0.91 0.94 0.94 0.87 0.79 0.71 0.62 14.8 (5.3–41.5) 0.61 (0.48–0.78) 
FBS-r ≥ 20 0.36 (0.23–0.52) 0.98 (0.94–0.99) 0.66 0.81 0.88 0.92 0.95 0.93 0.86 0.78 0.70 0.61 17.6 (5.4–57.5) 0.65 (0.52–0.81) 
FBS-r ≥ 21 0.20 (0.10–0.36) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.92 0.83 0.75 0.65 0.56 ∞ 0.80 (0.68–0.92) 
Probable Malingering 
FBS-r ≥ 17 0.30 (0.21–0.41) 0.92 (0.86–0.95) 0.29 0.48 0.61 0.71 0.78 0.92 0.84 0.75 0.66 0.57 3.6 (1.9–6.9) 0.76 (0.66–0.88) 
FBS-r ≥ 18 0.23 (0.15–0.34) 0.96 (0.91–0.98) 0.38 0.58 0.71 0.79 0.85 0.92 0.83 0.74 0.65 0.55 5.5 (2.3–13.3) 0.80 (0.71–0.90) 
FBS-r ≥ 19 0.17 (0.10–0.27) 0.97 (0.93–0.99) 0.40 0.60 0.72 0.80 0.86 0.91 0.82 0.73 0.64 0.54 6.1 (2.1–18.0) 0.85 (0.78–0.94) 
FBS-r ≥ 20 0.14 (0.08–0.24) 0.98 (0.94–0.99) 0.43 0.63 0.75 0.82 0.87 0.91 0.82 0.73 0.63 0.53 7.0 (2.0–24.1) 0.87 (0.80–0.95) 
FBS-r ≥ 21 0.11 (0.05–0.20) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.82 0.72 0.63 0.53 ∞ 0.89 (0.83–0.96) 
Combined Malingering Groups 
FBS-r ≥ 17 0.43 (0.35–0.51) 0.92 (0.86–0.95) 0.37 0.57 0.69 0.78 0.84 0.94 0.87 0.79 0.71 0.62 5.2 (3.0–9.2) 0.62 (0.54–0.71) 
FBS-r ≥ 18 0.35 (0.27–0.43) 0.96 (0.91–0.98) 0.49 0.68 0.78 0.85 0.89 0.93 0.85 0.77 0.69 0.60 8.4 (3.7–19.0) 0.68 (0.61–0.76) 
FBS-r ≥ 19 0.26 (0.19–0.34) 0.97 (0.93–0.99) 0.51 0.70 0.80 0.86 0.90 0.92 0.84 0.75 0.66 0.57 9.4 (3.4–25.5) 0.76 (0.70–0.84) 
FBS-r ≥ 20 0.23 (0.17–0.31) 0.98 (0.94–0.99) 0.55 0.73 0.83 0.88 0.92 0.92 0.84 0.75 0.66 0.56 11.2 (3.5–35.7) 0.78 (0.72–0.86) 
FBS-r ≥ 21 0.15 (0.10–0.22) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.83 0.73 0.64 0.54 ∞ 0.85 (0.79–0.90) 

aT-score equivalents raw scores: 17 = 80, 18 = 83, 19 = 86, 20 = 89, 21 = 92.

Table 7.

Sensitivity, specificity, positive/negative predictive value (PPV/NPV), and likelihood ratios for cutoff scores for the MMPI-2-RF Infrequent Somatic Responses Scale (Fs)

Cutoffa Test accuracy (95% CI) PPV for select base rates NPV for select base rates Likelihood ratios (95% CI) 
 Sensitivity Specificity 0.10 0.20 0.30 0.40 0.50 0.10 0.20 0.30 0.40 0.50 Positive test Negative test 
Definite Malingering 
Fs ≥ 6 0.46 (0.28–0.66) 0.94 (0.88–0.97) 0.45 0.65 0.76 0.83 0.88 0.94 0.88 0.80 0.72 0.64 7.5 (3.5–15.8) 0.57 (0.40–0.81) 
Fs ≥ 7 0.39 (0.22–0.59) 0.97 (0.93–0.99) 0.61 0.78 0.86 0.90 0.93 0.94 0.86 0.79 0.71 0.62 14.2 (4.9–41.5) 0.62 (0.46–0.84) 
Fs ≥ 8 0.29 (0.14–0.49) 0.99 (0.95–1.0) 0.69 0.84 0.90 0.93 0.95 0.93 0.85 0.76 0.67 0.58 20.7 (4.6–92.4) 0.72 (0.57–0.92) 
Fs ≥ 9 0.18 (0.07–0.38) 0.99 (0.95–1.0) 0.59 0.76 0.85 0.90 0.93 0.92 0.83 0.74 0.64 0.55 12.9 (2.6–63.4) 0.83 (0.70–0.99) 
Fs ≥ 10 0.08 (0.01–0.28) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.81 0.72 0.62 0.52 ∞ 0.92 (0.82–1.0) 
Probable to Definite Malingering 
Fs ≥ 6 0.36 (0.23–0.52) 0.94 (0.88–0.97) 0.39 0.59 0.72 0.80 0.85 0.93 0.86 0.77 0.69 0.60 5.9 (2.8–12.3) 0.68 (0.54–0.85) 
Fs ≥ 7 0.25 (0.14–0.41) 0.97 (0.93–0.99) 0.50 0.69 0.79 0.86 0.90 0.92 0.84 0.75 0.66 0.56 9.1 (3.0–27.0) 0.77 (0.65–0.92) 
Fs ≥ 8 0.18 (0.09–0.33) 0.99 (0.95–1.0) 0.59 0.76 0.85 0.90 0.93 0.92 0.83 0.74 0.64 0.55 13.2 (2.9–59.8) 0.83 (0.72–0.95) 
Fs ≥ 9 0.09 (0.03–0.23) 0.99 (0.95–1.0) 0.42 0.62 0.74 0.81 0.87 0.91 0.81 0.72 0.62 0.52 6.6 (1.2–34.8) 0.92 (0.84–1.0) 
Fs ≥ 10 0.07 (0.02–0.20) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.81 0.72 0.62 0.52 ∞ 0.93 (0.86–1.0) 
Probable Malingering 
Fs ≥ 6 0.25 (0.17–0.36) 0.94 (0.88–0.97) 0.31 0.51 0.64 0.73 0.80 0.92 0.83 0.75 0.65 0.56 4.1 (2.0–8.5) 0.80 (0.70–0.90) 
Fs ≥ 7 0.20 (0.13–0.31) 0.97 (0.93–0.99) 0.45 0.65 0.76 0.83 0.88 0.92 0.83 0.74 0.65 0.55 7.4 (2.6–21.3) 0.82 (0.73–0.91) 
Fs ≥ 8 0.12 (0.06–0.21) 0.99 (0.95–1.0) 0.49 0.68 0.79 0.85 0.90 0.91 0.82 0.72 0.63 0.53 8.7 (2.0–38.9) 0.89 (0.82–0.97) 
Fs ≥ 9 0.10 (0.05–0.19) 0.99 (0.95–1.0) 0.43 0.63 0.75 0.82 0.87 0.91 0.81 0.72 0.62 0.52 7.0 (1.5–32.1) 0.92 (0.85–0.98) 
Fs ≥ 10 0.04 (0.01–0.12) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.90 0.81 0.71 0.61 0.51 ∞ 0.96 (0.92–1.0) 
Combined Malingering Groups 
Fs ≥ 6 0.32 (0.25–0.40) 0.94 (0.88–0.97) 0.37 0.57 0.69 0.78 0.84 0.93 0.85 0.76 0.68 0.58 5.2 (2.7–10.2) 0.72 (0.65–0.81) 
Fs ≥ 7 0.25 (0.19–0.33) 0.97 (0.93–0.99) 0.50 0.69 0.79 0.86 0.90 0.92 0.84 0.75 0.66 0.57 9.1 (3.3–24.9) 0.77 (0.70–0.84) 
Fs ≥ 8 0.17 (0.11–0.24) 0.99 (0.95–1.0) 0.57 0.75 0.84 0.89 0.92 0.91 0.83 0.73 0.64 0.54 12.2 (2.9–50.3) 0.84 (0.79–0.91) 
Fs ≥ 9 0.11 (0.07–0.17) 0.99 (0.95–1.0) 0.47 0.66 0.77 0.84 0.89 0.91 0.82 0.72 0.62 0.53 8.0 (1.9–33.8) 0.90 (0.85–0.95) 
Fs ≥ 10 0.05 (0.03–0.11) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.90 0.81 0.71 0.61 0.51 ∞ 0.95 (0.91–0.98) 
Cutoffa Test accuracy (95% CI) PPV for select base rates NPV for select base rates Likelihood ratios (95% CI) 
 Sensitivity Specificity 0.10 0.20 0.30 0.40 0.50 0.10 0.20 0.30 0.40 0.50 Positive test Negative test 
Definite Malingering 
Fs ≥ 6 0.46 (0.28–0.66) 0.94 (0.88–0.97) 0.45 0.65 0.76 0.83 0.88 0.94 0.88 0.80 0.72 0.64 7.5 (3.5–15.8) 0.57 (0.40–0.81) 
Fs ≥ 7 0.39 (0.22–0.59) 0.97 (0.93–0.99) 0.61 0.78 0.86 0.90 0.93 0.94 0.86 0.79 0.71 0.62 14.2 (4.9–41.5) 0.62 (0.46–0.84) 
Fs ≥ 8 0.29 (0.14–0.49) 0.99 (0.95–1.0) 0.69 0.84 0.90 0.93 0.95 0.93 0.85 0.76 0.67 0.58 20.7 (4.6–92.4) 0.72 (0.57–0.92) 
Fs ≥ 9 0.18 (0.07–0.38) 0.99 (0.95–1.0) 0.59 0.76 0.85 0.90 0.93 0.92 0.83 0.74 0.64 0.55 12.9 (2.6–63.4) 0.83 (0.70–0.99) 
Fs ≥ 10 0.08 (0.01–0.28) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.81 0.72 0.62 0.52 ∞ 0.92 (0.82–1.0) 
Probable to Definite Malingering 
Fs ≥ 6 0.36 (0.23–0.52) 0.94 (0.88–0.97) 0.39 0.59 0.72 0.80 0.85 0.93 0.86 0.77 0.69 0.60 5.9 (2.8–12.3) 0.68 (0.54–0.85) 
Fs ≥ 7 0.25 (0.14–0.41) 0.97 (0.93–0.99) 0.50 0.69 0.79 0.86 0.90 0.92 0.84 0.75 0.66 0.56 9.1 (3.0–27.0) 0.77 (0.65–0.92) 
Fs ≥ 8 0.18 (0.09–0.33) 0.99 (0.95–1.0) 0.59 0.76 0.85 0.90 0.93 0.92 0.83 0.74 0.64 0.55 13.2 (2.9–59.8) 0.83 (0.72–0.95) 
Fs ≥ 9 0.09 (0.03–0.23) 0.99 (0.95–1.0) 0.42 0.62 0.74 0.81 0.87 0.91 0.81 0.72 0.62 0.52 6.6 (1.2–34.8) 0.92 (0.84–1.0) 
Fs ≥ 10 0.07 (0.02–0.20) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.81 0.72 0.62 0.52 ∞ 0.93 (0.86–1.0) 
Probable Malingering 
Fs ≥ 6 0.25 (0.17–0.36) 0.94 (0.88–0.97) 0.31 0.51 0.64 0.73 0.80 0.92 0.83 0.75 0.65 0.56 4.1 (2.0–8.5) 0.80 (0.70–0.90) 
Fs ≥ 7 0.20 (0.13–0.31) 0.97 (0.93–0.99) 0.45 0.65 0.76 0.83 0.88 0.92 0.83 0.74 0.65 0.55 7.4 (2.6–21.3) 0.82 (0.73–0.91) 
Fs ≥ 8 0.12 (0.06–0.21) 0.99 (0.95–1.0) 0.49 0.68 0.79 0.85 0.90 0.91 0.82 0.72 0.63 0.53 8.7 (2.0–38.9) 0.89 (0.82–0.97) 
Fs ≥ 9 0.10 (0.05–0.19) 0.99 (0.95–1.0) 0.43 0.63 0.75 0.82 0.87 0.91 0.81 0.72 0.62 0.52 7.0 (1.5–32.1) 0.92 (0.85–0.98) 
Fs ≥ 10 0.04 (0.01–0.12) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.90 0.81 0.71 0.61 0.51 ∞ 0.96 (0.92–1.0) 
Combined Malingering Groups 
Fs ≥ 6 0.32 (0.25–0.40) 0.94 (0.88–0.97) 0.37 0.57 0.69 0.78 0.84 0.93 0.85 0.76 0.68 0.58 5.2 (2.7–10.2) 0.72 (0.65–0.81) 
Fs ≥ 7 0.25 (0.19–0.33) 0.97 (0.93–0.99) 0.50 0.69 0.79 0.86 0.90 0.92 0.84 0.75 0.66 0.57 9.1 (3.3–24.9) 0.77 (0.70–0.84) 
Fs ≥ 8 0.17 (0.11–0.24) 0.99 (0.95–1.0) 0.57 0.75 0.84 0.89 0.92 0.91 0.83 0.73 0.64 0.54 12.2 (2.9–50.3) 0.84 (0.79–0.91) 
Fs ≥ 9 0.11 (0.07–0.17) 0.99 (0.95–1.0) 0.47 0.66 0.77 0.84 0.89 0.91 0.82 0.72 0.62 0.53 8.0 (1.9–33.8) 0.90 (0.85–0.95) 
Fs ≥ 10 0.05 (0.03–0.11) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.90 0.81 0.71 0.61 0.51 ∞ 0.95 (0.91–0.98) 

aT-score equivalents for raw scores 6 = 91, 7 = 99, 8 = 107, 9 = 115, 10 = 120.

Table 8.

Sensitivity, specificity, positive/negative predictive value (PPV/NPV), and likelihood ratios for cutoff scores for the MMPI-2 Henry–Heilbronner Index (HHI)

Cutoff Test Accuracy (95% CI) PPV for Select Base Rates NPV for Select Base Rates Likelihood Ratios (95% CI) 
 Sensitivity Specificity 0.10 0.20 0.30 0.40 0.50 0.10 0.20 0.30 0.40 0.50 Positive Test Negative Test 
Definite Malingering 
HHI ≥ 12 0.61 (0.41–0.78) 0.94 (0.89–0.97) 0.55 0.73 0.83 0.88 0.92 0.96 0.91 0.85 0.78 0.71 11.0 (5.3–23.0) 0.42 (0.26–0.66) 
HHI ≥ 13 0.36 (0.19–0.56) 0.97 (0.92–0.99) 0.54 0.72 0.82 0.88 0.91 0.93 0.86 0.78 0.69 0.60 10.4 (3.8–28.0) 0.67 (0.50–0.88) 
HHI ≥ 14 0.18 (0.07–0.38) 0.99 (0.96–1.0) 0.74 0.86 0.92 0.94 0.96 0.92 0.83 0.74 0.64 0.55 25.9 (3.1–213.3) 0.83 (0.70–0.98) 
HHI = 15 0.11 (0.03–0.29) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.82 0.72 0.63 0.53 ∞ 0.89 (0.79–1.0) 
Probable to Definite Malingering 
HHI ≥ 12 0.57 (0.41–0.71) 0.94 (0.89–0.97) 0.53 0.72 0.82 0.87 0.91 0.95 0.90 0.84 0.77 0.69 10.3 (5.0–21.2) 0.46 (0.33–0.64) 
HHI ≥ 13 0.36 (0.23–0.52) 0.97 (0.92–0.99) 0.54 0.73 0.82 0.88 0.91 0.93 0.86 0.78 0.69 0.60 10.5 (4.1–27.2) 0.66 (0.53–0.82) 
HHI ≥ 14 0.20 (0.10–0.36) 0.99 (0.96–1.0) 0.76 0.88 0.93 0.95 0.97 0.92 0.83 0.74 0.65 0.56 29.7 (3.9–227.7) 0.80 (0.69–0.93) 
HHI = 15 0.11 (0.04–0.25) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.82 0.72 0.63 0.53 ∞ 0.89 (0.80–0.99) 
Probable Malingering 
HHI ≥ 12 0.28 (0.19–0.39) 0.94 (0.89–0.97) 0.36 0.56 0.68 0.77 0.83 0.92 0.84 0.75 0.66 0.57 5.0 (2.4–10.7) 0.77 (0.67–0.87) 
HHI ≥ 13 0.23 (0.15–0.34) 0.97 (0.92–0.99) 0.43 0.63 0.74 0.82 0.87 0.92 0.83 0.75 0.65 0.56 6.6 (2.6–17.1) 0.80 (0.71–0.90) 
HHI ≥ 14 0.10 (0.05–0.19) 0.99 (0.96–1.0) 0.60 0.77 0.85 0.90 0.93 0.91 0.81 0.72 0.62 0.52 14.0 (1.8–109.8) 0.91 (0.85–0.98) 
HHI = 15 0.07 (0.03–0.16) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.81 0.72 0.62 0.52 ∞ 0.93 (0.87–0.99) 
Combined Malingering Groups 
HHI ≥ 12 0.42 (0.34–0.50) 0.94 (0.89–0.97) 0.46 0.66 0.77 0.84 0.88 0.94 0.87 0.79 0.71 0.62 7.6 (3.8–15.3) 0.61 (0.54–0.70) 
HHI ≥ 13 0.29 (0.22–0.37) 0.97 (0.92–0.99) 0.49 0.68 0.79 0.85 0.90 0.92 0.84 0.76 0.67 0.58 8.4 (3.4–20.6) 0.73 (0.66–0.81) 
HHI ≥ 14 0.14 (0.09–0.21) 0.99 (0.96–1.0) 0.69 0.84 0.90 0.93 0.95 0.91 0.82 0.73 0.63 0.54 20.6 (2.8–150.7) 0.86 (0.81–0.92) 
HHI = 15 0.09 (0.05–0.15) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.81 0.72 0.62 0.52 ∞ 0.91 (0.87–0.96) 
Cutoff Test Accuracy (95% CI) PPV for Select Base Rates NPV for Select Base Rates Likelihood Ratios (95% CI) 
 Sensitivity Specificity 0.10 0.20 0.30 0.40 0.50 0.10 0.20 0.30 0.40 0.50 Positive Test Negative Test 
Definite Malingering 
HHI ≥ 12 0.61 (0.41–0.78) 0.94 (0.89–0.97) 0.55 0.73 0.83 0.88 0.92 0.96 0.91 0.85 0.78 0.71 11.0 (5.3–23.0) 0.42 (0.26–0.66) 
HHI ≥ 13 0.36 (0.19–0.56) 0.97 (0.92–0.99) 0.54 0.72 0.82 0.88 0.91 0.93 0.86 0.78 0.69 0.60 10.4 (3.8–28.0) 0.67 (0.50–0.88) 
HHI ≥ 14 0.18 (0.07–0.38) 0.99 (0.96–1.0) 0.74 0.86 0.92 0.94 0.96 0.92 0.83 0.74 0.64 0.55 25.9 (3.1–213.3) 0.83 (0.70–0.98) 
HHI = 15 0.11 (0.03–0.29) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.82 0.72 0.63 0.53 ∞ 0.89 (0.79–1.0) 
Probable to Definite Malingering 
HHI ≥ 12 0.57 (0.41–0.71) 0.94 (0.89–0.97) 0.53 0.72 0.82 0.87 0.91 0.95 0.90 0.84 0.77 0.69 10.3 (5.0–21.2) 0.46 (0.33–0.64) 
HHI ≥ 13 0.36 (0.23–0.52) 0.97 (0.92–0.99) 0.54 0.73 0.82 0.88 0.91 0.93 0.86 0.78 0.69 0.60 10.5 (4.1–27.2) 0.66 (0.53–0.82) 
HHI ≥ 14 0.20 (0.10–0.36) 0.99 (0.96–1.0) 0.76 0.88 0.93 0.95 0.97 0.92 0.83 0.74 0.65 0.56 29.7 (3.9–227.7) 0.80 (0.69–0.93) 
HHI = 15 0.11 (0.04–0.25) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.82 0.72 0.63 0.53 ∞ 0.89 (0.80–0.99) 
Probable Malingering 
HHI ≥ 12 0.28 (0.19–0.39) 0.94 (0.89–0.97) 0.36 0.56 0.68 0.77 0.83 0.92 0.84 0.75 0.66 0.57 5.0 (2.4–10.7) 0.77 (0.67–0.87) 
HHI ≥ 13 0.23 (0.15–0.34) 0.97 (0.92–0.99) 0.43 0.63 0.74 0.82 0.87 0.92 0.83 0.75 0.65 0.56 6.6 (2.6–17.1) 0.80 (0.71–0.90) 
HHI ≥ 14 0.10 (0.05–0.19) 0.99 (0.96–1.0) 0.60 0.77 0.85 0.90 0.93 0.91 0.81 0.72 0.62 0.52 14.0 (1.8–109.8) 0.91 (0.85–0.98) 
HHI = 15 0.07 (0.03–0.16) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.81 0.72 0.62 0.52 ∞ 0.93 (0.87–0.99) 
Combined Malingering Groups 
HHI ≥ 12 0.42 (0.34–0.50) 0.94 (0.89–0.97) 0.46 0.66 0.77 0.84 0.88 0.94 0.87 0.79 0.71 0.62 7.6 (3.8–15.3) 0.61 (0.54–0.70) 
HHI ≥ 13 0.29 (0.22–0.37) 0.97 (0.92–0.99) 0.49 0.68 0.79 0.85 0.90 0.92 0.84 0.76 0.67 0.58 8.4 (3.4–20.6) 0.73 (0.66–0.81) 
HHI ≥ 14 0.14 (0.09–0.21) 0.99 (0.96–1.0) 0.69 0.84 0.90 0.93 0.95 0.91 0.82 0.73 0.63 0.54 20.6 (2.8–150.7) 0.86 (0.81–0.92) 
HHI = 15 0.09 (0.05–0.15) 1.0 (0.97–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.81 0.72 0.62 0.52 ∞ 0.91 (0.87–0.96) 
Table 9.

Sensitivity, specificity, positive/negative predictive value (PPV/NPV), and likelihood ratios for cutoff scores for the MMPI-2-RF Henry–Heilbronner Index (HHI-r)

Cutoff Test accuracy (95% CI) PPV for select base rates NPV for select base rates Likelihood ratios (95% CI) 
 Sensitivity Specificity 0.10 0.20 0.30 0.40 0.50 0.10 0.20 0.30 0.40 0.50 Positive test Negative test 
Definite Malingering 
HHI-r ≥ 9 0.57 (0.37–0.75) 0.93 (0.87–0.96) 0.48 0.67 0.78 0.85 0.89 0.95 0.90 0.84 0.76 0.68 8.3 (4.2–16.3) 0.46 (0.30–0.71) 
HHI-r ≥ 10 0.32 (0.17–0.52) 0.97 (0.92–0.99) 0.51 0.70 0.80 0.86 0.90 0.93 0.85 0.77 0.68 0.59 9.3 (3.4–25.7) 0.70 (0.54–0.91) 
HHI-r = 11 0.11 (0.03–0.29) 0.99 (0.96–1.0) 0.63 0.79 0.87 0.91 0.94 0.91 0.82 0.72 0.63 0.53 15.5 (1.7–144.0) 0.90 (0.79–1.0) 
Probable to Definite Malingering 
HHI-r ≥ 9 0.50 (0.35–0.65) 0.93 (0.87–0.96) 0.44 0.64 0.76 0.83 0.88 0.94 0.88 0.81 0.74 0.65 7.3 (3.7–14.1) 0.54 (0.40–72) 
HHI-r ≥ 10 0.30 (0.17–0.45) 0.97 (0.92–0.99) 0.49 0.68 0.79 0.85 0.90 0.92 0.85 0.76 0.67 0.58 8.6 (3.2–22.7) 0.73 (0.60–88) 
HHI-r = 11 0.14 (0.06–0.28) 0.99 (0.96–1.0) 0.68 0.83 0.89 0.93 0.95 0.91 0.82 0.73 0.63 0.53 19.8 (2.4–159.9) 0.87 (0.77–98) 
Probable Malingering 
HHI-r ≥ 9 0.25 (0.17–0.36) 0.93 (0.87–0.96) 0.29 0.48 0.61 0.71 0.79 0.92 0.83 0.74 0.65 0.55 3.7 (1.8–7.4) 0.80 (0.71–0.91) 
HHI-r ≥ 10 0.13 (0.07–0.23) 0.97 (0.92–0.99) 0.30 0.49 0.63 0.72 0.80 0.91 0.82 0.72 0.63 0.53 3.8 (1.4–10.7) 0.90 (0.83–0.98) 
HHI-r = 11 0.10 (0.05–0.19) 0.99 (0.96–1.0) 0.61 0.78 0.86 0.90 0.93 0.91 0.82 0.72 0.62 0.52 14.0 (1.8–109.8) 0.91 (0.85–0.98) 
Combined Malingering Groups 
HHI-r ≥ 9 0.38 (0.30–0.46) 0.93 (0.87–0.96) 0.38 0.58 0.70 0.79 0.85 0.93 0.86 0.78 0.69 0.60 5.5 (2.9–10.4) 0.67 (0.59–0.75) 
HHI-r ≥ 10 0.21 (0.15–0.29) 0.97 (0.92–0.99) 0.41 0.61 0.73 0.81 0.86 0.92 0.83 0.74 0.65 0.55 6.2 (2.5–15.4) 0.82 (0.75–0.89) 
HHI-r = 11 0.11 (0.07–0.17) 0.99 (0.96–1.0) 0.64 0.80 0.87 0.91 0.94 0.91 0.82 0.72 0.63 0.53 15.9 (2.1–118.0) 0.90 (0.85–0.95) 
Cutoff Test accuracy (95% CI) PPV for select base rates NPV for select base rates Likelihood ratios (95% CI) 
 Sensitivity Specificity 0.10 0.20 0.30 0.40 0.50 0.10 0.20 0.30 0.40 0.50 Positive test Negative test 
Definite Malingering 
HHI-r ≥ 9 0.57 (0.37–0.75) 0.93 (0.87–0.96) 0.48 0.67 0.78 0.85 0.89 0.95 0.90 0.84 0.76 0.68 8.3 (4.2–16.3) 0.46 (0.30–0.71) 
HHI-r ≥ 10 0.32 (0.17–0.52) 0.97 (0.92–0.99) 0.51 0.70 0.80 0.86 0.90 0.93 0.85 0.77 0.68 0.59 9.3 (3.4–25.7) 0.70 (0.54–0.91) 
HHI-r = 11 0.11 (0.03–0.29) 0.99 (0.96–1.0) 0.63 0.79 0.87 0.91 0.94 0.91 0.82 0.72 0.63 0.53 15.5 (1.7–144.0) 0.90 (0.79–1.0) 
Probable to Definite Malingering 
HHI-r ≥ 9 0.50 (0.35–0.65) 0.93 (0.87–0.96) 0.44 0.64 0.76 0.83 0.88 0.94 0.88 0.81 0.74 0.65 7.3 (3.7–14.1) 0.54 (0.40–72) 
HHI-r ≥ 10 0.30 (0.17–0.45) 0.97 (0.92–0.99) 0.49 0.68 0.79 0.85 0.90 0.92 0.85 0.76 0.67 0.58 8.6 (3.2–22.7) 0.73 (0.60–88) 
HHI-r = 11 0.14 (0.06–0.28) 0.99 (0.96–1.0) 0.68 0.83 0.89 0.93 0.95 0.91 0.82 0.73 0.63 0.53 19.8 (2.4–159.9) 0.87 (0.77–98) 
Probable Malingering 
HHI-r ≥ 9 0.25 (0.17–0.36) 0.93 (0.87–0.96) 0.29 0.48 0.61 0.71 0.79 0.92 0.83 0.74 0.65 0.55 3.7 (1.8–7.4) 0.80 (0.71–0.91) 
HHI-r ≥ 10 0.13 (0.07–0.23) 0.97 (0.92–0.99) 0.30 0.49 0.63 0.72 0.80 0.91 0.82 0.72 0.63 0.53 3.8 (1.4–10.7) 0.90 (0.83–0.98) 
HHI-r = 11 0.10 (0.05–0.19) 0.99 (0.96–1.0) 0.61 0.78 0.86 0.90 0.93 0.91 0.82 0.72 0.62 0.52 14.0 (1.8–109.8) 0.91 (0.85–0.98) 
Combined Malingering Groups 
HHI-r ≥ 9 0.38 (0.30–0.46) 0.93 (0.87–0.96) 0.38 0.58 0.70 0.79 0.85 0.93 0.86 0.78 0.69 0.60 5.5 (2.9–10.4) 0.67 (0.59–0.75) 
HHI-r ≥ 10 0.21 (0.15–0.29) 0.97 (0.92–0.99) 0.41 0.61 0.73 0.81 0.86 0.92 0.83 0.74 0.65 0.55 6.2 (2.5–15.4) 0.82 (0.75–0.89) 
HHI-r = 11 0.11 (0.07–0.17) 0.99 (0.96–1.0) 0.64 0.80 0.87 0.91 0.94 0.91 0.82 0.72 0.63 0.53 15.9 (2.1–118.0) 0.90 (0.85–0.95) 

With respect to cutoff scores based on specificities, an RBS cutoff of ≥15 produced a specificity of 0.93 for all malingering groups, and a score of 21 resulted in perfect specificity for all groups. An RBS cutoff of 15 had the highest sensitivity (0.71) for the DM group, and this was the highest sensitivity of all the C-S SVTs examined in this research. None of the RBS cutoffs resulted in a PPV of at least 0.90 at a base rate of 0.40, except for a cutoff of greater than or equal to 21. The PPV at a cutoff of greater than or equal to 21 was 1.0 for all malingering groups and at all base rates. This was obtained at a specificity of 1.0, which will of course result in a PPV of 1.0 and an infinitely large LR+.

An FBS cutoff of ≥20 resulted in a specificity of 0.92 across all groups, and a cutoff of 31 resulted in perfect specificities for all groups. An FBS cutoff of 25 produced the highest sensitivity (0.52), which was found for the PDM group. PPVs of at least 0.92 were found for scores ≥27 across all groups at a 0.40 base rate, except for the PDM group, where there was a PPV of 0.91 for a cutoff of ≥26.

For FBS-r, the cutoffs with a specificity of at least 0.90 ranged from 17 (0.92) to 21 (1.0) and with the highest sensitivity (0.59) for the PDM group (cutoff ≥ 17). A cutoff of ≥20 for the DM group and a cutoff of ≥18 for the PDM group had PPVs of 0.90. No other cutoffs had a PPV of at least 0.90, except for a cutoff of ≥2, which had a specificity of 1.0 for all groups resulting in perfect PPVs.

The range of cutoff score for Fs was ≥6 to ≥10 with specificities greater than or equal to 0.94, and the highest sensitivity (0.46) was for the DM group. PPVs greater than or equal to 0.90 were found for cutoffs ≥7 for the DM group and ≥8 for the PDM group (except for a cutoff of ≥9 for the PDM group; PPV = 0.81). A cutoff of ≥10 resulted in perfect specificities for all groups.

For HHI, cutoffs ranged from ≥12 to 15 (15 is the maximum score for HHI) with specificities across all groups greater than or equal to 0.94. A cutoff of 15 had specificities of 1.0 for all groups. The highest sensitivity (0.61) occurred for a cutoff of ≥12 for the DM group. Cutoffs ≥ 14 produced PPVs ≥ 0.90.

The cutoffs for HHI-r ranged from ≥9 to 11 (11 is the maximum score for HHI-r). Specificities were ≥0.93 or greater for all cutoffs in all groups, and no cutoffs had a specificity of 1.0. The highest specificity (0.99) was for a cutoff of 11 in all groups. The highest sensitivity was 0.57 for the DM group. PPVs of 0.90 or greater occurred only for a cutoff of 11.

Discussion

This research examined the performance of MMPI-2 and MMPI-2-RF C-S SVTs in a military sample of mostly closed head-injured patients with mTBI. The results indicate that RBS had the largest “mean” effect size (d = 1.58) in distinguishing the NM and the three malingering groups used for this research. This was followed by HHI and HHI-r; d was 1.50 for both scales. The lowest mean effect size was for Fs (1.23) as might be expected because it was designed to assess over-reporting of somatic symptoms and not specifically designed to assess malingering or non-credible performance on tests of cognitive functioning as were the other CS-SVTs. These results suggest that there is not much difference in the C-S SVTs based on this metric, and that they all have utility. For the CM group, the results indicated that RBS, HHI, and HHI-r performed in a very similar fashion (d range = 1.39–1.42) with RBS having the largest effect size. Research by Jones, Ingram Ben-Porath (2012) found that RBS also had the largest effect size (d = 1.69) when comparing a group with no PVT failures vs. a group with three failures. This was followed by FBS and then Fs. In general, the results demonstrate a dose–response relationship between the number of PVTs failed and the mean values and related effect size for each of the CS-SVTs. This finding helps establish the construct validity of the scales examined in this research as well as the validity of the method used to establish the different levels of malingering (Bianchini, Curtis, & Greve, 2006).

Cutoffs for the SVTs used in this research were established for specificities that ranged from at least 0.90–1.00 or until the maximum score for the scale was reached. For RBS for the current sample, the cutoff for a minimum specificity of 0.90 was ≥15 across the three malingering groups. Past research, as reviewed earlier, using a variety of criteria (e.g., individuals differing in litigation status, failing a single PVT vs. passing, malingering based on Slick and coworkers criteria vs. incentive only) have found cutoffs ranging from ≥13 to ≥19 for a minimum specificity of 0.90. Research using a mixed sample of veterans (not active duty military members) and using failing vs. passing one PVT as a criterion found cutoffs in the range of ≥17 to 19 were needed to meet specificity criterion. Research using head-injured samples (Schroeder et al., 2012; Peck et al., 2013) similar to the current sample found a cutoff of ≥16 was needed to meet the specificity criterion. Thus, the cutoffs found in other research in comparable samples found cutoffs similar to that found in the current research when using a specificity of 0.90.

For FBS for the current sample, the cutoff for a minimum specificity of 0.90 was ≥25 across the three malingering groups. As reviewed earlier, other research using different samples and methodologies and a specificity of at least 0.90 found cutoffs that ranged from ≥23 to ≥27. So, the cutoffs for the current research are consistent with this range. With respect to FBS-r, the current research found that a cutoff ≥17 was necessary to reach the minimally accepted specificity. This is at the low end of the range of scores (17–24) that Ben-Porath and Tellegen (2011) recommend for cutoffs suggesting possible over-reporting of memory complaints but much lower than the cutoff (24) that Tarescavage, Wygant, Gervais, and Ben-Porath (2013) found necessary to differentiate an incentive only vs. a probable to definite MND at criterion specificity in a non–head-injured sample. A cutoff of ≥21 had perfect specificity and a resulting PPV of 1.0 for FBS-r for the current research.

A cutoff of ≥6 for Fs for the current research has a specificity of 0.94 across the malingering groups, which compares with past research and recommendations of cutoffs in the range of 5–8 suggesting over-reporting of somatic problems. Tarescavage, Wygant, Gervais, and Ben-Porath (2013) found T-score of 100 (raw score ≥ 8) had acceptable specificity (90) when comparing their incentive only and probable/definite malingering group. This compares with a specificity of 0.99 for the PDM group in the current research.

Initial research for HHI found cutoffs of about 8 or 9 produced acceptable specificities. However, subsequent research has generally found that somewhat higher cutoffs (≥12) are required to obtain acceptable specificity. This is consistent with the current research, which found the same cutoff is needed to obtain specificity of at least 0.90. Initial research for HHI-r found a cutoff of ≥7 produced acceptable specificity. A slightly higher cutoff of ≥9 was required in the current research. Of note, the correlation between HHI and HHI-r is 0.97 (p < .001) for the total sample used in this research (N = 462).

This research established cutoff scores based on specificity of 0.90 (so comparisons could be made with past research), but this may not be the best criteria or method to establish cutoffs in predicting the probability of a COI. Crawford, Garthwaite, and Betkowska (2009) indicate that sensitivity informs us of the probability of a positive test given that an individual has the COI, but, in practice, neuropsychologists need to know the probability of the COI given that an individual obtained a positive test. They state that it is this inverse probability that is crucial. Similarly, they state that specificity tells us the probability of a negative test, given that an individual is free of the COI. However, in practice neuropsychologists are again interested in the inverse probability, i.e., the probability of the absence of the COI when a negative test result is obtained. In this same vein, Millis and Volinsky (2001) point out that although high sensitivity and specificity are desirable properties of tests they cannot alone answer the “fundamental diagnostic question” (p.818), which is, given a positive test result, what is the probability that the patient has the COI. Similarly, Akobeng (2006) states that although sensitivity and specificity are important measures of the diagnostic accuracy of a test, they are of no practical value when the task is to estimate the probability of disease in an individual patient, i.e., when we want to know who “really” has the COI (in our case malingering).

Predictive values inform the clinician of the probability of actually having or not having the COI but are dependent on the base rate of the COI. Greve, Bianchini, and Doane (2006) argue that a PPV of 0.51 or greater would be sufficient to meet standard of “more probable than not” when malingering is an issue in a legal context. Past research has found that the base rate of malingering in mTBI cases (Larrabee, 2007; Larrabee, Millis, & Meyers, 2009) is about 40%; this is similar to the research by Jones, (2013a), which found a base rate of 0.41 in the military sample used for the current research. Using the Greve and coworkers’ standard and a base rate of 0.40, all of the cutoff scores presented in this research exceeded the suggested standard by Greve and coworkers for all C-S SVTs and for all malingering groups. However, it can be argued that the standard to diagnose malingering should be in many cases much greater than the 0.51 PPV that is suggested by Greve and coworkers. The consequences of a false diagnosis of malingering can have significant consequences. If a military member is court-martialed for malingering, then the soldier could face dishonorable discharge, forfeiture of all pay and allowances, and confinement for several years. In this case, a much higher level of diagnostic certainty should apply. A PPV of at least 0.90 or higher should probably be used especially if other test results or other information do not support a diagnosis of malingering. Other evidence of malingering may involve a lack of consistency between neuropsychological tests results and severity of brain injury or lack of consistency between the individual's clinical presentation and evidence related to daily functioning. For example, it is sometimes found that a military member who is suspected of malingering and who also performs in a significantly impaired fashion on neuropsychological evaluation has recent performance reports that indicate he or she is performing military duties in a very satisfactory manner. The issues of inconsistency and other issues related to enhancing diagnostic certainty are addressed in more detail by Larrabee (2003, 2012b) and are also discussed by Slick, Sherman, and Iverson (1999). A lower PPV and associated cutoff could be used in cases where a diagnosis other than malingering is of primary consideration, e.g., in a treatment setting where a Somatic Symptom Disorder, similar to a Cogniform Disorder (Delis & Wetter, 2007), is the primary diagnostic consideration. In this case, if there are no legal implications and there is other information that is consistent with the diagnosis, such as lack of a known external incentive, then a lower PPV may be appropriate. In general, PPVs used in the decision-making process may vary depending on base rates, diagnostic issues involved, the context of the evaluation, as well as other information available when the final decision is made.

LRs are at times reported in the neuropsychological literature on cutoffs for validity tests, but the interpretive significance is often ignored. An overlooked utility of LRs is that they can be used to provide a greater range of cutoff scores for calculating post-test probabilities of a COI when “nonredundant” validity tests are used (Grimes & Schulz, 2005; Larrabee, 2008). Research has suggested that many validity tests used in neuropsychological evaluations are not redundant in military and other samples (Nelson et al., 2003; Young, Kearns, & Roper, 2011; Berthelson, Mulchan, Odland, Miller, & Mittenberg, 2013; Jones, 2013a; Jones, 2013b). An example of how nonredundant validity tests can be used to enhance diagnostic certainty, and, perhaps more importantly, how they may be utilized at cutoffs with specificities much lower than those dictated by the generally accepted criterion of a specificity of 0.90 is provided here. An RBS cutoff of ≥15 is required for a specificity of ≥0.90 in the current research for all groups. A much lower RBS cutoff of ≥10 (calculated for the current sample for the CM group) results in a specificity of only 0.61 but a very good sensitivity of 0.88. The LR+ for an RBS cutoff ≥10 for the current sample is 2.27, and this produces a PPV of 0.60 and a post-test probability of 0.60 at a base rate of 0.40 when calculated using the LR+. However, if this rather low LR+ of 2.27 is combined with another uncorrelated validity test with a modest LR+ of 6.17 (post-test probability = .80), then the chained LR+ becomes 14 (2.27 × 6.17 = 14.0), which results in a post-test probability of malingering in the CM group of 0.90 at a base rate of 0.40. In general, a LR+ greater than 10 has a large effect on increasing the post-test probabilities of having the COI (Hayden & Brown, 1999). In this example, a lower cutoff for RBS results in a higher sensitivity but much lower specificity, but when combined with another modest LR+ for an uncorrelated validity test, the result is a post-test probability that can be quite useful in the diagnostic process. Bowden and Loring (2009) point out most neuropsychological tests have a wide range of scores and important diagnostic information may be lost when validity test scores are reduced to a simple dichotomy of “positive” or “negative” diagnosis that underlies sensitivity and specificity analysis. So, combining likelihood ratio allows us to utilize individual test results without having to choose an arbitrary cutoff to dichotomize the results into a positive or negative result.

This research used a psychometric basis to establish malingering, which may have particular value in military settings. Traditionally, the identification of incentives has been deemed important if not necessary to establish malingering; however, this may be especially difficult in a military population. In the military, the incentives to feign are numerous, but the negative consequences of being discovered malingering are potentially quite significant. This combination of incentives and possible negative consequences can create a complex situation when considering malingering. The motivations to feign cognitive and other medical problems include, among others, being removed from arduous duties, avoiding deployment, obtaining discharge from the military, obtaining disability compensation, which includes compensation for traumatic injury that can result in large lump sum payments up to a hundred thousand dollars (cf. Howe, 2010). The consequences of being discovered “faking” problems include court-martial and imprisonment. So, the military member may be particularly careful in the concealment in their attempts to malinger. They may also be particularly adept at how to present a convincing set of symptoms because of the extensive education programs the military provides related to the symptoms of TBI. In addition, it has been observed that individuals who are placed in special units (e.g., the Warrior Transition Battalion) to ensure and coordinate care are sometimes coached by other members in these units about how to “beat” medical evaluations to ensure their goals are achieved. This makes the use of validity tests and psychometrically based methods to assess malingering, irrespective of incentives, a useful starting point in the diagnostic process in this population.

A limitation of this research is that the sample consisted of military members, which of course was largely men. In addition, the sample consisted of TBI patients, most of whom experienced mTBI, and it was not possible to compare groups of different TBI severity and/or other pathology. This could limit generalization to other populations; however, the results should be generalized to patient populations in large military medical centers and other samples that share similar demographics and similar diagnoses. In any case, it should be noted that the cutoffs found in this research were similar to cutoffs found in other non-military samples. With respect to the sample being predominantly men, past research has suggested that women tend to score about two points higher than men on FBS (Greiffenstein, Fox, & Lees-Haley, 2007) irrespective of clinical status. Lee, Graham, Sellbom, and Gervais (2012) also found that women had slightly higher scores than men (d = 0.29), but they also found no evidence of clinically meaningful bias in the prediction of SVT failure using FBS scores for men and women. For the current sample, the number of women was somewhat limited (n = 50) but the difference between mean values was not significant. For men, the mean was 11.13 (SD = 4.7), and for women, the mean was 11.66 (SD = 4.3; p = .25, df = 419). In any case, this should be examined in future research.

A methodological limitation of this research involves the varying number of validity tests administered to the participants. It would have been preferable to administer a consistent battery of validity tests to all participants; however, this was not possible for this retrospective research. The lack of a consistent battery presents an issue in the formation of the PM group, because participants could have been assigned to the PDM group if additional tests were administered and failed. This would not affect specificities for this group; however, sensitivities could be affected. Even if a consistent battery were administered, this may not ensure that an individual would be correctly classified. The administration of a sufficiently large number of tests may eventually result in the failure of three or more validity tests. Given this issue, the CM group probably provides the best estimates of cutoff scores and related statistics for any level of malingering.

Conflict of Interest

None declared.

References

Akobeng
,
A. K.
(
2006
).
Understanding diagnostic tests 1: sensitivity, specificity and predictive values
.
Acta Paediatrica
 ,
96
,
338
341
.
American Psychiatric Association
.
(
2013
).
Diagnostic and statistical manual of mental disorders
  (5th ed.).
Arlington, VA
:
American Psychiatric Publishing
.
Armistead-Jehle
,
P
., &
Hansen
,
C. L.
(
2011
).
Comparison of the Repeatable Battery for the Assessment of Neuropsychological Status effort index and stand-alone symptom validity tests in a military sample
.
Archives of Clinical Neuropsychology
 ,
26
,
592
601
.
Babikian
,
T.
,
Boone
,
K. B.
,
Lu
,
P.
, &
Arnold
,
G.
(
2006
).
Sensitivity and specificity of various digit span scores in the detection of suspect effort
.
The Clinical Neuropsychologist
 ,
20
,
145
159
.
Ben-Porath
,
Y. S.
, &
Tellegen
,
A.
(
2008
).
Minnesota Multiphasic Personality Inventory-2 restructured form: manual for administration and scoring
 .
Minneapolis, MN
:
University of Minnesota Press
.
Ben-Porath
,
Y. S.
, &
Tellegen
,
A.
(
2011
).
Minnesota Multiphasic Personality Inventory-2 restructured form: manual for administration and scoring—Response Bias Scale (RBS) supplement
 .
Minneapolis, MN
:
University of Minnesota Press
.
Ben-Porath
,
Y. S.
,
Tellegen
,
A.
, &
Graham
,
J. R.
(
2009
).
The MMPI-2 Symptom Validity Scale (FBS)
 .
Minneapolis, MN
:
University of Minnesota press
.
Berthelson
,
L.
,
Mulchan
,
S. S.
,
Odland
,
A. P.
,
Miller
,
L. J.
, &
Mittenberg
,
W.
(
2013
).
False positive diagnosis of malingering due to use of multiple tests
.
Brain Injury
 ,
27
,
909
916
Bianchini
,
K. J.
,
Curtis
,
K. L.
, &
Greve
,
K. W.
(
2006
).
Compensation and malingering in TBI: a dose-response relationship
.
The Clinical Neuropsychologist
 ,
20
,
831
847
.
Bowden
,
S. C.
, &
Loring
,
D. W.
(
2009
).
The diagnostic utility of multiple-level likelihood ratios
.
Journal of the International Neuropsychological Society
 ,
15
,
769
776
.
Butcher
,
J. N.
,
Dahlstrom
,
W. G.
,
Graham
,
J. R.
,
Tellegen
,
A.
, &
Kaemmer
,
B.
(
1989
).
Minnesota Multiphasic Personality Inventory-2 (MMPI-2): manual for administration and scoring
 .
Minneapolis, MN
:
University of Minnesota Press
.
Cohen
,
J.
(
1988
).
Statistical power analysis for the behavioral sciences
  (2nd ed.).
Hillsdale, NJ
:
Lawrence Erlbaum Associates
.
Crawford
,
J. R.
,
Garthwaite
,
P. H.
,
Betkowska
,
K.
(
2009
).
Bayes’ theorem and diagnostic tests in neuropsychology: interval estimates for posttest probabilities
.
The Clinical Neuropsychologist
 ,
23
,
624
644
.
Delis
,
D. C.
, &
Wetter
S. R.
(
2007
).
Cogniform disorder and cogniform condition: proposed diagnoses for excessive cognitive symptoms
.
Archives of Clinical Neuropsychology
 ,
22
,
589
604
.
Department of Veteran Affairs
.
(
2009
).
VA/DoD clinical practice guideline for management of concussion/mild traumatic brain injury
 .
Washington, DC
:
Department of Veteran Affairs
.
Dionysus
,
K. E.
,
Denney
,
R. L.
,
Halfaker
,
D. A.
(
2011
).
Detecting negative response bias with the Fake Bad Scale, Response Bias Scale, and Henry-Heilbronner Index of the Minnesota Multiphasic Personality Inventory-2
.
Archives of Clinical Neuropsychology
 ,
26
,
81
88
.
Fagan
T. J.
(
1975
).
Nomogram for Bayes theorem
.
New England Journal of Medicine
 ,
293
,
257
.
Ferguson
,
C. J.
(
2009
).
An effect size primer: a guide for clinicians and researchers
.
Professional Psychology: Research and Practice
 ,
40
,
532
538
.
Fox
,
D. F.
(
2011
).
Symptom validity test failure indicates invalidity of neuropsychological tests
.
The Clinical Neuropsychologist
 ,
3
,
488
495
.
Gervais
,
R. O.
(
2005
). Development of an empirically derived response bias scale for the MMPI-2. Paper presented at the Annual MMPI-2 Symposium and Workshops, Ft. Lauderdale, FL, USA.
Gervais
,
R. O.
,
Ben-Porath
,
Y.
,
Wygant
,
D.
, &
Green
,
P.
(
2007
).
Development and validation of a Response Bias Scale for the MMPI-2
.
Assessment
 ,
14
,
196
208
.
Gervais
,
R. O.
,
Ben-Porath
,
Y.
,
Wygant
,
D.
, &
Green
,
P.
(
2008
).
Differential sensitivity of the Response Bias Scale (RBS) and MMPI-2 Validity scales to memory complaints
.
The Clinical Neuropsychologist
 ,
3
,
1
19
.
Gervais
,
R. O.
,
Ben-Porath
,
Y.
,
Wygant
,
D.
,
Green
,
P.
, &
Sellbom
M.
(
2010
).
Incremental validity of the MMPI-2-RF over-reporting Scales and RBS in assessing the veracity of memory complaints
.
Archives of Clinical Neuropsychology
 ,
25
,
274
284
.
Green
,
P.
(
2003
).
Green's Word Memory Test for Windows: user's manual
 .
Edmonton, Canada
:
Green's Publishing
.
Green
,
P.
(
2004
).
Green's Medical Symptom Validity Test (MSVT) for Windows: user's manual
 .
Edmonton, Canada
:
Green's Publishing
.
Greiffenstein
,
M. F.
,
Baker
,
W. J.
, &
Gola
,
T.
(
1994
).
Validation of malingered amnesia measures with a large clinical sample
.
Psychological Assessment
 ,
6
,
218
224
.
Greiffenstein
,
M. F.
,
Fox
,
D.
, &
Lees-Haley
,
P. R.
(
2007
). The MMPI-2 Fake Bad Scale in detection of noncredible brain injury claims. In
Boone
K. B.
(Ed.),
Assessment of feigned cognitive impairment
  (pp.
210
235
).
New York, NY
:
The Guildford Press
.
Greve
,
K. W.
,
Bianchini
,
K. J.
, &
Doane
,
B. M.
(
2006
).
Classification accuracy of the test of memory malingering in traumatic brain injury: results of known-groups analysis
.
Journal of Clinical and Experimental Neuropsychology
 ,
28
,
1176
1190
.
Greve
,
K. W.
,
Bianchini
,
K. J.
,
Black
,
F. W.
,
Heinly
,
M. T.
,
Love
,
J. M.
,
Swift
,
D. A.
, et al
. (
2006
).
Classification accuracy of the test of memory malingering in persons reporting exposure to environmental and industrial toxins: results of a known-groups analysis
.
Archives of Clinical Neuropsychology
 ,
21
,
439
438
.
Greve
,
K. W.
,
Etherton
,
J. L.
,
Ord
,
J.
,
Bianchini
,
K. J.
, &
Curtis
,
K. L.
(
2009
).
Detecting malingered pain-related disability: classification accuracy of the Test of Memory Malingering
.
The Clinical Neuropsychologist
 ,
23
,
1250
1271
.
Grimes
,
D. A.
,
Schulz
,
K. F.
(
2005
).
Refining clinical diagnosis with likelihood ratios
.
Lancent
 ,
365
,
1500
1505
.
Grote
,
C. L.
,
Kooker
,
E. K.
,
Garron
,
D. C.
,
Nyenhuis
,
D. L.
,
Smith
,
C. A.
, &
Mattingly
,
M. L.
(
2000
).
Performance of compensation seeking and non-compensation seeking samples on the Victoria Symptom Validity Test: cross validation and extension of a standardization study
.
Journal of Clinical and Experimental Neuropsychology
 ,
22
,
709
719
.
Hayden
,
S. R.
, &
Brown
,
M. D.
(
1999
).
Likelihood ratios: a powerful tool for incorporating the results of diagnostic tests into clinical decision making
.
Annals of Emergency Medicine
 ,
33
,
575
580
Heilbronner
,
R. L.
, &
Henry
,
G. K.
(
2013
). Psychological assessment of symptom magnification in mild traumatic brain injury cases. In
Carone
D. A.
, &
Bush
S. S.
(Eds.),
Mild traumatic brain injury: symptom validity assessment and malingering
 . (pp.
183
202
.).
New York, NY
:
Springer Publishing Company
.
Henry
,
G. K.
,
Heilbronner
,
R. L.
,
Algina
,
J.
, &
Kaya
,
Y.
(
2013
).
Derivation of the MMPI-2-RF Henry-Heilbronner Index-r (HHI-r) scale
.
The Clinical Neuropsychologist
 ,
27
,
509
513
.
Henry
,
G. K.
,
Heilbronner
,
R. L.
,
Mittenberg
,
W.
, &
Enders
,
C.
(
2006
).
The Henry-Heilbronner Index: a 15 item empirically derived MMPI-2 subscale for identifying probable malingering in personal injury litigants and disability claims
,
The Clinical Neuropsychologist
 ,
20
,
786
797
.
Hinkle
,
D. E.
,
Wiersma
,
W.
, &
Jurs
,
S. G.
(
2003
).
Applied statistics for the behavioral Sciences
  (5th ed.).
Boston
:
Houghton Mifflin
.
Howe
,
L. L.
(
2010
).
Giving context to post-deployment post-concussive-like symptoms: blast related potential mild traumatic brain injury and comorbidities
.
The Clinical Neuropsychologist
 ,
23
,
1315
1337
.
Jasinski
,
L. J.
,
Berry
,
D. T.
,
Shandera
,
A. L.
, &
Clark
,
J. A.
(
2011
).
Use of the Wechsler Adult Intelligence Scale Digit Span subtest for malingering detection: a meta-analytic review
.
Journal of Clinical and Experimental Neuropsychology
 ,
33
,
300
314
.
Jones
,
A.
(
2013
a).
Test of Memory malingering: cutoff scores for psychometrically defined malingering groups in a military sample
.
The Clinical Neuropsychologist
 ,
27
,
1043
1059
.
Jones
,
A.
(
2013
b).
Victory Symptom Validity Test: cutoff scores for psychometrically defined malingering groups in a military sample
.
The Clinical Neuropsychologist
 ,
27
,
1373
1394
.
Jones
,
A.
, &
Ingram
,
M. V.
(
2011
).
A comparison of selected MMPI-2 and MMPI-2-RF validity scales in assessing effort on cognitive tests in a military sample
.
The Clinical Neuropsychologist
 ,
25
,
1207
1227
.
Jones
,
A.
,
Ingram
,
M. V.
, &
Ben-Porath
,
Y. S.
(
2012
).
Scores on the MMPI-2-RF scales as a function of increasing levels of failure on cognitive symptom validity tests in a military sample
.
The Clinical Neuropsychologist
 ,
26
,
790
815
.
Larrabee
,
G. J.
(
2003
).
Detection of malingering using atypical performance patterns on standard neuropsychological tests
.
The Clinical Neuropsychologist
 ,
17
,
410
425
.
Larrabee
,
G. J.
(
2007
). Malingering, research designs, and base rates. In
Larrabee
G. J.
(Ed.),
Assessment of malingered neuropsychological deficits
  (pp.
3
13
).
New York
:
Oxford University Press
.
Larrabee
,
G. J.
(
2008
).
Aggregation across multiple indicators improves the detection of malingering: relationship to likelihood ratios
.
The Clinical Neuropsychologist
 ,
22
,
666
679
.
Larrabee
,
G. J.
(
2012
a).
Performance validity and symptom validity in neuropsychological assessment
.
Journal of the International Neuropsychological Society
 ,
18
,
625
630
.
Larrabee
,
G. J.
(
2012
b). A scientific approach to forensic neuropsychology. In
Larrabee
G. J.
(Ed.),
Forensic neuropsychology: a scientific approach
 .
New York
:
Oxford University Press
.
Larrabee
,
G. J.
,
Greiffenstein
,
M. F.
,
Greve
,
K. W.
, &
Bianchini
,
K. J.
(
2007
). Refining diagnostic criteria for malingering. In
Larrabee
G. J.
(Ed.),
Assessment of malingered neuropsychological deficits
  (pp.
334
371
).
New York, NY
:
Oxford University Press
.
Larrabee
,
G. J.
,
Millis
,
S. R.
, &
Meyers
,
J. E.
(
2009
).
40 plus or minus 10, a new magical number: reply to Russell
.
The Clinical Neuropsychologist
 ,
23
,
746
753
.
Lee
,
T. T.
,
Graham
,
J. R.
,
Sellbom
,
M.
, &
Gervais
,
R. O.
(
2012
).
Examining the potential for gender bias in the prediction of symptom validity test failure by MMPI-2 symptom validity scale scores
.
Psychological Assessment
 ,
24
,
618
627
.
Lees-Haley
,
P. R.
,
English
,
L. T.
, &
Glenn
W. J.
(
1991
).
A Fake Bad scale on the MMPI-2 for personal injury claimants
,
Psychological Reports
 ,
68
,
203
210
.
Loring
,
D. W.
,
Larrabee
,
G. J.
,
Lee
,
G. P.
, &
Meador
,
K. J.
(
2007
).
Victoria Symptom Validity Test performance in a heterogeneous clinical sample
.
The Clinical Neuropsychologist
 ,
21
,
522
531
.
Macciocchi
,
S. N.
,
Seel
,
R. T.
,
Alderson
,
A.
, &
Godsall
,
R.
(
2006
).
Victoria Symptom Validity Test performance in acute severe traumatic brain injury: implications for test interpretation
.
Archives of Clinical Neuropsychology
 ,
21
,
395
404
.
Millis
,
S. R.
, &
Volinsky
,
C. T.
(
2001
).
Assessment of response bias in mild head injury: beyond malingering tests
.
Journal of Clinical and Experimental Neuropsychology
 ,
23
,
809
828
.
Mittenberg
,
W.
,
Patton
,
C.
,
Canyock
,
E. M.
, &
Condit
,
D. C.
(
2002
).
Base rates of malingering and symptom exaggeration
.
Journal of Clinical and Experimental Neuropsychology
 ,
24
,
1094
1102
.
Nelson
,
N. W.
,
Boone
,
K.
,
Dueck
,
A.
,
Wagener
,
L.
,
Lu
,
P.
, &
Grills
,
C.
(
2003
).
Relationships between eight measures of suspect effort
.
The Clinical Neuropsychologist
 ,
17
,
263
272
.
Nelson
,
N. W.
,
Hoelzle
,
J. B.
,
Sweet
,
J.
,
Arbisi
,
P. A.
, &
Demakis
,
G.
(
2010
).
Updated meta-analysis of the MMPI-2 Symptom Validity Scale (FBS): verified utility in forensic practice
.
The Clinical Neuropsychologist
 ,
24
,
701
724
.
Nelson
,
N. W.
,
Sweet
,
J.
, &
Demakis
,
G.
(
2006
).
Meta-analysis of the MMPI-2 Fake Bad Scale: utility in forensic practice
.
The Clinical Neuropsychologist
 ,
20
,
39
58
.
Nelson
,
N. W.
,
Sweet
,
J.
, &
Heilbronner
,
R. L.
(
2007
).
Examination of the new MMPI-2 Response Bias Scale relationship with MMPI-2 validity scales
.
Journal of Clinical and Experimental Neuropsychology
 ,
29
(1)
,
67
72
.
Peck
,
C. P.
,
Schroeder
,
R. W.
,
Heinrichs
,
R. J.
,
Vondran
,
E. J.
,
Brockman
,
C. J.
,
Webster
,
B. K.
, et al
. (
2013
).
Differences in MMPI-2 FBS and RBS scores in brain injury, probable malingering, and conversion disorder groups: a preliminary study
.
Clinical Neuropsychology
 ,
27
,
693
707
.
Proto
,
D. A.
,
Pastorek
,
N. J.
,
Miller
,
B. I.
,
Romesser
,
J. M.
,
Sim
,
A. H.
, &
Linck
,
J. F.
(
2014
).
The dangers of failing one or more performance validity tests in individuals claiming mild traumatic brain injury-related postconcussive symptoms
.
Archives of Clinical Neuropsychology
 ,
29
,
614
624
.
Randolph
,
C.
(
1998
).
Repeatable battery for the assessment of neuropsychological status (RBANS)
 .
San Antonio, TX
:
Harcourt: The Psychological Corporation
.
Sackett
,
D. L.
,
Haynes
,
R. B.
,
Guyatt
,
G. H.
, &
Tugwell
,
P.
(
1991
).
Clinical epidemiology: a basic science for clinical medicine
 .
Boston, MA
:
Little, Brown, and Company
.
Schroeder
,
R. W.
,
Baade
L. E.
,
Peck
C. P.
,
VonDran
,
E. J.
,
Brockman
C. J.
,
Webster
,
B. K.
, et al
. (
2012
).
Validation of MMPI-2-RF validity scales in criterion group neuropsychological samples
.
The Clinical Neuropsychologist
 ,
26
,
129
146
.
Shaw
,
D. J.
&
Matthews
,
C. G.
(
1965
).
Differential MMPI performance of brain damaged versus pseudoneurologic groups
.
Journal of Clinical Psychology
 ,
21
,
405
408
.
Silk-Eglit
,
S. M.
,
Lynch
,
J. K.
, &
Mccaffrey
,
R. J.
(
2016
).
Validation of Victoria Symptom Validity Test cutoff scores among mild traumatic brain injury litigants using a known-groups design
.
Archives of Clinical Neuropsychology
 , Advance on line publication. .
Silverberg
,
N. D.
,
Wertheimer
,
J. C.
, &
Fichtenberg
,
N. L.
(
2007
).
An effort index for the repeatable battery for the assessment of neuropsychological status (RBANS)
.
The Clinical Neuropsychologist
 ,
21
,
841
854
.
Slick
,
D. J.
,
Hopp
,
G.
,
Strauss
,
E.
, &
Thompson
,
G. B.
(
1997
).
Victoria Symptom Validity Test: professional manual
 .
Odessa, FL
:
Psychological Assessment Resources, Inc
.
Slick
,
D. J.
,
Sherman
,
E. M.
, &
Iverson
,
G. L.
(
1999
).
Diagnostic criteria for malingered neurocognitive dysfunction: proposed standards for clinical practice and research
.
The Clinical Neuropsychologist
 ,
13
,
545
561
.
Stenclik
,
J. H.
,
Miele
,
A. S.
,
Silk-Eglit
,
G.
,
Lynch
,
J. K.
, &
McCaffrey
,
R. J.
(
2013
).
Can the sensitivity and specificity of the TOMM be increased with differential cutoff scores
.
Applied Neuropsychology: Adult
 ,
20
,
243
248
.
Tarescavage
,
A. M.
,
Wygant
,
D. B.
,
Gervais
,
R. O.
, &
Ben-Porath
,
Y. S.
(
2013
).
Association between the MMPI-2 Restructured Form (MMPI-2-RF) and malingered neurocognitive dysfunction among non-head injury disability claimants
.
The Clinical Neuropsychologist
 ,
27
,
313
335
.
Tellegen
,
A.
, &
Ben-Porath
,
Y. S.
(
2008
).
Minnesota Multiphasic Personality Inventory-2 restructured form: technical manual
 .
Minneapolis, MN
:
University of Minnesota Press
.
Tombaugh
,
T. N.
(
1996
).
Test of Memory Malingering (TOMM)
 .
New York
:
Multi-Health Systems Inc
.
Tsushima
,
W. T.
,
Geling
O.
, &
Fabrigas
J.
(
2011
).
Comparison of MMP-2 validity scales scores of personal injury litigants and disability claimants
.
The Clinical Neuropsychologist
 ,
25
,
1403
1414
.
Victor
,
T. L.
,
Boone
,
K. B.
,
Serpa
,
J. G.
,
Buehler
,
J.
, &
Ziegler
,
E. A.
(
2009
).
Interpreting the meaning of multiple symptom validity test failure
.
The Clinical Neuropsychologist
 ,
23
,
297
313
.
Whitney
,
K. A.
(
2013
).
Predicting test of memory malingering and Medical Symptom Validity Test failure within a Veterans Affairs Medical Center: use of Response Bias Scale and the Henry-Heilbronner Index
.
Archives of Clinical Neuropsychology
 ,
28
,
222
235
.
Whitney
,
K. A.
,
Davis
,
J. J.
,
Shepard
,
P. H.
, &
Herman
,
S. M.
(
2008
).
Utility of the Response Bias Scale and other MMPI-2 validity scales in predicting TOMM performance
.
Archives of Clinical Neuropsychology
 ,
2
,
8
22
.
Wygant
,
D. B.
,
Sellbom
,
M.
,
Gervais
,
R. O.
,
Ben-Porath
,
Y. S.
,
Stafford
,
K. P.
,
Freeman
,
D. B.
, et al
. (
2010
).
Further validation of the MMPI-2 and MMPI-2-RF Response Bias Scale: findings from disability and criminal forensic settings
.
Psychological Assessment
 ,
22
,
745
756
.
Young
,
J. C.
,
Kearns
,
L. A
., &
Roper
,
B. L.
(
2011
).
Validation of the MMPI-2 Response Bias Scale and Henry-Heilbronner Index in a U.S. veteran population
.
Archives of Clinical Neuropsychology
 ,
26
,
194
204
.
Youngjohn
,
J. R.
,
Wershba
,
R.
,
Stevenson
,
M.
,
Sturgeon
,
J.
,
Thomas
,
M. L.
(
2011
).
Independent validation of the MMPI-2-RF somatic/cognitive validity scales in TBI litigants tested for effort
.
The Clinical Neuropsychologist
 ,
25
,
463
476
.

Author notes

This research was approved through the Institutional Review Board at Womack Army Medical Center. The views expressed herein are those of the author and do not reflect the official policy of the Department of the Army, Department of Defense, or the U.S. Government.