Abstract

Objective.

This research examined cutoff scores for the Effort Index (EI), an embedded measure of performance validity, for the Repeatable Battery for the Assessment of Neuropsychological Status. EI cutoffs were explored for an active-duty military sample composed mostly of patients with traumatic brain injury.

Method.

Four psychometrically defined malingering groups including a definite malingering, probable to definite malingering, probable malingering, and a combined group were formed based on the number of validity tests failed.

Results.

Excellent specificities (0.97 or greater) were found for all cutoffs examined (EI ≥ 1 to EI ≥ 3). Excellent sensitivities (0.80 to 0.89) were also found for the definite malingering group. Sensitivities were 0.49 or below for the other groups. Positive and negative predictive values and likelihood ratios indicated that the cutoffs for EI were much stronger for ruling-in than ruling-out malingering. Analyses indicated the validity tests used to form the malingering groups were uncorrelated, which serves to enhance the validity of the formation of the malingering groups.

Conclusions.

Cutoffs were similar to other research using samples composed predominantly of head-injured individuals.

The Repeatable Battery for the Assessment of Neuropsychological Status (RBANS; Randolph, 1998,, 2012) is a widely used neuropsychological test originally developed to assess dementia in elderly populations. However, its use has been expanded to many other patient groups, including individuals with stroke, multiple sclerosis, Parkinson's disease, psychosis, traumatic brain injury, and also concussed athletes (Beatty, 2004; Beatty et al., 2003; Holzer et al., 2007; Larson, Kirschener, Bode, Heineman, & Goodman, 2005; Lippa, Hawes, Jokic, & Caroselli, 2013; McCay, Casey, Wertheimer, & Fichtenberg, 2007; Moser & Schatz, 2002). The battery consists of five indexes, which assess Attention, Language, Visuospatial/Constructional Abilities, Immediate Memory, and Delayed Memory. A Total Index score is calculated from all of the cognitive domains assessed. According to Barth, Isler, Helmick, Wingler, and Jaffee (2010), the RBANS was selected for neuropsychological assessment in military battlefield or deployed situations because of its ease of use, portability, short administration time, and because it has equivalent alternate forms. In addition, the RBANS has an embedded measure, the Effort Index (EI; Silverberg, Wertheimer, & Fichtenberg, 2007), to assess the credibility of performance on the test. Barth et al. state that military psychologists are encouraged to use EI as well as other validity tests in their interpretation of neuropsychological test results.

EI is computed from the RBANS list recognition and digit span subtests (Silverberg et al., 2007), which have been found to be relatively insensitive to cognitive dysfunction. The raw scores on these two subtests are combined using a weighted scoring system resulting in a score that can range from 0 to 12 with higher scores indicating increasing levels of noncredible performance. In their initial study of the EI, Silverberg et al. reported that an EI cutoff of >3 was rare in the general population and should be considered suspicious of noncredible performance on the RBANS. This initial research also suggested that a cutoff of ≥1 was optimal in discriminating between participants with actual vs. falsely alleged cognitive dysfunction owing to post-acute mild traumatic brain injury (mTBI). Overall, classification accuracy (86.9%) was optimal at a cutoff of ≥1 when comparing the mTBI group with three other malingering groups (clinical malingerers, simulated-naive malingerers, and simulated-coached malingers).

Research on EI cutoffs in samples of older adult and geriatric patients generally advise against using EI or exercising significant caution when using the originally suggested cutoffs (e.g., Barker, Horner, & Bachman, 2010; Duff et al., 2011; Hook, Marquine, & Hoelzle, 2009). This is because of high false-positive rates found in cases of dementia at these cutoffs. To overcome the false-positive rate for EI in patients with “true” amnesia, Novitski, Steele, Karantzoulis, and Randolph (2012) developed a new Effort Scale (ES) for the RBANS. ES uses the list recognition and digit span subtests of the EI but also includes other free recall measures on the RBANS. They provide evidence that ES is more sensitive and specific in sample of patients with amnestic disorders. However, recent research has questioned the use of both EI and ES in individuals with dementia (Burton, Enright, O'Connell, Lanting, & Morgan, 2015). Recent research has demonstrated that EI performed better than ES in a sample of younger disability litigants (M age = 42.5) undergoing forensic evaluation (Crighton, Wygant, Holt, & Granacher, 2015). Data available in the database for the current research did not allow for the calculation of ES so comparison could not be made with that research.

There is limited research examining EI cutoff scores in military samples. However, two studies have examined EI in samples somewhat similar to the current research (Armistead-Jehle & Hansen, 2011; Young, Baughman, & Roper, 2012). The study by Young et al. used a non-geriatric sample of military veterans. They sought to determine if the recommended cutoff of >3 produced the best balance of sensitivity and specificity. They found that a cutoff >3 had strong specificity (0.94) but limited sensitivity (0.31) when predicting the pass–fail status on the Word Memory Test (WMT; Green, 2003). A cutoff of >2 improved sensitivity (0.54), but specificity fell to 0.81, which they believe is below a preferred specificity of 0.90. The only cutoff that produced a positive predictive value (PPV) of at least 0.90 was >3. The sample that Young et al. used consisted primarily of individuals with psychiatric disorders. The current research used an active-duty sample comprised mostly of individuals with traumatic brain injury, mostly mTBI.

The research by Armistead-Jehle and Hansen (2011) used an active-duty military sample consisting primarily of individuals who reported a history of mTBI (84.7%) and/or various mental health conditions. They used the pass–fail status on the Test of Memory Malingering (TOMM; Tombaugh, 1996), the Medical Symptom Validity Test (MSVT; Green, 2004), and the Nonverbal Medical Symptom Validity Test (NV-MSVT; Green, 2008) to examine EI cutoffs. The cutoffs they examined were ≥1 and >3. The sensitivities for the validity tests they examined ranged from 0.24 for the MSVT to 0.62 for the NV-MSVT, and specificities ranged from 0.92 for the TOMM to 0.97 for the MSVT and NV-MSVT. A receiver-operating curve analysis was performed for the MSVT, because they thought that it had more research using samples similar to the sample used in their research. They concluded that a cutoff of ≥1 had the best tradeoff between true positive and false positives.

The current study is similar to the Armistead-Jehle and Hansen (2011) research in that both studies used a mostly head-injured sample. However, the current research employs a sample more representative of Army demographics and uses a larger sample. The demographics of the sample (discussed in more detail later) used in the current research are more similar to the demographics of the Army at large in terms of education, age, and distribution of military ranks.

Method

Sample

The sample used in the current research is the same as used by Jones (2013a, 2013b) and the methodology in the current research parallels that used in the previous research. This research was approved by the Institutional Review Board at Womack Army Medical Center. The reader is referred to the previous research for more detail concerning the sample composition. In general, all participants were active-duty military members. The majority of the sample experienced mTBI and almost all had potential incentive(s) to malinger. The participants were living independently and evaluated by the author as outpatients in the brain injury medicine or neuropsychology service at two army medical centers. The participants completed a neuropsychological evaluation that included the Minnesota Multiphasic Personality-2 (MMPI-2; Butcher, Dahlstrom. Graham, Tellegen, & Kaemmer, 1989) and at least one performance validity test (PVT). The initial sample consisted of 495 participants, after exclusion criteria for the MMPI-2 were applied (CNS > 18, TRIN or VRIN ≥ 80), the sample consisted of 462 participants. The sample was 88% male; the mean age was 31.4 years (SD =8.8), and the mean education level was 13.2 years (SD= 2.0). The ethnic distribution was 73% Caucasian, 16% Afro-American, 10% Hispanic, and <1% was Asian or other. The distribution of military rank was 89% enlisted and 11% officers.

The participants were consecutive referrals to the author. Because of the retrospective nature of the data, the exact distribution of the severity and nature of head injury was not available for the sample. However, data were available for about two-thirds of the initial sample but unfortunately not linked to individual participants. About 90% of the sub-sample with information on severity and mechanism of injury experienced closed head, blast, or heat stroke/exhaustion or some combination of the three. Approximately 75% of these injuries were estimated to be mild, 19% were moderate, and 6% were severe. About 10% of the sample was evaluated for brain disease (e.g., multiple sclerosis, epilepsy, Huntington's disease). Criteria used to judge the severity of TBI were based in large part on the Department of Veterans Affairs/Department of Defense consensus based classification of closed TBI severity (Department of Veteran Affairs, 2009). However, information for each criterion was not always available (e.g., Glasgow Coma Scale ratings and brain-imaging studies at time of injury), so the severity ratings were primarily based on criteria related to length of loss of consciousness, length of alteration of consciousness (e.g., feeling dazed, disoriented, confused, or difficulty mentally tracking events), and length of post-traumatic amnesia. The severity ratings were based on the participants' self-report and/or review of medical records, when available, that documented the nature and characteristics of the injury. Given the exigencies related to care provided in an area of combat, the documentation of the nature and characteristics of the injury and type of care provided for injuries that occurred in theater were at times limited in scope or nonexistent. In addition, the description of the nature and characteristics of the injury provided by the participants often occurred several years after the injury/injuries and generally not less than several months after the injury. The descriptions of injuries were at times also likely influenced by the various motivations of the participants when evaluated, especially those who demonstrated evidence of questionable performance and/or symptom validity at the time of evaluation. It was not uncommon to find discrepancies, especially embellishment of characteristics of injury, between self-reports at the time of evaluation and information documented in medical records in such cases. Consequently, the severity ratings are best estimates influenced by the several sources of error inherent in research in this population.

Measures and Validity Test Cutoff Scores

A combination of PVTs and Symptom Validity Tests (SVTs) was used to establish comparison groups for the current research. PVTs assess the credibility of actual test performance, while SVTs assess the accuracy of self-reported neurocognitive, somatic, and psychological symptoms and problems (Larrabee, 2012). In addition to SVTs (discussed below), three freestanding and one embedded PVT were used to form the malingering groups. The freestanding tests included the Victoria Symptom Validity Test (VSVT; Slick, Hopp, Strauss, & Thompson, 1997), the TOMM, and the WMT. The embedded PVT was Reliable Digit Span (RDS), and a cutoff of ≤7 was used (Jasinski, Berry, Shandera, & Clark, 2011; Mathias, Greve, Bianchini, Houston, & Crouch, 2002). The stand-alone PVTs were administered and scored by computer using standard instructions. Cutoffs for determining failure on the WMT were based on the test manual. For the TOMM the cutoffs used were ≤43 for Trial 1 and ≤49 for the other 2 trials. The cutoffs all had PPVs of 0.90 or greater at a base rate of 0.40 for a probable malingering (PM), probable to definite malingering (PDM), and definite malingering (DM) group used in research by Jones (2013a). These malingering groups and a nonmalingering (NM) group paralleled the groups (described below) used in the current research. The Jones research suggested a base rate of 0.41 for PM based on failure of two or more PVTs. This base rate for malingering and/or noncredible performance on cognitive tests is similar to what has been found in other research using military and civilian samples (Armistead-Jehle and Hansen, 2011; Lange et al., 2012; Larrabee, 2003,, 2007; Larrabee, Millis, & Myers, 2009; Mittenberg, Patton, Canyock, & Condit, 2002). Cutoffs used for the VSVT for the current research were ≤20, ≤18, and ≤41 for the Easy, Hard, and Total Scores, respectively. Research by Jones (2013b) indicates that these cutoffs for the Hard and Total scores produced PPVs of at least 0.90 at a base rate of 0.40 for a PM and PDM group. The cutoff for the Easy items for the PM group had a PPV of 0.88. It is important to note that the cutoffs established in the Jones research on the TOMM and VSVT and used in the current research are consistent with cutoffs established in recent research using a variety of other samples (Greve, Bianchini, Black, et al., 2006; Greve, Bianchini, & Doane, 2006; Greve, Etherton, Ord, Bianchini, & Curtis, 2009; Grote et al., 2000; Loring, Larrabee, Lee, & Meador, 2007; Macciocchi, Seel, Alderson, & Godsall, 2006). The research by Jones, as well as the research reviewed in the Jones manuscripts, related to the TOMM and VSVT strongly indicates that the standard cutoffs suggested by their respective manuals are too conservative not only in a military sample but in other populations as well.

The SVTs used in this research were the Cognitive-Somatic Symptom Validity Tests (CS-SVTs) from the MMPI-2and the MMPI-2-RF (Ben-Porath & Tellegen, 2008). The CS-SVTs from the MMPI-2 and MMPI-2-RF (scored from the MMPI-2) are designed to assess noncredible reporting or over-reporting of somatic and/or cognitive problems. The scales from the MMPI-2 included the Fake Bad Scale (FBS; Lees-Haley, English, & Glenn, 1991), now named the Symptom Validity Scale, and the Response Bias Scale (RBS; Gervais, Ben-Porath, Wygant, & Green, 2007). The MMPI-2-RF has a revised version of the MMPI-2 FBS (FBS-r), which retains 30 of the original 43 FBS items. RBS remains intact on the MMPI-2-RF. The MMPI-2-RF Infrequent Somatic Responses scale (Fs) was also used. Although Fs was designed to assess over-reporting of somatic symptoms, it has been shown to be sensitive to complaints of memory problems (Gervais, Ben-Porath, Wygant, Green, & Sellbom, 2010). A principal component analysis by Jones and Ingram (2011) also found that Fs loaded higher on a component that consisted of FBS, HHI, and RBS than on a component composed of the traditional F-family of scales. Other CS-SVTs used for the current research included the Henry–Heilbronner Index (HHI; Henry, Heilbronner, Mittenberg, & Enders, 2006) which was developed for the MMPI-2 and a revised but shorter version, the HHI-r, which was recently developed for the MMPI-2-RF (Henry, Heilbronner, Algina, & Kaya, 2013). Neither the HHI nor the HHI-r has been adopted by the publisher of the MMPI-2 and MMPI-2-RF. However, Jones and Ingram demonstrated that in a military sample HHI performed as well the best performing CS-SVT scale (RBS) and better than other CS-SVT scales (FBS, FBS-r, and Fs) in predicting adequate vs. inadequate effort on tests of cognitive functioning.

The cutoffs used for RBS, FBS-r, and Fs were based on guidelines presented in the interpretive manuals for the MMPI-2-RF. A T-score of >100 was used for RBS (raw score ≥17), FBS-r (raw score ≥24), and Fs (raw score ≥8) because scores less than a T-score of 100 may suggest other explanations for elevated scores, including significant emotional dysfunction (RBS) or significant and/or multiple medical conditions (FBS-r and Fs), according to Ben-Porath and Tellegen (2008, 2011). The cutoff for FBS was a score ≥25. The review of the literature by Greiffenstein, Fox, and Lees-Haley (2007) suggested an FBS score ≥23 justifies concern about the validity of self-reported symptoms, but subsequent research (e.g., Peck et al., 2013: Tsushima, Geling, & Fabrigas, 2011) has found that higher scores are needed to obtain acceptable specificity. Dionysus, Denney, and Halfaker (2011) found an FBS score ≥25 was necessary to obtain a specificity of at least 0.90 in differentiating a group of probable to definite malingerers who met Slick, Sherman, and Iverson (1999) criteria for malingering and a group of head-injured patients who did not meet these criteria. The Dionysus et al. sample is most similar to the sample used in the current research and this cutoff was used for this research. With respect to HHI, a cutoff of ≥12 was used in the current research. Henry and colleagues (2006) suggested an HHI cutoff of ≥8 had acceptable classification accuracy, in differentiating personal injury litigants and disability claimants who met Slick et al. criteria for PM with nonlitigating mTBI (85%) and moderate-to-severe (15%) head-injured controls. Subsequent research has suggested higher cutoffs (≥11 to ≥14) may be needed to differentiate satisfactorily individuals who pass or fail PVTs (Tsushima et al., 2011; Whitney, 2013; Young, Kerns, & Roper, 2011). The research by Dionysus et al. found a cutoff of ≥12 was needed to differentiate their groups. Because of the similarities in the Dionysus et al. sample and the sample used in the current research, this cutoff was used. The cutoffs used in this research for FBS and HHI based in large part on the Dionysus et al. (2011) research are identical to cutoffs found in research that is in progress establishing cutoffs in a military sample for these and other cognitive-somatic validity scales on the MMPI-2 and MMPI-2-RF. The cutoffs for FBS and HHI had a specificity of at least 0.90 for malingering groups that paralleled the malingering groups used in the current research.

Group Assignment

The formation of the malingering groups was based on the simplification and refinements of the Slick et al. (1999) criteria suggested by Larrabee, Greiffenstein, Greve, and Bianchini (2007) and Larrabee (2008). Although the Slick et al. criteria have been validated and have been widely used to diagnose Malingered Neurocognitive Dysfunction (MND), Larrabee et al. concluded that the Slick et al. criteria related to evidence from neuropsychological testing could be modified and simplified. They argued that multiple positive finding on independent (uncorrelated) psychometric indicators could be used to define different levels of malingering regardless of other Slick et al. criteria. Larrabee (2008) stated that failure on two independent validity tests provides strong evidence for a diagnosis of PM (post-test probability = 0.90+), and failure on three validity tests provides very strong evidence of probable, if not definite, malingering. Larrabee et al. stated that failure on three independent and well-validated validity indicators appears to be associated with “100% probability of malingering and is statistically equivalent to definite MND. . .” (p. 357). However, they also indicate that although this is associated with 100% probability of malingering, scores at this level are not conceptually equivalent to the active avoidance of correct answers as associated with significantly worse-than-chance-performance on two-alternative forced choice testing. They argue that significantly below chance performance would indicate conscious intent and definite malingering.

Consistent with the arguments advanced by Larrabee and colleagues (2007), three malingering groups (PM, PDM, and DM) were formed based on the number of PVTs and SVTs failed. For PVTs consisting of more than one trial (TOMM, VSVT, and WMT), failure of the test overall was operationalized as failure of one or more trials of the test. All SVTs were scored for all participants and failure on any of the SVTs was counted as one failure. An examination of the correlations between the SVTs for the full (N = 462) sample was utilized to assess redundancy between these scales, and this analysis suggests substantial overlap between the SVTs. The mean correlations reported below are based only on rs between RBS, FBS, Fs, and HHI, because the correlations between FBS and FBS-r (r = .95) and HHI and HHI-r (r = .97) for the final sample suggested that they were highly redundant. The correlation between all SVTs had a p-value of 0.001. The mean correlations ranged from 0.66 (Fs vs. the other 3 SVTs) to 0.76 (HHI vs. the other SVTs). The mean correlation for all SVTs was 0.72. The mean correlation between the SVTs, excluding Fs, ranged from 0.76 to 0.81 with a grand mean of 0.78. It was expected that Fs might have a somewhat lower correlation with the other SVTs because it was not developed using disability samples and/or PVTs as criteria for selecting items as were the other SVTs. The items on Fs were selected on a rare-symptoms rationale, i.e., somatic symptoms that are infrequently endorsed by patients in treatment for medical problems. Nevertheless, the correlations between Fs and the other scales were at least moderate to high (0.59–0.70). In general, the correlations between the SVTs indicate that they are largely redundant and it is prudent not to count failure on the SVTs separately.

As indicated, the malingering groups formed for the current research paralleled the groups formed by Jones (2013a, 2013b) and used the same sample. In order to form the malingering groups and the NM group, the minimum number of validity tests administered to a participant had to be at least two. All participants who performed significantly below chance on the WMT or TOMM also scored significantly below chance on the VSVT; 3 scored below chance on the WMT and 4 on the TOMM. Of the 462 participants in the sample after MMPI-2 exclusion criteria were applied, 459 were administered at least two validity tests. Of these 459 participants, 203 completed the RBANS and met inclusion criteria, as described below, for the NM and the three malingering groups.

Five comparison groups were formed to establish EI cutoffs. The participants in the NM group failed no validity tests and as indicated were administered at least two validity tests and of course failed no validity tests. The PM group was based on failure of exactly two validity tests; the PDM group was based on failure of three or more validity, and the DM group was composed of 19 participants whose performance on the VSVT was significantly below chance. Below chance performance was defined as ≤8 on any VSVT trial (cf., Binder, Larrabee, & Millis, 2014). None of the participants in the PM or PDM group overlapped with the participants in the DM group. A fifth group, a combined malingering group (CM) included all of the participants in the three malingering groups in order to assess the operating characteristics of EI cutoffs for all participants regardless of level of malingering. Table 1 provides a summary of the number of validity tests administered for each group and which validity tests were failed for each group. There were no significant differences in age or education or in the distribution of gender, ethnicity, or military rank across the comparison groups.

Table 1.

Number of validity tests administered and type of validity tests administered for each comparison group

Comparison group n Number of validity tests administered
 
Number of each PVT failed
 
TOMM VSVT WMT RDS SVT 
Nonmalingering 76 15 55 
Probable 47 40 34 28 27 
Probable-Definite 61 52 54 59 13 61 
Definite 19 14 17 19 19 
Combined 127 106 18 105 106 20 107 
Comparison group n Number of validity tests administered
 
Number of each PVT failed
 
TOMM VSVT WMT RDS SVT 
Nonmalingering 76 15 55 
Probable 47 40 34 28 27 
Probable-Definite 61 52 54 59 13 61 
Definite 19 14 17 19 19 
Combined 127 106 18 105 106 20 107 

Note: TOMM = Test of Memory Malingering; VSVT = Victoria Symptom Validity Test; WMT = Word Memory Test; RDS = Reliable Digit Span; SVT = failure on one or more of the MMPI-2 or MMPI-2 RF Cognitive-Somatic validity scales.

Analytic Plan

The data analysis for this research first included an examination of the mean EI scores for each comparison group and an examination of effects sizes. The effect sizes were calculated using Glass' Δ and Cohen's d. Glass' Δ is considered the better index of effect size when variances are not homogeneous, which is the case for this research (details discussed in the results). As Ellis (2010) and Olejnik and Algina (2000) indicate, if variances are roughly the same, then it is reasonable to assume that they are estimating a common population standard deviation. In this case, it is reasonable to pool the standard deviations and calculate Cohen's d as an index of effect size. However, these authors also state that if the standard deviations of the groups differ, then the homogeneity of variance assumption is violated and pooling the standard deviations is not appropriate. When this is the case, they argue that Glass's Δ should be used based on the variance that is likely untainted by extraneous factors, i.e., the best estimate of the population variance.

A second analysis was completed to ensure that the PVTs were independent (i.e., not correlated). This is important because, for example, the PM group was composed of participants who failed exactly two validity tests in any combination but did not fail the VSVT at below chance level. If the two tests that were failed were redundant (highly correlated), then it would be the equivalent to failing one test. However, when nonredundant tests are used, there is greater certainty that the individuals who were placed in the PM group were correctly classified. Larrabee (2008) and Nelson et al. (2003) used correlational analysis across the full range of scores to establish validity tests independence. However, it can be argued that for the current research the primary concern is establishing if there is an association between either passing or failing a validity tests within a comparison group and not the association between the full range of scores on each validity test. Thus, one wants to know if the validity tests failed by those in each of the malingering groups were or were not redundant. To establish independence for the PVTs used in this research, a χ2 analysis was completed to assess the association between the pass–fail status for each of the validity tests used to establish the malingering comparison groups.

The third analysis involved an examination of classification accuracy in terms of sensitivity and specificity for a range of cutoffs for EI. Since past research has suggested an EI cutoff as low as 1 can provide an adequate specificity of at least 90% (Larrabee, 2012; Victor, Boone, Serpa, Buehler, & Ziegler, 2009), cutoffs were calculated starting with 1 and continued until a specificity of 1.0 was reached. Sensitivity and specificity inform us of the accuracy of a test and not the probability that someone has a condition of interest (COI). However, they can be combined to calculate post-test probabilities of a COI, i.e., the probability of malingering. Thus, the current research included calculation of PPVs and negative predictive values (NPVs), and the likelihood ratios for positive (LR+) and negative (LR−) test results. PVs provide post-test probabilities at different base rates, and LRs can be combined to establish post-test probabilities of a COI.

Results

The means, standard deviations, and effect sizes for the comparison groups are reported in Table 2. Levene's test of homogeneity of variance indicated a significant difference, F= 48.9 (3,199), p = .001, in the variances between the four primary comparison groups (NM, PM, PDM, and DM), thus they do not appear to be estimates of a common variance. As discussed earlier, Glass' Δ is a better indicator of standardized effect sizes in this case. Based on Glass' Δ, the effect sizes are extremely large indicating substantial differences between the NM group and the malingering groups. Cohen's d suggested moderate-to-large effect sizes.

Table 2.

Descriptive statistics and effect sizes for the probable malingering (PM), probable to definite (PDM), definite malingering, and combined malingering group (CM) vs. nonmalingering (NM) group for the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS) Effort Index (EI)

Group NM (n = 76)
 
PM (n = 47)
 
PDM (n = 61)
 
DM (n = 19)
 
CM (N = 127)
 
Effect size glass' Δ for NM vs.
 
Effect size Cohen's d for NM vs.
 
M SD M SD M SD M SD M SD PM PDM DM CM PM PDM DM CM 
EI 0.05 0.322 0.66 1.11 1.43 1.91 3.47 2.89 1.45 2.06 1.89 4.29 10.6 4.35 0.55 0.72 1.18 0.85 
Group NM (n = 76)
 
PM (n = 47)
 
PDM (n = 61)
 
DM (n = 19)
 
CM (N = 127)
 
Effect size glass' Δ for NM vs.
 
Effect size Cohen's d for NM vs.
 
M SD M SD M SD M SD M SD PM PDM DM CM PM PDM DM CM 
EI 0.05 0.322 0.66 1.11 1.43 1.91 3.47 2.89 1.45 2.06 1.89 4.29 10.6 4.35 0.55 0.72 1.18 0.85 

Tables 3 and 4 show the results of the χ2 analysis that examined the independence of the validity tests used to form the comparison groups utilized to establish cutoff scores. The strength of the relationships between the validity tests in this research for those who failed either two (Table 3) or one or more validity tests (Table 4) is provided by φ, which is equivalent to Pearson's r when computed for two dichotomous variables. Only one correlation (RDS/TOMM) was significant in Table 3, and that correlation (−0.40) suggested minimal shared variance (16%). The cell sizes were small for some of the correlations in Table 3, so failure of one or more validity tests was computed to increase the cell sizes. Two of the correlations (VSVT/TOMM and WMT/TOMM) in Table 4 were significant (.17 and .36), but again the shared variance is minimal (3% and 13%). These results indicated that the validity tests are not redundant and that the composition of the comparison groups is valid.

Table 3.

χ2 analysis for validity test independence: two-failed validity tests

 TOMM
 
VSVT
 
WMT
 
RDS
 
χ2 φ pN χ2 φ p N χ2 φ p N χ2 φ p N 
SVT 2.45 −0.11 .118 195 2.40 −0.12 .118 172 0.330 −0.11 .566 28 1.80 −0.20 .179 46 
TOMM     0.777 −0.07 .378 153 0.489 −0.15 .484 22 6.31 −0.40 .012 40 
VSVT         1.05 0.19 .306 28 0.729 0.08 .729 19 
WMT             0.133 −0.58 .248 
 TOMM
 
VSVT
 
WMT
 
RDS
 
χ2 φ pN χ2 φ p N χ2 φ p N χ2 φ p N 
SVT 2.45 −0.11 .118 195 2.40 −0.12 .118 172 0.330 −0.11 .566 28 1.80 −0.20 .179 46 
TOMM     0.777 −0.07 .378 153 0.489 −0.15 .484 22 6.31 −0.40 .012 40 
VSVT         1.05 0.19 .306 28 0.729 0.08 .729 19 
WMT             0.133 −0.58 .248 

Notes: SVT = failure on one or more of the MMPI-2 or MMPI-2 RF cognitive-somatic validity scales; TOMM = Test of Memory Malingering; VSVT = Victoria Symptom Validity Test; WMT = Word Memory Test; RDS = Reliable Digit Span.

*The probability levels for all comparisons are based on χ2. There is no difference in the pattern of significant and nonsignificant results when compared with Fisher's exact test.

Table 4.

c2 analysis for validity test independence: one-failed validity test

 TOMM
 
VSVT
 
WMT
 
RDS
 
χ2 φ pn χ2 φ p n χ2 φ p n χ2 φ p n 
SVT 2.65 −0.10 .103 268 0.086 0.02 .770 236 0.589 0.12 .443 43 0.531 −0.08 .466 84 
TOMM     5.80 0.17 .016 192 3.84 0.36 .050 29 2.76 −0.21 .097 61 
VSVT         2.50 0.25 .114 39 0.058 0.04 .809 36 
WMT             1.74 −0.47 .187 
 TOMM
 
VSVT
 
WMT
 
RDS
 
χ2 φ pn χ2 φ p n χ2 φ p n χ2 φ p n 
SVT 2.65 −0.10 .103 268 0.086 0.02 .770 236 0.589 0.12 .443 43 0.531 −0.08 .466 84 
TOMM     5.80 0.17 .016 192 3.84 0.36 .050 29 2.76 −0.21 .097 61 
VSVT         2.50 0.25 .114 39 0.058 0.04 .809 36 
WMT             1.74 −0.47 .187 

Notes: SVT = failure on one or more of the MMPI-2 or MMPI-2 RF cognitive-somatic validity scales; TOMM = Test of Memory Malingering; VSVT = Victoria Symptom Validity Test; WMT = Word Memory Test; RDS = Reliable Digit Span.

* The probability levels for all comparisons are based on χ2. There is no difference in the pattern of significant and nonsignificant results when compared with Fisher's exact test.

Table 5 shows the results of classification and diagnostic statistics. With respect to cutoff scores, an EI of ≥1 produced a specificity of 0.97 or greater for the malingering groups. Sensitivities for this cutoff ranged from 0.30 for the PM group to 0.89 for the DM group. The CM group had an adequate sensitivity of 0.48 at this cutoff. At a base rate of 0.40, PPVs for all groups and all cutoffs were ≥0.90, except for EI ≥ 1 and EI ≥ 2 for the PM group, which were 0.87 and 0.86, respectively. LRs+ were >10 for all cutoffs for all malingering groups, except for EI ≥ 1 for the PM group, which generated large effects on post-test probabilities calculated from LR + s (Hayden & Brown, 1999). The LR+ for EI ≥ 1 for the PM groups was 8.9, which generated a moderate effect on the post-test probability of malingering. LR-s ranged from 0.52 to 0.89 for the PDM and PM groups, which produced small effects on post-test probabilities, and they ranged from 0.11 to 0.20 for the DM group, which produced moderate effects on post-test probabilities.

Table 5.

Sensitivity, specificity, positive/negative predictive value (PPV/NPV), and likelihood ratios for cutoff scores for the RBANS EIa

Cutoff Test accuracy (95% CI)
 
PPV for select base rates
 
NPV for select base rates
 
Likelihood ratios (95% CI)
 
Sensitivity Specificity 0.10 0.20 0.30 0.40 0.50 0.10 0.20 0.30 0.40 0.50 Positive test Negative test 
Definite malingering 
EI≥ 1 0.89 (0.67–0.99) 0.97 (0.90–1.0) 0.75 0.87 0.92 0.95 0.97 0.99 0.97 0.96 0.93 0.90 27.7 (7.0–109.4) 0.11 (0.03–0.40) 
EI≥ 2 0.89 (0.65–0.99) 0.97 (0.90–1.0) 0.75 0.87 0.92 0.95 0.96 0.99 0.97 0.95 0.93 0.90 27.6 (7.0–108.8) 0.11 (0.03–0.42) 
EI≥ 3 0.80 (0.44–0.97) 1.0 (0.94–1.0) 1.0 1.0 1.0 1.0 1.0 0.98 0.95 0.92 0.88 0.83 ∞ 0.20 (0.06–0.69) 
Probable to definite malingering 
EI≥ 1 0.49 (0.36–0.62) 0.97 (0.90–1.0) 0.64 0.80 0.88 0.92 0.94 0.94 0.88 0.82 0.74 0.66 16.5 (4.1–66.1) 0.52 (0.41–0.67) 
EI≥ 2 0.42 (0.28–0.56) 0.97 (0.90–1.0) 0.61 0.77 0.86 0.90 0.93 0.94 0.87 0.79 0.71 0.62 13.9 (3.4–56.5) 0.60 (0.48–0.76) 
EI≥ 3 0.33 (0.20–0.48) 1.0 (0.94–1.0) 1.0 1.0 1.0 1.0 1.0 0.93 0.85 0.78 0.69 0.60 ∞ 0.67 (0.55–0.82) 
Probable malingering 
EI≥ 1 0.30 (0.17–0.45) 0.97 (0.90–1.0) 0.53 0.71 0.81 0.87 0.91 0.93 0.85 0.76 0.67 0.58 8.9 (2.1–38.0) 0.72 (0.60–0.87) 
EI≥ 2 0.27 (0.15–0.42) 0.97 (0.90-0.1.0) 0.50 0.69 0.79 0.86 0.90 0.92 0.84 0.76 0.66 0.57 10.7 (2.6–45.1) 0.76 (0.63–0.91) 
EI≥ 3 0.11 (0.03–0.25) 1.0 (0.94–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.82 0.72 0.63 0.53 ∞ 0.89 (0.80–1.0) 
Combined malingering groups 
EI≥ 1 0.48 (0.39–0.57) 0.97 (0.90–1.0) 0.64 0.80 0.87 0.91 0.94 0.94 0.88 0.81 0.74 0.65 16.1 (4.1–63.8) 0.54 (0.45–0.64) 
EI≥ 2 0.43 (0.35.53) 0.97 (0.90–1.0) 0.62 0.78 0.86 0.91 0.94 0.94 0.87 0.80 0.72 0.63 14.4 (3.6–57.5) 0.59 (0.50–0.69) 
EI≥ 3 0.29 (0.20–0.39) 1.0 (0.94–1.0) 1.0 1.0 1.0 1.0 1.0 0.93 0.85 0.77 0.68 0.58 ∞ 0.71 (0.62–0.81) 
Cutoff Test accuracy (95% CI)
 
PPV for select base rates
 
NPV for select base rates
 
Likelihood ratios (95% CI)
 
Sensitivity Specificity 0.10 0.20 0.30 0.40 0.50 0.10 0.20 0.30 0.40 0.50 Positive test Negative test 
Definite malingering 
EI≥ 1 0.89 (0.67–0.99) 0.97 (0.90–1.0) 0.75 0.87 0.92 0.95 0.97 0.99 0.97 0.96 0.93 0.90 27.7 (7.0–109.4) 0.11 (0.03–0.40) 
EI≥ 2 0.89 (0.65–0.99) 0.97 (0.90–1.0) 0.75 0.87 0.92 0.95 0.96 0.99 0.97 0.95 0.93 0.90 27.6 (7.0–108.8) 0.11 (0.03–0.42) 
EI≥ 3 0.80 (0.44–0.97) 1.0 (0.94–1.0) 1.0 1.0 1.0 1.0 1.0 0.98 0.95 0.92 0.88 0.83 ∞ 0.20 (0.06–0.69) 
Probable to definite malingering 
EI≥ 1 0.49 (0.36–0.62) 0.97 (0.90–1.0) 0.64 0.80 0.88 0.92 0.94 0.94 0.88 0.82 0.74 0.66 16.5 (4.1–66.1) 0.52 (0.41–0.67) 
EI≥ 2 0.42 (0.28–0.56) 0.97 (0.90–1.0) 0.61 0.77 0.86 0.90 0.93 0.94 0.87 0.79 0.71 0.62 13.9 (3.4–56.5) 0.60 (0.48–0.76) 
EI≥ 3 0.33 (0.20–0.48) 1.0 (0.94–1.0) 1.0 1.0 1.0 1.0 1.0 0.93 0.85 0.78 0.69 0.60 ∞ 0.67 (0.55–0.82) 
Probable malingering 
EI≥ 1 0.30 (0.17–0.45) 0.97 (0.90–1.0) 0.53 0.71 0.81 0.87 0.91 0.93 0.85 0.76 0.67 0.58 8.9 (2.1–38.0) 0.72 (0.60–0.87) 
EI≥ 2 0.27 (0.15–0.42) 0.97 (0.90-0.1.0) 0.50 0.69 0.79 0.86 0.90 0.92 0.84 0.76 0.66 0.57 10.7 (2.6–45.1) 0.76 (0.63–0.91) 
EI≥ 3 0.11 (0.03–0.25) 1.0 (0.94–1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.82 0.72 0.63 0.53 ∞ 0.89 (0.80–1.0) 
Combined malingering groups 
EI≥ 1 0.48 (0.39–0.57) 0.97 (0.90–1.0) 0.64 0.80 0.87 0.91 0.94 0.94 0.88 0.81 0.74 0.65 16.1 (4.1–63.8) 0.54 (0.45–0.64) 
EI≥ 2 0.43 (0.35.53) 0.97 (0.90–1.0) 0.62 0.78 0.86 0.91 0.94 0.94 0.87 0.80 0.72 0.63 14.4 (3.6–57.5) 0.59 (0.50–0.69) 
EI≥ 3 0.29 (0.20–0.39) 1.0 (0.94–1.0) 1.0 1.0 1.0 1.0 1.0 0.93 0.85 0.77 0.68 0.58 ∞ 0.71 (0.62–0.81) 

aNote: PPVs, NPVs, and LRs are calculated from 4 to 5 decimal places and not the rounded sensitivity and specificity that appear here.

Discussion

This research examined cutoff scores for the RBANS EI in an active-duty military sample most of whom experienced mTBI. Four psychometrically established malingering groups were used to establish the cutoff scores. In general, the current research found that EI ≥ 1 produced very good specificities (0.97) for the malingering groups, and the specificity for EI ≥ 3 was 1.0 for the groups. As expected, sensitivities increased with increasing levels of certainty of malingering. Excluding the CM group, they ranged from a low of 0.11 (EI ≥ 3) for the PM group to a high of 0.89 (EI ≥ 1) for the DM group. As also expected, the CM group had sensitivities that were similar to the PDM group. The CM group is important because it provides classification and diagnostic statistics regardless of the level of malingering. Stated another way, it provides classification and diagnostic statistics for individuals who fail at least two validity tests or who fail the VSVT at below chance levels. The results with respect to sensitivities for the CM group and other groups argue against the use of EI alone to establish the validity of test results. For the CM and PDM groups, ∼50% of malingers may not be identified at a cutoff of ≥1, and 70% of the probable malingerers will not be accurately identified at the same cutoff. As Armistead-Jehle and Hansen (2011) and Barth and colleagues (2010) recommend, EI should be used in conjunction with other validity tests to aid in making clinical decisions.

It is important to remember that sensitivity provides the probability of a positive test result given that a person is known to be malingering but not the probability of malingering given a positive test result. Likewise, specificity provides the probability of a negative test result for a person who is not malingering, but it does not provide the probability of not malingering given a negative test result. The probability of malingering given positive and negative tests results (post-test probabilities) is provided by the PPVs and NPVs, which can be generated from likelihood ratios. Crawford, Garthwaite, and Betkowska (2009) provide examples of the calculation of PVs, and they also provide a computer program to calculated interval estimates for PVs. For the current research, PPVs were ≥0.87 for all cutoffs and for all groups at a base rate of 0.40, and 83% of the PPVs were ≥0.91. PPVs were ≥0.91 for the CM group for all cutoffs at this base rate. NPVs tended to be lower than PPVs at a base rate of 0.40. For the CM group, they ranged from 0.68 to 0.74. They ranged from 0.88 to 0.93 for the DM group and from 0.63 to 0.74 for the other groups. The results generally indicate that EI is better at predicting malingering than identifying those who are not malingering.

LRs+ for the current research ranged from 8.9 to 27.7 indicating large effects on post-test probabilities of malingering. In general, an LR+ >10 has a large effect on increasing the post-test probabilities of a diagnosis (Hayden & Brown, 1999). On the other hand, the size of the LR−s found in this research indicate that they will generate small-to-moderate effects on the post-test probabilities of the COI in a patient with a negative test. As would be expected, the results with respect to LRs parallel the results of the PVs and indicate that EI is better at ruling-in than ruling-out a diagnosis. In clinical situations and especially forensic settings when malingering is an issue, ruling-in a diagnosis with a high level of certainty is generally considered the more important of the two issues. With respect to post-test probabilities derived from LR + s, the lowest LR+ for the CM group was 14.4 at EI ≥ 2 at a base rate of 0.40, which results in post-test probability of 0.91. An important feature of LRs is that they can be combined or chained across independent diagnostic or validity tests to calculate post-test probabilities. Chaining is accomplished by simply multiplying the LRs of two or more independent validity tests (Grimes & Schulz, 2005; Larrabee, 2008). For example, if two independent validity tests that have LRs+ of 3.0 and 4.3 are combined, then the chained LR+ would be 12.9, which would produce a post-test probability of 0.90. If used singly LRs+ of 3.0 and 4.3 would have a small effect on a post-test probability, 0.67 and 0.74, respectively. A post-test probability of 0.90 might be considered a minimally acceptable probability when ruling-in malingering. Research by Jones (2013a, 2013b) provides LRs for the VSVT and TOMM, which can be chained with the results of the current research to calculate post-test probabilities. An example of the calculation of post-test probabilities of a COI using LRs can be found in Jones (2013b), and online calculators are readily available to simplify the process.

Comparisons of cutoffs with past research are somewhat difficult because of the different methods, samples, and validity tests used. Much of the past research has used geriatric samples, which is quite different from the much younger and healthier military sample used for the current research. Armistead-Jehle and Hansen (2011) completed the only previous research that used an active-duty army sample that was largely composed of individuals with mTBI. However, there are important differences between their research and the current research, including the setting where the research was conducted, demographics of the samples, and the method utilized to establish cutoffs. The Armistead-Jehle and Hansen sample was composed primarily of military officers (54.4%) enrolled in a training course to advance their careers, which suggests the possibility of reduced motivations to malinger. For the current research, 11% were officers and 89% were enlisted members, which compares with 16.4% officers and 83.6% enlisted for the entire army (Department of Defense, 2012). The participants for the current research were evaluated in army medical centers on bases where the possibility of deployment and other difficult duties are more typical of the army in general. There were significant differences (p ≤ .001) in age (30.5 ± 8.7 vs. 34.0 ± 6.7) and education (13.0 ± 1.9 vs. 15.0 ± 2.1) between the Armistead-Jehle and Hansen sample and the current sample. The average age for all active-duty members is 28.7, which is closer to the age for the current sample. It was not possible to determine education in years from the DOD report, but 79% of soldiers have a high school diploma, GED, or some college. These comparisons suggest that the current sample is probably more representative of that normally seen in TBI treatment centers in the army. With respect to differences in methods, the current research used failure on at least two validity tests as a minimal criterion for malingering. Armistead-Jehle and Hansen used the TOMM, MSVT, and NV-MSVT individually to examine cutoff scores for “poor effort.” Despite the many differences in settings, samples, and methods, the results of this research and that of Armistead-Jehle and Hansen suggest that EI ≥ 1 has very good classification and diagnostic statistics for both samples, which suggests that an EI cutoff of ≥1 may be a robust finding.

Beyond examining cutoffs, the research by Armistead-Jehle and Hansen (2011) also examined the base rate of validity test failure in their military sample. They found a base rate of failure on the MSVT and NV-MSVT of 15–20%, and they found a failure rate of 12% on the second trial of the TOMM at the manual's recommended cutoff of ≤44. Using this same cutoff for TOMM, the failure rate for the current research on the second trial of the TOMM is 30% (106 of 356 failed the TOMM). Using a cutoff of ≤48, the failure rate on the TOMM is 39%. A cutoff of ≤48 had a specificity of 1.0 in research by Jones (2013a) for three malingering groups (PM, PDM, and DM) that are comparable with the malingering groups used in the current research. The TOMM failure rate is 47% using a cutoff of ≤49, which had a PPV of ≥0.90 in the Jones research for all malingering groups at a base rate of 0.40. Of note, post hoc analysis suggests a base rate of malingering of about 47% (219 of 462) for the current sample based on failure of two or more validity tests with EI included in the number of failed validity tests or failure of VSVT below chance. The failure rate on validity tests tends to be higher in the current sample than for the Armistead-Jehle and Hansen research, which is likely due in part to the setting where the research was conducted. In general, it is expected that base rates will vary depending on such factors as the setting, tests used, number of tests administered, and cutoffs employed used to establish base rates. In any case, there is a need for continuing research in military and nonmilitary samples to examine base rates and to confirm cutoffs examined in this research and the research by Armistead-Jehle and Hansen.

In addition, additional research is needed to establish cutoffs in other non-geriatric samples. Research suggests that in geriatric samples higher cutoffs may be needed to establish acceptable classification statistics (e.g., Duff et al., 2011). There is also a need for research to ensure that accurate and current estimates of base rates of malingering are available. Base rates are likely to change over time, especially in military samples. When at war, the base rate of malingering in a military setting may increase. However, during times of peace and perhaps low employment, military members may want to remain in the military, and they may minimize medical problems, including neuropsychological problems. Under these conditions, the base rate of malingering may decrease. Current and accurate estimates of base rates will help ensure that post-test probabilities based on predictive values and LRs can be accurately calculated and used to help ensure accurate diagnoses.

The psychometric basis to establish malingering used in the current research may be particularly useful in military samples because external incentives, an important criterion in the Slick et al. (1999) criteria for diagnosing malingering, are omnipresent and at times difficult to establish with certainty. Incentives to malinger in the military include, among others, obtaining medical discharge from the military with extra benefits, avoiding deployment and other unpleasant duties, obtaining medical care in the Veteran's Administration medical system after discharge, and obtaining disability compensation for TBI (cf., Howe, 2010). The use of the military medical system to advance a military member's goals of avoiding combat, etc., may be especially true in cases of TBI. This may be in part a result of the significant efforts on the part of the military to educate military members about the symptoms of TBI and to encourage them to seek treatment for TBI. This may unintentionally provide a means for those who have incentive to not only advance their goal but also educate the member about symptoms that could be presented at the time of evaluation.

Although a research supported psychometric basis was used in the formation of the malingering groups for the current research, it is not argued here that the results of validity tests should be used in isolation to establish a diagnosis in a clinical situation. The results of the current research provide cutoffs and the necessary classification statistics to calculate post-test probabilities at different levels of psychometrically defined malingering for any base rates of interest. This information can be very useful when used with the total body of information available to the clinician during the diagnostic process. This includes of course such things as a review of available medical records and a complete history that should include such issues as the presence of incentives, nature of injury, course of symptoms, consistency in self-reports, as well as collateral information and the relative costs of making a false-positive or false-negative diagnosis. The total body of information should be used regardless of the diagnosis under consideration. The importance and relevance of particular information available during the diagnostic process may vary with the diagnoses under consideration, e.g., malingering vs. Cogniform Disorder (Dellis & Wetter, 2007). The accurate diagnosis of malingering and other conditions is not only important in forensic settings but also to ensure the appropriate treatment for military members regardless of diagnosis.

A limitation of this research is that the sample was largely male and military, which could limit generalization of cutoff scores. Additional research is needed to establish if there are gender differences, although this has not been evident in research to date. Another consideration in generalizations and interpreting the results may be related to the diagnostic composition of the sample. The sample for this research consisted largely of individuals with TBI (mostly mTBI) evaluated in a large military medical center, and the research was meant to generalize to similar populations. Although the research was not designed specifically to examine cutoff scores for an mTBI sample, an EI ≥ 1 proved to have very good classification and diagnostic statistics and is the same that Armistead-Jehle and Hansen (2011) found in their sample, which consisted primarily of military mTBI patients. Silverberg and colleagues (2007) also found that a cutoff of ≥1 was optimal in discriminating between participants with actual vs. falsely alleged cognitive dysfunction related to post-acute mTBI. Overall, classification accuracy (86.9%) was optimal at a cutoff of ≥1 when comparing the mTBI group with three other malingering groups (clinical malingerers, simulated-naive malingerers, and simulated-coached malingers). So, although the current research was not specifically intended to establish cutoffs for a TBI sample, a cutoff of ≥1 seems to be robust in samples consisting largely of mTBI patients. Research suggests that higher cutoffs are likely required for adequate classification and diagnostic statistics for some groups of older adult and geriatric patients. However, the current and other research suggests that younger, independently living individuals can be classified accurately with respect to credible and noncredible performance on neuropsychological evaluation by cutoffs established by the current research.

Conflict of Interest

None declared.

References

Armistead-Jehle
P.
,
Hansen
C. L.
(
2011
).
Comparison of the Repeatable Battery for the Assessment of Neuropsychological Status Effort Index and stand-alone Symptom Validity Tests in a military sample
.
Archives of Clinical Neuropsychology
 ,
26
,
592
601
.
Barker
M. D.
,
Horner
M. D.
,
Bachman
D. L.
(
2010
).
Embedded indices of effort in the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS) in a geriatric sample
.
The Clinical Neuropsychologist
 ,
18
,
1064
1077
.
Barth
J. T.
,
Isler
W. C.
,
Helmick
K. M.
,
Wingler
I. M.
,
Jaffee
M. S.
(
2010
). In
Kennedy
C.
,
Moore
J.
(Eds.),
Acute Battlefield Assessment of concussion/mild TBI and return to duty evaluations in military neuropsychology
  (pp.
127
174
).
New York
:
Springer Publishing Company, LLC
.
Beatty
W. W.
(
2004
).
RBANS analysis of verbal memory in multiple sclerosis
.
Archives of Clinical Neuropsychology
 ,
19
,
825
834
.
Beatty
W. W.
,
Ryder
K. A.
,
Gontkovsky
S. T.
,
Scott
J. G.
,
McSwan
K. L.
,
Bharucha
K. J.
(
2003
).
Analyzing, the subcortical dementia of Parkinson's disease using the RBANS
.
Archives of Clinical Neuropsychology
 ,
18
,
509
520
.
Ben-Porath
Y. S.
,
Tellegen
A.
(
2008
).
MMPI-2-RF (Minnesota Multiphasic Personality Inventory-2 Restructured Form): Manual for administration, scoring, and interpretation
 .
Minneapolis, MN
:
University of Minnesota Press
.
Ben-Porath
Y. S.
,
Tellegen
A.
(
2011
).
MMPI-2-RF (Minnesota Multiphasic Personality Inventory-2 Restructured Form): Manual for administration, scoring, and interpretation
 .
Minneapolis, MN
:
University of Minnesota Press
.
Binder
L. M.
,
Larrabee
G. J.
,
Millis
S. R.
(
2014
).
Intent to fail: Significance testing of forced choice test results
.
The Clinical Neuropsychologist
 ,
28
,
1366
1375
.
Burton
R. L.
,
Enright
J.
,
O'Connell
M. E.
,
Lanting
S.
,
Morgan
D.
(
2015
).
RBANS embedded measures of suboptimal effort in dementia: Effort Scale has a lower failure rate than the Effort Index
.
Archives of Clinical Neuropsychology
 ,
30
,
1
6
.
Butcher
J. N.
,
Dahlstrom
W. G.
,
Graham
J. R.
,
Tellegen
A.
,
Kaemmer
B.
(
1989
).
Minnesota Multiphasic Personality Inventory-2 (MMPI-2): Manual for administration and scoring
 .
Minneapolis, MN
:
University of Minnesota Press
.
Crawford
J. R.
,
Garthwaite
P. H.
,
Betkowska
K.
(
2009
).
Bayes' theorem and diagnostic tests in neuropsychology: Interval estimates for posttest probabilities
.
The Clinical Neuropsychologist
 ,
23
,
624
644
.
Crighton
A. H.
,
Wygant
D. B.
,
Holt
K. R.
,
Granacher
R. P.
(
2015
).
Embedded Effort Scales in the Repeatable Battery for the Assessment of Neuropsychological Status: Do they detect neurocognitive malingering?
Archives of Clinical Neuropsychology
 ,
30
,
181
185
.
Delis
D. C.
,
Wetter
S. R.
(
2007
).
Cogniform Disorder and Cogniform Condition: Proposed diagnoses for excessive cognitive symptoms
.
Archives of Clinical Neuropsychology
 ,
22
,
589
604
.
Department of Defense
. (
2012
).
2012 demographics report: Profile of the Military Community
 .
Washington, DC
:
Office of the Deputy Assistant Secretary of Defense
.
Department of Veteran Affairs
. (
2009
).
VA/dod clinical practice guideline for management of concussion/mild traumatic brain injury
 .
Washington, DC
:
Department of Veteran Affairs
.
Dionysus
K. E.
,
Denney
R. L.
,
Halfaker
D. A.
(
2011
).
Detecting negative response bias with the Fake Bad Scale, Response Bias Scale, and Henry-Heilbronner Index of the Minnesota Multiphasic Personality Inventory-2
.
Archives of Clinical Neuropsychology
 ,
26
,
81
88
.
Duff
K.
,
Spering
C. C.
,
O'Bryant
S. E.
,
Beglinger
L. J.
,
Moser
D. J.
,
Bayless
J. D.
et al
. (
2011
).
The RBANS Effort Index: Base rates in geriatric samples
.
Applied Neuropsychology
 ,
18
,
11
17
.
Ellis
P. D.
(
2010
).
The essential guide to effect sizes
 .
New York
:
Cambridge University Press
.
Gervais
R. O.
,
Ben-Porath
Y.
,
Wygant
D.
,
Green
P.
(
2007
).
Development and validation of a Response Bias Scale for the MMPI-2
.
Assessment
 ,
14
,
196
208
.
Gervais
R. O.
,
Ben-Porath
Y.
,
Wygant
D.
,
Green
P.
,
Sellbom
M.
(
2010
).
Incremental validity of the MMPI-2-RF over-reporting scales and RBS in assessing the veracity of memory complaints
.
Archives of Clinical Neuropsychology
 ,
25
,
274
284
.
Green
P.
(
2003
).
Manual for the Word Memory Test for Microsoft Windows
 .
Edmonton
:
Green's Publishing
.
Green
P.
(
2004
).
Test manual for the Medical Symptom Validity Test
 .
Edmonton
:
Green's Publishing
.
Green
P.
(
2008
).
Test manual for the Nonverbal Medical Symptom Validity Test
 .
Edmonton
:
Green's Publishing
.
Greiffenstein
M. F.
,
Fox
D.
,
Lees-Haley
P. R.
(
2007
).
The MMPI-2 Fake Bad Scale in detection of noncredible brain injury claims
. In
Boone
K.
(Eds.),
Assessment of feigned cognitive impairment: A neuropsychological perspective,
  (pp.
210
235
).
New York
:
Guilford Publications
.
Greve
K. W.
,
Bianchini
K. J.
,
Black
F. W.
,
Heinly
M. T.
,
Love
J. M.
,
Swift
D. A.
et al
. (
2006
).
Classification accuracy of the Test of Memory Malingering in persons reporting exposure to environmental and industrial toxins: Results of a known-groups analysis
.
Archives of Clinical Neuropsychology
 ,
21
,
439
438
.
Greve
K. W.
,
Bianchini
K. J.
,
Doane
B. M.
(
2006
).
Classification accuracy of the Test of Memory Malingering in traumatic brain injury: Results of known-groups analysis
.
Journal of Clinical and Experimental Neuropsychology
 ,
28
,
1176
1190
.
Greve
K. W.
,
Etherton
J. L.
,
Ord
J.
,
Bianchini
K. J.
,
Curtis
K. L.
(
2009
).
Detecting malingered pain-related disability: Classification accuracy of the Test of Memory Malingering
.
The Clinical Neuropsychologist
 ,
23
,
1250
1127
.
Grimes
D. A.
,
Schulz
K. F.
(
2005
).
Refining clinical diagnosis with likelihood ratios
.
Lancent
 ,
365
,
1500
1505
.
Grote
C. L.
,
Kooker
E. K.
,
Garron
D. C.
,
Nyenhuis
D. L.
,
Smith
C. A.
,
Mattingly
M. L.
(
2000
).
Performance of compensation seeking and non-compensation seeking samples on the Victoria Symptom Validity Test: Cross validation and extension of a standardization study
.
Journal of Clinical and Experimental Neuropsychology
 ,
22
,
709
719
.
Hayden
S. R.
,
Brown
M. D.
(
1999
).
Likelihood ratios: A powerful tool for incorporating the results of diagnostic tests into clinical decision making
.
Annals of Emergency Medicine
 ,
33
,
575
580
.
Henry
G. K.
,
Heilbronner
R. L.
,
Algina
J.
,
Kaya
Y.
(
2013
).
Derivation of the MMPI-2-RF Henry-Heilbronner Index-r (HHI-r) scale
.
The Clinical Neuropsychologist
 ,
27
,
509
513
.
Henry
G. K.
,
Heilbronner
R. L.
,
Mittenberg
W.
,
Enders
C.
(
2006
).
The Henry-Heilbronner Index: A 15 item empirically derived MMPI-2 subscale for identifying probable malingering in personal injury litigants and disability claims
.
The Clinical Neuropsychologist
 ,
20
,
786
797
.
Hook
J. N.
,
Marquine
M. J.
,
Hoelzle
J. B.
(
2009
).
Repeatable Battery for the Assessment of Neuropsychological Status Effort Index performance in a medically ill geriatric sample
.
Archives of Clinical Neuropsychology
 ,
24
,
231
235
.
Holzer
L.
,
Chinet
L.
,
Jaugey
L.
,
Plancherel
B.
,
Sofia
C.
,
Halfon
O.
et al
. (
2007
).
Detection on cognitive impairment with the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS) in adolescent with psychotic symptoms
.
Schizophrenia Research
 ,
95
,
48
53
.
Howe
L. L.
(
2010
).
Giving context to post-deployment post-concussive-like symptoms: Blast related potential mild traumatic brain injury and comorbidities
.
The Clinical Neuropsychologist
 ,
23
,
1315
1337
.
Jasinski
L. J.
,
Berry
D. T.
,
Shandera
A. L.
,
Clark
J. A.
(
2011
).
Use of the Wechsler Adult Intelligence Scale Digit Span subtest for malingering detection: A meta-analytic review
.
Journal of Clinical and Experimental Neuropsychology
 ,
33
,
300
314
.
Jones
A.
,
Ingram
M. V.
(
2011
).
A comparison of selected MMPI-2 and MMPI-2-RF Validity Scales in assessing effort on cognitive tests in a military sample
.
The Clinical Neuropsychologist
 ,
25
,
1207
1227
.
Jones
A.
(
2013a
).
Test of Memory Malingering: Cutoff scores for psychometrically defined malingering groups in a Military Sample
.
The Clinical Neuropsychologist
 ,
27
,
1043
1059
.
Jones
A.
(
2013b
).
Victory Symptom Validity Test: Cutoff scores for psychometrically defined malingering groups in a Military Sample
.
The Clinical Neuropsychologist
 ,
27
,
1373
1394
.
Lange
R. T.
,
Brickell
T. A.
,
French
L. M.
,
Merritt
V. C.
,
Bhagwat
A.
,
Pancholi
S.
et al
. (
2012
).
Neuropsychological outcomes from uncomplicated mild, mild, and moderate traumatic brain injury in U.S. military personnel
.
Archives of Clinical Neuropsychology
 ,
27
,
480
494
.
Larrabee
G. J.
(
2003
).
Detection of malingering using atypical performance patterns on standard neuropsychological tests
.
The Clinical Neuropsychologist
 ,
17
,
410
425
.
Larrabee
G. J.
(
2007
).
Malingering, research designs, and base rates
. In
Larrabee
G. J.
(Ed.),
Assessment of malingered neuropsychological deficits
  (pp.
3
13
).
New York
:
Oxford University Press
.
Larrabee
G. J.
(
2008
).
Aggregation across multiple indicators improves the detection of malingering: Relationship to likelihood ratios
.
The Clinical Neuropsychologist
 ,
22
,
666
679
.
Larrabee
G. J.
(
2012
).
Performance validity and symptom validity in neuropsychological assessment
.
Journal of the International Neuropsychological Society
 ,
18
,
625
630
.
Larrabee
G. J.
,
Greiffenstein
M. F.
,
Greve
K. W.
,
Bianchini
K. J.
(
2007
).
Refining diagnostic criteria for malingering
. In
Larrabee
G. J.
(Eds.),
Assessment of malingered neuropsychological deficits
  (pp.
334
371
).
New York, NY
:
Oxford University Press
.
Larrabee
G. J.
,
Millis
S. R.
,
Myers
J. E.
(
2009
).
40 plus or minus 10, a new magical number: Reply to Russell
.
The Clinical Neuropsychologist
 ,
23
,
841
849
.
Larson
E. B.
,
Kirschener
K.
,
Bode
R.
,
Heineman
A.
,
Goodman
R.
(
2005
).
Construct and predictive validity of the Repeatable Battery for the Assessment of Neuropsychological Status in the evaluation of stroke patients
.
Journal of Clinical and Experimental Neuropsychology
 ,
27
,
16
32
.
Lees-Haley
P. R.
,
English
L. T.
,
Glenn
W. J.
(
1991
).
A Fake Bad scale on the MMPI-2 for personal injury claimants
.
Psychological Reports
 ,
68
,
203
210
.
Lippa
S. M.
,
Hawes
S.
,
Jokic
E.
,
Caroselli
J. S.
(
2013
).
Sensitivity of the RBANS to acute traumatic brain injury and length of post-traumatic amnesia
.
Brain Injury
 ,
27
,
689
695
.
Loring
D. W.
,
Larrabee
G. J.
,
Lee
G. P.
,
Meador
K. J.
(
2007
).
Victoria Symptom Validity Test performance in a heterogeneous clinical sample
.
The Clinical Neuropsychologist
 ,
21
,
522
531
.
Macciocchi
S. N.
,
Seel
R. T.
,
Alderson
A.
,
Godsall
R.
(
2006
).
Victoria Symptom Validity Test performance in acute severe traumatic brain injury: Implications for test interpretation
.
Archives of Clinical Neuropsychology
 ,
21
,
395
404
.
Mathias
C. W.
,
Greve
K. W.
,
Bianchini
K. J.
,
Houston
R. J.
,
Crouch
J. A.
(
2002
).
Detecting malingered neurocognitive dysfunction using the Reliable Digit Span in traumatic brain injury
.
Assessment
 ,
9
,
301
318
.
McCay
C.
,
Casey
J.
,
Wertheimer
J.
,
Fichtenberg
N.
(
2007
).
Reliability and validity of the RBANS in a traumatic brain injured sample
.
Archives of Clinical Neuropsychology
 ,
22
,
91
98
.
Mittenberg
W.
,
Patton
C.
,
Canyock
E. M.
,
Condit
D. C.
(
2002
).
Base rates of malingering and symptom exaggeration
.
Journal of Clinical and Experimental Neuropsychology
 ,
24
,
1094
1102
.
Moser
M. S.
,
Schatz
P.
(
2002
).
Enduring effects of concussion in youth athletes
.
Archives of Clinical Neuropsychology
 ,
17
,
91
100
.
Nelson
N. W.
,
Boone
K.
,
Dueck
A.
,
Wagener
L.
,
Lu
P.
,
Grills
C.
(
2003
).
Relationships between eight measures of suspect effort
.
The Clinical Neuropsychologist
 ,
17
,
263
272
.
Novitski
J.
,
Steele
S.
,
Karantzoulis
S.
,
Randolph
C.
(
2012
).
The Repeatable Battery for the Assessment of Neuropsychological Status Effort Scale
.
Archives of Clinical Neuropsychology
 ,
27
,
190
195
.
Olejnik
S.
,
Algina
J.
(
2000
).
Measures of effect size for comparative studies: Applications, interpretations, and limitations
.
Contemporary Educational Psychology
 ,
25
,
241
286
.
Peck
C. P.
,
Schroeder
R. W.
,
Heinrichs
R. J.
,
Vondran
E. J.
,
Brockman
C. J.
,
Webster
B. K.
et al
. (
2013
).
Differences in MMPI-2 FBS and RBS scores in brain injury, probable malingering, and conversion disorder groups: A preliminary study
.
Clinical Neuropsychology
 ,
27
,
693
707
.
Randolph
C.
(
1998
).
Repeatable Battery for the Assessment of Neuropsychological Status (RBANS)
 .
San Antonio, TX
:
Harcourt: The Psychological Corporation
.
Randolph
C.
(
2012
).
Repeatable Battery for the Assessment of Neuropsychological Status Update (RBANS Update)
 .
San Antonio, TX
:
Harcourt: The Psychological Corporation
.
Silverberg
N. D.
,
Wertheimer
J. C.
,
Fichtenberg
N. L.
(
2007
).
An Effort Index for the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS)
.
The Clinical Neuropsychologist
 ,
21
,
841
854
.
Slick
D. J.
,
Hopp
G.
,
Strauss
E.
,
Thompson
G. B.
(
1997
).
Victoria Symptom Validity Test: Professional manual
 .
Odessa, FL
:
Psychological Assessment Resources
.
Slick
D. J.
,
Sherman
E. M.
,
Iverson
G. L.
(
1999
).
Diagnostic criteria for malingered neurocognitive dysfunction: Proposed standards for clinical practice and research
.
The Clinical Neuropsychologist
 ,
13
,
545
561
.
Tombaugh
T. N.
(
1996
).
Test of Memory Malingering (TOMM)
 .
New York
:
Multi-Health Systems
.
Tsushima
W. T.
,
Geling
O.
,
Fabrigas
J.
(
2011
).
Comparison of MMP-2 validity scales scores of personal injury litigants and disability claimants
.
The Clinical Neuropsychologist
 ,
25
,
1403
1414
.
Victor
T. L.
,
Boone
K. B.
,
Serpa
J. G.
,
Buehler
J.
,
Ziegler
E. A.
(
2009
).
Interpreting the meaning of multiple Symptom Validity Test failure
.
The Clinical Neuropsychologist
 ,
23
,
297
313
.
Whitney
K. A.
(
2013
).
Predicting Test of Memory Malingering and Medical Symptom Validity Test failure within a Veterans Affairs Medical Center: Use of Response Bias Scale and the Henry-Heilbronner Index
.
Archives of Clinical Neuropsychology
 ,
28
,
222
235
.
Young
J. C.
,
Baughman
B. C.
,
Roper
B. L.
(
2012
).
Validation of the Repeatable Battery for the Assessment of Neuropsychological Status – Effort Index in a veteran sample
.
The Clinical Neuropsychologist
 ,
26
,
688
699
.
Young
J. C.
,
Kearns
L. A.
,
Roper
B. L.
(
2011
).
Validation of the MMPI-2 Response Bias Scale and Henry-Heilbronner Index in a U.S. veteran population
.
Archives of Clinical Neuropsychology
 ,
26
,
194
204
.

Author notes

The views expressed herein are those of the author and do not reflect the official policy of the Department of the Army, Department of Defense, or the U.S. Government.