Abstract

During the last decades, symptom validity has become an important topic in the neuropsychological and psychiatric literature with respect to how it relates to malingering, factitious disorder, and somatoform complaints. We conducted a survey among neuropsychologists (N = 515) from six European countries (Germany, Italy, Denmark, Finland, Norway, and the Netherlands). We queried the respondents about the tools they used to evaluate symptom credibility in clinical and forensic assessments and other issues related to symptom validity testing (SVT). Although the majority of the respondents demonstrated technical knowledge about symptom validity, a sizeable minority of the respondents relied on outdated notions (e.g., the idea that clinicians can determine symptom credibility based on intuitive judgment). There is little consensus among neuropsychologists on how to instruct patients when they are administered SVTs and how to handle test failure. Our findings indicate that the issues regarding how to administer and communicate the SVT results to patients warrant systematic research.

Introduction

The scientific knowledge on symptom validity in neuropsychological assessments has increased rapidly during the last decades (Sweet & Guidotti Breting, 2013). As a result, dedicated detection methods, which are referred to as symptom validity tests (SVTs), have been developed (for an overview, seeBoone, 2007). In the context of this paper, the term “SVT” is used broadly to include stand-alone performance validity tests, embedded indicators of symptom validity, and self-report measures of negative response bias. SVTs enable neuropsychologists to understand the prevalence of non-credible symptoms in various settings (e.g., injury claims and work-related disability claims) and their impact on standard psychological tests and questionnaires (e.g., Boone, 2013; Iverson, 2006; Rohling, Green, Allen, & Iverson, 2002). It is now widely acknowledged that non-credible symptoms occur on a non-trivial scale in forensic assessments, with estimates ranging from 25% to 45% depending on the context (i.e., criminal vs. civil litigation), the sample studied, and the SVTs used (e.g., Chafetz, Prentkowski, & Rao, 2011; Gervais, Rohling, Green, & Ford, 2004; Miele, Gunner, Lynch, & McCaffrey, 2012; Van Hout, Schmand, & Wekking, 2006).

One of the major tasks of neuropsychologists is to determine the objective presence of cognitive deficiencies. Consequently, most SVTs focus on the credibility of cognitive test performance. Studies on SVTs have demonstrated that cognitive underperformance (i.e., performing below one's cognitive abilities) has a substantial impact on the scores obtained from standard cognitive tests (e.g., memory tests; Fox, 2011; Meyers, Volbrecht, Axelrod, & Reinsch-Boothby, 2011; Rohling et al., 2002). In a sample of 1,307 patients with various disorders (e.g., mild traumatic brain injury, multiple sclerosis, and stroke) who were evaluated using a medico-legal assessment, Green (2007) noted that SVT failure explained more of the variance in the test scores than the severity of the sustained brain injury. Bianchini, Curtis, and Greve (2006) observed a similar relationship between the monetary incentive at stake in a litigation procedure and the frequency of SVT failure. If a higher amount of money was to be gained from the legal procedure, a higher number of examinees exhibited a non-credible test performance.

Traditionally, research on the detection and prevalence of non-credible symptoms has focused on forensic evaluations in which external gains were obvious. However, we recently observed a shift toward non-credible symptoms in clinical (i.e., non-forensic) assessments. For example, Locke, Smigielski, Powell, and Stevens (2008) reported that up to 22% of patients who had acquired brain injury and were referred for outpatient treatment in a rehabilitation center failed an SVT. In their sample of patients who had medically unexplained symptoms and were referred for consult to a neurology department, Kemp and colleagues (2008) noted that 11% of these patients had failed at least two of six SVTs. Similarly, Dandachi-FitzGerald, Ponds, Peter, and Merckelbach (2011) found that 8% of a large sample of psychiatric outpatients failed two SVTs.

The accumulated body of SVT research has led to an influential position paper by the National Academy of Neuropsychology that states the following: “The assessment of symptom validity is an essential part of a neuropsychological evaluation. The clinician should be prepared to justify a decision not to assess symptom validity as part of a neuropsychological evaluation” (Bush et al., 2005, p. 421). Several survey studies have addressed the subject of whether and how neuropsychologists adopt this position as they perform a neuropsychological assessment. Mittenberg, Patton, Canyock, and Condit (2002) were the first authors in a well-cited study that specifically addressed the estimates from 144 neuropsychologists of the base rates of non-credible symptoms across various settings (i.e., personal injury litigation, disability or worker's compensation claims, criminal assessments, and medical or psychiatric assessments not involving litigation) and across a range of different diagnoses (e.g., mild head injury, fibromyalgia or chronic fatigue, and depressive disorders). The authors collected information on the methods that the neuropsychologists used to substantiate their diagnostic impression of non-credible symptoms. The neuropsychologists reported high base rates for non-credible symptoms in forensic settings (i.e., 30% personal injury cases, 33% disability or worker's compensation claim, and 23% criminal cases) and lower rates in clinical settings (i.e., 8% medical or psychiatric cases).

Slick, Tan, Strauss, and Hultsch (2004) surveyed 24 neuropsychologists with expertise in civil litigation (e.g., financial compensation claims and personal injury litigation) and found that the majority of experts (∼79%) reported using at least one dedicated SVT in every assessment. The Test of Memory Malingering (TOMM; Tombaugh, 1996) and the Rey 15-Item Test (FIT; Rey, 1958) were the most frequently employed SVTs. In cases that were highly suspicious of exaggeration or malingering, most experts indicated that they often or always adapted the routine of the assessment by encouraging the examinees to provide their best effort (88%). In their professional communication, nearly half of the experts (46%) stated they often or always used the term “malingering” in suspected cases of exaggerated or malingered deficits. The majority of experts reportedly communicated that the test data were invalid (92%) and inconsistent with the severity of the injury (96%).

Based on the surveys conducted by Mittenberg and colleagues (2002) and Slick and colleagues (2004), Sharland and Gfeller (2007) surveyed a general sample of 188 neuropsychologists. Compared with the results reported by Mittenberg and colleagues, Sharland and Gfeller found a lower estimated base rate of malingering in civil litigation procedures (i.e., median of 20%) and a comparable estimated base rate for malingering in non-litigation evaluations (i.e., median of 5%). The respondents used an SVT in their assessment (25% always) less frequently than the experts surveyed by Slick and colleagues (2004). There was no distinction between the use of SVTs in clinical and forensic assessments. In agreement with Slick and colleagues (2004), the TOMM and FIT were the most frequently used measures for the determination of cognitive underperformance. Additionally, Sharland and Gfeller included two validity scales for the Minnesota Multiphasic Personality Inverntory-2 (i.e., F–K ratio and fake bad scale), which were highly used as measures for the detection of non-credible symptoms. Similar to the survey conducted by Mittenberg and colleagues (2002), Sharland and Gfeller found that most respondents relied on subjective impressions to determine symptom validity. In both surveys, the severity of the cognitive impairments and the inconsistency in cognitive test performance were the most frequently reported indicators of non-credible symptoms.

In a more recent study, McCarter, Walton, Brooks, and Powell (2009) surveyed the practices and beliefs of 130 neuropsychologists in the UK. In contrast to previous surveys, these authors made a distinction in the use of SVTs between forensic and clinical neuropsychological assessments and gathered information on the arguments that were presumed valid for using or not using SVTs. Expectedly, SVTs were more frequently used in forensic assessments (59% always) than in clinical assessments (15% always). The most used stand-alone SVTs were the TOMM, the FIT, and the Word Memory Test (WMT; Green, 2003). The most frequently endorsed reasons for the use of SVTs were evidence from the scientific literature and the necessity to validate other test results. One third of the respondents, however, stated that they saw no need to administer SVTs because non-credible symptoms were obvious in the standard test results and the presentation of the examinee.

Apart from the survey conducted by McCarter and colleagues (2009) and as far as SVTs are concerned, information on the practices and beliefs of European neuropsychologists is lacking. Given that differences may exist between the Anglo-Saxon countries and the continental Western European countries with regard to how neuropsychologists practice their profession and their expert role in legal proceedings, we cannot extrapolate the findings of previous surveys and apply them to the practices and beliefs of Western European neuropsychologists.

Thus, we conducted a survey among neuropsychologists in six European countries. The aim of the survey was to gain insight into how neuropsychologists systematically address the issue of non-credible symptoms in forensic and clinical assessments, what type of tests and methods they employ, how they address non-credible symptoms during the assessment, and how they describe non-credible symptoms in their professional communications. Similar to the survey conducted by McCarter and colleagues (2009), we aimed to determine the arguments that neuropsychologists offer in favor of and against the use of SVTs in their assessments.

Method

Procedure

The chairs of the European Societies of Neuropsychology were contacted during the second meeting of the Federation of the European Societies of Neuropsychology (FESN) in September 2010 in Amsterdam or by email. They were asked whether they were willing to forward an email with a link to an online survey to the members of their respective societies. In total, 12 societies were contacted. Six societies either did not respond or considered the study not feasible. Six societies agreed to cooperate: the German, Finnish, Danish, Norwegian, Italian, and Dutch neuropsychological societies. We provided all chairs with the concept for an introductory email that they might adapt for their societies. The survey was formulated in four languages, namely, German, English, Italian, and Dutch. The chairs of the Scandinavian societies (Finland, Denmark, and Norway) indicated that their members are sufficiently fluent in English such that a translation of the questionnaires to their native language was unnecessary.

We collected data between December 2010 and May 2012. We analyzed the data from the total sample and separately for each society. We focused on the patterns evident in the total sample. We will briefly discuss those instances in which there were salient differences between the societies and how they responded to the items.

Survey

We based most of our questions on previous survey research on SVTs to facilitate comparisons (e.g., McCarter et al., 2009; Mittenberg et al., 2002; Sharland & Gfeller, 2007; Slick et al., 2004). The survey contained 30 questions that addressed five areas: (1) background information of the respondents (7–9 items); (2) how often the respondents used specific SVTs, the type of SVTs employed, the instructions given to examinees before administering SVTs, and how SVT failures were handled during the assessments (8–11 items); (3) how often the respondents believed that they had encountered non-credible symptom reports and malingering in their own assessments in 2010 and how often they thought non-credible symptom reports and malingering had occurred in clinical and forensic assessments (4–6 items); (4) how the respondents communicated SVT failures in their reports (2 items); and (5) the arguments that the respondent believed were valid for administering or not administering SVTs (2 items). The respondents took approximately 20 min to complete the survey. A copy of the questionnaire can be obtained from the first author.

Respondents

In total, 807 respondents started the survey. Of these, 292 participants (36.2%) did not complete the questionnaire. We obtained complete records from 515 respondents and included these in the analyses. The response rate of the six societies was as follows: 6% in Denmark (n = 20), 12% in Finland (n = 53) and Norway (n = 38), 14% in Germany (n = 211), 16% in the Netherlands (n = 144), and 25% in Italy (n = 49).

Results

Work-Related Background Information of Survey Respondents

Table 1 shows the demographic characteristics of the survey respondents. Almost all of the respondents were psychologists (96.3%), and most of the psychologists had more than 10 years of work experience (57.9%). The four highest ranked work settings were rehabilitation center (45.5%), somatic health care (37.9%), research (31.3%), and mental health care (26.6%), and most respondents worked in a median of two settings.

Table 1.

Demographics of respondents (N = 515)

Professional titlea (% respondents) 
 Psychologist 96.3 
 Physician 3.7 
 Other 0.6 
Years of work experience (% respondents) 
 0–5 18.3 
 5–10 23.9 
 >10 57.9 
Work setting (% respondents) 
 Rehabilitation center 45.4 
 Somatic health care 37.9 
 Research 31.3 
 Mental health care 26.6 
 Own practice 25.6 
 Teaching 24.9 
 Forensic setting 20.8 
 Other 6.2 
 Nursing home 5.0 
Neuropsychological assessments in 2010 (number of respondents [%]) 
 Clinical 492 (95.5%) 
 Forensic 283 (55.0%) 
Estimated number of neuropsychological assessments in 2010b (median [range]) 
 Clinical (n = 492) 70 (1–1,200) 
 Forensic (n = 283) 10 (1–200) 
Context medico-legal assessments (% respondents) 
 Criminal forensic 11.7 
 Civil forensic 55.8 
 Work-related disability claim 86.2 
 Other 14.1 
Who does the testing (% respondents) 
 You alone 65.6 
 You and psychology assistant 22.5 
 Psychology assistant only 11.8 
Professional titlea (% respondents) 
 Psychologist 96.3 
 Physician 3.7 
 Other 0.6 
Years of work experience (% respondents) 
 0–5 18.3 
 5–10 23.9 
 >10 57.9 
Work setting (% respondents) 
 Rehabilitation center 45.4 
 Somatic health care 37.9 
 Research 31.3 
 Mental health care 26.6 
 Own practice 25.6 
 Teaching 24.9 
 Forensic setting 20.8 
 Other 6.2 
 Nursing home 5.0 
Neuropsychological assessments in 2010 (number of respondents [%]) 
 Clinical 492 (95.5%) 
 Forensic 283 (55.0%) 
Estimated number of neuropsychological assessments in 2010b (median [range]) 
 Clinical (n = 492) 70 (1–1,200) 
 Forensic (n = 283) 10 (1–200) 
Context medico-legal assessments (% respondents) 
 Criminal forensic 11.7 
 Civil forensic 55.8 
 Work-related disability claim 86.2 
 Other 14.1 
Who does the testing (% respondents) 
 You alone 65.6 
 You and psychology assistant 22.5 
 Psychology assistant only 11.8 

aThree respondents held both a medical and psychological degree. Therefore, the percentages do not add up to 100.

bCalculated over those respondents who conducted at least one assessment in 2010.

Almost all of the respondents (95.5%) had conducted clinical neuropsychological assessments in 2010 with a median of 70 assessments (range 1–1,200). Of those respondents that had conducted clinical neuropsychological assessment in 2010, the majority (95.3%) had conducted five or more assessments in 2010. In 2010, 55.0% of the respondents had conducted forensic neuropsychological assessments, with a median of 10 assessments (range 1–200) and a median of 10 years of experience in performing forensic evaluations (range 0–36 years). Most forensic assessments involved work-related disability claims (86.2%), followed by civil forensic procedures (e.g., personal injury claim; 55.8%). Only 11.7% of the forensic assessments were conducted in the context of a criminal forensic procedure, and 14.1% of the forensic assessments involved other contexts, such as the determination of mental competency. Two-thirds of the respondents conducted the testing themselves. In addition, 23% of the respondents performed the psychological testing assisted by a technician. Only a few respondents indicated that the testing was conducted by a technician only.

Methods Used to Determine Symptom Validity

As expected, the use of SVTs is mainly determined by the context in which the assessment takes places (Table 2). A few respondents reported using an SVT often (8.7%) or always (11.8%) in clinical assessments, whereas a narrow majority reported using an SVT often (14.1%) or always (44.9%) in forensic assessments. Table 2 highlights the considerable differences in the use of SVTs between the respondents in the six countries. The majority of the Dutch and Norwegian respondents (70% and 69%, respectively) indicated that they always included an SVT in a forensic assessment. This result is more in line with that of the experts in the survey conducted by Slick and colleagues (79%). In contrast, only a minority of the respondents of the remaining four neuropsychological societies reported always using an SVT in forensic assessments with remarkably low frequencies in Italy (22%) and Finland (14%). This finding is surprising because it is long recognized that the symptom validity in forensic assessments must be formally assessed (e.g., Bush et al., 2005).

Table 2.

Frequency of inclusion of symptom validity tests in clinical and forensic assessments for the total sample and each country separately

 In every case or almost every case (>95%) In a majority of the cases (>50%) In a fair minority of the cases, between 20% and 50% In less than 20% of the cases Very rarely, <5% Never 
Total sample 
 Clinical (n = 492) 11.8 8.7 11.4 17.5 33.5 17.1 
 Forensic (n = 283) 44.9 14.1 9.2 9.5 12.7 9.5 
Germany 
 Clinical (n = 198) 7.6 5.6 9.6 16.7 44.9 15.7 
 Forensic (n = 155) 44.5 16.8 7.6 9.7 12.9 5.8 
The Netherlands 
 Clinical (n = 138) 13.8 12.3 21.7 23.2 21.0 8.0 
 Forensic (n = 47) 70.2 10.6 4.3 6.4 6.4 2.1 
Italy 
 Clinical (n = 49) 6.1 4.1 2.0 16.3 26.5 44.9 
 Forensic (n = 32) 21.9 15.6 9.4 9.4 12.5 31.3 
Norway 
 Clinical (n = 37) 29.7 24.3 8.1 10.8 21.6 5.4 
 Forensic (n = 16) 68.8 18.8 6.3 6.3 
Finland 
 Clinical (n = 53) 13.7 5.9 11.8 35.5 33.3 
 Forensic (n = 22) 13.6 9.1 22.7 31.8 22.7 
Denmark 
 Clinical (n = 19) 15.8 5.3 15.8 15.8 42.1 5.3 
 Forensic (n = 11) 36.4 9.1 27.3 9.1 9.1 9.1 
 In every case or almost every case (>95%) In a majority of the cases (>50%) In a fair minority of the cases, between 20% and 50% In less than 20% of the cases Very rarely, <5% Never 
Total sample 
 Clinical (n = 492) 11.8 8.7 11.4 17.5 33.5 17.1 
 Forensic (n = 283) 44.9 14.1 9.2 9.5 12.7 9.5 
Germany 
 Clinical (n = 198) 7.6 5.6 9.6 16.7 44.9 15.7 
 Forensic (n = 155) 44.5 16.8 7.6 9.7 12.9 5.8 
The Netherlands 
 Clinical (n = 138) 13.8 12.3 21.7 23.2 21.0 8.0 
 Forensic (n = 47) 70.2 10.6 4.3 6.4 6.4 2.1 
Italy 
 Clinical (n = 49) 6.1 4.1 2.0 16.3 26.5 44.9 
 Forensic (n = 32) 21.9 15.6 9.4 9.4 12.5 31.3 
Norway 
 Clinical (n = 37) 29.7 24.3 8.1 10.8 21.6 5.4 
 Forensic (n = 16) 68.8 18.8 6.3 6.3 
Finland 
 Clinical (n = 53) 13.7 5.9 11.8 35.5 33.3 
 Forensic (n = 22) 13.6 9.1 22.7 31.8 22.7 
Denmark 
 Clinical (n = 19) 15.8 5.3 15.8 15.8 42.1 5.3 
 Forensic (n = 11) 36.4 9.1 27.3 9.1 9.1 9.1 

As in the previous surveys conducted by Mittenberg and colleagues (2002) and Sharland and Gfeller (2007), the respondents could list all of the methods they used to determine symptom validity (Table 3; a comparison of the three surveys is shown in Table 4). The most frequently used methods were discrepancies between records, self-reporting, and observed behavior; severity of cognitive impairment inconsistent with the condition; and a pattern of cognitive impairment inconsistent with the condition. SVTs were used by a minority of the respondents. Stand-alone SVTs were used often to always by 42% of the respondents, scores below empirical cutoffs on embedded measures were often to always used by 22% of the respondents, and validity scales on objective personality tests were often to always used by 19% of the respondents. Although the rank order varied slightly, the similarity between the methods reported in the survey conducted by Sharland and Gfeller and those found in this survey was noticeable. In both surveys, the first five methods all relied on subjective clinical judgment, whereas the more objective SVTs ended last. The same finding was obtained by Mittenberg and colleagues, with the exception that the method of scores below empirical cutoffs on forced choice tests ranked third in their survey.

Table 3.

Methods used to determine symptom validity

 Percentage of respondents (N = 515)
 
Never Rarely Sometimes Often Always 
Discrepancies between records, self-reports, and observed behavior 2.5 8.5 19.6 29.9 39.4 
Severity of cognitive impairment inconsistent with condition 3.7 10.1 20.8 28.5 36.9 
Pattern of cognitive test performance inconsistent with condition 4.3 8.7 22.1 27.8 37.1 
Implausible self-reported symptoms in interview 6.0 13.0 23.9 27.8 29.3 
Implausible changes in test scores across repeated examinations 8.5 21.4 26.4 22.7 21.0 
Stand-alone symptom validity tests 23.3 19.8 15.0 17.7 24.3 
Scores below empirical cutoffs on embedded measures of effort 32.2 25.2 20.8 13.4 8.3 
Validity scales on objective personality tests 35.1 28.2 17.3 12.2 7.2 
Scores on empirically derived discriminant function analyses indicative of poor effort 58.8 19.0 11.7 5.6 4.9 
 Percentage of respondents (N = 515)
 
Never Rarely Sometimes Often Always 
Discrepancies between records, self-reports, and observed behavior 2.5 8.5 19.6 29.9 39.4 
Severity of cognitive impairment inconsistent with condition 3.7 10.1 20.8 28.5 36.9 
Pattern of cognitive test performance inconsistent with condition 4.3 8.7 22.1 27.8 37.1 
Implausible self-reported symptoms in interview 6.0 13.0 23.9 27.8 29.3 
Implausible changes in test scores across repeated examinations 8.5 21.4 26.4 22.7 21.0 
Stand-alone symptom validity tests 23.3 19.8 15.0 17.7 24.3 
Scores below empirical cutoffs on embedded measures of effort 32.2 25.2 20.8 13.4 8.3 
Validity scales on objective personality tests 35.1 28.2 17.3 12.2 7.2 
Scores on empirically derived discriminant function analyses indicative of poor effort 58.8 19.0 11.7 5.6 4.9 
Table 4.

Comparison of indicators of symptom invalidity between the current survey, Mittenberg and colleagues (2002), and Sharland and Gfeller (2007)

Current survey Mittenberg and colleagues Sharland and Gfeller 
Indicator (most frequently used to least frequently used) 
1. Discrepancies among records, self-report, and observed behavior 1. Severity of cognitive impairment inconsistent with condition 1. Severity of cognitive impairment inconsistent with condition 
2. Severity of cognitive impairment inconsistent with condition 2. Pattern of cognitive test performance inconsistent with condition 2. Pattern of cognitive test performance inconsistent with condition 
3. Pattern of cognitive test performance inconsistent with condition 3. Scores below empirical cutoffs on forced choice tests 3. Discrepancies among records, self-report, and observed behavior 
4. Implausible self-reported symptoms in interview 4. Discrepancies among records, self-report, and observed behavior 4. Implausible self-reported symptoms in interview 
5. Implausible changes in test scores across repeated examinations 5. Implausible self-reported symptoms in interview 5. Implausible changes in test scores across repeated examinations 
6. Stand-alone symptom validity tests 6. Scores below empirical cutoffs on other malingering tests 6. Scores below empirical cutoffs on forced choice tests 
7. Scores below empirical cutoffs on embedded measures of effort 7. Implausible changes in test scores across repeated examinations 7. Scores below empirical cutoffs on measures specific to the assessment of effort/malingering (e.g., Validity Indicator Profile) 
8. Validity scales on objective personality tests 8. Scores above validity scale cutoffs on objective personality tests 8. Validity scales on objective personality tests (e.g., MMPI-2) 
9. Scores on empirically derived discriminant function analyses indicative of poor effort 9. Scores below chance on forced choice tests 9. Scores below empirical cutoffs on embedded measures of effort 
  10. Scores on empirically derived discriminant function analyses indicative of poor effort 
Current survey Mittenberg and colleagues Sharland and Gfeller 
Indicator (most frequently used to least frequently used) 
1. Discrepancies among records, self-report, and observed behavior 1. Severity of cognitive impairment inconsistent with condition 1. Severity of cognitive impairment inconsistent with condition 
2. Severity of cognitive impairment inconsistent with condition 2. Pattern of cognitive test performance inconsistent with condition 2. Pattern of cognitive test performance inconsistent with condition 
3. Pattern of cognitive test performance inconsistent with condition 3. Scores below empirical cutoffs on forced choice tests 3. Discrepancies among records, self-report, and observed behavior 
4. Implausible self-reported symptoms in interview 4. Discrepancies among records, self-report, and observed behavior 4. Implausible self-reported symptoms in interview 
5. Implausible changes in test scores across repeated examinations 5. Implausible self-reported symptoms in interview 5. Implausible changes in test scores across repeated examinations 
6. Stand-alone symptom validity tests 6. Scores below empirical cutoffs on other malingering tests 6. Scores below empirical cutoffs on forced choice tests 
7. Scores below empirical cutoffs on embedded measures of effort 7. Implausible changes in test scores across repeated examinations 7. Scores below empirical cutoffs on measures specific to the assessment of effort/malingering (e.g., Validity Indicator Profile) 
8. Validity scales on objective personality tests 8. Scores above validity scale cutoffs on objective personality tests 8. Validity scales on objective personality tests (e.g., MMPI-2) 
9. Scores on empirically derived discriminant function analyses indicative of poor effort 9. Scores below chance on forced choice tests 9. Scores below empirical cutoffs on embedded measures of effort 
  10. Scores on empirically derived discriminant function analyses indicative of poor effort 

These results appear to imply that neuropsychologists give more weight to their own subjective judgment despite scientific evidence that this impression is insufficiently reliable for the detection of non-credible symptoms.

As shown in Table 5, the top four stand-alone SVTs are the Amsterdam Short-Term Memory Test (ASTM; Schmand & Lindeboom, 2005), the Rey FIT, the TOMM, and the WMT (Green, 2003). Approximately one in five respondents indicated never using a stand-alone SVT (20.6%). The analyses per country revealed some minor variations. First, in Germany, a German-language SVT, which is called the Testbatterie für Forensische Neuropsychologie, ranked first. Second, in Italy and the Scandinavian countries, the ASTM is rarely mentioned because this test is not available in their native languages.

Table 5.

Frequency of use of tests and procedures to determine symptom validity

 Entire sample (N = 515) 
Stand-alone symptom validity tests (%) 
 None 20.6 
 Amsterdam Short-Term Memory Test 32.4 
 Rey 15-Item Test 30.7 
 Test of Memory Malingering 30.3 
 Word Memory Test 24.1 
 Testbatterie zur Forensischen Neuropsychologie 16.9 
 Dot Counting Test 7.2 
 Aggravations- und Simulations test 6.6 
 Medical Symptom Validity Test 2.5 
 Coin-in-the-Hand Test 2.3 
 Portland Digit Recognition Test 1.4 
 Non-Verbal Medical Symptom Validity Test 1.4 
b-test 1.4 
 Victorian Symptom Validity Test 1.2 
 Morel Emotional Numbing Test 1.0 
 Computerized Assessment of Response Bias 0.6 
 Validity Indicator Profile 0.2 
Embedded indicators within standard tests (%) 
 None 21.0 
 Auditory Verbal Learning Test 52.4 
 Trail Making Test 48.3 
 Rey Complex Figure Test and Recognition Trial 38.6 
 Stroop Color Word Test 38.3 
 Wisconsin Card Sorting Test 31.7 
 Raven Progressive Matrices 24.5 
 California Verbal Learning Test 23.1 
 Concentration Endurance Test d2 20.6 
 Reliable Digit Span 15.0 
 Line Orientation Test 8.7 
 Reliable Spatial Span 6.4 
Validity questionnaires and scales (%) 
 None 57.7 
 Minnesota Multiphasic Personality Inventory-2 28.7 
 Structured Inventory of Malingered Symptomatology 13.2 
 Personality Assessment Inventory 3.3 
 Entire sample (N = 515) 
Stand-alone symptom validity tests (%) 
 None 20.6 
 Amsterdam Short-Term Memory Test 32.4 
 Rey 15-Item Test 30.7 
 Test of Memory Malingering 30.3 
 Word Memory Test 24.1 
 Testbatterie zur Forensischen Neuropsychologie 16.9 
 Dot Counting Test 7.2 
 Aggravations- und Simulations test 6.6 
 Medical Symptom Validity Test 2.5 
 Coin-in-the-Hand Test 2.3 
 Portland Digit Recognition Test 1.4 
 Non-Verbal Medical Symptom Validity Test 1.4 
b-test 1.4 
 Victorian Symptom Validity Test 1.2 
 Morel Emotional Numbing Test 1.0 
 Computerized Assessment of Response Bias 0.6 
 Validity Indicator Profile 0.2 
Embedded indicators within standard tests (%) 
 None 21.0 
 Auditory Verbal Learning Test 52.4 
 Trail Making Test 48.3 
 Rey Complex Figure Test and Recognition Trial 38.6 
 Stroop Color Word Test 38.3 
 Wisconsin Card Sorting Test 31.7 
 Raven Progressive Matrices 24.5 
 California Verbal Learning Test 23.1 
 Concentration Endurance Test d2 20.6 
 Reliable Digit Span 15.0 
 Line Orientation Test 8.7 
 Reliable Spatial Span 6.4 
Validity questionnaires and scales (%) 
 None 57.7 
 Minnesota Multiphasic Personality Inventory-2 28.7 
 Structured Inventory of Malingered Symptomatology 13.2 
 Personality Assessment Inventory 3.3 

We asked the respondents which standard tests with adapted procedures or cutoffs (embedded indicators) they used. The Auditory Verbal Learning Test, the Trail Making Test, and the Rey Complex Figure Test and Recognition Trial were named as the most frequently used tests. Finally, the MMPI-2 validity scales and the Structured Inventory of Malingered Symptomatology (Widows & Smith, 2005) were most often named as the scales that were used to detect the over-reporting of symptoms (28.7% and 13.2%, respectively). More than half of the respondents (57.7%) indicated that they had never used any scale for the detection of non-credible symptoms.

Estimated Base Rates of Non-Credible Symptoms and Malingering

The respondents estimated that 10% of the examinees they had seen in clinical assessments in 2010 and 15% of the examinees they had seen in the forensic assessments in 2010 made an insufficient effort for whatever reason (e.g., unwilling to undergo the assessment, wanting to adapt the patient role, and malingering). Only in 4% of the clinical assessments and in 10% of the forensic assessments did the respondents believe that the insufficient effort was due to malingering. In general, the respondents estimated that malingering occurred in 10% of the clinical assessments and 20% of the forensic assessments. Surprisingly, this is a larger number than the estimates of malingering in their own practice.

Management and Professional Communication of SVT Failure

The respondents were queried how they instruct examinees before testing. As shown in Table 6, the vast majority of the respondents encouraged examinees to provide their best effort (76% often or always in clinical assessments and 83% often or always in forensic assessments). There appears to be a consensus that the examinees who are evaluated in clinical assessment are not warned in advance that measures for the detection of non-credible symptoms are included (69% very rarely or never). The respondents, however, were split evenly as to whether or not they warn their forensic examinees. One quarter of the respondents warned (almost) every examinee, whereas almost one quarter of the respondents never warned their examinees beforehand.

Table 6.

Instructing examinees in clinical and forensic assessments before commencing testing

 Percentage responding
 
In (almost) every case (>95%) In a majority of the cases (>50%) In a fair minority, between 20% and 50% In less than 20% of the cases Very rarely, <5% Never 
Prior to commencing testing, do you specifically encourage examinees to give their best effort? 
 In clinical assessments 51.2 24.8 9.1 6.9 6.3 1.6 
 In forensic assessments 67.5 15.5 5.7 2.5 6.0 2.8 
Prior to commencing testing, do you give examinees any type of warning regarding the fact that psychological tests may be sensitive to poor effort, exaggeration, or faking of deficits? 
 In clinical assessments 6.7 5.7 8.3 10.0 31.1 38.2 
 In forensic assessments 25.4 12.0 9.2 11.0 18.0 24.4 
 Percentage responding
 
In (almost) every case (>95%) In a majority of the cases (>50%) In a fair minority, between 20% and 50% In less than 20% of the cases Very rarely, <5% Never 
Prior to commencing testing, do you specifically encourage examinees to give their best effort? 
 In clinical assessments 51.2 24.8 9.1 6.9 6.3 1.6 
 In forensic assessments 67.5 15.5 5.7 2.5 6.0 2.8 
Prior to commencing testing, do you give examinees any type of warning regarding the fact that psychological tests may be sensitive to poor effort, exaggeration, or faking of deficits? 
 In clinical assessments 6.7 5.7 8.3 10.0 31.1 38.2 
 In forensic assessments 25.4 12.0 9.2 11.0 18.0 24.4 

Subsequently, the respondents were asked to consider only those cases in which they were highly suspicious or certain that an examinee was exaggerating or feigning cognitive deficits and to indicate on a 5-point Likert scale (ranging from never to always) how often they had endorsed a certain action. Approximately two thirds of the respondents stated that they had encouraged the examinee to provide their best effort, and half of them stated that they had administered additional SVTs in such cases (Table 7). Almost 40% directly confronted or warned the examinee to provide their best effort. One third reportedly continued the assessment with no change in routine. A few respondents indicated that they had terminated the examination earlier than planned. As can been seen in Table 8, the rank order of endorsed actions largely corresponded to that reported by the survey conducted by Slick and colleagues (2004). The main exception was that the experts in the survey conducted by Slick and colleagues more frequently continued the examination without any change in routine. A possible explanation may be that the survey conducted by Slick and colleagues focused more narrowly on experts and their forensic practice. We posed the question of handling non-credible symptoms in a general way, and we did not distinguish between the clinical and forensic practices of the respondents. It appears likely that neuropsychologists give special attention to the potential for non-credible symptoms in forensic practice (e.g., in the selection of their test battery and a priori strategy in response to SVT failure). In other words, neuropsychologists might handle non-credible symptoms more in forensic compared with clinical routines. When looking at the proportion of respondents endorsing a statement, the statement “encourage the examinee to give good effort” was less frequently endorsed in our survey; 68% as opposed to 88% in the Slick and colleagues survey (Fisher's exact p < .05). The same holds true for the statement “continue the examination without any change in routine” (Fisher's exact p < .01).

Table 7.

Handling of examinees with evidence of or suspected exaggeration or malingering of cognitive deficits

 Percentage of respondents
 
Never Rarely Sometimes Often Always 
Considering only those cases you where highly suspicious or certain that an examinee was exaggerating or feigning cognitive deficits, how often did you: 
 Encourage examinee to give good effort 6.2 7.8 18.1 36.5 31.5 
 Administer additional SVTs 22.1 10.7 15.9 26.6 24.7 
 Directly confront or warn to give good effort 17.1 18.4 25.4 27.6 11.5 
 Continue with no change 29.3 23.7 16.9 16.5 13.6 
 Terminate earlier than planned 35.5 31.3 26.4 6.2 0.6 
When examinees obtain test results indicative of exaggerated deficits or malingering, how do you express this opinion in a report or professional communication? How often do you say that: 
 Test results are inconsistent with severity of injury 11.8 21.9  50.3 15.9 
 No firm conclusions can be drawn 8.2 31.5  44.3 16.1 
 Test data are invalid 18.8 31.6  36.2 13.5 
 Test results suggest or indicate malingering 29.3 32.6  28.5 9.5 
 Test results suggest or indicate exaggeration 34.4 28.5  27.0 10.1 
 Percentage of respondents
 
Never Rarely Sometimes Often Always 
Considering only those cases you where highly suspicious or certain that an examinee was exaggerating or feigning cognitive deficits, how often did you: 
 Encourage examinee to give good effort 6.2 7.8 18.1 36.5 31.5 
 Administer additional SVTs 22.1 10.7 15.9 26.6 24.7 
 Directly confront or warn to give good effort 17.1 18.4 25.4 27.6 11.5 
 Continue with no change 29.3 23.7 16.9 16.5 13.6 
 Terminate earlier than planned 35.5 31.3 26.4 6.2 0.6 
When examinees obtain test results indicative of exaggerated deficits or malingering, how do you express this opinion in a report or professional communication? How often do you say that: 
 Test results are inconsistent with severity of injury 11.8 21.9  50.3 15.9 
 No firm conclusions can be drawn 8.2 31.5  44.3 16.1 
 Test data are invalid 18.8 31.6  36.2 13.5 
 Test results suggest or indicate malingering 29.3 32.6  28.5 9.5 
 Test results suggest or indicate exaggeration 34.4 28.5  27.0 10.1 

Notes: SVT = symptom validity testing. The answer option “Test data are invalid” was erroneously not given in the German survey (n = 211).

Table 8.

Ways to handle non-credible symptoms during an assessment, comparing the current survey with the survey of Slick and colleagues (2004)

Current survey Slick and colleagues (2004) 
Most frequently often or always endorsed to least frequently endorsed 
1. Encourage examinee to give good effort (68%) 1. Encourage examinee to give good effort (88%) 
2. Administer additional symptom validity tests (51%) 2. Continue the examination with no change in routine (75%) 
3. Directly confront or warned examinee to give good effort (39%) 3. Administer additional symptom validity tests (71%) 
4. Continue the examination with no change in routine (30%) 4. Directly confront or warned examinee to give good effort (25%) 
5. Terminate the examination earlier than planned (7%) 5. Terminate the examination earlier than planned (17%) 
Current survey Slick and colleagues (2004) 
Most frequently often or always endorsed to least frequently endorsed 
1. Encourage examinee to give good effort (68%) 1. Encourage examinee to give good effort (88%) 
2. Administer additional symptom validity tests (51%) 2. Continue the examination with no change in routine (75%) 
3. Directly confront or warned examinee to give good effort (39%) 3. Administer additional symptom validity tests (71%) 
4. Continue the examination with no change in routine (30%) 4. Directly confront or warned examinee to give good effort (25%) 
5. Terminate the examination earlier than planned (7%) 5. Terminate the examination earlier than planned (17%) 

Table 7 provides information on how the respondents communicate non-credible symptoms professionally. The most frequently used phrases are the following: “the test results are inconsistent with the severity of the injury,” “no firm conclusions can be drawn,” and “the test data are invalid.” Most respondents indicated that they rarely or never used the terms “exaggeration” or “malingering” in their reports.

Table 9 provides an overview of the responses to this question in the surveys conducted by Slick and colleagues (2004) and Sharland and Gfeller (2007) and in the current survey. The broad outline appears to indicate that the majority of the respondents reportedly communicated that the test results were inconsistent with the severity of injury, whereas a minority of the respondents endorsed the statement that that the test results suggested or indicated malingering. In contrast, there is disagreement with regard to the statement that the test results suggested exaggeration. Whereas 91% of the respondents in the survey conducted by Sharland and Gfeller and 83% of the respondents in the survey conducted by Slick and colleagues indicated that they often or always made this statement, only 37% did so in the current survey (Fisher's exact p's < .05). Another notable difference is that the rates of the most endorsed communication statements were lower in this study compared with the previous studies. For example, 66% of the respondents in the present survey stated that they often or always reported that the test results were inconsistent with the severity of the injury, whereas 96% of the respondents in the study conducted by Slick and colleagues and 85% of the respondents in the study conducted by Sharland and Gfeller endorsed this statement (Fisher's exact p's < .05). In fact, applying the Fisher's exact test, the proportion of respondents endorsing a communication statement, was significantly higher in both the Slick and colleagues survey and the Sharland and Gfeller survey than in our survey. Two exceptions being that there was no statistical difference in the proportion of respondents endorsing the statement that “test results suggest or indicate malingering” between the current survey and that of Slick and colleagues (Fisher's exact p = .52) and in the proportion of respondents endorsing the statement that “no firm conclusions can be drawn” between the current survey and the surveys of both Slick and colleagues and Sharland and Gfeller (both Fisher's exact p's > .10). There is no obvious explanation for this finding. However, this difference does suggest that the respondents in the current survey were more divided on how to best communicate information on non-credible symptoms.

Table 9.

Comparison of communication of suspect test results between the current survey, Slick and colleagues (2004), and Sharland and Gfeller (2007)

Current survey Slick and colleagues (2004) Sharland and Gfeller (2007) 
Rank order of most frequent endorsed communication statements 
1. Test results are inconsistent with severity of injury (66%) 1. Test results are inconsistent with severity of injury (96%) 1. Test results suggest or indicate exaggeration (91%) 
2. No firm conclusions can be drawn (60%) 2. Test data are invalid (92%) 2. Test results are inconsistent with severity of injury (85%) 
3. Test data are invalid (50%) 3. Test results suggest or indicate exaggeration (83%) 3. No firm conclusions can be drawn (66%) 
4. Test results suggest or indicate malingering (38%) 4. No firm conclusions can be drawn (58%) 4. Test data are invalid (59%) 
5. Test results suggest or indicate exaggeration (37%) 5. Test results suggest or indicate malingering (46%) 5. Test results suggest or indicate malingering (29%) 
Current survey Slick and colleagues (2004) Sharland and Gfeller (2007) 
Rank order of most frequent endorsed communication statements 
1. Test results are inconsistent with severity of injury (66%) 1. Test results are inconsistent with severity of injury (96%) 1. Test results suggest or indicate exaggeration (91%) 
2. No firm conclusions can be drawn (60%) 2. Test data are invalid (92%) 2. Test results are inconsistent with severity of injury (85%) 
3. Test data are invalid (50%) 3. Test results suggest or indicate exaggeration (83%) 3. No firm conclusions can be drawn (66%) 
4. Test results suggest or indicate malingering (38%) 4. No firm conclusions can be drawn (58%) 4. Test data are invalid (59%) 
5. Test results suggest or indicate exaggeration (37%) 5. Test results suggest or indicate malingering (46%) 5. Test results suggest or indicate malingering (29%) 

Note: The answer option “Test data are invalid” was erroneously not given in the German survey (n = 211).

Arguments for and Against the Administration of SVTs

Comparable to the study conducted by McCarter and colleagues (2009), we asked the respondents to list all of the arguments they felt were valid for and against the use of SVTs (Table 10). The most important arguments in favor of the use of SVTs are the following: Reading of the literature, necessity to validate other test results, recommendations from professional bodies, and personal experiences with patients who exaggerate symptoms or deficits. The findings are fairly similar to those found in the survey conducted by McCarter and colleagues (2009). Applying the two-sample Z-test for proportions, the proportion of respondents in the McCarter and colleagues survey endorsing the statements “my reading of the literature” and “because I might be criticized if I didn't do so” is significantly higher than in the current survey study (both Z's > 2, both p's < .05). The opposite holds true for the statement “to cover my back” (Z = 3.06, p < .01).

Table 10.

Arguments thought valid for administering or not administering symptom validity tests, comparing the current survey with that of McCarter and colleagues (2009)

 Current survey (N = 515; % respondents) McCarter and colleagues (N = 130; % respondents) Z 
Arguments for 
 My reading of the literature 63 73 2.20* 
 Necessary to validate other test results 59 60 0.12* 
 Recommendation from professional bodies 59 53 1.07 
 My own experience of people exaggerating deficits or symptoms 56 58 0.44 
 Many claimants exaggerate deficits or symptoms 34 30 0.94 
 To cover my back 31 18 3.06** 
 Because I might be criticized if I did not do it 20 31 2.59** 
 Many clinical patients exaggerate deficits or symptoms 11 15 1.43 
 Because lawyers insist on it 1.25 
 None n/a – 
Arguments against 
 In the case of obvious severe cognitive impairment an SVT is not valid 47 n/a – 
 Exaggeration/poor effort is usually obvious in the pattern of traditional test scores 25 29 0.88 
 Clinical cases rarely exaggerate or malinger so it is not necessary in that setting 23 26 0.68 
 Exaggeration/poor effort is usually obvious in their presentation 23 29 1.55 
 None 21 n/a – 
 Insufficient time 19 27 2.04* 
 Too many genuine patients or claimants are wrongly classified by these tests 15 13 0.65 
 SVTs are unreliable 15 22 2.02* 
 Cost of effort testing 0.40 
 Identification of low effort might jeopardize your relationship with a patient 0.47 
 So few people exaggerate or malinger that it is not worthwhile testing for it 0.43 
 It is not the psychologists’ role to determine malingering or veracity 1.52 
 Current survey (N = 515; % respondents) McCarter and colleagues (N = 130; % respondents) Z 
Arguments for 
 My reading of the literature 63 73 2.20* 
 Necessary to validate other test results 59 60 0.12* 
 Recommendation from professional bodies 59 53 1.07 
 My own experience of people exaggerating deficits or symptoms 56 58 0.44 
 Many claimants exaggerate deficits or symptoms 34 30 0.94 
 To cover my back 31 18 3.06** 
 Because I might be criticized if I did not do it 20 31 2.59** 
 Many clinical patients exaggerate deficits or symptoms 11 15 1.43 
 Because lawyers insist on it 1.25 
 None n/a – 
Arguments against 
 In the case of obvious severe cognitive impairment an SVT is not valid 47 n/a – 
 Exaggeration/poor effort is usually obvious in the pattern of traditional test scores 25 29 0.88 
 Clinical cases rarely exaggerate or malinger so it is not necessary in that setting 23 26 0.68 
 Exaggeration/poor effort is usually obvious in their presentation 23 29 1.55 
 None 21 n/a – 
 Insufficient time 19 27 2.04* 
 Too many genuine patients or claimants are wrongly classified by these tests 15 13 0.65 
 SVTs are unreliable 15 22 2.02* 
 Cost of effort testing 0.40 
 Identification of low effort might jeopardize your relationship with a patient 0.47 
 So few people exaggerate or malinger that it is not worthwhile testing for it 0.43 
 It is not the psychologists’ role to determine malingering or veracity 1.52 

Note: SVT = symptom validity testing.

*p < .05.

**p < .01.

The most endorsed argument for not using SVTs is the presence of obvious severe cognitive impairment. This is not an argument against the use of SVTs per se but rather an argument that advocates against the uncritical use of empirically determined cutoff points. Other arguments against SVTs that were mentioned relatively often in the survey relate to the notions that clinical cases rarely exaggerate or malinger, that exaggeration is usually obvious in the other test results or in the presentation of the examinee, and that there is insufficient time to incorporate an SVT in the assessment. These notions are difficult to maintain in light of the current scientific database on the clinical judgment in general and symptom validity assessment in particular. Compared with the current survey, a significantly higher proportion of respondents in the McCarter and colleagues survey endorsed the arguments of “insufficient time” and “SVTs are unreliable” for not using SVTs (both Z's > 1.96, both p's < .05).

Discussion

We conducted a survey among neuropsychologists in six European countries to gain insight into the beliefs and practices of continental Western European neuropsychologists. Before discussing the findings, we must address several limitations. First, not all of the Western European neuropsychological societies participated in our survey. The results obtained are representative of the six European countries that did participate (Germany, Italy, Denmark, Finland, Norway, and the Netherlands). Second, the response rate in our survey was low, although the high number of total responses (N = 515) exceeded the numbers described in previous surveys (Mittenberg et al., N = 144; Slick et al., N = 24; Sharland & Gfeller, N = 188; McCarter et al., N = 130). We can likely partly attribute the low response rate to our approach. We elected to approach all of the members of the neuropsychology associations in the six countries. Not all of the members of these societies engage in neuropsychological assessments. A low response rate harbors the risk of non-response bias, and this negatively affects the representativeness of the data for the whole target group, which in this case includes the neuropsychologists in the six countries. Although a non-response bias will occur only when respondents differ from non-respondents, we have no means to determine whether this is the case in our survey and to what extent non-response bias affects the results. To counter this drawback and to place our data into perspective, we made comparisons wherever possible with the data obtained from similar surveys conducted in North-America and the UK.

With these limitations in mind, we will discuss the main patterns that emerged from the data. The results showed that Western European neuropsychologists exhibited an awareness of the occurrence of non-credible symptom reports in both forensic and clinical assessments. The general estimate of the prevalence of malingering (10% in clinical assessments and 20% in forensic assessments) corresponded broadly to the data from empirical studies (e.g., Boone, 2013; Dandachi-FitzGerald et al., 2011; Gervais et al., 2004; Kemp et al., 2008; Miele et al., 2012), although the estimate found in forensic studies was at the lower end of the range found in empirical studies.

Our survey did reveal an interesting discrepancy between these general estimates of malingering and the estimates in the respondents' own practices (4% in clinical assessments and 10% in forensic assessments). The respondents appeared to believe that their colleagues encountered examinees that malinger. One possible explanation for this finding is that neuropsychologists are inclined to believe what examinees say during the interview and show in their presentation and test scores; thus, neuropsychologists overestimate their own malingering detection abilities. Clinicians' overestimation of “their own” ability to detect fraudulent communication (i.e., their estimation is better than that of their colleagues) has been previously documented (cf. Hall & Pritchard, 1996). Another explanation is that malingering is a term with strong dichotomous (present vs. absent) and negative (honest vs. fraudulent; authentic patient vs. malingerer) connotations, although neuropsychologists often attempt to maintain a more nuanced view of their own patients. Greater nuances in the conception of malingering compared with the current definition in the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV-TR; American Psychiatric Association, 2000) appears to be more appropriate for the description of the complex nature of deceitful behavior (Berry & Nelson, 2010; Merckelbach, Jelicic, & Pieters, 2011; Merckelbach & Merten, 2012).

The high estimates of non-credible symptom reports advocate the use of a systematic determination of the validity of the obtained diagnostic data in neuropsychological assessments. However, comparable with the findings in the earlier studies conducted by Mittenberg and colleagues (2002) and Sharland and Gfeller (2007), we found a discrepancy between the acknowledgment of the occurrence of non-credible symptoms in a sizeable minority of the examinees and the use of objective detection methods. Thus, most respondents indicated that they based their judgments on discrepancies between records, self-reports, and observed behavior and on the finding that the severity of the cognitive impairment was inconsistent with the condition, whereas only a minority of the respondents systematically based their judgment on SVTs. The lack of use of SVTs in forensic assessments (i.e., in the total sample, only 44.9% always included an SVT) is particularly troublesome. The body of knowledge on symptom credibility has grown to such a degree that the reliance on subjective judgment alone is incompatible with evidence-based practice. In particular, in forensic assessments, one must base a determination of symptom credibility on objective information or at least be able to substantiate the clinical impression with objective information.

In line with the previous surveys, the TOMM, FIT, and WMT were among the most widely used stand-alone SVTs. In contrast to the surveys conducted in other continents, the ASTM ranked high in our survey. The difference in the use of this particular verbal SVT can most likely be attributed to the Dutch origin of the test; an English translation did not become available until 2005. Remarkably, the FIT shows up in every survey as one of the most widely used SVTs, even though research has shown that this test has a high specificity but low sensitivity (Nitch & Glassmire, 2007). A number of factors, such as familiarity with the test (being one of the oldest SVTs), ease of administration and scoring, and free availability, might play a role in the popularity of this test.

Another striking finding is that less than one of every four respondents stated that they often or always used embedded indicators to determine the validity of the obtained test scores. Although research shows that embedded measures are often less sensitive than stand-alone SVTs for the detection of underperformance (e.g., Miele et al., 2012), acceptable sensitivity and specificity rates (84% and 94%, respectively) have been found by Victor, Boone, Serpa, Buehler, and Ziegler (2009) when using a “two-failure rule” (any pairwise failure of two embedded indicators of underperformance). Moreover, embedded measures have certain advantages over stand-alone SVTs. The first and obvious benefit is that these measures do not require additional time or money, i.e., these are automatically available. Another reason for the use of embedded indicators is that these measures appear to be less sensitive to the effects of preparation and coaching (Suhr & Gunstad, 2000).

The problem of coaching is not a trivial point, as was illustrated by two surveys conducted among attorneys who dealt with personal injury and work-related compensation claims. Wetter and Corrigan (1995) found that almost half of the attorneys considered it their duty to inform their clients of the validity scales within a neuropsychological evaluation. In the survey conducted by Essig, Mittenberg, Petersen, Strauman, and Cooper (2001), the majority of attorneys (75%) reported spending on average 15 min to 1 h preparing their clients for a neuropsychological evaluation. Unsurprisingly, searching the Internet appears to be a common strategy for finding information about symptoms and information on deception detection measures (Bauer & McCaffrey, 2006; Dandachi-FitzGerald & Merckelbach, 2013).

In addition, more than half of the respondents indicated never using symptom validity measures for indexing the over-reporting of symptoms. The empirical evidence, however, shows that non-credible symptom reports are not a one-dimensional phenomenon (e.g., Dandachi-FitzGerald & Merckelbach, 2013; Nelson, Sweet, Berry, Bryant, & Granacher, 2007). Some claimants over-report psychological symptoms and simultaneously performing to the best of their cognitive abilities on neuropsychological tests, and vice versa. Consequently, to determine the validity of the symptoms reported and the cognitive test scores, both types of SVTs should be included in the assessment.

Regarding the administration of SVTs, the survey depicts controversy among the respondents on whether or not they warn their patients in advance during forensic assessments. This observation is not surprising given the debate in the published literature. The argument most frequently maintained against warning is that this warning might lead to more subtle malingering attempts (Youngjohn, Lees-Haley, & Binder, 1999). However, the published empirical studies exhibit divergent results on the effects of warning. Although several researchers found that warning and coaching has an effect, particularly on stand-alone SVTs (e.g., Suhr & Gunstad, 2000), others have reported that validity measures are robust against coaching (e.g., Jelicic, Ceunen, Peeters, & Merckelbach, 2011). These different empirical results may be explained by differences in the type of instrument under investigation and by the nature and intensity of the coaching and warning. An important argument in favor of warning is that, from an ethical standpoint, the examinees should be well informed such that they can provide full consent for their evaluation (e.g., Iverson, 2006). In conclusion, more research on the question of to what extent examinees should be informed without jeopardizing detection strategies for a negative response bias appears warranted. In the end, both ethical considerations and the potential deterrent effects of warning should be weighted.

With regard to the overall results of the current study, the following broad conclusion appears justified: although Western European neuropsychologists acknowledge the occurrence of non-credible symptoms and are knowledgeable of the various SVTs, they place too much weight on subjective clinical judgment. Consequently, the empirically validated methods are insufficiently used. As McCarter and colleagues (2009, p. 1057) stated, “the results suggest a lack of appreciation of the demonstrated increase in accuracy obtained when an objective assessment is used in conjunction with experience.” There is still a gap between the scientific consensus that SVTs should be included in every forensic assessment (Bush et al., 2005; Heilbronner et al., 2009) and the actual practice of neuropsychologists in Western Europe. In our opinion, an important step in bridging this gap between science and practice is to ensure that symptom validity assessment becomes an integral part of neuropsychology training.

A factor that may also hamper a more systematic use of SVTs can be observed by the answers to the question of how to best respond in cases in which the patients failed the SVTs and how to communicate the results to the patients. First recommendations have been published (e.g., Carone, Iverson, & Bush, 2010). It seems vital that scientific progress in the field of symptom validity should not be limited to improving the techniques used to detect non-credible symptoms but should also encompass conceptual issues to provide meaning to the non-credible symptoms (i.e., self-deception vs. other-deception;, e.g., Merckelbach & Merten, 2012) or to determine the best way to provide feedback and treatment advice to patients who present a negative response bias in an examination. Scientific data on these topics are emerging (e.g., Suchy, Chelune, Franchow, & Thorgussen, 2012) and will hopefully lead to more practical concrete guidelines for the practice of neuropsychologists.

Conflict of Interest

None declared.

Acknowledgements

We thank the chairs of the six European Neuropsychological Societies for their valuable assistance in distributing the survey to the members of their society: Professor S. Cappa (Italy), Dr H. Niemann (Germany), Dr J. Andersen (Denmark), Dr E. Hessen (Norway), Dr L. Hokkanen (Finland), and Professor G. Vingerhoets (the Netherlands). We also thank Professor L. Fasotti and Dr S. Zago for their help with the Italian translation of the survey.

References

American Psychiatric Association
Diagnostic and statistical manual of mental disorders
 , 
2000
4th ed.
Washington, DC
Author
 
text rev.
Bauer
L.
McCaffrey
R. J.
Coverage of the Test of Memory Malingering, Victoria Symptom Validity Test, and Word Memory Test on the Internet: Is test-security threatened?
Archives of Clinical Neuropsychology
 , 
2006
, vol. 
21
 (pg. 
121
-
126
)
Berry
D. T. R.
Nelson
N. W.
DSM-5 and malingering: A modest proposal
Psychological Injury and Law
 , 
2010
, vol. 
3
 (pg. 
295
-
303
)
Bianchini
K. J.
Curtis
K. L.
Greve
K. W.
Compensation and malingering in traumatic brain injury: A dose-response relationship?
The Clinical Neuropsychologist
 , 
2006
, vol. 
20
 (pg. 
831
-
847
)
Boone
K. B.
Assessment of feigned cognitive impairment: A neuropsychological perspective
 , 
2007
New York
The Guilford Press
Boone
K. B.
Clinical practice of forensic neuropsychology: An evidence-based approach
 , 
2013
New York
The Guilford Press
Bush
S. S.
Ruff
R. M.
Tröster
A. I.
Barth
J. J.
Koffler
S. P.
Pliskin
N. H.
, et al.  . 
Symptom validity assessment: Practice issues and medical necessity, NAN Policy and Planning Committee
Archives of Clinical Neuropsychology
 , 
2005
, vol. 
20
 (pg. 
419
-
426
)
Carone
D. A.
Iverson
G. L.
Bush
S. S.
A model to approaching and providing feedback to patients regarding invalid test performance in clinical neuropsychological evaluations
The Clinical Neuropsychologist
 , 
2010
, vol. 
24
 (pg. 
759
-
778
)
Chafetz
M. D.
Prentkowski
E.
Rao
A.
To work or not work: Motivation (not low IQ) determines symptom validity test findings
Archives of Clinical Neuropsychology
 , 
2011
, vol. 
26
 (pg. 
306
-
313
)
Dandachi-FitzGerald
B.
Merckelbach
H.
Feigning ≠ feigning a memory deficit: The Medical Symptom Validity Test as an example
Journal of Experimental Psychopathology
 , 
2013
, vol. 
4
 (pg. 
46
-
53
)
Dandachi-FitzGerald
B.
Ponds
R. W. H. M.
Peter
M. J. V.
Merckelbach
H.
Cognitive underperformance and symptom over-reporting in a mixed psychiatric sample
The Clinical Neuropsychologist
 , 
2011
, vol. 
25
 (pg. 
812
-
828
)
Essig
S. M.
Mittenberg
W.
Petersen
R. S.
Strauman
S.
Cooper
J. T.
Practices in forensic neuropsychology: Perspectives of neuropsychologists and trial attorneys
Archives of Clinical Neuropsychology
 , 
2001
, vol. 
16
 (pg. 
271
-
291
)
Fox
D. D.
Symptom test failure indicates invalidity of neuropsychological tests
The Clinical Neuropsychologist
 , 
2011
, vol. 
25
 (pg. 
488
-
495
)
Gervais
R. D.
Rohling
M. L.
Green
P.
Ford
W.
A comparison of the WMT, CARB, and TOMM failure rates in non-head injury disability claimants
Archives of Clinical Neuropsychology
 , 
2004
, vol. 
19
 (pg. 
475
-
487
)
Green
P.
Word memory test for windows: User's manual and program
 , 
2003
Edmonton, Canada
Author
Green
P.
The pervasive influence of effort on neuropsychological tests
Physical Medicine and Rehabilitation Clinics of North America
 , 
2007
, vol. 
18
 (pg. 
43
-
68
)
Hall
H. V.
Pritchard
D. A.
Detecting malingering and deception. Forensic Distortion Analysis (FDA)
 , 
1996
Delray Beach, FL
St Lucie Press
Heilbronner
R. L.
Sweet
J. J.
Morgan
J. E.
Larrabee
G. J.
Millis
S. R.
Conference Participants
American Academy of Clinical Neuropsychology consensus conference statement on the neuropsychological assessment of effort, response bias, and malingering
Clinical Neuropsychology
 , 
2009
, vol. 
23
 (pg. 
1093
-
1129
)
Iverson
G. L.
Ethical issues associated with the assessment of exaggeration, poor effort, and malingering
Applied Neuropsychology
 , 
2006
, vol. 
13
 (pg. 
77
-
90
)
Jelicic
M.
Ceunen
E.
Peeters
M. J. V.
Merckelbach
H.
Detecting coached feigning using the Test of Memory Malingering (TOMM) and the Structured Inventory of Malingered Symptomatology (SIMS)
Journal of Clinical Psychology
 , 
2011
, vol. 
67
 (pg. 
850
-
855
)
Kemp
S.
Coughlan
A. K.
Rowbottom
C.
Wilkinson
K.
Teggart
V.
Baker
G.
The base rate of effort test failure in patients with medically unexplained symptoms
Journal of Psychosomatic Research
 , 
2008
, vol. 
65
 (pg. 
319
-
325
)
Locke
D. E. C.
Smigielski
J. S.
Powell
M. R.
Stevens
S. R.
Effort issues in post acute outpatient acquired brain injury rehabilitation seekers
Neurorehabilitation
 , 
2008
, vol. 
23
 (pg. 
273
-
281
)
McCarter
R. J.
Walton
N. H.
Brooks
D. N.
Powell
G. E.
Effort testing in contemporary UK neuropsychological practice
The Clinical Neuropsychologist
 , 
2009
, vol. 
23
 (pg. 
1050
-
1066
)
Merckelbach
H.
Jelicic
M.
Pieters
M.
The residual effect of feigning: How intentional faking may evolve into a less conscious form of symptom reporting
Journal of Clinical and Experimental Neuropsychology
 , 
2011
, vol. 
33
 (pg. 
131
-
139
)
Merckelbach
H.
Merten
T.
A note on cognitive dissonance and malingering
The Clinical Neuropsychologist
 , 
2012
, vol. 
26
 (pg. 
1217
-
1229
)
Meyers
E. M.
Volbrecht
M.
Axelrod
B. N.
Reinsch-Boothby
L.
Embedded symptom validity tests and overall neuropsychological test performance
Archives of Clinical Neuropsychology
 , 
2011
, vol. 
26
 (pg. 
8
-
15
)
Miele
A. S.
Gunner
J. H.
Lynch
J. K.
McCaffrey
R. J.
Are embedded validity indices equivalent to free-standing symptom validity tests?
Archives of Clinical Neuropsychology
 , 
2012
, vol. 
27
 (pg. 
10
-
22
)
Mittenberg
W.
Patton
C.
Canyock
E. M.
Condit
D. C.
Base rates of malingering and symptom exaggeration
Journal of Clinical and Experimental Neuropsychology
 , 
2002
, vol. 
24
 (pg. 
1094
-
1102
)
Nelson
N. W.
Sweet
J. J.
Berry
D. T. R.
Bryant
F. B.
Granacher
R. P.
Response validity in forensic neuropsychology: Exploratory factor analytic evidence of distinct cognitive and psychological constructs
Journal of the International Neuropsychological Society
 , 
2007
, vol. 
13
 (pg. 
440
-
449
)
Nitch
S. R.
Glassmire
D. M.
Boone
K. B.
Non-forced-choice measures to detect noncredible cognitive performance
Assessment of feigned cognitive impairment, a neuropsychological perspective
 , 
2007
New York
The Guilford Press
(pg. 
79
-
102
)
Rey
A.
L'examen clinique en psychologie
 , 
1958
Paris
Presses Universitaires de Paris
 
[The clinical assessment in psychology]
Rohling
M. L.
Green
P.
Allen
L. M.
Iverson
G. L.
Depressive symptoms and neurocognitive test scores in patients passing symptom validity tests
Archives of Clinical Neuropsychology
 , 
2002
, vol. 
17
 (pg. 
205
-
222
)
Schmand
B.
Lindeboom
J.
Amsterdam Short-Term Memory Test. Manual
 , 
2005
Leiden, The Netherlands
PITS
Sharland
M. J.
Gfeller
J. D.
A survey of neuropsychologists' beliefs and practices with respect to the assessment of effort
Archives of Clinical Neuropsychology
 , 
2007
, vol. 
22
 (pg. 
213
-
223
)
Slick
D. J.
Tan
J. E.
Strauss
E. H.
Hultsch
D. F.
Detecting malingering: A survey of experts' practices
Archives of Clinical Neuropsychology
 , 
2004
, vol. 
19
 (pg. 
465
-
473
)
Suchy
Y.
Chelune
G.
Franchow
E. I.
Thorgussen
S. R.
Confronting patients about insufficient effort: The impact on subsequent symptom validity and memory performance
The Clinical Neuropsychologist
 , 
2012
, vol. 
26
 (pg. 
1296
-
1311
)
Suhr
J. A.
Gunstad
J.
The effects of coaching on the sensitivity and specificity of malingering measures
Archives of Clinical Neuropsychology
 , 
2000
, vol. 
15
 (pg. 
415
-
424
)
Sweet
J. J.
Guidotti Breting
L. M.
Symptom validity test research: Status and clinical implications
Journal of Experimental Psychopathology
 , 
2013
, vol. 
4
 (pg. 
6
-
19
)
Tombaugh
T. N.
Test of Memory Malingering (TOMM)
 , 
1996
North Tonawanda, NY
Multi-Health Systems
Van Hout
M. S. E.
Schmand
B.
Weking
E. M.
Deelman
B. G.
Cognitive functioning in patients with suspected chronic toxic encephalopathy: Evidence for neuropsychological disturbances after controlling for insufficient effort
Journal of Neurology, Neurosurgery and Psychiatry
 , 
2006
, vol. 
77
 (pg. 
296
-
303
)
Victor
T. L.
Boone
K. B.
Serpa
J. G.
Buehler
J.
Ziegler
E. A.
Interpreting the meaning of multiple symptom validity test failure
The Clinical Neuropsychologist
 , 
2009
, vol. 
23
 (pg. 
297
-
313
)
Wetter
M. W.
Corrigan
S. K.
Providing information to clients about psychological test: A survey of attorneys' and law students' attitudes
Professional Psychology: Research and Practice
 , 
1995
, vol. 
26
 (pg. 
474
-
477
)
Widows
M. R.
Smith
G. P.
Structured inventory of malingered symptomatology
 , 
2005
Odessa, FL
Psychological Assessment Resources
Youngjohn
J. R.
Lees-Haley
P. R.
Binder
L. M.
Comment: Warning malingerers produces more sophisticated malingering
Archives of Clinical Neuropsychology
 , 
1999
, vol. 
14
 (pg. 
511
-
515
)