## Abstract

Objective:

The Hayling Sentence Completion Test evaluates the ability to inhibit an automatic response. It has also been suggested for the assessment of orbitofrontal cortex function. The aim of the study was to develop a Spanish version of the Hayling test and to obtain normative data.

Method:

Responses to 60 sentences from 50 healthy controls were used to develop the task. Additionally, 185 healthy controls aged between 18 and 99 years were examined with the test in order to obtain normative data. The overlapping interval strategy was used to maximize the sample size. Age- and education-adjusted scores were obtained using linear regression analysis.

Results:

Age and educational level had a significant effect on the different scores. Good internal reliability and inter-rater variability were observed.

Conclusions:

We provide normative data adjusted for age and education. Our results enable the use of this test for clinical and research purposes in the field of neuropsychological assessment.

## Introduction

Neuropsychological assessment is an essential tool in the diagnosis of neurological and psychiatric disorders. Executive functioning is the most complex of behaviors and includes the ability to respond to novelties. Most of the neuropsychological tests that assess the executive abilities have been associated to the dorsolateral prefrontal cortex function (Egner & Hirsch, 2005; Hagen et al., 2014; Lazeron et al., 2000; Müller et al., 2014; Yuan & Raz, 2014).

However, in several diseases of the frontal lobe, the orbitofrontal cortex is the most prematurely affected. Indeed, during the progression of frontotemporal dementia, impairment of the orbitofrontal cortex occurs earlier than for the dorsolateral cortex (Fernández-Matarrubia, Matías-Guiu, Moreno-Ramos, & Matías-Guiu, 2014; Hornberger et al., 2010).

The Hayling Sentence Completion Test has been suggested as a neuropsychological tool useful to assess the function of the orbitofrontal cortex (Hornberger, Geng, & Hodges, 2011; Volle et al., 2012). It comprises two tasks: firstly, the subject has to complete a sentence with a word clearly suggested by the meaning of the first part of the sentence (e.g., “We eat soup with … spoon”); in the second part, the subject has to produce a word that should not be related to the sentence (e.g., “We eat soup with … .building”). The test was initially developed as a sensitive tool for frontal dysfunction (Burgess & Shallice, 1996). Since then, the test has been adapted to different populations and has demonstrated to be useful in the diagnosis of frontotemporal dementia, Alzheimer's disease, amyotrophic lateral sclerosis, Parkinson's disease, Gilles de la Tourette syndrome, bipolar disorder and schizophrenia, among others (Bellevile, Rouleau, & Van der Linden, 2006; Bouquet, Bonnaud, & Gil, 2003; Eddy, Rizzo, & Cavanna, 2009; Chan et al., 2012). Moreover, short versions of the test have been included in some screening tests, such as the Institute of Cognitive and Behavioural Frontal Screening test (Torralva, Roca, Gleichgerrcht, Lopez, & Manes, 2009) or the Edinburg Cognitive and Behavioural ALS Screen for amyotrophic lateral sclerosis (Abrahams, Newton, Niven, Foley, & Bak, 2014).

The underlying neural substrate of inhibition in the Hayling task has been studied with neuroimaging, raising the question of whether this test serves as a reliable measure of orbitofrontal function. In this regard, it has been correlated with orbitofrontal atrophy in frontotemporal dementia (Hornberger et al., 2011). Also, in an activation study using positron emission tomography with water (H215O), the verbal suppression in the second part of the test was associated to middle and inferior frontal gyri as well as orbitofrontal cortex (Collette et al., 2001). More recently, a study with patients with frontal lobe damage linked the Hayling test performance to the right lateral prefrontal cortex (Robinson et al., 2015). Overall, these studies confirm the usefulness and sensitivity of the Hayling test for the assessment of frontal function, although the specific frontal regions associated to the test may differ between studies. The performance of Hayling task involves several cognitive processes including the initiation of a behavior, suppression of an automatic response, the development of a strategy, verbal generation, and the maintenance of an appropriate strategy across trials. Thus, the execution of the test probably requires the participation of a network including the interaction of several frontal regions (Hornberger & Bertoux, 2015). In this regard, the impairment of specific processes has been more associated to some regions, such as medial rostral prefrontal cortex to initiation, and orbitofrontal cortex to errors in suppression (Volle et al., 2012).

However, cognitive functioning is influenced by demographic and social factors, among others. Thus, normative data for neuropsychological tests are required to describe the cognitive status of a subject, especially with aging (Lezak, Howieson, Bigler, & Tranel, 2012). Concerning Hayling test, several studies have observed the influence of demographic factors in the performance of the test, so normative data are necessary for an appropriate interpretation of results (Bielak, Mansueti, Strauss, & Dixon, 2006; Tournier, Postal, & Mathey, 2014). In this regard, Bielak et al. included 432 healthy subjects (301 women) from British Columbia, Canada, between 53 and 90 years, and 15.23 ± 2.86 years of formal education. In this study, notable effects were observed for age in the two tasks of the test, but especially in the second task; in contrast, education and gender had a minor and non-significant effect, respectively.

To our knowledge, there are no suitable or standardized Hayling test versions for the Spaniard population and its intrinsic characteristics, including both cultural and linguistic. For this reason, our aim was to develop the test and provide normative data for the Spaniard population.

## Methods

### Description of the Test

The test consists of 30 sentences and is divided into two sections (A and B) with 15 sentences in each part. The examiner reads a sentence, but leaves out the last word, then the subject should provide a word which would suitably complete the sentence. In the first part, the last word has to be appropriate to the context (response initiation); in the second part, the word has to be unrelated to the context of the sentence (response suppression). Two main variables are obtained: time (response latency) and scoring (adjustment of the response). Because completion of the first part is an overlearned task, and the second part requires suppression of an automatic response set up in part A, the difference, or the ratio, between part B and part A is considered an indicator of executive function.

### Development of the Test

A selection of sentences without the last word but with an automated response was carried out. We obtained these sentences from previous versions in other languages and new phrases designed by the authors. Then, 50 cognitively healthy subjects performed the test and we chose the 30 sentences with a more homogeneous response (all with the same response in >90% of cases). We made a pilot study in 20 cognitively healthy subjects in order to ensure the understanding and applicability of the test. With these results, we obtained a final version (Supplementary material online).

The test includes the presentation of 30 incomplete sentences with omission of the last word. The examiner reads the sentences and the subject/patient has to complete the last word. In part A, the individuals have to complete the sentence with a word that is appropriate or rational (e.g., “Bees produce … honey”). Answers are classified as either correct (0 points) or incorrect (1 point). Thus, a higher score (more errors) imply a lower performance.

Part B also consists of 15 sentences without the last word which should lead to an automated response. However, the subject now has to try to suppress the automated response, and provide any other word unrelated to the context of the sentence. In this case, responses are classified as correct (3 points), semantically related or opposite (1 point), or a response not related to the context (0 points). Thus, in this part of the test, the ideal answer is one word which is not related (N: non-related) to the context of the sentence. Responses may be classified according to the strategy used by the subject/patient, such as NR (non-related room: word non-related in the room): the subject provides a word that can be found in the place (Room) where the evaluation is performed (e.g., “The weather in winter is … computer”); NL (non-related last: word not-related with the sentence but semantically related with the last response): if the last response was apple, an example of an NL word would be “The weather in winter is … banana;”NB (non-related both): when both conditions of NR and NL are met (if the last response was chair, an NB response would be “The weather in winter is … table,” since it is semantically related to chair (both are furniture), and it can be found in the room); and finally N: the word is not related to the context of the sentence and does not meet any of the previous criteria (e.g., “ The weather in winter is … car”). The scoring for all these subgroups is 0 points. A correct answer is one word that reasonably completes the phrase (e.g., “The weather in winter is … cold”). In this case, this automated response is a violation of the instructions for the task and therefore is scored with 3 points. Moreover, the subject may answer with a word that is somehow related to the context of the sentence. In this case, this sentence is scored with 1 point. There are, also, different subgroups for this type of response: Opposite (O, the word is the opposite of the expected, for example “The weather in winter is … warm”); Semantically A (SA, the word is clearly semantically related to the context of the sentence: “The weather in winter is … snow”); Semantically B (SB, the word is semantically related to the context of the sentence, but this relationship is milder, for example “The weather in winter is … scarf”); Semantically C (SC, the word fits very slightly at the end of the sentence, but the final meaning is ridiculous or obscene). Latencies and scores of all sentences in each part are summed-up, and the four scores are obtained: time part A, score part A, time part B and score part B. We then used the ratio between B and A as the primary endpoint, for both score and time. The final score in part A should be 0 in almost all subjects, because it is an automated and overlearned response. For this reason, to be able to calculate the ratio between scores B and A, we decided to previously add ten points to both.

### Normative Data

In order to obtain normative data, a standardized protocol was administered to all participants. It included a complete questionnaire on demographic and clinical data, the Mini-Mental State Examination (Blesa et al., 2001), the Clinical Dementia Rating (Morris, 1993), and the Functional Activities Questionnaire (Pfeffer et al., 1982). The Hayling test was then administered. The protocol was administered in a single session of ∼20 min of duration. The Hospital Ethics Committee approved the research protocol, and informed consent was obtained from all subjects.

Participants included were healthy volunteers or healthy patients' relatives from two regions in Spain, Madrid, and Barcelona. Inclusion criteria were (1) age between 18 and 99 years old; (2) absence of any cognitive and functional impairment, measured by three different scales (Mini-Mental State Examination adjusted for age and education >24 (Blesa et al., 2001), clinical dementia rating (CDR) of 0 and functional activities questionnaire (FAQ) of 0 (Morris, 1993; Pfeffer et al., 1982); and (3) Spanish as native language. Exclusion criteria were (1) neurological disease, (2) systemic disease potentially associated with cognitive impairment, (3) psychiatric disease, (4) substance abuse, and (5) auditory impairment that may jeopardize the administration of the test.

### Statistical Analysis

Statistical analysis was performed using IBM SPSS® Statistics 20.0. Internal validity was measured using the Cronbach's α, and inter-rater variability with the intraclass correlation coefficient. Due to homogeneity of responses in part A, these parameters were referred to the time of part A and part B. The effects of age, years of education, and gender were assessed using Pearson's correlation coefficient (r) and coefficient of determination (r2).

The overlapping interval strategy was used to maximize the sample size. This strategy is the same as the one used in the Mayo Older American Normative Studies and the Spanish Multicenter Normative Studies (Pauker, 1988; Peña-Casanova et al., 2009). In our study, 13 intervals of age of 11 years (e.g., 45–55) were created to provide norms to a range of age of 5 years in the midpoint (for instance, age range 48–52, midpoint 50). Successive intervals were built accordingly, except for extreme ages in the sample. Subsequently, raw scores were converted to percentile ranks for each age distribution, and then to scaled scores (from 2 to 18, mean 10, and SD 3). Graphical representation was used to test the normality of the distribution of each score. Linear regression analysis was used to explore the relationship between years of education and gender. The following formula was used to estimate the scaled scores adjusted by age and education:

$SSAE=SSA−(β×[Education−12]),$
where SSAE was the scaled score adjusted by Age and Education; SSA was the scaled score adjusted only by Age; β was the regression coefficient for education; education was measured as years of formal education.

## Results

### Sample and Psychometric Properties

One hundred and eighty-five participants were included (57.8% women). Mean age was 57.73 ± 16.28 years old and mean formal education was 12.07 ± 5.10 years. Eighteen subjects (9.7%) had hypertension, 8 (4.3%) diabetes mellitus, 1 (0.5%) coronary heart disease, 3 (1.6%) arrhythmia, and 3 (1.6%) history of cancer.

Regarding psychometric properties, Cronbach's α was 0.846 (time A), 0.797 (time B), and 0.839 (score B). The intraclass correlation coefficient (inter-rater variability) was 0.873 (part B).

### Normative data

In part A, mean score was 0.07 ± 0.29 and mean time was 8.54 ± 3.91 s. In part B, mean score was 6.91 ± 4.78, and mean time was 38.45 ± 20.80 s. Age had an effect on time A (r = .329, p < .01), time B (r = .438, p < .01), and score B (r = .373, p < .01), but not in score A (r = .104, p = .157). Education showed an effect on time B (r = −.397, p < .01) and score B (r = −.464, p < .01), but not on time A (r = −.107, p = .147) and score A (r = .018, p = .811). In contrast, no correlation was observed between gender and any part of the test (Table 1). Correlation analysis between the Hayling ratio B/A (for time and score) and educational level and age showed that age correlated with score B/A (r = .366, p < .01), but not with time B/A (r = .075, p = .311). Education showed a significant effect in both score B/A (r = −.463, p < .01) and time B/A (r = −.232, p = .02).

Table 1.

Correlation (r) and determination (r2) coefficients

Age

Education

Gender

R r2 r r2 r r2
Hayling A (score) .104 .010 −.018 .000 −.094 .008
p-value .157 .811 .206
Hayling A (time) .329 .108 −.107 .011 .006 .000
p-value <.01 .147 .933
Hayling B (score) .373 .139 −.464 .215 .049 .002
p-value <.01 <.01 .510
Hayling B (time) .438 .191 −.397 .157 .088 .007
p-value <.01 <.01 .234
Hayling (B + 10)/(A + 10) (score) .366 .133 −.463 .214 .060 .003
p-value <.01 <.01 .415
Hayling B/A (time) .075 .005 −.232 .053 .117 .013
p-value .311 .002 .114
Age

Education

Gender

R r2 r r2 r r2
Hayling A (score) .104 .010 −.018 .000 −.094 .008
p-value .157 .811 .206
Hayling A (time) .329 .108 −.107 .011 .006 .000
p-value <.01 .147 .933
Hayling B (score) .373 .139 −.464 .215 .049 .002
p-value <.01 <.01 .510
Hayling B (time) .438 .191 −.397 .157 .088 .007
p-value <.01 <.01 .234
Hayling (B + 10)/(A + 10) (score) .366 .133 −.463 .214 .060 .003
p-value <.01 <.01 .415
Hayling B/A (time) .075 .005 −.232 .053 .117 .013
p-value .311 .002 .114

Age-adjusted scaled scores are shown in Tables 2 and 3. Although age was not significantly correlated with time B/A, scoring was also grouped according to age ranges to follow the same procedure for the B/A score and the procedure performed in other studies regarding normalization of cognitive tests. In these tables, we provide normative data adjusted for age: for instance, a 69-year-old subject with a Hayling score (B + 10)/(A + 10) of 2.22 corresponds to an age-adjusted scaled score of 7. Likewise, adjustment for the years of education is provided in Tables 4 and 5. If the patient has 6 years of formal education, the age and education adjusted score would be 8, but if the subject has 18 years of education, the scaled score would be 5.

Table 2.

Age-adjusted scores for the Hayling B/A (time)

Scaled score Percentile range Age range

20–27 28–32 33–37 38–42 43–47 48–52 53–57 58–62 63–67 68–72 73–77 78–82 83–90
<1 >11.0 >8.86 >6.03 >9.37 >12.29 >26.91 >29.18 >29.52 >13.02 >11.06 >11.51 >11.29 >9.79
— — — — — — — — — — — — —
— — — — — — — — — — — — —
3–5 — — — — — 12.50–26.91 19.00–29.18 20.00–29.52 10.20–13.02 9.50–11.06 11.30–11.51 — —
6–10 9.5–11.0 6.90–8.86 — — 8.00–12.29 7.30–12.49 8.00–18.99 8.00–19.99 7.30–10.19 8.75–9.49 9.43–11.29 9.60–11.29 —
11–18 6.9–9.4 6.17–6.89 5.86–6.03 6.60–9.37 6.45–7.99 6.40–7.29 6.52–7.99 5.50–7.99 6.70–7.29 7.20–8.74 7.68–9.42 7.98–9.59 7.67–9.79
19–28 5.90–6.89 5.74–6.16 5.67–5.85 6.02–6.59 5.94–6.44 5.25–6.39 4.86–6.51 4.68–5.49 4.87–6.69 6.00–7.19 6.34–7.67 6.26–7.97 6.92–7.66
29–40 4.4–5.85 4.80–5.73 5.27–5.66 4.80–6.01 4.73–5.93 4.66–5.24 4.20–4.85 4.20–4.67 4.30–4.86 5.24–5.99 5.60–6.33 5.44–6.25 6.16–6.91
10 41–59 3.48–4.39 3.33–4.79 4.01–5.26 3.10–4.79 3.67–4.72 3.72–4.65 3.44–4.19 3.58–4.19 3.64–4.29 4.20–5.23 4.64–5.59 4.63–5.43 4.58–6.15
11 60–71 3.11–3.47 2.73–3.32 2.84–4.00 2.40–3.09 3.10–3.66 3.11–3.71 2.89–3.43 3.16–3.57 3.27–3.63 3.72–4.19 4.06–4.63 4.05–4.62 3.94–4.57
12 72–81 2.36–3.10 2.42–2.72 2.48–2.83 2.25–2.39 2.30–3.09 2.43–3.10 2.31–2.88 2.67–3.15 2.69–3.26 3.37–3.71 3.73–4.05 3.60–4.04 2.80–3.93
13 82–89 2.15–2.35 2.22–2.41 2.38–2.47 1.40–2.24 1.43–2.29 1.85–2.42 2.00–2.30 2.25–2.66 2.19–2.68 2.70–3.36 3.09–3.72 2.98–3.59 1.93–2.79
14 90–94 <2.03–2.14 2.06–2.21 2.29–2.37 1.28–1.39 1.27–1.42 1.42–1.84 1.58–1.99 1.83–2.24 1.83–2.18 2.32–2.69 2.36–3.08 1.80–2.97 1.82–1.92
15 95–97 — 2.02–2.05 — — 1.20–1.26 1.25–1.41 1.47–1.57 1.57–1.82 1.57–1.82 2.14–2.31 1.68–2.35 1.48–1.79 –
16 98 — — — — — 1.20–1.24 1.42–1.46 1.54–1.56 1.54–1.56 2.12–2.14 1.48–1.68 — —
17 99 — — — — — — — — — — — — —
18 >99 <2.02 <2.02 <2.29 <1.28 <1.20 <1.20 <1.42 <1.54 <1.54 <2.12 <1.48 <1.48 <1.82
Age range  1830 2535 3040 3545 4050 4555 5060 5565 6070 6575 7080 7585 8090
Age midpoint  25 30 35 40 45 50 55 60 65 70 75 80 85
Sample size  16 20 14 15 32 41 46 44 44 40 43 28 14
Scaled score Percentile range Age range

20–27 28–32 33–37 38–42 43–47 48–52 53–57 58–62 63–67 68–72 73–77 78–82 83–90
<1 >11.0 >8.86 >6.03 >9.37 >12.29 >26.91 >29.18 >29.52 >13.02 >11.06 >11.51 >11.29 >9.79
— — — — — — — — — — — — —
— — — — — — — — — — — — —
3–5 — — — — — 12.50–26.91 19.00–29.18 20.00–29.52 10.20–13.02 9.50–11.06 11.30–11.51 — —
6–10 9.5–11.0 6.90–8.86 — — 8.00–12.29 7.30–12.49 8.00–18.99 8.00–19.99 7.30–10.19 8.75–9.49 9.43–11.29 9.60–11.29 —
11–18 6.9–9.4 6.17–6.89 5.86–6.03 6.60–9.37 6.45–7.99 6.40–7.29 6.52–7.99 5.50–7.99 6.70–7.29 7.20–8.74 7.68–9.42 7.98–9.59 7.67–9.79
19–28 5.90–6.89 5.74–6.16 5.67–5.85 6.02–6.59 5.94–6.44 5.25–6.39 4.86–6.51 4.68–5.49 4.87–6.69 6.00–7.19 6.34–7.67 6.26–7.97 6.92–7.66
29–40 4.4–5.85 4.80–5.73 5.27–5.66 4.80–6.01 4.73–5.93 4.66–5.24 4.20–4.85 4.20–4.67 4.30–4.86 5.24–5.99 5.60–6.33 5.44–6.25 6.16–6.91
10 41–59 3.48–4.39 3.33–4.79 4.01–5.26 3.10–4.79 3.67–4.72 3.72–4.65 3.44–4.19 3.58–4.19 3.64–4.29 4.20–5.23 4.64–5.59 4.63–5.43 4.58–6.15
11 60–71 3.11–3.47 2.73–3.32 2.84–4.00 2.40–3.09 3.10–3.66 3.11–3.71 2.89–3.43 3.16–3.57 3.27–3.63 3.72–4.19 4.06–4.63 4.05–4.62 3.94–4.57
12 72–81 2.36–3.10 2.42–2.72 2.48–2.83 2.25–2.39 2.30–3.09 2.43–3.10 2.31–2.88 2.67–3.15 2.69–3.26 3.37–3.71 3.73–4.05 3.60–4.04 2.80–3.93
13 82–89 2.15–2.35 2.22–2.41 2.38–2.47 1.40–2.24 1.43–2.29 1.85–2.42 2.00–2.30 2.25–2.66 2.19–2.68 2.70–3.36 3.09–3.72 2.98–3.59 1.93–2.79
14 90–94 <2.03–2.14 2.06–2.21 2.29–2.37 1.28–1.39 1.27–1.42 1.42–1.84 1.58–1.99 1.83–2.24 1.83–2.18 2.32–2.69 2.36–3.08 1.80–2.97 1.82–1.92
15 95–97 — 2.02–2.05 — — 1.20–1.26 1.25–1.41 1.47–1.57 1.57–1.82 1.57–1.82 2.14–2.31 1.68–2.35 1.48–1.79 –
16 98 — — — — — 1.20–1.24 1.42–1.46 1.54–1.56 1.54–1.56 2.12–2.14 1.48–1.68 — —
17 99 — — — — — — — — — — — — —
18 >99 <2.02 <2.02 <2.29 <1.28 <1.20 <1.20 <1.42 <1.54 <1.54 <2.12 <1.48 <1.48 <1.82
Age range  1830 2535 3040 3545 4050 4555 5060 5565 6070 6575 7080 7585 8090
Age midpoint  25 30 35 40 45 50 55 60 65 70 75 80 85
Sample size  16 20 14 15 32 41 46 44 44 40 43 28 14
Table 3.

Age-adjusted scores for the Hayling(B + 10)/(A + 10) (scoring)

Scaled score Percentile range Age range

2027 2832 3337 3842 4347 4852 5357 5862 6367 6872 7377 7882 8390
<1 ≥2.50 ≥2.49 ≥2.15 ≥2.30 ≥2.34 ≥2.38 ≥2.36 ≥2.30 ≥2.74 ≥2.82 ≥2.78 ≥2.72 ≥2.8
— — — — — — — — — — — — —
— — — — — — — — — — — — —
3–5 — 2.45–2.48 — — — 2.26–2.37 2.30–2.35 — 2.36–2.73 2.48–2.81 2.51–2.77 — —
6–10 2.33–2.49 2.25–2.45 2.05–2.14 2.25–2.29 2.21–2.33 2.00–2.25 2.21–2.29 2.23–2.29 2.30–2.35 2.36–2.47 2.42–2.50 2.45–2.71 —
11–18 2.19–2.32 2.02–2.24 2.00–2.04 2.01–2.24 1.71–2.20 1.80–1.99 2.00–2.20 2.10–2.22 2.21–2.29 2.21–2.35 2.21–2.41 2.33–2.44 2.53–2.79
19–28 1.87–2.18 1.80–2.01 1.58–1.99 1.80–2.00 1.56–1.70 1.61–1.79 1.80–1.99 1.81–2.09 1.93–2.20 2.11–2.20 2.11–2.20 2.17–2.32 2.41–2.52
29–40 1.56–1.82 1.50–1.79 1.50–1.57 1.60–1.79 1.47–1.55 1.60–1.51 1.63–1.79 1.61–1.80 1.61–1.92 2.00–2.10 2.00–2.10 2.11–2.16 2.30–2.40
10 41–59 1.50–1.55 1.36–1.49 1.42–1.49 1.30–1.59 1.31–1.46 1.40–1.50 1.40–1.62 1.35–1.60 1.31–1.60 1.87–1.99 1.91–1.99 1.99–2.10 2.11–2.29
11 60–71 1.30–1.49 1.21–1.35 1.31–1.41 1.16–1.29 1.21–1.30 1.29–1.39 1.30–1.39 1.21–1.34 1.26–1.30 1.61–1.86 1.71–1.90 1.91–1.98 2.03–2.10
12 72–81 1.12–1.29 1.10–1.20 1.28–1.30 1.09–1.15 1.11–1.20 1.20–1.28 1.11–1.29 1.11–1.20 1.11–1.25 1.45–1.60 1.51–1.70 1.64–1.90 1.88–2.02
13 82–89 1.10–1.11 — 1.16–1.27 1.00–1.08 1.00–1.10 1.00–1.19 1.00–1.10 1.00–1.10 1.00–1.10 1.22–1.44 1.36–1.50 1.40–1.63 1.60–1.87
14 90–94 — — 1.10–1.15 — — — — 0.95–0.99 0.95–0.99 1.10–1.21 1.15–1.35 1.13–1.39 1.20–1.59
15 95–97 — — — — — — — 0.92–0.94 0.92–0.94 — 1.10–1.14 1.00–1.12 —
16 98 — — — — — — — — — — — — —
17 99 — — — — — — — — — — — — —
18 >99 <1.10 <1.10 <1.10 <1.00 <1.00 <1.00 <1.00 <0.92 <0.92 <1.10 <1.10 <1.00 <1.20
Age range  18–30 25–35 30–40 35–45 40–50 45–55 50–60 55–65 60–70 65–75 70–80 75–85 80–90
Age midpoint  25 30 35 40 45 50 55 60 65 70 75 80 85
Sample size  16 20 14 15 32 41 46 44 44 40 43 28 14
Scaled score Percentile range Age range

2027 2832 3337 3842 4347 4852 5357 5862 6367 6872 7377 7882 8390
<1 ≥2.50 ≥2.49 ≥2.15 ≥2.30 ≥2.34 ≥2.38 ≥2.36 ≥2.30 ≥2.74 ≥2.82 ≥2.78 ≥2.72 ≥2.8
— — — — — — — — — — — — —
— — — — — — — — — — — — —
3–5 — 2.45–2.48 — — — 2.26–2.37 2.30–2.35 — 2.36–2.73 2.48–2.81 2.51–2.77 — —
6–10 2.33–2.49 2.25–2.45 2.05–2.14 2.25–2.29 2.21–2.33 2.00–2.25 2.21–2.29 2.23–2.29 2.30–2.35 2.36–2.47 2.42–2.50 2.45–2.71 —
11–18 2.19–2.32 2.02–2.24 2.00–2.04 2.01–2.24 1.71–2.20 1.80–1.99 2.00–2.20 2.10–2.22 2.21–2.29 2.21–2.35 2.21–2.41 2.33–2.44 2.53–2.79
19–28 1.87–2.18 1.80–2.01 1.58–1.99 1.80–2.00 1.56–1.70 1.61–1.79 1.80–1.99 1.81–2.09 1.93–2.20 2.11–2.20 2.11–2.20 2.17–2.32 2.41–2.52
29–40 1.56–1.82 1.50–1.79 1.50–1.57 1.60–1.79 1.47–1.55 1.60–1.51 1.63–1.79 1.61–1.80 1.61–1.92 2.00–2.10 2.00–2.10 2.11–2.16 2.30–2.40
10 41–59 1.50–1.55 1.36–1.49 1.42–1.49 1.30–1.59 1.31–1.46 1.40–1.50 1.40–1.62 1.35–1.60 1.31–1.60 1.87–1.99 1.91–1.99 1.99–2.10 2.11–2.29
11 60–71 1.30–1.49 1.21–1.35 1.31–1.41 1.16–1.29 1.21–1.30 1.29–1.39 1.30–1.39 1.21–1.34 1.26–1.30 1.61–1.86 1.71–1.90 1.91–1.98 2.03–2.10
12 72–81 1.12–1.29 1.10–1.20 1.28–1.30 1.09–1.15 1.11–1.20 1.20–1.28 1.11–1.29 1.11–1.20 1.11–1.25 1.45–1.60 1.51–1.70 1.64–1.90 1.88–2.02
13 82–89 1.10–1.11 — 1.16–1.27 1.00–1.08 1.00–1.10 1.00–1.19 1.00–1.10 1.00–1.10 1.00–1.10 1.22–1.44 1.36–1.50 1.40–1.63 1.60–1.87
14 90–94 — — 1.10–1.15 — — — — 0.95–0.99 0.95–0.99 1.10–1.21 1.15–1.35 1.13–1.39 1.20–1.59
15 95–97 — — — — — — — 0.92–0.94 0.92–0.94 — 1.10–1.14 1.00–1.12 —
16 98 — — — — — — — — — — — — —
17 99 — — — — — — — — — — — — —
18 >99 <1.10 <1.10 <1.10 <1.00 <1.00 <1.00 <1.00 <0.92 <0.92 <1.10 <1.10 <1.00 <1.20
Age range  18–30 25–35 30–40 35–45 40–50 45–55 50–60 55–65 60–70 65–75 70–80 75–85 80–90
Age midpoint  25 30 35 40 45 50 55 60 65 70 75 80 85
Sample size  16 20 14 15 32 41 46 44 44 40 43 28 14
Table 4.

Adjustment for education for (B/A) (time)

Scaled score Education (years)

10 11 12 13 14 15 16 17 18
10 10 10
11 11 11 10 10 10 10 10
10 12 12 12 11 11 11 11 11 10 10 10 10 10
11 13 13 13 12 12 12 12 12 11 11 11 11 11 10 10 10 10
12 14 14 14 13 13 13 13 13 12 12 12 12 12 11 11 11 11 10 10
13 15 15 15 14 14 14 14 14 13 13 13 13 13 12 12 12 12 11 11
14 16 16 16 15 15 15 15 15 14 14 14 14 14 13 13 13 13 12 12
15 17 17 17 16 16 16 16 16 15 15 15 15 15 14 14 14 14 13 13
16 18 18 18 17 17 17 17 17 16 16 16 16 16 15 15 15 15 14 14
17 19 19 19 18 18 18 18 18 17 17 17 17 17 16 16 16 16 15 15
18 20 20 20 19 19 19 19 19 18 18 18 18 18 17 17 17 17 16 16
Scaled score Education (years)

10 11 12 13 14 15 16 17 18
10 10 10
11 11 11 10 10 10 10 10
10 12 12 12 11 11 11 11 11 10 10 10 10 10
11 13 13 13 12 12 12 12 12 11 11 11 11 11 10 10 10 10
12 14 14 14 13 13 13 13 13 12 12 12 12 12 11 11 11 11 10 10
13 15 15 15 14 14 14 14 14 13 13 13 13 13 12 12 12 12 11 11
14 16 16 16 15 15 15 15 15 14 14 14 14 14 13 13 13 13 12 12
15 17 17 17 16 16 16 16 16 15 15 15 15 15 14 14 14 14 13 13
16 18 18 18 17 17 17 17 17 16 16 16 16 16 15 15 15 15 14 14
17 19 19 19 18 18 18 18 18 17 17 17 17 17 16 16 16 16 15 15
18 20 20 20 19 19 19 19 19 18 18 18 18 18 17 17 17 17 16 16

Note: β = 0.201.

Table 5.

Adjustment for education for (B + 10)/(A + 10) (score)

Scaled score Education (years)

10 11 12 13 14 15 16 17 18
10 10 10 10
11 11 11 11 10 10 10 10
10 12 12 12 12 11 11 11 11 10 10 10 10 10
11 13 13 13 13 12 12 12 12 11 11 11 11 11 10 10 10 10
12 14 14 14 14 13 13 13 13 12 12 12 12 12 11 11 11 11 10 10
13 15 15 15 15 14 14 14 14 13 13 13 13 13 12 12 12 12 11 11
14 16 16 16 16 15 15 15 15 14 14 14 14 14 13 13 13 13 12 12
15 17 17 17 17 16 16 16 16 15 15 15 15 15 14 14 14 14 13 13
16 18 18 18 18 17 17 17 17 16 16 16 16 16 15 15 15 15 14 14
17 19 19 19 19 18 18 18 18 17 17 17 17 17 16 16 16 16 15 15
18 20 20 20 20 19 19 19 19 18 18 18 18 18 17 17 17 17 16 16
Scaled score Education (years)

10 11 12 13 14 15 16 17 18
10 10 10 10
11 11 11 11 10 10 10 10
10 12 12 12 12 11 11 11 11 10 10 10 10 10
11 13 13 13 13 12 12 12 12 11 11 11 11 11 10 10 10 10
12 14 14 14 14 13 13 13 13 12 12 12 12 12 11 11 11 11 10 10
13 15 15 15 15 14 14 14 14 13 13 13 13 13 12 12 12 12 11 11
14 16 16 16 16 15 15 15 15 14 14 14 14 14 13 13 13 13 12 12
15 17 17 17 17 16 16 16 16 15 15 15 15 15 14 14 14 14 13 13
16 18 18 18 18 17 17 17 17 16 16 16 16 16 15 15 15 15 14 14
17 19 19 19 19 18 18 18 18 17 17 17 17 17 16 16 16 16 15 15
18 20 20 20 20 19 19 19 19 18 18 18 18 18 17 17 17 17 16 16

Note: β = 0.241.

## Discussion

In this study, we present an adapted version of the Hayling Sentence Completion test in the setting of our country's population. The prior selection of the 30 sentences with a more homogeneous response ensures that all items included in the test are associated to a highly automatic response. The psychometric properties of the test showed a good internal consistency, as well as a good inter-rater variability.

Age had an effect on the different scores and times of the test. In this sense, older age explained a 10.8% of the variance of part A (time), a 19.1% of the variance of part B (time) and a 13.9% of part B (scoring). These findings are consistent with previous research regarding the performance of the Hayling test in healthy controls (Bielak et al., 2006; Tournier et al., 2014). In contrast, no influence was observed between age and educational level with part A (scoring), so this confirms the fact that sentences included are automatic and well known by all groups of age and education.

Regarding the influence of educational level, there was a negative correlation between years of education and part B (for both score and latency times). In this regard, 15.7% of the variance in part B (time) and 21.4% in part B (score) was explained by the educational level. Our results contrast with the findings of Bielak et al. in Canada, where education was only weakly associated with part A (time), but not with other parts of the test. However, the study of Bielak et al. only included patients between 53 and 90 years, and education range was lower than in our sample.

Interestingly, age showed a positive correlation with score B/A, but not with time B/A. Working memory plays an important role in the production of an inhibitory response, entailing lower response times and better scores in Hayling task (Stenbäck, Hällgren, Lyxell, & Larsby, 2015). Then, as age increases, working memory and attention abilities decrease (Gazzaley, Cooney, Rissman, & D'Esposito, 2005; Prakash et al., 2009). However, our results in score B/A suggest a real decline in inhibition (verbal suppression) with age, not explained only by a general slowing in the responses. This supports the observation of a decline in executive function with aging (Brennan, Welsh, & Fisher, 1997; Keys & White, 2000; Wecker, Kramer, Wisniewski, Delis, & Kaplan, 2000; Wecker, Kramer, Hallam, & Delis, 2005).

Overall, our results reassure the need for normative data in neuropsychological assessment and, in particular, in the interpretation of the Hayling test (Bielak et al., 2006; Matías-Guiu et al., 2015; Tournier et al., 2014). In contrast, gender did not seem to significantly affect the performance of the test (Bielak et al., 2006).

The study has some limitations. Firstly, Hayling test is based on several sentences with a highly automatic response. The knowledge of the last word may differ in other Spanish-speaking populations from other countries, so specific studies are necessary to obtain normative data in those settings. Furthermore, we used years of formal schooling to estimate the level of education, as has been performed in other normative studies in our country (Peña-Casanova et al., 2009). However, some research suggests that other measures such as reading level may be a better proxy to the level of education (Sayegh, Arentoft, Thaler, Dean, & Thames., 2014).

In conclusion, our study provides normative data for a new version of the Hayling Sentence Completion Test for the Spaniard population. The influence of demographic factors, especially age and education, is confirmed, and we suggest the possibility of using the Hayling test for both clinical and research purposes. Further studies aimed to corroborate the usefulness of this test in specific patient populations, as well as to study its neural basis, deem necessary.

## Supplementary material

Supplementary material online Material is available at Archives of Clinical Neuropsychology online.

## References

Abrahams
S.
,
Newton
J.
,
Niven
E.
,
Foley
J.
,
Bak
T. H.
(
2014
).
Screening for cognition and behavior changes in ALS
.
Amyotrophic Lateral Sclerosis & Frontotemporal Degeneration
,
15
,
9
14
.
Bellevile
S.
,
Rouleau
N.
,
Van der Linden
M.
(
2006
).
Use of the Hayling task to measure inhibition of prepotent responses in normal aging and Alzheimer's disease
.
Brain and Cognition
,
62
,
113
119
.
Bielak
A. A.
,
Mansueti
L.
,
Strauss
E.
,
Dixon
R. A.
(
2006
).
Performance on the Hayling and Brixton tests in older adults: Norms and correlates
.
Archives of Clinical Neuropsychology
,
21
,
141
149
.
Blesa
R.
,
Pujol
M.
,
Aguilar
M.
,
Santacruz
P.
,
Bertran-Serra
I.
,
Hernández
G.
et al
.
NORMACODEM Group
. (
2001
).
Clinical validity of the “mini-mental state” for Spanish speaking communities
.
Neuropsychologia
,
39
,
1150
1157
.
Bouquet
C. A.
,
Bonnaud
V.
,
Gil
R.
(
2003
).
Investigation of supervisory attentional system functions in patients with Parkinson's disease using the Hayling task
.
Journal of Clinical and Experimental Neuropsychology
,
25
,
751
760
.
Brennan
M.
,
Welsh
M. C.
,
Fisher
C. B.
(
1997
).
Aging and executive function skills: An examination of a community-dwelling older adult population
.
Perceptual and Motor Skills
,
84
,
1187
1197
.
Burgess
P. W.
,
Shallice
T.
(
1996
).
Response suppression, initiation and strategy use following frontal lobe lesions
.
Neuropsychologia
,
34
,
263
272
.
Chan
K. K. S.
,
Xu
J. Q.
,
Liu
K. C. M.
,
Hui
C. L. M.
,
Wong
G. H. Y.
,
Chen
E. Y. H.
(
2012
).
Executive function in first-episode schizophrenia: A three-year prospective study of the Hayling Sentence Completion Test
.
Schizophrenia Research
,
135
,
62
67
.
Collette
F.
,
Van der Linden
M.
,
Delfiore
G.
,
Degueldre
C.
,
Luxen
A.
,
Salmon
E.
(
2001
).
The functional anatomy of inhibition processes investigated with the Hayling task
.
Neuroimage
,
14
,
258
267
.
Eddy
C. M.
,
Rizzo
R.
,
Cavanna
A. E.
(
2009
).
Neuropsychological aspects of Tourette syndrome: A review
.
Journal of Psychosomatic Research
,
67
,
503
513
.
Egner
T.
,
Hirsch
J.
(
2005
).
The neural correlates and functional integration of cognitive control in a Stroop task
.
Neuroimage
,
24
,
539
547
.
Fernández-Matarrubia
M.
,
Matías-Guiu
J. A.
,
Moreno-Ramos
T.
,
Matías-Guiu
J.
(
2014
).
Behavioral variant frontotemporal dementia: Clinical and therapeutic approaches
.
Neurologia
,
29
,
464
472
.
Gazzaley
A.
,
Cooney
J. W.
,
Rissman
J.
,
D'Esposito
M.
(
2005
).
Top-down suppression deficit underlies working memory impairment in normal aging
.
Nature Neuroscience
,
8
,
1298
1300
.
Hagen
K.
,
Ehlis
A. C.
,
Haeussinger
F. B.
,
Heinzel
S.
,
Dresler
T.
,
Mueller
L. D.
et al
. (
2014
).
Activation during the Trail Making Test measured with functional near-infrared spectroscopy in healthy elderly subjects
.
Neuroimage
,
85
,
583
591
.
Hornberger
M.
,
Bertoux
M.
(
2015
).
Right lateral prefrontal cortex—specificity for inhibition or strategy use?
Brain
,
138
,
833
835
.
Hornberger
M.
,
Geng
J.
,
Hodges
J. R.
(
2011
).
Convergent grey and white matter evidence of orbitofrontal cortex changes related to disinhibition in behavioral variant frontotemporal dementia
.
Brain
,
134
,
2502
2512
.
Hornberger
M.
,
Savage
S.
,
Hsieh
S.
,
Mioshi
E.
,
Piguet
O.
,
Hodges
J. R.
(
2010
).
Orbitofrontal dysfunction discriminates behavioral variant frontotemporal dementia from Alzheimer's disease
.
Dementia and Geriatric Cognitive Disorders
,
30
,
547
552
.
Keys
B. A.
,
White
D. A.
(
2000
).
Exploring the relationship between age, executive abilities, and psychomotor speed
.
Journal of the International Neuropsychological Society
,
6
,
76
82
.
Lazeron
R. H.
,
Rombouts
S. A.
,
Machielsen
W. C.
,
Scheltens
P.
,
Witter
M. P.
,
Uylings
H. B.
et al
. (
2000
).
Visualizing brain activation during planning: The tower of London test adapted for functional MR imaging
.
,
21
,
1407
1414
.
Lezak
M. D.
,
Howieson
D. B.
,
Bigler
E. D.
,
Tranel
D.
(
2012
).
Neuropsychological assessment
(
5th ed.
).
New York
:
Oxford University Press
.
Matías-Guiu
J. A.
,
R.
,
Escudero
G.
,
Pérez-Pérez
J.
,
Cortés
A.
,
Morenas-Rodríguez
E.
et al
. (
2015
).
Validation of the Spanish version of Addenbrooke's Cognitive Examination III for diagnosing dementia
.
Neurologia
,
30
,
545
551
.
Morris
J. C.
(
1993
).
The CDR: Current version and scoring rules
.
Neurology
,
43
,
2412
2413
.
Müller
L. D.
,
Guhn
A.
,
Zeller
J. B.
,
Biehl
S. C.
,
Dresler
T.
,
Hahn
T.
et al
. (
2014
).
Neural correlates of a standardized version of the trail making test in young and elderly adults: A functional near-infrared spectroscopy study
.
Neuropsychologia
,
56
,
271
279
.
Pauker
J. D.
(
1988
).
Constructing overlapping cell tables to maximize the clinical usefulness of normative test data: Rationale and an example from neuropsychology
.
Journal of Clinical Psychology
,
44
,
930
933
.
Peña-Casanova
J.
,
Blesa
R.
,
Aguilar
M.
,
Gramunt-Fombuena
N.
,
Gómez-Ansón
B.
,
Oliva
R.
et al
. (
2009
).
Spanish Multicenter Normative Studies (NEURONORMA Project): Methods and sample characteristics
.
Archives of Clinical Neuropsychology
,
24
,
307
319
.
Pfeffer
R. I.
,
Kurosaki
T. T.
,
Harrah
C. H.
,
Chance
J. M.
,
Filos
S.
(
1982
).
Measurement of functional activities in older adults in the community
.
Journal of Gerontology
,
3
,
323
329
.
Prakash
R. S.
,
Erickson
K. I.
,
Colcombe
S. J.
,
Kim
J. S.
,
Voss
M. W.
,
Kramer
A. F.
(
2009
).
Age-related differences in the involvement of the prefrontal cortex in attentional control
.
Brain and Cognition
,
71
,
328
335
.
Robinson
G. A.
,
Cipolotti
L.
,
Walker
D. G.
,
Biggs
V.
,
Bozzali
M.
,
Shallice
T.
(
2015
).
Verbal suppression and strategy use: A role for the right lateral prefrotal cortex?
Brain
,
138
,
1084
1096
.
Sayegh
P.
,
Arentoft
A.
,
Thaler
N. S.
,
Dean
A. C.
,
Thames
A. D.
(
2014
).
Quality of education predicts performance on the Wide Range Achievement Test-4th Edition Word Reading subtest
.
Archives of Clinical Neuropsychology
,
29
,
731
736
.
Stenbäck
V.
,
Hällgren
M.
,
Lyxell
B.
,
Larsby
B.
(
2015
).
The Swedish Hayling task, and its relation to working memory, verbal ability, and speech-recognition-in-noise
.
Scandinavian Journal of Psychology
,
56
,
264
272
.
Torralva
T.
,
Roca
M.
,
Gleichgerrcht
E.
,
Lopez
P.
,
Manes
F.
(
2009
).
INECO Frontal Screening (IFS): A brief, sensitive, and specific tool to assess executive function in dementia
.
Journal of the International Neuropsychological Society
,
15
,
777
786
.
Tournier
I.
,
Postal
V.
,
Mathey
S.
(
2014
).
.
Archives of Gerontology and Geriatrics
,
59
,
599
606
.
Volle
E.
,
de Lacy Costello
A.
,
Coates
L. M.
,
McGuire
C.
,
Towgood
K.
,
Gilbert
S.
et al
. (
2012
).
Dissociation between verbal response initiation and suppression after prefrontal lesions
.
Cerebral Cortex
,
22
,
2428
2440
.
Wecker
N. S.
,
Kramer
J. H.
,
Hallam
B. J.
,
Delis
D. C.
(
2005
).
Mental flexibility: Age effects on switching
.
Neuropsychology
,
19
,
345
352
.
Wecker
N. S.
,
Kramer
J. H.
,
Wisniewski
A.
,
Delis
D. C.
,
Kaplan
E.
(
2000
).
Age effects on executive ability
.
Neuropsychology
,
14
,
409
414
.
Yuan
P.
,
Raz
N.
(
2014
).
Prefrontal cortex and executive functions in healthy adults: A meta-analysis of structural neuroimaging studies
.
Neuroscience and Biobehavioral Reviews
,
42
,
180
192
.