Abstract

The International Shopping List Test (ISLT) was developed specifically to assess verbal list learning and memory in people from different language and cultural backgrounds. In this paper, we describe three studies that examined the sensitivity and reliability of the ISLT in assessing verbal list learning and memory impairment in English-speaking people with mild Alzheimer's disease (AD) and evaluated whether measures of retention-weighted recall (RWR) provided greater sensitivity and/or reliability relative to conventional list learning performance measures (e.g., free recall). In Study 1, we compared ISLT performance between patients with AD and matched controls and found that AD patients showed a large magnitude impairment on all ISLT performance measures (Cohen's d values >2). The RWR measure was more sensitive to detecting AD-related impairment than the free recall measure for Trial 1, although the most sensitive measures of the ISLT were free recall from Trial 3 and delayed recall. In Study 2, we compared RWR and free recall measures between 10- and 12-word versions of the ISLT, but found no difference between performance measures for the different list lengths. In Study 3, we evaluated test–retest reliabilities of the different outcome measures derived from the ISLT and found that measures of free recall had higher reliabilities than the RWR measures. Taken together, results of these studies suggest that measures of total free recall during learning trials and delayed recall from the 12-word version of the ISLT provide the greatest sensitivity to detecting verbal list learning and memory impairment in AD and that this task shows good test–retest reliability.

Introduction

Alzheimer's disease (AD) is the most prevalent form of dementia worldwide (Alzheimer Association, 2007). In the initial or mild severity stage, clinically classified AD is often characterized by substantial (i.e., performance 2 SD below that of matched controls) impairment in memory accompanied by impairment in language, visuospatial abilities, executive function, and praxis, as well as impaired activities of daily living (Stopford, Snowden, Thompson, & Neary, 2008). As the disease progresses into the moderate stage of severity, cognitive impairment increases in severity and is often associated with reduced functioning and adaptive behavior (e.g., impairment of activities of daily living). Neuropsychiatric symptoms are also prominent and include significant mood changes, agitation, and psychosis (Cummings, 2004). Neurobiological studies have shown that the memory impairments that characterize AD reflect pathophysiological changes to the hippocampus and medial temporal lobe (Braak, Braak, & Bohl, 1993; Pike et al., 2007).

There is strong evidence suggesting that a reduction in cognitive function, particularly memory, that emerges long before AD is diagnosed clinically (Backman, Jones, Berger, & Laukka, 2005; Knopman et al., 2003; Sperling et al., 2010; Twamley, Ropacki, & Bondi, 2006). Because deficits in episodic memory are the core feature of memory impairment in AD (Fischer et al., 2007; Fleisher et al., 2007; Palmer, Backman, Winbland, & Fratiglioni, 2008), tests of episodic memory reliably differentiate cognitively normal individuals from those with AD, even during the early stages of the disorder (Backman et al., 2005; Maruff, Collie, Darby, Weaver-Cargin, & Masters, 2004). One episodic memory paradigm shown to be highly sensitive to memory impairment in AD is verbal list learning (Gavett et al., 2009; Jungwirth et al., 2009). On tests of verbal list learning, individuals are instructed to remember a list of words (typically between 10 and 16 words) that are presented one at a time, often over successive trials (Lezak, Howieson, & Loring, 2004). Various measures may then be derived from these tests to examine cognitive process such as verbal list acquisition (learning), as well as retention (delayed recall).

An important limitation of the majority of verbal list learning tests is that they have been developed in Western European and North American cultures with the words selected on the basis of specific linguistic criteria, such as semantic category, familiarity, frequency of use, length, and imaginability (Brandt & Benedict, 2001; Delis, Kramer, Kaplan, & Ober, 1987, 2000). These criteria make it difficult to translate word lists for use in countries that do not use the same linguistic conventions as those in Western Europe and North America (Ardila, 2005; Greenfield, 1997; Lim et al., 2009). Recently, we developed the International Shopping List Test (ISLT), a verbal list learning test that allows construction of equivalent versions of the verbal learning and memory test and that are designed specifically for different language and cultural groups (Lim et al., 2009). The ISLT retained the word list presentation format of other commonly used verbal learning and memory tests by asking examinees to remember a list of items that they could obtain from their local shops (markets/stores/bazaar, but hereafter termed “stores”). The word stimuli for the ISLT consist only of items that had been rated by different individuals from the intended cultural or language group as being “very easy” or “easy” to obtain in local stores. Hence, across cultural/language groups, the stimuli were equivalent with respect to their concreteness, frequency, imaginability, and cultural relevance (Lim et al., 2009). We found that in young adults, performance on Chinese, French, Malay, and English versions of the ISLT was equivalent when expressed as measures of total free recall, trial-by-trial free recall, delayed recall, primacy and recency effects, and speed with which words were recalled (Lim et al., 2009). The ISLT was then used to evaluate verbal memory in individuals with schizophrenia who spoke American English (Pietrzak et al., 2009). Results of this study revealed that, relative to healthy controls, the magnitude of memory impairment in individuals with schizophrenia was equivalent when measured using the ISLT and the Hopkins Verbal Learning Test-Revised (HVLT-R; Brandt & Benedict, 2001). Results of this study also demonstrated that scores on the ISLT and HVLT-R were highly correlated (r > .8) in both healthy adults and in people with schizophrenia.

A large body of literature has examined the nature of verbal memory impairment in AD using verbal list learning tests such as the Rey Auditory Verbal Learning Test (RAVLT; Rey, 1964); California Verbal Learning Test (CVLT-II; Delis et al., 1987, 2000), and the HVLT-R (Brandt & Benedict, 2001). These studies have shown that, compared with healthy age-matched controls, verbal memory impairment in AD presents as a pronounced reduction in the number of words recalled (Gavett et al., 2009; Greenway et al., 2006; Hogervorst et al., 2002; Kulansky et al., 2004); slower rates of learning across trials (Greenway et al., 2006); increased rates of forgetting (Antonelli Incalzi, Capparella, Gema, Marra, & Carbonin, 1995; Estevez-Gonzalez, Kulisevsky, Boltes, Otermin, & Garcia-Sanchez, 2003; Gavett et al., 2009; Greenway et al., 2006; Jungwirth et al., 2009); and reduced primacy but relatively intact recency scores (Bayley et al., 2000; Buschke et al., 2006; Gainotti & Marra, 1994; Gainotti, Monteleone, Parlato, & Carlomagno, 1989; Spinnler & Della Sala, 1988).

In studies of memory in AD, performance measures from verbal list learning tests have generally been scored and analyzed separately. However, a recent study by Buschke and colleagues (2006) found that combining indices of primacy with total recall increased the sensitivity of a verbal list learning test (e.g., the memory test from the Telephone Interview for Cognitive Status (TICS); Brandt, Spencer, & Folstein, 1988) to AD-related memory impairment when compared with total recall scores alone. In this study, the retention-weighted recall (RWR) score was computed by assigning words recalled from the beginning of the list a greater weight in the total score than those recalled from the end of the list (Buschke et al., 2006). This greater sensitivity of the RWR score was inferred to have occurred because it amplified the contribution of the selective reduction in primacy that occurs in AD to the memory performance measure. However, given that patients with AD generally perform poorly on verbal list learning tests, their scores tend to be uniformly low and there is decreased variability between individuals. This is especially the case for the TICS memory test, which consists of a single trial of 10 words. The increased sensitivity of the RWR might also have occurred because the weighting procedure improved the metric properties of the outcome score for the TICS memory test by increasing the range of possible values for this score. This would, in turn, likely increase the variability between performance scores of different patients, which might be scored equivalently on a scale with a smaller range. Given this possibility, this finding deserves replication, especially when the rationale and method for computation can be applied to performance on any verbal list learning test.

While the use of the shopping list format in the ISLT has allowed the construction of linguistically and culturally equivalent versions, it also gives rise to one important difference between the ISLT and other verbal list learning tests. This is that the use of the shopping list format restricts the items on each list to a single semantic category (i.e., common shopping items). Thus, the words used on all language versions on the ISLT are highly interrelated. In contrast, tests like the CVLT-II, HVLT-R, RAVLT, and TICS memory test deliberately use sets of unrelated words (e.g., RAVLT), or words (typically three or four) grouped within multiple (typically three or four) semantic categories (e.g., CVLT-II, HVLT). Furthermore, the semantic categories themselves are selected to be different from one another. Studies of verbal list learning have shown that successful recall is related to the use of semantic clustering techniques and that individuals with poor recall often rely on serial recall techniques (Delis, Freeland, Kramer, & Kaplan, 1988). The use of a semantic organizing strategy during recall is assumed to reflect that this same semantic organization has occurred. Incorporating to-be-learned words into existing semantic networks allows for greater depth of processing than when information that is learned is relative only to the test at hand (e.g., Craik & Tulving, 1975). Thus, the use of a single semantic category on the ISLT may actually make word recall easier than on verbal list learning tests such as the RAVLT, CVLT-II, and HVLT-R, all of which deliberately limit the relatedness of their respective stimulus words. Consequently, the use of a single semantic category for stimulus words on the ISLT may result in it having lower sensitivity to detecting verbal memory impairment in AD and decrements in ISLT performance possibly being qualitatively different relative to other verbal list learning tests.

The aim of the set of studies described in this report was to examine the nature and magnitude of impairment in performance on the English version of the ISLT in individuals with clinically diagnosed AD compared with matched controls.

In Study 1, we compared the ISLT performance in these groups using ISLT measures of free recall, rate of learning, rate of forgetting, primacy, recency, and RWR. The ability of these different performance measures to identify individual cases of memory impairment was also compared by estimating the sensitivity of each while controlling specificity at over 90%. The hypothesis for Study 1 was that patients with AD would show a large general impairment across all ISLT performance measures. In Study 2, we evaluated whether the sensitivity of the ISLT depended on the number of words used in the list, by comparing performance on 10- and 12-word versions of the ISLT. The hypothesis for Study 2 was that the sensitivity of the ISLT to memory impairment in AD would be different for the 10- and 12-word list versions of the ISLT. Finally, in Study 3, we evaluated the test–retest reliability of the ISLT performance measures over a 1-month period in patients with AD. The hypothesis for Study 3 was that ISLT performance measures that showed good sensitivity to detecting verbal learning and memory impairment in AD in Study 1 would show high test–retest reliability in AD.

Study 1

Method

Participants

The control group consisted of 156 English-speaking adults aged 60 or older who were recruited from an ongoing study of healthy cognitive aging in Melbourne, Australia (Darby et al., in press; Fredrickson et al., 2010; Pietrzak et al., in press). All of these individuals had been recruited through radio advertisements and had been assessed on the Detection (simple reaction time), Identification (choice reaction time), One-Card Learning (visual learning), and One-Back (visual working memory) tasks from the CogState brief battery on five occasions over 24 months. All performance data on the CogState battery were reviewed by a board certified behavioral neurologist (DD). Individuals who did not show any evidence of decline in performance on the test battery over the 24-month period were then reviewed and assessed in person by the same neurologist. Individuals without memory complaints; without evidence of any cognitive, neurologic, or psychiatric symptoms; and who met the inclusion/exclusion criteria described below were recruited into the current study. Consequently, all of the control participants recruited to the current study were free of memory impairment and did not have any memory complaints. None of these individuals had performed the ISLT previously, and the ISLT was not administered as part of the healthy cognitive aging study.

A total of 27 English-speaking individuals with mild AD (hereafter referred to as “AD”) were recruited from ongoing studies in AD clinics in major metropolitan hospitals in Melbourne, Australia (e.g., Pike et al., 2007; Rowe et al., 2007). Although diagnoses were rendered by board certified neurologists or geriatricians working at these hospitals, all met uniform inclusion/exclusion criteria. Specific inclusion criteria were as follows: meet diagnostic criteria for probable AD as defined by the National Institute of Neurological and Communicative Disorders and Stroke–Alzheimer's Disease and Related Disorders Association (NINCDS-ADRDA; McKhann et al., 1984); MRI brain scan supporting the clinical diagnosis of AD; score of 18–26 on the Mini-Mental State Examination (MMSE); Rosen-Modified Hachinski Ischemic score of <4; Clinical Dementia Rating (CDR) scale sum of boxes score of 1; and taking cholinesterase inhibitor(s) at the time of assessment and had been doing so for 6 months or longer. The neuropsychological test battery used to inform the diagnosis of AD in each patient was determined by clinical neuropsychologists at the different hospitals. Data from these assessments were not available for use in the current study. Thus, for all patients, the diagnosis of AD had been made at least 6 months prior to enrolment in the current study. Exclusion criteria were any one or more of the following: a neurological disease other than AD that might affect cognition; a major psychiatric disorder, systemic illness, or symptoms that could affect the patient's ability to complete the study; a score on the Geriatric Depression Scale ≥10; or use of an anticonvulsant, antiparkinsonian, anticoagulant, narcotic, or immunosuppressive medications within 3 months prior to assessment. None of the individuals in this group had performed the ISLT in the past. Demographic and clinical characteristics of the control and AD groups are shown in Table 1.

Table 1.

Participant characteristics

Measure Control group AD group for Study 1 AD group for Study 2 AD group for Study 3a 
n 156 27 30 50 
Age 73.1 (9.5) 73.5 (5.9) 70.8 (8.7) 69.1 (9.6) 
Gender (% men) 53.4 58.1 55.6 65.1 
Years of education 12.1 (2.9) 11.7 (3.2) 11.4 (3.7) 10.9 (3.7) 
MMSE 29.1 (.9) 22.3 (3.3) 21.8 (3.8) 22.1 (5.9) 
CDR-SOB 4.9 (2.3) 4.7 (2.5) 4.2 (3.1) 
NART IQ 107.2 (12.3) 108.9 (11.3) 109.1 (10.2) 101.3 (10.4) 
Depression score (GDS) 1.5 (2.0) 1.7 (2.1) 1.8 (2.4) 2.1 (1.9) 
Measure Control group AD group for Study 1 AD group for Study 2 AD group for Study 3a 
n 156 27 30 50 
Age 73.1 (9.5) 73.5 (5.9) 70.8 (8.7) 69.1 (9.6) 
Gender (% men) 53.4 58.1 55.6 65.1 
Years of education 12.1 (2.9) 11.7 (3.2) 11.4 (3.7) 10.9 (3.7) 
MMSE 29.1 (.9) 22.3 (3.3) 21.8 (3.8) 22.1 (5.9) 
CDR-SOB 4.9 (2.3) 4.7 (2.5) 4.2 (3.1) 
NART IQ 107.2 (12.3) 108.9 (11.3) 109.1 (10.2) 101.3 (10.4) 
Depression score (GDS) 1.5 (2.0) 1.7 (2.1) 1.8 (2.4) 2.1 (1.9) 

Notes: Data are shown as group mean (±SD). AD = Alzheimer's disease; MMSE = Mini-Mental State Examination; CDR-SOB = Clinical Dementia Rating scale sum of boxes; NART IQ = National Adult Reading Test estimated IQ; GDS = Geriatric Depression Scale.

aIncluded all 27 individuals with AD from Study 1 and 23 individuals with AD from Study 2.

Measures

The ISLT is a 3-trial, 12-word list learning test of verbal memory. The items on the shopping list are adapted so that they are relevant to different cultural or language groups (Lim et al., 2009). The presentation of words and recording of responses is controlled by a computer. The computer software presents the words to the experimenter and the experimenter reads these to the participant as soon as they are displayed on the screen. The participant is not shown the computer screen or the words. Items for each administration of the ISLT are selected randomly from a pool of 128 words and are presented in the same order across the three trials. Stimuli are presented to the test administrator at a rate of one word per 2s. At the completion of each learning trial, participants are asked to recall as many words as they can remember. As participants recall words from the list, these words are marked by the test administrator on the computer screen. Any words offered that were not read as part of the list are also noted (i.e., intrusions), as are the number of words repeated more than once (i.e., repetitions). A delayed recall trial is also administered where, after a delay of 10–15 min, participants are asked to recall as many of the words from the list as they could remember. During the delay period, other measures such as the MMSE and National Adult Reading Test (NART) were administered. ISLT outcome measures and their computations are summarized in Table 2.

Table 2.

ISLT performance measures used in the current study

ISLT performance measure Definition Scale 
Free recall Number of words recalled correctly on each of the three trials and the delayed recall trial Possible values range from 0 to 12 
Total free recall Sum of words recalled on Trials 1–3 Possible values range from 0 to 36 
Retention- weighted recall Sum of retention-weighted recall scores (e.g., List Length − Presentation Position + 1). Possible values range from 1 to 78 
Delayed recall score Number of words recalled after a delay Possible values range from 0 to 12 
Primacy Sum of words recalled from the first four positions on the list on each of the trials Possible values range from 0 to 12 
Recency Sum of words recalled from the last four positions on the list on each of the trials Possible values range from 0 to 12 
Intrusion Number of non-list words recalled on Trials 1–3 Possible values are ≥0 
Repetition Number of times words are recalled more than once on Trials 1–3 Possible values are ≥0 
ISLT performance measure Definition Scale 
Free recall Number of words recalled correctly on each of the three trials and the delayed recall trial Possible values range from 0 to 12 
Total free recall Sum of words recalled on Trials 1–3 Possible values range from 0 to 36 
Retention- weighted recall Sum of retention-weighted recall scores (e.g., List Length − Presentation Position + 1). Possible values range from 1 to 78 
Delayed recall score Number of words recalled after a delay Possible values range from 0 to 12 
Primacy Sum of words recalled from the first four positions on the list on each of the trials Possible values range from 0 to 12 
Recency Sum of words recalled from the last four positions on the list on each of the trials Possible values range from 0 to 12 
Intrusion Number of non-list words recalled on Trials 1–3 Possible values are ≥0 
Repetition Number of times words are recalled more than once on Trials 1–3 Possible values are ≥0 

Note: ISLT,International Shopping List Test.

Procedure

For the AD group, the ISLT was administered as part of the workup for inclusion into the current study. Data from the ISLT did not inform the diagnosis of AD, as this had been made as part of the patients' clinical examinations at least 6 months prior to their being administered the ISLT in the current study. The ISLT, MMSE, and NART were administered to all study participants in a quiet room in a medical clinic or office by an experienced psychometrist who had undergone training on administration of the ISLT by a board registered clinical neuropsychologist.

Data Analysis

ISLT performance in AD

Serial position curves were constructed for the free recall data for controls. The free recall and RWR scores for each trial were compared between groups by submitting these data to a series of group × trial analyses of variance (ANOVAs). The magnitude of differences between the groups for the measures of recall and RWR were expressed as a measure of effect size (Cohen's d) for each trial. Cohen's d is the difference between group means expressed as a proportion of their pooled standard deviation, and the computation of the pooled standard deviation takes into account the standard deviation and sample size from each group (Cohen, 1988). To compare the performance between groups, the recall score for each subject for each trial was submitted to a group × trial multivariate profile analysis (Tabachnick & Fidell, 2001). When the analysis indicated that the test for parallelism was significant, the segments reflecting difference in performance from Trial 1 to Trial 2, Trial 2 to Trial 3, and Trial 3 to delayed recall were compared using a series of independent t-tests. The total recall score (i.e., sum of recall on each trial), primacy scores and recency scores were compared between groups using an independent t-test. Because data for the measure of total intrusions and repetitions did not meet the assumptions for parametric analysis, median performance was compared between groups using a series of Mann–Whitney U-tests (Table 3). In order to minimize the risk of Type I error, the criterion for statistical significance for all analyses was set to .01 and only statistically significant differences or associations with effect sizes greater than those considered as small in magnitude (e.g., d ≥ 0.20; Cohen, 1988) were interpreted.

Table 3.

Group mean or median performance, criterion value for sensitivity, and percent of patients with AD with abnormal scores for each of the outcome measures from the International Shopping List Test

Measure Trial Control (mean [SD]) AD (mean [SD]) t d 10th percentile cut-score from controls Number of AD patients with score < control 10th percentile 5th percentile cut-score from controls Number of AD patients with score < control 5th percentile 
Free recall Trial 1 6.8 (1.7) 3.4 (2.3) 9.2 1.8 5.0 21 (77.8%) 4.0 16 (59.3%) 
 Trial 2 9.4 (1.6) 4.9 (2.6) 12.1 2.4 7.0 20 (74.1%) 6.7 20 (74.1%) 
 Trial 3 10.4 (1.5) 5.1 (2.5) 12.1 3.0 8.0 23 (85.2%) 7.0 21 (77.8%) 
 Delayed recall 9.1 (2.3) 1.7 (2.2) 14.4 3.1 6.4 26 (96.3%) 5.0 22 (81.5%) 
 Total free recall 26.6 (3.9) 13.4 (6.8) 9.9 3.0 21.0 22 (81.5%) 19.0 21 (77.8%) 
Retention-weighted recall Trial 1 49.7 (12.8) 22.32 (16.3) 9.5 2.1 35.0 20 (80%) 32.7 19 (76%) 
 Trial 2 70.2 (15.4) 34.7 (19.9) 10.4 2.2 50.0 21 (80.8%) 43.0 17 (65.4%) 
 Trial 3 79.2 (17.4) 35.3 (20.7) 11.6 2.4 57.8 20 (76.9%) 52.8 20 (76.9%) 
Primacy  9.8 (1.4) 3.9 (3.3) 15.4 3.2 7.0 24 (88.9%) 6.0 20 (74%) 
Recency  8.7 (1.9) 6.4 (3.8) 4.65 1.1 6.0 9 (33.3%) 5.0 9 (33.5%) 
IntrusionsNP  0 (0–5) 0.17 (0–11) 1,105.5  2.6 11 (40.7%) 3.3 9 (33.3%) 
RepetitionsNP  0.02 (0–5) 0.02 (0–11) 1,812  6.6 3 (11.1%) 8.0 1 (3.7%) 
Measure Trial Control (mean [SD]) AD (mean [SD]) t d 10th percentile cut-score from controls Number of AD patients with score < control 10th percentile 5th percentile cut-score from controls Number of AD patients with score < control 5th percentile 
Free recall Trial 1 6.8 (1.7) 3.4 (2.3) 9.2 1.8 5.0 21 (77.8%) 4.0 16 (59.3%) 
 Trial 2 9.4 (1.6) 4.9 (2.6) 12.1 2.4 7.0 20 (74.1%) 6.7 20 (74.1%) 
 Trial 3 10.4 (1.5) 5.1 (2.5) 12.1 3.0 8.0 23 (85.2%) 7.0 21 (77.8%) 
 Delayed recall 9.1 (2.3) 1.7 (2.2) 14.4 3.1 6.4 26 (96.3%) 5.0 22 (81.5%) 
 Total free recall 26.6 (3.9) 13.4 (6.8) 9.9 3.0 21.0 22 (81.5%) 19.0 21 (77.8%) 
Retention-weighted recall Trial 1 49.7 (12.8) 22.32 (16.3) 9.5 2.1 35.0 20 (80%) 32.7 19 (76%) 
 Trial 2 70.2 (15.4) 34.7 (19.9) 10.4 2.2 50.0 21 (80.8%) 43.0 17 (65.4%) 
 Trial 3 79.2 (17.4) 35.3 (20.7) 11.6 2.4 57.8 20 (76.9%) 52.8 20 (76.9%) 
Primacy  9.8 (1.4) 3.9 (3.3) 15.4 3.2 7.0 24 (88.9%) 6.0 20 (74%) 
Recency  8.7 (1.9) 6.4 (3.8) 4.65 1.1 6.0 9 (33.3%) 5.0 9 (33.5%) 
IntrusionsNP  0 (0–5) 0.17 (0–11) 1,105.5  2.6 11 (40.7%) 3.3 9 (33.3%) 
RepetitionsNP  0.02 (0–5) 0.02 (0–11) 1,812  6.6 3 (11.1%) 8.0 1 (3.7%) 

Notes: AD = Alzheimer's disease; ISLT, International Shopping List Test. For each ISLT performance measure, the group mean (±SD) for AD and healthy control groups as well as the magnitude and statistical significance of group differences, the criterion value at the 10th and 5th percentiles in the control group and the percentage of participants from the AD with scores less than these different criteria for abnormality. NP indicates that group data are summarized with median and range statistics and compared statistically using the Mann–Whitney U-test. For all comparisons, the level of statistical significance was p < .001.

Impairment in ISLT performance in individual patients with AD

To compare the sensitivity of different outcome measures from the ISLT in detecting abnormal memory performance in AD, normal ranges for performance were computed for criteria that were based on scores that would classify performance as abnormal with 90% specificity (i.e., scores less than the 10th percentile for the control group) and 95% specificity (i.e., scores less than 5th percentile for the control group). For each ISLT outcome measure, the number of AD patients with performance beyond the criterion value was identified and defined as the sensitivity of that measure to AD-related impairment in verbal learning and memory.

Results and Discussion

ISLT performance in AD

Figure 1a shows the profile of performance across ISLT trials for the recall scores in the AD and the control groups. The profile analysis indicated a significant effect of levels—F(1, 181) = 233.6, p < .001, a significant effect of flatness—Wilk's lambda = 0.36, F(3, 179) = 105.63, p < .001, and a significant departure from parallelism—Wilk's lambda = 0.73, F(3, 179) = 22.12, p < .001. Post hoc tests comparing segments of the learning curve indicated that improvement in performance from Trial 1 to Trial 2 was greater in the control group than in the AD group—t(180) = 3.53, p = .001, d = 0.74; that improvement in performance from Trial 2 to Trial 3 was greater in the control group than in the AD group—t(180) = 2.41, p = .02, d = 0.49; and that decline in performance between Trial 3 and the delayed recall trial was less pronounced in the control group than in the AD group—t(30.9) = 4.00, p < .001, d = 1.05. Figure 1a also shows group mean performance in free recall score on each trial.

Fig. 1.

Differences in performance in recall (a) and RWR (b) between the AD group and the control group in each trial. Note that error bars represent the standard error of the mean and are presented to guide comparison of performance between groups not within groups over trials.

Fig. 1.

Differences in performance in recall (a) and RWR (b) between the AD group and the control group in each trial. Note that error bars represent the standard error of the mean and are presented to guide comparison of performance between groups not within groups over trials.

Table 3 shows the group mean and standard deviation in the control and AD groups for each measure of the ISLT. As expected, controls recalled significantly more words than the AD group on Trials 1–3, as well as on the delayed recall trial. Consequently, the total free recall score was significantly greater in the controls than in the AD group. Figure 1b shows the group mean RWR scores across the three learning trials on the ISLT in the control and the AD groups. ANOVA indicated a significant effect of trial—Wilk's lambda = 0.58, F(2, 177) = 63.62, p < .001, a significant effect of group—F(1, 178) = 176.51, p < .001, and a significant trial × group interaction—Wilk's lambda = 0.92, F(2, 177) = 8.03, p < .001. Post hoc decomposition of the interaction with t-tests (summarized in Table 3) revealed that the performance of the controls was significantly better than that of the AD group on Trials 1–3. Primacy scores were significantly higher in the controls than in the AD group; however, there were no significant differences in recency scores between the control and the AD groups. Finally, the AD group made significantly more intrusions than controls, but the groups did not differ with respect to the number of repetitions.

Impairment in ISLT performance in individual patients with AD

Table 3 also shows the number and percentage of participants from the AD group with abnormal scores on the different ISLT metrics with the criteria for performance set so that the specificity levels were 90% and 95%. At criteria that provided 90% specificity, the greatest sensitivity detected was for the delayed recall trial (96.3%), followed by Trial 3 free recall (85.2%) and total free recall (81.5%). The RWR measure had better sensitivity than the free recall measure for Trial 1 (RWR, 80%; free recall, 77.8%) and Trial 2 (RWR, 80.8%; free recall, 74.1%), although both of these measures had less sensitivity than the total free recall and delayed recall scores (Table 2). Increasing specificity to 95% reduced the sensitivity measures slightly, although the pattern of differences was equivalent.

Results of Study 1 indicated that patients with AD performed worse on the ISLT than controls and that the magnitude of impairment on all measures was, by convention, very large (Cohen, 1988). Further, performance scores for the delayed recall, Trial 3 free recall, and total free recall scores showed excellent sensitivity to detecting verbal memory impairment in AD when criteria for impairment were set to retain specificity at 90% and 95%. Unlike in the Buschke and colleagues (2006) study, RWR scores were not more sensitive to memory impairment in AD than the total free recall or delayed recall scores. However, because Buschke and colleagues (2006) used the 10-word single-trial list learning task from the TICS, the benefit of the weighted recall over the conventional recall measure may have occurred because the TICS word list is shorter than the ISLT word list. Therefore, the aim of Study 2 was to determine the extent to which RWR measures and their component measures of primacy and total free recall were modulated by list lengths of 10 and 12 words.

Study 2

Method

Participants

A total of 30 older adults with AD were recruited from memory clinics in major metropolitan hospitals. Inclusion/exclusion criteria were identical to those described in Study 1 and none of the participants in Study 2 participated in Study 1. The patients for Study 2 were also matched to those from Study 1 with respect to age, gender, years of education, premorbid IQ, MMSE score, and CDR sum of boxes score. All patients were taking cholinesterase inhibitors (Table 1).

Procedure

The procedure was the same as that described in Study 1, except that the ISLT word list contained only 10 words.

Data analysis

In order to account for differences in the possible scores for the 10- and 12-word versions of the ISLT, the RWR measure for each participant's score was standardized by dividing it by the maximum possible score (i.e., 55 for the 10-word list and 78 for the 12-word list). To standardize free recall scores for each trial and total free recall, scores were expressed as percentages of the words per trial (10 or 12) or total words (30 or 36). Free recall and RWR measures were compared between 10- and 12-word versions of the ISLT by submitting data for each trial to a series of 2 (list length) × 3 (trial) repeated measures ANOVAs. Significant effects involving list length were decomposed using independent t-tests and measures of effect size (Cohen's d) and computed for all differences. Total free recall and primacy scores were compared between list lengths using independent t-tests. As in Study 1, to minimize the risk of Type I error, the criterion for statistical significance was set to .01 and only statistically significant differences or associations with effect sizes greater than those considered as small in magnitude (e.g., d ≥ 0.20; Cohen, 1988) were interpreted.

Results and Discussion

Figure 2a shows the group mean standardized free recall performance on the 10- and 12-word versions of the ISLT. Results of the repeated-measures ANOVA indicated a significant effect of trial—Wilk's lambda = 0.26, F(3, 45) = 43.61, p < .001, but not of list length—F(1, 47) = 2.8, p = .10—or trial × list length interaction—Wilk's lambda = 0.80, F(3, 45) = 3.8, p = .02. Total free recall scores also did not differ significantly between groups—t(55) = 0.41, p = .11, d = 0.12.

Fig. 2.

Differences in performance in recall (a) and RWR (b) between the 10 word list and 12-word list groups in each trial. Note that error bars represent the standard error of the mean and are presented to guide comparison of performance between groups not within groups over trials.

Fig. 2.

Differences in performance in recall (a) and RWR (b) between the 10 word list and 12-word list groups in each trial. Note that error bars represent the standard error of the mean and are presented to guide comparison of performance between groups not within groups over trials.

Figure 2b displays the group mean RWR scores across the three learning trials on the ISLT in the groups who received the 10- and 12-word versions of the test. Results of a repeated-measures ANOVA of these scores indicated a significant effect of trial—Wilk's lambda = 0.67, F(2, 51) = 12.83, p < .001, but not of list length—F(1, 52) = 0.12, p = .77—or trial × list length interaction—Wilk's lambda = 0.92, F(2, 51) = 2.21, p = .12. There were no significant differences in primacy scores between the 10-item list (M = 0.47, SD = 0.27) and the 12-item listsM = 0.40, SD = 0.28, t(55) = 1.01, p = .34, d = 0.27.

Results of Study 2 suggested that there was no difference between the 10- and 12-item versions of ISLT on any of the outcome measures generated by this test, including RWR. Restrictions in range and reduced variability are known to reduce reliability estimates for measures of intelligence and cognition (Cohen & Swerdlik, 2009). In the current study, scores on the RWR measure showed greater variability about the mean (i.e., greater SDs) than the delayed recall score or total free recall score. This may suggest that the RWR measure has greater reliability than measures of total recall and that it may have greater utility in measuring change in verbal memory function over time in individuals with AD. To evaluate this possibility, the aim of Study 3 was to determine the test–retest reliability of the different ISLT performance measures in individuals with AD. Because change in memory performance over time is characteristic of AD (Lezak et al., 2004), test–retest reliability was assessed over a brief retest (e.g., 1 month).

Study 3

Method

Participants

A total of 50 older adults with AD underwent assessment on the ISLT on two occasions; 27 of these individuals were drawn from the sample studied in Study 1 (and therefore the data from Study 1 contributed to that from the first assessment in the current study). In addition, 23 participants from the 30 recruited for Study 2 agreed to undergo further baseline and follow-up assessments with the 12-word ISLT after they had finished their commitment to Study 2. The average time from participation in Study 2 to the baseline for Study 3 was 42 days (SD = 9.5 days; range = 35–56 days). Reasons for individuals from Study 2 not participating in Study 3 were: planned travel (n = 3); planned or ongoing family commitments (n = 2); lack of interest (n = 1); and negative perceptions of the assessment (n = 1). Table 1 shows the characteristics of the sample that participated in Study 3.

Procedure

The procedure for Study 3 was the same as that described in Study 1. All individuals had provided baseline data on the ISLT as part of their assessment (i.e., the 27 participants from Study 1) or in a re-assessment 6 weeks after their participation in Study 2 (i.e., the 23 participants from Study 2). All 50 participants then underwent a further assessment on the ISLT which was conducted an average of 26 days (SD = 11 days; range = 20–45 days) after their baseline administration of the 12-word ISLT. For all re-assessments, parallel versions of the 12-word ISLT were employed.

Data analysis

Outcome measures were compared between the first and second assessments using paired t-tests. Test–retest reliability was estimated using Pearson's r. Where data did not meet assumptions for parametric analysis, the Kruskal–Wallis tests and Spearman's r were computed to compare groups and estimate reliability, respectively. As in Studies 1 and 2, the level required for statistical significance was .01 and only statistically significant differences or associations with d ≥ 0.20 were interpreted.

Results and Discussion

Group means for each outcome measure on each assessment are summarized in Table 4. No statistically significant differences in performance between the first and the second assessment were found for any of the ISLT outcome measures, and all effect sizes for the difference between assessments were <0.20. Estimates of test–retest reliability were positive and statistically significant for the total free recall and RWR measures on each trial, for the total free recall score, and for the measure of primacy. In general, test–retest correlations were greater for the free recall scores than for the RWR scores.

Table 4.

Group mean (SD) performance on the ISLT performance measures at two assessments conducted 1-month apart

Measure Trial Assessment 1 (M [SD]) Assessment 2 (M [SD]) r t or Kruskal–Wallis 
Free recall Trial 1 3.43 (2.13) 2.99 (1.98) .56* 1.11 
 Trial 2 4.78 (2.35) 4.54 (2.17) .67* 1.12 
 Trial 3 5.13 (2.50) 5.18 (2.1) .73* 0.92 
 Delayed RecallNP 1.98 (2.31) 1.87 (1.95) .45* 0.72 
Total free Recall  13.27 (4.47) 12.94 (3.78) .85* 0.44 
Retention-weighted recall Trial 1 5.93 (5.23) 5.37 (3.74) .55* −0.45 
 Trial 2 6.54 (4.14) 6.70 (3.33) .56* 1.31 
 Trial 3 23.43 (17.4) 24.35 (17.56) .67* −0.021 
Primacy  3.18 (1.9) 3.20 (1.8) .44* 0.61 
Recency  5.63 (1.82) 5.52 (1.9) .26 2.11 
IntrusionsNP  1 (0–9) 1 (0–8) .21 0.34 
RepetitionsNP  0 (0–2) 0 (0–2) .17 0.41 
Measure Trial Assessment 1 (M [SD]) Assessment 2 (M [SD]) r t or Kruskal–Wallis 
Free recall Trial 1 3.43 (2.13) 2.99 (1.98) .56* 1.11 
 Trial 2 4.78 (2.35) 4.54 (2.17) .67* 1.12 
 Trial 3 5.13 (2.50) 5.18 (2.1) .73* 0.92 
 Delayed RecallNP 1.98 (2.31) 1.87 (1.95) .45* 0.72 
Total free Recall  13.27 (4.47) 12.94 (3.78) .85* 0.44 
Retention-weighted recall Trial 1 5.93 (5.23) 5.37 (3.74) .55* −0.45 
 Trial 2 6.54 (4.14) 6.70 (3.33) .56* 1.31 
 Trial 3 23.43 (17.4) 24.35 (17.56) .67* −0.021 
Primacy  3.18 (1.9) 3.20 (1.8) .44* 0.61 
Recency  5.63 (1.82) 5.52 (1.9) .26 2.11 
IntrusionsNP  1 (0–9) 1 (0–8) .21 0.34 
RepetitionsNP  0 (0–2) 0 (0–2) .17 0.41 

Notes: ISLT, International Shopping List Test; NP = non-parametric analysis conducted using Kruskal–Wallis for comparison and Spearman's r for test–retest correlation .

*Statistically significant association, p < .01.

General Discussion

Results of this set of three studies support our hypothesis that individuals with AD would show impairment on all performance measures of the ISLT. Relative to controls, individuals with AD learned the ISLT word list at a slower rate and recalled fewer words on each trial (Fig. 1a). They also forgot more words than control participants when asked to recall the word list that had been learned previously after a brief delay (i.e., the delayed recall trial). As such, total words recalled were significantly lower in the AD group. Importantly, the magnitude of the impairment observed on each ISLT measures was, by convention, very large (i.e., Cohen's d values >2; Cohen, 1988).

Large magnitude impairments relative to controls were also observed across each of the three ISLT learning trials when performance was expressed as RWR using the computation of Buschke and colleagues (2006). As expected, the AD group also showed lower primacy effects than healthy adults, although there was no impairment in the AD group for measures of recency. Finally, patients with AD made more intrusions than controls, but did not differ with respect to repetitions. When data were considered at the level of individual patients with AD, the sensitivity of all ISLT measures was high (greater than ∼60%) when specificity of classification was set at either 90% or 95% (Table 3). Delayed recall and free recall on Trial 3 measures of the ISLT were most sensitive to memory impairment in AD, yielding sensitivity estimates of approximately 90%. Taken together, these results suggest that the ISLT is sensitive to detecting impairment in verbal memory in AD, even though the words for this list are drawn from a single semantic category.

The nature of the impairment observed on the ISLT in the AD groups who participated in this set of studies is consistent with that observed previously in AD on the basis of performance on other tests of verbal learning and memory (i.e., HVLT, CVLT-II). For example, prior studies have shown that, compared with matched controls, patients with AD show slower rates of acquiring verbal information across learning trials (Estevez-Gonzalez et al., 2003; Greenway et al., 2006), forget newly learned information faster (Antonelli Incalzi et al., 1995; Estevez-Gonzalez et al., 2003; Greenway et al., 2006; Jungwirth et al., 2009) and have lower total recall scores (Gaines, Shapiro, Alt, & Benedict, 2006; Hogervorst et al., 2002; Kulansky et al., 2004). Individuals with AD have also been found to show reduced primacy effects but intact recency (Bayley et al., 2000; Buschke et al., 2006; Massman, Delis, & Butters, 1993), as well as more intrusions (Greenway et al., 2006). Finally, a number of studies have shown delayed recall to be the most sensitive measure of impairment in verbal list learning in AD (Antonelli Incalzi et al., 1995; Backman et al., 2005; Greenway et al., 2006; Jungwirth et al., 2009). Results of the current study suggest that performance deficits on the ISLT are qualitatively and quantitatively similar to those observed previously for other tests of verbal list learning used to assess verbal learning and memory dysfunction in individuals with AD. For example, a recent meta-analysis by Gavett and associates (2009) estimated the sensitivity of measures of immediate recall (i.e., total recall) and delayed recall from verbal list learning tests to AD-related memory impairment to range between 0.60 and 0.95, with specificities that ranged from 0.80 to 1.0. These generally high estimates reflect the substantial verbal memory impairment that is characteristic of AD.

The results of the current study did not support the hypothesis that the combination of measures of primacy with free recall increased the sensitivity of a verbal list learning test to detecting memory impairment in AD (e.g., Buschke et al., 2006). More specifically, expression of performance in terms of RWR improved the sensitivity to AD-related memory impairment of the TICS memory test (e.g., d = 1.52) when compared with the total free recall score (e.g., d = 1.08). Impaired primacy in AD is thought to represent an inability to encode newly learned verbal information into long-term memory (Massman et al., 1993). As such, patients with AD tend to rely on working memory processes to recall items from the word list and may therefore be more likely to recall items presented toward the end of the list (Capitani, Sala, Logie, & Spinnler, 1992). In the current study, the combination of measures of primacy and recall (i.e., RWR score) did not show the greatest sensitivity to memory impairment in AD. Instead, total free recall, delayed recall, and free recall on Trial 3 all yielded larger effect sizes, as well as higher sensitivity. One reason for the discrepancy in findings between the current study and those of Buschke and colleagues (2006) may be related to different lengths of the word lists used in the studies, as the ISLT employs a 12-word list, while the TICS employs a 10-word list. Additional research is needed to examine this possibility.

In Study 2, we investigated the possibility that the failure to replicate the results of Buschke and colleagues (2006) was explained by their use of a 10-item list. In other words, sensitivity of RWR may be greater for shorter lists which impose a reduced load on memory when compared with longer lists. However, results of Study 2 showed no differences in RWR scores (adjusted for list length) between 10- and 12-word versions of the ISLT in patients with AD. Interestingly, there was also no difference between the 10- and 12-item versions of the memory test for the adjusted free recall scores on any of the individual trials or on the delayed recall trial (Fig. 2a), the total free recall score, or number of intrusions. Therefore, lower memory loads could not account for greater sensitivity of the RWR measure as reported by Buschke and colleagues (2006).

Another explanation for the reduced sensitivity of the RWR measure is that the ISLT, like most conventional verbal list learning tests, is comprised of multiple trials, whereas the verbal memory test from the TICS, used in the Buschke and colleagues (2006) study, is comprised of only a single trial. Interestingly, when the comparison of sensitivities is restricted to the first trial of the ISLT, the RWR measure (d = 2.06) did provide a magnitude of impairment larger than the number of words recalled (d = 1.83), thereby demonstrating greater sensitivity to detecting verbal memory impairment in AD. A possible explanation for this finding is that the impaired encoding of new verbal information in AD means that there is little improvement in memory performance despite repeated exposure to the same word list. This aspect of memory performance has been noted previously in AD (Estevez-Gonzalez et al., 2003; Greenway et al., 2006) and is evident in the relatively flat learning curve for the AD group shown in Fig. 1a. Unlike patients with AD, controls continued to benefit from exposure to the list and consequently their performance improved over trials, thereby accentuating differences between the groups. Therefore, if only data from the first trial of the ISLT are considered, results are consistent with the earlier findings of Buschke and colleagues (2006). Specifically, they suggest that for single-trial verbal list learning tests, the RWR measure does improve the sensitivity of outcome measures to detecting verbal memory impairment in individuals with AD. However, for verbal list learning tests with multiple trials, conventional measures of performance (total free recall and delayed recall) appear to provide the greatest sensitivity to detecting verbal memory impairment in AD, at least for the ISLT as presented here.

In Study 3, test–retest reliability for the ISLT was highest on the total free recall scores (r = .85), followed by the free recall scores on each of the trials (range of rs = .56–.73). These estimates of test–retest reliability were similar to those observed for other verbal list learning tests in individuals with AD (Benedict, Schretlen, Groninger, & Brandt, 1998; Woods, Delis, Scott, Kramer, & Holdnack, 2006). Thus, even though the RWR for a single learning trial was more sensitive to memory impairment in AD than simple recall, it was not more reliable. In sum, these results suggest that the ISLT, a measure of verbal list learning that uses a single semantic category, is stable and reliable, at least in the short term (i.e., 1 month) and that RWR does not enhance test–retest reliability over conventional measures of free recall. Importantly, test–retest statistics computed in patients with AD cannot be generalized to healthy adults, but these estimates do provide useful information about the extent to which this type of test can be administered repeatedly to people with AD.

It is well known that subtle memory decline in AD occurs even before a clinical diagnosis is warranted (Backman et al., 2005; Knopman et al., 2003; Sperling et al., 2010; Twamley et al., 2006) and that with the progression of the disease, verbal memory impairment increases in severity in AD. Accordingly, some researchers have proposed that the assessment of memory deficits should be based on the individual pattern of decline on the same task over time, rather than norm-referenced performance (Collie et al., 2001). This individualized approach to memory assessment also controls for other factors that contribute to variability within individuals and has been applied successfully in studies of mild cognitive impairment (Maruff et al., 2004; Weaver Cargin, Maruff, Collie, Shafiq-Antonaci, & Masters, 2007). Hence, the verbal memory deficits in patients with AD could be further investigated by examining individual performance on the ISLT over time, employing the measures validated in this study, including free recall, RWR, primacy, and intrusions.

One limitation of the current study was that the NART, a measure of premorbid IQ that requires one to read a list of words, was administered between the learning and delayed recall trials of the ISLT. It is possible that interspersing a verbal task between these trials may have increased the number of intrusion errors in the AD group during the delayed recall trial; nevertheless, inspection of this possibility indicated that the intrusions recorded were not words from the NART. A second limitation of this study was that it was primarily cross-sectional in design and where memory performance was measured prospectively, this was done over a very short retest interval of 1 month. A third limitation of this study was that even though the ISLT was developed for the assessment of verbal memory across cultures, the current study investigated performance only in English-speaking people with AD. Thus, the current data do not currently support the use of the ISLT for detecting AD-related memory impairment in languages other than English. However, as previous research has shown that it is possible to minimize cultural biases in verbal list learning tests using the shopping list method (Lim et al., 2009), we can proceed to examine the sensitivity of the ISLT to detecting AD-related verbal learning and memory impairments in other linguistic and cultural groups. Finally, the continued sensitivity of the ISLT to detecting AD-related verbal memory impairment despite its use of words from a single semantic category warrants further research. When considered from a broader perspective, the evidence now suggests that the sensitivity of verbal list learning tests to AD is the same whether they have no semantic categories (i.e., RAVLT), one semantic category (ISLT), or multiple semantic categories (HVLT-R, CVLT-II). These tests have not been compared directly so it is possible that there are very small differences in their sensitivity to detecting AD-related verbal memory impairment. However, it would be useful to challenge the hypothesis that semantic grouping is not important to verbal list learning in AD.

In conclusion, results of this study suggest that the use of the ISLT, a verbal list learning task comprised of a single semantic category that may be administered to individuals from a broad range of language and cultural backgrounds, did not diminish the sensitivity of this task to detecting verbal memory impairment in individuals with AD. Compared with healthy older adults, individuals with AD performed significantly worse on all measures of this test. Measures of total free recall and delayed recall were the most sensitive to detecting verbal memory impairment in this sample of individuals with AD. A measure of RWR also showed good sensitivity, especially for the first trial. Results of this study are consistent with the literature on episodic memory impairment in AD and suggest that the English version of the ISLT is a simple, valid, and sensitive tool that may be useful in detecting verbal list learning and memory impairment in individuals with AD. Importantly, this task provides a measure of verbal list learning and memory that has the potential to be tailored to specific cultural groups and afford meaningful comparison between them.

Funding

This work was funded by an unrestricted educational grant from Pfizer Australia.

Conflict of Interest

Paul Maruff and David Darby are full time employees of CogState Ltd the company that distributes the International Shopping List Test.

References

Alzheimer Association
Alzheimer's disease facts and figures
2007
 
Antonelli Incalzi
R.
Capparella
O.
Gema
A.
Marra
C.
Carbonin
P.
Effects of aging and of Alzheimer's disease on verbal memory
Journal of Clinical and Experimental Neuropsychology
 , 
1995
, vol. 
17
 (pg. 
580
-
589
)
Ardila
A.
Cultural values underlying psychometric cognitive testing
Neuropsychology Review
 , 
2005
, vol. 
15
 (pg. 
185
-
195
)
Backman
L.
Jones
S.
Berger
A.
Laukka
E.J.
Cognitive impairment in preclinical Alzheimer's disease: A meta-analysis
Neuropsychology
 , 
2005
, vol. 
19
 (pg. 
520
-
531
)
Bayley
P. J.
Salmon
D. P.
Bondi
M. W.
Bui
B. K.
Olichney
J.
Delis
D. C.
, et al.  . 
Comparison of the serial position effect in very mild Alzheimer's disease, mild Alzheimer's disease and amnesia associated with electroconvulsive therapy
Journal of the International Neuropsychology Society
 , 
2000
, vol. 
6
 (pg. 
290
-
298
)
Benedict
R. H. B.
Schretlen
D.
Groninger
L.
Brandt
J.
The Hopkins Verbal Learning Test – Revised: Normative data and analysis of inter-form and test-retest reliability
The Clinical Neuropsychologist
 , 
1998
, vol. 
12
 (pg. 
43
-
55
)
Braak
H.
Braak
E.
Bohl
J.
Staging of Alzeheimer-related cortical destruction
European Journal of Neurology
 , 
1993
, vol. 
60
 (pg. 
1077
-
1081
)
Brandt
J.
Benedict
R. H. B.
Hopkins Verbal Learning Test-Revised
 , 
2001
Odessa, FL
PAR
Brandt
J.
Spencer
M.
Folstein
M.
The Telephone Interview for Cognitive Status
Neuropsychiatry, Neuropsychology, and Behavioural Neurology
 , 
1988
, vol. 
1
 (pg. 
111
-
117
)
Buschke
H.
Sliwinski
M. J.
Kuslanski
G.
Katz
M.
Verghese
J.
Lipton
R. B.
Retention weighted recall improves discrimination of Alzheimer's disease
Journal of the International Neuropsychological Society
 , 
2006
, vol. 
12
 (pg. 
436
-
440
)
Capitani
E.
Sala
S. D.
Logie
R. H.
Spinnler
H.
Recency, primacy, and memory: Reappraising and standardizing the serial position curve
Cortex
 , 
1992
, vol. 
28
 (pg. 
315
-
342
)
Cohen
J.
Statistical power analysis for the behavioural sciences
 , 
1988
2nd ed
New York
Academic Press
Cohen
R. J.
Swerdlik
M
Psychological testing and assessment: An introduction to test and measurement
 , 
2009
7th ed
New York
McGraw Hill Higher Education
Collie
A.
Maruff
P.
Shafiq-Antonacci
R.
Smith
M.
Hallup
M.
Schoefield
P. R. A.
, et al.  . 
Memory decline in healthy older people
Neurology
 , 
2001
, vol. 
56
 (pg. 
1533
-
1538
)
Craik
F. I.
Tulving
E.
Depth of processing and the retention of words in episodic memory
Journal of Experimental Psychology: General
 , 
1975
, vol. 
104
 (pg. 
268
-
294
)
Cummings
J. L.
Alzheimer's Disease
The New England Journal of Medicine
 , 
2004
, vol. 
351
 (pg. 
56
-
67
)
Darby
D.
Pietrzak
R. H.
Fredrickson
J.
Woodward
M.
Moore
L.
Fredrickson
A.
, et al.  . 
Intra-individual cognitive decline using a brief computerized cognitive screening test
Alzheimer's and Dementia
  
in press
Delis
D. C.
Freeland
J.
Kramer
J. H.
Kaplan
E.
Integrating clinical assessment with cognitive neuroscience: Construct validation of the California Verbal Learning Test
Journal of Consulting and Clinical Psychology
 , 
1988
, vol. 
56
 (pg. 
123
-
130
)
Delis
D. C.
Kramer
J. H.
Kaplan
E.
Ober
B. A.
The California Verbal Learning Test. Adult Version (Research ed.)
 , 
1987
San Antonio, TX
Psychological Corporation
Delis
D. C.
Kramer
J. H.
Kaplan
E.
Ober
B. A.
California Verbal Learning Test – Second Edition (CVLT-II)
 , 
2000
San Antonio, TX
Psychological Corporation
Estevez-Gonzalez
A.
Kulisevsky
J.
Boltes
A.
Otermin
P.
Garcia-Sanchez
C.
Rey Verbal Learning Test is a useful tool for differential diagnosis in the preclinical phase of Alzheimer's disease: Comparison with mild cognitive impairment and normal aging
International Journal of Geriatric Psychiatry
 , 
2003
, vol. 
18
 (pg. 
1021
-
1028
)
Fischer
P.
Jungwirth
S.
Zehetmayer
S.
Weissgram
S.
Hoenigschnabl
S.
Gelpi
E.
, et al.  . 
Conversion from subtypes of mild cognitive impairment to Alzheimer dementia
Neurology
 , 
2007
, vol. 
68
 (pg. 
288
-
291
)
Fleisher
A. S.
Sowell
B. B.
Taylor
C.
Gamst
A. C.
Petersen
R. C.
Thal
L. J.
for the Alzheimer's Disease Cooperative study
Clinical predictors of progression to Alzheimer disease in amnesic mild cognitive impairment
Neurology
 , 
2007
, vol. 
68
 (pg. 
1588
-
1595
)
Fredrickson
J.
Maruff
P.
Woodward
M.
Moore
L.
Fredrickson
A.
Sach
J.
, et al.  . 
Evaluation of the usability of a brief computerized cognitive screening test in older people for epidemiological studies
Neuroepidemiology
 , 
2010
, vol. 
34
 (pg. 
65
-
75
)
Gaines
J. J.
Shapiro
A.
Alt
M.
Benedict
R. H. B.
Semantic clustering indexes for the Hopkins Verbal Learning Test-Revised: Initial exploration in elder control and dementia groups
Applied Neuropsychology
 , 
2006
, vol. 
13
 (pg. 
213
-
222
)
Gainotti
G.
Marra
C.
Some aspects of memory disorders clearly distinguish dementia of the Alzheimer's type from depressive pseudo-dementia
Journal of Clinical and Experimental Neuropsychology
 , 
1994
, vol. 
16
 (pg. 
65
-
78
)
Gainotti
G.
Monteleone
D.
Parlato
E.
Carlomagno
S.
Verbal memory disorders in Alzheimer's disease and multi-infarct dementia
Journal of Neurolinguistics
 , 
1989
, vol. 
4
 (pg. 
327
-
345
)
Gavett
B.E.
Poon
S.J.
Ozonoff
A.
Jefferson
A.L.
Nair
A.K.
Green
R.C.
, et al.  . 
Diagnostic utility of the NAB List Learning test in Alzheimer's disease and amnestic mild cognitive impairment
Journal of the International Neuropsychological Society
 , 
2009
, vol. 
15
 (pg. 
121
-
129
)
Greenfield
P. M.
You can't take it with you: Why ability assessments don't cross cultures
American Psychologist
 , 
1997
, vol. 
52
 (pg. 
1115
-
1124
)
Greenway
M. C.
Lacritz
L. H.
Binegar
D.
Weiner
M. F.
Lipton
A.
Cullum
C. M.
Patterns of verbal memory performance in mild cognitive impairment, Alzheimer's disease, and normal aging
Cognitive and Behavioural Neurology
 , 
2006
, vol. 
19
 (pg. 
79
-
84
)
Hogervorst
E.
Combrinck
M.
Lapuerta
P.
Rue
J.
Swales
K.
Budge
M.
The Hopkins Verbal Learning Test and screening for dementia
Dementia and Geriatric Cognitive Disorders
 , 
2002
, vol. 
13
 (pg. 
13
-
20
)
Jungwirth
S.
Zehetmayer
S.
Bauer
P.
Weissgram
S.
Tragl
K. H.
Fischer
P.
Prediction of Alzheimer dementia with short neuropsychological instruments
Journal of Neural Transmission
 , 
2009
, vol. 
116
 (pg. 
1513
-
1521
)
Knopman
D.
Parisi
J. E.
Salvati
A.
Floriach-Robert
M.
Boeve
B. F.
Ivnik
R. J.
, et al.  . 
Neuropathology of cognitively normal elderly
Journal of Neuropathology and Experimental Neurology
 , 
2003
, vol. 
62
 (pg. 
1087
-
1095
)
Kulansky
G.
Katz
M.
Verghese
J.
Hall
C. B.
Lapuerta
P.
LaRuffa
G.
, et al.  . 
Detecting dementia with the Hopkins Verbal Learning Test and the Mini-Mental State examination
Archives of Clinical Neuropsychology
 , 
2004
, vol. 
19
 (pg. 
89
-
104
)
Lezak
M. D.
Howieson
D. B.
Loring
D. W.
Neuropsychological assessment
 , 
2004
4th ed
New York
Oxford University Press
Lim
Y. Y.
Prang
K. H.
Cysique
L.
Pietrzak
R. H.
Snyder
P. J.
Maruff
P.
A method for cross-cultural adaptation of a verbal memory assessment
Behavior Research Methods
 , 
2009
, vol. 
41
 (pg. 
1190
-
1200
)
Maruff
P.
Collie
A.
Darby
D.
Weaver-Cargin
J.
Masters
C.
Subtle memory decline over 12 months in mild cognitive impairment
Dementia and Geriatric Cognitive Disorders
 , 
2004
, vol. 
18
 (pg. 
342
-
348
)
Massman
P. J.
Delis
D. C.
Butters
N.
Does impaired primacy recall equal impaired long-term storage? Serial position effects in Huntingtion's disease and Alzheimer's disease
Developmental Neuropsychology
 , 
1993
, vol. 
9
 (pg. 
1
-
15
)
McKhann
G.
Drachman
D.
Folstein
M.
Katzman
R.
Price
D.
Stadlan
E.M.
Clinical diagnosis of Alzheimer's disease: Report of the NINCDS-ADRDA Work Group under the auspices of Department of Health and Human Services Task Force on Alzheimer's Disease
Neurology
 , 
1984
, vol. 
34
 (pg. 
939
-
944
)
Palmer
K.
Backman
L.
Winbland
B.
Fratiglioni
L.
Detection of Alzheimer's disease and dementia in the preclinical phase: Population-based cohort study
British Medical Journal
 , 
2008
, vol. 
326
 (pg. 
1
-
5
)
Pietrzak
R. H.
Maruff
P.
Woodward
M.
Fredrickson
J.
Fredrickson
A.
Krystal
J. H.
, et al.  . 
Mild worry symptoms are associated with decline in learning and memory in healthy older adults: A 2-year prospective cohort study
American Journal of Geriatric Psychiatry
  
in press
Pietrzak
R. H.
Olver
J.
Norman
T.
Piskulic
D.
Maruff
P.
Snyder
P. J.
A comparison of the Cogstate Schizophrenia Battery and the Measurement and Treatment Research to Improve Cognition in Schizophrenia (MATRICS) battery in assessing cognitive impairment in chronic schizophrenia
Journal of Clinical and Experimental Neuropsychology
 , 
2009
, vol. 
31
 (pg. 
848
-
859
)
Pike
K. E.
Savage
G.
Villemagne
V. L.
Ng
S.
Moss
S. A.
Maruff
P.
, et al.  . 
Beta-amyloid imaging and memory in non-demented individuals: Evidence for preclinical Alzheimer's disease
Brain
 , 
2007
, vol. 
130
 (pg. 
837
-
844
)
Rey
A.
L'examen clinique en psychologie
 , 
1964
Paris
Presses Universitaires de France
Rowe
C. C.
Ng
S.
Ackermann
U.
Gong
S. J.
Pike
K.
Savage
G.
, et al.  . 
Imaging beta-amyloid burden in aging and dementia
Neurology
 , 
2007
, vol. 
68
 (pg. 
1718
-
1725
)
Sperling
R.
Dickerson
B.
Pihlajamaki
M.
Vannini
P.
LaViolette
P.
Vitolo
O.
, et al.  . 
Functional alterations in memory networks in early Alzheimer's disease
Neuromolecular Medicine
 , 
2010
, vol. 
12
 (pg. 
27
-
43
)
Spinnler
H.
Della Sala
S.
The role of clinical neuropsychology in the neurological diagnosis of Alzheimer's disease
Journal of Neurology
 , 
1988
, vol. 
235
 (pg. 
258
-
271
)
Stopford
C. L.
Snowden
J. S.
Thompson
J. C.
Neary
D.
Variability in cognitive presentation of Alzheimer's disease
Cortex
 , 
2008
, vol. 
44
 (pg. 
185
-
195
)
Tabachnick
B. G.
Fidell
L. S.
Using multivariate analysis
 , 
2001
4th ed
Boston
Allyn and Bacon
Twamley
E. W.
Ropacki
S. A.
Bondi
M. W.
Neuropsychological and neuroimaging changes in preclinical Alzheimer's disease
Journal of the International Neuropsychological Society
 , 
2006
, vol. 
12
 (pg. 
707
-
735
)
Weaver Cargin
J.
Maruff
P.
Collie
A.
Shafiq-Antonaci
R.
Masters
C.
Decline in verbal memory in non-demented older adults
Journal of Clinical and Experimental Neuropsychology
 , 
2007
, vol. 
29
 (pg. 
706
-
718
)
Woods
S. P.
Delis
D. C.
Scott
J. C.
Kramer
J. H.
Holdnack
J. A.
The California Verbal Learning Test-second edition: Test-retest reliability, practice effects, and reliable change indices for the standard and alternate forms
Archives of Clinical Neuropsychology
 , 
2006
, vol. 
21
 (pg. 
413
-
420
)