Context: Subclinical hypothyroidism (SCH) and cognitive dysfunction are both common in the elderly and have been linked. It is important to determine whether T4 replacement therapy in SCH confers cognitive benefit.

Objective: Our objective was to determine whether administration of T4 replacement to achieve biochemical euthyroidism in subjects with SCH improves cognitive function.

Design and Setting: We conducted a double-blind placebo-controlled randomized controlled trial in the context of United Kingdom primary care.

Patients: Ninety-four subjects aged 65 yr and over (57 females, 37 males) with SCH were recruited from a population of 147 identified by screening.

Intervention: T4 or placebo was given at an initial dosage of one tablet of either placebo or 25 μg T4 per day for 12 months. Thyroid function tests were performed at 8-weekly intervals with dosage adjusted in one-tablet increments to achieve TSH within the reference range for subjects in treatment arm. Fifty-two subjects received T4 (31 females, 21 males; mean age 73.5 yr, range 65–94 yr); 42 subjects received placebo (26 females, 16 males; mean age 74.2 yr, 66–84 yr).

Main Outcome Measures: Mini-Mental State Examination, Middlesex Elderly Assessment of Mental State (covering orientation, learning, memory, numeracy, perception, attention, and language skills), and Trail-Making A and B were administered.

Results: Eighty-two percent and 84% in the T4 group achieved euthyroidism at 6- and 12-month intervals, respectively. Cognitive function scores at baseline and 6 and 12 months were as follows: Mini-Mental State Examination T4 group, 28.26, 28.9, and 28.28, and placebo group, 28.17, 27.82, and 28.25 [not significant (NS)]; Middlesex Elderly Assessment of Mental State T4 group, 11.72, 11.67, and 11.78, and placebo group, 11.21, 11.47, and 11.44 (NS); Trail-Making A T4 group, 45.72, 47.65, and 44.52, and placebo group, 50.29, 49.00, and 46.97 (NS); and Trail-Making B T4 group, 110.57, 106.61, and 96.67, and placebo group, 131.46, 119.13, and 108.38 (NS). Linear mixed-model analysis demonstrated no significant changes in any of the measures of cognitive function over time and no between-group difference in cognitive scores at 6 and 12 months.

Conclusions: This RCT provides no evidence for treating elderly subjects with SCH with T4 replacement therapy to improve cognitive function.

Subclinical hypothyroidism [defined as an elevated serum TSH concentration in the presence of a normal serum free T4 concentration] is a common condition with prevalence ranging from 2.9% in a recent United Kingdom population-based study of subjects aged 65 yr and over (1) to 9% in a U.S. population of 18 yr and over (2). Whether subclinical hypothyroidism should be treated continues to be controversial (35).

Cognitive dysfunction is also common and is recognized as an important public health problem, with up to 16.8% of older adults having cognitive impairment of sufficient severity to affect completion of everyday tasks but that is not associated with a formal diagnosis of dementia (6). Hypothyroidism has traditionally been regarded as a cause of reversible cognitive impairment; indeed, recent functional magnetic resonance imaging evidence [not from a randomized controlled trial (RCT)] suggests reversible domain-specific cognitive defects in hypothyroidism (7). However, the extent of cognitive dysfunction associated with subclinical hypothyroidism is unclear (8). Also unclear is the sensitivity of the adult brain to changes in thyroid function and the potential reversibility of these changes with treatment (9). In addition, cognitive function has been reported as impaired compared with population norms in patients receiving T4 replacement treatment for hypothyroidism (10). Recent evidence from a large (n = 5865) cross-sectional community-based study of subjects aged 65 yr and over measuring cognitive function has, however, shown no association between subclinical thyroid dysfunction and cognition or depression (11), although a smaller study of 1047 United Kingdom subjects aged 64 yr and over did show correlation between TSH values and cognitive performance (12). Detailed reviews of the association between cognitive dysfunction and thyroid dysfunction have come to no definitive conclusions (4, 13).

There have been few previous RCTs of the effects of T4 therapy on cognitive function in subjects with subclinical hypothyroidism, and their results are conflicting [see Biondi and Cooper (4) and Table 1]. Nystrom et al. (14) reported improvement in cognitive function in four of 17 women who completed a double-blind randomized crossover trial. Jaeschke et al. (15) in an RCT of nine men and 28 women with subclinical hypothyroidism reported an improvement in composite memory score with T4 treatment. Samuels et al. (16) recently performed an RCT of subclinical hypothyroidism (induced by reducing the dose of T4 replacement) in 19 females and reported significant reduction in working memory at the end of the subclinical hypothyroidism phase compared with that measured at the end of the euthyroid phase. Lastly, Jorde et al. (17) in the largest RCT reported to date (69 subjects, 37 men and 32 women) performed an extensive battery of cognitive function tests and found no differences with thyroid status, although after secondary analyses, they did report that serum TSH was negatively correlated with the Trail-Making test A, and serum free T4 concentration was positively associated with the word association test (both at <1% significance level).

Improvement of impaired cognitive function in older adults could have an important impact on continued independence and quality of life as well as produce cost savings in terms of the maintenance of independent living. Thus, it is important to determine whether T4 therapy in subjects with subclinical hypothyroidism confers any cognitive benefit. We therefore report on a double-blind RCT of the effects of T4 replacement on cognitive function in elderly subjects (aged 65 yr and over) identified as having subclinical hypothyroidism in a community screening study. To our knowledge, this is the largest study to measure cognitive function in an RCT of T4 replacement (registered as ISRCTN 23090699) to date.

TABLE 1.

RCTs of effects on cognitive function of thyroxine replacement in subclinical hypothyroidism

AuthorProtocolCognitive function testsNumber of subjectsAge (yr)Sampling frameEntry criteria and TSH valuesResults
Nystrom et al. (14Randomized to T4 or placebo (50 μg for 2 wk, 100 μg for 2 wk, 150 μg for 22 wk), then crossed over Identical forms test, Bingley’s memory test, and reaction time to combined light and sound stimuli 20 females 51–73 Population, Sweden Subclinical hypothyroidism. Mean TSH 7.7 (sd 3.7), range 2.9–16.3 4 of the 17 women improved on T4 
Jaeschke et al. (15Randomized to T4 or placebo (25 μg for 1 month, 50 μg for 1 month, then dose titrated at monthly intervals against TSH to achieve biochemical euthyroidism); total length 40 wk Composite psychomotor speed: subsuming word fluency, digit symbol substitution, and Trail Making; composite memory score: subsuming logical memory and word learning 37 (12 females, 6 males in active arm; 16 females, 3 males in placebo arm) ≥55 (mean 68 in both arms) Not defined Subclinical hypothyroidism; active arm (n = 18) mean TSH 12.1 (sd 6.8), range 6–32.4; placebo arm (n = 19) mean TSH 9.4 (sd 3.1), range 6–16.5 Composite psychomotor speed scores, NS; composite memory score improved in treatment group (mean change in z-score 0.58 (P = 0.01; 95% CI = 0.14–1.03) 
Jorde et al. (17Randomized to T4 or placebo in increasing doses (50 μg for 6 wk, 100 μg for 6 wk, then dose titrated at 3-monthly intervals according to TSH to achieve euthyroidism); total length of treatment 52 wk Attention and working memory: digit span forward, digit span backward; seashore rhythm test; psychomotor/cognitive speed: Trail Making Test A, Stroop test, parts 1 and 2, digit Symbol test; memory: verbal recall, visual recall, word list test; language: controlled word association test; cognitive flexibility/executive function: Stroop test part 3, Trail Making test B; speed of information processing: California Computerised Assessment Package; intelligence: vocabulary (Wechsler Adult Intelligence Scale); composite cognitive score 69 (17 females, 19 males in active arm; 15 females, 18 males in placebo arm) 29–75 (mean 61.6 in T4 arm, 63 in placebo arm Population study, Tromso, Norway Subclinical hypothyroidism; active arm (n = 36) mean TSH 5.81 (sd 1.76); placebo arm (n = 33): mean TSH 5.32 (sd 1.25) No significant differences in any tests 
(Continued       
Samuels et al. (16Induced subclinical hypothyroidism by reducing dose of T4 by average 45% for a total of 12 wk; randomized to intervention arm or placebo then crossed over at 12 wk; total length of induced biochemical subclinical hypothyroidism 12 wk Declarative memory: paragraph recall and complex figure test; working memory: N-back test, subject-ordered pointing, and digit span backwards 19 females 20–70 Outpatients Hypothyroid subjects euthyroid on stable dose of T4 replacement Significant reduction in working memory when subclinical hypothyroid 
AuthorProtocolCognitive function testsNumber of subjectsAge (yr)Sampling frameEntry criteria and TSH valuesResults
Nystrom et al. (14Randomized to T4 or placebo (50 μg for 2 wk, 100 μg for 2 wk, 150 μg for 22 wk), then crossed over Identical forms test, Bingley’s memory test, and reaction time to combined light and sound stimuli 20 females 51–73 Population, Sweden Subclinical hypothyroidism. Mean TSH 7.7 (sd 3.7), range 2.9–16.3 4 of the 17 women improved on T4 
Jaeschke et al. (15Randomized to T4 or placebo (25 μg for 1 month, 50 μg for 1 month, then dose titrated at monthly intervals against TSH to achieve biochemical euthyroidism); total length 40 wk Composite psychomotor speed: subsuming word fluency, digit symbol substitution, and Trail Making; composite memory score: subsuming logical memory and word learning 37 (12 females, 6 males in active arm; 16 females, 3 males in placebo arm) ≥55 (mean 68 in both arms) Not defined Subclinical hypothyroidism; active arm (n = 18) mean TSH 12.1 (sd 6.8), range 6–32.4; placebo arm (n = 19) mean TSH 9.4 (sd 3.1), range 6–16.5 Composite psychomotor speed scores, NS; composite memory score improved in treatment group (mean change in z-score 0.58 (P = 0.01; 95% CI = 0.14–1.03) 
Jorde et al. (17Randomized to T4 or placebo in increasing doses (50 μg for 6 wk, 100 μg for 6 wk, then dose titrated at 3-monthly intervals according to TSH to achieve euthyroidism); total length of treatment 52 wk Attention and working memory: digit span forward, digit span backward; seashore rhythm test; psychomotor/cognitive speed: Trail Making Test A, Stroop test, parts 1 and 2, digit Symbol test; memory: verbal recall, visual recall, word list test; language: controlled word association test; cognitive flexibility/executive function: Stroop test part 3, Trail Making test B; speed of information processing: California Computerised Assessment Package; intelligence: vocabulary (Wechsler Adult Intelligence Scale); composite cognitive score 69 (17 females, 19 males in active arm; 15 females, 18 males in placebo arm) 29–75 (mean 61.6 in T4 arm, 63 in placebo arm Population study, Tromso, Norway Subclinical hypothyroidism; active arm (n = 36) mean TSH 5.81 (sd 1.76); placebo arm (n = 33): mean TSH 5.32 (sd 1.25) No significant differences in any tests 
(Continued       
Samuels et al. (16Induced subclinical hypothyroidism by reducing dose of T4 by average 45% for a total of 12 wk; randomized to intervention arm or placebo then crossed over at 12 wk; total length of induced biochemical subclinical hypothyroidism 12 wk Declarative memory: paragraph recall and complex figure test; working memory: N-back test, subject-ordered pointing, and digit span backwards 19 females 20–70 Outpatients Hypothyroid subjects euthyroid on stable dose of T4 replacement Significant reduction in working memory when subclinical hypothyroid 

NS, Not significant.

TABLE 1.

RCTs of effects on cognitive function of thyroxine replacement in subclinical hypothyroidism

AuthorProtocolCognitive function testsNumber of subjectsAge (yr)Sampling frameEntry criteria and TSH valuesResults
Nystrom et al. (14Randomized to T4 or placebo (50 μg for 2 wk, 100 μg for 2 wk, 150 μg for 22 wk), then crossed over Identical forms test, Bingley’s memory test, and reaction time to combined light and sound stimuli 20 females 51–73 Population, Sweden Subclinical hypothyroidism. Mean TSH 7.7 (sd 3.7), range 2.9–16.3 4 of the 17 women improved on T4 
Jaeschke et al. (15Randomized to T4 or placebo (25 μg for 1 month, 50 μg for 1 month, then dose titrated at monthly intervals against TSH to achieve biochemical euthyroidism); total length 40 wk Composite psychomotor speed: subsuming word fluency, digit symbol substitution, and Trail Making; composite memory score: subsuming logical memory and word learning 37 (12 females, 6 males in active arm; 16 females, 3 males in placebo arm) ≥55 (mean 68 in both arms) Not defined Subclinical hypothyroidism; active arm (n = 18) mean TSH 12.1 (sd 6.8), range 6–32.4; placebo arm (n = 19) mean TSH 9.4 (sd 3.1), range 6–16.5 Composite psychomotor speed scores, NS; composite memory score improved in treatment group (mean change in z-score 0.58 (P = 0.01; 95% CI = 0.14–1.03) 
Jorde et al. (17Randomized to T4 or placebo in increasing doses (50 μg for 6 wk, 100 μg for 6 wk, then dose titrated at 3-monthly intervals according to TSH to achieve euthyroidism); total length of treatment 52 wk Attention and working memory: digit span forward, digit span backward; seashore rhythm test; psychomotor/cognitive speed: Trail Making Test A, Stroop test, parts 1 and 2, digit Symbol test; memory: verbal recall, visual recall, word list test; language: controlled word association test; cognitive flexibility/executive function: Stroop test part 3, Trail Making test B; speed of information processing: California Computerised Assessment Package; intelligence: vocabulary (Wechsler Adult Intelligence Scale); composite cognitive score 69 (17 females, 19 males in active arm; 15 females, 18 males in placebo arm) 29–75 (mean 61.6 in T4 arm, 63 in placebo arm Population study, Tromso, Norway Subclinical hypothyroidism; active arm (n = 36) mean TSH 5.81 (sd 1.76); placebo arm (n = 33): mean TSH 5.32 (sd 1.25) No significant differences in any tests 
(Continued       
Samuels et al. (16Induced subclinical hypothyroidism by reducing dose of T4 by average 45% for a total of 12 wk; randomized to intervention arm or placebo then crossed over at 12 wk; total length of induced biochemical subclinical hypothyroidism 12 wk Declarative memory: paragraph recall and complex figure test; working memory: N-back test, subject-ordered pointing, and digit span backwards 19 females 20–70 Outpatients Hypothyroid subjects euthyroid on stable dose of T4 replacement Significant reduction in working memory when subclinical hypothyroid 
AuthorProtocolCognitive function testsNumber of subjectsAge (yr)Sampling frameEntry criteria and TSH valuesResults
Nystrom et al. (14Randomized to T4 or placebo (50 μg for 2 wk, 100 μg for 2 wk, 150 μg for 22 wk), then crossed over Identical forms test, Bingley’s memory test, and reaction time to combined light and sound stimuli 20 females 51–73 Population, Sweden Subclinical hypothyroidism. Mean TSH 7.7 (sd 3.7), range 2.9–16.3 4 of the 17 women improved on T4 
Jaeschke et al. (15Randomized to T4 or placebo (25 μg for 1 month, 50 μg for 1 month, then dose titrated at monthly intervals against TSH to achieve biochemical euthyroidism); total length 40 wk Composite psychomotor speed: subsuming word fluency, digit symbol substitution, and Trail Making; composite memory score: subsuming logical memory and word learning 37 (12 females, 6 males in active arm; 16 females, 3 males in placebo arm) ≥55 (mean 68 in both arms) Not defined Subclinical hypothyroidism; active arm (n = 18) mean TSH 12.1 (sd 6.8), range 6–32.4; placebo arm (n = 19) mean TSH 9.4 (sd 3.1), range 6–16.5 Composite psychomotor speed scores, NS; composite memory score improved in treatment group (mean change in z-score 0.58 (P = 0.01; 95% CI = 0.14–1.03) 
Jorde et al. (17Randomized to T4 or placebo in increasing doses (50 μg for 6 wk, 100 μg for 6 wk, then dose titrated at 3-monthly intervals according to TSH to achieve euthyroidism); total length of treatment 52 wk Attention and working memory: digit span forward, digit span backward; seashore rhythm test; psychomotor/cognitive speed: Trail Making Test A, Stroop test, parts 1 and 2, digit Symbol test; memory: verbal recall, visual recall, word list test; language: controlled word association test; cognitive flexibility/executive function: Stroop test part 3, Trail Making test B; speed of information processing: California Computerised Assessment Package; intelligence: vocabulary (Wechsler Adult Intelligence Scale); composite cognitive score 69 (17 females, 19 males in active arm; 15 females, 18 males in placebo arm) 29–75 (mean 61.6 in T4 arm, 63 in placebo arm Population study, Tromso, Norway Subclinical hypothyroidism; active arm (n = 36) mean TSH 5.81 (sd 1.76); placebo arm (n = 33): mean TSH 5.32 (sd 1.25) No significant differences in any tests 
(Continued       
Samuels et al. (16Induced subclinical hypothyroidism by reducing dose of T4 by average 45% for a total of 12 wk; randomized to intervention arm or placebo then crossed over at 12 wk; total length of induced biochemical subclinical hypothyroidism 12 wk Declarative memory: paragraph recall and complex figure test; working memory: N-back test, subject-ordered pointing, and digit span backwards 19 females 20–70 Outpatients Hypothyroid subjects euthyroid on stable dose of T4 replacement Significant reduction in working memory when subclinical hypothyroid 

NS, Not significant.

Subjects and Methods

Design overview

This is a double-blind, placebo-controlled, RCT of 12 months titrated T4 replacement to achieve euthyroidism in elderly community-living subjects of both sexes recruited from a cross-sectional study of prevalence of thyroid dysfunction.

Setting and participants

Patients with subclinical hypothyroidism were recruited from a community-based cross-sectional study describing the prevalence of thyroid dysfunction in the elderly and defining the association of cognitive function and depression with thyroid function (1, 10). Details of recruitment for the prevalence study are as previously published (1), but in brief, all subjects were aged 65 yr or over and were registered with one of 20 family practices in the greater Birmingham area of the United Kingdom. Subjects on T4 therapy or antithyroid medications or those who had had recent treatment for hyperthyroidism were excluded from the prevalence study, as were those unable to provide informed consent. The overall prevalence of subclinical hypothyroidism (defined as a TSH concentration >5.5 mU/liter with free T4 concentration from 9–20 pmol/liter) in this population was 2.9%. Subjects identified as having subclinical hypothyroidism over the 2-yr period of recruitment into the prevalence study were invited to enter the RCT.

Randomization and interventions

Eligible subjects were randomly allocated to receive 12 months of either T4 treatment or placebo using a blocked randomization list (block size = 10) produced in advance and held by A.R. Eligible subjects were randomized, and prescriptions were prepared. Subjects were then invited to attend the trial clinic at which consent was obtained. The prescribed medication (T4 or placebo) was then dispensed immediately at the clinic visit to avoid the inconvenience of re-attendance for the medication to be supplied. Subjects had thyroid function tests (TSH and free T4) performed at 8-weekly intervals.

After each blood test, the results were reviewed by a clinician blinded to the patient randomization. The number of tablets (25 μg of T4 or placebo) per day was adjusted in one-tablet increments (the initial dosage being one tablet of placebo or 25 μg of T4 per day) with the aim of achieving and then maintaining a serum TSH within the reference range (0.4–5.5 mU/liter) for subjects in the T4 treatment arm. Subjects receiving placebo had the dose of placebo adjusted in the same way (i.e. the number of tablets was increased or reduced depending on thyroid function tests, but the patient received inactive tablets) to maintain double blinding. In this way, we aimed to achieve a similar experience of being prescribed single or multiple tablets in both the treatment and the placebo groups. All subjects received a new supply of T4 or placebo along with new dose instructions after every test. The patients and all those who had direct contact with them, including those who performed cognitive function testing, were blinded to treatment allocation.

Outcomes and follow-up

All patients had cognitive function tests performed at entry to the trial and at 6 and 12 months after entry (aiming for a minimum of 6 months of biochemically euthyroid state for those who were receiving T4 therapy). Any subjects receiving placebo and whose thyroid function tests on two consecutive occasions showed overt hypothyroidism (defined as TSH concentration above the upper limit of the reference range with a free T4 concentration below the lower limit) were removed from the trial and referred to their family doctor for consideration of treatment with T4.

Cognitive function tests performed

Trained research nurses administered cognitive function tests, using alternative forms for Middlesex Elderly Assessment of Mental State (MEAMS), Trail Making, and Speed and Capacity of Language Processing test (SCOLP) on retesting to reduce any learning effect. For all subjects, the following tests were completed: 1) Folstein Mini-Mental State Examination (MMSE) (18), which is widely used to determine cognitive status in clinical and research settings; 2) MEAMS (19), which was developed as a screening test to detect gross impairment of specific cognitive skills in elderly persons and systematically surveys the major areas of cognitive performance; 3) SCOLP (22), a measure of speed of processing, accounting for premorbid intelligence; and 4) Trail-Making Test (Parts A and B), a well-established psychomotor test of executive function. Slower times to complete the test indicate greater cognitive dysfunction; the difference score, calculated as the difference in times taken to complete the simpler Part A and more complex Part B (Part B − Part A), can be used to control for the effect of motor speed on performance to give a more accurate measure of executive function than the performance on either part alone (23). In the younger elderly (aged 65–75 yr), average times are 41 sec for Part A and 104 msec for Part B (24), whereas for those aged 76–85 yr, they are 60 and 153 sec, respectively (25). Aspects covered by MEAMS include orientation, learning, memory, numeracy, perception, attention, and language skills. Both MMSE and MEAMS comprise a range of tasks that all elderly persons without cognitive impairment should be able to complete, regardless of intelligence. MMSE provides a score of 0–30 (20, 21). Subtests in MEAMS can be used alone or a combined score produced (range 0–12). In both tests, higher scores indicate less dysfunction. For SCOLP, a higher discrepancy score indicates greater cognitive dysfunction, with a score of 4 or more indicating a severe problem. The Hospital Anxiety and Depression Scale (HADS) (26) was also completed, because depression is potentially confounded with cognitive dysfunction (27); scores range from 0–21 with a score of 11 or above indicating caseness, that is, a diagnosis of depression.

Statistical analysis and power calculations

Of the 168 subjects identified with subclinical hypothyroidism in the cross-sectional study of thyroid function and cognition (11), 147 were eligible for recruitment into the RCT. Of these, 94 subjects (64% of those eligible) consented to enter the RCT (see Consort flow chart, Fig. 1). Power calculations performed before the prevalence study indicated that 400 patients randomized to T4 or placebo would be sufficient to detect a 0.75-U difference in MMSE (sd = 2.2) (28) and a 1-U difference in HADS (sd = 3.07) (29) with 90% power and 5% significance. We retained sufficient power (90%) to identify a 1.5-U difference in MMSE (sd = 2.0) and a 2-U difference in HADS (sd = 2.5).

Fig. 1.

Consort flow chart of recruitment and treatment allocation.

Data were analyzed on an intention-to-treat basis, and outcome data were obtained for 82 of 94 subjects at 6 months and 85 of 94 subjects at 12 months. Baseline characteristics were compared across the groups, including age, gender, and deprivation. Deprivation was measured by the Index of Multiple Deprivation 2004 (30), a composite proxy measure of deprivation based on each patient’s place of residence (postcode/zip code). Cognitive function tests (MEAMS, MMSE, SCOLP, and Trail Making A and B) and depression score (HADS) were compared between the intervention and placebo groups at the 6- and 12-month follow-up using linear mixed modeling. Treatment alone and in combination with time were considered as fixed effects, with baseline measurement as a covariate, time (6 and 12 months) as a repeated factor, and participants as the random factor. Statistical analyses were performed using SPSS software, version 14.0 (SPSS Inc., Chicago, IL) and SAS version 9.1 (SAS Institute Inc., Cary, NC).

Biochemical assessment of thyroid function

Serum free T4 and TSH were measured by chemiluminescent immunoassay (Advia Centaur; Siemens Healthcare Diagnostics UK, Camberley, Surrey, UK). The laboratory reference range for free T4 was 9.0–20.0 pmol/liter with an interassay coefficient of variation of 8.2–9.8% over the range 8.2–54.9 pmol/liter. Serum TSH had a reference range of 0.4–5.5 mU/liter with an inter-assay coefficient of variation of 4.4–10.9% over the range 0.41–24.5 mU/liter. The lower limit of reporting for the TSH assay was 0.1 mU/liter with a mean functional sensitivity 0.02 mU/liter as quoted by the manufacturer. Serum free T4 and TSH concentrations were determined in all subjects on each occasion. The reference ranges employed for free T4 and TSH were those recommended by the manufacturer and used in other studies of this cohort and studies from our unit (1, 11, 31).

Results

Table 2 presents baseline characteristics by randomization arm and also for those patients declining to participate in the trial. A total of 94 subjects (57 females and 37 males) consented to recruitment into the trial. Differential consent rates between those randomized to treatment or placebo resulted in 52 subjects (mean age 73.5 yr, range 65–94 yr) receiving T4 replacement and 42 subjects (mean age 74.2 yr, range 66–84 yr) receiving placebo. No substantial baseline differences were observed between the randomized groups. No significant differences were found between those willing and those refusing to take part in the trial: mean difference in age 0.6 yr (95% confidence interval (CI) = −1.4–2.7), HADS (depression) 0.4 (−0.4–1.4); MEAMS −0.2 (−0.5–0.1), and MMSE −0.4 (−1.1–0.3).

TABLE 2.

Baseline characteristics

CharacteristicDeclined to participate (n = 53)Randomized
T4 (n = 52)Placebo (n = 42)
Demographics    
    Age (yr), mean (sd74.4 (6.3) 73.5 (6.2) 74.2 (5.2) 
    Sex, male (%) 17 (32) 21 (40) 16 (38) 
    IMD 2004, mean (sd19.5 (10.9) 18.1 (11.7) 23.4 (13.6) 
    Smoker, n (%) 3 (6)a 1 (2) 2 (5) 
    Alcohol, n (%) 28 (55)a 33 (63) 21 (50) 
    Systolic blood pressure (mm Hg), mean (sd144.4 (20.6)a 147.8 (17.0) 142.1 (15.8) 
    Diastolic blood pressure (mm Hg), mean (sd77.5 (11.9)a 79.9 (10.7) 79.4 (9.4) 
    TSH median (IQR) 7.1 (6.2–9.7) 6.6 (6.0–8.5) 6.6 (5.9–8.3) 
    Free T4 median (IQR) 12.8 (11.5–13.5) 12.9 (11.7–13.7) 12.4 (11.4–13.2) 
Major medical diagnoses n (%)    
    Cancer 3 (6) 1 (2) 2 (5) 
    Endocrine disease 3 (6) 4 (8) 1 (2) 
    Gastrointestinal disease 2 (4) 1 (2) 0 (0) 
    Hypertension 23 (43) 24 (46) 17 (40) 
    Neurological disease 2 (4) 1 (2) 1 (2) 
    Psychiatric disease 5 (9) 4 (8) 3 (7) 
    Pulmonary disease 1 (2) 6 (12) 5 (12) 
    Renal disease 1 (2) 0 (0) 1 (2) 
    Rheumatic disease 3 (6) 0 (0) 1 (2) 
    Vascular disease 7 (13) 4 (8) 3 (7) 
Depression and cognitive function tests mean (sd   
    HADS depression score 3.6 (2.9) 3.4 (2.6) 3.0 (2.5) 
    MEAMS 11.3 (1.2) 11.7 (0.5)a 11.3 (1.1) 
    MMSE 27.8 (2.4) 28.2 (2.0)a 28.2 (2.0) 
    SCOLP NA 1.7 (2.6) 0.3 (3.5) 
    Trail Making A NA 53.3 (51.5) 48.1 (18.5) 
    Trail Making B NA 121.6 (109.6)a 134.6 (133.8) 
    Trail Making B−A NA 68.4 (95.8)a 86.5 (123.3) 
CharacteristicDeclined to participate (n = 53)Randomized
T4 (n = 52)Placebo (n = 42)
Demographics    
    Age (yr), mean (sd74.4 (6.3) 73.5 (6.2) 74.2 (5.2) 
    Sex, male (%) 17 (32) 21 (40) 16 (38) 
    IMD 2004, mean (sd19.5 (10.9) 18.1 (11.7) 23.4 (13.6) 
    Smoker, n (%) 3 (6)a 1 (2) 2 (5) 
    Alcohol, n (%) 28 (55)a 33 (63) 21 (50) 
    Systolic blood pressure (mm Hg), mean (sd144.4 (20.6)a 147.8 (17.0) 142.1 (15.8) 
    Diastolic blood pressure (mm Hg), mean (sd77.5 (11.9)a 79.9 (10.7) 79.4 (9.4) 
    TSH median (IQR) 7.1 (6.2–9.7) 6.6 (6.0–8.5) 6.6 (5.9–8.3) 
    Free T4 median (IQR) 12.8 (11.5–13.5) 12.9 (11.7–13.7) 12.4 (11.4–13.2) 
Major medical diagnoses n (%)    
    Cancer 3 (6) 1 (2) 2 (5) 
    Endocrine disease 3 (6) 4 (8) 1 (2) 
    Gastrointestinal disease 2 (4) 1 (2) 0 (0) 
    Hypertension 23 (43) 24 (46) 17 (40) 
    Neurological disease 2 (4) 1 (2) 1 (2) 
    Psychiatric disease 5 (9) 4 (8) 3 (7) 
    Pulmonary disease 1 (2) 6 (12) 5 (12) 
    Renal disease 1 (2) 0 (0) 1 (2) 
    Rheumatic disease 3 (6) 0 (0) 1 (2) 
    Vascular disease 7 (13) 4 (8) 3 (7) 
Depression and cognitive function tests mean (sd   
    HADS depression score 3.6 (2.9) 3.4 (2.6) 3.0 (2.5) 
    MEAMS 11.3 (1.2) 11.7 (0.5)a 11.3 (1.1) 
    MMSE 27.8 (2.4) 28.2 (2.0)a 28.2 (2.0) 
    SCOLP NA 1.7 (2.6) 0.3 (3.5) 
    Trail Making A NA 53.3 (51.5) 48.1 (18.5) 
    Trail Making B NA 121.6 (109.6)a 134.6 (133.8) 
    Trail Making B−A NA 68.4 (95.8)a 86.5 (123.3) 

Figures are mean (sd) unless stated otherwise. Alcohol and smoking are self-reported, and blood pressure was measured using standard equipment. IQR, Interquartile range; NA, not available.

a

Based on 51 respondents.

TABLE 2.

Baseline characteristics

CharacteristicDeclined to participate (n = 53)Randomized
T4 (n = 52)Placebo (n = 42)
Demographics    
    Age (yr), mean (sd74.4 (6.3) 73.5 (6.2) 74.2 (5.2) 
    Sex, male (%) 17 (32) 21 (40) 16 (38) 
    IMD 2004, mean (sd19.5 (10.9) 18.1 (11.7) 23.4 (13.6) 
    Smoker, n (%) 3 (6)a 1 (2) 2 (5) 
    Alcohol, n (%) 28 (55)a 33 (63) 21 (50) 
    Systolic blood pressure (mm Hg), mean (sd144.4 (20.6)a 147.8 (17.0) 142.1 (15.8) 
    Diastolic blood pressure (mm Hg), mean (sd77.5 (11.9)a 79.9 (10.7) 79.4 (9.4) 
    TSH median (IQR) 7.1 (6.2–9.7) 6.6 (6.0–8.5) 6.6 (5.9–8.3) 
    Free T4 median (IQR) 12.8 (11.5–13.5) 12.9 (11.7–13.7) 12.4 (11.4–13.2) 
Major medical diagnoses n (%)    
    Cancer 3 (6) 1 (2) 2 (5) 
    Endocrine disease 3 (6) 4 (8) 1 (2) 
    Gastrointestinal disease 2 (4) 1 (2) 0 (0) 
    Hypertension 23 (43) 24 (46) 17 (40) 
    Neurological disease 2 (4) 1 (2) 1 (2) 
    Psychiatric disease 5 (9) 4 (8) 3 (7) 
    Pulmonary disease 1 (2) 6 (12) 5 (12) 
    Renal disease 1 (2) 0 (0) 1 (2) 
    Rheumatic disease 3 (6) 0 (0) 1 (2) 
    Vascular disease 7 (13) 4 (8) 3 (7) 
Depression and cognitive function tests mean (sd   
    HADS depression score 3.6 (2.9) 3.4 (2.6) 3.0 (2.5) 
    MEAMS 11.3 (1.2) 11.7 (0.5)a 11.3 (1.1) 
    MMSE 27.8 (2.4) 28.2 (2.0)a 28.2 (2.0) 
    SCOLP NA 1.7 (2.6) 0.3 (3.5) 
    Trail Making A NA 53.3 (51.5) 48.1 (18.5) 
    Trail Making B NA 121.6 (109.6)a 134.6 (133.8) 
    Trail Making B−A NA 68.4 (95.8)a 86.5 (123.3) 
CharacteristicDeclined to participate (n = 53)Randomized
T4 (n = 52)Placebo (n = 42)
Demographics    
    Age (yr), mean (sd74.4 (6.3) 73.5 (6.2) 74.2 (5.2) 
    Sex, male (%) 17 (32) 21 (40) 16 (38) 
    IMD 2004, mean (sd19.5 (10.9) 18.1 (11.7) 23.4 (13.6) 
    Smoker, n (%) 3 (6)a 1 (2) 2 (5) 
    Alcohol, n (%) 28 (55)a 33 (63) 21 (50) 
    Systolic blood pressure (mm Hg), mean (sd144.4 (20.6)a 147.8 (17.0) 142.1 (15.8) 
    Diastolic blood pressure (mm Hg), mean (sd77.5 (11.9)a 79.9 (10.7) 79.4 (9.4) 
    TSH median (IQR) 7.1 (6.2–9.7) 6.6 (6.0–8.5) 6.6 (5.9–8.3) 
    Free T4 median (IQR) 12.8 (11.5–13.5) 12.9 (11.7–13.7) 12.4 (11.4–13.2) 
Major medical diagnoses n (%)    
    Cancer 3 (6) 1 (2) 2 (5) 
    Endocrine disease 3 (6) 4 (8) 1 (2) 
    Gastrointestinal disease 2 (4) 1 (2) 0 (0) 
    Hypertension 23 (43) 24 (46) 17 (40) 
    Neurological disease 2 (4) 1 (2) 1 (2) 
    Psychiatric disease 5 (9) 4 (8) 3 (7) 
    Pulmonary disease 1 (2) 6 (12) 5 (12) 
    Renal disease 1 (2) 0 (0) 1 (2) 
    Rheumatic disease 3 (6) 0 (0) 1 (2) 
    Vascular disease 7 (13) 4 (8) 3 (7) 
Depression and cognitive function tests mean (sd   
    HADS depression score 3.6 (2.9) 3.4 (2.6) 3.0 (2.5) 
    MEAMS 11.3 (1.2) 11.7 (0.5)a 11.3 (1.1) 
    MMSE 27.8 (2.4) 28.2 (2.0)a 28.2 (2.0) 
    SCOLP NA 1.7 (2.6) 0.3 (3.5) 
    Trail Making A NA 53.3 (51.5) 48.1 (18.5) 
    Trail Making B NA 121.6 (109.6)a 134.6 (133.8) 
    Trail Making B−A NA 68.4 (95.8)a 86.5 (123.3) 

Figures are mean (sd) unless stated otherwise. Alcohol and smoking are self-reported, and blood pressure was measured using standard equipment. IQR, Interquartile range; NA, not available.

a

Based on 51 respondents.

Of the 94 patients randomized, 72 (77.16%) completed treatment (Fig. 1). Cognitive function was assessed in 82 participants (87%) at 6 months and in 85 (90%) at 12 months; of those completing the 12-month cognitive function tests, five (6%) did not complete a 6-month test. Changes in thyroid function test results over time for each group are shown in Table 3. The median dose of T4 in the group receiving T4 replacement while biochemically euthyroid was 50 μg/d (interquartile range, 50–75 μg/d).

TABLE 3.

Thyroid function over time

GroupBaseline6 months12 months
TSH [median, (IQR), range]Free T4 [median, (IQR), range]TSH [median, (IQR), range]Free T4 [median, (IQR), range]Proportion in euthyroid rangeTSH [median, (IQR), range]Free T4 [median, (IQR), range]Proportion in euthyroid range
T4 6.6 (6–8.5), 5.6–28.9 12.9 (11.7–13.7), 9.4–16.8 4.0 (2.7–4.6), 0.8–20.6 15.4 (14.9–17.4), 9.5–19.4 82.2% 3.7 (2.8–4.9), 0.2–6.9 16.2 (14.2–17.3), 12.8–24.8 84.4% 
Placebo 6.65 (5.9–8.3), 5.6–20.5 12.45 (11.4–13.2), 9.6–16.7 6.4 (5.0–8.5), 1.2–19.0 12.5 (11.2–14.2), 9.6–21.1 34.5% 5.45 (3.9–9.2), 0.9–17.3 12.85 (11.4–14.4), 9.7–22.2 50.0% 
GroupBaseline6 months12 months
TSH [median, (IQR), range]Free T4 [median, (IQR), range]TSH [median, (IQR), range]Free T4 [median, (IQR), range]Proportion in euthyroid rangeTSH [median, (IQR), range]Free T4 [median, (IQR), range]Proportion in euthyroid range
T4 6.6 (6–8.5), 5.6–28.9 12.9 (11.7–13.7), 9.4–16.8 4.0 (2.7–4.6), 0.8–20.6 15.4 (14.9–17.4), 9.5–19.4 82.2% 3.7 (2.8–4.9), 0.2–6.9 16.2 (14.2–17.3), 12.8–24.8 84.4% 
Placebo 6.65 (5.9–8.3), 5.6–20.5 12.45 (11.4–13.2), 9.6–16.7 6.4 (5.0–8.5), 1.2–19.0 12.5 (11.2–14.2), 9.6–21.1 34.5% 5.45 (3.9–9.2), 0.9–17.3 12.85 (11.4–14.4), 9.7–22.2 50.0% 

Significant difference in TSH level between the placebo and T4 groups at both 6 and 12 months (Mann-Whitney U test z = 5.1, P < 0.0001; z = 3.8, P = 0.0002).

TABLE 3.

Thyroid function over time

GroupBaseline6 months12 months
TSH [median, (IQR), range]Free T4 [median, (IQR), range]TSH [median, (IQR), range]Free T4 [median, (IQR), range]Proportion in euthyroid rangeTSH [median, (IQR), range]Free T4 [median, (IQR), range]Proportion in euthyroid range
T4 6.6 (6–8.5), 5.6–28.9 12.9 (11.7–13.7), 9.4–16.8 4.0 (2.7–4.6), 0.8–20.6 15.4 (14.9–17.4), 9.5–19.4 82.2% 3.7 (2.8–4.9), 0.2–6.9 16.2 (14.2–17.3), 12.8–24.8 84.4% 
Placebo 6.65 (5.9–8.3), 5.6–20.5 12.45 (11.4–13.2), 9.6–16.7 6.4 (5.0–8.5), 1.2–19.0 12.5 (11.2–14.2), 9.6–21.1 34.5% 5.45 (3.9–9.2), 0.9–17.3 12.85 (11.4–14.4), 9.7–22.2 50.0% 
GroupBaseline6 months12 months
TSH [median, (IQR), range]Free T4 [median, (IQR), range]TSH [median, (IQR), range]Free T4 [median, (IQR), range]Proportion in euthyroid rangeTSH [median, (IQR), range]Free T4 [median, (IQR), range]Proportion in euthyroid range
T4 6.6 (6–8.5), 5.6–28.9 12.9 (11.7–13.7), 9.4–16.8 4.0 (2.7–4.6), 0.8–20.6 15.4 (14.9–17.4), 9.5–19.4 82.2% 3.7 (2.8–4.9), 0.2–6.9 16.2 (14.2–17.3), 12.8–24.8 84.4% 
Placebo 6.65 (5.9–8.3), 5.6–20.5 12.45 (11.4–13.2), 9.6–16.7 6.4 (5.0–8.5), 1.2–19.0 12.5 (11.2–14.2), 9.6–21.1 34.5% 5.45 (3.9–9.2), 0.9–17.3 12.85 (11.4–14.4), 9.7–22.2 50.0% 

Significant difference in TSH level between the placebo and T4 groups at both 6 and 12 months (Mann-Whitney U test z = 5.1, P < 0.0001; z = 3.8, P = 0.0002).

Depression scores did not significantly change over time in either group, and there were no significant between-group differences (Table 4). The mean difference in change of depression score at 6 months between groups was 0.28 (95% CI = −0.56–1.11) with a mean change at 12 months of 0.18 (95% CI = −0.64–1.0).

TABLE 4.

Depression and cognitive function scores over time

ScoreGroupBaseline, mean (se)6 months12 monthsP valueb
nMean (se)Adjusteda mean (se)Group difference between adjusted means (95% CI)nMean (se)Adjusteda mean (se)Group difference between adjusted means (95% CI)
HADS T4 3.38 (0.37) 49 3.92 (0.40) 3.78 (0.27) 0.28 (−0.56–1.11) 49 3.61 (0.36) 3.55 (0.27) 0.18 (−0.64–1.00) 0.82 
 Placebo 2.88 (0.45) 33 3.18 (0.43) 3.50 (0.32)  36 3.31 (0.47) 3.37 (0.31)   
MEAMS T4 11.72 (0.13) 46 11.67 (0.07) 11.56 (0.09) −0.02 (−0.32–0.27) 46 11.78 (0.07) 11.67 (0.09) 0.07 (−0.21–0.36) 0.57 
 Placebo 11.21 (0.16) 32 11.47 (0.16) 11.58 (0.11)  36 11.44 (0.20) 11.60 (0.11)   
MMSE T4 28.26 (0.30) 48 28.90 (0.19) 29.00 (0.38) 1.19 (0.01–2.36) 46 28.28 (0.29) 28.24 (0.38) 0.03 (−1.12–1.17) 0.18 
 Placebo 28.17 (0.36) 33 27.82 (0.91) 27.82 (0.45)  36 28.25 (0.37) 28.22 (0.43)   
SCOLP T4 1.91 (0.46) 49 1.43 (0.38) 1.04 (0.30) 0.21 (−0.71–1.14) 49 1.69 (0.43) 1.29 (0.30) −0.44 (0.46–1.36) 0.59 
 Placebo 0.71 (0.57) 33 0.39 (0.54) 0.82 (0.36)  36 0.22 (0.58) 0.84 (0.35)   
Trail Making A T4 45.72 (2.32) 49 47.65 (3.28) 46.54 (2.62) −3.33 (−11.39–4.73) 48 44.52 (2.62) 45.33 (2.63) −1.44 (−9.42–6.53) 0.52 
 Placebo 50.29 (2.81) 33 49.00 (4.82) 49.87 (3.11)  36 46.97 (3.55) 46.78 (3.05)   
Trail Making B T4 110.57 (15.89) 49 106.61 (8.73) 104.24 (7.71) −14.00 (−37.83–9.82) 48 96.67 (9.62) 100.65 (7.75) −13.46 (−37.15–10.22) 0.95 
 Placebo 131.46 (19.72) 32 119.13 (18.56) 118.24 (9.21)  34 108.38 (14.12) 114.11 (9.07)   
Trail Making B−A T4 65.76 (14.81) 49 58.96 (6.42) 57.97 (6.75) −11.01 (−31.98–9.96) 48 52.15 (6.53) 54.55 (6.80) −12.72 (−33.50–8.06) 0.86 
 Placebo 82.63 (18.38) 32 70.31 (14.21) 68.97 (8.14)  34 63.76 (11.72) 67.27 (7.97)   
ScoreGroupBaseline, mean (se)6 months12 monthsP valueb
nMean (se)Adjusteda mean (se)Group difference between adjusted means (95% CI)nMean (se)Adjusteda mean (se)Group difference between adjusted means (95% CI)
HADS T4 3.38 (0.37) 49 3.92 (0.40) 3.78 (0.27) 0.28 (−0.56–1.11) 49 3.61 (0.36) 3.55 (0.27) 0.18 (−0.64–1.00) 0.82 
 Placebo 2.88 (0.45) 33 3.18 (0.43) 3.50 (0.32)  36 3.31 (0.47) 3.37 (0.31)   
MEAMS T4 11.72 (0.13) 46 11.67 (0.07) 11.56 (0.09) −0.02 (−0.32–0.27) 46 11.78 (0.07) 11.67 (0.09) 0.07 (−0.21–0.36) 0.57 
 Placebo 11.21 (0.16) 32 11.47 (0.16) 11.58 (0.11)  36 11.44 (0.20) 11.60 (0.11)   
MMSE T4 28.26 (0.30) 48 28.90 (0.19) 29.00 (0.38) 1.19 (0.01–2.36) 46 28.28 (0.29) 28.24 (0.38) 0.03 (−1.12–1.17) 0.18 
 Placebo 28.17 (0.36) 33 27.82 (0.91) 27.82 (0.45)  36 28.25 (0.37) 28.22 (0.43)   
SCOLP T4 1.91 (0.46) 49 1.43 (0.38) 1.04 (0.30) 0.21 (−0.71–1.14) 49 1.69 (0.43) 1.29 (0.30) −0.44 (0.46–1.36) 0.59 
 Placebo 0.71 (0.57) 33 0.39 (0.54) 0.82 (0.36)  36 0.22 (0.58) 0.84 (0.35)   
Trail Making A T4 45.72 (2.32) 49 47.65 (3.28) 46.54 (2.62) −3.33 (−11.39–4.73) 48 44.52 (2.62) 45.33 (2.63) −1.44 (−9.42–6.53) 0.52 
 Placebo 50.29 (2.81) 33 49.00 (4.82) 49.87 (3.11)  36 46.97 (3.55) 46.78 (3.05)   
Trail Making B T4 110.57 (15.89) 49 106.61 (8.73) 104.24 (7.71) −14.00 (−37.83–9.82) 48 96.67 (9.62) 100.65 (7.75) −13.46 (−37.15–10.22) 0.95 
 Placebo 131.46 (19.72) 32 119.13 (18.56) 118.24 (9.21)  34 108.38 (14.12) 114.11 (9.07)   
Trail Making B−A T4 65.76 (14.81) 49 58.96 (6.42) 57.97 (6.75) −11.01 (−31.98–9.96) 48 52.15 (6.53) 54.55 (6.80) −12.72 (−33.50–8.06) 0.86 
 Placebo 82.63 (18.38) 32 70.31 (14.21) 68.97 (8.14)  34 63.76 (11.72) 67.27 (7.97)   
a

Adjusted by baseline score.

b

Group by time interaction.

TABLE 4.

Depression and cognitive function scores over time

ScoreGroupBaseline, mean (se)6 months12 monthsP valueb
nMean (se)Adjusteda mean (se)Group difference between adjusted means (95% CI)nMean (se)Adjusteda mean (se)Group difference between adjusted means (95% CI)
HADS T4 3.38 (0.37) 49 3.92 (0.40) 3.78 (0.27) 0.28 (−0.56–1.11) 49 3.61 (0.36) 3.55 (0.27) 0.18 (−0.64–1.00) 0.82 
 Placebo 2.88 (0.45) 33 3.18 (0.43) 3.50 (0.32)  36 3.31 (0.47) 3.37 (0.31)   
MEAMS T4 11.72 (0.13) 46 11.67 (0.07) 11.56 (0.09) −0.02 (−0.32–0.27) 46 11.78 (0.07) 11.67 (0.09) 0.07 (−0.21–0.36) 0.57 
 Placebo 11.21 (0.16) 32 11.47 (0.16) 11.58 (0.11)  36 11.44 (0.20) 11.60 (0.11)   
MMSE T4 28.26 (0.30) 48 28.90 (0.19) 29.00 (0.38) 1.19 (0.01–2.36) 46 28.28 (0.29) 28.24 (0.38) 0.03 (−1.12–1.17) 0.18 
 Placebo 28.17 (0.36) 33 27.82 (0.91) 27.82 (0.45)  36 28.25 (0.37) 28.22 (0.43)   
SCOLP T4 1.91 (0.46) 49 1.43 (0.38) 1.04 (0.30) 0.21 (−0.71–1.14) 49 1.69 (0.43) 1.29 (0.30) −0.44 (0.46–1.36) 0.59 
 Placebo 0.71 (0.57) 33 0.39 (0.54) 0.82 (0.36)  36 0.22 (0.58) 0.84 (0.35)   
Trail Making A T4 45.72 (2.32) 49 47.65 (3.28) 46.54 (2.62) −3.33 (−11.39–4.73) 48 44.52 (2.62) 45.33 (2.63) −1.44 (−9.42–6.53) 0.52 
 Placebo 50.29 (2.81) 33 49.00 (4.82) 49.87 (3.11)  36 46.97 (3.55) 46.78 (3.05)   
Trail Making B T4 110.57 (15.89) 49 106.61 (8.73) 104.24 (7.71) −14.00 (−37.83–9.82) 48 96.67 (9.62) 100.65 (7.75) −13.46 (−37.15–10.22) 0.95 
 Placebo 131.46 (19.72) 32 119.13 (18.56) 118.24 (9.21)  34 108.38 (14.12) 114.11 (9.07)   
Trail Making B−A T4 65.76 (14.81) 49 58.96 (6.42) 57.97 (6.75) −11.01 (−31.98–9.96) 48 52.15 (6.53) 54.55 (6.80) −12.72 (−33.50–8.06) 0.86 
 Placebo 82.63 (18.38) 32 70.31 (14.21) 68.97 (8.14)  34 63.76 (11.72) 67.27 (7.97)   
ScoreGroupBaseline, mean (se)6 months12 monthsP valueb
nMean (se)Adjusteda mean (se)Group difference between adjusted means (95% CI)nMean (se)Adjusteda mean (se)Group difference between adjusted means (95% CI)
HADS T4 3.38 (0.37) 49 3.92 (0.40) 3.78 (0.27) 0.28 (−0.56–1.11) 49 3.61 (0.36) 3.55 (0.27) 0.18 (−0.64–1.00) 0.82 
 Placebo 2.88 (0.45) 33 3.18 (0.43) 3.50 (0.32)  36 3.31 (0.47) 3.37 (0.31)   
MEAMS T4 11.72 (0.13) 46 11.67 (0.07) 11.56 (0.09) −0.02 (−0.32–0.27) 46 11.78 (0.07) 11.67 (0.09) 0.07 (−0.21–0.36) 0.57 
 Placebo 11.21 (0.16) 32 11.47 (0.16) 11.58 (0.11)  36 11.44 (0.20) 11.60 (0.11)   
MMSE T4 28.26 (0.30) 48 28.90 (0.19) 29.00 (0.38) 1.19 (0.01–2.36) 46 28.28 (0.29) 28.24 (0.38) 0.03 (−1.12–1.17) 0.18 
 Placebo 28.17 (0.36) 33 27.82 (0.91) 27.82 (0.45)  36 28.25 (0.37) 28.22 (0.43)   
SCOLP T4 1.91 (0.46) 49 1.43 (0.38) 1.04 (0.30) 0.21 (−0.71–1.14) 49 1.69 (0.43) 1.29 (0.30) −0.44 (0.46–1.36) 0.59 
 Placebo 0.71 (0.57) 33 0.39 (0.54) 0.82 (0.36)  36 0.22 (0.58) 0.84 (0.35)   
Trail Making A T4 45.72 (2.32) 49 47.65 (3.28) 46.54 (2.62) −3.33 (−11.39–4.73) 48 44.52 (2.62) 45.33 (2.63) −1.44 (−9.42–6.53) 0.52 
 Placebo 50.29 (2.81) 33 49.00 (4.82) 49.87 (3.11)  36 46.97 (3.55) 46.78 (3.05)   
Trail Making B T4 110.57 (15.89) 49 106.61 (8.73) 104.24 (7.71) −14.00 (−37.83–9.82) 48 96.67 (9.62) 100.65 (7.75) −13.46 (−37.15–10.22) 0.95 
 Placebo 131.46 (19.72) 32 119.13 (18.56) 118.24 (9.21)  34 108.38 (14.12) 114.11 (9.07)   
Trail Making B−A T4 65.76 (14.81) 49 58.96 (6.42) 57.97 (6.75) −11.01 (−31.98–9.96) 48 52.15 (6.53) 54.55 (6.80) −12.72 (−33.50–8.06) 0.86 
 Placebo 82.63 (18.38) 32 70.31 (14.21) 68.97 (8.14)  34 63.76 (11.72) 67.27 (7.97)   
a

Adjusted by baseline score.

b

Group by time interaction.

There were no significant changes in any of the measures of cognitive function over time. Inclusion of depression as a covariate did not alter these findings. Results remained the same when smoking status was included as a covariate in the analyses.

The differences in mean change between the groups are presented in Table 4. Restricting the analysis to those subjects in the T4 group who had achieved at least 6 months of euthyroidism by the time of the 12-month cognitive function test did not alter these findings [i.e. 37 of 45 patients (82.2%) with available data in the T4 group were euthyroid at 6 months; no significant differences were found between these T4 patients and the placebo group for any of the outcomes at 12 months]. Furthermore, multiple imputation of missing data demonstrated the robustness of these findings.

Discussion

In this double-blind RCT of T4 replacement therapy in subjects with subclinical hypothyroidism aged 65 yr and over, we have shown no significant improvement in tests of cognitive function in those receiving T4 replacement compared with those receiving placebo. The majority (82%) of subjects receiving T4 replacement were biochemically euthyroid by the time of the first cognitive function test at 6 months, and 84% were euthyroid by the time of their 12-month test. We have therefore shown no improvement in cognitive function both in the short term, i.e. when euthyroidism is achieved, and after a substantial period of time has elapsed while rendered euthyroid.

There have been few previous RCTs of T4 replacement in subclinical hypothyroidism and its effect on cognitive function. Previous studies have recruited primarily from specialist or unspecified populations (15, 16) and have been based on smaller samples (1416). Some have been restricted to females only (14, 16), or the length of T4 treatment has been restricted to 12 wk (16). Our present study recruited an unselected sample from the community and included both men and women in a proportion similar to the community at large [39% of our subjects were male, compared with 54% males in Jorde’s study (17) and 24% males in Jaeschke’s study (15)]. We provided T4 replacement for a substantial period [52 wk, as did Jorde et al. (17); all the other RCTs (1416) had shorter periods of T4 therapy] with a large majority of those in our study treated with T4 achieving euthyroidism. T4 dosage was titrated (i.e. the dosage was modified according to repeated thyroid function tests) in our study and in those by Jaeschke et al. (15) and Jorde et al. (17) but not in those by Nystrom et al. (14) and Samuels et al. (16). This is also the largest RCT of T4 replacement in subclinical hypothyroidism in the literature to date, being 36% bigger than the previous largest study (17), and our population was also significantly older than that reported on by Jorde et al. (17) (mean age in our study 73.8 yr with sd 5.8 yr; mean age in Jorde study 62.3 yr with sd 11.9 yr).

Three previous small RCTs (1416) reported marginal differences in cognitive function in the T4-treated group compared with controls, but our study of 94 subjects did not, our findings being consistent with Jorde’s study (17) of 69 subjects in which no differences were found in cognitive function after an extensive battery of cognitive function tests.

In a study of induced subclinical hypothyroidism secondary to a T4 dose reduction, Samuels et al. (16) reported a reduction in working memory function during the time subjects were subclinically hypothyroid compared with when they were euthyroid, in contrast to our findings. This difference may reflect a chance finding or inducement of subclinical hypothyroidism in subjects already on replacement therapy for overt hypothyroidism in this study, as opposed to the correction of subclinical hypothyroidism in the other reported RCTs, including ours.

We defined hypothyroidism according to our laboratory’s reference ranges and aimed for the achievement of biochemical euthyroidism using the same ranges; as Biondi and Cooper (4) recently commented, “there is no consensus on the thyroid hormone and thyrotropin cut-off values at which treatment should be contemplated.”

Limitations of the study

Of the 168 potential participants, 53 declined to take part, but there were no significant differences in general characteristics nor in baseline measures of MMSE, MEAMS, or HADS between those who agreed to take part and those who refused. Another 15 individuals could not be entered to the trial because they were identified too late in the screening study for 12 months trial participation to have been supported. Only one of our subjects demonstrated cognitive dysfunction with a MEAMS score of 6 (mean MEAMs scores being 11.21 and 11.72 at baseline in the T4 and placebo group, respectively; a score of 10 and upward is in the normal range) before the study, although even within the normal population reference ranges, there could have been improvement if T4 replacement was effective in improving cognitive function as distinct from cognitive dysfunction. It is possible that those who declined to participate had a greater level of dysfunction than participants that was not assessed by baseline testing and that use of a more healthy group has reduced the potential to demonstrate improvement, although in the light of similar baseline scores in MMSE and MEAMS, any such difference would have to be subtle or highly localized. It is also possible that use of more subtle tests of higher executive function [see Burgess et al. (32)] may have revealed lesser differences in cognitive function than MMSE, MEAMS, SCOLP, or Trail Making, although the clinical significance of very minor deficiencies in higher cognitive functioning is unclear.

Patients eligible for entry into RCTs are typically randomized after consent is obtained. However, in this study, due to logistical (prescribing) issues, eligible patients’ names were randomized. This resulted in an imbalance between the group sizes due to different consent rates between those randomized to placebo and those randomized to treatment. There was also a lower than expected prevalence of subclinical hypothyroidism in the study from which the subjects for the RCT were recruited, so the number of subjects treated was less than anticipated, although we retained sufficient power (90%) to identify a 1.5-U difference in MMSE (sd = 2.0), a 0.6-U difference in MEAMS (sd = 0.8), and a 2-U difference in HADS (sd = 2.5). Stable patients retaking the MMSE on a second occasion improve their score by an average of 0.9 U (18, 33), so clinically important differences must be above this level. The minimally important clinical difference (MICD) in elderly patients with a transitory cognitive deficit (delirium) has been calculated as a negative change of 2 U or a positive change of 3 U (33). For a sample of elderly patients being tested with a longer time interval (1.5 yr), the MICD has been estimated to be between 2 and 4 U (21). One study has shown that the MICD on the HADS for patients with depression associated with chronic obstructive pulmonary disease is 1.68 U (34). No previous research has calculated the MICD for the MEAMS, but a 1-U difference would indicate failing one subtest, which would be clinically significant (19). Among those taking placebo, 50% normalized their TSH values at either 6 or 12 months; however, no significant differences in cognitive function tests within the placebo group between those who normalized their TSH and those who didn’t at either 6 or 12 months follow-up were found. The adopted prescribing strategy for the two groups was devised as logistically feasible and using only one algorithm to reduce human error. Because many patients in the T4 group stabilized on low doses of T4, the number of changes of dosage was marginally greater in the placebo group (Mann-Whitney U test; z = 1.9; P = 0.06). However, given that patients were aware that different dose tablets were being used, we consider that such differences are unlikely to have influenced subjects’ responses and are of limited relevance.

Recruitment into the study was based on a single thyroid function test; as is well known, thyroid function tests vary, and many return to normal from outside the reference range (35). There is therefore a possibility that ineligible euthyroid patients may have been recruited, increasing the chance of a type 2 error and thus reducing the power of the study to detect an effect on cognitive function.

Conclusion

Our previous cross-sectional survey showed no association between serum TSH concentration and cognitive function (11), an important finding that when combined with the presently reported lack of effect of T4 treatment of subclinical hypothyroidism on cognitive function suggests that the case for treating subjects with subclinical hypothyroidism with replacement therapy to improve their cognitive state should not be accepted. Our results therefore militate against screening for thyroid dysfunction in community-based elderly subjects if the aim of such screening is to identify patients to treat to improve cognitive function. Finally, the potential adverse consequences of T4 replacement need consideration: there is some evidence that untreated elevated TSH is associated with longevity (36, 37) (although the mechanism is unclear) and that subclinical hyperthyroidism (which can be due to T4 therapy) may be associated with increased mortality (38). Thus, careful consideration is indicated before treating subclinical hypothyroidism, an essentially biochemical abnormality, by instituting T4 hormone replacement therapy, when such treatment may also be harmful.

Acknowledgments

We acknowledge E. Kidney of the School of Health and Population Sciences, College of Medical and Dental Sciences, University of Birmingham, Edgbaston, Birmingham B152TT, UK, who assisted with data collection, P. Clark, consultant clinical scientist at the University Hospital Birmingham National Health Service Foundation Trust, Edgbaston, Birmingham, UK, who led the laboratory analyses.

The research costs for the RCT were funded by the Health Foundation (http://www.health.org.uk/) and the service support costs from the Support for Science allocations to the Midlands Research Practices Consortium (MidReC). Both T4 and identical placebo tablets were provided at no cost by Goldshield Pharmaceuticals (Croydon, Surrey, UK). The funding sources had no direct influence on the design and conduct of the study; collection, management, analysis, and interpretation of the data; or preparation, review, or approval of the manuscript. Trial support was provided by the Primary Care Clinical Research and Trials Unit. S.W. was funded by a Department of Health Career Scientist Award during the period of the study.

All subjects gave written informed consent for the study, which was approved by the multicenter research ethics committee of Scotland (MREC/01/0/24). This RCT is registered as ISRCTN 23090699.

Disclosure Summary: The authors declare there are no conflicts of interest in relation to this study.

For editorial see page 3611

Abbreviations:

     
  • CI,

    Confidence interval;

  •  
  • HADS,

    Hospital Anxiety and Depression Scale;

  •  
  • MEAMS,

    Middlesex Elderly Assessment of Mental State;

  •  
  • MICD,

    minimally important clinical difference;

  •  
  • MMSE,

    Mini-Mental State Examination;

  •  
  • RCT,

    randomized controlled trial;

  •  
  • SCOLP,

    Speed and Capacity of Language Processing test.

1

Wilson
S
,
Parle
J
,
Roberts
L
,
Roalfe
A
,
Hobbs
FD
,
Clark
P
,
Sheppard
MC
,
Gammage
M
,
Pattison
HM
, Franklyn JA; Birmingham Elderly Thyroid Study Team
2006
Prevalence of subclinical thyroid dysfunction and its relation to socioeconomic deprivation in the elderly: a community-based cross-sectional survey.
J Clin Endocrinal Metab
91
:
4809
4816

2

Canaris
GJ
,
Manowitz
NR
,
Mayor
G
,
Ridgway
EC
2000
The Colorado disease prevalence study.
Arch Intern Med
160
:
526
534

3

Villar
HC
,
Saconato
H
,
Valente
O
,
Atallah
AN
2007
Thyroid hormone replacement for subclinical hypothyroidism
.
Cochrane Database Syst Rev
3
:
CD003419

4

Biondi
B
,
Cooper
DS
2008
The clinical significance of subclinical thyroid dysfunction.
Endocr Rev
29
:
76
131

5

Pham
CB
,
Shaughnessy
AF
2008
Should we treat subclinical hypothyroidism?
BMJ
337
:
a834

6

Graham
JE
,
Rockwood
K
,
Beattie
BL
,
Eastwood
R
,
Gauthier
S
,
Tuokko
H
,
McDowell
I
1997
Prevalence and severity of cognitive impairment with and without dementia in an elderly population.
Lancet
349
:
1793
1796

7

Zhu
DF
,
Wang
ZX
,
Zhang
DR
,
Pan
ZL
,
He
S
,
Hu
XP
,
Chen
XC
,
Zhou
JN
2006
fMRI revealed neural substrate for reversible working memory dysfunction in subclinical hypothyroidism.
Brain
129
:
2923
2930

8

Baldini
IM
,
Vita
A
,
Mauri
MC
,
Amodei
V
,
Carrisi
M
,
Bravin
S
,
Cantalamessa
L
1997
Psychopathological and cognitive features in subclinical hypothyroidism.
Prog Neuropsychopharmacol Biol Psychiatry
21
:
925
935

9

Dugbartey
AT
1998
Neurocognitive aspects of hypothyroidism.
Arch Intern Med
158
:
1413
1418

10

Wekking
EM
,
Appelhof
BC
,
Fliers
E
,
Schene
AH
,
Huyser
J
,
Tijssen
JG
,
Wiersinga
WM
2005
Cognitive function and wellbeing in euthyroid patients on thyroxine therapy for primary hypothyroidism.
Eur J Endocrinol
153
:
747
753

11

Roberts
LM
,
Pattison
H
,
Roalfe
A
,
Franklyn
J
,
Wilson
S
,
Hobbs
FD
,
Parle
JV
2006
Is subclinical thyroid dysfunction in the elderly associated with depression or cognitive dysfunction?
Ann Intern Med
145
:
573
581

12

Hogervorst
E
,
Huppert
F
,
Matthews
FE
,
Brayne
C
2008
Thyroid function and cognitive decline in the MRC funded cognitive function and ageing study.
Psychoneuroendocrinology
33
:
1013
1022

13

Davis
JD
,
Tremont
G
2007
Neuropsychiatric aspects of hypothyroidism and treatment reversibility.
Minerva Endocrinol
32
:
49
65

14

Nystrom
E
,
Caidahl
K
,
Fager
G
,
Wikkelso
P
,
Lundberg
A
,
Linndstedt
G
1988
A Double-blind cross-over 12-month study of l-thyroxine treatment of women with ‘subclinical’ hypothyroidism.
Clin Endocrinol (Oxf)
29
:
63
76

15

Jaeschke
R
,
Guyatt
G
,
Gerstein
H
,
Patterson
C
,
Molloy
W
,
Cook
D
,
Harper
S
,
Griffith
L
,
Carbotte
R
1996
Does treatment with l-thyroxine influence health status in middle-aged and older adults with subclinical hypothyroidism?
J Gen Intern Med
11
:
744
749

16

Samuels
MH
,
Schuff
KG
,
Carlson
NE
,
Carello
P
,
Janowsky
JS
2007
Health status, mood and cognition in experimentally induced subclinical hypothyroidism.
J Clin Endocrinol Metab
92
:
2545
2551

17

Jorde
R
,
Waterloo
K
,
Storhaug
H
,
Nyrnes
A
,
Sundsfjord
J
,
Jenssen
TG
2006
Neuropsychological function and symptoms in subjects with subclinical hypothyroidism and the effect of thyroxine treatment.
J Clin Endocrinol Metab
91
:
145
153

18

Folstein
MF
,
Folstein
SE
,
McHugh
PR
1975
“Mini-mental state.” A practical method for grading the cognitive state of patients for the clinician.
J Psychiatr Res
12
:
189
198

19

Golding
E
1989
The Middlesex Elderly Assessment of Mental State
.
Bury St. Edmonds, UK
:
Thames Valley Test

20

Crum
RM
,
Anthony
JC
,
Bassett
SS
,
Folstein
MF
1993
Population-based norms for the Mini-Mental State Examination by age and educational level.
JAMA
269
:
2386
2391

21

Hensel
A
,
Angermeyer
MC
,
Riedel-Heller
SG
Measuring cognitive change in older adults: reliable change indices for the Mini-Mental State Examination.
J Neurol Neurosurg Psychiatry
78
:
1298
1303

22

Medical Research Council

1992
The speed and capacity of language-processing test
.
Bury St. Edmunds, UK
:
Thames Valley Test

23

Kortte
KB
,
Horner
MD
,
Windham
WK
2002
The trail making test, part B: cognitive flexibility or ability to maintain set?
Appl Neuropsychol
9
:
106
109

24

Ernst
J
1987
Neuropsychological problem-solving skills in the elderly.
Psychol Aging
2
:
363
365

25

Van Gorp
WG
,
Satz
P
,
Mitrushina
M
1990
Neuropsychological processes associated with normal aging.
Dev Neuropsychol
6
:
279
290

26

Zigmond
AS
,
Snaith
RP
1983
The Hospital Anxiety and Depression Scale.
Acta Psychiatr Scand
67
:
361
370

27

Godin
O
,
Dufouil
C
,
Ritchie
K
,
Dartigues
JF
,
Tzourio
C
,
Pérès
K
,
Artero
S
,
Alpérovitch
A
2007
Depressive symptoms, major depressive episode and cognition in the elderly: the three-city study.
Neuroepidemiology
28
:
101
108

28

Bravo
G
,
Hébert
R
1997
Age and education specific reference values for the mini-mental and modified mini-mental state examinations derived from a non-demented elderly population.
Int J Geriatr Psychiatry
12
:
1008
1018

29

Flint
AJ
,
Rifat
SL
1996
Validation of the hospital anxiety and depression scale as a measure of severity of geriatric depression.
Int J Geriatr Psychiatry
11
:
991
994

30

Office of the Deputy Prime Minister 2004 Indices of deprivation 2004: summary (revised). http://webarchive.nationalarchives.gov.uk/+;/http://www.communities.gov.uk/index.asp?id=1128444 (accessed July 11th 2006)

31

Boelaert
K
,
Horacek
J
,
Holder
RL
,
Watkinson
JC
,
Sheppard
MC
,
Franklyn
JA
2006
Serum thyrotropin concentration as a novel predictor of malignancy in thyroid nodules investigated by fine-needle aspiration.
J Clin Endocrinol Metab
91
:
4295
4301

32

Burgess
PW
,
Alderman
N
,
Forbes
C
,
Costello
A
,
Coates
LM
,
Dawson
DR
,
Anderson
ND
,
Gilbert
SJ
,
Dumontheil
I
,
Channon
S
2006
The case for the development and use of “ecologically valid” measures of executive function in experimental and clinical neuropsychology.
J Int Neuropsychol Soc
12
:
194
209

33

O'Keeffe
ST
,
Mulkerrin
EC
,
Nayeem
K
,
Varughese
M
,
Pillay
I
2005
Use of Serial Mini-Mental State Examinations to Diagnose and Monitor Delirium in Elderly Hospital Patients.
J Am Geriatr Soc
53
:
867
870

34

Puhan
MA
,
Frey
M
,
Büchi
S
,
Schünemann
HJ
2008
The minimal important difference of the hospital anxiety and depression scale in patients with chronic obstructive pulmonary disease
.
Health Qual Life Outcomes
6
:
46

35

Parle
JV
,
Franklyn
JA
,
Cross
KW
,
Jones
SC
,
Sheppard
MC
1991
Prevalence and follow-up of abnormal thyrotropin (TSH) concentrations in the elderly in the United Kingdom.
Clin Endocrinol (Oxf)
34
:
77
83

36

Gussekloo
J
,
van Exel
E
,
de Craen
AJ
,
Meinders
AE
,
Frölich
M
,
Westendorp
RG
2004
Thyroid status, disability and cognitive function, and survival in old age.
JAMA
292
:
2591
2599

37

Atzmon
G
,
Barzilai
N
,
Hollowell
JG
,
Surks
MI
,
Gabriely
I
2009
Extreme longevity is associated with increased serum thyrotropin.
J Clin Endocrinol Metab
94
:
1251
1254

38

Haentjens
P
,
Van Meerhaeghe
A
,
Poppe
K
,
Velkeniers
B
2008
Subclinical thyroid dysfunction and mortality: an estimate of relative and absolute excess all-cause mortality based on time-to-event data from cohort studies.
Eur J Endocrinol
159
:
329
341