Comprehension of acoustically degraded speech in Alzheimer’s disease and primary progressive aphasia

Abstract Successful communication in daily life depends on accurate decoding of speech signals that are acoustically degraded by challenging listening conditions. This process presents the brain with a demanding computational task that is vulnerable to neurodegenerative pathologies. However, despite recent intense interest in the link between hearing impairment and dementia, comprehension of acoustically degraded speech in these diseases has been little studied. Here we addressed this issue in a cohort of 19 patients with typical Alzheimer’s disease and 30 patients representing the three canonical syndromes of primary progressive aphasia (non-fluent/agrammatic variant primary progressive aphasia; semantic variant primary progressive aphasia; logopenic variant primary progressive aphasia), compared to 25 healthy age-matched controls. As a paradigm for the acoustically degraded speech signals of daily life, we used noise-vocoding: synthetic division of the speech signal into frequency channels constituted from amplitude-modulated white noise, such that fewer channels convey less spectrotemporal detail thereby reducing intelligibility. We investigated the impact of noise-vocoding on recognition of spoken three-digit numbers and used psychometric modelling to ascertain the threshold number of noise-vocoding channels required for 50% intelligibility by each participant. Associations of noise-vocoded speech intelligibility threshold with general demographic, clinical and neuropsychological characteristics and regional grey matter volume (defined by voxel-based morphometry of patients’ brain images) were also assessed. Mean noise-vocoded speech intelligibility threshold was significantly higher in all patient groups than healthy controls, and significantly higher in Alzheimer’s disease and logopenic variant primary progressive aphasia than semantic variant primary progressive aphasia (all P < 0.05). In a receiver operating characteristic analysis, vocoded intelligibility threshold discriminated Alzheimer’s disease, non-fluent variant and logopenic variant primary progressive aphasia patients very well from healthy controls. Further, this central hearing measure correlated with overall disease severity but not with peripheral hearing or clear speech perception. Neuroanatomically, after correcting for multiple voxel-wise comparisons in predefined regions of interest, impaired noise-vocoded speech comprehension across syndromes was significantly associated (P < 0.05) with atrophy of left planum temporale, angular gyrus and anterior cingulate gyrus: a cortical network that has previously been widely implicated in processing degraded speech signals. Our findings suggest that the comprehension of acoustically altered speech captures an auditory brain process relevant to daily hearing and communication in major dementia syndromes, with novel diagnostic and therapeutic implications.


Introduction
Successful communication in the world at large depends on our ability to understand spoken messages under non-ideal listening conditions.In our daily lives, we are required to interpret speech that is acoustically degraded by a wide variety of different wayswe regularly conduct conversations over background noise, adapt to suboptimal telephone and video connections and interpret unfamiliar accents.][3] Because speech signals are critical for communication, decoding of degraded speech is generally the most functionally relevant index of hearing ability in daily life.][6][7][8] Hearing impairment has recently been identified as a major risk factor for dementia and a driver of cognitive decline and disability. 4,9,10While most studies addressing this linkage have focused on peripheral hearing function measured using the detection of pure tones, 4,11,12 mounting evidence suggests that measures of central hearing (auditory brain) function and in particular, the comprehension of degraded speech signals, may be more pertinent. 6,8,13,14arge cohort studies have identified impaired comprehension of degraded messages as a harbinger of dementia. 7,15,16][23][24] Further, both Alzheimer's disease and PPA syndromes impair comprehension of non-native accents, [25][26][27][28] sinewave speech 29,30 and noise-interrupted speech, 31 suggesting that neurodegenerative pathologies impair the processing of degraded speech signals more generally.However, the neural mechanisms responsible, the types of speech degradation that are implicated in everyday listening and the effects of different neurodegenerative pathologies have not yet been fully clarified.There are several grounds on which the processing of degraded speech may be especially vulnerable to neurodegenerative pathologies. 5Neuroanatomically, the processing of degraded speech signals engages distributed neural networks in perisylvian, prefrontal and posterior temporo-parietal cortices: these same brain networks are targeted preferentially in PPA, particularly the non-fluent/agrammatic variant and logopenic variant syndromes. 5,29,32,33Computationally, the comprehension of degraded speech signals depends on precise, yet dynamic integration of information across neural circuitry 4,5,8,34,35 and neurodegenerative pathologies are likely to blight these computations early and profoundly.
One widely used technique for altering speech signals experimentally is noise-vocoding, whereby a speech signal is divided digitally into discrete frequency bands ('channels'), each filled with white noise and modulated by the amplitude envelope of the original signal. 36This procedure degrades the spectral content of the speech signal while preserving its overall longer range temporal structure.The level of intelligibility of the noise-vocoded speech signal can be controlled parametrically: fewer channels is equivalent to less spectral detail available, leading to less intelligible speech.Noise-vocoding simulates the acoustic characteristics of a cochlear implant, and noise-vocoded speech per se will not be encountered by most listeners in everyday life.However, among various alternative methods, 5 noise-vocoding has certain attributes that make it attractive as a model paradigm to study the effects of disease on the processing of degraded speech more generally.
7][38][39][40][41][42] As an exemplar of acoustic degradation based on reduction of spectral information, it is likely to capture auditory brain processes engaged by a variety of daily listening scenarios that require decoding of 'noisy' speech signals (for example, a poor telephone or video-conferencing line, or a speaker with a heavy cold).In contrast to speech-in-noise techniques (which mix a speech signal with extraneous background sound), noise-vocoding degrades the intrinsic features of the speech signal.It therefore opens a window on auditory perceptual and cognitive processes complementary to those engaged in processing sound scenes (following a speech signal against competing background noise).Comprehension of noisevocoded speech is likely to be more dependent on auditory object (phonemic) decoding than selective attention: indeed, perceptual and electrophysiological processing of noise-vocoded speech and acoustically degraded conspecific call sounds has been demonstrated in non-human primates, [43][44][45][46][47] suggesting that noisevocoding may engage a fundamental neural integrative mechanism for decoding vocal signals in primate auditory cortex.Further, noisevocoding offers the substantial advantage of generating a quantifiable threshold for intelligibility of the degraded speech signal, based on the number of vocoding channels.This potentially allows for a more sensitive, graded and robust determination of deficit, enabling comparisons between diseases, tracking of disease evolution and potentially, assessing the impact of therapeutic interventions.
Noise-vocoding has been previously applied in a joint behavioural and magnetoencephalographic (MEG) study of non-fluent/agrammatic variant PPA (nfvPPA), to assess the brain mechanisms that mediate comprehension of degraded speech in the context of relatively focal cerebral atrophy. 48This work showed that patients with nfvPPA rely more on cross-modal cues to disambiguate vocoded speech signals, and have inflexible predictive decoding mechanisms, instantiated in left inferior frontal cortex.However, noise-vocoding has not been exploited as a tool to compare degraded speech perception in different neurodegenerative syndromes.More generally, the cognitive and neuroanatomical mechanisms that mediate the processing of degraded speech and their clinical relevance in this disease spectrum remain poorly defined.
Here, using noise-vocoding, we evaluated the comprehension of acoustically degraded spoken messages in cohorts of patients with typical Alzheimer's disease and with all major syndromes of PPA, referenced to healthy older listeners.We assessed how the understanding of noise-vocoded speech was related to other demographic and disease characteristics.We further assessed the structural neuroanatomical associations of the noise-vocoded speech intelligibility threshold in Alzheimer's disease and PPA, using voxelbased morphometry (VBM) on patients' brain magnetic resonance images.Based on available evidence with noise-vocoded 48 and other degraded speech stimuli (e.g.speech-in-noise 16 and sinewave speech 29 ) in Alzheimer's disease and PPA patients, we hypothesized that both Alzheimer's disease and PPA patients would have elevated thresholds for comprehending noise-vocoded speech compared with healthy controls, and that this deficit would be more severe in nfvPPA and logopenic variant PPA (lvPPA) than in other neurodegenerative syndromes.We further hypothesized that elevated noise-vocoded intelligibility threshold (as an index of impaired comprehension of degraded speech) would be correlated over the combined patient cohort with regional grey matter atrophy in left posterior superior temporal, inferior parietal and inferior frontal cortices: a network of brain areas previously implicated in the processing of noise-vocoded speech in the healthy brain [36][37][38][39][40][41][42] and targeted early and relatively selectively by neurodegenerative pathology in Alzheimer's disease and PPA. 49

Participants
Nineteen patients with typical amnestic Alzheimer's disease, eight patients with lvPPA, 10 patients with nfvPPA and 12 patients with semantic variant PPA (svPPA) were recruited via a specialist cognitive clinic.All patients fulfilled consensus clinical diagnostic criteria with compatible brain MRI profiles and had clinically mild-to-moderate disease. 50,51No patients with pathogenic mutations were included.
Twenty-five healthy older control participants with no history of neurological or psychiatric disorders were recruited from the Dementia Research Centre volunteer database.All participants had a comprehensive general neuropsychological assessment (Table 1).None had a history of otological disease, other than presbycusis; participants assessed in person at the research centre had pure tone audiometry, following a previously described procedure (details in Supplementary material).
Owing to the Covid-19 pandemic, some data for this study were collected remotely (Supplementary material).We have described the design and implementation of our remote neuropsychological assessment protocol elsewhere. 52ll participants gave informed consent to take part in the study.Ethical approval was granted by the UCL-NHNN Joint Research Ethics Committees, in accordance with Declaration of Helsinki guidelines.

Creation of experimental stimuli
Lists of 50 different three-digit numbers (of the form, 'five hundred and eighty-seven'; examples in Supplementary material) were recorded by two young adult female speakers in a Standard Southern British English accent with neutral prosody.They were recorded in Audacity (v 2.2.3), using a condenser microphone with a pop-shield in a sound-proof booth.Speech recordings were noise-vocoded using Matlab® (vR2019b) (https://uk.mathworks.com/) to generate acoustically altered stimuli with a prescribed level of degraded intelligibility (see Supplementary Fig. 1 for spectrograms).Details concerning the synthesis of noise-vocoded stimuli are provided in the Supplementary material.The vocoding intelligibility threshold for younger normal listeners is typically around three to four 'channels' 36 ; in this experiment, we noise-vocoded the speech recordings with 1 to 24 channels, sampling at each integer number of channels within this range to ensure we would be able to accurately capture even markedly abnormal psychometric functions in the patient cohort.
The final stimulus list comprised 100 different spoken threedigit numbers: four unvocoded (clear speech) and 96 noise-vocoded with four stimuli for each number of channels, ranging from 1 to 24.

Experimental procedure
The stimuli were administered binaurally in a quiet room via Audio-Technica ATH-M50X headphones at a comfortable fixed listening level (at least 70 dB).Data for 30 participants were collected remotely via video link during the Covid-19 pandemic (Table 1 and Supplementary material).
To familiarize the participants with the experimental procedure, they were first asked to repeat five three-digit numbers (not included in the experimental session) that were spoken by the experimenter.Prior to presenting the experimental stimuli, participants were advised that the numbers they heard would vary in how difficult to understand they were, but that they should guess the number even if uncertain.Stimuli were presented in order of progressively decreasing channel number (intelligibility), first clear speech, then from 24 vocoding channels to one vocoding channel.On each experimental trial, the task was to repeat the number (or as many of the three digits as the participant could identify).Participants were allowed to write down the numbers they heard rather than speaking them if preferred; in scoring, we accepted the intended target digit as correct, even if imperfectly articulated.Responses were recorded for offline analysis.During the experiment, no feedback about performance was given and no time limits were imposed.

Analysis of clinical and behavioural data
Data were analysed in MATLAB® (vR2019b) and in R® (v4).For continuous demographic and neuropsychological data, participant groups were compared using ANOVA and Kruskal-Wallis tests (dependent on normality of the data); group categorical data were compared using Fisher's exact tests.Performance profiles in seven healthy control participants who performed the experiment both in person and subsequently remotely were very similar, justifying combining participants tested in person and remotely in the main analysis (Supplementary material).An alpha of 0.05 was adopted as the threshold for statistical significance on all tests.
Identification of noise-vocoded spoken numbers was scored according to the number of digits correct for each three-digit number (e.g. if the target number was '587' and the participant responded '585', they would score two points on that trial).As three digits were presented on every trial, this scoring effectively yielded a total of 12 (4 × 3) data-points for each vocoding channel number, for each participant.As the perceptual effect of noise-vocoding scales is exponential (e.g. the increase in intelligibility for normal listeners is much greater between two and four channels than between 20 and 24 channels), we applied a logarithmic (base 2) transformation to the data.The resulting data were then modelled using a Weibull sigmoid, a widely used function for fitting logarithmically scaled data. 53Individual participant and group mean psychometric curves were created for each diagnostic group using the MATLAB psignifit package.This package employs beta-binomial models that account for overdispersion of the fitted psychometric function, due (for example) to wide variation among individual patients. 53For each function, we report the following parameters: the binaural noisevocoded speech intelligibility threshold (the number of vocoding channels at which 50% identification of noise-vocoded numbers was achieved); the slope of the function at the threshold point; lambda (the lapse rate, or number of incorrect responses at maximum performance asymptote); gamma (the guess rate, or number of correct responses at minimum performance level); and eta (a measure of overdispersion).As the data were not normally distributed, we used nonparametric Kruskal Wallis tests to analyse psychometric parameters.Where the omnibus test was significant, we conducted Dunn's tests to conduct pairwise comparisons between participant groups.We assessed the relationship of noise-vocoded speech intelligibility threshold to forward digit span over the whole patient cohort, using Spearman's correlation; here, digit span provides a metric of each patient's overall ability to repeat (hear, hold in short term memory and articulate) natural spoken numbers.We further used Spearman's correlation to assess, over the combined patient cohort, the relationship of intelligibility threshold to general demographic (age, sex) and clinical [symptom duration, Mini-Mental State Examination (MMSE) score] variables, executive performance [Wechsler Abbreviated Scale of Intelligence (WASI) Matrices] and measures of auditory perceptual function (pure tone audiometry, phonemic pairs discrimination on the Psycholinguistic Assessment of Language Processing in Aphasia (PALPA)-3 subtest) (Supplementary material).
Finally, receiver operating characteristic (ROC) curves were derived to assess the overall diagnostic utility of noise-vocoded speech comprehension in distinguishing each patient group from healthy controls.The binary classifier used was the 50% speech intelligibility threshold obtained from each psychometric function.The area under the ROC curve (AUC) was calculated for each syndromic group using parametric estimates in the pROC R package. 54,55

Brain image acquisition and analysis
Volumetric brain magnetic resonance images were acquired for 25 patients in a 3 T Siemens Prisma MRI scanner, using a 32-channel phased array head coil and following a T 1 -weighted sagittal 3D magnetization prepared rapid gradient echo (MPRAGE) sequence (echo time = 2.9 ms, inversion time = 900 ms, repetition time = 2200 ms), with dimensions 256 mm × 256 mm × 208 mm and voxel size 1.1 mm × 1.1 mm × 1.1 mm.
For the VBM analysis, patients' brain images were first preprocessed and normalized to MNI space using SPM12 software (http://www.fil.ion.ucl.ac.uk/spm/software/spm12/) and the DARTEL toolbox with default parameters running under MATLAB R2014b.Images were smoothed using a 6-mm full-width at half-maximum (FWHM) Gaussian kernel.To control for individual differences in total (pre-morbid) brain size, total intracranial volume was calculated for each participant by summing white matter, grey matter and CSF volumes post-segmentation. 56An explicit brain mask was created using an automatic mask-creation strategy designed previously. 57A study-specific mean brain template image upon which to overlay statistical parametric maps was created by warping all patients' native-space whole-brain images to the final DARTEL template and using the ImCalc function to generate an average of these images.

Table 2 Psychometric function parameters for comprehension of noise-vocoded speech in each participant group
We assessed grey matter associations of noise-vocoded speech intelligibility threshold over the combined patient cohort.Voxel-wise grey matter intensity was modelled as a function of performance threshold in a multiple regression design, incorporating age, total intracranial volume and diagnostic group membership as covariates.Statistical parametric maps were generated using an initial cluster-defining threshold (P < 0.001) and assessed at peaklevel significance threshold P < 0.05, after family-wise error (FWE) correction for multiple voxel-wise comparisons within five separate predefined regions of interest, specified during the design of the study, and based on previously published work on degraded speech perception in the healthy brain and in neurodegenerative disease: these regions, which together constitute a distributed neural network processing degraded speech signals, comprised left planum temporale, 38,39 left angular gyrus, [40][41][42] left anterior superior temporal gyrus, 40,58,59 left inferior frontal gyrus 40,48,58 and left cingulate gyrus. 40,60Anatomical volumes were derived from Oxford-Harvard cortical maps 61 and are shown in Supplementary Fig. 3.

Data availability
The data that support the findings of this study are available on request from the corresponding author.The data are not publicly available because they contain information that could compromise the privacy of research participants.

General participant group characteristics
Participant groups did not differ significantly in sex distribution, handedness or years of formal education (all P > 0.05, Table 1).Patient groups differed significantly in terms of age (P = 0.04), with the Alzheimer's disease (z = 2.22, P = 0.03), lvPPA (z = 2.47, P = 0.01) and nfvPPA (z = 2.75, P = 0.01) PPA groups being older on average than the svPPA group.Patient groups did not differ in mean symptom duration (P = 0.09) but did differ in MMSE score [H(3) = 11.3,P = 0.01; Table 1], the Alzheimer's disease group performing worse than the nfvPPA (z = −3.22,P = 0.001) and svPPA (z = −2.10,P = 0.04) groups.General neuropsychological profiles were in keeping with syndromic diagnosis for each patient group (Table 1).Pure tone audiometry (in the participant subcohort assessed in-person) revealed no substantial peripheral hearing deficits nor any significant differences between participant groups.Basic speech discrimination (assessed using the PALPA-3) did not differ significantly from the healthy control group for any of the PPA syndromic groups.

Experimental behavioural data
Psychometric parameters for the participant groups are presented in Table 2. Individual and mean psychometric functions and datapoints of the noise-vocoded speech intelligibility threshold are presented in Fig. 1.Additional data-point plots of the slope at 50% correct and lapse rate are presented in Supplementary Fig. 2. ROC curves for the patient groups versus the healthy control group are shown in Fig. 2. Exclusion of two upper bound outliers on speech intelligibility threshold (>97.5 quantile) in parallel analyses left the results qualitatively unaltered.Results from the full dataset are accordingly reported in-text below; parallel analyses with outliers removed are reported in the Supplementary material.
There was a significant main effect of diagnostic group on noisevocoded speech intelligibility threshold [H(4) = 38.48,P < 0.001].
In post hoc pairwise group comparisons versus healthy controls, mean intelligibility threshold was significantly elevated in all patient groups: in the lvPPA (z = 4.48, P < 0.001), nfvPPA (z = 3.97, P < 0.001), Alzheimer's disease (z = 5.08, P < 0.001) and svPPA (z = 2.23, P = 0.03) groups.Comparing patient groups, intelligibility threshold was significantly elevated in the Alzheimer's disease (z = 2.07, P = 0.04) and lvPPA (z = 2.27, P = 0.02) groups compared with the svPPA group.There was no significant effect of diagnostic group on the slope of the psychometric function (P = 0.347).There was a significant main effect of diagnostic group on the lapse rate, lambda [H(4) = 16.03,P = 0.003].In post hoc pairwise group comparisons versus healthy controls, there was a significantly higher lapse rate (more errors made at maximum performance) in all patient groups: in the lvPPA (z = 2.68, P = 0.007), Alzheimer's disease (z = 2.61, P = 0.009), nfvPPA (z = 3.27, P = 0.001), and svPPA (z = 2.31, P = 0.02) groups.There were no significant differences between patient groups for lapse rate.There was a significant main effect of diagnostic group on the guess rate, gamma [H(4) = 16.49,P = 0.002].In post hoc pairwise group comparisons, there was a significantly higher gamma rate (i.e. more correct answers made at minimum performance) in the healthy control group than any patient groups (P < 0.05).There was no significant effect of diagnostic group on eta (overdispersion of the data) of the psychometric function (P = 0.118).Group effect sizes (Table 2) were large for intelligibility threshold, lapse rate and gamma rate, but small for other psychometric parameters. 62,63ndividual variability in psychometric parameters within participant groups was substantial (Fig. 1 and Table 2).Most pertinently, variation in noise-vocoded speech intelligibility threshold was wider in the Alzheimer's disease group than in healthy controls and most marked in the lvPPA and nfvPPA groups.

Neuroanatomical data
Statistical parametric maps of grey matter regions associated with speech intelligibility threshold are shown in Fig. 3 and local maxima are summarized in Table 3. Correlation plots for each significant peak voxel with speech intelligibility threshold are shown in Supplementary Fig. 5.

Discussion
Here we have shown that perception of acoustically degraded (noise-vocoded) speech is impaired in patients with Alzheimer's disease and PPA syndromes relative to healthy older listeners, and further, stratifies syndromes: impairment was most severe in lvPPA and nfvPPA, and significantly more severe in Alzheimer's disease than in svPPA.Intelligibility threshold for noise-vocoded speech did not correlate with measures of pure tone detection or phoneme discrimination in clear speech, suggesting that the deficit does not simply reflect a problem with peripheral hearing or elementary speech perception.Individual noise-vocoded speech intelligibility threshold varied widely within the Alzheimer's disease, lvPPA and nfvPPA groups.Our findings suggest that elevation in noise-vocoded speech intelligibility threshold in these dementia syndromes captures a central auditory impairment potentially relevant to difficulties in diverse everyday listening situations requiring the decoding of acoustically altered speech signals.
Neuroanatomically, impaired noise-vocoded speech comprehension across dementia syndromes was underpinned by atrophy of left planum temporale, angular gyrus and anterior cingulate gyrus.This cortical network has been shown to be critical for processing speech signals under a range of noisy, daily listening conditions. 5,32,33,42,66Planum temporale is likely to play a fundamental role in the deconvolution of complex sound patterns and engagement of neural representations corresponding to phonemes and other auditory objects. 38,39,67][69][70] Both regions are targeted in Alzheimer's disease, lvPPA and nfvPPA [71][72][73][74] and have been particularly implicated in the pathogenesis of impaired speech perception in these diseases. 29,30,32,75The anterior cingulate cortex operates in concert with these more posterior cortical hubs to decode spoken messages under challenging listening conditions, 40,60 with a more general role in cognitive control and in allocating attentional resources to salient stimuli. 66,76,77educed activation of the anterior cingulate cortex during tracking of information in degraded speech signals has been demonstrated in nfvPPA and svPPA. 33hese neuroanatomical considerations suggest that the mechanisms of impaired noise-vocoded speech intelligibility are likely to differ between neurodegenerative syndromes, in keeping with the dissociable processes involved in phoneme recognition. 2oise-vocoding fundamentally reduces the availability of acoustic cues that define phonemes as auditory objects: impaired recognition of these degraded auditory objects could in principle result from deficient encoding of acoustic features, damaged object-level representations (the auditory analogue of 'apperceptive' deficits in the visual domain) or impaired top-down, predictive disambiguation based on stored knowledge about speech signal characteristics.In Alzheimer's disease and lvPPA, a core deficit of object-level representations has been demonstrated neuropsychologically and electrophysiologically using other procedures that alter acoustic detail in phonemes and non-verbal sounds 31,33,78,79 ; it is therefore plausible that an analogous apperceptive deficit may have impacted the recognition of noise-vocoded phonemes in the Alzheimer's disease and lvPPA groups here.In nfvPPA, one previous MEG study of noise-vocoded speech perception has foregrounded the role of inflexible top-down predictive decoding mechanisms (i.e.inappropriately 'precise' stored expectations about incoming speech signals, leading to delayed disambiguation of degraded speech), instantiated in frontal cortex. 48However, this is a clinically, neuroanatomically and neuropathologically diverse syndrome, and involvement of posterior superior temporal cortex engaged in early auditory pattern analysis may constitute a 'second hit' to phoneme recognition. 33,78,80,81In svPPA, the elevated noisevocoded intelligibility threshold is a priori more likely to reflect reduced activation of semantic mechanisms engaged in the predictive disambiguation of degraded speech signals; and indeed, comprehension of other kinds of acoustically degraded speech signals by patients with svPPA has previously been shown to be sensitive to semantic predictability and to engage anterior cingulate cortex. 29,31,33ncreasing intelligibility threshold was correlated with digit span over the combined patient cohort.This suggests that verbal working memory limitations may be integrally related to impaired processing of degraded speech, consistent with previous work highlighting the role of working memory in speech perception, particularly in older adults. 82,83As working memory demands did not vary across trials and number of vocoding channels, the principal Figure 3 Statistical parametric maps of regional grey matter atrophy associated with elevated noise-vocoded speech intelligibility threshold in the combined patient cohort.Maps are rendered on sagittal sections of the group mean T 1 -weighted magnetic resonance image in MRI space, thresholded at P < 0.001 uncorrected for multiple voxel-wise comparisons, and masked using the pre-specified neuroanatomical region of interests (as used in the small volume corrections) that were significant at P < 0.05 FWE for multiple comparisons, over the whole brain for display purposes.The colour bar (right) codes voxel-wise t-values.All sections are through the left cerebral hemisphere; the plane of each section is indicated using the corresponding MNI coordinate (mm).driver of intelligibility threshold is likely to have been the level of acoustic alteration in the speech signal.On the other hand, all patient groups showed an increased lapse rate (i.e.errors unrelated to the stimulus level 53 ) at higher vocoding channel numbers (i.e. for minimally noise-vocoded speech signals approaching clear speech).This echoes previous work demonstrating that active listening can be abnormal in lvPPA and nfvPPA even for clear speech and other sounds in quiet. 75,84As lapse rate was also correlated with digit span, this suggests that reduced working memory may influence performance at the upper asymptote, potentially interacting with top-down mechanisms engaged in the predictive processing of speech. 48Indeed, frontal processes are likely to play a broader role in the disambiguation of degraded speech signals, including the allocation of attentional and executive resources 85 and according with the observed correlation here between noisevocoded speech intelligibility threshold and WASI Matrices score.Taken together, the present findings corroborate the profiles of deficit previously documented in Alzheimer's disease and PPA syndromes for comprehension of sinewave speech and phonemic restoration in noise-interrupted speech. 29,31ur findings further suggest that markers of noise-vocoded speech comprehension may have diagnostic and biomarker potential.The ROC analysis on the noise-vocoded intelligibility threshold measure (Fig. 2) suggests that it would constitute an 'excellent' clinical test (corresponding to AUC > 0.9) for discriminating patients with Alzheimer's disease, lvPPA and nfvPPA from healthy older individuals. 65However, the smaller sample size does need to be taken into consideration for the ROC analysis.Additionally, the noisevocoded intelligibility threshold was correlated with overall disease severity (MMSE score) in the patient cohort.These findings build on a growing body of work suggesting that markers of 'central' hearing (auditory cognition) may sensitively signal the functional integrity of cortical regions that are vulnerable to Alzheimer's disease and other neurodegenerative pathologies. 5,8,16The results of this study could further motivate the development of tailored strategies to help manage hearing difficulties experienced by people with dementia in various daily-life contexts and environments.
This study has limitations that suggest directions for further work.Our noise-vocoding paradigm (based on a step-wise linear progression through channel numbers) was not optimally efficient; an adaptive staircase procedure would reduce testing time and allow individual thresholds to be captured without administering uninformative trials at higher channel numbers.7][88] Using another kind of speech degradation (sinewave transformation), we have previously shown that pharmacological and perceptual learning effects may operate in Alzheimer's disease and PPA syndromes. 29,30To establish how noise-vocoded speech perception and its modulatory factors relate to neural circuit integrity in Alzheimer's disease PPA, functional neuroimaging using techniques such as functional MRI and MEG will be required to capture dynamic network connectivity engaged by these processes and the neural mechanisms that represent and decode vocoded speech sounds.Furthermore, whilst a direct comparison across sensory modalities was beyond the scope of the present study, the perceptual processing deficit presented here in the auditory domain may extend to other sensory domains, such as vision. 89It would be of particular interest to assess whether crossmodal sensory cues can be used to help disambiguate degraded speech signals in patients with Alzheimer's disease and PPA.
From a clinical perspective, this work should be taken forward in several ways.The group sizes here were relatively small: the noise-vocoding paradigm should be extended to larger patient cohorts, which (given the comparative rarity of PPA) will likely entail multicentre collaboration.Besides corroborating the present group findings, assessment of larger cohorts would allow characterization of the sources of the wide individual variation within diagnostic groups.There is also a need for prospective, longitudinal studies -both to assess how markers of degraded speech perception relate to disease course and to determine how early such markers may signal underlying neurodegenerative pathology.Auditory measures based on degraded speech comprehension would be well suited to future digital applications and potentially to large-scale screening of populations at risk of incident Alzheimer's disease, as well as outcome measures in clinical trials of pharmacotherapies and non-pharmacological interventions. 8,16Our work adds to a growing body of evidence that central hearing problems may emerge as early and/or prominent symptoms in dementia syndromes. 8Improved awareness and understanding of these issues in healthcare professionals such as audiologists and neurologists could inform care, management and counselling of patients.Older hearing aid users at risk of dementia are likely to be particularly vulnerable to impaired central mechanisms of degraded speech comprehension, given that the quality of incoming acoustic information in this setting is already compromised.
The key next step, however, will be to establish how well measures of degraded speech comprehension, not solely noisevocoding but also other ethologically relevant adverse speech listening tests, correlate with daily-life hearing and communication in Alzheimer's disease and other neurodegenerative diseases -using both currently standardized symptom questionnaires and bespoke instruments developed to capture functional hearing disability in dementia.We have previously shown that pure tone audiometry alone is a poor predictor of everyday hearing 90 while degraded speech performance may have better predictive value in patients with dementia. 91There would be considerable clinical The table shows significant negative associations between regional grey matter volume and intelligibility threshold for noise-vocoded speech, based on the voxel-based morphometric analysis of brain magnetic resonance images for the combined patient cohort.Coordinates of peaks (local maxima) are in MNI standard space.Local maxima shown were significant (P < 0.05) after family-wise error (FWE) correction for multiple voxel-wise comparisons within the pre-specified anatomical regions of interest (see text and Supplementary Fig. 2).
Comprehension of degraded speech in dementia BRAIN 2023: 146; 4065-4076 | 4073 value in a quantifiable index of degraded speech perception that could serve as a proxy and predictor of daily life hearing function and disability in major dementias: comprehension of noisevocoded speech is a promising candidate.The link between hearing impairment and dementia continues to be debated but presents a major opportunity for earlier diagnosis and intervention.Our findings suggest that the perception of degraded (noise-vocoded) speech quantifies central hearing functions beyond sound detection in dementia and stratifies major dementia syndromes.This central hearing index may constitute a proxy for the communication difficulties experienced by patients with Alzheimer's disease and PPA under challenging listening conditions in daily life.We hope that this work will motivate further studies to define the diagnostic and therapeutic scope of central hearing measures based on degraded speech perception in these diseases.
are based on mean psychometric functions for each participant group (see text and Supplementary Fig.2); mean (standard deviation, SD) and confidence intervals (CI) for each parameter values are shown.Threshold indicates 50% intelligibility of noise-vocoded spoken numbers; slope indicates the slope of the psychometric function at this threshold point; lambda (lapse rate) indicates the number of incorrect responses at the maximum performance level; gamma (guess rate) indicates the number of correct responses at the minimum performance level; eta (overdispersion) indicates scaling of extra variance (a value near 0 indicates that the data are essentially binomially distributed, while values near 1 indicate severely overdispersed data).The η 2 H (eta-squared) parameter measures effect size of the omnibus test for each parameter, and is expressed as a proportion ranging from 0 to 1, with higher values representing larger effect sizes.Significant differences (P < 0.05) between patient groups and the healthy older control group are shown in bold; *significantly lower in the svPPA group than the other patient groups (all P < 0.05).AD = patient group with typical Alzheimer's disease; Controls = healthy older control group; lvPPA = patient group with logopenic variant primary progressive aphasia; nfvPPA = patient group with non-fluent/agrammatic variant primary progressive aphasia; svPPA = patient group with semantic variant primary progressive aphasia.

Figure 1
Figure 1 Beeswarm plots of individual participants' speech intelligibility threshold and psychometric curves for comprehension of noise-vocoded speech within each diagnostic group.(A) Group speech intelligibility threshold values correspond to number of vocoding channels in the speech stimulus at which 50% intelligibility of spoken numbers was achieved.Dashed lines represent the mean for each group.(B-F) The y-axis here shows the percentage of digits identified correctly (from a total of 12 digits) at each noise-vocoding level; the x-axis shows the number of vocoding channels, plotted on a log scale.(B) Combined psychometric curves of all healthy control participants, with the bolded line indicating mean [curves have been fitted through values (coloured dots) representing the mean score correct across individual participants in that group at each noise-vocoding level].(C) Combined psychometric curves of all the participants with Alzheimer's disease (AD), with the bold line indicating mean (as in B). (D) Combined psychometric curves of all the participants with logopenic variant primary progressive aphasia (lvPPA), with the bold line indicating mean (as in B). (E) Combined psychometric curves of all the participants with non-fluent variant primary progressive aphasia (nfvPPA), with the bold line indicating mean (as in B). (F) Combined psychometric curves of all the participants with semantic variant primary progressive aphasia (svPPA), with the bold line indicating mean (as in B).