Tract-specific MRI measures explain learning and recall differences in multiple sclerosis

Abstract Cognitive difficulties are common and a key concern for people with multiple sclerosis. Advancing knowledge of the role of white matter pathology in multiple sclerosis-related cognitive impairment is essential as both occur early in the disease with implications for early intervention. Consequently, this cross-sectional study asked whether quantifying the relationships between lesions and specific white matter structures could better explain co-existing cognitive differences than whole brain imaging measures. Forty participants with relapse-onset multiple sclerosis underwent cognitive testing and MRI at 3 Tesla. They were classified as cognitively impaired (n = 24) or unimpaired (n = 16) and differed across verbal fluency, learning and recall tasks corrected for intelligence and education (corrected P-values = 0.007–0.04). The relationships between lesions and white matter were characterized across six measures: conventional voxel-based T2 lesion load, whole brain tractogram load (lesioned volume/whole tractogram volume), whole bundle volume, bundle load (lesioned volume/whole bundle volume), Tractometry (diffusion-tensor and high angular resolution diffusion measures sampled from all bundle streamlines) and lesionometry (diffusion measures sampled from streamlines traversing lesions only). The tract-specific measures were extracted from corpus callosum segments (genu and isthmus), striato-prefrontal and -parietal pathways, and the superior longitudinal fasciculi (sections I, II and III). White matter measure-task associations demonstrating at least moderate evidence against the null hypothesis (Bayes Factor threshold < 0.2) were examined using independent t-tests and covariate analyses (significance level P < 0.05). Tract-specific measures were significant predictors (all P-values < 0.05) of task-specific clinical scores and diminished the significant effect of group as a categorical predictor in Story Recall (isthmus bundle load), Figure Recall (right striato-parietal lesionometry) and Design Learning (left superior longitudinal fasciculus III volume). Lesion load explained the difference in List Learning, whereas Letter Fluency was not associated with any of the imaging measures. Overall, tract-specific measures outperformed the global lesion and tractogram load measures. Variation in regional lesion burden translated to group differences in tract-specific measures, which in turn, attenuated differences in individual cognitive tasks. The structural differences converged in temporo-parietal regions with particular influence on tasks requiring visuospatial-constructional processing. We highlight that measures quantifying the relationships between tract-specific structure and multiple sclerosis lesions uncovered associations with cognition masked by overall tract volumes and global lesion and tractogram loads. These tract-specific white matter quantifications show promise for elucidating the relationships between neuropathology and cognition in multiple sclerosis.


Introduction
Cognitive impairment affects 50-75% of people with multiple sclerosis [1][2][3] and is associated with adverse clinical outcomes and reduced quality of life. [4][5][6] Whilst cognitive impairment in multiple sclerosis has been subject to increasing attention in the literature over recent decades, more work is needed to characterize cognitive phenotypes and particularly their relationships to neuropathology. [6][7][8] Indeed, uncovering the precise neurobiological bases of cognitive deficits has been identified as a key research priority in multiple sclerosis. 8 Cognitive impairment in multiple sclerosis is typically characterized by impaired information processing speed and memory, but a variety of tasks across many cognitive domains have demonstrated differences in multiple sclerosis cohorts compared to healthy controls. 1,2,6-10 Even meta-analyses comprising large numbers of cases do not always agree on which types of tasks and/or cognitive domains demonstrate the most sensitivity in discriminating multiple sclerosis-related cognitive impairment. 9,11,12 The variation in cognitive profiles 7,12,13 perhaps mirrors the wide variation in the nature and location of neuropathology between individuals. 7,14,15 However, the lack of agreed methodology for cognitive testing and classification of cognitive impairment are key challenges in unravelling the pathological correlates of cognitive impairment in multiple sclerosis. 8 Another challenge is posed by the varied approaches to MRI, which is currently the best biomarker for both diagnosing and monitoring disease activity in multiple sclerosis. 16 In keeping with the clinico-radiological paradox, 17 traditional white matter measures such as T1 or T2 lesion loads correlate only modestly with cognition in multiple sclerosis 18 and have not always performed well in predicting specific cognitive functions. 19,20 While global measures such as lesion loads are valuable in clinical trials, these are not expected to explain inter-individual cognitive differences. 18 Imaging approaches that uncover regional or structure-specific variation in pathology may perform better at providing evidence for clinical-pathological correlation. 21 As the locus of pathology in the early stages of multiple sclerosis is predominantly in white matter [22][23][24] with cognitive deficits also apparent early in the disease, 3,5,25 white matter pathology may be a driver of cognitive deficits amenable to targeted rehabilitative or restorative interventions. 8,[26][27][28] Studies investigating the relevance of white matter pathology to cognition in multiple sclerosis have demonstrated the importance of lesion location, 29 abnormalities in 'diffusely abnormal' and 'normal-appearing' white matter, 7,30 and regional as well as tract-based microstructural abnormalities. [31][32][33][34][35] Where diffusion-weighted imaging has been used, conventional diffusion-tensor measures such as fractional anisotropy (FA) predominate, with voxel-based analysis (e.g. tract-based spatial statistics) often employed. 33 As demonstrated by recent approaches to tractography, 21,34-38 more advanced approaches allow for enhanced anatomical and microstructural specificity. [39][40][41][42][43][44] Combining state-of-the-art diffusion-MRI methods with examination of specific white matter structures and cognitive constructs (rather than composites from multiple cognitive domains) presents a novel opportunity for detailed characterization of the relationships between white matter structure and cognition in multiple sclerosis. 8,18 We aimed to address these unmet needs by investigating whether individual differences in tract-specific measures can account for individual differences in performance on specific cognitive tasks between people with multiple sclerosis who are classified as cognitively impaired or unimpaired. We used novel methods 45 to characterize individual tract macro-and microstructure and examined their performance, versus more global measures (e.g. conventional T2weighted lesion load), in explaining cognitive group differences. We hypothesize that tract-specific volumetric and/or microstructural measures will perform better than the global imaging measures in explaining specific cognitive differences. Furthermore, we hypothesize that white matter measures differentially affected by lesion burden between groups may show some relative specificity in explaining performances in individual cognitive tasks.

Estimate of sample size
The sample was derived from a larger clinical study into long-standing multiple sclerosis 46 and a priori sample size was calculated from data (n ¼ 26) involving a similarly defined cohort. 47 Bester et al. 47 examined tract-specific microstructural differences and compared those cognitively impaired (n ¼ 10) and cognitively preserved (n ¼ 16) using the same method for classifying cognitive impairment as applied in Tallantyre et al. 46 They reported significant differences between groups in FA (P ¼ 0.02) and mean diffusivity (MD, P ¼ 0.01) within the splenium of the corpus callosum. 47 Using the published means and standard deviations, the effect sizes were calculated, which were large (FA, g Hedges ¼ 1.2 and MD, g Hedges ¼ 1.3) with the impaired group demonstrating lower FA and higher MD.
These data were used to estimate sample size (with G*Power; www.gpower.hhu.de) for using a t-test to detect a white matter structural difference between cognitively impaired or unimpaired groups. Using the parameters of power 0.8, the smaller effect size 1.2, and alpha 0.05 for a two-tailed hypothesis, the minimum sample size derived for each group was n ¼ 12. Similarly, the lower correlation (r ¼ 0.52, P ¼ 0.008) between corpus callosum (genu) FA and a cognitive task (verbal learning) reported by Bester et al. was used to estimate the sample size needed to examine structure-cognitive task correlations. Using the same power and alpha as above, the minimum whole sample size needed was n ¼ 26.

Participants
From 60 people with longstanding relapse-onset multiple sclerosis, 46 40 consented to undergo MRI within 12 months of their clinical assessment. Problematic clinical confounds (e.g. fatigue, depression, drug effects) were minimized by recruiting participants who demonstrated relatively low levels of disability and had never received disease modifying treatments. [48][49][50][51] The study was approved by the Wales Research Ethics Committee 2 (16/ WA/0051) and is in keeping with the principles of the Declaration of Helsinki.
Data pre-processing and derivation of white matter measures Brain masks were extracted from the fluid attenuated inversion recovery (FLAIR) images using FSL bet. 53 White matter lesion masks were semi-automatically delineated using 3D T2 and FLAIR images by a trained technician (blinded to the purpose of the study) using NeuROI (www.nottingham.ac.uk/research/groups/clinicalneurology/ neuroi.aspx).
Diffusion data were denoised 54 and corrected for subject motion and distortion. 55 To align anatomical images with diffusion data, the FLAIR images were warped to the upsampled b ¼ 0 images from the diffusion MRI dataset using ANTs. 56 The same transformation matrix was then applied to the brain mask and lesion mask. Diffusion tensors were generated using iteratively weighted least squares in MRtrix 57 using only the b ¼ 1200 s/mm 2 data, followed by the derivation of FA, MD and radial diffusivity (RD) maps ( Fig. 1). In addition, the total apparent fibre density (AFD) was derived from fibre orientation distribution functions 58,59 obtained from multi-shell multi-tissue constrained spherical deconvolution 39 using a single group response function. The number of fibre orientations (NuFO) in each voxel was also extracted. 60 Finally, rotationally invariant spherical harmonics (RISH) features 61 were derived for each subject using the b ¼ 2400 s/mm 2 shell (0th and 2nd orders only, RISH0 and RISH2, respectively). Briefly, RISH features capture the signal energy at a given shell and therefore, the higher the b-value, the more specific to intraaxonal space the RISH features are. Both orders capture different microstructural tissue properties; RISH0 captures the isotropic component of the signal, while RISH2 captures the variance in the signal, and therefore deviations from isotropy.
For each dataset, automated white matter tract segmentation was performed using TractSeg 42 to obtain the following bilateral bundles of interest ( Fig. 2): Corpus callosum (TractSeg sections 2 [genu] and 6 [isthmus]), striato-prefrontal pathway, striato-parietal pathway and superior longitudinal fasciculus (SLF, TractSeg sections I-III). These bundles were chosen in line with the white matter regions often associated with neuropathological burden in multiple sclerosis, 7,37,62 the types of fibres affected by its pathology, 35,63,64 and to achieve some coverage of anterior and posterior regions across commissural, projection, and association pathways. For each bundle, 2000 streamlines were generated. The microstructural measures were then averaged in each bundle. A whole brain tractogram was also derived for further analysis by concatenating all TractSeg outputs.
Next, white matter measures were derived ( Fig. 3) according to Chamberland et al. 45 These include (i) conventional T2 lesion load; (ii) whole brain tractogram load; and (iii) bundle load. In addition, (iv) a lesion-specific approach to the Tractometry 43,65 framework was employed, where the diffusion MRI measures are sampled only within the portion of streamlines traversing a lesion. To reduce the high dimensionality of the data, two Tractometry factors were created from the tract-specific microstructural measures, 66 one derived from whole bundle streamlines and the other from the streamlines affected by lesions only (henceforth referred to as the lesionometry factor; Fig. 3D). were derived using the highest b-value and represent the isotropic and anisotropic energy, respectively. Diffusion-tensor measures (RD, MD and FA) were derived using the lowest b-value. HARDI measures like the apparent fibre density (AFD) and the number of unique fibre orientations per voxel (NuFO) were derived from constrained spherical deconvolution. Hyperintense T2 lesions (FLAIR) are highlighted across the maps (purple outline).

Behavioural measures
The sample was characterized using typical demographics, disease duration, years in education, Test of Premorbid Functioning UK 68  The cognitive battery has been previously detailed 46 and the same classification of cognitive impairment was applied (!2 test scores 5th percentile against test normative data). However, only a subset of tasks from the original battery were considered for inclusion; those testing cognitive domains often impaired in multiple sclerosis (i.e. information processing speed, working memory, learning, recall, language and cognitive flexibility) 7 :  On the right, 3D oblique view of the corona radiata and the superior longitudinal fasciculi in the vicinity of the lesion. (A) Fibre orientation distribution functions (fODFs) of the centrum semiovale (red arrow) shows continuity inside and around the lesion, suggesting preserved structural organization. 67 On the right, multiple major pathways intersect and traverse the lesion (CST, corticospinal tract; Cg, Cingulum; CC, corpus callosum; SLF, superior longitudinal fasciculus). (B) A lesion adjacent to the striato-prefrontal connection (ST_PREF) shows preserved fibre orientations (red arrow) allowing reconstruction of the SLF-III pathway.

• Verbal Fluency (Letter and Category) and Colour-Word Interference Test Conditions 3 and 4 (Delis-Kaplan Executive Function System). 77
To examine if cognitive differences can be explained by variation in structural white matter properties, only those cognitive tasks demonstrating significant (P < 0.05) differences between cognitively impaired and unimpaired groups following corrections for estimated IQ and years in education were included. The cognitive measures retained were Story Recall, Figure

Statistical analyses
All statistical analyses were completed using SPSS Statistics 26. The raw scores from the cognitive measures were used across analyses. Effect sizes are presented as d for t-tests, r for correlations, and g p 2 for analyses of covariance and interpreted as large (LES), medium (MES) or small (SES). 78

Tractometry factors
For the whole bundle Tractometry factors, the mean FA, MD, RD, AFD, RISH0, RISH2 and NuFO values underwent principal component analysis. 66 The measures were standardized, the factors derived based on eigenvalues greater than 1, and any small coefficients were suppressed (absolute value below 0.3). The same method was used to create the lesionometry factors, but as derived from the within-bundle streamlines affected by lesions only (Fig. 3D). To ensure the data were suitable for principal component analysis, the Kaiser-Meyer-Olkin Measure of Sampling Adequacy (all !0.7) and Bartlett's Test of Sphericity (all P < 0.0005) were applied.

Inclusion of cognitive measures
The cognitive measures considered for inclusion were compared between those cognitively impaired (n ¼ 24) and those unimpaired (n ¼ 16) using independent two-tailed ttests. The tasks differing between groups (P < 0.05) then underwent corrections for estimated IQ and years of education using analyses of covariance. The tasks still demonstrating a significant difference (P < 0.05) between groups following these corrections were included.

Primary analyses
The white matter measures from the bilateral tracts included were reduced into single measures by averaging the left and right values. Consequently, there were seven bundles (corpus callosum sections 2 and 6, striato-prefrontal pathway, striato-parietal pathway, and SLF sections I, II and III) with four white matter measures associated with each tract (bundle volume, bundle load, whole bundle Tractometry factor and lesionometry factor). These white matter measures alongside lesion load and tractogram load were correlated against each task resulting in 30 correlations per task. Owing to the number of comparisons, whole sample (n ¼ 40) Bayesian Pearson correlation was used to derive Bayes factors (BFs) to indicate the degree of evidence against the null hypothesis for each correlation (tolerance ¼ 0.0001; maximum iterations ¼ 2000; Monte Carlo samples ¼ 10 000). At least moderate evidence (BF 0.1-0.33) for the alternative hypothesis was desired with the threshold for inclusion in subsequent analyses set to BF 0.2 so that any relationships would be closer to demonstrating strong (BF < 0.1), as opposed to anecdotal (BF > 0.33), evidence for the alternative hypothesis (note in SPSS evidence for null hypothesis are values >1 and evidence for alternative hypothesis are values <1). 79 Only the surviving relationships per task were examined further to whole-brain tractogram load, whereby the volume of white matter intersecting with lesions is used as the numerator, (C) bundle load, which is the volume intersecting lesions divided by entire volume (example bundle: arcuate fasciculus) and (D) lesion-based Tractometry (lesionometry), where diffusion-MRI measures are sampled along the entire length of the streamlines that intersect lesions. Measures A-C range from 0 (0% interaction between lesions and white matter) to 1 (100% interaction between lesions and white matter). Visualization was performed using FiberNavigator. 76 understand the nature of the relationships and to determine if these accounted for the task differences.
Analyses of covariance were used to gauge the impact of the retained white matter measures on the task-specific group differences. In each model, the cognitive task was the dependent variable, group the independent variable, and estimated IQ and years of education were covariates alongside the task-specific white matter measures. As the focus was on covariate effects, any heterogeneity of regression slopes or group-covariate interactions were of interest. The aim was to determine if the white matter measures (covariates) would diminish the effect of group as a significant categorical predictor of task scores. The significance level adopted for the influence of the covariates and for the effect of group was P < 0.05. In addition to graphical linearity checks, the Levene's Test of Equality of Error Variances was used to gauge any violations of the assumption of homogeneity of variances (P > 0.05), and covariate collinearity checks (r < 0.8) were applied when individual covariate effects were examined. Supplemental analyses conducted to elucidate the results included bivariate Pearson's correlations and independent t-tests.

Data availability
TractSeg is available at github.com/MIC-DKFZ/TractSeg. MRtrix is available at www.mrtrix.org/. FiberNavigator is available at github.com/chamberm/fibernavigator. The lesionometry toolbox will be available upon publication at github.com/chamberm/Lesionometry. The anonymized data and code used in the reported analyses are available on reasonable request from the first author.

Sample
The 40 participants were 27 women and 13 men, mean age 58 years (range 44-78) and mean disease duration 27 years (range . Median Expanded Disability Status Scale at clinical assessment was 2.5 (range 0-6.0). Sample characteristics by group and whether these differed statistically are presented in Table 1.
Whole brain lesion load was significantly higher in the cognitively impaired group (Table 1)  between groups. The greatest lesion frequency difference appeared to localize to the right temporoparietal region (Fig. 4).

Primary analyses
The datasets were complete across all tasks (n ¼ 40) except Story Recall in which one datapoint was missing (n ¼ 39).

Letter fluency
The Letter Fluency score difference between groups was moderate following corrections for estimated IQ and years in education [F(1,36) ¼ 5.147, P ¼ 0.029, g p 2 ¼ 0.125]. However, as all white matter measure correlations demonstrated BFs in favour of the null hypothesis (BFs ranging from 2.4 to 8.1), no further analyses were conducted. These results suggest the white matter measures included did not account for the difference in Letter Fluency scores between groups.

Story recall
The Story Recall score difference between groups was large following corrections for estimated IQ and years in education [F(1,35) ¼ 6.359, P ¼ 0.016, g p 2 ¼ 0.154]. Of the white matter measures examined at whole sample level, only the isthmus bundle load correlation (r ¼ À0.448, BF ¼ 0.138, P ¼ 0.004, MES) reached the BF threshold (see Fig. 5A for bundle and tractogram load illustrations and 5B for correlation scatterplot). When entered as a covariate in an analysis of covariance Note. Estimated IQ and years of education were corrected for across analyses. IQ score (m ¼ 100; SD ¼ 15); EDSS <4 (low physical disability); MSIS29 (the higher the score, the greater the impact-score range 29-145); BDI II (scores 0-13 indicate minimal depression); Fatigue Severity <4 (unproblematic fatigue).    (Fig. 5B), there were no outliers when lesion load values were examined by group. There was no group-lesion load interaction (P ¼ 0.290) with the correlations with List Learning similar across groups (impaired: r ¼ À0.420, P ¼ 0.041; unimpaired: r ¼ À0.424, P ¼ 0.102). Lesion load was, however, higher in the cognitively impaired group than in the unimpaired (Table 1).

Figure recall
The Figure  Multiple correlations reached the BF threshold ( Table 2). Despite being significant predictors of variance in Figure  Recall scores (white matter measure effects P < 0.05), lesion load, tractogram load, the SLF II measures, and the striato-prefrontal measures did not attenuate the significant score difference between groups (all effects of group remained P < 0.05). The bilaterally averaged SLF II and striato-prefrontal measures did not mask any significant unilateral differences between groups in these measures. In contrast, isthmus bundle load did mitigate the significant difference in Figure Recall   Note. The counterintuitive negative correlations of the striato-prefrontal Tractometry and lesionometry measures in the cognitively unimpaired, and the striato-prefrontal interactions, were owing to two cases (these were not outliers but when excluded the interactions disappeared). BL, bundle load; L, load; Lmtry, lesionometry; SLF, superior longitudinal fasciculus; St-Pref, striato-prefrontal pathway; St-Par, striato-parietal pathway; Tmtry, Tractometry. *P < 0.05, **P < 0.01, ***P < 0.0005.
The SLF I measures were all significant predictors (P < 0.05), but did not eradicate the group difference in scores until inputted together [effect of group F(1,33) ¼ 3.987, P ¼ 0.054, g p 2 ¼ 0.108, MES]. These measures did not mask any significant unilateral differences between groups in these measures. Similarly, the bilaterally averaged striato-parietal measures ( Table 2) were significant predictors (P < 0.05), but did not attenuate the effect of group until inputted together [F(1,33) ¼ 3.988, P ¼ 0.054, g p 2 ¼ 0.108, MES]. However, both striatoparietal bundle load (impaired mean: 0.507; unimpaired mean: 0.324) and lesionometry (impaired mean: À0.244; unimpaired mean: 0.367) differed between groups ( Table 2). These differences were underpinned by significant unilateral differences in right striato-parietal bundle load [impaired mean: 0.526, unimpaired mean: 0.318; t(38) ¼ 2. As tract-specific microstructural measures were meaningful here, these measures are provided from the whole right striato-parietal bundle and its lesionometry. With the former, the groups differed significantly on two measures, MD and RD (Table 3). Regarding the latter, the groups differed on four measures, MD, RD, AFD, and RISH0. All differences were in the anticipated directions i.e. MD and RD higher, and AFD and RISH0 lower, in the cognitively impaired group. The whole bundle factor loadings were as follows: RD (À0.

Design Learning
The Design Learning score difference between groups was moderate following corrections for estimated IQ and years in education [F(1,36) ¼ 4.549, P ¼ 0.040, g p 2 ¼ 0.112]. The correlations that reached the BF threshold were with genu bundle volume (r ¼ 0.486, BF ¼ 0.054, P ¼ 0.001, MES), SLF III bundle volume (r ¼ 0.602, BF ¼ 0.002, P < 0.0005, LES), and striato-prefrontal bundle volume (r ¼ 0.477, BF ¼ 0.067, P ¼ 0.002, MES). When entered alongside estimated IQ and years in education as covariates, neither genu nor striato-prefrontal bundle volumes (despite being significant predictors P < 0.05) accounted for the significant difference in scores between groups (effect of group P ¼ 0.024 and P ¼ 0.015, respectively). The groups did not differ in these white matter measures, but the whole sample correlations again appeared driven by the cognitively impaired group (genu impaired: r ¼ 0.623, P ¼ 0.001, unimpaired: r ¼ 0.002, P ¼ 0.994; striato-prefrontal impaired: When the SLF III bundle volume was entered as a covariate alongside estimated IQ and years of education, it was a unique predictor [F(1,35) ¼ 10.074, P ¼ 0.003,  Note. MD and RD group means were higher, and FA, AFD, NuFO, RISH 0, and RISH 2 lower, in the cognitively impaired group than in those unimpaired. * significant at P <0.05.

Discussion
The mapping of region-specific and tract-specific structural differences with cognitive differences in multiple sclerosis remains underutilized for recognizing individual differences, predicting functional outcomes, optimizing disease monitoring, and supporting treatment options.
Here we have demonstrated that, in a cohort with longstanding multiple sclerosis, relatively low physical disability and functional impact (Table 1), four out of five cognitive differences were explained by variation in white matter measures. Furthermore, three cognitive differences were better explained by tract-specific than global measures. We have demonstrated that variation in pathological burden reflected by both macro-and microstructural tract-specific measures underpinned these cognitive group differences.
Despite the groups being well matched across demographic and clinical variables, cognitive differences were demonstrated along with greater lesion burden in those classed as cognitively impaired (Table 1). Indeed, whilst tractogram load did not differ between groups, on average 47% of the tractogram was affected by lesions in the cognitively impaired group in contrast to the 34% in those unimpaired. Overall, these global measures were less meaningful as predictors of cognitive differences than tract-specific measures, but tractogram load nonetheless supplements lesion load by allowing for convenient quantification and illustration of the structural network interacting with lesions, beyond that afforded by conventional 2D slice-based visualizations (Fig. 5A).

Verbal tasks
None of the white matter imaging measures selected for this study accounted for the difference in Letter Fluency between groups. The group difference in List Learning, in turn, was the only one explained by a global imaging measure (lesion load), which is in keeping with previous studies linking verbal learning with lesion load in relapsing-remitting multiple sclerosis. 80,81 In contrast, both verbal and visual immediate recall score differences were accounted for by the isthmus of the corpus callosum. It appears the pathological burden within the isthmus (mean bundle load 60%, n ¼ 40) was the key driver of the adverse impact on these task scores as opposed to the isthmus being the most critical structure supporting these functions. While temporoparietal regions are involved in recall 82 and the isthmus likely contributes, visual recall was better predicted by the striato-parietal pathways and verbal recall has previously been associated with other white matter tracts [83][84][85] not considered in this study.

Visual tasks
Right striato-parietal lesionometry and isthmus bundle load were the only measures that, by themselves, mitigated the group difference in Figure Recall. The mean bundle loads of both these tracts differed between groups (isthmus: impaired 70%, unimpaired 45%; right striatoparietal: impaired 53%, unimpaired 32%) suggesting a critical role for difference in tract-specific pathological burden in explaining the group difference in Figure  Recall. In addition to bundle load, Tractometry and leisonometry featured particularly in association with Figure  Recall. In fact, the lesionometry correlations were the highest among the SLF I, SLF II, and striato-parietal measures surviving the BF threshold (Table 2). Therefore, the interaction of lesions with the microstructure of the streamlines traversing them was especially relevant to Figure Recall performance. Having reduced multiple microstructural measures using principal component analysis, we uncovered neuropathological effects on measures such as AFD and RISH0 (alongside RD and MD) in lesionometry, which have not been as studied as e.g. FA in multiple sclerosis, but which differed between groups and were among the highest loading measures across factors meaningful for visual recall (Table 3).
Visual learning, in turn, was particularly associated with striato-prefrontal, genu, and SLF III whole bundle volumes, with only the latter, and the left SLF III volume specifically, eradicating the significant score difference between groups. There were no group differences in SLF III bundle loads and so the group differences in the bilaterally averaged, and left, SLF III volumes may reflect neuropathology extending beyond the streamlines directly interacting with lesions. Together these results suggest that the pathological burden on tracts converging in parietal regions was particularly meaningful in accounting for group differences in visuospatial-constructional tasks i.e. Figure Recall and Design Learning. This, in turn, is congruent with previous associations between posterior temporal and/or parietal regions and performance in these types of task. 86,87 Key contributions Periventricular and posterior regions have often been associated with neuropathological burden in multiple sclerosis with some having demonstrated structural differences between multiple sclerosis groups in these regions previously, particularly within the posterior corpus callosum. 7,47,64 However, we have enhanced these previous findings by linking variation in several structures to differences in specific cognitive functions. Our results have demonstrated that both (task-relevant) regional white matter and tract-specific structural reserve are important in understanding cognitive differences in multiple sclerosis, even in samples considered to have low disease impact (Table 1). We highlight that quantifying structural reserve benefits from specificity not provided by global measures such as lesion load or tractogram load, or even whole bundle measures, which can mask the relationships between individual tracts and specific cognitive differences. Furthermore, considering the differences in white matter measure correlations between groups across the visual tasks, the potential cumulative effects of tract-specific neuropathological variation across task-relevant structures may underpin these group differences by relatively better structural reserve facilitating better functional (i.e. cognitive) reserve. This, in turn, may mask relevant structure-function relationships in those cognitively unimpaired. [88][89][90] Whilst it is important to acknowledge the relatively small group sizes, this pattern of differing correlations was repeated across all measure-visual task relationships with the cognitively impaired group driving the strengths of the whole sample correlations. It is worth noting that despite only a few white matter measures demonstrating statistically significant group differences, across the tracts included in this study, all measure group means favoured the cognitively unimpaired group with only two exceptions (SLF II bundle volume and genu lesionometry).

Limitations
Whilst we have demonstrated that cognitive group differences can be explained by neuropathology in white matter, we are not suggesting that cortical influences can be discounted. The potential concomitant effects derived from cortical structures are unknown, and future work including cortical variables alongside these measures may elucidate their relationships and respective contributions to behaviour. Another consideration is that among the white matter measure-cognitive task relationships that did not reach the BF threshold, there may have been some meaningful unilateral correlations that were masked by the averaging of the right and left values from the striato-cortical and SLF pathways. Owing to asymmetries in lesion distributions across bilateral tracts, future work could uncover further unilateral tract-specific influences on particular cognitive tasks. A further limitation is the approach taken to classify cognitive impairment as this allowed for there to be 'normal' and 'impaired' performances in the impaired and unimpaired groups, respectively, at the level of the individual task. However, this was somewhat mitigated by the focus being on tasks that differed between groups. Last, as the bundles were selected owing to traversing regions often associated with neuropathology in multiple sclerosis, rather than according to the neuroanatomy associated with specific cognitive tasks, it is not known if other white matter structures may have contributed to performance differences.

Conclusion
These results highlight the potential for diffusion-weighted MRI to disentangle the relationships between regional neuropathology and performance on specific cognitive tasks in multiple sclerosis. The benefits for clinical practice include the ability of this approach to better measure eligibility and the effects of targeted therapeutic and rehabilitative interventions, allowing also for greater precision in measurement of outcomes. It may be that the measures examined in this study can offer superior pathological specificity to clinically relevant processes, in which case new treatment effects can be uncovered and quantified. However, it is not yet clear when one measure may demonstrate better sensitivity than another and future work could help inform how to optimize their application.