Detection of β-amyloid positivity in Alzheimer’s Disease Neuroimaging Initiative participants with demographics, cognition, MRI and plasma biomarkers

Abstract In vivo gold standard for the ante-mortem assessment of brain β-amyloid pathology is currently β-amyloid positron emission tomography or cerebrospinal fluid measures of β-amyloid42 or the β-amyloid42/β-amyloid40 ratio. The widespread acceptance of a biomarker classification scheme for the Alzheimer’s disease continuum has ignited interest in more affordable and accessible approaches to detect Alzheimer’s disease β-amyloid pathology, a process that often slows down the recruitment into, and adds to the cost of, clinical trials. Recently, there has been considerable excitement concerning the value of blood biomarkers. Leveraging multidisciplinary data from cognitively unimpaired participants and participants with mild cognitive impairment recruited by the multisite biomarker study of Alzheimer’s Disease Neuroimaging Initiative, here we assessed to what extent plasma β-amyloid42/β-amyloid40, neurofilament light and phosphorylated-tau at threonine-181 biomarkers detect the presence of β-amyloid pathology, and to what extent the addition of clinical information such as demographic data, APOE genotype, cognitive assessments and MRI can assist plasma biomarkers in detecting β-amyloid-positivity. Our results confirm plasma β-amyloid42/β-amyloid40 as a robust biomarker of brain β-amyloid-positivity (area under curve, 0.80–0.87). Plasma phosphorylated-tau at threonine-181 detected β-amyloid-positivity only in the cognitively impaired with a moderate area under curve of 0.67, whereas plasma neurofilament light did not detect β-amyloid-positivity in either group of participants. Clinical information as well as MRI-score independently detected positron emission tomography β-amyloid-positivity in both cognitively unimpaired and impaired (area under curve, 0.69–0.81). Clinical information, particularly APOE ε4 status, enhanced the performance of plasma biomarkers in the detection of positron emission tomography β-amyloid-positivity by 0.06–0.14 units of area under curve for cognitively unimpaired, and by 0.21–0.25 units for cognitively impaired; and further enhancement of these models with an MRI-score of β-amyloid-positivity yielded an additional improvement of 0.04–0.11 units of area under curve for cognitively unimpaired and 0.05–0.09 units for cognitively impaired. Taken together, these multi-disciplinary results suggest that when combined with clinical information, plasma phosphorylated-tau at threonine-181 and neurofilament light biomarkers, and an MRI-score could effectively identify β-amyloid+ cognitively unimpaired and impaired (area under curve, 0.80–0.90). Yet, when the MRI-score is considered in combination with clinical information, plasma phosphorylated-tau at threonine-181 and plasma neurofilament light have minimal added value for detecting β-amyloid-positivity. Our systematic comparison of β-amyloid-positivity detection models identified effective combinations of demographics, APOE, global cognition, MRI and plasma biomarkers. Promising minimally invasive and low-cost predictors such as plasma biomarkers of β-amyloid42/β-amyloid40 may be improved by age and APOE genotype.


Introduction
Alzheimer's disease (AD), pathologically defined as the presence of plaques of b-amyloid (Ab) protein, neurofibrillary tangles of tau protein and neurodegeneration (DeTure and Dickson, 2019), is the major cause of cognitive decline and dementia (2020). Currently, no treatment is approved that has been demonstrated to slow the progress of AD (Aisen, 2019). Historically, AD was diagnosed clinically through neurological and neuropsychological examinations to assess memory impairment and other thinking skills, judge functional abilities and identify behaviour changes, and exclude other causes than AD that could account for the dementia (McKhann et al., 2011). The 'gold-standard' method to confirm the presence of AD pathology is pathological examination of brains at autopsy (DeTure and Dickson, 2019). Since the turn of the century, the ability to diagnose AD pathology in living people has been made possible by the development of radioligands for Ab positron emission tomographic (PET) scans (Klunk et al., 2004;Schilling et al., 2016) and tau PET scans (Marquie et al., 2015;Leuzy et al., 2019), magnetic resonance imaging (MRI) for neurodegeneration (Frisoni et al., 2010) and analysis of cerebrospinal fluid (CSF) for Ab and tau species (Blennow, 2004;Holtzman, 2011). This has led to an in vivo biological framework of AD including Ab, tau and neurodegeneration, based on the so-called A/T/N system (Jack et al., 2018). Indeed, the descriptive A/T/N system places Abþ individuals firmly on the AD continuum, whereas individuals with Ab-profiles are considered either normal or possessing non-AD pathologic changes (Jack et al., 2018). Many trials, particularly the ones enrolling patients in earlier stages of disease, are therefore using either Ab PET imaging or CSF Ab 42 levels as a critical step in clinical trial cohort enrichment (Sperling et al., 2014;Honig et al., 2018).Despite these advances, PET scans are quite expensive and not universally accessible. Although lumbar punctures are very safe (Peskind et al., 2009), there continues to be reluctance to CSF sample collection in the patient and professional population (Moulder et al., 2017). Therefore, there has been great interest in developing low cost, minimally invasive methods to detect AD Ab pathology compared to PET scans and or CSF as the 'gold standard'. Many publications (reviewed in Ashford et al.) have evaluated the role of demographics (Insel et al., 2016;Tosun et al., 2016;Jansen et al., 2018;Buckley et al., 2019;Ko et al., 2019;Maserejian et al., 2019), APOE e4 (de Rojas et al., 2018;Jansen et al., 2018;Ten Kate et al., 2018;Ba et al., 2019;Buckley et al., 2019), cognition (Mielke et al., 2012;Burnham et al., 2014;Kandel et al., 2015;Burnham et al., 2016;Insel et al., 2016;Kim et al., 2018;Lee et al., 2018;Ba et al., 2019;Brunet et al., 2019;Maserejian et al., 2019;Ansart et al., 2020) and MRI measures (Tosun et al., 2013(Tosun et al., , 2014(Tosun et al., , 2016Ten Kate et al., 2018;Petrone et al., 2019;Ansart et al., 2020;Ezzati et al., 2020) to detect AD Ab pathology. More recently, there has been considerable excitement concerning the value of assays of plasma Ab species and related proteins (Burnham et al., 2014(Burnham et al., , 2016Kaneko et al., 2014;Fandos et al., 2017;Ovod et al., 2017;Park et al., 2017;de Rojas et al., 2018;Nakamura et al., 2018;Verberk et al., 2018;Westwood et al., 2018;Chatterjee et al., 2019;Chen et al., 2019;Goudey et al., 2019;Lin et al., 2019;Palmqvist et al., 2019a,b;Park et al., 2019;Perez-Grijalba et al., 2019;Vergallo et al., 2019), species of plasma tau, including phosphorylated tau (p-tau) forms (Mielke et al., 2018;Palmqvist et al., 2019b;Barthélemy et al., 2020;Janelidze et al., 2020a;Karikari et al., 2020;Palmqvist et al., 2020;Thijssen et al., 2020) and plasma neurofilament light (NfL) (Palmqvist et al., 2019b;Thijssen et al., 2020) to detect AD Ab pathology. The first reports of reproducible high precision, high accuracy tests of plasma Ab 42 /Ab 40 indicated high sensitivity and specificity for Ab plaques as measured by mass spectrometry (Ovod et al., 2017;Nakamura et al., 2018). Subsequently, plasma measures of p-tau at residues 181 (Mielke et al., 2018) and 217 (Barthélemy et al., 2020;Palmqvist et al., 2020) indicated good performance relative to Ab plaques and tau tangles. The performance of these tests is being evaluated and has been shown to detect PET Ab-positivity conversion (Schindler et al., 2019), to be associated with cognitive decline and to correlate with AD pathology (Janelidze et al., 2020a). If proven useful, these alternative approaches to detect AD Ab pathology may play an important role in drug discovery and in accelerating identification of risk factors for AD with greater precision.
For optimal and generalizable operationalization of such imputation approaches for the presence of AD Ab pathology, it is important to assess the independent and added value of each class of predictors (e.g. demographics, APOE e4, cognition, plasma biomarkers, MRI, etc.) and the differences in their classification performances at different clinical stages. The Alzheimer's Disease Neuroimaging Initiate (ADNI) is a large, multi-site, longitudinal study aimed at validating biomarkers for AD clinical trials . ADNI participants have Ab PET scans, lumbar punctures for CSF and blood drawn for plasma studies, therefore allowing for a head-to-head comparison. This study specifically aimed to assess (i) to what extent plasma Ab 42 /Ab 40 , NfL and p-tau181 biomarkers detect the presence of AD Ab pathology (i.e. Ab-positivity); (ii) to what extent the addition of demographic data, APOE genotype and cognitive assessments and (iii) MRI can assist plasma biomarkers in detecting Ab-positivity and (iv) to what extent the stage of clinical diagnosis affects these relationships.

Study design
Data used in the preparation of this article were obtained from the ADNI database (adni.loni.usc.edu). The ADNI was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial MRI, PET, other biological markers and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment (MCI) and early AD. Up-to-date information are available at www.adni-info.org.

Cohort
Subjects of this study were ADNI participants with known PET Ab status and with plasma biomarker assessments for p-tau181, and NfL, clinical assessments and structural MRI within 6 months of Ab PET imaging. A subset of the main study cohort also had plasma biomarker assessment for Ab 42 /Ab 40 . The primary focus of this study was to assess imputation of Ab positivity from single time-point observations of clinical, neuroimaging and plasma biomarker data; therefore, a cross-sectional study design was used. Although longitudinal biomarkers, neuroimaging and clinical data are available for many ADNI participants, we considered only data from the first time-point with complete clinical, neuroimaging and biomarker assessments for each participant to avoid circular model training and assessment. Clinical assessment closest in time to Ab PET imaging was used to define cognitively unimpaired (CU) and cognitively impaired (CI) diagnostic groups. The diagnostic criteria for ADNI participants were described previously (Petersen et al., 2010). Participant selection was made a priori from all ADNI subjects based on the availability of complete cross-sectional data as of 30 June 2020.

PET Ab status
Mean tracer uptake in the cerebellar grey and white matter was computed and used as reference to generate whole-brain standardized uptake value ratio (SUVR) maps of florbetapir PET scans (Jagust et al., 2015). A composite region-of-interest consisting of middle frontal, anterior cingulate, posterior cingulate, inferior parietal, precuneus, supramarginal, middle temporal and superior temporal regions was used to compute a global SUVR for florbetapir. A threshold of SUVR !1.11 for florbetapir (Landau et al., 2013) was then used to determine the status of PET Ab.

Demographics data
Age at florbetapir PET imaging, sex and years of education were included as demographic characteristics of each participant.
Apolipoprotein E genotyping APOE genotyping was done by the ADNI Genetics Core using DNA from blood samples as detailed in Supplementary Material. APOE e4 carrier status was considered as a predictor of Ab-positivity in this study.

Global cognitive assessments
ADNI participants were assessed with a wide spectrum of clinical and cognitive tests . In this study, we limited the global cognitive assessments to the Clinical Dementia Rating-Sum of Boxes (CDR-SB), the Alzheimer's Disease Assessment Scale-Cognitive subscale 13-item (ADAS-Cog) and the Mini-Mental State Examination (MMSE) based on a 30-point questionnaire.

Plasma sample collection
Blood samples were obtained by venipuncture in EDTA tubes for plasma by following the ADNI protocol (Kang et al., 2015). Within 60 min, the samples were centrifuged at 3000 r.p.m. at room temperature, aliquoted and stored at À80 C. Samples underwent two freeze/thaws. Further details are provided in Supplementary Material.

Plasma Ab 42 and Ab 40
Plasma Ab isoform concentrations were determined using immunoprecipitation combined with liquid chromatography tandem mass spectrometry (LC-MS/MS) as described previously (Ovod et al., 2017). Plasma aliquots were thawed at 21 C/800 RPM for 10 min and centrifuged at 21 C/10 000 RCF for 5 min prior to immunoprecipitation. Targeted Ab 42 and Ab 40 isoforms were immunoprecipitated with an anti-Ab mid-domain antibody (HJ5.1) using a KingFisher (Thermo) automated immunoprecipitation platform. Immuno-enriched fractions were subsequently digested with Lys-N protease, generating Ab 28-42 and Ab 28-40 species, which were measured by LC-MS/MS (Ovod et al., 2017). Absolute Ab isoform concentrations were determined with a 15 N-labelled internal standard for each isoform. The total levels of Ab 42 and Ab 40 were used to calculate the Ab 42 /Ab 40 ratio.

Plasma p-tau181
Plasma p-tau181 was analysed by the Single-molecule array (Simoa) technique (Quanterix, Billerica, MA), using an assay developed in the Clinical Neurochemistry Laboratory, University of Gothenburg, Sweden . The assay uses a combination of two monoclonal antibodies (Tau12 and AT270) and measures N-terminal to mid-domain forms of p-tau181 . Calibrators were run as duplicates, whereas plasma samples were measured in singlicate. All the available samples were analysed in a single batch.

Plasma NfL
Plasma NfL was analysed by the Simoa technique (Quanterix, Billerica, MA). The assay uses a combination of monoclonal antibodies, and purified bovine NfL as a calibrator. Calibrators were run as triplicates, whereas plasma samples were measured in singlicate. All the available samples were analysed in a single batch.

MRI-score for Ab-positivity
3T MRI data included a 3D MP-RAGE or IR-SPGR T1weighted acquisition, as described online (http://adni.loni. usc.edu/methods/documents/mri-protocols). We employed a previously proposed methodology for assessing brain Ab positivity status (Lang et al., 2019). Briefly, 972 ADNI patients with structural MRI scans and with known Ab status based on either CSF or Ab PET imaging were used to train a deep learning model. The deep learning model training cohort included individuals at different clinical stages (CU, subjective memory complaint, early/late MCI and dementia), but excluding the participants of this study with plasma biomarker data. The method yields a probabilistic score of Ab-positivity between 0 and 1.

Statistical analysis
All analyses were performed on CU and CI data separately.
Demographic, clinical and biomarker characteristic differences between Abþ and AbÀ participants were described using two-sample t-test and the v 2 test for continuous and categorical variables, respectively.
Demographic characteristics (age, sex and years of education), APOE genotype, cognitive scores (MMSE, ADAS-Cog, and CDR-SB), plasma Ab 42 /Ab 40 , p-tau181 and NfL levels, and derived MRI-score were used as inputs to construct random forest (RF) classifiers to detect the Ab-positivity using florbetapir PET status as the ground-truth. RF approach was pre-selected based on the classification performances that are previously reported (Fernández-Delgado et al., 2014) and flexibility of RF models to a mixture of numerical (age, years of education, cognitive scores, plasma levels and MRI-score) and categorical (sex and APOE genotype) features. A reference RF classifier was constructed from demographics and cognitive scores, referred as the clinical information here on. A second reference RF classifier was also constructed from MRI-score alone. To assess the added value of each class of variables (i.e. clinical, plasma and MRI classes), additional RF classifiers were constructed from (i) each plasma marker alone, (ii) each plasma marker jointly with clinical features, (iii) MRI-score jointly with clinical features and (iv) each plasma marker jointly with clinical features and MRI-score.
The RF model construction was repeated 10 times using different random seeds, and the average model performance was reported. Each data set (CU and CI data sets) was randomly divided into training and test data sets, using non-overlapping 80%/20% split. Each data set used the same partitioning for all classifiers for an unbiased comparison between classifiers (Vanschoren et al., 2012). The models were built on each training split, and the performance on the test data sets was evaluated, and this process was repeated 10 times. Performance was presented as mean and standard deviation over the model runs. We generated sensitivity-specificity curves based on model classifications on the test data. For each sensitivity-specificity curve, we also computed the area under curve (AUC) values. A confidence interval of 95% was chosen. AUC of two classifiers was compared with DeLong test (DeLong et al., 1988). Additionally, we computed accuracy, sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) on each set of model classifications at classifier probability cut-off of 0.5.
Finally, for RF models with multiple variables, the mean decrease in accuracy caused by a variable was determined based on the out-of-bag error estimates. The more the accuracy of the RF decreases due to the exclusion of a single variable, the more important that variable was deemed for the classification of the data.
The main analyses reported below with PET Ab-positivity as the gold-standard for Ab-positivity were repeated with CSF Ab-positivity and the results are shown in Supplementary Fig. 1. Results from another secondary analysis are also shown in Supplementary Fig. 2, in which each classifier model was considered in a sub-sample constraint by the plasma Ab 42 /Ab 40 cohort where all relevant data was available, yielding a fixed sample size across all classifier models. Finally, the main analyses were repeated by restricting clinical information to age and APOE genotype ( Supplementary Fig. 3).
All analyses were performed using the R language and environment for statistical computing version 4.0.1 (R Foundation for Statistical Computing).

Data availability
Data used in this study has been made publicly available by the ADNI in the Laboratory of Neuro Imaging (LONI) database.

Results
The results of plasma Ab 42 /Ab 40 for nine CU and nine CI participants failed quality control at measurement. No outliers (i.e. > 4 standard deviations of the mean) were detected in the plasma Ab 42 /Ab 40 measurements. Samples from three CU and one CI participants were measured below the lower limit of quantification of 1.0 pg/ml for plasma p-tau181. We identified additional five CU and five CI participants with outlier values of plasma p-tau181 levels, who were discarded from subsequent analyses. Analytical sensitivity for plasma NfL was <1.0 pg/ml, and no sample contained NfL levels in plasma below the limit of detection, but 5 CUs and 11 CIs were excluded from our analyses due to outlier plasma NfL values. Participants with dementia were excluded for two main reasons. First, 91% of the AD participants (n ¼ 235) with plasma NfL and plasma p-tau181 biomarker data were PET Ab-positive. An unbiased classification performance analysis with a prevalence of 91% Ab-positivity would have required a sample size of >500 (Hanczar et al., 2010). Second, cross-sectional plasma Ab 42 /Ab 40 data was available only for non-demented participants. The final main study cohort was composed of 333 CU and 519 CI elderly individuals. Participant characteristics are reported in Table 1.
In brief, 33% of CU participants in the main study cohort were PET Abþ. The frequency of APOE e4 allele was higher among Abþ CUs compared to AbÀ CUs. Compared to AbÀ CUs, Abþ CUs were older with fewer females and had significantly fewer years of education, greater CDR-SB and ADAS-Cog scores, as well as greater plasma NfL levels ( Fig. 1). Plasma p-tau181 levels were marginally higher in Abþ CUs compared to AbÀ CUs (P ¼ 0.057). When controlled for age differences, AbÀ CUs and Abþ CUs did not differ in ADAS-Cog scores and plasma NfL levels. Demographic and clinical characteristics of CUs in the plasma Ab 42 /Ab 40 sub-cohort did not differ from those of the main study CUs. Within the plasma Ab 42 /Ab 40 sub-cohort, Abþ CUs had lower plasma Ab 42 /Ab 40 compared to AbÀ CUs ( Fig. 1; P < 10 À6 ).
In total, 57% of CI participants in the main study cohort were PET Abþ. Abþ CIs were older than AbÀ CIs with fewer years of education and a higher frequency of APOE e4 allele. Compared to AbÀ CIs, Abþ CIs had greater clinical symptoms, with lower MMSE and higher CDR-SB and ADAS-Cog scores. Abþ CIs had significantly higher plasma p-tau181 and plasma NfL levels than AbÀ CIs (Fig. 1). AbÀ versus Abþ CI group differences in clinical scores and plasma levels were significant after controlling for age differences. Compared to the CIs in the main study cohort, CIs in the plasma Ab 42 /Ab 40 subcohort had lower symptom severity (i.e. mean CDR-SB of 1.4 versus 0.7 with P < 10 À15 and mean ADAS-Cog of 9.2 versus 7.8 with P ¼ 0.002) and lower plasma NfL levels (39.5 versus 34.5 pg/ml with P ¼ 0.01). Within the plasma Ab 42 /Ab 40 sub-cohort, Abþ CIs had significantly lower plasma Ab 42 /Ab 40 compared to AbÀ Cis ( Fig. 1; P < 10 À10 ).
Performance of classifiers differentiating Abþ and AbÀ CU participants is shown and summarized in Figs. 2a and 3a and Table 2. A classifier constructed with only clinical information (i.e. demographics, APOE e4 carrier status and global cognitive assessments) and a classifier constructed with only the MRI-score had similar performances (i.e. DeLong P ¼ 0.06) with an accuracy of 67-68% in differentiating Abþ CUs and AbÀ CUs ( Supplementary Fig. 4). Of these two classifiers, the MRIscore yielded better AUC (0.74 versus 0.69) reflected in higher NPV of MRI-score (76% versus 68%) and poor sensitivity of clinical information (3% versus 46%). When considered alone and together, plasma p-tau181 and plasma NfL did not differentiate Abþ and AbÀ CUs better than chance (Table 2; column (A)). In contrast, plasma Ab 42 /Ab 40 alone differentiated Abþ CUs from AbÀ CUs with an accuracy of 72%, a PPV of 69% and an NPV of 76%, yielding an AUC of 0.80. The overall performance of plasma Ab 42 /Ab 40 only classifier was similar to the performance of a classifier using MRI score and clinical information jointly (i.e. AUC of 0.80; DeLong P ¼ 0.53), with plasma Ab 42 /Ab 40 having slightly better PPV (69% versus 65%), whereas the multi-disciplinary MRI score and clinical information jointly having slightly better accuracy (i.e. 75% versus 72%) and NPV (i.e. 78% versus 76%). All three plasma biomarkers jointly differentiated Abþ CU and AbÀ CU at an improved accuracy of 77%, a PPV of 77% and an NPV of 80%, yielding an AUC of 0.83, but this was not significantly different than the performance of plasma Ab 42 / Ab 40 alone classification (DeLong P ¼ 0.09). When combined with clinical information (Table 2; column (B)), the predictive performance of the plasma p-tau181 and plasma NfL improved but not beyond the performance of the classifier constructed from clinical information alone (i.e. DeLong P ¼ 0.18 and P ¼ 0.08, respectively). Adding clinical information to the plasma Ab 42 /Ab 40 classifier yielded better differentiation of Abþ CU and AbÀ CU cases with an accuracy of 79%, PPV of 77%, NPV of 81% and an AUC 0.86, with the greatest improvement in the PPV compared to plasma Ab 42 /Ab 40 only and clinical information only classifiers. Further enhancing plasma NfL and plasma p-tau181 with the MRI score in addition to the clinical information improved classification accuracy by 5-8%, PPV by 13-22%, NPV by 8-11% and AUC by 0.10-0.14 (DeLong P < 10 À14 and P < 10 À21 , respectively) but this was not better than the classifier constructed with the MRI-score and clinical information (i.e. DeLong P ¼ 0.08 and P ¼ 0.46, respectively) or the classifier based on plasma Ab 42 /Ab 40 only (i.e. DeLong P ¼ 0.07 and P ¼ 0.88, respectively) as reported in Table 2 (column (C)). Of the three plasma biomarkers, Ab 42 /Ab 40 in combination with the MRI-score and clinical information performed the best with an accuracy of 83% and AUC of 0.90, with a well-balanced PPV of 84% and NPV of 83%, which was significantly better than the performance of Ab 42 /Ab 40 alone (i.e. DeLong P < 10 À4 ) or in combination with clinical information (i.e. DeLong P ¼ 0.02).
The full classifier model including all three plasma biomarkers, the MRI-score and clinical information had an accuracy of 82%, with a PPV of 90% and NPV of 79%. However, this was not significantly different from the classifier model with plasma Ab 42 /Ab 40 , MRI-score and clinical information (DeLong P ¼ 0.61), suggesting minimal added value of plasma NfL and plasma p-tau181. The most significant variables in a decreasing order of importance based on mean decrease in accuracy analysis were plasma Ab 42 /Ab 40 , MRI-score, APOE e4 status, MMSE, years of education and sex.
Performance of classifiers differentiating Abþ and AbÀ CI participants is shown and summarized in Figs. 2b and 3b and Table 3. Both clinical information-based and MRI-score-based classifiers were performed moderately well in differentiating Abþ and AbÀ CIs with an AUC of 0.81 and 0.76, accuracy of 74% and 67%, PPV of 76% and 70% and NPV of 73% and 63% AbÀ CI, n ¼ 40; Abþ CI, n ¼ 46). Plasma p-tau181 and NfL data included 852 individuals (AbÀ CU, n ¼ 224; Abþ CU, n ¼ 109; AbÀ CI, n ¼ 230; Abþ CI, n ¼ 289). Unpaired two-samples t-test uncorrected significance levels at ****P < 0.00001; ***P < 0.0001; **P < 0.001; ns: P ! 0.5. CU, cognitively unimpaired elderly; CI, elderly individuals with mild cognitive impairment. Area under the curve (AUC) estimates with 62 Â standard variation error bars from cross-validation iterations are shown for classifiers constructed separately and jointly with demographic information (age, sex and years of education), APOE, clinical scores, plasma biomarkers (Ab 42 /Ab 40 , p-tau181 and NfL) and structural MRI-score when predicting Ab-positivity using florbetapir PET as the ground truth in the ADNI study (n ¼ 333 CUs and n ¼ 519 CIs). To assess the added value of each class of variables (i.e. clinical, plasma and MRI classes), additional RF classifiers were constructed from (i) each plasma marker alone, (ii) each plasma marker jointly with clinical features, (iii) MRI-score jointly with clinical features and (iv) each plasma marker jointly with clinical features and MRI-score. Models including plasma Ab 42 /Ab 40 were tested and validated in a cohort of n ¼ 87 CUs and n ¼ 86 CIs due to limited available of plasma Ab 42 /Ab 40 data. Error bars indicate union of 95% CIs from cross-validation iterations. of Ab positivity prediction in an ADNI cohort of (A) cognitively unimpaired (CU) and (B) cognitively impaired (CI) elderly individuals. Optimized ROC curves for classifiers constructed separately and jointly with demographic information (age, sex and years of education), APOE, clinical scores, plasma biomarkers (Ab 42 /Ab 40 , p-tau181 and NfL), and structural MRIscore when predicting Ab-positivity using florbetapir PET as the ground truth in the ADNI study (n ¼ 333 CUs and n ¼ 519 CIs). To assess the added value of each class of variables (i.e. clinical, plasma and MRI classes), additional RF classifiers were constructed from (i) each plasma marker alone, (ii) each plasma marker jointly with clinical features, (iii) MRI-score jointly with clinical features and (iv) each plasma marker jointly with clinical features and MRI-score. Models including plasma Ab 42 /Ab 40 were tested and validated in a cohort of n ¼ 87 CUs and n ¼ 86 CIs due to limited availability of plasma Ab 42 /Ab 40 data. Error bars indicate union of 95% CIs from cross-validation iterations. The confidence interval includes the axis y ¼ x, suggesting that the classifier was not better than chance.
( Supplementary Fig. 4), respectively. The MRI-score together with clinical information achieved an AUC of 0.88, with an accuracy of 81%, PPV of 82% and NPV of 80%, performing significantly better than clinical information only (DeLong P < 10 À15 ) or MRI-score only (DeLong P < 10 À39 ) models. In contrast to CU data, both plasma Ab 42 /Ab 40 and plasma p-tau181, but not plasma NfL, separately detected Ab-positivity in CIs with an average accuracy of 77% and 58%, PPV of 79% and 63%, NPV of 76% and 52%, yielding AUCs of 0.87 and 0.64, respectively. Enhancement with clinical information improved performance metrics of plasma p-tau181 and NfL, but not plasma Ab 42 /Ab 40 , classifiers by 15-23% (Table 3; column (B)). Plasma p-tau181 enhanced with clinical information perform similarly to plasma Ab 42 /Ab 40 . When further enhanced with the MRI-score in addition to the clinical information, classifier performance metrics for both plasma p-tau181 and plasma NfL increased by an additional 3-8%, with plasma p-tau181 performing slightly better with an accuracy of 82%, PPV of 83% and NPV of 82% (Table 3; column (C)). Similarly, the MRI-score enhanced classification performance of plasma Ab 42 /Ab 40 more than clinical information (DeLong P < 10 À4 ), reaching an AUC of 0.94 with an accuracy of 86%, PPV of 86% and NPV of 88%. The full classifier model, including all three plasma biomarkers, MRI-score, and clinical information achieved an AUC of 0.92 and an accuracy of 86%, with a PPV of 88% and NPV of 86%. This was not significantly different than the classifier model with plasma Ab 42 /Ab 40 , MRI-score and clinical information (DeLong P ¼ 0.31), suggesting minimal added value of plasma NfL and plasma p-tau181. The most significant variables in a decreasing order of importance based on mean decrease in accuracy analysis were plasma Ab 42 /Ab 40 , MRI-score, APOE e4 allele, age and CDR-SB.

Discussion
The major findings of this multi-centre biomarker study were (i) of the three plasma biomarkers, when considered separately, Ab 42 /Ab 40 consistently differentiated PET Abpositivity status both in CU and in CI participants, with a slightly better performance in CIs, whereas plasma p-tau181 showed moderate value for differentiating PET Ab-positivity status in CI participants, and plasma NfL lacked Ab-positivity stratification value both in CU and in CI participants; (ii) clinical information, dominated by APOE e4 status and education in CU participants, and by APOE e4 status and age in CI participants, as well as MRI-score independently differentiated PET AbÀ and Abþ both in CU and in CI participants; (iii) clinical information enhanced the performance of plasma biomarkers in differentiating PET AbÀ and Abþ participants by 0.06-0.14 units of AUC for CUs, and by 0.21-0.25 units for CIs and (iv) further enhancement of these models with an MRI-score yielded an additional improvement of 0.04-0.11 units of AUC for CUs and 0.05-0.09 units for CIs. Taken together, the results recapitulate plasma Ab 42 /Ab 40 as a robust biomarker of brain Abpositivity and suggest that when combined with clinical information, plasma p-tau181 and NfL biomarkers, and an MRI-score, could effectively identify Abþ individuals with expected greater accuracy in the symptomatic individuals. Interestingly, when the MRI-score is considered in combination with clinical information, plasma p-tau181 and plasma NfL have minimal added value for brain Ab-positivity stratification in this multi-centre ADNI cohort of CU and CI participants.
Plasma Ab 42 /Ab 40 detects PET Ab-positivity The first major finding was that plasma Ab 42 /Ab 40 was a robust biomarker of PET Ab-positivity independent of clinical diagnosis, whereas plasma p-tau181 detected PET Ab-positivity only in CIs with a moderate accuracy, and plasma NfL lacked value for stratification of PET Abþ and PET AbÀ cases both in CU and in CI cohorts. It should be noted that this finding was replicated when the modelling and testing of all classifiers were repeated on the plasma Ab 42 /Ab 40 sub-cohort to mitigate the potential influence of sample size and sub-cohort characteristics in comparisons of classifiers ( Supplementary Fig. 2). Of the three plasma biomarkers considered in this study, Ab 42 / Ab 40 has been the most extensively studied in the literature. Recent studies, particularly the ones using highly sensitive mass spectrometry, have repeatedly reported a strong correlation between plasma Ab 42 /Ab 40 and the gold-standard CSF and PET Ab measures (Janelidze et al., 2016;Ovod et al., 2017;Nakamura et al., 2018;Schindler et al., 2019). Consistent with our findings, plasma Ab 42 /Ab 40 , especially when combined with age and APOE e4 status, have been shown to accurately stratify Abþ individuals (e.g. AUC, 0.80-0.85) in the AD continuum (Palmqvist et al., 2019b;Schindler et al., 2019). The slightly superior performance of plasma Ab 42 / Ab 40 in this study (cf. Supplementary Fig. 3) compared to the previous reports of 0.79-0.82 AUC for the detection of Ab-positivity in CU participants (Fandos et al., 2017;de Rojas et al., 2018;Chatterjee et al., 2019) and 0.90 AUC for CIs (Lin et al., 2019) might be due to high molecular specificity and detection sensitivity of LC-MS/MS technique used to analyse the ADNI plasma samples. This observation is consistent with the notion that the different assays for plasma Ab 42 /Ab 40 may have different precision and, in particular, mass spectrometrybased assays compared to immunoassays might be more accurate and robust in measuring levels of plasma Ab species as biomarker of brain Ab (Zetterberg, 2019). Another factor contributing to the high performance of the Ab 42 /Ab 40 ratio, as compared with single biomarkers, is that between-individual differences in basal 'total' Ab The confidence interval includes the axis y ¼ x, suggesting that the classifier was not better than chance.
secretion are compensated for in the ratio, by dividing with Ab 40 , whereas such differences in plasma NfL and p-tau181 levels, MRI measures or cognitive abilities might introduce variability in these measures.
Plasma p-tau181 presented a more complex picture as a candidate biomarker of brain Ab-positivity Assays for the quantification of plasma p-tau181 are very recently developed (Zetterberg and Blennow, 2020) and are still under extensive investigation to fully understand the role of different plasma tau species as peripheral markers of AD pathophysiology. Compared to the limited number of previously reported evaluations of plasma p-tau181 as a biomarker of brain Ab-positivity in other research and clinical cohorts (Mielke et al., 2018;Palmqvist et al., 2019b;Barthélemy et al., 2020;Janelidze et al., 2020a;Karikari et al., 2020;Thijssen et al., 2020), ADNI plasma p-tau181 levels measured by the Simoa assay differentiated between PET Abþ and PET AbÀ ADNI CI participants with an inferior accuracy (AUC, 0.64). Furthermore, this biomarker had no stratification value for PET Ab-positivity within the ADNI CU participants (AUC, 0.55). The addition of clinical information to this base model increased AUC for the classification of Abþ versus AbÀ by 0.14-0.69 in CUs and by 0.21-0.85 in CIs. The subsequent addition of an MRI-score to this model further increased AUC for the classification of Abþ versus AbÀ by 0.11-0.80 in CUs and by 0.05-0.90 in CIs, bringing its classification performance to a clinically acceptable level. Potential sources of the discrepancy between our results and those of other groups may include differences in the plasma analysis assays, diagnostic composition and demographic characteristics of the study cohorts, methodology used to determine ground-truth brain Ab-positivity status and data analytics. One of the earliest plasma p-tau181 studies on a Meso Scale Discovery (MSD) platform reported that plasma p-tau181 as a good biomarker of the elevated brain Ab with an AUC of 0.7 in CU and 0.85 in MCI participants in their discovery cohort but this study lacked internal validation or replication in an external validation cohort (Mielke et al., 2018). Another study (Barthélemy et al., 2020) reported high specificity of plasma p-tau181, measured by a highly sensitive mass spectrometry assay, for Ab plaque pathology in their discovery cohort (n ¼ 34; including clinically diagnosed CU, MCI and AD individuals) and then replicated their findings with an AUC of 0.72 to differentiate AbÀ and Abþ individuals in an independent replication cohort of CUs, MCIs and ADs (n ¼ 92) but the performance within CU only or MCI only sub-cohorts was not statistically significant. Similarly, a larger multi-cohort study which included individuals with various clinical diagnoses including CU, MCI and AD reported a stepwise increase in plasma p-tau181 levels, measured on the MSD platform, with both Ab-positivity and cognitive impairment and achieved an AUC of 0.81 in differentiating AbÀ and Abþ individuals, which was increased to 0.84 with the addition of plasma Ab 42 /Ab 40 (Janelidze et al., 2020a).
The age of cohort participants may also influence the ability of plasma p-tau181 to detect Ab-positivity status. For instance, a multi-cohort study used the Simoa assay to measure plasma p-tau181 in four different cohorts  and found that plasma p-tau181 biomarker discriminated Abþ CU older adults and individuals with CI from AbÀ CU older adults and young adults with an AUC of 0.76-0.88 across cohorts. However, the CU older adults in this study were on average 10 years younger than ADNI participants, raising the question about age-dependent sensitivity of plasma p-tau181 to AD-related Ab pathology. Similarly, another small cohort study of CU and CI participants, who were on average 13 years younger than ADNI participants, reported an excellent AUC of 0.86 in CU and 0.94, although not internally validated or replicated in an external cohort, in differentiating PET Abþ and PET AbÀ CIs with plasma p-tau181 levels (Thijssen et al., 2020). It is highly likely that younger Abþ participants might have greater pathophysiological changes than the older ADNI participants in response to Ab toxicity, which might be a driving factor for increased plasma p-tau181 levels. Indeed, it is well established that younger individuals who are Abþ have more brain tau deposition than older individuals who are Abþ (Schö ll et al., 2017). Furthermore, previous studies found that the strong correlations between plasma p-tau181 and Ab PET are often in the Abþ but not in AbÀ individuals (Janelidze et al., 2020a) and that increased plasma p-tau181 levels might be initiated by accumulation of Ab beyond the positivity threshold, and continue to increase as Ab further accumulates in the brain even during early stages of tau pathology as measured by Braak & Braak staging at autopsy or tau PET during life (Janelidze et al., 2020a;Karikari et al., 2020). Evidence from these recent studies together with the stronger association of plasma p-tau181 with brain Ab burden in younger cohorts might suggest that plasma p-tau181 is unlikely to be a direct measure of Ab pathology but instead a marker of tau pathology. Our finding that plasma p-tau181 has moderate stratification accuracy for PET Ab-positivity only at the symptomatic disease stage suggests that p-tau181 detects Ab-positivity only once a significant tau pathology, which is closely associated with symptoms, is detectable.

Plasma NfL was a poor biomarker of PET Ab-positivity
The relatively poor performance of plasma NfL in differentiating Abþ and AbÀ ADNI individuals, either symptomatic or asymptomatic, is largely consistent with the previous literature. Previous studies found no evidence that plasma NfL was related to Ab or tau pathology as measured by PET or even synaptic dysfunction as measured by fluorodeoxyglucose-PET imaging, repeatedly emphasizing that plasma NfL is more likely to be a marker of all cause neurodegeneration (Mattsson et al., 2019;Mielke et al., 2019;Janelidze et al., 2020a;Thijssen et al., 2020). Finally, our finding that plasma p-tau181 and plasma NfL did not improve Ab-positivity stratification accuracy above and beyond the plasma Ab 42 /Ab 40 was consistent with the previous studies on other AD research cohorts (Palmqvist et al., 2019b).

Clinical information and MRI-score independently differentiated PET Ab1 and Ab2 ADNI individuals
To date, the most common candidate predictors considered for Ab-positivity were age, APOE genotype and measures of cognition, largely because they are easier to collect with widely available standardized protocols. Of these, age has been the most common predictor of elevated brain Ab followed by the APOE genotype (reviewed in Ashford et al. (2021)), consistent with the notion that after advanced age, APOE e4 genotype is a major risk factor for developing AD (Payami et al., 1997). Consistent with the prior knowledge, age and APOE genotype were important predictors of Ab-positivity for ADNI CU and CI participants (cf. Supplementary  Fig. 3). In the main analyses, we observed that the ability of clinical information to differentiate Abþ and AbÀ participants improved, especially in terms of sensitivity, with increasing severity of clinical diagnosis. Indeed, measures of global cognition, such as MMSE and CDR-SB, had greater influence in the classifier model for Ab-positivity within the CI participants. Consistent with our findings, accumulating evidence suggests that elevated Ab is associated with risk of cognitive worsening and may indicate a pre-symptomatic stage of disease (Roe et al., 2013;Donohue et al., 2017). As the relationships between cognition and AD biomarkers are expected to be subtle, the global measures of cognition may have insufficient sensitivity among CUs to reliable detect pre-symptomatic expression of Ab pathology, as reflected in our results with extremely low sensitivity of clinical information in detecting Ab-positivity in CUs.
MRI-score of brain Ab alone stratified Abþ and AbÀ participants with an AUC of 0.74 in ADNI CUs and an AUC of 0.76 in ADNI CIs with a substantially increased sensitivity. When combined with clinical information, MRIscore performed as well as, or, in CIs, even better than, the best performing plasma biomarker, Ab 42 /Ab 40 . Although structural T1-weighted MRI is not a molecular imaging modality directly targeting quantification of protein accumulation in the brain, MRI has been a gold standard for neurodegeneration (Jack et al., 2004). The evidence for a relationship between Ab deposition and neurodegeneration has been previously demonstrated in very early AD and even in asymptomatic individuals Chételat et al., 2010). In a similar manner to plasma p-tau181, the value of the MRI-score for Ab-positivity might be a reflection of neurodegenerative processes due to Ab toxicity, yet we observed that the MRI-score outperformed the plasma p-tau181. The brain Ab deposition has a spatially distinct signature of cortical atrophy Chételat et al., 2010;Tosun et al., 2011) and MRI-based correlates of brain Ab deposition compared to plasma analytes might have the advantage of capturing this spatial information. Furthermore, although structural T1-weighted imaging has been traditionally considered to reveal fat and water distribution and distinguish tissue types, cellular changes associated with neuropathology might also influence the MRI contrast as well as the MRI intensity quality, such as the grey value distribution, texture features and spatial heterogeneity (Sørensen et al., 2016;Feng et al., 2019;Ranjbar et al., 2019). Our results also suggest that deep learning, the computational approach used in this study to construct MRI-scores, might efficiently quantify Ab toxicity from structural MRI because of its high automatic feature learning and visual pattern recognition abilities (LeCun et al., 2015).
Both clinical information and MRIscore enhanced performance of plasma biomarkers in identifying PET Ab-positivity One interesting observation was that although when combined with clinical information and MRI-score, plasma p-tau181 and NfL biomarkers could effectively identify Abþ symptomatic individuals, plasma p-tau181 and plasma NfL did not contribute to the detection of brain Ab above and beyond the classification power of clinical information and MRI-score jointly, particularly in CUs. This is a particularly important criterion in the selection of candidate cost-effective and rapid screening tools for broad implementation in clinical and drug trial settings. Demographics and global cognitive measures are an integral part of the clinical assessment. MRI has long played a role in inclusion and exclusion criteria in patient recruitment and ruling out other causes of cognitive symptoms (Frisoni et al., 2010). Furthermore, MRI has been routinely acquired in clinical trials to identify and monitor adverse events (Cash et al., 2014). Plasma biomarkers, therefore, should have a classification ability as good as or better than clinical information and MRI separately and in combination in order to be a practical non-invasive screener.
Our results in this ADNI study, although limited to CU and CI participants, suggest that plasma Ab 42 /Ab 40 but not plasma p-tau181 and plasma NfL might have added value in screening for brain Ab-positivity. It is also important to emphasize that plasma assays target brain-derived proteins that are present at extremely low concentrations in the peripheral circulation and originate not only in the brain but almost all peripheral cells (Roher et al., 2009). What plasma Ab measures mean biologically and to what extent the variances seen in plasma Ab levels reflect brain pathology especially in the CU and CI clinical groups in which greater heterogeneity in comorbid conditions is expected are questions still warrant further investigations. These limitations may make the use of the plasma Ab biomarkers to predict the AD pathology more difficult at the individual level. Despite the inferior performance of plasma p-tau181 in detecting AD Ab-positivity observed in this ADNI cohort, this biomarker may have different utility. Plasma p-tau181 can be robustly measured in plasma and is highly specific for AD pathology (Mielke et al., 2018), making it an attractive screening tool for brain Ab and tau pathologies jointly as required for A/T/N biomarker profiling (Jack et al., 2018) linked to differential trajectories of disease progression (Altomare et al., 2019;Jack et al., 2019;Ebenau et al., 2020). Further studies are warranted to better understand the behaviour of plasma p-tau181 as a biomarker of the burden of the disease at different disease stages (Lantero . Given that Abpositivity assessment using either CSF or PET is independent of clinical diagnosis, clinical stage-dependent classifier performance might be a concern if these plasma biomarkers are operationalized in clinical practice. In our analysis, a similar clinical diagnosis-dependent gradual increase in classification performance was observed in Abpositivity classifier constructed with clinical information and to a lesser extent with MRI-score.

Study design-specific considerations
There are multiple strengths to the study including the large sample size, well-characterized participants, and availability of plasma analytes, Ab PET imaging and structural MRI, all assessed within a short period of time. A limitation of this in vivo study was the use of Ab PET as the gold standard for brain Ab-positivity rather than the true gold standard of neuropathology. A limitation of plasma analyte comparisons is that different techniques were used, namely Simoa for p-tau181 and NfL and LC-MS/MS for Ab 42 /Ab 40 . Despite the superior specificity, mass spectrometry has the disadvantage of being more expensive and requiring more expertise than immunoassays, which are easily analysed by laboratories that routinely run blood tests. Another limitation of the study is the potential pre-analytical variability since the blood samples were collected at multiple ADNI sites. Although the collection site as a categorical variable had no significant effect on ADNI plasma levels, multi-centre studies of plasma analytes still require further investigation for standardization of protocols to reduce measurement variability (Rozga et al., 2019). We should also note that this study was limited to plasma p-tau181. Other blood immunoassays targeting tau species, specifically the very recently reported plasma pTau-217, might be promising biomarkers for AD Ab pathology (Janelidze et al., 2020b). Finally, we should further emphasize that this study is based on a convenience cohort where the degree of true population representation is not known. Most notable, about 47% of CU and 19% of CI ADNI participants who were CSF p-tau positive were PET AbÀ, suggesting non-AD aetiology of their tau pathology that might have particularly impacted the observed plasma p-tau181 levels (Benussi et al., 2020). Additionally, the PPV and NPV performance of the classifier models considered in this study was limited by the prevalence of the PET Ab-positivity in the selected ADNI cohort and may not be directly comparable to other studies with different PET Ab-positivity prevalence.

Conclusion
In summary, in vivo gold standard for brain Ab burden assessment is currently Ab PET or lumbar puncture for CSF Ab 42 (Tapiola et al., 2009;Palmqvist et al., 2016). The widespread acceptance of biomarker classification scheme for the AD continuum (Jack et al., 2018) has ignited interest in more affordable and accessible approaches to detect AD Ab pathology, a process that often slows down the recruitment into, and adds to the cost of, clinical trials. To this end, our systematic comparison of Ab-positivity stratification models that use minimally invasive and low-cost measures identified demographics, APOE genotype, global cognitive measures, MR imaging, plasma Ab measures, plasma p-tau181 and plasma NfL biomarkers, some alone and some in combination, as promising Ab-positivity classifiers. Advances in ultrasensitive assays for plasma analytes as well as in computational classifier techniques combining multidisciplinary information further promise reduce the difficulty and cost of screening participants with AD Ab pathology.

Funding
This work is funded by the National Institutes of Health Grant U01 AG024904.

Conflicts of interest
Dr Jack serves on an independent data monitoring board for Roche, has consulted for and served as a speaker for Eisai, and consulted for Biogen, but he receives no personal compensation from any commercial entity. He receives research support from NIH and the Alexander Family Alzheimer's Disease Research Professorship of the Mayo Clinic.
Dr Saykin reports grants from NIH Grants, non-financial support from Eli Lilly/Avid Radiopharmaceuticals, other from Bayer Oncology, grants and other from Arkley BioTek, personal fees and other from Springer Nature, outside the submitted work.
Dr Shaw reports grants from NIA, during the conduct of the study. Dr Jagust reports personal fees from Genentech, personal fees from Biogen, personal fees from Novartis, personal fees from Bioclinica, personal fees from Grifols, personal fees from Curasen, outside the submitted work.
Dr Aisen reports grants from Janssen, grants from NIA, grants from FNIH, grants from Alzheimer's Association, grants from Eisai, personal fees from Merck, personal fees from Biogen, personal fees from Roche, personal fees from Lundbeck, personal fees from Proclara, personal fees from Immunobrain Checkpoint, outside the submitted work.