Validation of the Klinrisk chronic kidney disease progression model in the FIDELITY population

ABSTRACT Background Chronic kidney disease (CKD) affects >800 million individuals worldwide and is often underrecognized. Early detection, identification and treatment can delay disease progression. Klinrisk is a proprietary CKD progression risk prediction model based on common laboratory data to predict CKD progression. We aimed to externally validate the Klinrisk model for prediction of CKD progression in FIDELITY (a prespecified pooled analysis of two finerenone phase III trials in patients with CKD and type 2 diabetes). In addition, we sought to identify evidence of an interaction between treatment and risk. Methods The validation cohort included all participants in FIDELITY up to 4 years. The primary and secondary composite outcomes included a ≥40% decrease in estimated glomerular filtration rate (eGFR) or kidney failure, and a ≥57% decrease in eGFR or kidney failure. Prediction discrimination was calculated using area under the receiver operating characteristic curve (AUC). Calibration plots were calculated by decile comparing observed with predicted risk. Results At time horizons of 2 and 4 years, 993 and 1795 patients experienced a primary outcome event, respectively. The model predicted the primary outcome accurately with an AUC of 0.81 for 2 years and 0.86 for 4 years. Calibration was appropriate at both 2 and 4 years, with Brier scores of 0.067 and 0.115, respectively. No evidence of interaction between treatment and risk was identified for the primary composite outcome (P = .31). Conclusions Our findings demonstrate the accuracy and utility of a laboratory-based prediction model for early identification of patients at the highest risk of CKD progression.


INTRODUCTION
Chronic kidney disease ( CKD) affects > 800 million individuals worldwide [1 ] and is often underrecognized, with the prevalence of undiagnosed stage 3 CKD estimated to be 61.6-95.5% [2 ].At advanced stages, when diagnosis is more common [3 ], the majority of kidney function is lost [4 ] and the window for diseasemodifying therapy is narrow.Clinical practice guidelines recommend kidney function should be regularly monitored in patients with diabetes, and early detection and risk stratification are prioritized to ensure early diagnosis and treatment [4 -6 ].
Klinrisk is a proprietary risk prediction model that uses a single time point measure of routinely collected laboratory data to predict CKD progression [7 ].The model has been externally validated and found to be highly accurate for predictions evaluated up to 5 years for the composite outcome of a 40% decrease in estimated glomerular filtration rate ( eGFR) or kidney failure [7 ].Although the model is promising, it is important to externally validate it in an independently derived dataset.
FInerenone in chronic kiDney diseasE and type 2 diabetes: Combined FIDELIO-DKD and FIGARO-DKD Trial programme analYsis ( FIDELITY) is a prespecified pooled analysis of the complementary phase III FInerenone in reducing kiDnEy faiLure and dIsease prOgression in Diabetic Kidney Disease ( FIDELIO-DKD) and FInerenone in reducinG cArdiovascular moRtality and mOrbidity in Diabetic Kidney Disease ( FIGARO-DKD) trials.These investigated the efficacy of the non-steroidal mineralocorticoid receptor antagonist finerenone on cardiovascular ( CV) and kidney outcomes in patients with CKD and type 2 diabetes ( T2D) .To date, FIDELITY provides a dataset of one of the largest clinical trial populations of patients ( N = 13 026) with early-to-late stages of CKD and T2D [8 -10 ].
This analysis aimed to externally validate the Klinrisk model for the prediction of a primary composite kidney outcome of a ≥40% decrease in eGFR or kidney failure ( defined as end-stage kidney disease or an eGFR < 15 ml/min/1.73m 2 ) as well as a secondary composite kidney outcome of a ≥57% decrease in eGFR or kidney failure, up to 4 years post randomization in FIDELITY.Additionally, this analysis aimed to identify categories of risk that correspond with the greatest net benefit for targeted treatment intervention in patients with CKD and T2D.

Validation cohort
The dataset included all participants in the FIDELITY prespecified pooled analysis, which combined individual patient-level data from the FIDELIO-DKD ( NCT02540993) and FIGARO-DKD ( NCT02545049) phase III, multicentre, double-blind trials.Eligible participants were adults ( ≥18 years) with CKD [urine albumin:creatinine ratio ( UACR) ≥3.4-< 33.9 mg/mmol and eGFR ≥25-≤90 ml/min/1.73m 2 or a UACR ≥33.9-≤565.6 mg/mmol and eGFR ≥25 ml/min/1.73m 2 ] and T2D, receiving maximum tolerated doses of renin-angiotensin system therapy with a serum potassium level ≤4.8 mmol/l.Key exclusion criteria included diagnosis of symptomatic chronic heart failure with reduced ejection fraction ( i.e. a class IA recommendation for mineralocorticoid receptor antagonist treatment) [8 -10 ].In the FIDELIO-DKD and FIGARO-DKD studies, time to kidney failure ( defined as end-stage kidney disease [initiation of long-term dialysis for ≥90 days or kidney transplant] or eGFR < 15 ml/min/1.73m 2 ) , a sustained ≥40% decrease in eGFR from baseline or kidney death was used as the primary or secondary prespecified kidney composite outcome, respectively [9 , 10 ].Time to a sustained ≥57% decrease in eGFR ( equivalent to doubling of serum creatinine) , time to kidney failure, or kidney death was a prespecified secondary outcome in both trials [8 ].A sustained ≥40% decrease or ≥57% decrease in eGFR was defined as from baseline over ≥4 weeks [8 ].

Description of the model
A full description of the development and external validation of the Klinrisk model has been reported previously, and the model was developed and validated in compliance with the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis ( TRIPOD) checklist [7 ].The machine learning model was the random forest model using the R package Fast Unified Random Forest for Survival, Regression and Classification using a survival forest with right-censored data ( randomForestSRC) .This model was developed using variables identified in previous datasets of patients with CKD or at risk of CKD and subsequently trained using laboratory-linked administrative data from Manitoba, Canada, between 1 April 2006 and 31 December 2018, and externally validated using the Alberta Health database from Alberta, Canada, for the outcomes of kidney failure ( defined as < 10 ml/min/1.73m 2 ) or a ≥40% decrease in eGFR.Time horizons of 2, 3 and 4 years were selected to stay close to the overall median follow-up date within the FIDELITY dataset ( 3.0 years) [8 ].

Variables
Variables included in the model were age ( years) , sex ( male or female) , eGFR ( ml/min/1.73m 2 ) , UACR ( mg/mmol) and the results of 18 other laboratory analyses from chemistry panels, including creatinine, urea, sodium, potassium, bicarbonate and others such as bilirubin, and values from complete blood cell count panels including haemoglobin and platelet count.
The primary composite outcome included a ≥40% decrease in eGFR, kidney failure ( defined as end-stage kidney disease [initiation of long-term dialysis for ≥90 days or kidney transplant] or eGFR < 15 ml/min/1.73m 2 ) or kidney death, and the secondary composite outcome included a ≥57% decrease in eGFR, kidney failure or kidney death.

Statistical analysis
The FIDELITY baseline characteristics were summarized with descriptive statistics.
The discrimination ability of the model to predict the outcome was evaluated using the area under the receiver operating characteristic curve ( AUC) and the Brier score, annually, at 1-4 years and at the median follow-up time.Thresholds were determined based on the distribution of the data and previous publications in the field [11 , 12 ].Higher AUC values indicate better model performance, which translates into strong discrimination ability and high concordance between observed events and model-predicted events, whereas a lower Brier score indicates improved accuracy of probabilistic predictions.Calibration plots were produced by decile comparing observed risk and model-predicted risk.Both discrimination and calibration were evaluated in the overall population and stratified by treatment assignment ( finerenone versus placebo) .Kidney Disease: Improving Global Outcomes ( KDIGO) heatmap categories were used as the reference standard.
Risk groups were formed based on the predicted risk from Klinrisk of a ≥40% decrease in eGFR or kidney failure.Four groups were defined based on the first three quantiles ( Q1: minimum-25th percentile; Q2: 25th percentile-median or 50th quantile; Q3: median-75th percentile; and Q4: 75th percentile-maximum) .Incidence rates were estimated for the risk categories and expressed per 100 patient-years.The interaction between the predicted risk and treatment assignment effects on the outcome was evaluated with Cox proportional hazards regression analysis.
To assess the risk categorization of the model within the FIDELITY dataset, incidence rates for the primary composite outcome were examined by risk group, defined by the range of predicted risk from Q1 ( minimum-25th percentile) to Q4 ( 75th percentile-maximum) .Risk groups were defined by the range of the predicted risk as shown in Fig. 2 and Table 3 .Incidence rates for the primary composite outcome corresponded with the assigned risk group ( i.e. lower for low-risk groups and higher for high-risk groups) .Incidence rates were lower in the finerenone   1) .In addition, no major deviation was observed in calibration across age groups ( Supplementary Fig. 2) .

DISCUSSION
In this external validation of the Klinrisk model, we demonstrate that the model originally developed in the general population, accurately predicted CKD progression events in patients with CKD and T2D.This analysis provides a comprehensive validation of the Klinrisk model in a well-characterized, global clinical trial population across all stages of CKD with centrally adjudicated clinical events.These findings add to the external validity of the model and highlight the need for studies on its clinical implementation.
The Klinrisk model was developed using a cohort of community-dwelling adults with a mean age of 59 years, mean eGFR of 82 ml/min/1.73m 2 and median UACR of 1.1 mg/mmol.Although the majority of this population had early-stage CKD, a significant proportion were disease-free at baseline, and the overall event rate was only 7.8% at 5 years [7 ].In contrast, the FIDELITY dataset represented a population at high risk of CKD progression, with a mean age of 65 years, mean eGFR of 58 ml/min/1.73m 2 and median UACR of 58 mg/mmol [8 ].Despite these differences, the Klinrisk model appears to both discriminate and calibrate well in the FIDELITY population.
Predictive models for CKD progression have been tested in similar populations.In the Action to Control CardiOvascular Risk in Diabetes ( ACCORD) trial, a time-varying Cox model demonstrated good discrimination [AUC 0.745 ( 95% CI 0.723-0.763)] and calibration [Brier score 0.0923 ( 95% CI 0.0873-0.0965)] performance in patients with CKD and T2D [13 ].More recently, a proprietary machine model ( KidneyIntelX) was tested in a subpopulation of the CANagliflozin cardioVascular Assessment Study ( CANVAS) trial, and KidneyIntelX successfully risk stratified a large multinational external cohort for risk of CKD progression [14 ].A new regression-based model from the CKD Prognosis Consortium, for patients with or without diabetes with recently developed CKD ( eGFR ≥60 ml/min/1.73m 2 ) , also included data from randomized trials and observational studies, finding good discrimination in a pooled validation [15 ].
Our current work on the Klinrisk model builds on previous and ongoing work on the KFRE, which uses the same principles of applying routinely collected laboratory data to predict CKD progression [16 ].However, there are several notable differences.First, the KFRE is not accurate in early-stage CKD [17 ].High-risk patients with early-stage disease ( e.g. a 50-year-old male with an eGFR of 80 ml/min/1.73m 2 and UACR of 100 mg/mmol) would have a low kidney failure risk using the KFRE ( 0.2% at 5 years)  Histogram of predicted risk by the Klinrisk prediction model for the primary composite outcome ( eGFR decrease ≥40% or kidney failure) in the FIDELITY population over 3 years.Patients were divided into 10 groups using the deciles of the predicted risk from the model.Thresholds were determined based on the distribution of the data and previous publications in the field [11 , 12 ].IQR: interquartile range; min: minimum; max: maximum; Q: quantile; SD: standard deviation.Risk groups are defined by the range of predicted risk.Q1: first quantile ( minimum-25th percentile) ; Q2: second quantile ( 25th percentile-median) ; Q3: third quantile ( median-75th percentile) ; Q4: fourth quantile ( 75th percentilemaximum) .PY: patient-years; Q: quantile.
but a very high risk of progression defined by a ≥40% decrease in eGFR ( 25.8% by the Klinrisk model) .As such, identification of these patients as high risk early in their disease course and appropriate treatment with renin-angiotensin-aldosterone system inhibitors, sodium-glucose co-transporter-2 inhibitors and finerenone could prevent a lifetime risk of kidney failure.Recent work by investigators from the United Kingdom suggested that automatic identification of individuals at high risk of CKD progression and CV risk is important for improving CKD care [18 ].The investigators performed a systematic review and realist synthesis to develop an integrated model of intervention mechanisms to improve the delivery of CKD care.Their model suggested that automatic detection of high-risk cases in primary care, alongside education, when integrated into existing workflows can improve kidney and CV outcomes.The Klinrisk model, with its use of routinely collected laboratory data and ability to integrate with both laboratory data or electronic medical records, can fill this implementation gap and improve outcomes in primary care where most CKD cases reside.In most countries, interventions targeted at primary care are the only feasible path to improving CKD-related population health because nephrology resources are limited.
We compared the performance of the Klinrisk model with the KDIGO heatmap and the KFRE instead of the KidneyIntelX model ( as we did not have access to the biomarkers required for KidneyIntelX) and found evidence of substantial improvement in discrimination.The KDIGO heatmap can be seen as a standard reference tool in clinical practice; however, the Klinrisk model outperforms the KDIGO heatmap in prediction accuracy.This is because the heatmap provides population-based risk categories [4 ], which are necessary, but insufficient, for individual risk stratification.Heterogeneity in the predicted risk of adverse outcomes has been found within each category, with overlap between categories [19 ].In each heatmap risk category, there can be large variations in the factors associated with the risk of kidney failure and CKD progression [20 ].Models that provide absolute risk at the individual level can go beyond the heatmap and identify patients at the greatest risk.
There was no evidence of interaction between treatment and risk for the primary outcome in the current analysis ( P = .31) .This suggests all patients included in FIDELITY, regardless of CKD stage, had a similar benefit, on a relative scale, from the intervention.This is consistent with data from sodium-glucose co-transporter-2 inhibitor trials in diabetes and heart failure [21 ].However, the benefit would be larger on the absolute scale than the relative scale in individuals at higher risk.
There are several implications of our findings.From a clinical perspective, the Klinrisk model can be used to identify patients with early-stage CKD and T2D but a high risk of progression, who can be treated accordingly to delay or prevent progression to kidney failure.Currently, most of these patients are unaware of their disease and are underdiagnosed.From a research perspective, the model may be useful for identifying intermediateor high-risk patients for inclusion in clinical trials to enhance event rates with 2-3 years of follow-up.From a health services perspective, payers can use the model to identify patients with the highest CKD-specific costs, and who may benefit from early treatment.Although we did not evaluate the association between risk and cost of care, findings from the KFRE strongly support an association between CKD progression risk score and cost of care [22 ].
This analysis has several limitations.The FIDELITY population consists of patients from two randomized clinical trials and is therefore highly selected.As a result, further analyses using other large observational datasets should be considered.In addition, our model only predicts the progression of CKD ( assessed by the primary composite outcome of a ≥40% decrease in eGFR, kidney failure or kidney death) , whereas many patients with CKD and T2D are at higher risk of CV events than kidney failure [23 -25 ].This is important because finerenone has clear CV benefits that may or may not be distributed along the same risk categories as CKD progression benefits [8 ].Therefore, refitting of the model to predict CV events or validation of a de novo CV model is needed to predict cardiorenal risk and potential cardiorenal benefit from therapy in this population.Furthermore, the severity of outcomes in the primary composite outcome differed ( e.g.eGFR decrease versus kidney failure) ; however, it was not possible to explore these separately due to the study designs of the FIDELIO-DKD and FIGARO-DKD trials.Despite this, kidney composite outcomes were primarily driven by a ≥40% decrease in eGFR because patients with an eGFR of < 25 ml/min/1.73m 2 were excluded from FIDELITY.Given that FIDELITY includes a large population of patients with CKD and T2D across the spectrum of CKD severity, using this dataset to develop a prognostic model to estimate CKD progression and response to treatment should be considered in the future.Moreover, the Klinrisk model requires commonly ordered laboratory data, which although accessible prospectively, may not be available retrospectively.However, the initial validation studies confirmed the model can be highly accurate with missing data by using imputation algorithms with a preserved ability to score as long as > 7 of 20 laboratory variables are present.Although the model is externally validated, additional studies are needed to address the impact of the model predictions and associated clinical decision support on quality and processes of CKD care and downstream outcomes.In addition, the Klinrisk model contains 20 routinely collected laboratory variables.Future iterations of the model could be simplified to require fewer variables without sacrificing model performance.Finally, cost-effectiveness analyses that evaluate the benefit of a risk-based treatment strategy to current standard of care will be needed prior to widespread adoption of prediction tools.
In conclusion, our findings demonstrate the accuracy and potential utility of a machine learning risk-prediction model, Klinrisk, based on routinely collected laboratory data for identifying patients at the highest risk of CKD progression early in their course of disease.Prospective studies implementing these models in electronic health records and laboratory information systems to identify and treat patients with high-risk CKD and T2D are needed.

Figure 1 :
Figure 1: Calibration plots for the Klinrisk prediction model for the primary composite outcome ( eGFR decrease ≥40% or kidney failure) at ( A) 2, ( B) 3 and ( C) 4 years after randomization in the FIDELITY population.RFSRC: Random Forests for Survival, Regression, and Classification.

Figure 2 :
Figure 2:Histogram of predicted risk by the Klinrisk prediction model for the primary composite outcome ( eGFR decrease ≥40% or kidney failure) in the FIDELITY population over 3 years.Patients were divided into 10 groups using the deciles of the predicted risk from the model.Thresholds were determined based on the distribution of the data and previous publications in the field[11 , 12 ].IQR: interquartile range; min: minimum; max: maximum; Q: quantile; SD: standard deviation.

Table 1 : Baseline demographics and laboratory values of the FIDELITY population by treatment
a Blood eGFR measured using the CKD-EPI equation.b Serum or plasma.c Preferred term in MedDRA version 23.1.d MedDRA Labelling Groupings term in MedDRA version 23.1.ALP: alkaline phosphatase; CKD-EPI: Chronic Kidney Disease Epidemiology Collaboration; GGT: gamma-glutamyl transferase; MedDRA: Medical Dictionary for Regulatory Activities; Q: quantile; SD: standard deviation.