ABSTRACT

Background

Interstitial inflammation and peritubular capillaritis are observed in many diseases on native and transplant kidney biopsies. A precise and automated evaluation of these histological criteria could help stratify patients’ kidney prognoses and facilitate therapeutic management.

Methods

We used a convolutional neural network to evaluate those criteria on kidney biopsies. A total of 423 kidney samples from various diseases were included; 83 kidney samples were used for the neural network training, 106 for comparing manual annotations on limited areas to automated predictions, and 234 to compare automated and visual gradings.

Results

The precision, recall and F-score for leukocyte detection were, respectively, 81%, 71% and 76%. Regarding peritubular capillaries detection the precision, recall and F-score were, respectively, 82%, 83% and 82%. There was a strong correlation between the predicted and observed grading of total inflammation, as for the grading of capillaritis (r = 0.89 and r = 0.82, respectively, all P < .0001). The areas under the receiver operating characteristics curves for the prediction of pathologists’ Banff total inflammation (ti) and peritubular capillaritis (ptc) scores were respectively all above 0.94 and 0.86. The kappa coefficients between the visual and the neural networks' scores were respectively 0.74, 0.78 and 0.68 for ti ≥1, ti ≥2 and ti ≥3, and 0.62, 0.64 and 0.79 for ptc ≥1, ptc ≥2 and ptc ≥3. In a subgroup of patients with immunoglobulin A nephropathy, the inflammation severity was highly correlated to kidney function at biopsy on univariate and multivariate analyses.

Conclusion

We developed a tool using deep learning that scores the total inflammation and capillaritis, demonstrating the potential of artificial intelligence in kidney pathology.

KEY LEARNING POINTS

What was known:

  • Total interstitial inflammation and peritubular capillaritis are observed in many native and transplant kidney diseases.

  • These lesions are frequently evaluated for diagnostic, severity and prognostic purposes, but these histological evaluations suffer from a lack of precision and reproducibility between pathologists.

  • This work aimed at automating and standardizing the grading of total interstitial inflammation and peritubular capillaritis with a convolutional neural network.

This study adds:

  • We developed and evaluated a tool that effectively segments leukocytes and peritubular capillaries on Masson's trichrome–stained kidney samples.

  • The convolutional neural networks' predictions for the total inflammation and the peritubular capillaritis scores were close to that of trained kidney pathologists.

  • This deep learning tool could also provide more precise predictions such as leukocyte density which is closely related to immunoglobulin A nephropathy patients’ kidney function.

Potential impact:

  • A more homogenized evaluation of total inflammation and capillaritis could help stratify patients’ kidney prognoses and guide therapeutic management.

  • If the prognostic impact of these automatic evaluations is confirmed, we can imagine the rise of dedicated histological classifications in native and transplant kidney biopsies.

  • The interstitial leukocyte density, which can be routinely calculated with the tool, might limit the impact of fibrosis and edema on the inflammation assessment and might be a stronger marker of interstitial inflammation.

INTRODUCTION

Interstitial inflammation is defined by kidney leukocyte infiltration sometimes associated with peritubular capillaritis and/or tubulitis [1]. Interstitial inflammation is the main lesion in tubulo-interstitial nephritis (TIN) diseases but can also be observed in other situations such as graft rejections and glomerulonephritis [2–5]. These histological lesions can lead to tubular dysfunction, fibrosis and kidney failure [1, 2, 6]. As an example, the grade of interstitial inflammation in immunoglobulin A (IgA) nephropathy (IgAN) is linked to the risk of disease progression [7–10]. Thus, in addition to its diagnostic function, the evaluation of interstitial inflammation is frequently evaluated in many diseases for prognostic purposes [11, 12].

Total interstitial inflammation can be graded as the percentage of affected cortical area affected but to limit its lack of reproducibility, a semi-quantitative grading is frequently chosen instead [8, 13–15]. The total inflammation (ti) and peritubular capillaritis (ptc) scores of the Banff classification currently represent one of the main standardizing methods [5, 16–18]. Nevertheless, these semi-quantitative evaluations still suffer from poor to moderate interrater reliability [14, 19, 20]. A more precise and reproducible evaluation could help target patients at risk of disease progression and guide therapeutic management [19].

Artificial intelligence has led to many advances in kidney pathology. Our team and others have previously shown that convolutional neural networks can automate the measurement of several quantitative histological criteria including interstitial fibrosis, tubular atrophy and mean glomerular density [21, 22]. Thanks to its high reproducibility, deep learning limits inter-observer variability and allows exhaustive and precise segmentations [14, 15, 23–26]. This high precision also allows us to refine the quantification of histological abnormalities and to measure histologic criteria that are virtually impossible to assess routinely by a pathologist. This work aims at automating the grading of total interstitial inflammation and peritubular capillaritis with a convolutional neural network in Masson's trichrome–stained kidney samples.

MATERIALS AND METHODS

Population

Kidney samples were obtained from the university hospitals of Dijon between January 2009 and January 2023, and from Besançon between January 2016 and January 2020. Several types of kidney samples were included:

  • kidney biopsies with either acute or chronic TIN, primary IgAN, IgA vasculitis or minimal change disease;

  • transplant kidney biopsies at the Dijon center or protocol transplant biopsies performed within the first year of transplantation at the Besançon center;

  • non-tumor kidney sample from total nephrectomy for cancer.

Biopsies with tumor lesions were excluded. Patients had to be 14 years of age or older and give oral consent for research purposes. This work received the agreement of the local ethics committee and was following the Helsinki Convention.

Clinico-biological data on the day of the biopsy were retrospectively collected, which included: age, sex, history of diabetes, hypertension, use of renin–angiotensin system inhibitors, serum creatinine level and proteinuria. Estimation of the glomerular filtration rate (eGFR) was calculated using the Chronic Kidney Disease Epidemiology Collaboration formula. To evaluate clinical correlations in homogenized populations, patients with IgAN or a transplant were subsequently evaluated in secondary analyses.

Training, test and application cohorts

Kidney samples from 423 patients were divided into three independent groups (Fig. 1).

Training, test and application cohorts. MCD, minimal change disease.
Figure 1:

Training, test and application cohorts. MCD, minimal change disease.

The training cohort's purpose was to train the neural network on recognizing leukocytes and peritubular capillaries in limited areas. This group consisted of 83 kidney samples from Dijon including 43 transplant biopsies, 16 IgAN, 9 IgA vasculitis, 7 TIN, 5 nephrectomies and 3 minimal change diseases.

The test cohort's purpose was to validate the neural network detection performances in limited areas. It compared manual annotations with the network’s predictions. This group consisted of 36 samples from Dijon (17 IgAN, 5 TIN, 5 minimal change diseases, 5 nephrectomies and 4 IgA vasculitis), and 70 samples from Besançon (60 protocol transplant biopsies and 10 IgAN) for external validation.

The application cohort's purpose was to compare the automated and visual gradings on whole biopsies and nephrectomies samples. This cohort consisted of 234 kidney samples from Dijon and Besançon including 89 IgAN, 46 TIN, 20 minimal change diseases, 20 IgA vasculitis biopsies, 19 nephrectomies, and 40 transplant biopsies (32 graft rejections biopsies and 8 normal biopsies).

Histological analyses

Biopsies were formalin-fixed, paraffin-embedded, cut into 2-μm sections, and stained with blue or green Masson's trichrome. Transplant biopsies from the application cohort were stained with unpolarized Sirius Red, and the resulting images were visually analyzed and compared with the neural networks' assessment of fibrosis. The slides were read, analyzed and annotated blindly to patients’ medical histories. The digitization of the biopsy slides was performed using the Hamamatsu scanner (model C9600-12) with a 200× lens, and a resolution of 454 nm/pixel. Biopsies were inferred at a 25× zoom and nephrectomy samples at a 100× zoom. Images were analyzed and manually annotated by two trained nephropathologists using ASAP annotation software (ASAP, Netherlands). The gold standard was defined by the mean of the two pathologists’ evaluations.

Algorithms

Training and evaluations were carried out on a PC Titan RTX (Nvidia, CA, USA) graphics card (24 GB VRAM). The used Convolutional Neural Network was Mask R-CNN Inception ResNet V2, which was implemented using in Python using Tensorflow and keras. The implementation is based on the existing Mask R-CNN github repository and is available on github [27]. We previously developed two preliminary training steps with this neural network [21]. The first training consisted of isolating the cortical area from the capsule, the medulla and the background. Within the cortical area, the second training consisted of recognizing the following structures: sclerotic and non-sclerotic glomeruli, healthy and atrophic tubules, arteries and veins.

In the current study, after this pre-processing step, we used an image cleaned of cortical structures previously detected by the second training. Thus, areas of interest were virtually only containing interstitial areas. This method could enhance the detection accuracy by limiting the number of histological structures encountered by the network. This algorithm was based on the segmentation of leukocytes and peritubular capillaries. Within the application cohort, the three algorithms were automatically and sequentially executed.

Neural network training and testing

No biopsy that had been used to train the previous algorithms was used in the test and application cohorts. For training and testing, several regions at, respectively, a ×200 zoom and a ×400 zoom were randomly selected. A preprocessing phase used the annotations selected by previous algorithms. The image was then sliced into small vignettes of 1024 × 1024 pixels with at least 33% overlap between each adjacent vignette. This resulted in 902 vignettes (95 different regions) which were used for the training of the neural network. A total of 20 840 leukocytes and 7962 peritubular capillaries were manually annotated for the training. To artificially enhance the training data, the images were randomly rotated 90° at each epoch (reiteration of training). The neural network was trained on 600 epochs. A total of 116 different regions and 718 vignettes were used for the test cohort. A total of 3599 leukocytes and 2310 peritubular capillaries were manually annotated for the test.

Lesions grading

The cortical area without annotation by the second algorithm was considered an interstitial area. The number of annotations per category and the area of each category was obtained. The percentage of total inflammation was the ratio of total leukocyte area to cortical area. The number of peritubular capillaries containing leukocytes as well as the number of leukocytes found within them were evaluated. Total inflammation, capillaritis and interstitial fibrosis were classified according to the ti, ptc and interstitial fibrosis (ci) scores of the last version of the Banff classification (Supplementary Methods) [5]. The percentages of total inflammation and interstitial fibrosis were visually evaluated with a step of 5. The predicted percentage of interstitial fibrosis corresponded to the ratio between the surface of the areas which were not annotated by the second neural network and the total cortical area [21]. We estimated leukocyte interstitial density based on glomerular density assessment methods (Supplementary Methods) [28]. This density was only evaluated on the interstitial area. For patients with primary IgAN, the MEST-C score was assessed by pathologists [29, 30].

Statistical analysis

Quantitative data were expressed as mean and standard deviation. Semi-quantitative data were expressed as numbers and percentages. The correlation between two quantitative variables was calculated using the Spearman test. A multiple linear regression test was used to assess the effect of several quantitative variables on a target variable (eGFR). Student's or Mann–Whitney's T-tests compared two quantitative variables depending on whether the distribution was normal or not. The performance of the neural networks was evaluated by precision, recall, F-score and intersection over union (IOU) (Supplementary Methods). Inter-observer variability was assessed with Cohen's Kappa (ĸ) test. A ĸ <0.40 was considered poor, 0.40–0.59 moderate, 0.60–0.79 substantial and >0.80 major. Receiver operating characteristics (ROC) curves were constructed for the prediction of ti and ptc scores. As the range of predicted inflammation was smaller than the observed range, the Youden test was used to determine the thresholds with the best sensitivity and specificity for predicted ti scores. Statistical analyses were performed using GraphPad PRISM 6.01 software (GraphPad Software, La Jolla, CA, USA) and IBM SPSS 23 software (IBM, Chicago, IL, USA).

RESULTS

Population

The clinical, biological and histological data of the patients from the three cohorts are described in Table 1. Of the 423 samples included, 251 (59%) were native kidney biopsies, 144 (34%) were transplant biopsies and 29 (7%) were nephrectomy samples. Most native kidney biopsies involved IgAN (n = 138/252, 55%). Among the 58 patients with interstitial nephritis, the etiology was unknown in 22 (38%) patients, autoimmune in 13 (22%) patients, drug-induced in 13 (22%) patients, toxic in 9 (16%) patients and Hantavirus in 1 (2%) patient. Seventy-nine (55%) normal biopsies, 19 (13%) acute antibody-mediated, 19 (13%) acute T-cell-mediated, 11 (8%) chronic T-cell-mediated, 5 (3%) borderline, 6 (4%) viral nephritis, 3 (2%) mixed acute rejections and 2 (1%) chronic mixed rejections were included in the transplant biopsies.

Table 1:

The clinical, biological and histological data of the patients from the three cohorts.

DataAll patients (n = 423)Training cohort (n = 83)Test cohort (n = 106)Application cohort (n = 234)
Age (years)53 ± 1854 ± 1755 ± 1552 ± 19
Male sex281 (66)58 (70)72 (68)151 (65)
Diabetes mellitus84 (20)14 (17)24 (23)46 (20)
Hypertension300 (71)62 (75)91 (86)147 (63)
Native kidney biopsy251 (59)35 (42)41 (39)175 (75)
 TIN58 (14)7 (8)5 (5)46 (19)
 IgAN132 (31)16 (19)27 (25)89 (38)
 IgA vasculitis33 (8)9 (11)4 (4)20 (9)
 Minimal change disease28 (7)3 (4)5 (5)20 (8)
Kidney transplant144 (34)43 (52)60 (56)40 (17)
 Antibody-mediated rejectiona24 (6)3 (4)6 (6)15 (6)
 T-cell-mediated/borderline rejectiona40 (9)10 (12)10 (9)20 (9)
 Viral nephritis6 (2)3 (4)3 (3)0 (0)
Nephrectomy29 (7)5 (6)5 (5)19 (8)
Serum creatinine at biopsy (mg/dL)2.1 ± 1.82.0 ± 1.41.9 ± 1.42.2 ± 2.1
eGFR at biopsy (mL/min/1.73 m2)53 ± 3452 ± 3351 ± 3055 ± 37
Proteinuria at biopsy (g/day)1.9 ± 2.61.3 ± 1.51.2 ± 1.82.4 ± 3.0
Mean interstitial fibrosis (%)b22 ± 1720 ± 1719 ± 1525 ± 17
 ci1b144 (34)41 (49)64 (60)101 (43)
 ci2b101 (24)13 (16)13 (12)75 (32)
 ci3b36 (9)7 (8)8 (8)21 (9)
Mean total inflammation (%)b27 ± 2224 ± 2224 ± 2030 ± 22
 ti1b99 (23)19 (23)29 (27)51 (22)
 ti2b114 (27)19 (23)23 (22)72 (31)
 ti3b72 (17)13 (16)16 (15)43 (18)
 ptc1b97 (23)22 (27)33 (31)42 (18)
 ptc2b124 (29)19 (23)26 (24)79 (34)
 ptc3b82 (19)8 (10)8 (8)66 (28)
DataAll patients (n = 423)Training cohort (n = 83)Test cohort (n = 106)Application cohort (n = 234)
Age (years)53 ± 1854 ± 1755 ± 1552 ± 19
Male sex281 (66)58 (70)72 (68)151 (65)
Diabetes mellitus84 (20)14 (17)24 (23)46 (20)
Hypertension300 (71)62 (75)91 (86)147 (63)
Native kidney biopsy251 (59)35 (42)41 (39)175 (75)
 TIN58 (14)7 (8)5 (5)46 (19)
 IgAN132 (31)16 (19)27 (25)89 (38)
 IgA vasculitis33 (8)9 (11)4 (4)20 (9)
 Minimal change disease28 (7)3 (4)5 (5)20 (8)
Kidney transplant144 (34)43 (52)60 (56)40 (17)
 Antibody-mediated rejectiona24 (6)3 (4)6 (6)15 (6)
 T-cell-mediated/borderline rejectiona40 (9)10 (12)10 (9)20 (9)
 Viral nephritis6 (2)3 (4)3 (3)0 (0)
Nephrectomy29 (7)5 (6)5 (5)19 (8)
Serum creatinine at biopsy (mg/dL)2.1 ± 1.82.0 ± 1.41.9 ± 1.42.2 ± 2.1
eGFR at biopsy (mL/min/1.73 m2)53 ± 3452 ± 3351 ± 3055 ± 37
Proteinuria at biopsy (g/day)1.9 ± 2.61.3 ± 1.51.2 ± 1.82.4 ± 3.0
Mean interstitial fibrosis (%)b22 ± 1720 ± 1719 ± 1525 ± 17
 ci1b144 (34)41 (49)64 (60)101 (43)
 ci2b101 (24)13 (16)13 (12)75 (32)
 ci3b36 (9)7 (8)8 (8)21 (9)
Mean total inflammation (%)b27 ± 2224 ± 2224 ± 2030 ± 22
 ti1b99 (23)19 (23)29 (27)51 (22)
 ti2b114 (27)19 (23)23 (22)72 (31)
 ti3b72 (17)13 (16)16 (15)43 (18)
 ptc1b97 (23)22 (27)33 (31)42 (18)
 ptc2b124 (29)19 (23)26 (24)79 (34)
 ptc3b82 (19)8 (10)8 (8)66 (28)

Quantitative data are expressed as numbers (%) and semi-quantitative data as mean ± standard deviation.

aIncluding mixed rejections.

bData from the region of interest trained and/or analyzed.

Table 1:

The clinical, biological and histological data of the patients from the three cohorts.

DataAll patients (n = 423)Training cohort (n = 83)Test cohort (n = 106)Application cohort (n = 234)
Age (years)53 ± 1854 ± 1755 ± 1552 ± 19
Male sex281 (66)58 (70)72 (68)151 (65)
Diabetes mellitus84 (20)14 (17)24 (23)46 (20)
Hypertension300 (71)62 (75)91 (86)147 (63)
Native kidney biopsy251 (59)35 (42)41 (39)175 (75)
 TIN58 (14)7 (8)5 (5)46 (19)
 IgAN132 (31)16 (19)27 (25)89 (38)
 IgA vasculitis33 (8)9 (11)4 (4)20 (9)
 Minimal change disease28 (7)3 (4)5 (5)20 (8)
Kidney transplant144 (34)43 (52)60 (56)40 (17)
 Antibody-mediated rejectiona24 (6)3 (4)6 (6)15 (6)
 T-cell-mediated/borderline rejectiona40 (9)10 (12)10 (9)20 (9)
 Viral nephritis6 (2)3 (4)3 (3)0 (0)
Nephrectomy29 (7)5 (6)5 (5)19 (8)
Serum creatinine at biopsy (mg/dL)2.1 ± 1.82.0 ± 1.41.9 ± 1.42.2 ± 2.1
eGFR at biopsy (mL/min/1.73 m2)53 ± 3452 ± 3351 ± 3055 ± 37
Proteinuria at biopsy (g/day)1.9 ± 2.61.3 ± 1.51.2 ± 1.82.4 ± 3.0
Mean interstitial fibrosis (%)b22 ± 1720 ± 1719 ± 1525 ± 17
 ci1b144 (34)41 (49)64 (60)101 (43)
 ci2b101 (24)13 (16)13 (12)75 (32)
 ci3b36 (9)7 (8)8 (8)21 (9)
Mean total inflammation (%)b27 ± 2224 ± 2224 ± 2030 ± 22
 ti1b99 (23)19 (23)29 (27)51 (22)
 ti2b114 (27)19 (23)23 (22)72 (31)
 ti3b72 (17)13 (16)16 (15)43 (18)
 ptc1b97 (23)22 (27)33 (31)42 (18)
 ptc2b124 (29)19 (23)26 (24)79 (34)
 ptc3b82 (19)8 (10)8 (8)66 (28)
DataAll patients (n = 423)Training cohort (n = 83)Test cohort (n = 106)Application cohort (n = 234)
Age (years)53 ± 1854 ± 1755 ± 1552 ± 19
Male sex281 (66)58 (70)72 (68)151 (65)
Diabetes mellitus84 (20)14 (17)24 (23)46 (20)
Hypertension300 (71)62 (75)91 (86)147 (63)
Native kidney biopsy251 (59)35 (42)41 (39)175 (75)
 TIN58 (14)7 (8)5 (5)46 (19)
 IgAN132 (31)16 (19)27 (25)89 (38)
 IgA vasculitis33 (8)9 (11)4 (4)20 (9)
 Minimal change disease28 (7)3 (4)5 (5)20 (8)
Kidney transplant144 (34)43 (52)60 (56)40 (17)
 Antibody-mediated rejectiona24 (6)3 (4)6 (6)15 (6)
 T-cell-mediated/borderline rejectiona40 (9)10 (12)10 (9)20 (9)
 Viral nephritis6 (2)3 (4)3 (3)0 (0)
Nephrectomy29 (7)5 (6)5 (5)19 (8)
Serum creatinine at biopsy (mg/dL)2.1 ± 1.82.0 ± 1.41.9 ± 1.42.2 ± 2.1
eGFR at biopsy (mL/min/1.73 m2)53 ± 3452 ± 3351 ± 3055 ± 37
Proteinuria at biopsy (g/day)1.9 ± 2.61.3 ± 1.51.2 ± 1.82.4 ± 3.0
Mean interstitial fibrosis (%)b22 ± 1720 ± 1719 ± 1525 ± 17
 ci1b144 (34)41 (49)64 (60)101 (43)
 ci2b101 (24)13 (16)13 (12)75 (32)
 ci3b36 (9)7 (8)8 (8)21 (9)
Mean total inflammation (%)b27 ± 2224 ± 2224 ± 2030 ± 22
 ti1b99 (23)19 (23)29 (27)51 (22)
 ti2b114 (27)19 (23)23 (22)72 (31)
 ti3b72 (17)13 (16)16 (15)43 (18)
 ptc1b97 (23)22 (27)33 (31)42 (18)
 ptc2b124 (29)19 (23)26 (24)79 (34)
 ptc3b82 (19)8 (10)8 (8)66 (28)

Quantitative data are expressed as numbers (%) and semi-quantitative data as mean ± standard deviation.

aIncluding mixed rejections.

bData from the region of interest trained and/or analyzed.

Detection accuracy

Among limited cortical areas from the 106 kidney samples of the test cohort, the neural network’s predictions were compared with pathologists’ segmentations. Regarding leukocyte detection, the precision, recall, F-score and IOU were 81%, 71%, 76% and 52%, respectively. Regarding peritubular capillaries detection, the precision, recall, F-score and IOU were 82%, 83%, 82% and 70%, respectively. The most common errors were a lack of leukocyte detection in the most inflammatory areas and granulomas. Endothelial and fibroblast cells were sometimes labeled as leukocytes, and some biopsy borders were mistaken for capillaries (Supplementary data, Fig. S1).

Evaluation of total interstitial inflammation and capillaritis

In the application cohort, neural networks' lesions grading was compared with that of the pathologists on 215 kidney biopsies and 19 nephrectomy samples (Fig. 2). Mean percentages of total inflammation and fibrosis were 30 ± 22% and 25 ± 17%, respectively. Respectively, 68 (29%), 51 (22%), 72 (31%) and 43 (18%) samples had a ti0, ti1, ti2 and ti3 score, and 47 (20%), 42 (18%), 79 (34%) and 66 (28%) samples had a ptc0, ptc1, ptc2 and ptc3 score.

Masson's trichrome–stained kidney biopsy of a patient with tubulointerstitial nephritis before and after neural networks inferences. (A) Biopsy at a ×25 zoom. (B) Biopsy after the pre-processing steps, image cleaned of cortical structures previously detected by the first pieces of training. (C–F) Biopsy after the third training inference at ×25 (C), ×100 (D), ×200 (E) and ×400 (F) zooms. Leukocytes are artificially colorized in red and peritubular capillaries in green. Scale bars: 50 µm.
Figure 2:

Masson's trichrome–stained kidney biopsy of a patient with tubulointerstitial nephritis before and after neural networks inferences. (A) Biopsy at a ×25 zoom. (B) Biopsy after the pre-processing steps, image cleaned of cortical structures previously detected by the first pieces of training. (CF) Biopsy after the third training inference at ×25 (C), ×100 (D), ×200 (E) and ×400 (F) zooms. Leukocytes are artificially colorized in red and peritubular capillaries in green. Scale bars: 50 µm.

The three neural networks’ mean total inference time was 44 ± 28 min per biopsy. The tool detected a mean of 2665 ± 1647 peritubular capillaries, 7138 ± 6060 leukocytes in the interstitial area and 1026 ± 773 inside the peritubular capillaries. The mean interstitial leukocyte density was 26 161 ± 9477 leukocytes/mm2. There was a strong correlation between the predicted and observed percentage of total inflammation (r = 0.89, P < .0001) (Fig. 3), as for the predicted and observed degree of capillaritis (r = 0.82, P < .0001). The predicted percentage of inflammation was also associated with both the observed and predicted percentages of fibrosis (respectively, r = 0.77 and r = 0.89, all P < .0001). Neural network predictions based on pathologist scores and kidney diseases are shown in Fig. 4.

Correlation between the predicted and observed total inflammation. Mean values of pathologists’ evaluations were used for the total cortical inflammation observed.
Figure 3:

Correlation between the predicted and observed total inflammation. Mean values of pathologists’ evaluations were used for the total cortical inflammation observed.

Neural network predictions of total cortical inflammation and leukocyte capillary infiltration depending on pathologists’ scores (A, B) and kidney disease etiologies (C, D). Pathologists 1 (orange) and 2 (yellow) scores are presented in (A) and (B). Tukey box plots.
Figure 4:

Neural network predictions of total cortical inflammation and leukocyte capillary infiltration depending on pathologists’ scores (A, B) and kidney disease etiologies (C, D). Pathologists 1 (orange) and 2 (yellow) scores are presented in (A) and (B). Tukey box plots.

The areas under the ROC curves for the prediction of pathologists’ ti ≥1, ti ≥2 and ti3 with the neural networks’ percentage of inflammation were 0.96 [95% confidence interval (CI) 0.94–0.98], 0.95 (95% CI 0.92–0.97) and 0.94 (95% CI 0.91–0.97), respectively (with all P < .0001). The areas under the ROC curves for the prediction of pathologists’ ptc ≥1, ptc ≥2 and ptc3 with the neural networks’ leukocyte count in the most affected capillary were 0.92 (95% CI 0.88–0.97), 0.86 (95% CI 0.82–0.92) and 0.96 (95% CI 0.94–0.99), respectively (with all P < .0001) (Fig. 5). The ĸ coefficients between the visual and the neural networks' scores were respectively 0.74, 0.78 and 0.68 for ti ≥1, ti ≥2 and ti ≥3 (with all P < .0001), and 0.62, 0.64 and 0.79 for ptc ≥1, ptc ≥2 and ptc ≥3 (with all P < .0001). The ĸ coefficients between the two pathologists’ scores were, respectively, 0.84, 0.83 and 0.82 for ti ≥1, ti ≥2 and ti ≥3 (with all P < .0001), and 0.57, 0.61 and 0.51 for ptc ≥1, ptc ≥2 and ptc ≥3 (with all P < .0001).

ROC curves for the prediction of pathologists’ ti (A–C) and ptc (D–F) scores with the predicted percentage of cortical inflammation and leukocyte count in the most affected capillary. Optimal cut-off values for ti scores were obtained with Youden tests. The optimal cortical inflammation scores were 16.5% (sensitivity of 83%, specificity of 98%) for ti1, 20.1% (sensitivity of 89%, specificity of 93%) for ti2 and 30.4% (sensitivity of 91%, specificity of 89%) for ti3. AUC, area under the curve.
Figure 5:

ROC curves for the prediction of pathologists’ ti (AC) and ptc (DF) scores with the predicted percentage of cortical inflammation and leukocyte count in the most affected capillary. Optimal cut-off values for ti scores were obtained with Youden tests. The optimal cortical inflammation scores were 16.5% (sensitivity of 83%, specificity of 98%) for ti1, 20.1% (sensitivity of 89%, specificity of 93%) for ti2 and 30.4% (sensitivity of 91%, specificity of 89%) for ti3. AUC, area under the curve.

Patients with IgAN

Patients in the application cohort with primary IgAN were analyzed to assess the association of predicted histological data with baseline clinico-biological characteristics. Patients’ characteristics at biopsy are described in Table 2.

Table 2:

Characteristics of IgAN patients from the application cohort.

DataPatients (N = 89)
Age (years)49 ± 19
Male sex70 (79)
Diabetes mellitus11 (12)
Hypertension59 (66)
Renin–angiotensin–aldosterone blockers56 (63)
Serum creatinine level (mg/dL)2.3 ± 2.4
eGFR at biopsy (mL/min/1.73 m2)59 ± 39
Proteinuria (g/day)2.8 ± 3.0
 M131 (35)
 E146 (51)
 S159 (66)
 T134 (38)
 T28 (9)
 C14 (4)
 C216 (18)
Interstitial fibrosis predicted (%)34 ± 9
Total cortical inflammation predicted (%)27 ± 13
 ti1 (predicted)8 (9)
 ti2 (predicted)30 (34)
 ti3 (predicted)19 (21)
Leukocytes in the most affected capillary (cell)7 ± 3
 ptc1 (predicted)9 (10)
 ptc2 (predicted)59 (66)
 ptc3 (predicted)15 (17)
Leukocyte density predicted (cell/mm2)26 374 ± 8288
DataPatients (N = 89)
Age (years)49 ± 19
Male sex70 (79)
Diabetes mellitus11 (12)
Hypertension59 (66)
Renin–angiotensin–aldosterone blockers56 (63)
Serum creatinine level (mg/dL)2.3 ± 2.4
eGFR at biopsy (mL/min/1.73 m2)59 ± 39
Proteinuria (g/day)2.8 ± 3.0
 M131 (35)
 E146 (51)
 S159 (66)
 T134 (38)
 T28 (9)
 C14 (4)
 C216 (18)
Interstitial fibrosis predicted (%)34 ± 9
Total cortical inflammation predicted (%)27 ± 13
 ti1 (predicted)8 (9)
 ti2 (predicted)30 (34)
 ti3 (predicted)19 (21)
Leukocytes in the most affected capillary (cell)7 ± 3
 ptc1 (predicted)9 (10)
 ptc2 (predicted)59 (66)
 ptc3 (predicted)15 (17)
Leukocyte density predicted (cell/mm2)26 374 ± 8288

Quantitative data are expressed as numbers (%) and semi-quantitative data as mean ± standard deviation.

M, mesangial hypercellularity score; E, endocapillary hypercellularity score; S, sclerosis score; C, crescent score.

Table 2:

Characteristics of IgAN patients from the application cohort.

DataPatients (N = 89)
Age (years)49 ± 19
Male sex70 (79)
Diabetes mellitus11 (12)
Hypertension59 (66)
Renin–angiotensin–aldosterone blockers56 (63)
Serum creatinine level (mg/dL)2.3 ± 2.4
eGFR at biopsy (mL/min/1.73 m2)59 ± 39
Proteinuria (g/day)2.8 ± 3.0
 M131 (35)
 E146 (51)
 S159 (66)
 T134 (38)
 T28 (9)
 C14 (4)
 C216 (18)
Interstitial fibrosis predicted (%)34 ± 9
Total cortical inflammation predicted (%)27 ± 13
 ti1 (predicted)8 (9)
 ti2 (predicted)30 (34)
 ti3 (predicted)19 (21)
Leukocytes in the most affected capillary (cell)7 ± 3
 ptc1 (predicted)9 (10)
 ptc2 (predicted)59 (66)
 ptc3 (predicted)15 (17)
Leukocyte density predicted (cell/mm2)26 374 ± 8288
DataPatients (N = 89)
Age (years)49 ± 19
Male sex70 (79)
Diabetes mellitus11 (12)
Hypertension59 (66)
Renin–angiotensin–aldosterone blockers56 (63)
Serum creatinine level (mg/dL)2.3 ± 2.4
eGFR at biopsy (mL/min/1.73 m2)59 ± 39
Proteinuria (g/day)2.8 ± 3.0
 M131 (35)
 E146 (51)
 S159 (66)
 T134 (38)
 T28 (9)
 C14 (4)
 C216 (18)
Interstitial fibrosis predicted (%)34 ± 9
Total cortical inflammation predicted (%)27 ± 13
 ti1 (predicted)8 (9)
 ti2 (predicted)30 (34)
 ti3 (predicted)19 (21)
Leukocytes in the most affected capillary (cell)7 ± 3
 ptc1 (predicted)9 (10)
 ptc2 (predicted)59 (66)
 ptc3 (predicted)15 (17)
Leukocyte density predicted (cell/mm2)26 374 ± 8288

Quantitative data are expressed as numbers (%) and semi-quantitative data as mean ± standard deviation.

M, mesangial hypercellularity score; E, endocapillary hypercellularity score; S, sclerosis score; C, crescent score.

In univariate analysis, the predicted percentage of total inflammation, cortical fibrosis, capillaritis score and mean leukocyte density were all associated with baseline eGFR (Fig. 6). The other factors associated with eGFR were age, hypertension, proteinuria, and M, S and C status. In multiple linear regression, only interstitial leukocyte density, age and percentage of interstitial fibrosis were associated with baseline eGFR, with β scores, respectively, of –0.36 (95% CI –0.71, –0.01, P = .042), –0.45 (95% CI –0.63, –0.27, P < .0001) and –0.47 (95% CI –0.86, –0.08, P = .019) (Table 3).

Correlation between eGFR at biopsy and neural networks predictions in IgAN application cohort patients.
Figure 6:

Correlation between eGFR at biopsy and neural networks predictions in IgAN application cohort patients.

Table 3:

Data of IgAN patients associated with eGFR at biopsy.

N = 89UnivariateaMultivariateMultivariateMultivariate
Model 1bModel 2cModel 3d
Datar (95% CI)P-valueBeta (95% CI)P-valueBeta (95% CI)P-valueBeta (95% CI)P-value
Age (per years)–0.64 (–0.75, –0.49)<.001–0.45 (–0.63, –0.26)<.001–0.45 (–0.63, –0.27)<.001–0.45 (–0.63, –0.27)<.001
Male sex–0.03 (–0.25, 0.18).882
Hypertension–0.38 (–0.55, –0.18)<.001–0.07 (–0.24, 0.10).419–0.09 (–0.25, 0.08).299–0.09 (–0.26, 0.08).292
Diabetes mellitus–0.14 (–0.34, 0.08).118
Renin–angiotensin–aldosterone blockers–0.10 (–0.31, 0.12).343
Proteinuria (per 0.1 g/day)–0.36 (–0.53, –0.15)<.001–0.03(–0.17, 0.11).668–0.02 (–0.16, 0.12).770–0.01 (–0.15, 0.14).906
M1–0.34 (–0.51, –0.13).001–0.07 (–0.21, 0.09).287–0.07 (–0.22, 0.08).344–0.08 (–0.23, 0.07).270
E1–0.16 (–0.36, 0.06).131
S1–0.23 (–0.42, –0.01).0350.03 (–0.12, 0.18).6530.03 (–0.11, 0.18).6550.03 (–0.12, 0.18).679
Predicted cortical fibrosis (per %)–0.68 (–0.78, –0.54)<.001–0.18 (–0.46, 0.09).197–0.32 (–0.49, –0.15)<.001–0.47 (–0.86, –0.08).019
C >1–0.19 (–0.38, 0.02).080
Predicted Total cortical inflammation (per %)–0.67 (–0.77, –0.53)<.001–0.28 (–0.55, 0.00).0540.24 (–0.33, 0.81).396
Predicted ptc >2–0.21 (–0.41, –0.01).0460.10 (–0.05,0.25).2070.14 (–0.01, 0.30).0720.16 (–0.00, 0.32).055
Predicted leukocyte density (per cell/mm2)–0.47 (–0.62, –0.28)<.001–0.23 (–0.40, –0.06).007–0.36 (–0.71, –0.01).042
N = 89UnivariateaMultivariateMultivariateMultivariate
Model 1bModel 2cModel 3d
Datar (95% CI)P-valueBeta (95% CI)P-valueBeta (95% CI)P-valueBeta (95% CI)P-value
Age (per years)–0.64 (–0.75, –0.49)<.001–0.45 (–0.63, –0.26)<.001–0.45 (–0.63, –0.27)<.001–0.45 (–0.63, –0.27)<.001
Male sex–0.03 (–0.25, 0.18).882
Hypertension–0.38 (–0.55, –0.18)<.001–0.07 (–0.24, 0.10).419–0.09 (–0.25, 0.08).299–0.09 (–0.26, 0.08).292
Diabetes mellitus–0.14 (–0.34, 0.08).118
Renin–angiotensin–aldosterone blockers–0.10 (–0.31, 0.12).343
Proteinuria (per 0.1 g/day)–0.36 (–0.53, –0.15)<.001–0.03(–0.17, 0.11).668–0.02 (–0.16, 0.12).770–0.01 (–0.15, 0.14).906
M1–0.34 (–0.51, –0.13).001–0.07 (–0.21, 0.09).287–0.07 (–0.22, 0.08).344–0.08 (–0.23, 0.07).270
E1–0.16 (–0.36, 0.06).131
S1–0.23 (–0.42, –0.01).0350.03 (–0.12, 0.18).6530.03 (–0.11, 0.18).6550.03 (–0.12, 0.18).679
Predicted cortical fibrosis (per %)–0.68 (–0.78, –0.54)<.001–0.18 (–0.46, 0.09).197–0.32 (–0.49, –0.15)<.001–0.47 (–0.86, –0.08).019
C >1–0.19 (–0.38, 0.02).080
Predicted Total cortical inflammation (per %)–0.67 (–0.77, –0.53)<.001–0.28 (–0.55, 0.00).0540.24 (–0.33, 0.81).396
Predicted ptc >2–0.21 (–0.41, –0.01).0460.10 (–0.05,0.25).2070.14 (–0.01, 0.30).0720.16 (–0.00, 0.32).055
Predicted leukocyte density (per cell/mm2)–0.47 (–0.62, –0.28)<.001–0.23 (–0.40, –0.06).007–0.36 (–0.71, –0.01).042

aSpearman test.

bLinear regression Model 1 with age, hypertension, proteinuria, M1, S1, predicted cortical fibrosis, predicted total cortical inflammation, predicted ptc >2.

cLinear regression Model 2 with age, hypertension, proteinuria, M1, S1, predicted cortical fibrosis, predicted ptc >12, predicted leukocyte density.

dLinear regression Model 2 with age, hypertension, proteinuria, M1, S1, predicted cortical fibrosis, predicted total cortical inflammation, predicted ptc >2, predicted leukocyte density.

M, mesangial hypercellularity score; E, endocapillary hypercellularity score; S, sclerosis score; C, crescent score.

P-values of the factors statistically associated with the eGFR are bolded.

Table 3:

Data of IgAN patients associated with eGFR at biopsy.

N = 89UnivariateaMultivariateMultivariateMultivariate
Model 1bModel 2cModel 3d
Datar (95% CI)P-valueBeta (95% CI)P-valueBeta (95% CI)P-valueBeta (95% CI)P-value
Age (per years)–0.64 (–0.75, –0.49)<.001–0.45 (–0.63, –0.26)<.001–0.45 (–0.63, –0.27)<.001–0.45 (–0.63, –0.27)<.001
Male sex–0.03 (–0.25, 0.18).882
Hypertension–0.38 (–0.55, –0.18)<.001–0.07 (–0.24, 0.10).419–0.09 (–0.25, 0.08).299–0.09 (–0.26, 0.08).292
Diabetes mellitus–0.14 (–0.34, 0.08).118
Renin–angiotensin–aldosterone blockers–0.10 (–0.31, 0.12).343
Proteinuria (per 0.1 g/day)–0.36 (–0.53, –0.15)<.001–0.03(–0.17, 0.11).668–0.02 (–0.16, 0.12).770–0.01 (–0.15, 0.14).906
M1–0.34 (–0.51, –0.13).001–0.07 (–0.21, 0.09).287–0.07 (–0.22, 0.08).344–0.08 (–0.23, 0.07).270
E1–0.16 (–0.36, 0.06).131
S1–0.23 (–0.42, –0.01).0350.03 (–0.12, 0.18).6530.03 (–0.11, 0.18).6550.03 (–0.12, 0.18).679
Predicted cortical fibrosis (per %)–0.68 (–0.78, –0.54)<.001–0.18 (–0.46, 0.09).197–0.32 (–0.49, –0.15)<.001–0.47 (–0.86, –0.08).019
C >1–0.19 (–0.38, 0.02).080
Predicted Total cortical inflammation (per %)–0.67 (–0.77, –0.53)<.001–0.28 (–0.55, 0.00).0540.24 (–0.33, 0.81).396
Predicted ptc >2–0.21 (–0.41, –0.01).0460.10 (–0.05,0.25).2070.14 (–0.01, 0.30).0720.16 (–0.00, 0.32).055
Predicted leukocyte density (per cell/mm2)–0.47 (–0.62, –0.28)<.001–0.23 (–0.40, –0.06).007–0.36 (–0.71, –0.01).042
N = 89UnivariateaMultivariateMultivariateMultivariate
Model 1bModel 2cModel 3d
Datar (95% CI)P-valueBeta (95% CI)P-valueBeta (95% CI)P-valueBeta (95% CI)P-value
Age (per years)–0.64 (–0.75, –0.49)<.001–0.45 (–0.63, –0.26)<.001–0.45 (–0.63, –0.27)<.001–0.45 (–0.63, –0.27)<.001
Male sex–0.03 (–0.25, 0.18).882
Hypertension–0.38 (–0.55, –0.18)<.001–0.07 (–0.24, 0.10).419–0.09 (–0.25, 0.08).299–0.09 (–0.26, 0.08).292
Diabetes mellitus–0.14 (–0.34, 0.08).118
Renin–angiotensin–aldosterone blockers–0.10 (–0.31, 0.12).343
Proteinuria (per 0.1 g/day)–0.36 (–0.53, –0.15)<.001–0.03(–0.17, 0.11).668–0.02 (–0.16, 0.12).770–0.01 (–0.15, 0.14).906
M1–0.34 (–0.51, –0.13).001–0.07 (–0.21, 0.09).287–0.07 (–0.22, 0.08).344–0.08 (–0.23, 0.07).270
E1–0.16 (–0.36, 0.06).131
S1–0.23 (–0.42, –0.01).0350.03 (–0.12, 0.18).6530.03 (–0.11, 0.18).6550.03 (–0.12, 0.18).679
Predicted cortical fibrosis (per %)–0.68 (–0.78, –0.54)<.001–0.18 (–0.46, 0.09).197–0.32 (–0.49, –0.15)<.001–0.47 (–0.86, –0.08).019
C >1–0.19 (–0.38, 0.02).080
Predicted Total cortical inflammation (per %)–0.67 (–0.77, –0.53)<.001–0.28 (–0.55, 0.00).0540.24 (–0.33, 0.81).396
Predicted ptc >2–0.21 (–0.41, –0.01).0460.10 (–0.05,0.25).2070.14 (–0.01, 0.30).0720.16 (–0.00, 0.32).055
Predicted leukocyte density (per cell/mm2)–0.47 (–0.62, –0.28)<.001–0.23 (–0.40, –0.06).007–0.36 (–0.71, –0.01).042

aSpearman test.

bLinear regression Model 1 with age, hypertension, proteinuria, M1, S1, predicted cortical fibrosis, predicted total cortical inflammation, predicted ptc >2.

cLinear regression Model 2 with age, hypertension, proteinuria, M1, S1, predicted cortical fibrosis, predicted ptc >12, predicted leukocyte density.

dLinear regression Model 2 with age, hypertension, proteinuria, M1, S1, predicted cortical fibrosis, predicted total cortical inflammation, predicted ptc >2, predicted leukocyte density.

M, mesangial hypercellularity score; E, endocapillary hypercellularity score; S, sclerosis score; C, crescent score.

P-values of the factors statistically associated with the eGFR are bolded.

Transplant biopsies

Patients from the application cohort with transplant biopsies were analyzed to assess the predictive ability of neural networks on transplant inflammation and fibrosis. Patients’ characteristics are described in Table 4. The area under the ROC curve for the neural networks’ percentage of total inflammation to predict cellular rejections was 0.84 (95% CI 0.71–0.96, P < .0001). The area under the ROC curve for the neural networks’ leukocyte count in the most affected capillary to predict humoral rejection was 0.70 (95% CI 0.54–0.86, P = .036). To evaluate the accuracy of fibrosis detection by the second neural network, we compared its results to those of visual evaluation of Sirius Red–stained biopsies (Supplementary data, Fig. S2). Of the 30 biopsies with enough material to be stained with Sirius Red, the mean fibrosis percentage was 34 ± 24% with Sirius Red and 32 ± 12% with neural networks. The ĸ coefficients between the Sirius Red and the neural networks’ scores were 0.81, 0.61 and 0.60 for ci ≥1, ci ≥2 and ci3, respectively (with all P < .0001).

Table 4:

Characteristics of patients from the application cohort with a transplant biopsy.

DataPatients (N = 40)
Age (years)51 ± 14
Male sex23 (58)
Diabetes mellitus14 (35)
Hypertension30 (75)
Serum creatinine level (mg/dL)253 ± 149
eGFR at biopsy (mL/min/1.73 m2)31 ± 19
Proteinuria (g/day)1.4 ± 2.1
Graft rejection at biopsy32 (80)
Antibody-mediated rejection12 (30)
T-cell-mediated rejection16 (40)
Mixed rejection3 (8)
Borderline rejection1 (3)
Percentage of non–globally sclerotic glomeruli (%)89 ± 19
Interstitial fibrosis (%)34 ± 21
Inflammation outside fibrosis area (%)32 ± 24
Inflammation in fibrosis area (%)42 ± 26
Total cortical inflammation (%)38 ± 23
 ti13 (8)
 ti218 (45)
 ti311 (28)
 ptc13 (8)
 ptc29 (23)
 ptc319 (48)
Total cortical inflammation predicted (%)38 ± 18
Interstitial fibrosis predicted (%)32 ± 13
 ti1 (predicted)0
 ti2 (predicted)8 (20)
 ti3 (predicted)20 (50)
Leukocytes in the most affected capillary (cell)9 ± 7
 ptc1 (predicted)4 (10)
 ptc2 (predicted)10 (25)
 ptc3 (predicted)18 (45)
DataPatients (N = 40)
Age (years)51 ± 14
Male sex23 (58)
Diabetes mellitus14 (35)
Hypertension30 (75)
Serum creatinine level (mg/dL)253 ± 149
eGFR at biopsy (mL/min/1.73 m2)31 ± 19
Proteinuria (g/day)1.4 ± 2.1
Graft rejection at biopsy32 (80)
Antibody-mediated rejection12 (30)
T-cell-mediated rejection16 (40)
Mixed rejection3 (8)
Borderline rejection1 (3)
Percentage of non–globally sclerotic glomeruli (%)89 ± 19
Interstitial fibrosis (%)34 ± 21
Inflammation outside fibrosis area (%)32 ± 24
Inflammation in fibrosis area (%)42 ± 26
Total cortical inflammation (%)38 ± 23
 ti13 (8)
 ti218 (45)
 ti311 (28)
 ptc13 (8)
 ptc29 (23)
 ptc319 (48)
Total cortical inflammation predicted (%)38 ± 18
Interstitial fibrosis predicted (%)32 ± 13
 ti1 (predicted)0
 ti2 (predicted)8 (20)
 ti3 (predicted)20 (50)
Leukocytes in the most affected capillary (cell)9 ± 7
 ptc1 (predicted)4 (10)
 ptc2 (predicted)10 (25)
 ptc3 (predicted)18 (45)

Quantitative data are expressed as numbers (%) and semi-quantitative data as mean ± standard deviation.

Table 4:

Characteristics of patients from the application cohort with a transplant biopsy.

DataPatients (N = 40)
Age (years)51 ± 14
Male sex23 (58)
Diabetes mellitus14 (35)
Hypertension30 (75)
Serum creatinine level (mg/dL)253 ± 149
eGFR at biopsy (mL/min/1.73 m2)31 ± 19
Proteinuria (g/day)1.4 ± 2.1
Graft rejection at biopsy32 (80)
Antibody-mediated rejection12 (30)
T-cell-mediated rejection16 (40)
Mixed rejection3 (8)
Borderline rejection1 (3)
Percentage of non–globally sclerotic glomeruli (%)89 ± 19
Interstitial fibrosis (%)34 ± 21
Inflammation outside fibrosis area (%)32 ± 24
Inflammation in fibrosis area (%)42 ± 26
Total cortical inflammation (%)38 ± 23
 ti13 (8)
 ti218 (45)
 ti311 (28)
 ptc13 (8)
 ptc29 (23)
 ptc319 (48)
Total cortical inflammation predicted (%)38 ± 18
Interstitial fibrosis predicted (%)32 ± 13
 ti1 (predicted)0
 ti2 (predicted)8 (20)
 ti3 (predicted)20 (50)
Leukocytes in the most affected capillary (cell)9 ± 7
 ptc1 (predicted)4 (10)
 ptc2 (predicted)10 (25)
 ptc3 (predicted)18 (45)
DataPatients (N = 40)
Age (years)51 ± 14
Male sex23 (58)
Diabetes mellitus14 (35)
Hypertension30 (75)
Serum creatinine level (mg/dL)253 ± 149
eGFR at biopsy (mL/min/1.73 m2)31 ± 19
Proteinuria (g/day)1.4 ± 2.1
Graft rejection at biopsy32 (80)
Antibody-mediated rejection12 (30)
T-cell-mediated rejection16 (40)
Mixed rejection3 (8)
Borderline rejection1 (3)
Percentage of non–globally sclerotic glomeruli (%)89 ± 19
Interstitial fibrosis (%)34 ± 21
Inflammation outside fibrosis area (%)32 ± 24
Inflammation in fibrosis area (%)42 ± 26
Total cortical inflammation (%)38 ± 23
 ti13 (8)
 ti218 (45)
 ti311 (28)
 ptc13 (8)
 ptc29 (23)
 ptc319 (48)
Total cortical inflammation predicted (%)38 ± 18
Interstitial fibrosis predicted (%)32 ± 13
 ti1 (predicted)0
 ti2 (predicted)8 (20)
 ti3 (predicted)20 (50)
Leukocytes in the most affected capillary (cell)9 ± 7
 ptc1 (predicted)4 (10)
 ptc2 (predicted)10 (25)
 ptc3 (predicted)18 (45)

Quantitative data are expressed as numbers (%) and semi-quantitative data as mean ± standard deviation.

DISCUSSION

We developed and evaluated a tool that automatically grades the total inflammation and peritubular capillaritis. This tool had a great ability to detect interstitial leukocytes and peritubular capillaries. The reproducibility between pathologists and the network for ti and ptc scores were both substantial. The tool's predictions appeared to correlate with IgAN patients’ baseline eGFR and with the diagnosis of rejection in transplantation.

A total of 423 samples of various diseases from two different centers were included. This sample diversity and the external validation may reduce the impact of image alterations linked to the variability in conditioning, staining protocols and digitalization. Included biopsies contained close to normal tissues, such as in minimal change diseases and nephrectomy samples, as well as highly inflammatory tissues such as in T-cell-mediated rejections and TINs, but also an intermediate level of inflammation such as in IgAN and borderline rejections [7, 8]. This heterogeneity of inflammation severity could enhance the generalization of the tool application in many kidney diseases.

Few studies have focused on automated quantification of the interstitial area with classical unlabeled stains. Hermsen et al. have developed a tool that is able to estimate the interstitial surface [22]. Even if this surface was correlated to pathologists’ ti scores (r = 0.71), this was only the statistical association between interstitial fibrosis/edema and total inflammation, as also described in the current study (r = 0.77). Our tool provided a more reliable measurement of total inflammation through leukocyte segmentation. Yi et al. have performed a similar leukocyte detection algorithm in a kidney transplant study, but their correlation with the total inflammation score was limited due to the low number of biopsies with significant inflammation [31]. Our tool had a good ability to assess ti scores, with all areas under the ROC curve >0.94, and a strong correlation between the predicted and observed percentages of inflammation (r = 0.89). This accuracy could partly be explained by the high number of leukocytes annotated for training. This number of objects was higher than in most nephropathology studies with deep learning [22, 32–34]. We used the same deep learning network as Yi et al. and our previous work [21, 31]. This convolutional neural network was designed for the segmentation of small objects as it was pre-trained with the recognition of cell nuclei [35].

In the Jayapandian et al. study, periodic acid–Schiff was concluded to be the best stain for peritubular capillaries segmentation due to better detection of vascular membranes [32]. With an F-score of 81%, the precision was close to ours with Masson's trichrome (82%). Pretreatment by removing tubular, vascular and glomerular structures may explain our similar scores, even with a slightly lower number of capillaries trained [32]. An evaluation with periodic acid–Schiff stain would probably have helped to generalize our results, as it is likely the most commonly used staining technique for kidney histological analysis. We used Masson's trichrome staining in our study, as our previously published neural networks for evaluating fibrosis were only trained and evaluated on this staining [21, 36]. In many institutions, Masson's trichrome is the preferred stain used to quantify the matrix [37–39]. Masson's trichrome allows screening for interstitial fibrosis and inflammation, and helps with the recognition of interstitial edema (having a pale green or blue appearance) [40]. However, the evaluation of fibrosis by Masson's trichrome is subject to variability between pathology centers due to the sensitivity of the dyes to the duration of formalin fixation [37]. Other stains, such as unpolarized Sirius Red and Collagen III immunohistochemistry, are believed to be more specific for interstitial fibrosis as they bind to collagen fibers [37, 39]. However, these techniques are time-consuming, expensive and less widely available. Moreover, the assessment of fibrosis by our neural networks does not depend on the color intensity of the staining. This evaluation corresponds to the cortical area not recognized as tubular, glomerular or vascular structures. Similar training based on instance segmentation with these stains would probably not improve fibrosis recognition. Among the transplant biopsies, we observed good interobserver reliability between our neural network and unpolarized Sirius Red.

To our knowledge, the evaluation of capillaritis had never been carried out with deep learning. We observed a substantial diagnostic accuracy. Most of the errors were linked to endothelial cells identified as leukocytes, and a lack of detection in the most inflammatory areas. A larger training set including more highly inflammatory TIN biopsies could improve those results. Nonetheless, our reproducibility for ptc scores was higher than that observed between trained pathologists in this and another study [14].

As previously described, baseline kidney function in IgAN was linked to interstitial inflammation severity [7–10]. CD20-positive B cells are thought to form the main cell population infiltrating the interstitial area in IgAN [41]. These B lymphocytes promote fibrosis, inflammation and kidney destruction through their secretions of cytokines and chemokines [42]. In its initial definition, the MEST-C classification had not retained the evaluation of the interstitial infiltrate because of its low reproducibility among pathologists [43]. In univariate analysis, the percentage of total inflammation score was strongly correlated to eGFR. This suggests the necessity of a precise measurement of inflammation to accurately reflect disease severity. Unlike inter-observer evaluations by pathologists, the deep learning tools’ predictions have high inner reproducibility [19]. If the prognostic impact of interstitial inflammation in IgAN is confirmed, we could imagine the standardization of this evaluation by deep learning in a dedicated classification. Of note, in multivariate analysis apart from the fibrosis evaluation, the MEST-C score was no longer correlated with kidney function at biopsy. We also evaluated a new interstitial inflammation criterion, the interstitial leukocyte density. This evaluation might limit the impact of fibrosis and edema on the inflammation assessment. This density cannot be calculated routinely by a pathologist as it requires a comprehensive assessment of cortical, glomerular, vascular and leukocyte areas. In the multivariate analysis, leukocyte density appeared to be better associated with eGFR compared with the percentage of total inflammation. Thus, this precise evaluation may be a stronger marker of interstitial inflammation than the estimation of the surface of interstitial inflammation related to the cortical area. Nonetheless, this work did not assess whether this method was more related to kidney prognosis than the percentage of total inflammation, or not. As the purpose of this study was to automatize the grading of histological markers, we did not study patients’ kidney function over time. These prognostic performances need to be evaluated in another study with a higher number of IgAN patients.

The percentage of predicted cortical inflammation and the leukocyte count in the most affected capillary were respectively associated with the T-cell- and antibody-mediated rejections diagnoses. However, as the number of normal biopsies in this subgroup of kidney transplant was low, and as rejections diagnoses do not solely depend on those histological lesions (including also, among others, C4d staining, glomerulitis, tubulitis, vascular lesions and donor-specific antibodies), the interpretation of the predictive values should be done with caution [5]. Since our tool is not a rejection classifier as in the Kers et al. study, its diagnostic capacity is also necessarily lower [44].

This study had some limitations. First, the application cohort was designed to assess a wide range of inflammatory lesions, which led to a higher proportion of biopsy samples with inflammatory and fibrosis lesions than the other groups, resulting in fewer normal transplant biopsies. Then, our tool only allows a total inflammation score and therefore cannot separately evaluate the inflammation inside and outside of the fibrosis areas as in i-IFTA and i scores [5, 45]. Even though the ti score seems to be a better reflection of the patient's kidney prognosis, another study is mandatory to select these areas as well as tubulitis lesions [18, 46, 47].

While our tool could limit costs and gain time, it could not separate leukocyte subclasses. Even polymorphonuclear leukocytes were not isolated from the other ones as the degree of uncertainty in labeling was too high with the image resolution. Hermsen et al. previously carried out a deep learning study on transplant biopsies using a multiplex immunohistochemistry technology to classify leukocyte subclasses [33]. This method allows multiple immunostainings to be carried out sequentially in the same section. However, labeling techniques are time-consuming and expensive. Although a comparison with immunohistochemistry could have provided additional validation, this assessment would have been carried out on a different section from Masson's trichrome. Immunohistochemistry would not have ensured that the recognized cells were indeed leukocytes and not fibroblasts or endothelial cells.

In conclusion, we developed a tool using deep learning that scores the total inflammation and capillaritis, demonstrating the potential of artificial intelligence in kidney pathology.

FUNDING

This work was funded by the NEPHRIN-APJ2019 (Appel d'offre jeunes chercheurs) GIRCI EST (47755 euros) (Mathieu Legendre).

AUTHORS’ CONTRIBUTIONS

A.Jacq, G.T., M.P., A.Jaugey, E.M., L.M, J.-M.R. and M.L. were responsible for conception, analysis and interpretation of data. A.Jacq, E.M., A.Jaugey and M.L. drafted the article. A.Jacq, G.T., L.M. and M.L. were responsible for histological digitalization and/or analyze. Deep learning algorithms programming and evaluations were carried out by A.Jaugey, M.P., M.A. and P.B. M.A., P.B., D.C., J.B., D.D., T.C., M.C., M.F.V. and S.F. helped with data acquisition and analysis. C.T. provided intellectual content of critical importance to the work described. All authors gave final approval of the version to be published.

DATA AVAILABILITY STATEMENT

The data underlying this article will be shared on reasonable request to the corresponding author. The neural networks are freely available (https://github.com/AdrienJaugey/Mask-R-CNN-Inference-Tool).

CONFLICT OF INTEREST STATEMENT

None declared.

REFERENCES

1.

Joyce
 
E
,
Glasner
 
P
,
Ranganathan
 
S
 et al.  
Tubulointerstitial nephritis: diagnosis, treatment, and monitoring
.
Pediatr Nephrol
 
2017
;
32
:
577
87
. .

2.

Praga
 
M
,
González
 
E.
 
Acute interstitial nephritis
.
Kidney Int
 
2010
;
77
:
956
61
. .

3.

Chang
 
A
,
Clark
 
MR
,
Ko
 
K.
 
Cellular aspects of the pathogenesis of lupus nephritis
.
Curr Opin Rheumatol
 
2021
;
33
:
197
204
. .

4.

Almaani
 
S
,
Meara
 
A
,
Rovin
 
BH.
 
Update on lupus nephritis
.
Clin J Am Soc Nephrol
 
2017
;
12
:
825
35
. .

5.

Loupy
 
A
,
Haas
 
M
,
Roufosse
 
C
 et al.  
The Banff 2019 Kidney Meeting report (I): updates on and clarification of criteria for T cell– and antibody-mediated rejection
.
Am J Transplant
 
2020
;
20
:
2318
31
. .

6.

Heilman
 
RL
,
Devarapalli
 
Y
,
Chakkera
 
HA
 et al.  
Impact of subclinical inflammation on the development of interstitial fibrosis and tubular atrophy in kidney transplant recipients
.
Am J Transplant
 
2010
;
10
:
563
70
. .

7.

Myllymäki
 
JM
,
Honkanen
 
TT
,
Syrjänen
 
JT
 et al.  
Severity of tubulointerstitial inflammation and prognosis in immunoglobulin A nephropathy
.
Kidney Int
 
2007
;
71
:
343
8
. .

8.

Rankin
 
AJ
,
Kipgen
 
D
,
Geddes
 
CC
 et al.  
Assessment of active tubulointerstitial nephritis in non-scarred renal cortex improves prediction of renal outcomes in patients with IgA nephropathy
.
Clin Kidney J
 
2019
;
12
:
348
54
. .

9.

Soares
 
MF
,
Genitsch
 
V
,
Chakera
 
A
 et al.  
Relationship between renal CD68+ infiltrates and the Oxford Classification of IgA nephropathy
.
Histopathology
 
2019
;
74
:
629
37
. .

10.

Pei
 
G
,
Zeng
 
R
,
Han
 
M
 et al.  
Renal interstitial infiltration and tertiary lymphoid organ neogenesis in IgA nephropathy
.
Clin J Am Soc Nephrol
 
2014
;
9
:
255
64
. .

11.

Gomes
 
MF
,
Mardones
 
C
,
Xipell
 
M
 et al.  
The extent of tubulointerstitial inflammation is an independent predictor of renal survival in lupus nephritis
.
J Nephrol
 
2021
;
34
:
1897
905
. .

12.

Yu
 
F
,
Wu
 
L-H
,
Tan
 
Y
 et al.  
Tubulointerstitial lesions of patients with lupus nephritis classified by the 2003 International Society of Nephrology and Renal Pathology Society system
.
Kidney Int
 
2010
;
77
:
820
9
. .

13.

Wilhelmus
 
S
,
Cook
 
HT
,
Noël
 
L-H
 et al.  
Interobserver agreement on histopathological lesions in class III or IV lupus nephritis
.
Clin J Am Soc Nephrol
 
2015
;
10
:
47
53
. .

14.

Smith
 
B
,
Cornell
 
LD
,
Smith
 
M
 et al.  
A method to reduce variability in scoring antibody-mediated rejection in renal allografts: implications for clinical trials - a retrospective study
.
Transpl Int
 
2019
;
32
:
173
83
. .

15.

Oni
 
L
,
Beresford
 
MW
,
Witte
 
D
 et al.  
Inter-observer variability of the histological classification of lupus glomerulonephritis in children
.
Lupus
 
2017
;
26
:
1205
11
. .

16.

Loupy
 
A
,
Mengel
 
M
,
Haas
 
M.
 
Thirty years of the International Banff Classification for allograft pathology: the past, present, and future of kidney transplant diagnostics
.
Kidney Int
 
2022
;
101
:
678
91
. .

17.

Hakroush
 
S
,
Tampe
 
D
,
Korsten
 
P
 et al.  
Bowman's capsule rupture links glomerular damage to tubulointerstitial inflammation in ANCA-associated glomerulonephritis
.
Clin Exp Rheumatol
 
2021
;
39
:
27
31
. .

18.

Mengel
 
M
,
Reeve
 
J
,
Bunnag
 
S
 et al.  
Scoring total inflammation is superior to the current Banff inflammation score in predicting outcome and the degree of molecular disturbance in renal allografts
.
Am J Transplant
 
2009
;
9
:
1859
67
. .

19.

Becker
 
JU
,
Mayerich
 
D
,
Padmanabhan
 
M
 et al.  
Artificial intelligence and machine learning in nephropathology
.
Kidney Int
 
2020
;
98
:
65
75
. .

20.

Furness
 
PN
,
Taub
 
N
,
Assmann
 
KJM
 et al.  
International variation in histologic grading is large, and persistent feedback does not improve reproducibility
.
Am J Surg Pathol
 
2003
;
27
:
805
10
. .

21.

Marechal
 
E
,
Jaugey
 
A
,
Tarris
 
G
 et al.  
Automatic evaluation of histological prognostic factors using two consecutive convolutional neural networks on kidney samples
.
Clin J Am Soc Nephrol
 
2022
;
17
:
260
70
. .

22.

Hermsen
 
M
,
de Bel
 
T
,
Boer
 
M
 et al.  
Deep learning-based histopathologic assessment of kidney tissue
.
J Am Soc Nephrol
 
2019
;
30
:
1968
79
. .

23.

Barisoni
 
L
,
Troost
 
JP
,
Nast
 
C
 et al.  
Reproducibility of the NEPTUNE descriptor-based scoring system on whole-slide images and histologic and ultrastructural digital images
.
Mod Pathol
 
2016
;
29
:
671
84
. .

24.

Sato
 
N
,
Uchino
 
E
,
Kojima
 
R
 et al.  
Evaluation of kidney histological images using unsupervised deep learning
.
Kidney Int Rep
 
2021
;
6
:
2445
54
. .

25.

Zee
 
J
,
Hodgin
 
JB
,
Mariani
 
LH
 et al.  
Reproducibility and feasibility of strategies for morphologic assessment of renal biopsies using the Nephrotic Syndrome Study Network Digital Pathology Scoring System
.
Arch Pathol Lab Med
 
2018
;
142
:
613
25
.

26.

Barisoni
 
L
,
Gimpel
 
C
,
Kain
 
R
 et al.  
Digital pathology imaging as a novel platform for standardization and globalization of quantitative nephropathology
.
Clin Kidney J
 
2017
;
10
:
176
87
. .

27.

Mask R-CNN for Object Detection and Segmentation
[
Internet]
.
2022
[
cited 2022 Oct 25]. Available from: https://github.com/matterport/Mask_RCNN
(25 October 2022, date last accessed).

28.

Issa
 
N
,
Lopez
 
CL
,
Denic
 
A
 et al.  
Kidney structural features from living donors predict graft failure in the recipient
.
J Am Soc Nephrol
 
2020
;
31
:
415
23
. .

29.

Trimarchi
 
H
,
Barratt
 
J
,
Cattran
 
DC
 et al.  
Oxford Classification of IgA nephropathy 2016: an update from the IgA Nephropathy Classification Working Group
.
Kidney Int
 
2017
;
91
:
1014
21
. .

30.

Lv
 
J
,
Shi
 
S
,
Xu
 
D
 et al.  
Evaluation of the Oxford Classification of IgA nephropathy: a systematic review and meta-analysis
.
Am J Kidney Dis
 
2013
;
62
:
891
9
. .

31.

Yi
 
Z
,
Salem
 
F
,
Menon
 
MC
 et al.  
Deep learning identified pathological abnormalities predictive of graft loss in kidney transplant biopsies
.
Kidney Int
 
2022
;
101
:
288
98
. .

32.

Jayapandian
 
CP
,
Chen
 
Y
,
Janowczyk
 
AR
 et al.  
Development and evaluation of deep learning-based segmentation of histologic structures in the kidney cortex with multiple histologic stains
.
Kidney Int
 
2021
;
99
:
86
101
. .

33.

Hermsen
 
M
,
Volk
 
V
,
Bräsen
 
JH
 et al.  
Quantitative assessment of inflammatory infiltrates in kidney transplant biopsies using multiplex tyramide signal amplification and deep learning
.
Lab Invest
 
2021
;
101
:
970
82
. .

34.

Ligabue
 
G
,
Pollastri
 
F
,
Fontana
 
F
 et al.  
Evaluation of the classification accuracy of the kidney biopsy direct immunofluorescence through convolutional neural networks
.
Clin J Am Soc Nephrol
 
2020
;
15
:
1445
54
. .

35.

He
 
K
,
Gkioxari
 
G
,
Dollár
 
P
 et al.  
Mask R-CNN
.
IEEE Trans Pattern Anal Mach Intell
 
2020
;
42
:
386
97
. .

36.

Jaugey
 
A
,
Maréchal
 
E
,
Tarris
 
G
 et al.  
Deep learning automation of MEST-C classification in IgA nephropathy
.
Nephrol Dial Transplant
 
2023
;
gfad039
. .

37.

Farris
 
AB
,
Alpers
 
CE.
 
What is the best way to measure renal fibrosis?: A pathologist's perspective
.
Kidney Int Suppl
 
2014
;
4
:
9
15
. .

38.

Moreso
 
F
,
Lopez
 
M
,
Vallejos
 
A
 et al.  
Serial protocol biopsies to quantify the progression of chronic transplant nephropathy in stable renal allografts
.
Am J Transplant
 
2001
;
1
:
82
8
. .

39.

Farris
 
AB
,
Adams
 
CD
,
Brousaides
 
N
 et al.  
Morphometric and visual evaluation of fibrosis in renal biopsies
.
J Am Soc Nephrol
 
2011
;
22
:
176
86
. .

40.

Cathro
 
HP
,
Shen
 
SS
,
Truong
 
LD.
 
Diagnostic histochemistry in medical diseases of the kidney
.
Semin Diagn Pathol
 
2018
;
35
:
360
9
. .

41.

Heller
 
F
,
Lindenmeyer
 
MT
,
Cohen
 
CD
 et al.  
The contribution of B cells to renal interstitial inflammation
.
Am J Pathol
 
2007
;
170
:
457
68
. .

42.

Zheng
 
N
,
Xie
 
K
,
Ye
 
H
 et al.  
TLR7 in B cells promotes renal inflammation and Gd-IgA1 synthesis in IgA nephropathy
.
JCI Insight
 
2020
;
5
:
136965
. .

43.

Working Group of the International IgA Nephropathy Network and the Renal Pathology Society
,
Roberts
 
ISD
,
Cook
 
HT
 et al.
 
The Oxford classification of IgA nephropathy: pathology definitions, correlations, and reproducibility
.
Kidney Int
 
2009
;
76
:
546
56
. .

44.

Kers
 
J
,
Bülow
 
RD
,
Klinkhammer
 
BM
 et al.  
Deep learning-based classification of kidney transplant pathology: a retrospective, multicentre, proof-of-concept study
.
Lancet Digit Health
 
2022
;
4
:
e18
26
. .

45.

Roufosse
 
C
,
Simmonds
 
N
,
Clahsen-van Groningen
 
M
 et al.  
A 2018 reference guide to the Banff Classification of renal allograft pathology
.
Transplantation
 
2018
;
102
:
1795
814
. .

46.

Bancu
 
I
,
Hernández-Gallego
 
A
,
López-Alvárez
 
D
 et al.  
Prognostic value of modified banff score in the evolution of renal function
.
Transplant Proc
 
2016
;
48
:
2903
5
. .

47.

Sablik
 
KA
,
Clahsen-van Groningen
 
MC
,
Damman
 
J
 et al.  
Banff lesions and renal allograft survival in chronic-active antibody mediated rejection
.
Transpl Immunol
 
2019
;
56
:
101213
. .

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Supplementary data

Comments

0 Comments
Submit a comment
You have entered an invalid code
Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.