Plasma biomarkers to detect prevalent, or predict progressive, HIV-1-associated tuberculosis

Background
The risk of HIV-1 infected individuals developing TB is high while both prognostic and diagnostic tools remain insensitive. The predictive performance of plasma biomarkers to identify HIV-1 infected individuals likely to progress to active disease is unknown.


Methods
Thirteen preselected analytes were determined from QuantiFERON® Gold in-tube (QFT) plasma samples in 421 HIV-1 infected persons recruited within the screening and enrolment phases of a randomised controlled trial of isoniazid preventive therapy. Blood for QFT was obtained pre-randomisation. Individuals were classified into prevalent TB, incident TB and controls. Comparisons between groups, supervised learning methods and weighted correlation network analyses were applied utilising the unstimulated and background-corrected plasma analyte concentrations.


Results
Unstimulated samples showed higher analyte concentrations in prevalent and incident TB compared to controls. The largest differences were seen for CXCL10, IL-2, IL-1 and TGF-. Predictive model analysis using unstimulated analytes discriminated better between controls and prevalent TB (Area Under the Curve AUC= 0·9), reasonably between incident and prevalent TB (AUC > 0·8), but poorly between controls and incident TB. Unstimulated IL-2 and IFN-γ were ranked at or near the top for all comparisons except the comparison between controls vs incident TB. Models using background adjusted values performed poorly.


Conclusions
Single plasma biomarkers are unlikely to distinguish between disease states in HIV-1 co-infected individuals and combinations of biomarkers are required. The ability to detect prevalent TB is potentially important, as no blood test hitherto has suggested utility to detect prevalent TB amongst HIV-1 co-infected persons.


INTRODUCTION
HIV-1-associated tuberculosis (TB) is a health burden in Africa, despite implementation of the WHO directly observed short course therapy strategy (DOTS) for TB, the widespread roll-out of combination antiretroviral therapy (ART) and guidelines to increase provision of isoniazid preventive therapy (IPT). The risk of developing TB in HIV-1-infected people exceeds that in HIV-1 uninfected people even after accounting for risk reduction by ART. [1][2][3] HIV-1 co-infection impacts TB specific immune responses, causing false negative results in immune tests of TB sensitisation. Currently, testing for latent TB infection (LTBI) is performed by either Tuberculin skin testing (TST) or interferon gamma release assays (IGRA). HIV-1 infection impairs the sensitivity of both of these tests, [4][5][6] underestimating those who would benefit from IPT, particularly amongst those newly commencing ART, leading to vulnerable patients potentially being untreated.
The role of IPT to decrease the risk of TB in HIV-1-infected people is well-recognised. 7 A randomised controlled trial (RCT) of IPT plus ART versus ART alone for the prevention of TB in a large well characterised group of HIV-1-infected persons conducted in Khayelitsha, South Africa (ART-IPT Study), established that TST or IGRA negative HIV-infected persons on ART also benefit from IPT. 8 Thus, current tests for latent TB imperfectly identify those likely to benefit from IPT. It is not known whether tests evaluating an extended spectrum of biomarkers in plasma would enhance risk stratification of infected individuals, as studies assessing biomarkers other than IFN-γ to predict active disease or potential benefit from IPT are lacking. There is an urgent need for tests that can distinguish prevalent TB from LTBI, as the fear of treating TB with IPT monotherapy is a major barrier to IPT implementation.
Here we investigated the predictive performance of 13 preselected analytes determined from QuantiFERON® Gold in-tube plasma samples from HIV-1 infected persons screened for the ART-IPT Study.
Patients were grouped into those with prevalent (active) TB screened out prior to RCT randomisation; those who developed TB after randomisation and during longitudinal follow up (incident TB) and those who did not develop TB during follow up (controls). Analytes (Table S1), were selected for their relevance in active TB and LTBI based on literature at the time of study design, with the hypothesis that those elevated at baseline in patients who will either have active TB, and thus should not start IPT, or go on to develop TB during longitudinal follow up could be potential biomarkers predictive of TB.

Study design and participants
This study was conducted within the screening and follow up of a previously reported randomised controlled trial. 8  was required from all participants prior to screening. A total of 421 persons were included in this analysis based on IGRA sample availability at screening and included 51 individuals who developed clinically or microbiologically confirmed TB during the 4 year duration of the study (incident TB). Prevalent TB (defined as sputum culture positive) was identified in 87 individuals at the time of randomisation (when culture results became available) and were referred for treatment, rendering them thereby ineligible for the RCT. 283 controls with IGRA sample available and no signs or symptoms of TB during the duration of the follow up (median duration of follow up in the main study was 2.5 years), were randomly chosen just before and just after the incident and prevalent cases, in the order of recruitment. All samples used in the current analyses were collected before randomisation.

Definition of TB diagnoses
Prior TB indicates previous treatment history and it means diagnosis and treatment in a health care facility for active TB. There is no routine LTBI diagnosis in South Africa using either TST or IGRA, thus all prior TB refers to previously treated active TB. Where incident or prevalent TB was not microbiologically confirmed, TB was diagnosed clinically with a combination of symptoms and chest-Xray.

TST and IGRA
The TST (2 TU RT23 purified-protein-derivative, PPD; Statens Serum Institut, Copenhagen, Denmark) was administered on the volar aspect of the left forearm by personnel trained in its administration. The TST induration was recorded after 48-72h by the ballpoint pen and ruler method. Phlebotomy for IGRA (QuantiFERON®-Gold-in-tube, Qiagen) was performed on the same day that the TST was administered and the blood draw preceded placement of PPD. The IGRA were performed in a Qiagen-accredited laboratory and interpreted according to the manufacturer's guidelines. Laboratory personnel were blinded to TST results, TB symptoms, signs and culture results, while clinicians were blinded to IGRA results.

Luminex multiplex assay for cytokines and chemokines
QuantiFERON®-Gold-in-tube (further referred to as QFT) plasma samples from unstimulated (Nil) and

Statistical analysis
Raw out of range (OOR) values were adjusted, by replacing all OOR< (lower than the lowest detectable value) with 0 (1027 instances), and replacing all >OOR (higher than the highest detectable value) with 10000 pg/mL (top standard, 13 instances) and adding the mean limit of detection (LOD) per analyte to all plates.
Results are presented for unstimulated (Nil) and stimulated-nil [Ag-Nil] values. Analyte values are nearly always presented in log2 transformed scale as log2 pg/ml. Subgroups for analysis included incident TB, prevalent TB, controls and the combined incident and prevalent TB groups, hereafter referred to as TBcombined. Frequency (percent) or median (inter-quartile range) were calculated by group for discrete and continuous values respectively. Sensitivity analyses were undertaken using the subgroup of culture-confirmed incident TB, the subgroup randomised to placebo, and the subgroup of prevalent TB who were smear negative at baseline (Smear-negative). Statistical tests to compare groups were Fisher's exact test or Wilcoxon rank sum test, as appropriate. Throughout, a nominal threshold for statistical significance was set at α = 0·05, and false discovery rate correction (FDR) by Benjamini-Hochberg 9 was applied. These values are reported as p-corrected.
Data visualisation was used to clarify differences between and within groups. Weighted correlation network analysis was carried out on the nil and background corrected analyte levels, stratified by TB status and presented with correlation diagrams. Correlations were estimated using Pearson's correlation of the log 2 transformed data.
Supervised learning models were applied to the data to predict class membership (e.g. incident vs prevalent TB) in 2-way classifications. In all cases analyte values were centred and scaled prior to input and all models were carried out with 10-fold cross-validation resampling to estimate classification accuracy. Sampling was stratified by down sampling to ensure balanced class representation in the re-samples. Training for the prediction models utilised a grid approach over model parameters with a specified grid length. Variable importance score (VI), calculated as a scaled beta coefficient, was used as the primary means of determining individual analyte impact on classification outcome. Classification learners assessed included random forests (RF) to set a performance ceiling 10 and elastic-net regularisation (glmnet) 11 for a potentially interpretable and applicable model. Models were selected on the basis of largest minimum unbiased AUC estimate. Lists of analytes and associated VI score are presented. Receiver-operator curves were generated using the predicted vs observed classifications for each independent model and graphed. Additional detailed methods are available in the online supplement.

RESULTS
A total of 421 individuals were included in the analysis, with a median of 61 weeks (IQR: 28-91) to onset of TB in the incident group. Characteristics did not vary between the parent screening population and the analysis sample (Table S2). Between group characteristics were similar for basic demographics and show expected differences for TB disease related measures that were used as part of the screening or group definition in the RCT with higher rates of symptoms, QFT and/or TST positivity in the incident TB and prevalent TB groups than in the control group (Table 1). The proportion with previously treated active TB was higher in the control group compared to the TB-combined, incident TB and prevalent TB groups (48% vs 41% and 32% respectively, overall p = 0·028), and a higher proportion of the controls had previous exposure to ART. The controls also had lower proportion of individuals with CD4<200 copies/mL (40% vs 61%, overall p < 0·001) compared to TB-combined. In the incident TB group, 37% (19/51) were confirmed culture positive during follow up, those not culture confirmed were diagnosed clinically (symptoms and chest-Xray) due to diagnosis occurring at a site other than the study site.

Analytes from unstimulated samples
Analyte concentrations from unstimulated samples were higher in the incident and prevalent TB groups compared to controls (Table 2A, Figure 1, Figure S1). CXCL10, IL-2 and TGF-alpha were highest in the prevalent TB group. CCL4, IFN-alpha2, IFN-gamma, IL-10 and TNF were highest in the incident TB group; although not markedly higher than values seen in the prevalent TB group, the trend may indicate underlying inflammatory processes leading to TB risk. EGF, IL-1α and CD40L were highest in the control group.
Comparison between control and prevalent TB showed that the largest group differences (>50% median difference, Figure 2) were in analytes CXCL10 and IL-2 (highest in prevalent), IL-1α (highest in control), and VEGF (lowest in incident). Statistical comparison between groups indicated CXCL10, IFN-γ, IL-2 and TGF-α remained statistically significant after FDR correction (Table 3A, p-corrected < 0·05) when comparing controls with prevalent TB, with the addition of IFN-α2 when comparing controls with TB-combined. No analytes remained statistically significant when comparing controls with incident TB after FDR correction, and only IL-2 remained significant when comparing prevalent TB with incident TB (Table 3A).

Analytes from stimulated samples corrected for background [Ag-Nil]
A number of analytes had higher unstimulated concentrations, so background correction could result in negative median [Ag-nil] differences (most notably TNF, IL-1α, IL-10, EGF, CCL3; Table 2B, Table 3B, Figure   S2). In the control vs prevalent TB and control vs TB-combined comparisons CXCL10, IFN-γ and IL-2 remained statistically significant after FDR correction. CXCL10 also remained significantly different in the control vs incident TB comparison, however no analyte remained significant for [Ag-Nil] after FDR correction in the prevalent vs incident TB comparison.

Weighted correlation network analysis
Weighted correlation network analysis on the nil analyte concentrations (Figure 3) demonstrates a cluster of positively correlated analytes including IL-2, IFN-γ and INF-α2 in controls and to a lesser extent in prevalent TB, which is absent in the incident TB group. The second cluster identified includes CCL3, CCL4, IL-1α, IL-10 and TNF with varying strengths. While this cluster was present in all three groups, it is notable that the correlation between CCL3 and CCL4 was positive only in the incident TB group (Figure 3). The same analysis on the background corrected analyte values led to similar identification of an IL-2 and IFN-γ correlation, but this time in association with CXCL10 instead of IFN-α2 ( Figure S3). This cluster was present in all three groups.
There was a strong correlation, in the incident TB group only, between INF-γ and INF-α2, which was small and negative in both the control and prevalent TB groups. There were notable negative correlations in the TB incident group, specifically between IL-10 and INF-γ, while a strong positive correlation was seen between IFNγ and IFN-α2, present in this group only.

Predictive models
Results of predictive model fitting showed the tree based ensemble models of random forests (RF) to be most accurate, although more interpretable models (glmnet) also performed reasonably well by comparison when comparing AUC ( Figure 4, Table S5, Figure S4). Models using the unstimulated analyte values as input were better able to discriminate between control and prevalent TB (RF AUC = 0·9, glmnet AUC = 0·7) than models using the background corrected [Ag-Nil] values. Discrimination between incident TB and prevalent TB was also reasonable (AUC > 0·8) however the models had very poor ability to discriminate between control and incident TB. Model predicted variable importance scores ranked IL-2, and IFN-γ at or near the top for all comparisons except the comparison between control vs incident TB (Table 4A). Models using the [Ag-Nil] values as input performed poorly by comparison, requiring a large number of analytes to reach an overall lower AUC (max AUC < 0·7 for all models) (Table 4B).

Sensitivity analyses
Additional sensitivity analyses included (i) utilising the culture confirmed TB samples (n=106 from the 87 prevalent and 19 incident TB patients) in place of the TB-combined as a comparison group (Table 1). After false-discovery-rate adjustment CXCL10, IFN-γ, IL-2, and TGF-α remained statistically significant in the control vs culture-confirmed comparison of the unstimulated values, while only CXCL10, IFN-γ and IL-2 did so using the [Ag-Nil] values (Table 3). (ii) utilising the subset (n=181) of individuals randomised to placebo (did not receive IPT treatment) to investigate development of incident TB (Table S3). Baseline characteristics in this subgroup were comparable. Out of n=334 randomised, 181/334 (54%) were randomised to placebo, and of those 33/181 (18%) developed incident TB compared to 18/153 (12%) randomised to IPT treatment (95% CI for difference: -0.01 -0.12, p = 0.1016). No analytes remained significant after FDR correction. (iii) with the subset of sputum smear negative TB-prevalent patients (n=79, Tables 1 and 3) compared to controls. After FDR correction, IFN-γ, IL-2, and TGF-α remained statistically significant using the nil analyte values, and CXCL10 and IFN-γ when using the [Ag-nil] values. (iv) based on TB diagnosis time in the incident TB group, split by median time to TB (<=61 weeks, >62 weeks; analyte values summarised in Table S4).

DISCUSSION
We evaluated the predictive value of selected analytes as potential biomarkers for HIV associated TB. This is the first study of this size nested within a 4 year prospective cohort, using samples collected at the screening stage of a large RCT that showed that adding IPT to ART reduced TB incidence in patients with HIV-1 infection. 8 A unique feature of our study is being able to compare thirteen biomarkers in plasma from prevalent and incident TB patients. Novel TB biomarkers, other than IFN-γ, might significantly contribute to the prediction of prevalent TB and improve multivariate risk prediction for incident TB.
Zak et al 13  Recent work has demonstrated that Quantiferon supernatants can be useful in identifying combination biomarkers to diagnose pulmonary TB. 15,16 Our findings are in keeping with those described by Chegou et al, however, with some differences as their patient population was HIV-uninfected. 15 Significantly different analyte concentrations between groups, in the unstimulated samples, were seen in CXCL10, IFN-γ, IL-2 and TGF-α, when comparing controls with prevalent TB, with the addition of IFN-2 when comparing controls with TBcombined. The role of CXCL10 with or without IFN-γ in both adult and paediatric populations, has been highlighted previously. 17,18 In our analysis, CXCL10 remained statistically significant between both controls and prevalent as well as incident TB after correction for multiple comparisons in background corrected antigen stimulated samples as well. The finding that unstimulated IL-2 concentrations differentiated prevalent from incident TB after correction for statistical comparisons suggests that IL-2 may be an important marker of TB progression and warrants further investigation, although the median concentrations are low. Our predictive model analysis also ranked unstimulated IL-2 (together with IFN-γ) highly for all comparisons except for control versus incident TB. There is a dynamic relationship between IFN-γ and IL-2 secreting antigen specific T cells in patients during and after treatment for TB, reflecting that it is likely driven by antigen load. 22 These findings illustrate that the ability of unstimulated plasma analyte concentrations to better identify TB risk in these HIV-1 infected patients demonstrates underlying inflammatory processes and higher overall background activation might render them more susceptible to progress to TB.
While our study has strengths, in particular of power, it also has limitations, including the limited number of preselected analytes measured, the lack of an external validation cohort and the sampling, which relied on sample availability, leading to possible bias (i.e. the lack of QTF indeterminate individuals in the analysis set).
The randomisation to IPT or placebo also affected the outcome of our investigations since IPT reduced incident TB, reducing the ability of biomarkers to predict it in this cohort. Additionally, on-going TB exposure postenrolment could have contributed to the inability of these markers to predict incident TB, potentially measuring baseline plasma biomarkers prior to the TB exposure that may have led to active disease. Additional weaknesses include poor predictive performance in the analysis and unbalanced classes, since down-sampling reduces effective sample size. Classification performance may also be impacted by unmeasured confounding and effect modification, potentially related to the unique setting in South Africa. However, with these limitations in mind, our data, indicating that a combination of plasma biomarkers may be used to detect prevalent TB in HIV-1infected individuals is potentially an important finding. It is also interesting that unstimulated analytes performed best, therefore potentially avoiding the need to culture cells or whole blood in vitro. Our findings could pave the way towards further larger prospective studies evaluating soluble plasma biomarkers that might predict incident TB in HIV-infected patients effectively.

Contributors
MXR, RJW and KAW were involved in the conception and design. MXR, GM, RG, SM, JMK, RJW, KAW were involved in the study implementation. ML, CP did the analysis, with early input from AKC, KAW and MXR. ML, KAW, RJW interpreted the data, provided important intellectual input and wrote the first draft.
All authors read and contributed to the final manuscript.   Standardised differences in unstimulated group medians (log2 pg/mL) between incident TB (grey), prevalent TB (black) vs control for each analyte. Negative differences mean incident TB or prevalent TB group medians were lower than control group medians.  Receiver operator curves for primary comparison using nil data. Each curve represents an independent crossvalidated predictive model with different tuning parameters. The left panel are curves estimated using glmnet penalised regression, the right panel ROC curves estimated using random forests, each row corresponds to a specific comparison.