Genome-Wide Association Study Identifies Novel Colony Stimulating Factor 1 Locus Conferring Susceptibility to Cryptococcosis in Human Immunodeficiency Virus-Infected South Africans

Abstract Background Cryptococcus is the most common cause of meningitis in human immunodeficiency virus (HIV)-infected Africans. Despite universal exposure, only 5%–10% of patients with HIV/acquired immune deficiency syndrome and profound CD4+ T-cell depletion develop disseminated cryptococcosis: host genetic factors may play a role. Prior targeted immunogenetic studies in cryptococcosis have comprised few Africans. Methods We analyzed genome-wide single-nucleotide polymorphism (SNP) genotype data from 524 patients of African descent: 243 cases (advanced HIV with cryptococcal antigenemia and/or cryptococcal meningitis) and 281 controls (advanced HIV, no history of cryptococcosis, negative serum cryptococcal antigen). Results Six loci upstream of the colony-stimulating factor 1 (CSF1) gene, encoding macrophage colony-stimulating factor (M-CSF) were associated with susceptibility to cryptococcosis at P < 10–6 and remained significantly associated in a second South African cohort (83 cases; 128 controls). Meta-analysis of the genotyped CSF1 SNP rs1999713 showed an odds ratio for cryptococcosis susceptibility of 0.53 (95% confidence interval, 0.42–0.66; P = 5.96 × 10−8). Ex vivo functional validation and transcriptomic studies confirmed the importance of macrophage activation by M-CSF in host defence against Cryptococcus in HIV-infected patients and healthy, ethnically matched controls. Conclusions This first genome-wide association study of susceptibility to cryptococcosis has identified novel and immunologically relevant susceptibility loci, which may help define novel strategies for prevention or immunotherapy of HIV-associated cryptococcal meningitis.

cryptococcal antigenemia (CRAG), representing early dissemination from the lungs, is approximately 6% in this population [1]. After treatment of both cryptococcosis and underlying HIV, despite comparable CD4 counts, CRAG-positive individuals have a 12-month mortality rate approximately 3 times greater than CRAG-negative controls [5], suggesting that additional host immune factors, beyond that reflected by the CD4 count, may contribute to cryptococcosis susceptibility.
Prior immunogenetic studies performed in CM have studied candidate genes in small populations (n = 100-150) comprising few African individuals [3,[7][8][9]. In the only CM genetic susceptibility study in HIV-positive patients, targeted sequencing of the Fc-γ receptor in a cohort of 164 predominantly Caucasian men (55 HIV-positive with CM; 54 HIV-positive and 55 HIVnegative controls without CM) demonstrated that individuals homozygous for the Fc-γR3A 158V polymorphism had 20-fold increased odds of developing CM [9]. Despite sub-Saharan Africa having a high infectious disease burden, few genomewide association studies (GWAS) of infectious disease susceptibility have been conducted in people of African descent: published studies include tuberculosis [10] and malaria [11,12]. Specific challenges to GWAS in the African population include higher genetic diversity, low linkage disequilibrium, and more complex genetic structure [13], although, in the longterm, these aspects can be exploited for fine mapping of association signals.
In this study, we report on the first GWAS of genetic susceptibility to cryptococcosis in an HIV-infected population, using deoxyribonucleic acid from a discovery cohort of 524 cases and controls of African descent recruited in Cape Town 2005-2014 and a validation cohort of 211 recruited in Johannesburg 2015-2017.

Discovery and Validation Cohort
For the discovery cohort, 243 cases were recruited as part of 4 clinical trials (1 observational, 3 randomized) of HIV-associated CM and a CRAG study in ART-naive adults conducted in Cape Town, South Africa 2005-2014 [14][15][16][17][18]. Cases had disseminated cryptococcal infection and/or CM as confirmed by positive serum and/or CSF cryptococcal antigen and/or CSF culture. Two hundred eighty-one controls were recruited contemporaneously at the same hospital and referring clinic as the cases and had no history of cryptococcal disease and a negative serum cryptococcal antigen. All cases and controls were HIV-positive adults (age ≥18) with nadir CD4 cell count <100/μL who were ART-naive or within 3 months of starting ART. The validation cohort included 63 cases and 128 controls with CD4 cell count <100/μL recruited as part of a cryptococcal antigen screening study in ART-naive HIV-infected adults in 2015-2017 [19] ( Table 1). Twenty cases from a clinical trial of HIV-CM in Kwazulu-Natal were also included in this cohort [16].

Cryptococus-Specific Transcriptome and Functional Characterization Cohort
Ribonucleic acid sequencing (RNA-seq) was performed on peripheral blood mononuclear cells (PBMCs) from healthy volunteers of self-identified Xhosa ethnicity recruited in Cape Town. The functional characterization cohort included 5 HIV-infected patients of diverse ethnicities recruited at St George's Hospital, London, with CD4 count <200 cells/μL and not on ART within ≤12 months. Healthy donor PBMCs used were obtained from leukocyte cones. Further details of experimental methods and computational analyses are provided in the Supplementary Methods.

Patient Consent Statement
The studies were approved by ethics committees at the University of Cape Town, the University of Witswatersrand, and the London School of Hygiene of Tropical Medicine. All participants gave written informed consent.

Genotyping and Association Analyses
Five hundred twenty-four cases and controls from the discovery cohort were genotyped using the Illumina HumanOmniExpressExome-8 v1.0 single-nucleotide polymorphism (SNP) chip, an exome-based array with >700 000 genome-wide markers and >240 000 exonic markers. Two hundred eleven samples from the validation cohort were genotyped on the Illumina GSA beadchip GSA MD v1. Samples with a low call rate (≤99%) and variants with a Hardy-Weinberg equilibrium ≤0.00001, call rate <0.99, missingness test (GENO > 0.01), and minor allele frequency (MAF) <0.001 were excluded from further analyses. Eleven genetically divergent samples were excluded from the discovery cohort and 6 from the validation cohort. A total of 245 091 variants from 513 discovery samples passed quality control and were analyzed. Variants were aligned to the 1000 Genome reference and the data were imputed using the Michigan Imputation server. Postimputation quality controls were used to remove low-quality (r2 ≤ 0.8) imputed variants before further analyses. The association analysis was performed, and genetic susceptibility to disseminated cryptococcosis was tested using logistic regression. P value distribution was assessed using a Quantile-Quantile (Q-Q) plot, and there was no inflation effect on the association analysis. Discovery and validation cohort-imputed datasets were subsequently merged, and a combined cohort association analysis was performed on 2 686 126 variants, with the significance threshold set at P < 5 × 10 −6 . The impact of top SNPs on gene expression was explored using eQTL information from the HaploReg and Genotype Tissue Expression (GTex) databases (see Supplementary Methods). Information on SNP association with annotated genes and variants within 500 kb of each SNP was collated. Genes associated with SNPs with P < 5 × 10 −3 were included in pathway enrichment and gene ontology analyses. At the CSF1 locus, SNP rs1999713 was hardcalled on both genotyping platforms for both cohorts, so we performed a meta-analysis of the discovery and validation cohorts to negate any uncertainty from imputation, using an allele and fixed-effects model as the effect size, and direction was very similar in both the discovery and replication cohorts.

Macrophage Colony-Stimulating Factor Functional Characterization Experiments
The PBMCs from HIV-infected patients (n = 5) and healthy volunteers were pretreated with macrophage-CSF (M-CSF) or anti-M-CSF antibody and cocultured with C neoformans H99 (serotype A reference strain) for 24 hours. Cells were lysed, plated onto fresh SAB agar for 48 hours, and colony-forming units were counted. For the phagocytosis assays, PBMCs were pretreated as described above and then challenged with prelabeled heat-killed C neoformans for 24 hours at 37°C. Cells were then captured on a flow cytometer, and the percentage of cells with internalized cryptococcus were identified.

RNA Sequencing and Analyses
The PBMCs were stimulated with heat-killed C neoformans (multiplicity of infection = 0.1) for 24 hours. Ribonucleic acid was extracted, and a sequencing library was prepared and sequenced as described in Supplementary Methods. After quality-control measures, reads were mapped to the human reference genome (hg19). Reads were annotated and differentially expressed genes between controls and Cn-treated samples were identified. Genes with significant differential expression were used in gene ontology and pathway analyses.

Availability of Data and Materials
The human SNP array summary datasets and raw RNA-seq data supporting the conclusions of this article are available on figshare via link https://figshare.com/s/b953f3192c77cef0be98. The software and detailed analyses steps we undertook are detailed via link https://github.com/alanmichaelpittman100/ Crypto-GWAS.

Genome-Wide Association Analysis
We performed a GWAS of Cryptococcus susceptibility in a discovery cohort of 524 age-, gender-, and CD4 count-matched South African HIV-infected patients: cases with disseminated cryptococcosis (defined as positive serum CRAG and/or CM, n = 243) and controls (n = 281) with no cryptococcosis. The validation cohort comprised 83 cases and 128 controls of African descent (Table 1). After imputation and quality-control measures (Supplementary Figure 1a), ~9.2 million variants from 240 cases and 273 controls (discovery) and 79 cases and 126 controls (validation) were analyzed using regression analysis.
In the discovery cohort, we identified multiple loci associated with susceptibility to cryptococcosis ( Figure 1a). Although no individual SNP passed the genome-wide significance threshold P < 5 × 10 -8 , we identified 49 SNPs with P < 10 -5 associated with cryptococcosis ( Table 2). Six of the top susceptibility SNPs (P < 7.54 × 10 -6 ; odds ratio [OR] = 0.49-0.53) were located within 2.5 kb upstream of the CSF1 gene encoding M-CSF ( Figure 1b), a cytokine promoting macrophage activation and phagocytosis. The top associated SNP rs1999714 (OR = 0.49; P = 8.39 × 10 -7 ) was located in the block of linkage disequilibrium (LD) of ~2.5 kb, defined by significant r 2 >0.5 LD of surrounding SNPs with rs1999714) close to the CSF1 gene ( Figure 1b). Another top variant, rs12124202 (OR = 0.53; P = 7.54 × 10 -6 ), was in the gene enhancer region (position GRCh38.p12 chr1: 109 905 601-109 906 901, GeneHancer ID GH01J109905), and other SNPs (including rs1999714) were all close to the CSF1 regulatory region. However, exploring the impact of these candidate SNPs on gene on gene regulation using a number of databases (Supplementary Methods) revealed no expression quantitative traits for any of the CSF1 SNPs, including the SNP in the enhancer region of CSF1. Other susceptibility SNPs of potential relevance to Cryptococcus-macrophage interactions included rs6768912 (OR = 1.8; P = 7.56 × 10 -6 ) in the intronic region of NCEH1 (neutral cholesterol ester hydrolase) and rs7213159 (OR = 1.9; P = 9.79 × 10 -6 ), a noncoding transcript variant of CSNK1D (casein kinase I). NCEH1 encodes neutral cholesterol ester hydrolase, an enzyme-removing cholesterol, which plays a pivotal role in antiviral responses (including to HIV), in macrophages [20]. Gene silencing of the CSNK1D gene has been shown to significantly reduce intracellular mycobacterial load in murine macrophages [21] (Table 2).
To validate findings from our discovery cohort, we performed GWAS in a separate South African cohort of 79 cases and 126 controls. The CSF1 SNPs were independently significant in this smaller cohort (OR = 0.52-0.63; P < .05) ( Table 3). In the combined cohort of 319 cases and 399 controls, all 6 CSF1 SNPs remained significantly associated with cryptococcosis susceptibility (Table 3, Figure 1c and d, Supplementary Figure  2). A meta-analysis of the (nonimputed) genotyped CSF1 SNP rs1999713 (present in both discovery and validation cohorts) using a fixed-effects allele model generated an OR of 0.53 (95% confidence interval [CI], 0.42-0.66, P = 5.96 × 10 -8 ; heterogeneity, I 2 = 0%, P = .8539) in the combined cohort ( Figure 2).

Transcriptomics in Healthy Peripheral Blood Mononuclear Cells and Overlap With Genome-Wide Association Study Findings
Using PBMCs from 6 healthy donors of self-identified Xhosa ethnicity, we performed RNA-seq after stimulation with heat-killed C neoformans for 24 hours. Compared with unstimulated PBMCs, 653 genes were significantly up-or down-regulated (fold change >2; adjusted value <0.05) (Supplementary Table 1 EPS8L3 CYB561D1 expressed in the RNA-seq experiment and genes associated with significant SNPs (P < 1 × 10 -3 ) in the GWAS, we found 38 common genes ( Table 4)   Gene ontology analysis of differentially expressed genes in healthy controls identified enrichment of cytokine activity, phagocytosis, complement, and T-cell proliferation (Supplementary Table 2). Pathway analysis of these genes identified enrichment of cytokine-cytokine receptor interaction, complement and coagulation cascades, and Toll-like signaling pathways (Supplementary Table 2). These findings lend further support to the importance of genes involving macrophage activation, differentiation, and phagocytosis, including CSF1, to cryptococcal immune responses in the South African population.

Functional Characterization in Peripheral Blood Mononuclear Cells From Patients With Advanced Human Immunodeficiency Virus
To further examine the importance of M-CSF in cryptococcal phagocytosis and killing, we performed ex vivo experiments using PBMCs of 5 HIV-infected patients (ART-naive, CD4 count <200 cells/μL). Exogenous M-CSF significantly improved cryptococcal phagocytosis and killing by HIV-infected PBMCs (Figure 3). When M-CSF receptors were blocked with specific antibodies, phagocytosis and fungal killing were similar to that of unstimulated PBMCs, suggesting either incomplete receptor block or absence of endogenous M-CSF production in patients ( Figure 3).

DISCUSSION
Despite bearing the largest infectious disease burden, African individuals are underrepresented in studies of disease susceptibility [22]. Globally, fungal infections pose a major threat to human health as a result of the expansion of immunosuppressive interventions and the ongoing HIV epidemic [23]. Due to the challenges in recruiting large enough cohorts, the first GWAS in an invasive fungal infection (candidaemia) was published in 2014 [24]. The present study is the first to be conducted for cryptococcosis, taking 12 years (2005-2017) to enroll a total of 735 patients.
Unlike prior targeted sequencing approaches, we took an unbiased, hypothesis-generating approach as used previously for candidemia [24,25], combining GWAS in a clearly defined   Abbreviations: CSF1, colony-stimulating factor 1; GWAS, genome-wide associated study; padj, adjusted P value; RNA-seq, ribonucleic acid sequence. a The top 9 genes, including CSF1, were significantly up-regulated in response to cryptococcal stimulation of peripheral blood mononuclear cells from healthy Xhosa volunteers.
case-control cohort, backed up by validation in a second cohort, transcriptomics in ethnically matched healthy controls and functional studies. Although no individual locus reached genome-wide significance, meta-analysis of the nonimputed genotyped CSF1 SNP rs1999713 demonstrated P < 10 -8 (OR = 0.53; 95% CI, 0.42-0.66; P = 5.96 × 10 -8 ) and was independently significant in both our discovery and validation cohorts. It is worth noting that this result was obtained in an African population in which GWAS power was limited by extensive genetic diversity and low linkage disequilibrium [13]. Although no SNPs identified lay within coding regions, we identified immunologically plausible upstream genetic variants with potential regulatory roles, notably 5 SNPs in the regulatory region and 1 SNP on the enhancer region of the CSF1 gene encoding M-CSF. Macrophage-CSF induces survival, proliferation, chemotaxis, differentiation, and activation of monocytes/ macrophages, including microglia [26,27]. All 6 SNPs were confirmed in the validation cohort, remaining significantly associated with risk of cryptococcosis in the combined cohort. Although we did not have CSF1 genotype data for the healthy controls to link with gene expression, CSF1 was also one of the most highly up-regulated genes upon cryptococcal stimulation of PBMCs from healthy, ethnically matched volunteers, and experiments confirmed the importance of M-CSF in uptake and killing of Cryptococcus by PBMCs from HIV-infected patients.
Exogenous M-CSF enhances the anticryptococcal activity of human monocyte-derived macrophages and enhanced cryptococcal killing in a murine model, and it was synergistic with fluconazole [28][29][30]. Macrophage-CSF is one of the principal regulators of macrophage function [27,31], acting as a potent proliferation signal, increasing blood and tissue macrophage numbers [31][32][33]. Macrophage-CSF-primed macrophages are typically more phagocytic and less competent at antigen presentation, primed to M2 stimuli [32]; however, M-CSF does not induce a full M2 phenotype, with M-CSF-primed macrophages able to respond to a variety of proinflammatory stimuli including IFN-γ and Toll-like receptor activation [31,32,34,35]. Macrophage-CSF acts synergistically with IFN-γ to drive proinflammatory chemokine production including CCL2 (MCP-1) [31], and it is expressed in a subset of T-cells that also express Th1 markers [36]. T-cell derived M-CSF has been shown to play a crucial role in the control of bloodborne intracellular pathogens [36], and blocking M-CSF increases susceptibility to intracellular infections with Listeria and Mycobacterium tuberculosis [37,38]. The exact role of M-CSF in protective anticryptococcal immune responses in the context of HIV coinfection is unclear, although extensive data demonstrating the importance of effective alveolar macrophage responses in controlling early cryptococcal infection [6], and the key role of circulating and tissue macrophage/microglial responses during later disseminated disease [39,40], provide a plausible basis for why variations in CSF1 gene expression might impact susceptibility to cryptococcal disease. Of interest, the genotyped CSF1 SNP rs1999713 is common in different populations, with sampled African populations having the lowest MAF at 0.31 (comparable to 0.34 found in our control group) and East Asian populations having the highest MAF at 0.68 (https://gnomad. broadinstitute.org/).
Searching for inherited immune defects in anticryptococcal responses in the context of profound acquired CD4 T-cell depletion might seem paradoxical: yet given only a minority of patients with HIV/AIDS develop disseminated cryptococcosis despite presumed ubiquitous exposure, such an approach has the potential to highlight the contribution of other factors, including the central role of macrophage phagocytosis and killing [41]. Macrophages are also infected by HIV and act as its tissue reservoir [42,43] and are involved in trafficking both pathogens to the central nervous system (CNS). We postulate that, in the setting of HIV-cryptococcal coinfection, genotypes rendering macrophages more permissive to uptake and intracellular survival of intracellular pathogens are likely to confer susceptibility to disseminated cryptococcosis, either through direct effects on cryptococcal intracellular burden or indirectly through an impact on HIV burden [44]. FcγR polymorphisms identified in prior targeted sequencing studies [8,9] could exert an impact through either increasing phagocyte cargo (via increased binding and uptake of C neoformansimmune complexes), shown to be associated with CSF fungal burden in HIV-CM [41], and/or increased immune activation via antibody-dependent cellular cytotoxicity, leading to disruption of the blood-brain barrier or CNS tissue injury [9]. Both M-CSF and the M-CSF receptor have been proposed as targets in the treatment of HIV neurodegenerative disease [45,46], and M-CSF treatments for invasive fungal infections have been investigated in animal models [47,48] and early stage clinical trials [49].
Our study had several limitations. The relatively small sample size limited our statistical power, and genotype arrays differed for the 2 cohorts. The discovery cohort was genotyped on a chip biased towards European populations, whereas the validation cohort was typed using the newly available global screening array ([GSA] containing multiethnic genome-wide content), making imputation crucial for analysis of the combined cohort. Better designed genotyping chips representing African genetic diversity (such as the GSA and newer arrays under development) will mean less reliance on imputation methods to fill in the gaps in the African genomes. We lacked genotype data on the healthy volunteers that would have allowed us to examine effects of CSF1 genotype on cytokine expression upon cryptococcal stimulation. Furthermore, there was a paucity of eQTL data from African populations on the impact of the upstream variants identified on CSF1 gene expression and M-CSF production: this could be explored in future studies using PBMCs of genotyped individuals. Beyond host genotype, other unaccounted-for factors, such as those associated with environmental cryptococcal exposure, or concurrent opportunistic infections, may have an impact on cryptococcosis susceptibility.
In any GWAS of infectious disease susceptibility, pathogen variation is an additional and usually unaccounted-for element [13]. The completion of large, multisite, African phase III trials in HIV-associated CM provides the opportunity to undertake a larger pan-African GWAS of disease severity and treatment response, developing bioinformatic approaches to integrate host and pathogen genomics with host CSF immune profiling and pathogen virulence phenotyping to determine host and pathogen factors underlying poor clinical outcome [2,50].

CONCLUSIONS
In summary, we have identified and replicated a novel cryptococcosis susceptibility factor in HIV-infected Africans, the importance of which was further confirmed through ex vivo functional immune studies in patients with advanced HIV as well as healthy, ethnically matched controls. Our findings demonstrate that small but well defined GWAS can identify novel and immunologically relevant susceptibility loci for an important cause of mortality in an African population, provided they are replicated and complemented by functional approaches. Identifying a high-risk genotype helps elucidate disease mechanism and has the potential to identify novel strategies for targeted prevention and host-directed immunotherapy.

Supplementary Data
Supplementary materials are available at Open Forum Infectious Diseases online. Consisting of data provided by the authors to benefit the reader, the posted materials are not copyedited and are the sole responsibility of the authors, so questions or comments should be addressed to the corresponding author.
Potential conflicts of interest. T. B. has received speaking fees from Gilead Sciences and Pfizer and research funding from Gilead Sciences unrelated to the submitted work. All authors have submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest. Conflicts that the editors consider relevant to the content of the manuscript have been disclosed.