Genome-wide screen identifies host loci that modulate Mycobacterium tuberculosis fitness in immunodivergent mice

Abstract Genetic differences among mammalian hosts and among strains of Mycobacterium tuberculosis (Mtb) are well-established determinants of tuberculosis (TB) patient outcomes. The advent of recombinant inbred mouse panels and next-generation transposon mutagenesis and sequencing approaches has enabled dissection of complex host–pathogen interactions. To identify host and pathogen genetic determinants of Mtb pathogenesis, we infected members of the highly diverse BXD family of strains with a comprehensive library of Mtb transposon mutants (TnSeq). Members of the BXD family segregate for Mtb-resistant C57BL/6J (B6 or B) and Mtb-susceptible DBA/2J (D2 or D) haplotypes. The survival of each bacterial mutant was quantified within each BXD host, and we identified those bacterial genes that were differentially required for Mtb fitness across BXD genotypes. Mutants that varied in survival among the host family of strains were leveraged as reporters of “endophenotypes,” each bacterial fitness profile directly probing specific components of the infection microenvironment. We conducted quantitative trait loci (QTL) mapping of these bacterial fitness endophenotypes and identified 140 host–pathogen QTL (hpQTL). We located a QTL hotspot on chromosome 6 (75.97–88.58 Mb) associated with the genetic requirement of multiple Mtb genes: Rv0127 (mak), Rv0359 (rip2), Rv0955 (perM), and Rv3849 (espR). Together, this screen reinforces the utility of bacterial mutant libraries as precise reporters of the host immunological microenvironment during infection and highlights specific host–pathogen genetic interactions for further investigation. To enable downstream follow-up for both bacterial and mammalian genetic research communities, all bacterial fitness profiles have been deposited into GeneNetwork.org and added into the comprehensive collection of TnSeq libraries in MtbTnDB.


Introduction
Every pathogenic infection is part of an evolutionary battle between host and invader.In the case of Mycobacterium tuberculosis (Mtb), the causative agent of tuberculosis (TB), evidence tracing back at least 9,000 years tells the tale of a host-pathogen arms race (Hershkovitz et al. 2008), making Mtb one of the most enduring adversaries of the human species.Given the prevailing nature of this challenge, it is no surprise that weaknesses in vital host defenses provide greater opportunity for Mtb infection, allow Mtb to manipulate the nature and magnitude of the host immune response, and worsen patient prognoses (Cooper et al. 1993;Flynn et al. 1993;Cooper et al. 1997;Caruso et al. 1999;Kramnik et al. 2000).In addition, the advent of drug-resistant and multidrug-resistant Mtb strains highlights the imminent danger posed by our microscopic rivals and underscores the importance of bacterial variation as a critical factor in the pathogenesis and continued spread of Mtb (Caws et al. 2008;Hernández-Pando et al. 2012;Gopal et al. 2014).Meanwhile, genetic diversity of both hosts and Mtb strains that interact during infection can give rise to a spectrum of disease outcomes (Smith and Sassetti 2018), enhancing the complexity of TB identification and treatment.To combat a pathogen responsible for nearly 10.6 million infections and 1.6 million deaths annually (WHO 2022), it is therefore vital to dissect the host-pathogen interface.
Mammalian models of TB have been used successfully to identify and mechanistically interrogate TB susceptibility loci identified from human cohorts.For mouse strains that are phenotypically divergent, quantitative trait locus (QTL) mapping studies within genetically tractable and reproducible hosts have enabled the identification of genomic loci that contribute to the magnitude https://doi.org/10.1093/g3journal/jkad147Advance Access Publication Date: 5 July 2023

Mutant Screen Report
of clinically relevant phenotypes (Lavebratt et al. 1999;Kramnik et al. 2000;Mitsos et al. 2000;Mitsos et al. 2003;Yan et al. 2006).The classic inbred strains C57BL/6J (B6) and DBA/2J (D2) lie on opposite ends of the TB susceptibility spectrum, with B6 surviving nearly a year postinfection and D2 succumbing to cachexia and morbidity within months (Medina and North 1998).To capitalize on this diverse disease outcome and just over 6 million divergent SNPs (Wang et al. 2016;Ashbrook et al. 2021;Ashbrook et al. 2022;Sasani et al. 2022), a recombinant inbred panel generated from these 2 founders, known as the BXD, has laid the foundation for understanding host determinants of susceptibility to a variety of diseases through QTL mapping in genetically diverse mice (Taylor et al. 1973;Taylor et al. 1999;Peirce et al. 2004;Ashbrook et al. 2021).
BXD and other recombinant inbred panels have been used to explore the intricate dynamics of TB disease, but often the phenotypes leveraged to understand these dynamics in mice, such as survival time postinfection, body weight, or bacterial burden, are too complex and polygenic to reveal novel biology or enable refined mapping.These "macrophenotypes" materialize from many genetic and environmental factors that can be difficult to mechanistically dissect or precisely map.Within each unique host, bacteria face similarly diverse immunological dynamics and spatiotemporal selective pressures.To contribute to the growing body of work on genetic determinants of TB outcome, quantifying precise bacterial "endophenotypes" underlying disease outcomes is the crucial next step toward elaborating the biological intricacies that govern the host-pathogen interface.Recently employed molecular endophenotypes include expression QTL (Chesler et al. 2005), metabolite QTL (Wu et al. 2014), protein QTL (Chick et al. 2016), and chromatin accessibility QTL (Skelly et al. 2020), but for infectious disease, the most salient endophenotypes lie at the interface of the host and pathogen.
A classic approach to assess the requirement of a bacterial gene for infection is to infect a standard inbred mouse strain, such as B6, with a single knockout Mtb mutant.Expanding on this design, genome-wide mutagenesis enables the generation of single knockout mutants across the Mtb genome, thereby allowing the measurement of Mtb gene requirements under a variety of selective pressures.Transposon (Tn) mutant libraries (TnSeq) have been successfully used to report on variable components of the bacterial infection microenvironment under distinct host and antibiotic pressures (Sassetti and Rubin 2003;Nambi et al. 2015;Mishra et al. 2017;Bellerose et al. 2019).We have recently leveraged TnSeq across genetically diverse Collaborative Cross (CC) mice to map both host-and pathogen-linked loci (Smith et al. 2022).Deep phenotyping of disease traits in over 50 CC strains enabled mapping of 10 tuberculosis immunophenotype QTL (TipQTL), derived from bacterial burden and cytokine quantifications in each host.Measurement of TnSeq mutant fitness during in vivo infection yielded 46 host interacting with pathogen QTL (HipQTL).While 8 founder lines do provide a greater extent of SNP-level diversity in the CC panel than in the BXD panel (Munger et al. 2014), the additional founders can reduce statistical strength for QTL mapping and complicate determination of allele effects for mapped QTL.The 2-state model of the BXD panel allows relatively straightforward identification of significant QTL and downstream allele effects, potentially shortening the path from screen to biological discovery.
To further characterize the BXD phenome (Chesler et al. 2004) and add to the growing repertoire of selective pressures applied to Mtb whole-genome mutant libraries (Jinich et al. 2021), we now leverage the phenotypic divergence and reproducibility of the BXD family and the molecular phenotyping precision of an Mtb Tn mutant library.We infected 19 BXD strains and both parents with a saturated library of Mtb Tn mutants and conducted dual-genome QTL mapping to identify linkages between bacterial fitness profiles and host genetic factors, revealing 140 host genetic loci significantly associated with specific bacterial mutants.Thus, in passing a bacterial Tn mutant library through selection within the immunologically and phenotypically diverse BXD microenvironments, we have conducted a multidimensional host-pathogen screen to identify precise host and bacterial determinants of TB disease.Overall, this study provides a blueprint for the dissection of host-pathogen interactions, which can play a pivotal role in functional annotation of understudied host and bacterial genes.

Ethics statement
All mouse studies were conducted in accordance with the guidelines issued in the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health and the Office of Laboratory Animal Welfare.Animal studies conducted at Duke University Medical School were conducted using protocols approved by the Duke Institutional Animal Care and Use Committee (IACUC) (Animal Welfare Assurance #A221-20-11) in a manner designed to minimize pain and suffering in Mtb-infected animals.Any animal exhibiting signs of severe disease were immediately euthanized in accordance with IACUC-approved endpoints.All mouse studies conducted at the University of Massachusetts Medical School (UMass) were performed using protocols approved by UMass IACUC (Animal Welfare Assurance #A3306-01).

Mouse infections
For aerosol infections, B6 and D2 were infected with ∼50 colony forming units (CFUs) of Mtb H37Rv YFP via aerosol inhalation (Glas-Col) for 6 and 12 weeks.At each timepoint, male and female mice were euthanized in accordance with approved IACUC protocols, and lung and spleen were harvested into PBS-T and homogenized.CFU was quantified by dilution plating onto Middlebrook 7H10 agar supplemented with OADC, 0.2% glycerol, 50 mg/mL carbenicillin, 10 mg/mL amphotericin B, 25 mg/mL polymyxin B, and 20 mg/mL trimethoprim.One lung lobe per mouse was collected into 10% neutral-buffered formalin for hematoxylin and eosin (H&E) staining by the Duke University Research Immunohistology Laboratory.
For TnSeq experiments, 1 × 10 6 CFU of saturated Himar1 Tn mutants (Sassetti et al. 2003) was delivered via intravenous tail vein injection, an infection route that enables consistent delivery of the complete TnSeq library to each mouse.Mice were infected in a single batch.At 4-week postinfection, mice were euthanized, and spleens and lungs were harvested and then homogenized in a FastPrep-24 (MP Biomedicals).CFU was quantified by dilution plating on 7H10 agar with 20 µg/mL kanamycin.For library recovery, ∼1 × 10 6 CFU per mouse was plated on 7H10 agar with 20 µg/ mL kanamycin.After 3 weeks of growth, colonies were harvested by scraping, and genomic DNA was extracted.The relative abundance of each Tn mutant was estimated as described in Long et al. (2015).

Microscopy and damage quantification
H&E-stained lung sections were imaged in bright field at 2× magnification on the Keyence BZ-X800.Images were processed identically within FIJI software (v2.3.0/1.53p) for image clarity.To quantify damage, an artificial neural network-based damage identification model was developed within QuPath (v0.3.2) (Bankhead et al. 2017), as described in Wilburn et al. (2023).Eight images (1 image per experimental group), containing a total of 68 unaffected and 65 affected manual annotations, were utilized for the single purpose of training a pixel classifier, and the remaining 3 images per group were quantified using the model.To prevent bias, lung sections presented in this report were quantified as the closest to the mean damage of each experimental group.

TnSeq analysis
TnSeq libraries were prepared and Tn insertion counts were estimated as previously described (Smith et al. 2022).The H37Rv genome annotation used is NCBI Reference Sequence NC_018143.2.In total, 40 independent Tn libraries were sequenced after recovery from the 19 BXD genotypes and the 2 parent strains, B6 and D2.Two independent Tn libraries were placed under selection within each host genotype.However, in the cases of 2 BXD genotypes, BXD48a and BXD51, only 1 library could be sufficiently recovered for sequencing.Within TRANSIT (DeJesus et al. 2015), betageometric correction was used to normalize insertion mutant counts across all libraries, and pseudocounts were implemented.Insertion counts were totaled for each Mtb gene, and counts were averaged between individual replicate libraries per mouse.Per gene mean values were compared with the grand mean and then log 2 transformed.
To control for multiple hypotheses genome wide, 10,000 permutation tests were performed to generate a resampling distribution used to derive the significance of Tn mutant selection between in vitro and in vivo conditions.An Mtb gene was considered "essential" within a host if mutants lacking that gene experienced a log 2 fold change (LFC) value of −0.5 or lower postselection in that host.To identify Mtb genes whose fitness was sufficiently variable to enable QTL mapping, the dynamic range in LFC across the parental and BXD strains for any given bacterial gene had to be at least 0.63.This value is the minimum LFC range of a mutant that was significantly selected in at least 1 host genotype after resampling (Q < 0.05).Mutants used for QTL mapping also needed to be sufficiently represented in the library, which was accomplished by requiring 4 or more TA Tn insertion sites to be detected for each bacterial gene.Further, each gene included needed to be significantly selected in at least 1 host genotype before resampling (P < 0.05).

Genotyping and QTL mapping
Previously published BXD genotypes containing 7,321 total markers were leveraged for QTL mapping (Wang et al. 2016).Markers that were not diagnostic of genetic differences between B6 and D2 were filtered out resulting in a total of 7,314 markers.Genotype data and phenotype data, including lung and spleen burden and TnSeq mutant fitness profiles, were imported into R statistical software (version 4.1.1)and formatted for QTL mapping within R/qtl2 (version 0.28) (Broman et al. 2019).The leave-one-chromosome-out (LOCO) approach was leveraged to estimate kinship between strains as a covariate for QTL mapping.Because all screened mice were male and infected in 1 batch, no other covariates were used for the initial mapping.Logarithm of odds (LOD) scores were calculated using the linear mixed model in R/qtl2 to establish phenotypic associations with marker loci across the host genome.Trait-wise significance thresholds for QTL were established by 10,000 permutation tests.The significance of gene class overrepresentation among the mapped QTL in comparison with class representation in the whole Mtb genome was established using Fisher's exact test.Chromosomal ideograms of QTL across the mouse genome were generated using R/karyoploteR (Gel and Serra 2017).

B6 and D2 mouse strains experience distinct chronic disease profiles after aerosol infection
B6 and D2, the parent strains of the BXD panel, have divergent responses to Mtb infection (Chackerian and Behar 2003).Canonically considered Mtb resistant, the T H 1-skewed adaptive immune response initiated by B6 mice is capable of restraining Mtb growth after 25 days, resulting in a mean survival time of ∼230 days (Mitsos et al. 2000).D2 mice control early infection with Mtb equally as well as B6 mice until ∼25 days.However, D2 mice experience low IFN-γ-producing T-cell influx to the lung when adaptive immunity should begin, at which point B6 and D2 begin to diverge in bacterial growth restriction (Fig. 1a and b) and lung damage (Fig. 1c-e) (Marquis et al. 2008).D2 macrophages become foamy and dysfunctional, making it difficult to contain and control bacterial growth in the lung (Chackerian and Behar 2003).By 12-week postaerosol infection, D2 lungs exhibit hyperinflammatory and fibrotic responses leading to overt damage (Fig. 1f-h) and an ultimate mean survival time of ∼110 days (Medina and North 1998;Keller et al. 2006).In addition to genotype-specific differences between these BXD parent strains, we also observe a sex effect independent of genotype, demonstrated by higher susceptibility of males of both genotypes (Supplementary Fig. 1a and b) (Tsuyuguchi et al. 2001;Dibbern et al. 2017).Further, we find that in contrast to burden, all groups reduce lung damage by week 12 except for D2 males (Supplementary Fig. 1c-g)."hpQTL" is the name for each host-pathogen QTL of genome-wide significance (P ≤ 0.05)."Trait" refers to the Rv number of the bacterial mutant whose fitness profile identified the hpQTL in the host genome."Gene" is the bacterial gene annotation of the mutant if one is available on Mycobrowser."Chr" is the host chromosome on which the QTL is located, and "Position (Mb)" is the position of the QTL in megabases."LOD" is the maximum association score between the mutant fitness profile and the host locus."Start (Mb)" and "End (Mb)" are the lower and upper boundaries of the Bayesian 95% confidence interval, as calculated by R/qtl2."P-value" denotes the significance of the association between the bacterial mutant fitness and the host genetic locus.Corresponding female sections can be found in Supplementary Fig. 1c and d. e) Lung damage was quantified by QuPath v0.3.2 using an artificial neural network-based damage identification model and reported as a percent of total lung section area (n = 4 per genotype and sex).Lung damage is not significantly different between B6 and D2 at 6-week postinfection by unpaired Student's t-test.f and g) Male B6 and D2 H&E-stained lung sections taken 12-week postinfection, 2× magnification, representative of n = 4 per strain.Corresponding female sections can be found in Supplementary Fig. 1e and f. h) Lung damage, quantified as in e), is significantly different between B6 and D2 at 12-week postinfection by unpaired Student's t-test.

Early clinical macrophenotypes fail to distinguish parental disease outcomes
With over 100 still extant recombinant inbred strains bred from B6 and D2 parental strains, the BXD family is a well-suited mammalian resource to map the phenotypic diversion during Mtb infection (Fig. 2a).To study early disease traits and bacterial endophenotypes that predict differences in TB disease outcomes during chronic infection stages, we intravenously infected the parental B6 and D2 lines and 19 BXD genotypes with a saturated TnSeq library of Tn mutants.The library contains knockouts of every Mtb gene that is not required in vitro, which collectively produces an infection similar to the wild-type infection strain (Sassetti and Rubin 2003;Smith et al. 2022).At 4-week postinfection, the parental B6 and D2 strains demonstrated approximately the same bacterial burden in lung and spleen (Fig. 2b).Moreover, BXD strains did not exhibit sufficiently wide variation in bacterial burden at 4-week postinfection to precisely map the genetic cause of such a complex trait (Fig. 2c; Supplementary Fig. 2a and b).
Compound macrophenotypes, such as CFUs in organ homogenate, are highly polygenic, making it difficult to identify causal genes even with large panels of mice (Lavebratt et al. 1999;Mitsos et al. 2000;Mitsos et al. 2003;Yan et al. 2006).

Tn mutant fitness endophenotypes report on the host immunological microenvironment
To interrogate the biological mechanisms that predict immunodivergent outcomes, we quantified the abundance of each bacterial mutant both before infection with the TnSeq library and after recovery from mouse organs of each genotype, using the LFC in abundance as a quantification of relative mutant fitness in each condition.This LFC fitness value serves as a precise reporter of the host microenvironment and can be used to map host loci that impact Mtb mutant fitness (Fig. 3a).
We identified 3 main classes of bacterial mutants, which are highlighted in Fig. 3b.The first of these classes includes canonical Mtb genes that are essential for in vivo survival, exemplified by bioA.These genes remain essential for Mtb survival in both parental backgrounds and across the BXD strains (Fig. 3b; "essential" profile in red).A second, "differentially required" set of mutants is exemplified in this figure by mbtG; the loss of mbtG did not significantly reduce bacterial fitness within B6 mice, but mbtG mutants experienced an extreme fitness reduction within D2 mice (Fig. 3b; "differentially required" profile in yellow).Further, there is a spectrum of mbtG essentiality across the family, indicating that some variable host factor could modify the fitness of this mutant.In addition to in vivo essential and differentially essential Mtb genes, a third class of Mtb genes proved to be broadly dispensable for Mtb survival in these hosts, imparting an adaptive benefit for Mtb when the gene was disturbed, represented here by glpK (Fig. 3b; "nonessential" profile in green).
When surveying the complete set of Mtb Tn mutants, we find that in comparison with B6 infection, Mtb requires almost twice as many genes to survive within D2 mice (Fig. 3c).This finding suggests that there are detectable immunological pressures being exerted on Mtb within the highly inflammatory D2 line prior to the divergence in B6 and D2 bacterial control and disease outcome.We additionally find that the BXD panel, while recapitulating most B6-and D2-essential genes, reveals additional subsets of genes that are only required within specific BXD strains (Fig. 3d), highlighting the power of this genetically diverse panel to offer novel insights into the biology controlling TB outcome.This spectrum of Mtb gene essentiality is further confirmed when we examine the correlation of the complete TnSeq library fitness between hosts (Fig. 3e).Interestingly, the majority of the strains hierarchically clustered closer to D2 than to B6, indicating that they could be inheriting factors from D2 that put greater immunological pressure on Mtb.

Tn mutant fitness profiles map hpQTL
We aimed to use these bacterial mutants as sensitive reporters of host immunological pressures within the early phase of infection at the onset of adaptive immunity.Using the LFC values collected for each Tn mutant population in the TnSeq library as quantitative endophenotypes (Supplementary Table 1), we conducted QTL mapping.After removing nondynamic genes for quality control (Supplementary Fig. 3a; Mtb genes that are not sufficiently varying in essentiality across the screened BXD genotypes), we found 140 genome-wide significant QTL, defined as P ≤ 0.05 after 10,000 permutations (Table 1).Of these QTL, 33 of the loci had a P-value of ≤0.01 (Fig. 4a).shows the cumulative requirement as each new host strain is added.e) A heatmap depicting the total correlation of the library-wide bacterial fitness within each screened host strain.
conversion of maltose into maltose-1-phosphate as a part of the glycogen biosynthesis pathway that supports construction of the protective outer capsule of Mtb (Li et al. 2014).Of note, TreS, which converts trehalose to maltose upstream of Mak, also maps to a genome-wide significant QTL on chromosome 11 and has done so in previous Mtb infection screens (Smith et al. 2022).rip2 is a putative zinc metalloprotease located in the Mtb cell membrane (Sklar et al. 2010), although very little is known about its function.PerM is an integral membrane protein that is known to play an essential role in enabling bacterial cell division under acid stress in B6 mice (Wang et al. 2019).Here, we report that perM mutants are even more severely attenuated within a D2 background, implying that PerM is differentially impacted by host immunological pressures.EspR is a key regulator of the ESX-1 secretion system (Rosenberg et al. 2011;Blasco et al. 2012), which itself plays a well-established role in mycobacterial virulence.Mutants lacking these bacterial genes map significant QTL within the interval of 75.97-88.58Mb on mouse chromosome 6.The 95% Bayesian confidence intervals of the QTL mapped by rip2 and perM mutants overlap while the confidence intervals of mak and espR mutants overlap separately from the rip2 and perM QTL.To test for the independence of these 2 groups of nearby QTL, we remapped each of the QTL within the chromosome 6 hotspot, iteratively incorporating the haplotype possessed by each BXD strain at each QTL as an additive covariate.Dependent loci will not map if the haplotype of 1 locus is factored into the mapping pipeline as a covariate.Conversely, if a QTL achieves significance by permutation test while the haplotype state at another significant QTL nearby is factored into mapping as a covariate, it suggests that the 2 loci arise from distinct causal genetic factors within the host.These Scan2 independence tests, so named in R/qtl, suggested 2 causal host loci exist within this highly significant QTL hotspot: 1 host factor underlying the rip2 and perM mutant fitness QTL and 1 underlying the mak and espR QTL (Fig. 6a-d).
To identify host gene candidates that may mediate the survival of these 2 groups of bacterial mutants, we defined a pipeline to select for protein-coding genes for which D haplotype has a unique SNP in comparison with B haplotype (Fig. 7a).We passed each of the genes identified by this pipeline through a list of genetic and clinical criteria using independent but complementary data sets from a variety of Mtb-infected mammalian cohorts (Fig. 7b) (Zak et al. 2016;Moreira-Teixeira et al. 2017;Ahmed et al. 2020).From this comparative analysis, we identified a putative host candidate within the region mapped by mak: MAX dimerization protein 1 (Mxd1), which is known to be upregulated in murine bone marrow-derived macrophages after infection with the hypervirulent Mtb HN878 strain (Roy et al. 2018).Further, we found a putative host candidate within the QTL mapped by perM mutants: ancient ubiquitous protein 1 (Aup1).The QTL mapped by the mak Tn mutant was enriched for protein-coding genes with   4) whether the gene is significantly upregulated or downregulated in TB patient blood (Zak et al. 2016), and (5) whether the gene is differentially expressed in mouse lungs across variable host genotypes, Mtb genotypes, and doses (Moreira-Teixeira et al. 2020).If the answer is no, the space remains blank.Green indicates yes, and yellow indicates yes but cautions that the Rhesus Macaque gene is not a high confidence homolog.Only protein-coding genes with a D2 SNP (as per Sanger MGP) that met at least 1 of the criteria are included in this visual.

R. K. Meade et al. | 11
D2-specific SNPs compared with the other QTL within the hotspot.However, the causal locus underlying these QTL may not be protein coding at all.In fact, historical studies of cardiomyopathy in B6 and D2 backcrosses have identified a QTL that was later discovered to be caused by an intronic variant in the tropin-interacting kinase gene Tnni3k (Wheeler et al. 2005;Wheeler et al. 2009), which also conferred susceptibility to viral-induced myocarditis (Wiltshire et al. 2011).There are many examples in literature of noncoding causal variants (Hamilton and Yu 2012), yet there remain compelling candidate genes within the chromosome 6 hotspot that could drive the conditional essentiality of these Mtb genes in vivo.Altogether, we have leveraged complementary screen data to identify plausible host gene candidates underlying these host-pathogen QTL (hpQTL).

Discussion
While it is thought that nearly a quarter of the global population is infected with Mtb (Houben and Dodd 2016), only 5-10% of Mtb infections ultimately result in active TB (Gopal et al. 2014).This paradox has motivated decades of study on human genetic variants that could predict such a lethal prognosis in affected individuals.While Genome-wide association studies (GWAS) among infected populations has identified numerous host loci associated with increased susceptibility to disease, a limited ability to dissect biological mechanism leaves many questions unanswered (Saul et al. 2019).QTL mapping, a parallel systems genetics technique, has been utilized for decades to identify host loci in genetically tractable populations linked to TB susceptibility; Sst1, one of the most well-known susceptibility loci, was first discovered through a QTL mapping study in mice (Kramnik et al. 2000).Despite the widespread use of disease traits, including mean survival time, body weight, and bacterial burden, to identify host determinants of disease progression (Lavebratt et al. 1999;Kramnik et al. 2000;Mitsos et al. 2000;Mitsos et al. 2003;Yan et al. 2006), these phenotypes can fail to dissect immunological features that drive distinct Mtb pathogenesis in diverse hosts.Here, we describe a screen in which we leveraged the reproducibility, host variation, and phenotypic divergence of the BXD family and the molecular precision of the TnSeq library to identify novel axes of host-pathogen interaction during Mtb infection.We demonstrate the late (12-week) phenodivergence of the parental B6 and D2 mice; however, we found that CFU serves as a weak predictor of host outcome at the onset of adaptive immunity 4-week postinfection.By utilizing TnSeq as a refined infection trait, we identified bacterial mutants that could distinguish the resistant B6 and susceptible D2 strains at the early 4-week infection timepoint.Finally, we leveraged the differentially required mutants as traits to conduct QTL mapping and identified 140 genome-wide hpQTL and a QTL hotspot encompassing a cluster of Mtb virulence and cell wall genes.Using this BXD TnSeq platform and a novel candidate prioritization pipeline, we identified viable host gene candidates for future study.This screen highlights the importance of intentionality in design for deep phenotyping platforms to optimally identify the causal factors underlying gene associations.In contrast to complex macrophenotypes, which require hundreds of mice to sufficiently power causal locus identification for highly polygenic traits, molecular endophenotypes, such as Mtb Tn mutant fitness, allow us to learn more about the intricate details of Mtb invasion in the early stages of infection, which is particularly helpful in an experimental framework with fewer host genotypes.The strength of TnSeq as a deep phenotyping platform lies in the redundancy of each Mtb gene knockout; within the TnSeq library, each bacterial gene that is nonessential in vitro was perturbed at 4 or more unique Tn insertion sites, resulting in multiple subpopulations of unique replicates of each Mtb gene knockout.These mutant replicates taken together paint a robust picture of individual Mtb gene requirement in vivo, and multiple host replicates strengthen the reproducibility of each Mtb mutant fitness profile.In contrast with our previous TnSeq study that utilized nearly 70 host genotypes (Smith et al. 2022), this TnSeq study includes 21 unique genotypes and 73 mice, including the parental strains B6 and D2.The greater number of Tn mutant hpQTL identified in this screen demonstrates the power of a 2-state model to segregate allele effects, as opposed to the 8-state model present in the CC recombinant inbred panel.We, however, cannot eliminate the possibility that the smaller number of host genotypes included in this study did not sufficiently limit statistical noise during QTL mapping analyses.
Using bacterial mutant profiles as endophenotypes, we identified 140 genome-wide significant (P ≤ 0.05) associations between Mtb Tn mutant survival and host genetic loci in a diverse cohort, which may contribute to divergence in clinical outcomes between Mtb-resistant and susceptible hosts.These hpQTL and the Tn mutant fitness profiles across the BXD panel have been shared with GeneNetwork.org(Chesler et al. 2004;Parker et al. 2017;Ashbrook et al. 2021), an online compendium of BXD phenotypes.Increasing the precision of bacterial traits creates a rich environment for screen-based hypothesis generation in a smaller host population, thereby lowering the cost through lower animal count and less total infection time.This screen serves as a proof of concept that large host populations are not an absolute necessity to uncover novel and biologically meaningful axes of the hostpathogen interface.Molecular endophenotyping enables a much more accessible experimental platform to assist in our ongoing struggle against Mtb.
From this discovery screen, we have identified several plausible host gene candidates.Variants in Aup1 represent a possible axis for interaction between Mtb and the mammalian host.The acetyltransferase activity of Aup1 has been shown to be exploited by flaviviruses to trigger autophagy of lipid droplets within the cell, which produces ATP for the virus, thereby promoting viral replication (Zhang et al. 2018).Although not much is known about whether Mtb interacts directly with this protein, Mtb is one of the few bacteria capable of generating lipid droplets (Daniel et al. 2011).Host lipid droplets, specifically those derived from foamy macrophages, have been shown to promote the production of defensive cytokines during Mtb infection (Jaisinghani et al. 2018), implicating a potentially intriguing locus for host-pathogen interaction.The other high priority candidate Mxd1 encodes the protein MAD, which competes with MYC to bind with MAX, forming a transcriptional repressor.This mechanism has led Mxd1 to be considered a putative tumor suppressor gene.Mxd1 has been shown to be upregulated in murine bone marrow-derived macrophages upon initial infection with the hypervirulent Mtb HN878 strain (Roy et al. 2018) and to regulate the fitness of murine dendritic cells (Anderson et al. 2020), yet the function of Mxd1 in response to Mtb has yet to be determined.Together, these candidates provide feasible next steps for mechanistic interrogation of these loci.
While the functions of many mycobacterial genes remain unknown (Whitaker et al. 2020), this Tn mutant screen within the BXD family builds directly upon a previous screen in the CC panel (Smith et al. 2022) to further articulate the conditional necessity of many canonically "nonessential" Mtb genes within a spectrum of host microenvironments.A substantial number of Mtb genes known to be nonessential in B6 mice prove to be essential in a subset of these genetically diverse hosts, suggesting that these bacterial genes play an active yet undiscovered role in pathogenesis under specific immunological conditions.These new host conditions have been shared with MtbTnDB (Jinich et al. 2021), a central repository encompassing over 60 published Mtb TnSeq screens for ease of browsing and interscreen comparison.For mycobacterial researchers exploring Mtb genes traditionally thought to be nonessential in vivo, this work challenges such a notion and offers a new model that will provide insight into the host-pathogen dynamic underlying this conditional in vivo essentiality.

Fig. 1 .
Fig. 1.B6 and D2 are phenodeviant in TB susceptibility.a) Log 10 transformed CFUs recovered from lung tissue throughout the course of the aerosol infection (with an infectious dose of ∼40 CFU of Mtb H37Rv).b) Log 10 transformed CFU recovered from spleen throughout infection.c and d) Male B6 and D2 H&E-stained lung sections taken 6-week postinfection, 2× magnification, representative of n = 4 per strain.Corresponding female sections can be found in Supplementary Fig.1c and d. e) Lung damage was quantified by QuPath v0.3.2 using an artificial neural network-based damage identification model and reported as a percent of total lung section area (n = 4 per genotype and sex).Lung damage is not significantly different between B6 and D2 at 6-week postinfection by unpaired Student's t-test.f and g) Male B6 and D2 H&E-stained lung sections taken 12-week postinfection, 2× magnification, representative of n = 4 per strain.Corresponding female sections can be found in Supplementary Fig.1e and f. h) Lung damage, quantified as in e), is significantly different between B6 and D2 at 12-week postinfection by unpaired Student's t-test.

Fig. 2 .
Fig. 2. BXD panel exhibits natural genetic variation, but classical clinical traits cannot map genome-wide QTL at 4-week postinfection.a) The BXD panel is a biparental recombinant inbred panel bred from C57BL/6J (B6) and DBA/2J (D2) and composed of over 100 mosaic strains.b) Lung and spleen burden by host genotype (n = 1-5 per genotype).c) QTL mapping of lung and spleen burden traits across the BXD cohort.Solid threshold represents a genome-wide significance of P = 0.05.Dashed threshold represents P = 0.20.Thresholds were calculated from 10,000 permutation tests.

Fig. 3 .
Fig.3.TnSeq mutant fitness is a sensitive reporter of the factors that drive disease outcome.a) Each BXD and parental strain was infected with 10 6 CFU of Mtb TnSeq library, representative of ∼10 5 independent Mtb Tn mutants.For each mutant, the change between the initial abundance in the TnSeq library and the abundance recovered after 4 weeks of in vivo infection was compared with calculate a quantitative fitness metric (LFC), which could be used downstream for QTL mapping.b) Example plot showing an essential Mtb gene (bioA), a conditionally essential gene (mbtG), and a nonessential gene (glpK) across the BXD panel.c) A Venn diagram depicting the overlap of essential genes between B6, D2, and the screened BXD strains.Mtb genes were deemed essential if the Tn mutant experienced a 1-fold reduction (LFC < −0.5) within a given host strain and were significantly different from in vitro conditions (P < 0.05) in at least 1 host strain across the panel.d) The number of Mtb genes essential (LFC < −0.5) for growth or survival in each diverse mouse strain across the panel (P < 0.05).Salmon (smallest circle) indicates the mutants uniquely required for each host additional strain, and purple (medium circle) shows the cumulative requirement as each new host strain is added.e) A heatmap depicting the total correlation of the library-wide bacterial fitness within each screened host strain.

Fig. 5 .
Fig. 5.A chromosome 6 QTL hotspot highlights a genomic region controlling fitness of multiple bacterial mutants.QTL mapping of Tn mutants lacking a) mak, b) espR, c) perM, and d) rip2.After 10,000 permutation tests, the solid threshold represents P = 0.05, and the dashed threshold represents P = 0.20.e) Chromosome 6 mapping overlap of the 4 Tn mutant fitness profiles that identified the QTL hotspot.f-i) Boxplots representing the fitness of Tn mutants lacking mak, espR, perM, and rip2 within BXD mice with B (navy) or D (lilac) haplotypes at the QTL position.BXD genotypes for which a haplotype state could not be called with 95% confidence were not included.

Fig. 6 .
Fig. 6.Independence tests identify 2 putative causal variants.Scan2 independence tests from R/qtl were used to identify whether more than 1 causal host factor may underlie the chromosome 6 hotspot.The QTL mapped by a) mak, b) espR, c) perM, and d) rip2 Tn mutants were remapped, incorporating the haplotype of the BXD mice at the locus of each other QTL as a covariate in the mapping analysis.The original QTL mapped by each mutant fitness profile are represented as dashed lines.The solid horizontal threshold represents P = 0.05 while the dashed horizontal threshold represents P = 0.20.

Fig. 7 .
Fig. 7.A bioinformatic pipeline highlights putative candidate genes underlying the QTL on chromosome 6.a) A graphic summarizing the bioinformatic pipeline used to prioritize host gene candidates within each QTL.b) A heatmap representative of the per-gene outcome of 5 distinct criteria: (1) whether or not D2 possesses a missense or nonsense mutation in comparison to B6 according to the Wellcome Sanger Institute Mouse Genomes Project (Sanger MGP) (Adams et al. 2015), (2) whether the gene is significantly enriched in genetically diverse mice progressively infected with Mtb (Ahmed et al. 2020), (3) whether the gene is significantly enriched in Rhesus Macaques progressively infected with Mtb (Ahmed et al. 2020), (4) whether the gene is significantly upregulated or downregulated in TB patient blood(Zak et al. 2016), and (5) whether the gene is differentially expressed in mouse lungs across variable host genotypes, Mtb genotypes, and doses(Moreira-Teixeira et al. 2020).If the answer is no, the space remains blank.Green indicates yes, and yellow indicates yes but cautions that the Rhesus Macaque gene is not a high confidence homolog.Only protein-coding genes with a D2 SNP (as per Sanger MGP) that met at least 1 of the criteria are included in this visual.