Genetically Determined Strength of Natural Killer Cells is Enhanced by Adaptive HLA class I Admixture in East Asians

Human natural killer (NK) cells are essential for controlling infection, cancer and fetal development. NK cell functions are modulated by interactions between polymorphic inhibitory killer cell immunoglobulin-like receptors (KIR) and polymorphic HLA-A, -B and -C ligands expressed on tissue cells. All HLA-C alleles encode a KIR ligand and contribute to reproduction and immunity. In contrast, only some HLA-A and -B alleles encode KIR ligands and they focus on immunity. By high-resolution analysis of KIR and HLA-A, -B and -C genes, we show that the Chinese Southern Han are significantly enriched for interactions between inhibitory KIR and HLA-A and -B. This enrichment has had substantial input through population admixture with neighboring populations, who contributed HLA class I haplotypes expressing the KIR ligands B*46:01 and B*58:01, which subsequently rose to high frequency by natural selection. Consequently, over 80% of Southern Han HLA haplotypes encode more than one KIR ligand. Complementing the high number of KIR ligands, the Chinese Southern Han KIR locus combines a high frequency of genes expressing potent inhibitory KIR, with a low frequency of those expressing activating KIR. The Southern Han centromeric KIR region encodes strong, conserved, inhibitory HLA-C specific receptors, and the telomeric region provides a high number and diversity of inhibitory HLA-A and -B specific receptors. In all these characteristics, the Southern Han represent other East Asians, whose NK cell repertoires are thus enhanced in quantity, diversity and effector strength, likely through natural selection for resistance to endemic viral infections.


Introduction
difference in the proportion of HLA class I haplotypes encoding one versus more than one KIR 1 ligand between the Southern Han and each of the other seven representative populations is 2 statistically significant, as is that between Amerindians and the other populations (Two-3 proportions Z-test, Benjamini-Hocherg corrected p <0.001, Figure 1C Malay and Filipino). This analysis showed these populations also have a high frequency of 11 HLA class I haplotypes encoding multiple KIR ligands ( Figure 1D). Our analysis thus shows 12 that East Asian and South East Asian HLA class I haplotypes encode more ligands for 13 inhibitory KIR than the haplotypes of any other populations. HLA-B*46:01 carriers, the MHC is primarily of South East Asian ancestry ( Figure 2B) with 26 carriers having a significantly higher proportion of South East Asian genetic ancestry in the 1 MHC than outside the MHC (p = 2.7 -6 ), or within the MHC of non-carriers (p = 1.9 -5 ). Similarly, 2 among HLA-B*58:01 carriers, the MHC region is primarily of Japanese ancestry (Figure 2B),  3 with carriers having a significantly higher proportion of Japanese genetic ancestry within the 4 MHC than outside the MHC (p = 2.4 -4 ) and compared with non-B*58:01 carriers (p = 6.1 -7 ). 5 Excluding the MHC, carriers of any of these three alleles show East Asian ancestry along 6 chromosome 6 ( Figure 2B). Further supporting the observed population structure as specific to 7 the MHC region, among the three ancestral groups the FST values range from 0.098 -0.161, 8 compared with 0.012 -0.017 for the remainder of chromosome 6.  ). For three of the haplotypes, which include the two most frequent haplotypes in the 24 population, this distinction is statistically significant (pcorr < 0.01). Two of these haplotypes 25 contain B*46:01 and one contains B*58:01 ( Figure 2C). In total, four of five of the HLA-B 26 alleles that encode a KIR ligand and are present on these 10 most frequent haplotypes show 1 increased evidence for admixture in the MHC region. By contrast, neither of the two HLA-A 2 alleles that encode a KIR ligand show a genetic ancestry within the MHC that differed from 3 the East Asian ancestry throughout chromosome 6. This finding suggests that the number of 4 HLA-B genes encoding KIR ligands was enhanced in Chinese Southern Han by admixture with 5 neighboring or displaced populations. In summary, these findings clearly show that the 6 B*46:01 and B*58:01 alleles are present in Chinese Southern Han through admixture. 7 8

Southern Han 10
To investigate whether or not the admixed haplotypes were also subject to natural selection we 11 examined further characteristics of their diversity and distribution. We first measured 12 nucleotide diversity (π) of the genomic sequence flanking +/-500kb of specific HLA-B alleles 13 ( Figure 3A). We found significantly reduced nucleotide diversity of haplotypes containing 14 HLA-B*46 compared to haplotypes containing HLA-B*40 (mean π of B*40 = 2.2 x 10 -3 , B*46 15 = 0.6 x 10 -3 , Wilcoxon test, p = 1.24 x 10 -12 ). We also observed that haplotypes containing 16 B*58 have lower diversity than B*40, but this reduction was not statistically significant (mean 17 π of B*58 = 1.6 x 10 -3 , Wilcoxon test, p =0.12). This reduced diversity suggests that B* 46 18 haplotypes have arisen in frequency in the Chinese Southern Han without accumulating 19 mutations. To further explore this finding, we used the iHS statistic, which identifies genomic 20 variants that have increased in frequency recently and rapidly under natural selection, so that 21 their haplotypic background has not yet been diversified by recombination (Voight et al. 2006). 22 We identified a strong signal of recent selection (iHS >= 99 th percentile) that falls precisely in 23 the MHC of the Chinese Southern Han ( Figure 3B). 24 25 18 To investigate the patterns of selection specific to HLA-B*46:01 and B*58:01 haplotypes we 1 first identified SNPs that characterize those haplotypes and then compared their distribution of 2 iHS values to the distribution of all the SNPs of chromosome 6 ( Figure 3C). For B*46:01 the 3 mean absolute iHS of 3.38, was significantly higher than the mean for chromosome 6 of 0.785 4 (Wilcoxon two-sample test, p= 5.77 -6 ), as was the mean iHS for B*58:01 (3.43, p= 1.8 -3 ). 5 Although the signal for B*58:01 is weaker, there is a more distinct subset of SNPs having 6 extremely high iHS values (>99 th percentile, Figure 3C), which could indicate recent selection 7 of an older haplotype, although it was not possible from our analysis to determine if the SNP 8 allele was ancestral or derived in each case. Interestingly, the mean iHS for B*40:01 associated 9 SNPs was also significantly higher than the chromosome average (1.88, p= 7.7 -6 ). However, which are KIR ligands ( Figure 1B), we extended the analysis to these alleles ( Figure 3C). ligand. Together, these findings thus illustrate that HLA class I in the Chinese Southern Han 23 has been targeted by natural selection and suggest that one major benefit has been to increase 24 the number of KIR ligands present in the population. 25 26 We next examined whether the observed distributions of HLA class I encoded KIR ligands 1 were consistent with modern human population dispersal. Cluster analysis shows there are five 2 groups of HLA class I frequency spectra that correspond to the broad population groups of 3 African, European, Asian, Oceanian and American origin ( Figure 3D). By contrast, three 4 distinct and strongly supported groups cluster according to their proportions of haplotypes 5 encoding one, two or three KIR ligands ( Figure 3E). Notable examples are the Ryukyu 6 Japanese and Indigenous Australian populations, who group with Asian populations when 7 analyzed by HLA class I haplotype distribution ( Figure 3D). By contrast, Ryukyu Japanese and 8 Australians appear more similar to Africans and other groups when analyzed according to the 9 number of KIR ligands encoded by their HLA class I haplotypes ( Figure 3E). In a counter 10 example, Northern Indian and Tuvan populations ( Figure S4C

High frequency of inhibitory KIR allotypes in Southern Han 4
The KIR locus comprises genes encoding the four inhibitory and six activating KIR known to 5 bind polymorphic HLA class I ligands, and three that do not bind polymorphic HLA class I 6 (Guethlein et al. 2015). In total, we identified 116 KIR alleles, representing 101 KIR allotypes 7 ( Figure S5). A total of 46 novel KIR alleles (39.7% of total KIR alleles detected) were 8 characterized ( Figure S6) and 24.8% of the individual Han carried at least one novel allele. 9 Correcting for the number of individuals tested showed that the Southern Han are more diverse 10 than Amerindians and Oceanians, but less diverse than Europeans and Africans ( Figure 4A Han 17 In Chinese Southern Han, the KIR3DL1/S1 and KIR3DL2 genes encoding inhibitory NK cell 18 receptors specific for polymorphic HLA class I ligands, have two or three high frequency 19 alleles and multiple less frequent alleles ( Figure S5). In contrast, KIR2DL1 and 2DL2/3 also 20 encode inhibitory receptors but are each dominated by one high frequency allele ( Figure 4D). 21

Directional selection reduced centromeric KIR region diversity in the Southern
To explore this observation, we compared the observed homozygosity to the expected across 22 populations representing major ancestry groups from Europe, Africa, Asia, South America, and 23 Oceania, using the Ewens-Watterson test (Fnd). KIR2DL1 and KIR2DL2/3 are in the 24 centromeric region of the KIR locus, whereas KIR3DL1/S1 and KIR3DL2 are telomeric KIR 25 genes (Wilson et al. 2000). Overall the Southern Han show greater homozygosity compared to 26 22 other populations, which is more pronounced among centromeric than telomeric KIR genes and 1 statistically significant for KIR2DL2/3 (Fnd = 3.1, P > 0.985, Figure 4E). This high-resolution 2 analysis of KIR alleles, complements recent analysis of genome-wide SNP data that identified the USA (grant numbers: NIH R01 AI017892 to PP, and R56 AI151549 to PJN). We thank the 8 Chinese blood donors for generously providing DNA samples for this study. 9 10

Web Resources and Accession Numbers 11
The URLs for data, material and programs used herein are as follows : 12 The scripts used in the study are located at https://github.com/n0rmski/Han_Study  Alternative sequence motifs in the α1 domain of the HLA class I molecule determine the four 5 epitopes recognized by different KIR, and which are also called KIR ligands. The A3/11 6 epitope is carried by HLA-A3 and -A11 (yellow colored pie segments); the Bw4 epitope is 7 carried by subsets of HLA-A and -B allotypes (green colored pie segments). The C1 epitope is 8 carried by a majority of HLA-C allotypes, as well as by HLA-B*46 and HLA-B*73 (red 9 colored pie segments). The C2 epitope is carried by all HLA-C allotypes that do not carry C1 10 (blue-colored pie segments). Grey-colored pie segments correspond to allotypes that are not 11 KIR ligands. Figure S2 lists all the HLA-A, -B and -C allotypes present in the study population 12 and shows which KIR ligand motifs they carry. shaded in blue. At the right is shown for each haplotype the number of individuals carrying the 20 haplotype, and its frequency. All the haplotypes are shown in Figure S7. heterozygosity. An asterisk denotes significance (P < 0.05 or > 0.95) using the exact test 3 (Salamon et al. 1999).