-
PDF
- Split View
-
Views
-
Cite
Cite
Yuanqing Feng, Michael A McQuillan, Sarah A Tishkoff, Evolutionary genetics of skin pigmentation in African populations, Human Molecular Genetics, Volume 30, Issue R1, 1 March 2021, Pages R88–R97, https://doi.org/10.1093/hmg/ddab007
- Share Icon Share
Abstract
Skin color is a highly heritable human trait, and global variation in skin pigmentation has been shaped by natural selection, migration and admixture. Ethnically diverse African populations harbor extremely high levels of genetic and phenotypic diversity, and skin pigmentation varies widely across Africa. Recent genome-wide genetic studies of skin pigmentation in African populations have advanced our understanding of pigmentation biology and human evolutionary history. For example, novel roles in skin pigmentation for loci near MFSD12 and DDB1 have recently been identified in African populations. However, due to an underrepresentation of Africans in human genetic studies, there is still much to learn about the evolutionary genetics of skin pigmentation. Here, we summarize recent progress in skin pigmentation genetics in Africans and discuss the importance of including more ethnically diverse African populations in future genetic studies. In addition, we discuss methods for functional validation of adaptive variants related to skin pigmentation.
Introduction
The hypothesis that modern humans originated in Africa was first raised based on the fossil record (1,2). The ‘Out-of-Africa’ theory, which posits a single Homo sapiens origin in Africa ~300 kya, has since been confirmed at the molecular level based on mitochondrial and nuclear DNA sequencing studies (3–6). Around 50–80 kya, a small number of humans migrated out of Africa and subsequently populated the rest of the globe (3,4). During this migration, humans encountered diverse geographical, environmental and climatic conditions, which shaped many adaptive traits, including skin pigmentation (7). Variation in skin pigmentation is, thus, largely attributable to the forces of natural selection and is almost perfectly correlated with ultraviolet radiation (UVR) globally (7,8). The most widely held hypothesis is that darkly pigmented skin protects against UV-induced degradation of folate in high UVR environments, while lighter skin facilitates the production of vitamin D at higher latitudes (9,10). These biological processes could directly affect fitness; folate deficiencies lead to a diversity of birth defects and fertility complications (11), while vitamin D plays a critical role in bone health and reproductive physiology (12).
During the migration out of Africa to higher latitudes, modern humans encountered lower and more variable UVR environments, and depigmented skin evolved independently multiple times (13). As a result, genetic variants that are associated with lighter skin, including in genes SLC24A5, SLC45A2, MC1R, TYR, TYRP1 and OCA2, display strong statistical signals of positive selection in European and Asian populations (13–22). However, this is not to suggest that the ancestral state of human skin pigmentation was dark. Recent genomic studies show that the ancestral alleles of many predicted functional pigmentation variants in Africa are associated with lighter skin, suggesting our human ancestors may have had light or moderately pigmented skin (23). Combined with the fact that our closest evolutionary relatives, chimpanzees, have light skin (24), these results suggest that dark skin may be a derived trait in the Homo genus, and that both light and dark skin have continued to evolve over hominid history (23,25).
The majority of studies examining the genetics of skin pigmentation have been conducted in European and Asian populations, and African populations are vastly underrepresented (26,27). This is despite the fact that modern Africans harbor very high levels of genetic and phenotypic diversity (6), and skin pigmentation is no exception. For example, the KhoeSan hunter-gatherers in Botswana have relatively light skin, while Nilo-Saharan-speaking populations from East Africa have some of the darkest pigmented skin on Earth (23,28). Further, genetic variants associated with pigmentation in European and Asian populations explain only a small amount of the phenotypic variation in Africa, suggesting the genetic architecture underlying skin pigmentation differs between Africans and Eurasians (23,28).
In this review, we summarize current knowledge of the genes and variants influencing human skin pigmentation, with a particular focus on African populations. We highlight recent research on the evolutionary genetics of African skin pigmentation and suggest how best to utilize experimental methods to identify and functionally validate novel pigmentation loci that may be under selection. Finally, we argue that the inclusion of more ethnically, phenotypically and geographically diverse African populations will be essential for a full understanding of the evolutionary genetics of human skin pigmentation.
Human Skin Pigmentation Genes
Human skin color is determined by the composition, abundance and distribution of melanin pigments, which are biopolymers derived from tyrosine (29). Melanins come in multiple forms: black-brown eumelanin and yellow-red pheomelanin. Melanin is synthesized and stored in lysosome-like organelles called melanosomes, which are generated in melanocytes and transferred to surrounding keratinocytes (30). Skin melanocytes are differentiated from melanoblasts, which originate from neural crest cells during embryonic development (31,32).
Skin color is regulated by genes underlying the development of melanocytes and melanosome biogenesis. Mice and other model organisms have been used to validate the function of genes associated with human pigmentation. A recent review summarized a cross-species list of 650 genes with verified pigmentation phenotypes in humans, mice or zebrafish (33). The International Mouse Phenotyping Consortium has identified 224 genes associated with pigmentation phenotypes (34). Among them, SOX10, PAX3, MITF, EDN3, EDNRB, KIT and KITLG are key genes that regulate the proliferation, migration and differentiation of melanocyte precursor cells (32,35). TYR, TYRP1, DCT, OCA2, SLC24A5, SLC45A2, MC1R, ASIP, GPR143 and PMEL are well-characterized genes involved in the synthesis of melanin in melanocytes (29,30,36). Although here we mainly focus on the genetics and biology of skin pigmentation, it is important to note that many skin pigmentation genes (e.g. OCA2, SLC45A2, KITLG) also affect hair or eye color (37,38).
Searching for Signatures of Skin Pigmentation Evolution
An important goal of evolutionary genetics is to identify the variants and genes underlying phenotypic evolution and local adaptation (39,40). The advent of microarray and high throughput sequencing technologies now enables researchers to perform genome-wide association studies for complex human traits, which have been successful in identifying multiple genes associated with human skin pigmentation (27). These methods, coupled with genome-wide scans of natural selection, can provide novel insights into the genetic mechanisms underlying the evolution of skin pigmentation across populations (Fig. 1) (41,42). For example, allele frequency shifts at trait-associated variants between populations are indicators of local adaptation and positive selection (43–46).

Methods of detecting adaptive variants related to skin pigmentation. (1) Genome-wide selection scans identify genomic signatures of natural selection, and some are near pigmentation genes. (2) GWAS localizes variants associated with pigmentation. (3) Overlapping GWAS variants with signatures of natural selection reveals candidate adaptive pigmentation loci.
Natural selection has shaped the frequency of genetic variants in human populations, and many statistical methods have been developed to identify its footprints in the genome (Fig. 1). Several reviews have described the methods for detecting selection signatures extensively (47–50), so we briefly mention the widely used techniques and their principles. These methods are mainly based on changes in the site frequency spectrum (Tajima’s D (51), Fay and Wu’s H (52)), changes in linkage disequilibrium around a positively selected site (LRH (53), LDD (54), IBD (55), iHS (56), Rsb (57), nSL (58)), genetic differentiation between populations (XP-CLR (45), FST (46), LSBL (59), PBS (60), XP-EHH (61), GRoSS (62)), their combinations or derivatives (CLR (63), CMS (64,65)) and changes in the density of singleton mutations (SDS) (66). Many of these methods have been used to detect significant selection signals near pigmentation genes in European, East Asian (13,15–20,67) and African populations (discussed below). Although many variants found to be associated with pigmentation from GWAS or identified from selection scans have been functionally validated, a large proportion of these variants have currently unknown functions (27).
Evolutionary genetics of skin pigmentation in Africans
The majority of evolutionary genetic studies on skin pigmentation have been conducted in European and Asian populations (68,69). However, some recent studies have examined pigmentation in African populations, providing novel insights into the evolution of skin color. In this section, we summarize these African studies on a gene-by-gene basis.
SLC24A5
SLC24A5 encodes a cation exchanger in melanosomes and was first confirmed to affect skin pigmentation in zebrafish (21). A derived, nonsynonymous mutation (rs1426654 (A), Ala111Thr) in SLC24A5 associated with light skin color has swept to near fixation in Europeans due to positive selection (13,22,70,71). Recent studies show that this SLC24A5 allele is also associated with light skin in Africans. Crawford et al. (23) performed a GWAS on 1570 ethnically diverse Africans that spanned the full spectrum of skin pigment variation in Africa and found the strongest association at SLC24A5. The authors show that rs1426654 is common in East African populations with high levels of Afroasiatic ancestry (Table 1), and that rs1426654 likely introgressed into these populations from a Eurasian source >5 kya. Further, SLC24A5 likely experienced positive selection in East Africa after this admixture event. The authors also find that rs1426654 is at moderate frequency (5–12%) in the KhoeSan from Botswana, who have substantially lighter skin than equatorial Africans (Table 1).
Annotations and population allele frequencies of pigmentation-associated variants identified in African populations
. | . | . | . | Allele Frequency . | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
SNP . | Allele . | Gene . | Function . | AFR 1KG . | EUR 1KG . | SAS 1KG . | EAS 1KG . | AMR 1KG . | Botswana Khoesan* . | Ethiopia Afroasiatic* . | Ethiopia Nilo-Saharan* . |
rs1426654 | A | SLC24A5 | Ala 111 Thr | 0.074 | 0.997 | 0.685 | 0.012 | 0.589 | 0.124 | 0.441 | 0.127 |
rs12913832 | A | HERC2 | Intronic | 0.972 | 0.364 | 0.929 | 0.998 | 0.798 | 0.995 | 0.979 | 1 |
rs1800404 | T | OCA2 | Synonymous | 0.127 | 0.786 | 0.314 | 0.386 | 0.579 | 0.766 | 0.32 | 0.069 |
rs4932620 | T | HERC2 | Intronic | 0.085 | 0.016 | 0.165 | 0.017 | 0.014 | 0.027 | 0.276 | 0.48 |
rs6497271 | G | HERC2 | Intronic | 0.374 | 0.984 | 0.837 | 0.979 | 0.947 | 0.773 | 0.572 | 0.196 |
rs7948623 | T | DDB1 | Intergenic | 0.266 | 0.006 | 0.127 | 0.003 | 0.03 | 0.058 | 0.417 | 0.716 |
rs2240751 | G | MFSD12 | Tyr 182 His | 0.002 | 0.01 | 0.007 | 0.269 | 0.167 | 0 | 0.001 | 0 |
rs56203814 | T | MFSD12 | Synonymous | 0.208 | 0.008 | 0 | 0 | 0.017 | 0.024 | 0.273 | 0.51 |
rs10424065 | T | MFSD12 | Intronic | 0.298 | 0.009 | 0 | 0.001 | 0.022 | 0.044 | 0.256 | 0.55 |
. | . | . | . | Allele Frequency . | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
SNP . | Allele . | Gene . | Function . | AFR 1KG . | EUR 1KG . | SAS 1KG . | EAS 1KG . | AMR 1KG . | Botswana Khoesan* . | Ethiopia Afroasiatic* . | Ethiopia Nilo-Saharan* . |
rs1426654 | A | SLC24A5 | Ala 111 Thr | 0.074 | 0.997 | 0.685 | 0.012 | 0.589 | 0.124 | 0.441 | 0.127 |
rs12913832 | A | HERC2 | Intronic | 0.972 | 0.364 | 0.929 | 0.998 | 0.798 | 0.995 | 0.979 | 1 |
rs1800404 | T | OCA2 | Synonymous | 0.127 | 0.786 | 0.314 | 0.386 | 0.579 | 0.766 | 0.32 | 0.069 |
rs4932620 | T | HERC2 | Intronic | 0.085 | 0.016 | 0.165 | 0.017 | 0.014 | 0.027 | 0.276 | 0.48 |
rs6497271 | G | HERC2 | Intronic | 0.374 | 0.984 | 0.837 | 0.979 | 0.947 | 0.773 | 0.572 | 0.196 |
rs7948623 | T | DDB1 | Intergenic | 0.266 | 0.006 | 0.127 | 0.003 | 0.03 | 0.058 | 0.417 | 0.716 |
rs2240751 | G | MFSD12 | Tyr 182 His | 0.002 | 0.01 | 0.007 | 0.269 | 0.167 | 0 | 0.001 | 0 |
rs56203814 | T | MFSD12 | Synonymous | 0.208 | 0.008 | 0 | 0 | 0.017 | 0.024 | 0.273 | 0.51 |
rs10424065 | T | MFSD12 | Intronic | 0.298 | 0.009 | 0 | 0.001 | 0.022 | 0.044 | 0.256 | 0.55 |
AFR (African), EUR (European), SAS (South Asian), EAS (East Asian) and AMR (Admixed American) refer to super population codes from the 1000 genomes (1KG) project. Allele frequencies for populations marked with an asterisk were calculated from 295 Botswana KhoeSan, 451 Ethiopian Afroasiatic and 51 Ethiopian Nilo-Saharan individuals as described in Crawford et al. (23)
Annotations and population allele frequencies of pigmentation-associated variants identified in African populations
. | . | . | . | Allele Frequency . | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
SNP . | Allele . | Gene . | Function . | AFR 1KG . | EUR 1KG . | SAS 1KG . | EAS 1KG . | AMR 1KG . | Botswana Khoesan* . | Ethiopia Afroasiatic* . | Ethiopia Nilo-Saharan* . |
rs1426654 | A | SLC24A5 | Ala 111 Thr | 0.074 | 0.997 | 0.685 | 0.012 | 0.589 | 0.124 | 0.441 | 0.127 |
rs12913832 | A | HERC2 | Intronic | 0.972 | 0.364 | 0.929 | 0.998 | 0.798 | 0.995 | 0.979 | 1 |
rs1800404 | T | OCA2 | Synonymous | 0.127 | 0.786 | 0.314 | 0.386 | 0.579 | 0.766 | 0.32 | 0.069 |
rs4932620 | T | HERC2 | Intronic | 0.085 | 0.016 | 0.165 | 0.017 | 0.014 | 0.027 | 0.276 | 0.48 |
rs6497271 | G | HERC2 | Intronic | 0.374 | 0.984 | 0.837 | 0.979 | 0.947 | 0.773 | 0.572 | 0.196 |
rs7948623 | T | DDB1 | Intergenic | 0.266 | 0.006 | 0.127 | 0.003 | 0.03 | 0.058 | 0.417 | 0.716 |
rs2240751 | G | MFSD12 | Tyr 182 His | 0.002 | 0.01 | 0.007 | 0.269 | 0.167 | 0 | 0.001 | 0 |
rs56203814 | T | MFSD12 | Synonymous | 0.208 | 0.008 | 0 | 0 | 0.017 | 0.024 | 0.273 | 0.51 |
rs10424065 | T | MFSD12 | Intronic | 0.298 | 0.009 | 0 | 0.001 | 0.022 | 0.044 | 0.256 | 0.55 |
. | . | . | . | Allele Frequency . | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
SNP . | Allele . | Gene . | Function . | AFR 1KG . | EUR 1KG . | SAS 1KG . | EAS 1KG . | AMR 1KG . | Botswana Khoesan* . | Ethiopia Afroasiatic* . | Ethiopia Nilo-Saharan* . |
rs1426654 | A | SLC24A5 | Ala 111 Thr | 0.074 | 0.997 | 0.685 | 0.012 | 0.589 | 0.124 | 0.441 | 0.127 |
rs12913832 | A | HERC2 | Intronic | 0.972 | 0.364 | 0.929 | 0.998 | 0.798 | 0.995 | 0.979 | 1 |
rs1800404 | T | OCA2 | Synonymous | 0.127 | 0.786 | 0.314 | 0.386 | 0.579 | 0.766 | 0.32 | 0.069 |
rs4932620 | T | HERC2 | Intronic | 0.085 | 0.016 | 0.165 | 0.017 | 0.014 | 0.027 | 0.276 | 0.48 |
rs6497271 | G | HERC2 | Intronic | 0.374 | 0.984 | 0.837 | 0.979 | 0.947 | 0.773 | 0.572 | 0.196 |
rs7948623 | T | DDB1 | Intergenic | 0.266 | 0.006 | 0.127 | 0.003 | 0.03 | 0.058 | 0.417 | 0.716 |
rs2240751 | G | MFSD12 | Tyr 182 His | 0.002 | 0.01 | 0.007 | 0.269 | 0.167 | 0 | 0.001 | 0 |
rs56203814 | T | MFSD12 | Synonymous | 0.208 | 0.008 | 0 | 0 | 0.017 | 0.024 | 0.273 | 0.51 |
rs10424065 | T | MFSD12 | Intronic | 0.298 | 0.009 | 0 | 0.001 | 0.022 | 0.044 | 0.256 | 0.55 |
AFR (African), EUR (European), SAS (South Asian), EAS (East Asian) and AMR (Admixed American) refer to super population codes from the 1000 genomes (1KG) project. Allele frequencies for populations marked with an asterisk were calculated from 295 Botswana KhoeSan, 451 Ethiopian Afroasiatic and 51 Ethiopian Nilo-Saharan individuals as described in Crawford et al. (23)
Martin et al. (28) also examined the genetic architecture of skin pigmentation in two KhoeSan-speaking populations: the ‡Khomani San and Nama. By examining detailed skin color measurements in 465 KhoeSan people, the authors found that previously known pigmentation-associated loci explain only a small percentage of the overall skin color variation, suggesting many pigmentation loci remain to be discovered. Despite this, they identified a significant association at SLC24A5 and reported the frequency of rs1426654 at 40% in the combined Nama and ‡Khomani dataset, which has ~11% European admixture. Lin et al. (72) further examined the evolution of SLC24A5 in the Nama and ‡Khomani populations by performing haplotype analysis and demographic modeling, finding that SLC24A5 experienced strong positive selection after being introduced from a non-African source within the last ~2k years. Further, rs1426654 likely arrived in the KhoeSan indirectly through a southern pastoralist migration from East Africa (72).
OCA2/HERC2
The OCA2 gene impacts skin pigmentation by regulating melanosome pH (73,74), and mutations in this gene cause oculocutaneous albinism type II (OCA2) (75). HERC2 is adjacent to OCA2 on chromosome 15, and although HERC2 does not directly impact pigmentation, regulatory elements in HERC2 can influence expression of OCA2 (76). For example, multiple studies report that the intronic HERC2 mutation rs12913832 is associated with skin (77–79), hair (37,80) and eye (81–83) pigmentation in Eurasians (84) and is located in a melanocyte-specific enhancer that regulates OCA2 expression (76). Within Africans, several novel variants in this region associated with skin pigmentation have recently been described. The derived allele of a synonymous variant in OCA2 (rs1800404 (T)), associated with light pigmentation, is at high frequency (>70%) in Europeans and the KhoeSan from Botswana (Table 1) (13,23). This variant is associated with alternative splicing of OCA2 at exon 10, which may impact African pigmentation by influencing the amount of functional OCA2 protein produced. This OCA2 region also displays a significantly elevated Tajima’s D value in both Africans and Eurasians, suggesting long-term balancing selection at this locus. Finally, two novel noncoding variants (rs4932620 and rs6497271) in the introns of HERC2 were also recently identified by Crawford et al. (23). They found that the derived allele of rs4932620 (T) is significantly associated with dark skin pigmentation and at highest frequency in Ethiopian Nilo-Saharan populations, while the derived allele of rs6497271 (G) is associated with light skin pigmentation and is common in the KhoeSan from Botswana (Table 1) (23).
DDB1
The damage-specific DNA binding protein (DDB1) gene functions in DNA repair after UV-induced damage and is associated with the disease xeroderma pigmentosum complementation group E (85). This gene, which underlies the fruit pigmentation phenotype in tomatoes (86), is located in a broad genomic region that was found to significantly associate with skin color in a GWAS in an African-European admixed population (87). Variants in and around this gene were also significantly associated with African skin pigmentation in a GWAS of ethnically diverse African people from Tanzania, Botswana and Ethiopia (23). Specifically, three noncoding genome-wide significant variants (rs7948623, rs1377457, rs148172827) were shown to change the activity of enhancers near the DDB1 gene in melanoma cell lines. The derived allele of rs7948623 (T), associated with dark pigmentation, is at highest global frequency in East African populations with Nilo-Saharan ancestry, who inhabit high UVR environments and have darkly pigmented skin. This allele is also at moderate to high frequency in South Asians and Austro-Melanesians with dark skin, while the light pigmentation allele is nearly fixed in Europeans (Table 1). Finally, the FST and Tajima’s D statistics show that DDB1 is under differential selection in Africans versus non-Africans. Specifically, the light pigment associated alleles have undergone a strong selective sweep in European and Asian populations (23).
MFSD12
The above-mentioned Crawford et al. (23) study also identified a significant association with skin color at MFSD12 in Africans, a gene never previously characterized. Specifically, the derived alleles at the synonymous variant rs56203814 (T) and the intronic variant rs10424065 (T) significantly associate with dark pigmentation and are at highest frequency in East Africans with Nilo-Saharan ancestry (Table 1). Silencing MFSD12 expression in skin melanocytes from the mouse resulted in an increase in eumelanin production and darker pigmentation. This observation correlates with the finding that melanocyte MFSD12 expression levels are lower in people of African ancestry. Further, CRISPR-Cas9 knockouts of MFSD12 in Agouti mice results in decreased pheomelanin production (23). Recently, MFSD12 was shown to affect pigmentation by mediating the import of cysteine, a key precursor in pheomelanin production, into melanosomes (88). Increased MFSD12 expression also promotes melanoma cell proliferation and may be an important therapeutic target for melanoma (89). The fact that completely novel, functionally validated skin pigmentation loci were identified by Crawford et al. (23) using sample sizes that are small by today’s standards reinforces the need to include more ethnically diverse African populations in modern genetic studies. Variants at MFSD12 also have highly divergent allele frequencies between East Africans and Europeans as measured by FST, consistent with a signature of local adaptation (23). Finally, a novel nonsynonymous variant in MFSD12 (rs2240751 (G) Tyr182His) was recently reported to be associated with skin pigmentation in GWAS of Latin American (90), East Asian (91) and European populations (Table 1) (92).
MC1R
Some early studies of skin pigmentation genes focus on the melanocortin 1 receptor (MC1R), which controls the type of melanin produced (eumelanin or pheomelanin) by melanocytes (93). Harding et al. (94) examined nucleotide variation at MC1R in European, Asian and African populations, finding that nonsynonymous variants are largely absent in African populations. In contrast, European and Asian populations harbor more diversity at MC1R, with many segregating nonsynonymous variants present (94). These data suggest strong purifying selection acting at MC1R within Africa, where any deviation from the eumelanin-producing form of the gene, which leads to darker pigmentation, is strongly selected against. This functional constraint has been relaxed in other global populations, where UVR is lower and lighter skin is adaptive. However, another study identified a small number of nonsynonymous MC1R mutations present in lightly pigmented people from Southern Africa, suggesting this functional constraint may also have been relaxed in low UVR regions of Africa (95). Several recent GWAS confirmed that the MC1R locus is significantly associated with skin (96–98) or hair color (37,38,98,99) in European populations. No GWAS have reported significant associations with pigmentation at the MC1R the locus in African populations thus far, likely due to a lack of variation at the locus in Africans.
TYRP1 and Other Genes
TYRP1 was the first gene found to be associated with oculocutaneous albinism in Africans, and the rs387906560 frameshift mutation (Lys368SerfsTer17) in TYRP1 causes oculocutaneous albinism 3 (OCA3 or rufous albinism) (100–102). This mutation is at a higher frequency in Africans (0.5%, GnomAD) than non-Africans (0.017%, GnomAD). TYRP1 shows a signature of strong positive selection in a Senegalese population, as indicated by a significantly negative Tajima’s D value (103). In addition, tests of extended haplotype homozygosity (EHH), which measure levels of linkage disequilibrium around a positively selected locus, find three pigmentation loci (LYST, TP53BP1, RAD50) showing signatures of positive selection in the Yoruban population from Nigeria (104). Another analysis of an African American population used the composite likelihood ratio (CLR) test to identify signals of recent positive selection near KITLG and OCA2 (63,105).
From association to causation: fine-mapping and functional validation
Signals of selection do not provide information about which trait is under selection or what the driving force of selection is (17,106). Similarly, although GWAS hits are statistically associated with a trait, their biological effects are unclear. Most genome-wide selection scans report top loci based on an empirical distribution and assume that targets of selection are outliers in the distribution of summary statistics, which is not always true. Thus, these empirical statistics (including FST, Tajima’s D and iHS) have high false-positive and false-negative rates (106,107) and cannot accurately pinpoint the causal variants. Prioritizing candidate causal variants in GWAS can be accomplished by performing computational fine-mapping based on summary statistics (108,109). However, to identify causal variants from GWAS and selection scans, it is necessary to evaluate and validate their biological functions experimentally (107,110,111).
Here, we use skin pigmentation as an example to illustrate how to prioritize and validate causal variants from GWAS or selection scans (Fig. 2). It is well known that melanocytes are the key cell types involved in skin pigmentation, and epidermal keratinocytes and fibroblasts play a secondary role (112,113). To study the genetic basis of human skin pigmentation, there are two levels of functional validation: (1) testing the role of candidate genes on skin pigmentation and (2) validating the effects of allelic variants on skin pigmentation. Testing the function of pigmentation-associated genes or coding variants is relatively straightforward. One could use CRISPR/Cas9-based tools to edit the coding region of a gene directly and then examine the effect on the pigmentation phenotype in cell lines or animal models. In addition, comparative genetics offers a powerful approach for predicting the function of genes or variants related to human skin pigmentation. For example, coding variants in MFSD12 (discussed above) are associated with coat color in dogs (114) and horses (115). Similarly, genetic studies of pigmentation in stickleback fish helped elucidate the function of a conserved regulatory variant in the human KITLG gene (116). However, validating the function of noncoding variants is more challenging, and comparative genetics are less informative for variants located in unconserved noncoding regions.

An integrative functional genomics approach for evolutionary studies. During the dispersal of modern humans, admixture between different populations, archaic introgression and natural selection have shaped the allele frequencies in human populations. The functional alleles (arising de novo or from standing variation) influence an individual’s phenotype (morphological or physiological) and fitness by modulating gene expression (noncoding variants) or changing the protein function (coding variants). The individuals with adaptive variants have a higher probability of survival or reproduction in certain environments (adaptation). The selection signals could be detected by genomic scans of selection and the causal variants could be identified and validated with experimental methods. GWAS could link the potential causal variants with phenotypes and eQTLs could be used to link variants with their target genes. However, both GWAS and eQTL analyses only indicate that association (dashed lines) and experimental validations are necessary for identifying causal variants.
The majority of GWAS hits (~93%) and selection scan outliers (~90%) are located in noncoding regions (65,117–119). Noncoding variants in regulatory elements (REs) could impact the expression of target genes by altering the activities of enhancers or promoters, including some that act at long distances by chromatin looping (120,121). Overlapping candidate pigmentation variants with REs is an effective way to prioritize potential causal variants (122–124). However, the effects of non-coding causal variants can be highly cell-type specific. For example, rs12913832 in the HERC2 gene has been shown to be located in a melanocyte-specific enhancer that modulates nearby OCA2 expression by chromatin looping (76). Thus, to prioritize the functional variants related to pigmentation, one should overlap them with REs (e.g. annotations from the ENCODE (125) and/or RoadMap databases (126)) from pigmentation-related cells such as melanocytes, fibroblasts and keratinocytes. Another approach to localize potential functional variants is to overlap candidate pigmentation variants with expression quantitative trait loci (eQTL), which are genetic markers associated with the expression of one or more genes. However, one should be aware that about 50% of detected eQTL are cell-type specific and some African-specific variants may not be present in eQTL datasets such as GTEx (127–129).
To experimentally test the effects of regulatory variants in vitro, one could use luciferase reporter assays. With the development of massively parallel reporter assays (MPRA) (130), it is possible to screen thousands of candidate variants simultaneously in trait-related cells. Both methods are based on plasmid vectors, which harbor a reporter gene driven by the enhancer of interest, which contains the candidate variant. Luciferase assays use the light output of luciferase as an indicator of enhancer activity, while MPRA use read counts from high throughput sequencing as a measure of enhancer activity (130). Because enhancers have cell-type-dependent activity, it is important to perform these assays in trait-related cells (e.g. melanocytes for pigmentation studies). If the two alleles of a variant show different enhancer activity, the next step is to determine the target gene of the variant. Most studies link regulatory variants with their nearest gene based on genomic distance; however, this assumption is not always true. A recent multiplexed CRISPRi screen tested the target genes of 470 enhancers and found that only 67% of enhancers are linked to their most proximal gene (131). Two methods could refine the potential target genes: chromatin interaction capture-based approaches, which capture the interactions between regulatory elements (132–134), and CRISPR/Cas9 perturbations (131,135). The final step is using CRISPR/Cas9 tools to modify the variant in cell lines (e.g. melanocytes for skin pigmentation) or animal models, and then compare the phenotypic effects of the two alleles (136).
Taken together, there is an urgent need to validate candidate pigmentation variants discovered using GWAS and scans of natural selection, using functional experiments in vitro and in vivo. This will help to characterize the true causal variants underlying pigmentation variation, deepen our understanding of the selective processes acting on this trait, and provide guidance for the treatment of human skin diseases.
Conclusions and Future Directions
Skin pigmentation is a classic example of an adaptive trait and was historically thought to be determined by a small number of large-effect variants (13–22,67). However, recent GWAS in Africans and other populations have discovered multiple novel loci related to the evolution of human skin color, and at the same time revealed the complexity of this phenotype (23,28). The broad range of skin color in ethnically diverse Africans, which cannot be explained by previously identified large-effect alleles from Eurasian populations, indicates that African skin color is regulated by many other unknown alleles (23,28). As current African GWAS typically contain small sample sizes, the inclusion of more diverse populations from more variable African environments is needed in order to discover novel skin pigmentation loci.
One promising avenue of future research in skin pigmentation genomics is to utilize the increasing availability of ancient DNA, which has had an important impact on our understanding of skin pigmentation evolution. Recent studies examining ancient DNA samples from the last ~40k years in Europe and Asia find strong signatures of recent positive selection acting at loci underlying light skin color, including at SLC45A2, SLC24A5, TYR and HERC2 (14,137,138). While environmental conditions in Africa are not conducive to ancient DNA preservation, recent technical advances resulting in the generation of more ancient African genomes may improve our understanding of the evolutionary history of skin pigmentation in Africa (139).
Another avenue of future research is to examine the impact of structural genetic variation (SV) on skin pigmentation. Structural variants are polymorphisms larger than 50 bp, including insertions, deletions, inversions, duplications and translocations, which are difficult to detect using standard short-read sequencing methods (140,141). The recent development of high-fidelity long read sequencing technology has opened the door for SV analysis in human genomics (142). We anticipate that these long-read technologies will help to reveal the role of structural variants in skin pigmentation evolution and biology.
In conclusion, to decipher the full genetic map of skin pigmentation in Africa, it is necessary to include more African populations in genomic studies, develop novel methods to detect selection on standing genetic variation and validate candidate variants with functional experiments. Furthermore, analyzing ancient genomes, exploring the role of structural variation and combining all of these together in an integrated way will deepen our understanding of human phenotypic evolution.
Conflict of Interest statement. None declared.
Funding
National Institutes of Health (R35GM134957, R01AR076241-01A1 to S.A.T.); the Center of Excellence in Environmental Toxicology (CEET) T32 (T32ES019851 to M.A.M.).
References
Rinchik, E.M., Bultman, S.J., Horsthemke, B., Lee, S.T., Strunk, K.M., Spritz, R.A., Avidano, K.M., Jong, M.T. and Nicholls, R.D. (1993) A gene for the mouse pink-eyed dilution locus and for human type II oculocutaneous albinism.
Adelmann, C.H., Traunbauer, A.K., Chen, B., Condon, K.J., Chan, S.H., Kunchok, T., Lewis, C.A. and Sabatini, D.M. (2020) MFSD12 mediates the import of cysteine into melanosomes and lysosomes.
Shin, J.G., Leem, S., Kim, B., Kim, Y., Lee, S.G., Song, H.J., Seo, J.Y., Park, S.G., Won, H.H. and Kang, N.G. (2020) GWAS Analysis of 17,019 Korean Women Identifies the Variants Associated with Facial Pigmented Spots.
Ju, D. and Mathieson, I. (2021) The evolution of skin pigmentation-associated variation in West Eurasia.
Author notes
Yuanqing Feng and Michael A. McQuillan contributed equally.