Abstract

When the ancestors of modern Eurasians migrated out of Africa and interbred with Eurasian archaic hominins, namely, Neanderthals and Denisovans, DNA of archaic ancestry integrated into the genomes of anatomically modern humans. This process potentially accelerated adaptation to Eurasian environmental factors, including reduced ultraviolet radiation and increased variation in seasonal dynamics. However, whether these groups differed substantially in circadian biology and whether archaic introgression adaptively contributed to human chronotypes remain unknown. Here, we traced the evolution of chronotype based on genomes from archaic hominins and present-day humans. First, we inferred differences in circadian gene sequences, splicing, and regulation between archaic hominins and modern humans. We identified 28 circadian genes containing variants with potential to alter splicing in archaics (e.g., CLOCK, PER2, RORB, and RORC) and 16 circadian genes likely divergently regulated between present-day humans and archaic hominins, including RORA. These differences suggest the potential for introgression to modify circadian gene expression. Testing this hypothesis, we found that introgressed variants are enriched among expression quantitative trait loci for circadian genes. Supporting the functional relevance of these regulatory effects, we found that many introgressed alleles have associations with chronotype. Strikingly, the strongest introgressed effects on chronotype increase morningness, consistent with adaptations to high latitude in other species. Finally, we identified several circadian loci with evidence of adaptive introgression or latitudinal clines in allele frequency. These findings identify differences in circadian gene regulation between modern humans and archaic hominins and support the contribution of introgression via coordinated effects on variation in human chronotype.

Significance

Interbreeding between humans and Neanderthals created the potential for adaptive introgression as humans moved into environments that had been populated by Neanderthals for hundreds of thousands of years. Here, we discover lineage-specific genetic differences in circadian genes and their regulatory elements between humans and Neanderthals. We show that introgressed alleles are enriched for effects on circadian gene regulation, consistently increase the propensity for morningness in Europeans, and show evidence of adaptive introgression or associations between latitude and frequency. These results expand our understanding of how the genomes of humans and our closest relatives responded to environments with different light/dark cycles and demonstrate a coordinated contribution of admixture to human chronotype in a direction that is consistent with adaptation to higher latitudes.

Introduction

All anatomically modern humans (AMHs) trace their origin to the African continent around 300 ka (Stringer 2016; Hublin et al. 2017), where environmental factors shaped many of their biological features. Approximately 70 ka (Bae et al. 2017), the ancestors of modern Eurasian AMH began to migrate out of Africa, where they were exposed to diverse new environments. In Eurasia, the novel environmental factors included greater seasonal variation in temperature and photoperiod.

Changes in the pattern and level of light exposure have biological and behavioral consequences in organisms. For example, Drosophila melanogaster that is native to Europe harbors a polymorphism in “timeless,” a key gene in the light response of the circadian system, that follows a latitudinal cline in allele frequency (Sandrelli et al. 2007; Tauber et al. 2007). The ancestral haplotype produces a short TIM (S-TIM) protein that is sensitive to degradation by light because of its strong affinity to cryptochromes (CRYs), photoreceptor proteins involved in the entrainment of the circadian clock. Insertion of a G nucleotide in the 5′ coding region of the gene originated approximately 10 kyr in Europe and created a start codon that produces a new long TIM (L-TIM) isoform. The L-TIM variant has a lower affinity to CRY, creating a change in photosensitivity and altering the length of the period. L-TIM flies are at a higher frequency in southern Europe, while S-TIM flies are more prevalent in northern Europe. Another example is found in pacific salmon. Chinook salmon (Oncorhynchus tshawytscha) populations show a latitudinal cline in the frequency and length of repeat motifs in the gene OtsClock1b, strongly suggesting that this locus is under selection associated with latitude and photoperiod (O’Malley and Banks 2008; O’Malley et al. 2010). The evolution of circadian adaptation to diverse environments has also been widely studied in insects, plants (Michael et al. 2003; Zhang et al. 2008), and fishes, but it is understudied in humans. Adaptive processes could have helped to align human biology and chronotype to new natural conditions.

Previous studies in humans found a correlation between latitude and chronotype (morningness vs. eveningness) variation (Leocadio-Miguel et al. 2017; Randler and Rahafar 2017; Lowden et al. 2018) and a latitudinal cline in some circadian allele frequencies (Dorokhov et al. 2018; Putilov et al. 2018, 2019), highlighting the contribution of the environment to behavior and circadian biology. Many human health effects are linked to the misalignment of chronotype (Knutson and von Schantz 2018), including cancer, obesity (Gyarmati et al. 2016; Papantoniou et al. 2016, 2017; Gan et al. 2018; Shi et al. 2020; Yousef et al. 2020), and diabetes (Gan et al. 2015; Larcher et al. 2015, 2016). There is also evidence of a correlation between evening chronotype and mood disorders, most notably seasonal affective disorder, depression, and worsening of bipolar disorder episodes (Srinivasan et al. 2006; Kivelä et al. 2018; Taylor and Hasler 2018). Thus, we hypothesize that the differences in geography and environment encountered by early AMH populations moving into higher latitudes created potential for circadian misalignment and health risk.

Although AMHs arrived in Eurasia ∼70 ka, other hominins (e.g., Neanderthals and Denisovans) lived there for more than 400 ka (Arnold et al. 2014; Meyer et al. 2014, 2016). These archaic hominins diverged from AMHs around 700 ka (Meyer et al. 2012; Prüfer et al. 2014, 2017; Nielsen et al. 2017; Gómez-Robles 2019; Mafessoni et al. 2020), and as a result, the ancestors of AMHs and archaic hominins evolved under different environmental conditions. While there was substantial variation in the latitudinal ranges of each group, the Eurasian hominins largely lived at consistently higher latitudes and, thus, were exposed to higher amplitude seasonal variation in photoperiods. Given the influence of environmental cues on circadian biology, we hypothesized that these separate evolutionary histories produced differences in circadian traits adapted to the distinct environments.

When AMH migrated into Eurasia, they interbred with the archaic hominins that were native to the continent, initially with Neanderthals (Green et al. 2010; Villanea and Schraiber 2019) around 60 ka (Sankararaman et al. 2012; Skoglund and Mathieson 2018) and later with Denisovans (Jacobs et al. 2019). Due to this, a substantial fraction (>40%) of the archaic variation remains in present-day Eurasians (Vernot and Akey 2014; Skov et al. 2020), although each human individual carries only ∼2% DNA of archaic ancestry (Vernot et al. 2016; Prüfer et al. 2017). Most of the archaic ancestry in AMH was subject to strong negative selection, but some of these introgressed alleles remaining in AMH populations show evidence of adaptation (Racimo et al. 2015; Gower et al. 2021). For example, archaic alleles have been associated with differences in hemoglobin levels at higher altitude in Tibetans, immune resistance to new pathogens, levels of skin pigmentation, and fat composition (Huerta-Sánchez et al. 2014; Racimo et al. 2015; Racimo, Gokhman, et al. 2017; Dannemann and Kelso 2017; Racimo, Marnetto, and Huerta-Sánchez 2017; McArthur et al. 2021). Previous work also suggests that introgressed alleles could have adaptively influenced human chronotype. First, a phenome-wide association study in the UK Biobank found loci near ASB1 and EXOC6 with introgressed variants that were significantly associated with self-reported sleeping patterns (Dannemann and Kelso 2017). One of these alleles showed a significant association between frequency and latitude. Second, summarizing effects genome-wide, introgressed alleles are also moderately enriched for heritability of chronotype compared to nonintrogressed alleles (McArthur et al. 2021). These results suggest a potential role for introgressed alleles in adaptation to pressures stemming from migration to higher latitudes.

Motivated by the potential for a role of archaic introgression in AMH circadian variation, we explore two related questions: 1) Can comparative genomic analysis identify differences in AMH and archaic hominin circadian biology? 2) Do introgressed archaic alleles influence human circadian biology? Understanding the ancient history and evolution of chronotypes in humans will shed light on human adaptation to high latitudes and provide context for the genetic basis for the modern misalignment caused by the development of technology and night shiftwork.

Results

Did Archaic Hominins and Modern Humans Diverge in Circadian Biology?

Following divergence ∼700 ka (Nielsen et al. 2017; Gómez-Robles 2019), archaic hominins and AMH were geographically isolated, resulting in the accumulation of lineage-specific genetic variation and phenotypes (fig. 1). In the next several sections, we evaluate the genomic evidence for divergence in circadian biology between archaic hominin and modern human genomes.

Did the sharing of functionally diverged alleles from archaic hominins influence human circadian biology? (A) AMHs and archaic hominins evolved separately at different latitudes for hundreds of thousands of years. The ancestors of modern Eurasian humans left Africa approximately 70 ka and admixed with archaics, likely in southwestern Asia. The shaded range in Eurasia represents the approximate Neanderthal range. The dot in Asia represents the location of the sequenced Denisovan individual in the Altai Mountains; the full range of Denisovans is currently unknown. Silhouettes from phylopic.org. (B) After the split between the human and archaic lineages, each group accumulated variation and evolved in their respective environments for approximately 700 ka. We first test for evidence for divergent circadian evolution during this time. Humans acquired introgressed alleles from Neanderthals and from Denisovans around 60 and 45 ka, respectively. These alleles experienced strong selective pressures; however, ∼40% of the genome retains archaic ancestry in some modern populations. The second question we explore is whether introgression made contributions to human circadian biology. (C) The core circadian clock machinery is composed of several transcription factors (ovals) that dimerize and interact with E-box enhancer elements and each other to create a negative feedback loop. We defined a set of 246 circadian genes through a combination of literature search, expert knowledge, and existing annotations (supplementary table S1 and fig. S1, Supplementary Material online; Materials and Methods). Lines with arrows represent activation, and lines with bars represent suppression.
Fig. 1.

Did the sharing of functionally diverged alleles from archaic hominins influence human circadian biology? (A) AMHs and archaic hominins evolved separately at different latitudes for hundreds of thousands of years. The ancestors of modern Eurasian humans left Africa approximately 70 ka and admixed with archaics, likely in southwestern Asia. The shaded range in Eurasia represents the approximate Neanderthal range. The dot in Asia represents the location of the sequenced Denisovan individual in the Altai Mountains; the full range of Denisovans is currently unknown. Silhouettes from phylopic.org. (B) After the split between the human and archaic lineages, each group accumulated variation and evolved in their respective environments for approximately 700 ka. We first test for evidence for divergent circadian evolution during this time. Humans acquired introgressed alleles from Neanderthals and from Denisovans around 60 and 45 ka, respectively. These alleles experienced strong selective pressures; however, ∼40% of the genome retains archaic ancestry in some modern populations. The second question we explore is whether introgression made contributions to human circadian biology. (C) The core circadian clock machinery is composed of several transcription factors (ovals) that dimerize and interact with E-box enhancer elements and each other to create a negative feedback loop. We defined a set of 246 circadian genes through a combination of literature search, expert knowledge, and existing annotations (supplementary table S1 and fig. S1, Supplementary Material online; Materials and Methods). Lines with arrows represent activation, and lines with bars represent suppression.

Identifying Archaic–Hominin-Specific Circadian Gene Variation

With the sequencing of several genomes of archaic hominins, we now have a growing, but incomplete, catalog of genetic differences specific to modern and archaic lineages. Following recent work (Kuhlwilm and Boeckx 2019), we defined archaic-specific variants as genomic positions where archaic hominins (Altai Neanderthal, Vindija Neanderthal, and Denisovan) all have the derived allele while, in humans, the derived allele is absent or present at such an extremely low frequency that it is likely an independent occurrence. We defined human-specific variants as positions where all individuals in the 1000 Genomes Project carry the derived allele and all the archaics carry the ancestral allele.

We evaluated archaic-specific variants for their ability to influence proteins, splicing, and regulation of 246 circadian genes (Materials and Methods). The circadian genes were identified by a combination of literature search, expert knowledge, and existing annotations (supplementary table S1 and fig. S1, Supplementary Material online; Materials and Methods). The core circadian clock machinery is composed of a dimer between the CLOCK and ARNTL (BMAL1) transcription factors, which binds to E-box enhancer elements and activates the expression of the Period (PER1/2/3) and Cryptochrome (CRY1/2) genes (fig. 1C). PERs and CRYs form heterodimers that inhibit the positive drive of the CLOCK-BMAL1 dimer on E-boxes, inhibiting their own transcription in a negative feedback loop. CLOCK-BMAL1 also drives the expression of many other clock-controlled genes, including nuclear receptor subfamily 1 group D members 1 and 2 (NR1D1/2), RAR-related orphan receptors A and B (RORA/B), and D-Box-binding PAR BZIP transcription factor. ROR and REV-ERB are transcriptional regulators of BMAL1. CK1 binds to the PER/CRY heterodimer, phosphorylating PER and regulating its degradation. Similarly, FBXL3 marks CRY for degradation. Beyond the core clock genes, we included other upstream and downstream genes that are involved in the maintenance and response of the clock (supplementary table S1 and fig. S1, Supplementary Material online).

We identified 1,136 archaic-specific variants in circadian genes, promoters, and distal candidate cis-regulatory elements (cCREs). The circadian genes with the most archaic-specific variants are CLDN4, NAMPT, LRPPRC, ATF4, and AHCY (125, 112, 110, 104, and 102, respectively) (supplementary table S2, Supplementary Material online).

Fixed Human- and Archaic-Specific Variants Are Enriched in Circadian Genes and Associated Regulatory Elements

After the archaic and AMH lineages diverged, each accumulated genetic variation specific to each group. Variants fixed in each lineage are likely to be enriched in genomic regions that influence traits that experienced positive selection. We tested whether human- and archaic-specific fixed variants are enriched compared with other variants in circadian genes and their promoters and in annotated cCREs within 1 Mb (fig. 2). We found that human- and archaic-specific fixed variants are enriched in circadian genes (Fisher's exact test; human: odds ratio [OR] = 1.84, P = 7.06e−12; archaic: OR = 1.13, P = 0.023) and distal regulatory elements (Fisher's exact test; human: OR = 1.25, P = 8.39e−4; archaic: OR = 1.16, P = 6.15e−5) compared with variants derived on each lineage but not fixed. Promoter regions have a similar enrichment pattern as that in gene and regulatory regions, but the P-values are high (Fisher's exact test; human: OR = 1.21, P = 0.65; archaic: OR = 1.09, P = 0.63). This is likely due to the small number of such variants in promoters. These results suggest that both groups had a greater functional divergence in genomic regions related to circadian biology than expected.

Human- and archaic-specific fixed variants are enriched in circadian regulatory, promoter, and gene regions. Human-specific fixed variants are significantly enriched compared with variants that are not fixed in circadian regulatory elements (Fisher's exact: OR = 1.25, P = 8.39e−4) and gene regions (Fisher's exact: OR = 1.84, P = 7.06e−12). Promoters show a similar enrichment, but the higher P-value is the result of the small number of variants (Fisher's exact test: OR = 1.21, P = 0.65). Likewise, archaic-specific variants are enriched in circadian regulatory regions (Fisher's exact: OR = 1.16, P = 6.15e−5) and gene regions (Fisher's exact: OR = 1.13, P = 0.023), with the promoters showing a similar trend (Fisher's exact test: OR = 1.09, P = 0.63). The numbers in parentheses give the counts of fixed variants observed in each type of element. Regulatory elements were defined based on the ENCODE cCREs.
Fig. 2.

Human- and archaic-specific fixed variants are enriched in circadian regulatory, promoter, and gene regions. Human-specific fixed variants are significantly enriched compared with variants that are not fixed in circadian regulatory elements (Fisher's exact: OR = 1.25, P = 8.39e−4) and gene regions (Fisher's exact: OR = 1.84, P = 7.06e−12). Promoters show a similar enrichment, but the higher P-value is the result of the small number of variants (Fisher's exact test: OR = 1.21, P = 0.65). Likewise, archaic-specific variants are enriched in circadian regulatory regions (Fisher's exact: OR = 1.16, P = 6.15e−5) and gene regions (Fisher's exact: OR = 1.13, P = 0.023), with the promoters showing a similar trend (Fisher's exact test: OR = 1.09, P = 0.63). The numbers in parentheses give the counts of fixed variants observed in each type of element. Regulatory elements were defined based on the ENCODE cCREs.

Several Core Circadian Genes Have Evidence of Alternative Splicing between Humans and Archaic Hominins

We find only two archaic-specific coding variants in circadian genes: one missense and one synonymous. The missense variant (hg19: chr17_46923411_A_G) is in the gene CALCOCO2, calcium-binding and coiled-coil domain-containing protein 2. SIFT, PolyPhen, and CADD all predict that the variant does not have damaging effects. The second variant (hg19: chr7_119914770_G_T) is in the gene KCND2, which encodes a component of a voltage-gated potassium channel that contributes to the regulation of the circadian rhythm of action potential firing, but it is synonymous and the variant effect predictors suggest it is tolerated.

To explore potential splicing differences in circadian genes between humans and archaics, we applied SpliceAI to predict whether any sequence differences between modern humans and archaics are likely to modify splicing patterns. Four archaic individuals were included in this analysis (the Altai, the Vindija, the Chagyrskaya Neanderthals, and the Altai Denisovan) (Meyer et al. 2012; Prüfer et al. 2014, 2017; Mafessoni et al. 2020). We found that 28 genes contained at least one archaic-specific variant predicted to result in alternative splicing in archaics. These included several of the core clock genes CLOCK, PER2, RORB, RORC, and FBXL13 (fig. 3A and C; supplementary table S3, Supplementary Material online). For example, the variant (hg19: chr2:239187088–239187089) in the first intron of PER2 is predicted to result in a longer 5′ UTR. The splice-altering variants were largely specific to the two different archaic lineages (fig. 3A), with 13 specific to the Denisovan, eight shared among the three Neanderthals, and only one shared among all four archaic individuals.

Many circadian genes have evidence of alternative splicing and divergent regulation between modern and archaic hominins. (A) The distribution of the 28 predicted archaic-specific splice-altering variants (SAVs) in circadian genes across archaic individuals. Most are specific to either the Denisovan or Neanderthal lineage (supplementary table S3, Supplementary Material online). (B) The sharing of predicted DR gene/tissue pairs across three archaic individuals. Predictions were not available for the Chagyrskaya Neanderthal. Seventeen DR gene/tissue pairs were present in all three archaics (representing 16 unique genes). Additionally, seven gene/tissue DR pairs are shared between the Altai Neanderthal and the Denisovan individual. One pair is shared between the Vindija Neanderthal and the Denisovan (supplementary table S4, Supplementary Material online). (C) The proportion of circadian genes containing archaic SAVs predicted by SpliceAI (11.4%) or DR circadian genes predicted by PrediXcan (6.5%). Thus, 17.9% of the circadian genes are predicted to contain differences to AMH via these mechanisms.
Fig. 3.

Many circadian genes have evidence of alternative splicing and divergent regulation between modern and archaic hominins. (A) The distribution of the 28 predicted archaic-specific splice-altering variants (SAVs) in circadian genes across archaic individuals. Most are specific to either the Denisovan or Neanderthal lineage (supplementary table S3, Supplementary Material online). (B) The sharing of predicted DR gene/tissue pairs across three archaic individuals. Predictions were not available for the Chagyrskaya Neanderthal. Seventeen DR gene/tissue pairs were present in all three archaics (representing 16 unique genes). Additionally, seven gene/tissue DR pairs are shared between the Altai Neanderthal and the Denisovan individual. One pair is shared between the Vindija Neanderthal and the Denisovan (supplementary table S4, Supplementary Material online). (C) The proportion of circadian genes containing archaic SAVs predicted by SpliceAI (11.4%) or DR circadian genes predicted by PrediXcan (6.5%). Thus, 17.9% of the circadian genes are predicted to contain differences to AMH via these mechanisms.

Circadian Gene Regulatory Divergence between Humans and Archaic Hominins

Given the enrichment of variants in regulatory regions of circadian genes, we sought to explore the potential for differences in circadian gene regulation between humans and archaics with causes beyond single lineage-specific variants. We leveraged an approach we recently developed for predicting gene regulatory differences between modern and archaic individuals from combinations of genetic variants (Colbran et al. 2019). The approach uses PrediXcan, an elastic net regression method, to impute gene transcript levels in specific tissues from genetic variation. Previous work demonstrated that this approach has a modest decrease in performance when applied to Neanderthals, but that it can accurately applied between humans and Neanderthals for thousands of genes. Here, we quantify differences in the predicted regulation of the 246 circadian genes between 2,504 humans in the 1000 Genomes Project (1000 Genomes Project Consortium 2010) and the archaic hominins. The predicted regulation values are normalized to the distribution in the training set from the Genotype Tissue Expression Atlas (GTEx).

We first analyzed gene regulation predictions in the core circadian clock genes. Archaic gene regulation was at the extremes of the human distribution for many core clock genes including PER2, CRY1, NPAS2, RORA, and NR1D1 (fig. 4; supplementary fig. S2, Supplementary Material online). For example, the regulation of PER2 in the Altai and Vindija Neanderthals is lower than 2,491 of the 2,504 (99.48%) modern humans considered. The Denisovan has a predicted PER2 regulation that is lower than 2,410 (96.25%). Expanding to all circadian genes and requiring archaic regulation to be more extreme than all humans (Materials and Methods), we identified 24 circadian genes across 23 tissues with strong divergent regulation between humans and at least one archaic hominin (fig. 3B; supplementary table S4, Supplementary Material online). For example, all archaic regulation values for RORA, a core clock gene, are lower than for any of the 2,504 modern humans. We found that 16 of these genes (supplementary fig. S3 and table S4, Supplementary Material online), including RORA, MYBBP1A, and TIMELESS, were divergently regulated (DR) in all archaic individuals. This represents 6.5% of all the circadian genes (fig. 3C). Surprisingly, the two Neanderthals only shared one DR gene not found in the Denisovan, while the Altai Neanderthal and Denisovan shared seven not found in Vindija (fig. 3B). The Altai and Vindija Neanderthals represent deeply diverging lineages, and this result suggests that they may have experienced different patterns of divergence in the regulation of their circadian genes.

Many circadian genes are divergently regulated between modern humans and archaic hominins. Comparison of the imputed regulation of core circadian genes between 2,504 humans in 1000 Genomes Phase 3 (histogram) and three archaic individuals (vertical lines). For each core circadian gene, the tissue with the lowest average P-value for archaic difference from humans is plotted. Archaic gene regulation is at the extremes of the human distribution for several core genes: CRY1, PER2, NPAS2, NR1D1, and RORA. See supplementary figure S2, Supplementary Material online for all core clock genes and supplementary figure S3, Supplementary Material online for all DR circadian genes.
Fig. 4.

Many circadian genes are divergently regulated between modern humans and archaic hominins. Comparison of the imputed regulation of core circadian genes between 2,504 humans in 1000 Genomes Phase 3 (histogram) and three archaic individuals (vertical lines). For each core circadian gene, the tissue with the lowest average P-value for archaic difference from humans is plotted. Archaic gene regulation is at the extremes of the human distribution for several core genes: CRY1, PER2, NPAS2, NR1D1, and RORA. See supplementary figure S2, Supplementary Material online for all core clock genes and supplementary figure S3, Supplementary Material online for all DR circadian genes.

Given these differences in circadian gene regulation between humans and archaics, we tested whether circadian genes are more likely to be DR than other gene sets. Each archaic individual shows nominal enrichment for divergent regulation of circadian genes, and the enrichment was stronger (∼1.2×) in the Altai Neanderthal and Denisovan individuals. However, given the small sample size, the P-values are moderate (permutation test; Altai: OR = 1.21, P = 0.19; Vindija: OR = 1.05, P = 0.43; Denisovan: OR = 1.20, P = 0.24).

Did Introgressed Archaic Variants Influence Modern Human Circadian Biology?

The previous sections demonstrate lineage-specific genetic variation in many genes and regulatory elements essential to the function of the core circadian clock and related pathways. Given this evidence of functional differences between archaic hominins and AMH in these systems, we next evaluated the influence of archaic introgression on AMH circadian biology.

Introgressed Variants Are Enriched in Circadian Gene eQTL

Given the differences between archaic and modern sequences of circadian genes and their regulatory elements, we investigated whether Neanderthal introgression contributed functional circadian variants to modern Eurasian populations. We considered a set of 863,539 variants with evidence of being introgressed from archaic hominins to AMH (Browning et al. 2018). These variants were identified using the Sprime algorithm, which searches for regions containing a high density of alleles in common with Neanderthals and not present or at very low frequency in Africans. Since many approaches have been developed to identify introgressed variants, we also considered two stricter sets: 47,055 variants that were supported by all of six different introgression maps (Sankararaman et al. 2014; Vernot et al. 2016; Browning et al. 2018; Steinrücken et al. 2018; Skov et al. 2020; Schaefer et al. 2021) and 755,653 variants that were supported by Sprime and at least one other introgression map. As described below, our main results replicated on both of these stricter sets.

We first tested whether the presence of introgressed variants across modern individuals is associated with the expression levels of any circadian genes, that is, whether the introgressed variants are expression quantitative trait loci (eQTL). We identified 3,857 introgressed variants associated with the regulation of circadian genes in modern non-Africans (supplementary table S5, Supplementary Material online). The genes PTPRJ, HTR1B, NR1D2, CLOCK, and ATOH7 had the most eQTL (304, 273, 262, 256, and 252, respectively). We found introgressed circadian eQTL for genes expressed in all tissues in GTEx, except the kidney cortex. Notably, several of these circadian genes (e.g., NR1D2 and CLOCK) with introgressed eQTL were also found to be DR in our comparison of modern and archaic gene regulation. This indicates that some of the archaic-derived variants that contributed to divergent regulation were retained after introgression and continue to influence circadian regulation in modern humans.

Introgressed variants are significantly more likely to be eQTL for circadian genes than expected by chance from comparison to all eQTL (fig. 5A; Fisher's exact test: OR = 1.45, P = 9.71e−101). The stricter set of introgressed variants identified by Browning et al. plus at least one other introgression map had similar levels of eQTL enrichment for circadian genes (OR = 1.47; P = 2.4e−103). The highest confidence set of introgressed variants that were identified by all six maps considered had even stronger enrichment (OR = 1.68; P = 6.5e−23).

Most core circadian genes are expressed broadly across tissues; the fraction expressed in each GTEx tissue ranges from 57% (whole blood) to 83% in testis and an average of 72% (supplementary table S6, Supplementary Material online). As a result, we anticipated that the enrichment of introgressed variants among eQTLs for circadian genes would hold across tissues. Examining the associations in each tissue, we found that introgressed eQTL showed significant enrichment for circadian genes in most tissues (34 of 49; fig. 5B; supplementary table S7, Supplementary Material online) and trended this way in all but five. Given that tissues in GTEx have substantial differences in sample size and cellular heterogeneity, statistical power to detect enrichment differs. We anticipate that this is the main driver of differences in enrichment across tissues.

Circadian genes are enriched for introgressed eQTL. (A) Archaic introgressed variants are more likely to be eQTL for circadian genes in GTEx than for noncircadian genes (Fisher's exact test: OR = 1.45, P = 9.71e−101). The ellipse at the top represents the set of introgressed variants, and the ellipse at the bottom represents the set of circadian variants. A total of 3,857 are introgressed eQTL in circadian genes. The rectangle represents the universe of all GTEx eQTLs lifted over to hg19. The overlaps are not to scale. (B) The enrichment for circadian genes among the targets of introgressed eQTLs in each GTEx tissue. Introgressed eQTL in most tissues show significant enrichment for circadian genes (Fisher's exact test; supplementary table S7, Supplementary Material online). Kidney cortex did not have any circadian introgressed eQTLs and thus is not shown. Numbers inside the parenthesis indicate the count of variants in each tissue. Light bars indicate a lack of statistical significance; mid-tone bars indicate nominal significance (P ≤ 0.05); and dark bars indicate significance at the 0.05 level after Bonferroni multiple testing correction (P ≤ 0.00102).
Fig. 5.

Circadian genes are enriched for introgressed eQTL. (A) Archaic introgressed variants are more likely to be eQTL for circadian genes in GTEx than for noncircadian genes (Fisher's exact test: OR = 1.45, P = 9.71e−101). The ellipse at the top represents the set of introgressed variants, and the ellipse at the bottom represents the set of circadian variants. A total of 3,857 are introgressed eQTL in circadian genes. The rectangle represents the universe of all GTEx eQTLs lifted over to hg19. The overlaps are not to scale. (B) The enrichment for circadian genes among the targets of introgressed eQTLs in each GTEx tissue. Introgressed eQTL in most tissues show significant enrichment for circadian genes (Fisher's exact test; supplementary table S7, Supplementary Material online). Kidney cortex did not have any circadian introgressed eQTLs and thus is not shown. Numbers inside the parenthesis indicate the count of variants in each tissue. Light bars indicate a lack of statistical significance; mid-tone bars indicate nominal significance (P ≤ 0.05); and dark bars indicate significance at the 0.05 level after Bonferroni multiple testing correction (P ≤ 0.00102).

These results suggest that circadian pressures were widespread across tissues. Given the previously observed depletion for introgressed variants in regulatory elements and eQTL (Petr et al. 2019; Rinker et al. 2020; Telis et al. 2020), this enrichment for circadian genes among introgressed eQTL is surprising and suggests that the archaic circadian alleles could have been beneficial after introgression.

Introgressed Variants Predominantly Increase Propensity for Morningness

After observing that circadian gene expression is influenced by archaic variants, we evaluated whether these effects are likely to result in a change in organism-level phenotype. To do this, we evaluated evidence that introgressed variants influence chronotype. The heritability of chronotype has been estimated in a range from 12% to 38% (Jones et al. 2016, 2019; Lane et al. 2016), and previous studies have identified two introgressed loci associated with sleep patterns (Dannemann and Kelso 2017; Putilov et al. 2019). We recently found modest enrichment for heritability of chronotype (morning/evening person phenotype in a GWAS of the UK Biobank) among introgressed variants genome wide using stratified linkage disequilibrium (LD) score regression (heritability enrichment: 1.58, P = 0.25) (McArthur et al. 2021). This analysis also suggested that introgressed variants were more likely to increase morningness.

To test for this proposed directional effect, we calculated the cumulative fraction of introgressed loci associated with chronotype in the UK Biobank that increase morningness (after collapsing based on LD at R2 > 0.5 in EUR). The introgressed loci most strongly associated with chronotype increase the propensity for morningness (fig. 6; supplementary tables S8 and S9, Supplementary Material online). As the strength of the association with morningness decreases, the bias begins to decrease, but the effect is maintained well past the genome-wide significance threshold (P < 5e−8). When focusing the analysis on introgressed variants in proximity (<1 Mb) to circadian genes, the pattern becomes even stronger. The bias toward morningness remains above 80% at the genome-wide significance threshold. This result also held when limiting to introgressed variants found in Browning plus one or all other introgression maps considered (supplementary fig. S4, Supplementary Material online). This suggests that introgressed variants act in a consistent direction on chronotype, especially when they influence circadian genes.

Introgressed variants associate with increased morningness. For varying strengths of association in the UK Biobank (x axis), the cumulative fraction of introgressed loci significantly associated with chronotype that increase morningness (y axis) is plotted. Most introgressed loci associated with chronotype increase morningness, and this effect is greatest at the most strongly associated loci (bottom line with circles). Introgressed variants nearby (<1 Mb) circadian genes are even more strongly shifted toward increasing morningness (top line with triangles). Each point (dot or triangle) represents an associated locus; variants were clumped by LD for each set (R2 > 0.5 in EUR). The dotted vertical line represents the genome-wide significance threshold (P < 5e−8).
Fig. 6.

Introgressed variants associate with increased morningness. For varying strengths of association in the UK Biobank (x axis), the cumulative fraction of introgressed loci significantly associated with chronotype that increase morningness (y axis) is plotted. Most introgressed loci associated with chronotype increase morningness, and this effect is greatest at the most strongly associated loci (bottom line with circles). Introgressed variants nearby (<1 Mb) circadian genes are even more strongly shifted toward increasing morningness (top line with triangles). Each point (dot or triangle) represents an associated locus; variants were clumped by LD for each set (R2 > 0.5 in EUR). The dotted vertical line represents the genome-wide significance threshold (P < 5e−8).

Circadian rhythms are involved in a wide variety of biological systems. To explore other phenotypes potentially influenced by the introgressed circadian variants, we evaluated evidence for pleiotropic associations. First, we retrieved all the genome-wide associations reported for introgressed variants in the Open Targets Genetics (https://genetics.opentargets.org) database, which combines GWAS data from the GWAS Catalog, UK Biobank, and several other sources. Introgressed circadian variants are associated with traits from a diverse range of categories (supplementary table S10, Supplementary Material online). Associations with blood-related traits are by far the most common; however, this is likely because they have more power in the UK Biobank. Overall, circadian introgressed variants are significantly more likely to have at least one trait association than introgressed variants not in the circadian set (Fisher's exact test: OR = 1.25, P = 7.03e−25) (supplementary fig. S5A, Supplementary Material online). The circadian variants also associate with significantly more traits per variant than the noncircadian set (Mann–Whitney U test: P = 9.93e−14) (supplementary fig. S5B and table S11, Supplementary Material online). These results suggest effects for introgressed circadian variants beyond chronotype.

Evidence for Adaptive Introgression at Circadian Loci

The gene flow from Eurasian archaic hominins into AMH contributed to adaptations to some of the new environmental conditions encountered outside of Africa (Racimo et al. 2015). The above analyses demonstrate the effects of introgressed variants on circadian gene regulation and chronotype. To explore whether these circadian regions show evidence of adaptive introgression, we considered three sets of introgressed regions predicted to have contributed to AMH adaptation: one from an outlier approach based on allele frequency statistics (Racimo, Marnetto, and Huerta-Sánchez 2017) and two from recent machine learning algorithms: genomatnn (Gower et al. 2021) and MaLAdapt (Zhang et al. 2023). We intersected the circadian introgressed variants with the adaptive introgression regions from each method.

We identified 47 circadian genes with evidence of adaptive introgression at a nearby variant from at least one of the methods (supplementary table S12, Supplementary Material online). No region was supported by all three methods; however, six were shared between Racimo and MaLAdapt and three were shared by Racimo and genomatnn. The relatively small overlap between these sets underscores the challenges of identifying adaptive introgression. Nonetheless, these represent promising candidate regions for further exploration of the effects of introgressed variants on specific aspects of circadian biology. For example, an introgressed haplotype on chr10 tagged by rs76647913 was identified by both MaLAdapt and Racimo. This introgressed haplotype is an eQTL for the nearby ATOH7 gene in many GTEx tissues. ATOH7 is a circadian gene that is involved in retinal ganglion cell development, and mice with this gene knocked out are unable to entrain their circadian clock based on light stimuli (Brzezinski et al. 2005).

Latitudinal Clines for Introgressed Circadian Loci

Motivated by the previous discovery of an introgressed haplotype on chr2 that is associated with chronotype and increases in frequency with latitude (Dannemann and Kelso 2017; Putilov et al. 2019), we also tested each introgressed circadian variant for a correlation between allele frequency and latitude in modern non-African populations from the 1000 Genomes Project.

The strongest association between latitude and frequency was a large chromosome 2 haplotype that contains the previously discovered introgressed single nucleotide polymorphism (SNP) (rs75804782, R = 0.85) associated with the chronotype. This haplotype is present in all non-African populations, and rs61332075 showed the strongest latitudinal cline (R = 0.87). The second strongest consisted of a smaller haplotype of introgressed variants a few kilobases of upstream of the previous haplotype (tagged by rs35333999 and rs960783) that overlaps the core circadian gene PER2. These variants have a correlation between latitude and frequency of ∼0.68. They are also in moderate LD (R2 of ∼0.35 in EUR) with an additional introgressed variant (rs62194932) that has a similar latitudinal cline of 0.70 (supplementary fig. S6 and table S13, Supplementary Material online). These variants are each in very low LD with the previously discovered haplotype (R2 of ∼0.01) and are each supported by multiple introgression maps. Moreover, these introgressed variants are absent in all EAS populations, absent or at very low frequency in SAS (<3%), and at higher frequency in EUR populations (∼13%).

The EUR-specific introgressed variant rs35333999 causes a missense change in the PER2 protein (V903I) that overlaps a predicted interaction interface with PPARG. PER2 controls lipid metabolism by directly repressing PPARG's proadipogenic activity (Grimaldi et al. 2010). The rs62194932 variant is an eQTL of HES6 in the blood in the eQTLGen cohort (Võsa et al. 2021). HES6 encodes a protein that contributes to circadian regulation of LDLR and cholesterol homeostasis (Lee et al. 2012).

Thus, this genomic region that includes circadian genes and introgressed variants associated with chronotype has a population-specific structure and at least two distinct sets of introgressed variants with latitudinal clines and functional links to lipid metabolism. PER2 is also predicted to have lower gene regulation in archaic hominins than most humans (fig. 4), and the Vindija Neanderthal carries a lineage-specific variant in this gene that has splice-altering effects. These results together suggest that PER2 may have experienced multiple functional changes in different modern and archaic lineages, with potential adaptive effects mediated by introgression.

We did not discover any other significant associations between latitude and frequency for other introgressed circadian loci. The rapid migration and geographic turnover of populations in recent human history are likely to obscure many latitude-dependent evolutionary signatures, so we did not anticipate many circadian loci would have a strong signal.

Discussion

The Eurasian environments where Neanderthals and Denisovans lived for several hundred thousand years are located at higher latitudes with more variable photoperiods than the landscape where AMH evolved before leaving Africa. Evaluating genetic variation that arose separately in each of the archaic and AMH lineages after their split ∼700 Ma, we identified lineage-specific genetic variation in circadian genes, their promoters, and flanking distal regulatory elements. We found that both archaic- and human-specific variants are observed more often than expected in each class of functional region. This result suggests that, while each group evolved separately during hundreds of thousands of years in divergent environments, both experienced pressure on circadian-related variation. Leveraging sequence-based machine learning methods, we identified many archaic-specific variants likely to influence circadian gene splicing and regulation. For example, core clock genes (CLOCK, PER2, RORB, RORC, and FBXL13) have archaic variants predicted to cause alternative splicing compared with AMH. Several core genes were also predicted in archaics to be at the extremes of human gene regulation, including PER2, CRY1, NPAS2, RORA, and NR1D1. Surprisingly, the Altai Neanderthal shared more divergent regulation in the circadian genes with the Denisovan individual than the Vindija Neanderthals. The two Neanderthals represent populations that were quite distantly diverged with substantially different histories and geographical ranges. The Denisovan and Altai Neanderthal also come from the same region in Siberia, while the Vindija Neanderthal came from a region in Croatia with slightly lower latitude.

Introgression introduced variation that first appeared in the archaic hominin lineage into Eurasian AMH. While most of this genetic variation experienced strong negative selection in AMH, a smaller portion is thought to have provided adaptive benefits in the new environments (Racimo et al. 2015). Given the divergence in many circadian genes’ regulation, we explored the landscape of introgression on circadian genes. We first looked at introgressed circadian variants that are likely to influence gene regulation in AMH. Variants in this set are observed more often than expected, suggesting the importance of maintaining circadian variation in the population. We also verified that these results held over variants identified by different methods for calling archaic introgression.

We then evaluated the association of these introgressed variants with variation in the circadian phenotypes of Eurasians. We previously reported a modest enrichment among introgressed variants for heritability of the morning/evening person phenotype (McArthur et al. 2021). Here, we further discovered a consistent directional effect of the introgressed circadian variants on chronotype. The strongest associated variants increase the probability of being a morning person in Eurasians.

While it is not immediately clear why increased morningness would be beneficial at higher latitudes, considering this directional effect in the context of clock gene regulation and the challenge of adaptation to higher latitudes suggests an answer. In present-day humans, behavioral morningness is correlated with a shortened period of the circadian molecular clockworks in individuals. This earlier alignment of sleep/wake with external timing cues is a consequence of a quickened pace of the circadian gene network (Brown et al. 2008). Therefore, the morningness directionality of introgressed circadian variants may indicate selection toward a shortened circadian period in the archaic populations living at high latitudes. Supporting this interpretation, shortened circadian periods are required for synchronization to the extended summer photoperiods of high latitudes in Drosophila, and selection for shorter periods has resulted in latitudinal clines of decreasing period with increasing latitude, as well as earlier alignment of behavioral rhythms (Hut et al. 2013). In addition, Drosophila populations exhibit decreased amplitude of behavioral rhythms at higher latitudes, which is also thought to aid in synchronization to long photoperiods (Hut et al. 2013).

Our finding that introgressed circadian variants generally decrease gene regulation of circadian genes suggests that they could lead to lower amplitude clock gene oscillations. However, when assayed in present-day humans, there is not a strong correlation between the overall expression level of NR1D1 and the transcriptional amplitudes of other clock genes within individuals (Brown et al. 2008), and quantitative modeling of the mammalian circadian clockworks suggests that stable clock gene rhythms can result across a wide range of absolute levels of gene expression as long as the stoichiometric ratios of key positive and negative clock genes are reasonably conserved (Kim and Forger 2012). Interestingly, the lower transcriptional amplitude of NR1D1 does confer greater sensitivity of the present-day human clockworks to resetting stimuli, a potentially adaptive characteristic for high latitudes (Brown et al. 2008).

Thus, given the studies of latitudinal clines and adaptation from Drosophila and the nascent understanding of clock gene contributions to behavioral phenotypes in present-day humans, the directional effects of introgressed circadian gene variants toward early chronotype and decreased gene regulation we observed can be viewed as potentially adaptive. More complex chronotype phenotyping and mechanistic studies of the variants of interest are needed to fully understand these observations.

Finally, to explore evidence for positive selection on introgressed variants in AMH, we analyzed results from three recent methods for detecting adaptive introgression. All methods identified circadian loci as candidates for adaptive introgression. However, we note that the predictions of these methods have only modest overlap with one another, underscoring the difficulty of identifying adaptive introgression. Nonetheless, many of these loci, especially those supported by both Racimo and MaLAdapt, are good candidates for adaptive introgression given their functional associations with circadian genes

Several limitations must be considered when interpreting our results. First, it is challenging to quantify the complexity of traits with a large behavioral component (like chronotype) and infer their variation from genomic information alone. Nevertheless, we believe our approach of focusing on molecular aspects (splicing and gene regulation) of genomic loci with relevance to circadian biology, in parallel to GWAS-based associations, lends additional support to the divergence in chronotype between archaic hominins and modern humans. Second, we also note that circadian rhythms contribute to many biological systems, so the variants in these genes tend to be associated with a variety of phenotypes. Thus, there is also the potential that selection acted on other phenotypes influenced by circadian variation than those related directly to chronotype. Third, given the complexity of circadian biology, there is no gold standard set of circadian genes. We focus on the core clock genes and a broader set of expert-curated genes relevant to circadian systems, but it is certainly possible that other genes with circadian effects are not considered. Fourth, recent adaptive evolution is challenging to identify, and this is especially challenging for introgressed loci. Nonetheless, we find several circadian loci with evidence of adaptive introgression from more than one method. Finally, given the many environmental factors that differed between African and non-African environments, it is difficult to definitively determine whether selection on a particular locus was the result of variation in light levels versus other related factors, such as temperature. Nonetheless, given the observed modern associations with chronotype for many of these variants, we believe it is a plausible target.

In conclusion, studying how humans evolved in the face of changing environmental pressures is necessary to understanding variation in present-day phenotypes and the potential trade-offs that influence the propensity to different diseases in modern environments (Benton et al. 2021). Here, we show that genomic regions involved in circadian biology exhibited substantial functional divergence between separate hominin populations. Furthermore, we show that introgressed variants contribute to variation in AMH circadian phenotypes today in ways that are consistent with an adaptive benefit.

Materials and Methods

Circadian Gene Selection

Circadian biology is a complex system due to its high importance in the functioning of biological timing in diverse biological systems. For that reason, determining which genes are crucial for selection to environmental response related to light exposure is not a straightforward process. To address this issue, we look at different sources of genome annotation databases and searched for genes and variants associated with circadian-related phenotypes. We considered all human protein-coding genes in the Gene Ontology database annotated with the GO: 0007623 (“circadian rhythm”) term or terms annotated with the relationship “is_a,” “part_of,” “occurs_in,” or “regulates” circadian rhythm. We also considered genes containing experimental or orthologous evidence of circadian function in the Circadian Gene Database (CGDB), the GWAS Catalog genes containing “chronotype” or “circadian rhythm” associated variants, and a curated set of genes available in WikiPathways (Martens et al. 2021). The final set of circadian genes was curated by Dr. Douglas McMahon.

To select the candidate circadian genes with the highest confidence, we defined a hierarchy system where genes annotated by McMahon or annotated in three out of four other sources receive a “high” level of confidence. Genes with evidence from two out of four of the sources are assigned a “medium” level of confidence. Genes annotated as circadian only in one out of four sources are assigned to “low” level of confidence and not considered in our circadian gene set. We then defined our set of circadian variants from the 1000 Genomes Project using the list of circadian genes. The variants are included in the analysis of coding, noncoding, regulatory, eQTL, human-specific, archaic-specific, and introgressed variants.

Definition of Lineage-Specific Variants

To identify candidate variants that are specific to the human and the archaic lineages, we used a set of variants published by Kuhlwilm and Boeckx (2019). The variants were extracted from the high-coverage genomes of three archaics: a 122,000-year-old Neanderthal from the Altai Mountains (52× coverage), a 52,000-year-old Neanderthal from Vindija in Croatia (30× coverage), and a 72,000-year-old Denisovan from the Altai Mountains (30× coverage). The variants were called in the context of the human genome hg19/GRCh37 reference. The total number of variant sites after applying filters for high coverage sites and genotype quality is 4,437,803. A human-specific variant is defined as a position where all the humans in the 1000 Genomes Project carry the derived allele and all the archaics carry the ancestral allele. An archaic specific is defined as a position where all the archaics carry the derived allele and the derived allele is absent or extremely rare (≤0.00001) across all human populations. Note that introgressed archaic alleles are not included in the “archaic-specific” set. These criteria resulted in 9,424 human-specific and 33,184 archaic-specific variants.

Enrichment of Lineage-Specific Variants among Functional Regions of the Genome

We intersected the sets of lineage-specific variants with several sets of annotated functional genomic regions. Inside circadian gene regions (Gencode v29), we found 156 human-specific variants and 341 archaic-specific variants. In circadian promoter regions, we found six human-specific variants and 19 archaic-specific variants. Promoters were defined as regions 5-kb upstream to 1-kb downstream from a transcription start site (TSS). In distal regulatory elements, we found 247 human-specific variants and 807 archaic-specific variants. For this last set, we considered cCREs published by ENCODE (Moore et al. 2020) within 1 Mb of the circadian genes. To compute whether lineage-specific variants are more abundant than expected in circadian genes, we applied a Fisher's exact test to the sets of human- and archaic-specific variants in regulatory, promoter, and gene regions.

Genes Containing Archaic Variants with Evidence of Alternative Splicing

We used a set of archaic variants annotated with the splice-altering probabilities to identify circadian genes that may be differentially spliced between archaic hominins and AMH (Brand et al. 2023). We considered variants from four archaic individuals: the Altai, Chagyrskaya, and Vindija Neanderthals and the Altai Denisovan. These archaic variants were annotated using SpliceAI (Jaganathan et al. 2019), and we considered any variant with a maximum delta or splice-altering probability > 0.2. We identified 36 archaic-specific splice-altering variants, defined as those variants absent from 1000 Genomes Project, among 28 circadian genes. Next, we tested for enrichment among this gene set using an empirical null approach (McArthur et al. 2022; Brand et al. 2023). We shuffled the maximum deltas among 1,607,350 variants 10,000 times and counted the number of circadian genes with a splice-altering variant in each iteration. Enrichment was calculated as the number of observed genes (N = 28) divided by the mean gene count among 10,000 shuffles. In addition to all genes with archaic-specific variants, we considered six other subsets among these variants: 1) genes with variants private to the Altai Neanderthal, 2) genes with variants private to the Chagyrskaya Neanderthal, 3) genes with variants private to the Altai Denisovan, 4) genes with variants private to all Neanderthals, 5) genes with variants shared among all archaic individuals, and 6) genes with variants private to the Vindija Neanderthal. Finally, we considered a subset of splice-altering variants that were identified as tag SNPs by Vernot et al. (2016).

PrediXcan

To understand the difference in circadian biology between present-day humans and archaic hominins, we analyzed predictions on gene regulation. We considered the results from PrediXcan gene regulation predictions across 44 tissues from the PredictDB Data Repository (http://predictdb.org/). The models were trained on GTEx V6 using variants identified in 2,504 present-day humans in the 1000 Genomes Project Phase 3 within 1 Mb of each circadian gene. The original analysis includes predictions for 17,748 genes for which the models explained a significant amount of variance in gene expression in each tissue (false discovery rate < 0.05). The prediction models were also applied to the Altai and Vindija Neanderthals and the Denisovan. The resulting predictions are normalized values of the distribution observed in GTEx individuals used to train the original prediction models. Each prediction contains an empirical P-value, which was calculated for each gene and tissue pair to define genes that are DR between archaic hominins and humans. The P-value is obtained by calculating the proportion of humans from the 1000 Genomes Project that have predictions more extreme compared to the human median than the archaic individual. Significantly, DR genes are defined as those where the archaic prediction falls outside the distribution of humans in the 1000 Genomes Project predictions.

We tested whether the circadian genes in our set are more likely to be DR compared with an empirical null distribution from random gene sets of the same size. We account for the fact that some genes are modeled in more tissues than others by matching the distribution of tissues in which each gene could be modeled in the random sets to our set. Among 1,467 DR genes in the Altai Neanderthal, we find 23 DR circadian genes out of the total 236 genes in the circadian set. We iterate through the permutation analysis 1,000,000 times and find an enrichment of 1.21 (P = 0.19). A similar analysis is done in the Vindija Neanderthal (1,536 total DR, 21 circadian DR, enrichment of 1.05, P = 0.43) and the Denisovan individual (1,214 total DR, 19 circadian DR, enrichment of 1.20, P = 0.24). In this study, we define a set of DR genes as the intersection between DR genes in all three archaics, resulting in a set of 16 genes.

Enrichment of Introgressed Variants in eQTL

We performed an enrichment analysis using Pearson's chi-squared test to evaluate if there is over-representation of introgressed alleles in our set of circadian variants using the GTEx dataset. We did a liftOver of the GTEx v8 data set from hg38 to hg19. The original hg38 set contains 4,631,659 eQTLs across 49 tissues. After the LiftOver, 4,608,446 eQTLs remained, with the rest not mapping. We used the archaic introgressed variants data set from Browning et al. (2018). The set contains 863,539 variants that are introgressed in humans originating in archaic hominins. We performed an intersection between the set of genes containing evidence for eQTLs and our set of 246 circadian genes to retrieve a subset of variant sites with evidence of being eQTL in circadian genes. The resulting subset contained 97,441 circadian eQTLs in 49 tissues and 239 genes. We further intersected the introgressed variants and the set of eQTL, resulting in 128,138 introgressed eQTLs. The final set of eQTLs that are circadian and also introgressed is 3,857.

Direction of Effect of Chronotype Associations

To explore the effect of archaic introgression in circadian genes on human chronotype, we quantified the direction of the effect of variants associated to a morning/evening person trait in a GWAS analysis of the UK Biobank (http://www.nealelab.is/uk-biobank/). The variants were LD clumped using PLINK v1.9 (R2 > 0.5). We generated cumulative proportion values on the beta values assigned to each associated variant on an ascending order of P-values.

Detection of Latitudinal Clines in Chronotype Associations

To evaluate latitudinal clines in chronotype-associated variants, we assigned a latitude to each of the Eurasian 1000 Genomes Project populations. The latitude of diaspora populations was set to their ancestral country (GIH Gandhinagar in Gujarat: 23.223; STU Sri Jayawardenepura Kotte: 6.916667; ITU Amaravati in Andhra Pradesh: 16.5131; CEU: 52.372778). CEU was assigned a latitude in Amsterdam, following an analysis that shows that this group is more closely related to Dutch individuals (Lao et al. 2008). We then used the LDlink API to retrieve allele frequencies for each introgressed morningness variant in Eurasian individuals (Machiela and Chanock 2015). Variants that follow a latitudinal cline were identified using linear regression statistics requiring correlation coefficient (R ≥ 0.65) and P-value (P ≤ 0.5).

Detection of Pleiotropy in the Set of Introgressed Circadian Variants

To understand the extent of different phenotypes associated with the introgressed circadian variants, we first extracted genome-wide associations from Open Targets Genetics (https://genetics.opentargets.org/) for each of the variants with evidence of introgression (Browning et al. 2018). Only the variants with significant P-values were analyzed. The P-value threshold was set at the genome-wide significance level (P = 5e−8). We split the variants into two sets: introgressed circadian and introgressed noncircadian. Many of these variants are not associated with any phenotype. We performed a Fisher's exact test to analyze which of the two sets contains a higher ratio of SNPs with at least one association versus SNPs with no association. The result showed that the circadian set had a significantly higher ratio (OR = 1.36, P = 5e−29). Then, we calculated the total of unique traits associated with each of the variants, given that the SNP has at least one association. We used a Mann–Whitney U test to understand which set is represented by a higher level of traits per SNP. The circadian set was slightly more pleiotropic, and the result is supported by a significant P-value (P = 5.4e−3).

Identifying Introgressed Circadian Variants with Evidence of Adaptive Introgression

We sought to identify circadian variants that contain evidence of adaptive introgression. To achieve this, we collected adaptive introgression predictions from a method that applied various summary statistics on 1000 Genomes Project data (Racimo, Marnetto, and Huerta-Sánchez 2017) and two sets of genomic regions that were measured for their likelihood to be under adaptive introgression by two machine learning methods: genomatnn and MaLAdapt. genomatnn is a convolutional neural network trained to identify adaptive introgression based on simulations (Gower et al. 2021). MaLAdapt is a machine learning algorithm trained to find adaptive introgression based on simulations using an extra-trees classifier (Zhang et al. 2023). Following the thresholds used in each paper, a region is considered to be under adaptive introgression if the prediction value assigned to it meets a threshold of 0.5 or 0.9, respectively. To find the variants of interest that fall into adaptive introgression regions, we intersected the set of introgressed circadian SNPs with the Racimo et al. (2015), genomatnn, and MaLAdapt regions individually. The set of introgressed circadian variants contains variants inside circadian genes, in circadian promoter regions (5-kb upstream and 1-kb downstream of the TSS), and variants with regulatory function (cCREs) flanking circadian genes by 1 Mb.

Supplementary Material

Supplementary data are available at Genome Biology and Evolution online (http://www.gbe.oxfordjournals.org/).

Acknowledgments

We thank members of the Capra Lab for their helpful comments on this work. This work was conducted in part using the resources of the Advanced Computing Center for Research and Education at Vanderbilt University, Nashville, TN, USA. This work was supported by the National Institutes of Health (R35GM127087 to J.A.C., R01GM117650 to D.M., F30HG011200 to E.M., T32GM080178 to Vanderbilt University [E.M.], and T32HG009495 to the University of Pennsylvania [L.L.C.]).

Author Contributions

Conceptualization: K.V.-A. and J.A.C.; methodology: K.V.-A., L.C., E.M., C.B., D.R., J.S., D.M., and J.A.C.; investigation: K.V.-A., L.C., E.M., C.B., and J.A.C.; writing—original draft: K.V.-A., and J.A.C.; writing—review & editing: K.V.-A., L.C., E.M., C.B., D.R., D.M., and J.A.C.; funding acquisition: J.A.C.; resources: J.A.C.; and supervision: J.A.C.

Data Availability

The data underlying this article are available in the article and in its online Supplementary material.

Literature Cited

1000 Genomes Project Consortium
.
2010
.
A map of human genome variation from population-scale sequencing
.
Nature
 
467
(
7319
):
1061
.

Arnold
 
LJ
, et al.  
2014
.
Luminescence dating and palaeomagnetic age constraint on hominins from Sima de los Huesos, Atapuerca, Spain
.
J Hum Evol.
 
67
(
1
):
85
107
.

Bae
 
CJ
,
Douka
 
K
,
Petraglia
 
MD
.
2017
.
On the origin of modern humans: Asian perspectives
.
Science
 
358
(
6368
):
eaai9067
.

Benton
 
ML
, et al.  
2021
.
The influence of evolutionary history on human health and disease
.
Nat Rev Genet
.
22
(
5
):
269
283
.

Brand
 
CM
,
Colbran
 
LL
,
Capra
 
JA
.
2023
.
Resurrecting the alternative splicing landscape of archaic hominins using machine learning
.
Nat Ecol Evol
.
7
(
6
):
939
953
.

Brown
 
SA
, et al.  
2008
.
Molecular insights into human daily behavior
.
Proc Natl Acad Sci USA.
 
105
(
5
):
1602
1607
.

Browning
 
SR
, et al.  
2018
.
Analysis of human sequence data reveals two pulses of archaic Denisovan admixture
.
Cell
 
173
(
1
):
53
61
.

Brzezinski
 
JA
, et al.  
2005
.
Loss of circadian photoentrainment and abnormal retinal electrophysiology in Math5 mutant mice
.
Invest Ophthalmol Visual Sci
.
46
(
7
):
2540
2551
.

Colbran
 
LL
, et al.  
2019
.
Inferred divergent gene regulation in archaic hominins reveals potential phenotypic differences
.
Nat Ecol Evol
.
3
:
1598
1606
.

Dannemann
 
M
,
Kelso
 
J
.
2017
.
The contribution of Neanderthals to phenotypic variation in modern humans
.
Am J Hum Genet.
 
101
(
4
):
578
589
.

Dorokhov
 
VB
, et al.  
2018
.
An hour in the morning is worth two in the evening: association of morning component of morningness—eveningness with single nucleotide polymorphisms in circadian clock genes
.
Biol Rhythm Res.
 
49
(
4
):
622
642
.

Gan
 
Y
, et al.  
2015
.
Shift work and diabetes mellitus: a meta-analysis of observational studies
.
Occup Environ Med.
 
72
(
1
):
72
78
.

Gan
 
Y
, et al.  
2018
.
Association between shift work and risk of prostate cancer: a systematic review and meta-analysis of observational studies
.
Carcinogenesis
 
39
(
2
):
87
97
.

Gómez-Robles
 
A
.
2019
.
Dental evolutionary rates and its implications for the Neanderthal–modern human divergence
.
Sci Adv.
 
5
(
5
):
eaaw1268
.

Gower
 
G
, et al.  
2021
.
Detecting adaptive introgression in human evolution using convolutional neural networks
.
eLife
 
10
:
e64669
.

Green
 
RE
, et al.  
2010
.
A draft sequence of the neandertal genome
.
Science
 
328
(
5979
):
710
722
.

Grimaldi
 
B
, et al.  
2010
.
PER2 Controls lipid metabolism by direct regulation of PPARγ
.
Cell Metab.
 
12
(
5
):
509
520
.

Gyarmati
 
G
, et al.  
2016
.
Night shift work and stomach cancer risk in the MCC-Spain study
.
Occup Environ Med.
 
73
(
8
):
520
527
.

Hublin
 
JJ
, et al.  
2017
.
New fossils from Jebel Irhoud, Morocco and the pan-African origin of Homo sapiens
.
Nature
 
546
(
7657
):
289
292
.

Huerta-Sánchez
 
E
, et al.  
2014
.
Altitude adaptation in Tibetans caused by introgression of Denisovan-like DNA
.
Nature
 
512
(
7513
):
194
197
.

Hut
 
RA
, et al.  
2013
.
Latitudinal clines: an evolutionary view on biological rhythms
.
Proc R Soc B: Biol Sci
.
280
(
1765
):
20130433
.

Jacobs
 
GS
, et al.  
2019
.
Multiple deeply divergent Denisovan ancestries in Papuans
.
Cell
 
177
(
4
):
1010
1021.e32
.

Jaganathan
 
K
, et al.  
2019
.
Predicting splicing from primary sequence with deep learning
.
Cell
 
176
(
3
):
535
548.e24
.

Jones
 
SE
, et al.  
2016
.
Genome-wide association analyses in 128,266 individuals identifies new morningness and sleep duration loci
.
PLoS Genet.
 
12
(
6
):
e1006125
.

Jones
 
SE
, et al.  
2019
.
Genome-wide association analyses of chronotype in 697,828 individuals provides insights into circadian rhythms
.
Nat Commun.
 
10
(
1
):
343
.

Kim
 
JK
,
Forger
 
DB
.
2012
.
A mechanism for robust circadian timekeeping via stoichiometric balance
.
Mol Syst Biol.
 
8
(
1
):
630
.

Kivelä
 
L
,
Papadopoulos
 
MR
,
Antypa
 
N
.
2018
.
Chronotype and psychiatric disorders
.
Curr Sleep Med Rep.
 
4
(
2
):
94
103
.

Knutson
 
KL
,
von Schantz
 
M
.
2018
.
Associations between chronotype, morbidity and mortality in the UK Biobank cohort
.
Chronobiol Int.
 
35
(
8
):
1045
1053
.

Kuhlwilm
 
M
,
Boeckx
 
C
.
2019
.
A catalog of single nucleotide changes distinguishing modern humans from archaic hominins
.
Nat Sci Rep
.
9
(
1
):
8463
.

Lane
 
JM
, et al.  
2016
.
Genome-wide association analysis identifies novel loci for chronotype in 100,420 individuals from the UK Biobank
.
Nat Commun.
 
7
(
1
):
1
10
.

Lao
 
O
, et al.  
2008
.
Correlation between genetic and geographic structure in Europe
.
Curr Biol.
 
18
(
16
):
1241
1248
.

Larcher
 
S
, et al.  
2015
.
Sleep habits and diabetes
.
Diabet Metabol
 
41
(
4
):
263
271
.

Larcher
 
S
, et al.  
2016
.
Impact of sleep behavior on glycemic control in type 1 diabetes: the role of social jetlag
.
Eur J Endocrinol
.
175
(
5
):
411
419
.

Lee
 
YJ
, et al.  
2012
.
Circadian regulation of low density lipoprotein receptor promoter activity by CLOCK/BMAL1, Hes1 and Hes6
.
Exp Mol Med
.
44
(
11
):
642
652
.

Leocadio-Miguel
 
MA
, et al.  
2017
.
Latitudinal cline of chronotype
.
Sci Rep.
 
7
(
5437
):
2
7
.

Lowden
 
A
, et al.  
2018
.
Delayed sleep in winter related to natural daylight exposure among Arctic day workers
.
Clocks & Sleep
.
1
(
1
):
105
116
.

Machiela
 
MJ
,
Chanock
 
SJ
.
2015
.
LDlink: a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants
.
Bioinformatics
 
31
(
21
):
3555
3557
.

Mafessoni
 
F
, et al.  
2020
.
A high-coverage Neandertal genome from Chagyrskaya Cave
.
Proc Natl Acad Sci USA.
 
117
(
26
):
15132
15136
.

Martens
 
M
, et al.  
2021
.
WikiPathways: connecting communities
.
Nucleic Acids Res
.
49
(
D1
):
D613
D621
.

McArthur
 
E
, et al.  
2022
.
Reconstructing the 3D genome organization of Neanderthals reveals that chromatin folding shaped phenotypic and sequence divergence
.
bioRxiv [Preprint]
.

McArthur
 
E
,
Rinker
 
DC
,
Capra
 
JA
.
2021
.
Quantifying the contribution of Neanderthal introgression to the heritability of complex traits
.
Nat Commun.
 
12
(
1
):
1
14
.

Meyer
 
M
, et al.  
2012
.
A high-coverage genome sequence from an archaic Denisovan individual
.
Science
 
338
(
6104
):
222
226
.

Meyer
 
M
, et al.  
2014
.
A mitochondrial genome sequence of a hominin from Sima de los Huesos
.
Nature
 
505
:
7483
.

Meyer
 
M
, et al.  
2016
.
Nuclear DNA sequences from the Middle Pleistocene Sima de los Huesos hominins
.
Nature
 
531
(
7595
):
504
507
.

Michael
 
TP
, et al.  
2003
.
Enhanced fitness conferred by naturally occurring variation in the circadian clock
.
Science
 
302
(
5647
):
1049
1053
.

Moore
 
JE
, et al.  
2020
.
Expanded encyclopaedias of DNA elements in the human and mouse genomes
.
Nature
 
583
(
7818
):
699
710
.

Nielsen
 
R
, et al.  
2017
.
Tracing the peopling of the world through genomics
.
Nature
 
541
(
7637
):
302
310
.

O’Malley
 
KG
,
Banks
 
MA
.
2008
.
A latitudinal cline in the Chinook salmon (Oncorhynchus tshawytscha) clock gene: evidence for selection on PolyQ length variants
.
Proceedings of the Royal Society B: Biological Sciences
.
275
(
1653
):
2813
2821
.

O’Malley
 
KG
,
Ford
 
MJ
,
Hard
 
JJ
.
2010
.
Clock polymorphism in Pacific salmon: evidence for variable selection along a latitudinal gradient
.
Proc R Soc Lond B Biol Sci
.
277
(
1701
):
3703
3714
.

Papantoniou
 
K
, et al.  
2016
.
Breast cancer risk and night shift work in a case–control study in a Spanish population
.
Eur J Epidemiol.
 
31
(
9
):
867
878
.

Papantoniou
 
K
, et al.  
2017
.
Shift work and colorectal cancer risk in the MCC-Spain case–control study
.
Scand J Work Environ Health
.
43
(
3
):
250
259
.

Petr
 
M
, et al.  
2019
.
Limits of long-term selection against Neandertal introgression
.
Proc Natl Acad Sci USA.
 
116
(
5
):
1639
1644
.

Prüfer
 
K
, et al.  
2014
.
The complete genome sequence of a Neanderthal from the Altai Mountains
.
Nature
 
505
(
7481
):
43
49
.

Prüfer
 
K
, et al.  
2017
.
A high-coverage Neandertal genome from Vindija Cave in Croatia
.
Science
 
358
(
6363
):
655
658
.

Putilov
 
AA
, et al.  
2019
.
Genetic-based signatures of the latitudinal differences in chronotype
.
Biol Rhythm Res.
 
50
(
2
):
255
271
.

Putilov
 
AA
,
Dorokhov
 
VB
,
Poluektov
 
MG
.
2018
.
How have our clocks evolved? Adaptive and demographic history of the out-of-African dispersal told by polymorphic loci in circadian genes
.
Chronobiol Int.
 
35
(
4
):
511
532
.

Racimo
 
F
, et al.  
2015
.
Evidence for archaic adaptive introgression in humans
.
Nat Rev Genet
.
16
(
6
):
359
371
.

Racimo
 
F
,
Gokhman
 
D
, et al.  
2017
.
Archaic adaptive introgression in TBX15/WARS2
.
Mol Biol Evol.
 
34
(
3
):
509
524
.

Racimo
 
F
,
Marnetto
 
D
,
Huerta-Sánchez
 
E
.
2017
.
Signatures of archaic adaptive introgression in present-day human populations
.
Mol Biol Evol.
 
34
(
2
):
296
317
.

Randler
 
C
,
Rahafar
 
A
.
2017
.
Latitude affects morningness–eveningness: evidence for the environment hypothesis based on a systematic review
.
Sci Rep
.
7
:
39976
.

Rinker
 
DC
, et al.  
2020
.
Neanderthal introgression reintroduced functional ancestral alleles lost in Eurasian populations
.
Nat Ecol Evol
.
4
(
10
):
1332
1341
.

Sandrelli
 
F
, et al.  
2007
.
A molecular basis for natural selection at the timeless locus in Drosophila melanogaster
.
Science
 
316
(
5833
):
1898
1900
.

Sankararaman
 
S
, et al.  
2012
.
The date of interbreeding between Neandertals and modern humans
.
PLoS Genet.
 
8
(
10
):
e1002947
.

Sankararaman
 
S
, et al.  
2014
.
The genomic landscape of Neanderthal ancestry in present-day humans
.
Nature
 
507
(
7492
):
354
359
.

Schaefer
 
NK
,
Shapiro
 
B
,
Green
 
RE
.
2021
.
An ancestral recombination graph of human, Neanderthal, and Denisovan genomes
.
Sci Adv.
 
7
(
29
):
776
792
.

Shi
 
Y
, et al.  
2020
.
Night-shift work duration and risk of colorectal cancer according to IRS1 and IRS2 expression
.
Cancer Epidemiol Biomarkers Prev
.
29
(
1
):
133
140
.

Skoglund
 
P
,
Mathieson
 
I
.
2018
.
Ancient genomics of modern humans: the first decade
.
Annu Rev Genomics Hum Genet.
 
19
:
381
404
.

Skov
 
L
, et al.  
2020
.
The nature of Neanderthal introgression revealed by 27,566 Icelandic genomes
.
Nature
 
582
(
7810
):
78
83
.

Srinivasan
 
V
, et al.  
2006
.
Melatonin in mood disorders
.
World J Biol Psychiatry
.
7
(
3
):
138
151
.

Steinrücken
 
M
, et al.  
2018
.
Model-based detection and analysis of introgressed Neanderthal ancestry in modern humans
.
Mol Ecol.
 
27
(
19
):
3873
3888
.

Stringer
 
C
.
2016
.
The origin and evolution of Homo sapiens
.
Philos Trans R Soc B Biol Sci .
 
371
(
1698
):
20150237
.

Tauber
 
E
, et al.  
2007
.
Natural selection favors a newly derived timeless allele in Drosophila melanogaster
.
Science
 
316
(
5833
):
1895
1898
.

Taylor
 
BJ
,
Hasler
 
BP
.
2018
.
Chronotype and mental health: recent advances
.
Curr Psychiatry Rep.
 
20
(
8
):
59
.

Telis
 
N
,
Aguilar
 
R
,
Harris
 
K
.
2020
.
Selection against archaic hominin genetic variation in regulatory regions
.
Nat Ecol Evol
.
4
(
11
):
1558
1566
.

Vernot
 
B
, et al.  
2016
.
Excavating Neandertal and Denisovan DNA from the genomes of Melanesian individuals
.
Science
 
352
(
6282
):
235
239
.

Vernot
 
B
,
Akey
 
JM
.
2014
.
Resurrecting surviving Neandertal lineages from modern human genomes
.
Science
 
343
(
6174
):
1017
1021
.

Villanea
 
FA
,
Schraiber
 
JG
.
2019
.
Multiple episodes of interbreeding between Neanderthal and modern humans
.
Nat Ecol Evol
.
3
(
1
):
39
44
.

Võsa
 
U
, et al.  
2021
.
Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression
.
Nat Genet.
 
53
(
9
):
1300
1310
.

Yousef
 
E
, et al.  
2020
.
Shift work and risk of skin cancer: a systematic review and meta-analysis
.
Sci Rep.
 
10
(
1
):
1
11
.

Zhang
 
Q
, et al.  
2008
.
Association of the circadian rhythmic expression of GmCRY1a with a latitudinal cline in photoperiodic flowering of soybean
.
Proc Natl Acad Sci USA.
 
105
(
52
):
21028
21033
.

Zhang
 
X
, et al.  
2023
.
MaLAdapt reveals novel targets of adaptive introgression from Neanderthals and Denisovans in worldwide human populations
.
Mol Biol Evol.
 
40
(
1
):
msad001
.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Associate Editor: Emilia Huerta-Sanchez
Emilia Huerta-Sanchez
Associate Editor
Search for other works by this author on:

Supplementary data