Time-varying effects are common in genetic control of gestational duration

Abstract Preterm birth is a major burden to neonatal health worldwide, determined in part by genetics. Recently, studies discovered several genes associated with this trait or its continuous equivalent—gestational duration. However, their effect timing, and thus clinical importance, is still unclear. Here, we use genotyping data of 31 000 births from the Norwegian Mother, Father and Child cohort (MoBa) to investigate different models of the genetic pregnancy ‘clock’. We conduct genome-wide association studies using gestational duration or preterm birth, replicating known maternal associations and finding one new fetal variant. We illustrate how the interpretation of these results is complicated by the loss of power when dichotomizing. Using flexible survival models, we resolve this complexity and find that many of the known loci have time-varying effects, often stronger early in pregnancy. The overall polygenic control of birth timing appears to be shared in the term and preterm, but not very preterm, periods and exploratory results suggest involvement of the major histocompatibility complex genes in the latter. These findings show that the known gestational duration loci are clinically relevant and should help design further experimental studies.


Introduction
Preterm birth-birth before completing 37 weeks of gestationis a major cause of neonatal mortality and morbidity, affecting about 10% of deliveries worldwide (1). Various environmental factors associated with preterm birth have been reported, but the biological mechanisms leading to it are still largely unknown. As a result, there is no robust strategy for prevention of preterm birth, and its rates continue to increase in many countries (1)(2)(3).
Genetics is clearly important for this trait: around 20-40% of the variation in pregnancy duration is attributed to maternal genetic factors and another 10-30% to fetal genetics (4). Although the specific variants or genes underlying these effects are still mostly unknown, several loci were recently identified in large genome-wide association studies (GWASs) (5)(6)(7). Understanding the biological effects of variants at these loci could lead to broader understanding of pregnancy maintenance and suggest possible interventions for preterm birth.
Various statistical models have been used for discovering the variants, and their comparison could provide some clues about the mechanism of the pregnancy 'clock'. Typically, the variants are tested for association with pregnancy (gestational) duration in days as a continuous outcome using linear models, or preterm birth as a binary outcome, using e.g. logistic regression. Other dichotomizations of pregnancy duration have also been used, such as early (5), extreme (8) or very preterm birth (9) or postterm birth (5). So far, almost all of the associated variants have been identified using the continuous outcome and have weaker or no associations with preterm birth (5)(6)(7).
This contrast between results for pregnancy duration and preterm birth raises questions about the action and clinical relevance of the discovered genes. On one hand, if their variants have uniform effects throughout pregnancy (as modeled in the standard linear regression), such results are expected, as the continuous model then has much higher detection power (10). On the other hand, pathways initiating preterm birth are considered to be distinct from those acting near term (11). This is supported by epidemiological associations that differ by gestational duration groups (12), and registry data also suggest that genetic effects change throughout pregnancy (13). It is possible that the loci discovered for gestational duration have little relevance for preterm birth-or they might act uniformly throughout pregnancy and contradict this view.
In this study, we aim to resolve this uncertainty. We analyze a genotyping cohort of more than 30 000 parent-offspring trios using models explicitly incorporating the timing of genetic effects. With these, we aim to answer the following: (1) Are the pregnancy duration genes discovered so far clinically relevant for preterm birth? (2) Do the genes support the view that preterm and term birth timing is etiologically different? (3) Is there a period of gestation where the gene effects are particularly strong, weak or otherwise unusual?
We start by conducting GWASs of gestational duration and preterm birth in this cohort and directly showing the challenges in comparing the results of the two outcomes. We then select the top variants known for these traits and re-analyze them in timeto-event models, specifically designed to investigate their effect dynamics. We also repeat this analysis on a polygenic score (PGS) of gestational duration and further explore possibly different mechanisms acting in the very preterm range.

GWAS results for gestational duration and preterm delivery
We ran maternal and fetal GWASs of gestational duration and preterm delivery on 22 247 mothers and 21 262 fetuses (610 and 529 preterm cases, respectively) from the Norwegian MoBa cohort, using standard association methods (linear and logistic regression). In the maternal GWAS of gestational duration, we identified two loci associated with gestational duration at genome-wide significance (Fig. 1A), located in or near the KCNAB1 and ADCY5 genes. Both of these loci have been reported previously in (6); however, KCNAB1 was not identified in another large GWAS (7). In the fetal GWAS of gestational duration, one single genomewide significant variant was identified, rs34207099, nearest to the gene LRATD2 (FAM84B) (beta=−2.4 days, SE = 0.44 days, P = 3.0e-8, imputation INFO score = 0.75, Fig. 1C). We note, however, that the variant has a low frequency (MAF = 1.1%), is close to the significance threshold and has no linked proxies, and this association is not observed in the previous, larger fetal metaanalysis (P = 0.20) (5). For preterm delivery, no genome-wide significant associations were detected either in the maternal ( The null results of the preterm-delivery GWAS at first glance suggest differences from the continuous phenotype. Of the 165 preterm delivery loci at suggestive association (P < 10 −5 ) none were (genome-wide) significant in the GWAS of gestational duration. However, the regression coefficients of the suggestive variants across the two maternal GWASs were concordant in direction and size, i.e. variants that shorten gestational duration also increase preterm delivery risk (Fig. 1D). If the genetic effects are indeed shared between term and preterm periods, then dichotomization incurs a great loss of power: we estimate that such analysis would then require a 6.7× larger cohort to reach equal power with the continuous analysis (see also Fig. 1B; calculations based on 2.7% prevalence and the liability model, see Materials and Methods). For reference, the current maternal GWAS had 80% power to detect effects of at least 0.74 days per allele (linear model) or OR > 1.53 (logistic model), assuming MAF = 0.2 and 5e-8 significance threshold. In light of this power difference, comparison between the two GWAS is complicated, and we could not make definite conclusions about the effect dynamics or clinical relevance of the findings.

Top gestational duration variants have time-varying effects
From previous meta-analyses, we identified 25 maternal and 3 fetal genetic variants involved in delivery timing. One additional fetal locus was identified in the present study. Most (26) of these were initially discovered in analyses of gestational duration, although one (maternal) variant was detected only for preterm delivery and two (fetal) only for early preterm birth.
We then re-analyzed these 29 variants with a f lexible survival model (PAMM) and found that many of them have time-varying effects. PAMM (see Materials and Methods for detailed presentation) explicitly models the time to event, i.e. delivery. The effect of a variant is the change in instantaneous risk to deliver, relative to the major allele, and is allowed to change during gestation. Figure 2 shows the estimated effect dynamics for 20 variants that were nominally associated with gestational duration in this cohort, and 14 of these were significantly changing over time (P < 0.05 for the non-linear term in PAMM). Notably, even though all of the 20 variants were discovered in linear models and cohorts with few preterm births, their effects generally do not appear to be limited to the term period. In fact, for 10 loci (KCNAB1, FBXO32, LRATD2, ADCY5 and others), the estimated effect was largest early in the pregnancy and decreased toward the term. In three other cases, the effect peaked in the preterm birth range, around week 34 (TET3, EBF1 and COL27A1). For the remaining loci, the confidence intervals are wide and we are cautious about interpreting the estimated shapes, but most still show effects around or before 37 weeks (Fig. 2). One exception is the IL1A locus, which only had a significantly non-zero effect during weeks 38-41, consistently with the analyses performed in (5), which also concluded that its effects are mostly seen around the term. We emphasize that all of these variants were initially identified from other cohorts, except for the LRATD2 hit, which was discovered here and hence may show inf lated effect size.
The remaining 9 out of 29 variants are shown in Supplementary Material, Figure S3. Eight of these variants did not have any detectable effects on delivery timing in this cohort (P > 0.05 both in the gestational duration GWAS and for the non-linear effects in PAMM). This includes all three loci discovered in dichotomized analyses (LRP5, SPATA6 and LPP). The EEFSEC locus had borderline associations (P = 0.06 for the linear effects in GWAS, P = 0.02 for the non-linear effects in PAMM) and is also shown in Supplementary Material, Figure S3. Details of all 29 variants are listed in Supplementary Material, Table S1.
As diagnostics of our PAMM setup, we verified that it maintains the type I error rate in this setting by simulations (P < 0.05 observed in 4% of simulations) and that if only constant effects are included, the estimates are very similar between PAMM and standard Cox regression (Supplementary Material, Fig. S4).
As a sensitivity analysis, we applied two other methods for testing time-varying effects. These methods assume different (linear) shapes of the effects and produce larger P-values for the nonlinear effect loci (Supplementary Material, Table S1). The ranking of most other loci and the overall number of significant results are similar across the methods.
We also confirmed that almost all of the 28 top loci genes were actively transcribed in the placenta during early gestation (note that there were 29 loci, but 2 map to the EBF1 gene). In the data from (14), only AGTR2 and LEKR1 had few detections, while all other genes had at least five counts in 95% of the samples. Overall, the expression level of our top loci genes (median count = 326, taken over samples and then over genes) was at least as high as of other protein-coding genes (median count = 139; see also Supplementary Material, Fig. S5). This shows that the genes discovered in term-centric analyses are already present in the transcriptome and could plausibly have effects, much earlier in gestation.
Testing for enrichment of gene sets among the 28 genes mostly showed generic terms, with the top three being (all 'positive regulation of'): biosynthetic, macromolecule biosynthetic and nucleobase-containing compound metabolic processes (Supplementary Material, Table S2). Results were almost identical using only the 21 genes with significant time-varying effects detected here. However, restricting the set to 17 genes with effects strongest in early pregnancy (i.e. excluding TET3, EBF1, COL27A1 and IL1A), primary ovarian insufficiency appeared as one of top three enriched pathways (WikiPathways WP5316; Supplementary Material, Table S2). Notably, this pathway also The variants marked in red, and the corresponding loci, were reported as significant for the respective phenotype in the largest previous GWAS meta-analysis (6). Under the liability model, the GWAS in (B) has lower power: for a variant that can be detected genome-wide with 80% power in (A), the significance thresholds in (B) should be reduced to the dotted white lines to achieve 50% or 80% detection power. (C) LocusZoom plot showing the region around the only genome-wide significant association identified in the fetal GWAS. Shown P-values are for gestational duration. The significant variant is the purple diamond while all other variants are colored based on their linkage disequilibrium with it. Variants not available in the reference panel are colored in gray. (D) Effect size comparison for the suggestive variants in the maternal GWASs (P-value < 10 −5 for either outcome). Shown are the log odds ratios (ORs) and linear regression betas in the same position. The identity line is marked.
includes two of the three strongest GWAS hits overall-WNT4 and AGTR2 genes. Robust analysis would require determining the effect dynamics of each hit more precisely, but it could thus highlight particular pathways to birth, which has been difficult using all associated genes unselectively.

Polygenic effects are shared at term and early, but not very early, gestation
We constructed a maternal PGS for gestational duration based on the summary statistics of a previous large meta-analysis. The score includes ∼ 1.1 million variants and, on its own, explains 2.0% of the variance of gestational duration in the present cohort (or 2.8% in combination with the clinical covariates). For comparison, if this score is limited to 50-kbp regions surrounding the 25 top hits, the variance explained drops to 0.6%, or 1.5% with clinical covariates.
Firstly, we confirmed that the estimated time-varying effects of each variant remain after adjusting for this score; they were also robust to adding or removing the clinical covariates ( Supplementary Material, Fig. S6).
We then analyzed the effects of this score in time-to-event models (Fig. 3A). Similar to the individual variants, the PGS shows Effects of 20 genetic variants on delivery timing throughout gestation, estimated by a f lexible PAMM. For each week of gestation, the line is the estimated instantaneous log hazard ratio (positive = risk of birth is currently increased) and its 95% confidence interval is shaded. The preterm limit of 37 weeks is highlighted. Note that the minor allele count was used as a continuous covariate in the analysis, so the effects of 1 and 2 minor alleles are constrained to be proportional. effects extending well into the preterm range, even though it was derived based on analyses of continuous gestational age. Surprisingly, the 'protective' PGS groups appear to have neutral or even slightly harmful effects in the very early gestation. While there is large uncertainty in the estimates, the pattern is still seen with different groupings of the score (Supplementary Material, Fig. S7) or even by just directly observing the risks of different outcomes in each group (Fig. 3B): the risk for preterm delivery decreases smoothly with the PGS, but for delivery before 32 weeks (or very preterm delivery, vPTD), the risk is similar across four PGS groups and only elevated in one.
We tested some possible explanations of this pattern (although we caution that the data contain only 59 vPTD cases). For example, it is biologically plausible that vPTD risk is determined mostly by rare alleles with strong deleterious effects, but the frequency of such alleles should decrease smoothly across all PGS groups, and indeed, we observed that for rare deleterious alleles in the PGS (Fig. 4A). Similarly, the number of homozygous deleterious genotypes decreased smoothly with PGS, not enriched in any particular group (Fig. 4A). Neither of these counts was predictive of vPTD (i.e. no association after adjusting for PGS, P = 0.37 and 0.41). We then tested the top principal components, seeking to identify any broader genetic features related to vPTD, and found that only the third PC was significantly associated with it (P = 0.016). Also, this PC was not associated with all PTDs (P = 0.16). Inspecting it revealed that its loadings were higher in chromosome 6 and specifically in the major histocompatibility complex (MHC) region, 6p22.1-6p21.3 (Fig. 4B). This suggests that immune pathways may be particularly relevant to the genetic risk of vPTD, possibly through gene interactions or other complex mechanisms producing the observed non-linear relation with PGS.

Discussion
In summary, we conducted GWASs of gestational duration, noting three associated loci, and of preterm birth, which did not find any significant associations. Together with existing genome-wide studies, this brings the total number of loci known to be involved in pregnancy length to 29. Re-analyzing these loci, as well as a gestational duration PGS, in time-to-event models showed that time-varying effects are common and are often strongest in the preterm period. The PGS results and exploratory analyses also suggest that very preterm birth may be controlled differently, possibly involving the major histocompatibility gene region.
Our maternal GWAS results, in terms of the top variants, are largely similar to previous ones obtained from a direct-toconsumer genotyping cohort (7). This cohort substantially differed from ours in phenotype ascertainment, exclusion criteria and preterm delivery rates (which in Norway are relatively low (1)), so the agreement between results is encouraging for the generalizability of GWASs. The weak fetal genetic associations are also consistent with literature (5). Still, cohorts focusing on non-European ancestries, and especially high-case-rate populations, are largely missing and will be crucial to further progress in the genetics of pregnancy.
Our primary finding of practical importance is that many of the variants discovered for gestational duration also have strong effects in the preterm period. In other words, the lack of associations with preterm delivery does not mean lack of effects-rather, the power loss when dichotomizing is just too great (this is difficult to quantify directly under time-varying effects, but we showed a large loss under the simple liability model). Researchers thus should be encouraged to use gestational duration as a continuous phenotype that leads to clinically relevant findings. Stronger early effects in survival models can also be a technical artifact of frailty-unaccounted heterogeneity between the individuals, typically owing to genetics or other covariates (15). However, as we observed several different effect shapes across the loci, and the effects were unchanged after accounting for known clinical covariates or the genetic background (PGS), we do not believe that frailty was a major factor in this study. Alternatively, such data could be analyzed with accelerated failure time models, which are not impacted by frailty (16), but they still require some development of significance tests and computational tools before practical use.
We also find that, while the gene effects are often present at some level throughout pregnancy, their strength greatly varies over time. This suggests that the biological clock controlling pregnancy length does not 'tick' evenly but rather has distinct phases at which different genes become relevant. A similar theory has been proposed for miscarriage timing-namely, that there are distinct checkpoint times at which fetal development is evaluated and the pregnancy continued or not (17). In relation to preterm birth, different, simpler clock models have been discussed in (18), but here, we present the first data from human genomics directly showing the complexity of this regulation.
Further functional work will still be needed to clarify the mechanisms behind the associations observed here. In particular, it is possible that the relevant genes directly act very early in pregnancy, e.g. in the trophoblast invasion (9), but the impact of that only becomes apparent later. This is compatible with the decreasing effects that we observe for many genes (19). On the other hand, we do show that our loci are actively expressed in early-to-mid-gestation placenta, which supports a more persistent role throughout pregnancy and is also notable as many transcripts are missing in the placenta (20). In any case, we expect that follow-up studies can use our results to prioritize the most relevant genes and periods for each gene.
Finally, the analysis framework presented here can be applied to other exposures and outcomes as well. Similar questions about the timing of effects have been investigated with other factors of preterm birth as well, but the studies so far used ad hoc techniques, such as separate logistic regression models within subgroups (12). Moreover, there is great interest in applying time-to-event models to the age of onset of various adulthood diseases (19,21,22). In both cases, adopting the procedures presented here could provide easier interpretation, likely more precise effect estimates (owing to the use of continuous data across the full range of the phenotype) and more direct testing of clinically and biologically relevant questions.

Data source
The Norwegian Mother, Father and Child Cohort Study (MoBa) is a population-based pregnancy cohort study conducted by the Norwegian Institute of Public Health (23). Participants were recruited from all over Norway in 1999-2008. The women consented to participation in 41% of the pregnancies. The cohort includes ∼114 500 children, 95 200 mothers and 75 200 fathers. The data used in this study are version 12 of the quality-assured data files released by MoBa and include genotype data, questionnaires filled out by the parents and linked records from Medical Birth Registry (MBRN), a national health registry containing information about all births in Norway.
The establishment of MoBa and initial data collection were based on a license from the Norwegian Data Protection Agency and approval from The Regional Committees for Medical and Health Research Ethics. The MoBa cohort is currently regulated by the Norwegian Health Registry Act. The current study was approved by the Norwegian Regional Committee for Medical and Health Research Ethics South-East (2015/2425) and Swedish Ethical Review Authority (Dnr 2022-03248-01).

Study population and genotyping
The present study uses a subset of this cohort: pregnancy outcomes, maternal and fetal genotypes from 31 496 parent-offspring trios that were genotyped over 2012-2018. We note that these samples have been used as a small part of a previous metaanalysis of gestational duration (6). Genotyping included only singleton, live-birth pregnancies with complete birth registry data and at least the first MoBa questionnaire answered and individuals alive at time of genotyping. Further, we excluded pregnancies that used IVF, where the mother had diabetes, the child's APGAR score was 0 at both 1 and 5 min, recorded gestational duration was an extreme outlier and kept one random pregnancy for each mother that had repeated pregnancies in the cohort.
Pre-phasing was conducted locally using Shapeit v2.790 (24). Imputation was performed at the Sanger Imputation Server with Positional Burrows-Wheeler Transform and HRC version 1.1 reference panel (25). All coordinates are provided in reference to the human genome build GRCh37.

Genome-wide association studies
Four GWASs were performed in this data, using maternal or fetal genomes and gestational duration (in days, determined by ultrasound) or preterm delivery (gestational duration < 259 days) as the outcome. The analyses were performed in PLINK v2.00 alpha 3.3 (26) using an additive regression model (linear or logistic) with minor allele dosage and batch covariates. Nonspontaneous deliveries (initiated by induction or cesarean) and genetic variants with minor allele frequency < 1% were excluded here. Standard genome-wide significance threshold of 5e-8 was used. Lambda and intercept parameters for genomic inf lation were calculated using LD score regression (27) and pre-computed LD scores obtained in subjects of recent European ancestry from the 1000 Genomes Project. The regional scatter plot for the fetal locus was done using the LocusZoom's web application (28). The LD reference panel was the 1000 Genomes Mar 2012 EUR cohort.
For comparing the effect directions, variants at suggestive association in both of the maternal GWASs (P < 1e-5) were selected and pruned by distance (i.e. removing other variants within 1.5 Mbp).
For comparing the GWAS power for the two phenotypes, we used formulas from (29). Specifically, we assume that the gestational age distribution is normal, and variants only have additive, constant effects on its mean, not in any way specific to preterm or other time periods. This is the liability model, and then, the sample size required to achieve equal power in continuous and dichotomized analysis (namely, Wald tests of simple linear or logistic regression coefficients) is related by a factor of t 2 , where t = φ Φ −1 (K) / √ K (1 − K), K is the prevalence of cases, and φ, Φ −1 -the pdf and quantile function of the standard normal distribution, respectively. The same factor relates the continuous and dichotomized effect sizes and SEs: β d /SE d = tβ c /SE c . The significance level allowing power f in the dichotomous analysis is Φ −Φ −1 1 − f − β d /SE d ; we calculated this for effect size corresponding to the minimum standardized effect detectable in the continuous analysis at genome-wide significance and 80% power, i.e. β c /SE c = Φ −1 (0.8)−Φ −1 5 × 10 −8 /2 , converted as before. Power for the conducted GWASs was also calculated using the genpwr R package, with observed sample size, prevalence and SD.

Time-to-event analysis of top variants
For detailed analysis of timing, we selected all loci identified as genome-wide significant for pregnancy duration or preterm birth in the largest available maternal (6) and fetal (5) meta-analyses (all loci discovered in the third large study by (7) are represented by at least one variant in these results). From each locus, the top variant as initially reported was chosen. One new fetal variant identified in the present study was added to this set. The name of each locus here is the primary gene assigned to it in the source study.
The variants were re-analyzed in time-to-event models based on the Cox proportional hazards framework (see (30) for a general introduction to such models, or (13) about their use with pregnancy duration). Our event of interest is spontaneous delivery. The hazard h(t) then is defined as the instantaneous risk of delivery at time t for a mother who has not delivered before t. The covariates x are modeled to change this risk log-linearly: where h 0 (t) is a baseline hazard, and β(t) are the effects of covariates. Note that this effect may vary in time, e.g. a covariate may be only relevant during some particular periods of pregnancy.
Inspecting the shape of β(t) for each selected variant and testing whether it is constant are the main aims of our analysis. Our primary approach for computing this uses an approximation of (1) the piecewise exponential additive mixed model, PAMM (31). Its idea is to partition the range of observed event times into short intervals, within which the hazard is assumed to be constant and the number of events is Poisson-distributed. Given sufficiently short intervals, the likelihood of this model is proportional to that of (1), so this way, existing tools for Poisson regression and generalized additive mixed models can be used to fit model (1). In particular, the baseline hazard and the effect β(t) can be expanded using a spline basis to obtain smooth non-linear forms (31,32).
Specifically, for each variant, we fit the following hazard rate model: where T is the end time of the interval containing t, B m is the basis functions of a cubic regression spline and g is the minor allele dosage and other covariates (with constant effects) z. The model is fit using R packages mgcv (32) and pammtools (31). Our null hypothesis is that the variant has only constant effects, i.e. all β 1,m = 0. We test its significance and extract the estimated effects over time and their confidence intervals using standard mgcv functionality (32). We also checked that this test maintains the type I error rate in our setting: we simulated the phenotype by bootstrap from the observed distribution in the sample, simulated a locus with no effect and minor allele frequency of 0.3 and fitted the model, in 2000 replicates. The outcome and covariates were prepared as follows. The pregnancy durations were shifted to start at 1 (by subtracting 169 days), so that the period where no live deliveries are expected would not affect the estimation. Non-spontaneous deliveries were treated as censored. The PAMM interval endpoints were set to 0, 20 and then every 7 days. Minor (within this dataset) allele dosage was used as retrieved from the imputation, i.e. a number between 0 and 2, with occasional non-integer values representing uncertain imputation. Maternal dosage data were used for variants discovered in maternal GWASs and fetal dosages for fetal variants. The other covariates were genotyping batch, maternal age (in years) as a pair of degree-2 orthogonal polynomials, the year of the delivery, maternal height (in centimeters), fetal sex (reference level male), presence of congenital malformations (reference level no) and parity (reference 1, other categories 0, 2 or ≥3 previous deliveries). For maternal height, missing or extreme outlier values were mean-imputed.
To verify that our PAMM setup approximates the Cox model (1), at least when the effects are constant, we ran a simplified PAMM, and a Cox regression with the same covariates, as implemented in the R survival package (33), and compared their β g estimates. As a sensitivity analysis, two other tests for non-proportionality of hazards were applied. First, a PAMM that includes a time × genetic effect interaction as a specified form (linear), instead of the spline, i.e.
The significance of the interaction term, as calculated by mgcv, is reported. The second test was the cox.zph test implemented in the survival package; it again reports the significance of the interaction term, but here the times are replaced with the corresponding Kaplan-Meier survival estimates S(t) to reduce the effect of outliers, i.e. the term is β gt g (1 − S(t)) (33). Among the top loci, gene pathway enrichment was checked using the gprofiler2 R package (34). The package tests pathways from multiple sources, including GeneOntology and WikiPathways, and reports a P-value for overrepresentation (the package applies a custom 'g:SCS' multiple testing adjustment, determined empirically).
As a functional support for the observed effects of the top loci, we also inspected the expression levels of the corresponding genes in RNA sequencing data from (14). The data were obtained from 125 placenta samples, collected at pregnancy termination between 6 and 23 weeks of gestation. The read counts for each gene were retrieved from GEO (GSE150830), and we removed nonprotein-coding genes based on Ensembl annotations. We report the median of sample counts for each gene and the median of these over genes in each set (top loci or all other).

Construction and analysis of the gestational-duration PGS
We calculated a gestational-duration PGS for each mother included in the study. The score weights were previously trained in a maternal GWAS meta-analysis on gestational duration (6). The MoBa cohort of the present study was excluded, resulting in n = 173 162 mothers for the GWAS, and the score weights were trained using LDpred2. Each individual's PGS is the product of the weighted effects of the variants from the training and the number of corresponding effect alleles. In this section, we derived hardcall genotypes for the study population and calculated the individual scores using PLINK 1.9. Note that fetal scores were not used here owing to low overall heritability. The phenotype was prepared as described in the previous sections, with induced births censored in the PAMMs and excluded in all other models.
The variance explained by the PGS was reported as the Rsquared from a linear regression model of gestational duration with PGS as a covariate (with or without the batch and clinical covariates described before). For comparison, we reran this model but excluding all variants further than 50 kb away from any of the top maternal hits (weights were unchanged for the retained variants). We also refitted the PAMM model for each variant with PGS as an additional constant-effect covariate.
The PGS was then categorized into quintiles and a PAMM fitted with a smooth term for each category (except the middle one as the reference). This was repeated with tertile or septile categories of PGS instead. The genotyping batch and the clinical covariates were kept as constant-effect covariates. We also show the risks of preterm delivery and vPTD (fraction of spontaneous births in this sample < 259 or < 224 days of gestation, respectively) and their confidence intervals obtained with R prop.test.
In the exploratory analyses, we first inspected the total perindividual counts of rare deleterious alleles, which we defined as having a negative effect of at least 0.001 days in the PGS and frequency < 10%. We similarly plotted the counts of deleterious recessive genotypes-minor homozygous genotypes for alleles with < −0.001 days effect in the PGS. We then ran a principal component (PC) analysis on the maternal genetic data, using only autosomal PGS variants, pruned to reduce dependency with PLINK's -indep-pairwise 100 10 0.2 command. Top 10 PCs were extracted. All PCs and the two counts were tested for association with preterm delivery or vPTD in logistic regression models, adjusting for the PGS and genotyping batch. Variant loadings on PC3 were taken as reported by PLINK, and the Genome Reference Consortium boundaries of the MHC region were retrieved from https://www.ncbi.nlm.nih.gov/grc/human/regions/MHC?asm= GRCh37.

Supplementary Material
Supplementary Material is available at HMG online.