Natural variation in reproductive timing and X-chromosome nondisjunction in Caenorhabditis elegans

Abstract Caenorhabditis elegans hermaphrodites first produce a limited number of sperm cells, before their germline switches to oogenesis. Production of progeny then ensues until sperm is depleted. Male production in the self-progeny of hermaphrodites occurs following X-chromosome nondisjunction during gametogenesis, and in the reference strain increases with age of the hermaphrodite parent. To enhance our understanding of the reproductive timecourse in C. elegans, we measured and compared progeny production and male proportion during the early and late reproductive periods of hermaphrodites for 96 wild C. elegans strains. We found that the two traits exhibited natural phenotypic variation with few outliers and a similar reproductive timing pattern as previous reports. Progeny number and male proportion were not correlated in the wild strains, implying that wild strains with a large brood size did not produce males at a higher rate. We also identified loci and candidate genetic variants significantly associated with male-production rate in the late and total reproductive periods. Our results provide an insight into life-history traits in wild C. elegans strains.


Introduction
Reproductive timing is a temporarily regulated pattern of reproduction. In a selfing androdioecious nematode Caenorhabditis elegans, progeny-production rate of unmated hermaphrodites is the highest on the second day of adulthood and dramatically declines thereafter, whereas adult worms mostly die at day 15-20. Thus, these individuals produce approximately 90% of their progeny during the first 3 days of adulthood (Klass 1977;Huang et al. 2004;Zhang et al. 2021). This transient reproduction is a result of the sperm-limited fecundity of protandrous C. elegans. In C. elegans, spermatogenesis occurs during the last larval stage, and a shared pool of germ-cell precursors turns irreversibly into oogenesis after the completion of spermatogenesis (Kimble and Ward 1988;Ellis and Schedl 2007). The number of sperms is modulated by the duration of spermatogenesis: the longer spermatogenesis, the larger brood size, and the later sexual maturation. As the number of sperms is much lower than that of oocytes, it acts as a limiting factor of total brood size, resulting in a trade-off between total brood size and generation time (Hodgkin and Barnes 1991;Cutter 2004).
Meiotic X-chromosome nondisjunction (X-nondisjunction) rate increases with maternal age (Rose and Baillie 1979;Luo et al. 2010). The increased X-nondisjunction results in the increased number of male progeny, as C. elegans has an XX/XO sexdetermination system, where worms with two sex chromosomes (XX) develop into hermaphrodites, but worms with single-sex chromosome (XO) develop into males (Hodgkin 1987). Thus, more sperm production in hermaphrodites may result in an enhancement of progeny production in the late reproductive period, subsequently increasing the brood size and male-production rate, although this idea has not been tested. Brood size, reproductive timing, and male-production rates have thoroughly been investigated in wild strains, natural populations, and laboratory mutants (Huang et al. 2004;Barriè re and Fé lix 2005;Teotó nio et al. 2006;Harvey and Viney 2007;Hughes et al. 2007;Luo et al. 2009;Fé lix and Braendle 2010;Luo et al. 2010;Diaz and Viney 2014;Cutter 2015;Fré zal and Fé lix 2015;Poullet et al. 2015;Wharam et al. 2017;Zhang et al. 2021). However, these traits have been measured individually rather than collectively in the same experiment. Thus, to understand the relationship between sperm production and X-nondisjunction, their phenotypic values should be simultaneously quantified.
In this study, we used unmated hermaphrodites of 96 wild C. elegans strains and measured their natural variations in progeny production and male proportion during early and late reproductive periods. Specifically, we counted the numbers of males and hermaphrodites during early and late reproductive periods. This brood size can serve as a proxy of sperm production, although reproductive timing in C. elegans can be modulated by many other biological processes, such as the ovulation rate (Kadandale and Singson 2004;Greenstein 2005). The majority of the wild strains exhibited a similar reproductive timing pattern as reported earlier (Klass 1977;Huang et al. 2004;Zhang et al. 2021), and the two traits were not correlated. Our findings imply that although more self-sperm production may increase progeny-production in the late reproductive period, the rate of X-nondisjunction of meiosis could be maintained in the period. We also conducted genomewide association mapping to identify candidate genetic variants for these phenotypic variations. Our study provides further insights into the natural history of C. elegans.

Caenorhabditis elegans culture
All wild strains were maintained at 20 C on NGM lite plates seeded with OP50. All strains and their phenotypic values are listed in Supplementary Table S1.
Measuring brood size and male-production rate Four virgin hermaphrodites (L4 stage) of each wild strain were transferred separately to new plates, then transferred again every 12-24 h. Three plates were used on the same day, and each experiment was replicated five times. After about 2 days, the offspring could be distinguished by secondary sex characteristics, such as the shape of the tail tip in males. Time constraints prevented examination of all 96 strains at once, so about 15 wild strains were measured in blocks, with strain CB4856 as an internal control. The 96 wild strains were chosen by considering their genetic diversity (Cook et al. 2017). The brood size was calculated as the ratio of the number of progeny to the number of parents, and the male-production rate was calculated as the ratio of the number of males to the number of progeny. These phenotypic values were measured for each period (early, late, and total) and each replicate. Phenotypic change was calculated as (lateÀearly)/(lateearly), with the average of all replicates for each strain in the late and early positions. Since NIC260 was an outlier for the maleproduction rate and ECA36 was an outlier for both the maleproduction rate and the brood size, these strains were excluded from the following procedures.

Comparing standard deviations and estimating heritability
To compare the standard deviations of the phenotypic values for each trait and period, tests of equality of standard deviations were conducted using Minitab 19 (https://www.minitab.com). First, the Anderson-Darling test was implemented to test the normality of the distributions of the phenotypic values of brood size and male-production rate. Normality was not rejected for brood size, but was rejected for male-production rate. Therefore, an Ftest was performed to compare the standard deviations of brood size, but Bonett's test and Levene's test were performed for the standard deviations of the male-production rate. Broad-sense heritability was calculated as described in Gimond et al. (2019) using the lmer() function in the lme4 package (Bates et al. 2014).

Analysis and correction of block effects
Because we divided 96 strains into blocks, we used the CB4856 strains as internal control to test block effects caused by environmental factors between the blocks. To test if the environmental factors made a difference among the blocks, one-way ANOVA was conducted on phenotypic values of CB4856 from different blocks. To determine which block pairs were different, we conducted multiple comparisons t-test using Bonferroni correction. We corrected phenotypic values for block effects using lm() function in the R stats package (R Core Team 2013). The average phenotypic value of CB4856 was used as an independent variable for linear regression, and residuals for strains were calculated.

Correlation analysis and genome-wide association mapping
Correlations were quantified using Pearson's correlation using DatFrame.corr() from the pandas package in Python (McKinney 2010) and scipy.stats.pearsonr() from SciPy (http://www.scipy. org/). Genome-wide association mapping was conducted with the web-based Genetic Mapping in CeNDR (Cook et al. 2017) using the mean phenotypic values of five replicates of 94 wild strains. For phenotypic changes in the early and late periods, the mapping was conducted with the cegwas package, as in the CeNDR webpage (https://github.com/AndersenLab/cegwas2 last accessed: February 11, 2020). The variance explained by each QTL and the environmental conditions of the wild strains that we used were obtained from the mapping in CeNDR. JU311 was excluded from the male-production rate change and correlation analysis because the total male-production rate of the strain was zero.

Filtering and categorizing interval variants in associated loci
The interval variants of the male-production rates in the late and total reproductive periods were analyzed only for protein-coding genes in all associated loci. Lists of interval variants were acquired by Genetic Mapping in CeNDR (Cook et al. 2017). These data included the variants' P-values for the phenotypes using Spearman's correlation tests and their putative impact on the gene, as estimated by SnpEff (Cingolani et al. 2012). The variants of each trait were filtered by P-value <0.05, and then among the filtered variants, only those common to both traits were chosen. The common variants were categorized according to their putative impact, and whether they were variants of a gene that is associated with a high incidence of male progeny phenotype. The same process was carried out for genes that had physical, genetic, or regulatory interactions with TGF-b Sma/Mab ligand DLB-1 or insulin/IGF-1 receptor DAF-2 (Harris et al. 2020).

Calculating linkage disequilibrium between QTL peaks
We measured the linkage disequilibrium between pairs of QTL peaks by calculating the Pearson's correlation coefficient (Hill and Robertson 1968).

Results
To understand natural variation in reproductive timing phenotypes, we measured the brood size and the male-production rate, parameters that reflect sperm production and X-nondisjunction rates, respectively. We counted all progeny of 96 wild strains, and checked their sexes in two different reproductive periods: 0-36 (early) and 36-72 (late) hours after adulthood, as C. elegans worms produce around 90% of their progeny during these periods (Supplementary Figure S1, Supplementary Table S1). Almost all wild strains exhibited continuous phenotypic distributions for all the traits, but we identified a few outliers (Supplementary Figure S2). ECA36 had the smallest total brood size, of less than 50, which was one-third of the second-lowest brood size. ECA36 and NIC260 were outliers in the rate of male production over the total reproductive period. The male-production rates of ECA36 and NIC260 were around 6% and 0.6%, respectively, which were 30 and 3 times higher than the third-highest rate. We excluded these two outliers from further analysis. In addition, as we divided 96 strains into blocks to measure their phenotypes, we tested the block effects in our data using one-way ANOVA and multiple comparisons t-test with Bonferroni correction (Supplementary Figure S3, Supplementary Table S2). Male-production rates exhibited no significant difference among blocks regardless of reproductive timing, but the brood size did (Supplementary Figure S3,  Supplementary Table S2).
We then compared the phenotypes in the early and late reproductive periods to identify how parental age affects progeny production and male-production rate in wild strains. We found that the average brood size was 148 in the early period and 96 in the late periods ( Figure 1A). The average male-production rate was 0.05% in the early period and 0.12% in the late periods ( Figure 1B). We also estimated the broad-sense heritability of these traits. Heritability values of brood size were 36%, 62%, and 66% for early, late, and total reproductive periods, respectively ( Figure 1A), and the heritability of the total brood size was similar to the previously reported value (63%) (Zhang et al. 2021). The maleproduction rate exhibited lower heritability values than those of the brood size (9%, 12%, and 18% for male-production rate of early, late, and total reproductive periods, respectively; Figure 1B). Of the 94 wild strains, 87 (92.5%) laid more progeny, and 82 (87.2%) produced fewer male progeny in the early period than in the late period (Figure 1, C and D), suggesting that reproductive timing and male-production patterns in wild strains may be similar to those of the reference strain (Klass 1977;Rose and Baillie 1979;Huang et al. 2004). We tested whether these reproductive timing traits share an underlying genetic architecture by analyzing correlations by genotype for the following eight traits: brood size and maleproduction rate in early, late and total reproductive periods, and changes in these values between the early and late periods. The brood size and male-production rate were not significantly correlated (Figure 2A, Supplementary Table S3), implying that different genetic mechanisms may mediate the two traits. In contrast, the total brood size was weakly positively correlated with the early brood size, but strongly positively correlated with the late brood size, as the phenotypic values of the early brood size were similar among all wild strains ( Figure 1A). Phenotypic variations in the late brood size may therefore have primarily affected the total brood size. The total male-production rate was highly positively correlated with the early and late male-production rates, but the early and late male-production rates were weakly correlated to each other, suggesting that they might also have different genetic architectures.
We conducted genome-wide association mapping to investigate the underlying genetic architecture of these traits. We found that several loci were significantly associated with the maleproduction rates of late and total periods ( Figure 2B), but could not find any significant loci for the other six traits (Supplementary Figure S4). The significant loci for maleproduction rates in the late and total periods were located on chromosomes IV and V, and largely overlapped, as they were highly correlated (Figure 2, A and B), suggesting that they might share significant components of the genetic architecture. We investigated further, using correlations between phenotypic variations and genetic variants at the loci, and found that 3050 variants were significantly correlated with phenotypes, and were shared between the two traits ( Figure 3A).
We classified these shared variants according to the known phenotypes that appeared when the functions of the genes containing the variants were impaired by mutation or RNA interference. Among these genes, 11 were known to be associated with the high incidence of male progeny (Him) phenotype ( Figure 3B) (Harris et al. 2020). We categorized the shared variants by their putative impacts using SnpEff, and found that zim-2 contains a high-impact variant and that coh-3, him-17, him-5, him-8, lex-1, rad-51, sao-1, srgp-1, zim-1, zim-2, and zim-3 contain moderateimpact, low-impact or modifier variants (Supplementary Table  S4) (Cingolani et al. 2012;Cook et al. 2017).
We divided wild strains according to their alleles, and calculated the phenotypic variances that could be explained by each locus (Figure 4). We found that for every locus, only five to eight wild strains contained high-male-production rate alleles, and the other approximately 90 strains had low-male-production rate alleles (Figure 4, A and B). The variances explained by each locus ranged from 11% to 17% for late male-production rates, and 15%-19% for total male-production rates, values which are similar (Figure 4, A and B), and are also similar to the heritability values for those traits (12% and 18%, respectively; Figure 1B). These loci T o t a l C h a n g e L a t e E a r l y T o t a l C h a n g e L a t e E a r l y T o t a l C h a n g e L a t e E a r l y Male-production rate (late period) Chr. X B Figure 2 Correlation matrix and genome-wide association mapping for brood size and male-production rate. (A) Heatmap of Pearson correlations between brood size and male-production rate of early, late, and total reproductive periods and their changes over time (*P < 0.05; **P < 0.01; ***P < 0.001).
(B) Manhattan plots for genome-wide association mapping of male-production rates in the late and total periods. Red horizontal lines represent the adjusted Bonferroni cut-off for 5% (ÀlogP ¼ 5.44) and red dots indicate SNVs beyond that threshold. Cyan bars show 95% confidence intervals. exhibited strong linkage disequilibrium, even between loci on two different chromosomes (Figure 4, C and D).

Discussion
The reproductive timing in C. elegans is associated with sperm production, generation time, and X-nondisjunction. As C. elegans is a sequential hermaphrodite, longer sperm-production may delay the generation time and also increase the X-nondisjunction rate owing to the fact that oocyte quality control mechanisms become dysfunctional with age. Intriguingly, we found no significant correlation between progeny number and male-production rate, which were proxies of sperm-production and X-nondisjunction rate, respectively. Most strains produced more progeny and males in the late reproductive period, but strains with larger brood sizes did not always exhibit a higher male-production rate.
Furthermore, the male-production rate remained low throughout not only the early but also the total period, implying that spontaneous male production and chromosome segregation were restricted to reduce the high cost of males and aneuploidy (Smith 1978;Cutter et al. 2019). There were two outliers with several or tens of times higher male-production rates in C. elegans wild strains, and one of these outliers, ECA36, had a very small brood size. How this outlier survived in the natural environment and which genetic components underlie these extreme phenotypic variations remain elusive. Although the male-production and outcrossing rates were reported not to be correlated in wild strains of C. elegans (Teotó nio et al. 2006), dramatic increases in the male-production rate, for example by tens of times, may lead to routine production of males, and may possibly enhance outcrossing rates. These outliers may be selected under outcrossing-favoring environments such as those involving co-evolving parasites (Morran et al. 2009).
Different genetic variations may underlie phenotypic variations in sperm production and X-nondisjunction rates, as these traits were only weakly correlated to each other. TGF-b signaling, insulin/IGF-1 signaling, dietary restriction and other pathways modulate reproductive aging in C. elegans. TGF-b signaling and insulin/IGF-1 signaling pathways have been reported to be associated with oocyte quality control mechanisms, such as chromosome segregation (Luo et al. 2009(Luo et al. , 2010. Notch signaling pathway may affect the oocyte-production rate, as it controls the fates of germline stem cells (Kocsisova et al. 2019). These signaling pathways have genetic variations in wild strains, and they may affect phenotypic variations in reproductive aging phenotypes, but we could not find any loci significantly associated with any of the traits except late and total male-production rates. These loci contain some components of the insulin/IGF-1 pathway and genes known to be related to the X-nondisjunction rate. The genes and variants which mediate the phenotypic differences are still poorly understood.
Our genome-wide association mapping experiments identified only a few associated loci for two out of eight traits. These associated loci contain over 10,000 variants, and such low resolution, in addition to our negative results for other traits, could arise because the number of wild strains that we used was not large enough for fine mapping. It is also possible that the wild strains that we used do not have enough genetic diversity, as C. elegans has experienced a chromosome-scale selective sweep, and its hermaphroditism prevents the mixture of different genetic backgrounds (Barriè re and Fé lix 2005; Dolgin et al. 2007;Andersen et al. 2012;Gimond et al. 2013). These characteristics of C. elegans might result in low genetic diversity in wild strains. Our data showed that more than 90% of the wild strains that we used contained the same allele at each peak of the associated locus (Figure 4, A and B). A recent collection of C. elegans from Hawaii exhibits much higher genetic diversity than the set we used, which could possibly be used to resolve other loci associated with the remaining traits (Crombie et al. 2019). In addition, the correction of block effects in brood size and the utilization of more variants than those we used for the mapping ($14,000 variants) may improve the mapping resolution. We also added the corrected phenotypic values in Supplementary Table S1.
There were several limitations in our experimental procedure. In the previous study, spontaneous male production may have skewed the natural variation in X-nondisjunction rate as mixed worms were possibly used (Teotó nio et al. 2006 I  II III IV  V  X  I  II III IV V X IIS Figure 3 Filtering candidate variants in loci significantly associated with male-production rates in the late and total reproductive periods. (A) Filtering scheme. All interval variants in the loci were filtered according to their correlation with the traits. The loci for the two traits shared thousands of common variants. (B) The number of overlapped and filtered variants were categorized by their putative impacts on genes, as estimated by SnpEff (gray boxes and black lines) (Cingolani et al. 2012). These variants were further categorized by gene functions, including: genes known to interact with the IIS receptor, DAF-2 (blue), or to exhibit Him phenotypes (red), as annotated in the WormBase. Him: high incidence male progeny.  However, they require $12 h to reach adulthood; therefore, a maximum 12-h difference in their age may skew the precise estimation of reproductive timing. Specifically, if sperm production timing delays the onset of laying eggs, in strains with larger brood size, early and late periods would not be precisely divided. We also did not consider the embryonic lethality and larval arrest, which possibly skews brood size or male-production rate. Moreover, our experimental procedure to maintain and select worms for experiments also generated uncontrolled variations, as population density and maternal age affect nutrient availability and brood size. Wild strains have a CB4856-type npr-1 allele, and it reduces brood size under high population density (Andersen et al. 2014). In addition, 1-day adulthood hermaphrodites produce offspring with smaller brood size than that of 3-day adulthood hermaphrodites because older mothers can devote more resources to their eggs (Perez et al. 2017). We did not consider these various sources as we maintained worms with high density ($100 worms/plate) and selected them regardless of maternal ages. Other factors also could affect the brood size or reproductive timing in C. elegans, such as pheromones or temperature differences (Harvey and Viney 2007;Poullet et al. 2015;Wharam et al. 2017), but our experimental design may not fully consider these possibilities. As a result, we may have found that the brood size of our internal control strain, CB4856, was significantly variable among experimental blocks (Supplementary Figure S3). Our result demonstrates that reproductive timing is an important factor of brood size and male-production rate in C. elegans wild strains and that different genetic mechanisms may modulate the two traits. Our associated loci and variants for the maleproduction rate may act as candidate genetic variants for the trait. It could help in understanding the natural history of C. elegans.

Data availability
All genotypes, marker information, and mapping process are available in the C. elegans Natural Diversity Resource (https:// www.elegansvariation.org/). All phenotypes are listed in Supplementary Table S1 and the whole procedure was detailed in Materials and Methods.
Supplementary material is available at G3 online.