Limits to Genomic Divergence Under Sexually Antagonistic Selection

Since the autosomal genome is shared between the sexes, sex-specific fitness optima present an evolutionary challenge. While sexually antagonistic selection might favor different alleles within females and males, segregation randomly reassorts alleles at autosomal loci between sexes each generation. This process of homogenization during transmission thus prevents between-sex allelic divergence generated by sexually antagonistic selection from accumulating across multiple generations. However, recent empirical studies have reported high male-female FST statistics. Here, we use a population genetic model to evaluate whether these observations could plausibly be produced by sexually antagonistic selection. To do this, we use both a single-locus model with nonrandom mate choice, and individual-based simulations to study the relationship between strength of selection, degree of between-sex divergence, and the associated genetic load. We show that selection must be exceptionally strong to create measurable divergence between the sexes and that the decrease in population fitness due to this process is correspondingly high. Individual-based simulations with selection genome-wide recapitulate these patterns and indicate that small sample sizes and sampling variance can easily generate substantial male-female divergence. We therefore conclude that caution should be taken when interpreting autosomal allelic differentiation between the sexes.


KEYWORDS
antagonistic selection sexual conflict male-female divergence intersexual F ST genetic load Females and males use largely the same genome to produce distinct phenotypes and behaviors. This ubiquitous phenomenon requires an association between dimorphic phenotypes and their sexual environment (Kasimatis et al. 2017;Mank 2017b). Genes residing on a sex chromosome have a physical link to sex determination. Particularly, on heteromorphic sex chromosomes, the lack of recombination allows for selection to act in a sex-specific manner to optimize beneficial genes within each sex (Rice 1984(Rice , 1987Charlesworth and Charlesworth 1980). Conversely, the shared genetic basis of autosomal genes constrains such sex-specific optimization of fitness. When autosomal-based traits have different optimal fitness values in each sex, then selection acts in a sexually antagonistic manner to push females and males in opposing directions in phenotype space (Rice and Holland 1997;Bonduriansky and Chenoweth 2009). However, recombination and meiotic segregation uncouple beneficial alleles from their sexual environment every generation, constraining the resolution of antagonism via the creation of separate female and male genomic pools. This homogenization process tethers together the evolutionary responses of the sexes and creates an inherent intersexual genomic conflict (reviewed in Bonduriansky and Chenoweth 2009;Kasimatis et al. 2017).
Identifying sexually antagonistic loci-particularly using reverse genomics approaches-has proved challenging. Initial studies calculated differentiation between females and males using Wright's fixation index (F ST ), and interpreted high values as evidence of sexually antagonistic selection. Empirical data from multiple taxonomic groups (Lucotte et al. 2016;Flanagan and Jones 2017;Wright et al. 2018;Dutoit et al. 2018) suggest that hundreds to thousands of SNPs have elevated male-female autosomal differentiation with outliers exceeding F ST ¼ 0:01 (Lucotte et al. 2016;Wright et al. 2018) and even approaching F ST ¼ 0:2 (Flanagan and Jones 2017). (Cheng and Kirkpatrick differentiation are striking. However, these results are difficult to evaluate as they suggest that there must be quite a large amount of selective mortality within each sex to create such high divergence within a generation (as noted by Cheng and Kirkpatrick 2016).
Several different processes could in principle generate divergence (or apparent divergence) between the sexes. First, sex biases in chromosome segregation through associations with the sex determining region could distort allele frequencies between the sexes. Over time, this segregation distortion can contribute to the generation of neo-sex chromosomes (Jaenike 2001;Kozielska et al. 2010)particularly heteromorphic sex chromosomesleading to sex-specific differentiation in the trivial sense that the locus is completely absent in one sex. Second, gametic selection resulting in a sex-specific fertilization bias could also distort allele frequencies (Joseph and Kirkpatrick 2004). Both of these processes occur during the gametic phase of the lifecycle and have long been recognized for their potential ability to distort segregation ratios within the sexes (reviewed in Immler and Otto 2018). In contrast, sexually antagonistic viability selection that occurs post-fertilization is a fundamentally different mechanism because there is no direct co-segregation of sex with the alleles under selection. Previous work on sexually antagonistic viability selection has largely focused on its potential role in maintaining genetic variation due to the sex-specific pleiotropic effects of the locus. In particular, Kidwell et al. (1977) laid out a framework for analyzing sexual antagonism that has widely been used in the field (Arnqvist 2011;Connallon et al. 2010;Connallon and Clark 2011;Patten and Haig 2009;Fry 2010). A little appreciated feature of the Kidwell model is that it tracks allele frequencies (rather than diploid genotype frequencies) in adults from each generation to the next. Although the model incorporates diploid selection, this sampling paradigm is sufficient because the "random union of gametes" model of mating only requires allele frequencies to generate diploid genotype frequencies in the next generation. However, this model simplification prevents the inclusion of other models of mating, such as assortative mating among genotypes.
In this paper, we will first build a model of sexually antagonistic viability selection, segregation, and transmission, extending the model of Kidwell et al. (1977) to include assortative mating. We use this model to evaluate how much between-sex differentiation is produced across a range of selection, dominance, and assortative mating parameters. Second, we use these results to evaluate the claims that the observed between-sex allelic differentiation is caused by sexually antagonistic viability selection. We then use simulation to test the conclusions of our deterministic model, as well as the role of sampling variance in generating loci with high between-sex differentiation. Both our single locus model and individual-based simulations with antagonistic loci distributed genome-wide indicate that antagonistic selection must be remarkably strong to produce non-negligible divergence between the sexes. Instead, simulations indicate that sampling variance is much more likely to account for extreme between-sex divergence and must therefore be explicitly included in any analyses of putative signatures of male-female divergence.

Model
Consider an autosomal locus in which are found two alleles: one femalebeneficial (A 1 ) and one male-beneficial (A 2 ). Sexual antagonism results in a fitness cost to individuals carrying the allele favored in the other sex (Kidwell et al. 1977;Bodmer 1965). The life cycle is shown in Figure 1. Each generation begins with zygotic frequencies equal in each sex, but then genotype-dependent survival results in different genotype frequencies in each sex at time of mating. The relative fitnesses of genotypes A 1 A 1 , A 1 A 2 , and A 2 A 2 in females are 1::1 2 h f s f ::1 2 s f , where s f is the cost of a female having the male-favorable allele and h f is the dominance coefficient in females. (This is a model of viability selection, so here and below "fitness" refers to viability.) Writing the frequencies of the three genotypes in zygotes as p 11 ðtÞ, p 12 ðtÞ, and p 22 ðtÞ at the start of generation t, the genotype frequencies in females after selection will then be proportional to p 11 ðtÞ, p 12 ðtÞð1 2 h f s f Þ, and p 22 ðtÞð1 2 s f Þ, respectively. Similarly, the relative fitnesses of the genotypes A 1 A 1 , A 1 A 2 , and A 2 A 2 in males are 1 2 s m ::1 2 h m s m ::1, and the genotype frequencies in males after selection are proportional to p 11 ðtÞð1 2 s m Þ, p 12 ðtÞð1 2 h m s m Þ, and p 22 ðtÞ, respectively. Therefore, the frequency of the female-beneficial allele in females post-selection, which we denote p f ðtÞ, is The same quantity for males is: Figure 1 Lifecycle of the model. Zygotes are subject to sexually antagonistic viability selection (s m and s f ), perturbing allele frequencies in adults in a sex-specific manner. Sex-specific adult allele frequencies are given in Equations 1 and 2, where w f ¼ w f Á p t and w m ¼ w m Á p t . Surviving adults produce gametes of each allele type in frequencies corresponding to Equations 1 and 2. At this time meiotic segregation breaks the association between the locus and sex. Females and males mate with frequencies proportional to the mate choice matrix (M) to produce the zygote pool in the next generation. Kidwell et al. (1977) gives the recursion for the allele frequencies in gametes (p m , p f ), under the assumption of random, genotype-independent mating. p m ðtÞ ¼ p 11 ðtÞð1 2 s m Þ þ 1 2 p 12 ðtÞð1 2 h m s m Þ p 11 ðtÞð1 2 s m Þ þ p 12 ðtÞð1 2 h m s m Þ þ p 22 ðtÞ : In a deterministic model of non-overlapping generations without gamete-specific selection, the genotype frequencies in the next generation are determined by the frequency of gametes joining from each of the nine possible mating combinations weighted according to mate choice. We parameterize mate choice using a matrix whose rows are indexed by male genotypes and columns by female genotypes, such that M ij is the frequency of pairings of male genotype i with female genotype j relative to that expected under random mating. We focus on three common mating scenarios by structuring the mate choice matrix as: Under random mating, each pairing occurs with equal likelihood (m 1 ¼ m 2 ¼ m 3 ). Positive assortative mating by genotype occurs when females and males with the same genotype mate more frequently than those with different genotypes (m 1 . m 2 ¼ m 3 ). Conversely, disassortative mating by genotypeor positive assortative mating by fitnessoccurs when A 1 A 1 individuals mate with A 2 A 2 individuals (m 1 ¼ m 2 , m 3 ). The genotype frequencies in the next generation can be concisely calculated with some matrix algebra. Let w f ¼ ð1; 1 2 h f s f ; 1 2 s m Þ and w m ¼ ð1 2 s m ; 1 2 h m s m ; 1Þ be the vectors of relative fitnesses in females and males respectively. Then, define the 3 · 3 matrix of fitness-weighted mate pairings, F, so that for each pair of genotypes a and b, the entry F ab ¼ w m ðaÞM ab w f ðbÞ. In other words, where diagðw m Þ denotes the matrix with w m on the diagonal and zeros elsewhere. Finally, define b ¼ diagð1; 1=2; 0Þ and g ¼ diagð0; 1=2; 1Þ. Then, the vector of frequencies of each genotype among zygotes (before selection) in the next generation can be calculated using the current frequencies as a weighted sum over possible mating pairs: Here, pðtÞ ¼ ðp 11 ðtÞ; p 12 ðtÞ; p 22 ðtÞÞ is the column vector of genotype frequencies and pðtÞ T is its transpose. This set of equations can be derived by noting that the relative frequencies of A 1 A 1 , A 1 A 2 , and A 2 A 2 genotypes produced in the next generation are pðtÞ T bFbpðtÞ, pðtÞ T ðbFg þ gFbÞpðtÞ, and pðtÞ T gFgpðtÞ, respectively; since b þ g ¼ I, the identity matrix, these sum to pðtÞ T FpðtÞ, the denominator in equations (3). We used Mathematica v11.1.1.0 (Wolfram Research, Inc.) to find the equilibria of this system and determine stability of those equilibria. The complete notebook is provided in File S1.

Within-generation statistics
Sex-specific viability selection creates differences in allele frequencies between the sexes each generation. We can therefore quantify the effects of sexually antagonistic selection using the male-female F ST statistic, which we calculate as the squared difference in allele frequencies between sexes, normalized by the total heterozygosity across sexes (Cheng and Kirkpatrick 2016;Wright 1951): Sex-specific selection creates divergence between the sexes by increasing the frequency of the beneficial allele in each sex. Therefore, at the population level, this opposing action of section skews genotype frequencies away from Hardy-Weinberg equilibrium. The deviation from Hardy-Weinberg equilibrium within the population due to sex-specific effects can be quantified using Wright's F IS statistic (Wright 1951): À p 11 ðtÞ þ p 12 ðtÞ 2 ÁÀ p 22 ðtÞ þ p 12 ðtÞ 2 Á 2 1: A population fitness cost due to sexual antagonism (i.e., genetic load), is generated each generation. Within each sex, the genetic load is the difference between the maximum possible fitness and the mean fitness (Haldane 1937). The population's average genetic load (L) is the average of the loads for each sex (assuming an equal sex ratio), which is given by:

Simulations
We used R (R Core Team 2018) (File S2) to simulate allele frequency dynamics at a single locus in a population subject to selection and drift. During viability selection each generation, each individual survived with probability equal to their (sex-and genotype-dependent) fitness. Then, the genotype frequencies within each sex were multiplied to give the matrix of relative frequencies of possible mating pairs, which was further weighted by the mate choice matrix. To generate the next generation, a fixed number of mating pairs are sampled from this distribution, and offspring are produced by random choice of parental alleles.
We also implemented simulations with sexually antagonistic selection acting at many loci, genome-wide with SLiM v3.1, an evolution simulation framework (Haller and Messer 2019) (recipes in File S3). Individuals each had a genome of 100 Mb, a uniform recombination rate of 10 28 per nucleotide, and a mutation rate of 10 210 . All mutations are sexually antagonistic (we do not simulate neutral variation): each new mutation was beneficial in a randomly chosen sex and detrimental in the other, with selection coefficients drawn independently for each sex from a Gaussian distribution with mean zero and standard deviation 0.01. Each mutation also had dominance coefficients drawn independently for each sex from a uniform distribution between 0 and 1. The model had overlapping generations: each time step, first viability selection occurred (with probability of survival equal to fitness), followed by reproduction by random mating. The number of new offspring was chosen so that the population size fluctuated around 10,000 diploids, and simulations were run for 1,000 time steps. For a neutral comparison, we also simulated from the same scenario but with no fitness effects. We ran 5 independent simulations of each scenario (i.e., neutral and sexually antagonistic).
After the final generation, genetic load and male-female F ST at each locus were calculated. F ST values were calculated both using all individuals within the population as well as using smaller subsamples of 100 individuals and 50 individuals with equal numbers of each sex. Subsample sizes were chosen to reflect sample sizes currently used in the literature (Dutoit et al. 2018;Flanagan and Jones 2017 (Bhatia et al. 2013).)

Data availability
The model equations, equilibrium analyses, and stability analyses are given in the Mathematica notebook in File S1. The single locus simulations and SLiM statistical analyses are given in File S2. The SLiM code and simulation data are available in Files S3-S5. Supplemental material available at FigShare: https://doi.org/10.25387/g3.9785951.

RESULTS
We first examine the conditions under which our model supports a stable polymorphism, and then examine the degree of between-sex divergence and genetic load expected under both equilibrium and nonequilibrium (selective sweep) conditions. Our single locus results expand upon previous the results of previous studies (Kidwell et al. 1977;Arnqvist 2011;Cheng and Kirkpatrick 2016;Kasimatis et al. 2017). We then verify these results using simulations, which also provide an opportunity to explore the effects of statistical sampling on inferences of sex-specific differentiation from genomic samples.

Transmission dynamics at a sexually antagonistic locus
Maintenance of polymorphism usually requires symmetric selection between the sexes under random mating: We will quantify the strength and degree of asymmetry between the sex-specific allelic effects using the overall strength (s) and the ratio of selection coefficients (a), so that s m ¼ s and s f ¼ as. The full solution for the maintenance of polymorphism under arbitrary patterns of dominance can be solved by setting pðt þ 1Þ ¼ pðtÞ in the recursion equations above (Equations (3); File S1). Under general conditions, this system yields a fifth-order polynomial that does not readily generate a closed form solution in symbolic form, although the equilibria can be easily found numerically. Symbolic solutions are possible under some specific conditions.
Assuming random mating and additivity of allelic effects (h m ¼ h f ¼ 0:5), the frequency of the A 1 allele at equilibrium (denotedp A1 ) can be expressed in terms of the strength of selection and asymmetry in selection: When the strength of selection is equal between the sexes (a ¼ 1), an equilibrium frequency ofp A1 ¼ 0:5 is always predicted. This theoretical solution is well supported by the stochastic simulations as well ( Figure 3A-B). The bounds on the non-trivial equilibrium frequency can be found by settingp A1 to zero or one. By solving these equations for a in terms of the strength of selection (s), we find that for the equilibrium to be stable, a and s must satisfy the condition: These bounds can also be found by calculating the Jacobian matrix for the full set of transition equations (File S1) and agree with those identified by Kidwell et al. (1977). In general, the equilibrium conditions describe an expanding envelope in parameter space that allows more asymmetry in the pattern of antagonistic selection as the absolute strength of selection increases (Figure 2A & B). To a first order approximation in s, equation (7) shows that the equilibrium is stable only if asymmetry is not larger than the strength of selection, such that ja 2 1j , s, as shown in Figure 2A. Thus, when selection is weak or moderate, the maintenance of a polymorphism requires approximately equal selection between the sexes. However, the permissible degree of asymmetry increases with the strength of selection ( Figure 2B). For example, when s $ 0:4 a stable polymorphism can be maintained so long as the asymmetry in fitness (j1 2 aj) is less than 50%. Selection coefficients of this magnitude mean mortality rates of 40% or higher each generation due to a single incorrect sexually antagonistic allele, which seems biologically implausible. Therefore, under additivity, any stable antagonistic polymorphisms in natural populations must have approximately equal fitness effects in the two sexes, while less balanced antagonistic loci will quickly be fixed or lost.
On the other hand, if dominance is allowed to vary between the sexes but selection is equally antagonistic across the sexes (a ¼ 1), there is always a single real, non-trivial equilibrium ( Figure 2C), whose stability depends on the sum of the dominance coefficients between the sexes (Kidwell et al. 1977). When h m þ h f # 1 the equilibrium is stable (File S1). This stability boundary makes sense as the mean fitness of homozygous individuals is lower than that of heterozygous individuals (assuming equal sex ratios): In other words, the equilibrium remains stable if the deleterious effects of dominance in one sex do not outweigh the benefits in the other sex. Interestingly, weak selection at a locus with sex-beneficial dominance (h m ¼ h f ¼ 0) can maintain a stable polymorphism despite greater asymmetry in selection than can an additive model (File S1). This expansion of the region of stability is likely a result of heterozygotes being shielded from antagonistic selection and suggests that modifying dominance can act to maintain sexual antagonism at a locus (Connallon and Chenoweth 2019). Conversely, when 1 , h m þ h f # 2, dominance favors the deleterious allele in each sex, pushing the population to an unstable state and leading to the fixation of the less costly allele. Allowing for asymmetry in the strength of selection narrows the equilibrium space and reduces the range of dominance coefficients resulting in stability ( Figure 2D).
Assortative mating by fitness expands the polymorphism space: Under positive assortative mating by fitness, high fitness matings occur between disparate genotypes and therefore produce an excess of heterozygotes each generation. In this situation (i.e., m 3 . m 2 ¼ m 1 ), up to three real non-trivial equilibria can exist depending on the selection and dominance parameters (File S1). However, as with random mating, at most one equilibrium is stable. When selection is symmetrically antagonistic across the sexes (a ¼ 1), an A 1 allele frequency of approximately 0.5 is always predicted, regardless of dominance. This prediction is borne out by the single locus simulation results, which further show that assortative mating by fitness tends to make the stable equilibrium more robust to the effects of genetic drift ( Figure 3C). Increasing the asymmetry of selection can introduce an additional unstable equilibrium, and increasing the strength of sex-deleterious dominance (toward h m ¼ h f ¼ 1) can introduce a second unstable equilibrium (File S1). These theoretical predictions agree with previous simulations of assortative mating (Arnqvist 2011). As with random mating, the relationship between the strength and asymmetry in selection is the critical factor in determining when equilibria are stable. Specifically, when the asymmetry in selection is sufficiently large, fixation of the more favored allele is expected. Fixation only tends to occur under unrealistically large viability costs, however, and so the predominant outcome of assortative mating by fitness is the maintenance of heterozygotes and an expansion of the equilibrium space relative to random mating.
Assortative mating by genotype leads to fixation: In contrast to assortative mating by fitness, if assortative mating is by genotype (m 1 . m 2 ¼ m 3 ), there is only a single non-trivial equilibrium (File S1). This equilibrium is always unstable, regardless of dominance, as shown by the leading eigenvalue of the Jacobian matrix. Figure 3D shows allele frequency trajectories that start at this unstable equilibrium rapidly go to loss or fixation (with the choice determined by random genetic drift). Thus, these mating dynamics shrink the parameter space for maintaining a stable polymorphism and lead to the loss of the weaker antagonistic allele.  (2017)  When selection and dominance coefficients are chosen such that a stable equilibrium is maintained, divergence between the sexes tends to be exceptionally low ( Figure 4A). For example, a 10% viability cost (s ¼ 0:1) results in a between-sex F ST value of 0.0007 at equilibrium Figure 2 The equilibrium space for the A 1 allele under differing selection and dominance conditions. A) The equilibrium space at an additive locus (p A1 , Equation 10), when selection is weak and related between the sexes by the ratio a. Here the equilibrium space is symmetric around a ¼ 1 and confined to approximately equal selection between the sexes. The solid black line represent the permissible bounds on a (7) and the dashed gray line represents the first order Taylor series approximation. B) The equilibrium space at an additive locus increases as the strength of selection increases. The solid black line represents the bounds on a and the dashed gray line represents the second order Taylor series approximation. C) The equilibrium space across all dominance conditions when selection is equal between the sexes (s ¼ 0:1; a ¼ 1). When the dominance coefficients between the sexes sum to no greater than one (h m þ h f # 1), then the equilibrium is stable. However, when the sum is greater than one the equilibrium is unstable. D) Strong, asymmetric selection (s ¼ 0:4; a ¼ 1:5) narrows the equilibrium space and range of stable conditions (h m þ h f # 0:8).

Male-female divergence is exceptionally low
(assuming an additive locus and random mating). An equilibrium male-female F ST value of 0.0016 (as in flycatchers) requires at least a 15% viability cost within each sex (s ¼ 0:15, a ¼ 1). To produce equilibrium F ST values an order of magnitude larger (as reported for the largest loci in the other taxa) requires a 30-65% viability cost (s ¼ 0:30 2 0:65, a ¼ 0:8 2 2:0). For these values to be a product of viability selection, the field would need to have overlooked as much as 50% genotype-dependant mortality (or infertility) for each sex every generation, which seems implausible in these taxa.
Greater divergence can be generated across a broader range of selection values when an antagonistic locus transiently sweeps through a population. Here a viability cost of 10% produces higher divergence than at equilibrium, although divergence is still low in absolute terms (F ST , 0:002 across dominance values, under random mating). Again, at least a 30% viability cost would be required to produce F ST values above 0.05. Sex-specific beneficial dominance (h m ¼ 0, h f ¼ 0) is expected to generate the lowest levels of between-sex divergence, while sex-specific deleterious dominance (h m ¼ 1, h f ¼ 1) yields the greatest levels of divergence, though such a scenario seems biologically unstable ( Figure 4B). Importantly, under weak selection dominance has only a negligible effect on divergence. In fact, varying dominance does not generate quantitative changes in F ST unless selection is remarkably strong (s . 0:5). Rather, asymmetry in selection seems a more important driver of divergence in non-equilibrium populations, as this asymmetry is precisely the factor that moves populations away from equilibrium conditions to a state in which the least costly allele sweeps to fixation. Across the range of a values with a fixed mean strength of selection between the sexes, divergence slightly increases as asymmetry between the sexes increases ( Figure 4C). However, the male-female F ST values are of the same magnitude despite strong asymmetry when the mean strength of selection in confined in this manner. When the strength of selection varies independently between the sexes, increasing a yields much greater divergence, though this result is confounded by overall stronger selection in one sex. Overall, substantial divergence between the sexes still requires strong selection in non-equilibrium populations.
Sexual antagonism generates a substantial genetic load Since the mean relative fitness for females and the mean relative fitness for males are each maximized under fixation for different alleles at an antagonistic locus, sexually antagonistic selection generates a genetic load within the population at both a polymorphic equilibrium and during a selective sweep. At equilibrium under random mating, the load is maximized if the strengths of selection in each sex are equal Figure 3 The change in the frequency of a newly derived sexually antagonistic allele (A 1 ) over time. The black line represents the predicted allele frequency from the recursion equation. The overlaid pink and blue lines represent the simulated population (N = 20,000) of females and males, respectively. The strength of selection (s), ratio of selection between the sexes (a), dominance relationship (h m and h f ), and mate choice coefficients (m 1 , m 2 , and m 3 ) are given in each panel. A) Random mating with additive dominance and symmetric selection between the sexes maintains a stable polymorphism. B) Random mating with complete male dominance and symmetric selection between the sexes maintains a stable polymorphism. C) Assortative mating by fitness with additive dominance maintain a stable polymorphism. D) Assortative mating by genotype with additive dominance has an unstable equilibrium. Multiple simulated populations show how drift will quickly lead to fixation or loss of the A 1 allele.
( Figure 5A), and dominance has little to no effect. Importantly, across strengths of selection up to s ¼ 0:5, the load generated at equilibrium exceeds F ST between males and females by nearly a factor of 10 ( Figure  5B). For example, a 10% viability cost (s ¼ 0:1) results in a reduction of population fitness up to 5%, with a maximum F ST value of 0.0007. The load produced by a single antagonistic locus with F ST equal to the mean male-female F ST reported in human HAPMAP data (Lucotte et al. 2016) would exceed 20% ( Figure 5B). This relationship indicates that even weak selection driving low-and probably undetectablelevels of divergence can generate a substantial fitness reduction due to the sex-specific nature of selection.
An alternative way to examine load is by quantifying the excess of heterozygosity due to sexually antagonistic selection, using the F IS statistic (the inbreeding coefficient). Here sex-specific selection creates homozygous pools of each sex within a generation, which leads to an excess of heterozygotes at the start of the next generation. Under weak selection, these departures from Hardy-Weinberg equilibrium are of similar magnitude as male-female F ST ( Figure 5C). However, under strong sexual antagonismsuch as that required to generate the empirically observed divergence values-F IS can approach 10%.
Antagonistic loci that do not have a polymorphic equilibrium tend to produce even greater load while sweeping than a locus at stable equilibrium under similar selective conditions (File S1). Unless selection was very weak (s , 0:05), load tended to exceed 10%. Under strong, asymmetric selection load can approach 70% during a sweep. Additionally, the fitness cost of sexual antagonism remains after an allele fixes. The load generated during a sweep is affected by dominance, with additive loci generating loads that are intermediate to the other dominance scenarios. Beneficial dominance within each sex can apparently resolve some of the underlying antagonism by shielding selection on heterozygotes and therefore reducing the load. In contrast, sex-specific deleterious dominance generated the greatest load.
Genome-wide antagonistic selection also produces low divergence Our analytical results are based on a single-locus model, yet empirical studies report averages across large numbers of loci. To complement the single-locus theory, we quantified the effects of sexually antagonistic selection throughout the genome using individual-based simulations in SLiM Haller and Messer (2019). First, we analyzed male-female F ST values calculated using true population-wide allele frequencies Figure 4 Divergence between the sexes due to a single generation of sexually antagonistic selection. A) Male-female F ST for an additive locus (h m ¼ h f ¼ 0:5) at equilibrium, where the strength of selection between the sexes is related by the ratio a. B) Malefemale F ST as a function of selection for three dominance regimes: sex- The sexspecific deleterious dominance curve is at an unstable equilibrium, while the additive and beneficial dominance curves are at a stable equilibrium (for all curves a ¼ 1 and the equilibrium frequency is 0.5). Sex-specific beneficial dominance always results in the lowest divergence between the sexes. The inset graph highlights the similarly low divergence values generated under weak and moderately weak selection. C) Male-female F ST at a sexbeneficial locus (h m ¼ h f ¼ 0: the polymorphic equilibrium is stable only if a ¼ 1) as a function of A 1 allele frequency for varying degrees of asymmetry in selection (0:8 # a # 2) with a fixed mean selection coefficient (0:5ðs m þ s f Þ ¼ 0:2).
(i.e., whole-population F ST ). Simulations in which every new mutation was sexually antagonistic in a population of 10,000 individuals resulted in a mean male-female F ST of 0.00005 and a between-replicate standard deviation of 0.0001, consistent with the single-locus theory (since s was around 0.01). However, entirely neutral simulations (equal mutation rates but no selection) resulted in the same mean and SD of malefemale F ST values. Both the sexually antagonistic and neutral simulations averaged around 1,400 SNPs after 1,000 generations of evolution. Although qualitatively similar, the distribution of whole-population male-female F ST values across loci was statistically significantly different between the neutral and sexually antagonistic simulations (Kolmogorov-Smirnov test: D ¼ 0:11, p , 0:001; Figure 6A). However, this difference in distributions was driven by the larger number of intermediate frequency alleles in the sexually antagonistic simulations. In particular, neutral simulations across all five replicates had only two SNPs with a frequency above 10%, while the sexually antagonistic simulations had over 200 SNPs with a frequency above 10%. Despite there being true differences between the neutral and sexually antagonistic simulations, the male-female divergences observed were still exceptionally low. In fact, neither model had any loci with male-female F ST greater than 0.001 ( Figure 6A). Sexually antagonistic simulations had an average 21% decrease in population fitness (L ¼ 0:2160:02) after 1000 generations of evolution, again consistent with single-locus calculations. Even the minimum load observed under the sexually antagonistic scenario corresponded to a 18% decrease in population fitness.

Sampling variance can generate spurious signals of male-female divergence
Both the single locus model and genome-wide simulations indicate that, while theoretically possible, we would need strong sexually antagonistic viability selection to maintain high divergence between the sexes. Alternatively, the large observed F ST statistics might be due to sampling variance. The empirical studies we cite have relatively small sample sizes (between 15 and 200). The male-female F ST values we reported above from simulation were calculated from the entire population. To evaluate the effect of sampling, we calculated male-female  Figure 6B). Additionally, the tail of the distribution was lengthened under both neutral and sexually antagonistic simulation, including in both cases estimated values of F ST several orders of magnitude larger than the values calculated from actual population allele frequencies. In other words, ignoring the noisiness of the F ST estimator would lead us to conclude that between-sex divergence is much higher than it actually is. There was a significant difference in the distribution of male-female F ST values between the neutral and sexually antagonistic simulations both when subsampling at 100 individuals (Kolmogorov-Smirnov test: D ¼ 0:054, p , 0:01) and 50 individuals (D ¼ 0:096, p , 0:001). However, there was no correlation between the F ST values calculated from the full population and those obtained from samples of either 100 individuals (r ¼ 0:003) or the 50 individual subset (r ¼ 2 0:012). This lack of correlation also holds true for the neutral model (100 individuals: r ¼ 0:034; 50 individuals: r ¼ 0:025).
Empirical studies often use estimators of F ST , such as Weir and Cockerham's F ST to account for population size and sample allele frequencies, rather than population allele frequencies. Therefore, in addition to using equation 4, we calculated male-female F ST values in subsamples using Weir and Cockerham's method (Weir and Cockerham 1984;Bhatia et al. 2013). However, this did not qualitatively affect the tail of the distribution of F ST ( Figure 6B). This is not surprising, because Weir and Cockerham's method is designed to obtain unbiased estimates of F STand so re-centers the mean of the distributionbut does not reduce sampling variance.
Although there were more high male-female F ST sites in samples from the sexually antagonistic simulations ( Figure 6B), this did not seem to be a direct result of selection, but was due to the fact that there are many more intermediate frequency alleles in the sexually antagonistic simulations because of balanced polymorphisms. (If the larger number of high F ST sites were due to selection on those sites, then we'd expect to see a correlation between estimated F ST and population F ST , which, as noted above, we do not.) To test this hypothesis, we calculated F ST between two random samples of size 50 drawn from each simulation independently of sex, and also between random samples of size 25. If the enrichment of high between-sex F ST sites in the antagonistic simulations are in fact due to the difference in allele frequency distribution rather than the direct result of selection, then the enrichment Figure 5 The genetic load created by sexually antagonistic selection. A) The genetic load generated at equilibrium for an additive locus across strengths (s) and asymmetries (a) of selection. B) A comparison of malefemale divergence and genetic load for a locus at equilibrium across a gradient of selection coefficients with varying asymmetry. The load generated at a locus exceeds the degree of divergence between the sexes. Each curve corresponds to a different fixed strength of selection from s ¼ 0 to s ¼ 0:5 and each point along the curves corresponds to a different value of a from 0.6 to 2. C) The population inbreeding coefficient F IS for an additive locus (h m ¼ h f ¼ 0:5) at equilibrium. The excess of heterozygous individuals in the population represents the departures from Hardy-Weinberg equilibrium due to sex-specific selection.
should persist even in these samples drawn after randomizing sex. Indeed this enrichment persists, as shown in Figure 6C. Thus, the higher number of intermediate frequency sites in the sexually antagonistic model creates a higher sampling variance of F ST , as expected based on theory (Jakobsson et al. 2013). In particular, the tail of the F ST distributions show many higher values, such that an increase by two orders of magnitude relative to the full population was observed ( Figure 6B). These results suggest that separating signals of weak antagonistic selection from sampling noise will be extremely difficult.

DISCUSSION
Sexually antagonistic viability selection creates allelic divergence between the sexes because the proportions of each genotype that die before reproduction differs between the sexes. This between-sex divergence for non-sex-linked elements is created anew each generation because chromosomal segregation re-assorts autosomal associations across the sexes during sexual reproduction. An emerging trend in sexual antagonism research is the use of male-female genomic comparisons to identify sexually antagonistic loci. These recent studies identified hundreds of sexually divergent autosomal loci with mean divergence between the sexes in the range of 2-7% (Lucotte et al. 2016;Flanagan and Jones 2017;Wright et al. 2018). Taken as reported, these studies suggest the extent and strength of sexually antagonistic selection is far greater than might be anticipated. To assess these claims, we used a population genetic model to determine the magnitude of divergence generated by sexually antagonistic viability selection, the strength of selection required to drive such divergence, and the population fitness costs generated by this process.
Although sexual antagonism has been a topic of particular interest over the last few decades (Arnqvist and Rowe 2005), some of the early  4) and Weir and Cockerham's F ST . Subsampling increases the tail of the distribution substantially. The sexually antagonistic simulations are significantly different from the neutral simulations due to an increased sampling variance in the sexually antagonistic scenario. C) Cumulative distribution curves of per-locus male-female F ST values, both between random samples from the two sexes (solid lines) and between sets of individuals chosen randomly independently of sex (dotted lines). Male-female F ST distributions differed between neutral (gray) and antagonistic (teal) simulations but were not higher for between-sex comparisons, showing that higher F ST values in the antagonistic simulation was not directly due to selection.
investigations of sex-specific selection were largely motivated as part of a general attempt to elucidate all possible means by which the large amounts of segregating polymorphisms observed within natural populations could be maintained (Lewontin 1974). In this context, Kidwell et al. (1977) focused on how sex-specific selection, could maintain a polymorphism at an autosomal locus when alleles had opposing effects in the sexes. Our analysis agrees with Kidwell et al. (1977), though we highlight that maintenance of such polymorphisms would create a substantial genetic load. Additionally, with weaker selection, the parameter space allowing a stable polymorphism becomes quite narrow. These results highlight the necessity for considering biologically relevant conditionsas discussed by Smith and Hoekstra (1980) particularly when theory is is informing signatures of selection within the genome. Our analysis also allows for non-random mate choice, a potentially important underlying component of sexual conflict (Arnqvist and Rowe 2005). We build on Bodmer (1965) to generate a fully generalizable mate choice matrix and find that assortative mating can indeed have a large impact on the conditions for the maintenance of polymorphism. Supporting previous simulations (Arnqvist 2011), we found positive assortative mating by fitness maintained polymorphisms. In particular, we show that the combination of asymmetrical selection between the sexes and deleterious dominance conditions expanded the equilibrium space relative to random mating. However, such deleterious sex-specific dominance would likely be selected against, suggesting that the strength of selection is the more relevant parameter in natural populations.
While the maintenance of polymorphism may have been a primary motivation for previous work, a goal of modern genomics is to use specific signals of genomic differentiation to identify the loci underlying sexually antagonistic genetic effects (Mank 2017a). Building on our previous work (Kasimatis et al. 2017) allowed us to consider the expected degree of between-sex divergence both when an antagonistic polymorphism is maintained at equilibrium in the population, and when no such stable equilibrium exists, so one of the two alleles sweeps toward fixation to the detriment of one sex. Our model and accompanying simulations highlight several potential limitations of detecting sex-specific differentiation in empirical studies.
First, detectable quantitative divergence between the sexes requires exceptionally strong sexually antagonistic selection. Previous work indicates that F ST values between populations is of order s 2 (Charlesworth and Charlesworth 2010;Cheng and Kirkpatrick 2016), but we find that substantially lower values of F ST are often obtained in practice. Even a 10% viability cost in each sex resulted in between-sex F ST values of less than 0.001 (Figure 4), a signal that is unlikely to be distinguishable from noise without sampling many thousands of individuals within each sex. Critically, to achieve divergence values greater than 0.03such as seen in estimates from human datawould require a 30-60% viability cost in each sex under our model. These remarkably high sex-specific mortality rates are, to the best of our knowledge, not observed in nature (see Singh and Punzalan 2018). (However, exceptionally high fecundity animals might withstand such high sex-specific mortality (see Williams 1975).) Cheng and Kirkpatrick (2016) also pointed out that even small male-female F ST values would require unrealistic amounts of genetic load, but still argue in favor of ongoing sex-specific selection at many genes. Strong sex-specific gametic selection is perhaps more plausible than viability selection on adults, but such high levels of genotype-dependent gamete "mortality" in these organisms still do not seem consistent with empirical observations. Additionally, gametic selection requires an epistatic association between the autosomal locus and the sex determining region, which seems highly implausible across so many loci.
Second, asymmetry in the strength of selection between the sexes is critical in determining the degree of divergence generated. When the strength of selection is weak and approximately the same between the sexes, polymorphisms may be stably maintained, but between-sex divergence is small. However, there is no a priori reason to expect that antagonistic mutations should be perfectly symmetrical in their effects and therefore that polymorphic loci should be stable over time. Alleles with more asymmetric effects will often sweep, producing larger but transient between-sex divergences, although again only under moderate to strong selection. Of course, other types of selection (e.g., spatially varying selection) could contribute to maintenance of more asymmetric polymorphisms, but the observations about genetic load should still hold. Here, we found that dominance has little quantitative effect on male-female divergence, particularly when selection is weak. In general, understanding what the distribution of sex-specific effects underlying antagonistic selection looks like will provide important information on the potential for sexually antagonistic loci to contribute to genetic variation and genome evolution.
Thus, the theoretical predictions from our single locus model seem at odds with the empirical patterns reported to date. Taken as true measurements of sexually antagonistic selection, the empirical data could be described by two, non-exclusive genomic patterns. Divergent loci could either be stable polymorphisms or could be arising and sweeping to fixation through a constant genomic churn of antagonistic interactions. Either of these explanations require an exceptionally high genetic load. Again, there is currently no indication that mortality occurs in such a high, sex-specific manner, particularly in some of the vertebrate species that have been examined.
Individual-based simulations with many linked selected loci genome-wide recapitulate the predictions of the single-locus model, finding again that even in this more complex situation, weak selection can only produce very low levels of divergence. Most importantly, however, we found that estimating male-female F ST from samples of the sizes used in the literature (hundreds or less) produced distributions with larger means and longer tails, even in the complete absence of antagonistic selection. Even in simulations with antagonistic selection, any high divergence values were a result of random sampling noise, and did not correlate with the true divergence values or strength of selection. These simulations highlight the sensitivity of F ST statistics to sampling variance, which is a major obstacle for identification of antagonistic loci from sex-specific differentiation. Most existing empirical studies have not taken these effects fully into account. Our simulations are not intended to be comprehensive, but demonstrate that sampling variance can be more important than selection itself in driving high estimates of divergence, and highlight the need for proper sampling theory.
At the very least, studies analyzing male-female F ST values should compare values to empirical distributions found by random permutation of sex labels, as done by Dutoit et al. (2018). Connecting significant SNPs to a phenotype such as sex-biased expression (Cheng and Kirkpatrick 2016;Dutoit et al. 2018) can provide additional evidence for selection, but it is difficult to control for all confounding factors (e.g., overall expression level). Furthermore, population substructure remains a concern even when comparing to permuted data if there is a cryptic correlation of sampling with sex. For instance, suppose that the sampled population is composed of a mixture of two diverged subpopulations, and that the sex and admixture coefficients of the sampled individuals are correlated. As noted by Cheng and Kirkpatrick (2016), this will create spurious male-female F ST . (The samples in Dutoit et al. (2018) were composed of mating pairs from a single island, so this seems unlikely to explain their results, unless one sex is much more likely to disperse between islands than the other.) Other issues beyond sampling variance may well play a role in the large observed male-female F ST values. Reads from the sex chromosome that are wrongly aligned to an autosome, particularly in the heterogametic sex, have the potential to generate spurious F ST peaks (see Tsai et al. (2019)), an issue that may affect some classes of genessuch as those with sex-biased expressionmore than others.
Furthermore, some studies report large numbers of loci with high average F ST . Should we interpret this as evidence of antagonistic selection across many loci simultaneously, or at just a few loci that affect others through linkage? This is not clear, because each generation's sexspecific selection on a single antagonistic allele will also cause betweensex frequency differences at other loci to the extent they are in linkage disequilibrium with the locus under selection. More work is needed to quantify this effect.
More generally, our investigation calls into question whether it is even possible, at any sample size, to identify the action of realistically strong sex-specific antagonistic selection using male-female F ST . Many of the loci previously suggested as contributing to sex-specific antagonistic selection seem likely to be spurious signals resulting from poor statistical inference. While we believe sexually antagonistic selection does contribute to genomic evolution, we strongly caution against the use and over-interpretation of male-female F ST statistics, especially with small sample sizes, and until the potential bioinformatic confounders are better understood.