Mutations involving the SRY-related gene SOX8 are associated with a spectrum of human reproductive anomalies

Abstract SOX8 is an HMG-box transcription factor closely related to SRY and SOX9. Deletion of the gene encoding Sox8 in mice causes reproductive dysfunction but the role of SOX8 in humans is unknown. Here, we show that SOX8 is expressed in the somatic cells of the early developing gonad in the human and influences human sex determination. We identified two individuals with 46, XY disorders/differences in sex development (DSD) and chromosomal rearrangements encompassing the SOX8 locus and a third individual with 46, XY DSD and a missense mutation in the HMG-box of SOX8. In vitro functional assays indicate that this mutation alters the biological activity of the protein. As an emerging body of evidence suggests that DSDs and infertility can have common etiologies, we also analysed SOX8 in a cohort of infertile men (n = 274) and two independent cohorts of women with primary ovarian insufficiency (POI; n = 153 and n = 104). SOX8 mutations were found at increased frequency in oligozoospermic men (3.5%; P < 0.05) and POI (5.06%; P = 4.5 × 10−5) as compared with fertile/normospermic control populations (0.74%). The mutant proteins identified altered SOX8 biological activity as compared with the wild-type protein. These data demonstrate that SOX8 plays an important role in human reproduction and SOX8 mutations contribute to a spectrum of phenotypes including 46, XY DSD, male infertility and 46, XX POI.


Introduction
Human sex determination is a tightly controlled and highly complex process, where the bipotential gonad anlage develops as one of two mutually antagonistic fates -the ovary or a testis. SRY is well established as the primary testis-determining gene on the Y chromosome. In the XY gonad, SRY acts during embryonal development to upregulate the downstream effector SOX9 beyond a critical threshold, which promotes testis development and in turn represses 'ovarian pathways' in the male gonad through multiple mechanisms (1,2). In mice, up-regulation of Sox9 in males is by the synergistic action of Sry with Nr5a1 [also known as Steroidogenic factor-1 (Sf1)], through binding to multiple elements within a testis-specific Sox9 enhancer (Tesco) (1). Sox9 expression is required to establish Sertoli cell identity in the developing testis (2). Once formed, foetal Sertoli cells produce anti-Mü llerian hormone (AMH) and coordinate the cellular and morphogenetic events leading to primary sex determination (2).
Although the SOX genes SRY and SOX9 play essential roles in driving early mammalian testis-determination, emerging evidence suggests the contribution of another Sox gene, Sox8, in murine testis-determination as well as in the maintenance of gonadal function. Sox8 shows an overlapping expression pattern with Sox9 in foetal and adult mouse gonad and functional redundancy between Sox8 and Sox9 occurs during testis development (3)(4)(5)(6). Sox8 XY null mutants show normal testis development but develop post-natal progressive spermatogenic failure (7).
Development and maintenance of the mammalian gonad is regulated by a double repressive system, where an equilibrium of mutually antagonistic pathways must be attained for normal development of either the testis or ovaries (2)(3)(4)(5)(6). In humans, changes in this delicate balance can lead to Disorders of (or Differences in) Sex Development (DSD), which are defined as congenital conditions with discordant development of chromosomal, gonadal or anatomical sex (8). The incidence of DSD has been estimated at 1: 2, 500 to 1: 5000 births (9)(10). Excluding inborn errors in steroidogenesis, a molecular diagnosis will be reached in only 20% of all individuals with 46, XY DSD (9). Around 40% of the cases with 46, XY complete gonadal dysgenesis (CGD) can be explained by mutations involving three genes SRY, NR5A1 and MAP3K1 (4,5). Pathogenic variants in other sex-determining genes such as CBX2, GATA4 or FOG2/ZFPM2 are found in a very small percentage of cases (9,10). Hence, the aetiology in majority of the individuals with DSD remain poorly understood.
An emerging body of evidence suggests that DSD and infertility can have common aetiologies, For example, mutations involving NR5A1 (SF-1), a key player in many aspects of reproductive function including sex determination, are associated not only with a spectrum of DSD such as 46, XY CGD, 46, XY undervirilised males with testes, or 46, XX (ovo)testicular DSD but also more prevalent forms of human infertility, i.e. 46, XY men with spermatogenic failure and 46, XX women with primary ovarian insufficiency (POI) (11)(12)(13)(14). Human infertility is a major human health issue; one in seven couples worldwide have problems conceiving and both men and women are affected equally (15,16). POI is characterised by primary or secondary amenorrhea, high gonadotropin levels (FSH above 40iU/l on two occasions at least a month apart) and estrogen deficiency in women under the age of 40 years (17). Male infertility includes azoospermia characterized by complete absence of spermatozoa in the ejaculate, whereas oligozoospermia is defined as sperm concentrations below the World Health Organization reference level of 15 Â 10 6 sperm/ml (18). The underlying basis for either male or female infertility is complex, including both physiological and environmental factors as well as gene mutations. Although mouse models have revealed hundreds of genes that are associated with fertility, only a few single-gene defects that cause male and/or female infertility have been identified in humans (12,13,(19)(20)(21)(22)(23)(24). Here, for the first time we demonstrate that mutations involving the human SOX8 gene are associated with a range of phenotypes including 46, XY DSD as well as male infertility and ovarian insufficiency in females.

Results
Rearrangements involving the SOX8 locus and 46, XY DSD Patient 1 is a phenotypic female who presented at the age of 27 years with primary amenorrhea (Patient 1, Table 1). Chromosome analysis showed a 46, XY karyotype with a paracentric inversion [46, XY, inv(16)(p13.3p13.1); Fig. 1A and B], which was confirmed by FISH analysis (Fig. 1C). Her parents and unaffected sister were not available for analysis. Hence, it is unknown if this rearrangement is de novo, however, this rearrangement has not been reported in control databases. Array-CGH did not reveal any chromosomal imbalances, including the chr16 short arm, associated with the phenotype. The centromeric breakpoint was mapped within the BAC clone RP11-609N14 (chr16: 10428838-10600163) (Fig. 1D). The telomeric breakpoint was mapped within the BAC clone RP11-728H8 between chr16; 814 190-962 809 (Fig. 1E). Whole genome sequencing confirmed these results and indicated a centromeric breakpoint at chr16: 10 522 104-10 522 114 and a telomeric breakpoint at chr16: 881 929-881 965. The transcription initiation site of the SOX8 gene is at chr16: 1 031 808. Therefore, the breakpoint is located $150 Kb upstream of SOX8 (Fig. 1A). A bilateral gonadectomy revealed two small gonads, which on histological examination showed two streak-like gonads with no germ cells or differentiated tissue ( Fig. 2A). Patient 2 is a 46, XY girl with mild post-natal anaemia, sleepiness, poor feeding and tachycardia (Patient 2, Table 1 and Supplementary Material). Physical examination showed a prominent forehead and ophthalmological evaluation revealed small bilateral chorioretinal colobomas. At birth the genitalia consisted of a midline clitorophallic structure with bilateral labioscrotal folds, which were fused at the midline. On pelvic ultrasound, there was no uterus or vagina noted and gonads and epididymides were identified in the labia bilaterally. Pituitary hormone levels were within normal limits. The child was assigned male and at 10 months of age underwent the first stage of surgery for hypospadias, chordee repair and posterior urethral mobilization, and scrotoplasty. At 2 years 10 months the patient underwent bilateral orchidopexy, bilateral inguinal hernia repair, and bilateral gonadal biopsy. Gonadal histology showed seminiferous cords, variable germ cells and reduced Leydig cell numbers by calretinin staining (Fig. 2B-E). aCGH detected a two copy DNA gain in the 16p13.3 region [arr(h19) 16p13.3(137 893-992 302]Â4, log 2 ¼þ1), spanning at least 854 kb (Fig. 1F), resulting in tetrasomy of that DNA segment. This region comprises at least 27 genes, including the a-globin gene cluster (Supplementary Material, Table S1). FISH analyses using RP11-598I20 BAC probe on cultured peripheral blood samples from the patient and both parents indicated that the tetrasomy was a de novo event ( Fig. 1G and H). The centromeric breakpoint of this triplication was located within the intron 2 of the LMF1 (lipase maturation factor 1) gene, $39.5 kb upstream of the SOX8 gene. Homozygous, loss-of-function LMF1 gene mutations are responsible for hypertriglyceridemia and decreased lipase activity and hence this gene was not considered to be associated with the phenotype (25). Interestingly, SOX8-specific enhancer elements are included in the triplication (26). Together, these data suggest that rearrangements at the SOX8 resulting in dysregulation of SOX8 expression could negatively impact testis-determination and may result in 46, XY DSD including disorders of testis-determination. Consistent with this postulation, we observed that SOX8 is co-expressed with NR5A1 and SOX9 in the early stages of human testis-determination in Sertoli cells and Leydig cells as well as in Sertoli and Leydig cells in adult men (Fig. 3A, Supplementary Material, Fig. S1). Single cell RNA sequencing on XY mouse gonads during sex determination has also demonstrated that Sox8 is co-expressed with other genes involved in sex determination including Nr5a1, Dmrt1 and Gata4 (27). A missense mutation in the HMG-box domain of SOX8 is associated with a lack of testis-determination Based on the above observations, we extended the SOX8 mutation screen to a large cohort of 46, XY DSD individuals (Table 2). Three heterozygous missense SOX8 variants were identified in a screen of 204 cases of unexplained 46, XY DSD. Two of these changes were observed in the control cohort and their contribution to the phenotype is considered unlikely. The third mutation, a heterozygous c.468G > C p.Glu156Asp amino acid substitution was identified in a 46, XY phenotypic female who presented at 16 years of age with primary amenorrhea (Patient 3 Table 1, Fig. 4A). The Glu156 residue is located in the HMG-box and is evolutionarily conserved not only within the SOX8 protein but also in other members of groups E and F SOX protein family (Fig. 4B). The mode of inheritance of this mutation is unknown as the parents were unavailable for study. This mutation was not observed in our control cohort, but it is present as a rare variant in the ExAC database (2: 121, 228 alleles, rs764098477). The SOX8 p.Glu156Asp mutation is not expected to disrupt DNA binding per se because this residue does not map to the DNA interaction interface (Fig. 4C). Furthermore, exchanging the equivalent residues in Sox2 (Sox2Lys to Glu) and Sox17 (Sox17Glu to Lys) are reported not to show any effect on DNA binding by the SOX HMG box in the absence of partner proteins (28). However, this mutation could affect how SOX8 dimerizes with its partner factors on DNA, since changing the same residue in Sox2 and Sox17 affected the ability of these proteins to interact with OCT4 [ Fig. 4C (28)].
A series of in vitro analyses were performed to assess the effect of this mutant on the biological activity of the SOX8 protein. Transient transfection assays were performed since a number of SOXE group-responsive genomic elements have been identified that, when placed upstream of a reporter gene, can be used to quantitate transcriptional regulatory function of protein variants (29). The function of SOX genes involves a complex interaction with many other transcriptional co-regulators, including other SOX proteins (29) and therefore we considered that the consequence of the SOX8 mutation may be promoter or cell context dependent. Our data show that the mutant SOX8 protein exhibits a context-specific loss-of-function activity. The WTand mutant SOX8 proteins can activate a series of gonadal promoters (Fig. 5A). However, the SOX8p.Glu156Asp specifically fails to synergize with NR5A1 to transactivate the Sox9 Tesco enhancer element (Fig. 5A), yet both WT-SOX8 and SOX8p.Glu156Asp proteins physically interact with the NR5A1  protein (Fig. 6, Supplementary Material, Fig. S2). The mutation also impacted on the functional interaction between SOX8 and SOX9. Both SOXE proteins have the ability to heterodimerize and combinatorially regulate their target gene expression (30,31). We show that the WT-SOX8 protein binds to SOX9, however the SOX8p.Glu156Asp protein cannot (Fig. 6,  Supplementary Material, Fig. S2). The SOX8p.Glu156Asp mutant also exerts a repressive effect by negatively affecting the synergistic activation of the Tesco enhancer by NR5A1 and SOX9 (Fig. 5B). Finally, the SOX8p.Glu156Asp protein also has a dominant negative effect on the WT-SOX8 protein since it reduced, in a dose-dependent manner, the synergistic activation of the Tesco reporter by WT-SOX8 and NR5A1 ( Supplementary  Material, Fig. S3). All these results are consistent with the hypothesis that the mutant SOX8 protein has an altered biological activity impairs testis-determination.

SOX8 mutations associated with human infertility
To investigate whether mutations in SOX8, like NR5A1, could also contribute to a wider spectrum of male and female infertility, we sequenced the coding region of SOX8 in 427 men and women with infertility. The incidence of mutations in the control cohort of fertile and/or normospermic men was 0.74%. In contrast, rare or novel SOX8 variants were observed in 1.47% of azoospermic men, 3.5% of oligozoospermic men (P ¼ 0.0151, Fishers t-test, two-tailed) and 9/153 women with POI in an initial cohort (5.9%; P ¼ 0.000 107) and in 4/104 (3.85%; P ¼ 0.0191) of women with POI in a replication cohort (Tables 2 and 3).
Combining the data from both cohorts, we found a significant association of mutations in SOX8 with POI (5.06%; P ¼ 4.5Â10 À5 ). The majority of mutations associated with infertility flank the HMG-box and fall within one of the two transactivation domains. Although many of these missense mutations are predicted by computational methods to have a deleterious effect on protein function, the interpretation of the significance of mutations in human disease remains very challenging. We therefore sought to test the consequence of the SOX8 mutations associated with infertility in a series of functional assays. As expected, the transient transfection assays using the AMH, Dmrt1 and NR5A1 promoters as reporters in HEK 293-T and mES cells indicated biological differences between mutated wild-type proteins that, in some cases, were promoter specific, e.g. the p.Asp382Asn mutation shows a specific loss-of-function with the AMH promoter (Supplementary Material, Fig. S4). Furthermore, some mutant proteins show changes in cellular localization (Supplementary Material, Fig. S5).
Although, a contribution of Sox8 to male infertility in the mouse has been established for some time, the finding of an association between human POI and SOX8 variants is a novel finding since Sox8 À/À female mice are fertile (7). We therefore sought to examine the profile of SOX8 expression in the human ovary to see if this is consistent with a role for mutations in the protein contributing to female infertility. We observed that the SOX8 protein is expressed in the 40-week foetal ovary and in the adult ovary ( Fig. 3B and Supplementary Material, Fig. S1). The SOX8 protein is highly expressed in granulosa cells at both stages.

Discussion
Our data provide the first evidence that SOX8 plays a role in human sex determination and in the function of the adult gonad. Mouse models also suggest that Sox8 plays a key role in early   testis-determination as well as maintaining male fertility. We identified three patients with 46, XY DSD who carried rearrangements/mutations involving the SOX8 gene. One patient had a paracentric inversion with a breakpoint $150 Kb upstream of the SOX8 gene. The only other gene within the region is LMF1. Since, homozygous, loss-of-function mutations in the LMF1 gene are associated with hypertriglyceridemia and decreased lipase activity, this gene was not considered to be responsible for the phenotype (25).
A second patient carried four copies of an 854 Kb region immediately 5 0 to the SOX8 gene including SOX8 enhancer elements as well as the a-globin gene cluster (26). Duplication of a-globin genes is a rare cause of anaemia that may lead to imbalances of a-and b-chains in haemoglobin tetramer, especially in b-thalassemia carriers (32,33). Alternatively, the 16p13.3 chromosomal rearrangement may result in disruption of the longrange regulation leading to a partial silencing of gene expression consistent with prenatal and postnatal anaemia observed in patient 2 (34). The other somatic anomalies seen in this patient may be a result of gene dosage of one or more of the other genes located in the duplicated region (Supplementary Material,  Table S1). Indeed, a patient with 46, XY gonadal dysgenesis, skeletal and cardiac anomalies and developmental delay was reported to carry a similar 560 kb duplication located approximately 18 kb upstream of SOX8 (35). These rearrangements at the SOX8 locus may cause dysregulation of SOX8 expression. This hypothesis is consistent with the multiple reports of 46, XY gonadal dysgenesis associated with chromosomal rearrangements even up to several Mb upstream of genes involved in sex determination including SOX9 and NR0B1 (36)(37)(38)(39). Furthermore, the centromeric breakpoint in patient 2 falls between the conserved E1 and E2 SOX8 enhancer elements, which are required for murine Sox8 gene expression (26). Dysregulation of SOX8 expression could negatively impact on male gonadal development and lead to various degrees of 46, XY DSD. Indeed, 46, XY individuals carrying deletions that include SOX8 (ATR16 syndrome) occasionally present with mild anomalies such as hypospadias or cryptorchidism and Patients 1 and 2 may represent part of this broad DSD spectrum (40).
The third patient presenting with 46, XY gonadal dysgenesis had a missense mutation involving a highly conserved glutamic acid residue within helix 3 of the SOX8 HMG-box. The severe gonadal phenotype in this patient, which is due to lack of appropriate testis-determination, may be due to the specific dominant negative activity of the SOX8p.Glu156Asp protein on the WT-SOX8 protein and/or the repressive effect of this mutant protein on SOX9/NR5A1 synergy at the Sox9 Tesco enhancer that we observed using in vitro assays (Fig. 7). These data, which provide evidence for a role of SOX8 in early human testis development are supported by murine studies. The absence of both Sox9 (pro-testis) and Rspo1 (pro-ovary) in XX foetuses results in the development of ovotestes and hypoplastic testis (4). SoxE group genes, Sox10 and Sox8 that are normally repressed by Rspo1, are activated in the double knockout gonads, suggesting that Sox8 may serve as a driver of testis formation in the absence of Sox9 (4). Recent data suggest that Sox8 may also be involved in repressing the ovarian pathway, via the repression of Foxl2 expression during early testis formation (41).
In this study, we have also identified mutations in SOX8 associated with cases of male and female infertility. Although the mechanism(s) leading to infertility in either sex is unclear, all mutations associated with infertility showed differences in biological activity compared with the WT protein and all mutations flank the HMG-box. Transient transfection assays using as a target the promoters of genes important for gonad function showed a wide range of effects including promoter specific lossof-function or alterations in the cellular localization of the mutated protein. Inferring precisely how these mutations cause infertility is difficult since the gene regulatory pathways downstream of SOX8 have not been defined in either the XY or XX gonad. In mice Sox8 is essential for the maintenance of male fertility beyond the first wave of spermatogenesis (10). At 5 months of age Sox8 À/À mice show a progressive degeneration of the seminiferous epithelium through perturbed physical interactions between Sertoli cells and the developing germ cells (10). Here, we identified rare or novel SOX8 mutations in 3.5% (P < 0.05) of men with unexplained reduced sperm counts as compared with a frequency of 0.78% in ancestry-matched normospermic/fertile control cohorts. It is important to compare our data with a cohort of individuals who are known to be fertile and/or normospermic since infertility is a common phenotype and public databases, such as ExAC, are likely to contain a background pool of rare variants that can cause infertility and/or DSD. These cases of male infertility may represent mild forms of testicular dysgenesis or, similar to the mouse model the SOX8 protein may also be a regulator of Sertoli-germ cell adhesion independently of its role in primary testisdetermination (10).
Our analyses also revealed mutations in SOX8 in 5.06% of all women with POI from two replication studies (P ¼ 4.5Â10 À5 ). This suggests that mutations in human SOX8 may have a greater impact on ovarian rather than testis function. Although Sox8 À/À female mice are fertile, Sox8 expression has been reported in preantral follicles, preovulatory follicles, cumulus granulosa cells and at high levels in mural granulosa cells, which line the wall of the follicle and are critical for steroidogenesis and ovulation (7,42). We have shown that SOX8 protein is expressed specifically in the granulosa cell of the developing and adult ovary in the human. The observation that Sox8 expression is higher in the mural cells that line the follicle wall than in the cumulus cells that surround the oocyte suggests that SOX8 may play a role in granulosa cell differentiation (42). We can hypothesize that in female, in the absence of SOX9, SOX8 may be an important regulator of AMH expression (required for maintenance of the germ cell pool) in the adult ovary and, therefore, mutations in SOX8 may result in POI.
The results presented in this study provide novel insight into the genetic mechanisms of human gonadal development and function and provide further evidence that a spectrum of reproductive phenotypes from DSD to infertility can be associated with variations in a single genetic factor.

Materials and Methods
Detailed study methods are provided in SI methods

Subjects and samples
All patients with 46, XY DSD met the revised criteria of the Pediatric Endocrine Society (LWPES)/European Society for Paediatric Endocrinology (ESPE). This study was approved by the local French ethical committee (2014/18NICB -registration number IRB00003835) and consent to genetic testing was obtained from adult probands or from the parents when the patient was under 18 years. Patient ancestry was determined by self-reporting, based on responses to a personal questionnaire, which asked questions pertaining to the birthplace, languages and selfreported ethnicity of the participants, their parents and grandparents. Genes known to be involved in 46, XY DSD were screened for mutations in the XY DSD cohort and high-resolution aCGH was performed on all cases and indicated normal ploidy in all cases. An extended description of patient 2 is provided in SI methods. The infertile male population consisted of 274 men of European or North African ancestry. For each man, ejaculates were obtained by masturbation after 2-7 days of sexual abstinence. All underwent an andrological work-up, which included medical history, physical examination, hormonal evaluation (FSH, LH, and testosterone) and semen analysis. Men with known clinical (cryptorchidism, infections, varicocele) or genetic (karyotype anomalies, Y chromosome microdeletions) causes of infertility were excluded. Oligozoospermia was defined as having less than 15Â10 6 sperm/ml. Primary amenorrhea was diagnosed when pubertal development was absent despite the patient being of pubertal age (greater than 13 years) with increased basal plasma FSH concentration (>9 IU/l). Cases of secondary amenorrhea (no menstruation after six cycles) and premature ovarian failure (amenorrhea, hypoestrogenism, and elevated serum gonadotropin levels in women younger than 40 years of age) were included.
The control panel consisted of 280 unrelated normospermic 46, XY males of French ethnic origin with known fertility and no history of testicular anomalies (determined by self-reporting) from the Biobank for Research on Human Reproduction (GERMETHEQUE). An additional in-house 180 normospermic men and 130 fertile men of Arab/North African ancestry were included as controls. European ancestry-matched in-house controls of known fertility status were also included (n ¼ 20). Other control populations included fathers of European ancestry from the 1000 genomes project (Iberian and Northern Europeans from Utah, n ¼ 103), the Danish Genome Project (n ¼ 50) and from the University of Dundee (n ¼ 50).

Genomic analysis
Chromosome analysis, ArrayCGH, genomic, exome and Sanger sequencing are all described in SI Materials and Methods.
Details of the plasmids, cell lines, cellular localization assays, structural modelling and co-immunopreciptation are provided in the SI methods.

Statistics
The results of the luciferase assays were compared for their statistical significance by calculating two-tailed Student's t-test using GraphPad Prism software. 95% confidence interval was calculated using the mean difference between the two groups being tested. The data for each of the groups tested and their t values, along with degrees of freedom (df), standard error of difference, P-values and predicted statistical significance are summarized in Supplementary Material, Table S2. Based on the P-values the GraphPad Prism software calculates the difference as not significant, not likely to be significant, significant, very significant and extremely significant.