Contributions of Function-Altering Variants in Genes Implicated in Pubertal Timing and Body Mass for Self-Limited Delayed Puberty

Context: Self-limited delayed puberty (DP) is often associated with a delay in physical maturation, but although highly heritable the causal genetic factors remain elusive. Genome-wide association studies of the timing of puberty have identified multiple loci for age at menarche in females and voice break in males, particularly in pathways controlling energy balance. Objective/Main Outcome Measures: We sought to assess the contribution of rare variants in such genes to the phenotype of familial DP. Design/Patients: We performed whole-exome sequencing in 67 pedigrees (125 individuals with DP and 35 unaffected controls) from our unique cohort of familial self-limited DP. Using a whole-exome sequencing filtering pipeline one candidate gene [fat mass and obesity–associated gene (FTO)] was identified. In silico, in vitro, and mouse model studies were performed to investigate the pathogenicity of FTO variants and timing of puberty in FTO+/− mice. Results: We identified potentially pathogenic, rare variants in genes in linkage disequilibrium with genome-wide association studies of age at menarche loci in 283 genes. Of these, five genes were implicated in the control of body mass. After filtering for segregation with trait, one candidate, FTO, was retained. Two FTO variants, found in 14 affected individuals from three families, were also associated with leanness in these patients with DP. One variant (p.Leu44Val) demonstrated altered demethylation activity of the mutant protein in vitro. Fto+/− mice displayed a significantly delayed timing of pubertal onset (P < 0.05). Conclusions: Mutations in genes implicated in body mass and timing of puberty in the general population may contribute to the pathogenesis of self-limited DP.


Supplementary Methods: Patient Details
For family members diagnosis of DP was based on PHV occurring 1.5 SD beyond the mean, i.e. age at takeoff exceeding 12.9 and 11.3 yr, or age at PHV exceeding 14.8 and 12.8 yr in males and females. This 1.5SD cutoff has sensitivity of 98% in identifying boys with Tanner stages G2 later than 14.0 years and sensitivity of 97% in identifying girls with Tanner stage B2 later than 13.0 years.

Genetic Analysis
Genetic analysis was performed in 160 individuals from the 67 most extensive families from our cohort with DP. These included 67 probands (male n=57, female n=10), 58 affected family members (male n=36, female n=22) and 35 unaffected family members (male, n=13, female n=22). Whole exome sequencing (WES) was performed on DNA extracted from peripheral blood leukocytes, using a Nimblegen V2 or Agilent V5 platform and Illumina HiSeq 2000 sequencing. The exome sequences were aligned to the UCSC hg19 reference genome. Picard tools and the genome analysis toolkit were used to mark PCR duplicates, realign around indels, recalibrate quality scores and call variants.
Variants were analyzed and filtered for potential causal variants in Ingenuity Variant Analysis (Qiagen) using filters for quality control, predicted functional annotation, minor allele frequency (MAF), and GWAS relevance ( Figure 1).
Quality control included thresholds for call quality, read depth and upstream pipeline filtering. Predicted functional annotation involved prioritizing nonsense, exonic missense, splice site variants, structural or promoter changes, or variants deleterious to a microRNA. Filtering by MAF entailed including those variants with minor allele frequency (MAF) <1% in the 1000 Genomes database, the NHLBI exome variant server, and ExAC and gnoMAD databases. GWAS relevance filtering allowed identification of those remaining variants that lay within genes in linkage disequilibrium with 106 GWAS loci associated with AAM 1 . All genes in linkage disequilibrium with these GWAS AAM loci (using inclusive limits: D' > 0.8; r 2 : no limit) were selected using the Broad institute SNAP tool (SNP annotation and proxy search). Linkage disequilibrium data was calculated using Haploview 4.0, based on phased genotype data from the International HapMap Project and the 1000 Genomes Project. A total of 760 genes were selected using this SNAP tool, and 'GWAS relevance filtering' allowed identification of those remaining variants that lay within these 760 genes 2 . Filters for genes implicated in body mass regulation were applied using a biological context filter with pathway analysis. Variants were then filtered for segregation with trait: variants present in ≥ n-1 affected individuals (where n = number of affected individuals in a given pedigree) and not present in more than one unaffected individual being retained. Family members were screened using conventional Sanger sequencing.
Targeted exome sequencing using a Fluidigm array of the remaining candidate gene identified post-filtering was then performed in a further 42 families from our cohort (288 individuals, 178 with DP; male=106, female=69 and 110 controls; male=55, female=58, Figure 1). Variants post targeted resequencing were filtered using the same criteria as the WES data: quality control, predicted functional annotation, minor allele frequency and segregation with trait.
Whole gene rare variant burden testing was performed post sequencing.
Fisher's exact test was used to compare the prevalence of deleterious variants in our cohort with the Finnish population, using the ExAC Browser  Orthologous sequences from other species were retrieved from Ensembl. Species are classified in "Placental mammals", "Birds & Reptiles" and "Fish" according to Ensembl classification. For each species, FTO Uniprot accession number (in blue) and entry name (in black) are presented at the beginning of the row. The position of deleterious mutations on the human FTO sequence is indicated by a red arrow. L44 and other surrounding residues part of the same alpha helix form a motif, which is highly conserved across placental mammals but not in reptiles, birds and fish. Supplementary Fig. 7 -Tertiary structure of FTO local to L44 residue. Residue L44 is presented in blue and other residues located on the same alpha helix and conserved across placental mammals are presented in green. FTO structure is shown as cartoon on the left and as surface on the right. 3---methylthymidine is presented as yellow spheres in the cartoon representation.  Supplementary Fig. 8 -p.A163T sequence and structural analysis. Panel A: a multiple sequence alignment shows that alanine at position 163 is not conserved. Panel B: the position of Ala163 (indicated by a red arrow) is presented in relation to FTO secondary structure (H, alpha helix; C, residues that are not part of an alpha helix or a beta strand). Residues predicted to be disordered are indicated with an asterisk. Panel C: FTO tertiary structure. Ala163 (presented in red) is at the end of FTO alpha helix H4 and is not predicted to disrupt FTO function. FTO structure is visualized using visualisation program (http://www.pymol.org/). FTO  GNPDA2  LEKR1  SEC16B  BDNF  DLK1  TMEM18  LIN28B  LEPR-LEPROT  NEGR1  SIX6  TNN13K CENPW-NCO7

Gene Symbol
Supplementary Table 1 -Genes involved in energy metabolism and growth pathways implicated in the timing of puberty in the general population from genome wide association studies (adapted from Perry et al 1 ).