The ectodysplasin-A receptor is a candidate gene for lateral plate number variation in stickleback fish

Abstract Variation in lateral plating in stickleback fish represents a classical example of rapid and parallel adaptation in morphology. The underlying genetic architecture involves polymorphism at the ectodysplasin-A gene (EDA). However, lateral plate number is influenced by additional loci that remain poorly characterized. Here, we search for such loci by performing genome-wide differentiation mapping based on pooled whole-genome sequence data from a European stickleback population variable in the extent of lateral plating, while tightly controlling for the phenotypic effect of EDA. This suggests a new candidate locus, the EDA receptor gene (EDAR), for which additional support is obtained by individual-level targeted Sanger sequencing and by comparing allele frequencies among natural populations. Overall, our study illustrates the power of pooled whole-genome sequencing for searching phenotypically relevant loci and opens opportunities for exploring the population genetics and ecological significance of a new candidate locus for stickleback armor evolution.


Supplementary Information to
The EDA receptor (EDAR) is a candidate gene for lateral plate number variation in stickleback fish by Telma G. Laurentino, Nico Boileau, Fabrizia Ronco, Daniel Berner


Analysis S1 Sanger sequencing of the EDAR locus in the validation panel p. 2  Figure S1 Justification of read depth cutoffs used for SNP calling p. 4  Figure S2 Differentiation between Complete CC and Low LL across all chromosomes p. 5  Figure S3 Differentiation between Complete CL and Partial CL across all chromosomes p. 8  Figure S4 Concordance in differentiation at the EDAR locus between two independent genome assemblies p. 11  Figure S5 Gene annotation and functional evaluation of the regions containing the ten top-AFD SNPs in the Complete CL vs. Partial CL comparison p. 12  Supplementary Literature p. 14 Analysis S1

Exploration of an association between EDAR-associated alleles and the partial lateral plate morph using individual Sanger sequencing
Our sequencing considered a 'validation panel' of 46 individuals from the Lake Constance basin not used for the main genome scans. The individuals and their genotypes are described in the table below. PCR was conducted using the forward primer GTTTCTCGGACATGGCTTTTATG and the reverse primer GTCACTACACACAGCACACA (both at 10 uM), producing a 247 (or 245) bp amplicon. The PCR used RedTaq and the following amplification conditions: Initialization 94°C for 3 minutes; cycling (35x) 94°C for 20 seconds, 56°C for 20 seconds and 72°C for 20 seconds; final elongation 42°C for 4 minutes. The product was sequenced on an ABI3130xl sequencer (Applied Biosystems). The sequence below shows the primer binding sites (forward shaded blue, reverse shaded yellow), the position of the top-AFD SNP near the EDAR gene (shaded gray) and an associated 2-bp indel (purple). The two haplotypes associated with partial and complete plating are also visualized. Across our validation panel, the SNP and indel alleles showed perfect linkage.

Specimen ID
Population Genotype (P=partial allele, C=complete allele)

Figure S1
Figure S1 Read depth thresholds used for SNP calling. In the Complete CC vs. Low LL comparison, a read depth between 40 and 130x was required within each group for a SNP to be accepted for further analysis. In the Complete CL vs. Partial CL comparison, the thresholds were 40 and 200x. The histograms show the genome-wide distribution of read depth per base position for each of the four groups. The blue vertical lines indicate the read depth thresholds applied to each group. The upper read depth thresholds were chosen visually to exclude repeated genomic regions with excessive coverage. The lower read depth threshold was chosen to exclude positions at which allele frequency estimation would have been relatively imprecise (Ferretti et al. 2013;, although this slightly truncated the read depth distribution toward the lower end (especially in the Complete CC vs. Low LL comparison).     highlighting potential functional relevance (i.e., known to be involved in bone or ectodermal development, or in the TNF or Wnt/beta-catenin pathways). Otherwise, the graphing conventions follow Figures 2 and 3. Two of the ten total high-differentiation SNPs located on scaffolds not anchored to chromosomes (i.e., located on ChrUn) are not presented. The cluster of four SNPs on ChrXX (bottom right) emerged in the main genome scan (using a relatively liberal minimum read depth threshold) only, and inspection of the nucleotide pileup confirmed read depth just passing the threshold. Hence, although the annotation for this chromosome region is presented, the strong differentiation at these SNPs likely represents sampling stochasticity. Note that only the EDAR locus (top-left) shows substantial differentiation across a broader region around the high-AFD SNP (see also Figures 3 and S3). For this locus, methodological artifacts and sampling stochasticity can be ruled as causes of the strong differentiation between Complete CL and Partial CL stickleback. Combined with functional evidence on the role of EDAR in the development of the dermal skeleton in other fish species (see main text), this locus clearly emerges as the top candidate in our investigation, and was therefore subject to further analysis.
As a resource for future work, however, we here also discuss the evidence for the other annotated genes of potential functional relevance according to our criteria. On ChrXVI there is, beside EDAR, the gene SH3RF3 associated with variation in human bone mineral density (Kim 2018) and hair thickness (Adhikari et al. 2016). In the candidate region on ChrII, ATP6V-OD1 is noteworthy for its association with bone mineral density in humans (Kim 2018), while PACSIN3 has effects on the development of the fish lateral line organ (Thisse and Thisse 2004;Chen et al. 2012). The latter trait has previously been shown to be developmentally linked to lateral plates (Mills et al. 2014). A third candidate gene on ChrII is SAMD4a known to be essential for correct tooth development in mice: mutants exhibit ectopic osteoblast differentiation and bone formation via Wnt-pathway upregulation (Li et al. 2011). This phenotype resembles the ectopic expression of Wnt3a in stickleback, which upregulates EDA and leads to the development of extra lateral plates (O'brown et al. 2015). On ChrIII, the SNP showing the strongest differentiation is near the DSPa gene. In humans, DSPa is associated with Skin fragility-woolly hair syndrome (Whittock et al. 2002), including skin and hair phenotypes similar to those caused by EDA mutations (Cui and Schlessinger 2006;Mikkola 2009). Moreover, DSPa is expressed in the ectoderm during zebrafish development and interacts with the Wnt/β-catenin signaling pathway (Giuliodori et al. 2018) implicated in stickleback plate development (Indjeian et al. 2016).