Negative Selection on a SOD1 Mutation Limits Canine Degenerative Myelopathy While Avoiding Inbreeding

Abstract Several hundred disease-causing mutations are currently known in domestic dogs. Breeding management is therefore required to minimize their spread. Recently, genetic methods such as direct-to-consumer testing have gained popularity; however, their effects on dog populations are unclear. Here, we aimed to evaluate the influence of genetic testing on the frequency of mutations responsible for canine degenerative myelopathy and assess the changes in the genetic structure of a Pembroke Welsh corgi population from Japan. Genetic testing of 5,512 dogs for the causative mutation in superoxide dismutase 1 (SOD1) (c.118G>A (p.E40K)) uncovered a recent decrease in frequency, plummeting from 14.5% (95/657) in 2019 to 2.9% (24/820) in 2022. Weir and Cockerham population differentiation (FST) based on genome-wide single-nucleotide polymorphism (SNP) of 117 selected dogs detected the SNP with the highest FST located in the intron of SOD1 adjacent to the c.118G>A mutation, supporting a selection signature on SOD1. Further genome-wide SNP analyses revealed no obvious changes in inbreeding levels and genetic diversity between the 2019 and 2022 populations. Our study highlights that genetic testing can help inform improved mating choices in breeding programs to reduce the frequency of risk variants and avoid inbreeding. This combined strategy could decrease the genetic risk of canine degenerative myelopathy, a fatal disease, within only a few years.


Introduction
Genetic testing for disease-causing mutations in companion animals is increasingly performed by veterinarians for diagnosis, by breeders to reduce the incidence of inherited disease, and even by pet owners to determine the genetic background of their pets (Moses et al. 2018).Genetic tests employed by pet owners and breeders are termed direct-to-consumer (DTC) testing and can be broadly classified into two categories: (i) detection of specific mutations using sequencing or probes and (ii) high-throughput genotyping using genome-wide marker sets designed to detect multiple mutations simultaneously.In addition to their convenience, these DTC tests also yield data with substantial implications for genetic research.For example, a recent study using DTC test samples revealed breed-specific genetic mutations associated with hypertrophic cardiomyopathy in several cat breeds (Akiyama et al. 2023).Another study, employing over 10,000 DTC samples, determined the allele frequencies of 12 genes associated with canine coat color and the physical characteristics of different dog lineages (Dreger et al. 2019).The results indicated that random mating between certain dog breeds can produce unexpected phenotypes, including embryonic lethality (Dreger et al. 2019).Moreover, although at least 775 disease-associated mutations are already known in dogs (Rokhsar et al. 2021), a recent genome-wide association study using DTC genetic testing data revealed a novel and unexpected in-frame deletion that causes deafness in Rhodesian ridgebacks (Kawakami et al. 2022).Thus, the widespread adoption of animal genetic testing greatly benefits genetic studies.
Genetic testing for adult dogs was introduced in Japan in 2017, followed by testing for puppies in 2019.The results of such widespread testing could affect dog populations by preventing carriers from breeding, thus limiting the number of genetically affected animals.However, the actual effect of widespread testing on mutation frequency remains unclear.In addition, while breeding to avoid mutant alleles could affect the genetic structure and inbreeding levels in dog populations, only a few studies have investigated this possibility.
Since the initial report of the draft genome of domestic dogs in 2005 (Lindblad-Toh et al. 2005), significant advancements have been made in canine genetic research.Inbreeding and genetic structures have been evaluated at the population level based on microsatellite and genome-wide single-nucleotide polymorphism (SNP) analyses (Boyko et al. 2009;Mellanby et al. 2013;Dreger et al. 2016;Chu et al. 2019).Accordingly, genome-wide SNP analyses can be employed to evaluate the influence of expanded DTC genetic testing on genetic structure in dog populations.
Canine degenerative myelopathy (DM) is a fatal neurodegenerative disease prevalent in several dog breeds, including the Pembroke Welsh corgi (PWC), German shepherd, and boxer (Neeves and Granger 2015).The c.118G>A mutation in the superoxide dismutase 1 (SOD1) gene (p.E40K on chromosome 31; 26,540,342 bp, based on CanFam 3.1) is reportedly a causative factor of DM in PWCs (Awano et al. 2009).SOD1 is one of the two antioxidant isozymes responsible for specifically eliminating free superoxide radicals in mammals.The homozygous A allele mutation is strongly associated with DM onset (Awano et al. 2009;Chang et al. 2013;Zeng et al. 2014), indicating that it is an autosomal recessive variant.The mutant homozygous A allele is fairly common in PWCs and is not geographically restricted (e.g.48.4% or 59/122 in Japan (Chang et al. 2013) and 83% or 14/21 in Mexico (Ayala-Valdovinos et al. 2018)).
In this study, we evaluated the allelic frequency of SOD1: c.118G>A (p.E40K) mutation in a population of over 5,500 PWCs from Japan, analyzing DTC genetic testing data across three years (2019 to 2022).PWCs born from 2012 to 2022 were included in this dataset, allowing us to examine the impact of genetic testing on the PWC population for both adult dogs and puppies that were introduced in 2017 and 2019, respectively.We also performed a genome-wide analysis to detect and compare selection signatures between the populations in 2019 and 2022.Finally, we assessed inbreeding and population structures based on genome-wide SNPs to determine the effects of genetic testing on dog breeding.The findings of this study could provide valuable insights into how widespread genetic testing controls the spread of genetic disorders among dogs.
To further investigate the allele frequency of the SOD1: c.118G<A mutation, we performed a simulation based on random genetic drift (10,000 replicates) for two models (i.e.scenarios where effective population sizes (Ne), which refers to the number of bred dogs, were larger (Ne = 540) or smaller (Ne = 49); see Materials and Methods for details).Deviation from genetic drift would indicate selection.The observed allele frequency was significantly lower than the simulated one after 2018 in both models (P < 0.05; Fig. 1B, supplementary table S1, Supplementary Material online).

Selection Signature
To reveal the existence of selection signatures at the genome-wide level, we compared the 2019 and 2022 groups to determine the selection signature in the PWC genome.Genome-wide SNP-based analyses on 117 PWC pups were performed after determining their genetic backgrounds and separating them into four groups according to the SOD1: c.118G<A genotype (supplementary table S2, Supplementary Material online).The Weir and Cockerham population differentiation (F ST ) based analyses revealed that the SNP "BICF2G630738971" had the highest F ST (0.25) of the 143,013 tested SNPs, and this SNP was located in the intron of SOD1 on canine chromosome 31 (Fig. 2A and B), 10,529 bp downstream of SOD1: c.118G>A.We then calculated the extended haplotype homozygosity (EHH) of each population (2019 vs. 2022) for BICF2G630738971 (Fig. 2C).The 2019 group had longer EHH haplotypes than the 2022 group, although the position of SOD1: c.118G>A was closely linked to the top SNP in the 2022 group (0.47 < EHH < 1).

Inbreeding Levels and Genetic Structure
We assessed inbreeding levels in the dog population.Observed heterozygosity (Ho) per group was calculated using 142,510 SNPs; the Ho was 0.305 and 0.306 for the Wild 2019 and Wild 2022 groups, respectively.For mutant PWCs, the Ho was 0.296 and 0.314 in 2019 and 2022, respectively.We then compared the inbreeding coefficients between groups based on runs of homozygosity (F ROH ) (Fig. 3).The mean F ROH was 0.29 ± 0.067 (standard deviation), 0.30 ± 0.082, 0.28 ± 0.061, and 0.27 ± 0.038 for Wild 2019, Mutant 2019, Wild 2022, and Mutant 2022 groups, respectively.The four groups did not differ significantly regarding inbreeding estimates (supplementary table S3, Supplementary Material online).
We analyzed the genetic structure through clustering using ADMIXTURE (Alexander et al. 2009) and principal component analysis (PCA).Clustering results showed that the cross-validation error was lowest when K = 3 (supplementary fig.S1, Supplementary Material online), with no obvious structure in the model with K = 3 (Fig. 4A).Likewise, PCA did not identify any obvious components that explained variation in the dogs (Fig. 4B).
We applied the neighbor-joining method for each PWC and Nei's genetic distance to infer genetic relationships between groups.The phylogenetic tree revealed one clade that included all Mutant dogs and another clade including all Wild dogs (Fig. 4C).The population-based Nei's genetic distance indicated genetic similarities within the Mutant groups and within the Wild groups (Fig. 4D).Both analyses revealed that PWCs homozygous for the SOD1 mutation were more common in certain lineages.
To estimate the potential number of dogs bred, we used the linkage disequilibrium method to determine the contemporary Ne (Do et al. 2014).The contemporary Ne of Wild 2019 (n = 42) was 48.9, lower than that of Wild 2022 (73.1, n = 49).

Discussion
Our analysis revealed that the availability of DTC genetic testing coincided with a decrease in the frequency of homozygous SOD1: c.118G<A mutation among PWCs, reflecting negative selection against the mutation.Our study provides valuable empirical evidence that genetic testing coupled with selective breeding can lower mutation frequency in the span of a few years.With the widespread, global availability of commercial genetic testing for SOD1 mutations (Neeves and Granger 2015), breeding programs can apply the test results and make informed systematic mate selection decisions to decrease the frequency of this deleterious variant.
Genetic testing for adults and puppies started in 2017 and 2019, respectively.In this study, the simulation, encompassing both large and small models, revealed a significant decrease in the allele frequency of the mutation between 2017 and 2018.In addition, F ST analysis suggested a selective pressure on the mutation between 2019 and 2022.These results indicate that genetic testing for adults and puppies has led to a decrease in the frequency of the mutation.This decrease could be attributed to the introduction of large-scale genetic testing for dogs by Japanese pet shops and breeders since 2017.Subsequently, breeders may have avoided mating parents carrying the mutation.This prevented the production of puppies with the mutation and continuously reduced its frequency between 2017 and 2022.
Studies investigating selection signatures during dog domestication (Akey et al. 2010;Wang et al. 2013;Plassais et al. 2019) have identified an influence on phenotypes such as body size, coat color, and behavior.To date, selection scans have mainly focused on a given region over 10,000 yr or similarly long periods (Akey et al. 2010).In contrast, selection scans over short periods (a few years) in mammals are rare.Therefore, our results provide insights into the genome evolution of mammals.
EHH is a widely used statistic in genome biology and evolutionary genetics to detect regions of recent or ongoing positive selection (Sabeti et al. 2002).This statistic quantifies a haplotype that quickly sweeps toward fixation, making it effective for detecting hard selective sweeps.Theoretically, this observation can be detected in populations with random mating; however, our study revealed longer haplotypes in the SOD1 region in the 2019 group, as opposed to the 2022 group, where a strong selection occurred.A potential reason for the shorter haplotypes in the 2022 group could be genotype-based selection, employing breeds among a larger number of dogs from multiple lineages.First, given that PWCs are inbred breeds, the longer haplotypes observed in the 2019 group could be considered normal and not indicative of selection.Over the course of three years, our SNP-based Ne estimation suggests that approximately 1.5 times more dogs have been introduced and bred.Different lineages are employed during mating to prevent pairing with dogs carrying the DM mutation, a practice substantiated by our genetic analyses.Based on the genetic testing results, breeders mate dogs without the mutation, leading to a high number of recombinations around the mutation site.Future research, potentially employing model-based and/or observation-based approaches, will be required to address this possibility.
Our research also demonstrated that the mutation can be selected against without lowering the Ne (i.e.generating inbred animals).Inbreeding avoidance is essential for effective animal breeding (Sams and Boyko 2019).Our genetic analysis revealed no differences in Ho and Negative Selection on a SOD1 Mutation inbreeding levels between the 2019 and 2022 groups.In addition, we estimated a larger Ne for the 2022 group (73.1) than for the 2019 group (48.9).The phylogenetic analysis (Fig. 4C) further implied that the genetic origin of some Wild dogs in 2022 was from other lineages.Taken together, our findings suggest that PWC breeders used genetic testing results to limit inbreeding while mating dogs from different families or lineages.
Notably, our study focused on the genotype of the SOD1 mutation rather than the phenotype.The median onset of DM in PWCs is 11 yr (Coates et al. 2007).Considering the start of DTC genetic testing in PWCs in Japan, the corresponding decreases in the prevalence of SOD1-associated disease should be noticeable around the 2030s.Accordingly, further research using comprehensive phenotypic datasets, such as those available from pet insurance companies, is required to monitor DM onset in PWCs.
In conclusion, this study highlights the value of genetic testing as a tool to lower the risk of canine DM while avoiding animal inbreeding.We conducted a genome-wide analysis of short-term selection in PWCs and found that only a few years were required to reduce the number of dogs homozygous for the mutant allele.Our results highlight that genetic testing could reduce the prevalence of predictable genetic conditions, thus contributing to improved animal welfare.

Materials and Methods
Wild 2019 and 2022 populations were phased using Beagle 5.4 (version 22Jul22.46e)(Browning et al. 2018) with default settings.After phasing, EHH for the target SNP was estimated using a 1 Mbp window in Selscan version 2.0.0 (Szpiech and Hernandez 2014).

Inbreeding Levels
We inferred inbreeding levels using specimens from genome-wide SNP genotyping.The R package DetectRuns was used to obtain the proportion of times each SNP fell inside a run per population, corresponding to locus homozygosity or heterozygosity in the respective population.We used the following DetectRuns parameters: minSNP = 41, maxGap 10 6 , minLengthBps = 50000, and minDensity = 1/5000.

Population Genetics
To clarify the genetic structure and phylogenetic relationships in the PWC population, we performed PCA using PLINK version 1.9 with default settings and maximumlikelihood ancestry analysis using ADMIXTURE version 1.3 (Alexander et al. 2009).For ADMIXTURE, we set the number of populations (K) between 2 and 10.We used the option cv for cross-validation error calculation to select the optimal K value based on the lowest error.
For genetic relationships, we first converted the PLINK PED format into FASTA format.We then converted the FASTA file to NEXUS format in MEGA X (Kumar et al. 2018).We constructed a phylogenetic tree using the neighbor-joining algorithm with p-distance in MEGA X.
We calculated Nei's standard genetic distance D (Nei 1987) between the four populations (Wild 2019, Wild 2022, Mutant 2019, and Mutant 2022).Nei's D was estimated using GenoDive version 3.06 with default settings (Meirmans and Van Amsterdam 2019).

Estimating Ne
The contemporary Ne of PWC was estimated from genome-wide SNP data using a linkage disequilibrium method in NeEstimator version 2.1 (Do et al. 2014).The lowest allele frequency was set at 0.01.
FIG. 1.-Trends in diploid genotype and allele frequencies of the SOD1: c.118G>A mutation.A) Diploid genotype frequencies of the SOD1: c.118G>A mutation from 2012 to 2022.The ratio of allele frequency is based on the birth year of tested dogs.The lower chart shows the number of tested dogs divided by each diploid genotype (wild-type homozygotes (G/G, Wild), heterozygous carriers (G/A, Hetero), and variant homozygotes (A/A, Mutant)).B) Real and simulated allele frequencies of the SOD1: c.118G>A mutation for six years after 2016 based on two models: large (Ne = 540, left) and small (Ne = 49, right) models (see Materials and Methods for details).The solid orange line indicates the observed allele frequency, and the dashed orange lines indicate 95% confidence intervals.The light-blue lines indicate 10,000 simulated allele frequencies starting from 2016, and the dashed blue lines indicate 95% thresholds.

FIG. 2 .
FIG. 2.-Selection signature observed in SOD1.A) Manhattan plot based on F ST .B) Relationships of F ST per SNP and their locations around SOD1 on chromosome 31.SNP BICF2G630738971 had the highest F ST value.Red dots indicate SNPs with upper 0.1% F ST .C) EHH of each group from the top F ST SNP (BICF2G630738971) for the derived allele.Dark-blue and light-blue lines indicate the 2019 and 2022 groups, respectively.The red dotted line indicates the position of SOD1: c.118G.

FIG. 3
FIG. 3.-Inbreeding estimates based on F ROH and genome-wide SNP data.No significant difference was observed between populations (Welch's t-test, P > 0.05).

FIG. 4
FIG. 4.-Genetic structure and relatedness of PWCs.A) ADMIXTURE.B) PCA.C) Neighbor-joining phylogenetic tree.The asterisk of a lineage includes only PWCs tested in 2022, indicating the dogs from different lineages in the 2019 group.D) Dendrogram based on Nei's genetic distance.