A Genetic Screen for Genes That Impact Peroxisomes in Drosophila Identifies Candidate Genes for Human Disease

Peroxisomes are subcellular organelles that are essential for proper function of eukaryotic cells. In addition to being the sites of a variety of oxidative reactions, they are crucial regulators of lipid metabolism. Peroxisome loss or dysfunction leads to multi-system diseases in humans that strongly affect the nervous system. In order to identify previously unidentified genes and mechanisms that impact peroxisomes, we conducted a genetic screen on a collection of lethal mutations on the X chromosome in Drosophila. Using the number, size and morphology of GFP tagged peroxisomes as a readout, we screened for mutations that altered peroxisomes based on clonal analysis and confocal microscopy. From this screen, we identified eighteen genes that cause increases in peroxisome number or altered morphology when mutated. We examined the human homologs of these genes and found that they are involved in a diverse array of cellular processes. Interestingly, the human homologs from the X-chromosome collection are under selective constraint in human populations and are good candidate genes particularly for dominant genetic disease. This in vivo screening approach for peroxisome defects allows identification of novel genes that impact peroxisomes in vivo in a multicellular organism and is a valuable platform to discover genes potentially involved in dominant disease that could affect peroxisomes.

adult flies that that exhibit locomotor defects and reduced longevity (Nakayama et al. 2011;Faust et al. 2014;Wangler et al. 2017a), a number of essential genes that regulate peroxisome biogenesis and homeostasis may remain unstudied. To identify mechanisms that impact peroxisome number and size, we conducted a forward genetic screen on the X chromosome in Drosophila using lines that we had previously generated and analyzed representing a large collection of recessive lethal mutant lines on Drosophila X-chromosome (Yamamoto et al. 2014). We report an analysis of 215 lines from this collection that correspond to 100 genes and screened for peroxisomal phenotypes using GFP tagged peroxisomes (Chao et al. 2016;Wangler et al. 2017a) in conjunction with clonal analysis, allowing generation of homozygous mutant cells within Drosophila larval fat body in an otherwise heterozygous animal to bypass early lethality.
Our screen identified a number of genes not previously implicated in peroxisome dynamics or regulation. In previous studies, we've shown that the genes from this collection are enriched for human disease genes (Yamamoto et al. 2014). Based on this we propose our screen results as identifying candidate human disease genes particularly for dominant disease that may impact peroxisomes.

METHODS & MATERIALS
Drosophila X-Chromosome peroxisome (X-Pex) screen All X-linked recessive lethal mutant alleles utilized in this paper listed in Table 1 and Supplemental Material, Table 1 were generated on an isogenized y 1 w Ã FRT19A chromosome using ethyl methanesulfonate (EMS) mutagenesis as described (Haelterman et al. 2014;Yamamoto et al. 2014;Deal and Yamamoto 2019). These fly strains are publically available from the Kyoto Stock Center (https://kyotofly.kit.jp/stocks/documents/EMS_X_lethals.html) or the Bloomington Drosophila Stock Center (https://bdsc.indiana.edu/ stocks/chemically_induced_mutations/xlethals.html). Heterozygous females (y 1 w Ã mut Ã FRT19A/FM7c Kr-GAL4, UAS-GFP, mut Ã indicates the mutation of interest) from these lethal lines were crossed to males of the genotype hsFLP, Ubi-RFP FRT19A; Actin-GAL4,UAS-GFP-SKL/CyO, and their embryonic progeny were heat shocked at 0-4 hr after egg laying at 37°for 1 hr. Third larval instar wandering larvae were dissected in PBS and fixed in 4% paraformaldehyde for 20-30 min. Fat bodies were mounted in DAPI (4',6-diamidino-2-phenylindole) containing mounting media (Vectashield), confocal microscopy images were captured on a LSM 710 laser scanning confocal microscope (Zeiss), and processed in Photoshop (Adobe). Dissection of the lines shown in Supplemental Table 1 were performed by a team of three co-authors who each dissected and examined and imaged multiple clones. All three also examined each other's images and a group consensus on the classification was reached. This was then independently verified by two independent co-authors through imaging review. Image quantification on select groups was performed using Image J to ensure reproducibility. All the "hits" presented were also independently dissected and imaged. We did observe in the course of our screen that there were isolated cells with very low GFP in our dissections and this effect appeared random and we attributed it to somatic events, for example a recombination or mutation that mutates the UAS or GAL4. Other possibilities were changes in protein produrence, new somatic mutations that impact the UAS-GAL4 system or somatic mutations in peroxisomal genes occurring in single cells. We dealt with these "background" effects in the fat body through screening multiple clones and having independent observers select the hits.
The fat body experiments were a peroxisome focused secondary screen on a subset of mutants of the larger X-lethal collection reported n■ The Fly gene and specific allele are listed along with the phenotype observed in the screen (Category A, B and C). The human homologs of each gene were identified using DIOPT or HCOP (Hu et al. 2011;Braschi et al. 2019). Known biological function of the fly protein is listed according to the annotation in UniProt (Uniprot Consortium 2019). (Yamamoto et al. 2014), similar to an Atg8 and LAMP1 based screen to identify novel autophagy regulators conducted on the same collection (Fang et al. 2016). Similar secondary screens were performed on the same X-lethal collection to identify regulators of other biological processes such as ring canal formation and somatic stem cell maintenance during oogenesis (Yamamoto et al. 2013;Cook et al. 2017) demonstrating the value of this collection in screening for genes involved in diverse cellular processes. We named our screen the "X-Pex" putting the fly X-chromosome together with an abbreviation of peroxisome. We will refer to the gene set and the screen as "X-Pex" throughout the manuscript.
ImageJ Protocol for peroxisomal area counting Different scans of the original Z-stack images were used to calculate the area of the peroxisomes. The individual scan is divided in GFP-SKL(green layer) and RFP(red layer), converted to black and white and then by finding the edges the area of each peroxisome is calculated with freehand selection tool or wand tool. The detailed protocol is presented in the supplementary file.
Human gene candidate analysis Human homologs from the fly genes were determined using the Human Gene Nomenclature Orthology Prediction (HCOP, https://www.genenames.org/tools/hcop/) and the Drosophila RNAi Screening center Integrative Ortholog Prediction Tool (DIOPT, https://www.flyrnai.org/ cgi-bin/DRSC_orthologs.pl) tools (Hu et al. 2011;Gray et al. 2015). These genes were further examined in a series of public human and model organism databases using the MARRVEL tool (http://marrvel.org/) to gather information about the homologous proteins in human and other model organisms (Wang et al. 2017). Human gene nomenclature was confirmed using the HGNC (HUGO Gene Nomenclature Committee, https://www.genenames.org) database (Braschi et al. 2019). Mendelian disease links were explored in the OMIM (https:// www.omim.org/) database (Amberger et al. 2015), and each gene was Figure 1 Overall study design and outcomes from the Drosophila X-Pex screen. A. Procedure for the X-Pex screen. Drosophila Crossing Scheme shown in-detail. "lethal mut Ã " represents the different X-chromosome recessive lethal alleles used for the screen as listed in Supplemental Table 1. Males and females used for the experiment were crossed at room temperature. Females were allowed to lay eggs for 4 hr and the embryos were then heat-shocked at 37°C in a water-bath for 1hr and then kept at 25°C. The Fat bodies of the wandering third instar larvae were dissected, fixed and imaged by confocal microscopy. The homozygous mutant cells were identified through the absence of RFP (RFP-). B. Schematic representation of fat bodies expressing the GFP tagged peroxisome marker (GFP-SKL) with clones of mutant and wild-type cells. While homozygous mutant cells are marked by the absence of RFP, the sibling homozygous wild-type cell are marked by two doses of RFP (dark magenta). Heterozygous cells are marked with one dose of RFP (pale magenta) Category A represents an increase in peroxisomal numbers, Category B represents enlargement of peroxisomes, and Category C represents a loss of mislocalization of peroxisomal markers. C. Representative images of the categories explained in the Fig. B are shown in C' (Usp16-45) and C'' (fs(h)1). Detailed analysis and each individual channel for these two images are shown in Figure 3. D. Table  representing the overall results from the screen. 215 total lines were screened, 37 total allele hits from 18 genes were identified.

Identification of genes involved in peroxisomal dynamics
To identify new regulators of peroxisomal morphology and dynamics, we performed a screen with the collection of recessive lethal X-chromosome mutants that have been extensively studied by our group (Yamamoto et al. 2014). We had previously generated a large collection of recessive lethal mutant lines on an isogenized chromosome and these lines have been extensively screened for developmental and neurological phenotypes (Yamamoto et al. 2014;Deal and Yamamoto 2019). Moreover, we previously utilized this collection of mutations to uncover new human disease genes and showed that human orthologs of these fly essential genes are enriched for  (1)h clones (from B-B"), Rbcn-3B clones (from C-C"), Coq8 clones (from D-D") and Usp16-45 clones (from E-E"). Scale bars represent 50mm. Images are taken on Zeiss710-LSM microscope with 10X objective of NA 0.3 with 8 bit depth and 2200x2200 resolution.
genes listed in the Online Mendelian Inheritance in Man (OMIM) disease database (Yamamoto et al. 2014;Yoon et al. 2017;Tan et al. 2018). We took 215 lines from this collection that correspond to 100 genes (98 mapped genes and 2 unmapped complementation group) and screened for peroxisomal phenotypes using GFP tagged peroxisomes (Chao et al. 2016;Wangler et al. 2017a) in conjunction with clonal analysis, allowing generation of homozygous mutant cells within Drosophila larval fat body in an otherwise heterozygous animal to bypass early lethality. It is important to note that X-chromosome contains $15% of protein coding genes in Drosophila, and there is no correlation between X-linked genes in flies and humans. In this screen, we created homozygous mutant clones in the fat body of developing Drosophila larvae in an otherwise heterozygous animal and assayed for changes in the distribution pattern of a peroxisomal reporter, GFP-SKL ( Figure 1A). GFP-SKL is a GFP with a C-terminal peroxisomal localization signal, which we have previously shown to be an accurate marker for peroxisomal dynamics (Chao et al. 2016;Wangler et al. 2017a). We hypothesized that our screen could uncover three major categories of peroxisomal impact: Category A-an increase in the number of peroxisomes, Category B-an increase in the size of peroxisomes, and Category C-decrease or loss of peroxisomes or the marker ( Figure 1B-C).
We have previously observed examples of Category B in mutants with peroxisomal fission defects (Chao et al. 2016) and Category C in biogenesis defects (Wangler et al. 2017a) in Drosophila. We screened 215 lethal mutant lines from the already available collection that was mapped to a complementation group or to a gene ( Figure 1D, Supplemental Table 1). We considered a hit to be positive when multiple independent co-authors could differentiate a clear difference in the mutant clone compared to the surrounding (heterozygous or homozygous wild-type) cells ( Supplementary Figure 1). In total we identified 37 alleles corresponding to 18 genes (Table 1). For these hits, when possible we assayed two or more alleles per gene to confirm a change in GFP-SKL if possible. Some of these mutations led to an inconsistent increase of the peroxisome reporter among the fat body clones even within the same tissue, suggesting that perdurance of the wild type protein or other unmeasured factors may mask a change in peroxisomal dynamics in some clones. Interestingly, an overwhelming majority of the lines fell into Category A (Table 1). Although there are a few known peroxisomal related genes that are located on the Drosophila Xchromosome including Pex5 (homolog of human PEX5) and Mfe2 (homolog of human HSD17B4) (Faust et al. 2012), these were not part of the collection we screened here (Yamamoto et al. 2014). We  Figure 1C' to illustrate the Category A phenotype, and A' is the RFP channel while A'' is the GFP channel. B. Higher magnification images of fat bodies of fs(h)1 Fig 3B-B'' were taken on Zeiss710-LSM microscope with 63X objective of NA 1.4 with 16 bit depth and 512x512 resolution. The scale bars noted in the image are 10mm. B is the image shown in Figure 1C''to illustrate the Category B phenotype, and B' is the RFP channel while B''is the GFP channel. C. Quantification of number of peroxisomes present per mm 2 of Clonal and non-clonal area. This data represents the number of peroxisomes present per mm 2 of clonal/non-clonal area. Peroxisomes from five different stacks were counted with ImageJ and then divided with the total clonal/non-clonal area in that individual stack to get these data points. D. Quantification of percent area of total peroxisomes. The areas covered by peroxisomes in clonal as well as non-clonal regions from five different stacks were counted with ImageJ. The total of those peroxisomal areas in the same stack is divided by clonal/non-clonal area as shown in A' and B' of the same stack. The percent value of this is counted as one data point. note that none of the genes we isolated in this screen encode proteins that have been localized to peroxisome but are instead suggested to be involved in a wide variety of cellular processes (Table 1).
In the majority of the lines screened, there was no difference between the appearance of the peroxisomal (GFP-SKL) marker in the mutant clone vs. the surrounding sister cells, similar to the pattern seen in FRT19A controls (Figure 2A-A"). In one line, we observed enlargement and increased number of peroxisomes qualitatively ( Figure 2B-B"). This hit, fs(1)h, produced a Category A and B phenotype, and encodes a protein that has been reported to be involved in regulating proper expression of homeotic genes involved in pattern formation, such as Ultrabithorax (Florence and Faller 2008). In other hits, including Rbcn-3B, Coq8, and Usp16-45, we observed a Category A phenotype with more GFP-SKL punctae in the clones (Figure 2 C-E", Supplementary Figure 1). Some of the hits from this screen have been studied in other biological contexts. Rbcn-3B, which produced a Category A phenotype, encodes a protein involved in Notch signaling during oogenesis through its role in endocytic trafficking and lysosomal function (Yan et al. 2009). Coq8, which produced a Category A phenotype, encodes a mitochondrial inner membrane protein that is predicted to be involved in electron transport (Zhu et al. 2017) In contrast, many of the genes we identified, including Usp16-45 which is predicted to encode an ubiquitin specific protease, have not been extensively characterized in vivo and it is likely that these proteins also have additional functions as most genes are pleiotropic (Wangler et al. 2017b). We also quantified the number and the area of peroxisomes in the Usp16-45 and fs(1)h clones ( Figure 3A-B). We see a significant increase in the number of peroxisomes per mm 2 in the clones of both Usp16-45 and fs(1)h compared to surrounding cells consistent with the Category A phenotype ( Figure 3C). This increase in the number of peroxisomes and changes in size led to the increase in the GFP-SKL signal ( Figure 3D). These results indicate that screening for the GFP-SKL peroxisomal marker is an effective method to identify new genes that can impact peroxisome dynamics or morphology.
Human homologs of genes identified in the fly peroxisome screen are candidates for human disease In our previous studies from the X-screen collection we showed that the process of identification and screening for lethal mutations in Drosophila enriches for human Mendelian disease genes (Yamamoto et al. 2014). Indeed, our subsequent studies on this collection continued to yield novel human disease genes (Deal and Yamamoto 2019). We therefore wanted to assess this potential for genes from our peroxisome screen within public human databases.
First, we determined the human homolog of each fly gene with the highest predicted score using the Drosophila RNAi Screening center Integrative Ortholog Prediction Tool (DIOPT) tool (Hu et al. 2011;Gray et al. 2015) (Supplemental Table 2). We also utilized the MARR-VEL tool, MARRVEL allows for simultaneous display of public human genomic data and model organism phenotypes and conservation (Wang et al. 2017). For the eighteen fly genes that we considered hits, we found that sixteen of the eighteen had human homologs (88.9%). Thirteen out of sixteen fly genes had a single human homolog while remaining three of the genes had multiple human homologs (Table 1). We also wanted to know if any of the genes were already linked to a human single gene disorder. We examined the twenty human homologs from the X-Pex screen in a Mendelian disease using the Online Mendelian Inheritance in Man (OMIM) database (Amberger et al. 2015). Of the twenty-three human genes, nine were listed in relation to at least one Mendelian phenotype where the gene is causative for a n■ The human homologs of the X-Pex genes were examined for known Mendelian disease association (OMIM # entries) with genes that are not known to cause disease shown in red (Amberger et al. 2015). These are further sorted using data from the public human database gnomAD and the DOMINO scoring system for dominant disease. "pLI" score shows the probability (from 0-1) of a gene having intolerance to loss-of-function variation in the population of individuals represented in gnomAD data. "Missense z-score" show a z-score value for rates of missense variation in a gene. "pLI-o/e" is the observed / expected for loss-of-function variants in a gene, while "Missense o/e" is a similar ratio for missense variants. For DOMINO scores the code shows A = "Very likely dominant (0.8-1)", B = "Likely dominant (0.6-0.7)", C = "Either dominant or recessive (0.4-0.5)", D = "Likely recessive (0.2-0.3), E = "Very Likely recessive (0-0.1)" described disease (Supplemental Table 3). Fourteen human genes have no known single gene disorder (Table 2 and Supplementary Table 3). In our previous work in this collection, we observed that by screening for lethality, we enrich for essential genes in fly that are homologs of disease genes in humans (Wangler et al. 2015). We therefore hypothesized that these fourteen human genes from our screen, not currently associated with human disease could be considered good candidates for undiagnosed cases. One way to assess whether these genes could be good candidates for undiagnosed disease is to examine whether damaging or deleterious variants in the genes occur in the population at large. If variants in this gene have a deleterious effect prior to reproduction these variants will be selected against in the general population. In order to explore evidence for this we examined public human genomic databases, we were looking for evidence of selective constraint, or lack of damaging variation indicating that gene is under selection in humans, in all twenty-three human genes identified by our screen. To do this we examined the gnomAD database which is a large genome and exome aggregation largely selected for healthy or adult-onset disease cases (Lek et al. 2016). For each gene we examined the constraint metrics or evidence that damaging variants in the gene are absent from these "control" individuals (Table 2, Supplemental Table 3). This type of information would point to a gene being under strong selection twelve of the twenty-three genes had evidence of being under selection. We noted that twelve genes had "observed over expected" (o/e) numbers for loss-of-function variants at 0.2 (20%) or less indicating fewer loss-of function alleles than anticipated by chance. This data suggests that there may be some selection against loss of function alleles for approximately half of the genes possibly due to a haploinsufficient mechanisms. In summary, the genes from our screen could be considered good candidates for undiagnosed cases as they exhibit selective constraint in healthy individuals.
The constraint metrics of the gnomAD dataset are in fact most valuable for showing selective constraint for heterozygous alleles, such as de novo mutations or dominant inherited disorders, particularly with early onset or an impact on reproduction (Lek et al. 2016). This is because it is much easier to observe a single damaging allele than homozygous or compound heterozygous. We therefore wondered if the X-Pex gene set could be considered good candidates for dominant disease and we examined this through two strategies: 1) comparing the gene metrics to known recessive peroxisomal genes and 2) use of the DOMINO tool for predicting dominant disorders. Figure 4 Comparison of known human peroxisomal disease genes to the new X-Pex candidates. A. The Probability of Loss of Function intolerance score (pLi) calculated from human data from the gnomAD database (Lek et al. 2016). The X-Pex genes displayed a mean pLi score of 0.55 6 0.11, n = 20, while the known peroxisomal disease genes had a mean pLi of 0.14 6 0.06, n = 25, which was statistically significant (P = 0.0016) ÃÃ . This was also compared to all the homologs of the X screen genes. B. The observed over expected (o/e) loss of function scores calculated from public human data from the gno-mAD database. The X-Pex genes had a mean o/e score of 0.29 6 0.06, n = 20, while the known peroxisomal disease genes had an o/e score of 0.50 6 0.06, n = 25, which was statistically significant (P = 0.0218) Ã . This was also compared to all the homologs of the X screen genes. C. The missense constrain z-score calculated from public human data from the gnomAD database. The X-Pex genes had mean missense constrain z-scores of 2.16 6 0.34, n = 20, while the known peroxisomal genes had z-scores of 0.676 0.23, n = 25, which was statistically significant (P = 0.0005) ÃÃÃ . This was also compared to all the homologs of the X screen genes. D. Missense constraint o/e scores calculated from public human data from the gnomAD database. The X-Pex genes had a mean o/e for missense variants of 0.73 6 0.04, n = 20, compared to the known peroxisomal disease genes o/e score of 0.90 6 0.03, n = 25, also statistically significant (P = 0.0025) ÃÃ . This was also compared to all the homologs of the X screen genes. E. DOMINO scores calculated for the gene sets. The X-Pex gene set had a DOMINO score of 0.53 6 0.08, n = 20, while the known peroxisomal disease genes had a mean DOMINO score of 0.17 6 0.04, n = 24, and the difference was statistically significant (P , 0.0001) ÃÃÃ . This was also compared to all the homologs of the X screen genes.
In the first approach we compared these same characteristics of these genes to two other sets of genes, first we compared the X-Pex gene set to all the other homologs of the genes that we screened. We also compared to a group of twenty-five well known human peroxisomal disease genes that encode proteins in the peroxisome biogenesis machinery and enzymes involved in very-long-chain fatty acid oxidation, plasmalogen synthesis and reactive oxygen species (Supplemental Table 4). Comparing these three gene sets we found that the human homologs of the essential fly genes had significantly higher probability of loss of function intolerance ( Figure 4A-B), and higher missense constraint ( Figure 4C-D) in the human databases compared to the known peroxisomal disease genes. This was consistent across both genes that were positive in the peroxisome secondary screen as well as negative. These intolerance scores apply more to dominant disorders than autosomal recessive. Consistent with that, all known PEX gene-related Peroxisome biogenesis disorders are autosomal recessive (Braverman et al. 2016). (Supplemental Table 4). We therefore hypothesized that the selection of lethals in our original fly screen pointed us to a set of human genes that are more likely to underlie dominant disease.
In order to test this, we used the DOMINO tool to assess the probability of dominant disease for each gene (Quinodoz et al. 2017). The DOMINO score, indicating the likelihood of dominant disorders also differed between the X-Pex gene set and the known peroxisomal disease gene set ( Figure 4E). The X-Pex gene set had a higher DOMINO score, while the known peroxisomal disease genes had lower DOMINO scores, thus more likelihood of relating to recessive disease and the difference was statistically significant (P , 0.0001). As noted, for the known peroxisomal genes this is indeed the case, as twenty-two of the twenty-five genes are disease genes for autosomal recessive disorders (Supplemental Table 4).
With this data we predict that some X-Pex genes could underlie dominant phenotypes, we sought evidence for dominant phenotypes related to alleles in the set of X-Pex genes using public databases of de novo events from individuals with disease and controls (Turner et al. 2017). This dataset primarily focuses on neurodevelopmental phenotypes and the de novo events from diverse cohorts. Strikingly, we observed suggestive results for six genes from the X-Pex gene set with high DOMINO scores (GSK3A, BRD4, UPF1, BRD3, GSK3B and SMAD3) and at least one individual with developmental delay, Autism, or congenital heart disease. These cases are not definitively linked to these loci and are noted to have a missense de novo event. Interestingly no missense de novo events are observed in control individuals for these genes (Supplemental Table 5).
Whether these genes are ultimately good candidates remains to be explored in undiagnosed cases as these database searches do not definitely link the specific gene with disease. It is also not known whether the peroxisomal phenotype that was observed in our screen would be conserved in humans. Peroxisomes are not routinely examined in clinical samples, so this data provides a key starting point for these genes of interest. Even a secondary impact on peroxisomes without direct interaction could aid in exploring these candidate genes further.
Taken together we propose the X-Pex provides a good list of candidate genes, in particular, de novo events in these genes from patients with neurodevelopmental phenotypes should be explored. Considering that peroxisomal disease classically relates to autosomal recessive conditions, the X-Pex gene list may provide an entry point to study the role of de novo events in genes that impact peroxisomes that have been missed in previous screens. This study therefore provides additional support for the use of forward genetic screens in model organisms in the study and identification of human disease genes (Yamamoto et al. 2014;Wangler et al. 2015).