A functional genomics screen identifying blood cell development genes in Drosophila by undergraduates participating in a course-based research experience

Abstract Undergraduate students participating in the UCLA Undergraduate Research Consortium for Functional Genomics (URCFG) have conducted a two-phased screen using RNA interference (RNAi) in combination with fluorescent reporter proteins to identify genes important for hematopoiesis in Drosophila. This screen disrupted the function of approximately 3500 genes and identified 137 candidate genes for which loss of function leads to observable changes in the hematopoietic development. Targeting RNAi to maturing, progenitor, and regulatory cell types identified key subsets that either limit or promote blood cell maturation. Bioinformatic analysis reveals gene enrichment in several previously uncharacterized areas, including RNA processing and export and vesicular trafficking. Lastly, the participation of students in this course-based undergraduate research experience (CURE) correlated with increased learning gains across several areas, as well as increased STEM retention, indicating that authentic, student-driven research in the form of a CURE represents an impactful and enriching pedagogical approach.


Introduction
The Undergraduate Research Consortium for Functional Genomics (URCFG) was established at UCLA in 2003 as an entity representing the collaborative research effort of undergraduates, typically first-and second-year students, participating in a discovery-based laboratory course called Biomedical Research 10H (formerly Life Sciences 10H). Since that time, the URCFG has conducted several large-scale genetic research projects that have yielded publishable data and research resources (Chen et al. 2005;Liao et al. 2006;Call et al. 2007;Evans et al. 2009;Olson et al. 2019).
The current URCFG research project centers on the discovery of new genes controlling hematopoiesis (blood formation) in the fruit fly, Drosophila melanogaster. Over the last two decades, the fly has become an increasingly popular model for investigating the molecular mechanisms regulating blood cell specification, development, and function (Evans et al. 2003;Gold and Brü ckner 2014;Letourneau et al. 2016;Banerjee et al. 2019). This is due in large part to the established strength of Drosophila genetics and many developmental and functional parallels between human and fly blood systems. From a relative perspective, the development of the human blood system is extremely well understood, owing to a long history of observational and functional studies ex vivo, the development of blood and bone marrow transplant technologies in medicine, and the creation and analyses of a variety of highly relevant models such as the mouse and, more recently, zebrafish. Nevertheless, the human blood system is highly complex, and much is still to be learned about the genes that control development and, when disrupted, cause disease.
In both flies and humans, mature blood cell types are derived from progenitor cells through highly regulated differentiation. In humans, multipotent hematopoietic stem cells (HSCs) give rise to blood progenitors that belong to either myeloid or lymphoid lineage, which further differentiate into a variety of mature forms (Orkin and Zon 2008). Likewise, multipotent progenitor cells give rise to the mature blood cell types in Drosophila (Jung et al. 2005), although it is still unclear whether true blood stem cells are present in the fly. The origin of Drosophila blood cells (also called hemocytes) occurs in two separate specification events that differ in space and time. The first wave of hematopoiesis occurs in the embryonic head mesoderm and creates blood cells that quickly mature and migrate throughout the developing embryo, eventually becoming the circulating blood cells of the larva. A subset of these cells, many of which appear to retain progenitor characteristics, become sessile, attaching to the lateral body wall around the chordotonal organs and to various internal organs (Má rkus et al. 2009;Makhijani et al. 2011;Leitão and Sucena 2015). The second is the independent wave of blood cell specification, which begins slightly later in the embryonic cardiogenic mesoderm and contributes early blood progenitors that collectively form a specialized, multi-lobed organ called the lymph gland. During the larval stages, the lymph gland grows in size as these blood progenitors proliferate, and in the mid-second instar, a subset of these cells begin to differentiate (Jung et al. 2005). By the late third instar, the lymph gland primary lobes (the largest and the most anterior) contain organized, spatially restricted populations of mature and progenitor blood cells that occupy the Cortical Zone (CZ) and Medullary Zone (MZ), respectively (Jung et al. 2005). Additionally, a small group of dedicated regulatory cells, called the Posterior Signaling Center (PSC), is located at the posterior end of the primary lobes and influences progenitor cell maintenance and differentiation (Lebestky et al. 2003;Sinenko and Mathey-Prevot 2004;Jung et al. 2005;Krzemie n et al. 2007;Mandal et al. 2007;Tokusumi et al. 2010Tokusumi et al. , 2012Tokusumi et al. , 2015. Drosophila has three defined terminally differentiated blood cell types called plasmatocytes, crystal cells, and lamellocytes Olson et al. 2019). Plasmatocytes are professional phagocytes, similar to human macrophages and neutrophils, and are by far the most prevalent blood cell type ($95%) produced. Crystal cells make up most of the remainder and have roles in blood coagulation, sclerotization, and melanization, reminiscent of the role of megakaryocytes and derivative platelets in clotting. Lamellocytes are large, flat cells that are rare under normal developmental conditions, but can be induced to develop upon immune challenge. In the wild, fly larvae are the targets of parasitoid wasps that inject their embryos into the body cavity. In response, Drosophila larvae produce lamellocytes that, in conjunction with plasmatocytes and crystal cells, isolate and kill the wasp embryo through encapsulation, much like granuloma formation by specialized macrophages in humans (Rizki and Rizki 1992;Cronan et al. 2016). Thus, Drosophila blood cells exhibit key functional similarities to cells of the human myeloid lineage (Bidla et al. 2007;Buchon et al. 2014;Brü ckner 2014, 2015).
With regard to the genetic control of hematopoietic development, numerous studies have highlighted the conserved function of important signaling systems and gene expression regulators between Drosophila and humans (Evans et al. 2003Banerjee et al. 2019). For example, mesodermal formation of the Drosophila lymph gland and the mammalian aorta-gonadal-mesonephros (AGM) region, from which early blood cells are derived, both require FGF, BMP, and Wnt signaling (Mandal et al. 2004). Additionally, blood cell specification and lineage commitment in both flies and mammals require the function of GATA and Runx family transcriptional regulators (Daga et al. 1996;Rehorn et al. 1996;Lebestky et al. 2000;Han and Olson 2005). Other conserved transcription factors, including HOX, FOG, and EBF homologs (Fossett et al. 2001;Crozatier et al. 2004;Mandal et al. 2007), have also been shown to share regulatory roles. The activity of such factors are themselves regulated by an assortment of signaling pathways, such as the Pvr, FGF, and EGF receptor tyrosine kinase (Brü ckner et al. 2004;Jung et al. 2005;Mondal et al. 2011;Sinenko et al. 2012;Dragojlovic-Munther and Martinez-Agosto 2013), JAK/ STAT (Harrison et al. 1995;Luo et al. 2002), Notch (Duvic et al. 2002;Lebestky et al. 2003), Wingless (Sinenko et al. 2009), and Hedgehog pathways (Mandal et al. 2007), which are also conserved.
Though our understanding of the genetic control of hematopoietic development in Drosophila continues to grow, what is known is extremely limited from a genomic perspective. Most of the hematopoietic genes that have been identified to date stem from trial-and-error analysis of important genes known from other contexts, and a small number of forward genetic screens that produced discernible hematopoietic phenotypes. Sequencing of the fly genome has identified almost 14,000 protein coding genes, but which subset of the genome regulates hematopoietic development is largely unknown. Thus, the URCFG initiated a functional genomics project, in which reverse genetic analysis was used to link Drosophila genes to hematopoiesis. Moreover, by engaging in authentic research experiences, students show compelling learning outcomes, even when compared with students in traditional laboratory courses or summer laboratory apprenticeships.

RNAi lines
Transgenic RNAi lines for screening were obtained from the Vienna Drosophila RNAi Center (VDRC, Vienna, Austria; GD and KK collection), the National Institute of Genetics (Kyoto, Japan; NIG-R lines), and the Bloomington Drosophila Stock Center (BDSC, Bloomington, Indiana; TRiP lines). Acquired RNAi lines were randomly assigned to students participating in the primary screen and the secondary screen, and each RNAi line was assigned to a minimum of two students. Each RNAi line was continually screened until two complete data sets (see below) were acquired. For target gene validation, the BDSC was searched for alternate RNAi lines targeting 24 candidate genes identified by Hml D -GAL4 in our secondary screen (those causing strong increases in Hml D -DsRed fluorescence); 14 alternative RNAi lines were available, obtained, and screened (Supplementary Figure S1).

Crossing conditions
Virgin GAL4 females were crossed to males from individual UAS-hpRNA lines or to males from w 1118 (BDSC 5905) as a control. Crosses to HHLT-GAL4 and Hml D -GAL4 were reared at 29 C to maximize RNAi-based phenotypes. Crosses to Antp-GAL4 and dome PG14 -GAL4 were placed directly at 29 C or reared for one day at 18 C before shifting to 29 C. Crosses to Antp-GAL4 and dome MESO -GAL4 with elav-GAL80 were reared at room temperature for one day before shifting to 29 C.

Processing and imaging of larvae
Wandering third-instar larvae (non-Tb) were collected, washed with water, and placed into glass spot well plates (Fisher) on ice to minimize movement. Depending upon balancer chromosomes present in the parental GAL4 driver line, larvae were sometimes prescreened for the presence of GFP and DsRed expression. Four immobile larvae were aligned dorsal side up along the anterior/ posterior axis on the bottom (flat surface) of a glass spot well plate that was chilled on ice. Larvae were then imaged for GFP or DsRed fluorescence using a Zeiss Stemi SV11 fluorescence stereo dissection microscope (1.0Â objective lens, 0.8Â magnification) equipped with an AxioCam MRm camera, controlled by Zeiss AxioVision imaging software. Imaging 12 larvae (three sets of four larvae) for each cross was considered as a complete dataset.

Phenotype screening in whole animals
Reporter gene expression (fluorescence) in progeny larvae activating RNAi within the hematopoietic system was compared with that of progeny larvae in which RNAi was absent (from control crosses). For the primary (HHLT-GAL4 UAS-GFP) screen, students noted changes to fluorescence associated with the lymph gland region, including the posterior pericardial cells, and the circulating blood cell population. Changes noted were varying levels of increased or decreased fluorescence for lymph glands (including missing or partially missing), whether pericardial cells were absent, increased or decreased circulating cell density (including clumps and melanotic tumors). For the secondary screen with Hml D -DsRed as a marker, students noted changes to fluorescence associated with the lymph gland region and the circulating blood cell population. Changes noted were varying levels of increased or decreased fluorescence for lymph glands (including missing or partially missing) and increased or decreased circulating cell density (including clumps and melanotic tumors). RNAi phenotypes were scored by two or more students in both the primary and the secondary screens, with "hits" being selected by causing reproducible phenotype scores at each stage. Because circulating cell phenotypes varied in several ways, scoring was more subjective. Thus, RNAi line reproducibly causing circulating cell phenotypes were consolidated into a single group that cause any relative change (Supplementary Table S3).

Bioinformatic analysis
For RNAi lines causing a developmental phenotype, associated target genes were identified through their respective stock center databases. Gene information and protein sequences were retrieved from FlyBase (Attrill et al. 2016). Potential human homologs were identified using the Basic Local Alignment Search Tool (BLAST; National Center for Biotechnology Information) featuring the protein: protein BLAST (blastp) algorithm. Functional annotation of genes was performed using the STRING protein-protein interaction database (v11.0; Szklarczyk et al. 2019), which also includes the Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway database (Kanehisa and Goto 2000) and Reactome database (Fabregat et al. 2018) as analysis tools.

Assessment of learning gains
Learning gains were assessed using the Survey of Undergraduate Research Experiences (SURE) II (Lopatto 2004), which offers both the Classroom Undergraduate Research Experiences (CURE) survey and the Summer Undergraduate Research Experience (SURE) survey. The CURE and the SURE surveys include identical items that permit comparisons; URCFG students and the "All students" group took the CURE survey, while the "All summer research students" group took the SURE survey. A total of 308 UCLA undergraduates participating in this URCFG RNAi CURE project identified as follows: 53.9% female (n ¼ 166), 46.1% male (n ¼ 142); of 294 respondents, 10.1% were URM (n ¼ 31), where URM includes American Indian/Alaskan Native, Black/African American, or Hispanic/Latinx; student make-up by year: first-year, 33.1% (n ¼ 102), second-year, 41.6% (n ¼ 128), third-year, 20.8% (n ¼ 64), and fourth-year, 4.5% (n ¼ 14

Identification of new hematopoietic genes
To identify hematopoietic genes, 339 URCFG students used RNA interference (RNAi) to disrupt the function of approximately 3500 genes within the developing blood system. In our experimental approach, pseudo-double-stranded hairpin RNAs (hpRNAs) are produced within cells from a transgene containing an invertedrepeat DNA sequence corresponding to a specific target gene (Ni et al. 2008). Subsequently, these hpRNAs are recognized and processed into an active RNA-induced silencing complex (RISC), initiating the RNAi response and the eventual degradation of target gene mRNAs (Mohr et al. 2014). Restriction of hpRNA production to blood cells was achieved by using the GAL4/UAS gene expression system derived from yeast (Elliott and Brand 2008). Students crossed GAL4-expressing lines with RNAi lines in which targetgene inverted-repeat sequences are under the control of the GAL4-responsive UAS enhancer. The primary RNAi screen made use of the HHLT-GAL4 line (Mondal et al. 2014), in which GAL4 is expressed throughout the blood system. The HHLT-GAL4 line also contains a UAS-GFP transgene, allowing for direct observation of the hematopoietic tissues (the lymph gland and circulating cells) in whole animals using fluorescence microscopy. An overview of the experimental design is shown in Figure 1. Using this line for screening over the course of several years, URCFG students ultimately identified 137 candidate genes (148 RNAi lines) involved in hematopoiesis (Table 1; see Figure 2 for examples).

Cell-type specific RNAi and the effect on blood cell maturation
The primary RNAi screen with HHLT-GAL4 UAS-2XEGFP was useful in identifying candidate hematopoietic genes due to the relative ease of discerning gross defects in the lymph gland and the circulating blood cells through changes in GFP fluorescence. However, this screen could neither indicate a cell-type-specific function for the identified gene (as HHLT-GAL4 is expressed in mature, progenitor, and signaling cells) nor what the specific impact was on blood lineage development. To address these limitations and further delineate the functions of the identified candidate genes, the secondary screen was conducted in which RNAi was directed to either differentiating cells using the In this way, candidate genes with developmental roles in specific blood cell populations could be identified.
We compiled a collection of 202 RNAi lines comprised of the 148 lines identified in the primary screen, as well as 54 lines (Supplementary Table S1) that target either the primary screen candidate genes redundantly (20 genes) or genes predicted to function in related processes or pathways. Over the course of five academic quarters (Winter 2015-Spring 2016), students crossed RNAi lines from the 202-line collection with the three GAL4 drivers described above and analyzed DsRed fluorescence (Hml D -DsRed expression) in whole, wandering third-instar larvae. Each RNAi line was assigned to two or more students, with the goal of collecting at least two complete data sets for each GAL4 driver/ RNAi line cross combination. The collection of imaging data for 12 progeny larvae from a given cross was considered a complete data set, and individual RNAi lines remained within the assignment pool until two complete image data sets were obtained. The duplicate completion rate for the entire RNAi line collection was 41% (83 lines) for all three GAL4 drivers, 78% (158 lines) for at least two of three GAL4 drivers, and 95% (191 lines (95%) for at least one GAL4 driver. If single-complete data sets are included, the completion rate increases to 75% (151 lines) across all three drivers, to 99% (199 lines) for at least two of three GAL4 drivers, and to 100% (202 lines) for at least one GAL4 driver (Supplementary Table S2). With respect to RNAi in the PSC (Hml D -DsRed; Antp-GAL4 with or without elav-GAL80; Rideout et al. 2010), 188 of the 202 RNAi lines were analyzed (93%), eight of which were found to be lethal (presumably due to GAL4 activity outside of the lymph gland that is not suppressed by elav-GAL80). Of the 180 viable lines, 160 (79%) were completed in duplicate. For RNAi screening in progenitor cells (Hml D -DsRed; dome-GAL4 or elav-GAL80; Hml D -DsRed; dome MESO -GAL4), students successfully screened 186 RNAi lines (of 202; 92%), of which 137 (68%) were completed in duplicate. Similar screening in the maturing blood cell population (Hml D -DsRed Hml D -GAL4) was successful for 182 RNAi lines (of 202; 90%), 135 (67%) of which were completed in duplicate (Supplementary Table S2).
As described previously, phenotypic analysis in the secondary screen involved discerning the variance of Hml D -DsRed reporter expression between RNAi and control (non-RNAi) backgrounds as viewed in whole, third instar larvae. While this is a highly specific and, therefore, powerful molecular genetic tool, the usefulness of Hml D -DsRed in this RNAi screen is offset by variability in lymph gland phenotypes, possibly due to incomplete phenotypic penetrance and/or expressivity, within the 12 RNAi-larvae sample group. Additionally, the relative inexperience of the undergraduate researchers, with Drosophila in general and the hematopoietic system in particular, sometimes made their identification of subtle phenotypic changes difficult. Therefore, to increase the likelihood that RNAi lines (candidate genes) are identified correctly, the developmental phenotype caused by each RNAi line was independently scored by two or more students. Scoring consisted of first determining whether a phenotype was present and, if so, then describing and categorizing the nature of the phenotype. RNAi lines identified more than once and causing similar hematopoietic phenotypes (the majority of lines identified) were subdivided into those causing an increase in lymph gland Hml D -DsRed expression and those causing a decrease in lymph gland Hml D -DsRed expression. Though not our focus, changes in Hml D -DsRed expression among circulating cells were also noted (the vast majority of which also had a lymph gland phenotype; Supplementary Table S3).
Directing RNAi to the PSC using Antp-GAL4 identified 20 RNAi lines (representing 19 genes) that cause an increase in Hml D -DsRed lymph gland fluorescence and 15 RNAi lines (representing 15 genes) that cause a decrease in Hml D -DsRed lymph gland fluorescence (see Figure 3 for examples; Table 2). This analysis also identified 13 RNAi lines (representing 13 genes) associated with a change in the circulating cell population (Supplementary Table  S3). Of these 13 RNAi lines, three overlap with RNAi lines increasing lymph gland DsRed fluorescence and four overlap with RNAi Figure 1 A functional genomics screen for new hematopoietic genes in Drosophila. In the primary screen, RNAi occurred throughout the larval hematopoietic system, which specifically expressed GFP. Briefly, HHLT-GAL4 UAS-GFP flies (Mondal et al. 2014) were crossed to flies carrying different UAS-hairpin RNA (hpRNA) transgenes targeting a unique gene. Progeny third-instar larvae expressed both hpRNAs (eliciting an RNAi response) and GFP throughout the blood system. Expression of GFP was monitored by fluorescence microscopy in whole larvae, four at a time. RNAi lines causing a discernable increase or decrease in GFP fluorescence, relative to control larvae lacking RNAi, were selected for use in the secondary screen. In the secondary screen, RNAi line "hits" from the primary screen were crossed to population-specific GAL4 driver lines (Hml D -GAL4 for maturing cells, dome-GAL4 for progenitor cells, and Antp-GAL4 for Posterior Signaling Center cells). These GAL4 driver lines also carried Hml D -DsRed as a reporter of blood cell maturation. Expression of DsRed was monitored by fluorescence microscopy in whole larvae, four at a time. Table 1 Identified Table 3). Another 17 RNAi lines (representing 16 genes) were identified that cause a change in the circulating blood cell population (Supplementary Table S3

Validation of RNAi line gene targets
The use of RNAi is a well-established experimental approach to quickly link genes with developmental functions, and our results with the reported lines are highly reproducible. However, it is possible that off-target effects of RNAi may be responsible for some of the observed phenotypes and may also account for differential RNAi effects between primary and secondary screens. A common genetic approach to validating RNAi phenotypes is to use additional lines targeting the same gene. Replication of the phenotype with multiple RNAi lines increases the likelihood of functional disruption of the target gene being the cause. While it was not feasible for us to do this type of cross-validation for the entire RNAi line collection, we attempted to validate a subset of lines in this manner. We obtained 14 new RNAi lines targeting genes that, when disrupted in mature blood cells (Hml D -GAL4) using screen RNAi lines, cause an increase in  Figure S1). Several such RNAi line crossvalidations also appeared within the screen itself. For example, in the primary screen using HHLT-GAL4, seven genes (zip, AP-2a, Rep, aSnap, Bap60, eIF3m, and CSN1b) were identified by two different RNAi lines, and two genes (Nup93-1 and Chc) by three different RNAi lines (Table 1). Secondary, cell-type-specific, screening also identified multiple RNAi lines targeting CG14230, crn, and Chc (Tables 2-4). Additional evidence pointing to the validity of the RNAi lines identified in the screen is the independent identification of genes with linked functions. For example, the primary screen (HHLT-GAL4) identified several components of the COP9 Signalosome (CSN), an important negative regulator of cullin-RING ubiquitin ligases (Dubiel et al. 2020), namely CSN1b (identified twice), CSN2 (alien), CSN5, and CSN6 (Table 1). Likewise, the screen uncovered Nup88 (mbo), Nup93-1 (found twice), Nup214, and Nup358, genes encoding subunits (nucleoporins) of the Nuclear Pore Complex (NPC), which regulates nucleocytoplasmic shuttling, gene expression, and a variety of other cellular processes (Mondal et al. 2014;Kuhn and Capelson 2019;Cho and Hetzer 2020). Beyond multisubunit complexes, many screen-identified genes delineated functional pathways or systems within the cell. For example, the collective identification of the genes Clathrin heavy chain, shibire (encoding Dynamin), Amphiphysin, three of four AP-2 adapter genes (AP-2a, AP-2r, and AP-2l), Rab5, and Syntaxin7 suggests an important regulatory role for endosome formation and trafficking in hematopoiesis. The secondary screening, which examined the primary screen RNAi lines as well as additional RNAi lines targeting the same or related genes, identified CSN5 and CSN6 again (with dome-GAL4; Table 3), but also CSN4, CSN7, CSN8, and Nup154 (with Hml D -GAL4; Table 4), among others. Thus, while RNAi off-target effects may account for some of the hematopoietic phenotypes we have observed, the results described above collectively point to the general validity of our RNAi line collection and reinforce our association of target gene function with hematopoiesis.

Bioinformatic analysis of identified gene sets
To better understand the nature of the genes identified in the primary and the secondary screens, we analyzed each gene set using the online STRING protein interaction database (v11.0; Szklarczyk et al. 2019). Examination of the set of 137 genes identified in the primary screen using HHLT-GAL4 revealed a significant enrichment of protein-protein interactions (PPIs) within this group (P-value < 1.0e-16; STRING 11.0). Not surprisingly, a large number of Gene Ontology (GO) terms, many of which are defined very broadly, were also found to be enriched within the Biological  Table S4). Comparison of our genes with the KEGG pathway database (Kanehisa and Goto 2000) offered a more refined view, identifying enrichment in eight defined functional pathways (Table 5), the most significant of which is RNA transport (KEGG dme03013; 13 of 139 genes, q ¼ 1.35e-07). KEGG analysis also identified the Spliceosome (KEGG dme03040; 8 of 117 genes, q ¼ 1.1e-04) and mRNA surveillance (KEGG dme03015; 6 of 72, q ¼ 1.1e-03) groups, which, collectively, indicates an important role for RNA processing regulation during hematopoietic development. Gene set analysis by the Reactome pathway database (Fabregat et al. 2018), which defines almost three times the number of functional pathways as KEGG, identified 157 pathways to be enriched (Supplementary Table S5), 11 of which coincide with RNA regulation (Table 6). Another major functional theme uncovered by KEGG and Reactome analysis is vesicular trafficking. Three of the eight identified KEGG pathways were Endocytosis (KEGG dme04144; 9 of 119, q ¼ 1.1e-04), Phagosome (KEGG dme04145; 7 of 83, q ¼ 4.4e-04), and SNARE interaction in vesicular transport (KEGG dme04130; 3 of 20, q ¼ 9.7e-03), while 14 related pathways were identified by Reactome (Table 6 and  Supplementary Table S5).
The numbers of candidate hematopoietic genes identified by the secondary screening, using cell-type-specific RNAi along with the Hml D -DsRed maturation marker, were fewer than the number of genes identified in the primary screening. However, when gene enrichment within the secondary screen rose to significance, it was very often in functional groups that were also enriched in the primary screen. Indeed, of the eight KEGG enrichment groups identified by the primary screening, all but one group (dme03015, mRNA surveillance pathway) were found to be enriched among the secondary screening gene subsets (Table 5). Three additional functional groups (dme04330, Notch signaling pathway; dme04068, FoxO signaling pathway; and dme04141, Protein processing in endoplasmic reticulum) were also found to be enriched specifically among the secondary screen gene subsets. Notch signaling pathway gene enrichment shows up twice, identified by RNAi knockdown in PSC cells by Antp-GAL4, and in maturing blood cells by Hml D -GAL4 (Table 5). FoxO signaling pathway enrichment, like Notch signaling pathway enrichment, was identified by RNAi knockdown in PSC cells by Antp-GAL4, while enrichment for Protein processing in endoplasmic reticulum was identified by dome-GAL4-mediated RNAi in blood progenitor cells. With regard to phenotype, Notch signaling pathway gene enrichment identified by Hml D -GAL4-mediated RNAi was associated with an increase in Hml D -DsRed fluorescence, while the enrichment observed with Antp-GAL4-mediated RNAi was associated with a decrease in Hml D -DsRed fluorescence. As for FoxO signaling pathway (Antp-GAL4) and Protein processing in endoplasmic reticulum (dome-GAL4), both enrichments were associated with a decrease in Hml D -DsRed fluorescence. The Hml D -DsRed phenotypes associated with the seven functional enrichment groups overlapping with the primary screen (HHLT-GAL4) can be found in Table 5.

Discussion
To find new genes regulating fly hematopoiesis, we have conducted a reverse-genetic screen using RNAi. The primary phase of our screen examined the role of approximately 3500 genes, representing about 25% of the genome. Functional gene disruption was achieved using the GAL4/UAS gene expression system, specifically the HHLT-GAL4 driver, the activity of which is highly restricted to the lymph gland and circulating blood cell populations (Mondal et al. 2014). In any experimental context, direct examination of affected tissues is best, however the dissection of lymph glands for this purpose is relatively difficult and time consuming, especially for undergraduates without prior experience. Thus, we elected to circumvent dissections by taking advantage of the translucent nature of larvae and screening whole animals for GFP expression in the cells of the hematopoietic system (via HHLT-GAL4 UAS-GFP). While indirect, this approach was advantageous because general lymph gland morphologies and circulating cell densities could be easily evaluated and compared across genotypes in situ, while also increasing the analytical throughput ( Figure 2). Ultimately, this screen identified 137 candidate genes, corresponding to 148 different RNAi lines, which broadly regulate hematopoietic development (Table 1). With the 148 RNAi lines identified in the primary screen, we set out to refine our understanding of where each candidate gene was functioning in the lymph gland (i.e., whether in PSC cells, the progenitor cells, or the mature cells), and of how its functional disruption impacted lymph gland development. To achieve this, we first added additional RNAi lines (54; Supplementary Table S1) that were either redundant or targeted genes that were functionally related to candidate hits from our primary screen. Second, we generated new GAL4 driver lines that (1) target RNAi to lymph gland sub-populations and (2) report on the development of mature blood cells. Our collection of 202 RNAi lines was then screened using these driver lines (Antp-GAL4, dome-GAL4, and Hml D -GAL4, each with Hml D -DsRed in the background), and DsRed fluorescence was evaluated in whole animals, similar to GFP in the primary screen (see Figures 3-5). Each GAL4 driver identified target gene subsets that, when disrupted in their respective cell types, increase or decrease in DsRed fluorescence (Tables 2-4), changes that typically appeared to correlate with lymph gland size. However, for RNAi backgrounds with reduced fluorescence, we cannot rule out the possibility that lymph glands were normal in size or even enlarged but exhibited reduced Hml D -DsRed expression.
Previous work by several groups has demonstrated that the PSC communicates with both lymph gland progenitor cells and differentiating/mature cells to regulate development (Krzemie n et al. 2007;Mandal et al. 2007;Mondal et al. 2011;Benmimoun et al. 2012;Pennetier et al. 2012;Tokusumi et al. 2012), and our findings here are consistent with this role. Reduction of gene function in PSC cells (Antp-GAL4) identified 34 genes regulating blood cell maturation and/or proliferation in the lymph gland, 19 causing  Table 2). Since PSC cells support blood development but never contribute to the blood cell pool (Jung et al. 2005), each of the genes identified presumably plays a direct or a indirect role in signaling mechanisms regulating hematopoiesis. Perhaps not surprisingly, RNAi directly in lymph gland blood progenitor cells (dome-GAL4, dome MESO -GAL4) also identified a number of candidate genes regulating blood cell maturation. Specifically, progenitor-cell RNAi identified 50 genes, 33 that cause an increase and 17 that cause a decrease in Hml D -DsRed fluorescence in the lymph gland. Previous work has shown lymph gland progenitor cells to be regulated by several paracrine and metabolic signaling mechanisms (Owusu-Ansah and Banerjee 2009; Sinenko et al. 2009Sinenko et al. , 2011Mondal et al. 2011;Tiwari et al. 2020), so it will be interesting to address potential connections to our candidate genes in future work. Targeting gene knock down to differentiating and mature blood cells (Hml D -GAL4) identified the largest cell-type-specific subset with 56 candidate genes, 48 of which increase and 8 of which decrease Hml D -DsRed fluorescence in the lymph gland. Previous work has demonstrated that interaction between mature cells and progenitor cells, via the "equilibrium signaling pathway," is important for balancing progenitor cell maintenance and differentiation (Mondal et al. 2011(Mondal et al. , 2014. Blocking equilibrium signaling in maturing cells leads to a compensatory proliferation and differentiation of progenitor cells (Mondal et al. 2011(Mondal et al. , 2014. Thus, the increase in Hml D -DsRed fluorescence in whole animals, upon functional disruption of genes in the mature blood cell population, suggests that many of them may play a role in equilibrium signaling. Although we cannot be certain of the specific roles of the identified genes in each cell population, our dataset provides a valuable starting point for asking these questions. It is interesting to note that many of the RNAi lines identified in the primary screen using HHLT-GAL4 were not identified (did not cause a phenotype) by any single GAL4 driver in the secondary, cell type-specific screen. One possible reason is that HHLT-GAL4 phenotypes for most RNAi lines are complex, arising only because of functional disruption in multiple cell types simultaneously. Another possibility is a threshold effect owing to differences in GAL4 driver strength, i.e., the individual celltype GAL4 drivers may not induce RNAi as robustly as HHLT-GAL4. For RNAi lines that do cause phenotypes both with HHLT-GAL4 and with a cell type-specific GAL4 driver, it is not clear that these are equivalent phenotypes. The absence of a hematopoietic marker in the HHLT-GAL4 screen and the differences in GAL4 expression levels and patterns contribute to this uncertainty. Thus, while we are confident that the candidate hematopoietic genes identified by HHLT-GAL4 in the primary phase of the screen are valid, it seems that determining the functional specificities of candidate genes may be more straightforward for those causing phenotypes when disrupted in a single hematopoietic cell type.
Our analysis of the primary screen candidate genes using the online STRING database helped to reveal important genes subsets. The protein-protein interaction (PPI) network for our 137 gene dataset is composed of 599 edges (known or predicted interactions), a number significantly greater than the 350 edges expected for a randomly selected network of the same size (P-value ¼ 1.0e-16). Likewise, large numbers of Gene Ontology terms were also enriched for this network (Supplementary Table  S4), though many of the terms are broad and overlapping. However, network analysis using the KEGG Pathway database identified a smaller number of enriched functional groups or pathways. Of the eight groups identified by KEGG analysis (Table 5), three pointed to mRNA maturation (RNA transport, KEGG dme03013; Spliceosome, KEGG dme03040; and mRNA surveillance, KEGG dme03015) and another three pointed to vesicular trafficking (Endocytosis, KEGG dme04144; Phagosome, KEGG dme04145; and SNARE interaction in vesicular transport, KEGG dme04130) as having major hematopoietic roles.
Despite smaller gene sets from the secondary screening, seven of the eight primary screen KEGG enrichment pathways were identified again in these genes (Table 5), underscoring the relevance of these functional groups. KEGG analysis of the secondary screen candidate gene subsets also identified three additional enriched functional groups, Notch signaling pathway (dme04330), FoxO signaling pathway (dme04068), and Protein processing in endoplasmic reticulum (dme04141). It is interesting that Notch signaling   (Table 5). Insulin signaling has been shown to regulate both lymph gland progenitor cell and PSC cell populations (Benmimoun et al. 2012;Shim et al. 2012;Tokusumi et al. 2012;Kaur et al. 2019), though Chico function itself has not been previously analyzed. While the evidence for TGF-b/Activin signaling is lacking, the PSC population is known to be regulated by TGF-b/Dpp signaling (Pennetier et al. 2012). Others have shown that the gene dawdle, encoding an Activin-like ligand that activates Babo, is directly regulated by FoxO (Bai et al. 2013), raising the possibility that the Insulin and TGF-b/Activin pathways converge in PSC cells. Our screening and bioinformatic analyses have identified candidate hematopoietic genes but have also brought to light what appear to be broader realms of hematopoietic regulatory control. We have found that the areas of endosomal trafficking, mRNA regulation, and the ubiquitin-ligase system each have a number of constituent genes that control blood cell development in some way, including a smaller number of genes that are uniquely positioned at functional interfaces between these larger realms. The case for endosomal trafficking was made previously, in part, in the discussion of our gene set validation; however a number of other genes belonging to this group were not mentioned, including those encoding a variety of other Rab and Rab effector proteins, syntaxins (SNAREs), and a multifunctional chaperone called Hsc70-4. It is well established that functional disruption of early endosomal trafficking (e.g., mutation of Syx7 or Rab5) can cause a variety of cellular defects including loss of apicobasal polarity, increased proliferation, and aberrant activation of signaling pathways such as Notch and EGFR (Vieira et al. 1996;Lu and Bilder 2005;Vaccari and Bilder 2005;Fortini and Bilder 2009;Reimels and Pfleger 2015). The finding of Hsc70-4 stands out because it is a known regulator of Notch signaling (Hing et al. 1999), important in hematopoiesis (Duvic et al. 2002;Lebestky et al. 2003;Mandal et al. 2004;Mukherjee et al. 2011;Ferguson and Martinez-Agosto 2014;Small et al. 2014;Blanco-Obregon et al. 2017), but has also been functionally linked to clathrin-mediated vesicle formation and mRNA splicing (Chang et al. 2002;Herold et al. 2009).
Our screen identified an abundance of mRNA regulatory proteins involved in splicing, transport, translation initiation, and translation termination (Tables 5 and 6). The genes crn (the Drosophila homolog of the yeast Clf1p splicing factor) and Prp19 are interesting because both encode components of the  Nucleoporins have been shown to mediate many important functions, including the production, transport, and translation of mRNAs (Kuhn and Capelson 2019;Cho and Hetzer 2020). In the context of Drosophila hematopoiesis specifically, the nucleoporin Nup98 has been shown to regulate Pvr expression, the receptor tyrosine kinase controlling equilibrium signaling in the lymph gland (Mondal et al. 2014). In humans, the normal hematopoietic roles of nucleoporins remains elusive, however several chromosomal translocations into nucleoporin genes, Nup98 in particular, are known to cause a variety of hematopoietic defects and leukemias (Gough et al. 2011;Takeda and Yaseen 2014). Thus, the identification of several different nucleoporins in our screen confirms and extends the finding that these are important regulatory genes in the context of blood cell development.
The secondary phase of our screen began the work of identifying the specific cell types in which these genes function, as well as indicating whether the genes normally promote or limit the blood cell maturation process. Our findings also indicate that many of these candidate hematopoietic genes also control cellular proliferation, as lymph gland size and circulating cell densities were often changed. In the future, it will be important to examine these RNAi phenotypes again with additional hematopoietic markers, as many are likely to impact the differentiation of the crystal cell and lamellocyte lineages. For phenotypes with enlarged lymph glands with strong increases in Hml D -DsRed expression, our experience suggests that progenitor cells are likely reduced or perhaps even missing. Thus, it will also be important in future analyses to test this hypothesis by using a progenitor cell marker, such as dome MESO -GFP, to directly assess these RNAi phenotypes. Characterization of the RNAi phenotypes described here will also benefit significantly from direct observation of lymph glands through dissection and higher-magnification microscopy. This is critical because the presence of small cell populations in the lymph gland, for example, remnant progenitor cells expressing dome MESO -GFP, have correspondingly low fluorescence levels and are impossible to see in whole-animal analyses. Dissection analysis will also provide insight into lymph gland structural changes and abnormal morphologies that arise in these RNAi phenotypes.
The genetic screen reported here was conducted by the UCLA Undergraduate Research Consortium for Functional Genomics (URCFG; Chen et al. 2005), which consists of students participating in Biomedical Research 10H, a course-based undergraduate research experience (CURE) offered by the UCLA Minor in Biomedical Research. This RNAi-based screen for new hematopoietic genes represents the third iteration of a CURE-based pedagogical approach to teaching UCLA URCFG undergraduates about science and scientific research. The two previous research projects completed by the URCFG were mosaic analysis of lethal P-element insertional mutants in the fly eye (Chen et al. 2005;Call et al. 2007) and in vivo cell lineage tracing during Drosophila development using G-TRACE (Evans et al. 2009;Olson et al. 2019).
As an educational tool, this screen featured several design aspects that made its implementation as a CURE research project possible. CUREs strive to provide an authentic research experience for undergraduates, but this can be difficult to achieve if students work as research apprentices cultivating individual projects. We have found that research authenticity is much more manageable when students work in parallel, performing the same kind of experimental work, but collecting unique data, and that genetic screens reflect this approach well. The use of RNA interference (RNAi) as the basis for the genetic screen was particularly beneficial. Using RNAi in the context of the GAL4/UAS system enabled students to conduct an F1 screen, allowing for more throughput within the UCLA 10-week academic quarter. It also allowed us to take advantage of the thousands of transgenic GAL4-responsive RNAi fly lines that were already available to the fly research community. RNAi-based screening also provided students with a direct link to target gene identities and known functions. While screening was ongoing, students learned how to identify target genes associated with their RNAi fly stocks, how to mine FlyBase for information about their target genes, and how to use NCBI BLAST to identify human homologs. Lastly, the selection and the use of the highly specific HHLT-GAL4 UAS-GFP and Hml D -DsRed reporter lines was advantageous, as it allowed students to screen for hematopoietic phenotypes directly in translucent larvae, bypassing difficult and time-consuming dissection and tissue processing procedures.
To explore how students might benefit from participating in the RNAi screen, we used the SURE II survey (Lopatto 2004), which assesses learning across 21 different areas for students participating in undergraduate research pedagogies. We find that URCFG students participating in our RNAi screen for hematopoietic genes reported increased learning gains in almost every area (20/21, as compared to national benchmarks; Figure 6A), a finding that is similar to the increased learning gains reported by undergraduates participating in our previous URCFG research pedagogies (Chen et al. 2005;Call et al. 2007;Olson et al. 2019). It is also noteworthy that URCFG students who participated in this project reported relative increases in their interest in science and scientific research ( Figure 6B).
An increasingly important measure of the effectiveness of science pedagogies, including CUREs, is the impact that these pedagogies have on the retention of students in science, technology, engineering, and mathematics (STEM) majors. It has been previously reported that the STEM retention rate nationally (through degree completion) is approximately 40%, dropping to as low as 25% among underrepresented minority (URM) students (Hurtado et al. 2009; National Academies 2011; PCAST 2012). As recently reported (Olson et al. 2019), student participation in a URCFG CURE experience, including the one described here, correlates with an overall persistence of students in STEM majors at a rate that is greater than twice the national average (to 95%, n ¼ 626).

Figure 6
Impact of the URCFG experience on learning gains. (A) Categorical data plot comparing reported learning gains between URCFG students (green triangles), students, nationally, completing summer research apprenticeships (all summer research students; blue diamonds), and students, nationally, completing introductory to advanced biology courses containing some research component (all students; red squares). Students participating in the URCFG who responded to the survey (n ¼ 265) reported increased gains across 20 of 21 different areas compared to students in the other groups. Scale: 1 ¼ little to no gain, 2 ¼ small gain, 3 ¼ moderate gain, 4 ¼ large gain, and 5 ¼ very large gain. Error bars represent two times the standard error, representing greater than a 95% confidence interval. (B) average responses of URCFG students (green bars, top), when asked if they agreed with each of the statements on the left, regarding the impact of the course on their interest in science, ability to learn the process of scientific research and ability to learn the subject matter. Students scored each statement on a 5-point Likert scale, where 1 is "strongly disagree" and 5 is "strongly agree." Scores are compared to those from students nationally in biology courses with a research component (red bars, bottom). See Materials and Methods for additional details.
For URM students in particular, the increase in STEM retention is even greater (to 91%, n ¼ 46). Our findings add to a growing body of evidence that authentic research experiences in the classroom context create highly effective learning environments for undergraduate students and can improve engagement and persistence in STEM (Chen et al. 2005;Call et al. 2007;Lopatto et al. 2008;Graham et al. 2013;Jordan et al. 2014;Shaffer et al. 2014;Rodenbusch et al. 2016;Olson et al. 2019).