Gene-specific cell labeling using MiMIC transposons

Binary expression systems such as GAL4/UAS, LexA/LexAop and QF/QUAS have greatly enhanced the power of Drosophila as a model organism by allowing spatio-temporal manipulation of gene function as well as cell and neural circuit function. Tissue-specific expression of these heterologous transcription factors relies on random transposon integration near enhancers or promoters that drive the binary transcription factor embedded in the transposon. Alternatively, gene-specific promoter elements are directly fused to the binary factor within the transposon followed by random or site-specific integration. However, such insertions do not consistently recapitulate endogenous expression. We used Minos-Mediated Integration Cassette (MiMIC) transposons to convert host loci into reliable gene-specific binary effectors. MiMIC transposons allow recombinase-mediated cassette exchange to modify the transposon content. We developed novel exchange cassettes to convert coding intronic MiMIC insertions into gene-specific binary factor protein-traps. In addition, we expanded the set of binary factor exchange cassettes available for non-coding intronic MiMIC insertions. We show that binary factor conversions of different insertions in the same locus have indistinguishable expression patterns, suggesting that they reliably reflect endogenous gene expression. We show the efficacy and broad applicability of these new tools by dissecting the cellular expression patterns of the Drosophila serotonin receptor gene family.


INTRODUCTION
The continued development of novel molecular genetic technologies has been critical for the staying power of Drosophila melanogaster as a model system in biology. The early use of P-element transposons to generate transgenic flies jumpstarted a molecular genetic revolution still ongoing today (1,2). A subsequent technological milestone was the development of the first binary gene expression system that uses the yeast transcription factor GAL4 to activate any gene of interest cloned downstream of the Upstream Activating Sequence (UAS) (3). Both components of the binary system are integrated separately into a fly's genome through transposon-mediated transgenesis and 'activated' by a genetic cross of the two transgenic strains.
The expression pattern of GAL4 and by extension its target downstream of the UAS promoter is either driven by a cloned promoter fragment (promoter-GAL4) or by a local enhancer (enhancer-GAL4). Promoter-GAL4s drive expression of GAL4 based on a defined promoter fragment cloned into a GAL4 expression vector, which is typically inserted into the Drosophila genome through random transposition. Such promoter-GAL4 lines do not always accurately reflect endogenous expression of a gene for two reasons. First, the cloned fragment may lack enhancer and/or repressor elements necessary for correct regulation of the gene. Second, the insert may be affected by the genomic context surrounding the integration site (4). Enhancer-GAL4s are GAL4-containing transposons that express GAL4 in the pattern of local enhancers in the vicinity of the integration site of the GAL4-containing transposon. These lines also do not always accurately recapitulate endogenous expression due to several possible mechanisms, such as size and orientation of the transposon, and distance to the promoter (5).
One strategy to generate gene-specific GAL4 lines that faithfully reproduce endogenous gene expression is to replace the first coding exon of a gene with a GAL4 encoding exon through homologous recombination (6)(7)(8). This strategy is genetically cumbersome but is somewhat easier when implemented in large genomic fragments that are then inserted into specific predesigned docking sites in the fly genome (9). This type of site-specific integration relies on a viral binary system composed of the bacteriophage C31 integrase and its complementary DNA attachment recognition sites, attP and attB (10). Once attP sites were introduced into the fly genome using transposition embryos injected with integrase and plasmid DNA containing an attB site could be efficiently transformed by specific integration into the attP sites engineered into the fly's genome.
Minos-Mediated Integration Cassette (MiMIC) is a specialized transposon that carries two inverted attP sites that allow flexible conversion of resident loci through C31 recombinase-mediated cassette exchange (RMCE) (11,12). This transposon contains a dominant body color marker and a stop cassette with a splice acceptor that can mutate a gene when it lands in the right orientation in an intron. Many thousands of MiMIC insertions have been generated and are publicly available from the stock center designated as MI lines (11,13). What sets this transposon apart from other mutagenic transposons however is that it can be locally modified once inserted in a gene because of the inverted attP sites so that the content of the transposon can be exchanged with a new cassette allowing limitless modification of the locus (11). Two examples of the versatility of this transposon system are protein-and gene-traps. A protein-trap is made by converting a MiMIC insertion in a coding intron into an artificial exon encoding a protein tag (e.g. superfolder Green Fluorescent Protein (GFP)) to visualize endogenous protein localization. A gene-trap, in contrast, is based on conversion of 5 non-coding intronic insertions into an artificial terminal exon. Such insertions can be used to document the endogenous cellular expression pattern of a host gene when a binary factor (e.g. GAL4) is inserted (11), but only ∼13% of MiMIC insertions are located in 5 non-coding introns. That means that this strategy is not feasible for ∼87% of MiMIC insertions (11). In order to make all intragenic intronic insertions available for conversion (∼46% of all MiMIC insertions) (11), we designed a set of protein-trap cassettes for the conversion of coding intronic MiMIC insertions into gene-specific binary factors. In addition, we created three new gene-trap cassettes for the binary transcription factor LexA (14), the drug-inducible transcription factor GeneSwitch (15,16) and GAL80 (17), a negative regulator of GAL4. We have tested these novel protein-and gene-trap conversion cassettes on 16 different MiMIC insertions in 10 different genes and show that this conversion strategy reliably reflects endogenous gene expression. These novel tools will be useful for gene-specific manipulations of gene function as well as cell and neural circuit function.

Construction of the protein-trap and gene-trap cassettes
Protein-trap cassettes with T2A fused to binary factors were made by cloning polymerase chain reaction (PCR) fragments into the previously generated protein-trap vectors for the three different intron phases: pBS-KS-attB1-2PT-SA-SD-0, 1 or 2 (11). To generate the T2A-GAL4 plasmid, we amplified the GAL4 sequence from the GAL4 genetrap cassette by using a forward primer matching the first 22 bp of the GAL4 sequence connected to the T2A sequence and a BamHI site and a reverse primer to the Hsp70 3 UTR sequence followed by a BamHI site. The resulting PCR product was digested with BamHI and cloned into the BamHI digested protein-trap cassettes for all three frames. All clones were verified by sequencing. The same procedure was used to generate T2A-GeneSwitch, T2A-LexA and T2A-GAL80 plasmids. To generate the corresponding genetrap plasmids, the same strategy was used but the fragments were cloned into the gene-trap vector, pBS-KS-attB1-2-GT-SA (11) (Supplementary Figure S1).

C31 integrase-mediated RMCE
All the genes and alleles that were used in the conversion experiments are listed in Table 1 and Supplementary Figure S2. Conversions were performed as previously described (11). Briefly, we injected plasmid DNA of the abovedescribed exchange cassettes into fertilized embryos (before they were cellularized) that were derived from flies with each MiMIC insertion crossed to flies with a C31 integrase source on the X chromosome and an appropriate balancer chromosome (i.e. for chromosome 2 or 3) so that the MiMIC insertion remains balanced during the conversion process and that successful conversion events can be recovered by scoring for the absence of the yellow + (y + ) dominant body color marker. Adults that emerged after injection were crossed to y w stocks with the appropriate balancer and y − offspring from these crosses were selected to establish a new stock with the successful conversion event. These flies were List of all the genes and alleles converted with the different gene-trap and protein-trap cassettes described in this manuscript. All 16 alleles converted with GAL4 (G4) in protein-and/or gene-trap configuration showed reproducible and internally consistent expression patterns when crossed to UAS-GFP. GAL4 (G4) and GAL80 (G80) conversions were most successful and worked reliably in both protein-trap and gene-trap configurations. EGFP cassette conversions showed only strong expression in two of the seven conversions that we attempted. GeneSwitch (GS) conversions worked reliably in the genetrap configuration (all four) but not in the protein-trap configuration. LexA (LA) conversions only worked for some gene-traps and not for protein-traps.
subsequently analyzed molecularly to verify correct integration of the exchange cassettes.

Molecular characterization of the conversion events
The conversion cassettes can recombine into the locus in a forward or reverse orientation relative to the direction of the locus and require screening of the integration orientation. PCR-based verification of RMCE events was performed as previously described (11). Briefly, DNA was extracted from a small number of adult flies using the PureLink Genomic DNA Mini kit (Life Technologies) and PCR was performed with cassette-specific primers and MiMIC-specific primers to determine the orientation of the conversion event. The primers that were used for PCR confirmation of conversion events are listed in Supplementary Table S1. PCR conditions for the conversion events were as follows: denaturation at 94 • C for 10 min, 40 cycles at 94 • C for 30 s, 60 • C for 30 s and 72 • C for 60 s and post amplification extension at 72 • C for 10 min.

Expression analysis of the gene-specific binary factor conversion strains
Flies with confirmed conversion events were crossed to 10xUAS-IVS-syn10-GFPp10 (18). Staining and imaging was performed as previously described (19) with the following modifications. Adult brains were dissected and fixed in ice-cold 4% paraformaldehyde-phosphate-buffered saline (PBS) for a total of 1 h. Next, the brains were rinsed two times with PBS-0.5% Triton X-100 (PBT) and then washed twice for 30 min in PBT at room temperature. The brains were then blocked in 5% normal goat serum (NGS) in PBT for 1 h at room temperature. Samples were incubated in 5% NGS/PBT with primary antibody for 48 h at 4 • C. After two 30 min washes with PBT, the brains were incubated in 5% NGS/PBT with secondary antibody for 48 h at 4 • C. The brains were then washed two times for 30 min at room temperature and then for two days at 4 • C. Finally, brains were mounted in SlowFade mounting medium (Invitrogen) and covered with a no. 0 glass coverslip that was separated from the slide by two strips of scotch tape. Immunos-

A novel T2A-based GAL4 exchange cassette for RMCE
MiMIC insertions contain two inverted attP sites that allow swapping of the transposon content using C31-mediated RMCE (Figure 1a) (11). To expand the existing GAL4 conversion of 5 non-coding intronic insertions (∼20% of all intragenic MiMIC insertions), to include coding intronic MiMIC insertions (∼50% of the intragenic MiMIC insertions) (Figure 1b), we designed a novel exchange cassette that contains a splice acceptor followed by a self-cleaving   T2A peptide sequence fused to the GAL4 coding sequence (20) ending in a stop codon and stabilizing 3 UTR (Supplementary Figure S1). We made conversion cassettes for all three frames to accommodate conversion of any coding intronic insertion regardless of the intron phase. A genetrap cassette in a 5 non-coding intron is spliced onto the upstream non-coding exon of the host gene and leads to translation of GAL4, which can then activate an effector gene downstream of the UAS promoter (Figure 1c). In contrast, the newly designed protein-trap cassette splices to the upstream coding exon fusing the GAL4 gene sequence to a piece of the host gene separated by the T2A sequence. During translation of this hybrid mRNA, the GAL4 protein part is released due to failure of peptide-bond synthesis at the last codon of the T2A 'self-cleaving' peptide (21). This generates a truncated version of the native protein attached to T2A and a GAL4 transcription factor molecule that can mark the cellular expression pattern of the host gene through activation of a UAS-reporter (Figure 1d). The C-terminal Proline of the T2A peptide will become the first amino acid of GAL4.

Gene-and protein-trap GAL4 conversions in the same gene produce similar patterns
We next tested whether the T2A-GAL4 protein-trap conversions generate the same expression pattern as gene-trap GAL4 cassettes (11). To do so, we compared the two types of conversion events for 5-HT2A (Supplementary Figure  S2c). Gene-trap and protein-trap GAL4 conversion of both 5-HT2A MiMIC insertions (Mi{MIC}5-HT2A MI00459 and Mi{MIC}5-HT2A MI03299 ) produced very similar expression patterns, staining the ellipsoid body (EB), lateral triangle with the associated R-cells, the dorsal fan-shaped body (FSB) and the F-cells with characteristic dendritic tufts radiating to the top of the dorsal protocerebrum (33) (Figure 4a-b and Supplementary Video 3). This pattern of expression is distinct from the enhancer trap insertion-based pattern, which showed expression in the EB but not in the FSB, with the caveat that the expression pattern was generated with a LacZ reporter (26). Our results confirm that enhancer trap lines do not necessarily recapitulate endogenous expression accurately (5). We next tested the two noncoding and two coding intronic MiMIC insertions in the 5-HT2B locus (Supplementary Figure S2d). All four produced very similar expression patterns regardless of whether we converted a gene-trap or protein-trap, staining strongly in the EB, LTR, R cells and PI (Figure 4c-d and Supplementary Figure S3). The consistency of the expression patterns generated by GAL4 conversion of different MiMIC insertions in the same locus (separated by as much as 40 kb) strongly suggests that they faithfully recapitulate the full expression pattern of the host gene.
To complete the conversions of all the 5-HT receptor encoding genes, we converted the single gene-trap MiMIC allele in the 5-HT7 locus (Mi{MIC}5-HT7 MI00215 ) and found that it is expressed in the EB and R-cells similar to the pattern of the promoter-based GAL4 (5-HT7-GAL4 Prom (28)) (Figure 4e-f and Supplementary Figure S2e).
Together these results suggest that GAL4 conversions of both coding and non-coding intronic MiMIC insertions can be used as an alternative method to identify the cellular expression pattern of genes for which no antibodies exist and for which GFP protein-traps do not produce visible patterns or only weak expression patterns. While these GAL4 conversions do not provide information on the sub-cellular localization of the endogenous protein, they do make functional manipulation of the neurons possible.

GAL4 conversions recapitulate expression of GFP-tagged proteins
To further test how accurately these constructs capture the expression patterns of the native genes, we converted several MiMIC insertions in genes for which we did succeed in GFP-tagging the protein-traps.  Videos 4,5). These data provide strong evidence that the MiMIC-based GAL4 conversion patterns faithfully represent the expression patterns of the endogenous locus. Further support for this conclusion is the observation that lines derived from different insertions in the same gene have expression patterns that are in many cases indistinguishable from each other, but different from those of promoter-based GAL4 or enhancer trap lines in those genes. This is likely the case because MiMIC conver- sion lines are generated in the context of the endogenous transcription unit.

Geneswitch, LexA and GAL80 exchange cassettes
We next created three new binary factor gene-trap cassettes: a GeneSwitch (GS) (15,16) cassette for locus-specific inducible expression, a LexA (LA) (14,37) cassette for intersectional (i.e. overlapping) expression and a GAL80 (17,38) cassette allowing cell-specific inhibition through GAL4 repression (39). The expression pattern of the GS conversion in the 5-HT2A locus (Mi{MIC}5-HT2A MI03299 ) showed inducible expression indistinguishable from the GAL4-mediated conversions (Figure 6a and b). Similarly, the 5-HT7 gene-trap converted with GS produced strong inducible expression (Figure 6c and d). We then tested LA conversions in the 5-HT2A and 5-HT7 loci. While the LA conversion of 5-HT7 showed strong expression in the EB and R-cells (albeit weaker than the GAL4 and inducible GS patterns) (Figure 6e), the LA conversion of 5-HT2A had little to no staining (data not shown). The lack of 5-HT2A-LA expression and the weaker 5-HT7-LA expression that resembles 5-HT7-GAL4 Prom expression (Figure 4e) suggest that the LA construct is weaker than GAL4 and GS. We next tested whether GAL80 conversions in the 5-HT2A and broadly expressing Ubp64E genes could block expression from the 5-HT2A-GAL4 conversions. Both mostly inhibited GAL4 expression (Figure 6f and g). Finally, we created T2A protein-trap cassettes for the same binary factors (T2A-GS, T2A-LA and T2A-GAL80). Like the GAL80 gene-trap conversions, T2A-GAL80 conversion of 5-HT2A inhibited 5-HT2A-T2A-GAL4 expression although it did not completely block expression in the PI neurons (Figure 6h). However, neither the T2A-GS nor the T2A-LA constructs showed detectable expression in any of the loci that we tested (data not shown, summarized in Table 1), suggesting that peptide bond skipping may be ineffective in these specific contexts.

MiMIC conversion utility and expansion potential
Taken together, we have developed and tested a new set of binary factor conversion cassettes that take advantage of the T2A polycistronic strategy to convert coding intron MiMIC insertions into reliable gene-specific GAL4 and GAL80 binary factors that can be used in a range of applications. This new strategy will significantly expand the utility of the growing number of publicly available MiMIC insertions because it more than triples the number of MiMICs that can be converted into reliable GAL4 and GAL80 binary factors.
Comparison of gene-trap and protein-trap binary factor conversions in the same locus and known expression patterns of some of the converted loci suggests that these new tools faithfully reflect the endogenous expression of the locus in which the MiMIC transposon is inserted, irrespective of its original orientation (i.e. the original transposon can be in the forward or reverse orientation, Figure 1a). It is important to note that insertions capturing all splice variants may be required to report the full expression pattern of the host gene and that some splice variants may not be separable. In this study we mostly used insertion sites that capture all transcript variants. The only exceptions were the 5 non-coding and coding intronic MiMIC insertions in the 5-HT2A locus. However, these two 5-HT2A insertions together do capture all splice variants and produce very similar expression patterns. In theory, splice-variant specific insertions should nonetheless be usable to reveal expression of a subset of variants. This will depend on the saturation of the growing MiMIC library and may become an additional useful component of this strategy. Alternatively, attP sites could be introduced into specific introns of a gene using targeted CRISPR/cas9 nuclease strategies (40,41) to selectively capture-specific splice variants.
Similarly, CRISPR/cas9-based knock-in strategies could be used to introduce attP sites in specific genes in mammalian genomes. In combination with the strategy that we developed here, such an approach could then be used to generate allelic series in mammalian genes to better dissect expression, structure and function.
In addition to the novel protein-trap configuration tools, we created three additional binary factors that can be used to convert genes into non-overlapping or inducible binary factors. Together these new tools expand the repertoire and flexibility of the MiMIC transposon platform to allow further gene-specific manipulations such as expression pattern identification, expression-specific rescue experiments and manipulation of neuronal function (42)(43)(44)(45). Given the large number of genes in the Drosophila genome that already contain MiMIC insertions in coding and non-coding introns, these new tools complement and improve upon the large collections of GAL4 lines generated by enhancer analysis (46), and should allow many investigators to better dissect the function of their genes of interest. Future incorporation of in vivo remobilization features would eliminate microinjection. Recently, the integrase swappable in vivo targeting element system was developed to convert different binary factors in vivo by using a vector with non-overlapping site-specific recombinase target sequences (47). Addition of these sequence elements to our conversion cassettes may allow expansion of the gene-specific approach presented here into a large-scale in vivo format.

NOTE ADDED IN PROOF
While our manuscript was in the review process, two other manuscripts came to our attention describing a very similar technology (48) and an expanded version of the MiMIC library (49).

SUPPLEMENTARY DATA
Supplementary Data are available at NAR Online.