A Versatile Transposon-Based Activation Tag Vector System for Functional Genomics in Cereals and Other Monocot Plants

Transposon insertional mutagenesis is an effective alternative to T-DNA mutagenesis when transformation through tissue culture is inefficient as is the case for many crop species. When used as activation tags, transposons can be exploited to generate novel gain-of-function phenotypes without transformation, and are of particular value in the study of polyploid plants where gene knockouts will not have phenotypes. We have developed an in cis activation tagging Ac - Ds system in which a T-DNA vector carries a Ds element containing 4x CaMV enhancers along with the Ac transposase gene. Stable Ds insertions were selected using dual GFP/RFP fluorescence marker genes driven by promoters that are functional in maize ( Zea mays ) and rice ( Oryza sativa ). The system has been tested in rice, where 638 stable Ds insertions were selected from an initial set of 26 primary transformants. By analysis of 311 flanking sequences mapped to the rice genome, we could demonstrate the wide distribution of the elements over the rice chromosomes. Enhanced expression of rice genes adjacent to Ds insertions was detected in the insertion lines using semi-quantitative RT-PCR method. The in cis 2-element vector system requires minimal number of primary transformants and eliminates the need for crossing, while the use of fluorescent markers instead of antibiotic or herbicide resistance increases the applicability to other plants and eliminates problems with escapes. Since Ac-Ds has been shown to transpose widely in the plant kingdom, the activation vector system developed in this study should be of utility more generally to other monocots.


INTRODUCTION
Genetic mutants have always played a central role as tools for functional analysis of plant genes. Many plant genes have been isolated by the strategy of insertional mutagens. In the model plants of Arabidopsis and rice, large-scale T-DNA and transposon insertion libraries and flanking sequence tag (FST) database have been generated, which serve the plant biologists worldwide for both forward and reverse genetics studies (Parinov et al., 1999;Tissier et al., 1999;Jeon et al., 2000;Ito et al., 2002;Kuromori et al., 2004;Ito et al., 2005). Transposon insertional mutagenesis is an effective alternative to T-DNA mutagenesis when transformation through tissue culture is inefficient as is the case for many crop species. The strategy for transposon mutagenesis requires just a limited number of primary transformants, with insertions being generated through propagation.
On the other hand, although the classic insertional mutagenesis strategies play important role in plant functional genomics studies, a limitation of such methods is the difficulty of identifying genes that are redundant in plant genomes and whose knockouts do not induce phenotypes. Activation tagging is an effective approach for overcoming this limitation. Activation tagging involves introduction of a T-DNA containing regulatory sequence such as the enhancer of the CaMV 35S promoter randomly into a plant genome to enhance expression of nearby genes, which potentially resulted in a gainof-function phenotype (Kardailsky et al., 1999;Borevitz et al., 2000;Weigel et al., 2000;Jeong et al., 2002;Mathews et al., 2003;Jeong et al., 2006;Mori et al., 2007;Hsing et al., 2007).
As for the functional genomics of cereals and monocot plants, maize transposable elements are useful tools for generating large collection of gene knockouts because highthroughput T-DNA transformation is not a vial approach for those plant species. A number of maize genes have been isolated from transposon-tagged mutants. Several studies have shown that the maize Ac-Ds transposable elements function in barley and mapped Ds launching pads were developed for study of functional genomics (Koprek et. al, 2000;Cooper et al., 2004;Singh et al., 2006;Zhao et al., 2006). Ayliffe et al. (2007) reported a novel Ac-Ds system in which a modified Ds element (UbiDs) carrying two maize ubiquitin 1 promoters (Christensen et al., 1996) was utilized for read-through transcription of adjacent flanking sequences.
We have been developing vectors using maize transposable elements for plant functional genomics. In rice, we previously developed an Ac-Ds insertional mutagenesis system in which a GFP fluorescence gene functioned as negative selection marker against the immobilized Ac and a BASTA resistance marker worked for selection of transposed Ds elements (Kolesnik et al., 2004). In the development of an En/Spm tagging system in rice, our transposition selection scheme was improved based on GFP and RFP double fluorescence markers (Kumar et al., 2005). In this study, we developed an activation tagging Ac-Ds in cis system in which the advantage of GFP and RFP double fluorescence markers was utilized. The system was shown to be effective for distributing a large number of activation-tagging Ds elements in the rice genome. As the Ac-Ds has been shown to transpose widely in the plant kingdom, we anticipate the applicability of our activation vector system for studies in monocot plants including other cereals.

Vector Construction and Production of Starter Lines
The T-DNA vector pSQ5 carrying a non-autonomous Ds element along with an immobilized Ac element in cis is shown in Figure 1. The immobilized Ac contains the Ac transposase gene under the control of cauliflower mosaic virus (CaMV) 35S promoter (Kolesnik et al., 2004). The Ds element has the 1785-bp 5' terminus and 222-bp 3' terminus of the wildtype Ac element (Sundaresan et al., 1995). A tetramer of the transcriptional enhancer of CaMV 35S promoter and the Discosoma sp. red fluorescence protein (DsRed) gene were cloned between the 5' and 3'Ds termini. The DsRed (or RFP) gene, encoding a red fluorescence protein (Clontech, CA;Baird et al., 2000), is driven by the maize ubiquitin 1 promoter (Christensen and Quail, 1996). The vector carries a hygromycin phosphotransferase gene for plant transformation selection. A synthetic green fluorescent protein (sGFP) gene was cloned next to the immobilized Ac transposase source in the T-DNA. The GFP and RFP fluorescence markers make it possible to visually track the Ds element and the immobilized Ac element, respectively, in transgenic progeny.
The pSQ5 Ac-Ds vector was introduced into Oryza sativa ssp. cv. Nipponbare via Agrobacterium tumefaciens-mediated rice transformation. Eighty fertile transformants that were double fluorescent (GFP + and RFP + ) were produced. In the T 2 generation, fifty seeds of each transformant were germinated and assayed for GFP and RFP. For GFP segregation, T 2 plants of fifty-eight transformants showed 3:1 ratio and therefore carried single T-DNA locus in the rice genome. Eight transformants had multiple T-DNA integration loci based on the GFP segregation ratios (>3:1) of T 2 plants. T 2 generation of fourteen transformants did not have any plants showing GFP or RFP possibly because of Ds transposition into the GFP gene or due to transgene silencing (see Discussion).

High Frequency of Germinal Transposition in T 1 and T 2 Transgenic Plants
Because the activation-tagging Ds element and the T-DNA-based Ac transposase can be tracked by RFP and GFP, respectively, we determined the GFP and RFP phenotype of each T 2 plant from 53 single-T-DNA-locus populations (Table I). An average of 48 T 2 plants in each population were assayed for GFP and RFP fluorescence. The phenotype of individual plants was determined as shown in Figure 2A. Four different phenotypes were observed in these T 2 populations. Because each T 2 population was generated from a single primary transformant (T 1 ) and in the T 1 generation the Ds element carrying the RFP gene can transpose, the RFP gene and the T-DNA-anchored GFP gene may have different chromosomal locations and may segregate in the T 2 plants (Fig. 2B). Therefore, it is likely that the GFP -RFP + or GFP + RFPplants in T 2 generation were derived from transposition events in the T 1 generation. In GFP and RFP assays of T 2 plants of 53 transformants (Table I), GFP -RFP + plants were identified from T 2 populations of 23 transformants . These GFP -RFP + plants indicated that 43.4% (23/53) of the T 1 transformants carried transposed Ds elements that were germinally transmitted to the T 2 generation. Among the 53 T 2 populations, 12 populations (No. 42-53) had GFP + RFPplants but no GFP -RFP + plants, which suggested that a fraction of excised Ds elements did not reinsert in the rice genome (see Discussion).
To detect transposition in T 2 generation, we investigated T 3 populations of 37 transformants. For the 37 transformants, 18 transformants (  (Table I, [30][31][32]35,[42][43]46,[48][49][50][51] showed transposition, and T 2 families of 5 transformants (No.54-58) were not analyzed. The T 3 families of the 37 transformants were obtained by selfing heterozygous T 2 plants. As GFP-heterozygous T 2 seedlings gave a lower level of GFP fluorescence than GFP-homozygous seedlings (Kumar et al., 2005), we selected the heterozygous T 2 plants based on their lower GFP intensity. In this way we were able to reduce the T 2 homozygotes to 8.2% according to the results of GFP assays of T 3 generation (data not shown). In each T 3 family, 200 to 400 plants were assayed to identify GFP -RFP + transposant plants.

Sequence Tags
We grew a total of 3057 T 2 plants that were derived from 37 primary transformants. An average of 300 seeds (T 3 ) were produced from each T 2 plant. The GFP negative selection marker and the RFP positive selection marker were utilized to screen T 3 families to obtain putative stable transposants (GFP -RFP + plants). Using the scheme as shown in Figure 3, we initially screened 905 T 3 families that were derived from the 37 transformants (Table II). In the next screening step, we did not continue with the T 3 families derived from transformants for which transposition frequencies in the T 3 generation were less than 10%. Instead, we focused on the 1181 T 3 families of the 26 transformants which generated transposition frequencies in the T 3 of 10% to 83.3% (Table II). Among the 1181 T 3 families, 343 families were found to have at least one GFP -RFP + plant per family. Taken together, a total of 2086 T 3 families were screened and GFP -RFP + transposants were selected from 638 (30.6%) of those families.
The putative stable transposants were subjected to adaptor-ligation PCR to determine Ds flanking sequence (Fig. 3). We analyzed 559 transposants that were derived from 26 primary transformants. A total of 463 sequences were obtained after adaptorligation PCR, sequencing, and filtration against the T-DNA sequence (to eliminate the Ds donor locus). In Blast search against the Rice GE genome database RiceGE Functional Genomics Database (http://signal.salk.edu/cgi-bin/RiceGE ), 311 (67.2%) sequences were mapped on the rice chromosomes (Table III). The 152 remaining sequences were 73 (15.8%) redundant FSTs from siblings carrying the same insertions, and 79 (17.1%) unmapped sequences due to insertions in rice repeat sequence regions. Our explanation for the redundant FSTs is that a small number of T 2 siblings carried the same Ds insertion due to transpositions in the T 1 parent (Table 1), which were propagated by T 2 siblings to the T 3 generation (see Discussion). In further analysis of the FSTs, the TIGR rice genome database (http://www.tigr.org/tdb/e2k1/osa1/pseudomolecules/info.shtml) was searched for gene homology between the FSTs and rice cDNA sequences. There were 186 mapped FSTs with hits to rice cDNA sequences. Such FSTs were 40.2% (186/463) among the FSTs obtained and represented Ds insertions in the genic regions.
We analyzed the distribution of Ds insertions among the rice chromosomes. For each single-T-DNA-locus transformant, in the T 3 generation, the Ds element was found to be distributed among different rice chromosomes (Table III) size of each chromosome, the Ds insertions appear to be evenly distributed in the rice genome.

Enhanced Expression of Rice Genes Adjacent to the Activation-Tagging Ds
To examine whether expression of the Ds-tagged rice genes was altered, reverse transcriptase (RT)-PCR was performed on transposant lines with Ds insertion adjacent to the rice genes. We randomly selected twenty four transposant lines and examined gene transcript levels using semi-quantitative RT-PCR method. Initially, we analyzed the closest rice gene for each Ds insertion, which was at distance of 1 to 7 kb upstream or downstream of the Ds element. RNA was extracted from the leaves or roots of 60-d-old rice plants. RT-PCR was performed to compare gene expression levels in the twenty four candidate lines to those in the wildtype rice Nipponbare. Two pairs of primers specific to each gene were tested and each RT-PCR experiment was repeated at least three times under the same condition. As shown in Figure 4A, gene expression was enhanced in the line ADS247 whose Ds insertion was 1609 bp upstream of Os03g15050 as compared to Nipponbare. In ADS248 whose Ds insertion was 1880 bp upstream of a plasma membrane-type ATPase gene (Os11g29490), the transcript level was significantly higher than in Nipponbare as suggested by the RT-PCR results (Fig. 4B).
However, the other twenty two transposant lines showed the same results as Nipponbare in semi-quantitative RT-PCR when just one gene was analyzed. We further tested eight of such transposant lines by amplifying other genes in the vicinity of the Ds insertion. Three overexpressed Ds lines were identified through RT-PCR examination of the additional genes (Fig. 4C, 4D and 4E). In ADS305 whose Ds insertion was 12.2 kb upstream of the gene Os09g30439, the transcript level was slightly higher than in Nipponbare as indicated in the results of both leaf and root tissues (Fig. 4C). In ADS427 where the Ds was inserted 1349 bp upstream of Os04g42840, the transcript level was increased significantly based on the amount of the RT-PCR product in ADS427 and wildtype rice (Fig. 4D). In the RT-PCR analysis of ADS210, we examined expression levels of four genes that are 1.8 to 11.8 kb from the Ds insertion (Fig. 4E). Just for the gene Os10g41410, which is 11.8 kb upstream of the Ds element, transcript level was significantly enhanced in ADS210 as compared to Nipponbare.
It was observed that the CaMV enhancers in Ds in ADS210 activated the farthest gene Os10g41410 but did not affect expression of the other closer genes (Fig. 4E). In ADS305, as compared to Os09g30439, Os09g30458 is closer to the Ds insertion but did not get activated (Fig. 4C). On the other hand, in ADS427, the CaMV enhancers in Ds only affected expression of Os04g42840, which is closer to the Ds element than the gene Os04g42860 ( Fig 4D). Therefore, for the function of the CaMV enhancers, likelihood of gene activation was not directly related to the distance from the CaMV enhancers to the gene.

DISCUSSION
We have developed an in cis activation tagging Ac In the design of a transposon tagging system, it is also essential to prevent large number of transposant siblings from entering the pipeline. In this study, an important step was to avoid using transformants whose T 2 generation exhibited several transposants  We performed semi-quantitative RT-PCR of rice genes in twenty four transposant lines and observed enhanced expression of Ds-tagged genes in five lines. The RT-PCR results were confirmed in repeated experiments using the same condition. We never had a result that the RT-PCR product of a transposant line was less than that of wildtype rice.
Therefore, the CaMV enhancers in Ds element were capable to activate rice genes adjacent to the Ds insertion, which was similar to the result of T-DNA-based activation tagging in rice (Jeong et al., 2002(Jeong et al., , 2006. However, the frequency of activation-tagged lines was 20.8% (5/24) in this study and the frequency is lower than the frequency of 52.7 % in the previous report using T-DNA in rice (Jeong et al., 2006). One reason might be that we analyzed just one gene in semi-quantitative RT-PCR for most (16 among 24) of the candidate lines. For the other eight lines, two to four rice genes in each line that are close to the Ds insertion were examined in RT-PCR and enhanced expression of Dstagged genes was observed in three lines. Therefore, our rate of activation-tagged lines could be increased if more genes in the candidate lines were analyzed. For the function of the CaMV enhancers, it was reported that no good relationship was found between frequency of activation and distance from the CaMV enhancers to the gene and that no correlation was observed between degree of activation and distance (Jeong et al., 2006).
We also observed the similar results in RT-PCR analysis of three activation tagged lines.
Although large T-DNA insertion libraries have been generated in rice, efficient T-DNA transformation system is available just for the japonica subspecies. For the indica rice subspecies that is more widely grown in rice farming areas around the world, T-DNA transformation is still difficult. Also, most of the T-DNA transformation methods involve tissue culture step that generates high frequency of somaclonal variation, which disturbs the process of forward mutant screens. Therefore, transposon mutagenesis is an effective approach for plant functional genomics, and can serve as an alternative to T-DNA when transformation through tissue culture is inefficient as is the case for the indica rice species as well as many other crops. Transposable elements can be mobilized or   et al., 1996) that is functional in maize, rice, barley, wheat and many grasses, the principles developed here are applicable to many other monocot plants. We find that the Ds element preferentially transposes into genic regions, which is similar to previous reports about the maize transposable elements (Enkoi et al., 1999;Cowperthwaite et al., 2002;Kolesnik et al., 2004;van Enckevort et al., 2005). When utilized as the carrier of the CaMV enhancers, the Ds element offers the advantage of both activation taggging and knockout mutations. These features make transposon-based activation tagging particularly useful for large genomes with many duplicated genes such as maize, as well as polyploid plant crops. The Ac-Ds based activation vector system developed in this study is publicly available without IP restrictions, and should be applicable for the functional genomics of a range of plants and especially for that of cereals and monocot plants.

Vector Construction
In construction of the activation tagging Ac-Ds vector pSQ5, the Ubi-DsRed-Nos cassette was from pSK62 (Kumar et al., 2005), the 1785-bp 5'Ds and 222-bp 3'Ds were from pWS32 (Sundaresan et al., 1995), and the 4x CaMV 35S enhancers was from the AcREH construct (Suzuki et al., 1999 (Yin and Wang, 2000;Sallaud et al., 2003). GFP and RFP positive (GFP + RFP + ) calluses were transferred to pre-regeneration medium (Yin and Wang, 2000) and cultured at 25 ˚C and in dark for 10 to 15 days. The GFP + RFP + calluses were further cultured on regeneration medium (Yin and Wang, 2000) under light for 2 to 3 weeks. GFP + RFP + transformant plantlets were finally transferred to greenhouse.     Genes that were examined in semi-quantitative RT-PCR are marked with dotted-line boxes. Right, RT-PCR analysis of activation-tagged genes. Rice Act1 transcript was amplified as control. WT, the wildtype rice Nipponbare. RT-PCR using leaf or root RNA was indicated. A, Line ADS247 was tagged by the activation-tagging Ds element 1609 bp upstream of Os03g15050, which encodes the phosphenolpyruvate carboxykinase. B, Line ADS248 was tagged by a Ds 1880 bp upstream of Os11g29490, which encodes a plasma membrane-type ATPase. C, Line ADS305 was tagged by a Ds 12.2 kb upstream of Os09g30439, which encodes a heat shock protein. D, Line ADS427 was tagged by a