LINE1 family member is negative regulator of HLA-G expression

Class Ia molecules of human leucocyte antigen (HLA-A, -B and -C) are widely expressed and play a central role in the immune system by presenting peptides derived from the lumen of the endoplasmic reticulum. In contrast, class Ib molecules such as HLA-G serve novel functions. The distribution of HLA-G is mostly limited to foetal trophoblastic tissues and some tumour tissues. The mechanism required for the tissue-specific regulation of the HLA-G gene has not been well understood. Here, we investigated the genomic regulation of HLA-G by manipulating one copy of a genomic DNA fragment on a human artificial chromosome. We identified a potential negative regulator of gene expression in a sequence upstream of HLA-G that overlapped with the long interspersed element (LINE1); silencing of HLA-G involved a DNA secondary structure generated in LINE1. The presence of a LINE1 gene silencer may explain the limited expression of HLA-G compared with other class I genes.


INTRODUCTION
Class Ia molecules of human leucocyte antigen (HLA-A, -B and -C) and class Ib molecules (HLA-E, -F and -G) are members of human major histocompatibility complex (MHC), a cell surface molecule encoded by gene family. Class Ia molecules are widely expressed in tissues and play a central role in the immune system by presenting peptides derived from the lumen of the endoplasmic reticulum (1). In contrast, the class Ib molecule HLA-G contributes to maternal tolerance of the allogeneic foetus and also to novel functions (2,3). Therefore, the distribution of HLA-G is mostly limited to foetal trophoblastic tissues and numerous tumour tissues (4,5).
HLA family genes were derived from gene duplication, accompanied by the insertion of retrotransposons (6,7). Although the sequences upstream (5 0 ) and downstream (3 0 ) of the coding region are analogous among HLA genes, they are often interrupted by highly abundant retrotransposons, including long interspersed elements (LINE1 or L1) (8,9). Previously, the HLA-G promoter was reported to be located in a region upstream of HLA-G that was analogous to the upstream regions of class Ia genes, with some polymorphisms (10,11). However, tissue-specific regulation of the HLA-G gene is not well understood.
Transgene expression in stable cell lines is often repressed by unexpected epigenetic silencing around the integrated position, complicating the search for regulatory elements. The human artificial chromosome (HAC), which replicates and segregates once per cell division cycle to be stably maintained in cells, is an alternative platform of gene expression (12). We previously developed a HAC vector system in which one copy of the DNA fragment of interest can be manipulated by Cre/lox insertion in any cell line of interest (13). Compared with viral or integrating vectors, this HAC vector is advantageous for evaluating gene expression without unexpected gene silencing.
Here, we used a HAC vector to investigate regulation of the HLA-G gene. With this approach, we identified a negative regulator of gene expression in a sequence upstream of HLA-G that overlapped with a LINE1 sequence.
Each LINE fragment was inserted into the AseI site at the 5 0 -end of the CMV-EGFP sequence in pEGFP-C1.
These LINE-CMV-EGFP cassettes were digested with AseI and MluI, then blunt-end ligated into the EcoRI site of the pLox66-puro vector. The gL1 fragment was inserted into a SnaBI site at the 5 0 -end of HLA-EGFP, and gL1-HLA-EGFP cassettes were also cloned into the EcoRI site of the pLox66-puro vector.

DNA transfection
For insertion of the constructs into the HAC vector, 1 mg of the pLox66-puro vector containing the HLA-G gene or the EGFP expression cassette was co-transfected with 0.5 mg of CAGGS-Cre into HT1080, MEF or JEG3 cells (5 Â 10 5 ) with lipofectamine2000 (Invitrogen) or FuGENE HD (Roche) according to the manufacturer's instructions. Cell lines were selected with puromycin (Sigma) at 0.25 mg/ml (HT1080), at 2 mg/ml (MEF) or at 0.5 mg/ml (JEG3).

RT-PCR
Total RNA was isolated with the SV Total RNA Isolation System (Promega). cDNA was synthesized using the Transcriptor First Strand cDNA Synthesis Kit (Roche Applied Science); 25 ng aliquots were used for PCR. The EGFP gene was amplified with PCR primers EGFP1 (5 0 -CGACGTAAACGGCCACAAGTTCAG-3 0 ) and EGFP2 (5 0 -CAGGACCATGTGATCGCGCTTCT C-3 0 ). The PCR protocol was: 95 C for 4 min and 23 cycles of 95 C for 15 s, 68 C for 15 s, and 72 C for 30 s. For the HLA-G gene, the primers were HLA-G1 (5 0 -AGG CGGCCAATGTGGCTGAACAAA-3 0 ) and HLA-G2 (5 0 -CAGGGTGGCCTCATAGTCAAAGACA-3 0 ), and the PCR protocol was 95 C for 4 min and 23 cycles of 95 C for 15 s, 68 C for 15 s, and 72 C for 30 s.

Western blot
Whole cell extracts were separated by 12.5% SDS-PAGE and transferred to polyvinylidene difluoride membranes (Millipore). Anti-HLA-G antibody (1:500 dilution; MBL) was used for immunodetection. Images were captured and detected with a LAS3000.

Analysis of HLA-G regulation using the HAC vector
To examine the tissue-specific expression of HLA-G, transcripts from cell lines were measured by RT-PCR with HLA-G-specific primers ( Figure 1A). HLA-G was repressed in HT1080 cells derived from a fibrosarcoma, but was expressed in JEG3 cells from a placental source ( Figure 1B). To investigate regulation of the HLA-G gene, we manipulated one copy of various genomic fragments using the HAC vector in HT1080 cell lines (13) ( Figure 1C). The product of the HLA-G transgene was detected in HT1080 cells via western blotting with anti-HLA-G antibodies after introduction of a CMV-driven HLA-G cDNA using the HAC vector ( Figure 1D).
We investigated four constructs of HLA-G genomic fragments containing the eight exons of HLA-G and associated upstream regions to identify the HLA-G regulatory element ( Figure 2A). HLA-G protein was detected via western blot of stable cell lines (eight cell lines per series). Despite the presence of the tissue-specific promoter (10), HLA-G expression driven by the 1.4-kb promoter in a 5.9-kb fragment was not repressed but was fully expressed in non-placental HT1080 cells ( Figure 2B).
HLA-G expression from longer genomic fragments was compared with that from the standard 5.9-kb genomic HLA-G sequence. The average ratios were 0.22, 0.13 and 0.88 from the 11.4 kb, 10.1 kb and 8.7 kb regions, respectively ( Figure 2B). We reasoned that the remarkable difference in expression levels between the 8.7 kb and 10.1 kb HLA-G fragments were due to a repressive element present in fragments 10.1 kb or longer. Sequence analysis indicated that the region between 8.7 and 10.1 kb did not contain any known elements essential to gene expression, but this region harboured part of a LINE1 element (8,9,15) (gL element; Figure 1A).

Gene silencing by the gL1 sequence
To examine whether the 8.7-10.1 kb region of HLA-G possessed a dominant gene silencing element, we created EGFP reporter constructs driven by an HLA-G minimal promoter with or without the 1.4 kb fragment in the LINE1 insertion (SnaBI-NheI: gL1) at the 5 0 position (gL1-HLA-EGFP and HLA-EGFP; Figures 1A and  3A). These reporter cassettes were investigated using an HAC vector in HT1080 cells (HT/HLA and HT/gL1-HLA; Figure 3A). The level of EGFP fluorescence from stable cell lines indicated that the standard HLA-EGFP construct was expressed in nearly all cells at uniform levels; however, when fused to the gL1 sequence, EGFP expression was partially or completely repressed in HT/gL1-HLA cells ( Figure 3B). Real-time RT-PCR revealed that the level of EGFP transcript in HT/gL1-HLA cells decreased to 14% of that in the HT/HLA cells ( Figure 3B).
To examine whether the gL1 sequence was specific or universal to promoters or cell lines, we replaced the HLA-G minimal promoter with the CMV promoter or the human b-actin promoter ( Figure 3A). When the gL1 sequence was fused to the CMV or b-actin promoter, the level of EGFP expression in HT1080 cells also decreased to 4% and 37% of that in the cell lines without gL1, respectively ( Figure 3C and D). Next, we evaluated MEF cells and JEG3 cells that possessed the HAC vector. Expression from the CMV promoter decreased to 27% of that in the cell lines without gL1 in MEF cells (MEF/ gL1-CMV) ( Figure 4A and B) and expression from the HLA promoter or the actin promoter decreased to 27% and 41%, respectively, in JEG3 cells (JEG3/gL1-HLA and JEG3/gL1-actin, respectively) ( Figure 4C and D).
Taken together, these observations indicated that the gL1 sequence exhibited gene silencing activity under the control of the HLA-G, CMV and actin promoters in HT1080, MEF and JEG3 cells. However, the silencing level varied among promoters and cell lines (Figures 3  and 4). The expression level of all EGFP reporters in JEG3 cells was low by two orders of magnitude compared with that in HT1080 or MEF cells.

An essential element for gene silencing
To identify the elements essential for gene silencing, we compared the sequence 8 kb upstream of HLA-G to the sequence upstream of HLA-A; the HLA-A expression was not repressed in most tissues. Another LINE1 sequence was found in the upstream region of HLA-A (aL; Figures 1A and 5A). The alignment with a 6-kb LINE1 sequence (L1RE1) revealed that the gL element was 1553 bp insertion, and the aL element was 1005 bp insertion of non-Ta subfamily of human LINE1 sequence (16,17) (Supplementary Figure S1).  To determine whether the general LINE1 sequence exhibited gene silencing activity, the complete 1.5 kb gL element in HLA-G (gL2; longer than gL1), the corresponding sequence in L1RE1 (L1B) and the 1.5 kb fragment next to L1B in L1RE1 (L1A) were fused to the CMV-EGFP reporter and assayed for gene silencing in HT1080 cells ( Figure 5A). EGFP expression from HT1080 cell lines (gL2-CMV, L1A-CMV and L1B-CMV) was compared with that of the control CMV-EGFP cell lines (CMV; Figure 5B). The average levels of EGFP transcripts from the L1A-CMV and L1B-CMV cell lines were similar to that of the control cell lines. The gL2 sequence specifically silenced gene expression driven by the CMV promoter ( Figure 5B).
To determine which portion of the gL2 sequence conferred the silencing function, we divided the gL2 sequence into the 5 0 (gL3) and 3 0 halves (gL4) and assayed CMV-EGFP expression. Neither gL3 nor gL4 repressed CMV-EGFP expression; cell lines harbouring these constructs had EGFP expression levels similar to control CMV-EGFP levels (gL3-CMV and gL4-CMV; Figure 5B). Thus, gene silencing required all parts of the gL element (gL1 or gL2) that were part of the LINE1 sequence, with variations from the L1RE1 sequence.
We inspected the gene silencing function of the gL element in genomic HLA-G region via four series of HLA-G genomic fragments ( Figure 6A): a fragment that was deleted for the gL element from the 11.4-kb fragment (ÁgL), ÁgL plus the aL element (ÁgL+aL), ÁgL plus the gL element in the opposite direction [ÁgL+gL(R)] and ÁgL with the addition of the gL sequence at a distal site [ÁgL+gL(dis)]. HLA-G expression from these four series of fragments was measured by real-time RT-PCR and compared with expression from the 11.4-kb (negative control) and 5.9-kb HLA-G fragments (positive control, asterisk: standard cell line; Figure 6B). The average ratios of HLA-G expression for ÁgL, ÁgL+aL and ÁgL+gL(R) (0.74, 0.36 and 0.44, respectively) were highly relative to the positive control (0.73), and the ratio of ÁgL+gL(dis) expression (0.059) was as low as the negative control (0.059). We concluded that the gL element in HLA-G exhibited its silencing function when located at an upstream position (2.8 kb or 5.5 kb from the minimal promoter) and in the forward direction.

Participation of a hairpin loop in gene silencing
Most of the LINE1 sequences, including the gL element, were characteristic in $60% AT-rich, however, a database search did not reveal any typical gene silencers in the gL sequence, nor any specific differences between gL and the general LINE1 sequence. Interestingly, the gL sequence possessed more sites with a higher probability of forming hairpin loops than the general LINE sequence ( Figures 5A and 7A). Of these, 10 or 12 of the sites with >70% probability of forming hairpin loops were identified in the gL1 and gL2 fragments that were active for gene silencing, whereas these sites did not frequently occur in L1A, L1B, gL3 or gL4 sequences that were negative for gene silencing ( Figure 7A). The mechanism of gene silencing may thus involve hairpin loops in addition to recruitment of silencing factors (18,19).
To examine whether a hairpin loop was present in the gL element, and to characterize the potential participation of such a loop in gene silencing, genomic DNA from five cell lines in three reporter series (gL2-CMV, L1A-CMV and L1B-CMV) was treated with MBN, a single-strandspecific DNA and RNA endonuclease that cleaves hairpin loops ( Figure 7B). We then amplified three sites in the expression cassettes corresponding to the 1.5-kb LINE, the 0.6-kb CMV promoter and the 0.6-kb EGFP sequence ( Figure 7A). Single-strand DNAs that included hairpin loops resulted in a reduced quantity of amplified product, because cleaved sites between the primer annealing sites were not amplified ( Figure 7B). Three conditions for MBN digestion of DNA from the gL2-CMV20 cell line showed a gradual degrease in amplification of LINE and CMV sequences ( Figure 7C). The second MBN digestion condition was applied in further analyses of cell lines.
The possibility of hairpin loops was scored by calculating the quantity of DNA measured after and before MBN cleavage (MBN+/MBNÀ; Figure 7D and E). The average scores for the LINE sequences (gL2: 0.145, L1A: 0.394 and L1B: 0.309) and the CMV promoter region (0.383, 0.484 and 0.397) were lower than those observed for EGFP (0.676, 0.763 and 0.737). These scores corresponded to the frequency of predicted sites for hairpin loop formation as revealed by sequence analysis (Figure 7A and E). Lower scores from the LINE and CMV sequences were distinguishable from high scores from the EGFP sequence. The high score for EGFP indicated that the main targets of MBN cleavage were not transient single-strand DNA molecules produced during replication or transcription, but were single strands formed in DNA secondary structures, including hairpin loops. Importantly, the score for the LINE sequence in the gL2 cell lines was lower than the scores from the L1A and L1B cell lines ( Figure 7E). The differences in score likely reflect the probability of formation or the number of hairpin loops. The cell lines that were exceptional in terms of EGFP expression (asterisks in Figures  5B and 7E) also exhibited corresponding value in the MBN analysis; a relatively high score (0.481) was measured in the EGFP active cell line (gL2-CMV-24), and a low score (0.084) was measured in the L1A-CMV-20 cell line that did not express EGFP ( Figure 7E). Thus, MBN analysis revealed that a hairpin loop was more likely to form in the gL2 sequence than in the L1A, L1B and CMV promoters, and that the gL2 cell lines that were silenced for EGFP expression contained DNA secondary structure including hairpin loop at gL element.

Formation of hairpin loops in the genome
The MBN analysis of the CMV promoter in the EGFP cassettes uncovered the possibility of DNA secondary structures, including hairpin loops, at promoter regions in addition to gL2 sequences ( Figure 7E). To understand the mechanism of gene silencing mediated by the gL element, the MBN assay was performed on LINE insertions (gL and aL) and promoter regions in the genomic HLA-A and HLA-G loci ( Figure 8A). The MBN score in HT1080 cells was low for the gL element (0.042) and intermediate for the aL, HLA-A, and HLA-G promoter sequences (0.515, 0.467, and 0.473, respectively; Figure 8B). The promoters of the constitutive genes b-actin and EF1a were also investigated in HT1080 cells as controls for the MBN assay in the genomic context. The MBN score was intermediate at the actin promoter (0.378) and high at the EF1a promoter (0.699; Figure 8B). The scores from sequences on the genome corresponded to the predicted frequency of hairpin formation: low scores for gL elements and high scores for EF1a promoters ( Figure 8A). Intermediate scores for the HLA-A, HLA-G and b-actin promoters were distinguishable from the high scores of the EF1a promoters. Therefore, the HLA-G promoter in HT1080 cells retains a target for the hairpin loops in the gL element that is absent from the EF1a promoter ( Figure 8D).
To confirm the target specificity of the gL element, gene silencing was examined using a reporter cassette (gL1 fused to EF1a-EGFP; gL1-EF1a) in HT1080 cells. EGFP expression from gL1-EF1a was not silenced relative to that from EF1a ( Figure 8C), indicating that the EF1a promoter is not a target of the gL element. Thus, the gL element forms a hairpin loop to specify target promoters for silencing.

DISCUSSION
We investigated the genomic regulation of HLA-G using a HAC vector, identifying a negative regulator of gene expression in a sequence upstream of HLA-G that overlapped with a LINE1 sequence (gL element). LINE1 insertions are abundant in the human genome, therefore, LINE1s with different lengths, sequences or directions were also found within 10 kb upstream of the promoters of HLA-A, HLA-B and HLA-C, but these loci were not silenced. The gene silencing by gL element may be characteristic for HLA-G expression. A mechanism for transcriptional disruption by LINE1 insertion was previously demonstrated that poor LINE1 expression caused inadequate transcriptional elongation of genes, when the LINE1 was inserted inside the transcriptional region (e.g. intron) of the genes (20). However, the insertion length of gL element was 1.5 kb and the position of insertion was upstream of HLA-G promoters.
We demonstrated that a hairpin loop can form in the gL element, and may participate in the silencing of HLA-G (Figures 7 and 8). The correlation of gene silencing with the presence of a hairpin loop was demonstrated in EGFP reporters fused with or without gL element (Figure 7), and in changes of gL2-24 and L1A-20 cell lines, which were from EGFP-active to EGFP-repressed and repressed to active, respectively (Supplementary Figure S2).
We deduced the presence of a hairpin loop at the gL element via MBN analysis, which also revealed potential structural differences between the promoters; CMV, HLA-A, HLA-G and b-actin promoters were associated with intermediate scores, whereas the EF1a promoter received a high score in this analysis ( Figure 8). Therefore, gene silencing by the gL element may be characteristic for target promoters. The gL element effectively silenced the native HLA-G promoter, a synthetic CMV promoter and the actin promoter in HT1080 cells, but the gL element did not silence the EF1a promoter in the reporter construct gL1-EF1a ( Figure 8C).
Hairpin loops at the gL element may interact with target promoters directly or indirectly to prevent the loading of transcription factors or enhancer factors onto promoters, in addition to the activity of conventional repressors (21). In this model, a hairpin at the gL element may be a highly reactive trigger of gene silencing ( Figure 8D). The specificity of silencing may depend not only on the structure of the loop, but also on its sequence, because the gL element lost its silencing activity when placed in a reversed orientation relative to the HLA-G promoter ( Figure 6).
Although hairpin loops may form all over the genome, the MBN analysis revealed a much higher probability of hairpin formation on the gL element than on other sequences. Surrounding sequences or chromatin architecture may supply the conditions necessary for hairpin formation on the gL element. For example, hairpin formation may be facilitated by the AT-or A-rich regions in LINE1, which previously exhibited low average nucleosome occupancy both in vivo and in vitro (22). The hairpin loop may thus provide a simple and flexible trigger for the complex biological process of gene regulation. Despite the presence of hairpin loops on the gL element in JEG3 cells, HLA-G expression was not silenced ( Figure 8B). We suggest that HLA-G expression in JEG3 cells may be regulated in a tissue-specific manner that is epistatic to the hairpin-mediated silencing mechanism.
Assuming that HLA-G gene silencing depends on a flexible, DNA structure-based mechanism, we speculate that HLA-G expression may be activated in the foetal placenta or in some tumours via positive regulation. The presence of a LINE1 silencer may explain the limited expression of HLA-G compared with other class I genes. The frequency of hairpin loops was scored by determining the quantity of DNA measured before and after MBN cleavage. Genomic DNA from cell lines harbouring gL2-CMV, L1A-CMV and L1B-CMV constructs (n = 5 cell lines each) were amplified before (À) and after (+) MBN cleavage with primers that targeted the LINE elements (gL2, L1A and L1B), the CMV promoter and the EGFP sequence. (E) Histograms of the ratios of PCR products of DNA amplified after or before MBN cleavage (MBN +/À). A low ratio indicates a high probability or high frequency of hairpin loops. Asterisks correspond to exceptional cell lines in Figure 5B.