RNA-binding proteins (RBPs) regulate gene expression at many post-transcriptional levels, including mRNA stability and translation. The RBP nucleolin, with four RNA-recognition motifs, has been implicated in cell proliferation, carcinogenesis and viral infection. However, the subset of nucleolin target mRNAs and the influence of nucleolin on their expression had not been studied at a transcriptome-wide level. Here, we globally identified nucleolin target transcripts, many of which encoded cell growth- and cancer-related proteins, and used them to find a signature motif on nucleolin target mRNAs. Surprisingly, this motif was very rich in G residues and was not only found in the 3′-untranslated region (UTR), but also in the coding region (CR) and 5′-UTR. Nucleolin enhanced the translation of mRNAs bearing the G-rich motif, since silencing nucleolin did not change target mRNA stability, but decreased the size of polysomes forming on target transcripts and lowered the abundance of the encoded proteins. In summary, nucleolin binds G-rich sequences in the CR and UTRs of target mRNAs, many of which encode cancer proteins, and enhances their translation.
In mammalian cells, RNA-binding proteins (RBPs) robustly modulate gene expression by controlling post-transcriptional processes like pre-mRNA splicing and maturation, mRNA transport, stability and translation (1–5). Some RBPs are specialized in one particular aspect of mRNA metabolism; for example, the RBPs tristetraprolin (TTP) and KH-type splicing regulatory protein (KSRP) promote mRNA degradation (6–8). However, most RBPs influence the fate of target transcripts in multiple ways; for example, the embryonic lethal abnormal vision (elav)/Hu protein HuR stabilizes some target mRNAs, but modulates the translation of other targets (9), AUF1 (AU-binding factor 1)/ hnRNP D (heterogeneous nuclear ribonucleoprotein D) modulates the stability and translation of several target transcripts (10–13), T-cell intracellular antigen-1 (TIA-1) and TIA-1-related protein (TIAR) participate in the splicing and translational repression of target transcripts (14–16), while the polypyrimidine tract-binding protein (PTB) can modulate splicing, stability and translation of target RNAs (17,18). Yet, other RBPs such as the nuclear factor (NF)90 (also named NFAR, DRBP76 and ILF3) not only interact with mRNAs and modulate their post-transcriptional fate, but are also capable of interacting with DNA (19,20).
Nucleolin is another multifunctional protein capable of interacting with DNA and RNA. With an apparent molecular weight of 100 kDa and a length of 710 amino acids, nucleolin has several different domains: an N-terminal segment with multiple phosphorylation sites, a central domain with four RNA-recognition motifs (RRMs) and a C-terminal arginine–glycine-rich (RGG) domain (21–24). Among its functions associated with binding DNA, nucleolin can induce chromatin decondensation by the remodeling complex SWI/SNF (switch/sucrose non-fermentable in yeast), facilitates transcription and modulates DNA replication (23,25,26). However, nucleolin is a prominent RBP with a strong presence in the nucleolus, where it interacts with precursor ribosomal (r)RNA and is essential for rRNA biogenesis and rRNA transport to the cytoplasm (21,27–29). Accordingly, downregulation of nucleolin caused nucleolar disruption and defects in cell cycle progression and centrosome duplication (30). Nucleolin was also found on the plasma membrane, where it functions in signal transduction, wound repair and viral infection (31–34); it also affects other aspects of viral RNA metabolism, including the translation and replication of viral RNAs (35,36).
The remainder of nucleolin is found in the nucleoplasm and the cytoplasm, where it is increasingly recognized as a pivotal regulator of mature mammalian mRNAs (22,23,37–40). However, its influence on target mRNAs differs depending on the target transcript and the experimental system. Nucleolin was reported to interact with the 3′-untranslated region (UTR) of numerous mRNAs, enhancing their stability, as shown for mRNAs encoding β-globin, amyloid precursor protein (APP), gastrin, B-cell leukemia/lymphoma 2 (Bcl-2), Bcl-xL, interleukin 2 (IL-2) and the growth arrest- and DNA damage-inducible 45 (Gadd45α) (38,40–44). On the other hand, nucleolin interacted with the 5′-UTR of the TP53 mRNA and inhibited p53 translation following DNA damage (45) and with the 5′-UTR of prostaglandin endoperoxide H synthase-1 (PGHS1) mRNA, also leading to the repression of PGHS-1 translation (46). In addition, nucleolin associated with the 3′-UTR of MMP9 mRNA and promoted MMP9 translation (47), and with the 3′-UTR of several selenoprotein mRNAs, similarly promoting their translation (48).
Here, we sought to identify systematically the collection of mammalian nucleolin target mRNAs. Immunoprecipitation (IP) of nucleolin ribonucleoprotein (RNP) complexes was followed by microarray analysis to elucidate target mRNAs. These targets encoded proteins involved in several key cellular processes such as translation, viral infection, metabolism, carcinogenesis and cell proliferation. Computational analysis of the target RNAs revealed a G-rich signature sequence present in the coding regions (CRs), and the 5′- and 3′-UTRs of a majority of target mRNAs. In vitro binding assays confirmed that both endogenous nucleolin and recombinant purified nucleolin were capable of binding biotinylated transcripts spanning the 5′-UTR, CR and 3′-UTR of different targets analyzed, all of which contained the G-rich signature motif. Functionally, nucleolin enhanced the translation of target mRNAs, as assessed by polysome profiling, nascent translation and reporter construct analyses. In sum, we have identified a large subset of nucleolin target mRNAs, found a signature G-rich sequence present in coding and non-coding regions of these mRNAs and discovered that nucleolin can function as a translation enhancer for this group of target mRNAs.
MATERIALS AND METHODS
Cell culture and transfection
Human cervical carcinoma HeLa cells were cultured in DMEM containing 5% fetal bovine serum (FBS) supplemented with glutamine and antibiotics. For silencing nucleolin, cells were transfected with either control (Ctrl) siRNA (Qiagen) or nucleolin (NCL)-directed siRNA (Santa Cruz). Plasmid pGEX-4T2-Nuc-C (284–707 amino acids) was used to engineer a plasmid lacking the RGG domain [pGEX-4T2-Nuc-C (284—644 amino acids)]; these plasmids were used to express GST-NCL and GST-NCL(ΔRGG), respectively, each lacking the N-terminal 283 amino acids that rendered the proteins insoluble. For reporter analyses, cells were transfected with pGFP or with the vectors engineered to express wild type of mutant nucleolin motif hits (M1, M2, M3), inserted in the CR or 3′-UTR: pGFP-M1, pGFP-M2, pGFP-M3, pGFP-3′M1, pGFP-3′M2, pGFP-3′M3, pGFP-3′M1mut, pGFP-3′M2mut and pGFP-3′M3mut; 100 ng plasmid was used per transfection. All transfections were carried out using Lipofectamine-2000 (Invitrogen) following the manufacturer’s protocol.
RNP IP and microarray analysis
Endogenous mRNA–protein complexes were precipitated as previously described (49). Briefly, HeLa cytoplasmic lysates were prepared in polysome lysis buffer (PLB) [100 mM KCl, 5 mM MgCl2, 10 mM HEPES, pH 7.0, 0.5% Nonidet P-40, 1 mM dithiothreitol (DTT)] containing 100 U/ml RNase OUT (Invitrogen) and a protease inhibitor cocktail (Roche). For microarray analysis, 3 mg of lysate were incubated (1 h, 4°C) with 100 ml of a 50% (v/v) suspension of protein-A Sepharose beads precoated with 20 mg each of mouse anti-nucleolin or mouse IgG (Santa Cruz). Beads were washed with NT2 buffer (50 mM Tris–HCl [pH 7.4], 150 mM NaCl, 1 mM MgCl2 and 0.05% NP-40) and then incubated with 100 ml of NT2 buffer containing RNase-free DNase I (20 U, 15 min, 30°C), washed with NT2 buffer and further incubated in 100 ml NT2 buffer containing 0.1% SDS and 0.5 mg/ml Proteinase K (15 min, 55°C) to digest proteins bound to the beads. RNA was extracted using phenol and chloroform, precipitated in the presence of glycoblue (Applied Biosystems) and used for further analysis.
For Illumina microarray analysis, the RNA obtained after IP reactions using either anti-nucleolin or IgG antibodies was assessed using an Agilent 2100 bioanalyzer and RNA 6000 nanochips. The RNA was used to generate biotin-labeled cRNA using the Illumina TotalPrep RNA Amplification Kit (Ambion; Austin, TX, USA cat # IL1791), which was then hybridized to Illumina’s Sentrix HumanRef-8 Expression BeadChips (Illumina, San Diego, CA, USA), containing 24 000 well-annotated RefSeq transcripts with ~30-fold redundancy. The arrays were scanned using an Illumina BeadStation 500X Genetic Analysis Systems scanner and the image data extracted using Illumina BeadStudio software, version 1.5, normalized by Z-score transformation and used to calculate differences in signal intensities. Significant values were calculated from three independent experiments, using a two-tailed Z-test and P < 0.01. The complete list of mRNAs enriched in nucleolin IP identified on the arrays is shown in Supplementary Table S1.
Computational identification of nucleolin signature motif
The top 335 human transcripts enriched in nucleolin IP (Supplementary Table S1) served as the experimental data set for the computational identification of the nucleolin motif. Transcript sequences (UniGene) were scanned with RepeatMasker (www.repeatmasker.org) to remove repetitive sequences, and complete, high-quality sequences were first divided into 100-base-long subsequences with a 50-base overlap between consecutive sequences and were organized into 50 data sets. Common RNA motifs were then elucidated from each of the 50 random data sets. The top 10 candidate motifs from each random data set were selected and used to build the stochastic context-free grammar (SCFG) model, which summarizes the folding, pairing and additional secondary structure features. The SCFG model of each candidate motif was then used to search against the experimental data set as well as the entire human UniGene data set (5′-UTR, CR, 3′-UTR) to obtain the hits. The motif with the highest enrichment in the experimental dataset compared with the entire UniGene dataset was considered to be the top nucleolin candidate motif. The identified RNA motif for nucleolin forms a stem loop. The identification of the RNA motif in unaligned sequences was conducted using FOLDALIGN software and the identified motif was modeled by the SCFG algorithm and searched against the transcript data set using the COVE and COVELS software packages (50,51). The motif logo was constructed using WebLogo (http://weblogo.berkeley.edu/). RNAplot was used to depict the secondary structure of the representative RNA motifs. The computation was performed using the NIH Biowulf computer farm. Both UniGene and RefSeq datasets were downloaded from NCBI.
RNA isolation and RT–qPCR analysis
Total RNA was isolated from cells using Trizol (Invitrogen) from intact cells or from RNP IP samples and was used to measure gene expression or to validate microarrays, respectively. After reverse transcription (RT) using random hexamers and SSII reverse transcriptase (Invitrogen), real-time, quantitative (q)PCR analysis was performed using gene-specific primer pairs (below) and SYBR Green PCR master mix (Kapa Biosystems). The oligomer pairs (each forward and reverse) used for the amplification of PCR products were as follows: AGGCCCTTTTGGATCTTCAT and CAGGTGGTCACCCATCTTCT for FTL, AGCGGAAGGAGGAGAAAAAG and GTACTCTTGGGCAGGTGAGC for EEF1G, ATCTTCAAGTGGGTGCCAGT and AGATCCAGCAGGATGAGAGG for BCL7C, GAGATGGTGTCTGGGAGCAT and CTGGGTGTGGTCCATCTCTT for STX10, CACCCAAACACAAGGTCTCA and GGGGCAGAGAACATCACATT for CTAG2, GACGCCATTGACCTGAACTT and GCACGTGACAGGAACAGAGA for DUS1L, TGGGTGGAGGACTACTGAGG and CAGTCCAGAGTCCAACAGCA for STUB1, CAAGACGGGAGCGAGTAAAG and GGCCTCTTTGAAGGTCTCCT for USF2, TGAAGCCATCAACTCACAGC and CTGGCAGAACTGCTTGAACA for MAF1, GCTGCAGGAGTCTGTCATCA and AGTGGGGTACGAATGGAGGT for METRN, ATTCTAACTCGCCTGCCAGA and CTATGAGGCTGGGCATCTGT for GLOT1, CCCAGAGTTCCTCTGAGCAC and CTTCAGCGTTATGCCTGTCA for AGBL5, TACGCCAAGAAGCTGAGACA and TCTGGGAAGAGTGAGCAGGT for PGLS, TCTATGGCGCTGAGATTGTG and CTTAATGTGCCCGTCCTTGT for AKT1, CCCCAGAAACAGGAGAATCA and TATGCTTTGTGGCATCTGGA for PDCD2, AGAGCCAAGCTGCACAATTT and AGCCAGACAGGAAGAGACCA for NBL1, GCTGGCCAAACTCAAGTACC and TTTTGGATGAGCCTTTACGG for CCNI, ATATGCCTTCCCCCACTACC and CGTGAGTGCTCACTCCAGAA for CDKN2A, TAGTTGACAATCGGGGCTTC and GGGTCAGCTGCAGTTTAAGG for MKNK2, TGTGCAGAGAGTTTGGCAAC and GGGAAGGGGAGCAGGTATAG for MGAT1, TGCAACAACCAGAAAAGCTG and TCTCGAAGATGCACAGGTTG for LRP3, CTTTGTCAGCCAAGGAAAGC and GCTCACTGGGCACTTTTCTC for MG21, GTCCAGGACACCTCCAAGAA and TATGCCAAACCCATCTCCTC for AP1S1, and TGCACCACCAACTGCTTAGC and GGCATGGACTGTGGTCATGAG for GAPDH.
Assessment of translation by polysomal mRNA analysis and 35S labeling
HeLa cells were transfected with siRNA and 48 h later cells were incubated with 0.1 mg/ml cycloheximide for 10 min. Cytoplasmic extracts (1 mg each) were prepared and fractionated through linear sucrose gradients [10–50% (w/v)], as previously reported (52). Twelve fractions were collected using a fraction collector (Brandel) and monitored by optical density measurement (A254). The RNA in each fraction was isolated using Trizol (Invitrogen). Following RT, qPCR analysis was performed using specific primer pairs; primer sequences are shown above. To assess the levels of nascent translation, de novo synthesis of Usf2, Akt1, Flot1 and GAPDH was measured by incubating HeLa cells briefly (15 min) with L-[35S]methionine and L-[35S]cysteine (Easy Tag TMEXPRESS; NEN/Perkin Elmer, Boston, MA). Cells were lysed in RIPA buffer (10 mM Tris-HCl [pH 7.4], 150 mM NaCl, 1% NP-40, 1 mM EDTA, 0.1% SDS, and 1 mM dithiothreitol), and the IP reactions were carried out in 1 ml TNN buffer (50 mM Tris-HCl [pH 7.5], 250 mM NaCl, 5 mM EDTA, 0.5% NP-40) for 16 h at 4°C, using anti- IgG, Usf2, Akt1, Flot1 and GAPDH antibodies (Santa Cruz Biotechnology). After IP samples were washed extensively in TNN buffer, the samples were resolved by SDS-PAGE, transferred onto polyvinylidene difluoride membrane filters, and visualized with a PhosphorImager (Molecular Dynamics).
Western blot and biotin pull-down analyses
Whole-cell lysates were prepared with RIPA buffer [10 mM Tris–HCl (pH 7.4), 150 mM NaCl, 1% NP-40, 1 mM EDTA, 0.1% SDS and 1 mM dithiothreitol]. Proteins were resolved by SDS–polyacrylamide gel electrophoresis and transferred to polyvinylidene difluoride membranes (Invitrogen). After incubation with primary antibodies recognizing nucleolin, Flot1, Usf2, Akt1, GFP or α-tubulin (Santa Cruz), or recognizing β-actin (Abcam), membranes were incubated with the appropriate secondary antibodies and signals were detected by ECL Plus (Amersham).
For biotin pull-down assays, PCR fragments containing the T7 RNA polymerase promoter sequence [(T7), CCAAGCTTCTAATACGACTCACTATAGGGAGA] were used as templates for in vitro transcription. Biotinylated transcripts were incubated with cytoplasmic lysates (100 μg lysate, 3 μg biotinylated RNA) or with recombinant purified protein [GST, GST-NCL or GST-NCL(ΔRGG), 1 μg protein and 2 μg biotinylated RNA per reaction] for 30 min at room temperature, and complexes were isolated with streptavidin-coated magnetic Dynabeads (Dynal) and analyzed using western blot analysis to detect nucleolin or GST-NCL. The following primer pairs (forward and reverse, respectively) were used:
(T7)CCCGCGAGCGGACGCG and GATATCCTTTGGATCTGCCTGCTAC for CCNI 5′,
(T7)ATGAAGTTTCCAGGGCCTTT and AATTGCAGAAGTTGGTTGCAG for CCNI CR1,
(T7)AACTACTTCACTGTATGGCC and CTACATGACAGAAACAGGCT for CCNI CR2,
(T7)TTTCAACAAGTGCTACCTTTGAG and GGTCTTTATGTGCTTAAATAACGC for CCNI 3′,
(T7)CGGCAGGACCGAGCG and GGTGCCCGAGGCTCCCG for AKT1 5′,
(T7)ATGAGCGACGTGGCTATTG and TCGGAGAACACACGCTCC for AKT1 CR1,
(T7)AGCTGTTCTTCCACCTGTCC and TCAGGCCGTGCCGCTG for AKT1 CR2,
(T7)GGCGGCGGTGGACTG and CTGGGGGGCTGCTGTG for AKT1 3′1,
(T7)ACCCTCTCCTGGGGG and GAAAAGCAACTTTTATTGAAGAATTT for AKT1 3′2,
(T7)ACCCTCTCCTGGGGG and GGTTCAGGCTGGAGCTTCC for FLOT1 5′,
(T7)ATGTTTTTCACTTGTGGCCC and CTTCTTCAGTTCGTAATCTCT for FLOT1 CR1,
(T7)CGAGATGGCCAAGGCACA and TCAGGCTGTTCTCAAAGGCT for FLOT1 CR2, and
(T7)GCCTTCAGCCCTCACAG and AGTACTTACTTACAGCAATTTATTTG for FLOT 3′.
The predicted nucleolin binding motifs were synthesized from longer complementary oligomer pairs containing the T7 polymerase promoter sequence. Complementary oligomers were annealed and used as template for in vitro transcription using biotin-conjugated cytidine triphosphate (CTP). After isolating the biotinylated transcripts, binding assays with whole-cell lysate (using 3 μg biotinylated RNA) or GST fusion proteins (1 μg biotinylated RNA) were done as described above. The oligomer pairs were as follows:
(T7)GAGGAAGGGAGGGGCTGGGGGCTACGCCCCCTCC and GGAGGGGGCGTAGCCCCCAGCCCCTCCCTTCC for BRD2 5′-UTR1,
(T7)AGATGTGGCGGGTTGCCACTTCCCTGTGGGTCTCT and AGAGACCCACAGGGAAGTGGCAACCCGCCACATCT for BRD2 5′-UTR2,
(T7)CCCTGGGGAAGGGAATGCAGGGTTGCTGGGGCTGG and CCAGCCCCAGCAACCCTGCATTCCCTTCCCCAGGG for BRD2 CR,
(T7)CGTGGGCGTGGCCGGCGTGGCTGCTCGGGACCA and TGGTCCCGAGCAGCCACGCCGGCCACGCCCACG for PEX10 5′-UTR,
(T7)AGGAGGGCGCTGCTGCGGGCGGTCTTCGTCCTCA and TGAGGACGAAGACCGCCCGCAGCAGCGCCCTCCT for PEX10 CR,
(T7)CTCTGGGGGCCGTGGGGTGGGAGCTGGGGCGAG and CTCGCCCCAGCTCCCACCCCACGGCCCCCAGAG for BCL2 5′-UTR,
(T7)GGGTGTGGCTGGGCCTGTCACCCTGGGGCCCTCC and GGAGGGCCCCAGGGTGACAGGCCCAGCCACACCC for BCL2 3′-UTR,
(T7)GCCTGGCCCCGAGCCCCGAGCGGGCGTCGCTCA and TGAGCGACGCCCGCTCGGGGCTCGGGGCCAGGC for PKD1 5′-UTR,
(T7)TGCTGGGCCTGGCTCCCTGGCTGGCCAGCCTCT and AGAGGCTGGCCAGCCAGGGAGCCAGGCCCAGCA for PDK1 CR1,
(T7)GCCTGGGCGCGGTGGCTGCTGGTGGCGCTGA and TCAGCGCCACCAGCAGCCACCGCGCCCAGGC for PDK1 CR2,
(T7)GGCCGGGCCGGGGGCGGGGGGGCCGGGGCCCG and CGGGCCCCGGCCCCCCCGCCCCCGGCCCGGCC for ZNF219 5′-UTR,
(T7)AGGCGAGGCCGGGCCTGGGGGTGCCCTCCACCG and CGGTGGAGGGCACCCCCAGGCCCGGCCTCGCCT for ZNF219 CR,
(T7)GCTCGGGCCCGGGCCCGGGCCCCGCAGGCCTGC and GCAGGCCTGCGGGGCCCGGGCCCGGGCCCGAGC for FOXD2 CR,
(T7)GTCTGGGCCAGGGACTGGAGAGGTGGGGGTGGA and TCCACCCCCACCTCTCCAGTCCCTGGCCCAGAC for FOXD2 3′-UTR,
(T7)GGATGGGGATGGGGACTTTGGGGCCGGGGTCAA and TTGACCCCGGCCCCAAAGTCCCCATCCCCATCC for MMP15 CR, and
(T7)GGCTTGGCCACAGCCAGGGGAGCAGAGGGGCA and TGCCCCTCTGCTCCCCTGGCTGTGGCCAAGCC for MMP15 3′-UTR.
Identification of nucleolin target mRNAs
HeLa cells were used to isolate nucleolin-mRNA RNP complexes using a specific antibody against nucleolin, while parallel control IP reactions were performed using IgG (Figure 1A). The visualization of proteins present in the anti-nucleolin IP was performed by silver stain (Figure 1B), revealing a major nucleolin band and several minor polypeptides that copurified with nucleolin (asterisks), and by immunoblot detection of nucleolin (Figure 1C). The IP reactions were carried out using conditions that preserved protein–mRNA complexes as described (19). The RNA in nucleolin IP and control IgG IP samples was isolated and analyzed using Illumina microarrays (‘Materials and Methods’ section). A partial list of nucleolin target transcripts that were highly enriched in nucleolin IP relative to IgG IP is shown (Figure 1D); see Supplementary Table S1 for a complete list of nucleolin target mRNAs on arrays. Given that nucleolin was shown to interact with RBPs such as hnRNP K (43), we studied if other hnRNPs were detected in nucleolin IP samples. As shown in the Supplementary Figure S1, there was weak interaction of hnRNP K with nucleolin, but neither hnRNP D (also known as AUF1) nor hnRNP C1/C2 were detected in the nucleolin IP samples. Furthermore, the interaction of hnRNP K and nucleolin did not appear to influence the association of nucleolin with target mRNAs, since these transcripts generally showed comparable enrichments in nucleolin IP samples regardless of hnRNP K abundance (Supplementary Figure S2).
A comparative analysis of nucleolin target mRNAs using Ingenuity Pathways Analysis (IPA) enabled a global view of the involvements of these genes in biological processes. For example, a large number of nucleolin target mRNAs encoded proteins that were involved in cancer, genetic disorders and cellular growth, proliferation and infection; a lesser number of transcripts encoded proteins involved in other processes such as such as cell division, DNA replication, recombination and repair (Figure 1E). A complete list of functional analysis is provided Supplementary Table S2. These data indicate that the proteins codified by nucleolin target mRNAs are involved in important cellular processes.
Validation of endogenous and recombinant nucleolin target transcripts
A subset of nucleolin target mRNAs (Figure 1D) was validated by RT–qPCR analysis employing gene-specific primer pairs, as described (19). Transcripts such as BCL7C, syntaxin 10 (STX10), STIP1 homology and U-box containing protein 1 (STUB1), cyclin I (CCNI), AKT1, flotillin 1 (FLOT1) and upstream transcription factor 2 c-fos (USF2) mRNAs were highly enriched in nucleolin IP compared to IgG IP samples, as measured by RT–qPCR (Figure 2) analysis. Binding of nucleolin to GAPDH mRNA, which encodes a housekeeping protein, was used for normalization, since GAPDH mRNA was not a target of nucleolin. Since GAPDH mRNA is highly abundant, it bound non-specifically to IP components (including beads, antibody and reaction tube) and assessment of GAPDH qPCR product helped to monitor that sample input was even. As shown, nucleolin target transcripts were more abundant in nucleolin (NCL) IP compared with IgG IP.
Binding was also confirmed by using biotinylated transcripts which spanned the 3′-UTR, CR and 5′-UTR of target mRNAs (Figure 3 and Supplementary Figure S3). Biotinylated RNAs were incubated with HeLa cytoplasmic lysates, biotinylated RNA–protein complexes were pulled down using streptavidin-coated beads and the presence of nucleolin in the RNPs was examined by western blot analysis (‘Materials and Methods’ section). Unexpectedly, binding assays using biotinylated transcripts that spanned the entire length of AKT1, FLOT1 and CCNI mRNAs showed that nucleolin associated with the CR of these transcripts, with the 3′-UTR of AKT1 mRNA (but much less with the 3′-UTRs of CCNI or FLOT1 mRNAs) and with the 5′-UTR of FLOT1 and CCNI mRNAs (and only weakly with the 5′-UTR of AKT1 mRNA; Figure 3). The biotinylated GAPDH 3′-UTR, included as a negative control, did not show binding to nucleolin (Figure 3C).
Further evidence of the direct interaction of nucleolin with these RNAs was obtained using recombinant purified proteins. Two fusion proteins were prepared (each lacking the acidic N-terminal domain that renders nucleolin insoluble): GST-NCL and GST-NCL(ΔRGG), the latter lacking the RGG domain through which nucleolin can associate with other molecules, including RNA (Figure 4A). Incubation of both GST fusion proteins with the partial biotinylated transcripts (Figure 3) revealed an association of GST-NCL with the biotinylated CCNI and AKT1 RNAs tested (Figure 4B and C) and with the CR segments of FLOT1 RNA. On the other hand, GST-NCL(ΔRGG) interacted with some fragments of CCNI and AKT1 mRNAs, but less overall than GST-NCL, but not with FLOT1 RNAs. GST alone did not interact with these biotinylated RNAs (data not shown), nor did it interact with shorter nucleolin target mRNAs (see Figure 6 below). Although the patterns of association of the biotinylated RNAs with endogenous nucleolin were often similar to those seen with GST-NCL, they were not identical. These discrepancies are likely due to the fact that endogenous nucleolin associates with the biotinylated RNAs in competition (or perhaps in cooperation) with other RBPs and other molecules present in the lysate. Therefore, the net binding of endogenous nucleolin and biotinylated RNAs reflects the combined influence of other surrounding factors. In contrast, when the biotinylated RNA is incubated with GST-NCL, there are no competing or cooperating RBPs or other molecules in the binding reaction, so the pattern of binding can differ from that seen with the endogenous nucleolin. Taken together, the data in Figures 3 and 4 indicate that both endogenous and recombinant nucleolin can associate with cellular target mRNAs and with in vitro synthesized partial transcripts. They also suggest that the four RNA-binding domains of nucleolin cooperate with the RGG domain to enhance these interactions.
Transcriptome-wide identification and validation of a G-rich nucleolin binding motif
Since nucleolin exhibited affinity for different mRNA regions besides the 3′-UTR (Figures 3 and 4), we did not restrict our search for nucleolin signature motifs among target mRNAs to the 3′-UTRs but included CRs and 5′-UTRs. Nucleolin target mRNAs were used in computational analyses to identify common RNA signature motifs based on shared primary RNA sequences and secondary structures (‘Materials and Methods’ section). The resulting motif logo (relative frequency of nucleotides) was highly enriched in G residues (Figure 5A). The sequences of eight specific motif hits and their secondary structures are shown in Figure 5B and C.
Using the entire UniGene database, the G-rich nucleolin motif was found to be abundant within the CR and both UTRs (Figure 5D); in fact, its frequency was comparable among the lists of motif predictions in the 5′-UTR, CR and 3′-UTR (Figure 5E; complete lists are available in Supplementary Tables S3–S5). Four mRNAs that were predicted to be nucleolin target mRNAs based on the presence of the G-rich motif (the MGAT1, MG21, LRP3 and AP1S1 mRNAs) were tested using RT–qPCR and specific primer pairs. All four were enriched in nucleolin IP (Figure 5F), indicating that the G-rich motif has good predictive value for identifying nucleolin target mRNAs.
Next, we tested the ability of nucleolin to bind to the specific RNA motif hits in several mRNAs. Short biotinylated RNAs comprising the sequences shown were tested by biotin pull-down assay using HeLa cytoplasmic lysates (Figure 6A). Nucleolin was found to bind the short RNA motifs from different regions (5′-UTR, CR and 3′-UTR). For the FOXD2 mRNA, nucleolin appeared to bind the CR motif more strongly than the 3′-UTR motif. Nucleolin bound to motif CR1 (but not to motifs CR2 or 5′-UTR) of the PDK1 mRNA and also interacted with hits in the 5′-UTR and the CR of BRD2 and PEX10 mRNAs, as well as with hits in the 3′-UTR and 5′-UTR of BCL2 mRNA. All the hits tested within the ZNF219 and MMP15 mRNAs also associated positively with nucleolin. The intensity of these interactions varied among the transcripts (Figure 3); biotinylated GAPDH RNA segments did not show interaction in this assay (data not shown). These short biotinylated RNAs also showed strong in vitro interaction with recombinant purified GST-NCL, but less with GST-NCL(ΔRGG), as observed with longer biotinylated RNA (Figure 4); there was no interaction of biotinylated RNA with GST (Figure 6B). Together, these data indicate that nucleolin is able to bind G-rich motif elements within the 3′-UTR, CR and 5′-UTR and that the RGG domain likely cooperates with the RRMs to form these interactions.
Nucleolin promotes mRNA translation without influencing mRNA stability
Many RBPs which associate with mature mRNAs often regulate their stability and/or translation. To investigate the functional influence of nucleolin upon target mRNAs bearing the G-rich motif, we specifically silenced nucleolin using small interfering (si)RNA directed to the nucleolin mRNA. Downregulation of nucleolin did not influence the steady-state levels of most of the mRNAs tested (Figure 7A), nor did it change the half-lives of these mRNAs, as measured by RT–qPCR after treatment with actinomycin D (Supplementary Figure S4 and data not shown). Thus, for mRNAs whose levels declined moderately after nucleolin silencing (including AKT1, MAF1 and CCNI), it is likely that nucleolin affects the levels of transcription factors or other proteins that modulate their transcriptional expression. Testing of a subset of nucleolin targets (those encoding Flot1, Usf2, Dus1l, Akt1 or Cyclin I) following nucleolin silencing revealed a greater magnitude of reduction in protein level (Figure 7B) than was seen at the mRNA level (Figure 7A). As previously reported, nucleolin silencing increased the abundance of p53 and reduced the abundance of Bcl-2, included here as positive controls (Figure 7C). Among the reported nucleolin targets, the mRNAs encoding APP, Bcl-2, p53, PGHS1 and several selenoproteins were found to have hits in the 3′-UTR; BCL2 and PGHS1 mRNAs had coding region hits; and one selenoprotein (SEP15) mRNA had hits in the 5′-UTR (Supplementary Tables S3–S5).
Since nucleolin downregulation did not affect markedly the steady-state levels of most target mRNAs (Figure 7A), we investigated whether mRNA translation was altered by studying the sizes of polysomes associated with nucleolin target mRNAs (‘Materials and Methods’ section). Nucleolin downregulation did not alter the global profiles of polysomes in cells transfected with control or nucleolin-directed siRNAs, as observed by comparing the relative levels of free mRNAs (-), small (40S) and large (60S) ribosomal subunits, monosomes (80S) and low- and high-molecular weight polysomes (LMWP, HMWP, respectively) (Figure 8A). Although nucleolin promotes polysome biogenesis and cell cycle progression, there was no global inhibition of protein synthesis by 48 h after silencing nucleolin (Supplementary Figure S5). However, when individual mRNAs present in the different polysome gradient fractions were compared between Ctrl siRNA and NCL siRNA populations, MGA1, AKT1, FLOT1, LRP3, MG21 and CCNI mRNAs were all found in relatively lighter polysomal fractions after silencing nucleolin; this was manifested as leftward shifts in the distribution peaks, indicative of smaller polysomes. For example, FLOT1 mRNA was most abundant in polysomes peaking at fraction 9 in control cells, but the peak was in fraction 7 in nucleolin-silenced cells; CCNI, LRP3 and AKT1 mRNAs shifted peaks from fraction 8 to fraction 7 after silencing nucleolin, MG21 mRNA shifted from a peak in fraction 6 to a peak in fraction 3, and MGA1 mRNA from fraction 6 to 5 (Figure 8B). Additional evidence that translation of these proteins was reduced came from experiments to measure nascent translation of specific proteins. Following incubation of HeLa cells with 35S-Met/Cys for 20 min (to faithfully detect newly synthesized protein and have a negligible contribution of protein decay), lysates were subjected to IP reactions in the presence of antibodies recognizing the proteins in Figure 7B, and the immunoreactive, 35S-labeled proteins were visualized. As shown, the de novo translation of Usf2, Akt1 and Flot1 was reduced in nucleolin-silenced cells, while incorporation of label into a control housekeeping protein (nascent 35S-GAPDH) was equal and overall signals in both lanes were similar (IgG IP, Figure 8C). Other proteins (e.g. Cyclin I or Dus1L were not detectable using this low-efficiency de novo translation assay). These results underscore nucleolin’s function as an enhancer of the translation of a subset of target mRNAs.
To further ascertain if the nucleolin signature motif elicited its influence from the CR and the 3′-UTR, we studied the effect of inserting the motif in different sites of a heterologous reporter construct. A reporter plasmid expressing GFP (pGFP) was used to generate three separate constructs bearing motifs M1, M2 and M3 (from AP1S1, LRP3 and MGAT1 mRNAs, respectively) inserted in frame in the CR before the GFP stop codon (pGFP-M1, pGFP-M2 and pGFP-M3), three constructs with the same motifs sequences inserted in the 3′-UTR, after the stop codon (pGFP-3′M1, pGFP-3′M2 and pGFP-3′M3) and three additional constructs with mutated versions of M1, M2 and M3, in which all Gs were replaced with Cs (pGFP-3′M1mut, pGFP-3′M2mut and pGFP-3′M3mut; Figure 9A). Transfection of these plasmids into HeLa cells that expressed either normal or reduced levels of nucleolin was followed by western blot analysis to assess GFP expression. As shown in Figure 9B, nucleolin silencing strongly reduced GFP expression of all constructs bearing nucleolin motifs in 3′-UTR. When nucleolin motifs were inserted in the CR of the reporter construct, GFP expression was variable: after silencing nucleolin, GFP expressed from pGFP-M1 remained elevated, GFP expressed from pGFP-M2 was moderately downregulated and GFP expressed from pGFP-M3 was markedly reduced (Figure 9B). All the mutant reporters were refractory to nucleolin silencing. GFP mRNA levels in these transfection groups remained unchanged by nucleolin silencing (Supplementary Figure S6). Together, these findings indicate that the presence of the nucleolin signature motif in the 3′-UTR of target mRNAs strongly enhances translation by nucleolin, while its presence in the CR has a less robust influence.
We report the global identification of nucleolin target mRNAs. These mRNAs encoded proteins involved in key mammalian processes such as cell proliferation, viral infection and carcinogenesis. Computational analysis of the target mRNAs revealed the presence of a consensus G-rich RNA signature motif that was detected in all regions of the mRNA—5′-UTR, CR and 3′-UTR. Nucleolin exhibited an affinity for RNAs bearing the G-rich signature motif, both endogenous mRNAs and recombinant biotinylated RNAs assessed in vitro. Similarly, recombinant purified nucleolin (GST-NCL) associated in vitro with these biotinylated RNA targets, indicating that nucleolin was capable of interacting with these mRNAs in the absence of other proteins. Nucleolin did not globally influence the stability of target mRNAs bearing the G-rich motif, but instead enhanced target mRNA translation. Nucleolin’s preference for G-rich elements adds to an expanding repertoire of turnover- and translation-regulatory mRNA sequences, which now includes motifs such as those rich in AU, U, CA, GU and CU (53).
Translational regulation vs mRNA turnover
The finding that nucleolin broadly enhanced the translation of this subset of target mRNAs was unexpected. Nucleolin was previously reported to increase the stability of some target mRNAs. For example, nucleolin associated with AU-rich elements in the BCLXL 3′-UTR, enhancing BCLXL mRNA stability and Bcl-xL protein levels in human keratinocytes following irradiation with ultraviolet light (UVA) and it also enhanced the stability of BCL2 mRNA (39,54). Treatment with the nucleolin-directed aptamer AS1411 (presently in clinical trials, phase II) was recently found to inhibit nucleolin binding to BCL2 mRNA, in turn increasing the association of BCL2 mRNA with AUF1, an RBP that can destabilize some target mRNAs (55,56). Thus, at least for BCL2 mRNA, nucleolin can stabilize a target mRNA by competing with AUF1; perhaps similar competition paradigms explain the effect of nucleolin in the stabilization of other ligand RNAs. It is interesting to note that the nucleolin-targeted aptamer AS1411 is a G-rich oligodeoxynucleotide (26-nt long) (55,57), supporting the notion that nucleolin indeed has affinity to G-rich sequences.
Nucleolin and hnRNP E interacted with the 3′UTR of human β-globin mRNA and enhanced its half-life (41), and nucleolin, hnRNP K and PCBP1 associated with and stabilized gastrin mRNA in human gastric adenocarcinoma cells (43). Whether nucleolin competed with binding of decay-promoting RBPs in these instances was not reported. On the other hand, some RBPs that influence mRNA stability and translation (e.g. HuR, NF90, TTP, BRF1 and KSRP) have affinity for U/AU-rich sequences in the 3′-UTR of target transcripts (8,12,40,58–61). The fact that nucleolin can bind to G-rich sequences suggests that it may not always compete with other RBPs, and may instead cooperate with or recruit other RBPs to shared mRNA targets. The specific functional interactions of nucleolin and other RBPs (as well as non-coding RNAs) warrant future study.
As mentioned above, nucleolin was also reported to modulate the translation of a handful of mRNAs. Both nucleolin and the ribosomal protein L26 (RPL26) bound the 5′-UTR of TP53 mRNA and controlled p53 translation in breast carcinoma cells, RPL26 enhanced it, while nucleolin suppressed it (45). Similarly, nucleolin interacted with the 5′-UTR of PGHS1 mRNA and repressed PGHS-1 production (46). In contrast, nucleolin binding to the 3′-UTR of MMP9 mRNA enhanced MMP9 translation in human fibrosarcoma HT1080 cells treated with 2,2-dipyridyl (2,2-DP) (47). Hypoxia treatment of HT1080 cells also increased expression of C-P4H-α(I) (collagen prolyl 4-hydroxylase) mediated by a ~64-kDa cleavage product of nucleolin which bound to the C-P4H-α(I) 5′-UTR (62). Similarly, nucleolin associated with the 3′-UTR of several selenoprotein mRNAs and promoted mRNA translation (48). Whether nucleolin represses translation through the 5′-UTR but enhances it through the 3′UTR should also be examined systematically.
Controlling gene expression from the CR
Although traditionally the CR has not been reported to harbor many stability/translation regulatory elements, it is now clear that much of the cellular mRNA is not bound to ribosomes, including nuclear mRNAs and cytoplasmic mRNAs being transported or in storage (63). Other mRNAs are occupied sparsely by ribosomes (64,65), indicating that a large pool of untranslated mRNA and CR mRNA segments are available for interaction with RBPs like nucleolin and with non-coding RNAs. It will be important to study systematically the association of nucleolin with nuclear and cytoplasmic mRNAs. Whether cytoplasmic nucleolin interacts with the CR of mRNAs devoid of ribosomes, with lightly translated mRNAs (mRNAs forming low-molecular weight polysomes), or with mRNAs localized in the plasma membrane also deserves further study.
Nucleolin binding under basal and stress-stimulated conditions
The association of nucleolin with mRNAs bearing G-rich motifs, and the ensuing translational upregulation, may only occur in unstimulated conditions such as those studied here. Exposure to growth factors, mitogens, stressors or other stimuli could alter nucleolin’s affinity for mRNAs, allowing it to bind different sequences and possibly affecting other post-transcriptional processes, like mRNA turnover. For example, treatment with EGF increased nucleolin binding to gastrin mRNA and enhanced its interaction with hnRNP K; these events were linked to changes in gastrin mRNA half-life (43).
It is also worth noting that nucleolin was identified as a genotoxic stress-responsive RBP, whose levels, localization and interaction with other RBPs were influenced by UV irradiation and other stresses (66,67). In this regard, nucleolin is post-translationally modified by phosphorylation [via casein kinase II (CKII), protein kinase C (PKC) and cell division cycle kinase 2 (CDC2) (68–70)], by methylation and by ADP-ribosylation (21,30). Further studies are needed to understand in detail how these modifications affect nucleolin function. Phosphorylation and other post-translational alterations of RBPs such as HuR, BRF1, TTP, KSRP, TIAR/TIA-1 have been linked to changes in their subcellular localization and their association with and influence upon target mRNAs (reviewed in 5,71).
Functional consequences, concluding remarks
In closing, many of nucleolin’s target mRNAs, including those that bear the G-rich signature, encode proteins involved in cell growth and proliferation, cancer and viral infection. Whether nucleolin’s influence upon these target transcripts is modulated by its interactions with other RBPs and/or non-coding RNA awaits systematic study. Another open question is how post-translational modification of nucleolin alters its influence on target mRNAs. Finally, it remains to be tested whether the promising clinical influence of AS1411, a nucleolin-directed oligonucleotide, involves nucleolin’s effect on targets that bear G-rich motifs. Addressing these questions in depth will provide critical information about the role of nucleolin in cell proliferation and its potential usefulness in cancer therapy.
Supplementary Data are available at NAR Online.
National Institute on Aging-Intramural Research Program of the National Institutes of Health; NIH (RO1 1CA116491-01 to F.C.). Funding for open access charge: National Institute on Aging-Intramural Research Program, National Institutes of Health.
Conflict of interest statement. None declared.