On the rules of engagement for microRNAs targeting protein coding regions

Abstract MiRNAs post-transcriptionally repress gene expression by binding to mRNA 3′UTRs, but the extent to which they act through protein coding regions (CDS regions) is less well established. MiRNA interaction studies show a substantial proportion of binding occurs in CDS regions, however sequencing studies show much weaker effects on mRNA levels than from 3′UTR interactions, presumably due to competition from the translating ribosome. Consequently, most target prediction algorithms consider only 3′UTR interactions. However, the consequences of CDS interactions may have been underestimated, with the reporting of a novel mode of miRNA-CDS interaction requiring base pairing of the miRNA 3′ end, but not the canonical seed site, leading to repression of translation with little effect on mRNA turnover. Using extensive reporter, western blotting and bioinformatic analyses, we confirm that miRNAs can indeed suppress genes through CDS-interaction in special circumstances. However, in contrast to that previously reported, we find repression requires extensive base-pairing, including of the canonical seed, but does not strictly require base pairing of the 3′ miRNA terminus and is mediated through reducing mRNA levels. We conclude that suppression of endogenous genes can occur through miRNAs binding to CDS, but the requirement for extensive base-pairing likely limits the regulatory impacts to modest effects on a small subset of targets.


INTRODUCTION
MicroRN As (miRN As) are small non-coding RNAs that constitute the target recognition component of the RNAinduced silencing complex (RISC).In this role, they facilitate gene r epr ession through their recruitment of RISC to their target mRNA transcripts, resulting in translational inhibition or destabilisation of the target mRNA ( 1 ).The canonical mechanism through w hich miRN As work is well estab lished.This involv es the binding of a miRNA to an Argonaute (AGO) protein in such a way as to make a region of the miRNA (nucleotides 2-8 known as the 'seed') accessible for complementary base pairing with target transcripts.Initial interactions with the seed then bring about conformational changes in the AGO protein, exposing additional sites within the miRN A (especiall y nucleotides [12][13][14][15][16][17] for interaction that can further stabilize binding and facilita te more ef fecti v e target r epr ession (2)(3)(4)(5)(6).In the rare instances of e xtensi v e sequence complementarity between the miRNA and its target, or in the case of short interfering RN As (siRN As), the interaction across the w hole binding interface activates the enzymatic function of AGO2, directly cleaving the transcript at the bases bound to the central region of the miRNA (7)(8)(9)(10)(11).
Efforts to understand miRNA-mediated r epr ession initiall y anal ysed the effects of miRNA perturbation on the transcriptome.Such studies re v ealed 'rules' of functional sites (also known as microRNA response elements, MREs).For example, longer and fully complementary seed interactions (across 8 contiguous nucleotides) are most effecti v e (12)(13)(14).Also, MREs located within 3 UTRs ar e mor e effecti v e than sites within protein coding sequences (CDS) ( 12 , 13 , 15 , 16 ), presumably due to avoidance of competition from transiting ribosomes ( 15 , 17 ).These observations have been widely replicated and have set the landscape for miRNA target prediction to such an extent that potential sites within coding exons are often entirely ignored ( 12 , 18-26 ).
Despite this, individual examples of functional miRNA targeting within CDS regions have been reported (27)(28)(29).AGO cross-linking and immunoprecipitation studies re v eal e xtensi v e miRNA interaction sites within coding regions, with the frequency of binding to the CDS often similar to that within the 3 UTR ( 30 ).Of particular interest is the report that miRNAs can target the CDS via a mechanism that is independent of the seed, but that is dependent upon extensi v e binding elsew here, particularl y at the 3 end of the interaction site ( 31 ).The mechanism of gene r epr ession for these CDS sites was reported to occur through aborted translation and to not affect the le v el of the target transcript.Such interactions ar e ther efor e likely to have been missed in most studies because typically only the effects on mRNA le v els are examined.If such a mechanism is substantiated, the breadth of functional CDS targeting by miRNAs may be vastly larger than previously recognised.
In this study, we have sought to definitively determine the ca pacity of miRN As to ex ert their r epr essi v e effects within protein coding regions.Using multiple reporter constructs, miRNAs and cell lines, we find that miRNAs are capable of r epr essing their targets within protein coding regions, howe v er in contrast to observations from the Zhang et al. study ( 31 ), the mode of target r epr ession is canonical (seeddependent), is not especially dependent on the binding of the miRNA 3 terminus and operates at the le v el of transcript stability.We find that an e xtensi v e binding interface between the miRNA and its target is r equir ed for functionality within the CDS, which involves direct cleavage of the target transcript.Bioinformatic assessment of both predicted and experimentally identified binding sites reveal that CDS targeting is likely to occur endogenously, but the requirement for e xtensi v e base pairing will limit this mechanism to a relati v ely small number of genes.

Lucifer ase r eporter constructs
The psiCHECK-2 reporter vector (Promega) was initially digested with NruI and NotI, into which were cloned double stranded DNA oligonucleotides (G-blocks, IDT) to introduce an AgeI restriction site 6 nucleotides upstream of the stop codon.The modified psiCHECK-2 plasmid was then digested with AgeI and NotI to insert the desired miRNA target sequences by T4PNK (NEB) treatment and annealing of single stranded oligonucleotides.XhoI and NotI sites present in the original Renilla luciferase vector were used for the cloning of miRNA binding sites into the 3 UTR.Sequences of G-blocks and single stranded oligonucleotides are listed in Supplemental Table S4.

Plasmid transfection and dual luciferase assay
Cells were seeded at 5 × 10 4 cells per well in 24-well plates and transfected the next day with 5 ng psiCHECK-2 or modified psiCHECK-2 along with either 5 nM of control or miRNA mimic or 20 nM of control or miRNA inhibitor diluted in opti-MEM (Invitrogen) and in combination with lipofectamine-2000 (Invitrogen).The transfection reagent was replaced with fresh cell growth media 6 hours post tr ansfection.Lucifer ase activity was measured 48 h post transfection using a dual luciferase kit (Promega).

RNA isolation, cDNA synthesis and qRT-PCR
Cells were seeded at 8 × 10 4 cells per well in 6-well plates.The following day, cells were transfected with 10 nM miRNA mimic or 50 nM miRNA-inhibitor diluted in opti-MEM (Invitrogen) and in combination with Lipofectamine RN AiMAX (Invitro gen) using the recommended protocol.Media was replaced 6 h post transfection.72 hours post transfection, total RNA was harvested using TRIzol (Invitrogen), following the standard manufacturer's protocol.cDNA was synthesised using the QuantiTech Re v erse Transcription Kit (Qiagen) from 1 g of RNA.qRT-PCR was performed on a Rotor-Gene 6000 series thermocycler (Qiagen) with Master SYBR Gr een r ea gent (Qia gen).Analysis was performed using the comparati v e quantitation feature in the Rotor-Gene software with each gene measured being normalized to the mean of GAPDH and RPL32.All miRN A-mimics, miRN A-inhibitors, transfected pseu-domiRs and qPCR primers are listed in Supplemental Table S4.

Protein purification and western blotting
Cells were seeded at 8 × 10 5 per well in 6 well plates.The f ollowing da y, cells were transfected with 10 nM miRNA mimic using Lipofectamine RN AiMAX (Invitro gen) using the recommended protocol.The transfection reagent was removed 6 hours post transfection and cells were cultured in fresh media.Se v enty-two hours post transfection, cells were treated with ice cold 1 × RIPA lysis buffer pr epar ed by the recommended combination of cOmplete Mini, EDTA-free Protease-inhibitor Cocktail tablet (PIC; Roche), PhoSTOP EASYpack (Roche), 10 × RIPA buffer (Abcam).The concentration of protein in purified lysate was estimated using Pierce BCA Protein Assay Kit (Thermo Scientific).20 g of protein was loaded onto Bolt Bis-Tris Plus gels (gel type based on protein size) using 1 × Bolt MOPS SDS Running Buffer (Invitr ogen).Pr oteins wer e transferr ed to nitrocellulose membrane at 4 o C using 1 × Bolt Transfer Buffer (Invitrogen) with 10% methanol by volume.Membranes were incubated with Ponceau stain for total protein visualization using ChemidocTouch.Membranes were blocked in 5% skim milk for 1 h at room temperature and incubated overnight in the recommended dilution (generally 1:1000) of primary antibody at 4 • C. Protein visualization using Near Infrared (NIR) was achie v ed by incubation for an hour at 4 • C in secondary (1:20 000; PBST) antibody, IRDye 800, of the correct species.The same membrane was reprobed with ␣-tubulin (1:2500 dilution) for an hour at 4 • C followed by an hour of secondary (1:20 000; PBST) antibody, IRDye 680.

Lentivirus production and infection
For lentivirus production, HEK293T cells were plated at 2 × 10 6 cells in T25 flasks.The following day, cells were transfected with 1 g pLP1, 1 g pLP2, 1 g pLP-VSVG, 1 g pTAT and either 4 g of the pLV4301-enhanced GFP transfer vector ( 32 ) or 4 g of the pLX301-mCherry transfer vector ( 33 ).DNA was mixed in 500 l opti-MEM and transfected in combination with 12 l Lipofectamine-2000.Transfection reagent was removed 6 h post transfection and viral supernatant of either pLV4301-eGFP or pLX301-mCherry was collected after 72 hours.MDA-MB-231 cells were seeded at 2 × 10 6 cells in T 25 flasks.The following day, cells were transduced (1:4) with viral supernatant of pLV4301-eGFP in the presence of polybrene (4 mg / ml).MDA-MB-231-eGFP positi v e cells wer e r e-transduced with pLX301-mCherry viral supernatant to generate a pool of MDA-MB-231-eGFP + mCherry cells.The transduced pool of cells was selected using puromycin (1 g / ml) and grown for at least 48 h before further analysis.

Single cell sorting and flow cytometry
MDA-MB-231-eGFP-mCherry cells were washed twice with warm washing buffer (1 × PBS + 10 mM EDTA) followed by short incubation with 3 mM EDTA in 1 × PBS.Semi-detached cells were treated with TrypLE followed by dilution in 1 × PBS + 10 mM EDTA.Cells were centrifuged at 350 ×g for 5 min, washed with 1 × PBS + 10 mM EDTA and resuspended in sorting buffer (ice cold 1 × PBS, 5 mM EDTA, 1% FCS and 25 mM HEPES; pH 7.0).Cells were then filtered using a 30 m Filcon sterile filter (BD Biosciences) and sorted on the basis of fluorescence intensity compared to control cells including parental MDA-MB-231 (no colour), MDA-MB-231-eGFP (single colour) and MDA-MB-231-mCherry (single colour).Single cells separated in 96-well plates were grown in the conditioned media before transferring into larger 6-well plates.Flow Cytometry sorting was performed on the MoFlo Astrios EQ High Speed Cell Sorter using Summit Software version: 6.3.1 (Beckman Coulter, Miami, FL, USA).Experiments utilised the 488 nm (150 mW) and 561 (200 mW) laser lines and the 100 micron nozzle at 30 PSI.Laser and light filter usage are displayed on plot axes.No forward scatter masks were used.Flow cytometry data was analysed using the Apple Macintosh-version of FCS express 6 (De Novo Software, Los Angeles, CA, USA).

Primer extension assays
15 pmoles of primers P1 and P2 were 5 end-labelled with equimolar amounts of 32 P-␥ -ATP using T4-PNK and purified through G-25 columns (GE Healthcare 27-5325-01).The Rps12 control primer was similarly labelled using a 2:1 ratio of cold:hot ATP.10ug of total RNA extracted MDA-MB-231-eGFP-mCherry cells transfected with pseudo-miRs was mixed with 0.5 pmole each Rps12 and P1 or P2 32 P-labeled primers, denatured at 75˚C for 5 min then re v erse transcribed using Superscript III (Invitrogen) according to the manufacturer's instructions.Sanger DNA sequencing of the mCherry reporter with the same radiolabelled primers and Klenow DN A pol ymerase (NEB M0210) was used as a ladder to map the cleavage sites to nucleotide resolution.Products were separated by large format 5% acrylamide, 7M urea PAGE, exposed to a phosphor screen and imaged using a Typhoon.Primer sequences were mCherry P1: TTGACCTCAGCGTCG-T AGTG, P2:T ACTTGT ACAGCT CGT CCATG, Rps12: GCAGT CTT CAGAACCT CTTG.

Statistical analysis
Each experiment was performed across m ultiple biolo gical replicates as indicated in the individual figure legends.Data ar e pr esented as mean ± s.e.m., and P values were determined by two-tailed Student's t test.

Bioinf ormatic pr ediction of miRNA binding
For the prediction of miRNA binding shown in Supplemental Table S1, analysis was restricted to a subset of 560 miR-NAs (either annotated as 'high confidence' in miRBase ( 34 ) and expressed ≥10 rpm or annotated as 'low confidence' and expressed at ≥1000 rpm).Sequences (3 UTR and CDS) were extracted from ENSEMBL Biomart ( 35 ).

Bioinformatic analysis of AGO-CLASH
Processed CLEAR-CLIP data from mouse keratinocytes was obtained as supplemental files from GEO (Accession GSE102716 ( 36 )) which comprised, for each read, the mapped location of the target RNA part of the read and the name of associated microRNA.Analyses were performed using the Mus musculus genome version 'mm10' and UCSC gene transcripts in python using p yr efer ence ( https://p ypi.org/project/p yr efer ence/ ), HTSeq ( 37 ) and seaborn ( 38 ) libraries.Using 'Set1 Control' sample file (161692 CLEAR-CLIP reads) and all the UCSC transcripts, we annotated each read target region as overlapping intronic, exonic, 3 UTR and / or 5 UTR gene regions, discarding unannotated (i.e.intergenic) reads.To compensate for nuclease 'nibbling', the genomic interval of the target RNAs was expanded on both ends by three bases.The RNAduplex method from the Vienna RNA package ( 39 ) was used to analyse the binding affinities of resulting target RNA and miRNA sequences.The resulting dot-bracket annotation and delta-G values for the predicted RNA duplexes and other annotation were used to produce the table of CLEAR-CLIP read annotations provided as Supplemental Table S1.Where present, duplicate r eads (r eads with identical RNA target intervals and microRNAs) were counted (in the 'counts' column) and collapsed into a single entry.For visualisations, reads are classified as in '3 UTR' if they overlap 3 UTRs and 'coding region' if they are in exons and not in 3 or 5 UTRs.

MiRNAs frequently interact with coding regions
The Ago-HITS-CLIP procedure identifies the locations of miRNA interaction within their target mRNAs in li v e cells.Although miRNAs are generally assumed to act through interaction with mRNA 3 UTRs, Ago-HITS-CLIP in a range of different cell types re v eals a substantial amount of binding in protein coding sequences (CDSs) (Figure 1 A), consistent with reports showing that miRNAs can target CDS r egions ( 27 , 36 , 40-42 ).Furthermor e, when we designed artificial miRNAs to target thr ee differ ent r egions within the CDS of a Renilla luciferase reporter mRNA, containing in each case a mismatch at position 12 of the miRNA to minimise direct cleavage of the target, we found these miR-NAs all substantially reduced activity of the targeted Renilla luciferase relati v e to acti vity of the co-expressed but non-targeted firefly luciferase (Figure 1 B).One of the artificial miRNAs was less effecti v e than the other two, but this was likely due to the target sequence being within a pr edicted hairpin structur e (Supplemental Figur e 1).These data support the contention that miRNAs binding within the CDS can r epr ess expr ession.
A non-canonical form of binding in CDS regions that r equir es the 3 end of the miRNA to be base paired to the target mRNA has been reported ( 31 ).To assess how common this mode of binding is in CDS and 3 UTR regions we performed a broad survey of the base pairing interactions of miRNAs by analysing the interactions found by CLEAR-CLIP (covalent ligation of endogenous Ar gonaute-bound RNAs, also kno wn as cross-linking ligation and sequencing of hybrids (CLASH)), using data from Yi and colleagues ( 36 ).The CLEAR-CLIP procedure ligates the miRNA to the fragment of target mRNA to which it is bound, thereby identifying both the miRNA and the target sequence.We compiled the base pairing interactions for CDS and 3 UTRs, and because different miRNAs are not all of identical length, we performed separate alignments with anchoring at the 5 end of the miRNA (thereby aligning the seed regions) or at the 3 end to determine whether base pairing of miRNA 3 ends is especially prominent in CDS interactions.This analysis indicated that base pairing patterns in the CDS are similar to those observed in 3 UTRs and that the non-canonical mode of binding, with base pairing at the miRNA 3 end, occurs at similar low frequency in CDS and 3 UTRs (Figure 1 C).

Repression by binding to the CDS requires seed complementarity but not 3 end pairing
Gi v en the frequency of CDS interaction sites indicated by CLIP studies, we sought to further interrogate the base pairing r equir ements for acti v e r epr ession of expr ession via binding to CDS regions.We introduced potential binding sites for various endogenous miRNAs into the coding region of Renilla luciferase, using the strategy employed by Zhang et al. ( 31 ), in which a unique restriction site (encoding two additional amino acid residues) is inserted upstream of the stop codon, allowing subsequent insertion of additional target sequences for selected miRNAs.We first examined whether r epr ession by miR-20a r equir es seed r egion and / or 3 end base pairing.In contrast to the findings of Zhang et al. , we found that the criteria for miR-20a r epr essi v e effect were similar whether the binding site was in the CDS (Figure 2 A, B) or the 3 UTR (Figure 2 A, C), with seed region binding being required, but not binding by the miRNA 3 end.When the seed region was base paired there was effecti v e r epr ession (CDS3, CDS3a, CDS3b and CDS3c in Figur e 2 ), wher eas complete base pairing of the miRNA 3 end (CDS1) did not compensate for seed region mismatches, and e v en a minor imperfection in the seed region (CDS2 and CDS2a) eliminated r epr ession by the miRNA, regardless of 3 end base pairing.
As there was no rationale why miR-20a would specifically target CDS, we selected additional miRNAs to assess whether the base pairing criteria for r epr ession within the CDS were similar.These miRNAs were selected on the basis that they are well studied, widely expressed and represented in CLASH data.Initially starting with miR-342, we found the criteria for miR-342 targeting within the luciferase CDS were similar to those seen with miR-20a (Figure 3 compared to Figure 2 ).Disruption of base pairing of the seed r egion impair ed activity (Figur e 3 A,B), whether the 3 end was base paired (CDS1, CDS2) or not (CDS1a, CDS2a), while the miRNA inhibited luciferase activity if the seed region was perfectly base paired, whether or not the 3 end was also base paired (CDS3, CDS3a, CDS3b, CDS3c).To check that these key criteria of functionality applied at physiological le v els of miRNA, we measured the effect of inhibition of the endogenous miR-342 in MCF7 cells, a cell line in which miR-342 is naturally expressed.Inhibition of miR-342 did not affect expression of the luciferases with seed region mismatches (Figure 3C; CDS1, CDS1a, CDS2, CDS2a), regardless of the degree of base pairing at the miRNA 3 end.The miR-342-CDS3 luciferase (which was strongly inhib-ited by transfected miR-342 in MDA-MB-231 cells) was strongly activated by inhibition of endogenous miR-342 in MCF7 cells, indicating that the anti-miR inhibitor is effecti v e and that endogenous miR-342 targets the reporter, as expected (Figure 3 C).The miR-342-CDS3c luciferase, which has seed pairing but not 3 end pairing, was not activ ated b y miR-342 inhibitor (Figure 3 C), despite being inhibited to a degree by miR-342 in transfected MDA-MB-231 cells (Figure 3 B), while the CDS3b luciferase, which has a two base mismatch at the miRNA 3 end, was only slightly activated on inhibition of the miRNA (Figure 3 C).Taken together these data indicate that seed region base pairing is essential for inhibition by miR-342, and can be augmented by 3 end binding, but without a specific r equir ement for base pairing of the 3 terminal bases as previously reported ( 31 ).
To further check the generality of the base pairing requirements we also tested equivalent luciferase constructs with CDS sites for miR-200a (Figure 3 D-F), miR-200b (Supplemental Figure S2A-C) and miR-194 (Supplemental Figure S2D, E).All of these gave similar results, demonstra ting tha t for miRNAs to r epr ess genes via CDS sites, e xtensi v e base pairing is r equir ed that includes the seed, but does not necessarily include the 3 terminal nucleotides.We ra tionalize tha t if such sites are significant in biology, one would expect similar 'rules' to operate across different cells and different miRNAs.Howe v er, as we did not find these sites to be functional as was r eported pr eviously, we sought to exactly replicate the reporters used in the prior study.
One difference between the miR-20a targeting presented in Figure 2 and the Zhang et al. study is the cell line in which the assays were conducted.We therefore repeated our reporter assays in HeLa cells as the prior study had used, b ut a gain found no seed-independent CDS r epr ession (Figure 4 A).To check whether the difference between our observations and those of Zhang et al. might be due to the seed region mismatch bases in our DAPK3-deri v ed miR-20a reporter being different from those of Zhang et al. , we created an additional reporter with identical sequence to that of Zhang et al., but we found it too was not inhibited by miR-20a (DAPK3, Figure 4 B).Howe v er, restoring base pairing in the seed r egion r esulted in inhibition of the r eporter (CDS3, Figur e 4 B).A r eporter with near perfect complementarity to miR-194 was inhibited by miR-194 but not miR-20a.Similarly, miR-20a but not miR-194 inhibited a miR-20a reporter, confirming the specificity of these assays (Figure 4 B, Supplemental Figure 3A).To compare further with the Zhang study, four let-7b target reporters were also cloned into the Renilla luciferase CDS that were pr eviously r eported to be strongly suppr essed upon let-7b transfection (Figure 4 C).Again, we report no equivalent finding, though let-7b itself was functional as it effecti v ely r epr essed a complementary reporter (Supplemental Figure 3B).Collecti v ely, these results all lead to the same conclusions: for miRNA binding in the CDS to be r epr essi v e, seed region base pairing is essential, as is e xtensi v e base pairing beyond the seed, howe v er specific base pairing of the very 3 end is not essential.
To check that the base pairing r equir ements we identified as being necessary for r epr ession within the CDS were not restricted to the luciferase reporter system, we created a dual colour reporter system with constituti v e mCherry and GFP expression in MDA-MB-231 cells and measured the effects of artificial miRNAs targeting the mCherry CDS.An advantage of this system is that the effect of the transfected miRNAs is measured in e v ery indi vidual cell by flow cytometry, giving thousands of data points per transfection for both the targeted mCherry and the non-targeted GFP control.We assessed the effects of miRNA mimics targeting thr ee differ ent r egions in the CDS of the mCherry (Figur e 5 A).As expected, none of these mCherry-targeting miRNA mimics (called C-miRs) affected GFP expression.Two of the miR mimics (C-miR2 and C-miR3) strongly reduced mCherry expression, further demonstrating the potential for highly complementary miRNAs to target CDS regions (Figure 5 B).
To assess the role of miRNA 3 end binding in this reporter context, we disrupted base pairing of the 3 end of C-miR2.C-miR-20-21 with two bases misma tched a t the 3 end and C-miR2-19-21 with 3 bases mismatched had similar effects, reducing the efficacy of the miRNA but not eliminating miRNA function (Figure 5 C).To indicate whether the reduced efficacy was due to reduced binding or was due to the presence of single-stranded bases at the end of the miRNA, we made a longer version of the miRNA that still had 3 unpaired bases at the 3 end, but retained 8 of the 9 base pairs in the 3 half of the miRNA (C-miR2-21-23 in Figure 5 C).This miRNA was more effecti v e than the shorter C-miR2-19-21, which also has 3 mismatched bases a t the 3 end, indica ting tha t the stability of the duplex is a major criterion for ef ficacy, ra ther than the presence or absence of unpaired bases at the 3 end of the miRNA.Moreov er, in the conte xt of complete base pairing of the miRNA across the central region (which allows direct target cleavage by Ago2), the presence or absence of two or three unpaired bases at the 3 end was also of no consequence (Figure 5 D), again indicating that base pairing per se at the 3 end is not necessary for producti v e interaction in CDS MREs.

CDS-targeting miRNAs promote mRNA degradation
To investigate whether the inhibitory effect of the miRNAs targeting the CDS was primarily through mRNA destabilisation or inhibition of translation, we compared the effect of the mCherry-targeting miR mimics on the mCherry mRNA and protein le v els.For all of the miR mimics tested, the effect on protein le v el was closely matched by the effect on mRNA le v el (Figure 5 E).Thus, the predominant effect of the miRNA targeting was on mRNA stability, with little additional effect on translation efficiency.This was the case both when the 3 end of the miRNA mimic was unpaired (C-miR2-20-12, C-miR2-19-21, C-miR2-21-23, siR2-20-21 and siR2-19-21 in Figure 5 E) and when the 3 end was base paired (C-miR2, C-miR3 and siR2 in Figure 5 E), indicating that the translational mechanism of r epr ession by CDStargeting miRNAs reported by Zhang et al. ( 31 ) did not have a role in any of these instances.
Although the mismatch in the central region of miR-NAs is expected to r educe dir ect cleavage of target mRNA, we wished to assess the contribution of direct cleavage on mRNA le v el.The mismatch in the central region of the C-miR2, expected to affect cleavage activity of Ago2 but not miRNA binding ( 43 ), reduced the inhibitory effect compared to a fully complementary siRNA (Figure 6 A).Single base mismatches at position 12, 11 or 10 of the miRNA (C-miR2, C-miR2 / 11 and C-miR2 / 10) all had similar effect, reducing the mCherry le v el to a pproximatel y one third of the le v el in control cells (Figure 6 A).Increasing the size of the bulge in the central region to 2 bases (C-miR2-11-12) se v er ely r educed efficacy while incr easing the size to 4 bases (C-miR2-10-13) eliminated activity of the miRNA (Figure 6 B).Similar results were found in the luciferase system when the bulge in miR-194 interaction was incr eased (Figur e 6 C).
Since these data are consistent with the prime mode of inhibition by the miRNAs being AGO2-mediated direct cleavage, we performed primer extension assays to detect mCherry mRNA that was cleaved at the midpoint of miRNA binding.We found that a primer extension product of the size expected from direct cleavage was present at a le v el that correlated with the extent of inhibition of mCherry expression, and correlated inversely with the le v el of uncleav ed mRNA, indicated by full length primer extension product (Figure 6 D, Supplemental Figure S4).Taken together, these data suggest that in the context of the e xtensi v ely base-paired interactions that ar e r equir ed for CDS-mediated inhibition, direct cleavage of the target is prominent.Non-cleavage mechanisms such as deadenyla tion and transla tional suppr ession ar e likely to play lesser roles.

CDS-mediated targeting of endogenous genes
Our reporter gene experiments indicated that miRNAs targeting CDS regions can be inhibitory so long as there is e xtensi v e complementarity and limited bulge size in the central section of the miRNA.To assess whether endogenous miRNAs may target endogenous mRNAs in this manner, we first bioinformatically searched for candidates among human miRNAs and mRNA CDS regions, identifying dozens of candidates with full complementarity within the seed and cleavage region (nucleotides 2-12) and with no more than 2 mismatches throughout the remaining binding interface.Mor eover, hundr eds of candidate targets are present when 3 mismatches are permitted, with numbers increasing a further 10-fold when the requirement for perfect binding within the three 3 -terminal nucleotides is also remov ed (Supplemental Tab le S1).To identify candidates for validation experiments we searched the e xtensi v e mouse kera tinocyte AGO-CLASH da ta of Hoefert et al. ( 36 ) to identify candidate in vivo miRN A-mRN A (CDS) interactions and focused on those with e xtensi v e interaction interfaces that are conserved in sequence in humans.In each case that w e selected, AGO-CLASH show ed binding in the CDS but not the 3 UTR of the respecti v e miRNAs.Based on antibody availability we chose a number of candidates to test by Western blotting after the transfection of cells with the respecti v e targeting miRNAs, but we found little to no repression of the target in most cases.We did observe some r epr ession of MET (by miR-25-3p), NOTCH2 (by miR-221 / 222-3p) and RTN4 (by miR-320a-3p), but in each of these cases it was primarily a minor isoform of the protein that was affected (Figure 7 A, B, Supplemental Figure 4).MET and NOT CH2 ar e processed into mor e abundant low er w eight forms by protein cleavage, while different isoforms of RTN4 arise from alternati v e splicing (Figure 7 C).Consistent with previous data (Figure 5 E), the repression of these minor isoforms was observable at both the protein and RNA le v el (Figure 7 D).These data, along with our extensi v e reporter approach, indicate miRNAs can exert effects through coding regions, but only if there is e xtensi v e base pairing to the target which includes full complementarity with the miRNA seed.We suggest these strict r equir ements se v erely limit the impact of CDS sites in all but the most extreme of cases.

DISCUSSION
The de v elopment of high-throughput methodologies to profile miRNA binding has revolutionized the field.Techniques such as A GO HITS-CLIP ( 40 ), A GO Par-CLIP ( 44 ), AGO-CLASH ( 41 ) and AGO CLEAR-CLIP ( 42 ) isolate AGO-containing complexes from cells and enable the identification of binding sites en masse .AGO 'bind-and-seq' assesses all potential binding sites of synthesized oligonucleotides in vitro ( 14 ).These techniques consistently demonstrate an abundance of miRN A interaction, not onl y across 3 UTRs where miRNAs are well known to function, but also across coding regions and e v en introns ( 30 , 36 , 40-42 , 45 , 46 ).This suggests miRNAs may impact genes more frequently than is currently appreciated and / or may target  genes that are often ignored by 3 UTR-centric target prediction algorithms.
The mere identification of a binding site howe v er does not necessarily indicate function ( 47 ).This is because AGOpulldown approaches are capable of capturing transient interactions between miRNAs and their targets and e v en if an interaction is stable, the stoichiometry between miRNA and target might be such that the interaction is of little functional consequence.Even so, the observation remains: miRN As frequentl y interact within coding regions and multiple studies report examples where miRNAs binding within coding regions have an impact on gene expression and cell behaviour ( 27-29 , 31 ).For example, it was recently reported that the transfection of miRNAs could increase AGO-occupancy within the CDS and post-transcriptionally downregulate gene expression in an additi v e manner with increasing numbers of CDS sites ( 28 ).The specific importance of one CDS site was also recently demonstrated in the context of granulosa cell tumours, w here somatic m utation within the coding region of the tumour suppressor FOXL2, caused FOXL2 haploinsufficiency through the creation of a novel target site for miR-1236 ( 29 ).
Of particular interest was a report that miRNAs can bind to coding regions and abort translation in a manner that is dependent upon the 3 end of the miRNA but not the seed ( 31 ).This is of particular interest as it may r epr esent a pool of miRNA targets that have previously gone unrecognized.This is because the seed-less interaction will not be predicted by most algorithms, and the translation-only mechanism will make gene r epr ession invisible in RNAsequencing and qPCR experiments.Elements of this observation are echoed in other findings.For example, enhancing gene r epr ession by the introduction of non-optimal codons suggests competition exists between RISC and the ribosome ( 15 ) and transfection experiments have revealed that sites located in both the 3 UTR and CDS are capable of inhibiting translation ( 27 ).Furthermore, RNAi in C. elegans functions at the translational le v el in addition to target cleavage, and generates stalled ribosome-mRNA complexes that are observable in the absence of the factors (SKI and PEL O TA) that will otherwise clear them ( 17 ).We have sought to clarify whether miRNAs are able to repress genes by binding within coding regions and if so, what are the sequence r equir ements for this to happen.One would anticipate these r equir ements to be mor e e xtensi v e in coding regions than 3 UTRs gi v en the necessity of RISC to compete with transiting ribosomes.
By constructing multiple variants of MREs within reporter constructs, we confirmed that miRNAs are capable of r epr essing gene expr ession through sites located within coding regions if base pairing is e xtensi v e. Howe v er, in all examples tested we found targeting to be of the canonical, seed-dependent type.This is not strictly dependent upon perfect complementarity at the 3 end, but is heavily dependent upon e xtensi v e binding across the rest of the interaction interface.This includes binding across the central r egion, wher e muta tions tha t bias against direct target cleavage decrease efficiency of repression, whilst mutations suf ficient to abroga te cleavage elimina ted the suppressi v e capacity of the miRNA altogether.Accordingly, miRNA-CDS interactions that are repressi v e cause a reduction in mRNA le v els and the production of fragments with termini exactly coincident with the products of miRNA-directed, AGO-mediated cleavage.Although 3 pairing is not a strict r equir ement, incr easing 3 mismatches (Figures 3 B,E, 4 A, Supplemental Figures 2B,E) do generally reduce the degree of r epr ession, but this may simply be due to an overall reduction in the strength of target binding as opposed to the special significance of the 3 -terminus.This conclusion is supported by mismatched nucleotides at the 3 terminus being compensated for by the presence of a longer miRNA : target interaction interface (compare 'c-miR2 21-23' with c-miR2 19-21, Figure 5 C).Of note, suppression is efficient e v en when the MRE is situated close to the start codon ('RL1' in Figure 1 B).This indica tes tha t no-go decay ( 48 ) is not associated with r epr ession because the distance of the site from the start codon is insufficient to allow the requisite build-up of stalled ribosomes that leads to transcript turnover.
Whilst we are not able to discern if additional, noncleavage mechanisms also contribute to r epr ession, our findings demonstrating reduced le v els of target transcripts and seed-dependent / 3 terminus-independent binding are in direct contradiction to that previously reported ( 31 ), but are supported bythe remar kab le consistency of our results between differ ent MRE-r eporters, using multiple microR-N As, m ultiple cell lines and across two entirely separate reporter systems.
In spite of e xtensi v e binding r equir ements, hundr eds, if not thousands of candidate interactions between miRNAs and coding regions are either possible (Supplemental Table S1) or experimentally demonstrated (mouse keratinocyte AGO-CLASH data ( 36 ), Supplemental Tables S2 and S3).Howe v er, we find that if target r epr ession is to be mediated through CDS sites, e xtensi v e complementarity along the length of the miRNA : target interface is r equir ed.Such interactions are exceedingly rare within CLASH data (Figure 1 C) and no trends are observable to suggest that CDS interactions are generally more e xtensi v e than 3 UTR sites (Supplemental Figure 5a), nor are predicted CDS binding sites conserved beyond the constraints imposed by the requirements of protein coding (Supplemental Figure 5b).We examined a number of miRN A : mRN A (CDS) candidates identified by mouse CLASH data that retained conserved sequences in humans, but only found modest le v els of repression for three out of ten endogenous genes, for which in each case, the effect of the miRNA is only apparent for the lesser expressed isoform (Figure 7 ).It is possible that for one of these (RTN4), structural RNA differences could explain the differ ential r esponsi v eness to the targeting miRNA as differ ent exons ar e pr esent between isoforms, e v en if the putati v e MRE is present in all isoforms.Howe v er, this cannot be the case for MET and NO TCH2, w here isoforms arise from post-translational processing, yet miRNA-mediated suppression could only be detected of the lowly-abundant form.Taken together with our other data, the most likely explana tion is tha t the modest ef fects of an imperfectly paired miRNA within a CDS are simply swamped by high le v els of tar get expression.If so, CDS tar geting may not only r equir e unusually e xtensi v e binding, but may also be more specifically relevant for modestly expressed genes.It is noteworthy that a modest degree of repression was observed for miR-221 / 222-mediated targeting of NOTCH despite seed mispairing (ther e ar e 2 extra nucleotides in the seed-binding r egion of NOT CH2 mRNA that otherwise pairs with miR-221 and miR-222CUC).Whether this is the result of r epr ession being mediated via indirect means, or whether such a bulge does not substantially reduce the affinity of binding in this case (which is still sufficient for robust interaction) is unclear.
In agreement with both Zhang et al. and other reports (27)(28)(29), our da ta demonstra te tha t suppression by MREs located within the CDS can occur, although in contrast to Zhang et al. we find r epr ession r equir es e xtensi v e complementarity that involves the seed region and leads to target cleavage.Although we cannot rule out exceptions to this, we conclude that the e xtensi v e binding that is r equir ed for efficient MRE-CDS function will likely limit its influence to only a small number of modestly expressed genes.

Figure 1 .
Figure 1.Locations of miRNA interaction within mRNAs and of base pairing within the miRNAs.( A ) Locations of miRNA interaction as determined by HITS-CLIP.The r efer ences of the relevant published study from left to right are as follows ( 36 , 40 , 42 , 45 , 46 ).( B ) Effects on artificial miRNAs targeting three different regions in the Renilla luciferase reporter gene.The locations of binding within the mRNA are shown schematically and base pairings are shown with the miRNA seed regions in red.Luciferase reporter assays were performed in MDA-MB-231 cells.Quantitati v e data are based on three biological replicates, with each experiment containing 6 technical replica tes.Da ta is expressed as mean ± s.e.m.Statistical significance (* P < 0.05, ** P < 0.01, *** P < 0.001 and **** P < 0.0001) was determined by two-tailed Student's t test.( C ) Patterns of base pairings for miRNAs binding in 3 UTRs (left panels) and CDS regions (right panels) using data from the CLEAR-CLIP study of Hoefert et al. ( 36 ).P air ed bases are in dark blue.Alignments are from the 5 end of the miRNA in the upper panels and the from the 3 end of the miRNA in the lower panels.

Figure 2 .
Figure 2. Base-pairing interactions r equir ed in the CDS or 3 UTR for miR-20a to effecti v ely target a luciferase reporter gene.( A ) Binding models indicating base pairing interactions between miR-20a (bottom) and the miRNA-response element (MRE, top) that is cloned within the luciferase CDS or 3 UTR.The miRNA seed region is shown in red.( B, C ) Results of Renilla Luciferase reporter assays after co-transfection of miR-20a mimic in MDA-MB-231 cells.Data ar e expr essed as mean ± s.e.m., n = 18.Statistical significance (* P < 0.05, ** P < 0.01, *** P < 0.001 and **** P < 0.0001) was determined by two-tailed Student's t test.

Figure 3 .
Figure 3. Base-pairing interactions r equir ed in the CDS or 3 UTR for miR-342 and miR-200a to effecti v ely target the CDS of a luciferase reporter gene.( A, D ) Binding models indicating base pairing interactions between the miRNA and the miRNA-response element cloned within the luciferase CDS.The miRNA seed region is shown in red.( B, E ) Results of Renilla Luciferase reporter assays after co-transfection of (B) miR-342 or (E) miR-200a mimic in MDA-MB-231 cells.( C, F) Results of Renilla Luciferase reporter assays after co-transfection of miRNA inhibitors in MCF7 cells.Data ar e expr essed as mean ± s.e.m., n = 18.Statistical significance (* P < 0.05, ** P < 0.01, *** P < 0.001 and **** P < 0.0001) was determined by two-tailed Student's t test.

Figure 4 .
Figure 4. Non-canonical targeting of CDS-luciferase reporter genes is ineffecti v e. ( A-C ) Binding models indicating base pairing interactions between the miRNA and the miRNA-response element cloned within the luciferase CDS are shown.The miRNA seed region is shown in red.Results of Renilla Luciferase reporter assays after co-transfection of (A) miR-20a mimic, (B) miR-20a or miR-194 mimic or (C) let-7b mimic with the reporters indicated.In (B), the DAPK3 reporter is an exact nucleotide match of the reporter used in the Zhang et al. ( 31 ).The CDS3 constructs are relevant for miR-20a or miR-194 as indicated.In (C), potential let-7b response elements deri v ed from the genes shown again match those reported by Zhang et al .All transfections were performed in HeLa cells.Data are expressed as mean ± s.e.m.Statistical significance (* P < 0.05, ** P < 0.01, *** P < 0.001 and **** P < 0.0001) was determined by two-tailed Student's t test.

Figure 5 .
Figure 5. Effects of artificial miRNAs targeting the CDS of an mCherry r eporter.( A ) Pr edicted base pairing interactions of artificial miRNAs targeting the mCherry CDS in three different locations.( B ) MDA-MB-231 cells stably expressing mCherry and GFP were transiently transfected with the indicated miRNAs and the mCherry and GFP protein le v els quantitated by flow cytometry.The vertical black line is located at the mean fluorescence value of cells transfected with control miR (top) which does not target either reporter gene.Histograms moving to the left indicate r epr ession of the reporter gene (which r esults in r educed fluor escence in the cell).Mean le v els of fluor escence ar e indica ted.( C ) Ef fects of disruption of base pairing at the 3 end of the artificial miRNA.Peak fluorescence values are shown for mCherry transfected with each miRNA.( D ) Effects of disruption of base pairing at the 3 end of otherwise perfectl y complementary miRN A mimics. ( E ) Relati v e mCherry mRNA and protein le v els in cells transfected with the indicated artificial miRNAs.Data ar e expr essed as mean ± s.e.m., n = 3. Statistical significance (* P < 0.05, ** P < 0.01, *** P < 0.001 and **** P < 0.0001) was determined by two-tailed Student's t test.

Figure 6 .
Figure 6.Effects of base pairing disruptions in the centre of the miRNA and primer extension mapping of the cleavage site.( A ) Effects of single base bulges at miRNA base 12, 11 and 10. ( B ) Effects of larger bulges in the miRNA central region.( C ) Effects of miR-20a on Renilla luciferase harbouring a CDS site for miR-20a with mismatches in the central region.Data are expressed as mean ± s.e.m., n = 18.**** P < 0.0001 as determined by two-tailed Student's t test.( D ) Primer extension assa ys perf ormed on RNA extracted from MDA-MB-231-eGFP-mCherry cells transfected with the indicated miRNAs (previously featured in Figures 5 C, 6 A and B).Locations of products consistent with full length mCherry transcript and with miR-directed AGO2-mediated cleavage are indicated.RPS12 primer extension is shown as a loading control.

Figure 7 .
Figure 7.There is minimal r epr ession of endogenous genes by microRNAs targeting coding regions.( A ) Negati v e control or a relevant targeting miRNA (10 nM unless otherwise specified) were co-transfected into MDA-MB-231 cells (unless otherwise specified) and western blotting performed to probe expression of the genes indicated.( B ) Binding models indicating base pairing interactions between the miRNA and the miRNA-response element cloned within the luciferase CDS are shown.The miRNA seed region is shown in red.( C ) Alternati v e splicing is responsible for RTN4A and B isoforms.(Pre-Met / Met and NOTCH2 / NICD (Notch intracellular domain) are produced via proteolytic processing.) ( D ) Quantitation of mRNA le v els normalized to the mean of GAPDH and RPL32 ar e shown.Data ar e expr essed as mean ± s.e.m.Statistical significance (* P < 0.05, ** P < 0.01, *** P < 0.001 and **** P < 0.0001) was determined by two-tailed Student's t test.