Epigenetic CpG duplex marks probed by an evolved DNA reader via a well-tempered conformational plasticity

Abstract 5-methylcytosine (mC) and its TET-oxidized derivatives exist in CpG dyads of mammalian DNA and regulate cell fate, but how their individual combinations in the two strands of a CpG act as distinct regulatory signals is poorly understood. Readers that selectively recognize such novel ‘CpG duplex marks’ could be versatile tools for studying their biological functions, but their design represents an unprecedented selectivity challenge. By mutational studies, NMR relaxation, and MD simulations, we here show that the selectivity of the first designer reader for an oxidized CpG duplex mark hinges on precisely tempered conformational plasticity of the scaffold adopted during directed evolution. Our observations reveal the critical aspect of defined motional features in this novel reader for affinity and specificity in the DNA/protein interaction, providing unexpected prospects for further design progress in this novel area of DNA recognition.


INTRODUCTION
Cellular dif ferentia tion to stable, tissue-specific phenotypes despite identical genetic material is a pr er equisite for the de v elopment of multicellular organisms. This is achie v ed by coordinated gene expression regulation via chroma tin modifica tion, such as the epigenetic modifica tion of DNA nucleobases. In mammals, 5-methylation of cytosine by DNA methyltr ansfer ases (DNMTs) plays essential roles in dif ferentia tion, de v elopment, X-chromosome inactivation, and genomic imprinting; consequently, aberrant DNA methylation has been linked to multiple diseases, including cancer ( 1 , 2 ). Enzymatic oxidation of mC ( Figure  1 A) to 5-h ydroxymeth ylcytosine (hmC), 5-formylcytosine (fC) and 5-carboxycytosine (caC) is catalyzed by Ten-Ele v en-Translocation (TET) dioxygenases and results in particularly high le v els of oxidized mCs in embryonic stem cells and the brain ( 3 ). These oxidized mC deri vati v es have been shown to exert regulatory functions in multiple contexts (4)(5)(6). Mammalian cytosine modification by DNMTs and TETs occurs predominantly in palindromic CpG dyads and can theoretically gi v e rise to 15 different symmetric and asymmetric combinations of cytosine 5modifications across the two CpG strands ( 7 ). Howe v er, despite the established general roles of oxidized mCs as chroma tin regula tors, it is poorly understood how their individual combinations in CpGs act as distinct regulatory signals , for example , by dif ferentially modula ting interactions with the large number of double-stranded DNA-binding chr omatin pr oteins. Reader pr oteins that selecti v ely interact with novel CpG duplex marks could serve as fundamental tools for studying their biological functions (8)(9)(10)(11), but their design opens a new aspect in the field of DNA recognition that poses formidable selectivity challenges. We have recently reported the first designer reader for such a TET-associated CpG duplex mark. This protein has been e volv ed from a methyl-CpG-binding-domain (MBD) ( 12 ) and selecti v el y reco gnizes the asymmetric combination hmC / mC in the context of all fifteen possible CpG duplex marks ( 13 ). The 'core' MBD family proteins share a conserved domain of 70-80 residues and include the proteins MBD1-4 and methyl-CpG-binding protein 2 (MeCP2) ( 12 ). The latter r epr esents a largely disorder ed DNA-binding protein for which loss-of-function mutations are associated with the neurological de v elopmental disease Rett syndrome (RTT) ( 14 , 15 ). Its high-affinity interaction with mC / mC DNA hinges on two Arg fingers that both form two H-bonds to the guanosine of the CpG (Figure 1 C) ( 16 ). Previous work on MBDs has shown that the stability of the threedimensional fold is exceptionally susceptible to simple point mutations in the center of the hydrophobic core ( 17 ). Now, directed evolution of MBDs has recently been suggested as a viable path to generating reader proteins specifically targeting pre viously inaccessib le combinations of epigenetic DNA modifications in CpG dyads, with the prospect of providing a platform for their genome-wide identification and mapping ( 13 , 18 ).
Directed-e volution e xperiments for selecting hmC / mC readers from an MeCP2 mutant library in a previous study of our labs re v ealed the replacement of a hydrophobic cor e r esidue (Val122) with Ala as a critical mutation, emerging in addition to a modified DNA binding interface (K109T / S134N, Figure 2 A) ( 13 ). The V122A mutation in the K109T / V122A / S134N (TAN) triple mutant established high hmC / mC DNA-binding affinity ( ∼10 nM) and specificity in electrophoretic mobility shift assays (EMSA). In contrast, a second mutant selected in those directede volution e xperiments, wher e Val122 was r eplaced with Cys (TCN), exhibited more promiscuous binding of both mC / mC and hmC / mC, e v en though it differed only by one nonpolar Cys residue at a core position that does not interact with DNA in the wt protein. This surprising role of a cor e r esidue in designed CpG duplex r eaders as potent determinant for selecti v e target recognition re v eals a fundamental lack of molecular-le v el understanding and unr avels a consider able pitfall of structure-based approaches for the design of this novel class of epigenetic reader proteins.

Protein expression for EMSA studies
MeCP2 variants with mutated residue 122 were generated by Quikchange site-directed mutagenesis and were expressed and purified as described earlier ( 13 ). EMSA assays were conducted as described ( 13 ). See the SI for details of these e xperiments (e xtended Materials and Methods and Supplementary Figure S1), protein and DNA constructs (Supplementary Table S1A and B), and details of fitting and error estimation (Supplementary Table  S2).

Protein expression and sample preparation for NMR spectroscopy
The MBD proteins wer e expr essed and purified largely as described earlier ( 13 ) and discussed in more detail in the SI. In brief, cells were grown in 13 C, 15 N-labeled M9 medium and purified using Ni affinity chromato gra phy. After TEV cleavage and a second Ni affinity column, pure protein solutions for NMR experiments were obtained via size exclusion chromato gra phy and concentration. For protein:DN A complex samples, the proteins were mixed in 1:1 ratio with DNA (a 12-mer with 4 nucleotides to form a hairpin, see Supplementary Table S1B) carrying central hmC / mC CpG dyads.

NMR sample pr epar ation and assignments
Purified uniformly-13 C / 15 N wild-type (KVS) and its double (TVN) and triple (TAN) mutants were prepared in a mixed solvent of 90% H 2 O and 10% 2 H 2 O (50 mM sodium phosphate, 50 mM NaCl, pH 6). All NMR experiments were carried out with protein concentrations of ∼0.5 mM on a Bruker Avance 800 MHz NMR spectrometer using a triple-resonance cryo probe. The near-complete 1 H, 13 C and 15 N resonance assignments of MBD mutant MeCP2 protein TAN and its complex with hmC / mC DNA were deposited to the BMRB under the accession numbers 51020 and 34745, respecti v ely. The chemical-shift perturbations were measured as [( H) 2 + ( N / 10) 2 ] 1 / 2 , where H and N signify the changes in 1 H N and 15 N chemical shifts, respecti v ely. A suite of 3D double-and tripleresonance NMR experiments were performed for sequencespecific 1 H, 13 C, and 15 N backbone resonance assignments largely as discussed earlier ( 19 , 20 ). In addition, we recorded 3D HCCH-TOCSY, [ 15 N, 1 H]-NOESY-HSQC, as well as aliphatic and aromatic [ 13 C, 1 H]-NOESY-HSQCs for almost complete assignment of 1 H, 13 C and 15 N side-chain r esonances, dihedral-angle r estraints, and NOE-deri v ed distance restrains for 3D structure calculation of the protein.

N relaxation experiments
Backbone 15 N r elaxation measur ements wer e acquir ed at 800 MHz and generally 291 K as described earlier ( 23 ). T 1 measurements employed eight recovery delays between 50 and 1100 ms. 15 N T 2 measurements were carried out using a CPMG pulse sequence ( 24 ) with relaxation delays of 5, 20, 35, 50, 70 and 90 ms. Stead y-sta te [ 15 N, 1 H] heteronuclear-NOE measur ements wer e carried out with and without proton sa tura tion during the relaxa tion delay, using either 5 s of relaxation delay and 3 s of proton saturation or 8 s of relaxation delay only, respecti v ely. Constant-time 15 N Carr-Pur cell-Meiboom-Gill (CPMG) r elaxation dispersion experiments ( 25 ) were measured at 291 K, using a constanttime delay of 40 ms and nine variable CPMG frequencies ( ν CPMG ) ranging from 50 to 2000 Hz, in addition to a reference spectrum without delay ( τ CPMG = 0). For each data set, the frequencies 750 and 50 Hz were repeated for estimation of errors in R 2,eff . Data for each residue (Supplementary Table S4) were anal yzed individuall y using the NESSY software package ( 26 ) to obtain the kinetic parameters of interest, corresponding to either a no-exchange model or a twosite exchange process, dependent on the corrected Akaike information criterion (see the SI for details).

NMR structure calculations
The solution structure of MBD triple mutant and its complex were determined by ARIA ( 29 ) using the following NMR constr aints: i) dihedr al angle constr aints deri v ed from TALOS-N ( 30 ) using individual 1 H N , 15 N, 13 C ␣, 13 C ␤, 13 CO chemical shift values as inputs (A total of 108 and 114 φ and ψ dihedral angle constraints were used for the apo T AN and T AN:hmC / mC comple x, respecti v ely.) and ii) cross peaks in NOESY spectra, identified and automatically assigned using ARIA 2.3 ( 29 ), with upper distance bounds set to 6.0 Å . The NMR structural statistics for the ensemble of 10 refined conformers of MBD apo and complex are summarized in Supplementary Table S3. The 10 lo west-ener gy conformers were further refined in explicit water using NMR-deri v ed distance restraints, angle restraints and --for the apo form --25 1 ( 31 ), with an alignment tensor calculated via the PALES ( 32 ) software. The program PSVS-1.4 was used to validate the quality of the selected ensemble of lo west-ener gy structures of TAN in apo and complex form. The 3D coordinates thus obtained were deposited in the PDB (pdb IDs: 8AJR and 8ALQ). The structure figures were prepared using Pymol and UCSF Chimera.

Molecular dynamics simulations
All MD simulations were performed using the Gromacs simulation package. Six different systems were modelled from the previously reported X-ray coordinates (PDB ID: 3C2I) ( 16 ), namel y a po wt, a po double m utant (TVN), and apo triple mutant (TAN), as well as wt:mC / mC, TVN:hmC / mC and TAN:hmC / mC complexes. The ff99bsc0 Amber force field ( 33 , 34 ) was used for describing the proteins and nucleic acids. Each system was solvated using TIP3P water. Then, energy minimization, thermaliza tion, and step-wise equilibra tions were carried out. Subsequently, a total number of fiv e production MD simulations for each system were initiated using different initial velocities. Each MD trajectory was 500 ns long, thus yielding a total simulation time of 2.5 s for each of the six systems. The equations of motion were integrated using the lea p-fro g algorithm with a time step of 2 fs. The temperature and pressure were kept constant at 300 K and 1 bar using velocity rescaling thermostat ( 35 ) and Berendsen barostat ( 36 ), respecti v ely. More details on MD simulations are provided in the SI.

RESULTS AND DISCUSSION
To study the peculiar influence of hydrophobic-core residues on the functional le v el, we generated additional mutants with different smaller (Gly) or larger (Val, Ile, Leu) residues at position 122, maintaining mutations required in the DNA binding interface (Supplementary Table S1A), in addition to TAN and TCN. We expressed the respecti v e MBDs (residues 87-190) from human MeCP2 and measured dissociation constants ( K D ) for mC / mC and hmC / mC-containing dsDNA by electrophoretic mobility shift assays (EMSA, Figure 2 B, Supplementary Figure S1 and Supplementary Table S2). Only the selected TAN mutant exhibits a very high affinity (8 ± 2 nM) and selectivity for its new, hmC / mC-containing target. Strikingly, any deviation from this narrow steric space leads to a dramatic loss of selectivity, most pronounced for the TGN mutant. Importantly, both MBDs with the wild type residue Val at position 122 (TVN and wt KVS) exhibit equally poor ( ∼100 nM) binding of hmC / mC, but differ drastically in the binding to the canonical wild type target mC / mC, which is not bound by TVN anymore. Strikingly, this clear requirement for a specific steric demand at position 122 is observed despite the fact that it lies in a secluded element not interacting with DNA (Figure 2 C).
To explore the molecular underpinnings of the modulating role of the protein core for selectivity of its interface, we conducted NMR and MD simulation studies. Wild-type MBD, the hmC / mC reader TAN, as well as the TVN mutant wer e over expr essed in doubly-labeled 15 N, 13   To our surprise, at a temperature of 293 K or above, many protein resonances of the hmC / mC reader TAN re v ersib ly broaden beyond detection, indicating the presence of alternate conformational states on the s-to-ms timescale (Figure 3 E). By contrast, no significant indication of temperature-dependent exchange broadening is observ ed ov er a wide range of temperatures in the wt protein or TVN (which carries only the mutations directly involved in the DNA interactions), rendering its binding affinity to hmC / mC almost 10-fold lower than in TAN (Figure 2 B and Supplementary Figure S1). Supplementary Figure S4 also shows the susceptibility of TAN amide shifts to temperature changes ( ␦N / T and ␦H / T) as well as its correlation with wt behavior. Again, the strong temperature dependence of se v eral residues in TAN deviates from wt behavior, the most strongly deviating residues being V145, the C-terminal residue of ␣1 helix, and D121 next to the central mutation site. To assess the details of conformational rearrangements in the high-affinity TAN reader, we closely examined motion occurring on the ps-ns time scale as well as in the s-ms regime, employing a large array of 15 N relaxa tion, relaxa tion dispersion (RD), and chemical-exchange sa tura tion transfer (CEST) data.  Figures  S3 and S4, respecti v ely. The distribution of hetNOE and R 1 rates, reporting on ps-ns timescale dynamics, confirm the ar chitectur e of the domain with respect to its expected mobile N-terminus, extended C-terminus, and loop L1. Mor e inter estingly, transverse r elaxation ( R 2 ) is, in addition, sensiti v e to slower motions and reflects conformationalexchange processes on the s-ms timescale. Whereas R 2 rates in wt simply mirror the fast-timescale mobility observed in R 1 and hetNOE, significant deviations in TAN re v eal robust conformational exchange throughout the sequence. To assess the timescale of motion for the exchange, we carried out 15 N constant-time CPMG relaxation dispersion e xperiments. A dispersi v e nature in the dispersion profiles is the signature of conformational exchange on the sms timescale between states with different chemical shifts. For wt, a small number of residues show the incidence of modest conformational exchange (Figure 4 A and Supplementary Figure S6), with global RD on a timescale of around 200 s ( k ex of 5207 ± 356 s -1 ). In the hmC / mC reader TAN, strikingly, these exchange contributions are f ourf old slower and more e xcessi v e than in the wild-type -with a timescale of about 800 s ( k ex 1240 ± 10 s −1 ) at 291 K, higher total R 2 rates up to 60 s -1 , and substantial exchange contributions R ex up to > 40 s −1 -widely surrounding the structural elements in loop L1, ␤1, ␤2 and ␤3 strands, and ␣1 r esidues. Figur e 4 A and B display R ex mapped onto the structure and exemplary dispersion profiles (selecting the three mutation sites), respecti v ely; Supplementary Figure S7 provides further dispersion profiles for TAN. RD data were fitted individually, assuming ei-ther a two-site exchange model or the absence of exchange, dependent on the corrected Akaike information criterion. See Supplementary Figure S8 for an ov ervie w about the residues with significant exchange contributions in both wt and TAN. Interestingly, neither V122 nor A122 backbone sites show dispersion themselves, reflecting the role of the side chain as a le v er for the dynamics, the amide not being exposed to differential chemical environments itself ( Figure  4A, B, center). Finally, the exchange dynamics ceases upon DNA binding. (See details regarding complex formation below.) A pparentl y, in the apo protein, an exchange occurs between different conformations, of which only one is relevant within the complex.

The apo hmC / mC reader accesses alternate conformational states
To more closely assess the conforma tional sta tes sampled by the apo hmC / mC reader during the exchange, we used chemical-exchange sa tura tion transfer (CEST) ( 28 ). Here, a weak radiofrequency field is applied to capture chemical shifts of minor conformations for each amide site. We recorded 15 Figure  S10), provides an estimate for the activation energy of 9.5 kcal / mol, which is in line with the timescale of exchange observed in the RD data. Many residues with substantial discrepancies between ground-and excited-state chemical shifts are located in or near the binding-loop and ␣1 and denote changes in particular for residues just before and after the first helix ( Figure 4 E-G). This agrees with the contradictory RDCs for this helix and hence ambiguous relati v e orientation of the helical residues with respect to the ␤-sheet.
The absolute change in 15 N shifts for the excited state tends to be downfield for helical residues, in particular the beginning (N134) and end (V145) of ␣1, and upfield for ␤1, consistent with a temporary, partial release of the secondary structure at these sides ( Figure 4F, G). Note that V145 also was the residue with the strongest temperature dependence (see abov e). Accor dingly, the e xcited state r epr esents a partly molten conformation, whose temporary adoption becomes possible due to altered interactions of the hydrophobic side chain in position 122 in the interface between the long ␤-sheet, ␣-helix and C-terminal residues. On the other hand, in line with its temperature-dependent HSCQ spectra, the double mutant TVN does not show the strong conformational exchange in 15 Figures S11 and S12, respecti v ely). The onset of vast chemical exchange by (and only by) V122A in the (otherwise identical) triple mutant protein supports the notion that the V122A mutation allows to modulate the conformational-exchange dynamics, w hich --w hen incorporated in addition to the constituti v e changes in the DNA binding interface (K109T, S134N) --ultimately enables the reader to achieve highaffinity binding to the hmC / mC.

N CEST or CPMG experiments (Supplementary
In order to shed further light on the structural fluctuations with atomic resolution, we interrogated dynamics in the apo proteins (wt, TVN and TAN) and their DNA complexes (see section on DNA binding below) in MD simulations, which provided a detailed description of motion up to 2.5 s. Even though slower motions on the s timescale and beyond cannot be faithfully sampled (unless enhancedsampling techniques are used, which render the interpretation of timescales challenging ( 37 )), the tendencies seen in the MD simulations qualitati v ely match those observed experimentally. Figure 4 H and I shows the distribution of RMSDs to the X-ray structure (over all residues) and the r esidue-r esolved root-mean-squar e fluctuations (RMSFs) of the apo proteins, respecti v ely. Wher eas the r egion around residues 132 and 138 shows increased plasticity over wt for both TVN and T AN mutant, T AN shows an additional systematic increase of backbone fluctuations between 119 and 132. A similar increase is observed between 103 and 109, preceding mutation K109T. Interestingly, we do observe the above-described local unfolding seen experimentally also in one of our fiv e MD trajectories of TAN. Even though this is only a single e v ent (due to the limited MD timescale), and hence has to be interpreted very carefully, it may shed further light on the destabilization of the structure by V122A mutation. Supplementary Figure S13 shows an overlay of the structur es, wher e ␤1 / ␤2 (harboring T109) is slightly reoriented relati v e to the rest of the structure, L1 loop becomes extended, and ␣1 is shortened by one turn (reaching only up to Y141), nicely in line with the strong shift perturbation of the C-terminal helix residues due to temperature (see above), upon DNA binding (see below), and when transitioning to the excited state (CEST data). Extended simulations, combined with enhanced-sampling methods, will be r equir ed to further characterize the nature of the putati v e locally unfolded state.

The structure of the hmC / mC reader in complex with DNA
In addition to the apo proteins, the complex between TAN and hmC / mC DNA was subjected to NMR investigation of structure and dynamics. To more closely investigate the binding of the hmC / mC reader to its DNA, 15 N-labeled TAN was titrated and equilibrated with an unlabeled, double-stranded hairpin oligonucleotide, carrying asymmetric hmC / mC modifications in the central CpG dyad, in a 1:1 molar ratio. We completed sequence-specific backbone and sidechain assignments of the TAN-DNA complex by similar experimental strategies as for the apo proteins (Supplementary Figure S14, deposited under BMRB accession code 34745). We then assessed chemical-shift per-turbations (CSPs, see the SI for details) upon complex formation, which are shown as a function of sequence and depicted on the X-ray structure of the wt reader (pdb 3c2i) in Figure 5 A and B, respecti v el y. CSPs largel y match the positions expected from the wt complex (pdb 3c2i), with perturbations seen in particular for the poorly structured loop L1, which slides into the DNA major groove. (Also compare Supplementary Figure S15). In addition, CSPs are found at allosteric positions distant from the interaction sites (e.g. at F142 at the end of ␣1 or L100), which corroborate the overall reshuffling of the dynamic conformational ensemble upon binding. The tertiary structure of the complex was elucidated using a similar set of restraints as for the apo structure; howe v er, RDCs for the comple x could not be obtained due to sample instability in the presence of alignment media. Also, due to the absence of intermolecular NOEs, we did not specifically include the DNA in structure determination. Within the precision of this assessment, all individual structural elements of the TAN:hmC / mC complex, deposited as PDB 8ALQ, seem highly reminiscent of the wt reader in complex with a symmetrical mC / mC dyad as observed in 3c2i (see an overlay of lo west-ener gy NMR conformers in Figure 5 C, also compare Supplementary Figure S16).
We asked whether the bound form resembled either the compact or the destabilized excited state of the TAN conformational ensemble. All secondary structural features of the complex are essentially identical to the ground state apo protein ( Figure 5 D and E). More interestingly, knowing the r esidue-r esolved 15 N chemical shifts of the protein in the complex, correlations were sought between these and 15 N shifts of either the ground or excited state obtained from CEST data of apo TAN ( Figure 5 F and G, respecti v ely). Again, a correlation with a correlation coefficient R 2 of around 0.93 for the ground state shifts, in contrast to much larger deviations to the partially unfolded state ( R 2 of 0.76), shows the ground state secondary structure of the apo reader to be reconstituted upon binding. This speaks against conformational preselection via a defined excited state as the main mechanism and instead points to facilitation of induced-fit binding of the apo protein to its target by decreased rigidity. Associated with a high energy barrier, this plasticity, howe v er, incurs low entropic costs upon complex formation.

Increased plasticity facilitates DNA binding
We closely inspected the MD simulations of the complex to complement the experimental data on structural properties, dynamics and interactions and further elucidate the specificities of the interaction of TAN as the first hmC / mC reader with its target DNA. As an expected source of adopted selectivity in addition to the known interactions of the wt protein ( 38 ), the S134N mutation is confirmed to allow specific H-bonding between 5-hmC and reader ( Figure  6 A and Supplementary Figure S17). (The SI also contains a short movie depicting this interaction.) H-bonding both relates to the hmC hydroxyl group and the phosphate backbone (Figure 6 B). The two-sided polar character of Asn at the same time disfavors interaction with mC / mC DNA, which in the wt hinges on a hydrophobic cluster formed by mC methyl, deoxyribose CH 2 , and the aliphatic back- side of Ser. In addition, we again inspected how complex formation is facilitated by specific structural features of the TAN mutant. Indeed, the formation of the TAN:hmC / mC DNA complex in the simulation is associated with slight conforma tional rearrangements. W hereas some rearrangement also occurs upon wt:mC / mC complex formation, the specific characteristics differ. (see above) increases the discrepancy that its 1 H ε proton has in the apo state compared with other Arg sidechain moieties (Supplementary Figure S20A). This deviation, both for wt and TAN, confirms a preorganization of the salt bridge with neighboring D121 in the apo state that e v entually also characterizes the complex of the wt mC / mC reader with DNA in crystals. Supplementary Figure S20B shows strips of a 15 N-edited NOESY experiment, where the preformed intraresidual contacts between R111 and D121 in the TAN m utant a po form are a pparent. MD data for R111, w hich maintains a stable salt bridge with D121 both in apo and complex states of all readers in the simulations, is shown in Figure 6 F, left panels. Importantly, in contrast to R111 and to what has been proposed in the frame wor k of a possib le selecti vity mechanism ( 6 ), the salt bridge preorganizing Arg133 for its H-bond to guanosine in the complex is witnessed in the apo wt protein, but it is released in our simulations of the wt:mC / mC complex (Figure 6  Overall, a high affinity of the new MBD to the asymmetric, h ydroxymeth yla ted d yad seems to depend on smoothening of the free-energy landscape of the reader towards enabling conformational adaptations. This key property is fine-tuned by interactions between the three central protein secondary-structural elements (helix ␣1, the extended ␤-sheet, and the domain connecting the two) --which is defined by the central residues in the hydrophobic core --rather than the binding interface itself.
The above results show that redirecting the specificity of MBDs as a naturally existing scaffold to a new epigenetic CpG duplex modification hinges on tailored modulation of the thermodynamics of binding, deri v ed not only from new intermolecular contacts but also tuning the characteristics of conformational plasticity in the interface. The enabling plasticity, adjusted via central residues within the protein core, deri v es from changes in the steric matches in the inter-domain interaction surfaces. The extent of this mobility, apparent from a tendency towards rare local unfolding, is strongly enhanced in the new hmC / mC reader. The adoption of an excited state on the s-ms timescale motion and hence the presence of a similar tendency, albeit less pronounced, also characterizes the natural mC / mC reader. Suitably adjusted plasticity of the binding interface thus seems an important general aspect of target recognition for the MBD fold. At first glance, such 'disorder' in the apo scaffold seems like an entropically disadvantageous property for a high-affinity binder in which plasticity decreases upon binding. Howe v er, with features of this plasticity being very modestly tuned (i.e. with maintained, well-defined secondary-structural elements, tight conformational restrictions, and a high activation barrier), the penalty at physiological temperatures is minor -while still redefining a 25 Å wide binding interface -and can be largely compensated by maximized enthalpic gains due to optimized H-bond formation. The design of reader proteins that can serve as probes for the analysis of postsynthetic modifications of nucleic acids constitutes a current key aim of the soaring fields of epigenetics and epitranscriptomics. We showed that sought new properties of relevant proteinnucleic acid interfaces can be induced by directed evolution based on natural progenitors. Whereas the design and selection of mutant libraries with well-defined randomization sites guided by visual inspection of crystallo gra phic structures that report on local interactions is the most intuiti v e approach, our data show that, by contrast, interrogating nucleic acids by designed reader proteins can also critically hinge on correctly adjusting protein plasticity as a modulator of selecti v e comple x formation. As such, central mutation sites far from interaction surfaces but relevant for inter-domain connectivity allosterically can enable high affinity and selectivity of readers, hence allowing for une xpected ne w perspecti v es for pro gress, particularl y in the new field of CpG duplex mark recognition. We believe that the interrogation and understanding of dynamic networks in epigenetic readers , writers , and erasers can r epr esent a fundamental element for designing future probes to decipher and effecti v ely modulate the layer of epigenetic control of cell fate and function. These findings will be interesting for design problems in other contexts of target recognition, as --despite acti v e r esear ch towards understanding of dy-namic networks ( 39-41 ) --dedicated allosteric optimization of large-scale dynamics as a le v er f or an y desired functionality still tends to escape awareness in the creation of new molecular tools.

CONCLUSION
Here, we have demonstrated that the high affinity and selectivity of the first designed epigenetic reader for oxidized CpG dyads is le v eraged by well-defined conformational plasticity of the DNA binding interface, remotely orchestrated by interactions in the hydrophobic core. The observed inter mediate-timescale confor mational exchange towards a partial melting of secondary-structural strains, elucidated via e xtensi v e NMR and MD interrogation of protein structure and d ynamics, demonstra tes adapted plasticity as a le v er for specific reader :DNA inter actions within the vast landscape of differential epigenetic modifications. Albeit to a lower e xtent, e xchange dynamics on the same timescale are also visible for the natural, mC / mC-specific MBD. Our study suggests that the aspect of tailored conformational plasticity may both, help understanding physiological reader selectivities and facilitate the design of novel readers as specific molecular probes for different CpG duple x mar ks. The findings will propel the advances in the emerging field of DN A reco gnition and thus in deciphering the elusi v e roles of indi vidual CpG duple x modifications in chroma tin regula tion.

DA T A A V AILABILITY
NMR chemical shifts have been deposited to the BMRB ( www.bmrb.com ) under accession codes 51548, 51547 and 51020. Structural data have been deposited to the PDB ( www.rcsb.org ) under 8AJR and 8ALQ.

SUPPLEMENT ARY DA T A
Supplementary Data are available at NAR Online.