In vitro selection targeting an anti-polyhistidine monoclonal antibody was performed using mRNA display with a random, unconstrained 27-mer peptide library. After six rounds of selection, epitope-like peptides were identified that contain two to five consecutive, internal histidines and are biased for arginine residues, without any other identifiable consensus. The epitope was further refined by constructing a high-complexity, unidirectional fragment library from the final selection pool. Selection by mRNA display minimized the dominant peptide from the original selection to a 15-residue functional sequence (peptide Cmin: RHDAGDHHHHHGVRQ; KD = 38 nM). Other peptides recovered from the fragment library selection revealed a separate consensus motif (ARRXA) C-terminal to the histidine track. Kinetics measurements made by surface plasmon resonance, using purified Fab (antigen-binding fragment) to prevent avidity effects, demonstrate that the selected peptides bind with 10- to 75-fold higher affinities than a hexahistidine peptide. The highest affinity peptides (KD ≈ 10 nM) encode both a short histidine track and the ARRXA motif, suggesting that the motif and other flanking residues make important contacts adjacent to the core polyhistidine-binding site and can contribute >2.5 kcal/mol of binding free energy. The fragment library construction methodology described here is applicable to the development of high-complexity protein or cDNA expression libraries for the identification of protein–protein interaction domains.
Edited by Dario Neri
Epitope mapping, the identification of regions of an antigen recognized by an antibody, is an important subset of protein–protein interaction analysis that is relevant in a wide range of disciplines where antibodies are used as molecular reagents. Conventional methods for epitope mapping involve the synthesis or expression of numerous overlapping polypeptides followed by probing for antibody reactivity (Geysen et al., 1984; Lenstra et al., 1990; Frank, 1992; Frank and Overwin, 1996; Kramer et al., 1999; Reineke et al., 1999). Although these methods can achieve very fine mapping (single amino acid resolution), they involve tedious, time-consuming and often cost-intensive steps. These techniques also typically require a priori knowledge of one of the interacting partners (i.e. the antigen sequence).
Display technologies such as phage (Scott and Smith, 1990) and cell surface display on Escherichia coli or yeast (Boder and Wittrup, 1997; Georgiou et al., 1997) permit the assay of millions of polypeptides simultaneously for the identification of functional properties. In these systems, each display vehicle expresses multiple copies of a single polypeptide sequence on its surface. Active peptides are recovered by affinity selection (e.g. by biopanning or fluorescence-activated cell sorting) and identified by DNA sequencing of the library inserts. Random peptide libraries (Miceli et al., 1994; Parhami-Seren et al., 1997; Murthy et al., 1998), antigen- or gene-fragment libraries (Kuwabara et al., 1999; Christmann et al., 2001; Mullaney et al., 2001) or a combination of both (Stephen et al., 1995; Fack et al., 1997; Coley et al., 2001) have previously been used for the epitope mapping of a wide variety of monoclonal antibodies (mAbs) (reviewed by Irving et al., 2001). Generally, these libraries suffer from low starting complexities and do not always achieve fine mapping of antibodies unless the epitope is short (about five residues) and well defined.
More recently, entirely in vitro techniques for protein selection such as ribosome (Mattheakis et al., 1994; Hanes and Plückthun, 1997; He and Taussig, 1997) and mRNA display (Roberts and Szostak, 1997) have emerged. In mRNA display, peptides are covalently attached to the 3′-end of their encoding mRNA via a tethered puromycin moiety. Pools of RNA–peptide fusions are selected for binding via their attached peptides and recovered fusions are RT-PCR amplified for the next round of selection and/or cloned for DNA sequencing (Figure 1). The mRNA display system generates libraries that are robust (functional in a wide variety of conditions), encode high complexities (>1013 unique sequences, compared with ∼108–109 for techniques requiring an in vivo transformation step) and lack avidity effects as only one peptide is displayed per mRNA sequence. By accessing larger libraries, extremely rare sequences (such as long, discontinuous epitopes or peptides with better functional properties) can be selected and amplified (Takahashi et al., 2003). Epitope-like consensus motifs that define the core determinants of binding for the anti-c-Myc antibody, 9E10, have previously been identified using mRNA display with a random peptide library (Baggio et al., 2002).
Published methods for generating gene or fragment libraries from DNA typically involve degenerate oligonucleotide priming (Whitcomb et al., 1993; Hampson et al., 1996; Santi et al., 2000), random fragmentation of DNA (Gupta et al., 1999) or the iterative removal of bases from either end of a gene (Henikoff, 1984; Milavetz, 1992; Pues et al., 1997), followed by amplification of the library. These techniques have been employed for a variety of purposes, including epitope mapping and the determination of protein interaction domains (Fack et al., 1997; Kuwabara et al., 1999; Christmann et al., 2001; Coley et al., 2001; Mullaney et al., 2001; McPherson et al., 2002). Because of the random nature of library construction, the majority of sequences in these libraries are non-viable owing to frameshifts and ligations in the anti-sense orientation. Techniques have been described to maintain gene orientation using a pair of degenerate primers with constant 5′ sequences used sequentially in the amplification of cDNA (Hampson et al., 1996) or mRNA (Hammond et al., 2001; McPherson et al., 2002). However, these methods are technically challenging and may be prone to poor library coverage owing to biased hybridization to target sequences (Telenius et al., 1992; Zhang and Byrne, 1999).
A further advancement of mRNA display technology is described here, where we describe a robust method for generating unidirectional, nested deletion libraries. As mRNA display facilitates selection from pool sizes larger than previously possible, improvements are needed for generating libraries with broad coverage while maintaining high sequence complexity. Here, parent DNA sequences are partially digested with DNase I. The resulting fragments are then directionally amplified, maintaining the sense orientation and used to generate an mRNA display library. We used this method to identify a 15-mer peptide that binds with high affinity to an anti-polyhistidine mAb from an initially enriched population of 35-mer, epitope-like sequences. The fragment library selection also revealed a new motif important for high affinity, demonstrating how sequence length may be an important factor in delineating an specific binding requirements. The methods described here should be highly applicable towards the isolation of minimal protein interaction domains from cDNA or protein expression libraries using mRNA display.
Materials and methods
l-[35S]Methionine (35S-Met) was purchased from PerkinElmer Life Sciences. Other reagents and solvents were obtained from Sigma-Aldrich or VWR International, unless noted otherwise. All buffer components for RNA and RNA–peptide fusions were made with diethyl pyrocarbonate-treated doubly distilled water. DNA oligos were synthesized at the Caltech Biopolymer Synthesis and Analysis Facility and were desalted by OPC purification, with the exception of DNA template 130.2, which was obtained from the W. M. Keck Foundation Biotechnology Resource Laboratory (http://keck.med.yale.edu) and purified by urea–PAGE (Ellington and Pollard, 2001). Oligo and peptide concentrations were determined by UV spectrophotometry using a calculated extinction coefficient (http://paris.chem.yale.edu/extinct.html). Protein concentrations were determined by measuring the UV absorbance at 205 nm (Scopes, 1974). The values obtained with this method were within 5% of those obtained using a calculated extinction coefficient at 280 nm for the protein (http://paris.chem.yale.edu/extinct.html).
mRNA display library construction
Construction of a random peptide library for mRNA display selections has been described in detail previously (Liu et al., 2000; Keefe, 2001; Baggio et al., 2002). Briefly, the anti-sense DNA oligo 130.2 [5′-AGC GCA AGA GTT ACG CAG CTG (SNN)27 CAT TGT AAT TGT AAA TAG TAA TTG TCC C; S = C or G, N = A, C, G or T] was PCR amplified with primers 47T7FP (5′-GGA TTC TAA TAC GAC TCA CTA TAG GGA CAA TTA CTA TTT ACA ATT AC) and mycRP (5′-AGC GCA AGA GTT ACG CAG CTG) to produce the initial template containing a T7 promoter, a 5′-untranslated region (UTR), an ATG methionine start codon, 27 random amino acids and a constant 3′-end that encoded the peptide, QLRNSCA. In vitro transcription, purification of the mRNA and splint-mediated ligation of the puromycin linker oligo (pF30P: 5′-A21[S9]3ACC-P, S9 = spacer phosphoramidite 9, P = puromycin, 5′-phosphorylated with phosphorylation reagent II, Glen Research; splint: 5′-TTT TTT TTT TTN AGC GCA AGA GT) were performed as described (Ja and Roberts, 2004) to produce puromycin-conjugated templates (mRNA–F30P).
RNA–peptide fusion preparation and selection
Purified mRNA–F30P templates were translated in rabbit reticulocyte lysate (Red Nova lysate, Novagen) according to the manufacturer's instructions with optimized conditions (100 mM KOAc, 0.5 mM MgOAc and 0.5 µM mRNA–F30P) and additional l-Met (0.5 mM final, 1 ml total reaction volume) or 35S-Met labeling (150 µl reaction). Following the incubation step at 30°C, KOAc and MgCl2 were added to 585 and 50 mM (final), respectively, and the reactions were incubated on ice for 15 min to facilitate RNA–peptide fusion formation. Radioactively labeled and non-labeled RNA–peptide fusions were pooled and subsequently purified with oligo dT-cellulose (New England Biolabs) as described (Ja and Roberts, 2004). Purified fusions were concentrated (Microcon YM-30, Millipore) and reverse transcribed according to the manufacturer's instructions (Superscript II, Invitrogen) with excess mycRP primer in a 100 µl reaction.
The matrix preparation and all selection steps were performed at 4°C. The reverse-transcribed fusions, in 1 ml of selection buffer (50 mM HEPES–KOH, pH 7.5, 100 mM NaCl, 10 mM MgCl2, 10 mM NaF, 30 µM AlCl3, 0.05% Tween 20, 1 mM β-mercaptoethanol (β-ME) and 5 µM GDP), were pre-cleared by rotating with 20 µl of protein G–Sepharose (4B Fast Flow, Sigma) for >1 h. The supernatant was transferred to the target matrix [80 µg of His6–TEV–Giα1 (Lee et al., 1994) immobilized by 40 µg of anti-polyhistidine mAb (clone HIS-1, catalog No. H1029, Sigma) on 20 µl of protein G–Sepharose] and rotated for 1 h. The matrix was washed with 3 × 1 ml of selection buffer and the bound RNA–peptide fusions were eluted with 2 × 200 µl of 4% acetic acid through a 0.45 µm spin filter (SpinX, Costar). Washes and an aliquot of the elution were scintillation counted (LS 6500, Beckman Coulter) to determine the amount of bound fusions.
The eluted fusions were either desalted and concentrated by ultrafiltration (Microcon YM-30) or frozen and dried by vacuum centrifugation. After resuspension in doubly distilled water or 10 mM Tris–HCl, pH 8, samples were PCR amplified for the next cycle of selection and/or for DNA sequencing (TOPO TA cloning, Invitrogen). Subsequent selection rounds were performed similarly, except that smaller translation reactions were used (300 µl non-labeled, 100 µl 35S-Met-labeled). Unblocked mAb (without the His6-tagged protein) was used as the target in the sixth round of selection, when it was realized that the peptides were specific for the mAb.
RNA–peptide fusion binding assay
Aliquots of purified 35S-Met-labeled RNA–peptide fusions were treated with RNase (DNase-free, Roche) and added to ∼15 µl of protein G–Sepharose matrix (with or without ∼10 µg of anti-polyhistidine mAb) in 1 ml of selection buffer. Mixtures were rotated at 4°C for 1 h and washed with 3 × 1 ml of selection buffer. The percentage binding was determined by scintillation counting of the washes and the matrix.
Fragment library construction and selection
First-strand cDNA was produced from the mRNA of a selected library by reverse transcription with dUTP instead of dTTP nucleotides (Superscript II). The mRNA was subsequently removed with RNase H (Roche) and the single-stranded cDNA was purified by spin-column (QIAquick, Qiagen). To generate cDNA fragments, a partial digest was performed on 30 pmol of cDNA (∼1.2 µM final concentration) with 0.25 U of DNase I (Invitrogen) in 1× DNase I buffer (10 mM Tris–HCl, pH 7.4, 2.5 mM MgCl2 and 0.1 mM CaCl2) at 15°C for 10 min. DNase I was removed using DNase Removal Reagent (Ambion), according to the manufacturer's instructions.
Second-strand cDNA was generated by random hexamer priming and fill-in with a processive polymerase (Sequenase v2.0, Amersham Biosciences). cDNA fragments were mixed with 125 pmol of myc6-N6-FP (5′-ATC TCT GAA GAG GAC CTG NNN NNN) in T7 reaction buffer (Amersham Biosciences). After heating the sample to 60°C and cooling on ice to anneal the primers, dNTP (200 µM each nucleotide, final), DTT (10 mM, final) and Sequenase v2.0 (13 U) were added and the reaction was incubated at 37°C for 20 min. The enzyme was heat inactivated at 90°C for 5 min. First-strand cDNA was removed by adding uracil–DNA glycosylase (UDG, 30 U, New England Biolabs) and incubating at 37°C for 20 min. After heat inactivating the UDG, ssDNA longer than ∼50 bases was gel purified with QiaEX II (Qiagen) from a 4% agarose gel (Frohlich and Parker, 2001). A second fill-in reaction was performed with 3myc-N6-RP (5′-AAA TGC ACA AGA GTT GCC CTC GNN NNN N) as before. The dsDNA was subsequently purified on a 2% agarose gel (QIAquick).
PCR using primers T7mycFP (5′-GGA TTC TAA TAC GAC TCA CTA TAG GGA CAA TTA CTA TTT ACA ATT ACA ATG GAA CAG AAA CTG ATC TCT GAA GAG GAC CTG) and psn3mycRP (5′-AAA TGC ACA AGA GTT GCC CTC G) produces the initial library (containing a T7 promoter, a 5′-UTR, an ATG methionine start codon, the fragment domain and a constant 3′-end) suitable for mRNA display selection. The PCR resulted in a smear of products ranging from 100 to 200 bp and DNA corresponding to 150–200 bp was gel purified (QIAquick) and used as the starting template. RNA–peptide fusions were prepared as described above, except that the puromycin moiety was coupled to the mRNA by UV photo-crosslinking with oligo psn-mycF15P (5′-[Ps]-UGC ACA AGA GUU G-dA15-[S9]2-dCdC-P, where unlabeled bases are 2′-OMe RNA, Ps = psoralen C6, S9 = spacer phosphoramidite 9, P = puromycin, Glen Research) as published previously (Kurz et al., 2000). A modified selection buffer [1× PBS, 1 mM β-ME, 1 mM EDTA, 0.05% Tween 20, 0.2% (w/v) BSA and 1 µg/ml yeast tRNA (Roche)] was used in the fragment library selection. In the second and third rounds of selection, the matrix was more stringently washed by incubating the mAb-bound RNA–peptide fusions in buffer containing poly-l-His (0.15 mg/ml, P2534, Sigma) and His6 peptide (60 µM, Covance Research Products) for ∼40 min at 4°C (Boder and Wittrup, 1998).
Direct binding assay of in vitro translated peptides in lysate
Individual clones (in pCR4-TOPO vector, Invitrogen) were PCR amplified with primers 47T7FP and mycRP, in vitro transcribed, urea–PAGE purified and in vitro translated (Red Nova Lysate) with 35S-Met labeling according to the manufacturer's instructions. An aliquot of the translation reaction (4 µl) was added directly to an assay tube (600 µl of fragment selection buffer, 10 µl of protein G–Sepharose, 5 µg of anti-polyhistidine mAb). After rotating at 4°C for 1 h, the Sepharose was washed with 6 × 600 µl of fragment selection buffer in a 0.45 µm spin filter (SpinX) and bound peptides were eluted with 2 × 20 µl of 0.05% SDS. Half of the sample was analyzed via tricine SDS–PAGE along with 2 µl of the original translation reaction for comparison. After electrophoresis, gels were destained (40% methanol and 10% acetic acid) for 20 min, dried under vacuum and imaged via autoradiography (Storm Phosphorimager, Amersham Biosciences). Peptide band intensities were analyzed with ImageQuant software (Amersham Biosciences).
Peptide synthesis/protein purification
Peptides were synthesized on an ABI 432A Synergy peptide synthesizer (Applied Biosystems) using Fmoc chemistry. Peptides included the sequence GGYK-NH2 at their C-terminus, where K is biotinyllysine (biocytin, BAchem) and -NH2 represents C-terminal amidation. The tyrosine residue, used for quantitation by UV absorbance, was omitted from the synthesis for peptides that already contained a tryptophan and/or tyrosine. Crude peptides were deprotected in TFA–thioanisole–1,2-ethanediol (450, 25, 25 µl, 2 h at room temperature), precipitated with methyl tert-butyl ether, purified to >95% purity by reversed-phase HPLC on a semi-preparative C18 column (250 × 10 mm i.d., Vydac) and confirmed by MALDI-TOF mass spectrometry.
Several peptide sequences were expressed in E.coli as in vivo biotinylated maltose-binding protein (MBP) fusions using a vector derived from pDW363 (Tsao et al., 1996). The MBP gene from pDW363 was amplified by successive PCR (primers 35.3 5′-GGA CTA GTA AAA TCG AAG AAG GTA AAC TGG TAA TC and 35.4 5′-CCA TTG GAT CCT TAA TTA GTC TGC GCG TCT TTC AG, then primers 84.1 5′-GAG CAC TCG AGC GGT GCG AAT TCA AAC AAC ATC GAG GGG CGC GCC GGT GGC ACT AGT AAA ATC GAA GAA GGT AAA CTG GTA ATC and 29.3 5′-CCA TTG GAT CCT TAA TTA GTC TGC GCG TC). The PCR-amplified fragment and pDW363 were digested with XhoI/BamHI, purified and ligated to produce the pDW363B vector.
DNA templates encoding peptides B and C were amplified by PCR using the universal forward primer 29.4 (5′-TGA AGT CTG GAG TAT TTA CAA TTA CAA TG) and a template-specific reverse primer that added an SpeI site. BpmI/SpeI-digested dsDNA was co-ligated into XhoI/SpeI digested pDW363B with DNA linkers (XhoI linker 5′-TCG AGC TCT GGA GGC ATC GAG GGT CGC AT and BpmI linker 5′-GCG ACC CTC GAT GCC TCC AGA GC) to produce the expression vector. Inserts contained an N-terminal bio-tag (MAGGLNDIFEAQKIEWHEDTGGSS), peptide B or C and a C-terminal MBP fusion. The vectors produce a dicistronic mRNA which encodes the bio-tag–peptide–MBP fusion and biotin holoenzyme synthetase (birA), an enzyme that attaches biotin to the single lysine in the bio-tag in vivo. Protein expression with 30 ml cultures of E.coli BL21 cells was performed as described (Tsao et al., 1996). Cells were lysed with B-PER (Pierce) and MBP fusions were purified on monomeric avidin–agarose (Pierce) according to the manufacturer's instructions. Purified proteins were concentrated and desalted into 1× PBS by ultrafiltration (Centriprep YM-10, Millipore).
The His6–TEV–Giα1 was expressed and purified as described previously (Lee et al., 1994). Briefly, E.coli expression cultures were lysed by French press and successively purified by FPLC on metal chelate affinity, anion-exchange and size-exclusion columns. Protein purity was assayed by SDS–PAGE. MALDI-TOF analyses on TEV protease-treated His6–TEV–Giα1 were consistent with the epitope tag being removed. For the surface plasmon resonance (SPR) experiments, His6–TEV–Giα1 was biotinylated using EZ-Link Sulfo-NHS-LC-biotin (5-fold excess, Pierce) according to the manufacturer's instructions. Excess reagent was quenched with ethanolamine and the biotinylated protein was desalted on an NAP-10 column (Amersham Biosciences).
Anti-polyhistidine mAb in ascites fluid was affinity purified on protein G–Sepharose in 1× PBS/0.1% Triton X-100, eluted with 0.1 M citric acid buffer, pH 3, and immediately neutralized with buffer. After concentration and buffer exchange (Centriprep YM-50) into papain buffer (20 mM phosphate, pH 7, 10 mM EDTA), the antigen-binding fragment (Fab) was generated and purified using an ImmunoPure Fab Preparation Kit (Pierce), according to the manufacturer's instructions.
Surface plasmon resonance
SPR measurements were made at 25°C on a Biacore 2000 (Biacore) equipped with either SA (streptavidin) sensor chips or research-grade CM5 sensor chips (Biacore) with amine-coupled streptavidin (ImmunoPure, Pierce). The CM5–streptavidin chips were prepared in-house by standard NHS/EDC amine coupling (Biacore) and achieved >1100 RU of immobilized streptavidin per flow cell. HBS-EP [20 mM HEPES, pH 7.4, 150 mM NaCl, 3 mM EDTA and 0.005% (v/v) surfactant P20 (Tween 20)] was used as the running buffer for all experiments. Biotinylated ligands were diluted in HBS-EP to 1 nM and immobilized to individual flow cells (∼10 RU for peptides and ∼100 RU for proteins). Flow cell 1 was left as a streptavidin negative control in all sensor chips. To collect kinetic data, a concentration series of Fab in HBS-EP was injected for 2 min at 35 µl/min over all flow cells and dissociation was observed for 3 min. The Fab samples were injected in random order, interspersed with a number of buffer blank injections for double referencing (Myszka, 2000). Flow cells were regenerated between Fab injections with a 0.5 min wash of 2.5 M NaCl at 100 µl/min. Raw data were processed with Scrubber and analyzed with CLAMP using a 1:1 bimolecular interaction model (Myszka and Morton, 1998). KD values were calculated (kd/ka) from the on and off rates determined by CLAMP. Standard free energies of binding were calculated from the KD values [ΔG° = −RT ln(C/KD), R = 1.987 × 10−3 kcal/mol.K, T = 298.15 K and C = 1 mol/l].
Peptide selection against an anti-polyhistidine mAb
The peptide selection experiment, originally designed to target a His6-tagged protein immobilized by an anti-polyhistidine mAb, utilized a random, unconstrained 27-mer peptide library. During PCR and transcription the complexity of the library was maintained by having at least 7 × 1013 sequences at the start of each reaction. The initial mRNA display pool contained at least 1012 unique peptide sequences, estimated from the initial mRNA and methionine concentrations in the translation reaction, out of a maximum complexity of 2027 peptides (∼1.3 × 1035).
Five rounds of selection were performed on the immobilized anti-polyhistidine mAb, pre-saturated with an N-terminal His6-tagged protein (Figure 2A). Bound RNA–peptide fusions were eluted with acetic acid, which generally recovered >80% of the remaining 35S counts. To determine the progress of the selection, a separate 35S-Met-labeled RNA–peptide fusion pool from the fifth round was purified, RNase-treated and assayed for binding (Figure 2B). This assay revealed specific binding of the peptide pool (now modified only at the C-terminus with puromycin and a short DNA linker) to the antibody rather than to the immobilization matrix (protein G–Sepharose) or to the His6-tagged protein. The reduced binding observed when a His6 peptide competitor was added further evinced that the selected sequences specifically targeted the antigen-binding region of the mAb. A sixth round of selection, performed with unblocked mAb as the target, demonstrated that the enrichment for active peptides against the mAb was essentially complete (Figure 2A).
DNA sequencing of the final sixth round pool revealed a variety of peptides containing two to five consecutive His residues with no other apparent consensus except a bias for Arg residues C-terminal to the His-track (Table I). The His-track was seen in various positions in the random region of the library, suggesting that specific locations in the random domain were not favored for mAb recognition. One sequence, peptide C, emerged as the dominant member of the selected library (Table I). Further rounds of selection using His6 peptide and/or poly-l-His as competitors in the selection buffer generally resulted in changes in the percentage of peptide C in the pool rather than the emergence of new, beneficial mutations or peptides defining a single consensus (data not shown). Peptide C remained the most prevalent sequence in all subsequent selection rounds, with a collective frequency of 20 out of 53 sequences (Table I).
Only the random domain is shown. Sequences contained between two and five consecutive histidines and were aligned at the C-terminal end of the His-track. A consensus was not observed except for a strong bias for Arg several residues C-terminal to the His-track. His and Arg residues are shown in bold. The frequency (out of 53) is shown for peptides that appeared more than once from DNA sequencing of individual clones. For these sequences, amino acids that differed between clones are in italics, with the most common residue at that position shown. Several sequences contained multiple deletions that shortened the random domain but left the C-terminal constant region intact and in-frame. The sequence marked with an asterisk contained a 2 bp insertion which resulted in a frameshift of the C-terminal constant region (not shown). Peptides A, B and C are named.
Selection with an mRNA display fragment library
To identify the minimal peptide sequence necessary for high-affinity binding, a nested deletion library was constructed from the peptide C-dominated library. This library is composed of fragments of DNA that encode shorter stretches of the parent peptides. Initial attempts to generate nested deletions using random priming on cDNA resulted in nearly full-length sequences (unpublished observations), possibly due to the strand-displacement abilities of the tested polymerases (I.N.Hampson, personal communication; Hamilton et al., 2001). This attribute was exploited in the final fragmentation scheme (Figure 3A). DNase I was used to generate random fragments from the cDNA of a functional library. Various dilutions of DNase I were used to find the optimal conditions for producing a range of ssDNA products from ∼50 to 130 bases (data not shown). Successive random priming and fill-in reactions with a modified T7 polymerase and primers containing 3′-random hexamers produced the initial DNA pool. PCR-amplified dsDNA was purified to retain fragments between 150 and 200 bp, corresponding to peptides ∼10–30 amino acids long.
Because stop codons hinder RNA–peptide fusion formation, the 3′-constant sequence of the fragment library was chosen such that TAA, TAG and TGA codons did not exist in any frame. The 5′-constant region added a c-Myc epitope tag and provided a primer site for subsequent PCR amplification (for additional attachment of the T7 promoter and UTR sequence). This method resulted in a unidirectional fragmented pool; all transcribed RNA maintained the sense orientation. DNA sequencing of the initial pool demonstrated reasonable representation of the dominant sequence (peptide C) and confirmed that one-third of the sequences were in-frame, as expected (Figure 3B). DNA alignments with peptide C derivatives typically contained several mismatches at the beginning and end of the fragment region, most likely due to imperfect annealing of the random hexamer primers.
The nested deletion library was used for selection against the anti-polyhistidine mAb (Figure 3C). Poly-l-His and His6 peptide were used as competitors in the second and third rounds. Although the binding of the second and third round pools was similar, more RNA–peptide fusions were retained after the stringent, competitive wash in the third round, suggesting that the washes were indeed enriching the pool for the highest affinity peptides. DNA sequencing of the final pool revealed three distinct classes of peptides (Table II). Class 1 sequences were fragments corresponding to N- and C-terminal deletions of peptide C. A sequence alignment of the fragments identified RHDAGDHHHHHGVRQ (peptide Cmin) as a minimal functional sequence for peptide C.
Only the fragment domain of the peptides is shown. Class 1 peptides are derived from peptide C (Table I) and the putative minimal active sequence is underlined. Class 2 sequences contain portions of the ARRXA motif. Conserved residues are in bold. Sequences derived from parent peptides A and B, as well as new peptides D, E and F, are labeled. The C-terminal RGQ in the sequence derived from peptide A is encoded by part of the 3′-constant region. Class 3 peptide sequences were aligned using CLUSTALW (http://npsa-pbil.ibcp.fr) with key residues (bold) determined automatically. Clone frequency (out of 20) is shown and differing residues are italicized as described in Table I. Peptide sequences translated from alternate start codons are marked with an asterisk.
The majority of fragments recovered after the selection came from parent sequences other than peptide C (Table II, Class 2). An alignment of peptides D and E (which collectively represented 40% of the final, third round selection pool) revealed the consensus motif ARRHA. This exact motif was not seen in the original selection, although three peptide sequences contained ARRXA [X = R, G (peptide A) or K (peptide B)] two residues C-terminal to the His-track (Table I), as in peptide D. Additional N- and C-terminal deletions for peptides D and E were not observed. Hence these sequences may already represent minimal high-affinity binding epitopes. Alternatively, there may have been an insufficient number of clones sequenced to find other corresponding fragments. Other recovered sequences in this peptide class retained at least part of the ARRXA, suggesting that the first few residues of the consensus motif are more critical for mAb interaction.
Several additional peptides were discovered that encoded a weak consensus sequence unrelated to the mAb-binding peptides (Table II, Class 3). Binding assays with two of these peptides revealed significantly weaker affinity for the mAb than a His6-containing peptide control (data not shown). These peptides may bind to an alternate interaction site and were consequently enriched when high stringency, competitive washes were introduced for the last rounds of selection. Site-specific, competitive washes (e.g. with poly-l-histidine) would result in the enrichment of peptides with higher affinity for the antigen-binding region, as well as of peptides with any affinity for other sites.
Immunoprecipitation of selected peptides
Selected clones were qualitatively assessed for binding by immunoprecipitation with the anti-polyhistidine mAb. 35S-Met-labeled peptides were assayed directly from the in vitro translation reactions. The selected peptides demonstrated significantly increased binding compared with a C-terminal His6-tagged peptide control (Figure 4). Non-specific binding was shown to be minimal with a c-Myc epitope control peptide. The fragment-selected peptides and the Myc control were immunoprecipitated with the 9E10 anti-c-Myc mAb to confirm that they were correctly translated (data not shown).
Kinetics by surface plasmon resonance
Various peptides from the fragment selection were synthesized or expressed for kinetic analysis by surface plasmon resonance (SPR). In an SPR experiment, one binding partner (the ligand) is immobilized on a sensor chip while the other reactant (the analyte) is in solution. Binding of the analyte to the ligand is observed as a refractive index change on the sensor chip surface and is measured in real time in resonance units (RU). Biotinylated ligands were immobilized on streptavidin sensor chips for the SPR analyses.
Absolute and relative binding constants obtained from mAb interactions with immobilized antigens are unreliable owing to the effects of rebinding and bivalency (Nieba et al., 1996). To avoid these problems, Fab was prepared from anti-polyhistidine mAb and used as the analyte. Using the peptides as the immobilized ligands and Fab as the analyte ensured fair comparisons between the kinetics measurements, avoiding bias in protein quantitation since all Fab concentrations were prepared from a single stock solution. Kinetic parameters were determined using a simple 1:1 bimolecular interaction model (Table III).
SPR experiments monitored binding between immobilized peptides and purified Fab. On and off rates were determined by global fit analysis on CLAMP using a 1:1 bimolecular interaction model (Myszka and Morton, 1998). KD values were calculated from kd/ka. Synthetic peptides include a short, C-terminal biotin-containing sequence (not shown). Full-length peptides B and C were assayed as MBP fusion proteins.
The assayed peptides could be categorized by their dissociation rates from the Fab (Figure 5). The cited epitope, His6, bound weakest to the Fab; the His6 peptide and the His6-tagged protein used in the original selection exhibited KD values of 0.6 and 3 µM, respectively. Additional His residues (His10 peptide) increased the association rate 6-fold without changing the dissociation rate significantly. Peptides from the selection demonstrated KD values of <75 nM, ∼10- to 75-fold better than the control His6 sequence, with increased affinities as a result of faster association (up to 5-fold) and considerably slower (6- to 21-fold) dissociation rates (Table III). Class 2 peptides with the ARRXA motif demonstrated the highest affinities, with ∼3-fold slower dissociation rates compared with sequences derived from peptide C (Figure 5C). While the flanking residues on peptide Cmin contribute at least 1.6 kcal/mol to the binding free energy compared with the His6 peptide, sequences with the ARRXA motif demonstrate 2.6 (peptide B) and 2.2 (peptide D) kcal/mol improvements. The contributions from these flanking residues is likely even greater, as these calculations do not account for any loss of binding free energy from having shorter (<6) stretches of His residues in the core site.
During an in vitro selection experiment against a His6-tagged target protein immobilized using an anti-polyhistidine antibody, mAb-binding peptides were inadvertently enriched. The His6-tagged fusion protein was of high quality and purity (see Materials and methods) and was previously used in functional assays (Ja and Roberts, 2004). Additionally, the presence of the target protein was confirmed in the elutions from each selection round (data not shown). Hence the enrichment of peptides that bind the mAb was most likely due to the existence of short sequences that confer significantly higher affinity than a hexahistidine tag. A preclearing step that included the mAb may not have been totally effective in preventing the selection of antibody-specific peptides, as even the final selection round resulted in an incomplete, ∼40% pull-down of the RNA–peptide fusions.
Although the cited mAb epitope is hexahistidine, the recovered peptides surprisingly each contained a shorter (≤5) stretch of consecutive His residues and a bias for Arg. Because His is clearly a key determinant in mAb recognition, we consider the selected peptides to be ‘epitope-like’ rather than mimotopes [linear peptides that mimic the binding mode of an epitope, a term previously reserved for distinct, alternative ligand structures (Stephen et al., 1995)]. The original immunogen used to develop the mAb was an N-terminal His6-tagged fusion protein of unknown sequence and identity (proprietary information, Sigma-Aldrich). Hence whether the regions flanking the His-tracks in the selected peptides have homology to the original immunogen cannot be determined. However, given the antigen sequence, the epitope (centering on the His6 tag) clearly would have been recognized.
To characterize the mAb epitope better and demonstrate the feasibility of gene-fragment mRNA display, a nested deletion library was constructed from the final selection pool. A previously described protocol, directional random oligonucleotide primed (DROP) synthesis of cDNA (Hampson et al., 1996), was modified to maintain as many viable library fragments as possible. Owing to the difficulty in obtaining a broad size distribution of sequences with degenerate oligos, DNase I was used for the random fragmentation of cDNA. DROP synthesis using a highly processive DNA polymerase, capable of potent strand displacement, yielded intact copies of the cDNA fragments while maintaining the sense strand (Figure 3A).
In vitro selection with the fragment library resulted in the identification of a 15-mer functional sequence (peptide Cmin: RHDAGDHHHHHGVRQ; KD = 38 nM) derived from the full-length 35-mer, peptide C. Because the initial fragment library was produced from a pool dominated by peptide C, we expected to recover and identify numerous overlapping peptides that defined a minimal epitope for this sequence. Surprisingly, the majority of recovered sequences came from unknown parents. The enrichment of these peptides implies that these fragments were more highly favored after truncation. The flanking regions of the original peptides may have hindered access to the epitope by the mAb, suggesting that peptide length may be an important attribute in the fine-tuning of affinity and/or function. Alternatively, these particular sequences may have been negatively biased by the constant C-terminal peptide used in the original random peptide library. The three-frame constant sequence used in the fragment library construction increases the sensitivity of the selection when one of the translation frames causes negative bias. Additionally, a random distribution between the three translation frames in the selected peptides would indicate that the 3′-constant region does not affect functional selectability or bias RNA–peptide fusion formation. The six independent clones of peptide D, for example, had all three frames represented in the 3′-constant region (Table II and data not shown).
Based on the selected peptide sequences, two major protein interaction motifs were identified: a core epitope consisting of at least three consecutive His residues and a second interaction site encoded by the consensus motif, ARRXA. SPR experiments demonstrated a significant increase in the association rate of His10 compared with His6, suggesting that additional His residues present a more accessible core interaction rather than slow dissociation by enhancing rebinding from multivalency effects. Only additional contacts, made by the addition of interacting residues such as the ARRXA motif, result in significantly slower dissociation rates. These flanking residues can contribute significantly to the binding free energy, at least 2.6 kcal/mol in the case of peptide B in comparison with His6, which assumes that the loss of two out of six histidines in the core has no effect. The two interaction cassettes we have identified here may be juxtaposed sites from the original fusion protein antigen.
Our results also highlight the importance of flanking residues outside of the two consensus motifs and their contribution to binding affinity with antibodies. Residues adjoining core amino acids in an epitope can substantially influence antibody binding, the effects of which can only be assessed through quantitative affinity measurements (Choulier et al., 2001; Coley et al., 2001). This is demonstrated in our experiments, where the rank order of binding in the immunoprecipitation assay did not entirely correspond with quantitative kinetic measurements. Epitope tags are often appended to proteins and used as molecular handles for detection, isolation and analysis of protein–protein interactions. Their functionality in this context, however, is highly variable. Tandem repeats of tags (e.g. the popular c-Myc or FLAG epitopes) have been used to ensure robust affinity and recognition by antisera (Nakajima and Yaoita, 1997; Hernan et al., 2000). By identifying longer functional peptides with appropriate flanking residues, high affinity can be maintained with less variability depending on the linker region and the protein to which the epitope is attached.
The fragment library selection resulted in a disproportionate number of peptides that did not contain an N-terminal deletion. Because of the 5′-UTR on the mRNA used to make the fragment library, more fragments containing the first start codon (with varying lengths of UTR sequence) were probably present in the initial fragment pool. 5′-UTR and/or promoter sequences most likely do not hinder the fragment selection process, as ribosome scanning can initiate translation at the correct start codon, regardless of which frame was amplified. This was seen in several of the selected fragment sequences (Table II). This property increases the number of viable (i.e. translatable) templates, but introduces some bias favoring intact N-terminal sequences.
Although not utilized in this experiment, the c-Myc tag introduced in the fragmentation library can be used to generate and purify a fragment library enriched with in-frame sequences. Although the tag is at the N-terminus of the library, in general RNA–peptide fusions will form only when the ribosome can translate most of the sequence and reach the end of the mRNA (unpublished results). Hence only sequences that lack stop codons (and therefore are most likely in-frame) will form fusions and be purified and amplified after a Myc-epitope pre-selection. Another improvement to the protocol includes using Exonuclease I to remove excess degenerate primers during DROP synthesis, preventing the amplification of sequences without ‘inserts,’ as DNA size fractionation by agarose gel is not completely effective in removing these smaller fragments (data not shown).
The ability to access high-complexity libraries is a great advantage for mRNA display over other selection systems. Library construction methods that involve PCR and DNA reassembly are better suited for the mRNA display format, thereby avoiding cloning steps that are required in techniques such as phage display. A comparative study on epitope mapping using random 6-mer and 15-mer peptide phage display libraries successfully identified consensus motifs for only two of the four mAbs examined (Fack et al., 1997). For one of the mapped mAbs, the random peptide selection succeeded only with the 6-mer library, identifying a short consensus motif that was not discovered with the 15-mer library, which the authors attributed to a statistical lack of representation. In contrast, an mRNA display selection with a random 27-mer library identified epitope-like consensus motifs for the anti-c-Myc antibody, 9E10 (Baggio et al., 2002). The complete, 10-mer wild-type epitope was also selected from the library. The selection revealed the core determinants and some of the allowed flanking residues for mAb interaction. By using high-complexity, long peptide libraries, mRNA display selections can identify extremely rare sequences such as discontinuous or conformational epitopes, as well as novel mimotopes. The full-length consensus peptide, Hm–X2–ARRXA, found here, for example, may not have been identified with more traditional X6 or X10 phage display libraries.
While selection with biological libraries is an inexpensive and technically straightforward approach for epitope mapping, synthetic combinatorial libraries (SCLs) offer an alternative method for generating peptide ligands (Houghten et al., 1992; Pinilla et al., 1994a). Screens of mixture-based SCLs have resulted in the delineation of linear and discontinuous epitopes a priori (Geysen et al., 1986; Pinilla et al., 1993; 1994b). In these experiments, peptide SCLs are composed of mixtures containing positions defined with a specific amino acid while the remaining positions contain mixtures of residues. Various libraries are screened to determine the most active amino acids at each position of a peptide sequence. The optimal residues can be determined in parallel (‘positional scanning’ approach; Pinilla et al., 1994a) or iteratively, where each subsequent library is synthesized to expand on the most active libraries determined in the previous screen (Houghten et al., 1992). The construction of SCLs can be very specifically controlled and library members can be screened free in solution, avoiding any bias from a fusion partner.
While selection methods employ an amplification step for the enrichment of functional library members, individual peptides in an SCL must exist at sufficient concentrations for producing a signal (e.g. through ELISA screens). As the library size increases, the concentration of individual peptides decreases significantly. Additionally, these larger libraries exponentially increase the deconvolution step (e.g. with positional scanning SCLs) after a screen because individual peptides made up of combinations of the optimal amino acids at each position in a library need to be synthesized and analyzed. The interdependence of individually determined optimal amino acids at each position may cause significant deviations from the optimum full-length peptide sequence. This contrasts greatly from peptide selections where complete peptide information is generated by DNA sequencing and each library member is presumably active for binding and already optimized in the context of other positions. These problems may not be significant in epitope mapping experiments, however, as most antibodies recognize linear epitopes composed of less than six residues. While SCLs seem appropriate for generating short peptide ligands, biological libraries can easily search hexamer libraries, with techniques such as mRNA and ribosome display permitting exhaustive searches of decapeptide libraries.
Owing to the higher efficiency of synthesizing the nested deletion library completely in vitro, the fragment library construction described here maintains a higher number of unique sequences, in contrast to DNA libraries produced by enzymatic ligation and cloning, which are limited by in vivo transformation efficiencies. Additionally, the library construction method we used is unidirectional for all amplified sequences so that the sense orientation is maintained and only the minimal two-thirds of the fragments are non-viable due to frameshifts. This protocol produces a well-distributed library and is technically less challenging as the random oligonucleotide priming is used only to ‘copy’ the cDNA fragments produced by DNase digestion and need not be optimized for generating a fragment distribution. mRNA display with fragment libraries combines the ease and versatility of working with cDNA in vitro with the benefits of expression cloning. The method permits the minimization of functional domains, as well as the isolation of optimal binding contexts through the removal of negative-acting flanking regions. Although the technique may not be sufficiently processive for the fine mapping of short peptide sequences, it should be highly applicable for constructing cDNA or tissue-specific expression libraries and the subsequent determination of minimal binding domains and novel protein–protein interactions.
We thank Dr David S.Waugh (National Cancer Institute at Frederick) for the pDW363 in vivo biotinylation vector, Professor Pamela J.Bjorkman and Anthony M.Giannetti for time and support on the Biacore 2000, William Hunter (Biacore, Piscataway, NJ) for technical advice on SPR, Cindy I.Chen and Christopher T.Balmaseda for preparative and technical assistance with library construction and protein purification and Professor David G.Myszka (University of Utah) for generously providing the kinetic analysis software, Scrubber and CLAMP. We greatly appreciate Dr Yuri Peterson (Duke University) for suggestions on the paper. We are indebted to Dr Ian N.Hampson (St Mary's Hospital, Manchester, UK) for valuable discussions and technical expertise on random priming and the synthesis of the fragment library. This work was supported by grants from the NIH (RO1 GM 60416) and the Beckman Foundation to R.W.R. W.W.J. was supported in part by a DOD National Defense and Engineering Graduate Fellowship.
1Division of Chemistry and Chemical Engineering, California Institute of Technology, M/C 147-75, Pasadena, CA 91125, USA
2Present address: Program in Cellular and Molecular Biology, University of Wisconsin, Madison, WI 53706, USA