Design and characterization of structured protein linkers with differing flexibilities

Engineered fusion proteins containing two or more functional polypeptides joined by a peptide or protein linker are important for many fields of biological research. The separation distance between functional units can impact epitope access and the ability to bind with avidity; thus the availability of a variety of linkers with different lengths and degrees of rigidity would be valuable for protein design efforts. Here, we report a series of designed structured protein linkers incorporating naturally occurring protein domains and compare their properties to commonly used Gly4Ser repeat linkers. When incorporated into the hinge region of an immunoglobulin G (IgG) molecule, flexible Gly4Ser repeats did not result in detectable extensions of the IgG antigen-binding domains, in contrast to linkers including more rigid domains such as β2-microglobulin, Zn-α2-glycoprotein and tetratricopeptide repeats. This study adds an additional set of linkers with varying lengths and rigidities to the available linker repertoire, which may be useful for the construction of antibodies with enhanced binding properties or other fusion proteins.


Introduction
Fusion proteins are engineered biomolecules containing parts from two or more genes synthesized as a single multifunctional construct. These have been critical in many areas of biological research including affinity purification (Lichty et al., 2005) and protein stabilization for structure determination (Zou et al., 2012). Bi-specific fusion proteins have also been utilized as biopharmaceuticals, with an active drug domain fused to a carrier domain, allowing for the drug's proper transport (Chen et al., 2013). Such proteins have been designed to penetrate epithelial membranes including the blood-brain barrier, as well as to target a specific cell population (Pardridge, 2010). Due to the modularity of protein domains in the generation of functional constructs, fusion proteins will likely have increasing importance in research and drug design.
The successful construction of fusion proteins relies on the proper choice of a protein linker as direct fusion of two domains can lead to compromised biological activity (Bai et al., 2005;Zhang et al., 2009). Several studies have utilized existing databases to compile and characterize linkers in naturally occurring multi-domain proteins (Argos, 1990;George and Heringa, 2002). These studies have yielded amino acid sequence propensities for natural linkers of various sizes and lengths, as well as information on rigidity and secondary structure. This information has helped the empirical design of linkers that are customized for particular applications.
Linkers can be classified into three groups: flexible, rigid and cleavable (Chen et al., 2013). Flexible linkers are generally composed of small, non-polar or polar residues such as Gly, Ser and Thr. The most common is the (Gly 4 Ser) n linker (Gly -Gly-Gly -Gly-Ser) n , where n indicates the number of repeats of the motif. Polyglycine linkers have also been evaluated, but the addition of a polar residue such as serine can reduce linker -protein interactions and preserve protein function. Due to their flexibility, these linkers are unstructured and thus provided limited domain separation in a previous study (Evers et al., 2006). As a result, more rigid linkers including polyproline motifs (Schuler et al., 2005) and an all a-helical linker A(EAAAK) n A (Arai et al., 2001) have been developed.
We are interested in using relatively rigid protein linkers to separate anti-HIV binding proteins at distances that would permit bi-or multi-valent binding to HIV Env glycoproteins with the objective of creating reagents capable of cross-linking epitopes within a single Env trimer (intra-spike cross-linking). Such reagents would take advantage of avidity effects to minimize HIV's ability to evade neutralizing antibodies by rapidly mutating to lower the affinity between the HIV epitopes and the antigen recognition fragment (Fab) of the antibody (Klein et al., 2009). Although the architecture of the HIV spike trimer does not permit intra-spike cross-linking by most natural antibodies (Zhu et al., 2006;Klein and Bjorkman, 2010), it may be possible to create reagents capable of bivalent binding to an HIV Env trimer by fusing two identical reagents or two different reagents with an appropriate length linker. Here we report the design, construction and characterization of a series of structured protein linkers incorporating both rigid and flexible domains that can be used to achieve a variety of different desired separations. The linkers were incorporated into the hinge region of an intact immunoglobulin G (IgG) antibody and evaluated for their relative lengths and rigidities by dynamic light scattering (DLS).

Plasmid construction and protein purification
Genes encoding designed linkers were synthesized (Blue Heron Bio) with restriction sites for the enzymes NheI (5 0 -end) and either NgoMIV or HindIII (3 0 -end). These sites † These authors contributed equally to this work.
# The Author 2014. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
were also introduced into the gene encoding the heavy chain of the HIV-neutralizing antibody b12 (Roben et al., 1994) such that the insert would be located between hinge region residues His235 and Thr236. Constructs encoding the b12 heavy chain gene with a linker inserted in the hinge region were subcloned into the pTT5 mammalian expression vector. The b12-linker IgGs were expressed transiently in HEK-6E cells by co-transfecting the b12-linker heavy chain genes with the b12 light chain gene as described (Diskin et al., 2011).
IgG-linker fusion constructs were purified by protein A affinity chromatography (GE Healthcare) followed by purification and analysis by size-exclusion chromatography (SEC) using a Superdex 200 10/300 GL column (GE Healthcare) in phosphate-buffered saline, 0.05% w/v sodium azide, pH 7.4.

Dynamic light scattering
Fractions corresponding to the center of the SEC elution peak were concentrated using Amicon Ultra-15 Centrifugal Filter Units (Millipore) with a molecular weight cutoff of 100 kDa to a volume of 80-400 ml and concentrations of 0.5-1 mg/ml. Concentration differences within this range were not observed to affect the hydrodynamic radius values determined by DLS (data not shown). Sample sizes ranging from 80 to 350 ml were loaded into a disposable cuvette, and measurements were performed on a DynaProw NanoStar TM (Wyatt Technology) using manufacturer's suggested settings. A fit of the secondorder autocorrelation function to a globular protein model was used to derive the hydrodynamic radius.

Design and identity of designed linkers
In order to design potential structured linkers, we surveyed the Protein Data Bank (PDB) to find structures that were relatively elongated and rigid, or represented small globular proteins. We chose Zn-a2-glycoprotein (ZAG; PDB code: 1ZAG) as an example of a relatively elongated and rigid structure (Sanchez et al., 1999), and b2-microgloblin (b2m; PDB code: 1LDS) and ubiquitin (Ub; PDB code: 1UBQ) as examples of small globular proteins (Fig. 1A). ZAG is a 31.5 kDa protein with a class I major histocompatibility complex heavy chain-like fold and a separation distance between the N-and C-termini of 45 Å . b2m is a stable 12 kDa protein with an immunoglobulin constant region-like fold that forms a rigid structure with a separation distance between the N-and C-terminus of 35 Å (Trinh et al., 2002). Likewise, Ub is a compact, stable 8.5 kDa protein with an N-and C-terminal separation distance of 37 Å (Vijay- Kumar et al., 1987). In addition to the structured linkers chosen from the PDB, proline-rich linkers were designed from the hinge sequence from IgA1 (polyPro and polyPro(Glyc)). This glycosylated region confers rotational flexibility of the Fab relative to the Fc in the context of wild-type dimeric IgA1 (Bonner et al., 2008). In addition, glycosylation has been shown to potentially increase stability of polypeptide linkers (Imperiali and O'Connor, 1999). ZAG, b2m and Ub proteins were joined in various combinations with short linker regions, either (Gly 2 Ser) n repeats, glycosylated proline-rich sequences ( polypro(Glyc), or unglycosylated proline-rich sequences ( polypro), to create linkers L1-L12 (Table I).
We also created linkers using tetratricopeptide repeat domains (TPRs; PDB code: 2AVP; L13-L16; Table I; Fig. 1A) (Kajander et al., 2007) that are found in natural proteins such as HSP70/90 (Scheufler et al., 2000). These domains are optimal for use as potential structured linkers because the length of a set of tandem TPR domains corresponds predictably with the number of repeats. Each repeat consists of 34 amino acids with a defined sequence motif that forms two a-helices (D' Andrea and Regan, 2003). Seven to eight TPRs  (1UBQ)). The cTPR structure shown contains eight tandem repeats. N-and C-terminal residues are shown as sticks, color-coded blue for the N-terminus and red for the C-terminus. form a complete superhelical turn with a pitch of 72 Å . For our TPR linkers, we used a consensus sequence defined by the amino acid of the greatest global propensity in the natural database of the TPR domains at each position, which was shown to form a stable superhelix and was therefore named the consensus TPR sequence or cTPR (Main et al., 2003).
Finally, for comparison, we constructed a series of (Gly 4 Ser) n linkers (L17 -L24; Table I) in order to determine the effect of increasing the number of flexible Gly 4 Ser repeats on the hydrodynamic radius of the IgG. The complete sequence of each linker is given in Table II.
As a scaffold for comparing the designed structured linkers, we inserted each into the hinge region of an intact IgG antibody (the anti-HIV antibody b12) (Roben et al., 1994). We chose the hinge region of an IgG, which encompasses the amino acids between the C-terminus of the heavy chain portion of the antigen-binding fragment (Fab) and the N-terminus of the Fc, to insert the linkers because it can tolerate large protein insertions (Redpath et al., 1998). In addition, extension in the hinge region could potentially increase the separation distance of the Fab arms (Fig. 1B).

Characterization of the IgGs containing structured linkers
The b12 IgG proteins containing linkers L1 -L24 were expressed by transient transfection in HEK 293-6E mammalian cells and purified by affinity and size exclusion chromatography. Visualization by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) for IgGs containing the L1-L8 linkers showed that all proteins were purified to .95% homogeneity (Fig. 2). Under reducing conditions, two heavy chain bands were observed for b12-L1, which contained a linker containing three potential N-linked glycosylation sites, indicating the presence of multiple glycosylated isoforms. An overlay of the chromatograms derived from SEC showed that the IgGs containing the L1-L8 structured linkers all exhibited a decrease in retention volume relative to wild-type IgG, consistent with the expected increases in the radius of gyration (R g ) of each of the constructs due to the addition of a structured linker (Fig. 3).
We next derived the hydrodynamic radii using DLS for wild-type b12 and the b12 proteins containing designed linkers. DLS measures fluctuations in the intensity of scattered light of a protein solution over time, which can be used to calculate an autocorrelation function of intensity (Nobbmann et al., 2007). Typical monodisperse samples (including our hinge-linked antibodies) generate an exponential decay in the autocorrelation. A least squares fit can be performed to calculate the decay constant, which directly relates to the diffusion coefficient. The diffusion coefficient is then inversely related to the characteristic hydrodynamic radius R H , which reflects the radius of a hypothetical solid sphere that would diffuse at the same rate as the protein. The R H value is not a direct measurement of the length that the linker contributes to the size of the IgG. However, comparative analysis can yield rank order differences for the relative lengths and rigidity of the various linkers. For example, if the separation between the IgG Fc and Fab domains were increased by the addition of a designed hinge linker, we would expect an observable increase in the R H of the fusion construct compared with the parental b12 IgG due to increased size of the diffusion sphere.
The hydrodynamic radii were measured by DLS for each of the b12 IgG-linker fusion proteins and compared with an internal wild-type b12 IgG control (Fig. 4). By comparing constructs containing elongated or small protein domain linkers, cTPR repeat linkers and flexible (Gly 4 Ser) n linkers of various lengths (L17-L24), we could directly compare the effects of incorporating different lengths of flexible vs. structured proteins linkers.
We observed a consistent trend for the R H values between glycosylated and non-glycosylated linkers (L1, L2, L3 and L12 vs. L4). The incorporation of three potential N-linked glycosylation sites in proline-rich linkers derived from the hinge region of IgA1 (L1) appeared to increase the R H relative to constructs containing similar linker sequences with only one (L2) or no (L4) N-linked glycosylation sites, possibly through stabilization of the folded state and leading the linker to adopt a more extended conformation (Shental-Bechor and Levy, 2008). While the addition of only a single potential N-linked glycosylation site did not seem to affect the diffusion rate of proline-rich linkers (compare L2 and L4), a single potential N-linked glycosylation in the GGSG-NSS-GSGG region of a combination proline-rich and Gly 2 Ser linker (L3) increased its R H beyond the R H of a proline-rich linker with three potential N-linked glycosylation sites (L1). These data are consistent with the observation that N-linked glycosylation confers rigidity in the backbone of a flexible linker (Liu et al., 2000), suggesting these reagents contained linkers with a more extended conformation. Thus incorporating potential N-linked glycosylation sites within flexible linkers may be a general method to increase linker rigidity.
Adding a single b2m domain to a linker increased the R H of the b12-linker protein to a similar degree as a proline-rich repeat relative to IgG (compare L5 to L2, L4 and IgG),
cTPR linker series cTPR constructs were generated with 3, 6, 9 or 12 tandem repeats. All cTPR linkers were flanked by (Gly 4 Ser) 3 sequences (Table II). The constructs exhibited a consistent decrease in elution volume on SEC as a function of the repeat length (Fig. 3). These constructs also predictably increased the R H of the linked IgG with increased number of tandem repeats (Fig. 4). The hydrodynamic radius of the cTPR12 construct corresponded to approximately the size of L4, which contained a proline-rich linker. These data suggested that, unlike with repeated domains of the structured linkers, the increase in separation between the Fab and Fc correlated predictably with the number of cTPR repeats despite the presence of Gly 4 Ser peptides flanking the N-and C-termini.

(Gly 4 ser) n linker series
In order to compare our structured linkers to the typical unstructured Gly-Ser linkers commonly used in protein design and engineering, we constructed, expressed and purified eight IgG-(G 4 S) n variants. In contrast to the SEC profiles for the structured linker constructs, there were only small differences in Fig. 3. Overlay of size-exclusion chromatograms for IgGs containing flexible and structured protein linkers. Structured linkers (L1-L8) exhibited larger decreases in retention volume with respect to wild-type compared with Gly 4 Ser linkers, which exhibited little to no decrease depending on the number of repeats. Structured cTPR linkers also exhibited consistent decreases in retention volume as a function of the number of repeats. Polypeptide linker design for increased IgG extension elution volume for the IgGs including Gly 4 Ser linkers (L17-L24). These differences often did not correlate with molecular mass as IgG-GS9, the IgG with the largest linker, eluted at approximately the same volume as wild-type IgG, which eluted after some of the constructs with shorter linkers (Fig. 3).
Unlike proline-rich linkers and rigid linkers consisting of natural protein domains such as b2m, Gly 4 Ser linkers that did not contain a potential N-linked glycosylation site did not detectably increase the hydrodynamic radius of the IgG, suggesting that these linkers did not provide increased separation between the Fab and Fc domains (Fig. 4). These results were consistent with the observation that Gly 4 Ser linkers did not provide significant separation between the joined domains in the context of other fusion proteins (Arai et al., 2001). Measurements of IgG-GS9 from two preparations showed only a slight difference in R H (0.1 nm), indicating that these measurements were quite robust and relatively small differences in R H may be significant.
Optimized linkers are important for the construction of multi-functional fusion proteins, in terms of both immunogenicity and conformational dynamics. Different linker compositions can alter their effective length and rigidity. In this study, we used SEC and DLS to characterize designed linkers in the context of an IgG to determine whether these linkers could increase the distance between the antigen-binding fragments. We found that flexible Gly 4 Ser linkers did not increase the R H of fused reagents, suggesting these linkers did not provide increased separation between the Fab and Fc domains even with up to nine Gly 4 Ser repeats, in agreement with previous studies (Arai et al., 2001). By contrast, the structured helical cTPR linkers provided consistent increases in R H and SEC elution volume as a function of repeat number, indicating that these repeats can be used to increase the separation distance between two proteins or domains. Our other designed linkers, including those containing naturally occurring proteins such as b2m and ZAG, yielded increases in the observed R H by as much as twice the R H of a naturally occurring IgG. The systematic characterization of the lengths and rigidity properties of the structured protein linkers and a range of (Gly 4 Ser) n linkers reported here provide a new set of tools to the available linker repertoire for engineering fusion proteins.