Metalation calculators for E. coli strain JM109 (DE3): aerobic, anaerobic, and hydrogen peroxide exposed cells cultured in LB media

Abstract Three Web-based calculators, and three analogous spreadsheets, have been generated that predict in vivo metal occupancies of proteins based on known metal affinities. The calculations exploit estimates of the availabilities of the labile buffered pools of different metals inside a cell. Here, metal availabilities have been estimated for a strain of Escherichia coli that is commonly used in molecular biology and biochemistry research, e.g. in the production of recombinant proteins. Metal availabilities have been examined for cells grown in Luria-Bertani (LB) medium aerobically, anaerobically, and in response to H2O2 by monitoring the abundance of a selected set of metal-responsive transcripts by quantitative polymerase chain reaction (qPCR). The selected genes are regulated by DNA-binding metal sensors that have been thermodynamically characterized in related bacterial cells enabling gene expression to be read out as a function of intracellular metal availabilities expressed as free energies for forming metal complexes. The calculators compare these values with the free energies for forming complexes with the protein of interest, derived from metal affinities, to estimate how effectively the protein can compete with exchangeable binding sites in the intracellular milieu. The calculators then inter-compete the different metals, limiting total occupancy of the site to a maximum stoichiometry of 1, to output percentage occupancies with each metal. In addition to making these new and conditional calculators available, an original purpose of this article was to provide a tutorial that discusses constraints of this approach and presents ways in which such calculators might be exploited in basic and applied research, and in next-generation manufacturing.


Introduction
Metalation is difficult to comprehend based on metal affinities because proteins commonly bind wrong metals more tightly than those required for activity. [1][2][3] Metalation in vivo is sometimes assisted by metallochaperones and chelatases, 4 , 5 but here the question becomes, how do the correct metals somehow partition onto the delivery proteins? The order of binding of essential and exchangeable cytosolic metals to proteins typically follows the Irving-Williams series. 3 , 6 Cells contain vast surpluses of binding sites for most metals and a subset of these sites are labile. 7 Metals can transfer from labile binding sites by ligand-exchange reactions. [7][8][9] To obtain a specific metal a protein must compete with these labile binding sites. To make metalation predictable it is necessary to know how tightly the labile metals are bound. Metal sensors have evolved to respond to changes in metal availability and have been used to estimate how tightly the exchangeable metals are bound. 10 A set of bacterial DNA-binding metal sensors ( from Salmonella enterica serovar Typhimurium strain SL1344 ) was thermodynamically characterized ( determining K metal for the tightest allosteric site, K DNA of apo-sensor, K DNA of holo-sensor, number of sensor molecules per cell in the absence and presence of elevated metal, number of promoter DNA targets ) to relate DNA occupancy to intracellular metal availability. These data confirmed that the available labile pools of the tighter binding metals such as Ni 2 + , Zn 2 + , and Cu + are maintained at the lowest free energies for forming metal complexes, while the weaker binding metals such as Mg 2 + , Mn 2 + , and Fe 2 + are at the highest. 10 In short, metal availabilities in cells also follow the Irving-Williams series, 10 and metalation can be understood by reference to the respective free energy values. 10 , 11 A metalation calculator for an idealized Escherichia coli cell was previously created based on metal availabilities at the midpoints of the ranges for each metal of the similar set of Salmonella sensors. 11 This calculator first determines the difference between these availabilities and the free energy for forming a metal complex with a protein of interest: the latter determined from metal affinities using the standard relationship G = −RT ln K A . The metal with the largest favorable free energy gradient, from the available labile pool to the protein, becomes the predominant metal bound. The calculator computes the free energy differences for all metals such that the total amount of metal bound to a site does not exceed a stoichiometry of 1. 11 Here we create calculators for conditional, rather than idealized, cells based on the status of the metal sensors, and hence metal availabilities, during standard growth of E. coli in Luria-Bertani ( LB ) medium. E. coli strain JM109 ( DE3 ) has been used for this work since it is widely exploited for molecular biology and contains a full complement of metal sensors ( unlike strain BL21, e.g., which is aberrant in Ni 2 + and Co 2 + sensing and homeostasis ) . 12 Previous work calibrated the availabilities of cobalt and zinc in conditional cells of a strain of E. coli that had been engineered to produce vitamin B 12 [11] , and here we replicate this approach in JM109 ( DE3 ) for all metals. Calculators have been generated for cells grown under aerobic conditions, anaerobic conditions, and after exposure to H 2 O 2 . These calculators can be used to predict and optimize the metalation of recombinant proteins overexpressed in E. coli . They can be used to explore fundamental questions, e.g. related to the effects of oxygen status on metalation and to identify disparities that illuminate the contributions of more elaborate mechanisms to the specificity of metalation. An estimated 47% of enzymes require metals, and it is intended that accessible calculators will assist the optimization of metallo-enzyme-dependent sustainable manufacturing in industrial biotechnology. Here we discuss constraints associated with this approach and set out ways in which outputs of the calculators might be interpreted.

Bacterial strain maintenance/growth and reagents
Escherichia coli strain JM109 ( DE3 ) was purchased from Promega. Liquid growth media and cultures were prepared in acid washed glassware or sterile plasticware to minimize metal contamination. Overnight cultures were inoculated into 400 mL LB ( 10 g/L tryptone, 5 g/L yeast extract, 10 g/L NaCl ) at a 1 in 100 dilution and grown at 37°C to an OD 600 nm of 0.2-0.3 then split into 5 mL aliquots in 12 mL capped culture tubes with the indicated treatment and grown as indicated. All aerobic cultures were incubated with shaking at 180 rpm on a 45°angle while anaerobic cultures were incubated statically in an air-tight box with AnaeroGen anaerobic gas generating sachets purchased from Thermo Fisher Scientific, for 2 or 3 h. OD 600 nm measurements were made using a Thermo Scientific Multiskan spectrophotometer and percentage growth of n = 3 biological replicates ( unless stated otherwise ) calculated relative to untreated control cultures ( n = 3 biological replicates ) .
All metal stocks were quantified by inductively coupled plasma mass spectrometry ( ICP-MS ) . ICP-MS analysis was performed using Durham University Bio-ICP-MS Facility. MnCl 2 , CoCl 2 , NiSO 4 , CuSO 4 , and ZnSO 4 were prepared in ultrapure water. FeSO 4 was prepared in 0.1 N HCl and diluted when required using ultrapure water. 13 Dimethylglyoxime ( DMG ) was dissolved in 100% ethanol. Metal solutions and ethylene diamine tetra acetic acid ( EDTA ) were filter sterilized prior to addition to bacterial cultures.

Determination of transcript abundance
Aliquots ( 1-1.2 mL ) of culture were added to RNAProtect Bacteria Reagent ( Qiagen ) ( 2-2.4 mL ) , vortexed and incubated at room temperature for 5 min before pelleting by centrifugation ( 10 min, 3900 ×g , 10°C ) . Supernatant was decanted and cell pellets stored at −80°C prior to processing. RNA was extracted using an RNeasy Mini Kit ( Qiagen ) according to manufacturer's instructions. Samples were treated with DNaseI ( Fermentas ) following manufacturer's instructions but excluding those samples with a 260/280 nm ratio of < 2. The cDNA was generated using the ImProm-II Reverse Transcriptase System ( Promega ) or Super-Script IV Reverse Transcriptase System ( Invitrogen ) with control reactions lacking reverse transcriptase prepared in parallel.
Transcript abundance was determined using primers 1 and 2 for mntS , 3 and 4 for fepD , 5 and 6 for rcnA , 7 and 8 for nikA , 9 and 10 for znuA , 11 and 12 for zntA , 13 and 14 for copA , 15 and 16 for rpoD , and 17 and 18 for gyrA , with each pair designed to amplify ∼100 bp ( Supplementary Table S1 ) . The quantitative polymerase chain reaction ( qPCR ) was performed in 20 μL reactions containing 5 ng of cDNA, 400 nM of each primer and PowerUP SYBR Green Master Mix ( Thermo Fisher Scientific ) . Three technical replicates of each biological sample were analysed using a Rotor-Gene Q 2plex ( Qiagen, Rotor-Gene-Q Pure Detection version 2.3.5 ) . Control reactions without cDNA template ( qPCR grade water used instead ) were run for each primer pair and -RT ( reverse transcriptase ) control reactions were run for the reference gene primer pair ( rpoD or gyrA ) . The qPCR was performed on n = 3 biological replicates for each treatment other than samples treated with 1 mM EDTA for 60 min, with primers specific to znuA and rcnA , where n = 5 biological replicates were analysed. Where an analysis was run more than once a mean of the determined C q values was used subsequently. C q values were calculated with LinReg PCR ( version 2021.1 ) after correcting for amplicon efficiency, 14 with each primer pair and treatment condition considered as an amplicon. Samples were rejected where the C q value for no template or -RT control was < 10 ( to the nearest integer ) greater than the equivalent value for cDNA-containing sample. These samples were either rerun or DNaseI treatment and cDNA synthesis performed again on RNA samples before running qPCR.

Determination of boundary conditions for the expression of each transcript
Boundary conditions for the calibration of sensor response curves were defined by the minimum and maximum abundance of the regulated transcript. Supplementary Table S2 gives the rationale behind the choice of growth conditions. Cell pellets from cultures exposed to 1.5 and 4 mM FeSO 4 were brown, difficult to resuspend in RNA extraction buffer, discolored the buffer, and generated RNA samples with ratios of absorbance at 260/280 nm < 2, so were not processed further.
Following completion of all qPCR reactions, qPCR data for the control gene ( rpoD ) in each sample were reanalysed in a single LinReg PCR analysis along with collated data for rpoD expression in control conditions. The difference between rpoD C q in the condition of interest and control condition was determined ( average rpoD C q for control condition minus average rpoD C q for treated sample ) ( Supplementary Table S3 ) . All aerobic and anaerobic samples were compared against untreated aerobic 2 h controls and anaerobic samples were additionally compared against the most relevant anaerobic control condition ( 2 or 3 h treatment ) . Where a difference of > 2 was found for rpoD C q values, data were either not further analysed or investigated using a second control gene ( gyrA , Supplementary Table S4 ) , before deciding whether or not to proceed. Notably, cells cultured in the presence of 1 mM ZnSO 4 showed a significant increase in rpoD transcripts relative to the control condition ( Supplementary Table S3 ) , which was not observed with gyrA ( Supplementary Table S4 ) , and these samples were not used further. Likewise, this was the case for samples from cells cultured anaerobically for 3 h either untreated or supplemented with 0.5 mM FeSO 4 ( Supplementary Tables S3 and S4 ) . NiSO 4 treatments where rpoD C q values changed by more than two ( relative to control condition ) and one treatment with EDTA ( 20 min ) were also excluded.

Intracellular metal availability in conditional cells
The fold change in transcript abundance, relative to the mean of the control condition ( lowest expression ) for each sensor, was calculated using the 2 -CT method, 15 with rpoD as the reference gene. Fractional responses of metal sensors ( θ D for de-repressors and co-repressors, θ DM for activators ) were calculated via equations 1 and 2 calibrating sensor fractional occupancy between 0.01 and 0.99: where fold change obs is the fold change in the condition of interest and fold change max is the maximum observed fold change. Values of 0.01 and 0.99 were selected as the dynamic ranges within which changes in gene expression as detected by qPCR will coincide with changes in metal availability: Outside this range changes in metal availability will occur without detectable change in gene expression. Fractional responses of sensors were converted to available metal concentrations using excel spreadsheet ( Supplementary Dataset 1 ) and MATLAB code ( Supplementary Note 3 ) from Osman and coworkers, 10 along with known sensor metal affinities, DNA affinities and protein abundances determined for Salmonella sensors ( Osman and coworkers 10 ) , with numbers of DNA-binding sites for E. coli sensors ( Supplementary Table S5 ) . Sensor response curves were also determined with these values and materials.
Intracellular available G Metal was calculated using equation 3:

Simulated metalation of molecules under bespoke conditions
Three calculators have been created for prediction of molecule metalation in JM109 ( DE3 ) under aerobic and anaerobic ( after 2 h to give 0.1 to 1% O 2 with a coincident indicative change in nikA expression; Supplementary Table S6 ) conditions, and in response to H 2 O 2 treatment accounting for multiple inter-metal competitions plus competition from the intracellular buffer as described by Young and coworkers, 11 ( https://mib-nibb.webspace.durham.ac. uk/metalation-calculators/and Supplementary Spreadsheets 1-3 ) . Estimations of molecule metal affinities were taken from cited references.
The Web-based calculators were created by converting the Excel spreadsheet created by Young and coworkers 11 into HTML and JavaScript code that could be run in a Web browser. This code was then turned into a plugin for Wordpress, to allow Wordpress page authors to create calculators with different default values for metal availabilities. The source code alone underlying the operation of the Web-based calculators has additionally been made available on GitHub and Zenodo. 16

Results
Identifying fold change in abundance of mntS, fepD, rcnA, nikA, znuA, zntA, and copA transcripts by qPCR We previously defined the range of buffered metal concentrations over which each metal sensor responds on selected promoters ( response curves ) based on their DNA-and metal-binding affinities, protein molecules per cell under low and high metal exposures, and number of promoter targets within the cell. 10 The E. coli metal-regulated genes equivalent to those studied in Salmonella were used here since their responses had been characterized, 10 notably their promoters were previously selected in Salmonella because the cognate metallo-regulator has a single DNA target or because other targets are subject to additional tiers of regulation, e.g. by both MntR and Fur. More recently the position on the response curve under bespoke conditions was used to determine Co 2 + availability in E. coli cells engineered to produce vitamin B 12 [11] . To calculate metal availability under specific conditions, minimum transcript abundance observed by qPCR was first defined to enable fold change in expression under specific conditions to be related to this boundary condition. Calculations of metal availability secondarily require identification of maximum transcript abundance by qPCR to define the opposite boundary.
To select conditions for RNA isolation, JM109 ( DE3 ) was cultured in metals and EDTA to identify ∼15% growth inhibition relative to untreated cells after 2 h exposure ( Supplementary Table  S7 ) ; where necessary, more inhibitory concentrations were used along with shorter or longer exposure times. Anaerobic culture, exposure to H 2 O 2 and to the Ni-specific chelator DMG followed established protocols. 17 , 18 Control gene, rpoD , was used throughout plus gyrA to validate or eliminate samples where rpoD expression ( C q ) changed by more than two, relative to control conditions ( Supplementary Tables S3 and S4 ) .
Expression from the nikA promoter is repressed in response to rising Ni 2 + availability by NikR but dependent upon activation by Fnr under anaerobic conditions. [19][20][21][22] We observed Ni 2 + -dependent regulation of nikA expression both aerobically and anaerobically ( Supplementary Table S6 ) . Boundary conditions were therefore independently defined for anaerobically and aerobically grown cells.
Supplementary Tables S6 and S8-13 show all of the resulting C q values. Change in gene expression relative to the lowest abundance for each transcript was calculated as a series of C q values ( Supplementary Tables S14-21 ) . In turn C q values were expressed as fold increase in gene expression ( Fig. 1 ) .
Calibrated responses of MntR, Fur, RcnR, NikR, Zur, ZntR and CueR to Mn 2 + , Fe 2 + , Co 2 + , Ni 2 + , Zn 2 + , and Cu + The fold changes in gene expression shown in Fig. 1 need to be calibrated to DNA occupancies ( conditional θ D for repressors or θ DM for activators ) and in turn DNA occupancies related to buffered metal concentration. Minimum transcript abundance defines maximum DNA occupancy for co-repressors and de-repressors ( θ D assigned as 0.99 ) while defining minimum DNA occupancy for metalated activators ( θ DM assigned as 0.01 ) . To set the opposite end of the response range for the co-repressors and de-repressors ( θ D = 0.01 ) and for activators ( θ DM = 0.99 ) we selected the highest expression from Fig. 1 . Conditional θ D or θ DM values were calculated based on the proportion of the maximum fold change observed for each promoter using common equations for repressors ( Equation 1 ) and a separate relationship for activators ( Equation 2 ) .
The relationships between intracellular metal availabilities and θ D or θ DM for each sensor were calculated essentially as described previously using measured metal affinities, DNA affinities, protein abundance, numbers of DNA targets ( Supplementary Table S5 ) , via Excel spreadsheet ( Supplementary Dataset 1 ) available in Osman and coworkers. 10 Notably, the numbers of target promoters for CueR and NikR in E. coli differ from Salmonella ( Supplementary Table S5 ) . The values θ D and θ DM at 0.01 and 0.99 were related to available metal concentration as above and using MATLAB code ( Supplementary Note 3 ) available in Osman and coworkers 10 ( red symbols on Fig. 2 ) .
Availabilities of Mn 2 + , Fe 2 + , Co 2 + , Ni 2 + , Zn 2 + , and Cu + in cells cultured in LB media To estimate the buffered concentrations of available metal inside E. coli strain JM109 ( DE3 ) cells grown in LB media, fold changes in abundance of transcripts encoded by the seven metal-responsive genes were calculated in RNA isolated from cells grown aerobically, anaerobically, and in response to H 2 O 2 . Values were converted to DNA occupancies for the respective metal-sensor proteins ( θ D and θ DM ) ( green symbols on Fig. 2 ) . Buffered concentrations of available metals were thus derived from their established relationships to θ D or θ DM for each sensor using MATLAB code ( Supplementary Note 3 ) available in Osman and coworkers 10 ( Table 1 ) .

Free energies for forming half-saturated metal complexes at intracellular metal availabilities in E. coli
Inside cells available metals are mostly bound to labile sites rather than fully hydrated. For clarity, and to assist subsequent data manipulations, metal availabilities were expressed in terms of free energies for forming complexes with proteins ( or other types of molecules ) that will be 50% metalated at the respective buffered available metal concentrations shown in Table 1 . The dissociation constant ( K D ) of such a molecule matches the metal concentration and free energy ( G ) is calculated using the relationship shown in Equation 3. These data reveal how tightly labile metals are bound and hence the magnitude of competition for each metal inside E. coli JM109 ( DE3 ) under specific growth conditions ( red symbols on Fig. 3 ) . Standard deviations were calculated based upon the triplicated determinations of buffered concentration that have been averaged in Table 1 . For comparison, previously used metal availabilities at the midpoints of the ranges of each sensor for each metal in idealized cells, are also shown ( gray symbols in Fig. 3 ) .

Three Web-based metalation calculators
To predict the metalation states of proteins of known metal affinities, three metalation calculators were developed based on the intracellular metal availabilities estimated in cells grown in LB media under aerobic, anaerobic, and H 2 O 2 exposed conditions ( Fig. 3 ) . Initially calculators were produced as spreadsheets ( Supplementary Spreadsheets 1-3 ) , as described previously. 11 The spreadsheets complete two operations, firstly calculating the difference in free energy for metal binding to the protein versus competing sites of the intracellular milieu, secondly accounting for inter-metal competition as described previously. 11 Web-based versions of the three calculators have been generated, each prepopulated with the metal availabilities determined under each of the three growth conditions ( https://mib-nibb.webspace.durham. ac.uk/metalation-calculators/) . Toggle switches exclude metals from the calculations enabling simulations for proteins where some affinities are unknown ( toggle switches are visible on the far left in Fig. 4 ) . Metal affinities are entered as dissociation constants, K D , and the calculators' output free energies for forming the respective metal complexes as well as occupancies.

Metal availability follows the Irving-Williams series under all three conditions
Metals tend to associate with nascent proteins with the following order of preference ( from weakest to tightest ) : Mg 2 + < Mn 2 + < Fe 2 + < Co 2 + < Ni 2 + < Cu 2 + ( Cu + ) > Zn 2 + , as set out in the original Irving-Williams series and noting the subsequent addition of monovalent copper. 3 , 6 The availabilities of these metals ( as divalent forms except copper that is predominantly monovalent in the cytosol ) in E. coli JM109 ( DE3 ) follows this series under all three conditions ( Fig. 3 , Table 1 ) . Notably the series is ambiguous about the exact position of Zn 2 + , only specifying that binding is weaker than copper, 3 and in conditional cells the free energy for forming complexes with available Zn 2 + is indeed less negative than copper, but also slightly less negative than Ni 2 + ( Fig. 3 , Table 1 ) . By maintaining the available forms of the tightest binding metals at the lowest free energies for forming metal complexes the challenge to correctly metalate proteins in cells is substantively overcome. 3 , 7 , 10 , 11 In LB media Fe 2 + is more, and Cu + and Co 2 + less, available than in idealized cells Comparisons of the calculated free energies for forming complexes with available metals in the cytosol of idealized cells ( where sensors are assigned to the midpoints of their ranges; gray symbols on Fig. 3 ) with those inside conditional E. coli JM109 ( DE3 ) ( where the status of the sensors have been determined via qPCR; red symbols on Fig. 3 ) reveals that Fe 2 + is more available than previously suggested. 10 Notably, the availability of Fe 2 + is only slightly less than Mn 2 + ( Fig. 3 Table S5 ) .  ( Fig. 2 ) . In some bacteria Fe 2 + sensors have been discovered that detect highly elevated Fe 2 + , including most recently riboswitches. 23 , 24 These observations raise the intriguing possibility that Fe 2 + availability may become even greater than detected here, perhaps even exceeding the availability of Mn 2 + . Enhanced binding of Fe 2 + Fur has been documented at some E. coli promoters under anaerobic conditions, but not the fepD promoter. 25 Moreover, the number of Fur molecules per cell is insufficient to populate all Fur-target promoters in Salmonella ( and by inference E. coli ) cells grown aerobically but becomes sufficient after iron supplementation, and hence some target promoters with weaker affinities for Fur could remain vacant in E. coli cells cultured aerobically in LB media. 10 Calibration of a different Fur-target promoter could allow the detection of some further increase in available Fe 2 + in anaerobically cultured cells, although it is noted that change in Fe 2 + availability may be difficult to observe by these methods in the upper responsive range of Fur in part due to hysteretic effects of changes in Fur abundance. 10 Counterintuitive, apparently elevated, available Fe 2 + in H 2 O 2 -treated cells may also be a function of this limitation or due to displacement of Fe 2 + from binding sites. Iterative improvements in estimated metal availabilities will be updated on the Web-based versions of the calculators. The number of promoter targets for CueR and NikR differs in E. coli relative to Salmonella ( Supplementary Table S5 ) . The modelled responses of these sensors are consequently altered in E. coli slightly changing availabilities in idealized cells as well as conditional cells ( Figs. 2 and 3 ) . Previous data established in another strain of E. coli that the availability of Co 2 + was considerably less than predicted in idealized cells and this observation  Fig. 3 Free energies of available metal in E. coli JM109 ( DE3 ) . Intracellular available free energies for metal binding to molecules that would be 50% saturated at the available metal concentration ( Table 1 )   is confirmed here for JM109 ( DE3 ) . 11 The availability of copper is also significantly less than in idealized cells, defining the lower boundary condition in response to H 2 O 2 ( Figs. 2 and 3 ) . Copper is an extremely potent pro-oxidant catalyzing the Fenton reaction and a reduction in available Cu + in the presence of H 2 O 2 will limit production of the deadly hydroxyl radical. The free energy of exchangeable and available Ni 2 + does not significantly change under anaerobic conditions where it is known that the metal is imported to supply hydrogenase. 26 In anaerobic cells the magnitude of additional import by the Nik system and the magnitude of flux through the Hyp metallochaperones into nascent protein, notably hydrogenase, may be approximately matched.

H 2 O 2 increases Mn 2 + availabilities
Mn 2 + is an effective antioxidant. 27 Detection of H 2 O 2 by OxyR triggers Mn 2 + import and activates manganese superoxide dismutase ( SodA ) with evidence that SodA is otherwise mis-metalated with iron and inactive. 28 , 29 . Here we similarly detect elevated Mn 2 + in response to H 2 O 2 ( Fig. 3 and Table 1 ) . Some treatments, including potentially exposure to H 2 O 2 , may directly modify the sensors independently of effects on metal availability, raising a caveat that readouts of free energies for metalation may become less accurate for some metals under these conditions. The increase in available Mn 2 + is accompanied by a doubling in the total cellular manganese quota ( Supplementary Fig. S1A ) . Intriguingly, in 4 mM exogenous manganese the manganese quota increases by 3 orders of magnitude and under these conditions H 2 O 2 does not further increase the quota perhaps reflecting differences in the localization of the excess manganese, e.g. in the periplasm versus cytosol ( Supplementary Fig. S1B ) .
Metals are kinetically trapped within SodA but attempts have been made to measure affinities via thermal unfolding of metalated and un-metalated protein. 32 This approach generates affinities of 3.1 × 10 -9 M for Mn 2 + and 2.5 × 10 -8 M for Fe 2 + . 32 SodA becomes correctly metalated after exposure to H 2 O 2 and indeed the H 2 O 2 calculator predicts 99.5% occupancy with Mn 2 + and 0.5% occupancy with Fe 2 + in this condition. However, using these affinities the aerobic calculator predicts only slightly reduced occupancy to 96.6% with Mn 2 + and only slightly increased to 3.4% occupancy with Fe 2 + . The reported affinities do not follow the Irving-Williams series raising a tantalizing possibility that they do not reflect flexible sites at which exchangeable metals partition onto the nascent unfolded SodA protein prior to kinetic trapping. Simulations using an affinity for Mn 2 + which is 10-fold weaker than the reported affinity for Fe 2 + , abiding by the Irving-Williams series, do flip occupancy from predominantly Fe 2 + in the absence of H 2 O 2 to predominantly Mn 2 + in the presence of H 2 O 2 , by using the respective two calculators ( 26.1% Mn 2 + and 72.9% Fe 2 + aerobically to 72.0% Mn 2 + and 27.8% Fe 2 + in H 2 O 2 ) .
A spectroscopically active variant of a czcD riboswitch, previously designated as a Co 2 + /Ni 2 + sensor, was recently shown to respond to Fe 2 + when studied in E. coli . 23 This experimental observation was shown to align with predictions from calculations based on idealized cells and the thermodynamically calibrated ranges of Salmonella , and by inference E. coli , metal sensors. 10 , 23 The predictions are also supported here when using the aerobic calculator to give occupancies of 74.6% Fe 2 + , negligible occupancy with either Ni 2 + or Co 2 + ( 0.01% Co ) , also zero occupancy with Zn 2 + and a modest 9.7% occupancy with Mn 2 + . An alteration in the documented selectivity of this riboswitch from Ni 2 + and Co 2 + to Fe 2 + seems apposite.

Uses, constraints, and future prospects for metalation calculators
Predictions of the calculators can be used to: identify and/or confirm metal specificity; infer mis-metalation; reveal erroneous metal affinity measurements; suggest where metal availabilities may differ from calculator values; and indicate where additional mechanisms assist metalation. Several of these are exemplified by simulations in the preceding section.
The calculators assume that the molecule of interest does not deplete the buffered available metal pool. This suggests a potential constraint associated with the widespread use of E. coli overexpression systems to generate recombinant proteins for research purposes and here the outputs of metal sensors might re-veal where overexpression of a metalloprotein depletes available metals. The calculators also assume a 1:1 metal complex with a preformed metal site on the molecules of interest. They cannot directly predict occupancies for metal-dependent assemblies in which the ligands are derived from more than one molecule: Notably, the degree of metal binding to such complexes will depend on the intracellular ligand concentrations and hence be prone to occur when proteins are overexpressed in E. coli . It is anticipated that future iterations of the calculator could be developed for such metal-dependent assemblies where affinities are reported as β values in per square molar values ( M -2 ) .
As noted earlier some strains of E. coli commonly used for protein overexpression are aberrant in metal homeostasis. For example, strain BL21 ( DE3 ) lacks the rcn genes, is mutated in fnr and lacks the mod operon for molybdate uptake. 33 It is anticipated that bespoke versions of the calculator might be generated for such strains. Additionally, E. coli unlike Salmonella lacks a dedicated Co 2 + uptake system. 34 Under-or mis-metalation of Co 2 + or Ni 2 + ( and indeed molybdenum ) requiring proteins has been documented and may be common following overexpression in E. coli . 11 , 35 The calculators predict metal occupancies based upon a population of intracellular metal-buffering sites at thermodynamic equilibrium with the molecule of interest. The magnitude of disparities from these predictions can thus be used to establish the magnitude of additional contributions to metal specificity, e.g. where the distribution of metalated products is, at least in part, kinetically determined. Localization of a nascent metalloprotein proximal to a metal importer may favor metalation in a niche where metal availability is greater than average for the compartment: There is also limited evidence of direct metal transfer by ligand exchange from importers to docked metalloproteins. [36][37][38][39] Kinetic bias could theoretically occur where a buffering molecule preferentially accesses a specific metal site with dedicated metallochaperones representing an extreme example.
Previous use of an earlier iteration of a metalation calculator revealed that the Co 2 + -chaperone CobW alone would be unable to acquire Co 2 + in E. coli . 11 Binding of GTP and Mg 2 + increased the Co 2 + affinity sufficiently to enable metalation. In contrast, after hydrolysis, it was previously noted that binding of GDP would lead to Co 2 + release illustrating how use of such a calculator can uncover the contributions of crucial molecular interactions to metalation, and hence mechanisms of action. 11 Simulations for proteins of other organisms but using these calculators for E. coli strain JM109 ( DE3 ) ( https:// mib-nibb.webspace.durham.ac.uk/metalation-calculators/and Supplementary Spreadsheets 1-3 ) may indicate where availabilities could depart from those estimated here. It is anticipated that future iterations of these calculators may be generated with metal availabilities established for other cell types, either via bespoke calibration of metal sensors from other strains or through use of other approaches to estimate metal availability. This could include the use of small molecule metal probes or genetically encoded probes of metal availability. Furthermore, subtle differences between the metal sensors of E. coli and Salmonella might have led to disparities in estimating metal availabilities that could emerge in simulations of metalation of E. coli proteins ( Supplementary Table S22 ) . This could be resolved by thermodynamically characterizing the respective E. coli sensor in the manner performed for Salmonella , 10 the Web-based version of the calculators will catalogue such updates in estimates of metal availability ( see note added post review ) .
There is a view that many reported metal affinities of proteins are not correct, and a recent article provides a guide to such