Functional Cyanobacterial b -Carboxysomes Have an Absolute Requirement for Both Long and Short Forms of the CcmM Protein 1[W][OA]

Carboxysomes are an essential part of the cyanobacterial CO 2 -concentrating mechanism, consisting of a protein shell and an interior of Rubisco. The b -carboxysome shell protein CcmM forms two peptides via a proposed internal ribosomal entry site (IRES) within the ccmM transcript in Synechococcus PCC7942. The abundant short form (35 kD, M35) consists of Rubisco small subunit-like repeats and binds Rubisco. The lower abundance long form (58 kD, M58) also contains a g -carbonic anhydrase-like domain, which binds the carboxysomal carbonic anhydrase, CcaA. We examined whether these CcmM forms arise via an IRES or by other means. Mutations of a putative internal start codon (GTG) and Shine-Dalgarno sequence within ccmM , along with a gene coding for M35 alone, were examined in the high-CO 2 -requiring (HCR) carboxysomeless mutant, D ccmM . Expression of wild-type ccmM in D ccmM restored the wild-type phenotype, while mutation of putative start and Shine-Dalgarno sequences led to as much as 20-fold reduction in M35 content with no recovery from HCR phenotype. These cells also contained small electron-dense structures. Cells producing little or no M58, but sufﬁcient M35, were found to contain large electron-dense structures, no CcaA, and had a HCR phenotype. Large subcellular aggregates can therefore form in the absence of M58, suggesting a role for M35 in internal carboxysome Rubisco packing. The results conﬁrm that M35 is independently translated via an IRES within ccmM . Importantly, the data reveal that functional carboxysomes require both M35 and M58 in sufﬁcient quantities and with a minimum stoichiometry

Carboxysomes of cyanobacteria and some proteobacteria are protein-bound microcompartments that enclose the majority of a cell's complement of Rubisco (Price et al., 1992;McKay et al., 1993). Rubisco is the prime CO 2 -fixing enzyme in the biosphere and responsible for the vast majority of biologically acquired CO 2 (Ellis, 1979). Approximately 50% of global primary productivity occurs in the oceans where cyanobacteria, and therefore carboxysomal Rubisco, account for a significant proportion of global carbon acquisition (Field et al., 1998). Indeed, cyanobacteria are also significant primary producers in terrestrial and freshwater environments. The carboxysome is an essential component of the CO 2 concentration mechanism (CCM) in cyanobacteria, being the site where CO 2 levels are elevated around Rubisco following the accumulation of HCO 3 2 in the cytoplasm due to active uptake systems for CO 2 and HCO 3 2 (Price et al., 2008). Furthermore, the cyanobacterial CCM and the carboxy-some are obligate requirements for efficient photosynthetic CO 2 fixation and growth. Carboxysomebased CO 2 fixation therefore represents a singularly significant contribution to global carbon fixation. There are two types of cyanobacterial carboxysomes (a-and b-carboxysomes) whose composition differs by the phylogenetic grouping of their Rubiscos (form 1A or 1B Rubisco, respectively) and the proteins that contribute to the carboxysome shell (cso or ccm type; Badger et al., 2002;Cannon et al., 2002;Badger and Price, 2003).
Our work focuses on b-carboxysomes, which are present in cyanobacteria from both freshwater and marine habitats (Badger et al., 2006). This group of carboxysomes is characterized by their ccm-type shell proteins, the complement of which is currently the subject of a number of characterization studies; present indications are that b-carboxysomes could be composed of as little as six shell proteins (CcmKLMNO and CcaA) and the two Rubisco proteins (RbcLS; Kerfeld et al., 2005;Price et al., 2008;Yeates et al., 2008;Tanaka et al., 2009;Cannon et al., 2010). The smaller shell components (CcmK, L, and O) are members of the bacterial microcompartment family of proteins and have homologs in a-carboxysomes. The larger CcmM and N proteins, however, are unique to b-carboxysomes and are likely to have particular structural or functional roles in the carboxysome.
We have previously proposed that the CcmM proteins of b-carboxysomes are essential Rubisco-organizing components that may coordinate the initial nucleation of the carboxysome shell facets by binding Rubisco and the carboxysomal carbonic anhydrase (CA), CcaA, as well as organizing an outer layer of interlocking CcmK hexamers (Long et al., 2007). CcmM is characterized by two specific domains: an N-terminal g-CA-like domain and a C-terminal domain of Rubisco small subunit (SSU)-like repeats (Price et al., 1993;Ludwig et al., 2000). The crystal structure of the CA-like domain of Thermosynechococcus elongatus BP-1 CcmM has recently been elucidated and shown to be an active CA in that and other species lacking ccaA (Peñ a et al., 2010). In addition, there is consistent evidence that CcmM exists as at least two forms in a number of b-cyanobacteria; our studies show that in Synechococcus PCC7942 these are a fulllength 58 kD form (M58) and a shorter 35 kD form (M35) that contains three SSU-like repeats (Price et al., 1998;Long et al., 2005).
In Synechococcus PCC7942 M35 possibly results from a putative internal ribosomal entry site (IRES) within the ccmM gene transcript (Price et al., 1998;Ludwig et al., 2000;Long et al., 2005Long et al., , 2007Fig. 1). N-terminal sequencing of the short form of CcmM (M35) suggested the N terminus to be Ser 217 , possibly resulting from a putative GTG start codon downstream of the putative Shine-Dalgarno (SD) ribosomal binding sequence (GAGG), and subsequent N-terminal Met loss (Long et al., 2007). This corresponds to an equivalent peptide from the CcmM of Nostoc (Anabaena) PCC7120 and matches likely short forms from a suite of related CcmM sequences (Long et al., 2007). In support of these findings, Cot et al. (2008) showed that CcmM from Synechocystis PCC6803 is present in carboxysomes as a number of peptides resulting from multiple putative IRESs and translation initiation codons within the gene transcript. In all cases, it appears that short forms of CcmM maintain complete Rubisco SSU repeats (Ludwig et al., 2000), which are responsible for Rubisco binding and playing an important role in carboxysome structure (Long et al., 2007). However, the true origin of CcmM short forms is unknown.
In a recent review we highlighted the potential to utilize components of the cyanobacterial CCM to enhance nitrogen-and water-use efficiency in C 3 crop plants (Price et al., 2008). A long-term part of this approach could incorporate carboxysome-like structures into C 3 chloroplasts to generate a compartmentalized Rubisco with an elevated CO 2 concentration. To make progress in this area, however, further comprehensive information on the interactions of cyanobacterial Rubiscos and their carboxysomal component proteins is required. Here we investigate the complicated expression of two important forms of CcmM to better understand their roles in carboxysome assembly. We show that an IRES and translation start codon are responsible for the production of two peptides from the ccmM gene and that there is an absolute requirement for both M35 and M58 in functional carboxysome formation. The results indicate specialized roles for M35 and M58 in both carboxysome structure and function.

RESULTS
The carboxysome production capabilities, requirement for high CO 2 , and distribution of carboxysome proteins from each ccmM mutation ( Fig. 1) are summarized in Table I. Mutation of the putative internal start codon or putative SD sequence in ccmM, in the carboxysomeless deletion mutant DccmM (Woodger et al., 2005;Emlyn-Jones et al., 2006), led to a significant reduction in the CcmM short form (M35) in Synechococcus PCC7942 cells (Table I; Fig. 2). Immunoblots of cell lysates and Mg 2+ precipitable proteins from wild type and mutant cell types (Fig. 2) showed that both forms of CcmM were present in wild-type cell extracts and those of DccmM + ccmM, DccmM + gtc, DccmM + NL, and DccmM + SD mutants. However, M35 amounts were significantly diminished in Figure 1. Wild-type and mutant Synechococcus PCC7942 ccmM gene sequence constructs. Partial wild-type Synechococcus PCC7942 ccmM gene sequence (WT) indicating the positions of the putative internal SD sequence (shaded) and putative internal start codon (underlined) is shown. Partial amino acid sequences of the resulting polypeptides (M58, M35, and M23) are shown with the confirmed N-terminal amino acid of M35 (boxed; Long et al., 2007). N-terminal sequence analysis suggests that the initial Met of M35 is lost posttranslationally (Long et al., 2007). The mutated read-through sequences, ccmM-NL and ccmM-gtc, are indicated with individual base changes from the wildtype sequence (underlined) and individual amino acid changes (bold). The resulting CTG (Leu) codon in the ccmM-NL mutation is also a rare initiation codon in some instances (Missiakas et al., 1993;Sazuka et al., 1999;Tichy and Vermaas, 1999). The ccmM-FS mutation introduces a stop codon (shaded) while maintaining the M35 sequence by introducing a FS by removal of a single base (indicated by an arrowhead). The ccmM-SD mutation eliminates the putative internal SD sequence by a single codon mutation (underlined). The construct coding for M35 alone (ccmM-M35) has the translation start codon immediately downstream of the nirA promoter in plasmid pSE4 (Maeda et al., 1998;Price et al., 2004).
Neither M58 nor M35 were present in DccmM cell extracts. M58 could not be detected in DccmM + M35 or DccmM + frame shift (FS) cells, where an introduced FS was designed to eliminate M58 production but maintain M35 via the uninterrupted IRES ( Fig. 1). However, both of these cell types contained greater relative quantities of M35 than those found in wild-type cells (Fig. 2). The absence of M58 from DccmM + FS cells suggests successful termination of M58 translation at the introduced stop codon (TGA), while M35 production within these cells confirms independent translation of the shorter peptide via the putative GTG start codon.
While expression of the mutated forms of ccmM, or the M35 form alone, within the DccmM background did not alleviate the high-CO 2 -requiring (HCR) phenotype, expression of the wild-type gene (DccmM + ccmM) recovered wild-type physiology ( Table I). The DccmM + SD cell type displayed intermediate inorganic carbon (C i ) requirement for photosynthesis with a slightly lower K 0.5 C i (concentration at which half maximum photosynthetic rate occurs) than the other ccmM mutants but it still required high CO 2 for growth (Table I; Supplemental Fig. S1).
Magnesium precipitation of carboxysome-related complexes from each cell type showed that the carboxysomal CA CcaA was undetectable in ccmM and DccmM + M35 cells (Fig. 2). However, CcaA was detected in all other cell types (Fig. 2). Nonetheless, all the cell types producing M58 or the 23-kD (M23) g-CA domain (DccmM + FS cells) M23 contained relatively low CcaA quantities compared to wild type (Fig. 2). Detection of M23 in DccmM + FS cells was not achieved by western-blot analysis due to both extremely poor immunogenicity of this region of CcmM in polyclonal antisera raised against M58, and probable low abundance of the peptide in this cell line (although, in the absence of specific proteases, M23 can be detected in high abundance in Escherichia coli expression cell lines; Supplemental Fig. S2). Because of the known interaction between CcaA and the g-CA domain of CcmM (Long et al., 2007;Cot et al., 2008), the detection of stable CcaA in DccmM + FS cells (Fig.  2) was presumed to indicate the presence of M23 at below-detectable quantities in these cells.
Immunoblots of lysates, Mg 2+ supernatants, and pellets revealed that both RbcL and CcmK1 (the homolog of CcmK2 from Synechocystis PCC6803) were present in all cell types and were successfully precipitated from cell lysates with Mg 2+ in most cases (Fig. 2). Several additional bands cross-reacting with the CcmK1 antibody were observed in DccmM cell lysates but not in any other cell lines (Fig. 2), indicating potential proteolysis of this protein in this cell line; these protein bands appeared to be too small to be the homologs CcmK3 and CcmK4. With regard to CcmM, there was no evidence of any peptides smaller than Table I. Summary of carboxysome protein characteristics and CO 2 requirement in ccmM mutants The presence of each of the carboxysome proteins identified in each Mg 2+ fraction is subjectively identified from western blots ( Fig. 2) according to the following scheme: present but low abundance (+), present and high abundance (+++), present and equivalent abundance in each fraction (++), or absent (2). In wild-type cells, carboxysome proteins are predominantly associated with the Mg 2+ pellet fraction. HCR in each cell type is indicated.
a Photosynthetic half-saturation concentrations of C i for wild-type and mutant cell types. Photosynthetic parameters were determined by membrane inlet mass spectrometry as described in the text. Data presented are means of triplicate measurements 6SD. Maximum photosynthetic rates measured were in the range 330 to 513 mmol O 2 evolved (mg Chl) 21 h 21 . A representative set of mass spectrometric measurements of C i -dependent O 2 evolution in wild-type Synechococcus PCC7942 and DccmM mutants is presented in Supplemental Figure S1. b EDB, Electron-dense bodies.
c DccmM + ccmM-NL cells contain small electron-dense bodies with apparent planar surfaces (Fig. 3). 35 kD that might arise from proteolytic cleavage of M58 into two distinct products, as evidenced by probing western blots with a polyclonal antibody specifically raised against the entire mature M58 polypeptide (data not shown). Specifically, no protein corresponding to a 23 kD or smaller M58 proteolysis fragment was observed. In DccmM + M35 and DccmM + FS cells, where M58 was absent, carboxysomal proteins were primarily found in the supernatant after Mg 2+ treatment (Table I). This is not the case for DccmM cells, however. CcmK1 was confined to the supernatant fraction after Mg 2+ precipitation of lysates from DccmM + SD, DccmM + FS, and DccmM + M35 cells (Fig. 2).
Electron-dense structures (possibly rhomboid and appearing rectangular in cross section) were observed in DccmM + FS and DccmM + M35 cells where M35 was the only CcmM polypeptide present (Fig. 3). Electron microscopy revealed that wild-type and DccmM + ccmM cell types contained carboxysomes while DccmM did not (Fig. 3). Cells of DccmM + NL, DccmM + gtc, and DccmM + SD mutants contained electrondense structures although, for the most part, properly formed carboxysomes were not observed in sections of these cell types (Fig. 3). The DccmM + NL cells contained electron-dense regions with planar surfaces, reminiscent of carboxysome edges. The DccmM + gtc and DccmM + SD cells predominantly contained small dense structures that have the appearance of very small carboxysome structures and might also be carboxysome related (Fig. 3).

M35 Arises from an IRES and GTG Start Codon in ccmM
Mutations of a putative IRES and start codon within the Synechococcus PCC7942 ccmM gene have enabled us to determine a translational origin for the carboxysomal protein M35. Our data reveal a significant reduction in M35 production in the DccmM + gtc mutant (mutation of a GTG start codon to a frequently used but nonstart GTC codon), indicating this protein arises via a mechanism other than M58 proteolysis. Being a silent mutation, retaining the internal Val, DccmM + gtc should result in wild-type or at least DccmM + ccmM levels of both CcmM peptides if M35 results from M58 proteolysis. This is clearly not the case, with M35 approximately 20-fold lower in DccmM + gtc cells compared to wild type (Fig. 2). In addition, M35 production still occurs (albeit at low levels) in the ccmM + NL mutant that was designed to eliminate any potential proteolytic cleavage site within M58 by varying the amino acid residues at the known M35 N terminus, but retaining a known CTG start codon. Furthermore, immunoblots of cell extracts of all cell types using a polyclonal antibody (that recognizes the entire M58 polypeptide) was not successful in identifying any M58-like peptides (potential cleavage products) smaller than M35 (B.M. Long, unpublished data). This antibody shows relatively poor, yet sufficient, detection of the N-terminal (M23) g-CAlike domain of CcmM when expressed in E. coli . Carboxysomal proteins in wild-type Synechococcus PCC7942 and ccmM mutants. Western blots showing the presence or absence of carboxysomal proteins in the whole-cell lysates (L), Mg 2+ supernatants (S), and Mg 2+ pellets (P) from wildtype (WT), DccmM, and DccmM complementation mutant cell types. Cells were lysed and crude carboxysome-rich preparations formed by precipitation with 25 mM Mg 2+ as described in the text. In all cases samples equivalent to 0.5 mg Chl in the lysate fraction were loaded onto SDS-PAGE gels. Separated proteins were transferred onto PVDF membranes and probed as described in the text. The proteins identified are indicated. NS signifies a nonspecific band that is routinely visible in CcmM western blots in our hands and corresponds to RbcL. This possibly results from M58 fragments comigrating with RbcL during electrophoresis. M35 to M58 ratios and relative CcaA content in lysate samples are indicated and were determined from Attophos-based immunoblots as described in the text. Relative CcaA amounts are expressed relative to the amount in wild-type cell lysates (set at 100%) 6SD (n = 3). NA, Not applicable; ND, not detected. A complete dataset is available in Supplemental Table S1.
(Supplemental Fig. S2), suggesting that, if present, M23 is below detection limits in Synechococcus PCC7942 cells. Previous analysis by ourselves and others has not identified a peptide corresponding to the N-terminal domain of CcmM in b-carboxysomes (Long et al., 2005(Long et al., , 2007Cot et al., 2008). Although N-terminal sequence analysis of M35 and the 36-kD homolog from Nostoc PCC7120 suggests that in both cases they might arise from a potential full-length CcmM cleavage at the carboxy side of an internal Val residue (Long et al., 2007), the data presented here clearly indicate a nonproteolytic origin for M35.
The ccmM gene of Synechococcus PCC7942, and related b-cyanobacteria, sits within a carboxysomal shell gene operon (ccmKLMNO; Omata et al., 2001) and codes for a 58-kD protein (Price et al., 1993). However, a short form of CcmM is routinely identified in greater quantities than the full-length form in both cell lysates and enriched carboxysome fractions from this and related cyanobacteria (Price et al., 1998;Long et al., 2005Long et al., , 2007Cot et al., 2008). A putative translation start site and IRES have been identified within the ccmM sequence (Ludwig et al., 2000;Long et al., 2007) that could account for independent M35 production. While IRESs are rare in cyanobacteria (the petH gene of Synechocystis PCC6803 the only known cyanobacterial example; Thomas et al., 2006), such complex translation processes have also been described in a number of organisms including E. coli (Sanatinia et al., 1995) and Salmonella typhimurium (Schoenhals et al., 1998).
In wild-type cells, M35 is produced in greater quantities compared with M58 ( Fig. 2; Long et al., 2007), suggesting that the IRES and internal start codon of ccmM dominate translation of the ccmM mRNA on the ccmKLMNO transcript. A possible explanation for this is that IRES might result in both relatively high rates of M35 initiation as well as a greater rate of paused M58 translation (i.e. greater residence time of the ribosome at the IRES). Our analysis of local mRNA folding of  hairpin loops at the transcription initiation regions of both M58 and M35 within the ccmM transcript give little indication as to which of these sites, if any, is likely to result in greater levels of translation (Supplemental Fig. S3; De Smit and Van Duin, 1994). It is worth noting that both out of context plasmid expression of ccmM (as in DccmM + ccmM) and expression from the wild-type operon lead to similar absolute quantities and ratios of M35 and M58 (Fig. 2). This confirms that competitive translation of the M58 or M35 regions of the ccmM message is independent of the ccm operon, but that processing signals are present within the ccmM gene.

Carboxysome Formation Minimally Requires Close to Equal Quantities of M35 and M58
The data presented here and in a previous study (Long et al., 2007) for ccmM mutants suggest that functional carboxysomes require M35 and M58 in at least an approximate 1:1 ratio to form functional carboxysomes. While M35 quantities are greater than M58 in wild-type cells ( Fig. 2; Long et al., 2007), we have found that functional carboxysomes can form when M35:M58 stoichiometries are significantly perturbed by production of excess M58 (Long et al., 2007). However, the IRES-interrupting DccmM + NL, DccmM + gtc, and DccmM + SD mutants appear to contain far too little M35 to enable sufficient carboxysome formation. Notably, however, small electron-dense structures were found to form in DccmM + NL, DccmM + gtc, and DccmM + SD cells (Fig. 3), although these are apparently not fully functional carboxysomes (Table I). Relative quantitative analysis of M35 and M58 amounts in these mutants suggest approximately 3.5-fold and 20-fold decreases in M35 to M58 ratios in these cells (Table I).
In a previous study we found that the M35:M58 ratio of enriched carboxysome fractions from high-CO 2grown wild-type cells was approximately 4:1, while M35:M58 ratios in His 6 -tagged CcmM mutants, capable of producing functional carboxysomes, were in the range of approximately 1:1 to 11:1 (Long et al., 2007). Thus, approximately equal quantities, at least, of M35 and M58 are required for functional carboxysome formation. The DccmM + NL, DccmM + gtc, and DccmM + SD mutants in this study produce M35:M58 ratios that are substantially lower (0.1-0.6:1), yet M58 quantities are comparable with wild-type cells (Fig. 2). This suggests that M35 quantities are insufficient to allow functional carboxysome formation. Interestingly, through maintaining a HCR phenotype, the DccmM + SD mutant displayed intermediate C i requirement for photosynthesis (Table I), had a M35:M58 ratio just below 1:1 (Fig. 2), and formed occasional electrondense bodies (Fig. 3). In this case it is plausible that functional carboxysomes sometimes form due to nearadequate M35 to M58 ratios, but not sufficiently at a population level to overcome HCR for growth. Thus, appropriate absolute and relative quantities of M35 and M58 are essential for functional carboxysome formation. It follows, then, that fine control of the putative IRES and internal translation event within the ccmM transcript are important in the operation of the cyanobacterial CCM.
The absence of any M58 in the DccmM + M35 and DccmM + FS mutants (that did not alleviate the HCR phenotype; Table I) reveals an absolute requirement for the long form of CcmM in functional carboxysome formation. Notably, however, cells of these mutants contain large electron-dense structures (Fig. 3) that are for the most part similar to wild-type carboxysomes but relatively low in abundance (discussed below). While M35 provides an essential backbone for carboxysome formation, M58 must be present for CcaA recruitment and for organized shell facet formation (Long et al., 2007). In the DccmM + M35 mutant, not only is M58 absent but CcaA levels are essentially undetectable, as observed in DccmM cells (Fig. 2) and in the P-N carboxysomeless mutant (Price and Badger, 1991;Price et al., 1993). Thus, not only is carboxysome shell formation compromised in DccmM + M35, but CcaA is rapidly degraded, presumably to prevent extra-carboxysomal bicarbonate dehydration and CO 2 loss. This indicates a specific proteolytic degradation of CcaA when it is not complexed with M58, a process that is likely to be relevant to the functioning of the cyanobacterial CCM but is outside the scope of this report.

M35 or M58 Alone Can Form Electron-Dense Subcellular Aggregates
In cell types producing M35 but no M58 (DccmM + M35 and DccmM + FS), relatively large electron-dense structures can be observed (Fig. 3) but cells still maintain a HCR phenotype (Table I). This suggests that M35 may form the structural basis for Rubisco holoenzyme interactions within the carboxysome and is not strictly limited to the shell layer as we have previously suggested (Long et al., 2007). Conversely, the small structures found in cell types producing predominantly M58 (DccmM + NL, DccmM + gtc, and DccmM + SD) could be carboxysome shell complexes with little internal Rubisco due to low M35 content. These hypotheses are supported by the regularly observed higher proportion of M35 over M58 (Fig. 2;Long et al., 2007) that would be required if M35 was distributed throughout the carboxysome and M58 confined to the shell layer. Indeed, analysis of absolute quantities of various carboxysomal component proteins from Synechococcus PCC7942 suggests that there is sufficient M35 to enable cross-linking of all Rubisco holoenzymes within the carboxysome and sufficient M58 to suggest that it may be confined to the shell layer (B.M. Long, unpublished data). Thus, specific roles for M35 as an internal carboxysomal Rubisco cross-linking protein and M58 as a shellorganizing scaffold protein can be hypothesized. This requires further investigation and may lead to im-portant advances in developing coordinated Rubisco complexes in C 3 plants (e.g. pseudocarboxysomes or pyrenoids).

IRESs and Start Codons within ccmM Genes
Are Conserved Cot et al. (2008) found multiple CcmM polypeptides in Synechocystis PCC6803, each corresponding to putative IRES sequences and translation start codons, although the polyclonal antibody used was raised against the Synechococcus PCC7942 M58 protein (i.e. the same antibody used in this study). Analysis of ccmM from Synechococcus PCC7942 indicates additional putative IRES and translation initiation codons corresponding to doublet and singlet rbcS-like domains within the gene, corresponding to additional 23and 12-kD polypeptides (Supplemental Figs. S3 and S4). However, unlike the findings of Cot et al. (2008), we have not yet found evidence for these smaller peptides being produced in Synechococcus PCC7942. Indeed, immunoblot analysis with polyclonal antibodies raised against both M35 and M58 polypeptides has provided no evidence for their existence thus far (B.M. Long, unpublished data). Nonetheless, the same putative GTG start codons and very similar SD sequences appear to be associated with these rbcS-like domains (Supplemental Figs. S3 and S4). The putative SD sequences at these sites are all commonly recognized sites in Synechocystis PCC6803 (Sazuka and Ohara, 1996). It is therefore possible that mRNA secondary or tertiary structure could play a role in preventing these sequences from being independently translated (De Smit and Van Duin, 1990;Boni, 2006), although the close proximity of the putative small SSU-3 SD sequence to the putative start may be less than optimal (Supplemental Fig. S4).
Our own attempts at determining secondary structure characteristics of the known and putative ccmM mRNA translation initiation regions provided no clear indication as to why one region may be favored over the others (Supplemental Fig. S3), suggesting that mRNA tertiary structure in particular may play a role. While it is plausible that the putative shorter CcmM forms (SSU-2 and SSU-3; Supplemental Fig. S4) arise from duplication and conservation of rbcS sequences, their independent translation would be redundant when triplet SSU-like repeats enable Rubisco cross-linking to occur. This is of particular relevance to the hypothesis that M35 plays a role in internalized carboxysomal Rubisco cross-linking, since three SSUlike domains would allow strong and uniform crosslinking both within and between Rubisco layers. Fewer domains would provide either weak cross-links (on only one plane) or none at all. Thus the three SSUlike domains in M35 might allow the formation of carboxysome-like structures in the absence of M58 (such as those observed in the DccmM + M35 and DccmM + FS mutants), whereas fewer are less likely to. Why the putative IRESs and start codons of the addi-tional SSU-like domains are therefore retained is unexpected, and the means by which one operates as an IRES but not the others is not clear.

Why Is CcmM a Two-Domain Protein?
It is well established that ccmM codes for two distinct protein domains (Price et al., 1993;Ludwig et al., 2000), and results presented here show that it can be translated into either a two-domain protein (M58) or a single-domain protein (M35) via an IRES. We have previously presented a model of how M35 and M58 may interact within the carboxysome shell to provide an essential structural framework: the N-terminal g-CA domain of M58 forming an important structural trimer (Long et al., 2007). The N-terminal g-CA domain of M58 is also required for binding the specific carboxysomal b-CA, CcaA (Long et al., 2007;Cot et al., 2008), and recruiting the enzyme into the carboxysome. In species that do not contain ccaA, it had been proposed that the N-terminal g-CA domain of CcmM might act as the functional carboxysomal CA (Long et al., 2007;Cot et al., 2008;Cannon et al., 2010). A recent study of CcmM from T. elongatus BP-1, which does not have a ccaA gene, revealed that the N-terminal region of CcmM from this species is indeed an active trimeric CA under oxidizing conditions (Peñ a et al., 2010). The authors also showed that conserved regions of the T. elongatus polypeptide suggest this is the case in other ccaA-lacking b-cyanobacteria. Our model also suggests that M35 plays a structural role in binding Rubisco complexes to the shell layer via a cross-linking or Rubisco-organizing mechanism. This is based on our suggestion that multiple Rubisco SSU-like domains within a single M35 or M58 polypeptide enable cross-linking of Rubisco complexes (Ludwig et al., 2000;Long et al., 2007) and is supported by our experimental observation that M35 can bind to L 8 Rubisco cores in an E. coli expression system (Long et al., 2007).
The data presented here are consistent with M35 playing a specific role in Rubisco cross-linking within the carboxysome (perhaps within and between Rubisco layers). Thus the multiple SSU-like domains are required for Rubisco interaction and cross-linking. The fusion of the g-CA domain and SSU-like domains therefore ensures the recruitment and close proximity of CcaA to Rubisco and carboxysome facet formation via M58. However, retention of the M35 gene start site allows independent SSU domain cross-linking with Rubisco where only structural interactions (and not CcaA recruitment) are required. Interestingly, while Synechococcus PCC7942 ccmM codes for a protein containing three Rubisco SSU-like domains, many b-cyanobacteria contain four or even five such repeats. While there is evidence for two CcmM forms in Nostoc PCC7120 (Long et al., 2007) and possibly up to four forms in Synechocystis PCC6803 (Cot et al., 2008), the number of CcmM short forms actually expressed in most b-cyanobacterial species is, as yet, unknown.

SD Sequence, Start Codon, and Short Form Mutants of ccmM
Recombinant ccmM gene constructs (Fig. 1) were used to examine the putative IRES and internal start codon within the gene. Mutation of the putative internal start codon (GTG to Val 216 ) to a nonstart GTC generates a silent, read-through mutation of the IRES is genuine. This mutation, ccmM + gtc, should result in one gene product (M58) if the IRES is genuine, or two gene products (M35 and M58) if a protease recognition site is required to produce both forms of CcmM. The ccmM + NL mutation is a GTG to CTG codon change at the putative start codon site. CTG is recognized as a rare start codon in cyanobacteria (Sazuka et al., 1999;Tichy and Vermaas, 1999) and was expected to result in a down-regulation of M35 production if CTG can operate as a start codon and if the IRES is genuine. The ccmM + NL mutation also contains an additional base change (ACC to AAC) that introduces T 215 V 216 to N 215 L 216 peptide sequence change, aimed at disrupting any putative protease recognition site within M58 that may lead to M35 production. The ccmM + FS mutation introduces a FS and subsequent termination codon immediately prior to the putative M35 GTG start while maintaining the putative IRES (Fig.  1). This mutation was designed to allow M35 production in the absence of M58 if the IRES is genuine. It was also designed to produce the N-terminal g-CA region of CcmM (M23; Fig. 1) independently of M35, thus enabling assessment of carboxysome formation without M58 but in the presence of both CcmM protein domains. Finally, a mutation in the putative SD sequence (GAGG to TCTG, DccmM + SD) was designed to test the hypothesis that ribosomal binding at this site is required for M35 translation. It was predicted that this mutation should lead to M58 production alone. All recombinant ccmM constructs were created via site-directed mutagenesis and expressed in the markerless deletion mutant DccmM (Woodger et al., 2005), along with the wild-type gene, to assess M35 and M58 production and carboxysome formation capabilities. It should also be noted that based on translational codon usage in the Synechococcus PCC7942 genome that GTG, CTG, and GTC are frequently used codons.

Cloning and Site-Directed Mutagenesis of ccmM
The full-length (M58) ccmM gene from Synechococcus PCC7942 was PCR adapted with primers: forward 5#-ttagcatatgccgagcccaacaac (adding an NdeI site at the underlined start codon) and reverse 5#-atctagattactcgagcggcttttgaatcaacagttc (adding an in-frame XhoI site upstream of the underlined stop codon, and XbaI downstream). M35 was cloned with the forward primer 5#-aacatggtgagcgcttataacgc and the same reverse primer as used for M58 (above). The resulting M58 fragment was cloned into the NdeI and XbaI sites of a variant of the shuttle vector pSE4 (Maeda et al., 1998;Price et al., 2004) in which an NdeI site was removed from the plasmid backbone and one engineered into the multiple cloning site (pSE4-Nde-Del). The M35 fragment was cloned into the NcoI and XbaI sites of pSE4 and both sequences verified by DNA sequencing; these shuttle plasmids known as pSE4-ccmM and pSE4-M35. The shuttle vector possesses a nirA promoter from Synechococcus PCC7942 that is repressed in the presence of ammonium and derepressed in the presence of nitrate (Maeda et al., 1998;Price et al., 2004).

Culture Conditions
Wild-type and mutant strains of Synechococcus PCC7942 were grown in modified BG-11 medium (Price and Badger, 1989a) containing 20 mM HEPES-KOH, pH 8.0 at 30°C and approximately 80 mmol photons m 22 s 21 . Cultures were grown in 100-mL culture vessels and sparged with air enriched with 2% CO 2 to circumvent the HCR phenotype. For mutants that harbor a spectinomycin resistance marker antibiotics were added to liquid cultures at 7 mg/mL.

Carboxysome Isolation, Protein Electrophoresis, and Western Blotting
The presence of M35, M58, and other carboxysomal proteins was determined in cell extracts by western blotting. Cells from 3-d-old 50-mL cultures were collected by centrifugation (5,000g, 5 min) and lysed by lysozyme and French Press treatment as described (Long et al., 2005). Centrifugally clarified cell lysates were treated with 25 mM MgSO 4 to generate carboxysome-enriched pellet fractions (Price and Badger, 1991;Price et al., 1998). Prior to SDS-PAGE, whole-cell lysate samples containing 500 ng chlorophyll (Chl) and equivalent volumes of resuspended Mg 2+ pellets and Mg 2+ supernatants were diluted in 43 SDS-PAGE sample buffer (Invitrogen), containing 4 M urea, and boiled for 10 min. The use of 4.0 M urea greatly enhanced the formation of monomeric CcmK1 that is otherwise observed predominantly as multimeric forms on SDS-PAGE in our hands. Proteins were separated on 4% to 12% NuPAGE Bis-Tris gels (Invitrogen) according to the manufacturer's instructions, using the MES-based buffer system. In our hands, electrophoresis of monomeric CcaA on SDS-PAGE was also enhanced by addition of sodium bisulfite to a final concentration of 1 mM in the cathode buffer. Separated proteins were transferred to Immobilon-P polyvinylidene difluoride (PVDF) membrane (Whatman) using a semidry blot apparatus (1 mA/cm 2 , 60 min) and probed with polyclonal rabbit antiserum raised against Synechococcus PCC7942 Rubisco, M35, CcmK1 (the homolog of CcmK2 from Synechocystis PCC6803), and CcaA. In addition, blots were also probed using a polyclonal antibody raised against M58 in an attempt to identify cleavage products of the parent protein, other than M35, in the event that M35 may arise through proteolytic cleavage. Bound antibodies were detected using the AP-conjugate substrate kit (Bio-Rad) after secondary probing with a sheep-anti-rabbit/alkaline phosphatase conjugated antibody (ICN).
Relative M35, M58, and CcaA quantities in mutant and wild-type cells were determined from western blots using the Attophos (Promega) detection system (ICN). Briefly, a dilution series of wild-type cell lysates (0.05-0.3 mg Chl on gel) and 0.15 mg Chl of each mutant was separated on SDS-PAGE gels. Proteins were blotted onto PVDF membranes as above and probed with polyclonal rabbit serum raised specifically against M35 (thus the quantitation of M35 and M58 were on an equivalent basis) or polyclonal rabbit serum raised against CcaA. Bound antibodies were detected with sheep-anti-rabbit alkaline phosphatase conjugated antibodies. Blots were visualized using a Bio-Rad versa doc system and relative quantities determined using QuantityOne software (Bio-Rad). For CcmM polypeptides, data were expressed relative to M58 found in wild-type extracts (set at 100%). Quantities of CcaA were expressed relative to that found in wildtype extracts (set at 100%).

Mass Spectrometric Measurements
Cells were prepared and analyzed in a membrane inlet mass spectrometer as previously described (Price and Badger, 1989b;Sü ltemeyer et al., 1995). Cells were assayed at a Chl density of 2 mg mL 21 in BG-11 medium buffered with 50 mM BisTrisPropane-HCl (pH 7.9), containing 20 mM NaCl instead of NaNO 3 . Assays were performed in a thermostatted (30°C) mass spectrometer cuvette allowing membrane inlet analysis of O 2 (mass 32) and CO 2 (mass 44). The irradiance used was 700 mmol photons m 22 s 21 .

Transmission Electron Microscopy
Cells were prepared for electron microscopy essentially as described by Price and Badger (1989b). Stained sections were viewed using an Hitachi H-7100FA TEM at 75 kV.
Sequence data from this article can be found in the GenBank/EMBL data libraries under accession number M96929.

Supplemental Data
The following materials are available in the online version of this article. Figure S1. C i -dependent photosynthesis in wild-type Synechococcus PCC7942 and ccmM mutants.

Supplemental
Supplemental Figure S2. Detection of M35 and M23 peptide products of ccmM-FS expressed in Escherichia coli.