Reconstruction of Metabolic Pathways, Protein Expression, and Homeostasis Machineries across Maize Bundle Sheath and Mesophyll Chloroplasts: Large-Scale Quantitative Proteomics Using the First Maize Genome Assembly

Chloroplasts in differentiated bundle sheath (BS) and mesophyll (M) cells of maize ( Zea mays ) leaves are specialized to accommodate C 4 photosynthesis. This study provides a reconstruction of how metabolic pathways, protein expression, and homeostasis functions are quantitatively distributed across BS and M chloroplasts. This yielded new insights into cellular specialization. The experimental analysis was based on high-accuracy mass spectrometry, protein quantiﬁcation by spectral counting, and the ﬁrst maize genome assembly. A bioinformatics workﬂow was developed to deal with gene models, protein families, and gene duplications related to the polyploidy of maize; this avoided overidentiﬁcation of proteins and resulted in more accurate protein quantiﬁcation. A total of 1,105 proteins were assigned as potential chloroplast proteins, annotated for function, and quantiﬁed. Nearly complete coverage of primary carbon, starch, and tetrapyrole metabolism, as well as excellent coverage for fatty acid synthesis, isoprenoid, sulfur, nitrogen, and amino acid metabolism, was obtained. This showed, for example, quantitative and qualitative cell type-speciﬁc specialization in starch biosynthesis, arginine synthesis, nitrogen assimilation, and initial steps in sulfur assimilation. An extensive overview of BS and M chloroplast protein expression and homeostasis machineries (more than 200 proteins) demonstrated qualitative and quantitative differences between M and BS chloroplasts and BS-enhanced levels of the specialized chaperones ClpB3 and HSP90 that suggest active remodeling of the BS proteome. The reconstructed pathways are peak searched the maize genome data set with organellar genes (with 54,380 entries) and in parallel against ZmGI version 16.0 (with 56,364 including sequences for known contaminants (e.g. keratin, trypsin) concate-nated with a decoy database where all the sequences were randomized. Each of the peak lists were searched using Mascot version 2.2 (maximum P of 0.01) for full tryptic peptides using a precursor ion tolerance window set at 6 6 ppm, variable Met oxidation, ﬁxed Cys carbamidomethylation, anda minimal ion score threshold of 30 for maize genome and 44 for ZmGI; this yielded a peptide false discovery rate below 1%, with peptide false positive rate calculated as: 3 (decoy_hits)/total_hits. The false protein identiﬁcation rate of protein identiﬁed with two or more peptides was zero. To reduce the false protein identiﬁcation rate of proteins identiﬁed by one peptide, the Mascot search results were further ﬁltered as follows: ion score threshold was increased to 40 for maize genome and 50 for ZmGI, and mass accuracy on the precursor ion was required to be within 6 3 ppm. Precursor ion masses below 700 D were discarded. results into (http://

Plants can be classified as C 3 or C 4 species based on the primary product of carbon fixation in photosynthesis. The primary product of carbon fixation is a four-carbon compound (oxaloacetate [OAA]) in C 4 plants but a three-carbon compound (3-phosphoglycerate [3PGA]) in C 3 plants. In leaves of C 4 grasses such as maize (Zea mays), photosynthetic activities are partitioned between two anatomically and biochemi-cally distinct bundle sheath (BS) and mesophyll (M) cells. A single ring of BS cells surrounds the vascular bundle, followed by a concentric ring of specialized M cells, creating the classical Kranz anatomy. Active carbon transport (in the form of C 4 organic acids) from M cell to BS cells and specific expression of Rubisco in the BS cells allows Rubisco, the carboxylating enzyme in the Calvin cycle, to operate in a high CO 2 concentration. The high CO 2 concentration suppresses the oxygenation reaction by Rubisco (and the subsequent energy-wasteful photorespiratory pathway), resulting in increased photosynthetic yield and more efficient use of water and nitrogen. The history of C 4 research has been described (Nelson and Langdale, 1992;Sage and Monson, 1999;Edwards et al., 2001). At present, there is renewed interest in C 4 photosynthesis, stimulated in part by the potential use of C 4 plants as a source of biofuels (Carpita and McCann, 2008) and the genetic engineering of C 4 rice (Oryza sativa; Sheehy et al., 2007;Hibberd et al., 2008;Taniguchi et al., 2008). The use of new genomics and/or proteomics tools has resulted in new insights into cellular differentiation in C 4 plants ).
Proteins are responsible for most cellular functions, and knowing their abundance, cell-type specific expression patterns, and subcellular localization is essential to understand C 4 differentiation. Previously, we published a quantitative analysis of purified M and BS chloroplast (soluble) stromal proteomes in which BS-M protein accumulation ratios for 125 accessions were determined; this covered a limited range of plastid functions, although it enabled the integration of information from previous studies (Majeran et al., 2005). A subsequent complementary quantitative proteomics study, using nano-liquid chromatography (LC)-LTQ-Orbitrap mass spectrometry (MS) and label-free spectral counting complemented with other techniques, identified proteins in BS and M thylakoid and envelope membranes of maize chloroplasts and determined cell type-specific differences in (1) the protein assembly state and composition of the four photosynthetic complexes and of a new type of NADPH dehydrogenase (NDH) complex; (2) the auxiliary functions of the thylakoid proteome; and (3) protein and metabolite transport functions of M and BS chloroplast envelopes (Majeran et al., 2008). Comparative MS analysis of chloroplast envelope membranes from leaves of pea (Pisum sativum), a C 3 species, and from M chloroplasts of maize showed an enrichment of several known and putative translocators in the maize M envelopes (Brautigam et al., 2008). The conclusions of these proteome analyses are summarized by Majeran and van Wijk, 2009. Whereas these proteomics studies provide significant progress in understanding the organization of C 4 metabolism in maize, three aspects have not been adequately addressed: (1) the stromal proteomes of BS and M chloroplasts likely each contain more than 1,500 proteins, but the BS-M ratios for only approximately 125 proteins were quantified, resulting in very limited coverage of several important secondary metabolic pathways such as sulfur, fatty acid, amino acid, and nucleotide metabolism; (2) information about relative concentrations of stromal proteins in BS and M chloroplasts is lacking but is needed as a basis for quantitative modeling and metabolic engineering of C 4 photosynthesis and other metabolic pathways; the growing "toolbox" of proteomics and MS now allows for such quantitative analyses (Bantscheff et al., 2007;Kumar and Mann, 2009); (3) the soluble (Majeran et al., 2005) and membrane (Majeran et al., 2008) proteome data sets were analyzed by different techniques and mass spectrometers, mostly due to the improvement of commercial mass spectrometers in that time frame. Therefore, it is difficult to understand the quantitative relationships between these data sets. This study addresses these three aspects.
So far, maize proteome analyses used essentially ZmGI maize assemblies (for the Z. mays Gene Index) based on ESTs, combined with a limited amount of additional DNA sequence information. The ZmGI was originally generated by The Institute for Genome Research and subsequently supported by the computational Biology and Functional Genomics Laboratory (http://compbio.dfci.harvard.edu/index.html). This ZmGI database did not have annotated gene models (for proteome analysis, the DNA sequences were searched in all six reading frames), and low expressed genes were likely underrepresented. In our most recent BS-M chloroplast analyses (Majeran et al., 2008) as well as a maize envelope analysis (Brautigam et al., 2008), the MS data were searched against ZmGI version 16.0 or 17.0. Since that time, the maize genome has been sequenced (using a bacterial artificial chromosome approach), a physical map was created (the maize accessioned golden path AGP version 1), and its first assembly with gene coordinates and predicted proteins was very recently released (June 2009; http:// ftp.maizesequence.org/release-4a.53/sequences/) and published (Schnable et al., 2009). This release contains 32,540 genes with 53,764 gene models; most of the gene models are evidence based. The new maize genome assembly is expected to improve maize proteome analysis with more accurate protein identification and quantitative assessment of protein expression patterns. This also allows for the determination of N-terminal localization signals, which was rarely possible from EST assemblies, as N termini were often lacking.
This study presents a quantitative protein expression atlas of differentiated maize leaf M and BS chloroplasts using high-resolution and mass-accuracy MS (using a LTQ-Orbitrap) and the new maize genome assembly. Three biological replicates of stromal proteomes of isolated BS and M chloroplasts were analyzed. Quantification was carried out based on the "spectral counting" method (Zybailov et al., 2005Bantscheff et al., 2007;Choi et al., 2008) using a sophisticated bioinformatics "workflow" in particular to deal with gene duplications and extended gene families observed in polyploids such as maize. These new stromal data sets were combined with a reanalysis of our recent BS and M membrane proteome data sets (Majeran et al., 2008) against genome 4a53. Compared with previous maize leaf proteome analyses, this study provides an integrated overview of both primary and especially secondary metabolism, as well as chloroplast gene expression and protein biogenesis, in far greater depth. The reconstructed pathways are presented as figures that include quantitative protein information; pathways include primary carbon metabolism, starch metabolism, nucleotide metabolism, fatty acid and lipid biosynthesis, chlorophyll, heme, and carotenoid synthesis, and nitrogen assimilation. We briefly comment on the use of the new maize genome assembly for proteome analysis. All matched peptides are projected on the predicted protein models via the Plant Proteomics Database (PPDB; http:// ppdb.tc.cornell.edu/). Interactive functional annotation, chloroplast localization assignments, as well as details of protein identification are also available via PPDB.
Previously, we determined the BS and M chloroplast membrane proteomes using protein separation by native gels, followed by tryptic digestion and extensive MS/MS analysis by LTQ-Orbitrap (Majeran et al., 2008). Since no genome assembly was available at that time, we searched the data against the ZmGI assembly (version 16.0). Here, we researched these MS data against the maize genome release 4a53. We identified 1,219 protein models corresponding to 882 protein accessions when counting only one model per gene (see below; Supplemental Table S1).

Integration of Stromal and Membrane Data Sets and Selection of the Best Gene Model
The workflow of the proteome analysis is summarized in Figure 2. Combining the stromal and membrane data sets identified 2,439 maize gene models and 1,429 protein accessions when counting only one model per gene (Supplemental Table S1). In the new maize genome annotation, many genes have more than one gene model. In such cases, we selected the protein form (gene model) that had the highest number of matched spectra (spectral counts [SPC]) across all experiments; if two gene models had the same number of matched spectra, the model with the lowest digit was selected. Within each protein report page in PPDB, "pop-up windows" display the gene models for each protein, with the peptide count projected on the exons and details for matched peptides projected on the primary amino acid sequence. This will allow the user to determine the significance of the different gene models. BLAST search of the 1,429 maize proteins against the Arabidopsis (Arabidopsis thaliana) proteome and rice genome resulted in 1,017 Arabidopsis and 1,079 rice homologues; 47 and 48 are chloroplast encoded in Arabidopsis and rice, respectively.

Relative Amounts and Concentrations of Proteins in BS or M Cells
The relative amount (mass) of each identified protein within each replicate was calculated based on the adjusted number of matched MS/MS spectra (adjSPC), normalized by the sum of adjusted SPC in the replicate, yielding nadjSPC. The adjSPC are the sum of unique SPC and a proportional distribution of shared SPC, using the ratio of unique SPC to determine this distribution; we previously developed and tested this strategy for Arabidopsis proteomes . The relative concentration for each identified protein was calculated as the normalized spectral abundance factor (nSAF), which was calculated from adjSPC weighted for the number of theoretical tryptic peptides with a relevant length ("observable peptides"; Zybailov et al., 2006).
To better understand the distribution and expression range of the identified protein population, we determined their frequency distribution of the relative concentration using Log10(nSAF) values and a bin size of 0.50 (Fig. 1B). Bins in the range of 21.5 to 23, with the approximately 110 most abundant proteins, were strongly dominated by proteins involved in photosynthesis and the C 4 shuttle. Bins in the range of 25.5 to 27, with the approximately 205 proteins of lowest abundance, contained mostly proteins with less than five matched spectra, frequently with most of the spectra shared with other proteins; thus, this group of low-abundance proteins is enriched in low expressed members of groups of homologues, possibly indicating that these are pseudogenes as well as falsepositive identifications (see below).

Removal of Low Scoring Proteins
Our workflow has a strictly controlled false-positive peptide identification rate, and we use high-resolution (100,000), high-mass accuracy (6 ppm) MS data; this workflow helps to avoid false-positive identifications. However, the high complexity of the maize genome and the presence of repetitive DNA and the fact that we are using the first draft genome sequence could lead to identifications of less meaningful protein accessions (e.g. misassemblies, pseudogenes, etc). Therefore, for further analysis of BS and M chloroplast functions, we removed those protein accessions with only one matched MS/MS spectrum (total of 100 accessions). In addition, we removed those protein accessions that were identified with only one amino acid sequence (irrespective of charge state and possible posttranslational modifications) if the sequence had less than 10 amino acid residues and contained Ile (I) or Leu (L); these two residues have the same mass (they are isobaric) and thus cannot be distinguished by MS. These steps removed 149 proteins, mostly repre-senting pseudogenes, false-positive identifications, and proteins expressed at very low levels; they represented 0.07% of the calculated protein mass (percentage of total nadSPC; Supplemental Table S2).

Assignment of Chloroplast Localization
Since we were interested in BS and M chloroplast differentiation and function, we evaluated the remaining identified proteins for chloroplast localization. For assignment of a protein to the chloroplast, we used a combination of MS-based scores (chloroplast proteins should generally have higher scores in chloroplastenriched fractions than in total leaf fractions) and known localization of the predicted best homologue in Arabidopsis or rice, in addition to the known or predicted maize protein function. We did pay careful attention to localization assignment for proteins that are members of groups of closely related identified homologues.
We assigned 974 proteins to the chloroplast, 175 proteins were assigned to other subcellular locations and were considered contaminants, and for the remaining proteins we could not assign a subcellular Figure 1. Protein analysis of chloroplast fractions and protein distribution according to function. A, One-dimensional SDS-Tricine-PAGE (12% acrylamide) of stroma proteins from isolated BS and M maize chloroplasts. Three independent BS-M replicates are shown; 150 mg of protein was loaded in each lane. B, Frequency distribution of relative protein abundance in BS and M chloroplasts, calculated from Log10(nSAF) values. Each bin on the x axis corresponds to 0.50 orders of magnitude, with the total protein population spanning about 6 orders of magnitude. Four protein populations are displayed, and they are all identified proteins (gray bars), proteins assigned a chloroplast location (black bars), proteins with unassigned subcellular localization (striped bars), and nonchloroplast proteins (white bars). C, Distribution of proteins with assigned locations to chloroplast or with unclear locations based on the number of proteins (top) or protein mass based on nadjSPC (bottom) over primary carbon metabolism and photosynthesis (I), secondary metabolism (II), membrane transport (III), miscellaneous and unknown functions (IV), and plastid gene expression and protein homeostasis (V). D, Distribution of proteins with assigned locations to chloroplast or with unclear locations to the different functions according to the number of proteins. The main functional groups I, II, III, IV, and V are as defined for C. Table S2). These three categories had respectively 99%, 0.6%, and 0.3% of the total nadjSPC, indicating that assigned contaminations made up less than 0.6% of the total protein mass. The actual contamination was lower, since highly expressed (chloroplast) proteins are underestimated in their abundance using the spectral counting technique . We then analyzed the frequency distribution of relative protein concentration [based on Log10(nSAF)] for these three groups of proteins. This showed that the chloroplast proteins spanned 4 orders of magnitude, peaking at 23.5 to 24, whereas the contaminant proteins spanned about 3.5 orders of magnitude in abundance, peaking at 25.5 to 26 (Fig.  1B). For further analysis, we removed the contaminant proteins.

The Presence of Organellar Genes in the Nuclear Genome Assembly
In addition to the nuclear genome, plastid and mitochondria also have a genome, and we merged these sequences with the 4a53 nuclear genes when we searched the mass spectral data. Indeed, we identified 47 chloroplast-encoded proteins and one mitochondriaencoded protein (Supplemental Table S2). However, we noted that several of these chloroplast-encoded proteins had many shared MS/MS spectra matching to maize 4a53 (nuclear genome) accessions. BLAST searching of these maize accessions against Arabidopsis and rice showed they mostly matched to chloroplast-encoded Arabidopsis and rice proteins. These 4a53 genes could represent pseudogenes from fragments of plastid genome inserted into the nuclear chromosomes or could result from DNA sequencing of contaminating chloroplast DNA. Examples are the observation of seven maize 4a53 genome accessions for the chloroplast-encoded Rubisco large subunit in addition to the chloroplast accession (NP_043033) and four maize 4a53 genome accessions for the chloroplastencoded PSII subunit cytochrome b 559a in addition to the chloroplast accession (NP_043041). For quantification purposes, we grouped these redundant accessions, as will be described further below.

Functional Annotation of Proteins, Differentiating between C 3 and C 4 Enzymes and Splicing
The remaining proteins were assigned a function using the MapMan bin classification system (Thimm et al., 2004), similar to its use in previous maize studies (Majeran et al., 2005(Majeran et al., , 2008. Moreover, the MapMan classification system is well integrated in our PPDB for both Arabidopsis and maize proteome data . We added a new bin (assigned 1.5 PS.C4 malate shuttle) for proteins involved in the C 4 shuttle (e.g. phosphoenolpyruvate [PEP] carboxylase, pyruvate phosphate dikinase [PPDK], malate dehydrogenase [MDH], and PPDK regulatory protein [PPDK-RP]); importantly, based on quantitative BS-M expression patterns, we were also able to differentiate between proteins specialized in the C 4 functions as compared with non-C 4 functions (e.g. PPDK and MDH). For instance, we identified two PPDK accessions, GRMZM2G011507 (PPDK-C4) and GRMZM2G097457 (PPDK-C3). PPDK-C4 is extremely abundant, with a BS-M ratio of 0.56, whereas less abundant PPDK-C3 is nearly exclusively localized in BS cells (BS-M ratio of 9.7). Interestingly, we identified only one maize gene accession for PPDK-RP (GRMZM2G004880). PPDK-RP is enriched in M chloroplasts (BS-M ratio of 0.28), indicating its specificity toward the regulation of the M-localized C 4 -type PPDK. There are no obvious homologues in the 4a53 genome, and indeed only one maize PPDK-RP was identified by bioinformatics analysis and cloning (Burnell and Chastain, 2006). In Arabidopsis, there are two genes for PPDK-RP (RP1 and RP2, At4g21210 and At3g01200, respectively). RP1 is similar to the maize C 4 -type PPDK-RP and is plastid localized, whereas RP2 has no detectable activity and is localized to the cytosol (Chastain et al., 2008). For more details on C 4 and non-C 4 NADP-MDH enzymes, see Maurino et al. (2001) and Tausta et al. (2002).
The Identified Proteome Provides High Coverage of Secondary Metabolism, Plastid Gene Expression, and Chloroplast Protein Homeostasis Figure 1C shows the distribution of the number of proteins (total of 1,105 proteins) or protein mass involved in (1) primary carbon metabolism and photosynthesis, (2) secondary metabolism, (3) miscellaneous and unknown functions, (4) plastid gene expression and protein homeostasis, and (5) membrane transport. Each of the first four groups involved 23% to 25% of the identified proteins, whereas about 4% of identified proteins were involved in transport of metabolites across the chloroplast envelopes ( Fig. 1C, top). In terms of protein biomass, the majority (approximately 66%) was invested in primary metabolism, with equal distribution across secondary metabolism and plastid protein homeostasis (each 13%; Fig. 1C, bottom).
A more detailed breakdown of chloroplast functions is shown in Figure 1D, which shows high coverage of functions not much observed in maize chloroplasts in previous studies. In particular, the protein synthesis components involved in expression of chloroplastencoded proteins were very well covered; this included most of the ribosomal proteins, many of the t-RNA synthetases, as well as initiation and elongation factors (Fig. 1D). Even more rewarding was the identification of 132 proteins involved in various aspects of posttranslational chloroplast protein homeostasis, including about 50 processing peptidases, aminopeptidases, and proteases, as well as soluble methione sulfoxide reductases involved in protein repair, peptide deformylase, as well as several phosphatases and kinases. Most if not all members of the Clp protease system in maize were identified; while they are now well characterized in Arabidopsis chloroplasts (Sjogren et al., 2006;Kim et al., 2009;Zybailov et al., 2009), they appeared elusive in maize.
Excellent coverage was obtained for redox regulators and radical oxygen species (ROS) detoxification enzymes (e.g. peroxiredoxins), nitrogen, sulfur, and amino acid metabolism, and other secondary metabolic pathways, including fatty acid synthesis and isoprenoid, tetrapyrole, and nucleotide metabolism (Fig. 1D). Whereas only a very small percentage of protein mass was invested in isoprenoid synthesis and derived products, as well as cofactor and vitamin biosynthesis (less than 1%), we obtained a good coverage of several of these pathways; for instance, most enzymes of the methylerythritol phosphate pathway (MEP) and multiple enzymes in thiamine (vitamin b1) and riboflavin (vitamin b2) synthesis were identified. Thus, the combined analysis of membrane and soluble fractions of isolated BS and M chloroplasts provided a good coverage of many of the chloroplast functions. Therefore, the quantitative comparison of the BS and M profiles should thus provide meaningful and new insights into C 4 -driven differentiation and fulfill our objective to obtain an integrated, quantitative overview of the differentiated state of BS and M chloroplast functions in the maize leaf.

The Differentiated Functional State of BS and M Chloroplasts
Table I compares protein mass investment (based on nadjSPC) of BS and M chloroplasts in the different chloroplast functions (Table I). About 39% (M) to 38% (BS) of the total chloroplast proteome was invested in the photosynthetic apparatus and cyclic electron flow located in the thylakoid membrane (Table I). Between 23% (M) and 31% (BS) was invested in the Calvin cycle, the C 4 shuttle, and carbonic anhydrases. About 11% (M) and 10% (BS) was invested in plastid gene expression, protein folding, processing, and proteolysis. About 5% (M) and 3% (BS) was invested in redox regulation. About 1.9% (M) to 1.6% (BS) was invested in transport functions. Some 6% (M) and 5% (BS) was invested in proteins for which the function is unknown. About 14% (M) and 11% (BS) was invested in all remaining functions (Table I). To better appreciate the cell type-specific differences in molecular functions, we further broke down these protein investments for the membrane and soluble fractions (Table I; for a detailed description that also highlights the most abundant proteins for the various functions, see Supplemental Text S1).

Grouping of Identified Protein Accessions to Deal with Gene Duplications and Extended Gene Families in Maize for Quantification
Maize is a polyploid with a far larger genome size (2,800 Mb) than Arabidopsis (130 Mb) and rice (430 Mb), and maize has a significantly larger amount of repetitive sequences (Schnable et al., 2009). The complexity of the maize genome and its predicted pro-teome requires an extra effort for maize proteomics, in particular to deal with closely related identified proteins. These closely related proteins could represent true duplicated genes, pseudogenes, or could be artifacts of the genome assembly.
To deal with this complexity of the maize genome for quantification of BS and M protein expression, we grouped proteins that shared more than approximately 80% of their matched SPC. Grouping was done by generation of a similarity matrix through calculation of the dice coefficient between each pair of identified proteins based on matched SPC, followed by clustering of the proteins using MCL software (Enright et al., 2002), followed by manual evaluation (see "Materials and Methods"). In total, 313 proteins were placed in 131 groups (Fig. 2). This grouping avoids overinterpretation of differences observed between M and BS chloroplast proteomes; this strategy was also very useful for Arabidopsis Zybailov et al., 2009). The "relationships" between proteins that shared matched MS/MS spectra are visible through PPDB. Quantification of proteins with a low number of adjSPC within a replicate (i.e. below approximately 10 adjSPC) is generally less accurate Zybailov et al., 2009). A total of 378 proteins had less than 40 adjSPC, 179 had between 40 and 100 adjSPC, and 366 had more than 100 adjSPC (Fig. 2), and we generally consider these three groups as low-, medium-, and high-confidence quantifications, respectively (Supplemental Table S3).
In the remaining sections, metabolic pathways are reconstructed, and the distribution of specific proteins in the various processes and metabolic pathways over the BS and M chloroplasts is discussed. We will first discuss the chloroplast gene expression and protein homeostasis machineries to understand the assembly and maintenance of the differentiated chloroplast proteomes. Furthermore, we focus on those metabolic pathways that were not (or were poorly) covered in previous maize studies, in particular starch metabolism, nucleotide metabolism, sulfur and nitrogen assimilation, and isoprenoid and tetrapyrole metabolism. We will also analyze the distribution of relative concentrations of the proteins, as this will be valuable for future modeling of metabolic fluxes. In these various plots and analyses, we will use nSAF values, as this is the best approximation of relative protein concentration, as well as BS-M accumulation ratios. The data will be presented as figures for plastid gene expression, protein synthesis, and homeostasis, thylakoid light reactions (Supplemental Fig. S1), envelope transporters, primary carbon metabolism, starch metabolism, nucleotide metabolism, fatty acid and lipid biosynthesis, chlorophyll, heme, and carotenoid synthesis, nitrogen assimilation, and sulfur assimilation, as well as tables for redox regulation and ROS defense and for amino acid metabolism. More details regarding accession numbers, groupings of accessions, MapMan bin numbers, nSAF values, matched MS/MS spectra, and more are available in Supplemental Table S3.

Plastid Gene Expression, Protein Synthesis, and Homeostasis
The contribution of plastid gene expression and protein homeostasis on control of BS and M differentiation is entirely unclear. For instance, levels of chloroplast-encoded (and nucleus-encoded) PSII subunits are lower in BS than in M chloroplasts, whereas chloroplast-encoded NDH subunits are higher in BS chloroplasts. However, it is not clear if levels of these chloroplast-encoded proteins are regulated through control of transcription (or even chromosome copy number), mRNA processing and stability, translation, or posttranslationally through proteolysis. Furthermore, levels of imported nucleus-encoded proteins within the BS or M chloroplast could be controlled through proteolysis. Both chloroplast-and nucleusencoded proteins share intraplastid protein sorting, folding, and assembly machineries.
After careful evaluation for function and location, we quantified 221 proteins (and protein groups) involved in chloroplast biogenesis and protein homeostasis, assembled in five categories, namely protein synthesis (77 proteins), proteolysis and processing (42 proteins), import and posttranslational modifications (34 proteins), folding (38 proteins), and RNA-DNA interaction (31 proteins). Figure 3 displays the BS-M ratio (based on nSAF), the number of matched adjSPC, and relative abundance of proteins (as a color scale) in these various functional classes. A total of 114 proteins were only found in the stroma, whereas 72 were found in both membrane and stromal fractions, but with a wide range of distribution (i.e. between 3,000-fold enriched in the membrane fraction and 100-fold enriched in the stromal fraction). The systematic quantitative comparison of BS and M chloroplast proteins in these processes did identify several strongly differentially expressed proteins (marked with an asterisk in Fig. 3); these may contribute to BS and M specialization. We will highlight selected candidate proteins, with emphasis on soluble proteins, and comment on functional implications and follow-up analyses (for details, see Supplemental Table S3).

Interactors with the Plastid Chromosome
There is very little understanding of how plastid chromosome copy number and organization influence transcription and contribute to BS-M chloroplast differentiation. Therefore, we were pleased to identify eight DNA-interacting proteins (pTAC or nucleoid proteins) and two DNA-repair enzymes. The abundance levels of these proteins spanned more than 2 orders of magnitude (based on nSAF), with two membranebound nucleoid-associated proteins, pTAC16 and MFP1, being by far the most abundant, with 1,679 and 656 matched MS/MS spectra, respectively. pTAC16 of unknown function was 2-fold enriched in the M, whereas MFP1 was 2-fold enriched in the BS chloroplasts. MFP1 is a coiled-coil DNA-binding protein and is believed to play a role in anchoring the plastid DNA to the envelope and thylakoid membrane; its expression is tightly correlated with the accumulation of thylakoid membranes (Jeong et al., 2003). Two soluble putative DNA-repair enzymes, auvrB/uvrC motif-containing protein and a deoxyribodipyrimidine photolyase, follow in abundance and have BS-M ratios of 0.51 and 0.99, respectively. The next three abundant proteins were homologues of TCP34, pTAC5, and pTAC17; TCP34 and pTAC5 homologues were only found in the membrane fractions, and pTAC17 was mostly found in the soluble phase. The Arabidopsis homologues of these pTAC proteins were all identified in a highly enriched plastid chromosome preparation, but their precise function is not clear (Pfalz et al., 2006;Weber et al., 2006). Targeted   nucleoid and functional analyses will be needed to determine the contribution of transcriptional regulation to chloroplast BS and M differentiation.

Regulation of Transcription and RNA Metabolism
We identified 21 proteins in this category, including six RRM domain proteins, and four ribonucleases in the soluble stromal fractions covering 2 to 3 orders of expression, with the number of matched MS/MS spectra ranging from 1,400 to just 3 (Fig. 3). The most abundant proteins were homologues of CP33, CP31, CP29, CSP41A, and CSP41B. Homologues for several of these proteins have been characterized in Arabidopsis; examples are RRM protein CP31A (Tillich et al., 2009) and CSP41 (Beligni and Mayfield, 2008;Bollenbach et al., 2009). CSP41B-2 (BS-M = 2.5) and CSP41A (BS-M = 2.1) and the 3# to 5# exoribonuclease RIF10 (BS-M = 4.9) and a DUF740 protein (BS-M = 3.5) were all higher expressed in BS chloroplasts, whereas several other proteins, such as two SET domain proteins and a S1 RNA-binding protein, were much more expressed in M chloroplasts (BS-M = 0.2, 0.1, and 0.06).

The Chloroplast Translational Machinery
We quantified 76 proteins (and protein groups) that are part of the chloroplast translational machinery, including 18 tRNA synthetases, two initiation factors (IF2 and IF3) and five elongation factors (BipA, TU, G, and P types), a ribosome-recycling factor, and a peptide chain-release factor, as well as 14 subunits of the 30S ribosomal particle, 32 subunits of the 50S particle, and three plastid-specific ribosomal proteins (PSRP1, -2, and -3). Relative protein concentrations spanned 3 orders of magnitude, with the most abundant protein being EF-TU-1, with both elongation and chaperone functions (3,163 matched MS/MS spectra). Chloroplast ribosome levels were about 3-fold higher in M chloroplasts than BS chloroplasts (median BS-M ratio for individual subunits was 0.32). Interestingly, the initiation and elongation factors were more equally distributed across BS and M, with the exception of the typA/BipA elongation factor-like protein. TypA/BipA EF is a specialized ribosome-associated translation factor (Wang et al., 2008) suggested to be required in stress responses; interestingly, we observed BipA EF to be strongly induced in Arabidopsis chloroplast Clp protease mutants Zybailov et al., 2009). The 18 tRNA synthetases were higher in M chloroplasts, with a median BS-M ratio of 0.14. Overall, these data suggest higher translational activity in M chloroplasts than BS chloroplasts, particularly at the thylakoid surface.
When comparing the stromal and membrane proteomes, we calculated that nearly 40% of ribosome protein mass (based on nadjSPC) was found in the membrane fractions. Interestingly, the M membrane fractions contain 20-fold more ribosomal protein than the BS membranes, which is very striking when com-pared with the approximately 2-fold difference between the M and BS stroma. Chloroplast ribosomes are known to associate with the thylakoid membranes; indeed, many chloroplast-encoded thylakoid proteins are synthesized at the membrane surface and cotranslationally inserted (Margulies and Michaels, 1975;Klein et al., 1988;van Wijk et al., 1996;Rohl and van Wijk, 2001). We suggest that the very low BS-M ratio of thylakoid-bound ribosomes reflects a much higher demand for synthesis of thylakoid proteins in M chloroplasts, most likely due to high M abundance of PSII subunits and the relative short lifetime of PSII reaction center proteins due to light-induced damage. The BSenriched NDH complex also contains many chloroplast-encoded proteins, but the relative concentration of the NDH complex, and likely also the turnover rate, is several fold lower than the PSII complex, thus contributing much less to overall chloroplast translation.

Protein Processing and Proteolysis
We quantified 42 proteases and processing peptidases, 26 of which were quantified with medium or high confidence. These include 22 stromal proteases, six thylakoid lumen proteins, nine integral thylakoid proteases, and three proteases that we assigned to the inner envelope membrane. As most membrane proteins were discovered and discussed previously when searching the ZmGI database (Majeran et al., 2008), we focus here on the 22 soluble proteases and peptidases. Twelve of the soluble proteases were members of the Clp protease system, and based on a multialignment analysis with the Arabidopsis Clp family, we annotated the maize Clp proteins. The Clp protease system is the most abundant soluble protease in Arabidopsis chloroplasts (Peltier et al., 2004) and in pea etioplasts (Kanervo et al., 2008) and consists of a proteolytic tetradecameric barrel-structured ClpPR complex to which two small ClpT subunits tightly associate (Peltier et al., 2004). ClpC1 and -C2 chaperones are assumed to deliver substrate to the core protease complex (Adam et al., 2006). We were surprised to find that the most abundant ClpPR subunit in maize chloroplasts was a homologue of Arabidopsis mitochondrial ClpP2. Clearly, this maize ClpP2 protein is not mitochondrial, and this finding warrants a more in-depth phylogenetic analysis of the Clp protease family in plants. The average and median BS-M ratio of the Clp system was 0.8 and 0.7, respectively; this rather equal distribution across BS and M chloroplasts is consistent with the idea that the Clp system serves as a general housekeeping protease, unlike some of the proteases with narrow functions and extreme BS-M ratios, such as SPPA. The slight bias (25%-40%) toward M accumulation may relate to specific control functions by the Clp protease system of plastid gene expression; this is certainly worth investigating further.
Other abundant proteases were Prep1 (likely involved in degradation of cleaved cTPs), several ami-nopeptidases (eucyl, glycyl, glutamyl, M24), and a Lon protease (LON2). Several of these soluble proteases showed a strong preferential distribution for either cell type; in particular, glutamyl endopeptidase (cGEP) and abundant glycyl aminopeptidase (M1) were 4-fold and 2-fold higher, respectively, in the BS stroma, whereas stromal DegP2, the very abundant eucyl aminopeptidase LAP1, and the M24 aminopeptidase APP2 were 3-, 4-, and 5-fold higher, respectively, in M chloroplasts. The substrates of these proteases are unknown, but the preferential accumulation in one cell type suggests a narrow set of substrates.

Protein Sorting
We identified the major soluble and membrane components that control protein import, sorting, and translocation in the thylakoid (Tic110/40/55, SRP43/ 54, SecA/Y, TatC, and HCF106). Tic110, Tic40, SecY, and SecA were each identified with more than 50 SPC, and their BS-M ratio could be determined with confidence. The envelope import components, as well as both subunits of the SRP particle and soluble SecA, were clearly higher in M chloroplasts, whereas SecY was equally distributed. Since most of the SecAdependent abundant lumenal proteins are components of M-enriched PSII complex, the strong M accumulation is logical. SecY is the general thylakoid import channel for membrane proteins, and its equal distribution across the two cell types is consistent with its general role.

Protein Assembly Factors
We also identified 13 soluble and membrane proteins involved in the assembly of thylakoid complexes; these included several factors that function for very specific complexes (e.g. for PSII, HCF136 and LPA1; for PSI, PYG7 and YCF4) and at least three factors involved in biogenesis of 2Fe-2S complexes (NFU1,2,3; see also the section on sulfur assimilation below). Factors for PSII-specific complexes were several-fold higher in M chloroplasts, and those involved in PSI biogenesis were 30% to 40% higher in BS (Majeran et al., 2008). Thus, the BS-M accumulation ratios of the PSI-and PSII-specific assembly factors correlate well with the BS-M ratios of PSI and PSII themselves, indicating that expression of these assembly factors must be well coordinated with demand. Strikingly, the maize homologue of Arabidopsis stromal protein HCF101 (quantified with 69 adjSPC) involved in biogenesis of 4Fe-4S clusters (not 2Fe-2S) in PSI (Stockel and Oelmuller, 2004), as well as ferredoxin (Fd)-thioredoxin reductase (FTR; Lezhneva et al., 2004), showed 10-fold higher M accumulation. Unlike PSI, FTR is more highly expressed in M chloroplast than in BS chloroplasts, and further studies on the regulation of cell-specific expression and accumulation of HCF101 may elucidate regulatory networks for chloroplast biogenesis and differentiation.

Protein (Un)folding and Maturation
We quantified 38 proteins involved in (un)folding and maturation; none had predicted transmembrane domains. These proteins included lumenal protein isomerases, general high-abundance stromal chaperones (several HSP70s and their GrpE nucleotideexchange factors and CPN60/20/10 proteins), as well as low-abundance factors (e.g. BSD2 implicated in Rubisco assembly). As discussed previously (Majeran et al., 2008), several thylakoid lumen isomerases showed distinct preferential accumulation in BS or M thylakoids, suggesting specific adaptation or substrates. General chaperones were quite equally distributed, consistent with their broad array of substrates. Clear exceptions were stromal chaperone HSP90 and ClpB3, involved in protein maturation and unfolding, respectively; they were 2-to 6-fold higher in BS chloroplasts. Considering the much lower chloroplast translation rates in BS chloroplasts, this is surprising and potentially very important for understanding the BS-M differentiation pathways; we speculate that this relates specifically to the specialization of the BS chloroplast (see "Conclusion").

Posttranslational Modifiers
Nucleus-encoded chloroplast proteins can be modified within the chloroplast after import, whereas chloroplast-encoded proteins can be modified during and after synthesis. We identified two different types of protein-repair proteins, Met sulfoxide type A4 and ribulosamine/erythrulosamine 3-kinase; both were only slightly more abundant in M chloroplasts, possibly because the antioxidative systems within the chloroplast have sufficient capacity to prevent protein damage. We identified peptide deformylase 1A (PDF1a), involved in cotranslational removal of the N-terminal formyl group of Met, with a BS-M ratio of 0.68; this higher M accumulation is again consistent with higher translation rates in M chloroplasts. The function of the very abundant and soluble methyltransferase (246 MS/MS spectra) is unknown, and it has a similar BS-M ratio as PDF1a. Phosphorylation does play an important role in chloroplast metabolism and adaptation, but more systematic studies on stromal kinases and phosphatases are now just beginning to emerge (in Arabidopsis; Schliebner et al., 2008;Reiland et al., 2009). We identified two protein phosphatases 2C (PP2C), a protein Tyr phosphatase and a putative protein kinase inhibitor. The most abundant PP2C (246 MS/MS spectra) was more than 5-fold enriched in BS chloroplasts, whereas the zinc-binding domain protein and putative kinase inhibitor (68 MS/ MS spectra) were more than 5-fold enriched in M chloroplasts. Reversible phosphorylation, in addition to redox regulation (see below), plays an important role in regulation of plant metabolism, and the study of BS and M chloroplast-specific (de)phosphorylation networks will be an important complement to this study.

The Photochemical Apparatus in the Thylakoid Membrane
We obtained extensive coverage (121 proteins and protein groups) of the linear electron transport chain, including PSII (33 proteins), PSI (23 proteins), the cytochrome b 6 f complex (five proteins), the ATP synthase (nine proteins), lumenal plastocyanin, five Fd proteins, three FNR1 isoforms, and six light stress proteins with chlorophyll-binding domains of the Ohp and Lil families (Supplemental Fig. S1). We also identified cyclic electron flow components of the NDH complex and NDH-specific biogenesis factors (23 proteins), NDH-independent cyclic electron flow components of the PGR complex (three PGRL1 and two PGR5 homologues), as well as alternative thylakoid terminal oxidases (PTOX or IMMUTANS; two homologues) and PIFI. We note that homologues of the new NDH subunits that we discovered previously in maize (Majeran et al., 2008) have now also been identified in Arabidopsis chloroplasts (Peng et al., 2009). Finally, we identified two state transition kinases (Stn7 and Stn8), the thylakoid phosphoprotein TSP9, and the Ca 2+ phosphoprotein (Supplemental Fig. S1, DE-P). Most of these proteins were also identified based on our previous search against the ZmGI EST assembly database (Majeran et al., 2008); therefore, we will not discuss these proteins any further. BS-M ratios for these proteins per complex or function are shown in Supplemental Figure S1.

Envelope Transporters with Known and Unknown Functions
From our recent BS-M membrane study (Majeran et al., 2008) and general literature, combined with functional suggestions from the study of Weber and colleagues (Brautigam et al., 2008), we (tentatively) assigned functions for 26 chloroplast envelope transporters ); these included M-enriched transporters MEP1,2,3,4 with unknown substrates, members of the DIT family (Dit1, Dit2, and OMT1), phosphate/triosephosphate translocators (TPT and PPT), the maltose exporter MEX1, ATP/ ADP translocator AATP1 or NTT1, and anion transporter ANTR2. A number of envelope transporters (e.g. various porins, ATP-binding cassette transporters) could not be assigned to specific functions. Researching our membrane data against the maize genome identified 46 transporters (52 proteins in 46 groups). We summarize this information for these transporters with (putative) substrates in Figure 4. Log2 BS-M ratios are displayed in small bar diagrams, with the bars shown in black, dark gray, and light gray, which indicate proteins identified with high (more than 100), medium (40-100), and low (less than 40) numbers of adjSPC, respectively. Proteins that are quantified with at least 40 adjSPC and enriched more than 1.5-fold in the M chloroplasts are marked in blue, whereas enzymes enriched more than 1.5-fold in the BS chloroplasts are marked in red. Relative protein abundance (based on nSAF) across both cell types is shown as colored squares.
The transporter with the highest relative abundance was MEP1-1 (equally distributed between BS and M), followed by MEP3/4 (M enriched), PPT (strong M enriched), and TPT (M enriched). Some of these transporters are also integrated with information of specific pathways, in particular with primary carbon metabolism and the C 4 -malate shuttle (Fig. 5), starch metabolism (Fig. 6), and nucleotide and nitrogen metabolism (Figs. 7 and 9). These assignments are based on previous literature, and in the case of the Calvin cycle intermediates (3PGA and dihydroxyacetone phosphate [DHAP]) and malate shuttle substrates (PEP and OAA), these assignments are speculative. Identification of substrates for many of these transporters should be of highest priority.
The Calvin Cycle, the C 4 Malate Shuttle, the Oxidative Pentose Phosphate Pathway, and Glycolysis Whereas our previous stromal analysis, employing two-dimensional gels and quadrupole time of flightbased MS analysis, identified several of the enzymes involved in the Calvin cycle and the C 4 malate shuttle (Majeran et al., 2005), the new stromal analysis using spectral counting, the LTQ-Orbitrap, and the new maize genome gives a nearly complete overview. Figure 5 shows an integrated overview with BS-M ratio of the Calvin cycle, the C 4 shuttle (PPDK, PPDK-RP, MDH, ME), the (irreversible) oxidative pentose phosphate pathway (G6PDH, Lact, 6PGDH) and the reversible pentose phosphate pathway (TKL, RPE, RPI, TA), and enzymes leading to starch biosynthesis (PGM1, PGM2, Glc6PI). Furthermore, Figure 5 shows C 3 forms of abundant C 4 shuttle enzymes (NADP-MHD-C3 and PPDK-C3). The inset shows the relative abundance of proteins for both cell types combined. We did not find evidence for accumulation of the specific chloroplast glycolytic enzymes (PyrK, ENI, PGlyM), nor did we find evidence for PPI-PFK or ATP-PFK involved in the conversion of F6P to F16BP. We did identify PGP, one of the two chloroplast-localized enzymes involved in photorespiration, only in the BS chloroplast. Chloroplast glycerate kinase, involved in the conversion of glycerate imported from peroxisomes into 3PGA, is likely GRMZM2G054663, but we never identified it.
In agreement with our initial analysis in 2005 (Majeran et al., 2005), our data strongly suggest that the reductive phase of the Calvin cycle, represented by GAPDHB and TPI, is enriched in the M chloroplast. The third enzyme in the reductive phase, PGK1, was also slightly higher in the M chloroplast, in contrast to the homologue PGK2, which was much higher in the BS chloroplast. Moreover, two of the reversible pentose phosphate pathway enzymes (RPI and TA; marked with asterisks) are also (somewhat) higher in M chloroplasts. The three enzymes of the oxidative PPP (G6PDH, Lact, and 6PGDH) are more expressed in the M chloroplasts. A second low-abundance isoform of Lact was higher in BS chloroplasts, but the significance is unclear, since it was only quantified by eight MS/MS spectra. The increased levels of the OPPP enzymes in M chloroplasts suggest that carbohydrates imported from the BS cell feed the OPPP in M chloroplasts, likely as a source of carbon intermediates for various pathways. Alternatively, the OPPP pathway may also be higher to provide precursors to the shikimate pathway; indeed, as discussed further below (Table III), most enzymes in the shikimate pathway were more abundant in M chloroplasts than in BS chloroplast. The preferential BS accumulation of PGM1, PGM2, and Glc6PI is not a reflection of in-creased rates of glycolysis but instead reflects higher rates of starch synthesis in BS chloroplasts (see next section). The C 3 -type PPDK is strongly enriched in BS chloroplasts, whereas C 3 -type NADP-MDH is equally distributed across both cell types.

Pathways for Starch Metabolism Show Strong Quantitative and Qualitative Differences between BS and M Chloroplasts
Expression and distribution of starch metabolic enzymes in C 4 leaves have not been systematically studied. We identified and quantified the relative abundances of 21 chloroplast-localized enzymes involved in starch synthesis and degradation. These 21  Table S3. proteins were assigned names, functions, and positions in the starch metabolic pathway, based on BLAST alignments and extensive literature analysis (Smith et al., 2005;Zeeman et al., 2007;Fulton et al., 2008;Fig. 6). We used the nomenclature developed for Arabidopsis (Smith et al., 2004). For functional interpretation, we connected the starch pathway to enzymes (PGM1,2 and Glc6PI) involved in conversion between Glc1P and Glc6P and fructoses (F16BP and F6P), as well as the maltose and Glc transporters (MEX1 and GlcT1; Fig. 7). BS-M protein ratios are indicated as bar diagrams, and the total protein abundance (based on nSAF) across both cell types is shown as colored squares. The majority of enzymes are more abundant in BS chloroplasts than in M chloroplasts, and the total investment of enzymes in starch metabolism is nearly 3-fold higher in BS than in M chloroplasts (Table I); this is consistent with several previous observations (Spilatro and Preiss, 1987;Lunn and Furbank, 1997;Majeran et al., 2005Majeran et al., , 2008 and with the much higher presence of starch particles in BS chloroplasts than in M chloroplasts. ADP-Glc-pyrophosphorylase (AGPase), a heterotetramer of large and small subunits, is the first committed step in starch biosynthesis and is controlled by a combination of allosteric control by 3PGA and inorganic phosphate, redox regulation, and trehalose (Kolbe et al., 2005;Zeeman et al., 2007). We identified two isoforms of the large subunit of AGPase (AGPL1 and -2) and one isoform for the small subunit (AGPS). APGL1 is nearly 10-fold more abundant than APGL2; APGL1 is about 2-fold higher in BS than in M chloroplasts, whereas APGL2 is likely a specific isoform adapted to M chloroplast conditions,  Table S3. since it is more than 10-fold higher in M than in BS chloroplasts. APGS was approximately 2-fold higher in BS than in M chloroplasts and likely serves both large isoforms.
The product of AGPase, ADP-Glc, is used by a family of starch synthases (SS) to generate linear Glc polymers named amylose and branched Glc polymers named amylopectin. We identified four starch SS, namely granule-bound SS (GSS), which is needed for synthesis of long linear chains of amylose, and SSI, SSIIa, and SSIIIb, which generate the linear chains of amylopectin. GSS was 4.4-fold higher in BS than in M chloroplasts, consistent with a higher production of starch in BS and the requirement of GSS for amylose synthesis. It is thought that SSI is needed upstream of SSII and that SSII operates upstream of SSIII; SSI is primarily responsible for the synthesis of short chains, and SSII and SSIII lengthen these chains further. SSI was equally distributed between BS and M chloroplasts, whereas the more abundant SSIIa was more than 2-fold higher in BS chloroplasts. The low-abundance SSIIIb was 1.4-fold higher in BS chloroplasts.
Together, this strongly suggests that starch synthesized in M chloroplasts has shorter amylase chains as compared with B chloroplasts.
We identified three starch (de)branching enzymes (BEIIb1 and -2 and ISA2) involved in the synthesis of branched glycans. BEIIb introduces a-1,6 branch points and is a so-called class II branching enzyme. Absence of class II BE results in the production of only long-chain glucans and no amylopectin. ISA2 is a type of debranching enzyme involved in synthesis (rather than degradation), and it has been suggested that ISA2 has a more regulatory function (for discussion, see Zeeman et al., 2007). BEIIb2 was about 100 times more abundant than BEIIb1 and ISA2. BEIIb2 was more than 6-fold enriched in BS chloroplasts, whereas BEIIb1 and ISA2 were not detected in M chloroplasts. Together, these findings suggest that the starch produced in BS chloroplasts is more highly branched than starch in M chloroplasts.
We identified two kinases (PWD and GWD) that catalyze the phosphorylation of a glucosyl residue of amylopectin; these two enzymes stimulate starch deg-  radation, possibly by loosening the surface of the granule structure and/or increasing the solubility of starch (Smith et al., 2005;Zeeman et al., 2007). Both have a similar relative concentration, and both were enriched (3.5-to 5-fold) in BS chloroplasts; these BSenriched levels are indicative of and/or consistent with the reduced accumulation of amylopectin in M chloroplasts (see below). We also identified the maize homologue of Arabidopsis phosphoglucan phosphatase DSP4 (or SEX4; Kotting et al., 2009); DSP4 was 3-fold enriched in M chloroplasts. DSP4 phosphorylation activity is needed to allow effective cleavage hydrolysis of glucans by BAM3 and ISA3. The preferential M accumulation of DSP4 is puzzling and suggests a specific adaptation of starch degradation.
There are several starch breakdown pathways in leaves, and they result in maltose, Glc, or Glc1P (Zeeman et al., 2007). We were able to assign maize homologues to each of these pathways in BS chloroplasts. In C 3 leaves such as Arabidopsis, the maltose is the main breakdown product and export product of transient starch in leaves during the dark period.
Based on the relatively high levels of ISA3 as compared with DPE1 (approximately 15-fold higher), it is likely that BS chloroplasts also use maltose as the main export product in the night. However, our observations of high levels of DPE1 (comparable to ISA3 in BS chloroplasts) and its low BS-M ratio (Fig. 6) suggest that in M chloroplasts the dominant end product for starch degradation is Glc. Furthermore, no evidence was found for the phosphorylytic pathway in M chloroplasts involving PSH1, while PSH1 was identified in BS chloroplasts very confidently with 57 MS/ MS spectra. The maltose exporter MEX1 was 2-fold more abundant in BS chloroplasts, whereas the Glc transporter was 2-fold more abundant in M chloroplasts; both expression patterns are consistent with these different preferences for starch degradation. We identified three b-amylase homologues, BAM3, BAM6, and BAM9 (Smith et al., 2004;Fulton et al., 2008), types II, I, and III, respectively. BAM3 and BAM9 were only detected in BS chloroplasts, whereas BAM6 was nearly 2-fold higher in M than in BS chloroplasts. The functions of maize or Arabidopsis  Supplemental Table S3. BAM6 and BAM9 have not been studied. Since we did not identify BAM3 in M chloroplasts, it is tempting to speculate that BAM6 represents the major b-amylose activity in maize M chloroplasts; it should be noted that we only found BAM6 in chloroplast membrane fractions and not in the stroma (possibly in association with starch granules).
The source of starch synthesis in the BS chloroplast begins with condensation of the triose phosphates DHAP and GAP by Fru-bisP aldolase-2 (SFBA-2), leading to F16BP, followed by dephosphorylation by F16BPase into F6P (in chloroplasts, SFBA is part of both the Calvin Cycle and in the dark also glycolysis, whereas F16BPase is unique to the Calvin cycle). In M chloroplasts, the possible sources for starch synthesis are the triose phosphate 3GPA directly imported from the BS cells, DHAP generated within the M chloroplasts through the reductive part of the Calvin cycle, and Glc6P imported from the cytosol and supplied by the BS cells. M chloroplast import of 3PGA from the BS chloroplast occurs through the envelope transporter TPT, whereas it is known that in nonphotosynthetic plastids in sink tissues, Glc6P is imported by the envelope transporter GPT in exchange for TP and/or inorganic phosphate. We did not observe GPT in the leaf chloroplasts, indicating that the source for starch synthesis in M chloroplasts is not imported GlcP, but we observed very significant levels of TPT (245 matched MS/MS spectra). Therefore, it is most likely that the limited amount of starch produced in M chloroplasts is produced through the reductive phase of the Calvin cycle, followed by the activity of SFBA and F16BP and subsequent conversions by Glc6PI and PGM, similar to BS chloroplasts. We note that PGM2 is mostly localized in the BS chloroplast (BS-M ratio is 3.3), whereas the 10-fold more dominant form PGM1 is equally distributed over both cell types (Fig. 6).
The key step in the regulation of starch synthesis occurs at the level of AGPase through allosteric regulation by 3PGA (stimulation) and inorganic phosphate (inhibition) as well as redox regulation, in addition to the availability of ATP and Glc1P (Zeeman et al., 2007). Regulation of starch synthesis in BS chloroplasts most likely will follow the C 3 -type regulation, which is well studied in Arabidopsis leaves (Zeeman et al., 2007). Regulation of starch synthesis in M chloroplasts is likely rate limited by the availability to generate sufficient Glc1P. Functional characterization of BAM6 and BAM9, as well as the composition of M chloroplast-localized starch, are needed to fully understand the role of transient starch storage in C 4 leaves.

Redox Regulation and Defense against Oxidative Stress
Redox regulation and defense against ROS are of key importance for optimal functionality of the chloroplast. We identified 45 proteins and protein groups (total of 54 genes) that we classified as being involved in chloroplast redox regulation and/or oxidative stress defense; most proteins were soluble proteins identified in the stroma (Table II). Whereas we did identify and quantify some of those in our initial stromal BS-M analysis (Majeran et al., 2005), the newly quantified set is far more exhaustive, but it correlates well with our initial observations. Table II shows BS-M accumulation ratios as well as relative protein abundance for each of the annotated proteins.
We identified and quantified the known key chloroplast protein components involved in detoxification of superoxide and hydrogen peroxide as well as the glutathione defense system (Table II). In addition, we identified five different glutaredoxins (total of seven genes), four peroxiredoxins, and one rubredoxin. The most abundant proteins (based on nSAF) were 2-Cys peroxiredoxin A,B (with 4,999 MS/MS spectra), copper/ zinc-superoxide dismutase, thioredoxins m2 and m4, and peroxiredoxin IIE; these were all soluble proteins, identified in the stromal fractions. Except for a lowabundance ascorbate peroxidase (33 matched MS/ MS spectra), all components of this detoxification system were between 2-and 5-fold enriched in the M chloroplasts; this is consistent with the much higher linear electron transport rates and water-splitting activity by PSII and the associated risk of generation of oxygen radicals (for more discussion and references, see Majeran et al., 2005;Majeran and van Wijk, 2009).
We identified three FNR proteins (all three with high numbers of MS/MS spectra) as well as five Fd proteins. These Fd proteins match to Arabidopsis Fd1 (AT1G10960.1), Fd2 (AT1G60950.1), and Fd3 (AT2G27510.1) and to an uncharacterized Fd protein (AT4G14890.1). Maize Fd3 was only identified with five matched MS/MS spectra, whereas the others were identified with many more spectra, ranging from 44 (Fd1) to 518 (Fd2-1). Fd2-2 was BS enriched, whereas the others were enriched in M chloroplasts. We mention these proteins here since the Fd-FNR system connects thylakoid electron transport activity to metabolic activity via NADPH (Table II).
Redox regulation through the thioredoxin system is important in chloroplasts and can activate and deactivate metabolic pathways (Michelet et al., 2006;Lemaire et al., 2007;Schurmann and Buchanan, 2008). We identified and quantified 10 different chloroplast thioredoxins (total of 13 genes). Interestingly, we also identified and quantified the aand b-subunits of the Fd-dependent thioredoxin reductases (FTR-A/B; Schurmann and Buchanan, 2008) as well as the NADPH-dependent thioredoxin reductase C (NTRC). NTR contains both an NADP-thioredoxin reductase and a thioredoxin domain, and NTRC is able to conjugate both NTR and thioredoxin activities to reduce 2Cys peroxiredoxin using NADPH as a source of reducing power (rather than Fd). NTRC may also more generally help to integrate the regulation of metabolic processes, including regulation of starch synthesis (Lepisto et al., 2009;Michalska et al., 2009). NTRC was very strongly (10-fold) enriched in M chloroplasts. Two SOUL domain proteins, possibly involved in heme delivery or degradation, were also identified. Strikingly, except for thioredoxin (CDSP32), thioredoxin h-2 (Trx-H2), thylakoid-bound APX-2, and SOUL heme binding, which were enriched in the BS chloroplasts, all other proteins were preferentially located in the M chloroplast. NTRC (with 80 matched MS/MS spectra) and two of the glutaredoxins (with 137 and 23 MS/MS spectra) showed the most extreme BS-M ratios of 0.09, 0.08, and 0.05 (Table II).
The differential accumulation of redox and ROS defense components shows that BS and M chloroplast metabolism operates under different redox and oxygen radical stress conditions. Cell type-specific quantitative and qualitative differences in redox regulators and ROS defense components are in place to cope with these different environments.

Nucleotide Metabolism and Homeostasis of Nucleotides
Nucleotides are critical molecules in nearly every aspect of plant life and function in primary and Table II. Distribution of redox regulators and the ROS defense machinery across BS and M chloroplasts Data include annotations for function and subchloroplast location, membrane-stroma ratios (based on nadjSPC) and BS-M ratios (based on nadjSPC), number of matched adjSPC, and relative protein accumulation levels (based on nSAF). secondary metabolism as well as gene expression. Purine nucleotides (ATP and GTP) and pyrimidine nucleotides (UTP and CTP) are predominantly synthesized in plastids (Zrenner et al., 2006). Their synthesis requires high amounts of energy. Phosphotransfer reactions by kinases and phosphatases convert mononucleotides and dinucleotides to trinucleotides and also equilibrate different pools of nucleotides; this is important to balance the activity of different chloroplast metabolic pathways and to reequilibrate between chloroplasts and the cytosol. In addition, conversion of the pyridine nucleotide cofactor NAD or NAD + to NADP(H) by NAD(H) kinases is important to accommodate the activity of NAD-or NADP-specific enzyme activities. Since BS and M chloroplasts in C 4 plants have such different roles in photosynthesis and carbon metabolism, we were particularly interested to determine specific adaptations of these two chloroplast types in terms of nucleotide metabolism and homeostasis. However, this is challenging, since many of the biosynthetic enzymes are known to accumulate at very low levels, and often they are underrepresented in proteomics studies.
Despite the low abundance of many enzymes, we identified 23 proteins/groups of proteins (total of 33 genes) involved in nucleotide metabolism and homeostasis (Fig. 7). All proteins were found in the soluble stromal fraction, with the exception of one of the inorganic pyrophosphatases, adenylate monophosphate kinase 5, and the envelope nucleotide translocator (AATP1/NTT1); these were detected in the chloroplast membrane fractions. The most abundant proteins were stromal adenylate monophosphate kinase 2 (with more than 5,000 matched MS/MS spectra), nucleoside diphosphate kinase 2, and soluble inorganic pyrophosphatase. The membrane-bound ATP/ADP translocator, AATP1, involved in import of ATP in exchange for ADP, has a BS-M ratio of 0.45, which is consistent with higher photosynthetic electron transport and ATP synthase capacity than in BS chloroplasts. We note that we did not detect the plastid envelope-localized uniporter BRT1; there are two isoforms (BRT1-1 and -2) in maize, and the Arabidopsis homologue of one of them was recently shown to be involved in the export of AMP, ADP, and ATP. The other maize isoform serves to export AGP-Glc in endosperm.
A greater portion of the enzymes in nucleotide metabolism are preferentially located in M cells. One possible explanation for this is that de novo biosynthesis of nucleotides is very energy consuming (Zrenner et al., 2006). M chloroplasts are able to generate ATP by linear and cyclic electron flow (Edwards et al., 2001) and therefore may be more capable of driving biosynthesis of nucleotides. However, two enzymes that have high SPC, formylglycinamidine ribonucleotide synthase (FGAMS) and carbamoylphosphate synthetase (CPS), are more concentrated in BS cells (BS-M = 3.32 and 3.14 for FGAMS and CPS, respectively). As discussed in nitrogen assimilation, CPS may be responsi-ble for recycling photorespiratory ammonium, thus explaining its preferred localization in BS cells. We have no functional explanation for the preferred BS location of FGAMS.

Fatty Acid and Lipid Metabolism
We identified 27 proteins and protein groups (total of 36 genes) involved in fatty acid and lipid metabolism, most of them in the soluble fraction ( Fig. 8; Supplemental Table S3). Nineteen enzymes were involved in fatty acid synthesis and elongation, and two stearoyl-ACP desaturases (SSI2/FAB2) and linoleate desaturases (FAD7/FAD8) were involved in fatty acid desaturation. Furthermore, we identified UDPsulfoquinovose synthase (SQD1), which is responsible for the production of sulfolipids. Finally, three enzymes were involved in lipid degradation and four proteins were assigned to lipid transport. We were able to annotate and functionally assign most of these proteins, and we reconstructed the pathway for fatty acid synthesis (Fig. 8). We did not observe the ACPthioesterases (Fat-A and -B) that terminate the elongation reactions; since none of the many proteomics studies in Arabidopsis identified these either, we conclude that they must have a lower abundance than the other fatty acid synthesis enzymes. Most of the proteins (18 out of 27) were exclusively identified in the soluble stromal fractions; proteins exclusively identified in the membrane fraction were components of the pyruvate dehydrogenase (PDH) complex and phosphatidic acid-binding protein (TGD2) associated at the inner envelope membrane, two of the lipases, linoleate desaturases (FAD7/FAD8), and a lipocalin domain protein. Protein abundance of the identified proteins (based on nSAF) spanned nearly 4 orders of magnitude, with two acyl carrier proteins (ACP3 and -4) being the most abundant proteins. The high abundance of these carrier proteins is consistent with their functions.
In our initial stromal analysis (Majeran et al., 2005), we identified only four proteins involved in fatty acid and lipid metabolism; three were preferentially expressed in M chloroplasts. Our analysis here greatly expands the coverage of these pathways, which allowed us to better assess differences between M-and BS-localized fatty acid and lipid metabolism. The average and median BS-M ratios for all proteins involved in fatty acid synthesis were 0.91 and 0.94, respectively, indicating comparable distribution of fatty acid synthesis across both cell types. This suggests that demand for fatty acids is similar in both BS and M cells. However, we observed clear differential BS-M expression for two of the lipases (Fig. 8; see below).
Fatty acids are strictly synthesized in the chloroplast (and nongreen plastids), and synthesis occurs by sequential addition of two carbon units to acyl groups attached to a soluble acyl carrier protein (CAP). Acetyl-CoA is the initial carbon precursor and the building block for elongation. In C 3 chloroplasts, most acetyl-CoA is generated from pyruvate by pyruvate dehydrogenase, rather than through acetyl-CoA synthase using imported acetate as a source (Fig. 8). Pyruvate was clearly not generated through chloroplast glycolysis, since we did not identify any of the three plastid glycolytic enzymes. Through the C 4 carbon cycle, high levels of pyruvate are generated in BS chloroplasts and redistributed to M chloroplasts; this should provide a good resource for the synthesis of acetyl-CoA, even if it is in direct competition with the Calvin cycle. The pyruvate pool could be replenished by import from the cytosol, but our data do not allow us to assess that contribution. It was interesting that acetyl-CoA synthetase (also named acetate-CoA ligase) was only found in M chloroplasts at significant levels (59 adjSPC). This provides an alternative route for acetyl-CoA production, independent of pyruvate and pyruvate dehydrogenase.
The glycerol precursor phosphatidic acid can be synthesized within the plastid from the Calvin cycle intermediate DHAP through the prokaryotic pathway and involves DHAP reductase and acyl-ACP-glycerol 3-phosphate acyltransferase (ATS1 and -2). Alternatively, phosphatidic acid is imported from the endo-plasmic reticulum via envelop proteins TDG1 and TDG2. Phosphatidic acid is then dephosphorylated by an inner envelope phosphatase (PAP) to produce diacylglycerol followed by enzymatic transfer or one or two molecules of Gal, resulting in monogalactosyldiacylglycerol and digalactosyldiacylglycerol. Alternatively, sulfoquinovose is transferred from UDPsulfoquinovose to the diacylglycerol portion of phosphatidic acid to produce the sulfo-lipid SQDG. We identified SQD1 and the subsequent enzyme Fd-Glu synthetase (GLU1; see nitrogen assimilation), leading to the production of UDP-sulfoquinovose but none of the actual enzymes that catalyze the conversion of phosphatidic acid to these various lipid classes (i.e. SQD2, MGD1, DGD1). An indirect alternative source for glycerol is the breakdown of phosphatidylglycerol by lipases. We identified three lipases in the chloroplast membrane fractions, but their precise functions are unknown. Two of these lipases, in particular the very abundant DAD1 hydrolase, showed preferential M accumulation (Fig. 8). The TGD2 gene encodes a phosphatidic acid-binding protein tethered to the inner chloroplast envelope membrane facing the outer envelope membrane. It is proposed that TGD2  Supplemental Table S3. represents the substrate-binding or regulatory component of a phosphatidic acid/lipid transport complex in the chloroplast inner envelope membrane. We identified TGD2 with 21 adjSPC in M membranes but not in BS membranes, suggesting differential regulation of lipid transport between chloroplasts and the endoplasmic reticulum in BS compared with M membranes.

An Integrative Overview of Cell Type-Specific Expression Patterns of Isoprenoid and Tetrapyrole Metabolism
Isoprenoids and tetrapyroles play a central and absolutely essential role in functioning of the chloroplast, and they include tocopherols, quinones, carotenoids, chlorophyll, and heme. The enzymes of these pathways are distributed over the soluble and membrane phases, with the upstream steps located in the stroma and the downstream steps in the chloroplast membranes. Twenty-two proteins involved in the isoprenoid pathway were identified, 20 of which we could assign to specific steps (Fig. 9). We also identified 21 proteins in tetrapyrole synthesis and two involved in chlorophyll degradation, and this information was integrated with the isoprenoid pathway (Fig. 10). Except for 1-deoxy-D-xylulose-5-phosphate synthase, we identified all enzymes in the MEP pathway, and except for urogen III synthase, magnesium protomethyltransferase, and the D-subunit of magnesium chelatase, we identified the complete pathway leading to chlorophyllide, including the regulators FLU and GUN4 (Fig. 9). Finally, we also identified two of the three enzymes (iron chelatase and heme oxygenase) of the branch leading to the production of phytochromobilin. For some of the enzymes, we identified two homologues. Finally, VTE1 (tocopherol cyclase in the plastoglobules), VTE3 (MPBQ/MSBQ methyltransferase in the inner envelope), and VTE4 (tocopherol methyltransferase involved in tocopherol synthesis) were also identified. Geranylgeranyl pyrophosphate also serves as the precursor for carotenoid synthesis. We identified six enzymes in carotenoid synthesis, and with the exception of phytoene desaturase, these proteins were of low abundance (Fig. 9).
These pathways were integrated in Figure 9, and the relative protein concentrations (based on nSAF values) in BS and M chloroplasts are displayed to obtain a unique overview of abundance of the various steps, as well as BS-M chloroplast-specific accumulation patterns.
Chlorophylls and heme are required in both BS and M chloroplasts, with a higher demand for chlorophyll b in M chloroplasts due to the higher level of PSII lightharvesting complexes. The average and median BS-M ratios for chlorophyll and heme pathway proteins were 1.2 and 0.67, respectively; this suggests a quite equal distribution across the two cell types, possibly with some preferential accumulation in M chloro-plasts. For discussion of some of the membrane-bound enzymes, see Majeran et al. (2008).
Isoprenoids can be synthesized in the cytosolic mevalonate pathway or the chloroplast-localized MEP pathway (Phillips et al., 2008;Cordoba et al., 2009). Isopentyl diphosphate is an end product of both pathways and can possibly be transported across the chloroplast envelope. However, it is generally believed that the MEP pathway provides most, if not all, precursors for the synthesis of carotenoids, chlorophyll, quinones, and tocopherol. The MEP pathway and the downstream enzymes geranyl diphosphate synthase (GPPS) and geranylgeranyl diphosphate synthase (GGPS) accumulated at higher levels in the M than in BS chloroplasts. This is likely due to the much higher demands for carotenoids in the M chloroplasts, as indicated by the low BS-M ratios for several of the enzymes (ZEP, CCD, LUT1, LCY-b; Fig. 9). This higher demand for carotenoids must relate to their role in quenching of excess light and detoxification of triplet chlorophyll and singlet oxygen, in particular generated during linear electron transport. Similarly, we observed somewhat preferred M accumulation of the VTE enzymes involved in tocopherol and plastoquinone synthesis (Fig. 10); both components are needed in the thylakoid membrane, in particular during linear electron transport.

Nitrogen Assimilation from Inorganic and Organic Nitrogen Sources
Chloroplasts play a central role in nitrogen assimilation. During nitrogen assimilation, nitrogen is incorporated into amino acids in the form of ammonium ( Fig. 10; Table III). Plants obtain ammonium from two sources. The primary source originates from inorganic nitrogen, either by reduction of nitrate or by direct uptake of ammonium from soil or symbiotic rhizobium. The secondary source is derived from organic compounds within the plant through processes such as photorespiration. Upon transport of nitrate through the vascular system into the BS and M cells, nitrate is reduced to nitrite by nitrate reductase in the cytosol (Lillo, 2008). Nitrite is then imported into the chloroplast by the nitrite transporter (Sugiura et al., 2007), followed by further reduction into ammonia by Fdnitrite reductase (NiR) and subsequent incorporation of ammonia into Gln in the GOGAT/GS cycle and its redistribution to other amino acids through Asp transaminase. An alternative ammonium assimilation is catalyzed by the formation of CP for the synthesis of Arg (Fig. 11) and purimidine nucleotides (Fig. 8).
We did not identify the nitrite transporter in maize, mostly likely due to its low abundance; the Arabidopsis homolog has not (yet) been identified in any of the published proteomics studies either. However, we did identify and quantify the major nitrogen assimilation enzymes within chloroplasts ( Fig. 10; Table III): NiR (BS-M = 0.18), Gln synthase 2 (BS-M = 1.33), Fd-dependent and NADH-dependent Glu synthases (Fd-GOGAT and NADH-GOGAT; BS-M = 1.00 and 5.33, respectively). Previous studies have indicated that primary nitrogen assimilation takes place in M cells, with the Fd-NiR and Fd-GOGAT predominantly localized to M cells, while GS activity is present in both cell types (Rathnam and Edwards, 1976;Harel et al., 1977;Becker et al., 1993). Our proteomic data of NiR and Gln synthase 2 are consistent with these findings. In the case of Fd-GOGAT, we found equal abundance in BS and M cells. The explanation for this apparent discrepancy likely lies in its participation in both primary (preferentially in M) and secondary (preferentially in BS from photorespiration) nitrogen assimilation. NADH-GOGAT is strongly BS enriched (BS-M = 5.33), but its overall abundance is 100-fold lower (based on nSAF values) than Fd-GOGAT. NADH-GOGAT is believed to be generally higher expressed in plant roots than in leaves, but unlike Fd-GOGAT, it has not been studied in much detail. The strong preferential cell type-specific accumulation of NADH-GOGAT likely reflects a metabolic adaptation to the BS chloroplast environment.
Our results are consistent with the notion that inorganic nitrogen assimilation is tightly correlated with photosynthesis, and its distribution between BS and M must depend on the availability of reducing equivalents for Fd-NiR activity.
We detected three related chloroplast envelope transporters: OMT1 (BS-M = 0.09), DiT1 (BS-M = 0.26), and DiT2 (BS-M = 1.16). DiT1 transports 2-oxoglutarate into the chloroplast, whereas DiT2 exports Glu into the cytosol; both transporters use malate as the countertransport molecule (Fig. 10). The precise role of OMT1 is unclear, but it is clearly preferentially accumulating in M chloroplasts, similar to DiT1. OMT1 and DiT1 are strongly (4-to 10-fold) enriched in the M chloroplast (BS-M = 0.09 and 0.26, respectively), likely because of the high flux rate of malate from M chloroplasts in the C 4 cycle and low export from BS chloroplasts due to its high rate of malate consumption by ME. Moreover, photorespiration is confined to the BS cells (but operates at lower flux rates than in C 3 plants). DiT2 is only slightly higher in BS chloroplasts; using the ZmGI database, we found  Table S3. that DiT2 had a higher BS-M ratio (1.7; Majeran et al., 2008;Majeran and van Wijk, 2009). Careful verification of the full-length sequences of the DiT family is needed to clarify this difference.
Asp aminotransferase catalyzes the reversible transamination between Glu and OAA to give rise to Asp and 2-oxoglutarate. The rate of OAA production in the cytosol (from PEP by PEP carboxylase) and subsequent import into M chloroplasts is high, since this is part of the C 4 cycle. Asp aminotransferase is approximately 2-fold higher in M chloroplasts (BS-M = 0.54), consistent with our initial observation (Majeran et al., 2005). The production of Asp does compete with production of malate and must involve several regulatory steps. In addition, Asp aminotransferase may provide a link between nitrogen and carbon pathways in maize, if Asp is used to transport carbon to the BS cells, as in NAD-ME and PEP carboxykinase types of C 4 . CPS condenses ammonium (or the Gln amide group) and HCO 3 2 to make CP, a precursor for Arg and pyrimidine biosynthesis (Fig. 10). It was indicated that, in C 3 plants, mitochondrial CPS is involved in recycling photorespiratory nitrogen (Taira et al., 2004;Potel et al., 2009). Here, we observed higher levels of the CPS large subunit-2 and small subunit in BS cells (BS-M = 4.16 and 2.43 respectively), while the CPS large subunit-1 level is comparable (BS-M = 1.03). This implies that the chloroplast-targeted CPS may also play a role in the recovery of photorespiratory nitrogen. Unlike CPS, most enzymes in the Arg synthesis pathway are preferentially located in M cells (see section on amino acid biosynthesis below); this is best explained by a lower availability of Glu in BS chloroplasts, as Glu is needed in the GOGAT cycle to remove ammonium released from photorespiration.
In Arabidopsis, nitrogen assimilation is integrated with carbon metabolism via the chloroplast-localized nitrogen sensor protein PII (Smith et al., 2003), and it interacts with N-acetyl Glu kinase (Chen et al., 2006). The PII protein senses concentrations of ATP and 2-oxoglutarate and relieves Arg feedback inhibition of N-acetyl Glu kinase. A similar scenario seems in place in rice (Sugiyama et al., 2004). We identified PII in Arabidopsis chloroplasts with many MS/MS spectra (see PPDB). Surprisingly, there is no obvious maize homolog of the rice or Arabidopsis PII protein, which suggest that integration of carbon and nitrogen metabolism is organized differently in maize and possibly other C 4 species (possibly this integration occurs through regulating the activity of Asp aminotransferase, as mentioned above).

Sulfur Assimilation and Synthesis of Met, Cys, and Glutathione
Sulfur is absorbed by plants in the form of sulfate, which is first reduced to sulfite and then to sulfide before being incorporated into Cys. Sulfate assimilation requires high amounts of reducing equivalents and therefore occurs mostly in the photosynthetically active leaves; reduced sulfur compounds are then distributed to sink tissues via the phloem (Hopkins et al., 2004;Kopriva and Koprivova, 2005). Sulfate reduction takes place exclusively in chloroplasts, whereas Cys can be synthesized in chloroplasts, cytosol, and mitochondria (Krueger et al., 2009, and refs. therein). But since sulfate reduction takes place exclusively in plastids in both C 3 and C 4 species, cytosolic or mitochondrial Cys synthesis is limited by transport of sulfide out of the chloroplast (but see below).
We obtained good coverage of enzymes involved in the step-wise reduction of sulfate to sulfite and then to sulfide ( Fig. 11; Table III). Coverage of subsequent synthesis of Cys, Met, and glutathione was more scattered, likely relating to the distribution of these synthetic pathways across multiple subcellular compartments (Fig. 11). Older studies using other assays than proteomics revealed that the activities of ATP sulfurylase (ATPS) and adenosine 5#-phosphosulfate sulfotransferase (APR) in maize were confined to BS cells, while Cys synthase (O-acetyl-Ser thiol lyase [OASTL]) activity is found in both cell types but at a higher level in M cells (Gerwick and Black, 1979;Burnell, 1984;Schmutz and Brunold, 1984;Burgener et al., 1998). These previous observations are consistent with our current observations for ATPS2 and APR3 and -4, which showed BS-M ratios of 3. 86, 9.92, and 19.75, respectively, whereas two OASTL isoforms (also named Cys synthase) were 40% to 100% higher in M chloroplasts than in BS chloroplasts. OASTL was very abundant (identified with 504 MS/MS spectra), but we did  Supplemental  Table S3.
not identify Ser acetyl-CoA transferase (SAT), which is only active when associated with OASTL. The lack of detectable levels of SAT is consistent with a previous report indicating that OASTL is far more abundant than SAT (Hopkins et al., 2004). In addition, SAT activity is also reported to be high in mitochondria and cytosol, and the possibility of OAS transport between subcellular compartments is suggested by experimental data (Krueger et al., 2009). Interestingly, we found that the highly abundant sulfite reductase (identified with 172 adjSPC) was equally distributed across BS and M chloroplasts (BS-M = 1.08), suggesting that sulfite is transported from BS to M chloroplasts, in addition to well-established Cys transport (Burgener et al., 1998). Reduction of sulfite into sulfide in M chloroplasts does decrease the demand for NADPH in BS, which should Table III. Distribution of enzymes involved in amino acid metabolism, nitrogen and sulfur assimilation, and iron-sulfur cluster assembly across BS and M chloroplasts Data include annotations for function and subchloroplast location, membrane-stroma ratios (based on nadjSPC) and BS-M ratios (based on nadjSPC), number of matched adjSPC, and relative protein accumulation levels (based on nSAF).
Cys is used for the synthesis of Met in a three-step reaction, involving cystathionine g-synthase, cystathionine b-lyase (CBL), and Met synthase (MetS); we did not observe cystathionine g-synthase, but we found CBL and MetS preferentially located in BS cells respectively). CBL clearly appears to be a bona fide chloroplast-localized enzyme (also when comparing with other maize samples; W. Majeran, G. Friso, and K.J. van Wijk, unpublished data). However, MetS is quite possibly a contamination from the cytosol. MetS subcellular localization was believed to be cytosolic, but recently it was shown for Arabidopsis that one of the isoforms of MetS (MS3) is localized in the plastid, and MetS activity in the plastid was supported by experimental evidence (Hacham et al., 2008). The MetS protein that we identified was very abundant in total maize leaf extract and isolated BS strands, and even if this MetS protein is not with the chloroplast, it is highly enriched in the BS cells, as indicated in Figure 11.
Cys is also the source for sulfur atoms for proteins with iron-sulfur clusters; these include several enzymes in sulfur assimilation (APR and sulfite reductase) as well as enzymes in nitrogen assimilation (NiR and GOGAT) and photosynthetic electron transport (Ye et al., 2006;Xu and Moller, 2008). We identified eight proteins involved in iron-sulfur formation: Cys sulfurase (cpSufS/cpNifS) and its activator cpSufE extract elemental sulfur from Cys, cpSufB, -C, and -D, and three NFU proteins that contribute to iron-sulfur cluster assembly ( Fig. 11; Table III). These five proteins were on average 3-fold enriched in the M chloroplast; given the complexity of sulfur metabolism and its integration with iron homeostasis, there is no a simple explanation for this M enrichment.
The tripeptide glutathione is an important thiol functioning in defense and ROS scavenging (see earlier section on redox and ROS), and glutathione is also a source of electrons for the sulfate reduction (Fig. 11). Glutathione is synthesized from Glu and Gly in a twostep reaction, involving g-glutamyl-Cys synthetase (g-ECS) and glutathione synthase. g-ECS is exclusively localized in plastids in all plant species, whereas glutathione synthase also (or predominantly) occurs in the cytosol (as studied in various C 3 species). We detected g-ECS (with 36 adjSPC) with a larger portion of the total protein located in M chloroplasts (BS-M = 0.66), consistent with feeding experiments that showed preferential glutathione synthesis in M cells (Burgener et al., 1998). We did not detect glutathione synthase in the BS or M chloroplasts, which is consistent with a report that it is localized in both cytosol and plastids, thus possibly lowering the plastid glutathione synthase concentration (Gomez et al., 2004), but we did detect glutathione reductase at high levels (230 adjSPC), responsible for converting oxidized glutathione to reduced glutathione. Glutathione reductase was 3-fold higher in M chloroplasts, consistent with the higher rate of ROS production (see section on redox regulation and ROS defense). Our observations clarify several of the controversies for the distribution of sulfur metabolism in maize BS and M cells (Kopriva and Koprivova, 2005).

Amino Acid Metabolism
Amino acid synthesis in plants is distributed across chloroplasts, cytosol, and mitochondria. Some amino acids (e.g. Gln) act primarily to assimilate and transport nitrogen. Amino acids are also precursors of many nitrogen-containing compounds. Therefore, amino acid biosynthesis is regulated through various mechanisms and with various alternative pathways and is tightly coordinated with carbon metabolism. Given this complexity, it is perhaps not surprising that very little is known about the distribution of amino acid biosynthetic pathways between BS and M cells in C 4 plants. In our previous BS-M stromal analysis, we could only identify a handful of enzymes involved in amino acid biosynthesis (Majeran et al., 2005). In this study, we identified and functionally assigned some 60 proteins/protein groups (about 70 genes) involved in amino acid metabolism (with an overlap to sulfur and nitrogen assimilation and photorespiration), and they covered 3 orders of magnitude in relative concentration (based on nSAF values; Table III). Amino acid biosynthetic pathways can be separated based on the precursor, but connections between these pathways exist. We have grouped the identified proteins in different pathways in Table III, and a number of proteins and pathways are integrated in the figures for nitrogen and sulfur assimilation (e.g. for Arg and Cys biosynthesis).
Asp aminotransferase was the most abundant protein (based on nSAF; 1,687 MS/MS spectra; BS-M = 0.54), followed by shikimate kinase SKL1 (488 MS/MS spectra; BS-M = 0.7) involved in biosynthesis of aromatic amino acids. Among the proteins with at least 40 adjSPC, we observed BS-M ratios ranging from 0.06 (greater than 10-fold enriched in M chloroplasts) to 5.96 (6-fold enriched in BS chloroplasts), with the majority of proteins being enriched in M chloroplasts (Table III). In the remainder of this section, we will briefly describe our findings, with an emphasis on the link to C 4 metabolism.
For the Glu family (Glu, Arg, and Pro), we identified seven out of the eight enzymes needed for Arg biosynthesis. Arg is synthesized from Glu and CP and is thus closely linked to the GOGAT/GS cycle. All seven identified Arg biosynthetic enzymes were enriched in the M chloroplasts (on average 2.5-fold; Table III; Fig.  10). There must be competition between Arg and nucleotide synthesis, as both require CP as a precursor. We suggest that Arg synthesis and accumulation serve to redistribute excess assimilated nitrogen to other parts of the plant. It is interesting that in the Arabidopsis glu1 mutant, excess photorespiratory ammonium was transported in the form of Arg (and other amino acids; Potel et al., 2009).
Asp is the precursor for homoserine, Thr, Ile, and Lys and also contributes to the synthesis of Met. We identified nine enzymes in these pathways, with the Lys synthesis branch being well covered (Table III). The pathway was M enriched, which is understandable given the high rates of Asp synthesis and high energy requirements (ATP and NADPH).
We identified most of the enzymes involved in synthesis of the branched amino acids, Val and Leu (Table III). The precursor of this pathway is pyruvate, and the last step in their synthesis requires Glu as an amine source. The later steps in Leu synthesis are clearly not M enriched but either equally distributed across both cell types or enriched in the BS cell; the reason is not clear.
As shown in Figure 11, we identified several of the enzymes involved in Ser and Cys synthesis (Table III), as explained in the section on sulfur assimilation. We did not identify proteins involved in Gly synthesis, which is reassuring, since Gly is primarily produced in mitochondria from Thr, Ser, or photorespiration.
The shikimate pathway for the synthesis of the aromatic residues (Phe, Tyr, and Trp) was well covered with 11 proteins, even if the number of SPC was low for several of the steps (see "Conclusion"; Table III). The entire shikimate pathway takes place in plastids and starts with the condensation of erythrose 4-phosphate (a product of the pentose phosphate pathway and of the Calvin cycle) and PEP, followed by incorporation of a second molecule of PEP in the penultimate step of the central pathway. This central part of the pathway produces chorismate in seven steps, followed by two separate pathways for the synthesis of Trp and of Tyr plus Phe. We identified three steps (1, 5, and 6) in the central pathway and multiple steps in the Trp-and Tyr/Phe-specific pathways. All but one of the enzymes were more highly expressed in the M chloroplasts (the BS-enriched exception was identified by only four SPC and therefore is not reliable). EPSP synthase (for 3-phosphoshikimate 1-carboxyvinyltransferase), in the central part of the shikimate pathway, is the target of the well-known and extremely "popular" herbicide glyphosate; EPSP synthase was 5-fold enriched in M chloroplasts. The preferential accumulation in M chloroplasts is in line with the high levels of PEP synthesis in the M chloroplasts for the C 4 carbon cycle. The availability and origin of erythrose 4-phosphate in the M chloroplasts is less clear but must be generated by the pentose phosphate pathway rather than the Calvin cycle. Indeed, this is consistent with the observed accumulation of OPPP enzymes in the M chloroplasts (G6PDH, lactonate, and 6PGDH; Fig. 5). The last step in Trp synthesis is carried out by subunits of the Trp synthase complex; these subunits were 3-to 4-fold higher in M than in BS chloroplasts ( Table III).
The synthesis of His follows a linear pathway, originating with phosphoribosyl pyrophosphate (Fig.  8). We identified enzymes for six out of eight steps of the pathway, even if the number of SPC was low for all but the last step, histidinol dehydrogenase (59 matched MS/MS spectra). The BS-M ratio of histidinol dehydrogenase was 0.66, but the distribution of the complete His pathway across both cell types is not clear. Clearly, the His pathway in C 4 species requires more attention.

Many Proteins of Unknown Function Show Differential BS-M Accumulation
In addition to the proteins discussed above, we identified and quantified over 200 proteins (and groups) with either a miscellaneous function (e.g. TPR and PPR proteins, rhodanese, and DnaJ domain proteins) or without any obvious function (Supplemental Table  S3). Ninety-six proteins were only detected in the stromal fractions, and 25 were identified in both stroma and membrane fractions. These proteins spanned 3 orders of abundance (based on nSAF) and up to 1,556 matched spectra; a significant number of proteins showed strong preferential BS or M accumulation and are an excellent resource for further exploration of specialized BS or M chloroplast functions.

CONCLUSION
Large-scale MS analysis with a high-resolution and high-sensitivity mass spectrometer has allowed pathways and activities to be analyzed and reconstructed across the differentiated BS and M chloroplasts in the leaf of maize. Quantitative information about relative protein accumulation levels and cell type-specific protein accumulation patterns provided new insights into the functions and structures of differentiated BS and M chloroplasts. These results provide a major step forward from previous analyses and also give many new entry points for studying specific aspects of chloroplast biogenesis, differentiation, and function. We did note that a number of proteins, partic-ularly in the functional categories of nucleotide metabolism and DNA interactions and transcription, were quantified with only a few spectra counts. This is likely due to a down-regulation of these functions when chloroplasts begin to reach the end point of biogenesis and differentiation. For instance, nonphotosynthetic chloroplasts (etioplasts) at the leaf base will be expected to contain much higher protein levels for these functions.
The availability of the first maize genome sequence has provided an excellent template for protein identification and quantification, even if imperfections surely must have resulted in missed proteins and inaccuracies of protein quantification. Through this study, we have added a wealth of information to these maize protein accession numbers, and this includes assignment of protein names, functions, and subcellular localizations; this will all be made available online through the PPDB. From PPDB, this information can be freely distributed to the various maize databases (e.g. http://maizesequence.org/index.html and http:// www.maizegdb.org/) and other resources. Moreover, all mass spectral data are made available through the public depository Proteomics Identifications (PRIDE; http://www.ebi.ac.uk/pride/).
A central conclusion from this study is that differentiated BS chloroplasts at the maize leaf tip appear to have strongly reduced plastid protein expression and protein import. Moreover, it is likely actively remodeling its proteome, as evidenced by increased levels of ClpB and HSP90. Furthermore, this study provides strong experimental support for a number of specialized BS and M metabolic functions, including starch biosynthesis (BS), Arg synthesis (M), MEP activity (M), nitrogen assimilation (M), initial steps in sulfur assimilation (BS), and more. Functional and genetic studies, as well as in vitro enzyme activity assays, will be needed to further determine their significance. In a forthcoming study, we will explore the protein accumulation of the various pathways and processes along the leaf developmental gradient and strengthen and expand on our observations presented here for the differentiated BS and M chloroplasts.

Maize Genotype, Plant Growth, and Purification of BS and M Chloroplast Fractions
WT-T43 maize (Zea mays) plants were grown for 12 to 14 d in a growth chamber (16 h of light/8 h of dark, 400 mmol photons m 22 s 21 ) until the fourth leaf was emerging. M and BS chloroplasts were purified from the top 4-cm section of the third leaf and harvested about 2 h after the onset of the light period, using several hundred leaf tips, following procedures described previously (Majeran et al., 2005). Purified M and BS chloroplasts were broken with a Dounce homogenizer, and thylakoid and envelope membranes were collected by 20 min of centrifugation at 80,000g. The supernatant, representing the enriched soluble stromal cross-contamination of M and BS chloroplast fractions, was collected. Cross-contamination was assessed from the presence of the M and BS markers (PPDK and Rubisco, respectively) as visualized on stained one-dimensional SDS-PAGE gels, as described (Majeran et al., 2005). Protein concentrations were determined with the Bradford essay (Bradford, 1976). Out of more than eight BS-M chloroplast preparations, the best three were selected.

Chloroplast Stroma and Membrane Proteome Analysis by NanoLC-LTQ-Orbitrap
A total of 150 mg of each BS and M stromal preparation was separated by SDS-Tricine-PAGE (12% acrylamide). Each gel lane was then cut into 12 slices, proteins were digested with trypsin, and the extracted peptides were analyzed by nanoLC-LTQ-Orbitrap MS using data-dependent acquisition and dynamic exclusion, as described (Majeran et al., 2008). Each sample was analyzed twice with different amounts injected to ensure maximum protein coverage. The complete analysis was carried out in three independent biological replicates. In total, 144 MS runs were carried out, with extensive blanks between each sample analysis to avoid carryover of peptides that could bias quantification. Purification and MS analysis of the BS and M membranes has been described (Majeran et al., 2008).

Processing of the MS Data, Database Searches, and Upload into PPDB
Peak lists (.mgf format) were generated using DTA supercharge (version 1.19) software (http://msquant.sourceforge.net/) and searched with Mascot version 2.2 (Matrix Science) against maize genome release 4a.53 (with 53,764 models) from http://www.maizesequence.org/ supplemented with the plastidencoded proteins (111 protein models) and mitochondria-encoded proteins (165 protein models). For off-line calibration, first a preliminary search was conducted with the precursor tolerance window set at 630 ppm. Peptides with the ion scores above 40 were chosen as benchmarks to determine the offset for each LC-MS/MS run. This offset was then applied to adjust precursor masses in the peak lists of the respective .mgf file for recalibration using a Perl script (B. Zybailov, unpublished data). The recalibrated peak lists were searched against the maize genome data set with organellar genes (with 54,380 entries) and in parallel against ZmGI version 16.0 (with 56,364 entries), including sequences for known contaminants (e.g. keratin, trypsin) concatenated with a decoy database where all the sequences were randomized. Each of the peak lists were searched using Mascot version 2.2 (maximum P of 0.01) for full tryptic peptides using a precursor ion tolerance window set at 66 ppm, variable Met oxidation, fixed Cys carbamidomethylation, and a minimal ion score threshold of 30 for maize genome and 44 for ZmGI; this yielded a peptide false discovery rate below 1%, with peptide false positive rate calculated as: 2 3 (decoy_hits)/total_hits. The false protein identification rate of protein identified with two or more peptides was zero. To reduce the false protein identification rate of proteins identified by one peptide, the Mascot search results were further filtered as follows: ion score threshold was increased to 40 for maize genome and 50 for ZmGI, and mass accuracy on the precursor ion was required to be within 63 ppm. Precursor ion masses below 700 D were discarded. All filtered results were uploaded into PPDB (http:// ppdb.tc.cornell.edu/; Sun et al., 2009). All mass spectral data (the .mgf files reformatted as PRIDE XML files) are available via the PRIDE database at http://www.ebi.ac.uk/pride/.

Selection of the Best Gene Models and Post-Mascot Filter to Assign Shared and Unique Peptides, and Creation of Protein Groups with a High Percentage of Shared Matched Spectra
Many genes have more than one gene model, thus leading in many cases to different predicted proteins. In such cases, we selected the protein form (gene model) that had the highest number of matched spectra; if two gene models had the same number of matched spectra, the model with the lowest digit was selected. For quantification by spectral counting, each protein accession was scored for total SPC, unique SPC (uniquely matching to an accession), and adjusted SPC. The latter assigns shared peptides to accessions in proportion to their relative abundance using unique SPC for each accession as a basis. Proteins that shared more than approximately 80% of their matched adjusted peptides with other proteins across the complete data set were grouped into clusters by generating a similarity matrix through calculation of the dice coefficient between each pair of identified proteins. The dice coefficient is defined as s = 2 |X \ Y|/(|x| + |Y|), where x and y represent the number of SPC for each protein. The similarity cutoff was 0.80. The MCL software (Enright et al., 2002) was used to cluster the proteins into groups, with inflation value set at 5. In some cases, a group contained one protein that had a high percentage of unique SPC (e.g. more than 80%), indicating that this was the most abundant member of the identified protein family. All groups were manually verified and ungrouped if needed. Additional proteins were grouped manually as needed, in particular if they had a low number of adjusted SPC (e.g. less than 20). The linkage of homologous proteins that were identified by the same set of MS/MS spectra was recorded (their identification was marked as "ambiguous"), and the matched MS/MS spectra were marked as "shared" spectra (or "not unique"). For proteins that were identified with unique spectra, as well as shared spectra, a linkage with the related identified protein was recorded (marked as "related" protein).

Calculation of Relative Abundance and Protein BS-M Ratios Using nSAF Values
Relative abundance for each identified protein accession (or cluster) was calculated by the nSAF (Zybailov et al., 2006) within each technical replicate. SAF was calculated based on the number of adjusted SPC for a protein, normalized by the number of predicted tryptic peptides (for that protein) within a mass range of 700 to 3,500 D, since shorter peptides were excluded from the Mascot search results, while the longer peptides were beyond the mass-to-charge ratio window of MS acquisition. The SAF for each protein was than normalized for the sum of all SAF in the technical replicate, resulting in nSAF. BS-M protein accumulation ratios were calculated based on average nSAF values. Average relative abundance for each protein was calculated for BS membranes, M membranes, BS soluble, M soluble, BS total, M total, and total BS+M. Within the chloroplast, the protein mass in the membraneenriched fractions as compared with the soluble fractions is similar; therefore, to calculate total BS chloroplast and total M relative abundance, average nSAF values for membrane and soluble fractions were summed (Supplemental Tables S2 and S3).

The PPDB and Functional Assignment of Identified Proteins
MS-based information of all identified proteins was extracted from the Mascot search pages and filtered for significance (e.g. minimum ion scores, etc.), ambiguities, and shared spectra as described . This information includes Mowse scores, number of matching peptides, number of matched MS/MS spectra (counts), number of unique and adjusted counts, highest peptide score, highest peptide error (ppm), lowest absolute error (ppm), sequence coverage, and tryptic peptide sequences. This information is available in the PPDB  using the search function "proteome experiments" and selecting the desired output parameters; this search can be restricted to specific experiments. Alternatively, information for specific accessions (either individually or a group) can be extracted using the search function "accessions"; if desired, this search can be limited to specific experiments. Finally, information for a particular accession can also be found on each "protein report page." Assignment of protein names of ZmGI and maize genome protein accessions was based on a combination of best BLAST hits in the predicted rice (Oryza sativa) proteome (OsGI version 5; from http:// rice.plantbiology.msu.edu/), the predicted Arabidopsis (Arabidopsis thaliana) proteome (ATH version 8 from The Arabidopsis Information Resource [http://www.arabidopsis.org/]), and our manual annotations for ZmGI version 16 (from http://compbio.dfci.harvard.edu/) homologues (Majeran et al., 2005(Majeran et al., , 2008Majeran and van Wijk, 2009). Pair-wise BLAST search results between ATH version 8, OSGI version 5, and ZmGI version 16 are available via PPDB. Each identified protein was assigned to a molecular function using the hierarchical, nonredundant classification system developed for MapMan (Thimm et al., 2004; http://gabi.rzpd.de/projects/MapMan/), adjusted after manual verification and information from the literature, and incorporated into PPDB. Predicted PFAM domains for all predicted maize genome proteins are also available in PPDB.

Supplemental Data
The following materials are available in the online version of this article.