Shared characteristics underpinning C4 leaf maturation derived from analysis of multiple C3 and C4 species of Flaveria

We identify transcription factors that show conserved patterns of expression in multiple C4 species, both within the Flaveria genus and also in more distantly related C4 plants.


Introduction
Photosynthesis powers life on earth by harvesting energy from sunlight to fix carbon dioxide. In seed plants, three types of photosynthesis known as C 3 , C 4 , and Crassulacean acid metabolism (CAM) have been defined. C 3 photosynthesis is considered the ancestral form and is found in the majority of species. All three photosynthetic pathways use Ribulose-1,5-Bisphosphate Carboxylase Oxygenase (RuBisCO) to catalyse CO 2 fixation in the Calvin-Benson-Bassham (CBB) cycle and, in so doing, two molecules of the three-carbon molecule 3-phosphoglycerate are formed. However, RuBisCO is not completely substrate specific, catalysing an oxygenation reaction that produces 2-phosphoglycolate as well as 3-phosphoglycerate (Bowes et al., 1971). As phosphoglycolate is toxic and removes carbon from the CBB cycle, it is rapidly metabolized by the photorespiratory pathway to ensure it does not accumulate, but also to retrieve carbon. Photorespiration uses energy inputs and is not completely effective in terms of carbon retrieval so some CO 2 is lost (Bauwe et al., 2010). The rate of oxygenation at the active site of RuBisCO increases with temperature, and so in lower latitudes multiple plant lineages have evolved either CAM or C 4 photosynthesis (Sage et al., 2011), which in both cases initially fix CO 2 by an alternative carboxylase, and then subsequently release high concentrations of CO 2 around RuBisCO to limit oxygenation.
Almost all C 4 plants use a two-celled system in which phosphoenolpyruvate carboxylase (PEPC) initially fixes carbon in mesophyll (M) cells, and then, after release of CO 2 in the bundle sheath (BS) cells, RuBisCO re-fixes CO 2 in the CBB cycle. The production of CO 2 within the BS leads to high concentrations of CO 2 around RuBisCO and minimizes oxygenation. PEPC generates oxaloacetate, which is rapidly metabolized to malate and/or aspartate prior to their diffusion to the BS. Three C 4 acid decarboxylase enzymes known as NADP-malic enzyme (NADP-ME), NAD-malic enzyme (NAD-ME), and phosphoenolpyruvate carboxykinase (PEPCK) have been shown to release CO 2 around RuBisCO and, for many years, based on their relative abundance, these decarboxylase enzymes were used to define three C 4 biochemical subtypes. More recently it has been shown that this division into three types is less rigid than previously thought (Furbank, 2011;Bellasio and Griffiths, 2014;Wang et al., 2014). The use of distinct decarboxylases in separate C 4 lineages indicates that C 4 photosynthesis is underpinned by convergent evolution. In addition to the 60 groups of plants known to have evolved the C 4 pathway (Sage et al., 2011), there are some genera containing both C 3 and C 4 species. Probably the most studied of these clades with closely related C 3 and C 4 species is the genus Flaveria found in the Asteraceae (Drincovich et al., 1998;Gowik et al., 2011;Schulze et al., 2013).
Most species of Flaveria are native to Central and North America and grow as annual or perennial herbs or shrubs with decussate leaves (Powell, 1978). C 4 species of Flaveria show high activities of the NADP-ME C 4 acid decarboxylase in chloroplasts of the BS, and leaf anatomy conforms to the Atriplicoid type (McKown and Dengler, 2007). Phylogenetic reconstruction of this group has been conducted using morphological as well as life history and gene sequence data for 21 of the 23 known species (McKown et al., 2005;Lyu et al., 2015). The consensus from this work is that the ancestral condition in Flaveria is C 3 photosynthesis (McKown et al., 2005;Lyu et al., 2015) and that the occurrence of multiple C 3 and C 4 species within the same genus provides an interesting system to study processes associated with the C 4 phenotype (see Supplementary Fig. S1 at JXB online).
In this study, we used two pairs of C 3 and C 4 Flaveria species, and set out to link the gradual maturation of C 3 and C 4 characteristics in leaves to underlying alterations in transcript abundance. By linking the development of the C 4 phenotype to changes in gene expression in two C 4 species, and comparing these findings with equivalent data from two C 3 species in the same genus, we aimed to identify common traits associated with C 4 photosynthesis, and to remove species-specific characteristics from our data sets. Using the maturing leaf as a dynamic system, we show that in both C 4 species studied, the induction of Kranz anatomy occurs along a base to tip developmental gradient in leaves of >2 cm length. We sampled this maturation gradient and undertook RNA sequencing to correlate the underlying patterns of gene expression with anatomical development.

Plant growth
Flaveria bidentis (L.) Kuntze, Flaveria pringlei Gandoger, Flaveria robusta Rose, and Flaveria trinervia (Spreng.) C. Mohr were grown in a glasshouse on the roof of the Department of Plant Sciences, Cambridge. Temperature was maintained above 20 °C and supplemental lighting was provided to ensure at least 250-350 µmol m -2 s -1 photon flux density for 16 h d -1 . Seeds were sown directly onto soil (Levington's M3 potting compost; Scotts Miracle-Gro Company, Godalming, UK) in covered pots as seeds need high humidity for germination. Covers were removed when seedlings were 1 cm high.

Analysis of leaf anatomy
Samples of 4 mm 2 to 1 cm 2 were fixed in 4% (w/v) formaldehyde at 4 °C overnight and then placed on ice and dehydrated prior to being placed in 100% (v/v) ethanol, followed by 1:1 ethanol/Technovit mix and then 100% Technovit 7100 (Heraeus Kulzer, Germany). Samples were subsequently left in Technovit solution plus hardener I (1 g of hardener per 100 ml) for at least 1 h. Disposable plastic resin embedding moulds (Agar Scientific, UK) were filled with a mixture of Technovit plus hardener I and II (15 ml of Technovit plus hardener I were mixed with 1 ml of hardener II). Samples were arranged within the embedding moulds which were then covered with unstretched Parafilm ® M to seal them from air, and left to harden overnight. Samples were removed, heated to 80 °C, and trimmed for sectioning. Sections of 2 µm thickness were produced using a Thermo Scientific Microm HM340E microtome. Ribbons were mounted onto SuperFrost ® white microscope slides (VWR, Leuven, The Netherlands), left to dry, and then stained with 0.1% (w/v) toluidine blue. All sections were analysed with a BX41 light microscope (Olympus, Center Valley, PA, USA), usually using the bright-field setting.
To clear leaves, they were placed into 70% (v/v) ethanol and heated to 80 °C. The next day samples were placed in 5% (w/v) NaOH for ~15 min to clear leaves further, and then mounted in water and analysed by light microscopy.
To quantify leaf anatomical characteristics, Photoshop CS5 was used. The program was calibrated with scale bars and the lasso tool was used to measure cell area for both BS and M cells. Measurements of M cells were taken between BS cells, with 30 BS and 30 M cells being measured per leaf section. This was done for all six leaf stages and for three biological replicates for each species. Images derived from leaf sections were assembled using Photoshop. The background of sections was averaged and in some cases the contrast was increased to improve visibility of cell borders. Vein density was determined using images of cleared leaves. In mature sections of the leaf there was no ambiguity in collecting these data, whereas in the most basal sections of some leaves, the estimates may be underestimates of the extent of venation as immature veins are not always visible. Three independent leaves were measured per species. The length of the veins was measured using Q-Capture Pro7, and vein density was expressed in mm mm -2 .
Samples for TEM were processed by the Multi-Imaging Centre in the Department of Physiology, Development and Neuroscience (Cambridge). Sections of ~1 mm thickness of all six stages along a developing leaf were used for embedding. Tissues were fixed in 4% (w/v) glutaraldehyde in 0.1 M HEPES buffer with a pH of 7.4 for 12 h at 4 °C. Subsequently they were rinsed in 0.1 M HEPES buffer five times, and then treated with 1% osmium ferricyanide at room temperature for 2 h and rinsed in deionized water five times before being treated with 2% (w/v) uranyl acetate in 0.05 M maleate buffer with a pH of 5.5 for 2 h at room temperature. They were rinsed again in deionized water and dehydrated in an ascending series of ethanol solutions from 70% to 100% (v/v). This was followed by treatment with two changes of dry acetonitrile and infiltration with Quetol 651 epoxy resin. Sections 50-70 nm thick were cut with a Leica Ultracut UCT and stained with lead citrate and uranyl acetate. Images were taken with a Tecnai G2 transmission electron microscope (FEI, Hillsboro, USA) and operated at 120 kV using an AMT XR60B digital camera (Advanced Microscopy Techniques, Woburn, USA) running Deben software (Deben UK Limited, Bury St. Edmunds, UK). Images were taken at ×1700, ×3500, and ×5000 magnification.

RNA extraction and sequencing
Deep sequencing was carried out to analyse transcriptomes associated with the leaf developmental gradient defined by the anatomical analysis. Opposite leaves from the third leaf pair after the cotyledons were harvested in the morning between 10:00 h and 11:00 h (4 h after the start of the photoperiod in the glasshouse). The leaves were harvested when they had reached 1.8-2 cm length (from base to tip). Leaves were measured on an ice-cold glass plate and cut into six portions of equal length (3-3.3 mm). Each sample was then immediately placed into liquid nitrogen. Sections of 12 leaves were collected before extraction of RNA using the Qiagen Plant RNeasy Mini Kit. The optional DNA digestion step with RNase-free DNase was performed. Three biological replicates were analysed for each of the four Flaveria species. Total leaf RNA was sent to the Genomics & Transcriptomics Laboratory (GTL) in Düsseldorf (Germany) for paired-end sequencing. RNA samples were prepared following the TrueSeq RNA sample preparation v2 guide (Revision F) and sequenced using an Illumina/Solexa HiSeq 2000 machine. Prior to assembly, reads were trimmed using Trimmomatic v0.32 (Bolger et al., 2014), corrected using BayesHammer v2.5.1 (Nikolenko et al., 2013), and then digitally normalized using the khmer package (Crusoe et al., 2015). SOAPdenovo-trans (Xie et al., 2014), IDBA_ tran v1.0.13 (Peng et al., 2013), and SGA v0.10.12 (Simpson and Durbin, 2012) were used for de novo assembly. For each species, the individual assemblies were combined and clustered using CD-HIT-EST (Simpson and Durbin, 2012;Li and Godzik, 2006). Annotation of contigs used Conditional Reciprocal Best Blast (github.com/ cboursnell/crb-blast) to the Arabidopsis thaliana peptide sequences as previously described (Aubry et al., 2014). For principal components analysis, Z-scores were calculated from transcripts per million (TPM) reads from for each gene across the four species. Expression patterns of genes of interest were analysed using a custom-made R script to extract the data of interest and to visualize expression patterns. Gene Ontology (GO) term analysis was performed according to Burgess et al. (2016).

Leaf maturation in C 3 and C 4 Flaveria species
We first confirmed that fully expanded mature leaves of C 3 F. pringlei and F. robusta as well as C 4 F. bidentis and F. trinervia showed characteristic C 3 and C 4 anatomy under our growth conditions. In all cases, the third leaf pair was chosen as the first and second leaf pairs have previously been described as juvenile (McKown and Dengler, 2009). Mature third leaves of C 3 and C 4 Flaveria ranged from 6 cm to 10 cm in length, but the C 3 species had narrower leaves with an entire margin whereas the C 4 leaves were wider and the margin was slightly dentate (Fig. 1A). Both C 4 species had closer veins than the C 3 species ( Fig. 1B; Supplementary Fig. S2). In addition, analysis of transverse leaf sections indicated that C 3 leaves were thicker because of larger cells and more cell layers (Fig.  1C). Clear differences in BS and M cell arrangement were visible between the C 3 and C 4 pairs, with the BS in both C 4 species being more uniform in shape than in the C 3 species (Fig.  1C). In addition, compared with the C 3 species, M cells in the C 4 leaves were smaller ( Supplementary Fig. S2) and appeared to show increased contact with the BS. In agreement with previous analysis (McKown and Dengler, 2007), leaves of the two C 4 and two C 3 species typically contained five and eight ground cell layers between the adaxial and abaxial epidermis, respectively.
In all four Flaveria species, leaves of 1.8-2 cm length showed a basipetal maturation gradient with differentiating tissues at the base and fully differentiated tissues at the tip ( Fig. 2A-C; Supplementary Fig. S3). This base-to-tip maturation programme is typical of dicotyledons, but it is notable that in Flaveria maturation occurred in larger leaves than in Gynandropsis gynandra, where the maturation gradient was detected in leaflets of 0.4 cm length (Aubry et al., 2014). This prolonged development of leaf maturation in Flaveria thus provides an excellent system to analyse the induction of C 4 photosynthesis as the larger leaf allows these leaves to be divided into more stages for functional analysis.
To better understand differences in leaf maturation in C 3 and C 4 Flaveria, leaves from each species were divided into six portions, and RNA was extracted, quantified, integrity determined ( Supplementary Fig. S4) and then subjected to deep sequencing. On average, 20 million reads were recovered from each replicate (Supplementary Table S1). After de novo assembly of these reads, the average number of annotated transcripts per species was 12 475 (Supplementary Table S1).
Read mapping was used to quantify transcript abundance in TPM (Supplementary Data S1). By grouping the two base, mid, and tip stages, it was possible to analyse transcript behaviour quickly across the maturation gradient. This revealed that on average 40% of annotated transcripts showed descending behaviours, meaning that they were expressed more strongly at the base of the leaf than at the tip (Supplementary Table S1).
Using the 8000 annotations found in all four species, correlation analysis showed that patterns of gene expression first clustered by species and then by photosynthetic type, meaning that the correlation between neighbouring developmental stages was highest within species and that the correlation was higher between species of the same photosynthetic type than between species of different photosynthetic types ( Supplementary Fig. S5). A gradient from base to tip was clear in all four species, but was more pronounced in the two C 4 species ( Supplementary Fig. S5).
The six portions from the base to the tip of the leaf showed a clear induction of genes known to encode components of the C 4 cycle (Fig. 3). An important step in the evolution of C 4 photosynthesis is thought to be a co-ordinate alteration in expression of genes encoding the photorespiratory cycle and nitrogen metabolism (Mallman et al., 2014). Consistent with a reduction in photorespiration in the C 4 species, genes of photorespiration increased along the developmental gradient in C 3 species but to a lesser extent in the C 4 Flaveria species. The main exception was DIT1, a transporter gene which is involved in the C 4 cycle and is more abundant in both C 4 species (Supplementary Data S1). Around 200 genes are consistently up-regulated in the upper tip tissue using C 4 compared with C 3 photosynthesis ( Fig. 4A; Supplementary Data S2) and, in young tissue that is still undergoing differentiation, the number of genes up-regulated in the C 4 species is around double this number. These estimates are significantly lower than those reported from analysis of whole leaves at various developmental stages from C 3 Tarenaya hassleriana and C 4 G. gynandra in the Cleomaceae (Külahoglu et al., 2014), where between 2500 and 3500 genes were estimated to be up-regulated in the C 4 species. The reduction in differential gene expression in the C 4 compared with C 3 Flaveria species in this study compared with the estimates from the Cleomaceae may be due to the use of two C 3 and two C 4 species, removing species-specific differences, but also to the reduced phylogenetic distance between the Flaveria species (Sage et al., 2011). Principal components analysis indicated that the first two dimensions contributed to 37% and 19% of the variation, respectively, and that whereas the first dimension was associated with the leaf maturation signal from all species, the second dimension was associated with the photosynthetic pathway being used (Fig. 4B). This finding implies that C 4 metabolism has a large impact on the differences in gene expression associated with C 3 and C 4 leaves within one genus. When MapMan categories associated with the top decile of differentially expressed genes were assessed it was notable that photosynthesis genes were over-represented for a prolonged period along the gradient in the C 4 species (Fig. 4C). This was also the case for genes related to DNA and transport. In contrast, the MapMan category associated with RNA was overrepresented for longer in the C 3 species (Fig. 4C). Equivalent analysis for GO terms indicated that cell wall loosening was over-represented for longer in both C 4 species compared with the C 3 species (Supplementary Fig. S6). Overall, these data indicate that the maturation of Flaveria leaves captures changes in transcript abundance associated with induction of C 3 or C 4 photosynthesis, and provide insight into the broader alterations in gene expression. We next aimed to relate the underlying anatomical changes associated with leaf maturation in these C 3 and C 4 species of Flaveria to changes in gene expression determined from the deep sequencing.

Kranz anatomy during leaf maturation
Using cleared leaves, veins were traced to visualize their development. In the most basal part of the leaf, major veins were present but, in the two C 4 species, the highest order veins were still being laid down. Some of these developing higher order veins could be detected by the specific patterns of periclinal cell divisions associated with their production (Supplementary Fig. S7). Vein density was determined in each of the six sections along the leaf developmental gradient, and this showed markedly different developmental patterns in the C 3 and C 4 pairs of Flaveria (Fig. 5A). Vein density was lower at the base in C 4 compared with C 3 leaves of Flaveria, but, during the transition from the base to the middle of the leaf, density increased dramatically in the C 4 species (Fig. 5B). In both C 3 and C 4 pairs, vein density then decreased from the middle to the tip of the leaf. The reduction in vein density as leaves matured is probably associated with expansion of M and BS cells. In the two most basal sections of the leaf, cell division was still ongoing and procambial strands were initiated first with a periclinal and then an anticlinal division, to derive BS cells ( Supplementary Fig. S7). These basal portions of the leaf therefore appear most interesting to interrogate for candidate genes associated with these processes. The de novo Fig. 3. Transcripts encoding the C 4 cycle increase dramatically during leaf maturation of the C 4 species. The schematic of major components of the C 4 pathway located in mesophyll and bundle sheath cells of C 4 leaves is surrounded by plots of transcript abundance for key genes of the NADP-ME subtype in transcripts per million (TPM) for the four species. RBCS1A transcripts are, as expected, higher in the C 3 species. All C 4 transcripts other than CARBONIC ANHYDRASE1 (CA1) are barely detectable in the C 3 species. assembled transcriptomes from the four Flaveria species were therefore analysed for homologues of genes known to impact on vein formation in others species.
Six genes that effect vein formation in A. thaliana showed different behaviours in the C 4 Flaveria species compared with the C 3 species (Fig. 6). In all cases, absolute transcript abundance tended to be higher at the base of the C 4 leaves compared with the base of the C 3 species. However, the most consistent differences in expression were found for A. thaliana HOMEOBOX-GENE-8 (ATHB8). SHR and SCR have been implicated in the development of C 4 Kranz anatomy (Slewinski et al., 2012;Slewinski, 2013). SHR was detected in three of the species with a descending pattern, but with no clear (B) Principal components analysis of the differentially expressed genes in all four species shows that the first dimension is associated with leaf maturation, but the second dimension is associated with the photosynthetic pathway. C 3 species are depicted in red, C 4 species in blue, and each stage is numbered from 1 (base) to 6 (tip). (C) Summary of MapMan categories associated with the top decile of differentially expressed genes in each species. Fp, Flaveria pringlei; Fr, Flaveria robusta; Fb, Flaveria bidentis; Ft, Flaveria trinervia. In all species, vein density was highest in the middle and decreased towards the tip of leaves. A rapid increase in vein density occurs between the upper base and lower mid in both C 4 species. Error bars represent 1 SE. Scale bars represent 100 μm for the vein traces and 0.5 cm for the leaf outline. difference between C 3 and C 4 . SCR also showed a descending pattern, but again with no clear difference between photosynthetic types. Of Scarecrow-like genes, only SCARECROW-LIKE 14 was clearly higher in C 4 and showed a parabolic expression pattern, being most highly expressed at the base and the tip (Supplementary Data S3). SCARECROW-LIKE 14 was recently identified from cross-species selection scans in the grasses (Huang et al., 2016). Transcripts encoding the auxin response factors ARF3, ARF8, and also IAA7 showed differences in the timing of expression in C 4 compared with C 3 species (Supplementary Data S3).
The same six regions from leaf base to tip used to define vein density were next used to quantify maturation of BS and M cells (Fig. 7). Leaf thickness was higher in the C 3 compared with the C 4 species, and cell expansion continued for longer in the C 3 leaves (Fig. 7A). The BS in the C 3 species was less regular than in the C 4 species, and individual cell size within the BS was more variable (Fig. 7A). However, it was notable that the cross-sectional area of BS cells in the C 3 species was larger than that of C 4 species, particularly towards the tip.
It has been proposed that a large BS cell size is a key early event associated with the evolution of C 4 photosynthesis (Christin et al., 2013;Griffiths et al., 2013;Williams et al., 2013), and so it appears that within Flaveria this is a pre-existing trait. It was noticeable that the cross-sectional area of M cells from the two C 3 species was around five times greater than that of M cells of the C 4 species (Fig. 7B). This strongly implies that compared with C 3 species, the reduced leaf depth of C 4 species is associated with an inhibition of M cell expansion as well as a reduced number of cell layers. A number of genes that have been annotated as having roles in cell proliferation or expansion showed behaviours that may explain the reduced cell expansion in leaves of both C 4 species. Most notable was DWF4, a 22α-hydroxylase that catalyses the rate-limiting step of brassinosteroid synthesis and so controls cell expansion. In both C 3 species, DWF4 transcript abundance increased along the leaf gradient but in the C 4 species its transcripts were barely detectable (Supplementary Data S4). These data imply that DWF4 is a strong candidate for the reduced expansion associated with maturation of the C 4 leaf. In sorghum, Fig. 6. Patterns of transcript abundance in the four Flaveria species for genes known to be involved in vein formation. Transcript abundance is shown in transcripts per million (TPM), and the parts of the leaf are annotated on the x-axis. The two C 4 species are shown with solid lines, and the two C 3 species with dashed lines. Fp, Flaveria pringlei; Fr, Flaveria robusta; Fb, Flaveria bidentis; Ft, Flaveria trinervia. mutation of CYP450D2, a Cyt P450 involved in brassinosteroid synthesis, leads to perturbed vein spacing (Rizal et al., 2015). Although the orthologue to CYP450D2 is not present in A. thaliana, transcripts encoding CYP90D1 and CYP90C1 appear to be slightly higher in the base of the C 4 compared with the C 3 Flaveria species (Supplementary Data S4).
A prolonged rate of cell division in the C 4 Flaveria species is also supported by transcript profiles. A number of genes implicated in cell division are found at higher levels in basal sections of C 4 than C 3 leaves including PLE (Müller et al., 2004), MOR1 (Whittington et al., 2001), ATK1 (Marcus et al., 2003), and ATK3 (Mitsui et al., 1994) which are involved in microtubule organization, and SCC3 (Chelysheva et al., 2005) and SYN1 (Cai et al., 2003) which are implicated in sister chromatid pairing (Supplementary Data S4). Transcripts derived from genes associated with DNA replication and repair including SOG1 (Yoshiyama et al., 2009), TOP6B (Gilkerson and Callis, 2014), MutS homologues MSH (MSH3, MSH4, and MSH7) (Culligan and Hays, 2000), and multiple components of the mini-chromosome maintenance (MCM) complex (MCM2, MCM3, MCM5, and MCM6) (Tuteja et al., 2011) were also more abundant in the C 4 species at the base of the leaf (Supplementary Data S4). Good candidate regulators of cell division include MYB3R4 which forms a complex with E2FB and RETINOBLASTOMA-RELATED PROTEIN (RBR1) to activate cell cycle progression (Haga et al., 2011), and DEL1 which maintains cell division by repressing entry into the endocycle (Vlieghe et al., 2005), both of which are higher in the base of C 4 leaves and maintain expression longer into the leaf gradient.
A reduction along the leaf gradient in the abundance of transcripts encoding CPP SYNTHASE 1 (CPS1), which catalyses the first step of gibberellin synthesis, and a spike in transcripts of GA2ox2 (Supplementary Data S4), which degrades gibberellin, between the upper base and lower mid in C 4 leaves is in keeping with findings that a decrease in gibberellin levels controls the transition between cell division and expansion in maize (Nelissen et al., 2012). A corresponding pattern cannot be seen in C 3 leaves, which may mean that the transition between cell division had already occurred at the base of C 3 lineages, or is mediated by another mechanism. We did not undertake analysis of sink-to-source transitions within the Flaveria leaves. However, as sugar signalling impacts on the transition from cell division to expansion, it is possible that the altered maturation of C 4 compared with C 3 leaves is related to sink-to-source transition as well as the C 4 paradigm per se .

Chloroplast maturation within the leaf gradient
Species that primarily use NADP-ME to decarboxylate malate in the C 4 BS commonly develop dimorphic chloroplasts with M cells containing stacked thylakoids (grana) and the BS cells containing fewer grana (Edwards and Walker, 1983;Edwards and Voznesenskaya, 2011). To provide insight into the dynamics of chloroplast development and maturation in C 3 and C 4 Flaveria species, TEM was used to investigate chloroplast structure in each of the six stages along the leaf gradient. Dimorphic chloroplasts were observed in the C 4 species but not in the C 3 species (Fig. 8A-D). Towards the base of the C 4 leaves, both BS and M chloroplasts contained 2-3 granal lamellae per stack (Supplementary Fig. S8). In more mature parts of the leaf, levels of stacking were reduced in BS cells but increased in M cells (Fig. 8A-D). In both C 3 species, a developmental gradient in granal stacking was also visible from base to tip, with more mature parts of the leaf showing increased levels of granal stacking in both cell types ( Supplementary Fig. S8). Thus, a key difference between the C 3 and C 4 leaves was the reduction in granal stacking in C 4 BS cells from the middle of the leaf.
The control of photosynthesis gene expression and chloroplast development has previously been linked to a pair of transcription factors known as GOLDEN-LIKE 1 and 2 (GLK1 and GLK2) (Supplementary Data S5). Unfortunately, de novo assembly did not generate a complete set of contigs for either GLK1 or GLK2 from all four species, so we were unable to determine if their abundance correlated with the observed changes in chloroplast development. However, other candidate genes that can be associated with chloroplast maturation were identified. For example, cytokinin is known to play a role in chloroplast maturation, and increased abundance of transcripts encoding the cytokinin receptors AHK2 and AHK3, along with the downstream effector ARR1 in C 4 species suggests that enhanced cytokinin signalling may play a role in these alterations to the C 4 chloroplast. Additionally, it was also notable that the behaviour of two nuclear-encoded chloroplast RNA polymerases (RpoTp and RpoTmp) (Swiatecka-Hagenbruch et al., 2008) showed clear differences between the C 3 and C 4 Flaveria species. Transcripts encoding RpoTp, which controls ycf1 that is involved in plastid protein import, were very low in both C 4 species. Further indications to changes in chloroplast protein import related to components of the TIC/ TOC complex (Supplementary Data S5). Transcripts encoding the outer envelope receptor proteins TOC34 and TOC159 were higher in the base and peaked later in the C 4 than the C 3 species. Additionally, TOC75, the major channel constituent for protein import into the chloroplast (Li and Chiu, 2010), was lower at the base of C 4 leaves. These data would be consistent with protein import playing a role in the establishment of dimorphic chloroplasts. Chloroplast maturation also correlated with increased abundance of nuclear-encoded chloroplast sigma factors, which direct the nuclear-encoded chloroplast RNA polymerase (NEP) to promoters.
Previous studies have demonstrated that C 4 plants alter photosystem activity to adjust ADP/ATP and NADPH ratios to ensure proper functioning of the carbon pump. Consistent with this was the C 4 -specific up-regulation of SIG3, which initiates transcription of psbN (Zghidi et al., 2007) that is involved in repair of PSII complexes (Torabi et al., 2014). It was also notable that transcripts encoding components of the NDH complex (NDF1, NDF6, and PIFI), as well as genes involved in cyclic electron transport (PGR5 and PGRL1A), were more abundant in both C 4 species, and these differences became more apparent from base to tip (Supplementary Data S5). Chloroplast dimorphism also coincided with an increase in transcript abundance of CRR1, which is proposed to be involved in NDH complex assembly.
Analysis of electron micrographs also revealed that chloroplast length in the BS increased from base to tip in the C 4 species more than in the C 3 species (Fig. 9). Two members of the REDUCED CHLOROPLAST COVERAGE (REC) gene family, REC2 and REC3, knockout of which causes a reduction in chloroplast size (Larkin et al., 2016), were up-regulated in the C 4 species. Further, FtsZ1-1, which is involved in chloroplast division, was significantly down-regulated in the C 4 compared with the C 3 species (Supplementary Data S5). As reduced division of chloroplasts leads to increased size (Pyke and Leech, 1994;Pyke, 1997), the reduced expression of chloroplast division genes in C 4 plants could lead to their increased size in BS cells.

Convergence in transcript abundance between independent C 4 lineages
By combining our data from the four Flaveria species with publicly available data sets, we next sought to investigate whether separate lineages of C 4 plants shared common changes in transcript abundance compared with their C 3 congeners. A recent study compared transcriptomes derived from leaf developmental gradients of T. hassleriana (C 3 ) and G. gynandra (C 4 ) from the Cleomaceae (Külahoglu et al., 2014). In contrast to our work, whole leaves were staged by age (Külahoglu et al., 2014) and changes in transcript abundance compared with non-photosynthetic tissues such as roots. This analysis identified a set of 33 genes that could be annotated with unique A. thaliana identifiers. This set of genes showed root expression in the C 3 species T. hassleriana but leaf expression in the C 4 species G. gynandra (Külahoglu et al., 2014), and so it was proposed that these were key genes that neofunctionalize to become involved in C 4 photosynthesis. Of these 33 genes, homologues to 15 were detected in all four Flaveria species, but only three showed up-regulation in the C 4 compared with C 3 species in our study and three showed higher expression in the two C 3 species (Supplementary Data S6).
Previously, transcript abundance in developing leaves of the C 4 dicot G. gynandra (Aubry et al., 2014) has been compared with that in the C 4 monocot maize Pick et al., 2011). Despite the very wide phylogenetic distance between these species, 18 transcription factors that showed the same patterns of behaviour in both C 4 G. gynandra and C 4 maize were identified (Aubry et al., 2014). Ten of these Fig. 9. Quantitation of chloroplast length and width from mesophyll and bundle sheath cells. Length and width were taken from thin sections used in TEM. For each species, five chloroplasts were measured per stage and cell type. Error bars represent 1 SE. The two C 4 species are shown with solid lines, and the two C 3 species with dashed lines.
18 candidates were also detected in all four Flaveria species. Five of those showed ascending behaviour and higher transcript abundance in the two C 4 species. These were RAD-LIKE6 (RL6, AT1G75250); AT2G05160, a CCCH-type zing finger family protein with an RNA-binding domain; AT2G21530, a SMAD/FHA domain-containing protein; RNA Polymerase Sigma Subunit C (SIGC, AT3G53920), and BEL1-Like Homeodomain1 (BLH1, AT5G67030) (Supplementary Data S7). The five genes have therefore been shown to exhibit the same behaviour along leaf developmental gradients in C 4 species in the monocots and dicots [both rosids (Cleomaceae) and asterids (Asteraceae)]. This strongly supports the notion that independent lineages of C 4 plants have recruited homologous trans-factors to induce the C 4 system.

Summary and conclusions
Most previous analysis derived from comparisons of RNAseq of C 3 and C 4 species have relied on either sampling one congeneric C 3 and C 4 species pair, or analysis of mature leaf tissues. The number of genes that are differentially expressed between any two species can be very high simply due to species-specific differences, which do not relate to their photosynthetic pathway. In this work, we identify genes that are consistently differentially expressed in multiple C 3 and C 4 species from the same genus along a leaf maturation gradient. In the Flaveria genus, the data set therefore indicates that ~200 genes are consistently up-regulated in mature tissue using C 4 compared with C 3 photosynthesis. In young tissue that is still undergoing differentiation, the number of genes up-regulated in the C 4 species is around double this number. Leaf maturation and the photosynthetic pathway were responsible for the greatest amount of variation in transcript abundance. In Flaveria, the evolution of C 4 photosynthesis is not associated with an expansion of BS cell size, but rather a reduction in the cross-sectional area of M cells in C 4 compared with C 3 species. Overall, the analysis therefore provides quantitation of changes in gene expression associated with C 3 and C 4 photosynthesis, candidate genes underlying the alterations in leaf characteristics associated with these pathways, and, through comparison with independent C 4 lineages outside of the Asteraceae, insight into the extent to which parallel evolution underlies the convergent and complex C 4 trait.

Supplementary data
Supplementary data are available at JXB online. Fig. S1. Phylogeny of the Flaveria genus. Fig. S2. Mature leaf vein density and cell size. Fig. S3. Leaf anatomy maturation gradient in C 3 and C 4 Flaveria. Fig. S4. RNA quality from sections of each species. Fig. S5. TPM correlation matrix. Fig. S6. GO term analysis presented as heatmaps for each Flaveria species.  Data S1. TPM values for all four species for all stages and all replicates.
Data S2. Summary of transcripts that were up-regulated in both C 3 species or both C 4 species at each stage.
Data S3. Transcript abundance of SHR, SCR, and SCRlike genes as well as auxin response genes.
Data S4. Transcript abundance of genes associated with cell division.
Data S5. Transcript abundance of genes associated with chloroplast maturation and development.
Data S6. Comparison of data derived from this study with genes identified by Külahoglu et al. (2014).
Data S7. Comparison of data derived from this study with genes identified by Aubry et al. (2014). Table S1. Number of reads and transcripts, per stage and species.

Author contributions
BMCK and JMH designed the study. BMCK grew the plants, undertook the anatomical analysis, and isolated the RNA; RS-U and CB undertook the de novo transcriptome assembly and annotation of contigs; BMCJ, SJB, and IR-L analysed the RNA-seq data; BMCK, SJB, and JMH wrote the paper.