The landscape of transcription factor promoter activity during vegetative development in Marchantia

Abstract Transcription factors (TFs) are essential for the regulation of gene expression and cell fate determination. Characterizing the transcriptional activity of TF genes in space and time is a critical step toward understanding complex biological systems. The vegetative gametophyte meristems of bryophytes share some characteristics with the shoot apical meristems of flowering plants. However, the identity and expression profiles of TFs associated with gametophyte organization are largely unknown. With only ∼450 putative TF genes, Marchantia (Marchantia polymorpha) is an outstanding model system for plant systems biology. We have generated a near-complete collection of promoter elements derived from Marchantia TF genes. We experimentally tested reporter fusions for all the TF promoters in the collection and systematically analyzed expression patterns in Marchantia gemmae. This allowed us to build a map of expression domains in early vegetative development and identify a set of TF-derived promoters that are active in the stem-cell zone. The cell markers provide additional tools and insight into the dynamic regulation of the gametophytic meristem and its evolution. In addition, we provide an online database of expression patterns for all promoters in the collection. We expect that these promoter elements will be useful for cell-type-specific expression, synthetic biology applications, and functional genomics.


Introduction
Embryophytes evolved around 470 million years ago and started covering the Earth's land surface.A common feature of the body plan of land plants is the alternation of generations between the sporophyte and the gametophyte during vegetative to reproductive development (Bowman et al. 2016;Bowman 2022a).The major lineages display two contrasting forms of vegetative body: tracheophytes (vascular plants) display a dominant sporophyte generation (diploid), while the vegetative body of bryophytes is gametophytic (haploid).Both tissues are characterized by polar growth with apical dominance and maintenance of a stem-cell population.Developmental programs controlling meristem organization in the sporophyte of vascular plants are relatively well known (Lodha et al. 2008;Uchida and Torii 2019).It is expected that the vegetative body of bryophytes has an equivalent meristem organization, but the regulatory programs associated with the bryophyte gametophyte and how it evolved are not fully understood (Bowman et al. 2019;Hata and Kyozuka 2021).
During the last decade, evo-devo studies in models such as Marchantia (Marchantia polymorpha) and Physcomitrella (Physcomitrium patens) have provided exceptional insights into the molecular mechanisms regulating developmental programs in bryophytes.Several aspects of hormonal and peptide signaling follow strikingly similar rules to flowering plants (Blázquez et al. 2020;Hirakawa 2022).However, less is known about the identity of transcription factors (TFs) regulating vegetative development of the apical meristem of bryophytes (Romani and Moreno 2021).A better understanding of the nature of these two forms of vegetative growth is likely to shed light on the early evolution of land plants.
TFs are key determinants of genetic programs operating during cellular development, and their cell-type specific patterns of expression provide indicators for regulatory processes that underpin cell differentiation during the vegetative body formation.M. polymorpha shows many advantages as an experimental system and has become a significant model organism for plant science (Kohchi et al. 2021;Bowman et al. 2022;Bowman 2022b).Marchantia not only widens our knowledge of plant molecular biology outside of flowering plants (angiosperms) but is also an exceptional model for synthetic biology (Boehm et al. 2017;Sauret-Güeto et al. 2020).The Marchantia genome features only about ∼450 TF genes (Bowman et al. 2017), about a fifth of the number of TFs in Arabidopsis (Arabidopsis thaliana), with several subfamilies containing a single gene.This greatly simplifies the study of complex gene regulatory networks (GRN), which is afflicted by the problem of gene redundancy in other systems (Wagner 1996;Panchy et al. 2016).Combined with a short haploid life cycle and efficient Agrobacterium (Agrobacterium tumefaciens)-mediated transformation protocols (Ishizaki et al. 2016), fast modular growth, and simple morphology (Boehm et al. 2017), Marchantia allows system-wide experimental approaches that are infeasible in other plant species.
The mapping of temporal and spatial gene expression patterns is essential for understanding regulatory networks underlying developmental processes.In the last few years, different techniques have been developed to explore gene expression using single-cell (scRNA-seq) and spatial transcriptomics (Giacomello 2021;Seyfferth et al. 2021;Wang et al. 2023).These techniques can provide gene expression information at a single-cell level for an entire transcriptome but associating that to cell identities presents some challenges and limitations (Yuan et al. 2017).On the other hand, traditional tools, such as using transgenic lines with reporters fused to predicted promoter regions, can deliver a more detailed map of expression patterns at the cellular level.This approach can provide insight into the dynamics of gene expression as well as useful tools for tissue-specific expression.However, understanding the landscape of gene expression in an organism requires exploring the expression of hundreds to thousands of genes.The generation of stable transgenic lines is laborious and time-consuming, making such an endeavor infeasible for many model organisms, particularly in plant species.Yet, comprehensive expression pattern databases have been established for several metazoan species using transcriptional reporters and in situ hybridization (Visel et al. 2007;Gallo et al. 2011;Bessa et al. 2014;Alonso-Barba et al. 2016).

IN A NUTSHELL
Background: Transcription factors are essential for the regulation of gene expression, cell behavior, and growth of plants.Characterizing the cellular activity of these regulatory genes in space and time is a critical step toward understanding the architecture and self-organization of plant tissues.The vegetative gametophyte meristems of bryophytes are understudied compared with the meristems of flowering plants.Marchantia polymorpha is a compelling model system for studying the dynamic relationships between gene expression and cellular organization.The liverwort plant has a simple and streamlined genome with minimal redundancy, with only ∼450 putative transcription factor genes.
Question: Which cellular identities characterize the development of the vegetative meristem in Marchantia?How can one visualize individual cell states during development?What is the identity and expression profile of transcription factors associated with them?

Findings:
We have generated a near-complete collection of promoter elements derived from these genes.We used these to make transformed plant lines expressing fluorescent proteins in patterns that reflected properties of the corresponding regulatory genes.This allowed us to build a map of specific expression domains in early development and identify a set of transcription factor-derived promoters that are active in the stem cell zone of the plant.The cell markers provide additional tools and insight into the dynamic regulation of the plant gametophytic meristem and its evolution.In addition, we now have a library of synthetic promoters that can be used as a toolkit to engineer gene expression.

Next steps:
We expect that these modular promoter elements will be useful for a wide range of experiments.The ability to precisely control gene expression is important for functional studies of the natural feedback between genetic and cellular processes that underpins growth, and for reprogramming these interactions in engineering approaches.
In this work, we aimed to systematically explore the behavior of promoter elements from TF genes in Marchantia, and to map the resulting expression patterns.We hypothesize the gametophytic meristem is also characterized by the specific expression patterns of TFs in Marchantia and they could provide clues to understand underlying GRNs.We characterized a near-complete collection of promoter elements derived from TFs encoded in the genome of Marchantia.These transcriptional reporters were used to survey regulatory landscape in the vegetative gametophytes of Marchantia.The approach offers an unbiased and comprehensive way to explore TF expression patterns.Comparative analysis of the reporters allowed us to recognize expression domains and cell types in Marchantia gemmae and provide important insights into the genetic programs underpinning the organization of Marchantia stem cells.We also identified cell-type specific promoters across different stages of gemma development.Surprisingly, the set of TF reporters found in the stem-cell zone (SCZ) is largely evolutionary unrelated to TFs known from the sporophyte meristem of vascular plants.The imaging data for all tested promoters is available via a webbased database to accelerate functional genomics studies and cell-type specific engineering.

A comprehensive collection of putative promoters from Marchantia TF genes
In order to create a rich library of active cell-specific promoter elements, we extracted the 5′ untranslated regions together with adjacent upstream sequences from putative TF genes.The rationale for this design choice was based on a number of widely seen observations: (i) regulatory proteins such as TFs often show spatially and temporally restricted patterns of expression, (ii) proximal upstream regions are rich in cis-regulatory elements that control transcription, and (iii) these can be independently transplanted to create independent and reliable promoters.
Promoters are defined as a region of DNA upstream of a gene where relevant regulatory regions, such TF binding sites, are contained and control gene transcription.This region can have different length, ranging from the translation initiation site to the start/end of the next downstream transcript of a given gene (Fig. 1A).In Arabidopsis and other plant species, regulatory elements are enriched in regions up to 300 bp upstream the transcription start site (TSS) (Yamamoto et al. 2007) and 500 bp can be sufficient to capture the promoter region (Hiratsuka et al. 2022).However, additional DNA sequences can be important for gene regulation, at least 1 kb is generally used for bioinformatic analysis (Grau and Franco-Zorrilla 2022).Previous experiments to generate large promoter collections in Arabidopsis have employed sequences 1 kb upstream the TSS (patent WO 2005/098007) or 2 kb upstream of the translation start site (Tang et al. 2021).
According to annotation of the M. polymorpha Tak-1 v5.1 genome (Bowman et al. 2017;Montgomery et al. 2020) the intergenic regions in Marchantia are usually longer than Arabidopsis (median At = 906 bp, Mp = 4,042 bp).The same can be said of the length of the 5′ UTRs (median At = 88 bp, Mp = 293 bp) (Bowman et al. 2017;Montgomery et al. 2020) (Fig. 1A).In addition, chromatin accessibility profiled by ATAC-seq (Karaaslan et al. 2020) showed that ∼50% of the peaks are >1 kb from the TSS (Fig. 1, B and C).This contrasts with ∼75% in Arabidopsis (Lu et al. 2016) and suggested that Marchantia promoters could be substantially longer than Arabidopsis.Therefore, we chose to include the entire 5′ UTR (5UTR) and a putative promoter region (PROM) that is at least 1.8 kb upstream from the annotated TSS.This was balanced with other practical limitations, such as the technical limits and increasing costs of DNA synthesis for longer constructs.On average, the length of the promoters used is about 2.5 kb from the translation start site (Fig. 1D), a length that is well above the standard criteria for Arabidopsis and reflects the different genomic architecture in Marchantia.(Bowman et al. 2017;Montgomery et al. 2020).
Putative promoter elements were domesticated, synthesized and cloned in standard Phytobrick format (Patron et al. 2015).5UTRs shorter than 500 bp were synthesized as a single unit (PROM5), while longer than 500 bp and smaller than 3 kb, were made as separate 5UTR and PROM parts (Fig. 1E).The collection is widely representative of all major plant putative TF protein families (Fig. 1F), with a total coverage of around ∼82% of all putative TFs annotated in the Marchantia genome.In addition, the collection includes promoters for other relevant genes in Marchantia that serve as references (Supplementary Data Set 1).Promoter sequences were domesticated following the standards for Type IIS cloning and inserted as L0 parts for Loop assembly (Patron et al. 2015;Pollak et al. 2019) to facilitate the reuse of the synthetic parts.
We subsequently cloned the promoter elements into a binary vector to drive expression of an mVenus yellow fluorescent protein with an N7 nuclear localization signal.The use of mVenus-N7 provides a stable protein with good sensitivity of detection, which is suitable for visualizing cell identities, as shown before in previous studies (Sauret-Güeto et al. 2020).The vector T-DNA also contained a plasma membrane gene reporter (mScarlet-Lti6b) controlled by the pro MpUBE2 (BIQUITIN-CONJUGATING ENZYME E2) constitutive promoter (Sauret-Güeto et al. 2020) (Fig. 1G).This marker works as a positive control for transformation, an internal reference for any artifacts associated with the insertion site of the construct and helped to visualize different cell shapes and arrangements and to classify patterns.To speed the cloning, we built a custom vector with Type IIS sites for cloning of PROM5 or PROM and 5UTR L0 parts in a backbone with preassembled parts, obtaining the desired final construct for stable expression in a single step (Fig. 1, G to H).Finally, we implemented a high-throughput transformation protocol based on Agrobacterium-mediated transformation in multiwell plates (Ishizaki et al. 2008;Sauret-Güeto et al. 2020) to obtain six to seven independent stable transgenic lines for each plasmid.in cups.Gemmae provide a stereotypically conserved stage of development, poised at the beginning of the Marchantia vegetative life cycle.Each gemma typically has two opposing apical notches containing stem-cells, differentiated cells, two planes of symmetry and no predefined abaxial/adaxial polarity (Zheng et al. 2020;Kato et al. 2020a).At the gemma stage, the structure of the stem-cell niche and the entire body is accessible for microscopy and differentiating cells can be recognized easily without the need for staining or clearing.

Characterizing TF reporters in planta
We imaged several lines for each promoter using confocal microscopy and selected images that best represent the consensus expression patterns.From the collection of 367 promoters, we initially classified the expression patterns into five nonexclusive categories: 104 lines showed no detectable signal (28%), 41 presented a dim signal (11%), 121 a ubiquitous expression pattern across the gemma (33%), 127 a pattern stronger or specific to the notch area (35%), and, finally, 50 (13%) had some specificity for specialized cells (Fig. 2A, Supplementary Data Set 1).
To test whether these expression patterns correspond with endogenous expression, we compared each group with transcript levels from the corresponding genes analyzed by RNA-seq (Mizuno et al. 2021) and scRNA-seq (Wang et al. 2023) analysis of the whole gemmae in publicly available datasets (Fig. 2, B and C).As expected, ubiquitous promoters showed the highest average transcripts per million (TPM) values, followed by genes associated with specialized cells and notch biased expression (Fig. 2B).On the other hand, reporters with no expression had the lowest TPM values, followed by the group with dim expression (poor signal-to-noise ratio).A similar correspondence is observed when comparing the number of cells with detectable expression in scRNAseq (Fig. 2D).Among the TFs with undetectable levels of expression, several are expressed in other developmental stages (Kawamura et al. 2022).Only around ∼15% presented clearly inconsistent expression patterns compared to RNA-seq.
The microscopy data collected during the screening of promoter activities have been organized in a database accessible online (Fig. 1E, https://mpexpatdb.org/).The collection can be searched and filtered by expression profiles, gene IDs, names, and families.The database links promoters with functional information about the adjacent gene available in the MarpolBase (Ishizaki et al. 2016).For each reporter construct tested we recorded a maximum projection image with three separate channels (gene of interest, chlorophyll autofluorescence, and the constitutive plasma membrane marker) for identification of cell types.We have also developed an original feature to visualize the channels independently.The user can select which channels are actively visualized and download the appropriate composite picture.

Identifying expression domains in Marchantia gemmae
The variability between individuals is relatively low and the dimensions of the tissue follow a normal distribution (Supplementary Fig. S1).This simple morphology makes the gemma stage convenient for systematic comparisons between reporters.Excluding promoters with dim or no expression levels, for each representative reporter we orientated the image to align the two apical notches to the horizontal axis, subtracted the background, and made a profile of the fluorescence intensity along the notch axis.The length of the profile was normalized by the distance between notches and then smoothed to reduce the noise of the signal.To avoid small variations between left and right notch, we averaged them.Finally, we normalized the signal to the maximum of each image (Fig. 3A).This allowed us to generate a linear vector that represented expression patterns from different transgenic lines in a comparable way.
In total, we analyzed reporters for 218 different genes.We used hierarchical clustering and identified five clusters representing distinct expression domains (Fig. 3, B to E).Most expression patterns follow a skewed distribution with the apical notch position as the mode (clusters 1 to 3).Others instead followed a normal distribution with the central zone (CS) as the mode (cluster 4) or were evenly distributed across the gemma (cluster 5).Only a few expression patterns did not match these broad classes, and these were mostly associated with expression in differentiated scattered cells.
Within cluster 1, we distinguished two populations, one with a peak in the apical notch and a second includes a broader area around it (Fig. 3C).These correspond to the SCZ and dividing and differentiating cell zone (DDCZ), respectively, as recognized earlier (Kohchi et al. 2021).The SCZ includes a single apical cell and subapical cell anticlinal derivatives located at the center of the notch (Kohchi et al. 2021).The DDCZ covers a population of two rows of derivative cells precisely arranged around the SCZ.Cluster 2 is a broader area of small cells radially distributed along the SCZ that we named transition zone (TZ).Cluster 3 also includes the previous domains but extends over a group of cells distant to the apical notches and fades along the axis.We named this domain of larger cells peripheral zone (PZ).Finally, we named clusters 4 and 5 that correspond to two populations of ubiquitous promoters with different strengths between the apical region and the central zone (CZ).Most known constitutive promoters ( pro MpUBE2, pro MpEF1/ELONGATION FACTOR 1α, pro CaMV35S) belong to cluster 4 (Althoff et al. 2014;Sauret-Güeto et al. 2020).Finally, based on clustering analysis and incorporating literature information about cell types in Marchantia, we generated a schematic model of a gemma that described cellular arrangements and cell populations that could be distinguished (Fig. 3D).
We selected reporters representative of expression domains and cell types to obtain a more precise map of the expression domains at a cellular level.We built transgenic lines with different combinations of promoters driving the expression of two or three compatible fluorescent reporters (mVenus, mScarlet, and mTurquoise) localized in the nucleus as part of the same T-DNA.In all cases, the domains could be clearly distinguished in the different combinations (Fig. 4).This demonstrates that the expression patterns could be used in an independent and additive fashion to mark multiple cell states simultaneously and allowed us to differentiate between promoters active in the SCZ and the apical cell in the middle (Fig. 4, A and D).

Mapping TF expression patterns in specific cell types
The global analysis of expression profiles along the apical axis can provide a systematic account of organism-wide patterns but may not capture the local cell patterning important for cell differentiation.To get a more precise map of cell types, we manually inspected each reporter and identified promoters with specificity for specialized cells such as rhizoids, oil body cells, and mucilage papillae (Fig. 5B).In addition, two other expression domains (border and attachment) do not form regular distributions along the apical axis as most of the other domains (see below).These cell types and domains match descriptions in the published literature on cellular analysis in the Marchantia gemmae (Shimamura 2016) and can be included in the schematic model gemma (Fig. 5A).The classification of cell types was defined in a way that any observed expression pattern could be classified as active in one or a combination of cell types.The corresponding TF gene families associated with the expression patterns are distributed across different cell types (Fig. 5C).We did not find a clear association of a particular TF family with specific cell types.Finally, clustering analysis of the expression domains and cell types enabled the reconstruction of putative developmental pathways for cell differentiation during gemma development (Fig. 5D).
We identified promoters specific for cell lineages of specialized cells in Marchantia gemmae (Fig. 5B).Mucilage papillae are tip-growing cells covering the SCZ (Galatis and Apostolakos 1977).We showed that pro MpBZIP14 (BASIC LEUCINE ZIPPER 14) and pro MpBHLH28 (BASIC HELIX-LOOP-HELIX 28) were specifically active in the mucilage papillae (Fig. 5B, Suppl.Dataset S1).Oil body cells are idiotypic cells scattered across the thallus and are distributed in regular fashion along the edges of gemmae (Romani et al. 2022).Our screening also led to the rediscovery of oil bodyspecific promoters derived from the genes MpERF13 (ETHYLENE RESPONSIVE FACTOR 13), MpC1HDZ (CLASS I HOMEODOMAIN LEUCINE-ZIPPER), and MpR2R3-MYB2 (Fig. 5B, Supplementary Data Set 1), which have been described as important regulators of oil body development (Kubo et al. 2018;Kanazawa et al. 2020;Romani et al. 2020Romani et al. , 2022)).The patterns of expression were consistent with earlier published reporters (Kanazawa et al. 2020;Romani et al. 2020) despite the shorter length of the promoters in our collection (30%, 49%, and 46% the length of the published promoters, respectively).
Having a comparable set of reporters allowed us to spot some differences between the expression patterns of each of them: pro MpERF13 seems to be more active in oil body cells closer to the apical cell while pro MpR2R3-MYB2 is more evenly expressed in all oil body cells.In contrast, pro MpC1HDZ expression is not restricted to only oil body cells (Romani et al. 2020).In addition, we observed that the reporters derived from MpBHLH34, MpWRKY10, MpREM2 (REPRODUCTIVE MERISTEM 2), MpBHLH10, MpTRIHELIX8, MpASLBD11 (LOB DOMAIN-CONTAINING PROTEIN 11), and MpC2H2-8 displayed degrees of cell-type specificity, but their functions in Marchantia are largely unknown (Supplementary Data Set 1).Among them, MpC1HDZ, MpR2R3-MYB2, MpERF13, and MpWRKY10 mRNAs were also shown to be specifically expressed in oil body cells in scRNA-seq experiments (Wang et al. 2023).We also identified a set of promoters specifically active in rhizoid precursor cells (Fig. 5B).Of these, pro MpBHLH33/MpRSL3 (ROOTHAIR DEFECTIVE SIX-LIKE 3) has been described before (Sauret-Güeto et al. 2020) and is strongly expressed in all rhizoid cells (Fig. 5B).Some promoters were active in the rhizoid precursors near the apical region but not in those located in the center of the gemma (e.g.pro MpAP2L2/APETALA2-LIKE 2 and pro MpERF2), suggesting there are two populations of rhizoid precursor cells (proximal and distal) in the gemma (Fig. 5B, Supplementary Data Set 1).Last, we observed a series of other promoters displaying seemingly random expression patterns that do not match any of the cell types or expression domains that we have described here (Supplementary Data Set 1).

Marker expression reveals the dynamics of cell fates in the SCZ
The availability of this prolific collection of highly precise cellular markers allows new approaches to visualizing the dynamics cell fates in planta.We followed the expression profile of a set of promoters active in the notch to better understand patterns of cell differentiation.We found five TF reporters ( pro MpBZIP15, pro MpBZIP7, pro MpC2H2-26, pro MpC2H2-22, pro MpERF20/LAXR/LOW-AUXIN RESPONSIVE, pro MpCDC5/CELL DIVISION CYCLE 5) with high specificity for the SCZ at the gemma stage.It is proposed that the SCZ is composed of a central apical cell and a pair of immediate derivatives called subapical cells (Kohchi et al. 2021).During the first days of gemmaling development, it is possible to observe two stacked apical cells (Miller and Alvarez 1965;Miller 1966; Bowman 2016).We followed the expression pattern of these candidates after the germination of gemmae and only pro MpERF20 remained expressed in the apical cells (Fig. 6, Supplementary Fig. S2).In contrast, pro MpBZIP15, pro MpBZIP7, pro MpC2H2-26, pro MpCDC5, and pro MpC2H2-22 are expressed in a subset of differentiated cells after gemmae germination (Fig. 6, Supplementary Fig. S2).The expression of these reporters in subapical cells in the gemma provides evidence for the initiation of cell differentiation processes immediately adjacent to the stem cell, and additional tools for studying these.
We verified the expression of MpERF20 transcript by in situ hybridization.(Supplementary Fig. S3).Tissue-specific expression of MpERF20 in the SCZ was confirmed, however mRNA transcript signal corresponds to a larger area than it was observed in the transcriptional reporter.It was recently shown that MpERF20 plays a fundamental role in regeneration, has the capacity to induce cellular reprogramming to generate undifferentiated cells and it is sufficient to generate new apical stem cells (Ishida et al. 2022).After the ablation of the notches in the gemma, a strong response of pro MpERF20 is induced in the whole tissue after just 5 h consistently with data from RNA-seq experiments after ablation (Ishida et al. 2022).A previous longer version (4.3 vs 1.8 kb) of this promoter also displayed similar induction after ablation (Ishida et al. 2022).Following the induction of pro MpERF20, cells start dividing and de-differentiate until a new apical region is formed.Subsequently, the expression activity of pro MpERF20 diminishes in epidermal cells and only remains in the new SCZ (Fig. 6, Supplementary movie S1).

Dynamic expression of reporters during gemmaling development
Outside of the SCZ, we identified 20 TF promoter-driven reporters specifically expressed in the DDCZ.Interestingly, this later group of TFs also included stress-related genes such as MpABI3a (ABSCISIC ACID-INSENSITIVE 3A), MpMYCY, MpHY5 (ELONGATED HYPOCOTYL 5) (Clayton et al. 2018;Eklund et al. 2018;Penuelas et al. 2019), suggesting that stress signal transduction pathways are specifically active in the DDCZ at this stage of gemma development.We found only nine TF reporters specific to the TZ, and all of them are also active in the DDCZ and SCZ.This logic is seen for other TFs expressed in the PZ.Altogether, the Marchantia meristem is characterized by more than 200 TFs active in the SCZ and this number diminishes as cells mature and are displaced distally from the apical growth direction.
Two types of expression pattern do not follow a regular profile along the apical axis and are not associated with known specialized cells.The first corresponds to cells around the perimeter of gemmae that we called "border cells".Such cells were not well described in the literature.In a transverse section, the border cells form a layer of two to three cells at the margins of the gemma.Among the promoters observed, MpNAC2 (NAM/ATAF/CUC 2) and MpARF2 (AUXIN RESPONSIVE FACTOR 2) show higher specificity for expression in border cells.A similar expression pattern was shown before, using a knock-in reporter of MpARF2 (Kato et al. 2020b).We observed the expression of both genes after gemma germination (Fig. 7A, Supplementary Fig. S4A) and the expression maximum migrates from the border to the CZ after 2 d.We believe the border expression pattern could be associated with the establishment of abaxial/adaxial polarity or auxin accumulation during gemma formation.This interpretation is supported by the role of MpARF2 and auxin signaling in gemmae development (Rousseau 1953;Eklund et al. 2015).
The second special expression domain corresponds to a group of elongated cells referred to in the literature as the "attachment point" (Solly et al. 2017) which correspond to the cells connected to the cup base before the detachment of the gemma from the stalk cell (Kato et al. 2020a).
pro MpASLBD17 is the best reporter with high specificity for the attachment cells (Fig. 5).After gemma germination, pro MpASLBD17 signal remains in the attachment region but the signal diminishes (Supplementary Fig. S4).Interestingly, pro MpASLBD17 is also active in the base of the cup in both gemma initials and mucilage cells (Fig. 7B).These terminal cells do not divide after germination suggesting this cell identity could be a remnant of interaction between the gemma and cup.Later in development, pro MpASLBD17 is also strongly expressed in slime papillae (Supplementary Fig. S4B).This is consistent with the notion that the mucilage papillae, slime papillae, and gemma initials share common properties and some similar genetic programs (Proust et al. 2016).
We followed the expression patterns of 27 other promoters active in the different expression domains in the notch across the course of vegetative development in Marchantia gemmalings for 7 d.During this developmental period, gemmalings start maturing and proliferating and undergo drastic morphological changes.Still, the relationship between most patterns remained consistent (19/27) during this period of growth (Fig. 7C).Examples of expression patterns in 0-, 3-, 5-and 7-d-old gemmalings are shown in Fig. 7C.After the first 2 d of growth, cells rapidly expand and form a mature epidermis while the first bifurcation of the thallus takes place.Proximal rhizoids (Fig. 5A) of the dorsal surface can still undergo cell divisions and de-differentiate into epidermal cells, while distal rhizoids are committed to elongate even at the dorsal surface.The mature thallus is characterized by the complete formation of air chambers and air pore structures (Shimamura 2016).These structures are formed by a very precise pattern of cell divisions that occur very close to the SCZ and form a boundary between the mature thallus and the gemmae epidermis visible after 3 to 4 d.
The DDCZ drastically expands during the first days and covers most of the newly formed mature thallus, displacing the TZ and PZ (Fig. 7C).This contrasts with TF promoters expressed in the SCZ of the gemma which remain limited to subdomains of the mature thallus (Fig. 6).The DDCZ, TZ, and PZ maintain a high rates of cell expansion and division during the first days (Boehm et al. 2017;Ishida et al. 2022) but only the DDCZ is active during the differentiation of cells.Both the TZ and PZ expand to form the boundary and heartshaped morphology that separate both apical notches, acting as a supportive tissue to the forming mature epidermis (Fig. 7C).The CZ remains unaltered while the rhizoid precursors in the dorsal region de-differentiate.It is only after 5 to 7 d that the DDCZ forms a gradient of expression focused on the SCZ forming a boundary between developing and mature air pores (Fig. 7C).This structure is repeated in a similar pattern during vegetative growth (Solly et al. 2017).We synthesized these observations and expanded our model of expression domains to later developmental stages (Fig. 7C).
The mature thallus is later characterized by the presence of air pores and gemma cups.We observed that other promoters showing tissue-specific expression in organs in the mature thallus, such as gemma cup ( pro MpNAC1 and pro MpERF11) and air pores ( pro MpC3H8, pro MpNFYC4/ NUCLEAR TRANSCRIPTION FACTOR Y SUBUNIT ALPHA 4), are also active in the DDCZ of the developing gemma (Fig. 8).Among them, MpNFYC4 was also found to be air pore specific in scRNA-seq experiments (Wang et al. 2023).These observations are in agreement with classical morphological models of cell differentiation in bryophytes where most cell differentiation processes occur in the cells surrounding the apical region (Apostolakos et al. 1982;Bowman 2016;Shimamura 2016).

Discussion
We built and tested a comprehensive library of promoters derived from the genes of regulatory TFs in Marchantia.
These promoters contain native recognition sequences and are competent to respond precisely to endogenous transcriptional regulators.The promoter parts are of relatively compact size with standardized modular format to allow simple DNA engineering and facilitate their use for construction of new genetic circuits.These reporters can be combined to recognize or target virtually any cell type in Marchantia, providing a toolset that rivals any other plant system.
We have used these promoters to systematically map patterns of gene expression during early gemmaling development in Marchantia.We exploited nuclear-localized fluorescent cell markers and the regular cellular architecture of gemmae to normalize and compare patterns of gene expression with cellular resolution.These could be registered using microscopic features of cellular anatomy and compared with published knowledge of cellular differentiation to enable construction of a stereotypical map of cell states in the Marchantia gemma.This atlas will provide a guide for further use of the promoter collection, and a template for more detailed studies of the interactions between genome and cellular development in Marchantia.
The activity of putative promoter regions has some limitations in accurately reflecting the transcriptional patterns of the corresponding endogenous genes.For example, they may be missing important downstream or upstream regulatory regions, contain minor alterations due to domestication during cloning, or miss post-transcriptional regulatory mechanisms associated with the native transcripts.Post-transcriptional regulation has been shown to be important for several developmental regulators in Marchantia (e.g.MpRSL1, MpFGMYB/ FEMALE GAMETOPHYTE MYB) (Honkanen et al. 2018;Hisanaga et al. 2019).At the same time, the use of a fluorescent protein as reporter also poses limitations due to high stability of the protein.The reporters may not accurately reflect transient expression patterns that could be relevant during growth, such as the cell cycle.We cannot fully exclude other sources of artifacts, for example errors in annotation of TSS and translation start sites.
Nevertheless, we found broad and consistent associations between the observed patterns of promoter activity and independently measured levels and distribution of transcripts, and documented properties of longer versions of the promoters from the literature.In addition, cell-type specific expression patterns were associated with known gene functions, as described for MpERF13, MpR2R3-MYB2, MpC1HDZ, MpERF20.Our approach is complementary to transcriptomebased efforts to map Marchantia development.Moreover, it could capture precise features of cellular organization and gene regulation in the apical meristem that were not discernible by time-resolved scRNA-seq (Wang et al. 2023).Further, these promoters can be used to drive the expression of (i) fluorescent proteins to deliver spatially precise and sensitive markers for visualizing the dynamics of cell states in living tissues, or (ii) regulatory genes to genetically reprogram the system.
Reconstructing the evolution of morphological traits requires defining the relationship between tissues and cell types and how genetic programs evolved (Delaux et al. 2019;Zeng 2022).Previous models suggested that the vegetative gametophyte meristem of bryophytes is analogous or homologous to the vegetative sporophyte meristem in tracheophytes (vascular plants), both as a deeply conserved trait or by the co-option of several TFs from one generation to the other (Bowman et al. 2019).To reconstruct the history of the evolution of meristems in land plants, the expression patterns of TFs play a crucial role.Looking at conserved factors across embryophytes may have generated constraints in the comparisons between the functional architectures of these two forms of multicellular polar growth.Our approach of testing a near-complete collection of TF reporters has the potential to revisit this question, sidestepping selection bias.
Morphological studies suggest that stem cells of the vegetative body of bryophytes are comprised of single apical cell (Menand et al. 2007;Shimamura 2016;Suzuki et al. 2020).This simple structure is likely the ancestral state of the land plant meristem, while the more complex meristem observed in vascular plants is likely a derived trait (Harrison and Morris 2018;Fouracre and Harrison 2022).Our observations provide genetic evidence for the identity of such cells in Marchantia.MpERF20 is expressed in the center of the SCZ (as verified by the fluorescent reporter and in situ hybridization) and accompanied by subapical cells where other TF derive promoters are specifically expressed ( pro MpBZIP15, pro MpBZIP7, pro MpC2H2-26, pro MpC2H2-22, pro MpCDC5).In addition, we found a set of TF promoters active in the DDCZ that completes the arrangement of cells forming the Marchantia notch, that constitute the building blocks of Marchantia vegetative development.
Thus, there appears to be a hierarchical order to the patterns for gene expression in the Marchantia thallus.Many TFs are expressed in the SCZ and expression patterns are progressively pruned along the longitudinal axis as distal daughter lineages take up specific cell fates (Fig. 5D).However, we also observed complex gene expression patterns which are active in broad domains but excluded from specific cell types (e.g.pro MpERF21, pro MpBZR2, pro MpBBX3) that could also be important for developmental processes.
The classical model of stem cell organization in the sporophyte of vascular plants involves WUSCHEL (WUS/WOX), Class I KNOX (KNOX1), Class III HD-ZIP (C3HDZ), AINTEGUMENTA/ PLETHORA/BABYBOOM (APB), SCARECROW (SCR), SHORTROOT (SHR), and HAIRY MERISTEM (HAM) TFs.In Marchantia, the reporters for MpWOX, MpAPB, MpKNOX1, and MpC3HDZ are not specific to an analogous region of the apical notch in the Marchantia gametophyte (Supplementary Data Set 1).This is in line with functional evo-devo studies in bryophytes showing that MpWOX does not play a critical role in the gametophyte of Marchantia (Hirakawa et al. 2020), that MpKNOX1 only participates in the sporophyte generation (Sano et al. 2005;Sakakibara et al. 2008;Dierschke et al. 2021;Hisanaga et al. 2021), and C3HDZ mutant does not affect the gametophytic meristem in the model moss P. patens (Yip et al. 2016).As observed in other cases, the function of TFs could be only conserved in the sporophyte generation (Romani and Moreno 2021).In the case of GRAS TFs such as HAM and SCR, they seem to play a prominent role in the gametophytic stem cell organization in Physcomitrium, but putative orthologues for some do not exist in Marchantia (Beheshti et al. 2021;Ge et al. 2022;Ishikawa et al. 2023).
The singular set of TFs expressed in the SCZ is largely unrelated to known TFs associated with meristem organization in other species.For example, MpBZIP15 has no true orthologue in angiosperms (Bowman et al. 2017) and characterized C2H2 TFs are largely associated with stress responses (Han et al. 2020).Interestingly, CDC5 has been associated with shoot apical meristem organization in Arabidopsis upstream of STM and WOX and loss-of-function plants are embryo lethal, but its expression is not meristem specific (Lin et al. 2007).As a possible exception, in Arabidopsis, AtESR1/DRN the orthologue of MpERF20, was described to be involved in regulation of the shoot apical meristem organization and regeneration, suggesting that this role could be conserved across land plants, or co-opted in the opposite generation (Banno et al. 2001;Kirch et al. 2003;Ikeda et al. 2021).However, unlike MpERF20, AtESR1/DRN is expressed in the leaf primordia and not in the SCZ (Kirch et al. 2003).
In summary, the evidence presented here supports the notion that GRNs governing the formation of an apical meristem in the vegetative body of bryophytes and embryophytes are not analogous.One scenario is that both forms of multicellular polar growth evolved to a large degree independently in contrasting generations.The fact that the only conserved factor is associated with regeneration, indicates that the bryophyte meristem GRNs may be built on top of an ancestral capacity of ancestral land plants to regenerate.On the other hand, the more complex body plans of vascular plant may have recruited de novo GRNs during evolution to support organ development and more sophisticated patterning.In contrast, most of the differentiation events in Marchantia development are observed immediately after formation of the first derivatives of the apical cell (DDCZ) and there is not a comparable PZ as in the sporophyte of vascular plants.Nevertheless, other aspects of the molecular machinery regulating the meristem formation and maintenance, such as peptide signaling, biosynthesis and polar transport of auxin, and cytokinin signaling; seem to work in a similar fashion in both forms of vegetative bodies (Whitewoods et al. 2018;Aki et al. 2019;Hirakawa et al. 2019Hirakawa et al. , 2020;;Blázquez et al. 2020;Kato et al. 2020b;Bowman et al. 2021).Future work on hormone control of growth and their interaction with TFs in bryophytes and streptophyte algae will help to fill the gaps in how the cell types are defined and maintained across development.
This atlas of TF expression patterns will provide a valuable resource for the plant science community.As we showed for air pores and gemmae cups, the collection is a rich source of tissue-specific promoters for Marchantia tissues at later developmental stages.We expect this collection of promoters will help to accelerate studies in Marchantia for a wide range of applications: markers for cell identities, ratiometric quantification (Federici et al. 2012), isolation of nuclei tagged in specific cell types (INTACT) (Deal and Henikoff 2011), celltype specific expression, among many other functional genomics and synthetic biology applications.

Genomic bioinformatic analysis
Genomic analysis to quantify the length of intergenic regions and 5′UTR were extracted from M. polymorpha Tak-1 genome v5.1 using Genomic Features package (Lawrence et al. 2013).ATAC-seq, reads were extracted from accessions SRR10879463 and SRR10879464 (Kaaraslan et al. 2020), mapped in M. polymorpha Tak-1 genome v5.1 using HISAT2 (Kim et al. 2019) and ATAC-seq were identified using MACS2 (Liu et al. 2014) using default settings.Peaks were converted to bigwig format and analyzed using ChiPseeker package in R (Yu et al. 2015) and DeepTools v2 (Ramirez et al. 2016) using default parameters.

Synthesis of L0 parts
5′UTR and promoter regions from genes were extracted from M. polymorpha Tak-1 genome version 3.1 (Bowman et al. 2017) genome.DNA sequences were domesticated to remove internal BsaI and SapI sites using Recode2S (https:// github.com/bpollakw/recode2s),and gene IDs were translated to M. polymorpha Tak-1 genome v5.1 (Montgomery et al. 2020) primary transcripts.The sequences of synthetic L0 parts used in this work are available in Supplementary Data Set 1. L0 parts were synthesized either by GENEWIZ or Twist Bioscience following the standard syntax for plant synthetic biology with PROM5 or PROM and 5UTR overhangs and cloned into the plasmid pUAP1 (Addgene #63674) (Patron et al. 2015) by homology recombination.Promoter sequences with repeated Ns in first 1,000 bp or 5UTR longer than 3 kbp, were omitted.Additional L1 and L0 parts were obtained from the OpenPlant toolkit (Sauret-Güeto et al. 2020) (Supplementary Data Set 1).

Laser scanning confocal microscopy
Images of Marchantia were acquired on a Leica SP5 confocal microscope upright system equipped with Argon ion gas laser with emitted wavelengths of 458, 476, 488, and 514 nm, 405 nm diode laser, 594 nm HeNe laser, 633 nm HeNe laser, and 561 DPSS laser.The laser power for argon laser was 30% with a gain of 600 V for all detectors.For higher resolution and time lapse studies, images were acquired on a Leica SP8X spectral confocal microscope upright system equipped with a 460 to 670 nm super continuum white light laser (80% laser power), two CW laser lines 405, and 442 nm, and 5 Channel Spectral Scanhead (four hybrid detectors and one photomultiplier).For slides, imaging was conducted using either a 10× air objective (HC PL APO 10×/0.40CS2), a 20× air objective (HC PL APO 20×/0.75CS2).When observing fluorescent protein with overlapping emission spectra, sequential scanning mode was selected.Excitation laser wavelength and captured emitted fluorescence wavelength window were: for mTurquoise2 (442 nm, 460 to 485 nm), for eGFP (488 nm, 498 to 516 nm), for mVenus (514 nm, 527 to 552 nm), for mScarlet (561 nm, 595 to 620 nm), and for chlorophyll autofluorescence (633, 687 to 739 nm), respectively.
When imaging time-courses, plants grown under normal culture conditions in small petri dishes, removed the lid for imaging, and returned the plants to the growth chamber and imaged as described above.For live imaging, six stacked Gene Frames (ThermoFisher #AB0578) were placed on a glass slide and filled halfway with molten Gamborg B-5 agar medium.Plants were then placed on the solidified agar surface and meristems were removed using a Laser Microdissection Leica LMD7000.Samples were mounted in perfluorodecalin (Littlejohn et al. 2010) with a glass coverslip on top.The slides were then continuously imaged on the Leica SP8X confocal microscope for 1 to 4 d. and DRR284686 (Mizuno et al. 2021) were used to compare reporter expression patterns with RNA-seq.Data was subsequently analyzed with R. Hyperbolic arc-sin was calculated for each corresponding transcript (base package v4.1.3)and plotted with the density function (stats package v4.1.3).

Image analysis and clustering
Image processing was performed in Fiji (Schindelin et al. 2012) to perform maximum intensity projections of the Z-stacks.For fluorescence intensity analysis, background was subtracted with parameters by default, images were rotated to align the notches in the X-axis, and the histogram was done using the plot profile function of the mVenus channel covering the entire gemma, using the chlorophyll channel as a reference.Raw intensity data and distance of the notches were exported for further analysis in R. The smooth.splinefunction (spar = 0.4) was used to reduce noise from cell-to-cell signal, and approxfun function from the stats package was used to interpolate the distance from the start to the first notch, and then to the second and end of the plot using fixed values.The average distance values of all images taken was used as a reference to align all profiles.Intensity was normalized to the maximum value.The hclust and cutree functions from stats package were used to perform the clustering and extract the groups.The pheatmap function for ComplexHeatmap package v2.10.0 (Gu et al. 2016) was used to plot the heatmap.For calculating the mode, the mlv function from modeest package v2.4.0 with the Grenander method (Grenander 1965).Default parameters were used, and plots were made using the base package v4.1.3unless specifically stated.

Figure 1 .
Figure 1.Overview of the TF promoter collection.A) Schematic representation of regulatory regions in a typical gene in Marchantia highlighting relevant features and average distances (noncoding regions in white and coding regions shaded).B) ATAC-seq peak coverage in Marchantia genome centered in TSS and scaled to gene regions.C) Distribution of ATAC-seq peaks according to gene features.D) Boxplot showing the length distribution of all promoters in the collection.The box extends from the 25th to 75th percentiles.Whiskers cover the entire range of the values.The line in the middle of the box marks the median.E) The architecture of the synthetic parts.F) Distribution of tested TFs (lower shaded portion of bars) across TF families in the Marchantia genome.G) Overview of the cloning and transformation strategy implemented to characterize the promoters, including (H) an example ( pro MpNFYA) of the imaging output for each promoter deposited in an accessible database.Scale bar 100 μm.a.u., arbitrary units; TSS, transcription start site; TES, transcription end site; ATAC-seq, transposase-accessible chromatin with sequencing; N, number of observations.

Figure 2 .Figure 3 .
Figure 2. Quality control for the promoter collection.A) Number of unique promoters tested in each class.B, C) Density plot of RNA of initial expression pattern TF classifications: no expression, dim expression, specialized, notch, and ubiquitous.asinh of TPM values from whole tissue RNA-seq (A) and pseudo-bulk of scRNA-seq (B) of the gemma are shown.D) asinh of cell counts with detectable expression in scRNA-seq of the gemma per class of expression pattern.E) Examples of promoters belonging to each class ( pro MpERF1, pro MpNFYB1, pro MpREM2, pro MpBPC2,

Figure 4 .
Figure 4. Combination of multiple fluorescent reporters.Confocal images of the apical region of Marchantia gemmae transformed with multiple fluorescent reporters of TFs in the same plasmid.A) Combo 1: pro MpBZIP9, pro MpBZIP7, pro MpERF7.B) Combo 2: pro MpC3H30, pro MpHY5, pro MpC2H2-21.C) Combo 3: pro MpC3H28, pro MpTRIHELIX31.D) Combo 4: pro MpERF20, pro MpBHLH42.Schematic map and legend of the expression domains and cell types in the gemma notch is shown (bottom left).Shaded squares indicate the domains where each selected promoter is active.Individual channels and composite images are shown.Scale bar 100 μm.SCZ, stem-cell zone.

Figure 5 .
Figure 5.A model for promoter activity in the Marchantia gemmae.A) Schematic representation of cell types identified in the Marchantia gemma and (b) detailed view of the notch area.Examples of representative fluorescent reporters displaying cell-type specific expression patterns B) pro MpBZIP14, pro MpR2R3-MYB2, pro MpNAC2 pro MpASLBD17, pro MpERF2, and pro MpBHLH33/MpRSL3.Marked cell types are shown (left) with confocal images (right) of the promoter of interest driving a yellow fluorescent protein and a constitutive plasma membrane marker ( pro MpUBE2: mScarlet-Lti6b).Scale bars: 100 μm.C) The number of reporters with expression across cell-types are shaded according to different TF gene families.D) Principal component analysis (PCA) of cell types based on the expression of TF reporters.Putative developmental trajectories are highlighted with lines.DDCZ, dividing and differentiating cell zone; SCZ, stem-cell zone; TZ, transition zone; PZ, peripheral zone; CZ, central zone; PC, principal components.

Figure 6 .
Figure 6.Dynamic expression of reporters in the SCZ.A) A selection of promoters specifically active in the SCZ ( pro MpERF20, pro MpBZIP15, pro MpC2H2-22).Cell types of the SCZ are shown on the left.Confocal images of the promoter of interest driving expression of a yellow fluorescent protein and a constitutive plasma membrane marker ( pro MpUBE2:mScarlet-Lti6b). Asterisks point the apical notch.B) Time lapse of pro MpERF20 expression after laser ablation of the notches and until re-establishment of the new SCZ.Ablated regions are marked as dotted lines.Arrowheads point the first cells with signal and the forming apical notch (see also Video 1).Scale bars: 100 μm.

Figure 7 .
Figure 7. Promoter activity during gemmaling development.A) Schematic representation of the gemma border and time-course of expression for a representative cell-type specific marker ( pro MpNAC2).B) Schematic representation of the attachment point of the gemma and expression of a representative cell-type specific marker ( pro MpASLBD17) in a gemma cup in a mature thallus (view from the top and cross-section).Confocal images of the promoter of interest driving expression of a yellow fluorescent protein and a constitutive plasma membrane marker ( pro MpUBE2:mScarlet-Lti6b). C) Schematic models of expression domain dynamics during the first days of gemmaling development, with (below) examples of confocal images of time-courses of fluorescent reporters ( pro MpBHLH42, pro MpNFYC5, pro MpC2H2-9) illustrating the different expression domains.The dashed line represents the boundary between the mature epidermis and the supportive tissue of the gemmae.Scale bar: 100 μm.DDCZ, dividing and differentiating cell zone; SCZ, stem-cell zone; TZ, transition zone; PZ, peripheral zone; CZ, central zone.

Figure 8 .
Figure 8. Promoters specifically active in mature thallus tissues.A) Expression pattern of reporters with specific expression in air pores ( pro MpC3H8, pro MpNFYC4) and B) gemma cups ( pro MpNAC1 and pro MpERF11).Schematic representation of a Marchantia adult plant and the correspondence to images shown.Scale bars: 100 μm.