Diatom Molecular Research Comes of Age: Model Species for Studying Phytoplankton Biology and Diversity

Diatoms are the world ’ s most diverse group of algae, comprising at least 100,000 species. Contributing ; 20% of annual global carbon ﬁ xation, they underpin major aquatic food webs and drive global biogeochemical cycles. Over the past two decades, Thalassiosira pseudonana and Phaeodactylum tricornutum have become the most important model systems for diatom molecular research, ranging from cell biology to ecophysiology, due to their rapid growth rates, small genomes, and the cumulative wealth of associated genetic resources. To explore the evolutionary divergence of diatoms, additional model species are emerging, such as Fragilariopsis cylindrus and Pseudo-nitzschia multistriata . Here, we describe how functional genomics and reverse genetics have contributed to our understanding of this important class of microalgae in the context of evolution, cell biology, and metabolic adaptations. Our review will also highlight promising areas of investigation into the diversity of these photosynthetic organisms, including the discovery of new molecular pathways governing the life of secondary plastid-bearing organisms in aquatic environments.


INTRODUCTION
Diatoms (Bacillariophyceae) are unicellular chlorophyll a/c-containing microalgae belonging to the division Heterokontophyta, also known as stramenopiles (e.g., Stiller et al., 2014). Diatoms represent a major class of phytoplankton in both marine and freshwater environments (Field et al., 1998;Pierella Karlusich et al., 2020) and are the most species-rich class of microalgae, with at least 100,000 extant species according to conservative estimates (Malviya et al., 2016). Diatoms are distributed worldwide, from tropical and subtropical regions to polar ecosystems (Figures 1  and 2). Thus, the diversity of diatom lifestyles and survival strategies can likely be attributed to their exceptional ability to adapt to highly dynamic aquatic environments. Diatoms and other aquatic protists contribute less than 1% to the Earth's total biomass (550 gigatons of carbon [Gt C], 80 to 90% of which is derived from plants; (Bar-On et al., 2018;Pierella Karlusich et al., 2020), but diatoms alone contribute an estimated 20% to annual C fixation.
This role in global C cycling is comparable to that of all the terrestrial rain forests combined (Field et al., 1998).
The most distinctive macroscopic feature of diatoms is the cell wall, which is made of hydrated silicon dioxide (silica; Figure 2) comprising two halves (thecae) that fit together like a Petri dish (Round et al., 1990). Subcellular features include a plastid surrounded by four membranes, loosely stacked thylakoids with no apparent subdomains, and a rough endoplasmic reticulum (ER) connected to the nucleus and outer plastid membrane. Unlike plants, diatoms store energy in the form of chrysolaminarin (b-1,3glucan; Gruber and Kroth, 2017;Huang et al., 2018) and lipids (Armbrust et al., 2004). The metabolic strategies of different diatom species range from photoautotrophy to heterotrophy, including parasitism (Gruber and Kroth, 2017; Supplemental Table 1). Some diatoms live as free-floating single cells, whereas others form colonies ( Figure 2) and participate in mutualistic or symbiotic relationships (Foster et al., 2011).
In the last 50 years, major advances in plant biology have been propelled by the development of genomic and genetic resources for experimentally controllable model systems such as Arabidopsis (Arabidopsis thaliana; Arabidopsis Genome Initiative, 2000; Koornneef and Meinke, 2010). In parallel, the green alga Chlamydomonas reinhardtii has emerged as a model system to study photosynthesis, chloroplast biology, sexual reproduction, flagellar motility, and micronutrient homeostasis (Merchant et al., 2007;Salomé and Merchant, 2019). Approximately 25 years ago, the diatom research community established the first genetic transformation protocols for microparticle bombardment of various species (Dunahay et al., 1995;Apt et al., 1996). This work prompted the nascent molecular marine community to start sequencing the genomes of phytoplankton organisms, first Thalassiosira pseudonana (Armbrust et al., 2004) and then Phaeodactylum tricornutum , which are both diatom species.
In this review, we discuss the continuing rise of diatoms as model organisms. To date, T. pseudonana and P. tricornutum (Figures 2A and 2B) are the two species most amenable to laboratory studies, from simple growth experiments to the development of the latest genome-editing tools (Tables 1 to 4;  Supplemental Tables 1 and 2). During the past 15 years, many genomic, metabolic, and cellular features of these algae have been discovered and are likely to be conserved in many diatom species. Nevertheless, many pertinent questions about the life cycles and environmental adaptation of ecologically relevant diatom species cannot be addressed with these two models alone, pointing to the pros and cons of having model species. This need has driven the implementation of new genome sequencing projects for Fragilariopsis cylindrus, Pseudo-nitzschia multistriata, and other species (Table 1; Supplemental Table 1). Information from these emerging models has extended our understanding of the evolution and diversification of diatom genomes. Thus, these diatoms may also represent the dawn of novel models that could be used to address biological questions that might have been missed by focusing on only a few models among the more than 100,000 extant diatom species.
In the following sections, we review diatom evolution and phylogeny and aspects of the cell and life cycle, discuss emerging areas of importance ("Selected Research Highlights" section) for diatom biology from genome organization to basic metabolism, and highlight the genomic and genetic resources and technologies that have been developed over the past decade ("Genomic and Genetic Resources" section). By placing current and future diatom research in the context of green algae and plants, we hope to make diatom research accessible to plant biologists, especially because most of the fascinating molecular secrets of diatoms await to be discovered.

EVOLUTION AND PHYLOGENY
Photosynthetic eukaryotes evolved more than 1.5 billion years ago in the Proterozoic oceans. The progenitor of the plastid is considered to have been a free-living cyanobacterium that was engulfed by a heterotrophic host via primary endosymbiosis (Shih and Matzke, 2013). This event gave rise to the supergroup Archaeplastida, which diversified into three major lineages: the glaucophytes, rhodophytes, and Viridiplantae (Petersen et al., 2014). The rhodophytes gave rise to the stramenopiles ( Figure 3) as well as the cryptophytes, alveolates, and haptophytes, together comprising the chromalveolate group (Stiller et al., 2014). Most chromalveolate species containing a red-algal-derived plastid or at least a remnant of it (e.g., Plasmodium falciparum) were thought to have originated from a single secondary endosymbiotic event (Cavalier-Smith, 1999), which is consistent with phylogenetic evidence based on plastid genes. However, phylogenomic studies based on nuclear genes do not support the proposed monophyly of chromalveolates (e.g., Stiller et al., 2014). Large-scale comparative genomic analyses have even suggested that a cryptic green-algal endosymbiont in chromalveolates predated the symbiosis with red alga (Moustafa et al., 2009). Based on the in silico prediction of 770 plastid-targeted proteins that are conserved across photosynthetic stramenopiles and on a smaller set of experimentally verified plastid-localized proteins, there is initial evidence that approximately 25% of the ancestral diatom plastid proteome consisted of nucleus-encoded proteins from green algae (Dorrell et al., 2017). Horizontal gene transfer (HGT) also contributed to the complex history of gene acquisitions in the chromalveolates (e.g., Nosenko and Bhattacharya, 2007;Chan et al., 2012;Basu et al., 2017).
Despite the uncertain origins of many diatom genes, calibrated molecular clocks, multigene and taxon phylogeny, and the fossil record provide evidence that diatoms arose around the Triassic-Jurassic boundary ;190 million years ago . Diatoms can be divided into two major groups (Kooistra and  Model of the global distribution of diatom biomass in the ocean surface layer in April to June (left) and October to December (right) of 2000 (courtesy of Oliver Jahn, Stephanie Dutkiewicz, and Mick Follows). Biomass values, reported in log scale in units of mmol C m 23 , were derived from the MIT ecosystem model, which simulates ocean circulation and key biogeochemical processes, e.g., nutrient fluxes, plankton growth, and death occurring in the ocean, at a global scale (Follows et al., 2007;Tréguer et al., 2018). Physiological data were derived from laboratory and field experiments. Because diatom biomass varies between ;0.4 (P. tricornutum) and ;7000 (Coscinodiscus wailesii) pmol C per cell, a biomass of 1 mmol C m 23 corresponds to a range of ;150 cells L 21 to ;2.5 10 6 cells L 21 (Marañón et al., 2013). Medlin, 1996): the radially symmetrical centric diatoms (e.g., T. pseudonana) and the pennate diatoms, with an elongated bilateral symmetry (e.g., P. tricornutum). Pennate diatoms can be subdivided into nonmotile araphid species and motile raphid species (Figures 2 and 3). Raphid species (e.g., P. tricornutum, F. cylindrus, P. multistriata) are mobile due to the presence of the raphe, a longitudinal slit in the cell wall through which mucilage containing actin-myosin protein complexes is secreted (Poulsen et al., 1999). Interestingly, the evolution of the raphe and the accompanying changes in life history traits (such as active motility of vegetative cells) approximately 120 million years ago appear to have facilitated range expansions or colonization of previously unavailable habitats (e.g., sediments) with varying rates of diversification. The raphid group outnumbers the species diversity estimated for both centric and araphid pennate diatoms, which might reflect the propensity of raphid diatoms to occupy and adapt to a variety of habitats, from sea ice in polar oceans to the epibiotic lifestyle in freshwater ecosystems. Most of the accelerated diversification took place in the Cenozoic over the past 75 million years .

DIATOM CELL DIVISION AND LIFE CYCLE
Diatoms often display a "bloom and bust" life cycle, when rapid mitosis under favorable environmental conditions is followed by the demise of the bloom due to the depletion of nutrients, grazing of herbivores, sinking of cells, sexual reproduction, or even cell death if environmental conditions become unfavorable (Gross, 2012). Diatom cell division is complex and has unique aspects, as it resembles a patchwork of processes from plants and animals (Pickett-Heaps and Tippit, 1978;De Martino et al., 2009;Tanaka et al., 2015a). For instance, diatom mitosis commences at the microtubule-organizing center, similar to those in animal cells. Similar to most plants, diatoms do not have centrioles but rather (B) to (I) Light microscopy of the pennate model species P. tricornutum (B). This diatom exists in three interconvertible morphotypes: the fusiform, oval, and triradiate morphotypes, as shown in the figure. Scanning electron micrograph images of (C) Skeletonema tropicum, (D) a valve of a raphid pennate diatom, (E) Shionodiscus oestrupii var venrickiae, and (F) F. cylindrus. Light microscopy of (G) Chaetoceros sp, (H) Fragilariopsis kerguelensis, and (I) Pseudonitzschia sp from a natural phytoplankton sample collected at the Long Term Station MareChiara in the Gulf of Naples, Italy. (J) Flipped ice floe in the Southern Ocean with dense population of sea-ice diatoms (e.g., F. cylindrus) at the ice-water interface. Images were kindly provided by Diana Sarno and Marina Montresor (Stazione Zoologica Anton Dohrn, Napoli, Italy) and James A. Raymond (Univeristy of Nevada, United States). possess a highly organized central spindle embedded in a matrix called the collar that enables chromosome attachment (Pickett-Heaps and Tippit, 1978). Diatom cytokinesis involves the centripetal development of a cleavage furrow, resembling cytokinesis in animals, but the process of centrifugal cell wall neosynthesis is more similar to plant cell wall deposition (Pickett-Heaps and Tippit, 1978). Cyclins, which regulate the cell cycle together with cyclin-dependent kinases, have expanded extensively in diatoms (Huysman et al., 2010). Some diatom-specific cyclins (dsCYCs) induced under specific growth conditions may control diatom growth in particular environments. dsCYC2 is a key regulator of the onset of cell division in P. tricornutum (Huysman et al., 2013). dsCYC2 controls the light-dependent G1-to-S cell cycle checkpoint, and its expression is induced by the blue-light sensor AUREOCHROME1a (described in more detail below).
The silica cell wall undergoes characteristic changes during cell division (Round et al., 1990). In general, the diatom cell wall (frustule) is composed of two identically shaped but slightly different sized thecae: the larger epitheca and the smaller hypotheca ( Figure 4). The two thecae are fastened together by bands of silica, the girdle bands. When diatom cells divide, each daughter cell retains one theca from the original frustule, meaning that one new theca must be synthesized per daughter cell in addition to new girdle bands in order to keep the original and new thecae together. The synthesis of the silica structures is coordinated with the cell cycle, and when silicate is scarce in seawater, the cell cycle is halted (Brzezinski et al., 1990). It is generally assumed that intracellular silica deposition and patterning depend on both selfassembly processes and controlled silica polymerization in specialized silica deposition vesicles (Reimann et al., 1966), which deliver polymerized and macromolecule-complexed silica to specific cellular sites by exocytosis. Numerous organic molecules are involved in these processes, including soluble and transmembrane proteins (e.g., Kröger and Poulsen, 2008;Hildebrand et al., 2018). Proteins involved in cell wall biogenesis were initially identified by biochemical analysis and nanoscale resolution imaging. Putative novel regulators of silicon bioprocesses have been identified by whole-genome expression profiling of T. pseudonana grown under different conditions (Supplemental Table 2). Newly developed genome-editing tools (see section Genomic and Genetic Resources) have been used to define their functions in vivo. For instance, knockdown and knockout of T. pseudonana silacidin production revealed that this protein influences the size of thecae (Belshaw et al., 2017;Kirkham et al., 2017), whereas knockout of silicanin-1 suggested that this protein is important for ensuring that the diatom cell wall is mechanically robust (Görlich et al., 2019).
For many diatoms, a progressive reduction in cell size occurs during each cell division because the hypotheca is synthesized inside the cells of the previous generation (Macdonald, 1869;Montresor et al., 2016). Restoration of the original cell size is thought to involve sexual recombination. For sexual reproduction in centric diatoms, eggs and flagellated sperm are usually produced from a clonal strain (homothallic), whereas pennate diatoms require strains of opposite mating types (heterothallic; Montresor et al., 2016). In many diatom species, sexual reproduction can only be induced below a specific cell size threshold. Several species (such as Seminavis robusta) emit pheromones as a signal to find appropriate mating types for sexual reproduction (Moeys et al., 2016). Little is known about how sex determination is controlled in diatoms, partly because this issue is not pertinent to the two moststudied models, T. pseudonana and P. tricornutum, as they appear The initial primary endosymbiosis occurred when a heterotrophic host engulfed a cyanobacterium (represented in blue). Over time, a large proportion of the cyanobacterial genome was transferred to the nucleus of the host, as indicated by the blue arrow. The endosymbiotic process generated the plastids of the Archaeplastida, a major group including Glaucophyta (pale blue), Rhodophyta (pink), and the Viridiplantae (the model green alga C. reinhardtii and plant Arabidopsis are shown). During secondary endosymbiosis, a different heterotrophic cell acquired a red alga and potentially also a green alga. The algal endosymbiont became the plastid (in brown) of the Stramenopila, a group including diatoms, but also other algae such as pelagophytes or the multicellular kelps. Algal nuclear genomes were transferred to the heterotrophic nucleus, as represented by green and red curved arrows, while the algal nucleus and mitochondria were lost. Other bacterial genes in the Stramenopila genome were derived by horizontal gene transfer events from bacterial donors (violet arrow). The figure also shows the approximative dates of diatom evolution and separation between centric and pennate diatoms based on Nakov et al. (2018). Some pennate species further diversified and acquired the capacity to move by secreting mucilage through a longitudinal slit in the cell wall called raphe, hence the division between raphid (motile) and araphid (nonmotile) pennate diatoms. Mya, million years ago.

550
The Plant Cell to have lost the ability to reproduce sexually and do not go through cell size reduction. Important information has recently been obtained from studies of the pennate diatom P. multistriata by taking advantage of novel reverse genetics tools developed for this species (Table 1;  Supplemental Table 1). By comparing gene expression profiles of opposing mating types in P. multistriata, Russo et al. (2018) identified five genes with a mating-type-specific expression pattern, including MRP3, which is expressed exclusively in one mating type in a monoallelic manner. Overexpression of MRP3 in the opposite mating type induced sex reversal such that this transformant was able to mate with its own mating type. These findings indicate that a single allele of MRP3 is responsible for sex determination in P. multistriata . F. cylindrus contains an MRP3 homolog , but a population genomics study estimated that the recombination frequency in F. cylindrus is only approximately four times higher than the mutation rate, suggesting that sexual recombination is very infrequent or even lacking in this species . If MRP3 is part of a larger mating-type locus, techniques such as linkage mapping would reveal whether this locus is nonrecombining, as seen in sexually reproducing multicellular algae such as Ectocarpus siliculosus (Ahmed et al., 2014). Asexuality appears to have emerged several times in diatoms (Montresor et al., 2016), suggesting it might confer a selective advantage under certain conditions. In the open ocean where cell population density is low, the chance of encountering opposite mating types is very limited, so sex might not be the ideal way to recombine and diversify, as suggested for the coccolithophore Emiliania huxleyi (von Dassow et al., 2015). Other mechanisms such as allelic divergence and changes in ploidy might be as effective as sexual reproduction for diatom evolution and adaptation.

Genome Organization and Regulation of Gene Expression
T. pseudonana (Armbrust et al., 2004) and P. tricornutum  were the first diatoms to have their genomes sequenced (Table 1). They were chosen as model systems because of their small genome sizes, rapid growth under controlled laboratory conditions (one division or more per day), their extensively characterized physiology, and the increasing availability of tools for reverse genetics (see section on "Genomic and Genetic Resources"). In the last decade, the hyperdiversity of diatoms, particularly the nested clade of pennate diatoms, has stimulated an interest in comparative genomics approaches to study the drivers of diversification and adaptation to a variety of aquatic habitats. New genome sequencing initiatives have been complemented by efforts to extend the existing molecular toolkit to a wider range of diatoms (Table 1; Supplemental Table 1). The rationale for deciding subsequent genome projects was either based on the importance of the habitats the diatom species occupy or their specific physiology: (1) F. cylindrus  to reveal adaptations to polar oceans, the preferred habitat of diatoms based on abundance; (2) Thalassiosira oceanica (Lommer et al., 2012) to study adaptation to iron-limited waters, which cover approximately 35% of the ocean surface; (3) P. multistriata (Basu et al., 2017) to reveal genes involved in sexual reproduction and recombination; (4) Pseudo-nitzschia multiseries (Basu et al., 2017) and Skeletonema costatum (Ogura et al., 2018) for the metabolic pathway underpinning the synthesis of a potent diatom toxin and to study harmful algal blooms; and (5) Fistulifera solaris (Tanaka et al., 2015b) and Cyclotella cryptica (Traller et al., 2016) for the production of biofuels.
The haploid size of the nuclear genomes of T. pseudonana (32.1 Mbp) and P. tricornutum (27.4 Mbp) are relatively small, with 11,766 and 12,233 predicted genes, respectively. At least 50% of the nuclear genome encodes proteins or smaller peptides, according to estimates based on P. tricornutum (e.g., Yang et al., 2018). The other sequenced species have larger genome sizes ranging from approximately 60 Mbp for P. multistriata and F. cylindrus up to over 200 Mbp for P. multiseries (Table 1).
All diatom species sequenced so far are thought to be diploid in their vegetative states. However, the results of several k-mer analyses suggest that F. cylindrus is either triploid or aneuploid (T.M., unpublished data). F. solaris has a confirmed allodiploid genome structure indicative of interspecies hybridization (Tanaka et al., 2015b, and below). Plastid and mitochondrial genomes have also been analyzed for most of the species listed in Table 1 in addition to the organellar genomes of other diatoms (Kowallik et al., 1995;Oudot-Le Secq et al., 2007;Lommer et al., 2010;Tanaka et al., 2011;Brembu et al., 2014). Organellar genomes are similarly compact, with few protein-coding genes (Table 1), and most organellar proteins are encoded in the nucleus, as in other eukaryotic phototrophs. Unlike the plastid genomes of green algae, diatom plastid genomes lack introns, except for S. robusta (Brembu et al., 2014). Overall, there appears to be no synteny among plastid genomes in diatoms.
The complex evolutionary history of diatoms is the main reason their genomes are considered to be chimeric (Armbrust et al., 2004;Bowler et al., 2008). The extant gene repertoire common to all diatom genomes was recruited from (1) the ancient heterotrophic host, (2) a red algal (and potentially also a green algal) endosymbiont, and (3) diverse bacteria via HGT. As green and red algae are subgroups of the Archaeplastida that arose from the primary endosymbiosis event, diatom genomes also contain cyanobacterial genes and genes from their hosts (Figure 3; Petersen et al., 2014).
A comparison of several diatom genomes with those of other taxonomic groups confirmed that they include a significant proportion of species-specific genes (e.g., 26% of P. tricornutum genes) among smaller proportions of "core diatom-specific genes" (not present in nondiatom genomes, 15.8%) and genes shared with diatoms and other stramenopiles (5.3%; Rastogi et al., 2018). Comparative genomics in the green lineage revealed similar proportions of genes across different species. For instance, ;15% of genes of the model chlorophyte C. reinhardtii are considered to be species specific (Merchant et al., 2007).
Repetitive elements such as transposable elements (TEs) are some of the main evolutionary drivers affecting genome structure and gene expression in diatoms. For instance, TEs make up less than 3% of the T. pseudonana genome but more than 30% of the F. cylindrus genome  and 73% of the P. multiseries genome (Basu et al., 2017), possibly reflecting differences in diversification and adaptation. Chlorophyte genomes show less variability in TE content, which varies between 6 and 7% (Philippsen et al. 2016). In diatoms, retrotransposons flanked with long terminal repeats appear to be the most abundant TEs in diatom genomes ). Some P. tricornutum-specific Ty1/ copia-like retrotransposons are strongly induced when nitrate is depleted , and cytosine methylation is thought to control the mobility of these TEs, as they are hypomethylated when expressed (Veluchamy et al., 2015). Long terminal repeat retrotransposons might induce homologous recombination between highly similar elements, leading to homologous and nonhomologous recombinant genomic loci, including gene-containing flanking regions. The detection of de novo insertions of these elements into genomic loci in different P. tricornutum accessions supports the hypothesis that transposition contributes to rearrangements and diversification of diatom genomes, which might be adaptive, as there is conditional activation of TEs . How TE-mediated recombination contributes to the evolution of genomes in terms of the accumulation and spread of de novo single nucleotide polymorphisms in a population depends on the selective advantage conferred by the mutation. Depending on the strength of selection and the population structure, this might lead to selective sweeps. Experimental evolution under different types of selection pressure combined with population genomics based on single-cell genomes will shed light on how mutations come to prominence in diatom populations and therefore become established in the genome. Such work will also reveal the temporal dynamics of the fixation of mutations and provide fundamental insights into the pace of genetic adaptation (e.g., Krasovec et al., 2019) under changing environmental conditions in the ocean.
An additional influence on diatom genomes is the acquisition of genes via HGT . Recent data have estimated that 2 to 6% of diatom genes are of bacterial origin (Basu et al., 2017). However, long-read sequencing technology, which was not applied to T. pseudonana or P. tricornutum, will reveal how many of these genes were derived from contaminating bacteria. A recent genome project with 10 red algal species based on longread sequencing technology revealed that only ;1% of genes were horizontally transferred (Rossoni et al., 2019). However,

552
The Plant Cell based on the currently available HGT data from diatoms, ;50% of these genes were acquired before the separation of the main lineages ;150 million years ago. How genes are transferred into diatom genomes is still unknown. Foreign DNA with a lower GC content can be maintained in P. tricornutum as episomes. Since diatom centromeres have a low GC content (Diner et al., 2017), perhaps low GC foreign DNA in diatoms could recruit specific chromosome maintenance proteins such as histones. This hijacking mechanism might thus favor the acquisition of foreign DNA (Diner et al., 2017). Diatoms have a long history of entangled relationships with bacteria for the acquisition of essential nutrients including vitamins, and bacterial conjugation is effective in transferring genetic material by direct cell-to-cell contact between bacteria and diatoms. Therefore, it is reasonable to assume that their inter-kingdom co-evolution contributed to shaping the structure and organization of diatom genomes. In addition to TEs and bacterial genes acquired via HGT, gene duplication accompanied by neofunctionalization significantly contributes to the level of innovation in diatom genomes. Recent comparative analyses (Basu et al., 2017) have provided the most comprehensive insights into the role of gene family expansions in the evolution of diatom genomes. In generally, diatoms contain significantly more genes encoding high mobility group boxes, heat-shock transcription factors, and cyclins compared to other phototrophs Basu et al., 2017). Speciesspecific gene family expansions have also been detected. For example, the expansion of ice binding protein family genes in F. cylindrus is thought to help these diatoms cope with frequent freezing in polar oceans . Whole-genome duplication appears to play a role in diatom evolution, diversification, and adaptation, especially in Thalassiosirales and several pennate diatoms, such as those in the genus Gyrosigma (Parks et al., 2018). Genome duplication also plays a significant role in plant evolution, but not in the evolution of green algae, as they appear to be haploid or diploid depending on their life cycle stage (Qiao et al., 2019). The mechanisms inducing whole-genome duplication are less clear. The pennate diatom F. solaris appears to be allodiploid, which may be a common mode of polyploidization based on hybridization events in distant parental lineages (Tanaka et al., 2015b). However, for other pennate lineages, low hybrid viability is commonly observed, which minimizes the evolutionary success of allopolyploidy (Amato and Orsini, 2015). Much higher hybrid viability is seen with autopolyploidy, which is often caused by meiotic nonreduction (Mann, 1994;von Dassow et al., 2008). However, there is insufficient genomic evidence to reliably assess the ploidy levels in different diatom species. We currently know little about the occurrence and potential significance of aneuploidy in diatoms. If it occurs in addition to polyploidy, it could potentially generate exceptional allelic diversity, which could be exploited experimentally to test the potential fitness effects of mutations. To reveal the significance of any type of ploidy, it will be crucial to investigate how the different alleles are expressed and if specific conditions drive the evolution of ploidy. The recent sequencing of the F. cylindrus genome showed that ;25% of genetic loci have highly divergent alleles, which are differentially expressed under various environmental conditions . Alleles with the highest ratios of nonsynonymous to synonymous substitutions showed the most pronounced levels of differential expression, which suggests they resulted from adaptive evolution. Results from recent genome projects with several Picochlorum (Chlorophyta) species (Foflonker et al., 2018) suggest that significant haplotype divergence is not restricted to polar diatoms. It may therefore have evolved independently in some algal lineages as a mechanism to adapt to variable environments.
The mechanisms controlling gene expression in diatoms are still largely unknown. Although diverse transcription factor (TF) families have been identified from diatom gene sequence analysis , to date, only a few studies have confirmed TF activity and TF DNA binding sites in diatoms (Banerjee et al., 2016;Matthijs et al., 2017). More generally, cis-and trans-acting regulatory elements controlling gene expression are still largely unknown in diatoms due to poor sequence conservation. Some novel information is currently emerging from computational analyses of large transcriptome data sets (Ashworth et al., 2016).
In plants and green algae, epigenetic mechanisms regulating gene expression, including DNA methylation and interfering small non-coding RNAs (sRNAs), significantly contribute to acclimation and evolution (Law and Jacobsen, 2010;Lopez et al., 2015;Kronholm et al., 2017). Although diatom epigenetics is still in its infancy, there is preliminary evidence that cytosine methylation contributes to the regulation of gene expression and silencing of repetitive sequences such as TEs (Tirichine et al., 2017). Canonical microRNAs (miRNAs) that control gene expression in plants (Law and Jacobsen, 2010) and the green alga Chlamydomonas (Molnár et al., 2007) have not been found in diatoms (Lopez-Gomollon et al., 2014;Rogato et al., 2014). However, a large and diverse group of canonical miRNAs has been discovered in the multicellular brown alga E. siliculosus (Tarver et al., 2015), suggesting that miRNAs independently evolved in different heterokont lineages. However, various classes of 21-to 30-nucleotide sRNAs have been characterized in both P. tricornutum and T. pseudonana (Rogato et al., 2014;Lopez-Gomollon et al., 2014). These include known noncoding RNA classes such as very abundant tRNAderived sRNAs, splicing sRNAs (e.g., U2 small nuclear RNA), and highly expressed transcripts of undefined origin. In P. tricornutum, most sRNAs are 25 to 30 nucleotides long and map to repetitive and silenced TEs marked by DNA methylation. This suggests that epigenetic mechanisms drive RNA-dependent DNA methylation of TEs, as observed in plants and metazoans. However, unlike in plants, sRNAs in diatoms target DNA methylated protein-coding genes (Rogato et al., 2014). A multitude of long intergenic noncoding RNAs have also been identified in P. tricornutum that respond to phosphorus depletion (Cruz de Carvalho et al., 2016).
The mechanisms underpinning the biogenesis and regulation of noncoding RNAs and the dynamic control of chromatin marks in diatoms are still largely unknown. Studies aimed at characterizing key components of plant epigenetic pathways identified in diatom genomes such as the Dicer-, Argonaute-, and Polycomb-based systems (De Riso et al. 2009;Tirichine et al., 2017) should help to reveal how epigenetics influences the adaptation and evolution of diatoms in their aquatic environment.

Photosynthesis and Energy Production
Diatoms usually dominate algal blooms (Boyd et al., 2007), and their photosynthesis is highly efficient under dynamic light regimes (Wagner et al., 2006). Although the general biochemical structures and functions of the complexes involved in oxygenic photosynthesis are conserved in most phototrophs, important differences have been identified in diatoms, including differences in pigment composition, the spatial organization of photosynthetic complexes, the response to high-light conditions, and the integration of photosynthetic activity with cellular metabolism. We are only starting to discover unique features of the photosynthetic process in diatoms in terms of how they acclimate to the dynamic physicochemical environment of aquatic systems. On the other hand, the shared photosynthetic features between diatoms and plants or green algae tell us about the universal constraints that apply to the photosynthetic electron transfer chain.
Due to their origins from secondary endosymbiosis (Cavalier-Smith, 2003), diatom plastids are enclosed by four membranes, unlike plastids in the green lineage, which are surrounded by only two membranes. The outermost chloroplast membrane in diatoms is connected to the ER (Figure 4). Diatom plastids contain stacks of three thylakoids that run through the entire organelle. Immunolocalization, three-dimensional plastid reconstruction, and in vivo spectroscopy in P. tricornutum have revealed the segregation of PSI and PSII: PSI is found mostly within the peripheral membranes facing the stroma, and PSII is located within the core membranes at the junction between two thylakoid membranes (Pyszniak and Gibbs, 1992;Flori et al., 2017). This structural organization, in contrast to that found in plants (with grana stacks interspaced by stroma lamellae), has important implications for how light capture and photosynthetic electron flow are regulated in these microalgae. The distance between the two photosystems is short, as they face each other in the two external thylakoids, and there are physical connections between thylakoids (Flori et al., 2017). Therefore, the diffusion of mobile electron carriers from PSII to PSI does not limit photosynthetic electron flow even under highlight irradiance, which is not the case in plants (Kirchhoff et al., 2004). This unusual segregation of the two photosystems prevents the loss of efficiency of the photosystems via possible spillover of excitons from PSII to PSI (Flori et al., 2017), as found in cyanobacteria and red algae (Biggins and Bruce, 1989). This organization might also explain why state transitions, which are observed in the green lineage when light harvesting complexes move from PSII to PSI, are absent in diatoms (Owens, 1986).
In terms of pigments, diatoms use chlorophyll (Chl) c and Chl a, and the main carotenoids are b-carotene, fucoxanthins (Fx), and diadinoxanthin/diatoxanthin. The fucoxanthin Chl a/c binding protein (FCP), which belongs to the family of the Light Harvesting Complex (LHC) proteins, binds seven Chl a, two Chl c, seven Fx, and likely one diadinoxanthin within the protein scaffold . FCPs are divided into three groups (Gundermann and Büchel, 2014): LHCF (the main light-harvesting antenna), LHCR (specific to PSI), and LHCX (involved in photoprotection, see below). A specific member of the FCP family (LHCF15) is involved in acclimation to far-red light conditions, at least in P. tricornutum (Herbstová et al., 2015(Herbstová et al., , 2017. PSII-FCP forms a homodimer, with two FCP homotetramers and three FCP monomers per monomer, forming a complicated protein-pigment network that is different from its counterpart in the green lineage (Nagao et al., 2019;Pi et al., 2019). The structural arrangement between pigments in FCPs ensures efficient energy transfer from Fx to Chl, optimizing the use of the blue-green light spectrum, which is enriched in the aquatic environment and absorbed by Fx . This arrangement might also allow for the rapid conversion to a dissipative state under high light stress.
Indeed, diatoms continuously cope with highly dynamic light environments and the risk of PSII inactivation under excessive light. Diatoms respond to PSII photoinactivation by using a PSII repair cycle mediated by FtsH protease complexes (Campbell et al., 2013). Diatoms accumulate significant pools of PSII repair cycle intermediates (Wu et al., 2012) even in darkness (Li et al., 2017). In addition, diatoms can rapidly adjust PSII activity to light conditions by dissipating the energy from excess photons as heat (Lavaud et al., 2002;Wilhelm et al., 2006;Lepetit et al., 2012). This process is usually visualized in vivo as a quenching of Chl a fluorescence, hereafter called nonphotochemical quenching (NPQ), which is up to five times more efficient in diatoms than in land plants (Ruban et al., 2004). Less is known about the regulation of NPQ in diatoms than in green algae and land plants, but there have been some interesting findings (Goss and Lepetit, 2015). NPQ capacity in diatoms is dependent on a reversible xanthophyll cycle, which converts diadinoxanthin to diatoxanthin under high light in a single de-epoxidation step, as opposed to the two steps in plants Stransky and Hager, 1970). Specialized members of the LHC family, LHCX proteins, are required for NPQ in diatoms (Bailleul et al., 2010;Lepetit et al., 2017;Buck et al., 2019), as demonstrated for the PsbS protein in Arabidopsis (Li et al., 2000) and LHCSR (light harvesting complex stress related) proteins in Chlamydomonas (Peers et al., 2009). LHCX1 is essential for responses to high-light conditions (Bailleul et al., 2010;Buck et al., 2019), acting as a modulator of NPQ, and it contributes to the phenotypic plasticity in diatoms underpinning adaptive evolution (Bailleul et al., 2010). The constitutive expression of LHCX1 enables a high basal NPQ capacity in P. tricornutum, as also observed in T. pseudonana and C. cryptica/ meneghiniana (Zhu and Green, 2010). In addition to LHCX1, diatoms possess high-light-induced LHCX2 and LHCX3 proteins, which, like Chlamydomonas LHCSRs (Peers et al., 2009), help regulate NPQ under high-light stress (Buck et al., 2019). The quenching of Chl fluorescence (the dissipation of excess energy in the form of heat) is thought to occur in two different locations within PSII (Chukhutsina et al., 2014;Kuzminov and Gorbunov, 2016;Taddei et al., 2018). The various LHCX proteins might contribute differently to these two quenching sites. Diatom LHCX family members are differentially expressed in response to nutrient stress, and more intriguingly, under prolonged periods of darkness (Taddei et al., 2016;Lepetit et al., 2017). Thus, perhaps the expansion of the LHCX family, especially in the polar diatom F. cylindrus (containing 11 LHCXs; Mock et al., 2017), reflects functional diversification, which could help diatoms respond to variable environmental conditions, such as strong seasonal changes (Taddei et al., 2016). The pioneering work on diatom LHCXs lays the foundation for more general studies investigating the roles of these proteins in the ecophysiological adaptation of marine microalgae to dynamic environments, as LHCX or LHCX-like genes have been identified in other clades with secondary plastids, such as haptophytes and alveolates (Giovagnetti and Ruban, 2018).
Light-dependent photosynthetic reactions lead to the biosynthesis of ATP and NADPH at a ratio lower than that required for C-fixation, resulting in an ATP deficit. In the green lineage, this deficit is mostly overcome by the production of additional ATP through cyclic electron transport around PSI (Allen, 2002). Diatoms cope with this deficit by "meta-cyclic" electron flow, where the ATP/NADPH ratio is increased by two processes: (1) the export of NADPH to mitochondria and (2) the subsequent import of ATP from mitochondria to plastids (Bailleul et al., 2015). While exchanges of NADPH and ATP between the two organelles have also been observed in the green lineage (e.g., Lemaire et al., 1988;Yoshida et al., 2007;Joliot and Joliot 2008), the observation that this process is constitutive in diatoms is unprecedented. One recent hypothesis suggests that this energetic coupling involves a strong functional connection between mitochondrial glycolysis, an unusual metabolic configuration in stramenopiles (Liaud et al., 2000;Kroth et al., 2008;Río Bártulos et al., 2018), and the Calvin-Benson-Bassham cycle (Smith et al., 2016). Additional characterization of the mitochondrial alternative oxidase (AOX) knock-down lines also showed an hypersensitivity to stress, indicating that the coupling between the two organelles is essential for diatom sensitivity to changing environments (Murik et al., 2018). Regardless of its exact mechanism, the coupling between mitochondria and plastids requires the displacement of metabolites between the two organelles. How this exchange takes place despite the presence of four membranes surrounding the plastid remains an enigma.
Another fundamental issue is the role of luminal pH in regulating photosynthetic electron flow in diatoms. In the green lineage, the proton concentration in the lumen is a master regulator that adjusts photosynthetic electron transfer based on the availability of light and metabolic requirements. An acidic lumen can slow electron transport at the cytochrome b 6 f level through a process called photosynthetic control (Foyer et al., 1990). In the green lineage, luminal pH also controls PSII activity by regulating the two main actors of NPQ: the xanthophyll cycle, which regulates the concentration of the quencher diatoxanthin, and PSII subunits PsbS and LHCSR, whose protonation induce conformational changes (Horton, 2012). In diatoms like in plants, de-epoxidase activity (and therefore NPQ) is modulated by the proton gradient (Caron et al., 1987). But a role for pH-induced conformational changes to PSII subunits in NPQ is unlikely given the linear relationship between diatoxanthin and NPQ (Lavaud et al., 2002;Goss et al., 2006;Barnett et al., 2015).
The biogenesis of the diatom photosynthetic apparatus is yet to be unexplored. The proteins of the photosynthetic apparatus are composed of subunits encoded in the plastid and nucleus, like those of the green lineage. It is well established that tight cooperation between the nuclear and chloroplast genetic systems is critical for the stoichiometric biogenesis and complex assembly of the photosynthetic machinery in green algae and plants (Woodson and Chory, 2008;Choquet and Wollman, 2009). No such information is currently available for secondary endosymbionts such as diatoms. Genomic analysis has suggested that plastid-targeting systems function in the delivery of nuclear-encoded plastid proteins in diatoms, first through the ER and then across the four plastid membranes (Gruber et al., 2015). Plastid proteins are transported from the ER to the periplastidial compartment via an ER-associated protein degradation-derived translocon (Hempel et al., 2009) and to the stroma via systems similar to that of the Toc and Tic plant translocon (reviewed in Gruber and Kroth, 2017;Maier et al., 2015). Plastid-targeted proteins possess N-terminal bipartite presequences consisting of a signal peptide and transit peptide, with a conserved ASAFAP motif at the signal peptide cleavage site (Table 2; Gruber et al., 2015), which are relatively well conserved in diatoms compared to the corresponding plant presequences. However, our knowledge of the mature N-terminal sequences of these plastid-targeted proteins is far less advanced (Huesgen et al., 2013). The protein import mechanisms to different plastid compartments are still largely undefined, and novel systems regulating the photosynthetic process are expected to be found.
We propose two research directions that might advance our understanding of diatom photosynthesis. First, setting up appropriate genetic resources in diatom strains that can be grown both autotrophically and heterotrophically, such as C. cryptica and Cylindrotheca fusiformis, could be a powerful way to dissect plastid regulation, as has been achieved in C. reinhardtii (Salomé and Merchant, 2019). Genetic investigations into photosynthetic processes have been somewhat limited to the obligate phototrophy of P. tricornutum and T. pseudonana, where photosynthetic mutants are lethal, so characterizing them is unmanageable. Second, the increasing amount of genomic information from diverse diatom species and from other phytoplankton can be used to identify proteins associated with photosynthetic functions in these organisms (Dorrell et al., 2017). Inspired by the GreenCut inventory that used the Chlamydomonas genome as a reference to identify proteins conserved in green lineage organisms (Merchant et al., 2007), a catalog using diatoms as the reference could be exploited to reconstruct the plastid proteomes of stramenopiles. This would in turn help us identify regulators of diatom photosynthesis, thereby extending the current view of photosynthetic constraints in eukaryotes.

Nutrient Metabolism
Nutrient uptake and assimilation by prominent primary producers such as diatoms are particularly relevant because they influence the entire marine food web and global biogeochemical cycling of key elements such as carbon and nitrogen (Falkowski, 2015). Furthermore, due to their silicified cell walls, diatoms are also major players in the silicon cycle, and silicon is one of the most abundant elements in the Earth's crust. Despite its relevance, the regulation of silica metabolism is still largely uncharacterized. Diatoms adapt to changes in silicic acid availability in part by altering the expression of genes encoding silicon transporter proteins (Hildebrand et al., 1997;Thamatrakoln and Hildebrand, 2008). Novel putative regulators of this important process identified by whole-genome expression profiling of T. pseudonana (Supplemental Table 2). Common sets of genes are induced by both silicate and nutrient limitation (Supplemental Table 2; Mock et al., 2008), suggesting that the pathways involved in cell cycle regulation, silica deposition, and nutrient metabolism are strongly coupled. In this section, we focus on the regulation of carbon, nitrogen, and iron metabolism, as recent studies on these topics have helped unveil some peculiar adaptations of diatoms to the marine environment (Figure 4).

Carbon Fixation
Like other aquatic phototrophs, diatoms must overcome the problem of low CO 2 concentrations in water (10 to 30 mM); these concentrations are much lower than that required to saturate the CO 2 -fixing enzyme ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in diatoms (Reinfelder, 2011). Physiological and molecular investigations have shown that the carbon concentration mechanism (CCM) in diatoms is critical for the efficient uptake and assimilation of inorganic Carbon (Ci; both CO 2 and HCO 3 2 ) from the environment (see Hopkinson et al., 2011;Tsuji et al., 2017;Schoefs et al., 2017, for a detailed overview on this topic; and Figure 4). Except for some highly conserved proteins, the CCM in diatoms is different from that of cyanobacteria and green algae (Wang et al., 2015;Kaplan, 2017;Tomar et al., 2017), supporting the hypothesis that the many proteins and metabolic pathways of the CCM independently evolved in different lineages of unicellular photosynthetic organisms. Diatoms do not contain homologs of Ci transporters from green algae (e.g., the Chlamydomonas LCI1 and HLA3; Ohnishi et al., 2010;Yamano et al., 2015). However, 10 putative HCO 3 2 transporters phylogenetically related to those from metazoans are encoded in the genomes of model diatom species, including the solute carrier SLC4, a major plasmalemma bicarbonate transporter of functional relevance under low-CO 2 conditions (Nakajima et al., 2013;Shen et al., 2017). The existence of additional plastid-localized HCO 3 2 transporters has been hypothesized based on the CCM model (Hopkinson, 2014), but no such transporters have thus far been characterized.
Like other photosynthetic organisms, diatoms possess multiple carbonic anhydrase (CA) subclasses that convert (1) HCO 3 2 to CO 2 at the cell surface to facilitate diffusive uptake, (2) CO 2 to HCO 3 2 in the cytosol to limit CO 2 diffusion out of the cells and to facilitate its transport to the plastid, and (3) HCO 3 2 in the stroma to CO 2 close to Rubisco in the pyrenoid. CA variants have also been found in some diatom species Young and Hopkinson, 2017; e.g., Figure 4 for P. tricornutum and T. pseudonana), an indication of how the CCM process has further diversified in these microalgae. The u-type CA in P. tricornutum (Kikutani et al., 2016), which is targeted to the lumen of the pyrenoid-penetrating thylakoid, is essential for efficient photosynthesis and growth. Interestingly, in green algae, the final conversion of HCO 3 2 to CO 2 by the a-type CAH3 CA also occurs within the thylakoids that penetrate the pyrenoid (Mitra et al., 2004), suggesting that the tight association between CCM in the pyrenoid and thylakoid membrane function was a driving force in the independent evolution of photosynthesis in distant aquatic phototrophs. New CA metallovariants that use cadmium or cobalt as cofactors, rather than the commonly used zinc, have also been described in diatoms (Lane et al., 2005;Lionetto et al., 2016). The ability to use different metal cofactors may facilitate diatom growth in the ocean, where large areas contain low levels of trace metals. This hypothesis is supported by the recent discovery of the diatom iCA that uses manganese as a cofactor (Jensen et al., 2019). The sequence of this CA variant shares only low identity with known CAs, but its ecological relevance is supported by its wide distribution in diatoms and other phytoplanktonic algae, bacteria, and archaea and its prevalence in metagenomic samples from marine environments (Jensen et al., 2019).
Overall, comparative analysis of diatom genomes has revealed that diatoms have more genes that take part in C metabolism than green algae (Kroth et al., 2008;Smith et al., 2012). Moreover, diatom species differ considerably in terms of the intracellular localization, gene duplication, or deletion of some enzymes of the C-partitioning pathway (e.g., some pyruvate hub enzymes; Smith et al., 2012). It is likely that these enzymes acquired specialized metabolic functions in different species as diatoms diversified. Diatoms also contain genes that are potentially involved in C4-like photosynthesis (Kroth et al., 2008). Well described in land and aquatic plants, this pathway favors the formation of organic molecules with four C atoms, which increase local CO 2 concentrations when decarboxylated near Rubisco. The existence of the C4 pathway in diatoms was originally proposed in Thalassiosira weissflogii (Reinfelder et al., 2000) and then in T. pseudonana (Kustka et al., 2014) based on physiological and biochemical analyses, but functional characterization of the putative C4 genes does not support these hypotheses. The silencing of the essential pyruvate-orthophosphate dikinase (PPDK) gene of C4 metabolism (Haimovich-Dayan et al., 2013) did not alter Ci acquisition in P. tricornutum. However, perhaps this pathway aids in the dissipation of excess energy and pH homeostasis. The intracellular localization of all putative C4-like proteins in P. tricornutum (Ewe et al., 2018) and T. pseudonana (Tanaka et al., 2014) is quite different from the distribution of the core C4 pathway enzymes in plants. In particular, the lack of decarboxylases in the diatom plastid would prevent a functional C4 pathway from operating as a CCM. The roles of diatom C4-like genes and many other genes putatively involved in C acquisition and metabolism are still poorly understood. We anticipate that their functional characterization in different model species will reveal novel regulatory processes that explain how diatoms became such successful primary producers in the contemporary ocean.

Nitrogen Assimilation
Diatoms quickly assimilate NO 3 2 , the most stable, abundant ionic form of inorganic N in the ocean . They also quickly assimilate NO 2 , NH 4 1 , and organic forms of N such as urea and diverse amino acids, making them highly competitive in nitrate-rich environments such as coastal upwelling systems and the Southern Ocean.
Diatoms contain many genes encoding N transporters, including high-and low-affinity nitrate (NRT2/NPF) and ammonium (AMT) transporters and a urea transporter that is not found in green algae or plants (Rogato et al., 2015). Meta-omics analysis has unveiled complex genetic diversification of these transporters, which is likely shaped by environmental N abundance (Busseni et al., 2019). Recent characterization of plasma membrane-type aquaporins (PtAQP1 and PtAQP2) from P. tricornutum also defined novel CO 2 /NH 3 channels ( Figures 4A and 4B) (Matsui et al., 2018), which control the cell's permeability to these nutrients. PtAQP1 and PtAQP2 also participate in regulating NPQ under high light, likely mitigating the potentially toxic effects of high concentrations of ammonia on the photosynthetic apparatus by facilitating NH 3 efflux.
As observed in other algal lineages (Breuer et al., 2012;Merchant et al., 2012), N starvation triggers intracellular N recycling in diatoms and redirects C from storage carbohydrates to C-rich compounds such as lipids. A negative effect of N depletion is the degradation of pigments, which is detrimental to photosynthetic capacity, C fixation, and ultimately cell division and growth (Kolber et al., 1988;Allen et al., 2008;Levitan et al., 2015a). Transcriptomic, proteomic, and metabolic profiling of diatoms responding to N limitation revealed an extremely pronounced remodeling of lipid metabolism compared to other phototrophs (Supplemental Table 2). Functional characterization of key regulators of N metabolism such as the nitrate reductase gene (NR) helped confirm this finding. Initial analysis of P. tricornutum NR knockdown lines revealed that under N starvation, 40% more C is redirected toward the biosynthesis of storage triacylglycerols in the mutants compared to wild-type lines (Levitan et al., 2015b), suggesting that these important metabolic changes are regulated by the N assimilation pathway. NR knockout lines cannot grow in medium with NO 3 2 as the sole N source because N assimilation is abolished (McCarthy et al., 2017), but they continue to import NO 3 2 despite growth arrest, leading to NO 3 2 accumulation inside the vacuole, likely via the action of recently identified vacuolar NO 3 2 transporters (McCarthy et al., 2017). NO 3 2 accumulation in the vacuole is thought to help diatoms cope with periods of nitrate scarcity and even suboptimal light conditions.
The identification of a metazoan-type ornithine-urea cycle in diatoms, previously only known for its role in N excretion, is one of the most surprising discoveries from diatom genomic analysis (Armbrust et al., 2004). The study of one of these proteins, carbamoyl phosphate synthase III (unCPS; Allen et al., 2011), which is targeted to the mitochondria in P. tricornutum, has elucidated some of the functions of the urea cycle in these algae. Metabolic analysis of CPSIII knockdown cell lines showed that intermediate products of the urea cycle are particularly depleted in these lines. In particular, CPSIII regulates the incorporation of NH 4 1 and HCO 3 2 into carbamoyl phosphate, which is subsequently converted into other urea cycle intermediates and products, such as Arg and the signaling molecule nitric oxide (NO). Down-regulation of CPSIII also impairs the response of N-limited diatoms to N addition, suggesting that the urea cycle contributes to the rapid recovery from prolonged N limitation. It has therefore been proposed that the urea cycle plays a key role in controlling and redistributing C and N fluxes into the cell based on N availability. This pathway is strongly linked to the tricarboxylic acid (TCA) and glutamine synthase/glutamate synthase cycles due to the shared pools of precursor metabolites . Since their discovery in diatoms, CPS genes have been found in multicellular brown algae and haptophytes , supporting the hypothesis that a urea cycle was inherited from the common exosymbiont of the cryptophyte, alveolate, stramenopile, and haptophyte lineage. In diatoms, CPSIII down-regulation also affects the biosynthesis of the major osmolyte Pro and the precursors of long-chain polyamines required for cell wall formation (Kröger et al., 2000). Interestingly, some of these metabolites are generated from ornithine-urea cycle intermediates by the products of genes of bacterial origin. From an evolutionary perspective, the coupling of proteins derived from bacteria and exosymbionts likely allowed the functionality of the urea cycle to expand in diatoms toward processes that are critical for their growth and physiological acclimation to variable environments . It is important to mention that homologs of the ornithineurea cycle such as CPS and Orn carbamoyltransferase are also found in green algae and plants. However, these proteins are localized to the plastid as part of the Arg biosynthesis pathway and are phylogenetically distinct from the diatom enzymes (Winter et al., 2015).

Iron
Iron, which functions as an electron carrier in iron-sulfur proteins and hemoproteins, plays essential roles in photosynthesis (PSI, ferredoxin, cytochrome b 6 f) and other metabolic processes (TCA cycle, respiration, nutrient assimilation, and so on) in all phototrophs. In marine ecosystems, iron is the prime limiting factor for phytoplankton growth, as large phytoplankton blooms have been induced in high-nutrient low-chlorophyll regions in several iron fertilization experiments (Boyd et al., 2007). Diatoms often dominated these blooms, suggesting they have efficient strategies for iron uptake and a subsequent growth response that enable them to outcompete other algal groups.
Different coastal and oceanic diatom species employ a wide variety of physiological strategies to deal with iron availability. Diatoms possess diverse high-affinity reductive uptake systems, and like vascular plants, they can reduce ferric iron, the predominant form of iron in the environment, before or during uptake into the cell (Allen et al., 2008;Marchetti et al., 2009;Groussman et al., 2015). Many diatom species contain high-affinity ferric reductases (FREs) that dissociate iron III from ligands, multicopper oxidases that oxidize Fe 21 to Fe 31 , and permease (FTR) systems that receive iron (II) for translocation across membranes ( Figure 4B; Groussman et al., 2015), as observed in green algae such as Chlamydomonas (Glaesener et al., 2013). Several species can also take up ferric iron without reducing it. The direct intracellular uptake of siderophores was recently demonstrated in several different diatom species (Kazamia et al., 2018), as Lesuisse et al. (1998) and Chu et al. (2010) previously described in bacteria and fungi. In P. tricornutum, siderophores are taken up by interacting with the cell surface, followed by endocytosis and delivery to the plastid. This siderophore uptake system is highly efficient, with an affinity thought to be higher than for any other eukaryotes (Kazamia et al., 2018). Therefore, it is likely that this system strongly contributes to the assimilation of iron, especially in regions where iron is very scarce.
Once iron enters the cytosol of diatoms, it is allocated to ironbased metabolism, but its intracellular content must be tightly regulated because free iron can trigger the production of harmful radical oxygen species (ROS). ROS can irreversibly damage essential Fe-S-containing proteins such as those that function in photosynthesis. Ferritin is thought to be crucial for maintaining iron homeostasis in diatoms (Groussman et al., 2015) by sequestering iron and providing an efficient iron storage system. For instance, ferritin in the pennate diatom Pseudo-nitzschia granii was thought to confer a competitive advantage over non-ferritincontaining algae, as the stored iron in P. granii corresponded with a prolonged growth phase and extended diatom blooms (Marchetti et al., 2009). However, recent data indicate that diatom ferritin is optimized for initial Fe 21 oxidation, pointing to an additional role for this protein in buffering iron availability (Pfaffen et al., 2015).
When iron is scarce in the environment, diatoms adjust their metabolism by reducing their cellular requirements for iron. For instance, as observed in most cyanobacteria and other algae (Pierella Karlusich and Carrillo, 2017), diatoms can replace the plastid-localized ferredoxin with flavodoxin, as the latter uses flavin as a cofactor instead of iron (La Roche et al., 1996;Groussman et al., 2015). Unlike most diatoms, green algae also possess other interchangeable soluble electron carriers: the copper-containing plastocyanin, which is abundant in coppersupplemented medium and remains expressed under iron limitation; and the iron-cytochrome c 6 , which replaces plastocyanin under copper deficiency (Merchant and Bogorad, 1986). So far, the plastocyanin has been reported only in T. oceanica (Peers and Price, 2006), which appears to have adapted to low-iron environments. The constitutive expression of plastocyanin observed in this species may reduce the requirement for iron, which is less abundant than copper in the open ocean.
Novel candidate proteins specific to diatoms and other phytoplanktonic organisms have been put forward as regulators of acclimation to iron limitation. These include a protein previously annotated as "death-specific", which regulates photosynthesis under limiting iron and light conditions (Thamatrakoln et al., 2013), and the iron-starvation-induced proteins (ISIPs), whose expression is induced by iron limitation (Allen et al., 2008). ISIP1 was originally identified as a putative iron receptor on the cell surface (Lommer et al., 2012), but more recent studies in P. tricornutum showed that ISIP1 plays a key role in the endocytic uptake of siderophores (Kazamia et al., 2018). Functional characterization of ISIP2a revealed its influence on ferric-iron concentration and high-affinity uptake (Morrissey et al., 2015). ISIP2a is an algal transferrin-like proteins named phytotransferrin that is functionally equivalent to its human homologue (McQuaid et al., 2018), as iron uptake was recovered in a P. tricornutum ISIP2a mutant complemented with the human transferrin gene. These findings, together with the characterization of a transferritin-like protein in the green alga Dunaliella salina (Fisher et al., 1997), highlight the importance of this iron acquisition mechanism for a wide variety of microalgae in aquatic environments (Morrissey et al., 2015;McQuaid et al., 2018). Moreover, the discovery that carbonate ions control iron uptake by ISIP2 uncovered a relationship between carbonate availability and iron uptake in diatoms. Regulators of iron metabolism can therefore be used as molecular markers for assessing iron status in natural diatom populations. For example, ISIP proteins are highly expressed in iron-limited oceanic regions (Bertrand et al., 2015;Carradec et al., 2018), which is consistent with the results of laboratory-based experiments examining iron uptake (Kazamia et al., 2018). Furthermore, the wide occurrence of carbonate-sensitive phytotransferrin sequences in metagenomic data sets suggests that this system is of ecological importance and raises the possibility that ocean acidification has a negative impact on iron uptake by diatoms (McQuaid et al., 2018).

Perception and Responses to Environmental Signals
Massive diatom blooms in the marine environment are controlled by light and nutrient availability, as well as interactions with other organisms, including partners, competitors, or predators (Brodie et al., 2017). Studies of diatom responses to specific abiotic and biotic signals or stresses are revealing how diatoms acclimate to complex environments and survive adverse conditions.
Diatoms possess a palette of photoreceptors that use information on the changing light spectrum and intensity to modulate critical photophysiological responses (Jaubert et al., 2017). As expected from the abundance of blue light wavelengths through the entire water column, diatoms have a multitude of blue light sensors (Jaubert et al., 2017). The animal-like protein PtCPF1, a member of the expanded cryptochrome/photolyase family (CPF) in P. tricornutum, shows both blue light-dependent transcriptional regulation and DNA repair activity (Coesel et al., 2009). This double function is thought to allow PtCPF1 to regulate different cell cycle phases, which are strongly synchronized with periodic light/dark variations. Following their discovery in diatoms, bifunctional cryptochromes have been also found in green algae Franz et al., 2018). Diatoms also possess a protein distantly related to plant-type cryptochrome (CryP) that regulates transcript abundance in the light as well as in darkness (Juhas et al., 2014;König et al., 2017). Diatoms do not possess blue-light-absorbing phototropins, which control responses such as phototropism and chloroplast movement in plants (Briggs, 2014) and photoprotection in green algae (Petroutsos et al., 2016). However, multiple aureochromes (AUREOs) have been found in diatoms. AUREOs are light-induced TFs specific to stramenopiles . Like phototropins, AUREOs contain a light-oxygenvoltage-sensing domain for blue light perception in a novel combination with a b-ZIP-domain responsible for DNA binding. PtAUREO1a functions synergistically with the basic leucine zipper TF bZIP10 to control the expression dsCYC2, thus rendering the regulation of cell cycle progression dependent on blue light (Huysman et al., 2013). Both PtAUREO1a and AUREO1b contribute to photoacclimation in diatoms, as their corresponding knockout mutants show less photosynthetic activity than the wild type (Mann et al., 2017). The NPQ process was only modified in the AUREO1a mutant, indicating that these gene family members have diversified functions.
Rhodopsins (RHOs), a diverse group of light-dependent reactive membrane proteins with retinal as the chromophore, have been identified in several diatom species. The first diatom RHO gene discovered was found in F. cylindrus (Strauss, 2012). Preliminary data suggest that RHOs from F. cylindrus and P. granii are likely xanthorhodopsins acquired from bacteria via HGT. Xanthorhodopsins are light-driven proton pumps that generate a proton motive force for ATP biosynthesis (Strauss, 2012). F. cylindrus RHO is targeted to the plastid (Strauss, 2012), suggesting it contributes to ATP biosynthesis in this organelle. Alternatively, the proton motive force generated by RHOs might enhance iron uptake when iron is limiting. This might explain why RHO transcripts are abundant in the metatranscriptomes of diatom communities from iron-limited ocean surfaces (Marchetti et al., 2012).
Light sensing in diatoms is further augmented by bacterial-like phytochrome photoreceptors. Diatom phytochromes (DPHs) absorb longer red and far-red wavebands than plants, green algae, and glaucophyte algae (Rockwell et al., 2014) and can trigger far-red light signaling (Fortunato et al., 2016). These solar radiation wavelengths are quickly absorbed in the water column, and DPH might act as ocean surface light sensors. However, since DPH genes have been found in the genomes of many diatom species and in metatranscriptomic data from subsurface layer samples (M.J. and A.F., unpublished data), their roles in light sensing in deeper waters cannot be excluded. Most far-red-light-induced genes regulated by DPH encode proteins of unknown function.
Characterizing these genes and proteins should reveal novel farred-light-sensing processes and their importance for diatom biology.
Diatoms also experience the daily light-dark cycle, and basic biological processes such as the cell cycle (Huysman, et al., 2010), gene expression (Ashworth et al., 2013;Chauton et al., 2013;Smith et al., 2016), and pigment biosynthesis (Ragni and d'Alcalà, 2007) show daily oscillations. In P. tricornutum, these rhythms persist under constant light conditions, providing evidence that they are regulated by an endogenous circadian clock (Annunziata et al., 2019). The bHLH-PAS protein RITMO1 was the first regulator of circadian rhythms identified in diatoms, as deregulating its expression strongly altered cellular rhythmicity (Annunziata et al., 2019). No obvious homologs of bacterial, green algae, plant, or animal circadian clock components have thus far been identified in diatoms. Therefore, RITMO1 is a valuable entry point for characterizing circadian regulation in diatoms and in many marine microalgae possessing RITMO1-like proteins.
Diatoms also perceive and respond to changes in nutrient availability, which can occur rapidly in turbulent waters, as evidenced by the massive changes in gene expression that occur under different nutrient stresses (Supplemental Table 2). Based on the important metabolic changes observed in NR mutants compared to wild-type cells under N stress, as mentioned above (Levitan et al., 2015b;McCarthy et al., 2017), NR might function in sensing cellular N fluxes and remodeling intermediate metabolism (Levitan et al., 2015b). Regarding C sensing, CO 2 itself might elicit the modifications in cell physiology observed in response to changes in external CO 2 levels. CO 2 activates a signaling cascade involving cAMP as a second messenger and the TF PtbZIP11 by binding to CO 2 /cAMP-responsive elements in CO 2 -regulated genes such as PtCA1 and PtCA2. Interestingly, light-driven signals and metabolic signals act on the same pathway (Kikutani et al. , 2012;Tanaka et al., 2016). Therefore, a complex interconnection of regulatory networks controls diatom physiology under dynamic environmental conditions (Supplemental Table 2).
Diatoms also respond to intracellular metabolic signals that are strongly dependent on environmental changes. A retrograde plastid-to-nucleus signaling pathway was identified in P. tricornutum by artificially modifying the redox state of the plastoquinone pool (Lepetit et al., 2013). As shown in plants and green algae, plastid signals induce changes in the expression of nuclear genes, such as LHCX genes, and the biosynthesis of photoprotective pigments (Lepetit et al., 2013;Taddei et al., 2016). ROS normally produced by oxygen-based metabolism (photosynthesis, photorespiration, and oxidative phosphorylation) can act as central secondary messengers in diatoms, as in many other organisms including plants and green algae (Mittler et al., 2011). A compartmentalized redox-sensitive signaling network (redoxome) was characterized in P. tricornutum by analyzing the redox proteome and redox-sensitive biosensors localized to different cellular compartments (e.g., roGFP; green fluorescent protein) (Rosenwasser et al., 2014) and by monitoring phenotypic variability within diatom populations at the single-cell level (Mizrachi et al., 2019). When oxidative stress is induced by other stresses (e.g., high light), the redoxome regulates redox metabolism and cell fate decisions toward acclimation or cell death.
The life of diatoms in the ocean is also strongly influenced by surrounding organisms. Some diatoms can assimilate N via symbiosis with N 2 -fixing cyanobacteria (Foster et al., 2011). As observed in a variety of marine and freshwater microalgae (Croft et al., 2005), some bacteria can provide diatoms with vitamin B12, whose deficiency would otherwise limit growth in certain environments (Bertrand et al., 2015). P. multiseries and diverse co-occurring bacteria interact via complex communication systems (Amin et al., 2015) involving the exchange of hormones, macro and micronutrients, and signaling molecules released by both the diatoms and bacteria. Diatoms and their grazers also interact in an elaborate way. In the presence of copepods, some diatoms produce unsaturated aldehyde oxylipins that inhibit the growth of the predator (Ianora et al., 2004). These aldehydes can act as sensing molecules to control the diatom population itself by activating a signaling cascade involving calcium, NO, and oxidation of the glutathione pool (Vardi et al., 2006;van Creveld et al., 2015). NO production was initially thought to be controlled by a calcium-dependent NO synthase-like protein (NOA; Vardi, 2008), but subsequent studies revealed that NO is likely produced by a nitrite-dependent pathway catalyzed by the side activity of NR.
Consequently, NO appears to play a role in a nitrite-sensing and acclimation system (Dolch et al., 2017), as previously demonstrated in Chlamydomonas (Wei et al., 2014). In Chlamydomonas, however, NO is also produced in NR-defective strains by other pathways, such as the polyamine pathway, from Arg by an unknown NO synthase-like pathway or from nitrite by molybdenum cofactor-containing enzymes (De Mia et al., 2019). Considering the interconnectivity of signaling networks in diatoms, it is conceivable that NO produced by different N sources participates in both cell acclimation and cell fate regulation, depending on the environmental context and the physiological state of the cells. A recent study in Skeletonema marinoi supports this scenario. In this species, another pathway activated by intracellular sterol sulfates triggers programmed cell death by inducing ROS and NO production during cell aging (Gallo et al., 2017).
Currently, information about diatom signaling pathways, from stimulus to response, remains largely fragmentary, and many aspects of diatom interaction networks remain to be clarified. None of the signaling components involved in light sensing downstream of photoreceptors of plants and green algae have thus far been identified in diatom genomes. As in other eukaryotes, calcium appears to be a critical second messenger for detecting fluid motion, osmotic stress, iron, and infochemicals in diatoms (Falciatore et al., 2000;Vardi et al., 2006). A Ca 21 signaling mechanism was recently discovered in diatoms when EukCats, a novel class of single-domain voltage-gated channels, was characterized (Helliwell et al., 2019). EukCats, which are widely distributed in eukaryote phytoplankton, are involved in regulating cellular motility in P. tricornutum.

Genomic Resources
P. tricornutum and T. pseudonana are the most advanced diatom model species for molecular studies. Strains of both P. tricornutum and T. pseudonana have been isolated from locations worldwide and are curated in various culture collections (Supplemental Figure 1). The ecological relevance of T. pseudonana and its obligate requirement for silica were strong considerations when the CCMP1335 strain was chosen for the first sequencing project of a centric diatom (Armbrust et al., 2004). P. tricornutum Pt1.1 8.6 Bohlin (CCMP2561) was then chosen as a representative diatom of the pennate lineage . The availability of many transcriptomic and proteomic data sets has greatly facilitated genome annotation efforts and therefore gene discovery in both species (Table 2; Supplemental  Table 2). Resequencing of seven additional T. pseudonana strains (Koester et al., 2018) and 10 P. tricornutum accessions (Supplemental Figure 1; Rastogi et al., 2019) has revealed polymorphisms, providing evidence for genome variation, evolution, and adaptation. The following diatom genome sequences have recently become available, as previously mentioned (Table 1; Supplemental Table 1): F. cylindrus, T. oceanica, P. multistriata, P. multiseries, F. solaris, C. cryptica, and S. costatum. Most diatom genome projects are accompanied by the production of extensive RNA-sequencing data sets and other omics data obtained under a variety of growth conditions relevant to the individual species (see Table 2 and Supplemental Table 2 for P. tricornutum and T. pseudonana). However, a database centralizing all of these omic resources is still lacking.
The availability of genomic information from P. tricornutum has made it possible to begin reconstructing the metabolic networks of diatoms, as done in other models of the green lineage (e.g., Dal'Molin et al., 2011). Diatom metabolism databases (Table 2) include DiatomCyc and the genome scale metabolic model based on improved genome annotation and protein subcellular localization predictions (Fabris et al., 2012;Levering et al., 2016). These resources are instrumental in charting intracellular metabolic fluxes in diatoms grown under different sources of energy and C (Kim et al., 2016) to explore the roles of novel pathways (e.g., lower glycolysis in mitochondria) or to predict the existence of new regulators of energy fluxes between organelles in relation to diatom acclimation (Kim et al., 2016;Broddrick et al., 2019). Although the biotechnological exploitation of diatoms is beyond the scope of this review, it is important to mention that these genomic resources and genetic tools (Tables 1 to 4) also provide the basis for metabolic engineering of these algae (e.g., Daboussi et al., 2014;Kim et al., 2016;Levering et al., 2016).
Large amounts of metagenomics and metatranscriptomics data are now being generated from the sequencing of natural phytoplankton populations (Table 3). These data represent powerful resources for exploring the distribution of diatom genes and species in an environmental context.

Genetic Resources
The breakthrough that made reverse genetics in diatoms possible was the biolistic delivery of transgenes to C. cryptica and Navicula saprophila cells (Dunahay et al., 1995). Many transformation experiments in other diatom species, including P. tricornutum and T. pseudonana (Table 4; Supplemental Table 1), were subsequently performed using the original protocol from 1995 with minor modifications, which gives reproducible results even though the transformation efficiency is relatively low (;10 26 to 10 28 transformants per mg of DNA). DNA can be delivered to P. tricornutum via electroporation (Table 4), but the transformation efficiency is more variable. A drawback of both approaches is that the transgene is randomly integrated into the genome, with multiple integration events. Chromosomal position effects and variable transgene copy numbers cause significant variability in transgene expression levels in independent lines. Biolistic gene transfer affects genome integrity due to the physical forces that produce the double-strand breaks that are repaired by nonhomologous end joining . To overcome these issues, new vectors have been designed containing a yeast-derived sequence, which can be replicated as episomes in diatom cells (Karas et al., 2015). These episomes can be delivered to P. tricornutum and T. pseudonana through bacterial conjugation using Escherichia coli (Table 4). This transformation strategy is more efficient than previously described methods (;10 24 transformants per mg of DNA for P. tricornutum and T. pseudonana) but is less stable, as episomes are eventually lost in the absence of selective pressure. On the positive side, this feature can be exploited to achieve transient expression, which could be particularly useful for potentially deleterious transgenes.
Vectors have also been constructed to express foreign genes in P. tricornutum plastids transformed by electroporation (Table 4). There is also evidence that plastid mutagenesis by homologous recombination is feasible in this species (Table 4).
In parallel, vectors have been designed for high-throughput protein tagging and promoter-reporter/target transgenes to study  (Table 4). In P. tricornutum and T. pseudonana, Lhcf promoters are most commonly used to drive strong expression of the target gene, while the Nitrate reductase promoter allows gene expression to be induced by shifting the transgenic cells from nitrate-deficient to nitratesufficient medium. Examples of reporter genes include the beta-glucuronidase gene uidA for monitoring gene expression and GFP variants for subcellular localization of proteins (Table 4). Down-regulation of gene expression has been achieved in P. tricornutum and T. pseudonana by expressing antisense or inverted repeat sequences of target genes (Table 4). This strategy often results in the reduction of mRNA and protein levels (De Riso et al., 2009), but it is not yet known how gene silencing works in diatoms. A recent major achievement was the development of genome-editing tools such as TALEN and CRISPR/Cas9 in P. tricornutum and T. pseudonana (Table 4; Kroth et al., 2018). Nucleotide insertions or deletions resulting from nonhomologous end joining have been described, as well as gene replacement mediated by homology-directed repair in the presence of template DNA (Table 4). More recently, transgenefree editing of the P. tricornutum genome was achieved by delivering Cas9/single guide RNA ribonucleoprotein complexes . With ribonucleoprotein delivery, endogenous counter-selectable markers have been used to facilitate the identification of mutated genes.

PERSPECTIVES
In the last two decades, major international initiatives such as Marine Genomics Europe, the European Marine Biological Resource Centre, and the Marine Microbiology Initiative have promoted diatom genetics and genomics research. Since diatoms include the most advanced molecular model systems among phytoplankton organisms, previously unknown cellular and metabolic processes have been discovered in these species, broadening our awareness of the diversity of photosynthetic organisms. Molecular insights into the life of diatoms obtained from studies of established and emerging model species can be now transferred to the ;100 nonmodel diatom species whose transcriptomes have been sequenced in the framework of the Marine Microbial Eukaryote Transcriptome Sequencing Project (Keeling et al., 2014). Diatom models also provide a reference system for genomic and phylogenomic comparisons (Blaby-Haas and  with other microalgae and macroalgae such as E. siliculosus (Cock et al., 2010) of the heterokont clade that are still recalcitrant to genetic investigations.
Many molecular secrets of diatoms remain to be revealed. The unveiling of the chimeric nature of diatom genomes was a step forward, but it prompted the following question: How did regulators derived from different ancestors coevolve and integrate their functionalities to cooperatively fine-tune diatom genome expression and metabolism while responding to environmental changes? Further characterization of genetic and epigenetic processes in diatoms will undoubtedly uncover novel regulatory pathways, their significance for diatom success in contemporary oceans, and their evolutionary past.
The lack of functional information for roughly half of the diatom gene repertoire greatly limits our understanding of their biology, as is the case for many other algae (Blaby-Haas and . The discovery of gene products with no known protein domain hints at functional capabilities unique to diatom species that cannot be surmised by comparative genomics. In other model organisms, such as Arabidopsis and Chlamydomonas, forward genetics has been among the most successful strategies to determine gene functions by inducing and hybridizing mutants and carrying out large-scale phenotypic screens. These approaches are precluded in diatoms that do not reproduce sexually in a way that can be controlled in the laboratory. Forward genetics could possibly be established in sexually reproducing diatom species. However, in the coming years, the generation of CRISPR-Cas9 knockout libraries targeting all genes in diatom genomes may allow us to establish loss-of-function screens in any diatom species, paving the way for major breakthroughs. One of the biggest challenges for the future will be to assess the significance of diatom biology in the marine environment. Being the most advanced phytoplanktonic model species, diatoms offer unique opportunities to address fundamental questions through integrative gene-to-ecosystem approaches (e.g., Guidi et al., 2016;Carradec et al., 2018). Functional investigations of diatom model species have enabled us to identify genes with strong phenotypic impacts that might be targets for selection and therefore drive the evolution and adaptation of diatoms to their highly variable environment. More links between diatom ecology, genomics, and genetics will intensify in the future due to the tangible potential to learn more about the factors shaping phytoplanktonic communities and the capacity of marine ecosystems to tolerate, adapt to, and contribute to a changing climate.

Supplemental Data
Supplemental Figure 1. Map of the globe indicating the geographical origin of T. pseudonana and P. tricornutum isolates used for molecular and cellular investigations.
Supplemental Table 1. Characteristics of the established and emerging model species.
Supplemental Table 2. Available omics data recently generated in T. pseudonana and P. tricornutum under different conditions.