Advances and prospects of orchid research and industrialization

Abstract Orchidaceae is one of the largest, most diverse families in angiosperms with significant ecological and economical values. Orchids have long fascinated scientists by their complex life histories, exquisite floral morphology and pollination syndromes that exhibit exclusive specializations, more than any other plants on Earth. These intrinsic factors together with human influences also make it a keystone group in biodiversity conservation. The advent of sequencing technologies and transgenic techniques represents a quantum leap in orchid research, enabling molecular approaches to be employed to resolve the historically interesting puzzles in orchid basic and applied biology. To date, 16 different orchid genomes covering four subfamilies (Apostasioideae, Vanilloideae, Epidendroideae, and Orchidoideae) have been released. These genome projects have given rise to massive data that greatly empowers the studies pertaining to key innovations and evolutionary mechanisms for the breadth of orchid species. The extensive exploration of transcriptomics, comparative genomics, and recent advances in gene engineering have linked important traits of orchids with a multiplicity of gene families and their regulating networks, providing great potential for genetic enhancement and improvement. In this review, we summarize the progress and achievement in fundamental research and industrialized application of orchids with a particular focus on molecular tools, and make future prospects of orchid molecular breeding and post-genomic research, providing a comprehensive assemblage of state of the art knowledge in orchid research and industrialization.


Introduction
With over 750 genera and 28 000 species, Orchidaceae constitutes one of the largest, species-rich families of f lowering plants and has successfully colonized all continents except Antarctica [1,2]. Orchids display extreme specializations particularly in the tropics, presenting several distinctive features like highly diversified f lowers driven by orchid-pollinator interactions, dust-like, non-endospermic seeds, obligate association with mycorrhizal fungi, crassulacean acid metabolism (CAM) and unrivaled reproductive strategies contributing to a wide range of adaptation [3,4]. Although exhibiting enormous diversity, the intrinsic nature of pollinator and mycorrhiza specification makes them particularly vulnerable to environmental change, and unsustainable harvest becomes a major additional risk. As a result, orchids now feature prominently among lists of threatened plant species, including all species in Appendices I and II of CITES [5] and 1601 species on the IUCN Global Red List [6]. Apart from their scientific fascination and significance, orchids are renowned for cut f lowers and ornamental potted plants, accounting for a great part of the global f loriculture trade [7]. Orchids also have long been commercialized as medicinal products and food [8], whereas breeding and mass production of commercially valuable orchids have long been hindered by a lack of scientific information on the mechanism involved in orchid growth and f lower inducement. Therefore, from the time of Darwin to the present day, there was a great deal of attention to this family in terms of its conservation, evolutionary relationship, and f lower diversification. With the scientific tools now available for systematic research, there are emerging studies providing in-depth analysis addressing questions in orchids, with an increasing focus on comparative studies in the area of genomics. By identifying and validating functional genes, reconstructing phylogenetic relationships and performing genome-wide exploration, our understanding of the genetic basis of those unique characteristics of orchids has been fundamentally reshaped. The availability of the wealth of genomic information also stimulates further studies on molecular breeding, new varieties selection and secondary component production that could protect and restore the wild orchids but also develop orchid industrialization in a way that satisfies the demand for commercial uses sustainably.
This review summarizes the prominent strategies and approaches utilized in orchid research including molecular markers, sequencing technologies, transgenic and gene editing techniques, etc. with an enumeration of the studies leading to a major breakthrough in orchid biology such as orchid phylogeny, f lower development and evolution of life forms. In addition, recent advances in biotechnology applied to the orchid industry together with future prospects pertaining to the post-genomic research, conservation, and utilization of orchids are brief ly discussed.

Orchid conservation
Over the past 10 years, novel technologies have advanced our understanding of orchids and promoted revolution in orchid research. As modern molecular biology and computer science become more commonplace, the rapid development of genetic techniques has facilitated the unprecedented resolution of uncertainties in orchid biology. These techniques were heavily applied to studies with conservation purposes, including taxonomy, systematics, examination of genetic variation level, effective population size, and mating patterns; namely, population genetics [9]. To visualize the latest knowledge domain and emerging trends of orchid conservation-related literature, CiteSpace [10] was employed to conduct a scientometric investigation (Fig. 1). Figure 1 is generated by time series and depicting the keywords of the research on orchid conservation. It's clear from the timelines that orchid conservation research has gradually shifted from descriptive theories and conventional practices to a high technology-based quantitative science, particularly in a direction toward molecular systematics. Notably, high-frequency keywords like 'phylogenetic analysis', 'biogeography' and 'conservation genetics' have occurred since 2013 (Fig. 1). Indeed, new molecular approaches have emerged in the past decades and will be the most powerful tool in providing new insights and understanding into orchid conservation.

Orchid classification and systematics
Due to frequent hybridization and introgression, delineating taxonomic boundaries of orchids can be very difficult [11,12]. The resulting uncertainties in the taxonomy and conservation status of orchids can largely hamper effective conservation strategymaking. Nowadays, markers include amplified fragments length polymorphism (AFLP), microsatellite DNA, chloroplast genomes, single nucleotide polymorphisms (SNPs), and restriction-siteassociated DNA sequencing (RADseq) have been widely used in systematic and taxonomic studies [13]. These approaches are likely to be most important for delimitating species and intergeneric relationships. More than 10 new recorded genera or new genera with dozens of new species have been published with molecular evidence for their distinctiveness, including new genera Danxiaorchis [14], Hsenhsua [15], Shizhenia [16], and Yunorchis [17], the newly recorded genus Thaia [18] and genera with phylogenetic replacement, Cymbilabia and Mengzia [19,20]. Newly discovered species with support from a phylogenetic tree based on nuclear and plastid DNA markers were found in several genera including Bulbophyllum, Paphiopedilum, Gastrodia, Dendrobium, Liparis, and Cymbidium [21][22][23][24][25][26][27]. Although challenges still remain, there have been great strides in the use of molecular tools for the description of new taxa, and the rapid development of new markers can be expected to further expedite the process of orchid systematic studies. Along with the species-level identification, orchid phylogenetics have now employed next-generation sequencing (mostly Illumina techniques) for understanding higher-level relationships. Given its small size, uniparental inheritance, conservative gene content, and great numbers in cells, plastid genome (plastome) has so far been the most important source of data for plant phylogeny [28]. Since the first plastid genome has been reported in Phalaenopsis aphrodite subsp. formosana [29], so far there are 642 complete plastid genomes for Orchidaceae in Genbank (accessed 3 May 2022). Indeed, extensive studies have also been done in developing orchid plastid phylogenomics as a well-resolved, strongly supported, time-calibrated phylogeny is fundamental for circumscribing species, tribes and genera. Since the first DNA data-based phylogenetic classification of Orchidaceae was published [30], there has been a great deal of progress in resolving relationships and problematic placements at higher levels. We have seen an exponential growth in plastid phylogenomics to resolve and clarify relationships across orchids, including a supermatrix tree based on 75 chloroplast genes for 39 species covering all orchid subfamilies 16 of 17 tribes [31]; reconstruction of phylogeny and temporal evolution of Orchidaceae based on 76 and 38 coding genes of plastid and mitochondrial genomes, respectively [32]; a first-time phylogenetic placement for Codonorchideae (Orchidoideae), Podochilieae and Collabieae (Epidendroideae) based on 78 plastid coding genes [33]; plastid phylogenomic resolution of subtribes Aeridinae [19] and Goodyerinae [34]; and genera phylogenetics such as Dendrobium [35], Cymbidium [36], Holcoglossum [37] and Paphiopedilum [38], suggesting plastome has been a mainstay for addressing finerscale phylogenetic questions in orchid evolutionary studies.
In parallel with the examination of the plant's organelle genomes, opportunities are rising for using transcriptomic data for orchid phylogenetics. Several orchid phylotranscriptomics have been performed for addressing either broad-scale or shallow-scale orchid relationships including species diversification and genome evolution, as well as key traits like sexual deception [39][40][41][42][43]. These studies also give insights beyond phylogenetic analysis to uncover the morphological evolutionary histories of thousands of unlinked genes. Therefore, repurposing transcriptomic studies provides an additional approach and perspectives for orchid systematic studies. Collectively, drawing on these informative data for orchid classification and systematics has gained important insights into the mechanisms driving the extraordinary species diversity and served as an invaluable resource for conservation implications.

Population genetics
At the population level, identifying species that should be treated as a high priority for conservation is also struggling for conservation planning [44], thus highlighting the need for population genetic analysis. In orchids, unique traits like deceptive pollination and dust-like, wind-dispersed seeds have promoted gene f low between populations [45]. These reproductive strategies have led to significant heterogeneity in the genetic structure among orchid populations [46]. To investigate the population genetics of endangered orchids, marker-assisted approaches have been widely employed in the assessment of genetic diversity, genetic drift, level of inbreeding, and gaining insights into contemporary and historical dispersal for populations identified as being at risk. Genetic diversity studies for the populations of two critically endangered Amitostigma species were conducted to identify which entities to preferentially conserve and determine optimal conservation strategy [47,48]. Studies on Cypripedium showed random genetic drift and limited gene f low have a great impact on its genetic diversity and population structure [49,50], while distinct levels of genetic diversity occur in Cypripedium populations with different types of habitats and climates [51,52]. The accumulation of ancestral variation and genetic admixture from multiple post-glacial colonization routes may explain the high genetic diversity of Cypripedium in central Europe, enabling the longterm stability of the species in different biogeographical regions . Each node represents a keyword that first appears in the analysed data set. The published articles from 2010 to 2022 were retrieved from the Web of Science core collection. [53,54]. Anthropogenic disturbance accounts for the major factors driving the orchid habitat fragmentation and deterioration, especially for those thriving as epiphytes [55]. Population genetic structure analysis of an epiphytic orchid, Bulbophyllum occultum showed self-pollination and genetic drift have contributed to its high population genetic and phenotypic differentiation, despite it being impacted severely by deforestation [56]. In another case of a dominant clonal B. bicolor, low genotypic diversity and lack of spatial genetic structure led by skewed clonal reproduction have contributed to the loss of sex and extinction debt, which demands urgent conservation attention such as ex situ collection [57]. Orchids have specific adaptations to pollinators that could promote crossing and avoid inbreeding depression [58]. Several studies have investigated the relative fitness of the population resulting from selfing and outcrossing by using molecular markers. Genetic diversity and structure of Cattleya were evaluated by ISSR markers and the results showed that Cattleya populations undergo a drastic population decline while deceptive pollination and long-distance seed dispersal may help with higher genetic variability [59,60]. Blambert et al. [61] reported the selfing rates and levels of genetic diversity for two closely related Jumellea species with different reproductive systems. The results showed that the selfing rate and magnitude of inbreeding depression are negatively correlated. A study that examined the population genetic patterns of the populations of Platanthera praeclara indicated that its small population size could lead to severe inbreeding depression [62]. Genetic diversity and population structure of Calibrachoa showed that species presenting inbreeding are more likely to suffer from a loss of alleles, resulting in a low level of population structure and diversity [63]. In a case that used double-digest restriction-site associated DNA (ddRAD) sequencing to evaluate the conservation status of three threatened species of Corybas, the gene structure results provided evidence for hybridization and introgression within the Corybas complex, thus leading to blurred taxonomic boundaries [64].
Taken together, the reliability and efficiency of molecular tools have provided more opportunities for examining a wider spectrum of orchid population genetics, facilitating the development of genetic studies on more endangered orchids and the possibility of answering unresolved biological questions, thereafter guiding the conservation practices.

Deceptive pollination
Orchids have evolved to diverse pollination mechanisms with an overrepresentation of f loral mimicry (deceptive pollination) compared to other plants [65]. For a long time, ecological investigations were mostly adopted to understand the driving forces of orchid deception, while the knowledge gap remains to be filled between the underlying mechanisms of the evolution of mimicry and species diversification [66]. Several studies have performed novel and multidisciplinary approaches to provide new insights into this research area. For example, genus phylogeny suggested that the unidirectional transition in nectar presence/absence is associated with food deception, which is an 'evolutionarily stable strategy' in Epidendrum [67]. A study incorporated field experiments and chemical findings into phylogenetic analysis concluded that the sexually deceptive orchid Chiloglottis undergoes pollinator-driven speciation [68]. A similar attempt was reported in genera Serapias and Iris to provide evidence for evolutionary transitions between two different pollination strategies [69]. Stearoyl-acyl carrier protein desaturase (SAD), the first gene involved in the biosynthesis of semiochemicals for sexual deception was found and it's responsible for alkene difference between Ophrys species and pollinatordriven 'genic' speciation [70]. Despite the recent advances in our understanding of the orchid deception, it is evident that much remains to be learnt. In light of the methodological progress, our capability will be greatly enhanced to examine the basis of pollination deception at an unparalleled depth.

Mycorrhizal associations
A common feature of orchids is their obligate mycorrhizal associations, which play crucial roles in orchid life cycle and population dynamics [71,72]. Identification of orchid mycorrhizal fungi (OMF) has been extensively studied in at least 200 genera [73]. Most OMF belongs to three Basidiomycota lineages, namely Ceratobasidiaceae, Tulasnellaceae, and Serendipitaceae [74]. Moreover, orchid distribution is considered associated with OMF diversity [75], which is jointly regulated by environmental factors, spatiotemporal variations, biogeography, and the phylogenetic constraints of host orchids [72]. Multifaceted evidence from radiocarbon ( 14 C), molecular markers and genomic data suggested that both fully mycoheterotrophic and partial mycoheterotrophic orchids obtain carbon from fungi [76,77]. And the concept that OMF receives nothing in return has been overturned by a study that examined the expression of fungal and plant nitrogen (N) transport and assimilation genes, the results showed mycoheterotrophic orchid and its fungal partner have a mutualistic association [78]. Whereas disentangling the factors determining OMF specificity and molecular mechanisms of this symbiosis still represents a major challenge. Nevertheless, a broader investigation of OMF is needed to translate more orchid-fungi crosstalk, and the availability of fungal reference genomes [79] is expected to enable metagenomic studies to be implemented to address these questions. These studies will be especially important for the regeneration of rare and endangered orchids that link orchids and OMFs in efforts to inform conservation.

Crassulacean acid metabolism
CAM is a key innovation for orchids, representing a specialized drought-adapted photosynthetic pathway that enables plants to grow in the arid environment [80]. Several transcriptomic studies have been done to illuminate the origin and evolution of CAM pathway. PEPC is a key enzyme in CAM pathway, phylogenetic analysis of the PEPC family based on transcriptome data revealed that orchid PEPC genes originated from the ancient duplication of monocots, with CAM developed earlier than C4 [81]. Zhang et al. [82] reported a comprehensive comparison of carbon fixation pathway genes based on transcriptomic data of 13 orchid species, and revealed that CAM may have evolved primarily by changes in the transcription level of key carbon fixation pathway genes. Differentially expressed genes between circadian day and night were identified in Pha. equestris; these genes were mostly enriched in carbon fixation, circadian clock regulation, photosynthesis, and signal transduction pathways [83]. Other studies that employed physiological and biochemical measures to examine the diel dynamics of carbon gain and metabolites of CAM have provided more opportunities to study orchids' adaptation to dry environments, especially in response to projected global warming [84][85][86].

Whole genome sequencing
With the advent of sequencing techniques, we have witnessed a paradigm shift from microarray-based genotyping studies to whole genome sequencing. The adoption of third-generation sequencing approaches that generate longer reads (e.g. Pacific Biosciences HiFi [87], Pacific Biosciences SMRT [88], Oxford Nanopore [89]) and methods for chromosome conformation capture (e.g. Hi-C) have ushered the revolution of higher quality sequencing and assembly in genome contiguity and accuracy of plants, even for orchids that have large and complex genomes. To date, 16 orchid genomes encompassing four subfamilies (Apostasioideae, Vanilloideae, Epidendroideae, and Orchidoideae) have been published with some of them having updated versions of assemblies and annotations ( Fig. 2; Table 1). These genome studies together with high-quality referenced transcriptomic data have given rise to major conceptual or technical breakthroughs of orchid research and overturned many classical views of orchid biology.
Pha. equestris was the first reported orchid genome that contributed to a greater understanding of orchid morphological evolution and physiological adaptation [3]. The highly sophisticated f loral structures, epiphytism and CAM were considered pivotal traits that jointly promoted the rapid radiation of orchids. A chromosome level assembly of Pha. aphrodite showed that lineagespecific expansion of FRS-like subclade might be an adaptation to the unstable light condition that is associated with epiphytes [90]. The genome sequence of primitive orchid Apostasia shenzhenica represents a milestone in research on orchid genomics [91]. By reconstructing an ancestral orchid gene toolkit and evolutionary history of orchids within the angiosperms, together with the exhausted investigation on Apo. shenzhenica genome characteristics and transcriptome data covering all orchid subfamilies, this study provided significant insights into key innovations, origins, adaptation and diversification of orchids. Absolute dating of whole-genome duplication (WGD) event in Apo. shenzhenica showed a lineage-specific WGD in orchids which occurred shortly before the divergence of five subfamilies. Another interesting feature in orchids is that all orchids showed mycorrhizal fungi association (partial mycoheterotrophy) during seed germination and the early stage of protocorms [92]. Genomes of obligate mycoheterotrophic orchids were reported in two species in Gastrodia [93][94][95] and Platanthera guangdongensis [77]. Both studies revealed contracted plastid genomes and substantial loss of genes involved in photosynthesis of fully mycoheterotrophic orchids, while elevated expression of trehalase could be a critical adaptation for mycoheterotrophy, enabling orchids to hijack trehalose from fungi and resynthesize it as sucrose for internal use [77].
For thousands of years, the active compounds in orchids have been used in traditional medicine or as health food supplements. These ancient remedies have now been scientifically scrutinized at the molecular level. Dendrobium is the second largest genus in Orchidaceae and is renowned for medicinal use in Asia. Expansion of resistance-related genes and highly expressed polysaccharide synthase gene families in the Dendrobium genome suggested a powerful immune system and high tolerance to environmental stress that contributes to its wide distribution [96]. An updated version of the chromosome-level Dendrobium genome combined with genome-wide association studies (GWAS) revealed several genes associated with the biosynthesis of active ingredients and stem production in Dendrobium species [97]. Another high-quality chromosome level Dendrobium genome, D. chrysotoxum provided important insights into the interplay among carotenoid, abscisic acid (ABA), and ethylene biosynthesis, demonstrating the regulatory mechanisms of the relatively short f lowering period of yellow-f lower orchids [98]. The first haplotype-resolved genome assembly using Pacbio HiFi sequencing of orchid is completed in Bletilla striata, a traditional medicinal herb in China [99]. The study reconstructed the ancestral karyotype (18 chromosomes) of seven orchids and functionally verified the key transcription factor (MYB) involved in polysaccharide biosynthesis, laying a foundation for molecular-assisted breeding for orchids with high medicinal values. Vanilla, a stunning spice that is mostly derived from Vanilla planifolia's pods, is now ubiquitous in worldwide food and beverage production. A chromosome-scale genome assembly of V. planifolia was reported to provide a better understanding of the genes associated with vanillin pathway, which will enable accelerated breeding of vanilla pods with higher quality and productivity [100]. The other chromosome-level, haplotype-phased V. planifolia genome based on Pacbio HiFi sequencing was completed to address the partial endoreplication of V. planifolia and take a step forward to further elucidate this complex genome [101].
Orchids exhibit specialized and complex f loral structures that led to marvelous species richness; these unique traits play an instrumental role in the course of orchid evolution. A comprehensive inspection of f loral shape, color, and scent has been conducted in Cymbidium, a genus with vital commercial importance in the world f loriculture industry. Three Cymbidium genomes (C. ensifolium, C. sinense, and C. goeringii) unveiled the important genetic clues to phenotypic traits such as colorful leaves, diversified f lowers, and fragrance, which are primarily regulated by MADS-box, MYB, and TPS gene families, respectively [102][103][104]. Furthermore, changes in gene number and gene expression can affect f lower morphogenesis by altering f loral structure and color that foster a variety of mutants, providing great potentials for orchid molecular breeding and f loriculture.
In a nutshell, new sequencing technologies are producing genome assemblies of increasing quality, deciphering orchid genomes can provide a comprehensive catalog of genomic information that could empower the studies of important traits and evolutionary mechanisms for the breadth of species with significant ecological and evolutionary importance.

Orchid database
The emergence and availability of massive sequence data have opened new interfaces with computer science, allowing the establishment of multiple orchid databases. OrchidBase was established in 2011 and has now updated to version 5.0, which accommodates whole-genome sequencing and transcriptomic data for Apo. shenzhenica, D. catenatum and Pha. equestris, and f loral transcriptomic sequences from 10 orchid species covering all five subfamilies [105] (http://orchidbase.itps.ncku.edu.tw/). Orchidstra 2.0 [106] (http://orchidstra2.abrc.sinica.edu.tw) is another orchid database for transcriptomics resource that includes orchid transcriptome assembly and gene annotations of 18 orchid species belonging to 12 genera across five subfamilies. Other databases like OOGB (http://predictor.nchu.edu.tw/oogb) and PhalDB [107] also provided quantities of sequence information about the genome, transcriptome, and miRNA data of Oncidium and Phalaenopsis, respectively. These datasets provide the capacity and platform for rapid data mining, generation, and analysis. The aggregation and navigation of different types of orchid data enable the broader biology community to access orchid bioinformatics and perform genome-wide analysis with a lower level of computational skills.

Genome-wide studies
On the basis of large-scale whole-genome data, genome-wide identification and comparative study of a multiplicity of gene families have been conducted in several orchids. Recent publications have reported gene families like YABBY, terpene synthase (TPS), R2R3-MYB, KNOTTED1-like homeobox (KNOX), WRKY, autophagy (ATGs), and MADS-box [108][109][110][111][112][113][114][115] and their expression analysis in Dendrobium, Cymbidium, and Phalaenopsis. Phylogenetics, physicochemical properties, and comparative transcriptomics were performed to identify key genetic basis and molecular mechanisms determining the organogenesis and morphogenesis of orchids. Also, from a perspective of f loriculture development, most economically important traits are usually inherited in a quantitative manner such as f loral scent and color. Therefore, genome-wide associate studies can contribute to the discovery of candidate genes associated with these important traits. Genotyping-by-sequencing (GBS) approach based on Illumina sequencing has been applied to study four sexually deceptive orchids of Ophrys. Highly differentiated polymorphisms were found in genes that are involved in f loral scent, including several SNPs linked to pseudo-pheromones related genes [116]. GWAS was performed by combining SNPs and f loral aesthetic traits to identify QTL that is associated with color-related traits in Phalaenopsis [117]. The results indicated 10 quantitative trait loci and 35 candidate genes that were associated with color-related traits and anthocyanin biosynthesis, respectively, providing important selection markers for Phalaenopsis breeding.

Genes associated with important traits
The development of whole genome sequencing, de novo transcriptome and molecular toolkit for functional analysis have allowed identification, quantitative comparison and functional characterization of genes related to important traits in orchids. Key genes and transcription factors (TFs) involved in f loral pattern, f lowering time, coloration, f loral scent, colorful leaves, and other important properties were identified and validated (Fig. 3). These studies provide nearly unlimited possibilities in orchid research that could translate complex orchid biology to quantitative information and lay a solid foundation for orchid molecular breeding.

Floral patterning
Floral homeotic genes that encode MADS-box transcription factors play crucial roles in f lower development, of which type II MADS-box genes are renowned for their roles in the specification of f loral organ identity [118]. Most orchids display a zygomorphic f lower with a distinguished lip in the second whorl, this particular f loral architecture presents an exciting opportunity to examine the classical ABCDE model of f lower development. Since the 'Orchid Code model' [119], 'HOT model' [120] and 'Perianth code' (P code) [121] have been proposed, a wide-ranging exploration of transcriptome and expression profiles of Cymbidium species and their mutants has been conducted to investigate the f lower formation of orchids [122][123][124][125]. The results from the transcriptomic based analysis were further verified by whole genome data of C. ensifolium [104], C. sinense [102] and C. goeringii [103], which suggested that B-and E-class MADS genes play fundamental and dualistic roles in determining perianth formation while C-and D-class are involved in carpel and gynostemium (column) development (Fig. 4). To better understand the characteristics of ABCDE genes among different lineages, we identified and classified the MADS-box genes for all available orchid genomes ( Table 2). Gene duplication in type II MADS genes seems common in orchids [3,104] and most ABCDE clades look similar in gene numbers between closely related species, while some of them have extremely higher or lower copies (e.g. Bs in C. ensifolium and C/D in G. elata). In Apostasia, fewer B-AP3 and E class genes are considered to form an undifferentiated lip and partially fused column, making it an actinomorphic feature [91], whereas due to the dearth of knowledge pertaining to orchid f lower development, the evolutionary buildup of gain and loss of ABCDE genes remains to be investigated. Notwithstanding, an increasing number of functional studies have been done to elucidate the function of these f loral organ identity genes. Ectopic expression and virus-induced gene silencing (VIGS) were performed to investigate SEPALLATA (SEP) genes in determining f loral organ identity of Phalaenopsis [126]. The model of perianth formation in orchids has been validated by the suppression of L complex activity in lips in Oncidium and Phalaenopsis, demonstrating AGAMOUS-LIKE6 (AGL6)-like MADS-box genes are exclusively required for lip development, a specialized petal for most orchids [121]. Ectopic expression of PeMADS28, a Bs gene of MADS-box gene family in Arabidopsis can lead to abnormal ovule development, indicating the conserved function of Bs in ovule integument development [127]. The transgenic Dendrobium was generated for elucidating the functions of C-and D-class MADS-Box genes, in which two AGAMOUS (AG) genes, DOAG1 and DOAG2 were downregulated by microRNA interference [128]. The result showed that both genes exert different roles in specifying reproductive organ identity, DOAG1 affects f loral meristem determinacy and f loral organ development, while DOAG2 regulates perianth and gynostemium (column) development. A study based on VIGS and transient overexpression of Pha. equestris revealed that two DROOPING LEAF/CRABS CLAW (DL/CRC) genes play an important role in the innovation of orchid reproductive organs [129]. Transcriptome analysis combined with yeast two-hybrid assays of a greenish f lower mutant of Habenaria radiata showed that this phenotype is caused by loss of function of an E-class MADS-box gene (HrSEP-1), which plays an important role in column, lip, and petal development [130]. In addition to MADS-box genes, it was also found that overexpression of miR390, a miRNA involves in diverse processes of plant growth and development from C. goeringii to transgenic Arabidopsis can alter normal reproductive organ development [131]. Valoroso et al. [132] found that DIV, RAD, and DRIF in the MYB gene family might be involved in the establishment of f lower bilateral symmetry of Orchis italica, an orchid with resupinate f lowers. Other regulatory mechanism includes the coordinated signaling of phytohormones such as ethylene, auxin, GA, and ABA also play important roles in orchid f loral organ development [133,134]. All these studies specify that the diverse roles of MADS-box genes and other f loral identify regulators have contributed to the specialized f lower morphogenesis of orchids.

Floral color
Anthocyanin and carotenoid are major pigments that depict f lower colors and key TFs such as MYBs play prominent roles in their biosynthesis [135]. In Phalaenopsis cultivars, the differential expression of PeMYB2, PeMYB11, PeMYB12 was concomitant with complicated f loral pigmentation patterning [136]. Transient overexpression of these R2R3-MYBs in Phalaenopsis verified that they can promote anthocyanin accumulation, and these PeMYBs could activate the expression of downstream structural genes. VIGS and transient overexpression of PeMYB4L, a member of R2R3-MYB verified its role in regulating anthocyanin biosynthesis in Phalaenopsis [137]. In Cattleya, transient overexpression of its two R2R3-MYBs (RcPCP and RcPAP) in a Phalaenopsis hybrid showed elevated expression of structural genes involved in carotenoid biosynthesis and anthocyanin biosynthesis, respectively, resulting in yellow or red pigmentation in Phalaenopsis's perianth [138]. The insertion of a retrotransposon, Harlequin Orchid RetroTransposon 1 (HORT1) in PeMYB11 enhanced its expression and resulted in high accumulation of anthocyanins in harlequin flower [139].
It's also widely acknowledged that MYB-basic-helix-loop-helix (bHLH)-WD40 repeat (WDR) (MBW) ternary complex are key regulators determining f loral pigmentation and patterning [140]. Comparative transcriptomics of Pleione showed a MBW protein complex formed by PlMYB10, PlbHLH20, or PlbHLH26 and PlWD40-1 can repress the expression of f lavonol synthese (FLS), the key structure gene in the anthocyanin biosynthesis pathway that contributes to color formation, resulting in color variations in wild populations [141]. Transient overexpression of a bHLH protein PeMYC4 plus a R2R3 MYB PeMYB4L showed a negative regulation of the anthocyanin accumulation by inhibiting the expression of structure gene chalcone synthase (CHS) [137]. Characterization of the structure genes involved in the anthocyanin biosynthesis has also been extensively studied in orchids. Metabolite profiling and transcriptomic analysis of sexually deceptive Chiloglottis showed downregulation and upregulation of FLS at young and mature bud stages, respectively, contributing to the stark color contrast between callus and lip, which enhances the visibility of the mimicry [142]. In C. kanran, transient expression of three structure genes CHS, dihydrof lavonol reductase (DFR), and anthocyanidin synthase (ANS) caused purple-red pigmentation in white C. kanran flower [143]. For Paphiopedilum hirsutissimum, two structure genes (F3H and CHS) involved in anthocyanin biosynthesis and major carotenoid biosynthesis genes including VDE, NCED, and ABA2 were strongly downregulated in albino phenotype compared to normal f lowers, suggesting the color variation of Pap. hirsutissimum is a result of changes in anthocyanin and carotenoid contents [144]. Anthocyanin biosynthesis is also modulated by several phytohormones including ethylene, cytokinins, and ABA [145][146][147]. In orchids, two transcriptome-based analysis in Dendrobium showed auxin, ABA, and ethylene play various modulating roles in f lower color formation [148,149]. Whereas the direct association between color formation and phytohormones, in the presence of other regulators that lead to the activation of anthocyanin biosynthesis remains unknown in orchids. Apart from MYB TFs, on the basis of P code in orchids, Hsu et al. [121] found that Class B-and AGL6 MADS-box genes have additional functions than determining f loral identifies, in which B and AGL6 proteins form L (OAP3-2/OAGL6-2/OPI) and SP (OAP3-1/OAGL6-1/OPI) complexes that regulate pigmentation of Phalaenopsis's perianth. Also, AP3/AGL6 genes may act together with RcPAP1/2, RcPCP1 in shaping the spatiotemporal pattern of Cattleya's color formation, further supporting the MADS-box genes' potential role in regulating color differentiation of f lower segments [138].

Floral scent
Floral scent is one of the most important signals to attract pollinators, especially for some sexually and food deceptive orchids [150,151]. On the other hand, fragrant orchids are also coveted commodities in the global orchid market. Studies on orchid f loral scents have mainly focused on the characterization of genes encoding the key enzymes responsible for the synthesis of volatile compounds such as terpenes and methyl jasmonate (MeJA) [152]. Volatile terpenoid metabolism genes including farnesyl diphosphate synthase (FDPS), acetyl-CoA C-acetyltransferase (AACT), hydroxymethylglutaryl-CoA reductase (HMGR), linalool synthase (LIS), and 3-hydroxy-3-methylglutaryl-CoA synthase (HMGS) were found to be responsible for f loral scent of C. goeringii [103,153]. In C. faberi, jasmonic acid carboxyl methyltransferase (JMT), an important gene in MeJA biosynthetic pathways was highly upregulated in blooming f lowers [154]. Similarly, MeJA biosynthesis-related genes were found exhibited maximal expression levels in opened f lowers of C. ensifolium [155]. In terms of f loral scent regulation, CIRCADIAN CLOCK ASSOCIATED1 (CCA1) TF was reported to regulate diurnal emission of f loral scent in Oncidium [156] and diurnal scent emission of Pha. violacea was positively regulated by circadian clock and light factors which are associated with structural genes and TFs involved in monoterpene biosynthesis [157]. Functional characterization of orchid f loral scent has been investigated in Dendrobium, Phalaenopsis, and Cymbidium. In D. officinale, genes involved in geraniol synthase (GES) were transiently expressed in Nicotiana benthamiana's leaves, resulting in the accumulation of geraniol in vivo [158]. Functional analysis of several TFs (BHLH, Ethylene Response Factor (ERF), NAM, ATAF, and CUC (NAC)) and TPS genes involved in monoterpene biosynthesis was performed for Pha. bellina, indicating they can induce monoterpene production in scentless orchids [159,160]. TPS genes of C. faberi have also been ectopically expressed in Escherichia coli, with β-myrcene, geraniol, Table 2. Numbers and categories of MADS-box genes in published orchid genomes. and α-pinene as the main product catalyzed by CfTPS18 [112]. Still, the type, regulation, and function of volatile compounds biosynthesis require further studies in orchids. Further progress in transcriptomic and functional studies may lead to more solid findings for revealing the molecular basis of orchid f loral scent.

Colorful leaf
Colorful leaf is another ornamental character for orchids. Knowledge of the underlying mechanisms that give rise to colorful leaves could help with the effective selection of desirable leaf traits. Leaf color is primarily determined by pigments such as chlorophyll and anthocyanin, mutation or different expression of genes associated with their biosynthesis and metabolism may lead to leaf color variations [161]. Differential expression of genes involved in the chlorophyll biosynthesis and degradation was found responsible for yellow-or silver-leaf phenotype in Cymbidium [104,162,163]. Comparative transcriptomic analysis between normal and yellow color-mutant leaves of C. sinense 'Dharma' showed that chlorophyllase (CHL2) and red Chl catabolite reductase (RCCR) which are key genes involved in chlorophyll degradation were highly expressed in the yellow-leaf mutant [164]. In C. sinense 'Red Sun', metabolic changes during the red to green leaf color transition were qualitatively and quantitatively analysed [165]. UPLC-MS/MS and PCR results showed that decreasing levels in 15 metabolites associated with anthocyanins biosynthesis, together with the down-regulation of genes encoding anthocyanin synthesis contribute to the leaf variegation of C. sinense 'Red Sun'. Purple leaves mutant of D. biggibum leaves is considered associated with the increased expression of MYB2, and the transient expression of DbMYB2 in N. benthamiana showed elevated expression of endogenous anthocyanin genes and thereby the increasing anthocyanin levels [166]. A transcriptomic study based on two Paphiopedilum species with green and tessellated leaves revealed that their differentially expressed genes were mostly enriched in processes like chloroplast, cytoplasm, thylakoid membrane, and nucleus [167]. For a Pha. aphrodite mutant with leaf variegation, the functional deficiency of PHOTOSYSTEM II SUBUNIT P (PsbP) protein, the extrinsic subunit of the photosystem II, is considered to play an important role in the formation of leaf variegation [168].

Flowering time
Vegetative-to-reproductive transition is a key phase for orchid development and f lowering regulation is critical in commercial orchid production. The f loral transition of orchids is inf luenced by environmental cues such as ambient temperature and photoperiod [169], endogenous signals including phytohormones like gibberellic acid (GA) [170] and ABA [171], as well as genetic factors. The regulatory mechanism of orchid f lowering varies in different species due to differing growth conditions while several f lowering-related genes, either f lowering promoters or repressors that have been functionally identified in Phalaenopsis, Dendrobium, and Oncidium, etc. show a similar pattern to affect f lower transition [172][173][174][175][176]. It has been reported in C. goeringii and Phalaenopsis that ectopic overexpression of FLOWERING LOCUS T (FT) gene can lead to an earlier f lowering phenotype in transgenic Arabidopsis, with significant upregulation of other f lowering timerelated genes [177,178]. The f loral meristem identity gene LEAFY accumulates high transcript levels in the f loral meristem primordia of Pha. aphrodite, and its overexpression in rice results in precocious headings [174]. Also, the VIGS-PhapLFY Pha. aphrodite showed abnormal f lower phenotypes like reduced pigmentation and morphological alterations in cell epidermis. Other orchid f lowering promoters include CONSTANS (CO), SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1 (SOC1), APETALA1 (AP1) can also lead to early-f lowering phenotypes in different types of transgenic plants examined [179][180][181]. While overexpression of SHORT VEGETATIVE PHASE (SVP) in C. sinense can significantly inhibit the expression of FT, SOC1 and APETALA1 (AP1) [102]. Phalaenopsis SVP also showed negative regulation in f loral transition by repressing the expression of PhaFTs [182]. Another f lowering repressor TERMINAL FLOWER1 (TFL1) plays an antagonistic role with FT in repressing f lowering, and the overexpression of TFL in Dendrobium causes delayed f lowering and defective f loral traits [175,183]. Moreover, TFs such as MADS-box and zinc finger are thought to play an important role in mediating multiple signaling pathways in orchid f lowering [184,185]. miRNAs are also suggested to be involved in f lowering regulation by targeting key genes that function in vegetative to reproductive transition [186][187][188].
In sum, the current understanding of the regulatory network of orchid f lowering is based on the well-characterized genes from model plants, novel regulating factors of f loral transition remain to be investigated. Also, the next challenge is to establish reliable protocols to make alterations in the f lowering time of orchids to meet the market demand.

Other properties
Genes related to other phenotypic traits or biological processes have been characterized in orchids. For example, secondary metabolites are critical in plant physiology in terms of stress and disease resistance. Functional verification of Cglycosyltransferase (CGT) genes was conducted in D. catenatum, which could specifically catalyze O-glycosylation (OGT) that helps drought resistance [189]. Phalaenopsis is distinguished by having long-lasting f lowers; this feature is determined by the presence of cuticular waxes in perianth [190]. Ectopic overexpression of R2R3-MYB genes (PaMYB9A1 and PaMYB9A2) of Pha. aphrodite in transgenic tobacco gives rise to a shiny leaf phenotype, indicating these two genes act in regulating the biosynthesis of cuticular wax [191]. WRKY and ARFs gene family have been functionally studied for C. goeringii, and their expression in transgenic Arabidopsis was highly upregulated after ABA and Indoleacetic acid (IAA) treatment, respectively, suggestive of their roles in response to abiotic stress [192,193]. These functional genomics studies have elevated the possibilities to verify genes associated with numerous regulating networks and link them with desired traits, thus providing valuable resources for orchid molecular breeding and industrialization.

Biotechnology-aided orchid production
Orchids exhibit several merits by having the elegant appearance, extended longevity, and varieties of cultivars that thrived in f loriculture and gained widespread popularity among growers and consumers. Orchid has taken a significant position in the f lower market, constituting more than 10% of the international pot plant trade with a 3.0% average annual growth in global import of cut flowers [7]. In addition to ornamental value, orchids such as Dendrobium, Gastrodia, and Bletilla have long been cultivated for medicinal use in Asia and Europe [194], and some (Vanilla) have been used for human consumption as f lavorings and beverages [8].
Although remaining on a small scale, extracts of orchids have also been gradually applied and traded in the cosmetic, personal care products, and fragrance industries [195]. The significant increase in production and consumption of orchids is largely owing to the utilization of advanced breeding and cultivation techniques. Over the past two decades, tissue culture techniques associated with greenhouse cultivation has largely sped up orchid industrialization. More importantly, advanced biotechnology such as Agrobacterium-mediated transformation, particle bombardment and CRISPR/Cas9 genome editing for molecular breeding and large-scale propagation were more likely to be applied to the orchid industry for producing novel varieties with a longer shelf life and shortened production period that contribute to successful commercial orchid farming.

Orchid propagation
Propagation techniques have gained major industrial importance for ornamental plants like orchids in which the commercial demand far exceeds the natural regeneration. Asymbiotic seed germination is a common way for orchid multiplication but it's difficult for some terrestrial orchids (e.g. Cypripedium) and less favorable in large-scale production owing to the long juvenile period and heterozygosity in progenies [196]. To obtain propagated orchids with genetic stability and uniformity, micropropagation, usually by diverse forms of tissue culture (e.g. protocorm-like bodies (PLBs)) has been routinely used for mass commercial production and regeneration of endangered orchids [197]. Propagation protocols have been established for many species with commercial values and conservation purposes [198,199]. In spite of the advantages, somaclonal variations induced by prolonged clonal propagation and cryptic genetic effects under microenvironment become the major limitations for maintaining certain desired traits [200]. Therefore, molecular markers have been applied to examine the genetic variability of in vitro raised plants to ensure clonal fidelity [201,202]. In this context, it has been pinpointed that biotechnological approaches will be a promising tool for orchid micropropagation and simultaneously, commercial production.
Conventionally, it takes three to 13 years to produce a f lowering plant from seeds and the transition from the juvenile period to reproductive development is restricted by some environmental conditions such as vernalization [203]. Intriguingly, under a certain combination of growing conditions, orchid protocorms are able to bypass the vegetative phase and directly f lower without leaf initiation and root development. It's reported that this process is co-regulated by key TFs like KNOX, R2R3-MYB and Ovate Family Protein (OFP) that can induce a different f lowering program [204]. This rapid in vitro f lowering is of great potential in commercial utilization, especially for species with a long juvenile period. In addition to the reduction of reproduction cycle, enhancing the production of useful secondary metabolites has become an important objective for in vitro orchid propagation. Micropropagated orchids with treatment of illumination, abiotic and biotic stress, bioreactors or precursors showed an accumulation of secondary metabolites, indicative of their positive roles in end product production [205][206][207]. Overall, technical innovations have built the path from laboratory to commercialization for orchid micropropagation, which is expected to gain momentum and give rise to more elite orchid varieties and byproducts.

Genetic transformation
There are mainly two types of gene manipulation techniques, overexpression of exogenous genes (genetic transformation) and silencing of endogenous genes (gene editing/silencing). Genetic transformation is a sought-after technology for introducing agronomically useful genes to modify or recombine already existing traits into target plant species, expanding the gene pool beyond what has been available to conventional breeding means and does not exist in the species of interest. Also, traditional breeding strategies such as crossbreeding with the purpose of improving particular traits are time-consuming for orchids that have prolonged reproductive cycles and multiple backcrossing attempts are required. Therefore, genetic transformation poses a tremendous potential for efficient genetic enhancement of important ornamental plants like orchids. Agrobacterium-mediated approach and particle bombardment are two major strategies employed for transgenic orchids [208]. These two methods have been increasingly applied to studies with the aim of transferring desired genes to orchids that could give rise to commercially important traits, such as genes regulating novel f lower color, f lowering time, plant-pathogen/virus resistance, and cold tolerance ( Table 3). These attempts provided efficient, possibly heritable and promising gene transformation to genetically modify specific characteristics of orchids. However, although bioinformatics analyses have provided implications for a wide spectrum of genes associated with orchid characteristics, only a small proportion of these have been functionally verified via gene transformation studies. Besides, it can take many years of painstaking research to develop transformation methods for different species as some orchids (e.g. Cypripedium) are recalcitrant for in vitro propagation, as mentioned earlier, plus the long reproduction cycle and the limited availability of genomic data. After the transgenic plant with the target gene is successfully generated, it is possible that the desired phenotype and the associated traits could be altered by the occurrence of somaclonal variation during propagation [208]. So far, transgenic studies have been done exclusively within the Epidendroideae subfamily, with Cymbidium, Dendrobium, Oncidium, Vanda, and Phalaenopsis as the representative genera for fundamental research and breeding applications. A routine protocol of genetic transformation for other taxa is still considerably lagging behind. Nevertheless, with the help of the ongoing release of highquality whole genome sequencing in more orchids and the advent of cutting-edge techniques in molecular biology, our understanding of intrinsic mechanisms and modifications of genes of interest for desired traits of orchids will be fundamentally improved.

Genome editing
Genome editing refers to a group of newfangled genetic engineering technologies, in which programmed nucleases composed of sequence-specific DNA-binding domains are employed to induce targeted DNA double-strand breaks (DSBs) in the host that stimulate the cellular DNA repair mechanisms [209]. These sitespecific nuclease technologies enable a specific target DNA to be added, removed, or altered at particular locations in the genome, providing great opportunities for plant genome engineering [210]. Before the emergence of sequence-specific nucleases, RNAi-mediated gene knock-down represented an efficient, lowcost, and high-throughput alternative to target gene silencing by homology-directed recombination [211]. There were several f lower color/pigmentation-related studies conducted using gene knock-down by RNAi in orchids, including Dendrobium Sonia [212], Oncidium Gower Ramsey [213] and Oncidium Honey Angel [214], whereas this RNAi technology has several drawbacks regarding unpredictable off-target effects and temporary loss-of-function, which could hinder the practical application to a certain extent, due to limited linkage between phenotype and genotype [209]. The clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) system represents a newly developed, efficient tool for introducing site-specific DSBs [215], by which the target gene is completely knocked out and genomic change can be maintained and heritable to the progeny. Although the well-established transformation systems are far from complete in Orchidaceae, this technology has been successfully applied in both natural specimens and their cultivars. Kui et al. [216] reported a successful knockout of five target genes (C3H, C4H, 4CL, CCR, and IRX) involved in the lignocellulose biosynthesis pathway in D. officinale by CRISPR/Cas9 gene editing. By examining mutation rates of the locus of each target gene, they claimed the established CRISPR/Cas9 system can work efficiently to edit endogenous genes in D. officinale and introduce mutations. Three MADS-box genes (MADS44, MADS36, and MADS8) were edited by CRISPR/Cas for producing singleguide RNAs (sgRNAs) and generating different combinations of mutants in Pha. equestris [217]. This study illustrated the possibility that several sgRNAs can be constructed into a library and transformed to create mutants with multiple phenotypes. However, whether the editing procedures reported by these two studies can actually produce knock-out phenotypes in orchids remains unknown. Yet only two subsequent studies have provided validated phenotypes correlated with the knockout of target genes using CRISPR/Cas9 technology. In Pha. amabilis, phytoene desaturase (PDS3) gene which encodes a rate-limiting enzyme in carotenoid synthesis has been edited by CRISPR/Cas9 system and the transformants showed an albino phenotype [218]. Li et al. [219] generated a kilobase-scale genomic deletion at the DOTFL1 locus of Dendrobium Chao Praya Smile. The results showed dotf l1 exhibited earlier occurrence of f lowering, pseudobulb  [237] formation and termination of inf lorescence apices, which are more explicit phenotypes related to f loral transition than the previous reported DOTFL1 knockdown lines by RNAi-mediated silencing [183]. Whereas all these practices have not yet been verified if these edited phenotypes can be retained to F2 and F3 -selfed and -crossed progeny, the gap between research experiment and industry application is still waiting to be filled in. Albeit started late and still in its infancy, these studies showed the great potential of CRISPR/Cas9 systems in efficient and accurate modification of orchid traits and this technique has huge potential to be applied in breeding new and improved orchid varieties without lengthy procedures and unexpected variables led by crossing breeding.

Future prospects
Over the past two decades, the explosion of genomic sequencing technologies combined with recent advances in transgenic techniques have revolutionized the basic and applied biology in orchids. With the wealth of phylogenetic information available, more and more controversial taxonomy of orchid groups has been well solved, framing a reliable evolutionary relationship for this extraordinarily diverse family. Conservation genetics based on molecular markers enable the effective translation of underlying threats into practical conservation measures that are tailored for different orchid communities. The ongoing release of orchid whole genomes has elevated the possibilities of multi-omics studies in orchids in terms of genome mapping, gene expression, comparative genomics, and functional validation. The progress in these studies provides new insights into the origin, evolution, and diversification of Orchidaceae, as well as a molecular vision that uncovers the underlying regulating mechanisms pertaining to exquisite f loral morphology, complex life histories, and unrivaled reproductive strategies. As genetic transformation and genome editing have become increasingly routine techniques in plant genome engineering, precise gene characterization and modification are feasible in orchids with complex genomes. Besides, tools like CRISPR-Cas9 system offers new opportunities for gene stacking, in which adding genes-of-interest with desired traits or knocking out genes associated with undesirable traits can be achieved simultaneously in a single practice [217,220]. Although the current achievement of genome engineering is limited to orchids, these swift and robust approaches pave a way for efficient genetic improvement of important traits in orchids in the foreseeable future. Apparently, we are now entering a post-genomic era for orchid research and industrialization. Despite the recent advances having unveiled many facets of this intriguing family, its scientific significance has not yet been fully realized. In addition, the deficiency of genome data in most orchid species still hampers the in-depth comparative genomics analysis. Genetic investigation and functional analysis are mainly focusing on well-characterized gene families such as MADS-box, MYB that have been extensively studied in model plants, with fewer efforts put into the inspection of novel regulating factors. Last but not least, genetic transformation protocols available at present are derived from several model orchids like Dendrobium and Phalaenopsis, which may not be applicable to many other orchid species with specified features, plus the existing transformation efficiency is relatively low in orchids compared to model plants such as Arabidopsis and rice; these pending problems will require multi-facet investments to guide future acts. Taken together, the ongoing accomplishments in developing novel genomic tools and techniques have shed new light onto orchid research and production. Our critical next step is to utilize their full potential in the application for orchid conservation, breeding, and industrialization.