Chromosome-scale genome assembly of Prunus pusilliflora provides novel insights into genome evolution, disease resistance, and dormancy release in Cerasus L.

Abstract Prunus pusilliflora is a wild cherry germplasm resource distributed mainly in Southwest China. Despite its ornamental and economic value, a high-quality assembled P. pusilliflora genome is unavailable, hindering our understanding of its genetic background, population diversity, and evolutionary processes. Here, we de novo assembled a chromosome-scale P. pusilliflora genome using Oxford Nanopore, Illumina, and chromosome conformation capture sequencing. The assembled genome size was 309.62 Mb, with 76 scaffolds anchored to eight pseudochromosomes. We predicted 33 035 protein-coding genes, functionally annotated 98.27% of them, and identified repetitive sequences covering 49.08% of the genome. We found that P. pusilliflora is closely related to Prunus serrulata and Prunus yedoensis, having diverged from them ~41.8 million years ago. A comparative genomic analysis revealed that P. pusilliflora has 643 expanded and 1128 contracted gene families. Furthermore, we found that P. pusilliflora is more resistant to Colletotrichum viniferum, Phytophthora capsici, and Pseudomonas syringae pv. tomato (Pst) DC3000 infections than cultivated Prunus avium. P. pusilliflora also has considerably more nucleotide-binding site-type resistance gene analogs than P. avium, which explains its stronger disease resistance. The cytochrome P450 and WRKY families of 263 and 61 proteins were divided into 42 and 8 subfamilies respectively in P. pusilliflora. Furthermore, 81 MADS-box genes were identified in P. pusilliflora, accompanying expansions of the SVP and AGL15 subfamilies and loss of the TM3 subfamily. Our assembly of a high-quality P. pusilliflora genome will be valuable for further research on cherries and molecular breeding.


Introduction
The Rosaceae family consists of ∼3000 species, distributed across 90 genera with abundant fruit types [1]. It contains most of the temperate fruit species categorized as stone and pome fruits depending on their fruit morphology, with f leshy fruits that are abundant in organic acids, carbohydrates, vitamins, carotene, and minerals. The stone fruit Prunus pusillif lora (Ppus) Card. belongs to the subgenus Cerasus in the Rosaceae family and might be the parent of several f lowering and fresh Chinese cherry germplasm resources. Widely distributed in Yunnan and Sichuan provinces of Southwest China, endemic Ppus is a wild woody plant that grows naturally on the sides of ravines and sunny mountain slopes at altitudes of 1400-2600 m. The plant has dark-green leaves with acuminate serrate teeth, corymbose-racemose inf lorescences with three to seven f lowers, white single suborbicular petals, Prunus yedoensis (Pyed), and Prunus serrulata (Pser). Insufficient systematic classification and biological evidence have generated confusion regarding the taxonomic groups of Ppus and other Cerasus species. Complicating matters, few investigations have been launched on Ppus, resulting in a lack of specimen collections, mining and utilization of morphological and molecular markers, and phylogenetic analyses. Considering that genomic studies have contributed to solving these issues to a certain extent, we conducted de novo genome assembly of Ppus, with the aim of providing a scientific basis for investigating the evolutionary processes in this species.
Commercial cherry production is confronted with many challenges from biotic and abiotic stresses. Bacterial canker caused by Pseudomonas syringae is one of the most devastating diseases in cherries [4,5], having caused tremendous losses in the global yield of cherries [6][7][8]. Other microorganisms, such as Phytophthora, Colletotrichum, and Botrytis cinerea, inf lict diseases that limit cherry productivity [9][10][11][12]. Most plant resistance genes (R genes) belong to the nucleotide-binding site leucinerich repeat (NLR) receptor family, which confers resistance to various pathogens, including bacteria, viruses, oomycetes, and fungi [13,14]. The WRKY gene family also plays crucial roles in pathogen defense and environmental stress responses [15][16][17]. The cytochrome P450 monooxygenases (CYP450), a family of heme-thioate proteins, protect plants from diseases and insect infestations [18,19]. Therefore, a major objective of cherry breeding programs worldwide is to improve disease resistance and the abiotic tolerance of cultivated cherries by investigating the resistance and tolerance genes.
Bud dormancy, a complex process comprising many biological events, is essential for cherry growth and development. Its release is triggered by long-term exposure to cold, and cold accumulation in winter is commonly addressed as the chilling requirement. Global warming has led to inadequate chilling accumulation in winter, which has caused physiological disorders along with some negative effects on f lowering, bud sprouting, and fruit production [20]. The Dormancy-Associated MADS-Box (DAM) genes, belonging to the SHORT VEGETATIVE PHASE (SVP)/AGAMOUS 24 (AGL24) subfamily of the MADS-box family, are involved in dormancy regulation [20,21]. The large fragment deletion involving DAM1-4, which also eliminates DAM5 and DAM6 expression, stops bud growth cessation in the evergrowing (evg) peach mutant [22]. Therefore, elucidating the mechanism controlling dormancy release that involves MADS-box family members might help address some issues caused by climate change.
The combined use of Oxford Nanopore Technologies (ONT) sequencing, next-generation sequencing (NGS), and chromosome conformation capture (Hi-C) sequencing has been particularly fruitful for genome assembly [23][24][25]. Owing to the highly heterozygous genetic background of Cerasus species, their genome sequencing and assembly are challenging. Nevertheless, highquality genome assemblies have contributed to clarifying the phylogenetic relationships and resolving taxonomic controversies in this subgenus. In fact, whole genomes have been sequenced in various Prunus crops, including P. avium (Pavi) [3], Pser [26], Pyed [27], P. fruticosa [28], P. dulcis (Pdul) [29], P. domestica [30], P. salicina [31], and P. mume [32]. As such data were previously unavailable for Ppus, we therefore generated a high-quality chromosomelevel genome assembly. We then compared the Ppus genome with the publicly available Cerasus L. genomes and investigated gene family evolution, positive selection, and disease resistance in Ppus. This study provides a solid foundation for elucidating the genetic diversity, variation, phylogenic hierarchy, and mechanism underlying the strong disease resistance of Ppus. The sequenced genome will be a valuable resource for basic research on cherries and molecular breeding.

Genome sequencing and assembly
The Ppus genome is estimated to be 303.03 Mb based on k-mer frequencies of Illumina short reads (Supplementary Data Table  S2). We generated the Ppus genome by integrating NGS, ONT, and Hi-C sequencing. We obtained 93 Table S5). The genome size of Ppus resembled that of Pyed var. nudif lora [27], was larger than that of Pser [26], and was smaller than that of Pavi cv. Tieton [3] (Table 1). A Hi-C interaction heat map indicated that the Ppus genome had no obvious assembly errors and comprised eight clusters at the chromosomal level ( Supplementary Data Fig. S1).
To evaluate genome quality and completeness, Illumina pairedend short reads were aligned to the final assembled genome using the Burrows-Wheeler Alignment-Maximal Exact Match (BWA-MEM) software. Approximately 96.73% of the reads mapped to the assembly (Supplementary Data Table S6). Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis of the final assembly indicated 98.30% completeness, with only 1.70% missing singlecopy orthologs (Supplementary Data Table S7). In addition, as the long terminal repeat (LTR) assembly index (LAI) is often used to evaluate the quality of a genome assembly, we compared the LAI of Ppus with those of several Rosaceae species. We found that the LAI of Ppus (17.35) was only slightly lower than that of the wellassembled Pavi (19.68) but higher than the LAI of Prunus armeniaca (Parm) (16.29), Pyed (6.87), and P. domestica (2.27), indicating that it has a superior assembly quality (Supplementary Data Table S8).

Gene prediction and annotation
We identified 142.44 Mb repetitive sequences (∼49.08% of the genome) including simple repeats, and transposable elements (Supplementary Data Table S9). Among these repetitive sequences, ∼121.25 Mb (∼41.79% of the genome) were classified into different types of transposable elements (Supplementary  Data Table S10). The repeat-masked genome was used as input data for gene predictors. We annotated 33035 protein-coding genes in Ppus, supported by homologous and de novo predictions (Supplementary Data Table S11). The BUSCO completeness between the genome (98.30%) and the annotated gene set (96.2%) was close, indicating the successful annotation of most genes in the Ppus genome (Supplementary Data Table S7). We functionally annotated 32463 genes using the non-redundant (NR) (32453 genes), eggNOG (27458), Swiss-Prot (22172), Pfam (21905), Clusters of Orthologous Groups of proteins (COG) (27458), Gene Ontology (GO) (10174), and Kyoto Encyclopedia of Genes and Genomes (KEGG) (12888) databases (Supplementary Data Tables S12 and S13). We also identified 149 micro-, 756 transfer, 1247 ribosomal, and 276 small nuclear (sn) RNAs in the Ppus genome (Supplementary Data Table S14).

Syntenic analysis between P. pusilliflora and three Prunus species
To gain a better understanding of the relationship between Ppus and several Prunus species, we performed a syntenic analysis and drew synteny maps after comparing the Ppus genome with Prunus persica (Pper), Pavi, and Pser genomes ( Fig. 2A Table S18). Meanwhile, several chromosome inversions were present in the Ppus versus Pavi synteny map (Fig. 2B). Gene syntenic blocks derived from comparing the Ppus

Evolution and gene family expansion analysis
Orthologous clustering was conducted on the Ppus, Pavi, Pper, and Pyed genomes (Supplementary Data Tables S23-S26). We identified 17 767 gene families in the Ppus genome, which was more than the number in the Pper and Pyed genomes and slightly less than that in the Pavi genome (Fig. 3A). Moreover, 11 534 gene families were common to all four Prunus plants, whereas more unique gene families (1144) were found in the Ppus than in the Pper genome (Fig. 3A). To elucidate evolutionary relationships, we performed a comparative genomic analysis of all the identified families based on our BLASTP and Pfam results (Supplementary  Data Table S27). We then compared the numbers of single-and multiple-copy orthologs, other orthologs, unique paralogs, and unclustered genes among Arabidopsis thaliana (Atha), Pavi, Pser, Pyed, Ppus, Pper, Pdul, Parm, Rosa chinensis (Rchi), and Vitis vinifera (Vvin), and selected 1938 high-quality single-copy orthologs for phylogenetic reconstruction (Fig. 3B). Detailed statistics of unique, expanded, and contracted gene families in the Ppus genome are shown in Supplementary Data Tables S28-S30. Among 11 824 gene families common to the 10 species, 643 and 1128 gene families expanded and contracted, respectively, in Ppus after speciation from Pavi (Fig. 3C). Ppus contained fewer expanded gene families than did Pavi, Pser, and Pyed (Fig. 3C). However, the number of contracted gene families was greater in Ppus than in Pser. The expanded, contracted, and unique family genes were significantly enriched (P < 0.05) in 37, 102, and 109 GO terms, respectively (Supplementary Data Tables  S31-S33) Table S33).

Comparative genomic analysis of P. pusilliflora
We assessed the divergence times of Ppus and nine other species based on the phylogenetic tree (Fig. 3C Fig. S3). The K s distribution among the seven Prunus genomes revealed that these species diversified very recently. P. pusilliflora was more resistant to Colletotrichum viniferum, Phytophthora capsici, and Pseudomonas syringae pv. tomato DC3000 than P. avium We evaluated the disease resistance and susceptibility of Ppus and Pavi leaves inoculated with Colletotrichum viniferum, Phytophthora capsici, Pseudomonas syringae pv. tomato (Pst) DC3000, and B. cinerea ( Fig. 4; Supplementary Data Fig. S5). Symptoms were monitored every 3 days. An obvious lesion area persisted on all Ppus and Pavi leaves infected with C. viniferum (Fig. 4A). ImageJ measurements showed a larger lesion area in infected Pavi than in Ppus leaves at 6 and 9 days post-inoculation (dpi) (Fig. 4B). Lactophenol Trypan Blue (TB) staining identified more necrotic cells in Pavi leaves infected with C. viniferum than in Ppus leaves (Fig. 4C). In addition, data showed that the ratio of pathogen DNA to plant DNA increased continuously from 1 to 6 dpi, indicating C. viniferum could infect Ppus and Pavi (Supplementary Data Fig.  S4A). Meanwhile, a significantly greater ratio was observed in Pavi than in Ppus at 6 dpi, indicating that Ppus was more resistant to C. viniferum fungus than Pavi. Similarly, the lesions on Ppus and Pavi leaves infected with P. capsici oomycetes increased continuously from 3 to 9 dpi but were larger on infected Pavi leaves ( Fig. 4D and E). Lactophenol TB staining identified more necrotic cells in Pavi leaves than in Ppus leaves (Fig. 4F). Data revealed that the ratio of pathogen to plant DNA increased continuously from 1 to 9 dpi and showed a greater ratio in Pavi than in Ppus (Supplementary Data Fig. S4B), suggesting that P. capsici can infect two tested species, and Ppus is more resistant to P. capsici than Pavi. P. syringae pv. tomato is a vital model pathogen for plantpathogen interactions [34]. This pathogen caused less severe disease symptoms (water-soaking) in Ppus compared with those in Pavi leaves (Fig. 4G). We investigated whether differences in watersoaking size and disease severity in Ppus and Pavi leaves mirrored the differences in bacterial growth by counting bacteria numbers in the leaves of Ppus and Pavi at 3 and 6 dpi. Data revealed a slightly lower bacterial number in Ppus than in Pavi leaves at 3 dpi but this difference was not obvious in either. The bacterial number in Pavi leaves reached a maximum of 6.5 × 10 7 colony-forming units (CFU)/cm 2 at 6 dpi, which exceeded that in Ppus leaves (Fig. 4H). Meanwhile, the ratio of pathogen to plant DNA increased continuously from 3 to 6 dpi and was significantly higher in Pavi than in Ppus at 3 and 6 dpi (Supplementary Data Fig. S4C), indicating that Pst DC3000 can infect two tested species, Ppus and Pavi, and Ppus is more resistant to Pst DC3000 than Pavi. In addition, lesion size was larger in Ppus than in Pavi leaves infected with B. cinerea at 3 and 6 dpi ( Supplementary Data Fig. S5). Finally, lactophenol TB staining identified more necrotic cells in Ppus leaves than in Pavi leaves (Supplementary Data Fig. S5). These results showed that Ppus is less resistant to the fungus B. cinerea than Pavi. These data together suggested that Ppus was more resistant to C. viniferum, P. capsici, and Pst DC3000 than cultivated Pavi, which might be associated with the natural selection of Ppus in the wild.
Pavi genomes (Fig. 5F) and found 17 collinear pairs on Chr 2 (Supplementary Data Fig. S6B; Supplementary Data Table  S46). Some TNL-type proteins participate in the recognition of specific pathogens and play crucial roles in P. syringae resistance [23,36]. Some TN-and TX-type proteins that participate in plant defenses might cooperate with TNL proteins to facilitate pathogen recognition or downstream signaling [37]. Thus, Ppus has evolved more TNL-, TN-, and TX-type transcripts than Pavi, which explains to some degree why Ppus is more resistant to Pst DC3000.

WRKY family in P. pusilliflora, P. persica, P. serrulata, and P. yedoensis
The WRKY family of proteins, first discovered in plants, is characterized by a sequence of 60 amino acids that includes the WRKY domain [38]. These proteins play vital roles in pathogen defense and the environmental stress response, and development [15,16,39]. Genome-wide analysis of WRKY has been performed in several plants, including maize [40], peaches [41], Camellia sinensis [42], and strawberry [17]. We identified 61 WRKY genes in the Ppus genome, 58 in Pper, 60 in Pser, and 78 in Pyed (Supplementary Data Figure 5. Distribution of RGAs in P. pusillif lora (Ppus) and P. avium (Pavi) chromosomes. Distribution of RGAs along P. pusillif lora (A) and P. avium (C) chromosomes, showing the absolute number of genes homologous to nucleotide-binding site-leucine-rich repeat (NBS-LRR-encoding) proteins, RLKs, RLPs, resistance to powdery mildew 8 (RPW8), and TM-CC proteins along each of the eight chromosomes. Distribution of NBS-LRR-encoding proteins along the P. pusillif lora (B) and P. avium (D) chromosomes. (E) Microsynteny analysis of TNL-type genes between P. pusillif lora and P. avium chromosomes, as indicated by red (representing collinear gene pairs on Chr 8) and green curves, respectively. (F) Microsynteny analysis of NL-type genes between the P. pusillif lora and P. avium chromosomes, as indicated by red (representing collinear gene pairs on Chr 2) and blue curves, respectively. The yellow and purple curves represent Chr1-Chr8 of P. pusillif lora and P. avium, respectively.  Fig. S7B). Notably, Giardia lamblia and Dictyostelium discoideum each only contain one known WRKY gene, whereas gene duplication in Physcomitrella patens has resulted in an increase of 37 WRKY proteins [40]. The WRKY genes rapidly duplicated before monocots and dicots diverged [43]. Monocots also have larger WRKY families than most dicotyledons (Supplementary Data Fig.  S7B). For instance, maize contains the largest WRKY family of 136 genes, whereas Ppus has 61. The rapid duplication of WRKY genes, as vital TFs, might contribute to enhancing disease resistance, environmental stress adaptability and establishing a better stressresistance signaling network.

Cytochrome P450 family in P. pusilliflora, P. persica, P. serrulata, and P. yedoensis
The cytochrome P450 (CYP450) family catalyzes the biosynthesis of numerous important plant compounds, which are categorized into A-and non-A-types and further subdivided into clans [44,45] Data Fig. S8). We further explored CYP450 evolution and divergence between Ppus and 14 other species. The CYP701, CYP84, CYP72, CYP714, CYP704, and CYP88 genes were most abundant in Ppus.
The CYP99 and CYP723 subfamilies were found only in monocots, whereas CYP82 and CYP716 were found only in dicots (Fig. 6). Twenty-four CYP subfamilies (e.g. CYP89, CYP77, CYP71, CYP81, and CYP76) were common to all 15 species. CYP719 was found only in Nelumbo nucifera. These results showed that some CYP subfamilies (i.e. CYP79, CYP93, and CYP74) were lost only in a single species. The CYP71, CYP72, CYP76, CYP81, and CYP94 subfamilies expanded massively (Fig. 6). Notably, the CYP71 family converts aldoximes to nitriles that participate in resistance to biotic stress [47]. Thus, the rapid duplication of some CYP subfamilies might contribute to improved stress tolerance in plants.

MADS-box family in P. pusilliflora, P. persica, P. serrulata, and P. yedoensis
MADS-box family genes are vital to plant development, especially during dormancy release and the development of f lowers and fruits [20,48]. In plants, MADS-box genes are divided into types I and II lineages based on protein domain structures [49]. Type II MADS-box genes have a conserved MADS-box domain, intervening (I) and keratin-like (K) domains, and a C-terminal (C) region that are sequentially arranged from the N-to the C-termini; these genes are also called MIKC-type genes [48,50]. These genes are further subdivided into MIKC C and MIKC * types. In Arabidopsis, MIKC C -type genes are categorized into 12 subfamilies [51,52]. Type I MADS-box genes are classified into Mα, Mβ, Mγ , and Mδ groups based on phylogeny. The Mδ group in Arabidopsis and rice corresponds to the MIKC * type [53].
MADS-box family genes are reported in multiple Prunus species [49,54] Table S49). In accordance with the Atha classification, we divided type I MADS-box genes into Mα (18), M-β (11), M-γ (11), and M-δ (4) groups in Ppus (Fig. 7). We also categorized Ppus type II MADS-box  (Fig. 7). Among these subfamilies, 14 were grouped with their Arabidopsis counterparts. We used grapevine TM8, poplar TM8 (XP_002321711.1), P. mume PmMADS26 (a homologous gene of TM8), and Coffea arabica TOMATO MADS-box 3 (TM3) for phylogenetic analysis because the Arabidopsis genome lacks the TM8 and TM3 subfamilies [49,52,55]. The MADS-box PAV05G034890 of Ppus unambiguously grouped with these three TM8 genes (Fig. 7), indicating that the Ppus genome has only one TM8 member, similar to P. mume and grapevine. The Ppus, Arabidopsis, Pper, Pyed, and Pser genomes notably have no genes that are homologous to TM3, suggesting that this subfamily might be unique to C. arabica. The most expanded type II MADS-box subfamilies were SVP and AGL15, with four each in Ppus and two in Atha. Considering that SVP is important for early f lowering during spring, we speculated that the expansion of this subfamily in Ppus is correlated with the control of f lowering time. Evolutionary analysis showed that four Ppus MADS-box TFs (i.e. PAV06G025430, PAV08G027210, PAV01G095740, and PAV01G095790) clustered with AtSVP in a clade, suggesting that these four proteins play important roles in the regulation of bud endodormancy.

Discussion
We successfully addressed a gap in knowledge about wild cherry genomes and generated a high-quality chromosomal-level Ppus genome using NGS, ONT, and Hi-C sequencing technologies [56,57]. Because of the strong disease resistance of this species, it is an important germplasm resource for cherry breeding programs, and decoding its genome sequence is of great significance. Due to data limitations, we did not generate a haplotype-resolved assembly in this study, but assembly quality is comparable to or even better than that of other published cherry species [27]  Our Ppus assembly will serve as a high-quality reference genome for further investigations regarding cherries. Moreover, this genome assembly provides necessary data for clarifying the genetic background and evolution of Ppus, Pyed, Pser, and Pavi and the independent domestication of cherries.
Phylogenetic analysis with high-quality single-copy orthologs revealed that the Cerasus species Ppus, Pavi, Pser, and Pyed clustered on a branch with the shortest divergence time and were separate from the Prunus species Pper, Pdul, and Parm. Both the phylogenetic tree and K s analysis indicated that Pavi diverged earlier than Pser and Pyed, which was consistent with previous findings [26]. Our results showed that Ppus was more closely related to Pser than to Pavi, which is supported by a perfect collinear relationship between Ppus and Pser. We also found that all syntenic blocks between Ppus and Pser matched on the same chromosome, compared with only 65.77% of those between Ppus and Pavi (Supplementary Data Table S19). We did not find any large-scale chromosome inversions or translocations between the Ppus and Pser genomes but did for the Ppus and Pavi genomes (Fig. 2). One potential reason is that Pavi underwent more artificial selection than Pser, resulting in inversions and other chromosome structure variations, though further research is needed to confirm these inversions. The relationship between Ppus and Pper is farther than that between Ppus and Pavi, but Ppus had good collinearity with Pper, possibly due to having fewer gene syntenic blocks with which to build the collinear relationship between Ppus and Pper.
Disease resistance strongly depends on R genes in plants [13]. We found that wild Ppus was more resistant to C. viniferum, P. capsici, and Pst DC3000 than was cultivated Pavi. Although the number of RGAs was comparable between Ppus and Pavi, the former species had considerably more NBS-type RGAs. The majority of the NBS type confer resistance to pathogenic viruses, bacteria, oomycetes, and fungi [14]. Thus, we speculated that NBS-type expansion partially enhanced Ppus resistance to some pathogens.
The TNL-type proteins RPS4 and RPS6 recognize P. syringae effectors and confer resistance to P. syringae [36]. The TN-and TX-type proteins cooperate with TNL proteins to facilitate pathogen recognition or downstream signaling [37]. Here, Ppus contained more TNL-, TN-and TX-type transcripts than did Pavi, which indicated that the expansion of TNL, TN and TX types might confer on Ppus stronger resistance to Pst DC3000. Some RLP-type RGAs, such as RLP30 and RLP42, are essential for resistance to B. cinerea [59,60]. We found that Ppus had fewer RLPtype proteins than Pavi, which explains why Ppus is somewhat less resistant to B. cinerea. Some RLKs play important roles in plant resistance to pathogens [61]. In fact, RLK1 is involved in the hypersensitivity response signaling pathway and functions in P. capsici resistance [62]. We found that Ppus had fewer RLK-type transcripts than Pavi but was more resistant to P. capsici, implying that more dominant genes are involved in resistance to P. capsici in Ppus. Further identification of RGAs will enable us to determine the resistance traits of various types of R genes and apply these findings to breeding programs.
WRKY protein PAV02G008190 was closely related to AtWRKY38 and AtWRKY62 and that PAV01G017570 was closely associated with AtWRKY48, suggesting that they have similar functions in P. syringae defense.
AtCYP76C2 is associated with hypersensitive rapid cell death, which is a defense mechanism for Pst DC3000 infection [67]. Evolutionary analysis showed that several Ppus CYP450s (such as PAV01G058410 and PAV01G058390) were closely associated with AtCYP76C2, implying that they function in resistance to Pst DC3000. A pathogen-induced CYP82C2 gene and other possible CYPs are involved in the biosynthesis of 4-hydroxyindole-3carbonyl nitrile with cyanogenic functionality against P. syringae [68]. In soybean, GmCYP82A3 is highly resistant to B. cinerea [69]. Our results revealed that 15 Ppus CYP450s and five AtCYP82 genes clustered together, suggesting that they play important roles in resistance to B. cinerea and P. syringae.
As Ppus is an important ornamental tree species that grows during early spring, we focused on the MADS-box family in this study because of its involvement in dormancy release and f loral organ development. We identified 81, 77, 97, and 131 MADSbox genes in Ppus, Pper, Pser, and Pyed, respectively. These gene numbers indicate that f lowering cherry Pyed has well-developed f loral organs, whereas Pper does not, probably because some MADS-box genes had been deleted over a long period of artificial selection. Loss of the TM3 subfamily might affect the transition from vegetative to reproductive growth in the four Prunus species Ppus, Pper, Pser, and Pyed. Because the SVP subfamily is associated with early f lowering, its expansion suggests a need for better control of f lowering time during the evolution of Ppus. Finally, DAM has been verified to have functions related to the inhibition of bud break in pears [20]; DAM genes, usually named SVP or SVPlike (SVL), mainly participate in the regulation of endodormancy [21]. Our results revealed that four Ppus MADS-box genes were closely related to AtSVP, implying that they function in regulating Ppus bud endodormancy.

Plant materials and DNA extraction
We used DNeasy Plant Mini Kits (Tiangen Biotech Co. Ltd, Beijing, China) to extract high-purity genomic DNA from the fresh and young leaves of an endemic wild Ppus tree aged ∼120 years preserved in its natural habitat (Binchuan County, Dali District, Yunnan Province, China). The concentration and purity of extracted DNA were evaluated using a Nanodrop 2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) and a Qubit 3.0 f luorometer (Thermo Fisher Scientific, Waltham, MA, USA). DNA completeness was assessed by 0.8% agarose gel electrophoresis using pulsed-field techniques.

Genomic DNA sequencing
We constructed a paired-end library using GenElute Plant Genomic DNA Miniprep Kits (Sigma-Aldrich, St Louis, MO, USA) for short-read sequencing based on the Illumina HiSeq X Ten (Illumina, San Diego, CA, USA) platform. On the other hand, an ONT library was constructed for long-read sequencing using an Oxford Nanopore PromethION 48 platform (Oxford Nanopore Technologies, Oxford, UK) at Novogene Co., Ltd (Beijing, China). A Hi-C library was generated as follows. First, fresh and young leaves fixed in formaldehyde were lysed, then cross-linked DNA was digested overnight using the single four-cutter restriction enzyme Dpn II. Then, digested fragments were ligated and biotinylated to form chimeric rings, which were enriched, sheared, and further processed. The Hi-C library was also sequenced based on the Illumina HiSeq X Ten platform. Raw reads were subjected to quality control procedures that involved adapter trimming and removal of low-quality reads. The resultant clean reads were used for subsequent analysis.

Genome assembly evaluation
We used the Burrows-Wheeler Aligner with default parameters [75] to align Illumina reads to the assembly for estimating the coverage ratio. Additionally, we evaluated the completeness and quality of the genome and annotated proteins using BUSCO v3.0.1 with default parameters by mapping them to the embryophyta_odb10 database [76].
Gene families that had contracted or expanded were identified based on family size and phylogeny using CAFE v2.1 (parameters: number of threads = 10, P = 0.05, number of random = 1000, and search for lambda) [103]. Each gene module was subjected to functional enrichment analysis with GO and KEGG.
Seed files corresponding to the CYP450 (PF00067), WRKY (PF03106), and MADS-box (PF00319) gene families were obtained at the website of the Pfam database (http://pfam.janelia.org/). The domain file was used as the first template for scanning the gene families, and any output genes with an E-value of less than 1e−10 were filtered out. The filtered genes were taken as templates for a second scan, and then output genes were filtered out in the same way. Putative genes were identified in each gene family. The resulting sequences were aligned using MUSCLE v3.8.3 and the phylogenetic trees of gene families were constructed using FastTree v2.1.11 (http://www.microbesonline.org/fasttree/).

Polyploidization analysis
We adopted K s to explore WGD and divergence events between Prunus and other species. Homologous amino acid sequences were aligned using MUSCLE v3.8.3 [101] and then converted into codon alignments using PAL2NAL v14 [104]. Finally, K a and K s were calculated via the Nei-Gojobori method using the NG86 program of PAML as described previously [105]. We used the median K s between homologous genes to classify collinear blocks caused by duplication events. K s was indicated via different colors on collinear blocks in WGDI [33]. Curves of K s density distribution were created with Kspeaks (− kp). Multipeak fitting was conducted using the PeaksFit (− pf) software. Multiple fitted density curves were converted into one graph using KsFigures (− kf).

Pathogen inoculation and disease development
P. syringae pv. tomato DC3000 was cultured on King's B (KB) medium supplemented 50 μg/ml rifampicin at 30 • C [106]. Log-phase cultures were resuspended with a buffer [10 mM MgCl 2 and 10 mM 2-(N-morpholino) ethanesulfonic acid (MES)] to obtain an OD 600 of 0.1 and then diluted 100-fold before spray inoculation. The sprayed leaves were monitored every 3 days for symptoms, and bacterial proliferation was measured in extracts of leaf tissues collected on 3 and 6 dpi. Three leaf disks with 5 mm diameter were collected from three independent leaves at 3 and 6 dpi and ground in 1 ml of 10 mM MgCl 2 and 10 mM MES. Bacterial colonies were counted 2 days after plating 60 μl from serial dilutions on KB plates supplemented with rifampicin. To determine bacterial proliferation, we determined CFU/cm 2 on each leaf at 3 and 6 dpi. In addition, the amount of Pst DC3000 DNA in plant DNA (%) was estimated using a previously described method [107,108].
The P. capsici isolate LT263 was cultured on oatmeal agar at 25 • C for 7 days. C. viniferum and B. cinerea were routinely cultured on potato dextrose agar at 25 • C for 7-10 days. Agar disks (diameter 7.5 mm) were cut using a cork borer and then inoculated onto the abaxial surfaces of Ppus and Pavi cv. Tieton leaves that were maintained at 25 • C. We collected the leaf disks with 5 mm diameter 5 mm away from agar disks and verified, using qPCR, if these pathogens had infected Ppus and Pavi [107,108]. The primers used for qPCR are listed in Supplementary Data Table S51. Lesions were photographed at 3, 6, and 9 dpi, stained with lactophenol TB as previously described [109], and measured using ImageJ (National Institutes of Health, Bethesda, MD, USA).

Statistical analysis
All data were statistically analyzed using SAS software (SAS Institute, Cary, NC, USA). We calculated statistical differences among all datasets by conducting a two-tailed Student's t test, where P < .05 was used to denote significant differences. The results of the pathogen inoculation assays are shown as the mean ± standard deviation of values from more than nine replicates in each independent experiment.