Phylotranscriptomics of the Pentapetalae Reveals Frequent Regulatory Variation in Plant Local Responses to the Fungal Pathogen Sclerotinia sclerotiorum.

Quantitative disease resistance (QDR) is a conserved form of plant immunity that limits infections caused by a broad range of pathogens. QDR has a complex genetic determinism. The extent to which molecular components of the QDR response vary across plant species remains elusive. The fungal pathogen Sclerotinia sclerotiorum, causal agent of white mold diseases on hundreds of plant species, triggers QDR in host populations. To document the diversity of local responses to S. sclerotiorum at the molecular level, we analyzed the complete transcriptomes of six species spanning the Pentapetalae (Phaseolus vulgaris, Ricinus communis, Arabidopsis [Arabidopsis thaliana], Helianthus annuus, Solanum lycopersicum, and Beta vulgaris) inoculated with the same strain of S. sclerotiorum. About one-third of plant transcriptomes responded locally to S. sclerotiorum, including a high proportion of broadly conserved genes showing frequent regulatory divergence at the interspecific level. Evolutionary inferences suggested a trend toward the acquisition of gene induction relatively recently in several lineages. Focusing on a group of ABCG transporters, we propose that exaptation by regulatory divergence contributed to the evolution of QDR. This evolutionary scenario has implications for understanding the QDR spectrum and durability. Our work provides resources for functional studies of gene regulation and QDR molecular mechanisms across the Pentapetalae.


INTRODUCTION
The plant immune system includes multiple molecular mechanisms for pathogen detection and defense. Gene-for-gene resistance involves single dominant resistance genes often belonging to the nucleotide binding domain and leucine-rich repeat (NLR) family of proteins (Dodds and Rathjen, 2010). This form of resistance typically protects plants against a few genotypes of biotrophic pathogens, which keep host cells alive during infection. There are however no dominant resistance genes described to function against many necrotrophic pathogens (which actively kill host cells during infection), such as the fungus Sclerotinia sclerotiorum (Guyon et al., 2014;Mbengue et al., 2016). S. sclerotiorum is the causal agent of white and stem mold diseases on a broad range of dicot plants (Bolton et al., 2006). The host range of S. sclerotiorum covers all major groups of the Pentapetalae, a subclade of the core eudicots including Rosidae, Asteridae, and Caryophyllaceae (Moore et al., 2010;Navaud et al., 2018). This fungus notably causes severe damage on oil crops such as soybean (Glycine max) and rapeseed (Brassica napus) when conditions are favorable (Derbyshire and Denton-Giles, 2016). Plants typically exhibit quantitative disease resistance (QDR) against S. sclerotiorum, a form of immunity resulting in a full continuum of disease symptom severity in plant populations (Roux et al., 2014). The distribution of disease symptoms on plants challenged by S. sclerotiorum follows an approximately Gaussian distribution in populations of sunflower (Helianthus annuus, Asterales; Fusari et al., 2012), soybean (Fabales; Bastien et al., 2014), rapeseed (Brassicales; Wu et al., 2013), and thale cress, Arabidopsis (Arabidopsis thaliana, Brassicales; Perchepied et al., 2010), for instance. QDR is a complex type of resistance involving genes of very diverse families . Because of the functional diversity and complexity of QDR, the genetic bases and evolution of QDR remain largely unexplored.
How many genes contribute to the QDR response to a given pathogen is an unresolved question of fundamental and practical interest . This information is indeed critical to understand the evolutionary dynamics of plant pathogens with a broad host range and to optimize breeding for disease resistance in crops. Genome-wide association (GWA) mapping in Arabidopsis natural populations and the analysis of fungal small RNA putative targets identified five QDR genes against S. sclerotiorum (Badet et al., 2017(Badet et al., , 2019Derbyshire et al., 2019;Barbacci et al., 2020). The inactivation of these genes caused 16 to 36% variation in the QDR phenotype each, in line with previous quantitative trait loci analyses for QDR in crops (Micic et al., 2004;Poland et al., 2011;Bonhomme et al., 2014). Together with other approaches (Perchepied et al., 2010;Stotz et al., 2011;Mbengue et al., 2016), these studies bring to a few dozen the number of known Arabidopsis genes contributing to QDR against S. sclerotiorum. Global transcriptomic analyses revealed that the number of genes differentially expressed upon S. sclerotiorum challenge adds up to 4703 in Arabidopsis and 3513 in tomato (Solanum lycopersicum, Solanales; Badet et al., 2017), suggesting that the number of QDR genes may exceed by far the current validated list. In agreement, GWA mapping for resistance against four distinct Botrytis cinerea strains in 110 Arabidopsis accessions identified 3504 genes as candidates for controlling QDR (Corwin et al., 2016). Extending the analysis to several parameters describing the disease lesion (size, shape, color) increased the number of candidate QDR genes to 7940 in the Arabidopsis-B. cinerea interaction (Fordyce et al., 2018).
These findings provide molecular foundations for an infinitesimal model (or polygenic model) of plant QDR. The infinitesimal model of quantitative genetics postulates that continuous variation in a trait is often controlled by environmental factors and a large number of loci, each making a very small (infinitesimal) contribution to the phenotype (Fisher, 1919;Nelson et al., 2013). The existence of very large numbers of causal genetic variants is proposed to explain the small proportion of phenotypic variance resulting from variants identified in GWA studies, a problem known as missing heritability (Manolio et al., 2009;Barton et al., 2017). Because the infinitesimal model notably assumes linkage equilibrium and additive phenotypic effect of loci, the genetic architecture of disease resistance probably involves an additional layer of complexity (Barton et al., 2017;Turelli, 2017). In a recent variant of the infinitesimal model coined the omnigenic model, Boyle et al. (2017) suggest that any gene differentially regulated in infected tissues has a non-null impact on the disease outcome. Following this view, a modest number of core genes would directly and strongly affect QDR and could each be modulated by numerous small effect genes as a result of their interactions in global gene networks (Boyle et al., 2017;Liu et al., 2019). Although the most efficient way to move forward in understanding the genetics of complex traits is debatable, the omnigenic model emphasizes the relevance of large-scale transcriptomics for studying diseases (Boyle et al., 2017;Wray et al., 2018).
A global knowledge of gene regulation is also key to study complex traits from an evolutionary point of view according to the cis-regulatory hypothesis. This theory predicts that because cisregulatory mutations are likely to have reduced pleiotropic effects, they should lead to phenotypic evolution more frequently than mutations in coding regions Orgogozo, 2008, 2009). A literature survey covering multicellular plants and animals provided support for the cis-regulatory hypothesis over long time scales (beyond the species level; Stern and Orgogozo, 2008). In the harlequin ladybird beetle (Harmonia axyridis), allelic variation in the cis-regulatory region of the transcription factor gene pannier is responsible for more than 200 distinct patterns of red and black on the elytra (the hardened forewing), providing a striking example of phenotypic variation caused by cis-regulatory variants at the intraspecific level (Ando et al., 2018;Gautier et al., 2018). In our previous work, we showed that the ARPC4 gene encoding an actin-related protein complex member controls the dynamics of the actin filament network with an impact on QDR against S. sclerotiorum (Badet et al., 2019). Two major alleles of ARPC4 were detected in Arabidopsis populations, associated with contrasted levels of QDR. Remarkably, the two ARPC4 alleles encoded identical proteins, but they differed markedly in their induction upon S. sclerotiorum inoculation. The Arp2/3 complex involving ARPC4 is conserved in all eukaryotes (Deeks and Hussey, 2005), indicating that transcriptional regulation of conserved pleiotropic genes can underlie variation in plant QDR phenotypes. Similarly, the POQR gene showed conserved expression pattern and function in QDR upon S. sclerotiorum challenge in Arabidopsis and tomato (Badet et al., 2017). These results suggest that some QDR determinants are conserved between distantly related plants and that gene expression polymorphisms could be drivers of QDR evolution.
S. sclerotiorum has one of the broadest host ranges among plant pathogens (Derbyshire et al., 2017;Navaud et al., 2018), providing a rare opportunity to analyze the diversity and long-term evolution of transcriptional programs in the interaction of various plants with a single fungal pathogen species. To document how diverse the local plant responses to S. sclerotiorum are at the molecular level, we analyzed the complete transcriptomes of six plant species spanning the Pentapetalae group inoculated with the same strain of S. sclerotiorum. Differentially expressed genes (DEGs) were enriched with core ortholog genes (conserved in all six species) but depleted from lineage-specific genes. Among core orthologs, patterns of gene expression varied significantly across species and deviated significantly from a null model of expression heritability. Only 4.34% of core orthogoups contained upregulated genes from all six species, including a group of ATP Binding Cassette type G (ABCG) transporters. Functional analysis of AtABCG40 supports a scenario of exaptation in which ancestral (E) Proportion of DEGs in each species (colors) and overall (gray, % of transcripts). Plain bars correspond to the percentage of of upregulated genes, empty bars to the percentage of downregulated genes, and labels indicate the total number of DEGs per species. Error bars show SD. Boxplots show first and third quartiles (box), median (thick line), and the most dispersed values within 1.5 times the interquartile range (whiskers). stress response genes gained a function in local disease resistance to S. sclerotiorum recently in plant evolution.

Common Features of S. sclerotiorum-Responsive Gene Pools across Distant Species
To document the transcriptional changes induced locally by S. sclerotiorum infection in taxonomically diverse eudicot plant species, we performed RNA sequencing (RNA-seq) of healthy and inoculated leaves from six plant species, chosen to cover a broad diversity in the Pentapetalae group of angiosperms (Chanderbali et al., 2017;Zeng et al., 2017). Common bean (Phaseolus vulgaris, Fabales) and castor bean (Ricinus communis, Malpighiales) were chosen as representatives of the fabid group of the rosids clade and Arabidopsis (Brassicales) as a representative of the malvid group of the rosids clade. Sunflower (Asterales) and tomato (Solanales) were chosen as representatives of the asterids clade and sugar beet/beetroot (Beta vulgaris, Caryophyllales) as a representative of basal Superasteridae lineages ( Figure 1A; Supplemental Data Set 1.1; Zeng et al., 2017). A reference genome is available for all of these species, to which we aligned RNA-seq reads to quantify plant gene expression. In noninoculated plants, a minimum of 90% of the total sequence reads were mapped to the corresponding plant reference genome, except for R. communis in which only 74.26% of sequence reads mapped to annotated transcripts.
With an aim to study comparable stages of infection on all six plant species, we harvested the edge of disease lesions before they reached 25 mm in diameter (Supplemental Methods; Badet et al., 2017;Peyraud et al., 2019). As expected, due to the presence of sequence reads corresponding to S. sclerotiorum RNAs, an average 32.5% of sequence reads mapped to their corresponding plant genome in infected samples ( Figure 1B; Supplemental Data Set 1.2). To estimate the degree of fungal colonization in these samples, we mapped RNA-seq reads to the S. sclerotiorum reference genome (Derbyshire et al., 2017). Our results suggest that S. sclerotiorum colonization was consistent across replicates and across species, with an average 58.25% 6 4.83% of reads aligned to the fungal genome ( Figure 1B).
The number of genes expressed both in noninoculated and infected leaves ranged from 19,353 (R. communis) to 42,351 (H. annuus), reaching a total 156,497 genes across the six plant species ( Figure 1C; Table 1; Supplemental Data Set 1.3). The proportion of expressed genes was remarkably consistent across all six plant species, corresponding to 78.48% 6 2.47% of complete transcriptomes ( Figure 1C).
To determine the number of plant genes misregulated upon S. sclerotiorum infection in each genome, we calculated log 2 fold change (LFC) of expression in healthy and inoculated plant samples. We used LFCs shrinkage by the empirical Bayes approach implemented in DESeq2 to avoid bias in genes with low read coverage (Supplemental Methods; Love et al., 2014). Genomic median for LFC ranged from -0.19 in Arabidopsis to 0.07 in B. vulgaris ( Figure 1D; Supplemental Data Sets 1.4 to 1.9). The distribution of LFC values was very similar between genomes, with a stronger dispersion from the median in rosid species. We considered as DEGs showing a log 2 fold change higher than 1.5 and a P value lower than 0.01. We detected 6148 DEGs in B. vulgaris, 6843 in P. vulgaris, 7510 in R. communis, 8069 in S. lycopersicum, 9844 in H. annuus, and 10,496 in Arabidopsis (  Figure 1E). The lowest proportion of DEGs corresponded to species from the Superasteridae clade (average 26.7% 6 1.65%), whereas species from the rosids clade showed 38.6% 6 3.75% of DEGs on average. In all species, the proportion of downregulated genes exceeded that of upregulated genes (Student's t test, P value 5 0.0013), with an upregulated/downregulated ratio of 0.82 6 0.03 ( Figure 1E; Table 1 To compare local transcriptional responses to S. sclerotiorum in the six selected plant species, we performed sequence-based clustering of all predicted proteins from the six species using OrthoMCL (Li et al., 2003). This identified 14,983 orthogroups shared between two or more species and containing from 2 to 110 expressed genes, for a total of 97,763 genes ( Figure 2A; Table 1; Supplemental Data Set 1.10). Of these genes, 62,838 genes (40.2%) classified into 7918 orthogroups we designated as core Pentapetalae that contained genes from all six plant species and could represent ancestral gene groups. There were 58,726 expressed genes (37.5%) unique to one species and not assigned to orthogroups, classified as lineage specific (see Methods). The 7065 other ortholog groups, shared between two to five species, included a total of 34,925 genes (22.3%; Figure 2A; Supplemental Data Set 1.11). The proportion of core Pentapetalae genes represented 41.5% 6 2.9% per species; other ortholog genes represented 23.1% 6 2.1% per species ( Figure 2B; Table 1; Supplemental Data Set 1.12). The proportion of lineage-specific genes was more variable, ranging from 23.8% in R. communis to 48.1% in H. annuus (average 35.6% 6 4.8%).
To test for the relative contribution of conserved genes in the plant transcriptional responses to S. sclerotiorum, we calculated for each species the proportion of DEGs that were core Pentapetalae, other ortholog, and lineage-specific genes ( Figure 2C). On average, 46.7% 6 2.6% of DEGs were core Pentapetalae genes, 24.7% 6 1.7% were other ortholog genes, and 28.6% 6 3.6% were lineage-specific genes. The percentage core Pentapetalae in DEGs was significantly higher than that of other ortholog genes (Student's t test, P value 5 4.2e 208 ) and that of lineage-specific genes (P value 5 2.3e 206 ). The proportion of upregulated and downregulated genes was similar among the other ortholog and lineage-specific classes. It showed a slight difference among core Pentapetalae genes in which 43.15% 6 1.7% were upregulated and 49.63% 6 3.6% were downregulated (P value 5 0.058). The percentage of DEGs being lineage specific showed the strongest variation between species, especially regarding downregulated genes that included 17.6% of lineage-specific genes in P. vulgaris and up to 42.1% in H. annuus.
To determine whether the prevalence of core Pentapetalae genes in DEGs was conserved at different phylogenetic levels, we analyzed separately the distribution of DEGs in the rosid and Superasteridae clades ( Figure 2D; Supplemental Data Set 1.13). In both clades, DEGs were significantly more abundant in core Pentapetalae genes than in any other class (P value < 5.5e 210 for rosids, P value < 0.005 for Superasteridae). In the rosids, other orthologs and lineage-specific genes contributed to a similarly low proportion of DEGs (on average 26.9 and 22.9%, respectively; P value 5 0.14). By contrast, in the Superasteridae, DEGs were significantly less abundant among other orthologs than among lineage-specific genes (on average 22.7 and 34.7%, respectively; P value 5 7.0e 205 ). As a complementary analysis, we used chi-squared tests to show that core Pentapetalae genes were significantly enriched with DEGs in all six species, between 1.041fold (P value 5 1.74e 204 ) in R. communis and 1.196-fold (P value 5 3.11e 246 ) in B. vulgaris (Table 2; Supplemental Data Set 1.13). Other ortholog genes were significantly enriched with DEGs in superasterids species only, between 1.129-fold (P value 5 6.91e 212 ) in S. lycopersicum and 1.157-fold (P value 5 3.97e 217 ) in H. annuus. Conversely, lineage-specific genes were depleted with DEGs in all six species, between 0.751-fold (P value 5 5.46e 281 ) in B. vulgaris and 0.885 (P value 5 9.79e 212 ) in R. communis.
We conclude that the plant transcriptome is enriched with core Pentapetalae genes during local responses to S. sclerotiorum. Because of a relatively high proportion of lineage-specific genes differentially regulated in species from the Superasteridae clade, this trend is less pronounced in the Superasteridae than in rosids. Comparing the transcriptome of resistant and susceptible genotypes at the species level, together with further functional assays, will be required to determine which of these responsive genes contribute to the QDR phenotype.

Accumulation of S. sclerotiorum-Induced Genes during the Evolution of the Pentapetalae
To document global patterns of evolution in the repertoire of S. sclerotiorum-responsive genes across plants, we inferred gene gains and losses along the Pentapetalae phylogeny. Following a simple parsimony hypothesis, we considered a gain along a branch of the tree when genes were present in all derived clades (among ortholog groups or lineage specific), but not in three or more paraphyletic clades ( Figure 3A; Supplemental Data Sets 1.14 and 1.15). Reciprocally, we considered a loss along a branch when genes were present in all paraphyletic clades but absent from all derived clades. This identified a total of 5131 loss events associated with 10,143 DEGs and 10,630 gain events associated with 16,046 DEGs.
Considering core Pentapetalae genes as the ancestral state, we calculated the change in gene content along each branch of the tree relative to the ancestral state ( Figure 3B). There were 3148 orthogroups (including 4083 DEGs) that could not be associated unambiguously to one branch of the tree, and they were excluded from this analysis. We used estimated dates of divergence in the Pentapetalae phylogeny ( Figure 3B; Zeng et al., 2017) to calculate the proportion of genes gained and lost per million year (Mya) along branches of the tree. Except the two most ancestral branches (Pentapetalae to Superasteridae and Pentapetalae to rosids), all branches showed higher rates of gene gain than gene losses. Along intermediate and terminal branches of the tree, whole genome losses were 0.95% 6 0.53%/Mya and whole (A) Distribution of orthologous gene groups across species. Groups including genes from all six species are designated as core Pentapetalae, groups including genes from two to five species are designated as other orthologs, and genes not included in multi-species ortholog groups are designated as genome gains were 0.35% 6 0.21%/Mya (paired t test, P value 5 0.012; Figure 3C; Supplemental Data Set 1.15). For DEGs, losses were 0.12% 6 0.06%/Mya and gains were 0.25% 6 0.12% (P value 5 0.009), indicating that S. sclerotiorum-responsive gene pools were less variable than whole genomes and that all lineages accumulated DEGs since ;120 Mya.
Next, we calculated the ratio between the number of genes upregulated and downregulated upon S. sclerotiorum inoculation among orthogroups containing genes gained and lost along intermediate and terminal branches of the phylogeny (Figures 3A and 3D; Supplemental Data Set 1.15). The up/down ratio was ;1.08 6 0.20 in gained DEGs but only ;0.80 6 0.12 in lost DEGs (paired t test, P value 5 0.0098), suggesting a trend to favor the retention of upregulated genes rather than downregulated genes. In agreement, the up/down ratio was 0.72 in Core Pentapetalae genes and ;0.82 6 0.02, on average, in the six plant species. These analyses suggest the existence of mechanisms leading to the progressive accumulation of genes upregulated upon S. sclerotiorum inoculation during the evolution of Pentapetalae plants.

Local Transcriptomic Responses to S. sclerotiorum Involve Core and Lineage-Specific Functions
To document the functional diversity of genes differentially expressed upon S. sclerotiorum inoculation, we collected Gene Ontology (GO) and Pfam domains annotations for the six plant species. Between 47.1% (H. annuus) and 74.1% (Arabidopsis) of expressed genes were annotated with GO, and between 76.1% (H. annuus) and 89.7% (Arabidopsis) of expressed genes were annotated with at least one Pfam domain per species (Table 1; Supplemental Data Sets 1.16 to 1.18; Supplemental Files 1 and 2). These annotations included between 2381 (B. vulgaris) and 3792 (Arabidopsis) distinct GOs and between 4128 (R. communis) and 5913 (H. annuus) distinct Pfams represented at least three times (Table 1). To highlight gene functions associated with the local response to S. sclerotiorum, we analyzed GO and Pfam annotation enrichment with upregulated and downregulated genes from each species. We mapped GOs to enriched Pfam domains to summarize annotations as GOs. We found 262 GOs significantly enriched with upregulated genes (chi-squared adjusted P value < 0.01) in at least one species, including 103 Biological Process ontologies, 28 Cellular Component ontologies, and 131 Molecular Function ontologies ( Figure 4A; Supplemental Data Sets 1.19 and 1.20). Six GOs (4.65%) were enriched with upregulated genes in all species, including protein phosphorylation and kinase activities (GO:0006468, GO:0004674, GO:0004672). Two hundred and four GOs were enriched with upregulated genes in a single species, revealing the prevalence of abscisic acid signaling (GO:0009738) in R. communis and jasmonic acid signaling (GO:0009753, GO: 0016629) in Arabidopsis and P. vulgaris. These GOs also point toward lineage-prevalent defense mechanisms, such as the biosynthesis of terpenes (GO:0010333) and inhibition of endopeptidases (GO:0004866) in S. lycopersicum, flavonoid glucuronidation (GO:0052696) in H. annuus, and vitamin B6 metabolism (GO:0042816) and ceramide biosynthesis (GO:0046513) in P. vulgaris. Fifty-two GOs were enriched with upregulated genes in two to five species, including defense response (GO: 0,006,952), response to wounding (GO:0009611), exocytosis (GO:0006887), response to chitin (GO:0010200), toxin catabolic process (GO: 0009407), and flavonoid biosynthesis (GO:0009813). We found 411 GOs significantly enriched with downregulated genes (chisquared adjusted P value < 0.01) in at least one species, including 187 Biological Process ontologies, 57 Cellular Component ontologies, and 167 Molecular Function ontologies ( Figure 4B; Supplemental Data Set 1.19). Fifteen GOs were enriched with downregulated genes in all species, mostly related to the photosynthesis process (GO:0015979, GO:009654, GO:009522, GO: 009507, GO:0016168). Three hundred forty GOs were enriched with downregulated genes in a single species, such as response to auxin (GO:0009733) in B. vulgaris and amine metabolism (GO: 0009038) in H. annuus. GOs enriched with downregulated genes also included several GOs related to RNA processing (GO: 0009451, GO:0001510), the regulation of protein translation (GO: 0006400, GO:0006417), and DNA maintenance (GO:0004519, GO:0006298, GO:0006260, GO:0006508) in several species. Thirty-one GOs were enriched both with upregulated and downregulated genes, leading to a total 642 GOs enriched with DEGs.
To visualize hierarchical relationships between GOs enriched with DEGs, we constructed a network of Biological Process ontologies in which nodes are color coded according to enrichment with upregulated or downregulated genes ( Figure 4C). Biological Process ontologies harboring an excess of upregulated over downregulated genes notably included response to toxin (up/ down ratio 2.19), response to chitin (up/down ratio 5.80), response lineage specific. The number of gene group is indicated in bold, number of genes is in italics, and the proportion (% of all expressed genes) is indicated between square brackets. (B) Proportion of core Pentapetalae (Penta.) genes, lineage-specific (spe.) genes, and other ortholog (ortho.) genes in each plant genome (% of expressed genes). (C) Relative proportion of DEGs belonging to core Pentapetalae groups (dark blue), other ortholog groups (green), and lineage-specific genes (yellow). Branches of the phylogenetic tree in inset are color coded according to corresponding gene groups. Results are shown for downregulated genes only (Down), upregulated genes only (Up), or both together (All). Filled dots show proportions for Superasteridae species; open dots are for rosid species. Letters define groups of significance as determined by pairwise Student t tests (P value < 0.01). (D) Relative proportion of DEGs, either downregulated (open circles) or upregulated (filled circles), in rosids and Superasteridae species. Genes belonging to core Pentapetalae groups (dark blue); other ortholog groups (green) and lineage-specific genes (yellow) are shown separately. Letters define groups of significance as determined by pairwise Student t tests (P value < 0.01). Boxplots show first and third quartiles (box), median (thick line), and the most dispersed values within 1.5 times the interquartile range (whiskers). At, Arabidopsis; Bv, B. vulgaris; Ha, H. annuus; Pv, P. vulgaris; Rc, R. communis; Sl, S. lycopersicum.
to wounding (up/down ratio 2.14), and cell surface receptor signaling (up/down ratio 1.19; Figure 4C). Ontologies with an excess of upregulated genes tended to cluster in defined sectors of the GO network related to secondary metabolism and detoxification on one side and signaling and response to stress on the other side ( Figure 4C). Biological Process ontologies harboring an excess of downregulated over upregulated genes included response to light stimulus (up/down ratio 0.25), starch biosynthetic process (up/ down ratio 0.13), and RNA methylation (up/down ratio 0.03). DEGenriched ontologies spanned a broad range of the GO network, highlighting the diversity of processes involved in the response to S. sclerotiorum.
To study the relationship between conservation and regulation among major gene functions responsive to S. sclerotiorum infection, we classified GOs according to their content in core genes and their content in upregulated genes, focusing on the 282 Biological Process GOs enriched in DEGs. To this end, we determined the ratio between the number of core Pentapetalae genes and the number of lineage-specific DEGs as well as the ratio between the number of upregulated and downregulated DEGs for all GOs (Supplemental Data Set 1.19). For instance, ontology toxin catabolic process (GO:0009407) was identified in 71 upregulated genes, 12 downregulated genes, and 51 genes not differentially regulated ( Figure 5A, up/down ratio 5.92), including 23 core Pentapetalae DEGs and 21 lineage-specific genes (core/lineage specific ratio 1. 09). Ontology chlorophyll biosynthesis process (GO:0015995) was identified in 79 downregulated genes, 11 upregulated genes, and 48 genes not differentially regulated across six species ( Figure 5B, up/down ratio 0.14), including 76 core Pentapetalae DEGs and 9 lineage-specific DEGs (core/lineage specific ratio 15.2). Fifty-one GOs did not contain upregulated or core genes and were excluded from the analysis. We observed a correlation between the content in core Pentapetalae genes and the content in downregulated genes (Spearman's r 5 0.325, P 5 4.41e -07 ), suggesting a high degree of conservation or convergence in gene functions downregulated upon S. sclerotiorum infection ( Figure 5C). A number of ontologies were exceptions to this trend, harboring a high content in core genes and high content in upregulated genes, such as callose deposition (GO:0052542), chorismate biosynthesis (GO:0009423), ceramide biosynthesis (GO:0046513), hypersensitive response (GO: 0009626), and systemic acquired resistance (GO:0009862). Gene families lacking annotation, not covered by this analysis, may also contribute to the repertoire of genes responsive to S. sclerotiorum.

Transcriptional Response of Core Genes to S. sclerotiorum Varies Significantly between Species
We found only a few GOs enriched with DEGs in all species (Figures 4A and 4B). This could be due to (1) GOs enriched with DEGs being weakly conserved across species or to (2) conserved GOs harboring variable content in DEGs across species. To test for the second alternative, we analyzed the distribution of DEGs among genes, ontologies, and Pfam domains conserved in all six species. Genes, GOs, and Pfams retrieved in a maximum of five species were excluded from the following analyses. The core Pentapetalae class includes 62,838 genes conserved in all six species and distributed in 7918 orthogroups of 6 to 132 genes. Among those, 9520 genes (15.2%) were upregulated upon S. sclerotiorum infection, distributed between 3659 orthogroups (46.2%). This included 159 orthogroups (2.01%) containing upregulated genes from all six plant species ( Figure 6A; Supplemental Data Sets 1.11 and 1.21). There were 13,201 core Pentapetalae genes downregulated (21.0%), distributed in 4962 orthogroups (62.7%), including 282 orthogroups (3.56%) containing downregulated genes from all six plant species. Overall, 1039 core Pentapetalae groups did not contain DEGs (13.1%), 438 contained DEGs in all six plant species (5.53%), and 6441 contained DEGs in one to five plant species (81.3%; Figure 6A; Supplemental Figures 1 and 2), indicating that a majority of conserved gene groups showed divergent regulation upon S. sclerotiorum inoculation in distinct plant species.
To test whether this observation holds true for conserved gene functions, we conducted a comparative analysis of GOs and Pfam domains expressed in all six species (corresponding to 3536 GOs and 5919 Pfam domains; Figures 6B and 6C; Supplemental Data Sets 1.22 and 1.23). We found that 71.5% of GOs and 64.5% of Pfam domains were associated with DEGs from one to five plant species. In agreement with frequent species-specific regulatory patterns for conserved genes, although detected in all six plant species, only 23% of GOs and 33.6% of Pfam domains were associated with DEGs in all six plant species (Figures 6B and 6C).
To characterize species-specific regulation across GOs, we color coded GO networks according to the number of plant species in which they are associated with upregulated genes (zero to six; Figure 6D). Ontologies associated with upregulated genes in five or six species where highly connected at the center of the network. Exceptions included the ontologies ATP transport, Significance of enrichment (fold >1) or depletion (fold <1) was assessed using a chi-squared test followed by a Bonferroni correction for multiple testing.
negative regulation of apoptosis, vacuolar acidification, lipid homeostasis, Cys biosynthesis from Ser, removal of superoxide radicals, and negative regulation of endopeptidase that were terminal nodes of the network associated with upregulated genes in five or six species. This prompted us to test for a relationship between association with upregulated genes in multiple species and the level of GOs in the ontology hierarchy. We found that the number of species in which GOs associate with upregulated genes decreased with ranks in the GO hierarchy, from an average rank 12.96 for GOs associated with upregulated genes in all six plant species to an average rank 17.66 for GOs associated with upregulated genes in a single species (Student's t test, P value 5 3.7e 209 ; Figure 6E; Supplemental Data Set 1.22). This highlights significant variations in genes and biological functions regulated locally upon S. sclerotiorum inoculation across diverse plant species.
To evaluate further the divergence in biological functions regulated upon S. sclerotiorum inoculation across six plant species, we analyzed 893 GOs represented in all six species and including at least one DEG and one core Pentapetalae gene in each species. As a reference, we calculated pairwise correlations for GOs content in core Pentapetalae genes between plant species. We found an average Spearman's r of 0.601 6 0.078 ( Figure 6F; Supplemental Data Set 1.24). We then calculated pairwise correlations for GOs content in DEGs between plant species and found an average Spearman's r of 0.370 6 0.064 ( Figure 6G; Supplemental Data Set 1.24). Next, we analyzed the top 100 GOs enriched in DEGs in each plant species. In total, the selection included 410 GOs for the six species, and indicated a limited overlap in individual top 100 lists. Indeed, only seven GOs related to the downregulation of photosynthesis belonged to the top 100 enriched in DEGs in all six species ( Figure 6H; Supplemental Data Set 1.25). Forty-four (S. lycopsersicum) to 64 (P. vulgaris) GOs were part of the top 100 enriched in DEGs in a single species, including for instance structural constituent of cytoskeleton in Arabidopsis, carotenoid biosynthesis in B. vulgaris, flavonoid biosynthesis in H. annuus, and glucosylceramidase activity in P. vulgaris. For comparison purposes, we analyzed top 100 GOs depleted in DEGs in each plant species ( Figure 6I; Supplemental Data Sets 1.25 and 1.26). Eleven GOs were highly depleted in DEGs in all six species; between 14 (H. annuus) and 38 (Arabidopsis and P. vulgaris) GOs were highly depleted in GOs in a single species. This supports the view that the repertoire of core GOs enriched with DEGs is rather diverse at the interspecific level.

Patterns of Core Genes Expression upon S. sclerotiorum Inoculation Deviate Significantly from a Null Model of Heritability
Of 3659 core Pentapetalae orthogroups including upregulated genes, only 159 (4.3%) included upregulated genes from all six species analyzed (Supplemental Data Set 1.21), suggesting a high degree of regulatory divergence at the interspecific level. Since genes from a given core Pentapetalae orthogroup most likely derive from a common ancestor, we considered the null hypothesis that their expression pattern is completely inherited, leading to similar expression patterns within orthogroups. We compared observed distributions of gene expression upon S. sclerotiorum inoculation to modeled distributions to test this hypothesis and estimate the degree of regulatory variation within core Pentapetalae genes.
First, we analyzed the distribution of the number of species harboring upregulated genes in each core Pentapetalae orthogroup. For this, we shuffled 100 times core Pentapetalae genes among orthogroups. Under the null hypothesis of complete inheritance, upregulation is inherited in six species. We conducted our bootstrap analysis by constraining identical expression (upregulated or not) in six species within an ortholog group (inheritance [h] 5 6) to test this hypothesis. We next considered relaxed degrees of expression pattern inheritance (h 5 5 to 2), down to a complete random assignment of expression patterns (h 5 1; Figure 7A; Supplemental Data Set 1.27; Supplemental File 3). We used the sum of squared residuals (SSR) to estimate the deviation between observed and simulated distributions. The lowest SSR value (38.0) is obtained when gene expression is constrained as identical in two species only (h 5 2). A similar analysis on the distribution of downregulated genes also identified h 5 2 as the best approximation to observed distribution. This suggests that constraints on core genes upregulation are weak, allowing switches between upregulated and not upregulated states in four of six species on average. Second, we clustered genes based on their LFC using the silhouette method (Charrad et al., 2014) within each orthogroup, and we determined the size of the largest expression cluster relative to the size of each orthogroup (Supplemental Data Set 1; Supplemental File 4). Under the null hypothesis of complete inheritance, gene expression patterns should be highly consistent within each Pentapetalae core orthogroup, and clusters should tend to cover complete orthogroups. By contrast, under the hypothesis that genes forming an orthogroup are fully independent (expression patterns are randomly distributed), expression clusters would correspond to one single gene. To determine the extent to which genes expression departs from independence in each core Pentapetalae orthogroup, we computed a consistency of gene expression index (CEI) corresponding to the difference between the relative size of the largest expression cluster observed and the relative size of the largest theoretical expression cluster under gene independence ( Figure 7B; Supplemental Data Set 1.28). For instance, orthogroup no. 56 containing 50 genes (theoretical expression cluster size under gene independence was 1/50) included 17 expression clusters, the largest of which contained 6 genes (12% of the orthogroup; CEI 5 0.12 -0.02 5 0.1; Figure 7C). In our data set, CEI ranged from 0.0602 (92.8% of genes showed independent expression pattern) to 0.8824 (94.1% of genes showed consistent expression pattern). The CEI average was 0.46, meaning that, on average, about one-half of the genes composing a core orthogroup had consistent patterns of expression ( Figure 7B). There were only 831 orthogroups (10.5%) with CEI $ 0.67 (largest expression cluster including at least 67. 42% of genes in the orthogroup). One of them is orthogroup no. 1067 including 13 genes distributed into two expression clusters, the largest containing 10 genes (CEI 5 0.692; Figure 7D). There were 2500 orthogroups (31.6%) with CEI # 0.33 (largest expression cluster including <34.1% of genes in the orthogroup). We conclude that core orthogroups with weakly consistent expression patterns are abundant, suggesting that expression consistency is weakly constrained in many orthogroups. Our bootstrap and clustering approaches both lead to the conclusion that the expression of core genes upon S. sclerotiorum inoculation is only weakly heritable at the interspecific level.

Evidence for Exaptation into QDR in a Group of ABCG/ PDR Transporters
To illustrate regulatory divergence in response to S. sclerotiorum at the interspecific level, we focused on the orthogroup no. 4. This group is remarkable for including upregulated genes in each of the six plant species, a property restricted to 2.01% of the core Pentapetalae class. Gene expression LFC ranges from -6.2 (XM_015728348.1) to 12.7 (PHASIBEAM10F011181T1) in orthogroup no. 4, ranking it 14/15,685 for intragroup LFC variation. Orthogroup no. 4 contains 96 expressed genes encoding class G ATP binding cassette (ABC) transporters (pleotropic drug resistance [PDR] class), among which 28 are upregulated and 14 are downregulated upon S. sclerotiorum inoculation ( Figure 8A; Supplemental Data Set 1.29; Supplemental File 5). A phylogenetic analysis of orthogroup no. 4 reveals six monophyletic clades each containing genes from all six plant species, suggesting the divergence of these clades early in land plants, in agreement with a previous report (Hwang et al., 2016). Upregulated genes were restricted to one species in clades 1 and 3, three species in clade 2, and four species in clade 4. Clade 5 did not include upregulated genes, while clade 6 contained upregulated genes from each of the six plant species, including Arabidopsis ABCG40/PDR12 (AT1G15520, LFC 12.48; Figure 8A). Because the six plant lineages analyzed here diverged before the emergence of the Sclerotinia genus (Navaud et al., 2018), we hypothesized that the ABCG clade induced in all species was exapted into QDR (Gould and Vrba, 1982). In this evolutionary scenario, an ancestor of ABCG clade 6 would have been responsive to signals of the ancestral environment, in which the Sclerotinia genus had not emerged yet, and underwent regulatory and functional changes associated with enhanced QDR against S. sclerotiorum. Because of the extensive resources available for this plant, we focused on Arabidopsis clade 6 gene ABCG40 to test this exaptation scenario. For this, we first analyzed the expression of Arabidopsis ABCG genes from orthogroup no. 4 in 12 stress conditions reported in the literature, including challenge by the fungal pathogens S. sclerotiorum, B. cinerea, Alternaria brassicicola, Verticillium dahlia, and Colletotrichum incanum. The other treatment surveyed covered inoculation by the fungal endophyte Colletotrichum tofeldiae, inoculation by avirulent (carrying the AvrRPS4 avirulence gene) and virulent (DC3000 cor-) strains of the bacterial pathogen Pseudomonas syringae pv tomato (Pst), inoculation by the nematode Heterodera schachtii, inoculation by    the cabbage leaf curl virus, and upon heat and salt stress ( Figure 8B; Supplemental Data Set 1.30). Six ABCG genes were significantly induced by two or more stresses. Salt treatment, inoculation by C. tofeldiae endophytic fungus, the virulent Pst DC3000 cor-strain, H. schachtii nematodes, and cabbage leaf curl virus did not significantly induce any the ABCG genes tested. The ABCG40/PDR12 gene was significantly induced by the broadest range of treatments (6/12), including fungal pathogens S. sclerotiorum, B. cinerea (LFC 8.08), A. brassicicola (LFC 4.15), Verticillium dahliae (LFC 1.58), avirulent Pst (LFC 9.04), and heat (LFC 8.16). Second, we analyzed the disease resistance phenotype of four T-DNA mutant lines altered in the AtABCG40 gene. To characterize mutant lines at the transcript level, we first assessed AtABCG40 expression in healthy plants ( Figure 8C; Supplemental Data Set 1.31). The abcg40-2 line (SALK_005635; Campbell et al., 2003) was strongly impaired in ABCG40 expression (<4% of the wild-type expression, P value 5 1.4e 204 ), while ABCG40 was only weakly reduced in abcg40-3 (SAIL_885_E09) and abcg40-4 (SALK_148565; ;62 and ;54% of the wild-type, respectively) lines, and abcg40-5 (SALK_013945) showed an increased accumulation of ABCG40 transcripts (;165% of the wild type). We inoculated leaves of these plants with a GFP-expressing strain of S. sclerotiorum to measure the area colonized by the fungus 24 h after inoculation, using the rlp30-1 mutant ) as a susceptible control ( Figure 8D). A strong genotype effect with no significant experiment replicate effect was detected on lesion size (ANOVA P value 5 9.81e -06 and 0.05, respectively). Average area colonized by S. sclerotiorum was 85.0 6 9.7 mm 2 in Arabidopsis ecotype Columbia (Col-0) and 123.7 6 16.9 mm 2 in rlp30-1 (Student's t test adjusted P value 5 9.9e -05 ; Figure 8E; Supplemental Data Set 1.32). Decrease in ABCG40 expression was associated with an increase in susceptibility in abcg40-3 (108. 1 6 13.0 mm 2 colonized, adjusted P value 5 0.02), and abcg40-2 (130.5 6 17.4 mm 2 colonized, adjusted P value 5 2.5e -05 ). The abcg40-4 and abcg40-5 mutants were weakly, but not significantly, impaired in resistance to S. sclerotiorum (106.5 6 10.1 and 98.4 6 9.7 mm 2 colonized, respectively; adjusted P value > 0.1), suggesting that ABCG40 transcripts may be functional in these lines. Overall, these results demonstrate a role for AtABCG40 in disease resistance to S. sclerotiorum and suggest that AtABCG40 was exapted into QDR from an ancestral stress response function through reinforcement of its transcription upon fungal pathogen challenge.

DISCUSSION
Our phylotranscriptomics analyses of local responses to S. sclerotiorum in plants from six Pentapetalae orders provide an unprecedented insight into the broad diversity of transcriptional reprograming upon challenge with a single fungal pathogen strain. Comparative transcriptomics studies of plant interactions with eukaryotic pathogens typically focused on gene reprogramming in closely related plant cultivars (Zhao et al., 2009) and species (Powell et al., 2017) and on the transcriptome of several parasite strains (Cooke et al., 2012;Palma-Guerrero et al., 2016;Zhang et al., 2017) or species (Hacquard et al., 2016) infecting a given plant genotype. The remarkably wide range of plants that S. sclerotiorum can infect allowed us to document molecular responses to this fungus in host species that diverged more than 140 Mya.

An Estimate of the Pan-Genomic Diversity of S. sclerotiorum-Responsive Genes
Recent sequencing efforts defined the complement of NLR genes across Arabidopsis accessions (Van de Weyer et al., 2019). Here, we provide an estimate of the QDR complement responding to S. sclerotiorum locally in the Pentapetalae. Together, we identified 48,910 genes differentially expressed upon S. sclerotiorum across six species, enriched in 642 GOs. Despite their evolutionary distance and strong variation in genome size, all species analyzed in this work upregulated ;15% of their genome and downregulated ;18% of their genome upon infection, suggesting the existence of a conserved optimal pool of S. sclerotiorumresponsive transcripts. On average, we found that ;47% of genes differentially expressed in a species upon S. sclerotiorum inoculation were conserved in the five others (core QDR orthogroups), while ;38% were specific to this species. Similarly, the core NLR orthogroups, conserved in >50 of 65 accessions, represent 53% of NLR genes ( Van de Weyer et al., 2019). This suggests that a significant fraction of S. sclerotiorum-responsive pan-transcriptome is likely to be functional in multiple plant species, as shown for instance for the POQR putative peptidase in Arabidopsis and tomato (Badet et al., 2017). In this context, functional analyses of QDR genes in model species may offer relatively straightforward perspectives for improving QDR in crops.
Experimental evidence for a role in QDR is only available for a limited number of DEGs. Nevertheless, a diverse set of Arabidopsis DEGs were shown to function in resistance against S. sclerotiorum (Table 3). According to the omnigenic model (Boyle et al., 2017), any DEG in our pan-transcriptome is expected to impact the QDR phenotype. Our work therefore suggests that ;32.4% of plant genes could have a non-null contribution to QDR against S. sclerotiorum. Because network connectivity is an important determinant of gene essentiality and conservation, it is   (A) Simulated (gray) and observed (red) distributions of the number of upregulated genes in each core Pentapetalae group. Similar expression (upregulated or not) has been constrained in h 5 6 to 1 species (no constraint). The deviation between observed and simulated distribution was assessed using the SSR. The lowest SSR value corresponded to h 5 2 species, suggesting that constraints to maintain similar gene expression across species are weak. tempting to speculate that core genes in the omnigenic model, that directly and strongly affect disease, would likely be among the 43% DEGs broadly conserved across species. Species-specific DEGs would modulate core gene activity according to genetic backgrounds, pathogen genotypes, and other environmental variables and may underlie the frequent variation in core gene regulation that we observed. Our enrichment analyses are consistent with a set of genes that could directly control QDR through their protective or antifungal activity (e.g., terpene and flavonoid biosynthesis genes, endopeptidase inhibitors, toxin catabolism genes), the activity of which could be modulated by numerous transcriptional and posttranscriptional regulators (e.g., WRKY DNA binding, ERF/AP2-type transcription factor, and kinase domain proteins). We identified several DEGs encoding proteins with domains of unknown function (DUF2870, DUF969, DUF3403) representing promising candidates for novel defense proteins. Studies with B. cinerea in Arabidopsis and tomato indicated that genes associated with QDR varied according to the fungal strain being inoculated (Corwin et al., 2016;Soltis et al., 2019), suggesting that the pan-genomic repertoire of QDR-associated transcripts may depend on pathogen genotypes and species and may exceed current estimates.

Molecular Bases of Regulatory Variation and QDR Phenotype Evolution
Although 47% of DEGs belonged to core orthogroups conserved in six plant species, only 2.01% of core orthogroups contained genes upregulated in six species (and 3.56% genes downregulated in six species), highlighting frequent regulatory variation within orthogroups. In our previous work, we identified allelic variants of the ARPC4 gene differing in their promoter region, induction upon pathogen challenge, and associated with resistance to S. sclerotiorum (Badet et al., 2019). These findings are in line with a prominent role of regulatory variation in the evolution of QDR. It is also consistent with the alteration of host gene transcription being a major function of small RNA secreted by several fungal pathogens (Weiberg et al., 2013;Derbyshire et al., 2019). Species-specific gene expression patterns may result from directional selection on gene regulation or specific environmental influences (Romero et al., 2012). Recent progress in genomics allowed disentangling genetic and environmental factors in a number of cases, pointing toward several molecular mechanisms that may be responsible for expression polymorphisms in QDR genes. Molecular mechanisms underlying the evolution of gene expression include variations in epigenetic marks, gene copy number variation, and cis-and trans-regulatory variation.
Epigenetic marks such as DNA methylation control Arabidopsis stress-responsive gene expression upon challenge by bacterial and fungal pathogens (Yu et al., 2013;Le et al., 2014). Histone acetylation, methylation, and ubiquitination and the activity of chromatin remodeling complexes are other epigenetic mechanisms involved in the control of plant defense genes transcription (Ramirez-Prado et al., 2018). Transcription is also controlled by the effect of cis-regulatory elements and trans-acting factors that may drive phenotypic evolution. For instance, the homeobox transcription factor RCO underwent regulatory changes, gene duplication, and loss, leading to leaf shape diversity among Brassicaceae species (Vlad et al., 2014). This cis-regulatory variation results from the emergence of novel enhancer sequences (Long et al., 2016) or from genome shuffling recruiting new genes to transcriptionally active regions. In both cases, whole-genome duplications and transposable elements (TEs) can play prominent roles. Insertion of TE acting as an enhancer of the teosinte branched (tb1) transcription factor gene was a key molecular event in the acquisition of apical dominance during maize (Zea mays) domestication (Studer et al., 2011). In some Arabidopsis accessions, insertion of a TE-derived enhancer mediated WRKY33 binding to the promoter of the CYP82C2 gene, making it inducible upon bacterial pathogen inoculation, activating biosynthesis of the antimicrobial 4-hydroxyindole-3-carbonylnitrile metabolite (Barco et al., 2019). A similar mechanism could have driven the emergence of an S. sclerotiorum-inducible allele of ARPC4 associated with enhanced QDR in Arabidopsis (Badet et al., 2019). The pangenomic repertoire of QDR transcripts conserved across species but differentially regulated upon S. sclerotiorum inoculation provides an original resource for the identification of new transcription enhancer motifs. Since transcription factors often regulate multiple target genes, while cisregulatory motifs act rather locally in the genome, cis-regulatory regions are considered less pleiotropic and therefore more likely to contribute to phenotypic evolution (Wray, 2007). Detailed promoter sequence analyses and cross-species functional validations will be required to estimate the relative contribution of cisand trans-regulatory divergence in the evolution of QDR. The central role of gene expression in QDR phenotypes suggests that engineering the expression or copy number of endogenous QDR genes, rather than introgressing new genes or alleles, could be an efficient strategy to control diseases in the field.

An Exaptation Scenario for the Recruitment of Genes into QDR
Under the Red Queen hypothesis, plant immunity genes represent true adaptations, evolved under selection pressure associated  (A) Phylogenetic relationship of 96 ABCG transporters from six Pentapetalae species covering orthogroup no. 4. The tree obtained by a maximum likelihood analysis is shown, with the number of substitutions per site used as branch length and branch support determined by an approximate likelihood-ratio test (red labels). Terminal nodes are color coded according to plant species. The circular bar plot shows gene expression log 2 fold change (LFC) upon inoculation with specific pathogen populations (Van Valen, 1973). In agreement, signatures of diversifying selection were detected in Arabidopsis and Solanum NLR genes (Mondragón-Palomino et al., 2017;Stam et al., 2019). In natural populations of Arabidopsis, polymorphism at the GS-ELONG locus controlling the structure of aliphatic glucosinolate defense compounds was associated with the presence of specific aphid species in the environment (Züst et al., 2012). In selection experiments, aphid species drove selection for contrasted glucosinolate profiles in five plant generations, demonstrating that herbivores drive the rapid evolution of some plant defense genes. Alternatively, plant genes could have emerged prior to exposure to pathogens to serve nonimmune functions, before being exapted into defense with the rise of pathogen populations. The glucosinolate 3-hydroxypropylglucosinolate produced by Brassicaceae plants has a role in defense and inhibits root growth in diverse dicot species (Malinovsky et al., 2017). Responses to 3-hydroxypropylglucosinolate involve the target of rapamycin pathway conserved across eukaryotes, suggesting that the target of rapamycin pathway was exapted into glucosinolate response early in plant evolution. Similarly, Arabidopsis CYP82C2 gene originated from the duplication the iron-stress response gene CYP82C4 and was exapted into defense through cis-regulatory variation (Barco et al., 2019).
We identified 159 orthogroups featuring genes upregulated by S. sclerotiorum in all six plant lineages, which diverged long before the emergence of the Sclerotinia genus (Navaud et al., 2018). For these genes, responsiveness to S. sclerotiorum likely evolved convergently in multiple plant lineages, through exaptation of ancient stress responses, or a combination of both processes. Here, we provide evidence for exaptation of the ABCG40/PDR12 transporter into QDR against S. sclerotiorum. The indirect pathway for abscisic acid synthesis probably evolved with the colonization of land (Hauser et al., 2011), similar to the ABCG/PDR class of transporters (Hwang et al., 2016). It is therefore probable that the abscisic acid importer function of ABCG40/PDR12 (Kang et al., 2010) evolved with the colonization of land, ;450 Mya, to control drought tolerance, seed germination, and lateral root formation (Campbell et al., 2003;Kang et al., 2010). Consistent with an ancestral role in acclimation to abiotic stress, AtABCG40 was induced upon heat stress and downregulated by salt stress (Suzuki et al., 2016). AtABCG40 is also induced upon inoculation by the fungal pathogens A. brassicicola, Fusarium oxysporum, and more strongly and more rapidly by S. sclerotiorum (Campbell et al., 2003). In line with suppression of abscisic acid synthesis leading to susceptibility to S. sclerotiorum (Perchepied et al., 2010;Zhou et al., 2015;Mbengue et al., 2016), mutants defective in AtABCG40 were more susceptible to S. sclerotiorum than the wild type. Exaptation of AtABCG40 into QDR against S. sclerotiorum associates with strong transcriptional activation of the gene upon challenge with S. sclerotiorum, through a mechanism that remains to be elucidated.

Broad Spectrum and Durability in the Context of Exaptation into Resistance
The evolution of QDR genes through exaptation of ancestral stress responses and developmental pathways could be related to QDR often being relatively broad spectrum and durable (Poland et al., 2009;Roux et al., 2014;Corwin and Kliebenstein, 2017). Signals created by pathogen attack on plants are either molecular signals that can be specific to the pathogen genotype such as pathogen effectors (Guyon et al., 2014;Guo et al., 2018), or abiotic signals. Abiotic signals generated by pathogens during host colonization include altered pH (Xu et al., 2015;Fernandes et al., 2017), water status (Xin et al., 2016;Aung et al., 2018), and physical strain (Wilson and Talbot, 2009). Since similar abiotic signals may be generated by diverse pathogens, as for alkalinization by Pseudomonas phytotoxins (Bender et al., 1999) and by Fusarium RALF-like peptides (Masachis et al., 2016), plant responses mediating acclimation to these signals are likely to be exapted into QDR and to function against a broad spectrum of pathogens. A rapid transcriptional reprogramming was proposed to be critical for broad-spectrum efficiency of plant immune responses, supporting a central role of cis-regulatory divergence in adaptation to pathogens (Mine et al., 2018). In pathogens were single effector molecules are major determinants of the disease outcome, single amino acid substitution, gene presence/absence, and expression polymorphism are frequent mechanisms associated with evasion of recognition by the plant immune system (Raffaele and Kamoun, 2012). Abiotic signals are not directly encoded by pathogen genomes; therefore, evasion of the corresponding plant responses may require very complex genetic alterations to pathogen genomes.
Upon escape from adaptive conflict, genes with exapted function may retain their ancestral functions (Conant and Wolfe, 2008). In some animal lineages, structural proteins forming the ocular lens retained their ancestral argininosuccinate lyase  . Significance of the difference to the wild type were assessed by Student's t tests with Bonferroni correction for multiple testing, with P values shown in green. Boxplots show first and third quartiles (box), median (thick line), and the most dispersed values within 1.5 times the interquartile range (whiskers). enzymatic activity (Piatigorsky et al., 1988). A class of fungal dicarboxylic acid transporters were shown to have retained their ancestral activity and gained the ability to transport tricarboxylic acids after horizontal transfer to oomycetes (Savory et al., 2018). Plant enzymes in the dihydroflavonol-4-reductase and the SA-BATH methyltransferase families evolved novel substrate specificities during the diversification of dicots while retaining weak ancestral activity (Des Marais and Rausher, 2008;Huang et al., 2012). Because of such pleiotropic activity of some QDR genes, the suppression of their activity by pathogens may have adverse effects on availability of plant-derived nutrients or lead to pathogen detection and therefore increase the durability of disease resistance. The reconstruction of immune gene networks may facilitate the systematic investigation of pleiotropy in the QDR gene pool and elucidate how genomic backgrounds affect the activity of disease resistance genes (Mine et al., 2014;Peyraud et al., 2017;Smakowska-Luzan et al., 2018;Wu et al., 2018).

Plant Materials and Sclerotinia sclerotiorum Infection
Arabidopsis (Arabidopsis thaliana) Col-0 genotype obtained from the Nottingham Arabidopsis Stock Centre (accession number N1093), Solanum lycopersicum cv Ailsa Craig, Helianthus annuus cv XRQ, Ricinus communis cv Hale (accession number PI 642000), and Beta vulgaris subsp vulgaris (accession number PI 355961) obtained from the USDA-ARS Germplasm Resources Information Network, and Phaseolus vulgaris (accession number G19833) obtained from the International Center for Tropical Agriculture were used for inoculations. Plants were grown in Jiffy pots under controlled conditions at 23°C, with a 9-h light period at intensity of 120 mmol/m 2 /s under Osram OSLON SSL 80 light-emitting diodes. Fiveweek-old plants were inoculated on fully developed leaves by 0.5-cm-wide plugs of potato dextrose agar (Fluka) colonized by S. sclerotiorum strain 1980 (ATCC18683; Supplemental Methods). Plugs of agar containing the fungal pathogen were placed on the adaxial surface of leaves, and plants were incubated in trays closed with plastic wrap to maintain 80% relative humidity and placed in a Percival AR-41L3 chamber under the same day/ light conditions as for plant growth.

Tissue Sampling, RNA Extraction, and Sequencing
RNA-seq reads from Arabidopsis and S. lycopersicum were obtained from the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database accession GSE106811 (Badet et al., 2017). For all infected plant species, the edge of 25-mm-wide developed necrotic lesions (Peyraud et al., 2019) was isolated with a scalpel blade from leaves placed on a glass slide cooled down by liquid nitrogen and immediately frozen in liquid nitrogen. Samples were harvested before lesions reached 25 mm in width, at various times after inoculation, owing to the different susceptibility level of each species. All inoculations were performed 5 h after lights were turned on; samples were harvested at 24 h postinoculation for H. annuus, between 47 and 50 h postinoculation for Arabidopsis, P. vulgaris, R. communis, and S. lycopersicum, and 72 h postinoculation for B. vulgaris. Samples from noninoculated plants were harvested simultaneously to serve as controls and exclude biases due to synchrony of the plants' circadian clock. To determine the extent to which circadian synchrony varies across species in our data set, we analyzed normalized read counts corresponding to 10 genes of the circadian clock (LHY, TOC1, LUX, RVE8, PRR9, PRR3, PRR5, ZTL, GI, and ELF3) in the six species, in inoculated and healthy plants (Supplemental Methods). In a principal component analysis, the first two dimensions summarized 86.8% of the total inertia. All species variables appeared very well correlated, excepted for Arabidopsis, which differed slightly in a higher TOC1 expression. Correlations between inoculated and healthy samples were significant for all species (r $ 0.66; P value # 0.04). Material obtained from leaves of three plants were pooled for each sample; all samples were collected in triplicates. Samples were ground with glass beads (2.5 mm) in a Retschmill apparatus (24 Hz for 2 3 1 min) before RNA extraction using NucleoSpin RNA extraction kits (Macherey-Nagel) following the manufacturer's instructions. A 30-min DNase treatment (TURBO DNase, Ambion) was applied following manufacturer's instructions to eliminate genomic DNA. The quality and concentration of RNA preparations were assessed with an Agilent 2100 Bioanalyzer and the RNA 6000 Nano kit. For H. annuus, P. vulgaris, R. communis, and B. vulgaris samples, mRNA sequencing was outsourced to Fasteris SA to produce Illumina paired reads (2 3 125 bp) using a HiSeq 2500 instrument. Three independent biological samples were sequenced per condition. The corresponding raw RNA-seq reads data were deposited in NCBI's GEO under accession number GSE138039.

Mapping of Sequence Reads and Analysis of DEGs
Reads were mapped to reference genomes listed in Supplemental Data Set 1.1 using the RNA-seq analysis tool of the CLC Genomics Workbench 11.0. 1 software (Qiagen). The following mapping parameters were used: mismatch cost 2, insertion cost 3, deletion cost 3, length fraction 0.8, similarity fraction 0.8, both strands mapping, and 10 hits maximum per read, with expression value given as total counts. All reads were mapped independently to S. sclerotiorum genome version 2 (Derbyshire et al., 2017) to estimate the fraction of fungal reads per sample. Number of uniquely mapped reads per transcript were exported in tab delimited text format. Differential gene expression analysis was conducted with the DESeq2 Bioconductor package version 1.8.2 (Love et al., 2014) in R 3.4.0 with total uniquely mapped read count per gene as input. Differential expression was calculated using expression in uninfected plants as a reference with ;replicates 1 treatment as the design formula. For interspecific comparisons, LFCs were quantile normalized using the preprocessCore package in R. Genes with LFC $ 1.5 and less than or equal to -1.5 and a Benjamini-Hochberg adjusted P value < 0.01 in DESeq2 Wald test were considered significant for differential expression. For the analysis of AtPDR genes expression in multiple stress conditions, total read count per gene were retrieved from the NCBI GEO repository (accession numbers GSE66290, GSE83478, GSE104590, GSE70094, GSE116269, GSE90075, GSE72548, GSE56922, and GSE72806), and differential gene expression analysis performed in DESeq2 as described above.

Orthologous Gene Groups Analyses
To define groups of orthologous genes, the complete proteomes of the six species were used in a Markov clustering analysis through OrthoMCL version 1.4 via the Family Companion server (Cottret et al., 2018). Parameters were set to pi_cutoff 5 0, pv_cutoff 5 0.00001, pmatch_cutoff 5 80, and inflation 5 1.5. Genes not expressed in our RNA-seq data (read count 5 0) were then discarded from orthogroups. Orthogroups including at least one expressed gene from each of the six plant species were designated as core Pentapetalae groups/genes. Orthogroups including one or more expressed gene from two to five plant species were designated as other orthologous groups/genes. Orthogroups including expressed genes from one single species together with expressed genes not included in orthogroups were designated as lineage specific. The largest 100 orthogroups were manually inspected for consistency in protein sequence, GO, and Pfam annotation. R scripts used for the generation of figures in this manuscript are provided in Supplemental File 6.

Gene Annotation and Enrichment Analyses
GO annotations for Arabidopsis and S. lycopersicum were obtained from the Ensembl Plant database release 37 (Kersey et al., 2018). GO annotations were mapped onto complete plant proteomes of H. annuus, P. vulgaris, B. vulgaris, and R. communis using the Blast2GO 5.2.5 (Conesa et al., 2005). Pfam domain annotations were obtained using the hmmscan function of the HMMER version 3.1b2 software against the Pfam31.0 database (Schaeffer et al., 2017). For enrichment analyses, the number of DEGs and genes not differentially expressed was summed across the six species for each GO and Pfam. Several occurrences of a same Pfam domain within a gene were counted once. GOs were mapped onto Pfam domains using the PFAM2GO file version 2019/12/14 provided by Interpro. Counts were used in a chi-squared test performed with R 3.5.0, and P values were adjusted with Bonferroni correction for multiple testing.
Enrichment fold was defined as the ratio between proportion of DEGs in genes harboring a given GO or Pfam over the overall proportion of DEGs among expressed genes in each species. Annotations harboring enrichment fold > 1 and adjusted P value < 0.01 were considered significantly enriched in DEGs. Annotations harboring enrichment fold <1 and adjusted P value < 0.01 were considered significantly depleted in DEGs. GO network rendering was performed using the BiNGO plugin in Cytoscape 3.6.1. To visualize enrichment across all species for the complete GO network, the following parameters were used: statistical test-, significance level: 1.00, categories to be visualized: All categories, reference set: Use whole annotation as reference set, select ontology file: GO_Full, select organism/ annotation: custom. As a custom annotation file, a list of 143,807 GO annotations from DEGs of the six species, including 6962 distinct GO terms, was provided (Supplemental File 7). Using the 'Import table from file' option, an attribute table containing DEG enrichment score (-log of adjusted P value from chi-squared test), up/down ratio, and number of species with upregulated genes for each of 6962 GO (Supplemental File 8) was mapped onto the full GO network. Rank in the GO hierarchy were determined using the GOMFANCESTOR, GOCCANCESTOR, and GOB-PANCESTOR functions from the GO.db package version 3.8 from Bio-cManager in R 3.5.0. Violin plot rendering was performed using the easyGgplot2 package in R 3.4.0.

Bootstrapping and Expression Consistency Analyses
Bootstrapping and consistency analysis was performed using homemade scripts in R 3.5.0, provided in Supplemental Files 3 and 4. To simulate distributions of upregulated genes among core orthogroups, a total of 9520 upregulated gene states and 53,318 not upregulated gene states were shuffled among 7918 core orthogroups containing at least one DEG. Orthogroups in this simulation had the same number of genes from each species as experimentally determined core orthogroups. For each orthogroup, a minimal gene count is determined, corresponding to the number of genes for the species with the least genes in the group. Gene states are drawn first for minimal gene counts in each orthogroup, with h species assigned the same draw. The remaining gene states are drawn randomly for each species in each orthogroup. The percentage of orthogroups featuring zero to six species with at least one upregulated gene state was calculated of 100 bootstrap replicates. The procedure was repeated using 13,201 downregulated gene states and 49,637 not downregulated gene states. Residual sum of squares was calculated using the ssq function of the hydroGOF package in R 3.4.0. The CEI was computed on the basis of the number of clusters of core ortholog genes with similar expression. The number of clusters was computed by silhouette analysis (Rousseeuw, 1987) performed by the factoextra R library (Charrad et al., 2014). The index of conservation expression was then computed as the difference between the proportion of genes in the largest cluster and the proportion associated to one gene of the cluster. By construction, an index of conservation of expression equal to 0 indicates a group composed by genes with independent level of expression. An index tending toward 1 indicates a unique pattern of genes expression.

Molecular and Phenotypic Characterization of abcg40 Mutants
Arabidopsis T-DNA mutant lines SALK_005635 (abcg40-2), SAIL_885_E09 (abcg40-3), SALK_148565 (abcg40-4), and SALK_013945 (abcg40-5) were obtained from the Nottingham Arabidopsis Stock Centre. Plants were grown as described above. Homozygous T-DNA insertions were verified by PCR performed with primers given in Supplemental Data Set 1.33. The accumulation of ABCG40 transcripts in abcg40 mutants was assessed by reverse transcription quantitative PCR performed on a LightCycler480 device (Roche) following the manufacturer's recommendation. RNA extraction, cDNA synthesis, and quantitative PCR reactions were performed as described in Badet et al. (2015) using primers given in Supplemental Data Set 1.33 and the At2g28390 gene as a housekeeping reference. A strain of S. sclerotiorum 1980 expressing GFP fused to the endogenous OAH1 gene (Badet et al., 2017) was inoculated on 4week-old plants as described above. Lesions were imaged 24 h later with a Zeiss Axio Zoom V16 microscope under bright light and fluorescent illumination. Lesion areas were determined using the ImageJ 1.51k software. For this, fully developed lesions that have not reached the leaves borders were manually set as regions of interest. The area of the agar plug used for inoculation served as reference to normalize lesion measurements across images. Statistical analyses of disease phenotypes were performed in R via type III two-way ANOVA for unbalanced designs, and significance of the difference between genotype was assessed using Student's t test with Bonferroni correction for multiple testing since no experiment replicate effect was detected.

Accession Numbers
Sequencing data generated in this work has been deposited in the NCBI's GEO and are accessible through GEO series accession number GSE138039.  Figure 1C)

SUPPLEMENTAL
Supplemental Data Set 1.4. Log fold change, differential expression analysis and orthogroup assignment for Arabidopsis thaliana genes. (Related to Figure 1D) Supplemental Data Set 1.5. Log fold change, differential expression analysis and orthogroup assignment for Phaseolus vulgaris genes. (Related to Figure 1D) Supplemental Data Set 1.6. Log fold change, differential expression analysis and orthogroup assignment for Beta vulgaris genes. (Related to Figure 1D) Supplemental Data Set 1.7. Log fold change, differential expression analysis and orthogroup assignment for Ricinus communis genes. (Related to Figure 1D) Supplemental Data Set 1.8. Log fold change, differential expression analysis and orthogroup assignment for Helianthus annuus genes.
(Related to Figure 1D) Supplemental Data Set 1.9. Log fold change, differential expression analysis and orthogroup assignment for Solanum lycopersicum genes. (Related to Figure 1D) Supplemental Data Set 1.10. List of genes assigned to orthogroups. (Related to Figure 2A) Supplemental Data Set 1.11. Number of species, of expressed genes and DEGs per orthogoup. (Related to Figure 2A) Supplemental Data Set 1.12. Distribution of orthogroups per species. (Related to Figure 2B) Supplemental Data Set 1.13. Complete statistical analysis for enrichment in DEGs among core, ortho and lineage-specific genes.
(Related to Figure 2D and Table 1) Supplemental Data Set 1.14. Parsimonious inference of orthogoups evolution. (Related to Figure 3A)