Resistance Gene Analogs in the Brassicaceae: Identi ﬁ cation, Characterization, Distribution, and Evolution 1[OPEN]

The Brassicaceae consists of a wide range of species, including important Brassica crop species and the model plant Arabidopsis ( Arabidopsis thaliana ). Brassica spp. crop diseases impose signi ﬁ cant yield losses annually. A major way to reduce susceptibility to disease is the selection in breeding for resistance gene analogs (RGAs). Nucleotide binding site-leucine rich repeats (NLRs), receptor-like kinases (RLKs), and receptor-like proteins (RLPs) are the main types of RGAs; they contain conserved domains and motifs and play speci ﬁ c roles in resistance to pathogens. Here, all classes of RGAs have been identi ﬁ ed using annotation and assembly-based pipelines in all available genome annotations from the Brassicaceae, including multiple genome assemblies of the same species where available (total of 32 genomes). The number of RGAs, based on genome annotations, varies within and between species. In total 34,065 RGAs were identi ﬁ ed, with the majority being RLKs (21,691), then NLRs (8,588) and RLPs (3,786). Analysis of the RGA protein sequences revealed a high level of sequence identity, whereby 99.43% of RGAs fell into several orthogroups. This study establishes a resource for the identi ﬁ cation and characterization of RGAs in the Brassicaceae and provides a framework for further studies of RGAs for an ultimate goal of assisting breeders in improving resistance to plant disease.

. Several members of this family are also employed as model species, including Arabidopsis (Arabidopsis thaliana), which is the most widely used plant in research; Arabidopsis halleri, a model species for heavy metal tolerance and hyperaccumulation (Palmer et al., 2001); Lepidium meyenii, a model species for floral structure (Lee et al., 2002); Eutrema salsugineum, a model species for salinity stress (Wu et al., 2012); and Barbarea vulgaris, a model species for insect resistance (Nielsen et al., 2010). Huang et al. (2016) grouped tribes from the Brassicaceae into six major clades (A to F) using nuclear markers from newly sequenced transcriptomes of 32 Brassicaceae species. Clade A includes plants from the genera Lepidium, Arabidopsis, Capsella, and Boechera; the genera Brassica, Eutrema, Raphanus, Arabis, and Thlaspi are examples of clade B; the tribes Cochlearieae, Anastaticeae, and Biscutelleae are from clade C; clade D includes the tribe Alysseae; clade E includes species from four tribes, including Chorisporeae and Hesperideae; and clade F includes the Atheionemeae tribe . Guo et al. (2017) also reported a similar phylogeny of Brassicaceae using the plastome of 53 species of Brassicales. However, clade D was not identified as a result of the limited taxon sampling .
Crop species from the Brassicaceae are often affected by several diseases, including blackleg (Leptosphaeria maculans), Sclerotinia stem rot (Sclerotinia sclerotiorum), downy mildew (Hyaloperonospora parasitica), clubroot (Plasmodiophora brassicae), and Turnip mosaic virus (Rimmer et al., 2007;Neik et al., 2017). Plant resistance gene analogs (RGAs) play a role in host resistance to disease and consist of genes with conserved domains and motifs and diverse structure, function, and evolution (Sekhwal et al., 2015). The nucleotide-binding-site leucine-rich repeat (NLR) gene family is the best-known family of RGAs, with a major role in plant disease resistance (Meyers et al., 1999;McHale et al., 2006). NLR genes are known as Resistance genes (R genes). R gene-induced resistance occurs in a gene-for-gene manner whereby an R gene in the host plant has a corresponding avirulence gene in the pathogen (Flor, 1971). In a typical NLR gene, the nucleotide-binding site (NBS) and leucine-rich repeat (LRR) domains are located in the middle and in the C terminus of the gene, respectively (Meyers et al., 1999;Xiao et al., 2001;Shao et al., 2014). The remaining structure of NLR proteins consists of three main domains at the N terminus, which are also used to classify R genes: the TIR-NBS-LRR (TNL) class is characterized by a Toll/IL-1 receptor (TIR) domain, the CC-NBS-LRR (CNL) class contains the coiled-coil domain, and the RPW8-NBS-LRR (RNL) class contains the RESISTANCE TO POW-DERY MILDEW8 (RPW8) domain.
Receptor-like protein kinases (RLKs) or membraneassociated receptor-like proteins (RLPs), known as pattern recognition receptors (PRRs), are another class of RGA and the main component of the first line of plant immunity (Zipfel, 2014;Sekhwal et al., 2015). In plants, RLKs are the most abundant RGAs, and their structure is very similar to that of RLPs. They have an extracellular domain at the beginning of their N terminus involved in the perception of the microbial pattern and a transmembrane helix domain that can anchor the RLP and RLK in the plasma membrane. However, RLKs differ from RLPs by having a cytoplasmic kinase domain such as Ser/Thr protein kinases and Tyr kinase (STTK) at their C termini (Walker, 1994;Shiu and Bleecker, 2003;Sekhwal et al., 2015). Plant PRRs are classified according to their extracellular N-terminal domain. The major PRR subclasses involved in pathogen recognition carry a Lys motif or an LRR domain (Couto and Zipfel, 2016). RLPs do not possess any signaling domain in their intracellular region, and their domain structure is similar to extracellular domains of RLKs, suggesting that they might function in conjunction with one or several RLKs (Shiu and Bleecker, 2003;Zipfel, 2014). Along with defense mechanisms, RLKs and RLPs are also involved in developmental processes (Sekhwal et al., 2015), including meristem and stomatal development (Jeong et al., 1999;Nadeau and Sack, 2002).
RGAs play an important role in plant defense and have been widely used in breeding programs to improve crop disease resistance. A better understanding of their function, structure, and distribution is extremely valuable for breeders to enhance crop disease resistance. Several studies have analyzed the diversity and evolution of RGAs among the Brassicaceae (Fritz-Laylin et al., 2005;Wang et al., 2008;Shao et al., 2016;Yu et al., 2016;Zhang et al., 2016b); however, most focus on one or a few members of the Brassicaceae, and no study includes all sequenced species from this family. Moreover, the analyses have been undertaken using different methodologies, making the results between studies harder to compare. In this research, we used a unified methodology for RGA identification, phylogeny, and distribution analysis in all currently available genome annotations for 25 species of the Brassicaceae, including both wild and domesticated species, and one species from the Cleomaceae, the Brassicaceae sister family (Cheng et al., 2013). The results provide a valuable resource of candidate R genes in the Brassicaceae, which can be employed by breeders to improve disease resistance in Brassica spp. crops.
RLKs and RLPs were subdivided into three classes based on their N-terminal domain, namely LRR, LysM, and Other. RLKs were further divided into RD and non-RD classes, where RD refers to the kinases that are involved in phosphorylation with positively charged Arg (R) and negatively charged Asp (D) amino acid residues at the activation site and non-RD refers to kinases that are lacking these amino acids (Dardick et al., 2012). In total, the majority of candidate RLKs and RLPs were of class LRR, 9,066 and 3,702, respectively, whereas only 157 LysM-RLKs and 84 LysM-RLPs were identified. There were fewer than 10 LysM-RLPs identified in all the Brassicaceae genomes. C. sativa and L. meyenii contained the highest number of LysM-RLKs, 15 and 10, respectively. The additional contigs of the B. napus and B. oleracea pangenomes were not found to contain any LysM-RLK and LysM-RLP RGAs (Supplemental Table S1).
Despite the greater overall total number of TNLs in the Brassicaceae, six of the genomes were found to contain more CNLs than TNLs, including B. juncea (TNLs, 30; CNLs, 49), L. meyenii (TNLs, 18; CNLs, 86), and Raphanus raphanistrum (TNLs, 15; CNLs, 18). The number of RNLs and RNs identified was lower than that of TNLs and CNLs, with 14 genomes not containing any RNLs. Furthermore, no RNLs were found in the additional contigs of B. napus and B. oleracea pangenomes, even though these extra contigs contained all other classes of RGAs (Supplemental Table S1).

Identification of NLRs in the Brassicaceae Using NLR-Annotator
To evaluate the impact of different assembly and annotation methods on RGA prediction, NLR genes were also identified using the assembly-based NLR-Annotator, in contrast to the annotation-based RGAugury. The total numbers of NLRs for both methods were comparable: RGAugury (8,588 NLR genes) and NLR-Annotator (8,889 NLR genes; Fig. 4; Supplemental Table S1).

RGA Distribution on Chromosomes/Pseudochromosomes and Subgenomes
We investigated the distribution and density of RGAs in the annotations of the closely related genomes B. oleracea, B. rapa, and B. napus. Overall, RGAs were unevenly distributed along the chromosomes. The RGA distribution pattern in the C subgenome (B. oleracea) and the A subgenome (B. rapa) is comparable with their counterpart subgenome in B. napus (Fig. 5). In general, a similar distribution and density pattern of RLKs was found between subgenome A in B. napus and B. rapa. However, some regions, such as the end of chromosome A3 and the beginning of chromosome A8 in B. rapa, do not contain any RGAs, which is not the case in A3 and A8 of B. napus. Similarly, in the C subgenome, there are some similarities and differences between B. napus and B. oleracea. For example, both species do not contain any RGAs in the middle of chromosome C9, but chromosome C7 has a similar distribution and density pattern in both species (Fig. 5).
RGA analyses were carried out for the A and C subgenomes in B. napus 'Darmor-bzh V8', 'Tapidor', 'Darmor-bzh V4', and 'ZS11'. The total number of RLKs was found to be similar between the A and C subgenomes in each of the cultivars and assembly versions of B. napus (Supplemental Table S1). However, the total number of NLRs was found to be noticeably higher in the C subgenome of all cultivars and assembly versions of B. napus than in subgenome A (Supplemental Table  S1). For example, subgenomes A and C in B. napus 'Darmor-bzh V8' contained 479 and 486 RLKs and 119 and 157 NLRs, respectively. For RLPs, there was no specific pattern: whereas both subgenomes contained similar numbers of RLPs in B. napus 'Darmor-bzh V8' and 'Darmor-bzh V4', in B. napus 'Tapidor' and 'ZS11', the C subgenomes have noticeably more RLPs than the A subgenome (Supplemental Table S1).

Orthogroup Clustering and Phylogenetic Analysis
We investigated the protein sequence similarity of all candidate RGAs across all genomes using OrthoFinder. All RGAs were grouped into 1,396 multigene clusters (orthogroups). The cluster with the largest number of RGAs contained 2,969 genes, including 917 non-RGAs, 2,028 NLRs, zero RLKs, and 24 RLPs. Of the 2,029 NLR-RGAs, the majority were of class TNL (1,098), followed by 428 NLs and 141 TNs.
CNLs and RNLs, similar to TNLs, mostly grouped together, and their clusters also contained high numbers of their respective atypical genes. For instance, the two clusters with the largest number of CNLs (572 and 168) contained the highest number of CNs, 108 and 56, respectively (Supplemental Table S2). The number of non-RGAs in clusters varied from zero to 2,195. Out of 1,396 clusters, 12 contained at least 1,000 non-RGAs and 256 contained at least 100 non-RGAs. The non-RGA domain structure was examined among 20 clusters with the highest number of candidate RGAs, and the results showed that most of the non-RGAs in each cluster contained some conserved domains of the dominant RGAs in the cluster (Supplemental Table S2).
To gain insight into the dynamic evolution of NLR genes, phylogenetic analysis based on the genome annotation was performed for all 30 genomes and two extra contigs of pangenomes. The tree formed two main clades: one clade included only species from the Brassicaceae family and the other clade included the Tarenaya genus from the Cleomaceae family. All the genomes from the same species grouped together: all species from the Brassica genus were in one clade and all species from the Arabidopsis genus grouped together ( Figs. 1 and 2).
The phylogenetic signal was also tested for RLKs, RLPs, and NLRs across the Brassicaceae using phylosignal (Keck et al., 2016). Significant positive autocorrelation (P , 0.05) was detected in clade B of the phylogeny tree for RLKs and NLRs and in clades A and B for RLPs. There was a strong phylogenetic signal for RLPs among three species (Leavenworthia alabamica, B. vulgaris, and C. hirsute) in clade A that was not present in species that are distanced phylogenetically. For all three subclasses of RGAs there were strong positive phylogenetic signals for B. napus 'Darmor-bzh V4' and B. napus 'ZS11'; however, there were no positive phylogenetic signals for other B. napus cultivars and their other surrounding species in the clade (Fig. 6).

Distribution of RGAs in the Brassicaceae Family
In this study, the whole-genome distribution of RGAs was studied among different members of the Brassicaceae. The results revealed variation in RLK, RLP, and NLR gene numbers among different species of this family. The large number of RLKs compared with the other RGAs has been previously reported in several species, including Arabidopsis (Shiu and Bleecker, 2001;Meyers et al., 2003), rice (Oryza sativa; Shiu et al., 2004;Zhou et al., 2004;Fritz-Laylin et al., 2005), and Fragaria vesca (Li et al., 2018b). This is likely to be due to RLKs also having a signaling function and their involvement in different plant processes, such as growth and development, as well as defense (Shiu et al., 2004), whereas NLR genes are primarily involved with plant resistance responses (McHale et al., 2006).  trichocarpa; LRR-RLP:82; Petre et al., 2014). Similarly, in this study, the majority of RLKs and RLPs were classified as LRR-RLK/RLP. Here, a large number of non-RD RLKs were identified across the Brassicaceae, and it has been suggested that non-RD RLKs are mainly involved in immunity responses (Dardick et al., 2012;Rameneni et al., 2015).
Different numbers of NLR genes have also been reported within and between plant species, including Arachis duranensis (Song et al., 2017), Manihot esculenta (Lozano et al., 2015), Arabidopsis and B. rapa (Mun et al., 2009), B. rapa, C. rubella, and Arabidopsis lyrata (Zhang et al., 2016b), Arabidopsis (Meyers et al., 2003;Kong et al., 2018), B. napus, B. rapa, and B. oleracea (Alamery et al., 2018), the B. oleracea pangenome (Golicz et al., 2016), tomato (Solanum lycopersicum), pepper (Capsicum annuum), and potato (Seo et al., 2016), Fragaria 3 ananassa, Fragaria iinumae, Fragaria nipponica, Fragaria nubicola, and Fragaria orientalis (Zhong et al., 2018), and wheat (Gu et al., 2015). Using different pipelines for gene prediction and different versions of a genome assembly causes the contradiction of reported numbers of NLR genes among different studies, including our study. It has been confirmed that different approaches in genome annotation and masking of repetitive elements have a major impact on gene prediction Slotkin, 2018). In our study, the quality of different assembly versions of the genera Arabidopsis and Raphanus was comparable, and no significant differences were observed between the percentages of candidate RGAs. However, there were noticeable differences among different assembly versions of B. napus, potentially due to different repeatmasking approaches.
The higher number of TNLs than CNLs and RNLs observed in our study is similar to the findings in B. rapa and B. napus (Alamery et al., 2018) and Arabidopsis (Meyers et al., 2003). However, some of the genomes contain more CNLs than TNLs, such as Aethionema arabicum and R. raphanistrum, which was also reported in potato (Seo et al., 2016) and Medicago truncatula (Ameline-Torregrosa et al., 2008). The reports of RNLs being the least prevalent among different classes of NLRs in 25 species of angiosperms (Shao et al., 2014;Zhang et al., 2016b) are consistent with our findings. RNLs are not directly involved in pathogen recognition, and they assist other NLR genes to accomplish a resistance response (Xiao et al., 2001). Therefore, plants do not need a high number of RNLs, and thus the wide range of TNLs and CNLs is necessary for plants to expand their ability for pathogen recognition to counteract rapid pathogen evolution. The observed variation of identified RGAs and their uneven distribution along the genomes indicate that the genome evolutionary events, namely whole-genome duplication (WGD), whole-genome triplication (WGT), transposon-mediated gene duplication, and tandem and segmental gene duplication, have an important role in generating different numbers of RGAs (Walker et al., 1995;Mun et al., 2009;Franzke et al., 2011;Lisch, 2013). Loss of function, subfunctionalization, and neofunctionalization play major roles in the evolution of duplicated genes (Lynch and Conery, 2000) by increasing the retention rate of these genes (Rastogi and Liberles, 2005). The involvement of these processes in the evolution of RGAs has been also reported in other plant families such as Rosaceae (Zhong et al., 2015(Zhong et al., , 2018.

NLR-Annotator and RGAugury Comparison
Different annotation approaches can affect the prediction of NLR genes . To evaluate the results from RGAugury, a pipeline for RGA identification using all predicted proteins, we used NLR-Annotator, a tool for de novo genome annotation of NLR loci. The total number of candidate NLRs was very similar between the RGAugury and NLR-Annotator methods. Most of the assemblies that have been used in this study are Illumina based; however, a few assemblies are based on other sequencing methods; for example, B. rapa and B. napus 'Darmor-bzh' are PacBio and Illumina based (Supplemental Table S1).
Only one assembly showed substantial differences between the two methods. The results from RGAugury show that B. napus 'Darmor-bzh V8' contains substantially fewer RGAs in comparison with B. napus 'Darmor-bzh V4'; however, using NLR-Annotator, B. napus 'Darmor-bzh V8' and 'Darmor-bzh V4' contain exactly the same number of RGAs. Of note, B. napus 'Darmor-bzh V8' and 'Darmor-bzh V4' are different in both annotation and assembly. As NLR-Annotator is only annotation independent, this observation suggests that the differences between B. napus 'Darmor-bzh V4' and 'Darmor-bzh V8' as a result of using different annotation methods and RGA prediction are not affected by assembly methods. The stringent repeat-masking method used for both B. napus 'Darmor-bzh V8' and 'Tapidor' (Bayer et al., 2017 can explain the lower number of RGAs identified by RGAugury in comparison with that identified for other B. napus cultivars and assembly versions. Using NLR-Annotator, the number of RGAs in cv Tapidor is comparable with other B. napus cultivars. These observations show that the results have not been significantly affected by sequencing and assembly methods.

RGA Distribution on Pseudomolecules
The distribution of RGAs in and within the pseudomolecules of B. napus and its progenitor B. rapa and B. oleracea genomes was not even, which is similar to observations made in other studies (Chalhoub et al., 2014;Yu et al., 2014;Golicz et al., 2016;Zhang et al., 2016b). The high similarity of distribution and density pattern of RGAs in the C subgenome (B. oleracea) and the A subgenome (B. rapa) with their counterpart subgenome in B. napus shows that RGAs have been conserved during genome evolutionary events. The results also show that a chromosome might be rich in one class of RGAs whereas it has only a few or no genes of another class. Different NBS-encoding gene numbers were also reported in C. rubella, Thellungiella salsuginea, A. lyrata, Arabidopsis, and B. rapa (Zhang et al., 2016b). The observed differences of distribution and density pattern of RGAs between different assembly versions of B. napus and B. rapa reflect the importance of the genome assembly quality, whereby the accuracy of gene prediction mainly depends on the quality of genome assembly.

Orthogroup Clustering and Phylogenetic Analysis
Based on Orthogroup clustering, most of the genes from the same class of RGAs, across all genomes, grouped together. This observation confirms a significant homology between RGA protein sequences among all species. Non-RGAs in a cluster carry some of the key domains of the dominant RGA class in the cluster. Non-RGAs in RLK-dominated clusters consisted of different types of kinase domain, such as STTK, and other types of domains, including PAN/Apple domain, a type of RLK receptor (Shiu and Bleecker, 2001); LRR; legume lectin domain; and Gnk2-homologous domain. Non-RGAs in NBS-dominated clusters mostly include a winged helix-like DNA-binding domain, which is a subdomain of the NBS domain (McHale et al., 2006), and LRR and TIR domains. The non-RGAs might be an incomplete form of the RGAs that have not been identified with the applied pipeline (RGAugury).
Our phylogenetic analysis is consistent with the proposed Brassicaceae phylogeny and evolutionary history presented by Guo et al. (2017) and Huang et al. (2016). Species from the Brassicaceae family formed a separate clade from the Cleomaceae family. The clade of the Brassicaceae family also divided into three subclades: one subclade includes all species from clade A, the second one includes all species from clade B, and the third one includes the Aethionema genus from clade F.
NLR distribution among the species shows that species with a larger genome size (such as Brassica spp.) contain a higher number of NLR genes; however, they contain a lower percentage of typical NLR genes and a higher percentage of atypical NLR genes compared with species with a small genome size (such as Arabidopsis). Additionally, the difference between the percentages of typical and atypical NLRs is more significant in species with a larger genome size. This suggests that the large number of R genes must be costly, so that in the Brassicaceae family, despite the increased genome size during genome evolutionary events, the R gene number has not increased proportionally with genome size. The biological cost of containing a high copy number of R genes is increasing energy consumption under stress conditions, when plant adaptation relies on energy-saving responses (Tomé et al., 2014), in order to balance the transcription and translation of R genes where the high expression of R genes can be lethal for plants (Li et al., 2012). The lower percentage of NLRs among Brassica spp., those species with larger genome sizes, could also be a consequence of extensive gene loss, which occurred during WGD, WGT, and polyploidization events in these species. Gene loss is reported to be common during polyploidization (Town et al., 2006), and extensive gene and segmental loss have been frequently reported in B. oleracea (Town et al., 2006;Liu et al., 2014;Alamery et al., 2018), B. napus (Parkin et al., 2005;Chalhoub et al., 2014;Alamery et al., 2018), B. rapa (Alamery et al., 2018), and B. juncea (Yang et al., 2016). Consequently, both genome evolutionary events and the biological cost of the high copy number of R genes lead plants to keep the number of R genes at a definite range, regardless of genome size. Phylogenetic signals were observed among species with close phylogenetic distance for all three classes of RGA (i.e. RLKs, RLPs, and NLRs). However, the strong positive phylogenetic signals that were detected for B. napus 'Darmor-bzh V4' and B. napus 'ZS11' were not detected for other B. napus cultivars and their other surrounding species in the phylogeny tree, which suggests that the observed signals for B. napus 'Darmor-bzh V4' and B. napus 'ZS11' might be false-positive signals due to technological artifacts. This is more likely linked to the different annotation methods, as we show that RGA prediction is not affected by the sequencing technologies and assembly approaches in this study.
In summary, we identified more than 34,000 RGAs across Brassicaceae wild and domesticated species. In Brassica spp., despite their large genome size and WGD, WGT, and polyploidization events, the number of R genes has not expanded widely. Comparative analysis indicates that the number and distribution of RGAs greatly vary among species, whereas orthogroup clustering confirmed a high homology of the RGA proteome across all genomes. Despite many studies that have identified RGAs across different plant species, only a few R genes have been cloned due to the complexity of finemapping of R genes, which is partially due to the lack of information about their genomic structure and distribution. These complications are further intensified in plants that experience WGD and WGT, like Brassica spp. that harbor many copy numbers of RGAs. Different methods of RGA identification make further finemapping and comparative genomic analysis between species more complicated. Here, by using a unified methodology, we performed a comparative analysis of RGAs among all sequenced wild and cultivated species in addition to two sets of extra contigs from two Brassica spp. pangenomes. These comparative analyses provide a better insight into the genomic distribution and variation of RGAs across this plant family, which can be used to assist the identification and cloning of RGAs from previously untapped sources and their subsequent application in breeding programs for producing resistant cultivars.

Genomic Resources
Whole-genome RGA identification was performed on 32 sequenced and annotated genomes, including 30 genomes from the Brassicaceae in addition to two sets of extra contigs of Brassica napus and Brassica oleracea pangenomes and one genome from Cleomaceae, included as an outlier. To minimize the assembly impact on RGA prediction, different assembly versions for identical species were included (Table 1).

Identification and Classification of RGA Genes in the Brassicaceae Family
RGAs were identified using RGAugury, a pipeline for genome-wide RGA prediction . Three main classes of RGAs were identified and classified: RLK, RLP, and NLR genes. The RGAugury pipeline also divided the NLR gene family members into several subgroups according to their domain architecture, namely NBS, CNL, TNL, TN, CN, NL, TX (TIR with unclassified domains), and Other. Genes carrying RPW8 domains were manually reassigned based on their original domains: genes classified as NBS with an RPW8 were reclassified as RN, genes classified as NL with an additional RPW8 domain were reclassified as RNL, and all other remaining genes (TNL, CNL, CN, TN, and TX) carrying an additional RPW8 domain were reclassified as Other. Among NLRs, TNL, CNL, and RNL subgroups were named as typical NLRs and the rest of the subgroups that contain partial or disordered domains was named as atypical NLRs. Using a Python script, RLKs and RLPs were divided into three subclasses: PRRs containing an LRR domain (LRR-RLK/RLP), PRRs containing Lys motifs (LysM-RLK/RLP), and PRRs with any other domain (other-RLK/RLP). RLKs were further divided into RD and non-RD classes by performing BLAST between RLK candidates and previously published RD and non-RD RLK candidates in Brassica rapa (Rameneni et al., 2015). To evaluate the effect of annotation on gene prediction, we also used NLR-Annotator (Steuernagel et al., 2018) for the identification of NLR genes based on motifs present in the genome assembly.

Supplemental Data
The following supplemental materials are available.
Supplemental Figure S1. Phylogeny of Brassicaceae genomes based on NLR R genes.
Supplemental Figure S2. Phylogeny of Brassicaceae genomes based on RLP and RLK R genes.
Supplemental Table S1. RGA classification and clustering analysis.