Plastid NDH Pseudogenization and Gene Loss in a Recently Derived Lineage from the Largest Hemiparasitic Plant Genus Pedicularis (Orobanchaceae)

Abstract The plastid genome (plastome) is highly conserved in both gene order and content and has a lower mutation rate than the nuclear genome. However, the plastome is more variable in heterotrophic plants. To date, most such studies have investigated just a few species or only holoheterotrophic groups, and few have examined plastome evolution in recently derived lineages at an early stage of transition from autotrophy to heterotrophy. In this study, we investigated the evolutionary dynamics of plastomes in the monophyletic and recently derived Pedicularis sect. Cyathophora (Orobanchaceae). We obtained 22 new plastomes, 13 from the six recognized species of section Cyathophora, six from hemiparasitic relatives and three from autotrophic relatives. Comparative analyses of gene content, plastome structure and selection pressure showed dramatic differences among species in section Cyathophora and in Pedicularis as a whole. In comparison with autotrophic relatives and other Pedicularis spp., we found that the inverted repeat (IR) region in section Cyathophora had expansions to the small single-copy region, with a large expansion event and two independent contraction events. Moreover, NA(D)H dehydrogenase, accD and ccsA have lost function multiple times, with the function of accD being replaced by nuclear copies of an accD-like gene in Pedicularis spp. The ccsA and ndhG genes may have evolved under selection in association with IR expansion/contraction events. This study is the first to report high plastome variation in a recently derived lineage of hemiparasitic plants and therefore provides evidence for plastome evolution in the transition from autotrophy to heterotrophy.


Introduction
Plastids are key organelles for photosynthesis in green plants. The plastid possesses its own genome in a single circular molecule. The plastid genome (or plastome) is 140-180 kb in size and has a quadripartite structure, including two inverted repeat (IR) regions separated by a large single-copy (LSC) region and a small single-copy (SSC) region (Palmer 1983, Palmer et al. 1987, Wicke et al. 2011, Ruhlman and Jansen 2014. The IR regions are highly conserved in size and gene content among angiosperms, but IR expansions and contractions (and even complete losses) have been reported in several distant lineages (Zhu et al. 2016, Mower andVickrey 2018). Some studies have demonstrated that genes in the IR region have lower substitution rates than genes in the single-copy regions, which may help to stabilize the plastome structure (Wicke et al. 2011, Zhu et al. 2016. Generally, the plastome contains about 114 unique genes, including 80 coding sequence (CDS) genes, 30 transfer RNA (tRNA) genes and four ribosomal RNA (rRNA) genes (Wicke et al. 2011). However, heterotrophic plants, including haustorial parasites and mycoheterotrophs, no longer need to photosynthesize and may have degraded plastomes resulting from gene loss and structural changes (Graham et al. 2017, Wicke andNaumann 2018).
From a comprehensive survey of land plants, Wicke and Naumann (2018) have documented that heterotrophic plants occur in at least 23 orders. According to their carbon acquisition pathways, heterotrophic plants can be classified into mycoheterotrophs, which acquire carbon from associated mycorrhizal fungi (Leake 1994), and parasites, which acquire carbon from other land plants using a specialized haustorium (Yoshida et al. 2016). To date, parasitic plants have been found in 12 orders of angiosperms. Around 95% of species belong to the Orobanchaceae (Lamiales) and Santalales, and the remaining 5% belong to Apodanthaceae (Cucurbitales), Cassytha L. (Laurales), Cuscuta L. (Solanales), Cynomoriaceae (Saxifragales), Cytinaceae (Malvales), Hydnoraceae (Piperales), Krameriaceae (Zygophyllales), Lennoaceae (Boraginales), Mitrastemonaceae (Ericales) and Rafflesiaceae (Malpighiales) (Yoshida et al. 2016). Plant parasitism is characterized by a transition from autotrophy to partial (hemiparasite) or full parasitism (holoparasite). The plastomes of parasitic plants degrade as they transition from hemiparasitism to holoparasitism; this transition is accompanied by pseudogenization or by complete loss of photosynthesis-related genes (Wicke and Naumann 2018). It has been demonstrated that plastid genes experienced relaxed functional constraints during the transition from autotrophy to heterotrophy Orobanchaceae (Wolfe et al. 1992, but the impact of the transition from autotrophy to hemiparasitism has not been well documented. Plastome degradation in parasitic plants may include structural reconfiguration, IR expansion or contraction, or complete loss. For example, plastomes of Pedicularis ishidoyana Koidz. & Ohwi, Buchnera americana L., Schwalbea americana L. and Striga spp. (S. aspera Benth., S. forbessii Benth. and S. hermonthica Benth.) have experienced IR expansion to the LSC and/or SSC regions (Wicke et al. 2013, Cho et al. 2018, Frailey et al. 2018; Cassytha has lost one IR region (Song et al. 2017) and some Santalales species have a large inversion in the LSC region . Published plastome data from distantly related lineages of hemiparasitic plants show that some plastid NA(D)H dehydrogenase-like (NDH) genes have been independently lost or reside as nonfunctional pseudogenes (see Supplementary Table S1). For example, plastomes of Cassytha spp. (Laurales) have five NDH pseudogenes (ndhB, ndhD, ndhE, ndhF and ndhH) and have completely lost the other six NDH genes (Song et al. 2017, Wu et al. 2017. Similarly, plastomes of many hemiparasitic Orobanchaceae have ndhB, ndhD, ndhE, ndhF and ndhH pseudogenes , Frailey et al. 2018, and plastomes of hemiparasitic Santalales have completely lost ndhA, ndhC, ndhG, ndhI, ndhJ and ndhK (Petersen et al. 2015). It is therefore not possible to trace the evolutionary history of functional or physical losses of plastid NDH genes by comparative analyses of distantly related hemiparasitic lineages. In contrast, gradual pseudogenization, fragmentation and loss of plastid NDH genes can be observed by comparative analyses of plastomes from the same genus or a recently derived lineage.
Pedicularis L. is the largest hemiparasitic genus of Orobanchaceae (including approximately 700 species) (Li 1951, Fischer 2004 and is well-known for its floral variation. It is widely distributed in the north temperate zone with more than twothirds of the species represented in the Hengduan Mountains of south-central China (Yang et al. 1998). The roots form haustoria through xylem connections with host roots (Ren et al. 2010, Li et al. 2012. Recent molecular phylogenetic studies show that the genus Pedicularis is monophyletic and includes 13 major clades (Ree 2005, Tkach et al. 2014, Yu et al. 2015. Pedicularis sect. Cyathophora Li (1948) is a monophyletic group in clade IV, fully supported by both morphological characters and molecular phylogenies, which diverged at around 7.14 Mya (Eaton and Ree 2013, Yu et al. 2013. This recently derived lineage is characterized by having cup-like bracts around the stem at each node, and includes all corolla types found in Pedicularis, making it an ideal group for investigation of species diversification and plastome evolution in the genus (Eaton andRee 2013, Yu et al. 2013).

Characteristics of the plastid genomes
The synonymous codon usage (RSCU) value for all species is shown in Supplementary Fig. S3. Usage was often biased towards one of several codons that encode the same amino acid over the other. For example, the AGA codon had the highest preference for encoding Arginine (R) (RSCU value: 1.7-2.0) among the six synonymous codons. The synonymous codons ending with A or U had higher RSCU values than ending with C or G, such as Tyrosine (Y), Cysteine (C) and Aspartic acid (D).

Structural variations of the IR boundary
IR regions commonly had seven complete CDS genes (rpl2, rpl23, ycf2, ycf15, ndhB, rps7, rps12), six tRNA genes (trnI-CAU, trnL-CAA, trnV-GAC, trnL-GAU, trnR-ACG, trnN-GUU), and four rRNA genes (rrn16, rrn23, rrn4.5, rrn5). For all species the boundary separating LSC and IRb was between rps19 and rpl2 and the boundary separating IRa and LSC was between between rpl2 and trnH-GUG. Boundaries separating IR and SSC regions varied dramatically, especially in section Cyathophora (Fig. 1B). Few recognize four main types of boundary between these regions. In type I, the ycf1 gene occurs on the SSC/IRb boundary, with 210-1628 bp located in the IRb region. Type I characterizes all plastomes treated here except for those in section Cyathophora. In types II-IV, genes from the SSC have been captured by the IR; all three IR types have captured ycf1. In type II plastomes, the IR region includes ndhH + ndhA + ndhF, characterizing P. connata, P. cyathophylla, P. cyathophylloides and P. rex var. rockii. In type III, the IR region extends to ndhH + ndhA alone, characterizing P. superba. In type IV, found in series Reges except P. rex var. rockii, includes ndhH with ndhA inverted in the SSC (Fig. 1B). The ancestral state of the Pedicularis is type I. A transition to type II occurred in the ancestor of section Cyathophora, with types III and IV evolving in the ancestors of clade III and clade I, respectively (Fig. 1A).

Evolutionary selection on plastid genes
Ten of 13 functional gene groups experienced relaxed selection in the hemiparasites and in Pedicularis (k < 1), although not all these were significant; cemA (hemiparasites: P > 0.1; Pedicularis: P < 0.05) and rps (both P < 0.05) experienced Fig. 2 Heatmap of the GC content by gene groups or genes in all species (A), the deviation values from the average GC content of each gene group or gene (B) and correlation analysis among the GC content and plastome/region size (C). Significance levels: ***P < 0.001; **P < 0.01; *P < 0.05. increased selection (k > 1), and ycf genes experienced neutral or nearly neutral selection (k = 0.98-1.0) ( Table 2). In section Cyathophora, five genes (cemA, psb, rps, rpo and clpP) showed increased selection, but none of these was significant (P > 0.1).

Evolutionary analyses of pseudogenization and gene loss
The 'ER' likelihood model analyses showed that the ancestral states of ndhE, ndhG and ndhI may have been pseudogenes in   Table S4). Moreover, both Dollop and the 'ER' likelihood model analyses showed that seven NDH genes (ndhA, ndhD, ndhE, ndhF, ndhG, ndhH and ndhI) lost function at the fourth node ( Fig. 3 and Supplementary  Fig. S11), and ndhJ and ndhK lost function at the fifth node (Fig. 3). Therefore, nine NDH genes had already lost function in the ancestor of section Cyathophora. Based on Dollop and 'ER' likelihood model analyses, ndhI, which had been previously lost, was regained in P. superba and P. rex var. rockii, and the truncated ndhA recovered its full length in the P. superba clade (Fig. 3). The Dollop analysis showed that the functional loss of ndhK happened at the fifth node and, then, reversed to full function in clade I of section Cyathophora (Fig. 3B), but the 'ER' likelihood model analysis showed ndhK underwent no functional reversal but rather experienced three independent functional losses at the sixth node, and in clades II and III of section Cyathophora (Supplementary Fig. S11). The Dollop analysis showed that ndhG lost function at the fourth node, followed by three reversals in series Reges except P. rex var. purpurea (Fig. 3B). However, the 'ER' likelihood model analysis showed that ndhG had no significantly functional reversals with independent fragmentation and physical loss (Fig. 3A).

Pedicularis (Supplementary
Both analyses showed that the functional loss of ccsA occurred in the basal branches of Pedicularis ( Fig. 3 and Supplementary Table S4). This was followed by a reversal to full function in the eighth node (91.4% support) of clade III and a second loss of function in P. superba 1 (Fig. 3). Both analyses showed that accD lost function at the fourth node ( Fig. 3 and Supplementary Fig. S11). Dollop analysis showed that accD regained its full length in P. przewalskii at the seventh node (Fig.  3B).

Plastid phylogenomics of section Cyathophora
Plastid phylogenomics recovered the monophyly of section Cyathophora as reported in previous studies (e.g. Eaton and Ree 2013, Yu et al. 2013. Monophyly of series Reges was also recovered. The relationship among the other three series has been controversial using nuclear and plastid datasets (Eaton and Ree 2013, Yu et al. 2013. Phylogenies based on RAD-seq, transcriptome, CRABS CLAW and nrITS dastasets supported a wellsupported clade including P. cyathophylla, P. cyathophylloides and P. superba. However, phylogenies based on LEAFY and plastid data provided poor and conflicting resolution among these species. In this study, plastid phylogenomics strongly supported a clade composed of P. cyathophylloides + P. superba and a different clade composed of P. cyathophylla + P. connata. The P. cyathophylloides + P. superba clade was the first diverging clade in section Cyathophora, as sister to the P. cyathophylla + P. connata clade and series Reges. Noteworthily, P. connata is included in phylogenetic analyses for the first time and is sister to P. cyathophylla, corresponding to series Cyathophyllae in the classification system of Li (1948).

Plastome characteristics of Pedicularis
In this study, the plastomes of hemiparasitic Pedicularis and the outgroups have the typical quadripartite architecture, but the IR/SSC boundaries show variation in the recently derived section Cyathophora. Generally, the plastome structure and gene contents are highly conserved at the genus level or in recently derived lineages. IR regions in section Cyathophora can be classified into three types. These types are associated with the species delimitation supported by plastid phylogenies (Yu et al. 2013, with the exception in series Reges of P. rex var. rockii, in which the IR region differed from other varieties of P. rex (Fig. 1). IR regions may have experienced a large expansion of the SSC region in the common ancestor of section Cyathophora, followed by two independent contractions, in P. superba and series Reges. The expansions and contractions of the IR regions in section Cyathophora might have destabilized the plastome. This, along with changes in selective pressures on the NDH and ccsA genes located at the junctions between the IR and SSC regions, may have caused more frequent parallel losses or reversals of these genes as opposed to others in the LSC and IR regions (Fig. 3).
IR expansion in section Cyathophora (Table 1, Fig. 1) increased the length of the plastome despite gene fragmentation. It is worth noting that the sizes of the IR and SSC correlate with their GC contents, possibly due to IR capture of SSC sequences. The GC content of NDH genes (except ndhC, ndhJ and ndhK) in section Cyathophora (35.3 to 36.9%) was lower than that of the IR (40.5 to 41.1%). Consequently, the GC content of the IR expanded region in section Cyathophora was significantly lower than that of both other Pedicularis and non-Pedicularis. IR expansion may have altered the structural conservatism of the plastome (Sabir et al. 2014, Zhu et al. 2016. The size of the LSC was negatively correlated with its own GC content (Fig. 2). Due to the expansion of the IR and reduction of GC content in section Cyathophora, whole plastome size is negatively correlated with the total GC content. Therefore, the total GC content of the section Cyathophora was lower than that of other Pedicularis. Interestingly, the average GC content of Pedicularis (38.2-38.6%) and Phtheirospermum japonicum (38.3%) were higher than those of three closely related autotrophs (37.7-38.0%) ( Table 1), suggesting that the GC content might be increased or not significantly decreased in the fully photosynthetic hemiparasitic plants in an early stage of the plastome degradation. Some studies have demonstrated that the shift from non-parasite to parasite is accompanied by reduction in plastome GC content, especially in holoheterotrophic plants Naumann 2018, Su et al. 2019).

Pseudogenization and loss of NA(D)H dehydrogenase-like genes in Pedicularis
The 11 NDH genes were pseudogenized or completely lost in the plastomes of Pedicularis (Fig. 3), as has been observed in other members of Orobanchaceae (e.g. Wicke et al. 2013, Frailey et al. 2018. Based on the comprehensive phylogeny of Pedicularis (Yu et al. 2015), the two species in the basal clades, P. kangdingensis and P. tongolensis, have functional NDH genes, as well as the sister lineage Phtheirospermum japonicum. Ancestral state reconstruction indicated that the common ancestor of Pedicularis had functional NDH genes, with subsequent pseudogenization and loss (Table 3). This evolutionary pattern indicated that the degradation of the NDH genes in Pedicularis was a gradual process from functional to non-functional. Moreover, gene loss and pseudogenization of NDH genes likely occurred independently across the 13 major clades in Pedicularis (Zhang et al. 2020), which is similar to the pattern of evolution proposed for vegetative and corolla characters (Ree 2005, Yu et al. 2015. In photosystem II, the NDH proteins act by adjusting the redox level of the cyclic electron transport machineries, allowing the fine tuning of photosynthesis (Peltier et al. 2016, Shikanai 2016, Laughlin et al. 2019. The mechanism of NDH gene loss remains unclear, and published studies have suggested that some NDH genes might be dispensable or their lost function compensated by other factors (Suorsa et al. 2012, Strand et al. 2019). Functional and/or physical loss of NDH genes occurs in some non-parasitic plants, including gymnosperms (Braukmann et al. 2009), Geraniaceae (Blazier et al. 2011) and Lentibulariaceae (Wicke et al. 2014), besides heterotrophic plants (Graham et al. 2017, Wicke andNaumann 2018). Thus, the loss of NDH genes might be a random phenomenon in photosynthetic lineages, though common in the heterotrophic plants (Wicke and Naumann 2018). In mycoheterotrophic orchid lineages (including leafy and photosynthetic orchids), NDH gene loss may have not been so disadvantageous for the lineages that live in low-light canopy habitats as epiphytes, or in dark, understory habitats (Barrett et al. 2014, Feng et al. 2016. In parasitic plants, loss of NDH genes may be associated with increasing dependence on host-derived carbon and decreasing dependence upon photosynthetic carbon Naumann 2018). For example, Bao (2020) found that the net photosynthetic rate of Pedicularis kansuensis Maxim decreased significantly in plants grown on hosts rather than those without hosts.

Pseudogenization of accD and ccsA genes
The plastid accD gene encodes a beta subunit of Acetyl-CoA carboxylase (ACCase), which is a key enzyme for fatty acid biosynthesis and is crucial for leaf development (Kode et al. 2005, Bock 2007). Functional loss of accD has occurred in autotrophic angiosperms (Sasaki and Nagano 2004, Rousseau-Gueutin et al. 2013, Ruhlman and Jansen 2014 and gymnosperms (Sudianto and Chaw 2019), as well as Volvocales (Smith and Lee 2014), in which the function might be taken over by the nuclear copy (Sasaki andNagano 2004, Wicke et al. 2011). The accD gene maintains intact reading frames in most heterotrophic plants (Wicke and Naumann 2018). In this study, the accD is highly divergent in Pedicularis, with long insertions/deletions and high rates of substitution. Function of accD may be taken over by nuclear copies (Sasaki and Nagano 2004, Rousseau-Gueutin et al. 2013, Sudianto and Chaw 2019, because nuclear copies of an accD-like gene were isolated from transcriptome data of Pedicularis (Supplementary Fig. S12).
The ccsA gene encodes a cytochrome C biogenesis protein that mediates the attachment of heme to c-type cytochromes (Bock 2007, Ruhlman andJansen 2014). This gene is localized in the plastid SSC region and functional in most photosynthetic plants (Wicke et al. 2011). However, it is lost or pseudogenized in most heterotrophic plants (Graham et al. 2017, Wicke andNaumann 2018). In Pedicularis, the ccsA gene is not truncated but has experienced frameshift in most species resulting in a premature stop codon ( Supplementary  Fig. S5, S6). Exceptions include members of clade III of section Cyathophora which have functional ccsA genes (Fig. 3B). Two of three individuals of P. superba had functional ccsA. This polymorphism might be recovered in other species of Pedicularis if more individuals are genotyped in the future. Unexpectedly, evolutionary reconstruction of the ancestral state suggests that there may have been one or three reversals from nonfunctional to functional in P. cyathophylloides and P. superba (Fig. 3). The RELAX (Detecting Relaxed Selection) analysis indicated that the ccsA gene experienced increased selection in the hemiparasites, Pedicularis, and section Cyathophora, which suggests that ccsA gene function is necessary in these lineages.

Evolutionary analyses of plastid genes
Relaxed selective pressure on plastid genes is one important driver of plastome evolution in heterotrophic plants . The RELAX selection analyses detected relaxed selection in 11 functional group genes ( Table 2) and up to nine NDH genes in hemiparasitic Pedicularis (Table 3), which might have caused variation in the accD and NDH genes in Pedicularis, even within a species. We detected significant relaxation of selection on fully functional photosynthesis and housekeeping genes in Pedicularis. Relaxation of selective pressure on photosynthesis and housekeeping genes has also been found in other heterotrophic plants, including mistletoes , Cuscuta (Banerjee and Stefanovic 2019), Orobanchaceae , Frailey et al. 2018 and Orchidaceae (Barrett et al. 2014, Feng et al. 2016, which also show variation in accD, ccsA and NDH genes. However, this study is the first to report similar variation in accD, ccsA and NDH genes in a recently derived lineage, section Cyathophora, which diverged only 7.14 Mya . As discussed above, the full functions of the accD, ccsA and NDH genes in the plastome are not clear, but these genes normally play critical roles in photosynthesis in green plants. The relaxed selection on these genes is an important indicator for plants transition from non-parasite to parasite . The question remains, however, how hemiparasitic plants carry on photosynthesis without functional plastid ccsA and NDH genes. Of course, hemiparasites can use haustoria to steal nutrients from the host plants to compensate the loss of the function of these genes , Yoshida et al. 2016, Wicke and Naumann 2018 and perhaps survive as well with less efficient photosynthetic apparatus.
It is interesting that some photosynthesis-related genes showed increased selection in the RELAX selection tests. Of these, cemA, ccsA and ndhG showed consistent intensification in the three tests, while others experienced increased selection pressure in section Cyathophora. The functional ccsA gene in section Cyathophora is presumably associated with the intensification of selection detected in these genes.

Plant materials, DNA extraction and sequencing
In total, 22 new plastomes were de novo assembled and analyzed in this study (Supplementary Table S2). Of these, 13 samples represent the six recognized species of Pedicularis sect. Cyathophora (including one individual each of P. connata and P. cyathophylla, two of P. thamnophila and three of P. cyathophylloides, P. rex and P. superba), and five represent selected Pedicularis species, including two species (P. insignis and P. przewalskii) sister to section Cyathophora, and P. kangdingensis, P. lyrata and P. tongolensis, representing three recognized clades of Pedicularis (Yu et al. 2015). The hemiparasitic Phtheirospermum japonicum and the autotrophic Lindenbergia muraria (Orobanchaceae), Rehmannia glutinosa (Orobanchaceae) and Paulownia tomentosa (Paulowniaceae) were sampled as outgroups. Voucher specimens of 20 newly sequenced taxa were deposited in the herbarium of Kunming Institute of Botany, Chinese Academy of Science (KUN). The raw reads of Phtheirospermum japonicum (DRR082673) (Kado and Innan 2018) and Paulownia tomentosa (SRR6940033) (Mint Evolutionary Genomics Consortium 2018) were downloaded from the Sequence Read Archive (SRA) data of NCBI.
Total genomic DNA was extracted from silica gel-dried leaf tissues or nitrogen-frozen young buds using a modified CTAB method. Purified DNA was fragmented to approximately 500 bp in size for library construction following standard protocols (NEBNext® Ultra II™DNA Library Prep Kit for Illumina®). The 150 bp pair-end reads were generated using the Illumina Hi-Seq 2500.

Plastome assembly and annotation
Circular plastomes were de novo assembled using the GetOrganelle toolkit (Jin et al. 2020). The final circular plastid graphs were also checked using Bandage (Wick et al. 2015). The quality of the plastome assemblies was assessed by reads mapping using the GetOrganelle toolkit. Plastomes of four non-Pedicularis species and P. tongolensis were automatically annotated using CPGAVAS2 (Shi et al. 2019) and, then, manually adjusted in Geneious (Biomatters, Auckland, New Zealand) using Nicotiana tabacum L. (accession number: Z00044) as the reference. The remaining plastomes of Pedicularis were manually annotated by Geneious using P. tongolensis as the reference. Fragmented genes were checked and confirmed using BLAST. The plastome maps were drawn by Organellar Genome DRAW tool (Greiner et al. 2019).

Statistical analysis of plastomes
We analyzed the relative synonymous codon usage (RSCU) value using MEGA-X (Kumar et al. 2018) and then drew the heatmap in the R statistical environment. RSCU > 1: codons used more frequently than expected; RSCU = 1: codons used as frequently as expected; RSCU < 1: codons used less frequently than expected. We used Spearman's rank correlation (Best and Roberts 1975) for correlation analyses and the Wilcoxon test (Bauer 1972) for difference comparisons among genomic characters (i.e. GC content, genome size, etc.) in the R statistical environment.

Phylogenomic analysis
The whole sequence of 22 plastomes with one IR region was aligned using MAFFT (Katoh and Standley 2013) using the default options and then manually adjusted in Geneious. The aligned matrix was used to reconstruct the phylogeny using the maximum likelihood (ML) method. ML analysis was conducted using RAxML (Stamatakis et al. 2008), with GTR + GAMMA + I model to find the best-scoring ML tree. One thousand bootstrap replicates were performed to obtain clade support. ML analysis was performed at the CIPRES Science Gateway (http://www.phylo.org). The tree was viewed and edited with FigTree (http://tree.bio.ed.ac.uk/software/figtree/).

Testing selection of plastid genes
First, we used MACSE v2 (Ranwez et al. 2011) to align CDS regions with pseudogenes. Then, we aligned functional CDS genes in Geneious using the Translation Align option. The alignments for each CDS gene and 13 functional gene groups (see Table 2) were used to test for potential relaxed selection on the Datamonkey Adaptive Evolution Server (https:// www.datamonkey.org/) with the HyPhy software package (Pond and Muse 2005) using the hypothesis testing framework RELAX (Wertheim et al. 2014). We examined three test groups, i.e. hemiparasites (vs. non-parasites), Pedicularis (vs. non-Pedicularis) and Pedicularis sect. Cyathophora (vs. others).
RELAX calculates the ratio of nonsynonymous (dN) to synonymous (dS) substitutions. Given two subsets of branches in a phylogeny, RELAX can determine whether selective strength was relaxed or intensified in one of these subsets relative to the other. Rate changes are summarized using a relaxation coefficient (k), where k < 1 indicates the relaxation of selection and k > 1 indicates the intensification of selection.

Evolutionary analyses of IR structure, pseudogenization and gene loss
Because the IR regions showed dramatic variations in Pedicularis, especially in Cyathophora, four types of IR structure were recognized (see the 'Results' section). To infer the evolutionary history of the four types, ancestral states were reconstructed based on the plastome ML tree using the 'phytools' (phylogenetic tools for comparative biology-and other things) package (Revell 2012) in the R statistical environment with the 'ER' likelihood model (Schluter et al. 1997).
The plastome tree was also used to infer ancestral states of the 11 NDH genes, accD and ccsA, with functional and physical losses inferred using the 'phytools' package (Revell 2012) and the Dollop program of PHYLIP (http://evolution.genetics.washington.edu/phylip.html), respectively. The four states for each gene specified were functional (0, an ancestral state), full sequence with premature stop codon (1), truncated gene (2) and complete loss (3). First, the ancestral state for each gene was estimated by Brownian evolution using likelihood with the 'ER' model (Schluter et al. 1997). We then used the Dollo parsimony method to infer the evolutionary history of the 13 genes. The ancestral state (0) was known in each gene. Once state 1 was reached, the reoccurrence of state 0 is very improbable, much less probable than multiple retentions of polymorphism. For the Dollop parsimony analyses, we specified a transition matrix allowing changes from functional (0) → nonfunctional (1,2,3), complete (0,1) →