Taxonomic Sampling and Rare Genomic Changes Overcome Long-Branch Attraction in the Phylogenetic Placement of Pseudoscorpions

Abstract Long-branch attraction is a systematic artifact that results in erroneous groupings of fast-evolving taxa. The combination of short, deep internodes in tandem with long-branch attraction artifacts has produced empirically intractable parts of the Tree of Life. One such group is the arthropod subphylum Chelicerata, whose backbone phylogeny has remained unstable despite improvements in phylogenetic methods and genome-scale data sets. Pseudoscorpion placement is particularly variable across data sets and analytical frameworks, with this group either clustering with other long-branch orders or with Arachnopulmonata (scorpions and tetrapulmonates). To surmount long-branch attraction, we investigated the effect of taxonomic sampling via sequential deletion of basally branching pseudoscorpion superfamilies, as well as varying gene occupancy thresholds in supermatrices. We show that concatenated supermatrices and coalescent-based summary species tree approaches support a sister group relationship of pseudoscorpions and scorpions, when more of the basally branching taxa are sampled. Matrix completeness had demonstrably less influence on tree topology. As an external arbiter of phylogenetic placement, we leveraged the recent discovery of an ancient genome duplication in the common ancestor of Arachnopulmonata as a litmus test for competing hypotheses of pseudoscorpion relationships. We generated a high-quality developmental transcriptome and the first genome for pseudoscorpions to assess the incidence of arachnopulmonate-specific duplications (e.g., homeobox genes and miRNAs). Our results support the inclusion of pseudoscorpions in Arachnopulmonata (new definition), as the sister group of scorpions. Panscorpiones (new name) is proposed for the clade uniting Scorpiones and Pseudoscorpiones.


Introduction
The advent of current generation sequencing technologies has greatly benefitted the practice of molecular systematics. However, certain recalcitrant nodes in the Tree of Life remain staunchly unresolved despite the quantity of sequence data deployed to address phylogenetic relationships. Among the most intractable empirical problems in phylogenetics are nodes characterized by the combination of (1) ancient and rapid diversification and (2) accelerated evolution of multiple ingroup lineages, exacerbating long-branch attraction artifacts (Bergsten 2005;Rokas and Carroll 2006;King and Rokas 2017). The combination of these characteristics is difficult to overcome even with genome-scale data sets, due to homoplasy accrued over millions of years of evolutionary history, conflicting evolutionary signals in data partitions, systematic bias, and the lack of external arbiters to evaluate appropriateness of substitution and rate heterogeneity models. Within animals, examples of such problematic nodes include the base of Metazoa, Bilateria, the superclades Lophotrochozoa and Ecdysozoa, and internal relationships of many diverse phyla (Borner et al. 2014;Kocot et al. 2016;Feuda et al. 2017;Simion et al. 2017;Laumer et al. 2019;Marl etaz et al. 2019).
The basal phylogeny of the arthropod subphylum Chelicerata remains particularly recalcitrant to resolution despite the application of genome-scale phylogenomic data sets (Sharma, Kaluziak, et al. 2014;Lozano-Fern andez et al. 2019). Initial diversification of this group and the crown age of many orders dates to the early Paleozoic (Lozano . Within chelicerates, at least three orders exhibit the characteristics of long-branch taxa (Acariformes, Parasitiformes, and Pseudoscorpiones), with Solifugae and Palpigradi also prone to unstable placement, as inferred from taxon deletion experiments and assessments of topological stability (Sharma, Kaluziak, et al. 2014;). Moreover, extinction has asymmetrically affected different branches in the chelicerate tree, resulting in both relictual orders such as horseshoe crabs, as well as several extinct orders. As a result, basic questions about the evolutionary history of Chelicerata remain controversial, namely, the monophyly of Arachnida (the terrestrial chelicerates; Lozano-Fern andez et al. 2019). Even in data sets that support arachnid monophyly, relationships between chelicerate orders are highly unstable from one data set to the next, with the exception of the basal split between Pycnogonida (sea spiders) and the remaining chelicerates, Tetrapulmonata (a group of arachnid orders that bear four book lungs; Pepato et al. 2010;Regier et al. 2010;Sharma, Kaluziak, et al. 2014;Lozano-Fern andez et al. 2019;Howard et al. 2020), and the robust recovery of Arachnopulmonata (Scorpiones þ Tetrapulmonata; Sharma, Kaluziak, et al. 2014;Lozano-Fern andez et al. 2019;Howard et al. 2020).
In addition to these phylogenomic analyses, Arachnopulmonata is also supported by analyses of genome architecture, as both spiders and scorpions share partial or whole-genome duplication (WGD). This inference is evidenced by retention of duplicated copies of numerous developmental patterning genes and microRNAs, to the exclusion of groups like Opiliones (harvestmen) and Acari (Schwager et al. 2007;Sharma, Schwager, et al. 2014;Leite et al. 2016;Sharma, Santiago, et al. 2015;Schwager et al. 2017;Leite et al. 2018). Moreover, exploratory analyses of gene trees and embryonic gene expression patterns in spiders, scorpions, and harvestmen have shown that the duplicated copies of arachnopulmonate leg-patterning genes also retain expression domains that reflect the evolutionary history of shared WGD Nolan et al. 2020). The systemic duplication of developmental patterning genes and gene expression patterns together constitute a highly complex character that unites Arachnopulmonata (Leite et al. 2016;Nolan et al. 2020), but the putative incidence of this phenomenon has not been assessed in many chelicerate orders, most of which lack genomic and functional genetic resources (Garb et al. 2018).
One potential solution to overcome long-branch attraction includes the expansion of taxonomic sampling, which serves to "break" long branches and improve the estimation of parameters of substitution models. Although recent efforts have targeted improving taxonomic representation of the acarine orders in phylogenetic data sets (Acariformes and Parasitiformes; Arribas et al. 2019;Charrier et al. 2019), only recently has phylogenomic sampling of Pseudoscorpiones successfully sampled all major extant lineages (Benavides et al. 2019). Intriguingly, in phylogenetic studies that have broadly sampled pseudoscorpions and scorpions, pseudoscorpions are frequently recovered as either sister group to Arachnopulmonata (Sharma, Fern andez, et al. 2015) or as sister group to scorpions Benavides et al. 2019), although these works lacked complete representation of all chelicerate orders ( fig. 1). In works assessing chelicerate phylogeny broadly, pseudoscorpion placement has proven unstable or unsupported, either clustering with the Acari or with arachnopulmonates (Sharma, Kaluziak, et al. 2014;Arribas et 1). In these works, taxonomic representation of Pseudoscorpiones has nevertheless been limited, often to a subset of derived lineages.
To evaluate these competing hypotheses for pseudoscorpion placement in the chelicerate tree of life, we established a phylogenomic data set of Chelicerata broadly sampling all major lineages of Pseudoscorpiones. We assessed the effect of an incomplete taxonomic sampling by sequentially pruning basally branching lineages of pseudoscorpions and gauged the effect on the inferred tree topology using different analytical approaches to phylogenetic reconstruction. Furthermore, we reasoned that if Pseudoscorpiones is nested within Arachnopulmonata, then they should share the systemic duplications of developmental patterning genes previously demonstrated for scorpions and spiders (Leite et al. 2016(Leite et al. , 2018. The advantage of WGDs as phylogenetic characters is that even if an affected lineage exhibits broad scale loss of the resulting ohnologs (the duplicate gene copies resulting from WGD) over time, the signature of this event can be discerned using patterns of synteny across genomes as well via the ensuing gene trees (i.e., a retained single-copy homolog of an originally ohnologous pair is still expected to cluster with their orthologs of other taxa that share the genome duplication). WGD events shared across an array of taxa can be further discerned from lineage-specific duplications, using gene tree topologies (i.e., ohnologs clustering across shared WGD events vs. in-paralogs clustering by lineage). Such dynamics have been especially well studied at the base of the vertebrates (Dehal and Boore 2005;Putnam et al. 2008;Simakov et al. 2020).
Here we show that expanded taxonomic sampling of pseudoscorpions, systemic homeobox gene duplications, tree topologies of benchmarked ohnologs of developmental patterning genes, and duplications of miRNAs, all support the hypothesis that pseudoscorpions are nested within Arachnopulmonata as the sister group of scorpions. Empirical Solutions to Long Branch Attraction . doi:10.1093/molbev/msab038

Phylogenomics with Partitioned Models
To assess matrix completeness and denser taxonomic sampling as explanatory processes for the unstable phylogenetic placement of pseudoscorpions, we assembled a data set of 132 Panarthropoda, including 40 pseudoscorpion libraries previously generated by Benavides et al. (2019), which represent all pseudoscorpion superfamilies ( fig. 2a; supplementary table S1, Supplementary Material online). Orthologs analyzed in this study consisted of the Benchmarked Universal Single Copy Orthologs of Arthropoda (BUSCO-Ar) derived from OrthoDB v.9.1 (Simão et al. 2015;Zdobnov et al. 2017;Waterhouse et al. 2018). Each library was analyzed with the OrthoDB pipeline to identify available homologs of 1,066 arthropod-specific BUSCO genes. Duplicated BUSCOs were discarded to retain only validated, single-copy loci. We constructed six matrices ranging in gene occupancy thresholds of 80% (248 BUSCO loci) to 55% (1002 BUSCO loci); we denote these as G 1 to G 6 , in order of increasing matrix length ( fig. 2b). For each of these six matrices, we additionally pruned basally branches lineages within Pseudoscorpiones, with reference to Cheliferoidea. This superfamily was selected as the distal-most taxon, because it was represented by the most exemplars of any pseudoscorpion superfamily (12 transcriptomes), ensuring that the order would be well represented across all supermatrices, despite the pruning of other lineages (supplementary table S2, Supplementary Material online). For each matrix, we performed maximum likelihood (ML) searches and assessed phylogenetic placement of pseudoscorpions as sister group to scorpions, sister group to Arachnopulmonata (sensu Sharma, Kaluziak, et al. 2014) or sister group to one or both of the long-branch acarine orders (Acariformes and Parasitiformes). Six branches were sequentially pruned; we denote these data sets as T -1 to T -6 , in order of increasing branch pruning ( fig. 2a and b).
Matrices retaining all superfamilies of pseudoscorpions (i.e., unpruned data sets) consistently recovered the relationship Pseudoscorpiones þ Scorpiones, regardless of matrix completeness. ML tree topologies of pruned taxon subsets T -1 and T -2 similarly recovered the relationship  Inversely, matrices exhibiting pruning of the three most basally branching pseudoscorpion lineages (Chthonioidea, Feaelloidea, and Neobisioidea) recovered ML tree topologies wherein pseudoscorpions were sister group to either Parasitiformes or Acariformes, regardless of matrix completeness (T -3 matrices). Further pruning of basally branching pseudoscorpions generally also incurred this tree topology (T -4 to T -6 matrices), with the exception of matrix G 1 • T -3 ( fig. 2b). No matrix recovered the relationship of pseudoscorpions as sister group to Arachnopulmonata (sensu Sharma, Kaluziak, et al. 2014).
Relationships among other chelicerate taxa largely reflected the outcomes of previous works  and are not discussed in detail here ( fig. 2d). Notably, we never recovered the monophyly of Acari or Arachnida.

Phylogenomics with Site Heterogeneous Models
Partitioned model ML analyses have sometimes been criticized as less accurate than site heterogeneous models, although these inferences have often been grounded in assumptions of true relationships based on traditional phylogenetic hypotheses (e.g., Wang et al. 2019). Simulations have previously shown that CATþGTR and partitioned ML analyses are comparably accurate, with both of these outperforming CAT-F81 (sometimes referred to as CAT-Poisson) with respect to topological accuracy (Whelan and Halanych 2016). However, CATþGTR models are notoriously difficult to implement in a Bayesian framework, due to excessive computational times for real data sets (i.e., >100 taxa, >500 genes), and numerous published analyses using PhyloBayes-mpi have exhibited failure to converge (defined as ESS >200; maxdiff <0.10), especially for chelicerate phylogeny (Sharma, Kaluziak, et al. 2014;Lozano-Fern andez et al. 2019;Howard et al. 2020). As a workaround, we assessed the performance of the posterior mean site frequency (PMSF) model LG þ C20 þ F þ C, a mixture model alternative to the CAT Empirical Solutions to Long Branch Attraction . doi:10.1093/molbev/msab038 MBE implementation. This model was implemented for the G 1 and G 6 family of matrices, which constitute the densest and the largest matrices we analyzed, respectively (248 and 1002 genes, respectively). For the G 6 matrices, the pattern of tree topologies recovered reflected the same outcome as the partitioned model analyses, with T -3 to T -6 matrices recovering Pseudoscorpiones as clustering with one of the acarine orders, and T 0 to T -2 matrices recovered Pseudoscorpiones þ Scorpiones ( fig. 2c).
Notably, all the G 1 matrices analyzed using the LG þ C20 þ F þ C recovered Pseudoscorpiones þ Parasitiformes with support (BS ¼ 94-100%). To assess the impact of a more parameter-rich site heterogenous model on phylogenomic inference, we repeated the analyses of the G 1 matrices under the LG þ C60 þ F þ C model. Despite the use of a model with additional rate categories, these analyses also uniformly recovered the relationship Pseudoscorpiones þ Parasitiformes with high support (BS ¼ 95-100%).
Analyses using site heterogeneous models never recovered the monophyly of Arachnida or Acari.

Nodal Support Dynamics
Ultrafast bootstrap resampling frequencies were used to estimate support for competing hypotheses for the phylogenetic placement of Pseudoscorpiones, across the 42 concatenated matrices analyzed with partitioned modelfitting ( fig. 3a). Across all levels of matrix completeness, support for Pseudoscorpiones þ Scorpiones was negligible (<10%) for T -3 to T -6 matrices, but increased dramatically upon including Neobisioidea (T -2 matrices). Increase in nodal support for Pseudoscorpiones þ Scorpiones was not monotonic, as sampling of Feaelloidea and Chthonioidea resulted in some variability in bootstrap frequency ( fig. 3a). The nodal support trajectories were identical for the hypotheses Pseudoscorpiones þ Scorpiones and Pseudoscorpiones þ Arachnopulmonata. This result reflects in part the nestedness of the two hypotheses (i.e., Scorpiones is nested within Arachnopulmonata).
By contrast, support for Pseudoscorpiones as the sister group of either Parasitiformes or Acariformes showed the opposite trend, with better representation of basally branching pseudoscorpion groups resulting in lower nodal support for pseudoscorpions clustering with either of these groups. For taxon subsets with the least representation of basally branching pseudoscorpions (T -4 to T -6 matrices), the most complete matrices recovered high support values for Pseudoscorpiones þ Parasitiformes, whereas matrices with intermediate gene occupancy thresholds (60-70%, or G 3 to G 5 matrices) recovered high support values for Pseudoscorpiones þ Acariformes.

Gene Trees, DGLS, and Species Tree Reconstruction
Approaches to inferring species tree using gene trees have been shown to be powerful predictors of phylogenetic accuracy, but these methods are predicated on the accuracy of the underlying gene tree set. To assess whether improving taxonomic sampling of a long-branch taxon also affects phylogenetic signal at the level of gene trees, we calculated gene-wise log-likelihood scores (DGLS) on gene trees corresponding to each of the 42 matrices. DGLS assesses the likelihood of each gene given two competing tree topologies, across all genes in a data set (Shen et al. 2017). We generated DGLS distributions for the two competing hypotheses of pseudoscorpion placement (clustering with scorpions vs. clustering with either acarine order).
We observed minimal effects of taxon pruning in the largest matrices (G 5 and G 6 ), and no consistent trends in the distribution of genes favoring either competing hypothesis, across the DGLS distributions of 42 analyses ( fig. 3b). Magnitudes of log likelihood favoring either hypothesis were also not consistently affected (supplementary fig. S1, Supplementary Material online). These results suggest that increasing taxonomic sampling of a long-branch lineage does not greatly alter the distribution of phylogenetic signal at the level of individual gene trees.
Although gene and site concordance factors were trialed (Minh et al. 2020), these were invariably low for all competing placements of pseudoscorpions, as well as interordinal relationships, reflecting well-known conflicting signal in basal chelicerate phylogeny (Sharma, Kaluziak, et al. 2014;. To assess whether the intransigence of DGLS distributions to taxonomic sampling has downstream effects on methods of phylogenetic reconstruction, especially those that use the multispecies coalescent model, we reconstructed species trees from gene trees using ASTRAL v.5.14.2 (Zhang et al. 2018). We discovered no clear difference between the performance of ASTRAL versus concatenation-based approaches, with respect to the tree topology recovered as a function of the number of basal branches pruned ( fig. 4). Generally, T 0 to T -2 matrices recovered the relationship Pseudoscorpiones þ Scorpiones, whereas T -3 to T -6 matrices again recovered Pseudoscorpiones as clustering with the acarine orders. The exceptions were matrices G 2 • T -1 and G 4 • T 0 , which recovered pseudoscorpions as the sister group of Parasitiformes or in a grade at the base of Chelicerata, respectively. ASTRAL analyses never recovered the monophyly of Arachnida or Acari.

Filtering by Evolutionary Rate
It has been previously shown that support for some chelicerate relationships is strongly affected by evolutionary rate. As examples, support for Pseudoscorpiones þ Scorpiones and Arachnida was initially shown to be restricted to slowevolving genes by Sharma, Kaluziak, et al. (2014), a result partly corroborated by Howard et al. (2020), but reproduced with variable success in other analyses . To dissect the interaction of taxon sampling and evolutionary rate, we partitioned the G 2 and G 3 families of matrices into tertiles based on mean pairwise sequence identity (MPSI) of loci. The G 2 and G 3 matrices were selected for a tradeoff between high number of loci per tertile and low quantity of missing data.
For G . 5a). Pseudoscorpiones were recovered as the sister group of Arachnopulmonata (sensu Sharma, Kaluziak, et al. 2014) by the intermediate rate tertiles of the T 0 and T -1 matrices. All other analyses of G 2 matrices recovered pseudoscorpions as sister group to an acarine order or in an unresolved position. For G 3 , Pseudoscorpiones þ Scorpiones was recovered for only the slow-evolving tertile of T 0 and T -2 matrices, with all other analyses recovering pseudoscorpions as the sister group to an acarine order or in an unresolved position.
ASTRAL analyses never recovered Pseudoscorpiones þ Scorpiones; pseudoscorpions were recovered as the sister group of Arachnopulmonata (sensu Sharma   Empirical Solutions to Long Branch Attraction . doi:10.1093/molbev/msab038 MBE G 2 data sets ( fig. 5b). Taken together, these analyses suggest that slow-evolving genes alone cannot resolve long-branch taxa consistently in the absence of dense taxonomic sampling.
Analyses of data sets filtered for evolutionary rate never recovered the monophyly of Arachnida or Acari.

Duplications of Homeobox Genes
As an external arbiter of the two competing hypotheses of pseudoscorpion relationships, we generated a developmental transcriptome of the West Australian chernetid Conicochernes crassus. Homeobox gene surveys of developmental transcriptomes and/or genomes have previously been

Scorpiones + Tetrapulmonata
Other (basal grade)  MBE shown to be faithful readouts of WGD in Chelicerata. WGDs are inferred to have occurred in the common ancestor of Arachnopulmonata (one event) and independently in the Xiphosura (2-fold or 3-fold WGD); groups like mites, ticks, and harvestmen do not exhibit these shared duplications (Sharma, Schwager, et al. 2014;Sharma, Santiago, et al. 2015;Kenny et al. 2016;Leite et al. 2016;Schwager et al. 2017;Leite et al. 2018;Shingate et al. 2020). A previous comprehensive analysis of homeobox genes by Leite et al. (2018) showed that the retention of duplicates is systemic in two arachnopulmonate lineages (spiders and scorpions), an inference subsequently supported by the first whip spider developmental transcriptomes ) and by embryonic gene expression data Nolan et al. 2020). However, this survey of homeobox duplications omitted key groups, such as Xiphosura and Parasitiformes (Leite et al. 2018). Curiously, Leite et al. (2018) had indeed sampled two pseudoscorpion species, but recovered few homeobox genes for these taxa, likely owing to the sampling of postembryonic stages rather than embryos; in scorpions, developmental transcriptomes have been shown to recover far more duplicated homeobox genes than adult transcriptomes (Sharma, Schwager, et al. 2014;Sharma, Santiago, et al. 2015). We therefore assembled a data set of 26 Panarthropoda, sampling genomes or developmental transcriptomes of all three major lineages of Arachnopulmonata sensu Sharma, Kaluziak, et al. (2014) (i.e., spiders, scorpions, and Pedipalpi [Amblypygi þ Uropygi þ Schizomida]), as well as mites, ticks, harvestmen, horseshoe crabs, and sea spiders. This data set leveraged recent developmental genetic resources generated by us for several non-model chelicerate groups, such as mygalomorph spiders, whip spiders, harvestmen, and sea spiders (Sharma et al. 2012;Setton et al. 2019;Ballesteros et al. 2020;. We included in our analysis two adult transcriptomes of pseudoscorpions previously analyzed by Leite et al. (2018), which had been shown to harbor few homeobox genes and exhibited short contigs for many homeobox homologs. Outgroup data sets consisted of an onychophoran embryonic transcriptome and genomes of Mandibulata.
In contrast to the previous analyses of adult pseudoscorpion transcriptomes (Hesperochernes sp. and Neobisium carcinoides in Leite et al. 2018), our analysis of the first pseudoscorpion developmental transcriptome recovered homologs of 56 homeobox genes in C. crassus ( fig. 6). Of these, 26 exhibited duplications in at least one of the three pseudoscorpion exemplars that were also found in at least one scorpion or one tetrapulmonate, with clear evidence of paralogy (i.e., overlapping peptide sequences exceeding 100 amino acids in length that exhibited multiple substitutions between duplicate pairs).
All ten Hox genes ancestral to Panarthropoda are known to be duplicated in scorpions and spiders, with embryonic expression patterns reflecting the shared duplication (Schwager et al. 2007;Sharma, Schwager, et al. 2014;Schwager et al. 2017). Recent work has shown that the common ancestor of Amblypygi (whip spiders) likely also exhibited two copies of each Hox gene . However, the previous homeobox survey of adult pseudoscorpion transcriptomes had only recovered five of the ten Hox genes, with none of these duplicated (Leite et al. 2018). By contrast, we discovered eight of the ten Hox homologs in the developmental transcriptome of C. crassus (all but Hox3 and Sex combs reduced). Of these eight, five exhibited duplications: labial, Deformed, fushi tarazu, Antennapedia, and abdominal-A.
By comparison to pseudoscorpions, we did not detect systemic duplications of homeobox genes (i.e., suggestive of shared WGD with arachnopulmonates) in Acariformes, Parasitiformes, Opiliones, or Pycnogonida. As a key example, among these groups of arachnids, duplicates of only two Hox genes were detected in the genome of the mite Tetranychus urticae (with these being tandem duplicates on a single Hox cluster; figure 4a of Grbi c et al. 2011). By contrast, tetrapulmonate exemplars new to this analysis (the mygalomorph Aphonopelma hentzi; the three Amblypygi species) exhibited the expected trend of retention of homeobox duplicates. Taken together, this survey of homeobox genes suggests that pseudoscorpions were included in the shared WGD at the base of Arachnopulmonata.

Gene Tree Analysis of Benchmarked Embryonic Patterning Genes
Whereas embryonic expression data are abundant for spiders, and principally for the model system Parasteatoda tepidariorum, they are comparatively few for non-spider chelicerate groups (e.g., Blackburn et al. 2008;Jager et al. 2006;Grbic et al. 2007;Sharma et al. 2012;Sharma, Schwager, et al. 2014;Sharma, Tarazona, et al. 2015;Barnett and Thomas 2013;. In the recent comparative work, it was shown that four appendage patterning genes known to be duplicated in spiders and scorpions exhibited shared expression patterns that reflected the history of the species tree (i.e., ohnologs of P. tepidariorum and the scorpion C. sculpturatus exhibited shared, unique expression patterns, by comparison to the expression domains of their paralogs or of single-copy homologs of outgroups like harvestmen, mites, and mandibulates) (Nolan et al. 2020). These four genes (dachshund, homothorax, extradenticle, and optomotor blind) constitute benchmarked cases of arachnopulmonate ohnologs that have been validated via gene expression surveys, with additional and recent corroboration of this pattern in two of the four genes in the whip spider P. marginemaculatus

MBE
We therefore investigated whether duplicates of these four genes also occurred in the developmental transcriptome of C. crassus. To the surveys previously generated by Nolan et al. (2020), we searched for and added homologs of these genes from developmental transcriptomes of the pseudoscorpion, the whip spider species Phrynus marginemaculatus (Gainett and Sharma 2020), five sea spider species (Setton and Sharma 2018;Ballesteros et al. 2020), and the tarantula A. hentzi (Setton et al. 2019). We discovered two copies of all four genes in the developmental transcriptome of the pseudoscorpion, except for dachshund, wherein three putative homologs were discovered ( fig. 7). However, two of these pseudoscorpion dachshund fragments were non-overlapping, suggesting that only two copies of dachshund are present in this transcriptome (comparable to the case of Mesobuthus martensii; Nolan et al. 2020). Similarly, we discovered two copies of these genes in the new arachnopulmonate data sets (whip spiders and the tarantula). By contrast, only one copy of these four genes was discovered in the sea spiders, as with mites, ticks, and harvestmen.
Gene tree analysis of these four genes had previously shown sufficient signal to resolve monophyletic clusters of arachnopulmonate dac and hth ohnologs (Nolan et al. 2020). Upon reconstructing these two gene trees after adding the pseudoscorpion, the whip spiders, the tarantula, and the sea spiders, we observed each pseudoscorpion paralog clustering with an arachnopulmonate ohnolog, rather than with the single copy orthologs of acarine taxa. For dac, the arachnopulmonate (including pseudoscorpion) clusters were recovered as monophyletic; as previously reported, the horseshoe crab duplications are unrelated to those of Arachnopulmonata (Nolan et al. 2020;Shingate et al. 2020 . 7). Gene trees of extradenticle and optomotor blind showed insufficient phylogenetic signal for testing phylogenetic placement, as previously reported (Nolan et al. 2020 systemic duplication in Pseudoscorpiones, we sequenced and analyzed the draft genome of the species Cordylochernes scorpioides for Hox gene clusters and miRNAs. Due to the fragmentation of the assembly, we were unable to recover more than one Hox gene per scaffold. Nevertheless, we discovered 18 Hox genes in the C. scorpioides genome, corresponding to two ohnologs of all Hox genes except for Hox3 ( fig. 8).
Together with the homeobox duplications in C. crassus, these results are consistent with a shared genome duplication uniting arachnopulmonates and pseudoscorpions. MicroRNAs (miRNAs) have been leveraged as rare genomic changes across the metazoan tree of life, with their effectiveness as phylogenetic markers being closely tied to the quality of genomic resources used for miRNA surveys (Tarver et al. 2013;Thomson et al. 2014;Tarver et al. 2018). In Chelicerata, Leite et al. (2016) previously surveyed miRNAs in the genomes of four spiders, a scorpion, a horseshoe crab, five Parasitiformes, and one Acariformes, as well as several outgroup taxa. This survey revealed lineage-specific duplications in Limulus polyphemus consistent with 2-fold WGD in Xiphosura; duplicated clusters of miRNAs in the spider P. tepidariorum, as well as tandem duplications; and a subset of duplicated miRNAs that were shared across spiders and scorpions.
To elucidate if pseudoscorpions exhibit miRNA duplications shared by arachnopulmonates, we expanded the survey of Leite et al. (2016) and searched for miRNAs in the draft genome of the pseudoscorpion, C. scorpioides and the genome of the scorpion, Mesobuthus martensii. Twenty-six conserved miRNA families were identified in the C. scorpioides genome, and another 35 in M. martensii. Among them, families iab-4, mir-71, and mir-276 had two or more ortholog copies in Arachnopulmonata, Pseudoscorpiones and Xiphosura ( fig. 9). Similarly, we found two members of the families bantam and mir-1 in Scorpiones, Pseudoscorpiones, two spiders, and Xiphosura.
Two miRNAs, mir-190 and pte-bantam, were found duplicated only in Scorpiones and C. scorpioides (with inferred independent duplications in Xiphosura, fig. 9). Our survey did not recover the presence of miRNA sequences from the families mir-210, mir-275, mir-315, mir-981, mir-277, and mir-11960 (previously reported in genomes of spiders and scorpions). We cannot rule out that these absences are attributable to the incompleteness of the pseudoscorpion genome assembly.
We found no miRNAs unique to Arachnida, nor patterns of duplication consistent with arachnid monophyly.
Taken together, these surveys of miRNA duplication revealed four miRNA duplications supporting the inclusion of pseudoscorpions within arachnopulmonates, and two further duplications supporting the sister relationship of pseudoscorpions and scorpions.

Consilience of Phylogenetic Data Classes in the Placement of Pseudoscorpions
Chelicerate higher-level phylogeny is plagued by topological uncertainty, with a subset of orders exhibiting long-branch attraction artifacts, as elucidated by taxon deletion experiments . Barring the monophyly of Euchelicerata (Xiphosura and arachnids), Arachnopulmonata (previously defined as Scorpiones þ Tetrapulmonata), and relationships within Tetrapulmonata, ordinal relationships in the chelicerate tree of life are highly unstable across phylogenomic data sets. Here, we leveraged the previous discovery of a WGD subtending the common ancestor of spiders and scorpions to assess competing hypotheses for the placement of pseudoscorpions (Sharma, Schwager, et al. 2014;Schwager et al. 2017). Taxonrich analyses of supermatrices as well as reconciliation of gene trees consistently recovered pseudoscorpions as the sister group of scorpions, the hypothesis supported by genome and miRNA duplication. Our taxon deletion experiments reveal that the sampling of basally branching lineages in the pseudoscorpion tree of life is key to overcoming long-branch attraction artifacts that draw pseudoscorpions together with the acarine orders.
Our results are also consistent with the variance of tree topologies in the previous chelicerate phylogenetics. Studies that have omitted basally branching pseudoscorpion families, or insufficiently sampled outgroup lineages, recovered Pseudoscorpiones as sister group to, or nested within, Acari (e.g., Sharma, Kaluziak, et al. 2014;Arribas et al. 2019). By contrast, phylogenomic works that sampled basal splits within Pseudoscorpiones have recovered support for their placement within Arachnopulmonata (e.g., Benavides et al. 2019;Howard et al. 2020). Our analyses further demonstrate that taxonomic sampling outweighs matrix completeness and analytical approach (supermatrix vs. gene tree reconciliation approaches) in achieving phylogenetic accuracy when long-branch attraction is incident.
To date, no morphological data matrix has ever recovered the monophyly of Arachnopulmonata (with or without pseudoscorpions), with both older and recent morphological cladistic studies continuing to recover the archaic grouping of Lipoctena (scorpions as the sister group to the remaining arachnid orders; Legg et al. 2013;Lamsdell 2016;Aria and Caron 2019;Bicknell et al. 2019; reviewed by Nolan et al. 2020). Shultz (1990Shultz ( , 2007 presented the first compelling cladistic analyses demonstrating that scorpions are derived within the arachnid tree, a result reflected in another body of recent paleontological investigations (e.g., Garwood and Dunlop 2014;Huang et al. 2018;). In such works, pseudoscorpions have typically been recovered as the sister group of Solifugae (as the clade Haplocnemata), an another order exhibiting topological instability ). Nevertheless, a sister group relationship of scorpions and pseudoscorpions has previously been tenuously supported by some morphological analyses, namely, the cladistic analysis of Garwood and Dunlop (2014). Subsequent expansion and reuse of this matrix also recovered this relationship (Huang et al. 2018;). However, the recovery of the clade Pseudoscorpiones þ Scorpiones as a sister group of Opiliones in those studies is refuted by phylogenomic analyses, developmental gene expression, and genomic architecture ( Nolan et al. 2020). We therefore observe only partial concordance between our analyses and inferences based on morphological matrices.
By contrast to morphology, we identified clear and systemic evidence for a shared WGD in the first developmental transcriptome and genome of two pseudoscorpion exemplars, which is concordant with the hypothesis that pseudoscorpions are derived arachnopulmonates. Surveys of homeobox gene duplication, gene tree topologies of benchmarked arachnopulmonate-specific ohnologs with a known spatiotemporal subdivision of embryonic expression domains, and patterns of miRNA duplication all support the inclusion of Pseudoscorpiones within arachnopulmonates, with further evidence from two miRNA families for the clade Pseudoscorpiones þ Scorpiones, a clade we term Panscorpiones (new name). Henceforth, we redefine Arachnopulmonata to include Pseudoscorpiones (new definition).
Due the unanticipated large size of the C. scorpioides genome (3.6 Gb), and the ensuing fragmentation of the assembly, we were not able to assess the number of Hox clusters in Pseudoscorpiones, which would constitute an independent test of the hypothesized shared WGD (but see Hoy et al. 2016 for a case of atomized Hox clusters in a mite). A forthcoming long-read, proximity ligation-based genome assembly of this species is anticipated to inform the ancestral architecture of arachnopulmonate genomes. One additional line of evidence that would support this phylogenetic inference would be embryonic gene expression patterns of ohnologs known to exhibit shared spatiotemporal dynamics in developing appendages of spiders and scorpions (e.g., dac; hth; Nolan et al. 2020). More recently, evidence from whip spiders (Amblypygi) has additionally supported the inference of conserved expression domains of ohnologs that correspond to gene tree topologies . Although we endeavored to generate expression data for the two copies of the appendage patterning transcription factors dac, hth, exd, and omb in C. crassus, we encountered technical challenges incurred by cuticle deposition early in pseudoscorpion development, as well as paucity of embryonic tissue. Whole mount in situ hybridization in pseudoscorpion embryos likely requires modified in situ hybridization protocols previously developed for highly sclerotized chelicerate embryos (e.g., sea spiders; Jager et al. 2006). Future efforts must establish a

Ancient Origins of Courtship Behavior and Brood Care in Arachnopulmonata
The recovery of Pseudoscorpiones as the sister group of scorpions markedly alters the reconstruction of several key character in the chelicerate tree of life ( fig. 10). Regarding their respiratory system, pseudoscorpions are reconstructed as arachnopulmonates that have secondarily lost their book lungs; instead, pseudoscorpions typically exhibit two pairs of tracheal tubules opening as spiracles on the third and the fourth opisthosomal segments. The evolutionary transition of book lungs to tracheal tubules is broadly associated with miniaturization in other arachnopulmonate orders (Dunlop 2019). For example, in derived spiders, the posterior pair of book lungs is replaced by openings of the tracheal tubules as well, which in turn have a complex evolutionary history within this order (Ram ırez et al. 2021). In Schizomida, the posterior pair of respiratory organs is lost altogether (Hansen and Sørensen 1905;Shultz 1990). Separately, an arachnopulmonate affinity for pseudoscorpions suggests that both a courtship behavior and a mode of parental care are ancient across this group. Like scorpions, Amblypygi, Uropygi, and Schizomida, pseudoscorpions of the superfamily Cheliferoidea perform a characteristic courtship dance (the promenade a deux), wherein the male clasps the female using the pedipalps and the pair navigate over a substrate ( fig. 10b, d, and g; Gravely 1915). The inferred purpose of this behavior is to guide the female to the spermatophore deposited by the male onto the substrate. The promenade a deux behavior is secondarily lost in spiders, which exhibit other, often complex, courtship behaviors. In addition, spiders do not produce an external spermatophore during mating; typically, sperm are passed to specialized copulatory bulbs on the distal palps, which are used for internal fertilization. Given the tree topology supported by analyses (reciprocally monophyletic Panscorpiones þ Tetrapulmonata), and under accelerated transformation of character states ( fig. 10), the promenade a deux appears to be a possible synapomorphy of Arachnopulmonata that was secondarily lost in spiders as well as in the common ancestor of Pseudoscorpiones, with a secondary regain in Cheliferoidea, or its retention in Cheliferoidea may represent a plesiomorphy Empirical Solutions to Long Branch Attraction . doi:10.1093/molbev/msab038 MBE that reflects arachnopulmonate affinity. An equally parsimonious scenario (under delayed transformation; not shown) constitutes independent gains in Pedipalpi and Panscorpiones, with the same sequence of loss and regain of this character within Pseudoscorpiones. A less ambiguous reconstruction is the presence of a stalked spermatophore attached to the substrate is found across pseudoscorpion superfamilies, as well as in scorpions, Amblypygi, Uropygi, and Schizomida (Shultz 2007). A similarity of spermatophore structure in scorpions and pseudoscorpions has previously been noted as well (Francke 1979).
Many pseudoscorpion superfamilies will produce a brood sac on the underside of the female's opisthosoma that is secreted by gonoporal glands, wherein embryos develop until hatching ( fig. 10c). A condition unique to brooding pseudoscorpion lineages is that developing embryos are additionally provisioned by nutritive secretions of the female (Weygoldt 1969). The production of a brood sac from genitalic glands is shared by Amblypygi, Uropygi, and Schizomida, which also brood embryos on the underside of the opisthosoma (Gravely 1915;Rowland 1972). The incidence of this mode of development in pseudoscorpions was previously thought to represent a morphological convergence (Shultz 1990). Scorpions exhibit a derived state in this regard, with all extant Scorpiones bearing live young ( fig. 10d). Upon birth or hatching from the egg, postembryos of scorpions, Amblypygi, Uropygi, and Schizomida will climb onto the female's back until they advance to additional instar stages ( fig. 10e, f, h, and  i). A similar form of brood care (carrying of the eggs) occurs in some acarine groups as well, as exemplified by argasid ticks (Pienaar et al. 2018), though the hatchlings are not known to be carried by the adult females.
Pseudoscorpion postembryonic care is variable across this order, but can take the form of females forming brood chambers and cohabiting these with offspring (Weygoldt 1969). As with insemination, spiders again bear a derived form of brood care within arachnopulmonates, with the female typically enveloping egg masses in silk. Brood care in spiders is variable; egg sacs may be guarded by females in burrows until juveniles achieve a later instar and disperse (e.g., mesotheles and mygalomorphs), attached to the substrate (e.g., some Ctenidae, Corinnidae, Selenopidae, and Hersiliidae), attached to webs (most araneomorphs), or carried on the female's back (e.g., Lycosoidea; fig. 10h). In addition, brood care consisting of egg guarding has independently evolved in Solifugae and several times within laniatorean Opiliones (Punzo 1998;Machado and Mac ıas-Ord oñez 2007).
Given the distribution of the promenade a deux, the stalked spermatophore, the production of the maternal brood sac from gonoporal glands, and comparable forms of maternal brood care across Chelicerata, we infer these four characters to be ancestral to Arachnopulmonata. As the the oldest known arachnopulmonate, Parioscorpio venator, is Silurian in age (439 Ma; Wendruff et al. 2020), the promenade a deux may constitute the oldest known courtship behavior.
The recovery of Panscorpiones precipitates reevaluation of other characters, whose homology in now in question. Key among these are venoms of Iocheirata (a clade of venomous pseudoscorpions, which excludes Chthonioidea and Feaelloidea), scorpions, and spiders. As the venom glands of each of these groups do not share positional homology (pedipalpal fingers in pseudoscorpions; posterior-most somite in scorpions; chelicerae in spiders), it is most likely that each group has undergone independent recruitment of housekeeping genes to serve as venom peptides, though striking similarities exist in some toxins of these three groups and may constitute a deep homology (Santib añez-L opez et al. 2018;Kr€ amer et al. 2019). On the other hand, the evolution of silks, which occur in spiders, some pseudoscorpions, and some Acariformes (once again, with no shared positional homology of silk-producing organs), is most likely to reflect independent evolutionary gains.
Prospects for a Resolved "Arachnid" Phylogeny Topological uncertainty in chelicerate phylogeny extends to the traditionally accepted monophyly of Arachnida, with an array of phylogenomic analyses recovering the derived placement of Xiphosura as the sister group of Ricinulei . In this study, Xiphosura was recovered as the sister group of Ricinulei (118/189 analyses), as part of a clade with Ricinulei and Solifugae (50/189 analyses), or sister group to a larger clade of derived arachnids, such as Arachnopulmonata (21/189 Given the unstable support for an arachnid monophyly across phylogenomic data sets, it has been contended that the morphological result of arachnid monophyly should be accepted as the most likely evolutionary scenario (Howard et al. 2020).
As we have previously shown, the most taxon-rich phylogenomic data set of chelicerates-and the sole analysis sampling all extant chelicerate orders-does not support arachnid monophyly, including under the CAT model . Recent reanalyses of data sets that had previously recovered arachnid monophyly under certain models (e.g., Regier et al. 2010; 500-slowest evolving genes in Sharma, Kaluziak, et al. 2014), showed that higher support for Arachnida could be obtained if these were analyzed under site heterogenenous models (Howard et al. 2020). Howard et al. , as well as the slowly evolving putative sister group of Parasitiformes (Opilioacariformes). We then computed topologies under the same three likelihood models implemented by Howard et al. (2020) Separately, we performed this same family of analyses, after removing 10 loci that represent duplicated genes in the 200locus data set. Duplicates in this context refers to identical or nearly identical alignments that recur in the same supermatrix. These are typically the result of failing to reduce input transcriptomes to single isoforms per Trinity gene, prior to analysis with OMA (Altenhoff et al. 2013). A list of these erroneously duplicated alignments is provided in the Dryad Digital Repository.
As shown in supplementary figures S2 and S3, Supplementary Material online, the inclusion of just two phylogenetically significant lineages (with or without the removal of the duplicated loci) to the analyses of Howard et al. (2020) is sufficient to break arachnid monophyly, as well as the monophyly of Acari, with significant nodal support (90-99% ultrafast bootstrap resampling frequency), under all three substitution models. The consistent recovery of a non-monophyletic Acari in matrices that sample the basally branching parasitiform lineage Opilioacariformes (e.g., this study) suggests that Acari monophyly is an another long-branch attraction artifact in chelicerate phylogeny. Taken together with the analyses of , as well as our analyses of BUSCO genes (this study), our reanalyses of OMA-inferred orthologs from Howard et al. (2020) suggest that neither Arachnida nor Acari are substantiated by dense taxonomic sampling, slowly evolving genes, site heterogeneous models, approach to orthology inference, or various combinations thereof.
Across Chelicerata, a subset of genes supporting arachnid monophyly, as identified by a DGLS framework, were previously shown to be statistically indistinguishable from the majority (which supported Xiphosura as derived), with respect to 70 parameters, including evolutionary rate, compositional heterogeneity, and alignment length (figure 3 of . In the present study, of the 189 phylogenetic analyses we performed using an independent orthology criterion for locus selection (BUSCO genes), not one analysis recovered arachnid monophyly. In addition, surveys of miRNAs revealed no support for Arachnida, either in the form of miRNAs unique to arachnids, or evidence of an arachnid-specific duplication (note that although not all chelicerate orders are represented by genomes, this should not hinder the recovery of putative arachnid-specific miRNAs in our analysis; Garb et al. 2018). Recovering arachnid monophyly in molecular data sets appears to require a concerted, and largely contrived, effort to circumscribe taxa, loci, models, and algorithms that will recover this preconceived relationship. As we have previously shown, this practice is questionable (if not outright unscientific) because it can be used to justify nonsensical groupings (figure 8 of . The attribution of arachnid non-monophyly to unspecified systematic biases or artifacts remains an unsubstantiated notion. Strong arguments in favor of arachnid monophyly remain the domain of morphological and paleontological data sets; these span the nature of mouthparts, eyes, respiratory systems, and stratigraphic distributions of marine versus terrestrial lineages, among others (reviewed by Howard et al. 2020). Such discussions eerily echo arguments once advanced in support of Tracheata (Myriapoda þ Hexapoda, or the terrestrial mandibulates), a group revealed by molecular phylogenetics to be an artifact of morphological convergence in another subset of terrestrial arthropods. As the history of hypotheses like Pulmonata (Gastropoda) and Tracheata has repeatedly shown, terrestrial lineages are highly prone to convergence, often to an astonishing degree (Friedrich and Tautz 1995;Shultz and Regier 2000;Giribet et al. 2001;Jörger et al. 2010). Shared reduction of the appendage-less intercalary segment (third head segment), the incidence of uniramous appendages, the gnathobasic architecture of the mandible, and the organization of the tracheal tubules in hexapods and myriapods serve as powerful examples of how parallel adaptations to life on land can confound interpretations of synapomorphies. More generally, the "holistic" approach of Howard et al. (2020) simply fails to reconcile its dependence upon the validity of morphology to support one questionable node (Arachnida) with its simultaneous dismissal of morphological data sets' consistent inability to recover the only higher-level chelicerate relationships that are robustly and independently supported by other data classes (Arachnopulmonata and Euchelicerata; figure 1 of Nolan et al. 2020).
We submit that an objective approach to testing phylogenetic hypotheses of terrestrialization in arthropods must regard traditional groupings with skepticism, rather than querying molecular sequence data for genes and data sets supporting preconceived relationships. Such investigations must also account for new neurophylogenetic characters that have recently suggested morphological support for a Empirical Solutions to Long Branch Attraction . doi:10.1093/molbev/msab038 MBE closer relationship of Xiphosura to Arachnopulmonata Melzer 2019a, 2019b). Due to the lack of genomes for Ricinulei (the putative Xiphosura sister group in some phylogenies) as well as other poorly studied arachnid groups (e.g., Palpigradi and Solifugae), we were not able to assess miRNAs or other rare genomic changes to test the competing hypothesis of Ricinulei þ Xiphosura. However, the incidence of WGDs in horseshoe crabs proffers the tantalizing possibility of applying the approaches used herein to assess this competing hypothesis, as at least one of the two WGD events in Xiphosura is thought to be ancient (Roelofs et al. 2020). The discovery of shared duplications of gene families, miRNAs, and syntenic blocks between different sets of chelicerate orders could be used to evaluate independently the monophyly of Arachnida, as well as the placement of the unstable apulmonate orders. Future efforts should therefore target the generation of genomic resources for Ricinulei, Palpigradi, and Solifugae to reevaluate such hypotheses as Haplocnemata (Solifugae þ Pseudoscorpiones), Megoperculata (Palpigradi þ Tetrapulmonata), and Arachnida itself.

Conclusions
Consilience in phylogenetics is the outcome of multiple, independent topological tests recovering support for the same hypothesis (e.g., Rota-Stabelli et al. 2011;Fröbius and Funch 2017;Marl etaz et al. 2019). Here, we demonstrated that analyses of sequence data, gene family duplications, gene tree topologies of arachnopulmonate-specific paralogs, and miRNA duplications independently support a nested placement of pseudoscorpions within Arachnopulmonata. Our results reinforce that topological accuracy in the placement of long-branch taxa is most affected by dense sampling of basally branching lineages, rather than algorithmic approach (supermatrix vs. coalescent-based summary methods), matrix completeness, evolutionary rate, or model choice alone. Improvements to chelicerate phylogeny must therefore focus on the identification of basally branching groups within orders whose internal relationships remain poorly understood, such as Solifugae, Amblypygi, Uropygi, and Schizomida. Leveraging rare genomic changes stemming from the genome duplications exhibited by a subset of chelicerate orders may be a key to resolving some of the most obdurate nodes in the chelicerate tree of life.

Species Sampling
For phylogenetic reconstruction, we generated a data set of 117 chelicerates (40 pseudoscorpions, 12 scorpions, 17 spiders, 4 Pedipalpi, 13 Opiliones, 5 Ricinulei, 3 Xiphosura, 2 Solifugae, 9 Parasitiformes, 10 Acariformes, 2 Pycnogonida) and 15 outgroups (3 Onychophora, 4 Myriapoda, 8 Pancrustacea) . Taxon selection prioritized the representation of basal splits in all major groups (Sharma, Fern andez, et al. 2015;Fern andez et al. 2017Fern andez et al. , 2018Ballesteros et al. , 2020Benavides et al. 2019; Santib añez-L opez et al. 2019, 2020). Libraries of high quality were additionally selected such that all chelicerate orders were represented in >95% of loci by at least one terminal, in all matrices constructed. Although we trialed the inclusion of a palpigrade library recently generated by us , the low representation of BUSCO genes for this taxon across data sets (46-70%) prohibited the inclusion of this order in downstream analyses. A list of taxa and sequence accession data is provided in supplementary table S1, Supplementary Material online.

Orthology Inference and Phylogenomic Methods
Candidate ORFs were identified in transcripts using TransDecoder (Haas et al. 2013). Loci selected for phylogenomic analysis consisted of the subset of 1066 Benchmarked Universal Single Copy Orthologs identified for Arthropoda (BUSCO-Ar). For each library, these were discovered using a hidden Markov model approach, following the procedure detailed in Leite et al. (2018). Multiple sequence alignment was performed using MAFFT 7.3.8 ( To assess the tradeoff between data completeness and the number of loci per data set, six matrices were constructed by setting taxon occupancy thresholds to 55% (1002 loci), 60% (945 loci), 65% (846 loci), 70% (693 loci), 75% (480 loci), and 80% (248 loci) of total taxa. These thresholds were selected to represent broadly commonly occurring values for matrix completeness in phylotranscriptomic studies of metazoans. Representation of each terminal and ordinal lineage per matrix is provided in supplementary table S2, Supplementary Material online.
To assess the effect of denser taxonomic sampling on the placement of Pseudoscorpiones, basally branching lineages of pseudoscorpions (corresponding to superfamilies or families) were sequentially pruned until only Cheliferoidea (Cheliferidae þ Chernetidae) was retained. Thus, six additional matrices were constructed, with sequential pruning of Chthonioidea (six terminals), Feaelloidea (two terminals), Neobisioidea (ten terminals), Garypoidea (five terminals), Garypinoidea (three terminals), and Cheridoidea þ Sternophoroidea (two terminals). Pruning was performed for each of the six matrices constructed according to taxon occupancy thresholds, resulting in 42 matrices in total.
Tree topologies for individual loci and for concatenated data sets were computed with IQ-TREE 1.6.8 (Nguyen et al. 2015;Chernomor et al. 2016), coupled with model selection of substitution and rate heterogeneity based on the Bayesian Information Criterion (Kalyaanamoorthy et al. 2017) and 1000 ultrafast bootstraps to assess branch support (-m MFP -mset LG, JTT, WAG -st AA -bb 1000; Hoang et al. 2018). For the subset of the least complete matrices (55% taxon occupancy), we additionally performed model selection under the posterior mean site frequency (PMSF), a mixture model that approximates the CAT model in a maximum likelihood framework (Lartillot and

MBE
Analyses were performed using the LG þ C20 þ F þ C and LG þ C60 þ F þ C models.
To assess the interaction between evolutionary rate and taxon sampling, we selected the 70% complete (693 loci) and 75% complete (480 loci) matrices to optimize the tradeoff between sufficient sampling of genes and low quantity of missing data. These matrices were divided into tertiles of slow-, intermediate-and fast-evolving genes using mean percent pairwise identity as a metric of evolutionary rate, following the approach of Sharma, Fern andez, et al. (2015). Subsequent pruning of basally branching pseudoscorpion taxa was performed as in other analyses. Tree inference was performed with a partitioned model-fitting and ASTRAL.
For phylogenetic analyses using multispecies coalescent methods, species trees were estimated with ASTRAL v. 5.14.2 (Mirarab and Warnow 2015;Zhang et al. 2018), using gene trees from IQ-TREE analyses as inputs. Phylogenetic signal at the level of individual genes was quantified using the gene-wise log-likelihood score (DGLS) for the unconstrained tree versus a competing hypothesis (Pseudoscorpiones þ Acariformes; Pseudoscorpiones þ Parasitiformes; Pseudoscorpiones þ Scorpiones) (Shen et al. 2017). This metric maps the relative support for each of two competing hypotheses, for every locus in the data set; the amplitude of the log-likelihood indicates the degree of support for either hypothesis.

Embryo Collection, Sequencing, and Mapping of Homeodomains
Given that transcriptomes of adult tissues have been shown to sample poorly transcription factors relevant for developmental patterning in arachnids (Sharma, Santiago, et al. 2015), assessment of homeodomain duplications was performed only for genomes and developmental transcriptomes. The genome of Cordylochernes scorpioides was excluded from this analysis, due to the fragmentation of the assembly.
Conicochernes crassus (Pseudoscorpiones: Chernetidae) were hand collected from underneath the bark of karri trees in Denmark, Western Australia (-34.963640, 117.359720). Individuals were reared in plastic containers containing damp paper towels at room temperate to simulate living conditions between bark and sapwood. Adult pseudoscorpions were fed a combination of cricket nymphs and ap -/fruit flies. Females of C. crassus carry developing embryos in a brood sac on the underside of the opisthosoma; individuals were checked for the presence of embryos. Females carrying embryos were separated from the colony for 12-72 h to prevent cannibalism and allow embryos to mature to a range of developmental time points. Entire brood sacs were then separated from the opisthosoma using forceps wetted with distilled water to prevent damage to the females before being returned to the colony.
Establishment of Phrynus marginemaculatus (Amblypygi: Phrynidae) for the study of developmental genetics and the comparative development was previously described by . Embryos of the whip spiders Charinus ioanniticus and Charinus israelensis were obtained by hand collecting brooding females from two cave sites in Israel, Hribet Hruba (31.913280, 34.960830) and Mimlach (32.858150, 35.44410). Two stages of deutembryos were obtained and sequenced for each species. Further details are provided in .
Field collection of embryos of the tarantula Aphonopelma hentzi (Araneae: Theraphosidae) for developmental genetics and transcriptomics was previously described by Setton et al. (2019).
Embryos were transferred to Trizol Tri-reagent (Ambion Life Technologies, Waltham, MA, USA) for RNA extraction, following manufacturer's protocols. Library preparation and stranded mRNA sequencing were performed at the University of Wisconsin-Madison Biotechnology Center on an Illumina HiSeq 2500 platform (paired-end reads of 125 bp). Raw sequence reads are deposited in NCBI Sequence Read Archive. Filtering of raw reads and strandspecific assembly using Trinity v. 2.8.3 followed our previous approaches .
All initial BLAST hits were retained. Next, the full protein sequences of the BLAST hits were predicted with TransDecoder v. 5.5.0 (Haas et al. 2013) with default parameters (-m 100; predicted transcripts with less than 100 amino acids were not retained) and thereafter analyzed using the Conserved Domain Database (CDD) (Marchler-Bauer et al. 2015) to confirm the presence of homeodomains and annotate other functional domains. BLAST hits that did not have homeodomains identified by CDD were removed. Transcripts within a species that had identical protein sequences Empirical Solutions to Long Branch Attraction . doi:10.1093/molbev/msab038 MBE predicted to encode homeodomains were manually checked. Because this approach conservatively emphasized retention of complete homeobox genes with conserved sequences, we cannot rule out the exclusion of partial transcripts of homeobox genes that lack homeodomains or orthologs with highly divergent sequences. Multiple sequence alignment, trimming to retain only the homeodomain, and classification of verified homologs followed procedures described by Leite et al. (2018).

Analysis of Appendage Patterning Ohnologs
Homologs of four appendage patterning genes were retrieved from the C. crassus transcriptome using approaches described above. Multiple sequence alignment of peptide sequences and alignment trimming followed the approach of Nolan et al. (2020). Maximum likelihood inference of tree topologies was performed using IQ-TREE under an LG þ I þ C substitution model. Nodal support was estimated using ultrafast bootstrapping.

Cordylochernes scorpioides Genome Sequencing
Illumina fragment libraries (insert sizes 270 and 420 bp) and mate-pair libraries (insert sizes 2, 4, and 8 kb) were constructed by Lucigen Corporation (Middleton, WI, USA). Fragment libraries were constructed from genomic DNA extracted from single individual inbred males; to meet DNA input requirements for mate-pair library construction, genomic DNA from 12 fourth generation inbred individuals was pooled. Fragment libraries were sequenced on HiSeq X with 150 b paired-end sequencing (Hudson Alpha Genomic Services Lab, Huntsville AL), and mate-pair libraries were sequenced on MiSeq with 150 b paired-end sequencing at Lucigen Corporation. The read data was assembled de novo at 125X coverage using MaSuRCA v. 3.2.3 (Zimin et al. 2013), with additional scaffolding using SSPACE Standard v 3.0 (BaseClear BV, Netherlands) followed by gap-filling using GapFiller v1.12 (BaseClear). The draft C. scorpioides genome assembly was submitted to GenBank (GenBank: QEEW00000000.1) and read data were deposited in NCBI SRA (SRA: SRP144365; BioProject: PRJNA449764). Global statistics for assessment of draft genome quality and completeness are provided on NCBI (https://www.ncbi.nlm.nih.gov/ assembly/GCA_003123905.1).

MicroRNA and Hox Genes Orthology Search
Previous work on miRNA occurrence in the genome of the house spider Parasteatoda tepidariorum identified 40 miRNA families shared across Arthropoda, and a further 31 either unique to spiders (n ¼ 30) or unique to arachnopulmonates (n ¼ 1) (Leite et al. 2016). To extend this survey to new taxa, we searched for miRNA families in the draft genome assembly of C. scorpioides (GCA_003123905.1), as well as the genome of Mesobuthus martensii (GCA_000484575.1). All miRNA reported from P. tepidariorum were retrieved from the miRBASE and used as query sequences (Kozomara et al. 2019). An initial BLAST search was performed (blastn -word_size 4 -reward 2 -penalty -3 -evalue 0.05) and sequences with e-value <0.05 and percentage identity >70% were retained. To accommodate the fragmentation of the C. scorpioides genome, as well as heterozygosity, putative hits were retained only if both the ELEKEF and KIWFQN motifs were discovered in the peptide translation, and peptide sequences were unique (i.e., pairs of sequences with only synonymous substitutions were considered putative alleles). Putative homologs were verified by multiple sequence alignment using MAFFT v. 7.407 (Katoh and Standley 2013). The structure and the minimum free energy of these selected miRNAs were analyzed with RNAfold v. 2.4.13 (as part of the ViennaRNA Package 2.0; Lorenz et al. 2011) and with The Vienna RNA WebServer (http://rna.tbi.univie.ac.at/cgibin/RNAWebSuite/RNAfold.cgi) using default settings. Regarding the previous survey of miRNA families in 16 ecdysozoan taxa by Leite et al. (2016), we corroborated all reported results, except for the discovery that the mygalomorph spider A. hentzi exhibits only a single copy of the miRNA ptebantam.
Permitting Specimens of C. crassus were collected in Western Australia under permit number 08-000214-6 from the Department of Parks and Wildlife. Specimens of C. scorpioides were collected in Panam a under permits SE/A-92-05 (collecting) and SEX/A-142-05 (export), from the Autoridad Nacional del Ambiente, Rep ublica de Panam a; and permit number 68818 (quarantine) from the Ministerio de Desarrollo Agropecuario, Rep ublica de Panam a.

Supplementary Material
Supplementary data are available at Molecular Biology and Evolution online.