SPEARS: Standard Performance Evaluation of Ancestral Reconstruction through Simulation

Ancestral haplotype maps provide useful information about genomic variation and biological processes. Reconstructing the descendent haplotype structure of homologous chromosomes, particularly for large numbers of individuals, can help with characterizing the recombination landscape, elucidating genotype-to-phenotype relationships, improving genomic predictions and more. Inferring haplotype maps from sparse genotype data is an efficient approach to whole-genome haplotyping, but this is a non-trivial problem. A standardized approach is needed to validate whether haplotype reconstruction software, conceived population designs and existing data for a given population provides accurate haplotype information for further inference. Results We introduce SPEARS, a pipeline for whole simulation-based appraisal of genome-wide ancestral haplotype inference. The pipeline generates virtual genotypes (truth data) with real-world missing data structure. It then proceeds to mimic analysis in practice, capturing sources of error due to imputation and reconstruction of ancestral haplotypes. Standard metrics allow researchers to assess which features of haplotype structure or regions of the genome are sufficiently accurate for analysis and reporting. Haplotype maps for 1,000 outcross progeny from a multi-parent population of maize is used to demonstrate SPEARS. Availability https://github.com/maizeatlas/spears


Introduction
The genome is a mosaic of ancestral haplotypes that capture the evolutionary or breeding history of an individual. Reconstructing ancestral blocks is important for imputing untyped regions of the genome (1), mapping quantitative trait loci (2), investigating the recombination landscape (3), and inferring the evolutionary history and structure of haplotypes (4,5). Accurately inferring ancestral haplotypes is non-trivial, but several approaches for this have been developed: MERLIN (6); HAPPY (2); GAIN (7); DOQTL (8); R/qtl2 (9). These tools have been shown to perform well, but are mostly limited to specific population types or breeding schemes and can be computationally intensive with complex pedigrees or large numbers of markers. RABBIT (Reconstructing Ancestry Blocks BIT by bit) is a flexible tool that uses a Markovian model to reconstruct haplotypes for complex pedigrees involving various mating scenarios (10). Compared to current tools, it has shown the highest accuracy for inbred lines from multiparent pedigrees (including the mouse CC design [11] and Arabidopsis thaliana MAGIC design [12]) (10). Here, tailored for RAB-BIT but extensible for other tools, we present a standard performance evaluation of ancestral reconstruction through simulation (SPEARS) pipeline. SPEARS is designed to determine expectations for the accuracy of haplotype reconstruction software and different population designs. As proof-ofconcept and a new demonstration of RABBIT, we develop a detailed picture on the variation in accuracy for genome-wide haplotype maps for progeny from a multi-parent population of maize.

Materials and methods
A. Simulation Data. This study introduces SPEARS (Figure S1) which incorporates SAEGUS (https://github.com/maizeatlas/saegus) as the genome simulator. Using SAEGUS, we first generated a virtual multiparent outcross population of 1, 000 progeny ( Figure S2). Real genotype data on inbred line parents of the population was used to initiate the simulation (n = 47, 078 markers obtained from both genotyping-by-sequencing (GBS) [13] and the MaizeSNP50 BeadChip [14]; filtered to remove residual heterozygous sites in the parents for compatibility with MaCH imputation [15]). We also examined the impact of using fewer markers (n = 23, 584 GBS markers). In order to represent analysis in practice, the level of missing data per marker observed in a real maize population (not shown) was reproduced for the virtual progeny. That is, at each marker, virtual genotype scores were randomly converted to missing according to the missing data rate observed at the corresponding marker. These missing data were then imputed using MaCH (15) and filtered to exclude markers with imputation accuracy < 0.5, resulting in 47, 074 markers (or 23, 584 GBS markers) being retained for RABBIT.

B. Reconstruction of Ancestral Haplotypes.
To build haplotype maps for the simulated population ( Figure 1, Figure S3) the joint model of RABBIT was used to assign an optimal Viterbi path using the origViterbiDecoding algorithm (10). Custom scripts in R were used to reformat output from RABBIT into phased genotypes for SPEARS.  Figure S3). Colors correspond to the seven founding parents of the admixed population ( Figure S2). The centromere is represented as a white circle.
SPEARS uses four metrics to assess distinct features of haplotype maps ( Figure S4): (i) Ancestral Assignment Accuracy (AAA); (ii) Genotype Assignment Accuracy (GAA); (iii) Switch-Error Rate (SER); and (iv) Correlation between Crossover Counts (CCC). AAA is calculated as the proportion of markers that have the correct inferred parent assigned on each homologue (given as a percentage). To calculate GAA, genotypes are assigned based on the inferred ancestry and corresponding parental genotype data used as input, from which the proportion of genotype matches are calculated (given as a percentage). To assess phasing accuracy, SER is calculated only among markers with correctly inferred genotype scores (phasing cannot be assessed for markers with genotyping errors) and is defined as: 1-(n-1-sw)/(n-1) where n is the number of heterozygous sites and sw is the number of switches required to obtain the true phase based on known data (16,17). The CCC is calculated as the correlation coefficient for the total number of crossovers across both homologues of all chromosomes per individual. For additional analysis, we also calculated "parent certainty" as the difference between the posterior probabilities of the two most likely parents at each marker (obtained from the origPosteri-orDecoding algorithm within RABBIT). Table S1. The average per sample AAA was 97.0%. Regions with lower accuracy showed decreases in parent certainty alone or in combination with a lower density of markers ( Figure S5), indicating that both identity-by-state among parents and marker density, but neither RABBIT nor the Viterbi algorithm per se were main sources of error in reconstructing haplotype maps. Additionally, the average GAA (99.3%) was higher than AAA (Table S1), suggesting shared parental haplotypes were indeed a contributing factor to error. RABBIT performed very well in phasing among markers that had correctly assigned genotypes with an average SER of 0.0090 across all samples. The CCC was positive and showed high correlation between known and RABBIT-inferred crossover counts (r = 0.89, p < 2.2e − 16) ( Figure S6). However, the number of crossover counts per sample was downward biased with an average of 260 ± 16 versus 228 ± 14 for known and RABBIT data, respectively.

Conclusions
Reconstruction of ancestral haplotypes from genomic data is useful for a number of applications. SPEARS showed that RABBIT produced highly accurate genome-wide haplotype maps for an admixed non-inbred population. Genomic segments with minimal differentiation between the parents resulted in localized reductions in the accuracy of ancestral assignment ( Figure S5). This may be overcome with a higher marker density unless parental haplotypes are truly identicalby-descent. For this study, using approximately half the number of markers had minimal influence on the average genomewide AAA, GAA, CCC, and SER (Table S1); some local regions showed slight reductions in accuracy ( Figure S7). This exemplifies how SPEARS estimates of the expectation for different types of error can help guide investigations on haplotype structure. It allows analysis of certain features of haplotype data to be included/excluded in a study based on a corresponding metric, and the expectations can be reported. Moreover, one can assess whether specific chromosomes or regions of the genome, but not others, are sufficiently accurate for downstream analysis. SPEARS, the protocol and suite of scripts, has been made publicly available at https://github.com/maizeatlas/spears.