Neighbor QTL: an interval mapping method for quantitative trait loci underlying plant neighborhood effects

Abstract Phenotypes of sessile organisms, such as plants, rely not only on their own genotypes but also on those of neighboring individuals. Previously, we incorporated such neighbor effects into a single-marker regression using the Ising model of ferromagnetism. However, little is known regarding how neighbor effects should be incorporated in quantitative trait locus (QTL) mapping. In this study, we propose a new method for interval QTL mapping of neighbor effects, designated “neighbor QTL,” the algorithm of which includes: (1) obtaining conditional self-genotype probabilities with recombination fraction between flanking markers; (2) calculating conditional neighbor genotypic identity using the self-genotype probabilities; and (3) estimating additive and dominance deviations for neighbor effects. Our simulation using F2 and backcross lines showed that the power to detect neighbor effects increased as the effective range decreased. The neighbor QTL was applied to insect herbivory on Col × Kas recombinant inbred lines of Arabidopsis thaliana. Consistent with previous results, the pilot experiment detected a self-QTL effect on the herbivory at the GLABRA1 locus. Regarding neighbor QTL effects on herbivory, we observed a weak QTL on the top of chromosome 4, at which a weak self-bolting QTL was also identified. The neighbor QTL method is available as an R package (https://cran.r-project.org/package=rNeighborQTL), providing a novel tool to investigate neighbor effects in QTL studies.


Introduction
Sessile organisms, such as land plants, lack the active mobility required to withdraw from neighboring individuals. Field studies have shown that the phenotypes of an individual plants depend not only on their own genotypes but also on those of neighboring plants ( Barbosa et al. 2009). Such neighbor effects are mediated by direct (e.g., competition and volatile communication) and indirect (e.g., herbivore and pollinator movements) interactions that modulate complex traits throughout a plant life cycle, including growth (Subrahmaniam et al. 2018), defense (Schuman et al. 2015;Sato 2018;Tamura et al. 2020), and reproduction (Underwood et al. 2020). Moreover, increasing evidence suggests that plantplant interactions within a species may increase population biomass and pest resistance (Zeller et al. 2012;Wuest and Niklaus 2018;Yang et al. 2019). However, it remains unclear how to best analyze the quantitative trait locus (QTL) underlying key traits that lead to plant neighborhood effects.
QTL mapping is a well-established approach for the analysis of loci responsible for complex traits (Broman et al. 2003(Broman et al. , 2019Broman and Sen 2009). Although genome-wide association studies (GWAS) have now been developed, there are several limitations associated with this approach, including the occurrence of false positive signals due to the population structure (Hayes 2013), as well as the omission of rare smalleffect variants in a sample population (Korte and Farlow 2013). While recombination events are limited in experimental crosses, QTL mapping can compensate for GWAS disadvantages in population structure and rare variant detection among unrelated individuals. Once GWAS identifies a pair of target accessions, its biparental population is then subject to QTL mapping (Sonah et al. 2015;Crowell et al. 2016;Han et al. 2018). For example, the joint approach using GWAS and QTL mapping has thus far enabled plant researchers to validate loci controlling traits of ecological or agricultural interest in Arabidopsis thaliana (Brachi et al. 2010), maize (Crowell et al. 2016), soya bean (Sonah et al. 2015), and peppers (Han et al. 2018). Therefore, QTL mapping provides a complementary analysis for GWAS to further dissect complex traits in plant genetics and breeding (Sonah et al. 2015;Rishmawi et al. 2017;Han et al. 2018;Marchadier et al. 2019).
Regarding QTL mapping of plant neighborhood effects, Wuest and Niklaus (2018) have recently conducted mixed planting studies with various combinations of near-isogenic lines in A. thaliana and uncovered that a single QTL increased the biomass when a pair of lines having different alleles at the QTL marker were cultivated together. Although the first discovery of neighbor QTL is intriguing, it is not yet possible to detect neighbor QTLs in the absence of exhaustive pairwise experiments comprising combinations of many accessions. In this context, our previous study proposed "neighbor GWAS" that screened neighbor effects from random spatial arrangements of multiple genotypes (Sato et al. 2021). The core concept of neighbor GWAS was to consider the Ising model of statistical physics as an inverse problem of singlemarker regression, thereby estimating the effects of neighbor genotypic identity on a trait. However, QTL mapping of neighbor effects is more complicated than single-marker analysis as QTL studies employ the maximum likelihood method for interval mapping between flanking markers (Haley and Knott 1992;Jansen 1993;Broman and Sen 2009). Such interval mapping requires a stepwise inference from genotype imputation to phenotype prediction. First, conditional genotype probabilities are obtained from the observed marker genotypes and recombination fractions between flanking markers. Second, phenotypes are inferred using the conditional genotype probabilities and marker effects (Haley and Knott 1992). To adopt interval mapping for neighbor effects, it is necessary to define the effects of neighbor genotypic identity on a quantitative trait.
In this study, we developed an interval mapping method for detecting QTLs that led to neighbor effects. The primary aim of the developed method was to detect such neighbor QTLs from spatial variation in phenotypes rather than partialing out the spatial heterogeneity as a nuisance. The proposed method, "neighbor QTL," was applied to simulated data and recombinant inbred lines (RILs) of A. thaliana. Furthermore, the new QTL method was built into an R package.

Model
First, we developed a basic regression model and subsequently defined QTL effects for interval mapping. To combine neighbor effects and a linear model, we focused on a well-known statistical physics model, Ising model (McCoy and Maillard 2012). The Ising model defines the magnetic energy arising from physical interactions among neighboring magnets. By analogy, we regarded an individual as a magnet, genotypes as dipoles, and a trait as energy. Given the observed traits (or energy), we estimated the interaction coefficients of the Ising model to infer neighbor effects.

Joint regression for self and neighbor effects
To incorporate neighbor effects into linear regression, we developed a joint model following the single-marker regression of neighbor GWAS (Sato et al. 2021). We considered a situation where a number of inbred lines occupied finite sites in a twodimensional space and assumed that an individual is represented by a magnet, whereby two homozygotes at each marker, AA or BB, correspond to the north or south dipole (Figure 1). We defined x i or x j as the genotype at a focal marker for i-th focal individual or j-th neighbor, respectively, where x iðjÞ 2fAA, BBg ¼ f1, À1g. We then used multiple regression to model the effects of self-genotype and neighbor genotypic identity on a trait of i-th individual y i as where b 0 , b 1 , and b 2 indicated intercept, self-genotype effects, and neighbor effects, respectively. The residual for a trait value of the focal individual i was denoted as e i . The neighbor genotypic identity was represented by P L < i;j> x i x ðsÞ j , which indicated the sum of products for all combinations between the i-th focal individual and the j-th neighbor at the s-th scale of spatial distance from the focal individual i (Figure 1). The total number of neighbors L varied in response to the spatial scale s to be referred. The coefficient of neighbor effects b 2 was scaled by L. If two individuals shared the same genotype at a given locus, the product x i x j became positive, whereas the product x i x j became negative if two individuals had different genotypes. Thus, the effects of neighbor genotypic identity on a trait y i was dependent on the neighbor effect coefficient b 2 and the number of two genotypes in a neighborhood.
Notably, the multiple regression model Equation (1) was posed as an inverse problem of the Ising model. When summing y i for all individuals and substituting coefficients as E ¼ Àb 2 =L; H ¼ Àb 1 and I ¼ P ðy i À b 0 Þ, Equation (1) could be transformed into the total magnetic energy of a two-dimensional Ising model as McCoy and Maillard 2012). In such a case, the neighbor effect b 2 and self-genotype effect b 1 could be interpreted as the interaction coefficient E and external magnetic force H, respectively.

QTL effects of neighbor genotypic identity
To convert a linear regression into a QTL model, we defined QTL effects for self-genotypes and neighbor genotypic identity. With heterozygosity incorporated, we redefined x i and x j by a marker genotype for an i-th focal individual and j-th neighbor as g i and g j , respectively. Self-QTL effects expected from these genotypes were denoted as g iðjÞ 2fAA, AB, BBg ¼ fa 1 , d 1 , Àa 1 g, where a 1 and d 1 indicated additive and dominance deviation for self-QTL effects, respectively. We then assumed that neighbor QTL effects were not prominent until the two alleles interacted at an Figure 1 Assumption regarding neighbor effects in a two-dimensional space. A white or black point indicates an individual having AA or BB genotype, respectively. A gray circle represents an effective area of neighbor effects at the spatial distance s from the focal individual i. Neighbor effects then occur depending on genotype similarity between the focal individual i and the j-th neighbors within the spatial distance s. individual level and accordingly defined different coefficients of the additive and dominance deviation a 2 and d 2 for neighbor QTL effects. The neighbor QTL effects caused by neighbor genotypic identity between the individual i and j were defined by its product among nine genotype combinations (Table 1). Given the QTL effects of self-genotype and neighbor genotypic identity, we decomposed a trait of i-th individual y i as, where y and e i indicate a population mean of traits and a residual for the focal individual i, respectively. If QTL effects were completely additive (i.e., a 1 ¼ a 2 ¼ 1 and d 1 ¼ d 2 ¼ 0), the QTL model Equation (2) had the same structure as the linear regression (Equation 1). In such an additive model, the coefficients b 1 and 6 ffiffiffiffiffi b 2 p represent additive QTL effects. Notably, the 6 ffiffiffiffiffi b 2 p sign indicates positive or negative effects of sharing the same alleles on a trait.

Interval mapping for conditional neighbor genotypic identity
Finally, we extended the single-marker QTL model Equation (2) to interval mapping. In particular, we modified the Haley-Knott regression that approximated the maximum likelihood method by a simple regression (Haley and Knott 1992;Broman and Sen 2009). The proposed algorithm consisted of three steps: (1) obtaining conditional self-genotype probabilities, (2) calculating conditional neighbor genotypic identity from the conditional selfgenotype probabilities, and (3) regressing trait values on the conditional self-genotype probabilities and neighbor genotypic identity.
The first step to obtain conditional self-genotype probabilities was the same as that of standard QTL mapping. Let p iðjÞ be the probability for the focal individual i or neighbor j to have a certain genotype at an interval pseudo-marker. We defined the conditional self-genotype probability for the individual i as p i ¼ Pr(g i ¼ fAA, AB, BBgjM) and obtained p i from the number of observed markers Â n individuals matrix M and its recombination fraction following hidden Markov models (Lander and Green 1987;Broman et al. 2003). Based on the products of the conditional selfgenotype probabilities, we further calculated the conditional neighbor genotypic identity P L < i;j> p i p ðsÞ j =L. We then defined g i g j as the combination of marker genotypes generating neighbor QTL effects; and p i p j as the expected probability for two genotypes to interact. The expected neighbor QTL combination was accordingly designated as p i p j g i g j . These probabilities were summed for all possible combinations of the genotypes as P 3 v P 3 w ½ðp i;v p ðsÞ j;w Þ ðg i;v g ðsÞ j;w Þ, where the subscript v and w indicate the three genotype states AA, AB, and BB.
Similar to Haley-Knott regression, we finally estimated the QTL effects g i and g i g j by regressing the trait values y i on p i and P L < i;j> p i p ðsÞ j =L, respectively. The additive and dominance deviations for the self-QTL effects a 1 and d 1 were considered as average differences in trait values among AA, AB, or BB genotypes, such that a 1 ¼ ðy AA À y BB Þ=2 and d 1 ¼ y AB À ðy AA þ y BB Þ=2 (Broman and Sen 2009). In such a case, the regression coefficient b 1 gave 2â 1 when -1, 0, and 1 dummy groups were assigned for the AA, AB, and BB genotypes, respectively, or gaved 1 when 0, 1, and 0 were assigned for the AA, AB, and BB genotypes, respectively (Broman et al. 2003). For neighbor QTL effects, the additive and dominance deviation a 2 and d 2 were also considered as the average differences in trait values among the nine possible combinations (Table 1) as In this case, trait values y i could be fitted by a quadratic regression on the group of nine genotype combinations ( Figure 2). Suppose that j =LÞ 2 represents such a quadratic regression, where the linear or quadratic coefficient b 2 or b 2 3 provides estimates for the additive or dominance deviation 62â 2 2 ord 2 2 , respectively. Practically, we could estimate a 2 and d 2 by the quadratic regression of the trait values y i on the conditional neighbor genotypic identity P L < i;j> p i p ðsÞ j =L, with nine genotype combinations encoded as AA/AA, BB/BB, AA/ AB, AB/AA, AB/AB, AB/BB, BB/AB, AA/BB, BB/AA ¼ f1, 1, 0.25, 0.25, 0.0, À0.25, À0.25, À1, À1g. The estimated sign of 6a 2 2 inferred positive or negative effects of sharing same alleles on a trait.
Based on the linear and quadratic regression, we decomposed a trait y i into self and neighbor QTL effects. Notably, the self and neighbor QTL effects are inevitably correlated because the self-QTL component g i appears in the second and third term in Equation (2). Specifically, when the effective scale s was large, less spatial variation occurred in the neighbor condition P g ðsÞ j , and stronger correlations arose between self and neighbor QTL effects. If a 1 and a 2 are simultaneously estimated by a one-step regression, there is a risk that the neighbor QTL effects a 2 may be overrepresented when fitted to correlated components. In addition, given that this study aimed to test neighbor effects, it was important to avoid type I errors against neighbor QTL effects. Thus, neighbor QTL effects were tested in comparison with the residuals of self-QTL effects. We estimated a 1 , d 1 , a 2 , and d 2 by following the six-step iterations.
1) Estimate a 1 by a linear regression on self-genotype probabilities, with À1, 0, and 1 encoded for the AA, AB, and BB genotypes, respectively. 2) Estimate d 1 by a linear regression on self-genotype probabilities, with 0, 1, and 0 encoded for the AA, AB, and BB genotypes, respectively. 3) Calculate self-QTL effects withâ 1 andd 1 . 4) Include the self-QTL effect as a covariate at a focal marker. 5) Estimate 6a 2 and d 2 by a quadratic regression on the conditional neighbor genotypic identity, with [À1, 1] dummy groups assigned for nine genotype combinations. 6) Calculate joint QTL effects withâ 1 ;d 1 ;â 2 andd 2 .
Based onâ 1 ;â 2 ;d 1 andd 2 , we inferredŷ i and derived log elikelihood (LL) using model deviance. LOD score for the self or neighbor QTL effects were designated as LOD self ¼ log 10 [exp(LL self À LL null )] or LOD nei ¼ log 10 [exp(LL nei À LL self )], which could be obtained in steps 3 and 6, respectively.
When there were only two genotypes, the quadratic regression was replaced by a linear regression to estimate the additive neighbor QTL effects. For the case of inbred lines lacking AB heterozygotes, we estimated the additive deviation a 2 by a linear Table 1 QTL effects expected by genotypic identity between the individuals i and j with AA, AB, or BB genotypes The additive and dominance deviation for the neighbor QTL effects is denoted by a 2 and d 2 , respectively. regression of trait values y i on the conditional neighbor genotypic identity P L < i;j> p i p ðsÞ j =L, with the 1 and À1 dummy groups assigned for the AA and BB genotypes, respectively. In the case of backcross lines lacking BB homozygotes, the additive deviation corresponded to the dominance deviation so that d 2 ¼ Àa 2 . The additive deviation a 2 could be estimated by a linear regression with the AA and AB genotypes encoded as -1 and 0, respectively. These two linear models were equivalent as both the inbred and backcross lines had two genotypes with additive effects.

Variation partitioning with the QTL model
Prior to the genome scan, we estimated the effective spatial scale s by calculating the proportion of phenotypic variation explained (PVE) by neighbor effects. Incorporating two random effects into a linear mixed model, we could partition the phenotypic variation into PVE by polygenic self-effects, polygenic neighbor effects, and residuals (Sato et al. 2021). According to previous studies (Henderson et al. 1959;Kang et al. 2008), the linear mixed model can be expressed as: where y indicates a phenotype vector as y i 2 y; Xb indicates fixed effects with a matrix including a unit vector and all covariates X and a coefficient vector b; Zu indicates random effects with u i 2 u and a design matrix Z; and e indicates residuals where e i 2 e. The random effects and residuals were further decomposed as Var(u) ¼ r 2 1 K 1 þ r 2 2 K 2 and Var(e) ¼ r 2 e I, where the n Â n individual similarity matrix for self-genotype or neighbor identity was scaled by the number of markers q as K 1 ¼ P T 1 P 1 =ðq À 1Þ or K 2 ¼ P T 2 P 2 =ðq À 1Þ, respectively. Given that one of two alleles is similar between heterozygotes and homozygotes, here, we defined the additive polygenic effects for self-QTLs as g i 2fAA, AB, BBg ¼ fÀ1, 0, 1g and for neighbor QTLs as g i g j 2fAA/AA, BB/BB, AA/AB, AB/AA, AB/AB, AB/BB, BB/AB, AA/BB, BB/AAg ¼ f1, 1, 0.5, 0.5, 0.0, À0.5, À0.5, À1, À1g. In these cases, the q Â n matrix P 1 included expected self-genotype values as elements P 1 ¼ ð P 3 v p i g i Þ, and K 1 represented a kinship matrix that was calculated from all the pseudo-markers (Broman et al. 2019). Similarly, the q Â n matrix P 2 included the conditional neighbor genotypic identities as elements j;w Þ ðg i g ðsÞ j ÞÞ, and K 2 represented a genome-wide structure of conditional neighbor genotypic identity. Based on the three variance component parameters, we calculated PVE by polygenic self or neighbor effects as PVE self ¼ r 2 1 =ðr 2 1 þ r 2 2 þ r 2 e Þ or PVE nei ¼ r 2 2 =ðr 2 1 þ r 2 2 þ r 2 e Þ. In addition, a marker heritability that represented additive genetic variance was defined as h 2 ¼ r 2 1 =ðr 2 1 þ r 2 e Þ when r 2 2 and s were set at 0 in Equation (3).
Using the linear mixed model (Equation 3), our previous simulations revealed that increasing patterns of PVE nei from s ¼ 0 to a large s (Sato et al. 2021) could inform the effective spatial scale of neighbor effects. If the effective range was narrow, PVE nei approached a plateau at a small value of s. In contrast, PVE nei linearly increased with s if the effective range was broad. To generalize these results for a continuous two-dimensional space, here we introduced DPVE metric as differences in PVE from s to s þ 1 such that DPVE ¼ PVE nei,s þ 1 -PVE nei,s. Using such differential metrics, we quantified how PVE nei approached to a plateau across s as follows: 1) Categorize spatial scales as s 2 S based on the percentiles for pairwise Euclidean distance between individuals. 2) Calculate PVE nei from s ¼ 1 to the maximum elements of S.

3) Calculate DPVE nei and determine s ¼ argmaxDPVE nei
The proposed algorithm using a differential PVE is referred to as "DPVE method" hereafter.

An R package, "rNeighborQTL"
In addition, the neighbor QTL method was built into an R package, "rNeighborQTL." The rNeighborQTL inherited input objects from the R/qtl package (Broman et al. 2003), allowing us to save phenotype and genotype data as common "cross" objects. Because of the stepwise testing, the self-QTL effects yielded the same results as standard QTL mapping. For the DPVE method, the mixed models Equation (3) were solved using the algorithm of average information restricted maximum likelihood (AI-REML) (Gilmour et al. 1995) implemented in the Gaston package (Perdry and Dandine-Roulland 2020). An additional but necessary input file was a spatial map describing the positions of individuals at the x-and y-axes. The rNeighborQTL package is available via the Comprehensive R Archive Network (CRAN) at https://cran.r-proj ect.org/package¼rNeighborQTL.
The rNeighborQTL package included several options to analyze a variety of QTL data. Instead of linear (mixed) models (Equations 1 and 3), logistic (mixed) models could also be selected to handle a binary phenotype (Chen et al. 2016;Faraway 2016). Because the logistic mixed model did not provider 2 e (Chen et al. 2016; Perdry and Dandine-Roulland 2020), PVE nei was substituted by the ratio of phenotypic variation explained (RVE) to polygenic neighbor effects as RVE nei ¼r 2 2 =r 2 1 when a binary trait was subject to the DPVE method. In addition, the neighbor QTL allowed additional covariates when conducting a genome scan. This option enabled composite interval mapping (Jansen 1993) if genetic markers other than a focal locus were considered covariates. When a significant marker was detected by the single-QTL analysis, it was also possible to test two-way interactions, namely epistasis, between the neighbor QTL effects across a genome.

Simulation
Furthermore, we performed a benchmark test using simulated data on F2 and backcross lines. With a random spatial map  (Table 1). The additive or dominance deviation a or d is represented by the linear or quadratic term, respectively. A negative linear coefficient indicates that sharing of the same alleles (i.e., AA/AA or BB/BB combinations) had negative effects on traits. generated, we simulated neighbor effects based on "fake.f2" and "fake.bc" autosome genotypes implemented in the R/qtl package (Broman et al. 2003). The spatial positions were sampled from a uniform distribution Unif(1, 100) across a continuous twodimensional space. We estimated a 1 for self-phenotypes of "fake.f2" and "fake.bc" data after the trait values were scaled to have a mean of zero and variance of 1 and assigned maxâ 1 to a randomly selected marker. In contrast to the major-effect marker, small coefficients, i.e., 10 À3 Â maxâ 1 , were assigned to the other markers to simulate polygenic effects. Additive (a 2 ¼ maxâ 1 and d 2 ¼ 0:25 Â maxâ 1 ), dominant (a 2 ¼ d 2 ¼ maxâ 1 ), and overdominant (a 2 ¼ maxâ 1 and d 2 ¼ 1:25 Â maxâ 1 ) scenarios were analyzed for the F2 lines, while only additive scenario (a 2 ¼ maxâ 1 and d 2 ¼ Àmaxâ 1 ) was applicable for the backcross lines. In total, 30 traits were simulated for true effective distances given at 10-th to 50-th percentiles of pairwise Euclidean distance among individuals. The trait values of simulated neighbor effects were added to the selfphenotypes of "fake.f2" or "fake.bc" data set, with 50% of phenotypic variation being attributable to the neighbor effects. The conditional self-genotype probabilities and conditional neighbor genotypic identity at each marker were standardized to have mean of zero and variance of 1. We then applied the DPVE method and a genome scan for the joint traits and calculated LOD nei at s ¼ argmaxDPVE nei to evaluate the power to detect neighbor effects.
To further examine the model performance, we compared the proportion of PVE by a model including polygenic self-effects alone and a joint model including both the polygenic self and neighbor effects. In addition, we calculated the marker heritability h 2 ¼ r 2 1 =ðr 2 1 þ r 2 e Þ by setting r 2 2 and s at 0 in Equation (3). Moreover, the total PVE by the full model was defined by PVE self þ PVE nei ¼ ðr 2 1 þ r 2 2 Þ=ðr 2 1 þ r 2 2 þ r 2 e Þ. The PVE unexplained by the heritability but explained by the full model, namely (PVE self þ PVE nei) Àh 2 , could be considered a net contribution of polygenic neighbor effects to phenotypic variation. In addition, to test whether neighbor phenotypes rather than neighbor genotypes explained greater phenotypic variation, we ran the same simulation, with the neighbor genotype g ðsÞ j replaced by the neighbor phenotype y ðsÞ j in Equation (3). To assess false positive detection of neighbor genotypic effects due to neighbor phenotypic effects, we also simulated traits involving the neighbor phenotypic effects, the trait values of which were added to the self-phenotypes of "fake.f2" or "fake.bc" data set, with 50% of phenotypic variation attributed to the neighbor phenotypic effects. The simulated traits were then fitted by the neighbor QTL model (i.e., model incorporating neighbor genetic effects).

Data
To apply the neighbor QTL on real data, we conducted a pilot QTL experiment using the yellow-striped flea beetle Phyllotreta striolata and RILs of A. thaliana (Supplementary Figure S1A). Adults of flea beetles access host plants by jumping like a "flea." The adults have a small mouthpart and make small holes on leaves when eating. The number of leaf holes can be used as an indicator of herbivory by flea beetles. These flea beetles are known to prefer glabrous A. thaliana to hairy accessions (Sato et al. 2019). We selected RILs derived from hairy and glabrous accessions in this study to observe large phenotypic variation in leaf holes.

Plants and insects
We used 130 accessions, including parental lines and RILs between Col(gl1) and Kas-1 accession (Wilson et al. 2001). Col(gl1) plants produce no trichomes, while Kas has sparse trichomes on leaves and stems. The RILs are known to vary in the trichome production, disease resistance (Wilson et al. 2001), and flowering time (Li et al. 2006). We used the marker data based on simple sequence length polymorphism and cleaved amplified polymorphic sequences (Wilson et al. 2001). Although SNP markers are available from Li et al. (2006), Li et al. (2006) analyzed 96 out of the 130 accessions. Furthermore, subsampling of individuals or accessions proved more problematic for neighbor QTL than for standard QTL mapping, as neighbor effects, if any, caused nonindependence of individual data points, and thus, the real influence of neighboring plants on observed traits could not be eliminated from an experiment by subsampling. Therefore, to assess interval mapping capability, we applied the sparse marker data provided by Wilson et al. (2001)

Experimental procedure
To investigate neighbor effects in herbivory, we allowed adult beetles to feed on RIL seedlings grown in a plastic cell tray. Three seeds for each accession were sown on each compartment of the cell tray (13 Â 10 cells composed of 20 Â 20 mm 2 compartment) with the accessions randomized. The seeds were acclimated under a constant dark condition with 4 C for 7 days, and then allowed to germinate under a long-day condition (16:8 h light: dark cycles with a 20 C constant air temperature). The seedlings were grown under the long-day condition for 24 days, with 2000fold diluted liquid fertilizer (N:P:K ¼ 6:10:5; Hyponex, Hyponex Japan, Osaka) supplied once. On day 14 post-germination, the seedlings were thinned out to leave one seedling per compartment. Prior to the feeding experiment, we recorded the presence or absence of leaf trichomes and the occurrence of bolting by direct observation and determined the rosette diameter (mm) by analyzing seedling images using the Image J software (Abrà moff et al. 2004). The cell tray was enclosed by a white mesh cage (length 29.2 cm Â width 41.0 cm Â height 27.0 cm: Supplementary Figure S1B). Thereafter, 30 adult beetles were released into the cage and allowed to feed on plants for 72 h. We counted leaf holes as a measure of herbivory for each plant as flea beetles left small holes when they fed on leaves (Supplementary Figure S1C). The final sample size was 126 individuals; out of 130 accessions, 4 accessions (CS84877, CS84873, CS84950, and CS84894) were not germinated, CS84898 lacked genotype data, and CS84958 had two replicates of individuals.

Data analysis
We used R version 3.6.0 (R Core Team 2019) for all statistical analyses. A genetic map for the Col Â Kas RILs was estimated using the est.map() function in the R/qtl package (Broman et al. 2003). Self-genotype probabilities were calculated using the calc.genoprob() function implemented in the R/qtl package (Broman et al. 2003). The number of leaf holes was logtransformed and analyzed using linear models. The presence of trichomes and bolting was analyzed using logistic models. When analyzing the number of leaf holes, we incorporated the presence or absence of bolting, the rosette diameter, and the edge (or not) of the cell tray into covariates. The neighbor QTL was performed using the rNeighborQTL package developed in this study. A genome-wide significance level was determined by empirical percentiles of the maximum LOD score among 999 permuted traits. We considered p < 0.1 and p < 0.05 a suggestive and significant level, respectively. In addition, we set an arbitrary threshold at LOD score of 1.5 when discussing the results. To determine whether neighbor phenotypic effects more accurately explained the leaf holes compared to neighbor genotypic effects, we calculated PVE by incorporating the neighbor phenotype y

Simulation using F2 and backcross lines
We simulated neighbor effects based on "fake.f2" and "fake.bc" data implemented in the R/qtl package (Broman et al. 2003). The maximum additive deviation of self-QTL effects, maxâ 1 , was 0.56 and 0.28 for F2 and backcross lines, respectively. These values were assigned for neighbor QTL effects to achieve a similar signal strength between self and neighbor effects, while minor effects were allocated to other loci. Considering the polygenic self and neighbor effects as random effects, we applied the DPVE method for simulated traits. The estimated distance given by s ¼ argmaxDPVE increased as the true distance increased (Figure 3), indicating that the DPVE method was effective. Even when neighbor phenotypes were incorporated instead of neighbor genotypes in the model fitting, PVE nei did not increase but decreased in all the four scenarios (mean of the difference in PVE nei ¼ À0.4 for the additive scenario, À0.20 for the dominant scenario, À0.19 for the overdominant scenario, and À0.46 for the backcross lines among 30 iterations from 10-th to 50-th percentiles of the spatial distance class). Incorporating neighbor phenotypes instead of genotypes during the simulation, we applied neighbor QTL to simulated traits involving neighbor phenotypic effects. However, the mean LOD score of a major-effect marker did not exceed 1.60 in any scenario; hence, the neighbor phenotypic effects were unlikely to result in false positive detection of neighbor QTL effects.
When the effective distance of neighbor effects was limited, such short-range neighbor effects were well detected using DPVE method and the quadratic approximation (median LOD nei > 4 at the 10-th percentile of pairwise Euclidean distance: Figure 3). Although the power to detect long-range neighbor effects was lowered, the LOD score was still larger than the Bonferroni threshold up to 40-th percentiles of the distance class (median LOD nei > 4: Figure 3). These results indicated that short-range neighbor effects could be detected in any scenario, although it was relatively difficult to detect long-range effects. For backcross lines, both short-and long-range neighbor effects were well detected (median LOD nei > 4 for all s: Figure 3D). The backcross lines had two genotypes with the additive deviation alone and were well fitted using linear approximation ( Figure 3D), whereas the additive traits for F2 lines were less likely fitted using the quadratic model assuming three genotypes with additive and dominance deviation ( Figure 3A). Given the model structure underlying F2 lines, it is plausible that the quadratic term was unnecessary for the additive F2 traits, and it decreased the power to detect neighbor effects.
Linear mixed models including both polygenic self and neighbor effects accounted for a greater proportion of phenotypic variation than the marker heritability that considered polygenic selfeffects alone (lower panels in Figure 3). However, the net contribution of polygenic neighbor effects to phenotypic variation, namely (PVE self þ PVE nei) Àh 2 , decreased as the true distance of neighbor effects became larger. In contrast, the marker heritability of simulated traits increased as the true distance of simulated neighbor effects became larger (Supplementary Figure S2). These results were likely due to the correlation between self and neighbor QTL effects, given that the self-QTL component g i appeared in both the second and third term in Equation (2). Such a correlation became stronger when the effective space broadened and less variation in neighborhood conditions among individuals remained. The loss of PVE by neighbor effects made it difficult to detect long-range neighbor effects as observed in the LOD score simulation (middle panels in Figure 3). To anticipate the underlying correlation structure, we should note that (1) the significance of neighbor QTL effects should be tested using an iterative regression in comparison with the self-QTL model, and (2) PVE nei can be useful for the DPVE method but does not indicate the model performance over the self-QTL model.

Self-QTL effects in Col 3 Kas RILs
The observed number of leaf holes ranged from 0 to 38 with a median of 4 (Supplementary Figure S1D). The total variation in the number of leaf holes was explained at 5% by the trichome production; 2% by bolting; 10% by the rosette diameter; and 22% by the edge effects (Analysis-of-Variance, F ¼ 9.1, 3.7, 20.7, and 43.4; p ¼ 0.003, 0.06, 10 À4 , and 10 À8 , respectively). With the polygenic self-effects considered a single random effect, a linear mixed model estimated the marker heritability as 5.2% for the leaf holes, though it was not significant (Likelihood ratio test, v 2 1 ¼ 1:85; p ¼ 0:17). To scan self-QTL effects, we conducted standard QTL mapping of the trichome production, the number of leaf holes, and bolting ( Figure 4; Table 2). For self-QTL effects on the trichome production, we detected a strong peak near the GLABRA1 locus (> 20 LOD self score: Figure 4B). Considering the rosette diameter and bolting covariates, we observed a suggestive (0:05 < p < 0:1) but the largest self-QTL effect on the leaf holes at the GLABRA1 locus (LOD self ¼ 1.97: Figure 4C). For the bolting, we observed the largest significant peak on the bottom of chromosome 1 (> 4 LOD self) , and the second largest and suggestive (0:05 < p < 0:1) peak on the top of chromosome 4 (LOD self ¼ 1.92: Figure 4D). The self-QTL effects were also analyzed using the R/qtl package. Neighbor QTL yielded the same results regarding the self-QTL effects as the Haley-Knott regression implemented in the scanone() function (Supplementary Figure S3).
Several studies reported the same QTLs or a particular gene function for the self-effects on trichomes, defense, and flowering. Remarkably, GLABRA1 gene on chromosome 3 is known to encode a myb transcription factor regulating leaf trichome developments (Ishida et al. 2008). Our previous study showed that gl1 mutants were more likely attacked by the flea beetles than hairy wild-types in A. thaliana (Sato et al. 2019). Other studies on Brassica cultivars also documented that leaf trichomes deterred herbivory by flea beetles (Soroka et al. 2011;Alahakoon et al. 2016). The present finding of the self-QTL effects agrees with the previous evidence for roles of plant trichomes in resistance to flea beetles. Furthermore, two self-bolting QTLs on chromosomes 1 and 4 were located near flowering time QTLs in Col Â Kas RILs (Li et al. 2006). Thus, our pilot experiment supports previous evidence for the loci responsible for flowering, trichome production, and anti-herbivore defense.

Neighbor QTL effects in Col 3 Kas RILs
To estimate the effective distance of neighbor effects, we applied DPVE method with every 10-th percentile categories for pairwise Euclidean distance ( Figure 5). For the number of leaf holes, the DPVE nei was peaked at a 7.81 distance scale from a focal individual ( Figure 5A), covering almost all the experimental arena from the center plant. The adults of flea beetles jump and access host plants like a flea. Our DPVE method likely reflected such moving behaviors of flea beetles, suggesting that the experimental cage used in this pilot study was too small for flea beetles to mediate neighbor effects among plant individuals. At the estimated distance, 6.3% of total variation in leaf holes was attributable to PVE nei ( Figure 5A). The sum of PVE self and PVE nei explained 6.5% of the total variation, although its additional 1.3% fraction compared to 5.2% of the marker heritability was not significant (Likelihood ratio test, v 2 1 ¼ 0:23; p ¼ 0:62). Even when neighbor phenotypes were incorporated in place of neighbor genotypes, the PVE for the number of leaf holes did not increase (PVE nei ¼ 0.052, the sum of PVE self and PVE nei ¼ 0.052). Meanwhile, the DPVE became the largest at the second nearest scale for the bolting and explained one-third of variation compared to the self-QTL effects (RVE nei ¼ 0:32 at s ¼ 3.16: Figure 5B), suggesting that the bolting was unlikely to be affected by distant neighbors. For the trichome production, DPVE method revealed that the slight variation can be explained using polygenic neighbor effects (RVE nei % 0 for s from 10-th to 60-th percentiles; or models failed to converge for s over 70-th percentiles). A genome scan for neighbor effects was performed using the estimated spatial distance ( Figure 5; Table 2). Regarding the neighbor QTL effects on the leaf holes, we observed a weak, but the largest, QTL on the top of chromosome 4 at the nga8 marker (LOD nei ¼ 1.68: Figure 5A; Table 2), which was also the position of the second largest self-bolting QTL. This neighbor QTL had no significant epistasis as shown by <1.1 LOD scores for all the twoway interactions between the nga8 and other markers (Supplementary Figure S4). Neither the neighbor QTL nor GLABRA1 locus detected above was overlapped with known self-QTLs of powdery mildew resistance (Wilson et al. 2001), suggesting independence of the herbivory QTLs on the disease resistance loci. At the second nearest scale for bolting, we found a very weak neighbor QTL at the R30025 marker on the chromosome 3 (LOD nei ¼ 1.38: Figure 5B); however, we did not detect any neighbor QTLs having an LOD score >1.5.
Ecological studies have shown that plant apparency, which defines how easily an individual plant can be identified, drives neighbor effects through visual crypsis against herbivores (Hambä ck et al. 2000;Castagneyrol et al. 2013;Strauss et al. 2015). In this study, the neighbor QTL involved in the leaf holes was located near a self-bolting QTL at the top of chromosome 4, suggesting the potential importance of plant apparency in neighbor effects in anti-herbivore defense. In addition, the positive sign of the additive neighbor effects a 2 at that marker indicated that the number of leaf holes decreased when neighbors had different genotypes (Table 2). This implies that the mixture of flowering and vegetative plants may acquire population-wide resistance to flea beetles since the effective distance of neighbor effect was sufficiently large to encompass almost the entire experimental arena. These results led us to hypothesize that the self-QTL

Further applicability and limitation
The theoretical advantage of the Ising model lies in its inference of spatial arrangements that optimize total magnetic energy. Once the self and neighbor coefficients are estimated by a marker-based regression, these coefficients may be able to infer which genotype distributions can minimize or maximize the population-sum of trait values (Sato et al. 2021). In the context of neighbor QTL, additive effects suggest that positive and negative a 2 favors clustered or mixed patterns for maximizing the sum of trait values, respectively. However, in cases where dominance effects and epistasis are involved, how such a complex genetic basis affects the optimal spatial arrangement remains unexplored. These potential effects of genetic architecture on a population-level outcome of neighbor effects would be of theoretical as well as empirical interest for future studies.
Superior to the previous neighbor GWAS that assumes additive effects alone, the present neighbor QTL offers the flexibility to deal with heterozygosity. While the neighbor QTL analyzes crossed progeny, the neighbor GWAS analyzes unrelated individuals with a population structure incorporated into a random effect of a linear mixed model (Sato et al. 2021). Considering the complementary usage of GWAS and QTL mapping in plant genetics (Sonah et al. 2015;Crowell et al. 2016;Han et al. 2018), the neighbor QTL may be useful to analyze crossed progeny of interacting pairs nominated by the neighbor GWAS. However, the use of neighbor QTL is still restricted to autosomes because the sexdependent inheritance of neighbor effects remains unknown.
Standard QTL mapping on sex chromosomes require from one to three degrees of freedom (Broman et al. 2006), and thus, its extension to neighbor effects may be more complex than that of self-QTL effects. In addition, the neighbor QTL approximated the maximum likelihood method by a quadratic regression, in which phenotype variance was assumed to be equal among the nine combinations among three QTL genotypes. Our simulations revealed that the quadratic approximation could handle overdominance but was outperformed by linear approximation if additive effects alone governed a trait. We should thus be aware of statistical models behind the neighbor QTL. Practically, both the intercross and the inbred models might be utilized if a sample population is partially inbred.
While this study involved simulations and a laboratory experiment, environmental similarity other than neighbor genotypic identity might also shape spatial patterns of trait values in largescale cultivation under outdoor conditions. Such environmental autocorrelation matters when genotype distribution and abiotic conditions (e.g., light and soil nutrients) are clustered together in space. Although allele frequencies are unlikely biased in QTL populations, genotypes may be clustered in a large field where complete randomization is hard. When applying the neighbor QTL to field data, joint modeling with a random effect of spatial autocorrelation, such as SpATS (Rodríguez-Á lvarez et al. 2018), would allow us to distinguish neighbor QTL effects from environmental similarity.

Conclusion
The present neighbor QTL, together with the previous neighbor GWAS (Sato et al. 2021), provides a novel tool to incorporate neighbor effects into quantitative genetics. These methods may provide insights into the genetic architecture underlying neighbor Proportion, or ratio, of PVE by polygenic neighbor effects (PVE nei or RVE nei) plotted against the pairwise distance among individuals. A closed point indicates the distance at which DPVE peaked. Right: LOD nei score for neighbor QTL effects at the distance at which DPVE peaked. Colors correspond to chromosome numbers, and dots indicate observed markers. A solid and dashed horizontal line indicates a significant (P < 0.05) and suggestive (P < 0.1) LOD threshold with 999 permutations, respectively. effects, as exemplified by the pilot study of insect herbivory on A. thaliana. Once the neighbor GWAS screens candidate accessions, their crossed progeny can be inspected by the neighbor QTL. The line of R packages, "rNeighborQTL" and "rNeighborGWAS," help investigate neighbor effects using a complementary set of GWAS and QTL data.