The impact of paramutations on the invasion dynamics of transposable elements

Abstract According to the prevailing view, the trap model, the activity of invading transposable elements (TEs) is greatly reduced when a TE copy jumps into a piRNA cluster, which triggers the emergence of piRNAs that silence the TE. One crucial component in the host defence are paramutations. Mediated by maternally deposited piRNAs, paramutations convert TE insertions into piRNA producing loci, thereby transforming selfish TEs into agents of the host defence. Despite this significant effect, the impact of paramutations on the dynamics of TE invasions remains unknown. To address this issue, we performed extensive forward simulations of TE invasions with piRNA clusters and paramutations. We found that paramutations significantly affect TE dynamics, by accelerating the silencing of TE invasions, reducing the number of insertions accumulating during the invasions and mitigating the fitness cost of TEs. We also demonstrate that piRNA production induced by paramutations, an epigenetically inherited trait, may be positively selected. Finally, we show that paramutations may account for three important open problems with the trap model. Firstly, paramutated TE insertions may compensate for the insufficient number of insertions in piRNA clusters observed in previous studies. Secondly, paramutations may explain the discrepancy between the observed and the expected abundance of different TE families in Drosophila melanogaster. Thirdly, piRNA clusters may be crucial to trigger the host defence, but paramutations render the clusters dispensable once the defence has been established. This could account for the lack of TE activation when three major piRNA clusters were deleted in a previous study.

Figure S1: Validation that transposition is accurately simulated with InvadeGO.We tested whether the expected exponential growth of TE copy numbers in populations without host defence is accurately reproduced.We simulated 500 TE invasions with a transposition rate of u = 0.1 in populations of a size N = 1000.We started the invasion with 10 randomly distributed TE copies in the populations.Note that the observed TE copy numbers (c) follows closely the expectations (c t = c 0 • (1 + u) t ; bold line).
Figure S2: Validation that genetic drift is accurately simulated with InvadeGO.We relied on the basic population genetics insight that the probability of fixation of a neutral singleton is p = 1/2N .When n loci are segregating as singletons the expected number of loci fixed by drift is e = np.To test this we distributed 10, 000 neutral TE insertions at random positions in populations with different population sizes (N = 250, N = 500, N = 1000).Importantly we used a transposition rate of u = 0 to prevent the emergence of additional TE copies.For each population size we used 300 replicates and populations were simulated for 20, 000 generations.In our scenario the expected number of fixed TE insertions is e = 10000/2N (the dashed blue line).Note that the observed number of fixed TE insertions fits well with the expectations.Figure S4: Validation that negative selection is accurately simulated with InvadeGO.We tested whether the observed trajectories of selected TEs matches the theoretical expectations.For a negatively selected TE with a population frequency of p t and a negative effect x the allele frequency at the next generation p t+1 can be computed as p t+1 = p 2 t w AA + p t (1 − p t )w Aa /w where w is the average fitness and w AA = 1 − 2x and w Aa = 1 − x is the fitness of the homozygous and heterozygous TE insertion, respectively.We simulated a TE insertion with an allele frequency of p 0 = 0.5 (in Hardy-Weinberg equilibrium) in a population with size N = 1000.We simulated five different negative effects of TEs (x) and used 100 replicates for each.The simulations with x = 0 represent the neutral scenario.According to population genetic theory an allele should behave similarly to the neutral scenario when N e • x < 1 (i.e.x = 0.0001).Note that the trajectories of the negatively selected TE insertions agree with theoretical expectations.
Figure S5: Validation that the effect of piRNA clusters is accurately simulated with InvadeGO.We simulated a single chromosome of 1Mb and a piRNA cluster at the end of the chromosome.piRNAs clusters either account for 0%, 50% and 100% of the chromosome.For each scenario we simulated 100 TE invasions with a transposition rate of u = 0.1 in populations of size N = 1000.We started the TE invasions with 10 randomly distributed TE copies in the populations.As expected TE copy numbers increased exponentially in the simulations without a piRNA clusters.Note that piRNA clusters markedly slowed down the copy number increase of the TE.       on selection of paramutation dependent piRNA production (PDPP).We simulated different negative effects of TEs (x) and different transposition rates (u).Fractions of populations where PDPP is fixed (red) or lost (blue) after 5000 generations are shown.If PDPP is fixed, all individuals in a population will produce piRNAs.With negative fitness effects of TEs (x > 0) the fraction of populations where PDPP was fixed increases, suggesting that the epigenetically inherited trait, PDPP, is positively selected.Note that our finding that PDPP is positively selected is robust to different assumptions about the recombination rate and the fitness function.Luo et al. [2022].We assumed that TE insertions in trigger sites (top panel) lead to the production of siRNAs that drive the conversion of TE insertions in paramutable loci into piRNA producing loci.The three phases of TE invasions can also be recognized under this siRNA model.After a rapid increase in TE copy numbers (green; rapid invasion phase), a TE invasion is initially controlled by segregating paramutated TEs (yellow; shotgun phase) and later by fixed paramutated TEs (red; inactive phase).Table S1: Fraction of stopped TE invasions after 5.000 and 10.000 generations, when assuming that TE insertions in piRNA clusters trigger the host response.We simulated TE invasion and counted the number of populations with a fixed piRNA producing locus (i.e.stopped invasion).Results are shown for different numbers of paramutable loci (%para).We used 100 replicates for each scenario.piRNA clusters had a size of 3% of the genome (%cluster) %para %cluster 5000 10000 0 3 0.92 1.00 1 3 0.97 1.00 10 3 0.94 1.00 100 3 1.00 1.00

Figure S3 :
Figure S3: Validation that recombination is accurately simulated with InvadeGO.We tested whether the observed decay of linkage disequilibrium (D; grey lines) matches the theoretical expectations D t = D 0 (1−c) t (blue line) where c is the recombination rate.For the base populations we simulated two neutral TE insertions with a population frequency of f = 0.5 in perfect linkage (D = p AB − p A p B = 0.5 − 0.5 • 0.5 = 0.25).Furthermore we used a population size of N = 10000 and a transposition rate of u = 0. We simulated four different recombination rates and used 100 replicates for each.The LD decay observed in the simulations agrees well with theoretical expectations.

Figure S6 :
Figure S6: Validation that paramutations are correctly simulated with InvadeGO.A) Overview of the simulations.We simulated 1000 individuals, 1 chromosome of length 1Mb without piRNA clusters, 100 replicates and a fixed TE at a single paramutable locus (position 1).The transposition rate was u = 0.1.Different fractions of the individuals in the base populations were initiated with maternal piRNAs ( 0%, 50% and 100%).B) Average number of TE insertion per diploid individual during TE invasions, dependent on the fraction of individuals with maternal piRNAs in the base population (top panel).Note that maternal piRNAs reduce the amount of TEs accumulating during the invasions.

Figure S7 :
Figure S7: Validation that the genomic landscape and the random distribution of novel insertion sites is accurately simulated with InvadeGO.Simulations were performed for a diploid organism having two chromosomes (top panel) with an unequal size (500kb and 1000kb).The population size was N = 1000 and the transposition rate was u = 0.1.All plots show the number of insertion sites in 10kb windows at different generations during the TE invasion (right panel).Colors indicate the different genomic regions blue:piRNA cluster, green: paramutable loci, red: rest of the genome A) Simulation without host defence.Note that insertion sites are equally distributed in the two chromosomes.B) Simulation with piRNA clusters (100kb) at the 5' end of each chromosomes.Note that the cluster insertions are located at the end of the two chromosomes.C) Simulations with piRNA clusters and paramutable loci (at each 10th position outside of piRNA clusters).Note that paramutable loci are solely found outside of piRNA clusters and that paramutable loci are equally distributed over the chromosomes.D) Simulation with piRNA clusters and negative selection.We assumed that solely TE insertions outside of piRNA clusters are negatively selected.Note that all TE insertions, except for the insertions in piRNA clusters, are gradually lost from the population.

Figure S8 :
FigureS8: Population frequency of TE insertions in piRNA clusters (clu), paramutated TEs (para) and the rest of the genome (noe; non of either).We simulated piRNA clusters accounting for 3% of the genome and 10% paramutable loci.We used a transposition rate of u = 0.1 and assumed neutral TE insertions (x = 0).Simulations were performed for 5000 generations and results are provided for each 1000 th generation.Note that the frequency of all TE insertions increases gradually during the experiment due to genetic drift.The significance was estimated with Wilcoxon rank sum tests.NS not significant

Figure S10 :
Figure S10: Invasion dynamics with shifting proportions of piRNA clusters (clu) and paramutable loci (para) such that the total sum of piRNA producing loci (clu+para) remains constant at 3%.We show 100 replicates for each scenario.A) Abundance of TE insertions during the invasions.B) Abundance of TE insertions at the beginning of a phase.

Figure S11 :
Figure S11: Invasion dynamics with neutral (left panels) and negatively (right panels) selected TE insertions.Data are shown for TE invasions with (bottom panels) and without (top panels) paramutable loci.

Figure S12 :
Figure S12: Paramutations reduce the fitness cost of TE invasions.We simulated TE invasions without and with 10% paramutable loci (para) for many different sizes of piRNA clusters (clu).The significance was estimated with Wilcoxon rank sum tests.* * * p < 0.001.

Figure S13 :
Figure S13: Paramutations reduce the fitness cost of TE invasions under a multiplicative (w = (1 − x) n ) and linear fitness function (w = 1 − xn).fitness cost = 1 -minimum fitness.w fitness of an individual, x negative fitness effect of a single TE insertion, n number of TE insertions per diploid individual.The significance was estimated with Wilcoxon rank sum tests.NS not significant, * * * p < 0.001

Figure S15 :
Figure S15: Influence of the recombination rate (r) and the transposition rate (u) on the number of TEs at the start of the phase.We simulated piRNA clusters accounting for 3% of the genome and 10% paramutable loci.Default parameters are shown in bold.Note that both parameters, the transposition rate and the recombination rate, have little influence on the abundance of TEs.

Figure S16 :
Figure S16: Dynamics of TE invasions under a siRNA model inspired byLuo et al. [2022].We assumed that TE insertions in trigger sites (top panel) lead to the production of siRNAs that drive the conversion of TE insertions in paramutable loci into piRNA producing loci.The three phases of TE invasions can also be recognized under this siRNA model.After a rapid increase in TE copy numbers (green; rapid invasion phase), a TE invasion is initially controlled by segregating paramutated TEs (yellow; shotgun phase) and later by fixed paramutated TEs (red; inactive phase).

Figure
Figure S18: A siRNA model (second and third columns) predicts the observed abundance of TE families in D. melanogaster more accurately than the classic trap model (first column).For the siRNA model we simulated no piRNA cluster, 10% paramutable loci and 3% or 30% trigger sites.A) Abundance of TEs during simulated TE invasions.B) Observed and expected abundance of TEs in D. melanogaster .Bar plots indicate the abundance of each TE family.The colors of the bars indicate the average population frequency (blue = 0.1, red = 1.0).The grey shades indicate expectations from the simulations.Note that simulations with paramutable loci capture the observed TE abundance more accurately than simulations without paramutable loci.Data are from Kofler et al. [2015].

Table S2 :
Fraction of stopped TE invasions after 5.000 and 10.000 generations, when insertions into siRNAtrigger-sites initiate the host response.Results are shown for different numbers of siRNA-trigger-loci (%trigger).We simulated 30% paramutable loci and used 100 replicates for each scenario.No piRNA clusters were simulated (%cluster)