Is cooperation favored by horizontal gene transfer?

Abstract It has been hypothesized that horizontal gene transfer on plasmids can facilitate the evolution of cooperation, by allowing genes to jump between bacteria, and hence increase genetic relatedness at the cooperative loci. However, we show theoretically that horizontal gene transfer only appreciably increases relatedness when plasmids are rare, where there are many plasmid-free cells available to infect (many opportunities for horizontal gene transfer). In contrast, when plasmids are common, there are few opportunities for horizontal gene transfer, meaning relatedness is not appreciably increased, and so cooperation is not favored. Plasmids, therefore, evolve to be rare and cooperative, or common and noncooperative, meaning plasmid frequency and cooperativeness are never simultaneously high. The overall level of plasmid-mediated cooperation, given by the product of plasmid frequency and cooperativeness, is therefore consistently negligible or low.


Supplementary information for 'Is cooperation favoured by horizontal gene transfer?'
This supplementary document contains:

p1)
Appendix A: Mathematical model. 4. Individuals with a cooperation allele encode a public good, at cost CG, which generates a benefit B that is shared between all members of the patch. Individuals bearing a plasmid (either cooperative or defective) suffer a cost of plasmid carriage CC.
5. Individuals survive according to their fitness, which is determined by the costs and benefits of public goods and the cost of plasmid carriage.
6. Surviving, plasmid-bearing individuals lose their plasmid with probability s.
7. Individuals disperse to form an infinite pool of potential founders.
We note that, in our lifecycle, clonal reproduction, plasmid transfer, and plasmid loss, occur in three distinct lifecycle stages. This is an unrealistic simplifying assumption, commonly taken in theoretical models of plasmid dynamics, to make them analytically tractable (Rankin et al., 2011). Of course, in nature, the three lifecycle features (reproduction, plasmid transfer, plasmid loss) occur continuously, in tandem, but treating them as discrete lifecycle stages, as we do here, is unlikely to lead to artificial results (Rankin et al., 2011;Mc Ginty et al., 2013;Birch, 2014Birch, , 2017. Furthermore, the ordering of clonal reproduction, plasmid transfer, and plasmid loss, is unlikely to qualitatively affect the results. For instance, Birch showed that the particular ordering of plasmid transfer and public goods production does not qualitatively influence the course of evolution (Birch, 2014(Birch, , 2017.
where W denotes mean population fitness, and is given by: and where mA gives the expected proportion of the patch who are cooperators (either plasmid or chromosomal). The subscript A denotes a set. Specifically, A denotes the set of "known genotypes". The patch cooperator frequency (mA) varies depending on the genotype of the focal individual, and who the focal individual pairs up with in the plasmid transmission lifecycle stage, as we will now explain with a few examples.
As a first example, if the focal individual has a given genotype ab, and it pairs up with an individual who has descended from the same founder cell as itself (this occurs with probability 1/N), then we can be sure that one of the original founder cells had the genotype ab (and gave rise to both the focal individual and its partner), but we cannot be sure about the identities of the remaining N-1 founder cells. The patch cooperator frequency varies depending on the genotypes of the founder cells that gave rise to the patch. Therefore, in this scenario, given that we only know the identity of one founder cell, ab, we write the patch cooperator frequency as m{ab} (we derive m{ab} explicitly in terms of model parameters in the next section).
As a second example, if the focal individual has a given genotype ab, and it pairs up with an individual who has descended from a different founder cell to itself (this occurs with probability (N-1)/N), who has the genotype cd, then we can be sure that one of the original founder cells had the genotype ab (and gave rise to the focal individual), and one of the other founder cells had the genotype cd (and gave rise to the focal individual's partner). We cannot be sure about the identities of the remaining N-2 founder cells. This leads to patch cooperator frequency of m{ab,cd} (we derive m{ab,cd} explicitly in terms of model parameters in the next section).
Our mean fitness expression ( ) makes intuitive sense. It is simply given by: posttransmission cooperator frequency / $+ + $* + $$ + *$ + ( $$ + *$ ) *+ A fraction of these individuals, given by 1/N, pair up with an individual who has descended from the same founder cell as itself (and therefore also has the 22 genotype), and as a result, no plasmid transfer occurs. This gives rise to the second number (multiplier) of 1/N.
The next three terms are arrived at through similar reasoning to the previous three, so we don't go over these explicitly. This takes us to our final term. To explain this term, we need to focus on pairings in the population that are between type-20 and type-12 individuals. Such However, if our focal cell pairs up with an individual who has descended from a different founder cell to itself (and has genotype cd, which may or may not be the same as ab), we gain further information about the identities of the N founder cells. Specifically, in this case, there are two 'known' founder cells, which are type-ab and type-cd, and N-2 'unknown' founder cells.
Therefore, in a given scenario, there are either N-1 unknown founder cells (1 known), or N-2 unknown founder cells (2 known). The task now is to calculate, for each scenario, the (expected) genotypes of the 'unknown' founder cells. Then, using this information about the genotypes of the founder cells, we can calculate, for each scenario, the resulting (expected) patch cooperator frequency.
To reiterate, we take A as the set of known founder cells. Therefore, to take an arbitrary example, for a focal individual who has descended from a type-20 founder cell, and who pairs with an individual who has descended from a type-12 founder cell, our set A would equal {20,12}.
We now let i be the number of chromosomal cooperators (genotypes 20 & 21) in the set of known founder genotypes (A). We let j be the number of plasmid cooperators (genotypes 22 & 12) in the set of known founder genotypes (A). We let k be the number of plasmid-free defectors (genotype 10) in the set of known founder genotypes (A). We let l be the number of double-locus-defectors (genotype 11) in the set of known founder genotypes (A). We let = + + + be the total number of known founders, meaning there are − unknown founders. Now, we let . ( $+ , $* , $$ , *+ , ** , *$ ) denote the probability that, in addition to our set (A) of 'known' individuals, a patch is founded by exactly $+ type-20 individuals, $* type-21 individuals, $$ type-22 individuals, *+ type-10 individuals, ** type-11 individuals, and *$ type-12 individuals. We can then write this probability . ( $+ , $* , $$ , *+ , ** , *$ ) as the probability mass function of a multinomial random variable, as follows: We condition this on the numbers of the various types of 'known' founders, and divide through by N. This gives us an expression for the patch cooperator frequency (dividing though by N ensures that cooperator frequency is given relative to patch size, i.e. as a proportion of the patch): We can write this in terms of expectations and covariances as: We can evaluate this using We obtain: where And with this, we have a general-form expression, which we can use to obtain the specific mA terms in the above equations. To obtain the specific mA term for a given set A, we simply substitute in the i,j and k values associated with the set. For instance, for the set {20,11}, we have i=1,j=0,k=0,l=1, which leads to the following patch cooperator frequency: ## Having now obtained "closed" equations to describe how each genotype changes in frequency from the start of the generation to just-before the 'plasmid loss' lifecycle stage (lifecycle stage 6) ( !" to !" # ), we now write equations to describe how each genotype changes in frequency from just-before the 'plasmid loss' lifecycle stage (lifecycle stage 6) to the end of the generation ( !" # to !" ## ). These equations specifically capture the evolutionary consequences of plasmid loss.
We take z to be 1 or 2 (z∈{1,2}). This means that az refers to a given plasmid-carrying genotype (i.e., it includes the genotypes a1 and a2), and a0 refers to a given plasmid-free genotype. The frequency of a given plasmid-carrying genotype at the end of the generation ( !7 ## ), and the frequency of a given plasmid-free genotype at the end of the generation ( !+ ## ), can then be written: !+ where s denotes the probability that a given plasmid-bearing individual loses its plasmid. Note that Equation 14 applies to genotypes with cooperative plasmids (a2) as well as genotypes with defector plasmids (a1). Now, we can take our closed equations describing how genotype frequencies change from the start of the generation to just-before plasmid loss ( !" to !" # ), alongside our equations describing how genotype frequencies change from just-before plasmid loss to the end of the generation ( !" # to !" ## ). This gives us a series of dynamically sufficient recursions, which describe how genotype frequencies change over an entire generation (from !" to !" ## ).

Solving the model
Having obtained dynamically sufficient recursions, we can now solve the model, to see what genotypes evolve. To do so, we assume that, initially, cooperation is absent from the population, and there are no plasmids. We then introduce the remaining five genotypes from rarity. To be specific, we take the following initial genotype frequencies: *+ = 0.999, ** = 0.0002, *$ = 0.0002, $+ = 0.0002, $* = 0.0002, $$ = 0.0002. We then track the evolutionary process, by numerically iterating our recursions over successive generations. We continue tracking evolution for 100,000 generations, at which point, genotype frequencies are changing negligibly, meaning evolutionary equilibrium has been reached (either approximately or exactly).

Appendix B: Derivation of plasmids relatedness in terms of model parameters.
This appendix derives a fully explicit expression for plasmid relatedness in terms of model parameters. Doing so reveals that the general relatedness expression derived by Mc Ginty et al.

General definition of relatedness
Relatedness defined at a plasmid locus is given by the following formula (Grafen, 1985;Frank, 1998;Gardner et al., 2011;Rousset, 2015). It measures genetic similarity at the plasmid locus: This is a "whole-group" relatedness and is interpretable as a regression (Pepper, 2000).
We will now derive an explicit version of 86!9 , expressed in terms of model parameters. To do so, we will gradually re-write our relatedness expression, evaluating higher level factors in terms of lower level ones, until it is expressed completely in terms of model parameters.
It is convenient to write our relatedness expression in a slightly different form.
Specifically, we can write covariance and variance in terms of expectations (using standard identities). We can also recognise that the expected proportion of the cooperator plasmid on a patch, at a given point in time, will be equal to the expected proportion of the cooperator plasmid in an individual drawn from the patch, meaning Next, we can substitute in explicit expressions for * # , * # $ and , # * # .

#
The frequency of the cooperator plasmid amongst descendants of founder 1 ( * # ) can be written explicitly as: To interpret this expression, note that, if * = 1, meaning the founder (1) had the cooperator plasmid, all descendants of founder 1 will inherit the cooperator plasmid by vertical transmission, leading to * # = * (the second term is eliminated because 1 − * − * = 0 whenever * = 1, owing to I / J relationship described above). Conversely, if * = 1, meaning the founder (1)

(20)
It is convenient to write the fifth term of the right hand side of Equation 20 slightly differently.
First, we expand it: / Next, we will derive g * # $ h, which denotes the square of the expected proportion of lineage 1 descendants who have the cooperator plasmid, where the expectation is taken over all patches.
By taking an expectation over both the left-and right-hand sides of Equation 19, we obtain: We will now evaluate the five terms comprising the right-hand side of Equation 24. The first term, g * # $ h, has already been evaluated (Equation 23).
We can use properties of the binomial distribution to write the second term as: The third term can be expanded to obtain: , 4;3,* + g * $ h.. Finally, using properties of the binomial distribution, and simplifying, we arrive at an explicit expression for our third term: We can use properties of the binomial distribution to write the fourth term as: We can use properties of the binomial distribution to write the fifth term as:

Full relatedness expression in terms of model parameters
By evaluating the terms on the right-hand side of Equation 17 using Equations 22-28, we arrive at the following fully explicit relatedness expression: We check, using simulation, that our relatedness expression (Equation 29) is correct. To do so, first we assume specific values for pC, pD, N & β. We obtain a simulated combined I+J array by drawing N values from a binomial distribution centred at pC+pD. We obtain a simulated plasmid cooperativeness array by drawing N values from a binomial distribution centred at pC/(pC+pD). We obtain a simulated I array by multiplying our I+J array by our plasmid cooperativeness array. We obtain a simulated J array analogously, by multiplying our I+J array by 1-plasmid cooperativeness. We obtain a simulated # array by manipulating our I array using the following relation: 3 # = 3 +

Appendix C: Invasion criteria.
These derivations were first given in Dewar et al. (2021).
1) Invasion of a defector plasmid against non-cooperative hosts.
2) Invasion of a cooperator plasmid against cooperative hosts.
3) Invasion of a defector plasmid against cooperative hosts.

5)
Region of parameter space where a cooperator plasmid has a (transient) selective advantage over a cheat plasmid (i.e., a selective advantage whilst plasmids are rare).
When plasmids are rare, relatedness at the plasmid locus is given by I

Appendix D: Plasmids versus greenbeards.
Greenbeard genes are genes that can recognise copies of themselves in other individuals, and then help and receive help from those individuals (Gardner & West, 2010;Madgwick et al., 2019). Greenbeard genes provide a possible way for cooperation to be favoured, in addition to standard genealogical relatedness (kin selection) and horizontal gene transfer. By only interacting with other individuals who also have the greenbeard gene, relatedness for greenbeard-encoded helping may be maximal (=1), meaning Hamilton's Rule has a good chance of being satisfied, leading to the greenbeard being favoured.
Greenbeards may help obligately or facultatively (Gardner & West, 2010 However, the "positive frequency dependence" argument for the rarity of obligate helping greenbeards is undermined by the possibility of population structure (Rousset, 2004;Gardner & West, 2010). Specifically, population structure can allow obligate helping greenbeards to be locally common, even though they are globally rare, removing the positive frequency dependent selection, and allowing them to be selected even at low global frequencies.
A reviewer wondered whether our argument, for why plasmid-mediated cooperation should be generally low, is analogous to the "positive frequency dependence" argument for why obligate helping greenbeards are rare. Furthermore, they wondered whether, like the "positive frequency dependence" argument, our argument about plasmid-mediated cooperation may be undermined by the possibility of population structure.
Fortunately, the possibility of population structure does not undermine our argument, for two reasons. Firstly, unlike obligate helping greenbeards, there is no positive frequency dependent selection on cooperative plasmids -the cooperative plasmids do not need to be above a certain frequency before they are favoured. In fact, the opposite is true -cooperative plasmids are most favoured at low frequencies, where plasmid relatedness is highest. It is negative frequency dependence that stops cooperative plasmids from reaching high frequencies and having an appreciable impact on bacterial population. Secondly, we have already examined population structure (population structure is given by the inverse of the number of founders, N, per patch), and found it to not qualitatively affect results in general.

Appendix E: Equilibria and non-equilibria.
We have often discussed evolutionary equilibria, even though in nature many plasmids will be in continuous flux, and never settle down to an evolutionary equilibrium. An additional reason for focusing on equilibria is that it is more natural to make points about the long-term steady state of a system (which is unique), rather than short-term states (which are unpredictable, being contingent on initial conditions). This is a pragmatic argument for referring to equilibria (which are knowable) rather than nonequilibrium states (which are generally not knowable, unless initial conditions are fixed).
However, we stress that we did not exclusively focus on equilibria. Indeed, an important part of our argument is about evolutionary trends that occur away from equilibrium states. For instance, we showed that cooperative plasmids often gain a transient advantage, allowing them to invade, before losing this advantage as plasmids reach high population frequency, causing them to ultimately be lost from the population. This is a detailed characterisation of nonequilibrium dynamics, and gives insight into why the final equilibrium (loss of the cooperative plasmid) comes about.