The following argument shows how a negative relation between ?na and synonymous diversity could arise. First, because ?na is the component of KA/Kna and hence KA (SI Appendix, Fig. S3, shows that genes with lower KA have lower ?na). Second, stronger selection can also reduce the strength of BGS for a single gene, leading to higher synonymous diversity (SI Appendix, Eq. S5). This pattern results from the fact that weakly deleterious mutations achieve higher equilibrium frequencies than more strongly selected escort sites Houston mutations, so that a closely linked neutral mutation has a higher chance of association with a mutation that is destined to be eliminated from the population (5).
To make this analysis quantitative, we used both an exact summation formula and a more tractable, but approximate, integral method; each of these included BGS effects of both NS and UTR sites, and are described by Eq. 1 of Materials and Methods, and SI Appendix, Eqs. S10b and S12, respectively. The equations take both gene conversion and crossing over into account and determine the mean value of E over all synonymous sites in a gene, where E is the negative of the natural logarithm of the ratio of the predicted value ?S at a site to its value in the absence of BGS, ?0. The larger E, the greater the reduction in diversity due to BGS. A subsidiary question is the extent to which the two methods for determining E agree.
We calculated the mean E value for a gene, using a broad range of assumed ?na values of NS mutations. For the summation results, we assumed the “standard” D. melanogaster gene model (27), described in Materials and Methods before Eq. 1. This model has five exons of 100 codons each, interrupted by four introns of 100 bp. For the integral method, we assumed 500 codons without introns. We also assumed gamma distributions of the selective effects of deleterious mutations, with a shape parameter, ?, of 0.3 for both UTRs and NS sites, because this is a typical value from estimates of the DFE (SI Appendix, Tables S2 and S3). We assumed ?na = 0.15 for 3?- and 5?-UTRs, regardless of the value of ?na for NS sites; this value is also consistent with the DFE-? results. We assumed u = 4.5 ? 10 ?9 for the mean mutation rate per base pair, which is in the midrange of values from direct estimates for single nucleotide mutations in D. melanogaster (28, 29). We then used SI Appendix, Eq. S13, to calculate the mean selection coefficients for NS and UTR mutations from the assigned values of ?na for NS and UTR sites, assuming an effective population size of 10 6 , by applying equation 23 of ref. 30.
The crossing-over rate per base pair, rc, was set to the standard value of 1 ? 10 ?8 for D. melanogaster in regions with nonzero rates of crossing over, averaging over the two sexes (19). A recent whole-genome sequencing analysis of a single cross (23) gave estimates of the rate of initiation per base pair of noncrossover gene conversion events (gc) and mean tract length (dg) of 1 ? 10 ?8 per base pair and 440 bp, respectively, whereas a recombinant inbred line experiment (22) gave gc = 5 ? 10 ?8 and dg = 500 bp. Because these estimates differ considerably, we generated results for both sets of values.
Fig. 2 shows the BGS effects caused by NS mutations alone, as well as the joint BGS effects of NS and UTR mutations. na for NS sites, implying that ?S declines with ?na, consistent with the properties of SI Appendix, Eq. S5. na is close to linear, tailing off somewhat at high values of ?na. The integral model gives slightly larger estimates of mean E for a gene than the summation model. Both models are sensitive to the gene conversion parameters, with the smaller gc and dg values giving substantially stronger effects than the larger values, as would be expected as a result of the lower net recombination rates. These results show that BGS can indeed have larger effects on genes with larger values of ?na and hence KA, and is therefore a possible contributory factor to the negative relation between synonymous site diversity and KA. An intuitive explanation for this pattern was given at the beginning of this section.