Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
[…]
(1) The authors claim that the negative frequency dependence that maintains polymorphism in their model results from a non-linear relationship between the display trait and sexual success [...] Maybe I missed something, but the authors do not provide support for their claim about the negative frequency-dependence of sexual selection in their simulations. To do so they could (1) extract the relationship between the relative mating success of the two male types from the simulations and (2) demonstrate that polymorphism is not maintained if the relationship between male display trait and mating success is linear.
We believe that there is a confusion of terminology here. We agree that for the two alleles at a locus impacting male display in our model, the allele conferring inferior display quality will have a fitness that increases as its frequency increases, so this allele displays positive frequency dependent fitness. And, the alternate, display-favoring allele at the locus does display negative frequency dependence. Our use of the terminology ‘negative frequency dependence’ was meant to refer to the negative dependence of the fitness of the display-favoring allele with respect to its own frequency. However, a significant body of literature instead discusses models in which both an allele and its alternate(s) are beneficial when at low frequency and deleterious when at high frequency under the same selective challenge, entailing negative frequency dependence of fitness for all alleles involved. This benefit-when-rare model of a single trait is often described simply as negative frequency dependence, and generates balancing selection at the locus, but is not the model we are presenting here, and does not encompass all models involving negative frequency dependent fitness. This lexical expectation may make the interpretation of our work more difficult, and we have amended the manuscript to make our model clearer (lines 227-231). In this model, we have a negative frequency dependence for the fitness of the display-favoring allele in mate competition, but the net selective disadvantage of this allele at high frequency is due to a cost in another, pleiotropic, fitness challenge: the constant survival effect. So, the alleles are under balancing selection where alternate alleles are favored by selection when rare, but not due solely to selection during mate competition. Instead, our model relies on pleiotropy for an emergent form of frequency-dependent balancing selection (in the sense that each allele is predicted to be beneficial on balance when rare).
In the reviewer’s model of the success of two alleles at one locus, the ratio of success is vaguely linear with allele frequency for n=3, though it starts quite convex and has an inflection point between convex and concave segments (for the disfavored allele) at p≈0.532. This is visualized easily by plotting the function and its derivatives in Wolfram-Alpha. For n>=4, the fitness function with respect to the display-favoring/disfavoring allele becomes increasingly concave/convex respectively, and this specific nonlinearity is needed to act along with the antagonistic pleiotropy to maintain balancing selection, rather than being maintained by a model that favors any rare allele on the basis of its rarity in some manner. In an attempt to make the importance of the encounter number parameter clearer, we’ve generated new panels for Figure S1 which simulate encounter numbers 2, 3, and 4, and we have updated corresponding text and figure references in lines 335-338.
For (1-2), it is not clear how to modify the simulation such that the relationship between the trait value and mating success can be perfectly linear - either linear with respect to allele frequency in a one locus model or linear with respect to trait value at a specific population composition, without removing the simulation of mate competition altogether. While it may be of interest to explore a more comprehensive range of biological trade-offs in future studies, we are not able to meaningfully do so within the context of the present manuscript.
(2) The authors only explore versions of the model where the survival costs are paid by females or by both sexes. We do not know if polymorphism would be maintained or not if the survival cost only affected males, and thus if sexual antagonism is crucial.
We now present simulations with male costs only as added panels to Figure S1 and mention these results in the main text (lines 334-335). Maintenance of the polymorphism is significantly reduced or completely absent in such simulations.
(3) The authors assume no cost to aneuploidy, with no justification. Biologically, investment in aneuploid eggs would not be recoverable by Drosophila females and thus would potentially act against inversions when they are rare.
We did offer some discussion and justification of our decision to model no inherent fitness of the inversion mutation itself, specifically aneuploidy, in lines 36-39 and 78-80 of the original reviewed preprint. Previous research suggests that D. melanogaster females may not actually invest in aneuploid eggs generated from crossover within paracentric inversions. While surprising, and potentially limited to a subset of clades, many ‘r-selected’ taxa or those in which maternal investment is spread out over time may have some degree of reproductive compensation for non-viable offspring, which can reduce the costs of generating aneuploids significantly (for example, t-haplotypes in mice). We have added this example and citation to lines 34ff in the current draft.
(4) The authors appear to define balanced polymorphism as a situation in which the average allele frequency from multiple simulation runs is intermediate between zero and one (e.g., Figure 3). However, a situation where 50% of simulation runs end up with the fixation of allele A and the rest with the fixation of allele B (average frequency of 0.5) is not a balanced polymorphism. The conditions for balanced polymorphism require that selection favors either variant when it is rare.
We originally chose mean final frequency for presenting the single locus simulations based on the ease of generating a visual plot that included information on fixation vs loss and equilibrium frequency. Figure 3 and related supplemental images have been changed to now also represent the proportion of simulations retaining polymorphism at the locus in the final generation.
(5) Possibly the most striking result of the experiment is the fact that for 14 out of 16 combinations of inversion x maternal background, the changes in allele frequencies between embryo and adult appear greater in magnitude in females than in males irrespective of the direction of change, being the same in the remaining two combinations. The authors interpret this as consistent with sexually antagonistic pleiotropy in the case of In(3L)Ok and In(3R)K. The frequencies of adult inversion frequencies were, however, measured at the age of 2 months, at which point 80% of flies had died. For all we know, this may have been 90% of females and 70% of males that died at this point. If so, it might well be that the effects of inversion on longevity do not systematically differ between the ages and the difference in Figure 9B results from the fact that the sample includes 30% longest-lived males and 10% longest-lived females.
This critique deserves some consideration. The aging adults were separated by sex during aging, but while we recorded the number of survivors, we did not record the numbers of eclosed adults and their sexes initially collected out of an interest in maintaining high throughput collection. We therefore cannot directly calculate the associated survival proportions, but we can estimate them. We collected 1960 females and 3156 males, and we can very roughly estimate survival if we assume that equal numbers of each sex eclosed, and that the survivors represent 20% of the original population. That gives 12790 individuals per sex, or 84.7% female mortality and 75.3% male mortality.
So, we have added a qualification discussing the possibility of stronger selection on females and its influence on observed sex-specific frequency changes, on lines 602-605.
(6) Irrespective of the above problem, survival until the age of 2 months is arguably irrelevant from the viewpoint of fitness consequences and thus maintenance of inversion polymorphism in nature. It would seem that trade-offs in egg-to-adult survival (as assumed in the model), female fecundity, and possibly traits such as females resistance to male harm would be much more relevant to the maintenance of inversion polymorphisms.
Adult Drosophila will continue to reproduce in good conditions until mortality, and the estimated age of a mean reproductive event for a Drosophila melanogaster individual is 24 days (Pool 2015), and likewise for D. simulans (Turelli and Hoffman 1995). Given that reproduction is centered around 24 days, we expect sampling at 2 months of age to still be relevant to fitness. In seasonally varying climates, either temperate or with long dry season, survival through challenging conditions is expected to require several months. In many such cases, females are in reproductive diapause, and so longevity is the main selective pressure. See lines 931-936 in the revised manuscript.
As we agreed above, it would of interest to investigate a wider range of trade-offs in future studies. We focused here on the balanced between survival and male reproductive success because the latter trait generates negative frequency dependence for display-favoring alleles and a disproportionate skew towards higher quality competitors, whereas many other fitness-relevant traits lack that property.
(7) The experiment is rather minimalistic in size, with four cages in total; given that each cage contains a different female strain, it essentially means N=1. The lack of replication makes statements like " In(2L)t and In(2R)NS each showed elevated survival with all maternal strains except ZI418N" (l. 493) unsubstantiated because the claimed special effect of ZI418N is based on a single cage subject to genetic drift and sampling error. The same applies to statements on inversion x female background interac7on (e.g., l. 550), as this is inseparable from residual variation. It is fortunate that the most interesting effects appear largely consistent across the cages/female backgrounds. Still, I am wondering why more replicates had not been included.
Our experimental approach might be described as “diversity replication”. Essentially, the four maternal genetic backgrounds are serving dual purposes – both to assess experimental consistency and to ensure that our conclusions are not solely driven by a single non-representative genotype (which in so many published studies, can not be ruled out). It would indeed be interesting if we could have quadrupled the size of our experiment by having four replicates per maternal background. However, we suspect the reviewer may not recognize the substantial effort involved in our four existing experiments. Each of these involved collecting 500+ virgin females, hand-picking thousands of embryos during the duration of egg-laying, and repeatedly transferring offspring to maintain conditions during aging, such that cages had to be staggered by more than a month. These four cages took a year of benchwork just to collect frozen samples, before any preparation and quality control of the associated amplicon libraries for sequencing. Adding a further multiplier would take it well beyond the scope of a single PhD thesis. Fortunately, we were able to obtain the key results of interest without that additional effort, even if clearer insights into the role of maternal background would also be of strong interest.
We do agree that no firm conclusions about maternal background can be reached without further replication, and so we have qualified or removed relevant statements accordingly (lines 568ff, 620-622).
Reviewer #1 (Recommendations For The Authors):
The description of the model is confusing and incomplete, e.g., the values of several parameters used to obtain the numerical results are not given. It is first stated (l. 223) that the model is haploid, but text elsewhere talks about homozygotes and heterozygotes. If the model is diploid (this in itself is not clear), what is assumed about dominance?
We are not presenting results for a mathematical model estimated numerically. We have now clarified our transition from a conceptual depiction of our model, in which we use haploid representations for simplified presentation, to our forward population genetic simulations, which are entirely diploid. More broadly, we have improved our communication of the assumptions and parameters used in our simulations. The scenarios we investigate involve purely additive trait effects within and between loci (except that survival probabilities are multiplicative to avoid negative values). We think that considering other dominance scenarios would be a worthy subject for a follow-up study, whereas the present manuscript is already covering a great deal of ground.
Similarly, it is hard to understand the design (l.442ff). I was confused as to whether a population was set up for each inversion or for all of them and what the unit or replication was. I found the description in Methods (l. 763-771) much clearer and only slightly longer; I suggest the authors transfer it to the Results. Also, Figure 8 should contain the entire crossing scheme; the current version is misleading in that it implies males with only two genotypes.
All four tested inversions were segregating within the same karyotypically diverse population of males, and were assayed from the same experiments. We have attempted to improve the relevant description. For Figure 8, we had trouble conceiving a graphic update that contained a more complete cross scheme without seeming much more confused and cluttered. We have tried to clarify in the relevant text and the figure caption instead.
There are a number of small issues that should be addressed:
- No epistasis for viability assumed - what would be the consequence?
We explored a model in which we intentionally included no terms for epistatic effects on phenotype. All epistasis with regard to fitness is emergent from competition between individuals with phenotypes composed of non-epistatic, non-dominant genetic effects. So, the simplest model of antagonism would have no epistasis for viability whatsoever. One could explore a model that has emergent viability epistasis in a similar way, by implementing stabilizing selection on a quantitative trait with a gaussian or similar non-linear phenotype-to-fitness map, but that might be better served as a topic for a future study. We have, however, tried to make this intent clearer in the text.
l. 750 implies that aneuploidy generated by the inversion has no cost (aneuploid games are resampled)
Yes, as addressed in public review item (3). Alternately see lines 34ff, 293, 369, 392 for in-text edits.
l. 24-25: unclear; is this to mean that there is haplotype x sex interaction for survival?
l. 25: success in what? (I assume this will be explained in the paper, but the abstract should stand on its own).
l. 193-4: "producing among most competitive males": something missing or a word too much?? Figure 1B,C: a tiny detail, but the plots would be more intuitive if the blue (average) bars were ager (i.e., to the right) of the male and female ones, given that the average is derived from the two sex-specific values.
Each of the above have been edited or implemented as suggested
l. 205. It is convex function, but I do not understand what the authors mean by "convex distribution".
Hopefully the updated text is clearer: “yielding a distribution of male reproductive output that follows a relatively convex trend”.
l. 223ff: some references to Fig 1 panels in this paragraph seem off by one letter (i.e., A should be B, etc.).
l. 231 "fitness...are equally fit": rephrase
l. 260: maybe "thrown out" is not the most fortunate term, maybe "eliminated" would be better?
Each of the above have been edited or implemented as suggested
Figure 3: I do not understand the meaning of "additive" and "multiplicative" in the case of a single locus haploid model
All presented simulations are diploid, and these refer to the interactions between the two alleles at the locus. Hopefully the language is overall clearer in this draft.
l. 274: "Mutation of new nucleotide" meaning what? Or is it mutation _to_ a new nucleotide?
Hopefully the revised text is clearer.
Figure 5. The right panel of figure 5A implies that, with the inversion, the population evolves to an extreme display trait that is so costly that it fills 95% of all individuals (or of all females?
What is assumed about this here?). Apart from the biological realism of this result, what does it say about the accumulation of polymorphism and maintenance of the inversion? The graphs in fig 5B do plot a divergence between haplotypes, but it is not clear how they relate to those in panel A - the parameter values used to generate these plots are again not listed. Furthermore, from the viewpoint of the polymorphism, it would be good to report the frequencies at the steady-state.
We have now clarified the figure description, including the parameter values used. The distribution of frequencies at the end of the simulation is represented in figure 6. Given that we set up the simulation with assumptions that are otherwise common to population models, what biological process would prevent this extreme? Why isn’t this extreme observed in natural populations? One possible explanation is that they become sex chromosomes, with increasing likelihood as the cost increases. Or other compensatory changes may occur that we don’t simulate, like regulatory evolution giving a complementary phenotype. Maybe genetic constraints in natural populations prevent the mutation of the kind of pleiotropic mutations that drive this dynamic. The populations still survive, though they are parameterized by relative fitness. What would an absolute fitness population function be? Would it go extinct or not? It would be of interest to explore a wider range of models, but it is the purpose of this paper to establish that this is a viable model for the maintenance of sexually antagonistic polymorphism and association with inversions. We have added a paragraph motivated by this comment to the Discussion starting on line 765.
l. 401-2: Z-like, W-like : please specify you are talking about patterns resembling sex chromosomes.
l. 738: "population calculates"?
l. 743-4 and 746-7: is this the same thing said twice, or are there two components of noise? l. 357: there is no figure 5C.
Each of the above have been addressed with text edits.
L. 473-5: Yes, the offspring did not contain inversion homozygotes, but the sire pool did, didn't it? So homozygous inversions may have affected male reproductive success. Anyway, most of this paragraph (from line 473) seems to belong in Discussion rather than Results.
We have revised this sentence to focus on offspring survival.
We can understand the reviewer’s suggestion about Results vs. Discussion text. While this can often be a challenging balance, we find that papers are often clearer if some initial interpretation is offered within the Results text. However, we moved the portion of this paragraph relating our findings to the published literature to the Discussion.
l. 516: " In(3L)Ok favored male survival": this is misleading/confusing given the data, " In(3L)Ok reduced female survival more strongly than male survival..."
Hopefully the phrasing is clearer now.
l. 663ff: I did not have an impression that this section added anything new and could safely be cut.
We have done some editing to make this more concise and emphasize what we think is essential, but we believe that the model of an autosomal, sexually antagonistic inversion differentiating before contributing to the origin of a sex chromosome is novel and interesting. And, that this additional emphasis is worthwhile to encourage thought and consideration of this idea in future research and among interested researchers.
l. 751: "flat probability per locus": do the authors mean a constant probability?
Edited.
Reviewer #2 (Public Review):
The manuscript lacks clarity of writing. It is impossible to fully grasp what the authors did in this study and how they reached their conclusions. Therefore, I will highlight some cases that I found problematic.
Hopefully the revised manuscript improves writing clarity.
Although this is an interesting idea, it clearly cannot explain the apparent influence of seasonal and clinal variation on inversion frequencies.
We do not believe that our model predicts a non-existence of temporal and spatial dependence of the fitness of inverted haplotypes, nor do we seek to identify the manner in which seasonal and clinal differences affect fitness of inverted haplotypes. Rather, we argued that the influence of seasonal and clinal selection on inversions does not on its own predict the observed maintenance of inversions at low to intermediate frequencies across such a diverse geographic range, along with the higher frequencies of many derived inversions in more ancestral environments.
We might imagine that trade-offs between life history traits such as mate competition and survival should be universal across the range of an organism. But in practice, the fitness benefits and costs of a pleiotropic variant (or haplotype) may be heavily dependent on the environment. A harsh environment such as a temperate winter may both reduce the number of females that a male encounters (decreasing the benefit of display-enhancing variants) and also increase the likelihood that survival-costly variants lead to mortality (thus increasing their survival penalty). In light of such dynamics, our model would predict that equilibrium inversion frequencies should be spatially and temporally variable, in agreement with a number of empirical observations regarding D. melanogaster inversions.
We have edited the introduction to emphasize that inversion frequencies vary temporally as well as seasonally, on lines 144ff. We also note relevant discussion of the potential interplay between the environment and trade-offs such as those we investigate, on lines 153-155.
The simulations are highly specific and make very strong assumptions, which are not well-justified.
We respond to all specific concerns expressed in the Recommendations For The Authors section below. We also note that we have made further clarifications throughout the text regarding the assumptions made in our analysis and their justification.
Reviewer #2 (Recommendations For The Authors):
I think that the manuscript would greatly benefit from a major rewrite and probably also a reanalysis of the empirical data.
In particular, a genome-wide analysis of differences in SNP frequencies between sexes and developmental stages would help the reader to appreciate that inversions are special.
[moved up within this section for clarity] We are lacking a genomic null model-how often do the authors see similar allele frequency differences when looking at the entire genome? This could be easily done with whole genome Pool-Seq and would tell us whether inversions are really different from the genomic background. I think that this information would be essential given the many uncertainties about the statistical tests performed.
We expect that autosome-wide SNP frequencies will be heavily influenced by the frequencies of inversions, which occur on all four major autosomal chromosome arms. These inversions often show moderate disequilibrium with distant variants (e.g. Corbett-Detig & Hartl 2012).
Furthermore, the limited number of haplotypes present, given that the paternal population was founded from 10 inbred lines, would further enhance associations between inversions and distant variants. Therefore, we do not expect that whole-genome Pool-Seq data would provide an appropriate empirical null distribution for frequency changes. Instead, we have generated appropriate null predictions by accounting for both sampling effects and experimental variance, and we have aimed to make this methodology clearer in the current draft.
Some basic questions:
why start at a frequency of 50% (line 287)?
Isn't it obvious that in this scenario strong alleles with sexually antagonistic effects can survive?
The initial goal of the associated Figure 4 was not to show that a strongly antagonistic variant could persist. Instead, we wanted to test the linkage conditions in which a second, relatively weaker antagonistic variant survived – which did not occur in the absence of strong linkage.
We have now added simulations with relatively lower initial frequencies, in which the weaker variant and the inversion both start at 0.05 frequency, while the stronger variant is still initialized at 0.5 to reflect the initial presence of one balanced locus with a strongly antagonistic variant. Here, the weaker antagonistic variant is still usually maintained when it is close to the stronger variant, and while the inversion-mediated maintenance of the weaker variant at greater distance from the stronger variant because less frequent than the original investigated case, it still happens often enough to hypothetically allow for such outcomes over evolutionary time-scales.
Still, we should also emphasize that the goals of this proof-of-concept analysis are to establish and convey some basic elements of our model. Subsequently, analyses such as those presented in Figures 5 and 6 provide clearer evidence that the hypothesized dynamics of inversions facilitating the accumulation of sexual antagonism actually occur in our simulations.
The experiments seem to be conducted in replicate (which is of course essential), but I could not find a clear statement of how many replicates were done for each maternal line cross.
How did the authors arrive at 16 binomial trials (line 473)? 4 inversions, 4 maternal genotypes?
How were replicates dealt with?
In Figure 9, it would be important to visualize the variation among replicates.
Unfortunately, we did not have the bandwidth to perform replicates of each maternal line. Instead, we use four maternal backgrounds to simultaneously establish consistency across independent experiments and genetic backgrounds (see our response to Reviewer 1, point 7). We’ve edited the draft to make this clearer and more clearly delineate what is supported and not supported by our data. Replicate variation for the control replicates of the extraction and sequencing process, and the exact read counts of the experiment, are available in Supplemental Tables S5, S6, and S7.
The statistical analysis of trade-off is not clear: which null model was tested? No frequency change? In my opinion, two significances are needed: a significant difference between parental and embryo and then embryo and adult offspring. The issue with this is, however, that the embryo data are used twice and an error in estimating the frequency of the embryos could be easily mistaken as antagonistic selection.
Hopefully the description of our null model is clearer in the text, now starting around line 967 in the Methods. We are aware of the positive dependence when performing tests comparing the paternal to embryo and then embryo to offspring frequencies, and this is accounted for by our analysis strategy - see lines 1009-1012.
It was not clear how the authors adjusted their chi-squared test expectations. Were they reinventing the wheel? There is an improved version of the chi-squared test, which accounts for sampling variation.
We did not actually perform chi-square tests. Instead, we used the chi statistic from the chi-squared test as a quantitative summary of the differences in read counts between samples. We compared an observed value of chi to values for this statistic obtained from simulated replicates of the experiment. Sampling from this simulation generated our ‘expected’ distribution of read counts, sampled to match sources of variance introduced in the experimental procedure, but without any effect of natural selection, per lines 825ff in the original submission. Hence, we are approximating the likelihood of observing an empirical chi statistic by generating random draws from a model of the experiment and comparing values calculated from each draw to the experimental value: a Monte Carlo method of approximating a p-value for our data. We have attempted to make the structure of these simulations and their use as a null-model clearer in this draft.
It is not sufficiently motivated why the authors model differences in the extraction procedure with a binomial distribution.
Adding a source of variance here seemed necessary as running control sequencing replicates revealed that there was residual variance not fully recapitulated by sample-size-dependent resampling. Given that we were still sampling a number of draws from a binomial outcome (the read being from the inverted or standard arrangement), a binomial distribution seemed a reasonable model, and we fit the level of this additional noise source to an experiment-wide constant, read-count or genome-count independent parameter that best fit the variance observed in the controls (lines 830ff in the original draft). Clarification is made in this manuscript draft, lines 979-989.
How many reads were obtained from each amplicon? It looks like the authors tried to mimic differences between technical replicates by a binomial distribution, which matches the noise for a given sample size, but this depends on the sequence coverage of the technical replicates.
We provide read counts in Supplemental Tables S6 and S7. The relevant paragraph in the methods has been edited for clarity, lines 972ff. Accounting for sampling differences between replicates used a hypergeometric distribution for paternal samples to account for paternal mortality before collection, and the rest were resampled with a binomial distribution. There were two additional binomial samplings, to account for resampling the read counts and to capture further residual variance in the library prep that did not seem to depend on either allele or read counts.
It would be good to see an estimate for the strength of selection: 10% difference in a single generation appears rather high to me.
Estimates of selection strength based on solving for a Wright-Fisher selection coefficient for each tested comparison can now be found in Table S8, mentioned in text on lines 589-590. The mean magnitude of selection coefficients for all paternal to embryo comparisons was 0.322, and for embryo to all adult offspring it was 0.648. For In(3L)Ok the mean selection coefficients were 0.479 and -0.53, and for In(3R)K they were -0.189 and 1.28, respectively. Some are of quite large magnitude, but we emphasize that the coefficients for embryo to adult are based on survival to old age, rather than developmental viability. That factor, in addition to the laboratory environment, makes these estimates distinct from selection coefficients that might be experienced in natural populations.
Reviewer #3 (Public Review):
Strengths:
(1) …the authors developed and used a new simulator (although it was not 100% clear as to why SLiM could not have been used as SLiM has been used to study inversions).
Before SLiM 3.7 or so (and including when we did the bulk of our simulation work), we do not think it would have been feasible to use SLiM to model the mutation of inversions with random breakpoints and recombination between without altering the SLiM internals. Separately, needing to script custom selection, mutation, and recombination functions in Eidos would have slowed SLiM down significantly. Given our greater familiarity with python and numpy, and the ability to implement a similar efficiency simulator more quickly than through learning C++ and Eidos, we chose to write our own.
It should be a fair bit easier to implement comparable simulations in SLiM now, but it will still require scripting custom mutation, selection, and recombination functions and would still result in a similarly slow runtime. The current script recipe recommended by SLiM for simulating inversions uses constants to specify the breakpoints of a single inversion, without the ability to draw multiple inversions from a mutational distribution, or model recombination between more complicated karyotypes. Hence, our simulator still seems to be a more versatile and functional option for the purposes of this study.
Weaknesses:
[Comments 1 through 4 on Weaknesses included numerous citation suggestions, and some discussion recommendations as well. In our revised manuscript, we have substantially implemented these suggestions. In particular, we have deepened our introduction of mechanisms of balancing selection and prior work on inversion polymorphism, integrating many
suggested references. While especially helpful, these suggestions are too extensive to completely quote and respond to in this already-copious document. Therefore, we focus our response on two select topics from these comments, and then proceed to comment 5 thereafter.]
(2) The general reduction principle and inversion polymorphism. In Section 1.2., the authors state that "there has not been a proposed mechanism whereby alleles at multiple linked loci would directly benefit from linkage and thereby maintain an associated inversion polymorphism under indirect selection." Perhaps I am misunderstanding something, but in my reading, this statement is factually incorrect. In fact, the simplest version of Dobzhansky's epistatic coadaptation model
(see Charlesworth 1974; also see Charlesworth and Charlesworth 1973 and discussion in Charlesworth & Flatt 2021; Berdan et al. 2023) seems to be an example of exactly what the authors seem to have in mind here: two loci experiencing overdominance, with the double heterozygote possessing the highest fitness (i.,e., 2 loci under epistatic selection, inducing some degree of LD between these loci), with subsequent capture by an inversion; in such a situation, a new inversion might capture a haplotype that is present in excess of random expectation (and which is thus filer than average)…
We agree that the quoted statement could be misleading and have rewritten it. We intended to point out that we are presenting a model in which all loci contribute additively (with respect to display) or multiplicatively (with respect to survival probability), without any dominance relationships or genetic interaction terms. And yet, the model generates epistatic balancing selection in a panmictic population under a constant environment. This represents a novel mechanism by which (the life-history characteristics of) a population would generate epistatic balancing selection as an emergent property, instead of assuming a priori that there is some balancing mechanism and representing frequency dependence, dominance effects, or epistatic interactions directly using model parameters. We have therefore refined the scope of the statement in question (lines 155-158).
(4) Hearn et al. 2022 on Littorina saxatilis snails.
A good reference. There is considerable work on ecotype-associated inversions in L. saxatalis, but we previously cut some discussion of this and of other populations with high gene flow but identifiable spatial structure for inversion-associated phenotypes (e.g. butterfly mimicry polymorphisms, Mimulus, etc.). Due to the spatially discrete environmental preferences and sampled ranges of the inversions in these populations, we considered these examples to be somewhat distinct from explaining inversion polymorphism in a potentially homogenous and panmictic environment.
(4) cont. A very interesting paper that may be worth discussing is Connallon & Chenoweth (2019) about dominance reversals of antagonistically selected alleles (even though C&C do not discuss inversions): AP alleles (with dominance reversals) affecting two or more life-history traits provide one example of such antagonistically selected alleles (also see Rose 1982, 1985; Curtsinger et al. 1994) and sexually antagonistically selected alleles provide another. The two are of course not necessarily mutually exclusive, thus making a conceptual connection to what the authors model here.
We had removed a previously drafted discussion of dominance reversal for brevity’s sake, but this topic is once again represented in the updated draft of the manuscript with a short reference in the introduction, lines 76-80. We also mention ‘segregation lift’ (Wittmann et al. 2017) involving a similar reversal of dominance for fitness between temporally fluctuating conditions, as opposed to between sexes or life history stages.
(5) The model. In general, the description of the model and of the simulation results was somewhat hard to follow and vague. There are several aspects that could be improved: [5](1) it would help the reader if the terminology and distinction of inverted vs. standard arrangements and of the three karyotypes would be used throughout, wherever appropriate.
We have attempted to do so, using the suggested heterokaryotypic/homokaryotypic terminology.
[5](2) The mention of haploid populations/situations and haploid loci (e.g., legend to Figure 1) is somewhat confusing: the mechanism modelled here, of course, requires suppressed recombination in the inversion/standard heterokaryotype; and thus, while it may make sense to speak of haplotypes, we're dealing with an inherently diploid situation.
While eukaryotes with haploid-dominant life history may still experience similar dynamics, we do expect that most male display competition is in diploid animals, and we are only simulating diploid fitnesses and experimenting with diploid Drosophila. We have tried to minimize the discussion of haploids in this draft.
[5](3) The authors have a situation in mind where the 2 karyotypes (INV vs. STD) in the heterokaryotype carry distinct sets of loci in LD with each other, with one karyotype/haplotype carrying antagonistic variants favoring high male display success and with the other karyotype/haplotype carrying non-antagonistic alternative alleles at these loci and which favor survival. Thus, at each of the linked loci, we have antagonistic alleles and non-antagonistic alleles - however, the authors don't mention or discuss the degree of dominance of these alleles. The degree of dominance of the alleles could be an important consideration, and I found it curious that this was not mentioned (or, for that matter, examined).
In this study, our goal was to show that the investigated model could produce balanced and increasing antagonism without the need to invoke dominance. We think there would be a strong case for a follow-up study that more investigates how dominance and other variables impact the parameter space of balanced antagonism, but this goal is beyond our capacity to pursue in this initial study. We’ve added several lines clarifying the absence of dominance from our investigated models, and pointing out that dominance could modulate the predictions of these models (lines 211-213, 278-282).
[5](4) In many cases, the authors do not provide sufficient detail (in the main text and the main figures) about which parameter values they used for simulations; the same is true for the Materials & Methods section that describes the simulations. Conversely, when the text does mention specific values (e.g., 20N generations, 0.22-0.25M, etc.), little or no clear context or justification is being provided.
We have sought to clarify in this draft that 20N was chosen as an ample time frame to establish equilibrium levels and frequencies of genetic variation under neutrality. We present a time sequence in Figure 5, and these results indicate that that antagonism has stabilized in models without inversions or with higher recombination rates, whereas its rate of increase has slowed in a model with inversions and lower levels of crossing over.
The inversion breakpoints and the position of the locus with stronger antagonistic effects in Figure 4 were chosen arbitrarily for this simple proof of concept demonstration, with the intent that this locus was close to one breakpoint. Hopefully these and other parameters are clearer in the revised manuscript.
[5](5) The authors sometimes refer to "inversion mutation(s)" - the meaning of this terminology is rather ambiguous.
Edited, hopefully the wording is clearer now. The quoted phrase had uniformly referred to the origin of new inversions by a mutagenic process.
(6) Throughout the manuscript, especially in the description and the discussion of the model and simulations, a clearer conceptual distinction between initial "capture" and subsequent accumulation / "gain" of variants by an inversion should be made. This distinction is important in terms of understanding the initial establishment of an inversion polymorphism and its subsequent short- as well as long-term fate. For example, it is clear from the model/simulations that an inversion accumulates (sexually) antagonistic variants over time - but barely anything is said about the initial capture of such loci by a new inversion.
We do not have a good method of assessing a transition between these two phases for the simulations in which both antagonistic alleles and inversions arise stochastically by a mutagenic process. However, we have tried to be clearer on the distinction in this draft: we have included simulations in Figure 4 with variants starting at lower frequencies, and we have tried to better contextualize the temporal trajectories in Figure 5 as (in part) modeling the accumulation of variants after such an origin.
Reviewer #3 (Recommendations For The Authors):
- In general: the whole paper is quite long, and I felt that many parts could be written more clearly and succinctly - the whole manuscript would benefit from shortening, polishing, and making the wording maximally precise. Especially the Introduction (> 8 pages) and Discussion (7.5 pages) sections are quite long, and the description of the model and model results was quite hard to follow.
We have attempted to condense some portions of the manuscript, but inevitably added to others based on important reviewer suggestions. Regarding the length Introduction and Discussion, we are covering a lot of intellectual territory in this study, and we aim to make it accessible to readers with less prior familiarity. At this point, we have well over 100 citations – far more than a typical primary research paper – in part thanks to the relevant sources provided by this reviewer. We are therefore optimistic that our text will provide a valuable reference point for future studies. We have also made significant efforts to clarify the Results and Methods text in this draft without notably expanding these sections.
- In general: the conceptual parts of the paper (introduction, discussion) could be better connected to previous work - this concerns e.g. the theoretical mechanisms of balancing selection that might be involved in maintaining inversions; the general, theoretical role of antagonistic pleiotropy (AP) and trade-offs in maintaining polymorphisms; previously made empirical connections between inversions and AP/trade-offs; previously made empirical connections between inversions and sexual antagonism.
In the revised manuscript, we have improved the connection of these topics to prior work.
- L3: "accumulate". A clearer distinction could be made, throughout, between initial capture of alleles/haplotypes by an inversion vs. subsequent gain.
Please see point 6 in the response to the Public Review, above.
- L29: I basically agree about the enigma, however, there are quite many empirical examples in D. melanogaster / D. pseudoobscura and other species where we do know something about the nature of selection involved, e.g., cases of NFDS, spatially and temporally varying selection, fitness trade-offs, etc.
At least for our focal species, we have emphasized that geographic (and now temporal) associations have been found for some inversions. For the sake of length and focus, we probably should not go down the road of documenting each phenotypic association that has been reported for these inversions, or say too much about specific inversions found in other species. As indicated in our response to reviewer 2, some previously documented inversion-associated trade-offs may be compatible with the model presented here. However, we did locate and add to our Discussion one report of frequency-dependent selection on a D. melanogaster inversion (Nassar et al. 1973).
- L43: it is actually rather unlikely, though not impossible, that new inversions are ever completely neutral (see the review by Berdan et al. 2023).
This line was intended to convey that, in line with Said et al. 2018’s results, the structural alterations involved in common segregating inversions are not expected to contribute significantly to the phenotype and fitness (as indicated by lack of strong regulatory effects), and that their phenotypic consequences are instead due to linked variation. We have rewritten this passage to better communicate this point, now lines 44-52. Interpreting Section 2 and Figure 1 of Berdan et al. 2023, the linked variation may be what is in mind when saying that inversions are almost never neutral. We have also added a line referencing the expected linked variation of a new inversion (lines 49-52).
- L51-73: I felt this overview should be more comprehensive. The model by Kirkpatrick & Barton (2016 ) is in many ways less generic than the one of Charlesworth (1974) which essentially represents one way of modeling Dobzhansky's epistatic coadaptation. Also, the AOD mechanism is perhaps given too much weight here as this mechanism is very unlikely to be able to explain the establishment of a balanced inversion polymorphism (see Charlesworth 2023 preprint on bioRxiv). NFDS, spatially varying selection and temporally varying selection (for all of which there is quite good empirical evidence) should all be mentioned here, including the classical study of Wright and Dobzhansky (1946) which found evidence for NFDS (also see Chevin et al. 2021 in Evol. Lett.)
On reflection, we agree that we put too much emphasis on AOD and have edited the section to be more representative.
- L57. Two earlier Dobzhansky references, about epistatic coadaptation, would be: Dobzhansky, T. (1949). Observations and experiments on natural selection in Drosophila. Hereditas, 35(S1), 210-224. hlps://doi.org/10.1111/j.1601-5223.1949.tb033 34.xM; Dobzhansky, T. (1950). Genetics of natural populations. XIX. Origin of heterosis through natural selection in populations of Drosophila pseudoobscura. Genetics, 35, 288-302.hlps://doi.org/10.1093/gene7cs/35.3.288 - In general, in the introduction, the classical chapter by Lemeunier and Aulard (1992) should be cited as the primary reference and most comprehensive review of D. melanogaster inversion polymorphisms.
- L101: this is of course true, though there are some exceptions, such as In(3R)Mo.
- L110: the papers by Knibb, the chapter by Lemeunier and Aulard (1992), and the meta-analysis of INV frequencies by Kapun & Flatt (2019) could be cited here as well.
Citation suggestions integrated.
- L123 and elsewhere: the common D. melanogaster inversions are old but perhaps not THAT old - if we take the Corbett-Detig & Hartl (2012) es7mates, then most of them do not really exceed an age of Ne generations, or at least not by much. I mean: yes, they are somewhat old but not super-old (cf. discussion in Andolfatto et al. 2001).
Edited to curb any hyperbole. We agree that there are much more ancient polymorphisms in populations.
- L133-135. This needs to be rewritten: this claim is incorrect, to my mind (Charlesworth 1974; also see Charlesworth and Charlesworth 1973; discussion in Charlesworth & Flatt 2021).
Edited. See public review response (2).
- L154: the example of inversion polymorphism is actually explicitly discussed in Altenberg's and Feldman's (1987) paper on the reduction principle.
Edited to mention this. Inversions are also mentioned in Feldman et al. 1980, Feldman and Balkau 1973, Feldman 1972, and have been in discussion since the origins of the idea.
- L162ff: see Connallon & Chenoweth (2019).
Citation suggestion integrated, along with Cox & Calsbeek 2009 which seems more directly applicable, now line 185ff.
- L169: why? There is much evidence for other important trade-offs in this system.
Reworded.
- L178-179: other studies have found that trade-offs/AP contribute to the maintenance of inversion polymorphisms, e.g. Mérot et al. 2020 and Betrán et al. 1998, etc.
Added Betrán et al. 1998 - a good reference. Moved up mention of Mérot et al. 2020 from later in the text and directed readers to the Discussion, lines 202-205.
- L198. "alternate inversion karyotypes" - you mean INV vs. STD? It would be good to adopt a maximally clear, uniform terminology throughout.
Edited to communicate this better.
- L215-217: this is a theoretically well-known result due to Hazel (1943); Dickerson (1955); Robertson (1955); e.g., see the discussion in the quantative genetics book by Roff (1997) or in the review of Flatt (2020).
Citations integrated, now lines 232ff.
- L223 and L245: "haploid" - somewhat confusing (see public review).
- L259-260: This may need some explanation.
- L261-262: simply state that there is no recombination in D. melanogaster males.
Edited for increased clarity.
- L274 (and elsewhere): the meaning of "mutation...of new..inversion polymorphisms" is ambiguous - do you mean a polymorphic inversion and hence a new inversion polymorphism or do you mean polymorphisms/variants accumulating in an inversion?
- L275: maybe better heterokaryotypic instead of heterozygous? (note that INV homokaryotypes or STD homokaryotypes can be homo- or heterozygous, so when referring to chromosomal heterozygotes instead of heterozygous chromosomes it may be best to refer to heterokaryotypes).
Per [5](1) and [5](5) in the public review, we have edited our terminology.
- L276: referral to M&M - I found the description of the model/simulation details there to be somewhat vague, e.g. in terms of parameter settings, etc.
Further described.
- L281-282: would SLiM not have worked?
See public review response.
- L286-287: why these parameters?
Further described.
- L296ff: it is not immediately clear that the loci under consideration are polymorphic for antagonistic alleles vs. non-antagonistic alternative alleles - maybe this could be made clear very explicitly.
Edited to be explicit as suggested.
- L341, 343: "inversion mutation" - meaning ambiguous.
- L348, 352: "specified rate" - vague.
- L354-357: initial capture and/or accumulation/gain?
- L401, 402, 404: Z-, W- and Y- are brought up here without sufficient context/explanation.
The above have been addressed by edits in the text.
- L523, 557, 639, 646, and elsewhere: not the first evidence - see the paper by Mérot et al. (2020) (and e.g. also by Yifan Pei et al. (2023)).
Citations integrated in the introduction and discussion. Mérot et al. (2020) was cited (L486 in original) but discussion was curtailed in the previous draft.
- L558-559. I agree but it is clear that there are many mechanisms of balancing selection that can achieve this, at least in principle; for some of them (NFDS, etc.) we have pretty good evidence.
- L576-577. This is correct but for In(3R)C that study did find a differential hot vs. cold selection response.
Addressed with text edit.
- L584-L586: cf. Betrán et al. (1998), Mérot et al. (2020), Pei et al. (2023), etc.
- L591. "other forms of balancing selection": yes! This should be stressed throughout. Multiple forms of balancing selection exist and they are not mutually exclusive.
- L593: consider adding Dobzhansky (1943), Machado et al. (2021)
- L596-597: this is rather unlikely, at least in terms of inversion establishment (see Charlesworth 2023; hlps://www.biorxiv.org/content/10.1101/2023.10.16.562579v1).
- L608: consider adding Kapun & Flal (2019).
- L611-612: see studies by Mukai & Yamaguchi, 1974; and Watanabe et al., 1976.
- L639, 646: AP - see general literature on AP as a factor in maintaining polymorphism (Rose
1982, 1985; Curtsinger et al. 1994; Charlesworth & Hughes 2000 chapter in Lewontin Festschrift; Conallon & Chenoweth 2019 - this latter paper is par7cularly relevant in terms of AP effects in the context of sexual antagonism)
Citation suggestions integrated.
- L657: inversion polymorphism is explicitly discussed in Altenberg's and Feldman's (1987) paper on the reduction principle.
Hopefully this is better communicated.
- L724-755: I felt that this section generally lacks sufficient details, especially in terms of parameter choices and settings for the simula7ons.
- L732L: why not state these rates?
Parameter values are now given a fuller description in figure legends and in the methods.
- L746: but we know that mutational effect sizes are not uniformly distributed (?).
We made this choice for simplicity and to avoid invoking seemingly arbitrary distribution, but one could instead simulate trait effects with some gamma distribution. Display values would still have variable fitness effects that fluctuate with population composition, but we agree that distribution shifted toward small effects would be more realistic.
- L765: In(3R)P is not mentioned elsewhere - is this really correct?
That was incorrect, fixed.