Heterozygote advantage can explain the extraordinary diversity of immune genes

  1. Mattias Siljestam  Is a corresponding author
  2. Claus Rueffler
  1. Department of Ecology and Genetics, Animal Ecology, Uppsala University, Sweden

Abstract

The majority of highly polymorphic genes are related to immune functions and with over 100 alleles within a population, genes of the major histocompatibility complex (MHC) are the most polymorphic loci in vertebrates. How such extraordinary polymorphism arose and is maintained is controversial. One possibility is heterozygote advantage (HA), which can in principle maintain any number of alleles, but biologically explicit models based on this mechanism have so far failed to reliably predict the coexistence of significantly more than 10 alleles. We here present an eco-evolutionary model showing that evolution can result in the emergence and maintenance of more than 100 alleles under HA if the following two assumptions are fulfilled: first, pathogens are lethal in the absence of an appropriate immune defence; second, the effect of pathogens depends on host condition, with hosts in poorer condition being affected more strongly. Thus, our results show that HA can be a more potent force in explaining the extraordinary polymorphism found at MHC loci than currently recognised.

Editor's evaluation

This important theoretical and numerical study deals with a contemporary topic in evolutionary biology, immunology and population genetics. The structure of the models and the analytic framework used are relevant and sound, and the combination of two types of models is a powerful approach that produces compelling evidence to support the hypothesis on the role of heterozygote advantage in maintaining MHC gene polymorphism. The description of the models is easy to follow, and the paper would be of interest to specialists in evolution, immunology, and the general eLife readership.

https://doi.org/10.7554/eLife.94587.sa0

Introduction

Heterozygote advantage (HA) is a well-established explanation for single locus polymorphism, with the sickle cell locus as a classical textbook example (Allison, 1954). However, whether HA is generally important for the maintenance of genetic polymorphism is questioned (Hedrick, 2012; Sellis et al., 2016). Genes of the major histocompatibility complex (MHC), responsible for inducing immune defences by recognising the agretopes of the pathogenic antigens, are the most polymorphic loci among vertebrates (Duncan et al., 1979; Apanius et al., 1997; Penn, 2002; Sommer, 2005; Eizaguirre and Lenz, 2010). HA as an explanation for this high level of polymorphism was introduced almost 50 years ago by Doherty and Zinkernagel, 1975. The idea suggests that individuals with MHC-molecules from two different alleles are capable of recognising a broader spectrum of pathogens, resulting in higher fitness. This is especially evident when the MHC-molecules of the two alleles have complementary immune profiles (Pierini and Lenz, 2018), a phenomenon known as divergent allele advantage (Wakeland et al., 1990), and Stefan et al., 2019 show that this allows for the coexistence of alleles with larger variation in their immune efficiencies. Early theoretical work suggested that HA can maintain an arbitrarily high number of alleles if these alleles have appropriately fine-tuned homo- and heterozygote fitness values (Kimura and Crow, 1964; Wright, 1966; Maruyama and Nei, 1981). However, later work suggests that such genotypic fitness values are unlikely to emerge through random mutations (Lewontin et al., 1978). More mechanistic models have also failed to reliably predict very high allele numbers (Spencer and Marks, 1988; Hedrick, 2002; De Boer et al., 2004; Borghans et al., 2004; Stoffels and Spencer, 2008; Trotter and Spencer, 2008; Trotter and Spencer, 2013; Ejsmond and Radwan, 2015; Lau et al., 2015). As a result, HA plays only a minor role in current explanations of polymorphism at MHC loci (Hedrick, 1998; Gould et al., 2004; Wegner, 2008; Kekäläinen et al., 2009; Eizaguirre and Lenz, 2010; Lenz, 2011; Loiseau et al., 2011), despite empirical evidence for its existence (Doherty and Zinkernagel, 1975; Hughes and Nei, 1989; Jeffery and Bangham, 2000; Penn et al., 2002; McClelland et al., 2003; Froeschke and Sommer, 2005; Kekäläinen et al., 2009; Oliver et al., 2009; Lenz, 2011). Consequently, other mechanisms are suggested to be important for the maintenance of allelic diversity, such as Red-Queen dynamics, fluctuating selection, and disassortative mating (Apanius et al., 1997; Hedrick, 1998; Penn, 2002; Borghans et al., 2004; Wegner, 2008; Spurgin and Richardson, 2010; Loiseau et al., 2011; Ejsmond and Radwan, 2015; Ejsmond et al., 2023).

Our study challenges this status quo by demonstrating that HA is a potent force that can drive the evolution and subsequent maintenance of more than 100 alleles. To demonstrate that it is indeed HA that is responsible for allelic diversity in our model, we deliberately keep all aspects of the pathogen community fixed to exclude any Red-Queen dynamics. The novelty of our approach lies in the fact that we do not rely on hand-picked genotypic fitness values. Instead, these fitness values emerge from our eco-evolutionary models, where the allelic values that allow for extraordinary polymorphism are found by evolution in a self-organised process. We do not claim that HA is the only mechanism responsible for the diversity of MHC-alleles in nature. However, our results show that HA can be more important than currently recognised.

Model

We investigate the evolution at an MHC locus using mathematical modelling and computer simulations. In the following sections, we describe how genotypes map to immune response and ultimately to survival, followed by a description of our evolutionary algorithm.

We assume that the MHC-molecules produced by the two alleles at a diploid MHC locus determine the immune response based on antigen recognition against multiple pathogens present in the environment. Our approach is based on the following two key assumptions regarding the relationship between pathogen virulence and host fitness:

  1. Virulent pathogens are lethal in the absence of an appropriate immune defence.

  2. The effect of pathogens on host survival depends on host condition, with hosts in poorer condition being affected more strongly.

An implication of the second assumption is that the combined effect of multiple pathogens on host survival exceeds the sum of the effects of each pathogen alone.

To incorporate these two assumptions, we assume that the effect of pathogen attacks on host survival acts through the intermediary step of the host’s ‘condition’, which is a proxy for a suit of measurements describing an individual’s body composition and physiology (Wilder et al., 2016). In the absence of an adequate immune response, a pathogen attack reduces the condition of a host to zero, causing its death (assumption a). More generally, we assume that the probability to survive is an increasing function of condition and that a host clearing a pathogen is in a weaker condition afterward. Since the survival probability cannot exceed one, the function that maps condition to survival has to be saturating. Consequently, for high values of conditions, where the survival function has saturated, pathogens reducing condition have small effects on survival. As condition decreases, pathogen-induced reductions have larger effects on survival (assumption b). A natural biological intuition for assumption (b) can be drawn from examples like COVID-19 or influenza, where it is well known that these pathogens do not pose a high mortality risk to individuals in good condition, but can significantly increase mortality risk for individuals in poor condition (Thompson, 2004; Zhou et al., 2020).

A further assumption of our model is the existence of a trade-off between the efficiencies of MHC-molecules to induce a defence against different pathogens. Thus, no MHC-molecule can perform optimally with respect to all pathogens and an improved efficiency against one set of pathogens can only be achieved at the expense of a decreased efficiency against another set of pathogens. Under such trade-offs, an MHC-molecule can be specialised to detect a few pathogens with high efficiency, or, alternatively, be a generalist molecule that can detect many pathogens but with low efficiency. There is empirical support for the existence of such trade-offs. First, many MHC-molecules can detect only a certain set of antigens (Wakeland et al., 1990; Froeschke and Sommer, 2012; Eizaguirre et al., 2012; Chappell et al., 2015; Pierini and Lenz, 2018) and therefore provide different degrees of protection against different pathogens (Wakeland et al., 1990; Apanius et al., 1997; Eizaguirre and Lenz, 2010; Froeschke and Sommer, 2012; Eizaguirre et al., 2012; Cortazar Chinarro et al., 2022). Second, it has also been found that specialist MHC-molecules are expressed at higher levels at the cell surface while generalist MHC-molecules that bind less selectively are expressed at lower levels (Chappell et al., 2015), potentially to reduce the harm of binding self-peptides. This could explain the lower efficiency of generalist MHC-molecules.

We employ two approaches to model this trade-off. First, we use unimodal functions to model the match between MHC-molecules and pathogens. This approach has a long history in evolutionary ecology (e.g. Levins, 1968; Sheftel et al., 2018), and, when using Gaussian functions, the model becomes amenable to mathematical analysis. We envisage that these pathogen optima represent distinct pathogen species from diverse taxonomic groups such as fungi, viruses, bacteria, protists, helminths, and prions, among others (Schmid-Hempel, 2021). Hence, we expect these pathogen optima to remain approximately constant over the time scales considered in our model. By keeping all aspects of the pathogen community fixed, we exclude Red-Queen dynamics and ensure that the observed allelic polymorphism is driven solely by HA.

To demonstrate that the allelic diversity evolving in the Gaussian model does not dependent on the specificities of this model but rather results from the model fulfilling the above assumptions (a) and (b), we implement an alternative and more mechanistic approach to model pathogen recognition. Inspired by Borghans et al., 2004, in this approach, while keeping assumptions (a) and (b) intact, immune defence is based on the match between two binary strings (or bit-strings), one representing the MHC-molecule and the other a peptide of the pathogen. In this model, a single MHC-allele has the potential to detect several pathogens, which could be interpreted as the different pathogens being more closely related.

By explicitly modelling MHC efficiencies against various pathogens – rather than assuming a fixed proportion of pathogens detected per MHC-molecule (as, e.g., De Boer et al., 2004) – our model accounts for the possibility that MHC-molecules can have complementary immune profiles. When paired, complementary alleles produce fit heterozygotes able to detect an increased number of pathogen peptides (Pierini and Lenz, 2018), exemplifying the concept of divergent allele advantage in the sense of Wakeland et al., 1990.

Gaussian model

In this approach, we use Gaussian functions to model the ability of MHC-molecules to recognise m different pathogens, as illustrated in Figure 1. Here, MHC-alleles and pathogens are represented by vectors x=(x1,x2,,xh) and pk=(p1k,p2k,,phk), respectively. The MHC-alleles code for MHC-molecules, and the ability of an MHC-molecule to recognise the kth pathogen is maximal if x=pk. This ability decreases with increasing distance between x and pk. The decrease is modelled using an h-dimensional Gaussian function, as detailed in Equation A14. The nature of the trade-off can be varied by adjusting the positions of the pathogen optima and the shape of the Gaussian functions.

Efficiency against two pathogens (coloured lines in A–B) and three pathogens (coloured cones in C–D) as a function of allelic values x.

Efficiencies are modeled with Gaussian functions with pathogen optima at equal distances d=1 (indicated by p1 and p2 in A, B). The width of the Gaussian functions, which determine how severely pathogens affect hosts with suboptimal major histocompatibility complex (MHC) molecules, is given by the virulence parameter v. With high virulence (v=7, narrow Gaussians in B, D), alleles away from the optima have a low efficiency, while for a low virulence (v=2.5, wide Gaussians in A, C) efficiency is higher. Grey lines and cones give the condition c of homozygote individuals. The generalist allele, maximising condition, is located at the centre with equal distance to all pathogen optima (indicated by x in A, B).

Without loss of generality, we can reduce the dimension of the vectors x and pk to h=m1, such that x=(x1,x2,,xm1) and pk=(p1k,p2k,,pm1,k). For example, in Figure 1A and B, where m=2, the x-axis represents the unique line passing through two pathogen optima in a trait space of potentially much higher dimension. Similarly, in Figure 1C and D, where m=3, the two-dimensional coordinate system represented by the grey surfaces describes the unique plane passing through three pathogen optima. Mathematically speaking, m linearly independent pathogen optima form the basis of a vector space of dimension m1, which we choose as the coordinate system for the vectors x and p. Allelic vectors outside this set are necessarily maladapted for all pathogens along at least one dimension, and owing to our dimensionality reduction we ignore such trait vectors.

We examine two versions of the Gaussian model. The first one is based on two symmetry assumptions and shown in Figure 1: pathogen optima are placed symmetrically such that the distance between any two pathogens equals 1, and the Gaussian functions ek(x) are isotropic (rotationally symmetric) and of equal width. This allows to simplify the covariance matrix in the Gaussian function ek(x) (Equation A14) such that it can be replaced with a single parameter v (Appendix ‘Model description’),

(1) ek(x)=exp(v22(xpk)T(xpk)),

where the superscript T indicates vector transposition. The parameter v, to which we refer as virulence, is the inverse of the width of the Gaussian function. If the Gaussian function is narrow, corresponding to a high virulence v, a pathogen causes significant harm if MHC-molecules are not well adapted against it (Figure 1B and D). On the other hand, if the Gaussian function is wide, corresponding to a low virulence v, a pathogen causes less harm (Figure 1A and C).

We relax these symmetry assumptions in the second version, where we allow for Gaussian functions with arbitrary shape and position. Since the results for the two versions are similar, we here focus on the case with symmetry and refer to Appendices ‘Deviations from symmetry in the Gaussian model’, ‘Mode description’, and ‘Analytical results for the Gaussian model’ for results based on general Gaussian functions.

Bit-string model

Our second approach is inspired by Borghans et al., 2004, and commonly referred to as a bit-string model. Pathogens are assumed to produce npep peptides, and for a pathogen to cause virulence, all of its peptides have to avoid detection by the host’s MHC-molecules. We here equate MHC-alleles with the MHC-molecule they code for, and both MHC-molecules and pathogen peptides are represented by binary strings (or bit-strings) of, following Borghans et al., 2004, length 16.

The probability that an MHC-molecule detects a pathogen peptide increases with the maximum match length of consecutive matches between their binary strings. For an MHC-molecule x and the kth peptide of the ith pathogen, this match length is denoted Lki(x), or L for short (see Figure 2A). The corresponding detection probability, denoted D(Lki(x)), is then given by the logistic function

(2) D(Lki(x))=11+exp[a(vLki(x))].

Here, v denotes the required match length L for a 50% chance of detection. The parameter v has again the interpretation of virulence, with higher values indicating pathogen peptides that are harder to detect by MHC-molecules. The positive parameter a governs the steepness of the function D. We choose a=log(9), which results in D(L) equalling 10% when L=v1 and 90% when L=v+1 (Figure 2B). Finally, the realised efficiency of an MHC-molecule x against the kth pathogen is given by the probability of detecting at least one of its npep peptides, which equals

(3) ek(x)=1i=1npep(1D(Lki(x))).
Detection probability in the bit-string model.

(A) Major histocompatibility complex (MHC) bit-string matching against a pathogen with npep=10 peptides. Yellow indicates a match between MHC and peptide bits. The longest consecutive match per peptide (L) is indicated with a black box. The longest match over all peptides occurs for the last peptide, marked in green, with match length L=7. (B) Detection probability for peptides as a function of match length L (Equation 2 with a=log(9)). The dashed lines indicate, from left to right, 10%, 50%, and 90% detection probability.

From immune defence to survival

For both versions of our model, we assume that MHC-alleles are co-dominantly expressed (Eizaguirre and Lenz, 2010; Abbas et al., 2014), and an individual’s efficiency to recognise pathogens of type k is given by the arithmetic mean of the efficiencies from its two alleles. We want to note that assuming co-dominance gives more conservative results in terms of the number of coexisting alleles, as dominance would increase the degree of HA.

For each pathogen attack, an individual’s condition c is reduced by a certain fraction that depends on the efficiency of the defence e against that pathogen. Since each individual is exposed to all pathogens during their lifetime, the condition c is determined by the product of its defences against all pathogens,

(4) c(xi,xj)=cmaxk=1mek(xi)+ek(xj)2,

where xi and xj represent the MHC-alleles the host carries at the focal locus, and cmax is the condition of a hypothetical individual with perfect defence against all pathogens (see Appendix ‘Model description’ for more details). Because ek(x)<1, condition is reduced with each additional pathogen in a proportional manner. The multiplicative nature of Equation 4 has the effect that a poor defence against a single pathogen is sufficient to severely compromise condition, and therefore survival (see next paragraph), fulfilling assumption (a) above.

Finally, survival s is an increasing but saturating function of an individual’s condition c,

(5) s(xi,xj)=c(xi,xj)K+c(xi,xj).

Here, K is the survival half-saturation constant, giving the condition c required for a 50% chance of survival. This function fulfils assumption (b) above as long as K is not too large. Individuals in good health then have a condition c far above K, and a decrease in condition only has a small effect on survival. If c is lower than K, then the host is in bad health and any additional pathogen causes a large reduction in survival s (orange lines in bottom panel of Figure 3 and 6).

Evolution of allelic values under the Gaussian model in the presence of three pathogens (arranged as in Figure 1D) for four different values of the survival half-saturation constant K (A: K=10, B: K=1, C: K=0.1, D: K=0.01; dashed line in lower panel).

The top panel shows individual-based simulations. The two horizontal axes give the two allelic values x=(x1,x2) that characterise an allele, while the vertical axis shows evolutionary time. The thickness of the blue tubes is proportional to allele frequencies. Allelic values at the last generation are projected as red dots on the top as well as on the bottom plane. Coloured circles represent the contour lines of the Gaussian efficiency functions ek(x) as shown in Figure 1D. In all simulations, gradual evolution leads towards the generalist allele x=(0,0) and branching occurs in its neighbourhood, as predicted by our analytical derivations (Appendix ‘The evolutionarily singular point’). In (A) there are three consecutive branching events with the second branching event marked by the grey plane (ne=4.0; for details regarding ne, see the legend of Figure 4). (B and C) show that, as K decreases, the number of branching events increases, resulting in more coexisting alleles (ne=7.8 and ne=16.5, respectively). Finally, (D) reveals that, as K decreases even further such that already low condition values result in high survival, the number of branching events decreases again, resulting in a set of alleles closely clustered around the generalist allele (ne=10.2). The bottom panel shows survival s as a function of condition c as defined by Equation 5 on a log-log scale (orange line, left vertical axis) and the frequencies of individual conditions at the final generation (dark blue bars for homozygotes and light blue bars for heterozygotes, right vertical axis; conditions from 0 to 104 are incorporated into the first bar). These panels show that increased allelic diversity results in a lower proportion of homozygote individuals, which have lower survival. Other parameter values: v=7, N=2×105, μ=106, and δ=0.016.

In summary, Equations 4 and 5 entail that assumptions (a) and (b), as formulated above, are satisfied. Using two distinct models to describe the interaction of hosts and pathogens, which both impose a trade-off between the ability to detect different pathogens – namely the Gaussian and the bit-string model – we demonstrate below that HA emerges as a potent force capable of driving the evolution of a very high number of coexisting alleles.

Analysis

To study the evolutionary dynamics of allelic values x in both the Gaussian and the bit-string model, we simulate a diploid Wright-Fisher model with mutation and selection (Fisher, 1930; Wright, 1931). Thus, we consider a diploid population of fixed size N with non-overlapping generations and random mating. Individuals produce, independent of their genotype, a large number of offspring, resulting in deterministic Hardy-Weinberg proportions before viability selection. After viability selection, which is based on Equation 5 and adjusts the proportion of genotypes accordingly, stochasticity is introduced by random multinomial sampling of N surviving offspring, which constitute the adult population of the next generation. Using this model, we follow the fate of recurrent mutations that occur with a per capita mutation probability µ. The long-term evolutionary dynamics is obtained by iterating this procedure (Figure 3, top panel) until the number of alleles equilibrates. This procedure can result in high numbers of coexisting alleles, where the emerging allelic polymorphism is driven by increasing the alleles’ expected survival (or marginal fitness, see Equations A5–A6 in Appendix ‘Adaptive dynamics and invasion fitness’).

For the Gaussian model, mutations are drawn from an isotropic normal distribution with expected mutational effect size δ (Appendix ‘Varying the expected mutational step size in the Gaussian model’). We here focus on mutations of small effect (δ=0.016 in Figure 3 and δ=0.03 in Figure 4) and thus near-gradual evolution. The effect of smaller and larger effect sizes is investigated in Appendix ‘Varying the expected mutational step size in the Gaussian model’. To minimise computation time, simulations (other than those in Figure 3) are initialised at the trait vector that is given by the mean of the vectors describing the pathogens. In the bit-string model, the m pathogens are each given npep randomly drawn bit-strings at the beginning of a simulation and the host population is initialised with a single MHC-allele given by a randomly drawn bit-string. Mutations change a random bit of the MHC-allele. The bit-string model can indeed only be analysed with computer simulations. In contrast, for the Gaussian model we can analytically derive conditions under which to expect either a single generalist allele or the build-up of allelic diversity through gradual evolution in a process known as evolutionary branching (Metz et al., 1992; Geritz et al., 1998; Kisdi and Geritz, 1999; Doebeli, 2011) (see Appendices ‘Mathematical analysis of the Gaussian model: Preliminaries’ and ‘Analytical results for the Gaussian model’ for details).

Number of coexisting alleles under the Gaussian model for m pathogens as a function of pathogen virulence v and the survival half-saturation constant K.

Figures are based on a single individual-based simulation per pixel and run for 106 generations, assuring that the equilibrium distribution of alleles is reached. Results are reported in terms of the effective number of alleles ne, which is a conservative measure for the number of alleles, discounting for rare alleles present at mutation-drift balance (see Appendix ‘Effective number of alleles’). The clear pattern in the figures indicates a high degree of determinism in the simulations. Using population size N=105 and per capita mutation probability μ=5×107, the expected ne under mutation-drift balance alone equals 1.2 (see Appendix ‘Effective number of alleles’). Dashed and solid lines give the contours for ne=50 and ne=100, respectively. Red arrows indicate v=2m, the threshold for polymorphism to emerge from branching (Equation A46). Accordingly, simulations in the dark blue area result in a single abundant allele with ne close to one. Other parameters: expected mutational step size δ=0.03.

Results

Gaussian model

In the simulations of the Gaussian model, the evolutionary dynamics first proceed towards a generalist allele with an intermediate efficiency against all pathogens, to which we refer to as x. This generalist allele maximises the condition c for homozygote genotypes (grey lines and cones in Figure 1, Appendix ‘Absolute convergence stability’). Once this generalist allele is reached, the evolutionary dynamics either stops (Appendix 2—figure 1), resulting in a population where all individuals are homozygous for x, or allelic diversification ensues (Figure 3), resulting in the coexistence of specialist and generalist alleles. Based on the adaptive dynamics approximation, we show analytically (Appendix ‘The evolutionarily singular point’) that x is given by the arithmetic mean of the vectors p1,,pm describing the m pathogen optima (see Equation A26 in Appendix ‘The evolutionarily singular point’) and an attractor of any sequence of allelic substitutions. Whether x is an evolutionary stable endpoint or an evolutionary branching point where diversification ensues depends on the covariance matrix Σp2 of the pathogen optima relative to the covariance matrices ΣG2 of the Gaussian efficiency functions (Appendices ‘Derivation of the Hessian matrix of invasion fitness’, ‘Special case: identically shaped Gaussian efficiency functions’, and ‘Special case: maximal symmetry’). For the special case of identically shaped Gaussian functions, diversification occurs if and only if

(6) Σp22ΣG2>0,

(Appendix ‘Special case: identically shaped Gaussian efficiency functions’). Note, that this expression is independent of the number of pathogens m. Under the additional assumption of equally distant pathogens and isotropic Gaussian functions, these covariance matrices are diagonal matrices with identical diagonal entries σp and σG, respectively, and Condition 6 simplifies to σp22σG2>0. When pathogen optima have an equal distance of 1, the variance among the optima σp2 decreases with an increasing number of pathogens m, and the condition for evolutionary branching can be rewritten as

(7) v>2m,

where v=σG1 (Appendix ‘Special case: maximal symmetry’).

Figure 4 presents the final number of coexisting alleles as derived from individual-based simulations. It shows that the number of coexisting alleles increases with the number of pathogens m and their virulence v, but also depends on the survival half-saturation constant K (Equation 5). For a large part of the parameter space, more than 100 (solid contour lines in Figure 4) and up to over 200 alleles can emerge and coexist.

In order to better understand the process of allelic diversification, it is useful to inspect our analytical results in more detail. Evolutionary diversification occurs if mutant alleles x exist that can invade a population that is monomorphic for the generalist allele x. Initially, while still rare, such mutant alleles will always occur in heterozygous individuals, where they are paired with the generalist allele. Thus, our condition for evolutionary diversification, v>2m, is equivalent to s(x,x)>s(x,x). Since, as homozygotes, the generalist allele maximises condition and therefore survival (Appendix ‘Absolute convergence stability’), we also have s(x,x)>s(x,x). In conclusion, individuals heterozygous for x and x have higher survival than either homozygote, s(x,x)>s(x,x)>s(x,x), and a polymorphism of these two alleles is maintained by HA, as suggested by Doherty and Zinkernagel, 1975. Furthermore, the generalist allele is an evolutionary branching point in the sense of adaptive dynamics theory (Geritz et al., 1998; Kisdi and Geritz, 1999).

The left-hand side of the diversification condition given byEquation 7 indicates that invasion of more specialised alleles is favoured when pathogen virulence v is large (narrow Gaussian functions, see Figure 1A and C). In this case, homozygotes for the generalist allele x are relatively poorly protected against pathogens and more specialised alleles enjoy a fitness advantage while invading. The opposite is true when v is small (wide Gaussian functions, see Figure 1A and C). The right-hand side of the diversification criterion indicates that the benefit of specialisation decreases with an increasing number of pathogens (compare position of red arrows in Figure 4), because different pathogens require different adaptations. Thus, counter to intuition, initial allelic diversification is disfavoured in the presence of many pathogens.

If initial allelic diversification occurs, it leads to a dimorphism from which new mutant alleles can invade if they are more specialised than the allele from which they originated. Then, two allelic lineages emerge from the generalist allele x and subsequently diverge (Figure 3A, up to t=3×104 below grey plane). Increasing the difference between the two alleles present in such a dimorphism has two opposing effects. The condition and thereby the survival of the heterozygote genotype increases because the MHC-molecules of the two more specialised alleles provide increasingly better protection against complementary sets of pathogens, i.e., these alleles are subject to a divergent allele advantage (Wakeland et al., 1990; Pierini and Lenz, 2018). On the other hand, survival of the two homozygote genotypes decreases because they become increasingly more vulnerable to the set of pathogens for which their MHC-molecules do not offer protection. Note that, due to random mating and assuming equal allele frequencies, half of the population are high survival heterozygotes and the remaining half homozygotes with low survival. Since survival is a saturating function of condition c (Equation 5), it follows that the increase in survival of heterozygotes slows down with increasing condition (plateau of the orange curves in Figure 3), and the two opposing forces eventually balance each other such that divergence comes to a halt. At this point, our simulations show that the allelic lineages can branch again, resulting in three coexisting alleles. As a result, the proportion of low survival homozygotes decreases, assuming equal allele frequencies, from one-half to one-third. Subsequently, the coexisting alleles diverge further from each other because the increase in heterozygote survival once again outweighs the decreased survival of the (now less frequent) homozygotes (see Figure 3A, at time t=3×104, grey plane). In Figure 3A, this process of evolutionary branching and allelic divergence repeats itself one more time, resulting in four coexisting alleles. Consequently, 10 genotypes emerge: four homozygotes and six heterozygotes. The homozygotes with specialist alleles have a condition, and thereby a survival, close to zero (two left bars in bottom panel). Conversely, the homozygote for the generalist allele x has an intermediate condition (middle bar), and all heterozygote genotypes have a survival close to 1 (right bar).

In Figure 3B–D, the process of evolutionary branching and allelic divergence continues to recur. As a consequence, allelic diversity continues to increase while simultaneously the proportion of vulnerable homozygote genotypes decreases (Figure 3, lower panel). Thus, in contrast to prior approaches (e.g. Kimura and Crow, 1964; Wright, 1966; Lewontin et al., 1978; Maruyama and Nei, 1981), we do not rely on hand-picked genotypic fitness values. Instead, in our approach, fitness values emerge from an eco-evolutionary model where evolution can be viewed as a self-organising process finding large sets of alleles that can coexist (Figure 3, upper panel).

We note that the half-saturation constant K does not appear in the branching condition and thus does not affect whether polymorphism evolves. However, K does affect the final number of alleles, which is maximal for intermediate values of K. This can be understood as follows. If K is very large (right-hand side of the panels in Figure 4), then heterozygote survival saturates more slowly with increased allelic divergence so that continued allelic divergence is less counteracted. This hinders repeated branching (compare A and C in Figure 3). On the other hand, if K is very small (left-hand side of the panels in Figure 4), then homozygous individuals can have high survival, which decreases the selective advantage of specialisation, leading to incomplete specialisation and a reduced number of branching events (compare D and C in Figure 3).

In summary, high virulence v promotes allelic diversification. Increasing the number of pathogens m has a dual effect: it hinders initial diversification but facilitates a higher number of coexisting alleles if diversification occurs, especially, for intermediate values of the half-saturation constant K.

We perform several robustness checks. First, Appendix 4—figure 1 shows simulations in which we vary the expected mutational step size. These simulations show that the gradual build-up of diversity occurs most readily as long as the mutational step size is neither very small, since then the evolutionary dynamics becomes exceedingly slow, nor very large, since a large fraction of the mutants are then deleterious and end up outside the simplex made up of the pathogen optima (e.g. outside the triangle made up by the three pathogen optima in Figure 1C and D) so that they perform worse against all pathogens.

Second, the results presented in Figures 3 and 4 are based on the assumptions of equally spaced pathogen optima and equal width and isotropic Gaussian functions ek(x) as shown in Figure 1. In Appendices Analytical results for the Gaussian model’ and ‘Deviations from symmetry in the Gaussian model’, we present analytical and simulation results, respectively, for the non-symmetric case. In particular, Appendix 5—figure 1 shows that the predictions for the number of coexisting alleles presented here are qualitatively robust against deviations from symmetry. This is in line with Condition 6 and its simplification under full symmetry, σp22σG2>0, showing that the more general condition for the evolution of allelic polymorphism is structurally identical to the condition under full symmetry.

Bit-string model

Evolutionary diversification of MHC-alleles in the bit-string model is analysed with individual-based simulations, and the results are summarised in Figure 5. Similar to the Gaussian model, we find high levels of allelic polymorphism, with over 100 alleles coexisting in a significant portion of the parameter space. Note that we here keep the half-saturation constant K fixed at 1. With this choice, the realised conditions occur both in the range where survival changes drastically with condition and where the survival function saturates (Figure 6), fulfilling assumption (b). This allows us to focus on the effect of the number of peptides npep per pathogen.

Number of coexisting alleles for the bit-string model for four values of virulence v as a function of the number of pathogens m (increased in steps of 7) and the number of peptides per pathogen npep.

Figures are based on a single individual-based simulation per pixel and run for 106 generations. Results are reported in terms of the effective number of alleles ne, which discounts for rare alleles present at mutation-drift balance (see Appendix ‘Effective number of alleles’). Using population size N=105 and per capita mutation probability μ=5×106, the expected ne under mutation-drift balance alone equals 3. Dashed and solid lines give the contours for ne=50 and ne=100, respectively. Evolution started from populations monomorphic for a random allele, and run for 2×106 generations, assuring that the equilibrium distribution of alleles is reached. Other parameters: half-saturation constant K=1.

A simulation run showing the evolution of allelic diversity under the bit-string model in the presence of m=12 pathogens.

(A) shows the number of alleles n and the effective number of alleles ne as a function of time (on a log-log scale). (B–D) give survival s as a function of condition c as defined by Equation 5 on a log-log scale (orange line, left vertical axis) and the distribution of conditions at three time points (B t=100, C t=3900, D t=106; vertical green dashed lines in A), with dark blue bars for homozygotes and light blue bars for heterozygotes (right vertical axis; conditions from 0 to 104 are incorporated into the first bar, and conditions from 104 and greater are incorporated in the last bar). This shows that as allelic diversity increases, the frequency of homozygotes with low survival decreases. The black dashed lines indicate the value of K=1. Other parameter values: v=7, m=12, npep=3, N=105, μ=5×106.

Our results can be understood as follows. The likelihood that an MHC-molecule can recognise all pathogens is high in the following regions of the parameter space. Firstly, if virulence v is low, then peptide recognition is more likely (Equation 2). Secondly, if the number of pathogens m is low, then detection of all pathogens is a simpler task. Thirdly, if the number of peptides npep per pathogen is high, then the potential for successful pathogen detection increases (Equation 3). Although our model is not sufficiently mechanistic to be directly related to parameters observed in nature, it suggests that when pathogens have a high number of peptides, maintaining allelic polymorphism requires a larger number of pathogens under conditions of low virulence (v7). For higher virulence (v9), the effect of npep weakens, and allelic polymorphism evolves seemingly independent of the number of pathogens. Each of these three circumstances facilitates the existence of a single best allele whose MHC-molecule recognises all pathogens with a high probability (dark blue regions in Figure 5).

As virulence or the number of pathogens increases, or as the number of peptides decreases, the task of recognising all pathogens with high probability becomes progressively more challenging. This leaves homozygous individuals vulnerable to an increasing array of pathogens. As homozygotes get more vulnerable, there is a growing advantage for heterozygotes carrying alleles with complementary immune profiles, as these are able to detect up to twice as many pathogens as either homozygote. This increasingly stronger HA, in turn, facilitates coexistence of an increasing number of alleles, illustrated by increasingly warmer colours in Figure 5, and thereby decreases the proportion of vulnerable homozygotes. Thus, similar to the Gaussian model, increasing either the virulence v or the number of pathogens m enables a higher number of alleles to coexist. However, unlike the Gaussian model, increasing m actually facilitates initial diversification rather than hindering it.

Importantly, in the bit-string model, a point mutation, switching the value of an arbitrary bit in the bit-string, can drastically alter the efficiencies against a large set of pathogens. Because of this, and in contrast to the Gaussian model, a polymorphism maintained by HA can emerge from many different alleles. On the other hand, gradual evolution in the Gaussian model is more efficient in finding the evolutionary endpoint of complementary alleles (Figure 3), while for the bit-string model, as the number of alleles increases, this becomes slower due to the lack of fine-tuning as mutations have large effect. To compensate for this, we use, compared to the Gaussian model, a higher mutation probability µ and run simulations for more generations.

Figure 6A shows the build-up of allelic diversity over time in an exemplary simulation run, and Figure 6B–D show the distribution of condition values at three time points, as indicated by green hatched lines in Figure 6A. In Figure 6B the population is dimorphic. Due to random mating, half of the population consists of homozygotes with low condition (dark blue bar), while the remaining half are heterozygotes with high condition (light blue bar). As time proceeds, the number of coexisting alleles increases. Figure 6C depicts a stage with five coexisting alleles (with at least 1% frequency) and an effective number of alleles (ne) of 4.4. Ultimately, evolution results in 19 coexisting alleles (with at least 1% frequency), and an ne of 16.1, as shown in Figure 6D. In this process, the number of low condition homozygotes decreases, as indicated by the dark blue bars.

Discussion

HA as an explanation for the coexistence of a large number of alleles at a single locus has a long history in evolutionary genetics. Kimura and Crow, 1964, and subsequently Wright, 1966 showed that HA can in principle result in the coexistence of an arbitrary number of alleles at a single locus if two conditions are met: (1) all heterozygotes have a similarly high fitness, and (2) all homozygotes have a similarly low fitness. One special class of genes fulfilling these assumptions occur at self-incompatibility loci, where mating partners need to carry different alleles for fertilisation to be successful (Wright, 1939; Castric and Vekemans, 2004), or loci where homozygosity is lethal (Ding et al., 2021). However, more generally these conditions were deemed unrealistic by Kimura, Crow, and Wright themselves. This assessment was subsequently confirmed by Lewontin et al., 1978, who investigated a model in which the exact fitnesses are determined by drawing random numbers in a manner that all heterozygotes are more fit than all homozygotes. They found that the proportion of fitness arrays that leads to a stable equilibrium of more than six or seven alleles is vanishingly small. Similarly, the idea that the high allelic diversity found at MHC loci can be explained by HA was initially accepted by theoreticians (e.g. Maruyama and Nei, 1981; Takahata and Nei, 1990), but several later authors studying models based on more mechanistic assumptions were unable to reliably predict the coexistence of significantly more than 10 alleles (Spencer and Marks, 1988; Hedrick, 2002; De Boer et al., 2004; Borghans et al., 2004; Stoffels and Spencer, 2008; Trotter and Spencer, 2008; Trotter and Spencer, 2013; Ejsmond and Radwan, 2015; Lau et al., 2015). Thus, currently HA is largely dismissed as an explanation for highly polymorphic loci (Gould et al., 2004; Eizaguirre and Lenz, 2010; Lenz, 2011; Hedrick, 2012).

Our study, while not meant to be a highly realistic mechanistic representations of the interaction between MHC genes and pathogens, serves as a proof of principle that a high number of alleles, matching those found at MHC loci in natural populations, can indeed arise in an evolutionary process driven by HA. Our results thus revive the idea that HA has the potential to explain extraordinary allelic diversity. Importantly, and in contrast to several of the above-mentioned studies, this is achieved without making direct assumptions about homozygote and heterozygote fitnesses. Instead, our results emerge from two assumptions about how pathogens affect a host’s condition and how this, in turn, affect survival. Assumption (a) states that pathogens are lethal in the absence of an appropriate immune response. This assumption is implemented in our model by assuming that each pathogen decreases a host’s condition in a proportional manner (Equation 4), rather than by a fixed amount. Assumption (b) states that the effect of pathogens depends on host condition, with hosts in poorer condition being affected more strongly. Then, the combined effect of multiple pathogens on host survival exceeds the sum of the effects of each pathogen alone. Thus, many pathogens against which a host has an imperfect immune response can collectively push a host’s condition below a threshold where mortality becomes rather high (orange lines in Figures 3 and 6). In our model, this assumption is fulfilled rather naturally. Since the probability to survive can logically not exceed 1, the function that maps condition to survival has to be saturating (Equation 5).

In the following, we detail how assumptions (a) and (b) can result in the emergence of well over 100 alleles such that heterozygotes have similarly high fitness (condition (1) of Kimura and Crow) and homozygotes have similarly low fitness (condition (2) of Kimura and Crow). We start with the observation that the survival probabilities in evolved polymorphic populations vary between individuals (lower panels in Figures 4 and 5B–D). Part of the population consists of individuals that have very low survival probabilities. These are individuals with a condition value considerably less than K and they are almost exclusively homozygotes. This is because, whenever polymorphism is favoured, homozygotes are poorly defended against some pathogens and the fact that pathogens affect condition multiplicatively (Equation 4). The remaining part of the population consists of individuals with condition values considerably above K. Although the condition of these individuals can differ by several orders of magnitude, their survival is close to 1, which results from the fact that the function that maps condition to survival is saturating. These individuals are almost exclusively heterozygotes. This is because alleles that protect against complementary sets of pathogens, when paired together, offer at least a decent protection against all pathogens. In summary, our assumptions (a) and (b) lead to a set of alleles such that their survival probabilities fall into two clusters as required for conditions (1) and (2) of Kimura and Crow, 1964 to be fulfilled. The larger the number of alleles, the lower becomes the proportion of vulnerable homozygotes, and the population consists increasingly of almost equally fit heterozygotes.

Borghans et al., 2004 use a bit-string model similar to ours with m=50 pathogens, npep=20 peptides, a virulence of v=7 and a step function for the probability that an MHC-molecule detects a peptide (a in Equation 2). In contrast to our model, they assume that an individual’s condition equals the proportion of detected pathogens, meaning that each pathogen can reduce fitness by only 2% (thereby not fulfilling our assumption a). Additionally, they assume that survival is proportional to the squared condition (not fulfilling our assumption b). Appendix 6—figure 1 shows a run of our bit-string model with the parameter values used by Borghans et al., 2004, resulting in more than 100 coexisting alleles. In contrast, they find only up to seven coexisting alleles, demonstrating that assumptions (a) and (b) in our model drive the high number of coexisting alleles found by us.

Currently, there are several mechanisms proposed to explain the diversity observed at MHC loci. First, in the presence of an HA, each allele has an advantage when rare because it almost always occurs in heterozygotes. Thus, there is negative frequency-dependent selection acting at the level of the allele. In addition, negative frequency-dependent selection can arise from, for example, Red-Queen dynamics, fluctuating selection, and disassortative mating (Apanius et al., 1997; Hedrick, 1998; Penn, 2002; Borghans et al., 2004; Wegner, 2008; Spurgin and Richardson, 2010; Loiseau et al., 2011; Ejsmond and Radwan, 2015; Lighten et al., 2017; Ejsmond et al., 2023). These mechanisms are similar to HA in the sense that the selective advantage of an allele increases with decreasing frequency. However, they do not result in heterozygotes being more fit than the homozygotes carrying the rare allele. In addition, neutral diversity can be enhanced by recombination (Klitz et al., 2012; Linnenbrink et al., 2018; Robinson et al., 2017). If many individuals are heterozygous, the particularly high levels of gene conversion found at MHC genes can be effective in creating new allelic variants. For instance, for urban human populations with a large effective population size of Ne=106 and a per capita gene conversion probability of r=104 an effective number of alleles as high as ne=1+4rNe=401 can theoretically be maintained by gene conversion (Klitz et al., 2012). However, it is important to point out that for gene conversion to increase allelic diversity, some genetic polymorphism due to balancing selection has to exist to start with. We do not claim that the mechanisms listed here do not play an important role in maintaining allelic diversity at MHC loci. Rather, our results show that, contrary to the currently widespread view, HA should not be dismissed as a potent force. In any real system, different mechanisms will jointly affect allelic diversity. For instance, Lighten et al., 2017 present a model in which, for Red-Queen co-evolution to maintain allelic polymorphism, HA in the form of a divergent allele advantage (Wakeland et al., 1990) seems to be a necessary ingredient. Similarly, Borghans et al., 2004 show that pathogen co-evolution can further increase the number of coexisting alleles compared to HA alone.

The aim of our study is to understand how HA on its own can result in allelic polymorphism. For this reason, we kept all aspects concerning pathogens fixed, focusing on a scenario where pathogen optima represent diverse taxonomic groups that remain approximately constant over the time scales considered in our model. This approach excludes Red-Queen dynamics and fluctuating selection. Models of Red-Queen dynamics are based that pathogens evolve to avoid detection by the host’s immune system (Borghans et al., 2004; Ejsmond and Radwan, 2015; Ejsmond et al., 2023). In our model, this would correspond to moving pathogen optima (in the Gaussian model) or changes in the pathogen peptides (in the bit-string model). We expect that incorporating this would hamper the build-up of allelic MHC diversity when driven solely by HA if pathogens evolve quickly. Alleles previously maintained as beneficial would then become disadvantageous and go extinct more rapidly than new advantageous alleles can appear.

Another component of pathogens that can evolve in response to host immune defence is their virulence (Frank and Schmid-Hempel, 2008). The transmission-virulence trade-off hypothesis (Anderson and May, 1982; Frank, 1996; Alizon et al., 2009) predicts that pathogens that cause relatively little harm to their host (i.e. pathogens with low virulence) may evolve towards higher virulence to increase their transmission rate. In line with this hypothesis, we speculate that incorporating virulence evolution leads to higher virulence whenever pathogens inflict little harm on their hosts. This scenario applies in the dark blue parameter regions in Figures 4 and 5, where host populations possess a single effective generalist allele. In these regions, the evolution of increased virulence would shift pathogens into parameter regions where allelic polymorphism becomes adaptive. The ensuing build-up of allelic polymorphism decreases the harm inflicted by pathogens through HA, which, in turn, increases the selection pressure acting on pathogens for an even further increase in virulence. This suggests, in contrast to evolving pathogen optima, a positive feedback loop between virulence evolution and the evolution of allelic diversity.

Our Gaussian model is not restricted to MHC genes, but can apply to any gene that affects several functions important for survival. Examples are genes that are expressed in different ontogenetic stages or different tissues with competing demands on the optimal gene product. However, gene duplication is expected to reduce the potential number of coexisting alleles per locus and eventually lead to a situation where the number of duplicates equals the number of functions (Proulx and Phillips, 2006). Under this scenario, the high degree of polymorphism reported here would be transient. However, for MHC genes evidence exist that other forces limit the number of MHC loci (Penn, 2002; Wegner, 2008; Eizaguirre and Lenz, 2010; Spurgin and Richardson, 2010). But it is important to point out that, while our model focuses on evolution at a single MHC locus, many vertebrates have more than one MHC locus with similar functions (Wegner, 2008; Eizaguirre and Lenz, 2010; Spurgin and Richardson, 2010). The diversity generating mechanism described here still applies if the different loci are responsible for largely non-overlapping sets of pathogens, indicating that the mechanism presented here can in principle explain the high number of coexisting MHC-alleles.

In summary, our research offers a fresh view that can help us to understand allelic diversity at MHC loci. We identify two crucial assumptions related to pathogen-host interactions, under which we show that HA emerges as a potent force capable of driving the evolution of a very high number of coexisting alleles.

Appendix 1

Effective number of alleles

Here, we provide the calculations for the effective number of alleles ne reported in Figures 4 and 5. The effective number of alleles is given by the reciprocal of the population homozygosity G=fi2, where fi denotes the frequency of allele i in the population (Kimura and Crow, 1964). Under mutation-drift balance, the expected homozygosity is approximated by 1/(1+4Nμ) (Gillespie, 2004), where N is population size and µ the per capita mutation probability.

Thus, under mutation-drift balance, the expected value of ne equals 1+4Nμ. For Figure 4, where N=105 and μ=5×107, the expected value of ne is 1.2. In Figure 5, with N=105 and μ=5×106, the expected value for ne is 3. Hence, ne-values significantly higher than these expectations indicate the presence of alleles maintained by balancing selection.

It is worth noting that when alleles are at equal frequencies fi=1/n, ne is equal to n. In our model, both conditions (1) and (2) of Kimura and Crow, 1964, are approached at evolutionary equilibrium (i.e. heterozygote having similar and high fitness while homozygote having similar and low fitness), as elaborated in the Discussion. As a result, alleles maintained by HA are maintained at roughly similar frequencies. Consequently, ne gives a good estimate for the number of alleles that coexist in a protected polymorphism due to HA, rather than being maintained in a balance between mutation and drift.

Appendix 2

Evolutionary dynamics without diversification

Appendix 2—figure 1
Evolution of allelic values in the presence of three pathogens.

This figure is analogous to Figure 3 (see that legend for details) but with wider Gaussians (v=2.5, as in Figure 1C). As a consequence, the condition for evolutionary branching (v>2m) is not fulfilled and the evolutionary dynamics result in a monomorphic population consisting essentially of only the generalist allele x=(0,0). This result is independent of the half-saturation constant K, here chosen to be K=10.

Appendix 3

Table of mathematical notation

List of all mathematical symbols used in the Supplementary Information. Bold italic font indicates vectors (e.g. x) while normal italic font indicates numbers or scalar-valued functions. Capital letters in sans serif font indicate matrices (e.g. H).

NotationExplanation
ccondition function
cmaxmaximum condition
Cmutational covariance matrix
Dpeptide detection probability (bit-string model)
δexpected mutational step size (Gaussian model)
ekefficiency function for pathogen k
fafrequency of allele xa
hdimensionality of the allelic trait space
HHessian matrix
Iidentity matrix
JJacobian matrix
Khalf-saturation constant of survival function
mnumber of pathogens
μper capita mutation probability
nnumber of alleles
neeffective number of alleles
Npopulation size
pkvector describing the kth pathogen (Gaussian model)
ssurvival function
smaxmaximum survival
Σkcovariance matrix of the Gaussian efficiency function ek for pathogen k (Equation A14)
ΣGcovariance matrix of the Gaussian efficiency function ek, assuming equal covariance matrices for all pathogens
σG2variance of the Gaussian efficiency function ek, assuming full symmetry matrices for all pathogens
Σpcovariance matrix of the position of the pathogen vectors (Gaussian model)
σp2variance of the position of the pathogen vectors, assuming equally distant pathogen optima (Gaussian model)
vvirulence; given by σG1 for the Gaussian model and the detection threshold value in the bit-string model
xaallelic trait vector of allele a
xallelic trait vector of a resident allele
yallelic trait vector of rare mutant allele

Appendix 4

Varying the expected mutational step size in the Gaussian model

For the Gaussian model, mutations are drawn from an isotropic normal distribution, i.e., a matrix with covariance matrix σμI of dimension h. The expected mutational step size δ is given by σμ times the expected value of the Chi-distribution (Equation 18.14 in Johnson et al., 1994),

(A1) δ=σμ2Γ(h+12)Γ(h/2).
Appendix 4—figure 1
Number of coexisting alleles as they emerge in individual-based simulations for different expected mutational step sizes δ and eight pathogens (m=8).

Parameters are chosen such that up to 200 alleles can evolve (K=0.5, v=20; see bottom right panel in Figure 4 in the main text). Solid orange lines and dotted blue lines give the effective number ne and the absolute number n of alleles, respectively. The number of alleles increases fastest and saturates earliest for an intermediate expected mutational step size of δ=0.03 (D; pathogen vectors are 1/δ=1/0.0330 average mutational steps apart) as used in Figure 4. Decreasing the average mutational step size slows down the build-up of allelic diversity (E–G). In the extreme case shown in G (pathogen vectors are 1/0.001 = 1000 average mutational steps apart), the evolutionary dynamics is strongly limited by the rate of phenotypic change due to the small step size and the number of alleles after 107 time steps has reached only 10% the number reached in D. Increasing the average mutational step size also slows down the build-up of allelic diversity (A–C). In the extreme case shown in A (pathogen vectors are 1.25 average mutational steps apart), the evolutionary dynamics are strongly limited by the very large proportion of maladapted mutants. Other parameters (as in Figure 4): N=105, μ=5×107.

Appendix 5

Deviations from symmetry in the Gaussian model

The number of coexisting alleles for different parameter combinations are shown in Figure 4 in the main text. These results are based on two symmetry assumptions. First, the m points describing by the pathogen vectors are placed equidistantly with d=1, resulting in a regular (m1)-simplex. Second, the multivariate Gaussian functions ek describing the MHC-molecule’s efficiencies against the different pathogens are isotropic and have equal width, as shown in Figure 1. Thus, the covariance matrices Σk in Equation A14 are equal to σG2I, where I is the identity matrix. Here, we test the robustness of the outcome shown Figure 4 with respect to violations of these symmetry assumptions. We focus on the case with eight pathogens, and the results are summarised in Appendix 5—figure 1. Panel A is identical to the bottom right panel in Figure 4, and shown here for comparison. Panels BD show the final number of coexisting alleles for increasing deviations from symmetry, as explained in the following. Note that each pixel in the figure is based on a single simulation with a unique random perturbation from symmetry.

In Appendix 5—figure 1B the assumption of symmetrically placed pathogen vectors is perturbed while the Gaussian functions ek are kept rotationally symmetric with equal width. Section ‘Random placement of pathogen vectors’ describes the procedure how the positions of the pathogen vectors are randomised. The similarity between panels A and B indicates that deviations from a symmetric placement of pathogen vectors has a minor effect on the number of coexisting alleles. The slightly decreased smoothness of the contours corresponding to 50 and 100 coexisting alleles stems from the fact that each simulation (corresponding to a pixel) is based from a unique perturbation. Note that polymorphism can emerge for values of v such that the branching condition v>2m derived for the symmetric case is not fulfilled (below the red arrow). This can be understood based on the expression for the Hessian matrix given in Equation A43. This Hessian matrix is more likely to be positive definite for asymmetrically placed pathogen vectors.

Appendix 5—figure 1
Number of coexisting alleles for eight pathogens as a function of pathogen virulence v and the half-saturation constant K for symmetrically (A) and non-symmetrically placed pathogen vectors (B–D).

Figures are based on a single individual-based simulation per pixel and run for 106 generations, assuring that the equilibrium distribution of alleles is reached. (A) shows results for equally spaced pathogen vectors and isotropic functions ek (Equation A14). It is identical to the bottom right panel in Figure 4 and shown here for comparison. (B–D) show the result for increasing perturbations from symmetry. In (B), pathogen vectors are placed randomly (see Section ‘Random placement of pathogen vectors’ for details) while the functions ek are kept rotationally symmetric. In (C) and (D), additionally to the non-symmetric placement of pathogen vectors, the functions ek are independently perturbed from rotational symmetry (see Appendix ‘Random covariance matrices for the pathogen efficiencies’ for details). In (C) the deviations from rotational symmetry are moderate, while in (D) they are strong. Note that in (B–D) pathogen vectors are no longer at a constant distance 1, but instead have the mean variance calculated from the pathogen optima corresponds to the variance of symmetrically placed pathogens optima with distance 1. Results are reported in terms of the effective number of alleles ne, which discounts for alleles arising from mutation-drift balance (see Appendix ‘Effective number of alleles’). Dashed and solid lines give the contours for ne=50 and ne=100, respectively. Red arrows indicate v=28, the minimal value for polymorphism to emerge from branching under full symmetry (Equation A46). Accordingly, simulations in the dark blue area result in a single abundant allele with ne close to one. Other parameters: population size N=105, per capita mutation probability μ=5×107, expected mutational step size δ=0.03.

In Appendix 5—figure 1C and D we, additionally to the non-symmetric placement of pathogen vectors, allow for Gaussian functions ek that are not isotropic. The variances of the perturbed covariance matrices are drawn from the interval (σp2(1ε),σp2(1+ε)) and constrained such that the average variance is equal to 1, and then rotated randomly. Section ‘Random covariance matrices for the pathogen efficiencies’ describes this procedure in detail. Panel C shows the result for modest (ε=0.2) and panel D for strong (ε=0.5) deviations from rotational symmetry. Comparing panels C to B indicates that modest deviations from rotational symmetry have a relatively minor effect on the final number of coexisting alleles. In contrast, in panel D configurations exist where significantly fewer alleles are able to coexist. Interestingly, configurations resulting in a high number of alleles are more likely to occur in combination with high K-values. The highly irregular pattern results from each pixel corresponding to a single simulation with a unique random perturbation from symmetry. Furthermore, the threshold for polymorphism decreases even more because the Hessian matrix given in Equation A42 is even more likely to be positive definite with perturbations in Σk.

Random placement of pathogen vectors

We here describe how we randomly place eight pathogen vectors in trait space. In order to keep the results comparable to the symmetric case, we keep the average variance calculated from the position of their mid-points constant. The distribution of eight pathogen vectors can be described by their seven dimensional covariance matrix Σp calculated from the coordinates p1,,p8. Since each diagonal element of Σp describes the variance of the pathogen vectors along a different dimension of the trait space, the average variance equals tr(Σp)/7, where tr(Σp) denotes the trace. This measure is unaffected by rotation of the points p1,,p8. For symmetrically placed pathogen vectors Σp,sym=d2/(2m)I, where I denotes the identity matrix, and therefore tr(Σp,sym)=d2(m1)/(2m). For the pathogens with perturbed placements (with covariance matrix Σp,per), we demand tr(Σp,per)=tr(Σp,sym). We achieve this by first choosing eight preliminary points p^1,,p^8 that are placed randomly within a unit 7-sphere, having the covariance matrix Σ^p. By subsequently multiplying the coordinates of these points by a scalar α, the variances in Σ^p are multiplied by α2. By setting

(A2) α=tr(p,sym)tr(^p)=d2(m1)2mtr(^p),

with m=8, we obtain the final set of pathogen vectors p1,,p8 with a covariance matrix Σp,per fulfilling tr(Σp,per)=tr(Σp,sym).

Random covariance matrices for the pathogen efficiencies

We here describe how we create random covariance matrices Σk. In order to keep the results between the symmetric and asymmetric case comparable, we fix the mean variance over all Σk to σ2=v2. We obtain the eight random covariance matrices Σ1,,Σ8 in the following manner. First, eight random diagonal matrices D1,,D8 are determined (one per pathogen vectors) with entries drawn from a uniform distribution U(1ε,1+ε). These matrices are then multiplied with the scalar

(A3) β=v2m11mk=1mtr(Dk),

with m=8 to obtain the set of matrices M1,,M8 obeying v2=18i=18(tr(Mi)/7). In a final step, we draw eight random rotation matrices R1,,R8 and calculate our final covariance matrices P1,,P8 as Pk=RkMkRkT.

Appendix 6

Simulation run of the bit-string model with parameter values as in Borghans et al., 2004

Appendix 6—figure 1
The number of alleles n and the effective number of alleles ne as a function of time (on a log-log plot) for a simulation run of our bit-string model.

Parameters values: N=105 and μ=5×106. Other parameters as in Borghans et al., 2004: v=7, m=50, npep=20.

We here present a comparison of our bit-string model with that of Borghans et al., 2004. These authors analyse a bit-string model with m=50 pathogens (that are allowed to mutate but are not subject to selection), with npep=20 peptides each, a virulence of v=7 (with a step function for the probability that an MHC-molecule detects a peptide), a population size of N=103, and a per capita mutation probability of μ=105. In contrast to our model, they assume that condition equals the proportion of detected pathogens, such that each pathogen can lower fitness by only 2% (not fulfilling our assumption a) and that survival is proportional to the squared condition (not fulfilling our assumption b). With these parameters and parameter values, their simulation results in up to seven alleles. We note, that the effective number of alleles in these simulations is likely lower, but no allele frequencies are given.

We contrast their results with those from our model, which, as detailed in the main part, fulfils assumptions (a) and (b). To approximate the step function for the detection probability, we use

(A4) D(Lki(x))=11+exp[2log(99)(vLki(x)1/2)].

For this function, v is the required match length L for a 99% chance of detection, while a match length L=v1 gives only 1% detection probability. Note, that compared to Equation 2, we here subtract 1/2 in the denominator and a=2log(99). Then, our model with the exact same parameters (omitting pathogen mutations) results in 18 alleles and ne=16.7, clearly exceeding the number of alleles found by Borghans et al., 2004.

Based on Kimura and Crow, 1964, for the above N and μ the effective number of alleles that can be maintained by HA cannot exceed ne=17.6 at mutation-drift-selection balance. This suggests that the allelic diversity found by Borghans et al., 2004, is likely not limited by the parameters affecting mutation and drift, μ and N. In contrast, our final number of alleles (being 95% of the maximum) is likely limited by these parameters. To demonstrate that this is indeed the case, we simulate our model with N=105 and μ=5×106, as shown in Appendix 6—figure 1. We find well over 100 alleles (n=157 and ne=140). This demonstrates that the ecological parameter values used by Borghans et al., 2004, m=50, npep=20, and v=7, under our model allows for more than a 20-fold higher allelic diversity.

Appendix 7

Mathematical analysis of the Gaussian model: preliminaries

Adaptive dynamics and invasion fitness

For the Gaussian model presented in the main part, we investigate with an evolutionary invasion analysis using the adaptive dynamics formalism (Metz et al., 1992; Dieckmann and Law, 1996; Geritz et al., 1998) whether selection favours a single generalist allele or a polymorphic population. In the language of adaptive dynamics, we ask whether a monomorphic population evolves towards an evolutionary branching point, where two coexisting allelic lineages emerge.

Let us consider a large population of N individuals with two segregating alleles x1 and x2 under Wright-Fisher population dynamics (Fisher, 1930; Wright, 1931). The allelic frequencies at time t are denoted fx1,t and fx2,t, respectively.

The recurrence equation for the change of frequency of an allele xa{x1,x2} is then given by

(A5) fxa,t+1=fxa,t(fxa,ts(xa,xa)s¯t+fxb,ts(xa,xb)s¯t),

where s(xa,xb) is the survival of an individual carrying the alleles xa and xb (see Equation A12) and

s¯t=fx1,t2s(x1,x1)+fx1,tfx1,ts(x1,x2)+fx2,t2s(x2,x2)

is the population mean survival at time t. Note, that the expression within brackets on the right-hand side of Equation A5 describes the marginal fitness of allele xa.

Consider a resident population carrying allele x to which a mutant allele y=x+ϵ is introduced. In the limit of a mutant allele frequency close to zero, its marginal fitness is given by

(A6) w(y,x) = s(y,x)s(x,x).

We refer to w(y,x) as invasion fitness, which is the expected long-term exponential growth rate of an infinitesimally rare mutant allele y in a resident population with allele x (Metz et al., 1992; Metz, 2008). Allele y has a positive probability to invade and increase in frequency if w(y,x)>1 and disappears otherwise.

We denote the gradient of invasion fitness with respect to the mutant allele y, evaluated at y=x, with w(x,x). It has the entries

(A7) w(x,x)i=w(y,x)yi|py=x

and gives the direction in the h-dimensional allelic trait space in which deviations from x result in the fastest increase of invasion fitness.

If mutations rarely occur, a mutant allele y will either go extinct or reach an equilibrium frequency before the next mutant appears. If, additionally, w(x,x)0 and mutational effects are sufficiently small (i.e. y=x+ϵ for ϵ small), then invasion of y implies extinction of x (Dercole and Rinaldi, 2008; Priklopil and Lehmann, 2020).

In the limit of small mutational steps, the evolutionary dynamics of an allelic lineage becomes gradual and is given by

(A8) dxdt=μNCw(x,x)

(Dieckmann and Law, 1996; Champagnat et al., 2006; Durinx et al., 2008; Metz and de Kovel, 2013). Here, μ is the per capita mutation probability and C the covariance matrix for the distribution of mutational effects on the trait x.

We note that Equation A8 is structurally similar to the gradient equation of quantitative genetics, which is based on the assumption of weak selection or, equivalently, small genetic variances (Lande, 1979; Iwasa et al., 1991; Abrams et al., 1993; Débarre et al., 2014). In this case, x characterises the mean of the phenotype distribution, the covariance matrix describes the distribution of the standing genetic variation, and the factor μN is replaced with a constant.

Allelic trait values x where w(x,x)=0 are of special interest, and such x are referred to as evolutionarily singular points x.

Evolutionarily singular points can be either attractors or repellers of the evolutionary dynamics described by Equation A8. Furthermore, an evolutionarily singular point can be either invadable or uninvadable by nearby mutants. For a resident allele with a one-dimensional trait x=x, a classification of singular strategies is straightforward (Geritz et al., 1998). Evolutionarily singular points that are not approached, irrespective of whether they are invadable or uninvadable, act as repellers, and we do not expect to ever find resident alleles with such values. Evolutionarily singular strategies that are attractors and uninvadable are endpoints of the evolutionary dynamics. Finally, evolutionarily singular points that are attractors and invadable are known as evolutionary branching points. In this case, any nearby mutant can invade the singular point and coexist with it in a protected dimorphism. Further evolution leads to divergence of the alleles present in the dimorphism. Thus, evolutionary branching points are points in trait space at which diversity emerges (Geritz et al., 1998; Rueffler et al., 2006).

The classification of singular points becomes more complicated in multivariate trait spaces or when several strategies coexist in an evolutionarily singular point (Leimar, 2009; Doebeli, 2011; Geritz et al., 2016). First, in multivariate trait spaces or polymorphic populations, whether a singular point is an attractor does not only depend on the direction of the fitness gradient in the vicinity of the singular point but also on the mutational input (Leimar, 2009). Second, in multivariate trait spaces or polymorphic populations, for evolutionary branching it is necessary that a singular point is an attractor and invadable. However, in the multidimensional case, this is generally not sufficient any more (Geritz et al., 2016).

In Section ‘The evolutionarily singular point’, we show for our model that a unique singular point x exists. This allele is uninvadable if it is a minimum of w(y,x) as a function of y. This is the case if the h-dimensional Hessian matrix H with entries

(A9) hij=2w(y,x)yiyj|py=x=x.

is negative definite (Leimar, 2009; Doebeli, 2011). In Section ‘Derivation of the Hessian matrix of invasion fitness’ we derive an explicit expression for H for the fully general case of our model that allows to determine invadability of x as a function of the positions of the pathogen vectors, the half-saturation constant K, and the covariance matrices Σk that determine the shape of the efficiency functions ek.

Whether the singular point x is an attractor of the evolutionary dynamics can be evaluated based on the Jacobian matrix J of the fitness gradient w(x,x) (Leimar, 2009), which is given by

(A10) J = H+Q

and where Q is the h-dimensional matrix of mixed derivatives with entries

(A11) qij=2w(y,x)yixj|py=x=x.

Leimar, 2009, shows that if the symmetric part of J, i.e., (J+JT)/2, is negative definite, then the singular point is an attractor of the evolutionary dynamics described by Equation A8 independent of the mutational covariance matrix C and he refers to this case as strong convergence stability. For the case that the Jacobian matrix is a symmetric negative definite matrix, a stronger result holds, to which he refers to as absolute convergence stability (Leimar, 2002; Leimar, 2009). In this case, all conceivable gradualistic, adaptive paths starting near the point x converge to it. Furthermore, he shows that the condition for absolute convergence stability is equivalent to the existence of a function g(x) having a maximum at x and a positive function α(x) such that the gradient of invasion fitness can be expressed as

w(x,x)=α(x)g(x).

In Section ‘Absolute convergence stability’, we show for our model that w(y,x)>1 is indeed absolutely convergence stable.

For the case of two-dimensional trait spaces, results in Geritz et al., 2016, allow us to conclude that if x is invadable, then it is indeed an evolutionary branching point. For trait spaces of dimension three or higher, whether convergence stability and invadability imply evolutionary branching is an open problem (Geritz et al., 2016). Individual-based simulations indicate, however, that for our model this is indeed the case.

Model description

In this section, we describe the model ingredients. Survival s(xa,xb) of a genotype carrying alleles xa and xb is a saturating function of condition c and described by the well-known Michaelis-Menten equation

(A12) s(xa,xb)=smaxc(xa,xb)K+c(xa,xb).

Here, the half-saturation constant K gives the condition c at which half of the maximum survival is reached and smax is the maximum survival probability that is approached when c becomes large.

The condition of a genotype is given by

(A13) c(xa,xb)=cmaxk=1mek(xa)+ek(xb)2,

where cmax is the condition of a hypothetical individual with perfect defence against all m pathogens and ek(x) is the efficiency of an allele’s MHC-molecule against pathogen k in an environment with m pathogens.

Without loss of generality, c(x,x) is standardised to 1 (by choosing cmax in Equation A13 appropriately). This is helpful because it allows us to choose an interval of K-values where individuals homozygous for the generalist allele have either a condition in the range where survival changes rapidly (K>>1) or slowly (K<<1) with condition. In Figure 4, Appendix 5—figure 1, the x-axis can be translated into survival s(x,x) of the generalist genotype using Equation A12, which then varies between 0.01 for K=102 and 0.99 for K=102.

We assume that the efficiencies ek(x) of inducing immune defence against the m different pathogens are traded off. This trade-off emerges by describing the efficiencies against different pathogens with multivariate Gaussian functions (see Figure 1) that have pathogen-dependent optima,

(A14) ek(x)=exp(12(xpk)Tk1(xpk)).

These function describes how the efficiency of an allele characterised by the h-dimensional vector x decreases with increasing distance from the pathogen vector pk. The closer an allelic trait vector is to a pathogen vector, the higher is the efficiency of the MHC-molecule against that pathogen. The magnitude of the decrease in efficiency with increasing distance to the kth pathogen is determined by the shape and width of the Gaussian function as determined by the h-dimensional covariance matrix Σk.

In the main part, we consider the special case of rotationally symmetric Gaussian functions ek(x). These matrices are thus specified by an inverse matrix-covariance matrix Σk1 (see Equation A14) that takes the form of a scalar matrix, i.e., a scalar multiple of the identity matrix I. Furthermore, we assume that all Gaussians are of equal width. Hence, we have a common scalar for all Gaussians that we denote with v2, i.e., v is the inverse of the width of the Gaussian function. We refer to v as virulence (see Section ‘Special case: maximal symmetry’, below).

(A15) ek(x)=exp(v22(xpk)T(xpk))

Appendix 8

Analytical results for the Gaussian model

The evolutionarily singular point

In this section, we analyse the evolutionary dynamics of a monomorphic resident population in full generality. By subsequently applying several symmetry assumptions, we then derive the analytical results presented in the main text (see Sections ‘Special case: identically shaped Gaussian efficiency functions’ and ‘Special case: maximal symmetry’). Invasion fitness of a rare mutant allele y in a resident population with allele x is given by its marginal fitness,

(A16) w(y,x)=s(y,x)s(x,x)

(see derivation of Equation A6). Note, that smax cancels out. It is therefore omitted from all further calculations. The direction of the evolutionary dynamics is governed by the selection gradient. Its ith entry calculates to

(A17) w(x,x)i=w(y,x)yi|py=x=1s(x,x)s(y,x)yi|py=x.

Using the definitions for s (Equation A12) and c (Equation A13) and their derivatives,

(A18) s(y,x)yi=K(K+c(y,x))2c(y,x)yi

and

(A19) c(y,x)yi=c(y,x)k=1m1ek(y)+ek(x)ek(y)yi,

where Equation A19 is obtained by applying the generalised product rule

(A20) x(i=1nfi(x))=(i=1nfi(x))i=1nfi(x)fi(x),

we obtain

(A21) w(x,x)i=Kc(x,x)(K+c(x,x))c(y,x)yi|py=x=K2(K+c(x,x))k=1m1ek(x)ek(x)xi.

In the next step, we calculate the derivative of the function ek(x) (Equation A14). Applying the chain rule and simplifying results in

(A22) ek(x)xi=12ek(x)xi((xpk)TΣk1(xpk))=12ek(x)xij=1hl=1hσkjl1(xjpkj)(xlpkl)=12ek(x)(j=1hl=1hσkjl1(xlpkl)dxjdxi+j=1hl=1hσkjl1(xjpkj)dxldxi),

where the entries of the matrix Σk1 are denoted by σkjl1. Using that dxj/dxi=0 for ij and dxj/dxi=1 for i=j this further simplifies to

(A23) ek(x)xi=ek(x)j=1hσkij1(pkjxj).

Substituting Equation A23 into Equation A21 finally results in

(A24) w(x,x)i=K2(K+c(x,x))k=1mj=1hσkij1(pkjxj)

and

(A25) w(x,x)=K2(K+c(x,x))k=1mΣk1(pkx).

As mentioned in Section ‘Adaptive dynamics and invasion fitness’, singular points x are allelic trait vectors for which Equation A25 equals zero. From Equation A25 follows that in our model singular points have to fulfil

(A26) (k=1mΣk1)x=k=1mΣk1pk.

Solving for x yields

(A27) x=(k=1mΣk1)1k=1mΣk1pk:=p¯w.

Thus, the unique singular point x equals the arithmetic mean of the pathogen vectors p1,...pm, each weighted by the inverse of their Gaussian covariance matrices Σ1,,Σm. For a one-dimensional trait space (h=1) this simplifies to

(A28) x = k=1mσk2pkk=1mσk2=:p¯w,

which is the well-known weighted average for scalars. If Σ1==Σm, then Equation A27 simplifies to the arithmetic mean pathogen vector

(A29) x = 1mk=1mpk:=p¯

as stated in the main text.

Absolute convergence stability

Below, we prove that the unique singular point x (Equation A27) is absolutely convergence stable. To this end, we first demonstrate that the gradient of invasion fitness can be expressed as w(x,x)=α(x)c(x,x), where α(x) is a positive function, and then show that x is a maximum of c(x,x).

Gradient of condition

An individual homozygote for x has a condition given by

(A30) c(x,x)=cmaxk=1mek(x),

and the ith entry of the gradient c(x,x) is given by

(A31) c(x,x)i=c(x,x)xi=c(x,x)k=1m1ek(x)ek(x)xi.

Substituting Equation A23 into Equation A31 gives

(A32) c(x,x)i=c(x,x)k=1mj=1hσkij1(pkjxj),

and

(A33) c(x,x)=c(x,x)k=1mj=1hΣk1(pkx).

By comparing Equation A33 with Equation A25 we see that

(A34) w(x,x)=K2(K+c(x,x))k=1mΣk1(pkx)=K2c(x,x)(K+c(x,x))c(x,x),

where the fraction before c(x,x) is positive. Thus, w(x,x)=α(x)c(x,x).

Hessian matrix of condition

The Hessian matrix of a homozygote individual’s condition, evaluated at the singular point, is given by the second-order partial derivative of c(x,x) with respect to the ith and jth entry of x.

We obtain this by differentiating Equation A33 with respect of xj, evaluated at x=x, resulting in

(A35) hcij=2c(x,x)xixj=xj(c(x,x)k=1ml=1hσkil1(pklxl)),

and applying the product rule and using that the first derivative at x is zero gives

(A36) hcij=c(x,x)k=1ml=1hσkil1(pklxl)xj=c(x,x)k=1mσkil1,

where the last simplification uses that dxj/dxi=0 for ij and dxj/dxi=1 for i=j, and

(A37) Hc=c(x,x)k=1mΣk1,

which is always negative definite. Thus, c(x,x) is a maximum and x is absolute convergence stable.

Derivation of the Hessian matrix of invasion fitness

As stated in Equation A9, the entries of the Hessian matrix of invasion fitness are given by

(A38) hij=2w(y,x)yiyj|y=x = 1s(x,x)2s(y,x)yiyj|y=x.

The second derivative of the function s is obtained by differentiating Equation A18 with respect to yj, resulting in

(A39) 2s(y,x)yiyj|py=x=Kyj(1(K+c(y,x))2c(y,x)yi)|py=x=K(K+c(x,x))4(2c(y,x)yiyj|py=x(K+c(x,x))2c(y,x)yi|py=xc(y,x)2yj|py=x)=K(K+c(x,x))22c(y,x)yiyj|py=x.

In the final simplification step we use the conclusion drawn from Equation A21 that 0=c(y,x)/yi|y=x and therefore the term after the minus sign disappears.

The second derivative for the function c is obtained by differentiating Equation A19 with respect to yj, resulting in

(A40) 2c(y,x)yiyj|py=x=yj(c(y,x)k=1m(1ek(y)+ek(x)ek(y)yi))|py=x=c(x,x)k=1myj(1ek(y)+ek(x)ek(y)yi)|py=x=c(x,x)k=1m14ek(x)2(22ek(y)yiyj|py=xek(x)ek(y)yi|py=xek(y)yj|py=x).

Here, the one but last simplification step again follows from the fact that 0=c(y,x)/yi|y=x.

The second derivative for the function ek is obtained by differentiating Equation A23 with respect to yj, resulting in

(A41) 2ek(y)yiyj=yjek(y)l=1hσkil1(pklyl)=ek(y)yjl=1hσkil1(pklyl)ek(y)(l=1hσkil1ylyj)=1ek(y)ek(y)yiek(y)yjek(y)σkij1,

where the last simplification uses that dyj/dyi=0 for ij and dyj/dyi=1 for i=j.

By recursively substituting Equations A39–A41 into Equation A9 we obtain

hij=KK+c(x,x)k=1m14ek(x)2(ek(y)yi|py=xek(y)yj|py=x2ek(x)2σkij1)=K4(K+c(x,x))k=1m((l=1hσk il1(pklxl))(l=1hσk jl1(pklxl))2σk ij1),

where in the last step we substituted Equation A23. This result can be rewritten as a matrix

𝖧=K4(K+c(𝒙*,𝒙*))(k=1m(Σk-1(𝒑k-𝒙*)(𝒑k-𝒙*)TΣk-1)-2k=1mΣk-1).

Finally, substituting x with Equation A27, we obtain

(A42) H=K4(K+c(x,x))(k=1m(Σk1(pkp¯w)(pkp¯w)TΣk1) 2k=1mΣk1).

Special case: identically shaped Gaussian efficiency functions

For the special case that the Gaussian covariance matrices Σk are equal (Σ1=Σ2==Σm=ΣG), fulfilled in Appendix 5—figure 1B, the Hessian matrix simplifies to

(A43) H=Km4(K+c(x,x))ΣG1(1mk=1m((pkp¯)(pkp¯)T)2ΣG)ΣG1=Km4(K+c(x,x))ΣG1(Σp2ΣG)ΣG1,

where we used that Σp=(k=1m(pkp¯)(pkp¯)T)/m is the covariance matrix of the positions pk of the pathogen vectors. Since the fraction in Equation A43 is positive and ΣG1 is positive definite, it follows that the definiteness of H is given by the definiteness of the matrix Σp2ΣG. Hence, the singular point x is uninvadable whenever

(A44) Σp2ΣG<0,

where the inequality indicates negative definiteness.

Special case: maximal symmetry

For the even more special case that ΣG=σG2I and that the pathogen vectors are arranged symmetrically, resulting in Σp=σp2I, we obtain

(A45) H = Km4σG4(K+c(x,x))(σp22σG2)I.

In this case, the singular point x is a branching point whenever σp2>2σG2. Assuming that all m pathogen vectors have an equal distance d to each other implies that they are arranged in an m1-dimensional regular simplex (see Figure 1). For this case, σp2=d2/(2m), and we obtain the branching condition d/σG>2m. Assuming pathogen distance d=1 and substitute σG1 as virulence v, we obtain

(A46) v>2m.

Data availability

All data presented and analysed in this study were generated through individual based simulations using Matlab, with code authored by the first author. The corresponding Matlab script is available at Dryad with DOI: https://doi.org/10.5061/dryad.69p8cz98j.

The following data sets were generated
    1. Siljestam M
    (2024) Dryad Digital Repository
    Heterozygote advantage can explain the extraordinary diversity of immune genes.
    https://doi.org/10.5061/dryad.69p8cz98j

References

  1. Book
    1. Abbas AK
    2. Lichtman AH
    3. Pillai S
    (2014)
    Basic Immunology: Functions and Disorders of the Immune System
    Elsevier Health Sciences.
  2. Book
    1. Gillespie JH
    (2004)
    Population Genetics: A Concise Guide
    The Johns Hopkins University Press.
  3. Book
    1. Johnson NL
    2. Kotz S
    3. Balakrishnan N
    (1994)
    Continuous Univariate Distributions
    John wiley & sons.
    1. Leimar O
    (2009)
    Multidimensional convergence stability
    Evolutionary Ecology Research 11:191–208.
  4. Book
    1. Metz JAJ
    (2008) Fitness
    In: Jorgensen SE, Jorgensen B, editors. Encyclopedia of Ecology Oxford. Elsevier. pp. 1599–1612.
    https://doi.org/10.1016/B978-008045405-4.00792-8
    1. Sheftel H
    2. Szekely P
    3. Mayo A
    4. Sella G
    5. Alon U
    (2018) Evolutionary trade-offs and the structure of polymorphisms
    Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 373:20170105.
    https://doi.org/10.1098/rstb.2017.0105

Article and author information

Author details

  1. Mattias Siljestam

    Department of Ecology and Genetics, Animal Ecology, Uppsala University, Uppsala, Sweden
    Contribution
    Conceptualization, Formal analysis, Visualization, Methodology, Writing – original draft, Writing – review and editing
    For correspondence
    m@siljestam.com
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-3720-4926
  2. Claus Rueffler

    Department of Ecology and Genetics, Animal Ecology, Uppsala University, Uppsala, Sweden
    Contribution
    Conceptualization, Supervision, Funding acquisition, Methodology, Writing – original draft, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9836-2752

Funding

No external funding was received for this work.

Acknowledgements

We thank Yvonne Meyer-Lucht and Tobias Lenz for helpful discussions and Göran Arnqvist, Helena Westerdahl, Joachim Hermisson, and Sophie Karrenberg for comments on an earlier version of the manuscript.

Copyright

© 2024, Siljestam and Rueffler

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 16
    downloads
  • 0
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Mattias Siljestam
  2. Claus Rueffler
(2024)
Heterozygote advantage can explain the extraordinary diversity of immune genes
eLife 13:e94587.
https://doi.org/10.7554/eLife.94587

Share this article

https://doi.org/10.7554/eLife.94587

Further reading

    1. Computational and Systems Biology
    2. Evolutionary Biology
    Pierre Barrat-Charlaix, Richard A Neher
    Research Article

    As pathogens spread in a population of hosts, immunity is built up, and the pool of susceptible individuals are depleted. This generates selective pressure, to which many human RNA viruses, such as influenza virus or SARS-CoV-2, respond with rapid antigenic evolution and frequent emergence of immune evasive variants. However, the host’s immune systems adapt, and older immune responses wane, such that escape variants only enjoy a growth advantage for a limited time. If variant growth dynamics and reshaping of host-immunity operate on comparable time scales, viral adaptation is determined by eco-evolutionary interactions that are not captured by models of rapid evolution in a fixed environment. Here, we use a Susceptible/Infected model to describe the interaction between an evolving viral population in a dynamic but immunologically diverse host population. We show that depending on strain cross-immunity, heterogeneity of the host population, and durability of immune responses, escape variants initially grow exponentially, but lose their growth advantage before reaching high frequencies. Their subsequent dynamics follows an anomalous random walk determined by future escape variants and results in variant trajectories that are unpredictable. This model can explain the apparent contradiction between the clearly adaptive nature of antigenic evolution and the quasi-neutral dynamics of high-frequency variants observed for influenza viruses.

    1. Chromosomes and Gene Expression
    2. Evolutionary Biology
    Timothy Fuqua, Yiqiao Sun, Andreas Wagner
    Research Article

    Gene regulation is essential for life and controlled by regulatory DNA. Mutations can modify the activity of regulatory DNA, and also create new regulatory DNA, a process called regulatory emergence. Non-regulatory and regulatory DNA contain motifs to which transcription factors may bind. In prokaryotes, gene expression requires a stretch of DNA called a promoter, which contains two motifs called –10 and –35 boxes. However, these motifs may occur in both promoters and non-promoter DNA in multiple copies. They have been implicated in some studies to improve promoter activity, and in others to repress it. Here, we ask whether the presence of such motifs in different genetic sequences influences promoter evolution and emergence. To understand whether and how promoter motifs influence promoter emergence and evolution, we start from 50 ‘promoter islands’, DNA sequences enriched with –10 and –35 boxes. We mutagenize these starting ‘parent’ sequences, and measure gene expression driven by 240,000 of the resulting mutants. We find that the probability that mutations create an active promoter varies more than 200-fold, and is not correlated with the number of promoter motifs. For parent sequences without promoter activity, mutations created over 1500 new –10 and –35 boxes at unique positions in the library, but only ~0.3% of these resulted in de-novo promoter activity. Only ~13% of all –10 and –35 boxes contribute to de-novo promoter activity. For parent sequences with promoter activity, mutations created new –10 and –35 boxes in 11 specific positions that partially overlap with preexisting ones to modulate expression. We also find that –10 and –35 boxes do not repress promoter activity. Overall, our work demonstrates how promoter motifs influence promoter emergence and evolution. It has implications for predicting and understanding regulatory evolution, de novo genes, and phenotypic evolution.