Optimal cancer evasion in a dynamic immune microenvironment generates diverse post-escape tumor antigenicity profiles

Abstract
Editor's evaluation
Introduction
Results
Discussion
Methods
Data availability
References
Article and author information
Metrics

Abstract

The failure of cancer treatments, including immunotherapy, continues to be a major obstacle in preventing durable remission. This failure often results from tumor evolution, both genotypic and phenotypic, away from sensitive cell states. Here, we propose a mathematical framework for studying the dynamics of adaptive immune evasion that tracks the number of tumor-associated antigens available for immune targeting. We solve for the unique optimal cancer evasion strategy using stochastic dynamic programming and demonstrate that this policy results in increased cancer evasion rates compared to a passive, fixed strategy. Our foundational model relates the likelihood and temporal dynamics of cancer evasion to features of the immune microenvironment, where tumor immunogenicity reflects a balance between cancer adaptation and host recognition. In contrast with a passive strategy, optimally adaptive evaders navigating varying selective environments result in substantially heterogeneous post-escape tumor antigenicity, giving rise to immunogenically hot and cold tumors.

Editor's evaluation

This study presents a valuable mathematical model for the adaptive dynamics of cancer evolution in response to immune recognition. The mathematical analysis is rigorous and convincing, and overall the framework presented could be used in the future as a solid base for analytically tracking tumor evasion strategies. The work will be of interest to evolutionary cancer biologists and potentially may also have implications for the design of clinical interventions.

https://doi.org/10.7554/eLife.82786.sa0

Introduction

Cancer dynamics, encompassing both genotypic evolution and phenotypic progression, lies at the heart of treatment failure and disease recurrence, and therefore represents a significant and stubborn therapeutic hurdle. Prior research efforts have made substantial progress in detailing the mathematics of acquired drug resistance (Iwasa et al., 2006; Michor et al., 2004; Komarova, 2006) and the complementary roles of phenotypic and genotypic changes (Gupta et al., 2019). Recently, there has been much renewed interest in therapies that utilize the adaptive immune system to confer durable remission (Couzin-Frankel, 2013; Waldman et al., 2020). These latter breakthroughs have generated considerable interest in quantifying the cancer-immune interaction (Mayer et al., 2019; Sontag, 2017; George et al., 2017). As with targeted therapeutic resistance via compensatory evolution or adaptive rewiring (Bergholz and Zhao, 2021), tumors can similarly evade the immune system via either elimination or downregulation of tumor-associated antigens (TAAs) normally detectable by the T cell repertoire (Rosenthal et al., 2019). However, several key features distinguish immune-specific evasion from classical drug resistance (Komarova, 2006). Dynamical changes in cancer genotypes and phenotypes, while problematic for conventional therapies, create additional TAAs that may subsequently be recognized by distinct T cells (Yarchoan et al., 2017). Thus, the evolving diversity of the T cell repertoire, consisting of billions of unique clones each with a distinct T cell receptor, provides adaptive immunity and immunotherapy the unique advantage of repeated tumor recognition opportunities (George and Levine, 2021; Lakatos et al., 2020; Qi et al., 2014), making long-term evasion more challenging.

Previous research efforts have investigated the diversity of evolutionary trajectories and the extent of cancer-immune co-evolution occurring in early disease progression (George and Levine, 2018; George and Levine, 2020). These works were based on increasing evidence of significant and sustained tumor evolution driven by immune surveillance (Turajlic et al., 2018; Jamal-Hanjani et al., 2017). Immunosurveillance via distinct T cell clones imposes an adaptive, stochastic recognition environment on developing cancer populations (Desponds et al., 2016) that can result either in cancer elimination, escape, or equilibrium (Schreiber et al., 2002; Dunn et al., 2004). Equilibrium results in cancer co-existence with the immune system over large time scales (Turajlic et al., 2018), thereby motivating the need for a more complete understanding of the interplay between immune recognition and cancer evolution for effective therapeutic design. In addition to parsing this complexity, the precise extent to which a cancer population may actively evade repeated immune recognition attempts is at present unknown.

Previous modeling efforts have assumed that cancer adaptation occurs passively, that is, without behavior predicated on knowledge of the current immune microenvironment (IME). However, it is well known that cancer populations commonly undergo phenotypic changes capable of altering their immunogenicity (Tripathi et al., 2016); these changes could be coupled to sensing of the IME in a manner similar to cancer mechanical, chemical, and stress sensing (Lee et al., 2019; Damaghi et al., 2013; Rosenberg, 2001). Moreover, direct experimental evidence demonstrates genetic adaptation in bacterial systems capable of sensing stress and consequently varying the per-cell mutation rate (Al Mamun et al., 2012; Rosenberg and Queitsch, 2014); there appear to be similar stress pathways in cancer (Bindra et al., 2007). Therefore, an alternative to passive evolution is for cancer populations to actively sense and evade recognition in the current environment en route to metastasis in a manner that maximally benefits survival, which we refer to henceforth as the ‘optimal escape hypothesis.’ Understanding the extent and associated features of optimized tumor evasion is a crucial first step to identifying the best therapeutic approach, particularly for T cell immunotherapies that may be temporally varied.

Here, we introduce a mathematical framework, which we call ‘Tumor Evasion via adaptive Antigen Loss’ (TEAL), to quantify the aggressiveness of an evolutionary strategy executed by a cancer population faced with a varying recognition environment. This framework enables a dynamical analysis of both passive and optimized evasion strategies. The TEAL model describes a discrete-time stochastic process tracking the number of targets available to a recognizing adaptive immune system. We apply dynamic programming (Bellman and Dreyfus, 1959; Ross, 2014) in order to solve the corresponding time homogeneous Bellman equation detailing the tumor optimal evasion strategy for a specific example of the assumed penalty for attempting to avoid immune detection. In doing so, we obtain an exact analytical characterization of the evasion policy that maximizes long-run population survival, which we show is the unique solution. We can then quantify the enhancement in survival for optimal threats relative to their passive counterparts under a variety of temporally varying recognition environments. Surprisingly, we find that optimized strategies exhibit substantial diversity in their dynamical behavior, distinguishing them from threats with a fixed evolutionary strategy. Notably, immune recognition efficiency and the IME microenvironment are predicted to influence the likelihood for tumors to either accumulate or lose therapeutically actionable TAAs prior to their escape. The TEAL model represents a first attempt to explicitly represent – and in the future test – the optimal escape hypothesis in order to frame cancer evasion as a dynamic and informed strategy aimed at maximizing population survival.

Model development

In greatest generality, our model consists of an evading clonal population that may be targeted over time by a recognizing system. We assume henceforth that the recognition-evasion pair consists of the T cell repertoire of the adaptive immune system and a cancer cell population, recognizable by a minimal collection of s_n TAAs present on the surface of cancer cells in sufficient abundance for recognition to occur over some time interval $n$ . Our focus is on a clonal population, recognizing that subclonal TAA distributions in this model may be studied by considering independent processes in parallel for each clone.

Experimental evidence and prior modeling suggest that tumors may be kept in an ‘equilibrium’ state of small population size prior to either escape or elimination, with repeated epochs of recognition and evasion (Dunn et al., 2004; Turajlic et al., 2018; George and Levine, 2020). We adopt a coarse-grained strategy and assume that during each epoch, the immune system has an opportunity to independently recognize each of the s_n TAAs with probability $q$ , and also the cancer population can lose recognized TAAs, each with probability $π_{n}$ , which we refer to as the antigen loss rate. The antigen loss rate is either fixed or chosen by the cancer population using information available in the current period. If the immune system cannot detect any of the available TAAs in a given period, then the cancer population escapes detection. On the other hand, if $r_{n} > 0$ antigens are detected by the adaptive immune system in this time frame, then the cancer population is effectively targeted. This leads to cancer elimination unless the population is able to lose each of the r_n recognized antigens during the same period. This loss of recognition would presumably arise in a subpopulation that would then expand at the expense of the successfully targeted cells. If evasion balances recognition and all detected antigens are lost, then equilibrium (non-escape, non-elimination) ensues, and the process repeats in the next period with a new number of target antigens given by a state transition equation

s_{n + 1} = s_{n} - r_{n} + β + f_{n}

where $β$ represents the basal rate of new antigen accumulation, and f_n represents the addition of new TAA targets dependent on the rate of escape $π_{n}$ in the current state. We shall refer to f_n as the (intertemporal) penalty term, the idea being that changes that lead to antigen loss will out of necessity give rise to the creation of new TAAs, in the form of either overexpressed/mislocalized self-peptides or tumor neo-antigens.

The model therefore defines a discrete time process that involves changes to both the tumor and the immune system. The process ends in cancer elimination if the cancer population is unable to match all of the r_n recognized antigens at any period. The process ends in cancer escape if at any period the number of recognized antigens is zero ( $r_{n} = 0$ ). This framework mirrors the outcomes resulting from known tumor-immune interactions, a process that leads via immunoediting to cancer escape, elimination, or equilibrium (Schreiber et al., 2002; Dunn et al., 2002; Dunn et al., 2004; Koebel et al., 2007). Here, tumor antigenicity is represented by the total number of post-escape TAAs. We do not distinguish between different types of TAA loss, which may occur through a number of mechanisms, including somatic mutation, epigenetic regulation, or phenotypic alteration.

Passive evader

In the passive case, the cancer population does not change its evasion rate so that $π_{n} = p$ is fixed and independent of any of the parameters governing the recognition landscape. For this case, we shall also use the simple assumption that the net antigen accumulation and penalty $β + f$ is a fixed constant.

Optimal evader

In the optimized case, $π_{n}$ is chosen in order to maximize the overall evasion probability as a function of parameters realizable to the cancer at period $n$ . We assume that s_n the number of TAAs as well as r_n the size of the recognized subset is knowable by the cancer prior to strategy selection. In addition, we postulate that the intertemporal penalty scales directly with $π_{n}$ , a reasonable assumption given, for example, the direct relationship between mutagenesis and passenger mutation accumulation (Pon and Marra, 2015; McFarland et al., 2014). While many functional forms of $f_{n} (π_{n}, r_{n}, s_{n})$ would be reasonable, we assume in general that the penalty is $π_{n}$ -linear:

f_{n} (s_{n}, r_{n}, π_{n}) = h_{m} (s_{n}, r_{n}) π_{n} .

To make our system analytically solvable, we use a specific choice in which h_m scales monotonically as a function of both r_n and s_n and $h_{m} \propto r_{n}$ in the large r_n limit (see ‘Methods’). Since the number of recognizable (and thus actively targeted) TAAs reflect, all else being equal, an active IME hostile to cancer, we assume that subsequent total TAA addition, $β + f_{n}$ , are dependent on the current level of immune detection, thereby taking into account the increased cost of surviving in, for example, an inflammatory IME. The temporal dynamics of the TEAL process are illustrated in Figure 1A and Figure 1—figure supplement 1.

Figure 1 with 11 supplements see all

Download asset Open asset

Tumor Evasion via adaptive Antigen Loss (TEAL) model.

(A) Illustration of tumor antigen detection and downregulation in the TEAL model of cancer-immune interaction. (B) The directed graph with nodes representing the states of the TEAL model and edges labeled based on the probability of their occurrence. The interaction leads to elimination, equilibrium, or escape. Both evasion and elimination are absorbing states, and the equilibrium state results in repeated interaction. (C) Plots of single-period cancer optimal antigen loss rates $π^{*}$ given by Equation 8 are plotted as a function of recognition rate $q$ for various numbers of recognized antigens $0 < r_{n} \leq s_{n}$ with $s_{n} = 5$ .

Varying environments

Using the above framework, we subject both passive and active cancer evasion tactics to temporally varying recognition profiles. We partition pre-escape dynamics into four cases based on immune recognition $q$ and basal TAA arrival $β$ , from which we characterize the distribution of escape time, cumulative mutational burden, and predicted post-escape tumor immunogenicity.

Results

The following section presents the main findings of our analysis (full mathematical details are provided in the ‘Methods’ section). For s_n available and r_n recognized TAAs, we have that $r_{n} \sim Binom (s_{n}, q)$ . Conditional on recognition ( $r_{n} > 0$ ), the number of downregulated antigens, $ℓ_{n}$ , is given by $ℓ_{n} \sim Binom (r_{n}, π_{n})$ . Recognition therefore occurs with probability $P (r_{n} > 0) = 1 - (1 - q)^{s_{n}}$ . Similarly, non-elimination occurs following recognition with probability $P (ℓ_{n} = r_{n}) = π_{n}^{r_{n}}$ . A decision tree for the TEAL process is illustrated in Figure 1B (passive and active decision trees used in the analysis are depicted in Figure 1—figure supplements 2–4).

Passive evasion strategy

For a passive evader, the TAA loss rate is fixed so that $π_{n} = p$ . It can be shown (see Methods Section. Distribution of lost antigens) that the dynamics governed by Equation 1 in the passive case can be represented by their mean trajectories while the cancer population is in equilibrium, given by

E_{n} [S_{n + 1}] = S_{n} - \frac{p (1 - γ) η^{s_{n} - 1}}{η^{s_{n}} - γ^{s_{n}}} s_{n} + (β + f),

where $η \equiv 1 - q (1 - p)$ is the probability of equilibrium (non-escape, non-elimination) between the cancer and immune compartments for a single TAA given the existence of at least one available TAA. These dynamics may be approximated by

E_{n} [s_{n + 1}] \approx (1 - q) s_{n} + (β + f),

where $E_{n} [\cdot]$ is the conditional expectation given the information available at time $n$ . The approximation given by Equation 4 is a lower estimate of tumor antigenicity and is accurate as long as $p$ and $q$ are not both small and in particular for choices that give rise to large tie probability (Figure 1—figure supplements 6 and 10).

Optimal evasion strategy

In contrast to the above case where $π_{n}$ was fixed at $p$ , Here, the antigen loss rate is variable and selected optimally given the current state of total s_n and recognized r_n antigens. The use of dynamic programming to address the optimal long-term evasion policy relies on a defined value function (Bellman and Dreyfus, 1959). We shall focus on the case where the cancer population is assigned normalized values of 1 at any period resulting in escape and 0 otherwise. The corresponding stationary Bellman equation takes the form

\begin{aligned} J_{n} = \underset{π_{n}}{m a x} E_{n} [π_{n}^{r_{n}} [(1 - q)^{s_{n + 1}} + (1 - (1 - q)^{s_{n + 1}}) J_{n + 1}]], \end{aligned}

where the value function $J_{n} = J (s_{n}, r_{n}, π_{n})$ represents the maximal attainable value at period $n$ ; (Methods Section Dynamic programming solution). It can be shown that

J_{n} = \frac{A_{n} γ^{s_{n}}}{1 - (1 - q)^{s_{n}}}

with

A_{n} = \frac{δ_{n} q (1 - q)^{β + r_{n} / c - r_{n}}}{1 - δ_{n} q (1 - q)^{β + r_{n} / c - r_{n}}}

satisfies Equation 5. Here, $0 < δ_{n} \leq 1$ is a free parameter that varies inversely with the risk aversion of the evader (larger values imply a bolder strategy). One advantage of the dynamic programming approach is that it reduces an infinite-period optimization problem to a sequence of single-period optimizations. The corresponding optimal policy is given by the sequence

π_{n}^{*} = {(\frac{δ_{n} q}{1 - (1 - q)^{s_{n}}})}^{1 / r_{n}} .

Plots of $π_{n}^{*}$ are given for various r_n in Figure 1C and Figure 1—figure supplement 11. As expected, this closed-form strategy results in increased values for the optimal antigen loss rate $π_{n}^{*}$ , which increase for increasing $q$ and r_n. We take $δ_{n} = 1$ in subsequent analysis (so that the optimal strategy when $s_{n} = r_{n} = 1$ is $π_{n}^{*} = 1$ ).

Active evasion strategies enhance population survival rates

For a fixed TAA arrival, Equations 3 and 4 describe a mean-reverting process. Consequently, the mean number of TAAs approaches a stable equilibrium

lim_{n \to \infty} E_{n} [s_{n + 1}] \approx (β + f) / q .

as long as the cancer neither escapes nor is eliminated. In the optimal case, a similar equilibrium value $s_{\infty}$ may be calculated:

s_{\infty} = \frac{β}{q | 1 / \ln (1 - q) - 1 |} .

In this case, stability is more complex: If immune recognition is sufficiently effective, meaning $q > q^{*} = 1 - e^{- 1}$ , then Equation 10 is a stable equilibrium exhibiting mean reversion similar to that of the passive case. On the other hand, recognition impairment ( $q < q^{*}$ ) gives rise to an instability, which results in a system harboring an initial number of targets s₀ being driven either to escape if $s_{0} < s_{\infty}$ or to large accumulations (and likely elimination) if $s_{0} > s_{\infty}$ (Figure 5—figure supplement 2).

We proceed by contrasting active and passive escape rates assuming no recognition impairment, and discuss the implications of immune impairment in a later section. Simulations of passive and optimized strategies with passive evasion rates matching mean optimal evasion rates ( $p = E [π_{n}^{*}] |_{s_{\infty}}$ ) are compared in Figure 2. Despite identical mean TAA evolution (Figure 2A) and comparable intertemporal penalties, the optimized strategy results in substantially higher cancer escape probability (150%) compared to the passive case. Moreover, optimized strategies generate wider escape time distributions, thus illustrating an adaptive evader’s sustained effort to thwart elimination prior to escape (Figure 2B).

Figure 2

Download asset Open asset

Passive and optimized evasion strategies against stationary threats.

(A) Comparisons of the temporal dynamics of passive (green) and active (blue) strategies with parameter selections giving equal mean behavior. In the active case, $q = q^{*} + 0.1$ yields stable dynamics, giving mean antigen arrival $β + f = 3.44$ . In the passive case, $p = 0.90$ was selected to match the mean optimal evasion rate and the expected s_n of the active case. Also, $f = 3.51$ and $β = 0.88$ both chosen so that $s_{\infty} = 5$ , and the results plotted for s₀ $\in {2, 5, 8}$ {2, 5, 8}. (B) 10⁶ replicates of this process were used to calculate distributions of stopping times conditioned on escape. This distribution generates passive (resp. optimized) $p_{escape}$ of 5.37 (resp. 8.44).

Figure 2—source data 1 Source data contains a spreadsheet of data for Figure 2B.: https://cdn.elifesciences.org/articles/82786/elife-82786-fig2-data1-v1.xlsx
Download elife-82786-fig2-data1-v1.xlsx

Arbitrary recognition landscape

The above describes the dynamics of passive and optimized cancer co-evolution during adaptive immune recognition with constant governing parameters. We can more generally apply this approach to understand how an evasion strategy affects the likelihood and timing of cancer escape under a variety of temporally varying recognition landscapes. Such landscapes could, for example, be imposed by a clinician temporally modulating an immunotherapeutic intervention and are routinely proposed in the setting of traditional therapies, where attempted strategies have included a variety of cyclical burst approaches (Foo and Michor, 2009; Eigl et al., 2005). A similar approach could be taken with regard to timing and dosage of adoptive T cell immunotherapy. An advantage of our dynamic programming approach is the ability to study optimal evasion strategies for arbitrary recognition landscapes (Figure 3A). We simulate TEAL dynamics and find that optimized immune evaders are more successful in evading detection than their passive counterparts across various recognition landscapes (Figure 3B). Evasion, when it occurs in the optimized case, does so largely after a sustained interaction with the recognizing threat (Figure 3C). Collectively, our results detail the dynamics of sustained cancer-immune co-evolution via TAA loss in threats capable of adopting adaptive evasion strategies in the presence of complex treatment modulation (George and Levine, 2020; Turajlic et al., 2018).

Figure 3

Download asset Open asset

Passive and optimized evasion strategies for temporally varying recognition profiles.

(A) Temporally varying recognition functions are selected and applied to threats employing passive (blue) and optimized (red) evasion strategies. (B) The mean and standard deviation of escape probabilities is compared across recognition profiles for each strategy (pairwise significance was assessed using two-sample t-test at significance $α = 0.05$ with $p$ <10^-5). (C) Escape time distributions are generated for step, cyclical, increasing, and decreasing recognition environments (solid line: mean). In each case, mean total new antigen arrival $β + E [f_{n}]$ for passive (resp. optimized) evasion were 4.39 (resp. 4.75), and 10³ simulations of 10³ replicates each were used for statistical comparison; all samples were aggregated for escape time violin plots (solid line denotes mean).

Figure 3—source data 1 Source data contains a spreadsheet of data for Figure 3B, C.: https://cdn.elifesciences.org/articles/82786/elife-82786-fig3-data1-v1.xlsx
Download elife-82786-fig3-data1-v1.xlsx

Optimal evaders under effective immune recognition accrue mutations at a fixed rate

One consequence of mean reversion is that the rate of mutation accumulation over time, $λ (n)$ , is linear in $n$ (Methods Section Mean optimal transitions):

λ (n) = \frac{2 β \ln (1 - q)}{1 - \ln (1 - q)} n, q > q^{*} = 1 - e^{- 1} .

The prediction of constant accumulation is consistent with empirically observed cancer mutation behavior (Lawrence et al., 2013; Alexandrov et al., 2013). This is not what holds in the impaired case (as will be discussed later), thus suggesting that early cancer progression often proceeds in an environment with effective immune recognition. Additionally, our formula shows that larger mutation rates can be caused by large evasion penalties or by reduced immune recognition. Of course, the TEAL model does not consider any specific features that determine the values of the effective parameters. Instead, its utility is in quantifying the overall effect of reducing antigen detection resulting from, for example, transitions to an immune impaired microenvironment.

Post-escape tumor antigenicity determined by a balance between recognition aggressiveness and local penalties in the immune microenvironment

The prior section related recognition and penalty to observed mutation rates. We now consider their combined effects on tumor immunogenicity following immune escape. The TEAL model represents immunogenicity by the number of available TAAs at the time of cancer detection, an important predictor of immunotherapeutic efficacy (Martin et al., 2016; Samstein et al., 2019; Goodman et al., 2017). We apply the TEAL model to simulate evading cancer populations, focusing exclusively on trajectories that result in tumor escape, to characterize the distribution of available TAAs. This is performed first for increasing immune recognition rates $q$ (Figure 4A) and then for increasing penalty term $β$ (Figure 4B). Our results demonstrate that larger penalties result in higher post-escape TAA levels, while efficient immune recognition depletes available TAAs. The presumptive reason for this latter observation is that escape in the presence of strong immune recognition biases the tumor to have low numbers of TAAs. This prediction agrees with recent empirical observations that strong immune selective pressure in early cancer development results in tumor neoantigen depletion and is prognostic of poor clinical outcome (Rosenthal et al., 2019; Lakatos et al., 2020).

Figure 4

Download asset Open asset

Distribution of available post-escape tumor antigens.

The distribution of tumor-associated antigens (TAAs) was estimated from simulations of optimized cancer evasion resulting in escape and plotted for (A) increasing recognition probability $q \in {0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95}$ and (B) increasing evasion penalty $β \in {0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8}$ . For (A), $β = 0.59$ . For (B), $q = 0.7 > q^{*}$ . In both cases, $s_{\infty} = 5$ and $n = 10^{6}$ simulations were performed for each histogram.

Figure 4—source data 1 Source data contains a spreadsheet of data for Figure 4A, B.: https://cdn.elifesciences.org/articles/82786/elife-82786-fig4-data1-v1.xlsx
Download elife-82786-fig4-data1-v1.xlsx

Variation in the tumor microenvironment drives the generation of immune hot vs. cold tumors under optimal evasion

In the passive evader case, antigenicity fluctuates around a stable equilibrium that varies directly with penalty and inversely with recognition. The adaptive case gives rise to more complex behavior resulting from impairments in immune recognition or changes in penalty (Figure 5—figure supplements 1 and 2). These changes are important manifestations of disease progression, which may alter the immunogenic landscape via impairments in immune recognition, such as MHC downregulation, co-stimulation alteration, T cell exclusion, or the establishment of a pro-tumor IME, via. for example. M2 macrophage polarization (Liu et al., 2021; Goswami et al., 2017). Although many factors may affect recognition rates, for simplicity we shall refer to larger vs. smaller immune recognition rates $q$ as infiltrated vs. excluded.

On the other hand, the generation of new TAA targets is expected to vary substantially across tumor type, for example, due to differing somatic mutation rates. Within a given tumor subtype, variations in the hostility of the IME, resulting from a large variety of possible mechanisms (metabolic, mechanical, cytokine, environment), require cancer populations to undergo greater degrees of adaptation to survive; in our approach, this greater degree of adaptation comes with a greater penalty. Consequently, we relate large vs. small local penalty terms $β$ to anti-tumor vs. pro-tumor IMEs. Conceptually, the baseline state (infiltrated anti-tumor IME) may give rise to three alternative states (excluded anti-tumor IME, infiltrated pro-tumor IME, or excluded pro-tumor IME), based on progression.

Toward this end, we simulate the TEAL model under the above conditions and record post-escape TAA distributions. As already explained, our results predict that infiltrated ( $q > q^{*}$ ) environments lead to an absorbing equilibrium state in the intervening period prior to escape, while exclusion ( $q < q^{*}$ ) results in unstable equilibria. Interestingly, the sign of this equilibrium, and hence the long-term immunogenic trajectory, depends on the sign of $β$ (Equations 88 and 89). The baseline infiltrated anti-tumor case ( $q > q^{*}$ , $β > 0$ ) yields a positive and stable, mean-reverting TAA steady state, generating immunogenically ‘warm’ tumors. Excluded anti-tumor IMEs ( $q < q^{*}$ , $β > 0$ ) exhibit low recognition and large TAAs arrival, resulting in a unstable TAA steady state that leads to increased immunogenicity over time, resulting in ‘hot’ tumors. Furthermore, the infiltrated pro-tumor ( $q > q^{*}$ , $β < 0$ ) case demonstrates preserved recognition with low TAAs arrival and generates an unphysiological negative stable steady state, thereby predicting that trajectories reduce immunogenicity to zero over time, yielding ‘cold’ tumors. Lastly, excluded pro-tumor IMEs ( $q < q^{*}$ , $β < 0$ ), having compromises in both recognition and TAA arrival rate, result in an unstable state, above which trajectories accumulate additional TAAs over time, becoming immunogenically ‘hot,’ and below which the populations are predicted to reduce the number of recognizable TAAs over time, becoming ‘cold’ (Figure 5A and B). Substantial heterogeneity in the distributions of escape time predict sustained interactions in the unimpaired case (Figure 5—figure supplement 3). Tumor exclusion leads to hot tumors so that escape, should it occur, must do so on average prior to the accumulation of many TAAs. Conversely, pro-tumor IME with immune recognition drives TAA depletion, so escape occurs relatively early. These results are summarized in Figure 5C.

Figure 5 with 4 supplements see all

Download asset Open asset

Active Evader dynamics.

Violin plots of the distribution of post-immune escape. (A) Cumulative mutation burden. (B) Post-escape immunogenicity (available tumor-associated antigens [TAAs]) as a function of time for a variety of tumor immune microenvironment (IME) conditions. (Anti-tumor-infiltrated: $q = q^{*} + 0.1$ , $β = 0.529$ ; anti-tumor-excluded: $q = q^{*} - 0.1$ , $β = 0.505$ ; pro-tumor-infiltrated: $q = q^{*} + 0.1$ , $β = - 0.529$ ; pro-tumor-excluded: $q = q^{*} - 0.1$ , $β = - 0.505$ . In all cases, $β$ chosen to give $| s_{\infty} | = 3$ [ $s_{\infty} = - 3$ for the pro-tumor-infiltrated case] giving strictly positive penalties. Simulations were run until $n = 10^{6}$ escape events occurred for each case.) (C) The number of recognizable TAAs over time along with equilibrium states is depicted assuming (left) anti-tumor IME, $β > 0$ , and efficient immune recognition. Compromises in (top right) recognition, $q < q^{*}$ ; (bottom right) the establishment of a pro-tumor IME, $β < 0$ , or (middle right) both affect the predicted dynamical behavior of tumor immunogenicity. A phase plot partitions each case as a function of relevant critical parameter values.

Figure 5—source data 1 Source data contains a spreadsheet of data for Figure 5A, B.: https://cdn.elifesciences.org/articles/82786/elife-82786-fig5-data1-v1.xlsx
Download elife-82786-fig5-data1-v1.xlsx

Discussion

The underlying evolutionary dynamics of adaptive populations lies at the heart of many important clinical challenges, including antibiotic resistance, acquired drug resistance, immunotherapy failure, and tumor immune escape. Quantitative analytic modeling will continue to provide improved insight into these complex issues by generating fast and affordable predictions and a convenient theoretical framework for hypothesis testing. To date, virtually all of the current models of cancer evolution and the tumor-immune interaction have assumed passive acquired evolution without allowing the tumor to sense and optimally respond to the current fitness landscape in order to maximize future survival. The ‘optimal escape hypothesis’ is, in our opinion, worth exploring in light of the myriad examples of treatment failure and adaptive resistance.

Our analysis centered on the ability of cancer populations to adaptively respond to a measured immune state, and we have primarily focused on studying subsequent mutations resulting in the disruption of existing (targeted) tumor-associated antigenic targets and on the generation of new ones. It is important to note that independent empirical observations support the ability of cancer cells to sense their IME, and perhaps even the level of CD8+ killing that occurs therein. At the signaling level, IL-6 secreted by CTLs, macrophages, and dendritic cells in response to immune recognition has been shown to directly activate ataxia-telangiectasia mutated (ATM), a factor implicated in response to DNA damage, and this has been associated with increased metastasis and multi-drug resistance in lung cancer (Jiang et al., 2015; Yan et al., 2014). IFN-gamma released by activated CD8+ tumor-infiltrating lymphocytes activates the cell-intrinsic STING pathway in response to DNA damage in cancer, implicating an altered TME from activated CD8+ T cells that is measurable by the cancer (Xiong et al., 2022). Lastly, at the level of individual TCR interactions with recognized tumor cells, granzyme B release has been directly linked to DNA damage and associated CHK2 and p53 stress responses, and studies have demonstrated hSMG-1 stress-activated proteins upregulated in cancer cells following granzyme B treatment (Meslin et al., 2011). Moreover, granzyme release in the microenvironment serves a signaling molecule promoting a pro-inflammatory response from other immune cells (Cullen et al., 2010). The relatively acute response and short half-lives of downstream effectors (e.g., minutes for p53 and hours for CHK1) provide a tunable response based on the current level of immune targeting through stress-induced mutagenesis (Bindra et al., 2007; Rosenberg, 2001; Rosenberg and Queitsch, 2014) that in our analysis directly influences tumor-associated antigen availability.

Toward this end, we propose and analyze the TEAL model for studying and comparing passive and optimal escape mechanisms in the tumor-immune interaction. We focused our dynamic programming approach on a particular set of relations to provide analytical insight into this process. We do note, however, that the Bellman function approach to dynamic programming can be numerically implemented to obtain solutions for arbitrary functional forms of the penalty function, thereby enabling analysis of more complex assumptions where analytic progress becomes intractable. As expected, threats adopting optimal evasion strategies largely outperform their passive counterparts by increasing the rate of immune escape over prolonged cycles of cancer-immune co-evolution. In the setting of the tumor-immune interaction, the resulting TAAs available for targeting, a proxy for clinical post-detection immunotherapeutic efficacy, are augmented when cancer populations accrue large penalties for evasion and, perhaps surprisingly, when immune recognition is impaired.

Evasion dynamics of passive and active evaders are similar in some ways while different in others. Similarities include the mean-reverting stationary dynamics of both strategies under efficient immune recognition. However, the TEAL model predicts, for adaptive threats in an excluded pro-tumor IME, the emergence of an unstable state, resulting in either accrual or depletion of TAAs in a manner that depends on the current TAA abundance. This splitting behavior into ‘hot’ and ‘cold’ tumors offers insight into the microenvironmental features generating spatial immunogenic diversity within solid tumors and is consistent with prior observations (Huss et al., 2021; Jia et al., 2022; Meiller et al., 2021; Lakatos et al., 2020). This argues that TAA-depleted tumors share in common the tendency for their evasion strategies to incur less antigenic penalties. Our results suggest the possibility of altering the tumor IME to increase the immunogenicity of immune-cold tumors by making evasion more costly in a manner reminiscent of mutational meltdown (Gabriel et al., 1993). We remark that these dynamics are worth considering in the case of adoptive T cell-based immunotherapies, marked by their potential for exerting substantial co-evolutionary pressure on a developing malignancy (George and Levine, 2021). We also predict that impaired immune recognition leads to TAA accumulation, consistent with experimental observations in lung cancer wherein patients with HLA loss of heterozygosity harbored larger mutational burdens, an indirect measure of TAAs of our model (McGranahan and Swanton, 2017). Lastly, active evader variable mutation rates also distinguish this case from passive evaders with fixed mutation rates, and this feature is analogous to that observed in bacterial colonies faced with antibiotic selective pressure (Windels et al., 2019).

More generally, the TEAL framework provides a mechanistic basis for several empirical observations. First, our results would suggest that the lower observed TAA availability of hematological malignancies vs. immune-protected solid tumors, such as melanoma (Lawrence et al., 2013), occurs as a result of greater immune accessibility and possible immunoediting of liquid cancers. Second, our model predicts enhanced immune interactions, both natural and treatment-derived, resulting from increasing the cost of immune evasion in the evading cancer population in order to enrich the TAAs following escape. This supports the utility of neo-adjuvant radiation therapy (McGranahan et al., 2016) or chemotherapy (Mouw et al., 2017) in inducing immunogenicity. Orthogonal efforts to quantify cancer evolution have similarly predicted the benefit of larger evasion rates resulting in mutational meltdown (McFarland et al., 2014). Integrated together, the TEAL model can predict the balance of generated TAAs given the relative influences of recognition and evasion penalty.

Tumor antigen depletion is a concerning consequence of immunotherapy since increased recognition is desirable and required for tumor elimination. In solid tumors, one contributor to this problem is T cell exclusion (Pai et al., 2020). However, should effective treatment and robust tumor recognition lead to relapse, the resulting tumor has a greater chance of being TAA-depleted (Rosenthal et al., 2019). Other strategies that fall in this group include those that effectively reduce recognition, like the presence of T-regulatory cells. Our results suggest that this detrimental effect of targeting can be offset by increasing the ‘hostility of the IME.’ Strategies encourage making tumor adaptation more penalizing, such as fostering an anti-tumor environment by, for example, M1 macrophage polarization, or the inactivation of tumor-associated macrophages (Liu et al., 2021; Goswami et al., 2017).

Of course, this foundational model is not without limitation. At present, we have assumed that the recognition agent is not employing an optimized strategy informed by optimal cancer evasion. Instead, we have detailed our results for arbitrarily imputed recognition landscapes, which is useful for predicting the response of an aggressive evader like cancer to particular immunotherapeutic interventions, such as hematopoietic stem cell transplant and adoptive T cell therapy, where the clinician has temporal control over treatment. Identification of such optimal treatment strategies upon quantification of disease evasion aggressiveness is of paramount importance. In this foundational model, we demonstrated the dynamics of immune recognition of an adaptive population of cancer cells expressing a purely clonal pattern of antigens. Our model implicitly equates antigen loss and the progression of a subpopulation currently adapted to evade immune targeting – either by direct pruning of the fittest subclone or by stochastic emergence and subsequent growth of a new one lacking the targeted antigens – as equivalent. Here, we tracked the fittest clone represented by a core set clonal antigens. We remark that heterogeneous populations each having a distinct subclonal signature can also be tracked, but the corresponding antigen-driven selection and fitness cost to each clone would be coupled through shared antigens (see ‘Methods’). Finally, we note that this extended approach implicitly assumes that antigen detection rates over a given period are subclone size-independent, given that antigens are tracked over a period where each of the clones with comparable fitness would be detectable by the immune system during their growth trajectory en route to attempted escape.

Lastly, cancers characterized by co-evolutionary dynamics resulting in large variability in population size prior to escape or elimination would require in general that recognition and evasion parameters depend on the current period. While possible to incorporate, we have for foundational understanding assumed these to be constant. In this discrete-time evolutionary model, the intertemporal period considered represents the time period between the earliest moment that the adaptive immune system may identify a cancer clone and the latest point after which such a recognition event would no longer be able to prevent cancer escape (George and Levine, 2020). This effectively gives $q$ a probabilistic representation for the total rate of opportunity to recognize a given TAA during cancer progression. Implementing this model in cancer subtype-specific contexts thus requires a consideration of the per-cell division rates, for example.

We detailed strategies that affect the number of TAAs present following escape. In addition to quantity, variations in individual TAA antigenicity could affect overall immunogenicity, but we do not as yet take this into account. In future work, individual antigenicities could be built in by allowing individual TAA contributions to s_n and $q$ to depend on the particular TAA. Many additional features contribute to the immune landscape. Here, we focused on TAA availability and effects of general immune recognition rates and IME hostility on TAA accrual. Future efforts may incorporate additional cancer-specific features, including antigen presentation, immunomodulatory gene expression, and measured immune signatures present in the IME.

These optimized dynamics are proposed in the absence of the precise mechanistic details of cancer decision-making. Further studies linking changes in the evasion rates to cell signaling are necessary next steps at elucidating a possible mechanism of optimal evasion. Our framework serves as a tool for evaluating the extent of evasion aggressiveness in a variety of observed disease contexts, including cancer. Differentiating dynamics of passive and adaptive evasion mechanisms is a first step to understanding this difference, its importance underscored by the large implications such an understanding would have on our approach to treatment.

The TEAL model represents a framework broadly applicable for studying population behavior consistent with optimized collective decision-making, and subsequent experimental validation or refutation is of highest priority. Future direction aims to apply this framework for personalizing optimal interventions that maximize disease elimination probabilities. Consequently, stochastic analysis and optimal control theory are indispensable tools for better understanding the complex cancer-immune interaction. Defeating an evolving cancer population has provided a persistent challenge to researchers and clinicians, with the majority of progress heralded by fundamental discoveries on cancer behavior, and additional insights require a more detailed understanding of cancer evasion. The possibility that cancer population-level strategies are somewhat informed to the present recognition threat would have a radical effect on our own optimal treatment approach.

Methods

Passive evader in an adaptive environment

Let $S_{n}$ denote the set of tumor antigens recognizable by the immune system and present at period $n$ on a population of cancer cells, and let $s_{n} = | S_{n} |$ count their number ( $| A |$ denotes the cardinality of set $A$ ). From one period to the next, each of the s_n detectable antigens may be independently and identically detected by the immune system with probability $q$ per antigen. We let $R_{n} \subseteq S_{n}$ denote the collection of antigens that are recognized by the immune system at time $n$ . As the immune system targets and begins to eliminate cells via the $R_{n}$ antigens, the cancer population has an opportunity to lose or downregulate each of the $r_{n} = | R_{n} |$ recognized antigens with a similar independent and identical manner. The rate of antigen loss $π_{n}$ may in general vary as a function of time and environmental features (considered in Section Active evader in an adaptive environment). In this section, we assume it is passively fixed and denote this rate as p. We denote the collection of antigens that are lost by the cancer population at time $n$ by $L_{n} \subseteq S_{n}$ . We track the number of recognized and lost antigens at time $n$ by r_n and $ℓ_{n} = | L_{n} |$ , respectively, so that $ℓ_{n} \leq r_{n} \leq s_{n}$ .

The system evolves as follows (Figure 1—figure supplements 1 and 2): If $R_{n} = \emptyset$ , then the immune system is unable to recognize any tumor antigen at time $n$ and so the process ends in cancer escape. Since in this case the immune system loses, we denote this event by $L_{n}$ . If $R_{n} \neq \emptyset$ , then the immune system recognizes the threat by at least one TAA and one of two outcomes results: The first possibility is that the cancer population successfully downregulates or loses all of the targeted antigens, expressed as $L_{n} = R_{n}$ , and survives to the next time step. We call this a tie and denote the event by $E_{n}$ . Alternatively, the cancer population is unable to lose every recognized antigen and subsequently becomes eliminated. This means the immune system has won so we denote this event by $W_{n}$ . Although the recognition and evasion probabilities may in general be clonally and temporally dependent, we assume fixed probabilities for the recognition, $q$ , and evasion, $p$ , of individual antigens. In the event of a tie, $s_{n} - r_{n}$ antigens remain, with the addition of a basal antigen arrival rate $β$ and a possibly noisy penalty term f_n to reflect the production of new antigens as the population evolves. For simplicity, we assume the $β$ to be constant and the f_n a sequence of independent, identically distributed (IID) random variables with mean $f$ . While it is in general possible that the distributions of r_n and $ℓ_{n}$ be both state- and time-dependent, we focus on the foundational example above.

This process is identical to the following game between two players, hereafter referred to as the ‘Recognizer’ (immune system) and the ‘Evader’ (threat): the Recognizer starts off with a collection, $S_{0}$ , of s₀ coins and begins her turn by flipping each coin with IID success probability $q$ . If she has no success ( $R_{0} = 0$ ), she loses (denoted by event L₀) and the game ends. If $r_{0} > 0$ of her coins land on heads, then the next turn goes to the Evader, who proceeds to flip his r₀ coins with IID success probability $p$ in an attempt to match the Recognizer’s successful coin flips. The Evader must succeed in all coin flips ( $L_{0} = R_{0}$ ) for the turn to end in a tie (equilibrium between Evader and Recognizer), given by event E₀. Otherwise, he loses and the game ends with a Recognizer win, (event W₀). If a tie occurs then both players restart the game, but only after the removal from $S_{0}$ of the r₀ coins that landed on heads for both players as well as the addition of a random number f₀ of new coins. The Evader wins by default if a new turn begins and there are no longer any remaining coins to flip.

Probability of equilibrium

It is immediately apparent that this game is unfair to the Evader if s₀ is much larger than 1, unless the recognition probability $q$ is low and the evasion probability $p$ is high. We motivate the following analysis with this in mind and proceed to characterize the dynamics of this stochastic process. Clearly, the number of recognized and lost antigens during each period is binomially distributed, their respective distributions given by

r_{n} \sim Binom (s_{n}, q); ℓ_{n} \sim Binom (r_{n}, p) .

The event that the immune and cancer systems are in equilibrium (non-escape and non-extinction) may be written as

E_{n} = [L_{n} = R_{n} ⊋ \emptyset] = [ℓ_{n} = r_{n} > 0] .

One might expect that the number of antigens lost at time $n$ is affected by knowledge of whether or not the game continues to be played. The distribution of $ℓ_{n}$ conditioned on equilibrium may be characterized by conditioning on the number of recognized antigens at time $n$ . To this end, let $F_{n, r} = [r_{n} = r]$ denote the event that $r$ antigens are recognized at period $n$ , with

\begin{matrix} ℙ (F_{n, r}) = (\binom{s_{n}}{r}) q^{r} {(1 - q)}^{s_{n} - r} . \end{matrix}

We remark that events ${F_{n, r}}_{r}$ are disjoint and exhaustive; in other words, for sample space $Ω$ ,

⋃_{r = 0}^{s_{n}} F_{n, r} = Ω; F_{n, i} \cap F_{n, j} = \emptyset, for i \neq j .

Additionally, we note that equilibrium cannot occur if no antigens are recognized (i.e., $F_{n, 0} = [R_{n} = \emptyset]$ ). Lastly,

\begin{matrix} ℙ (E_{n} | F_{n, r}) = p^{r}, \end{matrix}

since if $r$ antigens are recognized then $L_{n} = R_{n}$ occurs if and only if each of the $l_{n} = r_{n}$ recognition positions are exactly matched with r_n evasions. We will make use of the following variables to simplify subsequent results:

η \equiv (1 - q) + q p = [1 - q (1 - p)]; γ \equiv 1 - q .

Here, $η$ may be interpreted as the probability of the complement of the following event: ‘recognition occurs without matched evasion for a single antigen.’ In other words, $η$ is the probability that equilibrium exists at one antigen position provided that there is at least one available antigen for immune targeting. This event occurs in one of two disjoint ways for a single antigen: either there is no recognition, and so equilibrium occurs regardless of evasion, or there is recognition that must also be matched by evasion. The joint distribution of recognized and lost antigens is given by the probability mass function

\begin{array}{lr} m (r, l) = P ([r_{n} = r] \cap [ℓ_{n} = ℓ]) \\ = P (ℓ_{n} = ℓ | r_{n} = r) P (r_{n} = r) \\ = (\binom{r}{ℓ}) p^{ℓ} (1 - p)^{r - ℓ} \cdot (\binom{s_{n}}{r}) q^{r} (1 - q)^{s_{n} - r} . \end{array}

The probability that equilibrium occurs and the process continues at period $n$ is given by

\begin{array}{lr} P (E_{n}) = \sum_{r = 1}^{s_{n}} m (r, r) \\ = \sum_{r = 1}^{s_{n}} (\binom{s_{n}}{r}) (p q)^{r} (1 - q)^{s_{n} - r} \\ = (1 - q)^{s_{n}} [(\frac{q - p q - 1}{q - 1})^{s_{n}} - 1] \\ = [1 - q (1 - p)]^{s_{n}} - (1 - q)^{s_{n}} \\ = η^{s_{n}} - γ^{s_{n}}, \end{array}

which is equal to the probability of equilibrium occurring at every position minus the probability that all of the s_n antigens are not recognized, since at least one recognized antigen is required for equilibrium to occur.

Break-even probability

The process is usually more favorable for the Recognizer. The Recognizer loses at period $n$ if there are zero recognition events, and this occurs with probability

\begin{matrix} ℙ (L_{n}) = γ^{s_{n}} . \end{matrix}

The Recognizer wins at period $n$ if she does not lose or tie, which occurs with probability

\begin{matrix} ℙ (W_{n}) = 1 - (ℙ (E_{n}) + ℙ (L_{n})) = 1 - η^{s_{n}} . \end{matrix}

If $q$ and s_n are given, then the evasion probability $p$ required for equal probabilities of Recognizer failure and success, or the break-even probability, is given by

\begin{aligned} p_{even} = \frac{(1 - γ^{s})^{1 / s} - γ}{1 - γ}, \end{aligned}

and exists whenever $p_{even} > 0$ . We plot $p_{even}$ as a function of recognition probability $q$ for various numbers of TAAs, $s$ (Figure 1—figure supplement 5A). The ‘fair-game’ line indicates where the break-even evasion probability is always equal to the recognition probability. Regions where the break-even probability localizes above the fair-game line favor the Recognizer since there the evasion rates $p$ must be higher than recognition rates $q$ for the game to be fair. Alternatively, areas below the break-even curve favor the Evader. It is clear from Figure 1—figure supplement 5B that the process favors recognition for a majority of parameter choices $(p, q)$ in all cases except for when $s = 1$ . Thus, the process is largely unfair and mostly favors the Recognizer over the Evader when $p = q$ so long as $s$ is not small. In order for the Evader to have a reasonable chance of success, either the evasion probability must be very large or the number of TAAs must remain small.

Tracking distinct clones

The above describes a clonal population harboring a core minimal set of TAAs for which recognition and downregulation ultimately determine cancer escape, elimination, or equilibrium. Our model can however be adapted to study the more general scenario involving a clonal hierarchy of heterogeneous cancer cells. We illustrate this by considering a population of cells with a set $C$ of $c = | C |$ core clonal TAAs, together with distinct groups of cells with subclonal collections of TAAs S₁ and S₂ (having size $s_{1} = | S_{1} |$ and $s_{2} = | S_{2} |$ , respectively). The relevant populations therefore have antigen sets given by $P_{1} = C \cup S_{1}$ and $P_{2} = C \cup S_{2}$ . The basic event considered in the foundational model, $[r_{n} > 0]$ , must now be replaced by the event that recognition occurs in both P₁ and P₂; in the absence of recognition of both subclones, the cancer escapes. Recognition happens either if there is a recognition event $r$ in $C$ or if there are simultaneous recognition events r₁ in S₁ and r₂ in S₂. Assuming that TAA recognition occurs independently as before with probability $q$ , the total probability of relevant recognition, originally $(1 - γ^{s_{n}})$ , is now given by $(1 - γ^{c}) + γ^{c} (1 - γ^{r_{1}}) (1 - γ^{r_{2}})$ . The first term characterizes the coupling of the fate of both subclones should a common TAA be recognized, while the latter term represents the parallel recognition process required to control each subclone separately via subclonal TAA recognition. Lastly, assuming that recognition proceeds either by a shared TAA in $C$ or instead by subclonal TAAs in both S₁ and S₂, then the probability of elimination and progression proceed identically as before. In the remainder of the discussion, we will, for baseline understanding, only track a core set of clonal antigens on the fittest clone.

Distribution of lost antigens

The process transitions at period $n$ if and only if equilibrium occurs, which means that the number of lost antigens match those recognized and are strictly positive. In other words,

E_{n} = [ℓ_{n} = r_{n} > 0] .

The survival probability as a function of $q$ and $p$ are plotted for various choices of $s$ in Figure 1—figure supplement 6. From this, we find that equilibrium occurs with high probability for large evasion rates, $p$ , as well as for recognition rates $q$ that vary inversely with the number of recognizable antigens. This coincides with conditions that do not disadvantage the Evader so that the equilibrium probability is maintained. We remark that recognition and evasion rates in general vary with the IME. We shall subsequently restrict our attention to large recognition probabilities ( $p > 1 / 2$ ).

Exact dynamics

Let $I_{F}$ denote the usual indicator random variable on event $F$ :

\begin{matrix} I_{F (ω)} = {\begin{matrix} 1, & ω \in F; \\ 0, & ω \notin F . \end{matrix} \end{matrix}

If r_n is unknown, then the distribution of $ℓ_{n}$ follows that of r_n on a strictly positive outcome normalized to the probability of surviving:

\begin{array}{lr} P (ℓ_{n} = ℓ | E_{n}) = P ([ℓ_{n} = ℓ] \cap [ℓ_{n} = r_{n} > 0]) / P (E_{n}) \\ = P (r_{n} = ℓ_{n} = ℓ > 0) / (η^{s_{n}} - γ^{s_{n}}) \\ = {\begin{array}{cc} m (ℓ, ℓ) / P (E_{n}), & 0 < ℓ \leq s_{n}; \\ 0, & ℓ = 0. \end{array} \\ = I_{[ℓ > 0]} (\binom{s_{n}}{ℓ}) {[p (1 - γ)]}^{ℓ} \frac{γ^{s_{n} - ℓ}}{(η^{s_{n}} - γ^{s_{n}})} . \end{array}

In this case, the mean number of lost antigens conditioned on a tie becomes

\begin{array}{lr} E [ℓ_{n} | E_{n}] = \sum_{ℓ = 0}^{s_{n}} ℓ P (ℓ_{n} = ℓ | E_{n}) \\ = {(η^{s_{n}} - γ^{s_{n}})}^{- 1} \sum_{ℓ = 1}^{s_{n}} ℓ (\binom{s_{n}}{ℓ}) {[p (1 - γ)]}^{ℓ} γ^{s - ℓ} \\ = \frac{p (1 - γ) η^{s_{n} - 1}}{η^{s_{n}} - γ^{s_{n}}} s_{n} . \end{array}

Of course, for any realized number of recognized antigens r_n at period $n$ (event $F_{n, r} = [r_{n} = r]$ ), the number of lost antigens conditional on equilibrium $ℓ_{n}$ is completely determined since

\begin{array}{lr} P (ℓ_{n} = ℓ | E_{n} \cap F_{n, r}) = P (ℓ_{n} = ℓ | ℓ_{n} = r_{n} = r > 0) \\ = I_{[ℓ = r]}, \end{array}

so that the conditional mean number of lost antigens must match exactly those recognized:

\begin{array}{lr} E [ℓ_{n} | E_{n} \cap F_{n, r}] = \sum_{ℓ = 0}^{s_{n}} ℓ \cdot P (ℓ_{n} = ℓ | E_{n} \cap F_{n, r}) \\ = \sum_{ℓ = 0}^{s_{n}} ℓ I_{[ℓ = r]} = r . \end{array}

Mean transition behavior

The state transition equation for this process is given by Equation 1:

s_{n + 1} = s_{n} - ℓ_{n} + β + f_{n},

where $β + f_{n}$ represents the arrival of new antigens through a basal production rate $β$ plus additional antigens ${f_{n}}_{n}$ that possibly depend on the evasion strategy employed. In our model, we will assume that the ${f_{n}}_{n}$ are IID random penalties with mean $E [f_{n}] = f$ and finite variance (e.g., Poisson-distributed). Given this, we will now characterize the mean transition behavior conditioned on equilibrium and the information available at the present moment. We write $E_{n} [\cdot]$ to denote the conditional expectation with respect to date- $n$ information.

Exact dynamics

The mean number of detectable antigens evolves according to the difference equation (Equation 3):

\begin{array}{lr} E_{n} [s_{n + 1} | E_{n}] = E_{n} [s_{n} - ℓ_{n} + β + f_{n} | E_{n}] \\ = E_{n} [s_{n}] - E_{n} [ℓ_{n} | E_{n}] + β + E [f_{n}] \\ = s_{n} - \frac{p (1 - γ) η^{s_{n} - 1}}{η^{s_{n}} - γ^{s_{n}}} s_{n} + (β + f), \end{array}

which gives Equation 3 and follows since s_n is measurable at period $n$ and independent from $E_{n}$ , while f_n is independent from period $n$ and $E_{n}$ . This process is mean stationary at $s_{n} = μ$ whenever

Δ s_{n} \equiv E_{n} [s_{n + 1} | E_{n}] - s_{n} = 0

giving

μ = (\frac{β + f}{q}) (\frac{η^{μ} - γ^{μ}}{p η^{μ - 1}}) .

Plots of fixed points of Equation 3 are illustrated in Figure 1—figure supplement 7 for $p > 1 / 2$ and $q$ away from zero for small total mean antigen accumulation rates $β + f$ . As expected, increases in $(β + f)$ result in higher equilibria. In the large $ℙ (E_{n})$ region of interest, increased $q$ results in a lower number of detectable antigens at equilibrium since more are recognized during each period.

Approximate dynamics

If r_n is explicitly given, then the mean transition equation simplifies to

\begin{array}{lr} E_{n} [s_{n + 1} | E_{n} \cap F_{n, r}] = E_{n} [s_{n} - ℓ_{n} + β + f_{n} | E_{n} \cap F_{n, r}] \\ = s_{n} - E_{n} [ℓ_{n} | E_{n} \cap F_{n, r}] + β + E [f_{n}] \\ = s_{n} - r_{n} + β + f, \end{array}

since s_n is measurable at period $n$ , while f_n is independent from period $n$ and $E_{n} \cap F_{n, r}$ . We can use this to approximate the exact recognition dynamics described above by assuming $r_{n} = E_{n} [r_{n}] = q s_{n}$ . In this case, we have Equation 4:

E_{n} [s_{n + 1} | E_{n} \cap F_{n, r}] = (1 - q) s_{n} + β + f .

The equilibrium may be given explicitly as

\tilde{μ} = (β + f) / (1 - γ) = (β + f) / q .

We distinguish the approximate equilibrium $\tilde{μ}$ from that of exact case μ, the latter incorporating a correction term arising from the fact that knowledge of equilibrium occurring requires a larger average value of r_n above $q s_{n}$ since equilibrium occurs only when $r_{n} > 0$ . We remark that the steady states given by Equations 30 and 32 are close to one another for small penalty (Figure 1—figure supplement 8) and parameter regions that overlap with those having large equilibrium probabilities ( $p \sim 1$ , $q > 0.5$ ; Figure 1—figure supplement 6), which intuitively suggests that a process driven by its mean overlaps well with one conditional on equilibrium provided the escape and elimination probabilities are small. We obtain good agreement between averages of large-scale simulations of the process, together with the predicted exact and approximate steady states for $p, q > 0.5$ and small penalty (Figure 1—figure supplement 9). Of course, the mean dynamics are also approximate since $q s_{n}$ is in general non-integer-valued. With this in mind, we focus on the dynamics given by Equation 31.

Here, r_n is Binomially distributed conditional on the number of current antigens, so that

E_{n} [r_{n}] = q s_{n}; V {ar}_{n} [r_{n}] = q (1 - q) s_{n} .

We define the following zero-mean noise variable

ε_{n} \equiv (f_{n} - f) - (r_{n} - q s_{n}),

and rewrite Equation 1 as

s_{n + 1} = γ s_{n} + β + f + ε_{n} .

This is none other than a first-order autoregressive, or AR(1), process with innovation terms $ε_{n}$ comprised of endogenous noise due to the variance in the number of recognized antigens and exogenous noise generated by fluctuations in the random penalty term.

The process is stable for all but trivial choices of probability $γ$ . The mean behavior evolves according to

E_{n} [s_{n + 1}] = E_{n} [γ s_{n} + β + f - ε_{n}] = γ s_{n} + β + f,

which ultimately gives Equation 9:

\begin{array}{lr} E [s_{n}] = γ^{n} s_{0} + (β + f) \sum_{j = 0}^{n - 1} γ^{j} \\ = γ^{n} s_{0} + (\frac{1 - γ^{n}}{1 - γ}) (β + f) \\ \to (β + f) / q as n \to \infty, \end{array}

thus showing agreement in mean with the fixed point given by Equation 32. Of course, $s_{n} = \tilde{μ} = (β + f) / q$ satisfies the martingale property:

E [s_{n + 1}] = γ (β + f) / q + (β + f) = (β + f) / q = s_{n},

and the process tends toward steady state with expected intertemporal difference

| E [s_{n + 1}] - E [s_{n}] | = γ^{n} | (β + f) - q s_{0} | .

The variance at stationarity, $V ar (s_{n})$ , can be calculated by solving for the fixed point of

\begin{matrix} V ar (s) = γ^{2} V ar (s) + σ_{f}^{2}, \end{matrix}

giving

V ar (s_{n}) = σ_{f}^{2} / (1 - γ^{2}) .

Recognizer success probability

For the event $W_{n}$ (resp. $L_{n}$ ) that the Recognizer wins (resp. loses) at period $n$ , and for the event $E_{n}$ of equilibrium at period $n$ , we have

P (W_{n}) = P (E_{n - 1}) (1 - η^{s_{n}})

\begin{aligned} P (L_{n}) = P (E_{n - 1}) γ^{s_{n}}, \end{aligned}

\begin{aligned} P (E_{n}) = P (E_{n - 1}) (η^{s_{n}} - γ^{s_{n}}) . \end{aligned}

These relationships, along with the implicit evolution given by Equation 32, are used to approximate ultimate Recognizer success probabilities for all possible $p$ and $q$ against several choices of initial antigen number s₀ and mean antigen arrival rate $β + f$ , and are compared with simulations of using actual transitions via Equation 29 (Figure 1—figure supplement 10). We find good agreement between these methods in characterizing the final outcome over a variety of parameter choices, where accuracy is highest in the relevant parameter region of interest. In particular, the left column of Figure 1—figure supplement 10 details the likelihood that a (static) threat is controlled in the special case where no penalty is assumed.

Mutation accumulation rate and tumor antigen availability

The above analysis was motivated by a desire to explain both genetic and non-genetic possibilities leading to recognition evasion. We can consider applying this model to strictly describe genetic evasion in the form of somatic mutations leading either to the generation of (recognizable) tumor-associated antigens or to escape via the removal of these antigens. Using the above framework, mutations, denoted by $λ$ , accumulate across each period in proportion to the sum of antigens downregulated to enhance escape and antigens gained via basal arrival and penalty. Thus their rate of accumulation may be expressed by

ν (n) \equiv \frac{Δ λ (n)}{Δ n} \propto ℓ_{n} + β + f_{n} .

Together with the fact that $ℓ_{n} = r_{n}$ during progression, we have for the mean rate of mutant accumulation

\begin{array}{lr} E [ν (n)] \propto E [E [r_{n} | s_{n}] + (β + f)] \\ = q E [s_{n}] + β + f \\ \to 2 (β + f) a s n \to \infty, \end{array}

ultimately giving

λ (n) \propto 2 (β + f) n .

which predicts that the rate of mutational acquisition is linear in time, consistent with empirical observation (Alexandrov et al., 2013; Lawrence et al., 2013). Heuristically, tumors that survive while accumulating an average of $β + f$ targetable alterations must balance those gains by $β + f$ additional evasion events. This theory predicts, perhaps surprisingly, that the mutation rate is a direct reflection of the penalty paid for cancer progression as a function of the basal antigen arrival rate and contributions from the local environment. Tumors having a more difficult time surviving in a hostile or restrictive environment would be predicted to have higher rates of mutation. In this context, high mutational signatures are predicted to be correlated with tumors that are more susceptible to recognition. For a passive Evader, our theory predicts that the observed mutation rate depends only on basal arrival and mean penalty term for cancer progression, unaffected by recognition rate. On the other hand, the stationary number of available antigens, approximated by $\tilde{μ} = (β + f) / q$ , varies directly with evasion penalty and inversely with antigen recognition rate. Moreover, mutation or adaptation accumulation is expected to converge to a stable steady state for all allowable recognition, evasion, and penalty rates.

Active evader in an adaptive environment

In the previous section, we considered the predicted dynamical behavior when the Evader is assumed to adopt a fixed strategy. In that case, if number of detectable antigens is moderately large ( $s_{0} \sim 10$ ), then the game is biased against the Evader for most combinations of evasion and recognition success probabilities (Section Break-even probability). Additionally, mean transitions in the number of recognizable antigens obey an AR(1) process tending toward the quotient of the mean penalty and recognition rate (Section Mean transition behavior). Moreover, this behavior predicts that the observed mutation accumulation rate is linear in time and proportional to the sum of basal antigen creation rate and mean penalty term (Section Mutation accumulation rate and tumor antigen availability). Here, we allow for the Evader to optimally select his evasion rate $π_{n}$ at each period (Figure 1—figure supplement 3). Larger success rates come at the cost of adding back more recognition opportunities in the subsequent time step, so that the Evader employs a strategy to maximize his survival or likelihood of escape. This framework is motivated by the observation that cancer threats are known to accumulate perhaps mildly deleterious mutations that occur passively during evolution to obtain rare ‘driver’ mutations (McFarland et al., 2014). The novelty here is that we propose a unifying theoretical framework to investigate the resulting strategy employed by a cancer population if the choice of evasion is planned based on knowledge of the current antigen landscape and hostility, or number of recognized targets.

In contrast with the prior section, which considered temporal evolution as a function of fixed evasion rate $p$ and random penalty f_n, here, the evasion rate $π_{n}$ may depend on time, and for simplicity we consider deterministic penalties. In order to properly frame this problem in a manner suitable to handle via dynamic programming, we define the necessary parameters, expectation, and value functions below. We assume that the process evolves according to state transition equation,

s_{n + 1} = s_{n} - r_{n} + β + f_{n},

and that conditional expectations are taken with respect to $ℱ_{n}$ , the natural filtration (Karatzas and Shreve, 1998) with respect to the underlying process.

If at time $n$ knowledge of total s_n and recognized r_n targets is known, then the Evader’s objective is to select a policy $π \equiv {π_{n}, π_{n + 1}, \dots}$ that maximizes the sum of present and future rewards, $R (s_{n}, r_{n}, π_{n})$ , which in general depend on the current state, s_n, as well as the Recognizer, r_n, and Evader, $π_{n}$ , actions. The value function is defined to be the maximal attainable sum of expected future rewards, given by

J_{n} (s_{n}) = sup_{π} E_{n} [\sum_{m = n}^{\infty} R (s_{m}, r_{m}, π_{m})] .

Problems that may be framed in this context have been well-studied and utilize a rich theory of stochastic dynamic programming, originally proposed by Bellman, 1954; Bellman and Dreyfus, 1959. Bellman’s Principle of Optimality and Bellman equation for a stationary solution (independent of starting time) are given via backward induction by

J (s_{n}) = R (s_{n}, r_{n}, π_{n}) + J (s_{n + 1}) .

Equation 49 states that the maximal attainable value at period $n$ is given by the sum of the maximal attainable value at the next time step, $J (s_{n + 1})$ , and the $n$ -period reward of strategy $π_{n}$ obeying Equation 48. For the problem at hand, we assume that the Evader receives a normalized reward of either $R_{n} = 1$ if it escapes at any time period (there is no temporal discount for escape at later periods), or $R_{n} = 0$ if it is eliminated. In this case, we may draw a decision tree for the $n$ -period problem in terms of the value function $J$ , current antigen number s_n, Recognizer antigen recognition miss probability $γ = 1 - q$ , number of recognized antigens r_n, and Evader strategy, $π_{n}$ (Figure 1—figure supplement 4). Here, $π_{n}$ represents the $n$ -period probability of antigen loss by the Evader.

Using the dynamic programming principle, the Bellman equation under uncertainty takes the form given by Equation 5:

J (s_{n}) = \underset{π_{n}}{m a x} {E_{n} [π_{n}^{r_{n}} [γ^{s_{n + 1}} + (1 - γ^{s_{n + 1}}) J (s_{n + 1})]]} .

Under a particular choice of assumed penalty and transition equation, we can calculate an exact, closed-form solution to the dynamic program in Equation 5. This solution generates an optimal policy, given by $π^{*} = {π_{1}^{*}, π_{2}^{*}, \dots, π_{n}^{*}, \dots}$ , a sequence of optimal decisions, in addition to the maximal value at each time assuming the optimal policy, given by ${J (s_{n}) |}_{π_{n}^{*}}$ .

Constitutive relations for intertemporal penalty

We make the following assumptions in our setting to make this problem more tractable. The first assumption is that the penalty function is time-homogeneous and deterministic:

f (s_{n}, r_{n}, π_{n}), π_{n} \in [0, 1], s_{n}, r_{n} \in Z^{+} .

Conditional on progressing to the next period, the transition equation takes the following form:

s_{n + 1} = s_{n} - r_{n} + β + f (s_{n}, r_{n}, π_{n}) .

In cases where we wish to emphasize the dependence of the transition equation on $π_{n}$ , we will denote $s_{n + 1} .$ by $g (π_{n})$ so that

g (s_{n}, π_{n}) = s_{n + 1}

The second assumption is that this penalty is $π_{n}$ -linear, given by Equation 2:

f (s_{n}, r_{n}, π_{n}) = h_{m} (s_{n}, r_{n}) π_{n}

for positive h_m.

In order to analytically characterize the solution, we assume that r_n is known prior to choosing $π_{n}$ ( $r_{n} \in ℱ_{n}$ ). In the analogous coin game, the Evader is allowed to see the success of his opponent, the Recognizer, prior to choosing a strategy. In this case, the dynamic program has a solution if we also assume that the linear penalty term can be represented by

h_{m} (s_{n}, r_{n}) = \frac{r_{n}}{c} (\frac{1}{δ_{n}} \cdot \frac{1 - γ^{s_{n}}}{1 - γ})^{1 / r_{n}}

with $c \equiv - \ln γ > 0$ and $0 < δ_{n} \leq 1$ . This assumption implies that the marginal penalty of increasing $π_{n}$ is asymptotically proportional to the number of recognized antigens. This is reasonable to assume, for example, in cases where significant immune system recognition and tumor killing create an environment that makes subsequent adaptation more costly, resulting possibly from increased inflammation. The constant $δ_{n}$ , a free variable, is inversely related to aversion of the Evader strategy so that larger values imply a bolder evasion strategy for all else held constant. This parameter may in general vary temporally and as a function of disease subtype.

Dynamic programming solution

In the above case, we may find an exact solution to the optimal programming problem. Since $r_{n} \in ℱ_{n}$ (the filtration generated by the evolution of s_n and the Recognizer action at time $n$ ), the stationary Bellman equation takes the form

J (s_{n}) = \underset{0 \leq π_{n} \leq 1}{m a x} {π_{n}^{r_{n}} [γ^{s_{n + 1}} + (1 - γ^{s_{n + 1}}) J (s_{n + 1})]} .

For simplicity in the subsequent definition, we drop the period index, rewriting Equation 54 as

J (s) = \underset{0 \leq π \leq 1}{m a x} {π^{r} [γ^{g (s, π)} + (1 - γ^{g (s, π)}) J (g (s, π))]}

Using $c \equiv - \ln γ$ , the first-order condition (FOC) is

\begin{matrix} \frac{\partial}{\partial π} {π^{r} [e^{- c g (s, π)} + (1 - e^{- c g (s, π)}) J (g (s, π))]} = 0 . \end{matrix}

In expanded form, the FOC becomes

\begin{array}{ll} 0 = π^{r - 1} {r [e^{- c g} + (1 - e^{- c g}) J (g)] \\ + π [- c \frac{\partial g}{\partial π} e^{- c g} + c \frac{\partial g}{\partial π} e^{- c g} J (g) + (1 - e^{- c g}) \frac{\partial J}{\partial g} \frac{\partial g}{\partial π}]} . \end{array}

From Equation 2, we have that

\frac{\partial f}{\partial π} = \frac{\partial g}{\partial π} = h_{m} .

We postulate that the solution takes the form of Equation 6:

J (s) = \frac{A γ^{s}}{1 - γ^{s}} .

so that

\frac{\partial J}{\partial s} = - \frac{c J (s)}{(1 - e^{- c s})} .

This, together with Equation 59, reduces Equation 58 to

π^{r - 1} [e^{- c g} + (1 - e^{- c g}) J (g)] (r - c h_{m} π) = 0 .

Thus, the optimal Evader success probability, $π^{*}$ , is given by

π^{*} = r / c h_{m} .

Under Evader optimal strategy, the transition equation in Equation 51 becomes

\begin{array}{lr} g^{*} \equiv g (s, π^{*}) = s - r + β + f (s, r, π^{*}) \\ = s - r + (β + r / c) . \end{array}

We next confirm that this satisfies the Bellman equation (Equation 55). The above solution implies

J (s) = {π^{*}}^{r} [γ^{g^{*}} + (1 - γ^{g^{*}}) J (g^{*})],

which ultimately yields

A γ^{s} = δ (1 - γ) (1 + A) γ^{β + r / c - r} γ^{s} .

Equating coefficients and applying this logic to each policy gives Equation 7:

A_{n} = \frac{δ_{n} (1 - γ) γ^{β + r / c - r}}{1 - δ_{n} (1 - γ) γ^{β + r / c - r}} .

The optimal policy (Figure 1—figure supplement 11) is given by (Equation 8) the sequence

π_{n}^{*} = (\frac{δ_{n} (1 - γ)}{1 - γ^{s_{n}}})^{1 / r_{n}} .

We henceforth refer to $δ_{n}$ as the aversion parameter. Large values of $δ_{n}$ imply low aversion. It can be interpreted as the selected strategy in the simplest case where $δ_{n} = δ > 0$ and $s_{n} = r_{n} = 1$ since

π_{n}^{*} = δ;

Rearranging Equation 8 gives

\frac{1 - γ_{n}^{s}}{1 - γ} = \frac{δ}{π_{n}^{*}^{r_{n}}} .

Solution uniqueness

Proposition

The above value function is unique.

Proof

We consider value functions $V (s)$ in the space of functions that are continuous in $π$ and bounded in $s$ . We take $| | V | |_{\infty} \equiv sup_{s} | V (s) |$ . From the previous section, we have identified such a function $J$ so that

J (s_{n}) = \underset{0 \leq π \leq 1}{m a x} π^{r} [γ^{s_{n + 1}} + (1 - γ^{s_{n + 1}}) J (s_{n + 1})] .

Assume that $V (s)$ is another solution. For fixed s_n, let $π^{*}$ be such that

V (s_{n}) = π^{*}^{r} [γ^{s_{n + 1}} + (1 - γ^{s_{n + 1}}) V (s_{n + 1})] .

We can rewrite the following term:

γ^{s_{n + 1}} = γ^{s_{n} - r + h_{m} π + β} = γ^{s_{n} - r + β} (γ^{h_{m}})^{π} \equiv γ^{k_{s}} {\tilde{γ}}^{π},

where $\tilde{γ}, γ^{k_{s}} < 1$ . Then

\begin{array}{ll} V - J = {π^{*}}^{r} [γ^{k_{s}} {\tilde{γ}}^{π^{*}} + (1 - γ^{k_{s}} \tilde{γ^{π^{*}}}) V (s_{n + 1})] \\ \underset{0 \leq π \leq 1}{- m a x} π^{r} [γ^{k_{s}} {\tilde{γ}}^{π} + (1 - γ^{k_{s}} {\tilde{γ}}^{π}) J (s_{n + 1})] \end{array}

\begin{aligned} \leq {π^{*}}^{r} [γ^{k_{s}} {\tilde{γ}}^{π^{*}} + (1 - γ^{k_{s}} \tilde{γ^{π^{*}}}) V (s_{n + 1})] - {π^{*}}^{r} [γ^{k_{s}} {\tilde{γ}}^{π^{*}} + (1 - γ^{k_{s}} {\tilde{γ}}^{π^{*}}) J (s_{n + 1})] \end{aligned}

\begin{aligned} = {π^{*}}^{r} (1 - γ^{k_{s}} {\tilde{γ}}^{π^{*}}) (V (s_{n + 1}) - J (s_{n + 1})) \end{aligned}

\begin{matrix} \leq π^{*}^{r} (1 - γ^{k_{s}} {\tilde{γ}}^{π^{*}}) | V (s_{n + 1}) - J (s_{n + 1}) | \end{matrix}

\begin{matrix} \leq π^{*}^{r} (1 - γ^{k_{s}} {\tilde{γ}}^{π^{*}}) {|| V - J ||}_{\infty} . \end{matrix}

Note that

\begin{matrix} C (π) \equiv π^{r} (1 - γ^{k} {\tilde{γ}}^{π}) \leq 1 - γ^{k} {\tilde{γ}}^{π} \end{matrix}

is increasing in $π$ (since $\tilde{γ} < 1$ ) so that $C (π) \leq 1 - γ^{k} \tilde{γ} \equiv K < 1$ . Thus,

V - J \leq K {|| V - J ||}_{\infty} .

By identical argument above, this time reversing the roles of $V$ and $J$ gives

J - V \leq K | | V - J | |_{\infty},

and so

| V (s_{n}) - J (s_{n}) | \leq K | | V - J | |_{\infty} < | | V - J | |_{\infty} for all s_{n} .

Therefore,

| | V - J | |_{\infty} = \underset{s_{n}}{s u p} | V (s_{n}) - J (s_{n}) | < | | V - J | |_{\infty} .

Thus,

{|| V - J ||}_{\infty} = 0 .

□

Mean optimal transitions

From Equation 63, the mean optimal transitions are

E_{n} [s_{n + 1} | E_{n}] = s_{n} + (1 / c - 1) r_{n} + β .

The mean increment, $Δ s_{n}$ , assuming the process is driven by $r_{n} \sim Binomial (s_{n}, q)$ , becomes

Δ s_{n} = (1 / c - 1) q s_{n} + β .

We next consider two cases. In the first case, the basal antigen creation rate $β$ scales linearly with the number of currently recognized antigens, and in the second case we instead assume that it is fixed.

r_n-linear basal antigen creation rate

This case considers $β = α r_{n}$ . Here, larger recognition in the current period results in larger exogenous penalty, and hence easier targeting, in the next period. Consequently, the number of detectable antigens in the future is directly influenced by both the tumor evasion strategy $π^{*}$ and the extent of that recognition resulting from immune targeting r_n. In this case (Figure 5—figure supplement 1), we have that

E [(1 / c - 1 + α) r_{n} | s_{n}] = (1 / c - 1 + α) q s_{n},

so that the process satisfies the Martingale condition

E [s_{n + 1} | s_{n}] = s_{n}

for critical alpha

α_{c} = \frac{\log γ^{- 1} - 1}{\log γ^{- 1}} .

Mutation accumulation rate

In the trivial case where, $α = α_{c}$ , $s$ is constant and so mutation accumulation is predicted to be linear. Contributions by optimal evasion to the mutation rate are expected to exponentially decrease (resp. increase) over time if $α < α_{c}$ (resp. $α > α_{c}$ ).

In this case, dynamics and resultant mutation accumulation is determined by $α$ relative to $α_{c}$ , and only those $α$ close to the threshold generate behavior resembling linear mutation accumulation. Given this, the added penalty $β (r_{n}) = α r_{n}$ due to the number of recognized antigens appears to be a less reasonable assumption based on empirical mutation rates (Lawrence et al., 2013; Alexandrov et al., 2013). We next consider the case for which the basal antigen creation rate is independent of $r$ .

r_n-independent basal antigen creation rate

In this case, $Δ s_{n}$ from Equation 83 becomes

Δ s_{n} = (1 / c - 1) q s_{n} + β .

The recognition dynamics of this case are more complex and partition into three regimes based on recognition relative to a critical threshold $q^{*} = 1 - 1 / e$ (for which $c = 1$ and Equation 87 $Δ s_{n} = β$ ): effective immune recognition, critical recognition, and impaired recognition.

Effective immune recognition

Here, $q > q^{*}$ , giving $c > 1$ . In this case, the Recognizer exerts a large recognition rate on the evading tumor. If $β \leq 0$ , then the equilibrium, $s^{*}$ for which $Δ s_{n} = 0$ is negative, and the s_n is driven to 0. If $β$ is a positive, then there exists a stable, positive antigen state:

s^{*} = \frac{β}{q (1 - 1 / c)}

Trajectories assuming a variety of initial conditions are given with $s^{*} = 10$ in Figure 5—figure supplement 2A.

Impaired immune recognition

In contrast with effective recognition $q < q^{*}$ , $c < 1$ , and in this case, the equilibrium points are unstable. Moreover, If $β \geq 0$ , then by a similar reasoning as above, $s^{*} \leq 0$ so that s_n is driven to become very large. Alternatively, if $β < 0$ then the equilibrium state is

s^{*} = \frac{β}{q (1 / c - 1)}

so that collectively the equilibrium value is given by Equation 10.

Critical immune recognition

At criticality $q = q^{*}$ , $c = 1$ , and Equation 83 simplifies to

Δ s_{n} = β .

In this special case, all randomness imparted to the process by r_n is eliminated by a critical offset in the number of recognized antigens and the net addition of new antigens so that the long-term behavior of the process is completely determined by $β$ . Predictably, $β > 0$ (resp. $β < 0$ ) results in net expansion (resp. depletion) of antigens over time, and $β = 0$ is stationary. The sign of $β$ may change as a function of the tumor IME. For example, immune exclusion and the resulting attenuated inflammation may both decrease $q$ and $β$ as well as genetic aberrations involving mismatch repair (MMR) deficiency and microsatellite instability. Other alterations, such as modulated MHC expression, or MHC loss of heterozygosity (LOH), may affect $q$ in isolation Rosenthal et al., 2019.

Mutation accumulation rate

Critical and impaired immune recognition dynamics follow a similar behavior to that detailed in Section Mean optimal transitions. The effective recognition case bears a resemblance to the approximate dynamics of the informed Evader in Section Mean transition behavior. Here, by a similar argument in Section Mutation accumulation rate and tumor antigen availability once equilibrium is achieved, we have that

ν (n) \equiv \frac{Δ λ (n)}{Δ n} \propto r_{n} + β + f_{n} .

Studying the process at $s_{0} = s^{*}$ given by Equation 88, and $f_{n}^{*} = r_{n} / c$ , we have that

\begin{array}{ll} E_{n} [ν (n) | E_{n}] \propto E_{n} [r_{n} + r_{n} / c + β | E_{n}] \\ = (1 + 1 / c) E_{n} [r_{n} | E_{n}] + β \\ = (1 + 1 / c) q s^{*} + β \\ = β [\frac{(1 + 1 / c)}{(1 - 1 / c)} + 1] \\ = (\frac{2 c}{c - 1}) β . \end{array}

This implies Equation 11:

λ (n) \propto 2 β c n / (c - 1) .

Therefore, linear mutation accumulation as a function of time ensues for an effective Recognizer as in the passive Evader case (Equation 46), this time as a function not only of the basal antigen creation rate $β > 0$ but also of $q$ through $c$ . We recall that under effective recognition, $q^{*} < q < 1$ (equivalently $1 < c < \infty$ ), which ultimately gives via Equation 11

2 β n < μ (n) .

Dynamics summary

The assumption that the basal antigen production depends on recognition $β = α r_{n}$ results in exponential growth or decay in the number of recognizable antigens (and therefore mutation rate), and it was only for a very narrow parameter value $α \sim α_{c}$ for which linear mutation accumulation could occur. It is for this reason that the r_n-linear constitutive assumption is less realistic.

For basal antigen rates $β$ that are r_n-independent, mutations are predicted to accumulate linearly under effective immune recognition, in a similar manner to that observed in the passive Evader case. In contrast with that case, however, an active Evader executes an optimal strategy to maximize the overall escape probability. This predicts that one effect of a dynamic evasion that optimally maximizes escape probability is a concomitant increase in the mutation accumulation rate relative to the passive case via a correction term $c / (c - 1)$ . This enhancement becomes indistinguishable when recognition is very aggressive ( $q \to 1$ ) and becomes large when $q$ approaches the critical detection rate.

Interestingly, the active evasion strategy predicts that mutation accumulation rates vary as a function of recognition pressure, in contrast with the passive evasion model. Additionally, disease progression may affect immune recognition (changes in $q$ ) and tumor evasion penalty (changes in $β$ ). While the number of recognizable TAAs for the passive case continues evolve according to the mean-reverting process, there is a dramatic discontinuity in active systems whereby recognition rates below a critical threshold may result in unstable behavior prior to escape (Figure 5—figure supplement 2).

Optimal evasion strategy

From Equations 6–8, we have

\begin{aligned} J (s_{n}, r_{n}) = \frac{A_{n} e^{- c s_{n}}}{1 - e^{- c s_{n}}}, \end{aligned}

\begin{aligned} A_{n} = \frac{δ_{n} q e^{- (1 - c) r_{n} - c β}}{1 - δ_{n} q e^{- (1 - c) r_{n} - c β}}, \end{aligned}

and

\begin{aligned} π_{n}^{*} = {(\frac{δ_{n} q}{1 - (1 - q)^{s_{n}}})}^{1 / r_{n}} . \end{aligned}

Thus,

J (s_{0}, r_{0}) = \frac{δ_{0} q e^{- (1 - c) r_{0} - c β}}{1 - δ_{0} q e^{- (1 - c) r_{0} - c β}} \cdot \frac{e^{- c s_{0}}}{1 - e^{- c s_{0}}},

We note that for $s_{n} = s_{n - 1} + (1 / c - 1) r_{n - 1} + β$ , therefore

\begin{array}{lr} e^{- c s_{n}} = γ^{s_{n - 1} + (1 - c) r_{n - 1} / c + β} \\ = γ^{s_{n - 2} + (1 - c) (r_{n - 1} + r_{n - 2}) / c + 2 β} \\ = . . . = γ^{s_{0} + n β + C_{γ} R_{n - 1}}, \end{array}

where

C_{γ} \equiv \frac{1 - \ln γ^{- 1}}{\ln γ^{- 1}}

and

R_{n} \equiv \sum_{j = 1}^{n} r_{j} .

By iteratively applying Equation 98, we ultimately obtain the value function in terms of the history of the environmental landscape, ${r_{n}}_{n}$

J (s_{n}, r_{n}) = \frac{δ_{n} q (1 - q)^{C_{γ} r_{n} - β}}{1 - δ_{n} q (1 - q)^{C_{γ} r_{n} - β}} \cdot \frac{(1 - q)^{s_{0} - n β + C_{γ} R_{n - 1}}}{1 - (1 - q)^{s_{0} - n β + C_{γ} R_{n - 1}}} .

We remark that this simplifies for constant $δ_{n} = δ$ , which we will typically take as 1.

Critical recognition

At the critical value of recognition $q^{*} = 1 - 1 / e$ ( $c = 1$ ), the dynamics become deterministic. Here, the value of the present state depends only on the initial number of detectable antigens and number of periods that have elapsed and is independent of the history of recognized antigens ${r_{n}}_{n}$ .

J (s_{n}, r_{n}) = \frac{δ_{n} q (1 - q)^{β}}{1 - δ_{n} (1 - q)^{β}} \cdot \frac{(1 - q)^{s_{0} - n β}}{1 - (1 - q)^{s_{0} - n β}} .

At criticality, the value of the present state depends only on the initial number of detectable antigens and number of periods that have elapsed, and not on the number of recognized antigens.

Non-critical recognition

We recall that the value function carries meaning as the maximal attainable expected future value. Under effective recognition ( $c = 1 \Rightarrow γ^{C r}$ is increasing in $r$ ), so that the value function (Equation 101) has an exponent that increases.

We are motivated to consider either mild or aggressive recognition of Section 5.2.4. We will assume that there is minimal aversion so that $δ_{n} = 1$ .

Predicted dynamical behavior

From Section Mean optimal transitions, the dynamical behavior of the number of recognizable TAAs, or immunogenicity, of an active Evader is determined by $β$ and $q$ . Disease progression may ultimately affect immune recognition (reducing $q$ ) and/or tumor basal tumor antigen creation (reducing $β$ ). $β$ is expected to vary widely across tumor types. Within a given tumor subtype, the extent of environmental hostility is expected to require additional tumor adaptation that may manifest as additional TAA targets. Therefore, larger (resp. smaller) evasion penalties $β$ correspond with anti-tumor (resp. pro-tumor) IME. Similarly, larger (resp. smaller) $q$ corresponds to infiltrated (resp. excluded) environments, and from this we model four possible states: anti-tumor-infiltrated, anti-tumor-excluded, pro-tumor-infiltrated, and pro-tumor-excluded. The model predicts that infiltrated ( $q > q^{*}$ ) environments lead to an absorbing equilibrium state in the intervening period prior to escape, while exclusion ( $q < q^{*}$ ) result in unstable equilibria. Interestingly, the sign of the equilibrium, and hence the behavior, depends on $β$ , and leads to dramatically diverse behavior in the antigenicity of a dominant tumor clone as it progresses via immune recognition. This case is meaningful as long as the intertemporal penalty assuming the optimal strategy occurs, $β + f_{n}^{*}$ , remains non-negative whenever there is at least one recognition event. This is equivalent to the condition that $f_{n}^{*} + β \geq 1 / \ln γ^{- 1} + β > 0$ , which is assumed in all examples that follow. These results are summarized in Figure 5 and organized below. The corresponding immunogenicity and cumulative mutations following escape are given by Figure 4, with the timing of escape and example trajectories given by Figure 5—figure supplement 3.

Anti-tumor-infiltrated ( $q > q^{*}$ , $β > 0$ ): This stable steady state is positive, so that the process is mean-reverting, and generates immunogenically warm’ tumors.
Anti-tumor-excluded ( $q < q^{*}$ , $β > 0$ ): Here, recognition is low, while the arrival of new TAAs is large. This unstable steady state is negative, so that all trajectories tend to increase their immunogenicity over time, resulting in ‘hot’ tumors.
Pro-tumor-infiltrated ( $q > q^{*}$ , $β < 0$ ): In this case, recognition is large while the arrival of new TAAs is low. This stable steady state is negative, so that all trajectories tend to reduce their immunogenicity to zero over time, yielding ‘cold’ tumors.
Pro-tumor-excluded ( $q < q^{*}$ , $β < 0$ ): Lastly, if both recognition and new TAA arrival rates are low, then there is a positive unstable state, above which trajectories accumulate additional TAAs over time, becoming ‘hot,’ and below which the populations are predicted to reduce the number of recognizable TAAs over time, becoming ‘cold.’

These predicted dynamics parallel the observation that tumors under active immunosurveillance via effective recognition undergo significant immunoediting. Our results predict that the resulting tumor becomes ‘warm’ or ‘cold’ depending on the extent of new TAA arrival during active evasion. On the one hand, impaired recognition leads to diverse behavior dependent on the rate at which new TAAs are acquired during active evasion. If this acquisition rate is large, then the tumor accumulates TAAs over time to become ‘hot.’ On the other hand, tumors subject to reduced selection pressures may evolve as immune-hot or immune-cold tumors, consistent with previous observations (Lakatos et al., 2020). Moreover, the effect of reducing immune recognition leads to an accumulation of TAAs over time, consistent with experimental observations in lung cancer wherein patients with HLA loss of heterozygosity harbored larger mutational burdens, an indirect measure of TAA number of our model (McGranahan and Swanton, 2017). Our predictions suggest that immunogenicity ultimately depends on the number of detectable TAAs at the time of impaired immune recognition, suggesting that TAA-depleted tumors share in common the tendency for their evasion strategies to incur less antigenic penalties. Our results would predict the utility of altering the tumor microenvironment to increase the immunogenicity of immune-cold tumors by making evasion more costly in a manner reminiscent of mutational meltdown (Gabriel et al., 1993). We remark that these dynamics are worth considering in the case of adoptive T cell-based immunotherapies, which have a large potential for exerting substantial co-evolutionary pressure on a developing malignancy (George and Levine, 2021).

Survival benefit of active evasion

From the above analysis, immunogenicity dynamics of an active Evader are closest to those of a mean-reverting passive Evader under the pro-tumor-infiltrated case. Given this, we study the dynamics under active and passive evasion as well as the distribution of escape times and probability of escape (Figure 2). For a reasonable comparison, we fix $q$ and $s^{*}$ for each case, and the passive evasion rate $p$ is chosen to match the stationary mean optimal evasion rate $π^{*}$ . Our simulations result in escape occurring 1.6 times more frequently under active evasion. Moreover, active evasion exhibits a broader distribution of elimination and escape times (Mean Passive Escape = 6.0, Var Passive Escape = 25.0, Mean Passive Elimination = 6.1, Var Passive Elimination = 30.1; Mean Active Escape = 7.2, Var Active Escape = 35.8, Mean Active Elimination = 6.7, Var Active Elimination = 38.0). Our results demonstrate that active evasion allows an Evader to adapt to the observed recognition and, despite continual penalty, allows an Evader to ‘out-wait’ a Recognizer in order to undergo escape.

Exogenous recognition

One powerful advantage of this approach is that the theoretical predictions are not limited by the underlying distribution of r_n driving the process. In fact, the optimal policies and value function can handle any temporally varying recognition landscape, ${r_{n}}_{n}$ , so long as $0 \leq r_{n} \leq s_{n}$ . We consider the effects of step, cyclical, increasing, and decreasing recognition landscapes on the relative evasion probability for populations adopting either a passive or active strategy (Figure 3).

In addition to arbitrary recognition landscapes, our dynamic programming approach may be applied to understand the effects of immunotherapeutic intervention, whereby immune escape can be modeled as a range of possible behavior on the spectrum of passive evasion to the most aggressive (active) evasion. For example, the active evasion dynamics assuming an anti-tumor-infiltrated case are similar to those of passive evasion. In both cases, the process escapes with immunogenicity values that fluctuate around a stationary $s^{*}$ . We can recover the recover the relationship between $s^{*}$ and mutation rate $ν (n) = Δ λ / Δ n$ via Equations 32 and 46 for the passive case and Equation 88, Equation 11 for the active case. In both cases, the result is similar:

s^{*} = ν / 2 q .

demonstrating that immunogenicity, and thus the success likelihood of immunotherapeutic intervention, varies directly with mutation rate and inversely with recognition rate. This theory predicts that escape to a cold tumor is more likely when $s^{*}$ is close to 0 and is akin to complete evasion as modeled in George and Levine, 2018, contrasting with temporary evasion that may be recognized subsequently George and Levine, 2020. All else equal, higher mutational rates can lead to higher predicted efficacy via higher $s^{*}$ , but this is not the only way as concomitantly high rates of recognition can drive $s^{*}$ down, thereby reducing predicted efficacy. In Equation 103, it is clear that a better immunotherapy prognosis occurs when the mutational rate is higher and the recognition rate is also low since $s^{*}$ is predicted large in this case. Figure 5—figure supplement 4 summarizes the behavior of an adaptive Evader subject to a temporally varying recognition pressure.

Data availability

All data generated or analyzed in this study are included in the supplementary data files. Source code is publicly available as a git repository (George, 2022).

References

1. Alexandrov LB
2. Nik-Zainal S
3. Wedge DC
4. Aparicio SAJR
5. Behjati S
6. Biankin AV
7. Bignell GR
8. Bolli N
9. Borg A
10. Børresen-Dale A-L
11. Boyault S
12. Burkhardt B
13. Butler AP
14. Caldas C
15. Davies HR
16. Desmedt C
17. Eils R
18. Eyfjörd JE
19. Foekens JA
20. Greaves M
21. Hosoda F
22. Hutter B
23. Ilicic T
24. Imbeaud S
25. Imielinski M
26. Jäger N
27. Jones DTW
28. Jones D
29. Knappskog S
30. Kool M
31. Lakhani SR
32. López-Otín C
33. Martin S
34. Munshi NC
35. Nakamura H
36. Northcott PA
37. Pajic M
38. Papaemmanuil E
39. Paradiso A
40. Pearson JV
41. Puente XS
42. Raine K
43. Ramakrishna M
44. Richardson AL
45. Richter J
46. Rosenstiel P
47. Schlesner M
48. Schumacher TN
49. Span PN
50. Teague JW
51. Totoki Y
52. Tutt ANJ
53. Valdés-Mas R
54. van Buuren MM
55. van ’t Veer L
56. Vincent-Salomon A
57. Waddell N
58. Yates LR
59. Australian Pancreatic Cancer Genome Initiative
60. ICGC Breast Cancer Consortium
61. ICGC MMML-Seq Consortium
62. ICGC PedBrain
63. Zucman-Rossi J
64. Futreal PA
65. McDermott U
66. Lichter P
67. Meyerson M
68. Grimmond SM
69. Siebert R
70. Campo E
71. Shibata T
72. Pfister SM
73. Campbell PJ
74. Stratton MR
(2013) Signatures of mutational processes in human cancer
Nature 500:415–421.

https://doi.org/10.1038/nature12477
- PubMed
- Google Scholar
1. Al Mamun AAM
2. Lombardo M-J
3. Shee C
4. Lisewski AM
5. Gonzalez C
6. Lin D
7. Nehring RB
8. Saint-Ruf C
9. Gibson JL
10. Frisch RL
11. Lichtarge O
12. Hastings PJ
13. Rosenberg SM
(2012) Identity and function of a large gene network underlying mutagenic repair of DNA breaks
Science 338:1344–1348.

https://doi.org/10.1126/science.1226683
- PubMed
- Google Scholar
1. Bellman R
(1954) The theory of dynamic programming
Bulletin of the American Mathematical Society 60:503–515.

https://doi.org/10.1090/S0002-9904-1954-09848-8
- Google Scholar
1. Bellman R
2. Dreyfus S
(1959) Functional approximations and dynamic programming
Mathematical Tables and Other Aids to Computation 13:247.

https://doi.org/10.2307/2002797
- Google Scholar
1. Bergholz JS
2. Zhao JJ
(2021) How compensatory mechanisms and adaptive rewiring have shaped our understanding of therapeutic resistance in cancer
Cancer Research 81:6074–6077.

https://doi.org/10.1158/0008-5472.CAN-21-3605
- PubMed
- Google Scholar
(2007) Regulation of DNA repair in hypoxic cancer cells
Cancer Metastasis Reviews 26:249–260.

https://doi.org/10.1007/s10555-007-9061-3
- PubMed
- Google Scholar
1. Couzin-Frankel J
(2013) Cancer immunotherapy
Science 342:1432–1433.

https://doi.org/10.1126/science.342.6165.1432
- PubMed
- Google Scholar
(2010) Granzymes in cancer and immunity
Cell Death and Differentiation 17:616–623.

https://doi.org/10.1038/cdd.2009.206
- PubMed
- Google Scholar
(2013) Ph sensing and regulation in cancer
Frontiers in Physiology 4:370.

https://doi.org/10.3389/fphys.2013.00370
- PubMed
- Google Scholar
(2016) Fluctuating fitness shapes the clone-size distribution of immune repertoires
PNAS 113:274–279.

https://doi.org/10.1073/pnas.1512977112
- PubMed
- Google Scholar
1. Dunn GP
2. Bruce AT
3. Ikeda H
4. Old LJ
5. Schreiber RD
(2002) Cancer immunoediting: from immunosurveillance to tumor escape
Nature Immunology 3:991–998.

https://doi.org/10.1038/ni1102-991
- PubMed
- Google Scholar
(2004) The three ES of cancer immunoediting
Annual Review of Immunology 22:329–360.

https://doi.org/10.1146/annurev.immunol.22.012703.104803
- PubMed
- Google Scholar
1. Eigl BJC
2. Eggener SE
3. Baybik J
4. Ettinger S
5. Chi KN
6. Nelson C
7. Wang Z
8. Gleave ME
(2005) Timing is everything: preclinical evidence supporting simultaneous rather than sequential chemohormonal therapy for prostate cancer
Clinical Cancer Research 11:4905–4911.

https://doi.org/10.1158/1078-0432.CCR-04-2140
- PubMed
- Google Scholar
1. Foo J
2. Michor F
(2009) Evolution of resistance to targeted anti-cancer therapies during continuous and pulsed administration strategies
PLOS Computational Biology 5:e1000557.

https://doi.org/10.1371/journal.pcbi.1000557
- PubMed
- Google Scholar
(1993) Muller’s ratchet and mutational meltdowns
Evolution; International Journal of Organic Evolution 47:1744–1757.

https://doi.org/10.1111/j.1558-5646.1993.tb01266.x
- PubMed
- Google Scholar
(2017) Effects of thymic selection on t cell recognition of foreign and tumor antigenic peptides
PNAS 114:E7875–E7881.

https://doi.org/10.1073/pnas.1708573114
- PubMed
- Google Scholar
1. George JT
2. Levine H
(2018) Stochastic modeling of tumor progression and immune evasion
Journal of Theoretical Biology 458:148–155.

https://doi.org/10.1016/j.jtbi.2018.09.012
- PubMed
- Google Scholar
1. George JT
2. Levine H
(2020) Sustained coevolution in a stochastic model of cancer-immune interaction
Cancer Research 80:811–819.

https://doi.org/10.1158/0008-5472.CAN-19-2732
- PubMed
- Google Scholar
1. George JT
2. Levine H
(2021) Implications of tumor-immune coevolution on cancer evasion and optimized immunotherapy
Trends in Cancer 7:373–383.

https://doi.org/10.1016/j.trecan.2020.12.005
- PubMed
- Google Scholar
Software
1. George JT
(2022) TumorEvasionAntigenLoss, version v1.0
Zenodo.

https://doi.org/10.5281/zenodo.7145286
1. Goodman AM
2. Kato S
3. Bazhenova L
4. Patel SP
5. Frampton GM
6. Miller V
7. Stephens PJ
8. Daniels GA
9. Kurzrock R
(2017) Tumor mutational burden as an independent predictor of response to immunotherapy in diverse cancers
Molecular Cancer Therapeutics 16:2598–2608.

https://doi.org/10.1158/1535-7163.MCT-17-0386
- PubMed
- Google Scholar
1. Goswami KK
2. Ghosh T
3. Ghosh S
4. Sarkar M
5. Bose A
6. Baral R
(2017) Tumor promoting role of anti-tumor macrophages in tumor microenvironment
Cellular Immunology 316:1–10.

https://doi.org/10.1016/j.cellimm.2017.04.005
- PubMed
- Google Scholar
(2019) Phenotypic plasticity: driver of cancer initiation, progression, and therapy resistance
Stem Cell 24:65–78.

https://doi.org/10.1016/j.stem.2018.11.011
- PubMed
- Google Scholar
1. Huss R
2. Schmid C
3. Manesse M
4. Thagaard J
5. Maerkl B
(2021) Immunological tumor heterogeneity and diagnostic profiling for advanced and immune therapies
ADVANCES IN CELL AND GENE THERAPY 4:e113.

https://doi.org/10.1002/acg2.113
- Google Scholar
(2006) Evolution of resistance during clonal expansion
Genetics 172:2557–2566.

https://doi.org/10.1534/genetics.105.049791
- PubMed
- Google Scholar
1. Jamal-Hanjani M
2. Wilson GA
3. McGranahan N
4. Birkbak NJ
5. Watkins TBK
6. Veeriah S
7. Shafi S
8. Johnson DH
9. Mitter R
10. Rosenthal R
11. Salm M
12. Horswell S
13. Escudero M
14. Matthews N
15. Rowan A
16. Chambers T
17. Moore DA
18. Turajlic S
19. Xu H
20. Lee S-M
21. Forster MD
22. Ahmad T
23. Hiley CT
24. Abbosh C
25. Falzon M
26. Borg E
27. Marafioti T
28. Lawrence D
29. Hayward M
30. Kolvekar S
31. Panagiotopoulos N
32. Janes SM
33. Thakrar R
34. Ahmed A
35. Blackhall F
36. Summers Y
37. Shah R
38. Joseph L
39. Quinn AM
40. Crosbie PA
41. Naidu B
42. Middleton G
43. Langman G
44. Trotter S
45. Nicolson M
46. Remmen H
47. Kerr K
48. Chetty M
49. Gomersall L
50. Fennell DA
51. Nakas A
52. Rathinam S
53. Anand G
54. Khan S
55. Russell P
56. Ezhil V
57. Ismail B
58. Irvin-Sellers M
59. Prakash V
60. Lester JF
61. Kornaszewska M
62. Attanoos R
63. Adams H
64. Davies H
65. Dentro S
66. Taniere P
67. O’Sullivan B
68. Lowe HL
69. Hartley JA
70. Iles N
71. Bell H
72. Ngai Y
73. Shaw JA
74. Herrero J
75. Szallasi Z
76. Schwarz RF
77. Stewart A
78. Quezada SA
79. Le Quesne J
80. Van Loo P
81. Dive C
82. Hackshaw A
83. Swanton C
84. TRACERx Consortium
(2017) Tracking the evolution of non-small-cell lung cancer
The New England Journal of Medicine 376:2109–2121.

https://doi.org/10.1056/NEJMoa1616288
- PubMed
- Google Scholar
1. Jia Q
2. Wang A
3. Yuan Y
4. Zhu B
5. Long H
(2022) Heterogeneity of the tumor immune microenvironment and its clinical relevance
Experimental Hematology & Oncology 11:24.

https://doi.org/10.1186/s40164-022-00277-y
- PubMed
- Google Scholar
1. Jiang YN
2. Yan HQ
3. Huang XB
4. Wang YN
5. Li Q
6. Gao FG
(2015) Interleukin 6 trigged ataxia-telangiectasia mutated activation facilitates lung cancer metastasis via mmp-3/mmp-13 up-regulation
Oncotarget 6:40719–40733.

https://doi.org/10.18632/oncotarget.5825
- PubMed
- Google Scholar
1. Karatzas I
2. Shreve SE
(1998)
Brownian Motion and Stochastic Calculus

47–127, Brownian motion, Brownian Motion and Stochastic Calculus, New York, NY, Springer, 10.1007/978-1-4612-0949-2.
- Google Scholar
1. Koebel CM
2. Vermi W
3. Swann JB
4. Zerafa N
5. Rodig SJ
6. Old LJ
7. Smyth MJ
8. Schreiber RD
(2007) Adaptive immunity maintains occult cancer in an equilibrium state
Nature 450:903–907.

https://doi.org/10.1038/nature06309
- PubMed
- Google Scholar
1. Komarova N
(2006) Stochastic modeling of drug resistance in cancer
Journal of Theoretical Biology 239:351–366.

https://doi.org/10.1016/j.jtbi.2005.08.003
- PubMed
- Google Scholar
1. Lakatos E
2. Williams MJ
3. Schenck RO
4. Cross WCH
5. Househam J
6. Zapata L
7. Werner B
8. Gatenbee C
9. Robertson-Tessi M
10. Barnes CP
11. Anderson ARA
12. Sottoriva A
13. Graham TA
(2020) Evolutionary dynamics of neoantigens in growing tumors
Nature Genetics 52:1057–1066.

https://doi.org/10.1038/s41588-020-0687-1
- PubMed
- Google Scholar
1. Lawrence MS
2. Stojanov P
3. Polak P
4. Kryukov GV
5. Cibulskis K
6. Sivachenko A
7. Carter SL
8. Stewart C
9. Mermel CH
10. Roberts SA
11. Kiezun A
12. Hammerman PS
13. McKenna A
14. Drier Y
15. Zou L
16. Ramos AH
17. Pugh TJ
18. Stransky N
19. Helman E
20. Kim J
21. Sougnez C
22. Ambrogio L
23. Nickerson E
24. Shefler E
25. Cortés ML
26. Auclair D
27. Saksena G
28. Voet D
29. Noble M
30. DiCara D
31. Lin P
32. Lichtenstein L
33. Heiman DI
34. Fennell T
35. Imielinski M
36. Hernandez B
37. Hodis E
38. Baca S
39. Dulak AM
40. Lohr J
41. Landau D-A
42. Wu CJ
43. Melendez-Zajgla J
44. Hidalgo-Miranda A
45. Koren A
46. McCarroll SA
47. Mora J
48. Crompton B
49. Onofrio R
50. Parkin M
51. Winckler W
52. Ardlie K
53. Gabriel SB
54. Roberts CWM
55. Biegel JA
56. Stegmaier K
57. Bass AJ
58. Garraway LA
59. Meyerson M
60. Golub TR
61. Gordenin DA
62. Sunyaev S
63. Lander ES
64. Getz G
(2013) Mutational heterogeneity in cancer and the search for new cancer-associated genes
Nature 499:214–218.

https://doi.org/10.1038/nature12213
- PubMed
- Google Scholar
1. Lee G
2. Han SB
3. Lee JH
4. Kim HW
5. Kim DH
(2019) Cancer mechanobiology: microenvironmental sensing and metastasis
ACS Biomaterials Science & Engineering 5:3735–3752.

https://doi.org/10.1021/acsbiomaterials.8b01230
- PubMed
- Google Scholar
1. Liu J
2. Geng X
3. Hou J
4. Wu G
(2021) New insights into M1/M2 macrophages: key modulators in cancer progression
Cancer Cell International 21:389.

https://doi.org/10.1186/s12935-021-02089-2
- PubMed
- Google Scholar
1. Martin SD
2. Brown SD
3. Wick DA
4. Nielsen JS
5. Kroeger DR
6. Twumasi-Boateng K
7. Holt RA
8. Nelson BH
(2016) Low mutation burden in ovarian cancer may limit the utility of neoantigen-targeted vaccines
PLOS ONE 11:e0155189.

https://doi.org/10.1371/journal.pone.0155189
- PubMed
- Google Scholar
(2019) How a well-adapting immune system remembers
PNAS 116:8815–8823.

https://doi.org/10.1073/pnas.1812810116
- PubMed
- Google Scholar
(2014) Tug-of-war between driver and passenger mutations in cancer and other adaptive processes
PNAS 111:15138–15143.

https://doi.org/10.1073/pnas.1404341111
- PubMed
- Google Scholar
1. McGranahan N
2. Furness AJS
3. Rosenthal R
4. Ramskov S
5. Lyngaa R
6. Saini SK
7. Jamal-Hanjani M
8. Wilson GA
9. Birkbak NJ
10. Hiley CT
11. Watkins TBK
12. Shafi S
13. Murugaesu N
14. Mitter R
15. Akarca AU
16. Linares J
17. Marafioti T
18. Henry JY
19. Van Allen EM
20. Miao D
21. Schilling B
22. Schadendorf D
23. Garraway LA
24. Makarov V
25. Rizvi NA
26. Snyder A
27. Hellmann MD
28. Merghoub T
29. Wolchok JD
30. Shukla SA
31. Wu CJ
32. Peggs KS
33. Chan TA
34. Hadrup SR
35. Quezada SA
36. Swanton C
(2016) Clonal neoantigens elicit T cell immunoreactivity and sensitivity to immune checkpoint blockade
Science 351:1463–1469.

https://doi.org/10.1126/science.aaf1490
- PubMed
- Google Scholar
1. McGranahan N.
2. Swanton C
(2017) Clonal heterogeneity and tumor evolution: past, present, and the future
Cell 168:613–628.

https://doi.org/10.1016/j.cell.2017.01.018
- PubMed
- Google Scholar
1. Meiller C
2. Montagne F
3. Hirsch TZ
4. Caruso S
5. de Wolf J
6. Bayard Q
7. Assié J-B
8. Meunier L
9. Blum Y
10. Quetel L
11. Gibault L
12. Pintilie E
13. Badoual C
14. Humez S
15. Galateau-Sallé F
16. Copin M-C
17. Letouzé E
18. Scherpereel A
19. Zucman-Rossi J
20. Le Pimpec-Barthes F
21. Jaurand M-C
22. Jean D
(2021) Multi-Site tumor sampling highlights molecular intra-tumor heterogeneity in malignant pleural mesothelioma
Genome Medicine 13:113.

https://doi.org/10.1186/s13073-021-00931-w
- PubMed
- Google Scholar
1. Meslin F
2. Hamaï A
3. Mlecnik B
4. Rosselli F
5. Richon C
6. Jalil A
7. Wemhoff G
8. Thiery J
9. Galon J
10. Chouaib S
(2011) Hsmg-1 is a granzyme B-associated stress-responsive protein kinase
Journal of Molecular Medicine 89:411–421.

https://doi.org/10.1007/s00109-010-0708-0
- PubMed
- Google Scholar
(2004) Dynamics of cancer progression
Nature Reviews. Cancer 4:197–205.

https://doi.org/10.1038/nrc1295
- PubMed
- Google Scholar
(2017) Dna damage and repair biomarkers of immunotherapy response
Cancer Discovery 7:675–693.

https://doi.org/10.1158/2159-8290.CD-17-0226
- PubMed
- Google Scholar
Book
(2020) The paradox of cancer immune exclusion: Immune oncology next frontier
In: Marincola FM, Lee PP, editors. Tumor Microenvironment. Cham: Springer. pp. 173–195.

https://doi.org/10.1007/978-3-030-38862-1
- Google Scholar
1. Pon JR
2. Marra MA
(2015) Driver and passenger mutations in cancer
Annual Review of Pathology 10:25–50.

https://doi.org/10.1146/annurev-pathol-012414-040312
- PubMed
- Google Scholar
1. Qi Q
2. Liu Y
3. Cheng Y
4. Glanville J
5. Zhang D
6. Lee JY
7. Olshen RA
8. Weyand CM
9. Boyd SD
10. Goronzy JJ
(2014) Diversity and clonal selection in the human T-cell repertoire
PNAS 111:13139–13144.

https://doi.org/10.1073/pnas.1409155111
- PubMed
- Google Scholar
1. Rosenberg SM
(2001) Evolving responsively: adaptive mutation
Nature Reviews. Genetics 2:504–515.

https://doi.org/10.1038/35080556
- PubMed
- Google Scholar
1. Rosenberg SM
2. Queitsch C
(2014) Medicine: combating evolution to fight disease
Science 343:1088–1089.

https://doi.org/10.1126/science.1247472
- PubMed
- Google Scholar
1. Rosenthal R
2. Cadieux EL
3. Salgado R
4. Bakir MA
5. Moore DA
6. Hiley CT
7. Lund T
8. Tanić M
9. Reading JL
10. Joshi K
11. Henry JY
12. Ghorani E
13. Wilson GA
14. Birkbak NJ
15. Jamal-Hanjani M
16. Veeriah S
17. Szallasi Z
18. Loi S
19. Hellmann MD
20. Feber A
21. Chain B
22. Herrero J
23. Quezada SA
24. Demeulemeester J
25. Van Loo P
26. Beck S
27. McGranahan N
28. Swanton C
29. TRACERx consortium
(2019) Neoantigen-directed immune escape in lung cancer evolution
Nature 567:479–485.

https://doi.org/10.1038/s41586-019-1032-7
- PubMed
- Google Scholar
Book
1. Ross SM
(2014)
Introduction to Stochastic Dynamic Programming

Academic Press.
- Google Scholar
1. Samstein RM
2. Lee C-H
3. Shoushtari AN
4. Hellmann MD
5. Shen R
6. Janjigian YY
7. Barron DA
8. Zehir A
9. Jordan EJ
10. Omuro A
11. Kaley TJ
12. Kendall SM
13. Motzer RJ
14. Hakimi AA
15. Voss MH
16. Russo P
17. Rosenberg J
18. Iyer G
19. Bochner BH
20. Bajorin DF
21. Al-Ahmadie HA
22. Chaft JE
23. Rudin CM
24. Riely GJ
25. Baxi S
26. Ho AL
27. Wong RJ
28. Pfister DG
29. Wolchok JD
30. Barker CA
31. Gutin PH
32. Brennan CW
33. Tabar V
34. Mellinghoff IK
35. DeAngelis LM
36. Ariyan CE
37. Lee N
38. Tap WD
39. Gounder MM
40. D’Angelo SP
41. Saltz L
42. Stadler ZK
43. Scher HI
44. Baselga J
45. Razavi P
46. Klebanoff CA
47. Yaeger R
48. Segal NH
49. Ku GY
50. DeMatteo RP
51. Ladanyi M
52. Rizvi NA
53. Berger MF
54. Riaz N
55. Solit DB
56. Chan TA
57. Morris LGT
(2019) Tumor mutational load predicts survival after immunotherapy across multiple cancer types
Nature Genetics 51:202–206.

https://doi.org/10.1038/s41588-018-0312-8
- PubMed
- Google Scholar
1. Schreiber H
2. Wu TH
3. Nachman J
4. Kast WM
(2002) Immunodominance and tumor escape
Seminars in Cancer Biology 12:25–31.

https://doi.org/10.1006/scbi.2001.0401
- Google Scholar
1. Sontag ED
(2017) A dynamic model of immune responses to antigen presentation predicts different regions of tumor or pathogen elimination
Cell Systems 4:231–241.

https://doi.org/10.1016/j.cels.2016.12.003
- PubMed
- Google Scholar
1. Tripathi SC
2. Peters HL
3. Taguchi A
4. Katayama H
5. Wang H
6. Momin A
7. Jolly MK
8. Celiktas M
9. Rodriguez-Canales J
10. Liu H
11. Behrens C
12. Wistuba II
13. Ben-Jacob E
14. Levine H
15. Molldrem JJ
16. Hanash SM
17. Ostrin EJ
(2016) Immunoproteasome deficiency is a feature of non-small cell lung cancer with a mesenchymal phenotype and is associated with a poor outcome
PNAS 113:E1555–E1564.

https://doi.org/10.1073/pnas.1521812113
- PubMed
- Google Scholar
1. Turajlic S
2. Xu H
3. Litchfield K
4. Rowan A
5. Chambers T
6. Lopez JI
7. Nicol D
8. O’Brien T
9. Larkin J
10. Horswell S
11. Stares M
12. Au L
13. Jamal-Hanjani M
14. Challacombe B
15. Chandra A
16. Hazell S
17. Eichler-Jonsson C
18. Soultati A
19. Chowdhury S
20. Rudman S
21. Lynch J
22. Fernando A
23. Stamp G
24. Nye E
25. Jabbar F
26. Spain L
27. Lall S
28. Guarch R
29. Falzon M
30. Proctor I
31. Pickering L
32. Gore M
33. Watkins TBK
34. Ward S
35. Stewart A
36. DiNatale R
37. Becerra MF
38. Reznik E
39. Hsieh JJ
40. Richmond TA
41. Mayhew GF
42. Hill SM
43. McNally CD
44. Jones C
45. Rosenbaum H
46. Stanislaw S
47. Burgess DL
48. Alexander NR
49. Swanton C
50. PEACE
51. TRACERx Renal Consortium
(2018) Tracking cancer evolution reveals constrained routes to metastases: tracerx renal
Cell 173:581–594.

https://doi.org/10.1016/j.cell.2018.03.057
- PubMed
- Google Scholar
(2020) A guide to cancer immunotherapy: from T cell basic science to clinical practice
Nature Reviews. Immunology 20:651–668.

https://doi.org/10.1038/s41577-020-0306-5
- PubMed
- Google Scholar
(2019) Bacterial persistence promotes the evolution of antibiotic resistance by increasing survival and mutation rates
The ISME Journal 13:1239–1251.

https://doi.org/10.1038/s41396-019-0344-9
- PubMed
- Google Scholar
1. Xiong H
2. Xi Y
3. Yuan Z
4. Wang B
5. Hu S
6. Fang C
7. Cai Y
8. Fu X
9. Li L
(2022) Ifn-Γ activates the tumor cell-intrinsic sting pathway through the induction of DNA damage and cytosolic dsDNA formation
Oncoimmunology 11:2044103.

https://doi.org/10.1080/2162402X.2022.2044103
- PubMed
- Google Scholar
1. Yan HQ
2. Huang XB
3. Ke SZ
4. Jiang YN
5. Zhang YH
6. Wang YN
7. Li J
8. Gao FG
(2014) Interleukin 6 augments lung cancer chemotherapeutic resistance via ataxia-telangiectasia mutated/nf-kappab pathway activation
Cancer Science 105:1220–1227.

https://doi.org/10.1111/cas.12478
- PubMed
- Google Scholar
1. Yarchoan M
2. Johnson BA
3. Lutz ER
4. Laheru DA
5. Jaffee EM
(2017) Targeting neoantigens to augment antitumour immunity
Nature Reviews. Cancer 17:209–222.

https://doi.org/10.1038/nrc.2016.154
- PubMed
- Google Scholar

Article and author information

Author details

Jason T George
1. Department of Biomedical Engineering, Texas A&M University, Houston, United States
2. Engineering Medicine Program, Texas A&M University, Houston, United States
3. Center for Theoretical Biological Physics, Rice University, Houston, United States
Contribution
Conceptualization, Formal analysis, Supervision, Funding acquisition, Investigation, Methodology, Writing – original draft, Project administration, Writing – review and editing

For correspondence
jason.george@tamu.edu

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-8248-2888
Herbert Levine
1. Center for Theoretical Biological Physics, Rice University, Houston, United States
2. Department of Physics, Northeastern University, Boston, United States
3. Department of Bioengineering, Northeastern University, Boston, United States
Contribution
Supervision, Funding acquisition, Investigation, Writing – original draft, Project administration, Writing – review and editing

For correspondence
h.levine@northeastern.edu

Competing interests
No competing interests declared

Funding

Cancer Prevention Research Institute of Texas (RR210080)

Jason T George

National Science Foundation (PHY-2019745)

Herbert Levine

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

JTG thanks Kerry E Back, Philip A Ernst, Thomas J George, and Richard A Tapia for their helpful discussions on stochastic dynamic programming and optimization. JTG was supported by the Cancer Prevention Research Institute of Texas (RR210080). JTG is a CPRIT Scholar in Cancer Research. HL is supported by the National Science Foundation (NSF) grant NSF PHY-2019745.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.