Basepairing is predictive of synonymous substitution frequency.

Distribution of frequencies of synonymous substitutions for the most common substitutions (each approximately corresponding to 5% or more of observed substitutions), expressed as the estimated mutational fitness, which is a logarithmic comparison of the observed versus the expected number of occurrences of each type of substitution in the SARS-CoV-2 phylogenetic tree (Bloom and Neher, 2023). Distributions are grouped by substitution type and whether positions are basepaired in a full-genome secondary structure model of SARS-CoV-2 in Huh7 cells (Lan et al., 2022). The median relative substitution rate (ratio of median rates of substitutions at unpaired and paired positions) is shown above each type of substitution.

Estimated mutational fitness correlates with secondary structure for nonsynonymous C→U substitutions.

Scatter plots compare mutational fitness to average DMS reactivity for positions with potential nonsynonymous C→U substitutions. The minimum observed DMS reactivity value is assigned to positions lacking data. Points are colored by basepairing in the full genome secondary structure model. Nonsynonymous C→U substitutions at basepaired positions are highlighted which rank highly for mutational fitness and characterize major SARS-CoV-2 lineages. Synonymous C29095U at an unpaired position is also highlighted. Left: Estimated mutational fitness based only on observed versus expected occurrences of C→U at each position. Dashed lines indicate the median estimated mutational fitness for synonymous substitutions at paired and unpaired positions. Arrows indicate the magnitudes of adjustments made to mutational fitness that result in median fitness of synonymous substitutions at paired (+0.32) and unpaired (−1.13) positions identical to the unadjusted median for all synonymous substitutions (−0.46). Right: Mutational fitness adjusted by constants derived from the medians of mutational fitness for synonymous substitutions at basepaired, unpaired, and all potential C→U positions.

Basepairing dependence of effect of 5′ context on synonymous C→U substitution frequency.

Distribution of frequencies of synonymous substitutions for C→U substitutions, expressed as the estimated mutational fitness, which is a logarithmic comparison of the observed versus the expected number of occurrences of each type of substitution in the SARS-CoV-2 phylogenetic tree. Distributions are grouped by the identity at the 5′ position and by whether positions are basepaired in a full-genome secondary structure model of SARS-CoV-2.

Basepairing dependence of effect of 3′ context on synonymous C→U substitution frequency.

Distribution of frequencies of synonymous substitutions for C→U substitutions, expressed as the estimated mutational fitness, which is a logarithmic comparison of the observed versus the expected number of occurrences of each type of substitution in the SARS-CoV-2 phylogenetic tree. Distributions are grouped by the identity at the 3′ position and by whether positions are basepaired in a full-genome secondary structure model of SARS-CoV-2.

Estimated mutational fitness correlates with secondary structure for synonymous C→U substitutions.

Scatter plots compare mutational fitness to average DMS reactivity for positions with potential synonymous C→U substitutions. The minimum observed DMS reactivity value is assigned to positions lacking data. Points are colored by basepairing in the full genome secondary structure model. Left: Estimated mutational fitness based only on observed versus expected occurrences of C→U at each position. Dashed lines indicate the median estimated mutational fitness for synonymous substitutions at paired and unpaired positions (identical to Figure 2). Arrows indicate the magnitudes of adjustments made to mutational fitness that result in median fitness of synonymous substitutions at paired (+0.32) and unpaired (−1.13) positions identical to the unadjusted median for all synonymous substitutions (−0.46). Right: Mutational fitness adjusted by constants derived from the medians of mutational fitness for synonymous substitutions at basepaired, unpaired, and all potential C→U positions.