Selection analysis identifies significant mutational changes in Omicron that are likely to influence both antibody neutralization and Spike function (Part 2 of 2)

Part 1 of this post can be found here:

Omicron mutations at neutral or negatively selected S-gene sites might only be adaptive when they co-occur

Given both the apparent selective constraints on arising mutations at the cluster region 1, 2 and 3 sites in SARS-CoV-2 and other Sarbecoviruses, and the rarity of observed mutations at these sites among the millions of assembled SARS-CoV-2 genomes (despite evidence that individually such mutations do regularly occur during within-host evolution; Figure 3), it is very likely that Omicron mutations at cluster region, 1, 2 and 3 sites are maladaptive when present on their own. Nevertheless, the presence of mutations at these sites in Omicron, a virus that is clearly highly adapted, suggests that these mutations might interact with one another such that, when present together, they become adaptive. Therefore while individually the mutations might decrease the fitness of any genome in which they occur, collectively they might compensate for one-another’s deficits to yield a fitter virus genotype.

Positive epistasis of this type has, in fact, already been demonstrated between the cluster 2 mutation, S/Q498R, and the pivotal mutation of the 501Y SARS-CoV-2 lineages, S/N501Y. Whereas S/498R only marginally impacts the affinity of Spike for human ACE2 when present with S/501N (18), it increases ACE2 binding affinity approximately four-fold when present with S/501Y (19).

If mutations in the three cluster regions do epistatically interact with one another, then one might expect that selection would favour their co-occurrence either within individual SARS-CoV-2 genome sequences that have so far been sampled, or as minor variants within unassembled intra-patient sequence data. We failed to detect such associations in any systematic manner (Figure 6). While there are individual pairs of Omicron mutations that co-occur more frequently than expected by chance (e.g. 440K in the presence of T95I), they do not involve cluster 1, 2, and 3 mutations. Furthermore, many of the Omicron mutation pairs occur together less frequently than expected by chance (e.g. 478K and 501Y). Rather than reflecting an absence of epistasis between the cluster 1,2, and 3 mutation sites our failure to detect the co-occurrence of Omicrom mutation pairs at these sites simply reflects the rarity of these mutations within both assembled SARS-CoV-2 genome sequences and raw intra-patient sequence datasets (Figure 3).

Whether or not epistasis is extensively operating between mutations in the three cluster regions, the amino acid changes caused by these mutations in Omicron likely represent a substantial remodelling of two functionally important regions of Spike: the receptor binding domain and the fusion domain. If epistasis is operating between the mutations then it would simply mean that one might expect one or two major functionally important mutations in each of the clusters to have had adverse structural impacts on Spike that were compensated for by other mutations in the clusters.

\ 625x516
Figure 6. Patterns of co-occurrence of Omicron amino-acid residues in circulating SARS-CoV-2 S-gene haplotypes from other lineages (data up to October 15, 2021). Only mutations occurring in at least 10 haplotypes are shown. All sequences having exactly the same S sequence count as a single unique haplotype; instead of counting raw sequence numbers, this approach focuses on the number of unique genetic backgrounds in which pairs of codons co-occur. Circles show odds ratios for finding the mutation on the X axis when the mutation on the Y axis is also present (vs when it is not present). Red circles depict OR > 1, while blue circles 1/OR for OR < 1. Black circles on the right show the fraction of globally sampled SARS-CoV-2 S-gene haplotypes which carry the corresponding mutation.

The cluster region 3 encoded amino acid changes in the part of Spike that is responsible for membrane fusion suggest that the membrane fusion machinery of the Omicron spike has been overhauled. The mutations in cluster regions 1 and 2 fall within the receptor binding domain (RBD) encoding part of the S-gene. These mutations, together with those at S/417, S/440, S/446 might be indicative of an extensive remodelling of the ACE2 receptor binding surface; possibly to accommodate a major change in the way that spike interacts with ACE2 and/or other host cell receptors.

Of the cluster 2 sites that all fall within the receptor binding motif encoding part of the RBD, only S/498 and S/505 show signs of the Wuhan-Hu-1 encoded amino acid state having been selectively favoured in the past (S/498 in SARS-CoV-2 and S/505 in nCoV). No signs of any positive selection at the other cluster 2 sites in SARS-CoV-2 implies that changes at these and the negatively selected sites in cluster 2 have likely not individually contributed to effective immune evasion since the start of the pandemic. Deep mutational scans (Figure 7; (20)) have found little evidence that individual substitutions at S/505 have antigenic effects; S/496R and S/498R have only moderate antigenic effects, similar to those of the 501Y mutation. The exception that proves the rule that sites in this region might not be free to change in response to immune pressures is 493R. Given that 493R has a moderate antigenic effect, if it was not under selective constraints to sustain optimal degrees of ACE2 interaction (16) it should (but does not) display at least intermittently detectable signs of positive selection.

\ 603x339
Figure 7. Experimentally measured effects of RBD mutations on binding of monoclonal antibodies at sites that differ between the Omicron variant and Wuhan-Hu-1. The line plot shows antibody binding escape measured by deep mutational scanning of the Wuhan-Hu-1 RBD (21), averaged across 36 monoclonal antibodies (8 class 1, 13 class 2, 7 class 3, and 8 class 4 antibodies). Sites that are mutated in the Omicron variant relative to Wuhan-Hu-1 are indicated and colored according to the predicted antigenic effect of mutations at that site (strong, moderate, or minimal). An interactive version of this plot is available at

How and why have so many apparently maladaptive mutations been assembled within Omicron?

Given the manifest viability of Omicron there is a pressing need to understand how and why it accumulated so many mutations that, on their own at least, are apparently either selectively neutral or maladaptive. The sheer number of mutations in Spike and the genetic distance between Omicron and its nearest known SARS-CoV-2 relatives implies that the Omicron progenitor accumulated its unprecedented number of mutations during an extensive time period of undetected replication. When accurate molecular clock estimates are obtained of the time when Omicron last shared a common ancestor with other SARS-CoV-2 lineages, we will have an upper bound on the amount of time it took for Omicron to assemble its complement of mutations.

Omicron could have spent this period of intensive or prolonged evolution in a region that carries out minimal genomic surveillance or where access to, or utilization of, health care resources is low (the surveillance failure hypothesis). Alternatively, this viral evolution could have taken place within a long-term infection (or possibly serial long-term infections; the chronic infection hypothesis), or during spread within a non-human host population (the reverse-zoonosis hypothesis). Combinations of these evolutionary modes are also a possibility. We will only be able to distinguish between these hypotheses with more data. For example, if one or more SARS-CoV-2 lineages are discovered that are close relatives of Omicron then this would support the surveillance failure hypothesis, whereas if similarly divergent SARS-CoV-2 variants are discovered in either long-term human infections or in other animal species, these would support the other hypotheses.

Relative to evolution during normal SARS-CoV-2 person-to-person transmission, evolution within the context of either long-term infections or an alternative animal host could potentially have occurred at an accelerated pace (22,23). In these contexts purifying selection may have been relaxed somewhat relative to that occurring during normal human-to-human transmission: enough so for genomes carrying suboptimal combinations of epistatically interacting mutations to remain viable while fitter combinations were discovered via additional mutations and genetic recombination. In addition, chronic infections are not impacted by the tight transmission bottlenecks that can stochastically purge nascent adaptive mutations during normal transmission (24,25).

Sequential cycles of immune surveillance and viral immune escape within a long-term infection could also potentially explain the mutation clusters without the need to invoke compensatory epistatic interactions between mutations. Specifically, the clustered mutation patterns in the Omicron Spike are reminiscent of those seen in the HIV envelope protein as a consequence of sequentially acquired virus mutations that evade the progressively broadening neutralization potential of a maturing antibody lineage (26). While signs of negative selection at 9/13 of the mutated codons in the three cluster regions of Omicon are not entirely consistent with this hypothesis, the overwhelming contributor to these negative selection signals are the selective processes operating during normal short-term SARS-CoV-2 infections where the antibody-pathogen dynamics simply don’t have time to develop. It is possible that if purifying selection is relaxed at these sites during unusually prolonged infections, then neutralizing antibody evasion mutations might have been tolerable. Even if purifying selection were not relaxed, however, during a chronic infection the potential long-term fitness costs that are incurred by highly effective immune evasion mutations might frequently be offset by the immediate fitness benefits of evading neutralization.

Whatever the process that yielded the three clusters of rarely seen mutations in Omicron, now that it is being transmitted among people, any mildly deleterious immune evasion mutations it has accumulated might be substantially less tolerable. Likewise, some of the mutations it may have accumulated during its adaptation to transmission in an alternative animal species would now also potentially be somewhat maladaptive. If the rarely-seen mutations at negatively selected sites in the Omicron RBD that are known to be targeted by neutralizing antibodies begin reverting or acquiring clear second-site compensatory mutations over the coming months, it would best support the chronic-infection hypothesis in that such reversions would imply a trade-off between intra-host replicative and/or movement fitness and immune evasion. Alternatively, if reversion mutations occur at Omicron RBM sites that are known to impact human ACE2 binding but which have minor antigenic impacts, this would better support the reverse zoonosis hypothesis.

If, however, the rarely seen mutations in Omicron show no signs of reverting, rather than supporting one origin hypothesis over another, it would support the hypothesis that these mutations are broadly adaptive when they occur in the combinations found in Omicron. It would also indicate that rapid and substantial remodelling of important functional SARS-CoV-2 Spike domains is not just possible, but will likely recur in other lineages. The phenotypic impacts of such extensive genetic changes are very difficult to predict but, in the case of Omicron at least, the proximity of these changes to functionally important genome sites suggests the aspects of Spike function that are likely involved. The effects of the mutations in the three cluster regions on Omicron Spike function might be as similar to those caused by “normal” stepwise mutational changes as antigenic shifts are to antigenic drifts. Rather than just small tweaks in the antigenicity of Spike, its ACE2 binding properties or its membrane fusion functions, the clustered rarely seen mutations in Omicron’s RBD and fusion domain could cause quite big shifts in the way that Spike works.


We gratefully acknowledge all of the authors from the originating laboratories responsible for obtaining the specimens and the submitting laboratories where genetic sequence data were generated and shared via the GISAID Initiative, on which this research is based.

1. Rambaut A, Loman N, Pybus O, Barclay W, Barrett J, Carabelli A, et al. Preliminary genomic characterisation of an emergent SARS-CoV-2 lineage in the UK defined by a novel set of spike mutations. Virological. 2020 Dec 19;

2. Scheepers C, Everatt J, Amoako DG, Mnguni A, Ismail A, Mahlangu B, et al. The continuous evolution of SARS-CoV-2 in South Africa: a new lineage with rapid accumulation of mutations of concern and global detection. medRxiv. 2021 Aug 24;

3. Karim F, Moosa MYS, Gosnell BI, Cele S, Giandhari J, Pillay S, et al. Persistent SARS-CoV-2 infection and intra-host evolution in association with advanced HIV infection. medRxiv. 2021 Jun 4;

4. Escalera-Zamudio M, Pond SLK, de la Viña NM, Gutiérrez B, Thézé J, Bowden TA, et al. Identification of site-specific evolutionary trajectories shared across human betacoronaviruses. BioRxiv. 2021 May 25;

5. Elbe S, Buckland-Merrett G. Data, disease and diplomacy: GISAID’s innovative contribution to global health. Global Challenges. 2017 Jan;1(1):33–46.

6. Kosakovsky Pond SL, Frost SDW. Not so different after all: a comparison of methods for detecting amino acid sites under selection. Mol Biol Evol. 2005 May;22(5):1208–22.

7. Kosakovsky Pond SL, Poon AFY, Velazquez R, Weaver S, Hepler NL, Murrell B, et al. HyPhy 2.5-A Customizable Platform for Evolutionary Hypothesis Testing Using Phylogenies. Mol Biol Evol. 2020 Jan 1;37(1):295–9.

8. Martin DP, Weaver S, Tegally H, San JE, Shank SD, Wilkinson E, et al. The emergence and ongoing convergent evolution of the SARS-CoV-2 N501Y lineages. Cell. 2021 Sep 30;184(20):5189-5200.e7.

9. Xu C, Wang Y, Liu C, Zhang C, Han W, Hong X, et al. Conformational dynamics of SARS-CoV-2 trimeric spike glycoprotein in complex with receptor ACE2 revealed by cryo-EM. Sci Adv. 2021 Jan 1;7(1).

10. Maier W, Bray S, van den Beek M, Bouvier D, Coraor N, Miladi M, et al. Ready-to-use public infrastructure for global SARS-CoV-2 monitoring. Nat Biotechnol. 2021 Oct;39(10):1178–9.

11. Barnes CO, Jette CA, Abernathy ME, Dam K-MA, Esswein SR, Gristick HB, et al. SARS-CoV-2 neutralizing antibody structures inform therapeutic strategies. Nature. 2020 Dec;588(7839):682–7.

12. Weisblum Y, Schmidt F, Zhang F, DaSilva J, Poston D, Lorenzi JC, et al. Escape from neutralizing antibodies by SARS-CoV-2 spike protein variants. eLife. 2020 Oct 28;9.

13. Choi B, Choudhary MC, Regan J, Sparks JA, Padera RF, Qiu X, et al. Persistence and Evolution of SARS-CoV-2 in an Immunocompromised Host. N Engl J Med. 2020 Dec 3;383(23):2291–3.

14. Lytras S, Hughes J, Xia W, Jiang X, Robertson DL. Exploring the natural origins of SARS-CoV-2. BioRxiv. 2021 Jan 22;

15. Murrell B, Wertheim JO, Moola S, Weighill T, Scheffler K, Kosakovsky Pond SL. Detecting individual sites subject to episodic diversifying selection. PLoS Genet. 2012 Jul 12;8(7):e1002764.

16. Starr TN, Zepeda SK, Walls AC, Greaney AJ, Veesler D, Bloom JD. ACE2 binding is an ancestral and evolvable trait of sarbecoviruses. BioRxiv. 2021 Jul 19;

17. Kang L, He G, Sharp AK, Wang X, Brown AM, Michalak P, et al. A selective sweep in the Spike gene has driven SARS-CoV-2 human adaptation. Cell. 2021 Aug 19;184(17):4392-4400.e4.

18. Starr TN, Greaney AJ, Hilton SK, Ellis D, Crawford KHD, Dingens AS, et al. Deep Mutational Scanning of SARS-CoV-2 Receptor Binding Domain Reveals Constraints on Folding and ACE2 Binding. Cell. 2020 Sep 3;182(5):1295-1310.e20.

19. Zahradník J, Marciano S, Shemesh M, Zoler E, Harari D, Chiaravalli J, et al. SARS-CoV-2 variant prediction and antiviral drug design are enabled by RBD in vitro evolution. Nat Microbiol. 2021 Sep;6(9):1188–98.

20. Greaney AJ, Loes AN, Crawford KHD, Starr TN, Malone KD, Chu HY, et al. Comprehensive mapping of mutations in the SARS-CoV-2 receptor-binding domain that affect recognition by polyclonal human plasma antibodies. Cell Host Microbe. 2021 Mar 10;29(3):463-476.e6.

21. Greaney AJ, Starr TN, Gilchuk P, Zost SJ, Binshtein E, Loes AN, et al. Complete Mapping of Mutations to the SARS-CoV-2 Spike Receptor-Binding Domain that Escape Antibody Recognition. Cell Host Microbe. 2021 Jan 13;29(1):44-57.e9.

22. Kemp SA, Collier DA, Datir RP, Ferreira IATM, Gayed S, Jahun A, et al. SARS-CoV-2 evolution during treatment of chronic infection. Nature. 2021 Apr;592(7853):277–82.

23. Lu L, Sikkema RS, Velkers FC, Nieuwenhuijse DF, Fischer EAJ, Meijer PA, et al. Adaptation, spread and transmission of SARS-CoV-2 in farmed minks and associated humans in the Netherlands. Nat Commun. 2021 Dec;12(1):6802.

24. Braun KM, Moreno GK, Wagner C, Accola MA, Rehrauer WM, Baker DA, et al. Acute SARS-CoV-2 infections harbor limited within-host diversity and transmit via tight transmission bottlenecks. PLoS Pathog. 2021 Aug 23;17(8):e1009849.

25. Lythgoe KA, Hall M, Ferretti L, de Cesare M, MacIntyre-Cockett G, Trebes A, et al. SARS-CoV-2 within-host diversity and transmission. Science. 2021 Apr 16;372(6539).

26. Landais E, Murrell B, Briney B, Murrell S, Rantalainen K, Berndsen ZT, et al. HIV envelope glycoform heterogeneity and localized diversity govern the initiation and maturation of a V2 apex broadly neutralizing antibody lineage. Immunity. 2017 Nov 21;47(5):990-1003.e9.

27. Zahradnik J, Marciano S, Shemesh M, Zoler E, Chiaravalli J, Meyer B, et al. SARS-CoV-2 RBD in vitro evolution follows contagious mutation spread, yet generates an able infection inhibitor. BioRxiv. 2021 Jan 6;

1 Like