The molecular clock of variants of concern

The emergence of SARS-CoV-2 variants of concern is driven by episodic
acceleration of the genomic rate of molecular evolution

John Tay, Ashleigh F. Porter, Wytamma Wirth, and Sebastian Duchene*.

Peter Doherty Institute for Infection and Immunity, University of
Melbourne, Melbourne, Australia.



The ongoing SARS-CoV-2 pandemic has seen an unprecedented amount of
rapidly generated genome data. These data have revealed the emergence of
lineages with mutations associated to transmissibility and antigenicity,
known as variants of concern (VOCs). A striking aspect of VOCs is that
many of them involve a high number of defining mutations. Current
phylogenetic estimates of the evolutionary rate of SARS-CoV-2 suggest
that its genome accrues around 2 mutations per month. However, VOCs can
have around 15 defining mutations and it is hypothesised that they
emerged over the course of a few months, implying that the evolutionary
rate would be several fold higher. A plausible scenario that is
difficult to demonstrate empirically is that such rapid evolution has
occurred within immunocompromised patients over a short period of time.
We analysed genome sequence data from the GISAID database to assess
whether the emergence of VOCs can be attributed to changes in the
evolutionary rate of the virus and whether this pattern can be detected
at a phylogenetic level using genome data. We fit a range of molecular
clock models and assessed their statistical fit. Our analyses indicate
that the emergence of VOCs is driven by an episodic increase in the
evolutionary rate of around 6-fold the background phylogenetic rate
estimate. Our results underscore the importance of monitoring the
molecular evolution of the virus as a means of understanding the
circumstances under which VOCs may emerge.

This is ongoing work and we will update it as we conduct further analyses.

Keywords: SARS-CoV-2 molecular evolution, variants of concern,
molecular clock, Bayesian model selection.

VOC molecular clocks ms draft (280.5 KB)

Very nice and relevant work Sebastian.
One question is about the impact of the different clock models on VOCs Tmrcas estimates when VOCs and non-VOCs sequences are combined in the same analysis. Do you test if the use of a SC model produce older Tmrca estimates than UCLN or FL-stems models?
I agree that long-term infections could be a source of episodic selection and generation of VOCs. Don’t you think that an alternative hypothesis for episodic selection could be short-term infections in immunocompetent individuals with partial immunity due to previous infections?


Hey Gonzalo,
Thanks for the feedback. We did not report the TMRCA estimates, but I pulled them below and they are quite consistent, with the UCLN having a bit more uncertainty. The mean rate estimates are also similar between models, as we reported here. In that respect, the local clock model framework was very useful to tease out what branches have higher rates, but it doesn’t appear to substantially impact other esimates (mean rate and timescale).

You make an important point about the potential mechanisms that could have increased the rate along the VOC stem branches. We cannot distinguish them here, but I we can certainly report the length of these branches. This should give us an idea of whether the process occurred over weeks or months. We will include this in our ongoing analyses.



Thanks Sebastian for your reply. That TMRCA were not substantially affected by the molecular clock model is an excellent result because many published studies were conducted using a SC model. Report the length of the VOC stem branches, mainly under the FL-stem model, is a very good idea.

Best regards.