Further musings on the tMRCA

Louis du Plessis and I have written a short report of our calculations. They use the same logic as the estimates Richard Neher shared on Twitter, with a couple of extensions.

CAVEAT: The number of sequences and mutations are small, and the sequencing error rate is unknown, so all current tMRCA estimates should be treated with a pinch of salt. I’d wager a pint on our calculations but not a whole night out.

Thanks to everyone who has shared genomic data (go Jing!). Please let us know if you find any errors or oversights.

Oli & Louis

nCoVtimes_20genomes.pdf (1.4 MB)


I further developed this approach to treat each branch in the star tree as an independent observation. That gives a slightly different likelihood, and slightly different dates.

Using the same range of rates as in Figure 3, I estimate these dates pointing to a TMRCA in mid rather than late December:

Estimated TMRCA, tree length and branch model:
[1] "2019-12-22" "2019-12-17"
CI for tree length model:
[1] "2019-12-14" "2019-12-23"
CI for branch model:
[1] "2019-12-11" "2019-12-21"

Here is the corresponding density plot:

My notes are here, which also includes an analysis of a lower rate bound of 0.005 subst/site/year:
ncov2019starTree.pdf (259.3 KB)

1 Like

Thanks Erik. I think your estimate is better if we choose to assume a star topology from the outset. I suggest we keep updating these estimates until that assumption is untenable, and compare them with future phylogenetic clock estimates.

How do y’all reconcile some of these tMRCAs in early-mid December with the early December cases? If these onset are correct, and infections occurred some number of days prior, then the true tMCRA should at least be in November, right?

Screenshot 2020-01-26 16.47.40

1 Like

We discuss that briefly (see Fig 5). The TMRCA is the date of common ancestry of the sample, not of the index case. There can be a lineage from the index case to the TMRCA which is not observed. This is because infections in that lineage left <2 secondary cases, or because their descendants haven’t been sampled.

1 Like

It’s interesting to consider the implications of finding a TMRCA before or after the first human infection. Supposing that indeed took place on December 1, a TMRCA after that date is consistent with Oli’s interpretation. A TMRCA before that date would imply multiple spillovers from an animal reservoir with non-negligible diversity.