Clock and TMRCA based on 27 genomes

Three more high-quality full-length genomes were added to the China National GeneBank. I updated the analyses to include these new genomes and at the suggestion of Oliver Pybus, also provide an additional analysis randomizing tip dates to show that the estimates are informed more by the data than choice of priors. The main conclusions remain the same, although we now obtain significantly better mixing of the MCMC and narrower 95% HPDs. Despite this, please note that this is still work in progress and precaution should be taken interpreting the values and all estimates should be considered based on their intervals and not point values.

Data

As of January 27, 2020, 30 full-length nCoV-2019 high quality genomes are available. The final dataset contains 36 SNPs after removing 9 SNPs that are likely due to sequencing errors. Acknowledgements of the genome sequences used in this analysis are in the table at the end of this document.

Phylogenetic Tree

A phylogenetic tree was created using PhyML and in agreement with previous analyses still shows limited genetic variation in the sampled viruses, which is consistent with a recent common ancestor.

The genetic data is still highly suggestive of a single-point introduction into the human population followed by sustained human-to-human transmission with no further evidence of zoonotic transmissions.

Evolutionary rate

To estimate the substitution rate of nCoV-2019, I used BEAST with a simple model consisting of HKYγ, strict clock with a CTMC rate prior, and a constant tree prior. The median estimate for the substitution rate is very similar to other RNA viruses, including SARS-CoV, Ebola virus, Zika virus, and others at ~ 1E-3 subs/site/year. Compared to previous analyses, we’re now starting to see better estimates of the rate.

|Median|95% HPD
|—|—|—|—|
1.05E-3 | 3.29E-4 - 2.03E-3

Date of the MRCA

I next estimated the date of the MRCA of the sampled nCoV-2019 genomes and the results were in agreement with previous estimates.

|Median|95% HPD
|—|—|—|—|
01 Dec 2019 | 20 Oct 2019 - 20 Dec 2019

Randomization analysis

To test whether the estimates could have been strongly influenced by the priors, three independent alignments shuffling tip dates were created and analyzed with BEAST using the same model as described above. In all cases did these analyses fail to capture meaningful rate and date estimates, suggesting that the estimates above are primarily informed by the data and not choice of priors. Despite this, all caveats described above still hold true and the dataset is still limited by size and sampling may be biased. This means that the addition of more sequencing data could likely change these estimates.

Note: ideally these analyses should be done 100+ times and include ‘leave-one-out’ analyses. I will try to do those at a later date.

Rate (three randomized alignments)

TMRCA (three randomized alignments)

Log files can be downloaded here. Please contact me directly for XMLs and alignments.

Acknowledgements and Genome Availability

Strain Authors Source Lab
EPI_ISL_402119 Wenjie Tan, et al. GISAID National Institute for Viral Disease Control and Prevention, China CDC
EPI_ISL_402120 Wenjie Tan, et al. GISAID National Institute for Viral Disease Control and Prevention, China CDC
EPI_ISL_402121 Wenjie Tan, et al. GISAID National Institute for Viral Disease Control and Prevention, China CDC
EPI_ISL_402123 Lili Ren, et al. GISAID Institute of Pathogen Biology, Chinese Academy of Medical Sciences & Peking Union Medical College
EPI_ISL_402124 Peng Zhou, et al. GISAID Wuhan Institute of Virology, Chinese Academy of Sciences
EPI_ISL_402125 Zhang, et al. GISAID National Institute for Communicable Disease Control and Prevention (ICDC) Chinese Center for Disease Control and Prevention (China CDC)
EPI_ISL_402127 Peng Zhou, et al. GISAID Wuhan Institute of Virology, Chinese Academy of Sciences
EPI_ISL_402128 Peng Zhou, et al. GISAID Wuhan Institute of Virology, Chinese Academy of Sciences
EPI_ISL_402129 Peng Zhou, et al. GISAID Wuhan Institute of Virology, Chinese Academy of Sciences
EPI_ISL_402130 Peng Zhou, et al. GISAID Wuhan Institute of Virology, Chinese Academy of Sciences
EPI_ISL_402132 Bin Fang, et al. GISAID Hubei Provincial Center for Disease Control and Prevention
EPI_ISL_403929 Lili Ren, et al. GISAID Institute of Pathogen Biology, Chinese Academy of Medical Sciences & Peking Union Medical College
EPI_ISL_403930 Lili Ren, et al. GISAID Institute of Pathogen Biology, Chinese Academy of Medical Sciences & Peking Union Medical College
EPI_ISL_403931 Lili Ren, et al. GISAID Institute of Pathogen Biology, Chinese Academy of Medical Sciences & Peking Union Medical College
EPI_ISL_403932 Min Kang, et al. GISAID Department of Microbiology, Guangdong Provincial Center for Diseases Control and Prevention
EPI_ISL_403933 Min Kang, et al. GISAID Department of Microbiology, Guangdong Provincial Center for Diseases Control and Prevention
EPI_ISL_403934 Min Kang, et al. GISAID Department of Microbiology, Guangdong Provincial Center for Diseases Control and Prevention
EPI_ISL_403935 Min Kang, et al. GISAID Department of Microbiology, Guangdong Provincial Center for Diseases Control and Prevention
EPI_ISL_403936 Min Kang, et al. GISAID Department of Microbiology, Guangdong Provincial Center for Diseases Control and Prevention
EPI_ISL_403937 Min Kang, et al. GISAID Department of Microbiology, Guangdong Provincial Center for Diseases Control and Prevention
EPI_ISL_403962 Pilailuk, et al. GISAID Department of Medical Sciences, Ministry of Public Health, Thailand
EPI_ISL_403963 Pilailuk, et al. GISAID Department of Medical Sciences, Ministry of Public Health, Thailand
EPI_ISL_404227 Yin Chen, et al. GISAID Department of Microbiology, Zhejiang Provincial Center for Disease Control and Prevention
EPI_ISL_404228 Yanjun Zhang, et al. GISAID Department of Microbiology, Zhejiang Provincial Center for Disease Control and Prevention
EPI_ISL_404253 Ying Tao, et al. GISAID IL Department of Public Health Chicago Laboratory
EPI_ISL_404895 Queen, et al. GISAID Division of Viral Diseases, Centers for Disease Control and Prevention
MN975262 Chan et al. Genbank State Key Laboratory of Emerging Infectious Diseases
CNA0007332 Chen et al. China National GeneBank BGI
CNA0007334 Chen et al. China National GeneBank BGI
CNA0007335 Chen et al. China National GeneBank BGI
1 Like