Novel 2019 coronavirus genome

Brian_Foley · February 2, 2020, 7:17am

A coronavirus was recently found in short read sequence data from a pangolin viromics data set. I am attaching an alignment of the complete genomes, a maximum likelihood tree built from just the spike glycoprotein gene, that does not include the pangolin sequence and a tree built from complete genomes that does include the pangolin sequence. The 2013 Yunnan bat virus sequence is still closer to nCoV than the pangolin virus sequence, but the pangolin virus is quite close. The 2013 Yunnan bat virus is removed from the alignment uploaded here, because it is GISAID data.

https://www.ncbi.nlm.nih.gov/sra/SRR10168377

SARS_SARSlikeCodonAligned.FASTA.gz (630.7 KB)

_SARSlike_PlusWuhan_YunnanSPIKE_CodonAlignedTreePDF.pdf (7.1 KB)

BetaCoronaviruses_114_WuhanCladeHandAlignedPlusPangolin2_IQtreePDF.pdf (7.2 KB)

Credit for the discovery of the Pangolin virus sequence: