Tackling Rumors of a Suspicious Origin of nCoV2019

I have run a similar breakpoint analysis on the ref seq for Middle Eastern Respiratory Syndrome (MERS) virus in the spike (S) protein region. This is the protein, especially the S1 attachment subunit, that is the principal determinant of host range, pathogenesis via cell fusion and communicabiity between human subjects.

The specificity of S is responsible in large part for the widespread and rapid communicability of SARS-CoV-2, with relatively low virulence in most of those infected (mortality varies but much less than 10%); in contrast, the spread of MERS is more easily contained, but its virulence is much higher, estimated at 35%.

A virus with the high communicability of SARS-CoV-2 and the virulence of MERS would be potentially apocalyptic in its impact.

Against that prospect, below is an annotated breakpoint sequence map for the reference sequence of MERS, delineating each occurrence of CAGAC or CAGAT, or its equivalent on the “minus” template strand, GTCTG. There are many fewer instances of these sequences in the MERS genome than in SARS-CoV-2, and most do not occur in a comparable location between the two viral genomes.

MERS S breakpoints.pdf (91.1 KB)

However, there are four exceptions.

First of all, as noted by Graham and Baric (2010), each of the independently transcribed mRNAs are preceded by an identical hexanucleotide ACGAAC, that serves as the transcriptional regulatiory sequence TRS). Each could theoretically serve as a breakpoint bracketing any of the ORFs from S onward.

Second, there is a breakpoint sequence in MERS S, 21564CAGAC that is very close to the relative postion of 21691CAGAT in SARS-CoV-2. Homologous recombination at this point would not be likely to have a significant effect on protein structure.

Third, skipping one for the moment, there is a breakpoint sequence in MERS S, 25215CAGAT, that is similarly very close in relative position to 25047CAGAT in SARS-CoV-2.

Most importantly, there is a fourth breakpoint sequence, between the receptor binding domain of MERS and the S1/S2 junction, that is precisely conserved in sequence and position with the corresponding location of SARS-CoV-2. This breakpoint, 23577CAGAC in MERS and 23300CAGAC in SARS-CoV-2 (double-underlined in the pdf) is identical in both viruses at the same relative position in the S1 protein sequence, where it defines an identical dipeptide, QT.

There is potential, therefore, based on closely apposed or identical breakpoint sequences, that the bulk of the S1A and S1B domains of the S1 attachemnt subunit could be exchanged between MERS and SARS-CoV-2 in any mixed infection. While there is no indication that this has occurred in the wild, during the SARS-CoV-2 pandemic we face the unique situation of SARS-CoV-2 being present simultaneously in a substantial number of human beings worldwide. Any simultaneous outbreak of MERS, within or without those areas where is has been previously found prevalent, could produce the kind of mixed infection in humans that we know has resulted in frequent recombination among coronaviruses in the wild or in captive populations of animals. To the viruses, there is no known theoretical difference.

Public health authorities should especially guard against simultaneous spread of more than one coronavirus in the human population at the same time and in the same locations.

Bill Gallaher