Preliminary in silico assessment of the specificity of published molecular assays and design of new assays using the available whole genome sequences of 2019-nCoV

Preliminary in silico assessment of the specificity of published molecular assays and design of new assays using the available whole genome sequences of 2019-nCoV

Authors:

Mitchell Holland1, Daniel Negrón1, Shane Mitchell1, Mychal Ivancich1, Katharine W. Jennings1, Bruce Goodwin2, and Shanmuga Sozhamannan2.3

1Noblis, Reston, VA 20191

2Defense Biological Product Assurance Office, Frederick, MD 21702

3Logistics Management Institute, Tysons, VA 22102

Acknowledgement: All the investigators/labs that published the sequence data at GISAID- also see table at the end

Similar analyses/References: Marion Koopmans; Initial assessment of the ability of published coronavirus primers sets to detect the Wuhan coronavirus: http://virological.org/t/initial-assessment-of-the-ability-of-published-coronavirus-primers-sets-to-detect-the-wuhan-coronavirus/321Source

WGS: 25 WGS sequences have been downloaded from GISAID; All other data from NCBI BLAST databases (nt, gss, and env_nt - last updated December 2019)

Attached zip file with data visualizations: nCoV_pset_report.zip (459.5 KB)

Analyses:

BioLaboro is an application for rapidly designing de novo assays and validating existing PCR detection assays. It is a user-friendly new assay discovery pipeline composed of three tools: BioVelocity®, Primer3, and PSET. BioVelocity® uses a rapid, accurate hashing algorithm to align sequencing reads to a large set of references (e.g. Genbank) (Sozhamannan et al., 2015). BioVelocity® creates a k-mer index to determine all possible matches between query sequences and references simultaneously using a large RAM system (i.e. an IBM Power8). This algorithm makes it possible to very quickly identify sequences conserved within or omitted from a set of target references. Primer3 (http://primer3.sourceforge.net/) is a tool for designing primers and probes for real-time PCR reactions. It considers a range of criteria such as oligonucleotide melting temperature, size, GC content, and primer-dimer possibilities. We use Primer3 along with our signature detection process to identify potential new primer sets. PSET (PCR Signature Erosion Tool) tests PCR assays in silico against the latest versions of public sequence repositories, or other reference datasets, to determine if primers and probes match only to their intended targets. Using this information, an assay provider can be better aware of potential false hits and be better prepared to design new primers when false hits become intractable.

Results:

The BioLaboro application detected four highly specific signature sequence regions that hit all 25 (available at the time of analysis) Wuhan genomes (Table 1). The detected signatures were found to occur in disparate locations on the genome (Figure 1). All four signatures were found to target all current Wuhan genomes, and three out of four of these signature regions did not sufficiently align to any known coronavirus or other organism in NCBI BLAST databases (Table 2).

Table 1. List of PCR assays evaluated in this analysis. First four assays newly created using BioLaboro to be specific to Wuhan coronavirus. Last four assays from Diagnostic detection of Wuhan coronavirus 2019 by real-time RTPCR by Corman et al 2020.

Identifier length forward probe reverse
2019-nCoV-noblis_1 165 TGATGGTGGTGTCACTCGTG TGGTTTAGCCAGCGTGGTGGT GAAGTGGGTTTTGTCGTGCC
2019-nCoV-noblis_2 168 GCCGCTGTTGATGCACTATG ACGTGCTCGTGTAGAGTGTTTTGAT ATGCATTGCCTGAGACGACA
2019-nCoV-noblis_3 272 CGGATGGCTTATTGTTGGCG TGCTCGTTGCTGCTGGCCTT TTGGCTTTGCTGGAAATGCC
2019-nCoV-noblis_4 218 TGTCGTTGACAGGACACGAG TTCGTCCGTGTTGCAGCCGA CGTACGTGGCTTTGGAGACT
ncov_e_gene 113 ACAGGTACGTTAATAGTTAATAGCGT ACACTAGCCATCCTTACTGCGCTTCG TGTGTGCGTACTGCTGCAATAT
ncov_n_gene 128 CACATTGGCACCCGCAATC ACTTCCTCAAGGAACAACATTGCCA CAAGCCTCTTCTCGTTCCTC
ncov_rdrp_1 100 GTGARATGGTCATGTGTGGCGG CCAGGTGGWACRTCATCMGGTGATGC TATGCTAATAGTGTSTTTAACATYTG
ncov_rdrp_2 100 GTGARATGGTCATGTGTGGCGG CAGGTGGAACCTCATCAGGAGATGC TATGCTAATAGTGTSTTTAACATYTG

Figure 1. Map of the Wuhan genome (NCBI Accession: MN908947.3) with assay signature locations (created using DNA Features Viewer Python library). Corman assays in blue, Noblis assays in red, and gene regions in green.

Table 2. Results from PSET analysis. The four new Noblis assays were compared alongside the four assays from Corman. Each assay was tested using Wuhan coronavirus (25 genomes) as the intended target. All off-target hits (PF, FP) are to entries in NCBI BLAST databases (nt, gss, and env_nt).

Assay Confusion
identifier type target PT TP TN PF FP FN
2019-nCoV-noblis_1 Probe Wuhan 22 3 415 NA NA NA
2019-nCoV-noblis_2 Probe Wuhan 25 NA 691 NA NA NA
2019-nCoV-noblis_3 Probe Wuhan 25 NA 656 NA NA NA
2019-nCoV-noblis_4 Probe Wuhan 25 NA 356 NA 3 NA
ncov_e_gene Probe Wuhan 25 NA 75 353 15 NA
ncov_n_gene Probe Wuhan 25 NA 73 NA 339 NA
ncov_rdrp_1 Probe Wuhan NA* 25 169 433 85 NA
ncov_rdrp_2 Probe Wuhan NA* 25 586 1 61 NA
PT = Perfect True Positive = All assay components hit with 100% identity to the correct target
TP = True Positive = All assay components hit with >=90% identity over >=90% of the component length to the correct target
TN = True Negative = Partial hit to assay amplicon but insufficient assay component alignments to an incorrect target
PF = Perfect False Positive = All assay components hit with 100% identity to an incorrect target
FP = False Positive = All assay components hit with >=90% identity over >=90% of the component length to an incorrect target
FN = False Negative = Partial hit to assay amplicon but insufficient assay component alignments to the correct target
* The ncov_rdrp_1 and ncov_rdrp_2 assays have one mismatch in the reverse primer to all current Wuhan coronavirus genomes which prevents it from being called a perfect hit. This is an incorrect match between an ambiguous base code, “S”, and the reference sequence “T”, in the middle of the primer which will likely not affect binding. NA == 0.

Conclusions and Caveats:

This is preliminary analyses based on 25 sequences available at the time.

New sequences are added on an hourly basis and these Noblis signatures need to be tested against the new sequences to verify that no signature erosion is occurring, as described in Sozhamannan et al 2015 for Ebola sequences. These designs were generated entirely in silico , and have yet to be tested in the lab. Although the BioLaboro pipeline is designed on sound scientific principles and the results from analyses of Ebola and Lassa viruses using the in silico components have been demonstrated (Sozhamannan et al 2015 and Wiley et al 2019) these assays await validation before conclusions regarding their use for clinical testing can be made.

Our intent in publishing these nCoV real-time PCR assays is to make the community aware of the existence of these potential unique signature regions as well the availability of BioLaboro for rapid evaluation of existing assays and design of new assays.

References:

  1. Diagnostic detection of Wuhan coronavirus 2019 by real-time RTPCR -Protocol and preliminary evaluation as of Jan 13, 2020- Victor Corman, Tobias Bleicker, Sebastian Brünink, Christian Drosten, Charité Virology, Berlin, Germany; Olfert Landt, Tib-Molbiol, Berlin, Germany; Marion Koopmans, Erasmus MC, Rotterdam, The Netherlands; Maria Zambon, Public Health England, London, Additional advice by Malik Peiris, University of Hong Kong; contact: [email protected] Institut für Virologie - Institute of Virology

  2. Sozhamannan, Shanmuga, et al. “Evaluation of signature erosion in Ebola virus due to genomic drift and its impact on the performance of diagnostic assays.” Viruses 7.6 (2015): 3130-3154.

  3. Wiley, Michael R., et al. “Lassa virus circulating in Liberia: a retrospective genomic characterisation.” The Lancet Infectious Diseases 19.12 (2019): 1371-1378.

Appendix 1: Acknowledgement for 2019-nCoV genome sequences

The following table is from this blog at Virological.org: Phylogenetic analysis of 23 nCoV-2019 genomes, 2020-01-23; Phylogenetic analysis of 23 nCoV-2019 genomes, 2020-01-23.

Table 3 . nCoV2019 genome sequences used in this analysis, the GISAID 6 accession numbers and submitting labs.

GISAID Accession Strain Location Collection date Lab
EPI_ISL_404227 BetaCoV/Zhejiang/WZ-01/2020 Zhejiang, China 2020-01-16 1
EPI_ISL_404228 BetaCoV/Zhejiang/WZ-02/2020 Zhejiang, China 2020-01-17 1
EPI_ISL_402132 BetaCoV/Wuhan/HBCDC-HB-01/2019 China/Hubei Province 2019-12-30 2
EPI_ISL_402127 BetaCoV/Wuhan/WIV02/2019 China / Hubei Province / Wuhan City 2019-12-30 3
EPI_ISL_402128 BetaCoV/Wuhan/WIV05/2019 China / Hubei Province / Wuhan City 2019-12-30 3
EPI_ISL_402129 BetaCoV/Wuhan/WIV06/2019 China / Hubei Province / Wuhan City 2019-12-30 3
EPI_ISL_402130 BetaCoV/Wuhan/WIV07/2019 China / Hubei Province / Wuhan City 2019-12-30 3
EPI_ISL_403963 BetaCoV/Nonthaburi/74/2020 Thailand/ Nonthaburi Province 2020-01-13 4
EPI_ISL_403962 BetaCoV/Nonthaburi/61/2020 Thailand/ Nonthaburi Province 2020-01-08 4
EPI_ISL_402120 BetaCoV/Wuhan/IVDC-HB-04/2020 China / Hubei Province / Wuhan City 2020-01-01 5
EPI_ISL_402119 BetaCoV/Wuhan/IVDC-HB-01/2019 China / Hubei Province / Wuhan City 2019-12-30 5
EPI_ISL_402121 BetaCoV/Wuhan/IVDC-HB-05/2019 China / Hubei Province / Wuhan City 2019-12-30 5
EPI_ISL_402124 BetaCoV/Wuhan/WIV04/2019 China / Hubei Province / Wuhan City 2019-12-30 6
EPI_ISL_402123 BetaCoV/Wuhan/IPBCAMS-WH-01/2019 China / Hubei Province / Wuhan City 2019-12-24 7
EPI_ISL_402125 BetaCoV/Wuhan-Hu-1/2019 China 2019-12 8
EPI_ISL_403931 BetaCoV/Wuhan/IPBCAMS-WH-02/2019 China / Hubei Province / Wuhan City 2019-12-30 9
EPI_ISL_403928 BetaCoV/Wuhan/IPBCAMS-WH-05/2020 China / Hubei Province / Wuhan City 2020-01-01 9
EPI_ISL_403930 BetaCoV/Wuhan/IPBCAMS-WH-03/2019 China / Hubei Province / Wuhan City 2019-12-30 9
EPI_ISL_403929 BetaCoV/Wuhan/IPBCAMS-WH-04/2019 China / Hubei Province / Wuhan City 2019-12-30 9
EPI_ISL_403937 BetaCoV/Guangdong/20SF040/2020 Guangdong, China 2020-01-18 10
EPI_ISL_403936 BetaCoV/Guangdong/20SF028/2020 Guangdong, China 2020-01-17 10
EPI_ISL_403935 BetaCoV/Guangdong/20SF025/2020 Guangdong, China 2020-01-15 10
EPI_ISL_403934 BetaCoV/Guangdong/20SF014/2020 Guangdong, China 2020-01-15 10
EPI_ISL_403933 BetaCoV/Guangdong/20SF013/2020 Guangdong, China 2020-01-15 10
EPI_ISL_403932 BetaCoV/Guangdong/20SF012/2020 Guangdong, China 2020-01-14 10

[1] Department of Microbiology, Zhejiang Provincial Center for Disease Control and Prevention

[2] Hubei Provincial Center for Disease Control and Prevention

[3] Wuhan Institute of Virology, Chinese Academy of Sciences

[4] Department of Medical Sciences, Ministry of Public Health, Thailand & Thai Red Cross Emerging Infectious Diseases - Health Science Centre & Department of Disease Control, Ministry of Public Health, Thailand

[5] National Institute for Viral Disease Control and Prevention, China CDC

[6] Wuhan Institute of Virology, Chinese Academy of Sciences

[7] Institute of Pathogen Biology, Chinese Academy of Medical Sciences & Peking Union Medical College

[8] National Institute for Communicable Disease Control and Prevention (ICDC) Chinese Center for Disease Control and Prevention (China CDC)

[9] Institute of Pathogen Biology, Chinese Academy of Medical Sciences & Peking Union Medical College

[10] Department of Microbiology, Guangdong Provincial Center for Diseases Control and Prevention

1 Like

I couldn’t match the primer from Lancet article (by Huang et al.) with the nCoV sequences.
Can anybody check this primer?

https://doi.org/10.1016/S0140-6736(20)30183-5
forward primer 5′-TCAGAATGCCAATCTCCCCAAC-3′;
reverse primer 5′-AAAGGTCCACCCGATACATTGA-3′;

hhttps://www.thelancet.com/action/showPdf?pii=S0140-6736%2820%2930183-5

Reported in Methods section, in Procedures subsection on page 2 of article:

“The primers and probe target to envelope gene of CoV were used and the sequences were as follows: forward primer 5′-TCAGAATGCCAATCTCCCCAAC-3′; reverse primer 5′-AAAGGTCCACCCGATACATTGA-3′; and the probe 5′CY5-CTAGTTACACTAGCCATCCTTACTGC-3′BHQ1.”

The forward and reverse primers reported here do not seem to correspond to nCoV sequences. However, I found them in another paper listed below and are from Saffold Cardiovirus.

Saffold Cardiovirus in Children with Acute Gastroenteritis, Beijing, ChinaLili Ren, Richard Gonzalez, Yan Xiao, Xiwei Xu, Lan Chen, Guy Vernet, Gláucia Paranhos-Baccalà, Qi Jin, and Jianwei Wang

“Because VP1 genes of 2 SAFV-positive samples could not be amplified in this way, a newly designed primer pair (cardioVP1Fn: TCAGAATGCCAATCTCCCCAAC and cardioVP1Rn: AAAGGTCCACCCGATACATTGA) was used in combination with cardioVP1-2F/3R to amplify the VP1 gene based on the sequences obtained from our positive samples.”

Emerging Infectious Diseases • www.cdc.gov/eid • Vol. 15, No. 9, September 2009

1 Like

I got it. Thank you so much for your nice explanation and nice work!
I checked Saffold virus sequences.
The primer pair perfectly matches to Saffold virus sequences.

PSET results updated with three new 2019-nCoV genomes; 28 total genomes. Also adding three assays for comparison from the CDC (https://www.cdc.gov/coronavirus/2019-ncov/downloads/rt-pcr-panel-primer-probes.pdf). New genomes have resulted in mismatches in previously perfect alignments for some assays.

Table 1. Results from PSET analysis. The four Noblis assays were compared alongside the four assays from Corman and three assays from the CDC. Each assay was tested using 2019-nCoV (28 genomes) as the intended target. All off-target hits (TN, PF, FP) are to entries in NCBI BLAST databases (nt, gss, and env_nt).

identifier provider PT TP TN PF FP FN
2019-nCoV-noblis_1 Noblis 23 5 415 NA NA NA
2019-nCoV-noblis_2 Noblis 27 1 691 NA NA NA
2019-nCoV-noblis_3 Noblis 27 1 638 NA NA NA
2019-nCoV-noblis_4 Noblis 28 NA 356 NA 3 NA
ncov_e_gene Corman et al 28 NA 75 353 15 NA
ncov_n_gene Corman et al 27 NA 73 NA 339 1
ncov_rdrp_1 Corman et al NA 28 169 433 85 NA
ncov_rdrp_2 Corman et al 1 27 586 1 61 NA
2019-nCoV_N1 CDC 27 1 381 NA NA NA
2019-nCoV_N2 CDC 27 1 371 NA NA NA
2019-nCoV_N3 CDC 28 NA 48 NA 346 NA

PSET results updated with new 2019-nCoV genomes; 46 total genomes.

Table 1. Results from PSET analysis. The four Noblis assays were compared alongside the four assays from Corman and three assays from the CDC. Each assay was tested using 2019-nCoV (46 genomes) as the intended target. All off-target hits (TN, PF, FP) are to entries in NCBI BLAST databases (nt, gss, and env_nt).

identifier provider PT TP TN PF FP FN
2019-nCoV-noblis_1 Noblis 36 10 415 NA NA NA
2019-nCoV-noblis_2 Noblis 45 1 689 NA NA NA
2019-nCoV-noblis_3 Noblis 45 1 637 NA NA NA
2019-nCoV-noblis_4 Noblis 46 NA 356 NA 3 NA
ncov_e_gene Corman et al 46 NA 75 353 15 NA
ncov_n_gene Corman et al 44 1 73 NA 339 1
ncov_rdrp_1 Corman et al NA 46 169 433 85 NA
ncov_rdrp_2 Corman et al 1 45 586 1 61 NA
cdc_n1 CDC 44 2 381 NA NA NA
cdc_n2 CDC 45 1 371 NA NA NA
cdc_n3 CDC 45 1 48 NA 346 NA

Figure 1. Updated genome map showing locations of all assays. Map of the Wuhan genome (NCBI Accession: MN908947.3) with assay signature locations (created using DNA Features Viewer Python library). Corman assays in blue, Noblis assays in red, CDC assays in purple, and gene regions in green.

PSET results updated with new 2019-nCoV genomes; 66 total genomes.

Table 1. Results from PSET analysis. The four Noblis assays were compared alongside the four assays from Corman and three assays from the CDC. Each assay was tested using 2019-nCoV (66 genomes) as the intended target. All off-target hits (TN, PF, FP) are to entries in NCBI BLAST databases (nt, gss, and env_nt).

identifier provider PT TP TN PF FP FN
2019-nCoV-noblis_1 Noblis 47 19 417 NA NA NA
2019-nCoV-noblis_2 Noblis 65 1 688 NA NA NA
2019-nCoV-noblis_3 Noblis 65 1 682 NA NA NA
2019-nCoV-noblis_4 Noblis 65 NA 357 NA 3 1
ncov_e_gene Corman et al 66 NA 76 353 15 NA
ncov_n_gene Corman et al 64 1 74 NA 339 1
ncov_rdrp_1 Corman et al NA 66 169 433 85 NA
ncov_rdrp_2 Corman et al 1 65 586 1 61 NA
cdc_n1 CDC 64 2 381 NA NA NA
cdc_n2 CDC 65 1 371 NA NA NA
cdc_n3 CDC 65 1 48 NA 346 NA

PSET results updated with new 2019-nCoV genomes; 96 total genomes. Now just using the subset of genomes on GISAID marked as high quality. Sequence IDs tested in this analysis listed here: ncov_ids_tested.zip (386 Bytes)

Assays continue to perform well against new genome submissions. Only one assay, 2019-nCoV-noblis_4, showing potential false negatives. Assay 2019-nCoV-noblis_1 has just one mismatch in the probe for 27 genomes, resulting in less perfect matches, but still likely functional for all 96 genomes.

Table 1. Results from PSET analysis. The four Noblis assays were compared alongside the four assays from Corman and three assays from the CDC. Each assay was tested using 2019-nCoV (96 genomes) as the intended target. All off-target hits (TN, PF, FP) are to entries in NCBI BLAST databases (nt, gss, and env_nt).

identifier provider PT TP TN PF FP FN
2019-nCoV-noblis_1 Noblis 69 27 372 NA NA NA
2019-nCoV-noblis_2 Noblis 96 NA 568 NA NA NA
2019-nCoV-noblis_3 Noblis 96 NA 439 NA NA NA
2019-nCoV-noblis_4 Noblis 94 NA 343 NA 3 2
ncov_e_gene Corman et al 96 NA 42 353 15 NA
ncov_n_gene Corman et al 95 1 55 NA 339 NA
ncov_rdrp_1 Corman et al NA 96 75 433 87 NA
ncov_rdrp_2 Corman et al NA 96 526 1 66 NA
cdc_n1 CDC 95 1 363 NA NA NA
cdc_n2 CDC 95 1 361 NA NA NA
cdc_n3 CDC 94 2 17 NA 346 NA

PSET results updated with new 2019-nCoV genomes; 129 total genomes. Now just using the subset of genomes on GISAID marked as high quality. Sequence IDs tested in this analysis listed here: ncov_ids_tested.zip (437 Bytes)

Assays have more potential false negatives, although the majority of FNs (38 out of 42) are from samples collected from Pangolins. Only assays ncov_rdp_1 and cdc_n3 currently have no potential for FNs.

Table 1. Results from PSET analysis. The four Noblis assays were compared alongside the four assays from Corman and three assays from the CDC. Each assay was tested using 2019-nCoV (129 genomes) as the intended target. All off-target hits (TN, PF, FP) are to entries in NCBI BLAST databases (nt, gss, and env_nt).

identifier provider PT TP TN PF FP FN
2019-nCoV-noblis_1 Noblis 91 32 372 NA NA 6
2019-nCoV-noblis_2 Noblis 123 NA 568 NA NA 6
2019-nCoV-noblis_3 Noblis 122 2 439 NA NA 5
2019-nCoV-noblis_4 Noblis 121 5 343 NA 3 3
ncov_e_gene Corman et al 127 1 42 353 15 1
ncov_n_gene Corman et al 122 2 55 NA 339 5
ncov_rdrp_1 Corman et al NA 129 75 433 87 NA
ncov_rdrp_2 Corman et al NA 124 526 1 66 5
cdc_n1 CDC 122 2 363 NA NA 5
cdc_n2 CDC 122 1 361 NA NA 6
cdc_n3 CDC 121 8 17 NA 346 NA

PSET results updated with new 2019-nCoV genomes; 152 total genomes. Now just using the subset of genomes on GISAID marked as high quality. Sequence IDs tested in this analysis listed here: ncov_ids_tested.zip (475 Bytes)

Table 1. Results from PSET analysis. The four Noblis assays were compared alongside the four assays from Corman and three assays from the CDC. Each assay was tested using 2019-nCoV (152 genomes) as the intended target. All off-target hits (TN, PF, FP) are to entries in NCBI BLAST databases (nt, gss, and env_nt).

identifier provider PT TP TN PF FP FN
2019-nCoV-noblis_1 Noblis 106 39 373 NA NA 7
2019-nCoV-noblis_2 Noblis 145 NA 568 NA NA 7
2019-nCoV-noblis_3 Noblis 144 2 439 NA NA 6
2019-nCoV-noblis_4 Noblis 144 5 343 NA 3 3
ncov_e_gene Corman et al 149 2 42 353 15 1
ncov_n_gene Corman et al 144 3 55 NA 339 5
ncov_rdrp_1 Corman et al NA 152 75 433 87 NA
ncov_rdrp_2 Corman et al NA 147 526 1 66 5
cdc_n1 CDC 144 3 363 NA NA 5
cdc_n2 CDC 144 1 361 NA NA 7
cdc_n3 CDC 143 9 17 NA 346 NA

PSET results updated with new 2019-nCoV genomes; 190 total genomes. Now just using the subset of genomes on GISAID marked as high quality sampled from humans (pangolin samples removed). Sequence IDs tested in this analysis listed here: ncov_ids_tested.zip (534 Bytes)

Table 1. Results from PSET analysis. The four Noblis assays were compared alongside the four assays from Corman and three assays from the CDC. Each assay was tested using 2019-nCoV (190 genomes) as the intended target. All off-target hits (TN, PF, FP) are to entries in NCBI BLAST databases (nt, gss, and env_nt).

identifier provider PT TP TN PF FP FN
2019-nCoV-noblis_1 Noblis 134 56 373 0 0 0
2019-nCoV-noblis_2 Noblis 188 2 568 0 0 0
2019-nCoV-noblis_3 Noblis 189 1 439 0 0 0
2019-nCoV-noblis_4 Noblis 184 0 343 0 3 6
ncov_e_gene Corman et al 187 2 42 353 15 1
ncov_n_gene Corman et al 189 1 55 0 339 0
ncov_rdrp_1 Corman et al 190 0 75 433 87 0
ncov_rdrp_2 Corman et al 190 0 526 1 66 0
cdc_n1 CDC 188 2 363 0 0 0
cdc_n2 CDC 189 1 361 0 0 0
cdc_n3 CDC 183 7 17 0 346 0

PSET results updated with new 2019-nCoV genomes; 432 total genomes. Just using the subset of genomes on GISAID marked as high quality sampled from humans. Sequence IDs tested in this analysis listed here: 432_ids.zip (833 Bytes)

Noblis assay 4 showing some potential FNs, but most of these appear to be the result of gaps or sequencing errors for some sequences at the very start of the genome. This assay falls within the first 500bp of the genome. All other assays still performing very well in silico against new sequences.

Table 1. Results from PSET analysis. The four Noblis assays were compared alongside the four assays from Corman and three assays from the CDC. Each assay was tested using 2019-nCoV (432 genomes) as the intended target. All off-target hits (TN, PF, FP) are to entries in NCBI BLAST databases (nt, gss, and env_nt).

identifier provider PT TP TN PF FP FN
2019-nCoV-noblis_1 Noblis 325 105 373 0 0 2
2019-nCoV-noblis_2 Noblis 422 10 568 0 0 0
2019-nCoV-noblis_3 Noblis 431 1 439 0 0 0
2019-nCoV-noblis_4 Noblis 409 4 343 0 3 19
ncov_e_gene Corman et al 428 3 42 353 15 1
ncov_n_gene Corman et al 431 1 55 0 339 0
ncov_rdrp_1 Corman et al 0 432 75 433 87 0
ncov_rdrp_2 Corman et al 0 432 526 1 66 0
cdc_n1 CDC 429 3 363 0 0 0
cdc_n2 CDC 430 2 361 0 0 0
cdc_n3 CDC 412 20 17 0 346 0

PSET results updated with new 2019-nCoV genomes; 655 total genomes. Just using the subset of genomes on GISAID marked as high quality sampled from humans. Sequence IDs tested in this analysis listed here: 655_ids.zip (1.4 KB)

Previous Noblis assays showed some false negatives due to Ns, sequence gaps, and in one case the assay’s position at the very start of the genome. These have been replaced with five new assays generated at a later date using 96 complete genomes. The Noblis.57 assay and the German ncov_e_gene assay each have one FN that’s due to a stretch of Ns. All other assays still performing very well in silico against new sequences.

Table 1. Results from PSET analysis. The five Noblis assays were compared alongside the four assays from Corman and three assays from the CDC. Each assay was tested using 2019-nCoV (655 genomes) as the intended target. All off-target hits (TN, PF, FP) are to entries in NCBI BLAST databases (nt, gss, and env_nt).

identifier provider PT TP TN PF FP FN
Noblis.12 Noblis 653 2 2 0 0 0
Noblis.40 Noblis 635 20 316 0 0 0
Noblis.42 Noblis 655 0 275 0 0 0
Noblis.44 Noblis 655 0 277 0 0 0
Noblis.57 Noblis 654 0 359 0 0 1
ncov_e_gene Corman et al 651 3 42 353 15 1
ncov_n_gene Corman et al 652 3 55 0 339 0
ncov_rdrp_1 Corman et al 0 655 75 433 87 0
ncov_rdrp_2 Corman et al 1 654 526 1 66 0
cdc_n1 CDC 647 8 363 0 0 0
cdc_n2 CDC 653 2 361 0 0 0
cdc_n3 CDC 628 27 17 0 346 0

Figure 1. Map of the SARS-CoV-2 genome (NCBI Accession: MN908947.3) with assay signature locations (created using DNA Features Viewer Python library). Noblis assays in red, Corman assays in blue, CDC assays in purple, and gene regions in green.

PSET results updated with new 2019-nCoV genomes; 1620 total genomes. Just using the subset of genomes on GISAID marked as high quality sampled from humans. Sequence IDs tested in this analysis listed here: 1620_ids.zip (12.2 KB)

The Noblis.57 assay and the German ncov_e_gene assay still each have one FN that’s due to a stretch of Ns. All other assays still performing very well in silico against new sequences.

Table 1. Results from PSET analysis. The five Noblis assays were compared alongside the four assays from Corman and three assays from the CDC. Each assay was tested using 2019-nCoV (1620 genomes) as the intended target. All off-target hits (TN, PF, FP) are to entries in NCBI BLAST databases (nt, gss, and env_nt).

identifier provider PT TP TN PF FP FN
Noblis.12 Noblis 1616 4 2 0 0 0
Noblis.40 Noblis 1546 74 316 0 0 0
Noblis.42 Noblis 1620 0 275 0 0 0
Noblis.44 Noblis 1619 1 277 0 0 0
Noblis.57 Noblis 1605 14 359 0 0 1
ncov_e_gene Corman et al 1616 3 42 353 15 1
ncov_n_gene Corman et al 1608 12 55 0 339 0
ncov_rdrp_1 Corman et al 0 1620 75 433 87 0
ncov_rdrp_2 Corman et al 1 1619 526 1 66 0
cdc_n1 CDC 1591 29 363 0 0 0
cdc_n2 CDC 1616 4 361 0 0 0
cdc_n3 CDC 1560 60 17 0 346 0

Dear S. Sozhamannan,

please can you share the new primer and probe sequences of Noblis.12-57 as you did with 2019-nCoV-noblis_1-4? That would be great.

BTW: Do you know the primer sequences used by doi:10.1128/JCM.00310-20? They are not provided with their paper and the author did not respond within the last 12 days.

Thanks!

1 Like

Sure thing! Here they are below. The full amplicon sequences are provided with the primers in brackets and probes in parentheses. We do not know the primers sequences used in that paper, but if we happen to find them we will let you know.

Identifier Amplicon
Noblis.12 [ACGGCAGTGAGGACAATCAG]ACAACTACTATTCAAACAATTGTTGAGGTTCAACCTCAATTAGAGATGGAACTTACACCAGTTGTTCAGACTATTGAAGTGAATAGTTTTAGTGGTTATTTAAAACTTACTGACAATGTATACATTAAAAATGCAGACATTGTGGAAGAAGCTAAAAAGGTAAAA(CCAACAGTGGTTGTTAATGCAGCCA)ATGTTTACCTTAA[ACATGGAGGAGGTGTTGCAG]
Noblis.40 [GCCGCTGTTGATGCACTATG]TGAGAAGGCATTAAAATATTTGCCTATAGATAAATGTAGTAGAATTATACCTGC(ACGTGCTCGTGTAGAGTGTTTTGAT)AAATTCAAAGTGAATTCAACATTAGAACAGTATGTCTTTTGTACTGTAA[ATGCATTGCCTGAGACGACA]
Noblis.42 [TGTACGTGCATGGATTGGCT](TCGATGTCGAGGGGTGTCATGCT)ACTAGAGAAGCTGTTGGTACCAATTTACCTTTACAGCTAGGTTTTTCTACAGGTGTTAACCTAGTTGCTGTACCTACAGGTTATGTTGATACACCTAATAATACAGATTTTTCCAGAGT[TAGTGCTAAACCACCGCCTG]
Noblis.44 [CAGGCACCTACACACCTCAG]TGTTGACACTAAATTCAAAACTGAAGGTTTATGTGTTGACATACCTGGCATACCTAAGGACATGACCTATAGAAGACTCATCTCTATGATGGGTTTTAAAATGAATTATCAAGTTAATGGTTACCCTAACATGTTTATCACCCGCGAAGAAGCTATAAGACATGTACGTGCATGGAT(TGGCTTCGATGTCGAGGGGTGT)CATGCTACTAGAGAAGCTGTTGGTACCAATTTACCTTTACAGCTAGGTTTTTCTACAGGTGTTAACCTAGTTGCTGTACCTACAGGTTATGTTGATACACCTAATAATACAGATTTTTCCAGAGT[TAGTGCTAAACCACCGCCTG]
Noblis.57 [TGCAGATGCTGGCTTCATCA]AACAATATGGTGATTGCCTTGGTGATATTGCTGCTAGAGACCTCATTTGTGCACAAAAGTTTA(ACGGCCTTACTGTTTTGCCACCT)TTGCTCACAGATGAAATGATTGCTCAATACACTT[CTGCACTGTTAGCGGGTACA]

**Small error identified in the results Table 1 from previous post on 4/10. Fixed below. (4/15/2020)

PSET results updated with new 2019-nCoV genomes; 3996 total genomes. Just using the subset of genomes on GISAID marked as high quality sampled from humans. Sequence IDs tested in this analysis listed here: 3996_ids.zip (29.1 KB)

Table 1. Results from PSET analysis. The five Noblis assays were compared alongside the four assays from Corman and three assays from the CDC. Each assay was tested using 2019-nCoV (3996 genomes) as the intended target. All off-target hits (TN, PF, FP) are to entries in NCBI BLAST databases (nt, gss, and env_nt).

identifier provider PT TP TN PF FP FN
Noblis.12 Noblis 3984 12 2 0 0 0
Noblis.40 Noblis 3855 140 316 0 0 1
Noblis.42 Noblis 3990 6 275 0 0 0
Noblis.44 Noblis 3988 8 277 0 0 0
Noblis.57 Noblis 3967 25 359 0 0 4
ncov_e_gene Corman et al 3983 7 42 353 15 6
ncov_n_gene Corman et al 3965 31 55 0 339 0
ncov_rdrp_1 Corman et al 0 3995 75 433 87 1
ncov_rdrp_2 Corman et al 1 3995 526 1 66 0
cdc_n1 CDC 3919 77 363 0 0 0
cdc_n2 CDC 3982 14 361 0 0 0
cdc_n3 CDC 3883 113 17 0 346 0

Table 2. False negative hits along with affected assay and sequence identifiers.

num identifier accession outcome
1 ncov_e_gene France_ARA10968_2020_EPI_ISL_418431_2020-03-18 FN
2 ncov_e_gene USA_NY-NYUMC33_2020_EPI_ISL_418979_2020-03-17 FN
3 ncov_e_gene Italy_TE4959_2020_EPI_ISL_418259_2020-03-14 FN
4 ncov_e_gene France_ARA12626_2020_EPI_ISL_420612_2020-03-23 FN
5 ncov_e_gene France_ARA13074_2020_EPI_ISL_420622_2020-03-24 FN
6 ncov_e_gene France_Lyon_06042_2020_EPI_ISL_417333_2020-03-04 FN
7 ncov_rdrp_1 Australia_QLDID922_2020_EPI_ISL_418799_2020-02-28 FN
8 Noblis.40 USA_WA-S7_2020_EPI_ISL_416462_2020-02-24 FN
9 Noblis.57 Beijing_IVDC-BJ-005_2020_EPI_ISL_408485_2020-01-18 FN
10 Noblis.57 Canada_ON_PHL2223_2020_EPI_ISL_418381_2020 FN
11 Noblis.57 Canada_ON_PHL2273_2020_EPI_ISL_418383_2020 FN
12 Noblis.57 Canada_ON_PHL6884_2020_EPI_ISL_418343_2020-03-10 FN

Big Update-- Added 14 new assays to monitor. Updated NCBI databases (nt, env_nt) – 4/16/2020. Updated to 6,835 COVID-19 WGS from GISAID. Sequence IDs tested in this analysis listed here: 6835_ids.zip (44.5 KB)

Table 1. Results from PSET analysis. The five Noblis assays were compared alongside the four assays from Corman and three assays from the CDC. Along with new assays from China, Hong Kong, Thailand, Japan, and France. Each assay was tested using 2019-nCoV (6,835 genomes) as the intended target. All off-target hits (TN, PF, FP) are to entries in NCBI BLAST databases (nt, gss, and env_nt).

identifier type PT TP FN Percent True PF FP TN
Noblis.12 Probe 6793 41 1 99.99% 0 0 4
Noblis.40 Probe 6524 310 1 99.99% 0 1 318
Noblis.42 Probe 6825 8 2 99.97% 0 1 0
Noblis.44 Probe 6823 11 1 99.99% 0 1 31
Noblis.57 Probe 6779 50 6 99.91% 0 1 7
ncov_e_gene Probe 6813 13 9 99.87% 361 15 3
ncov_n_gene Probe 6784 51 0 100.00% 0 340 27
ncov_rdrp_1 Probe 0 6834 1 99.99% 430 59 239
ncov_rdrp_2 Probe 1 6834 0 100.00% 2 37 677
cdc_n1 Probe 6689 146 0 100.00% 0 2 0
cdc_n2 Probe 6787 26 22 99.68% 0 1 330
cdc_n3 Probe 6683 145 7 99.90% 1 3 1
China_N Probe 5606 47 1182 82.71% 0 2 2
China_ORF1ab Probe 6753 63 19 99.72% 1 0 348
France_nCoV_IP2 Probe 6820 14 1 99.99% 1 0 1
France_nCoV_IP4 Probe 6816 19 0 100.00% 0 0 3
HKU-N Probe 6762 48 25 99.63% 320 9 4
HKU-ORF1b-nsp14 Probe 6807 27 1 99.99% 1 0 0
Japan_NIID_2019-nCOV_N Probe 0 6812 23 99.66% 0 0 360
Japan_NIID_WH-1_F24381 Outer 6777 58 0 100.00% 0 1 455
Japan_NIID_WH-1_F501 Outer 6723 100 12 99.82% 0 1 3
Japan_NIID_WH-1_F509 Outer 6769 52 14 99.80% 1 0 3
Japan_NIID_WH-1_Seq_F24383 Outer 6778 57 0 100.00% 0 0 456
Japan_NIID_WH-1_Seq_F519 Outer 6795 22 18 99.74% 0 1 3
Japan_WuhanCoV-spk1 Outer 6786 49 0 100.00% 1 1 454
Thailand_WH-NIC_N Probe 6760 75 0 100.00% 0 2 0

Updated to 7,903 COVID-19 WGS from GISAID. Just using the subset of genomes on GISAID marked as high quality sampled from humans.

Table 1. Results from PSET analysis. The five Noblis assays were compared alongside the four assays from Corman and three assays from the CDC. Along with new assays from China, Hong Kong, Thailand, Japan, and France. Each assay was tested using 2019-nCoV (7,903 genomes) as the intended target. All off-target hits (TN, PF, FP) are to entries in NCBI BLAST databases (nt, gss, and env_nt).

identifier type PT TP FN Percent True PF FP TN
Noblis.12 Probe 7859 43 1 99.99% 0 0 680
Noblis.40 Probe 7569 333 1 99.99% 0 1 1420
Noblis.42 Probe 7892 9 2 99.97% 0 2 1136
Noblis.44 Probe 7890 12 1 99.99% 0 1 1786
Noblis.57 Probe 7837 59 7 99.91% 0 1 853
ncov_e_gene Probe 7878 16 9 99.89% 361 15 53
ncov_n_gene Probe 7841 60 2 99.97% 0 340 72
ncov_rdrp_1 Probe 0 7902 1 99.99% 433 105 5967
ncov_rdrp_2 Probe 1 7902 0 100.00% 2 73 5916
cdc_n1 Probe 7681 221 1 99.99% 0 2 381
cdc_n2 Probe 7848 32 23 99.71% 0 1 372
cdc_n3 Probe 7725 176 2 99.97% 1 353 36
China_N Probe 6514 58 1331 83.16% 0 2 691
China_ORF1ab Probe 7819 65 19 99.76% 1 0 367
France_nCoV_IP2 Probe 7866 36 1 99.99% 1 0 367
France_nCoV_IP4 Probe 7877 25 1 99.99% 0 0 1190
HKU-N Probe 7812 65 26 99.67% 327 33 34
HKU-ORF1b-nsp14 Probe 7872 30 1 99.99% 324 29 452
Japan_NIID_2019-nCOV_N Probe 0 7879 24 99.70% 0 0 384
Japan_NIID_WH-1_F24381 Outer 7836 67 0 100.00% 0 1 601
Japan_NIID_WH-1_F501 Outer 7762 128 13 99.84% 0 1 370
Japan_NIID_WH-1_F509 Outer 7826 62 15 99.81% 1 0 369
Japan_NIID_WH-1_Seq_F24383 Outer 7838 65 0 100.00% 0 0 606
Japan_NIID_WH-1_Seq_F519 Outer 7852 29 22 99.72% 0 1 369
Japan_WuhanCoV-spk1 Outer 7845 58 0 100.00% 1 1 594
Thailand_WH-NIC_N Probe 7822 80 1 99.99% 0 2 375

Updated to 15,411 COVID-19 WGS from GISAID. Using all genomes sampled from humans regardless of quality.

Table 1. Results from PSET analysis. The five Noblis assays were compared alongside the four assays from Corman and three assays from the CDC. Along with assays from China, Hong Kong, Thailand, Japan, and France. Each assay was tested using 2019-nCoV (15,411 genomes) as the intended target. All off-target hits (TN, PF, FP) are to entries in NCBI BLAST databases (nt, gss, and env_nt).

identifier type PT TP FN Percent True PF FP TN
Noblis.12 Probe 15251 71 89 99.42% 0 0 680
Noblis.40 Probe 14742 606 63 99.59% 0 1 1420
Noblis.42 Probe 15327 23 61 99.60% 0 2 1136
Noblis.44 Probe 15130 36 245 98.41% 0 1 1786
Noblis.57 Probe 15165 117 129 99.16% 0 1 853
ncov_e_gene Probe 15278 48 85 99.45% 361 15 53
ncov_n_gene Probe 15222 116 73 99.53% 0 340 72
ncov_rdrp_1 Probe 0 15366 45 99.71% 433 105 5967
ncov_rdrp_2 Probe 1 15369 41 99.73% 2 73 5916
cdc_n1 Probe 15034 339 38 99.75% 0 2 381
cdc_n2 Probe 14958 120 333 97.84% 0 1 372
cdc_n3 Probe 15059 312 40 99.74% 1 353 36
China_N Probe 12170 153 3088 79.96% 0 2 691
China_ORF1ab Probe 14330 132 949 93.84% 1 0 367
France_nCoV_IP2 Probe 15295 60 56 99.64% 1 0 367
France_nCoV_IP4 Probe 15326 53 32 99.79% 0 0 1190
HKU-N Probe 14947 129 335 97.83% 327 33 34
HKU-ORF1b-nsp14 Probe 15278 70 63 99.59% 324 29 452
Japan_NIID_2019-nCOV_N Probe 0 15079 332 97.85% 0 0 384
Japan_NIID_WH-1_F24381 Outer 15184 152 75 99.51% 0 1 601
Japan_NIID_WH-1_F501 Outer 15109 195 107 99.31% 0 1 370
Japan_NIID_WH-1_F509 Outer 15156 140 115 99.25% 1 0 369
Japan_NIID_WH-1_Seq_F24383 Outer 15180 151 80 99.48% 0 0 606
Japan_NIID_WH-1_Seq_F519 Outer 15004 232 175 98.86% 0 1 369
Japan_WuhanCoV-spk1 Outer 15210 122 79 99.49% 1 1 594
Thailand_WH-NIC_N Probe 15226 142 43 99.72% 0 2 375