Tracking SARS-CoV-2 VOC 202012/01 (lineage B.1.1.7) dissemination in Portugal: insights from nationwide RT-PCR Spike gene drop out data
Vítor Borges*,1, Carlos Sousa*,2, Luís Menezes3, António Maia Gonçalves4, Miguel Picão5, José Pedro Almeida6, Margarida Vieita6, Rafael Santos6, Ana Rita Silva2, Mariana Costa2, Luís Carneiro2, Joana Isidro1, Sílvia Duarte7, Luís Vieira7, Raquel Guiomar8, Susana Silva9, Baltazar Nunes9, João P Gomes1,#
Affiliations
National Institute of Health Dr. Ricardo Jorge (INSA):
1 Bioinformatics Unit, Department of Infectious Diseases
7 Innovation and Technology Unit, Department of Human Genetics;
8 National Reference Laboratory for Influenza and other Respiratory Viruses, Department of Infectious Diseases;
9 Epidemiological Research Unit Department of Epidemiology.
Unilabs:
2 Molecular Diagnostics Laboratory
3 Executive Office
4 Medical Office
5 IT Office
6 Data Intelligence
*These authors equally contributed to this work.
Contact
#Corresponding author:
João Paulo Gomes
Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge, Av. Padre Cruz, 1649-016 Lisbon, Portugal. Tel.: (+351) 217 519 241; fax: (+351) 217 526 400.
E-mail: [email protected]
Summary
The SARS-CoV-2 lineage B.1.1.7, also designated Variant of Concern (VOC) 202012/01, has shown a pronounced frequency increase in the United Kingdom (where it likely emerged by late summer 2020) and is rapidly expanding its geographic range worldwide. Among the several mutations harbored by this variant, the Spike Δ69/70 deletion affects the detection of S-gene by some RT-PCR assays (e.g., TaqPath COVID-19 RT-PCR assay, ThermoFisher), leading to what has been termed “Spike gene target failure (SGTF) or “Spike gene drop out”. In this context, SGTF is being successfully applied as a proxy to indicate carriage of the highly transmissible B.1.1.7 lineage and monitor its geotemporal dissemination.
In this brief report, we present comprehensive data supporting that the proportion of SGTF samples is greatly increasing in Portugal. We used data from 27096 cases confirmed positive by the ThermoFisher TaqPath RT-PCR assay, collected in 287 settings of a large laboratory (Unilabs) distributed throughout the country mainland since December 1st, 2020. We observed that the proportion of SGTF cases increased from ~1% in weeks 49-50 (2020) to 11.4% in week 2 (2021). Unexpectedly, we also detected a considerable proportion of TaqPath positive samples having Cycle threshold (Ct) values for S gene >5 units higher than the maximum Ct value obtained for the other two targets (N and ORF1ab) of the assay. So far, SARS-CoV-2 genome sequencing of samples showing this RT-PCR profile, here tentatively designated as “Spike gene target late amplification” (SGTL), identified the VOC 202012/01 (lineage B.1.1.7) in all of them. SGTL cases present an increasing trend in frequency (about 7-fold increase in relative proportion from week 49/2020 to week 2/2021) similar to that observed for SGTF cases. Also, both SGTF and SGTL samples had median Ct values of ORF1ab and N-gene targets within the same range, and these values were significantly lower than the ones observed for samples where S gene was unbiasedly detected. These results suggests that SGTL observation can constitute an additional proxy to detect and monitor the VOC 202012/01 variant. Current data on age distribution of SGTF+SGTL cases (median=35; IQR=21-48) does not seem to indicate a relevant shift in the age distribution when comparing with non-SGTF/SGTI cases (median=38; IQR=23-53).
In total, SGTF/SGFL cases represented 5.8% of all TaqPath positive cases detected since week 49 (2020), with this proportion reaching 13.3% at end of week 2 (2021). The SGTF/SGTL proportion increased at 70% (63%–76%, CI 95%) rate per week, with our forecast for the next three weeks, assuming no change in the increasing rate, pointing that the proportion of SGTF/SGTL cases can reach up to 60% of TaqPath positive cases by week 5.
In order to facilitate the assessment (and real-time monitoring) of SGTF/SGTL cases, their frequency and geographical dispersion, an interactive dashboard and data cube was developed and shared with the national public health authority. This approach may support timely public-health decisions by enabling early identification of geographical regions with increased incidence and circulation of Δ69-70-bearing SARS-CoV-2, in particular the VOC 202012/01 variant (lineage B.1.1.7).
Background
The SARS-CoV-2 lineage B.1.1.7, also designated Variant of Concern 202012/01 (VOC) by Public Health England, likely originated in the UK in late Summer to early Autumn 2020 [1-3], has shown a rampant frequency increase in the UK. It has been already detected in multiple countries worldwide [2,4], with European Centre for Disease Prevention and Control (ECDC) assessing its impact in terms of hospitalisations and deaths in EU/EEA as high on December 29th, 2020 [2]. This variant possesses a large number of non-synonymous substitutions of biological / immunological significance [1,5] that are under active investigation, in particular Spike mutations Δ69-70, N501Y and P681H, as well as ORF8 Q27stop [1,5]. Importantly, the deletion of 21765-21770 genome region, which deletes Spike amino acids 69 and 70 (Δ69-70), causes Spike gene target failure (SGTF) in some RT-PCR assays, such as TaqPath COVID-19 RT-PCR assay (ThermoFisher). As only a small fraction of all new cases can be timely subjected to whole-genome sequencing, this coincidental occurrence has provided a good proxy for monitoring trends in VOC 202012/01, as SGTF has been shown to highly correlate with presence of Δ69-70 (at least when it already reached considerable prevalence in the community) [1,6,7]. Concordantly, in Portugal, 90% of positive samples with known SGTF status successfully sequenced at Portuguese National Institute of Health (INSA) since December 1st were identified as B.1.1.7 lineage. Notwithstanding, Δ69-70 has independently arisen multiple times, so SGTF is logically a proxy for any SARS-CoV-2 lineage with that mutation [8,9].
Here, we assume that SGTF can also be a reliable indicator of B.1.1.7 circulation in Portugal, as currently observed in the UK, considering that: i) Portugal was among the top destinations of UK air travellers from major international London airports during early Autumn [10]; ii) INSA detected multiple independent introductions of B.1.1.7 lineage in Portugal since early December (SARS-CoV-2 Portugal); and, iii) the current epidemiological situation in Portugal places it among the EU/EEA countries with highest 14-day notification rate of newly reported COVID-19 cases per 100 000 population, as of January 18th.
Results
In the present study, we investigated the proportion of SGTF cases as a mean to get insights on VOC 202012/01 (B.1.1.7 lineage) frequency and geographic spread in Portugal. To achieve this, we took advantage of massive SARS-CoV-2 RT-PCR data comprehensively collected by a large laboratory (Unilabs) in 287 sample collection settings distributed throughout the country. Between week 49 (2020) and week 2 (2021), Unilabs performed more than 134 000 SARS-CoV-2 RT-PCR tests using Thermofisher TaqPath assay (targeting three regions of SARS-CoV-2 genome: ORF1ab, N and S genes), which roughly corresponds to 7% of all SARS-CoV-2 RT-PCR tests done in Portugal during the same period.
Of the 27096 positive results, 1279 (4.72%) corresponded to SGTF tests, as defined by a positive test with non-detectable S gene and <=30 Cycle threshold (Ct) for N and ORF1ab targets. During the same period, we also unexpectedly detected a non-negligible proportion (296/27096; 1.09%) of TaqPath positive samples having Ct values for S gene >5 units higher than the maximum Ct value obtained for the other two targets (N and ORF1ab) of the assay (again, analysis only included positive samples with <=30 Ct for N and ORF1ab targets). We tentatively designate this RT-PCR profile as “Spike gene target late amplification” (SGTL). So far, SARS-CoV-2 genome sequencing of SGTL samples identified the VOC 202012/01 (lineage B.1.1.7) in all of them, which suggest that SGTL can also be considered a proxy to identify this variant. The mean Ct difference between S gene and N/ORF genes for SGTL samples was consistently around 7 Ct values (median=6.7, IQR=5.8-7.6). We hypothesize that the “late amplification” might be due, for example, to probe misannealing. Still, the technical reason behind SGTL effect is under investigation.
Remarkably, the proportion of both SGTF and SGTL cases is continuously raising since the beginning of December (Figure 1), reaching a total of 11.4% and 1.9% of all positive samples detected by this assay in week 2 (2021), respectively. In particular, since week 53 (2020), the aggregated proportion of SGTF/SGTL cases increased 4-fold, reaching a total of 13.3% in week 2 (2021). We thus extrapolate that, during week 2 (2021), about 8000 out of the 59951 COVID-19 confirmed cases in Portugal were caused by VOC 202012/01 variant (lineage B.1.1.7). As shown in Figure 2, SGTF and SGFL cases are dispersed throughout the Portugal mainland, indicating that this variant is highly spread and under active community transmission.
Figure 1. Proportion of SGTF and SGTL positive samples among all TaqPath positive samples detected, between week 49 (2020) and week 2 (2021), by a large laboratory distributed throughout the Portuguese territory. SGTF (S gene target failure) = positive test with non-detectable S gene and <=30 Cycle threshold (Ct) for N and ORF1ab targets; SGTL (S gene target late amplification) = positive test having Ct values for S gene >5 units higher than the maximum Ct value obtained for the other two targets (N and ORF1ab) of the assay (exclusively includes positive samples with <=30 Ct for N and ORF1ab targets).
Figure 2. Location of Unilabs sample collection points where SGTF and SGTL were detected between week 49 (2020) and week 2 (2021).
The SGTF/SGFL proportion increased at a 70% (63%–76%, CI 95%) rate per week, with our forecast for the next three weeks, assuming no change in the increasing rate, pointing that the proportion of SGTF/SGTL cases can reach up to 60% of positive cases by week 5 (Figure 3). A general lockdown has been posed during week 2, which is likely to mitigate this concerning increasing trend. It will be of utmost interest to assess whether the expected reduction in the number of positive cases in the next weeks will have an impact in the increasing relative frequency of SGTF/SGTL cases.
Figure 3. Estimate of weekly frequency time trend of SGTF/SGTL cases using a log-binomial model with 95% prediction interval from week 2 to 5.
We also investigated cycle threshold (Ct) values (a proxy for viral load) in SGTF and SGTL versus non-SGTF/SGTL positive samples. Notably, both SGTF and SGTL samples had significantly lower median Ct values of ORF1ab and N-gene targets compared to samples where S gene was unbiasedly detected within the same range of Ct values as the other detected genes (i.e., non-SGTF/SGTL positive samples) (Figure 4). Median Ct values observed for SGTL (gene N: median=17.3, IQR=14.5-21.1; gene ORF: median=17.6, IQR=14.9-20.8) and SGTF (gene N: median=16.9, IQR=13.1-23.8; gene ORF: median=17.3, IQR=13.6-22.8) positive samples were about 2-4 Cts lower than those observed for the control group (gene N: median=21.5, IQR=16.5-26.0; gene ORF: median=19.6, IQR=15.5-23.6). This observation not only corroborates previous findings that SGTF samples (associated with VOC 202012/01) are more likely to present higher viral loads [2,6], but also consolidates that SGTL samples might be an additional surrogate to identify Δ69-70-bearing SARS-CoV-2 variants.
Figure 4. Scatter, violin and box plot of the ORF1ab and N-gene Cycle threshold (Ct) values obtained for SGTL and SGTF samples compared with samples where S gene was unbiasedly detected within the same order of Ct values as the other detected genes (i.e., non-SGTF/SGTL positive samples). Median Ct is shown by a black horizontal bar and the results of tests for significant differences are shown above both plots with conventional representation. The Kruskal-Walls one-way ANOVA non-parametric test was used to assess the existence of statistically significant differences in Ct values between groups. Differences in Ct values for each pair of groups were assessed using the Dunn test adjusted for multiple comparison tests with Bonferroni correction.
Current data on age distribution of SGTF+SGTL cases does not seem to indicate a relevant shift in the age composition when comparing with non-SGTF/SGTL cases (Figure 5 and Figure 6), as observed in previous reports in the UK [11,12]. For both groups, the highest frequencies are observed for individuals aged 20 to 49 years. Although statistical different (p<0.001), the age distribution of the group SGTF+SGTL individuals (median=35; IQR=21-48) did not differ substantially from the non-SGTF/SGTL group (median=38; IQR=23-53) (Figure 6).
Figure 5. Age distribution of SGTF / SGTL-associated individuals compared with cases where S gene was unbiasedly detected within the same order of Ct values as the other detected genes (i.e., non-SGTF/SGTL positive samples).
Figure 6. Violin and box-plot depicting the age distribution of SGTF / SGTL-associated patients compared with cases where S gene was unbiasedly detected within the same order of Ct values as the other detected genes (i.e., non-SGTF/SGTL positive samples). The Mann–Whitney–Wilcoxon test was used to assess the existence of statistically significant differences age (years) between groups (p = 5.71e-09).
In order to facilitate real-time monitoring of SGTF/SGTL cases, their frequency and geographical dispersion, an interactive dashboard and data cube was developed by Unilabs and shared with the national public health authority (Figure 7). This dashboard relies on Unilabs Intelli4Covid, which is a BigData platform that constantly captures and correlates data from multiple data sources associated with Unilabs Portugal Covid-19 operations, in real-time, allowing Unilabs to immediately obtain actionable insights on changing patterns of pandemic evolution. We anticipate that this tool may support timely and tailored public-health decisions by enabling the timely identification of geographical regions with increased incidence and circulation of Δ69-70-bearing SARS-CoV-2, in particular the VOC 202012/01 variant (lineage B.1.1.7).
Figure 7. Screenshots of Unilabs dashboard for real-time monitoring of SGTF/SGTL cases.
Conclusions
In the present study, we investigated the proportion of SGTF/SGTL cases as a mean to get insights on VOC 202012/01 (B.1.1.7 lineage) frequency and geographic spread in Portugal. We used massive SARS-CoV-2 RT-PCR data comprehensively collected by a large laboratory spread throughout the country. The main observations were:
- Besides S gene target failure (SGTF) in ThermoFisher TaqPath RT-PCR assay, we observed that Spike gene target late amplification (SGTL) samples can be an additional proxy to identify the highly transmissible VOC 202012/01 (B.1.1.7 lineage) and monitor its geotemporal dissemination.
- To our knowledge, the SGTL effect on RT-PCR has not been reported yet. Still, the technical reason behind SGTL effect is under investigation.
- SGTF/SGTL cases represented 5.8% (1575 / 27069) of all TaqPath positive cases detected since week 49 (2020), with this proportion reaching 13.3% at end of week 2 (2021).
- The SGTF/SGTL proportion increased at a 70% (63%–76%, CI 95%) rate per week, with our forecast for the next four weeks pointing that the proportion of SGTF/SGTL cases can reach up to 60% of TaqPath positive cases by week 5.
- Although discordant results have been reported regarding the association between VOC 202012/01 (lineage B.1.1.7) and high viral loads [2,6,12], our data collected so far suggests that patients whose samples exhibit both SGTF or SGTL effect in the TaqPath test are more likely to have high viral loads at the time of sampling.
- Current data on age distribution of SGTF+SGTL cases does not seem to indicate a relevant shift in the age composition when comparing with non-SGTF/SGTL cases.
Portugal is facing a highly concerned epidemic situation, being among the countries with highest 14-day notification rate of newly reported COVID-19 cases per 100 000 population, as of January 18th. Our data pointing that the highly transmissible VOC 202012/01 (B.1.1.7 lineage) is highly spread and progressively increasing its frequency reinforces the need to implement robust public health measures in Portugal to mitigate the impact of COVID-19 disease in terms of hospitalizations and death.
Acknowledgments
This study is partially co-funded by Fundação para a Ciência e Tecnologia and Agência de Investigação Clínica e Inovação Biomédica (234_596874175) on behalf of the Research 4 COVID-19 call. Some infrastructural resources used in this study come from GenomePT project (POCI-01-0145-FEDER-022184), supported by COMPETE 2020 - Operational Programme for Competitiveness and Internationalisation (POCI), Lisboa Portugal Regional Operational Programme (Lisboa2020), Algarve Portugal Regional Operational Programme (CRESC Algarve2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF), and by Fundação para a Ciência e a Tecnologia (FCT).
References
- Public Health England (PHE). (2020). Investigation of novel SARS-COV-2 variant: Variant of Concern 202012/01: Technical briefing document on novel SARS-CoV-2 variant. Available from: Investigation of SARS-CoV-2 variants of concern: technical briefings - GOV.UK
- European Centre for Disease Prevention and Control. Risk related to spread of new SARS-CoV-2 variants of concern in the EU/EEA – 29 December 2020. ECDC: Stockholm; 2020. https://www.ecdc.europa.eu/sites/default/files/documents/COVID-19-risk-related-to-spread-of-new-SARS-CoV-2-variants-EU-EEA.pdf
- Volz E, Mishra S, Chand M, Barrett JC, Johnson R, Geidelberg L, et al. (2021). Transmission of SARS-CoV-2 Lineage B. 1.1. 7 in England: Insights from linking epidemiological and genetic data. medRxiv. doi: Transmission of SARS-CoV-2 Lineage B.1.1.7 in England: Insights from linking epidemiological and genetic data
- Galloway SE, Paul P, MacCannell DR, Johansson MA, Brooks JT, MacNeil A, et al. (2021). Emergence of SARS-CoV-2 B.1.1.7 Lineage — United States, December 29, 2020–January 12, 2021. MMWR. Morbidity and Mortality Weekly Report, 70(3), 1–5. Emergence of SARS-CoV-2 B.1.1.7 Lineage — United States, December 29, 2020–January 12, 2021 | MMWR
- Kemp SA, Datir RP, Collier DA, Ferreira IATM, Carabelli A, et al. (2020). Recurrent emergence and transmission of a SARS-CoV-2 Spike deletion ΔH69/V70. bioRxiv. doi: https://doi.org/10.1101/2020.12.14.422555
- Kidd M, Richter A, Best A, Mirza J, Percival B, Mayhew M, et al. (2020). S-variant SARS-CoV-2 is associated with significantly higher viral loads in samples tested by ThermoFisher TaqPath RT-QPCR. medRxiv. doi: https://doi.org/10.1101/2020.11.10.20228528
- Gravagnuolo AM, Faqih L, Cronshaw C, Wynn J, Burglin , Klapper P, Wigglesworth M. (2021). Epidemiological Investigation of New SARS-CoV-2 Variant of Concern 202012/01 in England. medRxiv. doi: Epidemiological Investigation of New SARS-CoV-2 Variant of Concern 202012/01 in England | medRxiv
- Public Health England (PHE). (2020). Investigation of novel SARS-COV-2 variant. Variant of Concern 202012/01: Technical briefing 2. Available from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/949639/Technical_Briefing_VOC202012-2_Briefing_2_FINAL.pdf
- Bal A, Destras G, Gaymard A, Regue H, Semanas Q, d’Aubarede C, et al. (2020). Two-step strategy for the identification of SARS-CoV-2 variants co-occurring with spike deletion H69-V70, Lyon, France, August to December 2020. medRxiv. doi: https://doi.org/10.1101/2020.11.10.20228528
- O’Toole Á, Hill V, Pybus OG, Watts A, Bogoch II, Khan K, et al. (2021). Tracking the international spread of SARS-CoV-2 lineages B.1.1.7 and B.1.351/501Y-V2. Virological.org. Available from: https://pando.tools/t/tracking-the-international-spread-of-sars-cov-2-lineages-b-1-1-7-and-b-1-351-501y-v2/592/1
- Public Health England (PHE). (2021). Investigation of novel SARS-COV-2 variant. Variant of Concern 202012/01: Technical briefing 3. Available from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/950823/Variant_of_Concern_VOC_202012_01_Technical_Briefing_3_-_England.pdf
- Walker AS, Vihta KD, Gethings O, Pritchard E, Jones J, House T, et al. (2020). Increased infections, but not viral burden, with a new SARS-CoV-2 variant. medRxiv. doi: Increased infections, but not viral burden, with a new SARS-CoV-2 variant