Novel sublineage within B.1.1.1 currently expanding in Peru and Chile, with a convergent deletion in the ORF1a gene (Δ3675-3677) and a novel deletion in the Spike gene (Δ246-252, G75V, T76I, L452Q, F490S, T859N)

C.37: Novel lineage expanding in Peru and Chile, with a convergent deletion in the ORF1a gene (Δ3675-3677) and a novel deletion in the Spike gene (Δ246-252, G75V, T76I, L452Q, F490S, T859N)

Pedro E. Romero1, Alejandra Dávila-Barclay1, Luis Gonzáles1, Guillermo Salvatierra1, Diego Cuicapuza1, Luis Solis1, Pool Marcos1, Janet Huancachoque1, Dennis Carhuaricra2, Raúl Rosadio2, Luis Luna2, Lenin Maturrano2, Pablo Tsukayama1,3,4*

  1. Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia. Lima, Perú.
  2. Facultad de Medicina Veterinaria, Universidad Nacional Mayor de San Marcos. Lima, Perú.
  3. Instituto de Medicina Tropical Alexander von Humboldt. Lima, Perú.
  4. Wellcome Sanger Institute. Hinxton, UK.

*Correspondence: pablo.tsukayama@upch.pe

Summary

We report a novel sublineage of B.1.1.1 (now C.37) that presents a deletion in the ORF1a gene (Δ3675-3677), also present in variants of concern (VOCs) B.1.1.7, B.1.351, and P.1. It displays a novel deletion and multiple nonsynonymous mutations in the Spike gene (Δ246-252, G75V, T76I, L452Q, F490S, T859N). Initially reported in Lima, Peru, in late December, it now accounts for 50 of 105 (47.6%) genomes in Lima between January 1 and March 18. Further RT-qPCR screening for VOCs suggests that it is widespread in other regions of Peru. It is also expanding in Chile and has been reported in Argentina, Australia, Brazil, Ecuador, Germany, Spain, the UK, and the US.

Background

The recent emergence of SARS-CoV-2 VOCs with potentially higher transmissibility and reduced sensitivity to antibody neutralization poses new challenges for the control of COVID-19 (Wang et al., 2021). Latin America has seen a steep increase in COVID-19 cases in 2021, accounting for almost 40% of daily global deaths in March and April (FT Visual & Data Journalism team, 2021). Peru has been severely hit by the pandemic: as of April 2021, it has had the highest number of excess deaths globally relative to its population (approx. 160 300 out of 32.5 million: 0.5%) (FT Visual & Data Journalism team, 201; OpenCovid-Peru, 2021).

Currently, 1211 genome sequences from Peru are available on GISAID, comprising 58 circulating PANGO lineages since March 2020 (see our Peru Nextstrain build). VOCs B.1.1.7 (n=3) and P.1 (n=25) were first reported in Peru in January 2021. Since then, due to limited resources to scale genomic sequencing efforts, the Peruvian National Institute of Health (INS) implemented an RT-qPCR protocol to screen for VOCs based on deletions ORF1a:Δ3675-3677 and S:Δ69-70 (Vogels et al., 2021). Testing of 579 SARS-CoV-2+ samples from 10 out of 25 regions in Peru showed that 39.7% samples from Lima and 33.7% across Peru had a result consistent with P.1 or B.1.351 (ORF1a gene target failure, S+, N+; Table 1) (INS, 2021a). This led to the assumption that P.1 is expanding rapidly in Peru due to multiple unchecked introductions via our extensive land border with Brazil. In addition, it has been proposed that the high prevalence of P.1 in Peru is associated with the sharp increase in case numbers and deaths since January (INS, 2021b).


Table 1. Potential detections of VOCs from the Peruvian Institute of Health (INS), based on the protocol by Vogels et al. (2021).

The authors of this RT-qPCR protocol indicate that S gene target failure due to S:Δ69-70 is not enough evidence to detect B.1.1.7 because this deletion is also present in lineages such as (non-VOCs) B.1.358 and B.1.375. Also, other lineages such as B.1.525 and B.1.526 have the ORF1a:Δ3675–3677 deletion. Thus, qPCR-based identification of VOCs is only presumptive and should be confirmed by genome sequencing, particularly in countries with limited knowledge of locally circulating lineages.

Routine genome sequencing of samples between January and March 2021 revealed a deep-branching clade within B.1.1.1 (Figure 1) that shared a novel deletion (Δ246-252) and multiple nonsynonymous mutations in the Spike gene. It shares ORF1a Δ3675–3677 deletion with VOCs that may lead to a result similar to P.1 and B.1.351 in the RT-qPCR screening test. It was first reported in Lima in late December and now accounts for 50 of 105 genomes (41.3%) from Lima between January 1 and March 18. In contrast, only three P.1 genomes (2.8%) were identified in Lima over the same period, see Peru Nextstrain build.

C.37 appears to be expanding in Chile since January and has also been reported in Ecuador, Argentina, Brazil, Germany, Spain, the US, and the UK. We discuss the implications of our findings for public health policies in Peru and the need for improved genomic surveillance programs in low- and middle-income countries.


Figure 1. Identification of a deep-branching B.1.1.1 sublineage in Peru. Source: Nextstrain.

The Study

We used the Nextstrain global build to identify lineages with the ORF1a:Δ3675-3677 deletion. One such lineage was a branch of B.1.1.1 that included samples from Chile and Peru (Figure 2). We then used the Nextstrain south-america build to identify additional samples from the region and a set of mutations shared by this sublineage (Figure 3).

We downloaded 357 genomes available on GISAID (April 21, 2021) filtered by substitutions Spike_L452Q, Spike_F490S, Spike_T859N. We used the Nextstrain pipeline to align the genomes and obtain a calibrated tree (Figure 4). The pipeline discarded one sequence from Ecuador.


Figure 2. Lineages that present the ORF1a:Δ3675-3677. C.37 is highlighted in the red box. Source: Nextstrain.


Figure 3. Defining mutations in C.37. Samples share five nonsynonymous mutations in the Spike gene (G75V, T76I, L452Q, F490S, T859N) and the Δ246-252 deletion. Source: Nextstrain.


Figure 4. Calibrated tree of the 356 genomes available on GISAID (April 21, 2021). Source: C.37 Nextstrain build.

The geographical distribution of C.37 was as follows, North America: US (n=113); South America: Argentina (n=1), Brazil (n=1), Chile (n=160), Ecuador (n=1, discarded), Peru (n=51); Europe: Germany (n=22), Spain (n=4), United Kingdom (n=3); Oceania: Australia (n=1). One sample was classified as lineage C.4 (Chile), three as C.8 (Germany and Peru), and one as C.30 (US).

Table 2 summarizes the shared mutations and deletions shared by members of this novel lineage. Most samples (94%) have the ORF1a deletion (Δ3675-3677), except for a subset from Brazil (n=1), Peru (n=21), and the US (n=1).

Gene Amino acid
N P13L, R203K, G204R, G214C
ORF1a T1246I, P2287S, F2387V, L3201P, T3255I, G3278S, S3675-, G3676-, F3677-, K3678R
ORF1b P314L
ORF9b P10S
S G75V, T76I, R246-, S247-, Y248-, L249-, T250-, P251-, G252-, L452Q, F490S, D614G, T859N

Table 2. Nonsynonymous mutations in C.37 compared to the reference genome NC_045512.2. Deletions in the ORF1a and Spike genes are shown in bold.

Members of this lineage also share a distinct seven-amino acid deletion in the Spike gene (S:Δ246-252). However, we found 23 samples that do not present this deletion; these were the same samples that did not have the ORF1 deletion (Δ3675-3677), 21 of them were submitted by one INS laboratory in Peru. We have requested access to the raw reads to confirm if this potential insertion is due to an assembly error. We observed an additional deletion in the Spike gene (Δ60-72) in samples from Chile (n=22), Peru (n=4), and the US (n=2), and a similar one (Δ63-76) in one sample from Spain.

We observed a novel combination of nonsynonymous mutations in Spike: G75V, T76I, L452Q, F490S, T859N. Mutations L452Q and F490S both map to the Spike protein’s receptor-binding domain (RBD). While L452Q is almost exclusive to C.37, L452R has arisen independently in VOIs B.1.427 / B.1.429 (spreading in California) and B.1.617 (spreading in India) and is associated with reduced antibody neutralization (Li et al., 2020; Liu et al, 2021). In vitro studies show that F490S is also associated with reduced susceptibility to antibody neutralization (Koening et al., 2021; Liu et al, 2021; Weisblum et al., 2020). T859N is present in the B.1.526 variant of interest (VOI) spreading in New York City (Zhou et al., 2021)

Chile and Peru differ significantly in their vaccination efforts: As of April 20, 40.7% of adults in Chile have received at least one vaccine dose, compared to 2.3% in Peru. Yet, both countries are experiencing a sharp resurgence of cases since January (Figure 5). Media articles have speculated that the second wave of COVID-19 in these countries is associated with the increased prevalence of P.1 in these countries (Chambers, 2021). Based on our preliminary findings, it is plausible that the COVID-19 epidemics in Peru and Chile are instead driven by the increasing prevalence of C.37.


Figure 5. Daily new confirmed COVID-19 cases per million people in Chile and Peru. Source: ourworldindata.org

The ORF1a deletion (Δ3675-3677) in this lineage could lead to a misidentification when using the Vogels et al. assay, resulting in samples assigned incorrectly to P.1. Reports by the INS in Peru based on this assay suggest that P. 1’s prevalence is 40% in Lima and 33% in Peru. In light of our results, it is possible that a significant fraction of samples corresponds to C.37 instead of P.1.

As far as we know, the S:Δ246-252 deletion has not been reported previously. Deletions in the Spike Δ137-149 and Δ241-249 are thought to have emerged through extended infections in immunosuppressed hosts and might be associated with increased resistance to neutralizing antibodies (McCarthy et al., 2021 ). The B.1.351 variant harbors the Δ242–244 and is associated with decreased neutralizing antibody activity (Wang et al., 2021). The S:Δ60-72 deletion found in a subset of samples within this lineage warrants further study.

Conclusions

Since January, Latin America has become the new epicenter of the COVID-19 pandemic, with several countries experiencing a rapid rise of cases and deaths. So, it is likely that novel variants beyond P.1 may emerge and quickly spread in the region. A novel sublineage within B.1.1.1 has emerged since late 2020, is rapidly expanding in Peru and Chile, and is now present in multiple countries in America, Europe, and Oceania. It is defined by the ORF1a: Δ3675-3677 (shared with VOCs B.1.1.7, B.1.351, P.1), S:Δ246-252, and S:G75V,T76I,L452Q,F490S,T859N. The novel S:Δ246-252 deletion should be taken into consideration for validation of diagnostic tests and vaccine design efforts. Additional deletions in the Spike protein should also be assessed to understand their effects on viral fitness and host interaction. Our findings emphasize the need to strengthen genomic surveillance programs in Latin America and balance the need to track the importation of VOCs with the monitoring of locally circulating variants.

References

Chambers, Jane. “Chile sees Covid surge despite vaccination success”. BBC News (2021) Chile sees Covid surge despite vaccination success - BBC News Accessed April 17 2021.

FT Visual & Data Journalism team. “Coronavirus tracker: the latest figures as countries fight the Covid-19 resurgence”. Financial Times (2021)
Coronavirus tracker: the latest figures as countries fight the Covid-19 resurgence | Free to read | Financial Times Accesed April 22 2021

Instituto Nacional de Salud (Peruvian National Institute of Health) “INS: variante brasilera tiene una amplia circulación en varios distritos de Lima” (2021a) https://web.ins.gob.pe/es/prensa/noticia/ins-variante-brasilera-tiene-una-amplia-circulacion-en-varios-distritos-de-lima Accessed April 16 2021.

Instituto Nacional de Salud (Peruvian National Institute of Health) “INS detectó la presencia de la variante brasileña del coronavirus en Loreto, Huánuco y Lima” (2021b)
https://web.ins.gob.pe/es/prensa/noticia/ins-detecto-la-presencia-de-la-variante-brasilena-del-coronavirus-en-loreto-huanuco Accesed April 21 2021.

Koenig, Paul-Albert, et al. “Structure-guided multivalent nanobodies block SARS-CoV-2 infection and suppress mutational escape.” Science 371.6530 (2021).

Li, Qianqian, et al. “The impact of mutations in SARS-CoV-2 spike on viral infectivity and antigenicity.” Cell 182.5 (2020): 1284-1294.

Liu, Zhuoming, et al. “Identification of SARS-CoV-2 spike mutations that attenuate monoclonal and serum antibody neutralization.” Cell Host & Microbe 29.3 (2021): 477-488.

McCarthy, Kevin R., et al. “Recurrent deletions in the SARS-CoV-2 spike glycoprotein drive antibody escape.” Science 371.6534 (2021): 1139-1142.

OpenConvid-Peru. https://opencovid-peru.com (2021) Accessed April 22 2021

Weisblum, Yiska, et al. “Escape from neutralizing antibodies by SARS-CoV-2 spike protein variants.” Elife 9 (2020): e61312.

Vogels, Chantal BF, et al. “PCR assay to enhance global surveillance for SARS-CoV-2 variants of concern.” medRxiv (2021).

Wang, Pengfei, et al. “Antibody resistance of SARS-CoV-2 variants B.1.351 and B.1.1.7.” Nature (2021):1-6.

Zhou, Hao, et al. “B.1.526 SARS-CoV-2 variants identified in New York City are neutralized by vaccine-elicited and therapeutic monoclonal antibodies.” bioRxiv (2021).

Data Availability

Peru’s Nextstrain custom build. / C.37 Nextstrain custom build.

Acknowledgements

The Universidad Peruana Cayetano Heredia’s Institutional Review Board approved the project in June 2020. We are supported by Fondo Nacional de Ciencia y Tecnología (FONDECYT) grants #046-2020 and #022-2021. We thank our collaborators at Laboratorio de Virus Respiratorios at INS (Maribel Huaringa, Priscilla Lope, Nancy Rojas) and Instituto de Medicina Tropical Alexander von Humboldt (Giovanni Lopez, David Durand, Theresa Ochoa) for providing clinical samples for sequencing.

We acknowledge the authors from the originating laboratories responsible for obtaining the specimens and the submitting laboratories where genetic sequence data were generated and shared via the GISAID Initiative, on which this research is based. The complete information of the originating lab teams can be found here.

2 Likes

On April 23, the novel B.1.1.1 sublineage was designated as PANGO lineage C.37, see this thread in Github. In a couple of days, this new lineage name should be updated in GISAID, Nextstrain and PANGO lineages.

Up to date, there are 377 genomes that will be assigned as C.37 lineage in GISAID, the current Nextstrain C.37 build can be found here

1 Like

I also prepared a Nextstrain build using 1896 genomes from Chile. 161 out of 698 genomes correspond to the new C.37 lineage