Genomic Surveillance of SARS-CoV-2 in the State of Rio de Janeiro, Brazil: technical briefing

Genomic Surveillance of SARS-CoV-2 in the State of Rio de Janeiro, Brazil: a technical briefing

Luiz G P de Almeida¹, Alessandra P Lamarca¹, Ronaldo da Silva F Jr¹, Liliane Cavalcante¹, Alexandra L Gerber¹, Ana Paula de C Guimarães¹, Douglas Terra Machado¹, Cassia Alves², Diana Mariani², Thais Felix Cruz², Mario Sergio Ribeiro³, Silvia Carvalho³, Flávio Dias da Silva4, Marcio Henrique de Oliveira Garcia4, Leandro Magalhães de Souza5, Cristiane Gomes Da Silva5, Caio Luiz Pereira Ribeiro4, Andréa Cony Cavalcanti5, Claudia Maria Braga de Mello³, Amilcar Tanuri²ª, Ana Tereza R Vasconcelos¹ª*

ªBoth authors coordinated this work
*Corresponding author : atrv@lncc.br
1 Laboratório de Bioinformática, Laboratório Nacional de Computação Científica, Petrópolis, Brazil.
2 Departamento de Genética, Instituto de Biologia, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil.
3 Secretaria Estadual de Saúde do Rio de Janeiro, Rio de Janeiro, Brazil.
4 Secretaria Municipal de Saúde Rio de Janeiro, Rio de Janeiro, Brazil
5 Laboratório Central de Saúde Pública Noel Nutels, Rio de Janeiro, Brazil.

The Genomic Surveillance Program of the state of Rio de Janeiro is part of a task force composed by the Secretary of Health from the state of Rio de Janeiro, the Secretary of Health from the city of Rio de Janeiro, the Noel Nutels Central Laboratory (LACEN-RJ) and the Corona-Ômica-RJ Network through partnership with the Bioinformatics Laboratory of the National Laboratory for Scientific Computing (LNCC) and the Laboratory of Molecular Virology (LVM) at the Federal University of Rio de Janeiro (UFRJ).
This project is funded by the Research Support Foundation of the State of Rio de Janeiro (FAPERJ). The Program integrates data related to the viral genetic alterations, transmission and the course of infection among cities in Rio de Janeiro. This report aims to: (i) present the epidemiological situation of the SARS-CoV-2 genomic surveillance at the state levels, and (ii) describe the emergence of possible new mutations and viral variants.

Genomic Epidemiology
The state of Rio de Janeiro, Brazil, contains the third largest city of South America, which is further expanded into a metropolitan area containing multiple industrial cities. Hence, it is not only common for workers to commute daily between neighbouring cities, but Rio de Janeiro also receives a high number of interstate corporate travels. The large population coupled with this daily/seasonal travelling dynamics favours the introduction and spread of new SARS-CoV-2 lineages in the state.
We analysed in both spatio-temporal and phylogenetic contexts 90 SARS-CoV-2 samples collected between March 24, 2021 and March 28, 2021 from 20 municipalities of the state of Rio de Janeiro. Previous sequencing of SARS-CoV-2 samples from the state available in GISAID [1–3] indicated that the B.1.1.33 lineage had a high prevalence from April to October, 2020. P.2 lineage was first detected in September, 2020 and by October had already substituted B.1.1.33 as the prevalent lineage in the state. P.2 gradually increased its frequency until February, 2021, when it reached 64% of samples sequenced (Figure 1). By January, 2021 there was a fast replacement of P.2 by P.1 lineage, the latter being present in 94% of the sampling sequenced for this report (March 24, 2021 to March 28, 2021; Figure 1 and Figure 2A). Currently, P.1 is the dominant lineage widespread across all regions in the State (Figure 2B-C).

Figure 1. Dispersion of SARS-CoV-2 lineages from April 2020 to March 2021 in the state of Rio de Janeiro.

Figure 2. Frequency and spatial analysis of SARS-CoV-2 lineages found in samples from March 2021. A) Donut plot showing the total proportion of lineages identified in the 90 samples sequenced in March 2021. B) Proportion of lineages in each macro-region of the state of Rio de Janeiro. C) Proportion of lineages across the 17 cities investigated in this study.

Local transmission of a previously undefined cluster of P.1 lineage was identified in the municipality of Conceição de Macabu in the North region of the state Local transmission of a previously undefined cluster of P.1 lineage was identified in the municipality of Conceição de Macabu in the North region of the state. This cluster contains up to nine mutations in addition to the P.1 lineage-defining ones: four on ORF1ab (synC1150T, synC1912T, D762G, T1820I), two on ORF3a (D155Y,S180F), one on the M protein (synC26954T), one on the N protein (synC28789T) and one on the S protein (A262S) (Figure 3, black dots in tree). We observed that these sequences formed a monophyletic clade with samples from the states of Rio Grande do Sul and São Paulo in Brazil, Spain, Netherlands, Australia and the United States, which consistently bears five of these mutations (synC1912T, D762G, T1820I, D155Y, synC28789T). The mutation D155Y was proposed to lower the viral affinity to the host’s caveolin-1 protein, avoiding cell apoptosis and extending the infection’s asymptomatic phase [4]. Therefore, we suggest that this clade be considered a new Variant of Interest (VOI), with the proposed lineage name P.1.2. Origin of this lineage was estimated to have happened around late January (Figure 4).

Figure 3. Genotype table (center) and Maximum Likelihood tree (left) of the new proposed lineage P.1.2 (highlighted in blue). Position of mutations are indicated by column names and base substitution is signalized by color change from gray. Bootstrap value is indicated on the tree.

Figure 4. Time tree of the newly sequenced samples from Rio de Janeiro, Brazil, indicated by the points colored according to locality.

In addition to P.1 and P.2 lineages, we also found the B.1.1.7 lineage in three samples from the northwestern and southern regions of the state. These genomes harbored the E484K and N501Y mutations in the Spike protein, associated with the escape of the immunological system [5,6]. The B.1.1.7 lineage was first identified in the state in a sample from January, 1st, 2021 and its frequency has remained low (total n=8).

Conclusion
The identification of the quick replacement of P.2 by P.1 lineage through the sequencing of samples, reinforces the importance of continuosly genomic surveillance in the state of Rio de Janeiro aiming to monitor and prevent the strains dispersion. The adoption of new restrictive measures by the state and municipalities must consider the spatio-temporal distribution of the strains and the presence of mutations associated with the escape of the adaptive immune response.

Methods
The 90 analyzed genomes were collected between the dates March 24, 2021 and March 28, 2021 from 17 municipalities in the state of Rio de Janeiro/Brazil. Samples from patients with SARS-CoV-2 positive nasopharyngeal RT-PCR were collected at the Noel Nutels Central Laboratory (LACEN-RJ). The study was approved by the Ethics Committee (30161620.0.1001.5257 and 34025020.0.0000.5257). Patients were aged between 11 and 78 years old, being 50% men and 50% women. Extraction of the genetic material was performed at the Molecular Virology Laboratory (LVM-UFRJ) with QIAamp or MagMAX Viral / Pathogen Nucleic Acid Isolation kits and KingFisher automatic platform. Annealing of cDNA was conducted with 8,5 µl of viral RNA extracted from each sample. Libraries were constructed at the DFA/LNCC Genomics Unit with Illumina COVIDSeq Test (Illumina), according to the manufacturer’s protocol. Purification was then conducted using 5 µl of each library combined and the TapeStation (Agilent) system was used for quality control. We employed the MiSeq Reagent Kit v3 (600-cycle) to generate reads of 2x150 bp with a MiSeq sequencer. Sequence analysis, consensus building and variant calling were performed with DRAGEN COVID Lineage v3.5.1. Pango lineages were attributed to the newly assembled genomes using the PangoLEARN database v2021-04-14. They were then aligned with MAFFT 7.475 [7] to P.1 sequences and to sequences from Rio de Janeiro state regardless of lineage, all available in the GISAID database. The resulting alignment was then used to construct the maximum likelihood tree with IqTree2 [8] and the substitution model selected with the built-in algorithm ModelFinder [9]. Bootstrap support was calculated with 10,000 tree replicates. We also inferred the divergence dates of the new sequences using a bayesian approach with BEAST 1.10.4 [9,10]. Divergence dates were estimated by tip-dating and employing the strict clock model, the coalescent exponential growth tree prior and the GTR substitution model. Bioinformatic and phylogenetic analyses were performed using the computational infrastructure of the Bioinformatics Laboratory (LABINFO-LNCC).

Acknowledgments
We would like to thank all the authors and administrators of the GISAID database, which allowed this study of genomic epidemiology to be conducted properly. A full list acknowledging the authors publishing data used in this study can be found in the following file: gisaid_hcov-19_acknowledgement_table_2021_04_21_Post_Virological_02.pdf (226.3 KB)
This work was developed in the frameworks of Corona-ômica-RJ (FAPERJ = E-26/210.179/2020). A.T.R.V. is supported by CNPq (303170/2017-4) and FAPERJ (E-26/202.903/20); A.T. by FAPERJ E-26/010.002434/2019 and E-26/210.178/2020 R.S.F.J is a recipient of a graduate fellowship from CNPq, A.P.L is granted a post-doctoral scholarship (DTI-A) from CNPq.
We acknowledge the support from the Rede Corona-ômica BR MCTI/FINEP affiliated to RedeVírus/MCTI (FINEP 01.20.0029.000462/20, CNPq 404096/2020-4).

References

  1. Candido, D. S. et al. Evolution and epidemic spread of SARS-CoV-2 in Brazil. Science 369, 1255–1260 (2020).
  2. Voloch, C. M. et al. Genomic characterization of a novel SARS-CoV-2 lineage from Rio de Janeiro, Brazil. J. Virol. (2021) doi:10.1128/JVI.00119-21
  3. Lamarca, A. P. et al. Genomic surveillance of SARS-CoV-2 tracks early interstate transmission of P.1 lineage and diversification within P.2 clade in Brazil. bioRxiv (2021) doi:10.1101/2021.03.21.21253418
  4. Gupta, S. et al. D155Y Substitution of SARS-CoV-2 ORF3a Weakens Binding with Caveolin-1: An in silico Study. bioRxiv (2021) doi:10.1101/2021.03.26.437194
  5. Collier, D. A. et al. Sensitivity of SARS-CoV-2 B.1.1.7 to mRNA vaccine-elicited antibodies. Nature (2021) doi:10.1038/s41586-021-03412-7
  6. Singh, J. et al. Structure-Function Analyses of New SARS-CoV-2 Variants B.1.1.7, B.1.351 and B.1.1.28.1: Clinical, Diagnostic, Therapeutic and Public Health Implications. Viruses 13, (2021).
  7. Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).
  8. Minh, B. Q. et al. IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era. Mol. Biol. Evol. 37, 1530–1534 (2020).
  9. Kalyaanamoorthy, S., Minh, B. Q., Wong, T. K. F., von Haeseler, A. & Jermiin, L. S. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Methods 14, 587–589 (2017).
  10. Drummond, A. J. & Rambaut, A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7, 214 (2007).

The proposed lineage P.1.2 was designated in Pango v1.1.21 (Release pango designations v1.1.21 · cov-lineages/pango-designation · GitHub)

[Updated information - Genomic surveillance of the state of Rio de Janeiro, Brazil, in the period between March, 24th, 2021 and April,16th, 2021]

We are happy to announce the sequencing, assembling and analyzing of 376 new SARS-CoV-2 genomes from 57 cities in the state of Rio de Janeiro. We have expanded the sampling temporally and geographically. 127 novel sequences from March 2021 and 249 from April 2021 were included and are now available in GISAID. The methods followed are described in the original post.

The cluster previously identified in the Conceição de Macabu city (northern region of Rio de Janeiro) has now been accepted by PANGO (v1.1.21) as a novel lineage named P.1.2 (Figure 1).

Figure 1. Evolutionary tree of P.1.2 genomes available in GISAID database. Sequences from the state of Rio de Janeiro are colored according to the city of origin and new samples are indicated by black circles at the tip.

We have detected an increased frequency of the new lineage P.1.2 from 4.95% in March 2021 to 8.43% in April 2021 (Figure 2). This is possibly due to a higher sampling in cities from the northern region. We also detected this novel lineage in other regions, indicating an initial dissemination across the state of Rio de Janeiro. Still, P.1 lineage is maintained as the dominant lineage in the state (91.49%). The other two lineages detected were B.1.1.7 (2.13%) and P.2 (0.53%), both with decreasing frequencies when compared to the previous report.

Figure 2. Relative frequencies of SARS-CoV-2 lineages in the state of Rio de Janeiro between March, 2020 and April, 2021.

We also found a single P.1 genome sequence in the city of Vassouras with two deletions in ORF7a and ORF7b with 157 and 75 nucleotides, respectively (Figure 3).

Figure 3. Genomic representation of the two deletions found in the P.1 lineage of a sample from the city of Vassouras. Deletion 01 comprises part of the ORF7a and the Deletion 02 covers the terminal region of ORF7a and the beginning of ORF7b. In gray we represent the virus genome and in red the region (enlarged) containing ORF7a, ORF7b and the deletions.

We would like to thank all the authors and administrators of the GISAID database, which allowed this study of genomic epidemiology to be conducted properly. A full list acknowledging the authors publishing the data used in this update can be found in the following file:
gisaid_hcov-19_acknowledgement_table_2021_05_04_19.pdf (19.3 KB)

1 Like

[Update information - Genomic surveillance of the state of Rio de Janeiro, Brazil, in the period between March, 24th, 2021 and April, 26th, 2021]

On May 14th, 2021, we performed the sequencing, assembling and analyzing of 362 new SARS-CoV-2 genomes from 66 cities in the state of Rio de Janeiro and five samples originated in other states of Brazil collected in Rio de Janeiro. So far, the Genomic Surveillance Program of the state of Rio de Janeiro has sequenced and reported 828 new genomes.

The samples were collected between March 24th to April 26th, 2021. The methods followed are described in the original post. The overall frequency of the circulating lineages was similar to the previous report with most of the samples assigned to P.1 (93,09%), B.1.1.7 (2,76%), P.1.2 (2,76%), B.1.1.28 (0,83), and P.2 (0,55%).

We identified three samples classified as B.1.1.28 from the southern region of the state harboring 12 nonsynonymous mutations in addition to the defining mutation of this lineage (Figure 1). We found 10 additional genomes in GISAID harboring 10 out of the 12 mutations of our samples (ORF1ab: T1637I, A3209V, Q3729K and P4337L; S: F2L, Q14K, T95I, E484Q and N501T; N: G215V) all of them from the state of São Paulo. We are proposing to the PangoLEARN system that this clade should be classified as a novel lineage.

Figure 1. Evolutionary tree of the lineage B.1.1.28, indicating with a bar the new proposed lineage and its defining mutations. The three newly-sequenced genomes are evidenced by black circles and branch color indicates the Brazilian state in which the genome was sampled.

We have also identified that eight samples attributed to the lineage P.1 by the Pango classification system were positioned within the proposed lineages “P1-like II” (New SARS-CoV-2 P.1-related lineages proposal - P.1-like-I and P.1-like-II in Brazil · Issue #77 · cov-lineages/pango-designation · GitHub, Gräf et al, 2021) (Figure 2). First evidence of this clade in the state of Rio de Janeiro was in January and the most recent sequenced genome was from March, 10th. Therefore, we confirm in this work that the lineage is still circulating in, at least, four different cities in the state between March, 24th and April, 26th, 2021.

Figure 2. Evolutionary tree of P.1 lineage. Black circles indicate the eight newly-sequenced samples contained in the proposed lineage “P.1-like II”. Branch color indicates the Brazilian state in which the genome was sampled.

We would like to thank all the authors and administrators of the GISAID database, which allowed this study of genomic epidemiology to be conducted properly. A full list acknowledging the authors publishing the data used in this update can be found in the following file:
Acknowledgement_table_GISAID.pdf (118.9 KB)

Reference
Gräf, T et al. Identification of SARS-CoV-2 P.1-related lineages in Brazil provides new insights about the mechanisms of emergence of Variants of Concern. 2021. Identification of SARS-CoV-2 P.1-related lineages in Brazil provides new insights about the mechanisms of emergence of Variants of Concern

[Update information - Genomic surveillance of the state of Rio de Janeiro, Brazil, in the period between April, 26th, 2021 and June, 18th, 2021]

Since last update, we have additionally sequenced 1,705 SARS-CoV-2 genomes from the state of Rio de Janeiro, Brazil, as part of the Genomic Surveillance Program of the Rede Corona-ômica-RJ project. The program was able to sequence samples from all cities in the state and we have, so far, sequenced, assembled and analyzed 2,533 genomes.

The frequency of the circulating lineages has behaved stable since the last update, with most of the samples assigned to P.1 (92,1% in June). Other lineages with significant frequencies were B.1.1.7 (0,98%), P.1.2 (2,44%) and the new lineage described in the last update, P.5 (0,98%).

We have identified for the first time the presence of the Delta variant (lineage B.1.167.2) in Rio de Janeiro. Two sequences sampled in June 16-17th from patients living in the cities of Seropédica and São João de Meriti were classified as belonging to the lineage by the PANGOlearn v.3.1.5 algorithm. Both sequences contain 19 lineage-defining mutations of B.1.167.2 and share additional 21 mutations. To evaluate the evolutionary relationship of these two samples to all others from South American genomes, we have reconstructed the phylogeny using 300 worldwide genomes randomly obtained from GISAID database as background sequences. The resulting tree (Figure 1) suggests that both sequences from Rio de Janeiro are closely related and derived from the same introductory event. This event is, in turn, not related to other introductions in the country. Further analyses are being conducted to identify the origin of the Rio de Janeiro Delta clade.

Figure 1. Evolutionary relationships of the two B.1.167.2 genomes from the state of Rio de Janeiro suggests that they are originated from a new introduction of the lineage in Brazil. Brazilian sequences are indicated by points, which are colored according to sampling location.

We would like to thank all the authors and administrators of the GISAID database, which allowed this study of genomic epidemiology to be conducted properly. A full list acknowledging the authors publishing the data used in this update can be found in the following file:
Acknowledgement_table_gisaid.pdf (48.5 KB)

[Update information - Genomic surveillance of the state of Rio de Janeiro, Brazil, in the period between June, 22th, 2021 and July, 19th, 2021]

Since the last update, we have additionally sequenced 749 SARS-CoV-2 genomes from the state of Rio de Janeiro, Brazil, as part of the Genomic Surveillance Program of the Rede Corona-ômica-RJ project. The frequency of B.1.617.2 (Delta) has increased from 0,57% to 26,37% of samples in a month and is expected to replace P.1 (currently at 65,66%) as the dominant lineage in the near future. These new sequences were grouped within three different clades, suggesting a new introduction of B.1.617.2 in the country and the transmission from the state of Goiás to Rio de Janeiro (see the update on Genomic surveillance tracks the first communitary outbreak of Delta (B.1.617.2) variant in Brazil). We have also detected the appearance of lineages P.1.7 (1,65%) and P.1.4 (0,82%) in Rio de Janeiro. Frequency of other lineages in the states were as follow: P.1.2 (2,75%), B.1.1.7 (1,10%), P.1.1 (0,82%), B.1.1 (0,27%), P.4 (0,27%) and P.5 (0,27%).

Figure 1. Relative frequency of SARS-CoV-2 lineages in Rio de Janeiro since fist detection in the state.

[Update information - Genomic surveillance of the state of Rio de Janeiro, Brazil, in the period between June, 24th, 2021 and July, 31th, 2021]

Continuing the Genomic Surveillance Program of the Rede Corona-ômica-RJ project in the state of Rio de Janeiro, Brazil, we have sequenced new 335 SARS-CoV-2 samples obtained in state. The frequency of Delta variant (B.1.617.2, AY.4 and AY.12) has continued to increase, now representing 61.8% of samples. In contrast, P.1 has now declined to a frequency of 20.66%. Other lineages found in Rio de Janeiro were P.1.1 (5.39%), P.1.8, (4.79%), P.1.9 (4.19%), P.1.2 (1.50%), B.1.1 (0.90%), B.1 (0.60%), B.1.1.7 (0.60%) and P.1.7 (0.30%).

Figure 1. Change of frequency of SARS-CoV-2 lineages in the state of Rio de Janeiro showing the Delta variant surpassing Gamma in July 2021

[Update information - Genomic surveillance of the state of Rio de Janeiro, Brazil, in the period between August, 04th, 2021 and August, 16th, 2021]

Since last update, we have sequenced new 377 SARS-CoV-2 genomes from the state of Rio de Janeiro, Brazil, as part of the Genomic Surveillance Program of the Rede Corona-ômica-RJ project. We have sampled, so far, 3,952 genomes from the state. As expected, the Delta variant (lineages B.1.617.2, AY.4, AY.5, AY.6, AY.7.1, AY.7.2, AY.12, AY.20 and AY.25) has become dominant in Rio de Janeiro and now corresponds to ~89.2% of the samples collected in August. We are investigating if such high lineage diversity within Delta in Rio de Janeiro is originated by new introductions in Brazil or a misclassification by PANGO-learn. While Gamma variant is still present in the sample (10.8%), Alpha has completely disappeared. No other lineages were identified in the sample either.

lineages_time

Figure 1. Frequency of SARS-CoV-2 variants of concern in the state of Rio de Janeiro, Brazil.