Vibrio Cholerae Genomics

Genomic analysis of selected Vibrio cholerae isolates from the 2023 Cholera outbreak in Zambia: A use-case for integration of whole-genome sequencing in cholera outbreak response


Vibrio cholerae is a Gram-negative bacterium responsible for causing the infectious disease cholera. It is estimated that V. cholerae causes millions of cases of cholera each year, resulting in thousands of deaths globally. To combat this disease, the isolation and serotyping of V. cholerae are crucial for understanding the epidemiology and transmission of cholera outbreaks. A Global Roadmap to 2030 operationalizes the new global strategy for cholera control at the country level and provides a concrete path toward a world in which cholera is no longer a threat to public health. By implementing the strategy between now and 2030, the Global Task Force on Cholera Control (GTFCC) partners are supporting countries to reduce cholera deaths by at least 90 percent. In 2021, GTFCC reported twenty-one countries with cholera outbreaks globally. The roadmap focuses on controlling the disease at the country level, but a critical element for outbreak prevention is establishing an efficient mechanism that ensures routine monitoring and identifying sources of V. cholerae.

Whole-genome sequencing (WGS) has been a critical tool for infectious disease surveillance and outbreak investigations during the COVID-19 pandemic. When combined with epidemiological and environmental investigations, WGS provides the ultimate resolution for detecting and analyzing transmission routes, tracing outbreak sources, and assessing pathogen immune escape properties, virulence, and antimicrobial resistance determinants. WGS of V. cholerae is now providing unprecedented insights into where cholera strains emerge and then spread to other areas. It is time to integrate these developments into routine public-health measures, especially in outbreak-prone areas of Africa; however, cholera genomic surveillance remains largely underutilized.

The first case of the 2023 cholera outbreak was recorded on January 21, 2023, in Vubwi District, Eastern Province. Eastern Provinces shares a long border with Malawi, which has been experiencing cholera outbreaks since June 2022. A further 3 districts in Eastern Province have so far recorded cases. In Luapula province’s Mwansabombwe and Nchelenge districts, an additional, unrelated outbreak without any epi link to the Eastern Province outbreak was confirmed on January 26 and February 9, 2023, respectively. The two affected districts in the province of Luapula border the Democratic Republic of the Congo, where reports indicate a potential ongoing cholera outbreak, particularly in fishing camps. Nsama district in Northern Province, one of the 20 cholera hotspots in Zambia, confirmed a cholera outbreak on March 12, 2023. As of March 23, 2023, a total of 276 cases, including 7 mortalities, had been recorded in the 7 districts affected by the cholera outbreak. Three (03) of the seven (07) districts affected by the current cholera outbreak received preemptive oral cholera vaccination in 2021 as part of the cholera elimination effort.

In the case of the 2023 Cholera outbreak in Zambia, we will present a use-case for integrating WGS in cholera outbreak response with the following objectives;

  1. Identification of the specific V. cholerae strain responsible for the outbreak: we will use WGS to determine specific strain of V. cholerae responsible for the outbreak and track transmission pattern.
  2. Investigation of the source of the outbreak: Once the specific strain of V. cholerae responsible for the outbreak has been identified, we will compare the genome sequence of the outbreak strain to other V. cholerae strains found in the environment, such as in water sources or food samples. This will help to identify the source of the outbreak and inform prevention strategies for future outbreaks.
  3. Detection of antimicrobial resistance genes: We will use WGS data analysis tools to detect antimicrobial resistance genes in the V. cholerae isolates. This information can be used to determine which antibiotics will be effective in treating severe cases of infection and help guide treatment decisions.
  4. Identification of virulence factors of V. Cholerae: We will analyze the generated sequences to identify virulence genes and review clinical data at the treatment centers to link with disease severity.
  5. Phylogenetics Relatedness: We will analyze genomes collected from the different districts affected by the outbreak to determine phylogenetic relatedness of the isolates and infer transmission patterns.


Outbreak Response Deployment

The cholera outbreak in Zambia in 2023 occurred in districts that lacked laboratory capacity for bacterial culture, identification, and antibiotic sensitivity testing. The Zambia National Public Health Institute (ZNPHI) deployed a team of public health emergency rapid responders (PHERRT) to assist affected districts in responding to and controlling the outbreak. The PHERRT was composed of experts in epidemiology, case management, public health laboratory science, environmental health, and public health risk communication. The Zambia National Public Health Reference Laboratory (ZNPHRL) anchored the laboratory function of the response to the cholera outbreak through deployment of materials, consumables, and equipment for V. cholerae culture, identification, phenotypic antibiotic susceptibility testing (AST), and WGS in the field.

V. cholerae bacterial culture and identification

Stool samples were collected from patients presenting with acute diarrhea at the cholera treatment centers in the affected districts. The specimens were inoculated in alkaline peptone water (APW) for 6 hours for enrichment and later inoculated onto thiosulfate-citrate-bile-sucrose agar (TCBS), a selective medium specific for V. cholerae that inhibits the growth of other bacteria.

Water samples were collected from the sources where individuals with cholera drew water for drinking. Water samples were filtered through the membrane with a flex pump before being inoculated on eosin methylene blue agar for coliform identification. Non-lactose fermenting coliforms were scooped and inoculated on TCBS.

The TCBS plates were incubated at 37°C for 18–24 hours. Yellow colonies with a characteristic halophilic (salt-loving) appearance were scooped from incubated TCBS plates for identification of V. cholerae using the triple sugar iron agar (TSI), lysine iron agar (LIA), sulfide indole motility agar (SIM), citrate, and oxidase biochemical tests.

V. cholerae Serotyping

Serotyping involves the use of antisera to identify and classify V. cholerae based on specific surface antigens, such as the O antigen, which is found in the lipopolysaccharide layer of the cell wall, and the H antigen, which is present in the flagella. Colonies on TCBS suggestive of V. cholerae were plated on Mueller-Hinton agar for purity and incubated at 37°C for 16–18 hours. On a clean glass slide divided into two parts, one drop of physiological saline was placed. A match-head size fresh bacterial colony from Mueller-Hinton agar was emulsified in physiological saline on the two parts of the slide. 1–2 drops of serotyping antisera were placed on the emulsified colony and gently mixed by tilting the glass slide back and forth for 1 minute, and the agglutination pattern was observed. Agglutination, or the clumping of bacterial cells, indicated a positive reaction with the serotyping antisera. V. cholerae Polyvalent 01, V. cholerae 0139 Bengal, V. cholerae Inaba, and V. cholerae Ogawa antisera were used for serotyping.

Phenotypic Antibiotic Susceptibility Testing

A bacterial suspension prepared and standardized to 0.5 Mcfarland was used to perform phenotypic AST on Mueller-Hinton agar using both disk diffusion and minimum inhibitory concentration (MIC) methods. Clinical Laboratory Standards Institute (CLSI) M45 guidelines were used to interpret the MIC and the measured zones of inhibition on the AST disk.

Bacterial DNA Extraction, Quantification, and Normalization

A bacterial suspension in phosphate buffer solution placed in a 1.5 mL Eppendorf tube from fresh colonies on Mueller-Hinton agar plates was prepared for genomic DNA extraction. Bacterial DNA was extracted using the QIAamp DNA Mini Kit as per the manufacturer’s instructions. The DNA extracts from each sample were quantified on the Thermo Fisher Qubit Flex fluorometer using the Invitrogen Qubit dsDNA HS Assay kit following the manufacturer’s protocol. All samples were normalized to 60 ng by diluting with nuclease free water in order to avoid the selective sequencing of samples with a high concentration.

Whole Genome Sequencing for V. cholerae

WGS was performed was the Oxford Nanopore sequencing kits and Gridion. Library preparation was performed using SQK-RBK-110.96 rapid barcoding kit.

Bioinformatic Data Analysis

Quality assessment was performed using fastqc. (LaMar, 2015). To perform de novo assembly, flye (Kolmogorov et al., 2019) was used and produced contigs for each sample. The resultant contigs were subjected to abricate to detect both antimicrobial resistant genes and virulence factors using the card and virulence factor databases respectively.

To investigate the relatedness of the samples, phylogenetic analysis and multilocus sequence typing were performed. For phylogenetic analysis, resultant contigs from flye were subjected to snippy and snippy-core (Seemann, 2022) to produce a core genome alignment of all the samples. The resultant core alignment was cleaned to remove ambiguous characters. The clean core alignment was subjected to gubbins (Croucher et al., 2015) to remove recombinant sites. The output from gubbins was subjected to Fasttree to produce a phylogenetic tree in newick file format and this was later visualized by iTOL. (Letunic and Bork, 2007; Price, Dehal and Arkin, 2010). For multilocus sequence typing, the contigs from flye were searched for sequence types using mlst against the vcholerae and vcholerae_2 schemes from pubMLST. (Jolley and Maiden, 2010; PubMLST - Public databases for molecular typing and microbial genome diversity, no date; tseemann/mlst: Scan contig files against PubMLST typing schemes, no date).


Detection of antimicrobial resistance genes:

From antimicrobial resistance profiling, all the samples were found to have 9 antimicrobial resistance genes and these were common to all. The coverage percentages were 99% and above as seen in the table below. In regards to virulence, 51 virulence factors were found and all were shared by all the samples

SAMPLE APH(3’ ‘)-Ib APH(6)-Id CRP Vibrio_cholerae_varG almG catB9 dfrA1 floR sul2
Barcode01 99.88 100.00 99.68 100.00 99.88 100.00 100.00 99.92 99.88
Barcode02 99.88 100.00 99.68 100.00 99.88 100.00 100.00 99.92 99.88
Barcode03 99.88 100.00 99.68 100.00 100.00 100.00 100.00 100.00 99.88
Barcode04 100.00 100.00 99.68 100.00 100.00 100.00 100.00 100.00 99.88
Barcode05 100.00 100.00 99.68 100.00 100.00 100.00 100.00 100.00 99.88
Barcode06 99.88 100.00 99.68 100.00 100.00 100.00 100.00 100.00 99.88
Barcode07 99.88 100.00 99.68 100.00 100.00 100.00 100.00 100.00 100.00
Barcode08 100.00 100.00 99.68 100.00 100.00 100.00 100.00 100.00 99.88
Barcode09 99.88 100.00 99.68 100.00 99.98 100.00 100.00 99.92 99.88
Barcode10 99.88 100.00 99.68 100.00 100.00 100.00 100.00 99.92 99.88

Table1: Percentage coverages for each sample per resistance gene

Multilocus sequence typing

The resultant contigs were checked for sequence types using both the vcholerae and vcholreae_2 schemes.

While using the vcholerae scheme, novel alleles for pntA and gyrB genes were found in some of the samples as seen in the table below. Some of the samples were found to have alleles with partial matches in the pntA and pyrC genes as seen in the table below

Sample ID adk gyrB mdh metE pntA purM pyrC
barcode01 7 11 4 37 ~12 1 20
barcode02 7 11 4 37 12 1 20?
barcode03 7 11 4 37 227? 1 20?
barcode04 7 ~11 4 37 227? 1 20?
barcode05 7 11 4 37 12 1 20?
barcode06 7 ~11 4 37 227? 1 20
barcode07 7 11 4 37 227? 1 20?
barcode08 7 11 4 37 227? 1 20?
barcode09 7 11 4 37 227? 1 20?
barcode10 7 11 4 37 227? 1 20?

Table2: Alleles found in the vcholerae scheme for each sample. ~ means a novel full length allele similar to that number. ? means partial match allele to known allele

While using the vcholerae_2 scheme, a novel allele for the lap gene was found in all of the samples as seen in the table below. All of the samples were found to have alleles with partial matches in the asd, dnaE, and recA genes genes as seen in the table below

Sample ID asd dnaE lap pgm recA
barcode01 26? 54? ~2 2 38?
barcode02 26? 54? ~2 2 38?
barcode03 26? 54? ~2 2 38?
barcode04 26? 54? ~2 2 38?
barcode05 26? 54? ~2 2 38?
barcode06 26? 54? ~2 2 38?
barcode07 26? 54? ~2 2 38?
barcode08 26? 54? ~2 2 38?
barcode09 26? 54? ~2 2 38?
barcode10 26? 54? ~2 2 38?

Table3: Alleles found in the vcholerae_2 scheme for each sample. ~ means a novel full length allele similar to that number. ? means partial match allele to known allele

Identification of virulence factors of V. Cholerae:

Phylogenetics Relatedness:

All the samples were found to cluster on the same clade as seen on the tree below


Croucher, N.J. et al. (2015) ‘Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins’, Nucleic Acids Research, 43(3), p. e15. Available at:

Jolley, K.A. and Maiden, M.C. (2010) ‘BIGSdb: Scalable analysis of bacterial genome variation at the population level’, BMC Bioinformatics, 11(1), p. 595. Available at:

Kolmogorov, M. et al. (2019) ‘Assembly of long, error-prone reads using repeat graphs’, Nature Biotechnology, 37(5), pp. 540–546. Available at:

LaMar, D. (2015) ‘FastQC’. Available at:

Letunic, I. and Bork, P. (2007) ‘Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation’, Bioinformatics (Oxford, England), 23(1), pp. 127–128. Available at:

Price, M.N., Dehal, P.S. and Arkin, A.P. (2010) ‘FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments’, PLOS ONE, 5(3), p. e9490. Available at:

PubMLST - Public databases for molecular typing and microbial genome diversity (no date) PubMLST. Available at: (Accessed: 30 March 2023).

Seemann, T. (2022) ‘Snippy’. Available at: (Accessed: 25 March 2022).

tseemann/mlst: Scan contig files against PubMLST typing schemes (no date). Available at: (Accessed: 30 March 2023).

Partners and collaborators

Mpanga Kasonde (ZNPHRL/ZNPHI, Zambia)

Joseph Mutale (ZNPHRL/ZNPHI, Zambia)

Nchimunya Siabenzu (ZNPHRL/ZNPHI, Zambia)

Otridah Kapona (ZNPHRL/ZNPHI, Zambia)

Dr. Kunda Musonda (ZNPHRL/ZNPHI, Zambia)
Prof. Roma Chilengi (ZNPHI, Zambia)

Peter Mwansa (ZNPHRL/ZNPHI, Zambia)

Frazer Mtine (ZNPHRL/ZNPHI, Zambia)

Stephen Kanyerezi (NHLDS/CPHL/UNHLS, Uganda)

Ivan Sserwadda (NHLDS/CPHL/UNHLS, Uganda)

Sofonias Tessema (Africa CDC, Ethiopia)

Gerald Mboowa (Africa CDC, Ethiopia)

Statement on continuing work and analyses prior to publication

These genomes are being shared before publication. Please note that this data is based on work in progress and should be considered preliminary. Our analysis of these data is ongoing, and a publication communicating our findings is in preparation. If you intend to use these sequences prior to our publication, please communicate with Mr. Mpanga Kasonde for coordination.

1 Like