Monkeypox virus genome sequences from multiple lesions indicates co-infection of a UK returning traveller

Ana da Silva Filipe1, Lily Tong1*, Vattipally B Sreenu1*, Alasdair Maclean2, Rory Gunson2, Matthew TG Holden3,4, David Barr5, Antonia Ho1, Massimo Palmarini1, Andrew Rambaut6, David L Robertson1, Emma C Thomson1

1MRC-University of Glasgow Centre for Virus Research, Glasgow, UK; 2West of Scotland Specialist Virology Centre, Glasgow Royal Infirmary, Glasgow, UK; 3Public Health Scotland, Glasgow, UK; 4School of Medicine, University of St Andrews, St Andrews, UK; 5Queen Elizabeth University Hospital, NHS Greater Glasgow and Clyde, Glasgow, UK; 6Institute of Ecology and Evolution, University of Edinburgh, Edinburgh, UK.
*These authors contributed equally.

Monkeypox virus (MPXV) is endemic in wildlife in Central and Western Africa, resulting in frequent zoonotic transmissions to humans in this part of the world. Monkeypox is a self-limited disease with a duration of 2-4 weeks, and symptom onset between 5-21 days after exposure. Recent estimates of the case fatality rate range between 3-6% but this can be higher in younger children and immunocompromised individuals. Symptoms include a rash which appears 1-5 days following the first symptoms (fever, swollen glands, headache, muscle ache and shivering), and can spread to other parts of the body, including the genitals.

From 2018-2021, seven cases of monkeypox were detected in the UK (Monkeypox outbreak: technical briefings - GOV.UK). Since May 2022, 780 cases have been reported to WHO by 27 member states (Multi-country monkeypox outbreak: situation update). Phylogenetic analysis has linked these to sequences sampled in 2018/19 and the recent finding of another lineage in the US that is deeper in the tree (sharing a most recent common ancestor with sequences sampled in 2017) is consistent with monkeypox being present in humans for the last five years (see Update to observations about putative APOBEC3 deaminase editing in the light of new genomes from USA). The majority of reported 2022 cases have been male and many (although not all) affected people self identify as gay, bisexual and men who have sex with men (GBMSM).

Here we report five MPXV genome sequences from an infected person presenting with rash and fever in May 2022, and reporting a single exposure event following recent travel to Germany. The patient was recruited to the International Severe Acute Respiratory and emerging Infections Consortium (ISARIC) WHO Clinical Characterisation Protocol UK (CCP-UK) study. Ethical approval was given by the South Central–Oxford C Research Ethics Committee in England (13/SC/0149), the Scotland A Research Ethics Committee (20/SS/0028), and the WHO Ethics Review Committee (RPC571 and RPC572).

Genome sequencing was carried out on samples from five different lesions from the same patient: CVR_MPXV1a, CVR_MPXV1b, CVR_MPXV1c, CVR_MPXV1d and CVR_MPXV1e. All had Ct values in the 15-16 range. Following total nucleic acid extraction with easyMAG, both samples underwent host depletion, using the NEBNext® Microbiome DNA Enrichment Kit, which facilitates enrichment of viral DNA, by selective binding and removal of the CpG-methylated host DNA. Fragmentation was performed with a Covaris S220. Libraries were prepared with the KAPA LTP library preparation Kit (Roche), and amplification was done with 10 cycles of PCR. Samples were pooled at equimolar concentrations and sequenced along a negative control using a NextSeq mid-output cartridge (Illumina).

Short and low-quality sequence reads (length <75 nucleotides, Phred score >30) were filtered from the datasets using Trim Galore (version 0.6.6). BWA-MEM (version 0.7.17-r1188) was used to map filtered readings to a MPXV sequence ON602722.1; this MPVX sequence was chosen as a close reference by using a k-mer analysis of all MPXV full genome sequences available. Ivar (version 1.3.1) program was used to generate consensus sequences (minimum depth 10 and consensus frequency threshold 0.6) from the aligned BAM files. Please see Table.1 for read mapping statistics.

Table 1 - Mapping statistics of next-generation sequence data.

Sample Total reads QC passed Mapped reads Ref coverage Average depth
CVR_MPXV1a 146,229,822 117,947,252 633,176 99.88 427.6
CVR_MPXV1b 136,896,642 126,331,776 676,579 99.88 410.7
CVR_MPXV1c 91,462,518 80,997,026 461,363 99.97 267.0
CVR_MPXV1d 90,340,102 79,815,084 526,895 99.92 304.5
CVR_MPXV1e 76,246,684 67,298,502 409,778 99.99 240.5

Consensus sequences were manually checked and repeat regions are curated based on majority read consensus. All the consensus sequences are aligned with available MPXV genomes using MAFFT (version 7.475). The phylogenetic tree is generated with IQ-TREE using the default ModelFinder and tree reconstruction options.

Figure 1 - Genomic positions of variations observed in the five MPXV lesion samples. CVR_MPXV1a is used as a reference.

This Scottish case was linked to other sequences from continental Europe, indicating that this is an epidemiologically related outbreak of infections. Interestingly, the sequences from the five lesions was consistent with the presence of two variants, related to sequences previously identified in Germany and Italy, indicating the presence of MPXV co-infection. Compared to CVR_MPXV1b, CVR_MPXV1a has two single nucleotide differences (Figure 1), a C->T (in an APOBEC3 preferred dinucleotide context TC) at position 22,742 resulting in an E->K replacement at residue 141 of the MPXVgp026 protein and a G->A (not in a APOBEC3 preferred dinucleotide context) at position 74,363 resulting in R->H at residue 194 of the MPXVgp079 protein. No minor variants were detected at these sites 22,742 or 74,363 in CVR_MPXV1a, which had coverage of 252 and 71 reads, respectively. Amino acid positions are given with reference to the 2018 “MPXV-M5312_HM12_Rivers” reference genome sequence (GenBank accession NC_063383).

Figure 2 - Phylogenetic tree showing the relationships of CVR_MPXV1a-e to recently sampled MPXV genome sequences.

While MPXV genome sequence comparisons can be used to infer epidemiological linkage, detecting divergent sequences in the same individual has implications for linking MPXV cases. Specifically the potential presence of different variants in one person (due to coinfection) means nucleotide differences cannot be used to definitively rule out epidemiological linkage between infections.

Data. The CVR_MPXV1a-e sequences have GenBank accession numbers ON808413-ON808417.

Acknowledgements. The study was funded by the Medical Research Council (MRC, MC UU 1201412).


Fantastic work! I thought it would be useful to show where these sequences lie on the current Nextstrain MPXV tree which masks terminals and variable regions.

Sample a is part of the largest overall lineage/cluster identified so far, consisting of 18 German sequences and 1 Swiss sequence.

Samples b, c, and e are identical to the outbreak root. Sample d is attached directly to the outbreak root, differing by 1 mutation that has not otherwise been seen in the other 160 outbreak sequences in the Nextstrain tree.

In the Nextstrain tree, the following sites are masked

from beginning: 1500
from end: 7000
6400-7500       very diverse region
133050-133250   indel variation and long homopolymers
173250-173460   indel variation and long repetitive elements

Link to the auspice tree with sequences discussed in this post highlighted

1 Like