Clade 2 in Liberia

The CDC recently sequenced an EBOV lineage from an American expat that had returned from Liberia in July/August:

http://www.ncbi.nlm.nih.gov/nuccore/KP178538.1

This particular lineage belongs to ‘clade 2’ as described in our Gire et al. paper. The EVD outbreak in Liberia has been a little special, since the country was infected early in the outbreak in April 2014. However, few cases were reported until the disease hit the capital of Monrovia around June - at which point the number of cases started rising exponentially and rapidly (R0 ~2, doubling-time ~2-3 weeks). I can think of two different scenarios that can explain the transmission chain in Liberia:

  1. A ‘slow-fuse’ scenario where the original introduction event in April 2014 lead to a dramatic increase in cases once the outbreak hit Monrovia. This scenario suggests that the lineage that hit Monrovia around June was a direct descendant of the lineage that was introduced in April.

  2. A second introduction into Liberia from Sierra Leone or Guinea. This would suggest that the ‘Monrovia lineage’ isn’t a direct descendant of the April lineage.

Since the lineage that the CDC published belongs to clade 2 - which is too young to have been introduced in April (see the Gire et al. paper) - the sequence alone is consistent with scenario 2. However, a reference-based approach was used for the generation of the Liberian EBOV sequence, so it’s possible that an analysis bias could inadvertently have caused this sequence to ‘look’ like clade 2, without it actually being one. We could address this question by gaining direct access to the raw data produced by the CDC, but unfortunately this data is not yet available.

To try and get a better understanding of this, I have looked more closely into the early transmission chain in Monrovia, to see if any information could be gained from this. As it turns out, it appears that the outbreak in Monrovia was caused by two travelers that had recently come from Sierra Leone, stayed in the ‘Sierra Leone quarter’ of the city, and later died from Ebola at ELWA.

Taking the epidemiological and sequence data into consideration, I think scenario 2 is the most likely explanation. In other words, Liberia was ‘infected’ in April 2014, but this relatively small-scale outbreak did not lead to very many cases. It was only when Monrovia got hit in June 2014 with a lineage from Sierra Leone that the outbreak really got started in this country.

EBOV sequencing from additional samples in Liberia would be very helpful to understand if the outbreak in this country today is fueled entirely by a single point-introduction into Liberia from Sierra Leone in June, or whether multiple introductions have occurred.

Thomas Hoenen, David Safronetz and Heinz Feldmann of NIH/NIAID have just published 4 genomes from the Mali outbreak:

http://www.ncbi.nlm.nih.gov/nuccore/KP260799.1

Very interestingly, this also groups with the Sierra Leone 2 cluster. All 4 genomes cluster together but 1 is genetically distinct from the other 3.

The WHO reports 2 introductions into Mali: a 2-year old infant who died on the 24th of October and a 70-year old man who died on the 27th October. The former case had no reported secondary cases where as the latter had 5 reported secondary cases.

This suggests that some of the Guinea outbreak was also seeded from the Sierra Leone outbreak and established local diversity in Guinea before the 2 independent introductions into Mali. This would also be concordent with the resurgence in cases in Guinea in May/June after the initial outbreak was thought to be under control.

I had a quick look at the SNPs. There’s 14 mutations in the clade: 4 shared by all isolates (1 intergenic, 1 non-synonymous and 2 synonymous, one of which is homoplasic with 1995 Zaire/1996 Gabon clade), 5 shared by 3 isolates (no intergenic, 1 non-synonymous, 4 synonymous) and 5 unique to individual isolates (2 intergenic, 1 non-synonymous and 2 synonymous). I suspect the homoplasic synonymous mutation (at site 14254) might be genuine since all 4 genomes were Sanger-sequenced.

@evogytis - you’re referring to Mali, right? Are any of those SNPs shared with Sierra Leone / Liberia?

@arambaut - are you suggesting clade 2 in SL -> GIN -> Mali? Unless any of these SNPs are shared with Sierra Leone, it’s also possible that it was just GIN -> Mali from the same Guinean clade 2 pool that seeded SL in May.

@evogytis - the homoplasic-with-Kikwit SNP could still be a technical artifact (doesn’t have to be, but I wouldn’t rule it out). The original Baize sequences (accessions ending in .1) had such imputation artifacts in a dozen regions of the genome, and that was Sanger-sequenced.

@dpark Yes, those SNPs are from the Mali sequences. Sorry, should’ve made that clearer. All other mutations are the same ones that define the SL2 clade.