ZIKV sequence from Florida

Data here.
Protocol here.
Blog post here.

In collaboration with Diogo Magnani from the University of Miami, we (the Andersen Lab) recently received plasma and saliva from two people with Zika virus infections living in the Miami area. Using our amplicon-based approach previously used to sequence Zika virus from travel-related Zika virus cases in Florida, we sequenced the entire coding sequence from the saliva of one individual believed to have been infected in or near Miami. This sequence (ZL2_Hu0015) is related to other Zika viruses recently sequenced from the Americas (Figure 1), indicating that the current Zika virus outbreak in Florida was the result of an imported infection (human or mosquito) from the ongoing epidemic in the Americas. It is not a result of a separate introduction from Africa or earlier circulating Asian strains. ZL2_Hu0015 is 99.95% identical (5 mismatches) to a Zika virus that we recently sequenced from a traveler coming into Florida from Cuba (ZF10_010U) on June 2nd, 2016. We did not detect any contaminating reads from ZF10_010U in our ZL2_Hu0015 data set (libraries prepared separately with different reagents), demonstrating that these are independent samplings of related viruses. While the data is suggestive, we do not have enough data at this time to conclusively determine the direct origin of the virus. Further sequencing of Zika viruses recovered from autochthonous transmission will help to resolve if there were more than one Zika virus introductions contributing to the outbreak in Florida.

We did, however, detect some contamination in our sequencing reads belonging to PRVABC59, our commonly used lab control. While we are working to remove contamination, we believe that the contamination did not alter the consensus sequences we are reporting here.

In collaboration with Scott Michael and Sharon Isern from Florida Gulf Coast University, we also recently sequenced ZIKV from travel-associated cases in Florida, including the first sequence from Cuba. Data here. Blog post here.

Update #1 (11Sep16)

Data here.
Blog post here.

Three pools of Aedes aegypti mosquitoes collected in Miami Beach on August 22nd and 23rd, 2016, were found to be infected with Zika virus. Through our collaborators, Scott Michael and Sharon Isern from Florida Gulf Coast University, we recently received samples of these mosquitoes for sequencing using our amplicon-based protocol for MiSeq. We obtained ~7.5 million 250 bp Zika virus reads per sample, giving us a mean genome coverage of ~160,000 nucleotides per site. Previous cross-contamination issues have been mostly resolved as we only detected 346 reads aligning to Zika virus in our water controls. The Zika virus genomes obtained from the mosquitoes formed a distinct clade (strong bootstrap support) with a Zika virus sequenced from a Miami infection and a traveler returning to Miami from Cuba. The close relationship between the sequences suggests that 1) there is Ae. aegypti-borne Zika virus transmission in the Miami area, 2) the traveler from Cuba may have been actually infected in Miami, and 3) the outbreak was initiated by a single Zika virus introduction from the ongoing epidemic in the Americas. At this point, however, it is difficult to determine the exact origin of the introduced virus. More Zika virus data are needed to help resolve the tree.

Zika virus tree was created using RAxML. Blue = Zika virus-infected Ae. aegypti collected in Miami Beach, red = suspected local human infection, and purple = suspected travel-associated infection.

We will be sequencing additional clinical samples from Florida Zika virus infections this week. Data will be released ASAP.

Feedback encouraged and openly welcome.


Just added new Zika virus sequencing data from 3 pools of adult Ae. aegypti mosquitoes collected in Miami Beach. More clinical data coming soon…

In collaboration with the Florida Department of Health, we (Center for Genome Sciences, USAMRIID) have generated 11 Zika virus (ZIKV) genomes from urine samples collected from patients in Florida with locally acquired ZIKV infections. All genomes were sequenced using Illumina in combination with RNA Access targeted enrichment. Illumina reads were aligned to BeH819015 (KU365778.1 with missing terminal UTR regions filled with sequence from MR766) and new consensus sequences were generated. A minimum of 3x read depth (in support of the consensus base) was required to make a consensus call. Across the different samples, genome coverage ranged from 57% to 99.8%. All samples were collected from patients in the Miami region during July 28 – August 31 2016.

All of these sequences form a phylogenetic clade together with the 4 Florida-derived genomes generated by the Andersen Lab [1 from a locally acquired human infection (http://andersen-lab.com/zika-sequence-local-florida-transmission/) and 3 from Aedes aegypti mosquitoes (http://andersen-lab.com/zika-sequences-miami-mosquitoes/)], 1 genome generated from a patient in Florida with a history of travel to Cuba (also from the Andersen Lab; http://andersen-lab.com/travel-related-cases-florida/) and 1 genome from a patient in Guadeloupe (Atkinson et al.; https://www.ncbi.nlm.nih.gov/nuccore/KX673530). However, the genomes generated from the locally acquired cases in Florida form two distinct subclades within this group, which are separated from each other by 18 substitutions. Ten human cases and two mosquito isolates belong to subclade #1 and two human cases and one mosquito isolate belong to subclade #2. The genome from Guadeloupe is basal to subclade #2, while the genome from the patient with a history of travel to Cuba is basal to subclade #1 in the MCC BEAST tree, but nested within subclade #1 in the ML tree.

Based on epidemiological data combined with tMRCA estimates, we believe that these results are most consistent with at least two introductions resulting in local transmission in Florida. Under this hypothesis, we estimate that both introductions likely occurred during the first half of 2016 (median estimates during April 2016). Alternatively, there could have been a single introduction followed by diversification within Florida (and export from Florida). However, based on the molecular clock analysis, this would require an introduction to Florida sometime in 2015 (median estimate during July 2015).

MLtree_10-2-16.pdf (8.9 KB)

Figure 1. Maximum-likelihood phylogeny constructed using PhyML (GTR+G) with 100 bootstrap replicates.

BEAST_MCCtree_10-2-16.pdf (193.0 KB)

Figure 2. BEAST MCC tree with lognormal uncorrelated relaxed clock. Magenta bar represents the 95 HPD estimate for the tMRCA of all locally acquired Florida genomes. Red and blue bars represent tMRCA estimates for the two Florida subclades. Our genomes are highlighted in green; we only included those with >75% genome coverage. KX673530 from Guadeloupe was not included because this sequence is known to have been determined from a virus grown in cell culture.

Sequences available for download here: https://github.com/jtladner/ZIKA_Florida/tree/master/sequences

This is very interesting. Does anyone know how the Florida cases were being detected - was it more severe cases presenting to medical care or systematic surveillance? If the former, then it is possible there are many sub-clinical infections and an introduction date going back to early 2016 seems plausible. The diversity in Florida does seem striking.

BEAST is pushing the Cuban traveller basal in the cluster due to the much earlier sampling time of the genome (and the largely equivocal relationship with the rest of the cluster). I don’t know which of the two trees is correct though (and has implications as to whether the Cuban traveller was infected in Miami or not). Seems pretty reasonable about the two introductions (or a much older hidden outbreak in Miami).

Dear Jason
Thanks for sharing. Those results agree with our most recent analyses of an expanded set of genomes generated by Kristian et al. Current hypothesis is that the “cuban” isolate was infected locally and previous travel to Cuba was coincidental. That isolate jumps around the cluster as sample composition changes, so its phylogenetic position isn’t well resolved.

Hi Jason (& Gus), welcome to the fray!

Just a quick note: did you pull Kristian’s FL sequences recently, or a while ago? The initial versions of those genomes had some sequencing errors which have very recently been corrected after comparing against independently generated data from the Broad on the same samples. Early versions of those genomes will certainly have inflated branch lengths on the trees and will mess up the tMRCA. Maybe @grubaugh can comment on when exactly those genomes were corrected online.

Hi Andrew, I’m pretty sure that all of the cases we sequenced are from individuals that were symptomatic and sought medical care. Therefore, there have likely been many undocumented cases.

Hi Danny, thanks for the heads up! It’s probably been over a week since I downloaded the genomes from the Andersen lab. I’ll grab the updated genomes and rerun the analysis.

Hi Oli,

That makes sense to me. The “Cuban” isolate shares all eight substitutions that are diagnostic for subclade 1 based on the genomes we sequenced (i.e., present in all subclade 1 genomes). So, as Andrew said, the placement in the BEAST analysis is really just driven by sampling date. And actually, I didn’t have the exact date for that case when I ran the analysis, so I only provided BEAST with the sampling year and then utilized the variable precision option.

Sorry, I forgot to update here with the new sequences: https://github.com/andersen-lab/zika-florida

Issue was that not all of the primers were removed from the reads (so technically a bioinformatics error, not a sequencing error). We are now simply cutting the first 22nt from each read instead of specifically removing the primer sequences from the 5’ end. Also included in this update are additional human and mosquito-derived ZIKV sequences. All but 2 have complete cds. We will finish filling in those gaps this week and will post here with those updates.



Thanks so much for sharing this data. Together with @richard.neher, I’ve been maintaining the website nextstrain.org/zika/. I’ve added the Florida USAMRIID sequences to the website. You can see these samples by selecting “Ladner et al” under “Authors”:

Please let me know if you’d like the attribution “Ladner et al” changed.

With the Broad + USAMRIID + Scripps data, we’re now getting a fair amount of resolution of the Florida outbreak: