Transparent analysis of raw COVID-19 data: lack and low quality of raw data

The Galaxy Project and HyPhy/Datamonkey teams are partnering in making analyses of COVID-19 data globally available:

Most of analyses we describe begin with raw data (sequencing reads). The bottom line so far is:

  1. There are very few raw datasets
  2. Some raw datasets contain no COVID-19 data at all
  3. Lack of raw data prevents assessing the extent of viral heterogeneity such as, for example,
    an A-to-C substitution (MAF 38%) at position 24,323 (resulting in Lys921Gln in protein S) in sample “wuhan2”
3 Likes

New high quality raw reads from the University of Wisconsin. This is the first truly high quality set of raw reads.

I had to figure out how to use the :blush: emoji