WHO Statement on Data Sharing

It was great to see this statement about data sharing coming out of the recent WHO consultation that I know Andrew was involved with:


I wonder if those present could provide any more insights into the discussion. Wish I could have been there!

Developing Global Norms for Sharing Data and Results during Public Health Emergencies

WHO Consultation 1-2 September 2015: Summary and Key Conclusions

In line with open access policies, the timely sharing of information on clinical, epidemiologic and genetic features of emerging infectious diseases as well as information on experimental diagnostics, therapeutics and vaccines, is critical for actions during a rapid public health response.

WHO held a consultation in Geneva, Switzerland, on 1-2 September 2015 to advance the development of global norms on data and results sharing in public health emergencies. Government representatives, public health agencies, scientists, research funders, ethicists and industry representatives attended the consultation. Acknowledging the years of work that many groups have engaged in to support data sharing in health research, the following consensus emerged from the meeting specific to the emergency perspective.

It was recognized that epidemiologic data belong to the countries where they are generated, but there was consensus that the default option is that data should be shared (i.e. opt-out policy) to ensure that the knowledge generated becomes a global public good.

It was also recognized that pathogen genetic sequence and associated clinical and epidemiologic data are of greatest value if made openly available, in as close to real-time as possible, during a public health emergency.

It was unequivocally agreed by representatives from leading biomedical journals that public disclosure of important information of potential relevance to public health emergencies should not be delayed by publication timelines, and that pre-publication disclosure must not and will not prejudice journal publication. It was agreed that pre-publication information sharing should become the global norm in the context of public health emergencies. Researchers should take the responsibility to ensure that results – even when preliminary – are adequately robust and have undergone quality control, prior to public disclosure to enable an evidence-based dialogue with the media and communities.

Outside emergencies, 12 months is often considered an appropriate timeframe from study completion to public disclosure. In the emergency context, there was unanimity that 12 months should be greatly shortened from the time that interim results are available for public disclosure and that a specific expedited timeline commitment for results sharing should be made in protocols and analysis plans before trial commencement.

There was consensus that the risks and potential harms to individuals of non-disclosure of important information provide a strong ethical rationale for rapid sharing of data.

The meeting reaffirmed that funders and sponsors have a crucial role to play in requiring that expedited timelines for sharing of data and interim results in public health emergencies are a precondition for approval of study initiation, disbursement of funds and in monitoring compliance.

Participants called upon all researchers from public and private sectors to make data publically available, including results from studies that are inconclusive or have not led to the expected results.

The meeting agreed on a series of short-, medium- and long-term actions that can be implemented to build on achievements and lessons learned during the Ebola crisis to improve data and results sharing during the next emergency. These will be developed and submitted to the WHO emergency R&D preparedness blueprint Advisory Group for endorsement and action by WHO and partners.

The meeting recognized the imperative for capacity strengthening and creation of an enabling environment in low- and middle-income countries that is conducive to locally led research and structures for data sharing.

A longer version of the meeting report will be made available in the next weeks.

@nyozwiak - any comments from Geneva? Sounds like it went pretty well!

Yeah, I think the WHO outbreak data sharing meeting went really well!

First of all, it was really great for Bronwyn and me to spend such quality time with @arambaut, @pk5 and @c_fraser , and I thank them all so much for joining us in Geneva.

At risk of violating the meeting’s Chatham House rules, I’ll just say that I think all of our talks complemented each other nicely and were very positively received by the meeting organizers and worked to reinforce the main takeaways we wanted to share. Following our panel, we broke out into working groups tasked with identifying guiding principles, major roadblocks, possible working models, and next steps/recs, in our case for the specific issue of viral genetic data in outbreaks. We summarized the goal as: “In an outbreak there should be a commitment to turn a portion of RESIDUAL diagnostic nucleic acid into a publically available pathogen genomes at NO additional cost to the country(s) experiencing the outbreak & that the data leads to OPEN, ACTIONABLE & INTERPRETABLE INFORMATION.”

Overall, I think the meeting practically helped advance the urgency of developing pathogen genome sequence sharing guidelines during emergencies and - importantly - with a larger eye to the future of expanding the role of sequencing to inform outbreak response.

The WHO should be preparing a much more detailed meeting report soon and may be incorporating some text we prepare summarizing our breakout session efforts. Going forward, I hope and suspect the WHO may then convene a standing working group to advise and formalize outbreak genetic data sharing guidelines.

It is great to see this statement and that the WHO and others are taking this seriously. I am still skeptical about its implementation, however, and I’m unsure about some of the language in the statement. Specifically:

  1. What defines an emergency? The Ebola outbreak wasn’t declared an emergency until August 2014 and I’m not sure that e.g. MERS has been declared an emergency (yet). Hence no need for datasharing until the WHO declares an outbreak an emergency?

  2. The wording allows for ‘opting out’ of this agreement - in other words, while ‘opting in’ is default, individual groups can still chose not to share information.

  3. What is the proposed timeline for release of data? The wording mentions ‘significantly less than 12 months’, but could that be 6 months? 8 months? 4 months? 2 weeks?

  4. A comment on publications - I agree that the release of data shouldn’t preclude publication. In fact, I’d like a much stronger statement (from journals, NIH, etc) stating that public deposition of data prior to publication is required for publication in X journal. Plus if work is funded by NIH, PHE, other government/non-profit organizations, then the data must be released immediately for continued funding.

Overall though, it’s a very good thing to get a statement from the WHO on the importance and requirement for data sharing. While I remain skeptical of whether we’ll see real change, I’m hopeful that this will pave the way for more openness and sharing.

This was discussed and in the feedback from the sequence data breakout it was pointed out that a lower threshold was needed (or additional levels of emergency with lower thresholds).

See also final paragraph of Margaret Chan’s speech to theInstitute of Medicine Ebola workshop:


In the one page report there was some conflation of the discussion about the various different types of data. We pointed out that most sequence data was made public at publication and was a requirement of publication (not the case for other data). The issue for sequence data was making it open quick enough for it to be useful.

Most Ebola papers made data public before the papers were published - they just waited until they heard the paper was accepted. I think there is one time where making data public has utility to Public Health and that is as soon as it is obtained. Anything later and it becomes academic data and may as well be submitted to Genbank at publication.

The one big question remains which is how to give incentives and rewards for doing this.

Thanks for the clarifications Andrew.

Absolutely agree! I would say that no journal should accept to review a paper unless the data is already publicly available. In fact, how can you as a reviewer even review a paper properly if you don’t have access to the data? This really has to change - no other way around it. Funding agencies are already pushing this, but the push needs to be bigger and the journals need to get behind it (and yes, once accepted, such papers would also have to be open access… that’s another battle).

Agreed! I’m more hopeful about this though - I feel like the way papers and data (especially in the genomics space) are being discussed online these days, people will quickly know who’s the one sharing the data, and who isn’t. Give it a bit of time and that information should trickle down to funding committees, reviewers, journalists, etc. At the end of the day, that’s where we make our livelihood. That said, having formal guidelines and incentive mechanisms (e.g. being able to cite data depositions directly - though I don’t believe that’s good enough) will be critical for this to succeed as well.

Things will change when generating genome data is a routine event and can be done by diagnostic labs rather than the rather laborious set of steps currently required, which means academic labs need to get involved which is perhaps when issues of priority and hoarding raise their ugly heads (although I quite agree with Kristian’s view that journals and the community should punish such behaviour).

The question for me is - how do we establish a set of practices where genome data is open by default and only closed when there is a good reason (there have been a few examples during Ebola where sharing too early may pose ethical concerns). This will require good systems and an early established culture of sharing, and academics need to be at the vanguard here to demonstrate real-life examples of how data integration from many sources can positively influence public health efforts.

I should clarify too that there is a difference between sharing between stakeholders (e.g. everyone involved in outbreak response, which should be done immediately) and putting things in Genbank for unfettered access by anyone (which should be done as soon as is practical).

I guess overall this is really the key question. Besides patient confidentiality, I don’t really see any reason why data should ever be behind lock and key? Publicly funded, publicly available - or at least that’s how it should be. My take on it is fairly simple - although massively naive:

  1. Data have to be published immediately
  2. Data should be considered equivalent to a publication - i.e. citable
  3. Journals should require data be available upon production of the data - not once the paper has been submitted, not once it has been accepted
  4. Funding agencies should require data be made available immediately upon production for continued funding
  5. Poaching of data should be discouraged - i.e. if you intend to publish (in a journal - not blogs, twitter, etc) on a dataset before the data producers, then you’d have to contact the data producers directly and get their permission (within a certain time period - and yes, we’re on a slippery slope here)

Overall I really don’t think this is super complicated stuff and requires a lot of complicated rejigging of already established procedures, establishment of intricate practices, etc. - I think it just requires the big players to step up to the plate and show that open data is possible. We have had several outbreaks now showing that this is actually possible and is very helpful to outbreak response. We also have several others (many more…) examples of how not to do it - let’s make that a thing of the past.

And yes, I realize that I sound hopelessly naive, but having been part of the team releasing data during the Ebola outbreak, in hindsight I only see pros and absolutely no cons ;).

Possibly. What was interesting in the WHO meeting was that possibly this should be considered an ethical issue. I.e., need to consider whether it was unethical to retain the data because it had a potential public health benefit. So perhaps a journal could consider whether the data was ethically problematic (because it had been withheld but could have had benefits) or not (because there were never any greater benefits to be gained by early release - or other ethical issues outweighed them). Thus data from, say, seasonal influenza is very unlikely to have any direct public health benefits from immediate release. The journal policy would then remain release at submission or release at acceptance.

The slight complication then comes from the fact that a researcher could have the samples but choose not to sequence until later at which point the sequences have no direct public health value.

Great to hear this directly from the WHO:

“Without the timely exchange of information on clinical, epidemiologic, and molecular features of an infectious disease, informed decisions about appropriate responses cannot be made, particularly those that relate to fielding new interventions or adapting existing ones. Failure to share information in a timely manner can have disastrous public health consequences, leading to unnecessary suffering and death.”

Yes, it is a strong statement which is good.

Personally I would be keen to translate some of these ideas into practical suggestions – not the dreaded word standards – for those generating, analysing and reporting sequence data during outbreaks in real-time. Is anyone else out there interested in this?

Oh, you mean making actual practical recommendations that we all have to stick to? ;-).

Yes, I think we should do that and I’m game. We have a great meeting coming up in June where I think a lot of the players will be around, so maybe organize a little side hoolie to come up with an actionable plan?

Whatever: just don’t say standards :slight_smile:

Yes, that would be cool. Let’s aim to do something there then. The obvious venue is the Red Lion Pub!

Good idea about the Red Lion - would be interesting to have a session about this at the meeting too. Will suggest it.

Sheesh, I lived in Cambridge for five years and never went to the Red Lion Pub! I’d always hang out at the crummy places… Or Wrestlers (which is kinda crummy too…) - excellent Thai food!

Yes, Andrew, having a breakout session on this would be excellent!

Nothing new here, but the WHO reiterating the importance of data sharing so just FYI:

Thanks Kristian.

This phrasing is interesting:

WHO will advocate that pathogen genome sequences be made publicly available as rapidly as possible through relevant databases and that benefits arising out of the utilization of those sequences be shared equitably with the country from where the pathogen genome sequence originates. This refers only to the public sharing of sequences, not to biological samples, which will be subject to a separate WHO policy (in preparation).

How do we interpret the term benefits?

Also what does ‘shared equitably’ mean?

Wait, to generate pathogen genomic sequence data you need samples?

Yes, all very fluffy and not very actionable - my take is that WHO should focus on actually get samples flowing (e.g. via reference labs) and then the genomic community will solve the other hurdle.