Data sharing

Sharing the full data sets underlying the results in your article brings many benefits. It enables reuse, reduces research waste, and promotes collaboration. Greater transparency increases trust in research results by allowing results to be independently verified. These benefits lead to a more reliable evidence base and a healthier world.

Many funders now require that the data sets from the studies they fund be shared. A list of research funders that mandate data archiving is provided by SHERPA/JULIET.

BMJ data sharing policy

BMJ has three policies on data:

Tier 3: 

  • We strongly encourage that data generated by your research that supports your article be made available as soon as possible, wherever legally and ethically possible.

Tier 2: 

  • We strongly encourage that data generated by your research that supports your article be made available as soon as possible, wherever legally and ethically possible
  • We require data from clinical trials to be made available upon reasonable request
  • We require that a data sharing plan must be included with trial registration for clinical trials that begin enrolling participants on or after 1st January 2019. Changes to the plan must be noted in the Data Availability Statement and updated in the registry record (to comply with Statement ICMJE recommendations)

Tier 1:

  • We require that the data generated by your research that supports your article be made openly and publicly available upon publication of your article. Where it is not possible or viable to make data openly available (due to confidentiality or sensitivity issues), they should be shared through a controlled access repository.

 

What data should be shared?

We encourage you to make available as much of the underlying data from your article as possible (without compromising participant privacy), but at least the minimum data required to reproduce the results presented in the associated article.

We consider any files generated by your research as constituting relevant data. This may be raw or processed data. Examples include (but are not limited to):

  • Individual-level deidentified patient data
  • Survey results
  • Interview transcripts
  • Statistical code
  • Images
  • Videos
  • Spreadsheets
  • Audio files
  • Text files
  • Imaging and scan files

To enable reuse and enhance reproducibility, all data should be shared using the sources file in which they were originally generated, for example:

  • Images should be provided as .png, .jpg, .eps, etc.
  • Text files should be provided as .txt, .doc, .rft, etc.
  • Spreadsheets should be provided as .csv, .xls, .tsv, etc.
  • Videos should be provided as .mp4, .avi, .wav, etc.
  • Imaging and scan files should be provided in .img, .dcm, hdr, etc.

Data should not be shared in any way that could compromise participant anonymity or privacy, and data should not be shared if that would require the authors to break any laws or licensing agreements. If the data used were licensed from a third party, the data availability statement should explain how to obtain a licence for that data.

Where a research community in a particular field has established standards for what, where, and how data should be shared, we expect authors to meet those.

While data sharing is not mandatory in most of our journals, BMJ reserves the right to request at any time confidential access to any primary data needed to reproduce the article so that the results reported can be verified. Editors may also use data availability statements to inform their editorial decisions.

How to share data

For clinical data (Individual Participant Data) we request that you use controlled access repositories, such as clinicalstudydatarequest.com, the YODA project, or Vivli. Please see this article for current practical guidance on clinical trial data sharing.

For pre-clinical data we recommend using recognised subject-specific repositories, such as GenBank, where relevant and available.

There are also a number of recognised, general repositories in which to deposit data, for example, DRYAD, OSF, FigShare and Zenodo. FAIRsharing and re3data.org provide a curated list of repositories.

Data from clinical trials

Where required, reports of clinical trials must include a commitment to make relevant anonymised patient-level data available on reasonable request (see this editorial in The BMJ for further explanation). This policy applies to any research article that reports the outcomes of a clinical trial of any type of intervention.

The International Committee of Medical Journal Editors (ICMJE) requires that clinical trials that begin enrolling participants on or after 1 January 2019 must include a data sharing plan in the trial’s registration. We encourage all authors to follow this best practice, however this is compulsory on The BMJ. The ICMJE’s policy regarding trial registration is explained here. Read more at BMJ 2017;357:j2372

What and how to cite data

We recommend that all data used in the writing of an article are cited in the reference list – whether they are data generated by the author(s) or by other researchers. Where data are publicly available these should be cited in the reference list in line with the recommendations of DataCite. Please use the following format:

Creator (PublicationYear). Title. Version. Publisher. ResourceType. Identifier

Data availability statements

Authors are asked to pick from the following standardised Data Availability Statements and are able to add additional information as free text. Please see the ICMJE recommendations for how to compose a rich statement.

  • Data are available in a public, open access repository
    Please give the repository name, the persistent URL, and any conditions of reuse (eg. licence, embargo).
  • Data are available upon reasonable request
    Please state what the data are (e.g. deidentified participant data), who the data are available from, their publishable contact details (e.g. a generic lab email address or an individual’s ORCID identifier – please ensure you have permission) and under what conditions reuse is permitted. Is there additional information available (e.g. protocols, statistical analysis plans)?
  • Data may be obtained from a third party and are not publicly available
    Please state what the data are (e.g. deidentified participant data), who the data are available from, their publishable contact details (e.g. a generic lab email address or an individual’s ORCID identifier – please ensure you have permission), and under what conditions reuse is permitted. Is there additional information available (e.g. protocols, statistical analysis plans)?
  • There are no data in this work
    This option is usually for editorials or opinion pieces.
  • All data relevant to the study are included in the article or uploaded as supplementary information (*please ensure this does not include <patient identifiable data>)
  • No data are available
    This option is for Tier 1 journals only, non-clinical trials in Tier 2 journals (unless agreed in advance by Editor), or only as agreed in advance by an Editor in Tier 3 journals.

How to access data that is available upon request

Data requesters should do the following:

  • Email the corresponding author for the paper to request the relevant data.
  • For requests to The BMJ, a Rapid Response must also be submitted to the article. (Please see The BMJ’s specific instructions)
  • Provide the authors of the article a detailed protocol for the proposed study, and to supply information about the funding and resources you have to carry out the study.
  • If appropriate, invite the original author[s] to participate in the re-analysis.
  • If a month elapses without a response from the authors, please email the editorial office of the relevant journal.
  • The journal will assess the request and if appropriate will encourage the authors or their institution to share the data, although BMJ are not in a position to compel data release or broker agreements.