Publishing and sharing

Copyright and Research Data

Vintage mechanical calculator used for decorative purposes
"Vintage mechanical calculator - Royalty-free Pixabay"

What is research data?

There are many types of research data, and it differs between disciplines. Traditional research data may be thought of as data on spreadsheets, lab or field work or interview answers from participants. Research data can also be multimedia or audio-visual files, annotations or sketchbook work.


Raw data, or facts, are technically not copyrightable. However, datasets fall under the category of a literary work in the Copyright, Designs and Patents – the way the data is arranged and presented in a dataset gives it copyright protection.


  • Copyright considerations for research data is similar to copyright in research outputs (articles, book chapters, monographs etc.)Specifically, the educational exceptions apply to datasets. This means datasets can be used for teaching, research and private study for non-commercial purposes.However, using third party data beyond teaching and private study (for example, to underpin a published work) will require checking permissions.


    If a dataset is found within a research output or other medium (such as a blog, or webpage) the “patchworking” approach to licensing may need to be considered. Essentially a research output or content on a webpage may be given a copyright licence or terms, but this doesn’t necessarily mean that all the datasets or figures included in the output is under the same licensing terms. Datasets may be a secondary source, and the author of the output has been granted permission or a non-exclusive licence to use that dataset for the purposes of that output.

  • As a creator of research data, it is encouraged you give your datasets a copyright licence, whether that is an Open Licence (such as a creative common licence) or stating All Rights Reserved and direct permission is required.In some cases, creators may give their datasets a CC0 licence, placing it in the public domain.

    However, with the nature of some data types and to align with good research data management practices, researchers should spend time considering copyright and making sure all datasets suitable for access. This includes ensuring all datasets are clear of identifiable participant information and that all future use of the work is ethical and aligns with the creators' intentions or original purpose of the research.

    Examples of licensing considerations include:

    • If a dataset requires an open licence, should it allow modifications/derivatives for future use?
    • Should commercial reuse of the data be allowed?

    Open licences such as Creative Commons licences are non-exclusive licences. Creators of datasets can grant additional permissions to users under different terms.In some cases, such as medical based research and other sensitive topics, datasets may not be allowed to be made public. A requirement of most research is ethical clearance and a data management plan which should be discussed with academic supervisors, project leads or funders in the first instance.

  • As the research landscape is evolving, more publishers and research require datasets to be made open on repositories. This is often supported by Data Access statements. Research datasets can underpin or supplement a research ouput, or it can be an independent dataset.However, by depositing a dataset on a repository that dataset will gain its own DOI, and depending on the publishing contract or terms*, the creator can assign a specific open licence to it.

    In most cases this means that all open datasets can be treated as standalone. Depending on the open licence applied, the dataset may be reused, developed and applied to different research outputs. For example, a research article may be licenced as CC BY ND or the copyright has been transferred to the publisher. The article may include a dataset, created by the same author, which is uploaded on a data repository. That dataset may be licenced under CC BY and the author is the copyright holder. This allows the creator and other users of the dataset to use that dataset for additional purposes without impacting or infringing the connected research output.

    *It is important to check all publishing contracts to see the terms of open data or the data access statements.