The important thing to remember about data as you reach the end of a project is that while sharing your data openly can have some brilliant outcomes, for you as a researcher, but also for the development of your research area, sometimes it isn’t possible to be truly open. This doesn’t mean that there is nothing you can do.
The core principles you should consider when planning and managing your data are those of FAIR. These principles apply whether your data is being made open or not – FAIR and Open are not mutually exclusive.
- Your outputs should be Findable – this means they should be discoverable by the wider academic community and the public. If your outputs can be made open, this is a case of choosing an appropriate repository, but this still applies if your data cannot be shared. For example, if your data is commercially sensitive or cannot be fully anonymised, it should still be made known that the data exists so that interested parties can still find out about the project. They may approach you for further details or to collaborate.
- They should be Accessible – you should make sure that you, or the system you choose uses unique identifiers, high quality metadata and a clear use of language and access protocols. This goes for data you might want to share, which could be the whole dataset, a derived subset of the data, the data that underlies a specific publication or even just a record of the project as discussed above.
- They should be Interoperable – this means that any data structures or file types you use need to be able to be opened and used by others. This is an important consideration, as your data, code, annotations or any other file needs to be reusable into the future when the original programme you used may not be around anymore.
- Finally, they should be Reusable – enabling the repurposing of research outputs to maximise their potential. This means that file types need to be operational long after publication as discussed above, but also files need to be annotated in such a way that someone else can use it accurately to either reproduce or build upon your research. If data cannot be shared, you can still take action. If you have used commercial or clinical datasets, or personally identifiable data that needs to remain restricted, you can share how you processed the data to reach your published conclusions, or the code you used. This will enable researchers who access or collect the same data from different participants to reproduce your research, without access to the same exact data.
When combined, these four elements help lower barriers to research outputs and facilitate secondary researchers finding, understanding, reusing and repurposing your research to realise additional research opportunities and maximise existing resources, even if you can’t share the data in full.
Whatever happens to your data at the end of a project should have been in the plan from the beginning. You should be aware from the start how much of your data can be shared, and you will have thought about where and how you were going to share it. Of course, if you aren’t sure, there are teams that can help make these decisions, and a wide range of advice available, even about less common topics like finding data to form part of your project, how to negotiate a Material Transfer Agreement in order to use it, how to securely destroy sensitive data or even how to cite data appropriately, just to name a few!
With this post we bring our Love Data Week full circle, back to the teams that can support you, and the importance of a good plan. Thank you for joining our activities this week, we hope you enjoyed it.